[prev in list] [next in list] [prev in thread] [next in thread] 

List:       vdsm-devel
Subject:    =?utf-8?q?=5Bovirt-devel=5D?= Error on incremental backup
From:       francesco--- via Devel <devel () ovirt ! org>
Date:       2021-04-22 7:37:22
Message-ID: 20210422073722.17979.55355 () mail ! ovirt ! org
[Download RAW message or body]

Hi all,

I use a custom python script for performing full and incremental backup on few \
hundreds vms. I customized the example script in \
https://github.com/oVirt/ovirt-engine-sdk/blob/master/sdk/examples/backup_vm.py.

Everything was working fine, but starting from a few days ago, some vms starts to \
fail the backup. I upgraded the engine (self-hosted) from 4.4.4 to 4.4.5 yesterday \
afternoon and today most of the vms failed the incremental and full backup.

The error is triggered when trying to get the backp_service in the following \
function:

```
def backup_disk(disk, backups_service, from_checkpoint_id):
    # We need to set an array because backups_service.add need it
    disks = []
    disks.append(disk)
    add_error = False

    try:
        backup = backups_service.add(
                types.Backup(
                    disks=disks,
                    from_checkpoint_id=from_checkpoint_id
                )
            )
    except Exception as e:
        exception = str(e)
        logging.error("error on backups_service.add, set backup object to False. \
Error: {} ".format(exception))  add_error = True
        backup = False

    if not add_error:
        backup_service = backups_service.backup_service(backup.id)
        logging.info("Backup phase: {}".format(backup.phase))
        while backup.phase != types.BackupPhase.READY:
            time.sleep(1)
            try:
                backup = backup_service.get()
            except sdk.NotFoundError:
                logging.error("backup_service NotFoundError, set backup object to \
False")  backup = False
                break

        if backup and backup.to_checkpoint_id is not None:
            logging.info("Created checkpoint {}".format(backup.to_checkpoint_id))

    return backup
```

Here's the error in the engine logs. 

```
2021-04-21 11:31:21,750+02 INFO  \
[org.ovirt.engine.core.bll.utils.ThreadPoolMonitoringService] \
(EE-ManagedScheduledExecutorService-engineThreadMonitoringThreadPool-Thread-1) [] \
Thread pool 'default' is using 0 threads out of 1, 5 threads waiting for tasks. \
2021-04-21 11:31:21,750+02 INFO  \
[org.ovirt.engine.core.bll.utils.ThreadPoolMonitoringService] \
(EE-ManagedScheduledExecutorService-engineThreadMonitoringThreadPool-Thread-1) [] \
Thread pool 'engine' is using 0 threads out of 500, 50 threads waiting for tasks and \
0 tasks in queue. 2021-04-21 11:31:21,750+02 INFO  \
[org.ovirt.engine.core.bll.utils.ThreadPoolMonitoringService] \
(EE-ManagedScheduledExecutorService-engineThreadMonitoringThreadPool-Thread-1) [] \
Thread pool 'engineScheduledThreadPool' is using 2 threads out of 1, 98 threads \
waiting for tasks. 2021-04-21 11:31:21,750+02 INFO  \
[org.ovirt.engine.core.bll.utils.ThreadPoolMonitoringService] \
(EE-ManagedScheduledExecutorService-engineThreadMonitoringThreadPool-Thread-1) [] \
Thread pool 'engineThreadMonitoringThreadPool' is using 1 threads out of 1, 0 threads \
waiting for tasks. 2021-04-21 11:31:21,750+02 INFO  \
[org.ovirt.engine.core.bll.utils.ThreadPoolMonitoringService] \
(EE-ManagedScheduledExecutorService-engineThreadMonitoringThreadPool-Thread-1) [] \
Thread pool 'hostUpdatesChecker' is using 0 threads out of 5, 5 threads waiting for \
tasks. 2021-04-21 11:31:59,013+02 INFO  \
[org.ovirt.engine.core.sso.service.AuthenticationService] (default task-19) [] User \
admin@internal-authz with profile [internal] successfully logged in with scopes: \
ovirt-app-api ovirt-ext=token-info:authz-search \
ovirt-ext=token-info:public-authz-search ovirt-ext=token-info:validate \
ovirt-ext=token:password-access 2021-04-21 11:31:59,124+02 INFO  \
[org.ovirt.engine.core.bll.aaa.CreateUserSessionCommand] (default task-19) [43d5f322] \
Running command: CreateUserSessionCommand internal: false. 2021-04-21 11:31:59,157+02 \
INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default \
task-19) [43d5f322] EVENT_ID: USER_VDC_LOGIN(30), User admin@internal-authz \
connecting from '5.19.121.125' using session \
'VN/aBJxvgcG1WBY9OQSjUnHL7HaSPhbLSX52inLiYFGG/gubMeZiv9Tu+3D4htqVwrb/kqlvMzBL3WAQXb+3Lg==' \
logged in. 2021-04-21 11:31:59,534+02 WARN  \
[org.ovirt.engine.api.restapi.util.LinkHelper] (default task-19) [] Can't find \
relative path for class \
"org.ovirt.engine.api.resource.StorageDomainVmDiskAttachmentsResource", will return \
null 2021-04-21 11:31:59,535+02 WARN  [org.ovirt.engine.api.restapi.util.LinkHelper] \
(default task-19) [] Can't find relative path for class \
"org.ovirt.engine.api.resource.StorageDomainVmDiskAttachmentsResource", will return \
null 2021-04-21 11:32:00,670+02 INFO  \
[org.ovirt.engine.core.bll.storage.backup.StartVmBackupCommand] (default task-19) \
[21bb0fd3-06fe-49c8-95ec-94856edabc41] Lock Acquired to object \
'EngineLock:{exclusiveLocks='[0ff73c3a-fb49-4148-a3b4-6ce790523a49=DISK]', \
sharedLocks='[94ec9758-cfc1-41ba-ba71-e339d8df8cee=VM]'}' 2021-04-21 11:32:00,863+02 \
INFO  [org.ovirt.engine.core.bll.storage.backup.StartVmBackupCommand] (default \
task-19) [21bb0fd3-06fe-49c8-95ec-94856edabc41] Running command: StartVmBackupCommand \
internal: false. Entities affected :  ID: 94ec9758-cfc1-41ba-ba71-e339d8df8cee Type: \
VMAction group BACKUP_DISK with role type ADMIN,  ID: \
0ff73c3a-fb49-4148-a3b4-6ce790523a49 Type: DiskAction group BACKUP_DISK with role \
type ADMIN 2021-04-21 11:32:00,863+02 INFO  \
[org.ovirt.engine.core.bll.storage.backup.StartVmBackupCommand] (default task-19) \
[21bb0fd3-06fe-49c8-95ec-94856edabc41] Creating VmBackup entity for VM \
'94ec9758-cfc1-41ba-ba71-e339d8df8cee' 2021-04-21 11:32:00,956+02 INFO  \
[org.ovirt.engine.core.bll.storage.backup.StartVmBackupCommand] (default task-19) \
[21bb0fd3-06fe-49c8-95ec-94856edabc41] Created VmBackup entity \
'ee02f949-ccb9-452f-a1bb-8e05d0684b0f' 2021-04-21 11:32:00,956+02 INFO  \
[org.ovirt.engine.core.bll.storage.backup.StartVmBackupCommand] (default task-19) \
[21bb0fd3-06fe-49c8-95ec-94856edabc41] Redefine previous VM checkpoints for VM \
'94ec9758-cfc1-41ba-ba71-e339d8df8cee' 2021-04-21 11:32:01,316+02 INFO  \
[org.ovirt.engine.core.bll.storage.backup.RedefineVmCheckpointCommand] (default \
task-19) [21bb0fd3-06fe-49c8-95ec-94856edabc41] Running command: \
RedefineVmCheckpointCommand internal: true. Entities affected :  ID: \
94ec9758-cfc1-41ba-ba71-e339d8df8cee Type: VMAction group BACKUP_DISK with role type \
ADMIN,  ID: 0ff73c3a-fb49-4148-a3b4-6ce790523a49 Type: DiskAction group BACKUP_DISK \
with role type ADMIN 2021-04-21 11:32:01,318+02 INFO  \
[org.ovirt.engine.core.bll.storage.backup.RedefineVmCheckpointCommand] (default \
task-19) [21bb0fd3-06fe-49c8-95ec-94856edabc41] No previous VM checkpoints found for \
VM '94ec9758-cfc1-41ba-ba71-e339d8df8cee', skipping redefining VM checkpoints \
2021-04-21 11:32:01,567+02 INFO  \
[org.ovirt.engine.core.bll.storage.backup.RedefineVmCheckpointCommand] (default \
task-19) [21bb0fd3-06fe-49c8-95ec-94856edabc41] Lock freed to object \
'EngineLock:{exclusiveLocks='[0ff73c3a-fb49-4148-a3b4-6ce790523a49=DISK]', \
sharedLocks='[94ec9758-cfc1-41ba-ba71-e339d8df8cee=VM]'}' 2021-04-21 11:32:01,567+02 \
INFO  [org.ovirt.engine.core.bll.storage.backup.StartVmBackupCommand] (default \
task-19) [21bb0fd3-06fe-49c8-95ec-94856edabc41] Successfully redefined previous VM \
checkpoints for VM '94ec9758-cfc1-41ba-ba71-e339d8df8cee' 2021-04-21 11:32:01,567+02 \
INFO  [org.ovirt.engine.core.bll.storage.backup.StartVmBackupCommand] (default \
task-19) [21bb0fd3-06fe-49c8-95ec-94856edabc41] Creating VmCheckpoint entity for VM \
'94ec9758-cfc1-41ba-ba71-e339d8df8cee' 2021-04-21 11:32:01,602+02 INFO  \
[org.ovirt.engine.core.bll.storage.backup.StartVmBackupCommand] (default task-19) \
[21bb0fd3-06fe-49c8-95ec-94856edabc41] Created VmCheckpoint entity \
'b67a0436-42fb-4715-a0a5-6bfd9cb09214' 2021-04-21 11:32:01,736+02 INFO  \
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default \
task-19) [21bb0fd3-06fe-49c8-95ec-94856edabc41] EVENT_ID: VM_BACKUP_STARTED(10,790), \
Backup ee02f949-ccb9-452f-a1bb-8e05d0684b0f for VM vm1.cloudserver.tld started (User: \
admin@internal-authz). 2021-04-21 11:32:03,789+02 INFO  \
[org.ovirt.engine.core.bll.storage.backup.CreateScratchDisksCommand] \
(EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-94) \
[21bb0fd3-06fe-49c8-95ec-94856edabc41] Running command: CreateScratchDisksCommand \
internal: true. Entities affected :  ID: 94ec9758-cfc1-41ba-ba71-e339d8df8cee Type: \
VMAction group BACKUP_DISK with role type ADMIN 2021-04-21 11:32:03,910+02 INFO  \
[org.ovirt.engine.core.bll.storage.backup.CreateScratchDiskCommand] \
(EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-94) \
[21bb0fd3-06fe-49c8-95ec-94856edabc41] Running command: CreateScratchDiskCommand \
internal: true. Entities affected :  ID: 4f6d464e-531a-431a-bfcc-ca9dfd1d0c9a Type: \
StorageAction group CREATE_DISK with role type USER 2021-04-21 11:32:03,910+02 INFO  \
[org.ovirt.engine.core.bll.storage.backup.CreateScratchDiskCommand] \
(EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-94) \
[21bb0fd3-06fe-49c8-95ec-94856edabc41] Creating scratch disk for disk ID \
0ff73c3a-fb49-4148-a3b4-6ce790523a49 2021-04-21 11:32:03,977+02 INFO  \
[org.ovirt.engine.core.bll.storage.disk.AddDiskCommand] \
(EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-94) \
[21bb0fd3-06fe-49c8-95ec-94856edabc41] Running command: AddDiskCommand internal: \
true. Entities affected :  ID: 4f6d464e-531a-431a-bfcc-ca9dfd1d0c9a Type: \
StorageAction group CREATE_DISK with role type USER 2021-04-21 11:32:04,038+02 INFO  \
[org.ovirt.engine.core.bll.storage.disk.image.AddImageFromScratchCommand] \
(EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-94) \
[21bb0fd3-06fe-49c8-95ec-94856edabc41] Running command: AddImageFromScratchCommand \
internal: true. Entities affected :  ID: 4f6d464e-531a-431a-bfcc-ca9dfd1d0c9a Type: \
Storage 2021-04-21 11:32:04,109+02 WARN  \
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] \
(EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-94) \
[21bb0fd3-06fe-49c8-95ec-94856edabc41] EVENT_ID: \
FAILED_TO_STORE_ENTIRE_DISK_FIELD_IN_DISK_DESCRIPTION_METADATA(1,026), Failed to \
store field DiskDescription as a part of VM vm1.cloudserver.tld backup \
ee02f949-ccb9-452f-a1bb-8e05d0684b0f scratch disk for vm1.cloudserver.tld-0's \
description metadata due to storage space limitations. The field DiskDescription will \
be truncated. 2021-04-21 11:32:04,111+02 INFO  \
[org.ovirt.engine.core.vdsbroker.irsbroker.CreateVolumeVDSCommand] \
(EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-94) \
[21bb0fd3-06fe-49c8-95ec-94856edabc41] START, CreateVolumeVDSCommand( \
CreateVolumeVDSCommandParameters:{storagePoolId='0532fe8c-d841-49ae-9615-45a7b865f3e3', \
ignoreFailoverLimit='false', storageDomainId='4f6d464e-531a-431a-bfcc-ca9dfd1d0c9a', \
imageGroupId='a419fb9b-486d-4a92-83d0-ef6c845ae5ba', imageSizeInBytes='268435456000', \
volumeFormat='COW', newImageId='0cbd890e-5bdc-4e8e-9e46-53bf5093cfd0', \
imageType='Sparse', newImageDescription='{"DiskAlias":"VM vm1.cloudserver.tld backup \
ee02f949-ccb9-452f-a1bb-8e05d0684b0f scratch disk for \
vm1.cloudserver.tld-0","DiskDescription":"Backup ee02f949-ccb9-452f-a1bb-8e05d0684b0f \
scratch di"}', imageInitialSizeInBytes='0', \
imageId='00000000-0000-0000-0000-000000000000', \
sourceImageGroupId='00000000-0000-0000-0000-000000000000', \
shouldAddBitmaps='false'}), log id: 34e22243 2021-04-21 11:32:04,137+02 ERROR \
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] \
(EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-94) \
[21bb0fd3-06fe-49c8-95ec-94856edabc41] EVENT_ID: IRS_BROKER_COMMAND_FAILURE(10,803), \
VDSM command CreateVolumeVDS failed: Invalid parameter: 'DiskType=SCRD' 2021-04-21 \
11:32:04,137+02 ERROR \
[org.ovirt.engine.core.vdsbroker.irsbroker.CreateVolumeVDSCommand] \
(EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-94) \
[21bb0fd3-06fe-49c8-95ec-94856edabc41] Command 'CreateVolumeVDSCommand( \
CreateVolumeVDSCommandParameters:{storagePoolId='0532fe8c-d841-49ae-9615-45a7b865f3e3', \
ignoreFailoverLimit='false', storageDomainId='4f6d464e-531a-431a-bfcc-ca9dfd1d0c9a', \
imageGroupId='a419fb9b-486d-4a92-83d0-ef6c845ae5ba', imageSizeInBytes='268435456000', \
volumeFormat='COW', newImageId='0cbd890e-5bdc-4e8e-9e46-53bf5093cfd0', \
imageType='Sparse', newImageDescription='{"DiskAlias":"VM vm1.cloudserver.tld backup \
ee02f949-ccb9-452f-a1bb-8e05d0684b0f scratch disk for \
vm1.cloudserver.tld-0","DiskDescription":"Backup ee02f949-ccb9-452f-a1bb-8e05d0684b0f \
scratch di"}', imageInitialSizeInBytes='0', \
imageId='00000000-0000-0000-0000-000000000000', \
sourceImageGroupId='00000000-0000-0000-0000-000000000000', \
shouldAddBitmaps='false'})' execution failed: I  RSGenericException: \
IRSErrorException: Invalid parameter: 'DiskType=SCRD' 2021-04-21 11:32:04,137+02 INFO \
[org.ovirt.engine.core.vdsbroker.irsbroker.CreateVolumeVDSCommand] \
(EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-94) \
[21bb0fd3-06fe-49c8-95ec-94856edabc41] FINISH, CreateVolumeVDSCommand, return: \
00000000-0000-0000-0000-000000000000, log id: 34e22243 2021-04-21 11:32:04,137+02 \
ERROR [org.ovirt.engine.core.bll.storage.disk.image.AddImageFromScratchCommand] \
(EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-94) \
[21bb0fd3-06fe-49c8-95ec-94856edabc41] Command \
'org.ovirt.engine.core.bll.storage.disk.image.AddImageFromScratchCommand' failed: \
EngineException: org.ovirt.engine.core.vdsbroker.irsbroker.IrsOperationFailedNoFailoverException: \
IRSGenericException: IRSErrorException: Invalid parameter: 'DiskType=SCRD' (Failed \
with error InvalidParameterException and code 1000) 2021-04-21 11:32:04,170+02 INFO  \
[org.ovirt.engine.core.bll.tasks.AsyncTaskManager] \
(EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-94) \
[21bb0fd3-06fe-49c8-95ec-94856edabc41] Removed task \
'cb1f10fc-13e6-4ead-a33f-fad342a69ea2' from DataBase ```

Here's the SPM host vdsm logs:

```
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/vdsm/storage/task.py", line 887, in _run
    return fn(*args, **kargs)
  File "<decorator-gen-61>", line 2, in createVolume
  File "/usr/lib/python3.6/site-packages/vdsm/common/api.py", line 50, in method
    ret = func(*args, **kwargs)
  File "/usr/lib/python3.6/site-packages/vdsm/storage/hsm.py", line 1510, in \
createVolume  add_bitmaps=addBitmaps)
  File "/usr/lib/python3.6/site-packages/vdsm/storage/sd.py", line 1002, in \
validateCreateVolumeParams  add_bitmaps=add_bitmaps)
  File "/usr/lib/python3.6/site-packages/vdsm/storage/sd.py", line 649, in \
validateCreateVolumeParams  raise se.InvalidParameterException("DiskType", diskType)
vdsm.storage.exception.InvalidParameterException: Invalid parameter: 'DiskType=SCRD
```

It's a known bug in 4.4.4/4.4.5? Any suggestion?

If needed any other info lemme know, I'll provide you asap.

Thank you for your time
Francesco
_______________________________________________
Devel mailing list -- devel@ovirt.org
To unsubscribe send an email to devel-leave@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/
List Archives: https://lists.ovirt.org/archives/list/devel@ovirt.org/message/WYWMHZFHBVKAVASDWLFSTHXXGAG26K5D/



[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic