Message ID | 20200429111539.42103-3-kwolf@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | backup: Make sure that source and target size match | expand |
29.04.2020 14:15, Kevin Wolf wrote: > Since the introduction of a backup filter node, the backup block job > crashes when the target image is smaller than the source image because > it will try to write after the end of the target node without having > BLK_PERM_RESIZE. (Previously, the BlockBackend layer would have caught > this and errored out gracefully.) > > We can fix this and even do better than the old behaviour: Check that > source and target have the same image size at the start of the block job > and unshare BLK_PERM_RESIZE. This will immediately error out when > starting the job instead of only when writing to a block that doesn't > exist in the target. > > Longer target than source would technically work because we would never > write to blocks that don't exist, but semantically these are invalid, > too, because a backup is supposed to create a copy, not just an image > that starts with a copy. > > The bugs were introduced in commits 2c8074c45 (BLK_PERM_RESIZE is shared no, it was unshared by blks in block-copy > since this commit) and 00e30f05d (BdrvChild instead of BlockBackend > turns I/O errors into assertion failures). and here becomes shared. So, seems only 00e30f05d is broken and introduces both problems > > Fixes: 2c8074c453ff13a94bd08ec26061917670ec03be > Fixes: 00e30f05de1d19586345ec373970ef4c192c6270 > Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1778593 > Cc: qemu-stable@nongnu.org > Signed-off-by: Kevin Wolf <kwolf@redhat.com> > --- > block/backup-top.c | 12 ++++++++---- > block/backup.c | 14 +++++++++++++- > 2 files changed, 21 insertions(+), 5 deletions(-) > > diff --git a/block/backup-top.c b/block/backup-top.c > index 3b50c06e2c..0e515a7705 100644 > --- a/block/backup-top.c > +++ b/block/backup-top.c > @@ -148,8 +148,10 @@ static void backup_top_child_perm(BlockDriverState *bs, BdrvChild *c, > * > * Share write to target (child_file), to not interfere > * with guest writes to its disk which may be in target backing chain. > + * Can't resize during a backup block job because we check the size > + * only upfront. > */ > - *nshared = BLK_PERM_ALL; > + *nshared = BLK_PERM_ALL & ~BLK_PERM_RESIZE; > *nperm = BLK_PERM_WRITE; > } else { > /* Source child */ > @@ -192,11 +194,13 @@ BlockDriverState *bdrv_backup_top_append(BlockDriverState *source, > { > Error *local_err = NULL; > BDRVBackupTopState *state; > - BlockDriverState *top = bdrv_new_open_driver(&bdrv_backup_top_filter, > - filter_node_name, > - BDRV_O_RDWR, errp); > + BlockDriverState *top; > bool appended = false; > > + assert(source->total_sectors == target->total_sectors); Is it correct to use directly total_sectors and not bdrv_getlenght()? Anyway, using bdrv_getlength() seems safer, and will help us if we move to byte-accurate block-layer at some moment in future. Hmm but total_sectors used directly anyway in this function, so OK > + > + top = bdrv_new_open_driver(&bdrv_backup_top_filter, filter_node_name, > + BDRV_O_RDWR, errp); alignment broken. With it fixed: Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> > if (!top) { > return NULL; > } > diff --git a/block/backup.c b/block/backup.c > index c4c3b8cd46..4f13bb20a5 100644 > --- a/block/backup.c > +++ b/block/backup.c > @@ -340,7 +340,7 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs, > BlockCompletionFunc *cb, void *opaque, > JobTxn *txn, Error **errp) > { > - int64_t len; > + int64_t len, target_len; > BackupBlockJob *job = NULL; > int64_t cluster_size; > BdrvRequestFlags write_flags; > @@ -405,6 +405,18 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs, > goto error; > } > > + target_len = bdrv_getlength(target); > + if (target_len < 0) { > + error_setg_errno(errp, -target_len, "Unable to get length for '%s'", > + bdrv_get_device_or_node_name(bs)); > + goto error; > + } > + > + if (target_len != len) { > + error_setg(errp, "Source and target image have different sizes"); > + goto error; > + } > + > cluster_size = backup_calculate_cluster_size(target, errp); > if (cluster_size < 0) { > goto error; >
diff --git a/block/backup-top.c b/block/backup-top.c index 3b50c06e2c..0e515a7705 100644 --- a/block/backup-top.c +++ b/block/backup-top.c @@ -148,8 +148,10 @@ static void backup_top_child_perm(BlockDriverState *bs, BdrvChild *c, * * Share write to target (child_file), to not interfere * with guest writes to its disk which may be in target backing chain. + * Can't resize during a backup block job because we check the size + * only upfront. */ - *nshared = BLK_PERM_ALL; + *nshared = BLK_PERM_ALL & ~BLK_PERM_RESIZE; *nperm = BLK_PERM_WRITE; } else { /* Source child */ @@ -192,11 +194,13 @@ BlockDriverState *bdrv_backup_top_append(BlockDriverState *source, { Error *local_err = NULL; BDRVBackupTopState *state; - BlockDriverState *top = bdrv_new_open_driver(&bdrv_backup_top_filter, - filter_node_name, - BDRV_O_RDWR, errp); + BlockDriverState *top; bool appended = false; + assert(source->total_sectors == target->total_sectors); + + top = bdrv_new_open_driver(&bdrv_backup_top_filter, filter_node_name, + BDRV_O_RDWR, errp); if (!top) { return NULL; } diff --git a/block/backup.c b/block/backup.c index c4c3b8cd46..4f13bb20a5 100644 --- a/block/backup.c +++ b/block/backup.c @@ -340,7 +340,7 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs, BlockCompletionFunc *cb, void *opaque, JobTxn *txn, Error **errp) { - int64_t len; + int64_t len, target_len; BackupBlockJob *job = NULL; int64_t cluster_size; BdrvRequestFlags write_flags; @@ -405,6 +405,18 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs, goto error; } + target_len = bdrv_getlength(target); + if (target_len < 0) { + error_setg_errno(errp, -target_len, "Unable to get length for '%s'", + bdrv_get_device_or_node_name(bs)); + goto error; + } + + if (target_len != len) { + error_setg(errp, "Source and target image have different sizes"); + goto error; + } + cluster_size = backup_calculate_cluster_size(target, errp); if (cluster_size < 0) { goto error;
Since the introduction of a backup filter node, the backup block job crashes when the target image is smaller than the source image because it will try to write after the end of the target node without having BLK_PERM_RESIZE. (Previously, the BlockBackend layer would have caught this and errored out gracefully.) We can fix this and even do better than the old behaviour: Check that source and target have the same image size at the start of the block job and unshare BLK_PERM_RESIZE. This will immediately error out when starting the job instead of only when writing to a block that doesn't exist in the target. Longer target than source would technically work because we would never write to blocks that don't exist, but semantically these are invalid, too, because a backup is supposed to create a copy, not just an image that starts with a copy. The bugs were introduced in commits 2c8074c45 (BLK_PERM_RESIZE is shared since this commit) and 00e30f05d (BdrvChild instead of BlockBackend turns I/O errors into assertion failures). Fixes: 2c8074c453ff13a94bd08ec26061917670ec03be Fixes: 00e30f05de1d19586345ec373970ef4c192c6270 Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1778593 Cc: qemu-stable@nongnu.org Signed-off-by: Kevin Wolf <kwolf@redhat.com> --- block/backup-top.c | 12 ++++++++---- block/backup.c | 14 +++++++++++++- 2 files changed, 21 insertions(+), 5 deletions(-)