diff mbox series

[v2,3/4] backup: Make sure that source and target size match

Message ID 20200430142755.315494-4-kwolf@redhat.com (mailing list archive)
State New, archived
Headers show
Series backup: Make sure that source and target size match | expand

Commit Message

Kevin Wolf April 30, 2020, 2:27 p.m. UTC
Since the introduction of a backup filter node in commit 00e30f05d, the
backup block job crashes when the target image is smaller than the
source image because it will try to write after the end of the target
node without having BLK_PERM_RESIZE. (Previously, the BlockBackend layer
would have caught this and errored out gracefully.)

We can fix this and even do better than the old behaviour: Check that
source and target have the same image size at the start of the block job
and unshare BLK_PERM_RESIZE. (This permission was already unshared
before the same commit 00e30f05d, but the BlockBackend that was used to
make the restriction was removed without a replacement.) This will
immediately error out when starting the job instead of only when writing
to a block that doesn't exist in the target.

Longer target than source would technically work because we would never
write to blocks that don't exist, but semantically these are invalid,
too, because a backup is supposed to create a copy, not just an image
that starts with a copy.

Fixes: 00e30f05de1d19586345ec373970ef4c192c6270
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1778593
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block/backup-top.c | 14 +++++++++-----
 block/backup.c     | 14 +++++++++++++-
 2 files changed, 22 insertions(+), 6 deletions(-)

Comments

Vladimir Sementsov-Ogievskiy April 30, 2020, 6:21 p.m. UTC | #1
30.04.2020 17:27, Kevin Wolf wrote:
> Since the introduction of a backup filter node in commit 00e30f05d, the
> backup block job crashes when the target image is smaller than the
> source image because it will try to write after the end of the target
> node without having BLK_PERM_RESIZE. (Previously, the BlockBackend layer
> would have caught this and errored out gracefully.)
> 
> We can fix this and even do better than the old behaviour: Check that
> source and target have the same image size at the start of the block job
> and unshare BLK_PERM_RESIZE. (This permission was already unshared
> before the same commit 00e30f05d, but the BlockBackend that was used to
> make the restriction was removed without a replacement.) This will
> immediately error out when starting the job instead of only when writing
> to a block that doesn't exist in the target.
> 
> Longer target than source would technically work because we would never
> write to blocks that don't exist, but semantically these are invalid,
> too, because a backup is supposed to create a copy, not just an image
> that starts with a copy.
> 
> Fixes: 00e30f05de1d19586345ec373970ef4c192c6270
> Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1778593
> Cc: qemu-stable@nongnu.org
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>

I'm OK with it as is, as it fixes bug:

Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

still, some notes below


> ---
>   block/backup-top.c | 14 +++++++++-----
>   block/backup.c     | 14 +++++++++++++-
>   2 files changed, 22 insertions(+), 6 deletions(-)
> 
> diff --git a/block/backup-top.c b/block/backup-top.c
> index 3b50c06e2c..79b268e6dc 100644
> --- a/block/backup-top.c
> +++ b/block/backup-top.c
> @@ -148,8 +148,10 @@ static void backup_top_child_perm(BlockDriverState *bs, BdrvChild *c,
>            *
>            * Share write to target (child_file), to not interfere
>            * with guest writes to its disk which may be in target backing chain.
> +         * Can't resize during a backup block job because we check the size
> +         * only upfront.
>            */
> -        *nshared = BLK_PERM_ALL;
> +        *nshared = BLK_PERM_ALL & ~BLK_PERM_RESIZE;
>           *nperm = BLK_PERM_WRITE;
>       } else {
>           /* Source child */
> @@ -159,7 +161,7 @@ static void backup_top_child_perm(BlockDriverState *bs, BdrvChild *c,
>           if (perm & BLK_PERM_WRITE) {
>               *nperm = *nperm | BLK_PERM_CONSISTENT_READ;
>           }
> -        *nshared &= ~BLK_PERM_WRITE;
> +        *nshared &= ~(BLK_PERM_WRITE | BLK_PERM_RESIZE);
>       }
>   }
>   
> @@ -192,11 +194,13 @@ BlockDriverState *bdrv_backup_top_append(BlockDriverState *source,
>   {
>       Error *local_err = NULL;
>       BDRVBackupTopState *state;
> -    BlockDriverState *top = bdrv_new_open_driver(&bdrv_backup_top_filter,
> -                                                 filter_node_name,
> -                                                 BDRV_O_RDWR, errp);
> +    BlockDriverState *top;
>       bool appended = false;
>   
> +    assert(source->total_sectors == target->total_sectors);

May be better to error-out, just to keep backup-top independent. Still, now it's not
really needed, as we have only one caller. And this function have to be refactored
anyway, when publishing this filter (open() and close() should appear, so this code
will be rewritten anyway.)

And the other thought: the permissions we declared above, will be activated only after
successful bdrv_child_refresh_perms(). I think some kind of race is possible, so that
size is changed actual permission activation. So, may be good to double check sizes after
bdrv_child_refresh_perms().. But it's a kind of paranoia.

Also, third thought: the restricted permissions doesn't save us from resizing
of the source through exactly this node, does it? Hmm, but your test works somehow. But
(I assume) it worked in a previous patch version without unsharing on source..

Ha, but bdrv_co_truncate just can't work on backup-top, because it doesn't have file child.
But, if we fix bdrv_co_truncate to skip filters, we'll need to define .bdrv_co_truncate in
backup_top, which will return something like -EBUSY.. Or just -ENOTSUP, doesn't matter.

> +
> +    top = bdrv_new_open_driver(&bdrv_backup_top_filter, filter_node_name,
> +                               BDRV_O_RDWR, errp);
>       if (!top) {
>           return NULL;
>       }
> diff --git a/block/backup.c b/block/backup.c
> index c4c3b8cd46..4f13bb20a5 100644
> --- a/block/backup.c
> +++ b/block/backup.c
> @@ -340,7 +340,7 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs,
>                     BlockCompletionFunc *cb, void *opaque,
>                     JobTxn *txn, Error **errp)
>   {
> -    int64_t len;
> +    int64_t len, target_len;
>       BackupBlockJob *job = NULL;
>       int64_t cluster_size;
>       BdrvRequestFlags write_flags;
> @@ -405,6 +405,18 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs,
>           goto error;
>       }
>   
> +    target_len = bdrv_getlength(target);
> +    if (target_len < 0) {
> +        error_setg_errno(errp, -target_len, "Unable to get length for '%s'",
> +                         bdrv_get_device_or_node_name(bs));
> +        goto error;
> +    }
> +
> +    if (target_len != len) {
> +        error_setg(errp, "Source and target image have different sizes");
> +        goto error;
> +    }
> +
>       cluster_size = backup_calculate_cluster_size(target, errp);
>       if (cluster_size < 0) {
>           goto error;
>
Kevin Wolf May 5, 2020, 10:03 a.m. UTC | #2
Am 30.04.2020 um 20:21 hat Vladimir Sementsov-Ogievskiy geschrieben:
> 30.04.2020 17:27, Kevin Wolf wrote:
> > Since the introduction of a backup filter node in commit 00e30f05d, the
> > backup block job crashes when the target image is smaller than the
> > source image because it will try to write after the end of the target
> > node without having BLK_PERM_RESIZE. (Previously, the BlockBackend layer
> > would have caught this and errored out gracefully.)
> > 
> > We can fix this and even do better than the old behaviour: Check that
> > source and target have the same image size at the start of the block job
> > and unshare BLK_PERM_RESIZE. (This permission was already unshared
> > before the same commit 00e30f05d, but the BlockBackend that was used to
> > make the restriction was removed without a replacement.) This will
> > immediately error out when starting the job instead of only when writing
> > to a block that doesn't exist in the target.
> > 
> > Longer target than source would technically work because we would never
> > write to blocks that don't exist, but semantically these are invalid,
> > too, because a backup is supposed to create a copy, not just an image
> > that starts with a copy.
> > 
> > Fixes: 00e30f05de1d19586345ec373970ef4c192c6270
> > Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1778593
> > Cc: qemu-stable@nongnu.org
> > Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> 
> I'm OK with it as is, as it fixes bug:
> 
> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> 
> still, some notes below
> 
> 
> > ---
> >   block/backup-top.c | 14 +++++++++-----
> >   block/backup.c     | 14 +++++++++++++-
> >   2 files changed, 22 insertions(+), 6 deletions(-)
> > 
> > diff --git a/block/backup-top.c b/block/backup-top.c
> > index 3b50c06e2c..79b268e6dc 100644
> > --- a/block/backup-top.c
> > +++ b/block/backup-top.c
> > @@ -148,8 +148,10 @@ static void backup_top_child_perm(BlockDriverState *bs, BdrvChild *c,
> >            *
> >            * Share write to target (child_file), to not interfere
> >            * with guest writes to its disk which may be in target backing chain.
> > +         * Can't resize during a backup block job because we check the size
> > +         * only upfront.
> >            */
> > -        *nshared = BLK_PERM_ALL;
> > +        *nshared = BLK_PERM_ALL & ~BLK_PERM_RESIZE;
> >           *nperm = BLK_PERM_WRITE;
> >       } else {
> >           /* Source child */
> > @@ -159,7 +161,7 @@ static void backup_top_child_perm(BlockDriverState *bs, BdrvChild *c,
> >           if (perm & BLK_PERM_WRITE) {
> >               *nperm = *nperm | BLK_PERM_CONSISTENT_READ;
> >           }
> > -        *nshared &= ~BLK_PERM_WRITE;
> > +        *nshared &= ~(BLK_PERM_WRITE | BLK_PERM_RESIZE);
> >       }
> >   }
> > @@ -192,11 +194,13 @@ BlockDriverState *bdrv_backup_top_append(BlockDriverState *source,
> >   {
> >       Error *local_err = NULL;
> >       BDRVBackupTopState *state;
> > -    BlockDriverState *top = bdrv_new_open_driver(&bdrv_backup_top_filter,
> > -                                                 filter_node_name,
> > -                                                 BDRV_O_RDWR, errp);
> > +    BlockDriverState *top;
> >       bool appended = false;
> > +    assert(source->total_sectors == target->total_sectors);
> 
> May be better to error-out, just to keep backup-top independent. Still, now it's not
> really needed, as we have only one caller. And this function have to be refactored
> anyway, when publishing this filter (open() and close() should appear, so this code
> will be rewritten anyway.)

Yes, the whole function only works because it's used in this restricted
context today. For example, we only know that total_sectors is up to
date because the caller has called bdrv_getlength() just a moment ago.

I think fixing this would be beyond the scope of this patch, but
certainly a good idea anyway.

> And the other thought: the permissions we declared above, will be activated only after
> successful bdrv_child_refresh_perms(). I think some kind of race is possible, so that
> size is changed actual permission activation. So, may be good to double check sizes after
> bdrv_child_refresh_perms().. But it's a kind of paranoia.

We're not in coroutine context, so we can't yield. I don't see who could
change the size in parallel (apart from an external process, but an
external process can mess up anything).

When we make backup-top an independent driver, instead of double
checking (what would you do on error?), maybe we could move the size
initialisation (then with bdrv_getlength()) to after
bdrv_child_refresh_perms().

> Also, third thought: the restricted permissions doesn't save us from resizing
> of the source through exactly this node, does it? Hmm, but your test works somehow. But
> (I assume) it worked in a previous patch version without unsharing on source..
> 
> Ha, but bdrv_co_truncate just can't work on backup-top, because it doesn't have file child.
> But, if we fix bdrv_co_truncate to skip filters, we'll need to define .bdrv_co_truncate in
> backup_top, which will return something like -EBUSY.. Or just -ENOTSUP, doesn't matter.

Maybe this is a sign that bdrv_co_truncate shouldn't automatically skip
filters because filters might depend on a fixed size?

Or we could make the automatic skipping depend on having BLK_PERM_RESIZE
for the child. If the filter doesn't have the permission, we must not
call truncate for its child (it would crash). Then backup-top and
similar filters must just be careful not to take RESIZE permissions.

Kevin
Vladimir Sementsov-Ogievskiy May 6, 2020, 6:07 a.m. UTC | #3
05.05.2020 13:03, Kevin Wolf wrote:
> Am 30.04.2020 um 20:21 hat Vladimir Sementsov-Ogievskiy geschrieben:
>> 30.04.2020 17:27, Kevin Wolf wrote:
>>> Since the introduction of a backup filter node in commit 00e30f05d, the
>>> backup block job crashes when the target image is smaller than the
>>> source image because it will try to write after the end of the target
>>> node without having BLK_PERM_RESIZE. (Previously, the BlockBackend layer
>>> would have caught this and errored out gracefully.)
>>>
>>> We can fix this and even do better than the old behaviour: Check that
>>> source and target have the same image size at the start of the block job
>>> and unshare BLK_PERM_RESIZE. (This permission was already unshared
>>> before the same commit 00e30f05d, but the BlockBackend that was used to
>>> make the restriction was removed without a replacement.) This will
>>> immediately error out when starting the job instead of only when writing
>>> to a block that doesn't exist in the target.
>>>
>>> Longer target than source would technically work because we would never
>>> write to blocks that don't exist, but semantically these are invalid,
>>> too, because a backup is supposed to create a copy, not just an image
>>> that starts with a copy.
>>>
>>> Fixes: 00e30f05de1d19586345ec373970ef4c192c6270
>>> Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1778593
>>> Cc: qemu-stable@nongnu.org
>>> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
>>
>> I'm OK with it as is, as it fixes bug:
>>
>> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>
>> still, some notes below
>>
>>
>>> ---
>>>    block/backup-top.c | 14 +++++++++-----
>>>    block/backup.c     | 14 +++++++++++++-
>>>    2 files changed, 22 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/block/backup-top.c b/block/backup-top.c
>>> index 3b50c06e2c..79b268e6dc 100644
>>> --- a/block/backup-top.c
>>> +++ b/block/backup-top.c
>>> @@ -148,8 +148,10 @@ static void backup_top_child_perm(BlockDriverState *bs, BdrvChild *c,
>>>             *
>>>             * Share write to target (child_file), to not interfere
>>>             * with guest writes to its disk which may be in target backing chain.
>>> +         * Can't resize during a backup block job because we check the size
>>> +         * only upfront.
>>>             */
>>> -        *nshared = BLK_PERM_ALL;
>>> +        *nshared = BLK_PERM_ALL & ~BLK_PERM_RESIZE;
>>>            *nperm = BLK_PERM_WRITE;
>>>        } else {
>>>            /* Source child */
>>> @@ -159,7 +161,7 @@ static void backup_top_child_perm(BlockDriverState *bs, BdrvChild *c,
>>>            if (perm & BLK_PERM_WRITE) {
>>>                *nperm = *nperm | BLK_PERM_CONSISTENT_READ;
>>>            }
>>> -        *nshared &= ~BLK_PERM_WRITE;
>>> +        *nshared &= ~(BLK_PERM_WRITE | BLK_PERM_RESIZE);
>>>        }
>>>    }
>>> @@ -192,11 +194,13 @@ BlockDriverState *bdrv_backup_top_append(BlockDriverState *source,
>>>    {
>>>        Error *local_err = NULL;
>>>        BDRVBackupTopState *state;
>>> -    BlockDriverState *top = bdrv_new_open_driver(&bdrv_backup_top_filter,
>>> -                                                 filter_node_name,
>>> -                                                 BDRV_O_RDWR, errp);
>>> +    BlockDriverState *top;
>>>        bool appended = false;
>>> +    assert(source->total_sectors == target->total_sectors);
>>
>> May be better to error-out, just to keep backup-top independent. Still, now it's not
>> really needed, as we have only one caller. And this function have to be refactored
>> anyway, when publishing this filter (open() and close() should appear, so this code
>> will be rewritten anyway.)
> 
> Yes, the whole function only works because it's used in this restricted
> context today. For example, we only know that total_sectors is up to
> date because the caller has called bdrv_getlength() just a moment ago.
> 
> I think fixing this would be beyond the scope of this patch, but
> certainly a good idea anyway.
> 
>> And the other thought: the permissions we declared above, will be activated only after
>> successful bdrv_child_refresh_perms(). I think some kind of race is possible, so that
>> size is changed actual permission activation. So, may be good to double check sizes after
>> bdrv_child_refresh_perms().. But it's a kind of paranoia.
> 
> We're not in coroutine context, so we can't yield. I don't see who could
> change the size in parallel (apart from an external process, but an
> external process can mess up anything).
> 
> When we make backup-top an independent driver, instead of double
> checking (what would you do on error?), maybe we could move the size
> initialisation (then with bdrv_getlength()) to after
> bdrv_child_refresh_perms().
> 
>> Also, third thought: the restricted permissions doesn't save us from resizing
>> of the source through exactly this node, does it? Hmm, but your test works somehow. But
>> (I assume) it worked in a previous patch version without unsharing on source..
>>
>> Ha, but bdrv_co_truncate just can't work on backup-top, because it doesn't have file child.
>> But, if we fix bdrv_co_truncate to skip filters, we'll need to define .bdrv_co_truncate in
>> backup_top, which will return something like -EBUSY.. Or just -ENOTSUP, doesn't matter.
> 
> Maybe this is a sign that bdrv_co_truncate shouldn't automatically skip
> filters because filters might depend on a fixed size?
> 
> Or we could make the automatic skipping depend on having BLK_PERM_RESIZE
> for the child. If the filter doesn't have the permission, we must not
> call truncate for its child (it would crash). Then backup-top and
> similar filters must just be careful not to take RESIZE permissions.
> 

Hmm this should work.. Still it's a workaround, seems out of the concept of permission system..

I think, that the problem is that .bdrv_top_child_perm can't return an error.
The handler answers the question:

- Hi, we are your owners and we want the following cumulative permissions on you. Then, which permissions do you want on your child?

And the handler can't answer: "Hi, you guys want too much, I refuse to play by your rules"..
Kevin Wolf May 6, 2020, 8:02 a.m. UTC | #4
Am 06.05.2020 um 08:07 hat Vladimir Sementsov-Ogievskiy geschrieben:
> 05.05.2020 13:03, Kevin Wolf wrote:
> > Am 30.04.2020 um 20:21 hat Vladimir Sementsov-Ogievskiy geschrieben:
> > > 30.04.2020 17:27, Kevin Wolf wrote:
> > > > Since the introduction of a backup filter node in commit 00e30f05d, the
> > > > backup block job crashes when the target image is smaller than the
> > > > source image because it will try to write after the end of the target
> > > > node without having BLK_PERM_RESIZE. (Previously, the BlockBackend layer
> > > > would have caught this and errored out gracefully.)
> > > > 
> > > > We can fix this and even do better than the old behaviour: Check that
> > > > source and target have the same image size at the start of the block job
> > > > and unshare BLK_PERM_RESIZE. (This permission was already unshared
> > > > before the same commit 00e30f05d, but the BlockBackend that was used to
> > > > make the restriction was removed without a replacement.) This will
> > > > immediately error out when starting the job instead of only when writing
> > > > to a block that doesn't exist in the target.
> > > > 
> > > > Longer target than source would technically work because we would never
> > > > write to blocks that don't exist, but semantically these are invalid,
> > > > too, because a backup is supposed to create a copy, not just an image
> > > > that starts with a copy.
> > > > 
> > > > Fixes: 00e30f05de1d19586345ec373970ef4c192c6270
> > > > Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1778593
> > > > Cc: qemu-stable@nongnu.org
> > > > Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> > > 
> > > I'm OK with it as is, as it fixes bug:
> > > 
> > > Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> > > 
> > > still, some notes below
> > > 
> > > 
> > > > ---
> > > >    block/backup-top.c | 14 +++++++++-----
> > > >    block/backup.c     | 14 +++++++++++++-
> > > >    2 files changed, 22 insertions(+), 6 deletions(-)
> > > > 
> > > > diff --git a/block/backup-top.c b/block/backup-top.c
> > > > index 3b50c06e2c..79b268e6dc 100644
> > > > --- a/block/backup-top.c
> > > > +++ b/block/backup-top.c
> > > > @@ -148,8 +148,10 @@ static void backup_top_child_perm(BlockDriverState *bs, BdrvChild *c,
> > > >             *
> > > >             * Share write to target (child_file), to not interfere
> > > >             * with guest writes to its disk which may be in target backing chain.
> > > > +         * Can't resize during a backup block job because we check the size
> > > > +         * only upfront.
> > > >             */
> > > > -        *nshared = BLK_PERM_ALL;
> > > > +        *nshared = BLK_PERM_ALL & ~BLK_PERM_RESIZE;
> > > >            *nperm = BLK_PERM_WRITE;
> > > >        } else {
> > > >            /* Source child */
> > > > @@ -159,7 +161,7 @@ static void backup_top_child_perm(BlockDriverState *bs, BdrvChild *c,
> > > >            if (perm & BLK_PERM_WRITE) {
> > > >                *nperm = *nperm | BLK_PERM_CONSISTENT_READ;
> > > >            }
> > > > -        *nshared &= ~BLK_PERM_WRITE;
> > > > +        *nshared &= ~(BLK_PERM_WRITE | BLK_PERM_RESIZE);
> > > >        }
> > > >    }
> > > > @@ -192,11 +194,13 @@ BlockDriverState *bdrv_backup_top_append(BlockDriverState *source,
> > > >    {
> > > >        Error *local_err = NULL;
> > > >        BDRVBackupTopState *state;
> > > > -    BlockDriverState *top = bdrv_new_open_driver(&bdrv_backup_top_filter,
> > > > -                                                 filter_node_name,
> > > > -                                                 BDRV_O_RDWR, errp);
> > > > +    BlockDriverState *top;
> > > >        bool appended = false;
> > > > +    assert(source->total_sectors == target->total_sectors);
> > > 
> > > May be better to error-out, just to keep backup-top independent. Still, now it's not
> > > really needed, as we have only one caller. And this function have to be refactored
> > > anyway, when publishing this filter (open() and close() should appear, so this code
> > > will be rewritten anyway.)
> > 
> > Yes, the whole function only works because it's used in this restricted
> > context today. For example, we only know that total_sectors is up to
> > date because the caller has called bdrv_getlength() just a moment ago.
> > 
> > I think fixing this would be beyond the scope of this patch, but
> > certainly a good idea anyway.
> > 
> > > And the other thought: the permissions we declared above, will be activated only after
> > > successful bdrv_child_refresh_perms(). I think some kind of race is possible, so that
> > > size is changed actual permission activation. So, may be good to double check sizes after
> > > bdrv_child_refresh_perms().. But it's a kind of paranoia.
> > 
> > We're not in coroutine context, so we can't yield. I don't see who could
> > change the size in parallel (apart from an external process, but an
> > external process can mess up anything).
> > 
> > When we make backup-top an independent driver, instead of double
> > checking (what would you do on error?), maybe we could move the size
> > initialisation (then with bdrv_getlength()) to after
> > bdrv_child_refresh_perms().
> > 
> > > Also, third thought: the restricted permissions doesn't save us from resizing
> > > of the source through exactly this node, does it? Hmm, but your test works somehow. But
> > > (I assume) it worked in a previous patch version without unsharing on source..
> > > 
> > > Ha, but bdrv_co_truncate just can't work on backup-top, because it doesn't have file child.
> > > But, if we fix bdrv_co_truncate to skip filters, we'll need to define .bdrv_co_truncate in
> > > backup_top, which will return something like -EBUSY.. Or just -ENOTSUP, doesn't matter.
> > 
> > Maybe this is a sign that bdrv_co_truncate shouldn't automatically skip
> > filters because filters might depend on a fixed size?
> > 
> > Or we could make the automatic skipping depend on having BLK_PERM_RESIZE
> > for the child. If the filter doesn't have the permission, we must not
> > call truncate for its child (it would crash). Then backup-top and
> > similar filters must just be careful not to take RESIZE permissions.
> > 
> 
> Hmm this should work.. Still it's a workaround, seems out of the
> concept of permission system..

I'm not sure about this. I see the problem more with unconditionally
skipping the filter without checking whether the operation is even
allowed on the filtered child.

> I think, that the problem is that .bdrv_top_child_perm can't return an
> error.  The handler answers the question:
> 
> - Hi, we are your owners and we want the following cumulative
> permissions on you. Then, which permissions do you want on your child?
> 
> And the handler can't answer: "Hi, you guys want too much, I refuse to
> play by your rules"..

It can implement .bdrv_check_perm to do that. It's just a bit more work
for each driver to do that.

Kevin
Vladimir Sementsov-Ogievskiy May 6, 2020, 8:21 a.m. UTC | #5
06.05.2020 11:02, Kevin Wolf wrote:
> Am 06.05.2020 um 08:07 hat Vladimir Sementsov-Ogievskiy geschrieben:
>> 05.05.2020 13:03, Kevin Wolf wrote:
>>> Am 30.04.2020 um 20:21 hat Vladimir Sementsov-Ogievskiy geschrieben:
>>>> 30.04.2020 17:27, Kevin Wolf wrote:
>>>>> Since the introduction of a backup filter node in commit 00e30f05d, the
>>>>> backup block job crashes when the target image is smaller than the
>>>>> source image because it will try to write after the end of the target
>>>>> node without having BLK_PERM_RESIZE. (Previously, the BlockBackend layer
>>>>> would have caught this and errored out gracefully.)
>>>>>
>>>>> We can fix this and even do better than the old behaviour: Check that
>>>>> source and target have the same image size at the start of the block job
>>>>> and unshare BLK_PERM_RESIZE. (This permission was already unshared
>>>>> before the same commit 00e30f05d, but the BlockBackend that was used to
>>>>> make the restriction was removed without a replacement.) This will
>>>>> immediately error out when starting the job instead of only when writing
>>>>> to a block that doesn't exist in the target.
>>>>>
>>>>> Longer target than source would technically work because we would never
>>>>> write to blocks that don't exist, but semantically these are invalid,
>>>>> too, because a backup is supposed to create a copy, not just an image
>>>>> that starts with a copy.
>>>>>
>>>>> Fixes: 00e30f05de1d19586345ec373970ef4c192c6270
>>>>> Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1778593
>>>>> Cc: qemu-stable@nongnu.org
>>>>> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
>>>>
>>>> I'm OK with it as is, as it fixes bug:
>>>>
>>>> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>>>
>>>> still, some notes below
>>>>
>>>>
>>>>> ---
>>>>>     block/backup-top.c | 14 +++++++++-----
>>>>>     block/backup.c     | 14 +++++++++++++-
>>>>>     2 files changed, 22 insertions(+), 6 deletions(-)
>>>>>
>>>>> diff --git a/block/backup-top.c b/block/backup-top.c
>>>>> index 3b50c06e2c..79b268e6dc 100644
>>>>> --- a/block/backup-top.c
>>>>> +++ b/block/backup-top.c
>>>>> @@ -148,8 +148,10 @@ static void backup_top_child_perm(BlockDriverState *bs, BdrvChild *c,
>>>>>              *
>>>>>              * Share write to target (child_file), to not interfere
>>>>>              * with guest writes to its disk which may be in target backing chain.
>>>>> +         * Can't resize during a backup block job because we check the size
>>>>> +         * only upfront.
>>>>>              */
>>>>> -        *nshared = BLK_PERM_ALL;
>>>>> +        *nshared = BLK_PERM_ALL & ~BLK_PERM_RESIZE;
>>>>>             *nperm = BLK_PERM_WRITE;
>>>>>         } else {
>>>>>             /* Source child */
>>>>> @@ -159,7 +161,7 @@ static void backup_top_child_perm(BlockDriverState *bs, BdrvChild *c,
>>>>>             if (perm & BLK_PERM_WRITE) {
>>>>>                 *nperm = *nperm | BLK_PERM_CONSISTENT_READ;
>>>>>             }
>>>>> -        *nshared &= ~BLK_PERM_WRITE;
>>>>> +        *nshared &= ~(BLK_PERM_WRITE | BLK_PERM_RESIZE);
>>>>>         }
>>>>>     }
>>>>> @@ -192,11 +194,13 @@ BlockDriverState *bdrv_backup_top_append(BlockDriverState *source,
>>>>>     {
>>>>>         Error *local_err = NULL;
>>>>>         BDRVBackupTopState *state;
>>>>> -    BlockDriverState *top = bdrv_new_open_driver(&bdrv_backup_top_filter,
>>>>> -                                                 filter_node_name,
>>>>> -                                                 BDRV_O_RDWR, errp);
>>>>> +    BlockDriverState *top;
>>>>>         bool appended = false;
>>>>> +    assert(source->total_sectors == target->total_sectors);
>>>>
>>>> May be better to error-out, just to keep backup-top independent. Still, now it's not
>>>> really needed, as we have only one caller. And this function have to be refactored
>>>> anyway, when publishing this filter (open() and close() should appear, so this code
>>>> will be rewritten anyway.)
>>>
>>> Yes, the whole function only works because it's used in this restricted
>>> context today. For example, we only know that total_sectors is up to
>>> date because the caller has called bdrv_getlength() just a moment ago.
>>>
>>> I think fixing this would be beyond the scope of this patch, but
>>> certainly a good idea anyway.
>>>
>>>> And the other thought: the permissions we declared above, will be activated only after
>>>> successful bdrv_child_refresh_perms(). I think some kind of race is possible, so that
>>>> size is changed actual permission activation. So, may be good to double check sizes after
>>>> bdrv_child_refresh_perms().. But it's a kind of paranoia.
>>>
>>> We're not in coroutine context, so we can't yield. I don't see who could
>>> change the size in parallel (apart from an external process, but an
>>> external process can mess up anything).
>>>
>>> When we make backup-top an independent driver, instead of double
>>> checking (what would you do on error?), maybe we could move the size
>>> initialisation (then with bdrv_getlength()) to after
>>> bdrv_child_refresh_perms().
>>>
>>>> Also, third thought: the restricted permissions doesn't save us from resizing
>>>> of the source through exactly this node, does it? Hmm, but your test works somehow. But
>>>> (I assume) it worked in a previous patch version without unsharing on source..
>>>>
>>>> Ha, but bdrv_co_truncate just can't work on backup-top, because it doesn't have file child.
>>>> But, if we fix bdrv_co_truncate to skip filters, we'll need to define .bdrv_co_truncate in
>>>> backup_top, which will return something like -EBUSY.. Or just -ENOTSUP, doesn't matter.
>>>
>>> Maybe this is a sign that bdrv_co_truncate shouldn't automatically skip
>>> filters because filters might depend on a fixed size?
>>>
>>> Or we could make the automatic skipping depend on having BLK_PERM_RESIZE
>>> for the child. If the filter doesn't have the permission, we must not
>>> call truncate for its child (it would crash). Then backup-top and
>>> similar filters must just be careful not to take RESIZE permissions.
>>>
>>
>> Hmm this should work.. Still it's a workaround, seems out of the
>> concept of permission system..
> 
> I'm not sure about this. I see the problem more with unconditionally
> skipping the filter without checking whether the operation is even
> allowed on the filtered child.

Agree. Blindly skipping the filter is wrong anyway.

> 
>> I think, that the problem is that .bdrv_top_child_perm can't return an
>> error.  The handler answers the question:
>>
>> - Hi, we are your owners and we want the following cumulative
>> permissions on you. Then, which permissions do you want on your child?
>>
>> And the handler can't answer: "Hi, you guys want too much, I refuse to
>> play by your rules"..
> 
> It can implement .bdrv_check_perm to do that. It's just a bit more work
> for each driver to do that.
> 

OK, so, it actually can.
diff mbox series

Patch

diff --git a/block/backup-top.c b/block/backup-top.c
index 3b50c06e2c..79b268e6dc 100644
--- a/block/backup-top.c
+++ b/block/backup-top.c
@@ -148,8 +148,10 @@  static void backup_top_child_perm(BlockDriverState *bs, BdrvChild *c,
          *
          * Share write to target (child_file), to not interfere
          * with guest writes to its disk which may be in target backing chain.
+         * Can't resize during a backup block job because we check the size
+         * only upfront.
          */
-        *nshared = BLK_PERM_ALL;
+        *nshared = BLK_PERM_ALL & ~BLK_PERM_RESIZE;
         *nperm = BLK_PERM_WRITE;
     } else {
         /* Source child */
@@ -159,7 +161,7 @@  static void backup_top_child_perm(BlockDriverState *bs, BdrvChild *c,
         if (perm & BLK_PERM_WRITE) {
             *nperm = *nperm | BLK_PERM_CONSISTENT_READ;
         }
-        *nshared &= ~BLK_PERM_WRITE;
+        *nshared &= ~(BLK_PERM_WRITE | BLK_PERM_RESIZE);
     }
 }
 
@@ -192,11 +194,13 @@  BlockDriverState *bdrv_backup_top_append(BlockDriverState *source,
 {
     Error *local_err = NULL;
     BDRVBackupTopState *state;
-    BlockDriverState *top = bdrv_new_open_driver(&bdrv_backup_top_filter,
-                                                 filter_node_name,
-                                                 BDRV_O_RDWR, errp);
+    BlockDriverState *top;
     bool appended = false;
 
+    assert(source->total_sectors == target->total_sectors);
+
+    top = bdrv_new_open_driver(&bdrv_backup_top_filter, filter_node_name,
+                               BDRV_O_RDWR, errp);
     if (!top) {
         return NULL;
     }
diff --git a/block/backup.c b/block/backup.c
index c4c3b8cd46..4f13bb20a5 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -340,7 +340,7 @@  BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs,
                   BlockCompletionFunc *cb, void *opaque,
                   JobTxn *txn, Error **errp)
 {
-    int64_t len;
+    int64_t len, target_len;
     BackupBlockJob *job = NULL;
     int64_t cluster_size;
     BdrvRequestFlags write_flags;
@@ -405,6 +405,18 @@  BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs,
         goto error;
     }
 
+    target_len = bdrv_getlength(target);
+    if (target_len < 0) {
+        error_setg_errno(errp, -target_len, "Unable to get length for '%s'",
+                         bdrv_get_device_or_node_name(bs));
+        goto error;
+    }
+
+    if (target_len != len) {
+        error_setg(errp, "Source and target image have different sizes");
+        goto error;
+    }
+
     cluster_size = backup_calculate_cluster_size(target, errp);
     if (cluster_size < 0) {
         goto error;