diff mbox series

[for-5.0,v2,15/23] mirror: Prevent loops

Message ID 20191111160216.197086-16-mreitz@redhat.com (mailing list archive)
State New, archived
Headers show
Series block: Fix check_to_replace_node() | expand

Commit Message

Max Reitz Nov. 11, 2019, 4:02 p.m. UTC
While bdrv_replace_node() will not follow through with it, a specific
@replaces asks the mirror job to create a loop.

For example, say both the source and the target share a child where the
source is a filter; by letting @replaces point to the common child, you
ask for a loop.

Or if you use @replaces in drive-mirror with sync=none and
mode=absolute-paths, you generally ask for a loop (@replaces must point
to a child of the source, and sync=none makes the source the backing
file of the target after the job).

bdrv_replace_node() will not create those loops, but by doing so, it
ignores the user-requested configuration, which is not ideally either.
(In the first example above, the target's child will remain what it was,
which may still be reasonable.  But in the second example, the target
will just not become a child of the source, which is precisely what was
requested with @replaces.)

So prevent such configurations, both before the job, and before it
actually completes.

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 block.c                   | 30 ++++++++++++++++++++++++
 block/mirror.c            | 19 +++++++++++++++-
 blockdev.c                | 48 ++++++++++++++++++++++++++++++++++++++-
 include/block/block_int.h |  3 +++
 4 files changed, 98 insertions(+), 2 deletions(-)

Comments

Vladimir Sementsov-Ogievskiy Nov. 29, 2019, 12:01 p.m. UTC | #1
11.11.2019 19:02, Max Reitz wrote:
> While bdrv_replace_node() will not follow through with it, a specific
> @replaces asks the mirror job to create a loop.
> 
> For example, say both the source and the target share a child where the
> source is a filter; by letting @replaces point to the common child, you
> ask for a loop.
> 
> Or if you use @replaces in drive-mirror with sync=none and
> mode=absolute-paths, you generally ask for a loop (@replaces must point
> to a child of the source, and sync=none makes the source the backing
> file of the target after the job).
> 
> bdrv_replace_node() will not create those loops, but by doing so, it
> ignores the user-requested configuration, which is not ideally either.
> (In the first example above, the target's child will remain what it was,
> which may still be reasonable.  But in the second example, the target
> will just not become a child of the source, which is precisely what was
> requested with @replaces.)
> 
> So prevent such configurations, both before the job, and before it
> actually completes.
> 
> Signed-off-by: Max Reitz <mreitz@redhat.com>
> ---
>   block.c                   | 30 ++++++++++++++++++++++++
>   block/mirror.c            | 19 +++++++++++++++-
>   blockdev.c                | 48 ++++++++++++++++++++++++++++++++++++++-
>   include/block/block_int.h |  3 +++
>   4 files changed, 98 insertions(+), 2 deletions(-)
> 
> diff --git a/block.c b/block.c
> index 0159f8e510..e3922a0474 100644
> --- a/block.c
> +++ b/block.c
> @@ -6259,6 +6259,36 @@ out:
>       return to_replace_bs;
>   }
>   
> +/*
> + * Return true iff @child is a (recursive) child of @parent, with at
> + * least @min_level edges between them.
> + *
> + * (If @min_level == 0, return true if @child == @parent.  For
> + * @min_level == 1, @child needs to be at least a real child; for
> + * @min_level == 2, it needs to be at least a grand-child; and so on.)
> + */
> +bool bdrv_is_child_of(BlockDriverState *child, BlockDriverState *parent,
> +                      int min_level)
> +{
> +    BdrvChild *c;
> +
> +    if (child == parent && min_level <= 0) {
> +        return true;
> +    }
> +
> +    if (!parent) {
> +        return false;
> +    }
> +
> +    QLIST_FOREACH(c, &parent->children, next) {
> +        if (bdrv_is_child_of(child, c->bs, min_level - 1)) {
> +            return true;
> +        }
> +    }
> +
> +    return false;
> +}
> +
>   /**
>    * Iterates through the list of runtime option keys that are said to
>    * be "strong" for a BDS.  An option is called "strong" if it changes
> diff --git a/block/mirror.c b/block/mirror.c
> index 68a4404666..b258c7e98b 100644
> --- a/block/mirror.c
> +++ b/block/mirror.c
> @@ -701,7 +701,24 @@ static int mirror_exit_common(Job *job)
>            * there.
>            */
>           if (bdrv_recurse_can_replace(src, to_replace)) {
> -            bdrv_replace_node(to_replace, target_bs, &local_err);
> +            /*
> +             * It is OK for @to_replace to be an immediate child of
> +             * @target_bs, because that is what happens with
> +             * drive-mirror sync=none mode=absolute-paths: target_bs's
> +             * backing file will be the source node, which is also
> +             * to_replace (by default).
> +             * bdrv_replace_node() handles this case by not letting
> +             * target_bs->backing point to itself, but to the source
> +             * still.
> +             */

Hmm.. So, we want the following valid case:

(other parents of source) ----> source = to_replace <--- backing --- target

becomes

(other parents of source) ----> target --- backing ---> source

But it seems for me, that the following is not less valid:

(other parents of source) ----> source = to_replace <--- backing --- X <--- backing --- target

becomes

(other parents of source) ----> target --- backing ---> X --- backing ---> source

And what we actually want to prevent, is when to_replace is not source, but child (may be not direct)
of source..

Also, with your check you still allow silent no-change in the following case:

source --- backing --> to_replace <-- backing -- target

====

In other words, replacing make sense, only if to_replace has some other parents, which are not
children (may be not direct) of target.. And the only known such case is when in the same time
to_replace == source.

so, shouldn't the following be

if (to_replace == src || !bdrv_is_child_of(to_replace, target_bs, 1) {

or, may be, to allow also replace filters above src, keeping backing link :

if (bdrv_is_child_of(src, to_replace, 0) || !bdrv_is_child_of(to_replace, target_bs, 1) {

> +            if (!bdrv_is_child_of(to_replace, target_bs, 2)) {
> +                bdrv_replace_node(to_replace, target_bs, &local_err);
> +            } else {
> +                error_setg(&local_err, "Can no longer replace '%s' by '%s', "
> +                           "because the former is now a child of the latter, "
> +                           "and doing so would thus create a loop",
> +                           to_replace->node_name, target_bs->node_name);
> +            }
>           } else {
>               error_setg(&local_err, "Can no longer replace '%s' by '%s', "
>                          "because it can no longer be guaranteed that doing so "
> diff --git a/blockdev.c b/blockdev.c
> index 9dc2238bf3..d29f147f72 100644
> --- a/blockdev.c
> +++ b/blockdev.c
> @@ -3824,7 +3824,7 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
>       }
>   
>       if (has_replaces) {
> -        BlockDriverState *to_replace_bs;
> +        BlockDriverState *to_replace_bs, *target_backing_bs;
>           AioContext *replace_aio_context;
>           int64_t bs_size, replace_size;
>   
> @@ -3839,6 +3839,52 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
>               return;
>           }
>   
> +        if (bdrv_is_child_of(to_replace_bs, target, 1)) {
> +            error_setg(errp, "Replacing %s by %s would result in a loop, "
> +                       "because the former is a child of the latter",
> +                       to_replace_bs->node_name, target->node_name);
> +            return;
> +        }
> +
> +        if (backing_mode == MIRROR_SOURCE_BACKING_CHAIN ||
> +            backing_mode == MIRROR_OPEN_BACKING_CHAIN)
> +        {
> +            /*
> +             * While we do not quite know what OPEN_BACKING_CHAIN
> +             * (used for mode=existing) will yield, it is probably
> +             * best to restrict it exactly like SOURCE_BACKING_CHAIN,
> +             * because that is our best guess.
> +             */
> +            switch (sync) {
> +            case MIRROR_SYNC_MODE_FULL:
> +                target_backing_bs = NULL;
> +                break;
> +
> +            case MIRROR_SYNC_MODE_TOP:
> +                target_backing_bs = backing_bs(bs);
> +                break;
> +
> +            case MIRROR_SYNC_MODE_NONE:
> +                target_backing_bs = bs;
> +                break;
> +
> +            default:
> +                abort();
> +            }
> +        } else {
> +            assert(backing_mode == MIRROR_LEAVE_BACKING_CHAIN);
> +            target_backing_bs = backing_bs(target);
> +        }
> +
> +        if (bdrv_is_child_of(to_replace_bs, target_backing_bs, 0)) {
> +            error_setg(errp, "Replacing '%s' by '%s' with this sync mode would "
> +                       "result in a loop, because the former would be a child "
> +                       "of the latter's backing file ('%s') after the mirror "
> +                       "job", to_replace_bs->node_name, target->node_name,
> +                       target_backing_bs->node_name);
> +            return;
> +        }
> +
>           replace_aio_context = bdrv_get_aio_context(to_replace_bs);
>           aio_context_acquire(replace_aio_context);
>           replace_size = bdrv_getlength(to_replace_bs);
> diff --git a/include/block/block_int.h b/include/block/block_int.h
> index 589a797fab..7064a1a4fa 100644
> --- a/include/block/block_int.h
> +++ b/include/block/block_int.h
> @@ -1266,6 +1266,9 @@ void bdrv_format_default_perms(BlockDriverState *bs, BdrvChild *c,
>   bool bdrv_recurse_can_replace(BlockDriverState *bs,
>                                 BlockDriverState *to_replace);
>   
> +bool bdrv_is_child_of(BlockDriverState *child, BlockDriverState *parent,
> +                      int min_level);
> +
>   /*
>    * Default implementation for drivers to pass bdrv_co_block_status() to
>    * their file.
>
Max Reitz Nov. 29, 2019, 1:46 p.m. UTC | #2
On 29.11.19 13:01, Vladimir Sementsov-Ogievskiy wrote:
> 11.11.2019 19:02, Max Reitz wrote:
>> While bdrv_replace_node() will not follow through with it, a specific
>> @replaces asks the mirror job to create a loop.
>>
>> For example, say both the source and the target share a child where the
>> source is a filter; by letting @replaces point to the common child, you
>> ask for a loop.
>>
>> Or if you use @replaces in drive-mirror with sync=none and
>> mode=absolute-paths, you generally ask for a loop (@replaces must point
>> to a child of the source, and sync=none makes the source the backing
>> file of the target after the job).
>>
>> bdrv_replace_node() will not create those loops, but by doing so, it
>> ignores the user-requested configuration, which is not ideally either.
>> (In the first example above, the target's child will remain what it was,
>> which may still be reasonable.  But in the second example, the target
>> will just not become a child of the source, which is precisely what was
>> requested with @replaces.)
>>
>> So prevent such configurations, both before the job, and before it
>> actually completes.
>>
>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>> ---
>>   block.c                   | 30 ++++++++++++++++++++++++
>>   block/mirror.c            | 19 +++++++++++++++-
>>   blockdev.c                | 48 ++++++++++++++++++++++++++++++++++++++-
>>   include/block/block_int.h |  3 +++
>>   4 files changed, 98 insertions(+), 2 deletions(-)
>>
>> diff --git a/block.c b/block.c
>> index 0159f8e510..e3922a0474 100644
>> --- a/block.c
>> +++ b/block.c
>> @@ -6259,6 +6259,36 @@ out:
>>       return to_replace_bs;
>>   }
>>   
>> +/*
>> + * Return true iff @child is a (recursive) child of @parent, with at
>> + * least @min_level edges between them.
>> + *
>> + * (If @min_level == 0, return true if @child == @parent.  For
>> + * @min_level == 1, @child needs to be at least a real child; for
>> + * @min_level == 2, it needs to be at least a grand-child; and so on.)
>> + */
>> +bool bdrv_is_child_of(BlockDriverState *child, BlockDriverState *parent,
>> +                      int min_level)
>> +{
>> +    BdrvChild *c;
>> +
>> +    if (child == parent && min_level <= 0) {
>> +        return true;
>> +    }
>> +
>> +    if (!parent) {
>> +        return false;
>> +    }
>> +
>> +    QLIST_FOREACH(c, &parent->children, next) {
>> +        if (bdrv_is_child_of(child, c->bs, min_level - 1)) {
>> +            return true;
>> +        }
>> +    }
>> +
>> +    return false;
>> +}
>> +
>>   /**
>>    * Iterates through the list of runtime option keys that are said to
>>    * be "strong" for a BDS.  An option is called "strong" if it changes
>> diff --git a/block/mirror.c b/block/mirror.c
>> index 68a4404666..b258c7e98b 100644
>> --- a/block/mirror.c
>> +++ b/block/mirror.c
>> @@ -701,7 +701,24 @@ static int mirror_exit_common(Job *job)
>>            * there.
>>            */
>>           if (bdrv_recurse_can_replace(src, to_replace)) {
>> -            bdrv_replace_node(to_replace, target_bs, &local_err);
>> +            /*
>> +             * It is OK for @to_replace to be an immediate child of
>> +             * @target_bs, because that is what happens with
>> +             * drive-mirror sync=none mode=absolute-paths: target_bs's
>> +             * backing file will be the source node, which is also
>> +             * to_replace (by default).
>> +             * bdrv_replace_node() handles this case by not letting
>> +             * target_bs->backing point to itself, but to the source
>> +             * still.
>> +             */
> 
> Hmm.. So, we want the following valid case:
> 
> (other parents of source) ----> source = to_replace <--- backing --- target
> 
> becomes
> 
> (other parents of source) ----> target --- backing ---> source
> 
> But it seems for me, that the following is not less valid:
> 
> (other parents of source) ----> source = to_replace <--- backing --- X <--- backing --- target
> 
> becomes
> 
> (other parents of source) ----> target --- backing ---> X --- backing ---> source

I think it is less valid.  The first case works with sync=none, because
target is initially empty and then you just copy all new data, so the
target keeps looking like the source.

But in the second case, there are intermediate nodes that mean that
target may well not look like the source.

(Yes, you have the same problem if you use sync=none or sync=full to a
completely independent node.  But that still means that while the first
case is always valid, the second may be problematic.)

> And what we actually want to prevent, is when to_replace is not source, but child (may be not direct)
> of source..
> 
> Also, with your check you still allow silent no-change in the following case:
> 
> source --- backing --> to_replace <-- backing -- target

You mean if source is a filter on to_replace?  (Because otherwise you
can’t replace that node.)

Is that really a no-change?  Shouldn’t we get

source --> target --> to_replace

?  (And what else would you expect?)

So maybe we don’t want to prevent that, because I think it can make sense.

Max

> ====
> 
> In other words, replacing make sense, only if to_replace has some other parents, which are not
> children (may be not direct) of target.. And the only known such case is when in the same time
> to_replace == source.
> 
> so, shouldn't the following be
> 
> if (to_replace == src || !bdrv_is_child_of(to_replace, target_bs, 1) {
> 
> or, may be, to allow also replace filters above src, keeping backing link :
> 
> if (bdrv_is_child_of(src, to_replace, 0) || !bdrv_is_child_of(to_replace, target_bs, 1) {
> 
>> +            if (!bdrv_is_child_of(to_replace, target_bs, 2)) {
>> +                bdrv_replace_node(to_replace, target_bs, &local_err);
>> +            } else {
>> +                error_setg(&local_err, "Can no longer replace '%s' by '%s', "
>> +                           "because the former is now a child of the latter, "
>> +                           "and doing so would thus create a loop",
>> +                           to_replace->node_name, target_bs->node_name);
>> +            }
>>           } else {
>>               error_setg(&local_err, "Can no longer replace '%s' by '%s', "
>>                          "because it can no longer be guaranteed that doing so "
>> diff --git a/blockdev.c b/blockdev.c
>> index 9dc2238bf3..d29f147f72 100644
>> --- a/blockdev.c
>> +++ b/blockdev.c
>> @@ -3824,7 +3824,7 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
>>       }
>>   
>>       if (has_replaces) {
>> -        BlockDriverState *to_replace_bs;
>> +        BlockDriverState *to_replace_bs, *target_backing_bs;
>>           AioContext *replace_aio_context;
>>           int64_t bs_size, replace_size;
>>   
>> @@ -3839,6 +3839,52 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
>>               return;
>>           }
>>   
>> +        if (bdrv_is_child_of(to_replace_bs, target, 1)) {
>> +            error_setg(errp, "Replacing %s by %s would result in a loop, "
>> +                       "because the former is a child of the latter",
>> +                       to_replace_bs->node_name, target->node_name);
>> +            return;
>> +        }
>> +
>> +        if (backing_mode == MIRROR_SOURCE_BACKING_CHAIN ||
>> +            backing_mode == MIRROR_OPEN_BACKING_CHAIN)
>> +        {
>> +            /*
>> +             * While we do not quite know what OPEN_BACKING_CHAIN
>> +             * (used for mode=existing) will yield, it is probably
>> +             * best to restrict it exactly like SOURCE_BACKING_CHAIN,
>> +             * because that is our best guess.
>> +             */
>> +            switch (sync) {
>> +            case MIRROR_SYNC_MODE_FULL:
>> +                target_backing_bs = NULL;
>> +                break;
>> +
>> +            case MIRROR_SYNC_MODE_TOP:
>> +                target_backing_bs = backing_bs(bs);
>> +                break;
>> +
>> +            case MIRROR_SYNC_MODE_NONE:
>> +                target_backing_bs = bs;
>> +                break;
>> +
>> +            default:
>> +                abort();
>> +            }
>> +        } else {
>> +            assert(backing_mode == MIRROR_LEAVE_BACKING_CHAIN);
>> +            target_backing_bs = backing_bs(target);
>> +        }
>> +
>> +        if (bdrv_is_child_of(to_replace_bs, target_backing_bs, 0)) {
>> +            error_setg(errp, "Replacing '%s' by '%s' with this sync mode would "
>> +                       "result in a loop, because the former would be a child "
>> +                       "of the latter's backing file ('%s') after the mirror "
>> +                       "job", to_replace_bs->node_name, target->node_name,
>> +                       target_backing_bs->node_name);
>> +            return;
>> +        }
>> +
>>           replace_aio_context = bdrv_get_aio_context(to_replace_bs);
>>           aio_context_acquire(replace_aio_context);
>>           replace_size = bdrv_getlength(to_replace_bs);
>> diff --git a/include/block/block_int.h b/include/block/block_int.h
>> index 589a797fab..7064a1a4fa 100644
>> --- a/include/block/block_int.h
>> +++ b/include/block/block_int.h
>> @@ -1266,6 +1266,9 @@ void bdrv_format_default_perms(BlockDriverState *bs, BdrvChild *c,
>>   bool bdrv_recurse_can_replace(BlockDriverState *bs,
>>                                 BlockDriverState *to_replace);
>>   
>> +bool bdrv_is_child_of(BlockDriverState *child, BlockDriverState *parent,
>> +                      int min_level);
>> +
>>   /*
>>    * Default implementation for drivers to pass bdrv_co_block_status() to
>>    * their file.
>>
> 
>
Vladimir Sementsov-Ogievskiy Nov. 29, 2019, 1:55 p.m. UTC | #3
29.11.2019 16:46, Max Reitz wrote:
> On 29.11.19 13:01, Vladimir Sementsov-Ogievskiy wrote:
>> 11.11.2019 19:02, Max Reitz wrote:
>>> While bdrv_replace_node() will not follow through with it, a specific
>>> @replaces asks the mirror job to create a loop.
>>>
>>> For example, say both the source and the target share a child where the
>>> source is a filter; by letting @replaces point to the common child, you
>>> ask for a loop.
>>>
>>> Or if you use @replaces in drive-mirror with sync=none and
>>> mode=absolute-paths, you generally ask for a loop (@replaces must point
>>> to a child of the source, and sync=none makes the source the backing
>>> file of the target after the job).
>>>
>>> bdrv_replace_node() will not create those loops, but by doing so, it
>>> ignores the user-requested configuration, which is not ideally either.
>>> (In the first example above, the target's child will remain what it was,
>>> which may still be reasonable.  But in the second example, the target
>>> will just not become a child of the source, which is precisely what was
>>> requested with @replaces.)
>>>
>>> So prevent such configurations, both before the job, and before it
>>> actually completes.
>>>
>>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>>> ---
>>>    block.c                   | 30 ++++++++++++++++++++++++
>>>    block/mirror.c            | 19 +++++++++++++++-
>>>    blockdev.c                | 48 ++++++++++++++++++++++++++++++++++++++-
>>>    include/block/block_int.h |  3 +++
>>>    4 files changed, 98 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/block.c b/block.c
>>> index 0159f8e510..e3922a0474 100644
>>> --- a/block.c
>>> +++ b/block.c
>>> @@ -6259,6 +6259,36 @@ out:
>>>        return to_replace_bs;
>>>    }
>>>    
>>> +/*
>>> + * Return true iff @child is a (recursive) child of @parent, with at
>>> + * least @min_level edges between them.
>>> + *
>>> + * (If @min_level == 0, return true if @child == @parent.  For
>>> + * @min_level == 1, @child needs to be at least a real child; for
>>> + * @min_level == 2, it needs to be at least a grand-child; and so on.)
>>> + */
>>> +bool bdrv_is_child_of(BlockDriverState *child, BlockDriverState *parent,
>>> +                      int min_level)
>>> +{
>>> +    BdrvChild *c;
>>> +
>>> +    if (child == parent && min_level <= 0) {
>>> +        return true;
>>> +    }
>>> +
>>> +    if (!parent) {
>>> +        return false;
>>> +    }
>>> +
>>> +    QLIST_FOREACH(c, &parent->children, next) {
>>> +        if (bdrv_is_child_of(child, c->bs, min_level - 1)) {
>>> +            return true;
>>> +        }
>>> +    }
>>> +
>>> +    return false;
>>> +}
>>> +
>>>    /**
>>>     * Iterates through the list of runtime option keys that are said to
>>>     * be "strong" for a BDS.  An option is called "strong" if it changes
>>> diff --git a/block/mirror.c b/block/mirror.c
>>> index 68a4404666..b258c7e98b 100644
>>> --- a/block/mirror.c
>>> +++ b/block/mirror.c
>>> @@ -701,7 +701,24 @@ static int mirror_exit_common(Job *job)
>>>             * there.
>>>             */
>>>            if (bdrv_recurse_can_replace(src, to_replace)) {
>>> -            bdrv_replace_node(to_replace, target_bs, &local_err);
>>> +            /*
>>> +             * It is OK for @to_replace to be an immediate child of
>>> +             * @target_bs, because that is what happens with
>>> +             * drive-mirror sync=none mode=absolute-paths: target_bs's
>>> +             * backing file will be the source node, which is also
>>> +             * to_replace (by default).
>>> +             * bdrv_replace_node() handles this case by not letting
>>> +             * target_bs->backing point to itself, but to the source
>>> +             * still.
>>> +             */
>>
>> Hmm.. So, we want the following valid case:
>>
>> (other parents of source) ----> source = to_replace <--- backing --- target
>>
>> becomes
>>
>> (other parents of source) ----> target --- backing ---> source
>>
>> But it seems for me, that the following is not less valid:
>>
>> (other parents of source) ----> source = to_replace <--- backing --- X <--- backing --- target
>>
>> becomes
>>
>> (other parents of source) ----> target --- backing ---> X --- backing ---> source
> 
> I think it is less valid.  The first case works with sync=none, because
> target is initially empty and then you just copy all new data, so the
> target keeps looking like the source.
> 
> But in the second case, there are intermediate nodes that mean that
> target may well not look like the source.

Maybe, it's valid if target node is a filter? Or, otherwise, it's backing is a filter,
but this seems less useful.

> 
> (Yes, you have the same problem if you use sync=none or sync=full to a
> completely independent node.  But that still means that while the first
> case is always valid, the second may be problematic.)
> 
>> And what we actually want to prevent, is when to_replace is not source, but child (may be not direct)
>> of source..
>>
>> Also, with your check you still allow silent no-change in the following case:
>>
>> source --- backing --> to_replace <-- backing -- target
> 
> You mean if source is a filter on to_replace?  (Because otherwise you
> can’t replace that node.)
> 
> Is that really a no-change?  Shouldn’t we get
> 
> source --> target --> to_replace

Ah, yes, it's OK.

> 
> ?  (And what else would you expect?)
> 
> So maybe we don’t want to prevent that, because I think it can make sense.
> 
> Max
> 
>> ====
>>
>> In other words, replacing make sense, only if to_replace has some other parents, which are not
>> children (may be not direct) of target.. And the only known such case is when in the same time
>> to_replace == source.
>>
>> so, shouldn't the following be
>>
>> if (to_replace == src || !bdrv_is_child_of(to_replace, target_bs, 1) {
>>
>> or, may be, to allow also replace filters above src, keeping backing link :
>>
>> if (bdrv_is_child_of(src, to_replace, 0) || !bdrv_is_child_of(to_replace, target_bs, 1) {
>>
>>> +            if (!bdrv_is_child_of(to_replace, target_bs, 2)) {
>>> +                bdrv_replace_node(to_replace, target_bs, &local_err);
>>> +            } else {
>>> +                error_setg(&local_err, "Can no longer replace '%s' by '%s', "
>>> +                           "because the former is now a child of the latter, "
>>> +                           "and doing so would thus create a loop",
>>> +                           to_replace->node_name, target_bs->node_name);
>>> +            }
>>>            } else {
>>>                error_setg(&local_err, "Can no longer replace '%s' by '%s', "
>>>                           "because it can no longer be guaranteed that doing so "
>>> diff --git a/blockdev.c b/blockdev.c
>>> index 9dc2238bf3..d29f147f72 100644
>>> --- a/blockdev.c
>>> +++ b/blockdev.c
>>> @@ -3824,7 +3824,7 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
>>>        }
>>>    
>>>        if (has_replaces) {
>>> -        BlockDriverState *to_replace_bs;
>>> +        BlockDriverState *to_replace_bs, *target_backing_bs;
>>>            AioContext *replace_aio_context;
>>>            int64_t bs_size, replace_size;
>>>    
>>> @@ -3839,6 +3839,52 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
>>>                return;
>>>            }
>>>    
>>> +        if (bdrv_is_child_of(to_replace_bs, target, 1)) {
>>> +            error_setg(errp, "Replacing %s by %s would result in a loop, "
>>> +                       "because the former is a child of the latter",
>>> +                       to_replace_bs->node_name, target->node_name);
>>> +            return;
>>> +        }
>>> +
>>> +        if (backing_mode == MIRROR_SOURCE_BACKING_CHAIN ||
>>> +            backing_mode == MIRROR_OPEN_BACKING_CHAIN)
>>> +        {
>>> +            /*
>>> +             * While we do not quite know what OPEN_BACKING_CHAIN
>>> +             * (used for mode=existing) will yield, it is probably
>>> +             * best to restrict it exactly like SOURCE_BACKING_CHAIN,
>>> +             * because that is our best guess.
>>> +             */
>>> +            switch (sync) {
>>> +            case MIRROR_SYNC_MODE_FULL:
>>> +                target_backing_bs = NULL;
>>> +                break;
>>> +
>>> +            case MIRROR_SYNC_MODE_TOP:
>>> +                target_backing_bs = backing_bs(bs);
>>> +                break;
>>> +
>>> +            case MIRROR_SYNC_MODE_NONE:
>>> +                target_backing_bs = bs;
>>> +                break;
>>> +
>>> +            default:
>>> +                abort();
>>> +            }
>>> +        } else {
>>> +            assert(backing_mode == MIRROR_LEAVE_BACKING_CHAIN);
>>> +            target_backing_bs = backing_bs(target);
>>> +        }
>>> +
>>> +        if (bdrv_is_child_of(to_replace_bs, target_backing_bs, 0)) {
>>> +            error_setg(errp, "Replacing '%s' by '%s' with this sync mode would "
>>> +                       "result in a loop, because the former would be a child "
>>> +                       "of the latter's backing file ('%s') after the mirror "
>>> +                       "job", to_replace_bs->node_name, target->node_name,
>>> +                       target_backing_bs->node_name);
>>> +            return;
>>> +        }
>>> +
>>>            replace_aio_context = bdrv_get_aio_context(to_replace_bs);
>>>            aio_context_acquire(replace_aio_context);
>>>            replace_size = bdrv_getlength(to_replace_bs);
>>> diff --git a/include/block/block_int.h b/include/block/block_int.h
>>> index 589a797fab..7064a1a4fa 100644
>>> --- a/include/block/block_int.h
>>> +++ b/include/block/block_int.h
>>> @@ -1266,6 +1266,9 @@ void bdrv_format_default_perms(BlockDriverState *bs, BdrvChild *c,
>>>    bool bdrv_recurse_can_replace(BlockDriverState *bs,
>>>                                  BlockDriverState *to_replace);
>>>    
>>> +bool bdrv_is_child_of(BlockDriverState *child, BlockDriverState *parent,
>>> +                      int min_level);
>>> +
>>>    /*
>>>     * Default implementation for drivers to pass bdrv_co_block_status() to
>>>     * their file.
>>>
>>
>>
> 
>
Max Reitz Nov. 29, 2019, 2:17 p.m. UTC | #4
On 29.11.19 14:55, Vladimir Sementsov-Ogievskiy wrote:
> 29.11.2019 16:46, Max Reitz wrote:
>> On 29.11.19 13:01, Vladimir Sementsov-Ogievskiy wrote:
>>> 11.11.2019 19:02, Max Reitz wrote:
>>>> While bdrv_replace_node() will not follow through with it, a specific
>>>> @replaces asks the mirror job to create a loop.
>>>>
>>>> For example, say both the source and the target share a child where the
>>>> source is a filter; by letting @replaces point to the common child, you
>>>> ask for a loop.
>>>>
>>>> Or if you use @replaces in drive-mirror with sync=none and
>>>> mode=absolute-paths, you generally ask for a loop (@replaces must point
>>>> to a child of the source, and sync=none makes the source the backing
>>>> file of the target after the job).
>>>>
>>>> bdrv_replace_node() will not create those loops, but by doing so, it
>>>> ignores the user-requested configuration, which is not ideally either.
>>>> (In the first example above, the target's child will remain what it was,
>>>> which may still be reasonable.  But in the second example, the target
>>>> will just not become a child of the source, which is precisely what was
>>>> requested with @replaces.)
>>>>
>>>> So prevent such configurations, both before the job, and before it
>>>> actually completes.
>>>>
>>>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>>>> ---
>>>>    block.c                   | 30 ++++++++++++++++++++++++
>>>>    block/mirror.c            | 19 +++++++++++++++-
>>>>    blockdev.c                | 48 ++++++++++++++++++++++++++++++++++++++-
>>>>    include/block/block_int.h |  3 +++
>>>>    4 files changed, 98 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/block.c b/block.c
>>>> index 0159f8e510..e3922a0474 100644
>>>> --- a/block.c
>>>> +++ b/block.c
>>>> @@ -6259,6 +6259,36 @@ out:
>>>>        return to_replace_bs;
>>>>    }
>>>>    
>>>> +/*
>>>> + * Return true iff @child is a (recursive) child of @parent, with at
>>>> + * least @min_level edges between them.
>>>> + *
>>>> + * (If @min_level == 0, return true if @child == @parent.  For
>>>> + * @min_level == 1, @child needs to be at least a real child; for
>>>> + * @min_level == 2, it needs to be at least a grand-child; and so on.)
>>>> + */
>>>> +bool bdrv_is_child_of(BlockDriverState *child, BlockDriverState *parent,
>>>> +                      int min_level)
>>>> +{
>>>> +    BdrvChild *c;
>>>> +
>>>> +    if (child == parent && min_level <= 0) {
>>>> +        return true;
>>>> +    }
>>>> +
>>>> +    if (!parent) {
>>>> +        return false;
>>>> +    }
>>>> +
>>>> +    QLIST_FOREACH(c, &parent->children, next) {
>>>> +        if (bdrv_is_child_of(child, c->bs, min_level - 1)) {
>>>> +            return true;
>>>> +        }
>>>> +    }
>>>> +
>>>> +    return false;
>>>> +}
>>>> +
>>>>    /**
>>>>     * Iterates through the list of runtime option keys that are said to
>>>>     * be "strong" for a BDS.  An option is called "strong" if it changes
>>>> diff --git a/block/mirror.c b/block/mirror.c
>>>> index 68a4404666..b258c7e98b 100644
>>>> --- a/block/mirror.c
>>>> +++ b/block/mirror.c
>>>> @@ -701,7 +701,24 @@ static int mirror_exit_common(Job *job)
>>>>             * there.
>>>>             */
>>>>            if (bdrv_recurse_can_replace(src, to_replace)) {
>>>> -            bdrv_replace_node(to_replace, target_bs, &local_err);
>>>> +            /*
>>>> +             * It is OK for @to_replace to be an immediate child of
>>>> +             * @target_bs, because that is what happens with
>>>> +             * drive-mirror sync=none mode=absolute-paths: target_bs's
>>>> +             * backing file will be the source node, which is also
>>>> +             * to_replace (by default).
>>>> +             * bdrv_replace_node() handles this case by not letting
>>>> +             * target_bs->backing point to itself, but to the source
>>>> +             * still.
>>>> +             */
>>>
>>> Hmm.. So, we want the following valid case:
>>>
>>> (other parents of source) ----> source = to_replace <--- backing --- target
>>>
>>> becomes
>>>
>>> (other parents of source) ----> target --- backing ---> source
>>>
>>> But it seems for me, that the following is not less valid:
>>>
>>> (other parents of source) ----> source = to_replace <--- backing --- X <--- backing --- target
>>>
>>> becomes
>>>
>>> (other parents of source) ----> target --- backing ---> X --- backing ---> source
>>
>> I think it is less valid.  The first case works with sync=none, because
>> target is initially empty and then you just copy all new data, so the
>> target keeps looking like the source.
>>
>> But in the second case, there are intermediate nodes that mean that
>> target may well not look like the source.
> 
> Maybe, it's valid if target node is a filter? Or, otherwise, it's backing is a filter,
> but this seems less useful.

The question to me is whether it’s really useful.  The thing is that
maybe bdrv_replace_node() can make sense of it.  But still, from the
user’s perspective, they kind of are asking for a loop whenever
to_replace is a child of target.  It just so happens that we must allow
one of these cases because it’s the default case for sync=none.

So I’d rather forbid all such cases, because it should be understandable
to users why...

Max
Vladimir Sementsov-Ogievskiy Nov. 29, 2019, 2:26 p.m. UTC | #5
29.11.2019 17:17, Max Reitz wrote:
> On 29.11.19 14:55, Vladimir Sementsov-Ogievskiy wrote:
>> 29.11.2019 16:46, Max Reitz wrote:
>>> On 29.11.19 13:01, Vladimir Sementsov-Ogievskiy wrote:
>>>> 11.11.2019 19:02, Max Reitz wrote:
>>>>> While bdrv_replace_node() will not follow through with it, a specific
>>>>> @replaces asks the mirror job to create a loop.
>>>>>
>>>>> For example, say both the source and the target share a child where the
>>>>> source is a filter; by letting @replaces point to the common child, you
>>>>> ask for a loop.
>>>>>
>>>>> Or if you use @replaces in drive-mirror with sync=none and
>>>>> mode=absolute-paths, you generally ask for a loop (@replaces must point
>>>>> to a child of the source, and sync=none makes the source the backing
>>>>> file of the target after the job).
>>>>>
>>>>> bdrv_replace_node() will not create those loops, but by doing so, it
>>>>> ignores the user-requested configuration, which is not ideally either.
>>>>> (In the first example above, the target's child will remain what it was,
>>>>> which may still be reasonable.  But in the second example, the target
>>>>> will just not become a child of the source, which is precisely what was
>>>>> requested with @replaces.)
>>>>>
>>>>> So prevent such configurations, both before the job, and before it
>>>>> actually completes.
>>>>>
>>>>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>>>>> ---
>>>>>     block.c                   | 30 ++++++++++++++++++++++++
>>>>>     block/mirror.c            | 19 +++++++++++++++-
>>>>>     blockdev.c                | 48 ++++++++++++++++++++++++++++++++++++++-
>>>>>     include/block/block_int.h |  3 +++
>>>>>     4 files changed, 98 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/block.c b/block.c
>>>>> index 0159f8e510..e3922a0474 100644
>>>>> --- a/block.c
>>>>> +++ b/block.c
>>>>> @@ -6259,6 +6259,36 @@ out:
>>>>>         return to_replace_bs;
>>>>>     }
>>>>>     
>>>>> +/*
>>>>> + * Return true iff @child is a (recursive) child of @parent, with at
>>>>> + * least @min_level edges between them.
>>>>> + *
>>>>> + * (If @min_level == 0, return true if @child == @parent.  For
>>>>> + * @min_level == 1, @child needs to be at least a real child; for
>>>>> + * @min_level == 2, it needs to be at least a grand-child; and so on.)
>>>>> + */
>>>>> +bool bdrv_is_child_of(BlockDriverState *child, BlockDriverState *parent,
>>>>> +                      int min_level)
>>>>> +{
>>>>> +    BdrvChild *c;
>>>>> +
>>>>> +    if (child == parent && min_level <= 0) {
>>>>> +        return true;
>>>>> +    }
>>>>> +
>>>>> +    if (!parent) {
>>>>> +        return false;
>>>>> +    }
>>>>> +
>>>>> +    QLIST_FOREACH(c, &parent->children, next) {
>>>>> +        if (bdrv_is_child_of(child, c->bs, min_level - 1)) {
>>>>> +            return true;
>>>>> +        }
>>>>> +    }
>>>>> +
>>>>> +    return false;
>>>>> +}
>>>>> +
>>>>>     /**
>>>>>      * Iterates through the list of runtime option keys that are said to
>>>>>      * be "strong" for a BDS.  An option is called "strong" if it changes
>>>>> diff --git a/block/mirror.c b/block/mirror.c
>>>>> index 68a4404666..b258c7e98b 100644
>>>>> --- a/block/mirror.c
>>>>> +++ b/block/mirror.c
>>>>> @@ -701,7 +701,24 @@ static int mirror_exit_common(Job *job)
>>>>>              * there.
>>>>>              */
>>>>>             if (bdrv_recurse_can_replace(src, to_replace)) {
>>>>> -            bdrv_replace_node(to_replace, target_bs, &local_err);
>>>>> +            /*
>>>>> +             * It is OK for @to_replace to be an immediate child of
>>>>> +             * @target_bs, because that is what happens with
>>>>> +             * drive-mirror sync=none mode=absolute-paths: target_bs's
>>>>> +             * backing file will be the source node, which is also
>>>>> +             * to_replace (by default).
>>>>> +             * bdrv_replace_node() handles this case by not letting
>>>>> +             * target_bs->backing point to itself, but to the source
>>>>> +             * still.
>>>>> +             */
>>>>
>>>> Hmm.. So, we want the following valid case:
>>>>
>>>> (other parents of source) ----> source = to_replace <--- backing --- target
>>>>
>>>> becomes
>>>>
>>>> (other parents of source) ----> target --- backing ---> source
>>>>
>>>> But it seems for me, that the following is not less valid:
>>>>
>>>> (other parents of source) ----> source = to_replace <--- backing --- X <--- backing --- target
>>>>
>>>> becomes
>>>>
>>>> (other parents of source) ----> target --- backing ---> X --- backing ---> source
>>>
>>> I think it is less valid.  The first case works with sync=none, because
>>> target is initially empty and then you just copy all new data, so the
>>> target keeps looking like the source.
>>>
>>> But in the second case, there are intermediate nodes that mean that
>>> target may well not look like the source.
>>
>> Maybe, it's valid if target node is a filter? Or, otherwise, it's backing is a filter,
>> but this seems less useful.
> 
> The question to me is whether it’s really useful.  The thing is that
> maybe bdrv_replace_node() can make sense of it.  But still, from the
> user’s perspective, they kind of are asking for a loop whenever
> to_replace is a child of target.  It just so happens that we must allow
> one of these cases because it’s the default case for sync=none.
> 
> So I’d rather forbid all such cases, because it should be understandable
> to users why...
> 

Okay, I don't have more arguments:) Honestly, I just feel that relying on existing
of chains between nodes of some hardcoded length is not good generic criteria...

bdrv_replace_node never creates loops.. Maybe, just document this behavior in
qapi? And (maybe) return error, if we see that bdrv_replace_node will be noop?

And if it is not noop, may be user don't tries to create a loop, but instead,
user is powerful, knows how bdrv_replace_node works and wants exactly this
behavior?
Max Reitz Nov. 29, 2019, 2:38 p.m. UTC | #6
On 29.11.19 15:26, Vladimir Sementsov-Ogievskiy wrote:
> 29.11.2019 17:17, Max Reitz wrote:
>> On 29.11.19 14:55, Vladimir Sementsov-Ogievskiy wrote:
>>> 29.11.2019 16:46, Max Reitz wrote:
>>>> On 29.11.19 13:01, Vladimir Sementsov-Ogievskiy wrote:
>>>>> 11.11.2019 19:02, Max Reitz wrote:
>>>>>> While bdrv_replace_node() will not follow through with it, a specific
>>>>>> @replaces asks the mirror job to create a loop.
>>>>>>
>>>>>> For example, say both the source and the target share a child where the
>>>>>> source is a filter; by letting @replaces point to the common child, you
>>>>>> ask for a loop.
>>>>>>
>>>>>> Or if you use @replaces in drive-mirror with sync=none and
>>>>>> mode=absolute-paths, you generally ask for a loop (@replaces must point
>>>>>> to a child of the source, and sync=none makes the source the backing
>>>>>> file of the target after the job).
>>>>>>
>>>>>> bdrv_replace_node() will not create those loops, but by doing so, it
>>>>>> ignores the user-requested configuration, which is not ideally either.
>>>>>> (In the first example above, the target's child will remain what it was,
>>>>>> which may still be reasonable.  But in the second example, the target
>>>>>> will just not become a child of the source, which is precisely what was
>>>>>> requested with @replaces.)
>>>>>>
>>>>>> So prevent such configurations, both before the job, and before it
>>>>>> actually completes.
>>>>>>
>>>>>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>>>>>> ---
>>>>>>     block.c                   | 30 ++++++++++++++++++++++++
>>>>>>     block/mirror.c            | 19 +++++++++++++++-
>>>>>>     blockdev.c                | 48 ++++++++++++++++++++++++++++++++++++++-
>>>>>>     include/block/block_int.h |  3 +++
>>>>>>     4 files changed, 98 insertions(+), 2 deletions(-)
>>>>>>
>>>>>> diff --git a/block.c b/block.c
>>>>>> index 0159f8e510..e3922a0474 100644
>>>>>> --- a/block.c
>>>>>> +++ b/block.c
>>>>>> @@ -6259,6 +6259,36 @@ out:
>>>>>>         return to_replace_bs;
>>>>>>     }
>>>>>>     
>>>>>> +/*
>>>>>> + * Return true iff @child is a (recursive) child of @parent, with at
>>>>>> + * least @min_level edges between them.
>>>>>> + *
>>>>>> + * (If @min_level == 0, return true if @child == @parent.  For
>>>>>> + * @min_level == 1, @child needs to be at least a real child; for
>>>>>> + * @min_level == 2, it needs to be at least a grand-child; and so on.)
>>>>>> + */
>>>>>> +bool bdrv_is_child_of(BlockDriverState *child, BlockDriverState *parent,
>>>>>> +                      int min_level)
>>>>>> +{
>>>>>> +    BdrvChild *c;
>>>>>> +
>>>>>> +    if (child == parent && min_level <= 0) {
>>>>>> +        return true;
>>>>>> +    }
>>>>>> +
>>>>>> +    if (!parent) {
>>>>>> +        return false;
>>>>>> +    }
>>>>>> +
>>>>>> +    QLIST_FOREACH(c, &parent->children, next) {
>>>>>> +        if (bdrv_is_child_of(child, c->bs, min_level - 1)) {
>>>>>> +            return true;
>>>>>> +        }
>>>>>> +    }
>>>>>> +
>>>>>> +    return false;
>>>>>> +}
>>>>>> +
>>>>>>     /**
>>>>>>      * Iterates through the list of runtime option keys that are said to
>>>>>>      * be "strong" for a BDS.  An option is called "strong" if it changes
>>>>>> diff --git a/block/mirror.c b/block/mirror.c
>>>>>> index 68a4404666..b258c7e98b 100644
>>>>>> --- a/block/mirror.c
>>>>>> +++ b/block/mirror.c
>>>>>> @@ -701,7 +701,24 @@ static int mirror_exit_common(Job *job)
>>>>>>              * there.
>>>>>>              */
>>>>>>             if (bdrv_recurse_can_replace(src, to_replace)) {
>>>>>> -            bdrv_replace_node(to_replace, target_bs, &local_err);
>>>>>> +            /*
>>>>>> +             * It is OK for @to_replace to be an immediate child of
>>>>>> +             * @target_bs, because that is what happens with
>>>>>> +             * drive-mirror sync=none mode=absolute-paths: target_bs's
>>>>>> +             * backing file will be the source node, which is also
>>>>>> +             * to_replace (by default).
>>>>>> +             * bdrv_replace_node() handles this case by not letting
>>>>>> +             * target_bs->backing point to itself, but to the source
>>>>>> +             * still.
>>>>>> +             */
>>>>>
>>>>> Hmm.. So, we want the following valid case:
>>>>>
>>>>> (other parents of source) ----> source = to_replace <--- backing --- target
>>>>>
>>>>> becomes
>>>>>
>>>>> (other parents of source) ----> target --- backing ---> source
>>>>>
>>>>> But it seems for me, that the following is not less valid:
>>>>>
>>>>> (other parents of source) ----> source = to_replace <--- backing --- X <--- backing --- target
>>>>>
>>>>> becomes
>>>>>
>>>>> (other parents of source) ----> target --- backing ---> X --- backing ---> source
>>>>
>>>> I think it is less valid.  The first case works with sync=none, because
>>>> target is initially empty and then you just copy all new data, so the
>>>> target keeps looking like the source.
>>>>
>>>> But in the second case, there are intermediate nodes that mean that
>>>> target may well not look like the source.
>>>
>>> Maybe, it's valid if target node is a filter? Or, otherwise, it's backing is a filter,
>>> but this seems less useful.
>>
>> The question to me is whether it’s really useful.  The thing is that
>> maybe bdrv_replace_node() can make sense of it.  But still, from the
>> user’s perspective, they kind of are asking for a loop whenever
>> to_replace is a child of target.  It just so happens that we must allow
>> one of these cases because it’s the default case for sync=none.
>>
>> So I’d rather forbid all such cases, because it should be understandable
>> to users why...
>>
> 
> Okay, I don't have more arguments:) Honestly, I just feel that relying on existing
> of chains between nodes of some hardcoded length is not good generic criteria...
> 
> bdrv_replace_node never creates loops.. Maybe, just document this behavior in
> qapi? And (maybe) return error, if we see that bdrv_replace_node will be noop?
> 
> And if it is not noop, may be user don't tries to create a loop, but instead,
> user is powerful, knows how bdrv_replace_node works and wants exactly this
> behavior?

I don’t know whether that’s a good point.  We have strong restrictions
on @replaces anyway (that’s the point of this series, to fix them).  So
if we want to loosen those restrictions and allow the user to do
anything they want because it’s their job to be careful, that would be a
whole different series.

Also, one of the examples in the commit message is using @replaces with
drive-mirror sync=none mode=absolute-paths.  @replaces must be a child
of the source.  So what will happen is that it’s replaced and then we
can’t attach the source as the backing file of the target.  So the
target will probably just read garbage, given that it now lacks the
source as its backing file.

So I’m not sold on “If the user knows what’ll happen, it’s all good”.
Because I don’t think they’ll really know.  I’d rather keep it tight
until someone complains.

Max
Vladimir Sementsov-Ogievskiy Dec. 2, 2019, 12:12 p.m. UTC | #7
11.11.2019 19:02, Max Reitz wrote:
> While bdrv_replace_node() will not follow through with it, a specific
> @replaces asks the mirror job to create a loop.
> 
> For example, say both the source and the target share a child where the
> source is a filter; by letting @replaces point to the common child, you
> ask for a loop.
> 
> Or if you use @replaces in drive-mirror with sync=none and
> mode=absolute-paths, you generally ask for a loop (@replaces must point
> to a child of the source, and sync=none makes the source the backing
> file of the target after the job).
> 
> bdrv_replace_node() will not create those loops, but by doing so, it
> ignores the user-requested configuration, which is not ideally either.
> (In the first example above, the target's child will remain what it was,
> which may still be reasonable.  But in the second example, the target
> will just not become a child of the source, which is precisely what was
> requested with @replaces.)
> 
> So prevent such configurations, both before the job, and before it
> actually completes.
> 
> Signed-off-by: Max Reitz <mreitz@redhat.com>
> ---
>   block.c                   | 30 ++++++++++++++++++++++++
>   block/mirror.c            | 19 +++++++++++++++-
>   blockdev.c                | 48 ++++++++++++++++++++++++++++++++++++++-
>   include/block/block_int.h |  3 +++
>   4 files changed, 98 insertions(+), 2 deletions(-)
> 
> diff --git a/block.c b/block.c
> index 0159f8e510..e3922a0474 100644
> --- a/block.c
> +++ b/block.c
> @@ -6259,6 +6259,36 @@ out:
>       return to_replace_bs;
>   }
>   
> +/*
> + * Return true iff @child is a (recursive) child of @parent, with at
> + * least @min_level edges between them.
> + *
> + * (If @min_level == 0, return true if @child == @parent.  For
> + * @min_level == 1, @child needs to be at least a real child; for
> + * @min_level == 2, it needs to be at least a grand-child; and so on.)
> + */
> +bool bdrv_is_child_of(BlockDriverState *child, BlockDriverState *parent,
> +                      int min_level)
> +{
> +    BdrvChild *c;
> +
> +    if (child == parent && min_level <= 0) {
> +        return true;
> +    }
> +
> +    if (!parent) {
> +        return false;
> +    }
> +
> +    QLIST_FOREACH(c, &parent->children, next) {
> +        if (bdrv_is_child_of(child, c->bs, min_level - 1)) {
> +            return true;
> +        }
> +    }
> +
> +    return false;
> +}
> +
>   /**
>    * Iterates through the list of runtime option keys that are said to
>    * be "strong" for a BDS.  An option is called "strong" if it changes
> diff --git a/block/mirror.c b/block/mirror.c
> index 68a4404666..b258c7e98b 100644
> --- a/block/mirror.c
> +++ b/block/mirror.c
> @@ -701,7 +701,24 @@ static int mirror_exit_common(Job *job)
>            * there.
>            */
>           if (bdrv_recurse_can_replace(src, to_replace)) {
> -            bdrv_replace_node(to_replace, target_bs, &local_err);
> +            /*
> +             * It is OK for @to_replace to be an immediate child of
> +             * @target_bs, because that is what happens with
> +             * drive-mirror sync=none mode=absolute-paths: target_bs's
> +             * backing file will be the source node, which is also
> +             * to_replace (by default).
> +             * bdrv_replace_node() handles this case by not letting
> +             * target_bs->backing point to itself, but to the source
> +             * still.
> +             */
> +            if (!bdrv_is_child_of(to_replace, target_bs, 2)) {
> +                bdrv_replace_node(to_replace, target_bs, &local_err);
> +            } else {
> +                error_setg(&local_err, "Can no longer replace '%s' by '%s', "
> +                           "because the former is now a child of the latter, "
> +                           "and doing so would thus create a loop",
> +                           to_replace->node_name, target_bs->node_name);
> +            }

you may swap if and else branch, dropping "!" mark..

>           } else {
>               error_setg(&local_err, "Can no longer replace '%s' by '%s', "
>                          "because it can no longer be guaranteed that doing so "
> diff --git a/blockdev.c b/blockdev.c
> index 9dc2238bf3..d29f147f72 100644
> --- a/blockdev.c
> +++ b/blockdev.c
> @@ -3824,7 +3824,7 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
>       }
>   
>       if (has_replaces) {
> -        BlockDriverState *to_replace_bs;
> +        BlockDriverState *to_replace_bs, *target_backing_bs;
>           AioContext *replace_aio_context;
>           int64_t bs_size, replace_size;
>   
> @@ -3839,6 +3839,52 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
>               return;
>           }
>   
> +        if (bdrv_is_child_of(to_replace_bs, target, 1)) {
> +            error_setg(errp, "Replacing %s by %s would result in a loop, "
> +                       "because the former is a child of the latter",
> +                       to_replace_bs->node_name, target->node_name);
> +            return;
> +        }

here min_level=1, so we don't handle the case, described in mirror_exit_common..
I don't see why.. blockdev_mirror_common is called from qmp_drive_mirror,
including the case with MIRROR_SYNC_MODE_NONE and NEW_IMAGE_MODE_ABSOLUTE_PATHS..

What I'm missing?

> +
> +        if (backing_mode == MIRROR_SOURCE_BACKING_CHAIN ||
> +            backing_mode == MIRROR_OPEN_BACKING_CHAIN)
> +        {
> +            /*
> +             * While we do not quite know what OPEN_BACKING_CHAIN
> +             * (used for mode=existing) will yield, it is probably
> +             * best to restrict it exactly like SOURCE_BACKING_CHAIN,
> +             * because that is our best guess.
> +             */
> +            switch (sync) {
> +            case MIRROR_SYNC_MODE_FULL:
> +                target_backing_bs = NULL;
> +                break;
> +
> +            case MIRROR_SYNC_MODE_TOP:
> +                target_backing_bs = backing_bs(bs);
> +                break;
> +
> +            case MIRROR_SYNC_MODE_NONE:
> +                target_backing_bs = bs;
> +                break;
> +
> +            default:
> +                abort();
> +            }
> +        } else {
> +            assert(backing_mode == MIRROR_LEAVE_BACKING_CHAIN);
> +            target_backing_bs = backing_bs(target);
> +        }
> +
> +        if (bdrv_is_child_of(to_replace_bs, target_backing_bs, 0)) {
> +            error_setg(errp, "Replacing '%s' by '%s' with this sync mode would "
> +                       "result in a loop, because the former would be a child "
> +                       "of the latter's backing file ('%s') after the mirror "
> +                       "job", to_replace_bs->node_name, target->node_name,
> +                       target_backing_bs->node_name);
> +            return;
> +        }

hmm.. so for MODE_NONE we disallow to_replace == src?

> +
>           replace_aio_context = bdrv_get_aio_context(to_replace_bs);
>           aio_context_acquire(replace_aio_context);
>           replace_size = bdrv_getlength(to_replace_bs);
> diff --git a/include/block/block_int.h b/include/block/block_int.h
> index 589a797fab..7064a1a4fa 100644
> --- a/include/block/block_int.h
> +++ b/include/block/block_int.h
> @@ -1266,6 +1266,9 @@ void bdrv_format_default_perms(BlockDriverState *bs, BdrvChild *c,
>   bool bdrv_recurse_can_replace(BlockDriverState *bs,
>                                 BlockDriverState *to_replace);
>   
> +bool bdrv_is_child_of(BlockDriverState *child, BlockDriverState *parent,
> +                      int min_level);
> +
>   /*
>    * Default implementation for drivers to pass bdrv_co_block_status() to
>    * their file.
>
Max Reitz Dec. 9, 2019, 2:43 p.m. UTC | #8
On 02.12.19 13:12, Vladimir Sementsov-Ogievskiy wrote:
> 11.11.2019 19:02, Max Reitz wrote:
>> While bdrv_replace_node() will not follow through with it, a specific
>> @replaces asks the mirror job to create a loop.
>>
>> For example, say both the source and the target share a child where the
>> source is a filter; by letting @replaces point to the common child, you
>> ask for a loop.
>>
>> Or if you use @replaces in drive-mirror with sync=none and
>> mode=absolute-paths, you generally ask for a loop (@replaces must point
>> to a child of the source, and sync=none makes the source the backing
>> file of the target after the job).
>>
>> bdrv_replace_node() will not create those loops, but by doing so, it
>> ignores the user-requested configuration, which is not ideally either.
>> (In the first example above, the target's child will remain what it was,
>> which may still be reasonable.  But in the second example, the target
>> will just not become a child of the source, which is precisely what was
>> requested with @replaces.)
>>
>> So prevent such configurations, both before the job, and before it
>> actually completes.
>>
>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>> ---
>>   block.c                   | 30 ++++++++++++++++++++++++
>>   block/mirror.c            | 19 +++++++++++++++-
>>   blockdev.c                | 48 ++++++++++++++++++++++++++++++++++++++-
>>   include/block/block_int.h |  3 +++
>>   4 files changed, 98 insertions(+), 2 deletions(-)
>>
>> diff --git a/block.c b/block.c
>> index 0159f8e510..e3922a0474 100644
>> --- a/block.c
>> +++ b/block.c
>> @@ -6259,6 +6259,36 @@ out:
>>       return to_replace_bs;
>>   }
>>   
>> +/*
>> + * Return true iff @child is a (recursive) child of @parent, with at
>> + * least @min_level edges between them.
>> + *
>> + * (If @min_level == 0, return true if @child == @parent.  For
>> + * @min_level == 1, @child needs to be at least a real child; for
>> + * @min_level == 2, it needs to be at least a grand-child; and so on.)
>> + */
>> +bool bdrv_is_child_of(BlockDriverState *child, BlockDriverState *parent,
>> +                      int min_level)
>> +{
>> +    BdrvChild *c;
>> +
>> +    if (child == parent && min_level <= 0) {
>> +        return true;
>> +    }
>> +
>> +    if (!parent) {
>> +        return false;
>> +    }
>> +
>> +    QLIST_FOREACH(c, &parent->children, next) {
>> +        if (bdrv_is_child_of(child, c->bs, min_level - 1)) {
>> +            return true;
>> +        }
>> +    }
>> +
>> +    return false;
>> +}
>> +
>>   /**
>>    * Iterates through the list of runtime option keys that are said to
>>    * be "strong" for a BDS.  An option is called "strong" if it changes
>> diff --git a/block/mirror.c b/block/mirror.c
>> index 68a4404666..b258c7e98b 100644
>> --- a/block/mirror.c
>> +++ b/block/mirror.c
>> @@ -701,7 +701,24 @@ static int mirror_exit_common(Job *job)
>>            * there.
>>            */
>>           if (bdrv_recurse_can_replace(src, to_replace)) {
>> -            bdrv_replace_node(to_replace, target_bs, &local_err);
>> +            /*
>> +             * It is OK for @to_replace to be an immediate child of
>> +             * @target_bs, because that is what happens with
>> +             * drive-mirror sync=none mode=absolute-paths: target_bs's
>> +             * backing file will be the source node, which is also
>> +             * to_replace (by default).
>> +             * bdrv_replace_node() handles this case by not letting
>> +             * target_bs->backing point to itself, but to the source
>> +             * still.
>> +             */
>> +            if (!bdrv_is_child_of(to_replace, target_bs, 2)) {
>> +                bdrv_replace_node(to_replace, target_bs, &local_err);
>> +            } else {
>> +                error_setg(&local_err, "Can no longer replace '%s' by '%s', "
>> +                           "because the former is now a child of the latter, "
>> +                           "and doing so would thus create a loop",
>> +                           to_replace->node_name, target_bs->node_name);
>> +            }
> 
> you may swap if and else branch, dropping "!" mark..

Yes, but I just personally prefer to have the error case in the else branch.

>>           } else {
>>               error_setg(&local_err, "Can no longer replace '%s' by '%s', "
>>                          "because it can no longer be guaranteed that doing so "
>> diff --git a/blockdev.c b/blockdev.c
>> index 9dc2238bf3..d29f147f72 100644
>> --- a/blockdev.c
>> +++ b/blockdev.c
>> @@ -3824,7 +3824,7 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
>>       }
>>   
>>       if (has_replaces) {
>> -        BlockDriverState *to_replace_bs;
>> +        BlockDriverState *to_replace_bs, *target_backing_bs;
>>           AioContext *replace_aio_context;
>>           int64_t bs_size, replace_size;
>>   
>> @@ -3839,6 +3839,52 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
>>               return;
>>           }
>>   
>> +        if (bdrv_is_child_of(to_replace_bs, target, 1)) {
>> +            error_setg(errp, "Replacing %s by %s would result in a loop, "
>> +                       "because the former is a child of the latter",
>> +                       to_replace_bs->node_name, target->node_name);
>> +            return;
>> +        }
> 
> here min_level=1, so we don't handle the case, described in mirror_exit_common..
> I don't see why.. blockdev_mirror_common is called from qmp_drive_mirror,
> including the case with MIRROR_SYNC_MODE_NONE and NEW_IMAGE_MODE_ABSOLUTE_PATHS..
> 
> What I'm missing?

Hmm.  Well.

If it broke drive-mirror sync=none, I suppose I would have noticed by
running the iotests.  But I didn’t, and that’s because this code here is
reached only if the user actually specified @replaces.  (As opposed to
the mirror_exit_common code, where @to_replace may simply be @src if not
overridden by the user.)

The only reason why I allow it in mirror_exit_common is because we have
to.  But if the user manually specifies this configuration, we can’t
guarantee it’s safe.

OTOH, well, if we allow it for drive-mirror sync=none, why not allow it
when manually specified with blockdev-mirror?

What’s your opinion?

>> +
>> +        if (backing_mode == MIRROR_SOURCE_BACKING_CHAIN ||
>> +            backing_mode == MIRROR_OPEN_BACKING_CHAIN)
>> +        {
>> +            /*
>> +             * While we do not quite know what OPEN_BACKING_CHAIN
>> +             * (used for mode=existing) will yield, it is probably
>> +             * best to restrict it exactly like SOURCE_BACKING_CHAIN,
>> +             * because that is our best guess.
>> +             */
>> +            switch (sync) {
>> +            case MIRROR_SYNC_MODE_FULL:
>> +                target_backing_bs = NULL;
>> +                break;
>> +
>> +            case MIRROR_SYNC_MODE_TOP:
>> +                target_backing_bs = backing_bs(bs);
>> +                break;
>> +
>> +            case MIRROR_SYNC_MODE_NONE:
>> +                target_backing_bs = bs;
>> +                break;
>> +
>> +            default:
>> +                abort();
>> +            }
>> +        } else {
>> +            assert(backing_mode == MIRROR_LEAVE_BACKING_CHAIN);
>> +            target_backing_bs = backing_bs(target);
>> +        }
>> +
>> +        if (bdrv_is_child_of(to_replace_bs, target_backing_bs, 0)) {
>> +            error_setg(errp, "Replacing '%s' by '%s' with this sync mode would "
>> +                       "result in a loop, because the former would be a child "
>> +                       "of the latter's backing file ('%s') after the mirror "
>> +                       "job", to_replace_bs->node_name, target->node_name,
>> +                       target_backing_bs->node_name);
>> +            return;
>> +        }
> 
> hmm.. so for MODE_NONE we disallow to_replace == src?

I suppose that’s basically the same as above.  Should we allow this case
when specified explicitly by the user?

Max

>> +
>>           replace_aio_context = bdrv_get_aio_context(to_replace_bs);
>>           aio_context_acquire(replace_aio_context);
>>           replace_size = bdrv_getlength(to_replace_bs);
>> diff --git a/include/block/block_int.h b/include/block/block_int.h
>> index 589a797fab..7064a1a4fa 100644
>> --- a/include/block/block_int.h
>> +++ b/include/block/block_int.h
>> @@ -1266,6 +1266,9 @@ void bdrv_format_default_perms(BlockDriverState *bs, BdrvChild *c,
>>   bool bdrv_recurse_can_replace(BlockDriverState *bs,
>>                                 BlockDriverState *to_replace);
>>   
>> +bool bdrv_is_child_of(BlockDriverState *child, BlockDriverState *parent,
>> +                      int min_level);
>> +
>>   /*
>>    * Default implementation for drivers to pass bdrv_co_block_status() to
>>    * their file.
>>
> 
>
Vladimir Sementsov-Ogievskiy Dec. 13, 2019, 11:18 a.m. UTC | #9
09.12.2019 17:43, Max Reitz wrote:
> On 02.12.19 13:12, Vladimir Sementsov-Ogievskiy wrote:
>> 11.11.2019 19:02, Max Reitz wrote:
>>> While bdrv_replace_node() will not follow through with it, a specific
>>> @replaces asks the mirror job to create a loop.
>>>
>>> For example, say both the source and the target share a child where the
>>> source is a filter; by letting @replaces point to the common child, you
>>> ask for a loop.
>>>
>>> Or if you use @replaces in drive-mirror with sync=none and
>>> mode=absolute-paths, you generally ask for a loop (@replaces must point
>>> to a child of the source, and sync=none makes the source the backing
>>> file of the target after the job).
>>>
>>> bdrv_replace_node() will not create those loops, but by doing so, it
>>> ignores the user-requested configuration, which is not ideally either.
>>> (In the first example above, the target's child will remain what it was,
>>> which may still be reasonable.  But in the second example, the target
>>> will just not become a child of the source, which is precisely what was
>>> requested with @replaces.)
>>>
>>> So prevent such configurations, both before the job, and before it
>>> actually completes.
>>>
>>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>>> ---
>>>    block.c                   | 30 ++++++++++++++++++++++++
>>>    block/mirror.c            | 19 +++++++++++++++-
>>>    blockdev.c                | 48 ++++++++++++++++++++++++++++++++++++++-
>>>    include/block/block_int.h |  3 +++
>>>    4 files changed, 98 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/block.c b/block.c
>>> index 0159f8e510..e3922a0474 100644
>>> --- a/block.c
>>> +++ b/block.c
>>> @@ -6259,6 +6259,36 @@ out:
>>>        return to_replace_bs;
>>>    }
>>>    
>>> +/*
>>> + * Return true iff @child is a (recursive) child of @parent, with at
>>> + * least @min_level edges between them.
>>> + *
>>> + * (If @min_level == 0, return true if @child == @parent.  For
>>> + * @min_level == 1, @child needs to be at least a real child; for
>>> + * @min_level == 2, it needs to be at least a grand-child; and so on.)
>>> + */
>>> +bool bdrv_is_child_of(BlockDriverState *child, BlockDriverState *parent,
>>> +                      int min_level)
>>> +{
>>> +    BdrvChild *c;
>>> +
>>> +    if (child == parent && min_level <= 0) {
>>> +        return true;
>>> +    }
>>> +
>>> +    if (!parent) {
>>> +        return false;
>>> +    }
>>> +
>>> +    QLIST_FOREACH(c, &parent->children, next) {
>>> +        if (bdrv_is_child_of(child, c->bs, min_level - 1)) {
>>> +            return true;
>>> +        }
>>> +    }
>>> +
>>> +    return false;
>>> +}
>>> +
>>>    /**
>>>     * Iterates through the list of runtime option keys that are said to
>>>     * be "strong" for a BDS.  An option is called "strong" if it changes
>>> diff --git a/block/mirror.c b/block/mirror.c
>>> index 68a4404666..b258c7e98b 100644
>>> --- a/block/mirror.c
>>> +++ b/block/mirror.c
>>> @@ -701,7 +701,24 @@ static int mirror_exit_common(Job *job)
>>>             * there.
>>>             */
>>>            if (bdrv_recurse_can_replace(src, to_replace)) {
>>> -            bdrv_replace_node(to_replace, target_bs, &local_err);
>>> +            /*
>>> +             * It is OK for @to_replace to be an immediate child of
>>> +             * @target_bs, because that is what happens with
>>> +             * drive-mirror sync=none mode=absolute-paths: target_bs's
>>> +             * backing file will be the source node, which is also
>>> +             * to_replace (by default).
>>> +             * bdrv_replace_node() handles this case by not letting
>>> +             * target_bs->backing point to itself, but to the source
>>> +             * still.
>>> +             */
>>> +            if (!bdrv_is_child_of(to_replace, target_bs, 2)) {
>>> +                bdrv_replace_node(to_replace, target_bs, &local_err);
>>> +            } else {
>>> +                error_setg(&local_err, "Can no longer replace '%s' by '%s', "
>>> +                           "because the former is now a child of the latter, "
>>> +                           "and doing so would thus create a loop",
>>> +                           to_replace->node_name, target_bs->node_name);
>>> +            }
>>
>> you may swap if and else branch, dropping "!" mark..
> 
> Yes, but I just personally prefer to have the error case in the else branch.
> 
>>>            } else {
>>>                error_setg(&local_err, "Can no longer replace '%s' by '%s', "
>>>                           "because it can no longer be guaranteed that doing so "
>>> diff --git a/blockdev.c b/blockdev.c
>>> index 9dc2238bf3..d29f147f72 100644
>>> --- a/blockdev.c
>>> +++ b/blockdev.c
>>> @@ -3824,7 +3824,7 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
>>>        }
>>>    
>>>        if (has_replaces) {
>>> -        BlockDriverState *to_replace_bs;
>>> +        BlockDriverState *to_replace_bs, *target_backing_bs;
>>>            AioContext *replace_aio_context;
>>>            int64_t bs_size, replace_size;
>>>    
>>> @@ -3839,6 +3839,52 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
>>>                return;
>>>            }
>>>    
>>> +        if (bdrv_is_child_of(to_replace_bs, target, 1)) {
>>> +            error_setg(errp, "Replacing %s by %s would result in a loop, "
>>> +                       "because the former is a child of the latter",
>>> +                       to_replace_bs->node_name, target->node_name);
>>> +            return;
>>> +        }
>>
>> here min_level=1, so we don't handle the case, described in mirror_exit_common..
>> I don't see why.. blockdev_mirror_common is called from qmp_drive_mirror,
>> including the case with MIRROR_SYNC_MODE_NONE and NEW_IMAGE_MODE_ABSOLUTE_PATHS..
>>
>> What I'm missing?
> 
> Hmm.  Well.
> 
> If it broke drive-mirror sync=none, I suppose I would have noticed by
> running the iotests.  But I didn’t, and that’s because this code here is
> reached only if the user actually specified @replaces.  (As opposed to
> the mirror_exit_common code, where @to_replace may simply be @src if not
> overridden by the user.)
> 
> The only reason why I allow it in mirror_exit_common is because we have
> to.  But if the user manually specifies this configuration, we can’t
> guarantee it’s safe.
> 
> OTOH, well, if we allow it for drive-mirror sync=none, why not allow it
> when manually specified with blockdev-mirror?
> 
> What’s your opinion?

Hmm, I think, that allowing to_replaces to be direct backing child of target
(like in mirror_exit_common) is safe enough. User doesn't know that
such replacing includes also replacing own child of the target,
which leads to the loop.. It's not obvious. And behavior of
bdrv_replace_node() which just doesn't create this loop, doesn't
seem something too tricky. Hmm..

We could mention in qapi spec, that replacing doesn't break backing
link of the target, for it to be absolutely defined.

But should we allow replaces to be some other (not backing and not filtered)
child of target?..

> 
>>> +
>>> +        if (backing_mode == MIRROR_SOURCE_BACKING_CHAIN ||
>>> +            backing_mode == MIRROR_OPEN_BACKING_CHAIN)
>>> +        {
>>> +            /*
>>> +             * While we do not quite know what OPEN_BACKING_CHAIN
>>> +             * (used for mode=existing) will yield, it is probably
>>> +             * best to restrict it exactly like SOURCE_BACKING_CHAIN,
>>> +             * because that is our best guess.
>>> +             */
>>> +            switch (sync) {
>>> +            case MIRROR_SYNC_MODE_FULL:
>>> +                target_backing_bs = NULL;
>>> +                break;
>>> +
>>> +            case MIRROR_SYNC_MODE_TOP:
>>> +                target_backing_bs = backing_bs(bs);
>>> +                break;
>>> +
>>> +            case MIRROR_SYNC_MODE_NONE:
>>> +                target_backing_bs = bs;
>>> +                break;
>>> +
>>> +            default:
>>> +                abort();
>>> +            }
>>> +        } else {
>>> +            assert(backing_mode == MIRROR_LEAVE_BACKING_CHAIN);
>>> +            target_backing_bs = backing_bs(target);
>>> +        }
>>> +
>>> +        if (bdrv_is_child_of(to_replace_bs, target_backing_bs, 0)) {
>>> +            error_setg(errp, "Replacing '%s' by '%s' with this sync mode would "
>>> +                       "result in a loop, because the former would be a child "
>>> +                       "of the latter's backing file ('%s') after the mirror "
>>> +                       "job", to_replace_bs->node_name, target->node_name,
>>> +                       target_backing_bs->node_name);
>>> +            return;
>>> +        }
>>
>> hmm.. so for MODE_NONE we disallow to_replace == src?
> 
> I suppose that’s basically the same as above.  Should we allow this case
> when specified explicitly by the user?
> 

I'm a bit more closer to allowing it, for consistency with automatic path, with
unspecified replaces. Are we sure that nobody uses it?

> 
>>> +
>>>            replace_aio_context = bdrv_get_aio_context(to_replace_bs);
>>>            aio_context_acquire(replace_aio_context);
>>>            replace_size = bdrv_getlength(to_replace_bs);
>>> diff --git a/include/block/block_int.h b/include/block/block_int.h
>>> index 589a797fab..7064a1a4fa 100644
>>> --- a/include/block/block_int.h
>>> +++ b/include/block/block_int.h
>>> @@ -1266,6 +1266,9 @@ void bdrv_format_default_perms(BlockDriverState *bs, BdrvChild *c,
>>>    bool bdrv_recurse_can_replace(BlockDriverState *bs,
>>>                                  BlockDriverState *to_replace);
>>>    
>>> +bool bdrv_is_child_of(BlockDriverState *child, BlockDriverState *parent,
>>> +                      int min_level);
>>> +
>>>    /*
>>>     * Default implementation for drivers to pass bdrv_co_block_status() to
>>>     * their file.
>>>
>>
>>
> 
>
Max Reitz Dec. 20, 2019, 11:39 a.m. UTC | #10
On 13.12.19 12:18, Vladimir Sementsov-Ogievskiy wrote:
> 09.12.2019 17:43, Max Reitz wrote:
>> On 02.12.19 13:12, Vladimir Sementsov-Ogievskiy wrote:
>>> 11.11.2019 19:02, Max Reitz wrote:
>>>> While bdrv_replace_node() will not follow through with it, a specific
>>>> @replaces asks the mirror job to create a loop.
>>>>
>>>> For example, say both the source and the target share a child where the
>>>> source is a filter; by letting @replaces point to the common child, you
>>>> ask for a loop.
>>>>
>>>> Or if you use @replaces in drive-mirror with sync=none and
>>>> mode=absolute-paths, you generally ask for a loop (@replaces must point
>>>> to a child of the source, and sync=none makes the source the backing
>>>> file of the target after the job).
>>>>
>>>> bdrv_replace_node() will not create those loops, but by doing so, it
>>>> ignores the user-requested configuration, which is not ideally either.
>>>> (In the first example above, the target's child will remain what it was,
>>>> which may still be reasonable.  But in the second example, the target
>>>> will just not become a child of the source, which is precisely what was
>>>> requested with @replaces.)
>>>>
>>>> So prevent such configurations, both before the job, and before it
>>>> actually completes.
>>>>
>>>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>>>> ---
>>>>    block.c                   | 30 ++++++++++++++++++++++++
>>>>    block/mirror.c            | 19 +++++++++++++++-
>>>>    blockdev.c                | 48 ++++++++++++++++++++++++++++++++++++++-
>>>>    include/block/block_int.h |  3 +++
>>>>    4 files changed, 98 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/block.c b/block.c
>>>> index 0159f8e510..e3922a0474 100644
>>>> --- a/block.c
>>>> +++ b/block.c
>>>> @@ -6259,6 +6259,36 @@ out:
>>>>        return to_replace_bs;
>>>>    }
>>>>    
>>>> +/*
>>>> + * Return true iff @child is a (recursive) child of @parent, with at
>>>> + * least @min_level edges between them.
>>>> + *
>>>> + * (If @min_level == 0, return true if @child == @parent.  For
>>>> + * @min_level == 1, @child needs to be at least a real child; for
>>>> + * @min_level == 2, it needs to be at least a grand-child; and so on.)
>>>> + */
>>>> +bool bdrv_is_child_of(BlockDriverState *child, BlockDriverState *parent,
>>>> +                      int min_level)
>>>> +{
>>>> +    BdrvChild *c;
>>>> +
>>>> +    if (child == parent && min_level <= 0) {
>>>> +        return true;
>>>> +    }
>>>> +
>>>> +    if (!parent) {
>>>> +        return false;
>>>> +    }
>>>> +
>>>> +    QLIST_FOREACH(c, &parent->children, next) {
>>>> +        if (bdrv_is_child_of(child, c->bs, min_level - 1)) {
>>>> +            return true;
>>>> +        }
>>>> +    }
>>>> +
>>>> +    return false;
>>>> +}
>>>> +
>>>>    /**
>>>>     * Iterates through the list of runtime option keys that are said to
>>>>     * be "strong" for a BDS.  An option is called "strong" if it changes
>>>> diff --git a/block/mirror.c b/block/mirror.c
>>>> index 68a4404666..b258c7e98b 100644
>>>> --- a/block/mirror.c
>>>> +++ b/block/mirror.c
>>>> @@ -701,7 +701,24 @@ static int mirror_exit_common(Job *job)
>>>>             * there.
>>>>             */
>>>>            if (bdrv_recurse_can_replace(src, to_replace)) {
>>>> -            bdrv_replace_node(to_replace, target_bs, &local_err);
>>>> +            /*
>>>> +             * It is OK for @to_replace to be an immediate child of
>>>> +             * @target_bs, because that is what happens with
>>>> +             * drive-mirror sync=none mode=absolute-paths: target_bs's
>>>> +             * backing file will be the source node, which is also
>>>> +             * to_replace (by default).
>>>> +             * bdrv_replace_node() handles this case by not letting
>>>> +             * target_bs->backing point to itself, but to the source
>>>> +             * still.
>>>> +             */
>>>> +            if (!bdrv_is_child_of(to_replace, target_bs, 2)) {
>>>> +                bdrv_replace_node(to_replace, target_bs, &local_err);
>>>> +            } else {
>>>> +                error_setg(&local_err, "Can no longer replace '%s' by '%s', "
>>>> +                           "because the former is now a child of the latter, "
>>>> +                           "and doing so would thus create a loop",
>>>> +                           to_replace->node_name, target_bs->node_name);
>>>> +            }
>>>
>>> you may swap if and else branch, dropping "!" mark..
>>
>> Yes, but I just personally prefer to have the error case in the else branch.
>>
>>>>            } else {
>>>>                error_setg(&local_err, "Can no longer replace '%s' by '%s', "
>>>>                           "because it can no longer be guaranteed that doing so "
>>>> diff --git a/blockdev.c b/blockdev.c
>>>> index 9dc2238bf3..d29f147f72 100644
>>>> --- a/blockdev.c
>>>> +++ b/blockdev.c
>>>> @@ -3824,7 +3824,7 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
>>>>        }
>>>>    
>>>>        if (has_replaces) {
>>>> -        BlockDriverState *to_replace_bs;
>>>> +        BlockDriverState *to_replace_bs, *target_backing_bs;
>>>>            AioContext *replace_aio_context;
>>>>            int64_t bs_size, replace_size;
>>>>    
>>>> @@ -3839,6 +3839,52 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
>>>>                return;
>>>>            }
>>>>    
>>>> +        if (bdrv_is_child_of(to_replace_bs, target, 1)) {
>>>> +            error_setg(errp, "Replacing %s by %s would result in a loop, "
>>>> +                       "because the former is a child of the latter",
>>>> +                       to_replace_bs->node_name, target->node_name);
>>>> +            return;
>>>> +        }
>>>
>>> here min_level=1, so we don't handle the case, described in mirror_exit_common..
>>> I don't see why.. blockdev_mirror_common is called from qmp_drive_mirror,
>>> including the case with MIRROR_SYNC_MODE_NONE and NEW_IMAGE_MODE_ABSOLUTE_PATHS..
>>>
>>> What I'm missing?
>>
>> Hmm.  Well.
>>
>> If it broke drive-mirror sync=none, I suppose I would have noticed by
>> running the iotests.  But I didn’t, and that’s because this code here is
>> reached only if the user actually specified @replaces.  (As opposed to
>> the mirror_exit_common code, where @to_replace may simply be @src if not
>> overridden by the user.)
>>
>> The only reason why I allow it in mirror_exit_common is because we have
>> to.  But if the user manually specifies this configuration, we can’t
>> guarantee it’s safe.
>>
>> OTOH, well, if we allow it for drive-mirror sync=none, why not allow it
>> when manually specified with blockdev-mirror?
>>
>> What’s your opinion?
> 
> Hmm, I think, that allowing to_replaces to be direct backing child of target
> (like in mirror_exit_common) is safe enough. User doesn't know that
> such replacing includes also replacing own child of the target,
> which leads to the loop.. It's not obvious. And behavior of
> bdrv_replace_node() which just doesn't create this loop, doesn't
> seem something too tricky. Hmm..
> 
> We could mention in qapi spec, that replacing doesn't break backing
> link of the target, for it to be absolutely defined.
> 
> But should we allow replaces to be some other (not backing and not filtered)
> child of target?..

Well, my opinion is that this is a bit of weird thing to do and that it
basically does ask for a loop.

I’m OK with excluding the sync=none case, because (1) that’s so
obviously a loop that it can’t be what the user honestly wants; (2) how
it’s resolved is rather obvious, too: There is exactly one edge that
causes the loop, so you simply don’t change that one; (3) drive-mirror
sync=none does this case automatically, so we should probably allow
users to do it manually with blockdev-mirror, too.

>>>> +
>>>> +        if (backing_mode == MIRROR_SOURCE_BACKING_CHAIN ||
>>>> +            backing_mode == MIRROR_OPEN_BACKING_CHAIN)
>>>> +        {
>>>> +            /*
>>>> +             * While we do not quite know what OPEN_BACKING_CHAIN
>>>> +             * (used for mode=existing) will yield, it is probably
>>>> +             * best to restrict it exactly like SOURCE_BACKING_CHAIN,
>>>> +             * because that is our best guess.
>>>> +             */
>>>> +            switch (sync) {
>>>> +            case MIRROR_SYNC_MODE_FULL:
>>>> +                target_backing_bs = NULL;
>>>> +                break;
>>>> +
>>>> +            case MIRROR_SYNC_MODE_TOP:
>>>> +                target_backing_bs = backing_bs(bs);
>>>> +                break;
>>>> +
>>>> +            case MIRROR_SYNC_MODE_NONE:
>>>> +                target_backing_bs = bs;
>>>> +                break;
>>>> +
>>>> +            default:
>>>> +                abort();
>>>> +            }
>>>> +        } else {
>>>> +            assert(backing_mode == MIRROR_LEAVE_BACKING_CHAIN);
>>>> +            target_backing_bs = backing_bs(target);
>>>> +        }
>>>> +
>>>> +        if (bdrv_is_child_of(to_replace_bs, target_backing_bs, 0)) {
>>>> +            error_setg(errp, "Replacing '%s' by '%s' with this sync mode would "
>>>> +                       "result in a loop, because the former would be a child "
>>>> +                       "of the latter's backing file ('%s') after the mirror "
>>>> +                       "job", to_replace_bs->node_name, target->node_name,
>>>> +                       target_backing_bs->node_name);
>>>> +            return;
>>>> +        }
>>>
>>> hmm.. so for MODE_NONE we disallow to_replace == src?
>>
>> I suppose that’s basically the same as above.  Should we allow this case
>> when specified explicitly by the user?
>>
> 
> I'm a bit more closer to allowing it, for consistency with automatic path, with
> unspecified replaces. Are we sure that nobody uses it?

Well, there are multiple cases, as shown in the commit message.  I think
that for drive-mirror sync=none, nobody uses @replaces, because it just
doesn’t work.

But, well, that’s just because drive-mirror does graph manipulation that
blockdev-mirror doesn’t (i.e., changing the target’s backing file on
completion).  So maybe we should just prevent loops for drive-mirror,
but let the user do what they want when they use blockdev-mirror?

Max
Vladimir Sementsov-Ogievskiy Dec. 20, 2019, 11:55 a.m. UTC | #11
20.12.2019 14:39, Max Reitz wrote:
> On 13.12.19 12:18, Vladimir Sementsov-Ogievskiy wrote:
>> 09.12.2019 17:43, Max Reitz wrote:
>>> On 02.12.19 13:12, Vladimir Sementsov-Ogievskiy wrote:
>>>> 11.11.2019 19:02, Max Reitz wrote:
>>>>> While bdrv_replace_node() will not follow through with it, a specific
>>>>> @replaces asks the mirror job to create a loop.
>>>>>
>>>>> For example, say both the source and the target share a child where the
>>>>> source is a filter; by letting @replaces point to the common child, you
>>>>> ask for a loop.
>>>>>
>>>>> Or if you use @replaces in drive-mirror with sync=none and
>>>>> mode=absolute-paths, you generally ask for a loop (@replaces must point
>>>>> to a child of the source, and sync=none makes the source the backing
>>>>> file of the target after the job).
>>>>>
>>>>> bdrv_replace_node() will not create those loops, but by doing so, it
>>>>> ignores the user-requested configuration, which is not ideally either.
>>>>> (In the first example above, the target's child will remain what it was,
>>>>> which may still be reasonable.  But in the second example, the target
>>>>> will just not become a child of the source, which is precisely what was
>>>>> requested with @replaces.)
>>>>>
>>>>> So prevent such configurations, both before the job, and before it
>>>>> actually completes.
>>>>>
>>>>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>>>>> ---
>>>>>     block.c                   | 30 ++++++++++++++++++++++++
>>>>>     block/mirror.c            | 19 +++++++++++++++-
>>>>>     blockdev.c                | 48 ++++++++++++++++++++++++++++++++++++++-
>>>>>     include/block/block_int.h |  3 +++
>>>>>     4 files changed, 98 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/block.c b/block.c
>>>>> index 0159f8e510..e3922a0474 100644
>>>>> --- a/block.c
>>>>> +++ b/block.c
>>>>> @@ -6259,6 +6259,36 @@ out:
>>>>>         return to_replace_bs;
>>>>>     }
>>>>>     
>>>>> +/*
>>>>> + * Return true iff @child is a (recursive) child of @parent, with at
>>>>> + * least @min_level edges between them.
>>>>> + *
>>>>> + * (If @min_level == 0, return true if @child == @parent.  For
>>>>> + * @min_level == 1, @child needs to be at least a real child; for
>>>>> + * @min_level == 2, it needs to be at least a grand-child; and so on.)
>>>>> + */
>>>>> +bool bdrv_is_child_of(BlockDriverState *child, BlockDriverState *parent,
>>>>> +                      int min_level)
>>>>> +{
>>>>> +    BdrvChild *c;
>>>>> +
>>>>> +    if (child == parent && min_level <= 0) {
>>>>> +        return true;
>>>>> +    }
>>>>> +
>>>>> +    if (!parent) {
>>>>> +        return false;
>>>>> +    }
>>>>> +
>>>>> +    QLIST_FOREACH(c, &parent->children, next) {
>>>>> +        if (bdrv_is_child_of(child, c->bs, min_level - 1)) {
>>>>> +            return true;
>>>>> +        }
>>>>> +    }
>>>>> +
>>>>> +    return false;
>>>>> +}
>>>>> +
>>>>>     /**
>>>>>      * Iterates through the list of runtime option keys that are said to
>>>>>      * be "strong" for a BDS.  An option is called "strong" if it changes
>>>>> diff --git a/block/mirror.c b/block/mirror.c
>>>>> index 68a4404666..b258c7e98b 100644
>>>>> --- a/block/mirror.c
>>>>> +++ b/block/mirror.c
>>>>> @@ -701,7 +701,24 @@ static int mirror_exit_common(Job *job)
>>>>>              * there.
>>>>>              */
>>>>>             if (bdrv_recurse_can_replace(src, to_replace)) {
>>>>> -            bdrv_replace_node(to_replace, target_bs, &local_err);
>>>>> +            /*
>>>>> +             * It is OK for @to_replace to be an immediate child of
>>>>> +             * @target_bs, because that is what happens with
>>>>> +             * drive-mirror sync=none mode=absolute-paths: target_bs's
>>>>> +             * backing file will be the source node, which is also
>>>>> +             * to_replace (by default).
>>>>> +             * bdrv_replace_node() handles this case by not letting
>>>>> +             * target_bs->backing point to itself, but to the source
>>>>> +             * still.
>>>>> +             */
>>>>> +            if (!bdrv_is_child_of(to_replace, target_bs, 2)) {
>>>>> +                bdrv_replace_node(to_replace, target_bs, &local_err);
>>>>> +            } else {
>>>>> +                error_setg(&local_err, "Can no longer replace '%s' by '%s', "
>>>>> +                           "because the former is now a child of the latter, "
>>>>> +                           "and doing so would thus create a loop",
>>>>> +                           to_replace->node_name, target_bs->node_name);
>>>>> +            }
>>>>
>>>> you may swap if and else branch, dropping "!" mark..
>>>
>>> Yes, but I just personally prefer to have the error case in the else branch.
>>>
>>>>>             } else {
>>>>>                 error_setg(&local_err, "Can no longer replace '%s' by '%s', "
>>>>>                            "because it can no longer be guaranteed that doing so "
>>>>> diff --git a/blockdev.c b/blockdev.c
>>>>> index 9dc2238bf3..d29f147f72 100644
>>>>> --- a/blockdev.c
>>>>> +++ b/blockdev.c
>>>>> @@ -3824,7 +3824,7 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
>>>>>         }
>>>>>     
>>>>>         if (has_replaces) {
>>>>> -        BlockDriverState *to_replace_bs;
>>>>> +        BlockDriverState *to_replace_bs, *target_backing_bs;
>>>>>             AioContext *replace_aio_context;
>>>>>             int64_t bs_size, replace_size;
>>>>>     
>>>>> @@ -3839,6 +3839,52 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
>>>>>                 return;
>>>>>             }
>>>>>     
>>>>> +        if (bdrv_is_child_of(to_replace_bs, target, 1)) {
>>>>> +            error_setg(errp, "Replacing %s by %s would result in a loop, "
>>>>> +                       "because the former is a child of the latter",
>>>>> +                       to_replace_bs->node_name, target->node_name);
>>>>> +            return;
>>>>> +        }
>>>>
>>>> here min_level=1, so we don't handle the case, described in mirror_exit_common..
>>>> I don't see why.. blockdev_mirror_common is called from qmp_drive_mirror,
>>>> including the case with MIRROR_SYNC_MODE_NONE and NEW_IMAGE_MODE_ABSOLUTE_PATHS..
>>>>
>>>> What I'm missing?
>>>
>>> Hmm.  Well.
>>>
>>> If it broke drive-mirror sync=none, I suppose I would have noticed by
>>> running the iotests.  But I didn’t, and that’s because this code here is
>>> reached only if the user actually specified @replaces.  (As opposed to
>>> the mirror_exit_common code, where @to_replace may simply be @src if not
>>> overridden by the user.)
>>>
>>> The only reason why I allow it in mirror_exit_common is because we have
>>> to.  But if the user manually specifies this configuration, we can’t
>>> guarantee it’s safe.
>>>
>>> OTOH, well, if we allow it for drive-mirror sync=none, why not allow it
>>> when manually specified with blockdev-mirror?
>>>
>>> What’s your opinion?
>>
>> Hmm, I think, that allowing to_replaces to be direct backing child of target
>> (like in mirror_exit_common) is safe enough. User doesn't know that
>> such replacing includes also replacing own child of the target,
>> which leads to the loop.. It's not obvious. And behavior of
>> bdrv_replace_node() which just doesn't create this loop, doesn't
>> seem something too tricky. Hmm..
>>
>> We could mention in qapi spec, that replacing doesn't break backing
>> link of the target, for it to be absolutely defined.
>>
>> But should we allow replaces to be some other (not backing and not filtered)
>> child of target?..
> 
> Well, my opinion is that this is a bit of weird thing to do and that it
> basically does ask for a loop.
> 
> I’m OK with excluding the sync=none case, because (1) that’s so
> obviously a loop that it can’t be what the user honestly wants; (2) how
> it’s resolved is rather obvious, too: There is exactly one edge that
> causes the loop, so you simply don’t change that one; (3) drive-mirror
> sync=none does this case automatically, so we should probably allow
> users to do it manually with blockdev-mirror, too.
> 
>>>>> +
>>>>> +        if (backing_mode == MIRROR_SOURCE_BACKING_CHAIN ||
>>>>> +            backing_mode == MIRROR_OPEN_BACKING_CHAIN)
>>>>> +        {
>>>>> +            /*
>>>>> +             * While we do not quite know what OPEN_BACKING_CHAIN
>>>>> +             * (used for mode=existing) will yield, it is probably
>>>>> +             * best to restrict it exactly like SOURCE_BACKING_CHAIN,
>>>>> +             * because that is our best guess.
>>>>> +             */
>>>>> +            switch (sync) {
>>>>> +            case MIRROR_SYNC_MODE_FULL:
>>>>> +                target_backing_bs = NULL;
>>>>> +                break;
>>>>> +
>>>>> +            case MIRROR_SYNC_MODE_TOP:
>>>>> +                target_backing_bs = backing_bs(bs);
>>>>> +                break;
>>>>> +
>>>>> +            case MIRROR_SYNC_MODE_NONE:
>>>>> +                target_backing_bs = bs;
>>>>> +                break;
>>>>> +
>>>>> +            default:
>>>>> +                abort();
>>>>> +            }
>>>>> +        } else {
>>>>> +            assert(backing_mode == MIRROR_LEAVE_BACKING_CHAIN);
>>>>> +            target_backing_bs = backing_bs(target);
>>>>> +        }
>>>>> +
>>>>> +        if (bdrv_is_child_of(to_replace_bs, target_backing_bs, 0)) {
>>>>> +            error_setg(errp, "Replacing '%s' by '%s' with this sync mode would "
>>>>> +                       "result in a loop, because the former would be a child "
>>>>> +                       "of the latter's backing file ('%s') after the mirror "
>>>>> +                       "job", to_replace_bs->node_name, target->node_name,
>>>>> +                       target_backing_bs->node_name);
>>>>> +            return;
>>>>> +        }
>>>>
>>>> hmm.. so for MODE_NONE we disallow to_replace == src?
>>>
>>> I suppose that’s basically the same as above.  Should we allow this case
>>> when specified explicitly by the user?
>>>
>>
>> I'm a bit more closer to allowing it, for consistency with automatic path, with
>> unspecified replaces. Are we sure that nobody uses it?
> 
> Well, there are multiple cases, as shown in the commit message.  I think
> that for drive-mirror sync=none, nobody uses @replaces, because it just
> doesn’t work.
> 
> But, well, that’s just because drive-mirror does graph manipulation that
> blockdev-mirror doesn’t (i.e., changing the target’s backing file on
> completion).  So maybe we should just prevent loops for drive-mirror,
> but let the user do what they want when they use blockdev-mirror?
> 

Well, the question finally is, how much to restrict from things for which we
don't know are they useful or not. I don't know) I think, finally, I'm OK with
either way we discussed, or with this patch as is. If it breaks some existing
scenario it will be easy to fix.
Max Reitz Dec. 20, 2019, 12:10 p.m. UTC | #12
On 20.12.19 12:55, Vladimir Sementsov-Ogievskiy wrote:
> 20.12.2019 14:39, Max Reitz wrote:
>> On 13.12.19 12:18, Vladimir Sementsov-Ogievskiy wrote:
>>> 09.12.2019 17:43, Max Reitz wrote:
>>>> On 02.12.19 13:12, Vladimir Sementsov-Ogievskiy wrote:
>>>>> 11.11.2019 19:02, Max Reitz wrote:
>>>>>> While bdrv_replace_node() will not follow through with it, a specific
>>>>>> @replaces asks the mirror job to create a loop.
>>>>>>
>>>>>> For example, say both the source and the target share a child where the
>>>>>> source is a filter; by letting @replaces point to the common child, you
>>>>>> ask for a loop.
>>>>>>
>>>>>> Or if you use @replaces in drive-mirror with sync=none and
>>>>>> mode=absolute-paths, you generally ask for a loop (@replaces must point
>>>>>> to a child of the source, and sync=none makes the source the backing
>>>>>> file of the target after the job).
>>>>>>
>>>>>> bdrv_replace_node() will not create those loops, but by doing so, it
>>>>>> ignores the user-requested configuration, which is not ideally either.
>>>>>> (In the first example above, the target's child will remain what it was,
>>>>>> which may still be reasonable.  But in the second example, the target
>>>>>> will just not become a child of the source, which is precisely what was
>>>>>> requested with @replaces.)
>>>>>>
>>>>>> So prevent such configurations, both before the job, and before it
>>>>>> actually completes.
>>>>>>
>>>>>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>>>>>> ---
>>>>>>     block.c                   | 30 ++++++++++++++++++++++++
>>>>>>     block/mirror.c            | 19 +++++++++++++++-
>>>>>>     blockdev.c                | 48 ++++++++++++++++++++++++++++++++++++++-
>>>>>>     include/block/block_int.h |  3 +++
>>>>>>     4 files changed, 98 insertions(+), 2 deletions(-)
>>>>>>
>>>>>> diff --git a/block.c b/block.c
>>>>>> index 0159f8e510..e3922a0474 100644
>>>>>> --- a/block.c
>>>>>> +++ b/block.c
>>>>>> @@ -6259,6 +6259,36 @@ out:
>>>>>>         return to_replace_bs;
>>>>>>     }
>>>>>>     
>>>>>> +/*
>>>>>> + * Return true iff @child is a (recursive) child of @parent, with at
>>>>>> + * least @min_level edges between them.
>>>>>> + *
>>>>>> + * (If @min_level == 0, return true if @child == @parent.  For
>>>>>> + * @min_level == 1, @child needs to be at least a real child; for
>>>>>> + * @min_level == 2, it needs to be at least a grand-child; and so on.)
>>>>>> + */
>>>>>> +bool bdrv_is_child_of(BlockDriverState *child, BlockDriverState *parent,
>>>>>> +                      int min_level)
>>>>>> +{
>>>>>> +    BdrvChild *c;
>>>>>> +
>>>>>> +    if (child == parent && min_level <= 0) {
>>>>>> +        return true;
>>>>>> +    }
>>>>>> +
>>>>>> +    if (!parent) {
>>>>>> +        return false;
>>>>>> +    }
>>>>>> +
>>>>>> +    QLIST_FOREACH(c, &parent->children, next) {
>>>>>> +        if (bdrv_is_child_of(child, c->bs, min_level - 1)) {
>>>>>> +            return true;
>>>>>> +        }
>>>>>> +    }
>>>>>> +
>>>>>> +    return false;
>>>>>> +}
>>>>>> +
>>>>>>     /**
>>>>>>      * Iterates through the list of runtime option keys that are said to
>>>>>>      * be "strong" for a BDS.  An option is called "strong" if it changes
>>>>>> diff --git a/block/mirror.c b/block/mirror.c
>>>>>> index 68a4404666..b258c7e98b 100644
>>>>>> --- a/block/mirror.c
>>>>>> +++ b/block/mirror.c
>>>>>> @@ -701,7 +701,24 @@ static int mirror_exit_common(Job *job)
>>>>>>              * there.
>>>>>>              */
>>>>>>             if (bdrv_recurse_can_replace(src, to_replace)) {
>>>>>> -            bdrv_replace_node(to_replace, target_bs, &local_err);
>>>>>> +            /*
>>>>>> +             * It is OK for @to_replace to be an immediate child of
>>>>>> +             * @target_bs, because that is what happens with
>>>>>> +             * drive-mirror sync=none mode=absolute-paths: target_bs's
>>>>>> +             * backing file will be the source node, which is also
>>>>>> +             * to_replace (by default).
>>>>>> +             * bdrv_replace_node() handles this case by not letting
>>>>>> +             * target_bs->backing point to itself, but to the source
>>>>>> +             * still.
>>>>>> +             */
>>>>>> +            if (!bdrv_is_child_of(to_replace, target_bs, 2)) {
>>>>>> +                bdrv_replace_node(to_replace, target_bs, &local_err);
>>>>>> +            } else {
>>>>>> +                error_setg(&local_err, "Can no longer replace '%s' by '%s', "
>>>>>> +                           "because the former is now a child of the latter, "
>>>>>> +                           "and doing so would thus create a loop",
>>>>>> +                           to_replace->node_name, target_bs->node_name);
>>>>>> +            }
>>>>>
>>>>> you may swap if and else branch, dropping "!" mark..
>>>>
>>>> Yes, but I just personally prefer to have the error case in the else branch.
>>>>
>>>>>>             } else {
>>>>>>                 error_setg(&local_err, "Can no longer replace '%s' by '%s', "
>>>>>>                            "because it can no longer be guaranteed that doing so "
>>>>>> diff --git a/blockdev.c b/blockdev.c
>>>>>> index 9dc2238bf3..d29f147f72 100644
>>>>>> --- a/blockdev.c
>>>>>> +++ b/blockdev.c
>>>>>> @@ -3824,7 +3824,7 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
>>>>>>         }
>>>>>>     
>>>>>>         if (has_replaces) {
>>>>>> -        BlockDriverState *to_replace_bs;
>>>>>> +        BlockDriverState *to_replace_bs, *target_backing_bs;
>>>>>>             AioContext *replace_aio_context;
>>>>>>             int64_t bs_size, replace_size;
>>>>>>     
>>>>>> @@ -3839,6 +3839,52 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
>>>>>>                 return;
>>>>>>             }
>>>>>>     
>>>>>> +        if (bdrv_is_child_of(to_replace_bs, target, 1)) {
>>>>>> +            error_setg(errp, "Replacing %s by %s would result in a loop, "
>>>>>> +                       "because the former is a child of the latter",
>>>>>> +                       to_replace_bs->node_name, target->node_name);
>>>>>> +            return;
>>>>>> +        }
>>>>>
>>>>> here min_level=1, so we don't handle the case, described in mirror_exit_common..
>>>>> I don't see why.. blockdev_mirror_common is called from qmp_drive_mirror,
>>>>> including the case with MIRROR_SYNC_MODE_NONE and NEW_IMAGE_MODE_ABSOLUTE_PATHS..
>>>>>
>>>>> What I'm missing?
>>>>
>>>> Hmm.  Well.
>>>>
>>>> If it broke drive-mirror sync=none, I suppose I would have noticed by
>>>> running the iotests.  But I didn’t, and that’s because this code here is
>>>> reached only if the user actually specified @replaces.  (As opposed to
>>>> the mirror_exit_common code, where @to_replace may simply be @src if not
>>>> overridden by the user.)
>>>>
>>>> The only reason why I allow it in mirror_exit_common is because we have
>>>> to.  But if the user manually specifies this configuration, we can’t
>>>> guarantee it’s safe.
>>>>
>>>> OTOH, well, if we allow it for drive-mirror sync=none, why not allow it
>>>> when manually specified with blockdev-mirror?
>>>>
>>>> What’s your opinion?
>>>
>>> Hmm, I think, that allowing to_replaces to be direct backing child of target
>>> (like in mirror_exit_common) is safe enough. User doesn't know that
>>> such replacing includes also replacing own child of the target,
>>> which leads to the loop.. It's not obvious. And behavior of
>>> bdrv_replace_node() which just doesn't create this loop, doesn't
>>> seem something too tricky. Hmm..
>>>
>>> We could mention in qapi spec, that replacing doesn't break backing
>>> link of the target, for it to be absolutely defined.
>>>
>>> But should we allow replaces to be some other (not backing and not filtered)
>>> child of target?..
>>
>> Well, my opinion is that this is a bit of weird thing to do and that it
>> basically does ask for a loop.
>>
>> I’m OK with excluding the sync=none case, because (1) that’s so
>> obviously a loop that it can’t be what the user honestly wants; (2) how
>> it’s resolved is rather obvious, too: There is exactly one edge that
>> causes the loop, so you simply don’t change that one; (3) drive-mirror
>> sync=none does this case automatically, so we should probably allow
>> users to do it manually with blockdev-mirror, too.
>>
>>>>>> +
>>>>>> +        if (backing_mode == MIRROR_SOURCE_BACKING_CHAIN ||
>>>>>> +            backing_mode == MIRROR_OPEN_BACKING_CHAIN)
>>>>>> +        {
>>>>>> +            /*
>>>>>> +             * While we do not quite know what OPEN_BACKING_CHAIN
>>>>>> +             * (used for mode=existing) will yield, it is probably
>>>>>> +             * best to restrict it exactly like SOURCE_BACKING_CHAIN,
>>>>>> +             * because that is our best guess.
>>>>>> +             */
>>>>>> +            switch (sync) {
>>>>>> +            case MIRROR_SYNC_MODE_FULL:
>>>>>> +                target_backing_bs = NULL;
>>>>>> +                break;
>>>>>> +
>>>>>> +            case MIRROR_SYNC_MODE_TOP:
>>>>>> +                target_backing_bs = backing_bs(bs);
>>>>>> +                break;
>>>>>> +
>>>>>> +            case MIRROR_SYNC_MODE_NONE:
>>>>>> +                target_backing_bs = bs;
>>>>>> +                break;
>>>>>> +
>>>>>> +            default:
>>>>>> +                abort();
>>>>>> +            }
>>>>>> +        } else {
>>>>>> +            assert(backing_mode == MIRROR_LEAVE_BACKING_CHAIN);
>>>>>> +            target_backing_bs = backing_bs(target);
>>>>>> +        }
>>>>>> +
>>>>>> +        if (bdrv_is_child_of(to_replace_bs, target_backing_bs, 0)) {
>>>>>> +            error_setg(errp, "Replacing '%s' by '%s' with this sync mode would "
>>>>>> +                       "result in a loop, because the former would be a child "
>>>>>> +                       "of the latter's backing file ('%s') after the mirror "
>>>>>> +                       "job", to_replace_bs->node_name, target->node_name,
>>>>>> +                       target_backing_bs->node_name);
>>>>>> +            return;
>>>>>> +        }
>>>>>
>>>>> hmm.. so for MODE_NONE we disallow to_replace == src?
>>>>
>>>> I suppose that’s basically the same as above.  Should we allow this case
>>>> when specified explicitly by the user?
>>>>
>>>
>>> I'm a bit more closer to allowing it, for consistency with automatic path, with
>>> unspecified replaces. Are we sure that nobody uses it?
>>
>> Well, there are multiple cases, as shown in the commit message.  I think
>> that for drive-mirror sync=none, nobody uses @replaces, because it just
>> doesn’t work.
>>
>> But, well, that’s just because drive-mirror does graph manipulation that
>> blockdev-mirror doesn’t (i.e., changing the target’s backing file on
>> completion).  So maybe we should just prevent loops for drive-mirror,
>> but let the user do what they want when they use blockdev-mirror?
>>
> 
> Well, the question finally is, how much to restrict from things for which we
> don't know are they useful or not. I don't know) I think, finally, I'm OK with
> either way we discussed, or with this patch as is. If it breaks some existing
> scenario it will be easy to fix.

OK.  I hope next-year-me has a good and consistent idea on what to do.

Max
diff mbox series

Patch

diff --git a/block.c b/block.c
index 0159f8e510..e3922a0474 100644
--- a/block.c
+++ b/block.c
@@ -6259,6 +6259,36 @@  out:
     return to_replace_bs;
 }
 
+/*
+ * Return true iff @child is a (recursive) child of @parent, with at
+ * least @min_level edges between them.
+ *
+ * (If @min_level == 0, return true if @child == @parent.  For
+ * @min_level == 1, @child needs to be at least a real child; for
+ * @min_level == 2, it needs to be at least a grand-child; and so on.)
+ */
+bool bdrv_is_child_of(BlockDriverState *child, BlockDriverState *parent,
+                      int min_level)
+{
+    BdrvChild *c;
+
+    if (child == parent && min_level <= 0) {
+        return true;
+    }
+
+    if (!parent) {
+        return false;
+    }
+
+    QLIST_FOREACH(c, &parent->children, next) {
+        if (bdrv_is_child_of(child, c->bs, min_level - 1)) {
+            return true;
+        }
+    }
+
+    return false;
+}
+
 /**
  * Iterates through the list of runtime option keys that are said to
  * be "strong" for a BDS.  An option is called "strong" if it changes
diff --git a/block/mirror.c b/block/mirror.c
index 68a4404666..b258c7e98b 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -701,7 +701,24 @@  static int mirror_exit_common(Job *job)
          * there.
          */
         if (bdrv_recurse_can_replace(src, to_replace)) {
-            bdrv_replace_node(to_replace, target_bs, &local_err);
+            /*
+             * It is OK for @to_replace to be an immediate child of
+             * @target_bs, because that is what happens with
+             * drive-mirror sync=none mode=absolute-paths: target_bs's
+             * backing file will be the source node, which is also
+             * to_replace (by default).
+             * bdrv_replace_node() handles this case by not letting
+             * target_bs->backing point to itself, but to the source
+             * still.
+             */
+            if (!bdrv_is_child_of(to_replace, target_bs, 2)) {
+                bdrv_replace_node(to_replace, target_bs, &local_err);
+            } else {
+                error_setg(&local_err, "Can no longer replace '%s' by '%s', "
+                           "because the former is now a child of the latter, "
+                           "and doing so would thus create a loop",
+                           to_replace->node_name, target_bs->node_name);
+            }
         } else {
             error_setg(&local_err, "Can no longer replace '%s' by '%s', "
                        "because it can no longer be guaranteed that doing so "
diff --git a/blockdev.c b/blockdev.c
index 9dc2238bf3..d29f147f72 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -3824,7 +3824,7 @@  static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
     }
 
     if (has_replaces) {
-        BlockDriverState *to_replace_bs;
+        BlockDriverState *to_replace_bs, *target_backing_bs;
         AioContext *replace_aio_context;
         int64_t bs_size, replace_size;
 
@@ -3839,6 +3839,52 @@  static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
             return;
         }
 
+        if (bdrv_is_child_of(to_replace_bs, target, 1)) {
+            error_setg(errp, "Replacing %s by %s would result in a loop, "
+                       "because the former is a child of the latter",
+                       to_replace_bs->node_name, target->node_name);
+            return;
+        }
+
+        if (backing_mode == MIRROR_SOURCE_BACKING_CHAIN ||
+            backing_mode == MIRROR_OPEN_BACKING_CHAIN)
+        {
+            /*
+             * While we do not quite know what OPEN_BACKING_CHAIN
+             * (used for mode=existing) will yield, it is probably
+             * best to restrict it exactly like SOURCE_BACKING_CHAIN,
+             * because that is our best guess.
+             */
+            switch (sync) {
+            case MIRROR_SYNC_MODE_FULL:
+                target_backing_bs = NULL;
+                break;
+
+            case MIRROR_SYNC_MODE_TOP:
+                target_backing_bs = backing_bs(bs);
+                break;
+
+            case MIRROR_SYNC_MODE_NONE:
+                target_backing_bs = bs;
+                break;
+
+            default:
+                abort();
+            }
+        } else {
+            assert(backing_mode == MIRROR_LEAVE_BACKING_CHAIN);
+            target_backing_bs = backing_bs(target);
+        }
+
+        if (bdrv_is_child_of(to_replace_bs, target_backing_bs, 0)) {
+            error_setg(errp, "Replacing '%s' by '%s' with this sync mode would "
+                       "result in a loop, because the former would be a child "
+                       "of the latter's backing file ('%s') after the mirror "
+                       "job", to_replace_bs->node_name, target->node_name,
+                       target_backing_bs->node_name);
+            return;
+        }
+
         replace_aio_context = bdrv_get_aio_context(to_replace_bs);
         aio_context_acquire(replace_aio_context);
         replace_size = bdrv_getlength(to_replace_bs);
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 589a797fab..7064a1a4fa 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -1266,6 +1266,9 @@  void bdrv_format_default_perms(BlockDriverState *bs, BdrvChild *c,
 bool bdrv_recurse_can_replace(BlockDriverState *bs,
                               BlockDriverState *to_replace);
 
+bool bdrv_is_child_of(BlockDriverState *child, BlockDriverState *parent,
+                      int min_level);
+
 /*
  * Default implementation for drivers to pass bdrv_co_block_status() to
  * their file.