diff mbox series

[6/7] block: Let replace_child_noperm free children

Message ID 20211104103849.46855-7-hreitz@redhat.com (mailing list archive)
State New, archived
Headers show
Series block: Attempt on fixing 030-reported errors | expand

Commit Message

Hanna Czenczek Nov. 4, 2021, 10:38 a.m. UTC
In most of the block layer, especially when traversing down from other
BlockDriverStates, we assume that BdrvChild.bs can never be NULL.  When
it becomes NULL, it is expected that the corresponding BdrvChild pointer
also becomes NULL and the BdrvChild object is freed.

Therefore, once bdrv_replace_child_noperm() sets the BdrvChild.bs
pointer to NULL, it should also immediately set the corresponding
BdrvChild pointer (like bs->file or bs->backing) to NULL.

In that context, it also makes sense for this function to free the
child.  Sometimes we cannot do so, though, because it is called in a
transactional context where the caller might still want to reinstate the
child in the abort branch (and free it only on commit), so this behavior
has to remain optional.

In bdrv_replace_child_tran()'s abort handler, we now rely on the fact
that the BdrvChild passed to bdrv_replace_child_tran() must have had a
non-NULL .bs pointer initially.  Make a note of that and assert it.

Signed-off-by: Hanna Reitz <hreitz@redhat.com>
---
 block.c | 102 +++++++++++++++++++++++++++++++++++++++++++++++---------
 1 file changed, 86 insertions(+), 16 deletions(-)

Comments

Kevin Wolf Nov. 5, 2021, 3:41 p.m. UTC | #1
Am 04.11.2021 um 11:38 hat Hanna Reitz geschrieben:
> In most of the block layer, especially when traversing down from other
> BlockDriverStates, we assume that BdrvChild.bs can never be NULL.  When
> it becomes NULL, it is expected that the corresponding BdrvChild pointer
> also becomes NULL and the BdrvChild object is freed.
> 
> Therefore, once bdrv_replace_child_noperm() sets the BdrvChild.bs
> pointer to NULL, it should also immediately set the corresponding
> BdrvChild pointer (like bs->file or bs->backing) to NULL.
> 
> In that context, it also makes sense for this function to free the
> child.  Sometimes we cannot do so, though, because it is called in a
> transactional context where the caller might still want to reinstate the
> child in the abort branch (and free it only on commit), so this behavior
> has to remain optional.
> 
> In bdrv_replace_child_tran()'s abort handler, we now rely on the fact
> that the BdrvChild passed to bdrv_replace_child_tran() must have had a
> non-NULL .bs pointer initially.  Make a note of that and assert it.
> 
> Signed-off-by: Hanna Reitz <hreitz@redhat.com>
> ---
>  block.c | 102 +++++++++++++++++++++++++++++++++++++++++++++++---------
>  1 file changed, 86 insertions(+), 16 deletions(-)
> 
> diff --git a/block.c b/block.c
> index ff45447686..0a85ede8dc 100644
> --- a/block.c
> +++ b/block.c
> @@ -87,8 +87,10 @@ static BlockDriverState *bdrv_open_inherit(const char *filename,
>  static bool bdrv_recurse_has_child(BlockDriverState *bs,
>                                     BlockDriverState *child);
>  
> +static void bdrv_child_free(BdrvChild *child);
>  static void bdrv_replace_child_noperm(BdrvChild **child,
> -                                      BlockDriverState *new_bs);
> +                                      BlockDriverState *new_bs,
> +                                      bool free_empty_child);
>  static void bdrv_remove_file_or_backing_child(BlockDriverState *bs,
>                                                BdrvChild *child,
>                                                Transaction *tran);
> @@ -2256,12 +2258,16 @@ typedef struct BdrvReplaceChildState {
>      BdrvChild *child;
>      BdrvChild **childp;
>      BlockDriverState *old_bs;
> +    bool free_empty_child;
>  } BdrvReplaceChildState;
>  
>  static void bdrv_replace_child_commit(void *opaque)
>  {
>      BdrvReplaceChildState *s = opaque;
>  
> +    if (s->free_empty_child && !s->child->bs) {
> +        bdrv_child_free(s->child);
> +    }
>      bdrv_unref(s->old_bs);
>  }
>  
> @@ -2270,8 +2276,23 @@ static void bdrv_replace_child_abort(void *opaque)
>      BdrvReplaceChildState *s = opaque;
>      BlockDriverState *new_bs = s->child->bs;
>  
> -    /* old_bs reference is transparently moved from @s to *s->childp */
> -    bdrv_replace_child_noperm(s->childp, s->old_bs);
> +    /*
> +     * old_bs reference is transparently moved from @s to s->child;
> +     * pass &s->child here instead of s->childp, because *s->childp
> +     * will have been cleared by bdrv_replace_child_tran()'s
> +     * bdrv_replace_child_noperm() call if new_bs is NULL, and we must
> +     * not pass a NULL *s->childp here.
> +     */
> +    bdrv_replace_child_noperm(&s->child, s->old_bs, true);

Passing free_empty_child=true with a non-NULL new_bs looks a bit
confusing because the child isn't supposed to become empty anyway.

> +    /*
> +     * The child was pre-existing, so s->old_bs must be non-NULL, and
> +     * s->child thus must not have been freed
> +     */
> +    assert(s->child != NULL);
> +    if (!new_bs) {
> +        /* As described above, *s->childp was cleared, so restore it */
> +        *s->childp = s->child;
> +    }

If it wasn't cleared, doesn't it still contain s->child, so this could
just be an unconditional rollback?

>      bdrv_unref(new_bs);
>  }

Kevin
Hanna Czenczek Nov. 8, 2021, 10:12 a.m. UTC | #2
On 05.11.21 16:41, Kevin Wolf wrote:
> Am 04.11.2021 um 11:38 hat Hanna Reitz geschrieben:
>> In most of the block layer, especially when traversing down from other
>> BlockDriverStates, we assume that BdrvChild.bs can never be NULL.  When
>> it becomes NULL, it is expected that the corresponding BdrvChild pointer
>> also becomes NULL and the BdrvChild object is freed.
>>
>> Therefore, once bdrv_replace_child_noperm() sets the BdrvChild.bs
>> pointer to NULL, it should also immediately set the corresponding
>> BdrvChild pointer (like bs->file or bs->backing) to NULL.
>>
>> In that context, it also makes sense for this function to free the
>> child.  Sometimes we cannot do so, though, because it is called in a
>> transactional context where the caller might still want to reinstate the
>> child in the abort branch (and free it only on commit), so this behavior
>> has to remain optional.
>>
>> In bdrv_replace_child_tran()'s abort handler, we now rely on the fact
>> that the BdrvChild passed to bdrv_replace_child_tran() must have had a
>> non-NULL .bs pointer initially.  Make a note of that and assert it.
>>
>> Signed-off-by: Hanna Reitz <hreitz@redhat.com>
>> ---
>>   block.c | 102 +++++++++++++++++++++++++++++++++++++++++++++++---------
>>   1 file changed, 86 insertions(+), 16 deletions(-)
>>
>> diff --git a/block.c b/block.c
>> index ff45447686..0a85ede8dc 100644
>> --- a/block.c
>> +++ b/block.c
>> @@ -87,8 +87,10 @@ static BlockDriverState *bdrv_open_inherit(const char *filename,
>>   static bool bdrv_recurse_has_child(BlockDriverState *bs,
>>                                      BlockDriverState *child);
>>   
>> +static void bdrv_child_free(BdrvChild *child);
>>   static void bdrv_replace_child_noperm(BdrvChild **child,
>> -                                      BlockDriverState *new_bs);
>> +                                      BlockDriverState *new_bs,
>> +                                      bool free_empty_child);
>>   static void bdrv_remove_file_or_backing_child(BlockDriverState *bs,
>>                                                 BdrvChild *child,
>>                                                 Transaction *tran);
>> @@ -2256,12 +2258,16 @@ typedef struct BdrvReplaceChildState {
>>       BdrvChild *child;
>>       BdrvChild **childp;
>>       BlockDriverState *old_bs;
>> +    bool free_empty_child;
>>   } BdrvReplaceChildState;
>>   
>>   static void bdrv_replace_child_commit(void *opaque)
>>   {
>>       BdrvReplaceChildState *s = opaque;
>>   
>> +    if (s->free_empty_child && !s->child->bs) {
>> +        bdrv_child_free(s->child);
>> +    }
>>       bdrv_unref(s->old_bs);
>>   }
>>   
>> @@ -2270,8 +2276,23 @@ static void bdrv_replace_child_abort(void *opaque)
>>       BdrvReplaceChildState *s = opaque;
>>       BlockDriverState *new_bs = s->child->bs;
>>   
>> -    /* old_bs reference is transparently moved from @s to *s->childp */
>> -    bdrv_replace_child_noperm(s->childp, s->old_bs);
>> +    /*
>> +     * old_bs reference is transparently moved from @s to s->child;
>> +     * pass &s->child here instead of s->childp, because *s->childp
>> +     * will have been cleared by bdrv_replace_child_tran()'s
>> +     * bdrv_replace_child_noperm() call if new_bs is NULL, and we must
>> +     * not pass a NULL *s->childp here.
>> +     */
>> +    bdrv_replace_child_noperm(&s->child, s->old_bs, true);
> Passing free_empty_child=true with a non-NULL new_bs looks a bit
> confusing because the child isn't supposed to become empty anyway.

I wasn’t sure what to do.  I decided to make the contract “Caller should 
pass false only if they will deal with freeing the child”, and so I 
ended up passing `true` in cases such as here.  I felt like `true` 
should kind of be the default case, and `false` the exception.

>> +    /*
>> +     * The child was pre-existing, so s->old_bs must be non-NULL, and
>> +     * s->child thus must not have been freed
>> +     */
>> +    assert(s->child != NULL);
>> +    if (!new_bs) {
>> +        /* As described above, *s->childp was cleared, so restore it */
>> +        *s->childp = s->child;
>> +    }
> If it wasn't cleared, doesn't it still contain s->child, so this could
> just be an unconditional rollback?

I think so, yes.  We’d still have to explain why the rollback is 
required, though, so would dropping the condition really make it 
simpler?  (Of course this also goes in the opposite direction: Is making 
it conditional simpler? :))

I just feel like the comment would be something like “Restore *s->childp 
in case it was cleared as described above”, and then I’d find it a bit 
strange if the “in case” isn’t part of the code...  *shrug*

Hanna
diff mbox series

Patch

diff --git a/block.c b/block.c
index ff45447686..0a85ede8dc 100644
--- a/block.c
+++ b/block.c
@@ -87,8 +87,10 @@  static BlockDriverState *bdrv_open_inherit(const char *filename,
 static bool bdrv_recurse_has_child(BlockDriverState *bs,
                                    BlockDriverState *child);
 
+static void bdrv_child_free(BdrvChild *child);
 static void bdrv_replace_child_noperm(BdrvChild **child,
-                                      BlockDriverState *new_bs);
+                                      BlockDriverState *new_bs,
+                                      bool free_empty_child);
 static void bdrv_remove_file_or_backing_child(BlockDriverState *bs,
                                               BdrvChild *child,
                                               Transaction *tran);
@@ -2256,12 +2258,16 @@  typedef struct BdrvReplaceChildState {
     BdrvChild *child;
     BdrvChild **childp;
     BlockDriverState *old_bs;
+    bool free_empty_child;
 } BdrvReplaceChildState;
 
 static void bdrv_replace_child_commit(void *opaque)
 {
     BdrvReplaceChildState *s = opaque;
 
+    if (s->free_empty_child && !s->child->bs) {
+        bdrv_child_free(s->child);
+    }
     bdrv_unref(s->old_bs);
 }
 
@@ -2270,8 +2276,23 @@  static void bdrv_replace_child_abort(void *opaque)
     BdrvReplaceChildState *s = opaque;
     BlockDriverState *new_bs = s->child->bs;
 
-    /* old_bs reference is transparently moved from @s to *s->childp */
-    bdrv_replace_child_noperm(s->childp, s->old_bs);
+    /*
+     * old_bs reference is transparently moved from @s to s->child;
+     * pass &s->child here instead of s->childp, because *s->childp
+     * will have been cleared by bdrv_replace_child_tran()'s
+     * bdrv_replace_child_noperm() call if new_bs is NULL, and we must
+     * not pass a NULL *s->childp here.
+     */
+    bdrv_replace_child_noperm(&s->child, s->old_bs, true);
+    /*
+     * The child was pre-existing, so s->old_bs must be non-NULL, and
+     * s->child thus must not have been freed
+     */
+    assert(s->child != NULL);
+    if (!new_bs) {
+        /* As described above, *s->childp was cleared, so restore it */
+        *s->childp = s->child;
+    }
     bdrv_unref(new_bs);
 }
 
@@ -2287,23 +2308,40 @@  static TransactionActionDrv bdrv_replace_child_drv = {
  * Note: real unref of old_bs is done only on commit.
  *
  * The function doesn't update permissions, caller is responsible for this.
+ *
+ * (*childp)->bs must not be NULL.
+ *
+ * If @free_empty_child is true and @new_bs is NULL, the BdrvChild is
+ * freed (on commit).  @free_empty_child should only be false if the
+ * caller will free the BDrvChild themselves (which may be important
+ * if this is in turn called in another transactional context).
  */
 static void bdrv_replace_child_tran(BdrvChild **childp,
                                     BlockDriverState *new_bs,
-                                    Transaction *tran)
+                                    Transaction *tran,
+                                    bool free_empty_child)
 {
     BdrvReplaceChildState *s = g_new(BdrvReplaceChildState, 1);
     *s = (BdrvReplaceChildState) {
         .child = *childp,
         .childp = childp,
         .old_bs = (*childp)->bs,
+        .free_empty_child = free_empty_child,
     };
     tran_add(tran, &bdrv_replace_child_drv, s);
 
+    /* The abort handler relies on this */
+    assert(s->old_bs != NULL);
+
     if (new_bs) {
         bdrv_ref(new_bs);
     }
-    bdrv_replace_child_noperm(childp, new_bs);
+    /*
+     * Pass free_empty_child=false, we will free the child (if
+     * necessary) in bdrv_replace_child_commit() (if our
+     * @free_empty_child parameter was true).
+     */
+    bdrv_replace_child_noperm(childp, new_bs, false);
     /* old_bs reference is transparently moved from *childp to @s */
 }
 
@@ -2675,8 +2713,22 @@  uint64_t bdrv_qapi_perm_to_blk_perm(BlockPermission qapi_perm)
     return permissions[qapi_perm];
 }
 
+/**
+ * Replace (*childp)->bs by @new_bs.
+ *
+ * If @new_bs is NULL, *childp will be set to NULL, too: BDS parents
+ * generally cannot handle a BdrvChild with .bs == NULL, so clearing
+ * BdrvChild.bs should generally immediately be followed by the
+ * BdrvChild pointer being cleared as well.
+ *
+ * If @free_empty_child is true and @new_bs is NULL, the BdrvChild is
+ * freed.  @free_empty_child should only be false if the caller will
+ * free the BdrvChild themselves (this may be important in a
+ * transactional context, where it may only be freed on commit).
+ */
 static void bdrv_replace_child_noperm(BdrvChild **childp,
-                                      BlockDriverState *new_bs)
+                                      BlockDriverState *new_bs,
+                                      bool free_empty_child)
 {
     BdrvChild *child = *childp;
     BlockDriverState *old_bs = child->bs;
@@ -2713,6 +2765,9 @@  static void bdrv_replace_child_noperm(BdrvChild **childp,
     }
 
     child->bs = new_bs;
+    if (!new_bs) {
+        *childp = NULL;
+    }
 
     if (new_bs) {
         QLIST_INSERT_HEAD(&new_bs->parents, child, next_parent);
@@ -2742,6 +2797,10 @@  static void bdrv_replace_child_noperm(BdrvChild **childp,
         bdrv_parent_drained_end_single(child);
         drain_saldo++;
     }
+
+    if (free_empty_child && !child->bs) {
+        bdrv_child_free(child);
+    }
 }
 
 /**
@@ -2771,7 +2830,14 @@  static void bdrv_attach_child_common_abort(void *opaque)
     BdrvChild *child = *s->child;
     BlockDriverState *bs = child->bs;
 
-    bdrv_replace_child_noperm(s->child, NULL);
+    /*
+     * Pass free_empty_child=false, because we still need the child
+     * for the AioContext operations on the parent below; those
+     * BdrvChildClass methods all work on a BdrvChild object, so we
+     * need to keep it as an empty shell (after this function, it will
+     * not be attached to any parent, and it will not have a .bs).
+     */
+    bdrv_replace_child_noperm(s->child, NULL, false);
 
     if (bdrv_get_aio_context(bs) != s->old_child_ctx) {
         bdrv_try_set_aio_context(bs, s->old_child_ctx, &error_abort);
@@ -2793,7 +2859,6 @@  static void bdrv_attach_child_common_abort(void *opaque)
 
     bdrv_unref(bs);
     bdrv_child_free(child);
-    *s->child = NULL;
 }
 
 static TransactionActionDrv bdrv_attach_child_common_drv = {
@@ -2871,7 +2936,9 @@  static int bdrv_attach_child_common(BlockDriverState *child_bs,
     }
 
     bdrv_ref(child_bs);
-    bdrv_replace_child_noperm(&new_child, child_bs);
+    bdrv_replace_child_noperm(&new_child, child_bs, true);
+    /* child_bs was non-NULL, so new_child must not have been freed */
+    assert(new_child != NULL);
 
     *child = new_child;
 
@@ -2935,8 +3002,7 @@  static void bdrv_detach_child(BdrvChild **childp)
 {
     BlockDriverState *old_bs = (*childp)->bs;
 
-    bdrv_replace_child_noperm(childp, NULL);
-    bdrv_child_free(*childp);
+    bdrv_replace_child_noperm(childp, NULL, true);
 
     if (old_bs) {
         /*
@@ -4911,7 +4977,11 @@  static void bdrv_remove_file_or_backing_child(BlockDriverState *bs,
     }
 
     if (child->bs) {
-        bdrv_replace_child_tran(childp, NULL, tran);
+        /*
+         * Pass free_empty_child=false, we will free the child in
+         * bdrv_remove_filter_or_cow_child_commit()
+         */
+        bdrv_replace_child_tran(childp, NULL, tran, false);
     }
 
     s = g_new(BdrvRemoveFilterOrCowChild, 1);
@@ -4920,8 +4990,6 @@  static void bdrv_remove_file_or_backing_child(BlockDriverState *bs,
         .is_backing = (childp == &bs->backing),
     };
     tran_add(tran, &bdrv_remove_filter_or_cow_child_drv, s);
-
-    *childp = NULL;
 }
 
 /*
@@ -4957,7 +5025,7 @@  static int bdrv_replace_node_noperm(BlockDriverState *from,
                        c->name, from->node_name);
             return -EPERM;
         }
-        bdrv_replace_child_tran(&c, to, tran);
+        bdrv_replace_child_tran(&c, to, tran, true);
     }
 
     return 0;
@@ -5106,7 +5174,9 @@  int bdrv_replace_child_bs(BdrvChild *child, BlockDriverState *new_bs,
     bdrv_drained_begin(old_bs);
     bdrv_drained_begin(new_bs);
 
-    bdrv_replace_child_tran(&child, new_bs, tran);
+    bdrv_replace_child_tran(&child, new_bs, tran, true);
+    /* @new_bs must have been non-NULL, so @child must not have been freed */
+    assert(child != NULL);
 
     found = g_hash_table_new(NULL, NULL);
     refresh_list = bdrv_topological_dfs(refresh_list, found, old_bs);