diff mbox series

[3/3] block: Start/end drain on correct AioContext

Message ID 20220923125227.300202-4-hreitz@redhat.com (mailing list archive)
State New, archived
Headers show
Series blcok: Start/end drain on correct AioContext | expand

Commit Message

Hanna Czenczek Sept. 23, 2022, 12:52 p.m. UTC
bdrv_parent_drained_{begin,end}_single() are supposed to operate on the
parent, not on the child, so they should not attempt to get the context
to poll from the child but the parent instead.  BDRV_POLL_WHILE(c->bs)
does get the context from the child, so we should replace it with
AIO_WAIT_WHILE() on the parent's context instead.

This problem becomes apparent when bdrv_replace_child_noperm() invokes
bdrv_parent_drained_end_single() after removing a child from a subgraph
that is in an I/O thread.  By the time bdrv_parent_drained_end_single()
is called, child->bs is NULL, and so BDRV_POLL_WHILE(c->bs, ...) will
poll the main loop instead of the I/O thread; but anything that
bdrv_parent_drained_end_single_no_poll() may have scheduled is going to
want to run in the I/O thread, but because we poll the main loop, the
I/O thread is never unpaused, and nothing is run, resulting in a
deadlock.

Closes: https://gitlab.com/qemu-project/qemu/-/issues/1215
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
---
 block/io.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

Comments

Kevin Wolf Oct. 7, 2022, 9:05 a.m. UTC | #1
Am 23.09.2022 um 14:52 hat Hanna Reitz geschrieben:
> bdrv_parent_drained_{begin,end}_single() are supposed to operate on the
> parent, not on the child, so they should not attempt to get the context
> to poll from the child but the parent instead.  BDRV_POLL_WHILE(c->bs)
> does get the context from the child, so we should replace it with
> AIO_WAIT_WHILE() on the parent's context instead.
> 
> This problem becomes apparent when bdrv_replace_child_noperm() invokes
> bdrv_parent_drained_end_single() after removing a child from a subgraph
> that is in an I/O thread.  By the time bdrv_parent_drained_end_single()
> is called, child->bs is NULL, and so BDRV_POLL_WHILE(c->bs, ...) will
> poll the main loop instead of the I/O thread; but anything that
> bdrv_parent_drained_end_single_no_poll() may have scheduled is going to
> want to run in the I/O thread, but because we poll the main loop, the
> I/O thread is never unpaused, and nothing is run, resulting in a
> deadlock.
> 
> Closes: https://gitlab.com/qemu-project/qemu/-/issues/1215
> Signed-off-by: Hanna Reitz <hreitz@redhat.com>

Hmm... Seeing a BdrvChild with child->bs == NULL outside of functions
directly manipulating it is surprising. I think we need to document that
at least for bdrv_parent_drained_begin/end_single(), that they can get
such BdrvChild objects passed.

Even then, polling with an incomplete BdrvChild in the graph doesn't
sound like the best idea either, but not sure how to avoid that best.
Maybe we would need a different order in bdrv_replace_child_noperm() if
either old_bs or new_bs is NULL.

Anyway, the code change itself looks reasonable:

Reviewed-by: Kevin Wolf <kwolf@redhat.com>
diff mbox series

Patch

diff --git a/block/io.c b/block/io.c
index 0a8cbefe86..7df1129b3b 100644
--- a/block/io.c
+++ b/block/io.c
@@ -71,9 +71,10 @@  static void bdrv_parent_drained_end_single_no_poll(BdrvChild *c,
 void bdrv_parent_drained_end_single(BdrvChild *c)
 {
     int drained_end_counter = 0;
+    AioContext *ctx = bdrv_child_get_parent_aio_context(c);
     IO_OR_GS_CODE();
     bdrv_parent_drained_end_single_no_poll(c, &drained_end_counter);
-    BDRV_POLL_WHILE(c->bs, qatomic_read(&drained_end_counter) > 0);
+    AIO_WAIT_WHILE(ctx, qatomic_read(&drained_end_counter) > 0);
 }
 
 static void bdrv_parent_drained_end(BlockDriverState *bs, BdrvChild *ignore,
@@ -116,13 +117,14 @@  static bool bdrv_parent_drained_poll(BlockDriverState *bs, BdrvChild *ignore,
 
 void bdrv_parent_drained_begin_single(BdrvChild *c, bool poll)
 {
+    AioContext *ctx = bdrv_child_get_parent_aio_context(c);
     IO_OR_GS_CODE();
     c->parent_quiesce_counter++;
     if (c->klass->drained_begin) {
         c->klass->drained_begin(c);
     }
     if (poll) {
-        BDRV_POLL_WHILE(c->bs, bdrv_parent_drained_poll_single(c));
+        AIO_WAIT_WHILE(ctx, bdrv_parent_drained_poll_single(c));
     }
 }