diff mbox series

[for-5.0,2/2] block: Fix blk->in_flight during blk_wait_while_drained()

Message ID 20200403104415.20963-3-kwolf@redhat.com (mailing list archive)
State New, archived
Headers show
Series block: Fix blk->in_flight during blk_wait_while_drained() | expand

Commit Message

Kevin Wolf April 3, 2020, 10:44 a.m. UTC
Calling blk_wait_while_drained() while blk->in_flight is increased for
the current request is wrong because it will cause the drain operation
to deadlock.

Many callers of blk_wait_while_drained() have already increased
blk->in_flight when called in a blk_aio_*() path, but can also be called
in synchonous code paths where blk->in_flight isn't increased. This
means that these calls of blk_wait_while_drained() are wrong at least in
some cases.

In order to fix this, increase blk->in_flight even for synchronous
operations and temporarily decrease the counter again in
blk_wait_while_drained().

Fixes: cf3129323f900ef5ddbccbe86e4fa801e88c566e
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block/block-backend.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

Comments

Max Reitz April 3, 2020, 12:42 p.m. UTC | #1
On 03.04.20 12:44, Kevin Wolf wrote:
> Calling blk_wait_while_drained() while blk->in_flight is increased for
> the current request is wrong because it will cause the drain operation
> to deadlock.
> 
> Many callers of blk_wait_while_drained() have already increased
> blk->in_flight when called in a blk_aio_*() path, but can also be called
> in synchonous code paths where blk->in_flight isn't increased. This
> means that these calls of blk_wait_while_drained() are wrong at least in
> some cases.
> 
> In order to fix this, increase blk->in_flight even for synchronous
> operations and temporarily decrease the counter again in
> blk_wait_while_drained().
> 
> Fixes: cf3129323f900ef5ddbccbe86e4fa801e88c566e
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
>  block/block-backend.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)

blk_co_pdiscard() and blk_co_flush() are called from outside of
block-backend.c (namely from mirror.c and nbd/server.c).  Is that OK?

Max
Kevin Wolf April 3, 2020, 2:50 p.m. UTC | #2
Am 03.04.2020 um 14:42 hat Max Reitz geschrieben:
> On 03.04.20 12:44, Kevin Wolf wrote:
> > Calling blk_wait_while_drained() while blk->in_flight is increased for
> > the current request is wrong because it will cause the drain operation
> > to deadlock.
> > 
> > Many callers of blk_wait_while_drained() have already increased
> > blk->in_flight when called in a blk_aio_*() path, but can also be called
> > in synchonous code paths where blk->in_flight isn't increased. This
> > means that these calls of blk_wait_while_drained() are wrong at least in
> > some cases.
> > 
> > In order to fix this, increase blk->in_flight even for synchronous
> > operations and temporarily decrease the counter again in
> > blk_wait_while_drained().
> > 
> > Fixes: cf3129323f900ef5ddbccbe86e4fa801e88c566e
> > Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> > ---
> >  block/block-backend.c | 8 ++++----
> >  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> blk_co_pdiscard() and blk_co_flush() are called from outside of
> block-backend.c (namely from mirror.c and nbd/server.c).  Is that OK?

Hm... I think you're right that the NBD server has a problem now because
we might now decrease blk->in_flight without having increased it.
(Mirror should be fine anyway because it sets disable_request_queuing.)

At first I was going to suggest that we could do the opposite of this
patch and just move the dec/wait/inc sequence (which this patch removes
for read/write) to all coroutine entry functions, so direct calls
wouldn't incorrectly decrease the counter.

But this is not what we want either, we do want to queue requests for
drained BlockBackends even in the blk_co_*() API.

Do you have another idea or do we have to turn blk_co_*() into wrappers
around the existing functions, which would gain an additional bool
parameter that tells whether we need to dec/inc or not?

Kevin
Max Reitz April 6, 2020, 9:41 a.m. UTC | #3
On 03.04.20 16:50, Kevin Wolf wrote:
> Am 03.04.2020 um 14:42 hat Max Reitz geschrieben:
>> On 03.04.20 12:44, Kevin Wolf wrote:
>>> Calling blk_wait_while_drained() while blk->in_flight is increased for
>>> the current request is wrong because it will cause the drain operation
>>> to deadlock.
>>>
>>> Many callers of blk_wait_while_drained() have already increased
>>> blk->in_flight when called in a blk_aio_*() path, but can also be called
>>> in synchonous code paths where blk->in_flight isn't increased. This
>>> means that these calls of blk_wait_while_drained() are wrong at least in
>>> some cases.
>>>
>>> In order to fix this, increase blk->in_flight even for synchronous
>>> operations and temporarily decrease the counter again in
>>> blk_wait_while_drained().
>>>
>>> Fixes: cf3129323f900ef5ddbccbe86e4fa801e88c566e
>>> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
>>> ---
>>>  block/block-backend.c | 8 ++++----
>>>  1 file changed, 4 insertions(+), 4 deletions(-)
>>
>> blk_co_pdiscard() and blk_co_flush() are called from outside of
>> block-backend.c (namely from mirror.c and nbd/server.c).  Is that OK?
> 
> Hm... I think you're right that the NBD server has a problem now because
> we might now decrease blk->in_flight without having increased it.
> (Mirror should be fine anyway because it sets disable_request_queuing.)
> 
> At first I was going to suggest that we could do the opposite of this
> patch and just move the dec/wait/inc sequence (which this patch removes
> for read/write) to all coroutine entry functions, so direct calls
> wouldn't incorrectly decrease the counter.
> 
> But this is not what we want either, we do want to queue requests for
> drained BlockBackends even in the blk_co_*() API.
> 
> Do you have another idea or do we have to turn blk_co_*() into wrappers
> around the existing functions, which would gain an additional bool
> parameter that tells whether we need to dec/inc or not?

So that whenever blk_co_* is called from outside of block-backend.c, we
don’t dec/inc?

Sounds reasonable to me.

The only alternative I see would be ensuring we call
blk_wait_while_drained() only outside of in_flight sections (without
having to dec/inc around it).  But we can’t call it in synchronous
sections.  And for those synchronous calls, we also have to wrap the
in_flight section around the whole asynchronous boilerplate, so there is
no place where they can call bdrv_wait_while_drained() without dec/inc
around it.

So I can’t think of another way either.

Max
diff mbox series

Patch

diff --git a/block/block-backend.c b/block/block-backend.c
index 3124e367b3..7bd16402b8 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -1143,7 +1143,9 @@  static int blk_check_byte_request(BlockBackend *blk, int64_t offset,
 static void coroutine_fn blk_wait_while_drained(BlockBackend *blk)
 {
     if (blk->quiesce_counter && !blk->disable_request_queuing) {
+        blk_dec_in_flight(blk);
         qemu_co_queue_wait(&blk->queued_requests, NULL);
+        blk_inc_in_flight(blk);
     }
 }
 
@@ -1260,6 +1262,7 @@  static int blk_prw(BlockBackend *blk, int64_t offset, uint8_t *buf,
         .ret    = NOT_DONE,
     };
 
+    blk_inc_in_flight(blk);
     if (qemu_in_coroutine()) {
         /* Fast-path if already in coroutine context */
         co_entry(&rwco);
@@ -1268,6 +1271,7 @@  static int blk_prw(BlockBackend *blk, int64_t offset, uint8_t *buf,
         bdrv_coroutine_enter(blk_bs(blk), co);
         BDRV_POLL_WHILE(blk_bs(blk), rwco.ret == NOT_DONE);
     }
+    blk_dec_in_flight(blk);
 
     return rwco.ret;
 }
@@ -1386,9 +1390,7 @@  static void blk_aio_read_entry(void *opaque)
     QEMUIOVector *qiov = rwco->iobuf;
 
     if (rwco->blk->quiesce_counter) {
-        blk_dec_in_flight(rwco->blk);
         blk_wait_while_drained(rwco->blk);
-        blk_inc_in_flight(rwco->blk);
     }
 
     assert(qiov->size == acb->bytes);
@@ -1404,9 +1406,7 @@  static void blk_aio_write_entry(void *opaque)
     QEMUIOVector *qiov = rwco->iobuf;
 
     if (rwco->blk->quiesce_counter) {
-        blk_dec_in_flight(rwco->blk);
         blk_wait_while_drained(rwco->blk);
-        blk_inc_in_flight(rwco->blk);
     }
 
     assert(!qiov || qiov->size == acb->bytes);