[for-5.0,v3,2/3] block: Increase BB.in_flight for coroutine and sync interfaces
diff mbox series

Message ID 20200407121259.21350-3-kwolf@redhat.com
State New
Headers show
Series
  • block: Fix blk->in_flight during blk_wait_while_drained()
Related show

Commit Message

Kevin Wolf April 7, 2020, 12:12 p.m. UTC
External callers of blk_co_*() and of the synchronous blk_*() functions
don't currently increase the BlockBackend.in_flight counter, but calls
from blk_aio_*() do, so there is an inconsistency whether the counter
has been increased or not.

This patch moves the actual operations to static functions that can
later know they will always be called with in_flight increased exactly
once, even for external callers using the blk_co_*() coroutine
interfaces.

If the public blk_co_*() interface is unused, remove it.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 include/sysemu/block-backend.h |   1 -
 block/block-backend.c          | 103 +++++++++++++++++++++++++--------
 2 files changed, 80 insertions(+), 24 deletions(-)

Comments

Max Reitz April 7, 2020, 12:54 p.m. UTC | #1
On 07.04.20 14:12, Kevin Wolf wrote:
> External callers of blk_co_*() and of the synchronous blk_*() functions
> don't currently increase the BlockBackend.in_flight counter, but calls
> from blk_aio_*() do, so there is an inconsistency whether the counter
> has been increased or not.
> 
> This patch moves the actual operations to static functions that can
> later know they will always be called with in_flight increased exactly
> once, even for external callers using the blk_co_*() coroutine
> interfaces.
> 
> If the public blk_co_*() interface is unused, remove it.
> 
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
>  include/sysemu/block-backend.h |   1 -
>  block/block-backend.c          | 103 +++++++++++++++++++++++++--------
>  2 files changed, 80 insertions(+), 24 deletions(-)

Thanks!

Reviewed-by: Max Reitz <mreitz@redhat.com>
Vladimir Sementsov-Ogievskiy April 7, 2020, 2:22 p.m. UTC | #2
07.04.2020 15:12, Kevin Wolf wrote:
> External callers of blk_co_*() and of the synchronous blk_*() functions
> don't currently increase the BlockBackend.in_flight counter, but calls
> from blk_aio_*() do, so there is an inconsistency whether the counter
> has been increased or not.
> 
> This patch moves the actual operations to static functions that can
> later know they will always be called with in_flight increased exactly
> once, even for external callers using the blk_co_*() coroutine
> interfaces.
> 
> If the public blk_co_*() interface is unused, remove it.
> 
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>

Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

side question:

Should we inc/dec in blk_make_zero, blk_truncate?
Kevin Wolf April 7, 2020, 2:42 p.m. UTC | #3
Am 07.04.2020 um 16:22 hat Vladimir Sementsov-Ogievskiy geschrieben:
> 07.04.2020 15:12, Kevin Wolf wrote:
> > External callers of blk_co_*() and of the synchronous blk_*() functions
> > don't currently increase the BlockBackend.in_flight counter, but calls
> > from blk_aio_*() do, so there is an inconsistency whether the counter
> > has been increased or not.
> > 
> > This patch moves the actual operations to static functions that can
> > later know they will always be called with in_flight increased exactly
> > once, even for external callers using the blk_co_*() coroutine
> > interfaces.
> > 
> > If the public blk_co_*() interface is unused, remove it.
> > 
> > Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> 
> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> 
> side question:
> 
> Should we inc/dec in blk_make_zero, blk_truncate?

I don't think it's necessary. They call into their bdrv_* counterpart
immediately, so the node-level counter should be enough.

Kevin
Vladimir Sementsov-Ogievskiy April 7, 2020, 2:56 p.m. UTC | #4
07.04.2020 17:42, Kevin Wolf wrote:
> Am 07.04.2020 um 16:22 hat Vladimir Sementsov-Ogievskiy geschrieben:
>> 07.04.2020 15:12, Kevin Wolf wrote:
>>> External callers of blk_co_*() and of the synchronous blk_*() functions
>>> don't currently increase the BlockBackend.in_flight counter, but calls
>>> from blk_aio_*() do, so there is an inconsistency whether the counter
>>> has been increased or not.
>>>
>>> This patch moves the actual operations to static functions that can
>>> later know they will always be called with in_flight increased exactly
>>> once, even for external callers using the blk_co_*() coroutine
>>> interfaces.
>>>
>>> If the public blk_co_*() interface is unused, remove it.
>>>
>>> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
>>
>> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>
>> side question:
>>
>> Should we inc/dec in blk_make_zero, blk_truncate?
> 
> I don't think it's necessary. They call into their bdrv_* counterpart
> immediately, so the node-level counter should be enough.
> 

bdrv_make_zero is not one request, it does block_status/pwrite_zeroes in a loop. So drained section may occur during bdrv_make_zero. Possibly, nothing bad in it?

blk_truncate may do coroutine_enter before incrementing node-level counter, which may only schedule it..
Kevin Wolf April 7, 2020, 4:27 p.m. UTC | #5
Am 07.04.2020 um 16:56 hat Vladimir Sementsov-Ogievskiy geschrieben:
> 07.04.2020 17:42, Kevin Wolf wrote:
> > Am 07.04.2020 um 16:22 hat Vladimir Sementsov-Ogievskiy geschrieben:
> > > 07.04.2020 15:12, Kevin Wolf wrote:
> > > > External callers of blk_co_*() and of the synchronous blk_*() functions
> > > > don't currently increase the BlockBackend.in_flight counter, but calls
> > > > from blk_aio_*() do, so there is an inconsistency whether the counter
> > > > has been increased or not.
> > > > 
> > > > This patch moves the actual operations to static functions that can
> > > > later know they will always be called with in_flight increased exactly
> > > > once, even for external callers using the blk_co_*() coroutine
> > > > interfaces.
> > > > 
> > > > If the public blk_co_*() interface is unused, remove it.
> > > > 
> > > > Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> > > 
> > > Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> > > 
> > > side question:
> > > 
> > > Should we inc/dec in blk_make_zero, blk_truncate?
> > 
> > I don't think it's necessary. They call into their bdrv_* counterpart
> > immediately, so the node-level counter should be enough.
> > 
> 
> bdrv_make_zero is not one request, it does block_status/pwrite_zeroes
> in a loop. So drained section may occur during bdrv_make_zero.
> Possibly, nothing bad in it?

It would potentially be a problem if it were called in coroutine
context. But it's a synchronous function that must be called in the main
thread (and also only used in qemu-img), so I don't see how drain could
happen while it runs.

If we did want to make it safe for use in coroutine context, it would be
by using bdrv_inc/dec_in_flight() in bdrv_make_zero().

> blk_truncate may do coroutine_enter before incrementing node-level
> counter, which may only schedule it..

This is bdrv_truncate(), not blk_truncate(). If you address it in
blk_truncate(), you miss the direct callers of bdrv_truncate().

But you're right that it could potentially be a problem. Not sure if it
really is, but maybe better safe than sorry, so if you want to send a
patch, go ahead.

Kevin
Vladimir Sementsov-Ogievskiy April 7, 2020, 5:36 p.m. UTC | #6
07.04.2020 19:27, Kevin Wolf wrote:
> Am 07.04.2020 um 16:56 hat Vladimir Sementsov-Ogievskiy geschrieben:
>> 07.04.2020 17:42, Kevin Wolf wrote:
>>> Am 07.04.2020 um 16:22 hat Vladimir Sementsov-Ogievskiy geschrieben:
>>>> 07.04.2020 15:12, Kevin Wolf wrote:
>>>>> External callers of blk_co_*() and of the synchronous blk_*() functions
>>>>> don't currently increase the BlockBackend.in_flight counter, but calls
>>>>> from blk_aio_*() do, so there is an inconsistency whether the counter
>>>>> has been increased or not.
>>>>>
>>>>> This patch moves the actual operations to static functions that can
>>>>> later know they will always be called with in_flight increased exactly
>>>>> once, even for external callers using the blk_co_*() coroutine
>>>>> interfaces.
>>>>>
>>>>> If the public blk_co_*() interface is unused, remove it.
>>>>>
>>>>> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
>>>>
>>>> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>>>
>>>> side question:
>>>>
>>>> Should we inc/dec in blk_make_zero, blk_truncate?
>>>
>>> I don't think it's necessary. They call into their bdrv_* counterpart
>>> immediately, so the node-level counter should be enough.
>>>
>>
>> bdrv_make_zero is not one request, it does block_status/pwrite_zeroes
>> in a loop. So drained section may occur during bdrv_make_zero.
>> Possibly, nothing bad in it?
> 
> It would potentially be a problem if it were called in coroutine
> context. But it's a synchronous function that must be called in the main
> thread (and also only used in qemu-img), so I don't see how drain could
> happen while it runs.
> 
> If we did want to make it safe for use in coroutine context, it would be
> by using bdrv_inc/dec_in_flight() in bdrv_make_zero().
> 
>> blk_truncate may do coroutine_enter before incrementing node-level
>> counter, which may only schedule it..
> 
> This is bdrv_truncate(), not blk_truncate(). If you address it in
> blk_truncate(), you miss the direct callers of bdrv_truncate().
> 
> But you're right that it could potentially be a problem. Not sure if it
> really is, but maybe better safe than sorry, so if you want to send a
> patch, go ahead.
> 

Hmm. Same thing may be said about all other coroutine-enter wrappers in block/io.c. OK, I'll make a patch.

Patch
diff mbox series

diff --git a/include/sysemu/block-backend.h b/include/sysemu/block-backend.h
index b198deca0b..9bbdbd63d7 100644
--- a/include/sysemu/block-backend.h
+++ b/include/sysemu/block-backend.h
@@ -171,7 +171,6 @@  BlockAIOCB *blk_aio_pdiscard(BlockBackend *blk, int64_t offset, int bytes,
                              BlockCompletionFunc *cb, void *opaque);
 void blk_aio_cancel(BlockAIOCB *acb);
 void blk_aio_cancel_async(BlockAIOCB *acb);
-int blk_co_ioctl(BlockBackend *blk, unsigned long int req, void *buf);
 int blk_ioctl(BlockBackend *blk, unsigned long int req, void *buf);
 BlockAIOCB *blk_aio_ioctl(BlockBackend *blk, unsigned long int req, void *buf,
                           BlockCompletionFunc *cb, void *opaque);
diff --git a/block/block-backend.c b/block/block-backend.c
index 17b2e87afa..610dbfa0b2 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -1147,9 +1147,10 @@  static void coroutine_fn blk_wait_while_drained(BlockBackend *blk)
     }
 }
 
-int coroutine_fn blk_co_preadv(BlockBackend *blk, int64_t offset,
-                               unsigned int bytes, QEMUIOVector *qiov,
-                               BdrvRequestFlags flags)
+/* To be called between exactly one pair of blk_inc/dec_in_flight() */
+static int coroutine_fn
+blk_do_preadv(BlockBackend *blk, int64_t offset, unsigned int bytes,
+              QEMUIOVector *qiov, BdrvRequestFlags flags)
 {
     int ret;
     BlockDriverState *bs;
@@ -1178,10 +1179,24 @@  int coroutine_fn blk_co_preadv(BlockBackend *blk, int64_t offset,
     return ret;
 }
 
-int coroutine_fn blk_co_pwritev_part(BlockBackend *blk, int64_t offset,
-                                     unsigned int bytes,
-                                     QEMUIOVector *qiov, size_t qiov_offset,
-                                     BdrvRequestFlags flags)
+int coroutine_fn blk_co_preadv(BlockBackend *blk, int64_t offset,
+                               unsigned int bytes, QEMUIOVector *qiov,
+                               BdrvRequestFlags flags)
+{
+    int ret;
+
+    blk_inc_in_flight(blk);
+    ret = blk_do_preadv(blk, offset, bytes, qiov, flags);
+    blk_dec_in_flight(blk);
+
+    return ret;
+}
+
+/* To be called between exactly one pair of blk_inc/dec_in_flight() */
+static int coroutine_fn
+blk_do_pwritev_part(BlockBackend *blk, int64_t offset, unsigned int bytes,
+                    QEMUIOVector *qiov, size_t qiov_offset,
+                    BdrvRequestFlags flags)
 {
     int ret;
     BlockDriverState *bs;
@@ -1214,6 +1229,20 @@  int coroutine_fn blk_co_pwritev_part(BlockBackend *blk, int64_t offset,
     return ret;
 }
 
+int coroutine_fn blk_co_pwritev_part(BlockBackend *blk, int64_t offset,
+                                     unsigned int bytes,
+                                     QEMUIOVector *qiov, size_t qiov_offset,
+                                     BdrvRequestFlags flags)
+{
+    int ret;
+
+    blk_inc_in_flight(blk);
+    ret = blk_do_pwritev_part(blk, offset, bytes, qiov, qiov_offset, flags);
+    blk_dec_in_flight(blk);
+
+    return ret;
+}
+
 int coroutine_fn blk_co_pwritev(BlockBackend *blk, int64_t offset,
                                 unsigned int bytes, QEMUIOVector *qiov,
                                 BdrvRequestFlags flags)
@@ -1234,7 +1263,7 @@  static void blk_read_entry(void *opaque)
     BlkRwCo *rwco = opaque;
     QEMUIOVector *qiov = rwco->iobuf;
 
-    rwco->ret = blk_co_preadv(rwco->blk, rwco->offset, qiov->size,
+    rwco->ret = blk_do_preadv(rwco->blk, rwco->offset, qiov->size,
                               qiov, rwco->flags);
     aio_wait_kick();
 }
@@ -1244,8 +1273,8 @@  static void blk_write_entry(void *opaque)
     BlkRwCo *rwco = opaque;
     QEMUIOVector *qiov = rwco->iobuf;
 
-    rwco->ret = blk_co_pwritev(rwco->blk, rwco->offset, qiov->size,
-                               qiov, rwco->flags);
+    rwco->ret = blk_do_pwritev_part(rwco->blk, rwco->offset, qiov->size,
+                                    qiov, 0, rwco->flags);
     aio_wait_kick();
 }
 
@@ -1262,6 +1291,7 @@  static int blk_prw(BlockBackend *blk, int64_t offset, uint8_t *buf,
         .ret    = NOT_DONE,
     };
 
+    blk_inc_in_flight(blk);
     if (qemu_in_coroutine()) {
         /* Fast-path if already in coroutine context */
         co_entry(&rwco);
@@ -1270,6 +1300,7 @@  static int blk_prw(BlockBackend *blk, int64_t offset, uint8_t *buf,
         bdrv_coroutine_enter(blk_bs(blk), co);
         BDRV_POLL_WHILE(blk_bs(blk), rwco.ret == NOT_DONE);
     }
+    blk_dec_in_flight(blk);
 
     return rwco.ret;
 }
@@ -1394,7 +1425,7 @@  static void blk_aio_read_entry(void *opaque)
     }
 
     assert(qiov->size == acb->bytes);
-    rwco->ret = blk_co_preadv(rwco->blk, rwco->offset, acb->bytes,
+    rwco->ret = blk_do_preadv(rwco->blk, rwco->offset, acb->bytes,
                               qiov, rwco->flags);
     blk_aio_complete(acb);
 }
@@ -1412,8 +1443,8 @@  static void blk_aio_write_entry(void *opaque)
     }
 
     assert(!qiov || qiov->size == acb->bytes);
-    rwco->ret = blk_co_pwritev(rwco->blk, rwco->offset, acb->bytes,
-                               qiov, rwco->flags);
+    rwco->ret = blk_do_pwritev_part(rwco->blk, rwco->offset, acb->bytes,
+                                    qiov, 0, rwco->flags);
     blk_aio_complete(acb);
 }
 
@@ -1498,7 +1529,9 @@  void blk_aio_cancel_async(BlockAIOCB *acb)
     bdrv_aio_cancel_async(acb);
 }
 
-int blk_co_ioctl(BlockBackend *blk, unsigned long int req, void *buf)
+/* To be called between exactly one pair of blk_inc/dec_in_flight() */
+static int coroutine_fn
+blk_do_ioctl(BlockBackend *blk, unsigned long int req, void *buf)
 {
     blk_wait_while_drained(blk);
 
@@ -1514,8 +1547,7 @@  static void blk_ioctl_entry(void *opaque)
     BlkRwCo *rwco = opaque;
     QEMUIOVector *qiov = rwco->iobuf;
 
-    rwco->ret = blk_co_ioctl(rwco->blk, rwco->offset,
-                             qiov->iov[0].iov_base);
+    rwco->ret = blk_do_ioctl(rwco->blk, rwco->offset, qiov->iov[0].iov_base);
     aio_wait_kick();
 }
 
@@ -1529,7 +1561,7 @@  static void blk_aio_ioctl_entry(void *opaque)
     BlkAioEmAIOCB *acb = opaque;
     BlkRwCo *rwco = &acb->rwco;
 
-    rwco->ret = blk_co_ioctl(rwco->blk, rwco->offset, rwco->iobuf);
+    rwco->ret = blk_do_ioctl(rwco->blk, rwco->offset, rwco->iobuf);
 
     blk_aio_complete(acb);
 }
@@ -1540,7 +1572,9 @@  BlockAIOCB *blk_aio_ioctl(BlockBackend *blk, unsigned long int req, void *buf,
     return blk_aio_prwv(blk, req, 0, buf, blk_aio_ioctl_entry, 0, cb, opaque);
 }
 
-int blk_co_pdiscard(BlockBackend *blk, int64_t offset, int bytes)
+/* To be called between exactly one pair of blk_inc/dec_in_flight() */
+static int coroutine_fn
+blk_do_pdiscard(BlockBackend *blk, int64_t offset, int bytes)
 {
     int ret;
 
@@ -1559,7 +1593,7 @@  static void blk_aio_pdiscard_entry(void *opaque)
     BlkAioEmAIOCB *acb = opaque;
     BlkRwCo *rwco = &acb->rwco;
 
-    rwco->ret = blk_co_pdiscard(rwco->blk, rwco->offset, acb->bytes);
+    rwco->ret = blk_do_pdiscard(rwco->blk, rwco->offset, acb->bytes);
     blk_aio_complete(acb);
 }
 
@@ -1571,12 +1605,23 @@  BlockAIOCB *blk_aio_pdiscard(BlockBackend *blk,
                         cb, opaque);
 }
 
+int coroutine_fn blk_co_pdiscard(BlockBackend *blk, int64_t offset, int bytes)
+{
+    int ret;
+
+    blk_inc_in_flight(blk);
+    ret = blk_do_pdiscard(blk, offset, bytes);
+    blk_dec_in_flight(blk);
+
+    return ret;
+}
+
 static void blk_pdiscard_entry(void *opaque)
 {
     BlkRwCo *rwco = opaque;
     QEMUIOVector *qiov = rwco->iobuf;
 
-    rwco->ret = blk_co_pdiscard(rwco->blk, rwco->offset, qiov->size);
+    rwco->ret = blk_do_pdiscard(rwco->blk, rwco->offset, qiov->size);
     aio_wait_kick();
 }
 
@@ -1585,7 +1630,8 @@  int blk_pdiscard(BlockBackend *blk, int64_t offset, int bytes)
     return blk_prw(blk, offset, NULL, bytes, blk_pdiscard_entry, 0);
 }
 
-int blk_co_flush(BlockBackend *blk)
+/* To be called between exactly one pair of blk_inc/dec_in_flight() */
+static int coroutine_fn blk_do_flush(BlockBackend *blk)
 {
     blk_wait_while_drained(blk);
 
@@ -1601,7 +1647,7 @@  static void blk_aio_flush_entry(void *opaque)
     BlkAioEmAIOCB *acb = opaque;
     BlkRwCo *rwco = &acb->rwco;
 
-    rwco->ret = blk_co_flush(rwco->blk);
+    rwco->ret = blk_do_flush(rwco->blk);
     blk_aio_complete(acb);
 }
 
@@ -1611,10 +1657,21 @@  BlockAIOCB *blk_aio_flush(BlockBackend *blk,
     return blk_aio_prwv(blk, 0, 0, NULL, blk_aio_flush_entry, 0, cb, opaque);
 }
 
+int coroutine_fn blk_co_flush(BlockBackend *blk)
+{
+    int ret;
+
+    blk_inc_in_flight(blk);
+    ret = blk_do_flush(blk);
+    blk_dec_in_flight(blk);
+
+    return ret;
+}
+
 static void blk_flush_entry(void *opaque)
 {
     BlkRwCo *rwco = opaque;
-    rwco->ret = blk_co_flush(rwco->blk);
+    rwco->ret = blk_do_flush(rwco->blk);
     aio_wait_kick();
 }