diff mbox

[10/12] block: Drain throttling queue with BdrvChild callback

Message ID 1458660792-3035-11-git-send-email-kwolf@redhat.com (mailing list archive)
State New, archived
Headers show

Commit Message

Kevin Wolf March 22, 2016, 3:33 p.m. UTC
This removes the last part of I/O throttling from block/io.c and moves
it to the BlockBackend.

When draining the queue of a BlockDriverState, we must make sure that no
new requests can come in for it. Request sources from outside the block
layer are disabled with aio_disable_external(), but the throttling queue
must be handled separately.

The two obvious options we have are either to implement a similar
mechanism in BlockBackend that queues requests and avoids to pass them
to the BDS, or to flush the whole queue. The first option seems nicer
and could prevent bypassing the I/O limit, but the second is closer to
what we're already doing on the BDS level, so this patch keeps it this
way for now.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block/block-backend.c     | 29 +++++++++++++++++++++++++++--
 block/io.c                | 35 ++++++++---------------------------
 include/block/block_int.h |  4 ++--
 3 files changed, 37 insertions(+), 31 deletions(-)

Comments

Paolo Bonzini March 23, 2016, 9:29 p.m. UTC | #1
On 22/03/2016 16:33, Kevin Wolf wrote:
> This removes the last part of I/O throttling from block/io.c and moves
> it to the BlockBackend.
> 
> When draining the queue of a BlockDriverState, we must make sure that no
> new requests can come in for it. Request sources from outside the block
> layer are disabled with aio_disable_external(), but the throttling queue
> must be handled separately.

I have looked at the strategy we talked about today to implement request
cancellation (so that e.g. system reset doesn't take ages because of
throttled requests).  While that may be a worthwhile addition anyway, I
think throttling bdrv_drain() may impose an excessive cost for cases
such as live migration.  The risk of the guest using bdrv_drain() to
game throttling is low enough that we can keep on disabling throttling
during bdrv_drain().

So for now I think we can merge the two series just fine.  The strategy
I used in my patch, adding bdrv_no_throttling_begin and
bdrv_no_throttling_end around the bdrv_drain loop, can be adapted just
as use BdrvChildRole callbacks ->drained_begin and ->drained_end.

I will post v3 of my series tomorrow, adopting your patch 1/12 of this
series and removing the recursion on bdrv_no_throttling_begin and
bdrv_no_throttling_end, which is unnecessary.

Paolo
Kevin Wolf March 24, 2016, 8:25 a.m. UTC | #2
Am 23.03.2016 um 22:29 hat Paolo Bonzini geschrieben:
> 
> 
> On 22/03/2016 16:33, Kevin Wolf wrote:
> > This removes the last part of I/O throttling from block/io.c and moves
> > it to the BlockBackend.
> > 
> > When draining the queue of a BlockDriverState, we must make sure that no
> > new requests can come in for it. Request sources from outside the block
> > layer are disabled with aio_disable_external(), but the throttling queue
> > must be handled separately.
> 
> I have looked at the strategy we talked about today to implement request
> cancellation (so that e.g. system reset doesn't take ages because of
> throttled requests).  While that may be a worthwhile addition anyway, I
> think throttling bdrv_drain() may impose an excessive cost for cases
> such as live migration.  The risk of the guest using bdrv_drain() to
> game throttling is low enough that we can keep on disabling throttling
> during bdrv_drain().

I think your cancellation series (allows to) gets rid of most if not all
blk_drain() callers in the device emulation, so it becomes harder for
guests to trigger one. Ideally only the monitor should allow triggering
a drain.

On the other hand, your other series introduces bdrv_drain() calls where
we have open-coded nested event loops waiting for a single request
today. I'm pretty sure that these can be triggered by the guest and that
throttling the drain would be desirable therefore. Maybe we need a
different function there, and maybe we can even retain the behaviour
that it doesn't unnecessarily flush everything instead of just waiting
for the completion of a single request.

> So for now I think we can merge the two series just fine.  The strategy
> I used in my patch, adding bdrv_no_throttling_begin and
> bdrv_no_throttling_end around the bdrv_drain loop, can be adapted just
> as use BdrvChildRole callbacks ->drained_begin and ->drained_end.

Okay. Actually, such a pair of callbacks - not only into the
BlockBackend, but from there into the guest device - was a thought
already when we introduced aio_disable_external(). Do you think it would
make sense to change things in the mid term so that the users of a
BlockBackend just get drain_begin/end callbacks?

> I will post v3 of my series tomorrow, adopting your patch 1/12 of this
> series and removing the recursion on bdrv_no_throttling_begin and
> bdrv_no_throttling_end, which is unnecessary.

Okay, I'll try to rebase then.

Kevin
Paolo Bonzini March 24, 2016, 9:32 a.m. UTC | #3
On 24/03/2016 09:25, Kevin Wolf wrote:
> I think your cancellation series (allows to) gets rid of most if not all
> blk_drain() callers in the device emulation, so it becomes harder for
> guests to trigger one. Ideally only the monitor should allow triggering
> a drain.

More precesely you still need to call drain, but indeed they won't be
able to game throttling.

> On the other hand, your other series introduces bdrv_drain() calls where
> we have open-coded nested event loops waiting for a single request
> today. I'm pretty sure that these can be triggered by the guest and that
> throttling the drain would be desirable therefore.

The open-coded nested event loops are typically triggered only from
qemu-io, bdrv_create, etc.  They shouldn't be worse than the "disable
throttling for sync I/O" that we used to have.

> Okay. Actually, such a pair of callbacks - not only into the
> BlockBackend, but from there into the guest device - was a thought
> already when we introduced aio_disable_external(). Do you think it would
> make sense to change things in the mid term so that the users of a
> BlockBackend just get drain_begin/end callbacks?

Yes, aio_disable_external and aio_disable_internal can be moved to the
BdrvChildRole callbacks too.

Paolo
diff mbox

Patch

diff --git a/block/block-backend.c b/block/block-backend.c
index d2bd268..c71ce4d 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -90,9 +90,12 @@  static void blk_root_inherit_options(int *child_flags, QDict *child_options,
     /* We're not supposed to call this function for root nodes */
     abort();
 }
+static bool blk_drain_throttling_queue(BdrvChild *child);
 
 static const BdrvChildRole child_root = {
-    .inherit_options = blk_root_inherit_options,
+    .inherit_options    = blk_root_inherit_options,
+
+    .drain_queue        = blk_drain_throttling_queue,
 };
 
 /*
@@ -1677,7 +1680,7 @@  void blk_set_io_limits(BlockBackend *blk, ThrottleConfig *cfg)
 void blk_io_limits_disable(BlockBackend *blk)
 {
     blk->public.io_limits_enabled = false;
-    bdrv_start_throttled_reqs(blk_bs(blk));
+    blk_drain_throttling_queue(blk->root);
     throttle_group_unregister_blk(blk);
 }
 
@@ -1705,3 +1708,25 @@  void blk_io_limits_update_group(BlockBackend *blk, const char *group)
     blk_io_limits_disable(blk);
     blk_io_limits_enable(blk, group);
 }
+
+/* this function drain all the throttled IOs */
+static bool blk_drain_throttling_queue(BdrvChild *child)
+{
+    BlockBackend *blk = child->opaque;
+    BlockBackendPublic *blkp = &blk->public;
+    bool drained = false;
+    bool enabled = blkp->io_limits_enabled;
+    int i;
+
+    blkp->io_limits_enabled = false;
+
+    for (i = 0; i < 2; i++) {
+        while (qemu_co_enter_next(&blkp->throttled_reqs[i])) {
+            drained = true;
+        }
+    }
+
+    blkp->io_limits_enabled = enabled;
+
+    return drained;
+}
diff --git a/block/io.c b/block/io.c
index 22dbb43..f6edab8 100644
--- a/block/io.c
+++ b/block/io.c
@@ -27,7 +27,6 @@ 
 #include "sysemu/block-backend.h"
 #include "block/blockjob.h"
 #include "block/block_int.h"
-#include "block/throttle-groups.h"
 #include "qemu/error-report.h"
 
 #define NOT_DONE 0x7fffffff /* used while emulated sync operation in progress */
@@ -56,31 +55,6 @@  static void coroutine_fn bdrv_co_do_rw(void *opaque);
 static int coroutine_fn bdrv_co_do_write_zeroes(BlockDriverState *bs,
     int64_t sector_num, int nb_sectors, BdrvRequestFlags flags);
 
-/* this function drain all the throttled IOs */
-bool bdrv_start_throttled_reqs(BlockDriverState *bs)
-{
-    if (!bs->blk) {
-        return false;
-    }
-
-    BlockBackendPublic *blkp = blk_get_public(bs->blk);
-    bool drained = false;
-    bool enabled = blk_get_public(bs->blk)->io_limits_enabled;
-    int i;
-
-    blkp->io_limits_enabled = false;
-
-    for (i = 0; i < 2; i++) {
-        while (qemu_co_enter_next(&blkp->throttled_reqs[i])) {
-            drained = true;
-        }
-    }
-
-    blkp->io_limits_enabled = enabled;
-
-    return drained;
-}
-
 void bdrv_setup_io_funcs(BlockDriver *bdrv)
 {
     /* Block drivers without coroutine functions need emulation */
@@ -2668,12 +2642,19 @@  void bdrv_io_unplug(BlockDriverState *bs)
 void bdrv_flush_io_queue(BlockDriverState *bs)
 {
     BlockDriver *drv = bs->drv;
+    BdrvChild *c;
+
     if (drv && drv->bdrv_flush_io_queue) {
         drv->bdrv_flush_io_queue(bs);
     } else if (bs->file) {
         bdrv_flush_io_queue(bs->file->bs);
     }
-    bdrv_start_throttled_reqs(bs);
+
+    QLIST_FOREACH(c, &bs->parents, next_parent) {
+        if (c->role->drain_queue) {
+            c->role->drain_queue(c);
+        }
+    }
 }
 
 void bdrv_drained_begin(BlockDriverState *bs)
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 1b3c310..e064d9d 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -355,6 +355,8 @@  typedef struct BdrvAioNotifier {
 struct BdrvChildRole {
     void (*inherit_options)(int *child_flags, QDict *child_options,
                             int parent_flags, QDict *parent_options);
+
+    bool (*drain_queue)(BdrvChild *child);
 };
 
 extern const BdrvChildRole child_file;
@@ -508,8 +510,6 @@  int get_tmp_filename(char *filename, int size);
 BlockDriver *bdrv_probe_all(const uint8_t *buf, int buf_size,
                             const char *filename);
 
-bool bdrv_start_throttled_reqs(BlockDriverState *bs);
-
 
 /**
  * bdrv_add_before_write_notifier: