diff mbox series

[PULL,3/9] aio-posix: completely stop polling when disabled

Message ID 20200311124045.277969-4-stefanha@redhat.com (mailing list archive)
State New, archived
Headers show
Series [PULL,1/9] qemu/queue.h: clear linked list pointers on remove | expand

Commit Message

Stefan Hajnoczi March 11, 2020, 12:40 p.m. UTC
One iteration of polling is always performed even when polling is
disabled.  This is done because:
1. Userspace polling is cheaper than making a syscall.  We might get
   lucky.
2. We must poll once more after polling has stopped in case an event
   occurred while stopping polling.

However, there are downsides:
1. Polling becomes a bottleneck when the number of event sources is very
   high.  It's more efficient to monitor fds in that case.
2. A high-frequency polling event source can starve non-polling event
   sources because ppoll(2)/epoll(7) is never invoked.

This patch removes the forced polling iteration so that poll_ns=0 really
means no polling.

IOPS increases from 10k to 60k when the guest has 100
virtio-blk-pci,num-queues=32 devices and 1 virtio-blk-pci,num-queues=1
device because the large number of event sources being polled slows down
the event loop.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Link: https://lore.kernel.org/r/20200305170806.1313245-2-stefanha@redhat.com
Message-Id: <20200305170806.1313245-2-stefanha@redhat.com>
---
 util/aio-posix.c | 22 +++++++++++++++-------
 1 file changed, 15 insertions(+), 7 deletions(-)
diff mbox series

Patch

diff --git a/util/aio-posix.c b/util/aio-posix.c
index b339aab12c..65964a2597 100644
--- a/util/aio-posix.c
+++ b/util/aio-posix.c
@@ -361,12 +361,13 @@  void aio_set_event_notifier_poll(AioContext *ctx,
                     (IOHandler *)io_poll_end);
 }
 
-static void poll_set_started(AioContext *ctx, bool started)
+static bool poll_set_started(AioContext *ctx, bool started)
 {
     AioHandler *node;
+    bool progress = false;
 
     if (started == ctx->poll_started) {
-        return;
+        return false;
     }
 
     ctx->poll_started = started;
@@ -388,8 +389,15 @@  static void poll_set_started(AioContext *ctx, bool started)
         if (fn) {
             fn(node->opaque);
         }
+
+        /* Poll one last time in case ->io_poll_end() raced with the event */
+        if (!started) {
+            progress = node->io_poll(node->opaque) || progress;
+        }
     }
     qemu_lockcnt_dec(&ctx->list_lock);
+
+    return progress;
 }
 
 
@@ -670,12 +678,12 @@  static bool try_poll_mode(AioContext *ctx, int64_t *timeout)
         }
     }
 
-    poll_set_started(ctx, false);
+    if (poll_set_started(ctx, false)) {
+        *timeout = 0;
+        return true;
+    }
 
-    /* Even if we don't run busy polling, try polling once in case it can make
-     * progress and the caller will be able to avoid ppoll(2)/epoll_wait(2).
-     */
-    return run_poll_handlers_once(ctx, timeout);
+    return false;
 }
 
 bool aio_poll(AioContext *ctx, bool blocking)