Message ID | 20220106082057.968-2-hnarukaw@yahoo-corp.jp (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Patch to adjust coroutine pool size adaptively | expand |
Cc'ing Stefan / Paolo. On 6/1/22 09:20, Hiroki Narukawa wrote: > Coroutine pool size was 64 from long ago, and the basis was organized in the commit message in c740ad92. > > At that time, virtio-blk queue-size and num-queue were not configuable, and equivalent values were 128 and 1. > > Coroutine pool size 64 was fine then. > > Later queue-size and num-queue got configuable, and default values were increased. > > Coroutine pool with size 64 exhausts frequently with random disk IO in new size, and slows down. > > This commit adjusts coroutine pool size adaptively with new values. > > This commit adds 64 by default, but now coroutine is not only for block devices, > > and is not too much burdon comparing with new default. > > pool size of 128 * vCPUs. > > Signed-off-by: Hiroki Narukawa <hnarukaw@yahoo-corp.jp> > --- > hw/block/virtio-blk.c | 3 +++ > include/qemu/coroutine.h | 5 +++++ > util/qemu-coroutine.c | 15 +++++++++++---- > 3 files changed, 19 insertions(+), 4 deletions(-) > > diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c > index f139cd7cc9..726dbe14de 100644 > --- a/hw/block/virtio-blk.c > +++ b/hw/block/virtio-blk.c > @@ -32,6 +32,7 @@ > #include "hw/virtio/virtio-bus.h" > #include "migration/qemu-file-types.h" > #include "hw/virtio/virtio-access.h" > +#include "qemu/coroutine.h" > > /* Config size before the discard support (hide associated config fields) */ > #define VIRTIO_BLK_CFG_SIZE offsetof(struct virtio_blk_config, \ > @@ -1222,6 +1223,8 @@ static void virtio_blk_device_realize(DeviceState *dev, Error **errp) > for (i = 0; i < conf->num_queues; i++) { > virtio_add_queue(vdev, conf->queue_size, virtio_blk_handle_output); > } > + qemu_coroutine_increase_pool_batch_size(conf->num_queues * conf->queue_size > + / 2); > virtio_blk_data_plane_create(vdev, conf, &s->dataplane, &err); > if (err != NULL) { > error_propagate(errp, err); > diff --git a/include/qemu/coroutine.h b/include/qemu/coroutine.h > index 4829ff373d..e52ed76ab2 100644 > --- a/include/qemu/coroutine.h > +++ b/include/qemu/coroutine.h > @@ -331,6 +331,11 @@ void qemu_co_sleep_wake(QemuCoSleep *w); > */ > void coroutine_fn yield_until_fd_readable(int fd); > > +/** > + * Increase coroutine pool size > + */ > +void qemu_coroutine_increase_pool_batch_size(unsigned int additional_pool_size); > + > #include "qemu/lockable.h" > > #endif /* QEMU_COROUTINE_H */ > diff --git a/util/qemu-coroutine.c b/util/qemu-coroutine.c > index 38fb6d3084..080a1e0126 100644 > --- a/util/qemu-coroutine.c > +++ b/util/qemu-coroutine.c > @@ -20,12 +20,14 @@ > #include "qemu/coroutine_int.h" > #include "block/aio.h" > > +/** Initial batch size is 64, and is increased on demand */ > enum { > - POOL_BATCH_SIZE = 64, > + POOL_INITIAL_BATCH_SIZE = 64, > }; > > /** Free list to speed up creation */ > static QSLIST_HEAD(, Coroutine) release_pool = QSLIST_HEAD_INITIALIZER(pool); > +static unsigned int pool_batch_size = POOL_INITIAL_BATCH_SIZE; > static unsigned int release_pool_size; > static __thread QSLIST_HEAD(, Coroutine) alloc_pool = QSLIST_HEAD_INITIALIZER(pool); > static __thread unsigned int alloc_pool_size; > @@ -49,7 +51,7 @@ Coroutine *qemu_coroutine_create(CoroutineEntry *entry, void *opaque) > if (CONFIG_COROUTINE_POOL) { > co = QSLIST_FIRST(&alloc_pool); > if (!co) { > - if (release_pool_size > POOL_BATCH_SIZE) { > + if (release_pool_size > pool_batch_size) { > /* Slow path; a good place to register the destructor, too. */ > if (!coroutine_pool_cleanup_notifier.notify) { > coroutine_pool_cleanup_notifier.notify = coroutine_pool_cleanup; > @@ -86,12 +88,12 @@ static void coroutine_delete(Coroutine *co) > co->caller = NULL; > > if (CONFIG_COROUTINE_POOL) { > - if (release_pool_size < POOL_BATCH_SIZE * 2) { > + if (release_pool_size < pool_batch_size * 2) { > QSLIST_INSERT_HEAD_ATOMIC(&release_pool, co, pool_next); > qatomic_inc(&release_pool_size); > return; > } > - if (alloc_pool_size < POOL_BATCH_SIZE) { > + if (alloc_pool_size < pool_batch_size) { > QSLIST_INSERT_HEAD(&alloc_pool, co, pool_next); > alloc_pool_size++; > return; > @@ -202,3 +204,8 @@ AioContext *coroutine_fn qemu_coroutine_get_aio_context(Coroutine *co) > { > return co->ctx; > } > + > +void qemu_coroutine_increase_pool_batch_size(unsigned int additional_pool_size) > +{ > + qatomic_add(&pool_batch_size, additional_pool_size); > +}
On Thu, Jan 06, 2022 at 05:20:57PM +0900, Hiroki Narukawa wrote: Phil, thanks for notifying me. > Coroutine pool size was 64 from long ago, and the basis was organized in the commit message in c740ad92. > > At that time, virtio-blk queue-size and num-queue were not configuable, and equivalent values were 128 and 1. > > Coroutine pool size 64 was fine then. > > Later queue-size and num-queue got configuable, and default values were increased. > > Coroutine pool with size 64 exhausts frequently with random disk IO in new size, and slows down. > > This commit adjusts coroutine pool size adaptively with new values. > > This commit adds 64 by default, but now coroutine is not only for block devices, > > and is not too much burdon comparing with new default. > > pool size of 128 * vCPUs. > > Signed-off-by: Hiroki Narukawa <hnarukaw@yahoo-corp.jp> > --- > hw/block/virtio-blk.c | 3 +++ > include/qemu/coroutine.h | 5 +++++ > util/qemu-coroutine.c | 15 +++++++++++---- > 3 files changed, 19 insertions(+), 4 deletions(-) Have you measured with QEMU 6.1 or later? Commit d7ddd0a1618a75b31dc308bb37365ce1da972154 ("linux-aio: limit the batch size using `aio-max-batch` parameter") can hide this issue so it may not be apparent in recent QEMU releases. I like your approach better than what I tried recently (I ended up dropping the patch from my queue because it doesn't handle coroutines created in one thread and terminated in another thread correctly): https://patchew.org/QEMU/20210913153524.1190696-1-stefanha@redhat.com/ > > diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c > index f139cd7cc9..726dbe14de 100644 > --- a/hw/block/virtio-blk.c > +++ b/hw/block/virtio-blk.c > @@ -32,6 +32,7 @@ > #include "hw/virtio/virtio-bus.h" > #include "migration/qemu-file-types.h" > #include "hw/virtio/virtio-access.h" > +#include "qemu/coroutine.h" > > /* Config size before the discard support (hide associated config fields) */ > #define VIRTIO_BLK_CFG_SIZE offsetof(struct virtio_blk_config, \ > @@ -1222,6 +1223,8 @@ static void virtio_blk_device_realize(DeviceState *dev, Error **errp) > for (i = 0; i < conf->num_queues; i++) { > virtio_add_queue(vdev, conf->queue_size, virtio_blk_handle_output); > } > + qemu_coroutine_increase_pool_batch_size(conf->num_queues * conf->queue_size > + / 2); This over-provisions coroutine pools when IOThreads are configured, because --device virtio-blk-pci,iothread=iothread2 will only submit I/O requests in iothread2, for example. Other threads don't need to increase their limit. However, I think it's okay to use this inexact approach. It's still better than the current hardcoded 64 coroutine pool size. > virtio_blk_data_plane_create(vdev, conf, &s->dataplane, &err); > if (err != NULL) { > error_propagate(errp, err); > diff --git a/include/qemu/coroutine.h b/include/qemu/coroutine.h > index 4829ff373d..e52ed76ab2 100644 > --- a/include/qemu/coroutine.h > +++ b/include/qemu/coroutine.h > @@ -331,6 +331,11 @@ void qemu_co_sleep_wake(QemuCoSleep *w); > */ > void coroutine_fn yield_until_fd_readable(int fd); > > +/** > + * Increase coroutine pool size > + */ > +void qemu_coroutine_increase_pool_batch_size(unsigned int additional_pool_size); > + > #include "qemu/lockable.h" > > #endif /* QEMU_COROUTINE_H */ > diff --git a/util/qemu-coroutine.c b/util/qemu-coroutine.c > index 38fb6d3084..080a1e0126 100644 > --- a/util/qemu-coroutine.c > +++ b/util/qemu-coroutine.c > @@ -20,12 +20,14 @@ > #include "qemu/coroutine_int.h" > #include "block/aio.h" > > +/** Initial batch size is 64, and is increased on demand */ > enum { > - POOL_BATCH_SIZE = 64, > + POOL_INITIAL_BATCH_SIZE = 64, > }; > > /** Free list to speed up creation */ > static QSLIST_HEAD(, Coroutine) release_pool = QSLIST_HEAD_INITIALIZER(pool); > +static unsigned int pool_batch_size = POOL_INITIAL_BATCH_SIZE; > static unsigned int release_pool_size; > static __thread QSLIST_HEAD(, Coroutine) alloc_pool = QSLIST_HEAD_INITIALIZER(pool); > static __thread unsigned int alloc_pool_size; > @@ -49,7 +51,7 @@ Coroutine *qemu_coroutine_create(CoroutineEntry *entry, void *opaque) > if (CONFIG_COROUTINE_POOL) { > co = QSLIST_FIRST(&alloc_pool); > if (!co) { > - if (release_pool_size > POOL_BATCH_SIZE) { > + if (release_pool_size > pool_batch_size) { > /* Slow path; a good place to register the destructor, too. */ > if (!coroutine_pool_cleanup_notifier.notify) { > coroutine_pool_cleanup_notifier.notify = coroutine_pool_cleanup; > @@ -86,12 +88,12 @@ static void coroutine_delete(Coroutine *co) > co->caller = NULL; > > if (CONFIG_COROUTINE_POOL) { > - if (release_pool_size < POOL_BATCH_SIZE * 2) { > + if (release_pool_size < pool_batch_size * 2) { > QSLIST_INSERT_HEAD_ATOMIC(&release_pool, co, pool_next); > qatomic_inc(&release_pool_size); > return; > } > - if (alloc_pool_size < POOL_BATCH_SIZE) { > + if (alloc_pool_size < pool_batch_size) { > QSLIST_INSERT_HEAD(&alloc_pool, co, pool_next); > alloc_pool_size++; > return; > @@ -202,3 +204,8 @@ AioContext *coroutine_fn qemu_coroutine_get_aio_context(Coroutine *co) > { > return co->ctx; > } > + > +void qemu_coroutine_increase_pool_batch_size(unsigned int additional_pool_size) > +{ > + qatomic_add(&pool_batch_size, additional_pool_size); If atomic_add() is used to modify pool_batch_size then qatomic_read() should be used for loads. At a minimum it serves as documentation that this is an atomic variable.
> On Thu, Jan 06, 2022 at 05:20:57PM +0900, Hiroki Narukawa wrote: > > Phil, thanks for notifying me. > > > Coroutine pool size was 64 from long ago, and the basis was organized in the commit message in c740ad92. > > > > At that time, virtio-blk queue-size and num-queue were not configuable, and equivalent values were 128 and 1. > > > > Coroutine pool size 64 was fine then. > > > > Later queue-size and num-queue got configuable, and default values were increased. > > > > Coroutine pool with size 64 exhausts frequently with random disk IO in new size, and slows down. > > > > This commit adjusts coroutine pool size adaptively with new values. > > > > This commit adds 64 by default, but now coroutine is not only for > > block devices, > > > > and is not too much burdon comparing with new default. > > > > pool size of 128 * vCPUs. > > > > Signed-off-by: Hiroki Narukawa <hnarukaw@yahoo-corp.jp> > > --- > > hw/block/virtio-blk.c | 3 +++ > > include/qemu/coroutine.h | 5 +++++ > > util/qemu-coroutine.c | 15 +++++++++++---- > > 3 files changed, 19 insertions(+), 4 deletions(-) > > Have you measured with QEMU 6.1 or later? Commit > d7ddd0a1618a75b31dc308bb37365ce1da972154 ("linux-aio: limit the batch size using `aio-max-batch` parameter") can hide this issue so it may not be apparent in recent QEMU releases. > > I like your approach better than what I tried recently (I ended up dropping the patch from my queue because it doesn't handle coroutines created in one thread and terminated in another thread correctly): > https://patchew.org/QEMU/20210913153524.1190696-1-stefanha@redhat.com/ Yes, I measured with both QEMU-6.0 and QEMU-6.2, and both were affected by coroutine pool size. Two versions did not have so much difference in my measurement. > > > > > diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c index > > f139cd7cc9..726dbe14de 100644 > > --- a/hw/block/virtio-blk.c > > +++ b/hw/block/virtio-blk.c > > @@ -32,6 +32,7 @@ > > #include "hw/virtio/virtio-bus.h" > > #include "migration/qemu-file-types.h" > > #include "hw/virtio/virtio-access.h" > > +#include "qemu/coroutine.h" > > > > /* Config size before the discard support (hide associated config > > fields) */ #define VIRTIO_BLK_CFG_SIZE offsetof(struct > > virtio_blk_config, \ @@ -1222,6 +1223,8 @@ static void virtio_blk_device_realize(DeviceState *dev, Error **errp) > > for (i = 0; i < conf->num_queues; i++) { > > virtio_add_queue(vdev, conf->queue_size, virtio_blk_handle_output); > > } > > + qemu_coroutine_increase_pool_batch_size(conf->num_queues * conf->queue_size > > + / 2); > > This over-provisions coroutine pools when IOThreads are configured, because --device virtio-blk-pci,iothread=iothread2 will only submit I/O requests in iothread2, for example. Other threads don't need to increase their limit. > > However, I think it's okay to use this inexact approach. It's still better than the current hardcoded 64 coroutine pool size. > Thank you for pointing out, Considering your comment, it seems to be better if pool_batch_size may be thread local variable. But I couldn't find a way to initialize pool size from the same thread of caller. > > virtio_blk_data_plane_create(vdev, conf, &s->dataplane, &err); > > if (err != NULL) { > > error_propagate(errp, err); > > diff --git a/include/qemu/coroutine.h b/include/qemu/coroutine.h index > > 4829ff373d..e52ed76ab2 100644 > > --- a/include/qemu/coroutine.h > > +++ b/include/qemu/coroutine.h > > @@ -331,6 +331,11 @@ void qemu_co_sleep_wake(QemuCoSleep *w); > > */ > > void coroutine_fn yield_until_fd_readable(int fd); > > > > +/** > > + * Increase coroutine pool size > > + */ > > +void qemu_coroutine_increase_pool_batch_size(unsigned int > > +additional_pool_size); > > + > > #include "qemu/lockable.h" > > > > #endif /* QEMU_COROUTINE_H */ > > diff --git a/util/qemu-coroutine.c b/util/qemu-coroutine.c index > > 38fb6d3084..080a1e0126 100644 > > --- a/util/qemu-coroutine.c > > +++ b/util/qemu-coroutine.c > > @@ -20,12 +20,14 @@ > > #include "qemu/coroutine_int.h" > > #include "block/aio.h" > > > > +/** Initial batch size is 64, and is increased on demand */ > > enum { > > - POOL_BATCH_SIZE = 64, > > + POOL_INITIAL_BATCH_SIZE = 64, > > }; > > > > /** Free list to speed up creation */ static QSLIST_HEAD(, > > Coroutine) release_pool = QSLIST_HEAD_INITIALIZER(pool); > > +static unsigned int pool_batch_size = POOL_INITIAL_BATCH_SIZE; > > static unsigned int release_pool_size; static __thread QSLIST_HEAD(, > > Coroutine) alloc_pool = QSLIST_HEAD_INITIALIZER(pool); static > > __thread unsigned int alloc_pool_size; @@ -49,7 +51,7 @@ Coroutine > > *qemu_coroutine_create(CoroutineEntry *entry, void *opaque) > > if (CONFIG_COROUTINE_POOL) { > > co = QSLIST_FIRST(&alloc_pool); > > if (!co) { > > - if (release_pool_size > POOL_BATCH_SIZE) { > > + if (release_pool_size > pool_batch_size) { > > /* Slow path; a good place to register the destructor, too. */ > > if (!coroutine_pool_cleanup_notifier.notify) { > > coroutine_pool_cleanup_notifier.notify = > > coroutine_pool_cleanup; @@ -86,12 +88,12 @@ static void coroutine_delete(Coroutine *co) > > co->caller = NULL; > > > > if (CONFIG_COROUTINE_POOL) { > > - if (release_pool_size < POOL_BATCH_SIZE * 2) { > > + if (release_pool_size < pool_batch_size * 2) { > > QSLIST_INSERT_HEAD_ATOMIC(&release_pool, co, pool_next); > > qatomic_inc(&release_pool_size); > > return; > > } > > - if (alloc_pool_size < POOL_BATCH_SIZE) { > > + if (alloc_pool_size < pool_batch_size) { > > QSLIST_INSERT_HEAD(&alloc_pool, co, pool_next); > > alloc_pool_size++; > > return; > > @@ -202,3 +204,8 @@ AioContext *coroutine_fn > > qemu_coroutine_get_aio_context(Coroutine *co) { > > return co->ctx; > > } > > + > > +void qemu_coroutine_increase_pool_batch_size(unsigned int > > +additional_pool_size) { > > + qatomic_add(&pool_batch_size, additional_pool_size); > > If atomic_add() is used to modify pool_batch_size then qatomic_read() should be used for loads. At a minimum it serves as documentation that this is an atomic variable. > Fixed this in patch v2 and resent it.
diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c index f139cd7cc9..726dbe14de 100644 --- a/hw/block/virtio-blk.c +++ b/hw/block/virtio-blk.c @@ -32,6 +32,7 @@ #include "hw/virtio/virtio-bus.h" #include "migration/qemu-file-types.h" #include "hw/virtio/virtio-access.h" +#include "qemu/coroutine.h" /* Config size before the discard support (hide associated config fields) */ #define VIRTIO_BLK_CFG_SIZE offsetof(struct virtio_blk_config, \ @@ -1222,6 +1223,8 @@ static void virtio_blk_device_realize(DeviceState *dev, Error **errp) for (i = 0; i < conf->num_queues; i++) { virtio_add_queue(vdev, conf->queue_size, virtio_blk_handle_output); } + qemu_coroutine_increase_pool_batch_size(conf->num_queues * conf->queue_size + / 2); virtio_blk_data_plane_create(vdev, conf, &s->dataplane, &err); if (err != NULL) { error_propagate(errp, err); diff --git a/include/qemu/coroutine.h b/include/qemu/coroutine.h index 4829ff373d..e52ed76ab2 100644 --- a/include/qemu/coroutine.h +++ b/include/qemu/coroutine.h @@ -331,6 +331,11 @@ void qemu_co_sleep_wake(QemuCoSleep *w); */ void coroutine_fn yield_until_fd_readable(int fd); +/** + * Increase coroutine pool size + */ +void qemu_coroutine_increase_pool_batch_size(unsigned int additional_pool_size); + #include "qemu/lockable.h" #endif /* QEMU_COROUTINE_H */ diff --git a/util/qemu-coroutine.c b/util/qemu-coroutine.c index 38fb6d3084..080a1e0126 100644 --- a/util/qemu-coroutine.c +++ b/util/qemu-coroutine.c @@ -20,12 +20,14 @@ #include "qemu/coroutine_int.h" #include "block/aio.h" +/** Initial batch size is 64, and is increased on demand */ enum { - POOL_BATCH_SIZE = 64, + POOL_INITIAL_BATCH_SIZE = 64, }; /** Free list to speed up creation */ static QSLIST_HEAD(, Coroutine) release_pool = QSLIST_HEAD_INITIALIZER(pool); +static unsigned int pool_batch_size = POOL_INITIAL_BATCH_SIZE; static unsigned int release_pool_size; static __thread QSLIST_HEAD(, Coroutine) alloc_pool = QSLIST_HEAD_INITIALIZER(pool); static __thread unsigned int alloc_pool_size; @@ -49,7 +51,7 @@ Coroutine *qemu_coroutine_create(CoroutineEntry *entry, void *opaque) if (CONFIG_COROUTINE_POOL) { co = QSLIST_FIRST(&alloc_pool); if (!co) { - if (release_pool_size > POOL_BATCH_SIZE) { + if (release_pool_size > pool_batch_size) { /* Slow path; a good place to register the destructor, too. */ if (!coroutine_pool_cleanup_notifier.notify) { coroutine_pool_cleanup_notifier.notify = coroutine_pool_cleanup; @@ -86,12 +88,12 @@ static void coroutine_delete(Coroutine *co) co->caller = NULL; if (CONFIG_COROUTINE_POOL) { - if (release_pool_size < POOL_BATCH_SIZE * 2) { + if (release_pool_size < pool_batch_size * 2) { QSLIST_INSERT_HEAD_ATOMIC(&release_pool, co, pool_next); qatomic_inc(&release_pool_size); return; } - if (alloc_pool_size < POOL_BATCH_SIZE) { + if (alloc_pool_size < pool_batch_size) { QSLIST_INSERT_HEAD(&alloc_pool, co, pool_next); alloc_pool_size++; return; @@ -202,3 +204,8 @@ AioContext *coroutine_fn qemu_coroutine_get_aio_context(Coroutine *co) { return co->ctx; } + +void qemu_coroutine_increase_pool_batch_size(unsigned int additional_pool_size) +{ + qatomic_add(&pool_batch_size, additional_pool_size); +}
Coroutine pool size was 64 from long ago, and the basis was organized in the commit message in c740ad92. At that time, virtio-blk queue-size and num-queue were not configuable, and equivalent values were 128 and 1. Coroutine pool size 64 was fine then. Later queue-size and num-queue got configuable, and default values were increased. Coroutine pool with size 64 exhausts frequently with random disk IO in new size, and slows down. This commit adjusts coroutine pool size adaptively with new values. This commit adds 64 by default, but now coroutine is not only for block devices, and is not too much burdon comparing with new default. pool size of 128 * vCPUs. Signed-off-by: Hiroki Narukawa <hnarukaw@yahoo-corp.jp> --- hw/block/virtio-blk.c | 3 +++ include/qemu/coroutine.h | 5 +++++ util/qemu-coroutine.c | 15 +++++++++++---- 3 files changed, 19 insertions(+), 4 deletions(-)