mbox series

[RFC,0/2] Optimise io_uring completion waiting

Message ID cover.1568413210.git.asml.silence@gmail.com (mailing list archive)
Headers show
Series Optimise io_uring completion waiting | expand

Message

Pavel Begunkov Sept. 13, 2019, 10:28 p.m. UTC
From: Pavel Begunkov <asml.silence@gmail.com>

There could be a lot of overhead within generic wait_event_*() used for
waiting for large number of completions. The patchset removes much of
it by using custom wait event (wait_threshold).

Synthetic test showed ~40% performance boost. (see patch 2)

Pavel Begunkov (2):
  sched/wait: Add wait_threshold
  io_uring: Optimise cq waiting with wait_threshold

 fs/io_uring.c                  | 21 ++++++-----
 include/linux/wait_threshold.h | 64 ++++++++++++++++++++++++++++++++++
 kernel/sched/Makefile          |  2 +-
 kernel/sched/wait_threshold.c  | 26 ++++++++++++++
 4 files changed, 103 insertions(+), 10 deletions(-)
 create mode 100644 include/linux/wait_threshold.h
 create mode 100644 kernel/sched/wait_threshold.c

Comments

Jens Axboe Sept. 14, 2019, 12:31 a.m. UTC | #1
On 9/13/19 4:28 PM, Pavel Begunkov (Silence) wrote:
> From: Pavel Begunkov <asml.silence@gmail.com>
> 
> There could be a lot of overhead within generic wait_event_*() used for
> waiting for large number of completions. The patchset removes much of
> it by using custom wait event (wait_threshold).
> 
> Synthetic test showed ~40% performance boost. (see patch 2)

Nifty, from an io_uring perspective, I like this a lot.

The core changes needed to support it look fine as well. I'll await
Peter/Ingo's comments on it.
Pavel Begunkov Sept. 14, 2019, 10:11 a.m. UTC | #2
It solves much of the problem, though still have overhead on traversing
a wait queue + indirect calls for checking.

I've been thinking to either
1. create n wait queues and bucketing waiter. E.g. log2(min_events)
bucketing  would remove at least half of such calls for arbitary
min_events and all if min_events is pow2.

2. or dig deeper and add custom wake_up with perhaps sorted wait_queue.
As I see it, it's pretty bulky and over-engineered, but maybe somebody
knows an easier way?

Anyway, I don't have performance numbers for that, so don't know if this
would be justified.


On 14/09/2019 03:31, Jens Axboe wrote:
> On 9/13/19 4:28 PM, Pavel Begunkov (Silence) wrote:
>> From: Pavel Begunkov <asml.silence@gmail.com>
>>
>> There could be a lot of overhead within generic wait_event_*() used for
>> waiting for large number of completions. The patchset removes much of
>> it by using custom wait event (wait_threshold).
>>
>> Synthetic test showed ~40% performance boost. (see patch 2)
> 
> Nifty, from an io_uring perspective, I like this a lot.
> 
> The core changes needed to support it look fine as well. I'll await
> Peter/Ingo's comments on it.
>