diff mbox series

io_uring: remove wait loop spurious wakeups

Message ID 936cd758d6c694fe1b8b9de050e24cfecdc2e60d.1570489620.git.asml.silence@gmail.com (mailing list archive)
State New, archived
Headers show
Series io_uring: remove wait loop spurious wakeups | expand

Commit Message

Pavel Begunkov Oct. 7, 2019, 11:18 p.m. UTC
From: Pavel Begunkov <asml.silence@gmail.com>

Any changes interesting to tasks waiting in io_cqring_wait() are
commited with io_cqring_ev_posted(). However, io_ring_drop_ctx_refs()
also tries to do that but with no reason, that means spurious wakeups
every io_free_req() and io_uring_enter().

Just use percpu_ref_put() instead.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 fs/io_uring.c | 16 ++++------------
 1 file changed, 4 insertions(+), 12 deletions(-)

Comments

Jens Axboe Oct. 8, 2019, 3:16 a.m. UTC | #1
On 10/7/19 5:18 PM, Pavel Begunkov (Silence) wrote:
> From: Pavel Begunkov <asml.silence@gmail.com>
> 
> Any changes interesting to tasks waiting in io_cqring_wait() are
> commited with io_cqring_ev_posted(). However, io_ring_drop_ctx_refs()
> also tries to do that but with no reason, that means spurious wakeups
> every io_free_req() and io_uring_enter().
> 
> Just use percpu_ref_put() instead.

Looks good, this is a leftover from when the ctx teardown used
the waitqueue as well.
Pavel Begunkov Oct. 8, 2019, 4:43 p.m. UTC | #2
On 08/10/2019 06:16, Jens Axboe wrote:
> On 10/7/19 5:18 PM, Pavel Begunkov (Silence) wrote:
>> From: Pavel Begunkov <asml.silence@gmail.com>
>>
>> Any changes interesting to tasks waiting in io_cqring_wait() are
>> commited with io_cqring_ev_posted(). However, io_ring_drop_ctx_refs()
>> also tries to do that but with no reason, that means spurious wakeups
>> every io_free_req() and io_uring_enter().
>>
>> Just use percpu_ref_put() instead.
> 
> Looks good, this is a leftover from when the ctx teardown used
> the waitqueue as well.
> 
BTW, is there a reason for ref-counting in struct io_kiocb? I understand
the idea behind submission reference, but don't see any actual part
needing it.

Tested with another ref-counting patch and got +5-8% to
nops performance.
Jens Axboe Oct. 8, 2019, 5 p.m. UTC | #3
On 10/8/19 10:43 AM, Pavel Begunkov wrote:
> On 08/10/2019 06:16, Jens Axboe wrote:
>> On 10/7/19 5:18 PM, Pavel Begunkov (Silence) wrote:
>>> From: Pavel Begunkov <asml.silence@gmail.com>
>>>
>>> Any changes interesting to tasks waiting in io_cqring_wait() are
>>> commited with io_cqring_ev_posted(). However, io_ring_drop_ctx_refs()
>>> also tries to do that but with no reason, that means spurious wakeups
>>> every io_free_req() and io_uring_enter().
>>>
>>> Just use percpu_ref_put() instead.
>>
>> Looks good, this is a leftover from when the ctx teardown used
>> the waitqueue as well.
>>
> BTW, is there a reason for ref-counting in struct io_kiocb? I understand
> the idea behind submission reference, but don't see any actual part
> needing it.

In short, it's to prevent the completion running before we're done with
the iocb on the submission side.

> Tested with another ref-counting patch and got +5-8% to
> nops performance.
> 
>
Pavel Begunkov Oct. 8, 2019, 8:58 p.m. UTC | #4
On 08/10/2019 20:00, Jens Axboe wrote:
> On 10/8/19 10:43 AM, Pavel Begunkov wrote:
>> On 08/10/2019 06:16, Jens Axboe wrote:
>>> On 10/7/19 5:18 PM, Pavel Begunkov (Silence) wrote:
>>>> From: Pavel Begunkov <asml.silence@gmail.com>
>>>>
>>>> Any changes interesting to tasks waiting in io_cqring_wait() are
>>>> commited with io_cqring_ev_posted(). However, io_ring_drop_ctx_refs()
>>>> also tries to do that but with no reason, that means spurious wakeups
>>>> every io_free_req() and io_uring_enter().
>>>>
>>>> Just use percpu_ref_put() instead.
>>>
>>> Looks good, this is a leftover from when the ctx teardown used
>>> the waitqueue as well.
>>>
>> BTW, is there a reason for ref-counting in struct io_kiocb? I understand
>> the idea behind submission reference, but don't see any actual part
>> needing it.
> 
> In short, it's to prevent the completion running before we're done with
> the iocb on the submission side.

Yep, that's what I expected. Perhaps I missed something, but what I've
seen following code paths all the way down, it either
1. gets error / completes synchronously and then frees req locally
2. or passes it further (e.g. async list) and never accesses it after
Jens Axboe Oct. 8, 2019, 9:22 p.m. UTC | #5
On 10/8/19 2:58 PM, Pavel Begunkov wrote:
> On 08/10/2019 20:00, Jens Axboe wrote:
>> On 10/8/19 10:43 AM, Pavel Begunkov wrote:
>>> On 08/10/2019 06:16, Jens Axboe wrote:
>>>> On 10/7/19 5:18 PM, Pavel Begunkov (Silence) wrote:
>>>>> From: Pavel Begunkov <asml.silence@gmail.com>
>>>>>
>>>>> Any changes interesting to tasks waiting in io_cqring_wait() are
>>>>> commited with io_cqring_ev_posted(). However, io_ring_drop_ctx_refs()
>>>>> also tries to do that but with no reason, that means spurious wakeups
>>>>> every io_free_req() and io_uring_enter().
>>>>>
>>>>> Just use percpu_ref_put() instead.
>>>>
>>>> Looks good, this is a leftover from when the ctx teardown used
>>>> the waitqueue as well.
>>>>
>>> BTW, is there a reason for ref-counting in struct io_kiocb? I understand
>>> the idea behind submission reference, but don't see any actual part
>>> needing it.
>>
>> In short, it's to prevent the completion running before we're done with
>> the iocb on the submission side.
> 
> Yep, that's what I expected. Perhaps I missed something, but what I've
> seen following code paths all the way down, it either
> 1. gets error / completes synchronously and then frees req locally
> 2. or passes it further (e.g. async list) and never accesses it after

As soon as the IO is passed on, it can complete. In fact, it can complete
even _before_ that call returns. That's the issue. Obviously this isn't
true for purely polled IO, but it is true for IRQ based IO.
Pavel Begunkov Oct. 8, 2019, 10:05 p.m. UTC | #6
On 09/10/2019 00:22, Jens Axboe wrote:
> On 10/8/19 2:58 PM, Pavel Begunkov wrote:
>> On 08/10/2019 20:00, Jens Axboe wrote:
>>> On 10/8/19 10:43 AM, Pavel Begunkov wrote:
>>>> On 08/10/2019 06:16, Jens Axboe wrote:
>>>>> On 10/7/19 5:18 PM, Pavel Begunkov (Silence) wrote:
>>>>>> From: Pavel Begunkov <asml.silence@gmail.com>
>>>>>>
>>>>>> Any changes interesting to tasks waiting in io_cqring_wait() are
>>>>>> commited with io_cqring_ev_posted(). However, io_ring_drop_ctx_refs()
>>>>>> also tries to do that but with no reason, that means spurious wakeups
>>>>>> every io_free_req() and io_uring_enter().
>>>>>>
>>>>>> Just use percpu_ref_put() instead.
>>>>>
>>>>> Looks good, this is a leftover from when the ctx teardown used
>>>>> the waitqueue as well.
>>>>>
>>>> BTW, is there a reason for ref-counting in struct io_kiocb? I understand
>>>> the idea behind submission reference, but don't see any actual part
>>>> needing it.
>>>
>>> In short, it's to prevent the completion running before we're done with
>>> the iocb on the submission side.
>>
>> Yep, that's what I expected. Perhaps I missed something, but what I've
>> seen following code paths all the way down, it either
>> 1. gets error / completes synchronously and then frees req locally
>> 2. or passes it further (e.g. async list) and never accesses it after
> 
> As soon as the IO is passed on, it can complete. In fact, it can complete
> even _before_ that call returns. That's the issue. Obviously this isn't
> true for purely polled IO, but it is true for IRQ based IO.

And the idea was to not use io_kiocb after submission. Except when we know,
that it won't complete asynchronously (e.g. error), that could be checked
with return code, I guess.

Anyway, thanks for the explanation!
Jens Axboe Oct. 9, 2019, 2:54 a.m. UTC | #7
On 10/8/19 4:05 PM, Pavel Begunkov wrote:
> On 09/10/2019 00:22, Jens Axboe wrote:
>> On 10/8/19 2:58 PM, Pavel Begunkov wrote:
>>> On 08/10/2019 20:00, Jens Axboe wrote:
>>>> On 10/8/19 10:43 AM, Pavel Begunkov wrote:
>>>>> On 08/10/2019 06:16, Jens Axboe wrote:
>>>>>> On 10/7/19 5:18 PM, Pavel Begunkov (Silence) wrote:
>>>>>>> From: Pavel Begunkov <asml.silence@gmail.com>
>>>>>>>
>>>>>>> Any changes interesting to tasks waiting in io_cqring_wait() are
>>>>>>> commited with io_cqring_ev_posted(). However, io_ring_drop_ctx_refs()
>>>>>>> also tries to do that but with no reason, that means spurious wakeups
>>>>>>> every io_free_req() and io_uring_enter().
>>>>>>>
>>>>>>> Just use percpu_ref_put() instead.
>>>>>>
>>>>>> Looks good, this is a leftover from when the ctx teardown used
>>>>>> the waitqueue as well.
>>>>>>
>>>>> BTW, is there a reason for ref-counting in struct io_kiocb? I understand
>>>>> the idea behind submission reference, but don't see any actual part
>>>>> needing it.
>>>>
>>>> In short, it's to prevent the completion running before we're done with
>>>> the iocb on the submission side.
>>>
>>> Yep, that's what I expected. Perhaps I missed something, but what I've
>>> seen following code paths all the way down, it either
>>> 1. gets error / completes synchronously and then frees req locally
>>> 2. or passes it further (e.g. async list) and never accesses it after
>>
>> As soon as the IO is passed on, it can complete. In fact, it can complete
>> even _before_ that call returns. That's the issue. Obviously this isn't
>> true for purely polled IO, but it is true for IRQ based IO.
> 
> And the idea was to not use io_kiocb after submission. Except when we know,
> that it won't complete asynchronously (e.g. error), that could be checked
> with return code, I guess.

I think you're still missing the point. During the submission it can go
away, it can be deep in a call chain. So it's not enough to say "we
won't touch it after completion returns", we need to hold a reference to
ensure it doesn't go away WHILE being submitted.

Hope that helps!
Pavel Begunkov Oct. 9, 2019, 9:19 a.m. UTC | #8
On 10/9/2019 5:54 AM, Jens Axboe wrote:
>>>>>> BTW, is there a reason for ref-counting in struct io_kiocb? I understand
>>>>>> the idea behind submission reference, but don't see any actual part
>>>>>> needing it.
>>>>>
>>>>> In short, it's to prevent the completion running before we're done with
>>>>> the iocb on the submission side.
>>>>
>>>> Yep, that's what I expected. Perhaps I missed something, but what I've
>>>> seen following code paths all the way down, it either
>>>> 1. gets error / completes synchronously and then frees req locally
>>>> 2. or passes it further (e.g. async list) and never accesses it after
>>>
>>> As soon as the IO is passed on, it can complete. In fact, it can complete
>>> even _before_ that call returns. That's the issue. Obviously this isn't
>>> true for purely polled IO, but it is true for IRQ based IO.
>>
>> And the idea was to not use io_kiocb after submission. Except when we know,
>> that it won't complete asynchronously (e.g. error), that could be checked
>> with return code, I guess.
> 
> I think you're still missing the point. During the submission it can go
> away, it can be deep in a call chain. So it's not enough to say "we
> won't touch it after completion returns", we need to hold a reference to
> ensure it doesn't go away WHILE being submitted.
> 
> Hope that helps!

Now I get it, thanks Jens!
diff mbox series

Patch

diff --git a/fs/io_uring.c b/fs/io_uring.c
index c934f91c51e9..89d77a626063 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -591,14 +591,6 @@  static void io_cqring_add_event(struct io_ring_ctx *ctx, u64 user_data,
 	io_cqring_ev_posted(ctx);
 }
 
-static void io_ring_drop_ctx_refs(struct io_ring_ctx *ctx, unsigned refs)
-{
-	percpu_ref_put_many(&ctx->refs, refs);
-
-	if (waitqueue_active(&ctx->wait))
-		wake_up(&ctx->wait);
-}
-
 static struct io_kiocb *io_get_req(struct io_ring_ctx *ctx,
 				   struct io_submit_state *state)
 {
@@ -646,7 +638,7 @@  static struct io_kiocb *io_get_req(struct io_ring_ctx *ctx,
 	req->result = 0;
 	return req;
 out:
-	io_ring_drop_ctx_refs(ctx, 1);
+	percpu_ref_put(&ctx->refs);
 	return NULL;
 }
 
@@ -654,7 +646,7 @@  static void io_free_req_many(struct io_ring_ctx *ctx, void **reqs, int *nr)
 {
 	if (*nr) {
 		kmem_cache_free_bulk(req_cachep, *nr, reqs);
-		io_ring_drop_ctx_refs(ctx, *nr);
+		percpu_ref_put_many(&ctx->refs, *nr);
 		*nr = 0;
 	}
 }
@@ -663,7 +655,7 @@  static void __io_free_req(struct io_kiocb *req)
 {
 	if (req->file && !(req->flags & REQ_F_FIXED_FILE))
 		fput(req->file);
-	io_ring_drop_ctx_refs(req->ctx, 1);
+	percpu_ref_put(&req->ctx->refs);
 	kmem_cache_free(req_cachep, req);
 }
 
@@ -3630,7 +3622,7 @@  SYSCALL_DEFINE6(io_uring_enter, unsigned int, fd, u32, to_submit,
 		}
 	}
 
-	io_ring_drop_ctx_refs(ctx, 1);
+	percpu_ref_put(&ctx->refs);
 out_fput:
 	fdput(f);
 	return submitted ? submitted : ret;