diff mbox series

[for-next,1/4] io_uring: if a linked request has REQ_F_FORCE_ASYNC then run it async

Message ID 20230127135227.3646353-2-dylany@meta.com (mailing list archive)
State New
Headers show
Series io_uring: force async only ops to go async | expand

Commit Message

Dylan Yudaken Jan. 27, 2023, 1:52 p.m. UTC
REQ_F_FORCE_ASYNC was being ignored for re-queueing linked
requests. Instead obey that flag.

Signed-off-by: Dylan Yudaken <dylany@meta.com>
---
 io_uring/io_uring.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

Comments

Jens Axboe Jan. 29, 2023, 10:57 p.m. UTC | #1
On 1/27/23 6:52?AM, Dylan Yudaken wrote:
> REQ_F_FORCE_ASYNC was being ignored for re-queueing linked
> requests. Instead obey that flag.
> 
> Signed-off-by: Dylan Yudaken <dylany@meta.com>
> ---
>  io_uring/io_uring.c | 8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
> index db623b3185c8..980ba4fda101 100644
> --- a/io_uring/io_uring.c
> +++ b/io_uring/io_uring.c
> @@ -1365,10 +1365,12 @@ void io_req_task_submit(struct io_kiocb *req, bool *locked)
>  {
>  	io_tw_lock(req->ctx, locked);
>  	/* req->task == current here, checking PF_EXITING is safe */
> -	if (likely(!(req->task->flags & PF_EXITING)))
> -		io_queue_sqe(req);
> -	else
> +	if (unlikely(req->task->flags & PF_EXITING))
>  		io_req_defer_failed(req, -EFAULT);
> +	else if (req->flags & REQ_F_FORCE_ASYNC)
> +		io_queue_iowq(req, locked);
> +	else
> +		io_queue_sqe(req);
>  }
>  
>  void io_req_task_queue_fail(struct io_kiocb *req, int ret)

This one causes a failure for me with test/multicqes_drain.t, which
doesn't quite make sense to me (just yet), but it is a reliable timeout.
Jens Axboe Jan. 29, 2023, 11:17 p.m. UTC | #2
On 1/29/23 3:57 PM, Jens Axboe wrote:
> On 1/27/23 6:52?AM, Dylan Yudaken wrote:
>> REQ_F_FORCE_ASYNC was being ignored for re-queueing linked
>> requests. Instead obey that flag.
>>
>> Signed-off-by: Dylan Yudaken <dylany@meta.com>
>> ---
>>  io_uring/io_uring.c | 8 +++++---
>>  1 file changed, 5 insertions(+), 3 deletions(-)
>>
>> diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
>> index db623b3185c8..980ba4fda101 100644
>> --- a/io_uring/io_uring.c
>> +++ b/io_uring/io_uring.c
>> @@ -1365,10 +1365,12 @@ void io_req_task_submit(struct io_kiocb *req, bool *locked)
>>  {
>>  	io_tw_lock(req->ctx, locked);
>>  	/* req->task == current here, checking PF_EXITING is safe */
>> -	if (likely(!(req->task->flags & PF_EXITING)))
>> -		io_queue_sqe(req);
>> -	else
>> +	if (unlikely(req->task->flags & PF_EXITING))
>>  		io_req_defer_failed(req, -EFAULT);
>> +	else if (req->flags & REQ_F_FORCE_ASYNC)
>> +		io_queue_iowq(req, locked);
>> +	else
>> +		io_queue_sqe(req);
>>  }
>>  
>>  void io_req_task_queue_fail(struct io_kiocb *req, int ret)
> 
> This one causes a failure for me with test/multicqes_drain.t, which
> doesn't quite make sense to me (just yet), but it is a reliable timeout.

OK, quick look and I think this is a bad assumption in the test case.
It's assuming that a POLL_ADD already succeeded, and hence that a
subsequent POLL_REMOVE will succeed. But now it's getting ENOENT as
we can't find it just yet, which means the cancelation itself isn't
being done. So we just end up waiting for something that doesn't happen.

Or could be an internal race with lookup/issue. In any case, it's
definitely being exposed by this patch.
Dylan Yudaken Jan. 30, 2023, 10:45 a.m. UTC | #3
On Sun, 2023-01-29 at 16:17 -0700, Jens Axboe wrote:
> On 1/29/23 3:57 PM, Jens Axboe wrote:
> > On 1/27/23 6:52?AM, Dylan Yudaken wrote:
> > > REQ_F_FORCE_ASYNC was being ignored for re-queueing linked
> > > requests. Instead obey that flag.
> > > 
> > > Signed-off-by: Dylan Yudaken <dylany@meta.com>
> > > ---
> > >  io_uring/io_uring.c | 8 +++++---
> > >  1 file changed, 5 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
> > > index db623b3185c8..980ba4fda101 100644
> > > --- a/io_uring/io_uring.c
> > > +++ b/io_uring/io_uring.c
> > > @@ -1365,10 +1365,12 @@ void io_req_task_submit(struct io_kiocb
> > > *req, bool *locked)
> > >  {
> > >         io_tw_lock(req->ctx, locked);
> > >         /* req->task == current here, checking PF_EXITING is safe
> > > */
> > > -       if (likely(!(req->task->flags & PF_EXITING)))
> > > -               io_queue_sqe(req);
> > > -       else
> > > +       if (unlikely(req->task->flags & PF_EXITING))
> > >                 io_req_defer_failed(req, -EFAULT);
> > > +       else if (req->flags & REQ_F_FORCE_ASYNC)
> > > +               io_queue_iowq(req, locked);
> > > +       else
> > > +               io_queue_sqe(req);
> > >  }
> > >  
> > >  void io_req_task_queue_fail(struct io_kiocb *req, int ret)
> > 
> > This one causes a failure for me with test/multicqes_drain.t, which
> > doesn't quite make sense to me (just yet), but it is a reliable
> > timeout.
> 
> OK, quick look and I think this is a bad assumption in the test case.
> It's assuming that a POLL_ADD already succeeded, and hence that a
> subsequent POLL_REMOVE will succeed. But now it's getting ENOENT as
> we can't find it just yet, which means the cancelation itself isn't
> being done. So we just end up waiting for something that doesn't
> happen.
> 
> Or could be an internal race with lookup/issue. In any case, it's
> definitely being exposed by this patch.
> 

That is a bit of an unpleasasnt test.
Essentially it triggers a pipe, and reads from the pipe immediately
after. The test expects to see a CQE for that trigger, however if
anything ran asynchronously then there is a race between the read and
the poll logic running.

The attached patch fixes the test, but the reason my patches trigger it
is a bit weird.

This occurs on the second loop of the test, after the initial drain.
Essentially ctx->drain_active is still true when the second set of
polls are added, since drain_active is only cleared inside the next
io_drain_req. So then the first poll will have REQ_F_FORCE_ASYNC set.

Previously those FORCE_ASYNC's were being ignored, but now with
"io_uring: if a linked request has REQ_F_FORCE_ASYNC then run it async"
they get sent to the work thread, which causes the race. 

I wonder if drain_active should actually be cleared earlier? perhaps
before setting the REQ_F_FORCE_ASYNC flag?
The drain logic is pretty complex though, so I am not terribly keen to
start changing it if it's not generally useful.
Jens Axboe Jan. 30, 2023, 3:53 p.m. UTC | #4
On 1/30/23 3:45 AM, Dylan Yudaken wrote:
> On Sun, 2023-01-29 at 16:17 -0700, Jens Axboe wrote:
>> On 1/29/23 3:57 PM, Jens Axboe wrote:
>>> On 1/27/23 6:52?AM, Dylan Yudaken wrote:
>>>> REQ_F_FORCE_ASYNC was being ignored for re-queueing linked
>>>> requests. Instead obey that flag.
>>>>
>>>> Signed-off-by: Dylan Yudaken <dylany@meta.com>
>>>> ---
>>>>  io_uring/io_uring.c | 8 +++++---
>>>>  1 file changed, 5 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
>>>> index db623b3185c8..980ba4fda101 100644
>>>> --- a/io_uring/io_uring.c
>>>> +++ b/io_uring/io_uring.c
>>>> @@ -1365,10 +1365,12 @@ void io_req_task_submit(struct io_kiocb
>>>> *req, bool *locked)
>>>>  {
>>>>         io_tw_lock(req->ctx, locked);
>>>>         /* req->task == current here, checking PF_EXITING is safe
>>>> */
>>>> -       if (likely(!(req->task->flags & PF_EXITING)))
>>>> -               io_queue_sqe(req);
>>>> -       else
>>>> +       if (unlikely(req->task->flags & PF_EXITING))
>>>>                 io_req_defer_failed(req, -EFAULT);
>>>> +       else if (req->flags & REQ_F_FORCE_ASYNC)
>>>> +               io_queue_iowq(req, locked);
>>>> +       else
>>>> +               io_queue_sqe(req);
>>>>  }
>>>>  
>>>>  void io_req_task_queue_fail(struct io_kiocb *req, int ret)
>>>
>>> This one causes a failure for me with test/multicqes_drain.t, which
>>> doesn't quite make sense to me (just yet), but it is a reliable
>>> timeout.
>>
>> OK, quick look and I think this is a bad assumption in the test case.
>> It's assuming that a POLL_ADD already succeeded, and hence that a
>> subsequent POLL_REMOVE will succeed. But now it's getting ENOENT as
>> we can't find it just yet, which means the cancelation itself isn't
>> being done. So we just end up waiting for something that doesn't
>> happen.
>>
>> Or could be an internal race with lookup/issue. In any case, it's
>> definitely being exposed by this patch.
>>
> 
> That is a bit of an unpleasasnt test.
> Essentially it triggers a pipe, and reads from the pipe immediately
> after. The test expects to see a CQE for that trigger, however if
> anything ran asynchronously then there is a race between the read and
> the poll logic running.
> 
> The attached patch fixes the test, but the reason my patches trigger it
> is a bit weird.
> 
> This occurs on the second loop of the test, after the initial drain.
> Essentially ctx->drain_active is still true when the second set of
> polls are added, since drain_active is only cleared inside the next
> io_drain_req. So then the first poll will have REQ_F_FORCE_ASYNC set.
> 
> Previously those FORCE_ASYNC's were being ignored, but now with
> "io_uring: if a linked request has REQ_F_FORCE_ASYNC then run it async"
> they get sent to the work thread, which causes the race. 
> 
> I wonder if drain_active should actually be cleared earlier? perhaps
> before setting the REQ_F_FORCE_ASYNC flag?
> The drain logic is pretty complex though, so I am not terribly keen to
> start changing it if it's not generally useful.

Pavel, any input on the drain logic? I think you know that part the
best.
Pavel Begunkov Jan. 30, 2023, 4:21 p.m. UTC | #5
On 1/30/23 15:53, Jens Axboe wrote:
> On 1/30/23 3:45 AM, Dylan Yudaken wrote:
>> On Sun, 2023-01-29 at 16:17 -0700, Jens Axboe wrote:
>>> On 1/29/23 3:57 PM, Jens Axboe wrote:
>>>> On 1/27/23 6:52?AM, Dylan Yudaken wrote:
>>>>> REQ_F_FORCE_ASYNC was being ignored for re-queueing linked
>>>>> requests. Instead obey that flag.
>>>>>
>>>>> Signed-off-by: Dylan Yudaken <dylany@meta.com>
>>>>> ---
>>>>>   io_uring/io_uring.c | 8 +++++---
>>>>>   1 file changed, 5 insertions(+), 3 deletions(-)
>>>>>
>>>>> diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
>>>>> index db623b3185c8..980ba4fda101 100644
>>>>> --- a/io_uring/io_uring.c
>>>>> +++ b/io_uring/io_uring.c
>>>>> @@ -1365,10 +1365,12 @@ void io_req_task_submit(struct io_kiocb
>>>>> *req, bool *locked)
>>>>>   {
>>>>>          io_tw_lock(req->ctx, locked);
>>>>>          /* req->task == current here, checking PF_EXITING is safe
>>>>> */
>>>>> -       if (likely(!(req->task->flags & PF_EXITING)))
>>>>> -               io_queue_sqe(req);
>>>>> -       else
>>>>> +       if (unlikely(req->task->flags & PF_EXITING))
>>>>>                  io_req_defer_failed(req, -EFAULT);
>>>>> +       else if (req->flags & REQ_F_FORCE_ASYNC)
>>>>> +               io_queue_iowq(req, locked);
>>>>> +       else
>>>>> +               io_queue_sqe(req);
>>>>>   }
>>>>>   
>>>>>   void io_req_task_queue_fail(struct io_kiocb *req, int ret)
>>>>
>>>> This one causes a failure for me with test/multicqes_drain.t, which
>>>> doesn't quite make sense to me (just yet), but it is a reliable
>>>> timeout.
>>>
>>> OK, quick look and I think this is a bad assumption in the test case.
>>> It's assuming that a POLL_ADD already succeeded, and hence that a
>>> subsequent POLL_REMOVE will succeed. But now it's getting ENOENT as
>>> we can't find it just yet, which means the cancelation itself isn't
>>> being done. So we just end up waiting for something that doesn't
>>> happen.
>>>
>>> Or could be an internal race with lookup/issue. In any case, it's
>>> definitely being exposed by this patch.
>>>
>>
>> That is a bit of an unpleasasnt test.
>> Essentially it triggers a pipe, and reads from the pipe immediately
>> after. The test expects to see a CQE for that trigger, however if
>> anything ran asynchronously then there is a race between the read and
>> the poll logic running.
>>
>> The attached patch fixes the test, but the reason my patches trigger it
>> is a bit weird.
>>
>> This occurs on the second loop of the test, after the initial drain.
>> Essentially ctx->drain_active is still true when the second set of
>> polls are added, since drain_active is only cleared inside the next
>> io_drain_req. So then the first poll will have REQ_F_FORCE_ASYNC set.
>>
>> Previously those FORCE_ASYNC's were being ignored, but now with
>> "io_uring: if a linked request has REQ_F_FORCE_ASYNC then run it async"
>> they get sent to the work thread, which causes the race.

And that sounds like a userspace problem, any request might be executed
async on an io_uring whim.

>> I wonder if drain_active should actually be cleared earlier? perhaps
>> before setting the REQ_F_FORCE_ASYNC flag?
>> The drain logic is pretty complex though, so I am not terribly keen to
>> start changing it if it's not generally useful.

No, that won't work. As a drain request forces the ring to delay all
future requests until all previous requests are completed we can't
skip draining checks based on state of currently prepared request.

Draining is currently out of hot paths, and I'm happy about it. There
might be some ways, we can use a new flag instead of REQ_F_FORCE_ASYNC
to force it into the slow path and return the request back to the normal
path if the draining is not needed, but we don't care and really should
not care. I'd argue even further, it's time to mark DRAIN deprecated,
it's too slow, doesn't fit io_uring model well and has edge case
behaviour problems.

> Pavel, any input on the drain logic? I think you know that part the
> best.
diff mbox series

Patch

diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index db623b3185c8..980ba4fda101 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -1365,10 +1365,12 @@  void io_req_task_submit(struct io_kiocb *req, bool *locked)
 {
 	io_tw_lock(req->ctx, locked);
 	/* req->task == current here, checking PF_EXITING is safe */
-	if (likely(!(req->task->flags & PF_EXITING)))
-		io_queue_sqe(req);
-	else
+	if (unlikely(req->task->flags & PF_EXITING))
 		io_req_defer_failed(req, -EFAULT);
+	else if (req->flags & REQ_F_FORCE_ASYNC)
+		io_queue_iowq(req, locked);
+	else
+		io_queue_sqe(req);
 }
 
 void io_req_task_queue_fail(struct io_kiocb *req, int ret)