Message ID | 20230127135227.3646353-2-dylany@meta.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | io_uring: force async only ops to go async | expand |
On 1/27/23 6:52?AM, Dylan Yudaken wrote: > REQ_F_FORCE_ASYNC was being ignored for re-queueing linked > requests. Instead obey that flag. > > Signed-off-by: Dylan Yudaken <dylany@meta.com> > --- > io_uring/io_uring.c | 8 +++++--- > 1 file changed, 5 insertions(+), 3 deletions(-) > > diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c > index db623b3185c8..980ba4fda101 100644 > --- a/io_uring/io_uring.c > +++ b/io_uring/io_uring.c > @@ -1365,10 +1365,12 @@ void io_req_task_submit(struct io_kiocb *req, bool *locked) > { > io_tw_lock(req->ctx, locked); > /* req->task == current here, checking PF_EXITING is safe */ > - if (likely(!(req->task->flags & PF_EXITING))) > - io_queue_sqe(req); > - else > + if (unlikely(req->task->flags & PF_EXITING)) > io_req_defer_failed(req, -EFAULT); > + else if (req->flags & REQ_F_FORCE_ASYNC) > + io_queue_iowq(req, locked); > + else > + io_queue_sqe(req); > } > > void io_req_task_queue_fail(struct io_kiocb *req, int ret) This one causes a failure for me with test/multicqes_drain.t, which doesn't quite make sense to me (just yet), but it is a reliable timeout.
On 1/29/23 3:57 PM, Jens Axboe wrote: > On 1/27/23 6:52?AM, Dylan Yudaken wrote: >> REQ_F_FORCE_ASYNC was being ignored for re-queueing linked >> requests. Instead obey that flag. >> >> Signed-off-by: Dylan Yudaken <dylany@meta.com> >> --- >> io_uring/io_uring.c | 8 +++++--- >> 1 file changed, 5 insertions(+), 3 deletions(-) >> >> diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c >> index db623b3185c8..980ba4fda101 100644 >> --- a/io_uring/io_uring.c >> +++ b/io_uring/io_uring.c >> @@ -1365,10 +1365,12 @@ void io_req_task_submit(struct io_kiocb *req, bool *locked) >> { >> io_tw_lock(req->ctx, locked); >> /* req->task == current here, checking PF_EXITING is safe */ >> - if (likely(!(req->task->flags & PF_EXITING))) >> - io_queue_sqe(req); >> - else >> + if (unlikely(req->task->flags & PF_EXITING)) >> io_req_defer_failed(req, -EFAULT); >> + else if (req->flags & REQ_F_FORCE_ASYNC) >> + io_queue_iowq(req, locked); >> + else >> + io_queue_sqe(req); >> } >> >> void io_req_task_queue_fail(struct io_kiocb *req, int ret) > > This one causes a failure for me with test/multicqes_drain.t, which > doesn't quite make sense to me (just yet), but it is a reliable timeout. OK, quick look and I think this is a bad assumption in the test case. It's assuming that a POLL_ADD already succeeded, and hence that a subsequent POLL_REMOVE will succeed. But now it's getting ENOENT as we can't find it just yet, which means the cancelation itself isn't being done. So we just end up waiting for something that doesn't happen. Or could be an internal race with lookup/issue. In any case, it's definitely being exposed by this patch.
On Sun, 2023-01-29 at 16:17 -0700, Jens Axboe wrote: > On 1/29/23 3:57 PM, Jens Axboe wrote: > > On 1/27/23 6:52?AM, Dylan Yudaken wrote: > > > REQ_F_FORCE_ASYNC was being ignored for re-queueing linked > > > requests. Instead obey that flag. > > > > > > Signed-off-by: Dylan Yudaken <dylany@meta.com> > > > --- > > > io_uring/io_uring.c | 8 +++++--- > > > 1 file changed, 5 insertions(+), 3 deletions(-) > > > > > > diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c > > > index db623b3185c8..980ba4fda101 100644 > > > --- a/io_uring/io_uring.c > > > +++ b/io_uring/io_uring.c > > > @@ -1365,10 +1365,12 @@ void io_req_task_submit(struct io_kiocb > > > *req, bool *locked) > > > { > > > io_tw_lock(req->ctx, locked); > > > /* req->task == current here, checking PF_EXITING is safe > > > */ > > > - if (likely(!(req->task->flags & PF_EXITING))) > > > - io_queue_sqe(req); > > > - else > > > + if (unlikely(req->task->flags & PF_EXITING)) > > > io_req_defer_failed(req, -EFAULT); > > > + else if (req->flags & REQ_F_FORCE_ASYNC) > > > + io_queue_iowq(req, locked); > > > + else > > > + io_queue_sqe(req); > > > } > > > > > > void io_req_task_queue_fail(struct io_kiocb *req, int ret) > > > > This one causes a failure for me with test/multicqes_drain.t, which > > doesn't quite make sense to me (just yet), but it is a reliable > > timeout. > > OK, quick look and I think this is a bad assumption in the test case. > It's assuming that a POLL_ADD already succeeded, and hence that a > subsequent POLL_REMOVE will succeed. But now it's getting ENOENT as > we can't find it just yet, which means the cancelation itself isn't > being done. So we just end up waiting for something that doesn't > happen. > > Or could be an internal race with lookup/issue. In any case, it's > definitely being exposed by this patch. > That is a bit of an unpleasasnt test. Essentially it triggers a pipe, and reads from the pipe immediately after. The test expects to see a CQE for that trigger, however if anything ran asynchronously then there is a race between the read and the poll logic running. The attached patch fixes the test, but the reason my patches trigger it is a bit weird. This occurs on the second loop of the test, after the initial drain. Essentially ctx->drain_active is still true when the second set of polls are added, since drain_active is only cleared inside the next io_drain_req. So then the first poll will have REQ_F_FORCE_ASYNC set. Previously those FORCE_ASYNC's were being ignored, but now with "io_uring: if a linked request has REQ_F_FORCE_ASYNC then run it async" they get sent to the work thread, which causes the race. I wonder if drain_active should actually be cleared earlier? perhaps before setting the REQ_F_FORCE_ASYNC flag? The drain logic is pretty complex though, so I am not terribly keen to start changing it if it's not generally useful.
On 1/30/23 3:45 AM, Dylan Yudaken wrote: > On Sun, 2023-01-29 at 16:17 -0700, Jens Axboe wrote: >> On 1/29/23 3:57 PM, Jens Axboe wrote: >>> On 1/27/23 6:52?AM, Dylan Yudaken wrote: >>>> REQ_F_FORCE_ASYNC was being ignored for re-queueing linked >>>> requests. Instead obey that flag. >>>> >>>> Signed-off-by: Dylan Yudaken <dylany@meta.com> >>>> --- >>>> io_uring/io_uring.c | 8 +++++--- >>>> 1 file changed, 5 insertions(+), 3 deletions(-) >>>> >>>> diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c >>>> index db623b3185c8..980ba4fda101 100644 >>>> --- a/io_uring/io_uring.c >>>> +++ b/io_uring/io_uring.c >>>> @@ -1365,10 +1365,12 @@ void io_req_task_submit(struct io_kiocb >>>> *req, bool *locked) >>>> { >>>> io_tw_lock(req->ctx, locked); >>>> /* req->task == current here, checking PF_EXITING is safe >>>> */ >>>> - if (likely(!(req->task->flags & PF_EXITING))) >>>> - io_queue_sqe(req); >>>> - else >>>> + if (unlikely(req->task->flags & PF_EXITING)) >>>> io_req_defer_failed(req, -EFAULT); >>>> + else if (req->flags & REQ_F_FORCE_ASYNC) >>>> + io_queue_iowq(req, locked); >>>> + else >>>> + io_queue_sqe(req); >>>> } >>>> >>>> void io_req_task_queue_fail(struct io_kiocb *req, int ret) >>> >>> This one causes a failure for me with test/multicqes_drain.t, which >>> doesn't quite make sense to me (just yet), but it is a reliable >>> timeout. >> >> OK, quick look and I think this is a bad assumption in the test case. >> It's assuming that a POLL_ADD already succeeded, and hence that a >> subsequent POLL_REMOVE will succeed. But now it's getting ENOENT as >> we can't find it just yet, which means the cancelation itself isn't >> being done. So we just end up waiting for something that doesn't >> happen. >> >> Or could be an internal race with lookup/issue. In any case, it's >> definitely being exposed by this patch. >> > > That is a bit of an unpleasasnt test. > Essentially it triggers a pipe, and reads from the pipe immediately > after. The test expects to see a CQE for that trigger, however if > anything ran asynchronously then there is a race between the read and > the poll logic running. > > The attached patch fixes the test, but the reason my patches trigger it > is a bit weird. > > This occurs on the second loop of the test, after the initial drain. > Essentially ctx->drain_active is still true when the second set of > polls are added, since drain_active is only cleared inside the next > io_drain_req. So then the first poll will have REQ_F_FORCE_ASYNC set. > > Previously those FORCE_ASYNC's were being ignored, but now with > "io_uring: if a linked request has REQ_F_FORCE_ASYNC then run it async" > they get sent to the work thread, which causes the race. > > I wonder if drain_active should actually be cleared earlier? perhaps > before setting the REQ_F_FORCE_ASYNC flag? > The drain logic is pretty complex though, so I am not terribly keen to > start changing it if it's not generally useful. Pavel, any input on the drain logic? I think you know that part the best.
On 1/30/23 15:53, Jens Axboe wrote: > On 1/30/23 3:45 AM, Dylan Yudaken wrote: >> On Sun, 2023-01-29 at 16:17 -0700, Jens Axboe wrote: >>> On 1/29/23 3:57 PM, Jens Axboe wrote: >>>> On 1/27/23 6:52?AM, Dylan Yudaken wrote: >>>>> REQ_F_FORCE_ASYNC was being ignored for re-queueing linked >>>>> requests. Instead obey that flag. >>>>> >>>>> Signed-off-by: Dylan Yudaken <dylany@meta.com> >>>>> --- >>>>> io_uring/io_uring.c | 8 +++++--- >>>>> 1 file changed, 5 insertions(+), 3 deletions(-) >>>>> >>>>> diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c >>>>> index db623b3185c8..980ba4fda101 100644 >>>>> --- a/io_uring/io_uring.c >>>>> +++ b/io_uring/io_uring.c >>>>> @@ -1365,10 +1365,12 @@ void io_req_task_submit(struct io_kiocb >>>>> *req, bool *locked) >>>>> { >>>>> io_tw_lock(req->ctx, locked); >>>>> /* req->task == current here, checking PF_EXITING is safe >>>>> */ >>>>> - if (likely(!(req->task->flags & PF_EXITING))) >>>>> - io_queue_sqe(req); >>>>> - else >>>>> + if (unlikely(req->task->flags & PF_EXITING)) >>>>> io_req_defer_failed(req, -EFAULT); >>>>> + else if (req->flags & REQ_F_FORCE_ASYNC) >>>>> + io_queue_iowq(req, locked); >>>>> + else >>>>> + io_queue_sqe(req); >>>>> } >>>>> >>>>> void io_req_task_queue_fail(struct io_kiocb *req, int ret) >>>> >>>> This one causes a failure for me with test/multicqes_drain.t, which >>>> doesn't quite make sense to me (just yet), but it is a reliable >>>> timeout. >>> >>> OK, quick look and I think this is a bad assumption in the test case. >>> It's assuming that a POLL_ADD already succeeded, and hence that a >>> subsequent POLL_REMOVE will succeed. But now it's getting ENOENT as >>> we can't find it just yet, which means the cancelation itself isn't >>> being done. So we just end up waiting for something that doesn't >>> happen. >>> >>> Or could be an internal race with lookup/issue. In any case, it's >>> definitely being exposed by this patch. >>> >> >> That is a bit of an unpleasasnt test. >> Essentially it triggers a pipe, and reads from the pipe immediately >> after. The test expects to see a CQE for that trigger, however if >> anything ran asynchronously then there is a race between the read and >> the poll logic running. >> >> The attached patch fixes the test, but the reason my patches trigger it >> is a bit weird. >> >> This occurs on the second loop of the test, after the initial drain. >> Essentially ctx->drain_active is still true when the second set of >> polls are added, since drain_active is only cleared inside the next >> io_drain_req. So then the first poll will have REQ_F_FORCE_ASYNC set. >> >> Previously those FORCE_ASYNC's were being ignored, but now with >> "io_uring: if a linked request has REQ_F_FORCE_ASYNC then run it async" >> they get sent to the work thread, which causes the race. And that sounds like a userspace problem, any request might be executed async on an io_uring whim. >> I wonder if drain_active should actually be cleared earlier? perhaps >> before setting the REQ_F_FORCE_ASYNC flag? >> The drain logic is pretty complex though, so I am not terribly keen to >> start changing it if it's not generally useful. No, that won't work. As a drain request forces the ring to delay all future requests until all previous requests are completed we can't skip draining checks based on state of currently prepared request. Draining is currently out of hot paths, and I'm happy about it. There might be some ways, we can use a new flag instead of REQ_F_FORCE_ASYNC to force it into the slow path and return the request back to the normal path if the draining is not needed, but we don't care and really should not care. I'd argue even further, it's time to mark DRAIN deprecated, it's too slow, doesn't fit io_uring model well and has edge case behaviour problems. > Pavel, any input on the drain logic? I think you know that part the > best.
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index db623b3185c8..980ba4fda101 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -1365,10 +1365,12 @@ void io_req_task_submit(struct io_kiocb *req, bool *locked) { io_tw_lock(req->ctx, locked); /* req->task == current here, checking PF_EXITING is safe */ - if (likely(!(req->task->flags & PF_EXITING))) - io_queue_sqe(req); - else + if (unlikely(req->task->flags & PF_EXITING)) io_req_defer_failed(req, -EFAULT); + else if (req->flags & REQ_F_FORCE_ASYNC) + io_queue_iowq(req, locked); + else + io_queue_sqe(req); } void io_req_task_queue_fail(struct io_kiocb *req, int ret)
REQ_F_FORCE_ASYNC was being ignored for re-queueing linked requests. Instead obey that flag. Signed-off-by: Dylan Yudaken <dylany@meta.com> --- io_uring/io_uring.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-)