diff mbox series

[1/1] io_uring: more graceful request alloc OOM

Message ID 3e79156a106e8b5b3646672656f738ba157957ef.1684505086.git.asml.silence@gmail.com (mailing list archive)
State New
Headers show
Series [1/1] io_uring: more graceful request alloc OOM | expand

Commit Message

Pavel Begunkov May 19, 2023, 2:05 p.m. UTC
It's ok for io_uring request allocation to fail, however there are
reports that it starts killing tasks instead of just returning back
to the userspace. Add __GFP_NORETRY, so it doesn't trigger OOM killer.

Cc: stable@vger.kernel.org
Fixes: 2b188cc1bb857 ("Add io_uring IO interface")
Reported-by: yang lan <lanyang0908@gmail.com>
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 io_uring/io_uring.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Jens Axboe May 20, 2023, 1:57 a.m. UTC | #1
On Fri, 19 May 2023 15:05:14 +0100, Pavel Begunkov wrote:
> It's ok for io_uring request allocation to fail, however there are
> reports that it starts killing tasks instead of just returning back
> to the userspace. Add __GFP_NORETRY, so it doesn't trigger OOM killer.
> 
> 

Applied, thanks!

[1/1] io_uring: more graceful request alloc OOM
      (no commit info)

Best regards,
yang lan May 20, 2023, 9:38 a.m. UTC | #2
Hi,

Thanks for your response.

But I applied this patch to LTS kernel 5.10.180, it can still trigger this bug.

--- io_uring/io_uring.c.back    2023-05-20 17:11:25.870550438 +0800
+++ io_uring/io_uring.c 2023-05-20 16:35:24.265846283 +0800
@@ -1970,7 +1970,7 @@
static struct io_kiocb *io_alloc_req(struct io_ring_ctx *ctx)
        __must_hold(&ctx->uring_lock)
 {
        struct io_submit_state *state = &ctx->submit_state;
-       gfp_t gfp = GFP_KERNEL | __GFP_NOWARN;
+       gfp_t gfp = GFP_KERNEL | __GFP_NOWARN | __GFP_NORETRY;
        int ret, i;

        BUILD_BUG_ON(ARRAY_SIZE(state->reqs) < IO_REQ_ALLOC_BATCH);

The io_uring.c.back is the original file.
Do I apply this patch wrong?

Regards,

Yang

Pavel Begunkov <asml.silence@gmail.com> 于2023年5月19日周五 22:06写道:
>
> It's ok for io_uring request allocation to fail, however there are
> reports that it starts killing tasks instead of just returning back
> to the userspace. Add __GFP_NORETRY, so it doesn't trigger OOM killer.
>
> Cc: stable@vger.kernel.org
> Fixes: 2b188cc1bb857 ("Add io_uring IO interface")
> Reported-by: yang lan <lanyang0908@gmail.com>
> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
> ---
>  io_uring/io_uring.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
> index dab09f568294..ad34a4320dab 100644
> --- a/io_uring/io_uring.c
> +++ b/io_uring/io_uring.c
> @@ -1073,7 +1073,7 @@ static void io_flush_cached_locked_reqs(struct io_ring_ctx *ctx,
>  __cold bool __io_alloc_req_refill(struct io_ring_ctx *ctx)
>         __must_hold(&ctx->uring_lock)
>  {
> -       gfp_t gfp = GFP_KERNEL | __GFP_NOWARN;
> +       gfp_t gfp = GFP_KERNEL | __GFP_NOWARN | __GFP_NORETRY;
>         void *reqs[IO_REQ_ALLOC_BATCH];
>         int ret, i;
>
> --
> 2.40.0
>
Pavel Begunkov May 22, 2023, 12:40 a.m. UTC | #3
On 5/20/23 10:38, yang lan wrote:
> Hi,
> 
> Thanks for your response.
> 
> But I applied this patch to LTS kernel 5.10.180, it can still trigger this bug.
> 
> --- io_uring/io_uring.c.back    2023-05-20 17:11:25.870550438 +0800
> +++ io_uring/io_uring.c 2023-05-20 16:35:24.265846283 +0800
> @@ -1970,7 +1970,7 @@
> static struct io_kiocb *io_alloc_req(struct io_ring_ctx *ctx)
>          __must_hold(&ctx->uring_lock)
>   {
>          struct io_submit_state *state = &ctx->submit_state;
> -       gfp_t gfp = GFP_KERNEL | __GFP_NOWARN;
> +       gfp_t gfp = GFP_KERNEL | __GFP_NOWARN | __GFP_NORETRY;
>          int ret, i;
> 
>          BUILD_BUG_ON(ARRAY_SIZE(state->reqs) < IO_REQ_ALLOC_BATCH);
> 
> The io_uring.c.back is the original file.
> Do I apply this patch wrong?

The patch looks fine. I run a self-written test before
sending with 6.4, worked as expected. I need to run the syz
test, maybe it shifted to another spot, e.g. in provided
buffers.
yang lan May 22, 2023, 7:55 a.m. UTC | #4
Hi,

Thanks. I'm also analyzing the root cause of this bug.

By the way, can I apply for a CVE? And should I submit a request to
some official organizations, such as Openwall, etc?

Regards,

Yang

Pavel Begunkov <asml.silence@gmail.com> 于2023年5月22日周一 08:45写道:
>
> On 5/20/23 10:38, yang lan wrote:
> > Hi,
> >
> > Thanks for your response.
> >
> > But I applied this patch to LTS kernel 5.10.180, it can still trigger this bug.
> >
> > --- io_uring/io_uring.c.back    2023-05-20 17:11:25.870550438 +0800
> > +++ io_uring/io_uring.c 2023-05-20 16:35:24.265846283 +0800
> > @@ -1970,7 +1970,7 @@
> > static struct io_kiocb *io_alloc_req(struct io_ring_ctx *ctx)
> >          __must_hold(&ctx->uring_lock)
> >   {
> >          struct io_submit_state *state = &ctx->submit_state;
> > -       gfp_t gfp = GFP_KERNEL | __GFP_NOWARN;
> > +       gfp_t gfp = GFP_KERNEL | __GFP_NOWARN | __GFP_NORETRY;
> >          int ret, i;
> >
> >          BUILD_BUG_ON(ARRAY_SIZE(state->reqs) < IO_REQ_ALLOC_BATCH);
> >
> > The io_uring.c.back is the original file.
> > Do I apply this patch wrong?
>
> The patch looks fine. I run a self-written test before
> sending with 6.4, worked as expected. I need to run the syz
> test, maybe it shifted to another spot, e.g. in provided
> buffers.
>
> --
> Pavel Begunkov
Pavel Begunkov May 23, 2023, 12:08 p.m. UTC | #5
On 5/22/23 08:55, yang lan wrote:
> Hi,
> 
> Thanks. I'm also analyzing the root cause of this bug.

The repro indeed triggers, this time in another place. Though
when I patch all of them it would fail somewhere else, like in
ext4 on a pagefault.

We can add a couple more those "don't oom but fail" and some
niceness around, but I think it's fine as it is as allocations
are under cgroup. If admin cares about collision b/w users there
should be cgroups, so allocations of one don't affect another. And
if a user pushes it to the limit and oom's itself and gets OOM,
that should be fine.

> By the way, can I apply for a CVE? And should I submit a request to
> some official organizations, such as Openwall, etc?

Sorry, but we cannot help you here. We don't deal with CVEs.

That aside, I'm not even sure it's CVE'able because it shouldn't
be exploitable in a configured environment (unless it is). But
I'm not an expert in that so might be wrong.



> Pavel Begunkov <asml.silence@gmail.com> 于2023年5月22日周一 08:45写道:
>>
>> On 5/20/23 10:38, yang lan wrote:
>>> Hi,
>>>
>>> Thanks for your response.
>>>
>>> But I applied this patch to LTS kernel 5.10.180, it can still trigger this bug.
>>>
>>> --- io_uring/io_uring.c.back    2023-05-20 17:11:25.870550438 +0800
>>> +++ io_uring/io_uring.c 2023-05-20 16:35:24.265846283 +0800
>>> @@ -1970,7 +1970,7 @@
>>> static struct io_kiocb *io_alloc_req(struct io_ring_ctx *ctx)
>>>           __must_hold(&ctx->uring_lock)
>>>    {
>>>           struct io_submit_state *state = &ctx->submit_state;
>>> -       gfp_t gfp = GFP_KERNEL | __GFP_NOWARN;
>>> +       gfp_t gfp = GFP_KERNEL | __GFP_NOWARN | __GFP_NORETRY;
>>>           int ret, i;
>>>
>>>           BUILD_BUG_ON(ARRAY_SIZE(state->reqs) < IO_REQ_ALLOC_BATCH);
>>>
>>> The io_uring.c.back is the original file.
>>> Do I apply this patch wrong?
>>
>> The patch looks fine. I run a self-written test before
>> sending with 6.4, worked as expected. I need to run the syz
>> test, maybe it shifted to another spot, e.g. in provided
>> buffers.
>>
>> --
>> Pavel Begunkov
yang lan May 24, 2023, 3:15 a.m. UTC | #6
Hi,

Thanks.

Regards,

Yang

Pavel Begunkov <asml.silence@gmail.com> 于2023年5月23日周二 20:15写道:
>
> On 5/22/23 08:55, yang lan wrote:
> > Hi,
> >
> > Thanks. I'm also analyzing the root cause of this bug.
>
> The repro indeed triggers, this time in another place. Though
> when I patch all of them it would fail somewhere else, like in
> ext4 on a pagefault.
>
> We can add a couple more those "don't oom but fail" and some
> niceness around, but I think it's fine as it is as allocations
> are under cgroup. If admin cares about collision b/w users there
> should be cgroups, so allocations of one don't affect another. And
> if a user pushes it to the limit and oom's itself and gets OOM,
> that should be fine.
>
> > By the way, can I apply for a CVE? And should I submit a request to
> > some official organizations, such as Openwall, etc?
>
> Sorry, but we cannot help you here. We don't deal with CVEs.
>
> That aside, I'm not even sure it's CVE'able because it shouldn't
> be exploitable in a configured environment (unless it is). But
> I'm not an expert in that so might be wrong.
>
>
>
> > Pavel Begunkov <asml.silence@gmail.com> 于2023年5月22日周一 08:45写道:
> >>
> >> On 5/20/23 10:38, yang lan wrote:
> >>> Hi,
> >>>
> >>> Thanks for your response.
> >>>
> >>> But I applied this patch to LTS kernel 5.10.180, it can still trigger this bug.
> >>>
> >>> --- io_uring/io_uring.c.back    2023-05-20 17:11:25.870550438 +0800
> >>> +++ io_uring/io_uring.c 2023-05-20 16:35:24.265846283 +0800
> >>> @@ -1970,7 +1970,7 @@
> >>> static struct io_kiocb *io_alloc_req(struct io_ring_ctx *ctx)
> >>>           __must_hold(&ctx->uring_lock)
> >>>    {
> >>>           struct io_submit_state *state = &ctx->submit_state;
> >>> -       gfp_t gfp = GFP_KERNEL | __GFP_NOWARN;
> >>> +       gfp_t gfp = GFP_KERNEL | __GFP_NOWARN | __GFP_NORETRY;
> >>>           int ret, i;
> >>>
> >>>           BUILD_BUG_ON(ARRAY_SIZE(state->reqs) < IO_REQ_ALLOC_BATCH);
> >>>
> >>> The io_uring.c.back is the original file.
> >>> Do I apply this patch wrong?
> >>
> >> The patch looks fine. I run a self-written test before
> >> sending with 6.4, worked as expected. I need to run the syz
> >> test, maybe it shifted to another spot, e.g. in provided
> >> buffers.
> >>
> >> --
> >> Pavel Begunkov
>
> --
> Pavel Begunkov
diff mbox series

Patch

diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index dab09f568294..ad34a4320dab 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -1073,7 +1073,7 @@  static void io_flush_cached_locked_reqs(struct io_ring_ctx *ctx,
 __cold bool __io_alloc_req_refill(struct io_ring_ctx *ctx)
 	__must_hold(&ctx->uring_lock)
 {
-	gfp_t gfp = GFP_KERNEL | __GFP_NOWARN;
+	gfp_t gfp = GFP_KERNEL | __GFP_NOWARN | __GFP_NORETRY;
 	void *reqs[IO_REQ_ALLOC_BATCH];
 	int ret, i;