diff mbox series

side effect from "aio: don't zero entire aio_kiocb aio_get_req"

Message ID CAOg9mSRrtYEw62We=qfYj3tEsscOZ-pGaQHw6BWadBEU6k8CMA@mail.gmail.com (mailing list archive)
State New, archived
Headers show
Series side effect from "aio: don't zero entire aio_kiocb aio_get_req" | expand

Commit Message

Mike Marshall Feb. 2, 2019, 2:29 a.m. UTC
We've been working on our Orangefs page cache patch with blinders on,
and last week I took our patch set which was based on 4.19-rc7
and applied it to 5.0-rc3.

In the process I ran vanilla rc3, and rc3 plus an Orangefs related
patch set that Christoph Hellwig sent in, through the
suite of xfstests.

It turns out that a patch from one of Jens Axboe's patch sets
that came, I think, in the 5.0 merge window triggered a BUG_ON in Orangefs'
file.c. The particular patch is "aio: don't zero entire aio_kiocb aio_get_req".

This code is in Orangefs file.c in a couple of places:

BUG_ON(iocb->private);

Anywho... I can easily fix the Orangefs problem by removing the two
BUG_ON statements, I've researched how they got there and they
are vestigial, just the kind of thing that Linus hates :-).

The bigger question is that maybe there is other code in other filesystems
that checks iocb->private without failing in a way that is as obvious
as BUG_ON. I don't see any upstream code with grep other than a few
lines in ext4/inode.c that might be affected.

As a test, I "fixed" the Orangefs problem with this:

[hubcap@vm1 linux]# git diff
        INIT_LIST_HEAD(&req->ki_list);

So, the real fix for Orangefs is getting rid of the two BUG_ON lines,
and I'll do that, I just wanted to bring this up in case it matters
to anyone else...

-Mike

Comments

Jens Axboe Feb. 2, 2019, 4:35 p.m. UTC | #1
On 2/1/19 7:29 PM, Mike Marshall wrote:
> We've been working on our Orangefs page cache patch with blinders on,
> and last week I took our patch set which was based on 4.19-rc7
> and applied it to 5.0-rc3.
> 
> In the process I ran vanilla rc3, and rc3 plus an Orangefs related
> patch set that Christoph Hellwig sent in, through the
> suite of xfstests.
> 
> It turns out that a patch from one of Jens Axboe's patch sets
> that came, I think, in the 5.0 merge window triggered a BUG_ON in Orangefs'
> file.c. The particular patch is "aio: don't zero entire aio_kiocb aio_get_req".
> 
> This code is in Orangefs file.c in a couple of places:
> 
> BUG_ON(iocb->private);
> 
> Anywho... I can easily fix the Orangefs problem by removing the two
> BUG_ON statements, I've researched how they got there and they
> are vestigial, just the kind of thing that Linus hates :-).
> 
> The bigger question is that maybe there is other code in other filesystems
> that checks iocb->private without failing in a way that is as obvious
> as BUG_ON. I don't see any upstream code with grep other than a few
> lines in ext4/inode.c that might be affected.
> 
> As a test, I "fixed" the Orangefs problem with this:
> 
> [hubcap@vm1 linux]# git diff
> diff --git a/fs/aio.c b/fs/aio.c
> index b906ff70c90f..2605a4b1a3c9 100644
> --- a/fs/aio.c
> +++ b/fs/aio.c
> @@ -1020,6 +1020,7 @@ static inline struct aio_kiocb
> *aio_get_req(struct kioctx *ctx)
>         if (unlikely(!req))
>                 return NULL;
> 
> +       req->rw.private = NULL;
>         percpu_ref_get(&ctx->reqs);
>         req->ki_ctx = ctx;
>         INIT_LIST_HEAD(&req->ki_list);
> 
> So, the real fix for Orangefs is getting rid of the two BUG_ON lines,
> and I'll do that, I just wanted to bring this up in case it matters
> to anyone else...

Let's just bring it back, I think your patch is fine. I don't see
any other issues with this in a git grep, but better safe than sorry.

Care to send this as a properly formatted patch so it can get
included?
Jens Axboe Feb. 2, 2019, 4:36 p.m. UTC | #2
On 2/2/19 9:35 AM, Jens Axboe wrote:
> On 2/1/19 7:29 PM, Mike Marshall wrote:
>> We've been working on our Orangefs page cache patch with blinders on,
>> and last week I took our patch set which was based on 4.19-rc7
>> and applied it to 5.0-rc3.
>>
>> In the process I ran vanilla rc3, and rc3 plus an Orangefs related
>> patch set that Christoph Hellwig sent in, through the
>> suite of xfstests.
>>
>> It turns out that a patch from one of Jens Axboe's patch sets
>> that came, I think, in the 5.0 merge window triggered a BUG_ON in Orangefs'
>> file.c. The particular patch is "aio: don't zero entire aio_kiocb aio_get_req".
>>
>> This code is in Orangefs file.c in a couple of places:
>>
>> BUG_ON(iocb->private);
>>
>> Anywho... I can easily fix the Orangefs problem by removing the two
>> BUG_ON statements, I've researched how they got there and they
>> are vestigial, just the kind of thing that Linus hates :-).
>>
>> The bigger question is that maybe there is other code in other filesystems
>> that checks iocb->private without failing in a way that is as obvious
>> as BUG_ON. I don't see any upstream code with grep other than a few
>> lines in ext4/inode.c that might be affected.
>>
>> As a test, I "fixed" the Orangefs problem with this:
>>
>> [hubcap@vm1 linux]# git diff
>> diff --git a/fs/aio.c b/fs/aio.c
>> index b906ff70c90f..2605a4b1a3c9 100644
>> --- a/fs/aio.c
>> +++ b/fs/aio.c
>> @@ -1020,6 +1020,7 @@ static inline struct aio_kiocb
>> *aio_get_req(struct kioctx *ctx)
>>         if (unlikely(!req))
>>                 return NULL;
>>
>> +       req->rw.private = NULL;
>>         percpu_ref_get(&ctx->reqs);
>>         req->ki_ctx = ctx;
>>         INIT_LIST_HEAD(&req->ki_list);
>>
>> So, the real fix for Orangefs is getting rid of the two BUG_ON lines,
>> and I'll do that, I just wanted to bring this up in case it matters
>> to anyone else...
> 
> Let's just bring it back, I think your patch is fine. I don't see
> any other issues with this in a git grep, but better safe than sorry.
> 
> Care to send this as a properly formatted patch so it can get
> included?

BTW, I'd probably initialize it in aio_prep_rw(), that's the more
logical place for it.


diff --git a/fs/aio.c b/fs/aio.c
index b906ff70c90f..aaaaf4d12c73 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -1436,6 +1436,7 @@ static int aio_prep_rw(struct kiocb *req, const struct iocb *iocb)
 	if (unlikely(!req->ki_filp))
 		return -EBADF;
 	req->ki_complete = aio_complete_rw;
+	req->private = NULL;
 	req->ki_pos = iocb->aio_offset;
 	req->ki_flags = iocb_flags(req->ki_filp);
 	if (iocb->aio_flags & IOCB_FLAG_RESFD)
Mike Marshall Feb. 2, 2019, 5:03 p.m. UTC | #3
I'll put it in aio_prep_rw like you suggest and run it through xfstests.
I work through gmail, so I'll fiddle with my git-sendmail-foo and send
the patch up after that.

Thanks!

-Mike

On Sat, Feb 2, 2019 at 11:37 AM Jens Axboe <axboe@kernel.dk> wrote:
>
> On 2/2/19 9:35 AM, Jens Axboe wrote:
> > On 2/1/19 7:29 PM, Mike Marshall wrote:
> >> We've been working on our Orangefs page cache patch with blinders on,
> >> and last week I took our patch set which was based on 4.19-rc7
> >> and applied it to 5.0-rc3.
> >>
> >> In the process I ran vanilla rc3, and rc3 plus an Orangefs related
> >> patch set that Christoph Hellwig sent in, through the
> >> suite of xfstests.
> >>
> >> It turns out that a patch from one of Jens Axboe's patch sets
> >> that came, I think, in the 5.0 merge window triggered a BUG_ON in Orangefs'
> >> file.c. The particular patch is "aio: don't zero entire aio_kiocb aio_get_req".
> >>
> >> This code is in Orangefs file.c in a couple of places:
> >>
> >> BUG_ON(iocb->private);
> >>
> >> Anywho... I can easily fix the Orangefs problem by removing the two
> >> BUG_ON statements, I've researched how they got there and they
> >> are vestigial, just the kind of thing that Linus hates :-).
> >>
> >> The bigger question is that maybe there is other code in other filesystems
> >> that checks iocb->private without failing in a way that is as obvious
> >> as BUG_ON. I don't see any upstream code with grep other than a few
> >> lines in ext4/inode.c that might be affected.
> >>
> >> As a test, I "fixed" the Orangefs problem with this:
> >>
> >> [hubcap@vm1 linux]# git diff
> >> diff --git a/fs/aio.c b/fs/aio.c
> >> index b906ff70c90f..2605a4b1a3c9 100644
> >> --- a/fs/aio.c
> >> +++ b/fs/aio.c
> >> @@ -1020,6 +1020,7 @@ static inline struct aio_kiocb
> >> *aio_get_req(struct kioctx *ctx)
> >>         if (unlikely(!req))
> >>                 return NULL;
> >>
> >> +       req->rw.private = NULL;
> >>         percpu_ref_get(&ctx->reqs);
> >>         req->ki_ctx = ctx;
> >>         INIT_LIST_HEAD(&req->ki_list);
> >>
> >> So, the real fix for Orangefs is getting rid of the two BUG_ON lines,
> >> and I'll do that, I just wanted to bring this up in case it matters
> >> to anyone else...
> >
> > Let's just bring it back, I think your patch is fine. I don't see
> > any other issues with this in a git grep, but better safe than sorry.
> >
> > Care to send this as a properly formatted patch so it can get
> > included?
>
> BTW, I'd probably initialize it in aio_prep_rw(), that's the more
> logical place for it.
>
>
> diff --git a/fs/aio.c b/fs/aio.c
> index b906ff70c90f..aaaaf4d12c73 100644
> --- a/fs/aio.c
> +++ b/fs/aio.c
> @@ -1436,6 +1436,7 @@ static int aio_prep_rw(struct kiocb *req, const struct iocb *iocb)
>         if (unlikely(!req->ki_filp))
>                 return -EBADF;
>         req->ki_complete = aio_complete_rw;
> +       req->private = NULL;
>         req->ki_pos = iocb->aio_offset;
>         req->ki_flags = iocb_flags(req->ki_filp);
>         if (iocb->aio_flags & IOCB_FLAG_RESFD)
>
> --
> Jens Axboe
>
diff mbox series

Patch

diff --git a/fs/aio.c b/fs/aio.c
index b906ff70c90f..2605a4b1a3c9 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -1020,6 +1020,7 @@  static inline struct aio_kiocb
*aio_get_req(struct kioctx *ctx)
        if (unlikely(!req))
                return NULL;

+       req->rw.private = NULL;
        percpu_ref_get(&ctx->reqs);
        req->ki_ctx = ctx;