diff mbox series

nfs: fix NULL deference in nfs4_get_valid_delegation

Message ID 20200508221935.GA11225@fieldses.org (mailing list archive)
State New, archived
Headers show
Series nfs: fix NULL deference in nfs4_get_valid_delegation | expand

Commit Message

J. Bruce Fields May 8, 2020, 10:19 p.m. UTC
From: "J. Bruce Fields" <bfields@redhat.com>

We add the new state to the nfsi->open_states list, making it
potentially visible to other threads, before we've finished initializing
it.

That wasn't a problem when all the readers were also taking the i_lock
(as we do here), but since we switched to RCU, there's now a possibility
that a reader could see the partially initialized state.

Symptoms observed were a crash when another thread called
nfs4_get_valid_delegation() on a NULL inode.

Fixes: 9ae075fdd190 "NFSv4: Convert open state lookup to use RCU"
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
 fs/nfs/nfs4state.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Masayoshi Mizuma May 11, 2020, 12:10 p.m. UTC | #1
On Fri, May 08, 2020 at 06:19:35PM -0400, J. Bruce Fields wrote:
> From: "J. Bruce Fields" <bfields@redhat.com>
> 
> We add the new state to the nfsi->open_states list, making it
> potentially visible to other threads, before we've finished initializing
> it.
> 
> That wasn't a problem when all the readers were also taking the i_lock
> (as we do here), but since we switched to RCU, there's now a possibility
> that a reader could see the partially initialized state.
> 
> Symptoms observed were a crash when another thread called
> nfs4_get_valid_delegation() on a NULL inode.
> 
> Fixes: 9ae075fdd190 "NFSv4: Convert open state lookup to use RCU"
> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
> ---
>  fs/nfs/nfs4state.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
> index ac93715c05a4..a8dc25ce48bb 100644
> --- a/fs/nfs/nfs4state.c
> +++ b/fs/nfs/nfs4state.c
> @@ -734,9 +734,9 @@ nfs4_get_open_state(struct inode *inode, struct nfs4_state_owner *owner)
>  		state = new;
>  		state->owner = owner;
>  		atomic_inc(&owner->so_count);
> -		list_add_rcu(&state->inode_states, &nfsi->open_states);
>  		ihold(inode);
>  		state->inode = inode;
> +		list_add_rcu(&state->inode_states, &nfsi->open_states);
>  		spin_unlock(&inode->i_lock);
>  		/* Note: The reclaim code dictates that we add stateless
>  		 * and read-only stateids to the end of the list */
> -- 

Thank you for posting the patch! It works for our box.
Please feel free to add:

        Reviewed-by: Seiichi Ikarashi <s.ikarashi@fujitsu.com>
        Tested-by: Daisuke Matsuda <matsuda-daisuke@fujitsu.com>
        Tested-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>

Without the patch, the system which is a NFSv4 client has been
crashed randomly. The panic log is such as:

   BUG: unable to handle page fault for address: ffffffffffffffb0
   ...
   RIP: 0010:nfs4_get_valid_delegation+0x6/0x30 [nfsv4]
   ...
   Call Trace:
    nfs4_open_prepare+0x80/0x1c0 [nfsv4]
    __rpc_execute+0x75/0x390 [sunrpc]
    ? finish_task_switch+0x75/0x260
    rpc_async_schedule+0x29/0x40 [sunrpc]
    process_one_work+0x1ad/0x370
    worker_thread+0x30/0x390
    ? create_worker+0x1a0/0x1a0
    kthread+0x10c/0x130
    ? kthread_park+0x80/0x80
    ret_from_fork+0x22/0x30

After applied the patch, the panic is gone.

Thanks!
Masa
J. Bruce Fields May 11, 2020, 1:16 p.m. UTC | #2
Thanks, applying.--b.

On Mon, May 11, 2020 at 08:10:54AM -0400, Masayoshi Mizuma wrote:
> On Fri, May 08, 2020 at 06:19:35PM -0400, J. Bruce Fields wrote:
> > From: "J. Bruce Fields" <bfields@redhat.com>
> > 
> > We add the new state to the nfsi->open_states list, making it
> > potentially visible to other threads, before we've finished initializing
> > it.
> > 
> > That wasn't a problem when all the readers were also taking the i_lock
> > (as we do here), but since we switched to RCU, there's now a possibility
> > that a reader could see the partially initialized state.
> > 
> > Symptoms observed were a crash when another thread called
> > nfs4_get_valid_delegation() on a NULL inode.
> > 
> > Fixes: 9ae075fdd190 "NFSv4: Convert open state lookup to use RCU"
> > Signed-off-by: J. Bruce Fields <bfields@redhat.com>
> > ---
> >  fs/nfs/nfs4state.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
> > index ac93715c05a4..a8dc25ce48bb 100644
> > --- a/fs/nfs/nfs4state.c
> > +++ b/fs/nfs/nfs4state.c
> > @@ -734,9 +734,9 @@ nfs4_get_open_state(struct inode *inode, struct nfs4_state_owner *owner)
> >  		state = new;
> >  		state->owner = owner;
> >  		atomic_inc(&owner->so_count);
> > -		list_add_rcu(&state->inode_states, &nfsi->open_states);
> >  		ihold(inode);
> >  		state->inode = inode;
> > +		list_add_rcu(&state->inode_states, &nfsi->open_states);
> >  		spin_unlock(&inode->i_lock);
> >  		/* Note: The reclaim code dictates that we add stateless
> >  		 * and read-only stateids to the end of the list */
> > -- 
> 
> Thank you for posting the patch! It works for our box.
> Please feel free to add:
> 
>         Reviewed-by: Seiichi Ikarashi <s.ikarashi@fujitsu.com>
>         Tested-by: Daisuke Matsuda <matsuda-daisuke@fujitsu.com>
>         Tested-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
> 
> Without the patch, the system which is a NFSv4 client has been
> crashed randomly. The panic log is such as:
> 
>    BUG: unable to handle page fault for address: ffffffffffffffb0
>    ...
>    RIP: 0010:nfs4_get_valid_delegation+0x6/0x30 [nfsv4]
>    ...
>    Call Trace:
>     nfs4_open_prepare+0x80/0x1c0 [nfsv4]
>     __rpc_execute+0x75/0x390 [sunrpc]
>     ? finish_task_switch+0x75/0x260
>     rpc_async_schedule+0x29/0x40 [sunrpc]
>     process_one_work+0x1ad/0x370
>     worker_thread+0x30/0x390
>     ? create_worker+0x1a0/0x1a0
>     kthread+0x10c/0x130
>     ? kthread_park+0x80/0x80
>     ret_from_fork+0x22/0x30
> 
> After applied the patch, the panic is gone.
> 
> Thanks!
> Masa
Trond Myklebust May 11, 2020, 1:43 p.m. UTC | #3
On Mon, 2020-05-11 at 09:16 -0400, J. Bruce Fields wrote:
> Thanks, applying.--b.
> 

You're applying? So should I remove it from the NFS client bugfixes?

> On Mon, May 11, 2020 at 08:10:54AM -0400, Masayoshi Mizuma wrote:
> > On Fri, May 08, 2020 at 06:19:35PM -0400, J. Bruce Fields wrote:
> > > From: "J. Bruce Fields" <bfields@redhat.com>
> > > 
> > > We add the new state to the nfsi->open_states list, making it
> > > potentially visible to other threads, before we've finished
> > > initializing
> > > it.
> > > 
> > > That wasn't a problem when all the readers were also taking the
> > > i_lock
> > > (as we do here), but since we switched to RCU, there's now a
> > > possibility
> > > that a reader could see the partially initialized state.
> > > 
> > > Symptoms observed were a crash when another thread called
> > > nfs4_get_valid_delegation() on a NULL inode.
> > > 
> > > Fixes: 9ae075fdd190 "NFSv4: Convert open state lookup to use RCU"
> > > Signed-off-by: J. Bruce Fields <bfields@redhat.com>
> > > ---
> > >  fs/nfs/nfs4state.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
> > > index ac93715c05a4..a8dc25ce48bb 100644
> > > --- a/fs/nfs/nfs4state.c
> > > +++ b/fs/nfs/nfs4state.c
> > > @@ -734,9 +734,9 @@ nfs4_get_open_state(struct inode *inode,
> > > struct nfs4_state_owner *owner)
> > >  		state = new;
> > >  		state->owner = owner;
> > >  		atomic_inc(&owner->so_count);
> > > -		list_add_rcu(&state->inode_states, &nfsi->open_states);
> > >  		ihold(inode);
> > >  		state->inode = inode;
> > > +		list_add_rcu(&state->inode_states, &nfsi->open_states);
> > >  		spin_unlock(&inode->i_lock);
> > >  		/* Note: The reclaim code dictates that we add
> > > stateless
> > >  		 * and read-only stateids to the end of the list */
> > > -- 
> > 
> > Thank you for posting the patch! It works for our box.
> > Please feel free to add:
> > 
> >         Reviewed-by: Seiichi Ikarashi <s.ikarashi@fujitsu.com>
> >         Tested-by: Daisuke Matsuda <matsuda-daisuke@fujitsu.com>
> >         Tested-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
> > 
> > Without the patch, the system which is a NFSv4 client has been
> > crashed randomly. The panic log is such as:
> > 
> >    BUG: unable to handle page fault for address: ffffffffffffffb0
> >    ...
> >    RIP: 0010:nfs4_get_valid_delegation+0x6/0x30 [nfsv4]
> >    ...
> >    Call Trace:
> >     nfs4_open_prepare+0x80/0x1c0 [nfsv4]
> >     __rpc_execute+0x75/0x390 [sunrpc]
> >     ? finish_task_switch+0x75/0x260
> >     rpc_async_schedule+0x29/0x40 [sunrpc]
> >     process_one_work+0x1ad/0x370
> >     worker_thread+0x30/0x390
> >     ? create_worker+0x1a0/0x1a0
> >     kthread+0x10c/0x130
> >     ? kthread_park+0x80/0x80
> >     ret_from_fork+0x22/0x30
> > 
> > After applied the patch, the panic is gone.
> > 
> > Thanks!
> > Masa
J. Bruce Fields May 11, 2020, 1:57 p.m. UTC | #4
On Mon, May 11, 2020 at 01:43:30PM +0000, Trond Myklebust wrote:
> On Mon, 2020-05-11 at 09:16 -0400, J. Bruce Fields wrote:
> > Thanks, applying.--b.
> > 
> 
> You're applying? So should I remove it from the NFS client bugfixes?

No.  Sorry, I responded to the wrong email!

--b.

> 
> > On Mon, May 11, 2020 at 08:10:54AM -0400, Masayoshi Mizuma wrote:
> > > On Fri, May 08, 2020 at 06:19:35PM -0400, J. Bruce Fields wrote:
> > > > From: "J. Bruce Fields" <bfields@redhat.com>
> > > > 
> > > > We add the new state to the nfsi->open_states list, making it
> > > > potentially visible to other threads, before we've finished
> > > > initializing
> > > > it.
> > > > 
> > > > That wasn't a problem when all the readers were also taking the
> > > > i_lock
> > > > (as we do here), but since we switched to RCU, there's now a
> > > > possibility
> > > > that a reader could see the partially initialized state.
> > > > 
> > > > Symptoms observed were a crash when another thread called
> > > > nfs4_get_valid_delegation() on a NULL inode.
> > > > 
> > > > Fixes: 9ae075fdd190 "NFSv4: Convert open state lookup to use RCU"
> > > > Signed-off-by: J. Bruce Fields <bfields@redhat.com>
> > > > ---
> > > >  fs/nfs/nfs4state.c | 2 +-
> > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > > 
> > > > diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
> > > > index ac93715c05a4..a8dc25ce48bb 100644
> > > > --- a/fs/nfs/nfs4state.c
> > > > +++ b/fs/nfs/nfs4state.c
> > > > @@ -734,9 +734,9 @@ nfs4_get_open_state(struct inode *inode,
> > > > struct nfs4_state_owner *owner)
> > > >  		state = new;
> > > >  		state->owner = owner;
> > > >  		atomic_inc(&owner->so_count);
> > > > -		list_add_rcu(&state->inode_states, &nfsi->open_states);
> > > >  		ihold(inode);
> > > >  		state->inode = inode;
> > > > +		list_add_rcu(&state->inode_states, &nfsi->open_states);
> > > >  		spin_unlock(&inode->i_lock);
> > > >  		/* Note: The reclaim code dictates that we add
> > > > stateless
> > > >  		 * and read-only stateids to the end of the list */
> > > > -- 
> > > 
> > > Thank you for posting the patch! It works for our box.
> > > Please feel free to add:
> > > 
> > >         Reviewed-by: Seiichi Ikarashi <s.ikarashi@fujitsu.com>
> > >         Tested-by: Daisuke Matsuda <matsuda-daisuke@fujitsu.com>
> > >         Tested-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
> > > 
> > > Without the patch, the system which is a NFSv4 client has been
> > > crashed randomly. The panic log is such as:
> > > 
> > >    BUG: unable to handle page fault for address: ffffffffffffffb0
> > >    ...
> > >    RIP: 0010:nfs4_get_valid_delegation+0x6/0x30 [nfsv4]
> > >    ...
> > >    Call Trace:
> > >     nfs4_open_prepare+0x80/0x1c0 [nfsv4]
> > >     __rpc_execute+0x75/0x390 [sunrpc]
> > >     ? finish_task_switch+0x75/0x260
> > >     rpc_async_schedule+0x29/0x40 [sunrpc]
> > >     process_one_work+0x1ad/0x370
> > >     worker_thread+0x30/0x390
> > >     ? create_worker+0x1a0/0x1a0
> > >     kthread+0x10c/0x130
> > >     ? kthread_park+0x80/0x80
> > >     ret_from_fork+0x22/0x30
> > > 
> > > After applied the patch, the panic is gone.
> > > 
> > > Thanks!
> > > Masa
> -- 
> Trond Myklebust
> Linux NFS client maintainer, Hammerspace
> trond.myklebust@hammerspace.com
> 
>
J. Bruce Fields May 11, 2020, 2:01 p.m. UTC | #5
On Mon, May 11, 2020 at 09:57:45AM -0400, bfields@fieldses.org wrote:
> On Mon, May 11, 2020 at 01:43:30PM +0000, Trond Myklebust wrote:
> > On Mon, 2020-05-11 at 09:16 -0400, J. Bruce Fields wrote:
> > > Thanks, applying.--b.
> > > 
> > 
> > You're applying? So should I remove it from the NFS client bugfixes?
> 
> No.  Sorry, I responded to the wrong email!

I do have a patch including the tags and oops provided by Masayoshi
Mizuma, if you'd like to take that instead.  See followup.--b.
diff mbox series

Patch

diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
index ac93715c05a4..a8dc25ce48bb 100644
--- a/fs/nfs/nfs4state.c
+++ b/fs/nfs/nfs4state.c
@@ -734,9 +734,9 @@  nfs4_get_open_state(struct inode *inode, struct nfs4_state_owner *owner)
 		state = new;
 		state->owner = owner;
 		atomic_inc(&owner->so_count);
-		list_add_rcu(&state->inode_states, &nfsi->open_states);
 		ihold(inode);
 		state->inode = inode;
+		list_add_rcu(&state->inode_states, &nfsi->open_states);
 		spin_unlock(&inode->i_lock);
 		/* Note: The reclaim code dictates that we add stateless
 		 * and read-only stateids to the end of the list */