diff mbox

Question on nfs40_discover_server_trunking.

Message ID 4FA345DA4F4AE44899BD2B03EEEC2FA915C0721A@sacexcmbx05-prd.hq.netapp.com (mailing list archive)
State New, archived
Headers show

Commit Message

Trond Myklebust Jan. 19, 2013, 1:01 a.m. UTC
On Fri, 2013-01-18 at 19:44 -0500, Trond Myklebust wrote:
> On Fri, 2013-01-18 at 18:14 -0500, Chuck Lever wrote:

> > On Jan 18, 2013, at 5:59 PM, "Myklebust, Trond" <Trond.Myklebust@netapp.com> wrote:

> > 

> > > On Fri, 2013-01-18 at 17:03 -0500, an unknown sender wrote:

> > >> On Fri, 2013-01-18 at 16:33 -0500, Chuck Lever wrote:

> > >>> On Jan 18, 2013, at 4:28 PM, Ben Greear <greearb@candelatech.com> wrote:

> > >>> 

> > >>>> Any chance the STALE_CLIENTID case needs a 'break'?

> > >>> 

> > >>> I don't think so.  LEASE_CONFIRM is set, and we want to wake the state renewal thread.

> > >>> 

> > >>>> 

> > >>>> Twice I've seen kernel crashes after the nfs40_walk_client_list

> > >>>> failed (though code comments say it should never fail).

> > >>> 

> > >>> nfs40_walk_client_list() is looking for an nfs_client that is supposed to already be in the nfs_client list.  If the search fails, that's a bug.

> > >>> 

> > >>> Eyeball the contents of your nfs_client list.  You should find an appropriate nfs_client in there, and then figure out why the search doesn't find it.

> > >> 

> > >> You have considered the fact that the call to

> > >> nfs4_proc_setclientid_confirm can potentially return

> > >> NFS4ERR_STALE_CLIENTID if the server rebooted while the client was

> > >> walking the list?

> > > 

> > > In fact, as far as I can see, the correct behaviour in

> > > nfs40_discover_server_trunking() should be to re-issue the setclientid

> > > call, and then walk the list again if nfs40_walk_client_list() returns

> > > NFS4ERR_STALE_CLIENTID.

> > 

> > When I wrote the server trunking detection logic, I think we hadn't clearly decided what needed to be done in the STALE_CLIENTID case.

> > 

> > > Something like the attached patch:

> > 

> > A couple of comments:

> > 

> >  o  nfs_get_client() already sticks the new client on the tail of the nfs_client list

> 

> OK. That allows us to get rid of 5-6 lines of code.

> 

> >  o  We don't want to get stuck in a loop here.  Should the "do {}" loop in nfs40_discover_server_trunking() be bounded by a retry count?

> 

> Not a number of retries, but possibly a timeout value.

> 

> > However, I haven't heard Ben say "oh, yes, my server had rebooted."  I'd like some confirmation that the match failed for an explainable and expected reason.

> 

> The spec allows the server to discard unconfirmed clientids, so our code

> should take that into account.

> 

> 

> BTW: I'm going insane trying to figure out how the refcounting in that

> code is supposed to work. The assumption that we can somehow safely

> return a non-referenced pointer from nfs4_discover_server_trunking and

> then bump the reference count in nfs4_init_client() drives me nuts...


Attached is v2 of the patch that addresses all of the above.

-- 
Trond Myklebust
Linux NFS client maintainer

NetApp
Trond.Myklebust@netapp.com
www.netapp.com

Comments

Chuck Lever Jan. 19, 2013, 1:27 a.m. UTC | #1
On Jan 18, 2013, at 8:01 PM, "Myklebust, Trond" <Trond.Myklebust@netapp.com> wrote:

> On Fri, 2013-01-18 at 19:44 -0500, Trond Myklebust wrote:
>> On Fri, 2013-01-18 at 18:14 -0500, Chuck Lever wrote:
>>> On Jan 18, 2013, at 5:59 PM, "Myklebust, Trond" <Trond.Myklebust@netapp.com> wrote:
>>> 
>>>> On Fri, 2013-01-18 at 17:03 -0500, an unknown sender wrote:
>>>>> On Fri, 2013-01-18 at 16:33 -0500, Chuck Lever wrote:
>>>>>> On Jan 18, 2013, at 4:28 PM, Ben Greear <greearb@candelatech.com> wrote:
>>>>>> 
>>>>>>> Any chance the STALE_CLIENTID case needs a 'break'?
>>>>>> 
>>>>>> I don't think so.  LEASE_CONFIRM is set, and we want to wake the state renewal thread.
>>>>>> 
>>>>>>> 
>>>>>>> Twice I've seen kernel crashes after the nfs40_walk_client_list
>>>>>>> failed (though code comments say it should never fail).
>>>>>> 
>>>>>> nfs40_walk_client_list() is looking for an nfs_client that is supposed to already be in the nfs_client list.  If the search fails, that's a bug.
>>>>>> 
>>>>>> Eyeball the contents of your nfs_client list.  You should find an appropriate nfs_client in there, and then figure out why the search doesn't find it.
>>>>> 
>>>>> You have considered the fact that the call to
>>>>> nfs4_proc_setclientid_confirm can potentially return
>>>>> NFS4ERR_STALE_CLIENTID if the server rebooted while the client was
>>>>> walking the list?
>>>> 
>>>> In fact, as far as I can see, the correct behaviour in
>>>> nfs40_discover_server_trunking() should be to re-issue the setclientid
>>>> call, and then walk the list again if nfs40_walk_client_list() returns
>>>> NFS4ERR_STALE_CLIENTID.
>>> 
>>> When I wrote the server trunking detection logic, I think we hadn't clearly decided what needed to be done in the STALE_CLIENTID case.
>>> 
>>>> Something like the attached patch:
>>> 
>>> A couple of comments:
>>> 
>>> o  nfs_get_client() already sticks the new client on the tail of the nfs_client list
>> 
>> OK. That allows us to get rid of 5-6 lines of code.
>> 
>>> o  We don't want to get stuck in a loop here.  Should the "do {}" loop in nfs40_discover_server_trunking() be bounded by a retry count?
>> 
>> Not a number of retries, but possibly a timeout value.
>> 
>>> However, I haven't heard Ben say "oh, yes, my server had rebooted."  I'd like some confirmation that the match failed for an explainable and expected reason.
>> 
>> The spec allows the server to discard unconfirmed clientids, so our code
>> should take that into account.
>> 
>> 
>> BTW: I'm going insane trying to figure out how the refcounting in that
>> code is supposed to work. The assumption that we can somehow safely
>> return a non-referenced pointer from nfs4_discover_server_trunking and
>> then bump the reference count in nfs4_init_client() drives me nuts...
> 
> Attached is v2 of the patch that addresses all of the above.

At first glance, it looks plausible.  Well done.  Perhaps the reference counting fix should be split into a separate patch.

This:

"If walking the list in nfs40_walk_client_list fails, then the most
likely explanation is that the server rebooted."

I think in Ben's case the server may be dropping the unconfirmed client ID because it is so busy and the client is probably dealing with a thousand or more mount points, which causes trunking detection to take a lot of time.  I'd still like to see real evidence of this, but I can't think of any easy way for Ben to provide it.

The "No match found." comment is good.

Then:

"Ensure that we catch that case by testing our new nfs_client last."

"testing the new nfs_client last" is the way it's supposed to work -- that's why nfs_get_client() uses list_add_tail() now to insert the new client.  Adding the new client to the client list is supposed to guarantee that walk_client_list always finds an appropriate nfs_client on the list.

So you're not actually changing the order of the tests.  What you're really changing is the recovery behavior in the "our new clientid is already stale" case.  The patch description is probably inaccurate in this regard.

Do we need a similar change for STALE_CLIENTID in the NFSv4.1 path?

Finally, because you are replacing that bogus switch statement in nfs40_discover_server_trunking(), you are fixing Ben's crash.  His GPF back trace should be included in your patch description if you are sending this to stable.

I can test this a little more deeply in a week or two when I get back to the migration work.
diff mbox

Patch

From 5fdbe6129cc767b4d8926d4a53ce7ce075411156 Mon Sep 17 00:00:00 2001
From: Trond Myklebust <Trond.Myklebust@netapp.com>
Date: Fri, 18 Jan 2013 17:54:34 -0500
Subject: [PATCH v2] NFSv4: Fix NFSv4.0 trunking discovery

If walking the list in nfs40_walk_client_list fails, then the most
likely explanation is that the server rebooted. Ensure that we catch
that case by testing our new nfs_client last.

Also fix reference counting issues that could lead to a memory leak
for nfs_client structures.

Reported-by: Ben Greear <greearb@candelatech.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Chuck Lever <chuck.lever@oracle.com>
Cc: stable@vger.kernel.org [>=3.7]
---
 fs/nfs/nfs4client.c | 37 +++++++++++++++----------------------
 fs/nfs/nfs4state.c  | 22 ++++++++++------------
 2 files changed, 25 insertions(+), 34 deletions(-)

diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c
index acc3472..bd6efe8 100644
--- a/fs/nfs/nfs4client.c
+++ b/fs/nfs/nfs4client.c
@@ -234,13 +234,12 @@  struct nfs_client *nfs4_init_client(struct nfs_client *clp,
 		nfs_mark_client_ready(clp, NFS_CS_READY);
 
 	error = nfs4_discover_server_trunking(clp, &old);
+	nfs_put_client(clp);
 	if (error < 0)
 		goto error;
 	if (clp != old) {
 		clp->cl_preserve_clid = true;
-		nfs_put_client(clp);
 		clp = old;
-		atomic_inc(&clp->cl_count);
 	}
 
 	return clp;
@@ -332,40 +331,33 @@  int nfs40_walk_client_list(struct nfs_client *new,
 
 		if (prev)
 			nfs_put_client(prev);
+		prev = pos;
 
 		status = nfs4_proc_setclientid_confirm(pos, &clid, cred);
-		if (status == 0) {
+		switch (status) {
+		case -NFS4ERR_STALE_CLIENTID:
+			break;
+		case 0:
 			nfs4_swap_callback_idents(pos, new);
 
-			nfs_put_client(pos);
+			prev = NULL;
 			*result = pos;
 			dprintk("NFS: <-- %s using nfs_client = %p ({%d})\n",
 				__func__, pos, atomic_read(&pos->cl_count));
-			return 0;
-		}
-		if (status != -NFS4ERR_STALE_CLIENTID) {
-			nfs_put_client(pos);
-			dprintk("NFS: <-- %s status = %d, no result\n",
-				__func__, status);
-			return status;
+		default:
+			goto out;
 		}
 
 		spin_lock(&nn->nfs_client_lock);
-		prev = pos;
 	}
+	spin_unlock(&nn->nfs_client_lock);
 
-	/*
-	 * No matching nfs_client found.  This should be impossible,
-	 * because the new nfs_client has already been added to
-	 * nfs_client_list by nfs_get_client().
-	 *
-	 * Don't BUG(), since the caller is holding a mutex.
-	 */
+	/* No match found. The server lost our clientid */
+out:
 	if (prev)
 		nfs_put_client(prev);
-	spin_unlock(&nn->nfs_client_lock);
-	pr_err("NFS: %s Error: no matching nfs_client found\n", __func__);
-	return -NFS4ERR_STALE_CLIENTID;
+	dprintk("NFS: <-- %s status = %d\n", __func__, status);
+	return status;
 }
 
 #ifdef CONFIG_NFS_V4_1
@@ -473,6 +465,7 @@  int nfs41_walk_client_list(struct nfs_client *new,
 		if (!nfs4_match_serverowners(pos, new))
 			continue;
 
+		atomic_inc(&pos->cl_count);
 		spin_unlock(&nn->nfs_client_lock);
 		dprintk("NFS: <-- %s using nfs_client = %p ({%d})\n",
 			__func__, pos, atomic_read(&pos->cl_count));
diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
index 9448c57..102885d 100644
--- a/fs/nfs/nfs4state.c
+++ b/fs/nfs/nfs4state.c
@@ -129,24 +129,22 @@  int nfs40_discover_server_trunking(struct nfs_client *clp,
 	if (clp->cl_addr.ss_family == AF_INET6)
 		port = nn->nfs_callback_tcpport6;
 
-	status = nfs4_proc_setclientid(clp, NFS4_CALLBACK, port, cred, &clid);
-	if (status != 0)
-		goto out;
-	clp->cl_clientid = clid.clientid;
-	clp->cl_confirm = clid.confirm;
+	do {
+		status = nfs4_proc_setclientid(clp, NFS4_CALLBACK, port, cred, &clid);
+		if (status != 0)
+			break;
+		clp->cl_clientid = clid.clientid;
+		clp->cl_confirm = clid.confirm;
 
-	status = nfs40_walk_client_list(clp, result, cred);
-	switch (status) {
-	case -NFS4ERR_STALE_CLIENTID:
-		set_bit(NFS4CLNT_LEASE_CONFIRM, &clp->cl_state);
-	case 0:
+		status = nfs40_walk_client_list(clp, result, cred);
+	} while (status == -NFS4ERR_STALE_CLIENTID);
+
+	if (status == 0) {
 		/* Sustain the lease, even if it's empty.  If the clientid4
 		 * goes stale it's of no use for trunking discovery. */
 		nfs4_schedule_state_renewal(*result);
-		break;
 	}
 
-out:
 	return status;
 }
 
-- 
1.8.1