diff mbox

[1/2] SUNRPC: Ensure that call_connect times out correctly

Message ID 1395250160.7168.1.camel@leira.trondhjem.org (mailing list archive)
State New, archived
Headers show

Commit Message

Trond Myklebust March 19, 2014, 5:29 p.m. UTC
On Wed, 2014-03-19 at 13:10 -0400, Steve Dickson wrote:
> 
> On 03/19/2014 11:04 AM, Trond Myklebust wrote:
> > IOW: there is no way to make mount.nfs honour the ‘retry’ and/or ‘bg' 
> > mount options in any consistent fashion by solely relying on kernel timeouts.
> I went back and took a look at how bg mounts worked in a number of
> older kernels f19(3.12) all the way back to RHEL6 kernel (2.6). 
> 
> I turns out you are right. The bg mounts were not depending on
> timeouts they were depended on the mount to fail with ECONNREFUSED
> The very first one, which is the reason the bg mount happen
> so fast... 
> 
> Its seems these days ECONNREFUSED are no longer return 
> as an error codes. They basically are turned into a 
> timeout... Just curious as to why ECONNREFUSED are 
> no longer returned?
> 
> Again, thanks for the cycles!

If the server is down during the initial rpc client creation, then I’d
still expect that to fail with ECONNREFUSED due to the rpc_ping() call.

Does the following patch help?


8<------------------------------------------------------------
From dad628cc357a06cff8ce04300ba5c19bd92e73eb Mon Sep 17 00:00:00 2001
From: Trond Myklebust <trond.myklebust@primarydata.com>
Date: Wed, 19 Mar 2014 13:25:43 -0400
Subject: [PATCH] SUNRPC: Ensure call_status() deals correctly with SOFTCONN
 tasks

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
---
 net/sunrpc/clnt.c | 4 ++++
 1 file changed, 4 insertions(+)

Comments

Steve Dickson March 19, 2014, 6:22 p.m. UTC | #1
On 03/19/2014 01:29 PM, Trond Myklebust wrote:
> On Wed, 2014-03-19 at 13:10 -0400, Steve Dickson wrote:
>>
>> On 03/19/2014 11:04 AM, Trond Myklebust wrote:
>>> IOW: there is no way to make mount.nfs honour the ‘retry’ and/or ‘bg' 
>>> mount options in any consistent fashion by solely relying on kernel timeouts.
>> I went back and took a look at how bg mounts worked in a number of
>> older kernels f19(3.12) all the way back to RHEL6 kernel (2.6). 
>>
>> I turns out you are right. The bg mounts were not depending on
>> timeouts they were depended on the mount to fail with ECONNREFUSED
>> The very first one, which is the reason the bg mount happen
>> so fast... 
>>
>> Its seems these days ECONNREFUSED are no longer return 
>> as an error codes. They basically are turned into a 
>> timeout... Just curious as to why ECONNREFUSED are 
>> no longer returned?
>>
>> Again, thanks for the cycles!
> 
> If the server is down during the initial rpc client creation, then I’d
> still expect that to fail with ECONNREFUSED due to the rpc_ping() call.
> 
> Does the following patch help?
> 
> 
> 8<------------------------------------------------------------
> From dad628cc357a06cff8ce04300ba5c19bd92e73eb Mon Sep 17 00:00:00 2001
> From: Trond Myklebust <trond.myklebust@primarydata.com>
> Date: Wed, 19 Mar 2014 13:25:43 -0400
> Subject: [PATCH] SUNRPC: Ensure call_status() deals correctly with SOFTCONN
>  tasks
> 
> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
> ---
>  net/sunrpc/clnt.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
> index cea1308a6fda..ef96568902c5 100644
> --- a/net/sunrpc/clnt.c
> +++ b/net/sunrpc/clnt.c
> @@ -2004,6 +2004,10 @@ call_status(struct rpc_task *task)
>  	case -EHOSTDOWN:
>  	case -EHOSTUNREACH:
>  	case -ENETUNREACH:
> +		if (RPC_IS_SOFTCONN(task)) {
> +			rpc_exit(task, status);
> +			break;
> +		}
>  		/*
>  		 * Delay any retries for 3 seconds, then handle as if it
>  		 * were a timeout.
> 
No... but I do thing that patch make sense... 

What's going on is-ECONNREFUSED is being seen in call_connect_status() 
and the task is not a soft connection. So call_timeout() is call which 
eventual times out the mount... 

So just for fun I make the SETCLIENTID rpc soft, but for some
reason that didn't work either... I thought for sure it would... 

steved.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Trond Myklebust March 19, 2014, 7:41 p.m. UTC | #2
On Mar 19, 2014, at 14:22, Steve Dickson <SteveD@redhat.com> wrote:
> 
> What's going on is-ECONNREFUSED is being seen in call_connect_status() 
> and the task is not a soft connection. So call_timeout() is call which 
> eventual times out the mount… 

This is what is confusing me. I’d expect that the rpc_ping() would be the first thing to be sent on the wire by rpc_create(), and that ping should normally have the RPC_SOFTCONN flag set.

> So just for fun I make the SETCLIENTID rpc soft, but for some
> reason that didn't work either... I thought for sure it would... 
> 
> steved.
Steve Dickson March 20, 2014, 2:12 p.m. UTC | #3
On 03/19/2014 03:41 PM, Trond Myklebust wrote:
> 
> On Mar 19, 2014, at 14:22, Steve Dickson <SteveD@redhat.com> wrote:
>>
>> What's going on is-ECONNREFUSED is being seen in call_connect_status() 
>> and the task is not a soft connection. So call_timeout() is call which 
>> eventual times out the mount… 
> 
> This is what is confusing me. I’d expect that the rpc_ping() would be the 
> first thing to be sent on the wire by rpc_create(), and that ping 
> should normally have the RPC_SOFTCONN flag set.
> 
>> So just for fun I make the SETCLIENTID rpc soft, but for some
>> reason that didn't work either... I thought for sure it would... 
I see you point... Using quick systemtap scripts it turns out
rpc_ping is not failing (returns 0x0) and it should because
the server is definitely down!  

steved.


--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Steve Dickson March 20, 2014, 3:19 p.m. UTC | #4
On 03/20/2014 10:12 AM, Steve Dickson wrote:
> 
> 
> On 03/19/2014 03:41 PM, Trond Myklebust wrote:
>>
>> On Mar 19, 2014, at 14:22, Steve Dickson <SteveD@redhat.com> wrote:
>>>
>>> What's going on is-ECONNREFUSED is being seen in call_connect_status() 
>>> and the task is not a soft connection. So call_timeout() is call which 
>>> eventual times out the mount… 
>>
>> This is what is confusing me. I’d expect that the rpc_ping() would be the 
>> first thing to be sent on the wire by rpc_create(), and that ping 
>> should normally have the RPC_SOFTCONN flag set.
>>
>>> So just for fun I make the SETCLIENTID rpc soft, but for some
>>> reason that didn't work either... I thought for sure it would... 
> I see you point... Using quick systemtap scripts it turns out
> rpc_ping is not failing (returns 0x0) and it should because
> the server is definitely down!  
Found it... The rpc_delay call in call_connect_status() 
is causing a callback to be scheduled and run before
rpc_exit_task. Unfortunately that call back (__rpc_atrun)
clears the status...

I'll posting my version of the fix shortly.. 

steved.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
index cea1308a6fda..ef96568902c5 100644
--- a/net/sunrpc/clnt.c
+++ b/net/sunrpc/clnt.c
@@ -2004,6 +2004,10 @@  call_status(struct rpc_task *task)
 	case -EHOSTDOWN:
 	case -EHOSTUNREACH:
 	case -ENETUNREACH:
+		if (RPC_IS_SOFTCONN(task)) {
+			rpc_exit(task, status);
+			break;
+		}
 		/*
 		 * Delay any retries for 3 seconds, then handle as if it
 		 * were a timeout.