diff mbox

[15/15] NFS: Slow down state manager after an unhandled error

Message ID 20120711203151.3767.79006.stgit@degas.1015granger.net (mailing list archive)
State New, archived
Headers show

Commit Message

Chuck Lever July 11, 2012, 8:31 p.m. UTC
If the state manager thread is not actually able to fully recover from
some situation, it wakes up waiters, who kick off a new state manager
thread.  Quite often the fresh invocation of the state manager is just
as successful.

This results in a livelock as the client dumps thousands of NFS
requests a second on the network in a vain attempt to recover.  Not
very friendly.

To mitigate this situation, add a delay in the state manager after
an unhandled error, so that the client sends just a few requests
every second in this case.
---

 fs/nfs/nfs4state.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)


--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Trond Myklebust July 17, 2012, 6:06 p.m. UTC | #1
On Wed, 2012-07-11 at 16:31 -0400, Chuck Lever wrote:
> If the state manager thread is not actually able to fully recover from

> some situation, it wakes up waiters, who kick off a new state manager

> thread.  Quite often the fresh invocation of the state manager is just

> as successful.

> 

> This results in a livelock as the client dumps thousands of NFS

> requests a second on the network in a vain attempt to recover.  Not

> very friendly.

> 

> To mitigate this situation, add a delay in the state manager after

> an unhandled error, so that the client sends just a few requests

> every second in this case.



I assume that this was intended to have a s-o-b line?

> ---

> 

>  fs/nfs/nfs4state.c |    1 +

>  1 files changed, 1 insertions(+), 0 deletions(-)

> 

> diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c

> index 5e3bebc..38959eb 100644

> --- a/fs/nfs/nfs4state.c

> +++ b/fs/nfs/nfs4state.c

> @@ -2151,6 +2151,7 @@ static void nfs4_state_manager(struct nfs_client *clp)

>  out_error:

>  	pr_warn_ratelimited("NFS: state manager failed on NFSv4 server %s"

>  			" with error %d\n", clp->cl_hostname, -status);

> +	ssleep(1);

>  	nfs4_end_drain_session(clp);

>  	nfs4_clear_state_manager_bit(clp);

>  }

> 


-- 
Trond Myklebust
Linux NFS client maintainer

NetApp
Trond.Myklebust@netapp.com
www.netapp.com
Chuck Lever July 17, 2012, 6:09 p.m. UTC | #2
On Jul 17, 2012, at 2:06 PM, Myklebust, Trond wrote:

> On Wed, 2012-07-11 at 16:31 -0400, Chuck Lever wrote:
>> If the state manager thread is not actually able to fully recover from
>> some situation, it wakes up waiters, who kick off a new state manager
>> thread.  Quite often the fresh invocation of the state manager is just
>> as successful.
>> 
>> This results in a livelock as the client dumps thousands of NFS
>> requests a second on the network in a vain attempt to recover.  Not
>> very friendly.
>> 
>> To mitigate this situation, add a delay in the state manager after
>> an unhandled error, so that the client sends just a few requests
>> every second in this case.
> 
> 
> I assume that this was intended to have a s-o-b line?

Yes.  Actually it was submitted mainly for discussion, which is why the S-O-B was accidentally omitted.  But if you feel it is appropriate for mainline now, I'm happy to see it merged.

> 
>> ---
>> 
>> fs/nfs/nfs4state.c |    1 +
>> 1 files changed, 1 insertions(+), 0 deletions(-)
>> 
>> diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
>> index 5e3bebc..38959eb 100644
>> --- a/fs/nfs/nfs4state.c
>> +++ b/fs/nfs/nfs4state.c
>> @@ -2151,6 +2151,7 @@ static void nfs4_state_manager(struct nfs_client *clp)
>> out_error:
>> 	pr_warn_ratelimited("NFS: state manager failed on NFSv4 server %s"
>> 			" with error %d\n", clp->cl_hostname, -status);
>> +	ssleep(1);
>> 	nfs4_end_drain_session(clp);
>> 	nfs4_clear_state_manager_bit(clp);
>> }
>> 
> 
> -- 
> Trond Myklebust
> Linux NFS client maintainer
> 
> NetApp
> Trond.Myklebust@netapp.com
> www.netapp.com
>
diff mbox

Patch

diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
index 5e3bebc..38959eb 100644
--- a/fs/nfs/nfs4state.c
+++ b/fs/nfs/nfs4state.c
@@ -2151,6 +2151,7 @@  static void nfs4_state_manager(struct nfs_client *clp)
 out_error:
 	pr_warn_ratelimited("NFS: state manager failed on NFSv4 server %s"
 			" with error %d\n", clp->cl_hostname, -status);
+	ssleep(1);
 	nfs4_end_drain_session(clp);
 	nfs4_clear_state_manager_bit(clp);
 }