diff mbox

nfsd: Do not refuse to serve out of cache

Message ID 20180328161801.8360-1-trond.myklebust@primarydata.com (mailing list archive)
State New, archived
Headers show

Commit Message

Trond Myklebust March 28, 2018, 4:18 p.m. UTC
Currently the knfsd replay cache appears to try to refuse replying to
retries that come within 200ms of the cache entry being created. That
makes limited sense in today's world of high speed TCP.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
---
 fs/nfsd/cache.h    | 5 -----
 fs/nfsd/nfscache.c | 6 ++----
 2 files changed, 2 insertions(+), 9 deletions(-)

Comments

Jeff Layton March 28, 2018, 7:20 p.m. UTC | #1
On Wed, 2018-03-28 at 12:18 -0400, Trond Myklebust wrote:
> Currently the knfsd replay cache appears to try to refuse replying to
> retries that come within 200ms of the cache entry being created. That
> makes limited sense in today's world of high speed TCP.
> 
> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
> ---
>  fs/nfsd/cache.h    | 5 -----
>  fs/nfsd/nfscache.c | 6 ++----
>  2 files changed, 2 insertions(+), 9 deletions(-)
> 
> diff --git a/fs/nfsd/cache.h b/fs/nfsd/cache.h
> index 046b3f048757..b7559c6f2b97 100644
> --- a/fs/nfsd/cache.h
> +++ b/fs/nfsd/cache.h
> @@ -67,11 +67,6 @@ enum {
>  	RC_REPLBUFF,
>  };
>  
> -/*
> - * If requests are retransmitted within this interval, they're
> dropped.
> - */
> -#define RC_DELAY		(HZ/5)
> -
>  /* Cache entries expire after this time period */
>  #define RC_EXPIRE		(120 * HZ)
>  
> diff --git a/fs/nfsd/nfscache.c b/fs/nfsd/nfscache.c
> index 334f2ad60704..637f87c39183 100644
> --- a/fs/nfsd/nfscache.c
> +++ b/fs/nfsd/nfscache.c
> @@ -394,7 +394,6 @@ nfsd_cache_lookup(struct svc_rqst *rqstp)
>  	__wsum			csum;
>  	u32 hash = nfsd_cache_hash(xid);
>  	struct nfsd_drc_bucket *b = &drc_hashtbl[hash];
> -	unsigned long		age;
>  	int type = rqstp->rq_cachetype;
>  	int rtn = RC_DOIT;
>  
> @@ -461,12 +460,11 @@ nfsd_cache_lookup(struct svc_rqst *rqstp)
>  found_entry:
>  	nfsdstats.rchits++;
>  	/* We found a matching entry which is either in progress or
> done. */
> -	age = jiffies - rp->c_timestamp;
>  	lru_put_end(b, rp);
>  
>  	rtn = RC_DROPIT;
> -	/* Request being processed or excessive rexmits */
> -	if (rp->c_state == RC_INPROG || age < RC_DELAY)
> +	/* Request being processed */
> +	if (rp->c_state == RC_INPROG)
>  		goto out;
>  
>  	/* From the hall of fame of impractical attacks:

That condition always looked a bit suspicious to me.

Acked-by: Jeff Layton <jlayton@kernel.org>
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
J. Bruce Fields March 28, 2018, 8:10 p.m. UTC | #2
Applying, thanks.

On Wed, Mar 28, 2018 at 03:20:45PM -0400, Jeff Layton wrote:
> On Wed, 2018-03-28 at 12:18 -0400, Trond Myklebust wrote:
> > Currently the knfsd replay cache appears to try to refuse replying to
> > retries that come within 200ms of the cache entry being created. That
> > makes limited sense in today's world of high speed TCP.

Trond gave me some helpful context in person, I may tag that onto the
changelog:

	After a TCP disconnection, a client can very easily reconnect
	and retry an rpc in less than 200ms.  If this logic drops that
	retry, however, the client may be quite slow to retry again.
	This logic is original to the first reply cache implementation
	in 2.1, and may have made more sense for UDP clients that
	retried much more frequently.

	We're still dropping on finding the original request still in
	progress, which can cause the same problem, though it's less
	likely.

	Note svc_check_conn_limits is often the cause of those
	disconnections.  We may want to fix that some day.

--b.

> > 
> > Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
> > ---
> >  fs/nfsd/cache.h    | 5 -----
> >  fs/nfsd/nfscache.c | 6 ++----
> >  2 files changed, 2 insertions(+), 9 deletions(-)
> > 
> > diff --git a/fs/nfsd/cache.h b/fs/nfsd/cache.h
> > index 046b3f048757..b7559c6f2b97 100644
> > --- a/fs/nfsd/cache.h
> > +++ b/fs/nfsd/cache.h
> > @@ -67,11 +67,6 @@ enum {
> >  	RC_REPLBUFF,
> >  };
> >  
> > -/*
> > - * If requests are retransmitted within this interval, they're
> > dropped.
> > - */
> > -#define RC_DELAY		(HZ/5)
> > -
> >  /* Cache entries expire after this time period */
> >  #define RC_EXPIRE		(120 * HZ)
> >  
> > diff --git a/fs/nfsd/nfscache.c b/fs/nfsd/nfscache.c
> > index 334f2ad60704..637f87c39183 100644
> > --- a/fs/nfsd/nfscache.c
> > +++ b/fs/nfsd/nfscache.c
> > @@ -394,7 +394,6 @@ nfsd_cache_lookup(struct svc_rqst *rqstp)
> >  	__wsum			csum;
> >  	u32 hash = nfsd_cache_hash(xid);
> >  	struct nfsd_drc_bucket *b = &drc_hashtbl[hash];
> > -	unsigned long		age;
> >  	int type = rqstp->rq_cachetype;
> >  	int rtn = RC_DOIT;
> >  
> > @@ -461,12 +460,11 @@ nfsd_cache_lookup(struct svc_rqst *rqstp)
> >  found_entry:
> >  	nfsdstats.rchits++;
> >  	/* We found a matching entry which is either in progress or
> > done. */
> > -	age = jiffies - rp->c_timestamp;
> >  	lru_put_end(b, rp);
> >  
> >  	rtn = RC_DROPIT;
> > -	/* Request being processed or excessive rexmits */
> > -	if (rp->c_state == RC_INPROG || age < RC_DELAY)
> > +	/* Request being processed */
> > +	if (rp->c_state == RC_INPROG)
> >  		goto out;
> >  
> >  	/* From the hall of fame of impractical attacks:
> 
> That condition always looked a bit suspicious to me.
> 
> Acked-by: Jeff Layton <jlayton@kernel.org>
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/fs/nfsd/cache.h b/fs/nfsd/cache.h
index 046b3f048757..b7559c6f2b97 100644
--- a/fs/nfsd/cache.h
+++ b/fs/nfsd/cache.h
@@ -67,11 +67,6 @@  enum {
 	RC_REPLBUFF,
 };
 
-/*
- * If requests are retransmitted within this interval, they're dropped.
- */
-#define RC_DELAY		(HZ/5)
-
 /* Cache entries expire after this time period */
 #define RC_EXPIRE		(120 * HZ)
 
diff --git a/fs/nfsd/nfscache.c b/fs/nfsd/nfscache.c
index 334f2ad60704..637f87c39183 100644
--- a/fs/nfsd/nfscache.c
+++ b/fs/nfsd/nfscache.c
@@ -394,7 +394,6 @@  nfsd_cache_lookup(struct svc_rqst *rqstp)
 	__wsum			csum;
 	u32 hash = nfsd_cache_hash(xid);
 	struct nfsd_drc_bucket *b = &drc_hashtbl[hash];
-	unsigned long		age;
 	int type = rqstp->rq_cachetype;
 	int rtn = RC_DOIT;
 
@@ -461,12 +460,11 @@  nfsd_cache_lookup(struct svc_rqst *rqstp)
 found_entry:
 	nfsdstats.rchits++;
 	/* We found a matching entry which is either in progress or done. */
-	age = jiffies - rp->c_timestamp;
 	lru_put_end(b, rp);
 
 	rtn = RC_DROPIT;
-	/* Request being processed or excessive rexmits */
-	if (rp->c_state == RC_INPROG || age < RC_DELAY)
+	/* Request being processed */
+	if (rp->c_state == RC_INPROG)
 		goto out;
 
 	/* From the hall of fame of impractical attacks: