diff mbox

nfsd delays between svc_recv and gss_check_seq_num

Message ID 20160425201002.GA21879@fieldses.org (mailing list archive)
State New, archived
Headers show

Commit Message

J. Bruce Fields April 25, 2016, 8:10 p.m. UTC
On Mon, Apr 25, 2016 at 03:23:56PM -0400, Benjamin Coddington wrote:
> On Mon, 25 Apr 2016, J. Bruce Fields wrote:
> 
> > On Sun, Apr 10, 2016 at 07:44:45AM -0400, Benjamin Coddington wrote:
> > > My client hangs on xfstests generic/074 on a krb5 mount, and I've found that
> > > the linux server is silently discarding one or more RPCs because the GSS
> > > sequence numbers are outside the sequence window.
> > >
> > > The reason is that sometimes one of the nfsd threads takes a long time
> > > between receiving the RPC and then checking if the sequence is within the
> > > window.  That delay allows the other nfsd threads to quickly move the window
> > > forward out of range.
> > >
> > > If the server discards the RPC then that causes then the client to wait
> > > forever for a response or until the connection is reset.
> > >
> > > By inserting tracepoints, I think I found two sources of delay:
> > >
> > >  1) gss_svc_searchbyctx() uses dup_to_netobj() which has a kmemdup with
> > > GFP_KERNEL.  It does this because presumabely it doesn't know how big the
> > > context handle should be.
> > >
> > >  2) gss_verify_mic() uses make_checksum() which eventually gets to
> > > crypto_alloc_hash() with GFP_KERNEL.
> > >
> > > For the first delay, can we assume the context handles are all going to be
> > > the same size?  It looks like the handle is assigned by the server, so it
> > > seems like we should be able to know beforehand how large they are.
> >
> > It's assigned by the server, but I believe that happens in userland,
> > either in svcgssd or gss-proxy.  On a quick look I can't find a limit
> > other than the rpc-imposed limit of 400 bytes for an rpc credential.  So
> > we'd need a documented agreement with svcgssd and gss-proxy for that.
> > Probably easy for the former, not sure about the latter.
> 
> OK, that gives me something to follow up.

Taking a look at the code....  That thing we're allocating is just a
temporary way to pass in the handle that we're looking for, so why not
just something like this?:


--b.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/sunrpc/auth_gss/svcauth_gss.c b/net/sunrpc/auth_gss/svcauth_gss.c
index 1095be9c80ab..22886097d8ee 100644
--- a/net/sunrpc/auth_gss/svcauth_gss.c
+++ b/net/sunrpc/auth_gss/svcauth_gss.c
@@ -569,10 +569,9 @@  gss_svc_searchbyctx(struct cache_detail *cd, struct xdr_netobj *handle)
 	struct rsc *found;
 
 	memset(&rsci, 0, sizeof(rsci));
-	if (dup_to_netobj(&rsci.handle, handle->data, handle->len))
-		return NULL;
+	rsci.handle.data = handle->data;
+	rsci.handle.len = handle->len;
 	found = rsc_lookup(cd, &rsci);
-	rsc_free(&rsci);
 	if (!found)
 		return NULL;
 	if (cache_check(cd, &found->h, NULL))