Message ID | 20120820223746.GL5779@fieldses.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Mon, Aug 20, 2012 at 06:37:47PM -0400, bfields wrote: > From: "J. Bruce Fields" <bfields@redhat.com> > > The rpc server tries to ensure that there will be room to send a reply > before it receives a request. > > It does this by tracking, in xpt_reserved, an upper bound on the total > size of the replies that is has already committed to for the socket. > > Currently it is adding in the estimate for a new reply *before* it > checks whether there is space available. If it finds that there is not > space, it then subtracts the estimate back out. > > This may lead the subsequent svc_xprt_enqueue to decide that there is > space after all. > > The results is a svc_recv() that will repeatedly return -EAGAIN, causing > server threads to loop without doing any actual work. > > Cc: stable@vger.kernel.org > Reported-by: Michael Tokarev <mjt@tls.msk.ru> > Tested-by: Michael Tokarev <mjt@tls.msk.ru> > Signed-off-by: J. Bruce Fields <bfields@redhat.com> > --- > net/sunrpc/svc_xprt.c | 7 ++----- > 1 file changed, 2 insertions(+), 5 deletions(-) > > Queuing up for 3.6 absent any objections.--b. By the way, one thing I'm still curious about is how this got introduced. mjt bisected it to f03d78db65085609938fdb686238867e65003181 "net: refine {udp|tcp|sctp}_mem limits", which looks like it just made the problem a little more likely. The last substantive change to has_wspace logic was Trond's 47fcb03fefee2501e79176932a4184fc24d6f8ec, but I have a tough time figuring out whether that would have affected it one way or the other. As far as I can tell we've always added to xpt_reserved in this way, so that svc_recv and svc_xprt_enqueue are comparing different things, and surely this was always wrong even if the problem must have been harder to trigger before. But some of the wspace logic I don't understand, so cc'ing Neil and Trond in case they see any other problem I missed. --b. > > diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c > index 0d693a8..bac973a 100644 > --- a/net/sunrpc/svc_xprt.c > +++ b/net/sunrpc/svc_xprt.c > @@ -316,7 +316,6 @@ static bool svc_xprt_has_something_to_do(struct svc_xprt *xprt) > */ > void svc_xprt_enqueue(struct svc_xprt *xprt) > { > - struct svc_serv *serv = xprt->xpt_server; > struct svc_pool *pool; > struct svc_rqst *rqstp; > int cpu; > @@ -362,8 +361,6 @@ void svc_xprt_enqueue(struct svc_xprt *xprt) > rqstp, rqstp->rq_xprt); > rqstp->rq_xprt = xprt; > svc_xprt_get(xprt); > - rqstp->rq_reserved = serv->sv_max_mesg; > - atomic_add(rqstp->rq_reserved, &xprt->xpt_reserved); > pool->sp_stats.threads_woken++; > wake_up(&rqstp->rq_wait); > } else { > @@ -640,8 +637,6 @@ int svc_recv(struct svc_rqst *rqstp, long timeout) > if (xprt) { > rqstp->rq_xprt = xprt; > svc_xprt_get(xprt); > - rqstp->rq_reserved = serv->sv_max_mesg; > - atomic_add(rqstp->rq_reserved, &xprt->xpt_reserved); > > /* As there is a shortage of threads and this request > * had to be queued, don't allow the thread to wait so > @@ -738,6 +733,8 @@ int svc_recv(struct svc_rqst *rqstp, long timeout) > else > len = xprt->xpt_ops->xpo_recvfrom(rqstp); > dprintk("svc: got len=%d\n", len); > + rqstp->rq_reserved = serv->sv_max_mesg; > + atomic_add(rqstp->rq_reserved, &xprt->xpt_reserved); > } > svc_xprt_received(xprt); > > -- > 1.7.9.5 > -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 21.08.2012 02:37, J. Bruce Fields wrote: > From: "J. Bruce Fields" <bfields@redhat.com> > > The rpc server tries to ensure that there will be room to send a reply > before it receives a request. > > It does this by tracking, in xpt_reserved, an upper bound on the total > size of the replies that is has already committed to for the socket. > > Currently it is adding in the estimate for a new reply *before* it > checks whether there is space available. If it finds that there is not > space, it then subtracts the estimate back out. > > This may lead the subsequent svc_xprt_enqueue to decide that there is > space after all. > > The results is a svc_recv() that will repeatedly return -EAGAIN, causing > server threads to loop without doing any actual work. > > Cc: stable@vger.kernel.org This is applicable to all 3.0+ stable kernels. The commit which made this bug apparent is included into 3.0-rc5 (changing memory buffer sizes for tcp/udp/stcp). Before that commit, this bug is not triggered, at least here. So 3.0, 3.2, 3.4 and 3.5 (for which stable series are maintained at the moment) definitely should include it. Should it be applied to 2.6.32 &Co too? /mjt > Reported-by: Michael Tokarev <mjt@tls.msk.ru> > Tested-by: Michael Tokarev <mjt@tls.msk.ru> > Signed-off-by: J. Bruce Fields <bfields@redhat.com> -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Aug 21, 2012 at 10:20:31AM +0400, Michael Tokarev wrote: > On 21.08.2012 02:37, J. Bruce Fields wrote: > > From: "J. Bruce Fields" <bfields@redhat.com> > > > > The rpc server tries to ensure that there will be room to send a reply > > before it receives a request. > > > > It does this by tracking, in xpt_reserved, an upper bound on the total > > size of the replies that is has already committed to for the socket. > > > > Currently it is adding in the estimate for a new reply *before* it > > checks whether there is space available. If it finds that there is not > > space, it then subtracts the estimate back out. > > > > This may lead the subsequent svc_xprt_enqueue to decide that there is > > space after all. > > > > The results is a svc_recv() that will repeatedly return -EAGAIN, causing > > server threads to loop without doing any actual work. > > > > Cc: stable@vger.kernel.org > > This is applicable to all 3.0+ stable kernels. The commit which > made this bug apparent is included into 3.0-rc5 (changing memory > buffer sizes for tcp/udp/stcp). Before that commit, this bug is > not triggered, at least here. So 3.0, 3.2, 3.4 and 3.5 (for which > stable series are maintained at the moment) definitely should > include it. > > Should it be applied to 2.6.32 &Co too? Probably. It doesn't apply cleanly as is, though, so I suspect the stable cc won't be enough to get it applied by default (that may be true of the some the more recent kernels too, I haven't checked). If you want to be sure that would happen, test and post a backported patch and I'll make sure it gets to Greg. --b. > > /mjt > > > Reported-by: Michael Tokarev <mjt@tls.msk.ru> > > Tested-by: Michael Tokarev <mjt@tls.msk.ru> > > Signed-off-by: J. Bruce Fields <bfields@redhat.com> -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, 20 Aug 2012 18:49:15 -0400 "J. Bruce Fields" <bfields@fieldses.org> wrote: > On Mon, Aug 20, 2012 at 06:37:47PM -0400, bfields wrote: > > From: "J. Bruce Fields" <bfields@redhat.com> > > > > The rpc server tries to ensure that there will be room to send a reply > > before it receives a request. > > > > It does this by tracking, in xpt_reserved, an upper bound on the total > > size of the replies that is has already committed to for the socket. > > > > Currently it is adding in the estimate for a new reply *before* it > > checks whether there is space available. If it finds that there is not > > space, it then subtracts the estimate back out. > > > > This may lead the subsequent svc_xprt_enqueue to decide that there is > > space after all. > > > > The results is a svc_recv() that will repeatedly return -EAGAIN, causing > > server threads to loop without doing any actual work. > > > > Cc: stable@vger.kernel.org > > Reported-by: Michael Tokarev <mjt@tls.msk.ru> > > Tested-by: Michael Tokarev <mjt@tls.msk.ru> > > Signed-off-by: J. Bruce Fields <bfields@redhat.com> > > --- > > net/sunrpc/svc_xprt.c | 7 ++----- > > 1 file changed, 2 insertions(+), 5 deletions(-) > > > > Queuing up for 3.6 absent any objections.--b. > > By the way, one thing I'm still curious about is how this got > introduced. mjt bisected it to f03d78db65085609938fdb686238867e65003181 > "net: refine {udp|tcp|sctp}_mem limits", which looks like it just made > the problem a little more likely. > > The last substantive change to has_wspace logic was Trond's > 47fcb03fefee2501e79176932a4184fc24d6f8ec, but I have a tough time > figuring out whether that would have affected it one way or the other. > > As far as I can tell we've always added to xpt_reserved in this way, so > that svc_recv and svc_xprt_enqueue are comparing different things, and > surely this was always wrong even if the problem must have been harder > to trigger before. > > But some of the wspace logic I don't understand, so cc'ing Neil and > Trond in case they see any other problem I missed. Hi Bruce et al. My guess is that commit 9c335c0b8daf56b9f73479d00b1dd726e1fcca09 Author: J. Bruce Fields <bfields@redhat.com> Date: Tue Oct 26 11:32:03 2010 -0400 svcrpc: fix wspace-checking race introduced this problem. It moved the test for 'do we have enough space' from before we add in the requirements of a new request, to after. But I think your patch looks good. Thanks, NeilBrown > > --b. > > > > > diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c > > index 0d693a8..bac973a 100644 > > --- a/net/sunrpc/svc_xprt.c > > +++ b/net/sunrpc/svc_xprt.c > > @@ -316,7 +316,6 @@ static bool svc_xprt_has_something_to_do(struct svc_xprt *xprt) > > */ > > void svc_xprt_enqueue(struct svc_xprt *xprt) > > { > > - struct svc_serv *serv = xprt->xpt_server; > > struct svc_pool *pool; > > struct svc_rqst *rqstp; > > int cpu; > > @@ -362,8 +361,6 @@ void svc_xprt_enqueue(struct svc_xprt *xprt) > > rqstp, rqstp->rq_xprt); > > rqstp->rq_xprt = xprt; > > svc_xprt_get(xprt); > > - rqstp->rq_reserved = serv->sv_max_mesg; > > - atomic_add(rqstp->rq_reserved, &xprt->xpt_reserved); > > pool->sp_stats.threads_woken++; > > wake_up(&rqstp->rq_wait); > > } else { > > @@ -640,8 +637,6 @@ int svc_recv(struct svc_rqst *rqstp, long timeout) > > if (xprt) { > > rqstp->rq_xprt = xprt; > > svc_xprt_get(xprt); > > - rqstp->rq_reserved = serv->sv_max_mesg; > > - atomic_add(rqstp->rq_reserved, &xprt->xpt_reserved); > > > > /* As there is a shortage of threads and this request > > * had to be queued, don't allow the thread to wait so > > @@ -738,6 +733,8 @@ int svc_recv(struct svc_rqst *rqstp, long timeout) > > else > > len = xprt->xpt_ops->xpo_recvfrom(rqstp); > > dprintk("svc: got len=%d\n", len); > > + rqstp->rq_reserved = serv->sv_max_mesg; > > + atomic_add(rqstp->rq_reserved, &xprt->xpt_reserved); > > } > > svc_xprt_received(xprt); > > > > -- > > 1.7.9.5 > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Sep 25, 2012 at 03:54:57PM +1000, NeilBrown wrote: > On Mon, 20 Aug 2012 18:49:15 -0400 "J. Bruce Fields" <bfields@fieldses.org> > wrote: > > > On Mon, Aug 20, 2012 at 06:37:47PM -0400, bfields wrote: > > > From: "J. Bruce Fields" <bfields@redhat.com> > > > > > > The rpc server tries to ensure that there will be room to send a reply > > > before it receives a request. > > > > > > It does this by tracking, in xpt_reserved, an upper bound on the total > > > size of the replies that is has already committed to for the socket. > > > > > > Currently it is adding in the estimate for a new reply *before* it > > > checks whether there is space available. If it finds that there is not > > > space, it then subtracts the estimate back out. > > > > > > This may lead the subsequent svc_xprt_enqueue to decide that there is > > > space after all. > > > > > > The results is a svc_recv() that will repeatedly return -EAGAIN, causing > > > server threads to loop without doing any actual work. > > > > > > Cc: stable@vger.kernel.org > > > Reported-by: Michael Tokarev <mjt@tls.msk.ru> > > > Tested-by: Michael Tokarev <mjt@tls.msk.ru> > > > Signed-off-by: J. Bruce Fields <bfields@redhat.com> > > > --- > > > net/sunrpc/svc_xprt.c | 7 ++----- > > > 1 file changed, 2 insertions(+), 5 deletions(-) > > > > > > Queuing up for 3.6 absent any objections.--b. > > > > By the way, one thing I'm still curious about is how this got > > introduced. mjt bisected it to f03d78db65085609938fdb686238867e65003181 > > "net: refine {udp|tcp|sctp}_mem limits", which looks like it just made > > the problem a little more likely. > > > > The last substantive change to has_wspace logic was Trond's > > 47fcb03fefee2501e79176932a4184fc24d6f8ec, but I have a tough time > > figuring out whether that would have affected it one way or the other. > > > > As far as I can tell we've always added to xpt_reserved in this way, so > > that svc_recv and svc_xprt_enqueue are comparing different things, and > > surely this was always wrong even if the problem must have been harder > > to trigger before. > > > > But some of the wspace logic I don't understand, so cc'ing Neil and > > Trond in case they see any other problem I missed. > > Hi Bruce et al. > > My guess is that > commit 9c335c0b8daf56b9f73479d00b1dd726e1fcca09 > Author: J. Bruce Fields <bfields@redhat.com> > Date: Tue Oct 26 11:32:03 2010 -0400 > > svcrpc: fix wspace-checking race > > introduced this problem. It moved the test for 'do we have enough space' > from before we add in the requirements of a new request, to after. > > But I think your patch looks good. Oops--thanks for spotting that! All I wish I had now is an idea for a patch that would make that mistake less likely in the future. Slather on some comments, I guess.... --b. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c index 0d693a8..bac973a 100644 --- a/net/sunrpc/svc_xprt.c +++ b/net/sunrpc/svc_xprt.c @@ -316,7 +316,6 @@ static bool svc_xprt_has_something_to_do(struct svc_xprt *xprt) */ void svc_xprt_enqueue(struct svc_xprt *xprt) { - struct svc_serv *serv = xprt->xpt_server; struct svc_pool *pool; struct svc_rqst *rqstp; int cpu; @@ -362,8 +361,6 @@ void svc_xprt_enqueue(struct svc_xprt *xprt) rqstp, rqstp->rq_xprt); rqstp->rq_xprt = xprt; svc_xprt_get(xprt); - rqstp->rq_reserved = serv->sv_max_mesg; - atomic_add(rqstp->rq_reserved, &xprt->xpt_reserved); pool->sp_stats.threads_woken++; wake_up(&rqstp->rq_wait); } else { @@ -640,8 +637,6 @@ int svc_recv(struct svc_rqst *rqstp, long timeout) if (xprt) { rqstp->rq_xprt = xprt; svc_xprt_get(xprt); - rqstp->rq_reserved = serv->sv_max_mesg; - atomic_add(rqstp->rq_reserved, &xprt->xpt_reserved); /* As there is a shortage of threads and this request * had to be queued, don't allow the thread to wait so @@ -738,6 +733,8 @@ int svc_recv(struct svc_rqst *rqstp, long timeout) else len = xprt->xpt_ops->xpo_recvfrom(rqstp); dprintk("svc: got len=%d\n", len); + rqstp->rq_reserved = serv->sv_max_mesg; + atomic_add(rqstp->rq_reserved, &xprt->xpt_reserved); } svc_xprt_received(xprt);