Message ID | 5E7D6A55-B7F3-411D-A74B-E8BCE04BCF02@oracle.com (mailing list archive) |
---|---|
State | Superseded, archived |
Headers | show |
Following up. > On Aug 3, 2016, at 3:40 PM, Chuck Lever <chuck.lever@oracle.com> wrote: > >> >> On Aug 3, 2016, at 1:47 PM, bfields@fieldses.org wrote: >> >> On Wed, Aug 03, 2016 at 11:27:47AM -0400, Chuck Lever wrote: >>> Hi Bruce- >>> >>> I see that commit 39a9beab5acb83176e8b9a4f0778749a09341f1f >>> Author: J. Bruce Fields <bfields@redhat.com> >>> AuthorDate: Tue May 17 12:38:21 2016 -0400 >>> >>> rpc: share one xps between all backchannels >>> >>> has added this piece of code: >>> >>> @@ -452,10 +452,20 @@ static struct rpc_clnt *rpc_create_xprt(struct rpc_create_args *args, >>> struct rpc_clnt *clnt = NULL; >>> struct rpc_xprt_switch *xps; >>> >>> - xps = xprt_switch_alloc(xprt, GFP_KERNEL); >>> - if (xps == NULL) { >>> - xprt_put(xprt); >>> - return ERR_PTR(-ENOMEM); >>> + if (args->bc_xprt && args->bc_xprt->xpt_bc_xps) { >>> + WARN_ON(args->protocol != XPRT_TRANSPORT_BC_TCP); >>> + xps = args->bc_xprt->xpt_bc_xps; >>> + xprt_switch_get(xps); >>> + } else { >>> >>> >>> the WARN_ON here fires on the server whenever I use NFSv4.1 on RDMA. >>> >>> Can you say why it was added? Is there something RPC/RDMA needs to >>> do to make the code safe? >> >> What is args->protocol in this case? >> >> Digging around... OK, I missed that BC_TCP and BC_RDMA were defined as >> OR's of an XPRT_TRANSPORT_BC bit with the identifier of the underlying >> transport. That makes sense. >> >> So, I should have just used XPRT_TRANSPORT_BC there--I think all I meant >> was "is this a backchannel". >> >> Does that fix the problem? > > This simple fix eliminates the log noise: > > diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c > index 2808d55..f94caf7 100644 > --- a/net/sunrpc/clnt.c > +++ b/net/sunrpc/clnt.c > @@ -520,7 +520,7 @@ struct rpc_clnt *rpc_create(struct rpc_create_args *args) > char servername[48]; > > if (args->bc_xprt) { > - WARN_ON(args->protocol != XPRT_TRANSPORT_BC_TCP); > + WARN_ON(!(args->protocol & XPRT_TRANSPORT_BC)); > xprt = args->bc_xprt->xpt_bc_xprt; > if (xprt) { > xprt_get(xprt); > > > This code seems to come from: > > commit d50039ea5ee63c589b0434baa5ecf6e5075bb6f9 > Author: J. Bruce Fields <bfields@redhat.com> > AuthorDate: Mon May 16 17:03:42 2016 -0400 > > nfsd4/rpc: move backchannel create logic into rpc code > > > Where it may have been copied from: > > -static struct rpc_clnt *create_backchannel_client(struct rpc_create_args *args) > -{ > - struct rpc_xprt *xprt; > - > - if (args->protocol != XPRT_TRANSPORT_BC_TCP) > - return rpc_create(args); > - > - xprt = args->bc_xprt->xpt_bc_xprt; > - if (xprt) { > - xprt_get(xprt); > - return rpc_create_xprt(args, xprt); > - } > - > - return rpc_create(args); > -} > > There's no warning here. In fact, protocol != BC_TCP seems to > be expected. > > I'm wondering if the warning is needed at all? Using NFSv4.1/RDMA against my v4.7 NFS server seems to result in a system deadlock in short order on the server. I haven't looked further into this because you mentioned you were going to have a look at these commits that change the backchannel code. -- Chuck Lever -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Aug 03, 2016 at 03:40:11PM -0400, Chuck Lever wrote: > > > On Aug 3, 2016, at 1:47 PM, bfields@fieldses.org wrote: > > > > On Wed, Aug 03, 2016 at 11:27:47AM -0400, Chuck Lever wrote: > >> Hi Bruce- > >> > >> I see that commit 39a9beab5acb83176e8b9a4f0778749a09341f1f > >> Author: J. Bruce Fields <bfields@redhat.com> > >> AuthorDate: Tue May 17 12:38:21 2016 -0400 > >> > >> rpc: share one xps between all backchannels > >> > >> has added this piece of code: > >> > >> @@ -452,10 +452,20 @@ static struct rpc_clnt *rpc_create_xprt(struct rpc_create_args *args, > >> struct rpc_clnt *clnt = NULL; > >> struct rpc_xprt_switch *xps; > >> > >> - xps = xprt_switch_alloc(xprt, GFP_KERNEL); > >> - if (xps == NULL) { > >> - xprt_put(xprt); > >> - return ERR_PTR(-ENOMEM); > >> + if (args->bc_xprt && args->bc_xprt->xpt_bc_xps) { > >> + WARN_ON(args->protocol != XPRT_TRANSPORT_BC_TCP); > >> + xps = args->bc_xprt->xpt_bc_xps; > >> + xprt_switch_get(xps); > >> + } else { > >> > >> > >> the WARN_ON here fires on the server whenever I use NFSv4.1 on RDMA. > >> > >> Can you say why it was added? Is there something RPC/RDMA needs to > >> do to make the code safe? > > > > What is args->protocol in this case? > > > > Digging around... OK, I missed that BC_TCP and BC_RDMA were defined as > > OR's of an XPRT_TRANSPORT_BC bit with the identifier of the underlying > > transport. That makes sense. > > > > So, I should have just used XPRT_TRANSPORT_BC there--I think all I meant > > was "is this a backchannel". > > > > Does that fix the problem? > > This simple fix eliminates the log noise: > > diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c > index 2808d55..f94caf7 100644 > --- a/net/sunrpc/clnt.c > +++ b/net/sunrpc/clnt.c > @@ -520,7 +520,7 @@ struct rpc_clnt *rpc_create(struct rpc_create_args *args) > char servername[48]; > > if (args->bc_xprt) { > - WARN_ON(args->protocol != XPRT_TRANSPORT_BC_TCP); > + WARN_ON(!(args->protocol & XPRT_TRANSPORT_BC)); > xprt = args->bc_xprt->xpt_bc_xprt; > if (xprt) { > xprt_get(xprt); > > > This code seems to come from: > > commit d50039ea5ee63c589b0434baa5ecf6e5075bb6f9 > Author: J. Bruce Fields <bfields@redhat.com> > AuthorDate: Mon May 16 17:03:42 2016 -0400 > > nfsd4/rpc: move backchannel create logic into rpc code > > > Where it may have been copied from: > > -static struct rpc_clnt *create_backchannel_client(struct rpc_create_args *args) > -{ > - struct rpc_xprt *xprt; > - > - if (args->protocol != XPRT_TRANSPORT_BC_TCP) > - return rpc_create(args); > - > - xprt = args->bc_xprt->xpt_bc_xprt; > - if (xprt) { > - xprt_get(xprt); > - return rpc_create_xprt(args, xprt); > - } > - > - return rpc_create(args); > -} > > There's no warning here. In fact, protocol != BC_TCP seems to > be expected. The protocol should be BC_TCP (OK, actually just BC) if and only if bc_xprt is set. (The BC_TCP case is the 4.1+ case, the other is the 4.0 case. In the 4.1+ case, the new client uses an existing (client-initiated) connection, in the 4.0 case, the new client must also have a new connection. In the 4.0 case we'll always create a new xprt, in the 4.1 case we might or might not--depends on whether that particular connection has been used for a backchannel previously.) --b. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Aug 10, 2016 at 02:01:20PM -0400, Chuck Lever wrote: > Following up. > > > > On Aug 3, 2016, at 3:40 PM, Chuck Lever <chuck.lever@oracle.com> wrote: > > > >> > >> On Aug 3, 2016, at 1:47 PM, bfields@fieldses.org wrote: > >> > >> On Wed, Aug 03, 2016 at 11:27:47AM -0400, Chuck Lever wrote: > >>> Hi Bruce- > >>> > >>> I see that commit 39a9beab5acb83176e8b9a4f0778749a09341f1f > >>> Author: J. Bruce Fields <bfields@redhat.com> > >>> AuthorDate: Tue May 17 12:38:21 2016 -0400 > >>> > >>> rpc: share one xps between all backchannels > >>> > >>> has added this piece of code: > >>> > >>> @@ -452,10 +452,20 @@ static struct rpc_clnt *rpc_create_xprt(struct rpc_create_args *args, > >>> struct rpc_clnt *clnt = NULL; > >>> struct rpc_xprt_switch *xps; > >>> > >>> - xps = xprt_switch_alloc(xprt, GFP_KERNEL); > >>> - if (xps == NULL) { > >>> - xprt_put(xprt); > >>> - return ERR_PTR(-ENOMEM); > >>> + if (args->bc_xprt && args->bc_xprt->xpt_bc_xps) { > >>> + WARN_ON(args->protocol != XPRT_TRANSPORT_BC_TCP); > >>> + xps = args->bc_xprt->xpt_bc_xps; > >>> + xprt_switch_get(xps); > >>> + } else { > >>> > >>> > >>> the WARN_ON here fires on the server whenever I use NFSv4.1 on RDMA. > >>> > >>> Can you say why it was added? Is there something RPC/RDMA needs to > >>> do to make the code safe? > >> > >> What is args->protocol in this case? > >> > >> Digging around... OK, I missed that BC_TCP and BC_RDMA were defined as > >> OR's of an XPRT_TRANSPORT_BC bit with the identifier of the underlying > >> transport. That makes sense. > >> > >> So, I should have just used XPRT_TRANSPORT_BC there--I think all I meant > >> was "is this a backchannel". > >> > >> Does that fix the problem? > > > > This simple fix eliminates the log noise: > > > > diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c > > index 2808d55..f94caf7 100644 > > --- a/net/sunrpc/clnt.c > > +++ b/net/sunrpc/clnt.c > > @@ -520,7 +520,7 @@ struct rpc_clnt *rpc_create(struct rpc_create_args *args) > > char servername[48]; > > > > if (args->bc_xprt) { > > - WARN_ON(args->protocol != XPRT_TRANSPORT_BC_TCP); > > + WARN_ON(!(args->protocol & XPRT_TRANSPORT_BC)); > > xprt = args->bc_xprt->xpt_bc_xprt; > > if (xprt) { > > xprt_get(xprt); > > > > > > This code seems to come from: > > > > commit d50039ea5ee63c589b0434baa5ecf6e5075bb6f9 > > Author: J. Bruce Fields <bfields@redhat.com> > > AuthorDate: Mon May 16 17:03:42 2016 -0400 > > > > nfsd4/rpc: move backchannel create logic into rpc code > > > > > > Where it may have been copied from: > > > > -static struct rpc_clnt *create_backchannel_client(struct rpc_create_args *args) > > -{ > > - struct rpc_xprt *xprt; > > - > > - if (args->protocol != XPRT_TRANSPORT_BC_TCP) > > - return rpc_create(args); > > - > > - xprt = args->bc_xprt->xpt_bc_xprt; > > - if (xprt) { > > - xprt_get(xprt); > > - return rpc_create_xprt(args, xprt); > > - } > > - > > - return rpc_create(args); > > -} > > > > There's no warning here. In fact, protocol != BC_TCP seems to > > be expected. > > > > I'm wondering if the warning is needed at all? > > Using NFSv4.1/RDMA against my v4.7 NFS server seems to result > in a system deadlock in short order on the server. I haven't > looked further into this because you mentioned you were going > to have a look at these commits that change the backchannel > code. I'm not seeing an obvious bug in those commits, for what it's worth. --b. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> On Aug 18, 2016, at 5:56 PM, J. Bruce Fields <bfields@fieldses.org> wrote: > > On Wed, Aug 10, 2016 at 02:01:20PM -0400, Chuck Lever wrote: >> Following up. >> >> >>> On Aug 3, 2016, at 3:40 PM, Chuck Lever <chuck.lever@oracle.com> wrote: >>> >>>> >>>> On Aug 3, 2016, at 1:47 PM, bfields@fieldses.org wrote: >>>> >>>> On Wed, Aug 03, 2016 at 11:27:47AM -0400, Chuck Lever wrote: >>>>> Hi Bruce- >>>>> >>>>> I see that commit 39a9beab5acb83176e8b9a4f0778749a09341f1f >>>>> Author: J. Bruce Fields <bfields@redhat.com> >>>>> AuthorDate: Tue May 17 12:38:21 2016 -0400 >>>>> >>>>> rpc: share one xps between all backchannels >>>>> >>>>> has added this piece of code: >>>>> >>>>> @@ -452,10 +452,20 @@ static struct rpc_clnt *rpc_create_xprt(struct rpc_create_args *args, >>>>> struct rpc_clnt *clnt = NULL; >>>>> struct rpc_xprt_switch *xps; >>>>> >>>>> - xps = xprt_switch_alloc(xprt, GFP_KERNEL); >>>>> - if (xps == NULL) { >>>>> - xprt_put(xprt); >>>>> - return ERR_PTR(-ENOMEM); >>>>> + if (args->bc_xprt && args->bc_xprt->xpt_bc_xps) { >>>>> + WARN_ON(args->protocol != XPRT_TRANSPORT_BC_TCP); >>>>> + xps = args->bc_xprt->xpt_bc_xps; >>>>> + xprt_switch_get(xps); >>>>> + } else { >>>>> >>>>> >>>>> the WARN_ON here fires on the server whenever I use NFSv4.1 on RDMA. >>>>> >>>>> Can you say why it was added? Is there something RPC/RDMA needs to >>>>> do to make the code safe? >>>> >>>> What is args->protocol in this case? >>>> >>>> Digging around... OK, I missed that BC_TCP and BC_RDMA were defined as >>>> OR's of an XPRT_TRANSPORT_BC bit with the identifier of the underlying >>>> transport. That makes sense. >>>> >>>> So, I should have just used XPRT_TRANSPORT_BC there--I think all I meant >>>> was "is this a backchannel". >>>> >>>> Does that fix the problem? >>> >>> This simple fix eliminates the log noise: >>> >>> diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c >>> index 2808d55..f94caf7 100644 >>> --- a/net/sunrpc/clnt.c >>> +++ b/net/sunrpc/clnt.c >>> @@ -520,7 +520,7 @@ struct rpc_clnt *rpc_create(struct rpc_create_args *args) >>> char servername[48]; >>> >>> if (args->bc_xprt) { >>> - WARN_ON(args->protocol != XPRT_TRANSPORT_BC_TCP); >>> + WARN_ON(!(args->protocol & XPRT_TRANSPORT_BC)); >>> xprt = args->bc_xprt->xpt_bc_xprt; >>> if (xprt) { >>> xprt_get(xprt); >>> >>> >>> This code seems to come from: >>> >>> commit d50039ea5ee63c589b0434baa5ecf6e5075bb6f9 >>> Author: J. Bruce Fields <bfields@redhat.com> >>> AuthorDate: Mon May 16 17:03:42 2016 -0400 >>> >>> nfsd4/rpc: move backchannel create logic into rpc code >>> >>> >>> Where it may have been copied from: >>> >>> -static struct rpc_clnt *create_backchannel_client(struct rpc_create_args *args) >>> -{ >>> - struct rpc_xprt *xprt; >>> - >>> - if (args->protocol != XPRT_TRANSPORT_BC_TCP) >>> - return rpc_create(args); >>> - >>> - xprt = args->bc_xprt->xpt_bc_xprt; >>> - if (xprt) { >>> - xprt_get(xprt); >>> - return rpc_create_xprt(args, xprt); >>> - } >>> - >>> - return rpc_create(args); >>> -} >>> >>> There's no warning here. In fact, protocol != BC_TCP seems to >>> be expected. >>> >>> I'm wondering if the warning is needed at all? >> >> Using NFSv4.1/RDMA against my v4.7 NFS server seems to result >> in a system deadlock in short order on the server. I haven't >> looked further into this because you mentioned you were going >> to have a look at these commits that change the backchannel >> code. > > I'm not seeing an obvious bug in those commits, for what it's worth. Thanks for checking. I've tracked the misbehavior down to a DMA mapping mismatch. It's not related to the backchannel at all. I'm putting together a fix right now. But I would like to use NFSv4.1/RDMA without that warning firing. Any reason to keep it (in either place) ? -- Chuck Lever -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> On Aug 18, 2016, at 5:56 PM, J. Bruce Fields <bfields@fieldses.org> wrote: > > On Wed, Aug 03, 2016 at 03:40:11PM -0400, Chuck Lever wrote: >> >>> On Aug 3, 2016, at 1:47 PM, bfields@fieldses.org wrote: >>> >>> On Wed, Aug 03, 2016 at 11:27:47AM -0400, Chuck Lever wrote: >>>> Hi Bruce- >>>> >>>> I see that commit 39a9beab5acb83176e8b9a4f0778749a09341f1f >>>> Author: J. Bruce Fields <bfields@redhat.com> >>>> AuthorDate: Tue May 17 12:38:21 2016 -0400 >>>> >>>> rpc: share one xps between all backchannels >>>> >>>> has added this piece of code: >>>> >>>> @@ -452,10 +452,20 @@ static struct rpc_clnt *rpc_create_xprt(struct rpc_create_args *args, >>>> struct rpc_clnt *clnt = NULL; >>>> struct rpc_xprt_switch *xps; >>>> >>>> - xps = xprt_switch_alloc(xprt, GFP_KERNEL); >>>> - if (xps == NULL) { >>>> - xprt_put(xprt); >>>> - return ERR_PTR(-ENOMEM); >>>> + if (args->bc_xprt && args->bc_xprt->xpt_bc_xps) { >>>> + WARN_ON(args->protocol != XPRT_TRANSPORT_BC_TCP); >>>> + xps = args->bc_xprt->xpt_bc_xps; >>>> + xprt_switch_get(xps); >>>> + } else { >>>> >>>> >>>> the WARN_ON here fires on the server whenever I use NFSv4.1 on RDMA. >>>> >>>> Can you say why it was added? Is there something RPC/RDMA needs to >>>> do to make the code safe? >>> >>> What is args->protocol in this case? >>> >>> Digging around... OK, I missed that BC_TCP and BC_RDMA were defined as >>> OR's of an XPRT_TRANSPORT_BC bit with the identifier of the underlying >>> transport. That makes sense. >>> >>> So, I should have just used XPRT_TRANSPORT_BC there--I think all I meant >>> was "is this a backchannel". >>> >>> Does that fix the problem? >> >> This simple fix eliminates the log noise: >> >> diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c >> index 2808d55..f94caf7 100644 >> --- a/net/sunrpc/clnt.c >> +++ b/net/sunrpc/clnt.c >> @@ -520,7 +520,7 @@ struct rpc_clnt *rpc_create(struct rpc_create_args *args) >> char servername[48]; >> >> if (args->bc_xprt) { >> - WARN_ON(args->protocol != XPRT_TRANSPORT_BC_TCP); >> + WARN_ON(!(args->protocol & XPRT_TRANSPORT_BC)); >> xprt = args->bc_xprt->xpt_bc_xprt; >> if (xprt) { >> xprt_get(xprt); >> >> >> This code seems to come from: >> >> commit d50039ea5ee63c589b0434baa5ecf6e5075bb6f9 >> Author: J. Bruce Fields <bfields@redhat.com> >> AuthorDate: Mon May 16 17:03:42 2016 -0400 >> >> nfsd4/rpc: move backchannel create logic into rpc code >> >> >> Where it may have been copied from: >> >> -static struct rpc_clnt *create_backchannel_client(struct rpc_create_args *args) >> -{ >> - struct rpc_xprt *xprt; >> - >> - if (args->protocol != XPRT_TRANSPORT_BC_TCP) >> - return rpc_create(args); >> - >> - xprt = args->bc_xprt->xpt_bc_xprt; >> - if (xprt) { >> - xprt_get(xprt); >> - return rpc_create_xprt(args, xprt); >> - } >> - >> - return rpc_create(args); >> -} >> >> There's no warning here. In fact, protocol != BC_TCP seems to >> be expected. > > The protocol should be BC_TCP (OK, actually just BC) if and only if > bc_xprt is set. > > (The BC_TCP case is the 4.1+ case, the other is the 4.0 case. In the > 4.1+ case, the new client uses an existing (client-initiated) > connection, in the 4.0 case, the new client must also have a new > connection. > > In the 4.0 case we'll always create a new xprt, in the 4.1 case we might > or might not--depends on whether that particular connection has been > used for a backchannel previously.) OK, but why is a WARN_ON needed here? Why not return -EINVAL, for example (once you've corrected BC_TCP -> BC) ? -- Chuck Lever -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Aug 18, 2016 at 06:11:43PM -0400, Chuck Lever wrote: > > > On Aug 18, 2016, at 5:56 PM, J. Bruce Fields <bfields@fieldses.org> wrote: > > > > On Wed, Aug 03, 2016 at 03:40:11PM -0400, Chuck Lever wrote: > >> > >>> On Aug 3, 2016, at 1:47 PM, bfields@fieldses.org wrote: > >>> > >>> On Wed, Aug 03, 2016 at 11:27:47AM -0400, Chuck Lever wrote: > >>>> Hi Bruce- > >>>> > >>>> I see that commit 39a9beab5acb83176e8b9a4f0778749a09341f1f > >>>> Author: J. Bruce Fields <bfields@redhat.com> > >>>> AuthorDate: Tue May 17 12:38:21 2016 -0400 > >>>> > >>>> rpc: share one xps between all backchannels > >>>> > >>>> has added this piece of code: > >>>> > >>>> @@ -452,10 +452,20 @@ static struct rpc_clnt *rpc_create_xprt(struct rpc_create_args *args, > >>>> struct rpc_clnt *clnt = NULL; > >>>> struct rpc_xprt_switch *xps; > >>>> > >>>> - xps = xprt_switch_alloc(xprt, GFP_KERNEL); > >>>> - if (xps == NULL) { > >>>> - xprt_put(xprt); > >>>> - return ERR_PTR(-ENOMEM); > >>>> + if (args->bc_xprt && args->bc_xprt->xpt_bc_xps) { > >>>> + WARN_ON(args->protocol != XPRT_TRANSPORT_BC_TCP); > >>>> + xps = args->bc_xprt->xpt_bc_xps; > >>>> + xprt_switch_get(xps); > >>>> + } else { > >>>> > >>>> > >>>> the WARN_ON here fires on the server whenever I use NFSv4.1 on RDMA. > >>>> > >>>> Can you say why it was added? Is there something RPC/RDMA needs to > >>>> do to make the code safe? > >>> > >>> What is args->protocol in this case? > >>> > >>> Digging around... OK, I missed that BC_TCP and BC_RDMA were defined as > >>> OR's of an XPRT_TRANSPORT_BC bit with the identifier of the underlying > >>> transport. That makes sense. > >>> > >>> So, I should have just used XPRT_TRANSPORT_BC there--I think all I meant > >>> was "is this a backchannel". > >>> > >>> Does that fix the problem? > >> > >> This simple fix eliminates the log noise: > >> > >> diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c > >> index 2808d55..f94caf7 100644 > >> --- a/net/sunrpc/clnt.c > >> +++ b/net/sunrpc/clnt.c > >> @@ -520,7 +520,7 @@ struct rpc_clnt *rpc_create(struct rpc_create_args *args) > >> char servername[48]; > >> > >> if (args->bc_xprt) { > >> - WARN_ON(args->protocol != XPRT_TRANSPORT_BC_TCP); > >> + WARN_ON(!(args->protocol & XPRT_TRANSPORT_BC)); > >> xprt = args->bc_xprt->xpt_bc_xprt; > >> if (xprt) { > >> xprt_get(xprt); > >> > >> > >> This code seems to come from: > >> > >> commit d50039ea5ee63c589b0434baa5ecf6e5075bb6f9 > >> Author: J. Bruce Fields <bfields@redhat.com> > >> AuthorDate: Mon May 16 17:03:42 2016 -0400 > >> > >> nfsd4/rpc: move backchannel create logic into rpc code > >> > >> > >> Where it may have been copied from: > >> > >> -static struct rpc_clnt *create_backchannel_client(struct rpc_create_args *args) > >> -{ > >> - struct rpc_xprt *xprt; > >> - > >> - if (args->protocol != XPRT_TRANSPORT_BC_TCP) > >> - return rpc_create(args); > >> - > >> - xprt = args->bc_xprt->xpt_bc_xprt; > >> - if (xprt) { > >> - xprt_get(xprt); > >> - return rpc_create_xprt(args, xprt); > >> - } > >> - > >> - return rpc_create(args); > >> -} > >> > >> There's no warning here. In fact, protocol != BC_TCP seems to > >> be expected. > > > > The protocol should be BC_TCP (OK, actually just BC) if and only if > > bc_xprt is set. > > > > (The BC_TCP case is the 4.1+ case, the other is the 4.0 case. In the > > 4.1+ case, the new client uses an existing (client-initiated) > > connection, in the 4.0 case, the new client must also have a new > > connection. > > > > In the 4.0 case we'll always create a new xprt, in the 4.1 case we might > > or might not--depends on whether that particular connection has been > > used for a backchannel previously.) > > OK, but why is a WARN_ON needed here? Why not return -EINVAL, > for example (once you've corrected BC_TCP -> BC) ? Well, it would be a programming bug, so I'd want a WARN_ON or similar somewhere, I don't care particularly where it is if you see a better way to organize things. --b. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Aug 18, 2016 at 05:59:12PM -0400, Chuck Lever wrote: > Thanks for checking. > > I've tracked the misbehavior down to a DMA mapping mismatch. It's > not related to the backchannel at all. I'm putting together a fix > right now. > > But I would like to use NFSv4.1/RDMA without that warning firing. > Any reason to keep it (in either place) ? I thought you said the BC_TCP->BC change fixed it? If so, we should just do that. --b. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> On Aug 19, 2016, at 10:50 AM, J. Bruce Fields <bfields@redhat.com> wrote: > > On Thu, Aug 18, 2016 at 06:11:43PM -0400, Chuck Lever wrote: >> >>> On Aug 18, 2016, at 5:56 PM, J. Bruce Fields <bfields@fieldses.org> wrote: >>> >>> On Wed, Aug 03, 2016 at 03:40:11PM -0400, Chuck Lever wrote: >>>> >>>>> On Aug 3, 2016, at 1:47 PM, bfields@fieldses.org wrote: >>>>> >>>>> On Wed, Aug 03, 2016 at 11:27:47AM -0400, Chuck Lever wrote: >>>>>> Hi Bruce- >>>>>> >>>>>> I see that commit 39a9beab5acb83176e8b9a4f0778749a09341f1f >>>>>> Author: J. Bruce Fields <bfields@redhat.com> >>>>>> AuthorDate: Tue May 17 12:38:21 2016 -0400 >>>>>> >>>>>> rpc: share one xps between all backchannels >>>>>> >>>>>> has added this piece of code: >>>>>> >>>>>> @@ -452,10 +452,20 @@ static struct rpc_clnt *rpc_create_xprt(struct rpc_create_args *args, >>>>>> struct rpc_clnt *clnt = NULL; >>>>>> struct rpc_xprt_switch *xps; >>>>>> >>>>>> - xps = xprt_switch_alloc(xprt, GFP_KERNEL); >>>>>> - if (xps == NULL) { >>>>>> - xprt_put(xprt); >>>>>> - return ERR_PTR(-ENOMEM); >>>>>> + if (args->bc_xprt && args->bc_xprt->xpt_bc_xps) { >>>>>> + WARN_ON(args->protocol != XPRT_TRANSPORT_BC_TCP); >>>>>> + xps = args->bc_xprt->xpt_bc_xps; >>>>>> + xprt_switch_get(xps); >>>>>> + } else { >>>>>> >>>>>> >>>>>> the WARN_ON here fires on the server whenever I use NFSv4.1 on RDMA. >>>>>> >>>>>> Can you say why it was added? Is there something RPC/RDMA needs to >>>>>> do to make the code safe? >>>>> >>>>> What is args->protocol in this case? >>>>> >>>>> Digging around... OK, I missed that BC_TCP and BC_RDMA were defined as >>>>> OR's of an XPRT_TRANSPORT_BC bit with the identifier of the underlying >>>>> transport. That makes sense. >>>>> >>>>> So, I should have just used XPRT_TRANSPORT_BC there--I think all I meant >>>>> was "is this a backchannel". >>>>> >>>>> Does that fix the problem? >>>> >>>> This simple fix eliminates the log noise: >>>> >>>> diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c >>>> index 2808d55..f94caf7 100644 >>>> --- a/net/sunrpc/clnt.c >>>> +++ b/net/sunrpc/clnt.c >>>> @@ -520,7 +520,7 @@ struct rpc_clnt *rpc_create(struct rpc_create_args *args) >>>> char servername[48]; >>>> >>>> if (args->bc_xprt) { >>>> - WARN_ON(args->protocol != XPRT_TRANSPORT_BC_TCP); >>>> + WARN_ON(!(args->protocol & XPRT_TRANSPORT_BC)); >>>> xprt = args->bc_xprt->xpt_bc_xprt; >>>> if (xprt) { >>>> xprt_get(xprt); >>>> >>>> >>>> This code seems to come from: >>>> >>>> commit d50039ea5ee63c589b0434baa5ecf6e5075bb6f9 >>>> Author: J. Bruce Fields <bfields@redhat.com> >>>> AuthorDate: Mon May 16 17:03:42 2016 -0400 >>>> >>>> nfsd4/rpc: move backchannel create logic into rpc code >>>> >>>> >>>> Where it may have been copied from: >>>> >>>> -static struct rpc_clnt *create_backchannel_client(struct rpc_create_args *args) >>>> -{ >>>> - struct rpc_xprt *xprt; >>>> - >>>> - if (args->protocol != XPRT_TRANSPORT_BC_TCP) >>>> - return rpc_create(args); >>>> - >>>> - xprt = args->bc_xprt->xpt_bc_xprt; >>>> - if (xprt) { >>>> - xprt_get(xprt); >>>> - return rpc_create_xprt(args, xprt); >>>> - } >>>> - >>>> - return rpc_create(args); >>>> -} >>>> >>>> There's no warning here. In fact, protocol != BC_TCP seems to >>>> be expected. >>> >>> The protocol should be BC_TCP (OK, actually just BC) if and only if >>> bc_xprt is set. >>> >>> (The BC_TCP case is the 4.1+ case, the other is the 4.0 case. In the >>> 4.1+ case, the new client uses an existing (client-initiated) >>> connection, in the 4.0 case, the new client must also have a new >>> connection. >>> >>> In the 4.0 case we'll always create a new xprt, in the 4.1 case we might >>> or might not--depends on whether that particular connection has been >>> used for a backchannel previously.) >> >> OK, but why is a WARN_ON needed here? Why not return -EINVAL, >> for example (once you've corrected BC_TCP -> BC) ? > > Well, it would be a programming bug, so I'd want a WARN_ON or similar > somewhere, I don't care particularly where it is if you see a better way > to organize things. The way it works now, the WARN_ON fires, but the logic goes ahead and creates the transport anyway. If this is a programming bug, it should fail and return an error, no transport should be created. I can see a WARN_ON being useful because it displays a backtrace which identifies the broken caller. If it is not a programming bug (which is implied by the fact that a transport is created anyway) then no WARN_ON is needed. If you think it is correct that a WARN_ON fires _and_ a transport is created, could a comment be added explaining that? The new logic seems less straightforward to me than what it replaces. -- Chuck Lever -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> On Aug 19, 2016, at 11:06 AM, Chuck Lever <chuck.lever@oracle.com> wrote: > >> >> On Aug 19, 2016, at 10:50 AM, J. Bruce Fields <bfields@redhat.com> wrote: >> >> On Thu, Aug 18, 2016 at 06:11:43PM -0400, Chuck Lever wrote: >>> >>>> On Aug 18, 2016, at 5:56 PM, J. Bruce Fields <bfields@fieldses.org> wrote: >>>> >>>> On Wed, Aug 03, 2016 at 03:40:11PM -0400, Chuck Lever wrote: >>>>> >>>>>> On Aug 3, 2016, at 1:47 PM, bfields@fieldses.org wrote: >>>>>> >>>>>> On Wed, Aug 03, 2016 at 11:27:47AM -0400, Chuck Lever wrote: >>>>>>> Hi Bruce- >>>>>>> >>>>>>> I see that commit 39a9beab5acb83176e8b9a4f0778749a09341f1f >>>>>>> Author: J. Bruce Fields <bfields@redhat.com> >>>>>>> AuthorDate: Tue May 17 12:38:21 2016 -0400 >>>>>>> >>>>>>> rpc: share one xps between all backchannels >>>>>>> >>>>>>> has added this piece of code: >>>>>>> >>>>>>> @@ -452,10 +452,20 @@ static struct rpc_clnt *rpc_create_xprt(struct rpc_create_args *args, >>>>>>> struct rpc_clnt *clnt = NULL; >>>>>>> struct rpc_xprt_switch *xps; >>>>>>> >>>>>>> - xps = xprt_switch_alloc(xprt, GFP_KERNEL); >>>>>>> - if (xps == NULL) { >>>>>>> - xprt_put(xprt); >>>>>>> - return ERR_PTR(-ENOMEM); >>>>>>> + if (args->bc_xprt && args->bc_xprt->xpt_bc_xps) { >>>>>>> + WARN_ON(args->protocol != XPRT_TRANSPORT_BC_TCP); >>>>>>> + xps = args->bc_xprt->xpt_bc_xps; >>>>>>> + xprt_switch_get(xps); >>>>>>> + } else { >>>>>>> >>>>>>> >>>>>>> the WARN_ON here fires on the server whenever I use NFSv4.1 on RDMA. >>>>>>> >>>>>>> Can you say why it was added? Is there something RPC/RDMA needs to >>>>>>> do to make the code safe? >>>>>> >>>>>> What is args->protocol in this case? >>>>>> >>>>>> Digging around... OK, I missed that BC_TCP and BC_RDMA were defined as >>>>>> OR's of an XPRT_TRANSPORT_BC bit with the identifier of the underlying >>>>>> transport. That makes sense. >>>>>> >>>>>> So, I should have just used XPRT_TRANSPORT_BC there--I think all I meant >>>>>> was "is this a backchannel". >>>>>> >>>>>> Does that fix the problem? >>>>> >>>>> This simple fix eliminates the log noise: >>>>> >>>>> diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c >>>>> index 2808d55..f94caf7 100644 >>>>> --- a/net/sunrpc/clnt.c >>>>> +++ b/net/sunrpc/clnt.c >>>>> @@ -520,7 +520,7 @@ struct rpc_clnt *rpc_create(struct rpc_create_args *args) >>>>> char servername[48]; >>>>> >>>>> if (args->bc_xprt) { >>>>> - WARN_ON(args->protocol != XPRT_TRANSPORT_BC_TCP); >>>>> + WARN_ON(!(args->protocol & XPRT_TRANSPORT_BC)); >>>>> xprt = args->bc_xprt->xpt_bc_xprt; >>>>> if (xprt) { >>>>> xprt_get(xprt); >>>>> >>>>> >>>>> This code seems to come from: >>>>> >>>>> commit d50039ea5ee63c589b0434baa5ecf6e5075bb6f9 >>>>> Author: J. Bruce Fields <bfields@redhat.com> >>>>> AuthorDate: Mon May 16 17:03:42 2016 -0400 >>>>> >>>>> nfsd4/rpc: move backchannel create logic into rpc code >>>>> >>>>> >>>>> Where it may have been copied from: >>>>> >>>>> -static struct rpc_clnt *create_backchannel_client(struct rpc_create_args *args) >>>>> -{ >>>>> - struct rpc_xprt *xprt; >>>>> - >>>>> - if (args->protocol != XPRT_TRANSPORT_BC_TCP) >>>>> - return rpc_create(args); >>>>> - >>>>> - xprt = args->bc_xprt->xpt_bc_xprt; >>>>> - if (xprt) { >>>>> - xprt_get(xprt); >>>>> - return rpc_create_xprt(args, xprt); >>>>> - } >>>>> - >>>>> - return rpc_create(args); >>>>> -} >>>>> >>>>> There's no warning here. In fact, protocol != BC_TCP seems to >>>>> be expected. >>>> >>>> The protocol should be BC_TCP (OK, actually just BC) if and only if >>>> bc_xprt is set. >>>> >>>> (The BC_TCP case is the 4.1+ case, the other is the 4.0 case. In the >>>> 4.1+ case, the new client uses an existing (client-initiated) >>>> connection, in the 4.0 case, the new client must also have a new >>>> connection. >>>> >>>> In the 4.0 case we'll always create a new xprt, in the 4.1 case we might >>>> or might not--depends on whether that particular connection has been >>>> used for a backchannel previously.) >>> >>> OK, but why is a WARN_ON needed here? Why not return -EINVAL, >>> for example (once you've corrected BC_TCP -> BC) ? >> >> Well, it would be a programming bug, so I'd want a WARN_ON or similar >> somewhere, I don't care particularly where it is if you see a better way >> to organize things. > > The way it works now, the WARN_ON fires, but the logic goes ahead > and creates the transport anyway. > > If this is a programming bug, it should fail and return an error, > no transport should be created. I can see a WARN_ON being useful > because it displays a backtrace which identifies the broken > caller. > > If it is not a programming bug (which is implied by the fact that > a transport is created anyway) then no WARN_ON is needed. > > If you think it is correct that a WARN_ON fires _and_ a transport > is created, could a comment be added explaining that? The new > logic seems less straightforward to me than what it replaces. Also, WARN_ONCE might be preferable. -- Chuck Lever -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Aug 19, 2016 at 11:06:16AM -0400, Chuck Lever wrote: > > On Aug 19, 2016, at 10:50 AM, J. Bruce Fields <bfields@redhat.com> wrote: > > On Thu, Aug 18, 2016 at 06:11:43PM -0400, Chuck Lever wrote: > >> OK, but why is a WARN_ON needed here? Why not return -EINVAL, > >> for example (once you've corrected BC_TCP -> BC) ? > > > > Well, it would be a programming bug, so I'd want a WARN_ON or similar > > somewhere, I don't care particularly where it is if you see a better way > > to organize things. > > The way it works now, the WARN_ON fires, but the logic goes ahead > and creates the transport anyway. > > If this is a programming bug, it should fail and return an error, I haven't been following that rule. Once upon a time, I would have put a BUG() there. Then Linus pointed out that sometimes a BUG() can bork the machine badly enough that the backtrace doesn't even make it to the logs, rendering it useless. (And I believe that could be the case here since this is running as a work item.) So, I stick a WARN() there instead and don't worry much what happens afterwards. > If it is not a programming bug (which is implied by the fact that > a transport is created anyway) then no WARN_ON is needed. So, could we just agree that WARN_ON means "there's a programming error", regardless of what happens next? Backtraces should never happen on a working kernel. And then ignore the following code path. Unless it's something that's obviously going to immediately oops in the warned case, in which case if we really want the warning then we should return if that looks safer. But I don't have really strong feelings about this case, the warning may be academic since setup_callback_client() makes this look obviously impossible, so if you want to reorganize this somehow, feel free to give it a shot. --b. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> On Aug 19, 2016, at 11:47 AM, J. Bruce Fields <bfields@fieldses.org> wrote: > > On Fri, Aug 19, 2016 at 11:06:16AM -0400, Chuck Lever wrote: >>> On Aug 19, 2016, at 10:50 AM, J. Bruce Fields <bfields@redhat.com> wrote: >>> On Thu, Aug 18, 2016 at 06:11:43PM -0400, Chuck Lever wrote: >>>> OK, but why is a WARN_ON needed here? Why not return -EINVAL, >>>> for example (once you've corrected BC_TCP -> BC) ? >>> >>> Well, it would be a programming bug, so I'd want a WARN_ON or similar >>> somewhere, I don't care particularly where it is if you see a better way >>> to organize things. >> >> The way it works now, the WARN_ON fires, but the logic goes ahead >> and creates the transport anyway. >> >> If this is a programming bug, it should fail and return an error, > > I haven't been following that rule. > > Once upon a time, I would have put a BUG() there. Then Linus pointed > out that sometimes a BUG() can bork the machine badly enough that the > backtrace doesn't even make it to the logs, rendering it useless. (And > I believe that could be the case here since this is running as a work > item.) So, I stick a WARN() there instead and don't worry much what > happens afterwards. I still don't understand. If you would have put a BUG here, then why does this logic continue and create the transport anyway? Well, it's a nit, so I'll drop it. >> If it is not a programming bug (which is implied by the fact that >> a transport is created anyway) then no WARN_ON is needed. > > So, could we just agree that WARN_ON means "there's a programming > error", regardless of what happens next? Backtraces should never happen > on a working kernel. > > And then ignore the following code path. Unless it's something that's > obviously going to immediately oops in the warned case, in which case if > we really want the warning then we should return if that looks safer. > > But I don't have really strong feelings about this case, the warning may > be academic since setup_callback_client() makes this look obviously > impossible, so if you want to reorganize this somehow, feel free to give > it a shot. Right, it's that obviously impossible part that made me wonder why there was a warning here in the first place. OK, the fix is to do BC_TCP -> BC and put our pencils down. Do you want me to send you a patch, or do you plan to take care of it? -- Chuck Lever -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Aug 19, 2016 at 11:51:13AM -0400, Chuck Lever wrote: > > > On Aug 19, 2016, at 11:47 AM, J. Bruce Fields <bfields@fieldses.org> wrote: > > > > On Fri, Aug 19, 2016 at 11:06:16AM -0400, Chuck Lever wrote: > >>> On Aug 19, 2016, at 10:50 AM, J. Bruce Fields <bfields@redhat.com> wrote: > >>> On Thu, Aug 18, 2016 at 06:11:43PM -0400, Chuck Lever wrote: > >>>> OK, but why is a WARN_ON needed here? Why not return -EINVAL, > >>>> for example (once you've corrected BC_TCP -> BC) ? > >>> > >>> Well, it would be a programming bug, so I'd want a WARN_ON or similar > >>> somewhere, I don't care particularly where it is if you see a better way > >>> to organize things. > >> > >> The way it works now, the WARN_ON fires, but the logic goes ahead > >> and creates the transport anyway. > >> > >> If this is a programming bug, it should fail and return an error, > > > > I haven't been following that rule. > > > > Once upon a time, I would have put a BUG() there. Then Linus pointed > > out that sometimes a BUG() can bork the machine badly enough that the > > backtrace doesn't even make it to the logs, rendering it useless. (And > > I believe that could be the case here since this is running as a work > > item.) So, I stick a WARN() there instead and don't worry much what > > happens afterwards. > > I still don't understand. If you would have put a BUG here, then > why does this logic continue and create the transport anyway? It seemed worth 1 line of screen real estate to say "this should never happen", but not 3? > Well, it's a nit, so I'll drop it. But, I mean, I don't care that much either. > >> If it is not a programming bug (which is implied by the fact that > >> a transport is created anyway) then no WARN_ON is needed. > > > > So, could we just agree that WARN_ON means "there's a programming > > error", regardless of what happens next? Backtraces should never happen > > on a working kernel. > > > > And then ignore the following code path. Unless it's something that's > > obviously going to immediately oops in the warned case, in which case if > > we really want the warning then we should return if that looks safer. > > > > But I don't have really strong feelings about this case, the warning may > > be academic since setup_callback_client() makes this look obviously > > impossible, so if you want to reorganize this somehow, feel free to give > > it a shot. > > Right, it's that obviously impossible part that made me wonder why > there was a warning here in the first place. > > OK, the fix is to do BC_TCP -> BC and put our pencils down. Do you > want me to send you a patch That would be great.--b. > or do you plan to take care of it? -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c index 2808d55..f94caf7 100644 --- a/net/sunrpc/clnt.c +++ b/net/sunrpc/clnt.c @@ -520,7 +520,7 @@ struct rpc_clnt *rpc_create(struct rpc_create_args *args) char servername[48]; if (args->bc_xprt) { - WARN_ON(args->protocol != XPRT_TRANSPORT_BC_TCP); + WARN_ON(!(args->protocol & XPRT_TRANSPORT_BC)); xprt = args->bc_xprt->xpt_bc_xprt; if (xprt) { xprt_get(xprt);