Message ID | 556F4A0C.2030804@oracle.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Jun 3, 2015, at 2:40 PM, Shirley Ma <shirley.ma@oracle.com> wrote: > When removing underlying RDMA device, the rmmod will hang forever if there > are any outstanding NFS/RDMA client mounts. The outstanding NFS/RDMA counts > could also prevent the server from shutting down. Further debugging shows > that the existing connections are not teared down and resource are not > released when receiving RDMA_CM_EVENT_DEVICE_REMOVAL event. It seems the > original code missing svc_xprt_put() in RDMA_CM_EVENT_REMOVAL event handler > thus svc_xprt_free is never invoked to release the existing connection resources. > > The patch has been passed removing, adding device back and forth without > stopping NFS/RDMA service. This will also allow a device to be unplugged > and swapped out without shutting down NFS service. > > Signed-off-by: Shirley Ma <shirley.ma@oracle.com> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> > --- > net/sunrpc/xprtrdma/svc_rdma_transport.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/net/sunrpc/xprtrdma/svc_rdma_transport.c b/net/sunrpc/xprtrdma/svc_rdma_transport.c > index f609c1c..2b82569 100644 > --- a/net/sunrpc/xprtrdma/svc_rdma_transport.c > +++ b/net/sunrpc/xprtrdma/svc_rdma_transport.c > @@ -673,6 +673,7 @@ static int rdma_cma_handler(struct rdma_cm_id *cma_id, > if (xprt) { > set_bit(XPT_CLOSE, &xprt->xpt_flags); > svc_xprt_enqueue(xprt); > + svc_xprt_put(xprt); > } > break; > default: > > Shirley -- Chuck Lever chuck[dot]lever[at]oracle[dot]com -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Jun 3, 2015, at 2:44 PM, Chuck Lever <chuck.lever@oracle.com> wrote: > > On Jun 3, 2015, at 2:40 PM, Shirley Ma <shirley.ma@oracle.com> wrote: > >> When removing underlying RDMA device, the rmmod will hang forever if there >> are any outstanding NFS/RDMA client mounts. The outstanding NFS/RDMA counts >> could also prevent the server from shutting down. Further debugging shows >> that the existing connections are not teared down and resource are not >> released when receiving RDMA_CM_EVENT_DEVICE_REMOVAL event. It seems the >> original code missing svc_xprt_put() in RDMA_CM_EVENT_REMOVAL event handler >> thus svc_xprt_free is never invoked to release the existing connection resources. >> >> The patch has been passed removing, adding device back and forth without >> stopping NFS/RDMA service. This will also allow a device to be unplugged >> and swapped out without shutting down NFS service. And maybe also add: BugLink: https://bugzilla.linux-nfs.org/show_bug.cgi?id=252 here. >> Signed-off-by: Shirley Ma <shirley.ma@oracle.com> > > Reviewed-by: Chuck Lever <chuck.lever@oracle.com> > >> --- >> net/sunrpc/xprtrdma/svc_rdma_transport.c | 1 + >> 1 file changed, 1 insertion(+) >> >> diff --git a/net/sunrpc/xprtrdma/svc_rdma_transport.c b/net/sunrpc/xprtrdma/svc_rdma_transport.c >> index f609c1c..2b82569 100644 >> --- a/net/sunrpc/xprtrdma/svc_rdma_transport.c >> +++ b/net/sunrpc/xprtrdma/svc_rdma_transport.c >> @@ -673,6 +673,7 @@ static int rdma_cma_handler(struct rdma_cm_id *cma_id, >> if (xprt) { >> set_bit(XPT_CLOSE, &xprt->xpt_flags); >> svc_xprt_enqueue(xprt); >> + svc_xprt_put(xprt); >> } >> break; >> default: >> >> Shirley > > -- > Chuck Lever > chuck[dot]lever[at]oracle[dot]com > > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Chuck Lever chuck[dot]lever[at]oracle[dot]com -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 06/03/2015 11:49 AM, Chuck Lever wrote: > > On Jun 3, 2015, at 2:44 PM, Chuck Lever <chuck.lever@oracle.com> wrote: > >> >> On Jun 3, 2015, at 2:40 PM, Shirley Ma <shirley.ma@oracle.com> wrote: >> >>> When removing underlying RDMA device, the rmmod will hang forever if there >>> are any outstanding NFS/RDMA client mounts. The outstanding NFS/RDMA counts >>> could also prevent the server from shutting down. Further debugging shows >>> that the existing connections are not teared down and resource are not >>> released when receiving RDMA_CM_EVENT_DEVICE_REMOVAL event. It seems the >>> original code missing svc_xprt_put() in RDMA_CM_EVENT_REMOVAL event handler >>> thus svc_xprt_free is never invoked to release the existing connection resources. >>> >>> The patch has been passed removing, adding device back and forth without >>> stopping NFS/RDMA service. This will also allow a device to be unplugged >>> and swapped out without shutting down NFS service. > > And maybe also add: > > BugLink: https://bugzilla.linux-nfs.org/show_bug.cgi?id=252 Yes, this patch has addressed above problem too. I forgot this bug. > here. > >>> Signed-off-by: Shirley Ma <shirley.ma@oracle.com> >> >> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> >> >>> --- >>> net/sunrpc/xprtrdma/svc_rdma_transport.c | 1 + >>> 1 file changed, 1 insertion(+) >>> >>> diff --git a/net/sunrpc/xprtrdma/svc_rdma_transport.c b/net/sunrpc/xprtrdma/svc_rdma_transport.c >>> index f609c1c..2b82569 100644 >>> --- a/net/sunrpc/xprtrdma/svc_rdma_transport.c >>> +++ b/net/sunrpc/xprtrdma/svc_rdma_transport.c >>> @@ -673,6 +673,7 @@ static int rdma_cma_handler(struct rdma_cm_id *cma_id, >>> if (xprt) { >>> set_bit(XPT_CLOSE, &xprt->xpt_flags); >>> svc_xprt_enqueue(xprt); >>> + svc_xprt_put(xprt); >>> } >>> break; >>> default: >>> >>> Shirley >> >> -- >> Chuck Lever >> chuck[dot]lever[at]oracle[dot]com >> >> >> >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html > > -- > Chuck Lever > chuck[dot]lever[at]oracle[dot]com > > > -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/net/sunrpc/xprtrdma/svc_rdma_transport.c b/net/sunrpc/xprtrdma/svc_rdma_transport.c index f609c1c..2b82569 100644 --- a/net/sunrpc/xprtrdma/svc_rdma_transport.c +++ b/net/sunrpc/xprtrdma/svc_rdma_transport.c @@ -673,6 +673,7 @@ static int rdma_cma_handler(struct rdma_cm_id *cma_id, if (xprt) { set_bit(XPT_CLOSE, &xprt->xpt_flags); svc_xprt_enqueue(xprt); + svc_xprt_put(xprt); } break; default:
When removing underlying RDMA device, the rmmod will hang forever if there are any outstanding NFS/RDMA client mounts. The outstanding NFS/RDMA counts could also prevent the server from shutting down. Further debugging shows that the existing connections are not teared down and resource are not released when receiving RDMA_CM_EVENT_DEVICE_REMOVAL event. It seems the original code missing svc_xprt_put() in RDMA_CM_EVENT_REMOVAL event handler thus svc_xprt_free is never invoked to release the existing connection resources. The patch has been passed removing, adding device back and forth without stopping NFS/RDMA service. This will also allow a device to be unplugged and swapped out without shutting down NFS service. Signed-off-by: Shirley Ma <shirley.ma@oracle.com> --- net/sunrpc/xprtrdma/svc_rdma_transport.c | 1 + 1 file changed, 1 insertion(+) Shirley -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html