diff mbox

[v1,05/13] xprtrdma: Don't drain CQs on transport disconnect

Message ID 20140623223942.1634.89063.stgit@manet.1015granger.net (mailing list archive)
State New, archived
Headers show

Commit Message

Chuck Lever June 23, 2014, 10:39 p.m. UTC
CQs are not destroyed until unmount. By draining CQs on transport
disconnect, successful completions that can change the r.frmr.state
field can be missed.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 net/sunrpc/xprtrdma/verbs.c |    5 -----
 1 file changed, 5 deletions(-)


--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Devesh Sharma July 2, 2014, 7:06 p.m. UTC | #1
This change is very much prone to generate poll_cq errors because of un-cleaned completions which still
point to the non-existent QPs. On the new connection when these completions are polled, the poll_cq will
fail because old QP pointer is already NULL. 
Did anyone hit this situation during their testing?

> -----Original Message-----

> From: linux-rdma-owner@vger.kernel.org [mailto:linux-rdma-

> owner@vger.kernel.org] On Behalf Of Chuck Lever

> Sent: Tuesday, June 24, 2014 4:10 AM

> To: linux-rdma@vger.kernel.org; linux-nfs@vger.kernel.org

> Subject: [PATCH v1 05/13] xprtrdma: Don't drain CQs on transport disconnect

> 

> CQs are not destroyed until unmount. By draining CQs on transport

> disconnect, successful completions that can change the r.frmr.state field can

> be missed.

> 

> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

> ---

>  net/sunrpc/xprtrdma/verbs.c |    5 -----

>  1 file changed, 5 deletions(-)

> 

> diff --git a/net/sunrpc/xprtrdma/verbs.c b/net/sunrpc/xprtrdma/verbs.c

> index 3c7f904..451e100 100644

> --- a/net/sunrpc/xprtrdma/verbs.c

> +++ b/net/sunrpc/xprtrdma/verbs.c

> @@ -873,9 +873,6 @@ retry:

>  			dprintk("RPC:       %s: rpcrdma_ep_disconnect"

>  				" status %i\n", __func__, rc);

> 

> -		rpcrdma_clean_cq(ep->rep_attr.recv_cq);

> -		rpcrdma_clean_cq(ep->rep_attr.send_cq);

> -

>  		xprt = container_of(ia, struct rpcrdma_xprt, rx_ia);

>  		id = rpcrdma_create_id(xprt, ia,

>  				(struct sockaddr *)&xprt->rx_data.addr);

> @@ -985,8 +982,6 @@ rpcrdma_ep_disconnect(struct rpcrdma_ep *ep,

> struct rpcrdma_ia *ia)  {

>  	int rc;

> 

> -	rpcrdma_clean_cq(ep->rep_attr.recv_cq);

> -	rpcrdma_clean_cq(ep->rep_attr.send_cq);

>  	rc = rdma_disconnect(ia->ri_id);

>  	if (!rc) {

>  		/* returns without wait if not connected */

> 

> --

> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the

> body of a message to majordomo@vger.kernel.org More majordomo info at

> http://vger.kernel.org/majordomo-info.html
Steve Wise July 2, 2014, 7:28 p.m. UTC | #2
On 7/2/2014 2:06 PM, Devesh Sharma wrote:
> This change is very much prone to generate poll_cq errors because of un-cleaned completions which still
> point to the non-existent QPs. On the new connection when these completions are polled, the poll_cq will
> fail because old QP pointer is already NULL.
> Did anyone hit this situation during their testing?

Hey Devesh,

iw_cxgb4 will silently toss CQEs if the QP is not active.


>> -----Original Message-----
>> From: linux-rdma-owner@vger.kernel.org [mailto:linux-rdma-
>> owner@vger.kernel.org] On Behalf Of Chuck Lever
>> Sent: Tuesday, June 24, 2014 4:10 AM
>> To: linux-rdma@vger.kernel.org; linux-nfs@vger.kernel.org
>> Subject: [PATCH v1 05/13] xprtrdma: Don't drain CQs on transport disconnect
>>
>> CQs are not destroyed until unmount. By draining CQs on transport
>> disconnect, successful completions that can change the r.frmr.state field can
>> be missed.
>>
>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
>> ---
>>   net/sunrpc/xprtrdma/verbs.c |    5 -----
>>   1 file changed, 5 deletions(-)
>>
>> diff --git a/net/sunrpc/xprtrdma/verbs.c b/net/sunrpc/xprtrdma/verbs.c
>> index 3c7f904..451e100 100644
>> --- a/net/sunrpc/xprtrdma/verbs.c
>> +++ b/net/sunrpc/xprtrdma/verbs.c
>> @@ -873,9 +873,6 @@ retry:
>>   			dprintk("RPC:       %s: rpcrdma_ep_disconnect"
>>   				" status %i\n", __func__, rc);
>>
>> -		rpcrdma_clean_cq(ep->rep_attr.recv_cq);
>> -		rpcrdma_clean_cq(ep->rep_attr.send_cq);
>> -
>>   		xprt = container_of(ia, struct rpcrdma_xprt, rx_ia);
>>   		id = rpcrdma_create_id(xprt, ia,
>>   				(struct sockaddr *)&xprt->rx_data.addr);
>> @@ -985,8 +982,6 @@ rpcrdma_ep_disconnect(struct rpcrdma_ep *ep,
>> struct rpcrdma_ia *ia)  {
>>   	int rc;
>>
>> -	rpcrdma_clean_cq(ep->rep_attr.recv_cq);
>> -	rpcrdma_clean_cq(ep->rep_attr.send_cq);
>>   	rc = rdma_disconnect(ia->ri_id);
>>   	if (!rc) {
>>   		/* returns without wait if not connected */
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the
>> body of a message to majordomo@vger.kernel.org More majordomo info at
>> http://vger.kernel.org/majordomo-info.html
> N?????r??y???b?X???v?^?)?{.n?+????{???"??^n?r???z???h????&???G???h?(????j"???m?????z?????f???h???~?mml==

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Chuck Lever July 2, 2014, 7:40 p.m. UTC | #3
On Jul 2, 2014, at 3:28 PM, Steve Wise <swise@opengridcomputing.com> wrote:

> On 7/2/2014 2:06 PM, Devesh Sharma wrote:
>> This change is very much prone to generate poll_cq errors because of un-cleaned completions which still
>> point to the non-existent QPs. On the new connection when these completions are polled, the poll_cq will
>> fail because old QP pointer is already NULL.
>> Did anyone hit this situation during their testing?

I tested this aggressively with a fault injector that triggers regular connection
disruption.

> Hey Devesh,
> 
> iw_cxgb4 will silently toss CQEs if the QP is not active.

xprtrdma relies on getting a completion (either successful or in error) for every
WR it has posted. The goal of this patch is to avoid throwing away queued
completions after a transport disconnect so we don't lose track of FRMR rkey
updates (FAST_REG_MR and LOCAL_INV completions) and we can capture all RPC
replies posted before the connection was lost.

Sounds like we also need to keep the QP around, even in error state, until all
known WRs on that QP have completed?


> 
> 
>>> -----Original Message-----
>>> From: linux-rdma-owner@vger.kernel.org [mailto:linux-rdma-
>>> owner@vger.kernel.org] On Behalf Of Chuck Lever
>>> Sent: Tuesday, June 24, 2014 4:10 AM
>>> To: linux-rdma@vger.kernel.org; linux-nfs@vger.kernel.org
>>> Subject: [PATCH v1 05/13] xprtrdma: Don't drain CQs on transport disconnect
>>> 
>>> CQs are not destroyed until unmount. By draining CQs on transport
>>> disconnect, successful completions that can change the r.frmr.state field can
>>> be missed.
>>> 
>>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
>>> ---
>>>  net/sunrpc/xprtrdma/verbs.c |    5 -----
>>>  1 file changed, 5 deletions(-)
>>> 
>>> diff --git a/net/sunrpc/xprtrdma/verbs.c b/net/sunrpc/xprtrdma/verbs.c
>>> index 3c7f904..451e100 100644
>>> --- a/net/sunrpc/xprtrdma/verbs.c
>>> +++ b/net/sunrpc/xprtrdma/verbs.c
>>> @@ -873,9 +873,6 @@ retry:
>>>  			dprintk("RPC:       %s: rpcrdma_ep_disconnect"
>>>  				" status %i\n", __func__, rc);
>>> 
>>> -		rpcrdma_clean_cq(ep->rep_attr.recv_cq);
>>> -		rpcrdma_clean_cq(ep->rep_attr.send_cq);
>>> -
>>>  		xprt = container_of(ia, struct rpcrdma_xprt, rx_ia);
>>>  		id = rpcrdma_create_id(xprt, ia,
>>>  				(struct sockaddr *)&xprt->rx_data.addr);
>>> @@ -985,8 +982,6 @@ rpcrdma_ep_disconnect(struct rpcrdma_ep *ep,
>>> struct rpcrdma_ia *ia)  {
>>>  	int rc;
>>> 
>>> -	rpcrdma_clean_cq(ep->rep_attr.recv_cq);
>>> -	rpcrdma_clean_cq(ep->rep_attr.send_cq);
>>>  	rc = rdma_disconnect(ia->ri_id);
>>>  	if (!rc) {
>>>  		/* returns without wait if not connected */
>>> 
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the
>>> body of a message to majordomo@vger.kernel.org More majordomo info at
>>> http://vger.kernel.org/majordomo-info.html
>> N?????r??y???b?X???v?^?)?{.n?+????{???"??^n?r???z???h????&???G???h?(????j"???m?????z?????f???h???~?mml==
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
Devesh Sharma July 2, 2014, 7:42 p.m. UTC | #4
> -----Original Message-----

> From: Steve Wise [mailto:swise@opengridcomputing.com]

> Sent: Thursday, July 03, 2014 12:59 AM

> To: Devesh Sharma; Chuck Lever; linux-rdma@vger.kernel.org; linux-

> nfs@vger.kernel.org

> Subject: Re: [PATCH v1 05/13] xprtrdma: Don't drain CQs on transport

> disconnect

> 

> On 7/2/2014 2:06 PM, Devesh Sharma wrote:

> > This change is very much prone to generate poll_cq errors because of

> > un-cleaned completions which still point to the non-existent QPs. On

> > the new connection when these completions are polled, the poll_cq will fail

> because old QP pointer is already NULL.

> > Did anyone hit this situation during their testing?

> 

> Hey Devesh,

> 

> iw_cxgb4 will silently toss CQEs if the QP is not active.


Ya, just now checked that in mlx and cxgb4 driver code. On the other hand ocrdma is asserting a BUG-ON for such CQEs causing system panic.
Out of curiosity I am asking, how this change is useful here, is it reducing the re-connection time...Anyhow rpcrdma_clean_cq was discarding the completions (flush/successful both)

> 

> 

> >> -----Original Message-----

> >> From: linux-rdma-owner@vger.kernel.org [mailto:linux-rdma-

> >> owner@vger.kernel.org] On Behalf Of Chuck Lever

> >> Sent: Tuesday, June 24, 2014 4:10 AM

> >> To: linux-rdma@vger.kernel.org; linux-nfs@vger.kernel.org

> >> Subject: [PATCH v1 05/13] xprtrdma: Don't drain CQs on transport

> >> disconnect

> >>

> >> CQs are not destroyed until unmount. By draining CQs on transport

> >> disconnect, successful completions that can change the r.frmr.state

> >> field can be missed.


Still those are missed isn’t it....Since those successful completions will still be dropped after re-connection. Am I missing something to 
understanding the motivation...

> >>

> >> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

> >> ---

> >>   net/sunrpc/xprtrdma/verbs.c |    5 -----

> >>   1 file changed, 5 deletions(-)

> >>

> >> diff --git a/net/sunrpc/xprtrdma/verbs.c

> >> b/net/sunrpc/xprtrdma/verbs.c index 3c7f904..451e100 100644

> >> --- a/net/sunrpc/xprtrdma/verbs.c

> >> +++ b/net/sunrpc/xprtrdma/verbs.c

> >> @@ -873,9 +873,6 @@ retry:

> >>   			dprintk("RPC:       %s: rpcrdma_ep_disconnect"

> >>   				" status %i\n", __func__, rc);

> >>

> >> -		rpcrdma_clean_cq(ep->rep_attr.recv_cq);

> >> -		rpcrdma_clean_cq(ep->rep_attr.send_cq);

> >> -

> >>   		xprt = container_of(ia, struct rpcrdma_xprt, rx_ia);

> >>   		id = rpcrdma_create_id(xprt, ia,

> >>   				(struct sockaddr *)&xprt->rx_data.addr);

> @@ -985,8 +982,6 @@

> >> rpcrdma_ep_disconnect(struct rpcrdma_ep *ep, struct rpcrdma_ia *ia)

> >> {

> >>   	int rc;

> >>

> >> -	rpcrdma_clean_cq(ep->rep_attr.recv_cq);

> >> -	rpcrdma_clean_cq(ep->rep_attr.send_cq);

> >>   	rc = rdma_disconnect(ia->ri_id);

> >>   	if (!rc) {

> >>   		/* returns without wait if not connected */

> >>

> >> --

> >> To unsubscribe from this list: send the line "unsubscribe linux-rdma"

> >> in the body of a message to majordomo@vger.kernel.org More

> majordomo

> >> info at http://vger.kernel.org/majordomo-info.html

> > N     r  y   b X  ?v ^ )?{.n +    {   "  ^n r   z   h    &   G   h 

> > ( ? ?j"   m     z ?   f   h   ~ mml==
Steve Wise July 2, 2014, 7:46 p.m. UTC | #5
> -----Original Message-----
> From: Chuck Lever [mailto:chuck.lever@oracle.com]
> Sent: Wednesday, July 02, 2014 2:40 PM
> To: Steve Wise; Devesh Sharma
> Cc: linux-rdma@vger.kernel.org; Linux NFS Mailing List
> Subject: Re: [PATCH v1 05/13] xprtrdma: Don't drain CQs on transport disconnect
> 
> 
> On Jul 2, 2014, at 3:28 PM, Steve Wise <swise@opengridcomputing.com> wrote:
> 
> > On 7/2/2014 2:06 PM, Devesh Sharma wrote:
> >> This change is very much prone to generate poll_cq errors because of un-cleaned
> completions which still
> >> point to the non-existent QPs. On the new connection when these completions are polled,
> the poll_cq will
> >> fail because old QP pointer is already NULL.
> >> Did anyone hit this situation during their testing?
> 
> I tested this aggressively with a fault injector that triggers regular connection
> disruption.
> 
> > Hey Devesh,
> >
> > iw_cxgb4 will silently toss CQEs if the QP is not active.
> 
> xprtrdma relies on getting a completion (either successful or in error) for every
> WR it has posted. The goal of this patch is to avoid throwing away queued
> completions after a transport disconnect so we don't lose track of FRMR rkey
> updates (FAST_REG_MR and LOCAL_INV completions) and we can capture all RPC
> replies posted before the connection was lost.
> 
> Sounds like we also need to keep the QP around, even in error state, until all
> known WRs on that QP have completed?
>

Perhaps.
 
> 
> >
> >
> >>> -----Original Message-----
> >>> From: linux-rdma-owner@vger.kernel.org [mailto:linux-rdma-
> >>> owner@vger.kernel.org] On Behalf Of Chuck Lever
> >>> Sent: Tuesday, June 24, 2014 4:10 AM
> >>> To: linux-rdma@vger.kernel.org; linux-nfs@vger.kernel.org
> >>> Subject: [PATCH v1 05/13] xprtrdma: Don't drain CQs on transport disconnect
> >>>
> >>> CQs are not destroyed until unmount. By draining CQs on transport
> >>> disconnect, successful completions that can change the r.frmr.state field can
> >>> be missed.
> >>>
> >>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> >>> ---
> >>>  net/sunrpc/xprtrdma/verbs.c |    5 -----
> >>>  1 file changed, 5 deletions(-)
> >>>
> >>> diff --git a/net/sunrpc/xprtrdma/verbs.c b/net/sunrpc/xprtrdma/verbs.c
> >>> index 3c7f904..451e100 100644
> >>> --- a/net/sunrpc/xprtrdma/verbs.c
> >>> +++ b/net/sunrpc/xprtrdma/verbs.c
> >>> @@ -873,9 +873,6 @@ retry:
> >>>  			dprintk("RPC:       %s: rpcrdma_ep_disconnect"
> >>>  				" status %i\n", __func__, rc);
> >>>
> >>> -		rpcrdma_clean_cq(ep->rep_attr.recv_cq);
> >>> -		rpcrdma_clean_cq(ep->rep_attr.send_cq);
> >>> -
> >>>  		xprt = container_of(ia, struct rpcrdma_xprt, rx_ia);
> >>>  		id = rpcrdma_create_id(xprt, ia,
> >>>  				(struct sockaddr *)&xprt->rx_data.addr);
> >>> @@ -985,8 +982,6 @@ rpcrdma_ep_disconnect(struct rpcrdma_ep *ep,
> >>> struct rpcrdma_ia *ia)  {
> >>>  	int rc;
> >>>
> >>> -	rpcrdma_clean_cq(ep->rep_attr.recv_cq);
> >>> -	rpcrdma_clean_cq(ep->rep_attr.send_cq);
> >>>  	rc = rdma_disconnect(ia->ri_id);
> >>>  	if (!rc) {
> >>>  		/* returns without wait if not connected */
> >>>
> >>> --
> >>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the
> >>> body of a message to majordomo@vger.kernel.org More majordomo info at
> >>> http://vger.kernel.org/majordomo-info.html
> >> N?????r??y???b?X???v?^?)?{.n?+????{???"??^n?r???z???h????&???G???h?(??
> ??j"???m?????z?????f???h???~?mml==
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> --
> Chuck Lever
> chuck[dot]lever[at]oracle[dot]com
> 
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Devesh Sharma July 2, 2014, 7:48 p.m. UTC | #6
> -----Original Message-----

> From: Steve Wise [mailto:swise@opengridcomputing.com]

> Sent: Thursday, July 03, 2014 1:16 AM

> To: 'Chuck Lever'; Devesh Sharma

> Cc: linux-rdma@vger.kernel.org; 'Linux NFS Mailing List'

> Subject: RE: [PATCH v1 05/13] xprtrdma: Don't drain CQs on transport

> disconnect

> 

> 

> 

> > -----Original Message-----

> > From: Chuck Lever [mailto:chuck.lever@oracle.com]

> > Sent: Wednesday, July 02, 2014 2:40 PM

> > To: Steve Wise; Devesh Sharma

> > Cc: linux-rdma@vger.kernel.org; Linux NFS Mailing List

> > Subject: Re: [PATCH v1 05/13] xprtrdma: Don't drain CQs on transport

> > disconnect

> >

> >

> > On Jul 2, 2014, at 3:28 PM, Steve Wise <swise@opengridcomputing.com>

> wrote:

> >

> > > On 7/2/2014 2:06 PM, Devesh Sharma wrote:

> > >> This change is very much prone to generate poll_cq errors because

> > >> of un-cleaned

> > completions which still

> > >> point to the non-existent QPs. On the new connection when these

> > >> completions are polled,

> > the poll_cq will

> > >> fail because old QP pointer is already NULL.

> > >> Did anyone hit this situation during their testing?

> >

> > I tested this aggressively with a fault injector that triggers regular

> > connection disruption.

> >

> > > Hey Devesh,

> > >

> > > iw_cxgb4 will silently toss CQEs if the QP is not active.

> >

> > xprtrdma relies on getting a completion (either successful or in

> > error) for every WR it has posted. The goal of this patch is to avoid

> > throwing away queued completions after a transport disconnect so we

> > don't lose track of FRMR rkey updates (FAST_REG_MR and LOCAL_INV

> > completions) and we can capture all RPC replies posted before the

> connection was lost.

> >

> > Sounds like we also need to keep the QP around, even in error state,

> > until all known WRs on that QP have completed?

> >


Why not poll and process every completion during rpcrdma_cq_cleanup()....

> 

> Perhaps.

> 

> >

> > >

> > >

> > >>> -----Original Message-----

> > >>> From: linux-rdma-owner@vger.kernel.org [mailto:linux-rdma-

> > >>> owner@vger.kernel.org] On Behalf Of Chuck Lever

> > >>> Sent: Tuesday, June 24, 2014 4:10 AM

> > >>> To: linux-rdma@vger.kernel.org; linux-nfs@vger.kernel.org

> > >>> Subject: [PATCH v1 05/13] xprtrdma: Don't drain CQs on transport

> > >>> disconnect

> > >>>

> > >>> CQs are not destroyed until unmount. By draining CQs on transport

> > >>> disconnect, successful completions that can change the

> > >>> r.frmr.state field can be missed.

> > >>>

> > >>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

> > >>> ---

> > >>>  net/sunrpc/xprtrdma/verbs.c |    5 -----

> > >>>  1 file changed, 5 deletions(-)

> > >>>

> > >>> diff --git a/net/sunrpc/xprtrdma/verbs.c

> > >>> b/net/sunrpc/xprtrdma/verbs.c index 3c7f904..451e100 100644

> > >>> --- a/net/sunrpc/xprtrdma/verbs.c

> > >>> +++ b/net/sunrpc/xprtrdma/verbs.c

> > >>> @@ -873,9 +873,6 @@ retry:

> > >>>  			dprintk("RPC:       %s: rpcrdma_ep_disconnect"

> > >>>  				" status %i\n", __func__, rc);

> > >>>

> > >>> -		rpcrdma_clean_cq(ep->rep_attr.recv_cq);

> > >>> -		rpcrdma_clean_cq(ep->rep_attr.send_cq);

> > >>> -

> > >>>  		xprt = container_of(ia, struct rpcrdma_xprt, rx_ia);

> > >>>  		id = rpcrdma_create_id(xprt, ia,

> > >>>  				(struct sockaddr *)&xprt->rx_data.addr);

> @@ -985,8 +982,6 @@

> > >>> rpcrdma_ep_disconnect(struct rpcrdma_ep *ep, struct rpcrdma_ia

> > >>> *ia)  {

> > >>>  	int rc;

> > >>>

> > >>> -	rpcrdma_clean_cq(ep->rep_attr.recv_cq);

> > >>> -	rpcrdma_clean_cq(ep->rep_attr.send_cq);

> > >>>  	rc = rdma_disconnect(ia->ri_id);

> > >>>  	if (!rc) {

> > >>>  		/* returns without wait if not connected */

> > >>>

> > >>> --

> > >>> To unsubscribe from this list: send the line "unsubscribe

> > >>> linux-rdma" in the body of a message to majordomo@vger.kernel.org

> > >>> More majordomo info at http://vger.kernel.org/majordomo-info.html

> > >> N     r  y   b X  ?v ^ )?{.n +    {   "  ^n r   z   h    &   G

> > >> h ( ?

> >  ?j"   m     z ?   f   h   ~ mml==

> > >

> > > --

> > > To unsubscribe from this list: send the line "unsubscribe linux-nfs"

> > > in the body of a message to majordomo@vger.kernel.org More

> majordomo

> > > info at  http://vger.kernel.org/majordomo-info.html

> >

> > --

> > Chuck Lever

> > chuck[dot]lever[at]oracle[dot]com

> >

> >

>
Steve Wise July 2, 2014, 7:50 p.m. UTC | #7
> -----Original Message-----
> From: Devesh Sharma [mailto:Devesh.Sharma@Emulex.Com]
> Sent: Wednesday, July 02, 2014 2:43 PM
> To: Steve Wise; Chuck Lever; linux-rdma@vger.kernel.org; linux-nfs@vger.kernel.org
> Subject: RE: [PATCH v1 05/13] xprtrdma: Don't drain CQs on transport disconnect
> 
> > -----Original Message-----
> > From: Steve Wise [mailto:swise@opengridcomputing.com]
> > Sent: Thursday, July 03, 2014 12:59 AM
> > To: Devesh Sharma; Chuck Lever; linux-rdma@vger.kernel.org; linux-
> > nfs@vger.kernel.org
> > Subject: Re: [PATCH v1 05/13] xprtrdma: Don't drain CQs on transport
> > disconnect
> >
> > On 7/2/2014 2:06 PM, Devesh Sharma wrote:
> > > This change is very much prone to generate poll_cq errors because of
> > > un-cleaned completions which still point to the non-existent QPs. On
> > > the new connection when these completions are polled, the poll_cq will fail
> > because old QP pointer is already NULL.
> > > Did anyone hit this situation during their testing?
> >
> > Hey Devesh,
> >
> > iw_cxgb4 will silently toss CQEs if the QP is not active.
> 
> Ya, just now checked that in mlx and cxgb4 driver code. On the other hand ocrdma is asserting
> a BUG-ON for such CQEs causing system panic.
> Out of curiosity I am asking, how this change is useful here, is it reducing the re-connection
> time...Anyhow rpcrdma_clean_cq was discarding the completions (flush/successful both)
>

Well, I don't think there is anything restricting an application from destroying the QP with pending CQEs on its CQs.   So it definitely shouldn't cause a BUG_ON() I think.   I'll have to read up in the Verbs specs if destroying a QP kills all the pending CQEs...

 
> >
> >
> > >> -----Original Message-----
> > >> From: linux-rdma-owner@vger.kernel.org [mailto:linux-rdma-
> > >> owner@vger.kernel.org] On Behalf Of Chuck Lever
> > >> Sent: Tuesday, June 24, 2014 4:10 AM
> > >> To: linux-rdma@vger.kernel.org; linux-nfs@vger.kernel.org
> > >> Subject: [PATCH v1 05/13] xprtrdma: Don't drain CQs on transport
> > >> disconnect
> > >>
> > >> CQs are not destroyed until unmount. By draining CQs on transport
> > >> disconnect, successful completions that can change the r.frmr.state
> > >> field can be missed.
> 
> Still those are missed isn’t it....Since those successful completions will still be dropped after re-
> connection. Am I missing something to
> understanding the motivation...
> 
> > >>
> > >> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> > >> ---
> > >>   net/sunrpc/xprtrdma/verbs.c |    5 -----
> > >>   1 file changed, 5 deletions(-)
> > >>
> > >> diff --git a/net/sunrpc/xprtrdma/verbs.c
> > >> b/net/sunrpc/xprtrdma/verbs.c index 3c7f904..451e100 100644
> > >> --- a/net/sunrpc/xprtrdma/verbs.c
> > >> +++ b/net/sunrpc/xprtrdma/verbs.c
> > >> @@ -873,9 +873,6 @@ retry:
> > >>   			dprintk("RPC:       %s: rpcrdma_ep_disconnect"
> > >>   				" status %i\n", __func__, rc);
> > >>
> > >> -		rpcrdma_clean_cq(ep->rep_attr.recv_cq);
> > >> -		rpcrdma_clean_cq(ep->rep_attr.send_cq);
> > >> -
> > >>   		xprt = container_of(ia, struct rpcrdma_xprt, rx_ia);
> > >>   		id = rpcrdma_create_id(xprt, ia,
> > >>   				(struct sockaddr *)&xprt->rx_data.addr);
> > @@ -985,8 +982,6 @@
> > >> rpcrdma_ep_disconnect(struct rpcrdma_ep *ep, struct rpcrdma_ia *ia)
> > >> {
> > >>   	int rc;
> > >>
> > >> -	rpcrdma_clean_cq(ep->rep_attr.recv_cq);
> > >> -	rpcrdma_clean_cq(ep->rep_attr.send_cq);
> > >>   	rc = rdma_disconnect(ia->ri_id);
> > >>   	if (!rc) {
> > >>   		/* returns without wait if not connected */
> > >>
> > >> --
> > >> To unsubscribe from this list: send the line "unsubscribe linux-rdma"
> > >> in the body of a message to majordomo@vger.kernel.org More
> > majordomo
> > >> info at http://vger.kernel.org/majordomo-info.html
> > > N     r  y   b X  ?v ^ )?{.n +    {   "  ^n r   z   h    &   G   h 
> > > ( ? ?j"   m     z ?   f   h   ~ mml==


--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Devesh Sharma July 2, 2014, 7:53 p.m. UTC | #8
> -----Original Message-----

> From: Steve Wise [mailto:swise@opengridcomputing.com]

> Sent: Thursday, July 03, 2014 1:21 AM

> To: Devesh Sharma; 'Chuck Lever'; linux-rdma@vger.kernel.org; linux-

> nfs@vger.kernel.org

> Subject: RE: [PATCH v1 05/13] xprtrdma: Don't drain CQs on transport

> disconnect

> 

> 

> 

> > -----Original Message-----

> > From: Devesh Sharma [mailto:Devesh.Sharma@Emulex.Com]

> > Sent: Wednesday, July 02, 2014 2:43 PM

> > To: Steve Wise; Chuck Lever; linux-rdma@vger.kernel.org;

> > linux-nfs@vger.kernel.org

> > Subject: RE: [PATCH v1 05/13] xprtrdma: Don't drain CQs on transport

> > disconnect

> >

> > > -----Original Message-----

> > > From: Steve Wise [mailto:swise@opengridcomputing.com]

> > > Sent: Thursday, July 03, 2014 12:59 AM

> > > To: Devesh Sharma; Chuck Lever; linux-rdma@vger.kernel.org; linux-

> > > nfs@vger.kernel.org

> > > Subject: Re: [PATCH v1 05/13] xprtrdma: Don't drain CQs on transport

> > > disconnect

> > >

> > > On 7/2/2014 2:06 PM, Devesh Sharma wrote:

> > > > This change is very much prone to generate poll_cq errors because

> > > > of un-cleaned completions which still point to the non-existent

> > > > QPs. On the new connection when these completions are polled, the

> > > > poll_cq will fail

> > > because old QP pointer is already NULL.

> > > > Did anyone hit this situation during their testing?

> > >

> > > Hey Devesh,

> > >

> > > iw_cxgb4 will silently toss CQEs if the QP is not active.

> >

> > Ya, just now checked that in mlx and cxgb4 driver code. On the other

> > hand ocrdma is asserting a BUG-ON for such CQEs causing system panic.

> > Out of curiosity I am asking, how this change is useful here, is it

> > reducing the re-connection time...Anyhow rpcrdma_clean_cq was

> > discarding the completions (flush/successful both)

> >

> 

> Well, I don't think there is anything restricting an application from destroying

> the QP with pending CQEs on its CQs.   So it definitely shouldn't cause a

> BUG_ON() I think.   I'll have to read up in the Verbs specs if destroying a QP

> kills all the pending CQEs...


Oh confusion...let me clarify: in ocrdma BUG ON is hit in poll_cq() after re-connection happens and cq is polled again.
Now the first completion in CQ still points to old QP-ID for which ocrdma does not have valid QP pointer.

> 

> 

> > >

> > >

> > > >> -----Original Message-----

> > > >> From: linux-rdma-owner@vger.kernel.org [mailto:linux-rdma-

> > > >> owner@vger.kernel.org] On Behalf Of Chuck Lever

> > > >> Sent: Tuesday, June 24, 2014 4:10 AM

> > > >> To: linux-rdma@vger.kernel.org; linux-nfs@vger.kernel.org

> > > >> Subject: [PATCH v1 05/13] xprtrdma: Don't drain CQs on transport

> > > >> disconnect

> > > >>

> > > >> CQs are not destroyed until unmount. By draining CQs on transport

> > > >> disconnect, successful completions that can change the

> > > >> r.frmr.state field can be missed.

> >

> > Still those are missed isn’t it....Since those successful completions

> > will still be dropped after re- connection. Am I missing something to

> > understanding the motivation...

> >

> > > >>

> > > >> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

> > > >> ---

> > > >>   net/sunrpc/xprtrdma/verbs.c |    5 -----

> > > >>   1 file changed, 5 deletions(-)

> > > >>

> > > >> diff --git a/net/sunrpc/xprtrdma/verbs.c

> > > >> b/net/sunrpc/xprtrdma/verbs.c index 3c7f904..451e100 100644

> > > >> --- a/net/sunrpc/xprtrdma/verbs.c

> > > >> +++ b/net/sunrpc/xprtrdma/verbs.c

> > > >> @@ -873,9 +873,6 @@ retry:

> > > >>   			dprintk("RPC:       %s:

> rpcrdma_ep_disconnect"

> > > >>   				" status %i\n", __func__, rc);

> > > >>

> > > >> -		rpcrdma_clean_cq(ep->rep_attr.recv_cq);

> > > >> -		rpcrdma_clean_cq(ep->rep_attr.send_cq);

> > > >> -

> > > >>   		xprt = container_of(ia, struct rpcrdma_xprt, rx_ia);

> > > >>   		id = rpcrdma_create_id(xprt, ia,

> > > >>   				(struct sockaddr *)&xprt-

> >rx_data.addr);

> > > @@ -985,8 +982,6 @@

> > > >> rpcrdma_ep_disconnect(struct rpcrdma_ep *ep, struct rpcrdma_ia

> > > >> *ia) {

> > > >>   	int rc;

> > > >>

> > > >> -	rpcrdma_clean_cq(ep->rep_attr.recv_cq);

> > > >> -	rpcrdma_clean_cq(ep->rep_attr.send_cq);

> > > >>   	rc = rdma_disconnect(ia->ri_id);

> > > >>   	if (!rc) {

> > > >>   		/* returns without wait if not connected */

> > > >>

> > > >> --

> > > >> To unsubscribe from this list: send the line "unsubscribe linux-rdma"

> > > >> in the body of a message to majordomo@vger.kernel.org More

> > > majordomo

> > > >> info at http://vger.kernel.org/majordomo-info.html

> > > > N     r  y   b X  ?v ^ )?{.n +    {   "  ^n r   z   h    &   G   h 

> > > > ( ? ?j"   m     z ?   f   h   ~ mml==

>
Devesh Sharma July 2, 2014, 7:56 p.m. UTC | #9
> -----Original Message-----

> From: Steve Wise [mailto:swise@opengridcomputing.com]

> Sent: Thursday, July 03, 2014 1:21 AM

> To: Devesh Sharma; 'Chuck Lever'; linux-rdma@vger.kernel.org; linux-

> nfs@vger.kernel.org

> Subject: RE: [PATCH v1 05/13] xprtrdma: Don't drain CQs on transport

> disconnect

> 

> 

> 

> > -----Original Message-----

> > From: Devesh Sharma [mailto:Devesh.Sharma@Emulex.Com]

> > Sent: Wednesday, July 02, 2014 2:43 PM

> > To: Steve Wise; Chuck Lever; linux-rdma@vger.kernel.org;

> > linux-nfs@vger.kernel.org

> > Subject: RE: [PATCH v1 05/13] xprtrdma: Don't drain CQs on transport

> > disconnect

> >

> > > -----Original Message-----

> > > From: Steve Wise [mailto:swise@opengridcomputing.com]

> > > Sent: Thursday, July 03, 2014 12:59 AM

> > > To: Devesh Sharma; Chuck Lever; linux-rdma@vger.kernel.org; linux-

> > > nfs@vger.kernel.org

> > > Subject: Re: [PATCH v1 05/13] xprtrdma: Don't drain CQs on transport

> > > disconnect

> > >

> > > On 7/2/2014 2:06 PM, Devesh Sharma wrote:

> > > > This change is very much prone to generate poll_cq errors because

> > > > of un-cleaned completions which still point to the non-existent

> > > > QPs. On the new connection when these completions are polled, the

> > > > poll_cq will fail

> > > because old QP pointer is already NULL.

> > > > Did anyone hit this situation during their testing?

> > >

> > > Hey Devesh,

> > >

> > > iw_cxgb4 will silently toss CQEs if the QP is not active.

> >

> > Ya, just now checked that in mlx and cxgb4 driver code. On the other

> > hand ocrdma is asserting a BUG-ON for such CQEs causing system panic.

> > Out of curiosity I am asking, how this change is useful here, is it

> > reducing the re-connection time...Anyhow rpcrdma_clean_cq was

> > discarding the completions (flush/successful both)

> >

> 

> Well, I don't think there is anything restricting an application from destroying

> the QP with pending CQEs on its CQs.   So it definitely shouldn't cause a

> BUG_ON() I think.   I'll have to read up in the Verbs specs if destroying a QP

> kills all the pending CQEs...


I will double check ocrdma_destroy_qp code.

> 

> 

> > >

> > >

> > > >> -----Original Message-----

> > > >> From: linux-rdma-owner@vger.kernel.org [mailto:linux-rdma-

> > > >> owner@vger.kernel.org] On Behalf Of Chuck Lever

> > > >> Sent: Tuesday, June 24, 2014 4:10 AM

> > > >> To: linux-rdma@vger.kernel.org; linux-nfs@vger.kernel.org

> > > >> Subject: [PATCH v1 05/13] xprtrdma: Don't drain CQs on transport

> > > >> disconnect

> > > >>

> > > >> CQs are not destroyed until unmount. By draining CQs on transport

> > > >> disconnect, successful completions that can change the

> > > >> r.frmr.state field can be missed.

> >

> > Still those are missed isn’t it....Since those successful completions

> > will still be dropped after re- connection. Am I missing something to

> > understanding the motivation...

> >

> > > >>

> > > >> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

> > > >> ---

> > > >>   net/sunrpc/xprtrdma/verbs.c |    5 -----

> > > >>   1 file changed, 5 deletions(-)

> > > >>

> > > >> diff --git a/net/sunrpc/xprtrdma/verbs.c

> > > >> b/net/sunrpc/xprtrdma/verbs.c index 3c7f904..451e100 100644

> > > >> --- a/net/sunrpc/xprtrdma/verbs.c

> > > >> +++ b/net/sunrpc/xprtrdma/verbs.c

> > > >> @@ -873,9 +873,6 @@ retry:

> > > >>   			dprintk("RPC:       %s:

> rpcrdma_ep_disconnect"

> > > >>   				" status %i\n", __func__, rc);

> > > >>

> > > >> -		rpcrdma_clean_cq(ep->rep_attr.recv_cq);

> > > >> -		rpcrdma_clean_cq(ep->rep_attr.send_cq);

> > > >> -

> > > >>   		xprt = container_of(ia, struct rpcrdma_xprt, rx_ia);

> > > >>   		id = rpcrdma_create_id(xprt, ia,

> > > >>   				(struct sockaddr *)&xprt-

> >rx_data.addr);

> > > @@ -985,8 +982,6 @@

> > > >> rpcrdma_ep_disconnect(struct rpcrdma_ep *ep, struct rpcrdma_ia

> > > >> *ia) {

> > > >>   	int rc;

> > > >>

> > > >> -	rpcrdma_clean_cq(ep->rep_attr.recv_cq);

> > > >> -	rpcrdma_clean_cq(ep->rep_attr.send_cq);

> > > >>   	rc = rdma_disconnect(ia->ri_id);

> > > >>   	if (!rc) {

> > > >>   		/* returns without wait if not connected */

> > > >>

> > > >> --

> > > >> To unsubscribe from this list: send the line "unsubscribe linux-rdma"

> > > >> in the body of a message to majordomo@vger.kernel.org More

> > > majordomo

> > > >> info at http://vger.kernel.org/majordomo-info.html

> > > > N     r  y   b X  ?v ^ )?{.n +    {   "  ^n r   z   h    &   G   h 

> > > > ( ? ?j"   m     z ?   f   h   ~ mml==

>
Steve Wise July 2, 2014, 7:56 p.m. UTC | #10
> -----Original Message-----
> From: Devesh Sharma [mailto:Devesh.Sharma@Emulex.Com]
> Sent: Wednesday, July 02, 2014 2:54 PM
> To: Steve Wise; 'Chuck Lever'; linux-rdma@vger.kernel.org; linux-nfs@vger.kernel.org
> Subject: RE: [PATCH v1 05/13] xprtrdma: Don't drain CQs on transport disconnect
> 
> 
> 
> > -----Original Message-----
> > From: Steve Wise [mailto:swise@opengridcomputing.com]
> > Sent: Thursday, July 03, 2014 1:21 AM
> > To: Devesh Sharma; 'Chuck Lever'; linux-rdma@vger.kernel.org; linux-
> > nfs@vger.kernel.org
> > Subject: RE: [PATCH v1 05/13] xprtrdma: Don't drain CQs on transport
> > disconnect
> >
> >
> >
> > > -----Original Message-----
> > > From: Devesh Sharma [mailto:Devesh.Sharma@Emulex.Com]
> > > Sent: Wednesday, July 02, 2014 2:43 PM
> > > To: Steve Wise; Chuck Lever; linux-rdma@vger.kernel.org;
> > > linux-nfs@vger.kernel.org
> > > Subject: RE: [PATCH v1 05/13] xprtrdma: Don't drain CQs on transport
> > > disconnect
> > >
> > > > -----Original Message-----
> > > > From: Steve Wise [mailto:swise@opengridcomputing.com]
> > > > Sent: Thursday, July 03, 2014 12:59 AM
> > > > To: Devesh Sharma; Chuck Lever; linux-rdma@vger.kernel.org; linux-
> > > > nfs@vger.kernel.org
> > > > Subject: Re: [PATCH v1 05/13] xprtrdma: Don't drain CQs on transport
> > > > disconnect
> > > >
> > > > On 7/2/2014 2:06 PM, Devesh Sharma wrote:
> > > > > This change is very much prone to generate poll_cq errors because
> > > > > of un-cleaned completions which still point to the non-existent
> > > > > QPs. On the new connection when these completions are polled, the
> > > > > poll_cq will fail
> > > > because old QP pointer is already NULL.
> > > > > Did anyone hit this situation during their testing?
> > > >
> > > > Hey Devesh,
> > > >
> > > > iw_cxgb4 will silently toss CQEs if the QP is not active.
> > >
> > > Ya, just now checked that in mlx and cxgb4 driver code. On the other
> > > hand ocrdma is asserting a BUG-ON for such CQEs causing system panic.
> > > Out of curiosity I am asking, how this change is useful here, is it
> > > reducing the re-connection time...Anyhow rpcrdma_clean_cq was
> > > discarding the completions (flush/successful both)
> > >
> >
> > Well, I don't think there is anything restricting an application from destroying
> > the QP with pending CQEs on its CQs.   So it definitely shouldn't cause a
> > BUG_ON() I think.   I'll have to read up in the Verbs specs if destroying a QP
> > kills all the pending CQEs...
> 
> Oh confusion...let me clarify: in ocrdma BUG ON is hit in poll_cq() after re-connection happens
> and cq is polled again.
> Now the first completion in CQ still points to old QP-ID for which ocrdma does not have valid
> QP pointer.
>

Right.  Which means it’s a stale CQE.  I don't think that should cause a BUG_ON. 

 
> >
> >
> > > >
> > > >
> > > > >> -----Original Message-----
> > > > >> From: linux-rdma-owner@vger.kernel.org [mailto:linux-rdma-
> > > > >> owner@vger.kernel.org] On Behalf Of Chuck Lever
> > > > >> Sent: Tuesday, June 24, 2014 4:10 AM
> > > > >> To: linux-rdma@vger.kernel.org; linux-nfs@vger.kernel.org
> > > > >> Subject: [PATCH v1 05/13] xprtrdma: Don't drain CQs on transport
> > > > >> disconnect
> > > > >>
> > > > >> CQs are not destroyed until unmount. By draining CQs on transport
> > > > >> disconnect, successful completions that can change the
> > > > >> r.frmr.state field can be missed.
> > >
> > > Still those are missed isn’t it....Since those successful completions
> > > will still be dropped after re- connection. Am I missing something to
> > > understanding the motivation...
> > >
> > > > >>
> > > > >> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> > > > >> ---
> > > > >>   net/sunrpc/xprtrdma/verbs.c |    5 -----
> > > > >>   1 file changed, 5 deletions(-)
> > > > >>
> > > > >> diff --git a/net/sunrpc/xprtrdma/verbs.c
> > > > >> b/net/sunrpc/xprtrdma/verbs.c index 3c7f904..451e100 100644
> > > > >> --- a/net/sunrpc/xprtrdma/verbs.c
> > > > >> +++ b/net/sunrpc/xprtrdma/verbs.c
> > > > >> @@ -873,9 +873,6 @@ retry:
> > > > >>   			dprintk("RPC:       %s:
> > rpcrdma_ep_disconnect"
> > > > >>   				" status %i\n", __func__, rc);
> > > > >>
> > > > >> -		rpcrdma_clean_cq(ep->rep_attr.recv_cq);
> > > > >> -		rpcrdma_clean_cq(ep->rep_attr.send_cq);
> > > > >> -
> > > > >>   		xprt = container_of(ia, struct rpcrdma_xprt, rx_ia);
> > > > >>   		id = rpcrdma_create_id(xprt, ia,
> > > > >>   				(struct sockaddr *)&xprt-
> > >rx_data.addr);
> > > > @@ -985,8 +982,6 @@
> > > > >> rpcrdma_ep_disconnect(struct rpcrdma_ep *ep, struct rpcrdma_ia
> > > > >> *ia) {
> > > > >>   	int rc;
> > > > >>
> > > > >> -	rpcrdma_clean_cq(ep->rep_attr.recv_cq);
> > > > >> -	rpcrdma_clean_cq(ep->rep_attr.send_cq);
> > > > >>   	rc = rdma_disconnect(ia->ri_id);
> > > > >>   	if (!rc) {
> > > > >>   		/* returns without wait if not connected */
> > > > >>
> > > > >> --
> > > > >> To unsubscribe from this list: send the line "unsubscribe linux-rdma"
> > > > >> in the body of a message to majordomo@vger.kernel.org More
> > > > majordomo
> > > > >> info at http://vger.kernel.org/majordomo-info.html
> > > > > N     r  y   b X  ?v ^ )?{.n +    {   "  ^n r   z   h    &   G   h 
> > > > > ( ? ?j"   m     z ?   f   h   ~ mml==
> >


--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Devesh Sharma July 2, 2014, 7:57 p.m. UTC | #11
> -----Original Message-----

> From: Steve Wise [mailto:swise@opengridcomputing.com]

> Sent: Thursday, July 03, 2014 1:27 AM

> To: Devesh Sharma; 'Chuck Lever'; linux-rdma@vger.kernel.org; linux-

> nfs@vger.kernel.org

> Subject: RE: [PATCH v1 05/13] xprtrdma: Don't drain CQs on transport

> disconnect

> 

> 

> 

> > -----Original Message-----

> > From: Devesh Sharma [mailto:Devesh.Sharma@Emulex.Com]

> > Sent: Wednesday, July 02, 2014 2:54 PM

> > To: Steve Wise; 'Chuck Lever'; linux-rdma@vger.kernel.org;

> > linux-nfs@vger.kernel.org

> > Subject: RE: [PATCH v1 05/13] xprtrdma: Don't drain CQs on transport

> > disconnect

> >

> >

> >

> > > -----Original Message-----

> > > From: Steve Wise [mailto:swise@opengridcomputing.com]

> > > Sent: Thursday, July 03, 2014 1:21 AM

> > > To: Devesh Sharma; 'Chuck Lever'; linux-rdma@vger.kernel.org; linux-

> > > nfs@vger.kernel.org

> > > Subject: RE: [PATCH v1 05/13] xprtrdma: Don't drain CQs on transport

> > > disconnect

> > >

> > >

> > >

> > > > -----Original Message-----

> > > > From: Devesh Sharma [mailto:Devesh.Sharma@Emulex.Com]

> > > > Sent: Wednesday, July 02, 2014 2:43 PM

> > > > To: Steve Wise; Chuck Lever; linux-rdma@vger.kernel.org;

> > > > linux-nfs@vger.kernel.org

> > > > Subject: RE: [PATCH v1 05/13] xprtrdma: Don't drain CQs on

> > > > transport disconnect

> > > >

> > > > > -----Original Message-----

> > > > > From: Steve Wise [mailto:swise@opengridcomputing.com]

> > > > > Sent: Thursday, July 03, 2014 12:59 AM

> > > > > To: Devesh Sharma; Chuck Lever; linux-rdma@vger.kernel.org;

> > > > > linux- nfs@vger.kernel.org

> > > > > Subject: Re: [PATCH v1 05/13] xprtrdma: Don't drain CQs on

> > > > > transport disconnect

> > > > >

> > > > > On 7/2/2014 2:06 PM, Devesh Sharma wrote:

> > > > > > This change is very much prone to generate poll_cq errors

> > > > > > because of un-cleaned completions which still point to the

> > > > > > non-existent QPs. On the new connection when these completions

> > > > > > are polled, the poll_cq will fail

> > > > > because old QP pointer is already NULL.

> > > > > > Did anyone hit this situation during their testing?

> > > > >

> > > > > Hey Devesh,

> > > > >

> > > > > iw_cxgb4 will silently toss CQEs if the QP is not active.

> > > >

> > > > Ya, just now checked that in mlx and cxgb4 driver code. On the

> > > > other hand ocrdma is asserting a BUG-ON for such CQEs causing system

> panic.

> > > > Out of curiosity I am asking, how this change is useful here, is

> > > > it reducing the re-connection time...Anyhow rpcrdma_clean_cq was

> > > > discarding the completions (flush/successful both)

> > > >

> > >

> > > Well, I don't think there is anything restricting an application from

> destroying

> > > the QP with pending CQEs on its CQs.   So it definitely shouldn't cause a

> > > BUG_ON() I think.   I'll have to read up in the Verbs specs if destroying a

> QP

> > > kills all the pending CQEs...

> >

> > Oh confusion...let me clarify: in ocrdma BUG ON is hit in poll_cq()

> > after re-connection happens and cq is polled again.

> > Now the first completion in CQ still points to old QP-ID for which

> > ocrdma does not have valid QP pointer.

> >

> 

> Right.  Which means it’s a stale CQE.  I don't think that should cause a

> BUG_ON.


Yes this surely needs a fix in ocrdma.

> 

> 

> > >

> > >

> > > > >

> > > > >

> > > > > >> -----Original Message-----

> > > > > >> From: linux-rdma-owner@vger.kernel.org [mailto:linux-rdma-

> > > > > >> owner@vger.kernel.org] On Behalf Of Chuck Lever

> > > > > >> Sent: Tuesday, June 24, 2014 4:10 AM

> > > > > >> To: linux-rdma@vger.kernel.org; linux-nfs@vger.kernel.org

> > > > > >> Subject: [PATCH v1 05/13] xprtrdma: Don't drain CQs on

> > > > > >> transport disconnect

> > > > > >>

> > > > > >> CQs are not destroyed until unmount. By draining CQs on

> > > > > >> transport disconnect, successful completions that can change

> > > > > >> the r.frmr.state field can be missed.

> > > >

> > > > Still those are missed isn’t it....Since those successful

> > > > completions will still be dropped after re- connection. Am I

> > > > missing something to understanding the motivation...

> > > >

> > > > > >>

> > > > > >> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

> > > > > >> ---

> > > > > >>   net/sunrpc/xprtrdma/verbs.c |    5 -----

> > > > > >>   1 file changed, 5 deletions(-)

> > > > > >>

> > > > > >> diff --git a/net/sunrpc/xprtrdma/verbs.c

> > > > > >> b/net/sunrpc/xprtrdma/verbs.c index 3c7f904..451e100 100644

> > > > > >> --- a/net/sunrpc/xprtrdma/verbs.c

> > > > > >> +++ b/net/sunrpc/xprtrdma/verbs.c

> > > > > >> @@ -873,9 +873,6 @@ retry:

> > > > > >>   			dprintk("RPC:       %s:

> > > rpcrdma_ep_disconnect"

> > > > > >>   				" status %i\n", __func__, rc);

> > > > > >>

> > > > > >> -		rpcrdma_clean_cq(ep->rep_attr.recv_cq);

> > > > > >> -		rpcrdma_clean_cq(ep->rep_attr.send_cq);

> > > > > >> -

> > > > > >>   		xprt = container_of(ia, struct rpcrdma_xprt, rx_ia);

> > > > > >>   		id = rpcrdma_create_id(xprt, ia,

> > > > > >>   				(struct sockaddr *)&xprt-

> > > >rx_data.addr);

> > > > > @@ -985,8 +982,6 @@

> > > > > >> rpcrdma_ep_disconnect(struct rpcrdma_ep *ep, struct

> > > > > >> rpcrdma_ia

> > > > > >> *ia) {

> > > > > >>   	int rc;

> > > > > >>

> > > > > >> -	rpcrdma_clean_cq(ep->rep_attr.recv_cq);

> > > > > >> -	rpcrdma_clean_cq(ep->rep_attr.send_cq);

> > > > > >>   	rc = rdma_disconnect(ia->ri_id);

> > > > > >>   	if (!rc) {

> > > > > >>   		/* returns without wait if not connected */

> > > > > >>

> > > > > >> --

> > > > > >> To unsubscribe from this list: send the line "unsubscribe linux-

> rdma"

> > > > > >> in the body of a message to majordomo@vger.kernel.org More

> > > > > majordomo

> > > > > >> info at http://vger.kernel.org/majordomo-info.html

> > > > > > N     r  y   b X  ?v ^ )?{.n +    {   "  ^n r   z   h    &   G   h 

> > > > > > ( ? ?j"   m     z ?   f   h   ~ mml==

> > >

>
Chuck Lever July 2, 2014, 7:59 p.m. UTC | #12
On Jul 2, 2014, at 3:48 PM, Devesh Sharma <Devesh.Sharma@Emulex.Com> wrote:

> 
> 
>> -----Original Message-----
>> From: Steve Wise [mailto:swise@opengridcomputing.com]
>> Sent: Thursday, July 03, 2014 1:16 AM
>> To: 'Chuck Lever'; Devesh Sharma
>> Cc: linux-rdma@vger.kernel.org; 'Linux NFS Mailing List'
>> Subject: RE: [PATCH v1 05/13] xprtrdma: Don't drain CQs on transport
>> disconnect
>> 
>> 
>> 
>>> -----Original Message-----
>>> From: Chuck Lever [mailto:chuck.lever@oracle.com]
>>> Sent: Wednesday, July 02, 2014 2:40 PM
>>> To: Steve Wise; Devesh Sharma
>>> Cc: linux-rdma@vger.kernel.org; Linux NFS Mailing List
>>> Subject: Re: [PATCH v1 05/13] xprtrdma: Don't drain CQs on transport
>>> disconnect
>>> 
>>> 
>>> On Jul 2, 2014, at 3:28 PM, Steve Wise <swise@opengridcomputing.com>
>> wrote:
>>> 
>>>> On 7/2/2014 2:06 PM, Devesh Sharma wrote:
>>>>> This change is very much prone to generate poll_cq errors because
>>>>> of un-cleaned
>>> completions which still
>>>>> point to the non-existent QPs. On the new connection when these
>>>>> completions are polled,
>>> the poll_cq will
>>>>> fail because old QP pointer is already NULL.
>>>>> Did anyone hit this situation during their testing?
>>> 
>>> I tested this aggressively with a fault injector that triggers regular
>>> connection disruption.
>>> 
>>>> Hey Devesh,
>>>> 
>>>> iw_cxgb4 will silently toss CQEs if the QP is not active.
>>> 
>>> xprtrdma relies on getting a completion (either successful or in
>>> error) for every WR it has posted. The goal of this patch is to avoid
>>> throwing away queued completions after a transport disconnect so we
>>> don't lose track of FRMR rkey updates (FAST_REG_MR and LOCAL_INV
>>> completions) and we can capture all RPC replies posted before the
>> connection was lost.
>>> 
>>> Sounds like we also need to keep the QP around, even in error state,
>>> until all known WRs on that QP have completed?
>>> 
> 
> Why not poll and process every completion during rpcrdma_cq_cleanup()….

Yes, I have a patch in the next version of this series that does that.
It just calls rpcrdma_sendcq_upcall() from the connect worker. I will
squash that change into this patch.

Maybe it needs to invoke rpcrdma_recvcq_upcall() there as well.


> 
>> 
>> Perhaps.
>> 
>>> 
>>>> 
>>>> 
>>>>>> -----Original Message-----
>>>>>> From: linux-rdma-owner@vger.kernel.org [mailto:linux-rdma-
>>>>>> owner@vger.kernel.org] On Behalf Of Chuck Lever
>>>>>> Sent: Tuesday, June 24, 2014 4:10 AM
>>>>>> To: linux-rdma@vger.kernel.org; linux-nfs@vger.kernel.org
>>>>>> Subject: [PATCH v1 05/13] xprtrdma: Don't drain CQs on transport
>>>>>> disconnect
>>>>>> 
>>>>>> CQs are not destroyed until unmount. By draining CQs on transport
>>>>>> disconnect, successful completions that can change the
>>>>>> r.frmr.state field can be missed.
>>>>>> 
>>>>>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
>>>>>> ---
>>>>>> net/sunrpc/xprtrdma/verbs.c |    5 -----
>>>>>> 1 file changed, 5 deletions(-)
>>>>>> 
>>>>>> diff --git a/net/sunrpc/xprtrdma/verbs.c
>>>>>> b/net/sunrpc/xprtrdma/verbs.c index 3c7f904..451e100 100644
>>>>>> --- a/net/sunrpc/xprtrdma/verbs.c
>>>>>> +++ b/net/sunrpc/xprtrdma/verbs.c
>>>>>> @@ -873,9 +873,6 @@ retry:
>>>>>> 			dprintk("RPC:       %s: rpcrdma_ep_disconnect"
>>>>>> 				" status %i\n", __func__, rc);
>>>>>> 
>>>>>> -		rpcrdma_clean_cq(ep->rep_attr.recv_cq);
>>>>>> -		rpcrdma_clean_cq(ep->rep_attr.send_cq);
>>>>>> -
>>>>>> 		xprt = container_of(ia, struct rpcrdma_xprt, rx_ia);
>>>>>> 		id = rpcrdma_create_id(xprt, ia,
>>>>>> 				(struct sockaddr *)&xprt->rx_data.addr);
>> @@ -985,8 +982,6 @@
>>>>>> rpcrdma_ep_disconnect(struct rpcrdma_ep *ep, struct rpcrdma_ia
>>>>>> *ia)  {
>>>>>> 	int rc;
>>>>>> 
>>>>>> -	rpcrdma_clean_cq(ep->rep_attr.recv_cq);
>>>>>> -	rpcrdma_clean_cq(ep->rep_attr.send_cq);
>>>>>> 	rc = rdma_disconnect(ia->ri_id);
>>>>>> 	if (!rc) {
>>>>>> 		/* returns without wait if not connected */
>>>>>> 
>>>>>> --
>>>>>> To unsubscribe from this list: send the line "unsubscribe
>>>>>> linux-rdma" in the body of a message to majordomo@vger.kernel.org
>>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>>> N     r  y   b X  ?v ^ )?{.n +    {   "  ^n r   z   h    &   G
>>>>> h ( ?
>>> ?j"   m     z ?   f   h   ~ mml==
>>>> 
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-nfs"
>>>> in the body of a message to majordomo@vger.kernel.org More
>> majordomo
>>>> info at  http://vger.kernel.org/majordomo-info.html
>>> 
>>> --
>>> Chuck Lever
>>> chuck[dot]lever[at]oracle[dot]com
>>> 
>>> 
>> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com



--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Devesh Sharma July 3, 2014, 5:33 a.m. UTC | #13
DQoNCj4gLS0tLS1PcmlnaW5hbCBNZXNzYWdlLS0tLS0NCj4gRnJvbTogQ2h1Y2sgTGV2ZXIgW21h
aWx0bzpjaHVjay5sZXZlckBvcmFjbGUuY29tXQ0KPiBTZW50OiBUaHVyc2RheSwgSnVseSAwMywg
MjAxNCAxOjMwIEFNDQo+IFRvOiBEZXZlc2ggU2hhcm1hDQo+IENjOiBTdGV2ZSBXaXNlOyBsaW51
eC1yZG1hQHZnZXIua2VybmVsLm9yZzsgTGludXggTkZTIE1haWxpbmcgTGlzdA0KPiBTdWJqZWN0
OiBSZTogW1BBVENIIHYxIDA1LzEzXSB4cHJ0cmRtYTogRG9uJ3QgZHJhaW4gQ1FzIG9uIHRyYW5z
cG9ydA0KPiBkaXNjb25uZWN0DQo+IA0KPiANCj4gT24gSnVsIDIsIDIwMTQsIGF0IDM6NDggUE0s
IERldmVzaCBTaGFybWEgPERldmVzaC5TaGFybWFARW11bGV4LkNvbT4NCj4gd3JvdGU6DQo+IA0K
PiA+DQo+ID4NCj4gPj4gLS0tLS1PcmlnaW5hbCBNZXNzYWdlLS0tLS0NCj4gPj4gRnJvbTogU3Rl
dmUgV2lzZSBbbWFpbHRvOnN3aXNlQG9wZW5ncmlkY29tcHV0aW5nLmNvbV0NCj4gPj4gU2VudDog
VGh1cnNkYXksIEp1bHkgMDMsIDIwMTQgMToxNiBBTQ0KPiA+PiBUbzogJ0NodWNrIExldmVyJzsg
RGV2ZXNoIFNoYXJtYQ0KPiA+PiBDYzogbGludXgtcmRtYUB2Z2VyLmtlcm5lbC5vcmc7ICdMaW51
eCBORlMgTWFpbGluZyBMaXN0Jw0KPiA+PiBTdWJqZWN0OiBSRTogW1BBVENIIHYxIDA1LzEzXSB4
cHJ0cmRtYTogRG9uJ3QgZHJhaW4gQ1FzIG9uIHRyYW5zcG9ydA0KPiA+PiBkaXNjb25uZWN0DQo+
ID4+DQo+ID4+DQo+ID4+DQo+ID4+PiAtLS0tLU9yaWdpbmFsIE1lc3NhZ2UtLS0tLQ0KPiA+Pj4g
RnJvbTogQ2h1Y2sgTGV2ZXIgW21haWx0bzpjaHVjay5sZXZlckBvcmFjbGUuY29tXQ0KPiA+Pj4g
U2VudDogV2VkbmVzZGF5LCBKdWx5IDAyLCAyMDE0IDI6NDAgUE0NCj4gPj4+IFRvOiBTdGV2ZSBX
aXNlOyBEZXZlc2ggU2hhcm1hDQo+ID4+PiBDYzogbGludXgtcmRtYUB2Z2VyLmtlcm5lbC5vcmc7
IExpbnV4IE5GUyBNYWlsaW5nIExpc3QNCj4gPj4+IFN1YmplY3Q6IFJlOiBbUEFUQ0ggdjEgMDUv
MTNdIHhwcnRyZG1hOiBEb24ndCBkcmFpbiBDUXMgb24gdHJhbnNwb3J0DQo+ID4+PiBkaXNjb25u
ZWN0DQo+ID4+Pg0KPiA+Pj4NCj4gPj4+IE9uIEp1bCAyLCAyMDE0LCBhdCAzOjI4IFBNLCBTdGV2
ZSBXaXNlDQo+IDxzd2lzZUBvcGVuZ3JpZGNvbXB1dGluZy5jb20+DQo+ID4+IHdyb3RlOg0KPiA+
Pj4NCj4gPj4+PiBPbiA3LzIvMjAxNCAyOjA2IFBNLCBEZXZlc2ggU2hhcm1hIHdyb3RlOg0KPiA+
Pj4+PiBUaGlzIGNoYW5nZSBpcyB2ZXJ5IG11Y2ggcHJvbmUgdG8gZ2VuZXJhdGUgcG9sbF9jcSBl
cnJvcnMgYmVjYXVzZQ0KPiA+Pj4+PiBvZiB1bi1jbGVhbmVkDQo+ID4+PiBjb21wbGV0aW9ucyB3
aGljaCBzdGlsbA0KPiA+Pj4+PiBwb2ludCB0byB0aGUgbm9uLWV4aXN0ZW50IFFQcy4gT24gdGhl
IG5ldyBjb25uZWN0aW9uIHdoZW4gdGhlc2UNCj4gPj4+Pj4gY29tcGxldGlvbnMgYXJlIHBvbGxl
ZCwNCj4gPj4+IHRoZSBwb2xsX2NxIHdpbGwNCj4gPj4+Pj4gZmFpbCBiZWNhdXNlIG9sZCBRUCBw
b2ludGVyIGlzIGFscmVhZHkgTlVMTC4NCj4gPj4+Pj4gRGlkIGFueW9uZSBoaXQgdGhpcyBzaXR1
YXRpb24gZHVyaW5nIHRoZWlyIHRlc3Rpbmc/DQo+ID4+Pg0KPiA+Pj4gSSB0ZXN0ZWQgdGhpcyBh
Z2dyZXNzaXZlbHkgd2l0aCBhIGZhdWx0IGluamVjdG9yIHRoYXQgdHJpZ2dlcnMNCj4gPj4+IHJl
Z3VsYXIgY29ubmVjdGlvbiBkaXNydXB0aW9uLg0KPiA+Pj4NCj4gPj4+PiBIZXkgRGV2ZXNoLA0K
PiA+Pj4+DQo+ID4+Pj4gaXdfY3hnYjQgd2lsbCBzaWxlbnRseSB0b3NzIENRRXMgaWYgdGhlIFFQ
IGlzIG5vdCBhY3RpdmUuDQo+ID4+Pg0KPiA+Pj4geHBydHJkbWEgcmVsaWVzIG9uIGdldHRpbmcg
YSBjb21wbGV0aW9uIChlaXRoZXIgc3VjY2Vzc2Z1bCBvciBpbg0KPiA+Pj4gZXJyb3IpIGZvciBl
dmVyeSBXUiBpdCBoYXMgcG9zdGVkLiBUaGUgZ29hbCBvZiB0aGlzIHBhdGNoIGlzIHRvDQo+ID4+
PiBhdm9pZCB0aHJvd2luZyBhd2F5IHF1ZXVlZCBjb21wbGV0aW9ucyBhZnRlciBhIHRyYW5zcG9y
dCBkaXNjb25uZWN0DQo+ID4+PiBzbyB3ZSBkb24ndCBsb3NlIHRyYWNrIG9mIEZSTVIgcmtleSB1
cGRhdGVzIChGQVNUX1JFR19NUiBhbmQNCj4gPj4+IExPQ0FMX0lOVg0KPiA+Pj4gY29tcGxldGlv
bnMpIGFuZCB3ZSBjYW4gY2FwdHVyZSBhbGwgUlBDIHJlcGxpZXMgcG9zdGVkIGJlZm9yZSB0aGUN
Cj4gPj4gY29ubmVjdGlvbiB3YXMgbG9zdC4NCj4gPj4+DQo+ID4+PiBTb3VuZHMgbGlrZSB3ZSBh
bHNvIG5lZWQgdG8ga2VlcCB0aGUgUVAgYXJvdW5kLCBldmVuIGluIGVycm9yIHN0YXRlLA0KPiA+
Pj4gdW50aWwgYWxsIGtub3duIFdScyBvbiB0aGF0IFFQIGhhdmUgY29tcGxldGVkPw0KPiA+Pj4N
Cj4gPg0KPiA+IFdoeSBub3QgcG9sbCBhbmQgcHJvY2VzcyBldmVyeSBjb21wbGV0aW9uIGR1cmlu
Zw0KPiBycGNyZG1hX2NxX2NsZWFudXAoKeKApi4NCj4gDQo+IFllcywgSSBoYXZlIGEgcGF0Y2gg
aW4gdGhlIG5leHQgdmVyc2lvbiBvZiB0aGlzIHNlcmllcyB0aGF0IGRvZXMgdGhhdC4NCj4gSXQg
anVzdCBjYWxscyBycGNyZG1hX3NlbmRjcV91cGNhbGwoKSBmcm9tIHRoZSBjb25uZWN0IHdvcmtl
ci4gSSB3aWxsIHNxdWFzaA0KPiB0aGF0IGNoYW5nZSBpbnRvIHRoaXMgcGF0Y2guDQoNCkNvb2wu
DQoNCj4gDQo+IE1heWJlIGl0IG5lZWRzIHRvIGludm9rZSBycGNyZG1hX3JlY3ZjcV91cGNhbGwo
KSB0aGVyZSBhcyB3ZWxsLg0KDQpZZXMsIHRoYXQgd291bGQgYmUgbmVlZGVkLg0KDQo+IA0KPiAN
Cj4gPg0KPiA+Pg0KPiA+PiBQZXJoYXBzLg0KPiA+Pg0KPiA+Pj4NCj4gPj4+Pg0KPiA+Pj4+DQo+
ID4+Pj4+PiAtLS0tLU9yaWdpbmFsIE1lc3NhZ2UtLS0tLQ0KPiA+Pj4+Pj4gRnJvbTogbGludXgt
cmRtYS1vd25lckB2Z2VyLmtlcm5lbC5vcmcgW21haWx0bzpsaW51eC1yZG1hLQ0KPiA+Pj4+Pj4g
b3duZXJAdmdlci5rZXJuZWwub3JnXSBPbiBCZWhhbGYgT2YgQ2h1Y2sgTGV2ZXINCj4gPj4+Pj4+
IFNlbnQ6IFR1ZXNkYXksIEp1bmUgMjQsIDIwMTQgNDoxMCBBTQ0KPiA+Pj4+Pj4gVG86IGxpbnV4
LXJkbWFAdmdlci5rZXJuZWwub3JnOyBsaW51eC1uZnNAdmdlci5rZXJuZWwub3JnDQo+ID4+Pj4+
PiBTdWJqZWN0OiBbUEFUQ0ggdjEgMDUvMTNdIHhwcnRyZG1hOiBEb24ndCBkcmFpbiBDUXMgb24g
dHJhbnNwb3J0DQo+ID4+Pj4+PiBkaXNjb25uZWN0DQo+ID4+Pj4+Pg0KPiA+Pj4+Pj4gQ1FzIGFy
ZSBub3QgZGVzdHJveWVkIHVudGlsIHVubW91bnQuIEJ5IGRyYWluaW5nIENRcyBvbiB0cmFuc3Bv
cnQNCj4gPj4+Pj4+IGRpc2Nvbm5lY3QsIHN1Y2Nlc3NmdWwgY29tcGxldGlvbnMgdGhhdCBjYW4g
Y2hhbmdlIHRoZQ0KPiA+Pj4+Pj4gci5mcm1yLnN0YXRlIGZpZWxkIGNhbiBiZSBtaXNzZWQuDQo+
ID4+Pj4+Pg0KPiA+Pj4+Pj4gU2lnbmVkLW9mZi1ieTogQ2h1Y2sgTGV2ZXIgPGNodWNrLmxldmVy
QG9yYWNsZS5jb20+DQo+ID4+Pj4+PiAtLS0NCj4gPj4+Pj4+IG5ldC9zdW5ycGMveHBydHJkbWEv
dmVyYnMuYyB8ICAgIDUgLS0tLS0NCj4gPj4+Pj4+IDEgZmlsZSBjaGFuZ2VkLCA1IGRlbGV0aW9u
cygtKQ0KPiA+Pj4+Pj4NCj4gPj4+Pj4+IGRpZmYgLS1naXQgYS9uZXQvc3VucnBjL3hwcnRyZG1h
L3ZlcmJzLmMNCj4gPj4+Pj4+IGIvbmV0L3N1bnJwYy94cHJ0cmRtYS92ZXJicy5jIGluZGV4IDNj
N2Y5MDQuLjQ1MWUxMDAgMTAwNjQ0DQo+ID4+Pj4+PiAtLS0gYS9uZXQvc3VucnBjL3hwcnRyZG1h
L3ZlcmJzLmMNCj4gPj4+Pj4+ICsrKyBiL25ldC9zdW5ycGMveHBydHJkbWEvdmVyYnMuYw0KPiA+
Pj4+Pj4gQEAgLTg3Myw5ICs4NzMsNiBAQCByZXRyeToNCj4gPj4+Pj4+IAkJCWRwcmludGsoIlJQ
QzogICAgICAgJXM6DQo+IHJwY3JkbWFfZXBfZGlzY29ubmVjdCINCj4gPj4+Pj4+IAkJCQkiIHN0
YXR1cyAlaVxuIiwgX19mdW5jX18sIHJjKTsNCj4gPj4+Pj4+DQo+ID4+Pj4+PiAtCQlycGNyZG1h
X2NsZWFuX2NxKGVwLT5yZXBfYXR0ci5yZWN2X2NxKTsNCj4gPj4+Pj4+IC0JCXJwY3JkbWFfY2xl
YW5fY3EoZXAtPnJlcF9hdHRyLnNlbmRfY3EpOw0KPiA+Pj4+Pj4gLQ0KPiA+Pj4+Pj4gCQl4cHJ0
ID0gY29udGFpbmVyX29mKGlhLCBzdHJ1Y3QgcnBjcmRtYV94cHJ0LCByeF9pYSk7DQo+ID4+Pj4+
PiAJCWlkID0gcnBjcmRtYV9jcmVhdGVfaWQoeHBydCwgaWEsDQo+ID4+Pj4+PiAJCQkJKHN0cnVj
dCBzb2NrYWRkciAqKSZ4cHJ0LQ0KPiA+cnhfZGF0YS5hZGRyKTsNCj4gPj4gQEAgLTk4NSw4ICs5
ODIsNiBAQA0KPiA+Pj4+Pj4gcnBjcmRtYV9lcF9kaXNjb25uZWN0KHN0cnVjdCBycGNyZG1hX2Vw
ICplcCwgc3RydWN0IHJwY3JkbWFfaWENCj4gPj4+Pj4+ICppYSkgIHsNCj4gPj4+Pj4+IAlpbnQg
cmM7DQo+ID4+Pj4+Pg0KPiA+Pj4+Pj4gLQlycGNyZG1hX2NsZWFuX2NxKGVwLT5yZXBfYXR0ci5y
ZWN2X2NxKTsNCj4gPj4+Pj4+IC0JcnBjcmRtYV9jbGVhbl9jcShlcC0+cmVwX2F0dHIuc2VuZF9j
cSk7DQo+ID4+Pj4+PiAJcmMgPSByZG1hX2Rpc2Nvbm5lY3QoaWEtPnJpX2lkKTsNCj4gPj4+Pj4+
IAlpZiAoIXJjKSB7DQo+ID4+Pj4+PiAJCS8qIHJldHVybnMgd2l0aG91dCB3YWl0IGlmIG5vdCBj
b25uZWN0ZWQgKi8NCj4gPj4+Pj4+DQo+ID4+Pj4+PiAtLQ0KPiA+Pj4+Pj4gVG8gdW5zdWJzY3Jp
YmUgZnJvbSB0aGlzIGxpc3Q6IHNlbmQgdGhlIGxpbmUgInVuc3Vic2NyaWJlDQo+ID4+Pj4+PiBs
aW51eC1yZG1hIiBpbiB0aGUgYm9keSBvZiBhIG1lc3NhZ2UgdG8NCj4gbWFqb3Jkb21vQHZnZXIu
a2VybmVsLm9yZw0KPiA+Pj4+Pj4gTW9yZSBtYWpvcmRvbW8gaW5mbyBhdCBodHRwOi8vdmdlci5r
ZXJuZWwub3JnL21ham9yZG9tby0NCj4gaW5mby5odG1sDQo+ID4+Pj4+IE4gICAgIHIgIHkgICBi
IFggIMendiBeICneunsubiArICAgIHsgICAiICBebiByICAgeiAaICBoICAgICYgIB4gRw0KPiA+
Pj4+PiBoIAMoIOmajg0KPiA+Pj4g3aJqIiAgGiAbbSAgICAgeiDeliAgIGYgICBoICAgfiBtbWw9
PQ0KPiA+Pj4+DQo+ID4+Pj4gLS0NCj4gPj4+PiBUbyB1bnN1YnNjcmliZSBmcm9tIHRoaXMgbGlz
dDogc2VuZCB0aGUgbGluZSAidW5zdWJzY3JpYmUgbGludXgtbmZzIg0KPiA+Pj4+IGluIHRoZSBi
b2R5IG9mIGEgbWVzc2FnZSB0byBtYWpvcmRvbW9Admdlci5rZXJuZWwub3JnIE1vcmUNCj4gPj4g
bWFqb3Jkb21vDQo+ID4+Pj4gaW5mbyBhdCAgaHR0cDovL3ZnZXIua2VybmVsLm9yZy9tYWpvcmRv
bW8taW5mby5odG1sDQo+ID4+Pg0KPiA+Pj4gLS0NCj4gPj4+IENodWNrIExldmVyDQo+ID4+PiBj
aHVja1tkb3RdbGV2ZXJbYXRdb3JhY2xlW2RvdF1jb20NCj4gPj4+DQo+ID4+Pg0KPiA+Pg0KPiA+
IC0tDQo+ID4gVG8gdW5zdWJzY3JpYmUgZnJvbSB0aGlzIGxpc3Q6IHNlbmQgdGhlIGxpbmUgInVu
c3Vic2NyaWJlIGxpbnV4LXJkbWEiDQo+ID4gaW4gdGhlIGJvZHkgb2YgYSBtZXNzYWdlIHRvIG1h
am9yZG9tb0B2Z2VyLmtlcm5lbC5vcmcgTW9yZQ0KPiBtYWpvcmRvbW8NCj4gPiBpbmZvIGF0ICBo
dHRwOi8vdmdlci5rZXJuZWwub3JnL21ham9yZG9tby1pbmZvLmh0bWwNCj4gPg0KPiANCj4gLS0N
Cj4gQ2h1Y2sgTGV2ZXINCj4gY2h1Y2tbZG90XWxldmVyW2F0XW9yYWNsZVtkb3RdY29tDQo+IA0K
PiANCg0K
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/sunrpc/xprtrdma/verbs.c b/net/sunrpc/xprtrdma/verbs.c
index 3c7f904..451e100 100644
--- a/net/sunrpc/xprtrdma/verbs.c
+++ b/net/sunrpc/xprtrdma/verbs.c
@@ -873,9 +873,6 @@  retry:
 			dprintk("RPC:       %s: rpcrdma_ep_disconnect"
 				" status %i\n", __func__, rc);
 
-		rpcrdma_clean_cq(ep->rep_attr.recv_cq);
-		rpcrdma_clean_cq(ep->rep_attr.send_cq);
-
 		xprt = container_of(ia, struct rpcrdma_xprt, rx_ia);
 		id = rpcrdma_create_id(xprt, ia,
 				(struct sockaddr *)&xprt->rx_data.addr);
@@ -985,8 +982,6 @@  rpcrdma_ep_disconnect(struct rpcrdma_ep *ep, struct rpcrdma_ia *ia)
 {
 	int rc;
 
-	rpcrdma_clean_cq(ep->rep_attr.recv_cq);
-	rpcrdma_clean_cq(ep->rep_attr.send_cq);
 	rc = rdma_disconnect(ia->ri_id);
 	if (!rc) {
 		/* returns without wait if not connected */