diff mbox series

xprtrdma: make sure MRs are unmapped before freeing them

Message ID 20200814173734.3271600-1-dan@kernelim.com
State New
Headers show
Series xprtrdma: make sure MRs are unmapped before freeing them | expand

Commit Message

Dan Aloni Aug. 14, 2020, 5:37 p.m. UTC
It was observed that on disconnections, these unmaps don't occur. The
relevant path is rpcrdma_mrs_destroy(), being called from
rpcrdma_xprt_disconnect().

Signed-off-by: Dan Aloni <dan@kernelim.com>
---
 net/sunrpc/xprtrdma/frwr_ops.c | 22 ++++++++++++++++------
 1 file changed, 16 insertions(+), 6 deletions(-)

Comments

Chuck Lever Aug. 14, 2020, 6:12 p.m. UTC | #1
Hi Dan-

> On Aug 14, 2020, at 1:37 PM, Dan Aloni <dan@kernelim.com> wrote:
> 
> It was observed that on disconnections, these unmaps don't occur. The
> relevant path is rpcrdma_mrs_destroy(), being called from
> rpcrdma_xprt_disconnect().

MRs are supposed to be unmapped right after they are used, so
during disconnect they should all be unmapped already. How often
do you see a DMA mapped MR in this code path? Do you have a
reproducer I can try?


> Signed-off-by: Dan Aloni <dan@kernelim.com>
> ---
> net/sunrpc/xprtrdma/frwr_ops.c | 22 ++++++++++++++++------
> 1 file changed, 16 insertions(+), 6 deletions(-)
> 
> diff --git a/net/sunrpc/xprtrdma/frwr_ops.c b/net/sunrpc/xprtrdma/frwr_ops.c
> index 7f94c9a19fd3..3899a5310214 100644
> --- a/net/sunrpc/xprtrdma/frwr_ops.c
> +++ b/net/sunrpc/xprtrdma/frwr_ops.c
> @@ -49,6 +49,19 @@
> # define RPCDBG_FACILITY	RPCDBG_TRANS
> #endif
> 
> +static void frwr_mr_unmap(struct rpcrdma_mr *mr)
> +{
> +	struct rpcrdma_xprt *r_xprt = mr->mr_xprt;
> +
> +	if (mr->mr_dir == DMA_NONE)
> +		return;
> +
> +	trace_xprtrdma_mr_unmap(mr);
> +	ib_dma_unmap_sg(r_xprt->rx_ep->re_id->device,
> +			mr->mr_sg, mr->mr_nents, mr->mr_dir);
> +	mr->mr_dir = DMA_NONE;
> +}
> +
> /**
>  * frwr_release_mr - Destroy one MR
>  * @mr: MR allocated by frwr_mr_init
> @@ -58,6 +71,8 @@ void frwr_release_mr(struct rpcrdma_mr *mr)
> {
> 	int rc;
> 
> +	frwr_mr_unmap(mr);
> +
> 	rc = ib_dereg_mr(mr->frwr.fr_mr);
> 	if (rc)
> 		trace_xprtrdma_frwr_dereg(mr, rc);
> @@ -71,12 +86,7 @@ static void frwr_mr_recycle(struct rpcrdma_mr *mr)
> 
> 	trace_xprtrdma_mr_recycle(mr);
> 
> -	if (mr->mr_dir != DMA_NONE) {
> -		trace_xprtrdma_mr_unmap(mr);
> -		ib_dma_unmap_sg(r_xprt->rx_ep->re_id->device,
> -				mr->mr_sg, mr->mr_nents, mr->mr_dir);
> -		mr->mr_dir = DMA_NONE;
> -	}
> +	frwr_mr_unmap(mr);
> 
> 	spin_lock(&r_xprt->rx_buf.rb_lock);
> 	list_del(&mr->mr_all);
> -- 
> 2.25.4
> 

--
Chuck Lever
Dan Aloni Aug. 14, 2020, 7:10 p.m. UTC | #2
On Fri, Aug 14, 2020 at 02:12:48PM -0400, Chuck Lever wrote:
> Hi Dan-
> 
> > On Aug 14, 2020, at 1:37 PM, Dan Aloni <dan@kernelim.com> wrote:
> > 
> > It was observed that on disconnections, these unmaps don't occur. The
> > relevant path is rpcrdma_mrs_destroy(), being called from
> > rpcrdma_xprt_disconnect().
> 
> MRs are supposed to be unmapped right after they are used, so
> during disconnect they should all be unmapped already. How often
> do you see a DMA mapped MR in this code path? Do you have a
> reproducer I can try?

These are not graceful disconnections but abnormal ones, where many large
IOs are still in flight, while the remote server suddenly breaks the
connection, the remote IP is still reachable but refusing to accept new
connections only for a few seconds.

We may also need reconnection attempts in the background trying to
recover the xprt, so that with short reconnect timeouts it may be enough
for xprt_rdma_close() to be triggered from xprt_rdma_connect_worker(),
leading up to rpcrdma_xprt_disconnect().
Chuck Lever Aug. 14, 2020, 8:21 p.m. UTC | #3
> On Aug 14, 2020, at 3:10 PM, Dan Aloni <dan@kernelim.com> wrote:
> 
> On Fri, Aug 14, 2020 at 02:12:48PM -0400, Chuck Lever wrote:
>> Hi Dan-
>> 
>>> On Aug 14, 2020, at 1:37 PM, Dan Aloni <dan@kernelim.com> wrote:
>>> 
>>> It was observed that on disconnections, these unmaps don't occur. The
>>> relevant path is rpcrdma_mrs_destroy(), being called from
>>> rpcrdma_xprt_disconnect().
>> 
>> MRs are supposed to be unmapped right after they are used, so
>> during disconnect they should all be unmapped already. How often
>> do you see a DMA mapped MR in this code path? Do you have a
>> reproducer I can try?
> 
> These are not graceful disconnections but abnormal ones, where many large
> IOs are still in flight, while the remote server suddenly breaks the
> connection, the remote IP is still reachable but refusing to accept new
> connections only for a few seconds.

Ideally that's not supposed to matter. I'll see if I can reproduce
with my usual tricks.

Why is your server behaving this way?


> We may also need reconnection attempts in the background trying to
> recover the xprt, so that with short reconnect timeouts it may be enough
> for xprt_rdma_close() to be triggered from xprt_rdma_connect_worker(),
> leading up to rpcrdma_xprt_disconnect().

--
Chuck Lever
Dan Aloni Aug. 15, 2020, 5:45 a.m. UTC | #4
On Fri, Aug 14, 2020 at 04:21:54PM -0400, Chuck Lever wrote:
> 
> 
> > On Aug 14, 2020, at 3:10 PM, Dan Aloni <dan@kernelim.com> wrote:
> > 
> > On Fri, Aug 14, 2020 at 02:12:48PM -0400, Chuck Lever wrote:
> >> Hi Dan-
> >> 
> >>> On Aug 14, 2020, at 1:37 PM, Dan Aloni <dan@kernelim.com> wrote:
> >>> 
> >>> It was observed that on disconnections, these unmaps don't occur. The
> >>> relevant path is rpcrdma_mrs_destroy(), being called from
> >>> rpcrdma_xprt_disconnect().
> >> 
> >> MRs are supposed to be unmapped right after they are used, so
> >> during disconnect they should all be unmapped already. How often
> >> do you see a DMA mapped MR in this code path? Do you have a
> >> reproducer I can try?
> > 
> > These are not graceful disconnections but abnormal ones, where many large
> > IOs are still in flight, while the remote server suddenly breaks the
> > connection, the remote IP is still reachable but refusing to accept new
> > connections only for a few seconds.
> 
> Ideally that's not supposed to matter. I'll see if I can reproduce
> with my usual tricks.
> 
> Why is your server behaving this way?

It's a dedicated storage cluster under a specific testing scenario,
implementing floating IPs.  Haven't tried, but maybe the same scenario
can be reproduced with a standard single Linux NFSv3 server by fiddling
with nfsd open ports.
Chuck Lever Aug. 16, 2020, 10:28 p.m. UTC | #5
> On Aug 15, 2020, at 1:45 AM, Dan Aloni <dan@kernelim.com> wrote:
> 
> On Fri, Aug 14, 2020 at 04:21:54PM -0400, Chuck Lever wrote:
>> 
>> 
>>> On Aug 14, 2020, at 3:10 PM, Dan Aloni <dan@kernelim.com> wrote:
>>> 
>>> On Fri, Aug 14, 2020 at 02:12:48PM -0400, Chuck Lever wrote:
>>>> Hi Dan-
>>>> 
>>>>> On Aug 14, 2020, at 1:37 PM, Dan Aloni <dan@kernelim.com> wrote:
>>>>> 
>>>>> It was observed that on disconnections, these unmaps don't occur. The
>>>>> relevant path is rpcrdma_mrs_destroy(), being called from
>>>>> rpcrdma_xprt_disconnect().
>>>> 
>>>> MRs are supposed to be unmapped right after they are used, so
>>>> during disconnect they should all be unmapped already. How often
>>>> do you see a DMA mapped MR in this code path? Do you have a
>>>> reproducer I can try?
>>> 
>>> These are not graceful disconnections but abnormal ones, where many large
>>> IOs are still in flight, while the remote server suddenly breaks the
>>> connection, the remote IP is still reachable but refusing to accept new
>>> connections only for a few seconds.
>> 
>> Ideally that's not supposed to matter. I'll see if I can reproduce
>> with my usual tricks.
>> 
>> Why is your server behaving this way?
> 
> It's a dedicated storage cluster under a specific testing scenario,
> implementing floating IPs.  Haven't tried, but maybe the same scenario
> can be reproduced with a standard single Linux NFSv3 server by fiddling
> with nfsd open ports.

Hi Dan, I was able to reproduce the DMA-map leak with a simple server-side
disconnect injection test. I'll try some root cause analysis tomorrow.


--
Chuck Lever
diff mbox series

Patch

diff --git a/net/sunrpc/xprtrdma/frwr_ops.c b/net/sunrpc/xprtrdma/frwr_ops.c
index 7f94c9a19fd3..3899a5310214 100644
--- a/net/sunrpc/xprtrdma/frwr_ops.c
+++ b/net/sunrpc/xprtrdma/frwr_ops.c
@@ -49,6 +49,19 @@ 
 # define RPCDBG_FACILITY	RPCDBG_TRANS
 #endif
 
+static void frwr_mr_unmap(struct rpcrdma_mr *mr)
+{
+	struct rpcrdma_xprt *r_xprt = mr->mr_xprt;
+
+	if (mr->mr_dir == DMA_NONE)
+		return;
+
+	trace_xprtrdma_mr_unmap(mr);
+	ib_dma_unmap_sg(r_xprt->rx_ep->re_id->device,
+			mr->mr_sg, mr->mr_nents, mr->mr_dir);
+	mr->mr_dir = DMA_NONE;
+}
+
 /**
  * frwr_release_mr - Destroy one MR
  * @mr: MR allocated by frwr_mr_init
@@ -58,6 +71,8 @@  void frwr_release_mr(struct rpcrdma_mr *mr)
 {
 	int rc;
 
+	frwr_mr_unmap(mr);
+
 	rc = ib_dereg_mr(mr->frwr.fr_mr);
 	if (rc)
 		trace_xprtrdma_frwr_dereg(mr, rc);
@@ -71,12 +86,7 @@  static void frwr_mr_recycle(struct rpcrdma_mr *mr)
 
 	trace_xprtrdma_mr_recycle(mr);
 
-	if (mr->mr_dir != DMA_NONE) {
-		trace_xprtrdma_mr_unmap(mr);
-		ib_dma_unmap_sg(r_xprt->rx_ep->re_id->device,
-				mr->mr_sg, mr->mr_nents, mr->mr_dir);
-		mr->mr_dir = DMA_NONE;
-	}
+	frwr_mr_unmap(mr);
 
 	spin_lock(&r_xprt->rx_buf.rb_lock);
 	list_del(&mr->mr_all);