diff mbox

[v1,02/14] xprtrdma: Warn when there are orphaned IB objects

Message ID 20150504175700.3483.57728.stgit@manet.1015granger.net (mailing list archive)
State Rejected
Headers show

Commit Message

Chuck Lever May 4, 2015, 5:57 p.m. UTC
Print an error during transport destruction if ib_dealloc_pd()
fails. This is a sign that xprtrdma orphaned one or more RDMA API
objects at some point, which can pin lower layer kernel modules
and cause shutdown to hang.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 net/sunrpc/xprtrdma/verbs.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Devesh Sharma May 6, 2015, 11:37 a.m. UTC | #1
On Mon, May 4, 2015 at 11:27 PM, Chuck Lever <chuck.lever@oracle.com> wrote:
>
> Print an error during transport destruction if ib_dealloc_pd()
> fails. This is a sign that xprtrdma orphaned one or more RDMA API
> objects at some point, which can pin lower layer kernel modules
> and cause shutdown to hang.
>
> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> ---
>  net/sunrpc/xprtrdma/verbs.c |    4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/net/sunrpc/xprtrdma/verbs.c b/net/sunrpc/xprtrdma/verbs.c
> index 4870d27..0cc4617 100644
> --- a/net/sunrpc/xprtrdma/verbs.c
> +++ b/net/sunrpc/xprtrdma/verbs.c
> @@ -710,8 +710,8 @@ rpcrdma_ia_close(struct rpcrdma_ia *ia)
>         }
>         if (ia->ri_pd != NULL && !IS_ERR(ia->ri_pd)) {
>                 rc = ib_dealloc_pd(ia->ri_pd);
> -               dprintk("RPC:       %s: ib_dealloc_pd returned %i\n",
> -                       __func__, rc);

Should we check for EBUSY explicitly? other then this is an error in
vendor specific ib_dealloc_pd()

> +               if (rc)
> +                       pr_warn("rpcrdma: ib_dealloc_pd status %i\n", rc);
>         }
>  }
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
Chuck Lever May 6, 2015, 1:24 p.m. UTC | #2
Hi Devesh-

On May 6, 2015, at 7:37 AM, Devesh Sharma <devesh.sharma@avagotech.com> wrote:

> On Mon, May 4, 2015 at 11:27 PM, Chuck Lever <chuck.lever@oracle.com> wrote:
>> 
>> Print an error during transport destruction if ib_dealloc_pd()
>> fails. This is a sign that xprtrdma orphaned one or more RDMA API
>> objects at some point, which can pin lower layer kernel modules
>> and cause shutdown to hang.
>> 
>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
>> ---
>> net/sunrpc/xprtrdma/verbs.c |    4 ++--
>> 1 file changed, 2 insertions(+), 2 deletions(-)
>> 
>> diff --git a/net/sunrpc/xprtrdma/verbs.c b/net/sunrpc/xprtrdma/verbs.c
>> index 4870d27..0cc4617 100644
>> --- a/net/sunrpc/xprtrdma/verbs.c
>> +++ b/net/sunrpc/xprtrdma/verbs.c
>> @@ -710,8 +710,8 @@ rpcrdma_ia_close(struct rpcrdma_ia *ia)
>>        }
>>        if (ia->ri_pd != NULL && !IS_ERR(ia->ri_pd)) {
>>                rc = ib_dealloc_pd(ia->ri_pd);
>> -               dprintk("RPC:       %s: ib_dealloc_pd returned %i\n",
>> -                       __func__, rc);
> 
> Should we check for EBUSY explicitly? other then this is an error in
> vendor specific ib_dealloc_pd()

Any error return means ib_dealloc_pd() has failed, right? Doesn’t that
mean the PD is still allocated, and could cause problems later?


>> +               if (rc)
>> +                       pr_warn("rpcrdma: ib_dealloc_pd status %i\n", rc);
>>        }
>> }

--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com



--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sagi Grimberg May 6, 2015, 2:05 p.m. UTC | #3
On 5/6/2015 4:24 PM, Chuck Lever wrote:
> Hi Devesh-
>
> On May 6, 2015, at 7:37 AM, Devesh Sharma <devesh.sharma@avagotech.com> wrote:
>
>> On Mon, May 4, 2015 at 11:27 PM, Chuck Lever <chuck.lever@oracle.com> wrote:
>>>
>>> Print an error during transport destruction if ib_dealloc_pd()
>>> fails. This is a sign that xprtrdma orphaned one or more RDMA API
>>> objects at some point, which can pin lower layer kernel modules
>>> and cause shutdown to hang.
>>>
>>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
>>> ---
>>> net/sunrpc/xprtrdma/verbs.c |    4 ++--
>>> 1 file changed, 2 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/net/sunrpc/xprtrdma/verbs.c b/net/sunrpc/xprtrdma/verbs.c
>>> index 4870d27..0cc4617 100644
>>> --- a/net/sunrpc/xprtrdma/verbs.c
>>> +++ b/net/sunrpc/xprtrdma/verbs.c
>>> @@ -710,8 +710,8 @@ rpcrdma_ia_close(struct rpcrdma_ia *ia)
>>>         }
>>>         if (ia->ri_pd != NULL && !IS_ERR(ia->ri_pd)) {
>>>                 rc = ib_dealloc_pd(ia->ri_pd);
>>> -               dprintk("RPC:       %s: ib_dealloc_pd returned %i\n",
>>> -                       __func__, rc);
>>
>> Should we check for EBUSY explicitly? other then this is an error in
>> vendor specific ib_dealloc_pd()
>
> Any error return means ib_dealloc_pd() has failed, right? Doesn’t that
> mean the PD is still allocated, and could cause problems later?

AFAICT, the only non-zero rc that ib_dealloc_pd should return is EBUSY.
So I don't see value in verifying it at all.

So, Looks Good

Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Devesh Sharma May 6, 2015, 2:22 p.m. UTC | #4
On Wed, May 6, 2015 at 6:54 PM, Chuck Lever <chuck.lever@oracle.com> wrote:
> Hi Devesh-
>
> On May 6, 2015, at 7:37 AM, Devesh Sharma <devesh.sharma@avagotech.com> wrote:
>
>> On Mon, May 4, 2015 at 11:27 PM, Chuck Lever <chuck.lever@oracle.com> wrote:
>>>
>>> Print an error during transport destruction if ib_dealloc_pd()
>>> fails. This is a sign that xprtrdma orphaned one or more RDMA API
>>> objects at some point, which can pin lower layer kernel modules
>>> and cause shutdown to hang.
>>>
>>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
>>> ---
>>> net/sunrpc/xprtrdma/verbs.c |    4 ++--
>>> 1 file changed, 2 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/net/sunrpc/xprtrdma/verbs.c b/net/sunrpc/xprtrdma/verbs.c
>>> index 4870d27..0cc4617 100644
>>> --- a/net/sunrpc/xprtrdma/verbs.c
>>> +++ b/net/sunrpc/xprtrdma/verbs.c
>>> @@ -710,8 +710,8 @@ rpcrdma_ia_close(struct rpcrdma_ia *ia)
>>>        }
>>>        if (ia->ri_pd != NULL && !IS_ERR(ia->ri_pd)) {
>>>                rc = ib_dealloc_pd(ia->ri_pd);
>>> -               dprintk("RPC:       %s: ib_dealloc_pd returned %i\n",
>>> -                       __func__, rc);
>>
>> Should we check for EBUSY explicitly? other then this is an error in
>> vendor specific ib_dealloc_pd()
>
> Any error return means ib_dealloc_pd() has failed, right? Doesn’t that
> mean the PD is still allocated, and could cause problems later?

Yes, you are correct, I was thinking ib_dealloc_pd() has a refcount
implemented in the core layer, thus if the PD is used by any resource,
it will always fail with -EBUSY.
.With emulex adapter it is possible to fail dealloc_pd with ENOMEM or
EIO in cases where device f/w is not responding etc. this situation do
not represent PD is actually in use.

>
>
>>> +               if (rc)
>>> +                       pr_warn("rpcrdma: ib_dealloc_pd status %i\n", rc);
>>>        }
>>> }
>
> --
> Chuck Lever
> chuck[dot]lever[at]oracle[dot]com
>
>
>
Jason Gunthorpe May 6, 2015, 4:48 p.m. UTC | #5
On Wed, May 06, 2015 at 07:52:03PM +0530, Devesh Sharma wrote:
> >> Should we check for EBUSY explicitly? other then this is an error in
> >> vendor specific ib_dealloc_pd()
> >
> > Any error return means ib_dealloc_pd() has failed, right? Doesn’t that
> > mean the PD is still allocated, and could cause problems later?
> 
> Yes, you are correct, I was thinking ib_dealloc_pd() has a refcount
> implemented in the core layer, thus if the PD is used by any resource,
> it will always fail with -EBUSY.

.. and it will not be freed, which indicates a serious bug in the
caller, so the caller should respond to the failure with a BUG_ON or
WARN_ON.

> .With emulex adapter it is possible to fail dealloc_pd with ENOMEM or
> EIO in cases where device f/w is not responding etc. this situation do
> not represent PD is actually in use.

This is a really bad idea. If the pd was freed and from the consumer's
perspective everything is sane then it should return success.

If the driver detects an internal failure, then it should move the
driver to a failed state (whatever that means, but at a minimum it
means the firmware state and driver state must be resync'd), and still
succeed the dealloc.

There is absolutely nothing the caller can do about a driver level
failure here, and it doesn't indicate a caller bug.

Returning ENOMEM for dealloc is what we'd call an insane API. You
can't have failable memory allocations in a dealloc path.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Devesh Sharma May 7, 2015, 7:53 a.m. UTC | #6
> -----Original Message-----
> From: Jason Gunthorpe [mailto:jgunthorpe@obsidianresearch.com]
> Sent: Wednesday, May 06, 2015 10:18 PM
> To: Devesh Sharma
> Cc: Chuck Lever; linux-rdma@vger.kernel.org; Linux NFS Mailing List
> Subject: Re: [PATCH v1 02/14] xprtrdma: Warn when there are orphaned IB
> objects
>
> On Wed, May 06, 2015 at 07:52:03PM +0530, Devesh Sharma wrote:
> > >> Should we check for EBUSY explicitly? other then this is an error
> > >> in vendor specific ib_dealloc_pd()
> > >
> > > Any error return means ib_dealloc_pd() has failed, right? Doesn’t
> > > that mean the PD is still allocated, and could cause problems later?
> >
> > Yes, you are correct, I was thinking ib_dealloc_pd() has a refcount
> > implemented in the core layer, thus if the PD is used by any resource,
> > it will always fail with -EBUSY.
>
> .. and it will not be freed, which indicates a serious bug in the caller,
> so the
> caller should respond to the failure with a BUG_ON or WARN_ON.

Yes, that’s what this patch is doing.

>
> > .With emulex adapter it is possible to fail dealloc_pd with ENOMEM or
> > EIO in cases where device f/w is not responding etc. this situation do
> > not represent PD is actually in use.
>
> This is a really bad idea. If the pd was freed and from the consumer's
> perspective everything is sane then it should return success.
>
> If the driver detects an internal failure, then it should move the driver
> to a
> failed state (whatever that means, but at a minimum it means the firmware
> state and driver state must be resync'd), and still succeed the dealloc.

Makes sense.

>
> There is absolutely nothing the caller can do about a driver level failure
> here,
> and it doesn't indicate a caller bug.
>
> Returning ENOMEM for dealloc is what we'd call an insane API. You can't
> have
> failable memory allocations in a dealloc path.

I will supply a fix in ocrdma.

Reviewed-by: Devesh Sharma <devesh.sharma@avagotech.com>
>
> Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/sunrpc/xprtrdma/verbs.c b/net/sunrpc/xprtrdma/verbs.c
index 4870d27..0cc4617 100644
--- a/net/sunrpc/xprtrdma/verbs.c
+++ b/net/sunrpc/xprtrdma/verbs.c
@@ -710,8 +710,8 @@  rpcrdma_ia_close(struct rpcrdma_ia *ia)
 	}
 	if (ia->ri_pd != NULL && !IS_ERR(ia->ri_pd)) {
 		rc = ib_dealloc_pd(ia->ri_pd);
-		dprintk("RPC:       %s: ib_dealloc_pd returned %i\n",
-			__func__, rc);
+		if (rc)
+			pr_warn("rpcrdma: ib_dealloc_pd status %i\n", rc);
 	}
 }