diff mbox

librdmacm: Return errno on create_qp failure

Message ID 20180325200902.13017-1-yuval.shaia@oracle.com (mailing list archive)
State Changes Requested
Delegated to: Leon Romanovsky
Headers show

Commit Message

Yuval Shaia March 25, 2018, 8:09 p.m. UTC
Upon QP creation failure provider's create_qp function sets errno with
error code and return NULL to caller.

Let's return this errno to caller to reflect the exact reason for the
error.

Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
---
 librdmacm/cma.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Leon Romanovsky March 26, 2018, 1:44 p.m. UTC | #1
On Sun, Mar 25, 2018 at 11:09:02PM +0300, Yuval Shaia wrote:
> Upon QP creation failure provider's create_qp function sets errno with
> error code and return NULL to caller.
>
> Let's return this errno to caller to reflect the exact reason for the
> error.
>
> Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
> ---
>  librdmacm/cma.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/librdmacm/cma.c b/librdmacm/cma.c
> index fa370650..e92a022c 100644
> --- a/librdmacm/cma.c
> +++ b/librdmacm/cma.c
> @@ -1369,7 +1369,7 @@ int rdma_create_qp_ex(struct rdma_cm_id *id,
>  		attr->srq = id->srq;
>  	qp = ibv_create_qp_ex(id->verbs, attr);
>  	if (!qp) {
> -		ret = ERR(ENOMEM);
> +		ret = ERR(errno);

It is not always correct, for example
ibv_create_qp_ex -> ibv_create_qp -> pd->context->ops.create_qp ->
mlx5_create_qp -> create_qp -> can return NULL without updating errno.

Thanks

>  		goto err1;
>  	}
>
> --
> 2.14.3
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jason Gunthorpe March 26, 2018, 3:30 p.m. UTC | #2
On Mon, Mar 26, 2018 at 04:44:35PM +0300, Leon Romanovsky wrote:
> On Sun, Mar 25, 2018 at 11:09:02PM +0300, Yuval Shaia wrote:
> > Upon QP creation failure provider's create_qp function sets errno with
> > error code and return NULL to caller.
> >
> > Let's return this errno to caller to reflect the exact reason for the
> > error.
> >
> > Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
> >  librdmacm/cma.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/librdmacm/cma.c b/librdmacm/cma.c
> > index fa370650..e92a022c 100644
> > +++ b/librdmacm/cma.c
> > @@ -1369,7 +1369,7 @@ int rdma_create_qp_ex(struct rdma_cm_id *id,
> >  		attr->srq = id->srq;
> >  	qp = ibv_create_qp_ex(id->verbs, attr);
> >  	if (!qp) {
> > -		ret = ERR(ENOMEM);
> > +		ret = ERR(errno);
> 
> It is not always correct, for example
> ibv_create_qp_ex -> ibv_create_qp -> pd->context->ops.create_qp ->
> mlx5_create_qp -> create_qp -> can return NULL without updating errno.

That is a driver bug..

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Yuval Shaia March 26, 2018, 3:51 p.m. UTC | #3
On Mon, Mar 26, 2018 at 09:30:07AM -0600, Jason Gunthorpe wrote:
> On Mon, Mar 26, 2018 at 04:44:35PM +0300, Leon Romanovsky wrote:
> > On Sun, Mar 25, 2018 at 11:09:02PM +0300, Yuval Shaia wrote:
> > > Upon QP creation failure provider's create_qp function sets errno with
> > > error code and return NULL to caller.
> > >
> > > Let's return this errno to caller to reflect the exact reason for the
> > > error.
> > >
> > > Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
> > >  librdmacm/cma.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/librdmacm/cma.c b/librdmacm/cma.c
> > > index fa370650..e92a022c 100644
> > > +++ b/librdmacm/cma.c
> > > @@ -1369,7 +1369,7 @@ int rdma_create_qp_ex(struct rdma_cm_id *id,
> > >  		attr->srq = id->srq;
> > >  	qp = ibv_create_qp_ex(id->verbs, attr);
> > >  	if (!qp) {
> > > -		ret = ERR(ENOMEM);
> > > +		ret = ERR(errno);
> > 
> > It is not always correct, for example
> > ibv_create_qp_ex -> ibv_create_qp -> pd->context->ops.create_qp ->
> > mlx5_create_qp -> create_qp -> can return NULL without updating errno.
> 
> That is a driver bug..
> 
> Jason

Agree.
Unfortunately there are more than one driver behaving like this. It will
take some time to find them all and fix them. I can try to do it but wanted
first to know if i'm on the right track.

Yuval

> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jason Gunthorpe March 26, 2018, 4:32 p.m. UTC | #4
On Mon, Mar 26, 2018 at 06:51:29PM +0300, Yuval Shaia wrote:
> On Mon, Mar 26, 2018 at 09:30:07AM -0600, Jason Gunthorpe wrote:
> > On Mon, Mar 26, 2018 at 04:44:35PM +0300, Leon Romanovsky wrote:
> > > On Sun, Mar 25, 2018 at 11:09:02PM +0300, Yuval Shaia wrote:
> > > > Upon QP creation failure provider's create_qp function sets errno with
> > > > error code and return NULL to caller.
> > > >
> > > > Let's return this errno to caller to reflect the exact reason for the
> > > > error.
> > > >
> > > > Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
> > > >  librdmacm/cma.c | 2 +-
> > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > >
> > > > diff --git a/librdmacm/cma.c b/librdmacm/cma.c
> > > > index fa370650..e92a022c 100644
> > > > +++ b/librdmacm/cma.c
> > > > @@ -1369,7 +1369,7 @@ int rdma_create_qp_ex(struct rdma_cm_id *id,
> > > >  		attr->srq = id->srq;
> > > >  	qp = ibv_create_qp_ex(id->verbs, attr);
> > > >  	if (!qp) {
> > > > -		ret = ERR(ENOMEM);
> > > > +		ret = ERR(errno);
> > > 
> > > It is not always correct, for example
> > > ibv_create_qp_ex -> ibv_create_qp -> pd->context->ops.create_qp ->
> > > mlx5_create_qp -> create_qp -> can return NULL without updating errno.
> > 
> > That is a driver bug..
> > 
> > Jason
> 
> Agree.
> Unfortunately there are more than one driver behaving like this. It will
> take some time to find them all and fix them. I can try to do it but wanted
> first to know if i'm on the right track.

The man page needs updating too, but yes, the expectation is that
these functions return errno.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Leon Romanovsky March 26, 2018, 5:09 p.m. UTC | #5
On Mon, Mar 26, 2018 at 09:30:07AM -0600, Jason Gunthorpe wrote:
> On Mon, Mar 26, 2018 at 04:44:35PM +0300, Leon Romanovsky wrote:
> > On Sun, Mar 25, 2018 at 11:09:02PM +0300, Yuval Shaia wrote:
> > > Upon QP creation failure provider's create_qp function sets errno with
> > > error code and return NULL to caller.
> > >
> > > Let's return this errno to caller to reflect the exact reason for the
> > > error.
> > >
> > > Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
> > >  librdmacm/cma.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/librdmacm/cma.c b/librdmacm/cma.c
> > > index fa370650..e92a022c 100644
> > > +++ b/librdmacm/cma.c
> > > @@ -1369,7 +1369,7 @@ int rdma_create_qp_ex(struct rdma_cm_id *id,
> > >  		attr->srq = id->srq;
> > >  	qp = ibv_create_qp_ex(id->verbs, attr);
> > >  	if (!qp) {
> > > -		ret = ERR(ENOMEM);
> > > +		ret = ERR(errno);
> >
> > It is not always correct, for example
> > ibv_create_qp_ex -> ibv_create_qp -> pd->context->ops.create_qp ->
> > mlx5_create_qp -> create_qp -> can return NULL without updating errno.
>
> That is a driver bug..

It was an example, does patch author go to check and update ALL paths in
all drivers?

Thanks

>
> Jason
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
Yuval Shaia March 26, 2018, 5:28 p.m. UTC | #6
On Mon, Mar 26, 2018 at 08:09:22PM +0300, Leon Romanovsky wrote:
> On Mon, Mar 26, 2018 at 09:30:07AM -0600, Jason Gunthorpe wrote:
> > On Mon, Mar 26, 2018 at 04:44:35PM +0300, Leon Romanovsky wrote:
> > > On Sun, Mar 25, 2018 at 11:09:02PM +0300, Yuval Shaia wrote:
> > > > Upon QP creation failure provider's create_qp function sets errno with
> > > > error code and return NULL to caller.
> > > >
> > > > Let's return this errno to caller to reflect the exact reason for the
> > > > error.
> > > >
> > > > Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
> > > >  librdmacm/cma.c | 2 +-
> > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > >
> > > > diff --git a/librdmacm/cma.c b/librdmacm/cma.c
> > > > index fa370650..e92a022c 100644
> > > > +++ b/librdmacm/cma.c
> > > > @@ -1369,7 +1369,7 @@ int rdma_create_qp_ex(struct rdma_cm_id *id,
> > > >  		attr->srq = id->srq;
> > > >  	qp = ibv_create_qp_ex(id->verbs, attr);
> > > >  	if (!qp) {
> > > > -		ret = ERR(ENOMEM);
> > > > +		ret = ERR(errno);
> > >
> > > It is not always correct, for example
> > > ibv_create_qp_ex -> ibv_create_qp -> pd->context->ops.create_qp ->
> > > mlx5_create_qp -> create_qp -> can return NULL without updating errno.
> >
> > That is a driver bug..
> 
> It was an example, does patch author go to check and update ALL paths in
> all drivers?

No, only at specific scenario in mlx5.

ibv_create_qp_ex -> ibv_create_qp -> pd->context->ops.create_qp ->
mlx5_create_qp -> create_qp -> mlx5_calc_wq_size -> mlx5_calc_sq_size which
returns -EINVAL that was lost by rdma_create_qp_ex which eventually reports
the user that there is some memory problem.

> 
> Thanks
> 
> >
> > Jason
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jason Gunthorpe March 26, 2018, 9:13 p.m. UTC | #7
On Mon, Mar 26, 2018 at 08:28:32PM +0300, Yuval Shaia wrote:
> On Mon, Mar 26, 2018 at 08:09:22PM +0300, Leon Romanovsky wrote:
> > On Mon, Mar 26, 2018 at 09:30:07AM -0600, Jason Gunthorpe wrote:
> > > On Mon, Mar 26, 2018 at 04:44:35PM +0300, Leon Romanovsky wrote:
> > > > On Sun, Mar 25, 2018 at 11:09:02PM +0300, Yuval Shaia wrote:
> > > > > Upon QP creation failure provider's create_qp function sets errno with
> > > > > error code and return NULL to caller.
> > > > >
> > > > > Let's return this errno to caller to reflect the exact reason for the
> > > > > error.
> > > > >
> > > > > Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
> > > > >  librdmacm/cma.c | 2 +-
> > > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/librdmacm/cma.c b/librdmacm/cma.c
> > > > > index fa370650..e92a022c 100644
> > > > > +++ b/librdmacm/cma.c
> > > > > @@ -1369,7 +1369,7 @@ int rdma_create_qp_ex(struct rdma_cm_id *id,
> > > > >  		attr->srq = id->srq;
> > > > >  	qp = ibv_create_qp_ex(id->verbs, attr);
> > > > >  	if (!qp) {
> > > > > -		ret = ERR(ENOMEM);
> > > > > +		ret = ERR(errno);
> > > >
> > > > It is not always correct, for example
> > > > ibv_create_qp_ex -> ibv_create_qp -> pd->context->ops.create_qp ->
> > > > mlx5_create_qp -> create_qp -> can return NULL without updating errno.
> > >
> > > That is a driver bug..
> > 
> > It was an example, does patch author go to check and update ALL paths in
> > all drivers?
> 
> No, only at specific scenario in mlx5.
> 
> ibv_create_qp_ex -> ibv_create_qp -> pd->context->ops.create_qp ->
> mlx5_create_qp -> create_qp -> mlx5_calc_wq_size -> mlx5_calc_sq_size which
> returns -EINVAL that was lost by rdma_create_qp_ex which eventually reports
> the user that there is some memory problem.

We don't have to fix every driver before fixing the common code.

Drivers are supposed to return errno, I'm sure lots and lots of them
have paths where they fail to do this.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Hefty, Sean March 26, 2018, 10:06 p.m. UTC | #8
> Drivers are supposed to return errno, I'm sure lots and lots of them
> have paths where they fail to do this.

The code could check for errno, and assume if set that it's set to the correct error number.

if (!qp) {
	ret = errno ? ERR(errno) : ENOMEM;

or wrap the errno check/return in a macro 

- Sean
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jason Gunthorpe March 26, 2018, 10:13 p.m. UTC | #9
On Mon, Mar 26, 2018 at 10:06:37PM +0000, Hefty, Sean wrote:
> > Drivers are supposed to return errno, I'm sure lots and lots of them
> > have paths where they fail to do this.
> 
> The code could check for errno, and assume if set that it's set to the correct error number.
> 
> if (!qp) {
> 	ret = errno ? ERR(errno) : ENOMEM;
> 
> or wrap the errno check/return in a macro 

And zero errno befor calling the driver.

We'd have to put that sort of guard in all the verbs calls that return
a pointer, I think.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Leon Romanovsky March 27, 2018, 3:25 a.m. UTC | #10
On Mon, Mar 26, 2018 at 04:13:15PM -0600, Jason Gunthorpe wrote:
> On Mon, Mar 26, 2018 at 10:06:37PM +0000, Hefty, Sean wrote:
> > > Drivers are supposed to return errno, I'm sure lots and lots of them
> > > have paths where they fail to do this.
> >
> > The code could check for errno, and assume if set that it's set to the correct error number.
> >
> > if (!qp) {
> > 	ret = errno ? ERR(errno) : ENOMEM;
> >
> > or wrap the errno check/return in a macro
>
> And zero errno befor calling the driver.
>
> We'd have to put that sort of guard in all the verbs calls that return
> a pointer, I think.

+1,

It is an excellent suggestion, at the end such code will allow us to unify
all returned "errno" and ensure that they are consistent between different
verbs calls.

Thanks

>
> Jason
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/librdmacm/cma.c b/librdmacm/cma.c
index fa370650..e92a022c 100644
--- a/librdmacm/cma.c
+++ b/librdmacm/cma.c
@@ -1369,7 +1369,7 @@  int rdma_create_qp_ex(struct rdma_cm_id *id,
 		attr->srq = id->srq;
 	qp = ibv_create_qp_ex(id->verbs, attr);
 	if (!qp) {
-		ret = ERR(ENOMEM);
+		ret = ERR(errno);
 		goto err1;
 	}