diff mbox series

[rdma-next,1/1] RDMA/mana_ib: indicate that inline data is not supported

Message ID 1721126889-22770-1-git-send-email-kotaranov@linux.microsoft.com (mailing list archive)
State Changes Requested
Delegated to: Leon Romanovsky
Headers show
Series [rdma-next,1/1] RDMA/mana_ib: indicate that inline data is not supported | expand

Commit Message

Konstantin Taranov July 16, 2024, 10:48 a.m. UTC
From: Konstantin Taranov <kotaranov@microsoft.com>

Set max_inline_data to zero during RC QP creation.

Fixes: fdefb9184962 ("RDMA/mana_ib: Implement uapi to create and destroy RC QP")
Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>
---
 drivers/infiniband/hw/mana/qp.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Leon Romanovsky July 16, 2024, 11:14 a.m. UTC | #1
On Tue, Jul 16, 2024 at 03:48:09AM -0700, Konstantin Taranov wrote:
> From: Konstantin Taranov <kotaranov@microsoft.com>
> 
> Set max_inline_data to zero during RC QP creation.
> 
> Fixes: fdefb9184962 ("RDMA/mana_ib: Implement uapi to create and destroy RC QP")
> Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>
> ---
>  drivers/infiniband/hw/mana/qp.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/infiniband/hw/mana/qp.c b/drivers/infiniband/hw/mana/qp.c
> index 73d67c853b6f..d9f24a763e72 100644
> --- a/drivers/infiniband/hw/mana/qp.c
> +++ b/drivers/infiniband/hw/mana/qp.c
> @@ -426,6 +426,8 @@ static int mana_ib_create_rc_qp(struct ib_qp *ibqp, struct ib_pd *ibpd,
>  	u64 flags = 0;
>  	u32 doorbell;
>  
> +	/* inline data is not supported */
> +	attr->cap.max_inline_data = 0;

Can you please point to me to the flow where attr is not zeroed before?

Thanks

>  	if (!udata || udata->inlen < sizeof(ucmd))
>  		return -EINVAL;
>  
> -- 
> 2.43.0
>
Konstantin Taranov July 16, 2024, 1:42 p.m. UTC | #2
> > Set max_inline_data to zero during RC QP creation.
> >
> > Fixes: fdefb9184962 ("RDMA/mana_ib: Implement uapi to create and
> > destroy RC QP")
> > Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>
> > ---
> >  drivers/infiniband/hw/mana/qp.c | 2 ++
> >  1 file changed, 2 insertions(+)
> >
> > diff --git a/drivers/infiniband/hw/mana/qp.c
> > b/drivers/infiniband/hw/mana/qp.c index 73d67c853b6f..d9f24a763e72
> > 100644
> > --- a/drivers/infiniband/hw/mana/qp.c
> > +++ b/drivers/infiniband/hw/mana/qp.c
> > @@ -426,6 +426,8 @@ static int mana_ib_create_rc_qp(struct ib_qp *ibqp,
> struct ib_pd *ibpd,
> >  	u64 flags = 0;
> >  	u32 doorbell;
> >
> > +	/* inline data is not supported */
> > +	attr->cap.max_inline_data = 0;
> 
> Can you please point to me to the flow where attr is not zeroed before?
>

Sorry, I do not understand the question. I cannot point to something that is not in the code.

It is to support the case when user asks for x bytes inlined
when it creates a QP, and we respond with actual allowed inline
data for the created QP. (as defined in: "The function ibv_create_qp()
will update the qp_init_attr->cap struct with the actual QP values of
the QP that was created;")

The kernel logic is inside "static int create_qp(struct uverbs_attr_bundle *attrs, struct ib_uverbs_ex_create_qp *cmd)"
where we do the following:
attr.cap.max_inline_data = cmd->max_inline_data;
qp = ib_create_qp_user(..,&attr,..);
resp.base.max_inline_data = attr.cap.max_inline_data;

So, my change makes sure that the response will have 0 and not the value the user asked,
as we do not support inlining. So without the fix, the user who was asking for inlining was falsely
seeing that we support it (example of such an application is rdma_server from librdmacm).

Thanks

> Thanks
> 
> >  	if (!udata || udata->inlen < sizeof(ucmd))
> >  		return -EINVAL;
> >
> > --
> > 2.43.0
> >
Leon Romanovsky July 16, 2024, 2:22 p.m. UTC | #3
On Tue, Jul 16, 2024 at 01:42:49PM +0000, Konstantin Taranov wrote:
> > > Set max_inline_data to zero during RC QP creation.
> > >
> > > Fixes: fdefb9184962 ("RDMA/mana_ib: Implement uapi to create and
> > > destroy RC QP")
> > > Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>
> > > ---
> > >  drivers/infiniband/hw/mana/qp.c | 2 ++
> > >  1 file changed, 2 insertions(+)
> > >
> > > diff --git a/drivers/infiniband/hw/mana/qp.c
> > > b/drivers/infiniband/hw/mana/qp.c index 73d67c853b6f..d9f24a763e72
> > > 100644
> > > --- a/drivers/infiniband/hw/mana/qp.c
> > > +++ b/drivers/infiniband/hw/mana/qp.c
> > > @@ -426,6 +426,8 @@ static int mana_ib_create_rc_qp(struct ib_qp *ibqp,
> > struct ib_pd *ibpd,
> > >  	u64 flags = 0;
> > >  	u32 doorbell;
> > >
> > > +	/* inline data is not supported */
> > > +	attr->cap.max_inline_data = 0;
> > 
> > Can you please point to me to the flow where attr is not zeroed before?
> >
> 
> Sorry, I do not understand the question. I cannot point to something that is not in the code.
> 
> It is to support the case when user asks for x bytes inlined
> when it creates a QP, and we respond with actual allowed inline
> data for the created QP. (as defined in: "The function ibv_create_qp()
> will update the qp_init_attr->cap struct with the actual QP values of
> the QP that was created;")
> 
> The kernel logic is inside "static int create_qp(struct uverbs_attr_bundle *attrs, struct ib_uverbs_ex_create_qp *cmd)"
> where we do the following:
> attr.cap.max_inline_data = cmd->max_inline_data;
> qp = ib_create_qp_user(..,&attr,..);

Awesome, ib_create_qp_user() is called exactly in two places, and in
both cases I see this line "struct ib_qp_init_attr attr = {}; "

It means that attr is zeroed.

Thanks

> resp.base.max_inline_data = attr.cap.max_inline_data;
> 
> So, my change makes sure that the response will have 0 and not the value the user asked,
> as we do not support inlining. So without the fix, the user who was asking for inlining was falsely
> seeing that we support it (example of such an application is rdma_server from librdmacm).
> 
> Thanks
> 
> > Thanks
> > 
> > >  	if (!udata || udata->inlen < sizeof(ucmd))
> > >  		return -EINVAL;
> > >
> > > --
> > > 2.43.0
> > >
Konstantin Taranov July 16, 2024, 2:55 p.m. UTC | #4
> > > > Set max_inline_data to zero during RC QP creation.
> > > >
> > > > Fixes: fdefb9184962 ("RDMA/mana_ib: Implement uapi to create and
> > > > destroy RC QP")
> > > > Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>
> > > > ---
> > > >  drivers/infiniband/hw/mana/qp.c | 2 ++
> > > >  1 file changed, 2 insertions(+)
> > > >
> > > > diff --git a/drivers/infiniband/hw/mana/qp.c
> > > > b/drivers/infiniband/hw/mana/qp.c index 73d67c853b6f..d9f24a763e72
> > > > 100644
> > > > --- a/drivers/infiniband/hw/mana/qp.c
> > > > +++ b/drivers/infiniband/hw/mana/qp.c
> > > > @@ -426,6 +426,8 @@ static int mana_ib_create_rc_qp(struct ib_qp
> > > > *ibqp,
> > > struct ib_pd *ibpd,
> > > >  	u64 flags = 0;
> > > >  	u32 doorbell;
> > > >
> > > > +	/* inline data is not supported */
> > > > +	attr->cap.max_inline_data = 0;
> > >
> > > Can you please point to me to the flow where attr is not zeroed before?
> > >
> >
> > Sorry, I do not understand the question. I cannot point to something that is
> not in the code.
> >
> > It is to support the case when user asks for x bytes inlined when it
> > creates a QP, and we respond with actual allowed inline data for the
> > created QP. (as defined in: "The function ibv_create_qp() will update
> > the qp_init_attr->cap struct with the actual QP values of the QP that
> > was created;")
> >
> > The kernel logic is inside "static int create_qp(struct uverbs_attr_bundle
> *attrs, struct ib_uverbs_ex_create_qp *cmd)"
> > where we do the following:
> > attr.cap.max_inline_data = cmd->max_inline_data; qp =
> > ib_create_qp_user(..,&attr,..);
> 
> Awesome, ib_create_qp_user() is called exactly in two places, and in both
> cases I see this line "struct ib_qp_init_attr attr = {}; "
> 
> It means that attr is zeroed.
> 

I think there is some misunderstanding.
So, the attr is zeroed at init. Then it is filled with values from the user request.
With
	attr.cap.max_send_wr     = cmd->max_send_wr;
	attr.cap.max_recv_wr     = cmd->max_recv_wr;
	attr.cap.max_send_sge    = cmd->max_send_sge;
	attr.cap.max_recv_sge    = cmd->max_recv_sge;
	attr.cap.max_inline_data = cmd->max_inline_data;
or with:
	set_caps(&attr, &cap, true);

So at this moment there is a value from the user in the attr (it is not 0).
ib_create_qp_user is called with the attr.

The drivers are allowed to tune the cap values during QP create.
So I want to set it to 0 as some other provides (see pvrdma_create_qp,
rvt_create_qp, set_kernel_sq_size from mlx4).

After the driver call has happened, the modified cap value are copied to temp variable:
	resp.base.max_recv_sge    = attr.cap.max_recv_sge;
	resp.base.max_send_sge    = attr.cap.max_send_sge;
	resp.base.max_recv_wr     = attr.cap.max_recv_wr;
	resp.base.max_send_wr     = attr.cap.max_send_wr;
	resp.base.max_inline_data = attr.cap.max_inline_data;
or with:
	set_caps(&attr, &cap, false);
And then copied to the user with:
	uverbs_response(attrs, &resp, sizeof(resp));
or with
	uverbs_copy_to_struct_or_zero(attrs,
					UVERBS_ATTR_CREATE_QP_RESP_CAP, &cap,
					sizeof(cap));

Do you agree that my patch solves a problem in mana_ib (there is no problem in core)?
Or do you think that I am trying to solve non-existent problem?

Thanks

> Thanks
> 
> > resp.base.max_inline_data = attr.cap.max_inline_data;
> >
> > So, my change makes sure that the response will have 0 and not the
> > value the user asked, as we do not support inlining. So without the
> > fix, the user who was asking for inlining was falsely seeing that we support
> it (example of such an application is rdma_server from librdmacm).
> >
> > Thanks
> >
> > > Thanks
> > >
> > > >  	if (!udata || udata->inlen < sizeof(ucmd))
> > > >  		return -EINVAL;
> > > >
> > > > --
> > > > 2.43.0
> > > >
Leon Romanovsky July 16, 2024, 5:06 p.m. UTC | #5
On Tue, Jul 16, 2024 at 02:55:26PM +0000, Konstantin Taranov wrote:
> > > > > Set max_inline_data to zero during RC QP creation.
> > > > >
> > > > > Fixes: fdefb9184962 ("RDMA/mana_ib: Implement uapi to create and
> > > > > destroy RC QP")
> > > > > Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>
> > > > > ---
> > > > >  drivers/infiniband/hw/mana/qp.c | 2 ++
> > > > >  1 file changed, 2 insertions(+)
> > > > >
> > > > > diff --git a/drivers/infiniband/hw/mana/qp.c
> > > > > b/drivers/infiniband/hw/mana/qp.c index 73d67c853b6f..d9f24a763e72
> > > > > 100644
> > > > > --- a/drivers/infiniband/hw/mana/qp.c
> > > > > +++ b/drivers/infiniband/hw/mana/qp.c
> > > > > @@ -426,6 +426,8 @@ static int mana_ib_create_rc_qp(struct ib_qp
> > > > > *ibqp,
> > > > struct ib_pd *ibpd,
> > > > >  	u64 flags = 0;
> > > > >  	u32 doorbell;
> > > > >
> > > > > +	/* inline data is not supported */
> > > > > +	attr->cap.max_inline_data = 0;
> > > >
> > > > Can you please point to me to the flow where attr is not zeroed before?
> > > >
> > >
> > > Sorry, I do not understand the question. I cannot point to something that is
> > not in the code.
> > >
> > > It is to support the case when user asks for x bytes inlined when it
> > > creates a QP, and we respond with actual allowed inline data for the
> > > created QP. (as defined in: "The function ibv_create_qp() will update
> > > the qp_init_attr->cap struct with the actual QP values of the QP that
> > > was created;")
> > >
> > > The kernel logic is inside "static int create_qp(struct uverbs_attr_bundle
> > *attrs, struct ib_uverbs_ex_create_qp *cmd)"
> > > where we do the following:
> > > attr.cap.max_inline_data = cmd->max_inline_data; qp =
> > > ib_create_qp_user(..,&attr,..);
> > 
> > Awesome, ib_create_qp_user() is called exactly in two places, and in both
> > cases I see this line "struct ib_qp_init_attr attr = {}; "
> > 
> > It means that attr is zeroed.
> > 
> 
> I think there is some misunderstanding.
> So, the attr is zeroed at init. Then it is filled with values from the user request.
> With
> 	attr.cap.max_send_wr     = cmd->max_send_wr;
> 	attr.cap.max_recv_wr     = cmd->max_recv_wr;
> 	attr.cap.max_send_sge    = cmd->max_send_sge;
> 	attr.cap.max_recv_sge    = cmd->max_recv_sge;
> 	attr.cap.max_inline_data = cmd->max_inline_data;
> or with:
> 	set_caps(&attr, &cap, true);
> 
> So at this moment there is a value from the user in the attr (it is not 0).
> ib_create_qp_user is called with the attr.
> 
> The drivers are allowed to tune the cap values during QP create.
> So I want to set it to 0 as some other provides (see pvrdma_create_qp,
> rvt_create_qp, set_kernel_sq_size from mlx4).
> 
> After the driver call has happened, the modified cap value are copied to temp variable:
> 	resp.base.max_recv_sge    = attr.cap.max_recv_sge;
> 	resp.base.max_send_sge    = attr.cap.max_send_sge;
> 	resp.base.max_recv_wr     = attr.cap.max_recv_wr;
> 	resp.base.max_send_wr     = attr.cap.max_send_wr;
> 	resp.base.max_inline_data = attr.cap.max_inline_data;
> or with:
> 	set_caps(&attr, &cap, false);
> And then copied to the user with:
> 	uverbs_response(attrs, &resp, sizeof(resp));
> or with
> 	uverbs_copy_to_struct_or_zero(attrs,
> 					UVERBS_ATTR_CREATE_QP_RESP_CAP, &cap,
> 					sizeof(cap));
> 
> Do you agree that my patch solves a problem in mana_ib (there is no problem in core)?
> Or do you think that I am trying to solve non-existent problem?

Yes, you are. If user asked for specific functionality (max_inline_data != 0) and
your device doesn't support it, you should return an error.

pvrdma, mlx4 and rvt are not good examples, they should return an error as well, but because of being
legacy code, we won't change them.

Thanks

> 
> Thanks
> 
> > Thanks
> > 
> > > resp.base.max_inline_data = attr.cap.max_inline_data;
> > >
> > > So, my change makes sure that the response will have 0 and not the
> > > value the user asked, as we do not support inlining. So without the
> > > fix, the user who was asking for inlining was falsely seeing that we support
> > it (example of such an application is rdma_server from librdmacm).
> > >
> > > Thanks
> > >
> > > > Thanks
> > > >
> > > > >  	if (!udata || udata->inlen < sizeof(ucmd))
> > > > >  		return -EINVAL;
> > > > >
> > > > > --
> > > > > 2.43.0
> > > > >
Konstantin Taranov July 16, 2024, 5:25 p.m. UTC | #6
> 
> Yes, you are. If user asked for specific functionality (max_inline_data != 0) and
> your device doesn't support it, you should return an error.
> 
> pvrdma, mlx4 and rvt are not good examples, they should return an error as
> well, but because of being legacy code, we won't change them.
> 
> Thanks
> 

I see. So I guess we can return a larger value, but not smaller. Right?
I will send v2 that fails QP creation then.

In this case, may I submit a patch to rdma-core that queries device caps before
trying to create a qp in rdma_client.c and rdma_server.c? As that code violates
what you described.

Thanks
Leon Romanovsky July 17, 2024, 6:22 a.m. UTC | #7
On Tue, Jul 16, 2024 at 05:25:22PM +0000, Konstantin Taranov wrote:
> > 
> > Yes, you are. If user asked for specific functionality (max_inline_data != 0) and
> > your device doesn't support it, you should return an error.
> > 
> > pvrdma, mlx4 and rvt are not good examples, they should return an error as
> > well, but because of being legacy code, we won't change them.
> > 
> > Thanks
> > 
> 
> I see. So I guess we can return a larger value, but not smaller. Right?
> I will send v2 that fails QP creation then.
> 
> In this case, may I submit a patch to rdma-core that queries device caps before
> trying to create a qp in rdma_client.c and rdma_server.c? As that code violates
> what you described.

Let's ask Jason, why is that? Do we allow to ignore max_inline_data?

librdmacm/examples/rdma_client.c
  63         memset(&attr, 0, sizeof attr);
  64         attr.cap.max_send_wr = attr.cap.max_recv_wr = 1;
  65         attr.cap.max_send_sge = attr.cap.max_recv_sge = 1;
  66         attr.cap.max_inline_data = 16;
  67         attr.qp_context = id;
  68         attr.sq_sig_all = 1;
  69         ret = rdma_create_ep(&id, res, NULL, &attr);
  70         // Check to see if we got inline data allowed or not
  71         if (attr.cap.max_inline_data >= 16)
  72                 send_flags = IBV_SEND_INLINE;
  73         else
  74                 printf("rdma_client: device doesn't support IBV_SEND_INLINE, "
  75                        "using sge sends\n");

> 
> Thanks
> 
>  
>
Jason Gunthorpe July 17, 2024, 4:34 p.m. UTC | #8
On Wed, Jul 17, 2024 at 09:22:50AM +0300, Leon Romanovsky wrote:
> On Tue, Jul 16, 2024 at 05:25:22PM +0000, Konstantin Taranov wrote:
> > > 
> > > Yes, you are. If user asked for specific functionality (max_inline_data != 0) and
> > > your device doesn't support it, you should return an error.
> > > 
> > > pvrdma, mlx4 and rvt are not good examples, they should return an error as
> > > well, but because of being legacy code, we won't change them.
> > > 
> > > Thanks
> > > 
> > 
> > I see. So I guess we can return a larger value, but not smaller. Right?
> > I will send v2 that fails QP creation then.
> > 
> > In this case, may I submit a patch to rdma-core that queries device caps before
> > trying to create a qp in rdma_client.c and rdma_server.c? As that code violates
> > what you described.
> 
> Let's ask Jason, why is that? Do we allow to ignore max_inline_data?
> 
> librdmacm/examples/rdma_client.c
>   63         memset(&attr, 0, sizeof attr);
>   64         attr.cap.max_send_wr = attr.cap.max_recv_wr = 1;
>   65         attr.cap.max_send_sge = attr.cap.max_recv_sge = 1;
>   66         attr.cap.max_inline_data = 16;
>   67         attr.qp_context = id;
>   68         attr.sq_sig_all = 1;
>   69         ret = rdma_create_ep(&id, res, NULL, &attr);
>   70         // Check to see if we got inline data allowed or not
>   71         if (attr.cap.max_inline_data >= 16)
>   72                 send_flags = IBV_SEND_INLINE;
>   73         else
>   74                 printf("rdma_client: device doesn't support IBV_SEND_INLINE, "
>   75                        "using sge sends\n");

I think the idea expressed in this code is that if max_inline_data
requested too much it would be limited to the device capability.

ie qp creation should limit the requests values to what the HW can do,
similar to how entries and other work.

If the HW has no support it should return - for max_inline_data not an
error, I guess?

Jason
Konstantin Taranov July 18, 2024, 3:05 p.m. UTC | #9
> > > > Yes, you are. If user asked for specific functionality
> > > > (max_inline_data != 0) and your device doesn't support it, you should
> return an error.
> > > >
> > > > pvrdma, mlx4 and rvt are not good examples, they should return an
> > > > error as well, but because of being legacy code, we won't change them.
> > > >
> > > > Thanks
> > > >
> > >
> > > I see. So I guess we can return a larger value, but not smaller. Right?
> > > I will send v2 that fails QP creation then.
> > >
> > > In this case, may I submit a patch to rdma-core that queries device
> > > caps before trying to create a qp in rdma_client.c and
> > > rdma_server.c? As that code violates what you described.
> >
> > Let's ask Jason, why is that? Do we allow to ignore max_inline_data?
> >
> > librdmacm/examples/rdma_client.c
> >   63         memset(&attr, 0, sizeof attr);
> >   64         attr.cap.max_send_wr = attr.cap.max_recv_wr = 1;
> >   65         attr.cap.max_send_sge = attr.cap.max_recv_sge = 1;
> >   66         attr.cap.max_inline_data = 16;
> >   67         attr.qp_context = id;
> >   68         attr.sq_sig_all = 1;
> >   69         ret = rdma_create_ep(&id, res, NULL, &attr);
> >   70         // Check to see if we got inline data allowed or not
> >   71         if (attr.cap.max_inline_data >= 16)
> >   72                 send_flags = IBV_SEND_INLINE;
> >   73         else
> >   74                 printf("rdma_client: device doesn't support
> IBV_SEND_INLINE, "
> >   75                        "using sge sends\n");
> 
> I think the idea expressed in this code is that if max_inline_data requested
> too much it would be limited to the device capability.
> 
> ie qp creation should limit the requests values to what the HW can do, similar
> to how entries and other work.
> 
> If the HW has no support it should return - for max_inline_data not an error,
> I guess?

Yes, this code implies that max_inline_data can be ignored at creation, while the manual of ibv_create_qp says:
"The function ibv_create_qp() will update the qp_init_attr->cap struct with the actual QP values of the QP that was created;
the values will be **greater than or equal to** the values requested."

I see two options:
1) Remove code from rdma examples that rely on ignoring max_inline; add a warning to libibverbs when drivers ignore that value.
2) Add to manual that max_inline_data might be ignored by drivers; and allow my current patch that ignores max_inline_data in mana_ib.

I am fine to implement either of two. Please, reply which one you think more correct. I guess option 2 is the safest, but option 1
is more correct in my opinion.

Thanks

> 
> Jason
Jason Gunthorpe July 18, 2024, 4:48 p.m. UTC | #10
On Thu, Jul 18, 2024 at 03:05:59PM +0000, Konstantin Taranov wrote:
> > > > > Yes, you are. If user asked for specific functionality
> > > > > (max_inline_data != 0) and your device doesn't support it, you should
> > return an error.
> > > > >
> > > > > pvrdma, mlx4 and rvt are not good examples, they should return an
> > > > > error as well, but because of being legacy code, we won't change them.
> > > > >
> > > > > Thanks
> > > > >
> > > >
> > > > I see. So I guess we can return a larger value, but not smaller. Right?
> > > > I will send v2 that fails QP creation then.
> > > >
> > > > In this case, may I submit a patch to rdma-core that queries device
> > > > caps before trying to create a qp in rdma_client.c and
> > > > rdma_server.c? As that code violates what you described.
> > >
> > > Let's ask Jason, why is that? Do we allow to ignore max_inline_data?
> > >
> > > librdmacm/examples/rdma_client.c
> > >   63         memset(&attr, 0, sizeof attr);
> > >   64         attr.cap.max_send_wr = attr.cap.max_recv_wr = 1;
> > >   65         attr.cap.max_send_sge = attr.cap.max_recv_sge = 1;
> > >   66         attr.cap.max_inline_data = 16;
> > >   67         attr.qp_context = id;
> > >   68         attr.sq_sig_all = 1;
> > >   69         ret = rdma_create_ep(&id, res, NULL, &attr);
> > >   70         // Check to see if we got inline data allowed or not
> > >   71         if (attr.cap.max_inline_data >= 16)
> > >   72                 send_flags = IBV_SEND_INLINE;
> > >   73         else
> > >   74                 printf("rdma_client: device doesn't support
> > IBV_SEND_INLINE, "
> > >   75                        "using sge sends\n");
> > 
> > I think the idea expressed in this code is that if max_inline_data requested
> > too much it would be limited to the device capability.
> > 
> > ie qp creation should limit the requests values to what the HW can do, similar
> > to how entries and other work.
> > 
> > If the HW has no support it should return - for max_inline_data not an error,
> > I guess?
> 
> Yes, this code implies that max_inline_data can be ignored at creation, while the manual of ibv_create_qp says:
> "The function ibv_create_qp() will update the qp_init_attr->cap struct with the actual QP values of the QP that was created;
> the values will be **greater than or equal to** the values
> requested."

Ah, well that seems to be some misunderstandings then, yes.

> I see two options:
> 1) Remove code from rdma examples that rely on ignoring max_inline; add a warning to libibverbs when drivers ignore that value.
> 2) Add to manual that max_inline_data might be ignored by drivers; and allow my current patch that ignores max_inline_data in mana_ib.

I don't know, what do the majority of drivers do? If enough are
already doing 1 then lets force everyone into 1, otherwise we have to
document 2.

And a pyverbs test should be added to cover this weirdness

Jason
Konstantin Taranov July 19, 2024, 10:51 a.m. UTC | #11
> > > > > > Yes, you are. If user asked for specific functionality
> > > > > > (max_inline_data != 0) and your device doesn't support it, you
> > > > > > should
> > > return an error.
> > > > > >
> > > > > > pvrdma, mlx4 and rvt are not good examples, they should return
> > > > > > an error as well, but because of being legacy code, we won't change
> them.
> > > > > >
> > > > > > Thanks
> > > > > >
> > > > >
> > > > > I see. So I guess we can return a larger value, but not smaller. Right?
> > > > > I will send v2 that fails QP creation then.
> > > > >
> > > > > In this case, may I submit a patch to rdma-core that queries
> > > > > device caps before trying to create a qp in rdma_client.c and
> > > > > rdma_server.c? As that code violates what you described.
> > > >
> > > > Let's ask Jason, why is that? Do we allow to ignore max_inline_data?
> > > >
> > > > librdmacm/examples/rdma_client.c
> > > >   63         memset(&attr, 0, sizeof attr);
> > > >   64         attr.cap.max_send_wr = attr.cap.max_recv_wr = 1;
> > > >   65         attr.cap.max_send_sge = attr.cap.max_recv_sge = 1;
> > > >   66         attr.cap.max_inline_data = 16;
> > > >   67         attr.qp_context = id;
> > > >   68         attr.sq_sig_all = 1;
> > > >   69         ret = rdma_create_ep(&id, res, NULL, &attr);
> > > >   70         // Check to see if we got inline data allowed or not
> > > >   71         if (attr.cap.max_inline_data >= 16)
> > > >   72                 send_flags = IBV_SEND_INLINE;
> > > >   73         else
> > > >   74                 printf("rdma_client: device doesn't support
> > > IBV_SEND_INLINE, "
> > > >   75                        "using sge sends\n");
> > >
> > > I think the idea expressed in this code is that if max_inline_data
> > > requested too much it would be limited to the device capability.
> > >
> > > ie qp creation should limit the requests values to what the HW can
> > > do, similar to how entries and other work.
> > >
> > > If the HW has no support it should return - for max_inline_data not
> > > an error, I guess?
> >
> > Yes, this code implies that max_inline_data can be ignored at creation,
> while the manual of ibv_create_qp says:
> > "The function ibv_create_qp() will update the qp_init_attr->cap struct
> > with the actual QP values of the QP that was created; the values will
> > be **greater than or equal to** the values requested."
> 
> Ah, well that seems to be some misunderstandings then, yes.
> 
> > I see two options:
> > 1) Remove code from rdma examples that rely on ignoring max_inline; add
> a warning to libibverbs when drivers ignore that value.
> > 2) Add to manual that max_inline_data might be ignored by drivers; and
> allow my current patch that ignores max_inline_data in mana_ib.
> 
> I don't know, what do the majority of drivers do? If enough are already doing
> 1 then lets force everyone into 1, otherwise we have to document 2.
> 
> And a pyverbs test should be added to cover this weirdness

I quickly read create_qp code of all providers and it seems that max_inline_data is ignored by hw/pvrdma and sw/rvt.
Other providers fail the creation when they cannot satisfy the inline_data cap.
Some drivers ignore it for GSI, but I think it is reasonable. 

Then I guess the option 1 is better. Regarding pyverbs, should I add a test for the option 1?
If yes, what should it test?

> 
> Jason
Leon Romanovsky July 21, 2024, 6:56 a.m. UTC | #12
On Fri, Jul 19, 2024 at 10:51:58AM +0000, Konstantin Taranov wrote:
> > > > > > > Yes, you are. If user asked for specific functionality
> > > > > > > (max_inline_data != 0) and your device doesn't support it, you
> > > > > > > should
> > > > return an error.
> > > > > > >
> > > > > > > pvrdma, mlx4 and rvt are not good examples, they should return
> > > > > > > an error as well, but because of being legacy code, we won't change
> > them.
> > > > > > >
> > > > > > > Thanks
> > > > > > >
> > > > > >
> > > > > > I see. So I guess we can return a larger value, but not smaller. Right?
> > > > > > I will send v2 that fails QP creation then.
> > > > > >
> > > > > > In this case, may I submit a patch to rdma-core that queries
> > > > > > device caps before trying to create a qp in rdma_client.c and
> > > > > > rdma_server.c? As that code violates what you described.
> > > > >
> > > > > Let's ask Jason, why is that? Do we allow to ignore max_inline_data?
> > > > >
> > > > > librdmacm/examples/rdma_client.c
> > > > >   63         memset(&attr, 0, sizeof attr);
> > > > >   64         attr.cap.max_send_wr = attr.cap.max_recv_wr = 1;
> > > > >   65         attr.cap.max_send_sge = attr.cap.max_recv_sge = 1;
> > > > >   66         attr.cap.max_inline_data = 16;
> > > > >   67         attr.qp_context = id;
> > > > >   68         attr.sq_sig_all = 1;
> > > > >   69         ret = rdma_create_ep(&id, res, NULL, &attr);
> > > > >   70         // Check to see if we got inline data allowed or not
> > > > >   71         if (attr.cap.max_inline_data >= 16)
> > > > >   72                 send_flags = IBV_SEND_INLINE;
> > > > >   73         else
> > > > >   74                 printf("rdma_client: device doesn't support
> > > > IBV_SEND_INLINE, "
> > > > >   75                        "using sge sends\n");
> > > >
> > > > I think the idea expressed in this code is that if max_inline_data
> > > > requested too much it would be limited to the device capability.
> > > >
> > > > ie qp creation should limit the requests values to what the HW can
> > > > do, similar to how entries and other work.
> > > >
> > > > If the HW has no support it should return - for max_inline_data not
> > > > an error, I guess?
> > >
> > > Yes, this code implies that max_inline_data can be ignored at creation,
> > while the manual of ibv_create_qp says:
> > > "The function ibv_create_qp() will update the qp_init_attr->cap struct
> > > with the actual QP values of the QP that was created; the values will
> > > be **greater than or equal to** the values requested."
> > 
> > Ah, well that seems to be some misunderstandings then, yes.
> > 
> > > I see two options:
> > > 1) Remove code from rdma examples that rely on ignoring max_inline; add
> > a warning to libibverbs when drivers ignore that value.
> > > 2) Add to manual that max_inline_data might be ignored by drivers; and
> > allow my current patch that ignores max_inline_data in mana_ib.
> > 
> > I don't know, what do the majority of drivers do? If enough are already doing
> > 1 then lets force everyone into 1, otherwise we have to document 2.
> > 
> > And a pyverbs test should be added to cover this weirdness
> 
> I quickly read create_qp code of all providers and it seems that max_inline_data is ignored by hw/pvrdma and sw/rvt.
> Other providers fail the creation when they cannot satisfy the inline_data cap.
> Some drivers ignore it for GSI, but I think it is reasonable. 
> 
> Then I guess the option 1 is better. Regarding pyverbs, should I add a test for the option 1?
> If yes, what should it test?

Probably, the test should check the max_inline_data value returned from device caps and try to create
QP with higher value. If the QP creation fails, the test should pass. For hw/pvrdma and sw/rvt, the QP
should be successfully created, despite the requested value.

Thanks

> 
> > 
> > Jason
diff mbox series

Patch

diff --git a/drivers/infiniband/hw/mana/qp.c b/drivers/infiniband/hw/mana/qp.c
index 73d67c853b6f..d9f24a763e72 100644
--- a/drivers/infiniband/hw/mana/qp.c
+++ b/drivers/infiniband/hw/mana/qp.c
@@ -426,6 +426,8 @@  static int mana_ib_create_rc_qp(struct ib_qp *ibqp, struct ib_pd *ibpd,
 	u64 flags = 0;
 	u32 doorbell;
 
+	/* inline data is not supported */
+	attr->cap.max_inline_data = 0;
 	if (!udata || udata->inlen < sizeof(ucmd))
 		return -EINVAL;