diff mbox

[v2,2/5] IB/core, RDMA RW API: Do not exceed QP SGE send limit

Message ID 7b9f8bca-38fa-b289-a92f-19cf93955e32@sandisk.com (mailing list archive)
State Superseded
Headers show

Commit Message

Bart Van Assche June 30, 2016, 1:49 p.m. UTC
The SGE limit for a queue pair is typically lower than what is
defined by the HCA limits. Hence use the QP SGE send limit
instead of the HCA send limit.

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: <stable@vger.kernel.org> #v4.7+
Cc: Christoph Hellwig <hch@lst.de>
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: Nicholas Bellinger <nab@linux-iscsi.org>
Cc: Parav Pandit <pandit.parav@gmail.com>
Cc: Laurence Oberman <loberman@redhat.com>
---
 drivers/infiniband/core/rw.c    | 9 +--------
 drivers/infiniband/core/verbs.c | 2 ++
 include/rdma/ib_verbs.h         | 1 +
 3 files changed, 4 insertions(+), 8 deletions(-)

Comments

Sagi Grimberg July 13, 2016, 9:23 a.m. UTC | #1
Looks good,

Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sagi Grimberg July 13, 2016, 9:32 a.m. UTC | #2
> -static inline u32 rdma_rw_max_sge(struct ib_device *dev,
> -		enum dma_data_direction dir)
> -{
> -	return dir == DMA_TO_DEVICE ?
> -		dev->attrs.max_sge : dev->attrs.max_sge_rd;
> -}
> -

Wait, this looks wrong...

iWARP devices have max_sge_rd = 1, Steve, did you get to test this?

>   static inline u32 rdma_rw_fr_page_list_len(struct ib_device *dev)
>   {
>   	/* arbitrary limit to avoid allocating gigantic resources */
> @@ -186,7 +179,7 @@ static int rdma_rw_init_map_wrs(struct rdma_rw_ctx *ctx, struct ib_qp *qp,
>   		u64 remote_addr, u32 rkey, enum dma_data_direction dir)
>   {
>   	struct ib_device *dev = qp->pd->device;
> -	u32 max_sge = rdma_rw_max_sge(dev, dir);
> +	u32 max_sge = qp->max_send_sge;

Here, rdma_read will use qp.max_send_sge = max_sge which is set by the
ULP.

If the ULP used max(max_sge, max_sge_rd) which is the right thing, then
we get a bug, if the ULP set min(max_sge, max_sge_rd) then it has
effectively a single sge even for writes (which is not good).

The QP user needs to set the maximum sge allowed for reads or writes,
but for reads use max_sge_rd and for writes use max_sge. Otherwise this
will break.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bart Van Assche July 13, 2016, 10:18 a.m. UTC | #3
On 07/13/2016 04:32 PM, Sagi Grimberg wrote:
>
>> -static inline u32 rdma_rw_max_sge(struct ib_device *dev,
>> -		enum dma_data_direction dir)
>> -{
>> -	return dir == DMA_TO_DEVICE ?
>> -		dev->attrs.max_sge : dev->attrs.max_sge_rd;
>> -}
>> -
>
> Wait, this looks wrong...
>
> iWARP devices have max_sge_rd = 1, Steve, did you get to test this?
>
>>   static inline u32 rdma_rw_fr_page_list_len(struct ib_device *dev)
>>   {
>>   	/* arbitrary limit to avoid allocating gigantic resources */
>> @@ -186,7 +179,7 @@ static int rdma_rw_init_map_wrs(struct rdma_rw_ctx *ctx, struct ib_qp *qp,
>>   		u64 remote_addr, u32 rkey, enum dma_data_direction dir)
>>   {
>>   	struct ib_device *dev = qp->pd->device;
>> -	u32 max_sge = rdma_rw_max_sge(dev, dir);
>> +	u32 max_sge = qp->max_send_sge;
>
> Here, rdma_read will use qp.max_send_sge = max_sge which is set by the
> ULP.
>
> If the ULP used max(max_sge, max_sge_rd) which is the right thing, then
> we get a bug, if the ULP set min(max_sge, max_sge_rd) then it has
> effectively a single sge even for writes (which is not good).
>
> The QP user needs to set the maximum sge allowed for reads or writes,
> but for reads use max_sge_rd and for writes use max_sge. Otherwise this
> will break.

Hello Sagi,

How about keeping rdma_rw_max_sge() and using the minimum of 
qp->max_send_sge and dev->attrs.max_sge_rd for RDMA READs?

Bart.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sagi Grimberg July 13, 2016, 10:32 a.m. UTC | #4
On 13/07/16 13:18, Bart Van Assche wrote:
> On 07/13/2016 04:32 PM, Sagi Grimberg wrote:
>>
>>> -static inline u32 rdma_rw_max_sge(struct ib_device *dev,
>>> -        enum dma_data_direction dir)
>>> -{
>>> -    return dir == DMA_TO_DEVICE ?
>>> -        dev->attrs.max_sge : dev->attrs.max_sge_rd;
>>> -}
>>> -
>>
>> Wait, this looks wrong...
>>
>> iWARP devices have max_sge_rd = 1, Steve, did you get to test this?
>>
>>>   static inline u32 rdma_rw_fr_page_list_len(struct ib_device *dev)
>>>   {
>>>       /* arbitrary limit to avoid allocating gigantic resources */
>>> @@ -186,7 +179,7 @@ static int rdma_rw_init_map_wrs(struct
>>> rdma_rw_ctx *ctx, struct ib_qp *qp,
>>>           u64 remote_addr, u32 rkey, enum dma_data_direction dir)
>>>   {
>>>       struct ib_device *dev = qp->pd->device;
>>> -    u32 max_sge = rdma_rw_max_sge(dev, dir);
>>> +    u32 max_sge = qp->max_send_sge;
>>
>> Here, rdma_read will use qp.max_send_sge = max_sge which is set by the
>> ULP.
>>
>> If the ULP used max(max_sge, max_sge_rd) which is the right thing, then
>> we get a bug, if the ULP set min(max_sge, max_sge_rd) then it has
>> effectively a single sge even for writes (which is not good).
>>
>> The QP user needs to set the maximum sge allowed for reads or writes,
>> but for reads use max_sge_rd and for writes use max_sge. Otherwise this
>> will break.
>
> Hello Sagi,
>
> How about keeping rdma_rw_max_sge() and using the minimum of
> qp->max_send_sge and dev->attrs.max_sge_rd for RDMA READs?

That would work.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Steve Wise July 13, 2016, 2:22 p.m. UTC | #5
> 
> 
> > -static inline u32 rdma_rw_max_sge(struct ib_device *dev,
> > -		enum dma_data_direction dir)
> > -{
> > -	return dir == DMA_TO_DEVICE ?
> > -		dev->attrs.max_sge : dev->attrs.max_sge_rd;
> > -}
> > -
> 
> Wait, this looks wrong...
> 
> iWARP devices have max_sge_rd = 1, Steve, did you get to test this?
>

No I haven't. 
 
> >   static inline u32 rdma_rw_fr_page_list_len(struct ib_device *dev)
> >   {
> >   	/* arbitrary limit to avoid allocating gigantic resources */
> > @@ -186,7 +179,7 @@ static int rdma_rw_init_map_wrs(struct rdma_rw_ctx
> *ctx, struct ib_qp *qp,
> >   		u64 remote_addr, u32 rkey, enum dma_data_direction dir)
> >   {
> >   	struct ib_device *dev = qp->pd->device;
> > -	u32 max_sge = rdma_rw_max_sge(dev, dir);
> > +	u32 max_sge = qp->max_send_sge;
> 
> Here, rdma_read will use qp.max_send_sge = max_sge which is set by the
> ULP.
> 
> If the ULP used max(max_sge, max_sge_rd) which is the right thing, then
> we get a bug, if the ULP set min(max_sge, max_sge_rd) then it has
> effectively a single sge even for writes (which is not good).
> 
> The QP user needs to set the maximum sge allowed for reads or writes,
> but for reads use max_sge_rd and for writes use max_sge. Otherwise this
> will break.

I haven't paid enough attention to these changes, but definitely max_sge_rd is for reads and max_sge is for writes...


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Steve Wise July 13, 2016, 2:29 p.m. UTC | #6
> 
> 
> On 13/07/16 13:18, Bart Van Assche wrote:
> > On 07/13/2016 04:32 PM, Sagi Grimberg wrote:
> >>
> >>> -static inline u32 rdma_rw_max_sge(struct ib_device *dev,
> >>> -        enum dma_data_direction dir)
> >>> -{
> >>> -    return dir == DMA_TO_DEVICE ?
> >>> -        dev->attrs.max_sge : dev->attrs.max_sge_rd;
> >>> -}
> >>> -
> >>
> >> Wait, this looks wrong...
> >>
> >> iWARP devices have max_sge_rd = 1, Steve, did you get to test this?
> >>
> >>>   static inline u32 rdma_rw_fr_page_list_len(struct ib_device *dev)
> >>>   {
> >>>       /* arbitrary limit to avoid allocating gigantic resources */
> >>> @@ -186,7 +179,7 @@ static int rdma_rw_init_map_wrs(struct
> >>> rdma_rw_ctx *ctx, struct ib_qp *qp,
> >>>           u64 remote_addr, u32 rkey, enum dma_data_direction dir)
> >>>   {
> >>>       struct ib_device *dev = qp->pd->device;
> >>> -    u32 max_sge = rdma_rw_max_sge(dev, dir);
> >>> +    u32 max_sge = qp->max_send_sge;
> >>
> >> Here, rdma_read will use qp.max_send_sge = max_sge which is set by the
> >> ULP.
> >>
> >> If the ULP used max(max_sge, max_sge_rd) which is the right thing, then
> >> we get a bug, if the ULP set min(max_sge, max_sge_rd) then it has
> >> effectively a single sge even for writes (which is not good).
> >>
> >> The QP user needs to set the maximum sge allowed for reads or writes,
> >> but for reads use max_sge_rd and for writes use max_sge. Otherwise this
> >> will break.
> >
> > Hello Sagi,
> >
> > How about keeping rdma_rw_max_sge() and using the minimum of
> > qp->max_send_sge and dev->attrs.max_sge_rd for RDMA READs?
> 
> That would work.


I'll be sure and test v3 of this series on cxgb4 (please CC me).

Steve.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bart Van Assche July 14, 2016, 12:26 a.m. UTC | #7
On 07/13/2016 09:29 PM, Steve Wise wrote:
> I'll be sure and test v3 of this series on cxgb4 (please CC me).

Thanks Steve. I will CC you for v3 of this series.

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/infiniband/core/rw.c b/drivers/infiniband/core/rw.c
index 1ad2baa..425e711 100644
--- a/drivers/infiniband/core/rw.c
+++ b/drivers/infiniband/core/rw.c
@@ -58,13 +58,6 @@  static inline bool rdma_rw_io_needs_mr(struct ib_device *dev, u8 port_num,
 	return false;
 }
 
-static inline u32 rdma_rw_max_sge(struct ib_device *dev,
-		enum dma_data_direction dir)
-{
-	return dir == DMA_TO_DEVICE ?
-		dev->attrs.max_sge : dev->attrs.max_sge_rd;
-}
-
 static inline u32 rdma_rw_fr_page_list_len(struct ib_device *dev)
 {
 	/* arbitrary limit to avoid allocating gigantic resources */
@@ -186,7 +179,7 @@  static int rdma_rw_init_map_wrs(struct rdma_rw_ctx *ctx, struct ib_qp *qp,
 		u64 remote_addr, u32 rkey, enum dma_data_direction dir)
 {
 	struct ib_device *dev = qp->pd->device;
-	u32 max_sge = rdma_rw_max_sge(dev, dir);
+	u32 max_sge = qp->max_send_sge;
 	struct ib_sge *sge;
 	u32 total_len = 0, i, j;
 
diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
index 6298f54..c7f840e 100644
--- a/drivers/infiniband/core/verbs.c
+++ b/drivers/infiniband/core/verbs.c
@@ -814,6 +814,8 @@  struct ib_qp *ib_create_qp(struct ib_pd *pd,
 		}
 	}
 
+	qp->max_send_sge = qp_init_attr->cap.max_send_sge;
+
 	return qp;
 }
 EXPORT_SYMBOL(ib_create_qp);
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 7e440d4..c44dbf6 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1449,6 +1449,7 @@  struct ib_qp {
 	void                  (*event_handler)(struct ib_event *, void *);
 	void		       *qp_context;
 	u32			qp_num;
+	u32			max_send_sge;
 	enum ib_qp_type		qp_type;
 };