diff mbox

[3/5] IB/isert: Limit the number of SG elements per work request

Message ID 885f39a6-9d75-9999-d582-e403f072bec1@sandisk.com (mailing list archive)
State Superseded
Headers show

Commit Message

Bart Van Assche June 28, 2016, 11:26 a.m. UTC
Limit the number of SG elements per work request to what the queue
pair supports.

Fixes: b99f8e4d7bcd ("IB/srpt: convert to the generic RDMA READ/WRITE API")
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: <stable@vger.kernel.org> #v4.7+
Cc: Christoph Hellwig <hch@lst.de>
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: Nicholas Bellinger <nab@linux-iscsi.org>
Cc: Parav Pandit <pandit.parav@gmail.com>
Cc: Laurence Oberman <loberman@redhat.com>
---
 drivers/infiniband/ulp/isert/ib_isert.c | 10 +++++-----
 drivers/infiniband/ulp/isert/ib_isert.h |  3 ++-
 2 files changed, 7 insertions(+), 6 deletions(-)

Comments

Steve Wise June 28, 2016, 2:28 p.m. UTC | #1
> 
> Limit the number of SG elements per work request to what the queue
> pair supports.
> 
> Fixes: b99f8e4d7bcd ("IB/srpt: convert to the generic RDMA READ/WRITE API")

I think the above is the wrong commit.  Maybe this one?

commit 38a2d0d429f1d87315c55d9139b8bdf66d51c4f4
Author: Christoph Hellwig <hch@lst.de>
Date:   Tue May 3 18:01:13 2016 +0200

    IB/isert: convert to the generic RDMA READ/WRITE API



--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Steve Wise June 28, 2016, 2:33 p.m. UTC | #2
> -----Original Message-----
> From: linux-rdma-owner@vger.kernel.org [mailto:linux-rdma-
> owner@vger.kernel.org] On Behalf Of Bart Van Assche
> Sent: Tuesday, June 28, 2016 6:27 AM
> To: Doug Ledford
> Cc: Christoph Hellwig; Sagi Grimberg; Nicholas A. Bellinger; Parav Pandit; Laurence
> Oberman; linux-rdma@vger.kernel.org
> Subject: [PATCH 3/5] IB/isert: Limit the number of SG elements per work request
> 
> Limit the number of SG elements per work request to what the queue
> pair supports.
> 

Hey Bart, do you think this patch fixes this error we're seeing?

isert: isert_rdma_rw_ctx_post: Cmd: ffff880f30eb0278 failed to post RDMA res
isert: isert_rdma_rw_ctx_post: Cmd: ffff880f30e85ef8 failed to post RDMA res
ABORT_TASK: Found referenced iSCSI task_tag: 14
ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for ref_tag: 14
ABORT_TASK: Found referenced iSCSI task_tag: 40
ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for ref_tag: 40
isert: isert_rdma_rw_ctx_post: Cmd: ffff880f311519f0 failed to post RDMA res
ABORT_TASK: Found referenced iSCSI task_tag: 70
ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for ref_tag: 70
isert: isert_rdma_rw_ctx_post: Cmd: ffff880f311720a0 failed to post RDMA res
ABORT_TASK: Found referenced iSCSI task_tag: 86
ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for ref_tag: 86
isert: isert_rdma_rw_ctx_post: Cmd: ffff880f31160528 failed to post RDMA res
isert: isert_rdma_rw_ctx_post: Cmd: ffff880f30ead238 failed to post RDMA res
isert: isert_rdma_rw_ctx_post: Cmd: ffff880f30ea8fe0 failed to post RDMA res
ABORT_TASK: Found referenced iSCSI task_tag: 124
ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for ref_tag: 124
ABORT_TASK: Found referenced iSCSI task_tag: 120
ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for ref_tag: 120
ABORT_TASK: Found referenced iSCSI task_tag: 111
ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for ref_tag: 111

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bart Van Assche June 28, 2016, 2:58 p.m. UTC | #3
On 06/28/2016 04:28 PM, Steve Wise wrote:
>
>>
>> Limit the number of SG elements per work request to what the queue
>> pair supports.
>>
>> Fixes: b99f8e4d7bcd ("IB/srpt: convert to the generic RDMA READ/WRITE API")
>
> I think the above is the wrong commit.  Maybe this one?
>
> commit 38a2d0d429f1d87315c55d9139b8bdf66d51c4f4
> Author: Christoph Hellwig <hch@lst.de>
> Date:   Tue May 3 18:01:13 2016 +0200
>
>     IB/isert: convert to the generic RDMA READ/WRITE API

Thanks Steve. I will fix this copy/paste error.

Bart.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Steve Wise June 28, 2016, 3:01 p.m. UTC | #4
> 
> On 06/28/2016 04:28 PM, Steve Wise wrote:
> >
> >>
> >> Limit the number of SG elements per work request to what the queue
> >> pair supports.
> >>
> >> Fixes: b99f8e4d7bcd ("IB/srpt: convert to the generic RDMA READ/WRITE API")
> >
> > I think the above is the wrong commit.  Maybe this one?
> >
> > commit 38a2d0d429f1d87315c55d9139b8bdf66d51c4f4
> > Author: Christoph Hellwig <hch@lst.de>
> > Date:   Tue May 3 18:01:13 2016 +0200
> >
> >     IB/isert: convert to the generic RDMA READ/WRITE API
> 
> Thanks Steve. I will fix this copy/paste error.

Patch 2 has the same problem, I think.

Stevo


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bart Van Assche June 28, 2016, 3:03 p.m. UTC | #5
On 06/28/2016 04:33 PM, Steve Wise wrote:
>> -----Original Message-----
>> From: linux-rdma-owner@vger.kernel.org [mailto:linux-rdma-
>> owner@vger.kernel.org] On Behalf Of Bart Van Assche
>> Sent: Tuesday, June 28, 2016 6:27 AM
>> To: Doug Ledford
>> Cc: Christoph Hellwig; Sagi Grimberg; Nicholas A. Bellinger; Parav Pandit; Laurence
>> Oberman; linux-rdma@vger.kernel.org
>> Subject: [PATCH 3/5] IB/isert: Limit the number of SG elements per work request
>>
>> Limit the number of SG elements per work request to what the queue
>> pair supports.
>>
>
> Hey Bart, do you think this patch fixes this error we're seeing?
>
> isert: isert_rdma_rw_ctx_post: Cmd: ffff880f30eb0278 failed to post RDMA res

Hello Steve,

Do you perhaps know how many SGE elements the failed work request 
contains? The reason I came up with the max_sge patch is because I 
noticed that the ib_srpt driver reported ib_post_send() failures for 
certain RDMA work requests. A debug printk() confirmed that 
ib_post_send() failed because the following test in the mlx4 driver was hit:

		if (unlikely(wr->num_sge > qp->sq.max_gs)) {
			err = -EINVAL;
			*bad_wr = wr;
			goto out;
		}

For the mlx4 HCA in my test setup max_sge == 32, max_sge_rd == 30 and 
with the QP creation parameters I used max_gs == 28.

Bart.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Steve Wise June 28, 2016, 3:09 p.m. UTC | #6
> On 06/28/2016 04:33 PM, Steve Wise wrote:
> >> -----Original Message-----
> >> From: linux-rdma-owner@vger.kernel.org [mailto:linux-rdma-
> >> owner@vger.kernel.org] On Behalf Of Bart Van Assche
> >> Sent: Tuesday, June 28, 2016 6:27 AM
> >> To: Doug Ledford
> >> Cc: Christoph Hellwig; Sagi Grimberg; Nicholas A. Bellinger; Parav Pandit;
> Laurence
> >> Oberman; linux-rdma@vger.kernel.org
> >> Subject: [PATCH 3/5] IB/isert: Limit the number of SG elements per work request
> >>
> >> Limit the number of SG elements per work request to what the queue
> >> pair supports.
> >>
> >
> > Hey Bart, do you think this patch fixes this error we're seeing?
> >
> > isert: isert_rdma_rw_ctx_post: Cmd: ffff880f30eb0278 failed to post RDMA res
> 
> Hello Steve,
> 
> Do you perhaps know how many SGE elements the failed work request
> contains? The reason I came up with the max_sge patch is because I
> noticed that the ib_srpt driver reported ib_post_send() failures for
> certain RDMA work requests. A debug printk() confirmed that
> ib_post_send() failed because the following test in the mlx4 driver was hit:
> 
> 		if (unlikely(wr->num_sge > qp->sq.max_gs)) {
> 			err = -EINVAL;
> 			*bad_wr = wr;
> 			goto out;
> 		}
> 
> For the mlx4 HCA in my test setup max_sge == 32, max_sge_rd == 30 and
> with the QP creation parameters I used max_gs == 28.
> 
> Bart.
> 

I'll let you know as I gather more data.  For now I'm retrying with patches 1-3 of this series to see if it alleviates the problem.  

Stevo

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Christoph Hellwig June 28, 2016, 3:46 p.m. UTC | #7
>  	attr.cap.max_send_sge = device->ib_device->attrs.max_sge;
> -	isert_conn->max_sge = min(device->ib_device->attrs.max_sge,
> -				  device->ib_device->attrs.max_sge_rd);
>  	attr.cap.max_recv_sge = 1;
>  	attr.sq_sig_type = IB_SIGNAL_REQ_WR;
>  	attr.qp_type = IB_QPT_RC;
> @@ -151,6 +149,9 @@ isert_create_qp(struct isert_conn *isert_conn,
>  		return ERR_PTR(ret);
>  	}
>  
> +	isert_conn->max_send_sge = attr.cap.max_send_sge;
> +	isert_conn->max_recv_sge = attr.cap.max_recv_sge;

Can you explain how cap.max_recv_sge fits in here?  My understanding
is that it is about the number of SGEs for a receive WR...

> @@ -2087,12 +2087,12 @@ isert_rdma_rw_ctx_post(struct isert_cmd *cmd, struct isert_conn *conn,
>  		addr = cmd->write_va;
>  		rkey = cmd->write_stag;
>  		offset = cmd->iscsi_cmd->write_data_done;
> -		max_sge = dev->attrs.max_sge_rd;
> +		max_sge = conn->max_recv_sge;

.. and not about the number of SGEs for a RDMA READ WR.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bart Van Assche June 28, 2016, 4:11 p.m. UTC | #8
On 06/28/2016 05:46 PM, Christoph Hellwig wrote:
>>  	attr.cap.max_send_sge = device->ib_device->attrs.max_sge;
>> -	isert_conn->max_sge = min(device->ib_device->attrs.max_sge,
>> -				  device->ib_device->attrs.max_sge_rd);
>>  	attr.cap.max_recv_sge = 1;
>>  	attr.sq_sig_type = IB_SIGNAL_REQ_WR;
>>  	attr.qp_type = IB_QPT_RC;
>> @@ -151,6 +149,9 @@ isert_create_qp(struct isert_conn *isert_conn,
>>  		return ERR_PTR(ret);
>>  	}
>>
>> +	isert_conn->max_send_sge = attr.cap.max_send_sge;
>> +	isert_conn->max_recv_sge = attr.cap.max_recv_sge;
>
> Can you explain how cap.max_recv_sge fits in here?  My understanding
> is that it is about the number of SGEs for a receive WR...
>
>> @@ -2087,12 +2087,12 @@ isert_rdma_rw_ctx_post(struct isert_cmd *cmd, struct isert_conn *conn,
>>  		addr = cmd->write_va;
>>  		rkey = cmd->write_stag;
>>  		offset = cmd->iscsi_cmd->write_data_done;
>> -		max_sge = dev->attrs.max_sge_rd;
>> +		max_sge = conn->max_recv_sge;
>
> .. and not about the number of SGEs for a RDMA READ WR.

Indeed. I will update the ib_srpt and ib_isert patches accordingly.

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bart Van Assche June 28, 2016, 6:41 p.m. UTC | #9
On 06/28/2016 05:09 PM, Steve Wise wrote:
> I'll let you know as I gather more data.  For now I'm retrying with patches 1-3 of this series to see if it alleviates the problem.

Hi Steve,

Can you use the code from the for-next branch in the following 
repository for your tests: https://github.com/bvanassche/linux.

I will post these patches tomorrow.

Thanks,

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Steve Wise June 28, 2016, 7:13 p.m. UTC | #10
> Hi Steve,
> 
> Can you use the code from the for-next branch in the following
> repository for your tests: https://github.com/bvanassche/linux.
> 

Which branch?

> I will post these patches tomorrow.
> 
> Thanks,
> 
> Bart.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bart Van Assche June 30, 2016, 1:47 p.m. UTC | #11
On 06/28/2016 09:13 PM, Steve Wise wrote:
>> Can you use the code from the for-next branch in the following
>> repository for your tests: https://github.com/bvanassche/linux.
>
> Which branch?

The for-next branch. I thought I had mentioned this in my previous message?

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/infiniband/ulp/isert/ib_isert.c b/drivers/infiniband/ulp/isert/ib_isert.c
index 06a5671..b5d6556 100644
--- a/drivers/infiniband/ulp/isert/ib_isert.c
+++ b/drivers/infiniband/ulp/isert/ib_isert.c
@@ -137,8 +137,6 @@  isert_create_qp(struct isert_conn *isert_conn,
 	attr.cap.max_recv_wr = ISERT_QP_MAX_RECV_DTOS + 1;
 	attr.cap.max_rdma_ctxs = ISCSI_DEF_XMIT_CMDS_MAX;
 	attr.cap.max_send_sge = device->ib_device->attrs.max_sge;
-	isert_conn->max_sge = min(device->ib_device->attrs.max_sge,
-				  device->ib_device->attrs.max_sge_rd);
 	attr.cap.max_recv_sge = 1;
 	attr.sq_sig_type = IB_SIGNAL_REQ_WR;
 	attr.qp_type = IB_QPT_RC;
@@ -151,6 +149,9 @@  isert_create_qp(struct isert_conn *isert_conn,
 		return ERR_PTR(ret);
 	}
 
+	isert_conn->max_send_sge = attr.cap.max_send_sge;
+	isert_conn->max_recv_sge = attr.cap.max_recv_sge;
+
 	return cma_id->qp;
 }
 
@@ -2075,7 +2076,6 @@  static int
 isert_rdma_rw_ctx_post(struct isert_cmd *cmd, struct isert_conn *conn,
 		struct ib_cqe *cqe, struct ib_send_wr *chain_wr)
 {
-	struct ib_device *dev = conn->device->ib_device;
 	struct se_cmd *se_cmd = &cmd->iscsi_cmd->se_cmd;
 	enum dma_data_direction dir = target_reverse_dma_direction(se_cmd);
 	u8 port_num = conn->cm_id->port_num;
@@ -2087,12 +2087,12 @@  isert_rdma_rw_ctx_post(struct isert_cmd *cmd, struct isert_conn *conn,
 		addr = cmd->write_va;
 		rkey = cmd->write_stag;
 		offset = cmd->iscsi_cmd->write_data_done;
-		max_sge = dev->attrs.max_sge_rd;
+		max_sge = conn->max_recv_sge;
 	} else {
 		addr = cmd->read_va;
 		rkey = cmd->read_stag;
 		offset = 0;
-		max_sge = dev->attrs.max_sge;
+		max_sge = conn->max_send_sge;
 	}
 
 	if (isert_prot_cmd(conn, se_cmd)) {
diff --git a/drivers/infiniband/ulp/isert/ib_isert.h b/drivers/infiniband/ulp/isert/ib_isert.h
index e512ba9..ccd6927 100644
--- a/drivers/infiniband/ulp/isert/ib_isert.h
+++ b/drivers/infiniband/ulp/isert/ib_isert.h
@@ -138,7 +138,8 @@  struct isert_conn {
 	u32			responder_resources;
 	u32			initiator_depth;
 	bool			pi_support;
-	u32			max_sge;
+	u32			max_send_sge;
+	u32			max_recv_sge;
 	struct iser_rx_desc	*login_req_buf;
 	char			*login_rsp_buf;
 	u64			login_req_dma;