Message ID | 885f39a6-9d75-9999-d582-e403f072bec1@sandisk.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
> > Limit the number of SG elements per work request to what the queue > pair supports. > > Fixes: b99f8e4d7bcd ("IB/srpt: convert to the generic RDMA READ/WRITE API") I think the above is the wrong commit. Maybe this one? commit 38a2d0d429f1d87315c55d9139b8bdf66d51c4f4 Author: Christoph Hellwig <hch@lst.de> Date: Tue May 3 18:01:13 2016 +0200 IB/isert: convert to the generic RDMA READ/WRITE API -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> -----Original Message----- > From: linux-rdma-owner@vger.kernel.org [mailto:linux-rdma- > owner@vger.kernel.org] On Behalf Of Bart Van Assche > Sent: Tuesday, June 28, 2016 6:27 AM > To: Doug Ledford > Cc: Christoph Hellwig; Sagi Grimberg; Nicholas A. Bellinger; Parav Pandit; Laurence > Oberman; linux-rdma@vger.kernel.org > Subject: [PATCH 3/5] IB/isert: Limit the number of SG elements per work request > > Limit the number of SG elements per work request to what the queue > pair supports. > Hey Bart, do you think this patch fixes this error we're seeing? isert: isert_rdma_rw_ctx_post: Cmd: ffff880f30eb0278 failed to post RDMA res isert: isert_rdma_rw_ctx_post: Cmd: ffff880f30e85ef8 failed to post RDMA res ABORT_TASK: Found referenced iSCSI task_tag: 14 ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for ref_tag: 14 ABORT_TASK: Found referenced iSCSI task_tag: 40 ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for ref_tag: 40 isert: isert_rdma_rw_ctx_post: Cmd: ffff880f311519f0 failed to post RDMA res ABORT_TASK: Found referenced iSCSI task_tag: 70 ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for ref_tag: 70 isert: isert_rdma_rw_ctx_post: Cmd: ffff880f311720a0 failed to post RDMA res ABORT_TASK: Found referenced iSCSI task_tag: 86 ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for ref_tag: 86 isert: isert_rdma_rw_ctx_post: Cmd: ffff880f31160528 failed to post RDMA res isert: isert_rdma_rw_ctx_post: Cmd: ffff880f30ead238 failed to post RDMA res isert: isert_rdma_rw_ctx_post: Cmd: ffff880f30ea8fe0 failed to post RDMA res ABORT_TASK: Found referenced iSCSI task_tag: 124 ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for ref_tag: 124 ABORT_TASK: Found referenced iSCSI task_tag: 120 ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for ref_tag: 120 ABORT_TASK: Found referenced iSCSI task_tag: 111 ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for ref_tag: 111 -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 06/28/2016 04:28 PM, Steve Wise wrote: > >> >> Limit the number of SG elements per work request to what the queue >> pair supports. >> >> Fixes: b99f8e4d7bcd ("IB/srpt: convert to the generic RDMA READ/WRITE API") > > I think the above is the wrong commit. Maybe this one? > > commit 38a2d0d429f1d87315c55d9139b8bdf66d51c4f4 > Author: Christoph Hellwig <hch@lst.de> > Date: Tue May 3 18:01:13 2016 +0200 > > IB/isert: convert to the generic RDMA READ/WRITE API Thanks Steve. I will fix this copy/paste error. Bart. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> > On 06/28/2016 04:28 PM, Steve Wise wrote: > > > >> > >> Limit the number of SG elements per work request to what the queue > >> pair supports. > >> > >> Fixes: b99f8e4d7bcd ("IB/srpt: convert to the generic RDMA READ/WRITE API") > > > > I think the above is the wrong commit. Maybe this one? > > > > commit 38a2d0d429f1d87315c55d9139b8bdf66d51c4f4 > > Author: Christoph Hellwig <hch@lst.de> > > Date: Tue May 3 18:01:13 2016 +0200 > > > > IB/isert: convert to the generic RDMA READ/WRITE API > > Thanks Steve. I will fix this copy/paste error. Patch 2 has the same problem, I think. Stevo -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 06/28/2016 04:33 PM, Steve Wise wrote: >> -----Original Message----- >> From: linux-rdma-owner@vger.kernel.org [mailto:linux-rdma- >> owner@vger.kernel.org] On Behalf Of Bart Van Assche >> Sent: Tuesday, June 28, 2016 6:27 AM >> To: Doug Ledford >> Cc: Christoph Hellwig; Sagi Grimberg; Nicholas A. Bellinger; Parav Pandit; Laurence >> Oberman; linux-rdma@vger.kernel.org >> Subject: [PATCH 3/5] IB/isert: Limit the number of SG elements per work request >> >> Limit the number of SG elements per work request to what the queue >> pair supports. >> > > Hey Bart, do you think this patch fixes this error we're seeing? > > isert: isert_rdma_rw_ctx_post: Cmd: ffff880f30eb0278 failed to post RDMA res Hello Steve, Do you perhaps know how many SGE elements the failed work request contains? The reason I came up with the max_sge patch is because I noticed that the ib_srpt driver reported ib_post_send() failures for certain RDMA work requests. A debug printk() confirmed that ib_post_send() failed because the following test in the mlx4 driver was hit: if (unlikely(wr->num_sge > qp->sq.max_gs)) { err = -EINVAL; *bad_wr = wr; goto out; } For the mlx4 HCA in my test setup max_sge == 32, max_sge_rd == 30 and with the QP creation parameters I used max_gs == 28. Bart. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> On 06/28/2016 04:33 PM, Steve Wise wrote: > >> -----Original Message----- > >> From: linux-rdma-owner@vger.kernel.org [mailto:linux-rdma- > >> owner@vger.kernel.org] On Behalf Of Bart Van Assche > >> Sent: Tuesday, June 28, 2016 6:27 AM > >> To: Doug Ledford > >> Cc: Christoph Hellwig; Sagi Grimberg; Nicholas A. Bellinger; Parav Pandit; > Laurence > >> Oberman; linux-rdma@vger.kernel.org > >> Subject: [PATCH 3/5] IB/isert: Limit the number of SG elements per work request > >> > >> Limit the number of SG elements per work request to what the queue > >> pair supports. > >> > > > > Hey Bart, do you think this patch fixes this error we're seeing? > > > > isert: isert_rdma_rw_ctx_post: Cmd: ffff880f30eb0278 failed to post RDMA res > > Hello Steve, > > Do you perhaps know how many SGE elements the failed work request > contains? The reason I came up with the max_sge patch is because I > noticed that the ib_srpt driver reported ib_post_send() failures for > certain RDMA work requests. A debug printk() confirmed that > ib_post_send() failed because the following test in the mlx4 driver was hit: > > if (unlikely(wr->num_sge > qp->sq.max_gs)) { > err = -EINVAL; > *bad_wr = wr; > goto out; > } > > For the mlx4 HCA in my test setup max_sge == 32, max_sge_rd == 30 and > with the QP creation parameters I used max_gs == 28. > > Bart. > I'll let you know as I gather more data. For now I'm retrying with patches 1-3 of this series to see if it alleviates the problem. Stevo -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> attr.cap.max_send_sge = device->ib_device->attrs.max_sge; > - isert_conn->max_sge = min(device->ib_device->attrs.max_sge, > - device->ib_device->attrs.max_sge_rd); > attr.cap.max_recv_sge = 1; > attr.sq_sig_type = IB_SIGNAL_REQ_WR; > attr.qp_type = IB_QPT_RC; > @@ -151,6 +149,9 @@ isert_create_qp(struct isert_conn *isert_conn, > return ERR_PTR(ret); > } > > + isert_conn->max_send_sge = attr.cap.max_send_sge; > + isert_conn->max_recv_sge = attr.cap.max_recv_sge; Can you explain how cap.max_recv_sge fits in here? My understanding is that it is about the number of SGEs for a receive WR... > @@ -2087,12 +2087,12 @@ isert_rdma_rw_ctx_post(struct isert_cmd *cmd, struct isert_conn *conn, > addr = cmd->write_va; > rkey = cmd->write_stag; > offset = cmd->iscsi_cmd->write_data_done; > - max_sge = dev->attrs.max_sge_rd; > + max_sge = conn->max_recv_sge; .. and not about the number of SGEs for a RDMA READ WR. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 06/28/2016 05:46 PM, Christoph Hellwig wrote: >> attr.cap.max_send_sge = device->ib_device->attrs.max_sge; >> - isert_conn->max_sge = min(device->ib_device->attrs.max_sge, >> - device->ib_device->attrs.max_sge_rd); >> attr.cap.max_recv_sge = 1; >> attr.sq_sig_type = IB_SIGNAL_REQ_WR; >> attr.qp_type = IB_QPT_RC; >> @@ -151,6 +149,9 @@ isert_create_qp(struct isert_conn *isert_conn, >> return ERR_PTR(ret); >> } >> >> + isert_conn->max_send_sge = attr.cap.max_send_sge; >> + isert_conn->max_recv_sge = attr.cap.max_recv_sge; > > Can you explain how cap.max_recv_sge fits in here? My understanding > is that it is about the number of SGEs for a receive WR... > >> @@ -2087,12 +2087,12 @@ isert_rdma_rw_ctx_post(struct isert_cmd *cmd, struct isert_conn *conn, >> addr = cmd->write_va; >> rkey = cmd->write_stag; >> offset = cmd->iscsi_cmd->write_data_done; >> - max_sge = dev->attrs.max_sge_rd; >> + max_sge = conn->max_recv_sge; > > .. and not about the number of SGEs for a RDMA READ WR. Indeed. I will update the ib_srpt and ib_isert patches accordingly. Bart. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 06/28/2016 05:09 PM, Steve Wise wrote:
> I'll let you know as I gather more data. For now I'm retrying with patches 1-3 of this series to see if it alleviates the problem.
Hi Steve,
Can you use the code from the for-next branch in the following
repository for your tests: https://github.com/bvanassche/linux.
I will post these patches tomorrow.
Thanks,
Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
> Hi Steve, > > Can you use the code from the for-next branch in the following > repository for your tests: https://github.com/bvanassche/linux. > Which branch? > I will post these patches tomorrow. > > Thanks, > > Bart. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 06/28/2016 09:13 PM, Steve Wise wrote: >> Can you use the code from the for-next branch in the following >> repository for your tests: https://github.com/bvanassche/linux. > > Which branch? The for-next branch. I thought I had mentioned this in my previous message? Bart. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/infiniband/ulp/isert/ib_isert.c b/drivers/infiniband/ulp/isert/ib_isert.c index 06a5671..b5d6556 100644 --- a/drivers/infiniband/ulp/isert/ib_isert.c +++ b/drivers/infiniband/ulp/isert/ib_isert.c @@ -137,8 +137,6 @@ isert_create_qp(struct isert_conn *isert_conn, attr.cap.max_recv_wr = ISERT_QP_MAX_RECV_DTOS + 1; attr.cap.max_rdma_ctxs = ISCSI_DEF_XMIT_CMDS_MAX; attr.cap.max_send_sge = device->ib_device->attrs.max_sge; - isert_conn->max_sge = min(device->ib_device->attrs.max_sge, - device->ib_device->attrs.max_sge_rd); attr.cap.max_recv_sge = 1; attr.sq_sig_type = IB_SIGNAL_REQ_WR; attr.qp_type = IB_QPT_RC; @@ -151,6 +149,9 @@ isert_create_qp(struct isert_conn *isert_conn, return ERR_PTR(ret); } + isert_conn->max_send_sge = attr.cap.max_send_sge; + isert_conn->max_recv_sge = attr.cap.max_recv_sge; + return cma_id->qp; } @@ -2075,7 +2076,6 @@ static int isert_rdma_rw_ctx_post(struct isert_cmd *cmd, struct isert_conn *conn, struct ib_cqe *cqe, struct ib_send_wr *chain_wr) { - struct ib_device *dev = conn->device->ib_device; struct se_cmd *se_cmd = &cmd->iscsi_cmd->se_cmd; enum dma_data_direction dir = target_reverse_dma_direction(se_cmd); u8 port_num = conn->cm_id->port_num; @@ -2087,12 +2087,12 @@ isert_rdma_rw_ctx_post(struct isert_cmd *cmd, struct isert_conn *conn, addr = cmd->write_va; rkey = cmd->write_stag; offset = cmd->iscsi_cmd->write_data_done; - max_sge = dev->attrs.max_sge_rd; + max_sge = conn->max_recv_sge; } else { addr = cmd->read_va; rkey = cmd->read_stag; offset = 0; - max_sge = dev->attrs.max_sge; + max_sge = conn->max_send_sge; } if (isert_prot_cmd(conn, se_cmd)) { diff --git a/drivers/infiniband/ulp/isert/ib_isert.h b/drivers/infiniband/ulp/isert/ib_isert.h index e512ba9..ccd6927 100644 --- a/drivers/infiniband/ulp/isert/ib_isert.h +++ b/drivers/infiniband/ulp/isert/ib_isert.h @@ -138,7 +138,8 @@ struct isert_conn { u32 responder_resources; u32 initiator_depth; bool pi_support; - u32 max_sge; + u32 max_send_sge; + u32 max_recv_sge; struct iser_rx_desc *login_req_buf; char *login_rsp_buf; u64 login_req_dma;
Limit the number of SG elements per work request to what the queue pair supports. Fixes: b99f8e4d7bcd ("IB/srpt: convert to the generic RDMA READ/WRITE API") Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: <stable@vger.kernel.org> #v4.7+ Cc: Christoph Hellwig <hch@lst.de> Cc: Sagi Grimberg <sagi@grimberg.me> Cc: Nicholas Bellinger <nab@linux-iscsi.org> Cc: Parav Pandit <pandit.parav@gmail.com> Cc: Laurence Oberman <loberman@redhat.com> --- drivers/infiniband/ulp/isert/ib_isert.c | 10 +++++----- drivers/infiniband/ulp/isert/ib_isert.h | 3 ++- 2 files changed, 7 insertions(+), 6 deletions(-)