Message ID | 462EF229174FDB4D92ACE4656EA561005DD2BD41@CMEXMB1.ad.emulex.com (mailing list archive) |
---|---|
State | Rejected |
Headers | show |
On 3/6/2015 7:56 PM, Chris Moore wrote: > isert_put_datain() always returns 1 and isert_get_dataout() always returns 0, even if > ib_post_send() fails. They should return an error in this case so the caller can handle it. > Also, in the case of an ib_post_send() failure, user isert_err instead of isert_warn. > > With these changes, these two functions handle errors from ib_post_send() in the > same way as other functions within ib_isert.c > Hi Chris, This is indeed needed, but I'm afraid this is not complete given the rc is completely ignored by the callers (see lio_queue_data_in/lio_write_pending). Did you really see any difference with this patch? > Signed-off-by: Chris Moore <chris.moore@emulex.com> > > --- > > diff --git a/drivers/infiniband/ulp/isert/ib_isert.c b/drivers/infiniband/ulp/isert/ib_isert.c > index 075b19c..7394ba9 100644 > --- a/drivers/infiniband/ulp/isert/ib_isert.c > +++ b/drivers/infiniband/ulp/isert/ib_isert.c > @@ -2860,8 +2860,10 @@ isert_put_datain(struct iscsi_conn *conn, struct iscsi_cmd *cmd) > } > > rc = ib_post_send(isert_conn->conn_qp, wr->send_wr, &wr_failed); > - if (rc) > - isert_warn("ib_post_send() failed for IB_WR_RDMA_WRITE\n"); > + if (rc) { > + isert_err("ib_post_send() failed for IB_WR_RDMA_WRITE\n"); > + return rc; > + } > > if (!isert_prot_cmd(isert_conn, se_cmd)) > isert_dbg("Cmd: %p posted RDMA_WRITE + Response for iSER Data " > @@ -2894,8 +2896,10 @@ isert_get_dataout(struct iscsi_conn *conn, struct iscsi_cmd *cmd, bool recovery) > } > > rc = ib_post_send(isert_conn->conn_qp, wr->send_wr, &wr_failed); > - if (rc) > - isert_warn("ib_post_send() failed for IB_WR_RDMA_READ\n"); > + if (rc) { > + isert_err("ib_post_send() failed for IB_WR_RDMA_READ\n"); > + return rc; > + } > > isert_dbg("Cmd: %p posted RDMA_READ memory for ISER Data WRITE\n", > isert_cmd); > --- > > -- > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sat, 2015-03-07 at 04:16 +0200, Sagi Grimberg wrote: > On 3/6/2015 7:56 PM, Chris Moore wrote: > > isert_put_datain() always returns 1 and isert_get_dataout() always returns 0, even if > > ib_post_send() fails. They should return an error in this case so the caller can handle it. > > Also, in the case of an ib_post_send() failure, user isert_err instead of isert_warn. > > > > With these changes, these two functions handle errors from ib_post_send() in the > > same way as other functions within ib_isert.c > > > > Hi Chris, > > This is indeed needed, but I'm afraid this is not complete given the > rc is completely ignored by the callers (see > lio_queue_data_in/lio_write_pending). > So lio_write_pending() is propagating up the return back to transport_generic_new_cmd(). When the return is -EAGAIN or -ENOMEM, it triggers transport_handle_queue_full() to retry ->write_pending() from se_device->qf_work_queue context. It's lio_queue_data_in() + lio_queue_status() that aren't propagating up failures to trigger queue_full in target_complete_ok_work(). Looking at this code again for traditional iscsi-target, I don't see a reason why iscsit_add_cmd_to_response_queue() failure should not be triggering queue_full logic to kick in.. On the iser-target side, is it OK for isert_put_datain() + isert_put_response() to be re-invoked from transport_complete_qf() context after ib_post_send() failure..? --nab -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 3/7/2015 9:19 AM, Nicholas A. Bellinger wrote: > On Sat, 2015-03-07 at 04:16 +0200, Sagi Grimberg wrote: >> On 3/6/2015 7:56 PM, Chris Moore wrote: >>> isert_put_datain() always returns 1 and isert_get_dataout() always returns 0, even if >>> ib_post_send() fails. They should return an error in this case so the caller can handle it. >>> Also, in the case of an ib_post_send() failure, user isert_err instead of isert_warn. >>> >>> With these changes, these two functions handle errors from ib_post_send() in the >>> same way as other functions within ib_isert.c >>> >> >> Hi Chris, >> >> This is indeed needed, but I'm afraid this is not complete given the >> rc is completely ignored by the callers (see >> lio_queue_data_in/lio_write_pending). >> > > So lio_write_pending() is propagating up the return back to > transport_generic_new_cmd(). When the return is -EAGAIN or -ENOMEM, > it triggers transport_handle_queue_full() to retry ->write_pending() > from se_device->qf_work_queue context. Ah, Right... > > It's lio_queue_data_in() + lio_queue_status() that aren't propagating up > failures to trigger queue_full in target_complete_ok_work(). Looking at > this code again for traditional iscsi-target, I don't see a reason why > iscsit_add_cmd_to_response_queue() failure should not be triggering > queue_full logic to kick in.. > > On the iser-target side, is it OK for isert_put_datain() + > isert_put_response() to be re-invoked from transport_complete_qf() > context after ib_post_send() failure..? Well, Generally the QP owner is obligated to not post more than the QP size and/or request for send completion once every SQ size. If we got ENOMEM from ib_post_send this usually indicates a bug, and there is no sense in retrying later, and I'm not aware of any provider that may return EAGAIN at the moment, but maybe this can happen theoretically... But I think the correct behavior from iSCSI PoV is to have ENOMEM/EAGAIN error codes from queue_data_in/queue_status trigger queue_full logic and terminate the session for any other (non-transient) error code. Sagi. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> From: Sagi Grimberg [mailto:sagig@dev.mellanox.co.il] > On 3/7/2015 9:19 AM, Nicholas A. Bellinger wrote: > > On Sat, 2015-03-07 at 04:16 +0200, Sagi Grimberg wrote: > >> On 3/6/2015 7:56 PM, Chris Moore wrote: > >>> isert_put_datain() always returns 1 and isert_get_dataout() always > returns 0, even if > >>> ib_post_send() fails. They should return an error in this case so the > caller can handle it. > >>> Also, in the case of an ib_post_send() failure, user isert_err instead of > isert_warn. > >>> > >>> With these changes, these two functions handle errors from > >>> ib_post_send() in the same way as other functions within ib_isert.c > >>> > >> > >> Hi Chris, > >> > >> This is indeed needed, but I'm afraid this is not complete given the > >> rc is completely ignored by the callers (see > >> lio_queue_data_in/lio_write_pending). > >> > > > > So lio_write_pending() is propagating up the return back to > > transport_generic_new_cmd(). When the return is -EAGAIN or -ENOMEM, > > it triggers transport_handle_queue_full() to retry ->write_pending() > > from se_device->qf_work_queue context. > > Ah, Right... > > > > > It's lio_queue_data_in() + lio_queue_status() that aren't propagating > > up failures to trigger queue_full in target_complete_ok_work(). > > Looking at this code again for traditional iscsi-target, I don't see a > > reason why > > iscsit_add_cmd_to_response_queue() failure should not be triggering > > queue_full logic to kick in.. > > > > On the iser-target side, is it OK for isert_put_datain() + > > isert_put_response() to be re-invoked from transport_complete_qf() > > context after ib_post_send() failure..? > > Well, Generally the QP owner is obligated to not post more than the QP size > and/or request for send completion once every SQ size. If we got ENOMEM > from ib_post_send this usually indicates a bug, and there is no sense in > retrying later, and I'm not aware of any provider that may return EAGAIN at > the moment, but maybe this can happen theoretically... > > But I think the correct behavior from iSCSI PoV is to have ENOMEM/EAGAIN > error codes from queue_data_in/queue_status trigger queue_full logic and > terminate the session for any other > (non-transient) error code. Interesting, I missed that part. I am seeing ocrdma_post_send() fail because it's out of QP entries. So maybe the real fix is to find out why that's happening. Either the caller is posting more entries than it should, or maybe ocrdma is reporting the wrong QP size. Any pointers to where that gets checked? If the target has received a SCSI READ it's going to have to post back one or more datain phases. Somewhere in the stack there has to be back pressure so that the target layer doesn't try to send a datain if the QP is full. I had assumed that was handled by the ENOMEM return and then queue full processing, but it sounds like it should be caught before the error even occurs. Chris
On 3/9/2015 5:30 PM, Chris Moore wrote: >> From: Sagi Grimberg [mailto:sagig@dev.mellanox.co.il] >> On 3/7/2015 9:19 AM, Nicholas A. Bellinger wrote: >>> On Sat, 2015-03-07 at 04:16 +0200, Sagi Grimberg wrote: >>>> On 3/6/2015 7:56 PM, Chris Moore wrote: >>>>> isert_put_datain() always returns 1 and isert_get_dataout() always >> returns 0, even if >>>>> ib_post_send() fails. They should return an error in this case so the >> caller can handle it. >>>>> Also, in the case of an ib_post_send() failure, user isert_err instead of >> isert_warn. >>>>> >>>>> With these changes, these two functions handle errors from >>>>> ib_post_send() in the same way as other functions within ib_isert.c >>>>> >>>> >>>> Hi Chris, >>>> >>>> This is indeed needed, but I'm afraid this is not complete given the >>>> rc is completely ignored by the callers (see >>>> lio_queue_data_in/lio_write_pending). >>>> >>> >>> So lio_write_pending() is propagating up the return back to >>> transport_generic_new_cmd(). When the return is -EAGAIN or -ENOMEM, >>> it triggers transport_handle_queue_full() to retry ->write_pending() >>> from se_device->qf_work_queue context. >> >> Ah, Right... >> >>> >>> It's lio_queue_data_in() + lio_queue_status() that aren't propagating >>> up failures to trigger queue_full in target_complete_ok_work(). >>> Looking at this code again for traditional iscsi-target, I don't see a >>> reason why >>> iscsit_add_cmd_to_response_queue() failure should not be triggering >>> queue_full logic to kick in.. >>> >>> On the iser-target side, is it OK for isert_put_datain() + >>> isert_put_response() to be re-invoked from transport_complete_qf() >>> context after ib_post_send() failure..? >> >> Well, Generally the QP owner is obligated to not post more than the QP size >> and/or request for send completion once every SQ size. If we got ENOMEM >> from ib_post_send this usually indicates a bug, and there is no sense in >> retrying later, and I'm not aware of any provider that may return EAGAIN at >> the moment, but maybe this can happen theoretically... >> >> But I think the correct behavior from iSCSI PoV is to have ENOMEM/EAGAIN >> error codes from queue_data_in/queue_status trigger queue_full logic and >> terminate the session for any other >> (non-transient) error code. > > Interesting, I missed that part. I am seeing ocrdma_post_send() fail because it's > out of QP entries. So maybe the real fix is to find out why that's happening. Right... > Either the > caller is posting more entries than it should, or maybe ocrdma is reporting the > wrong QP size. Any pointers to where that gets checked? the iser target ignores the device capabilities for SQ size at the moment (BUG), so I doubt it has something to do with ocrdma QP size report. > If the target has received > a SCSI READ it's going to have to post back one or more datain phases. Somewhere > in the stack there has to be back pressure so that the target layer doesn't try to > send a datain if the QP is full. I had assumed that was handled by the ENOMEM > return and then queue full processing, but it sounds like it should be caught before > the error even occurs. I still think the error should be propagated and not ignored by iscsit. As I mentioned transient errors should be retired later and non-transient errors should shutdown the connection. What was the initiator cmds_max you were running with? There is a bug I know of that the initaitor and target does not really sync the number of inflight commands (see MaxOutstandingUnexpectedPDUs). That might be related. Sagi. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/infiniband/ulp/isert/ib_isert.c b/drivers/infiniband/ulp/isert/ib_isert.c index 075b19c..7394ba9 100644 --- a/drivers/infiniband/ulp/isert/ib_isert.c +++ b/drivers/infiniband/ulp/isert/ib_isert.c @@ -2860,8 +2860,10 @@ isert_put_datain(struct iscsi_conn *conn, struct iscsi_cmd *cmd) } rc = ib_post_send(isert_conn->conn_qp, wr->send_wr, &wr_failed); - if (rc) - isert_warn("ib_post_send() failed for IB_WR_RDMA_WRITE\n"); + if (rc) { + isert_err("ib_post_send() failed for IB_WR_RDMA_WRITE\n"); + return rc; + } if (!isert_prot_cmd(isert_conn, se_cmd)) isert_dbg("Cmd: %p posted RDMA_WRITE + Response for iSER Data " @@ -2894,8 +2896,10 @@ isert_get_dataout(struct iscsi_conn *conn, struct iscsi_cmd *cmd, bool recovery) } rc = ib_post_send(isert_conn->conn_qp, wr->send_wr, &wr_failed); - if (rc) - isert_warn("ib_post_send() failed for IB_WR_RDMA_READ\n"); + if (rc) { + isert_err("ib_post_send() failed for IB_WR_RDMA_READ\n"); + return rc; + } isert_dbg("Cmd: %p posted RDMA_READ memory for ISER Data WRITE\n", isert_cmd);
isert_put_datain() always returns 1 and isert_get_dataout() always returns 0, even if ib_post_send() fails. They should return an error in this case so the caller can handle it. Also, in the case of an ib_post_send() failure, user isert_err instead of isert_warn. With these changes, these two functions handle errors from ib_post_send() in the same way as other functions within ib_isert.c Signed-off-by: Chris Moore <chris.moore@emulex.com> --- --- -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html