diff mbox

IB/isert: Return value of iser target transport handlers ignored by iscsi target

Message ID 8f3ad0ed-a17d-571a-7e1a-eb61caa263c9@grimberg.me (mailing list archive)
State Not Applicable, archived
Headers show

Commit Message

Sagi Grimberg Aug. 7, 2016, 5:09 p.m. UTC
> Hi,

Hi Baharat,

> In iSER target during iwarp connection tear-down due to ping timeouts, the rdma
> queues are set to error state and subsequent queued iscsi session commands posted
> shall fail with corresponding errno returned by ib_post_send/recv.
> At this stage iser target handlers (Ex: isert_put_datain())
> return the error to iscsci target, but these errors are
> not being handled by the iscsi target handlers(Ex: lio_queue_status()).

Indeed looks like lio_queue_data_in() assumes that
->iscsit_queue_data_in() cannot fail. This either mean
that isert_put_data_in() should take care of
the extra put in this case, or the iscsi should correctly
handle a queue_full condition (because we should not be hitting
this in the normal IO flow).

Does this (untested) patch help?
--
         iscsit_build_rsp_pdu(cmd, conn, true, hdr);
@@ -1848,7 +1849,10 @@ isert_put_response(struct iscsi_conn *conn, 
struct iscsi_cmd *cmd)

         isert_dbg("Posting SCSI Response\n");

-       return isert_post_response(isert_conn, isert_cmd);
+       ret = isert_post_response(isert_conn, isert_cmd);
+       if (ret)
+               target_put_sess_cmd(&cmd->se_cmd);
+       return 0;
  }

  static void
@@ -2128,7 +2132,7 @@ isert_put_datain(struct iscsi_conn *conn, struct 
iscsi_cmd *cmd)
         struct isert_conn *isert_conn = conn->context;
         struct ib_cqe *cqe = NULL;
         struct ib_send_wr *chain_wr = NULL;
-       int rc;
+       int ret;

         isert_dbg("Cmd: %p RDMA_WRITE data_length: %u\n",
                  isert_cmd, se_cmd->data_length);
@@ -2148,34 +2152,41 @@ isert_put_datain(struct iscsi_conn *conn, struct 
iscsi_cmd *cmd)
                 isert_init_send_wr(isert_conn, isert_cmd,
                                    &isert_cmd->tx_desc.send_wr);

-               rc = isert_post_recv(isert_conn, isert_cmd->rx_desc);
-               if (rc) {
-                       isert_err("ib_post_recv failed with %d\n", rc);
-                       return rc;
+               ret = isert_post_recv(isert_conn, isert_cmd->rx_desc);
+               if (ret) {
+                       isert_err("ib_post_recv failed with %d\n", ret);
+                       goto out;
                 }

                 chain_wr = &isert_cmd->tx_desc.send_wr;
         }

-       isert_rdma_rw_ctx_post(isert_cmd, isert_conn, cqe, chain_wr);
+       ret = isert_rdma_rw_ctx_post(isert_cmd, isert_conn, cqe, chain_wr);
         isert_dbg("Cmd: %p posted RDMA_WRITE for iSER Data READ\n", 
isert_cmd);
-       return 1;
+out:
+       if (ret)
+               target_put_sess_cmd(&cmd->se_cmd);
+       return 0;
  }

  static int
  isert_get_dataout(struct iscsi_conn *conn, struct iscsi_cmd *cmd, bool 
recovery)
  {
         struct isert_cmd *isert_cmd = iscsit_priv_cmd(cmd);
+       int ret;

         isert_dbg("Cmd: %p RDMA_READ data_length: %u write_data_done: 
%u\n",
                  isert_cmd, cmd->se_cmd.data_length, cmd->write_data_done);

         isert_cmd->tx_desc.tx_cqe.done = isert_rdma_read_done;
-       isert_rdma_rw_ctx_post(isert_cmd, conn->context,
+       ret = isert_rdma_rw_ctx_post(isert_cmd, conn->context,
                         &isert_cmd->tx_desc.tx_cqe, NULL);

         isert_dbg("Cmd: %p posted RDMA_READ memory for ISER Data WRITE\n",
                  isert_cmd);+out:
+       if (ret)
+               target_put_sess_cmd(&cmd->se_cmd);
         return 0;
  }

--

>
> -> While closing the session in case of ping timeouts, isert_close_connection()->
> isert_wait_conn()->isert_wait4cmds() checks for the queued session commands
> and waits infinitely for command completion 'cmd_wait_comp' in target_wait_for_sess_cmds().
> 'cmd_wait_comp' will be never complete as the kref on the session command is
> not derefed, due to which target_release_cmd_kref() is not called by kref_put().
> Due to this, the older session is not cleared causing the next login negotiation to fail
> as the older session is still active(Older SID exists).

Makes sense...

> [Query 1] If the return value of ib_post_send/recv() are handled to deref the
> corresponding queued session commands, the wait on cmd_wait_comp will be
> complete and clears off the session successfully. Is this the rightway to
> do it here?

I think that given that ->iscsit_queue_data_in() and
->iscsit_queue_status() are never expected to fail we probably
should take care of it in isert...

Nic, any input on the iscsit side?

>
> [Query 2] An extra deref is done in case of transport_status CMD_T_TAS
> in target_wait_for_sess_cmds(), can similar
> implementation be done for transport state CMD_T_FABRIC_STOP?

I think that can work also given that target_sess_cmd_list_set_waiting()
is indeed set when we are starting teardown. Not sure how this will
affect other transports though...
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Potnuri Bharat Teja Aug. 18, 2016, 12:03 p.m. UTC | #1
On Sunday, August 08/07/16, 2016 at 20:09:08 +0300, Sagi Grimberg wrote:
> >Hi,
> 
> Hi Baharat,
> 
> >In iSER target during iwarp connection tear-down due to ping timeouts, the rdma
> >queues are set to error state and subsequent queued iscsi session commands posted
> >shall fail with corresponding errno returned by ib_post_send/recv.
> >At this stage iser target handlers (Ex: isert_put_datain())
> >return the error to iscsci target, but these errors are
> >not being handled by the iscsi target handlers(Ex: lio_queue_status()).
> 
> Indeed looks like lio_queue_data_in() assumes that
> ->iscsit_queue_data_in() cannot fail. This either mean
> that isert_put_data_in() should take care of
> the extra put in this case, or the iscsi should correctly
> handle a queue_full condition (because we should not be hitting
> this in the normal IO flow).
> 
> Does this (untested) patch help?
> --
> diff --git a/drivers/infiniband/ulp/isert/ib_isert.c
> b/drivers/infiniband/ulp/isert/ib_isert.c
> index a990c04208c9..ff5dfc0f7c50 100644
> --- a/drivers/infiniband/ulp/isert/ib_isert.c
> +++ b/drivers/infiniband/ulp/isert/ib_isert.c
> @@ -1810,6 +1810,7 @@ isert_put_response(struct iscsi_conn *conn, struct
> iscsi_cmd *cmd)
>         struct ib_send_wr *send_wr = &isert_cmd->tx_desc.send_wr;
>         struct iscsi_scsi_rsp *hdr = (struct iscsi_scsi_rsp *)
>                                 &isert_cmd->tx_desc.iscsi_header;
> +       int ret;
> 
>         isert_create_send_desc(isert_conn, isert_cmd, &isert_cmd->tx_desc);
>         iscsit_build_rsp_pdu(cmd, conn, true, hdr);
> @@ -1848,7 +1849,10 @@ isert_put_response(struct iscsi_conn *conn, struct
> iscsi_cmd *cmd)
> 
>         isert_dbg("Posting SCSI Response\n");
> 
> -       return isert_post_response(isert_conn, isert_cmd);
> +       ret = isert_post_response(isert_conn, isert_cmd);
> +       if (ret)
> +               target_put_sess_cmd(&cmd->se_cmd);
> +       return 0;
>  }
> 
>  static void
> @@ -2128,7 +2132,7 @@ isert_put_datain(struct iscsi_conn *conn, struct
> iscsi_cmd *cmd)
>         struct isert_conn *isert_conn = conn->context;
>         struct ib_cqe *cqe = NULL;
>         struct ib_send_wr *chain_wr = NULL;
> -       int rc;
> +       int ret;
> 
>         isert_dbg("Cmd: %p RDMA_WRITE data_length: %u\n",
>                  isert_cmd, se_cmd->data_length);
> @@ -2148,34 +2152,41 @@ isert_put_datain(struct iscsi_conn *conn, struct
> iscsi_cmd *cmd)
>                 isert_init_send_wr(isert_conn, isert_cmd,
>                                    &isert_cmd->tx_desc.send_wr);
> 
> -               rc = isert_post_recv(isert_conn, isert_cmd->rx_desc);
> -               if (rc) {
> -                       isert_err("ib_post_recv failed with %d\n", rc);
> -                       return rc;
> +               ret = isert_post_recv(isert_conn, isert_cmd->rx_desc);
> +               if (ret) {
> +                       isert_err("ib_post_recv failed with %d\n", ret);
> +                       goto out;
>                 }
> 
>                 chain_wr = &isert_cmd->tx_desc.send_wr;
>         }
> 
> -       isert_rdma_rw_ctx_post(isert_cmd, isert_conn, cqe, chain_wr);
> +       ret = isert_rdma_rw_ctx_post(isert_cmd, isert_conn, cqe, chain_wr);
>         isert_dbg("Cmd: %p posted RDMA_WRITE for iSER Data READ\n",
> isert_cmd);
> -       return 1;
> +out:
> +       if (ret)
> +               target_put_sess_cmd(&cmd->se_cmd);
> +       return 0;
>  }
> 
>  static int
>  isert_get_dataout(struct iscsi_conn *conn, struct iscsi_cmd *cmd, bool
> recovery)
>  {
>         struct isert_cmd *isert_cmd = iscsit_priv_cmd(cmd);
> +       int ret;
> 
>         isert_dbg("Cmd: %p RDMA_READ data_length: %u write_data_done: %u\n",
>                  isert_cmd, cmd->se_cmd.data_length, cmd->write_data_done);
> 
>         isert_cmd->tx_desc.tx_cqe.done = isert_rdma_read_done;
> -       isert_rdma_rw_ctx_post(isert_cmd, conn->context,
> +       ret = isert_rdma_rw_ctx_post(isert_cmd, conn->context,
>                         &isert_cmd->tx_desc.tx_cqe, NULL);
> 
>         isert_dbg("Cmd: %p posted RDMA_READ memory for ISER Data WRITE\n",
>                  isert_cmd);+out:
> +       if (ret)
> +               target_put_sess_cmd(&cmd->se_cmd);
>         return 0;
>  }
> 
> --
>
Hi Sagi,
We found another issue in cq poll mechanism within iwarp module,
masking the above failure, so had to fix it first. Now I am able to
reproduce the actual stall due to pending underefed commands in command
list.
The above patch doesn't hold good for all cases and I see the same stall
waiting for completion event. Debugging further there are few more post
commands like isert_put_nopin(), which are also needed to be modified to
handle the post send failures.
I tried with few more changes in addition to the above change to handle
all possible post send/recv failures during ping timeout. Here is the  
snippet of change I tried:

diff --git a/drivers/infiniband/ulp/isert/ib_isert.c b/drivers/infiniband/ulp/isert/ib_isert.c
index ba6be06..a96f16c 100644
--- a/drivers/infiniband/ulp/isert/ib_isert.c
+++ b/drivers/infiniband/ulp/isert/ib_isert.c
static void
@@ -1892,6 +1897,7 @@ isert_put_nopin(struct iscsi_cmd *cmd, struct
iscsi_conn *conn,
        struct isert_cmd *isert_cmd = iscsit_priv_cmd(cmd);
        struct isert_conn *isert_conn = conn->context;
        struct ib_send_wr *send_wr =
			 &isert_cmd->tx_desc.send_wr;
+       int ret;

	isert_create_send_desc(isert_conn,
				 isert_cmd, &isert_cmd->tx_desc);
	iscsit_build_nopin_rsp(cmd, conn, (struct iscsi_nopin *)
@@ -1902,7 +1908,11 @@
	isert_put_nopin(struct iscsi_cmd *cmd, struct
					 iscsi_conn *conn,

	isert_dbg("conn %p Posting NOPIN Response\n", isert_conn);

-       return isert_post_response(isert_conn, isert_cmd);
+       ret = isert_post_response(isert_conn, isert_cmd);
+       if (ret)
+		 target_put_sess_cmd(&cmd->se_cmd);
+
+       return ret;
}

Similar change in all function calls of switch case of
isert_response_queue().

Yet that doesnt solve it all. With all above changes I see random
crashes due to null pointer dereference. I believe these crashes are due
to some race between isert threads.

I am currently dubugging those failures to see if our changes are proper
fix for the issue.

Will update with more details.

> >
> >-> While closing the session in case of ping timeouts, isert_close_connection()->
> >isert_wait_conn()->isert_wait4cmds() checks for the queued session commands
> >and waits infinitely for command completion 'cmd_wait_comp' in target_wait_for_sess_cmds().
> >'cmd_wait_comp' will be never complete as the kref on the session command is
> >not derefed, due to which target_release_cmd_kref() is not called by kref_put().
> >Due to this, the older session is not cleared causing the next login negotiation to fail
> >as the older session is still active(Older SID exists).
> 
> Makes sense...
> 
> >[Query 1] If the return value of ib_post_send/recv() are handled to deref the
> >corresponding queued session commands, the wait on cmd_wait_comp will be
> >complete and clears off the session successfully. Is this the rightway to
> >do it here?
> 
> I think that given that ->iscsit_queue_data_in() and
> ->iscsit_queue_status() are never expected to fail we probably
> should take care of it in isert...
> 
> Nic, any input on the iscsit side?
It is the same on the iscsit side too, Most of the code doesnt expect
for return value or failures.
> 
> >
> >[Query 2] An extra deref is done in case of transport_status CMD_T_TAS
> >in target_wait_for_sess_cmds(), can similar
> >implementation be done for transport state CMD_T_FABRIC_STOP?
> 
> I think that can work also given that target_sess_cmd_list_set_waiting()
> is indeed set when we are starting teardown. Not sure how this will
> affect other transports though...
Thanks,
Bharat.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/infiniband/ulp/isert/ib_isert.c 
b/drivers/infiniband/ulp/isert/ib_isert.c
index a990c04208c9..ff5dfc0f7c50 100644
--- a/drivers/infiniband/ulp/isert/ib_isert.c
+++ b/drivers/infiniband/ulp/isert/ib_isert.c
@@ -1810,6 +1810,7 @@  isert_put_response(struct iscsi_conn *conn, struct 
iscsi_cmd *cmd)
         struct ib_send_wr *send_wr = &isert_cmd->tx_desc.send_wr;
         struct iscsi_scsi_rsp *hdr = (struct iscsi_scsi_rsp *)
                                 &isert_cmd->tx_desc.iscsi_header;
+       int ret;

         isert_create_send_desc(isert_conn, isert_cmd, &isert_cmd->tx_desc);