Message ID | 20170519215344.2168-21-himanshu.madhani@cavium.com (mailing list archive) |
---|---|
State | Changes Requested, archived |
Headers | show |
On Fri, 2017-05-19 at 14:53 -0700, Himanshu Madhani wrote: > From: Quinn Tran <quinn.tran@cavium.com> > > During ABTS or Abort task, qla2xxx does a pre-search for > the se_cmd, based on command's tag. The same search is > performed by TCM. Remove the extra search from qla2xxx. > > Signed-off-by: Quinn Tran <quinn.tran@cavium.com> > Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> > --- > drivers/scsi/qla2xxx/qla_target.c | 29 ++++------------------------- > 1 file changed, 4 insertions(+), 25 deletions(-) > > diff --git a/drivers/scsi/qla2xxx/qla_target.c b/drivers/scsi/qla2xxx/qla_target.c > index 21e8993baf4b..b8e609ae6cff 100644 > --- a/drivers/scsi/qla2xxx/qla_target.c > +++ b/drivers/scsi/qla2xxx/qla_target.c > @@ -1836,34 +1836,13 @@ static int __qlt_24xx_handle_abts(struct scsi_qla_host *vha, > struct abts_recv_from_24xx *abts, struct fc_port *sess) > { > struct qla_hw_data *ha = vha->hw; > - struct se_session *se_sess = sess->se_sess; > struct qla_tgt_mgmt_cmd *mcmd; > - struct se_cmd *se_cmd; > int rc; > - bool found_lun = false; > - unsigned long flags; > - > - spin_lock_irqsave(&se_sess->sess_cmd_lock, flags); > - list_for_each_entry(se_cmd, &se_sess->sess_cmd_list, se_cmd_list) { > - if (se_cmd->tag == abts->exchange_addr_to_abort) { > - found_lun = true; > - break; > - } > - } > - spin_unlock_irqrestore(&se_sess->sess_cmd_lock, flags); > > - /* cmd not in LIO lists, look in qla list */ > - if (!found_lun) { > - if (abort_cmd_for_tag(vha, abts->exchange_addr_to_abort)) { > - /* send TASK_ABORT response immediately */ > - qlt_24xx_send_abts_resp(vha, abts, FCP_TMF_CMPL, false); > - return 0; > - } else { > - ql_dbg(ql_dbg_tgt_mgt, vha, 0xf081, > - "unable to find cmd in driver or LIO for tag 0x%x\n", > - abts->exchange_addr_to_abort); > - return -ENOENT; > - } > + if (abort_cmd_for_tag(vha, abts->exchange_addr_to_abort)) { > + /* send TASK_ABORT response immediately */ > + qlt_24xx_send_abts_resp(vha, abts, FCP_TMF_CMPL, false); > + return 0; > } > > ql_dbg(ql_dbg_tgt_mgt, vha, 0xf00f, Hello Himanshu and Quinn, Please drop this patch. If a command has already been submitted to the LIO core and an ABTS is received then the LIO core should be requested to perform the abort. This patch changes the behavior of the qla2xxx target driver such that the LIO core is not informed at all if abort_cmd_for_tag() finds the command that has to be aborted in one of the command lists maintained by the qla2xxx driver. That can lead to the presence of overlapping writes in the command set on the target system and hence to data corruption. Please note that I had proposed a better approach on the target-devel mailing list and that I'm still waiting for someone from Cavium to review these patches: * [PATCH v6 09/33] target: Make it possible to specify I_T nexus for SCSI abort (http://www.spinics.net/lists/target-devel/msg14534.html). * [PATCH v6 10/33] tcm_qla2xxx: Let the target core look up the LUN of the aborted cmd (http://www.spinics.net/lists/target-devel/msg14563.html). Bart.
On Fri, 2017-05-19 at 23:43 +0000, Bart Van Assche wrote: > On Fri, 2017-05-19 at 14:53 -0700, Himanshu Madhani wrote: > > From: Quinn Tran <quinn.tran@cavium.com> > > > > During ABTS or Abort task, qla2xxx does a pre-search for > > the se_cmd, based on command's tag. The same search is > > performed by TCM. Remove the extra search from qla2xxx. > > > > Signed-off-by: Quinn Tran <quinn.tran@cavium.com> > > Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> > > --- > > drivers/scsi/qla2xxx/qla_target.c | 29 ++++------------------------- > > 1 file changed, 4 insertions(+), 25 deletions(-) > > > > diff --git a/drivers/scsi/qla2xxx/qla_target.c b/drivers/scsi/qla2xxx/qla_target.c > > index 21e8993baf4b..b8e609ae6cff 100644 > > --- a/drivers/scsi/qla2xxx/qla_target.c > > +++ b/drivers/scsi/qla2xxx/qla_target.c > > @@ -1836,34 +1836,13 @@ static int __qlt_24xx_handle_abts(struct scsi_qla_host *vha, > > struct abts_recv_from_24xx *abts, struct fc_port *sess) > > { > > struct qla_hw_data *ha = vha->hw; > > - struct se_session *se_sess = sess->se_sess; > > struct qla_tgt_mgmt_cmd *mcmd; > > - struct se_cmd *se_cmd; > > int rc; > > - bool found_lun = false; > > - unsigned long flags; > > - > > - spin_lock_irqsave(&se_sess->sess_cmd_lock, flags); > > - list_for_each_entry(se_cmd, &se_sess->sess_cmd_list, se_cmd_list) { > > - if (se_cmd->tag == abts->exchange_addr_to_abort) { > > - found_lun = true; > > - break; > > - } > > - } > > - spin_unlock_irqrestore(&se_sess->sess_cmd_lock, flags); > > > > - /* cmd not in LIO lists, look in qla list */ > > - if (!found_lun) { > > - if (abort_cmd_for_tag(vha, abts->exchange_addr_to_abort)) { > > - /* send TASK_ABORT response immediately */ > > - qlt_24xx_send_abts_resp(vha, abts, FCP_TMF_CMPL, false); > > - return 0; > > - } else { > > - ql_dbg(ql_dbg_tgt_mgt, vha, 0xf081, > > - "unable to find cmd in driver or LIO for tag 0x%x\n", > > - abts->exchange_addr_to_abort); > > - return -ENOENT; > > - } > > + if (abort_cmd_for_tag(vha, abts->exchange_addr_to_abort)) { > > + /* send TASK_ABORT response immediately */ > > + qlt_24xx_send_abts_resp(vha, abts, FCP_TMF_CMPL, false); > > + return 0; > > } > > > > ql_dbg(ql_dbg_tgt_mgt, vha, 0xf00f, > > Hello Himanshu and Quinn, > > Please drop this patch. If a command has already been submitted to the LIO > core and an ABTS is received then the LIO core should be requested to perform > the abort. This patch changes the behavior of the qla2xxx target driver such > that the LIO core is not informed at all if abort_cmd_for_tag() finds the > command that has to be aborted in one of the command lists maintained by the > qla2xxx driver. That can lead to the presence of overlapping writes in the > command set on the target system and hence to data corruption. This analysis is wrong. The three lists abort_cmd_for_tag() walks from __qlt_24xx_handle_abts() are used to track descriptors only before __qlt_do_work() is reached, and before a descriptor is submitted into tcm_qla2xxx code. Or rather, the three lists in abort_cmd_for_tag() only contain qla_tgt_cmd or qla_tgt_sess_op descriptors that have not yet reached qla_tgt_func_tmpl->handle_cmd() code. Both qlt_do_work() and qlt_create_sess_from_atio() drop their respective descriptors from ->cmd_list before dispatching into tcm_qla2xxx -> target-core, which means there is no way for a descriptor to be part of internal lists once __qlt_do_work() is called. That said, the patch is correct and removes the redundant lookup. Acked-by: Nicholas Bellinger <nab@linux-iscsi.org>
On Sun, 2017-05-21 at 21:27 -0700, Nicholas A. Bellinger wrote: > The three lists abort_cmd_for_tag() walks from __qlt_24xx_handle_abts() > are used to track descriptors only before __qlt_do_work() is reached, > and before a descriptor is submitted into tcm_qla2xxx code. > > Or rather, the three lists in abort_cmd_for_tag() only contain > qla_tgt_cmd or qla_tgt_sess_op descriptors that have not yet reached > qla_tgt_func_tmpl->handle_cmd() code. > > Both qlt_do_work() and qlt_create_sess_from_atio() drop their respective > descriptors from ->cmd_list before dispatching into tcm_qla2xxx -> > target-core, which means there is no way for a descriptor to be part of > internal lists once __qlt_do_work() is called. > > That said, the patch is correct and removes the redundant lookup. I do not agree that this patch is makes ABTS handling fully robust. It seems like you have not noticed that the following race can occur with or without this patch applied: if abort_cmd_for_tag() and qlt_try_to_dequeue_unknown_atios() are called concurrently, since the latter function calls list_del() after qlt_send_term_exchange(), abort_cmd_for_tag() can return 1 and thereby trigger a call to qlt_24xx_send_abts_resp() while qlt_try_to_dequeue_unknown_atios() calls qlt_send_term_exchange() concurrently. I think an initiator could get really confused if it receives two responses for the same exchange. An existing issue that is not addressed by this patch is that if an ABTS is received after qlt_do_work() has called list_del() and before __qlt_do_work() triggers a call target_submit_cmd() that a command is not on any list and hence that abort_cmd_for_tag() won't be able to find it anywhere. Shouldn't these issues be addressed properly? Bart.
-----Original Message----- From: <linux-scsi-owner@vger.kernel.org> on behalf of Bart Van Assche <Bart.VanAssche@sandisk.com> Date: Monday, May 22, 2017 at 11:23 AM To: Nicholas Bellinger <nab@linux-iscsi.org> Cc: "linux-scsi@vger.kernel.org" <linux-scsi@vger.kernel.org>, "James.Bottomley@HansenPartnership.com" <James.Bottomley@HansenPartnership.com>, "Madhani, Himanshu" <Himanshu.Madhani@cavium.com>, "martin.petersen@oracle.com" <martin.petersen@oracle.com> Subject: Re: [PATCH 20/25] qla2xxx: Remove redundant code On Sun, 2017-05-21 at 21:27 -0700, Nicholas A. Bellinger wrote: > The three lists abort_cmd_for_tag() walks from __qlt_24xx_handle_abts() > are used to track descriptors only before __qlt_do_work() is reached, > and before a descriptor is submitted into tcm_qla2xxx code. > > Or rather, the three lists in abort_cmd_for_tag() only contain > qla_tgt_cmd or qla_tgt_sess_op descriptors that have not yet reached > qla_tgt_func_tmpl->handle_cmd() code. > > Both qlt_do_work() and qlt_create_sess_from_atio() drop their respective > descriptors from ->cmd_list before dispatching into tcm_qla2xxx -> > target-core, which means there is no way for a descriptor to be part of > internal lists once __qlt_do_work() is called. > > That said, the patch is correct and removes the redundant lookup. I do not agree that this patch is makes ABTS handling fully robust. It seems like you have not noticed that the following race can occur with or without this patch applied: if abort_cmd_for_tag() and qlt_try_to_dequeue_unknown_atios() are called concurrently, since the latter function calls list_del() after qlt_send_term_exchange(), abort_cmd_for_tag() can return 1 and thereby trigger a call to qlt_24xx_send_abts_resp() while qlt_try_to_dequeue_unknown_atios() calls qlt_send_term_exchange() concurrently. QT: “[PATCH 03/25] qla2xxx: Allow ABTS RX, RIDA on ATIOQ for ISP83XX/27XX” attempts to reduce the describe concurrency by moving the IOCB from the RSPQ thread to the ATIOQ thread. This helps preserve the ordering between SCSI cmd & ABTS. I think an initiator could get really confused if it receives two responses for the same exchange. QT: I do see the window you’re describing in the qlt_try_to_dequeue_unknown_atios(). Will have to follow up with another patch. This patch is not meant for this window you’ve just identify. An existing issue that is not addressed by this patch is that if an ABTS is received after qlt_do_work() has called list_del() and before __qlt_do_work() triggers a call target_submit_cmd() that a command is not on any list and hence that abort_cmd_for_tag() won't be able to find it anywhere. QT: Actually, we’re covered in this case. There 2 responds for 2 IOCB/transactions. 1) The abort_cmd_for_tag() finds the SCSI command on the qla_cmd_list and turns on the abort flag while cmd is in work queue & before __qlt_do_work is called. When __qlt_do_work is call and sees the aborted flag, the thread will terminate/cleanup the SCSI command and will not submit to TCM. As for the ABTS, the abort_cmd_for_tag() returns 1 after finding the command will trigger an ABTS respond. static void __qlt_do_work(struct qla_tgt_cmd *cmd) { if (cmd->aborted) { ql_dbg(ql_dbg_tgt_mgt, vha, 0xf082, "cmd with tag %u is aborted\n", cmd->atio.u.isp24.exchange_addr); goto out_term; } } Shouldn't these issues be addressed properly? QT: Will follow up with another patch for the new problem. thanks for the extra eyes. Bart.
On Mon, 2017-05-22 at 22:14 +0000, Tran, Quinn wrote: > > I think an initiator could get > really confused if it receives two responses for the same exchange. > > QT: I do see the window you’re describing in the qlt_try_to_dequeue_unknown_atios(). > Will have to follow up with another patch. This patch is not meant for this window > you’ve just identify. Hello Quinn, Has that patch been included in v2 of this series or will that patch perhaps be made available later? Thanks, Bart.
Bart, Will make available in the next upstream, once it has a chance to go through our internal test cycle. Regards, Quinn Tran -----Original Message----- From: Bart Van Assche <Bart.VanAssche@sandisk.com> Date: Wednesday, May 31, 2017 at 4:41 PM To: "Tran, Quinn" <Quinn.Tran@cavium.com>, Nicholas Bellinger <nab@linux-iscsi.org> Cc: "linux-scsi@vger.kernel.org" <linux-scsi@vger.kernel.org>, "James.Bottomley@HansenPartnership.com" <James.Bottomley@HansenPartnership.com>, "Madhani, Himanshu" <Himanshu.Madhani@cavium.com>, "martin.petersen@oracle.com" <martin.petersen@oracle.com> Subject: Re: [PATCH 20/25] qla2xxx: Remove redundant code On Mon, 2017-05-22 at 22:14 +0000, Tran, Quinn wrote: > > I think an initiator could get > really confused if it receives two responses for the same exchange. > > QT: I do see the window youʼre describing in the qlt_try_to_dequeue_unknown_atios(). > Will have to follow up with another patch. This patch is not meant for this window > youʼve just identify. Hello Quinn, Has that patch been included in v2 of this series or will that patch perhaps be made available later? Thanks, Bart.
diff --git a/drivers/scsi/qla2xxx/qla_target.c b/drivers/scsi/qla2xxx/qla_target.c index 21e8993baf4b..b8e609ae6cff 100644 --- a/drivers/scsi/qla2xxx/qla_target.c +++ b/drivers/scsi/qla2xxx/qla_target.c @@ -1836,34 +1836,13 @@ static int __qlt_24xx_handle_abts(struct scsi_qla_host *vha, struct abts_recv_from_24xx *abts, struct fc_port *sess) { struct qla_hw_data *ha = vha->hw; - struct se_session *se_sess = sess->se_sess; struct qla_tgt_mgmt_cmd *mcmd; - struct se_cmd *se_cmd; int rc; - bool found_lun = false; - unsigned long flags; - - spin_lock_irqsave(&se_sess->sess_cmd_lock, flags); - list_for_each_entry(se_cmd, &se_sess->sess_cmd_list, se_cmd_list) { - if (se_cmd->tag == abts->exchange_addr_to_abort) { - found_lun = true; - break; - } - } - spin_unlock_irqrestore(&se_sess->sess_cmd_lock, flags); - /* cmd not in LIO lists, look in qla list */ - if (!found_lun) { - if (abort_cmd_for_tag(vha, abts->exchange_addr_to_abort)) { - /* send TASK_ABORT response immediately */ - qlt_24xx_send_abts_resp(vha, abts, FCP_TMF_CMPL, false); - return 0; - } else { - ql_dbg(ql_dbg_tgt_mgt, vha, 0xf081, - "unable to find cmd in driver or LIO for tag 0x%x\n", - abts->exchange_addr_to_abort); - return -ENOENT; - } + if (abort_cmd_for_tag(vha, abts->exchange_addr_to_abort)) { + /* send TASK_ABORT response immediately */ + qlt_24xx_send_abts_resp(vha, abts, FCP_TMF_CMPL, false); + return 0; } ql_dbg(ql_dbg_tgt_mgt, vha, 0xf00f,