diff mbox series

[v2,2/6] qla2xxx: Simplify the code for aborting SCSI commands

Message ID 20200123042345.23886-3-bvanassche@acm.org (mailing list archive)
State Changes Requested
Headers show
Series qla2xxx patches for kernel v5.6 | expand

Commit Message

Bart Van Assche Jan. 23, 2020, 4:23 a.m. UTC
Since the SCSI core does not reuse the tag of the SCSI command that is
being aborted by .eh_abort() before .eh_abort() has finished it is not
necessary to check from inside that callback whether or not the SCSI command
has already completed. Instead, rely on the firmware to return an error code
when attempting to abort a command that has already completed. Additionally,
rely on the firmware to return an error code when attempting to abort an
already aborted command.

In qla2x00_abort_srb(), use blk_mq_request_started() instead of
sp->completed and sp->aborted.

This patch eliminates several race conditions triggered by the removed member
variables.

Acked-by: Himanshu Madhani <hmadhani@marvell.com>
Reviewed-by: Roman Bolshakov <r.bolshakov@yadro.com>
Cc: Quinn Tran <qutran@marvell.com>
Cc: Martin Wilck <mwilck@suse.com>
Cc: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 drivers/scsi/qla2xxx/qla_def.h |  3 ---
 drivers/scsi/qla2xxx/qla_isr.c |  5 -----
 drivers/scsi/qla2xxx/qla_os.c  | 27 ++++++++++++++-------------
 3 files changed, 14 insertions(+), 21 deletions(-)

Comments

Daniel Wagner Jan. 23, 2020, 10:17 a.m. UTC | #1
On Wed, Jan 22, 2020 at 08:23:41PM -0800, Bart Van Assche wrote:
> Since the SCSI core does not reuse the tag of the SCSI command that is
> being aborted by .eh_abort() before .eh_abort() has finished it is not
> necessary to check from inside that callback whether or not the SCSI command
> has already completed. Instead, rely on the firmware to return an error code
> when attempting to abort a command that has already completed. Additionally,
> rely on the firmware to return an error code when attempting to abort an
> already aborted command.
> 
> In qla2x00_abort_srb(), use blk_mq_request_started() instead of
> sp->completed and sp->aborted.
> 
> This patch eliminates several race conditions triggered by the removed member
> variables.

I can only guess here what the races are but I agree removing the
logic here and relying on the SCSI layer to handle it correctly makes
sense. 

> Acked-by: Himanshu Madhani <hmadhani@marvell.com>
> Reviewed-by: Roman Bolshakov <r.bolshakov@yadro.com>
> Cc: Quinn Tran <qutran@marvell.com>
> Cc: Martin Wilck <mwilck@suse.com>
> Cc: Daniel Wagner <dwagner@suse.de>
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>

Reviewed-by: Daniel Wagner <dwagner@suse.de>

> +/*
> + * The caller must ensure that no completion interrupts will happen
> + * while this function is in progress.
> + */

So could we add something like WARN_ON(irqs_disabled())?
Bart Van Assche Jan. 24, 2020, 3:12 a.m. UTC | #2
On 2020-01-23 02:17, Daniel Wagner wrote:
> On Wed, Jan 22, 2020 at 08:23:41PM -0800, Bart Van Assche wrote:
>> Since the SCSI core does not reuse the tag of the SCSI command that is
>> being aborted by .eh_abort() before .eh_abort() has finished it is not
>> necessary to check from inside that callback whether or not the SCSI command
>> has already completed. Instead, rely on the firmware to return an error code
>> when attempting to abort a command that has already completed. Additionally,
>> rely on the firmware to return an error code when attempting to abort an
>> already aborted command.
>>
>> In qla2x00_abort_srb(), use blk_mq_request_started() instead of
>> sp->completed and sp->aborted.
>>
>> This patch eliminates several race conditions triggered by the removed member
>> variables.
> 
> I can only guess here what the races are but I agree removing the
> logic here and relying on the SCSI layer to handle it correctly makes
> sense. 

I will make the patch description more clear when I repost this patch.

>> +/*
>> + * The caller must ensure that no completion interrupts will happen
>> + * while this function is in progress.
>> + */
> 
> So could we add something like WARN_ON(irqs_disabled())?

qla2x00_abort_all_cmds() is typically called after the kernel driver
discovered that the firmware stopped processing commands. If the
firmware has stopped processing commands it is safe to call this
function without disabling interrupts. Even if the caller would disable
interrupts, that would only disable interrupts on the local CPU but not
on other CPUs. Additionally, disabling interrupts is not a proper
solution because if a completion interrupt arrives after a command has
been aborted that will cause trouble if the command handle has already
been associated with another command.

Bart.
Daniel Wagner Jan. 24, 2020, 10:44 a.m. UTC | #3
Hi Bart,

On Thu, Jan 23, 2020 at 07:12:04PM -0800, Bart Van Assche wrote:
> >> +/*
> >> + * The caller must ensure that no completion interrupts will happen
> >> + * while this function is in progress.
> >> + */
> > 
> > So could we add something like WARN_ON(irqs_disabled())?
> 
> qla2x00_abort_all_cmds() is typically called after the kernel driver
> discovered that the firmware stopped processing commands. If the
> firmware has stopped processing commands it is safe to call this
> function without disabling interrupts. Even if the caller would disable
> interrupts, that would only disable interrupts on the local CPU but not
> on other CPUs. Additionally, disabling interrupts is not a proper
> solution because if a completion interrupt arrives after a command has
> been aborted that will cause trouble if the command handle has already
> been associated with another command.

Thanks for the explenation. I understood the comment as 'interrupts'
have to be disabled when calling this function.

Thanks,
Daniel
diff mbox series

Patch

diff --git a/drivers/scsi/qla2xxx/qla_def.h b/drivers/scsi/qla2xxx/qla_def.h
index ed32e9715794..c5a067f45005 100644
--- a/drivers/scsi/qla2xxx/qla_def.h
+++ b/drivers/scsi/qla2xxx/qla_def.h
@@ -597,9 +597,6 @@  typedef struct srb {
 	struct fc_port *fcport;
 	struct scsi_qla_host *vha;
 	unsigned int start_timer:1;
-	unsigned int abort:1;
-	unsigned int aborted:1;
-	unsigned int completed:1;
 
 	uint32_t handle;
 	uint16_t flags;
diff --git a/drivers/scsi/qla2xxx/qla_isr.c b/drivers/scsi/qla2xxx/qla_isr.c
index e7bad0bfffda..0c9bfe77ba8a 100644
--- a/drivers/scsi/qla2xxx/qla_isr.c
+++ b/drivers/scsi/qla2xxx/qla_isr.c
@@ -2504,11 +2504,6 @@  qla2x00_status_entry(scsi_qla_host_t *vha, struct rsp_que *rsp, void *pkt)
 		return;
 	}
 
-	if (sp->abort)
-		sp->aborted = 1;
-	else
-		sp->completed = 1;
-
 	if (sp->cmd_type != TYPE_SRB) {
 		req->outstanding_cmds[handle] = NULL;
 		ql_dbg(ql_dbg_io, vha, 0x3015,
diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c
index 79387ac8936f..a34f27b2d602 100644
--- a/drivers/scsi/qla2xxx/qla_os.c
+++ b/drivers/scsi/qla2xxx/qla_os.c
@@ -1253,17 +1253,6 @@  qla2xxx_eh_abort(struct scsi_cmnd *cmd)
 		return SUCCESS;
 
 	spin_lock_irqsave(qpair->qp_lock_ptr, flags);
-	if (sp->completed) {
-		spin_unlock_irqrestore(qpair->qp_lock_ptr, flags);
-		return SUCCESS;
-	}
-
-	if (sp->abort || sp->aborted) {
-		spin_unlock_irqrestore(qpair->qp_lock_ptr, flags);
-		return FAILED;
-	}
-
-	sp->abort = 1;
 	sp->comp = &comp;
 	spin_unlock_irqrestore(qpair->qp_lock_ptr, flags);
 
@@ -1688,6 +1677,10 @@  qla2x00_loop_reset(scsi_qla_host_t *vha)
 	return QLA_SUCCESS;
 }
 
+/*
+ * The caller must ensure that no completion interrupts will happen
+ * while this function is in progress.
+ */
 static void qla2x00_abort_srb(struct qla_qpair *qp, srb_t *sp, const int res,
 			      unsigned long *flags)
 	__releases(qp->qp_lock_ptr)
@@ -1696,6 +1689,7 @@  static void qla2x00_abort_srb(struct qla_qpair *qp, srb_t *sp, const int res,
 	DECLARE_COMPLETION_ONSTACK(comp);
 	scsi_qla_host_t *vha = qp->vha;
 	struct qla_hw_data *ha = vha->hw;
+	struct scsi_cmnd *cmd = GET_CMD_SP(sp);
 	int rval;
 	bool ret_cmd;
 	uint32_t ratov_j;
@@ -1717,7 +1711,6 @@  static void qla2x00_abort_srb(struct qla_qpair *qp, srb_t *sp, const int res,
 		}
 
 		sp->comp = &comp;
-		sp->abort =  1;
 		spin_unlock_irqrestore(qp->qp_lock_ptr, *flags);
 
 		rval = ha->isp_ops->abort_command(sp);
@@ -1741,13 +1734,17 @@  static void qla2x00_abort_srb(struct qla_qpair *qp, srb_t *sp, const int res,
 		}
 
 		spin_lock_irqsave(qp->qp_lock_ptr, *flags);
-		if (ret_cmd && (!sp->completed || !sp->aborted))
+		if (ret_cmd && blk_mq_request_started(cmd->request))
 			sp->done(sp, res);
 	} else {
 		sp->done(sp, res);
 	}
 }
 
+/*
+ * The caller must ensure that no completion interrupts will happen
+ * while this function is in progress.
+ */
 static void
 __qla2x00_abort_all_cmds(struct qla_qpair *qp, int res)
 {
@@ -1794,6 +1791,10 @@  __qla2x00_abort_all_cmds(struct qla_qpair *qp, int res)
 	spin_unlock_irqrestore(qp->qp_lock_ptr, flags);
 }
 
+/*
+ * The caller must ensure that no completion interrupts will happen
+ * while this function is in progress.
+ */
 void
 qla2x00_abort_all_cmds(scsi_qla_host_t *vha, int res)
 {