diff mbox

lpfc: Fix race on command completion

Message ID 1452851303-89378-1-git-send-email-hare@suse.de (mailing list archive)
State Deferred, archived
Headers show

Commit Message

Hannes Reinecke Jan. 15, 2016, 9:48 a.m. UTC
Upon command completion the lpfc driver would call ->done()
on the scsi command before taking the host lock and
releasing the command internally.
This opens up a race window there this command might be re-used
after ->done(), leading to a double completion on the same command.

This patch takes the host lock before accessing the scsi command,
and disconnect the internal command under the same lock.

Signed-off-by: Hannes Reinecke <hare@suse.de>
---
 drivers/scsi/lpfc/lpfc_scsi.c | 17 ++++++++---------
 1 file changed, 8 insertions(+), 9 deletions(-)

Comments

Martin K. Petersen Jan. 20, 2016, 12:40 a.m. UTC | #1
>>>>> "Hannes" == Hannes Reinecke <hare@suse.de> writes:

Hannes> Upon command completion the lpfc driver would call ->done() on
Hannes> the scsi command before taking the host lock and releasing the
Hannes> command internally.  This opens up a race window there this
Hannes> command might be re-used after ->done(), leading to a double
Hannes> completion on the same command.

Hannes> This patch takes the host lock before accessing the scsi
Hannes> command, and disconnect the internal command under the same
Hannes> lock.

James/Dick: Please review.
Tomas Henzl Jan. 20, 2016, 4:26 p.m. UTC | #2
On 15.1.2016 10:48, Hannes Reinecke wrote:
> Upon command completion the lpfc driver would call ->done()
> on the scsi command before taking the host lock and
> releasing the command internally.
> This opens up a race window there this command might be re-used
> after ->done(), leading to a double completion on the same command.

I agree that a driver should clean up the command before calling
->done, but this driver uses a list based system where a command 
can't be reused only until it was returned to the list,
so I don't understand how a 'done' before internal free could
cause an issue other than a failed lpfc_get_scsi_buf in .queuecommand.
Is your issue related to the abort_handler 
(maybe cmd->host_scribble = NULL; changes the abort handler flow)?

Tomas 

>
> This patch takes the host lock before accessing the scsi command,
> and disconnect the internal command under the same lock.
>
> Signed-off-by: Hannes Reinecke <hare@suse.de>
> ---
>  drivers/scsi/lpfc/lpfc_scsi.c | 17 ++++++++---------
>  1 file changed, 8 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/scsi/lpfc/lpfc_scsi.c b/drivers/scsi/lpfc/lpfc_scsi.c
> index 4679ed4..974af28 100644
> --- a/drivers/scsi/lpfc/lpfc_scsi.c
> +++ b/drivers/scsi/lpfc/lpfc_scsi.c
> @@ -3908,9 +3908,16 @@ lpfc_scsi_cmd_iocb_cmpl(struct lpfc_hba *phba, struct lpfc_iocbq *pIocbIn,
>  	uint32_t logit = LOG_FCP;
>  
>  	/* Sanity check on return of outstanding command */
> -	if (!(lpfc_cmd->pCmd))
> +	spin_lock_irqsave(&phba->hbalock, flags);
> +	if (!(lpfc_cmd->pCmd)) {
> +		spin_unlock_irqrestore(&phba->hbalock, flags);
>  		return;
> +	}
>  	cmd = lpfc_cmd->pCmd;
> +	cmd->host_scribble = NULL;
> +	lpfc_cmd->pCmd = NULL;
> +	spin_unlock_irqrestore(&phba->hbalock, flags);
> +
>  	shost = cmd->device->host;
>  
>  	lpfc_cmd->result = (pIocbOut->iocb.un.ulpWord[4] & IOERR_PARAM_MASK);
> @@ -4125,10 +4132,6 @@ lpfc_scsi_cmd_iocb_cmpl(struct lpfc_hba *phba, struct lpfc_iocbq *pIocbIn,
>  	cmd->scsi_done(cmd);
>  
>  	if (phba->cfg_poll & ENABLE_FCP_RING_POLLING) {
> -		spin_lock_irqsave(&phba->hbalock, flags);
> -		lpfc_cmd->pCmd = NULL;
> -		spin_unlock_irqrestore(&phba->hbalock, flags);
> -
>  		/*
>  		 * If there is a thread waiting for command completion
>  		 * wake up the thread.
> @@ -4141,10 +4144,6 @@ lpfc_scsi_cmd_iocb_cmpl(struct lpfc_hba *phba, struct lpfc_iocbq *pIocbIn,
>  		return;
>  	}
>  
> -	spin_lock_irqsave(&phba->hbalock, flags);
> -	lpfc_cmd->pCmd = NULL;
> -	spin_unlock_irqrestore(&phba->hbalock, flags);
> -
>  	/*
>  	 * If there is a thread waiting for command completion
>  	 * wake up the thread.

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Hannes Reinecke Jan. 21, 2016, 3:21 p.m. UTC | #3
On 01/20/2016 05:26 PM, Tomas Henzl wrote:
> On 15.1.2016 10:48, Hannes Reinecke wrote:
>> Upon command completion the lpfc driver would call ->done()
>> on the scsi command before taking the host lock and
>> releasing the command internally.
>> This opens up a race window there this command might be re-used
>> after ->done(), leading to a double completion on the same command.
> 
> I agree that a driver should clean up the command before calling
> ->done, but this driver uses a list based system where a command 
> can't be reused only until it was returned to the list,
> so I don't understand how a 'done' before internal free could
> cause an issue other than a failed lpfc_get_scsi_buf in .queuecommand.
> Is your issue related to the abort_handler 
> (maybe cmd->host_scribble = NULL; changes the abort handler flow)?
> 
Yes, this was (originally) an issue with the abort handler. But it
seems to be gone with the upstream driver, so this patch should be
retracted.

Will be reposting if and when the issue resurfaces.

Cheers,

Hannes
diff mbox

Patch

diff --git a/drivers/scsi/lpfc/lpfc_scsi.c b/drivers/scsi/lpfc/lpfc_scsi.c
index 4679ed4..974af28 100644
--- a/drivers/scsi/lpfc/lpfc_scsi.c
+++ b/drivers/scsi/lpfc/lpfc_scsi.c
@@ -3908,9 +3908,16 @@  lpfc_scsi_cmd_iocb_cmpl(struct lpfc_hba *phba, struct lpfc_iocbq *pIocbIn,
 	uint32_t logit = LOG_FCP;
 
 	/* Sanity check on return of outstanding command */
-	if (!(lpfc_cmd->pCmd))
+	spin_lock_irqsave(&phba->hbalock, flags);
+	if (!(lpfc_cmd->pCmd)) {
+		spin_unlock_irqrestore(&phba->hbalock, flags);
 		return;
+	}
 	cmd = lpfc_cmd->pCmd;
+	cmd->host_scribble = NULL;
+	lpfc_cmd->pCmd = NULL;
+	spin_unlock_irqrestore(&phba->hbalock, flags);
+
 	shost = cmd->device->host;
 
 	lpfc_cmd->result = (pIocbOut->iocb.un.ulpWord[4] & IOERR_PARAM_MASK);
@@ -4125,10 +4132,6 @@  lpfc_scsi_cmd_iocb_cmpl(struct lpfc_hba *phba, struct lpfc_iocbq *pIocbIn,
 	cmd->scsi_done(cmd);
 
 	if (phba->cfg_poll & ENABLE_FCP_RING_POLLING) {
-		spin_lock_irqsave(&phba->hbalock, flags);
-		lpfc_cmd->pCmd = NULL;
-		spin_unlock_irqrestore(&phba->hbalock, flags);
-
 		/*
 		 * If there is a thread waiting for command completion
 		 * wake up the thread.
@@ -4141,10 +4144,6 @@  lpfc_scsi_cmd_iocb_cmpl(struct lpfc_hba *phba, struct lpfc_iocbq *pIocbIn,
 		return;
 	}
 
-	spin_lock_irqsave(&phba->hbalock, flags);
-	lpfc_cmd->pCmd = NULL;
-	spin_unlock_irqrestore(&phba->hbalock, flags);
-
 	/*
 	 * If there is a thread waiting for command completion
 	 * wake up the thread.