diff mbox series

[v3,1/8] scsi: core: Fix a race between scsi_done() and scsi_timeout()

Message ID 20220929220021.247097-2-bvanassche@acm.org (mailing list archive)
State Superseded
Headers show
Series Fix a deadlock in the UFS driver | expand

Commit Message

Bart Van Assche Sept. 29, 2022, 10 p.m. UTC
If there is a race between scsi_done() and scsi_timeout() and if
scsi_timeout() loses the race, scsi_timeout() should not reset the
request timer. Hence change the return value for this case from
BLK_EH_RESET_TIMER into BLK_EH_DONE.

Although the block layer holds a reference on a request (req->ref) while
calling a timeout handler, restarting the timer (blk_add_timer()) while
a request is being completed is racy.

Cc: Keith Busch <kbusch@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mike Christie <michael.christie@oracle.com>
Cc: Hannes Reinecke <hare@suse.de>
Reported-by: Adrian Hunter <adrian.hunter@intel.com>
Fixes: 065990bd198e ("scsi: set timed out out mq requests to complete")
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 drivers/scsi/scsi_error.c | 14 +++-----------
 1 file changed, 3 insertions(+), 11 deletions(-)

Comments

Mike Christie Sept. 30, 2022, 12:17 a.m. UTC | #1
On 9/29/22 5:00 PM, Bart Van Assche wrote:
> If there is a race between scsi_done() and scsi_timeout() and if
> scsi_timeout() loses the race, scsi_timeout() should not reset the
> request timer. Hence change the return value for this case from
> BLK_EH_RESET_TIMER into BLK_EH_DONE.
> 
> Although the block layer holds a reference on a request (req->ref) while
> calling a timeout handler, restarting the timer (blk_add_timer()) while
> a request is being completed is racy.
> 
> Cc: Keith Busch <kbusch@kernel.org>
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Ming Lei <ming.lei@redhat.com>
> Cc: John Garry <john.garry@huawei.com>
> Cc: Mike Christie <michael.christie@oracle.com>
> Cc: Hannes Reinecke <hare@suse.de>
> Reported-by: Adrian Hunter <adrian.hunter@intel.com>
> Fixes: 065990bd198e ("scsi: set timed out out mq requests to complete")
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> ---
>  drivers/scsi/scsi_error.c | 14 +++-----------
>  1 file changed, 3 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
> index 16bd0adc2339..d1b07ff64a96 100644
> --- a/drivers/scsi/scsi_error.c
> +++ b/drivers/scsi/scsi_error.c
> @@ -343,19 +343,11 @@ enum blk_eh_timer_return scsi_timeout(struct request *req)
>  
>  	if (rtn == BLK_EH_DONE) {
>  		/*
> -		 * Set the command to complete first in order to prevent a real
> -		 * completion from releasing the command while error handling
> -		 * is using it. If the command was already completed, then the
> -		 * lower level driver beat the timeout handler, and it is safe
> -		 * to return without escalating error recovery.
> -		 *
> -		 * If timeout handling lost the race to a real completion, the
> -		 * block layer may ignore that due to a fake timeout injection,
> -		 * so return RESET_TIMER to allow error handling another shot

I've been wondering about this code too.

I think the patch is correct for the normal cases, but I didn't understand the
old fake timeout comment case. From the comment it seemed like that was the reason
we did the RESET_TIMER. Does that not exist anymore or was it just bogus?

The commit you referenced actually was returning BLK_EH_DONE like we want. This
commit:

commit f1342709d18af97b0e71449d5696b8873d1a456c
Author: Keith Busch <keith.busch@intel.com>
Date:   Mon Nov 26 09:54:29 2018 -0700

    scsi: Do not rely on blk-mq for double completions


changed it to BLK_EH_RESET_TIMER and changed the above comment to mention
the fake timeout case. However, the commit message mentioned the patch was done
because we didn't want scsi digging the block layer.

If the fake injection thingy is bogus, then it seems ok to me.

Reviewed-by: Mike Christie <michael.christie@oracle.com>


> -		 * at this command.
> +		 * If scsi_done() has already set SCMD_STATE_COMPLETE, do not
> +		 * modify *scmd.
>  		 */
>  		if (test_and_set_bit(SCMD_STATE_COMPLETE, &scmd->state))
> -			return BLK_EH_RESET_TIMER;
> +			return BLK_EH_DONE;
>  		if (scsi_abort_command(scmd) != SUCCESS) {
>  			set_host_byte(scmd, DID_TIME_OUT);
>  			scsi_eh_scmd_add(scmd);
Bart Van Assche Sept. 30, 2022, 12:32 a.m. UTC | #2
On 9/29/22 17:17, Mike Christie wrote:
> On 9/29/22 5:00 PM, Bart Van Assche wrote:
>>   	if (rtn == BLK_EH_DONE) {
>>   		/*
>> -		 * Set the command to complete first in order to prevent a real
>> -		 * completion from releasing the command while error handling
>> -		 * is using it. If the command was already completed, then the
>> -		 * lower level driver beat the timeout handler, and it is safe
>> -		 * to return without escalating error recovery.
>> -		 *
>> -		 * If timeout handling lost the race to a real completion, the
>> -		 * block layer may ignore that due to a fake timeout injection,
>> -		 * so return RESET_TIMER to allow error handling another shot
> 
> I've been wondering about this code too.
> 
> I think the patch is correct for the normal cases, but I didn't understand the
> old fake timeout comment case. From the comment it seemed like that was the reason
> we did the RESET_TIMER. Does that not exist anymore or was it just bogus?

Before commit 15f73f5b3e59 ("blk-mq: move failure injection out of
blk_mq_complete_request") the scsi_mq_done() function cleared the
SCMD_STATE_COMPLETE bit in case of fake timeout injection. I think
that commit made the above comment incorrect.

> The commit you referenced actually was returning BLK_EH_DONE like we want. This
> commit:
> 
> commit f1342709d18af97b0e71449d5696b8873d1a456c
> Author: Keith Busch <keith.busch@intel.com>
> Date:   Mon Nov 26 09:54:29 2018 -0700
> 
>      scsi: Do not rely on blk-mq for double completions
> 
> 
> changed it to BLK_EH_RESET_TIMER and changed the above comment to mention
> the fake timeout case. However, the commit message mentioned the patch was done
> because we didn't want scsi digging the block layer.
> 
> If the fake injection thingy is bogus, then it seems ok to me.

Hmm ... I probably should modify the Fixes tag.

Thanks,

Bart.
diff mbox series

Patch

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index 16bd0adc2339..d1b07ff64a96 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -343,19 +343,11 @@  enum blk_eh_timer_return scsi_timeout(struct request *req)
 
 	if (rtn == BLK_EH_DONE) {
 		/*
-		 * Set the command to complete first in order to prevent a real
-		 * completion from releasing the command while error handling
-		 * is using it. If the command was already completed, then the
-		 * lower level driver beat the timeout handler, and it is safe
-		 * to return without escalating error recovery.
-		 *
-		 * If timeout handling lost the race to a real completion, the
-		 * block layer may ignore that due to a fake timeout injection,
-		 * so return RESET_TIMER to allow error handling another shot
-		 * at this command.
+		 * If scsi_done() has already set SCMD_STATE_COMPLETE, do not
+		 * modify *scmd.
 		 */
 		if (test_and_set_bit(SCMD_STATE_COMPLETE, &scmd->state))
-			return BLK_EH_RESET_TIMER;
+			return BLK_EH_DONE;
 		if (scsi_abort_command(scmd) != SUCCESS) {
 			set_host_byte(scmd, DID_TIME_OUT);
 			scsi_eh_scmd_add(scmd);