diff mbox series

[2/3] scsi: Avoid that .queuecommand() gets called for a quiesced SCSI device

Message ID 20190605201435.233701-3-bvanassche@acm.org (mailing list archive)
State Superseded
Headers show
Series Avoid that .queuecommand() gets called for a quiesced SCSI device | expand

Commit Message

Bart Van Assche June 5, 2019, 8:14 p.m. UTC
Several SCSI transport and LLD drivers surround code that does not
tolerate concurrent calls of .queuecommand() with scsi_target_block() /
scsi_target_unblock(). These last two functions use
blk_mq_quiesce_queue() / blk_mq_unquiesce_queue() for scsi-mq request
queues to prevent concurrent .queuecommand() calls. However, that is
not sufficient to prevent .queuecommand() calls from scsi_send_eh_cmnd().
Hence surround the .queuecommand() call from the SCSI error handler with
code that avoids that .queuecommand() gets called in the quiesced state.

Note: converting the .queuecommand() call in scsi_send_eh_cmnd() into
code that calls blk_get_request() + blk_execute_rq() is not an option
since scsi_send_eh_cmnd() must be able to make forward progress even
if all requests have been allocated.

Cc: Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 drivers/scsi/scsi_error.c | 26 ++++++++++++++++++++++++--
 1 file changed, 24 insertions(+), 2 deletions(-)

Comments

Hannes Reinecke June 6, 2019, 5:50 a.m. UTC | #1
On 6/5/19 10:14 PM, Bart Van Assche wrote:
> Several SCSI transport and LLD drivers surround code that does not
> tolerate concurrent calls of .queuecommand() with scsi_target_block() /
> scsi_target_unblock(). These last two functions use
> blk_mq_quiesce_queue() / blk_mq_unquiesce_queue() for scsi-mq request
> queues to prevent concurrent .queuecommand() calls. However, that is
> not sufficient to prevent .queuecommand() calls from scsi_send_eh_cmnd().
> Hence surround the .queuecommand() call from the SCSI error handler with
> code that avoids that .queuecommand() gets called in the quiesced state.
> 
> Note: converting the .queuecommand() call in scsi_send_eh_cmnd() into
> code that calls blk_get_request() + blk_execute_rq() is not an option
> since scsi_send_eh_cmnd() must be able to make forward progress even
> if all requests have been allocated.
> 
Hmm. Have you actually observed this?
Typically, scsi_target_block()/scsi_target_unblock() is called prior to
invoking EH, to allow the system to settle and to guarantee that it's
fully quiesced. Only then EH is started.
Consequently, scsi_target_block()/scsi_target_unblock() really shouldn't
be called during EH; we're essentially single-threaded at this point, so
nothing else will be submitting command.
Can you explain why you need this?

Cheers,

Hannes
Bart Van Assche June 6, 2019, 8:40 p.m. UTC | #2
On 6/5/19 10:50 PM, Hannes Reinecke wrote:
> On 6/5/19 10:14 PM, Bart Van Assche wrote:
>> Several SCSI transport and LLD drivers surround code that does not
>> tolerate concurrent calls of .queuecommand() with scsi_target_block() /
>> scsi_target_unblock(). These last two functions use
>> blk_mq_quiesce_queue() / blk_mq_unquiesce_queue() for scsi-mq request
>> queues to prevent concurrent .queuecommand() calls. However, that is
>> not sufficient to prevent .queuecommand() calls from scsi_send_eh_cmnd().
>> Hence surround the .queuecommand() call from the SCSI error handler with
>> code that avoids that .queuecommand() gets called in the quiesced state.
>>
>> Note: converting the .queuecommand() call in scsi_send_eh_cmnd() into
>> code that calls blk_get_request() + blk_execute_rq() is not an option
>> since scsi_send_eh_cmnd() must be able to make forward progress even
>> if all requests have been allocated.
>>
> Hmm. Have you actually observed this?
> Typically, scsi_target_block()/scsi_target_unblock() is called prior to
> invoking EH, to allow the system to settle and to guarantee that it's
> fully quiesced. Only then EH is started.
> Consequently, scsi_target_block()/scsi_target_unblock() really shouldn't
> be called during EH; we're essentially single-threaded at this point, so
> nothing else will be submitting command.
> Can you explain why you need this?

Hi Hannes,

As one can see in the commit message of patch 3/3, I have observed a 
.queuecommand() call by the SCSI EH causing a crash.

The SCSI EH and blocking of SCSI devices have different triggers:
- As one can see in scsi_times_out(), if a SCSI command times out and an 
abort has already been scheduled for that command then that command is 
handed over to the SCSI error handler. After all commands that are in 
progress have failed the error handler thread is woken up.
- The iSCSI and SRP transport drivers call scsi_target_block() if a 
transport layer error has been observed. This can happen from another 
thread than the SCSI error handler thread and these functions can be 
called either before or after the SCSI error handler thread has been 
woken up.

Bart.
Hannes Reinecke June 7, 2019, 12:49 p.m. UTC | #3
On 6/6/19 10:40 PM, Bart Van Assche wrote:
> On 6/5/19 10:50 PM, Hannes Reinecke wrote:
>> On 6/5/19 10:14 PM, Bart Van Assche wrote:
>>> Several SCSI transport and LLD drivers surround code that does not
>>> tolerate concurrent calls of .queuecommand() with scsi_target_block() /
>>> scsi_target_unblock(). These last two functions use
>>> blk_mq_quiesce_queue() / blk_mq_unquiesce_queue() for scsi-mq request
>>> queues to prevent concurrent .queuecommand() calls. However, that is
>>> not sufficient to prevent .queuecommand() calls from
>>> scsi_send_eh_cmnd().
>>> Hence surround the .queuecommand() call from the SCSI error handler with
>>> code that avoids that .queuecommand() gets called in the quiesced state.
>>>
>>> Note: converting the .queuecommand() call in scsi_send_eh_cmnd() into
>>> code that calls blk_get_request() + blk_execute_rq() is not an option
>>> since scsi_send_eh_cmnd() must be able to make forward progress even
>>> if all requests have been allocated.
>>>
>> Hmm. Have you actually observed this?
>> Typically, scsi_target_block()/scsi_target_unblock() is called prior to
>> invoking EH, to allow the system to settle and to guarantee that it's
>> fully quiesced. Only then EH is started.
>> Consequently, scsi_target_block()/scsi_target_unblock() really shouldn't
>> be called during EH; we're essentially single-threaded at this point, so
>> nothing else will be submitting command.
>> Can you explain why you need this?
> 
> Hi Hannes,
> 
> As one can see in the commit message of patch 3/3, I have observed a
> .queuecommand() call by the SCSI EH causing a crash.
> 
> The SCSI EH and blocking of SCSI devices have different triggers:
> - As one can see in scsi_times_out(), if a SCSI command times out and an
> abort has already been scheduled for that command then that command is
> handed over to the SCSI error handler. After all commands that are in
> progress have failed the error handler thread is woken up.
> - The iSCSI and SRP transport drivers call scsi_target_block() if a
> transport layer error has been observed. This can happen from another
> thread than the SCSI error handler thread and these functions can be
> called either before or after the SCSI error handler thread has been
> woken up.
> 
But then I'd rather not quiesce the queue in these circumstances, like
in this (untested) patch:

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 34eaef631064..e0bdde025d1a 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -2641,8 +2641,17 @@ static int scsi_internal_device_block(struct
scsi_device *sdev)

        mutex_lock(&sdev->state_mutex);
        err = scsi_internal_device_block_nowait(sdev);
-       if (err == 0)
-               blk_mq_quiesce_queue(q);
+       if (err == 0) {
+               unsigned long flags;
+
+               spin_lock_irqsave(sdev->host->host_lock, flags);
+               if (sdev->host->shost_state != SHOST_RECOVERY &&
+                   sdev->host->shost_state != SHOST_CANCEL_RECOVERY) {
+                       spin_unlock_irqrestore(sdev->host->host_lock,
flags);
+                       blk_mq_quiesce_queue(q);
+               } else
+                       spin_unlock_irqrestore(sdev->host->host_lock,
flags);
+       }
        mutex_unlock(&sdev->state_mutex);

        return err;

Cheers,

Hannes
Bart Van Assche June 7, 2019, 3:21 p.m. UTC | #4
On 6/7/19 5:49 AM, Hannes Reinecke wrote:
> On 6/6/19 10:40 PM, Bart Van Assche wrote:
>> As one can see in the commit message of patch 3/3, I have observed a
>> .queuecommand() call by the SCSI EH causing a crash.
>>
>> The SCSI EH and blocking of SCSI devices have different triggers:
>> - As one can see in scsi_times_out(), if a SCSI command times out and an
>> abort has already been scheduled for that command then that command is
>> handed over to the SCSI error handler. After all commands that are in
>> progress have failed the error handler thread is woken up.
>> - The iSCSI and SRP transport drivers call scsi_target_block() if a
>> transport layer error has been observed. This can happen from another
>> thread than the SCSI error handler thread and these functions can be
>> called either before or after the SCSI error handler thread has been
>> woken up.
>>
> But then I'd rather not quiesce the queue in these circumstances, like
> in this (untested) patch:
> 
> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> index 34eaef631064..e0bdde025d1a 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -2641,8 +2641,17 @@ static int scsi_internal_device_block(struct
> scsi_device *sdev)
> 
>          mutex_lock(&sdev->state_mutex);
>          err = scsi_internal_device_block_nowait(sdev);
> -       if (err == 0)
> -               blk_mq_quiesce_queue(q);
> +       if (err == 0) {
> +               unsigned long flags;
> +
> +               spin_lock_irqsave(sdev->host->host_lock, flags);
> +               if (sdev->host->shost_state != SHOST_RECOVERY &&
> +                   sdev->host->shost_state != SHOST_CANCEL_RECOVERY) {
> +                       spin_unlock_irqrestore(sdev->host->host_lock,
> flags);
> +                       blk_mq_quiesce_queue(q);
> +               } else
> +                       spin_unlock_irqrestore(sdev->host->host_lock,
> flags);
> +       }
>          mutex_unlock(&sdev->state_mutex);
> 
>          return err;
  A significant disadvantage of the above patch is that it makes it less 
likely that error handling will succeed. If the error handler is 
activated before the transport recovery code is activated, I think that 
transport recovery should really happen. The above patch makes it less 
likely that the SCSI error handler will succeed.

Additionally, scsi_target_block() ignores the value returned by 
scsi_internal_device_block() and it is nontrivial to make some of the 
scsi_target_block() callers handle scsi_target_block() failures.

Bart.
Christoph Hellwig June 8, 2019, 9:39 a.m. UTC | #5
Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>
diff mbox series

Patch

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index f490994374f6..9f16304150b1 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -1055,7 +1055,7 @@  static int scsi_send_eh_cmnd(struct scsi_cmnd *scmd, unsigned char *cmnd,
 	struct scsi_device *sdev = scmd->device;
 	struct Scsi_Host *shost = sdev->host;
 	DECLARE_COMPLETION_ONSTACK(done);
-	unsigned long timeleft = timeout;
+	unsigned long timeleft = timeout, delay;
 	struct scsi_eh_save ses;
 	const unsigned long stall_for = msecs_to_jiffies(100);
 	int rtn;
@@ -1066,7 +1066,29 @@  static int scsi_send_eh_cmnd(struct scsi_cmnd *scmd, unsigned char *cmnd,
 
 	scsi_log_send(scmd);
 	scmd->scsi_done = scsi_eh_done;
-	rtn = shost->hostt->queuecommand(shost, scmd);
+
+	/*
+	 * Lock sdev->state_mutex to avoid that scsi_device_quiesce() can
+	 * change the SCSI device state after we have examined it and before
+	 * .queuecommand() is called.
+	 */
+	mutex_lock(&sdev->state_mutex);
+	while (sdev->sdev_state == SDEV_BLOCK && timeleft > 0) {
+		mutex_unlock(&sdev->state_mutex);
+		SCSI_LOG_ERROR_RECOVERY(5, sdev_printk(KERN_DEBUG, sdev,
+			"%s: state %d <> %d\n", __func__, sdev->sdev_state,
+			SDEV_BLOCK));
+		delay = min(timeleft, stall_for);
+		timeleft -= delay;
+		msleep(jiffies_to_msecs(delay));
+		mutex_lock(&sdev->state_mutex);
+	}
+	if (sdev->sdev_state != SDEV_BLOCK)
+		rtn = shost->hostt->queuecommand(shost, scmd);
+	else
+		rtn = SCSI_MLQUEUE_DEVICE_BUSY;
+	mutex_unlock(&sdev->state_mutex);
+
 	if (rtn) {
 		if (timeleft > stall_for) {
 			scsi_eh_restore_cmnd(scmd, &ses);