diff mbox

[v3,3/4] sd: Make synchronize cache upon shutdown asynchronous

Message ID 1492968509.2414.6.camel@HansenPartnership.com (mailing list archive)
State Changes Requested, archived
Headers show

Commit Message

James Bottomley April 23, 2017, 5:28 p.m. UTC
On Thu, 2017-04-20 at 22:52 +0000, Bart Van Assche wrote:
> On Thu, 2017-04-20 at 15:13 -0700, James Bottomley wrote:
> > How is that possible?  Once the device goes into the CANCEL state,
> > it
> > no longer can be found by starget_for_each_device() because
> > scsi_device_get() returns NULL ...
> 
> scsi_target_block() is not serialized against __scsi_remove_device().
> I think
> the following sequence of events can cause a queue to be stopped for
> a device
> in the CANCEL state:
> (a) scsi_target_block() triggers a call to scsi_get_device().
> (b) __scsi_remove_device() is called from the context of another
> thread.
> (c) __scsi_remove_device() changes the device state into SDEV_CANCEL.
> (d) scsi_internal_device_block() calls blk_mq_stop_hw_queue().

OK, so the bug is that I didn't prevent the DEL->BLOCK transition which
allows us to go blocked after the new DEL if you were blocked before
when we reached cancel (corrected).  but looking at all of this, I
don't like the loose binding of block and queue stop, so this one makes
it more simplistic by deferring the queue stop until we try to execute
a blocked request.

In doing all of this, I folded in your start queue patch (I bet if we
fix this someone's going to want a backport to stable, so we shouldn't
make it difficult).

I also noticed the scsi_wait_for_queuecommand() should be event driven
(we're not supposed to do sleep driven condition checking) and there's
a spurious lock in scsi_request_fn_active() which means we could
eliminate the whole function.

However, the above is for another day.  This should work.

James

---

Comments

Bart Van Assche April 24, 2017, 9:46 p.m. UTC | #1
On Sun, 2017-04-23 at 12:28 -0500, James Bottomley wrote:
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -1250,6 +1250,12 @@ scsi_prep_state_check(struct scsi_device *sdev, struct request *req)
>  			break;
>  		case SDEV_BLOCK:
>  		case SDEV_CREATED_BLOCK:
> +			/* q lock is held only in the non-mq case */
> +			if (req->q->mq_ops)
> +				blk_mq_stop_hw_queues(req->q);
> +			else
> +				blk_stop_queue(req->q);
> +
>  			ret = BLKPREP_DEFER;
>  			break;
>  		case SDEV_QUIESCE:

Hello James,

This change swaps the order of changing the device state and the block layer
state. Sorry but I don't like this. What will happen if e.g. the disk event
checker decides to check for events just before __scsi_remove_device()
changes the device state? I think that can that cause sd_shutdown() to be
called with the block layer queue stopped and hence that with this approach
it is still possible that sd_shutdown() hangs.

> @@ -2611,7 +2617,6 @@ scsi_device_set_state(struct scsi_device *sdev, enum scsi_device_state state)
>  		case SDEV_QUIESCE:
>  		case SDEV_OFFLINE:
>  		case SDEV_TRANSPORT_OFFLINE:
> -		case SDEV_BLOCK:
>  			break;
>  		default:
>  			goto illegal;

A previous patch made two changes to scsi_device_set_state(). Are you sure
that we do no longer have to enable the SDEV_BLOCK to SDEV_DEL transition?

> @@ -2844,10 +2849,12 @@ static int scsi_request_fn_active(struct scsi_device *sdev)
>   */
>  static void scsi_wait_for_queuecommand(struct scsi_device *sdev)
>  {
> -	WARN_ON_ONCE(sdev->host->use_blk_mq);
> -
> -	while (scsi_request_fn_active(sdev))
> -		msleep(20);
> +	if (sdev->request_queue->mq_ops) {
> +		synchronize_rcu();
> +	} else {
> +		while (scsi_request_fn_active(sdev))
> +			msleep(20);
> +	}
>  }

The above code makes an assumption about the block layer internals, namely
that calling synchronize_rcu() is sufficient to wait for outstanding requests
to finish. Please do not embed any assumptions about block layer internals in
SCSI code but keep code that relies on block layer internals in the block
layer. If you have a look at blk_mq_quiesce_queue() then you will see that
calling synchronize_rcu() is not sufficient for hardware queues for which
BLK_MQ_F_BLOCKING has been set. I am aware that today the SCSI core does not
set that flag. However, the dependency of the dependency of the
synchronize_rcu() call on BLK_MQ_F_BLOCKING not being set is nontrivial.

Thanks,

Bart.
diff mbox

Patch

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index e5a2d590a104..a9ec35d9c91b 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1250,6 +1250,12 @@  scsi_prep_state_check(struct scsi_device *sdev, struct request *req)
 			break;
 		case SDEV_BLOCK:
 		case SDEV_CREATED_BLOCK:
+			/* q lock is held only in the non-mq case */
+			if (req->q->mq_ops)
+				blk_mq_stop_hw_queues(req->q);
+			else
+				blk_stop_queue(req->q);
+
 			ret = BLKPREP_DEFER;
 			break;
 		case SDEV_QUIESCE:
@@ -2611,7 +2617,6 @@  scsi_device_set_state(struct scsi_device *sdev, enum scsi_device_state state)
 		case SDEV_QUIESCE:
 		case SDEV_OFFLINE:
 		case SDEV_TRANSPORT_OFFLINE:
-		case SDEV_BLOCK:
 			break;
 		default:
 			goto illegal;
@@ -2844,10 +2849,12 @@  static int scsi_request_fn_active(struct scsi_device *sdev)
  */
 static void scsi_wait_for_queuecommand(struct scsi_device *sdev)
 {
-	WARN_ON_ONCE(sdev->host->use_blk_mq);
-
-	while (scsi_request_fn_active(sdev))
-		msleep(20);
+	if (sdev->request_queue->mq_ops) {
+		synchronize_rcu();
+	} else {
+		while (scsi_request_fn_active(sdev))
+			msleep(20);
+	}
 }
 
 /**
@@ -2953,8 +2960,6 @@  EXPORT_SYMBOL(scsi_target_resume);
 int
 scsi_internal_device_block(struct scsi_device *sdev, bool wait)
 {
-	struct request_queue *q = sdev->request_queue;
-	unsigned long flags;
 	int err = 0;
 
 	err = scsi_device_set_state(sdev, SDEV_BLOCK);
@@ -2970,23 +2975,27 @@  scsi_internal_device_block(struct scsi_device *sdev, bool wait)
 	 * block layer from calling the midlayer with this device's
 	 * request queue. 
 	 */
+	if (wait)
+		scsi_wait_for_queuecommand(sdev);
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(scsi_internal_device_block);
+
+void scsi_start_queue(struct scsi_device *sdev)
+{
+	struct request_queue *q = sdev->request_queue;
+	unsigned long flags;
+
 	if (q->mq_ops) {
-		if (wait)
-			blk_mq_quiesce_queue(q);
-		else
-			blk_mq_stop_hw_queues(q);
+		blk_mq_start_stopped_hw_queues(q, false);
 	} else {
 		spin_lock_irqsave(q->queue_lock, flags);
-		blk_stop_queue(q);
+		blk_start_queue(q);
 		spin_unlock_irqrestore(q->queue_lock, flags);
-		if (wait)
-			scsi_wait_for_queuecommand(sdev);
 	}
-
-	return 0;
 }
-EXPORT_SYMBOL_GPL(scsi_internal_device_block);
- 
+
 /**
  * scsi_internal_device_unblock - resume a device after a block request
  * @sdev:	device to resume
@@ -3007,9 +3016,6 @@  int
 scsi_internal_device_unblock(struct scsi_device *sdev,
 			     enum scsi_device_state new_state)
 {
-	struct request_queue *q = sdev->request_queue; 
-	unsigned long flags;
-
 	/*
 	 * Try to transition the scsi device to SDEV_RUNNING or one of the
 	 * offlined states and goose the device queue if successful.
@@ -3027,13 +3033,7 @@  scsi_internal_device_unblock(struct scsi_device *sdev,
 		 sdev->sdev_state != SDEV_OFFLINE)
 		return -EINVAL;
 
-	if (q->mq_ops) {
-		blk_mq_start_stopped_hw_queues(q, false);
-	} else {
-		spin_lock_irqsave(q->queue_lock, flags);
-		blk_start_queue(q);
-		spin_unlock_irqrestore(q->queue_lock, flags);
-	}
+	scsi_start_queue(sdev);
 
 	return 0;
 }
diff --git a/drivers/scsi/scsi_priv.h b/drivers/scsi/scsi_priv.h
index f11bd102d6d5..c7629e31a75b 100644
--- a/drivers/scsi/scsi_priv.h
+++ b/drivers/scsi/scsi_priv.h
@@ -89,6 +89,7 @@  extern void scsi_run_host_queues(struct Scsi_Host *shost);
 extern void scsi_requeue_run_queue(struct work_struct *work);
 extern struct request_queue *scsi_alloc_queue(struct scsi_device *sdev);
 extern struct request_queue *scsi_mq_alloc_queue(struct scsi_device *sdev);
+extern void scsi_start_queue(struct scsi_device *sdev);
 extern int scsi_mq_setup_tags(struct Scsi_Host *shost);
 extern void scsi_mq_destroy_tags(struct Scsi_Host *shost);
 extern int scsi_init_queue(void);
diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
index 82dfe07b1d47..5e826801c84c 100644
--- a/drivers/scsi/scsi_sysfs.c
+++ b/drivers/scsi/scsi_sysfs.c
@@ -1282,8 +1282,17 @@  void __scsi_remove_device(struct scsi_device *sdev)
 		return;
 
 	if (sdev->is_visible) {
-		if (scsi_device_set_state(sdev, SDEV_CANCEL) != 0)
-			return;
+		/*
+		 * If blocked, we go straight to DEL so any commands
+		 * issued during the driver shutdown (like sync cache)
+		 * are errored
+		 */
+		if (scsi_device_set_state(sdev, SDEV_CANCEL) != 0) {
+			if (scsi_device_set_state(sdev, SDEV_DEL) != 0)
+				return;
+
+			scsi_start_queue(sdev);
+		}
 
 		bsg_unregister_queue(sdev->request_queue);
 		device_unregister(&sdev->sdev_dev);