diff mbox

[3/3] scsi: proper state checking and module refcount handling in scsi_device_get

Message ID CAC5umyiSEmBOLT2MeztTq8z5-wE3koXpRQkD+tuX8b-PMg-e+A@mail.gmail.com (mailing list archive)
State New, archived
Headers show

Commit Message

Akinobu Mita April 29, 2015, 1:17 a.m. UTC
2015-02-02 22:01 GMT+09:00 Christoph Hellwig <hch@lst.de>:
> This effectively reverts commits 85b6c7 ("[SCSI] sd: fix cache flushing on
> module removal (and individual device removal)" and dc4515ea ("scsi: always
> increment reference count").
>
> We now never call scsi_device_get from the shutdown path, and the fact
> that we started grabbing reference there in commit 85b6c7 turned out
> turned out to create more problems than it solves, and required
> workarounds for workarounds for workarounds. Move back to properly checking
> the device state and carefully handle module refcounting.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  drivers/scsi/scsi.c | 20 +++++++++++++-------
>  1 file changed, 13 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
> index 9b38299..9b7fd0b 100644
> --- a/drivers/scsi/scsi.c
> +++ b/drivers/scsi/scsi.c
> @@ -979,18 +979,24 @@ EXPORT_SYMBOL(scsi_report_opcode);
>   * Description: Gets a reference to the scsi_device and increments the use count
>   * of the underlying LLDD module.  You must hold host_lock of the
>   * parent Scsi_Host or already have a reference when calling this.
> + *
> + * This will fail if a device is deleted or cancelled, or when the LLD module
> + * is in the process of being unloaded.
>   */
>  int scsi_device_get(struct scsi_device *sdev)

Hi Christoph,

This change broke ufs driver.

Because scsi_device_get() can be called while the module is being
unloaded with the device runtime suspended.
(i.e. driver_detach -> ... pm_runtime_get_sync() ... ->
ufshcd_runtime_resume -> ufshcd_resume -> ufshcd_set_dev_pwr_mode ->
scsi_device_get -> try_module_get -> return -ENXIO)

The reason for scsi_device_get() in ufshcd_set_dev_pwr_mode() is
to avoid manual delete of UFS device W-LUN by holding the reference
to it.  So can we acquire shost->scan_mutex lock instead of
scsi_device_get()?  I tried attached patch and it seems to be working,
but I would like to ask your opinion about this change.

>  {
> -       if (sdev->sdev_state == SDEV_DEL)
> -               return -ENXIO;
> +       if (sdev->sdev_state == SDEV_DEL || sdev->sdev_state == SDEV_CANCEL)
> +               goto fail;
>         if (!get_device(&sdev->sdev_gendev))
> -               return -ENXIO;
> -       /* We can fail try_module_get if we're doing SCSI operations
> -        * from module exit (like cache flush) */
> -       __module_get(sdev->host->hostt->module);
> -
> +               goto fail;
> +       if (!try_module_get(sdev->host->hostt->module))
> +               goto fail_put_device;
>         return 0;
> +
> +fail_put_device:
> +       put_device(&sdev->sdev_gendev);
> +fail:
> +       return -ENXIO;
>  }
>  EXPORT_SYMBOL(scsi_device_get);
>
> --
> 1.9.1
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Christoph Hellwig April 29, 2015, 1:27 p.m. UTC | #1
On Wed, Apr 29, 2015 at 10:17:59AM +0900, Akinobu Mita wrote:
> This change broke ufs driver.

I'd claim the ufs driver, or rather more specifily the ufs spec
had already been broken.  That whole concept of keeping references
to scsi devices to send commands to for host state changes is just
fundamentally broken.  

> The reason for scsi_device_get() in ufshcd_set_dev_pwr_mode() is
> to avoid manual delete of UFS device W-LUN by holding the reference
> to it.  So can we acquire shost->scan_mutex lock instead of
> scsi_device_get()?  I tried attached patch and it seems to be working,
> but I would like to ask your opinion about this change.

It seems like a reasonable workaround.  But then again the concept
of the driver hanging on scsi_devices is and will stay broken.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

From a305a5284cac23adbf7f86b3014cc2e6325c7b88 Mon Sep 17 00:00:00 2001
From: Akinobu Mita <akinobu.mita@gmail.com>
Date: Wed, 29 Apr 2015 10:02:17 +0900
Subject: [PATCH] scsi: ufs: fix ufshcd_set_dev_pwr_mode() when unloading
 module

---
 drivers/scsi/ufs/ufshcd.c | 18 +++++++-----------
 1 file changed, 7 insertions(+), 11 deletions(-)

diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 540e00d..91cbc04 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -4695,23 +4695,19 @@  static int ufshcd_set_dev_pwr_mode(struct ufs_hba *hba,
 	struct scsi_sense_hdr sshdr;
 	struct scsi_device *sdp;
 	unsigned long flags;
-	int ret;
+	int ret = 0;
 
+	mutex_lock(&hba->host->scan_mutex);
 	spin_lock_irqsave(hba->host->host_lock, flags);
 	sdp = hba->sdev_ufs_device;
-	if (sdp) {
-		ret = scsi_device_get(sdp);
-		if (!ret && !scsi_device_online(sdp)) {
-			ret = -ENODEV;
-			scsi_device_put(sdp);
-		}
-	} else {
+	if (!sdp || !scsi_device_online(sdp))
 		ret = -ENODEV;
-	}
 	spin_unlock_irqrestore(hba->host->host_lock, flags);
 
-	if (ret)
+	if (ret) {
+		mutex_unlock(&hba->host->scan_mutex);
 		return ret;
+	}
 
 	/*
 	 * If scsi commands fail, the scsi mid-layer schedules scsi error-
@@ -4748,7 +4744,7 @@  static int ufshcd_set_dev_pwr_mode(struct ufs_hba *hba,
 	if (!ret)
 		hba->curr_dev_pwr_mode = pwr_mode;
 out:
-	scsi_device_put(sdp);
+	mutex_unlock(&hba->host->scan_mutex);
 	hba->host->eh_noresume = 0;
 	return ret;
 }
-- 
1.9.1