diff mbox series

scsi: ufs: Fix deadlocks between power management and error handler

Message ID 20220916184220.867535-1-bvanassche@acm.org (mailing list archive)
State New, archived
Headers show
Series scsi: ufs: Fix deadlocks between power management and error handler | expand

Commit Message

Bart Van Assche Sept. 16, 2022, 6:42 p.m. UTC
The following deadlocks have been observed on multiple test setups:

* ufshcd_wl_suspend() is waiting for blk_execute_rq() to complete while it
  holds host_sem.
* ufshcd_eh_host_reset_handler() invokes ufshcd_err_handler() and the
  latter function tries to obtain host_sem.
This is a deadlock because blk_execute_rq() can't execute SCSI commands
while the host is in the SHOST_RECOVERY state and because the error
handler cannot make progress either.

* ufshcd_wl_runtime_resume() is waiting for blk_execute_rq() to finish
  while it holds host_sem.
* ufshcd_eh_host_reset_handler() invokes ufshcd_err_handler() and the
  latter function calls pm_runtime_resume().
This is a deadlock because of the same reason as the previous scenario.

Fix both deadlocks by not obtaining host_sem from the power management
code paths. Removing the host_sem locking from the power management code
is safe because the ufshcd_err_handler() is already serialized against
SCSI command execution.

The ufshcd_rpm_get_sync() call at the start of
ufshcd_err_handling_prepare() may deadlock since calling scsi_execute()
is required by the UFS runtime resume implementation. Fixing that
deadlock falls outside the scope of this patch.

Cc: dh0421.hwang@samsung.com
Cc: Asutosh Das <asutoshd@codeaurora.org>
Fixes: b294ff3e3449 ("scsi: ufs: core: Enable power management for wlun")
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 drivers/ufs/core/ufshcd.c | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

Comments

Asutosh Das Sept. 19, 2022, 3:10 a.m. UTC | #1
Hello Bart,

On 9/16/2022 11:42 AM, Bart Van Assche wrote:
> The following deadlocks have been observed on multiple test setups:
> 
> * ufshcd_wl_suspend() is waiting for blk_execute_rq() to complete while it
>    holds host_sem.
> * ufshcd_eh_host_reset_handler() invokes ufshcd_err_handler() and the
>    latter function tries to obtain host_sem.
> This is a deadlock because blk_execute_rq() can't execute SCSI commands
> while the host is in the SHOST_RECOVERY state and because the error
> handler cannot make progress either.
> 
> * ufshcd_wl_runtime_resume() is waiting for blk_execute_rq() to finish
>    while it holds host_sem.
> * ufshcd_eh_host_reset_handler() invokes ufshcd_err_handler() and the
>    latter function calls pm_runtime_resume().
> This is a deadlock because of the same reason as the previous scenario.
> 
> Fix both deadlocks by not obtaining host_sem from the power management
> code paths. Removing the host_sem locking from the power management code
> is safe because the ufshcd_err_handler() is already serialized against
> SCSI command execution.
> 

Say, there's a PWR_FATAL error in ufshcd_wl_suspend().
Wouldn't there be a scenario in which the suspend and error handler may 
run simultaneously?
Do you see issues when that happens? How about when shutdown runs 
simulataneously with error handler?

-asd
Adrian Hunter Sept. 19, 2022, 11:34 a.m. UTC | #2
On 16/09/22 21:42, Bart Van Assche wrote:
> The following deadlocks have been observed on multiple test setups:
> 
> * ufshcd_wl_suspend() is waiting for blk_execute_rq() to complete while it
>   holds host_sem.
> * ufshcd_eh_host_reset_handler() invokes ufshcd_err_handler() and the
>   latter function tries to obtain host_sem.
> This is a deadlock because blk_execute_rq() can't execute SCSI commands
> while the host is in the SHOST_RECOVERY state and because the error
> handler cannot make progress either.

Hi Bart

Did you consider something like:

diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
index 7256e6c43ca6..dc83b38dfde9 100644
--- a/drivers/ufs/core/ufshcd.c
+++ b/drivers/ufs/core/ufshcd.c
@@ -7374,6 +7374,9 @@ static int ufshcd_eh_host_reset_handler(struct scsi_cmnd *cmd)
 
 	hba = shost_priv(cmd->device->host);
 
+	if (hba->pm_op_in_progress)
+		return FAST_IO_FAIL;
+
 	spin_lock_irqsave(hba->host->host_lock, flags);
 	hba->force_reset = true;
 	ufshcd_schedule_eh_work(hba);

> 
> * ufshcd_wl_runtime_resume() is waiting for blk_execute_rq() to finish
>   while it holds host_sem.
> * ufshcd_eh_host_reset_handler() invokes ufshcd_err_handler() and the
>   latter function calls pm_runtime_resume().
> This is a deadlock because of the same reason as the previous scenario.
> 
> Fix both deadlocks by not obtaining host_sem from the power management
> code paths. Removing the host_sem locking from the power management code
> is safe because the ufshcd_err_handler() is already serialized against
> SCSI command execution.

The original commit for host_sem was aimed at sysfs (see commit below).
Did you consider how sysfs access is affected?

  commit 9cd20d3f473619d8d482551d15d4cebfb3ce73c8
  Author: Can Guo <cang@codeaurora.org>
  Date:   Wed Jan 13 19:13:28 2021 -0800

    scsi: ufs: Protect PM ops and err_handler from user access through sysfs
    
    User layer may access sysfs nodes when system PM ops or error handling is
    running. This can cause various problems. Rename eh_sem to host_sem and use
    it to protect PM ops and error handling from user layer intervention.

> 
> The ufshcd_rpm_get_sync() call at the start of
> ufshcd_err_handling_prepare() may deadlock since calling scsi_execute()
> is required by the UFS runtime resume implementation. Fixing that
> deadlock falls outside the scope of this patch.

Do you mean:

static void ufshcd_err_handling_prepare(struct ufs_hba *hba)
{
	ufshcd_rpm_get_sync(hba);

because that is the host controller, not the UFS device, that is
being resumed.

> 
> Cc: dh0421.hwang@samsung.com
> Cc: Asutosh Das <asutoshd@codeaurora.org>
> Fixes: b294ff3e3449 ("scsi: ufs: core: Enable power management for wlun")
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> ---
>  drivers/ufs/core/ufshcd.c | 6 +-----
>  1 file changed, 1 insertion(+), 5 deletions(-)
> 
> diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
> index 7c15cbc737b4..cd3c2aa981c6 100644
> --- a/drivers/ufs/core/ufshcd.c
> +++ b/drivers/ufs/core/ufshcd.c
> @@ -9254,16 +9254,13 @@ static int ufshcd_wl_suspend(struct device *dev)
>  	ktime_t start = ktime_get();
>  
>  	hba = shost_priv(sdev->host);
> -	down(&hba->host_sem);
>  
>  	if (pm_runtime_suspended(dev))
>  		goto out;
>  
>  	ret = __ufshcd_wl_suspend(hba, UFS_SYSTEM_PM);
> -	if (ret) {
> +	if (ret)
>  		dev_err(&sdev->sdev_gendev, "%s failed: %d\n", __func__,  ret);
> -		up(&hba->host_sem);
> -	}
>  
>  out:
>  	if (!ret)
> @@ -9296,7 +9293,6 @@ static int ufshcd_wl_resume(struct device *dev)
>  		hba->curr_dev_pwr_mode, hba->uic_link_state);
>  	if (!ret)
>  		hba->is_sys_suspended = false;
> -	up(&hba->host_sem);
>  	return ret;
>  }
>  #endif
Bart Van Assche Sept. 19, 2022, 1:54 p.m. UTC | #3
On 9/19/22 04:34, Adrian Hunter wrote:
> Did you consider something like:
> 
> diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
> index 7256e6c43ca6..dc83b38dfde9 100644
> --- a/drivers/ufs/core/ufshcd.c
> +++ b/drivers/ufs/core/ufshcd.c
> @@ -7374,6 +7374,9 @@ static int ufshcd_eh_host_reset_handler(struct scsi_cmnd *cmd)
>   
>   	hba = shost_priv(cmd->device->host);
>   
> +	if (hba->pm_op_in_progress)
> +		return FAST_IO_FAIL;
> +
>   	spin_lock_irqsave(hba->host->host_lock, flags);
>   	hba->force_reset = true;
>   	ufshcd_schedule_eh_work(hba);

The above change could cause error handling to be skipped if an error 
happened that requires the link to be reset. That seems wrong to me.

> The original commit for host_sem was aimed at sysfs (see commit below).
> Did you consider how sysfs access is affected?
> 
>    commit 9cd20d3f473619d8d482551d15d4cebfb3ce73c8
>    Author: Can Guo <cang@codeaurora.org>
>    Date:   Wed Jan 13 19:13:28 2021 -0800
> 
>      scsi: ufs: Protect PM ops and err_handler from user access through sysfs
>      
>      User layer may access sysfs nodes when system PM ops or error handling is
>      running. This can cause various problems. Rename eh_sem to host_sem and use
>      it to protect PM ops and error handling from user layer intervention.

The sysfs and debugfs attribute callback methods already call 
pm_runtime_get_sync() and pm_runtime_put_sync() so how could the power 
state change while a sysfs or debugfs attribute callback method is in 
progress?

>> The ufshcd_rpm_get_sync() call at the start of
>> ufshcd_err_handling_prepare() may deadlock since calling scsi_execute()
>> is required by the UFS runtime resume implementation. Fixing that
>> deadlock falls outside the scope of this patch.
> 
> Do you mean:
> 
> static void ufshcd_err_handling_prepare(struct ufs_hba *hba)
> {
> 	ufshcd_rpm_get_sync(hba);
> 
> because that is the host controller, not the UFS device, that is
> being resumed.

Hmm ... I think that ufshcd_rpm_get_sync() affects the power state of 
the UFS device and not the power state of the UFS host controller. From 
ufshcd-priv.h:

static inline int ufshcd_rpm_get_sync(struct ufs_hba *hba)
{
	return pm_runtime_get_sync(&hba->ufs_device_wlun->sdev_gendev);
}

Thanks,

Bart.
Adrian Hunter Sept. 19, 2022, 5:21 p.m. UTC | #4
On 19/09/22 16:54, Bart Van Assche wrote:
> On 9/19/22 04:34, Adrian Hunter wrote:
>> Did you consider something like:
>>
>> diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
>> index 7256e6c43ca6..dc83b38dfde9 100644
>> --- a/drivers/ufs/core/ufshcd.c
>> +++ b/drivers/ufs/core/ufshcd.c
>> @@ -7374,6 +7374,9 @@ static int ufshcd_eh_host_reset_handler(struct scsi_cmnd *cmd)
>>         hba = shost_priv(cmd->device->host);
>>   +    if (hba->pm_op_in_progress)
>> +        return FAST_IO_FAIL;
>> +
>>       spin_lock_irqsave(hba->host->host_lock, flags);
>>       hba->force_reset = true;
>>       ufshcd_schedule_eh_work(hba);
> 
> The above change could cause error handling to be skipped if an error happened that requires the link to be reset. That seems wrong to me.

Hopefully a PM op with an error that really needed a host reset
would show up as a UFS error that the error handler could fix
successfully.

Alternatively, the same change but after scheduling the error handler?

> 
>> The original commit for host_sem was aimed at sysfs (see commit below).
>> Did you consider how sysfs access is affected?
>>
>>    commit 9cd20d3f473619d8d482551d15d4cebfb3ce73c8
>>    Author: Can Guo <cang@codeaurora.org>
>>    Date:   Wed Jan 13 19:13:28 2021 -0800
>>
>>      scsi: ufs: Protect PM ops and err_handler from user access through sysfs
>>           User layer may access sysfs nodes when system PM ops or error handling is
>>      running. This can cause various problems. Rename eh_sem to host_sem and use
>>      it to protect PM ops and error handling from user layer intervention.
> 
> The sysfs and debugfs attribute callback methods already call pm_runtime_get_sync() and pm_runtime_put_sync() so how could the power state change while a sysfs or debugfs attribute callback method is in progress?

Without PM holding host_sem, maybe it would give a similar
deadlock to what was described:

ufs_sysfs_read_desc_param
down(&hba->host_sem); <------------------------------------
ufshcd_rpm_get_sync(hba);
	waits for blk_execute_rq()
	waits for ufshcd_eh_host_reset_handler()
	waits for ufshcd_err_handler()
	waits for down(&hba->host_sem); <------------------

> 
>>> The ufshcd_rpm_get_sync() call at the start of
>>> ufshcd_err_handling_prepare() may deadlock since calling scsi_execute()
>>> is required by the UFS runtime resume implementation. Fixing that
>>> deadlock falls outside the scope of this patch.
>>
>> Do you mean:
>>
>> static void ufshcd_err_handling_prepare(struct ufs_hba *hba)
>> {
>>     ufshcd_rpm_get_sync(hba);
>>
>> because that is the host controller, not the UFS device, that is
>> being resumed.
> 
> Hmm ... I think that ufshcd_rpm_get_sync() affects the power state of the UFS device and not the power state of the UFS host controller. From ufshcd-priv.h:
> 
> static inline int ufshcd_rpm_get_sync(struct ufs_hba *hba)
> {
>     return pm_runtime_get_sync(&hba->ufs_device_wlun->sdev_gendev);
> }

Yes, I misread that, sorry.

I guess it goes unnoticed because it is very unlikely i.e. the UFS
device would need to be suspending but not yet have claimed host_sem.
There would not be any outstanding requests otherwise the suspend
would not have started, so chance of errors at that point is very low.

Maybe deadlock could be sidestepped by changing:

diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
index 7256e6c43ca6..9cb04c6f8dc3 100644
--- a/drivers/ufs/core/ufshcd.c
+++ b/drivers/ufs/core/ufshcd.c
@@ -9258,7 +9261,10 @@ static int ufshcd_wl_suspend(struct device *dev)
 	ktime_t start = ktime_get();
 
 	hba = shost_priv(sdev->host);
-	down(&hba->host_sem);
+	if (down_trylock(&hba->host_sem)) {
+		ret = -EBUSY;
+		goto out;
+	}
 
 	if (pm_runtime_suspended(dev))
 		goto out;
Bart Van Assche Sept. 19, 2022, 11:17 p.m. UTC | #5
On 9/18/22 20:10, Asutosh Das (asd) wrote:
> Say, there's a PWR_FATAL error in ufshcd_wl_suspend().
> Wouldn't there be a scenario in which the suspend and error handler may 
> run simultaneously?

This is something I need to look further into.

> Do you see issues when that happens? How about when shutdown runs 
> simultaneously with error handler?

Hmm ... I don't think that my patch changes the interaction between the 
shutdown handler and the error handler?

Thanks,

Bart.
Bart Van Assche Sept. 19, 2022, 11:22 p.m. UTC | #6
On 9/19/22 10:21, Adrian Hunter wrote:
> I guess it goes unnoticed because it is very unlikely i.e. the UFS
> device would need to be suspending but not yet have claimed host_sem.
> There would not be any outstanding requests otherwise the suspend
> would not have started, so chance of errors at that point is very low.
> 
> Maybe deadlock could be sidestepped by changing:
> 
> diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
> index 7256e6c43ca6..9cb04c6f8dc3 100644
> --- a/drivers/ufs/core/ufshcd.c
> +++ b/drivers/ufs/core/ufshcd.c
> @@ -9258,7 +9261,10 @@ static int ufshcd_wl_suspend(struct device *dev)
>   	ktime_t start = ktime_get();
>   
>   	hba = shost_priv(sdev->host);
> -	down(&hba->host_sem);
> +	if (down_trylock(&hba->host_sem)) {
> +		ret = -EBUSY;
> +		goto out;
> +	}
>   
>   	if (pm_runtime_suspended(dev))
>   		goto out;

Hi Adrian,

Unfortunately I don't think that the above change would help. My 
conclusion from the logs that I analyzed is that ufshcd_wl_suspend() is 
called first and also that the error handler is invoked while 
ufshcd_wl_suspend() holds host_sem, resulting in a deadlock.

Thanks,

Bart.
diff mbox series

Patch

diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
index 7c15cbc737b4..cd3c2aa981c6 100644
--- a/drivers/ufs/core/ufshcd.c
+++ b/drivers/ufs/core/ufshcd.c
@@ -9254,16 +9254,13 @@  static int ufshcd_wl_suspend(struct device *dev)
 	ktime_t start = ktime_get();
 
 	hba = shost_priv(sdev->host);
-	down(&hba->host_sem);
 
 	if (pm_runtime_suspended(dev))
 		goto out;
 
 	ret = __ufshcd_wl_suspend(hba, UFS_SYSTEM_PM);
-	if (ret) {
+	if (ret)
 		dev_err(&sdev->sdev_gendev, "%s failed: %d\n", __func__,  ret);
-		up(&hba->host_sem);
-	}
 
 out:
 	if (!ret)
@@ -9296,7 +9293,6 @@  static int ufshcd_wl_resume(struct device *dev)
 		hba->curr_dev_pwr_mode, hba->uic_link_state);
 	if (!ret)
 		hba->is_sys_suspended = false;
-	up(&hba->host_sem);
 	return ret;
 }
 #endif