diff mbox series

[v1] scsi: ufs: fix tm cmd timeout/ISR racing issue

Message ID 20211111094939.14991-1-peter.wang@mediatek.com (mailing list archive)
State New, archived
Headers show
Series [v1] scsi: ufs: fix tm cmd timeout/ISR racing issue | expand

Commit Message

Peter Wang (王信友) Nov. 11, 2021, 9:49 a.m. UTC
From: Peter Wang <peter.wang@mediatek.com>

When tmc 100 ms timeout and recevied tmc complete ISR concurrently,
Bug happen because complete NULL poiner and KE.
Fix this racing issue by check NULL and use host_lock protect.

Signed-off-by: Peter Wang <peter.wang@mediatek.com>
---
 drivers/scsi/ufs/ufshcd.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

Comments

Bart Van Assche Nov. 15, 2021, 7:49 p.m. UTC | #1
On 11/11/21 1:49 AM, peter.wang@mediatek.com wrote:
> From: Peter Wang <peter.wang@mediatek.com>
> 
> When tmc 100 ms timeout and recevied tmc complete ISR concurrently,
> Bug happen because complete NULL poiner and KE.
> Fix this racing issue by check NULL and use host_lock protect.
> 
> Signed-off-by: Peter Wang <peter.wang@mediatek.com>
> ---
>   drivers/scsi/ufs/ufshcd.c | 6 +++++-
>   1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
> index 5c6a58a666d2..6821ceb6783e 100644
> --- a/drivers/scsi/ufs/ufshcd.c
> +++ b/drivers/scsi/ufs/ufshcd.c
> @@ -6442,7 +6442,8 @@ static irqreturn_t ufshcd_tmc_handler(struct ufs_hba *hba)
>   		struct request *req = hba->tmf_rqs[tag];
>   		struct completion *c = req->end_io_data;
>   
> -		complete(c);
> +		if (c)
> +			complete(c);
>   		ret = IRQ_HANDLED;
>   	}
>   	spin_unlock_irqrestore(hba->host->host_lock, flags);
> @@ -6597,7 +6598,10 @@ static int __ufshcd_issue_tm_cmd(struct ufs_hba *hba,
>   		 * Make sure that ufshcd_compl_tm() does not trigger a
>   		 * use-after-free.
>   		 */
> +		spin_lock_irqsave(hba->host->host_lock, flags);
>   		req->end_io_data = NULL;
> +		spin_unlock_irqrestore(hba->host->host_lock, flags);
> +
>   		ufshcd_add_tm_upiu_trace(hba, task_tag, UFS_TM_ERR);
>   		dev_err(hba->dev, "%s: task management cmd 0x%.2x timed-out\n",
>   				__func__, tm_function);

Isn't this already addressed by Adrian Hunter's patches? See also
https://lore.kernel.org/linux-scsi/20211108064815.569494-1-adrian.hunter@intel.com/

Thanks,

Bart.
Bart Van Assche Nov. 16, 2021, 5:28 p.m. UTC | #2
On 11/15/21 22:57, Peter Wang wrote:
> By the way, we observe that 100ms TMC timeout value may not enough for
> some device, maybe we need enlarge this value?

Is that the TM_CMD_TIMEOUT constant? It surprises me that 100 ms is not 
enough. Will increasing that constant have a negative impact on the 
error handler in case it hits a task management timeout?

Thanks,

Bart.
diff mbox series

Patch

diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 5c6a58a666d2..6821ceb6783e 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -6442,7 +6442,8 @@  static irqreturn_t ufshcd_tmc_handler(struct ufs_hba *hba)
 		struct request *req = hba->tmf_rqs[tag];
 		struct completion *c = req->end_io_data;
 
-		complete(c);
+		if (c)
+			complete(c);
 		ret = IRQ_HANDLED;
 	}
 	spin_unlock_irqrestore(hba->host->host_lock, flags);
@@ -6597,7 +6598,10 @@  static int __ufshcd_issue_tm_cmd(struct ufs_hba *hba,
 		 * Make sure that ufshcd_compl_tm() does not trigger a
 		 * use-after-free.
 		 */
+		spin_lock_irqsave(hba->host->host_lock, flags);
 		req->end_io_data = NULL;
+		spin_unlock_irqrestore(hba->host->host_lock, flags);
+
 		ufshcd_add_tm_upiu_trace(hba, task_tag, UFS_TM_ERR);
 		dev_err(hba->dev, "%s: task management cmd 0x%.2x timed-out\n",
 				__func__, tm_function);