diff mbox series

[v3] scsi: ufs: core: fix racing issue during ufshcd_mcq_abort

Message ID 20231121071128.7743-1-hy50.seo@samsung.com (mailing list archive)
State Changes Requested
Headers show
Series [v3] scsi: ufs: core: fix racing issue during ufshcd_mcq_abort | expand

Commit Message

SEO HOYOUNG Nov. 21, 2023, 7:11 a.m. UTC
If cq complete irq raise during abort processing,
the command has already been complete.
So could not get utag to erase cmd like below log.
Because the cmd that was handling abort has already been completed

ufshcd_try_to_abort_task: cmd pending in the device. tag = 25
Unable to handle kernel NULL pointer dereference at virtual address
0000000000000194
Mem abort info:
ESR = 0x0000000096000006
EC = 0x25: DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
FSC = 0x06: level 2 translation fault
Data abort info:
ISV = 0, ISS = 0x00000006
CM = 0, WnR = 0

pc : blk_mq_unique_tag+0x8/0x14
lr : ufshcd_mcq_sq_cleanup+0x6c/0x1b8
sp : ffffffc03e3b3b10
x29: ffffffc03e3b3b10 x28: 0000000000000001 x27: ffffff8830b34f68
x26: ffffff8830b34f6c x25: ffffff8830b34040 x24: 0000000000000000
x23: 0000000000000f18 x22: ffffffc03e3b3bb8 x21: 0000000000000019
x20: 0000000000000019 x19: ffffff8830b309b0 x18: ffffffc00a1b5380
x17: 00000000529c6ef0 x16: 00000000529c6ef0 x15: 0000000000000000
x14: 0000000000000010 x13: 0000000000000032 x12: 0000001169e8a5bc
x11: 0000000000000001 x10: ffffff885dfc1588 x9 : 0000000000000019
x8 : 0000000000000000 x7 : 0000000000000001 x6 : fffffffdef706f28
x5 : 000000000000283d x4 : 0000000000000001 x3 : 0000000000000000
x2 : 0000000000000003 x1 : 0000000000000019 x0 : ffffff8855781200
Call trace:
blk_mq_unique_tag+0x8/0x14
ufshcd_clear_cmd+0x34/0x118
ufshcd_try_to_abort_task+0x1c4/0x4b0
ufshcd_err_handler+0x8d0/0xd24
process_one_work+0x1e4/0x43c
worker_thread+0x25c/0x430
kthread+0x104/0x1d4
ret_from_fork+0x10/0x20

v1 -> v2: fix build error

v2 -> v3: move to ufshcd_mcq_sq_cleanup() function

Bart said that lrbp->cmd could be changed before ufshcd_clear_cmd() was
called, so lrbp->cmd check was moved to ufshcd_clear_cmd().
In the case of legacy mode, spin_lock is used to protect before clear cmd,
but spin_lock cannot be used due to mcq mode, so it is necessary to check
the status of lrbp->cmd.

Change-Id: Id8412190e60286d00a30820591566835cefbf47e
Signed-off-by: SEO HOYOUNG <hy50.seo@samsung.com>
---
 drivers/ufs/core/ufs-mcq.c | 4 ++++
 1 file changed, 4 insertions(+)

Comments

Bart Van Assche Nov. 21, 2023, 5:57 p.m. UTC | #1
On 11/20/23 23:11, SEO HOYOUNG wrote:
> Bart said that lrbp->cmd could be changed before ufshcd_clear_cmd() was
> called, so lrbp->cmd check was moved to ufshcd_clear_cmd().
> In the case of legacy mode, spin_lock is used to protect before clear cmd,
> but spin_lock cannot be used due to mcq mode, so it is necessary to check
> the status of lrbp->cmd.

Does this mean that the race that I mentioned has not been addressed at all?
ufshcd_mcq_sq_cleanup() is called by ufshcd_clear_cmd(). No locks are held by
ufshcd_eh_device_reset_handler() when it calls ufshcd_clear_cmd(). So I think
there is still a race between the code added by this patch and the completion
interrupt.

Thanks,

Bart.

> Change-Id: Id8412190e60286d00a30820591566835cefbf47e

No Change-Ids in patches that are posted on upstream mailing lists please.

> diff --git a/drivers/ufs/core/ufs-mcq.c b/drivers/ufs/core/ufs-mcq.c
> index 2ba8ec254dce..deb6dac724c8 100644
> --- a/drivers/ufs/core/ufs-mcq.c
> +++ b/drivers/ufs/core/ufs-mcq.c
> @@ -507,6 +507,10 @@ int ufshcd_mcq_sq_cleanup(struct ufs_hba *hba, int task_tag)
>   	if (hba->quirks & UFSHCD_QUIRK_MCQ_BROKEN_RTC)
>   		return -ETIMEDOUT;
>   
> +	if (!ufshcd_cmd_inflight(cmd) ||
> +	    test_bit(SCMD_STATE_COMPLETE, &cmd->state))
> +		return 0;
> +
>   	if (task_tag != hba->nutrs - UFSHCD_NUM_RESERVED) {
>   		if (!cmd)
>   			return -EINVAL;

Thanks,

Bart.
Dan Carpenter Nov. 22, 2023, 9:23 a.m. UTC | #2
Hi SEO,

kernel test robot noticed the following build warnings:

https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/SEO-HOYOUNG/scsi-ufs-core-fix-racing-issue-during-ufshcd_mcq_abort/20231121-151923
base:   https://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git for-next
patch link:    https://lore.kernel.org/r/20231121071128.7743-1-hy50.seo%40samsung.com
patch subject: [PATCH v3] scsi: ufs: core: fix racing issue during ufshcd_mcq_abort
config: powerpc-randconfig-r071-20231122 (https://download.01.org/0day-ci/archive/20231122/202311220618.OnEhSic6-lkp@intel.com/config)
compiler: powerpc-linux-gcc (GCC) 13.2.0
reproduce: (https://download.01.org/0day-ci/archive/20231122/202311220618.OnEhSic6-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Reported-by: Dan Carpenter <error27@gmail.com>
| Closes: https://lore.kernel.org/r/202311220618.OnEhSic6-lkp@intel.com/

smatch warnings:
drivers/ufs/core/ufs-mcq.c:515 ufshcd_mcq_sq_cleanup() warn: variable dereferenced before check 'cmd' (see line 511)

vim +/cmd +515 drivers/ufs/core/ufs-mcq.c

8d7290348992f2 Bao D. Nguyen   2023-05-29  498  int ufshcd_mcq_sq_cleanup(struct ufs_hba *hba, int task_tag)
8d7290348992f2 Bao D. Nguyen   2023-05-29  499  {
8d7290348992f2 Bao D. Nguyen   2023-05-29  500  	struct ufshcd_lrb *lrbp = &hba->lrb[task_tag];
8d7290348992f2 Bao D. Nguyen   2023-05-29  501  	struct scsi_cmnd *cmd = lrbp->cmd;
8d7290348992f2 Bao D. Nguyen   2023-05-29  502  	struct ufs_hw_queue *hwq;
8d7290348992f2 Bao D. Nguyen   2023-05-29  503  	void __iomem *reg, *opr_sqd_base;
8d7290348992f2 Bao D. Nguyen   2023-05-29  504  	u32 nexus, id, val;
8d7290348992f2 Bao D. Nguyen   2023-05-29  505  	int err;
8d7290348992f2 Bao D. Nguyen   2023-05-29  506  
aa9d5d0015a8b7 Po-Wen Kao      2023-06-12  507  	if (hba->quirks & UFSHCD_QUIRK_MCQ_BROKEN_RTC)
aa9d5d0015a8b7 Po-Wen Kao      2023-06-12  508  		return -ETIMEDOUT;
aa9d5d0015a8b7 Po-Wen Kao      2023-06-12  509  
5363c9d813101c SEO HOYOUNG     2023-11-21  510  	if (!ufshcd_cmd_inflight(cmd) ||
5363c9d813101c SEO HOYOUNG     2023-11-21 @511  	    test_bit(SCMD_STATE_COMPLETE, &cmd->state))
                                                                                          ^^^^^^^^^^^
The patch adds a new unchecked dereference

5363c9d813101c SEO HOYOUNG     2023-11-21  512  		return 0;
5363c9d813101c SEO HOYOUNG     2023-11-21  513  
8d7290348992f2 Bao D. Nguyen   2023-05-29  514  	if (task_tag != hba->nutrs - UFSHCD_NUM_RESERVED) {
8d7290348992f2 Bao D. Nguyen   2023-05-29 @515  		if (!cmd)
                                                                     ^^^
But the old code assumed "cmd" could be NULL

8d7290348992f2 Bao D. Nguyen   2023-05-29  516  			return -EINVAL;
8d7290348992f2 Bao D. Nguyen   2023-05-29  517  		hwq = ufshcd_mcq_req_to_hwq(hba, scsi_cmd_to_rq(cmd));
8d7290348992f2 Bao D. Nguyen   2023-05-29  518  	} else {
8d7290348992f2 Bao D. Nguyen   2023-05-29  519  		hwq = hba->dev_cmd_queue;
8d7290348992f2 Bao D. Nguyen   2023-05-29  520  	}
diff mbox series

Patch

diff --git a/drivers/ufs/core/ufs-mcq.c b/drivers/ufs/core/ufs-mcq.c
index 2ba8ec254dce..deb6dac724c8 100644
--- a/drivers/ufs/core/ufs-mcq.c
+++ b/drivers/ufs/core/ufs-mcq.c
@@ -507,6 +507,10 @@  int ufshcd_mcq_sq_cleanup(struct ufs_hba *hba, int task_tag)
 	if (hba->quirks & UFSHCD_QUIRK_MCQ_BROKEN_RTC)
 		return -ETIMEDOUT;
 
+	if (!ufshcd_cmd_inflight(cmd) ||
+	    test_bit(SCMD_STATE_COMPLETE, &cmd->state))
+		return 0;
+
 	if (task_tag != hba->nutrs - UFSHCD_NUM_RESERVED) {
 		if (!cmd)
 			return -EINVAL;