Message ID | 20250311195340.2358368-1-bvanassche@acm.org (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | scsi: ufs: core: Fix a race condition related to device commands | expand |
Hi, > There is a TOCTOU race in ufshcd_compl_one_cqe(): hba->dev_cmd.complete > may be cleared from another thread after it has been checked and before it is > used. Fix this race by moving the device command completion from the stack of > the device command submitter into struct ufs_hba. This patch fixes the following > kernel crash: Can you elaborate how this is possible if there is a single tag for device management commands, And it is obtained under lock? And why making the completion structure persistent beyond the function's scope solves the problem? Thanks, Avri > > Unable to handle kernel NULL pointer dereference at virtual address > 0000000000000008 Call trace: > _raw_spin_lock_irqsave+0x34/0x80 > complete+0x24/0xb8 > ufshcd_compl_one_cqe+0x13c/0x4f0 > ufshcd_mcq_poll_cqe_lock+0xb4/0x108 > ufshcd_intr+0x2f4/0x444 > __handle_irq_event_percpu+0xbc/0x250 > handle_irq_event+0x48/0xb0 > > Fixes: 5a0b0cb9bee7 ("[SCSI] ufs: Add support for sending NOP OUT UPIU") > Signed-off-by: Bart Van Assche <bvanassche@acm.org> > --- > drivers/ufs/core/ufshcd.c | 24 +++++------------------- > include/ufs/ufshcd.h | 2 +- > 2 files changed, 6 insertions(+), 20 deletions(-) > > diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c index > 4e1e214fc5a2..23ba3f540f27 100644 > --- a/drivers/ufs/core/ufshcd.c > +++ b/drivers/ufs/core/ufshcd.c > @@ -3176,16 +3176,10 @@ static int ufshcd_wait_for_dev_cmd(struct ufs_hba > *hba, > int err; > > retry: > - time_left = wait_for_completion_timeout(hba->dev_cmd.complete, > + time_left = wait_for_completion_timeout(&hba->dev_cmd.complete, > time_left); > > if (likely(time_left)) { > - /* > - * The completion handler called complete() and the caller of > - * this function still owns the @lrbp tag so the code below does > - * not trigger any race conditions. > - */ > - hba->dev_cmd.complete = NULL; > err = ufshcd_get_tr_ocs(lrbp, NULL); > if (!err) > err = ufshcd_dev_cmd_completion(hba, lrbp); @@ - > 3199,7 +3193,6 @@ static int ufshcd_wait_for_dev_cmd(struct ufs_hba *hba, > /* successfully cleared the command, retry if needed */ > if (ufshcd_clear_cmd(hba, lrbp->task_tag) == 0) > err = -EAGAIN; > - hba->dev_cmd.complete = NULL; > return err; > } > > @@ -3215,11 +3208,9 @@ static int ufshcd_wait_for_dev_cmd(struct ufs_hba > *hba, > spin_lock_irqsave(&hba->outstanding_lock, flags); > pending = test_bit(lrbp->task_tag, > &hba->outstanding_reqs); > - if (pending) { > - hba->dev_cmd.complete = NULL; > + if (pending) > __clear_bit(lrbp->task_tag, > &hba->outstanding_reqs); > - } > spin_unlock_irqrestore(&hba->outstanding_lock, flags); > > if (!pending) { > @@ -3237,8 +3228,6 @@ static int ufshcd_wait_for_dev_cmd(struct ufs_hba > *hba, > spin_lock_irqsave(&hba->outstanding_lock, flags); > pending = test_bit(lrbp->task_tag, > &hba->outstanding_reqs); > - if (pending) > - hba->dev_cmd.complete = NULL; > spin_unlock_irqrestore(&hba->outstanding_lock, flags); > > if (!pending) { > @@ -3272,13 +3261,10 @@ static void ufshcd_dev_man_unlock(struct ufs_hba > *hba) static int ufshcd_issue_dev_cmd(struct ufs_hba *hba, struct ufshcd_lrb > *lrbp, > const u32 tag, int timeout) > { > - DECLARE_COMPLETION_ONSTACK(wait); > int err; > > - hba->dev_cmd.complete = &wait; > - > ufshcd_add_query_upiu_trace(hba, UFS_QUERY_SEND, lrbp- > >ucd_req_ptr); > - > + init_completion(&hba->dev_cmd.complete); > ufshcd_send_command(hba, tag, hba->dev_cmd_queue); > err = ufshcd_wait_for_dev_cmd(hba, lrbp, timeout); > > @@ -5585,12 +5571,12 @@ void ufshcd_compl_one_cqe(struct ufs_hba *hba, > int task_tag, > ufshcd_release_scsi_cmd(hba, lrbp); > /* Do not touch lrbp after scsi done */ > scsi_done(cmd); > - } else if (hba->dev_cmd.complete) { > + } else { > if (cqe) { > ocs = le32_to_cpu(cqe->status) & MASK_OCS; > lrbp->utr_descriptor_ptr->header.ocs = ocs; > } > - complete(hba->dev_cmd.complete); > + complete(&hba->dev_cmd.complete); > } > } > > diff --git a/include/ufs/ufshcd.h b/include/ufs/ufshcd.h index > e3909cc691b2..f56050ce9445 100644 > --- a/include/ufs/ufshcd.h > +++ b/include/ufs/ufshcd.h > @@ -246,7 +246,7 @@ struct ufs_query { > struct ufs_dev_cmd { > enum dev_cmd_type type; > struct mutex lock; > - struct completion *complete; > + struct completion complete; > struct ufs_query query; > }; >
On Tue, 2025-03-11 at 12:53 -0700, Bart Van Assche wrote: > > @@ -3272,13 +3261,10 @@ static void ufshcd_dev_man_unlock(struct > ufs_hba *hba) > static int ufshcd_issue_dev_cmd(struct ufs_hba *hba, struct > ufshcd_lrb *lrbp, > const u32 tag, int timeout) > { > - DECLARE_COMPLETION_ONSTACK(wait); > int err; > > - hba->dev_cmd.complete = &wait; > - > ufshcd_add_query_upiu_trace(hba, UFS_QUERY_SEND, lrbp- > >ucd_req_ptr); > - > + init_completion(&hba->dev_cmd.complete); > ufshcd_send_command(hba, tag, hba->dev_cmd_queue); > err = ufshcd_wait_for_dev_cmd(hba, lrbp, timeout); > Hi Bart, This could calling init_completion on the same completion twice? Thanks. Peter
On 3/12/25 12:33 AM, Avri Altman wrote: > Hi, >> There is a TOCTOU race in ufshcd_compl_one_cqe(): hba->dev_cmd.complete >> may be cleared from another thread after it has been checked and before it is >> used. Fix this race by moving the device command completion from the stack of >> the device command submitter into struct ufs_hba. This patch fixes the following >> kernel crash: > > Can you elaborate how this is possible if there is a single tag for device management commands, > And it is obtained under lock? Which lock? The code that submits device management commands is serialized by the hba->dev_cmd.lock mutex while ufshcd_compl_one_cqe() calls are serialized by the hwq->cq_lock spinlock in case of MCQ. I'm not aware of any single synchronization object that serializes all hba->dev_cmd.complete accesses. > And why making the completion structure persistent beyond the function's scope solves the problem? Without my patch, hba->dev_cmd.complete is sometimes set and sometimes NULL. My patch ensures that the pointer passed to the complete() call is never NULL. Does this answer your question? Thanks, Bart.
On 3/12/25 12:40 AM, Peter Wang (王信友) wrote: > On Tue, 2025-03-11 at 12:53 -0700, Bart Van Assche wrote: >> >> @@ -3272,13 +3261,10 @@ static void ufshcd_dev_man_unlock(struct >> ufs_hba *hba) >> static int ufshcd_issue_dev_cmd(struct ufs_hba *hba, struct >> ufshcd_lrb *lrbp, >> const u32 tag, int timeout) >> { >> - DECLARE_COMPLETION_ONSTACK(wait); >> int err; >> >> - hba->dev_cmd.complete = &wait; >> - >> ufshcd_add_query_upiu_trace(hba, UFS_QUERY_SEND, lrbp- >>> ucd_req_ptr); >> - >> + init_completion(&hba->dev_cmd.complete); >> ufshcd_send_command(hba, tag, hba->dev_cmd_queue); >> err = ufshcd_wait_for_dev_cmd(hba, lrbp, timeout); >> > > Hi Bart, > > This could calling init_completion on the same completion twice? Hi Peter, My patch will cause init_completion() to be called as many times as device management commands are submitted. As far as I know the following sequence is allowed and does not trigger any race conditions: Thread 1 Thread 2 -------- -------- init_completion() wait_for_completion_timeout() is called complete() wait_for_completion_timeout() returns init_completion() wait_for_completion_timeout() is called complete() wait_for_completion_timeout() returns [ ... ] Thanks, Bart.
> > On 3/12/25 12:33 AM, Avri Altman wrote: > > Hi, > >> There is a TOCTOU race in ufshcd_compl_one_cqe(): hba->dev_cmd.complete > >> may be cleared from another thread after it has been checked and before it is > >> used. Fix this race by moving the device command completion from the stack > of > >> the device command submitter into struct ufs_hba. This patch fixes the > following > >> kernel crash: > > > > Can you elaborate how this is possible if there is a single tag for device > management commands, > > And it is obtained under lock? > > Which lock? The code that submits device management commands is > serialized by the hba->dev_cmd.lock mutex while ufshcd_compl_one_cqe() > calls are serialized by the hwq->cq_lock spinlock in case of MCQ. I'm > not aware of any single synchronization object that serializes all > hba->dev_cmd.complete accesses. > > > And why making the completion structure persistent beyond the function's > scope solves the problem? > > Without my patch, hba->dev_cmd.complete is sometimes set and sometimes > NULL. My patch ensures that the pointer passed to the complete() call > is never NULL. Does this answer your question? I was hoping for a root-cause analysis explaining why hba->dev_cmd.complete is sometimes set and sometimes NULL. Thanks, Avri > > Thanks, > > Bart.
On 3/12/25 7:25 AM, Avri Altman wrote: > I was hoping for a root-cause analysis explaining why hba- > >dev_cmd.complete is sometimes set and sometimes NULL. Hi Avri, Hasn't that already been explained in the patch description? It's not clear what additional information you need other than what has already been mentioned in the patch description. Thanks, Bart.
diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c index 4e1e214fc5a2..23ba3f540f27 100644 --- a/drivers/ufs/core/ufshcd.c +++ b/drivers/ufs/core/ufshcd.c @@ -3176,16 +3176,10 @@ static int ufshcd_wait_for_dev_cmd(struct ufs_hba *hba, int err; retry: - time_left = wait_for_completion_timeout(hba->dev_cmd.complete, + time_left = wait_for_completion_timeout(&hba->dev_cmd.complete, time_left); if (likely(time_left)) { - /* - * The completion handler called complete() and the caller of - * this function still owns the @lrbp tag so the code below does - * not trigger any race conditions. - */ - hba->dev_cmd.complete = NULL; err = ufshcd_get_tr_ocs(lrbp, NULL); if (!err) err = ufshcd_dev_cmd_completion(hba, lrbp); @@ -3199,7 +3193,6 @@ static int ufshcd_wait_for_dev_cmd(struct ufs_hba *hba, /* successfully cleared the command, retry if needed */ if (ufshcd_clear_cmd(hba, lrbp->task_tag) == 0) err = -EAGAIN; - hba->dev_cmd.complete = NULL; return err; } @@ -3215,11 +3208,9 @@ static int ufshcd_wait_for_dev_cmd(struct ufs_hba *hba, spin_lock_irqsave(&hba->outstanding_lock, flags); pending = test_bit(lrbp->task_tag, &hba->outstanding_reqs); - if (pending) { - hba->dev_cmd.complete = NULL; + if (pending) __clear_bit(lrbp->task_tag, &hba->outstanding_reqs); - } spin_unlock_irqrestore(&hba->outstanding_lock, flags); if (!pending) { @@ -3237,8 +3228,6 @@ static int ufshcd_wait_for_dev_cmd(struct ufs_hba *hba, spin_lock_irqsave(&hba->outstanding_lock, flags); pending = test_bit(lrbp->task_tag, &hba->outstanding_reqs); - if (pending) - hba->dev_cmd.complete = NULL; spin_unlock_irqrestore(&hba->outstanding_lock, flags); if (!pending) { @@ -3272,13 +3261,10 @@ static void ufshcd_dev_man_unlock(struct ufs_hba *hba) static int ufshcd_issue_dev_cmd(struct ufs_hba *hba, struct ufshcd_lrb *lrbp, const u32 tag, int timeout) { - DECLARE_COMPLETION_ONSTACK(wait); int err; - hba->dev_cmd.complete = &wait; - ufshcd_add_query_upiu_trace(hba, UFS_QUERY_SEND, lrbp->ucd_req_ptr); - + init_completion(&hba->dev_cmd.complete); ufshcd_send_command(hba, tag, hba->dev_cmd_queue); err = ufshcd_wait_for_dev_cmd(hba, lrbp, timeout); @@ -5585,12 +5571,12 @@ void ufshcd_compl_one_cqe(struct ufs_hba *hba, int task_tag, ufshcd_release_scsi_cmd(hba, lrbp); /* Do not touch lrbp after scsi done */ scsi_done(cmd); - } else if (hba->dev_cmd.complete) { + } else { if (cqe) { ocs = le32_to_cpu(cqe->status) & MASK_OCS; lrbp->utr_descriptor_ptr->header.ocs = ocs; } - complete(hba->dev_cmd.complete); + complete(&hba->dev_cmd.complete); } } diff --git a/include/ufs/ufshcd.h b/include/ufs/ufshcd.h index e3909cc691b2..f56050ce9445 100644 --- a/include/ufs/ufshcd.h +++ b/include/ufs/ufshcd.h @@ -246,7 +246,7 @@ struct ufs_query { struct ufs_dev_cmd { enum dev_cmd_type type; struct mutex lock; - struct completion *complete; + struct completion complete; struct ufs_query query; };
There is a TOCTOU race in ufshcd_compl_one_cqe(): hba->dev_cmd.complete may be cleared from another thread after it has been checked and before it is used. Fix this race by moving the device command completion from the stack of the device command submitter into struct ufs_hba. This patch fixes the following kernel crash: Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008 Call trace: _raw_spin_lock_irqsave+0x34/0x80 complete+0x24/0xb8 ufshcd_compl_one_cqe+0x13c/0x4f0 ufshcd_mcq_poll_cqe_lock+0xb4/0x108 ufshcd_intr+0x2f4/0x444 __handle_irq_event_percpu+0xbc/0x250 handle_irq_event+0x48/0xb0 Fixes: 5a0b0cb9bee7 ("[SCSI] ufs: Add support for sending NOP OUT UPIU") Signed-off-by: Bart Van Assche <bvanassche@acm.org> --- drivers/ufs/core/ufshcd.c | 24 +++++------------------- include/ufs/ufshcd.h | 2 +- 2 files changed, 6 insertions(+), 20 deletions(-)