diff mbox series

scsi: qedi: Fix potential deadlock on &qedi_percpu->p_work_lock

Message ID 20230726125655.4197-1-dg573847474@gmail.com (mailing list archive)
State Accepted
Headers show
Series scsi: qedi: Fix potential deadlock on &qedi_percpu->p_work_lock | expand

Commit Message

Chengfeng Ye July 26, 2023, 12:56 p.m. UTC
As &qedi_percpu->p_work_lock is acquired by hard irq qedi_msix_handler(),
other acquisition of the same lock under process context should disable
irq, otherwise deadlock could happen if the irq preempt the execution
while the lock is held in process context on the same CPU.

qedi_cpu_offline() is one such function acquires the lock on process
context.

[Deadlock Scenario]
qedi_cpu_offline()
    ->spin_lock(&p->p_work_lock)
        <irq>
        ->qedi_msix_handler()
        ->edi_process_completions()
        ->spin_lock_irqsave(&p->p_work_lock, flags); (deadlock here)

This flaw was found by an experimental static analysis tool I am
developing for irq-related deadlock.

The tentative patch fix the potential deadlock by spin_lock_irqsave()
under process context.

Signed-off-by: Chengfeng Ye <dg573847474@gmail.com>
---
 drivers/scsi/qedi/qedi_main.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

Comments

Manish Rangankar Aug. 7, 2023, 12:20 p.m. UTC | #1
> -----Original Message-----
> From: Chengfeng Ye <dg573847474@gmail.com>
> Sent: Wednesday, July 26, 2023 6:27 PM
> To: Nilesh Javali <njavali@marvell.com>; Manish Rangankar
> <mrangankar@marvell.com>; GR-QLogic-Storage-Upstream <GR-QLogic-
> Storage-Upstream@marvell.com>; jejb@linux.ibm.com;
> martin.petersen@oracle.com
> Cc: linux-scsi@vger.kernel.org; linux-kernel@vger.kernel.org; Chengfeng Ye
> <dg573847474@gmail.com>
> Subject: [EXT] [PATCH] scsi: qedi: Fix potential deadlock on &qedi_percpu-
> >p_work_lock
> 
> External Email
> 
> ----------------------------------------------------------------------
> As &qedi_percpu->p_work_lock is acquired by hard irq qedi_msix_handler(),
> other acquisition of the same lock under process context should disable irq,
> otherwise deadlock could happen if the irq preempt the execution while the
> lock is held in process context on the same CPU.
> 
> qedi_cpu_offline() is one such function acquires the lock on process context.
> 
> [Deadlock Scenario]
> qedi_cpu_offline()
>     ->spin_lock(&p->p_work_lock)
>         <irq>
>         ->qedi_msix_handler()
>         ->edi_process_completions()
>         ->spin_lock_irqsave(&p->p_work_lock, flags); (deadlock here)
> 
> This flaw was found by an experimental static analysis tool I am developing
> for irq-related deadlock.
> 
> The tentative patch fix the potential deadlock by spin_lock_irqsave() under
> process context.
> 
> Signed-off-by: Chengfeng Ye <dg573847474@gmail.com>
> ---
>  drivers/scsi/qedi/qedi_main.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/scsi/qedi/qedi_main.c b/drivers/scsi/qedi/qedi_main.c
> index 450522b204d6..77a56a136678 100644
> --- a/drivers/scsi/qedi/qedi_main.c
> +++ b/drivers/scsi/qedi/qedi_main.c
> @@ -1976,8 +1976,9 @@ static int qedi_cpu_offline(unsigned int cpu)
>  	struct qedi_percpu_s *p = this_cpu_ptr(&qedi_percpu);
>  	struct qedi_work *work, *tmp;
>  	struct task_struct *thread;
> +	unsigned long flags;
> 
> -	spin_lock_bh(&p->p_work_lock);
> +	spin_lock_irqsave(&p->p_work_lock, flags);
>  	thread = p->iothread;
>  	p->iothread = NULL;
> 
> @@ -1988,7 +1989,7 @@ static int qedi_cpu_offline(unsigned int cpu)
>  			kfree(work);
>  	}
> 
> -	spin_unlock_bh(&p->p_work_lock);
> +	spin_unlock_irqrestore(&p->p_work_lock, flags);
>  	if (thread)
>  		kthread_stop(thread);
>  	return 0;
> --
> 2.17.1


Thanks for the patch,

Acked-by: Manish Rangankar <mrangankar@marvell.com>
Martin K. Petersen Aug. 8, 2023, 2:43 a.m. UTC | #2
On Wed, 26 Jul 2023 12:56:55 +0000, Chengfeng Ye wrote:

> As &qedi_percpu->p_work_lock is acquired by hard irq qedi_msix_handler(),
> other acquisition of the same lock under process context should disable
> irq, otherwise deadlock could happen if the irq preempt the execution
> while the lock is held in process context on the same CPU.
> 
> qedi_cpu_offline() is one such function acquires the lock on process
> context.
> 
> [...]

Applied to 6.5/scsi-fixes, thanks!

[1/1] scsi: qedi: Fix potential deadlock on &qedi_percpu->p_work_lock
      https://git.kernel.org/mkp/scsi/c/dd64f8058719
Chengfeng Ye Aug. 8, 2023, 6:23 a.m. UTC | #3
Thanks for the review.

Best,
Chengfeng
diff mbox series

Patch

diff --git a/drivers/scsi/qedi/qedi_main.c b/drivers/scsi/qedi/qedi_main.c
index 450522b204d6..77a56a136678 100644
--- a/drivers/scsi/qedi/qedi_main.c
+++ b/drivers/scsi/qedi/qedi_main.c
@@ -1976,8 +1976,9 @@  static int qedi_cpu_offline(unsigned int cpu)
 	struct qedi_percpu_s *p = this_cpu_ptr(&qedi_percpu);
 	struct qedi_work *work, *tmp;
 	struct task_struct *thread;
+	unsigned long flags;
 
-	spin_lock_bh(&p->p_work_lock);
+	spin_lock_irqsave(&p->p_work_lock, flags);
 	thread = p->iothread;
 	p->iothread = NULL;
 
@@ -1988,7 +1989,7 @@  static int qedi_cpu_offline(unsigned int cpu)
 			kfree(work);
 	}
 
-	spin_unlock_bh(&p->p_work_lock);
+	spin_unlock_irqrestore(&p->p_work_lock, flags);
 	if (thread)
 		kthread_stop(thread);
 	return 0;