Message ID | 20230313043711.13500-2-njavali@marvell.com (mailing list archive) |
---|---|
State | Accepted |
Commit | 0367076b0817d5c75dfb83001ce7ce5c64d803a9 |
Headers | show |
Series | qla2xxx driver fixes | expand |
> On Mar 12, 2023, at 9:37 PM, Nilesh Javali <njavali@marvell.com> wrote: > > While adding and removing the controller, call trace was observed, > > WARNING: CPU: 3 PID: 623596 at kernel/dma/mapping.c:532 dma_free_attrs+0x33/0x50 > CPU: 3 PID: 623596 Comm: sh Kdump: loaded Not tainted 5.14.0-96.el9.x86_64 #1 > RIP: 0010:dma_free_attrs+0x33/0x50 > > Call Trace: > qla2x00_async_sns_sp_done+0x107/0x1b0 [qla2xxx] > qla2x00_abort_srb+0x8e/0x250 [qla2xxx] > ? ql_dbg+0x70/0x100 [qla2xxx] > __qla2x00_abort_all_cmds+0x108/0x190 [qla2xxx] > qla2x00_abort_all_cmds+0x24/0x70 [qla2xxx] > qla2x00_abort_isp_cleanup+0x305/0x3e0 [qla2xxx] > qla2x00_remove_one+0x364/0x400 [qla2xxx] > pci_device_remove+0x36/0xa0 > __device_release_driver+0x17a/0x230 > device_release_driver+0x24/0x30 > pci_stop_bus_device+0x68/0x90 > pci_stop_and_remove_bus_device_locked+0x16/0x30 > remove_store+0x75/0x90 > kernfs_fop_write_iter+0x11c/0x1b0 > new_sync_write+0x11f/0x1b0 > vfs_write+0x1eb/0x280 > ksys_write+0x5f/0xe0 > do_syscall_64+0x5c/0x80 > ? do_user_addr_fault+0x1d8/0x680 > ? do_syscall_64+0x69/0x80 > ? exc_page_fault+0x62/0x140 > ? asm_exc_page_fault+0x8/0x30 > entry_SYSCALL_64_after_hwframe+0x44/0xae > > The command was completed in abort path during driver unload > with a lock held causing the warning in abort path. > Hence complete the command without any lock held. > > Reported-by: Lin Li <lilin@redhat.com> > Tested-by: Lin Li <lilin@redhat.com> > Cc: stable@vger.kernel.org > Signed-off-by: Nilesh Javali <njavali@marvell.com> > --- > drivers/scsi/qla2xxx/qla_os.c | 11 +++++++++++ > 1 file changed, 11 insertions(+) > > diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c > index 545167627e48..d2f6c0546587 100644 > --- a/drivers/scsi/qla2xxx/qla_os.c > +++ b/drivers/scsi/qla2xxx/qla_os.c > @@ -1858,6 +1858,17 @@ __qla2x00_abort_all_cmds(struct qla_qpair *qp, int res) > for (cnt = 1; cnt < req->num_outstanding_cmds; cnt++) { > sp = req->outstanding_cmds[cnt]; > if (sp) { > + /* > + * perform lockless completion while driver unload > + */ > + if (qla2x00_chip_is_down(vha)) { > + req->outstanding_cmds[cnt] = NULL; > + spin_unlock_irqrestore(qp->qp_lock_ptr, flags); > + sp->done(sp, res); > + spin_lock_irqsave(qp->qp_lock_ptr, flags); > + continue; > + } > + > switch (sp->cmd_type) { > case TYPE_SRB: > qla2x00_abort_srb(qp, sp, res, &flags); > -- > 2.19.0.rc0 > Looks Good. Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com <mailto:himanshu.madhani@oracle.com>>
This patch has been tested at Red Hat and found to fix the reported problem. Test-by: Lin Li <lilin@redhat.com> Reviewed-by: John Meneghini <jmeneghi@redhat.com> On 3/13/23 00:37, Nilesh Javali wrote: > While adding and removing the controller, call trace was observed, > > WARNING: CPU: 3 PID: 623596 at kernel/dma/mapping.c:532 dma_free_attrs+0x33/0x50 > CPU: 3 PID: 623596 Comm: sh Kdump: loaded Not tainted 5.14.0-96.el9.x86_64 #1 > RIP: 0010:dma_free_attrs+0x33/0x50 > > Call Trace: > qla2x00_async_sns_sp_done+0x107/0x1b0 [qla2xxx] > qla2x00_abort_srb+0x8e/0x250 [qla2xxx] > ? ql_dbg+0x70/0x100 [qla2xxx] > __qla2x00_abort_all_cmds+0x108/0x190 [qla2xxx] > qla2x00_abort_all_cmds+0x24/0x70 [qla2xxx] > qla2x00_abort_isp_cleanup+0x305/0x3e0 [qla2xxx] > qla2x00_remove_one+0x364/0x400 [qla2xxx] > pci_device_remove+0x36/0xa0 > __device_release_driver+0x17a/0x230 > device_release_driver+0x24/0x30 > pci_stop_bus_device+0x68/0x90 > pci_stop_and_remove_bus_device_locked+0x16/0x30 > remove_store+0x75/0x90 > kernfs_fop_write_iter+0x11c/0x1b0 > new_sync_write+0x11f/0x1b0 > vfs_write+0x1eb/0x280 > ksys_write+0x5f/0xe0 > do_syscall_64+0x5c/0x80 > ? do_user_addr_fault+0x1d8/0x680 > ? do_syscall_64+0x69/0x80 > ? exc_page_fault+0x62/0x140 > ? asm_exc_page_fault+0x8/0x30 > entry_SYSCALL_64_after_hwframe+0x44/0xae > > The command was completed in abort path during driver unload > with a lock held causing the warning in abort path. > Hence complete the command without any lock held. > > Reported-by: Lin Li <lilin@redhat.com> > Tested-by: Lin Li <lilin@redhat.com> > Cc: stable@vger.kernel.org > Signed-off-by: Nilesh Javali <njavali@marvell.com> > --- > drivers/scsi/qla2xxx/qla_os.c | 11 +++++++++++ > 1 file changed, 11 insertions(+) > > diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c > index 545167627e48..d2f6c0546587 100644 > --- a/drivers/scsi/qla2xxx/qla_os.c > +++ b/drivers/scsi/qla2xxx/qla_os.c > @@ -1858,6 +1858,17 @@ __qla2x00_abort_all_cmds(struct qla_qpair *qp, int res) > for (cnt = 1; cnt < req->num_outstanding_cmds; cnt++) { > sp = req->outstanding_cmds[cnt]; > if (sp) { > + /* > + * perform lockless completion while driver unload > + */ > + if (qla2x00_chip_is_down(vha)) { > + req->outstanding_cmds[cnt] = NULL; > + spin_unlock_irqrestore(qp->qp_lock_ptr, flags); > + sp->done(sp, res); > + spin_lock_irqsave(qp->qp_lock_ptr, flags); > + continue; > + } > + > switch (sp->cmd_type) { > case TYPE_SRB: > qla2x00_abort_srb(qp, sp, res, &flags);
diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c index 545167627e48..d2f6c0546587 100644 --- a/drivers/scsi/qla2xxx/qla_os.c +++ b/drivers/scsi/qla2xxx/qla_os.c @@ -1858,6 +1858,17 @@ __qla2x00_abort_all_cmds(struct qla_qpair *qp, int res) for (cnt = 1; cnt < req->num_outstanding_cmds; cnt++) { sp = req->outstanding_cmds[cnt]; if (sp) { + /* + * perform lockless completion while driver unload + */ + if (qla2x00_chip_is_down(vha)) { + req->outstanding_cmds[cnt] = NULL; + spin_unlock_irqrestore(qp->qp_lock_ptr, flags); + sp->done(sp, res); + spin_lock_irqsave(qp->qp_lock_ptr, flags); + continue; + } + switch (sp->cmd_type) { case TYPE_SRB: qla2x00_abort_srb(qp, sp, res, &flags);