diff mbox series

scsi: fnic: move flush_work initialization out of if block

Message ID 20240930133014.71615-1-mwilck@suse.com (mailing list archive)
State Accepted
Headers show
Series scsi: fnic: move flush_work initialization out of if block | expand

Commit Message

Martin Wilck Sept. 30, 2024, 1:30 p.m. UTC
(resending, sorry - I'd forgotten to add the mailing list)

After commit 379a58caa199 ("scsi: fnic: Move fnic_fnic_flush_tx() to a work
queue"), it can happen that a work item is sent to an uninitialized work
queue.  This may has the effect that the item being queued is never
actually queued, and any further actions depending on it will not proceed.

The following warning is observed while the fnic driver is loaded:

kernel: WARNING: CPU: 11 PID: 0 at ../kernel/workqueue.c:1524 __queue_work+0x373/0x410
kernel:  <IRQ>
kernel:  queue_work_on+0x3a/0x50
kernel:  fnic_wq_copy_cmpl_handler+0x54a/0x730 [fnic 62fbff0c42e7fb825c60a55cde2fb91facb2ed24]
kernel:  fnic_isr_msix_wq_copy+0x2d/0x60 [fnic 62fbff0c42e7fb825c60a55cde2fb91facb2ed24]
kernel:  __handle_irq_event_percpu+0x36/0x1a0
kernel:  handle_irq_event_percpu+0x30/0x70
kernel:  handle_irq_event+0x34/0x60
kernel:  handle_edge_irq+0x7e/0x1a0
kernel:  __common_interrupt+0x3b/0xb0
kernel:  common_interrupt+0x58/0xa0
kernel:  </IRQ>

It has been observed that this may break the rediscovery of fibre channel
devices after a temporary fabric failure.

This patch fixes it by moving the work queue initialization out of
an if block in fnic_probe().

Signed-off-by: Martin Wilck <mwilck@suse.com>

Fixes: 379a58caa199 ("scsi: fnic: Move fnic_fnic_flush_tx() to a work queue")
Cc: stable@vger.kernel.org
---
 drivers/scsi/fnic/fnic_main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Karan Tilak Kumar (kartilak) Sept. 30, 2024, 4:43 p.m. UTC | #1
On Monday, September 30, 2024 6:30 AM, Martin Wilck <martin.wilck@suse.com> wrote:
>
> (resending, sorry - I'd forgotten to add the mailing list)
>
> After commit 379a58caa199 ("scsi: fnic: Move fnic_fnic_flush_tx() to a work
> queue"), it can happen that a work item is sent to an uninitialized work
> queue.  This may has the effect that the item being queued is never
> actually queued, and any further actions depending on it will not proceed.
>
> The following warning is observed while the fnic driver is loaded:
>
> kernel: WARNING: CPU: 11 PID: 0 at ../kernel/workqueue.c:1524 __queue_work+0x373/0x410
> kernel:  <IRQ>
> kernel:  queue_work_on+0x3a/0x50
> kernel:  fnic_wq_copy_cmpl_handler+0x54a/0x730 [fnic 62fbff0c42e7fb825c60a55cde2fb91facb2ed24]
> kernel:  fnic_isr_msix_wq_copy+0x2d/0x60 [fnic 62fbff0c42e7fb825c60a55cde2fb91facb2ed24]
> kernel:  __handle_irq_event_percpu+0x36/0x1a0
> kernel:  handle_irq_event_percpu+0x30/0x70
> kernel:  handle_irq_event+0x34/0x60
> kernel:  handle_edge_irq+0x7e/0x1a0
> kernel:  __common_interrupt+0x3b/0xb0
> kernel:  common_interrupt+0x58/0xa0
> kernel:  </IRQ>
>
> It has been observed that this may break the rediscovery of fibre channel
> devices after a temporary fabric failure.
>
> This patch fixes it by moving the work queue initialization out of
> an if block in fnic_probe().
>
> Signed-off-by: Martin Wilck <mwilck@suse.com>
>
> Fixes: 379a58caa199 ("scsi: fnic: Move fnic_fnic_flush_tx() to a work queue")
> Cc: stable@vger.kernel.org
> ---
> drivers/scsi/fnic/fnic_main.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/scsi/fnic/fnic_main.c b/drivers/scsi/fnic/fnic_main.c
> index 0044717d4486..adec0df24bc4 100644
> --- a/drivers/scsi/fnic/fnic_main.c
> +++ b/drivers/scsi/fnic/fnic_main.c
> @@ -830,7 +830,6 @@ static int fnic_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
> spin_lock_init(&fnic->vlans_lock);
> INIT_WORK(&fnic->fip_frame_work, fnic_handle_fip_frame);
> INIT_WORK(&fnic->event_work, fnic_handle_event);
> -             INIT_WORK(&fnic->flush_work, fnic_flush_tx);
> skb_queue_head_init(&fnic->fip_frame_queue);
> INIT_LIST_HEAD(&fnic->evlist);
> INIT_LIST_HEAD(&fnic->vlans);
> @@ -948,6 +947,7 @@ static int fnic_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
>
> INIT_WORK(&fnic->link_work, fnic_handle_link);
> INIT_WORK(&fnic->frame_work, fnic_handle_frame);
> +     INIT_WORK(&fnic->flush_work, fnic_flush_tx);
> skb_queue_head_init(&fnic->frame_queue);
> skb_queue_head_init(&fnic->tx_queue);
>
> --
> 2.46.1
>
>

Looks good to me.

Signed-off-by: Karan Tilak Kumar <kartilak@cisco.com>

Regards,
Karan
Martin K. Petersen Oct. 4, 2024, 2:08 a.m. UTC | #2
On Mon, 30 Sep 2024 15:30:14 +0200, Martin Wilck wrote:

> (resending, sorry - I'd forgotten to add the mailing list)
> 
> After commit 379a58caa199 ("scsi: fnic: Move fnic_fnic_flush_tx() to a work
> queue"), it can happen that a work item is sent to an uninitialized work
> queue.  This may has the effect that the item being queued is never
> actually queued, and any further actions depending on it will not proceed.
> 
> [...]

Applied to 6.12/scsi-fixes, thanks!

[1/1] scsi: fnic: move flush_work initialization out of if block
      https://git.kernel.org/mkp/scsi/c/f30e5f77d2f2
diff mbox series

Patch

diff --git a/drivers/scsi/fnic/fnic_main.c b/drivers/scsi/fnic/fnic_main.c
index 0044717d4486..adec0df24bc4 100644
--- a/drivers/scsi/fnic/fnic_main.c
+++ b/drivers/scsi/fnic/fnic_main.c
@@ -830,7 +830,6 @@  static int fnic_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 		spin_lock_init(&fnic->vlans_lock);
 		INIT_WORK(&fnic->fip_frame_work, fnic_handle_fip_frame);
 		INIT_WORK(&fnic->event_work, fnic_handle_event);
-		INIT_WORK(&fnic->flush_work, fnic_flush_tx);
 		skb_queue_head_init(&fnic->fip_frame_queue);
 		INIT_LIST_HEAD(&fnic->evlist);
 		INIT_LIST_HEAD(&fnic->vlans);
@@ -948,6 +947,7 @@  static int fnic_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 
 	INIT_WORK(&fnic->link_work, fnic_handle_link);
 	INIT_WORK(&fnic->frame_work, fnic_handle_frame);
+	INIT_WORK(&fnic->flush_work, fnic_flush_tx);
 	skb_queue_head_init(&fnic->frame_queue);
 	skb_queue_head_init(&fnic->tx_queue);