diff mbox series

[net] ionic: fix use after netif_napi_del()

Message ID 20240610040706.1385890-1-ap420073@gmail.com (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series [net] ionic: fix use after netif_napi_del() | expand

Checks

Context Check Description
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for net
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag present in non-next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 8 this patch: 8
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers success CCed 7 of 7 maintainers
netdev/build_clang success Errors and warnings before: 868 this patch: 868
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 869 this patch: 869
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 8 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
netdev/contest success net-next-2024-06-10--15-00 (tests: 644)

Commit Message

Taehee Yoo June 10, 2024, 4:07 a.m. UTC
When queues are started, netif_napi_add() and napi_enable() are called.
If there are 4 queues and only 3 queues are used for the current
configuration, only 3 queues' napi should be registered and enabled.
The ionic_qcq_enable() checks whether the .poll pointer is not NULL for
enabling only the using queue' napi. Unused queues' napi will not be
registered by netif_napi_add(), so the .poll pointer indicates NULL.
But it couldn't distinguish whether the napi was unregistered or not
because netif_napi_del() doesn't reset the .poll pointer to NULL.
So, ionic_qcq_enable() calls napi_enable() for the queue, which was
unregistered by netif_napi_del().

Reproducer:
   ethtool -L <interface name> rx 1 tx 1 combined 0
   ethtool -L <interface name> rx 0 tx 0 combined 1
   ethtool -L <interface name> rx 0 tx 0 combined 4

Splat looks like:
kernel BUG at net/core/dev.c:6666!
Oops: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
CPU: 3 PID: 1057 Comm: kworker/3:3 Not tainted 6.10.0-rc2+ #16
Workqueue: events ionic_lif_deferred_work [ionic]
RIP: 0010:napi_enable+0x3b/0x40
Code: 48 89 c2 48 83 e2 f6 80 b9 61 09 00 00 00 74 0d 48 83 bf 60 01 00 00 00 74 03 80 ce 01 f0 4f
RSP: 0018:ffffb6ed83227d48 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff97560cda0828 RCX: 0000000000000029
RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff97560cda0a28
RBP: ffffb6ed83227d50 R08: 0000000000000400 R09: 0000000000000001
R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000000
R13: ffff97560ce3c1a0 R14: 0000000000000000 R15: ffff975613ba0a20
FS:  0000000000000000(0000) GS:ffff975d5f780000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f8f734ee200 CR3: 0000000103e50000 CR4: 00000000007506f0
PKRU: 55555554
Call Trace:
 <TASK>
 ? die+0x33/0x90
 ? do_trap+0xd9/0x100
 ? napi_enable+0x3b/0x40
 ? do_error_trap+0x83/0xb0
 ? napi_enable+0x3b/0x40
 ? napi_enable+0x3b/0x40
 ? exc_invalid_op+0x4e/0x70
 ? napi_enable+0x3b/0x40
 ? asm_exc_invalid_op+0x16/0x20
 ? napi_enable+0x3b/0x40
 ionic_qcq_enable+0xb7/0x180 [ionic 59bdfc8a035436e1c4224ff7d10789e3f14643f8]
 ionic_start_queues+0xc4/0x290 [ionic 59bdfc8a035436e1c4224ff7d10789e3f14643f8]
 ionic_link_status_check+0x11c/0x170 [ionic 59bdfc8a035436e1c4224ff7d10789e3f14643f8]
 ionic_lif_deferred_work+0x129/0x280 [ionic 59bdfc8a035436e1c4224ff7d10789e3f14643f8]
 process_one_work+0x145/0x360
 worker_thread+0x2bb/0x3d0
 ? __pfx_worker_thread+0x10/0x10
 kthread+0xcc/0x100
 ? __pfx_kthread+0x10/0x10
 ret_from_fork+0x2d/0x50
 ? __pfx_kthread+0x10/0x10
 ret_from_fork_asm+0x1a/0x30

Fixes: 0f3154e6bcb3 ("ionic: Add Tx and Rx handling")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
---
 drivers/net/ethernet/pensando/ionic/ionic_lif.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Nelson, Shannon June 10, 2024, 6:21 p.m. UTC | #1
On 6/9/2024 9:07 PM, Taehee Yoo wrote:
> 
> When queues are started, netif_napi_add() and napi_enable() are called.
> If there are 4 queues and only 3 queues are used for the current
> configuration, only 3 queues' napi should be registered and enabled.
> The ionic_qcq_enable() checks whether the .poll pointer is not NULL for
> enabling only the using queue' napi. Unused queues' napi will not be
> registered by netif_napi_add(), so the .poll pointer indicates NULL.
> But it couldn't distinguish whether the napi was unregistered or not
> because netif_napi_del() doesn't reset the .poll pointer to NULL.
> So, ionic_qcq_enable() calls napi_enable() for the queue, which was
> unregistered by netif_napi_del().
> 
> Reproducer:
>     ethtool -L <interface name> rx 1 tx 1 combined 0
>     ethtool -L <interface name> rx 0 tx 0 combined 1
>     ethtool -L <interface name> rx 0 tx 0 combined 4
> 
> Splat looks like:
> kernel BUG at net/core/dev.c:6666!
> Oops: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
> CPU: 3 PID: 1057 Comm: kworker/3:3 Not tainted 6.10.0-rc2+ #16
> Workqueue: events ionic_lif_deferred_work [ionic]
> RIP: 0010:napi_enable+0x3b/0x40
> Code: 48 89 c2 48 83 e2 f6 80 b9 61 09 00 00 00 74 0d 48 83 bf 60 01 00 00 00 74 03 80 ce 01 f0 4f
> RSP: 0018:ffffb6ed83227d48 EFLAGS: 00010246
> RAX: 0000000000000000 RBX: ffff97560cda0828 RCX: 0000000000000029
> RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff97560cda0a28
> RBP: ffffb6ed83227d50 R08: 0000000000000400 R09: 0000000000000001
> R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000000
> R13: ffff97560ce3c1a0 R14: 0000000000000000 R15: ffff975613ba0a20
> FS:  0000000000000000(0000) GS:ffff975d5f780000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007f8f734ee200 CR3: 0000000103e50000 CR4: 00000000007506f0
> PKRU: 55555554
> Call Trace:
>   <TASK>
>   ? die+0x33/0x90
>   ? do_trap+0xd9/0x100
>   ? napi_enable+0x3b/0x40
>   ? do_error_trap+0x83/0xb0
>   ? napi_enable+0x3b/0x40
>   ? napi_enable+0x3b/0x40
>   ? exc_invalid_op+0x4e/0x70
>   ? napi_enable+0x3b/0x40
>   ? asm_exc_invalid_op+0x16/0x20
>   ? napi_enable+0x3b/0x40
>   ionic_qcq_enable+0xb7/0x180 [ionic 59bdfc8a035436e1c4224ff7d10789e3f14643f8]
>   ionic_start_queues+0xc4/0x290 [ionic 59bdfc8a035436e1c4224ff7d10789e3f14643f8]
>   ionic_link_status_check+0x11c/0x170 [ionic 59bdfc8a035436e1c4224ff7d10789e3f14643f8]
>   ionic_lif_deferred_work+0x129/0x280 [ionic 59bdfc8a035436e1c4224ff7d10789e3f14643f8]
>   process_one_work+0x145/0x360
>   worker_thread+0x2bb/0x3d0
>   ? __pfx_worker_thread+0x10/0x10
>   kthread+0xcc/0x100
>   ? __pfx_kthread+0x10/0x10
>   ret_from_fork+0x2d/0x50
>   ? __pfx_kthread+0x10/0x10
>   ret_from_fork_asm+0x1a/0x30
> 
> Fixes: 0f3154e6bcb3 ("ionic: Add Tx and Rx handling")
> Signed-off-by: Taehee Yoo <ap420073@gmail.com>
> ---
>   drivers/net/ethernet/pensando/ionic/ionic_lif.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/pensando/ionic/ionic_lif.c b/drivers/net/ethernet/pensando/ionic/ionic_lif.c
> index 24870da3f484..b66c907d88e6 100644
> --- a/drivers/net/ethernet/pensando/ionic/ionic_lif.c
> +++ b/drivers/net/ethernet/pensando/ionic/ionic_lif.c
> @@ -304,7 +304,7 @@ static int ionic_qcq_enable(struct ionic_qcq *qcq)
>          if (ret)
>                  return ret;
> 
> -       if (qcq->napi.poll)
> +       if (test_bit(NAPI_STATE_LISTED, &qcq->napi.state))
>                  napi_enable(&qcq->napi);
> 
>          if (qcq->flags & IONIC_QCQ_F_INTR) {
> --
> 2.34.1
> 

I think a better solution would be to stay out of the napi internals 
altogether and rely on the IONIC_QCQ_F_INTR flag as in 
ionic_qcq_disable() and ionic_lif_qcq_deinit().

Thanks for catching this.  If I remember correctly, this is a vestige of 
an experimental feature that never went upstream, and eventually was 
dropped altogether anyway.

sln
Taehee Yoo June 11, 2024, 4:05 a.m. UTC | #2
On Tue, Jun 11, 2024 at 3:21 AM Nelson, Shannon <shannon.nelson@amd.com> wrote:
>

Hi Nelson,
Thanks a lot for the review!

> On 6/9/2024 9:07 PM, Taehee Yoo wrote:
> >
> > When queues are started, netif_napi_add() and napi_enable() are called.
> > If there are 4 queues and only 3 queues are used for the current
> > configuration, only 3 queues' napi should be registered and enabled.
> > The ionic_qcq_enable() checks whether the .poll pointer is not NULL for
> > enabling only the using queue' napi. Unused queues' napi will not be
> > registered by netif_napi_add(), so the .poll pointer indicates NULL.
> > But it couldn't distinguish whether the napi was unregistered or not
> > because netif_napi_del() doesn't reset the .poll pointer to NULL.
> > So, ionic_qcq_enable() calls napi_enable() for the queue, which was
> > unregistered by netif_napi_del().
> >
> > Reproducer:
> > ethtool -L <interface name> rx 1 tx 1 combined 0
> > ethtool -L <interface name> rx 0 tx 0 combined 1
> > ethtool -L <interface name> rx 0 tx 0 combined 4
> >
> > Splat looks like:
> > kernel BUG at net/core/dev.c:6666!
> > Oops: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
> > CPU: 3 PID: 1057 Comm: kworker/3:3 Not tainted 6.10.0-rc2+ #16
> > Workqueue: events ionic_lif_deferred_work [ionic]
> > RIP: 0010:napi_enable+0x3b/0x40
> > Code: 48 89 c2 48 83 e2 f6 80 b9 61 09 00 00 00 74 0d 48 83 bf 60 01 00 00 00 74 03 80 ce 01 f0 4f
> > RSP: 0018:ffffb6ed83227d48 EFLAGS: 00010246
> > RAX: 0000000000000000 RBX: ffff97560cda0828 RCX: 0000000000000029
> > RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff97560cda0a28
> > RBP: ffffb6ed83227d50 R08: 0000000000000400 R09: 0000000000000001
> > R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000000
> > R13: ffff97560ce3c1a0 R14: 0000000000000000 R15: ffff975613ba0a20
> > FS: 0000000000000000(0000) GS:ffff975d5f780000(0000) knlGS:0000000000000000
> > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: 00007f8f734ee200 CR3: 0000000103e50000 CR4: 00000000007506f0
> > PKRU: 55555554
> > Call Trace:
> > <TASK>
> > ? die+0x33/0x90
> > ? do_trap+0xd9/0x100
> > ? napi_enable+0x3b/0x40
> > ? do_error_trap+0x83/0xb0
> > ? napi_enable+0x3b/0x40
> > ? napi_enable+0x3b/0x40
> > ? exc_invalid_op+0x4e/0x70
> > ? napi_enable+0x3b/0x40
> > ? asm_exc_invalid_op+0x16/0x20
> > ? napi_enable+0x3b/0x40
> > ionic_qcq_enable+0xb7/0x180 [ionic 59bdfc8a035436e1c4224ff7d10789e3f14643f8]
> > ionic_start_queues+0xc4/0x290 [ionic 59bdfc8a035436e1c4224ff7d10789e3f14643f8]
> > ionic_link_status_check+0x11c/0x170 [ionic 59bdfc8a035436e1c4224ff7d10789e3f14643f8]
> > ionic_lif_deferred_work+0x129/0x280 [ionic 59bdfc8a035436e1c4224ff7d10789e3f14643f8]
> > process_one_work+0x145/0x360
> > worker_thread+0x2bb/0x3d0
> > ? __pfx_worker_thread+0x10/0x10
> > kthread+0xcc/0x100
> > ? __pfx_kthread+0x10/0x10
> > ret_from_fork+0x2d/0x50
> > ? __pfx_kthread+0x10/0x10
> > ret_from_fork_asm+0x1a/0x30
> >
> > Fixes: 0f3154e6bcb3 ("ionic: Add Tx and Rx handling")
> > Signed-off-by: Taehee Yoo <ap420073@gmail.com>
> > ---
> > drivers/net/ethernet/pensando/ionic/ionic_lif.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/net/ethernet/pensando/ionic/ionic_lif.c b/drivers/net/ethernet/pensando/ionic/ionic_lif.c
> > index 24870da3f484..b66c907d88e6 100644
> > --- a/drivers/net/ethernet/pensando/ionic/ionic_lif.c
> > +++ b/drivers/net/ethernet/pensando/ionic/ionic_lif.c
> > @@ -304,7 +304,7 @@ static int ionic_qcq_enable(struct ionic_qcq *qcq)
> > if (ret)
> > return ret;
> >
> > - if (qcq->napi.poll)
> > + if (test_bit(NAPI_STATE_LISTED, &qcq->napi.state))
> > napi_enable(&qcq->napi);
> >
> > if (qcq->flags & IONIC_QCQ_F_INTR) {
> > --
> > 2.34.1
> >
>
> I think a better solution would be to stay out of the napi internals
> altogether and rely on the IONIC_QCQ_F_INTR flag as in
> ionic_qcq_disable() and ionic_lif_qcq_deinit().
>
> Thanks for catching this. If I remember correctly, this is a vestige of
> an experimental feature that never went upstream, and eventually was
> dropped altogether anyway.
>
> sln

Okay, I will try to use ionic internal flags like IONIC_QCQ_F_INTR.
And then I will send a v2 patch after some tests.

Thanks a lot!
Taehee Yoo
diff mbox series

Patch

diff --git a/drivers/net/ethernet/pensando/ionic/ionic_lif.c b/drivers/net/ethernet/pensando/ionic/ionic_lif.c
index 24870da3f484..b66c907d88e6 100644
--- a/drivers/net/ethernet/pensando/ionic/ionic_lif.c
+++ b/drivers/net/ethernet/pensando/ionic/ionic_lif.c
@@ -304,7 +304,7 @@  static int ionic_qcq_enable(struct ionic_qcq *qcq)
 	if (ret)
 		return ret;
 
-	if (qcq->napi.poll)
+	if (test_bit(NAPI_STATE_LISTED, &qcq->napi.state))
 		napi_enable(&qcq->napi);
 
 	if (qcq->flags & IONIC_QCQ_F_INTR) {