Message ID | 20240610040706.1385890-1-ap420073@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net] ionic: fix use after netif_napi_del() | expand |
On 6/9/2024 9:07 PM, Taehee Yoo wrote: > > When queues are started, netif_napi_add() and napi_enable() are called. > If there are 4 queues and only 3 queues are used for the current > configuration, only 3 queues' napi should be registered and enabled. > The ionic_qcq_enable() checks whether the .poll pointer is not NULL for > enabling only the using queue' napi. Unused queues' napi will not be > registered by netif_napi_add(), so the .poll pointer indicates NULL. > But it couldn't distinguish whether the napi was unregistered or not > because netif_napi_del() doesn't reset the .poll pointer to NULL. > So, ionic_qcq_enable() calls napi_enable() for the queue, which was > unregistered by netif_napi_del(). > > Reproducer: > ethtool -L <interface name> rx 1 tx 1 combined 0 > ethtool -L <interface name> rx 0 tx 0 combined 1 > ethtool -L <interface name> rx 0 tx 0 combined 4 > > Splat looks like: > kernel BUG at net/core/dev.c:6666! > Oops: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI > CPU: 3 PID: 1057 Comm: kworker/3:3 Not tainted 6.10.0-rc2+ #16 > Workqueue: events ionic_lif_deferred_work [ionic] > RIP: 0010:napi_enable+0x3b/0x40 > Code: 48 89 c2 48 83 e2 f6 80 b9 61 09 00 00 00 74 0d 48 83 bf 60 01 00 00 00 74 03 80 ce 01 f0 4f > RSP: 0018:ffffb6ed83227d48 EFLAGS: 00010246 > RAX: 0000000000000000 RBX: ffff97560cda0828 RCX: 0000000000000029 > RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff97560cda0a28 > RBP: ffffb6ed83227d50 R08: 0000000000000400 R09: 0000000000000001 > R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000000 > R13: ffff97560ce3c1a0 R14: 0000000000000000 R15: ffff975613ba0a20 > FS: 0000000000000000(0000) GS:ffff975d5f780000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 00007f8f734ee200 CR3: 0000000103e50000 CR4: 00000000007506f0 > PKRU: 55555554 > Call Trace: > <TASK> > ? die+0x33/0x90 > ? do_trap+0xd9/0x100 > ? napi_enable+0x3b/0x40 > ? do_error_trap+0x83/0xb0 > ? napi_enable+0x3b/0x40 > ? napi_enable+0x3b/0x40 > ? exc_invalid_op+0x4e/0x70 > ? napi_enable+0x3b/0x40 > ? asm_exc_invalid_op+0x16/0x20 > ? napi_enable+0x3b/0x40 > ionic_qcq_enable+0xb7/0x180 [ionic 59bdfc8a035436e1c4224ff7d10789e3f14643f8] > ionic_start_queues+0xc4/0x290 [ionic 59bdfc8a035436e1c4224ff7d10789e3f14643f8] > ionic_link_status_check+0x11c/0x170 [ionic 59bdfc8a035436e1c4224ff7d10789e3f14643f8] > ionic_lif_deferred_work+0x129/0x280 [ionic 59bdfc8a035436e1c4224ff7d10789e3f14643f8] > process_one_work+0x145/0x360 > worker_thread+0x2bb/0x3d0 > ? __pfx_worker_thread+0x10/0x10 > kthread+0xcc/0x100 > ? __pfx_kthread+0x10/0x10 > ret_from_fork+0x2d/0x50 > ? __pfx_kthread+0x10/0x10 > ret_from_fork_asm+0x1a/0x30 > > Fixes: 0f3154e6bcb3 ("ionic: Add Tx and Rx handling") > Signed-off-by: Taehee Yoo <ap420073@gmail.com> > --- > drivers/net/ethernet/pensando/ionic/ionic_lif.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/net/ethernet/pensando/ionic/ionic_lif.c b/drivers/net/ethernet/pensando/ionic/ionic_lif.c > index 24870da3f484..b66c907d88e6 100644 > --- a/drivers/net/ethernet/pensando/ionic/ionic_lif.c > +++ b/drivers/net/ethernet/pensando/ionic/ionic_lif.c > @@ -304,7 +304,7 @@ static int ionic_qcq_enable(struct ionic_qcq *qcq) > if (ret) > return ret; > > - if (qcq->napi.poll) > + if (test_bit(NAPI_STATE_LISTED, &qcq->napi.state)) > napi_enable(&qcq->napi); > > if (qcq->flags & IONIC_QCQ_F_INTR) { > -- > 2.34.1 > I think a better solution would be to stay out of the napi internals altogether and rely on the IONIC_QCQ_F_INTR flag as in ionic_qcq_disable() and ionic_lif_qcq_deinit(). Thanks for catching this. If I remember correctly, this is a vestige of an experimental feature that never went upstream, and eventually was dropped altogether anyway. sln
On Tue, Jun 11, 2024 at 3:21 AM Nelson, Shannon <shannon.nelson@amd.com> wrote: > Hi Nelson, Thanks a lot for the review! > On 6/9/2024 9:07 PM, Taehee Yoo wrote: > > > > When queues are started, netif_napi_add() and napi_enable() are called. > > If there are 4 queues and only 3 queues are used for the current > > configuration, only 3 queues' napi should be registered and enabled. > > The ionic_qcq_enable() checks whether the .poll pointer is not NULL for > > enabling only the using queue' napi. Unused queues' napi will not be > > registered by netif_napi_add(), so the .poll pointer indicates NULL. > > But it couldn't distinguish whether the napi was unregistered or not > > because netif_napi_del() doesn't reset the .poll pointer to NULL. > > So, ionic_qcq_enable() calls napi_enable() for the queue, which was > > unregistered by netif_napi_del(). > > > > Reproducer: > > ethtool -L <interface name> rx 1 tx 1 combined 0 > > ethtool -L <interface name> rx 0 tx 0 combined 1 > > ethtool -L <interface name> rx 0 tx 0 combined 4 > > > > Splat looks like: > > kernel BUG at net/core/dev.c:6666! > > Oops: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI > > CPU: 3 PID: 1057 Comm: kworker/3:3 Not tainted 6.10.0-rc2+ #16 > > Workqueue: events ionic_lif_deferred_work [ionic] > > RIP: 0010:napi_enable+0x3b/0x40 > > Code: 48 89 c2 48 83 e2 f6 80 b9 61 09 00 00 00 74 0d 48 83 bf 60 01 00 00 00 74 03 80 ce 01 f0 4f > > RSP: 0018:ffffb6ed83227d48 EFLAGS: 00010246 > > RAX: 0000000000000000 RBX: ffff97560cda0828 RCX: 0000000000000029 > > RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff97560cda0a28 > > RBP: ffffb6ed83227d50 R08: 0000000000000400 R09: 0000000000000001 > > R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000000 > > R13: ffff97560ce3c1a0 R14: 0000000000000000 R15: ffff975613ba0a20 > > FS: 0000000000000000(0000) GS:ffff975d5f780000(0000) knlGS:0000000000000000 > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > CR2: 00007f8f734ee200 CR3: 0000000103e50000 CR4: 00000000007506f0 > > PKRU: 55555554 > > Call Trace: > > <TASK> > > ? die+0x33/0x90 > > ? do_trap+0xd9/0x100 > > ? napi_enable+0x3b/0x40 > > ? do_error_trap+0x83/0xb0 > > ? napi_enable+0x3b/0x40 > > ? napi_enable+0x3b/0x40 > > ? exc_invalid_op+0x4e/0x70 > > ? napi_enable+0x3b/0x40 > > ? asm_exc_invalid_op+0x16/0x20 > > ? napi_enable+0x3b/0x40 > > ionic_qcq_enable+0xb7/0x180 [ionic 59bdfc8a035436e1c4224ff7d10789e3f14643f8] > > ionic_start_queues+0xc4/0x290 [ionic 59bdfc8a035436e1c4224ff7d10789e3f14643f8] > > ionic_link_status_check+0x11c/0x170 [ionic 59bdfc8a035436e1c4224ff7d10789e3f14643f8] > > ionic_lif_deferred_work+0x129/0x280 [ionic 59bdfc8a035436e1c4224ff7d10789e3f14643f8] > > process_one_work+0x145/0x360 > > worker_thread+0x2bb/0x3d0 > > ? __pfx_worker_thread+0x10/0x10 > > kthread+0xcc/0x100 > > ? __pfx_kthread+0x10/0x10 > > ret_from_fork+0x2d/0x50 > > ? __pfx_kthread+0x10/0x10 > > ret_from_fork_asm+0x1a/0x30 > > > > Fixes: 0f3154e6bcb3 ("ionic: Add Tx and Rx handling") > > Signed-off-by: Taehee Yoo <ap420073@gmail.com> > > --- > > drivers/net/ethernet/pensando/ionic/ionic_lif.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/drivers/net/ethernet/pensando/ionic/ionic_lif.c b/drivers/net/ethernet/pensando/ionic/ionic_lif.c > > index 24870da3f484..b66c907d88e6 100644 > > --- a/drivers/net/ethernet/pensando/ionic/ionic_lif.c > > +++ b/drivers/net/ethernet/pensando/ionic/ionic_lif.c > > @@ -304,7 +304,7 @@ static int ionic_qcq_enable(struct ionic_qcq *qcq) > > if (ret) > > return ret; > > > > - if (qcq->napi.poll) > > + if (test_bit(NAPI_STATE_LISTED, &qcq->napi.state)) > > napi_enable(&qcq->napi); > > > > if (qcq->flags & IONIC_QCQ_F_INTR) { > > -- > > 2.34.1 > > > > I think a better solution would be to stay out of the napi internals > altogether and rely on the IONIC_QCQ_F_INTR flag as in > ionic_qcq_disable() and ionic_lif_qcq_deinit(). > > Thanks for catching this. If I remember correctly, this is a vestige of > an experimental feature that never went upstream, and eventually was > dropped altogether anyway. > > sln Okay, I will try to use ionic internal flags like IONIC_QCQ_F_INTR. And then I will send a v2 patch after some tests. Thanks a lot! Taehee Yoo
diff --git a/drivers/net/ethernet/pensando/ionic/ionic_lif.c b/drivers/net/ethernet/pensando/ionic/ionic_lif.c index 24870da3f484..b66c907d88e6 100644 --- a/drivers/net/ethernet/pensando/ionic/ionic_lif.c +++ b/drivers/net/ethernet/pensando/ionic/ionic_lif.c @@ -304,7 +304,7 @@ static int ionic_qcq_enable(struct ionic_qcq *qcq) if (ret) return ret; - if (qcq->napi.poll) + if (test_bit(NAPI_STATE_LISTED, &qcq->napi.state)) napi_enable(&qcq->napi); if (qcq->flags & IONIC_QCQ_F_INTR) {
When queues are started, netif_napi_add() and napi_enable() are called. If there are 4 queues and only 3 queues are used for the current configuration, only 3 queues' napi should be registered and enabled. The ionic_qcq_enable() checks whether the .poll pointer is not NULL for enabling only the using queue' napi. Unused queues' napi will not be registered by netif_napi_add(), so the .poll pointer indicates NULL. But it couldn't distinguish whether the napi was unregistered or not because netif_napi_del() doesn't reset the .poll pointer to NULL. So, ionic_qcq_enable() calls napi_enable() for the queue, which was unregistered by netif_napi_del(). Reproducer: ethtool -L <interface name> rx 1 tx 1 combined 0 ethtool -L <interface name> rx 0 tx 0 combined 1 ethtool -L <interface name> rx 0 tx 0 combined 4 Splat looks like: kernel BUG at net/core/dev.c:6666! Oops: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI CPU: 3 PID: 1057 Comm: kworker/3:3 Not tainted 6.10.0-rc2+ #16 Workqueue: events ionic_lif_deferred_work [ionic] RIP: 0010:napi_enable+0x3b/0x40 Code: 48 89 c2 48 83 e2 f6 80 b9 61 09 00 00 00 74 0d 48 83 bf 60 01 00 00 00 74 03 80 ce 01 f0 4f RSP: 0018:ffffb6ed83227d48 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff97560cda0828 RCX: 0000000000000029 RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff97560cda0a28 RBP: ffffb6ed83227d50 R08: 0000000000000400 R09: 0000000000000001 R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000000 R13: ffff97560ce3c1a0 R14: 0000000000000000 R15: ffff975613ba0a20 FS: 0000000000000000(0000) GS:ffff975d5f780000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f8f734ee200 CR3: 0000000103e50000 CR4: 00000000007506f0 PKRU: 55555554 Call Trace: <TASK> ? die+0x33/0x90 ? do_trap+0xd9/0x100 ? napi_enable+0x3b/0x40 ? do_error_trap+0x83/0xb0 ? napi_enable+0x3b/0x40 ? napi_enable+0x3b/0x40 ? exc_invalid_op+0x4e/0x70 ? napi_enable+0x3b/0x40 ? asm_exc_invalid_op+0x16/0x20 ? napi_enable+0x3b/0x40 ionic_qcq_enable+0xb7/0x180 [ionic 59bdfc8a035436e1c4224ff7d10789e3f14643f8] ionic_start_queues+0xc4/0x290 [ionic 59bdfc8a035436e1c4224ff7d10789e3f14643f8] ionic_link_status_check+0x11c/0x170 [ionic 59bdfc8a035436e1c4224ff7d10789e3f14643f8] ionic_lif_deferred_work+0x129/0x280 [ionic 59bdfc8a035436e1c4224ff7d10789e3f14643f8] process_one_work+0x145/0x360 worker_thread+0x2bb/0x3d0 ? __pfx_worker_thread+0x10/0x10 kthread+0xcc/0x100 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x2d/0x50 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1a/0x30 Fixes: 0f3154e6bcb3 ("ionic: Add Tx and Rx handling") Signed-off-by: Taehee Yoo <ap420073@gmail.com> --- drivers/net/ethernet/pensando/ionic/ionic_lif.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)