Message ID | 20230215072046.4000-1-quic_yingangl@quicinc.com (mailing list archive) |
---|---|
State | Rejected |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | net: disable irq in napi_poll | expand |
On Wed, Feb 15, 2023 at 8:20 AM Kassey Li <quic_yingangl@quicinc.com> wrote: > > There is list_del action in napi_poll, fix race condition by > disable irq. > > similar report: > https://syzkaller.appspot.com/bug?id=309955e7f02812d7bfb828c22b517349d9f068 > bc > > list_del corruption. next->prev should be ffffff88ea0bd4c0, but was > ffffff8a787099c0 > ------------[ cut here ]------------ > kernel BUG at lib/list_debug.c:56! > > pstate: 62400005 (nZCv daif +PAN -UAO +TCO -DIT -SSBS BTYPE=--) > pc : __list_del_entry_valid+0xa8/0xac > lr : __list_del_entry_valid+0xa8/0xac > sp : ffffffc0081bbce0 > x29: ffffffc0081bbce0 x28: 0000000000000018 x27: 0000000000000059 > x26: ffffffef130c6040 x25: ffffffc0081bbcf0 x24: ffffffc0081bbd00 > x23: 000000010003d37f x22: 000000000000012c x21: ffffffef130c9000 > x20: ffffff8a786cf9c0 x19: ffffff88ea0bd4c0 x18: ffffffc00816d030 > x17: ffffffffffffffff x16: 0000000000000004 x15: 0000000000000004 > x14: ffffffef131bae30 x13: 0000000000002b84 x12: 0000000000000003 > x11: 0000000100002b84 x10: c000000100002b84 x9 : 1f2ede939758e700 > x8 : 1f2ede939758e700 x7 : 205b5d3330383232 x6 : 302e33303331205b > x5 : ffffffef13750358 x4 : ffffffc0081bb9df x3 : 0000000000000000 > x2 : ffffff8a786be9c8 x1 : 0000000000000000 x0 : 000000000000007c > > Call trace: > __list_del_entry_valid+0xa8/0xac > net_rx_action+0xfc/0x3a0 > _stext+0x174/0x5f4 > run_ksoftirqd+0x34/0x74 > smpboot_thread_fn+0x1d8/0x464 > kthread+0x168/0x1dc > ret_from_fork+0x10/0x20 > Code: d4210000 f000cbc0 91161000 97de537a (d4210000) > ---[ end trace 8b3858d55ee59b7c ]--- > Kernel panic - not syncing: Oops - BUG: Fatal exception in interrupt > > Signed-off-by: Kassey Li <quic_yingangl@quicinc.com> No tags ? > --- > net/core/dev.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/net/core/dev.c b/net/core/dev.c > index b76fb37b381e..0c677a563232 100644 > --- a/net/core/dev.c > +++ b/net/core/dev.c > @@ -6660,7 +6660,9 @@ static __latent_entropy void net_rx_action(struct softirq_action *h) > } > > n = list_first_entry(&list, struct napi_struct, poll_list); > + local_irq_disable(); > budget -= napi_poll(n, &repoll); > + local_irq_enable(); > > /* If softirq window is exhausted then punt. > * Allow this to run for 2 jiffies since which will allow > -- > 2.17.1 > Absolutely not. NAPI runs in softirq mode, not hard irq. You will have to spend more time finding the real bug.
On 2/15/2023 4:32 PM, Eric Dumazet wrote: > On Wed, Feb 15, 2023 at 8:20 AM Kassey Li <quic_yingangl@quicinc.com> wrote: >> >> There is list_del action in napi_poll, fix race condition by >> disable irq. >> >> similar report: >> https://syzkaller.appspot.com/bug?id=309955e7f02812d7bfb828c22b517349d9f068 >> bc >> >> list_del corruption. next->prev should be ffffff88ea0bd4c0, but was >> ffffff8a787099c0 >> ------------[ cut here ]------------ >> kernel BUG at lib/list_debug.c:56! >> >> pstate: 62400005 (nZCv daif +PAN -UAO +TCO -DIT -SSBS BTYPE=--) >> pc : __list_del_entry_valid+0xa8/0xac >> lr : __list_del_entry_valid+0xa8/0xac >> sp : ffffffc0081bbce0 >> x29: ffffffc0081bbce0 x28: 0000000000000018 x27: 0000000000000059 >> x26: ffffffef130c6040 x25: ffffffc0081bbcf0 x24: ffffffc0081bbd00 >> x23: 000000010003d37f x22: 000000000000012c x21: ffffffef130c9000 >> x20: ffffff8a786cf9c0 x19: ffffff88ea0bd4c0 x18: ffffffc00816d030 >> x17: ffffffffffffffff x16: 0000000000000004 x15: 0000000000000004 >> x14: ffffffef131bae30 x13: 0000000000002b84 x12: 0000000000000003 >> x11: 0000000100002b84 x10: c000000100002b84 x9 : 1f2ede939758e700 >> x8 : 1f2ede939758e700 x7 : 205b5d3330383232 x6 : 302e33303331205b >> x5 : ffffffef13750358 x4 : ffffffc0081bb9df x3 : 0000000000000000 >> x2 : ffffff8a786be9c8 x1 : 0000000000000000 x0 : 000000000000007c >> >> Call trace: >> __list_del_entry_valid+0xa8/0xac >> net_rx_action+0xfc/0x3a0 >> _stext+0x174/0x5f4 >> run_ksoftirqd+0x34/0x74 >> smpboot_thread_fn+0x1d8/0x464 >> kthread+0x168/0x1dc >> ret_from_fork+0x10/0x20 >> Code: d4210000 f000cbc0 91161000 97de537a (d4210000) >> ---[ end trace 8b3858d55ee59b7c ]--- >> Kernel panic - not syncing: Oops - BUG: Fatal exception in interrupt >> >> Signed-off-by: Kassey Li <quic_yingangl@quicinc.com> > > No tags ? hi Eric, can you share a example of "tags" ? sorry I'm not aware of this. > >> --- >> net/core/dev.c | 2 ++ >> 1 file changed, 2 insertions(+) >> >> diff --git a/net/core/dev.c b/net/core/dev.c >> index b76fb37b381e..0c677a563232 100644 >> --- a/net/core/dev.c >> +++ b/net/core/dev.c >> @@ -6660,7 +6660,9 @@ static __latent_entropy void net_rx_action(struct softirq_action *h) >> } >> >> n = list_first_entry(&list, struct napi_struct, poll_list); >> + local_irq_disable(); >> budget -= napi_poll(n, &repoll); >> + local_irq_enable(); >> >> /* If softirq window is exhausted then punt. >> * Allow this to run for 2 jiffies since which will allow >> -- >> 2.17.1 >> > > Absolutely not. > > NAPI runs in softirq mode, not hard irq. > > You will have to spend more time finding the real bug. without this patch, issue is easy to reproduced when do phone clone( a new phone and old phone to clone data/app). with this path, issue is not seen. in the function of net_rx_action, list init/add is go with local_irq_disable/enable, however, napi_poll will call into list_del. there is no such local_irq_disable/enable. may you give some suggest on this to further narrow down ?
Greeting, FYI, we noticed WARNING:at_kernel/softirq.c:#__local_bh_enable_ip due to commit (built with gcc-11): commit: ed417c671fa6b56ce9f6db582d21c3a608838929 ("[PATCH] net: disable irq in napi_poll") url: https://github.com/intel-lab-lkp/linux/commits/Kassey-Li/net-disable-irq-in-napi_poll/20230215-152235 base: https://git.kernel.org/cgit/linux/kernel/git/davem/net-next.git 1ed32ad4a3cb7c6a8764510565e15ab46b5fdd19 patch link: https://lore.kernel.org/all/20230215072046.4000-1-quic_yingangl@quicinc.com/ patch subject: [PATCH] net: disable irq in napi_poll in testcase: trinity version: trinity-static-x86_64-x86_64-1c734c75-1_2020-01-06 with following parameters: runtime: 300s test-description: Trinity is a linux system call fuzz tester. test-url: http://codemonkey.org.uk/projects/trinity/ on test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace): [ 22.476967][ C0] ------------[ cut here ]------------ [ 22.477861][ C0] WARNING: CPU: 0 PID: 56 at kernel/softirq.c:376 __local_bh_enable_ip (kernel/softirq.c:376) [ 22.479087][ C0] Modules linked in: [ 22.479648][ C0] CPU: 0 PID: 56 Comm: kworker/0:2 Tainted: G T 6.2.0-rc7-01664-ged417c671fa6 #41 [ 22.481139][ C0] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-debian-1.16.0-5 04/01/2014 [ 22.482568][ C0] Workqueue: ipv6_addrconf addrconf_dad_work [ 22.483472][ C0] RIP: 0010:__local_bh_enable_ip (kernel/softirq.c:376) [ 22.486242][ C0] Code: c7 c7 ec 5c 58 84 e8 f7 9e 3f 00 83 3d e0 75 3f 03 00 74 18 65 8b 05 0b 92 05 7f 85 c0 75 0d 65 8b 05 4c 90 e9 7e 85 c0 75 02 <0f> 0b 9c 58 fa 0f ba e0 09 73 05 e8 6d 98 1b 00 65 8b 05 0d af 06 All code ======== 0: c7 c7 ec 5c 58 84 mov $0x84585cec,%edi 6: e8 f7 9e 3f 00 callq 0x3f9f02 b: 83 3d e0 75 3f 03 00 cmpl $0x0,0x33f75e0(%rip) # 0x33f75f2 12: 74 18 je 0x2c 14: 65 8b 05 0b 92 05 7f mov %gs:0x7f05920b(%rip),%eax # 0x7f059226 1b: 85 c0 test %eax,%eax 1d: 75 0d jne 0x2c 1f: 65 8b 05 4c 90 e9 7e mov %gs:0x7ee9904c(%rip),%eax # 0x7ee99072 26: 85 c0 test %eax,%eax 28: 75 02 jne 0x2c 2a:* 0f 0b ud2 <-- trapping instruction 2c: 9c pushfq 2d: 58 pop %rax 2e: fa cli 2f: 0f ba e0 09 bt $0x9,%eax 33: 73 05 jae 0x3a 35: e8 6d 98 1b 00 callq 0x1b98a7 3a: 65 gs 3b: 8b .byte 0x8b 3c: 05 .byte 0x5 3d: 0d .byte 0xd 3e: af scas %es:(%rdi),%eax 3f: 06 (bad) Code starting with the faulting instruction =========================================== 0: 0f 0b ud2 2: 9c pushfq 3: 58 pop %rax 4: fa cli 5: 0f ba e0 09 bt $0x9,%eax 9: 73 05 jae 0x10 b: e8 6d 98 1b 00 callq 0x1b987d 10: 65 gs 11: 8b .byte 0x8b 12: 05 .byte 0x5 13: 0d .byte 0xd 14: af scas %es:(%rdi),%eax 15: 06 (bad) [ 22.489103][ C0] RSP: 0018:ffff8883aee096d0 EFLAGS: 00010046 [ 22.490095][ C0] RAX: 0000000000000000 RBX: 0000000000000200 RCX: 1ffffffff08b0b9d [ 22.491351][ C0] RDX: dffffc0000000000 RSI: 0000000000000200 RDI: ffffffff825bdeb5 [ 22.492537][ C0] RBP: ffff8883aee096e0 R08: 0000000000000000 R09: 0000000000000000 [ 22.493640][ C0] R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff825bdeb5 [ 22.494729][ C0] R13: ffff88811c6d5b40 R14: 0000000000000000 R15: 0000000000000002 [ 22.495828][ C0] FS: 0000000000000000(0000) GS:ffff8883aee00000(0000) knlGS:0000000000000000 [ 22.497038][ C0] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 22.497951][ C0] CR2: 00007f2d57caffb0 CR3: 000000011cbe4000 CR4: 00000000000406f0 [ 22.499055][ C0] Call Trace: [ 22.499517][ C0] <IRQ> [ 22.499936][ C0] ip6_pol_route (net/ipv6/route.c:2259) [ 22.500722][ C0] ? ip6_pol_route_lookup (net/ipv6/route.c:2203) [ 22.501612][ C0] ? fib6_table_lookup (net/ipv6/route.c:2458) [ 22.502425][ C0] ip6_pol_route_input (net/ipv6/route.c:2276) [ 22.503215][ C0] ? ip6_pol_route (net/ipv6/route.c:2274) [ 22.504008][ C0] fib6_rule_lookup (net/ipv6/ip6_fib.c:320) [ 22.504800][ C0] ip6_route_input_lookup (net/ipv6/route.c:2288) [ 22.505648][ C0] ip6_route_input (net/ipv6/route.c:2583) [ 22.506387][ C0] ? ip6_validate_gw (net/ipv6/route.c:2558) [ 22.507176][ C0] ip6_rcv_finish_core+0x1b5/0x1c5 [ 22.508115][ C0] ip6_list_rcv_finish+0x364/0x46f [ 22.508987][ C0] ? rcu_lock_acquire (arch/x86/include/asm/preempt.h:80 include/linux/rcupdate.h:94 include/linux/rcupdate.h:762 kernel/notifier.c:224) [ 22.509817][ C0] ipv6_list_rcv (net/ipv6/ip6_input.c:325) [ 22.510495][ C0] ? ipv6_rcv (net/ipv6/ip6_input.c:325) [ 22.511069][ C0] ? do_xdp_generic (net/core/dev.c:5273) [ 22.511785][ C0] __netif_receive_skb_list_ptype (net/core/dev.c:5528) [ 22.512621][ C0] __netif_receive_skb_list_core (net/core/dev.c:5538) [ 22.513510][ C0] ? __netif_receive_skb_core+0xcdf/0xcdf [ 22.514483][ C0] ? dma_direct_map_page (kernel/dma/direct.h:101) [ 22.515243][ C0] ? dma_to_phys+0xad/0xad [ 22.516004][ C0] netif_receive_skb_list_internal (net/core/dev.c:5630 net/core/dev.c:5719) [ 22.516916][ C0] ? process_backlog (net/core/dev.c:5691) [ 22.517629][ C0] ? e1000_alloc_rx_buffers (drivers/net/ethernet/intel/e1000/e1000_main.c:4628) [ 22.518433][ C0] gro_normal_list (include/linux/list.h:37 include/net/gro.h:434) [ 22.519104][ C0] napi_complete_done (include/linux/list.h:292 net/core/dev.c:6061) [ 22.519830][ C0] ? gro_normal_list (net/core/dev.c:6026) [ 22.520517][ C0] e1000_clean (drivers/net/ethernet/intel/e1000/e1000_main.c:3811) [ 22.521135][ C0] ? hlock_class (arch/x86/include/asm/bitops.h:228 arch/x86/include/asm/bitops.h:240 include/asm-generic/bitops/instrumented-non-atomic.h:142 kernel/locking/lockdep.c:227) [ 22.521792][ C0] ? mark_lock (kernel/locking/lockdep.c:4612 (discriminator 3)) [ 22.522408][ C0] ? e1000_clean_tx_irq (drivers/net/ethernet/intel/e1000/e1000_main.c:3796) [ 22.523134][ C0] ? mark_held_locks (kernel/locking/lockdep.c:4236) [ 22.523849][ C0] __napi_poll+0xa4/0x2b8 [ 22.524628][ C0] net_rx_action (net/core/dev.c:6557 (discriminator 3) net/core/dev.c:6666 (discriminator 3)) [ 22.525304][ C0] ? __napi_poll+0x2b8/0x2b8 [ 22.526094][ C0] ? rcu_read_lock_sched_held (kernel/rcu/update.c:125) [ 22.526896][ C0] ? lock_is_held (include/linux/sched.h:1566) [ 22.527129][ T72] /dev/root: Can't open blockdev [ 22.527469][ C0] __do_softirq (arch/x86/include/asm/jump_label.h:27 include/linux/jump_label.h:207 include/trace/events/irq.h:142 kernel/softirq.c:572) [ 22.528738][ C0] ? rcu_read_unlock_bh (include/linux/rcupdate.h:330 include/linux/rcupdate.h:832) [ 22.529484][ C0] do_softirq (kernel/softirq.c:472) [ 22.530076][ C0] </IRQ> [ 22.530513][ C0] <TASK> [ 22.530954][ C0] __local_bh_enable_ip (arch/x86/include/asm/preempt.h:85 kernel/softirq.c:399) [ 22.531677][ C0] rcu_read_unlock_bh (include/linux/lockdep.h:191 include/linux/lockdep.h:198 include/linux/lockdep.h:204) [ 22.532368][ C0] ip6_finish_output2 (net/ipv6/ip6_output.c:135) [ 22.532953][ C0] ? ip6_mtu (net/ipv6/route.c:3206) [ 22.533439][ C0] ip6_output (net/ipv6/ip6_output.c:195 net/ipv6/ip6_output.c:206 include/linux/netfilter.h:403 net/ipv6/ip6_output.c:227) [ 22.533923][ C0] ndisc_send_skb (net/ipv6/ndisc.c:511 (discriminator 142)) [ 22.534453][ C0] ? ndisc_ifinfo_sysctl_change (net/ipv6/ndisc.c:472) [ 22.535116][ C0] ? ndisc_fill_addr_option (net/ipv6/ndisc.c:173) [ 22.535733][ C0] ndisc_send_rs (net/ipv6/ndisc.c:719) [ 22.536258][ C0] addrconf_dad_completed (net/ipv6/addrconf.c:4255) [ 22.536840][ C0] ? lock_is_held (include/linux/lockdep.h:284) [ 22.537335][ C0] ? addrconf_verify_work (net/ipv6/addrconf.c:4203) [ 22.537900][ C0] ? addrconf_dad_work (net/ipv6/addrconf.c:4154) [ 22.538466][ C0] addrconf_dad_work (net/ipv6/addrconf.c:4164) [ 22.539001][ C0] ? addrconf_dad_work (net/ipv6/addrconf.c:4164) [ 22.539570][ C0] ? addrconf_ifdown+0xae1/0xae1 [ 22.540302][ C0] ? rcu_read_lock_sched_held (kernel/rcu/update.c:125) [ 22.541088][ C0] ? lock_is_held (include/linux/sched.h:1566) [ 22.541681][ C0] process_one_work (arch/x86/include/asm/jump_label.h:27 include/linux/jump_label.h:207 include/trace/events/workqueue.h:108 kernel/workqueue.c:2294) [ 22.542251][ C0] ? max_active_store (kernel/workqueue.c:2184) [ 22.542814][ C0] worker_thread (include/linux/list.h:292 kernel/workqueue.c:2437) [ 22.543313][ C0] ? trace_hardirqs_on (kernel/trace/trace_preemptirq.c:51 (discriminator 22)) [ 22.543895][ C0] kthread (kernel/kthread.c:376) [ 22.544343][ C0] ? rescuer_thread (kernel/workqueue.c:2379) [ 22.544867][ C0] ? kthread_complete_and_exit (kernel/kthread.c:331) [ 22.545480][ C0] ret_from_fork (arch/x86/entry/entry_64.S:314) [ 22.546008][ C0] </TASK> [ 22.546347][ C0] irq event stamp: 265 [ 22.546787][ C0] hardirqs last enabled at (264): net_rx_action (arch/x86/include/asm/irqflags.h:45 arch/x86/include/asm/irqflags.h:80 net/core/dev.c:6651) [ 22.547781][ C0] hardirqs last disabled at (265): net_rx_action (net/core/dev.c:6551 (discriminator 3) net/core/dev.c:6666 (discriminator 3)) [ 22.548750][ C0] softirqs last enabled at (260): rcu_read_unlock_bh (include/linux/rcupdate.h:330 include/linux/rcupdate.h:832) [ 22.549741][ C0] softirqs last disabled at (261): do_softirq (kernel/softirq.c:472) [ 22.550664][ C0] ---[ end trace 0000000000000000 ]--- If you fix the issue, kindly add following tag | Reported-by: kernel test robot <yujie.liu@intel.com> | Link: https://lore.kernel.org/oe-lkp/202302211734.1d9915a-yujie.liu@intel.com To reproduce: # build kernel cd linux cp config-6.2.0-rc7-01664-ged417c671fa6 .config make HOSTCC=gcc-11 CC=gcc-11 ARCH=x86_64 olddefconfig prepare modules_prepare bzImage modules make HOSTCC=gcc-11 CC=gcc-11 ARCH=x86_64 INSTALL_MOD_PATH=<mod-install-dir> modules_install cd <mod-install-dir> find lib/ | cpio -o -H newc --quiet | gzip > modules.cgz git clone https://github.com/intel/lkp-tests.git cd lkp-tests bin/lkp qemu -k <bzImage> -m modules.cgz job-script # job-script is attached in this email # if come across any failure that blocks the test, # please remove ~/.lkp and /lkp dir to run from a clean state.
diff --git a/net/core/dev.c b/net/core/dev.c index b76fb37b381e..0c677a563232 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -6660,7 +6660,9 @@ static __latent_entropy void net_rx_action(struct softirq_action *h) } n = list_first_entry(&list, struct napi_struct, poll_list); + local_irq_disable(); budget -= napi_poll(n, &repoll); + local_irq_enable(); /* If softirq window is exhausted then punt. * Allow this to run for 2 jiffies since which will allow
There is list_del action in napi_poll, fix race condition by disable irq. similar report: https://syzkaller.appspot.com/bug?id=309955e7f02812d7bfb828c22b517349d9f068 bc list_del corruption. next->prev should be ffffff88ea0bd4c0, but was ffffff8a787099c0 ------------[ cut here ]------------ kernel BUG at lib/list_debug.c:56! pstate: 62400005 (nZCv daif +PAN -UAO +TCO -DIT -SSBS BTYPE=--) pc : __list_del_entry_valid+0xa8/0xac lr : __list_del_entry_valid+0xa8/0xac sp : ffffffc0081bbce0 x29: ffffffc0081bbce0 x28: 0000000000000018 x27: 0000000000000059 x26: ffffffef130c6040 x25: ffffffc0081bbcf0 x24: ffffffc0081bbd00 x23: 000000010003d37f x22: 000000000000012c x21: ffffffef130c9000 x20: ffffff8a786cf9c0 x19: ffffff88ea0bd4c0 x18: ffffffc00816d030 x17: ffffffffffffffff x16: 0000000000000004 x15: 0000000000000004 x14: ffffffef131bae30 x13: 0000000000002b84 x12: 0000000000000003 x11: 0000000100002b84 x10: c000000100002b84 x9 : 1f2ede939758e700 x8 : 1f2ede939758e700 x7 : 205b5d3330383232 x6 : 302e33303331205b x5 : ffffffef13750358 x4 : ffffffc0081bb9df x3 : 0000000000000000 x2 : ffffff8a786be9c8 x1 : 0000000000000000 x0 : 000000000000007c Call trace: __list_del_entry_valid+0xa8/0xac net_rx_action+0xfc/0x3a0 _stext+0x174/0x5f4 run_ksoftirqd+0x34/0x74 smpboot_thread_fn+0x1d8/0x464 kthread+0x168/0x1dc ret_from_fork+0x10/0x20 Code: d4210000 f000cbc0 91161000 97de537a (d4210000) ---[ end trace 8b3858d55ee59b7c ]--- Kernel panic - not syncing: Oops - BUG: Fatal exception in interrupt Signed-off-by: Kassey Li <quic_yingangl@quicinc.com> --- net/core/dev.c | 2 ++ 1 file changed, 2 insertions(+)