Message ID | Yg05duINKBqvnxUc@linutronix.de (mailing list archive) |
---|---|
State | Accepted |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net-next] net: Correct wrong BH disable in hard-interrupt. | expand |
Hi Andrzej, On 16.02.2022 18:50, Sebastian Andrzej Siewior wrote: > I missed the obvious case where netif_ix() is invoked from hard-IRQ > context. > > Disabling bottom halves is only needed in process context. This ensures > that the code remains on the current CPU and that the soft-interrupts > are processed at local_bh_enable() time. > In hard- and soft-interrupt context this is already the case and the > soft-interrupts will be processed once the context is left (at irq-exit > time). > > Disable bottom halves if neither hard-interrupts nor soft-interrupts are > disabled. Update the kernel-doc, mention that interrupts must be enabled > if invoked from process context. > > Fixes: baebdf48c3600 ("net: dev: Makes sure netif_rx() can be invoked in any context.") > Reported-by: Marek Szyprowski <m.szyprowski@samsung.com> > Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> > --- > Marek, does this work for you? Yes, this fixed the issue. Thanks! Tested-by: Marek Szyprowski <m.szyprowski@samsung.com> > net/core/dev.c | 11 ++++++++--- > 1 file changed, 8 insertions(+), 3 deletions(-) > > diff --git a/net/core/dev.c b/net/core/dev.c > index 909fb38159108..87729491460fc 100644 > --- a/net/core/dev.c > +++ b/net/core/dev.c > @@ -4860,7 +4860,9 @@ EXPORT_SYMBOL(__netif_rx); > * congestion control or by the protocol layers. > * The network buffer is passed via the backlog NAPI device. Modern NIC > * driver should use NAPI and GRO. > - * This function can used from any context. > + * This function can used from interrupt and from process context. The > + * caller from process context must not disable interrupts before invoking > + * this function. > * > * return values: > * NET_RX_SUCCESS (no congestion) > @@ -4870,12 +4872,15 @@ EXPORT_SYMBOL(__netif_rx); > int netif_rx(struct sk_buff *skb) > { > int ret; > + bool need_bh_off = !(hardirq_count() | softirq_count()); > > - local_bh_disable(); > + if (need_bh_off) > + local_bh_disable(); > trace_netif_rx_entry(skb); > ret = netif_rx_internal(skb); > trace_netif_rx_exit(ret); > - local_bh_enable(); > + if (need_bh_off) > + local_bh_enable(); > return ret; > } > EXPORT_SYMBOL(netif_rx); Best regards
Hi All, On 17.02.2022 07:35, Marek Szyprowski wrote: > Hi Andrzej, > > On 16.02.2022 18:50, Sebastian Andrzej Siewior wrote: >> I missed the obvious case where netif_ix() is invoked from hard-IRQ >> context. >> >> Disabling bottom halves is only needed in process context. This ensures >> that the code remains on the current CPU and that the soft-interrupts >> are processed at local_bh_enable() time. >> In hard- and soft-interrupt context this is already the case and the >> soft-interrupts will be processed once the context is left (at irq-exit >> time). >> >> Disable bottom halves if neither hard-interrupts nor soft-interrupts are >> disabled. Update the kernel-doc, mention that interrupts must be enabled >> if invoked from process context. >> >> Fixes: baebdf48c3600 ("net: dev: Makes sure netif_rx() can be invoked >> in any context.") >> Reported-by: Marek Szyprowski <m.szyprowski@samsung.com> >> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> >> --- >> Marek, does this work for you? > > Yes, this fixed the issue. Thanks! > > Tested-by: Marek Szyprowski <m.szyprowski@samsung.com> I've just noticed that there is one more issue left to fix (the $subject patch is already applied) - this one comes from threaded irq (if I got the stack trace right): ------------[ cut here ]------------ WARNING: CPU: 0 PID: 147 at kernel/softirq.c:363 __local_bh_enable_ip+0xa8/0x1c0 Modules linked in: cpufreq_powersave cpufreq_conservative brcmfmac brcmutil cfg80211 s3fwrn5_i2c s3fwrn5 nci crct10dif_ce exynos_gsc nfc s5p_jpeg hci_uart btqca s 5p_mfc v4l2_mem2mem btbcm bluetooth videobuf2_dma_contig videobuf2_memops ecdh_generic videobuf2_v4l2 panfrost drm_shmem_helper ecc videobuf2_common gpu_sched rfkill videodev mc ip_tables x_tab les ipv6 CPU: 0 PID: 147 Comm: irq/150-dwc3 Not tainted 5.17.0-rc4-next-20220217+ #4557 Hardware name: Samsung TM2E board (DT) pstate: 400000c5 (nZcv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : __local_bh_enable_ip+0xa8/0x1c0 lr : netif_rx+0xa4/0x2c0 ... Call trace: __local_bh_enable_ip+0xa8/0x1c0 netif_rx+0xa4/0x2c0 rx_complete+0x214/0x250 usb_gadget_giveback_request+0x58/0x170 dwc3_gadget_giveback+0xe4/0x200 dwc3_gadget_endpoint_trbs_complete+0x100/0x388 dwc3_thread_interrupt+0x46c/0xe20 irq_thread_fn+0x28/0x98 irq_thread+0x184/0x238 kthread+0x100/0x120 ret_from_fork+0x10/0x20 irq event stamp: 645 hardirqs last enabled at (643): [<ffff8000080c93b8>] finish_task_switch+0x98/0x288 hardirqs last disabled at (644): [<ffff8000090a6e34>] _raw_spin_lock_irqsave+0xb4/0x148 softirqs last enabled at (252): [<ffff800008010488>] _stext+0x488/0x5cc softirqs last disabled at (645): [<ffff800008ed71b0>] netif_rx+0x188/0x2c0 ---[ end trace 0000000000000000 ]--- > >> net/core/dev.c | 11 ++++++++--- >> 1 file changed, 8 insertions(+), 3 deletions(-) >> >> diff --git a/net/core/dev.c b/net/core/dev.c >> index 909fb38159108..87729491460fc 100644 >> --- a/net/core/dev.c >> +++ b/net/core/dev.c >> @@ -4860,7 +4860,9 @@ EXPORT_SYMBOL(__netif_rx); >> * congestion control or by the protocol layers. >> * The network buffer is passed via the backlog NAPI device. >> Modern NIC >> * driver should use NAPI and GRO. >> - * This function can used from any context. >> + * This function can used from interrupt and from process >> context. The >> + * caller from process context must not disable interrupts before >> invoking >> + * this function. >> * >> * return values: >> * NET_RX_SUCCESS (no congestion) >> @@ -4870,12 +4872,15 @@ EXPORT_SYMBOL(__netif_rx); >> int netif_rx(struct sk_buff *skb) >> { >> int ret; >> + bool need_bh_off = !(hardirq_count() | softirq_count()); >> - local_bh_disable(); >> + if (need_bh_off) >> + local_bh_disable(); >> trace_netif_rx_entry(skb); >> ret = netif_rx_internal(skb); >> trace_netif_rx_exit(ret); >> - local_bh_enable(); >> + if (need_bh_off) >> + local_bh_enable(); >> return ret; >> } >> EXPORT_SYMBOL(netif_rx); > > Best regards Best regards
On 2022-02-17 15:08:55 [+0100], Marek Szyprowski wrote: > Hi All, Hi, > I've just noticed that there is one more issue left to fix (the $subject > patch is already applied) - this one comes from threaded irq (if I got > the stack trace right): This is not `threadirqs' on the command line, right? Sebastian
On 17.02.2022 15:21, Sebastian Andrzej Siewior wrote: > On 2022-02-17 15:08:55 [+0100], Marek Szyprowski wrote: >> I've just noticed that there is one more issue left to fix (the $subject >> patch is already applied) - this one comes from threaded irq (if I got >> the stack trace right): > This is not `threadirqs' on the command line, right? I don't have 'threadirqs' in the kernel cmdline for that board. Best regards
On 2022-02-17 15:08:55 [+0100], Marek Szyprowski wrote: > Hi All, Hi, > >> Marek, does this work for you? > > > > Yes, this fixed the issue. Thanks! > > > > Tested-by: Marek Szyprowski <m.szyprowski@samsung.com> > > I've just noticed that there is one more issue left to fix (the $subject > patch is already applied) - this one comes from threaded irq (if I got > the stack trace right): netif_rx() did only set the matching softirq bit and not more. Based on that I don't see why NOHZ shouldn't complain about a pending softirq once the CPU goes idle. Therefore I think the change I made is good since it uncovered that. > ------------[ cut here ]------------ > WARNING: CPU: 0 PID: 147 at kernel/softirq.c:363 > __local_bh_enable_ip+0xa8/0x1c0 … > CPU: 0 PID: 147 Comm: irq/150-dwc3 Not tainted 5.17.0-rc4-next-20220217+ > #4557 > Hardware name: Samsung TM2E board (DT) > pstate: 400000c5 (nZcv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--) > pc : __local_bh_enable_ip+0xa8/0x1c0 > lr : netif_rx+0xa4/0x2c0 > ... > > Call trace: > __local_bh_enable_ip+0xa8/0x1c0 > netif_rx+0xa4/0x2c0 > rx_complete+0x214/0x250 > usb_gadget_giveback_request+0x58/0x170 > dwc3_gadget_giveback+0xe4/0x200 > dwc3_gadget_endpoint_trbs_complete+0x100/0x388 > dwc3_thread_interrupt+0x46c/0xe20 So dwc3_thread_interrupt() disables interrupts here. Felipe dropped it and then added it back in e5f68b4a3e7b0 ("Revert "usb: dwc3: gadget: remove unnecessary _irqsave()"") I would suggest to revert it (the above commit) and fixing the lockdep splat in the gadget driver and other. I don't see the g_ether warning Felipe mentioned. It might come from f_ncm (since it uses a timer) or something else in the network stack (that uses a timeout timer). But not now. As much as I hate it, I suggest: diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c index 183b90923f51b..a0c883f19a417 100644 --- a/drivers/usb/dwc3/gadget.c +++ b/drivers/usb/dwc3/gadget.c @@ -4160,9 +4160,11 @@ static irqreturn_t dwc3_thread_interrupt(int irq, void *_evt) unsigned long flags; irqreturn_t ret = IRQ_NONE; + local_bh_disable(); spin_lock_irqsave(&dwc->lock, flags); ret = dwc3_process_event_buf(evt); spin_unlock_irqrestore(&dwc->lock, flags); + local_bh_enable(); return ret; } In the long run I would drop that irqsave (along with bh_disable() since netif_rx() covers that) and make sure the there is no lockdep warning popping up. Marek, could you please give it a try? Sebastian
Hi, On 18.02.2022 10:34, Sebastian Andrzej Siewior wrote: > On 2022-02-17 15:08:55 [+0100], Marek Szyprowski wrote: >>>> Marek, does this work for you? >>> Yes, this fixed the issue. Thanks! >>> >>> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com> >> I've just noticed that there is one more issue left to fix (the $subject >> patch is already applied) - this one comes from threaded irq (if I got >> the stack trace right): > netif_rx() did only set the matching softirq bit and not more. Based on > that I don't see why NOHZ shouldn't complain about a pending softirq > once the CPU goes idle. Therefore I think the change I made is good > since it uncovered that. > >> ------------[ cut here ]------------ >> WARNING: CPU: 0 PID: 147 at kernel/softirq.c:363 >> __local_bh_enable_ip+0xa8/0x1c0 > … >> CPU: 0 PID: 147 Comm: irq/150-dwc3 Not tainted 5.17.0-rc4-next-20220217+ >> #4557 >> Hardware name: Samsung TM2E board (DT) >> pstate: 400000c5 (nZcv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--) >> pc : __local_bh_enable_ip+0xa8/0x1c0 >> lr : netif_rx+0xa4/0x2c0 >> ... >> >> Call trace: >> __local_bh_enable_ip+0xa8/0x1c0 >> netif_rx+0xa4/0x2c0 >> rx_complete+0x214/0x250 >> usb_gadget_giveback_request+0x58/0x170 >> dwc3_gadget_giveback+0xe4/0x200 >> dwc3_gadget_endpoint_trbs_complete+0x100/0x388 >> dwc3_thread_interrupt+0x46c/0xe20 > So dwc3_thread_interrupt() disables interrupts here. Felipe dropped it > and then added it back in > e5f68b4a3e7b0 ("Revert "usb: dwc3: gadget: remove unnecessary _irqsave()"") > > I would suggest to revert it (the above commit) and fixing the lockdep > splat in the gadget driver and other. I don't see the g_ether warning > Felipe mentioned. It might come from f_ncm (since it uses a timer) or > something else in the network stack (that uses a timeout timer). > But not now. > As much as I hate it, I suggest: > > diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c > index 183b90923f51b..a0c883f19a417 100644 > --- a/drivers/usb/dwc3/gadget.c > +++ b/drivers/usb/dwc3/gadget.c > @@ -4160,9 +4160,11 @@ static irqreturn_t dwc3_thread_interrupt(int irq, void *_evt) > unsigned long flags; > irqreturn_t ret = IRQ_NONE; > > + local_bh_disable(); > spin_lock_irqsave(&dwc->lock, flags); > ret = dwc3_process_event_buf(evt); > spin_unlock_irqrestore(&dwc->lock, flags); > + local_bh_enable(); > > return ret; > } > > > In the long run I would drop that irqsave (along with bh_disable() since > netif_rx() covers that) and make sure the there is no lockdep warning > popping up. > > Marek, could you please give it a try? Yes. The above change fixes the issue. I've also tried to revert the mentioned commit e5f68b4a3e7b, but this gives me the following lockdep warning: IPv6: ADDRCONF(NETDEV_CHANGE): usb0: link becomes ready ======================================================== WARNING: possible irq lock inversion dependency detected 5.17.0-rc4-next-20220217+ #4563 Not tainted -------------------------------------------------------- swapper/0/0 just changed the state of lock: ffff000025548098 (_xmit_ETHER#2){+.-.}-{2:2}, at: sch_direct_xmit+0x2f0/0x360 but this lock took another, SOFTIRQ-unsafe lock in the past: (&dev->lock#2){+.+.}-{2:2} and interrupts could create inverse lock ordering between them. other info that might help us debug this: Possible interrupt unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&dev->lock#2); local_irq_disable(); lock(_xmit_ETHER#2); lock(&dev->lock#2); <Interrupt> lock(_xmit_ETHER#2); *** DEADLOCK *** 5 locks held by swapper/0/0: #0: ffff80000a5f39e0 ((&ndev->rs_timer)){+.-.}-{0:0}, at: call_timer_fn+0x0/0x3b0 #1: ffff80000a695e00 (rcu_read_lock){....}-{1:2}, at: ndisc_send_skb+0x138/0x730 [ipv6] #2: ffff80000a696290 (rcu_read_lock_bh){....}-{1:2}, at: ip6_finish_output2+0x94/0xbe8 [ipv6] #3: ffff80000a696290 (rcu_read_lock_bh){....}-{1:2}, at: __dev_queue_xmit+0x64/0xec8 #4: ffff0000246cd258 (dev->qdisc_tx_busylock ?: &qdisc_tx_busylock){+...}-{2:2}, at: __dev_queue_xmit+0x9a4/0xec8 the shortest dependencies between 2nd lock and 1st lock: -> (&dev->lock#2){+.+.}-{2:2} { HARDIRQ-ON-W at: lock_acquire+0x120/0x3c8 _raw_spin_lock+0x58/0x78 gether_connect+0x118/0x1c0 ecm_set_alt+0xf4/0x170 composite_setup+0x8d0/0x1820 dwc3_ep0_delegate_req+0x40/0x68 dwc3_ep0_interrupt+0x5a4/0xf88 dwc3_thread_interrupt+0xb7c/0xe18 irq_thread_fn+0x28/0x98 irq_thread+0x184/0x238 kthread+0x100/0x120 ret_from_fork+0x10/0x20 SOFTIRQ-ON-W at: lock_acquire+0x120/0x3c8 _raw_spin_lock+0x58/0x78 gether_connect+0x118/0x1c0 ecm_set_alt+0xf4/0x170 composite_setup+0x8d0/0x1820 dwc3_ep0_delegate_req+0x40/0x68 dwc3_ep0_interrupt+0x5a4/0xf88 dwc3_thread_interrupt+0xb7c/0xe18 irq_thread_fn+0x28/0x98 irq_thread+0x184/0x238 kthread+0x100/0x120 ret_from_fork+0x10/0x20 INITIAL USE at: lock_acquire+0x120/0x3c8 _raw_spin_lock_irq+0x74/0xa0 eth_open+0x28/0x78 __dev_open+0x100/0x190 __dev_change_flags+0x16c/0x1b8 dev_change_flags+0x20/0x60 do_setlink+0x314/0xd28 __rtnl_newlink+0x418/0x798 rtnl_newlink+0x4c/0x78 rtnetlink_rcv_msg+0x2bc/0x4e0 netlink_rcv_skb+0xe8/0x130 rtnetlink_rcv+0x14/0x20 netlink_unicast+0x1d4/0x280 netlink_sendmsg+0x2cc/0x408 ____sys_sendmsg+0x258/0x290 ___sys_sendmsg+0x80/0xc0 __sys_sendmsg+0x60/0xb8 __arm64_sys_sendmsg+0x1c/0x28 invoke_syscall+0x40/0xf8 el0_svc_common.constprop.3+0x8c/0x120 do_el0_svc+0x20/0x98 el0_svc+0x48/0x100 el0t_64_sync_handler+0x8c/0xb0 el0t_64_sync+0x15c/0x160 } ... key at: [<ffff80000b5004b8>] __key.60862+0x0/0x10 ... acquired at: _raw_spin_lock_irqsave+0x78/0x148 eth_start_xmit+0x30/0x3a0 dev_hard_start_xmit+0x108/0x3d8 sch_direct_xmit+0xf4/0x360 __dev_queue_xmit+0xa10/0xec8 dev_queue_xmit+0x10/0x18 neigh_resolve_output+0x18c/0x290 ip6_finish_output2+0x20c/0xbe8 [ipv6] __ip6_finish_output+0x104/0x2b8 [ipv6] ip6_finish_output+0x30/0x108 [ipv6] ip6_output+0x80/0x360 [ipv6] mld_sendpack+0x570/0x5a0 [ipv6] mld_ifc_work+0x2a4/0x4a0 [ipv6] process_one_work+0x29c/0x6b0 worker_thread+0x48/0x420 kthread+0x100/0x120 ret_from_fork+0x10/0x20 -> (_xmit_ETHER#2){+.-.}-{2:2} { HARDIRQ-ON-W at: lock_acquire+0x120/0x3c8 _raw_spin_lock+0x58/0x78 sch_direct_xmit+0x2f0/0x360 __dev_queue_xmit+0xa10/0xec8 dev_queue_xmit+0x10/0x18 neigh_resolve_output+0x18c/0x290 ip6_finish_output2+0x20c/0xbe8 [ipv6] __ip6_finish_output+0x104/0x2b8 [ipv6] ip6_finish_output+0x30/0x108 [ipv6] ip6_output+0x80/0x360 [ipv6] mld_sendpack+0x570/0x5a0 [ipv6] mld_ifc_work+0x2a4/0x4a0 [ipv6] process_one_work+0x29c/0x6b0 worker_thread+0x48/0x420 kthread+0x100/0x120 ret_from_fork+0x10/0x20 IN-SOFTIRQ-W at: lock_acquire+0x120/0x3c8 _raw_spin_lock+0x58/0x78 sch_direct_xmit+0x2f0/0x360 __dev_queue_xmit+0xa10/0xec8 dev_queue_xmit+0x10/0x18 ip6_finish_output2+0x8bc/0xbe8 [ipv6] __ip6_finish_output+0x104/0x2b8 [ipv6] ip6_finish_output+0x30/0x108 [ipv6] ip6_output+0x80/0x360 [ipv6] ndisc_send_skb+0x464/0x730 [ipv6] ndisc_send_rs+0x58/0x138 [ipv6] addrconf_rs_timer+0x140/0x218 [ipv6] call_timer_fn+0xb4/0x3b0 run_timer_softirq+0x290/0x678 _stext+0x11c/0x5cc irq_exit_rcu+0x168/0x1a8 el1_interrupt+0x40/0x128 el1h_64_irq_handler+0x14/0x20 el1h_64_irq+0x64/0x68 arch_cpu_idle+0x14/0x20 default_idle_call+0x78/0x350 do_idle+0x1f0/0x280 cpu_startup_entry+0x24/0x80 rest_init+0x180/0x290 arch_call_rest_init+0xc/0x14 start_kernel+0x694/0x6cc __primary_switched+0xc0/0xc8 INITIAL USE at: lock_acquire+0x120/0x3c8 _raw_spin_lock+0x58/0x78 sch_direct_xmit+0x2f0/0x360 __dev_queue_xmit+0xa10/0xec8 dev_queue_xmit+0x10/0x18 neigh_resolve_output+0x18c/0x290 ip6_finish_output2+0x20c/0xbe8 [ipv6] __ip6_finish_output+0x104/0x2b8 [ipv6] ip6_finish_output+0x30/0x108 [ipv6] ip6_output+0x80/0x360 [ipv6] mld_sendpack+0x570/0x5a0 [ipv6] mld_ifc_work+0x2a4/0x4a0 [ipv6] process_one_work+0x29c/0x6b0 worker_thread+0x48/0x420 kthread+0x100/0x120 ret_from_fork+0x10/0x20 } ... key at: [<ffff80000b507ca8>] netdev_xmit_lock_key+0x10/0x390 ... acquired at: __lock_acquire+0x564/0x16f8 lock_acquire+0x120/0x3c8 _raw_spin_lock+0x58/0x78 sch_direct_xmit+0x2f0/0x360 __dev_queue_xmit+0xa10/0xec8 dev_queue_xmit+0x10/0x18 ip6_finish_output2+0x8bc/0xbe8 [ipv6] __ip6_finish_output+0x104/0x2b8 [ipv6] ip6_finish_output+0x30/0x108 [ipv6] ip6_output+0x80/0x360 [ipv6] ndisc_send_skb+0x464/0x730 [ipv6] ndisc_send_rs+0x58/0x138 [ipv6] addrconf_rs_timer+0x140/0x218 [ipv6] call_timer_fn+0xb4/0x3b0 run_timer_softirq+0x290/0x678 _stext+0x11c/0x5cc irq_exit_rcu+0x168/0x1a8 el1_interrupt+0x40/0x128 el1h_64_irq_handler+0x14/0x20 el1h_64_irq+0x64/0x68 arch_cpu_idle+0x14/0x20 default_idle_call+0x78/0x350 do_idle+0x1f0/0x280 cpu_startup_entry+0x24/0x80 rest_init+0x180/0x290 arch_call_rest_init+0xc/0x14 start_kernel+0x694/0x6cc __primary_switched+0xc0/0xc8 stack backtrace: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.17.0-rc4-next-20220217+ #4563 Hardware name: Samsung TM2E board (DT) Call trace: dump_backtrace.part.6+0xd8/0xe8 show_stack+0x14/0x60 dump_stack_lvl+0x88/0xb0 dump_stack+0x14/0x2c print_irq_inversion_bug+0x194/0x1f8 mark_lock.part.52+0x29c/0x440 __lock_acquire+0x564/0x16f8 lock_acquire+0x120/0x3c8 _raw_spin_lock+0x58/0x78 sch_direct_xmit+0x2f0/0x360 __dev_queue_xmit+0xa10/0xec8 dev_queue_xmit+0x10/0x18 ip6_finish_output2+0x8bc/0xbe8 [ipv6] __ip6_finish_output+0x104/0x2b8 [ipv6] ip6_finish_output+0x30/0x108 [ipv6] ip6_output+0x80/0x360 [ipv6] ndisc_send_skb+0x464/0x730 [ipv6] ndisc_send_rs+0x58/0x138 [ipv6] addrconf_rs_timer+0x140/0x218 [ipv6] call_timer_fn+0xb4/0x3b0 run_timer_softirq+0x290/0x678 _stext+0x11c/0x5cc irq_exit_rcu+0x168/0x1a8 el1_interrupt+0x40/0x128 el1h_64_irq_handler+0x14/0x20 el1h_64_irq+0x64/0x68 arch_cpu_idle+0x14/0x20 default_idle_call+0x78/0x350 do_idle+0x1f0/0x280 cpu_startup_entry+0x24/0x80 rest_init+0x180/0x290 arch_call_rest_init+0xc/0x14 start_kernel+0x694/0x6cc __primary_switched+0xc0/0xc8 Best regards
On 2022-02-18 11:02:35 [+0100], Marek Szyprowski wrote: > Hi, Hi, > > Marek, could you please give it a try? > > Yes. The above change fixes the issue. Awesome. I will make a patch and send today. > I've also tried to revert the mentioned commit e5f68b4a3e7b, but this > gives me the following lockdep warning: Oh oh. Thank you for that splat. Sebastian
Hi Sebastian, On Wed, 16 Feb 2022, Sebastian Andrzej Siewior wrote: > I missed the obvious case where netif_ix() is invoked from hard-IRQ > context. > > Disabling bottom halves is only needed in process context. This ensures > that the code remains on the current CPU and that the soft-interrupts > are processed at local_bh_enable() time. > In hard- and soft-interrupt context this is already the case and the > soft-interrupts will be processed once the context is left (at irq-exit > time). > > Disable bottom halves if neither hard-interrupts nor soft-interrupts are > disabled. Update the kernel-doc, mention that interrupts must be enabled > if invoked from process context. > > Fixes: baebdf48c3600 ("net: dev: Makes sure netif_rx() can be invoked in any context.") > Reported-by: Marek Szyprowski <m.szyprowski@samsung.com> > Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Thanks, this fixes the issue on rbtx4927[1]. Tested-by: Geert Uytterhoeven <geert@linux-m68k.org> [1] https://lore.kernel.org/all/alpine.DEB.2.22.394.2202221622570.372449@ramsan.of.borg/ Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds
On 2022-02-16 18:50:46 [+0100], Sebastian Andrzej Siewior wrote: > I missed the obvious case where netif_ix() is invoked from hard-IRQ > context. > > Disabling bottom halves is only needed in process context. This ensures > that the code remains on the current CPU and that the soft-interrupts > are processed at local_bh_enable() time. > In hard- and soft-interrupt context this is already the case and the > soft-interrupts will be processed once the context is left (at irq-exit > time). > > Disable bottom halves if neither hard-interrupts nor soft-interrupts are > disabled. Update the kernel-doc, mention that interrupts must be enabled > if invoked from process context. > > Fixes: baebdf48c3600 ("net: dev: Makes sure netif_rx() can be invoked in any context.") > Reported-by: Marek Szyprowski <m.szyprowski@samsung.com> > Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> This patch is marked as "Changes Requested" in patchwork. Could someone please explain? The USB/dwc3 fallout reported by Marek was addressed in usb: dwc3: gadget: Let the interrupt handler disable bottom halves. https://lore.kernel.org/r/Yg/YPejVQH3KkRVd@linutronix.de and is not a shortcoming in this patch but a problem in dwc3 that was just noticed. Sebastian
On Wed, 23 Feb 2022 09:05:43 +0100 Sebastian Andrzej Siewior wrote: > On 2022-02-16 18:50:46 [+0100], Sebastian Andrzej Siewior wrote: > > I missed the obvious case where netif_ix() is invoked from hard-IRQ > > context. > > > > Disabling bottom halves is only needed in process context. This ensures > > that the code remains on the current CPU and that the soft-interrupts > > are processed at local_bh_enable() time. > > In hard- and soft-interrupt context this is already the case and the > > soft-interrupts will be processed once the context is left (at irq-exit > > time). > > > > Disable bottom halves if neither hard-interrupts nor soft-interrupts are > > disabled. Update the kernel-doc, mention that interrupts must be enabled > > if invoked from process context. > > > > Fixes: baebdf48c3600 ("net: dev: Makes sure netif_rx() can be invoked in any context.") > > Reported-by: Marek Szyprowski <m.szyprowski@samsung.com> > > Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> > > This patch is marked as "Changes Requested" in patchwork. Could someone > please explain? > The USB/dwc3 fallout reported by Marek was addressed in > usb: dwc3: gadget: Let the interrupt handler disable bottom halves. > https://lore.kernel.org/r/Yg/YPejVQH3KkRVd@linutronix.de > > and is not a shortcoming in this patch but a problem in dwc3 that was > just noticed. Unclear, let me apply it now.
diff --git a/net/core/dev.c b/net/core/dev.c index 909fb38159108..87729491460fc 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -4860,7 +4860,9 @@ EXPORT_SYMBOL(__netif_rx); * congestion control or by the protocol layers. * The network buffer is passed via the backlog NAPI device. Modern NIC * driver should use NAPI and GRO. - * This function can used from any context. + * This function can used from interrupt and from process context. The + * caller from process context must not disable interrupts before invoking + * this function. * * return values: * NET_RX_SUCCESS (no congestion) @@ -4870,12 +4872,15 @@ EXPORT_SYMBOL(__netif_rx); int netif_rx(struct sk_buff *skb) { int ret; + bool need_bh_off = !(hardirq_count() | softirq_count()); - local_bh_disable(); + if (need_bh_off) + local_bh_disable(); trace_netif_rx_entry(skb); ret = netif_rx_internal(skb); trace_netif_rx_exit(ret); - local_bh_enable(); + if (need_bh_off) + local_bh_enable(); return ret; } EXPORT_SYMBOL(netif_rx);
I missed the obvious case where netif_ix() is invoked from hard-IRQ context. Disabling bottom halves is only needed in process context. This ensures that the code remains on the current CPU and that the soft-interrupts are processed at local_bh_enable() time. In hard- and soft-interrupt context this is already the case and the soft-interrupts will be processed once the context is left (at irq-exit time). Disable bottom halves if neither hard-interrupts nor soft-interrupts are disabled. Update the kernel-doc, mention that interrupts must be enabled if invoked from process context. Fixes: baebdf48c3600 ("net: dev: Makes sure netif_rx() can be invoked in any context.") Reported-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> --- Marek, does this work for you? net/core/dev.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-)