Message ID | 20230908112913.1701766-1-s.hauer@pengutronix.de (mailing list archive) |
---|---|
State | Accepted |
Commit | 403f0e771457e2b8811dc280719d11b9bacf10f4 |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | net: macb: fix sleep inside spinlock | expand |
On Fri, 2023-09-08 at 13:29 +0200, Sascha Hauer wrote: > macb_set_tx_clk() is called under a spinlock but itself calls clk_set_rate() > which can sleep. This results in: > > > BUG: sleeping function called from invalid context at kernel/locking/mutex.c:580 > > pps pps1: new PPS source ptp1 > > in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 40, name: kworker/u4:3 > > preempt_count: 1, expected: 0 > > RCU nest depth: 0, expected: 0 > > 4 locks held by kworker/u4:3/40: > > #0: ffff000003409148 > > macb ff0c0000.ethernet: gem-ptp-timer ptp clock registered. > > ((wq_completion)events_power_efficient){+.+.}-{0:0}, at: process_one_work+0x14c/0x51c > > #1: ffff8000833cbdd8 ((work_completion)(&pl->resolve)){+.+.}-{0:0}, at: process_one_work+0x14c/0x51c > > #2: ffff000004f01578 (&pl->state_mutex){+.+.}-{4:4}, at: phylink_resolve+0x44/0x4e8 > > #3: ffff000004f06f50 (&bp->lock){....}-{3:3}, at: macb_mac_link_up+0x40/0x2ac > > irq event stamp: 113998 > > hardirqs last enabled at (113997): [<ffff800080e8503c>] _raw_spin_unlock_irq+0x30/0x64 > > hardirqs last disabled at (113998): [<ffff800080e84478>] _raw_spin_lock_irqsave+0xac/0xc8 > > softirqs last enabled at (113608): [<ffff800080010630>] __do_softirq+0x430/0x4e4 > > softirqs last disabled at (113597): [<ffff80008001614c>] ____do_softirq+0x10/0x1c > > CPU: 0 PID: 40 Comm: kworker/u4:3 Not tainted 6.5.0-11717-g9355ce8b2f50-dirty #368 > > Hardware name: ... ZynqMP ... (DT) > > Workqueue: events_power_efficient phylink_resolve > > Call trace: > > dump_backtrace+0x98/0xf0 > > show_stack+0x18/0x24 > > dump_stack_lvl+0x60/0xac > > dump_stack+0x18/0x24 > > __might_resched+0x144/0x24c > > __might_sleep+0x48/0x98 > > __mutex_lock+0x58/0x7b0 > > mutex_lock_nested+0x24/0x30 > > clk_prepare_lock+0x4c/0xa8 > > clk_set_rate+0x24/0x8c > > macb_mac_link_up+0x25c/0x2ac > > phylink_resolve+0x178/0x4e8 > > process_one_work+0x1ec/0x51c > > worker_thread+0x1ec/0x3e4 > > kthread+0x120/0x124 > > ret_from_fork+0x10/0x20 > > The obvious fix is to move the call to macb_set_tx_clk() out of the > protected area. This seems safe as rx and tx are both disabled anyway at > this point. > It is however not entirely clear what the spinlock shall protect. It > could be the read-modify-write access to the NCFGR register, but this > is accessed in macb_set_rx_mode() and macb_set_rxcsum_feature() as well > without holding the spinlock. It could also be the register accesses > done in mog_init_rings() or macb_init_buffers(), but again these > functions are called without holding the spinlock in macb_hresp_error_task(). > The locking seems fishy in this driver and it might deserve another look > before this patch is applied. macb_set_tx_clk() moved under the bp->lock scope as a consequence of the blamed commit. Such commit moved a bunch of code originally in macb_mac_config() and under the bp->lock lock into macb_mac_link_up(). I guess that the lock was added to the latter function to be on the "safe side", but it looks like macb_set_tx_clk() does not need it. The patch LGTM, thanks! Paolo
Hello: This patch was applied to netdev/net.git (main) by Paolo Abeni <pabeni@redhat.com>: On Fri, 8 Sep 2023 13:29:13 +0200 you wrote: > macb_set_tx_clk() is called under a spinlock but itself calls clk_set_rate() > which can sleep. This results in: > > | BUG: sleeping function called from invalid context at kernel/locking/mutex.c:580 > | pps pps1: new PPS source ptp1 > | in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 40, name: kworker/u4:3 > | preempt_count: 1, expected: 0 > | RCU nest depth: 0, expected: 0 > | 4 locks held by kworker/u4:3/40: > | #0: ffff000003409148 > | macb ff0c0000.ethernet: gem-ptp-timer ptp clock registered. > | ((wq_completion)events_power_efficient){+.+.}-{0:0}, at: process_one_work+0x14c/0x51c > | #1: ffff8000833cbdd8 ((work_completion)(&pl->resolve)){+.+.}-{0:0}, at: process_one_work+0x14c/0x51c > | #2: ffff000004f01578 (&pl->state_mutex){+.+.}-{4:4}, at: phylink_resolve+0x44/0x4e8 > | #3: ffff000004f06f50 (&bp->lock){....}-{3:3}, at: macb_mac_link_up+0x40/0x2ac > | irq event stamp: 113998 > | hardirqs last enabled at (113997): [<ffff800080e8503c>] _raw_spin_unlock_irq+0x30/0x64 > | hardirqs last disabled at (113998): [<ffff800080e84478>] _raw_spin_lock_irqsave+0xac/0xc8 > | softirqs last enabled at (113608): [<ffff800080010630>] __do_softirq+0x430/0x4e4 > | softirqs last disabled at (113597): [<ffff80008001614c>] ____do_softirq+0x10/0x1c > | CPU: 0 PID: 40 Comm: kworker/u4:3 Not tainted 6.5.0-11717-g9355ce8b2f50-dirty #368 > | Hardware name: ... ZynqMP ... (DT) > | Workqueue: events_power_efficient phylink_resolve > | Call trace: > | dump_backtrace+0x98/0xf0 > | show_stack+0x18/0x24 > | dump_stack_lvl+0x60/0xac > | dump_stack+0x18/0x24 > | __might_resched+0x144/0x24c > | __might_sleep+0x48/0x98 > | __mutex_lock+0x58/0x7b0 > | mutex_lock_nested+0x24/0x30 > | clk_prepare_lock+0x4c/0xa8 > | clk_set_rate+0x24/0x8c > | macb_mac_link_up+0x25c/0x2ac > | phylink_resolve+0x178/0x4e8 > | process_one_work+0x1ec/0x51c > | worker_thread+0x1ec/0x3e4 > | kthread+0x120/0x124 > | ret_from_fork+0x10/0x20 > > [...] Here is the summary with links: - net: macb: fix sleep inside spinlock https://git.kernel.org/netdev/net/c/403f0e771457 You are awesome, thank you!
diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c index 31f664ee4d778..b940dcd3ace68 100644 --- a/drivers/net/ethernet/cadence/macb_main.c +++ b/drivers/net/ethernet/cadence/macb_main.c @@ -756,8 +756,6 @@ static void macb_mac_link_up(struct phylink_config *config, if (rx_pause) ctrl |= MACB_BIT(PAE); - macb_set_tx_clk(bp, speed); - /* Initialize rings & buffers as clearing MACB_BIT(TE) in link down * cleared the pipeline and control registers. */ @@ -777,6 +775,9 @@ static void macb_mac_link_up(struct phylink_config *config, spin_unlock_irqrestore(&bp->lock, flags); + if (!(bp->caps & MACB_CAPS_MACB_IS_EMAC)) + macb_set_tx_clk(bp, speed); + /* Enable Rx and Tx; Enable PTP unicast */ ctrl = macb_readl(bp, NCR); if (gem_has_ptp(bp))
macb_set_tx_clk() is called under a spinlock but itself calls clk_set_rate() which can sleep. This results in: | BUG: sleeping function called from invalid context at kernel/locking/mutex.c:580 | pps pps1: new PPS source ptp1 | in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 40, name: kworker/u4:3 | preempt_count: 1, expected: 0 | RCU nest depth: 0, expected: 0 | 4 locks held by kworker/u4:3/40: | #0: ffff000003409148 | macb ff0c0000.ethernet: gem-ptp-timer ptp clock registered. | ((wq_completion)events_power_efficient){+.+.}-{0:0}, at: process_one_work+0x14c/0x51c | #1: ffff8000833cbdd8 ((work_completion)(&pl->resolve)){+.+.}-{0:0}, at: process_one_work+0x14c/0x51c | #2: ffff000004f01578 (&pl->state_mutex){+.+.}-{4:4}, at: phylink_resolve+0x44/0x4e8 | #3: ffff000004f06f50 (&bp->lock){....}-{3:3}, at: macb_mac_link_up+0x40/0x2ac | irq event stamp: 113998 | hardirqs last enabled at (113997): [<ffff800080e8503c>] _raw_spin_unlock_irq+0x30/0x64 | hardirqs last disabled at (113998): [<ffff800080e84478>] _raw_spin_lock_irqsave+0xac/0xc8 | softirqs last enabled at (113608): [<ffff800080010630>] __do_softirq+0x430/0x4e4 | softirqs last disabled at (113597): [<ffff80008001614c>] ____do_softirq+0x10/0x1c | CPU: 0 PID: 40 Comm: kworker/u4:3 Not tainted 6.5.0-11717-g9355ce8b2f50-dirty #368 | Hardware name: ... ZynqMP ... (DT) | Workqueue: events_power_efficient phylink_resolve | Call trace: | dump_backtrace+0x98/0xf0 | show_stack+0x18/0x24 | dump_stack_lvl+0x60/0xac | dump_stack+0x18/0x24 | __might_resched+0x144/0x24c | __might_sleep+0x48/0x98 | __mutex_lock+0x58/0x7b0 | mutex_lock_nested+0x24/0x30 | clk_prepare_lock+0x4c/0xa8 | clk_set_rate+0x24/0x8c | macb_mac_link_up+0x25c/0x2ac | phylink_resolve+0x178/0x4e8 | process_one_work+0x1ec/0x51c | worker_thread+0x1ec/0x3e4 | kthread+0x120/0x124 | ret_from_fork+0x10/0x20 The obvious fix is to move the call to macb_set_tx_clk() out of the protected area. This seems safe as rx and tx are both disabled anyway at this point. It is however not entirely clear what the spinlock shall protect. It could be the read-modify-write access to the NCFGR register, but this is accessed in macb_set_rx_mode() and macb_set_rxcsum_feature() as well without holding the spinlock. It could also be the register accesses done in mog_init_rings() or macb_init_buffers(), but again these functions are called without holding the spinlock in macb_hresp_error_task(). The locking seems fishy in this driver and it might deserve another look before this patch is applied. Fixes: 633e98a711ac0 ("net: macb: use resolved link config in mac_link_up()") Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> --- drivers/net/ethernet/cadence/macb_main.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)