diff mbox series

net: macb: fix sleep inside spinlock

Message ID 20230908112913.1701766-1-s.hauer@pengutronix.de (mailing list archive)
State Accepted
Commit 403f0e771457e2b8811dc280719d11b9bacf10f4
Delegated to: Netdev Maintainers
Headers show
Series net: macb: fix sleep inside spinlock | expand

Checks

Context Check Description
netdev/series_format warning Single patches do not need cover letters; Target tree name not specified in the subject
netdev/tree_selection success Guessed tree name to be net-next
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 1361 this patch: 1361
netdev/cc_maintainers fail 1 blamed authors not CCed: davem@davemloft.net; 5 maintainers not CCed: pabeni@redhat.com richardcochran@gmail.com davem@davemloft.net edumazet@google.com kuba@kernel.org
netdev/build_clang success Errors and warnings before: 1363 this patch: 1363
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 1385 this patch: 1385
netdev/checkpatch warning WARNING: Please use correct Fixes: style 'Fixes: <12 chars of sha1> ("<title line>")' - ie: 'Fixes: 633e98a711ac ("net: macb: use resolved link config in mac_link_up()")'
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Sascha Hauer Sept. 8, 2023, 11:29 a.m. UTC
macb_set_tx_clk() is called under a spinlock but itself calls clk_set_rate()
which can sleep. This results in:

| BUG: sleeping function called from invalid context at kernel/locking/mutex.c:580
| pps pps1: new PPS source ptp1
| in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 40, name: kworker/u4:3
| preempt_count: 1, expected: 0
| RCU nest depth: 0, expected: 0
| 4 locks held by kworker/u4:3/40:
|  #0: ffff000003409148
| macb ff0c0000.ethernet: gem-ptp-timer ptp clock registered.
|  ((wq_completion)events_power_efficient){+.+.}-{0:0}, at: process_one_work+0x14c/0x51c
|  #1: ffff8000833cbdd8 ((work_completion)(&pl->resolve)){+.+.}-{0:0}, at: process_one_work+0x14c/0x51c
|  #2: ffff000004f01578 (&pl->state_mutex){+.+.}-{4:4}, at: phylink_resolve+0x44/0x4e8
|  #3: ffff000004f06f50 (&bp->lock){....}-{3:3}, at: macb_mac_link_up+0x40/0x2ac
| irq event stamp: 113998
| hardirqs last  enabled at (113997): [<ffff800080e8503c>] _raw_spin_unlock_irq+0x30/0x64
| hardirqs last disabled at (113998): [<ffff800080e84478>] _raw_spin_lock_irqsave+0xac/0xc8
| softirqs last  enabled at (113608): [<ffff800080010630>] __do_softirq+0x430/0x4e4
| softirqs last disabled at (113597): [<ffff80008001614c>] ____do_softirq+0x10/0x1c
| CPU: 0 PID: 40 Comm: kworker/u4:3 Not tainted 6.5.0-11717-g9355ce8b2f50-dirty #368
| Hardware name: ... ZynqMP ... (DT)
| Workqueue: events_power_efficient phylink_resolve
| Call trace:
|  dump_backtrace+0x98/0xf0
|  show_stack+0x18/0x24
|  dump_stack_lvl+0x60/0xac
|  dump_stack+0x18/0x24
|  __might_resched+0x144/0x24c
|  __might_sleep+0x48/0x98
|  __mutex_lock+0x58/0x7b0
|  mutex_lock_nested+0x24/0x30
|  clk_prepare_lock+0x4c/0xa8
|  clk_set_rate+0x24/0x8c
|  macb_mac_link_up+0x25c/0x2ac
|  phylink_resolve+0x178/0x4e8
|  process_one_work+0x1ec/0x51c
|  worker_thread+0x1ec/0x3e4
|  kthread+0x120/0x124
|  ret_from_fork+0x10/0x20

The obvious fix is to move the call to macb_set_tx_clk() out of the
protected area. This seems safe as rx and tx are both disabled anyway at
this point.
It is however not entirely clear what the spinlock shall protect. It
could be the read-modify-write access to the NCFGR register, but this
is accessed in macb_set_rx_mode() and macb_set_rxcsum_feature() as well
without holding the spinlock. It could also be the register accesses
done in mog_init_rings() or macb_init_buffers(), but again these
functions are called without holding the spinlock in macb_hresp_error_task().
The locking seems fishy in this driver and it might deserve another look
before this patch is applied.

Fixes: 633e98a711ac0 ("net: macb: use resolved link config in mac_link_up()")
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
---
 drivers/net/ethernet/cadence/macb_main.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

Comments

Paolo Abeni Sept. 12, 2023, 7:08 a.m. UTC | #1
On Fri, 2023-09-08 at 13:29 +0200, Sascha Hauer wrote:
> macb_set_tx_clk() is called under a spinlock but itself calls clk_set_rate()
> which can sleep. This results in:
> 
> > BUG: sleeping function called from invalid context at kernel/locking/mutex.c:580
> > pps pps1: new PPS source ptp1
> > in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 40, name: kworker/u4:3
> > preempt_count: 1, expected: 0
> > RCU nest depth: 0, expected: 0
> > 4 locks held by kworker/u4:3/40:
> >  #0: ffff000003409148
> > macb ff0c0000.ethernet: gem-ptp-timer ptp clock registered.
> >  ((wq_completion)events_power_efficient){+.+.}-{0:0}, at: process_one_work+0x14c/0x51c
> >  #1: ffff8000833cbdd8 ((work_completion)(&pl->resolve)){+.+.}-{0:0}, at: process_one_work+0x14c/0x51c
> >  #2: ffff000004f01578 (&pl->state_mutex){+.+.}-{4:4}, at: phylink_resolve+0x44/0x4e8
> >  #3: ffff000004f06f50 (&bp->lock){....}-{3:3}, at: macb_mac_link_up+0x40/0x2ac
> > irq event stamp: 113998
> > hardirqs last  enabled at (113997): [<ffff800080e8503c>] _raw_spin_unlock_irq+0x30/0x64
> > hardirqs last disabled at (113998): [<ffff800080e84478>] _raw_spin_lock_irqsave+0xac/0xc8
> > softirqs last  enabled at (113608): [<ffff800080010630>] __do_softirq+0x430/0x4e4
> > softirqs last disabled at (113597): [<ffff80008001614c>] ____do_softirq+0x10/0x1c
> > CPU: 0 PID: 40 Comm: kworker/u4:3 Not tainted 6.5.0-11717-g9355ce8b2f50-dirty #368
> > Hardware name: ... ZynqMP ... (DT)
> > Workqueue: events_power_efficient phylink_resolve
> > Call trace:
> >  dump_backtrace+0x98/0xf0
> >  show_stack+0x18/0x24
> >  dump_stack_lvl+0x60/0xac
> >  dump_stack+0x18/0x24
> >  __might_resched+0x144/0x24c
> >  __might_sleep+0x48/0x98
> >  __mutex_lock+0x58/0x7b0
> >  mutex_lock_nested+0x24/0x30
> >  clk_prepare_lock+0x4c/0xa8
> >  clk_set_rate+0x24/0x8c
> >  macb_mac_link_up+0x25c/0x2ac
> >  phylink_resolve+0x178/0x4e8
> >  process_one_work+0x1ec/0x51c
> >  worker_thread+0x1ec/0x3e4
> >  kthread+0x120/0x124
> >  ret_from_fork+0x10/0x20
> 
> The obvious fix is to move the call to macb_set_tx_clk() out of the
> protected area. This seems safe as rx and tx are both disabled anyway at
> this point.
> It is however not entirely clear what the spinlock shall protect. It
> could be the read-modify-write access to the NCFGR register, but this
> is accessed in macb_set_rx_mode() and macb_set_rxcsum_feature() as well
> without holding the spinlock. It could also be the register accesses
> done in mog_init_rings() or macb_init_buffers(), but again these
> functions are called without holding the spinlock in macb_hresp_error_task().
> The locking seems fishy in this driver and it might deserve another look
> before this patch is applied.

macb_set_tx_clk() moved under the bp->lock scope as a consequence of
the blamed commit. Such commit moved a bunch of code originally in
macb_mac_config() and under the bp->lock lock into macb_mac_link_up(). 

I guess that the lock was added to the latter function to be on the
"safe side", but it looks like macb_set_tx_clk() does not need it.

The patch LGTM, thanks!

Paolo
patchwork-bot+netdevbpf@kernel.org Sept. 12, 2023, 1:20 p.m. UTC | #2
Hello:

This patch was applied to netdev/net.git (main)
by Paolo Abeni <pabeni@redhat.com>:

On Fri,  8 Sep 2023 13:29:13 +0200 you wrote:
> macb_set_tx_clk() is called under a spinlock but itself calls clk_set_rate()
> which can sleep. This results in:
> 
> | BUG: sleeping function called from invalid context at kernel/locking/mutex.c:580
> | pps pps1: new PPS source ptp1
> | in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 40, name: kworker/u4:3
> | preempt_count: 1, expected: 0
> | RCU nest depth: 0, expected: 0
> | 4 locks held by kworker/u4:3/40:
> |  #0: ffff000003409148
> | macb ff0c0000.ethernet: gem-ptp-timer ptp clock registered.
> |  ((wq_completion)events_power_efficient){+.+.}-{0:0}, at: process_one_work+0x14c/0x51c
> |  #1: ffff8000833cbdd8 ((work_completion)(&pl->resolve)){+.+.}-{0:0}, at: process_one_work+0x14c/0x51c
> |  #2: ffff000004f01578 (&pl->state_mutex){+.+.}-{4:4}, at: phylink_resolve+0x44/0x4e8
> |  #3: ffff000004f06f50 (&bp->lock){....}-{3:3}, at: macb_mac_link_up+0x40/0x2ac
> | irq event stamp: 113998
> | hardirqs last  enabled at (113997): [<ffff800080e8503c>] _raw_spin_unlock_irq+0x30/0x64
> | hardirqs last disabled at (113998): [<ffff800080e84478>] _raw_spin_lock_irqsave+0xac/0xc8
> | softirqs last  enabled at (113608): [<ffff800080010630>] __do_softirq+0x430/0x4e4
> | softirqs last disabled at (113597): [<ffff80008001614c>] ____do_softirq+0x10/0x1c
> | CPU: 0 PID: 40 Comm: kworker/u4:3 Not tainted 6.5.0-11717-g9355ce8b2f50-dirty #368
> | Hardware name: ... ZynqMP ... (DT)
> | Workqueue: events_power_efficient phylink_resolve
> | Call trace:
> |  dump_backtrace+0x98/0xf0
> |  show_stack+0x18/0x24
> |  dump_stack_lvl+0x60/0xac
> |  dump_stack+0x18/0x24
> |  __might_resched+0x144/0x24c
> |  __might_sleep+0x48/0x98
> |  __mutex_lock+0x58/0x7b0
> |  mutex_lock_nested+0x24/0x30
> |  clk_prepare_lock+0x4c/0xa8
> |  clk_set_rate+0x24/0x8c
> |  macb_mac_link_up+0x25c/0x2ac
> |  phylink_resolve+0x178/0x4e8
> |  process_one_work+0x1ec/0x51c
> |  worker_thread+0x1ec/0x3e4
> |  kthread+0x120/0x124
> |  ret_from_fork+0x10/0x20
> 
> [...]

Here is the summary with links:
  - net: macb: fix sleep inside spinlock
    https://git.kernel.org/netdev/net/c/403f0e771457

You are awesome, thank you!
diff mbox series

Patch

diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c
index 31f664ee4d778..b940dcd3ace68 100644
--- a/drivers/net/ethernet/cadence/macb_main.c
+++ b/drivers/net/ethernet/cadence/macb_main.c
@@ -756,8 +756,6 @@  static void macb_mac_link_up(struct phylink_config *config,
 		if (rx_pause)
 			ctrl |= MACB_BIT(PAE);
 
-		macb_set_tx_clk(bp, speed);
-
 		/* Initialize rings & buffers as clearing MACB_BIT(TE) in link down
 		 * cleared the pipeline and control registers.
 		 */
@@ -777,6 +775,9 @@  static void macb_mac_link_up(struct phylink_config *config,
 
 	spin_unlock_irqrestore(&bp->lock, flags);
 
+	if (!(bp->caps & MACB_CAPS_MACB_IS_EMAC))
+		macb_set_tx_clk(bp, speed);
+
 	/* Enable Rx and Tx; Enable PTP unicast */
 	ctrl = macb_readl(bp, NCR);
 	if (gem_has_ptp(bp))