diff mbox series

[net] net: phy: skip LED triggers on PHYs on SFP modules

Message ID 102a9dce38bdf00215735d04cd4704458273ad9c.1702339354.git.daniel@makrotopia.org (mailing list archive)
State Accepted
Commit b1dfc0f76231bbf395c59d20a2070684620d5d0f
Delegated to: Netdev Maintainers
Headers show
Series [net] net: phy: skip LED triggers on PHYs on SFP modules | expand

Checks

Context Check Description
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for net
netdev/ynl success SINGLE THREAD; Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present fail Series targets non-next tree, but doesn't contain any Fixes tags
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 1115 this patch: 1115
netdev/cc_maintainers success CCed 7 of 7 maintainers
netdev/build_clang success Errors and warnings before: 1142 this patch: 1142
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 1142 this patch: 1142
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 18 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Daniel Golle Dec. 12, 2023, 12:05 a.m. UTC
Calling led_trigger_register() when attaching a PHY located on an SFP
module potentially (and practically) leads into a deadlock.
Fix this by not calling led_trigger_register() for PHYs localted on SFP
modules as such modules actually never got any LEDs.

======================================================
WARNING: possible circular locking dependency detected
6.7.0-rc4-next-20231208+ #0 Tainted: G           O
------------------------------------------------------
kworker/u8:2/43 is trying to acquire lock:
ffffffc08108c4e8 (triggers_list_lock){++++}-{3:3}, at: led_trigger_register+0x4c/0x1a8

but task is already holding lock:
ffffff80c5c6f318 (&sfp->sm_mutex){+.+.}-{3:3}, at: cleanup_module+0x2ba8/0x3120 [sfp]

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #3 (&sfp->sm_mutex){+.+.}-{3:3}:
       __mutex_lock+0x88/0x7a0
       mutex_lock_nested+0x20/0x28
       cleanup_module+0x2ae0/0x3120 [sfp]
       sfp_register_bus+0x5c/0x9c
       sfp_register_socket+0x48/0xd4
       cleanup_module+0x271c/0x3120 [sfp]
       platform_probe+0x64/0xb8
       really_probe+0x17c/0x3c0
       __driver_probe_device+0x78/0x164
       driver_probe_device+0x3c/0xd4
       __driver_attach+0xec/0x1f0
       bus_for_each_dev+0x60/0xa0
       driver_attach+0x20/0x28
       bus_add_driver+0x108/0x208
       driver_register+0x5c/0x118
       __platform_driver_register+0x24/0x2c
       init_module+0x28/0xa7c [sfp]
       do_one_initcall+0x70/0x2ec
       do_init_module+0x54/0x1e4
       load_module+0x1b78/0x1c8c
       __do_sys_init_module+0x1bc/0x2cc
       __arm64_sys_init_module+0x18/0x20
       invoke_syscall.constprop.0+0x4c/0xdc
       do_el0_svc+0x3c/0xbc
       el0_svc+0x34/0x80
       el0t_64_sync_handler+0xf8/0x124
       el0t_64_sync+0x150/0x154

-> #2 (rtnl_mutex){+.+.}-{3:3}:
       __mutex_lock+0x88/0x7a0
       mutex_lock_nested+0x20/0x28
       rtnl_lock+0x18/0x20
       set_device_name+0x30/0x130
       netdev_trig_activate+0x13c/0x1ac
       led_trigger_set+0x118/0x234
       led_trigger_write+0x104/0x17c
       sysfs_kf_bin_write+0x64/0x80
       kernfs_fop_write_iter+0x128/0x1b4
       vfs_write+0x178/0x2a4
       ksys_write+0x58/0xd4
       __arm64_sys_write+0x18/0x20
       invoke_syscall.constprop.0+0x4c/0xdc
       do_el0_svc+0x3c/0xbc
       el0_svc+0x34/0x80
       el0t_64_sync_handler+0xf8/0x124
       el0t_64_sync+0x150/0x154

-> #1 (&led_cdev->trigger_lock){++++}-{3:3}:
       down_write+0x4c/0x13c
       led_trigger_write+0xf8/0x17c
       sysfs_kf_bin_write+0x64/0x80
       kernfs_fop_write_iter+0x128/0x1b4
       vfs_write+0x178/0x2a4
       ksys_write+0x58/0xd4
       __arm64_sys_write+0x18/0x20
       invoke_syscall.constprop.0+0x4c/0xdc
       do_el0_svc+0x3c/0xbc
       el0_svc+0x34/0x80
       el0t_64_sync_handler+0xf8/0x124
       el0t_64_sync+0x150/0x154

-> #0 (triggers_list_lock){++++}-{3:3}:
       __lock_acquire+0x12a0/0x2014
       lock_acquire+0x100/0x2ac
       down_write+0x4c/0x13c
       led_trigger_register+0x4c/0x1a8
       phy_led_triggers_register+0x9c/0x214
       phy_attach_direct+0x154/0x36c
       phylink_attach_phy+0x30/0x60
       phylink_sfp_connect_phy+0x140/0x510
       sfp_add_phy+0x34/0x50
       init_module+0x15c/0xa7c [sfp]
       cleanup_module+0x1d94/0x3120 [sfp]
       cleanup_module+0x2bb4/0x3120 [sfp]
       process_one_work+0x1f8/0x4ec
       worker_thread+0x1e8/0x3d8
       kthread+0x104/0x110
       ret_from_fork+0x10/0x20

other info that might help us debug this:

Chain exists of:
  triggers_list_lock --> rtnl_mutex --> &sfp->sm_mutex

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&sfp->sm_mutex);
                               lock(rtnl_mutex);
                               lock(&sfp->sm_mutex);
  lock(triggers_list_lock);

 *** DEADLOCK ***

4 locks held by kworker/u8:2/43:
 #0: ffffff80c000f938 ((wq_completion)events_power_efficient){+.+.}-{0:0}, at: process_one_work+0x150/0x4ec
 #1: ffffffc08214bde8 ((work_completion)(&(&sfp->timeout)->work)){+.+.}-{0:0}, at: process_one_work+0x150/0x4ec
 #2: ffffffc0810902f8 (rtnl_mutex){+.+.}-{3:3}, at: rtnl_lock+0x18/0x20
 #3: ffffff80c5c6f318 (&sfp->sm_mutex){+.+.}-{3:3}, at: cleanup_module+0x2ba8/0x3120 [sfp]

stack backtrace:
CPU: 0 PID: 43 Comm: kworker/u8:2 Tainted: G           O       6.7.0-rc4-next-20231208+ #0
Hardware name: Bananapi BPI-R4 (DT)
Workqueue: events_power_efficient cleanup_module [sfp]
Call trace:
 dump_backtrace+0xa8/0x10c
 show_stack+0x14/0x1c
 dump_stack_lvl+0x5c/0xa0
 dump_stack+0x14/0x1c
 print_circular_bug+0x328/0x430
 check_noncircular+0x124/0x134
 __lock_acquire+0x12a0/0x2014
 lock_acquire+0x100/0x2ac
 down_write+0x4c/0x13c
 led_trigger_register+0x4c/0x1a8
 phy_led_triggers_register+0x9c/0x214
 phy_attach_direct+0x154/0x36c
 phylink_attach_phy+0x30/0x60
 phylink_sfp_connect_phy+0x140/0x510
 sfp_add_phy+0x34/0x50
 init_module+0x15c/0xa7c [sfp]
 cleanup_module+0x1d94/0x3120 [sfp]
 cleanup_module+0x2bb4/0x3120 [sfp]
 process_one_work+0x1f8/0x4ec
 worker_thread+0x1e8/0x3d8
 kthread+0x104/0x110
 ret_from_fork+0x10/0x20

Signed-off-by: Daniel Golle <daniel@makrotopia.org>
---
 drivers/net/phy/phy_device.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

Comments

Maxime Chevallier Dec. 12, 2023, 2:35 p.m. UTC | #1
Hi Daniel

On Tue, 12 Dec 2023 00:05:35 +0000
Daniel Golle <daniel@makrotopia.org> wrote:

> Calling led_trigger_register() when attaching a PHY located on an SFP
> module potentially (and practically) leads into a deadlock.
> Fix this by not calling led_trigger_register() for PHYs localted on SFP
> modules as such modules actually never got any LEDs.

While I don't have a fix for this issue, I think your justification
isn't good. This isn't about having LEDs on the module or not, but
rather the PHY triggering LED events for LEDS that can be located
somewhere else on the system (like the front pannel of a switch).

So I think it would be wiser to avoid the deadlock with a proper
analysis of what the locking scheme does. Maybe Andrew or Russell
have a better vision of what's going-on here, I tried to dive into
it but it doesn't look straightfoward to me :(

Maxime
Andrew Lunn Dec. 13, 2023, 9:08 a.m. UTC | #2
On Tue, Dec 12, 2023 at 03:35:12PM +0100, Maxime Chevallier wrote:
> Hi Daniel
> 
> On Tue, 12 Dec 2023 00:05:35 +0000
> Daniel Golle <daniel@makrotopia.org> wrote:
> 
> > Calling led_trigger_register() when attaching a PHY located on an SFP
> > module potentially (and practically) leads into a deadlock.
> > Fix this by not calling led_trigger_register() for PHYs localted on SFP
> > modules as such modules actually never got any LEDs.
> 
> While I don't have a fix for this issue, I think your justification
> isn't good. This isn't about having LEDs on the module or not, but
> rather the PHY triggering LED events for LEDS that can be located
> somewhere else on the system (like the front pannel of a switch).

SFP LEDs are very unlikely to be on the front panel, since there is no
such pins on the SFP cage.

Russell, in your collection of SFPs do you have any with LEDs?

> So I think it would be wiser to avoid the deadlock with a proper
> analysis of what the locking scheme does. Maybe Andrew or Russell
> have a better vision of what's going-on here, I tried to dive into
> it but it doesn't look straightfoward to me :(

I agree we should at least look at the deadlock, and see if we can
rearrange the locks to avoid it, even if there are no SFPs with LEDs.

	  Andrew
Maxime Chevallier Dec. 13, 2023, 10:06 a.m. UTC | #3
On Wed, 13 Dec 2023 10:08:25 +0100
Andrew Lunn <andrew@lunn.ch> wrote:

> On Tue, Dec 12, 2023 at 03:35:12PM +0100, Maxime Chevallier wrote:
> > Hi Daniel
> > 
> > On Tue, 12 Dec 2023 00:05:35 +0000
> > Daniel Golle <daniel@makrotopia.org> wrote:
> >   
> > > Calling led_trigger_register() when attaching a PHY located on an SFP
> > > module potentially (and practically) leads into a deadlock.
> > > Fix this by not calling led_trigger_register() for PHYs localted on SFP
> > > modules as such modules actually never got any LEDs.  
> > 
> > While I don't have a fix for this issue, I think your justification
> > isn't good. This isn't about having LEDs on the module or not, but
> > rather the PHY triggering LED events for LEDS that can be located
> > somewhere else on the system (like the front pannel of a switch).  
> 
> SFP LEDs are very unlikely to be on the front panel, since there is no
> such pins on the SFP cage.
> 
> Russell, in your collection of SFPs do you have any with LEDs?

I mean, aren't the led triggers generic so that it can trigger any
other LED to blink, and it's up to the user to decide ?

I do however see one good thing with this patch is that it makes the
behaviour coherent regarding triggers regardless if we have a
media-converter or not.

If we have a SFP bus with phylink as its upstream (SFP bus directly
connected to the MAC), then the phy is going to be attached through
phy_attach_direct(), and before this patch, led triggers will be
registered.

If we have an intermediate PHY acting as a media-converter and
connected to the SFP bus, then the phy isn't attached to the net_device,
and the triggers aren't registered.

So if in the end it doesn't make sense to register led triggers for
PHY in modules, it might be more coherent not registering them at all
as this patch does. What do you think ?

Thanks,

Maxime
Andrew Lunn Dec. 13, 2023, 10:47 a.m. UTC | #4
> > SFP LEDs are very unlikely to be on the front panel, since there is no
> > such pins on the SFP cage.
> > 
> > Russell, in your collection of SFPs do you have any with LEDs?
> 
> I mean, aren't the led triggers generic so that it can trigger any
> other LED to blink, and it's up to the user to decide ?

Correct. However, generic LEDs won't be registered via this code path,
so the deadlock is not an issue. Only LED controllers in a PHY within
an SFP, inside an SFP cage are the issue here. I don't have any Copper
SFP modules, so i've no idea if they are physically big enough to have
LEDs?

> I do however see one good thing with this patch is that it makes the
> behaviour coherent regarding triggers regardless if we have a
> media-converter or not.
> 
> If we have a SFP bus with phylink as its upstream (SFP bus directly
> connected to the MAC), then the phy is going to be attached through
> phy_attach_direct(), and before this patch, led triggers will be
> registered.
> 
> If we have an intermediate PHY acting as a media-converter and
> connected to the SFP bus, then the phy isn't attached to the net_device,
> and the triggers aren't registered.
> 
> So if in the end it doesn't make sense to register led triggers for
> PHY in modules, it might be more coherent not registering them at all
> as this patch does. What do you think ?

I think it is all messy. Say the media converter has its LEDs
connected to the front panel. You then get indications of activity on
the front panel. I've never seen a fibre SFP with LEDs, and its an
open question if Copper SFPs have LEDs. So you are limited to the MAC
LEDs or the media converter LEDs. Another scenario could be a PHY
which acts as a media switch, it can have an RJ-45 or an SFP cage,
first to get link wins. In such a situation, i would put the LEDs on
the front panel, since it would look odd for an empty RJ-45 socket
LEDs to blink for SFP activity.

So i think we should have the ability to describe all the LEDs in DT
and let it decided what gets registered.

	Andrew
Russell King (Oracle) Dec. 13, 2023, 3:27 p.m. UTC | #5
On Wed, Dec 13, 2023 at 10:08:25AM +0100, Andrew Lunn wrote:
> On Tue, Dec 12, 2023 at 03:35:12PM +0100, Maxime Chevallier wrote:
> > Hi Daniel
> > 
> > On Tue, 12 Dec 2023 00:05:35 +0000
> > Daniel Golle <daniel@makrotopia.org> wrote:
> > 
> > > Calling led_trigger_register() when attaching a PHY located on an SFP
> > > module potentially (and practically) leads into a deadlock.
> > > Fix this by not calling led_trigger_register() for PHYs localted on SFP
> > > modules as such modules actually never got any LEDs.
> > 
> > While I don't have a fix for this issue, I think your justification
> > isn't good. This isn't about having LEDs on the module or not, but
> > rather the PHY triggering LED events for LEDS that can be located
> > somewhere else on the system (like the front pannel of a switch).
> 
> SFP LEDs are very unlikely to be on the front panel, since there is no
> such pins on the SFP cage.
> 
> Russell, in your collection of SFPs do you have any with LEDs?

No, and we should _not_ mess around with the "LED" configuration on
PHYs on SFPs. It's possible that the LED output is wired to the LOS
pin on the module, and messing around with the configuration of that
would be asking for trouble.

In any case, I thought we didn't drive the LED configuration on PHYs
where the LED configuration isn't described by firmware - and as the
PHY on SFP modules would never be described by firmware, hooking
such a PHY up to the LED framework sounds like a waste of resources
to me.
Russell King (Oracle) Dec. 13, 2023, 3:35 p.m. UTC | #6
On Wed, Dec 13, 2023 at 11:47:50AM +0100, Andrew Lunn wrote:
> > > SFP LEDs are very unlikely to be on the front panel, since there is no
> > > such pins on the SFP cage.
> > > 
> > > Russell, in your collection of SFPs do you have any with LEDs?
> > 
> > I mean, aren't the led triggers generic so that it can trigger any
> > other LED to blink, and it's up to the user to decide ?
> 
> Correct. However, generic LEDs won't be registered via this code path,
> so the deadlock is not an issue. Only LED controllers in a PHY within
> an SFP, inside an SFP cage are the issue here. I don't have any Copper
> SFP modules, so i've no idea if they are physically big enough to have
> LEDs?

The ones I have, that is indeed the case - the RJ45 socket is the
absolute minimum size, with just enough metalwork around it to support
a RJ45 plug.

> I think it is all messy. Say the media converter has its LEDs
> connected to the front panel. You then get indications of activity on
> the front panel. I've never seen a fibre SFP with LEDs, and its an
> open question if Copper SFPs have LEDs.

A fibre module would only be able to repeat the information given via
RX_LOS and/or TX_FAULT if it had space to do so.

It's more normal in routers with SFP slots to see LEDs that are either
part of the SFP socket itself, or provided elsewhere.

> Another scenario could be a PHY
> which acts as a media switch, it can have an RJ-45 or an SFP cage,
> first to get link wins. In such a situation, i would put the LEDs on
> the front panel, since it would look odd for an empty RJ-45 socket
> LEDs to blink for SFP activity.

... and an example of this kind of setup would be the 88x3310 on
Macchiatobin - the LEDs are on the RJ45 socket, but they also
indicate for the status of the SFP connection if that is in use. I
don't remember off the top of my head what the LED configuration
register values allow one to select. We don't drive them because I
gave up on that idea - I don't believe that our LED framework is
"powerful" enough to be able to sensibly configure them... and I
personally use my own patches with register values in DT to
configure them to indicate sensibly.

However, from what I remember, configuring a LED to indicate for
1000M will mean that it will indicate whether the copper or fibre
interfaces are operating at 1000M.
Maxime Chevallier Dec. 13, 2023, 5:12 p.m. UTC | #7
Hi Andrew, Russell,

On Wed, 13 Dec 2023 15:27:28 +0000
"Russell King (Oracle)" <linux@armlinux.org.uk> wrote:

> On Wed, Dec 13, 2023 at 10:08:25AM +0100, Andrew Lunn wrote:
> > On Tue, Dec 12, 2023 at 03:35:12PM +0100, Maxime Chevallier wrote:  
> > > Hi Daniel
> > > 
> > > On Tue, 12 Dec 2023 00:05:35 +0000
> > > Daniel Golle <daniel@makrotopia.org> wrote:
> > >   
> > > > Calling led_trigger_register() when attaching a PHY located on an SFP
> > > > module potentially (and practically) leads into a deadlock.
> > > > Fix this by not calling led_trigger_register() for PHYs localted on SFP
> > > > modules as such modules actually never got any LEDs.  
> > > 
> > > While I don't have a fix for this issue, I think your justification
> > > isn't good. This isn't about having LEDs on the module or not, but
> > > rather the PHY triggering LED events for LEDS that can be located
> > > somewhere else on the system (like the front pannel of a switch).  
> > 
> > SFP LEDs are very unlikely to be on the front panel, since there is no
> > such pins on the SFP cage.
> > 
> > Russell, in your collection of SFPs do you have any with LEDs?  
> 
> No, and we should _not_ mess around with the "LED" configuration on
> PHYs on SFPs. It's possible that the LED output is wired to the LOS
> pin on the module, and messing around with the configuration of that
> would be asking for trouble.
> 
> In any case, I thought we didn't drive the LED configuration on PHYs
> where the LED configuration isn't described by firmware - and as the
> PHY on SFP modules would never be described by firmware, hooking
> such a PHY up to the LED framework sounds like a waste of resources
> to me.
> 

So it looks to me that the Daniel's patch does make sense then, even
without considering the underlying locking issue ?

Sorry for my misunderstanding of the LED driving that started this
discussion :/

Maxime
Daniel Golle Dec. 13, 2023, 7:01 p.m. UTC | #8
On Wed, Dec 13, 2023 at 03:27:28PM +0000, Russell King (Oracle) wrote:
> On Wed, Dec 13, 2023 at 10:08:25AM +0100, Andrew Lunn wrote:
> > On Tue, Dec 12, 2023 at 03:35:12PM +0100, Maxime Chevallier wrote:
> > > Hi Daniel
> > > 
> > > On Tue, 12 Dec 2023 00:05:35 +0000
> > > Daniel Golle <daniel@makrotopia.org> wrote:
> > > 
> > > > Calling led_trigger_register() when attaching a PHY located on an SFP
> > > > module potentially (and practically) leads into a deadlock.
> > > > Fix this by not calling led_trigger_register() for PHYs localted on SFP
> > > > modules as such modules actually never got any LEDs.
> > > 
> > > While I don't have a fix for this issue, I think your justification
> > > isn't good. This isn't about having LEDs on the module or not, but
> > > rather the PHY triggering LED events for LEDS that can be located
> > > somewhere else on the system (like the front pannel of a switch).
> > 
> > SFP LEDs are very unlikely to be on the front panel, since there is no
> > such pins on the SFP cage.
> > 
> > Russell, in your collection of SFPs do you have any with LEDs?
> 
> No, and we should _not_ mess around with the "LED" configuration on
> PHYs on SFPs. It's possible that the LED output is wired to the LOS
> pin on the module, and messing around with the configuration of that
> would be asking for trouble.
> 
> In any case, I thought we didn't drive the LED configuration on PHYs
> where the LED configuration isn't described by firmware - and as the
> PHY on SFP modules would never be described by firmware, hooking
> such a PHY up to the LED framework sounds like a waste of resources
> to me.

This was exactly my line of thought when posting the patch, however,
Maxime correctly pointed out that the issue with locking and also
what the patch prevents is registration of LED *triggers* rather than
the PHY-controlled LEDs themselves. And having the triggers available
is desirable even beyond the hardware offloaded case (which is probably
the aspect we both were dealing with the most recently and hence had in
mind). It is common to control another system SoC GPIO driven LED(s)
representing the link status and rx/tx traffic, for example.

So better we get to the core of it and fix the locking issue
(for example by registering LED trigger asynchronously using
delayed_work)...
Russell King (Oracle) Dec. 13, 2023, 8:23 p.m. UTC | #9
On Wed, Dec 13, 2023 at 07:01:29PM +0000, Daniel Golle wrote:
> On Wed, Dec 13, 2023 at 03:27:28PM +0000, Russell King (Oracle) wrote:
> > No, and we should _not_ mess around with the "LED" configuration on
> > PHYs on SFPs. It's possible that the LED output is wired to the LOS
> > pin on the module, and messing around with the configuration of that
> > would be asking for trouble.
> > 
> > In any case, I thought we didn't drive the LED configuration on PHYs
> > where the LED configuration isn't described by firmware - and as the
> > PHY on SFP modules would never be described by firmware, hooking
> > such a PHY up to the LED framework sounds like a waste of resources
> > to me.
> 
> This was exactly my line of thought when posting the patch, however,
> Maxime correctly pointed out that the issue with locking and also
> what the patch prevents is registration of LED *triggers* rather than
> the PHY-controlled LEDs themselves. And having the triggers available
> is desirable even beyond the hardware offloaded case (which is probably
> the aspect we both were dealing with the most recently and hence had in
> mind). It is common to control another system SoC GPIO driven LED(s)
> representing the link status and rx/tx traffic, for example.
> 
> So better we get to the core of it and fix the locking issue
> (for example by registering LED trigger asynchronously using
> delayed_work)...

I don't think a delayed work solves anything. Well, it solves the
registration problem, but when the PHY is removed, you have to
_synchronously_ cancel the delayed work, which could result in it
waiting on the called work function to complete. If that called work
function is waiting for the lock which we're holding on the remove
path, then we're still in a deadlock-prone situation.
Paolo Abeni Dec. 14, 2023, 9:48 a.m. UTC | #10
On Wed, 2023-12-13 at 19:01 +0000, Daniel Golle wrote:
> On Wed, Dec 13, 2023 at 03:27:28PM +0000, Russell King (Oracle) wrote:
> > On Wed, Dec 13, 2023 at 10:08:25AM +0100, Andrew Lunn wrote:
> > > On Tue, Dec 12, 2023 at 03:35:12PM +0100, Maxime Chevallier wrote:
> > > > Hi Daniel
> > > > 
> > > > On Tue, 12 Dec 2023 00:05:35 +0000
> > > > Daniel Golle <daniel@makrotopia.org> wrote:
> > > > 
> > > > > Calling led_trigger_register() when attaching a PHY located on an SFP
> > > > > module potentially (and practically) leads into a deadlock.
> > > > > Fix this by not calling led_trigger_register() for PHYs localted on SFP
> > > > > modules as such modules actually never got any LEDs.
> > > > 
> > > > While I don't have a fix for this issue, I think your justification
> > > > isn't good. This isn't about having LEDs on the module or not, but
> > > > rather the PHY triggering LED events for LEDS that can be located
> > > > somewhere else on the system (like the front pannel of a switch).
> > > 
> > > SFP LEDs are very unlikely to be on the front panel, since there is no
> > > such pins on the SFP cage.
> > > 
> > > Russell, in your collection of SFPs do you have any with LEDs?
> > 
> > No, and we should _not_ mess around with the "LED" configuration on
> > PHYs on SFPs. It's possible that the LED output is wired to the LOS
> > pin on the module, and messing around with the configuration of that
> > would be asking for trouble.
> > 
> > In any case, I thought we didn't drive the LED configuration on PHYs
> > where the LED configuration isn't described by firmware - and as the
> > PHY on SFP modules would never be described by firmware, hooking
> > such a PHY up to the LED framework sounds like a waste of resources
> > to me.
> 
> This was exactly my line of thought when posting the patch, however,
> Maxime correctly pointed out that the issue with locking and also
> what the patch prevents is registration of LED *triggers* rather than
> the PHY-controlled LEDs themselves. And having the triggers available
> is desirable even beyond the hardware offloaded case (which is probably
> the aspect we both were dealing with the most recently and hence had in
> mind). It is common to control another system SoC GPIO driven LED(s)
> representing the link status and rx/tx traffic, for example.
> 
> So better we get to the core of it and fix the locking issue
> (for example by registering LED trigger asynchronously using
> delayed_work)...

I understand you are looking for a different solution, so let me mark
this patch accordingly.

--
pw-bot: cr
Russell King (Oracle) Dec. 14, 2023, 4:52 p.m. UTC | #11
On Thu, Dec 14, 2023 at 10:48:10AM +0100, Paolo Abeni wrote:
> On Wed, 2023-12-13 at 19:01 +0000, Daniel Golle wrote:
> > On Wed, Dec 13, 2023 at 03:27:28PM +0000, Russell King (Oracle) wrote:
> > > On Wed, Dec 13, 2023 at 10:08:25AM +0100, Andrew Lunn wrote:
> > > > On Tue, Dec 12, 2023 at 03:35:12PM +0100, Maxime Chevallier wrote:
> > > > > Hi Daniel
> > > > > 
> > > > > On Tue, 12 Dec 2023 00:05:35 +0000
> > > > > Daniel Golle <daniel@makrotopia.org> wrote:
> > > > > 
> > > > > > Calling led_trigger_register() when attaching a PHY located on an SFP
> > > > > > module potentially (and practically) leads into a deadlock.
> > > > > > Fix this by not calling led_trigger_register() for PHYs localted on SFP
> > > > > > modules as such modules actually never got any LEDs.
> > > > > 
> > > > > While I don't have a fix for this issue, I think your justification
> > > > > isn't good. This isn't about having LEDs on the module or not, but
> > > > > rather the PHY triggering LED events for LEDS that can be located
> > > > > somewhere else on the system (like the front pannel of a switch).
> > > > 
> > > > SFP LEDs are very unlikely to be on the front panel, since there is no
> > > > such pins on the SFP cage.
> > > > 
> > > > Russell, in your collection of SFPs do you have any with LEDs?
> > > 
> > > No, and we should _not_ mess around with the "LED" configuration on
> > > PHYs on SFPs. It's possible that the LED output is wired to the LOS
> > > pin on the module, and messing around with the configuration of that
> > > would be asking for trouble.
> > > 
> > > In any case, I thought we didn't drive the LED configuration on PHYs
> > > where the LED configuration isn't described by firmware - and as the
> > > PHY on SFP modules would never be described by firmware, hooking
> > > such a PHY up to the LED framework sounds like a waste of resources
> > > to me.
> > 
> > This was exactly my line of thought when posting the patch, however,
> > Maxime correctly pointed out that the issue with locking and also
> > what the patch prevents is registration of LED *triggers* rather than
> > the PHY-controlled LEDs themselves. And having the triggers available
> > is desirable even beyond the hardware offloaded case (which is probably
> > the aspect we both were dealing with the most recently and hence had in
> > mind). It is common to control another system SoC GPIO driven LED(s)
> > representing the link status and rx/tx traffic, for example.
> > 
> > So better we get to the core of it and fix the locking issue
> > (for example by registering LED trigger asynchronously using
> > delayed_work)...
> 
> I understand you are looking for a different solution, so let me mark
> this patch accordingly.
> 
> --
> pw-bot: cr

I disagree with that - analysing the locking and coming up with a fix
is likely going to be a lengthy affair, meanwhile the mainline kernel
will deadlock on this. This patch prevents that deadlock at the
expense of removing the LED triggers for the PHY-on-SFP which I don't
think is a big deal considering the age of the PHY-based LED triggers.

So I personally would prefer this patch to be merged while a
different solution (that we have little idea at this point what it
would look like) is sought.
Jakub Kicinski Dec. 15, 2023, 2:31 a.m. UTC | #12
On Tue, 12 Dec 2023 00:05:35 +0000 Daniel Golle wrote:
> Calling led_trigger_register() when attaching a PHY located on an SFP
> module potentially (and practically) leads into a deadlock.
> Fix this by not calling led_trigger_register() for PHYs localted on SFP
> modules as such modules actually never got any LEDs.

Any suggestion of a Fixes tag?
Looks like the triggers were added a while back, are we only seeing it
now because we started exercising the code more?
Daniel Golle Dec. 15, 2023, 2:54 a.m. UTC | #13
On Thu, Dec 14, 2023 at 06:31:23PM -0800, Jakub Kicinski wrote:
> On Tue, 12 Dec 2023 00:05:35 +0000 Daniel Golle wrote:
> > Calling led_trigger_register() when attaching a PHY located on an SFP
> > module potentially (and practically) leads into a deadlock.
> > Fix this by not calling led_trigger_register() for PHYs localted on SFP
> > modules as such modules actually never got any LEDs.
> 
> Any suggestion of a Fixes tag?
> Looks like the triggers were added a while back, are we only seeing it
> now because we started exercising the code more?

I've noticed this as a consequence of commit 2f3ce7a56
"net: sfp: rework the RollBall PHY waiting code"
because (?) some PHYs on SFP+ now probe immediately as the previously
enforced minimum delay now became a timeout.

Blaming that commit would be wrong though, it's more that the problem
has probably always been there and was just previously augmented by
the delay.
Andrew Lunn Dec. 15, 2023, 9:46 a.m. UTC | #14
> I disagree with that - analysing the locking and coming up with a fix
> is likely going to be a lengthy affair, meanwhile the mainline kernel
> will deadlock on this. This patch prevents that deadlock at the
> expense of removing the LED triggers for the PHY-on-SFP which I don't
> think is a big deal considering the age of the PHY-based LED triggers.
> 
> So I personally would prefer this patch to be merged while a
> different solution (that we have little idea at this point what it
> would look like) is sought.

So, if i'm reading this patch correctly, it only affects PHYs within
SFPs.

The discussion went off on a tangent and also talked about media
converter PHYs, but from my reading of this patch, they are unaffected
by this patch. Do they however also suffer from this deadlock? Anybody
tested that?

     Andrew
Andrew Lunn Dec. 15, 2023, 9:53 a.m. UTC | #15
On Thu, Dec 14, 2023 at 06:31:23PM -0800, Jakub Kicinski wrote:
> On Tue, 12 Dec 2023 00:05:35 +0000 Daniel Golle wrote:
> > Calling led_trigger_register() when attaching a PHY located on an SFP
> > module potentially (and practically) leads into a deadlock.
> > Fix this by not calling led_trigger_register() for PHYs localted on SFP
> > modules as such modules actually never got any LEDs.
> 
> Any suggestion of a Fixes tag?
> Looks like the triggers were added a while back, are we only seeing it
> now because we started exercising the code more?

How about:

Fixes: 01e5b728e9e4 ("net: phy: Add a binding for PHY LEDs")

       Andrew
Maxime Chevallier Dec. 15, 2023, 9:59 a.m. UTC | #16
Hi Andrew,

On Fri, 15 Dec 2023 10:46:18 +0100
Andrew Lunn <andrew@lunn.ch> wrote:

> > I disagree with that - analysing the locking and coming up with a fix
> > is likely going to be a lengthy affair, meanwhile the mainline kernel
> > will deadlock on this. This patch prevents that deadlock at the
> > expense of removing the LED triggers for the PHY-on-SFP which I don't
> > think is a big deal considering the age of the PHY-based LED triggers.
> > 
> > So I personally would prefer this patch to be merged while a
> > different solution (that we have little idea at this point what it
> > would look like) is sought.  

I would agree, I feel bad about delaying it , although as Daniel
mentioned it's indeed the trigger registration that gets skipped.
 
> So, if i'm reading this patch correctly, it only affects PHYs within
> SFPs.
> 
> The discussion went off on a tangent and also talked about media
> converter PHYs, but from my reading of this patch, they are unaffected
> by this patch. Do they however also suffer from this deadlock? Anybody
> tested that?

I can give it a try today (in a few hours probably, I'm experiencing
a power outage right now...) and make sure the issue doesn't occur with
media converter PHYs.

Maxime

>      Andrew
Maxime Chevallier Dec. 15, 2023, 3:39 p.m. UTC | #17
Hi all,

On Fri, 15 Dec 2023 10:59:28 +0100
Maxime Chevallier <maxime.chevallier@bootlin.com> wrote:

> Hi Andrew,
> 
> On Fri, 15 Dec 2023 10:46:18 +0100
> Andrew Lunn <andrew@lunn.ch> wrote:
> > So, if i'm reading this patch correctly, it only affects PHYs within
> > SFPs.
> > 
> > The discussion went off on a tangent and also talked about media
> > converter PHYs, but from my reading of this patch, they are unaffected
> > by this patch. Do they however also suffer from this deadlock? Anybody
> > tested that?  
> 

I gave it a try with lockdep on a macchiatobin, but I couldn't even
reproduce the original issue (tested with the tip of net-next and net),
so I can't tell if the media-converter PHYs are affected as well. I got
no lockdep warnings regarding either paths.

Maxime
Russell King (Oracle) Dec. 15, 2023, 4:45 p.m. UTC | #18
On Tue, Dec 12, 2023 at 12:05:35AM +0000, Daniel Golle wrote:
> -> #3 (&sfp->sm_mutex){+.+.}-{3:3}:
>        __mutex_lock+0x88/0x7a0
>        mutex_lock_nested+0x20/0x28
>        cleanup_module+0x2ae0/0x3120 [sfp]
>        sfp_register_bus+0x5c/0x9c
>        sfp_register_socket+0x48/0xd4
>        cleanup_module+0x271c/0x3120 [sfp]
>        platform_probe+0x64/0xb8
>        really_probe+0x17c/0x3c0
>        __driver_probe_device+0x78/0x164
>        driver_probe_device+0x3c/0xd4
>        __driver_attach+0xec/0x1f0
>        bus_for_each_dev+0x60/0xa0
>        driver_attach+0x20/0x28
>        bus_add_driver+0x108/0x208
>        driver_register+0x5c/0x118
>        __platform_driver_register+0x24/0x2c
>        init_module+0x28/0xa7c [sfp]
>        do_one_initcall+0x70/0x2ec
>        do_init_module+0x54/0x1e4
>        load_module+0x1b78/0x1c8c
>        __do_sys_init_module+0x1bc/0x2cc
>        __arm64_sys_init_module+0x18/0x20
>        invoke_syscall.constprop.0+0x4c/0xdc
>        do_el0_svc+0x3c/0xbc
>        el0_svc+0x34/0x80
>        el0t_64_sync_handler+0xf8/0x124
>        el0t_64_sync+0x150/0x154

I suspect these backtraces aren't all that reliable, and look like they
are generated by walking through the stack and logging anything that
seems to be pointing into the text segment, which is rubbish for ARM32
and probably ARM64 as well.

We can see that this backtrace is a pile of lies because there is _no_
_way_ that sfp's cleanup_module() could ever be called while its
init_module() function is running.

In any case, I think this path is irrelevant.

> -> #2 (rtnl_mutex){+.+.}-{3:3}:
>        __mutex_lock+0x88/0x7a0
>        mutex_lock_nested+0x20/0x28
>        rtnl_lock+0x18/0x20
>        set_device_name+0x30/0x130
>        netdev_trig_activate+0x13c/0x1ac
>        led_trigger_set+0x118/0x234
>        led_trigger_write+0x104/0x17c
>        sysfs_kf_bin_write+0x64/0x80
>        kernfs_fop_write_iter+0x128/0x1b4
>        vfs_write+0x178/0x2a4
>        ksys_write+0x58/0xd4
>        __arm64_sys_write+0x18/0x20
>        invoke_syscall.constprop.0+0x4c/0xdc
>        do_el0_svc+0x3c/0xbc
>        el0_svc+0x34/0x80
>        el0t_64_sync_handler+0xf8/0x124
>        el0t_64_sync+0x150/0x154

This is one of the paths that matters. A userspace write is occuring
to the netdev trigger module. This path takes the following locks
(most recent first):

	rtnl_lock()
	trigger_lock (write)
	triggers_list_lock (read)

> -> #0 (triggers_list_lock){++++}-{3:3}:
>        __lock_acquire+0x12a0/0x2014
>        lock_acquire+0x100/0x2ac
>        down_write+0x4c/0x13c
>        led_trigger_register+0x4c/0x1a8
>        phy_led_triggers_register+0x9c/0x214
>        phy_attach_direct+0x154/0x36c
>        phylink_attach_phy+0x30/0x60
>        phylink_sfp_connect_phy+0x140/0x510
>        sfp_add_phy+0x34/0x50
>        init_module+0x15c/0xa7c [sfp]
>        cleanup_module+0x1d94/0x3120 [sfp]
>        cleanup_module+0x2bb4/0x3120 [sfp]
>        process_one_work+0x1f8/0x4ec
>        worker_thread+0x1e8/0x3d8
>        kthread+0x104/0x110
>        ret_from_fork+0x10/0x20

This path I suspect (but hard to see, we've got that cleanup_module
and init_module crud there again which is utter trash, sfp_add_phy
is not called from any of the functions previously listed)...
Manually going through the code instead, the locking order will be:

	triggers_list_lock (write)
	sm_mutex
	rtnl

I'm not sure that the lockdep report is accurate, as it seems to be
blaming the deadlock via three locks (triggers_list_lock --> rtnl_mutex
--> &sfp->sm_mutex) but I can't find a path where sm_mutex would be
involved (except the immediate above.)

It looks to me like the problem is in part caused by calling
phy_led_triggers_register() while holding the rtnl lock. Holding the
rtnl lock is fundamental to being able to safely remove and add PHY
devices to netdevs while they are up and running.

The other part that causes the problem is a write to a netdev trigger
that causes it to activate takes the rtnl and triggers_list_lock in
the opposite order.

I don't currently see a solution to this.
patchwork-bot+netdevbpf@kernel.org Dec. 16, 2023, 2 a.m. UTC | #19
Hello:

This patch was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Tue, 12 Dec 2023 00:05:35 +0000 you wrote:
> Calling led_trigger_register() when attaching a PHY located on an SFP
> module potentially (and practically) leads into a deadlock.
> Fix this by not calling led_trigger_register() for PHYs localted on SFP
> modules as such modules actually never got any LEDs.
> 
> ======================================================
> WARNING: possible circular locking dependency detected
> 6.7.0-rc4-next-20231208+ #0 Tainted: G           O
> 
> [...]

Here is the summary with links:
  - [net] net: phy: skip LED triggers on PHYs on SFP modules
    https://git.kernel.org/netdev/net/c/b1dfc0f76231

You are awesome, thank you!
diff mbox series

Patch

diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index d2caa89bf1fb8..08b87d7396eb9 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -1558,7 +1558,8 @@  int phy_attach_direct(struct net_device *dev, struct phy_device *phydev,
 		goto error;
 
 	phy_resume(phydev);
-	phy_led_triggers_register(phydev);
+	if (!phydev->is_on_sfp_module)
+		phy_led_triggers_register(phydev);
 
 	/**
 	 * If the external phy used by current mac interface is managed by
@@ -1827,7 +1828,8 @@  void phy_detach(struct phy_device *phydev)
 	}
 	phydev->phylink = NULL;
 
-	phy_led_triggers_unregister(phydev);
+	if (!phydev->is_on_sfp_module)
+		phy_led_triggers_unregister(phydev);
 
 	if (phydev->mdio.dev.driver)
 		module_put(phydev->mdio.dev.driver->owner);