Message ID | 20210105161136.250631-1-marex@denx.de (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | net: phy: Trigger link_change_notify on PHY_HALTED | expand |
Context | Check | Description |
---|---|---|
netdev/cover_letter | success | Link |
netdev/fixes_present | success | Link |
netdev/patch_count | success | Link |
netdev/tree_selection | success | Guessed tree name to be net-next |
netdev/subject_prefix | warning | Target tree name not specified in the subject |
netdev/cc_maintainers | warning | 2 maintainers not CCed: linux@armlinux.org.uk kuba@kernel.org |
netdev/source_inline | success | Was 0 now: 0 |
netdev/verify_signedoff | success | Link |
netdev/module_param | success | Was 0 now: 0 |
netdev/build_32bit | success | Errors and warnings before: 0 this patch: 0 |
netdev/kdoc | success | Errors and warnings before: 0 this patch: 0 |
netdev/verify_fixes | success | Link |
netdev/checkpatch | success | total: 0 errors, 0 warnings, 0 checks, 24 lines checked |
netdev/build_allmodconfig_warn | success | Errors and warnings before: 0 this patch: 0 |
netdev/header_inline | success | Link |
netdev/stable | success | Stable not CCed |
On 05.01.2021 17:11, Marek Vasut wrote: > In case the PHY transitions to PHY_HALTED state in phy_stop(), the > link_change_notify callback is not triggered. That's because the > phydev->state = PHY_HALTED in phy_stop() is assigned first, and > phy_state_machine() is called afterward. For phy_state_machine(), > no state transition happens, because old_state = PHY_HALTED and > phy_dev->state = PHY_HALTED. > There are a few formal issues with this patch: - It misses a net/net-next annotation. If it's meant to be a fix, then the Fixes tag is missing. I just checked the existing link_change_notify handlers and nobody is interested in state transitions to PHY_HALTED. Therefore I think it's more of an improvement. However AFAICS net-next is still closed. - The maintainers should be in To: and the list(s) on cc. - Seems that Russell and Jakub are missing as maintainers. > Signed-off-by: Marek Vasut <marex@denx.de> > Cc: Andrew Lunn <andrew@lunn.ch> > Cc: David S. Miller <davem@davemloft.net> > Cc: Heiner Kallweit <hkallweit1@gmail.com> > --- > drivers/net/phy/phy.c | 10 ++++++++++ > 1 file changed, 10 insertions(+) > > diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c > index 45f75533c47c..fca8c3eebc5d 100644 > --- a/drivers/net/phy/phy.c > +++ b/drivers/net/phy/phy.c > @@ -1004,6 +1004,7 @@ EXPORT_SYMBOL(phy_free_interrupt); > void phy_stop(struct phy_device *phydev) > { > struct net_device *dev = phydev->attached_dev; > + enum phy_state old_state; > > if (!phy_is_started(phydev) && phydev->state != PHY_DOWN) { > WARN(1, "called from state %s\n", > @@ -1021,8 +1022,17 @@ void phy_stop(struct phy_device *phydev) > if (phydev->sfp_bus) > sfp_upstream_stop(phydev->sfp_bus); > > + old_state = phydev->state; > phydev->state = PHY_HALTED; > > + if (old_state != phydev->state) { This check shouldn't be needed because it shouldn't happen that phy_stop() is called from status PHY_HALTED. In this case the WARN() a few lines above would have fired already. > + phydev_err(phydev, "PHY state change %s -> %s\n", > + phy_state_to_str(old_state), > + phy_state_to_str(phydev->state)); > + if (phydev->drv && phydev->drv->link_change_notify) > + phydev->drv->link_change_notify(phydev); Instead of duplicating this code it could be factored out into a small helper that is used by phy_stop() and phy_state_machine(). > + } > + > mutex_unlock(&phydev->lock); > > phy_state_machine(&phydev->state_queue.work); >
On Tue, Jan 05, 2021 at 05:58:21PM +0100, Heiner Kallweit wrote: > On 05.01.2021 17:11, Marek Vasut wrote: > > @@ -1021,8 +1022,17 @@ void phy_stop(struct phy_device *phydev) > > if (phydev->sfp_bus) > > sfp_upstream_stop(phydev->sfp_bus); > > > > + old_state = phydev->state; > > phydev->state = PHY_HALTED; > > > > + if (old_state != phydev->state) { > > This check shouldn't be needed because it shouldn't happen that > phy_stop() is called from status PHY_HALTED. In this case the > WARN() a few lines above would have fired already. That is incorrect. If an error happens with the phy, phy_error() will be called, which sets phydev->state = PHY_HALTED. If you then subsequently take the interface down, phy_stop() will be called, but phydev->state will be set to PHY_HALTED. This is a long standing bug since you changed the code, and I think is something I've reported previously, since I've definitely encountered it.
On 05.01.2021 18:05, Russell King - ARM Linux admin wrote: > On Tue, Jan 05, 2021 at 05:58:21PM +0100, Heiner Kallweit wrote: >> On 05.01.2021 17:11, Marek Vasut wrote: >>> @@ -1021,8 +1022,17 @@ void phy_stop(struct phy_device *phydev) >>> if (phydev->sfp_bus) >>> sfp_upstream_stop(phydev->sfp_bus); >>> >>> + old_state = phydev->state; >>> phydev->state = PHY_HALTED; >>> >>> + if (old_state != phydev->state) { >> >> This check shouldn't be needed because it shouldn't happen that >> phy_stop() is called from status PHY_HALTED. In this case the >> WARN() a few lines above would have fired already. > > That is incorrect. If an error happens with the phy, phy_error() will > be called, which sets phydev->state = PHY_HALTED. If you then > subsequently take the interface down, phy_stop() will be called, but > phydev->state will be set to PHY_HALTED. > OK, so we have to fix the way phy_error() works. Still the check isn't needed here. So far nobody is interested in transitions to PHY_HALTED, so nothing is broken. > This is a long standing bug since you changed the code, and I think is > something I've reported previously, since I've definitely encountered > it. > IIRC there was a start of a discussion whether phy_error() is useful at all regarding how it works as of today. A single failed MDIO access (e.g. timeout) stops the state machine, when e.g. one missed PHY status update in polling mode doesn't really cause any harm. If we want to keep the functionality of phy_error(), then I'd say a separate error phy state (e.g. PHY_ERROR) would make sense, as it would make clear that the network was stopped due to an error.
On 05.01.2021 17:11, Marek Vasut wrote: > In case the PHY transitions to PHY_HALTED state in phy_stop(), the > link_change_notify callback is not triggered. That's because the > phydev->state = PHY_HALTED in phy_stop() is assigned first, and > phy_state_machine() is called afterward. For phy_state_machine(), > no state transition happens, because old_state = PHY_HALTED and > phy_dev->state = PHY_HALTED. > > Signed-off-by: Marek Vasut <marex@denx.de> > Cc: Andrew Lunn <andrew@lunn.ch> > Cc: David S. Miller <davem@davemloft.net> > Cc: Heiner Kallweit <hkallweit1@gmail.com> > --- > drivers/net/phy/phy.c | 10 ++++++++++ > 1 file changed, 10 insertions(+) > > diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c > index 45f75533c47c..fca8c3eebc5d 100644 > --- a/drivers/net/phy/phy.c > +++ b/drivers/net/phy/phy.c > @@ -1004,6 +1004,7 @@ EXPORT_SYMBOL(phy_free_interrupt); > void phy_stop(struct phy_device *phydev) > { > struct net_device *dev = phydev->attached_dev; > + enum phy_state old_state; > > if (!phy_is_started(phydev) && phydev->state != PHY_DOWN) { > WARN(1, "called from state %s\n", > @@ -1021,8 +1022,17 @@ void phy_stop(struct phy_device *phydev) > if (phydev->sfp_bus) > sfp_upstream_stop(phydev->sfp_bus); > > + old_state = phydev->state; > phydev->state = PHY_HALTED; > > + if (old_state != phydev->state) { > + phydev_err(phydev, "PHY state change %s -> %s\n", > + phy_state_to_str(old_state), > + phy_state_to_str(phydev->state)); > + if (phydev->drv && phydev->drv->link_change_notify) > + phydev->drv->link_change_notify(phydev); > + } > + > mutex_unlock(&phydev->lock); > > phy_state_machine(&phydev->state_queue.work); > Following is RFC as an additional idea. When requesting a new state from outside the state machine, we could simply provide the old state to the state machine in a new phy_device member. Then we shouldn't have to touch phy_stop(), and maybe phy_error(). And it looks cleaner to me than duplicating code from the state machine to functions like phy_stop(). --- drivers/net/phy/phy.c | 10 +++++++++- drivers/net/phy/phy_device.c | 1 + include/linux/phy.h | 3 +++ 3 files changed, 13 insertions(+), 1 deletion(-) diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c index 45f75533c..347e42d90 100644 --- a/drivers/net/phy/phy.c +++ b/drivers/net/phy/phy.c @@ -45,6 +45,7 @@ static const char *phy_state_to_str(enum phy_state st) { switch (st) { PHY_STATE_STR(DOWN) + PHY_STATE_STR(INVALID) PHY_STATE_STR(READY) PHY_STATE_STR(UP) PHY_STATE_STR(RUNNING) @@ -1021,6 +1022,7 @@ void phy_stop(struct phy_device *phydev) if (phydev->sfp_bus) sfp_upstream_stop(phydev->sfp_bus); + phydev->old_state = phydev->state; phydev->state = PHY_HALTED; mutex_unlock(&phydev->lock); @@ -1086,7 +1088,13 @@ void phy_state_machine(struct work_struct *work) mutex_lock(&phydev->lock); - old_state = phydev->state; + /* set if a new state is requested from outside the state machine */ + if (phydev->old_state != PHY_INVALID) { + old_state = phydev->old_state; + phydev->old_state = PHY_INVALID; + } else { + old_state = phydev->state; + } switch (phydev->state) { case PHY_DOWN: diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c index 80c2e646c..9f8d9f68b 100644 --- a/drivers/net/phy/phy_device.c +++ b/drivers/net/phy/phy_device.c @@ -620,6 +620,7 @@ struct phy_device *phy_device_create(struct mii_bus *bus, int addr, u32 phy_id, device_initialize(&mdiodev->dev); dev->state = PHY_DOWN; + dev->old_state = PHY_INVALID; mutex_init(&dev->lock); INIT_DELAYED_WORK(&dev->state_queue, phy_state_machine); diff --git a/include/linux/phy.h b/include/linux/phy.h index 9effb511a..df48ea88c 100644 --- a/include/linux/phy.h +++ b/include/linux/phy.h @@ -442,6 +442,7 @@ struct phy_device *mdiobus_scan(struct mii_bus *bus, int addr); */ enum phy_state { PHY_DOWN = 0, + PHY_INVALID, PHY_READY, PHY_HALTED, PHY_UP, @@ -566,6 +567,8 @@ struct phy_device { unsigned interrupts:1; enum phy_state state; + /* if a new state is requested from outside the state machine */ + enum phy_state old_state; u32 dev_flags;
diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c index 45f75533c47c..fca8c3eebc5d 100644 --- a/drivers/net/phy/phy.c +++ b/drivers/net/phy/phy.c @@ -1004,6 +1004,7 @@ EXPORT_SYMBOL(phy_free_interrupt); void phy_stop(struct phy_device *phydev) { struct net_device *dev = phydev->attached_dev; + enum phy_state old_state; if (!phy_is_started(phydev) && phydev->state != PHY_DOWN) { WARN(1, "called from state %s\n", @@ -1021,8 +1022,17 @@ void phy_stop(struct phy_device *phydev) if (phydev->sfp_bus) sfp_upstream_stop(phydev->sfp_bus); + old_state = phydev->state; phydev->state = PHY_HALTED; + if (old_state != phydev->state) { + phydev_err(phydev, "PHY state change %s -> %s\n", + phy_state_to_str(old_state), + phy_state_to_str(phydev->state)); + if (phydev->drv && phydev->drv->link_change_notify) + phydev->drv->link_change_notify(phydev); + } + mutex_unlock(&phydev->lock); phy_state_machine(&phydev->state_queue.work);
In case the PHY transitions to PHY_HALTED state in phy_stop(), the link_change_notify callback is not triggered. That's because the phydev->state = PHY_HALTED in phy_stop() is assigned first, and phy_state_machine() is called afterward. For phy_state_machine(), no state transition happens, because old_state = PHY_HALTED and phy_dev->state = PHY_HALTED. Signed-off-by: Marek Vasut <marex@denx.de> Cc: Andrew Lunn <andrew@lunn.ch> Cc: David S. Miller <davem@davemloft.net> Cc: Heiner Kallweit <hkallweit1@gmail.com> --- drivers/net/phy/phy.c | 10 ++++++++++ 1 file changed, 10 insertions(+)