diff mbox series

net: phy: Trigger link_change_notify on PHY_HALTED

Message ID 20210105161136.250631-1-marex@denx.de (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series net: phy: Trigger link_change_notify on PHY_HALTED | expand

Checks

Context Check Description
netdev/cover_letter success Link
netdev/fixes_present success Link
netdev/patch_count success Link
netdev/tree_selection success Guessed tree name to be net-next
netdev/subject_prefix warning Target tree name not specified in the subject
netdev/cc_maintainers warning 2 maintainers not CCed: linux@armlinux.org.uk kuba@kernel.org
netdev/source_inline success Was 0 now: 0
netdev/verify_signedoff success Link
netdev/module_param success Was 0 now: 0
netdev/build_32bit success Errors and warnings before: 0 this patch: 0
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/verify_fixes success Link
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 24 lines checked
netdev/build_allmodconfig_warn success Errors and warnings before: 0 this patch: 0
netdev/header_inline success Link
netdev/stable success Stable not CCed

Commit Message

Marek Vasut Jan. 5, 2021, 4:11 p.m. UTC
In case the PHY transitions to PHY_HALTED state in phy_stop(), the
link_change_notify callback is not triggered. That's because the
phydev->state = PHY_HALTED in phy_stop() is assigned first, and
phy_state_machine() is called afterward. For phy_state_machine(),
no state transition happens, because old_state = PHY_HALTED and
phy_dev->state = PHY_HALTED.

Signed-off-by: Marek Vasut <marex@denx.de>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: David S. Miller <davem@davemloft.net>
Cc: Heiner Kallweit <hkallweit1@gmail.com>
---
 drivers/net/phy/phy.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

Comments

Heiner Kallweit Jan. 5, 2021, 4:58 p.m. UTC | #1
On 05.01.2021 17:11, Marek Vasut wrote:
> In case the PHY transitions to PHY_HALTED state in phy_stop(), the
> link_change_notify callback is not triggered. That's because the
> phydev->state = PHY_HALTED in phy_stop() is assigned first, and
> phy_state_machine() is called afterward. For phy_state_machine(),
> no state transition happens, because old_state = PHY_HALTED and
> phy_dev->state = PHY_HALTED.
> 

There are a few formal issues with this patch:
- It misses a net/net-next annotation. If it's meant to be a fix,
  then the Fixes tag is missing. I just checked the existing
  link_change_notify handlers and nobody is interested in state
  transitions to PHY_HALTED. Therefore I think it's more of an
  improvement. However AFAICS net-next is still closed.

- The maintainers should be in To: and the list(s) on cc.
- Seems that Russell and Jakub are missing as maintainers.


> Signed-off-by: Marek Vasut <marex@denx.de>
> Cc: Andrew Lunn <andrew@lunn.ch>
> Cc: David S. Miller <davem@davemloft.net>
> Cc: Heiner Kallweit <hkallweit1@gmail.com>
> ---
>  drivers/net/phy/phy.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
> index 45f75533c47c..fca8c3eebc5d 100644
> --- a/drivers/net/phy/phy.c
> +++ b/drivers/net/phy/phy.c
> @@ -1004,6 +1004,7 @@ EXPORT_SYMBOL(phy_free_interrupt);
>  void phy_stop(struct phy_device *phydev)
>  {
>  	struct net_device *dev = phydev->attached_dev;
> +	enum phy_state old_state;
>  
>  	if (!phy_is_started(phydev) && phydev->state != PHY_DOWN) {
>  		WARN(1, "called from state %s\n",
> @@ -1021,8 +1022,17 @@ void phy_stop(struct phy_device *phydev)
>  	if (phydev->sfp_bus)
>  		sfp_upstream_stop(phydev->sfp_bus);
>  
> +	old_state = phydev->state;
>  	phydev->state = PHY_HALTED;
>  
> +	if (old_state != phydev->state) {

This check shouldn't be needed because it shouldn't happen that
phy_stop() is called from status PHY_HALTED. In this case the
WARN() a few lines above would have fired already.

> +		phydev_err(phydev, "PHY state change %s -> %s\n",
> +			   phy_state_to_str(old_state),
> +			   phy_state_to_str(phydev->state));
> +		if (phydev->drv && phydev->drv->link_change_notify)
> +			phydev->drv->link_change_notify(phydev);

Instead of duplicating this code it could be factored out into
a small helper that is used by phy_stop() and phy_state_machine().

> +	}
> +
>  	mutex_unlock(&phydev->lock);
>  
>  	phy_state_machine(&phydev->state_queue.work);
>
Russell King (Oracle) Jan. 5, 2021, 5:05 p.m. UTC | #2
On Tue, Jan 05, 2021 at 05:58:21PM +0100, Heiner Kallweit wrote:
> On 05.01.2021 17:11, Marek Vasut wrote:
> > @@ -1021,8 +1022,17 @@ void phy_stop(struct phy_device *phydev)
> >  	if (phydev->sfp_bus)
> >  		sfp_upstream_stop(phydev->sfp_bus);
> >  
> > +	old_state = phydev->state;
> >  	phydev->state = PHY_HALTED;
> >  
> > +	if (old_state != phydev->state) {
> 
> This check shouldn't be needed because it shouldn't happen that
> phy_stop() is called from status PHY_HALTED. In this case the
> WARN() a few lines above would have fired already.

That is incorrect. If an error happens with the phy, phy_error() will
be called, which sets phydev->state = PHY_HALTED. If you then
subsequently take the interface down, phy_stop() will be called, but
phydev->state will be set to PHY_HALTED.

This is a long standing bug since you changed the code, and I think is
something I've reported previously, since I've definitely encountered
it.
Heiner Kallweit Jan. 5, 2021, 9:06 p.m. UTC | #3
On 05.01.2021 18:05, Russell King - ARM Linux admin wrote:
> On Tue, Jan 05, 2021 at 05:58:21PM +0100, Heiner Kallweit wrote:
>> On 05.01.2021 17:11, Marek Vasut wrote:
>>> @@ -1021,8 +1022,17 @@ void phy_stop(struct phy_device *phydev)
>>>  	if (phydev->sfp_bus)
>>>  		sfp_upstream_stop(phydev->sfp_bus);
>>>  
>>> +	old_state = phydev->state;
>>>  	phydev->state = PHY_HALTED;
>>>  
>>> +	if (old_state != phydev->state) {
>>
>> This check shouldn't be needed because it shouldn't happen that
>> phy_stop() is called from status PHY_HALTED. In this case the
>> WARN() a few lines above would have fired already.
> 
> That is incorrect. If an error happens with the phy, phy_error() will
> be called, which sets phydev->state = PHY_HALTED. If you then
> subsequently take the interface down, phy_stop() will be called, but
> phydev->state will be set to PHY_HALTED.
> 
OK, so we have to fix the way phy_error() works. Still the check isn't
needed here. So far nobody is interested in transitions to PHY_HALTED,
so nothing is broken.

> This is a long standing bug since you changed the code, and I think is
> something I've reported previously, since I've definitely encountered
> it.
> 
IIRC there was a start of a discussion whether phy_error() is useful
at all regarding how it works as of today. A single failed MDIO access
(e.g. timeout) stops the state machine, when e.g. one missed PHY status
update in polling mode doesn't really cause any harm.
If we want to keep the functionality of phy_error(), then I'd say
a separate error phy state (e.g. PHY_ERROR) would make sense, as it
would make clear that the network was stopped due to an error.
Heiner Kallweit Jan. 5, 2021, 11:11 p.m. UTC | #4
On 05.01.2021 17:11, Marek Vasut wrote:
> In case the PHY transitions to PHY_HALTED state in phy_stop(), the
> link_change_notify callback is not triggered. That's because the
> phydev->state = PHY_HALTED in phy_stop() is assigned first, and
> phy_state_machine() is called afterward. For phy_state_machine(),
> no state transition happens, because old_state = PHY_HALTED and
> phy_dev->state = PHY_HALTED.
> 
> Signed-off-by: Marek Vasut <marex@denx.de>
> Cc: Andrew Lunn <andrew@lunn.ch>
> Cc: David S. Miller <davem@davemloft.net>
> Cc: Heiner Kallweit <hkallweit1@gmail.com>
> ---
>  drivers/net/phy/phy.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
> index 45f75533c47c..fca8c3eebc5d 100644
> --- a/drivers/net/phy/phy.c
> +++ b/drivers/net/phy/phy.c
> @@ -1004,6 +1004,7 @@ EXPORT_SYMBOL(phy_free_interrupt);
>  void phy_stop(struct phy_device *phydev)
>  {
>  	struct net_device *dev = phydev->attached_dev;
> +	enum phy_state old_state;
>  
>  	if (!phy_is_started(phydev) && phydev->state != PHY_DOWN) {
>  		WARN(1, "called from state %s\n",
> @@ -1021,8 +1022,17 @@ void phy_stop(struct phy_device *phydev)
>  	if (phydev->sfp_bus)
>  		sfp_upstream_stop(phydev->sfp_bus);
>  
> +	old_state = phydev->state;
>  	phydev->state = PHY_HALTED;
>  
> +	if (old_state != phydev->state) {
> +		phydev_err(phydev, "PHY state change %s -> %s\n",
> +			   phy_state_to_str(old_state),
> +			   phy_state_to_str(phydev->state));
> +		if (phydev->drv && phydev->drv->link_change_notify)
> +			phydev->drv->link_change_notify(phydev);
> +	}
> +
>  	mutex_unlock(&phydev->lock);
>  
>  	phy_state_machine(&phydev->state_queue.work);
> 

Following is RFC as an additional idea. When requesting a new state
from outside the state machine, we could simply provide the old
state to the state machine in a new phy_device member.
Then we shouldn't have to touch phy_stop(), and maybe phy_error().
And it looks cleaner to me than duplicating code from the state
machine to functions like phy_stop().

---
 drivers/net/phy/phy.c        | 10 +++++++++-
 drivers/net/phy/phy_device.c |  1 +
 include/linux/phy.h          |  3 +++
 3 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
index 45f75533c..347e42d90 100644
--- a/drivers/net/phy/phy.c
+++ b/drivers/net/phy/phy.c
@@ -45,6 +45,7 @@ static const char *phy_state_to_str(enum phy_state st)
 {
 	switch (st) {
 	PHY_STATE_STR(DOWN)
+	PHY_STATE_STR(INVALID)
 	PHY_STATE_STR(READY)
 	PHY_STATE_STR(UP)
 	PHY_STATE_STR(RUNNING)
@@ -1021,6 +1022,7 @@ void phy_stop(struct phy_device *phydev)
 	if (phydev->sfp_bus)
 		sfp_upstream_stop(phydev->sfp_bus);
 
+	phydev->old_state = phydev->state;
 	phydev->state = PHY_HALTED;
 
 	mutex_unlock(&phydev->lock);
@@ -1086,7 +1088,13 @@ void phy_state_machine(struct work_struct *work)
 
 	mutex_lock(&phydev->lock);
 
-	old_state = phydev->state;
+	/* set if a new state is requested from outside the state machine */
+	if (phydev->old_state != PHY_INVALID) {
+		old_state = phydev->old_state;
+		phydev->old_state = PHY_INVALID;
+	} else {
+		old_state = phydev->state;
+	}
 
 	switch (phydev->state) {
 	case PHY_DOWN:
diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index 80c2e646c..9f8d9f68b 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -620,6 +620,7 @@ struct phy_device *phy_device_create(struct mii_bus *bus, int addr, u32 phy_id,
 	device_initialize(&mdiodev->dev);
 
 	dev->state = PHY_DOWN;
+	dev->old_state = PHY_INVALID;
 
 	mutex_init(&dev->lock);
 	INIT_DELAYED_WORK(&dev->state_queue, phy_state_machine);
diff --git a/include/linux/phy.h b/include/linux/phy.h
index 9effb511a..df48ea88c 100644
--- a/include/linux/phy.h
+++ b/include/linux/phy.h
@@ -442,6 +442,7 @@ struct phy_device *mdiobus_scan(struct mii_bus *bus, int addr);
  */
 enum phy_state {
 	PHY_DOWN = 0,
+	PHY_INVALID,
 	PHY_READY,
 	PHY_HALTED,
 	PHY_UP,
@@ -566,6 +567,8 @@ struct phy_device {
 	unsigned interrupts:1;
 
 	enum phy_state state;
+	/* if a new state is requested from outside the state machine */
+	enum phy_state old_state;
 
 	u32 dev_flags;
diff mbox series

Patch

diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
index 45f75533c47c..fca8c3eebc5d 100644
--- a/drivers/net/phy/phy.c
+++ b/drivers/net/phy/phy.c
@@ -1004,6 +1004,7 @@  EXPORT_SYMBOL(phy_free_interrupt);
 void phy_stop(struct phy_device *phydev)
 {
 	struct net_device *dev = phydev->attached_dev;
+	enum phy_state old_state;
 
 	if (!phy_is_started(phydev) && phydev->state != PHY_DOWN) {
 		WARN(1, "called from state %s\n",
@@ -1021,8 +1022,17 @@  void phy_stop(struct phy_device *phydev)
 	if (phydev->sfp_bus)
 		sfp_upstream_stop(phydev->sfp_bus);
 
+	old_state = phydev->state;
 	phydev->state = PHY_HALTED;
 
+	if (old_state != phydev->state) {
+		phydev_err(phydev, "PHY state change %s -> %s\n",
+			   phy_state_to_str(old_state),
+			   phy_state_to_str(phydev->state));
+		if (phydev->drv && phydev->drv->link_change_notify)
+			phydev->drv->link_change_notify(phydev);
+	}
+
 	mutex_unlock(&phydev->lock);
 
 	phy_state_machine(&phydev->state_queue.work);