Message ID | 20250220-marvell-88q2xxx-hwmon-enable-at-probe-v2-2-78b2838a62da@gmail.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | net: phy: marvell-88q2xxx: Enable temperature measurement in probe again | expand |
On Thu, Feb 20, 2025 at 09:11:12AM +0100, Dimitri Fedrau wrote: > If the PHYs reset is asserted it returns 0xffff for any read operation. > This might happen if the user admins down the interface and wants to read > the temperature. Prevent reading the temperature in this case and return > with an network is down error. Write operations are ignored by the device > when reset is asserted, still return a network is down error in this > case to make the user aware of the operation gone wrong. If we look at where mdio_device_reset() is called from: 1. mdio_device_register() -> mdiobus_register_device() asserts reset before adding the device to the device layer (which will then cause the driver to be searched for and bound.) 2. mdio_probe(), deasserts the reset signal before calling the MDIO driver's ->probe method, which will be phy_probe(). 3. after a probe failure to re-assert the reset signal. 4. after ->remove has been called. That is the sum total. So, while the driver is bound to the device, phydev->mdio.reset_state is guaranteed to be zero. Therefore, is this patch fixing a real observed problem with the current driver?
Hi Russell, Am Thu, Feb 20, 2025 at 09:36:49AM +0000 schrieb Russell King (Oracle): > On Thu, Feb 20, 2025 at 09:11:12AM +0100, Dimitri Fedrau wrote: > > If the PHYs reset is asserted it returns 0xffff for any read operation. > > This might happen if the user admins down the interface and wants to read > > the temperature. Prevent reading the temperature in this case and return > > with an network is down error. Write operations are ignored by the device > > when reset is asserted, still return a network is down error in this > > case to make the user aware of the operation gone wrong. > > If we look at where mdio_device_reset() is called from: > > 1. mdio_device_register() -> mdiobus_register_device() asserts reset > before adding the device to the device layer (which will then > cause the driver to be searched for and bound.) > > 2. mdio_probe(), deasserts the reset signal before calling the MDIO > driver's ->probe method, which will be phy_probe(). > > 3. after a probe failure to re-assert the reset signal. > > 4. after ->remove has been called. > There is also phy_device_reset that calls mdio_device_reset. > That is the sum total. So, while the driver is bound to the device, > phydev->mdio.reset_state is guaranteed to be zero. > > Therefore, is this patch fixing a real observed problem with the > current driver? > Yes, when I admin up and afterwards down the network device then the PHYs reset is asserted. In this case phy_detach is called which calls phy_device_reset(phydev, 1), ... Best regards, Dimitri Fedrau
On Thu, Feb 20, 2025 at 04:22:14PM +0100, Dimitri Fedrau wrote: > Hi Russell, > > Am Thu, Feb 20, 2025 at 09:36:49AM +0000 schrieb Russell King (Oracle): > > On Thu, Feb 20, 2025 at 09:11:12AM +0100, Dimitri Fedrau wrote: > > > If the PHYs reset is asserted it returns 0xffff for any read operation. > > > This might happen if the user admins down the interface and wants to read > > > the temperature. Prevent reading the temperature in this case and return > > > with an network is down error. Write operations are ignored by the device > > > when reset is asserted, still return a network is down error in this > > > case to make the user aware of the operation gone wrong. > > > > If we look at where mdio_device_reset() is called from: > > > > 1. mdio_device_register() -> mdiobus_register_device() asserts reset > > before adding the device to the device layer (which will then > > cause the driver to be searched for and bound.) > > > > 2. mdio_probe(), deasserts the reset signal before calling the MDIO > > driver's ->probe method, which will be phy_probe(). > > > > 3. after a probe failure to re-assert the reset signal. > > > > 4. after ->remove has been called. > > > > There is also phy_device_reset that calls mdio_device_reset. Ok, thanks for pointing that out. > > That is the sum total. So, while the driver is bound to the device, > > phydev->mdio.reset_state is guaranteed to be zero. > > > > Therefore, is this patch fixing a real observed problem with the > > current driver? > > > Yes, when I admin up and afterwards down the network device then the PHYs > reset is asserted. In this case phy_detach is called which calls > phy_device_reset(phydev, 1), ... I'm still concerned that this solution is basically racy - the netdev can come up/down while hwmon is accessing the device. I'm also unconvinced that ENETDOWN is a good idea here. While I get the "describe the hardware" is there a real benefit to describing this? What I'm wondering is whether manipulating the reset signal in this case provides more pain than gain.
Am Thu, Feb 20, 2025 at 03:29:10PM +0000 schrieb Russell King (Oracle): > On Thu, Feb 20, 2025 at 04:22:14PM +0100, Dimitri Fedrau wrote: > > Hi Russell, > > > > Am Thu, Feb 20, 2025 at 09:36:49AM +0000 schrieb Russell King (Oracle): > > > On Thu, Feb 20, 2025 at 09:11:12AM +0100, Dimitri Fedrau wrote: > > > > If the PHYs reset is asserted it returns 0xffff for any read operation. > > > > This might happen if the user admins down the interface and wants to read > > > > the temperature. Prevent reading the temperature in this case and return > > > > with an network is down error. Write operations are ignored by the device > > > > when reset is asserted, still return a network is down error in this > > > > case to make the user aware of the operation gone wrong. > > > > > > If we look at where mdio_device_reset() is called from: > > > > > > 1. mdio_device_register() -> mdiobus_register_device() asserts reset > > > before adding the device to the device layer (which will then > > > cause the driver to be searched for and bound.) > > > > > > 2. mdio_probe(), deasserts the reset signal before calling the MDIO > > > driver's ->probe method, which will be phy_probe(). > > > > > > 3. after a probe failure to re-assert the reset signal. > > > > > > 4. after ->remove has been called. > > > > > > > There is also phy_device_reset that calls mdio_device_reset. > > Ok, thanks for pointing that out. > > > > That is the sum total. So, while the driver is bound to the device, > > > phydev->mdio.reset_state is guaranteed to be zero. > > > > > > Therefore, is this patch fixing a real observed problem with the > > > current driver? > > > > > Yes, when I admin up and afterwards down the network device then the PHYs > > reset is asserted. In this case phy_detach is called which calls > > phy_device_reset(phydev, 1), ... > > I'm still concerned that this solution is basically racy - the > netdev can come up/down while hwmon is accessing the device. I'm > also unconvinced that ENETDOWN is a good idea here. > Yes it is racy. A solution would be to set a flag or whatever in mdio_device_reset right before changes to the reset line happens and clear the flag right after the changes have been done. Add a function that return the state of the flag. Better go back to EIO ? At first I thought it was a good idea because the user gets at least a hint what the cause of the error is... > While I get the "describe the hardware" is there a real benefit to > describing this? > I can't follow. > What I'm wondering is whether manipulating the reset signal in this > case provides more pain than gain. > It seems so. I wasn't really aware of it :-) Best regards, Dimitri Fedrau
diff --git a/drivers/net/phy/marvell-88q2xxx.c b/drivers/net/phy/marvell-88q2xxx.c index 342a909a12a2785ad579656eb369c69acaace9d1..ea9a2a923146bf432a33ff46b606c08debb69a4f 100644 --- a/drivers/net/phy/marvell-88q2xxx.c +++ b/drivers/net/phy/marvell-88q2xxx.c @@ -698,6 +698,12 @@ static int mv88q2xxx_hwmon_read(struct device *dev, struct phy_device *phydev = dev_get_drvdata(dev); int ret; + /* If the PHYs reset is asserted it returns 0xffff for any read + * operation. Return with an network is down error in this case. + */ + if (phydev->mdio.reset_state == 1) + return -ENETDOWN; + switch (attr) { case hwmon_temp_input: ret = phy_read_mmd(phydev, MDIO_MMD_PCS, @@ -737,6 +743,14 @@ static int mv88q2xxx_hwmon_write(struct device *dev, { struct phy_device *phydev = dev_get_drvdata(dev); + /* If the PHYs reset is asserted it ignores any write operation, return + * with an network is down error in this case. Without returning an + * error the user would not know that writing the temperature threshold + * has gone wrong. + */ + if (phydev->mdio.reset_state == 1) + return -ENETDOWN; + switch (attr) { case hwmon_temp_max: clamp_val(val, -75000, 180000);
If the PHYs reset is asserted it returns 0xffff for any read operation. This might happen if the user admins down the interface and wants to read the temperature. Prevent reading the temperature in this case and return with an network is down error. Write operations are ignored by the device when reset is asserted, still return a network is down error in this case to make the user aware of the operation gone wrong. Signed-off-by: Dimitri Fedrau <dima.fedrau@gmail.com> --- drivers/net/phy/marvell-88q2xxx.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+)