Message ID | 20240109205223.40219-1-wsa+renesas@sang-engineering.com (mailing list archive) |
---|---|
State | RFC |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [RFC] net: phy: micrel: reset KSZ9x31 when resuming | expand |
> +static int ksz9x31_resume(struct phy_device *phydev) > +{ > + phy_device_reset(phydev, 1); > + phy_device_reset(phydev, 0); > + > + return kszphy_resume(phydev); > +} > + > static int kszphy_probe(struct phy_device *phydev) > { > const struct kszphy_type *type = phydev->drv->driver_data; > @@ -4778,7 +4786,7 @@ static struct phy_driver ksphy_driver[] = { > .get_strings = kszphy_get_strings, > .get_stats = kszphy_get_stats, > .suspend = kszphy_suspend, > - .resume = kszphy_resume, > + .resume = ksz9x31_resume, Humm, i'm not so sure about this. phy_resume() is called by mdio_bus_phy_resume(). That first does a call to phy_init_hw(), which will perform a soft reset on the PHY, call the drivers config_init() callback, and the config_intr() callback. Then it calls phy_resume(). Does phy_resume() hitting it with a reset clear out the configuration done by config_init() and the interrupt configuration performed by config_intr()? Andrew
- Philippe, email address is no longer valid. On Tue, Jan 09, 2024 at 09:52:22PM +0100, Wolfram Sang wrote: > On a Renesas Ebisu board, the KSZ9031 PHY is stalled after resuming if > the interface has not been brought up before suspending. If it had been > brought up before, phylib ensures that reset is asserted before > suspending. But if it had never been brought up, there is no instance > which could ensure that reset is asserted. And upon resume, the PHY is > in an unknown state without reset being asserted. To bring it back to a > known state, simply reset it when it is about to be resumed. > > This likely also helps another issue [1] where a KSZ9131 can be powered > using regulators. After switching power on again in resume, a reset is > also needed. > > [1] https://patchwork.kernel.org/project/netdevbpf/patch/20211214121638.138784-4-philippe.schenker@toradex.com/ > > Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> > --- > > This is a different solution to a problem I already tried to solve > here[2]. Back then, I added code to the MAC, but I now believe it should > be solved on PHY level. We never saw the problem with other PHYs. > Looking at [1], it seems that KSZ9x31 is also sensitive to being > powered down without reset being asserted. I know it is not a perfect > proof, but I guess these assumptions are all we have. > > Philippe, Francesco: do you still have access to machines with this > issue? Could you kindly test if so? I have access, however - Philippe is long gone from Toradex and he was the one looking into this topic - we did solve the issue in a different way, e.g. we no longer power-off the phy in suspend Therefore is not straightforward to provide valuable feedback to you now. > > Patch is based on 6.7. Looking forward for comments if this is the > correct layer to tackle the problem. Thanks! > > > [2] https://lore.kernel.org/all/20230321103357.18940-1-wsa+renesas@sang-engineering.com/ > > drivers/net/phy/micrel.c | 12 ++++++++++-- > 1 file changed, 10 insertions(+), 2 deletions(-) > > diff --git a/drivers/net/phy/micrel.c b/drivers/net/phy/micrel.c > index 08e3915001c3..c38d7982c06c 100644 > --- a/drivers/net/phy/micrel.c > +++ b/drivers/net/phy/micrel.c > @@ -1984,6 +1984,14 @@ static int kszphy_resume(struct phy_device *phydev) > return 0; > } > > +static int ksz9x31_resume(struct phy_device *phydev) > +{ > + phy_device_reset(phydev, 1); > + phy_device_reset(phydev, 0); Is something like that fine? Don't we need to reconfigure the ethernet phy completely on resume if we do reset it? kszphy_config_reset() is taking care of something, but I think that the phy being reset on resume is not handled correctly. Francesco
Hi Francesco, > - Philippe, email address is no longer valid. OK, thanks for the heads up. > Therefore is not straightforward to provide valuable feedback to you > now. Thanks for answering anyhow. > > +static int ksz9x31_resume(struct phy_device *phydev) > > +{ > > + phy_device_reset(phydev, 1); > > + phy_device_reset(phydev, 0); > > Is something like that fine? > Don't we need to reconfigure the ethernet phy completely on resume > if we do reset it? kszphy_config_reset() is taking care of something, > but I think that the phy being reset on resume is not handled > correctly. If the interface is up before suspending, then suspend will assert the reset-line. If the interface is disabled before suspending, then close will assert the reset line. Deassertion will happen on resume/open. So, unless I am overlooking something, my code does not really add something new. It only makes sure that the reset line always gets asserted before deasserting. Because in the case that the interface has never been up before, there is no instance which could assert reset. Or, at least, I could not find one. Makes sense? Tests work fine here, at least. All the best, Wolfram
> Does phy_resume() hitting it with a reset clear out the configuration > done by config_init() and the interrupt configuration performed by > config_intr()? Hmm, I should have answered your mail first. Instrumentation says you are correct. Back to the drawing board :/
On Wed, Jan 17, 2024 at 02:33:53PM +0100, Wolfram Sang wrote: > > > +static int ksz9x31_resume(struct phy_device *phydev) > > > +{ > > > + phy_device_reset(phydev, 1); > > > + phy_device_reset(phydev, 0); > > > > Is something like that fine? > > Don't we need to reconfigure the ethernet phy completely on resume > > if we do reset it? kszphy_config_reset() is taking care of something, > > but I think that the phy being reset on resume is not handled > > correctly. > > If the interface is up before suspending, then suspend will assert the > reset-line. If the interface is disabled before suspending, then close > will assert the reset line. Deassertion will happen on resume/open. > > So, unless I am overlooking something, my code does not really add > something new. It only makes sure that the reset line always gets > asserted before deasserting. Because in the case that the interface has > never been up before, there is no instance which could assert reset. Or, > at least, I could not find one. > > Makes sense? Tests work fine here, at least. What I do not know, is what happen to any configuration that was done to the phy before. What if you have disabled gigabit ethernet from auto negotiation before suspend, it will be enabled again after the phy get out of reset. Is the ethernet phy subsystem taking care of this? Ensuring that this configuration is restored into the phy? I was starting to debug something around this a few days ago, with the difference that in that case the reset was asserted/de-asserted by the hardware and the end results was not really the expected one ... Francesco
On Tue, Feb 06, 2024 at 06:26:45PM +0100, Francesco Dolcini wrote: > On Wed, Jan 17, 2024 at 02:33:53PM +0100, Wolfram Sang wrote: > > > > +static int ksz9x31_resume(struct phy_device *phydev) > > > > +{ > > > > + phy_device_reset(phydev, 1); > > > > + phy_device_reset(phydev, 0); > > > > > > Is something like that fine? > > > Don't we need to reconfigure the ethernet phy completely on resume > > > if we do reset it? kszphy_config_reset() is taking care of something, > > > but I think that the phy being reset on resume is not handled > > > correctly. > > > > If the interface is up before suspending, then suspend will assert the > > reset-line. If the interface is disabled before suspending, then close > > will assert the reset line. Deassertion will happen on resume/open. > > > > So, unless I am overlooking something, my code does not really add > > something new. It only makes sure that the reset line always gets > > asserted before deasserting. Because in the case that the interface has > > never been up before, there is no instance which could assert reset. Or, > > at least, I could not find one. > > > > Makes sense? Tests work fine here, at least. > > What I do not know, is what happen to any configuration that was done to > the phy before. I'm assuming here WoL was not enabled, so the PHY did actually suspend. mdio_bus_phy_suspend() calls phy_stop_machine() which will set the state of the PHY to UP. During resume mdio_bus_phy_resume() calls phy_init_hw(). That should do a soft reset, call the config_init() callback, and configure interrupts. After that phy_resume() is called and then phy_state_machine(). Do to setting the state to UP, the state machine will kick off auto-negotiation, which will cause any auto-neg parameters to be written to the PHY. > What if you have disabled gigabit ethernet from auto negotiation before > suspend, it will be enabled again after the phy get out of reset. If you have set in fixed mode, the wrongly named phy_config_aneg() will set the fixed modes, same as it would set the auto-neg modes. So they should be preserved over suspend/resume. Andrew
On Tue, Feb 06, 2024 at 06:57:37PM +0100, Andrew Lunn wrote: > On Tue, Feb 06, 2024 at 06:26:45PM +0100, Francesco Dolcini wrote: > > On Wed, Jan 17, 2024 at 02:33:53PM +0100, Wolfram Sang wrote: > > > Makes sense? Tests work fine here, at least. > > > > What I do not know, is what happen to any configuration that was done to > > the phy before. > > I'm assuming here WoL was not enabled, so the PHY did actually > suspend. > > mdio_bus_phy_suspend() calls phy_stop_machine() which will set the > state of the PHY to UP. > > During resume mdio_bus_phy_resume() calls phy_init_hw(). That should > do a soft reset, call the config_init() callback, and configure > interrupts. After that phy_resume() is called and then > phy_state_machine(). Do to setting the state to UP, the state machine > will kick off auto-negotiation, which will cause any auto-neg > parameters to be written to the PHY. > > > What if you have disabled gigabit ethernet from auto negotiation before > > suspend, it will be enabled again after the phy get out of reset. > > If you have set in fixed mode, the wrongly named phy_config_aneg() > will set the fixed modes, same as it would set the auto-neg modes. So > they should be preserved over suspend/resume. Thanks for the detailed explanation Andrew. What if the configuration was done using ethtool? Francesco
> What if the configuration was done using ethtool?
The MAC driver when handling ethtool requests should call
phy_ethtool_ksettings_set(). That manipulates phydev->advertising,
phydev->autoneg etc. These are the values passed by phy_config_aneg()
to the PHY driver.
Andrew
diff --git a/drivers/net/phy/micrel.c b/drivers/net/phy/micrel.c index 08e3915001c3..c38d7982c06c 100644 --- a/drivers/net/phy/micrel.c +++ b/drivers/net/phy/micrel.c @@ -1984,6 +1984,14 @@ static int kszphy_resume(struct phy_device *phydev) return 0; } +static int ksz9x31_resume(struct phy_device *phydev) +{ + phy_device_reset(phydev, 1); + phy_device_reset(phydev, 0); + + return kszphy_resume(phydev); +} + static int kszphy_probe(struct phy_device *phydev) { const struct kszphy_type *type = phydev->drv->driver_data; @@ -4778,7 +4786,7 @@ static struct phy_driver ksphy_driver[] = { .get_strings = kszphy_get_strings, .get_stats = kszphy_get_stats, .suspend = kszphy_suspend, - .resume = kszphy_resume, + .resume = ksz9x31_resume, .cable_test_start = ksz9x31_cable_test_start, .cable_test_get_status = ksz9x31_cable_test_get_status, }, { @@ -4851,7 +4859,7 @@ static struct phy_driver ksphy_driver[] = { .get_strings = kszphy_get_strings, .get_stats = kszphy_get_stats, .suspend = kszphy_suspend, - .resume = kszphy_resume, + .resume = ksz9x31_resume, .cable_test_start = ksz9x31_cable_test_start, .cable_test_get_status = ksz9x31_cable_test_get_status, .get_features = ksz9477_get_features,
On a Renesas Ebisu board, the KSZ9031 PHY is stalled after resuming if the interface has not been brought up before suspending. If it had been brought up before, phylib ensures that reset is asserted before suspending. But if it had never been brought up, there is no instance which could ensure that reset is asserted. And upon resume, the PHY is in an unknown state without reset being asserted. To bring it back to a known state, simply reset it when it is about to be resumed. This likely also helps another issue [1] where a KSZ9131 can be powered using regulators. After switching power on again in resume, a reset is also needed. [1] https://patchwork.kernel.org/project/netdevbpf/patch/20211214121638.138784-4-philippe.schenker@toradex.com/ Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> --- This is a different solution to a problem I already tried to solve here[2]. Back then, I added code to the MAC, but I now believe it should be solved on PHY level. We never saw the problem with other PHYs. Looking at [1], it seems that KSZ9x31 is also sensitive to being powered down without reset being asserted. I know it is not a perfect proof, but I guess these assumptions are all we have. Philippe, Francesco: do you still have access to machines with this issue? Could you kindly test if so? Patch is based on 6.7. Looking forward for comments if this is the correct layer to tackle the problem. Thanks! [2] https://lore.kernel.org/all/20230321103357.18940-1-wsa+renesas@sang-engineering.com/ drivers/net/phy/micrel.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-)