diff mbox series

[net,v1,1/2] net: phy: set eee_cfg based on PHY configuration

Message ID 20241114081653.3939346-2-yong.liang.choong@linux.intel.com (mailing list archive)
State New
Headers show
Series Fix 'ethtool --show-eee' during initial stage | expand

Commit Message

Choong Yong Liang Nov. 14, 2024, 8:16 a.m. UTC
Not all PHYs have EEE enabled by default. For example, Marvell PHYs are
designed to have EEE hardware disabled during the initial state, and it
needs to be configured to turn it on again.

This patch reads the PHY configuration and sets it as the initial value for
eee_cfg.tx_lpi_enabled and eee_cfg.eee_enabled instead of having them set to
true by default.

Fixes: 49168d1980e2 ("net: phy: Add phy_support_eee() indicating MAC support EEE")
Cc: <stable@vger.kernel.org>
Signed-off-by: Choong Yong Liang <yong.liang.choong@linux.intel.com>
---
 drivers/net/phy/phy_device.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

Comments

Russell King (Oracle) Nov. 14, 2024, 9:23 a.m. UTC | #1
On Thu, Nov 14, 2024 at 04:16:52PM +0800, Choong Yong Liang wrote:
> Not all PHYs have EEE enabled by default. For example, Marvell PHYs are
> designed to have EEE hardware disabled during the initial state, and it
> needs to be configured to turn it on again.
> 
> This patch reads the PHY configuration and sets it as the initial value for
> eee_cfg.tx_lpi_enabled and eee_cfg.eee_enabled instead of having them set to
> true by default.

eee_cfg.tx_lpi_enabled is something phylib tracks, and it merely means
that LPI needs to be enabled at the MAC if EEE was negotiated:

 * @tx_lpi_enabled: Whether the interface should assert its tx lpi, given
 *      that eee was negotiated.

eee_cfg.eee_enabled means that EEE mode was enabled - which is user
configuration:

 * @eee_enabled: EEE configured mode (enabled/disabled).

phy_probe() reads the initial PHY state and sets things up
appropriately.

However, there is a point where the EEE configuration (advertisement,
and therefore eee_enabled state) is written to the PHY, and that should
be config_aneg(). Looking at the Marvell driver, it's calling
genphy_config_aneg() which eventually calls
genphy_c45_an_config_eee_aneg() which does this (via
__genphy_config_aneg()).

Please investigate why the hardware state is going out of sync with the
software state.

Thanks.

>  void phy_support_eee(struct phy_device *phydev)
>  {
> +	bool is_enabled = true;
> +
> +	genphy_c45_eee_is_active(phydev, NULL, NULL, &is_enabled);
>  	linkmode_copy(phydev->advertising_eee, phydev->supported_eee);
> -	phydev->eee_cfg.tx_lpi_enabled = true;
> -	phydev->eee_cfg.eee_enabled = true;
> +	phydev->eee_cfg.tx_lpi_enabled = is_enabled;
> +	phydev->eee_cfg.eee_enabled = is_enabled;

This is almost certainly incorrect, because eee_enabled should only
be set when phydev->advertising_eee (which should track the hardware
EEE advertisement programmed into the PHY) is non-zero.

Note that phy_support_eee() must be called _before_ phy_start(). I
haven't checked whether stmmac does this.

Thanks.
Russell King (Oracle) Nov. 14, 2024, 10:05 a.m. UTC | #2
On Thu, Nov 14, 2024 at 09:23:48AM +0000, Russell King (Oracle) wrote:
> On Thu, Nov 14, 2024 at 04:16:52PM +0800, Choong Yong Liang wrote:
> > Not all PHYs have EEE enabled by default. For example, Marvell PHYs are
> > designed to have EEE hardware disabled during the initial state, and it
> > needs to be configured to turn it on again.
> > 
> > This patch reads the PHY configuration and sets it as the initial value for
> > eee_cfg.tx_lpi_enabled and eee_cfg.eee_enabled instead of having them set to
> > true by default.
> 
> eee_cfg.tx_lpi_enabled is something phylib tracks, and it merely means
> that LPI needs to be enabled at the MAC if EEE was negotiated:
> 
>  * @tx_lpi_enabled: Whether the interface should assert its tx lpi, given
>  *      that eee was negotiated.
> 
> eee_cfg.eee_enabled means that EEE mode was enabled - which is user
> configuration:
> 
>  * @eee_enabled: EEE configured mode (enabled/disabled).
> 
> phy_probe() reads the initial PHY state and sets things up
> appropriately.
> 
> However, there is a point where the EEE configuration (advertisement,
> and therefore eee_enabled state) is written to the PHY, and that should
> be config_aneg(). Looking at the Marvell driver, it's calling
> genphy_config_aneg() which eventually calls
> genphy_c45_an_config_eee_aneg() which does this (via
> __genphy_config_aneg()).
> 
> Please investigate why the hardware state is going out of sync with the
> software state.

I think I've found the issue.

We have phydev->eee_enabled and phydev->eee_cfg.eee_enabled, which looks
like a bug to me. We write to phydev->eee_cfg.eee_enabled in
phy_support_eee(), leaving phydev->eee_enabled untouched.

However, most other places are using phydev->eee_enabled.

This is (a) confusing and (b) wrong, and having the two members leads
to this confusion, and makes the code more difficult to follow (unless
one has already clocked that there are these two different things both
called eee_enabled).

This is my untested prototype patch to fix this - it may cause breakage
elsewhere:

diff --git a/drivers/net/phy/phy-c45.c b/drivers/net/phy/phy-c45.c
index c1b3576c307f..2d64d3f293e5 100644
--- a/drivers/net/phy/phy-c45.c
+++ b/drivers/net/phy/phy-c45.c
@@ -943,7 +943,7 @@ EXPORT_SYMBOL_GPL(genphy_c45_read_eee_abilities);
  */
 int genphy_c45_an_config_eee_aneg(struct phy_device *phydev)
 {
-	if (!phydev->eee_enabled) {
+	if (!phydev->eee_cfg.eee_enabled) {
 		__ETHTOOL_DECLARE_LINK_MODE_MASK(adv) = {};
 
 		return genphy_c45_write_eee_adv(phydev, adv);
@@ -1576,8 +1576,6 @@ int genphy_c45_ethtool_set_eee(struct phy_device *phydev,
 		}
 	}
 
-	phydev->eee_enabled = data->eee_enabled;
-
 	ret = genphy_c45_an_config_eee_aneg(phydev);
 	if (ret > 0) {
 		ret = phy_restart_aneg(phydev);
diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index bc24c9f2786b..b26bb33cd1d4 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -3589,12 +3589,12 @@ static int phy_probe(struct device *dev)
 	/* There is no "enabled" flag. If PHY is advertising, assume it is
 	 * kind of enabled.
 	 */
-	phydev->eee_enabled = !linkmode_empty(phydev->advertising_eee);
+	phydev->eee_cfg.eee_enabled = !linkmode_empty(phydev->advertising_eee);
 
 	/* Some PHYs may advertise, by default, not support EEE modes. So,
 	 * we need to clean them.
 	 */
-	if (phydev->eee_enabled)
+	if (phydev->eee_cfg.eee_enabled)
 		linkmode_and(phydev->advertising_eee, phydev->supported_eee,
 			     phydev->advertising_eee);
 
diff --git a/include/linux/phy.h b/include/linux/phy.h
index 1e4127c495c0..33905e9672a7 100644
--- a/include/linux/phy.h
+++ b/include/linux/phy.h
@@ -601,7 +601,6 @@ struct macsec_ops;
  * @adv_old: Saved advertised while power saving for WoL
  * @supported_eee: supported PHY EEE linkmodes
  * @advertising_eee: Currently advertised EEE linkmodes
- * @eee_enabled: Flag indicating whether the EEE feature is enabled
  * @enable_tx_lpi: When True, MAC should transmit LPI to PHY
  * @eee_cfg: User configuration of EEE
  * @lp_advertising: Current link partner advertised linkmodes
@@ -721,7 +720,6 @@ struct phy_device {
 	/* used for eee validation and configuration*/
 	__ETHTOOL_DECLARE_LINK_MODE_MASK(supported_eee);
 	__ETHTOOL_DECLARE_LINK_MODE_MASK(advertising_eee);
-	bool eee_enabled;
 
 	/* Host supported PHY interface types. Should be ignored if empty. */
 	DECLARE_PHY_INTERFACE_MASK(host_interfaces);
Russell King (Oracle) Nov. 14, 2024, 10:16 a.m. UTC | #3
On Thu, Nov 14, 2024 at 10:05:52AM +0000, Russell King (Oracle) wrote:
> On Thu, Nov 14, 2024 at 09:23:48AM +0000, Russell King (Oracle) wrote:
> > On Thu, Nov 14, 2024 at 04:16:52PM +0800, Choong Yong Liang wrote:
> > > Not all PHYs have EEE enabled by default. For example, Marvell PHYs are
> > > designed to have EEE hardware disabled during the initial state, and it
> > > needs to be configured to turn it on again.
> > > 
> > > This patch reads the PHY configuration and sets it as the initial value for
> > > eee_cfg.tx_lpi_enabled and eee_cfg.eee_enabled instead of having them set to
> > > true by default.
> > 
> > eee_cfg.tx_lpi_enabled is something phylib tracks, and it merely means
> > that LPI needs to be enabled at the MAC if EEE was negotiated:
> > 
> >  * @tx_lpi_enabled: Whether the interface should assert its tx lpi, given
> >  *      that eee was negotiated.
> > 
> > eee_cfg.eee_enabled means that EEE mode was enabled - which is user
> > configuration:
> > 
> >  * @eee_enabled: EEE configured mode (enabled/disabled).
> > 
> > phy_probe() reads the initial PHY state and sets things up
> > appropriately.
> > 
> > However, there is a point where the EEE configuration (advertisement,
> > and therefore eee_enabled state) is written to the PHY, and that should
> > be config_aneg(). Looking at the Marvell driver, it's calling
> > genphy_config_aneg() which eventually calls
> > genphy_c45_an_config_eee_aneg() which does this (via
> > __genphy_config_aneg()).
> > 
> > Please investigate why the hardware state is going out of sync with the
> > software state.
> 
> I think I've found the issue.
> 
> We have phydev->eee_enabled and phydev->eee_cfg.eee_enabled, which looks
> like a bug to me. We write to phydev->eee_cfg.eee_enabled in
> phy_support_eee(), leaving phydev->eee_enabled untouched.
> 
> However, most other places are using phydev->eee_enabled.
> 
> This is (a) confusing and (b) wrong, and having the two members leads
> to this confusion, and makes the code more difficult to follow (unless
> one has already clocked that there are these two different things both
> called eee_enabled).
> 
> This is my untested prototype patch to fix this - it may cause breakage
> elsewhere:

As mentioned in the other thread:

Without a call to phy_support_eee():

EEE settings for eth2:
        EEE status: disabled
        Tx LPI: disabled
        Supported EEE link modes:  100baseT/Full
                                   1000baseT/Full
        Advertised EEE link modes:  Not reported
        Link partner advertised EEE link modes:  100baseT/Full
                                                 1000baseT/Full

With a call to phy_support_eee():

EEE settings for eth2:
        EEE status: enabled - active
        Tx LPI: 0 (us)
        Supported EEE link modes:  100baseT/Full
                                   1000baseT/Full
        Advertised EEE link modes:  100baseT/Full
                                    1000baseT/Full
        Link partner advertised EEE link modes:  100baseT/Full
                                                 1000baseT/Full

So the EEE status is now behaving correctly, and the Marvell PHY is
being programmed with the advertisement correctly.
diff mbox series

Patch

diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index 499797646580..b4fa40c2371a 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -3010,9 +3010,12 @@  EXPORT_SYMBOL_GPL(phy_advertise_eee_all);
  */
 void phy_support_eee(struct phy_device *phydev)
 {
+	bool is_enabled = true;
+
+	genphy_c45_eee_is_active(phydev, NULL, NULL, &is_enabled);
 	linkmode_copy(phydev->advertising_eee, phydev->supported_eee);
-	phydev->eee_cfg.tx_lpi_enabled = true;
-	phydev->eee_cfg.eee_enabled = true;
+	phydev->eee_cfg.tx_lpi_enabled = is_enabled;
+	phydev->eee_cfg.eee_enabled = is_enabled;
 }
 EXPORT_SYMBOL(phy_support_eee);