diff mbox series

[net,v1,1/2] net: phy: Introduce phy_update_eee() to update eee_cfg values

Message ID 20241112072447.3238892-2-yong.liang.choong@linux.intel.com (mailing list archive)
State New
Headers show
Series Fix ethtool --show-eee for stmmac | expand

Commit Message

Choong Yong Liang Nov. 12, 2024, 7:24 a.m. UTC
The commit fe0d4fd9285e ("net: phy: Keep track of EEE configuration")
introduced eee_cfg, which is used to check the existing settings against
the requested changes. When the 'ethtool --show-eee' command is issued,
it reads the values from eee_cfg. However, the 'show-eee' command does
not show the correct result after system boot-up, link up, and link down.

For system boot-up, the commit 49168d1980e2
("net: phy: Add phy_support_eee() indicating MAC support EEE") introduced
phy_support_eee to set eee_cfg as the default value. However, the values
set were not always correct, as after autonegotiation or speed changes,
the selected speed might not be supported by EEE.

phy_update_eee() was introduced to update the correct values for eee_cfg
during link up and down, ensuring that 'ethtool --show-eee' shows
the correct status.

Fixes: fe0d4fd9285e ("net: phy: Keep track of EEE configuration")
Cc: <stable@vger.kernel.org>
Signed-off-by: Choong Yong Liang <yong.liang.choong@linux.intel.com>
---
 drivers/net/phy/phy_device.c | 24 ++++++++++++++++++++++++
 include/linux/phy.h          |  2 ++
 2 files changed, 26 insertions(+)

Comments

Heiner Kallweit Nov. 12, 2024, 11:03 a.m. UTC | #1
On 12.11.2024 08:24, Choong Yong Liang wrote:
> The commit fe0d4fd9285e ("net: phy: Keep track of EEE configuration")
> introduced eee_cfg, which is used to check the existing settings against
> the requested changes. When the 'ethtool --show-eee' command is issued,
> it reads the values from eee_cfg. However, the 'show-eee' command does
> not show the correct result after system boot-up, link up, and link down.
> 

In stmmac_ethtool_op_get_eee() you have the following:

edata->tx_lpi_timer = priv->tx_lpi_timer;
edata->tx_lpi_enabled = priv->tx_lpi_enabled;
return phylink_ethtool_get_eee(priv->phylink, edata);

You have to call phylink_ethtool_get_eee() first, otherwise the manually
set values will be overridden. However setting tx_lpi_enabled shouldn't
be needed if you respect phydev->enable_tx_lpi.

> For system boot-up, the commit 49168d1980e2
> ("net: phy: Add phy_support_eee() indicating MAC support EEE") introduced
> phy_support_eee to set eee_cfg as the default value. However, the values
> set were not always correct, as after autonegotiation or speed changes,
> the selected speed might not be supported by EEE.
> 
> phy_update_eee() was introduced to update the correct values for eee_cfg
> during link up and down, ensuring that 'ethtool --show-eee' shows
> the correct status.
> 
> Fixes: fe0d4fd9285e ("net: phy: Keep track of EEE configuration")
> Cc: <stable@vger.kernel.org>
> Signed-off-by: Choong Yong Liang <yong.liang.choong@linux.intel.com>
> ---
>  drivers/net/phy/phy_device.c | 24 ++++++++++++++++++++++++
>  include/linux/phy.h          |  2 ++
>  2 files changed, 26 insertions(+)
> 
> diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
> index 499797646580..94dadf011ca6 100644
> --- a/drivers/net/phy/phy_device.c
> +++ b/drivers/net/phy/phy_device.c
> @@ -3016,6 +3016,30 @@ void phy_support_eee(struct phy_device *phydev)
>  }
>  EXPORT_SYMBOL(phy_support_eee);
>  
> +/**
> + * phy_update_eee - Update the Energy Efficient Ethernet (EEE) settings
> + * @phydev: target phy_device struct
> + * @tx_lpi_enabled: boolean indicating if Low Power Idle (LPI) for
> + * transmission is enabled.
> + * @eee_enabled: boolean indicating if Energy Efficient Ethernet (EEE) is
> + * enabled.
> + * @tx_lpi_timer: the Low Power Idle (LPI) timer value (in microseconds) for
> + * transmission.
> + *
> + * Description:
> + * This function updates the Energy Efficient Ethernet (EEE) settings for the
> + * specified PHY device. It is typically called during link up and down events
> + * to configure the EEE parameters according to the current link state.
> + */
> +void phy_update_eee(struct phy_device *phydev, bool tx_lpi_enabled,
> +		    bool eee_enabled, u32 tx_lpi_timer)
> +{
> +	phydev->eee_cfg.tx_lpi_enabled = tx_lpi_enabled;
> +	phydev->eee_cfg.eee_enabled = eee_enabled;
> +	phydev->eee_cfg.tx_lpi_timer = tx_lpi_timer;
> +}
> +EXPORT_SYMBOL(phy_update_eee);
> +
>  /**
>   * phy_support_sym_pause - Enable support of symmetrical pause
>   * @phydev: target phy_device struct
> diff --git a/include/linux/phy.h b/include/linux/phy.h
> index a98bc91a0cde..6c300ba47a2d 100644
> --- a/include/linux/phy.h
> +++ b/include/linux/phy.h
> @@ -2004,6 +2004,8 @@ void phy_advertise_eee_all(struct phy_device *phydev);
>  void phy_support_sym_pause(struct phy_device *phydev);
>  void phy_support_asym_pause(struct phy_device *phydev);
>  void phy_support_eee(struct phy_device *phydev);
> +void phy_update_eee(struct phy_device *phydev, bool tx_lpi_enabled,
> +		    bool eee_enabled, u32 tx_lpi_timer);
>  void phy_set_sym_pause(struct phy_device *phydev, bool rx, bool tx,
>  		       bool autoneg);
>  void phy_set_asym_pause(struct phy_device *phydev, bool rx, bool tx);
Andrew Lunn Nov. 12, 2024, 1:04 p.m. UTC | #2
On Tue, Nov 12, 2024 at 12:03:15PM +0100, Heiner Kallweit wrote:
> On 12.11.2024 08:24, Choong Yong Liang wrote:
> > The commit fe0d4fd9285e ("net: phy: Keep track of EEE configuration")
> > introduced eee_cfg, which is used to check the existing settings against
> > the requested changes. When the 'ethtool --show-eee' command is issued,
> > it reads the values from eee_cfg. However, the 'show-eee' command does
> > not show the correct result after system boot-up, link up, and link down.
> > 
> 
> In stmmac_ethtool_op_get_eee() you have the following:
> 
> edata->tx_lpi_timer = priv->tx_lpi_timer;
> edata->tx_lpi_enabled = priv->tx_lpi_enabled;
> return phylink_ethtool_get_eee(priv->phylink, edata);
> 
> You have to call phylink_ethtool_get_eee() first, otherwise the manually
> set values will be overridden. However setting tx_lpi_enabled shouldn't
> be needed if you respect phydev->enable_tx_lpi.

I agree with Heiner here, this sounds like a bug somewhere, not
something which needs new code in phylib. Lets understand why it gives
the wrong results.

	Andrew
Choong Yong Liang Nov. 13, 2024, 10:10 a.m. UTC | #3
On 12/11/2024 9:04 pm, Andrew Lunn wrote:
> On Tue, Nov 12, 2024 at 12:03:15PM +0100, Heiner Kallweit wrote:
>> In stmmac_ethtool_op_get_eee() you have the following:
>>
>> edata->tx_lpi_timer = priv->tx_lpi_timer;
>> edata->tx_lpi_enabled = priv->tx_lpi_enabled;
>> return phylink_ethtool_get_eee(priv->phylink, edata);
>>
>> You have to call phylink_ethtool_get_eee() first, otherwise the manually
>> set values will be overridden. However setting tx_lpi_enabled shouldn't
>> be needed if you respect phydev->enable_tx_lpi.
> 
> I agree with Heiner here, this sounds like a bug somewhere, not
> something which needs new code in phylib. Lets understand why it gives
> the wrong results.
> 
> 	Andrew
Hi Russell, Andrew, and Heiner, thanks a lot for your valuable feedback.

The current implementation of the 'ethtool --show-eee' command heavily 
relies on the phy_ethtool_get_eee() in phy.c. The eeecfg values are set by 
the 'ethtool --set-eee' command and the phy_support_eee() during the 
initial state. The phy_ethtool_get_eee() calls eeecfg_to_eee(), which 
returns the eeecfg containing tx_lpi_timer, tx_lpi_enabled, and eee_enable 
for the 'ethtool --show-eee' command.

The tx_lpi_timer and tx_lpi_enabled values stored in the MAC or PHY driver 
are not retrieved by the 'ethtool --show-eee' command.

Currently, we are facing 3 issues:
1. When we boot up our system and do not issue the 'ethtool --set-eee' 
command, and then directly issue the 'ethtool --show-eee' command, it 
always shows that EEE is disabled due to the eeecfg values not being set. 
However, in the Maxliner GPY PHY, the driver EEE is enabled. If we try to 
disable EEE, nothing happens because the eeecfg matches the setting 
required to disable EEE in ethnl_set_eee(). The phy_support_eee() was 
introduced to set the initial values to enable eee_enabled and 
tx_lpi_enabled. This would allow 'ethtool --show-eee' to show that EEE is 
enabled during the initial state. However, the Marvell PHY is designed to 
have hardware disabled EEE during the initial state. Users are required to 
use Ethtool to enable the EEE. phy_support_eee() does not show the correct 
for Marvell PHY.

2. The 'ethtool --show-eee' command does not display the correct status, 
even if the link is down or the speed changes to one that does not support EEE.

3. The tx_lpi_timer in 'ethtool --show-eee' always shows 0 if we have not 
used 'ethtool --set-eee' to set the values, even though the driver sets 
different values.

I appreciate Russell's point that eee_enabled is a user configuration bit, 
not a status bit. However, I am curious if tx_lpi_timer, tx_lpi_enabled, 
and other fields are also considered configuration bits.

According to the ethtool man page:
--show-eee
Queries the specified network device for its support of Energy-Efficient 
Ethernet (according to the IEEE 802.3az specifications)

It does not specify which fields are configuration bits and which are 
status bits.
Heiner Kallweit Nov. 13, 2024, 9:48 p.m. UTC | #4
On 13.11.2024 11:10, Choong Yong Liang wrote:
> 
> 
> On 12/11/2024 9:04 pm, Andrew Lunn wrote:
>> On Tue, Nov 12, 2024 at 12:03:15PM +0100, Heiner Kallweit wrote:
>>> In stmmac_ethtool_op_get_eee() you have the following:
>>>
>>> edata->tx_lpi_timer = priv->tx_lpi_timer;
>>> edata->tx_lpi_enabled = priv->tx_lpi_enabled;
>>> return phylink_ethtool_get_eee(priv->phylink, edata);
>>>
>>> You have to call phylink_ethtool_get_eee() first, otherwise the manually
>>> set values will be overridden. However setting tx_lpi_enabled shouldn't
>>> be needed if you respect phydev->enable_tx_lpi.
>>
>> I agree with Heiner here, this sounds like a bug somewhere, not
>> something which needs new code in phylib. Lets understand why it gives
>> the wrong results.
>>
>>     Andrew
> Hi Russell, Andrew, and Heiner, thanks a lot for your valuable feedback.
> 
> The current implementation of the 'ethtool --show-eee' command heavily relies on the phy_ethtool_get_eee() in phy.c. The eeecfg values are set by the 'ethtool --set-eee' command and the phy_support_eee() during the initial state. The phy_ethtool_get_eee() calls eeecfg_to_eee(), which returns the eeecfg containing tx_lpi_timer, tx_lpi_enabled, and eee_enable for the 'ethtool --show-eee' command.
> 
"relies on" may be the wrong term here. There's an API definition,
and phy_ethtool_get_eee() takes care of the PHY-related kernel part,
provided that the MAC driver uses phylib.
I say "PHY-related part", because tx_lpi_timer is something relevant
for the MAC only. Therefore phylib stores the master config timer value
only, not the actual value.
The MAC driver should populate tx_lpi_timer in the get_eee() callback,
in addition to what phy_ethtool_get_eee() populates.
This may result in the master config value being overwritten with actual
value in cases where the MAC doesn't support the master config value.

One (maybe there are more) special case of tx_lpi_timer handling is
Realtek chips, as they store the LPI timer in bytes. Means whenever
the link speed changes, the actual timer value also changes implicitly.

Few values exist twice: As a master config value, and as status.
struct phy_device has the status values:
@eee_enabled: Flag indicating whether the EEE feature is enabled
@enable_tx_lpi: When True, MAC should transmit LPI to PHY

And master config values are in struct eee_cfg:

struct eee_config {
	u32 tx_lpi_timer;
	bool tx_lpi_enabled;
	bool eee_enabled;
};

And yes, it may be a little misleading that eee_enabled exists twice,
you have to be careful which one you're referring to.

ethtool handles the master config values, only "active" is a status
information.

So the MAC driver should:
- provide a link change handler in e.g. phy_connect_direct()
- this handler should:
  - use phydev->enable_tx_lpi to set whether MAC transmits LPI or not
  - use phydev->eee_cfg.tx_lpi_timer to set the timer (if the config
    value is set)

Important note:
This describes how MAC drivers *should* behave. Some don't get it right.
So part of your confusion may be caused by misbehaving MAC drivers.
One example of a MAC driver bug is what I wrote earlier about 
stmmac_ethtool_op_get_eee().

And what I write here refers to plain phylib, I don't cover phylink as
additional layer.


> The tx_lpi_timer and tx_lpi_enabled values stored in the MAC or PHY driver are not retrieved by the 'ethtool --show-eee' command.
> 
> Currently, we are facing 3 issues:
> 1. When we boot up our system and do not issue the 'ethtool --set-eee' command, and then directly issue the 'ethtool --show-eee' command, it always shows that EEE is disabled due to the eeecfg values not being set. However, in the Maxliner GPY PHY, the driver EEE is enabled. If we try to disable EEE, nothing happens because the eeecfg matches the setting required to disable EEE in ethnl_set_eee(). The phy_support_eee() was introduced to set the initial values to enable eee_enabled and tx_lpi_enabled. This would allow 'ethtool --show-eee' to show that EEE is enabled during the initial state. However, the Marvell PHY is designed to have hardware disabled EEE during the initial state. Users are required to use Ethtool to enable the EEE. phy_support_eee() does not show the correct for Marvell PHY.
> 
> 2. The 'ethtool --show-eee' command does not display the correct status, even if the link is down or the speed changes to one that does not support EEE.
> 
> 3. The tx_lpi_timer in 'ethtool --show-eee' always shows 0 if we have not used 'ethtool --set-eee' to set the values, even though the driver sets different values.
> 
> I appreciate Russell's point that eee_enabled is a user configuration bit, not a status bit. However, I am curious if tx_lpi_timer, tx_lpi_enabled, and other fields are also considered configuration bits.
> 
> According to the ethtool man page:
> --show-eee
> Queries the specified network device for its support of Energy-Efficient Ethernet (according to the IEEE 802.3az specifications)
> 
> It does not specify which fields are configuration bits and which are status bits.
Andrew Lunn Nov. 13, 2024, 11:05 p.m. UTC | #5
On Wed, Nov 13, 2024 at 06:10:55PM +0800, Choong Yong Liang wrote:
> 
> 
> On 12/11/2024 9:04 pm, Andrew Lunn wrote:
> > On Tue, Nov 12, 2024 at 12:03:15PM +0100, Heiner Kallweit wrote:
> > > In stmmac_ethtool_op_get_eee() you have the following:
> > > 
> > > edata->tx_lpi_timer = priv->tx_lpi_timer;
> > > edata->tx_lpi_enabled = priv->tx_lpi_enabled;
> > > return phylink_ethtool_get_eee(priv->phylink, edata);
> > > 
> > > You have to call phylink_ethtool_get_eee() first, otherwise the manually
> > > set values will be overridden. However setting tx_lpi_enabled shouldn't
> > > be needed if you respect phydev->enable_tx_lpi.
> > 
> > I agree with Heiner here, this sounds like a bug somewhere, not
> > something which needs new code in phylib. Lets understand why it gives
> > the wrong results.
> > 
> > 	Andrew
> Hi Russell, Andrew, and Heiner, thanks a lot for your valuable feedback.
> 
> The current implementation of the 'ethtool --show-eee' command heavily
> relies on the phy_ethtool_get_eee() in phy.c. The eeecfg values are set by
> the 'ethtool --set-eee' command and the phy_support_eee() during the initial
> state. The phy_ethtool_get_eee() calls eeecfg_to_eee(), which returns the
> eeecfg containing tx_lpi_timer, tx_lpi_enabled, and eee_enable for the
> 'ethtool --show-eee' command.
> 
> The tx_lpi_timer and tx_lpi_enabled values stored in the MAC or PHY driver
> are not retrieved by the 'ethtool --show-eee' command.

tx_lpi_timer is a MAC property, but phylib does track it across
--set-eee calls and will fill it in for get-eee. What however is
missing it setting its default value. There is currently no API the
MAC driver can call to let phylib know what default value it is using.
Either such an API could be added, e.g. as part of phy_support_eee(),
or we could hard code a value, probably again in phy_support_eee().

tx_lpi_enabled is filled in by phy_ethtool_get_eee(), and its default
value is set by phy_support_eee(). So i don't see what is wrong here.

> Currently, we are facing 3 issues:
> 1. When we boot up our system and do not issue the 'ethtool --set-eee'
> command, and then directly issue the 'ethtool --show-eee' command, it always
> shows that EEE is disabled due to the eeecfg values not being set. However,
> in the Maxliner GPY PHY, the driver EEE is enabled. If we try to disable
> EEE, nothing happens because the eeecfg matches the setting required to
> disable EEE in ethnl_set_eee(). The phy_support_eee() was introduced to set
> the initial values to enable eee_enabled and tx_lpi_enabled. This would
> allow 'ethtool --show-eee' to show that EEE is enabled during the initial
> state. However, the Marvell PHY is designed to have hardware disabled EEE
> during the initial state. Users are required to use Ethtool to enable the
> EEE. phy_support_eee() does not show the correct for Marvell PHY.

We discussed what to set the initial state to when we reworked the EEE
support. It is a hard problem, because changing anything could cause
regressions. Some users don't want EEE enabled, because it can add
latency and jitter, e.g. to PTP packets. Some users want it enabled
for the power savings.

We decided to leave the PHY untouched, and will read out its
configuration. If this is going wrong, that is a bug which should be
found and fixed.

We want the core to be fixed, not workaround added to MAC
drivers. Please think about this when proposing future patches.

	Andrew
Choong Yong Liang Nov. 14, 2024, 4:35 a.m. UTC | #6
On 14/11/2024 5:48 am, Heiner Kallweit wrote:
> "relies on" may be the wrong term here. There's an API definition,
> and phy_ethtool_get_eee() takes care of the PHY-related kernel part,
> provided that the MAC driver uses phylib.
> I say "PHY-related part", because tx_lpi_timer is something relevant
> for the MAC only. Therefore phylib stores the master config timer value
> only, not the actual value.
> The MAC driver should populate tx_lpi_timer in the get_eee() callback,
> in addition to what phy_ethtool_get_eee() populates.
> This may result in the master config value being overwritten with actual
> value in cases where the MAC doesn't support the master config value.
> 
> One (maybe there are more) special case of tx_lpi_timer handling is
> Realtek chips, as they store the LPI timer in bytes. Means whenever
> the link speed changes, the actual timer value also changes implicitly.
> 
> Few values exist twice: As a master config value, and as status.
> struct phy_device has the status values:
> @eee_enabled: Flag indicating whether the EEE feature is enabled
> @enable_tx_lpi: When True, MAC should transmit LPI to PHY
> 
> And master config values are in struct eee_cfg:
> 
> struct eee_config {
> 	u32 tx_lpi_timer;
> 	bool tx_lpi_enabled;
> 	bool eee_enabled;
> };
> 
> And yes, it may be a little misleading that eee_enabled exists twice,
> you have to be careful which one you're referring to.
> 
> ethtool handles the master config values, only "active" is a status
> information.
> 
> So the MAC driver should:
> - provide a link change handler in e.g. phy_connect_direct()
> - this handler should:
>    - use phydev->enable_tx_lpi to set whether MAC transmits LPI or not
>    - use phydev->eee_cfg.tx_lpi_timer to set the timer (if the config
>      value is set)
> 
> Important note:
> This describes how MAC drivers *should* behave. Some don't get it right.
> So part of your confusion may be caused by misbehaving MAC drivers.
> One example of a MAC driver bug is what I wrote earlier about
> stmmac_ethtool_op_get_eee().
> 
> And what I write here refers to plain phylib, I don't cover phylink as
> additional layer.
> 

Thank you for your detailed explanation. It has been very helpful and has 
clarified how the code behaves.

Based on your and Andrew's input, I agree that phy_update_eee() is not needed.

I will ensure that our implementation follows these guidelines and will 
address any potential issues with misbehaving MAC drivers.

Thank you again for your valuable insights.
Choong Yong Liang Nov. 14, 2024, 4:37 a.m. UTC | #7
On 14/11/2024 7:05 am, Andrew Lunn wrote:

> tx_lpi_timer is a MAC property, but phylib does track it across
> --set-eee calls and will fill it in for get-eee. What however is
> missing it setting its default value. There is currently no API the
> MAC driver can call to let phylib know what default value it is using.
> Either such an API could be added, e.g. as part of phy_support_eee(),
> or we could hard code a value, probably again in phy_support_eee().
> 
> tx_lpi_enabled is filled in by phy_ethtool_get_eee(), and its default
> value is set by phy_support_eee(). So i don't see what is wrong here.
> 

Thank you for your detailed explanation. I will follow your suggestion to 
set the default value for tx_lpi_timer in phy_support_eee().

>> Currently, we are facing 3 issues:
>> 1. When we boot up our system and do not issue the 'ethtool --set-eee'
>> command, and then directly issue the 'ethtool --show-eee' command, it always
>> shows that EEE is disabled due to the eeecfg values not being set. However,
>> in the Maxliner GPY PHY, the driver EEE is enabled. If we try to disable
>> EEE, nothing happens because the eeecfg matches the setting required to
>> disable EEE in ethnl_set_eee(). The phy_support_eee() was introduced to set
>> the initial values to enable eee_enabled and tx_lpi_enabled. This would
>> allow 'ethtool --show-eee' to show that EEE is enabled during the initial
>> state. However, the Marvell PHY is designed to have hardware disabled EEE
>> during the initial state. Users are required to use Ethtool to enable the
>> EEE. phy_support_eee() does not show the correct for Marvell PHY.
> 
> We discussed what to set the initial state to when we reworked the EEE
> support. It is a hard problem, because changing anything could cause
> regressions. Some users don't want EEE enabled, because it can add
> latency and jitter, e.g. to PTP packets. Some users want it enabled
> for the power savings.
> 
> We decided to leave the PHY untouched, and will read out its
> configuration. If this is going wrong, that is a bug which should be
> found and fixed.
> 

I do agree with your point about leaving the PHY untouched and reading out 
its configuration as the default values in phy_support_eee() instead of 
setting the existing values to true for eee_enabled and tx_lpi_enabled.

> We want the core to be fixed, not workaround added to MAC
> drivers. Please think about this when proposing future patches.
> 
> 	Andrew
I will create different small patch fixes for each of the implementations. 
Thank you.
Russell King (Oracle) Nov. 14, 2024, 9:02 a.m. UTC | #8
On Wed, Nov 13, 2024 at 06:10:55PM +0800, Choong Yong Liang wrote:
> On 12/11/2024 9:04 pm, Andrew Lunn wrote:
> > On Tue, Nov 12, 2024 at 12:03:15PM +0100, Heiner Kallweit wrote:
> > > In stmmac_ethtool_op_get_eee() you have the following:
> > > 
> > > edata->tx_lpi_timer = priv->tx_lpi_timer;
> > > edata->tx_lpi_enabled = priv->tx_lpi_enabled;
> > > return phylink_ethtool_get_eee(priv->phylink, edata);
> > > 
> > > You have to call phylink_ethtool_get_eee() first, otherwise the manually
> > > set values will be overridden. However setting tx_lpi_enabled shouldn't
> > > be needed if you respect phydev->enable_tx_lpi.
> > 
> > I agree with Heiner here, this sounds like a bug somewhere, not
> > something which needs new code in phylib. Lets understand why it gives
> > the wrong results.
> > 
> > 	Andrew
> Hi Russell, Andrew, and Heiner, thanks a lot for your valuable feedback.
> 
> The current implementation of the 'ethtool --show-eee' command heavily
> relies on the phy_ethtool_get_eee() in phy.c. The eeecfg values are set by
> the 'ethtool --set-eee' command and the phy_support_eee() during the initial
> state. The phy_ethtool_get_eee() calls eeecfg_to_eee(), which returns the
> eeecfg containing tx_lpi_timer, tx_lpi_enabled, and eee_enable for the
> 'ethtool --show-eee' command.

These three members you mention are user configuration members.

> The tx_lpi_timer and tx_lpi_enabled values stored in the MAC or PHY driver
> are not retrieved by the 'ethtool --show-eee' command.

tx_lpi_timer is the only thing that the MAC driver should be concerned
with - it needs to program the MAC according to the timer value
specified. Whether LPI is enabled or not is determined by
phydev->enable_tx_lpi. The MAC should be using nothing else.

> Currently, we are facing 3 issues:
> 1. When we boot up our system and do not issue the 'ethtool --set-eee'
> command, and then directly issue the 'ethtool --show-eee' command, it always
> shows that EEE is disabled due to the eeecfg values not being set. However,
> in the Maxliner GPY PHY, the driver EEE is enabled.

So the software state is out of sync with the hardware state. This is a
bug in the GPY PHY driver.

If we look at the generic code, we can see that genphy_config_aneg()
calls __genphy_config_aneg() which then goes on to call
genphy_c45_an_config_eee_aneg(). genphy_c45_an_config_eee_aneg()
writes the current EEE configuration to the PHY.

Now if we look at gpy_config_aneg(), it doesn't do this. Therefore,
the GPY PHY is retaining its hardware state which is different from
the software state. This is wrong.

> 2. The 'ethtool --show-eee' command does not display the correct status,
> even if the link is down or the speed changes to one that does not support
> EEE.

"eee_enabled" means that the user has enabled EEE. It does not mean the
hardware is using EEE. It is a user configuration knob to turn EEE
on/off.

"eee_active" reports whether EEE has been negotiated, and thus will be
made use of.

There has been a lot of misinterpretation of the EEE API, and this is
one of them - some have thought that "eee_enabled" refers to whether EEE
has been negotiated, and "eee_active" means that the interface is
currently in low-power state. This is wrong.

> 3. The tx_lpi_timer in 'ethtool --show-eee' always shows 0 if we have not
> used 'ethtool --set-eee' to set the values, even though the driver sets
> different values.

The driver needs to set these when attaching the PHY.

> I appreciate Russell's point that eee_enabled is a user configuration bit,
> not a status bit. However, I am curious if tx_lpi_timer, tx_lpi_enabled, and
> other fields are also considered configuration bits.

tx_lpi_timer and tx_lpi_enabled are also user configuration.

> It does not specify which fields are configuration bits and which are status
> bits.

The documentation is in include/uapi/linux/ethtool.h:

 * @eee_active: Result of the eee auto negotiation.
 * @eee_enabled: EEE configured mode (enabled/disabled).
 * @tx_lpi_enabled: Whether the interface should assert its tx lpi, given
 *      that eee was negotiated.
 * @tx_lpi_timer: Time in microseconds the interface delays prior to asserting
 *      its tx lpi (after reaching 'idle' state). Effective only when eee
 *      was negotiated and tx_lpi_enabled was set.

and has been for a very long time, yet people in the past have
implemented it against the documentation, leading to the stupid
situation where using ethtool --set-eee on one network driver
works differently to another network driver. This has to stop,
and by implementing most of the logic for the interface in phylib,
it means there's less scope for misinterpretation.
Russell King (Oracle) Nov. 14, 2024, 9:12 a.m. UTC | #9
On Thu, Nov 14, 2024 at 09:02:47AM +0000, Russell King (Oracle) wrote:
> On Wed, Nov 13, 2024 at 06:10:55PM +0800, Choong Yong Liang wrote:
> > On 12/11/2024 9:04 pm, Andrew Lunn wrote:
> > > On Tue, Nov 12, 2024 at 12:03:15PM +0100, Heiner Kallweit wrote:
> > > > In stmmac_ethtool_op_get_eee() you have the following:
> > > > 
> > > > edata->tx_lpi_timer = priv->tx_lpi_timer;
> > > > edata->tx_lpi_enabled = priv->tx_lpi_enabled;
> > > > return phylink_ethtool_get_eee(priv->phylink, edata);
> > > > 
> > > > You have to call phylink_ethtool_get_eee() first, otherwise the manually
> > > > set values will be overridden. However setting tx_lpi_enabled shouldn't
> > > > be needed if you respect phydev->enable_tx_lpi.
> > > 
> > > I agree with Heiner here, this sounds like a bug somewhere, not
> > > something which needs new code in phylib. Lets understand why it gives
> > > the wrong results.
> > > 
> > > 	Andrew
> > Hi Russell, Andrew, and Heiner, thanks a lot for your valuable feedback.
> > 
> > The current implementation of the 'ethtool --show-eee' command heavily
> > relies on the phy_ethtool_get_eee() in phy.c. The eeecfg values are set by
> > the 'ethtool --set-eee' command and the phy_support_eee() during the initial
> > state. The phy_ethtool_get_eee() calls eeecfg_to_eee(), which returns the
> > eeecfg containing tx_lpi_timer, tx_lpi_enabled, and eee_enable for the
> > 'ethtool --show-eee' command.
> 
> These three members you mention are user configuration members.
> 
> > The tx_lpi_timer and tx_lpi_enabled values stored in the MAC or PHY driver
> > are not retrieved by the 'ethtool --show-eee' command.
> 
> tx_lpi_timer is the only thing that the MAC driver should be concerned
> with - it needs to program the MAC according to the timer value
> specified. Whether LPI is enabled or not is determined by
> phydev->enable_tx_lpi. The MAC should be using nothing else.
> 
> > Currently, we are facing 3 issues:
> > 1. When we boot up our system and do not issue the 'ethtool --set-eee'
> > command, and then directly issue the 'ethtool --show-eee' command, it always
> > shows that EEE is disabled due to the eeecfg values not being set. However,
> > in the Maxliner GPY PHY, the driver EEE is enabled.
> 
> So the software state is out of sync with the hardware state. This is a
> bug in the GPY PHY driver.
> 
> If we look at the generic code, we can see that genphy_config_aneg()
> calls __genphy_config_aneg() which then goes on to call
> genphy_c45_an_config_eee_aneg(). genphy_c45_an_config_eee_aneg()
> writes the current EEE configuration to the PHY.
> 
> Now if we look at gpy_config_aneg(), it doesn't do this. Therefore,
> the GPY PHY is retaining its hardware state which is different from
> the software state. This is wrong.

Also note that phy_probe() reads the current configuration from the
PHY. The supported mask is set via phydev->drv->get_features,
which calls genphy_c45_pma_read_abilities() via the GPY driver and
genphy_c45_read_eee_abilities().

phy_probe() then moved on to genphy_c45_read_eee_adv(), which reads
the advertisement mask. If the advertising mask is non-zero, then
EEE is set as enabled.

From your description, it sounds like this isn't working right, and
needs to be debugged. For example, is the PHY changing its EEE
advertisement between phy_probe() and when it is up and running?
Russell King (Oracle) Nov. 14, 2024, 10:15 a.m. UTC | #10
On Thu, Nov 14, 2024 at 09:12:06AM +0000, Russell King (Oracle) wrote:
> On Thu, Nov 14, 2024 at 09:02:47AM +0000, Russell King (Oracle) wrote:
> > On Wed, Nov 13, 2024 at 06:10:55PM +0800, Choong Yong Liang wrote:
> > > On 12/11/2024 9:04 pm, Andrew Lunn wrote:
> > > > On Tue, Nov 12, 2024 at 12:03:15PM +0100, Heiner Kallweit wrote:
> > > > > In stmmac_ethtool_op_get_eee() you have the following:
> > > > > 
> > > > > edata->tx_lpi_timer = priv->tx_lpi_timer;
> > > > > edata->tx_lpi_enabled = priv->tx_lpi_enabled;
> > > > > return phylink_ethtool_get_eee(priv->phylink, edata);
> > > > > 
> > > > > You have to call phylink_ethtool_get_eee() first, otherwise the manually
> > > > > set values will be overridden. However setting tx_lpi_enabled shouldn't
> > > > > be needed if you respect phydev->enable_tx_lpi.
> > > > 
> > > > I agree with Heiner here, this sounds like a bug somewhere, not
> > > > something which needs new code in phylib. Lets understand why it gives
> > > > the wrong results.
> > > > 
> > > > 	Andrew
> > > Hi Russell, Andrew, and Heiner, thanks a lot for your valuable feedback.
> > > 
> > > The current implementation of the 'ethtool --show-eee' command heavily
> > > relies on the phy_ethtool_get_eee() in phy.c. The eeecfg values are set by
> > > the 'ethtool --set-eee' command and the phy_support_eee() during the initial
> > > state. The phy_ethtool_get_eee() calls eeecfg_to_eee(), which returns the
> > > eeecfg containing tx_lpi_timer, tx_lpi_enabled, and eee_enable for the
> > > 'ethtool --show-eee' command.
> > 
> > These three members you mention are user configuration members.
> > 
> > > The tx_lpi_timer and tx_lpi_enabled values stored in the MAC or PHY driver
> > > are not retrieved by the 'ethtool --show-eee' command.
> > 
> > tx_lpi_timer is the only thing that the MAC driver should be concerned
> > with - it needs to program the MAC according to the timer value
> > specified. Whether LPI is enabled or not is determined by
> > phydev->enable_tx_lpi. The MAC should be using nothing else.
> > 
> > > Currently, we are facing 3 issues:
> > > 1. When we boot up our system and do not issue the 'ethtool --set-eee'
> > > command, and then directly issue the 'ethtool --show-eee' command, it always
> > > shows that EEE is disabled due to the eeecfg values not being set. However,
> > > in the Maxliner GPY PHY, the driver EEE is enabled.
> > 
> > So the software state is out of sync with the hardware state. This is a
> > bug in the GPY PHY driver.
> > 
> > If we look at the generic code, we can see that genphy_config_aneg()
> > calls __genphy_config_aneg() which then goes on to call
> > genphy_c45_an_config_eee_aneg(). genphy_c45_an_config_eee_aneg()
> > writes the current EEE configuration to the PHY.
> > 
> > Now if we look at gpy_config_aneg(), it doesn't do this. Therefore,
> > the GPY PHY is retaining its hardware state which is different from
> > the software state. This is wrong.
> 
> Also note that phy_probe() reads the current configuration from the
> PHY. The supported mask is set via phydev->drv->get_features,
> which calls genphy_c45_pma_read_abilities() via the GPY driver and
> genphy_c45_read_eee_abilities().
> 
> phy_probe() then moved on to genphy_c45_read_eee_adv(), which reads
> the advertisement mask. If the advertising mask is non-zero, then
> EEE is set as enabled.
> 
> From your description, it sounds like this isn't working right, and
> needs to be debugged. For example, is the PHY changing its EEE
> advertisement between phy_probe() and when it is up and running?

For the benefit of this thread - phylib definitely has a problem. It's
got one too many things called eee_enabled, leading to confusion about
which should be used. I've now built this patch, and am testing it
with a Marvell PHY.

Without a call to phy_support_eee():

EEE settings for eth2:
        EEE status: disabled
        Tx LPI: disabled
        Supported EEE link modes:  100baseT/Full
                                   1000baseT/Full
        Advertised EEE link modes:  Not reported
        Link partner advertised EEE link modes:  100baseT/Full
                                                 1000baseT/Full

With a call to phy_support_eee():

EEE settings for eth2:
        EEE status: enabled - active
        Tx LPI: 0 (us)
        Supported EEE link modes:  100baseT/Full
                                   1000baseT/Full
        Advertised EEE link modes:  100baseT/Full
                                    1000baseT/Full
        Link partner advertised EEE link modes:  100baseT/Full
                                                 1000baseT/Full

So the EEE status is now behaving correctly, and the Marvell PHY is
being programmed with the advertisement correctly.

diff --git a/drivers/net/phy/phy-c45.c b/drivers/net/phy/phy-c45.c
index c1b3576c307f..2d64d3f293e5 100644
--- a/drivers/net/phy/phy-c45.c
+++ b/drivers/net/phy/phy-c45.c
@@ -943,7 +943,7 @@ EXPORT_SYMBOL_GPL(genphy_c45_read_eee_abilities);
  */
 int genphy_c45_an_config_eee_aneg(struct phy_device *phydev)
 {
-	if (!phydev->eee_enabled) {
+	if (!phydev->eee_cfg.eee_enabled) {
 		__ETHTOOL_DECLARE_LINK_MODE_MASK(adv) = {};
 
 		return genphy_c45_write_eee_adv(phydev, adv);
@@ -1576,8 +1576,6 @@ int genphy_c45_ethtool_set_eee(struct phy_device *phydev,
 		}
 	}
 
-	phydev->eee_enabled = data->eee_enabled;
-
 	ret = genphy_c45_an_config_eee_aneg(phydev);
 	if (ret > 0) {
 		ret = phy_restart_aneg(phydev);
diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index bc24c9f2786b..b26bb33cd1d4 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -3589,12 +3589,12 @@ static int phy_probe(struct device *dev)
 	/* There is no "enabled" flag. If PHY is advertising, assume it is
 	 * kind of enabled.
 	 */
-	phydev->eee_enabled = !linkmode_empty(phydev->advertising_eee);
+	phydev->eee_cfg.eee_enabled = !linkmode_empty(phydev->advertising_eee);
 
 	/* Some PHYs may advertise, by default, not support EEE modes. So,
 	 * we need to clean them.
 	 */
-	if (phydev->eee_enabled)
+	if (phydev->eee_cfg.eee_enabled)
 		linkmode_and(phydev->advertising_eee, phydev->supported_eee,
 			     phydev->advertising_eee);
 
diff --git a/include/linux/phy.h b/include/linux/phy.h
index 1e4127c495c0..33905e9672a7 100644
--- a/include/linux/phy.h
+++ b/include/linux/phy.h
@@ -601,7 +601,6 @@ struct macsec_ops;
  * @adv_old: Saved advertised while power saving for WoL
  * @supported_eee: supported PHY EEE linkmodes
  * @advertising_eee: Currently advertised EEE linkmodes
- * @eee_enabled: Flag indicating whether the EEE feature is enabled
  * @enable_tx_lpi: When True, MAC should transmit LPI to PHY
  * @eee_cfg: User configuration of EEE
  * @lp_advertising: Current link partner advertised linkmodes
@@ -721,7 +720,6 @@ struct phy_device {
 	/* used for eee validation and configuration*/
 	__ETHTOOL_DECLARE_LINK_MODE_MASK(supported_eee);
 	__ETHTOOL_DECLARE_LINK_MODE_MASK(advertising_eee);
-	bool eee_enabled;
 
 	/* Host supported PHY interface types. Should be ignored if empty. */
 	DECLARE_PHY_INTERFACE_MASK(host_interfaces);
diff mbox series

Patch

diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index 499797646580..94dadf011ca6 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -3016,6 +3016,30 @@  void phy_support_eee(struct phy_device *phydev)
 }
 EXPORT_SYMBOL(phy_support_eee);
 
+/**
+ * phy_update_eee - Update the Energy Efficient Ethernet (EEE) settings
+ * @phydev: target phy_device struct
+ * @tx_lpi_enabled: boolean indicating if Low Power Idle (LPI) for
+ * transmission is enabled.
+ * @eee_enabled: boolean indicating if Energy Efficient Ethernet (EEE) is
+ * enabled.
+ * @tx_lpi_timer: the Low Power Idle (LPI) timer value (in microseconds) for
+ * transmission.
+ *
+ * Description:
+ * This function updates the Energy Efficient Ethernet (EEE) settings for the
+ * specified PHY device. It is typically called during link up and down events
+ * to configure the EEE parameters according to the current link state.
+ */
+void phy_update_eee(struct phy_device *phydev, bool tx_lpi_enabled,
+		    bool eee_enabled, u32 tx_lpi_timer)
+{
+	phydev->eee_cfg.tx_lpi_enabled = tx_lpi_enabled;
+	phydev->eee_cfg.eee_enabled = eee_enabled;
+	phydev->eee_cfg.tx_lpi_timer = tx_lpi_timer;
+}
+EXPORT_SYMBOL(phy_update_eee);
+
 /**
  * phy_support_sym_pause - Enable support of symmetrical pause
  * @phydev: target phy_device struct
diff --git a/include/linux/phy.h b/include/linux/phy.h
index a98bc91a0cde..6c300ba47a2d 100644
--- a/include/linux/phy.h
+++ b/include/linux/phy.h
@@ -2004,6 +2004,8 @@  void phy_advertise_eee_all(struct phy_device *phydev);
 void phy_support_sym_pause(struct phy_device *phydev);
 void phy_support_asym_pause(struct phy_device *phydev);
 void phy_support_eee(struct phy_device *phydev);
+void phy_update_eee(struct phy_device *phydev, bool tx_lpi_enabled,
+		    bool eee_enabled, u32 tx_lpi_timer);
 void phy_set_sym_pause(struct phy_device *phydev, bool rx, bool tx,
 		       bool autoneg);
 void phy_set_asym_pause(struct phy_device *phydev, bool rx, bool tx);