mbox series

[RFC/RFTv3,00/24] net: ethernet: Rework EEE

Message ID 20230331005518.2134652-1-andrew@lunn.ch (mailing list archive)
Headers show
Series net: ethernet: Rework EEE | expand

Message

Andrew Lunn March 31, 2023, 12:54 a.m. UTC
Most MAC drivers get EEE wrong. The API to the PHY is not very
obvious, which is probably why. Rework the API, pushing most of the
EEE handling into phylib core, leaving the MAC drivers to just
enable/disable support for EEE in there change_link call back, or
phylink mac_link_up callback.

MAC drivers are now expect to indicate to phylib/phylink if they
support EEE. If not, no EEE link modes are advertised. If the MAC does
support EEE, on phy_start()/phylink_start() EEE advertisement is
configured.

v3
--
Rework phylink code to add a new callback.
Rework function to indicate clock should be stopped during LPI

Andrew Lunn (24):
  net: phy: Add phydev->eee_active to simplify adjust link callbacks
  net: phylink: Add mac_set_eee() callback
  net: phy: Add helper to set EEE Clock stop enable bit
  net: phy: Keep track of EEE tx_lpi_enabled
  net: phy: Immediately call adjust_link if only tx_lpi_enabled changes
  net: phylink: Handle change in EEE from phylib
  net: marvell: mvneta: Simplify EEE configuration
  net: stmmac: Drop usage of phy_init_eee()
  net: stmmac: Simplify ethtool get eee
  net: lan743x: Fixup EEE
  net: fec: Move fec_enet_eee_mode_set() and helper earlier
  net: FEC: Fixup EEE
  net: genet: Fixup EEE
  net: sxgdb: Fixup EEE
  net: dsa: mt7530: Swap to using phydev->eee_active
  net: dsa: b53: Swap to using phydev->eee_active
  net: phylink: Remove unused phylink_init_eee()
  net: phy: remove unused phy_init_eee()
  net: usb: lan78xx: Fixup EEE
  net: phy: Add phy_support_eee() indicating MAC support EEE
  net: phylink: Add MAC_EEE to mac_capabilites
  net: phylink: Extend mac_capabilities in MAC drivers which support EEE
  net: phylib: call phy_support_eee() in MAC drivers which support EEE
  net: phy: Disable EEE advertisement by default

 drivers/net/dsa/b53/b53_common.c              |  5 +-
 drivers/net/dsa/mt7530.c                      |  2 +-
 .../net/ethernet/broadcom/genet/bcmgenet.c    | 42 +++------
 .../net/ethernet/broadcom/genet/bcmgenet.h    |  3 +-
 drivers/net/ethernet/broadcom/genet/bcmmii.c  |  3 +
 drivers/net/ethernet/freescale/fec_main.c     | 88 ++++++++-----------
 drivers/net/ethernet/marvell/mvneta.c         | 28 +++---
 .../net/ethernet/microchip/lan743x_ethtool.c  | 22 -----
 drivers/net/ethernet/microchip/lan743x_main.c |  9 ++
 .../net/ethernet/samsung/sxgbe/sxgbe_common.h |  3 -
 .../ethernet/samsung/sxgbe/sxgbe_ethtool.c    | 21 +----
 .../net/ethernet/samsung/sxgbe/sxgbe_main.c   | 39 +++-----
 drivers/net/ethernet/stmicro/stmmac/stmmac.h  |  1 -
 .../ethernet/stmicro/stmmac/stmmac_ethtool.c  |  7 --
 .../net/ethernet/stmicro/stmmac/stmmac_main.c | 10 ++-
 drivers/net/phy/phy-c45.c                     | 35 +++++++-
 drivers/net/phy/phy-core.c                    | 11 +++
 drivers/net/phy/phy.c                         | 83 ++++++++++-------
 drivers/net/phy/phy_device.c                  | 37 ++++----
 drivers/net/phy/phylink.c                     | 47 +++++-----
 drivers/net/usb/lan78xx.c                     | 44 +++++-----
 include/linux/phy.h                           | 11 ++-
 include/linux/phylink.h                       | 62 ++++++++-----
 include/uapi/linux/mdio.h                     |  1 +
 net/dsa/port.c                                |  3 +
 25 files changed, 308 insertions(+), 309 deletions(-)

Comments

Oleksij Rempel May 26, 2023, 8:56 a.m. UTC | #1
Hi Andrew,

On Fri, Mar 31, 2023 at 02:54:54AM +0200, Andrew Lunn wrote:
> Most MAC drivers get EEE wrong. The API to the PHY is not very
> obvious, which is probably why. Rework the API, pushing most of the
> EEE handling into phylib core, leaving the MAC drivers to just
> enable/disable support for EEE in there change_link call back, or
> phylink mac_link_up callback.
> 
> MAC drivers are now expect to indicate to phylib/phylink if they
> support EEE. If not, no EEE link modes are advertised. If the MAC does
> support EEE, on phy_start()/phylink_start() EEE advertisement is
> configured.
> 
> v3
> --
 
I was able to test some drivers and things seems to work ok so far. Do you
need more tests for a non RFC version?

Regards,
Oleksij
Andrew Lunn May 26, 2023, 12:08 p.m. UTC | #2
On Fri, May 26, 2023 at 10:56:04AM +0200, Oleksij Rempel wrote:
> Hi Andrew,
> 
> On Fri, Mar 31, 2023 at 02:54:54AM +0200, Andrew Lunn wrote:
> > Most MAC drivers get EEE wrong. The API to the PHY is not very
> > obvious, which is probably why. Rework the API, pushing most of the
> > EEE handling into phylib core, leaving the MAC drivers to just
> > enable/disable support for EEE in there change_link call back, or
> > phylink mac_link_up callback.
> > 
> > MAC drivers are now expect to indicate to phylib/phylink if they
> > support EEE. If not, no EEE link modes are advertised. If the MAC does
> > support EEE, on phy_start()/phylink_start() EEE advertisement is
> > configured.
> > 
> > v3
> > --
>  
> I was able to test some drivers and things seems to work ok so far. Do you
> need more tests for a non RFC version?

No, i just need time to rebase and post them. Plus check if there are
more drivers which added support for EEE and fix them up. There is a
new Broadcom driver which i think will need work.

Hopefully next week i can do this.

	  Andrew
Florian Fainelli May 26, 2023, 4:13 p.m. UTC | #3
On 5/26/23 05:08, Andrew Lunn wrote:
> On Fri, May 26, 2023 at 10:56:04AM +0200, Oleksij Rempel wrote:
>> Hi Andrew,
>>
>> On Fri, Mar 31, 2023 at 02:54:54AM +0200, Andrew Lunn wrote:
>>> Most MAC drivers get EEE wrong. The API to the PHY is not very
>>> obvious, which is probably why. Rework the API, pushing most of the
>>> EEE handling into phylib core, leaving the MAC drivers to just
>>> enable/disable support for EEE in there change_link call back, or
>>> phylink mac_link_up callback.
>>>
>>> MAC drivers are now expect to indicate to phylib/phylink if they
>>> support EEE. If not, no EEE link modes are advertised. If the MAC does
>>> support EEE, on phy_start()/phylink_start() EEE advertisement is
>>> configured.
>>>
>>> v3
>>> --
>>   
>> I was able to test some drivers and things seems to work ok so far. Do you
>> need more tests for a non RFC version?
> 
> No, i just need time to rebase and post them. Plus check if there are
> more drivers which added support for EEE and fix them up. There is a
> new Broadcom driver which i think will need work.

The Broadcom ASP driver should have EEE stripped off as we just found 
out that the MAC does not drive the PHY signals properly yet. This 
should allow your series to proceed through, without having to care 
about that particular driver.

Thanks for doing this work Andrew!
Florian Fainelli May 30, 2023, 6:31 p.m. UTC | #4
Hi Andrew, Russell,

On 3/30/23 17:54, Andrew Lunn wrote:
> Most MAC drivers get EEE wrong. The API to the PHY is not very
> obvious, which is probably why. Rework the API, pushing most of the
> EEE handling into phylib core, leaving the MAC drivers to just
> enable/disable support for EEE in there change_link call back, or
> phylink mac_link_up callback.
> 
> MAC drivers are now expect to indicate to phylib/phylink if they
> support EEE. If not, no EEE link modes are advertised. If the MAC does
> support EEE, on phy_start()/phylink_start() EEE advertisement is
> configured.

Thanks for doing this work, because it really is a happy mess out there. 
A few questions as I have been using mvneta as the reference for fixing 
GENET and its shortcomings.

In your new patches the decision to enable EEE is purely based upon the 
eee_active boolean and not eee_enabled && tx_lpi_enabled unlike what 
mvneta useed to do.

Russell, is there an use case for having eee_enabled while not having 
tx_lpi_enabled?

With the candidate patch attached, I have the following behavior on boot 
with no specific ethtool operation:

# ethtool --show-eee eth0
EEE Settings for eth0:
         EEE status: enabled - active
         Tx LPI: disabled
         Supported EEE link modes:  100baseT/Full
                                    1000baseT/Full
         Advertised EEE link modes:  100baseT/Full
                                     1000baseT/Full
         Link partner advertised EEE link modes:  100baseT/Full
                                                  1000baseT/Full
#

however EEE is not *really* enabled at the hardware level unless I do 
another:

# ethtool --set-eee eth0 tx-lpi on
# ethtool --show-eee eth0
EEE Settings for eth0:
         EEE status: enabled - active
         Tx LPI: 34 (us)
         Supported EEE link modes:  100baseT/Full
                                    1000baseT/Full
         Advertised EEE link modes:  100baseT/Full
                                     1000baseT/Full
         Link partner advertised EEE link modes:  100baseT/Full
                                                  1000baseT/Full

We have a global EEE enable bit which is described as "If set, the TX 
LPI policy control engine is enabled and the MAC inserts LPI_idle codes 
if the link is idle", so it would seem to me that we should require 
users to set both "eee on" and "tx-lpi on" in their ethtool command, but 
then unless we intentionally override tx_lpi_enabled in the driver upon 
probe, there will be any automagical configuration happening, and no 
power savings being realized.

Do you have any recommendations on what drivers should do? As you could 
see from the need of this patch series, we already all created our own 
little schisms of the cargo cult.

Thanks!
Russell King (Oracle) May 30, 2023, 7:35 p.m. UTC | #5
On Tue, May 30, 2023 at 11:31:04AM -0700, Florian Fainelli wrote:
> Hi Andrew, Russell,
> 
> On 3/30/23 17:54, Andrew Lunn wrote:
> > Most MAC drivers get EEE wrong. The API to the PHY is not very
> > obvious, which is probably why. Rework the API, pushing most of the
> > EEE handling into phylib core, leaving the MAC drivers to just
> > enable/disable support for EEE in there change_link call back, or
> > phylink mac_link_up callback.
> > 
> > MAC drivers are now expect to indicate to phylib/phylink if they
> > support EEE. If not, no EEE link modes are advertised. If the MAC does
> > support EEE, on phy_start()/phylink_start() EEE advertisement is
> > configured.
> 
> Thanks for doing this work, because it really is a happy mess out there. A
> few questions as I have been using mvneta as the reference for fixing GENET
> and its shortcomings.
> 
> In your new patches the decision to enable EEE is purely based upon the
> eee_active boolean and not eee_enabled && tx_lpi_enabled unlike what mvneta
> useed to do.
> 
> Russell, is there an use case for having eee_enabled while not having
> tx_lpi_enabled?

As I've been stating recently, LPI is entirely to do with the MAC, so
the LPI delay and the enable boolean are about controlling the MAC end
of things.

I've never really understood what "eee_enabled" is supposed to be doing,
since there's nowhere really to enable or disable EEE per-se. Through
recent patches, phylib has since defined "eee_enabled" to mean that the
advertisement is non-empty, which came as a surprise to me, but again
seems rather redundant and strange.

It's strange because as far as I can see from what documentation there
is, "eee_enabled" is supposed to be a control that users can set
independently of the advertisement, but with the phylib implementation,
eee_enabled read from get_eee() as I say is defined as whether the
advertisement is non-zero or not.

With fibre setups, there is no EEE advertisement, so in that situation
phylib's interpretation of eee_enabled via get_eee/set_eee would be
very much incorrect. The only control one has is whether LPI is enabled
and its timer setting.

That said, the current mvneta implementation doesn't actually enable
LPI for fibre... but there is a bug in that one can get the MAC to
enable LPI via ethtool's set_eee() !

I think for a fibre setup, eee_enabled && tx_lpi_enabled is reasonable,
given that eee_enabled is a seperate control from the advertisement
which will be zero in that case.

Going back to phylib, given this, things get even more "fun" if you have
a dual-media PHY. As there's no EEE capability bits for 1000base-X, but
a 1000base-X PCS optionally supports EEE. So, even with a zero EEE
advertisement with a dual-media PHY that would only affect the copper
side, and EEE may still be possible in the fibre side... which makes
phylib's new interpretation of "eee_enabled" rather odd.

In any case, "eee_enabled" I don't think has much meaning for the fibre
case because there's definitely no control beyond what "tx_lpi_enabled"
already offers.

I think this is a classic case where the EEE interface has been designed
solely around copper without checking what the situation is for fibre!
Andrew Lunn May 30, 2023, 7:48 p.m. UTC | #6
On Tue, May 30, 2023 at 11:31:04AM -0700, Florian Fainelli wrote:
> Hi Andrew, Russell,
> 
> On 3/30/23 17:54, Andrew Lunn wrote:
> > Most MAC drivers get EEE wrong. The API to the PHY is not very
> > obvious, which is probably why. Rework the API, pushing most of the
> > EEE handling into phylib core, leaving the MAC drivers to just
> > enable/disable support for EEE in there change_link call back, or
> > phylink mac_link_up callback.
> > 
> > MAC drivers are now expect to indicate to phylib/phylink if they
> > support EEE. If not, no EEE link modes are advertised. If the MAC does
> > support EEE, on phy_start()/phylink_start() EEE advertisement is
> > configured.
> 
> Thanks for doing this work, because it really is a happy mess out there. A
> few questions as I have been using mvneta as the reference for fixing GENET
> and its shortcomings.
> 
> In your new patches the decision to enable EEE is purely based upon the
> eee_active boolean and not eee_enabled && tx_lpi_enabled unlike what mvneta
> useed to do.

I don't really care much what we decide means 'enabled'. I just want
it moved out of MAC drivers and into the core so it is consistent.

Russel, if you want to propose something which works for both Copper
and Fibre, i'm happy to implement it. But as you pointed out, we need
to decide where. Maybe phylib handles copper, and phylink is layered
on top and handles fibre?

	  Andrew
Russell King (Oracle) May 30, 2023, 7:49 p.m. UTC | #7
On Tue, May 30, 2023 at 08:35:55PM +0100, Russell King (Oracle) wrote:
> Going back to phylib, given this, things get even more "fun" if you have
> a dual-media PHY. As there's no EEE capability bits for 1000base-X, but
> a 1000base-X PCS optionally supports EEE. So, even with a zero EEE
> advertisement with a dual-media PHY that would only affect the copper
> side, and EEE may still be possible in the fibre side... which makes
> phylib's new interpretation of "eee_enabled" rather odd.
> 
> In any case, "eee_enabled" I don't think has much meaning for the fibre
> case because there's definitely no control beyond what "tx_lpi_enabled"
> already offers.
> 
> I think this is a classic case where the EEE interface has been designed
> solely around copper without checking what the situation is for fibre!

Let me be a bit more explicit on this. If one does (e.g.) this:

# ethtool --set-eee eth0 advertise 0 tx-lpi on tx-timer 100

with a dual-media PHY, if the MAC is programmed to enable LPI, the
dual-media PHY is linked via fibre, and the remote end supports fibre
EEE, phylib will force "eee" to "off" purely on the grounds that the
advertisement was empty.

If one looks at the man page for ethtool, it says:

           eee on|off
                  Enables/disables the device support of EEE.

What does that mean, exactly, and how is it different from:

           tx-lpi on|off
                  Determines whether the device should assert its Tx LPI.

since the only control at the MAC is whether LPI can be asserted or
not and what the timer is.

The only control at the PHY end of things is what the advertisement
is, if an advertisement even exists for the media type in use.

So, honestly, I don't get what this ethtool interface actually intends
the "eee_enabled" control to do.
Florian Fainelli May 30, 2023, 8:03 p.m. UTC | #8
On 5/30/23 12:48, Andrew Lunn wrote:
> On Tue, May 30, 2023 at 11:31:04AM -0700, Florian Fainelli wrote:
>> Hi Andrew, Russell,
>>
>> On 3/30/23 17:54, Andrew Lunn wrote:
>>> Most MAC drivers get EEE wrong. The API to the PHY is not very
>>> obvious, which is probably why. Rework the API, pushing most of the
>>> EEE handling into phylib core, leaving the MAC drivers to just
>>> enable/disable support for EEE in there change_link call back, or
>>> phylink mac_link_up callback.
>>>
>>> MAC drivers are now expect to indicate to phylib/phylink if they
>>> support EEE. If not, no EEE link modes are advertised. If the MAC does
>>> support EEE, on phy_start()/phylink_start() EEE advertisement is
>>> configured.
>>
>> Thanks for doing this work, because it really is a happy mess out there. A
>> few questions as I have been using mvneta as the reference for fixing GENET
>> and its shortcomings.
>>
>> In your new patches the decision to enable EEE is purely based upon the
>> eee_active boolean and not eee_enabled && tx_lpi_enabled unlike what mvneta
>> useed to do.
> 
> I don't really care much what we decide means 'enabled'. I just want
> it moved out of MAC drivers and into the core so it is consistent.

Understood this is slightly out of the scope of what you are doing which 
is to have an unified behavior, but we also have a shot at defining a 
correct behavior.

> 
> Russel, if you want to propose something which works for both Copper
> and Fibre, i'm happy to implement it. But as you pointed out, we need
> to decide where. Maybe phylib handles copper, and phylink is layered
> on top and handles fibre?
> 
> 	  Andrew
Russell King (Oracle) May 30, 2023, 8:28 p.m. UTC | #9
On Tue, May 30, 2023 at 09:48:00PM +0200, Andrew Lunn wrote:
> On Tue, May 30, 2023 at 11:31:04AM -0700, Florian Fainelli wrote:
> > Hi Andrew, Russell,
> > 
> > On 3/30/23 17:54, Andrew Lunn wrote:
> > > Most MAC drivers get EEE wrong. The API to the PHY is not very
> > > obvious, which is probably why. Rework the API, pushing most of the
> > > EEE handling into phylib core, leaving the MAC drivers to just
> > > enable/disable support for EEE in there change_link call back, or
> > > phylink mac_link_up callback.
> > > 
> > > MAC drivers are now expect to indicate to phylib/phylink if they
> > > support EEE. If not, no EEE link modes are advertised. If the MAC does
> > > support EEE, on phy_start()/phylink_start() EEE advertisement is
> > > configured.
> > 
> > Thanks for doing this work, because it really is a happy mess out there. A
> > few questions as I have been using mvneta as the reference for fixing GENET
> > and its shortcomings.
> > 
> > In your new patches the decision to enable EEE is purely based upon the
> > eee_active boolean and not eee_enabled && tx_lpi_enabled unlike what mvneta
> > useed to do.
> 
> I don't really care much what we decide means 'enabled'. I just want
> it moved out of MAC drivers and into the core so it is consistent.
> 
> Russel, if you want to propose something which works for both Copper
> and Fibre, i'm happy to implement it. But as you pointed out, we need
> to decide where. Maybe phylib handles copper, and phylink is layered
> on top and handles fibre?

Phylib also handles fibre too with dual-media PHYs (such as 88E151x
and 88X3310), and as I've just pointed out, the recent attempts at
"fixing" phylib's handling particularly with eee_enabled have made it
rather odd.

That said, the 88E151x resolution of 1000BASE-X negotiation is also
rather odd, particularly with pause modes. So I don't trust one bit
that anyone is even using 88E151x in fibre setups - or if they are
they don't care about this odd behaviour.

Before we go any further, I think we need to hammer out eactly how the
ethtool EEE interface is supposed to work, because right now I can't
say that I fully understand it - and as I've said in my replies to
Florian recently, phylib's EEE implementation becomes utterly silly
when it comes to fibre.

In particular, we need to hammer out what the difference exactly is
between "eee_enabled" and "tx_lpi_enabled", and what they control,
and I suggest we look at it from the point of view of both copper
(where EEE is negotiated) and fibre (were EEE is optional, no
capability bits, no negotiation, so no advertisement.)

It seems fairly obvious to me that tx_lpi* are about the MAC
configuration, since that's the entity which is responsible for
signalling LPI towards the PHY(PCS) over GMII.

eee_active... what does "active" actually mean? From the API doc, it
means the "Result of the eee negotiation" which is fine for copper
links where EEE is negotiated, but in the case of fibre, there isn't
EEE negotiation, and EEE is optionally implemented in the PCS.

eee_enabled... doesn't seem to have a meaning as far as IEEE 802.3
goes, it's a Linux invention. Documentation says "EEE configured mode"
which is just as useful as a chocolate teapot for making tea, that
comment might as well be deleted for what use it is. To this day, I
have no idea what this setting is actually supposed to be doing.
It seemed sane to me that if eee_enabled is false, then we should
not report eee_active as true, nor should we allow the MAC to
generate LPI. Whether the advertisement gets programmed into the PHY
or not is something I never thought about, and I can't remember
phylib's old behaviour. Modern phylib treats eee_enabled = false to
program a zero advertisement, which means when reading back via
get_eee(), you get a zero advertisement back. Effectively, eee_active
in modern phylib means "allow the advertisement to be programmed
if enabled, otherwise clear the advertisement".

If it's simply there to zero the advertisement, then what if the
media type has no capability for EEE advertisement, but intrinsically
supports EEE. That's where phylib's interpretation falls down IMHO.

Maybe this ethtool interface doesn't work very well for cases where
there is EEE ability but no EEE advertisement? Not sure.

Until we get that settled, we can't begin to fathom how phylib (or
phylink) should make a decision as to whether the MAC should signal
LPI towards the media or not.
Russell King (Oracle) May 30, 2023, 8:36 p.m. UTC | #10
On Tue, May 30, 2023 at 01:03:15PM -0700, Florian Fainelli wrote:
> On 5/30/23 12:48, Andrew Lunn wrote:
> > I don't really care much what we decide means 'enabled'. I just want
> > it moved out of MAC drivers and into the core so it is consistent.
> 
> Understood this is slightly out of the scope of what you are doing which is
> to have an unified behavior, but we also have a shot at defining a correct
> behavior.

Yes, having unified behaviour is good only if we don't end up boxing
ourselves into a corner when trying to unify it. That said, I suspect
many are just broken in one way or another.

What always makes me rather nervous is that trying to fix this kind of
thing will inevitably change the user visible behaviour, and then we
hope that that doesn't cause people's setups to regress.

So, IMHO, if we can decide what the correct behaviour and use should
be before we unify it, the better chance we stand at not having
differing behaviours from future kernel versions as we end up revising
the unified behaviour.

I hope that makes sense! :)
Florian Fainelli May 30, 2023, 11:03 p.m. UTC | #11
On 5/30/23 13:28, Russell King (Oracle) wrote:
> On Tue, May 30, 2023 at 09:48:00PM +0200, Andrew Lunn wrote:
>> On Tue, May 30, 2023 at 11:31:04AM -0700, Florian Fainelli wrote:
>>> Hi Andrew, Russell,
>>>
>>> On 3/30/23 17:54, Andrew Lunn wrote:
>>>> Most MAC drivers get EEE wrong. The API to the PHY is not very
>>>> obvious, which is probably why. Rework the API, pushing most of the
>>>> EEE handling into phylib core, leaving the MAC drivers to just
>>>> enable/disable support for EEE in there change_link call back, or
>>>> phylink mac_link_up callback.
>>>>
>>>> MAC drivers are now expect to indicate to phylib/phylink if they
>>>> support EEE. If not, no EEE link modes are advertised. If the MAC does
>>>> support EEE, on phy_start()/phylink_start() EEE advertisement is
>>>> configured.
>>>
>>> Thanks for doing this work, because it really is a happy mess out there. A
>>> few questions as I have been using mvneta as the reference for fixing GENET
>>> and its shortcomings.
>>>
>>> In your new patches the decision to enable EEE is purely based upon the
>>> eee_active boolean and not eee_enabled && tx_lpi_enabled unlike what mvneta
>>> useed to do.
>>
>> I don't really care much what we decide means 'enabled'. I just want
>> it moved out of MAC drivers and into the core so it is consistent.
>>
>> Russel, if you want to propose something which works for both Copper
>> and Fibre, i'm happy to implement it. But as you pointed out, we need
>> to decide where. Maybe phylib handles copper, and phylink is layered
>> on top and handles fibre?
> 
> Phylib also handles fibre too with dual-media PHYs (such as 88E151x
> and 88X3310), and as I've just pointed out, the recent attempts at
> "fixing" phylib's handling particularly with eee_enabled have made it
> rather odd.
> 
> That said, the 88E151x resolution of 1000BASE-X negotiation is also
> rather odd, particularly with pause modes. So I don't trust one bit
> that anyone is even using 88E151x in fibre setups - or if they are
> they don't care about this odd behaviour.
> 
> Before we go any further, I think we need to hammer out eactly how the
> ethtool EEE interface is supposed to work, because right now I can't
> say that I fully understand it - and as I've said in my replies to
> Florian recently, phylib's EEE implementation becomes utterly silly
> when it comes to fibre.
> 
> In particular, we need to hammer out what the difference exactly is
> between "eee_enabled" and "tx_lpi_enabled", and what they control,
> and I suggest we look at it from the point of view of both copper
> (where EEE is negotiated) and fibre (were EEE is optional, no
> capability bits, no negotiation, so no advertisement.)
> 
> It seems fairly obvious to me that tx_lpi* are about the MAC
> configuration, since that's the entity which is responsible for
> signalling LPI towards the PHY(PCS) over GMII.

Yes that much we can agree on that tx_lpi* is from the perspective of 
the MAC.

> 
> eee_active... what does "active" actually mean? From the API doc, it
> means the "Result of the eee negotiation" which is fine for copper
> links where EEE is negotiated, but in the case of fibre, there isn't
> EEE negotiation, and EEE is optionally implemented in the PCS.

Is there any way to feed back whether EEE is actually being used at the 
time on a fiber link? In which case eee_active would reflect that state.

> 
> eee_enabled... doesn't seem to have a meaning as far as IEEE 802.3
> goes, it's a Linux invention. Documentation says "EEE configured mode"
> which is just as useful as a chocolate teapot for making tea, that
> comment might as well be deleted for what use it is. To this day, I
> have no idea what this setting is actually supposed to be doing.
> It seemed sane to me that if eee_enabled is false, then we should
> not report eee_active as true, nor should we allow the MAC to
> generate LPI. Whether the advertisement gets programmed into the PHY
> or not is something I never thought about, and I can't remember
> phylib's old behaviour. Modern phylib treats eee_enabled = false to
> program a zero advertisement, which means when reading back via
> get_eee(), you get a zero advertisement back. Effectively, eee_active
> in modern phylib means "allow the advertisement to be programmed
> if enabled, otherwise clear the advertisement". >
> If it's simply there to zero the advertisement, then what if the
> media type has no capability for EEE advertisement, but intrinsically
> supports EEE. That's where phylib's interpretation falls down IMHO.
> 
> Maybe this ethtool interface doesn't work very well for cases where
> there is EEE ability but no EEE advertisement? Not sure.

At the time it was introduced, there certainly was not much care being 
given to fiber use cases.

> 
> Until we get that settled, we can't begin to fathom how phylib (or
> phylink) should make a decision as to whether the MAC should signal
> LPI towards the media or not.

Now that we have SmartEEE and AutogrEEEn as additional supported modes 
by certain PHY drivers for MACs that lack any LPI signaling capability 
towards the PHY, there may be a reason for re-purposing or clarifying 
the meaning of eee_enabled=true with tx_lpi_enabled=false. Although in 
those cases, I would just issue a warning that tx_lpi_enabled is ignored 
because the MAC is incapable of LPI signaling.

It seems that eee_enabled is intended to be a local administrative 
parameter that provides an additional gating level to the local & remote 
advertisement resolution. As a driver you can only enable EEE at the MAC 
and/or PHY level if local & remote & enabled = true.
Oleksij Rempel May 31, 2023, 7:14 a.m. UTC | #12
Hi Russell,

On Tue, May 30, 2023 at 08:49:50PM +0100, Russell King (Oracle) wrote:
> On Tue, May 30, 2023 at 08:35:55PM +0100, Russell King (Oracle) wrote:
> > Going back to phylib, given this, things get even more "fun" if you have
> > a dual-media PHY. As there's no EEE capability bits for 1000base-X, but
> > a 1000base-X PCS optionally supports EEE. So, even with a zero EEE
> > advertisement with a dual-media PHY that would only affect the copper
> > side, and EEE may still be possible in the fibre side... which makes
> > phylib's new interpretation of "eee_enabled" rather odd.
> > 
> > In any case, "eee_enabled" I don't think has much meaning for the fibre
> > case because there's definitely no control beyond what "tx_lpi_enabled"
> > already offers.
> > 
> > I think this is a classic case where the EEE interface has been designed
> > solely around copper without checking what the situation is for fibre!
> 
> Let me be a bit more explicit on this. If one does (e.g.) this:
> 
> # ethtool --set-eee eth0 advertise 0 tx-lpi on tx-timer 100
> 
> with a dual-media PHY, if the MAC is programmed to enable LPI, the
> dual-media PHY is linked via fibre, and the remote end supports fibre
> EEE, phylib will force "eee" to "off" purely on the grounds that the
> advertisement was empty.
> 
> If one looks at the man page for ethtool, it says:
> 
>            eee on|off
>                   Enables/disables the device support of EEE.
> 
> What does that mean, exactly, and how is it different from:
> 
>            tx-lpi on|off
>                   Determines whether the device should assert its Tx LPI.
> 
> since the only control at the MAC is whether LPI can be asserted or
> not and what the timer is.
> 
> The only control at the PHY end of things is what the advertisement
> is, if an advertisement even exists for the media type in use.
> 
> So, honestly, I don't get what this ethtool interface actually intends
> the "eee_enabled" control to do.

Thank you for your insightful observations on the EEE interface and its
related complexities, particularly in the case of fiber interfaces.

Your comments regarding the functionality of eee_enabled and
tx_lpi_enabled commands have sparked a good amount of thought on the
topic. Based on my understanding and observations, I've put together a
table that outlines the interactions between these commands, and their
influence on the MAC LPI status, PHY EEE advertisement, and the overall
EEE status on the link level.

For Copper assuming link partner advertise EEE as well:
+------+--------+------------+----------------+--------------------------------+---------------------------------+
| eee  | tx-lpi | advertise  | MAC LPI Status | PHY EEE Advertisement Status  | EEE Status on Link Level        |
+------+--------+------------+----------------+--------------------------------+---------------------------------+
| on   | on     |   !=0      | Enabled        | Advertise EEE for supported   | EEE enabled for supported       |
|      |        |            |                | speeds                        | speeds (Full EEE operation)     |
| on   | off    |   !=0      | Disabled       | Advertise EEE for supported   | EEE enabled for RX, disabled    |
|      |        |            |                | speeds                        | for TX (Partial EEE operation)  |
| off  | on     |   !=0      | Disabled       | No EEE advertisement          | EEE disabled                    |
| off  | off    |   !=0      | Disabled       | No EEE advertisement          | EEE disabled                    |
| on   | on     |    0       | Enabled        | No EEE advertisement          | EEE TX enabled, RX depends on   |
|      |        |            |                |                               | link partner                    |
| on   | off    |    0       | Disabled       | No EEE advertisement          | EEE disabled                    |
| off  | on     |    0       | Disabled       | No EEE advertisement          | EEE disabled                    |
| off  | off    |    0       | Disabled       | No EEE advertisement          | EEE disabled                    |
+------+--------+------------+----------------+--------------------------------+---------------------------------+

For Fiber:
+-----------+-----------+-----------------+---------------------+-------------------------+
|     eee   |   tx-lpi  | PHY EEE Adv.    | MAC LPI Status      | EEE Status on Link Level|
+-----------+-----------+-----------------+---------------------+-------------------------+
|     on    |     on    |         NA      | Enabled             | EEE supported           |
|     on    |     off   |         NA      | Disabled            | EEE not supported       |
|     off   |     on    |         NA      | Disabled            | EEE not supported       |
|     off   |     off   |         NA      | Disabled            | EEE not supported       |
+-----------+-----------+-----------------+---------------------+-------------------------+

In my perspective, eee_enabled serves as a global administrative control for
all EEE properties, including PHY EEE advertisement and MAC LPI status. When
EEE is turned off (eee_enabled = off), both PHY EEE advertisement and MAC LPI
status should be disabled, regardless of the tx_lpi_enabled setting.

On the other hand, advertise retains the EEE advertisement configuration, even
when EEE is turned off. This way, users can temporarily disable EEE without
losing their specific advertisement settings, which can then be reinstated when
EEE is turned back on.

In the context of fiber interfaces, where there is no concept of advertisement,
the eee and tx-lpi commands may appear redundant. However, maintaining both
commands could offer consistency across different media types in the ethtool
interface.

Regards,
Oleksij
Russell King (Oracle) May 31, 2023, 2:41 p.m. UTC | #13
On Wed, May 31, 2023 at 09:14:19AM +0200, Oleksij Rempel wrote:
> Hi Russell,
> 
> On Tue, May 30, 2023 at 08:49:50PM +0100, Russell King (Oracle) wrote:
> > On Tue, May 30, 2023 at 08:35:55PM +0100, Russell King (Oracle) wrote:
> > > Going back to phylib, given this, things get even more "fun" if you have
> > > a dual-media PHY. As there's no EEE capability bits for 1000base-X, but
> > > a 1000base-X PCS optionally supports EEE. So, even with a zero EEE
> > > advertisement with a dual-media PHY that would only affect the copper
> > > side, and EEE may still be possible in the fibre side... which makes
> > > phylib's new interpretation of "eee_enabled" rather odd.
> > > 
> > > In any case, "eee_enabled" I don't think has much meaning for the fibre
> > > case because there's definitely no control beyond what "tx_lpi_enabled"
> > > already offers.
> > > 
> > > I think this is a classic case where the EEE interface has been designed
> > > solely around copper without checking what the situation is for fibre!
> > 
> > Let me be a bit more explicit on this. If one does (e.g.) this:
> > 
> > # ethtool --set-eee eth0 advertise 0 tx-lpi on tx-timer 100
> > 
> > with a dual-media PHY, if the MAC is programmed to enable LPI, the
> > dual-media PHY is linked via fibre, and the remote end supports fibre
> > EEE, phylib will force "eee" to "off" purely on the grounds that the
> > advertisement was empty.
> > 
> > If one looks at the man page for ethtool, it says:
> > 
> >            eee on|off
> >                   Enables/disables the device support of EEE.
> > 
> > What does that mean, exactly, and how is it different from:
> > 
> >            tx-lpi on|off
> >                   Determines whether the device should assert its Tx LPI.
> > 
> > since the only control at the MAC is whether LPI can be asserted or
> > not and what the timer is.
> > 
> > The only control at the PHY end of things is what the advertisement
> > is, if an advertisement even exists for the media type in use.
> > 
> > So, honestly, I don't get what this ethtool interface actually intends
> > the "eee_enabled" control to do.
> 
> Thank you for your insightful observations on the EEE interface and its
> related complexities, particularly in the case of fiber interfaces.
> 
> Your comments regarding the functionality of eee_enabled and
> tx_lpi_enabled commands have sparked a good amount of thought on the
> topic. Based on my understanding and observations, I've put together a
> table that outlines the interactions between these commands, and their
> influence on the MAC LPI status, PHY EEE advertisement, and the overall
> EEE status on the link level.
> 
> For Copper assuming link partner advertise EEE as well:
> +------+--------+------------+----------------+--------------------------------+---------------------------------+
> | eee  | tx-lpi | advertise  | MAC LPI Status | PHY EEE Advertisement Status  | EEE Status on Link Level        |
> +------+--------+------------+----------------+--------------------------------+---------------------------------+
> | on   | on     |   !=0      | Enabled        | Advertise EEE for supported   | EEE enabled for supported       |
> |      |        |            |                | speeds                        | speeds (Full EEE operation)     |
> | on   | off    |   !=0      | Disabled       | Advertise EEE for supported   | EEE enabled for RX, disabled    |
> |      |        |            |                | speeds                        | for TX (Partial EEE operation)  |
> | off  | on     |   !=0      | Disabled       | No EEE advertisement          | EEE disabled                    |
> | off  | off    |   !=0      | Disabled       | No EEE advertisement          | EEE disabled                    |
> | on   | on     |    0       | Enabled        | No EEE advertisement          | EEE TX enabled, RX depends on   |
> |      |        |            |                |                               | link partner                    |
> | on   | off    |    0       | Disabled       | No EEE advertisement          | EEE disabled                    |
> | off  | on     |    0       | Disabled       | No EEE advertisement          | EEE disabled                    |
> | off  | off    |    0       | Disabled       | No EEE advertisement          | EEE disabled                    |
> +------+--------+------------+----------------+--------------------------------+---------------------------------+
> 
> For Fiber:
> +-----------+-----------+-----------------+---------------------+-------------------------+
> |     eee   |   tx-lpi  | PHY EEE Adv.    | MAC LPI Status      | EEE Status on Link Level|
> +-----------+-----------+-----------------+---------------------+-------------------------+
> |     on    |     on    |         NA      | Enabled             | EEE supported           |
> |     on    |     off   |         NA      | Disabled            | EEE not supported       |
> |     off   |     on    |         NA      | Disabled            | EEE not supported       |
> |     off   |     off   |         NA      | Disabled            | EEE not supported       |
> +-----------+-----------+-----------------+---------------------+-------------------------+
> 
> In my perspective, eee_enabled serves as a global administrative control for
> all EEE properties, including PHY EEE advertisement and MAC LPI status. When
> EEE is turned off (eee_enabled = off), both PHY EEE advertisement and MAC LPI
> status should be disabled, regardless of the tx_lpi_enabled setting.
> 
> On the other hand, advertise retains the EEE advertisement configuration, even
> when EEE is turned off. This way, users can temporarily disable EEE without
> losing their specific advertisement settings, which can then be reinstated when
> EEE is turned back on.

I agree. I've found phylib's interpretation to be weird, particularly
that the advertisement settings are lost if one sets eee_enabled to be
false. That alone makes eee_enabled entirely redundant.

To me, sane behaviour is exactly as you describe - if the user sets
eee_enabled false, but sets an advertisement, the kernel should store
the advertisement setting so it can be retrieved via get_eee(), ready
for use when eee_enabled is subsequently set true.

> In the context of fiber interfaces, where there is no concept of advertisement,
> the eee and tx-lpi commands may appear redundant. However, maintaining both
> commands could offer consistency across different media types in the ethtool
> interface.

Yes, because having a consistent behaviour is important for a user API.
Having a user API randomly change behaviour depending on what's behind
it leads to user confusion.

Consider the case of a dual-media PHY - should the behaviour of the
EEE setter/getter change depending on what media is in use? Clearly
not, since one normally configures the EEE _policy_ when configuring
the network device, rather than each time the link comes up.

Thanks for your input. I now have some patches that add:

1. generic EEE configuration helpers, which I hope phylib and phylink
   can both use for consistent behaviour.
2. phylink gaining infrastructure for EEE management both of existing
   phylib and also fibre setups.
3. mvneta converted to use phylink's implementation. I may also do
   mvpp2 as well as I have patches for EEE support there as well.

I hope to post the patches later today.