diff mbox series

[v4,net-next,3/8] net: phy: bcm84881: move the in-band capability check where it belongs

Message ID 20221118000124.2754581-4-vladimir.oltean@nxp.com (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series Let phylink manage in-band AN for the PHY | expand

Checks

Context Check Description
netdev/tree_selection success Clearly marked for net-next
netdev/fixes_present success Fixes tag not required for -next series
netdev/subject_prefix success Link
netdev/cover_letter success Series has a cover letter
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 0 this patch: 0
netdev/cc_maintainers success CCed 10 of 10 maintainers
netdev/build_clang success Errors and warnings before: 0 this patch: 0
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 0 this patch: 0
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 53 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Vladimir Oltean Nov. 18, 2022, 12:01 a.m. UTC
Now that there is a generic interface through which phylink can query
PHY drivers whether they support various forms of in-band autoneg, use
that and delete the special case from phylink.c.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
v3->v4:
- since we are now in phylink_sfp_config_phy() rather than
  phylink_sfp_config(), drop "if (phy)" check
- s/inband_aneg/an_inband/

 drivers/net/phy/bcm84881.c | 10 ++++++++++
 drivers/net/phy/phylink.c  | 17 ++---------------
 2 files changed, 12 insertions(+), 15 deletions(-)

Comments

Russell King (Oracle) Nov. 22, 2022, 9:38 a.m. UTC | #1
On Fri, Nov 18, 2022 at 02:01:19AM +0200, Vladimir Oltean wrote:
> Now that there is a generic interface through which phylink can query
> PHY drivers whether they support various forms of in-band autoneg, use
> that and delete the special case from phylink.c.
> 
> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>

I think I'd prefer to see patch 2 and patch 3 first in this series
(patch 3 without the phylink change). Possibly followed by other PHY
driver patches adding the validate_an_inband function, but that's not
important. Then the next patch can be patch 1 and the phylink part of
this patch combined - which makes the changes to phylink smaller as
there's no need to move the phylink_phy_no_inband() function and then
delete it a few patches later.

Also, if we get the Marvell driver implementing validate_an_inband()
then I believe we can get rid of other parts of this patch - 88E1111 is
the commonly used accessible PHY on gigabit SFPs, as this PHY implements
I2C access natively. As I mentioned, Marvell PHYs can be set to no
inband, requiring inband, or inband with bypass mode enabled. So we
need to decide how we deal with that - especially if we're going to be
changing the mode from 1000base-X to SGMII (which we do on some SFP
modules so they work at 10/100/1000.)

In that regard, I'm not entirely convinced that validate_an_inband()
covers the functionality we need - as reading the config register on
Marvell hardware doesn't guarantee that we're reading the right mode -
the PHY may be in 1000base-X, and we might change it to
SGMII-with-bypass - I'd need to go through the PHY datasheets to check
what we actually do.

Changing what the PHY driver does would be a recipe for regressions,
especially for drivers that do not use phylink.

Sorry, the above is a bit rambling, but are my thoughts on the current
approach.
Vladimir Oltean Nov. 22, 2022, 10:01 a.m. UTC | #2
On Tue, Nov 22, 2022 at 09:38:43AM +0000, Russell King (Oracle) wrote:
> Also, if we get the Marvell driver implementing validate_an_inband()
> then I believe we can get rid of other parts of this patch - 88E1111 is
> the commonly used accessible PHY on gigabit SFPs, as this PHY implements
> I2C access natively. As I mentioned, Marvell PHYs can be set to no
> inband, requiring inband, or inband with bypass mode enabled. So we
> need to decide how we deal with that - especially if we're going to be
> changing the mode from 1000base-X to SGMII (which we do on some SFP
> modules so they work at 10/100/1000.)

Not clear which parts of this patch we could ged rid of, if we
implemented in-band AN reporting/configuration for the 88E1111.
Based on your previous description, it sounds like it would be just more
functionality for software rather than less.

> In that regard, I'm not entirely convinced that validate_an_inband()
> covers the functionality we need - as reading the config register on
> Marvell hardware doesn't guarantee that we're reading the right mode -
> the PHY may be in 1000base-X, and we might change it to
> SGMII-with-bypass - I'd need to go through the PHY datasheets to check
> what we actually do.
> 
> Changing what the PHY driver does would be a recipe for regressions,
> especially for drivers that do not use phylink.

I believe that currently, the "interface" passed to phy_validate_an_inband()
and phy_config_an_inband() is always also the phydev->interface.
We could strive to keep that being the case, and put a phydev_warn() in
the Marvell PHY driver if we get a query for interface != phydev->interface,
and report PHY_AN_INBAND_UNKNOWN.

It's also one of the reasons why I didn't yet want to jump right into
figuring out what should be done with PHYs that change SERDES protocols,
and when exactly we query the in-band capability for the new one. Right
now, phylink will assume that the in-band capability reported for the
first SERDES protocol will continue to be the same for all subsequent
protocols. Obviously this might not be the case.
Russell King (Oracle) Nov. 22, 2022, 11:16 a.m. UTC | #3
On Tue, Nov 22, 2022 at 09:38:43AM +0000, Russell King (Oracle) wrote:
> Also, if we get the Marvell driver implementing validate_an_inband()
> then I believe we can get rid of other parts of this patch - 88E1111 is
> the commonly used accessible PHY on gigabit SFPs, as this PHY implements
> I2C access natively. As I mentioned, Marvell PHYs can be set to no
> inband, requiring inband, or inband with bypass mode enabled. So we
> need to decide how we deal with that - especially if we're going to be
> changing the mode from 1000base-X to SGMII (which we do on some SFP
> modules so they work at 10/100/1000.)

For the Marvell 88E1111:

- If switching into 1000base-X mode, then bypass mode is enabled by
m88e1111_config_init_1000basex(). However, if AN is disabled in the
fibre page BMCR (e.g. by firmware), then AN won't be used.

- If switching into SGMII mode, then bypass mode is left however it was
originally set by m88e1111_config_init_sgmii() - so it may be allowed
or it may be disallowed, which means it's whatever the hardware
defaulted to or firmware set it as.

For the 88e151x (x=0,2,8) it looks like bypass mode defaults to being
allowed on hardware reset, but firmware may change that.

I don't think we make much of an effort to configure other Marvell
PHYs, relying on their hardware reset defaults or firmware to set
them appropriately.

So, I think for 88e151x, we should implement something like:

	int mode, bmcr, fscr2;

	/* RGMII too? I believe RGMII can signal inband as well, so we
	 * may need to handle that as well.
	 */
	if (interface != PHY_INTERFACE_MODE_SGMII &&
	    interface != PHY_INTERFACE_MODE_1000BASE_X)
		return PHY_AN_INBAND_UNKNOWN;

	bmcr = phy_read_paged(phydev, MII_MARVELL_FIBER_PAGE, MII_BMCR);
	if (bmcr < 0)
		return SOME_ERROR?

	mode = PHY_AN_INBAND_OFF;

	if (bmcr & BMCR_ANENABLE) {
		mode = PHY_AN_INBAND_ON;

		fscr2 = phy_read_paged(phydev, MII_MARVELL_FIBER_PAGE,
				       0x1a);
		if (fscr2 & BIT(6))
			mode |= PHY_AN_INBAND_TIMEOUT;
	}

	return mode;

Obviously adding register definitions for BIT(6) and 01a.

For the 88E1111:

	int mode, hwcfg;

	/* If operating in 1000base-X mode, we always turn on inband
	 * and allow bypass.
	 */
	if (interface == PHY_INTERFACE_MODE_1000BASEX)
		return PHY_AN_INBAND_ON | PHY_AN_INBAND_TIMEOUT;

	if (interface == PHY_INTERFACE_MODE_SGMII) {
		hwcfg = phy_read(phydev, MII_M1111_PHY_EXT_SR);
		if (hwcfg < 0)
			return SOME_ERROR?

		mode = PHY_AN_INBAND_ON;
		if (hwcfg & MII_M1111_HWCFG_SERIAL_AN_BYPASS)
			mode |= PHY_AN_INBAND_TIMEOUT;

		return mode;
	}

	return PHY_AN_INBAND_UNKNOWN;

Maybe?
Vladimir Oltean Nov. 22, 2022, 12:11 p.m. UTC | #4
On Tue, Nov 22, 2022 at 11:16:07AM +0000, Russell King (Oracle) wrote:
> On Tue, Nov 22, 2022 at 09:38:43AM +0000, Russell King (Oracle) wrote:
> > Also, if we get the Marvell driver implementing validate_an_inband()
> > then I believe we can get rid of other parts of this patch - 88E1111 is
> > the commonly used accessible PHY on gigabit SFPs, as this PHY implements
> > I2C access natively. As I mentioned, Marvell PHYs can be set to no
> > inband, requiring inband, or inband with bypass mode enabled. So we
> > need to decide how we deal with that - especially if we're going to be
> > changing the mode from 1000base-X to SGMII (which we do on some SFP
> > modules so they work at 10/100/1000.)
> 
> For the Marvell 88E1111:
> 
> - If switching into 1000base-X mode, then bypass mode is enabled by
> m88e1111_config_init_1000basex(). However, if AN is disabled in the
> fibre page BMCR (e.g. by firmware), then AN won't be used.
> 
> - If switching into SGMII mode, then bypass mode is left however it was
> originally set by m88e1111_config_init_sgmii() - so it may be allowed
> or it may be disallowed, which means it's whatever the hardware
> defaulted to or firmware set it as.
> 
> For the 88e151x (x=0,2,8) it looks like bypass mode defaults to being
> allowed on hardware reset, but firmware may change that.
> 
> I don't think we make much of an effort to configure other Marvell
> PHYs, relying on their hardware reset defaults or firmware to set
> them appropriately.
> 
> So, I think for 88e151x, we should implement something like:
> 
> 	int mode, bmcr, fscr2;
> 
> 	/* RGMII too? I believe RGMII can signal inband as well, so we
> 	 * may need to handle that as well.
> 	 */
> 	if (interface != PHY_INTERFACE_MODE_SGMII &&
> 	    interface != PHY_INTERFACE_MODE_1000BASE_X)
> 		return PHY_AN_INBAND_UNKNOWN;
> 
> 	bmcr = phy_read_paged(phydev, MII_MARVELL_FIBER_PAGE, MII_BMCR);
> 	if (bmcr < 0)
> 		return SOME_ERROR?

There's a limitation in the API presented here, you can't return
SOME_ERROR, you have to return PHY_AN_INBAND_UNKNOWN and maybe log the
error to the console. If the error persists, other PHY methods will
eventually catch it.

> 
> 	mode = PHY_AN_INBAND_OFF;
> 
> 	if (bmcr & BMCR_ANENABLE) {
> 		mode = PHY_AN_INBAND_ON;
> 
> 		fscr2 = phy_read_paged(phydev, MII_MARVELL_FIBER_PAGE,
> 				       0x1a);
> 		if (fscr2 & BIT(6))
> 			mode |= PHY_AN_INBAND_TIMEOUT;
> 	}
> 
> 	return mode;
> 
> Obviously adding register definitions for BIT(6) and 01a.
> 
> For the 88E1111:
> 
> 	int mode, hwcfg;
> 
> 	/* If operating in 1000base-X mode, we always turn on inband
> 	 * and allow bypass.
> 	 */
> 	if (interface == PHY_INTERFACE_MODE_1000BASEX)
> 		return PHY_AN_INBAND_ON | PHY_AN_INBAND_TIMEOUT;
> 
> 	if (interface == PHY_INTERFACE_MODE_SGMII) {
> 		hwcfg = phy_read(phydev, MII_M1111_PHY_EXT_SR);
> 		if (hwcfg < 0)
> 			return SOME_ERROR?
> 
> 		mode = PHY_AN_INBAND_ON;
> 		if (hwcfg & MII_M1111_HWCFG_SERIAL_AN_BYPASS)
> 			mode |= PHY_AN_INBAND_TIMEOUT;
> 
> 		return mode;
> 	}
> 
> 	return PHY_AN_INBAND_UNKNOWN;
> 
> Maybe?

Hmm, not quite (neither for 88E151x not 88E1111). The intention with the
validate()/config() split is that you either implement just validate(),
or both. If you implement just validate(), you should report just one
bit, corresponding to what the hardware is configured for (so either
PHY_AN_INBAND_ON, *or* PHY_AN_INBAND_TIMEOUT). This is because you'd
otherwise tell phylink that 2 modes are supported, but provide no way to
choose between them, and you don't make it clear which one is in use
either. This will force phylink to adapt to MLO_AN_PHY or MLO_AN_INBAND,
depending on what has a chance of working.

If you implement config_an_inband() too, then the validate procedure
becomes a simple report of what can be configured for that PHY
(OFF | ON | ON_TIMEOUT for 88E151x, and ON | ON_TIMEOUT for 88E1111).
It's then the config_an_inband() procedure that applies to hardware the
mode that is selected by phylink. From config_an_inband() you can return
a negative error code on PHY I/O failure.

If you can prepare some more formal patches for these PHYs for which I
don't have documentation, I think I have a copper SFP module which uses
SGMII and 88E1111, and I can plug it into the Honeycomb and see what
happens.
Vladimir Oltean Nov. 22, 2022, 12:24 p.m. UTC | #5
On Tue, Nov 22, 2022 at 09:38:43AM +0000, Russell King (Oracle) wrote:
> 88E1111 is the commonly used accessible PHY on gigabit SFPs, as this
> PHY implements I2C access natively.

As a side question, I suppose it would be possible to put PHYs on copper
SFP modules even if they don't natively implement I2C access. In that case,
if configuration from the host is at all available, how does that happen,
is there some sort of protocol translator (I2C -> MDIO) on the module?
Do you know of any part number for an I2C controlled MDIO controller?
Russell King (Oracle) Nov. 22, 2022, 4:58 p.m. UTC | #6
On Tue, Nov 22, 2022 at 02:11:22PM +0200, Vladimir Oltean wrote:
> On Tue, Nov 22, 2022 at 11:16:07AM +0000, Russell King (Oracle) wrote:
> > On Tue, Nov 22, 2022 at 09:38:43AM +0000, Russell King (Oracle) wrote:
> > > Also, if we get the Marvell driver implementing validate_an_inband()
> > > then I believe we can get rid of other parts of this patch - 88E1111 is
> > > the commonly used accessible PHY on gigabit SFPs, as this PHY implements
> > > I2C access natively. As I mentioned, Marvell PHYs can be set to no
> > > inband, requiring inband, or inband with bypass mode enabled. So we
> > > need to decide how we deal with that - especially if we're going to be
> > > changing the mode from 1000base-X to SGMII (which we do on some SFP
> > > modules so they work at 10/100/1000.)
> > 
> > For the Marvell 88E1111:
> > 
> > - If switching into 1000base-X mode, then bypass mode is enabled by
> > m88e1111_config_init_1000basex(). However, if AN is disabled in the
> > fibre page BMCR (e.g. by firmware), then AN won't be used.
> > 
> > - If switching into SGMII mode, then bypass mode is left however it was
> > originally set by m88e1111_config_init_sgmii() - so it may be allowed
> > or it may be disallowed, which means it's whatever the hardware
> > defaulted to or firmware set it as.
> > 
> > For the 88e151x (x=0,2,8) it looks like bypass mode defaults to being
> > allowed on hardware reset, but firmware may change that.
> > 
> > I don't think we make much of an effort to configure other Marvell
> > PHYs, relying on their hardware reset defaults or firmware to set
> > them appropriately.
> > 
> > So, I think for 88e151x, we should implement something like:
> > 
> > 	int mode, bmcr, fscr2;
> > 
> > 	/* RGMII too? I believe RGMII can signal inband as well, so we
> > 	 * may need to handle that as well.
> > 	 */
> > 	if (interface != PHY_INTERFACE_MODE_SGMII &&
> > 	    interface != PHY_INTERFACE_MODE_1000BASE_X)
> > 		return PHY_AN_INBAND_UNKNOWN;
> > 
> > 	bmcr = phy_read_paged(phydev, MII_MARVELL_FIBER_PAGE, MII_BMCR);
> > 	if (bmcr < 0)
> > 		return SOME_ERROR?
> 
> There's a limitation in the API presented here, you can't return
> SOME_ERROR, you have to return PHY_AN_INBAND_UNKNOWN and maybe log the
> error to the console. If the error persists, other PHY methods will
> eventually catch it.

Right.

> > 
> > 	mode = PHY_AN_INBAND_OFF;
> > 
> > 	if (bmcr & BMCR_ANENABLE) {
> > 		mode = PHY_AN_INBAND_ON;
> > 
> > 		fscr2 = phy_read_paged(phydev, MII_MARVELL_FIBER_PAGE,
> > 				       0x1a);
> > 		if (fscr2 & BIT(6))
> > 			mode |= PHY_AN_INBAND_TIMEOUT;
> > 	}
> > 
> > 	return mode;
> > 
> > Obviously adding register definitions for BIT(6) and 01a.
> > 
> > For the 88E1111:
> > 
> > 	int mode, hwcfg;
> > 
> > 	/* If operating in 1000base-X mode, we always turn on inband
> > 	 * and allow bypass.
> > 	 */
> > 	if (interface == PHY_INTERFACE_MODE_1000BASEX)
> > 		return PHY_AN_INBAND_ON | PHY_AN_INBAND_TIMEOUT;
> > 
> > 	if (interface == PHY_INTERFACE_MODE_SGMII) {
> > 		hwcfg = phy_read(phydev, MII_M1111_PHY_EXT_SR);
> > 		if (hwcfg < 0)
> > 			return SOME_ERROR?
> > 
> > 		mode = PHY_AN_INBAND_ON;
> > 		if (hwcfg & MII_M1111_HWCFG_SERIAL_AN_BYPASS)
> > 			mode |= PHY_AN_INBAND_TIMEOUT;
> > 
> > 		return mode;
> > 	}
> > 
> > 	return PHY_AN_INBAND_UNKNOWN;
> > 
> > Maybe?
> 
> Hmm, not quite (neither for 88E151x not 88E1111). The intention with the
> validate()/config() split is that you either implement just validate(),
> or both.

The intention of the above is to report how the PHY is going to behave
with the current code when the PHY is in operation, which I believe to
be the intention of the validate callback. I'm not proposing at the
moment to add the config() part, although that can be done later.

As I stated, with the 88e1111, if we are asked to operate in 1000base-X
mode, when we configure the PHY we will allow bypass mode and I believe
in-band will be enabled, because this is what config_init() does today
when operating in 1000base-X mode. If we add support for your config()
method, then we will need a way to prevent a later config_init()
changing that configuration.

For SGMII, things are a lot more complicated, the result depends on how
the PHY has been setup by firmware or possibly reset pin strapping, so
we need to read registers to work out how it's going to behave. So,
unless we do that, we just can't report anything with certainty. We
probably ought to be reading a register to check that in-band is indeed
enabled.

Note that a soft-reset doesn't change any of this - it won't enable
in-band if it was disabled, and it won't disable it if it was previously
set to be enabled.

> If you implement just validate(), you should report just one
> bit, corresponding to what the hardware is configured for (so either
> PHY_AN_INBAND_ON, *or* PHY_AN_INBAND_TIMEOUT). This is because you'd
> otherwise tell phylink that 2 modes are supported, but provide no way to
> choose between them, and you don't make it clear which one is in use
> either. This will force phylink to adapt to MLO_AN_PHY or MLO_AN_INBAND,
> depending on what has a chance of working.

Don't we have the same problem with PHY_AN_INBAND_TIMEOUT? If a PHY
reports that, do we use MLO_AN_INBAND or MLO_AN_PHY?

> If you implement config_an_inband() too, then the validate procedure
> becomes a simple report of what can be configured for that PHY
> (OFF | ON | ON_TIMEOUT for 88E151x, and ON | ON_TIMEOUT for 88E1111).
> It's then the config_an_inband() procedure that applies to hardware the
> mode that is selected by phylink. From config_an_inband() you can return
> a negative error code on PHY I/O failure.

So it sounds like the decision about which mode to use needs to be
coupled with "does the PHY driver implement config_an_inband()"

> If you can prepare some more formal patches for these PHYs for which I
> don't have documentation, I think I have a copper SFP module which uses
> SGMII and 88E1111, and I can plug it into the Honeycomb and see what
> happens.

I'm away from home at the moment, which means I don't have a way to
do any in-depth tests other than with the SFPs that are plugged into
my Honeycomb - which does include some copper SFPs but they're not
connected to anything. So I can't test to see if data passes until
I'm back home next week.
Russell King (Oracle) Nov. 22, 2022, 5:51 p.m. UTC | #7
On Tue, Nov 22, 2022 at 02:24:51PM +0200, Vladimir Oltean wrote:
> On Tue, Nov 22, 2022 at 09:38:43AM +0000, Russell King (Oracle) wrote:
> > 88E1111 is the commonly used accessible PHY on gigabit SFPs, as this
> > PHY implements I2C access natively.
> 
> As a side question, I suppose it would be possible to put PHYs on copper
> SFP modules even if they don't natively implement I2C access. In that case,
> if configuration from the host is at all available, how does that happen,
> is there some sort of protocol translator (I2C -> MDIO) on the module?

Modules that provide access to the PHY seem to do so on I2C address 0x56
and require specific I2C access patterns. I think the 88e1111 set a kind
of "standard" and many follow that or similar.

Some just don't give any kind of access to the PHY - for example, the
Mikrotik modules have an AR803x PHY on them, but there is no way to
access it.

> Do you know of any part number for an I2C controlled MDIO controller?

I think many that don't use a microcontroller - for example, when
someone puts an 88x3310, bcm84881, the protocols are very similar but
timing may vary.

Some of the patches from Marek provided a different way - via the 0x50
or 0x51 "eeprom" addresses, accessing specific addresses with in those
spaces, and if I remember correctly, requiring a "password" to enable
access.
Vladimir Oltean Nov. 22, 2022, 5:56 p.m. UTC | #8
On Tue, Nov 22, 2022 at 04:58:45PM +0000, Russell King (Oracle) wrote:
> The intention of the above is to report how the PHY is going to behave
> with the current code when the PHY is in operation, which I believe to
> be the intention of the validate callback. I'm not proposing at the
> moment to add the config() part, although that can be done later.

Again, all I'm saying is that with validate() but no config(), you should
report a single bit, not PHY_AN_INBAND_ON | PHY_AN_INBAND_TIMEOUT.
The code which you posted does report multiple bits, and we won't know
how to make sense out of them. We'd select what is the most convenient
to us, and we'll call phy_config_an_inband() with that, but it'll return
-EOPNOTSUPP, which will be perfectly reasonable to us. Except that the
PHY may actually not currently operate in the mode it reported as a
supported capability. So things are broken.

> As I stated, with the 88e1111, if we are asked to operate in 1000base-X
> mode, when we configure the PHY we will allow bypass mode and I believe
> in-band will be enabled, because this is what config_init() does today
> when operating in 1000base-X mode. If we add support for your config()
> method, then we will need a way to prevent a later config_init()
> changing that configuration.

If config_init() is going to be kept being called only from
phy_attach_direct() -> phy_init_hw(), then, at least with phylink, we
only call phy_config_an_inband() from phylink_bringup_phy(), so after
the phy_attach_direct() call. So we overwrite what the driver does by
default.

The problem is not phy_config_an_inband() but phy_validate_an_inband().
We call that earlier than phy_attach_direct(), so if the PHY driver is
going to read a register from HW which hasn't yet been written, we get
an incorrect report of the current capabilities.

I'll see if I can fix that and delay phy_validate_an_inband() a bit,
maybe up until before right before phy_config_an_inband().

> 
> For SGMII, things are a lot more complicated, the result depends on how
> the PHY has been setup by firmware or possibly reset pin strapping, so
> we need to read registers to work out how it's going to behave. So,
> unless we do that, we just can't report anything with certainty. We
> probably ought to be reading a register to check that in-band is indeed
> enabled.
> 
> Note that a soft-reset doesn't change any of this - it won't enable
> in-band if it was disabled, and it won't disable it if it was previously
> set to be enabled.

Similar to VSC8514 and the U-Boot preconfiguration issue. Soft reset,
but the setting sticks.

> > If you implement just validate(), you should report just one
> > bit, corresponding to what the hardware is configured for (so either
> > PHY_AN_INBAND_ON, *or* PHY_AN_INBAND_TIMEOUT). This is because you'd
> > otherwise tell phylink that 2 modes are supported, but provide no way to
> > choose between them, and you don't make it clear which one is in use
> > either. This will force phylink to adapt to MLO_AN_PHY or MLO_AN_INBAND,
> > depending on what has a chance of working.
> 
> Don't we have the same problem with PHY_AN_INBAND_TIMEOUT? If a PHY
> reports that, do we use MLO_AN_INBAND or MLO_AN_PHY?

Well, I haven't yet written any logic for it.

To your question: "PHY_AN_INBAND_ON_TIMEOUT => MLO_AN_PHY or MLO_AN_INBAND"?
I'd say either depending on situation, since my expectation is that it's
compatible with both.

Always give preference to what's in the device tree if it can work
somehow. If it can work in fully compatible modes (MLO_AN_PHY with
PHY_AN_INBAND_OFF; MLO_AN_INBAND with PHY_AN_INBAND_ON), perfect.
If not, but what's in the device tree can work with PHY_AN_INBAND_ON_TIMEOUT,
also good => use ON_TIMEOUT.

If what's in the device tree needs to be changed, it pretty much means
that ON_TIMEOUT isn't supported by the PHY.

Concretely, I would propose something like this (a modification of the
function added by the patch set, notice the extra "an_inband" argument,
as well as the new checks for PHY_AN_INBAND_ON_TIMEOUT):

static void phylink_sync_an_inband(struct phylink *pl, struct phy_device *phy,
				   enum phy_an_inband *an_inband)
{
	unsigned int mode = pl->cfg_link_an_mode;
	int ret;

	if (!pl->config->sync_an_inband)
		return;

	ret = phy_validate_an_inband(phy, pl->link_config.interface);
	if (ret == PHY_AN_INBAND_UNKNOWN) {
		phylink_dbg(pl,
			    "PHY driver does not report in-band autoneg capability, assuming %s\n",
			    phylink_autoneg_inband(mode) ? "true" : "false");

		*an_inband = PHY_AN_INBAND_UNKNOWN;
	} else if (phylink_autoneg_inband(mode) &&
		   !(ret & PHY_AN_INBAND_ON) &&
		   !(ret & PHY_AN_INBAND_ON_TIMEOUT)) {
		phylink_err(pl,
			    "Requested in-band autoneg but driver does not support this, disabling it.\n");

		mode = MLO_AN_PHY;
		*an_inband = PHY_AN_INBAND_OFF;
	} else if (!phylink_autoneg_inband(mode) &&
		   !(ret & PHY_AN_INBAND_OFF) &&
		   !(ret & PHY_AN_INBAND_ON_TIMEOUT)) {
		phylink_dbg(pl,
			    "PHY driver requests in-band autoneg, force-enabling it.\n");

		mode = MLO_AN_INBAND;
		*an_inband = PHY_AN_INBAND_ON;
	} else {
		/* For the checks below, we've found a common operating
		 * mode with the PHY, just need to figure out if we
		 * agree fully or if we have to rely on the PHY's
		 * timeout ability
		 */
		if (phylink_autoneg_inband(mode)) {
			*an_inband = !!(ret & PHY_AN_INBAND_ON) ? PHY_AN_INBAND_ON :
					PHY_AN_INBAND_ON_TIMEOUT;
		} else {
			*an_inband = !!(ret & PHY_AN_INBAND_OFF) ? PHY_AN_INBAND_OFF :
					PHY_AN_INBAND_ON_TIMEOUT;
		}
	}

	pl->cur_link_an_mode = mode;
}

then call phy_config_an_inband() with "an_inband" as the mode to use.

As per Sean's feedback, we force the PHY to report at least one valid
capability, otherwise, 0 is PHY_AN_INBAND_UNKNOWN and it's also treated
correctly.

> > If you implement config_an_inband() too, then the validate procedure
> > becomes a simple report of what can be configured for that PHY
> > (OFF | ON | ON_TIMEOUT for 88E151x, and ON | ON_TIMEOUT for 88E1111).
> > It's then the config_an_inband() procedure that applies to hardware the
> > mode that is selected by phylink. From config_an_inband() you can return
> > a negative error code on PHY I/O failure.
> 
> So it sounds like the decision about which mode to use needs to be
> coupled with "does the PHY driver implement config_an_inband()"

So do you recommend that I should put a WARN_ON() somewhere, which
asserts something like this?

"if the weight (number of bits set) in the return code of
phy_validate_an_inband() is larger than 1, then phydev->drv->phy_config_an_inband()
must be a non-NULL pointer, to allow selecting between them"

> > If you can prepare some more formal patches for these PHYs for which I
> > don't have documentation, I think I have a copper SFP module which uses
> > SGMII and 88E1111, and I can plug it into the Honeycomb and see what
> > happens.
> 
> I'm away from home at the moment, which means I don't have a way to
> do any in-depth tests other than with the SFPs that are plugged into
> my Honeycomb - which does include some copper SFPs but they're not
> connected to anything. So I can't test to see if data passes until
> I'm back home next week.

I actually meant that I can test on a Solidrun Honeycomb board that I
happen to have access to, if you have some Marvell PHY code, even untested,
that I could try out. I'm pretty much in the dark when it comes to their
hardware documentation.
Vladimir Oltean Nov. 22, 2022, 6:14 p.m. UTC | #9
On Tue, Nov 22, 2022 at 07:56:25PM +0200, Vladimir Oltean wrote:
> On Tue, Nov 22, 2022 at 04:58:45PM +0000, Russell King (Oracle) wrote:
> > > I think I have a copper SFP module which uses SGMII and 88E1111,
> > > and I can plug it into the Honeycomb and see what happens.
> > 
> > I don't have a way to do any in-depth tests other than with the SFPs
> > that are plugged into my Honeycomb
> 
> I actually meant that I can test on a Solidrun Honeycomb board that I
> happen to have access to

I may have misunderstood what you said here. Subconsciously I thought
that you replied to what I said, but it looks unrelated. Disregard.

Anyway, considering I can't really patch the Marvell PHY driver
(knowing what I'm doing), not sure what to choose between waiting for
some help there (to have more coverage => less chances for regression)
and simply not converting dpaa1's ovr_an_inband to sync_an_inband.
I don't think the other NXP drivers are quite in the same risk of
regressions as dpaa1, as their device trees were introduced much more
recently compared to their conversion to phylink.
Russell King (Oracle) Nov. 22, 2022, 6:28 p.m. UTC | #10
On Tue, Nov 22, 2022 at 07:56:25PM +0200, Vladimir Oltean wrote:
> The problem is not phy_config_an_inband() but phy_validate_an_inband().
> We call that earlier than phy_attach_direct(), so if the PHY driver is
> going to read a register from HW which hasn't yet been written, we get
> an incorrect report of the current capabilities.

Why would it be "incorrect" ?

What the code I'm proposing correctly reports back what inband mode(s)
will be in use should we select the proposed interface mode. Let's
ignore whether we report the TIMEOUT or not for that statement, because
I think that's confusing the discussion.

If we _do_ want to report whether the TIMEOUT mode is going to be used
or not, the code I proposed is what will be necessary, because it
depends on (a) how the PHY is strapped and (b) how firmware or external
EEPROM has setup the device. If we want a single bit, then we would
report just _ON_TIMEOUT if bypass is enabled - but we still need to
read registers to come to a conclusion about whether it's enabled or
not. As I say, we can't blindly say "if interface is X, then bypass
will be enabled" for any X - and what may be correct for one board will
not be correct for another.

Moreover, in the 88e1111 case on a SFP, what's right for one SFP is not
right for another - there are SFPs where the 88e1111 registers are
preloaded from an EEPROM, so whether bypass is enabled or not in SGMII
mode is up to the contents of the EEPROM - the marvell PHY driver does
not interfere with that setting for SGMII.

Hence, to report how the PHY will behave in SGMII mode, with lack of
explicit configuration, we _have_ to read registers and use them to
determine the outcome.

> > > If you implement just validate(), you should report just one
> > > bit, corresponding to what the hardware is configured for (so either
> > > PHY_AN_INBAND_ON, *or* PHY_AN_INBAND_TIMEOUT). This is because you'd
> > > otherwise tell phylink that 2 modes are supported, but provide no way to
> > > choose between them, and you don't make it clear which one is in use
> > > either. This will force phylink to adapt to MLO_AN_PHY or MLO_AN_INBAND,
> > > depending on what has a chance of working.
> > 
> > Don't we have the same problem with PHY_AN_INBAND_TIMEOUT? If a PHY
> > reports that, do we use MLO_AN_INBAND or MLO_AN_PHY?
> 
> Well, I haven't yet written any logic for it.
> 
> To your question: "PHY_AN_INBAND_ON_TIMEOUT => MLO_AN_PHY or MLO_AN_INBAND"?
> I'd say either depending on situation, since my expectation is that it's
> compatible with both.

Yes, it would be compatible with both.

> Always give preference to what's in the device tree if it can work
> somehow. If it can work in fully compatible modes (MLO_AN_PHY with
> PHY_AN_INBAND_OFF; MLO_AN_INBAND with PHY_AN_INBAND_ON), perfect.
> If not, but what's in the device tree can work with PHY_AN_INBAND_ON_TIMEOUT,
> also good => use ON_TIMEOUT.

What do we do for a SFP module with a Marvell PHY on - we need to cover
that in this thought process, especially as 88e1111 is one of the most
popular PHYs on Gigabit copper SFPs. We can't really say "whatever
DT/ACPI firmware says" because that's not relevant to SFPs (we always
override firmware for SFPs.)

> If what's in the device tree needs to be changed, it pretty much means
> that ON_TIMEOUT isn't supported by the PHY.
> 
> Concretely, I would propose something like this (a modification of the
> function added by the patch set, notice the extra "an_inband" argument,
> as well as the new checks for PHY_AN_INBAND_ON_TIMEOUT):
> 
> static void phylink_sync_an_inband(struct phylink *pl, struct phy_device *phy,
> 				   enum phy_an_inband *an_inband)
> {
> 	unsigned int mode = pl->cfg_link_an_mode;
> 	int ret;
> 
> 	if (!pl->config->sync_an_inband)
> 		return;
> 
> 	ret = phy_validate_an_inband(phy, pl->link_config.interface);
> 	if (ret == PHY_AN_INBAND_UNKNOWN) {
> 		phylink_dbg(pl,
> 			    "PHY driver does not report in-band autoneg capability, assuming %s\n",
> 			    phylink_autoneg_inband(mode) ? "true" : "false");
> 
> 		*an_inband = PHY_AN_INBAND_UNKNOWN;
> 	} else if (phylink_autoneg_inband(mode) &&
> 		   !(ret & PHY_AN_INBAND_ON) &&
> 		   !(ret & PHY_AN_INBAND_ON_TIMEOUT)) {
> 		phylink_err(pl,
> 			    "Requested in-band autoneg but driver does not support this, disabling it.\n");
> 
> 		mode = MLO_AN_PHY;
> 		*an_inband = PHY_AN_INBAND_OFF;
> 	} else if (!phylink_autoneg_inband(mode) &&
> 		   !(ret & PHY_AN_INBAND_OFF) &&
> 		   !(ret & PHY_AN_INBAND_ON_TIMEOUT)) {
> 		phylink_dbg(pl,
> 			    "PHY driver requests in-band autoneg, force-enabling it.\n");
> 
> 		mode = MLO_AN_INBAND;
> 		*an_inband = PHY_AN_INBAND_ON;
> 	} else {
> 		/* For the checks below, we've found a common operating
> 		 * mode with the PHY, just need to figure out if we
> 		 * agree fully or if we have to rely on the PHY's
> 		 * timeout ability
> 		 */
> 		if (phylink_autoneg_inband(mode)) {
> 			*an_inband = !!(ret & PHY_AN_INBAND_ON) ? PHY_AN_INBAND_ON :
> 					PHY_AN_INBAND_ON_TIMEOUT;
> 		} else {
> 			*an_inband = !!(ret & PHY_AN_INBAND_OFF) ? PHY_AN_INBAND_OFF :
> 					PHY_AN_INBAND_ON_TIMEOUT;
> 		}
> 	}
> 
> 	pl->cur_link_an_mode = mode;
> }
> 
> then call phy_config_an_inband() with "an_inband" as the mode to use.
> 
> As per Sean's feedback, we force the PHY to report at least one valid
> capability, otherwise, 0 is PHY_AN_INBAND_UNKNOWN and it's also treated
> correctly.
> 
> > > If you implement config_an_inband() too, then the validate procedure
> > > becomes a simple report of what can be configured for that PHY
> > > (OFF | ON | ON_TIMEOUT for 88E151x, and ON | ON_TIMEOUT for 88E1111).
> > > It's then the config_an_inband() procedure that applies to hardware the
> > > mode that is selected by phylink. From config_an_inband() you can return
> > > a negative error code on PHY I/O failure.
> > 
> > So it sounds like the decision about which mode to use needs to be
> > coupled with "does the PHY driver implement config_an_inband()"
> 
> So do you recommend that I should put a WARN_ON() somewhere, which
> asserts something like this?
> 
> "if the weight (number of bits set) in the return code of
> phy_validate_an_inband() is larger than 1, then phydev->drv->phy_config_an_inband()
> must be a non-NULL pointer, to allow selecting between them"

I think that would be a good idea, otherwise we are going to have to
rely on reviewers spotting this error, which given the way patches are
picked up on netdev, is prone to being missed. So, I think the more we
can encourage people to do it correctly first time, the better.

> > > If you can prepare some more formal patches for these PHYs for which I
> > > don't have documentation, I think I have a copper SFP module which uses
> > > SGMII and 88E1111, and I can plug it into the Honeycomb and see what
> > > happens.
> > 
> > I'm away from home at the moment, which means I don't have a way to
> > do any in-depth tests other than with the SFPs that are plugged into
> > my Honeycomb - which does include some copper SFPs but they're not
> > connected to anything. So I can't test to see if data passes until
> > I'm back home next week.
> 
> I actually meant that I can test on a Solidrun Honeycomb board that I
> happen to have access to, if you have some Marvell PHY code, even untested,
> that I could try out. I'm pretty much in the dark when it comes to their
> hardware documentation.

If we can agree on the reading-registers approach I suggested (with
the multi-bit return values corrected), then I can turn that into a
patch, but I think we need to come to agreement on that first.
Vladimir Oltean Nov. 22, 2022, 7:36 p.m. UTC | #11
On Tue, Nov 22, 2022 at 06:28:38PM +0000, Russell King (Oracle) wrote:
> On Tue, Nov 22, 2022 at 07:56:25PM +0200, Vladimir Oltean wrote:
> > The problem is not phy_config_an_inband() but phy_validate_an_inband().
> > We call that earlier than phy_attach_direct(), so if the PHY driver is
> > going to read a register from HW which hasn't yet been written, we get
> > an incorrect report of the current capabilities.
> 
> Why would it be "incorrect" ?
> 
> What the code I'm proposing correctly reports back what inband mode(s)
> will be in use should we select the proposed interface mode. Let's
> ignore whether we report the TIMEOUT or not for that statement, because
> I think that's confusing the discussion.
> 
> If we _do_ want to report whether the TIMEOUT mode is going to be used
> or not, the code I proposed is what will be necessary, because it
> depends on (a) how the PHY is strapped and (b) how firmware or external
> EEPROM has setup the device. If we want a single bit, then we would
> report just _ON_TIMEOUT if bypass is enabled - but we still need to
> read registers to come to a conclusion about whether it's enabled or
> not. As I say, we can't blindly say "if interface is X, then bypass
> will be enabled" for any X - and what may be correct for one board will
> not be correct for another.
> 
> Moreover, in the 88e1111 case on a SFP, what's right for one SFP is not
> right for another - there are SFPs where the 88e1111 registers are
> preloaded from an EEPROM, so whether bypass is enabled or not in SGMII
> mode is up to the contents of the EEPROM - the marvell PHY driver does
> not interfere with that setting for SGMII.
> 
> Hence, to report how the PHY will behave in SGMII mode, with lack of
> explicit configuration, we _have_ to read registers and use them to
> determine the outcome.
> 

I'll re-read this tomorrow to make sure I didn't miss something because
of being tired.

I may have mixed up interface modes in the validate() code for MV88E1111
that you posted. I was under the impression that PHY_AN_INBAND_TIMEOUT
always gets reported based on reading a hardware register, the same
hardware register that gets overwritten to MII_M1111_HWCFG_SERIAL_AN_BYPASS
in m88e1111_config_init_1000basex().

But your proposed code is actually a mix between reading the existing
hardware configuration for SGMII, and returning something hardcoded for
1000base-x. For 1000base-x, we will return PHY_AN_INBAND_TIMEOUT, not
because the hardware is currently configured like that, but because it
will be, later. And the timing of the validate() call isn't going to be
a problem, so there isn't a reason to move it.

I'm okay with that, I just didn't understand.

> > Always give preference to what's in the device tree if it can work
> > somehow. If it can work in fully compatible modes (MLO_AN_PHY with
> > PHY_AN_INBAND_OFF; MLO_AN_INBAND with PHY_AN_INBAND_ON), perfect.
> > If not, but what's in the device tree can work with PHY_AN_INBAND_ON_TIMEOUT,
> > also good => use ON_TIMEOUT.
> 
> What do we do for a SFP module with a Marvell PHY on - we need to cover
> that in this thought process, especially as 88e1111 is one of the most
> popular PHYs on Gigabit copper SFPs. We can't really say "whatever
> DT/ACPI firmware says" because that's not relevant to SFPs (we always
> override firmware for SFPs.)

Ok, I only answered to part of the question - which is how do we
interpret phy_validate_an_inband()'s result from phylink_sync_an_inband() -
the on-board PHY code path.

If the code path we're talking about is from phylink_sfp_config_phy(),
then the modified code, to account for TIMEOUT, would look like this:

	/* Select whether to operate in in-band mode or not, based on the
	 * capability of the PHY in the current link mode.
	 */
	ret = phy_validate_an_inband(phy, iface);
	if (ret == PHY_AN_INBAND_UNKNOWN) {
		mode = MLO_AN_INBAND;

		phylink_dbg(pl,
			    "PHY driver does not report in-band autoneg capability, assuming true\n");
	} else if (ret & (PHY_AN_INBAND_ON | PHY_AN_INBAND_ON_TIMEOUT)) {
		mode = MLO_AN_INBAND;
	} else {
		mode = MLO_AN_PHY;
	}

or in words, essentially prefer MLO_AN_INBAND except when the PHY driver
says that it requires in-band disabled.

At least that's for now, because we assume that the PCS always supports
MLO_AN_INBAND. For the purpose of this series, let's assume that's a given.

> > > > If you can prepare some more formal patches for these PHYs for which I
> > > > don't have documentation, I think I have a copper SFP module which uses
> > > > SGMII and 88E1111, and I can plug it into the Honeycomb and see what
> > > > happens.
> > > 
> > > I'm away from home at the moment, which means I don't have a way to
> > > do any in-depth tests other than with the SFPs that are plugged into
> > > my Honeycomb - which does include some copper SFPs but they're not
> > > connected to anything. So I can't test to see if data passes until
> > > I'm back home next week.
> > 
> > I actually meant that I can test on a Solidrun Honeycomb board that I
> > happen to have access to, if you have some Marvell PHY code, even untested,
> > that I could try out. I'm pretty much in the dark when it comes to their
> > hardware documentation.
> 
> If we can agree on the reading-registers approach I suggested (with
> the multi-bit return values corrected), then I can turn that into a
> patch, but I think we need to come to agreement on that first.

I think we're in agreement, but please let's wait until tomorrow, I need
to take a break for today.
Russell King (Oracle) Nov. 23, 2022, 12:08 p.m. UTC | #12
On Tue, Nov 22, 2022 at 09:36:59PM +0200, Vladimir Oltean wrote:
> I think we're in agreement, but please let's wait until tomorrow, I need
> to take a break for today.

I think we do have a sort of agreement... but lets give this a go. The
following should be sufficient for copper SFP modules using the 88E1111
PHY. However, I haven't build-tested this patch yet.

Reading through the documentation has brought up some worms in this
area. :(

It may be worth printing the fiber page BMCR and extsr at various
strategic points in this driver and reporting back if things don't
seem to be working right for your modules. In the mean time, I'll try
to see how the modules in the Honeycomb appear to be setup at power-up
and after the driver has configured the PHY... assuming I left both
MicroUSBs connected and the board has a network connection via the
main ethernet jack.

diff --git a/drivers/net/phy/marvell.c b/drivers/net/phy/marvell.c
index 3c54d7d0f17f..9d537a2bccda 100644
--- a/drivers/net/phy/marvell.c
+++ b/drivers/net/phy/marvell.c
@@ -669,6 +669,54 @@ static int marvell_config_aneg_fiber(struct phy_device *phydev)
 	return genphy_check_and_restart_aneg(phydev, changed);
 }
 
+static int m88e1111_validate_an_inband(struct phy_device *phydev,
+				       phy_interface_t interface)
+{
+	int hwcfg_mode, extsr, bmcr;
+
+	if (interface != PHY_INTERFACE_MODE_1000BASEX &&
+	    interface != PHY_INTERFACE_MODE_SGMII)
+		return PHY_AN_INBAND_UNKNOWN;
+
+	extsr = phy_read(phydev, MII_M1111_PHY_EXT_SR);
+	if (extsr < 0)
+		return PHY_AN_INBAND_UNKNOWN;
+
+	hwcfg_mode = extsr & MII_M1111_HWCFG_MODE_MASK;
+
+	/* If we are in 1000base-X no-AN hwcfg_mode,
+	 * m88e1111_config_init_1000basex() will allegedly enable AN and
+	 * allow AN bypass - but there is a question over whether that is
+	 * actually the case. Let's follow what the comment says, and assume
+	 * that it was actually tested.
+	 */
+	if (interface == PHY_INTERFACE_MODE_1000BASEX &&
+	    hwcfg_mode == MII_M1111_HWCFG_MODE_COPPER_1000X_NOAN)
+		return PHY_AN_INBAND_ON_TIMEOUT;
+
+	/* Otherwise, we leave the AN enable bit and the AN bypass bit
+	 * alone, so we need to read the registers to determine how the
+	 * MAC facing side of the PHY has been setup by firmware and/or
+	 * hardware reset.
+	 *
+	 * If the AN enable bit is clear, then all in-band signalling
+	 * on the SGMII/1000base-X side is disabled. Otherwise, AN is
+	 * enabled. If the bypass bit is set, AN can complete without
+	 * a response from the partner (MAC).
+	 */
+	bmcr = phy_read_paged(phydev, MII_MARVELL_FIBER_PAGE, MII_BMCR);
+	if (bmcr < 0)
+		return PHY_AN_INBAND_UNKNOWN;
+
+	if (!(bmcr & BMCR_ANENABLE))
+		return PHY_AN_INBAND_OFF;
+
+	if (extsr & MII_M1111_HWCFG_SERIAL_AN_BYPASS)
+		return PHY_AN_INBAND_ON_TIMEOUT;
+
+	return PHY_AN_INBAND_ON;
+}
+
 static int m88e1111_config_aneg(struct phy_device *phydev)
 {
 	int extsr = phy_read(phydev, MII_M1111_PHY_EXT_SR);
@@ -915,7 +963,10 @@ static int m88e1111_config_init_1000basex(struct phy_device *phydev)
 	if (extsr < 0)
 		return extsr;
 
-	/* If using copper mode, ensure 1000BaseX auto-negotiation is enabled */
+	/* If using copper mode, ensure 1000BaseX auto-negotiation is enabled.
+	 * FIXME: does this actually enable 1000BaseX auto-negotiation if it
+	 * was previously disabled in the Fiber BMCR? 2.3.1.6 suggests not!
+	 */
 	mode = extsr & MII_M1111_HWCFG_MODE_MASK;
 	if (mode == MII_M1111_HWCFG_MODE_COPPER_1000X_NOAN) {
 		err = phy_modify(phydev, MII_M1111_PHY_EXT_SR,
@@ -2997,6 +3048,7 @@ static struct phy_driver marvell_drivers[] = {
 		.get_stats = marvell_get_stats,
 		.get_tunable = m88e1111_get_tunable,
 		.set_tunable = m88e1111_set_tunable,
+		.validate_an_inband = m88e1111_validate_an_inband,
 	},
 	{
 		.phy_id = MARVELL_PHY_ID_88E1111_FINISAR,
Russell King (Oracle) Nov. 23, 2022, 1:11 p.m. UTC | #13
On Wed, Nov 23, 2022 at 12:08:11PM +0000, Russell King (Oracle) wrote:
> On Tue, Nov 22, 2022 at 09:36:59PM +0200, Vladimir Oltean wrote:
> > I think we're in agreement, but please let's wait until tomorrow, I need
> > to take a break for today.
> 
> I think we do have a sort of agreement... but lets give this a go. The
> following should be sufficient for copper SFP modules using the 88E1111
> PHY. However, I haven't build-tested this patch yet.
> 
> Reading through the documentation has brought up some worms in this
> area. :(
> 
> It may be worth printing the fiber page BMCR and extsr at various
> strategic points in this driver and reporting back if things don't
> seem to be working right for your modules. In the mean time, I'll try
> to see how the modules in the Honeycomb appear to be setup at power-up
> and after the driver has configured the PHY... assuming I left both
> MicroUSBs connected and the board has a network connection via the
> main ethernet jack.

Unfortunately, I don't have a SFP with an 88e1111 plugged in, only the
bcm84881, so I can't test my patch remotely. However, it builds fine
when the appropriate TIMEOUT definition is added.
Vladimir Oltean Nov. 25, 2022, 12:30 p.m. UTC | #14
Hi Russell,

On Wed, Nov 23, 2022 at 01:11:23PM +0000, Russell King (Oracle) wrote:
> On Wed, Nov 23, 2022 at 12:08:11PM +0000, Russell King (Oracle) wrote:
> > On Tue, Nov 22, 2022 at 09:36:59PM +0200, Vladimir Oltean wrote:
> > > I think we're in agreement, but please let's wait until tomorrow, I need
> > > to take a break for today.
> > 
> > I think we do have a sort of agreement... but lets give this a go. The
> > following should be sufficient for copper SFP modules using the 88E1111
> > PHY. However, I haven't build-tested this patch yet.
> > 
> > Reading through the documentation has brought up some worms in this
> > area. :(
> > 
> > It may be worth printing the fiber page BMCR and extsr at various
> > strategic points in this driver and reporting back if things don't
> > seem to be working right for your modules. In the mean time, I'll try
> > to see how the modules in the Honeycomb appear to be setup at power-up
> > and after the driver has configured the PHY... assuming I left both
> > MicroUSBs connected and the board has a network connection via the
> > main ethernet jack.
> 
> Unfortunately, I don't have a SFP with an 88e1111 plugged in, only the
> bcm84881, so I can't test my patch remotely. However, it builds fine
> when the appropriate TIMEOUT definition is added.

Sorry for the delay. Had to do something else yesterday and the day before.

I tested the patch and it does detect the operating mode of my PHY.

My modules are these:

[    6.465788] sfp sfp-0: module UBNT  UF-RJ45-1G  rev 1.0  sn X20072804742  dc 200617
ethtool -m dpmac7
        Identifier              : 0x03 (SFP)
        Extended identifier     : 0x04 (GBIC/SFP defined by 2-wire interface ID)
        Connector               : 0x00 (unknown or unspecified)
        Transceiver codes       : 0x00 0x00 0x00 0x08 0x00 0x00 0x00 0x00 0x00
        Transceiver type        : Ethernet: 1000BASE-T
        Encoding                : 0x01 (8B/10B)
        BR, Nominal             : 1300MBd
        Rate identifier         : 0x00 (unspecified)
        Length (SMF,km)         : 0km
        Length (SMF)            : 0m
        Length (50um)           : 0m
        Length (62.5um)         : 0m
        Length (Copper)         : 100m
        Length (OM3)            : 0m
        Laser wavelength        : 0nm
        Vendor name             : UBNT
        Vendor OUI              : 00:00:00
        Vendor PN               : UF-RJ45-1G
        Vendor rev              : 1.0
        Option values           : 0x00 0x1a
        Option                  : RX_LOS implemented
        Option                  : TX_FAULT implemented
        Option                  : TX_DISABLE implemented
        BR margin, max          : 0%
        BR margin, min          : 0%
        Vendor SN               : X20072804742
        Date code               : 200617

Here is how the PHY driver does a few things:

[ 3079.596985] fsl_dpaa2_eth dpni.1 dpmac7: configuring for inband/sgmii link mode
[ 3079.689892] fsl_dpaa2_eth dpni.1 dpmac7: PHY driver reported AN inband 0x4 // PHY_AN_INBAND_ON_TIMEOUT
[ 3079.696826] fsl_dpaa2_eth dpni.1 dpmac7: switched to phy/sgmii link mode
[ 3079.779656] Marvell 88E1111 i2c:sfp-0:16: m88e1111_config_init_sgmii: EXT_SR before 0x9088 after 0x9084, fiber page BMCR 0x1140
[ 3079.865386] fsl_dpaa2_eth dpni.1 dpmac7: PHY [i2c:sfp-0:16] driver [Marvell 88E1111] (irq=POLL)

So the default EXT_SR is being changed by the PHY driver from 0x9088 to
0x9084 (MII_M1111_HWCFG_MODE_COPPER_1000X_AN -> MII_M1111_HWCFG_MODE_SGMII_NO_CLK).

I don't know if it's possible to force these modules to operate in
1000BASE-X mode. If you're interested in the results there, please give
me some guidance.

I was curious if the fiber page BMCR has an effect for in-band autoneg,
and at least for SGMII it surely does. If I add this to m88e1111_config_init_sgmii():

	phy_modify_paged(phydev, MII_MARVELL_FIBER_PAGE, MII_BMCR,
			 BMCR_ANENABLE, 0);

(and force an intentional mismatch) then I am able to break the link
with my Lynx PCS.

If my hunch is correct that this also holds true for 1000BASE-X, then
you are also correct that m88e1111_config_init_1000basex() only allows
AN bypass but does not seem to enable in-band AN in itself, if it wasn't
enabled.

The implication here is that maybe we should test for the fiber page
BMCR in both SGMII and 1000BASE-X modes?

Should we call m88e1111_validate_an_inband() also for the Finisar
variant of the 88E1111? What about 88E1112?
Russell King (Oracle) Nov. 25, 2022, 1:43 p.m. UTC | #15
Hi Vladimir,

On Fri, Nov 25, 2022 at 02:30:42PM +0200, Vladimir Oltean wrote:
> Hi Russell,
> 
> Sorry for the delay. Had to do something else yesterday and the day before.

I think there was some kind of celebration going on in the US for at
least one of those days...

> I tested the patch and it does detect the operating mode of my PHY.

Thanks for testing.

> My modules are these:
> 
> [    6.465788] sfp sfp-0: module UBNT  UF-RJ45-1G  rev 1.0  sn X20072804742  dc 200617
> ethtool -m dpmac7
>         Identifier              : 0x03 (SFP)
>         Extended identifier     : 0x04 (GBIC/SFP defined by 2-wire interface ID)
>         Connector               : 0x00 (unknown or unspecified)
>         Transceiver codes       : 0x00 0x00 0x00 0x08 0x00 0x00 0x00 0x00 0x00
>         Transceiver type        : Ethernet: 1000BASE-T
>         Encoding                : 0x01 (8B/10B)
>         BR, Nominal             : 1300MBd
>         Rate identifier         : 0x00 (unspecified)
>         Length (SMF,km)         : 0km
>         Length (SMF)            : 0m
>         Length (50um)           : 0m
>         Length (62.5um)         : 0m
>         Length (Copper)         : 100m
>         Length (OM3)            : 0m
>         Laser wavelength        : 0nm
>         Vendor name             : UBNT
>         Vendor OUI              : 00:00:00
>         Vendor PN               : UF-RJ45-1G
>         Vendor rev              : 1.0
>         Option values           : 0x00 0x1a
>         Option                  : RX_LOS implemented
>         Option                  : TX_FAULT implemented
>         Option                  : TX_DISABLE implemented
>         BR margin, max          : 0%
>         BR margin, min          : 0%
>         Vendor SN               : X20072804742
>         Date code               : 200617
> 
> Here is how the PHY driver does a few things:
> 
> [ 3079.596985] fsl_dpaa2_eth dpni.1 dpmac7: configuring for inband/sgmii link mode
> [ 3079.689892] fsl_dpaa2_eth dpni.1 dpmac7: PHY driver reported AN inband 0x4 // PHY_AN_INBAND_ON_TIMEOUT
> [ 3079.696826] fsl_dpaa2_eth dpni.1 dpmac7: switched to phy/sgmii link mode
> [ 3079.779656] Marvell 88E1111 i2c:sfp-0:16: m88e1111_config_init_sgmii: EXT_SR before 0x9088 after 0x9084, fiber page BMCR 0x1140
> [ 3079.865386] fsl_dpaa2_eth dpni.1 dpmac7: PHY [i2c:sfp-0:16] driver [Marvell 88E1111] (irq=POLL)
> 
> So the default EXT_SR is being changed by the PHY driver from 0x9088 to
> 0x9084 (MII_M1111_HWCFG_MODE_COPPER_1000X_AN -> MII_M1111_HWCFG_MODE_SGMII_NO_CLK).
> 
> I don't know if it's possible to force these modules to operate in
> 1000BASE-X mode. If you're interested in the results there, please give
> me some guidance.

The value of "EXT_SR before" is 1000base-X, so if you change sfp-bus.c::
sfp_select_interface() to use 1000BASEX instead of SGMII then you'll be
using 1000BASEX instead (and it should work, although at fixed 1G
speeds). The only reason the module is working in SGMII mode is because,
as you've noticed above, we switch it to SGMII mode in
m88e1111_config_init_sgmii().

> I was curious if the fiber page BMCR has an effect for in-band autoneg,
> and at least for SGMII it surely does. If I add this to
> m88e1111_config_init_sgmii():
> 
> 	phy_modify_paged(phydev, MII_MARVELL_FIBER_PAGE, MII_BMCR,
> 			 BMCR_ANENABLE, 0);
> 
> (and force an intentional mismatch) then I am able to break the link
> with my Lynx PCS.

Yes, the fiber page is re-used for the host side of the link when
operating in SGMII and 1000baseX modes, so changes there have the
expected effect.

> If my hunch is correct that this also holds true for 1000BASE-X, then
> you are also correct that m88e1111_config_init_1000basex() only allows
> AN bypass but does not seem to enable in-band AN in itself, if it wasn't
> enabled.
> 
> The implication here is that maybe we should test for the fiber page
> BMCR in both SGMII and 1000BASE-X modes?

I think a more comprehensive test would be to write the fiber page
BMCR with 0x140 before changing the mode from 1000baseX to SGMII and
see whether the BMCR changes value. My suspicion is it won't, and
the hwcfg_mode only has an effect on the settings in the fiber page
under hardware reset conditions, and mode changes have no effect on
the fiber page.

If that is correct, then I think we have two options:
1. make m88e1111_config_init_1000basex() actually do what the comment
   says, and enable AN in the fiber page (risky). That would make it
   conform with how I implemented the validate_an_inband() function.
2. update the comment in m88e1111_config_init_1000basex() to reflect
   what's actually happening with the hardware, and make
   validate_an_inband() read the fiber page BMCR for 1000base-X too.

> Should we call m88e1111_validate_an_inband() also for the Finisar
> variant of the 88E1111? What about 88E1112?

Entirely probable - this patch is minimalist in order for your tests,
I didn't bother with the 151x devices. I'll prepare a more
comprehensive patch next week.
Vladimir Oltean Nov. 25, 2022, 3:35 p.m. UTC | #16
On Fri, Nov 25, 2022 at 01:43:34PM +0000, Russell King (Oracle) wrote:
> Hi Vladimir,
> 
> On Fri, Nov 25, 2022 at 02:30:42PM +0200, Vladimir Oltean wrote:
> > Hi Russell,
> > 
> > Sorry for the delay. Had to do something else yesterday and the day before.
> 
> I think there was some kind of celebration going on in the US for at
> least one of those days...

Yeah, but I don't celebrate that. I had to write some documentation.

> > So the default EXT_SR is being changed by the PHY driver from 0x9088 to
> > 0x9084 (MII_M1111_HWCFG_MODE_COPPER_1000X_AN -> MII_M1111_HWCFG_MODE_SGMII_NO_CLK).
> > 
> > I don't know if it's possible to force these modules to operate in
> > 1000BASE-X mode. If you're interested in the results there, please give
> > me some guidance.
> 
> The value of "EXT_SR before" is 1000base-X, so if you change sfp-bus.c::
> sfp_select_interface() to use 1000BASEX instead of SGMII then you'll be
> using 1000BASEX instead (and it should work, although at fixed 1G
> speeds). The only reason the module is working in SGMII mode is because,
> as you've noticed above, we switch it to SGMII mode in
> m88e1111_config_init_sgmii().
> 

Which is an interesting thing, because m88e1111_config_init_1000basex()
does not change the HWCFG_MODE_MASK to something with 1000X in it.

Anyway, with sfp_select_interface() hacked, I can confirm it works in
1000BASE-X with AN enabled too.

[   69.746643] fsl_dpaa2_eth dpni.1 dpmac7: configuring for inband/sgmii link mode
[   69.845784] fsl_dpaa2_eth dpni.1 dpmac7: PHY driver reported AN inband 0x4 // PHY_AN_INBAND_ON_TIMEOUT
[   69.852764] fsl_dpaa2_eth dpni.1 dpmac7: switched to inband/1000base-x link mode // MLO_AN_INBAND
[   69.934191] Marvell 88E1111 i2c:sfp-0:16: m88e1111_config_init_1000basex: EXT_SR before 0x9088 after 0x9088, fiber page BMCR before 0x1140 after 0x1140
[   70.015735] fsl_dpaa2_eth dpni.1 dpmac7: PHY [i2c:sfp-0:16] driver [Marvell 88E1111] (irq=POLL)

ping 192.168.100.2
PING 192.168.100.2 (192.168.100.2) 56(84) bytes of data.
64 bytes from 192.168.100.2: icmp_seq=1 ttl=64 time=0.874 ms
64 bytes from 192.168.100.2: icmp_seq=2 ttl=64 time=0.225 ms
64 bytes from 192.168.100.2: icmp_seq=3 ttl=64 time=0.216 ms
^C
--- 192.168.100.2 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2019ms
rtt min/avg/max/mdev = 0.216/0.438/0.874/0.308 ms

printed with code:

static int m88e1111_config_init_1000basex(struct phy_device *phydev)
{
	int extsr = phy_read(phydev, MII_M1111_PHY_EXT_SR);
	int fiber_bmcr_before, fiber_bmcr_after;
	int ext_sr_after;
	int err, mode;

	if (extsr < 0)
		return extsr;

	fiber_bmcr_before = phy_read_paged(phydev, MII_MARVELL_FIBER_PAGE, MII_BMCR);
	if (fiber_bmcr_before < 0)
		return fiber_bmcr_before;

	/* If using copper mode, ensure 1000BaseX auto-negotiation is enabled.
	 * FIXME: does this actually enable 1000BaseX auto-negotiation if it
	 * was previously disabled in the Fiber BMCR? 2.3.1.6 suggests not!
	 */
	mode = extsr & MII_M1111_HWCFG_MODE_MASK;
	if (mode == MII_M1111_HWCFG_MODE_COPPER_1000X_NOAN) {
		err = phy_modify(phydev, MII_M1111_PHY_EXT_SR,
				 MII_M1111_HWCFG_MODE_MASK |
				 MII_M1111_HWCFG_SERIAL_AN_BYPASS,
				 MII_M1111_HWCFG_MODE_COPPER_1000X_AN |
				 MII_M1111_HWCFG_SERIAL_AN_BYPASS);
		if (err < 0)
			return err;
	}

	ext_sr_after = phy_read(phydev, MII_M1111_PHY_EXT_SR);
	if (ext_sr_after < 0)
		return ext_sr_after;

	fiber_bmcr_after = phy_read_paged(phydev, MII_MARVELL_FIBER_PAGE, MII_BMCR);
	if (fiber_bmcr_after < 0)
		return fiber_bmcr_after;

	phydev_err(phydev, "%s: EXT_SR before 0x%x after 0x%x, fiber page BMCR before 0x%x after 0x%x\n",
		   __func__, extsr, ext_sr_after,
		   fiber_bmcr_before, fiber_bmcr_after);
	return 0;
}

Furthermore, I can confirm that if the fiber page BMCR has BMCR_ANENABLE
disabled, the link with my Lynx PCS in MLO_AN_INBAND is broken (and that
the write to EXT_SR doesn't change the value of the BMCR).



But there's actually a problem (or maybe two problems).

First is that if I make phylink treat the ON_TIMEOUT capability by using
MLO_AN_PHY (basically like this):

phylink_sfp_config_phy():

	/* Select whether to operate in in-band mode or not, based on the
	 * capability of the PHY in the current link mode.
	 */
	ret = phy_validate_an_inband(phy, iface);
	phylink_err(pl, "PHY driver reported AN inband 0x%x\n", ret);
	if (ret == PHY_AN_INBAND_UNKNOWN) {
		mode = MLO_AN_INBAND;

		phylink_dbg(pl,
			    "PHY driver does not report in-band autoneg capability, assuming true\n");
//	} else if (ret & (PHY_AN_INBAND_ON | PHY_AN_INBAND_ON_TIMEOUT)) {
	} else if (ret & PHY_AN_INBAND_ON) {
		mode = MLO_AN_INBAND;
	} else {
		mode = MLO_AN_PHY;
	}

[   30.059923] fsl_dpaa2_eth dpni.1 dpmac7: PHY driver reported AN inband 0x4 // PHY_AN_INBAND_ON_TIMEOUT
[   30.066867] fsl_dpaa2_eth dpni.1 dpmac7: switched to phy/1000base-x link mode // MLO_AN_PHY
[   30.153350] Marvell 88E1111 i2c:sfp-0:16: m88e1111_config_init_1000basex: EXT_SR before 0x9088 after 0x9088, fiber page BMCR before 0x1140 after 0x1140
[   30.238970] fsl_dpaa2_eth dpni.1 dpmac7: PHY [i2c:sfp-0:16] driver [Marvell 88E1111] (irq=POLL)

then pinging is broken with mismatched in-band AN settings ("TIMEOUT" in
PHY, "OFF" in PCS). I triple-checked this.

ping 192.168.100.2
PING 192.168.100.2 (192.168.100.2) 56(84) bytes of data.
From 192.168.100.1 icmp_seq=1 Destination Host Unreachable
From 192.168.100.1 icmp_seq=2 Destination Host Unreachable
From 192.168.100.1 icmp_seq=3 Destination Host Unreachable
From 192.168.100.1 icmp_seq=4 Destination Host Unreachable
From 192.168.100.1 icmp_seq=5 Destination Host Unreachable
From 192.168.100.1 icmp_seq=6 Destination Host Unreachable
^C
--- 192.168.100.2 ping statistics ---
9 packets transmitted, 0 received, +6 errors, 100% packet loss, time 8170ms


However, if using the same phylink code (to force a mismatch), I unhack
sfp_select_interface() and use SGMII mode, the timeout feature does
actually work:

[   30.262979] fsl_dpaa2_eth dpni.1 dpmac7: PHY driver reported AN inband 0x4 // PHY_AN_INBAND_ON_TIMEOUT
[   30.270349] fsl_dpaa2_eth dpni.1 dpmac7: switched to phy/sgmii link mode // MLO_AN_PHY
[   30.351066] Marvell 88E1111 i2c:sfp-0:16: m88e1111_config_init_sgmii: EXT_SR before 0x9088 after 0x9084, fiber page BMCR before 0x1140 after 0x1140
[   30.433236] fsl_dpaa2_eth dpni.1 dpmac7: PHY [i2c:sfp-0:16] driver [Marvell 88E1111] (irq=POLL)

this is a functional link despite the mismatched settings.

ping 192.168.100.2
PING 192.168.100.2 (192.168.100.2) 56(84) bytes of data.
64 bytes from 192.168.100.2: icmp_seq=1 ttl=64 time=0.885 ms
64 bytes from 192.168.100.2: icmp_seq=2 ttl=64 time=0.221 ms
64 bytes from 192.168.100.2: icmp_seq=3 ttl=64 time=0.216 ms
64 bytes from 192.168.100.2: icmp_seq=4 ttl=64 time=0.217 ms
64 bytes from 192.168.100.2: icmp_seq=5 ttl=64 time=0.238 ms
^C
--- 192.168.100.2 ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 4062ms
rtt min/avg/max/mdev = 0.216/0.355/0.885/0.264 ms


The second problem is that not even *matched* settings work if I turn
off BMCR_ANENABLE in the PHY fiber page.

[   30.809869] fsl_dpaa2_eth dpni.1 dpmac7: configuring for inband/sgmii link mode
[   30.817936] mdio_bus 0x0000000008c1f000:00: MII_BMCR 0x1140 MII_BMSR 0x9 MII_ADVERTISE 0x1 MII_LPA 0x0 IF_MODE 0x3 // PCS registers at the end of lynx_pcs_config_giga()
[   30.917651] fsl_dpaa2_eth dpni.1 dpmac7: PHY driver reported AN inband 0x4 // ignore; m88e1111_validate_an_inband() is hardcoded for this and does not detect BMCR for BASE-X
[   30.924571] fsl_dpaa2_eth dpni.1 dpmac7: switched to phy/1000base-x link mode
[   30.932441] mdio_bus 0x0000000008c1f000:00: MII_BMCR 0x140 MII_BMSR 0xd MII_ADVERTISE 0x1 MII_LPA 0x0 IF_MODE 0x1
[   31.032547] Marvell 88E1111 i2c:sfp-0:16: m88e1111_config_init_1000basex: EXT_SR before 0x9088 after 0x9088, fiber page BMCR before 0x140 after 0x140
[   31.117668] fsl_dpaa2_eth dpni.1 dpmac7: PHY [i2c:sfp-0:16] driver [Marvell 88E1111] (irq=POLL)

ping 192.168.100.2
PING 192.168.100.2 (192.168.100.2) 56(84) bytes of data.
^C
--- 192.168.100.2 ping statistics ---
4 packets transmitted, 0 received, 100% packet loss, time 3058ms

What's common is that if in-band autoneg is turned off (either forced
off or via timeout), 1000BASE-X between the Lynx PCS and the 88E1111
simply doesn't work.

> > I was curious if the fiber page BMCR has an effect for in-band autoneg,
> > and at least for SGMII it surely does. If I add this to
> > m88e1111_config_init_sgmii():
> > 
> > 	phy_modify_paged(phydev, MII_MARVELL_FIBER_PAGE, MII_BMCR,
> > 			 BMCR_ANENABLE, 0);
> > 
> > (and force an intentional mismatch) then I am able to break the link
> > with my Lynx PCS.
> 
> Yes, the fiber page is re-used for the host side of the link when
> operating in SGMII and 1000baseX modes, so changes there have the
> expected effect.
> 
> > If my hunch is correct that this also holds true for 1000BASE-X, then
> > you are also correct that m88e1111_config_init_1000basex() only allows
> > AN bypass but does not seem to enable in-band AN in itself, if it wasn't
> > enabled.
> > 
> > The implication here is that maybe we should test for the fiber page
> > BMCR in both SGMII and 1000BASE-X modes?
> 
> I think a more comprehensive test would be to write the fiber page
> BMCR with 0x140 before changing the mode from 1000baseX to SGMII and
> see whether the BMCR changes value. My suspicion is it won't, and
> the hwcfg_mode only has an effect on the settings in the fiber page
> under hardware reset conditions, and mode changes have no effect on
> the fiber page.

Confirmed that changes to the EXT_SR register don't cause changes to the
MII_BMCR register:

[   28.587838] Marvell 88E1111 i2c:sfp-0:16: m88e1111_config_init_sgmii: EXT_SR before 0x9088 after 0x9084, fiber page BMCR before 0x140 after 0x140

generated by:

static int m88e1111_config_init_sgmii(struct phy_device *phydev)
{
	int fiber_bmcr_before, fiber_bmcr_after;
	int ext_sr_before, ext_sr_after;
	int err;

	ext_sr_before = phy_read(phydev, MII_M1111_PHY_EXT_SR);
	if (ext_sr_before < 0)
		return ext_sr_before;

	err = phy_modify_paged(phydev, MII_MARVELL_FIBER_PAGE, MII_BMCR,
			       BMCR_ANENABLE, 0);
	if (err < 0)
		return err;

	fiber_bmcr_before = phy_read_paged(phydev, MII_MARVELL_FIBER_PAGE, MII_BMCR);
	if (fiber_bmcr_before < 0)
		return fiber_bmcr_before;

	err = m88e1111_config_init_hwcfg_mode(
		phydev,
		MII_M1111_HWCFG_MODE_SGMII_NO_CLK,
		MII_M1111_HWCFG_FIBER_COPPER_AUTO);
	if (err < 0)
		return err;

	ext_sr_after = phy_read(phydev, MII_M1111_PHY_EXT_SR);
	if (ext_sr_after < 0)
		return ext_sr_after;

	fiber_bmcr_after = phy_read_paged(phydev, MII_MARVELL_FIBER_PAGE, MII_BMCR);
	if (fiber_bmcr_after < 0)
		return fiber_bmcr_after;

	phydev_err(phydev, "%s: EXT_SR before 0x%x after 0x%x, fiber page BMCR before 0x%x after 0x%x\n",
		   __func__, ext_sr_before, ext_sr_after,
		   fiber_bmcr_before, fiber_bmcr_after);

	/* make sure copper is selected */
	return marvell_set_page(phydev, MII_MARVELL_COPPER_PAGE);
}
Russell King (Oracle) Nov. 27, 2022, 10:14 p.m. UTC | #17
On Fri, Nov 25, 2022 at 05:35:55PM +0200, Vladimir Oltean wrote:
> On Fri, Nov 25, 2022 at 01:43:34PM +0000, Russell King (Oracle) wrote:
> > The value of "EXT_SR before" is 1000base-X, so if you change sfp-bus.c::
> > sfp_select_interface() to use 1000BASEX instead of SGMII then you'll be
> > using 1000BASEX instead (and it should work, although at fixed 1G
> > speeds). The only reason the module is working in SGMII mode is because,
> > as you've noticed above, we switch it to SGMII mode in
> > m88e1111_config_init_sgmii().
> 
> Which is an interesting thing, because m88e1111_config_init_1000basex()
> does not change the HWCFG_MODE_MASK to something with 1000X in it.

It only changes the hwcfg mode if it was using 1000base-X no-AN -
switching it instead to be 1000base-X with AN, but as we've established
the comment above that code describes something which doesn't happen,
as the fibre page BMCR is unaffected by this change.

Anyway, with my SourcePhotonics SPGBTXCNFC module (which is a SGMII
module) I get:

Marvell 88E1111 i2c:sfp-3:16: extsr: 8084 fiber bmcr: 1140

although the first time I plugged it in, BMCR was 1940 (pdown set).
Key thing is this module doesn't have bypass permitted.

> But there's actually a problem (or maybe two problems).
> 
> First is that if I make phylink treat the ON_TIMEOUT capability by using
> MLO_AN_PHY (basically like this):
> 
> phylink_sfp_config_phy():
> 
> 	/* Select whether to operate in in-band mode or not, based on the
> 	 * capability of the PHY in the current link mode.
> 	 */
> 	ret = phy_validate_an_inband(phy, iface);
> 	phylink_err(pl, "PHY driver reported AN inband 0x%x\n", ret);
> 	if (ret == PHY_AN_INBAND_UNKNOWN) {
> 		mode = MLO_AN_INBAND;
> 
> 		phylink_dbg(pl,
> 			    "PHY driver does not report in-band autoneg capability, assuming true\n");
> //	} else if (ret & (PHY_AN_INBAND_ON | PHY_AN_INBAND_ON_TIMEOUT)) {
> 	} else if (ret & PHY_AN_INBAND_ON) {
> 		mode = MLO_AN_INBAND;
> 	} else {
> 		mode = MLO_AN_PHY;
> 	}
> 
> [   30.059923] fsl_dpaa2_eth dpni.1 dpmac7: PHY driver reported AN inband 0x4 // PHY_AN_INBAND_ON_TIMEOUT
> [   30.066867] fsl_dpaa2_eth dpni.1 dpmac7: switched to phy/1000base-x link mode // MLO_AN_PHY
> [   30.153350] Marvell 88E1111 i2c:sfp-0:16: m88e1111_config_init_1000basex: EXT_SR before 0x9088 after 0x9088, fiber page BMCR before 0x1140 after 0x1140
> [   30.238970] fsl_dpaa2_eth dpni.1 dpmac7: PHY [i2c:sfp-0:16] driver [Marvell 88E1111] (irq=POLL)
> 
> then pinging is broken with mismatched in-band AN settings ("TIMEOUT" in
> PHY, "OFF" in PCS). I triple-checked this.
> 
> ping 192.168.100.2
> PING 192.168.100.2 (192.168.100.2) 56(84) bytes of data.
> From 192.168.100.1 icmp_seq=1 Destination Host Unreachable
> From 192.168.100.1 icmp_seq=2 Destination Host Unreachable
> From 192.168.100.1 icmp_seq=3 Destination Host Unreachable
> From 192.168.100.1 icmp_seq=4 Destination Host Unreachable
> From 192.168.100.1 icmp_seq=5 Destination Host Unreachable
> From 192.168.100.1 icmp_seq=6 Destination Host Unreachable
> ^C
> --- 192.168.100.2 ping statistics ---
> 9 packets transmitted, 0 received, +6 errors, 100% packet loss, time 8170ms
> 
> 
> However, if using the same phylink code (to force a mismatch), I unhack
> sfp_select_interface() and use SGMII mode, the timeout feature does
> actually work:
> 
> [   30.262979] fsl_dpaa2_eth dpni.1 dpmac7: PHY driver reported AN inband 0x4 // PHY_AN_INBAND_ON_TIMEOUT
> [   30.270349] fsl_dpaa2_eth dpni.1 dpmac7: switched to phy/sgmii link mode // MLO_AN_PHY
> [   30.351066] Marvell 88E1111 i2c:sfp-0:16: m88e1111_config_init_sgmii: EXT_SR before 0x9088 after 0x9084, fiber page BMCR before 0x1140 after 0x1140
> [   30.433236] fsl_dpaa2_eth dpni.1 dpmac7: PHY [i2c:sfp-0:16] driver [Marvell 88E1111] (irq=POLL)
> 
> this is a functional link despite the mismatched settings.
> 
> ping 192.168.100.2
> PING 192.168.100.2 (192.168.100.2) 56(84) bytes of data.
> 64 bytes from 192.168.100.2: icmp_seq=1 ttl=64 time=0.885 ms
> 64 bytes from 192.168.100.2: icmp_seq=2 ttl=64 time=0.221 ms
> 64 bytes from 192.168.100.2: icmp_seq=3 ttl=64 time=0.216 ms
> 64 bytes from 192.168.100.2: icmp_seq=4 ttl=64 time=0.217 ms
> 64 bytes from 192.168.100.2: icmp_seq=5 ttl=64 time=0.238 ms
> ^C
> --- 192.168.100.2 ping statistics ---
> 5 packets transmitted, 5 received, 0% packet loss, time 4062ms
> rtt min/avg/max/mdev = 0.216/0.355/0.885/0.264 ms
> 
> 
> The second problem is that not even *matched* settings work if I turn
> off BMCR_ANENABLE in the PHY fiber page.
> 
> [   30.809869] fsl_dpaa2_eth dpni.1 dpmac7: configuring for inband/sgmii link mode
> [   30.817936] mdio_bus 0x0000000008c1f000:00: MII_BMCR 0x1140 MII_BMSR 0x9 MII_ADVERTISE 0x1 MII_LPA 0x0 IF_MODE 0x3 // PCS registers at the end of lynx_pcs_config_giga()
> [   30.917651] fsl_dpaa2_eth dpni.1 dpmac7: PHY driver reported AN inband 0x4 // ignore; m88e1111_validate_an_inband() is hardcoded for this and does not detect BMCR for BASE-X
> [   30.924571] fsl_dpaa2_eth dpni.1 dpmac7: switched to phy/1000base-x link mode
> [   30.932441] mdio_bus 0x0000000008c1f000:00: MII_BMCR 0x140 MII_BMSR 0xd MII_ADVERTISE 0x1 MII_LPA 0x0 IF_MODE 0x1
> [   31.032547] Marvell 88E1111 i2c:sfp-0:16: m88e1111_config_init_1000basex: EXT_SR before 0x9088 after 0x9088, fiber page BMCR before 0x140 after 0x140
> [   31.117668] fsl_dpaa2_eth dpni.1 dpmac7: PHY [i2c:sfp-0:16] driver [Marvell 88E1111] (irq=POLL)
> 
> ping 192.168.100.2
> PING 192.168.100.2 (192.168.100.2) 56(84) bytes of data.
> ^C
> --- 192.168.100.2 ping statistics ---
> 4 packets transmitted, 0 received, 100% packet loss, time 3058ms
> 
> What's common is that if in-band autoneg is turned off (either forced
> off or via timeout), 1000BASE-X between the Lynx PCS and the 88E1111
> simply doesn't work.

I've just tried an experiment here with my SourcePhotonics module.

I made m88e1111_validate_an_inband() set the SERIAL_AN_BYPASS bit,
and then the bit I think you're probably unaware of - the PHY needs
to be soft-reset in order for that change to take effect. Calling
genphy_soft_reset() is sufficient.

Then I made m88e1111_validate_an_inband() return PHY_AN_INBAND_OFF.
So we now have the PHY setup with BMCR=1140 and EXTSR=9084.

# ping -I eth4 fe80::222:68ff:fe15:37dd
ping: Warning: source address might be selected on device other than: eth4
PING fe80::222:68ff:fe15:37dd(fe80::222:68ff:fe15:37dd) from :: eth4: 56 data bytes
64 bytes from fe80::222:68ff:fe15:37dd%eth4: icmp_seq=1 ttl=64 time=0.281 ms
^C
--- fe80::222:68ff:fe15:37dd ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.281/0.281/0.281/0.000 ms

(yes, it did use eth4's source address, I checked with tcpdump on the
target machine.)

So the link appears to be functional. Using a highly modified mii-diag
tool that allows me to read/write registers on the PHY, if I read the
EXT_SR register, it now contains:

Reading 0x001b=0x9884

and bit 11 being set means the PHY went into bypass mode. In other
words, it didn't see the SGMII acknowledgement from the MAC and
decided to bring the link up in bypass mode.

However, I've just tripped over some information in the 88E1111
manual which states that in SGMII mode, if bypass mode is used, then
the PHY will apparently renegotiate on the copper side advertising
1000baseT HD and FD only, no pause. So I checked what my link partner
is seeing, and it was seeing the original advertisement.

So I then triggered a renegotiate from the partner, and it now shows
only 1000baseT/Half 1000baseT/Full being advertised by the 88E1111.
Reading the advertisement register, it still contains 0x0d41, which
shows pause modes, 100FD, 10FD - so the advertisement register doesn't
reflect what was actually adfertised in this case.

Also, presumably, based on this observation, it will only renegotiate
if the copper side hadn't resolved to gigabit. If correct, what this
means is that when operating in SGMII mode, the the PHY becomes
gigabit-only if bypass mode gets used.

Given this behaviour, the fact that it switches to gigabit only when
the SGMII side enters bypass mode, I think we should _positively_ be
disabling inband bypass in SGMII mode. This change in advertisement
is not what phylib would expect, and I suspect could lead to surprises
e.g. if phylib was told to advertise non-gigabit speeds only.

However, I'll try this test with 1000base-X mode tomorrow.

> > I think a more comprehensive test would be to write the fiber page
> > BMCR with 0x140 before changing the mode from 1000baseX to SGMII and
> > see whether the BMCR changes value. My suspicion is it won't, and
> > the hwcfg_mode only has an effect on the settings in the fiber page
> > under hardware reset conditions, and mode changes have no effect on
> > the fiber page.
> 
> Confirmed that changes to the EXT_SR register don't cause changes to the
> MII_BMCR register:
> 
> [   28.587838] Marvell 88E1111 i2c:sfp-0:16: m88e1111_config_init_sgmii: EXT_SR before 0x9088 after 0x9084, fiber page BMCR before 0x140 after 0x140
> 
> generated by:
> 
> static int m88e1111_config_init_sgmii(struct phy_device *phydev)
> {
> 	int fiber_bmcr_before, fiber_bmcr_after;
> 	int ext_sr_before, ext_sr_after;
> 	int err;
> 
> 	ext_sr_before = phy_read(phydev, MII_M1111_PHY_EXT_SR);
> 	if (ext_sr_before < 0)
> 		return ext_sr_before;
> 
> 	err = phy_modify_paged(phydev, MII_MARVELL_FIBER_PAGE, MII_BMCR,
> 			       BMCR_ANENABLE, 0);
> 	if (err < 0)
> 		return err;
> 
> 	fiber_bmcr_before = phy_read_paged(phydev, MII_MARVELL_FIBER_PAGE, MII_BMCR);
> 	if (fiber_bmcr_before < 0)
> 		return fiber_bmcr_before;
> 
> 	err = m88e1111_config_init_hwcfg_mode(
> 		phydev,
> 		MII_M1111_HWCFG_MODE_SGMII_NO_CLK,
> 		MII_M1111_HWCFG_FIBER_COPPER_AUTO);
> 	if (err < 0)
> 		return err;
> 
> 	ext_sr_after = phy_read(phydev, MII_M1111_PHY_EXT_SR);
> 	if (ext_sr_after < 0)
> 		return ext_sr_after;
> 
> 	fiber_bmcr_after = phy_read_paged(phydev, MII_MARVELL_FIBER_PAGE, MII_BMCR);
> 	if (fiber_bmcr_after < 0)
> 		return fiber_bmcr_after;
> 
> 	phydev_err(phydev, "%s: EXT_SR before 0x%x after 0x%x, fiber page BMCR before 0x%x after 0x%x\n",
> 		   __func__, ext_sr_before, ext_sr_after,
> 		   fiber_bmcr_before, fiber_bmcr_after);
> 
> 	/* make sure copper is selected */
> 	return marvell_set_page(phydev, MII_MARVELL_COPPER_PAGE);
> }

Thanks for testing. So that means m88e1111_config_init_sgmii() will not
enable in-band if it was previously disabled. So we need to check the
fiber ANENABLE bit and unconditionally return PHY_AN_INBAND_OFF if it is
clear before evaluating anything else.

Also, given this behaviour of bypass mode, it seems it would only make
sense if the PHY were operating in 1000base-X mode, which we don't do
with SFPs, so maybe it makes no sense to support the ON_TIMEOUT as an
option right now - and as I say above, maybe we should be focing the
AN bypass allow bit to be clear in SGMII mode.

I think maybe Andrew needs to be involved in that last bit though.
Russell King (Oracle) Nov. 29, 2022, 1:40 p.m. UTC | #18
On Sun, Nov 27, 2022 at 10:14:45PM +0000, Russell King (Oracle) wrote:
> On Fri, Nov 25, 2022 at 05:35:55PM +0200, Vladimir Oltean wrote:
> > On Fri, Nov 25, 2022 at 01:43:34PM +0000, Russell King (Oracle) wrote:
> > > The value of "EXT_SR before" is 1000base-X, so if you change sfp-bus.c::
> > > sfp_select_interface() to use 1000BASEX instead of SGMII then you'll be
> > > using 1000BASEX instead (and it should work, although at fixed 1G
> > > speeds). The only reason the module is working in SGMII mode is because,
> > > as you've noticed above, we switch it to SGMII mode in
> > > m88e1111_config_init_sgmii().
> > 
> > Which is an interesting thing, because m88e1111_config_init_1000basex()
> > does not change the HWCFG_MODE_MASK to something with 1000X in it.
> 
> It only changes the hwcfg mode if it was using 1000base-X no-AN -
> switching it instead to be 1000base-X with AN, but as we've established
> the comment above that code describes something which doesn't happen,
> as the fibre page BMCR is unaffected by this change.
> 
> Anyway, with my SourcePhotonics SPGBTXCNFC module (which is a SGMII
> module) I get:
> 
> Marvell 88E1111 i2c:sfp-3:16: extsr: 8084 fiber bmcr: 1140
> 
> although the first time I plugged it in, BMCR was 1940 (pdown set).
> Key thing is this module doesn't have bypass permitted.
> 
> > But there's actually a problem (or maybe two problems).
> > 
> > First is that if I make phylink treat the ON_TIMEOUT capability by using
> > MLO_AN_PHY (basically like this):
> > 
> > phylink_sfp_config_phy():
> > 
> > 	/* Select whether to operate in in-band mode or not, based on the
> > 	 * capability of the PHY in the current link mode.
> > 	 */
> > 	ret = phy_validate_an_inband(phy, iface);
> > 	phylink_err(pl, "PHY driver reported AN inband 0x%x\n", ret);
> > 	if (ret == PHY_AN_INBAND_UNKNOWN) {
> > 		mode = MLO_AN_INBAND;
> > 
> > 		phylink_dbg(pl,
> > 			    "PHY driver does not report in-band autoneg capability, assuming true\n");
> > //	} else if (ret & (PHY_AN_INBAND_ON | PHY_AN_INBAND_ON_TIMEOUT)) {
> > 	} else if (ret & PHY_AN_INBAND_ON) {
> > 		mode = MLO_AN_INBAND;
> > 	} else {
> > 		mode = MLO_AN_PHY;
> > 	}
> > 
> > [   30.059923] fsl_dpaa2_eth dpni.1 dpmac7: PHY driver reported AN inband 0x4 // PHY_AN_INBAND_ON_TIMEOUT
> > [   30.066867] fsl_dpaa2_eth dpni.1 dpmac7: switched to phy/1000base-x link mode // MLO_AN_PHY
> > [   30.153350] Marvell 88E1111 i2c:sfp-0:16: m88e1111_config_init_1000basex: EXT_SR before 0x9088 after 0x9088, fiber page BMCR before 0x1140 after 0x1140
> > [   30.238970] fsl_dpaa2_eth dpni.1 dpmac7: PHY [i2c:sfp-0:16] driver [Marvell 88E1111] (irq=POLL)
> > 
> > then pinging is broken with mismatched in-band AN settings ("TIMEOUT" in
> > PHY, "OFF" in PCS). I triple-checked this.
> > 
> > ping 192.168.100.2
> > PING 192.168.100.2 (192.168.100.2) 56(84) bytes of data.
> > From 192.168.100.1 icmp_seq=1 Destination Host Unreachable
> > From 192.168.100.1 icmp_seq=2 Destination Host Unreachable
> > From 192.168.100.1 icmp_seq=3 Destination Host Unreachable
> > From 192.168.100.1 icmp_seq=4 Destination Host Unreachable
> > From 192.168.100.1 icmp_seq=5 Destination Host Unreachable
> > From 192.168.100.1 icmp_seq=6 Destination Host Unreachable
> > ^C
> > --- 192.168.100.2 ping statistics ---
> > 9 packets transmitted, 0 received, +6 errors, 100% packet loss, time 8170ms
> > 
> > 
> > However, if using the same phylink code (to force a mismatch), I unhack
> > sfp_select_interface() and use SGMII mode, the timeout feature does
> > actually work:
> > 
> > [   30.262979] fsl_dpaa2_eth dpni.1 dpmac7: PHY driver reported AN inband 0x4 // PHY_AN_INBAND_ON_TIMEOUT
> > [   30.270349] fsl_dpaa2_eth dpni.1 dpmac7: switched to phy/sgmii link mode // MLO_AN_PHY
> > [   30.351066] Marvell 88E1111 i2c:sfp-0:16: m88e1111_config_init_sgmii: EXT_SR before 0x9088 after 0x9084, fiber page BMCR before 0x1140 after 0x1140
> > [   30.433236] fsl_dpaa2_eth dpni.1 dpmac7: PHY [i2c:sfp-0:16] driver [Marvell 88E1111] (irq=POLL)
> > 
> > this is a functional link despite the mismatched settings.
> > 
> > ping 192.168.100.2
> > PING 192.168.100.2 (192.168.100.2) 56(84) bytes of data.
> > 64 bytes from 192.168.100.2: icmp_seq=1 ttl=64 time=0.885 ms
> > 64 bytes from 192.168.100.2: icmp_seq=2 ttl=64 time=0.221 ms
> > 64 bytes from 192.168.100.2: icmp_seq=3 ttl=64 time=0.216 ms
> > 64 bytes from 192.168.100.2: icmp_seq=4 ttl=64 time=0.217 ms
> > 64 bytes from 192.168.100.2: icmp_seq=5 ttl=64 time=0.238 ms
> > ^C
> > --- 192.168.100.2 ping statistics ---
> > 5 packets transmitted, 5 received, 0% packet loss, time 4062ms
> > rtt min/avg/max/mdev = 0.216/0.355/0.885/0.264 ms
> > 
> > 
> > The second problem is that not even *matched* settings work if I turn
> > off BMCR_ANENABLE in the PHY fiber page.
> > 
> > [   30.809869] fsl_dpaa2_eth dpni.1 dpmac7: configuring for inband/sgmii link mode
> > [   30.817936] mdio_bus 0x0000000008c1f000:00: MII_BMCR 0x1140 MII_BMSR 0x9 MII_ADVERTISE 0x1 MII_LPA 0x0 IF_MODE 0x3 // PCS registers at the end of lynx_pcs_config_giga()
> > [   30.917651] fsl_dpaa2_eth dpni.1 dpmac7: PHY driver reported AN inband 0x4 // ignore; m88e1111_validate_an_inband() is hardcoded for this and does not detect BMCR for BASE-X
> > [   30.924571] fsl_dpaa2_eth dpni.1 dpmac7: switched to phy/1000base-x link mode
> > [   30.932441] mdio_bus 0x0000000008c1f000:00: MII_BMCR 0x140 MII_BMSR 0xd MII_ADVERTISE 0x1 MII_LPA 0x0 IF_MODE 0x1
> > [   31.032547] Marvell 88E1111 i2c:sfp-0:16: m88e1111_config_init_1000basex: EXT_SR before 0x9088 after 0x9088, fiber page BMCR before 0x140 after 0x140
> > [   31.117668] fsl_dpaa2_eth dpni.1 dpmac7: PHY [i2c:sfp-0:16] driver [Marvell 88E1111] (irq=POLL)
> > 
> > ping 192.168.100.2
> > PING 192.168.100.2 (192.168.100.2) 56(84) bytes of data.
> > ^C
> > --- 192.168.100.2 ping statistics ---
> > 4 packets transmitted, 0 received, 100% packet loss, time 3058ms
> > 
> > What's common is that if in-band autoneg is turned off (either forced
> > off or via timeout), 1000BASE-X between the Lynx PCS and the 88E1111
> > simply doesn't work.
> 
> I've just tried an experiment here with my SourcePhotonics module.
> 
> I made m88e1111_validate_an_inband() set the SERIAL_AN_BYPASS bit,
> and then the bit I think you're probably unaware of - the PHY needs
> to be soft-reset in order for that change to take effect. Calling
> genphy_soft_reset() is sufficient.
> 
> Then I made m88e1111_validate_an_inband() return PHY_AN_INBAND_OFF.
> So we now have the PHY setup with BMCR=1140 and EXTSR=9084.
> 
> # ping -I eth4 fe80::222:68ff:fe15:37dd
> ping: Warning: source address might be selected on device other than: eth4
> PING fe80::222:68ff:fe15:37dd(fe80::222:68ff:fe15:37dd) from :: eth4: 56 data bytes
> 64 bytes from fe80::222:68ff:fe15:37dd%eth4: icmp_seq=1 ttl=64 time=0.281 ms
> ^C
> --- fe80::222:68ff:fe15:37dd ping statistics ---
> 1 packets transmitted, 1 received, 0% packet loss, time 0ms
> rtt min/avg/max/mdev = 0.281/0.281/0.281/0.000 ms
> 
> (yes, it did use eth4's source address, I checked with tcpdump on the
> target machine.)
> 
> So the link appears to be functional. Using a highly modified mii-diag
> tool that allows me to read/write registers on the PHY, if I read the
> EXT_SR register, it now contains:
> 
> Reading 0x001b=0x9884
> 
> and bit 11 being set means the PHY went into bypass mode. In other
> words, it didn't see the SGMII acknowledgement from the MAC and
> decided to bring the link up in bypass mode.
> 
> However, I've just tripped over some information in the 88E1111
> manual which states that in SGMII mode, if bypass mode is used, then
> the PHY will apparently renegotiate on the copper side advertising
> 1000baseT HD and FD only, no pause. So I checked what my link partner
> is seeing, and it was seeing the original advertisement.
> 
> So I then triggered a renegotiate from the partner, and it now shows
> only 1000baseT/Half 1000baseT/Full being advertised by the 88E1111.
> Reading the advertisement register, it still contains 0x0d41, which
> shows pause modes, 100FD, 10FD - so the advertisement register doesn't
> reflect what was actually adfertised in this case.
> 
> Also, presumably, based on this observation, it will only renegotiate
> if the copper side hadn't resolved to gigabit. If correct, what this
> means is that when operating in SGMII mode, the the PHY becomes
> gigabit-only if bypass mode gets used.
> 
> Given this behaviour, the fact that it switches to gigabit only when
> the SGMII side enters bypass mode, I think we should _positively_ be
> disabling inband bypass in SGMII mode. This change in advertisement
> is not what phylib would expect, and I suspect could lead to surprises
> e.g. if phylib was told to advertise non-gigabit speeds only.
> 
> However, I'll try this test with 1000base-X mode tomorrow.
> 
> > > I think a more comprehensive test would be to write the fiber page
> > > BMCR with 0x140 before changing the mode from 1000baseX to SGMII and
> > > see whether the BMCR changes value. My suspicion is it won't, and
> > > the hwcfg_mode only has an effect on the settings in the fiber page
> > > under hardware reset conditions, and mode changes have no effect on
> > > the fiber page.
> > 
> > Confirmed that changes to the EXT_SR register don't cause changes to the
> > MII_BMCR register:
> > 
> > [   28.587838] Marvell 88E1111 i2c:sfp-0:16: m88e1111_config_init_sgmii: EXT_SR before 0x9088 after 0x9084, fiber page BMCR before 0x140 after 0x140
> > 
> > generated by:
> > 
> > static int m88e1111_config_init_sgmii(struct phy_device *phydev)
> > {
> > 	int fiber_bmcr_before, fiber_bmcr_after;
> > 	int ext_sr_before, ext_sr_after;
> > 	int err;
> > 
> > 	ext_sr_before = phy_read(phydev, MII_M1111_PHY_EXT_SR);
> > 	if (ext_sr_before < 0)
> > 		return ext_sr_before;
> > 
> > 	err = phy_modify_paged(phydev, MII_MARVELL_FIBER_PAGE, MII_BMCR,
> > 			       BMCR_ANENABLE, 0);
> > 	if (err < 0)
> > 		return err;
> > 
> > 	fiber_bmcr_before = phy_read_paged(phydev, MII_MARVELL_FIBER_PAGE, MII_BMCR);
> > 	if (fiber_bmcr_before < 0)
> > 		return fiber_bmcr_before;
> > 
> > 	err = m88e1111_config_init_hwcfg_mode(
> > 		phydev,
> > 		MII_M1111_HWCFG_MODE_SGMII_NO_CLK,
> > 		MII_M1111_HWCFG_FIBER_COPPER_AUTO);
> > 	if (err < 0)
> > 		return err;
> > 
> > 	ext_sr_after = phy_read(phydev, MII_M1111_PHY_EXT_SR);
> > 	if (ext_sr_after < 0)
> > 		return ext_sr_after;
> > 
> > 	fiber_bmcr_after = phy_read_paged(phydev, MII_MARVELL_FIBER_PAGE, MII_BMCR);
> > 	if (fiber_bmcr_after < 0)
> > 		return fiber_bmcr_after;
> > 
> > 	phydev_err(phydev, "%s: EXT_SR before 0x%x after 0x%x, fiber page BMCR before 0x%x after 0x%x\n",
> > 		   __func__, ext_sr_before, ext_sr_after,
> > 		   fiber_bmcr_before, fiber_bmcr_after);
> > 
> > 	/* make sure copper is selected */
> > 	return marvell_set_page(phydev, MII_MARVELL_COPPER_PAGE);
> > }
> 
> Thanks for testing. So that means m88e1111_config_init_sgmii() will not
> enable in-band if it was previously disabled. So we need to check the
> fiber ANENABLE bit and unconditionally return PHY_AN_INBAND_OFF if it is
> clear before evaluating anything else.
> 
> Also, given this behaviour of bypass mode, it seems it would only make
> sense if the PHY were operating in 1000base-X mode, which we don't do
> with SFPs, so maybe it makes no sense to support the ON_TIMEOUT as an
> option right now - and as I say above, maybe we should be focing the
> AN bypass allow bit to be clear in SGMII mode.
> 
> I think maybe Andrew needs to be involved in that last bit though.

Here's an updated patch.
8<===
From: "Russell King (Oracle)" <rmk+kernel@armlinux.org.uk>
Subject: [PATCH] net: phy: marvell: add validate_an_inband() method

Add the validate_an_inband() method for Marvell 88E1111, the Finisar
version of the 88E1111, and 88E1112.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
---
 drivers/net/phy/marvell.c | 54 ++++++++++++++++++++++++++++++++++++++-
 1 file changed, 53 insertions(+), 1 deletion(-)

diff --git a/drivers/net/phy/marvell.c b/drivers/net/phy/marvell.c
index 3c54d7d0f17f..1d7e00c4d97a 100644
--- a/drivers/net/phy/marvell.c
+++ b/drivers/net/phy/marvell.c
@@ -669,6 +669,52 @@ static int marvell_config_aneg_fiber(struct phy_device *phydev)
 	return genphy_check_and_restart_aneg(phydev, changed);
 }
 
+static int m88e1111_validate_an_inband(struct phy_device *phydev,
+				       phy_interface_t interface)
+{
+	int hwcfg_mode, extsr, bmcr;
+
+	if (interface != PHY_INTERFACE_MODE_1000BASEX &&
+	    interface != PHY_INTERFACE_MODE_SGMII)
+		return PHY_AN_INBAND_UNKNOWN;
+
+	extsr = phy_read(phydev, MII_M1111_PHY_EXT_SR);
+	bmcr = phy_read_paged(phydev, MII_MARVELL_FIBER_PAGE, MII_BMCR);
+	if (extsr < 0 || bmcr < 0)
+		return PHY_AN_INBAND_UNKNOWN;
+
+	/* We make no efforts to enable the ANENABLE bit when switching mode.
+	 * If this bit is clear, then we will not be using in-band signalling.
+	 */
+	if (!(bmcr & BMCR_ANENABLE))
+		return PHY_AN_INBAND_OFF;
+
+	hwcfg_mode = extsr & MII_M1111_HWCFG_MODE_MASK;
+
+	/* If we are in 1000base-X no-AN hwcfg_mode,
+	 * m88e1111_config_init_1000basex() will allow AN bypass, but does not
+	 * enable AN.
+	 */
+	if (interface == PHY_INTERFACE_MODE_1000BASEX &&
+	    hwcfg_mode == MII_M1111_HWCFG_MODE_COPPER_1000X_NOAN)
+		return PHY_AN_INBAND_ON_TIMEOUT;
+
+	/* Otherwise, we leave the AN enable bit and the AN bypass bit
+	 * alone, so we need to read the registers to determine how the
+	 * MAC facing side of the PHY has been setup by firmware and/or
+	 * hardware reset.
+	 *
+	 * If the AN enable bit is clear, then all in-band signalling
+	 * on the SGMII/1000base-X side is disabled. Otherwise, AN is
+	 * enabled. If the bypass bit is set, AN can complete without
+	 * a response from the partner (MAC).
+	 */
+	if (extsr & MII_M1111_HWCFG_SERIAL_AN_BYPASS)
+		return PHY_AN_INBAND_ON_TIMEOUT;
+
+	return PHY_AN_INBAND_ON;
+}
+
 static int m88e1111_config_aneg(struct phy_device *phydev)
 {
 	int extsr = phy_read(phydev, MII_M1111_PHY_EXT_SR);
@@ -915,7 +961,10 @@ static int m88e1111_config_init_1000basex(struct phy_device *phydev)
 	if (extsr < 0)
 		return extsr;
 
-	/* If using copper mode, ensure 1000BaseX auto-negotiation is enabled */
+	/* If using copper mode, ensure 1000BaseX auto-negotiation is enabled.
+	 * FIXME: this does not actually enable 1000BaseX auto-negotiation if
+	 * it was previously disabled in the Fiber BMCR!
+	 */
 	mode = extsr & MII_M1111_HWCFG_MODE_MASK;
 	if (mode == MII_M1111_HWCFG_MODE_COPPER_1000X_NOAN) {
 		err = phy_modify(phydev, MII_M1111_PHY_EXT_SR,
@@ -2978,6 +3027,7 @@ static struct phy_driver marvell_drivers[] = {
 		.get_stats = marvell_get_stats,
 		.get_tunable = m88e1011_get_tunable,
 		.set_tunable = m88e1011_set_tunable,
+		.validate_an_inband = m88e1111_validate_an_inband,
 	},
 	{
 		.phy_id = MARVELL_PHY_ID_88E1111,
@@ -2999,6 +3049,7 @@ static struct phy_driver marvell_drivers[] = {
 		.get_stats = marvell_get_stats,
 		.get_tunable = m88e1111_get_tunable,
 		.set_tunable = m88e1111_set_tunable,
+		.validate_an_inband = m88e1111_validate_an_inband,
 	},
 	{
 		.phy_id = MARVELL_PHY_ID_88E1111_FINISAR,
@@ -3020,6 +3071,7 @@ static struct phy_driver marvell_drivers[] = {
 		.get_stats = marvell_get_stats,
 		.get_tunable = m88e1111_get_tunable,
 		.set_tunable = m88e1111_set_tunable,
+		.validate_an_inband = m88e1111_validate_an_inband,
 	},
 	{
 		.phy_id = MARVELL_PHY_ID_88E1118,
Russell King (Oracle) Nov. 29, 2022, 1:43 p.m. UTC | #19
On Tue, Nov 29, 2022 at 01:40:58PM +0000, Russell King (Oracle) wrote:
> Here's an updated patch.
> 8<===
> From: "Russell King (Oracle)" <rmk+kernel@armlinux.org.uk>
> Subject: [PATCH] net: phy: marvell: add validate_an_inband() method
> 
> Add the validate_an_inband() method for Marvell 88E1111, the Finisar
> version of the 88E1111, and 88E1112.
> 
> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>

And this is what I was using on top to force a mismatch with bypass
enabled:

diff --git a/drivers/net/phy/marvell.c b/drivers/net/phy/marvell.c
index fbf881cd4a38..c7a0389320cd 100644
--- a/drivers/net/phy/marvell.c
+++ b/drivers/net/phy/marvell.c
@@ -683,6 +683,14 @@ static int m88e1111_validate_an_inband(struct phy_device *phydev,
 	if (extsr < 0 || bmcr < 0)
 		return PHY_AN_INBAND_UNKNOWN;
 
+	/* <<=== HACK */
+	phydev_info(phydev, "extsr: %04x fiber bmcr: %04x\n", extsr, bmcr);
+	phy_write(phydev, MII_M1111_PHY_EXT_SR, extsr |
+		  MII_M1111_HWCFG_SERIAL_AN_BYPASS);
+	genphy_soft_reset(phydev);
+	return PHY_AN_INBAND_OFF;
+	/* ===>> HACK */
+
 	/* We make no efforts to enable the ANENABLE bit when switching mode.
 	 * If this bit is clear, then we will not be using in-band signalling.
 	 */
Andrew Lunn Nov. 29, 2022, 2:07 p.m. UTC | #20
> > However, I've just tripped over some information in the 88E1111
> > manual which states that in SGMII mode, if bypass mode is used, then
> > the PHY will apparently renegotiate on the copper side advertising
> > 1000baseT HD and FD only, no pause. So I checked what my link partner
> > is seeing, and it was seeing the original advertisement.
> > 
> > So I then triggered a renegotiate from the partner, and it now shows
> > only 1000baseT/Half 1000baseT/Full being advertised by the 88E1111.
> > Reading the advertisement register, it still contains 0x0d41, which
> > shows pause modes, 100FD, 10FD - so the advertisement register doesn't
> > reflect what was actually adfertised in this case.
> > 
> > Also, presumably, based on this observation, it will only renegotiate
> > if the copper side hadn't resolved to gigabit. If correct, what this
> > means is that when operating in SGMII mode, the the PHY becomes
> > gigabit-only if bypass mode gets used.
> > 
> > Given this behaviour, the fact that it switches to gigabit only when
> > the SGMII side enters bypass mode, I think we should _positively_ be
> > disabling inband bypass in SGMII mode. This change in advertisement
> > is not what phylib would expect, and I suspect could lead to surprises
> > e.g. if phylib was told to advertise non-gigabit speeds only.

Sorry, i've not been reading this thread.

This behaviour is not very nice. So i agree, it should be avoided if
possible.

> > Thanks for testing. So that means m88e1111_config_init_sgmii() will not
> > enable in-band if it was previously disabled. So we need to check the
> > fiber ANENABLE bit and unconditionally return PHY_AN_INBAND_OFF if it is
> > clear before evaluating anything else.
> > 
> > Also, given this behaviour of bypass mode, it seems it would only make
> > sense if the PHY were operating in 1000base-X mode, which we don't do
> > with SFPs, so maybe it makes no sense to support the ON_TIMEOUT as an
> > option right now - and as I say above, maybe we should be focing the
> > AN bypass allow bit to be clear in SGMII mode.
> > 
> > I think maybe Andrew needs to be involved in that last bit though.

As you say, 1000Base-X does not make much sense for a copper
SFP. SGMII does odd things, so just not supporting ON_TIMEOUT seems
reasonable.

	Andrew
diff mbox series

Patch

diff --git a/drivers/net/phy/bcm84881.c b/drivers/net/phy/bcm84881.c
index 9717a1626f3f..ab03acee77ea 100644
--- a/drivers/net/phy/bcm84881.c
+++ b/drivers/net/phy/bcm84881.c
@@ -223,6 +223,15 @@  static int bcm84881_read_status(struct phy_device *phydev)
 	return genphy_c45_read_mdix(phydev);
 }
 
+/* The Broadcom BCM84881 in the Methode DM7052 is unable to provide a SGMII
+ * or 802.3z control word, so inband will not work.
+ */
+static int bcm84881_validate_an_inband(struct phy_device *phydev,
+				       phy_interface_t interface)
+{
+	return PHY_AN_INBAND_OFF;
+}
+
 static struct phy_driver bcm84881_drivers[] = {
 	{
 		.phy_id		= 0xae025150,
@@ -234,6 +243,7 @@  static struct phy_driver bcm84881_drivers[] = {
 		.config_aneg	= bcm84881_config_aneg,
 		.aneg_done	= bcm84881_aneg_done,
 		.read_status	= bcm84881_read_status,
+		.validate_an_inband = bcm84881_validate_an_inband,
 	},
 };
 
diff --git a/drivers/net/phy/phylink.c b/drivers/net/phy/phylink.c
index 40b7e730fb33..bf2a5ebfc4f4 100644
--- a/drivers/net/phy/phylink.c
+++ b/drivers/net/phy/phylink.c
@@ -2811,15 +2811,6 @@  int phylink_speed_up(struct phylink *pl)
 }
 EXPORT_SYMBOL_GPL(phylink_speed_up);
 
-/* The Broadcom BCM84881 in the Methode DM7052 is unable to provide a SGMII
- * or 802.3z control word, so inband will not work.
- */
-static bool phylink_phy_no_inband(struct phy_device *phy)
-{
-	return phy->is_c45 &&
-		(phy->c45_ids.device_ids[1] & 0xfffffff0) == 0xae025150;
-}
-
 static void phylink_sfp_attach(void *upstream, struct sfp_bus *bus)
 {
 	struct phylink *pl = upstream;
@@ -2941,14 +2932,10 @@  static int phylink_sfp_config_phy(struct phylink *pl, struct phy_device *phy)
 	 */
 	ret = phy_validate_an_inband(phy, iface);
 	if (ret == PHY_AN_INBAND_UNKNOWN) {
-		if (phylink_phy_no_inband(phy))
-			mode = MLO_AN_PHY;
-		else
-			mode = MLO_AN_INBAND;
+		mode = MLO_AN_INBAND;
 
 		phylink_dbg(pl,
-			    "PHY driver does not report in-band autoneg capability, assuming %s\n",
-			    phylink_autoneg_inband(mode) ? "true" : "false");
+			    "PHY driver does not report in-band autoneg capability, assuming true\n");
 	} else if (ret & PHY_AN_INBAND_ON) {
 		mode = MLO_AN_INBAND;
 	} else {