diff mbox series

[net-next] net: phy: marvell10g: add downshift tunable support

Message ID E1mREEN-001yxo-Da@rmk-PC.armlinux.org.uk (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series [net-next] net: phy: marvell10g: add downshift tunable support | expand

Checks

Context Check Description
netdev/cover_letter success Link
netdev/fixes_present success Link
netdev/patch_count success Link
netdev/tree_selection success Clearly marked for net-next
netdev/subject_prefix success Link
netdev/cc_maintainers warning 1 maintainers not CCed: linux@armlinux.org.uk
netdev/source_inline success Was 0 now: 0
netdev/verify_signedoff success Link
netdev/module_param success Was 0 now: 0
netdev/build_32bit success Errors and warnings before: 0 this patch: 0
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/verify_fixes success Link
netdev/checkpatch fail ERROR: space required after that ',' (ctx:VxV)
netdev/build_allmodconfig_warn success Errors and warnings before: 0 this patch: 0
netdev/header_inline success Link

Commit Message

Russell King (Oracle) Sept. 17, 2021, 1:48 p.m. UTC
Add support for the downshift tunable for the Marvell 88x3310 PHY.
Downshift is only usable with firmware 0.3.5.0 and later.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
---
 drivers/net/phy/marvell10g.c | 101 ++++++++++++++++++++++++++++++++++-
 1 file changed, 100 insertions(+), 1 deletion(-)

Comments

Andrew Lunn Sept. 17, 2021, 2:45 p.m. UTC | #1
> +static int mv3310_set_downshift(struct phy_device *phydev, u8 ds)
> +{
> +	struct mv3310_priv *priv = dev_get_drvdata(&phydev->mdio.dev);
> +	u16 val;
> +	int err;
> +
> +	if (!priv->has_downshift)
> +		return -EOPNOTSUPP;
> +
> +	if (ds == DOWNSHIFT_DEV_DISABLE)
> +		return phy_clear_bits_mmd(phydev, MDIO_MMD_PCS, MV_PCS_DSC1,
> +					  MV_PCS_DSC1_ENABLE);
> +
> +	/* FIXME: The default is disabled, so should we disable? */
> +	if (ds == DOWNSHIFT_DEV_DEFAULT_COUNT)
> +		ds = 2;

Hi Russell

Rather than a FIXME, maybe just document that the hardware default is
disabled, which does not make too much sense, so default to 2 attempts?

      Andrew
Russell King (Oracle) Sept. 17, 2021, 2:58 p.m. UTC | #2
On Fri, Sep 17, 2021 at 04:45:03PM +0200, Andrew Lunn wrote:
> > +static int mv3310_set_downshift(struct phy_device *phydev, u8 ds)
> > +{
> > +	struct mv3310_priv *priv = dev_get_drvdata(&phydev->mdio.dev);
> > +	u16 val;
> > +	int err;
> > +
> > +	if (!priv->has_downshift)
> > +		return -EOPNOTSUPP;
> > +
> > +	if (ds == DOWNSHIFT_DEV_DISABLE)
> > +		return phy_clear_bits_mmd(phydev, MDIO_MMD_PCS, MV_PCS_DSC1,
> > +					  MV_PCS_DSC1_ENABLE);
> > +
> > +	/* FIXME: The default is disabled, so should we disable? */
> > +	if (ds == DOWNSHIFT_DEV_DEFAULT_COUNT)
> > +		ds = 2;
> 
> Hi Russell
> 
> Rather than a FIXME, maybe just document that the hardware default is
> disabled, which does not make too much sense, so default to 2 attempts?

Sadly, the downshift parameters aren't documented at all in the kernel,
and one has to dig into the ethtool source to find out what they mean:

DOWNSHIFT_DEV_DEFAULT_COUNT -
	ethtool --set-phy-tunable ethN downshift on
DOWNSHIFT_DEV_DISABLE -
	ethtool --set-phy-tunable ethN downshift off
otherwise:
	ethtool --set-phy-tunable ethN downshift count N

This really needs to be documented somewhere in the kernel.
Russell King (Oracle) Sept. 22, 2021, noon UTC | #3
On Fri, Sep 17, 2021 at 03:58:01PM +0100, Russell King (Oracle) wrote:
> On Fri, Sep 17, 2021 at 04:45:03PM +0200, Andrew Lunn wrote:
> > > +static int mv3310_set_downshift(struct phy_device *phydev, u8 ds)
> > > +{
> > > +	struct mv3310_priv *priv = dev_get_drvdata(&phydev->mdio.dev);
> > > +	u16 val;
> > > +	int err;
> > > +
> > > +	if (!priv->has_downshift)
> > > +		return -EOPNOTSUPP;
> > > +
> > > +	if (ds == DOWNSHIFT_DEV_DISABLE)
> > > +		return phy_clear_bits_mmd(phydev, MDIO_MMD_PCS, MV_PCS_DSC1,
> > > +					  MV_PCS_DSC1_ENABLE);
> > > +
> > > +	/* FIXME: The default is disabled, so should we disable? */
> > > +	if (ds == DOWNSHIFT_DEV_DEFAULT_COUNT)
> > > +		ds = 2;
> > 
> > Hi Russell
> > 
> > Rather than a FIXME, maybe just document that the hardware default is
> > disabled, which does not make too much sense, so default to 2 attempts?
> 
> Sadly, the downshift parameters aren't documented at all in the kernel,
> and one has to dig into the ethtool source to find out what they mean:
> 
> DOWNSHIFT_DEV_DEFAULT_COUNT -
> 	ethtool --set-phy-tunable ethN downshift on
> DOWNSHIFT_DEV_DISABLE -
> 	ethtool --set-phy-tunable ethN downshift off
> otherwise:
> 	ethtool --set-phy-tunable ethN downshift count N
> 
> This really needs to be documented somewhere in the kernel.

I was hoping that this would cause further discussion on what the
exact meaning of "DOWNSHIFT_DEV_DEFAULT_COUNT" is. Clearly, it's
meant to turn downshift on, but what does "default" actually mean?

If we define "default" as "whatever the hardware defaults to" then
for this phy, that would be turning off downshift.

So, should we rename "DOWNSHIFT_DEV_DEFAULT_COUNT" to be
"DOWNSHIFT_DEV_ENABLE" rather than trying to imply that it's
some kind of default that may need to be made up?
Andrew Lunn Sept. 22, 2021, 11:56 p.m. UTC | #4
On Wed, Sep 22, 2021 at 01:00:31PM +0100, Russell King (Oracle) wrote:
> On Fri, Sep 17, 2021 at 03:58:01PM +0100, Russell King (Oracle) wrote:
> > On Fri, Sep 17, 2021 at 04:45:03PM +0200, Andrew Lunn wrote:
> > > > +static int mv3310_set_downshift(struct phy_device *phydev, u8 ds)
> > > > +{
> > > > +	struct mv3310_priv *priv = dev_get_drvdata(&phydev->mdio.dev);
> > > > +	u16 val;
> > > > +	int err;
> > > > +
> > > > +	if (!priv->has_downshift)
> > > > +		return -EOPNOTSUPP;
> > > > +
> > > > +	if (ds == DOWNSHIFT_DEV_DISABLE)
> > > > +		return phy_clear_bits_mmd(phydev, MDIO_MMD_PCS, MV_PCS_DSC1,
> > > > +					  MV_PCS_DSC1_ENABLE);
> > > > +
> > > > +	/* FIXME: The default is disabled, so should we disable? */
> > > > +	if (ds == DOWNSHIFT_DEV_DEFAULT_COUNT)
> > > > +		ds = 2;
> > > 
> > > Hi Russell
> > > 
> > > Rather than a FIXME, maybe just document that the hardware default is
> > > disabled, which does not make too much sense, so default to 2 attempts?
> > 
> > Sadly, the downshift parameters aren't documented at all in the kernel,
> > and one has to dig into the ethtool source to find out what they mean:
> > 
> > DOWNSHIFT_DEV_DEFAULT_COUNT -
> > 	ethtool --set-phy-tunable ethN downshift on
> > DOWNSHIFT_DEV_DISABLE -
> > 	ethtool --set-phy-tunable ethN downshift off
> > otherwise:
> > 	ethtool --set-phy-tunable ethN downshift count N
> > 
> > This really needs to be documented somewhere in the kernel.
> 
> I was hoping that this would cause further discussion on what the
> exact meaning of "DOWNSHIFT_DEV_DEFAULT_COUNT" is. Clearly, it's
> meant to turn downshift on, but what does "default" actually mean?

I guess this comes from the fact every other PHY has a bit to enable
downshift, and a counter from saying how many attempts to make. And
the counter has a documented default value.

> If we define "default" as "whatever the hardware defaults to" then
> for this phy, that would be turning off downshift.

Which does not make sense.

> So, should we rename "DOWNSHIFT_DEV_DEFAULT_COUNT" to be
> "DOWNSHIFT_DEV_ENABLE" rather than trying to imply that it's
> some kind of default that may need to be made up?

The value is made up anyway. Normally the silicon vendor picks a
value, and that is what you get after a reset. Does it make that much
difference if in this case if you pick the value, rather than Marvell?
None of this is standardised as far as i know, there is no correct
value.

	Andrew
Russell King (Oracle) Sept. 23, 2021, 10:15 a.m. UTC | #5
On Thu, Sep 23, 2021 at 01:56:22AM +0200, Andrew Lunn wrote:
> On Wed, Sep 22, 2021 at 01:00:31PM +0100, Russell King (Oracle) wrote:
> > On Fri, Sep 17, 2021 at 03:58:01PM +0100, Russell King (Oracle) wrote:
> > > On Fri, Sep 17, 2021 at 04:45:03PM +0200, Andrew Lunn wrote:
> > > > > +static int mv3310_set_downshift(struct phy_device *phydev, u8 ds)
> > > > > +{
> > > > > +	struct mv3310_priv *priv = dev_get_drvdata(&phydev->mdio.dev);
> > > > > +	u16 val;
> > > > > +	int err;
> > > > > +
> > > > > +	if (!priv->has_downshift)
> > > > > +		return -EOPNOTSUPP;
> > > > > +
> > > > > +	if (ds == DOWNSHIFT_DEV_DISABLE)
> > > > > +		return phy_clear_bits_mmd(phydev, MDIO_MMD_PCS, MV_PCS_DSC1,
> > > > > +					  MV_PCS_DSC1_ENABLE);
> > > > > +
> > > > > +	/* FIXME: The default is disabled, so should we disable? */
> > > > > +	if (ds == DOWNSHIFT_DEV_DEFAULT_COUNT)
> > > > > +		ds = 2;
> > > > 
> > > > Hi Russell
> > > > 
> > > > Rather than a FIXME, maybe just document that the hardware default is
> > > > disabled, which does not make too much sense, so default to 2 attempts?
> > > 
> > > Sadly, the downshift parameters aren't documented at all in the kernel,
> > > and one has to dig into the ethtool source to find out what they mean:
> > > 
> > > DOWNSHIFT_DEV_DEFAULT_COUNT -
> > > 	ethtool --set-phy-tunable ethN downshift on
> > > DOWNSHIFT_DEV_DISABLE -
> > > 	ethtool --set-phy-tunable ethN downshift off
> > > otherwise:
> > > 	ethtool --set-phy-tunable ethN downshift count N
> > > 
> > > This really needs to be documented somewhere in the kernel.
> > 
> > I was hoping that this would cause further discussion on what the
> > exact meaning of "DOWNSHIFT_DEV_DEFAULT_COUNT" is. Clearly, it's
> > meant to turn downshift on, but what does "default" actually mean?
> 
> I guess this comes from the fact every other PHY has a bit to enable
> downshift, and a counter from saying how many attempts to make. And
> the counter has a documented default value.

Having looked at the data sheet again, the same is actually true here,
but the default settings are for a downshift of 2 but with the enable
bit in disabled mode.

Do other PHYs default to having the enabled bit set? It seems not, it
seems there's no difference here.

Marvell 88E151x documentation says that the downshift counter is set
to 3, which means it attempts 4 times (the value programmed into the
register is one less than the number of attempts, just like 88x3310.)
However, whenever the PHYs config_init() is called, the downshift is
force-set to 3 attempts, which sets the register value to 2. So it
seems that's a randomly picked value that's different from the
manufacturer default. I'm guessing the actual default depends on the
exact model of PHY - indeed, looking at 88E1111, the default appears
to be 7 attempts with a register value of 6.

Looking deeper, DOWNSHIFT_DEV_DEFAULT_COUNT is not handled at all.
This has a value of 255, and m88e1111_set_downshift() will error that
out:

        if (cnt > MII_M1111_PHY_EXT_CR_DOWNSHIFT_MAX)
                return -E2BIG;

where "MII_M1111_PHY_EXT_CR_DOWNSHIFT_MAX" is 8.

AR8035 on the other hand documents that "smartspeed" is enabled by
default, with a default of 5 attempts, which is exactly what the
driver implements.

The adin PHY driver is similar to the Marvell case.
adin_set_downshift() doesn't handle DOWNSHIFT_DEV_DEFAULT_COUNT,
erroring out if it is used. I don't have the datasheet to compare with,
but the code looks somewhat suspicious. It has a separate enable field
and a three bit counter. DOWNSHIFT_DEV_DISABLE is zero, and the logic
in the function means that the three-bit field can never be set to
zero. If programming a zero value were to disable downshift, there
would be no need for a separate ADIN1300_DOWNSPEEDS_EN bit.

aqr107_set_downshift(), dp83867_set_downshift() and
dp83869_set_downshift() are similarly buggy, rejecting
DOWNSHIFT_DEV_DEFAULT_COUNT.

bcm54140_set_downshift() looks like it is probably correct.

So, it seems most implementations are sadly buggy in one way or
another. Two implementations look like they're correct, one is probably
a compromise of various default values from the manufacturer, and four
reject DOWNSHIFT_DEV_DEFAULT_COUNT probably indicating that the feature
addition was never tested with "ethtool --set-phy-tunable ethN
downshift on".

So, some further questions: should we be calling the set_downshift
implementation from the .config_init as the Marvell driver does to
ensure that downshift is correctly enabled? Is .config_init really
the best place to do this? So many things with Marvell PHYs seem to
require a reset, which bounces the link. So if one brings up the
network interface, then sets EEE (you get a link bounce) and then
set downshift, you get another link bounce. Each link bounce takes
more than a second, which means the more features that need to be
configured after bringing the interface up, the longer it takes for
the network to become usable. Note that Marvell downshift will cause
the link to bounce even if the values programmed into the register
were already there - there is no check to see if we actually changed
anything before calling genphy_soft_reset() which seems suboptimal
given that we have phy_modify_changed() which can tell us that.
Andrew Lunn Sept. 26, 2021, 3:24 p.m. UTC | #6
> So, some further questions: should we be calling the set_downshift
> implementation from the .config_init as the Marvell driver does to
> ensure that downshift is correctly enabled?

The bootloader might of messed it up, so it does not seem unreasonable
to set it somewhere at startup.

> Is .config_init really
> the best place to do this? So many things with Marvell PHYs seem to
> require a reset, which bounces the link. So if one brings up the
> network interface, then sets EEE (you get a link bounce) and then
> set downshift, you get another link bounce. Each link bounce takes
> more than a second, which means the more features that need to be
> configured after bringing the interface up, the longer it takes for
> the network to become usable. Note that Marvell downshift will cause
> the link to bounce even if the values programmed into the register
> were already there - there is no check to see if we actually changed
> anything before calling genphy_soft_reset() which seems suboptimal
> given that we have phy_modify_changed() which can tell us that.

This can clearly be optimized. Add a test if the values are being
changed. Skip the reset if it is being done as part of .config_init
and there is a guarantee a later stage will perform the reset, etc.

    Andrew
diff mbox series

Patch

diff --git a/drivers/net/phy/marvell10g.c b/drivers/net/phy/marvell10g.c
index bd310e8d5e43..dffd71def9e3 100644
--- a/drivers/net/phy/marvell10g.c
+++ b/drivers/net/phy/marvell10g.c
@@ -22,6 +22,7 @@ 
  * If both the fiber and copper ports are connected, the first to gain
  * link takes priority and the other port is completely locked out.
  */
+#include <linux/bitfield.h>
 #include <linux/ctype.h>
 #include <linux/delay.h>
 #include <linux/hwmon.h>
@@ -33,6 +34,8 @@ 
 #define MV_PHY_ALASKA_NBT_QUIRK_MASK	0xfffffffe
 #define MV_PHY_ALASKA_NBT_QUIRK_REV	(MARVELL_PHY_ID_88X3310 | 0xa)
 
+#define MV_VERSION(a,b,c,d) ((a) << 24 | (b) << 16 | (c) << 8 | (d))
+
 enum {
 	MV_PMA_FW_VER0		= 0xc011,
 	MV_PMA_FW_VER1		= 0xc012,
@@ -62,6 +65,15 @@  enum {
 	MV_PCS_CSCR1_MDIX_MDIX	= 0x0020,
 	MV_PCS_CSCR1_MDIX_AUTO	= 0x0060,
 
+	MV_PCS_DSC1		= 0x8003,
+	MV_PCS_DSC1_ENABLE	= BIT(9),
+	MV_PCS_DSC1_10GBT	= 0x01c0,
+	MV_PCS_DSC1_1GBR	= 0x0038,
+	MV_PCS_DSC1_100BTX	= 0x0007,
+	MV_PCS_DSC2		= 0x8004,
+	MV_PCS_DSC2_2P5G	= 0xf000,
+	MV_PCS_DSC2_5G		= 0x0f00,
+
 	MV_PCS_CSSR1		= 0x8008,
 	MV_PCS_CSSR1_SPD1_MASK	= 0xc000,
 	MV_PCS_CSSR1_SPD1_SPD2	= 0xc000,
@@ -125,6 +137,7 @@  enum {
 };
 
 struct mv3310_chip {
+	bool (*has_downshift)(struct phy_device *phydev);
 	void (*init_supported_interfaces)(unsigned long *mask);
 	int (*get_mactype)(struct phy_device *phydev);
 	int (*init_interface)(struct phy_device *phydev, int mactype);
@@ -138,6 +151,7 @@  struct mv3310_priv {
 	DECLARE_BITMAP(supported_interfaces, PHY_INTERFACE_MODE_MAX);
 
 	u32 firmware_ver;
+	bool has_downshift;
 	bool rate_match;
 	phy_interface_t const_interface;
 
@@ -330,6 +344,65 @@  static int mv3310_reset(struct phy_device *phydev, u32 unit)
 					 5000, 100000, true);
 }
 
+static int mv3310_get_downshift(struct phy_device *phydev, u8 *ds)
+{
+	struct mv3310_priv *priv = dev_get_drvdata(&phydev->mdio.dev);
+	int val;
+
+	if (!priv->has_downshift)
+		return -EOPNOTSUPP;
+
+	val = phy_read_mmd(phydev, MDIO_MMD_PCS, MV_PCS_DSC1);
+	if (val < 0)
+		return val;
+
+	if (val & MV_PCS_DSC1_ENABLE)
+		/* assume that all fields are the same */
+		*ds = 1 + FIELD_GET(MV_PCS_DSC1_10GBT, (u16)val);
+	else
+		*ds = DOWNSHIFT_DEV_DISABLE;
+
+	return 0;
+}
+
+static int mv3310_set_downshift(struct phy_device *phydev, u8 ds)
+{
+	struct mv3310_priv *priv = dev_get_drvdata(&phydev->mdio.dev);
+	u16 val;
+	int err;
+
+	if (!priv->has_downshift)
+		return -EOPNOTSUPP;
+
+	if (ds == DOWNSHIFT_DEV_DISABLE)
+		return phy_clear_bits_mmd(phydev, MDIO_MMD_PCS, MV_PCS_DSC1,
+					  MV_PCS_DSC1_ENABLE);
+
+	/* FIXME: The default is disabled, so should we disable? */
+	if (ds == DOWNSHIFT_DEV_DEFAULT_COUNT)
+		ds = 2;
+
+	if (ds > 8)
+		return -E2BIG;
+
+	ds -= 1;
+	val = FIELD_PREP(MV_PCS_DSC2_2P5G, ds);
+	val |= FIELD_PREP(MV_PCS_DSC2_5G, ds);
+	err = phy_modify_mmd(phydev, MDIO_MMD_PCS, MV_PCS_DSC2,
+			     MV_PCS_DSC2_2P5G | MV_PCS_DSC2_5G, val);
+	if (err < 0)
+		return err;
+
+	val = MV_PCS_DSC1_ENABLE;
+	val |= FIELD_PREP(MV_PCS_DSC1_10GBT, ds);
+	val |= FIELD_PREP(MV_PCS_DSC1_1GBR, ds);
+	val |= FIELD_PREP(MV_PCS_DSC1_100BTX, ds);
+
+	return phy_modify_mmd(phydev, MDIO_MMD_PCS, MV_PCS_DSC1,
+			      MV_PCS_DSC1_ENABLE | MV_PCS_DSC1_10GBT |
+			      MV_PCS_DSC1_1GBR | MV_PCS_DSC1_100BTX, val);
+}
+
 static int mv3310_get_edpd(struct phy_device *phydev, u16 *edpd)
 {
 	int val;
@@ -448,6 +521,9 @@  static int mv3310_probe(struct phy_device *phydev)
 		    priv->firmware_ver >> 24, (priv->firmware_ver >> 16) & 255,
 		    (priv->firmware_ver >> 8) & 255, priv->firmware_ver & 255);
 
+	if (chip->has_downshift)
+		priv->has_downshift = chip->has_downshift(phydev);
+
 	/* Powering down the port when not in use saves about 600mW */
 	ret = mv3310_power_down(phydev);
 	if (ret)
@@ -616,7 +692,16 @@  static int mv3310_config_init(struct phy_device *phydev)
 	}
 
 	/* Enable EDPD mode - saving 600mW */
-	return mv3310_set_edpd(phydev, ETHTOOL_PHY_EDPD_DFLT_TX_MSECS);
+	err = mv3310_set_edpd(phydev, ETHTOOL_PHY_EDPD_DFLT_TX_MSECS);
+	if (err)
+		return err;
+
+	/* Allow downshift */
+	err = mv3310_set_downshift(phydev, DOWNSHIFT_DEV_DEFAULT_COUNT);
+	if (err && err != -EOPNOTSUPP)
+		return err;
+
+	return 0;
 }
 
 static int mv3310_get_features(struct phy_device *phydev)
@@ -886,6 +971,8 @@  static int mv3310_get_tunable(struct phy_device *phydev,
 			      struct ethtool_tunable *tuna, void *data)
 {
 	switch (tuna->id) {
+	case ETHTOOL_PHY_DOWNSHIFT:
+		return mv3310_get_downshift(phydev, data);
 	case ETHTOOL_PHY_EDPD:
 		return mv3310_get_edpd(phydev, data);
 	default:
@@ -897,6 +984,8 @@  static int mv3310_set_tunable(struct phy_device *phydev,
 			      struct ethtool_tunable *tuna, const void *data)
 {
 	switch (tuna->id) {
+	case ETHTOOL_PHY_DOWNSHIFT:
+		return mv3310_set_downshift(phydev, *(u8 *)data);
 	case ETHTOOL_PHY_EDPD:
 		return mv3310_set_edpd(phydev, *(u16 *)data);
 	default:
@@ -904,6 +993,14 @@  static int mv3310_set_tunable(struct phy_device *phydev,
 	}
 }
 
+static bool mv3310_has_downshift(struct phy_device *phydev)
+{
+	struct mv3310_priv *priv = dev_get_drvdata(&phydev->mdio.dev);
+
+	/* Fails to downshift with firmware older than v0.3.5.0 */
+	return priv->firmware_ver >= MV_VERSION(0,3,5,0);
+}
+
 static void mv3310_init_supported_interfaces(unsigned long *mask)
 {
 	__set_bit(PHY_INTERFACE_MODE_SGMII, mask);
@@ -943,6 +1040,7 @@  static void mv2111_init_supported_interfaces(unsigned long *mask)
 }
 
 static const struct mv3310_chip mv3310_type = {
+	.has_downshift = mv3310_has_downshift,
 	.init_supported_interfaces = mv3310_init_supported_interfaces,
 	.get_mactype = mv3310_get_mactype,
 	.init_interface = mv3310_init_interface,
@@ -953,6 +1051,7 @@  static const struct mv3310_chip mv3310_type = {
 };
 
 static const struct mv3310_chip mv3340_type = {
+	.has_downshift = mv3310_has_downshift,
 	.init_supported_interfaces = mv3340_init_supported_interfaces,
 	.get_mactype = mv3310_get_mactype,
 	.init_interface = mv3340_init_interface,