diff mbox series

[net-next] net: phy: microchip_t1: SQI support for LAN887x

Message ID 20240904102606.136874-1-tarun.alle@microchip.com (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series [net-next] net: phy: microchip_t1: SQI support for LAN887x | expand

Checks

Context Check Description
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for net-next
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 16 this patch: 16
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers success CCed 9 of 9 maintainers
netdev/build_clang success Errors and warnings before: 16 this patch: 16
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 16 this patch: 16
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 168 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
netdev/contest success net-next-2024-09-04--15-00 (tests: 718)

Commit Message

Tarun Alle Sept. 4, 2024, 10:26 a.m. UTC
From: Tarun Alle <Tarun.Alle@microchip.com>

Add support for measuring Signal Quality Index for LAN887x T1 PHY.
Signal Quality Index (SQI) is measure of Link Channel Quality from
0 to 7, with 7 as the best. By default, a link loss event shall
indicate an SQI of 0.

Signed-off-by: Tarun Alle <Tarun.Alle@microchip.com>
---
 drivers/net/phy/microchip_t1.c | 144 +++++++++++++++++++++++++++++++++
 1 file changed, 144 insertions(+)

Comments

Andrew Lunn Sept. 5, 2024, 12:38 p.m. UTC | #1
> +	/* Get 200 SQI raw readings */
> +	for (int i = 0; i < 200; i++) {

Please replace all the hard coded 200 with ARRAY_SIZE(rawtable). That
makes it easier to tune the size of the table without causing buffer
overrun bugs.

> +		rc = phy_write_mmd(phydev, MDIO_MMD_VEND1,
> +				   LAN887X_POKE_PEEK_100,
> +				   LAN887X_POKE_PEEK_100_EN);
> +		if (rc < 0)
> +			return rc;
> +
> +		rc = phy_read_mmd(phydev, MDIO_MMD_VEND1,
> +				  LAN887X_SQI_MSE_100);
> +		if (rc < 0)
> +			return rc;
> +
> +		rawtable[i] = rc;
> +		rc = genphy_c45_read_link(phydev);
> +		if (rc < 0)
> +			return rc;
> +
> +		if (!phydev->link)
> +			return -ENETDOWN;
> +	}

How long does this take?

genphy_c45_read_link() takes a few MDIO transaction, plus the two you
see here. So maybe 1000 MDIO bus transactions? Which could be
3000-4000 if it needs to use C45 over C22.

Do you have any data on the accuracy, with say 10, 20, 40, 80, 160
samples?

Can the genphy_c45_read_link() be moved out of the loop? If the link
is lost, is the sample totally random, or does it have a well defined
value? Looking at the link status every iteration, rather than before
and after collecting the samples, you are trying to protect against
the link going down and back up again. If it is taking a couple of
seconds to collect all the samples, i suppose that is possible, but if
its 50ms, do you really have to worry?

> +static int lan887x_get_sqi(struct phy_device *phydev)
> +{
> +	int rc, val;
> +
> +	if (phydev->speed != SPEED_1000 &&
> +	    phydev->speed != SPEED_100) {
> +		return -EINVAL;
> +	}

Can that happen? Does the PHY support SPEED_10? Or are you trying to
protect against SPEED_UNKOWN because the link is down? ENETDOWN might
be more appropriate that EINVAL.

	Andrew
Tarun Alle Sept. 6, 2024, 5:45 a.m. UTC | #2
Hi Andrew,
Thanks for the review comments.

> -----Original Message-----
> From: Andrew Lunn <andrew@lunn.ch>
> Sent: Thursday, September 5, 2024 6:09 PM
> To: Tarun Alle - I68638 <Tarun.Alle@microchip.com>
> Cc: Arun Ramadoss - I17769 <Arun.Ramadoss@microchip.com>; UNGLinuxDriver
> <UNGLinuxDriver@microchip.com>; hkallweit1@gmail.com;
> linux@armlinux.org.uk; davem@davemloft.net; edumazet@google.com;
> kuba@kernel.org; pabeni@redhat.com; netdev@vger.kernel.org; linux-
> kernel@vger.kernel.org
> Subject: Re: [PATCH net-next] net: phy: microchip_t1: SQI support for LAN887x
> 
> EXTERNAL EMAIL: Do not click links or open attachments unless you know the
> content is safe
> 
> > +     /* Get 200 SQI raw readings */
> > +     for (int i = 0; i < 200; i++) {
> 
> Please replace all the hard coded 200 with ARRAY_SIZE(rawtable). That makes it
> easier to tune the size of the table without causing buffer overrun bugs.
> 

Will fix in next version.

> > +             rc = phy_write_mmd(phydev, MDIO_MMD_VEND1,
> > +                                LAN887X_POKE_PEEK_100,
> > +                                LAN887X_POKE_PEEK_100_EN);
> > +             if (rc < 0)
> > +                     return rc;
> > +
> > +             rc = phy_read_mmd(phydev, MDIO_MMD_VEND1,
> > +                               LAN887X_SQI_MSE_100);
> > +             if (rc < 0)
> > +                     return rc;
> > +
> > +             rawtable[i] = rc;
> > +             rc = genphy_c45_read_link(phydev);
> > +             if (rc < 0)
> > +                     return rc;
> > +
> > +             if (!phydev->link)
> > +                     return -ENETDOWN;
> > +     }
> 
> How long does this take?
> 

~76ms

> genphy_c45_read_link() takes a few MDIO transaction, plus the two you see
> here. So maybe 1000 MDIO bus transactions? Which could be
> 3000-4000 if it needs to use C45 over C22.
> 
> Do you have any data on the accuracy, with say 10, 20, 40, 80, 160 samples?
>

Here number of samples are suggested by our compliance test data.
There is an APP Note regarding SQI samples and calculations.
No, the number of samples are only 200 as any other count was not
consistent in terms of accuracy.
 
> Can the genphy_c45_read_link() be moved out of the loop? If the link is lost, is the
> sample totally random, or does it have a well defined value? Looking at the link
> status every iteration, rather than before and after collecting the samples, you are
> trying to protect against the link going down and back up again. If it is taking a
> couple of seconds to collect all the samples, i suppose that is possible, but if its
> 50ms, do you really have to worry?
> 


Sampling data is random. If the link is down at any point during
the data sampling we are discarding the entire set.
If we check the link status before and after the data collection, there could
be an invalidate SQI derivation in very worst-case scenario.

Just to improve instead of register read can I change it to use phydev->link variable?
This link variable is update by PHY state machine.

> > +static int lan887x_get_sqi(struct phy_device *phydev) {
> > +     int rc, val;
> > +
> > +     if (phydev->speed != SPEED_1000 &&
> > +         phydev->speed != SPEED_100) {
> > +             return -EINVAL;
> > +     }
> 
> Can that happen? Does the PHY support SPEED_10? Or are you trying to protect
> against SPEED_UNKOWN because the link is down? ENETDOWN might be more
> appropriate that EINVAL.
>

LAN887x supports SPEED_100 and SPEED_1000 only. This condition is to 
address the unknow speed. Will fix the error code in next version.
 
>         Andrew

Thanks,
Tarun Alle.
Andrew Lunn Sept. 6, 2024, 12:57 p.m. UTC | #3
> > How long does this take?
> > 
> 
> ~76ms

That is faster than i expected. You have a pretty efficient MDIO bus
implementation.

> > genphy_c45_read_link() takes a few MDIO transaction, plus the two you see
> > here. So maybe 1000 MDIO bus transactions? Which could be
> > 3000-4000 if it needs to use C45 over C22.
> > 
> > Do you have any data on the accuracy, with say 10, 20, 40, 80, 160 samples?
> >
> 
> Here number of samples are suggested by our compliance test data.
> There is an APP Note regarding SQI samples and calculations.
> No, the number of samples are only 200 as any other count was not
> consistent in terms of accuracy.
>  
> > Can the genphy_c45_read_link() be moved out of the loop? If the link is lost, is the
> > sample totally random, or does it have a well defined value? Looking at the link
> > status every iteration, rather than before and after collecting the samples, you are
> > trying to protect against the link going down and back up again. If it is taking a
> > couple of seconds to collect all the samples, i suppose that is possible, but if its
> > 50ms, do you really have to worry?
> > 
> 
> 
> Sampling data is random. If the link is down at any point during
> the data sampling we are discarding the entire set.
> If we check the link status before and after the data collection, there could
> be an invalidate SQI derivation in very worst-case scenario.
> 
> Just to improve instead of register read can I change it to use phydev->link variable?
> This link variable is update by PHY state machine.

Which won't get to run because the driver is actively doing SQI. There
is no preemption here, this code will run to completion, and then
phylib will deal with any interrupts for link down, or do its once per
second poll to check the link status.

With this only taking 76ms, what is the likelihood of link down and
link up again within 76ms? For a 1000BaseT PHY, they don't report link
down for 1 second, and it takes another 1 second to perform autoneg
before the link is up again. Now this is an automotive PHY, so the
timing is different. What does the data sheet say about how fast it
detects and reports link down and up?

	Andrew
Tarun Alle Sept. 9, 2024, 6:46 a.m. UTC | #4
Hi Andrew,
Thanks for your review and comments.

> -----Original Message-----
> From: Andrew Lunn <andrew@lunn.ch>
> Sent: Friday, September 6, 2024 6:28 PM
> To: Tarun Alle - I68638 <Tarun.Alle@microchip.com>
> Cc: Arun Ramadoss - I17769 <Arun.Ramadoss@microchip.com>; UNGLinuxDriver
> <UNGLinuxDriver@microchip.com>; hkallweit1@gmail.com;
> linux@armlinux.org.uk; davem@davemloft.net; edumazet@google.com;
> kuba@kernel.org; pabeni@redhat.com; netdev@vger.kernel.org; linux-
> kernel@vger.kernel.org
> Subject: Re: [PATCH net-next] net: phy: microchip_t1: SQI support for LAN887x
> 
> EXTERNAL EMAIL: Do not click links or open attachments unless you know the
> content is safe
> 
> > > How long does this take?
> > >
> >
> > ~76ms
> 
> That is faster than i expected. You have a pretty efficient MDIO bus
> implementation.
> 
> > > genphy_c45_read_link() takes a few MDIO transaction, plus the two
> > > you see here. So maybe 1000 MDIO bus transactions? Which could be
> > > 3000-4000 if it needs to use C45 over C22.
> > >
> > > Do you have any data on the accuracy, with say 10, 20, 40, 80, 160 samples?
> > >
> >
> > Here number of samples are suggested by our compliance test data.
> > There is an APP Note regarding SQI samples and calculations.
> > No, the number of samples are only 200 as any other count was not
> > consistent in terms of accuracy.
> >
> > > Can the genphy_c45_read_link() be moved out of the loop? If the link
> > > is lost, is the sample totally random, or does it have a well
> > > defined value? Looking at the link status every iteration, rather
> > > than before and after collecting the samples, you are trying to
> > > protect against the link going down and back up again. If it is
> > > taking a couple of seconds to collect all the samples, i suppose that is possible,
> but if its 50ms, do you really have to worry?
> > >
> >
> >
> > Sampling data is random. If the link is down at any point during the
> > data sampling we are discarding the entire set.
> > If we check the link status before and after the data collection,
> > there could be an invalidate SQI derivation in very worst-case scenario.
> >
> > Just to improve instead of register read can I change it to use phydev->link
> variable?
> > This link variable is update by PHY state machine.
> 
> Which won't get to run because the driver is actively doing SQI. There is no
> preemption here, this code will run to completion, and then phylib will deal with
> any interrupts for link down, or do its once per second poll to check the link
> status.
> 
> With this only taking 76ms, what is the likelihood of link down and link up again
> within 76ms? For a 1000BaseT PHY, they don't report link down for 1 second, and
> it takes another 1 second to perform autoneg before the link is up again. Now this
> is an automotive PHY, so the timing is different. What does the data sheet say
> about how fast it detects and reports link down and up?
> 

For 1000M this sampling procedure will not be run rather we use SQI hardware register to read the value.
as this procedure is only for 100M and linkup time is ~100ms we can check link status before starting the sampling and after 
completing the sampling. This would ensure that link is not down before calculating SQI.

>         Andrew

Thanks,
Tarun Alle.
Andrew Lunn Sept. 9, 2024, 1:17 p.m. UTC | #5
> > With this only taking 76ms, what is the likelihood of link down and link up again
> > within 76ms? For a 1000BaseT PHY, they don't report link down for 1 second, and
> > it takes another 1 second to perform autoneg before the link is up again. Now this
> > is an automotive PHY, so the timing is different. What does the data sheet say
> > about how fast it detects and reports link down and up?
> > 
> 
> For 1000M this sampling procedure will not be run rather we use SQI hardware register to read the value.
> as this procedure is only for 100M and linkup time is ~100ms we can check link status before starting the sampling and after 
> completing the sampling. This would ensure that link is not down before calculating SQI.

Great. That will help users who have a slower MDIO bus.

    Andrew

---
pw-bot: cr
diff mbox series

Patch

diff --git a/drivers/net/phy/microchip_t1.c b/drivers/net/phy/microchip_t1.c
index 5732ad65e7f9..01e0e71fca48 100644
--- a/drivers/net/phy/microchip_t1.c
+++ b/drivers/net/phy/microchip_t1.c
@@ -2,6 +2,7 @@ 
 // Copyright (C) 2018 Microchip Technology
 
 #include <linux/kernel.h>
+#include <linux/sort.h>
 #include <linux/module.h>
 #include <linux/delay.h>
 #include <linux/mii.h>
@@ -188,6 +189,20 @@ 
 #define LAN887X_EFUSE_READ_DAT9_SGMII_DIS	BIT(9)
 #define LAN887X_EFUSE_READ_DAT9_MAC_MODE	GENMASK(1, 0)
 
+#define LAN887X_COEFF_PWR_DN_CONFIG_100		0x0404
+#define LAN887X_COEFF_PWR_DN_CONFIG_100_V	0x16d6
+#define LAN887X_SQI_CONFIG_100			0x042e
+#define LAN887X_SQI_CONFIG_100_V		0x9572
+#define LAN887X_SQI_MSE_100			0x483
+
+#define LAN887X_POKE_PEEK_100			0x040d
+#define LAN887X_POKE_PEEK_100_EN		BIT(0)
+
+#define LAN887X_COEFF_MOD_CONFIG		0x080d
+#define LAN887X_COEFF_MOD_CONFIG_DCQ_COEFF_EN	BIT(8)
+
+#define LAN887X_DCQ_SQI_STATUS			0x08b2
+
 #define DRIVER_AUTHOR	"Nisar Sayed <nisar.sayed@microchip.com>"
 #define DRIVER_DESC	"Microchip LAN87XX/LAN937x/LAN887x T1 PHY driver"
 
@@ -1420,6 +1435,133 @@  static void lan887x_get_strings(struct phy_device *phydev, u8 *data)
 		ethtool_puts(&data, lan887x_hw_stats[i].string);
 }
 
+/* Compare block to sort in ascending order */
+static int data_compare(const void *a, const void *b)
+{
+	return  *(u16 *)a - *(u16 *)b;
+}
+
+static int lan887x_get_sqi_100M(struct phy_device *phydev)
+{
+	u16 rawtable[200];
+	u32 sqiavg = 0;
+	u8 sqinum;
+	int rc;
+
+	/* Configuration of SQI 100M */
+	rc = phy_write_mmd(phydev, MDIO_MMD_VEND1,
+			   LAN887X_COEFF_PWR_DN_CONFIG_100,
+			   LAN887X_COEFF_PWR_DN_CONFIG_100_V);
+	if (rc < 0)
+		return rc;
+
+	rc = phy_write_mmd(phydev, MDIO_MMD_VEND1, LAN887X_SQI_CONFIG_100,
+			   LAN887X_SQI_CONFIG_100_V);
+	if (rc < 0)
+		return rc;
+
+	rc = phy_read_mmd(phydev, MDIO_MMD_VEND1, LAN887X_SQI_CONFIG_100);
+	if (rc != LAN887X_SQI_CONFIG_100_V)
+		return -EINVAL;
+
+	rc = phy_modify_mmd(phydev, MDIO_MMD_VEND1, LAN887X_POKE_PEEK_100,
+			    LAN887X_POKE_PEEK_100_EN,
+			    LAN887X_POKE_PEEK_100_EN);
+	if (rc < 0)
+		return rc;
+
+	/* Required before reading register
+	 * otherwise it will return high value
+	 */
+	msleep(50);
+
+	/* Get 200 SQI raw readings */
+	for (int i = 0; i < 200; i++) {
+		rc = phy_write_mmd(phydev, MDIO_MMD_VEND1,
+				   LAN887X_POKE_PEEK_100,
+				   LAN887X_POKE_PEEK_100_EN);
+		if (rc < 0)
+			return rc;
+
+		rc = phy_read_mmd(phydev, MDIO_MMD_VEND1,
+				  LAN887X_SQI_MSE_100);
+		if (rc < 0)
+			return rc;
+
+		rawtable[i] = rc;
+		rc = genphy_c45_read_link(phydev);
+		if (rc < 0)
+			return rc;
+
+		if (!phydev->link)
+			return -ENETDOWN;
+	}
+
+	/* Sort SQI raw readings in ascending order */
+	sort(rawtable, 200, sizeof(u16), data_compare, NULL);
+
+	/* Keep inliers and discard outliers */
+	for (int i = 40; i < 160; i++)
+		sqiavg += rawtable[i];
+
+	/* Get SQI average */
+	sqiavg /= 120;
+
+	if (sqiavg < 75)
+		sqinum = 7;
+	else if (sqiavg < 94)
+		sqinum = 6;
+	else if (sqiavg < 119)
+		sqinum = 5;
+	else if (sqiavg < 150)
+		sqinum = 4;
+	else if (sqiavg < 189)
+		sqinum = 3;
+	else if (sqiavg < 237)
+		sqinum = 2;
+	else if (sqiavg < 299)
+		sqinum = 1;
+	else
+		sqinum = 0;
+
+	return sqinum;
+}
+
+static int lan887x_get_sqi(struct phy_device *phydev)
+{
+	int rc, val;
+
+	if (phydev->speed != SPEED_1000 &&
+	    phydev->speed != SPEED_100) {
+		return -EINVAL;
+	}
+
+	if (phydev->speed == SPEED_100)
+		return lan887x_get_sqi_100M(phydev);
+
+	/* Writing DCQ_COEFF_EN to trigger a SQI read */
+	rc = phy_set_bits_mmd(phydev, MDIO_MMD_VEND1,
+			      LAN887X_COEFF_MOD_CONFIG,
+			      LAN887X_COEFF_MOD_CONFIG_DCQ_COEFF_EN);
+	if (rc < 0)
+		return rc;
+
+	/* Wait for DCQ done */
+	rc = phy_read_mmd_poll_timeout(phydev, MDIO_MMD_VEND1,
+				       LAN887X_COEFF_MOD_CONFIG, val, ((val &
+				       LAN887X_COEFF_MOD_CONFIG_DCQ_COEFF_EN) !=
+				       LAN887X_COEFF_MOD_CONFIG_DCQ_COEFF_EN),
+				       10, 200, true);
+	if (rc < 0)
+		return rc;
+
+	rc = phy_read_mmd(phydev, MDIO_MMD_VEND1, LAN887X_DCQ_SQI_STATUS);
+	if (rc < 0)
+		return rc;
+
+	return FIELD_GET(T1_DCQ_SQI_MSK, rc);
+}
+
 static struct phy_driver microchip_t1_phy_driver[] = {
 	{
 		PHY_ID_MATCH_MODEL(PHY_ID_LAN87XX),
@@ -1468,6 +1610,8 @@  static struct phy_driver microchip_t1_phy_driver[] = {
 		.suspend	= genphy_suspend,
 		.resume		= genphy_resume,
 		.read_status	= genphy_c45_read_status,
+		.get_sqi	= lan887x_get_sqi,
+		.get_sqi_max	= lan87xx_get_sqi_max,
 	}
 };