diff mbox series

[net-next] net: dsa: lan9303: ensure chip reset and wait for READY status

Message ID 20241001090151.876200-1-alexander.sverdlin@siemens.com (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series [net-next] net: dsa: lan9303: ensure chip reset and wait for READY status | expand

Checks

Context Check Description
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for net-next
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 9 this patch: 9
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers warning 3 maintainers not CCed: pabeni@redhat.com kuba@kernel.org edumazet@google.com
netdev/build_clang success Errors and warnings before: 9 this patch: 9
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn fail Errors and warnings before: 12 this patch: 12
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 53 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Sverdlin, Alexander Oct. 1, 2024, 9:01 a.m. UTC
From: Anatolij Gustschin <agust@denx.de>

Accessing device registers seems to be not reliable, the chip
revision is sometimes detected wrongly (0 instead of expected 1).

Ensure that the chip reset is performed via reset GPIO and then
wait for 'Device Ready' status in HW_CFG register before doing
any register initializations.

Signed-off-by: Anatolij Gustschin <agust@denx.de>
[alex: added msleep() + justification for tout]
Signed-off-by: Alexander Sverdlin <alexander.sverdlin@siemens.com>
---
 drivers/net/dsa/lan9303-core.c | 38 ++++++++++++++++++++++++++++++++++
 1 file changed, 38 insertions(+)

Comments

Parthiban Veerasooran Oct. 1, 2024, 11:30 a.m. UTC | #1
Hi,

I think the subject line should have "net" tag instead of "net-next" as 
it is an update on the existing driver in the netdev source tree.

https://www.kernel.org/doc/Documentation/networking/netdev-FAQ.txt

Best regards,
Parthiban V

On 01/10/24 2:31 pm, A. Sverdlin wrote:
> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
> 
> From: Anatolij Gustschin <agust@denx.de>
> 
> Accessing device registers seems to be not reliable, the chip
> revision is sometimes detected wrongly (0 instead of expected 1).
> 
> Ensure that the chip reset is performed via reset GPIO and then
> wait for 'Device Ready' status in HW_CFG register before doing
> any register initializations.
> 
> Signed-off-by: Anatolij Gustschin <agust@denx.de>
> [alex: added msleep() + justification for tout]
> Signed-off-by: Alexander Sverdlin <alexander.sverdlin@siemens.com>
> ---
>   drivers/net/dsa/lan9303-core.c | 38 ++++++++++++++++++++++++++++++++++
>   1 file changed, 38 insertions(+)
> 
> diff --git a/drivers/net/dsa/lan9303-core.c b/drivers/net/dsa/lan9303-core.c
> index 268949939636a..5744e7ac436fb 100644
> --- a/drivers/net/dsa/lan9303-core.c
> +++ b/drivers/net/dsa/lan9303-core.c
> @@ -839,6 +839,8 @@ static void lan9303_handle_reset(struct lan9303 *chip)
>          if (!chip->reset_gpio)
>                  return;
> 
> +       gpiod_set_value_cansleep(chip->reset_gpio, 1);
> +
>          if (chip->reset_duration != 0)
>                  msleep(chip->reset_duration);
> 
> @@ -863,9 +865,45 @@ static int lan9303_disable_processing(struct lan9303 *chip)
> 
>   static int lan9303_check_device(struct lan9303 *chip)
>   {
> +       /*
> +        * Loading of the largest supported EEPROM is expected to take at least
> +        * 5.9s
> +        */
> +       int tout = 6000 / 30;
>          int ret;
>          u32 reg;
> 
> +       do {
> +               ret = lan9303_read(chip->regmap, LAN9303_HW_CFG, &reg);
> +               if (ret) {
> +                       dev_err(chip->dev, "failed to read HW_CFG reg: %d\n",
> +                               ret);
> +               }
> +               tout--;
> +
> +               dev_dbg(chip->dev, "%s: HW_CFG: 0x%08x\n", __func__, reg);
> +               if ((reg & LAN9303_HW_CFG_READY) || !tout)
> +                       break;
> +
> +               /*
> +                * In I2C-managed configurations this polling loop will clash
> +                * with switch's reading of EEPROM right after reset and this
> +                * behaviour is not configurable. While lan9303_read() already
> +                * has quite long retry timeout, seems not all cases are being
> +                * detected as arbitration error.
> +                *
> +                * According to datasheet, EEPROM loader has 30ms timeout
> +                * (in case of missing EEPROM).
> +                */
> +               msleep(30);
> +       } while (true);
> +
> +       if (!tout) {
> +               dev_err(chip->dev, "%s: HW_CFG not ready: 0x%08x\n",
> +                       __func__, reg);
> +               return -ENODEV;
> +       }
> +
>          ret = lan9303_read(chip->regmap, LAN9303_CHIP_REV, &reg);
>          if (ret) {
>                  dev_err(chip->dev, "failed to read chip revision register: %d\n",
> --
> 2.46.0
> 
>
Sverdlin, Alexander Oct. 1, 2024, 11:57 a.m. UTC | #2
Hi Parthiban!

On Tue, 2024-10-01 at 11:30 +0000, Parthiban.Veerasooran@microchip.com wrote:
> I think the subject line should have "net" tag instead of "net-next" as 
> it is an update on the existing driver in the netdev source tree.
> 
> https://www.kernel.org/doc/Documentation/networking/netdev-FAQ.txt

I explicitly didn't target it for -stable because seems that nobody
else is affected by this issue (or have their downstream workarounds).

However, it's up to maintainers and shall actually apply cleanly to both
trees.
Vladimir Oltean Oct. 1, 2024, 12:02 p.m. UTC | #3
On Tue, Oct 01, 2024 at 11:01:48AM +0200, A. Sverdlin wrote:
> From: Anatolij Gustschin <agust@denx.de>
> 
> Accessing device registers seems to be not reliable, the chip
> revision is sometimes detected wrongly (0 instead of expected 1).
> 
> Ensure that the chip reset is performed via reset GPIO and then
> wait for 'Device Ready' status in HW_CFG register before doing
> any register initializations.
> 
> Signed-off-by: Anatolij Gustschin <agust@denx.de>
> [alex: added msleep() + justification for tout]
> Signed-off-by: Alexander Sverdlin <alexander.sverdlin@siemens.com>
> ---
>  drivers/net/dsa/lan9303-core.c | 38 ++++++++++++++++++++++++++++++++++
>  1 file changed, 38 insertions(+)
> 
> diff --git a/drivers/net/dsa/lan9303-core.c b/drivers/net/dsa/lan9303-core.c
> index 268949939636a..5744e7ac436fb 100644
> --- a/drivers/net/dsa/lan9303-core.c
> +++ b/drivers/net/dsa/lan9303-core.c
> @@ -839,6 +839,8 @@ static void lan9303_handle_reset(struct lan9303 *chip)
>  	if (!chip->reset_gpio)
>  		return;
>  
> +	gpiod_set_value_cansleep(chip->reset_gpio, 1);
> +
>  	if (chip->reset_duration != 0)
>  		msleep(chip->reset_duration);
>  
> @@ -863,9 +865,45 @@ static int lan9303_disable_processing(struct lan9303 *chip)
>  
>  static int lan9303_check_device(struct lan9303 *chip)
>  {
> +	/*
> +	 * Loading of the largest supported EEPROM is expected to take at least
> +	 * 5.9s
> +	 */
> +	int tout = 6000 / 30;
>  	int ret;
>  	u32 reg;
>  
> +	do {
> +		ret = lan9303_read(chip->regmap, LAN9303_HW_CFG, &reg);
> +		if (ret) {
> +			dev_err(chip->dev, "failed to read HW_CFG reg: %d\n",
> +				ret);
> +		}
> +		tout--;
> +
> +		dev_dbg(chip->dev, "%s: HW_CFG: 0x%08x\n", __func__, reg);
> +		if ((reg & LAN9303_HW_CFG_READY) || !tout)
> +			break;
> +
> +		/*
> +		 * In I2C-managed configurations this polling loop will clash
> +		 * with switch's reading of EEPROM right after reset and this
> +		 * behaviour is not configurable. While lan9303_read() already
> +		 * has quite long retry timeout, seems not all cases are being
> +		 * detected as arbitration error.
> +		 *
> +		 * According to datasheet, EEPROM loader has 30ms timeout
> +		 * (in case of missing EEPROM).
> +		 */
> +		msleep(30);
> +	} while (true);
> +
> +	if (!tout) {
> +		dev_err(chip->dev, "%s: HW_CFG not ready: 0x%08x\n",
> +			__func__, reg);
> +		return -ENODEV;
> +	}
> +

Can this be written with read_poll_timeout()?

>  	ret = lan9303_read(chip->regmap, LAN9303_CHIP_REV, &reg);
>  	if (ret) {
>  		dev_err(chip->dev, "failed to read chip revision register: %d\n",
> -- 
> 2.46.0
>
Vladimir Oltean Oct. 1, 2024, 12:04 p.m. UTC | #4
On Tue, Oct 01, 2024 at 11:57:15AM +0000, Sverdlin, Alexander wrote:
> Hi Parthiban!
> 
> On Tue, 2024-10-01 at 11:30 +0000, Parthiban.Veerasooran@microchip.com wrote:
> > I think the subject line should have "net" tag instead of "net-next" as 
> > it is an update on the existing driver in the netdev source tree.
> > 
> > https://www.kernel.org/doc/Documentation/networking/netdev-FAQ.txt
> 
> I explicitly didn't target it for -stable because seems that nobody
> else is affected by this issue (or have their downstream workarounds).

Explain a bit more why nobody 'else' is affected by the issue, and why
you're not part of the 'else' people that would justify backporting to
stable?
Parthiban Veerasooran Oct. 1, 2024, 12:09 p.m. UTC | #5
Hi Alexander Sverdlin,

On 01/10/24 5:27 pm, Sverdlin, Alexander wrote:
> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
> 
> Hi Parthiban!
> 
> On Tue, 2024-10-01 at 11:30 +0000, Parthiban.Veerasooran@microchip.com wrote:
>> I think the subject line should have "net" tag instead of "net-next" as
>> it is an update on the existing driver in the netdev source tree.
>>
>> https://www.kernel.org/doc/Documentation/networking/netdev-FAQ.txt
> 
> I explicitly didn't target it for -stable because seems that nobody
> else is affected by this issue (or have their downstream workarounds).
I thought it is a fix for the existing driver so proposed the change. If 
not just ignore the comment.

Best regards,
Parthiban V
> 
> However, it's up to maintainers and shall actually apply cleanly to both
> trees.
> 
> --
> Alexander Sverdlin
> Siemens AG
> www.siemens.com
Andrew Lunn Oct. 1, 2024, 12:16 p.m. UTC | #6
On Tue, Oct 01, 2024 at 11:57:15AM +0000, Sverdlin, Alexander wrote:
> Hi Parthiban!
> 
> On Tue, 2024-10-01 at 11:30 +0000, Parthiban.Veerasooran@microchip.com wrote:
> > I think the subject line should have "net" tag instead of "net-next" as 
> > it is an update on the existing driver in the netdev source tree.
> > 
> > https://www.kernel.org/doc/Documentation/networking/netdev-FAQ.txt
> 
> I explicitly didn't target it for -stable because seems that nobody
> else is affected by this issue (or have their downstream workarounds).

This is the sort of information which should be placed under the ---
marker. It would then avoid reviewers asking the question...

	Andrew
Andrew Lunn Oct. 1, 2024, 12:20 p.m. UTC | #7
On Tue, Oct 01, 2024 at 03:04:55PM +0300, Vladimir Oltean wrote:
> On Tue, Oct 01, 2024 at 11:57:15AM +0000, Sverdlin, Alexander wrote:
> > Hi Parthiban!
> > 
> > On Tue, 2024-10-01 at 11:30 +0000, Parthiban.Veerasooran@microchip.com wrote:
> > > I think the subject line should have "net" tag instead of "net-next" as 
> > > it is an update on the existing driver in the netdev source tree.
> > > 
> > > https://www.kernel.org/doc/Documentation/networking/netdev-FAQ.txt
> > 
> > I explicitly didn't target it for -stable because seems that nobody
> > else is affected by this issue (or have their downstream workarounds).
> 
> Explain a bit more why nobody 'else' is affected by the issue, and why
> you're not part of the 'else' people that would justify backporting to
> stable?

At a guess, they have a lot of stuff in the EEPROM, which other
don't. I've seen this before with Marvell switches. But i always worry
that the EEPROM contents are going to be break an assumption of the
driver.

	Andrew

---
pw-bot: cr
Sverdlin, Alexander Oct. 1, 2024, 12:51 p.m. UTC | #8
Hi Andrew!

On Tue, 2024-10-01 at 14:20 +0200, Andrew Lunn wrote:
> > > > I think the subject line should have "net" tag instead of "net-next" as 
> > > > it is an update on the existing driver in the netdev source tree.
> > > > 
> > > > https://www.kernel.org/doc/Documentation/networking/netdev-FAQ.txt
> > > 
> > > I explicitly didn't target it for -stable because seems that nobody
> > > else is affected by this issue (or have their downstream workarounds).
> > 
> > Explain a bit more why nobody 'else' is affected by the issue, and why
> > you're not part of the 'else' people that would justify backporting to
> > stable?
> 
> At a guess, they have a lot of stuff in the EEPROM, which other
> don't. I've seen this before with Marvell switches. But i always worry
> that the EEPROM contents are going to be break an assumption of the
> driver.

The issue is real, but I suppose most of the designs use MDIO with this switch.
And it also depends, if people implement reset GPIO or not -- the switch
starts EEPROM read right after reset -- which will conflict with driver
probe. But this will not be visible on simplified schematics.

Nevertheless, IMO this is indeed a fix, let's target it for -stable,
I'll prepare v2 thinking about read_poll_timeout().
diff mbox series

Patch

diff --git a/drivers/net/dsa/lan9303-core.c b/drivers/net/dsa/lan9303-core.c
index 268949939636a..5744e7ac436fb 100644
--- a/drivers/net/dsa/lan9303-core.c
+++ b/drivers/net/dsa/lan9303-core.c
@@ -839,6 +839,8 @@  static void lan9303_handle_reset(struct lan9303 *chip)
 	if (!chip->reset_gpio)
 		return;
 
+	gpiod_set_value_cansleep(chip->reset_gpio, 1);
+
 	if (chip->reset_duration != 0)
 		msleep(chip->reset_duration);
 
@@ -863,9 +865,45 @@  static int lan9303_disable_processing(struct lan9303 *chip)
 
 static int lan9303_check_device(struct lan9303 *chip)
 {
+	/*
+	 * Loading of the largest supported EEPROM is expected to take at least
+	 * 5.9s
+	 */
+	int tout = 6000 / 30;
 	int ret;
 	u32 reg;
 
+	do {
+		ret = lan9303_read(chip->regmap, LAN9303_HW_CFG, &reg);
+		if (ret) {
+			dev_err(chip->dev, "failed to read HW_CFG reg: %d\n",
+				ret);
+		}
+		tout--;
+
+		dev_dbg(chip->dev, "%s: HW_CFG: 0x%08x\n", __func__, reg);
+		if ((reg & LAN9303_HW_CFG_READY) || !tout)
+			break;
+
+		/*
+		 * In I2C-managed configurations this polling loop will clash
+		 * with switch's reading of EEPROM right after reset and this
+		 * behaviour is not configurable. While lan9303_read() already
+		 * has quite long retry timeout, seems not all cases are being
+		 * detected as arbitration error.
+		 *
+		 * According to datasheet, EEPROM loader has 30ms timeout
+		 * (in case of missing EEPROM).
+		 */
+		msleep(30);
+	} while (true);
+
+	if (!tout) {
+		dev_err(chip->dev, "%s: HW_CFG not ready: 0x%08x\n",
+			__func__, reg);
+		return -ENODEV;
+	}
+
 	ret = lan9303_read(chip->regmap, LAN9303_CHIP_REV, &reg);
 	if (ret) {
 		dev_err(chip->dev, "failed to read chip revision register: %d\n",