diff mbox series

net: dsa: mt7530: disable LEDs before reset

Message ID 20240305043952.21590-1-justin.swartz@risingedge.co.za (mailing list archive)
State Accepted
Commit 2920dd92b980c73d15c1a187b4cb7820521cf9fa
Delegated to: Netdev Maintainers
Headers show
Series net: dsa: mt7530: disable LEDs before reset | expand

Checks

Context Check Description
netdev/series_format warning Single patches do not need cover letters; Target tree name not specified in the subject
netdev/tree_selection success Guessed tree name to be net-next
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 940 this patch: 940
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers success CCed 15 of 15 maintainers
netdev/build_clang success Errors and warnings before: 956 this patch: 956
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 956 this patch: 956
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 12 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
netdev/contest success net-next-2024-03-05--21-00 (tests: 892)

Commit Message

Justin Swartz March 5, 2024, 4:39 a.m. UTC
Disable LEDs just before resetting the MT7530 to avoid
situations where the ESW_P4_LED_0 and ESW_P3_LED_0 pin
states may cause an unintended external crystal frequency
to be selected.

The HT_XTAL_FSEL (External Crystal Frequency Selection)
field of HWTRAP (the Hardware Trap register) stores a
2-bit value that represents the state of the ESW_P4_LED_0
and ESW_P4_LED_0 pins (seemingly) sampled just after the
MT7530 has been reset, as:

    ESW_P4_LED_0    ESW_P3_LED_0    Frequency
    -----------------------------------------
    0               1               20MHz
    1               0               40MHz
    1               1               25MHz

The value of HT_XTAL_FSEL is bootstrapped by pulling
ESW_P4_LED_0 and ESW_P3_LED_0 up or down accordingly,
but:

  if a 40MHz crystal has been selected and
  the ESW_P3_LED_0 pin is high during reset,

  or a 20MHz crystal has been selected and
  the ESW_P4_LED_0 pin is high during reset,

  then the value of HT_XTAL_FSEL will indicate
  that a 25MHz crystal is present.

By default, the state of the LED pins is PHY controlled
to reflect the link state.

To illustrate, if a board has:

  5 ports with active low LED control,
  and HT_XTAL_FSEL bootstrapped for 40MHz.

When the MT7530 is powered up without any external
connection, only the LED associated with Port 3 is
illuminated as ESW_P3_LED_0 is low.

In this state, directly after mt7530_setup()'s reset
is performed, the HWTRAP register (0x7800) reflects
the intended HT_XTAL_FSEL (HWTRAP bits 10:9) of 40MHz:

  mt7530-mdio mdio-bus:1f: mt7530_read: 00007800 == 00007dcf

  >>> bin(0x7dcf >> 9 & 0b11)
  '0b10'

But if a cable is connected to Port 3 and the link
is active before mt7530_setup()'s reset takes place,
then HT_XTAL_FSEL seems to be set for 25MHz:

  mt7530-mdio mdio-bus:1f: mt7530_read: 00007800 == 00007fcf

  >>> bin(0x7fcf >> 9 & 0b11)
  '0b11'

Once HT_XTAL_FSEL reflects 25MHz, none of the ports
are functional until the MT7621 (or MT7530 itself)
is reset.

By disabling the LED pins just before reset, the chance
of an unintended HT_XTAL_FSEL value is reduced.

Signed-off-by: Justin Swartz <justin.swartz@risingedge.co.za>
---
 drivers/net/dsa/mt7530.c | 6 ++++++
 1 file changed, 6 insertions(+)

--

Comments

Arınç ÜNAL March 8, 2024, 9:51 a.m. UTC | #1
Hey Justin.

I couldn't find anything on the MT7621 Giga Switch Programming Guide v0.3
document regarding which pin corresponds to which bit on the HWTRAP
register. There's only this mention on the LED controller section,
"hardware traps and LEDs share the same pins in GSW". But page 16 of the
schematics document for Banana Pi BPI-R2 [1] fully documents this.

The HWTRAP register is populated right after power comes back after the
switch chip is reset [2]. This means any active link before the reset will
go away so the high/low state of the pins will go back to being dictated by
the bootstrapping design of the board. The HWTRAP register will be
populated before a link can be set up.

In conclusion, I don't see any need to disable the LED controller before
resetting the switch chip.

[1] https://wiki.banana-pi.org/Banana_Pi_BPI-R2#Documents

[2] I've tested it on my MT7621AT board with a 40MHz XTAL frequency and a
board with standalone MT7530 with 25MHz XTAL frequency.

While the kernel was booting, before the DSA subdriver kicks in:
- For the board with 40 MHz XTAL: I've connected a Vcc pin to ESW_P3_LED_0
   to set it high.
- For the board with 25 MHz XTAL: I've connected a GND pin to ESW_P3_LED_0
   to set it low.

Board with 40 MHz XTAL:
[    2.359428] mt7530-mdio mdio-bus:1f: MT7530 adapts as multi-chip module
[    2.374918] mt7530-mdio mdio-bus:1f: xtal is 25MHz

Board with 25 MHz XTAL:
[    4.324672] mt7530-mdio mdio-bus:1f: xtal is 40MHz

diff --git a/drivers/net/dsa/mt7530.c b/drivers/net/dsa/mt7530.c
index 51d7b816dd02..beab5e5558d0 100644
--- a/drivers/net/dsa/mt7530.c
+++ b/drivers/net/dsa/mt7530.c
@@ -2216,6 +2216,15 @@ mt7530_setup(struct dsa_switch *ds)
  		return ret;
  	}
  
+	if ((val & HWTRAP_XTAL_MASK) == HWTRAP_XTAL_25MHZ)
+		dev_info(priv->dev, "xtal is 25MHz\n");
+
+	if ((val & HWTRAP_XTAL_MASK) == HWTRAP_XTAL_40MHZ)
+		dev_info(priv->dev, "xtal is 40MHz\n");
+
+	if ((val & HWTRAP_XTAL_MASK) == HWTRAP_XTAL_20MHZ)
+		dev_info(priv->dev, "xtal is 20MHz\n");
+
  	id = mt7530_read(priv, MT7530_CREV);
  	id >>= CHIP_NAME_SHIFT;
  	if (id != MT7530_ID) {

Arınç

On 5.03.2024 07:39, Justin Swartz wrote:
> Disable LEDs just before resetting the MT7530 to avoid
> situations where the ESW_P4_LED_0 and ESW_P3_LED_0 pin
> states may cause an unintended external crystal frequency
> to be selected.
> 
> The HT_XTAL_FSEL (External Crystal Frequency Selection)
> field of HWTRAP (the Hardware Trap register) stores a
> 2-bit value that represents the state of the ESW_P4_LED_0
> and ESW_P4_LED_0 pins (seemingly) sampled just after the
> MT7530 has been reset, as:
> 
>      ESW_P4_LED_0    ESW_P3_LED_0    Frequency
>      -----------------------------------------
>      0               1               20MHz
>      1               0               40MHz
>      1               1               25MHz
> 
> The value of HT_XTAL_FSEL is bootstrapped by pulling
> ESW_P4_LED_0 and ESW_P3_LED_0 up or down accordingly,
> but:
> 
>    if a 40MHz crystal has been selected and
>    the ESW_P3_LED_0 pin is high during reset,
> 
>    or a 20MHz crystal has been selected and
>    the ESW_P4_LED_0 pin is high during reset,
> 
>    then the value of HT_XTAL_FSEL will indicate
>    that a 25MHz crystal is present.
> 
> By default, the state of the LED pins is PHY controlled
> to reflect the link state.
> 
> To illustrate, if a board has:
> 
>    5 ports with active low LED control,
>    and HT_XTAL_FSEL bootstrapped for 40MHz.
> 
> When the MT7530 is powered up without any external
> connection, only the LED associated with Port 3 is
> illuminated as ESW_P3_LED_0 is low.
> 
> In this state, directly after mt7530_setup()'s reset
> is performed, the HWTRAP register (0x7800) reflects
> the intended HT_XTAL_FSEL (HWTRAP bits 10:9) of 40MHz:
> 
>    mt7530-mdio mdio-bus:1f: mt7530_read: 00007800 == 00007dcf
> 
>    >>> bin(0x7dcf >> 9 & 0b11)
>    '0b10'
> 
> But if a cable is connected to Port 3 and the link
> is active before mt7530_setup()'s reset takes place,
> then HT_XTAL_FSEL seems to be set for 25MHz:
> 
>    mt7530-mdio mdio-bus:1f: mt7530_read: 00007800 == 00007fcf
> 
>    >>> bin(0x7fcf >> 9 & 0b11)
>    '0b11'
> 
> Once HT_XTAL_FSEL reflects 25MHz, none of the ports
> are functional until the MT7621 (or MT7530 itself)
> is reset.
> 
> By disabling the LED pins just before reset, the chance
> of an unintended HT_XTAL_FSEL value is reduced.
> 
> Signed-off-by: Justin Swartz <justin.swartz@risingedge.co.za>
> ---
>   drivers/net/dsa/mt7530.c | 6 ++++++
>   1 file changed, 6 insertions(+)
> 
> diff --git a/drivers/net/dsa/mt7530.c b/drivers/net/dsa/mt7530.c
> index 3c1f65759..8fa113126 100644
> --- a/drivers/net/dsa/mt7530.c
> +++ b/drivers/net/dsa/mt7530.c
> @@ -2238,6 +2238,12 @@ mt7530_setup(struct dsa_switch *ds)
>   		}
>   	}
>   
> +	/* Disable LEDs before reset to prevent the MT7530 sampling a
> +	 * potentially incorrect HT_XTAL_FSEL value.
> +	 */
> +	mt7530_write(priv, MT7530_LED_EN, 0);
> +	usleep_range(1000, 1100);
> +
>   	/* Reset whole chip through gpio pin or memory-mapped registers for
>   	 * different type of hardware
>   	 */
Daniel Golle March 8, 2024, 2:46 p.m. UTC | #2
On Fri, Mar 08, 2024 at 12:51:15PM +0300, Arınç ÜNAL wrote:
> Hey Justin.
> 
> I couldn't find anything on the MT7621 Giga Switch Programming Guide v0.3
> document regarding which pin corresponds to which bit on the HWTRAP
> register. There's only this mention on the LED controller section,
> "hardware traps and LEDs share the same pins in GSW". But page 16 of the
> schematics document for Banana Pi BPI-R2 [1] fully documents this.
> 
> The HWTRAP register is populated right after power comes back after the
> switch chip is reset [2]. This means any active link before the reset will
> go away so the high/low state of the pins will go back to being dictated by
> the bootstrapping design of the board. The HWTRAP register will be
> populated before a link can be set up.
> 
> In conclusion, I don't see any need to disable the LED controller before
> resetting the switch chip.
> 
> [1] https://wiki.banana-pi.org/Banana_Pi_BPI-R2#Documents
> 
> [2] I've tested it on my MT7621AT board with a 40MHz XTAL frequency and a
> board with standalone MT7530 with 25MHz XTAL frequency.

How many times did you repeat this test to conclude that there is no
need to disable LEDs before reset?

As Justin is decribing a probabilistic phenomenon ("[...] chance of an
unintended HT_XTAL_FSEL value is reduced.") I believe running a single
test is not enough to conlcude anything.

I have a hard time believing that he was working on this patch without
any reason to do so...


> 
> While the kernel was booting, before the DSA subdriver kicks in:
> - For the board with 40 MHz XTAL: I've connected a Vcc pin to ESW_P3_LED_0
>   to set it high.
> - For the board with 25 MHz XTAL: I've connected a GND pin to ESW_P3_LED_0
>   to set it low.
> 
> Board with 40 MHz XTAL:
> [    2.359428] mt7530-mdio mdio-bus:1f: MT7530 adapts as multi-chip module
> [    2.374918] mt7530-mdio mdio-bus:1f: xtal is 25MHz
> 
> Board with 25 MHz XTAL:
> [    4.324672] mt7530-mdio mdio-bus:1f: xtal is 40MHz
> 
> diff --git a/drivers/net/dsa/mt7530.c b/drivers/net/dsa/mt7530.c
> index 51d7b816dd02..beab5e5558d0 100644
> --- a/drivers/net/dsa/mt7530.c
> +++ b/drivers/net/dsa/mt7530.c
> @@ -2216,6 +2216,15 @@ mt7530_setup(struct dsa_switch *ds)
>  		return ret;
>  	}
> +	if ((val & HWTRAP_XTAL_MASK) == HWTRAP_XTAL_25MHZ)
> +		dev_info(priv->dev, "xtal is 25MHz\n");
> +
> +	if ((val & HWTRAP_XTAL_MASK) == HWTRAP_XTAL_40MHZ)
> +		dev_info(priv->dev, "xtal is 40MHz\n");
> +
> +	if ((val & HWTRAP_XTAL_MASK) == HWTRAP_XTAL_20MHZ)
> +		dev_info(priv->dev, "xtal is 20MHz\n");
> +
>  	id = mt7530_read(priv, MT7530_CREV);
>  	id >>= CHIP_NAME_SHIFT;
>  	if (id != MT7530_ID) {
> 
> Arınç
> 
> On 5.03.2024 07:39, Justin Swartz wrote:
> > Disable LEDs just before resetting the MT7530 to avoid
> > situations where the ESW_P4_LED_0 and ESW_P3_LED_0 pin
> > states may cause an unintended external crystal frequency
> > to be selected.
> > 
> > The HT_XTAL_FSEL (External Crystal Frequency Selection)
> > field of HWTRAP (the Hardware Trap register) stores a
> > 2-bit value that represents the state of the ESW_P4_LED_0
> > and ESW_P4_LED_0 pins (seemingly) sampled just after the
> > MT7530 has been reset, as:
> > 
> >      ESW_P4_LED_0    ESW_P3_LED_0    Frequency
> >      -----------------------------------------
> >      0               1               20MHz
> >      1               0               40MHz
> >      1               1               25MHz
> > 
> > The value of HT_XTAL_FSEL is bootstrapped by pulling
> > ESW_P4_LED_0 and ESW_P3_LED_0 up or down accordingly,
> > but:
> > 
> >    if a 40MHz crystal has been selected and
> >    the ESW_P3_LED_0 pin is high during reset,
> > 
> >    or a 20MHz crystal has been selected and
> >    the ESW_P4_LED_0 pin is high during reset,
> > 
> >    then the value of HT_XTAL_FSEL will indicate
> >    that a 25MHz crystal is present.
> > 
> > By default, the state of the LED pins is PHY controlled
> > to reflect the link state.
> > 
> > To illustrate, if a board has:
> > 
> >    5 ports with active low LED control,
> >    and HT_XTAL_FSEL bootstrapped for 40MHz.
> > 
> > When the MT7530 is powered up without any external
> > connection, only the LED associated with Port 3 is
> > illuminated as ESW_P3_LED_0 is low.
> > 
> > In this state, directly after mt7530_setup()'s reset
> > is performed, the HWTRAP register (0x7800) reflects
> > the intended HT_XTAL_FSEL (HWTRAP bits 10:9) of 40MHz:
> > 
> >    mt7530-mdio mdio-bus:1f: mt7530_read: 00007800 == 00007dcf
> > 
> >    >>> bin(0x7dcf >> 9 & 0b11)
> >    '0b10'
> > 
> > But if a cable is connected to Port 3 and the link
> > is active before mt7530_setup()'s reset takes place,
> > then HT_XTAL_FSEL seems to be set for 25MHz:
> > 
> >    mt7530-mdio mdio-bus:1f: mt7530_read: 00007800 == 00007fcf
> > 
> >    >>> bin(0x7fcf >> 9 & 0b11)
> >    '0b11'
> > 
> > Once HT_XTAL_FSEL reflects 25MHz, none of the ports
> > are functional until the MT7621 (or MT7530 itself)
> > is reset.
> > 
> > By disabling the LED pins just before reset, the chance
> > of an unintended HT_XTAL_FSEL value is reduced.
> > 
> > Signed-off-by: Justin Swartz <justin.swartz@risingedge.co.za>
> > ---
> >   drivers/net/dsa/mt7530.c | 6 ++++++
> >   1 file changed, 6 insertions(+)
> > 
> > diff --git a/drivers/net/dsa/mt7530.c b/drivers/net/dsa/mt7530.c
> > index 3c1f65759..8fa113126 100644
> > --- a/drivers/net/dsa/mt7530.c
> > +++ b/drivers/net/dsa/mt7530.c
> > @@ -2238,6 +2238,12 @@ mt7530_setup(struct dsa_switch *ds)
> >   		}
> >   	}
> > +	/* Disable LEDs before reset to prevent the MT7530 sampling a
> > +	 * potentially incorrect HT_XTAL_FSEL value.
> > +	 */
> > +	mt7530_write(priv, MT7530_LED_EN, 0);
> > +	usleep_range(1000, 1100);
> > +
> >   	/* Reset whole chip through gpio pin or memory-mapped registers for
> >   	 * different type of hardware
> >   	 */
patchwork-bot+netdevbpf@kernel.org March 11, 2024, 9:10 p.m. UTC | #3
Hello:

This patch was applied to netdev/net-next.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Tue,  5 Mar 2024 06:39:51 +0200 you wrote:
> Disable LEDs just before resetting the MT7530 to avoid
> situations where the ESW_P4_LED_0 and ESW_P3_LED_0 pin
> states may cause an unintended external crystal frequency
> to be selected.
> 
> The HT_XTAL_FSEL (External Crystal Frequency Selection)
> field of HWTRAP (the Hardware Trap register) stores a
> 2-bit value that represents the state of the ESW_P4_LED_0
> and ESW_P4_LED_0 pins (seemingly) sampled just after the
> MT7530 has been reset, as:
> 
> [...]

Here is the summary with links:
  - net: dsa: mt7530: disable LEDs before reset
    https://git.kernel.org/netdev/net-next/c/2920dd92b980

You are awesome, thank you!
Arınç ÜNAL March 11, 2024, 9:22 p.m. UTC | #4
Why was this applied? I already explained it did not achieve anything.

Arınç

On 12.03.2024 00:10, patchwork-bot+netdevbpf@kernel.org wrote:
> Hello:
> 
> This patch was applied to netdev/net-next.git (main)
> by Jakub Kicinski <kuba@kernel.org>:
> 
> On Tue,  5 Mar 2024 06:39:51 +0200 you wrote:
>> Disable LEDs just before resetting the MT7530 to avoid
>> situations where the ESW_P4_LED_0 and ESW_P3_LED_0 pin
>> states may cause an unintended external crystal frequency
>> to be selected.
>>
>> The HT_XTAL_FSEL (External Crystal Frequency Selection)
>> field of HWTRAP (the Hardware Trap register) stores a
>> 2-bit value that represents the state of the ESW_P4_LED_0
>> and ESW_P4_LED_0 pins (seemingly) sampled just after the
>> MT7530 has been reset, as:
>>
>> [...]
> 
> Here is the summary with links:
>    - net: dsa: mt7530: disable LEDs before reset
>      https://git.kernel.org/netdev/net-next/c/2920dd92b980
> 
> You are awesome, thank you!
Jakub Kicinski March 11, 2024, 9:58 p.m. UTC | #5
On Tue, 12 Mar 2024 00:22:48 +0300 Arınç ÜNAL wrote:
> Why was this applied? I already explained it did not achieve anything.

Daniel disagreed and you didn't reply to him.

Please don't top post.
Daniel Golle March 11, 2024, 9:58 p.m. UTC | #6
On Tue, Mar 12, 2024 at 12:22:48AM +0300, Arınç ÜNAL wrote:
> Why was this applied? I already explained it did not achieve anything.

I agree that we were still debating about it, however, I do believe
Justin that he truely observed this problem and the fix seemed
appropriate to me.

I've explained this in my previous email which you did not notice
or at least haven't repied to:

https://patchwork.kernel.org/project/netdevbpf/patch/20240305043952.21590-1-justin.swartz@risingedge.co.za/#25753421

In the end it probably depends on the electric capacity of the circuit
connecting each LED, so it may not be reproducible on all boards and/or
under all circumstances (temperature, humindity, ...).

Disabling the LEDs and waiting for around 1mS before reset seems like
a sensible thing to do, and I'm glad Justin took care of it.


> 
> Arınç
> 
> On 12.03.2024 00:10, patchwork-bot+netdevbpf@kernel.org wrote:
> > Hello:
> > 
> > This patch was applied to netdev/net-next.git (main)
> > by Jakub Kicinski <kuba@kernel.org>:
> > 
> > On Tue,  5 Mar 2024 06:39:51 +0200 you wrote:
> > > Disable LEDs just before resetting the MT7530 to avoid
> > > situations where the ESW_P4_LED_0 and ESW_P3_LED_0 pin
> > > states may cause an unintended external crystal frequency
> > > to be selected.
> > > 
> > > The HT_XTAL_FSEL (External Crystal Frequency Selection)
> > > field of HWTRAP (the Hardware Trap register) stores a
> > > 2-bit value that represents the state of the ESW_P4_LED_0
> > > and ESW_P4_LED_0 pins (seemingly) sampled just after the
> > > MT7530 has been reset, as:
> > > 
> > > [...]
> > 
> > Here is the summary with links:
> >    - net: dsa: mt7530: disable LEDs before reset
> >      https://git.kernel.org/netdev/net-next/c/2920dd92b980
> > 
> > You are awesome, thank you!
Arınç ÜNAL March 11, 2024, 11:27 p.m. UTC | #7
On 12.03.2024 00:58, Daniel Golle wrote:
> On Tue, Mar 12, 2024 at 12:22:48AM +0300, Arınç ÜNAL wrote:
>> Why was this applied? I already explained it did not achieve anything.
> 
> I agree that we were still debating about it, however, I do believe
> Justin that he truely observed this problem and the fix seemed
> appropriate to me.
> 
> I've explained this in my previous email which you did not notice
> or at least haven't repied to:
> 
> https://patchwork.kernel.org/project/netdevbpf/patch/20240305043952.21590-1-justin.swartz@risingedge.co.za/#25753421

I did read that and I did not respond because you did not argue over any of
the technical points I've made. All you said was did I repeat the test
enough, on a technical matter that I consider adding two and two together
and expecting a result other than four.

How I interpreted your response was: I don't know much about this, maybe
you're wrong. Justin must've made this patch for a reason so let's have
them elaborate further.

> 
> In the end it probably depends on the electric capacity of the circuit
> connecting each LED, so it may not be reproducible on all boards and/or
> under all circumstances (temperature, humindity, ...).

I'm sorry, this makes no sense to me. I simply fail to see how this fits
here. Could you base your argument over my points please?

Do you agree that the LED controller starts manipulating the state of the
pins used for LEDs and bootstrapping after a link is established?

Do you agree that after power is cut from the switch IC and then given
back, any active link from before will go away, meaning the pins will go
back to the state that is being dictated by the bootstrapping design of the
board?

Do you agree that with power given back, the HWTRAP register will be
populated before a link is established?

> 
> Disabling the LEDs and waiting for around 1mS before reset seems like
> a sensible thing to do, and I'm glad Justin took care of it.

Let's ask Justin if they tested this on a standalone MT7530. Because I did.
The switch chip won't even be powered on before the switch chip reset
operation is done. So the operation this patch brings does not do anything
at all for standalone MT7530.

My conclusion to this patch is Justin tested this only on an MCM MT7530
where the switch IC still has power before the DSA subdriver kicks in. And
assumed that disabling the LED controller before switch chip reset would
"reduce" the possibility of having these pins continue being manipulated by
the LED controller AFTER power is cut off and given back to the switch
chip, where the state of these pins would be back to being dictated by the
bootstrapping design of the board.

Jakub, please revert this. And please next time do not apply any patch that
modifies this driver without my approval if I've already made an argument
against it. I'm actively maintaining this driver, if there's a need to
respond, I will do so.

This patch did not have any ACKs. It also did not have the tree described
on the subject. More reasons as to why this shouldn't have been applied in
its current state.

Arınç
Daniel Golle March 11, 2024, 11:43 p.m. UTC | #8
On Tue, Mar 12, 2024 at 02:27:25AM +0300, Arınç ÜNAL wrote:
> On 12.03.2024 00:58, Daniel Golle wrote:
> > On Tue, Mar 12, 2024 at 12:22:48AM +0300, Arınç ÜNAL wrote:
> > > Why was this applied? I already explained it did not achieve anything.
> > 
> > I agree that we were still debating about it, however, I do believe
> > Justin that he truely observed this problem and the fix seemed
> > appropriate to me.
> > 
> > I've explained this in my previous email which you did not notice
> > or at least haven't repied to:
> > 
> > https://patchwork.kernel.org/project/netdevbpf/patch/20240305043952.21590-1-justin.swartz@risingedge.co.za/#25753421
> 
> I did read that and I did not respond because you did not argue over any of
> the technical points I've made. All you said was did I repeat the test
> enough, on a technical matter that I consider adding two and two together
> and expecting a result other than four.
> 
> How I interpreted your response was: I don't know much about this, maybe
> you're wrong. Justin must've made this patch for a reason so let's have
> them elaborate further.
> 
> > 
> > In the end it probably depends on the electric capacity of the circuit
> > connecting each LED, so it may not be reproducible on all boards and/or
> > under all circumstances (temperature, humindity, ...).
> 
> I'm sorry, this makes no sense to me. I simply fail to see how this fits
> here. Could you base your argument over my points please?

Sure, will happily do so.

> 
> Do you agree that the LED controller starts manipulating the state of the
> pins used for LEDs and bootstrapping after a link is established?

Yes. But a reset may happen while a link is already up because the switch IC
was initialized and in use by the bootloader. And hence LED may be powered
by the LED controller in that moment **just before reset**.

> 
> Do you agree that after power is cut from the switch IC and then given
> back, any active link from before will go away, meaning the pins will go
> back to the state that is being dictated by the bootstrapping design of the
> board?

I don't see how this could be related. We are not talking about power cuts
here, but rather use of a RESET signal (typically a GPIO on standalone MT753x
or reset controller of the CPU-part of the MCM).

> 
> Do you agree that with power given back, the HWTRAP register will be
> populated before a link is established?

Yes sure, but see above.

> 
> > 
> > Disabling the LEDs and waiting for around 1mS before reset seems like
> > a sensible thing to do, and I'm glad Justin took care of it.
> 
> Let's ask Justin if they tested this on a standalone MT7530. Because I did.
> The switch chip won't even be powered on before the switch chip reset
> operation is done. So the operation this patch brings does not do anything
> at all for standalone MT7530.

This is not true in case the bootloader has already powered on the
switch in order to load firmware via TFTP. In this case the link may
be up (and hence LEDs may be powered on) at the moment the reset
triggerd by probe of the DSA driver kicks in.

> 
> My conclusion to this patch is Justin tested this only on an MCM MT7530
> where the switch IC still has power before the DSA subdriver kicks in. And
> assumed that disabling the LED controller before switch chip reset would
> "reduce" the possibility of having these pins continue being manipulated by
> the LED controller AFTER power is cut off and given back to the switch
> chip, where the state of these pins would be back to being dictated by the
> bootstrapping design of the board.
> 
> Jakub, please revert this. And please next time do not apply any patch that
> modifies this driver without my approval if I've already made an argument
> against it. I'm actively maintaining this driver, if there's a need to
> respond, I will do so.
> 
> This patch did not have any ACKs. It also did not have the tree described
> on the subject. More reasons as to why this shouldn't have been applied in
> its current state.

It was clearly recognizable as a fix.

However, I agree that applying it after Ack from an active maintainer
would have been better.

I don't see a need to revert it before this debate (which starts to
look like an argument over authority) has concluded.
Justin Swartz March 12, 2024, 12:07 a.m. UTC | #9
Hi Arınç

This reply will be best read with a fixed-width font.

On 2024-03-08 11:51, Arınç ÜNAL wrote:
> Hey Justin.
> 
> I couldn't find anything on the MT7621 Giga Switch Programming Guide 
> v0.3
> document regarding which pin corresponds to which bit on the HWTRAP
> register. There's only this mention on the LED controller section,
> "hardware traps and LEDs share the same pins in GSW". But page 16 of 
> the
> schematics document for Banana Pi BPI-R2 [1] fully documents this.

There is also a table in the "MT7530 Giga Switch Programming
Guide" that describes which pins correspond to the bits of
HWTRAP, but beware of typos.


> The HWTRAP register is populated right after power comes back after the
> switch chip is reset [2]. This means any active link before the reset 
> will
> go away so the high/low state of the pins will go back to being 
> dictated by
> the bootstrapping design of the board. The HWTRAP register will be
> populated before a link can be set up.

> In conclusion, I don't see any need to disable the LED controller 
> before
> resetting the switch chip.

I should have included more evidence to support my claim.


> [1] https://wiki.banana-pi.org/Banana_Pi_BPI-R2#Documents
> 
> [2] I've tested it on my MT7621AT board with a 40MHz XTAL frequency and 
> a
> board with standalone MT7530 with 25MHz XTAL frequency.
> 
> While the kernel was booting, before the DSA subdriver kicks in:
> - For the board with 40 MHz XTAL: I've connected a Vcc pin to 
> ESW_P3_LED_0
>   to set it high.

My board has a 40MHz crystal between XPTL_XI and XPTL_XO,
ESW_P4_LED_0 has a 4.7k pull up to 3.3V, and ESW_P3_LED_0
has a 4.7k pull down to GND.

For testing, I'm relying on the MT7530 itself to change the
ESW_P3_LED_0 level according to the link state.


> - For the board with 25 MHz XTAL: I've connected a GND pin to 
> ESW_P3_LED_0
>   to set it low.
> 
> Board with 40 MHz XTAL:
> [    2.359428] mt7530-mdio mdio-bus:1f: MT7530 adapts as multi-chip 
> module
> [    2.374918] mt7530-mdio mdio-bus:1f: xtal is 25MHz
> 
> Board with 25 MHz XTAL:
> [    4.324672] mt7530-mdio mdio-bus:1f: xtal is 40MHz
> 
> diff --git a/drivers/net/dsa/mt7530.c b/drivers/net/dsa/mt7530.c
> index 51d7b816dd02..beab5e5558d0 100644
> --- a/drivers/net/dsa/mt7530.c
> +++ b/drivers/net/dsa/mt7530.c
> @@ -2216,6 +2216,15 @@ mt7530_setup(struct dsa_switch *ds)
>  		return ret;
>  	}
>  +	if ((val & HWTRAP_XTAL_MASK) == HWTRAP_XTAL_25MHZ)
> +		dev_info(priv->dev, "xtal is 25MHz\n");
> +
> +	if ((val & HWTRAP_XTAL_MASK) == HWTRAP_XTAL_40MHZ)
> +		dev_info(priv->dev, "xtal is 40MHz\n");
> +
> +	if ((val & HWTRAP_XTAL_MASK) == HWTRAP_XTAL_20MHZ)
> +		dev_info(priv->dev, "xtal is 20MHz\n");
> +
>  	id = mt7530_read(priv, MT7530_CREV);
>  	id >>= CHIP_NAME_SHIFT;
>  	if (id != MT7530_ID) {
> 
> Arınç

I took a similar approach, with calls to dev_info()
throughout mt7530_setup() plus mt7530_write(), mt7530_read()
and mt7530_rmw() to get an idea of the flow of execution and
which registers were being manipulated.

Calls to dev_info() affected timing, so I switched to using
trace_printk() and then read the output from the debugfs's
tracing/trace file instead of from the console.

I attached a logic analyzer to ESW_P4_LED_0 and ESW_P3_LED_0,
and manually triggered sampling as soon as execution of the
kernel was reported on UART1.


-- Test #1 -----------------------------------------------------------

For the sake of maximal reproducability, I removed the delay
between the assertion and deassertion of reset and added
HWTRAP value tracing:

---%---
--- a/drivers/net/dsa/mt7530.c
+++ b/drivers/net/dsa/mt7530.c
@@ -2243,7 +2243,6 @@ mt7530_setup(struct dsa_switch *ds)
	 */
	if (priv->mcm) {
		reset_control_assert(priv->rstc);
-		usleep_range(1000, 1100);
		reset_control_deassert(priv->rstc);
	} else {
		gpiod_set_value_cansleep(priv->reset, 0);
@@ -2260,6 +2259,10 @@ mt7530_setup(struct dsa_switch *ds)
		return ret;
	}

+	static char ht_xtal_fsel[4][6] = { "?????", "20MHz", "40MHz", "25MHz" 
};
+	trace_printk("HWTRAP = %x, HT_XTAL_FSEL = %s\n",
+		val, ht_xtal_fsel[(val & HWTRAP_XTAL_MASK) >> 9]);
+
         id = mt7530_read(priv, MT7530_CREV);
         id >>= CHIP_NAME_SHIFT;
         if (id != MT7530_ID) {
---%---

(a) Without a cable connected to Port 3 (lan4) before reset, the
correct external crystal frequency is selected:

               [3]      [4]     [6]
               :        :       :
               _________         ______________________________________
ESW_P4_LED_0           |_______|
                         _______
ESW_P3_LED_0  _________|       |______________________________________

                         :     :
                         [5]...:

[3] Port 4 LED Off. Port 3 LED On.
[4] Signals inverted. (reset_control_assert; reset_control_deassert)
[5] Period of 310 usec.
[6] Signals reflect the bootstrapped configuration.


~ # sed -n -e '$s/^.*\. //p' /sys/kernel/debug/tracing/trace
     2.432646: mt7530_setup: HWTRAP = 7dcf, HT_XTAL_FSEL = 40MHz


(b) When a cable attached to an active peer is connected to
Port 3 (lan4) before reset, the incorrect crystal frequency
selection (0b11 = 25MHz) always occurs:

               [7]      [8]     [10]    [12]
               :        :       :       :
               _________         ______________________________________
ESW_P4_LED_0           |_______|
               _________         _______
ESW_P3_LED_0           |_______|       |______________________________

                         :     : :     :
                         [9]...: [11]..:

  [7] Port 4 LED Off. Port 3 LED Off.
  [8] Signals inverted. (reset_control_assert; reset_control_deassert)
  [9] Period of 320 usec.
[10] Signals inverted.
[11] Period of 300 usec.
[12] Signals reflect the bootstrapped configuration.

~ # sed -n -e '$s/^.*\. //p' /sys/kernel/debug/tracing/trace
     2.432884: mt7530_setup: HWTRAP = 7fcf, HT_XTAL_FSEL = 25MHz


-- Test #2 -----------------------------------------------------------

To attempt to determine which transitions are associated
with asserting and deasserting reset, I performed another
test with a delay of what I intended to be a 1 sec period
where the MT7530 is held in reset:

---%---
--- a/drivers/net/dsa/mt7530.c
+++ b/drivers/net/dsa/mt7530.c
@@ -2243,7 +2243,7 @@ mt7530_setup(struct dsa_switch *ds)
	 */
	if (priv->mcm) {
		reset_control_assert(priv->rstc);
-		usleep_range(1000, 1100);
+		usleep_range(1000000, 1000000);
		reset_control_deassert(priv->rstc);
	} else {
		gpiod_set_value_cansleep(priv->reset, 0);
@@ -2260,6 +2260,10 @@ mt7530_setup(struct dsa_switch *ds)
		return ret;
	}

+	static char ht_xtal_fsel[4][6] = { "?????", "20MHz", "40MHz", "25MHz" 
};
+	trace_printk("HWTRAP = %x, HT_XTAL_FSEL = %s\n",
+		val, ht_xtal_fsel[(val & HWTRAP_XTAL_MASK) >> 9]);
+
	id = mt7530_read(priv, MT7530_CREV);
	id >>= CHIP_NAME_SHIFT;
	if (id != MT7530_ID) {
---%---

(a) Without a cable connected to Port 3 (lan4) before reset, the
correct external crystal frequency is again selected:

               [13]     [14]    [16]
               :        :       :
               _________         ______________________________________
ESW_P4_LED_0           |_______|
                         _______
ESW_P3_LED_0  _________|       |______________________________________

                         :     :
                         [15]..:

[13] Port 4 LED Off. Port 3 LED On.
[14] Signals inverted. (reset_control_deassert?)
[15] Period of 310 usec.
[16] Signals reflect the bootstrapped configuration.


~ # sed -n -e '$s/^.*\. //p' /sys/kernel/debug/tracing/trace
     3.437461: mt7530_setup: HWTRAP = 7dcf, HT_XTAL_FSEL = 40MHz


No difference is apparent in the timing diagram, compared
to the result from Test #1a, but it is obvious that the code
which follows the reset was executed about 1 second later.


(b) With a cable from an active peer connected to Port 3
(lan4) before reset, the correct crystal frequency
(0b10 = 40MHz) is selected:

               [17]     [18]                    [20]   [22]
               :        :                       :      :
               ______________________ \ \ ______        _______________
ESW_P4_LED_0                         / /       |______|
               _________              \ \        ______
ESW_P3_LED_0           |____________ / / ______|      |_______________
                                      \ \
                         :                     : :    :
                         [19]..................: [21].:

[17] Port 4 LED Off. Port 3 LED Off.
[18] ESW_P3_LED_0 set low. (reset_control_assert?)
[19] Period of 1000.325 msec.
[20] Signals inverted. (reset_control_deassert?)
[21] Period of 310 usec.
[22] Signals reflect the bootstrapped configuration.


~ # sed -n -e '$s/^.*\. //p' /sys/kernel/debug/tracing/trace
     3.433235: mt7530_setup: HWTRAP = 7dcf, HT_XTAL_FSEL = 40MHz


So it appears that when reset is deasserted, the ESW_P4_LED_0
and ESW_P3_LED_0 levels are inverted for a period of about
300 - 310 usec in Test #1a, #1b, #2a, and #2b.

Co-incidentally, these inverted levels are the active low
representation of what ends up in HT_XTAL_FSEL. And in all
four examples, have been the inversion of what immediately
preceded reset_control_deassert().


-- Test #3 -----------------------------------------------------------

Now it seems that there is a signature that can be used
to identify when reset_control_deassert() is executed,
the time of reset_control_assert()'s execution should be
between (at most) 1100 and (at least) 1000 usec prior
based on the unmodified reset logic.

So this patch only includes the HT_XTAL_FSEL trace.

---%---
--- a/drivers/net/dsa/mt7530.c
+++ b/drivers/net/dsa/mt7530.c
@@ -2260,6 +2260,10 @@ mt7530_setup(struct dsa_switch *ds)
                 return ret;
         }

+       static char ht_xtal_fsel[4][6] = { "?????", "20MHz", "40MHz", 
"25MHz" };
+       trace_printk("HWTRAP = %x, HT_XTAL_FSEL = %s\n",
+               val, ht_xtal_fsel[(val & HWTRAP_XTAL_MASK) >> 9]);
+
         id = mt7530_read(priv, MT7530_CREV);
         id >>= CHIP_NAME_SHIFT;
         if (id != MT7530_ID) {
---%---

The purpose of the following examples are to show the
variance in how long it takes for ESW_P3_LED_0 to go low.

(a) With a cable from an active peer connected to Port 3
(lan4) before reset, the correct crystal frequency
(0b10 = 40MHz) is selected.

               [23]    [24]       [26]      [28]    [30]
               :       :          :         :       :
               _____________________________         __________________
ESW_P4_LED_0                               |_______|
               ___________________           _______
ESW_P3_LED_0                     |_________|       |__________________

                        :          :       : :     :
                        :          [27]....: [29]..:
                        [25]...............:

[23] Port 4 LED Off. Port 3 LED Off.
[24] Approximate point of reset_control_assert?
[25] Period of approximately 1000 - 1100 usec.
[26] ESW_P3_LED_0 finally goes low.
[27] Period of 415 usec.
[28] Signals inverted. (reset_control_deassert?)
[29] Period of 310 usec.
[30] Signals reflect the bootstrapped configuration.


(b) With a cable from an active peer connected to Port 3
(lan4) before reset, the correct crystal frequency
(0b10 = 40MHz) is selected.

               [31]    [32]            [34] [36]    [38]
               :       :                  : :       :
               _____________________________         __________________
ESW_P4_LED_0                               |_______|
               ___________________________   _______
ESW_P3_LED_0                             |_|       |__________________

                        :                  : :     :
                        :               [35] [37]..:
                        :                  :
                        [33]...............:

[31] Port 4 LED Off. Port 3 LED Off.
[32] Approximate point of reset_control_assert?
[33] Period of approximately 1000 - 1100 usec.
[34] ESW_P3_LED_0 finally goes low.
[35] Period of 50 usec.
[36] Signals inverted. (reset_control_deassert?)
[37] Period of 310 usec.
[38] Signals reflect the bootstrapped configuration.


(c) With a cable from an active peer connected to Port 3
(lan4) before reset, an incorrect crystal frequency
(0b11 = 25MHz) is selected.

               [45]    [46]                 [48]    [50]
               :       :                    :       :
               _____________________________         __________________
ESW_P4_LED_0                               |_______|
               _____________________________
ESW_P3_LED_0                               |__________________________

                        :                  : :     :
                        :                  : [49]..:
                        :                  :
                        [47]...............:

[45] Port 4 LED Off. Port 3 LED Off.
[46] Approximate point of reset_control_assert?
[47] Period of approximately 1000 - 1100 usec.
[48] Signals inverted. (reset_control_deassert?)
[49] Period of 315 usec.
[50] Signals reflect the bootstrapped configuration.

~ # sed -n -e '$s/^.*\. //p' /sys/kernel/debug/tracing/trace
     2.617819: mt7530_setup: HWTRAP = 7fcf, HT_XTAL_FSEL = 25MHz

~ # cat /proc/cmdline
console=ttyS0,57600 loglevel=7 printk.time=1 root=/dev/ram0 debug 
rdinit=/linuxrc


-- End of Tests ------------------------------------------------------

It seems that the incorrect crystal frequency is selected more
often when debugging messages are present (such as printk()
based approaches) and especially when loglevel=7 and printk.time=1
are specified on the command line.

For the sake of reference: I disabled ethernet support in my build
of (mainline) U-boot, and my kernel configuration is a very lean
selection of options suited for IP routing and a few peripherals
on the I2C and SPI buses.

My userland is confined to an initramfs built around busybox.

I also disable gmac1 because I need a few of the rgmii2 pins for
modem control signalling.

Regards
Justin

PS: Yes, I only have access to MT7621AT SoCs.


> On 5.03.2024 07:39, Justin Swartz wrote:
>> Disable LEDs just before resetting the MT7530 to avoid
>> situations where the ESW_P4_LED_0 and ESW_P3_LED_0 pin
>> states may cause an unintended external crystal frequency
>> to be selected.
>> 
>> The HT_XTAL_FSEL (External Crystal Frequency Selection)
>> field of HWTRAP (the Hardware Trap register) stores a
>> 2-bit value that represents the state of the ESW_P4_LED_0
>> and ESW_P4_LED_0 pins (seemingly) sampled just after the
>> MT7530 has been reset, as:
>> 
>>      ESW_P4_LED_0    ESW_P3_LED_0    Frequency
>>      -----------------------------------------
>>      0               1               20MHz
>>      1               0               40MHz
>>      1               1               25MHz
>> 
>> The value of HT_XTAL_FSEL is bootstrapped by pulling
>> ESW_P4_LED_0 and ESW_P3_LED_0 up or down accordingly,
>> but:
>> 
>>    if a 40MHz crystal has been selected and
>>    the ESW_P3_LED_0 pin is high during reset,
>> 
>>    or a 20MHz crystal has been selected and
>>    the ESW_P4_LED_0 pin is high during reset,
>> 
>>    then the value of HT_XTAL_FSEL will indicate
>>    that a 25MHz crystal is present.
>> 
>> By default, the state of the LED pins is PHY controlled
>> to reflect the link state.
>> 
>> To illustrate, if a board has:
>> 
>>    5 ports with active low LED control,
>>    and HT_XTAL_FSEL bootstrapped for 40MHz.
>> 
>> When the MT7530 is powered up without any external
>> connection, only the LED associated with Port 3 is
>> illuminated as ESW_P3_LED_0 is low.
>> 
>> In this state, directly after mt7530_setup()'s reset
>> is performed, the HWTRAP register (0x7800) reflects
>> the intended HT_XTAL_FSEL (HWTRAP bits 10:9) of 40MHz:
>> 
>>    mt7530-mdio mdio-bus:1f: mt7530_read: 00007800 == 00007dcf
>> 
>>    >>> bin(0x7dcf >> 9 & 0b11)
>>    '0b10'
>> 
>> But if a cable is connected to Port 3 and the link
>> is active before mt7530_setup()'s reset takes place,
>> then HT_XTAL_FSEL seems to be set for 25MHz:
>> 
>>    mt7530-mdio mdio-bus:1f: mt7530_read: 00007800 == 00007fcf
>> 
>>    >>> bin(0x7fcf >> 9 & 0b11)
>>    '0b11'
>> 
>> Once HT_XTAL_FSEL reflects 25MHz, none of the ports
>> are functional until the MT7621 (or MT7530 itself)
>> is reset.
>> 
>> By disabling the LED pins just before reset, the chance
>> of an unintended HT_XTAL_FSEL value is reduced.
>> 
>> Signed-off-by: Justin Swartz <justin.swartz@risingedge.co.za>
>> ---
>>   drivers/net/dsa/mt7530.c | 6 ++++++
>>   1 file changed, 6 insertions(+)
>> 
>> diff --git a/drivers/net/dsa/mt7530.c b/drivers/net/dsa/mt7530.c
>> index 3c1f65759..8fa113126 100644
>> --- a/drivers/net/dsa/mt7530.c
>> +++ b/drivers/net/dsa/mt7530.c
>> @@ -2238,6 +2238,12 @@ mt7530_setup(struct dsa_switch *ds)
>>   		}
>>   	}
>>   +	/* Disable LEDs before reset to prevent the MT7530 sampling a
>> +	 * potentially incorrect HT_XTAL_FSEL value.
>> +	 */
>> +	mt7530_write(priv, MT7530_LED_EN, 0);
>> +	usleep_range(1000, 1100);
>> +
>>   	/* Reset whole chip through gpio pin or memory-mapped registers for
>>   	 * different type of hardware
>>   	 */
Andrew Lunn March 12, 2024, 1:03 a.m. UTC | #10
> (b) When a cable attached to an active peer is connected to
> Port 3 (lan4) before reset, the incorrect crystal frequency
> selection (0b11 = 25MHz) always occurs:
> 
>               [7]      [8]     [10]    [12]
>               :        :       :       :
>               _________         ______________________________________
> ESW_P4_LED_0           |_______|
>               _________         _______
> ESW_P3_LED_0           |_______|       |______________________________
> 
>                         :     : :     :
>                         [9]...: [11]..:
> 
>  [7] Port 4 LED Off. Port 3 LED Off.
>  [8] Signals inverted. (reset_control_assert; reset_control_deassert)
>  [9] Period of 320 usec.
> [10] Signals inverted.
> [11] Period of 300 usec.
> [12] Signals reflect the bootstrapped configuration.

Shame the patch has already been accepted. The text in this email
would of made a cool commit message.

      Andrew
Arınç ÜNAL March 12, 2024, 2:41 a.m. UTC | #11
On 12.03.2024 03:07, Justin Swartz wrote:
> I took a similar approach, with calls to dev_info()
> throughout mt7530_setup() plus mt7530_write(), mt7530_read()
> and mt7530_rmw() to get an idea of the flow of execution and
> which registers were being manipulated.
> 
> Calls to dev_info() affected timing, so I switched to using
> trace_printk() and then read the output from the debugfs's
> tracing/trace file instead of from the console.
> 
> I attached a logic analyzer to ESW_P4_LED_0 and ESW_P3_LED_0,
> and manually triggered sampling as soon as execution of the
> kernel was reported on UART1.
> 
> 
> -- Test #1 -----------------------------------------------------------
> 
> For the sake of maximal reproducability, I removed the delay
> between the assertion and deassertion of reset and added
> HWTRAP value tracing:
> 
> ---%---
> --- a/drivers/net/dsa/mt7530.c
> +++ b/drivers/net/dsa/mt7530.c
> @@ -2243,7 +2243,6 @@ mt7530_setup(struct dsa_switch *ds)
>       */
>      if (priv->mcm) {
>          reset_control_assert(priv->rstc);
> -        usleep_range(1000, 1100);
>          reset_control_deassert(priv->rstc);
>      } else {
>          gpiod_set_value_cansleep(priv->reset, 0);
> @@ -2260,6 +2259,10 @@ mt7530_setup(struct dsa_switch *ds)
>          return ret;
>      }
> 
> +    static char ht_xtal_fsel[4][6] = { "?????", "20MHz", "40MHz", "25MHz" };
> +    trace_printk("HWTRAP = %x, HT_XTAL_FSEL = %s\n",
> +        val, ht_xtal_fsel[(val & HWTRAP_XTAL_MASK) >> 9]);
> +
>          id = mt7530_read(priv, MT7530_CREV);
>          id >>= CHIP_NAME_SHIFT;
>          if (id != MT7530_ID) {
> ---%---
> 
> (a) Without a cable connected to Port 3 (lan4) before reset, the
> correct external crystal frequency is selected:
> 
>                [3]      [4]     [6]
>                :        :       :
>                _________         ______________________________________
> ESW_P4_LED_0           |_______|
>                          _______
> ESW_P3_LED_0  _________|       |______________________________________
> 
>                          :     :
>                          [5]...:
> 
> [3] Port 4 LED Off. Port 3 LED On.
> [4] Signals inverted. (reset_control_assert; reset_control_deassert)
> [5] Period of 310 usec.
> [6] Signals reflect the bootstrapped configuration.
> 
> 
> ~ # sed -n -e '$s/^.*\. //p' /sys/kernel/debug/tracing/trace
>      2.432646: mt7530_setup: HWTRAP = 7dcf, HT_XTAL_FSEL = 40MHz
> 
> 
> (b) When a cable attached to an active peer is connected to
> Port 3 (lan4) before reset, the incorrect crystal frequency
> selection (0b11 = 25MHz) always occurs:
> 
>                [7]      [8]     [10]    [12]
>                :        :       :       :
>                _________         ______________________________________
> ESW_P4_LED_0           |_______|
>                _________         _______
> ESW_P3_LED_0           |_______|       |______________________________
> 
>                          :     : :     :
>                          [9]...: [11]..:
> 
>   [7] Port 4 LED Off. Port 3 LED Off.
>   [8] Signals inverted. (reset_control_assert; reset_control_deassert)
>   [9] Period of 320 usec.
> [10] Signals inverted.
> [11] Period of 300 usec.
> [12] Signals reflect the bootstrapped configuration.
> 
> ~ # sed -n -e '$s/^.*\. //p' /sys/kernel/debug/tracing/trace
>      2.432884: mt7530_setup: HWTRAP = 7fcf, HT_XTAL_FSEL = 25MHz
> 
> 
> -- Test #2 -----------------------------------------------------------
> 
> To attempt to determine which transitions are associated
> with asserting and deasserting reset, I performed another
> test with a delay of what I intended to be a 1 sec period
> where the MT7530 is held in reset:
> 
> ---%---
> --- a/drivers/net/dsa/mt7530.c
> +++ b/drivers/net/dsa/mt7530.c
> @@ -2243,7 +2243,7 @@ mt7530_setup(struct dsa_switch *ds)
>       */
>      if (priv->mcm) {
>          reset_control_assert(priv->rstc);
> -        usleep_range(1000, 1100);
> +        usleep_range(1000000, 1000000);
>          reset_control_deassert(priv->rstc);
>      } else {
>          gpiod_set_value_cansleep(priv->reset, 0);
> @@ -2260,6 +2260,10 @@ mt7530_setup(struct dsa_switch *ds)
>          return ret;
>      }
> 
> +    static char ht_xtal_fsel[4][6] = { "?????", "20MHz", "40MHz", "25MHz" };
> +    trace_printk("HWTRAP = %x, HT_XTAL_FSEL = %s\n",
> +        val, ht_xtal_fsel[(val & HWTRAP_XTAL_MASK) >> 9]);
> +
>      id = mt7530_read(priv, MT7530_CREV);
>      id >>= CHIP_NAME_SHIFT;
>      if (id != MT7530_ID) {
> ---%---
> 
> (a) Without a cable connected to Port 3 (lan4) before reset, the
> correct external crystal frequency is again selected:
> 
>                [13]     [14]    [16]
>                :        :       :
>                _________         ______________________________________
> ESW_P4_LED_0           |_______|
>                          _______
> ESW_P3_LED_0  _________|       |______________________________________
> 
>                          :     :
>                          [15]..:
> 
> [13] Port 4 LED Off. Port 3 LED On.
> [14] Signals inverted. (reset_control_deassert?)
> [15] Period of 310 usec.
> [16] Signals reflect the bootstrapped configuration.
> 
> 
> ~ # sed -n -e '$s/^.*\. //p' /sys/kernel/debug/tracing/trace
>      3.437461: mt7530_setup: HWTRAP = 7dcf, HT_XTAL_FSEL = 40MHz
> 
> 
> No difference is apparent in the timing diagram, compared
> to the result from Test #1a, but it is obvious that the code
> which follows the reset was executed about 1 second later.
> 
> 
> (b) With a cable from an active peer connected to Port 3
> (lan4) before reset, the correct crystal frequency
> (0b10 = 40MHz) is selected:
> 
>                [17]     [18]                    [20]   [22]
>                :        :                       :      :
>                ______________________ \ \ ______        _______________
> ESW_P4_LED_0                         / /       |______|
>                _________              \ \        ______
> ESW_P3_LED_0           |____________ / / ______|      |_______________
>                                       \ \
>                          :                     : :    :
>                          [19]..................: [21].:
> 
> [17] Port 4 LED Off. Port 3 LED Off.
> [18] ESW_P3_LED_0 set low. (reset_control_assert?)
> [19] Period of 1000.325 msec.
> [20] Signals inverted. (reset_control_deassert?)
> [21] Period of 310 usec.
> [22] Signals reflect the bootstrapped configuration.
> 
> 
> ~ # sed -n -e '$s/^.*\. //p' /sys/kernel/debug/tracing/trace
>      3.433235: mt7530_setup: HWTRAP = 7dcf, HT_XTAL_FSEL = 40MHz
> 
> 
> So it appears that when reset is deasserted, the ESW_P4_LED_0
> and ESW_P3_LED_0 levels are inverted for a period of about
> 300 - 310 usec in Test #1a, #1b, #2a, and #2b.
> 
> Co-incidentally, these inverted levels are the active low
> representation of what ends up in HT_XTAL_FSEL. And in all
> four examples, have been the inversion of what immediately
> preceded reset_control_deassert().
> 
> 
> -- Test #3 -----------------------------------------------------------
> 
> Now it seems that there is a signature that can be used
> to identify when reset_control_deassert() is executed,
> the time of reset_control_assert()'s execution should be
> between (at most) 1100 and (at least) 1000 usec prior
> based on the unmodified reset logic.
> 
> So this patch only includes the HT_XTAL_FSEL trace.
> 
> ---%---
> --- a/drivers/net/dsa/mt7530.c
> +++ b/drivers/net/dsa/mt7530.c
> @@ -2260,6 +2260,10 @@ mt7530_setup(struct dsa_switch *ds)
>                  return ret;
>          }
> 
> +       static char ht_xtal_fsel[4][6] = { "?????", "20MHz", "40MHz", "25MHz" };
> +       trace_printk("HWTRAP = %x, HT_XTAL_FSEL = %s\n",
> +               val, ht_xtal_fsel[(val & HWTRAP_XTAL_MASK) >> 9]);
> +
>          id = mt7530_read(priv, MT7530_CREV);
>          id >>= CHIP_NAME_SHIFT;
>          if (id != MT7530_ID) {
> ---%---
> 
> The purpose of the following examples are to show the
> variance in how long it takes for ESW_P3_LED_0 to go low.
> 
> (a) With a cable from an active peer connected to Port 3
> (lan4) before reset, the correct crystal frequency
> (0b10 = 40MHz) is selected.
> 
>                [23]    [24]       [26]      [28]    [30]
>                :       :          :         :       :
>                _____________________________         __________________
> ESW_P4_LED_0                               |_______|
>                ___________________           _______
> ESW_P3_LED_0                     |_________|       |__________________
> 
>                         :          :       : :     :
>                         :          [27]....: [29]..:
>                         [25]...............:
> 
> [23] Port 4 LED Off. Port 3 LED Off.
> [24] Approximate point of reset_control_assert?
> [25] Period of approximately 1000 - 1100 usec.
> [26] ESW_P3_LED_0 finally goes low.
> [27] Period of 415 usec.
> [28] Signals inverted. (reset_control_deassert?)
> [29] Period of 310 usec.
> [30] Signals reflect the bootstrapped configuration.
> 
> 
> (b) With a cable from an active peer connected to Port 3
> (lan4) before reset, the correct crystal frequency
> (0b10 = 40MHz) is selected.
> 
>                [31]    [32]            [34] [36]    [38]
>                :       :                  : :       :
>                _____________________________         __________________
> ESW_P4_LED_0                               |_______|
>                ___________________________   _______
> ESW_P3_LED_0                             |_|       |__________________
> 
>                         :                  : :     :
>                         :               [35] [37]..:
>                         :                  :
>                         [33]...............:
> 
> [31] Port 4 LED Off. Port 3 LED Off.
> [32] Approximate point of reset_control_assert?
> [33] Period of approximately 1000 - 1100 usec.
> [34] ESW_P3_LED_0 finally goes low.
> [35] Period of 50 usec.
> [36] Signals inverted. (reset_control_deassert?)
> [37] Period of 310 usec.
> [38] Signals reflect the bootstrapped configuration.
> 
> 
> (c) With a cable from an active peer connected to Port 3
> (lan4) before reset, an incorrect crystal frequency
> (0b11 = 25MHz) is selected.
> 
>                [45]    [46]                 [48]    [50]
>                :       :                    :       :
>                _____________________________         __________________
> ESW_P4_LED_0                               |_______|
>                _____________________________
> ESW_P3_LED_0                               |__________________________
> 
>                         :                  : :     :
>                         :                  : [49]..:
>                         :                  :
>                         [47]...............:
> 
> [45] Port 4 LED Off. Port 3 LED Off.
> [46] Approximate point of reset_control_assert?
> [47] Period of approximately 1000 - 1100 usec.
> [48] Signals inverted. (reset_control_deassert?)
> [49] Period of 315 usec.
> [50] Signals reflect the bootstrapped configuration.
> 
> ~ # sed -n -e '$s/^.*\. //p' /sys/kernel/debug/tracing/trace
>      2.617819: mt7530_setup: HWTRAP = 7fcf, HT_XTAL_FSEL = 25MHz
> 
> ~ # cat /proc/cmdline
> console=ttyS0,57600 loglevel=7 printk.time=1 root=/dev/ram0 debug rdinit=/linuxrc
> 
> 
> -- End of Tests ------------------------------------------------------
> 
> It seems that the incorrect crystal frequency is selected more
> often when debugging messages are present (such as printk()
> based approaches) and especially when loglevel=7 and printk.time=1
> are specified on the command line.
> 
> For the sake of reference: I disabled ethernet support in my build
> of (mainline) U-boot, and my kernel configuration is a very lean
> selection of options suited for IP routing and a few peripherals
> on the I2C and SPI buses.
> 
> My userland is confined to an initramfs built around busybox.
> 
> I also disable gmac1 because I need a few of the rgmii2 pins for
> modem control signalling.
> 
> Regards
> Justin
> 
> PS: Yes, I only have access to MT7621AT SoCs.

Thanks for the detailed presentation. I've simplified it to this, from what
I understood.

--- Currently -------------------------------------------------------------

With a cable from an active peer connected to Port 3 (lan4) before
reset:

Test 1, the correct crystal frequency (0b10 = 40MHz) is selected.

                       [1]        [2-1]     [3]     [5]
                       :          :         :       :
               _____________________________         __________________
ESW_P4_LED_0                               |_______|
               ___________________           _______
ESW_P3_LED_0                     |_________|       |__________________

                        :          :       : :     :
                        :          [2-2]...: [4]...:
                        [2]................:

[1] reset_control_assert.
[2] Period of 1000 - 1100 usec.
[2-1] ESW_P3_LED_0 goes low.
[2-2] Period of 415 usec.
[3] reset_control_deassert.
[4] Period of 310 usec. HWTRAP register is populated with bootstrapped
     XTAL frequency.
[5] Signals reflect the bootstrapped configuration.

Test 2, an incorrect crystal frequency (0b11 = 25MHz) is selected.

                       [2]                  [4]     [6]
                       :                    :       :
               _____________________________         __________________
ESW_P4_LED_0                               |_______|
               _____________________________
ESW_P3_LED_0                               |__________________________

                        :                  : :     :
                        :                  : [5]...:
                        :                  :
                        [3]................:

[1] reset_control_assert.
[2] Period of 1000 - 1100 usec.
[3] reset_control_deassert.
[5] Period of 315 usec. HWTRAP register is populated with incorrect
     XTAL frequency.
[6] Signals reflect the bootstrapped configuration.

--- 1 second delay between assert and deassert ----------------------------

With a cable from an active peer connected to Port 3 (lan4) before
reset, the correct crystal frequency (0b10 = 40MHz) is selected:

                       [1]        [2-1]     [3]     [5]
                       :          :         :       :
               _____________________________         __________________
ESW_P4_LED_0                               |_______|
               ___________________           _______
ESW_P3_LED_0                     |_________|       |__________________

                        :          :       : :     :
                        :          [2-2]...: [4]...:
                        [2]................:

[1] reset_control_assert.
[3] Period of 1000.325 msec.
[2-1] ESW_P3_LED_0 goes low.
[2-2] Remaining period of 1000.325 msec.
[3] reset_control_deassert.
[4] Period of 310 usec. HWTRAP register is populated with bootstrapped
     XTAL frequency.
[5] Signals reflect the bootstrapped configuration.

---

Wouldn't it be a better idea to increase the delay between
reset_control_assert() and reset_control_deassert(), and
gpiod_set_value_cansleep(priv->reset, 0) and
gpiod_set_value_cansleep(priv->reset, 1) instead of disabling the LED
controller and delaying before reset?

One MT7531 pin used for an LED is also used to decide the crystal frequency
between 40MHz and 25Mhz. We should implement this there as well.

Arınç
Arınç ÜNAL March 12, 2024, 3:17 a.m. UTC | #12
On 12.03.2024 02:43, Daniel Golle wrote:
> On Tue, Mar 12, 2024 at 02:27:25AM +0300, Arınç ÜNAL wrote:
>> On 12.03.2024 00:58, Daniel Golle wrote:
>>> On Tue, Mar 12, 2024 at 12:22:48AM +0300, Arınç ÜNAL wrote:
>>>> Why was this applied? I already explained it did not achieve anything.
>>>
>>> I agree that we were still debating about it, however, I do believe
>>> Justin that he truely observed this problem and the fix seemed
>>> appropriate to me.
>>>
>>> I've explained this in my previous email which you did not notice
>>> or at least haven't repied to:
>>>
>>> https://patchwork.kernel.org/project/netdevbpf/patch/20240305043952.21590-1-justin.swartz@risingedge.co.za/#25753421
>>
>> I did read that and I did not respond because you did not argue over any of
>> the technical points I've made. All you said was did I repeat the test
>> enough, on a technical matter that I consider adding two and two together
>> and expecting a result other than four.
>>
>> How I interpreted your response was: I don't know much about this, maybe
>> you're wrong. Justin must've made this patch for a reason so let's have
>> them elaborate further.
>>
>>>
>>> In the end it probably depends on the electric capacity of the circuit
>>> connecting each LED, so it may not be reproducible on all boards and/or
>>> under all circumstances (temperature, humindity, ...).
>>
>> I'm sorry, this makes no sense to me. I simply fail to see how this fits
>> here. Could you base your argument over my points please?
> 
> Sure, will happily do so.
> 
>>
>> Do you agree that the LED controller starts manipulating the state of the
>> pins used for LEDs and bootstrapping after a link is established?
> 
> Yes. But a reset may happen while a link is already up because the switch IC
> was initialized and in use by the bootloader. And hence LED may be powered
> by the LED controller in that moment **just before reset**.

Yes.

> 
>>
>> Do you agree that after power is cut from the switch IC and then given
>> back, any active link from before will go away, meaning the pins will go
>> back to the state that is being dictated by the bootstrapping design of the
>> board?
> 
> I don't see how this could be related. We are not talking about power cuts
> here, but rather use of a RESET signal (typically a GPIO on standalone MT753x
> or reset controller of the CPU-part of the MCM).

Ok that makes sense. Thanks for clearing that up for me.

> 
>>
>> Do you agree that with power given back, the HWTRAP register will be
>> populated before a link is established?
> 
> Yes sure, but see above.
> 
>>
>>>
>>> Disabling the LEDs and waiting for around 1mS before reset seems like
>>> a sensible thing to do, and I'm glad Justin took care of it.
>>
>> Let's ask Justin if they tested this on a standalone MT7530. Because I did.
>> The switch chip won't even be powered on before the switch chip reset
>> operation is done. So the operation this patch brings does not do anything
>> at all for standalone MT7530.
> 
> This is not true in case the bootloader has already powered on the
> switch in order to load firmware via TFTP. In this case the link may
> be up (and hence LEDs may be powered on) at the moment the reset
> triggerd by probe of the DSA driver kicks in.

This diff works on MCM MT7530 on MT7621AT. Not on standalone MT7530 on
MT7623NI SoC. Also, I believe the LEDs turn off at the first mt7530_probe()
which returns -EPROBE_DEFER.

diff --git a/drivers/net/dsa/mt7530.c b/drivers/net/dsa/mt7530.c
index b012cbb249e4..f1a5673baa55 100644
--- a/drivers/net/dsa/mt7530.c
+++ b/drivers/net/dsa/mt7530.c
@@ -2247,6 +2247,24 @@ mt7530_setup(struct dsa_switch *ds)
  		}
  	}
  
+	/* Waiting for MT7530 got to stable */
+	INIT_MT7530_DUMMY_POLL(&p, priv, MT7530_HWTRAP);
+	ret = readx_poll_timeout(_mt7530_read, &p, val, val != 0,
+				 20, 1000000);
+	if (ret < 0) {
+		dev_err(priv->dev, "reset timeout\n");
+		return ret;
+	}
+
+	if ((val & HWTRAP_XTAL_MASK) == HWTRAP_XTAL_25MHZ)
+		dev_info(priv->dev, "xtal is 25MHz\n");
+
+	if ((val & HWTRAP_XTAL_MASK) == HWTRAP_XTAL_40MHZ)
+		dev_info(priv->dev, "xtal is 40MHz\n");
+
+	if ((val & HWTRAP_XTAL_MASK) == HWTRAP_XTAL_20MHZ)
+		dev_info(priv->dev, "xtal is 20MHz\n");
+
  	/* Reset whole chip through gpio pin or memory-mapped registers for
  	 * different type of hardware
  	 */

[    5.319534] mt7530-mdio mdio-bus:1f: reset timeout
[    5.324371] mt7530-mdio: probe of mdio-bus:1f failed with error -110
[    5.330803] ------------[ cut here ]------------
[    5.335427] WARNING: CPU: 2 PID: 36 at drivers/regulator/core.c:2398 _regulator_put+0x15c/0x164
[    5.344180] Modules linked in:
[    5.347248] CPU: 2 PID: 36 Comm: kworker/u19:0 Not tainted 6.8.0-rc7-next-20240306+ #36
[    5.355264] Hardware name: Mediatek Cortex-A7 (Device Tree)
[    5.360841] Workqueue: events_unbound deferred_probe_work_func
[    5.366694] Call trace:
[    5.366710]  unwind_backtrace from show_stack+0x10/0x14
[    5.374487]  show_stack from dump_stack_lvl+0x54/0x68
[    5.379551]  dump_stack_lvl from __warn+0x94/0xc0
[    5.384272]  __warn from warn_slowpath_fmt+0x184/0x18c
[    5.389431]  warn_slowpath_fmt from _regulator_put+0x15c/0x164
[    5.395283]  _regulator_put from regulator_put+0x1c/0x2c
[    5.400611]  regulator_put from devres_release_all+0x98/0xfc
[    5.406290]  devres_release_all from device_unbind_cleanup+0xc/0x60
[    5.412577]  device_unbind_cleanup from really_probe+0x25c/0x3f4
[    5.418603]  really_probe from __driver_probe_device+0x9c/0x130
[    5.424543]  __driver_probe_device from driver_probe_device+0x30/0xc0
[    5.431003]  driver_probe_device from __device_attach_driver+0xa8/0x120
[    5.437639]  __device_attach_driver from bus_for_each_drv+0x90/0xe4
[    5.443927]  bus_for_each_drv from __device_attach+0xa8/0x1d4
[    5.449692]  __device_attach from bus_probe_device+0x88/0x8c
[    5.455370]  bus_probe_device from deferred_probe_work_func+0x8c/0xd4
[    5.461829]  deferred_probe_work_func from process_one_work+0x158/0x2e4
[    5.468465]  process_one_work from worker_thread+0x264/0x488
[    5.474142]  worker_thread from kthread+0xd4/0xd8
[    5.478865]  kthread from ret_from_fork+0x14/0x28
[    5.483583] Exception stack(0xf08bdfb0 to 0xf08bdff8)
[    5.488642] dfa0:                                     00000000 00000000 00000000 00000000
[    5.496829] dfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[    5.505015] dfe0: 00000000 00000000 00000000 00000000 00000013 00000000
[    5.511688] ---[ end trace 0000000000000000 ]---
[    5.516493] ------------[ cut here ]------------
[    5.521153] WARNING: CPU: 2 PID: 36 at drivers/regulator/core.c:2398 _regulator_put+0x15c/0x164
[    5.529904] Modules linked in:
[    5.532970] CPU: 2 PID: 36 Comm: kworker/u19:0 Tainted: G        W          6.8.0-rc7-next-20240306+ #36
[    5.542462] Hardware name: Mediatek Cortex-A7 (Device Tree)
[    5.548038] Workqueue: events_unbound deferred_probe_work_func
[    5.553890] Call trace:
[    5.553900]  unwind_backtrace from show_stack+0x10/0x14
[    5.561673]  show_stack from dump_stack_lvl+0x54/0x68
[    5.566737]  dump_stack_lvl from __warn+0x94/0xc0
[    5.571457]  __warn from warn_slowpath_fmt+0x184/0x18c
[    5.576615]  warn_slowpath_fmt from _regulator_put+0x15c/0x164
[    5.582465]  _regulator_put from regulator_put+0x1c/0x2c
[    5.587794]  regulator_put from devres_release_all+0x98/0xfc
[    5.593471]  devres_release_all from device_unbind_cleanup+0xc/0x60
[    5.599757]  device_unbind_cleanup from really_probe+0x25c/0x3f4
[    5.605784]  really_probe from __driver_probe_device+0x9c/0x130
[    5.611724]  __driver_probe_device from driver_probe_device+0x30/0xc0
[    5.618184]  driver_probe_device from __device_attach_driver+0xa8/0x120
[    5.624819]  __device_attach_driver from bus_for_each_drv+0x90/0xe4
[    5.631106]  bus_for_each_drv from __device_attach+0xa8/0x1d4
[    5.636871]  __device_attach from bus_probe_device+0x88/0x8c
[    5.642549]  bus_probe_device from deferred_probe_work_func+0x8c/0xd4
[    5.649009]  deferred_probe_work_func from process_one_work+0x158/0x2e4
[    5.655644]  process_one_work from worker_thread+0x264/0x488
[    5.661321]  worker_thread from kthread+0xd4/0xd8
[    5.666042]  kthread from ret_from_fork+0x14/0x28
[    5.670759] Exception stack(0xf08bdfb0 to 0xf08bdff8)
[    5.675818] dfa0:                                     00000000 00000000 00000000 00000000
[    5.684004] dfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[    5.692189] dfe0: 00000000 00000000 00000000 00000000 00000013 00000000
[    5.698858] ---[ end trace 0000000000000000 ]---

> 
>>
>> My conclusion to this patch is Justin tested this only on an MCM MT7530
>> where the switch IC still has power before the DSA subdriver kicks in. And
>> assumed that disabling the LED controller before switch chip reset would
>> "reduce" the possibility of having these pins continue being manipulated by
>> the LED controller AFTER power is cut off and given back to the switch
>> chip, where the state of these pins would be back to being dictated by the
>> bootstrapping design of the board.
>>
>> Jakub, please revert this. And please next time do not apply any patch that
>> modifies this driver without my approval if I've already made an argument
>> against it. I'm actively maintaining this driver, if there's a need to
>> respond, I will do so.
>>
>> This patch did not have any ACKs. It also did not have the tree described
>> on the subject. More reasons as to why this shouldn't have been applied in
>> its current state.
> 
> It was clearly recognizable as a fix.

I see that it was applied to the net-next tree[1], not net[2]. Looks like
it wasn't clear enough. Let's follow the rules.

> 
> However, I agree that applying it after Ack from an active maintainer
> would have been better.
> 
> I don't see a need to revert it before this debate (which starts to
> look like an argument over authority) has concluded.

Let's hope this patch makes its way to the net tree.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/log/drivers/net/dsa/mt7530.c
[2] https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/log/drivers/net/dsa/mt7530.c

Arınç
Arınç ÜNAL March 12, 2024, 8:38 a.m. UTC | #13
On 12.03.2024 00:10, patchwork-bot+netdevbpf@kernel.org wrote:
> Hello:
> 
> This patch was applied to netdev/net-next.git (main)
> by Jakub Kicinski <kuba@kernel.org>:
> 
> On Tue,  5 Mar 2024 06:39:51 +0200 you wrote:
>> Disable LEDs just before resetting the MT7530 to avoid
>> situations where the ESW_P4_LED_0 and ESW_P3_LED_0 pin
>> states may cause an unintended external crystal frequency
>> to be selected.
>>
>> The HT_XTAL_FSEL (External Crystal Frequency Selection)
>> field of HWTRAP (the Hardware Trap register) stores a
>> 2-bit value that represents the state of the ESW_P4_LED_0
>> and ESW_P4_LED_0 pins (seemingly) sampled just after the
>> MT7530 has been reset, as:
>>
>> [...]
> 
> Here is the summary with links:
>    - net: dsa: mt7530: disable LEDs before reset
>      https://git.kernel.org/netdev/net-next/c/2920dd92b980
> 
> You are awesome, thank you!

I am once again calling for this patch to be reverted on the net-next tree
on the basis of:

- This patch did not go through a proper reviewing process. There are
   proposed changes on the code it changes regarding the scope and the
   method of the patch, and improvements to be made on the patch log.

- This patch should be backported to stable releases, therefore it
   shouldn't be on the net-next tree and should be submitted to the net tree
   instead.

Arınç
Paolo Abeni March 12, 2024, 10:46 a.m. UTC | #14
On Tue, 2024-03-12 at 11:38 +0300, Arınç ÜNAL wrote:
> On 12.03.2024 00:10, patchwork-bot+netdevbpf@kernel.org wrote:
> > Hello:
> > 
> > This patch was applied to netdev/net-next.git (main)
> > by Jakub Kicinski <kuba@kernel.org>:
> > 
> > On Tue,  5 Mar 2024 06:39:51 +0200 you wrote:
> > > Disable LEDs just before resetting the MT7530 to avoid
> > > situations where the ESW_P4_LED_0 and ESW_P3_LED_0 pin
> > > states may cause an unintended external crystal frequency
> > > to be selected.
> > > 
> > > The HT_XTAL_FSEL (External Crystal Frequency Selection)
> > > field of HWTRAP (the Hardware Trap register) stores a
> > > 2-bit value that represents the state of the ESW_P4_LED_0
> > > and ESW_P4_LED_0 pins (seemingly) sampled just after the
> > > MT7530 has been reset, as:
> > > 
> > > [...]
> > 
> > Here is the summary with links:
> >    - net: dsa: mt7530: disable LEDs before reset
> >      https://git.kernel.org/netdev/net-next/c/2920dd92b980
> > 
> > You are awesome, thank you!
> 
> I am once again calling for this patch to be reverted on the net-next tree
> on the basis of:
> 
> - This patch did not go through a proper reviewing process. There are
>    proposed changes on the code it changes regarding the scope and the
>    method of the patch, and improvements to be made on the patch log.
> 
> - This patch should be backported to stable releases, therefore it
>    shouldn't be on the net-next tree and should be submitted to the net tree
>    instead.

The net-next pull request is out:

https://lore.kernel.org/netdev/20240312042504.1835743-1-kuba@kernel.org/

at this point I believe we can't retract it unless there is a very
serious regression affecting most/all users. This does not look such
case.

I think the better option is follow-up on net with follow-up fixes if
any.

All the relevant patches could be sent to the stable tree later:

https://elixir.bootlin.com/linux/latest/source/Documentation/process/stable-kernel-rules.rst#L47

To try to reduce the possibilities of this kind of situation in the
future, may I kindly ask you to invest some more little time to help
the reviewers and the maintainers? e.g. trimming the replies explicitly
cutting all the unneeded parts in the quoted code/text would make the
whole conversation much easier to follow (at least to me). The netdev
volume is insane, it's very easy to get lost in a given thread and miss
relevant part of it.

Cheers,

Paolo
Arınç ÜNAL March 12, 2024, 11:21 a.m. UTC | #15
On 12.03.2024 13:46, Paolo Abeni wrote:
> On Tue, 2024-03-12 at 11:38 +0300, Arınç ÜNAL wrote:
>> I am once again calling for this patch to be reverted on the net-next tree
>> on the basis of:
>>
>> - This patch did not go through a proper reviewing process. There are
>>     proposed changes on the code it changes regarding the scope and the
>>     method of the patch, and improvements to be made on the patch log.
>>
>> - This patch should be backported to stable releases, therefore it
>>     shouldn't be on the net-next tree and should be submitted to the net tree
>>     instead.
> 
> The net-next pull request is out:
> 
> https://lore.kernel.org/netdev/20240312042504.1835743-1-kuba@kernel.org/
> 
> at this point I believe we can't retract it unless there is a very
> serious regression affecting most/all users. This does not look such
> case.

It seems so. This patch was not tested on standalone MT7530 at submission.
Whilst the patch is useless for standalone MT7530, it doesn't seem to break
anything either, from my simple test on a board with it.

> 
> I think the better option is follow-up on net with follow-up fixes if
> any.
> 
> All the relevant patches could be sent to the stable tree later:
> 
> https://elixir.bootlin.com/linux/latest/source/Documentation/process/stable-kernel-rules.rst#L47
> 
> To try to reduce the possibilities of this kind of situation in the
> future, may I kindly ask you to invest some more little time to help
> the reviewers and the maintainers? e.g. trimming the replies explicitly
> cutting all the unneeded parts in the quoted code/text would make the
> whole conversation much easier to follow (at least to me). The netdev
> volume is insane, it's very easy to get lost in a given thread and miss
> relevant part of it.

I already try to do this. Here's my proposal that would not reduce but
completely avoid this kind of situation in the future, and at the same time
reduce the workload of the netdev maintainers. Do not apply patches without
ACKs. Ask for reviews at least once if the patch had been stale for a
while, and wait a bit for reviews. Only then apply the patch with/without
ACKs.

Arınç
Justin Swartz March 12, 2024, 12:01 p.m. UTC | #16
Arınç

On 2024-03-12 04:41, Arınç ÜNAL wrote:
> On 12.03.2024 03:07, Justin Swartz wrote:
>> I took a similar approach, with calls to dev_info()
>> throughout mt7530_setup() plus mt7530_write(), mt7530_read()
>> and mt7530_rmw() to get an idea of the flow of execution and
>> which registers were being manipulated.
>> 
>> Calls to dev_info() affected timing, so I switched to using
>> trace_printk() and then read the output from the debugfs's
>> tracing/trace file instead of from the console.
>> 
>> I attached a logic analyzer to ESW_P4_LED_0 and ESW_P3_LED_0,
>> and manually triggered sampling as soon as execution of the
>> kernel was reported on UART1.
>> 
>> 
>> -- Test #1 -----------------------------------------------------------
>> 
>> For the sake of maximal reproducability, I removed the delay
>> between the assertion and deassertion of reset and added
>> HWTRAP value tracing:
>> 
>> ---%---
>> --- a/drivers/net/dsa/mt7530.c
>> +++ b/drivers/net/dsa/mt7530.c
>> @@ -2243,7 +2243,6 @@ mt7530_setup(struct dsa_switch *ds)
>>       */
>>      if (priv->mcm) {
>>          reset_control_assert(priv->rstc);
>> -        usleep_range(1000, 1100);
>>          reset_control_deassert(priv->rstc);
>>      } else {
>>          gpiod_set_value_cansleep(priv->reset, 0);
>> @@ -2260,6 +2259,10 @@ mt7530_setup(struct dsa_switch *ds)
>>          return ret;
>>      }
>> 
>> +    static char ht_xtal_fsel[4][6] = { "?????", "20MHz", "40MHz", 
>> "25MHz" };
>> +    trace_printk("HWTRAP = %x, HT_XTAL_FSEL = %s\n",
>> +        val, ht_xtal_fsel[(val & HWTRAP_XTAL_MASK) >> 9]);
>> +
>>          id = mt7530_read(priv, MT7530_CREV);
>>          id >>= CHIP_NAME_SHIFT;
>>          if (id != MT7530_ID) {
>> ---%---
>> 
>> (a) Without a cable connected to Port 3 (lan4) before reset, the
>> correct external crystal frequency is selected:
>> 
>>                [3]      [4]     [6]
>>                :        :       :
>>                _________         
>> ______________________________________
>> ESW_P4_LED_0           |_______|
>>                          _______
>> ESW_P3_LED_0  _________|       |______________________________________
>> 
>>                          :     :
>>                          [5]...:
>> 
>> [3] Port 4 LED Off. Port 3 LED On.
>> [4] Signals inverted. (reset_control_assert; reset_control_deassert)
>> [5] Period of 310 usec.
>> [6] Signals reflect the bootstrapped configuration.
>> 
>> 
>> ~ # sed -n -e '$s/^.*\. //p' /sys/kernel/debug/tracing/trace
>>      2.432646: mt7530_setup: HWTRAP = 7dcf, HT_XTAL_FSEL = 40MHz
>> 
>> 
>> (b) When a cable attached to an active peer is connected to
>> Port 3 (lan4) before reset, the incorrect crystal frequency
>> selection (0b11 = 25MHz) always occurs:
>> 
>>                [7]      [8]     [10]    [12]
>>                :        :       :       :
>>                _________         
>> ______________________________________
>> ESW_P4_LED_0           |_______|
>>                _________         _______
>> ESW_P3_LED_0           |_______|       |______________________________
>> 
>>                          :     : :     :
>>                          [9]...: [11]..:
>> 
>>   [7] Port 4 LED Off. Port 3 LED Off.
>>   [8] Signals inverted. (reset_control_assert; reset_control_deassert)
>>   [9] Period of 320 usec.
>> [10] Signals inverted.
>> [11] Period of 300 usec.
>> [12] Signals reflect the bootstrapped configuration.
>> 
>> ~ # sed -n -e '$s/^.*\. //p' /sys/kernel/debug/tracing/trace
>>      2.432884: mt7530_setup: HWTRAP = 7fcf, HT_XTAL_FSEL = 25MHz
>> 
>> 
>> -- Test #2 -----------------------------------------------------------
>> 
>> To attempt to determine which transitions are associated
>> with asserting and deasserting reset, I performed another
>> test with a delay of what I intended to be a 1 sec period
>> where the MT7530 is held in reset:
>> 
>> ---%---
>> --- a/drivers/net/dsa/mt7530.c
>> +++ b/drivers/net/dsa/mt7530.c
>> @@ -2243,7 +2243,7 @@ mt7530_setup(struct dsa_switch *ds)
>>       */
>>      if (priv->mcm) {
>>          reset_control_assert(priv->rstc);
>> -        usleep_range(1000, 1100);
>> +        usleep_range(1000000, 1000000);
>>          reset_control_deassert(priv->rstc);
>>      } else {
>>          gpiod_set_value_cansleep(priv->reset, 0);
>> @@ -2260,6 +2260,10 @@ mt7530_setup(struct dsa_switch *ds)
>>          return ret;
>>      }
>> 
>> +    static char ht_xtal_fsel[4][6] = { "?????", "20MHz", "40MHz", 
>> "25MHz" };
>> +    trace_printk("HWTRAP = %x, HT_XTAL_FSEL = %s\n",
>> +        val, ht_xtal_fsel[(val & HWTRAP_XTAL_MASK) >> 9]);
>> +
>>      id = mt7530_read(priv, MT7530_CREV);
>>      id >>= CHIP_NAME_SHIFT;
>>      if (id != MT7530_ID) {
>> ---%---
>> 
>> (a) Without a cable connected to Port 3 (lan4) before reset, the
>> correct external crystal frequency is again selected:
>> 
>>                [13]     [14]    [16]
>>                :        :       :
>>                _________         
>> ______________________________________
>> ESW_P4_LED_0           |_______|
>>                          _______
>> ESW_P3_LED_0  _________|       |______________________________________
>> 
>>                          :     :
>>                          [15]..:
>> 
>> [13] Port 4 LED Off. Port 3 LED On.
>> [14] Signals inverted. (reset_control_deassert?)
>> [15] Period of 310 usec.
>> [16] Signals reflect the bootstrapped configuration.
>> 
>> 
>> ~ # sed -n -e '$s/^.*\. //p' /sys/kernel/debug/tracing/trace
>>      3.437461: mt7530_setup: HWTRAP = 7dcf, HT_XTAL_FSEL = 40MHz
>> 
>> 
>> No difference is apparent in the timing diagram, compared
>> to the result from Test #1a, but it is obvious that the code
>> which follows the reset was executed about 1 second later.
>> 
>> 
>> (b) With a cable from an active peer connected to Port 3
>> (lan4) before reset, the correct crystal frequency
>> (0b10 = 40MHz) is selected:
>> 
>>                [17]     [18]                    [20]   [22]
>>                :        :                       :      :
>>                ______________________ \ \ ______        
>> _______________
>> ESW_P4_LED_0                         / /       |______|
>>                _________              \ \        ______
>> ESW_P3_LED_0           |____________ / / ______|      |_______________
>>                                       \ \
>>                          :                     : :    :
>>                          [19]..................: [21].:
>> 
>> [17] Port 4 LED Off. Port 3 LED Off.
>> [18] ESW_P3_LED_0 set low. (reset_control_assert?)
>> [19] Period of 1000.325 msec.
>> [20] Signals inverted. (reset_control_deassert?)
>> [21] Period of 310 usec.
>> [22] Signals reflect the bootstrapped configuration.
>> 
>> 
>> ~ # sed -n -e '$s/^.*\. //p' /sys/kernel/debug/tracing/trace
>>      3.433235: mt7530_setup: HWTRAP = 7dcf, HT_XTAL_FSEL = 40MHz
>> 
>> 
>> So it appears that when reset is deasserted, the ESW_P4_LED_0
>> and ESW_P3_LED_0 levels are inverted for a period of about
>> 300 - 310 usec in Test #1a, #1b, #2a, and #2b.
>> 
>> Co-incidentally, these inverted levels are the active low
>> representation of what ends up in HT_XTAL_FSEL. And in all
>> four examples, have been the inversion of what immediately
>> preceded reset_control_deassert().
>> 
>> 
>> -- Test #3 -----------------------------------------------------------
>> 
>> Now it seems that there is a signature that can be used
>> to identify when reset_control_deassert() is executed,
>> the time of reset_control_assert()'s execution should be
>> between (at most) 1100 and (at least) 1000 usec prior
>> based on the unmodified reset logic.
>> 
>> So this patch only includes the HT_XTAL_FSEL trace.
>> 
>> ---%---
>> --- a/drivers/net/dsa/mt7530.c
>> +++ b/drivers/net/dsa/mt7530.c
>> @@ -2260,6 +2260,10 @@ mt7530_setup(struct dsa_switch *ds)
>>                  return ret;
>>          }
>> 
>> +       static char ht_xtal_fsel[4][6] = { "?????", "20MHz", "40MHz", 
>> "25MHz" };
>> +       trace_printk("HWTRAP = %x, HT_XTAL_FSEL = %s\n",
>> +               val, ht_xtal_fsel[(val & HWTRAP_XTAL_MASK) >> 9]);
>> +
>>          id = mt7530_read(priv, MT7530_CREV);
>>          id >>= CHIP_NAME_SHIFT;
>>          if (id != MT7530_ID) {
>> ---%---
>> 
>> The purpose of the following examples are to show the
>> variance in how long it takes for ESW_P3_LED_0 to go low.
>> 
>> (a) With a cable from an active peer connected to Port 3
>> (lan4) before reset, the correct crystal frequency
>> (0b10 = 40MHz) is selected.
>> 
>>                [23]    [24]       [26]      [28]    [30]
>>                :       :          :         :       :
>>                _____________________________         
>> __________________
>> ESW_P4_LED_0                               |_______|
>>                ___________________           _______
>> ESW_P3_LED_0                     |_________|       |__________________
>> 
>>                         :          :       : :     :
>>                         :          [27]....: [29]..:
>>                         [25]...............:
>> 
>> [23] Port 4 LED Off. Port 3 LED Off.
>> [24] Approximate point of reset_control_assert?
>> [25] Period of approximately 1000 - 1100 usec.
>> [26] ESW_P3_LED_0 finally goes low.
>> [27] Period of 415 usec.
>> [28] Signals inverted. (reset_control_deassert?)
>> [29] Period of 310 usec.
>> [30] Signals reflect the bootstrapped configuration.
>> 
>> 
>> (b) With a cable from an active peer connected to Port 3
>> (lan4) before reset, the correct crystal frequency
>> (0b10 = 40MHz) is selected.
>> 
>>                [31]    [32]            [34] [36]    [38]
>>                :       :                  : :       :
>>                _____________________________         
>> __________________
>> ESW_P4_LED_0                               |_______|
>>                ___________________________   _______
>> ESW_P3_LED_0                             |_|       |__________________
>> 
>>                         :                  : :     :
>>                         :               [35] [37]..:
>>                         :                  :
>>                         [33]...............:
>> 
>> [31] Port 4 LED Off. Port 3 LED Off.
>> [32] Approximate point of reset_control_assert?
>> [33] Period of approximately 1000 - 1100 usec.
>> [34] ESW_P3_LED_0 finally goes low.
>> [35] Period of 50 usec.
>> [36] Signals inverted. (reset_control_deassert?)
>> [37] Period of 310 usec.
>> [38] Signals reflect the bootstrapped configuration.
>> 
>> 
>> (c) With a cable from an active peer connected to Port 3
>> (lan4) before reset, an incorrect crystal frequency
>> (0b11 = 25MHz) is selected.
>> 
>>                [45]    [46]                 [48]    [50]
>>                :       :                    :       :
>>                _____________________________         
>> __________________
>> ESW_P4_LED_0                               |_______|
>>                _____________________________
>> ESW_P3_LED_0                               |__________________________
>> 
>>                         :                  : :     :
>>                         :                  : [49]..:
>>                         :                  :
>>                         [47]...............:
>> 
>> [45] Port 4 LED Off. Port 3 LED Off.
>> [46] Approximate point of reset_control_assert?
>> [47] Period of approximately 1000 - 1100 usec.
>> [48] Signals inverted. (reset_control_deassert?)
>> [49] Period of 315 usec.
>> [50] Signals reflect the bootstrapped configuration.
>> 
>> ~ # sed -n -e '$s/^.*\. //p' /sys/kernel/debug/tracing/trace
>>      2.617819: mt7530_setup: HWTRAP = 7fcf, HT_XTAL_FSEL = 25MHz
>> 
>> ~ # cat /proc/cmdline
>> console=ttyS0,57600 loglevel=7 printk.time=1 root=/dev/ram0 debug 
>> rdinit=/linuxrc
>> 
>> 
>> -- End of Tests ------------------------------------------------------
>> 
>> It seems that the incorrect crystal frequency is selected more
>> often when debugging messages are present (such as printk()
>> based approaches) and especially when loglevel=7 and printk.time=1
>> are specified on the command line.
>> 
>> For the sake of reference: I disabled ethernet support in my build
>> of (mainline) U-boot, and my kernel configuration is a very lean
>> selection of options suited for IP routing and a few peripherals
>> on the I2C and SPI buses.
>> 
>> My userland is confined to an initramfs built around busybox.
>> 
>> I also disable gmac1 because I need a few of the rgmii2 pins for
>> modem control signalling.
>> 
>> Regards
>> Justin
>> 
>> PS: Yes, I only have access to MT7621AT SoCs.
> 
> Thanks for the detailed presentation. I've simplified it to this, from 
> what
> I understood.
> 
> --- Currently 
> -------------------------------------------------------------
> 
> With a cable from an active peer connected to Port 3 (lan4) before
> reset:
> 
> Test 1, the correct crystal frequency (0b10 = 40MHz) is selected.
> 
>                       [1]        [2-1]     [3]     [5]
>                       :          :         :       :
>               _____________________________         __________________
> ESW_P4_LED_0                               |_______|
>               ___________________           _______
> ESW_P3_LED_0                     |_________|       |__________________
> 
>                        :          :       : :     :
>                        :          [2-2]...: [4]...:
>                        [2]................:
> 
> [1] reset_control_assert.
> [2] Period of 1000 - 1100 usec.
> [2-1] ESW_P3_LED_0 goes low.
> [2-2] Period of 415 usec.
> [3] reset_control_deassert.
> [4] Period of 310 usec. HWTRAP register is populated with bootstrapped
>     XTAL frequency.
> [5] Signals reflect the bootstrapped configuration.
> 
> Test 2, an incorrect crystal frequency (0b11 = 25MHz) is selected.
> 
>                       [2]                  [4]     [6]
>                       :                    :       :
>               _____________________________         __________________
> ESW_P4_LED_0                               |_______|
>               _____________________________
> ESW_P3_LED_0                               |__________________________
> 
>                        :                  : :     :
>                        :                  : [5]...:
>                        :                  :
>                        [3]................:
> 
> [1] reset_control_assert.
> [2] Period of 1000 - 1100 usec.
> [3] reset_control_deassert.
> [5] Period of 315 usec. HWTRAP register is populated with incorrect
>     XTAL frequency.
> [6] Signals reflect the bootstrapped configuration.
> 
> --- 1 second delay between assert and deassert 
> ----------------------------
> 
> With a cable from an active peer connected to Port 3 (lan4) before
> reset, the correct crystal frequency (0b10 = 40MHz) is selected:
> 
>                       [1]        [2-1]     [3]     [5]
>                       :          :         :       :
>               _____________________________         __________________
> ESW_P4_LED_0                               |_______|
>               ___________________           _______
> ESW_P3_LED_0                     |_________|       |__________________
> 
>                        :          :       : :     :
>                        :          [2-2]...: [4]...:
>                        [2]................:
> 
> [1] reset_control_assert.
> [3] Period of 1000.325 msec.
> [2-1] ESW_P3_LED_0 goes low.
> [2-2] Remaining period of 1000.325 msec.
> [3] reset_control_deassert.
> [4] Period of 310 usec. HWTRAP register is populated with bootstrapped
>     XTAL frequency.
> [5] Signals reflect the bootstrapped configuration.
> 
> ---
> 
> Wouldn't it be a better idea to increase the delay between
> reset_control_assert() and reset_control_deassert(), and
> gpiod_set_value_cansleep(priv->reset, 0) and
> gpiod_set_value_cansleep(priv->reset, 1) instead of disabling the LED
> controller and delaying before reset?

I've done some additional tests to see what the difference would be
between increasing the reset hold period vs. disabling the LEDs then
sleeping before reset.


I wanted to know:

When an active link is present on Port 3 at boot, what are the
minimum, maximum and average periods between ESW_P3_LED_0
going low and the signal inversion that seems to co-incide with
reset_control_deassert() for each approach?


I created two patches:

WITH INCREASED RESET DELAY

As I submitted a patch that added an intended 1000 - 1100 usec delay
before reset, I thought it would be fair to increase the reset hold
period by the same value.

---%---
--- a/drivers/net/dsa/mt7530.c
+++ b/drivers/net/dsa/mt7530.c
@@ -2243,11 +2243,11 @@ mt7530_setup(struct dsa_switch *ds)
          */
         if (priv->mcm) {
                 reset_control_assert(priv->rstc);
-               usleep_range(1000, 1100);
+               usleep_range(2000, 2200);
                 reset_control_deassert(priv->rstc);
         } else {
                 gpiod_set_value_cansleep(priv->reset, 0);
-               usleep_range(1000, 1100);
+               usleep_range(2000, 2200);
                 gpiod_set_value_cansleep(priv->reset, 1);
         }
---%---


DISABLE LEDS BEFORE RESET

Reset hold period unchanged from the intended 1000 - 1100 usec.

---%---
--- a/drivers/net/dsa/mt7530.c
+++ b/drivers/net/dsa/mt7530.c
@@ -2238,6 +2238,12 @@ mt7530_setup(struct dsa_switch *ds)
                 }
         }

+       /* Disable LEDs before reset to prevent the MT7530 sampling a
+        * potentially incorrect HT_XTAL_FSEL value.
+        */
+       mt7530_write(priv, MT7530_LED_EN, 0);
+       usleep_range(1000, 1100);
+
         /* Reset whole chip through gpio pin or memory-mapped registers 
for
          * different type of hardware
          */
---%---


I ran 20 tests per patch, applied exclusively. 40 tests in total.

      <-- ESW_P3_LED_0 Low Period before Reset Deassertion -->

   TEST    WITH INCREASED RESET DELAY    DISABLE LEDS BEFORE RESET
      #                        (usec)                       (usec)
-------------------------------------------------------------------
      1                           182                         4790
      2                           370                         3470
      3                           240                         4635
      4                          1475                         4850
      5                            70                         4775
      6                          2730                         4575
      7                          3180                         4565
      8                           265                         5650
      9                           270                         4690
     10                          1525                         4450
     11                          3210                         4735
     12                           120                         4690
     13                           185                         4625
     14                           305                         7020
     15                          2975                         4720
     16                           245                         4675
     17                           350                         4740
     18                            80                        17920
     19                           150                        17665
     20                           100                         4620

    Min                            70                         3470
    Max                          3210                        17920

   Mean                           270                         4720
    Avg                       923.421                     6161.579


Every test resulted in a 40MHz HT_XTAL_FSEL, but after seeing 70 usec
and 80 usec periods I wondered how many more tests it may take before
an 25MHz HT_XTAL_FSEL appears.

I was also surprised by the 17920 usec and 17665 usec periods listed
under the DISABLED LEDS BEFORE RESET column. Nothing unusual seemed
to be happening, at least as far as the kernel message output was
concerned.

What do you make of these results?


> One MT7531 pin used for an LED is also used to decide the crystal 
> frequency
> between 40MHz and 25Mhz. We should implement this there as well.
> 
> Arınç
Arınç ÜNAL March 12, 2024, 2:06 p.m. UTC | #17
On 12.03.2024 15:01, Justin Swartz wrote:
> Arınç
> 
> On 2024-03-12 04:41, Arınç ÜNAL wrote:
>> Wouldn't it be a better idea to increase the delay between
>> reset_control_assert() and reset_control_deassert(), and
>> gpiod_set_value_cansleep(priv->reset, 0) and
>> gpiod_set_value_cansleep(priv->reset, 1) instead of disabling the LED
>> controller and delaying before reset?
> 
> I've done some additional tests to see what the difference would be
> between increasing the reset hold period vs. disabling the LEDs then
> sleeping before reset.
> 
> 
> I wanted to know:
> 
> When an active link is present on Port 3 at boot, what are the
> minimum, maximum and average periods between ESW_P3_LED_0
> going low and the signal inversion that seems to co-incide with
> reset_control_deassert() for each approach?
> 
> 
> I created two patches:
> 
> WITH INCREASED RESET DELAY
> 
> As I submitted a patch that added an intended 1000 - 1100 usec delay
> before reset, I thought it would be fair to increase the reset hold
> period by the same value.
> 
> ---%---
> --- a/drivers/net/dsa/mt7530.c
> +++ b/drivers/net/dsa/mt7530.c
> @@ -2243,11 +2243,11 @@ mt7530_setup(struct dsa_switch *ds)
>           */
>          if (priv->mcm) {
>                  reset_control_assert(priv->rstc);
> -               usleep_range(1000, 1100);
> +               usleep_range(2000, 2200);
>                  reset_control_deassert(priv->rstc);
>          } else {
>                  gpiod_set_value_cansleep(priv->reset, 0);
> -               usleep_range(1000, 1100);
> +               usleep_range(2000, 2200);
>                  gpiod_set_value_cansleep(priv->reset, 1);
>          }
> ---%---
> 
> 
> DISABLE LEDS BEFORE RESET
> 
> Reset hold period unchanged from the intended 1000 - 1100 usec.
> 
> ---%---
> --- a/drivers/net/dsa/mt7530.c
> +++ b/drivers/net/dsa/mt7530.c
> @@ -2238,6 +2238,12 @@ mt7530_setup(struct dsa_switch *ds)
>                  }
>          }
> 
> +       /* Disable LEDs before reset to prevent the MT7530 sampling a
> +        * potentially incorrect HT_XTAL_FSEL value.
> +        */
> +       mt7530_write(priv, MT7530_LED_EN, 0);
> +       usleep_range(1000, 1100);
> +
>          /* Reset whole chip through gpio pin or memory-mapped registers for
>           * different type of hardware
>           */
> ---%---
> 
> 
> I ran 20 tests per patch, applied exclusively. 40 tests in total.
> 
>       <-- ESW_P3_LED_0 Low Period before Reset Deassertion -->
> 
>    TEST    WITH INCREASED RESET DELAY    DISABLE LEDS BEFORE RESET
>       #                        (usec)                       (usec)
> -------------------------------------------------------------------
>       1                           182                         4790
>       2                           370                         3470
>       3                           240                         4635
>       4                          1475                         4850
>       5                            70                         4775
>       6                          2730                         4575
>       7                          3180                         4565
>       8                           265                         5650
>       9                           270                         4690
>      10                          1525                         4450
>      11                          3210                         4735
>      12                           120                         4690
>      13                           185                         4625
>      14                           305                         7020
>      15                          2975                         4720
>      16                           245                         4675
>      17                           350                         4740
>      18                            80                        17920
>      19                           150                        17665
>      20                           100                         4620
> 
>     Min                            70                         3470
>     Max                          3210                        17920
> 
>    Mean                           270                         4720
>     Avg                       923.421                     6161.579
> 
> 
> Every test resulted in a 40MHz HT_XTAL_FSEL, but after seeing 70 usec
> and 80 usec periods I wondered how many more tests it may take before
> an 25MHz HT_XTAL_FSEL appears.
> 
> I was also surprised by the 17920 usec and 17665 usec periods listed
> under the DISABLED LEDS BEFORE RESET column. Nothing unusual seemed
> to be happening, at least as far as the kernel message output was
> concerned.
> 
> What do you make of these results?

What I see is setting ESW_P3_LED_0 low via reset assertion is much more
efficient than doing so by setting the LED controller off. So I'd prefer
increasing the delay between assertion and reassertion. For example, the
Realtek DSA subdriver has a much more generous delay. 25ms after reset
assertion [1].

Looking at your results, 2000 - 2200 usec delay still seems too close, so
let's agree on an amount that will ensure that boards in any circumstances
will have these pins back to the bootstrapping configuration before reset
deassertion. What about 5000 - 5100 usec?

[1] https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/tree/drivers/net/dsa/realtek/rtl83xx.c#n205

Arınç
Justin Swartz March 12, 2024, 3:25 p.m. UTC | #18
On 2024-03-12 16:06, Arınç ÜNAL wrote:
> On 12.03.2024 15:01, Justin Swartz wrote:
>> Arınç
>> 
>> On 2024-03-12 04:41, Arınç ÜNAL wrote:
>>> Wouldn't it be a better idea to increase the delay between
>>> reset_control_assert() and reset_control_deassert(), and
>>> gpiod_set_value_cansleep(priv->reset, 0) and
>>> gpiod_set_value_cansleep(priv->reset, 1) instead of disabling the LED
>>> controller and delaying before reset?
>> 
>> I've done some additional tests to see what the difference would be
>> between increasing the reset hold period vs. disabling the LEDs then
>> sleeping before reset.
>> 
>> 
>> I wanted to know:
>> 
>> When an active link is present on Port 3 at boot, what are the
>> minimum, maximum and average periods between ESW_P3_LED_0
>> going low and the signal inversion that seems to co-incide with
>> reset_control_deassert() for each approach?
>> 
>> 
>> I created two patches:
>> 
>> WITH INCREASED RESET DELAY
>> 
>> As I submitted a patch that added an intended 1000 - 1100 usec delay
>> before reset, I thought it would be fair to increase the reset hold
>> period by the same value.
>> 
>> ---%---
>> --- a/drivers/net/dsa/mt7530.c
>> +++ b/drivers/net/dsa/mt7530.c
>> @@ -2243,11 +2243,11 @@ mt7530_setup(struct dsa_switch *ds)
>>           */
>>          if (priv->mcm) {
>>                  reset_control_assert(priv->rstc);
>> -               usleep_range(1000, 1100);
>> +               usleep_range(2000, 2200);
>>                  reset_control_deassert(priv->rstc);
>>          } else {
>>                  gpiod_set_value_cansleep(priv->reset, 0);
>> -               usleep_range(1000, 1100);
>> +               usleep_range(2000, 2200);
>>                  gpiod_set_value_cansleep(priv->reset, 1);
>>          }
>> ---%---
>> 
>> 
>> DISABLE LEDS BEFORE RESET
>> 
>> Reset hold period unchanged from the intended 1000 - 1100 usec.
>> 
>> ---%---
>> --- a/drivers/net/dsa/mt7530.c
>> +++ b/drivers/net/dsa/mt7530.c
>> @@ -2238,6 +2238,12 @@ mt7530_setup(struct dsa_switch *ds)
>>                  }
>>          }
>> 
>> +       /* Disable LEDs before reset to prevent the MT7530 sampling a
>> +        * potentially incorrect HT_XTAL_FSEL value.
>> +        */
>> +       mt7530_write(priv, MT7530_LED_EN, 0);
>> +       usleep_range(1000, 1100);
>> +
>>          /* Reset whole chip through gpio pin or memory-mapped 
>> registers for
>>           * different type of hardware
>>           */
>> ---%---
>> 
>> 
>> I ran 20 tests per patch, applied exclusively. 40 tests in total.
>> 
>>       <-- ESW_P3_LED_0 Low Period before Reset Deassertion -->
>> 
>>    TEST    WITH INCREASED RESET DELAY    DISABLE LEDS BEFORE RESET
>>       #                        (usec)                       (usec)
>> -------------------------------------------------------------------
>>       1                           182                         4790
>>       2                           370                         3470
>>       3                           240                         4635
>>       4                          1475                         4850
>>       5                            70                         4775
>>       6                          2730                         4575
>>       7                          3180                         4565
>>       8                           265                         5650
>>       9                           270                         4690
>>      10                          1525                         4450
>>      11                          3210                         4735
>>      12                           120                         4690
>>      13                           185                         4625
>>      14                           305                         7020
>>      15                          2975                         4720
>>      16                           245                         4675
>>      17                           350                         4740
>>      18                            80                        17920
>>      19                           150                        17665
>>      20                           100                         4620
>> 
>>     Min                            70                         3470
>>     Max                          3210                        17920
>> 
>>    Mean                           270                         4720
-1s/  Mean/Median/

>>     Avg                       923.421                     6161.579
>> 
>> 
>> Every test resulted in a 40MHz HT_XTAL_FSEL, but after seeing 70 usec
>> and 80 usec periods I wondered how many more tests it may take before
>> an 25MHz HT_XTAL_FSEL appears.
>> 
>> I was also surprised by the 17920 usec and 17665 usec periods listed
>> under the DISABLED LEDS BEFORE RESET column. Nothing unusual seemed
>> to be happening, at least as far as the kernel message output was
>> concerned.
>> 
>> What do you make of these results?
> 
> What I see is setting ESW_P3_LED_0 low via reset assertion is much more
> efficient than doing so by setting the LED controller off. So I'd 
> prefer
> increasing the delay between assertion and reassertion. For example, 
> the
> Realtek DSA subdriver has a much more generous delay. 25ms after reset
> assertion [1].
> 
> Looking at your results, 2000 - 2200 usec delay still seems too close, 
> so
> let's agree on an amount that will ensure that boards in any 
> circumstances
> will have these pins back to the bootstrapping configuration before 
> reset
> deassertion. What about 5000 - 5100 usec?

    TEST    ESW_P3_LED_0 LOW PERIOD
       #                     (usec)
----------------------------------
       1                       5410
       2                       5440
       3                       4375
       4                       5490
       5                       5475
       6                       4335
       7                       4370
       8                       5435
       9                       4205
      10                       4335
      11                       3750
      12                       3170
      13                       4395
      14                       4375
      15                       3515
      16                       4335
      17                       4220
      18                       4175
      19                       4175
      20                       4350

     Min                       3170
     Max                       5490

  Median                   4342.500
     Avg                   4466.500

Looks reasonable to me.


> [1]
> https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/tree/drivers/net/dsa/realtek/rtl83xx.c#n205
> 
> Arınç
Justin Swartz March 12, 2024, 4:35 p.m. UTC | #19
On 2024-03-12 17:25, Justin Swartz wrote:
> On 2024-03-12 16:06, Arınç ÜNAL wrote:
>> On 12.03.2024 15:01, Justin Swartz wrote:
>>> Arınç
>>> 
>>> On 2024-03-12 04:41, Arınç ÜNAL wrote:
>>>> Wouldn't it be a better idea to increase the delay between
>>>> reset_control_assert() and reset_control_deassert(), and
>>>> gpiod_set_value_cansleep(priv->reset, 0) and
>>>> gpiod_set_value_cansleep(priv->reset, 1) instead of disabling the 
>>>> LED
>>>> controller and delaying before reset?
>>> 
>>> I've done some additional tests to see what the difference would be
>>> between increasing the reset hold period vs. disabling the LEDs then
>>> sleeping before reset.
>>> 
>>> 
>>> I wanted to know:
>>> 
>>> When an active link is present on Port 3 at boot, what are the
>>> minimum, maximum and average periods between ESW_P3_LED_0
>>> going low and the signal inversion that seems to co-incide with
>>> reset_control_deassert() for each approach?
>>> 
>>> 
>>> I created two patches:
>>> 
>>> WITH INCREASED RESET DELAY
>>> 
>>> As I submitted a patch that added an intended 1000 - 1100 usec delay
>>> before reset, I thought it would be fair to increase the reset hold
>>> period by the same value.
>>> 
>>> ---%---
>>> --- a/drivers/net/dsa/mt7530.c
>>> +++ b/drivers/net/dsa/mt7530.c
>>> @@ -2243,11 +2243,11 @@ mt7530_setup(struct dsa_switch *ds)
>>>           */
>>>          if (priv->mcm) {
>>>                  reset_control_assert(priv->rstc);
>>> -               usleep_range(1000, 1100);
>>> +               usleep_range(2000, 2200);
>>>                  reset_control_deassert(priv->rstc);
>>>          } else {
>>>                  gpiod_set_value_cansleep(priv->reset, 0);
>>> -               usleep_range(1000, 1100);
>>> +               usleep_range(2000, 2200);
>>>                  gpiod_set_value_cansleep(priv->reset, 1);
>>>          }
>>> ---%---
>>> 
>>> 
>>> DISABLE LEDS BEFORE RESET
>>> 
>>> Reset hold period unchanged from the intended 1000 - 1100 usec.
>>> 
>>> ---%---
>>> --- a/drivers/net/dsa/mt7530.c
>>> +++ b/drivers/net/dsa/mt7530.c
>>> @@ -2238,6 +2238,12 @@ mt7530_setup(struct dsa_switch *ds)
>>>                  }
>>>          }
>>> 
>>> +       /* Disable LEDs before reset to prevent the MT7530 sampling a
>>> +        * potentially incorrect HT_XTAL_FSEL value.
>>> +        */
>>> +       mt7530_write(priv, MT7530_LED_EN, 0);
>>> +       usleep_range(1000, 1100);
>>> +
>>>          /* Reset whole chip through gpio pin or memory-mapped 
>>> registers for
>>>           * different type of hardware
>>>           */
>>> ---%---
>>> 
>>> 
>>> I ran 20 tests per patch, applied exclusively. 40 tests in total.
>>> 
>>>       <-- ESW_P3_LED_0 Low Period before Reset Deassertion -->
>>> 
>>>    TEST    WITH INCREASED RESET DELAY    DISABLE LEDS BEFORE RESET
>>>       #                        (usec)                       (usec)
>>> -------------------------------------------------------------------
>>>       1                           182                         4790
>>>       2                           370                         3470
>>>       3                           240                         4635
>>>       4                          1475                         4850
>>>       5                            70                         4775
>>>       6                          2730                         4575
>>>       7                          3180                         4565
>>>       8                           265                         5650
>>>       9                           270                         4690
>>>      10                          1525                         4450
>>>      11                          3210                         4735
>>>      12                           120                         4690
>>>      13                           185                         4625
>>>      14                           305                         7020
>>>      15                          2975                         4720
>>>      16                           245                         4675
>>>      17                           350                         4740
>>>      18                            80                        17920
>>>      19                           150                        17665
>>>      20                           100                         4620
>>> 
>>>     Min                            70                         3470
>>>     Max                          3210                        17920
>>> 
>>>    Mean                           270                         4720
> -1s/  Mean/Median/
> 
>>>     Avg                       923.421                     6161.579
>>> 
>>> 
>>> Every test resulted in a 40MHz HT_XTAL_FSEL, but after seeing 70 usec
>>> and 80 usec periods I wondered how many more tests it may take before
>>> an 25MHz HT_XTAL_FSEL appears.
>>> 
>>> I was also surprised by the 17920 usec and 17665 usec periods listed
>>> under the DISABLED LEDS BEFORE RESET column. Nothing unusual seemed
>>> to be happening, at least as far as the kernel message output was
>>> concerned.
>>> 
>>> What do you make of these results?
>> 
>> What I see is setting ESW_P3_LED_0 low via reset assertion is much 
>> more
>> efficient than doing so by setting the LED controller off. So I'd 
>> prefer
>> increasing the delay between assertion and reassertion. For example, 
>> the
>> Realtek DSA subdriver has a much more generous delay. 25ms after reset
>> assertion [1].
>> 
>> Looking at your results, 2000 - 2200 usec delay still seems too close, 
>> so
>> let's agree on an amount that will ensure that boards in any 
>> circumstances
>> will have these pins back to the bootstrapping configuration before 
>> reset
>> deassertion. What about 5000 - 5100 usec?
> 
>    TEST    ESW_P3_LED_0 LOW PERIOD
>       #                     (usec)
> ----------------------------------
>       1                       5410
>       2                       5440
>       3                       4375
>       4                       5490
>       5                       5475
>       6                       4335
>       7                       4370
>       8                       5435
>       9                       4205
>      10                       4335
>      11                       3750
>      12                       3170
>      13                       4395
>      14                       4375
>      15                       3515
>      16                       4335
>      17                       4220
>      18                       4175
>      19                       4175
>      20                       4350
> 
>     Min                       3170
>     Max                       5490
> 
>  Median                   4342.500
>     Avg                   4466.500
> 
> Looks reasonable to me.

I had messed up the Median/Average calculation with the data
I had pasted earlier after copying and pasting formulas without
double checking how they had been affected.

So here's the table again, now with the 5000 - 5100 usec test
run tacked on for easier comparison:

              2000 usec                      5000 usec
            - 2200 usec    DISABLE LEDS    - 5100 usec
    TEST     RESET HOLD    BEFORE RESET     RESET HOLD
       #         (usec)          (usec)         (usec)
--------------------------------------------------------
       1            182            4790           5410
       2            370            3470           5440
       3            240            4635           4375
       4           1475            4850           5490
       5             70            4775           5475
       6           2730            4575           4335
       7           3180            4565           4370
       8            265            5650           5435
       9            270            4690           4205
      10           1525            4450           4335
      11           3210            4735           3750
      12            120            4690           3170
      13            185            4625           4395
      14            305            7020           4375
      15           2975            4720           3515
      16            245            4675           4335
      17            350            4740           4220
      18             80           17920           4175
      19            150           17665           4175
      20            100            4620           4350

     Min             70            3470           3170
     Max           3210           17920           5490

  Median        267.500            4705       4342.500
     Avg        901.350            6093       4466.500


>> [1]
>> https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/tree/drivers/net/dsa/realtek/rtl83xx.c#n205
>> 
>> Arınç
diff mbox series

Patch

diff --git a/drivers/net/dsa/mt7530.c b/drivers/net/dsa/mt7530.c
index 3c1f65759..8fa113126 100644
--- a/drivers/net/dsa/mt7530.c
+++ b/drivers/net/dsa/mt7530.c
@@ -2238,6 +2238,12 @@  mt7530_setup(struct dsa_switch *ds)
 		}
 	}
 
+	/* Disable LEDs before reset to prevent the MT7530 sampling a
+	 * potentially incorrect HT_XTAL_FSEL value.
+	 */
+	mt7530_write(priv, MT7530_LED_EN, 0);
+	usleep_range(1000, 1100);
+
 	/* Reset whole chip through gpio pin or memory-mapped registers for
 	 * different type of hardware
 	 */