diff mbox series

[1/3] phy: rockchip: emmc: Enable pulldown for strobe line

Message ID 20240326-rk-default-enable-strobe-pulldown-v1-1-f410c71605c0@folker-schwesinger.de (mailing list archive)
State New, archived
Headers show
Series phy: rockchip: emmc: Enable internal strobe pull-down by default | expand

Commit Message

Folker Schwesinger via B4 Relay March 26, 2024, 6:54 p.m. UTC
From: Folker Schwesinger <dev@folker-schwesinger.de>

Restore the behavior of the Rockchip kernel that undconditionally
enables the internal strobe pulldown.
As the DT property rockchip,enable-strobe-pulldown is obsolete now,
replace it with a property to disable the internal pulldown.

This fixes I/O errors observed on various Rock Pi 4 and NanoPi4 series
boards with some eMMC modules. Other boards may also be affected.

An example of these errors is as follows:

[  290.060817] mmc1: running CQE recovery
[  290.061337] blk_update_request: I/O error, dev mmcblk1, sector 1411072 op 0x1:(WRITE) flags 0x800 phys_seg 36 prio class 0
[  290.061370] EXT4-fs warning (device mmcblk1p1): ext4_end_bio:348: I/O error 10 writing to inode 29547 starting block 176466)
[  290.061484] Buffer I/O error on device mmcblk1p1, logical block 172288

Fixes: 8b5c2b45b8f0 ("phy: rockchip: set pulldown for strobe line in dts")
Signed-off-by: Folker Schwesinger <dev@folker-schwesinger.de>
---
 drivers/phy/rockchip/phy-rockchip-emmc.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

Comments

Conor Dooley March 26, 2024, 7:46 p.m. UTC | #1
On Tue, Mar 26, 2024 at 07:54:35PM +0100, Folker Schwesinger via B4 Relay wrote:
> From: Folker Schwesinger <dev@folker-schwesinger.de>
> 
> Restore the behavior of the Rockchip kernel that undconditionally
> enables the internal strobe pulldown.

What do you mean "restore the behaviour of the rockchip kernel"? Did
mainline behave the same as the rockchip kernel previously? If not,
using "restore" here is misleading. "Unconditionally" is also incorrect,
because you have a property that disables it.

> As the DT property rockchip,enable-strobe-pulldown is obsolete now,
> replace it with a property to disable the internal pulldown.
> 
> This fixes I/O errors observed on various Rock Pi 4 and NanoPi4 series
> boards with some eMMC modules. Other boards may also be affected.
> 
> An example of these errors is as follows:
> 
> [  290.060817] mmc1: running CQE recovery
> [  290.061337] blk_update_request: I/O error, dev mmcblk1, sector 1411072 op 0x1:(WRITE) flags 0x800 phys_seg 36 prio class 0
> [  290.061370] EXT4-fs warning (device mmcblk1p1): ext4_end_bio:348: I/O error 10 writing to inode 29547 starting block 176466)
> [  290.061484] Buffer I/O error on device mmcblk1p1, logical block 172288
> 
> Fixes: 8b5c2b45b8f0 ("phy: rockchip: set pulldown for strobe line in dts")
> Signed-off-by: Folker Schwesinger <dev@folker-schwesinger.de>
> ---
>  drivers/phy/rockchip/phy-rockchip-emmc.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/phy/rockchip/phy-rockchip-emmc.c b/drivers/phy/rockchip/phy-rockchip-emmc.c
> index 20023f6eb994..6e637f3e1b19 100644
> --- a/drivers/phy/rockchip/phy-rockchip-emmc.c
> +++ b/drivers/phy/rockchip/phy-rockchip-emmc.c
> @@ -376,14 +376,14 @@ static int rockchip_emmc_phy_probe(struct platform_device *pdev)
>  	rk_phy->reg_offset = reg_offset;
>  	rk_phy->reg_base = grf;
>  	rk_phy->drive_impedance = PHYCTRL_DR_50OHM;
> -	rk_phy->enable_strobe_pulldown = PHYCTRL_REN_STRB_DISABLE;
> +	rk_phy->enable_strobe_pulldown = PHYCTRL_REN_STRB_ENABLE;
>  	rk_phy->output_tapdelay_select = PHYCTRL_OTAPDLYSEL_DEFAULT;
>  
>  	if (!of_property_read_u32(dev->of_node, "drive-impedance-ohm", &val))
>  		rk_phy->drive_impedance = convert_drive_impedance_ohm(pdev, val);
>  
> -	if (of_property_read_bool(dev->of_node, "rockchip,enable-strobe-pulldown"))
> -		rk_phy->enable_strobe_pulldown = PHYCTRL_REN_STRB_ENABLE;
> +	if (of_property_read_bool(dev->of_node, "rockchip,disable-strobe-pulldown"))
> +		rk_phy->enable_strobe_pulldown = PHYCTRL_REN_STRB_DISABLE;

Unfortunately you cannot do this.
Previously no property at all meant disabled and a property was required
to enable it. With this change the absence of a property means that it
will be enabled.
An old devicetree is that wanted this to be disabled would have no
property and will now end up with it enabled. This is an ABI break and is
clearly not backwards compatible, that's a NAK unless it is demonstrable
that noone actually wants to disable it at all.

If this patch fixes a problem on a board that you have, I would suggest
that you add the property to enable it, as the binding tells you to.

Thanks,
Conor.

>  	if (!of_property_read_u32(dev->of_node, "rockchip,output-tapdelay-select", &val)) {
>  		if (val <= PHYCTRL_OTAPDLYSEL_MAXVALUE)
> 
> -- 
> 2.44.0
> 
>
Dragan Simic March 26, 2024, 7:55 p.m. UTC | #2
Hello Conor and Folker,

On 2024-03-26 20:46, Conor Dooley wrote:
> On Tue, Mar 26, 2024 at 07:54:35PM +0100, Folker Schwesinger via B4 
> Relay wrote:
>> From: Folker Schwesinger <dev@folker-schwesinger.de>
>> 
>> Restore the behavior of the Rockchip kernel that undconditionally
>> enables the internal strobe pulldown.
> 
> What do you mean "restore the behaviour of the rockchip kernel"? Did
> mainline behave the same as the rockchip kernel previously? If not,
> using "restore" here is misleading. "Unconditionally" is also 
> incorrect,
> because you have a property that disables it.
> 
>> As the DT property rockchip,enable-strobe-pulldown is obsolete now,
>> replace it with a property to disable the internal pulldown.
>> 
>> This fixes I/O errors observed on various Rock Pi 4 and NanoPi4 series
>> boards with some eMMC modules. Other boards may also be affected.
>> 
>> An example of these errors is as follows:
>> 
>> [  290.060817] mmc1: running CQE recovery
>> [  290.061337] blk_update_request: I/O error, dev mmcblk1, sector 
>> 1411072 op 0x1:(WRITE) flags 0x800 phys_seg 36 prio class 0
>> [  290.061370] EXT4-fs warning (device mmcblk1p1): ext4_end_bio:348: 
>> I/O error 10 writing to inode 29547 starting block 176466)
>> [  290.061484] Buffer I/O error on device mmcblk1p1, logical block 
>> 172288
>> 
>> Fixes: 8b5c2b45b8f0 ("phy: rockchip: set pulldown for strobe line in 
>> dts")
>> Signed-off-by: Folker Schwesinger <dev@folker-schwesinger.de>
>> ---
>>  drivers/phy/rockchip/phy-rockchip-emmc.c | 6 +++---
>>  1 file changed, 3 insertions(+), 3 deletions(-)
>> 
>> diff --git a/drivers/phy/rockchip/phy-rockchip-emmc.c 
>> b/drivers/phy/rockchip/phy-rockchip-emmc.c
>> index 20023f6eb994..6e637f3e1b19 100644
>> --- a/drivers/phy/rockchip/phy-rockchip-emmc.c
>> +++ b/drivers/phy/rockchip/phy-rockchip-emmc.c
>> @@ -376,14 +376,14 @@ static int rockchip_emmc_phy_probe(struct 
>> platform_device *pdev)
>>  	rk_phy->reg_offset = reg_offset;
>>  	rk_phy->reg_base = grf;
>>  	rk_phy->drive_impedance = PHYCTRL_DR_50OHM;
>> -	rk_phy->enable_strobe_pulldown = PHYCTRL_REN_STRB_DISABLE;
>> +	rk_phy->enable_strobe_pulldown = PHYCTRL_REN_STRB_ENABLE;
>>  	rk_phy->output_tapdelay_select = PHYCTRL_OTAPDLYSEL_DEFAULT;
>> 
>>  	if (!of_property_read_u32(dev->of_node, "drive-impedance-ohm", 
>> &val))
>>  		rk_phy->drive_impedance = convert_drive_impedance_ohm(pdev, val);
>> 
>> -	if (of_property_read_bool(dev->of_node, 
>> "rockchip,enable-strobe-pulldown"))
>> -		rk_phy->enable_strobe_pulldown = PHYCTRL_REN_STRB_ENABLE;
>> +	if (of_property_read_bool(dev->of_node, 
>> "rockchip,disable-strobe-pulldown"))
>> +		rk_phy->enable_strobe_pulldown = PHYCTRL_REN_STRB_DISABLE;
> 
> Unfortunately you cannot do this.
> Previously no property at all meant disabled and a property was 
> required
> to enable it. With this change the absence of a property means that it
> will be enabled.
> An old devicetree is that wanted this to be disabled would have no
> property and will now end up with it enabled. This is an ABI break and 
> is
> clearly not backwards compatible, that's a NAK unless it is 
> demonstrable
> that noone actually wants to disable it at all.

Moreover, as I already explained some time ago, [1] some boards and
devices are unfortunately miswired, and we don't want to enable the
DATA STROBE pull-down on such boards.

[1] 
https://lore.kernel.org/linux-rockchip/ca5b7cad01f645c7c559ab26a8db8085@manjaro.org/#t

> If this patch fixes a problem on a board that you have, I would suggest
> that you add the property to enable it, as the binding tells you to.
> 
> Thanks,
> Conor.
> 
>>  	if (!of_property_read_u32(dev->of_node, 
>> "rockchip,output-tapdelay-select", &val)) {
>>  		if (val <= PHYCTRL_OTAPDLYSEL_MAXVALUE)
>> 
>> --
>> 2.44.0
>> 
>> 
> 
> _______________________________________________
> Linux-rockchip mailing list
> Linux-rockchip@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-rockchip
Folker Schwesinger March 27, 2024, 4:21 p.m. UTC | #3
Hi Conor and Dragan,

thanks for your feedback!

On Tue Mar 26, 2024 at 8:55 PM CET, Dragan Simic wrote:
> On 2024-03-26 20:46, Conor Dooley wrote:
> > On Tue, Mar 26, 2024 at 07:54:35PM +0100, Folker Schwesinger via B4
> > Relay wrote:
> >> From: Folker Schwesinger <dev@folker-schwesinger.de>
> >>
> >> Restore the behavior of the Rockchip kernel that undconditionally
> >> enables the internal strobe pulldown.
> >
> > What do you mean "restore the behaviour of the rockchip kernel"? Did
> > mainline behave the same as the rockchip kernel previously? If not,
> > using "restore" here is misleading. "Unconditionally" is also
> > incorrect,
> > because you have a property that disables it.

Apologizes for the misleading commit message. Prior to 5.11 the Linux
kernel did not touch the pull-down registers. However, it seems the
register's (factory?) default was set to enable the pull-down. As it
was mentioned elsewhere that was the configuration recommended by
Rockchip. The 4.4 vendor (Rockchip) kernel reflects that by enabling the
pull-down in its kernel.
Of course, this has nothing to do with the Linux kernel, so "restore"
was a bad choice here.

I previously had split the driver patch into two separate patches, one
for changing the default (unconditionally at that point), the other for
adding the disable property. As both changes were minimal I decided to
squash the commits. I updated the cover letter, but forgot to update the
commit message. Sorry.

> >> Fixes: 8b5c2b45b8f0 ("phy: rockchip: set pulldown for strobe line in
> >> dts")
> >> Signed-off-by: Folker Schwesinger <dev@folker-schwesinger.de>
> >> ---
> >>  drivers/phy/rockchip/phy-rockchip-emmc.c | 6 +++---
> >>  1 file changed, 3 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/drivers/phy/rockchip/phy-rockchip-emmc.c
> >> b/drivers/phy/rockchip/phy-rockchip-emmc.c
> >> index 20023f6eb994..6e637f3e1b19 100644
> >> --- a/drivers/phy/rockchip/phy-rockchip-emmc.c
> >> +++ b/drivers/phy/rockchip/phy-rockchip-emmc.c
> >> @@ -376,14 +376,14 @@ static int rockchip_emmc_phy_probe(struct
> >> platform_device *pdev)
> >>  	rk_phy->reg_offset = reg_offset;
> >>  	rk_phy->reg_base = grf;
> >>  	rk_phy->drive_impedance = PHYCTRL_DR_50OHM;
> >> -	rk_phy->enable_strobe_pulldown = PHYCTRL_REN_STRB_DISABLE;
> >> +	rk_phy->enable_strobe_pulldown = PHYCTRL_REN_STRB_ENABLE;
> >>  	rk_phy->output_tapdelay_select = PHYCTRL_OTAPDLYSEL_DEFAULT;
> >>
> >>  	if (!of_property_read_u32(dev->of_node, "drive-impedance-ohm",
> >> &val))
> >>  		rk_phy->drive_impedance = convert_drive_impedance_ohm(pdev, val);
> >>
> >> -	if (of_property_read_bool(dev->of_node,
> >> "rockchip,enable-strobe-pulldown"))
> >> -		rk_phy->enable_strobe_pulldown = PHYCTRL_REN_STRB_ENABLE;
> >> +	if (of_property_read_bool(dev->of_node,
> >> "rockchip,disable-strobe-pulldown"))
> >> +		rk_phy->enable_strobe_pulldown = PHYCTRL_REN_STRB_DISABLE;
> >
> > Unfortunately you cannot do this.
> > Previously no property at all meant disabled and a property was
> > required
> > to enable it. With this change the absence of a property means that it
> > will be enabled.
> > An old devicetree is that wanted this to be disabled would have no
> > property and will now end up with it enabled. This is an ABI break and
> > is
> > clearly not backwards compatible, that's a NAK unless it is
> > demonstrable
> > that noone actually wants to disable it at all.
> >
>
> Moreover, as I already explained some time ago, [1] some boards and
> devices are unfortunately miswired, and we don't want to enable the
> DATA STROBE pull-down on such boards.
>
> [1]
> https://lore.kernel.org/linux-rockchip/ca5b7cad01f645c7c559ab26a8db8085@manjaro.org/#t
>
> > If this patch fixes a problem on a board that you have, I would suggest
> > that you add the property to enable it, as the binding tells you to.

I agree, I'll post the patches later.

Best regards

Folker
Conor Dooley March 27, 2024, 5:21 p.m. UTC | #4
On Wed, Mar 27, 2024 at 04:21:45PM +0000, Folker Schwesinger wrote:
> Hi Conor and Dragan,
> 
> thanks for your feedback!
> 
> On Tue Mar 26, 2024 at 8:55 PM CET, Dragan Simic wrote:
> > On 2024-03-26 20:46, Conor Dooley wrote:
> > > On Tue, Mar 26, 2024 at 07:54:35PM +0100, Folker Schwesinger via B4
> > > Relay wrote:
> > >> From: Folker Schwesinger <dev@folker-schwesinger.de>
> > >>
> > >> Restore the behavior of the Rockchip kernel that undconditionally
> > >> enables the internal strobe pulldown.
> > >
> > > What do you mean "restore the behaviour of the rockchip kernel"? Did
> > > mainline behave the same as the rockchip kernel previously? If not,
> > > using "restore" here is misleading. "Unconditionally" is also
> > > incorrect,
> > > because you have a property that disables it.
> 
> Apologizes for the misleading commit message. Prior to 5.11 the Linux
> kernel did not touch the pull-down registers. However, it seems the
> register's (factory?) default was set to enable the pull-down. As it
> was mentioned elsewhere that was the configuration recommended by
> Rockchip. The 4.4 vendor (Rockchip) kernel reflects that by enabling the
> pull-down in its kernel.

Yeah, seems like a bit of a sticky situation. Probably the wrong
polarity was chosen when the property was implemented and the property
should have been the one you wanted to switch to given the default
before it existed was the factory defaults.

> Of course, this has nothing to do with the Linux kernel, so "restore"
> was a bad choice here.
>
> I previously had split the driver patch into two separate patches, one
> for changing the default (unconditionally at that point), the other for
> adding the disable property. As both changes were minimal I decided to
> squash the commits. I updated the cover letter, but forgot to update the
> commit message. Sorry.

No worries. Squashing them was probably the right thing to do anyway.
Alban Browaeys March 28, 2024, 5 p.m. UTC | #5
Le mardi 26 mars 2024 à 19:46 +0000, Conor Dooley a écrit :
> On Tue, Mar 26, 2024 at 07:54:35PM +0100, Folker Schwesinger via B4
> Relay wrote:
> > From: Folker Schwesinger <dev@folker-schwesinger.de>
> > -	if (of_property_read_bool(dev->of_node, "rockchip,enable-
> > strobe-pulldown"))
> > -		rk_phy->enable_strobe_pulldown =
> > PHYCTRL_REN_STRB_ENABLE;
> > +	if (of_property_read_bool(dev->of_node, "rockchip,disable-
> > strobe-pulldown"))
> > +		rk_phy->enable_strobe_pulldown =
> > PHYCTRL_REN_STRB_DISABLE;
> 
> Unfortunately you cannot do this.
> Previously no property at all meant disabled and a property was
> required
> to enable it. With this change the absence of a property means that
> it
> will be enabled.
> An old devicetree is that wanted this to be disabled would have no
> property and will now end up with it enabled. This is an ABI break
> and is
> clearly not backwards compatible, that's a NAK unless it is
> demonstrable
> that noone actually wants to disable it at all.


But the patch that introduced the new default to disable the pulldown
explicitely introduced a regression for at least 4 boards.
It took time to sort out that the default to disable pulldown was the
culprit but still.
Will we carry this new behavor that breaks the default design for
rk3399 because since the regression was introduced new board definition
might have expceted this new behavior.

Could the best option be to revert to énot set a default enable/disable
pulldown" (as before the commit that introduced the regression) and
allow one to force the pulldown via the enable/disable pulldown
property?
I mean the commit that introduced a default value for the pulldown did
not seem to be about fixing anything. But it broke a lot. ANd it was
really really hard to find the description of this commit to understand
that one had to enable pulldown to restore hs400.

In more than 3 years, only one board maintainer noticed that this
property was required to get back HS400  and thanks to a user telling
me that this board was working I found from this board that this
property was "missing" from most board definitions (while it was not
required before).


I am all for not breaking ABI. But what about not reverting a patch
that already broke ABI because this patch introduced a new ABI that we
don't want to break?
I mean shouldn't a new commit with a new ABI that regressed the kernel
be reverted?

Mind fixing the initial regression 8b5c2b45b8f0 "phy: rockchip: set
pulldown for strobe line in dts" does not necessarily mean changing the
default to the opposite value but could also be reverting to not
setting a default.
Though I don't know if there are pros to setting a default.


> If this patch fixes a problem on a board that you have, I would
> suggest
> that you add the property to enable it, as the binding tells you to.
> 
> Thanks,
> Conor.


Regards,
Alban
Conor Dooley March 28, 2024, 6:01 p.m. UTC | #6
On Thu, Mar 28, 2024 at 06:00:03PM +0100, Alban Browaeys wrote:
> Le mardi 26 mars 2024 à 19:46 +0000, Conor Dooley a écrit :
> > On Tue, Mar 26, 2024 at 07:54:35PM +0100, Folker Schwesinger via B4
> > Relay wrote:
> > > From: Folker Schwesinger <dev@folker-schwesinger.de>
> > > -	if (of_property_read_bool(dev->of_node, "rockchip,enable-
> > > strobe-pulldown"))
> > > -		rk_phy->enable_strobe_pulldown =
> > > PHYCTRL_REN_STRB_ENABLE;
> > > +	if (of_property_read_bool(dev->of_node, "rockchip,disable-
> > > strobe-pulldown"))
> > > +		rk_phy->enable_strobe_pulldown =
> > > PHYCTRL_REN_STRB_DISABLE;
> > 
> > Unfortunately you cannot do this.
> > Previously no property at all meant disabled and a property was
> > required
> > to enable it. With this change the absence of a property means that
> > it
> > will be enabled.
> > An old devicetree is that wanted this to be disabled would have no
> > property and will now end up with it enabled. This is an ABI break
> > and is
> > clearly not backwards compatible, that's a NAK unless it is
> > demonstrable
> > that noone actually wants to disable it at all.
> 
> 
> But the patch that introduced the new default to disable the pulldown
> explicitely introduced a regression for at least 4 boards.
> It took time to sort out that the default to disable pulldown was the
> culprit but still.
> Will we carry this new behavor that breaks the default design for
> rk3399 because since the regression was introduced new board definition
> might have expceted this new behavior.
> 
> Could the best option be to revert to énot set a default enable/disable
> pulldown" (as before the commit that introduced the regression) and
> allow one to force the pulldown via the enable/disable pulldown
> property?
> I mean the commit that introduced a default value for the pulldown did
> not seem to be about fixing anything. But it broke a lot. ANd it was
> really really hard to find the description of this commit to understand
> that one had to enable pulldown to restore hs400.
> 
> In more than 3 years, only one board maintainer noticed that this
> property was required to get back HS400  and thanks to a user telling
> me that this board was working I found from this board that this
> property was "missing" from most board definitions (while it was not
> required before).
> 
> 
> I am all for not breaking ABI. But what about not reverting a patch
> that already broke ABI because this patch introduced a new ABI that we
> don't want to break?
> I mean shouldn't a new commit with a new ABI that regressed the kernel
> be reverted?

I think I said it after OP replied to me yesterday, but this is a pretty
shitty situation in that the original default picked for the property
was actually incorrect. Given it's been like this for four years before
anyone noticed, and others probably depend on the current behaviour,
that's hard to justify.

> Mind fixing the initial regression 8b5c2b45b8f0 "phy: rockchip: set
> pulldown for strobe line in dts" does not necessarily mean changing the
> default to the opposite value but could also be reverting to not
> setting a default.

That's also problematic, as the only way to do this is make setting
one of the enabled or disabled properties required, which is also an ABI
break, since you'd then be rejecting probe if one is not present.

> Though I don't know if there are pros to setting a default.

What you probably have to weigh up is the cons of each side. If what you
lose is HS400 mode with what's in the kernel right now but switching to
what's been proposed would entirely break some boards, I know which
I think the lesser of two evils is.

It's probably up to the platform maintainer to weigh in at this point.

Hope that helps?
Conor.
Dragan Simic March 31, 2024, 7:26 p.m. UTC | #7
Hello Alban,

On 2024-03-28 18:00, Alban Browaeys wrote:
> Le mardi 26 mars 2024 à 19:46 +0000, Conor Dooley a écrit :
>> On Tue, Mar 26, 2024 at 07:54:35PM +0100, Folker Schwesinger via B4
>> Relay wrote:
>> > From: Folker Schwesinger <dev@folker-schwesinger.de>
>> > -	if (of_property_read_bool(dev->of_node, "rockchip,enable-
>> > strobe-pulldown"))
>> > -		rk_phy->enable_strobe_pulldown =
>> > PHYCTRL_REN_STRB_ENABLE;
>> > +	if (of_property_read_bool(dev->of_node, "rockchip,disable-
>> > strobe-pulldown"))
>> > +		rk_phy->enable_strobe_pulldown =
>> > PHYCTRL_REN_STRB_DISABLE;
>> 
>> Unfortunately you cannot do this.
>> Previously no property at all meant disabled and a property was
>> required
>> to enable it. With this change the absence of a property means that
>> it
>> will be enabled.
>> An old devicetree is that wanted this to be disabled would have no
>> property and will now end up with it enabled. This is an ABI break
>> and is
>> clearly not backwards compatible, that's a NAK unless it is
>> demonstrable
>> that noone actually wants to disable it at all.
> 
> But the patch that introduced the new default to disable the pulldown
> explicitely introduced a regression for at least 4 boards.
> It took time to sort out that the default to disable pulldown was the
> culprit but still.
> Will we carry this new behavor that breaks the default design for
> rk3399 because since the regression was introduced new board definition
> might have expceted this new behavior.
> 
> Could the best option be to revert to énot set a default enable/disable
> pulldown" (as before the commit that introduced the regression) and
> allow one to force the pulldown via the enable/disable pulldown
> property?
> I mean the commit that introduced a default value for the pulldown did
> not seem to be about fixing anything. But it broke a lot. ANd it was
> really really hard to find the description of this commit to understand
> that one had to enable pulldown to restore hs400.

Quite frankly, I think it's better to leave the default as-is, and
to fix the dts files for the boards that have been (or will be) tested
to work as expected and reliably in the HS400 mode.  Perhaps this is
also a good opportunity to revisit the reliability of the HS400 mode
on various boards.

In other words, it could be that some boards now rely on the pull-down
being disabled by default, so enabling it by default might actually
break such boards.  I know, the troublesome commit that disabled the
pull-down caused breakage, but fixing that might actually cause more
breakage at this point.

> In more than 3 years, only one board maintainer noticed that this
> property was required to get back HS400  and thanks to a user telling
> me that this board was working I found from this board that this
> property was "missing" from most board definitions (while it was not
> required before).

A couple of years ago I've also spent some time debugging HS400 not
working on a Rock 4, but ended up with limiting the speed to HS200 as
a workaround, so I agree about the whole thing being a mess.

> I am all for not breaking ABI. But what about not reverting a patch
> that already broke ABI because this patch introduced a new ABI that we
> don't want to break?
> I mean shouldn't a new commit with a new ABI that regressed the kernel
> be reverted?
> 
> Mind fixing the initial regression 8b5c2b45b8f0 "phy: rockchip: set
> pulldown for strobe line in dts" does not necessarily mean changing the
> default to the opposite value but could also be reverting to not
> setting a default.
> Though I don't know if there are pros to setting a default.
> 
> 
>> If this patch fixes a problem on a board that you have, I would
>> suggest
>> that you add the property to enable it, as the binding tells you to.
>> 
>> Thanks,
>> Conor.
> 
> 
> Regards,
> Alban
> 
> _______________________________________________
> Linux-rockchip mailing list
> Linux-rockchip@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-rockchip
Alban Browaeys April 10, 2024, 6:28 p.m. UTC | #8
Le jeudi 28 mars 2024 à 18:01 +0000, Conor Dooley a écrit :
> On Thu, Mar 28, 2024 at 06:00:03PM +0100, Alban Browaeys wrote:
> > Le mardi 26 mars 2024 à 19:46 +0000, Conor Dooley a écrit :
> > > On Tue, Mar 26, 2024 at 07:54:35PM +0100, Folker Schwesinger via
> > > B4
> > > Relay wrote:
> > > > From: Folker Schwesinger <dev@folker-schwesinger.de>
> > > > -	if (of_property_read_bool(dev->of_node,
> > > > "rockchip,enable-
> > > > strobe-pulldown"))
> > > > -		rk_phy->enable_strobe_pulldown =
> > > > PHYCTRL_REN_STRB_ENABLE;
> > > > +	if (of_property_read_bool(dev->of_node,
> > > > "rockchip,disable-
> > > > strobe-pulldown"))
> > > > +		rk_phy->enable_strobe_pulldown =
> > > > PHYCTRL_REN_STRB_DISABLE;
> > > 
> > > Unfortunately you cannot do this.
> > > Previously no property at all meant disabled and a property was
> > > required
> > > to enable it. With this change the absence of a property means
> > > that
> > > it
> > > will be enabled.
> > > An old devicetree is that wanted this to be disabled would have
> > > no
> > > property and will now end up with it enabled. This is an ABI
> > > break
> > > and is
> > > clearly not backwards compatible, that's a NAK unless it is
> > > demonstrable
> > > that noone actually wants to disable it at all.
> > 
> > 
> > But the patch that introduced the new default to disable the
> > pulldown
> > explicitely introduced a regression for at least 4 boards.
> > It took time to sort out that the default to disable pulldown was
> > the
> > culprit but still.
> > Will we carry this new behavor that breaks the default design for
> > rk3399 because since the regression was introduced new board
> > definition
> > might have expceted this new behavior.
> > 
> > Could the best option be to revert to énot set a default
> > enable/disable
> > pulldown" (as before the commit that introduced the regression) and
> > allow one to force the pulldown via the enable/disable pulldown
> > property?
> > I mean the commit that introduced a default value for the pulldown
> > did
> > not seem to be about fixing anything. But it broke a lot. ANd it
> > was
> > really really hard to find the description of this commit to
> > understand
> > that one had to enable pulldown to restore hs400.
> > 
> > In more than 3 years, only one board maintainer noticed that this
> > property was required to get back HS400  and thanks to a user
> > telling
> > me that this board was working I found from this board that this
> > property was "missing" from most board definitions (while it was
> > not
> > required before).
> > 
> > 
> > I am all for not breaking ABI. But what about not reverting a patch
> > that already broke ABI because this patch introduced a new ABI that
> > we
> > don't want to break?
> > I mean shouldn't a new commit with a new ABI that regressed the
> > kernel
> > be reverted?
> 
> I think I said it after OP replied to me yesterday, but this is a
> pretty
> shitty situation in that the original default picked for the property
> was actually incorrect. Given it's been like this for four years
> before
> anyone noticed, and others probably depend on the current behaviour,
> that's hard to justify.
> 

A lot of people noticed fast that HS400 was broken in the 5.10 branch
but due to another commit (more later, ie double regulator init that
messed up emmc) this second breakage was not detected. But mostly
downstream. And most if not all rk3399 boards in Armbian had HS400
disabled.


It took 3 years to detect that HS400 was broken on a few boards like
Rock Pi4 in the upstream kernel. Any might still be broken.
I would not count on the fact that keeping the current behavior equals
no more broken boards.

From the previous exchanges the boards that requires the pulldown to be
disabled seems well known.

Though I am fine with adding a property to set enable pulldown to any
board definition file where that is required.

Only I do not believe keeping the statu quo equal everything works
because it has been 3 years.
In fact this commit reached the downstream kernels way later. Any
stayed with the 5.10 branch for years.

But on the other side the disable pulldown by default is alraedy in
stable/linux-rolling-lts .



> > Mind fixing the initial regression 8b5c2b45b8f0 "phy: rockchip: set
> > pulldown for strobe line in dts" does not necessarily mean changing
> > the
> > default to the opposite value but could also be reverting to not
> > setting a default.
> 
> That's also problematic, as the only way to do this is make setting
> one of the enabled or disabled properties required, which is also an
> ABI
> break, since you'd then be rejecting probe if one is not present.


I don't understand.
How reverting to not set either pulldown enabled or disabled by default
force all board to set either enabled or disabled.
I was telling about making the pulldown set by kernel optional be it
enabled or disabled to revert to the previous behavior. 

I mean before the patch to set a default pulldown value (to disabled)
there were no forced value.



> > Though I don't know if there are pros to setting a default.
> 
> What you probably have to weigh up is the cons of each side. If what
> you
> lose is HS400 mode with what's in the kernel right now but switching
> to
> what's been proposed would entirely break some boards, I know which
> I think the lesser of two evils is.

More boards (even if not the most wide spread it seems) are broken by
the current behavior.

I agree that only HS400 is broken by keeping the status quo. But as far
as I understand only HS400 will be broken either way.
Be that by keeping the current disable pulldown which break the boards
based on the rockchip default design or the boards that are non-
standard or have a broken design.
Both case this lead to data corruption on boot to eMMC.

The only pro of keeping the current value the default is that most
board broken by the new default introduced in 2020 "might" already be
fixed (but that is just a guess).



> It's probably up to the platform maintainer to weigh in at this
> point.

I am not knowledged into the delegation scope. You mean that from now
on it is up to the rockchip maintainer?
I am fine with it either way.

I just wanted to point out that maybe we don't have to set a pulldown
value after all. And that then all boards will be fine as before
setting the pulldown explicitly was introduced.


In fact I am more eager to get this fixed be it by adding a enable-
pulldown property to the board definitions, than to change the current
behavior.
Just wanted to sort out if that was not the wrong way to fix this
issue. (ie if adding a setting on most boards was wrong).


> Hope that helps?
> Conor.


During more than 2 years, I tried various patches and discussed on
forums about the HS400 breakage. I had bisected the regulator init
issue in the 5.10 branch. Sadly it took so much time for this issue to
be understood that when the force pulldown to disable commit was
introduced downstream before the first issue go fixed.
This only made the matter worse because when one fixed the double
regulator init issue HS400 was still broken, this time because the
pulldown was forced to disable. But nobody noticed this commit that
forced a default pulldown state (that was older than the regulator
commit from 5.13 backported to the 5.10 stable branch commit, but that
reached downstream later due to not having been backported to 5.10 from
5.11).
Otherwise we would have emailed immeditaly.
Bisecting was only able to catch the first breakage (as it was only
fixed after the second breakage was introduced).

Maybe the problem is that me and others did not complained to the
kernel upstream ML because we were using heavily patched downstream
kernels (like most if not all downstream ARM kernels). So sadly, the
forums from back then are filled with complaints but nothing seemed to
have reached the Linux ML.



About the regulator double init, stable downstream branches were hit by
a bug in the 5.10 stable branch in May 2021 before they switched to
5.11 were this default pulldown was introduced. Thus they could not
detect that this pulldown broke HS400 because HS400 was already broken
by a double regulator init, backported in 5.10 from 5.13:
"
commit 06653ebc0ad2e0b7d799cd71a5c2933ed2fb7a66
Author: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Date:   Thu May 20 01:12:23 2021 +0300

	regulator: core: resolve supply for boot-on/always-on
regulators
   
	commit 98e48cd9283dbac0e1445ee780889f10b3d1db6a upstream.
   
	For the boot-on/always-on regulators the
set_machine_constrainst() is
	called before resolving rdev->supply. Thus the code would try
to enable
	rdev before enabling supplying regulator. Enforce resolving
supply
	regulator before enabling rdev.
   
	Fixes: aea6cb99703e ("regulator: resolve supply after creating
regulator")
	Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
	Link:
https://lore.kernel.org/r/20210519221224.2868496-1-dmitry.baryshkov@linaro.org
	Signed-off-by: Mark Brown <broonie@kernel.org>
	Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
"
and which to my knowledge was only fixed in v6.1-rc1 "
commit 8a866d527ac0441c0eb14a991fa11358b476b11d
Author: Christian Kohlschütter <christian@kohlschutter.com>
Date:   Thu Aug 18 12:46:47 2022 +0000

    regulator: core: Resolve supply name earlier to prevent double-init
    
    Previously, an unresolved regulator supply reference upon calling
    regulator_register on an always-on or boot-on regulator caused
    set_machine_constraints to be called twice.
    
    This in turn may initialize the regulator twice, leading to voltage
    glitches that are timing-dependent. A simple, unrelated
configuration
    change may be enough to hide this problem, only to be surfaced by
    chance.
    
    One such example is the SD-Card voltage regulator in a NanoPI R4S
that
    would not initialize reliably unless the registration flow was just
    complex enough to allow the regulator to properly reset between
calls.
    
    Fix this by re-arranging regulator_register, trying resolve the
    regulator's supply early enough that set_machine_constraints does
not
    need to be called twice.
    
    Signed-off-by: Christian Kohlschütter <christian@kohlschutter.com>
    Link:
https://lore.kernel.org/r/20220818124646.6005-1-christian@kohlschutter.com
    Signed-off-by: Mark Brown <broonie@kernel.org>

"

So most boards were already broken when the commit to force a pulldown
value was introduced.




Regards
Alban
Conor Dooley April 11, 2024, 3:42 p.m. UTC | #9
On Wed, Apr 10, 2024 at 08:28:57PM +0200, Alban Browaeys wrote:
> Le jeudi 28 mars 2024 à 18:01 +0000, Conor Dooley a écrit :
> > On Thu, Mar 28, 2024 at 06:00:03PM +0100, Alban Browaeys wrote:
> > > Le mardi 26 mars 2024 à 19:46 +0000, Conor Dooley a écrit :
> > > > On Tue, Mar 26, 2024 at 07:54:35PM +0100, Folker Schwesinger via
> > > > B4
> > > > Relay wrote:
> > > > > From: Folker Schwesinger <dev@folker-schwesinger.de>
> > > > > -	if (of_property_read_bool(dev->of_node,
> > > > > "rockchip,enable-
> > > > > strobe-pulldown"))
> > > > > -		rk_phy->enable_strobe_pulldown =
> > > > > PHYCTRL_REN_STRB_ENABLE;
> > > > > +	if (of_property_read_bool(dev->of_node,
> > > > > "rockchip,disable-
> > > > > strobe-pulldown"))
> > > > > +		rk_phy->enable_strobe_pulldown =
> > > > > PHYCTRL_REN_STRB_DISABLE;
> > > > 
> > > > Unfortunately you cannot do this.
> > > > Previously no property at all meant disabled and a property was
> > > > required
> > > > to enable it. With this change the absence of a property means
> > > > that
> > > > it
> > > > will be enabled.
> > > > An old devicetree is that wanted this to be disabled would have
> > > > no
> > > > property and will now end up with it enabled. This is an ABI
> > > > break
> > > > and is
> > > > clearly not backwards compatible, that's a NAK unless it is
> > > > demonstrable
> > > > that noone actually wants to disable it at all.
> > > 
> > > 
> > > But the patch that introduced the new default to disable the
> > > pulldown
> > > explicitely introduced a regression for at least 4 boards.
> > > It took time to sort out that the default to disable pulldown was
> > > the
> > > culprit but still.
> > > Will we carry this new behavor that breaks the default design for
> > > rk3399 because since the regression was introduced new board
> > > definition
> > > might have expceted this new behavior.
> > > 
> > > Could the best option be to revert to énot set a default
> > > enable/disable
> > > pulldown" (as before the commit that introduced the regression) and
> > > allow one to force the pulldown via the enable/disable pulldown
> > > property?
> > > I mean the commit that introduced a default value for the pulldown
> > > did
> > > not seem to be about fixing anything. But it broke a lot. ANd it
> > > was
> > > really really hard to find the description of this commit to
> > > understand
> > > that one had to enable pulldown to restore hs400.
> > > 
> > > In more than 3 years, only one board maintainer noticed that this
> > > property was required to get back HS400  and thanks to a user
> > > telling
> > > me that this board was working I found from this board that this
> > > property was "missing" from most board definitions (while it was
> > > not
> > > required before).
> > > 
> > > 
> > > I am all for not breaking ABI. But what about not reverting a patch
> > > that already broke ABI because this patch introduced a new ABI that
> > > we
> > > don't want to break?
> > > I mean shouldn't a new commit with a new ABI that regressed the
> > > kernel
> > > be reverted?
> > 
> > I think I said it after OP replied to me yesterday, but this is a
> > pretty
> > shitty situation in that the original default picked for the property
> > was actually incorrect. Given it's been like this for four years
> > before
> > anyone noticed, and others probably depend on the current behaviour,
> > that's hard to justify.
> > 
> 
> A lot of people noticed fast that HS400 was broken in the 5.10 branch
> but due to another commit (more later, ie double regulator init that
> messed up emmc) this second breakage was not detected. But mostly
> downstream. And most if not all rk3399 boards in Armbian had HS400
> disabled.
> 
> 
> It took 3 years to detect that HS400 was broken on a few boards like
> Rock Pi4 in the upstream kernel. Any might still be broken.
> I would not count on the fact that keeping the current behavior equals
> no more broken boards.
> 
> From the previous exchanges the boards that requires the pulldown to be
> disabled seems well known.
> 
> Though I am fine with adding a property to set enable pulldown to any
> board definition file where that is required.
> 
> Only I do not believe keeping the statu quo equal everything works
> because it has been 3 years.

FWIW, I didn't say this. Clearly if that was the case, this patch would
never have arrived.

> In fact this commit reached the downstream kernels way later. Any
> stayed with the 5.10 branch for years.
> 
> But on the other side the disable pulldown by default is alraedy in
> stable/linux-rolling-lts .
> 
> > > Mind fixing the initial regression 8b5c2b45b8f0 "phy: rockchip: set
> > > pulldown for strobe line in dts" does not necessarily mean changing
> > > the
> > > default to the opposite value but could also be reverting to not
> > > setting a default.
> > 
> > That's also problematic, as the only way to do this is make setting
> > one of the enabled or disabled properties required, which is also an
> > ABI
> > break, since you'd then be rejecting probe if one is not present.
> 
> 
> I don't understand.
> How reverting to not set either pulldown enabled or disabled by default
> force all board to set either enabled or disabled.
> I was telling about making the pulldown set by kernel optional be it
> enabled or disabled to revert to the previous behavior. 
> 
> I mean before the patch to set a default pulldown value (to disabled)
> there were no forced value.

Ah, maybe I misunderstood what the code originally did. Did the original
code leave the bit however the bootloader or reset value had left it?
In that case, probe wouldn't be rejected and you'd not have the sort of
issue that I mentioned above.

> > > Though I don't know if there are pros to setting a default.
> > 
> > What you probably have to weigh up is the cons of each side. If what
> > you
> > lose is HS400 mode with what's in the kernel right now but switching
> > to
> > what's been proposed would entirely break some boards, I know which
> > I think the lesser of two evils is.
> 
> More boards (even if not the most wide spread it seems) are broken by
> the current behavior.
> 
> I agree that only HS400 is broken by keeping the status quo. But as far
> as I understand only HS400 will be broken either way.
> Be that by keeping the current disable pulldown which break the boards
> based on the rockchip default design or the boards that are non-
> standard or have a broken design.
> Both case this lead to data corruption on boot to eMMC.
> 
> The only pro of keeping the current value the default is that most
> board broken by the new default introduced in 2020 "might" already be
> fixed (but that is just a guess).
> 
> > It's probably up to the platform maintainer to weigh in at this
> > point.
> 
> I am not knowledged into the delegation scope. You mean that from now
> on it is up to the rockchip maintainer?
> I am fine with it either way.

Yes, I meant the rockchip maintainer. I'm only a lowly bindings
maintainer, without any knowledge of rockchip specfics or the type of
boards we're talking about being broken here. Someone has to make a
judgement call about which "no property" behaviour is used going forward
and I don't want that to be me!

> I just wanted to point out that maybe we don't have to set a pulldown
> value after all. And that then all boards will be fine as before
> setting the pulldown explicitly was introduced.

By "all boards will be fine" you mean "all boards that expected the
kernel didn't touch this bit will be fine". The boards that need the
kernel to set this bit because it {comes out of reset,is set by firmware}
incorrectly are going to need a property added if we revert the default
behaviour to not touching the bit.

> In fact I am more eager to get this fixed be it by adding a enable-
> pulldown property to the board definitions, than to change the current
> behavior.
> Just wanted to sort out if that was not the wrong way to fix this
> issue. (ie if adding a setting on most boards was wrong).

> During more than 2 years, I tried various patches and discussed on
> forums about the HS400 breakage. I had bisected the regulator init
> issue in the 5.10 branch. Sadly it took so much time for this issue to
> be understood that when the force pulldown to disable commit was
> introduced downstream before the first issue go fixed.
> This only made the matter worse because when one fixed the double
> regulator init issue HS400 was still broken, this time because the
> pulldown was forced to disable. But nobody noticed this commit that
> forced a default pulldown state (that was older than the regulator
> commit from 5.13 backported to the 5.10 stable branch commit, but that
> reached downstream later due to not having been backported to 5.10 from
> 5.11).
> Otherwise we would have emailed immeditaly.
> Bisecting was only able to catch the first breakage (as it was only
> fixed after the second breakage was introduced).
> 
> Maybe the problem is that me and others did not complained to the
> kernel upstream ML because we were using heavily patched downstream
> kernels (like most if not all downstream ARM kernels). So sadly, the
> forums from back then are filled with complaints but nothing seemed to
> have reached the Linux ML.

Aye, and all I can really say there is to buy boards from a vendor that
doesn't use some horribly hacked downstream kernel, which I know is
clearly an unsatisfactory suggestion. That said, we probably should have
caught that the new default behaviour when the changes were made was not
the default before. There was only one DT maintainer then though, and
things just slip by :/
Heiko Stuebner April 19, 2024, 2:31 p.m. UTC | #10
Am Donnerstag, 11. April 2024, 17:42:24 CEST schrieb Conor Dooley:
> On Wed, Apr 10, 2024 at 08:28:57PM +0200, Alban Browaeys wrote:
> > Le jeudi 28 mars 2024 à 18:01 +0000, Conor Dooley a écrit :
> > > On Thu, Mar 28, 2024 at 06:00:03PM +0100, Alban Browaeys wrote:
> > > > Le mardi 26 mars 2024 à 19:46 +0000, Conor Dooley a écrit :
> > > > > On Tue, Mar 26, 2024 at 07:54:35PM +0100, Folker Schwesinger via
> > > > > B4
> > > > > Relay wrote:
> > > > > > From: Folker Schwesinger <dev@folker-schwesinger.de>
> > > > > > -	if (of_property_read_bool(dev->of_node,
> > > > > > "rockchip,enable-
> > > > > > strobe-pulldown"))
> > > > > > -		rk_phy->enable_strobe_pulldown =
> > > > > > PHYCTRL_REN_STRB_ENABLE;
> > > > > > +	if (of_property_read_bool(dev->of_node,
> > > > > > "rockchip,disable-
> > > > > > strobe-pulldown"))
> > > > > > +		rk_phy->enable_strobe_pulldown =
> > > > > > PHYCTRL_REN_STRB_DISABLE;
> > > > > 
> > > > > Unfortunately you cannot do this.
> > > > > Previously no property at all meant disabled and a property was
> > > > > required
> > > > > to enable it. With this change the absence of a property means
> > > > > that
> > > > > it
> > > > > will be enabled.
> > > > > An old devicetree is that wanted this to be disabled would have
> > > > > no
> > > > > property and will now end up with it enabled. This is an ABI
> > > > > break
> > > > > and is
> > > > > clearly not backwards compatible, that's a NAK unless it is
> > > > > demonstrable
> > > > > that noone actually wants to disable it at all.
> > > > 
> > > > 
> > > > But the patch that introduced the new default to disable the
> > > > pulldown
> > > > explicitely introduced a regression for at least 4 boards.
> > > > It took time to sort out that the default to disable pulldown was
> > > > the
> > > > culprit but still.
> > > > Will we carry this new behavor that breaks the default design for
> > > > rk3399 because since the regression was introduced new board
> > > > definition
> > > > might have expceted this new behavior.
> > > > 
> > > > Could the best option be to revert to énot set a default
> > > > enable/disable
> > > > pulldown" (as before the commit that introduced the regression) and
> > > > allow one to force the pulldown via the enable/disable pulldown
> > > > property?
> > > > I mean the commit that introduced a default value for the pulldown
> > > > did
> > > > not seem to be about fixing anything. But it broke a lot. ANd it
> > > > was
> > > > really really hard to find the description of this commit to
> > > > understand
> > > > that one had to enable pulldown to restore hs400.
> > > > 
> > > > In more than 3 years, only one board maintainer noticed that this
> > > > property was required to get back HS400  and thanks to a user
> > > > telling
> > > > me that this board was working I found from this board that this
> > > > property was "missing" from most board definitions (while it was
> > > > not
> > > > required before).
> > > > 
> > > > 
> > > > I am all for not breaking ABI. But what about not reverting a patch
> > > > that already broke ABI because this patch introduced a new ABI that
> > > > we
> > > > don't want to break?
> > > > I mean shouldn't a new commit with a new ABI that regressed the
> > > > kernel
> > > > be reverted?
> > > 
> > > I think I said it after OP replied to me yesterday, but this is a
> > > pretty
> > > shitty situation in that the original default picked for the property
> > > was actually incorrect. Given it's been like this for four years
> > > before
> > > anyone noticed, and others probably depend on the current behaviour,
> > > that's hard to justify.
> > > 
> > 
> > A lot of people noticed fast that HS400 was broken in the 5.10 branch
> > but due to another commit (more later, ie double regulator init that
> > messed up emmc) this second breakage was not detected. But mostly
> > downstream. And most if not all rk3399 boards in Armbian had HS400
> > disabled.
> > 
> > 
> > It took 3 years to detect that HS400 was broken on a few boards like
> > Rock Pi4 in the upstream kernel. Any might still be broken.
> > I would not count on the fact that keeping the current behavior equals
> > no more broken boards.
> > 
> > From the previous exchanges the boards that requires the pulldown to be
> > disabled seems well known.
> > 
> > Though I am fine with adding a property to set enable pulldown to any
> > board definition file where that is required.
> > 
> > Only I do not believe keeping the statu quo equal everything works
> > because it has been 3 years.
> 
> FWIW, I didn't say this. Clearly if that was the case, this patch would
> never have arrived.
> 
> > In fact this commit reached the downstream kernels way later. Any
> > stayed with the 5.10 branch for years.
> > 
> > But on the other side the disable pulldown by default is alraedy in
> > stable/linux-rolling-lts .
> > 
> > > > Mind fixing the initial regression 8b5c2b45b8f0 "phy: rockchip: set
> > > > pulldown for strobe line in dts" does not necessarily mean changing
> > > > the
> > > > default to the opposite value but could also be reverting to not
> > > > setting a default.
> > > 
> > > That's also problematic, as the only way to do this is make setting
> > > one of the enabled or disabled properties required, which is also an
> > > ABI
> > > break, since you'd then be rejecting probe if one is not present.
> > 
> > 
> > I don't understand.
> > How reverting to not set either pulldown enabled or disabled by default
> > force all board to set either enabled or disabled.
> > I was telling about making the pulldown set by kernel optional be it
> > enabled or disabled to revert to the previous behavior. 
> > 
> > I mean before the patch to set a default pulldown value (to disabled)
> > there were no forced value.
> 
> Ah, maybe I misunderstood what the code originally did. Did the original
> code leave the bit however the bootloader or reset value had left it?
> In that case, probe wouldn't be rejected and you'd not have the sort of
> issue that I mentioned above.
> 
> > > > Though I don't know if there are pros to setting a default.
> > > 
> > > What you probably have to weigh up is the cons of each side. If what
> > > you
> > > lose is HS400 mode with what's in the kernel right now but switching
> > > to
> > > what's been proposed would entirely break some boards, I know which
> > > I think the lesser of two evils is.
> > 
> > More boards (even if not the most wide spread it seems) are broken by
> > the current behavior.
> > 
> > I agree that only HS400 is broken by keeping the status quo. But as far
> > as I understand only HS400 will be broken either way.
> > Be that by keeping the current disable pulldown which break the boards
> > based on the rockchip default design or the boards that are non-
> > standard or have a broken design.
> > Both case this lead to data corruption on boot to eMMC.
> > 
> > The only pro of keeping the current value the default is that most
> > board broken by the new default introduced in 2020 "might" already be
> > fixed (but that is just a guess).

which I guess are the least stale boards too.

> > > It's probably up to the platform maintainer to weigh in at this
> > > point.
> > 
> > I am not knowledged into the delegation scope. You mean that from now
> > on it is up to the rockchip maintainer?
> > I am fine with it either way.
> 
> Yes, I meant the rockchip maintainer. I'm only a lowly bindings
> maintainer, without any knowledge of rockchip specfics or the type of
> boards we're talking about being broken here. Someone has to make a
> judgement call about which "no property" behaviour is used going forward
> and I don't want that to be me!

I'm somehow all for not changing defaults again.

I think in the past there was a similar example in some other kernel part,
where some change broke the ABI, but meanwhile another ABI depended
on the changed behaviour, so a revert was not possible.

I think it's somewhat similar here. If the change has been in the kernel
for 3-4 years now, I do think that ship has sailed somehow.

As was said above, board introduced since 2020 might already be fixed
and essentially for boards that weren't, it does look like these didn't run
a mainline kernel for like 4 years now.

So if it comes down to deciding who to keep working, I'm more in favor of
those that did run on mainline in the years since.


Though not sure if I understood all the details here yet.


Heiko

> 
> > I just wanted to point out that maybe we don't have to set a pulldown
> > value after all. And that then all boards will be fine as before
> > setting the pulldown explicitly was introduced.
> 
> By "all boards will be fine" you mean "all boards that expected the
> kernel didn't touch this bit will be fine". The boards that need the
> kernel to set this bit because it {comes out of reset,is set by firmware}
> incorrectly are going to need a property added if we revert the default
> behaviour to not touching the bit.
> 
> > In fact I am more eager to get this fixed be it by adding a enable-
> > pulldown property to the board definitions, than to change the current
> > behavior.
> > Just wanted to sort out if that was not the wrong way to fix this
> > issue. (ie if adding a setting on most boards was wrong).
> 
> > During more than 2 years, I tried various patches and discussed on
> > forums about the HS400 breakage. I had bisected the regulator init
> > issue in the 5.10 branch. Sadly it took so much time for this issue to
> > be understood that when the force pulldown to disable commit was
> > introduced downstream before the first issue go fixed.
> > This only made the matter worse because when one fixed the double
> > regulator init issue HS400 was still broken, this time because the
> > pulldown was forced to disable. But nobody noticed this commit that
> > forced a default pulldown state (that was older than the regulator
> > commit from 5.13 backported to the 5.10 stable branch commit, but that
> > reached downstream later due to not having been backported to 5.10 from
> > 5.11).
> > Otherwise we would have emailed immeditaly.
> > Bisecting was only able to catch the first breakage (as it was only
> > fixed after the second breakage was introduced).
> > 
> > Maybe the problem is that me and others did not complained to the
> > kernel upstream ML because we were using heavily patched downstream
> > kernels (like most if not all downstream ARM kernels). So sadly, the
> > forums from back then are filled with complaints but nothing seemed to
> > have reached the Linux ML.
> 
> Aye, and all I can really say there is to buy boards from a vendor that
> doesn't use some horribly hacked downstream kernel, which I know is
> clearly an unsatisfactory suggestion. That said, we probably should have
> caught that the new default behaviour when the changes were made was not
> the default before. There was only one DT maintainer then though, and
> things just slip by :/
>
Alban Browaeys June 11, 2024, 7:38 p.m. UTC | #11
This is quite a detailed replied. in summary I believe the enable-
strobe property should be added as soon as possible to restore hs400
support.

In the process it will be easier to see if any upstream dts has hs400
enabled and does not require to force this enable-strobe property. 

As far as I understood the only boards that do not support enable-
strobe pulldown by default do not support hs400 either (because they do
not support strobe pulldown enabled due to wiring issue, but that also
prevetn them from enabling hs400).

And then if I managed to convince you that the best policy is to
restore previous behaviour of not setting a default in the kernel (and
provides enable and disable properties for those that want to force
ether way be it due to wiring issues), remove these properties once the
default is removed.


Le vendredi 19 avril 2024 à 16:31 +0200, Heiko Stübner a écrit :
> Am Donnerstag, 11. April 2024, 17:42:24 CEST schrieb Conor Dooley:
> > On Wed, Apr 10, 2024 at 08:28:57PM +0200, Alban Browaeys wrote:
> > > Le jeudi 28 mars 2024 à 18:01 +0000, Conor Dooley a écrit :
> > > > On Thu, Mar 28, 2024 at 06:00:03PM +0100, Alban Browaeys wrote:
> > > > > Le mardi 26 mars 2024 à 19:46 +0000, Conor Dooley a écrit :
> > > > > > On Tue, Mar 26, 2024 at 07:54:35PM +0100, Folker
> > > > > > Schwesinger via
> > > > > > B4
> > > > > > Relay wrote:
> > > > > > > From: Folker Schwesinger <dev@folker-schwesinger.de>
> > > > > > > -	if (of_property_read_bool(dev->of_node,
> > > > > > > "rockchip,enable-
> > > > > > > strobe-pulldown"))
> > > > > > > -		rk_phy->enable_strobe_pulldown =
> > > > > > > PHYCTRL_REN_STRB_ENABLE;
> > > > > > > +	if (of_property_read_bool(dev->of_node,
> > > > > > > "rockchip,disable-
> > > > > > > strobe-pulldown"))
> > > > > > > +		rk_phy->enable_strobe_pulldown =
> > > > > > > PHYCTRL_REN_STRB_DISABLE;
> > > > > > 
> > > > > > Unfortunately you cannot do this.
> > > > > > Previously no property at all meant disabled and a property
> > > > > > was
> > > > > > required
> > > > > > to enable it. With this change the absence of a property
> > > > > > means
> > > > > > that
> > > > > > it
> > > > > > will be enabled.
> > > > > > An old devicetree is that wanted this to be disabled would
> > > > > > have
> > > > > > no
> > > > > > property and will now end up with it enabled. This is an
> > > > > > ABI
> > > > > > break
> > > > > > and is
> > > > > > clearly not backwards compatible, that's a NAK unless it is
> > > > > > demonstrable
> > > > > > that noone actually wants to disable it at all.
> > > > > 
> > > > > 
> > > > > But the patch that introduced the new default to disable the
> > > > > pulldown
> > > > > explicitely introduced a regression for at least 4 boards.
> > > > > It took time to sort out that the default to disable pulldown
> > > > > was
> > > > > the
> > > > > culprit but still.
> > > > > Will we carry this new behavor that breaks the default design
> > > > > for
> > > > > rk3399 because since the regression was introduced new board
> > > > > definition
> > > > > might have expceted this new behavior.
> > > > > 
> > > > > Could the best option be to revert to énot set a default
> > > > > enable/disable
> > > > > pulldown" (as before the commit that introduced the
> > > > > regression) and
> > > > > allow one to force the pulldown via the enable/disable
> > > > > pulldown
> > > > > property?
> > > > > I mean the commit that introduced a default value for the
> > > > > pulldown
> > > > > did
> > > > > not seem to be about fixing anything. But it broke a lot. ANd
> > > > > it
> > > > > was
> > > > > really really hard to find the description of this commit to
> > > > > understand
> > > > > that one had to enable pulldown to restore hs400.
> > > > > 
> > > > > In more than 3 years, only one board maintainer noticed that
> > > > > this
> > > > > property was required to get back HS400  and thanks to a user
> > > > > telling
> > > > > me that this board was working I found from this board that
> > > > > this
> > > > > property was "missing" from most board definitions (while it
> > > > > was
> > > > > not
> > > > > required before).
> > > > > 
> > > > > 
> > > > > I am all for not breaking ABI. But what about not reverting a
> > > > > patch
> > > > > that already broke ABI because this patch introduced a new
> > > > > ABI that
> > > > > we
> > > > > don't want to break?
> > > > > I mean shouldn't a new commit with a new ABI that regressed
> > > > > the
> > > > > kernel
> > > > > be reverted?
> > > > 
> > > > I think I said it after OP replied to me yesterday, but this is
> > > > a
> > > > pretty
> > > > shitty situation in that the original default picked for the
> > > > property
> > > > was actually incorrect. Given it's been like this for four
> > > > years
> > > > before
> > > > anyone noticed, and others probably depend on the current
> > > > behaviour,
> > > > that's hard to justify.
> > > > 
> > > 
> > > A lot of people noticed fast that HS400 was broken in the 5.10
> > > branch
> > > but due to another commit (more later, ie double regulator init
> > > that
> > > messed up emmc) this second breakage was not detected. But mostly
> > > downstream. And most if not all rk3399 boards in Armbian had
> > > HS400
> > > disabled.
> > > 
> > > 
> > > It took 3 years to detect that HS400 was broken on a few boards
> > > like
> > > Rock Pi4 in the upstream kernel. Any might still be broken.
> > > I would not count on the fact that keeping the current behavior
> > > equals
> > > no more broken boards.
> > > 
> > > From the previous exchanges the boards that requires the pulldown
> > > to be
> > > disabled seems well known.
> > > 
> > > Though I am fine with adding a property to set enable pulldown to
> > > any
> > > board definition file where that is required.
> > > 
> > > Only I do not believe keeping the statu quo equal everything
> > > works
> > > because it has been 3 years.
> > 
> > FWIW, I didn't say this. Clearly if that was the case, this patch
> > would
> > never have arrived.
> > 
> > > In fact this commit reached the downstream kernels way later. Any
> > > stayed with the 5.10 branch for years.
> > > 
> > > But on the other side the disable pulldown by default is alraedy
> > > in
> > > stable/linux-rolling-lts .
> > > 
> > > > > Mind fixing the initial regression 8b5c2b45b8f0 "phy:
> > > > > rockchip: set
> > > > > pulldown for strobe line in dts" does not necessarily mean
> > > > > changing
> > > > > the
> > > > > default to the opposite value but could also be reverting to
> > > > > not
> > > > > setting a default.
> > > > 
> > > > That's also problematic, as the only way to do this is make
> > > > setting
> > > > one of the enabled or disabled properties required, which is
> > > > also an
> > > > ABI
> > > > break, since you'd then be rejecting probe if one is not
> > > > present.
> > > 
> > > 
> > > I don't understand.
> > > How reverting to not set either pulldown enabled or disabled by
> > > default
> > > force all board to set either enabled or disabled.
> > > I was telling about making the pulldown set by kernel optional be
> > > it
> > > enabled or disabled to revert to the previous behavior. 
> > > 
> > > I mean before the patch to set a default pulldown value (to
> > > disabled)
> > > there were no forced value.
> > 
> > Ah, maybe I misunderstood what the code originally did. Did the
> > original
> > code leave the bit however the bootloader or reset value had left
> > it?
> > In that case, probe wouldn't be rejected and you'd not have the
> > sort of
> > issue that I mentioned above.
> > 
> > > > > Though I don't know if there are pros to setting a default.
> > > > 
> > > > What you probably have to weigh up is the cons of each side. If
> > > > what
> > > > you
> > > > lose is HS400 mode with what's in the kernel right now but
> > > > switching
> > > > to
> > > > what's been proposed would entirely break some boards, I know
> > > > which
> > > > I think the lesser of two evils is.
> > > 
> > > More boards (even if not the most wide spread it seems) are
> > > broken by
> > > the current behavior.
> > > 
> > > I agree that only HS400 is broken by keeping the status quo. But
> > > as far
> > > as I understand only HS400 will be broken either way.
> > > Be that by keeping the current disable pulldown which break the
> > > boards
> > > based on the rockchip default design or the boards that are non-
> > > standard or have a broken design.
> > > Both case this lead to data corruption on boot to eMMC.
> > > 
> > > The only pro of keeping the current value the default is that
> > > most
> > > board broken by the new default introduced in 2020 "might"
> > > already be
> > > fixed (but that is just a guess).
> 
> which I guess are the least stale boards too.
> 
> > > > It's probably up to the platform maintainer to weigh in at this
> > > > point.
> > > 
> > > I am not knowledged into the delegation scope. You mean that from
> > > now
> > > on it is up to the rockchip maintainer?
> > > I am fine with it either way.
> > 
> > Yes, I meant the rockchip maintainer. I'm only a lowly bindings
> > maintainer, without any knowledge of rockchip specfics or the type
> > of
> > boards we're talking about being broken here. Someone has to make a
> > judgement call about which "no property" behaviour is used going
> > forward
> > and I don't want that to be me!
> 
> I'm somehow all for not changing defaults again.
> 
> I think in the past there was a similar example in some other kernel
> part,
> where some change broke the ABI, but meanwhile another ABI depended
> on the changed behaviour, so a revert was not possible.
> 
> I think it's somewhat similar here. If the change has been in the
> kernel
> for 3-4 years now, I do think that ship has sailed somehow.
> 
> As was said above, board introduced since 2020 might already be fixed
> and essentially for boards that weren't, it does look like these
> didn't run
> a mainline kernel for like 4 years now.


I agree that all board have been fixed. But not by adpating to the new
default since 2020 and relying on it nowadays.
Most if not all have disabled hs400es support (and the patch to disable
hs400 are only reaching upstream recently, ie only a year ago for the
latest ones).

The hs400es support was disabled as soon as the new default was
introduced. And in fact earlier due to a double init in regulator core,
when probe deferred was introduced, that also broke emmc on rk3399 a
few month before this new default was pushed to stable branch in 2020.
Ie it could be that until when this double init was fixed in august
2022 by 
https://github.com/torvalds/linux/commit/8a866d527ac0441c0eb14a991fa11358b476b11d,
no rk3399 board was able to test hs400es support with the new 2020
default.

The main thing that made it impossible to find this new default was bad
is that emmc hs400 support was already broken when the commit reached
the stable kernel.

Also a lot of users and devs use SD instead of EMMC.

Maybe before discussing if the new default should be kept for
stability, we should check if any upstream rk3399 board that still has
hs400es support without the "rockchip,enable-strobe-pulldown" (ie out
of:
arch/arm64/boot/dts/rockchip/rk3399-nanopc-t4.dts via arch/arm64/boot/dts/rockchip/rk3399-nanopi4.dtsi
arch/arm64/boot/dts/rockchip/rk3399-nanopi-m4.dts via arch/arm64/boot/dts/rockchip/rk3399-nanopi4.dtsi
arch/arm64/boot/dts/rockchip/rk3399-nanopi-neo4.dts via arch/arm64/boot/dts/rockchip/rk3399-nanopi4.dtsi
arch/arm64/boot/dts/rockchip/rk3399-nanopi-r4s.dts via arch/arm64/boot/dts/rockchip/rk3399-nanopi4.dtsi
) has working emmc hs400.
Could be there are none that have a working hs400 emmc.

Could we ask for these boards to be checked for working emmc with the
new 2020 default? (as we cannot assume that as even well maintained
board only found upstream they were broken in mid 2023, ie commit
2bd1d2dd808c60532283e9cf05110bf1bf2f9079 and
cee572756aa2cb46e959e9797ad4b730b78a050b for ROCK 4C+ and ROCK Pi 4)? 
Then we will be confident the new 2020 default regression is required
by anything.

Ie, we assume they were working as nobody complains as most complains
never reach upstream ML... but the armbian forum were flooed with
complaints in these days. I was one of the complainer until I decided
to abandon other projects because this issue was too big to me (wel I
ended up and am still on SD since it took too long and I needed the
board, as did all other users. But I want this to change).
Only downstream disabled hs400es to get the board working. All rk3399
boards in Armbian are patched to remove hs400 support since at least
two years as far as I know.

These current boards with hs400es enabled upstream are:
arch/arm64/boot/dts/rockchip/rk3399-evb.dts
arch/arm64/boot/dts/rockchip/rk3399-firefly.dts
arch/arm64/boot/dts/rockchip/rk3399-gru-bob.dts via arch/arm64/boot/dts/rockchip/rk3399-gru.dtsi
arch/arm64/boot/dts/rockchip/rk3399-gru-kevin.dts via arch/arm64/boot/dts/rockchip/rk3399-gru.dtsi
arch/arm64/boot/dts/rockchip/rk3399-gru-scarlet-dumo.dts via arch/arm64/boot/dts/rockchip/rk3399-gru.dtsi
arch/arm64/boot/dts/rockchip/rk3399-gru-scarlet-inx.dts via arch/arm64/boot/dts/rockchip/rk3399-gru.dtsi
arch/arm64/boot/dts/rockchip/rk3399-gru-scarlet-kd.dts via arch/arm64/boot/dts/rockchip/rk3399-gru.dtsi
arch/arm64/boot/dts/rockchip/rk3399-hugsun-x99.dts
arch/arm64/boot/dts/rockchip/rk3399-khadas-edge-captain.dts via arch/arm64/boot/dts/rockchip/rk3399-khadas-edge.dtsi
arch/arm64/boot/dts/rockchip/rk3399-khadas-edge-v.dts via arch/arm64/boot/dts/rockchip/rk3399-khadas-edge.dtsi
arch/arm64/boot/dts/rockchip/rk3399-khadas-edge.dts via arch/arm64/boot/dts/rockchip/rk3399-khadas-edge.dtsi
arch/arm64/boot/dts/rockchip/rk3399-leez-p710.dts
arch/arm64/boot/dts/rockchip/rk3399-nanopc-t4.dts
arch/arm64/boot/dts/rockchip/rk3399-orangepi.dts
arch/arm64/boot/dts/rockchip/rk3399-puma-haikou.dts via arch/arm64/boot/dts/rockchip/rk3399-puma.dtsi
arch/arm64/boot/dts/rockchip/rk3399-ficus.dts via arch/arm64/boot/dts/rockchip/rk3399-rock960.dtsi
arch/arm64/boot/dts/rockchip/rk3399-rock960.dts via arch/arm64/boot/dts/rockchip/rk3399-rock960.dtsi
arch/arm64/boot/dts/rockchip/rk3399-sapphire-excavator.dts via arch/arm64/boot/dts/rockchip/rk3399-sapphire.dtsi
arch/arm64/boot/dts/rockchip/rk3399-sapphire.dts via arch/arm64/boot/dts/rockchip/rk3399-sapphire.dtsi
arch/arm64/boot/dts/rockchip/rk3399pro-rock-pi-n10.dts via arch/arm64/boot/dts/rockchip/rk3399pro-vmarc-som.dtsi

These board were broken by the new default though they only disabled hs enhanced strobe but left basic hs400 support:
arch/arm64/boot/dts/rockchip/rk3399-khadas-edge-captain.dts via arch/arm64/boot/dts/rockchip/rk3399-khadas-edge.dtsi
arch/arm64/boot/dts/rockchip/rk3399-khadas-edge-v.dts via arch/arm64/boot/dts/rockchip/rk3399-khadas-edge.dtsi
arch/arm64/boot/dts/rockchip/rk3399-khadas-edge.dts via arch/arm64/boot/dts/rockchip/rk3399-khadas-edge.dtsi
ie commit 6dd0053683804427529ef3523f7872f473440a19 in november of 2021.


Note: I have not checked that my board is working with mmc-hs400-1_8v
enabled and mmc-hs400-enhanced-strobe disabled. Could be that the new
default only break the enhanced strobe support, not hs400 per se. Back
then Armbian contributors disabled hs400 support.


Finally, it could be that one RK3399 board currently upstream depends
on this new default but I guess none.


!!!!!
The main issue and what bother me is whether any boards depend on the
previous default. This is the real issue.
That is previously the kernel value was left to the one set by the
bootloader or the hardware reset.
The Rockchip kernel default to enable strobe was told by Dragan Simic 
to break RockPro64 and the Pinebook Pro due to them behing miswired. 
But it was not the upstream behavior before the 2020 new disable by
default behavior was introduced. The previous behavior to leave teh
pulldown as is.
!!!!

The dev for commit 8b5c2b45b8f0a11c9072da0f7baf9ee986d3151e that
introduced the new disable-strobe by default tells he tested on his
"rk3399 customized board".




I wanted to sort out if it made sense to add "rockchip,enable-strobe-
pulldown" to most rk339 boards. But I should have first added this
property then discussed if we should revert the new 2020 default.
We could just remove all the "rockchip,enable-strobe-pulldown" later on
if we revert either to the previous behavior of not setting a default
or even to the Rockchip tree default to set it enable-strobe by
default.
I should not have kept the board broken in the mean time. It is not as
if I was introducing a new API that could be wrong. The API is already
there even if wrong.

I will try to carry forward the discussion. But unbreaking the board is
higher priority.

Could you test the enable-strobe property on all the dts that  disabled
hs400es due to this new default (I lack the hardware to test the
patches).
At least arch/arm64/boot/dts/rockchip/rk3399-khadas-edge.dtsi
arch/arm64/boot/dts/rockchip/rk3399-rock-4c-plus.dts
arch/arm64/boot/dts/rockchip/rk3399-rock-pi-4.dtsi
and ping the board tester for other boards to test if they require
"rockchip,enable-strobe-pulldown" for EMMC HS400 write support (read is
fine even with the new default).


About:
On Tue, 2024-02-27 at 10:11 +0000, Folker Schwesinger wrote:
> with the following applied, the EMMC related errors are gone. dmesg
> only
> shows "Purging ... bytes" during my tests:
> 
> diff --git a/arch/arm64/boot/dts/rockchip/rk3399-rock-pi-4.dtsi
> b/arch/arm64/boot/dts/rockchip/rk3399-rock-pi-4.dtsi
> index f2279aa6ca9e..ae0fb87e1a8b 100644
> --- a/arch/arm64/boot/dts/rockchip/rk3399-rock-pi-4.dtsi
> +++ b/arch/arm64/boot/dts/rockchip/rk3399-rock-pi-4.dtsi
> @@ -647,8 +647,10 @@ &saradc {
>  &sdhci {
>         max-frequency = <150000000>;
>         bus-width = <8>;
> -       mmc-hs200-1_8v;
> +       mmc-hs400-1_8v;
> +       mmc-hs400-enhanced-strobe;
>         non-removable;
> +       rockchip,enable-strobe-pulldown;
>         status = "okay";
>  };
> 
> For testing I ran dd three times in a row:
> 
> dd if=/dev/zero of=./zero.bin bs=1M count=5000
> 
> I tested this on both a Rock 4SE board and a Rock Pi 4B+ board with
> the
> same results.

Folker, are you confident "Rock 4SE board and Rock Pi 4B+" were fixed
with above patch?
Ie the "rockchip,enable-strobe-pulldown;" should be under an
"rockchip,rk3399-emmc-phy" compaible node, that is &emmc_phy in
arch/arm64/boot/dts/rockchip/rk3399-rock-pi-4.dtsi, not sdhci.




Since the new default bahvior was introduced the only new board dts
with hs400es support introduced were:
- arm64/boot/dts/rockchip/rk3399-rock-4c-plus.dts in
246450344dad087a121befbed1aba776dba3d377 (but as shown above the
support was disabled in 2023 due to hs400es not working with the new
default. I believe the dts was designed against a kernel before the new
default disable-strobe).
- arch/arm64/boot/dts/rockchip/rk3399-rock-pi-4.dtsi in
b5edb04673700125bfd1d13e6c14747b1ecba522
which these board dts includes:
arch/arm64/boot/dts/rockchip/rk3399-rock-4se.dts
arch/arm64/boot/dts/rockchip/rk3399-rock-pi-4a-plus.dts
arch/arm64/boot/dts/rockchip/rk3399-rock-pi-4a.dts
arch/arm64/boot/dts/rockchip/rk3399-rock-pi-4b-plus.dts
arch/arm64/boot/dts/rockchip/rk3399-rock-pi-4b.dts
arch/arm64/boot/dts/rockchip/rk3399-rock-pi-4c.dts
but the hs400es support was disabled
incee572756aa2cb46e959e9797ad4b730b78a050b due to not working with the
new disable-strobe

As far as I know no board with working hs400 was introduced after the
new behavior. We can assume that not setting a default as before this
new behavior will not break any board any more than it was before this
commit was introduced (and this commit introducing the new behavior
does not acknowledge fixing any boards).


Either way, I believe we should add "rockchip,enable-strobe-pulldown"
to any board dts with hs400 broken asap.


> So if it comes down to deciding who to keep working, I'm more in
> favor of
> those that did run on mainline in the years since.
> 
> 
> Though not sure if I understood all the details here yet.
> 
> 
> Heiko
> 
> > 
> > > I just wanted to point out that maybe we don't have to set a
> > > pulldown
> > > value after all. And that then all boards will be fine as before
> > > setting the pulldown explicitly was introduced.
> > 
> > By "all boards will be fine" you mean "all boards that expected the
> > kernel didn't touch this bit will be fine". The boards that need
> > the
> > kernel to set this bit because it {comes out of reset,is set by
> > firmware}
> > incorrectly are going to need a property added if we revert the
> > default
> > behaviour to not touching the bit.
> > 
> > > In fact I am more eager to get this fixed be it by adding a
> > > enable-
> > > pulldown property to the board definitions, than to change the
> > > current
> > > behavior.
> > > Just wanted to sort out if that was not the wrong way to fix this
> > > issue. (ie if adding a setting on most boards was wrong).
> > 
> > > During more than 2 years, I tried various patches and discussed
> > > on
> > > forums about the HS400 breakage. I had bisected the regulator
> > > init
> > > issue in the 5.10 branch. Sadly it took so much time for this
> > > issue to
> > > be understood that when the force pulldown to disable commit was
> > > introduced downstream before the first issue go fixed.
> > > This only made the matter worse because when one fixed the double
> > > regulator init issue HS400 was still broken, this time because
> > > the
> > > pulldown was forced to disable. But nobody noticed this commit
> > > that
> > > forced a default pulldown state (that was older than the
> > > regulator
> > > commit from 5.13 backported to the 5.10 stable branch commit, but
> > > that
> > > reached downstream later due to not having been backported to
> > > 5.10 from
> > > 5.11).
> > > Otherwise we would have emailed immeditaly.
> > > Bisecting was only able to catch the first breakage (as it was
> > > only
> > > fixed after the second breakage was introduced).
> > > 
> > > Maybe the problem is that me and others did not complained to the
> > > kernel upstream ML because we were using heavily patched
> > > downstream
> > > kernels (like most if not all downstream ARM kernels). So sadly,
> > > the
> > > forums from back then are filled with complaints but nothing
> > > seemed to
> > > have reached the Linux ML.
> > 
> > Aye, and all I can really say there is to buy boards from a vendor
> > that
> > doesn't use some horribly hacked downstream kernel, which I know is
> > clearly an unsatisfactory suggestion. That said, we probably should
> > have
> > caught that the new default behaviour when the changes were made
> > was not
> > the default before. There was only one DT maintainer then though,
> > and
> > things just slip by :/
> > 


The sad thing is the vendor indeed worked to upstream the code (though
it has since closed doors).
But as told previously, the emmc hs400 support was already broken by
the time the new default disable-strobe pulldown by default reached
stable. This due to another tricky to debug unrelated double init in
regulator core introduced a few months before the commit to disable-
strobe was introduced. No one could have caught that the new behavior
was broken on most rk3399 board by testing ... because it was not
possible to test hs400 emmc before the double init in regulator core
was fixed in august 2022.
I had bisected the breakage to this regulator core deferred probe
(commit "06653ebc0ad2e0b7d799cd71a5c2933ed2fb7a66"  "regulator: core:
resolve supply for boot-on/always-on regulators") when it broke in
5.10.60. But the new default behavior was no backported to the 5.10
branch so it confused me that the revert that got back hs400es on
helios64 in the 5.10 branch when finally fixed in the 5.16 branch (the
double init fixup) then emmc was still broken.
Only when arguing with an Armbian user that all rk3399 had emmc hs400es
broken did he argue that his board was working. And indeed it was the
only one which add "rockchip,enable-strobe-pulldown" to its dts as they
found about the new default behavior commit, cf
https://lore.kernel.org/lkml/eafd8d8c0cbcaca1b190f84ec17a1dcd7aec0306.camel@collabora.com/T/
"arm64: dts: rockchip: add enable-strobe-pulldown to emmc phy on
rk3399"


Heiko, you then asked:
On Tue, Aug 23, 2022 at 4:53 AM Heiko Stübner <heiko@sntech.de> wrote:
> I'm just wondering as the "offending" patch is from 2020, why this
only turns up now. Any ideas?


It turns out because this commit was not backported to stable branches.
And when it did most boards broke (it was panic in the forums with data
loss issues) and ended up with hs400 isabled in downstream kernel
(Armbian for one).
This discussion was in 2022. Only two dts files with hs400 support fo
rk3399 were added since then and had to disable hs400 later one because
of this default enable-strobe policy.


Also in this thread:
On Tue, Aug 23, 2022 at 10:13 PM Doug Anderson <dianders@chromium.org>
wrote:
> Ah, I see. So before the offending patch we used to just leave the
> pull state at whatever the default was when the kernel was booted.
> After the offending patch we chose a default.
> 
> On kevin I see an external pull down on this line. Enabling both the
> internal and external is probably not a huge deal, it'll just affect
> the strength of the pull.
> 
> On bob I _think_ the external pull down is also stuffed.
> 
> ...so I guess that would explain why it didn't cause a problem for at
> least those two boards?
> 
> -Doug


So it could be that disable by default broke most board but enabling by
default would break none, or only very specific board (that from my git
log inspection would have already been broken before the new behavior
was introduced).

then:
> I realized that only some devices may be affected, so I considered
> modifying rk3399-nanopi4.dtsi only,
> but other boards without external pull-down should still need this
> patch.
> 
> 
> BR,
> Jensen




It was pointed out that RockPro64 and Pinebook Pro had the strobe pin
miswired. 

From: Dragan Simic @ 2024-02-27 16:00 UTC (permalink / raw)
> When it comes to HS400 support on the RockPro64 and the Pinebook Pro,
> they're unfortunately miswired in a hopeless way, causing the DATA
> STROBE signal from the eMMC chip (i.e. the eMMC module) not to even
> reach the right ball on the RK3399 SoC.
> 
> Here are a few screenshots from the schematics that illustrate this
> issue (the first one is from the eMMC module schematic, the remaining
> two are from the RockPro64 schematic, which are pretty much exactly
> the same in the Pinebook Pro schematic):
> 
> - https://i.imgur.com/MqD7rcG.png
> - https://i.imgur.com/hrlQBck.png
> - https://i.imgur.com/mtYvc9s.png
> 
> There have been some Pine64 community attempts to have this fixed in
> new RockPro64 and Pinebook Pro hardware revisions, but such attempts
> unfortunately failed. :/
> 
> Thus, we'll unfortunately have to deal with the whole thing on the
> board level, instead of making changes on the SoC level.


On Tue Mar 26, 2024 at 8:55 PM CET, Dragan Simic wrote:
> Moreover, as I already explained some time ago, [1] some boards and
> devices are unfortunately miswired, and we don't want to enable the
> DATA STROBE pull-down on such boards.



Maybe reverting to the previous of not setting a default for this
strobe pulldown in the kernel is the best option (the commit
introducing a default does not tell why it makes sense to set a default
in the kernel out of "the Rockchip kernel does it").


and:
On Wed, Aug 24, 2022 at 10:58 PM Doug Anderson <dianders@chromium.org>
wrote:
> I guess the other alternative would be to change how the dt property
> works. You could say:
> 
> 1. If `enable-strobe-pulldown` is set then enable the strobe
> pulldown.
> 
> 2. If `enable-strobe-pulldown` is not set then don't touch the pin in
> the kernel.
> 
> 3. If someone later needs to explicitly disable the strobe pulldown
> they could add a new property like `disable-strobe-pulldown`.
> 
> 
> Obviously there are tradeoffs between that and what you've done and
> I'm happy to let others make the call of which they'd prefer.


This is a sane way policy to me.

Maybe Rockchip had a rationale to enable this strobe pulldown by
default. But I don't think there is any rationale to force this
pulldown to disabled by default.


In a secondary step, with extensive testing, if ever, I believe we
could still change the default to enable but add a disable-strobe for
RockPro64 and Pinebook Pro (broken due to a design mistake that is
specific to these boards as far as I understood).
But this can be worked on after any default is removed and merged only
when supported boards have been tested to be fine with this default.
As far as I understood setting the strobe pulldown to board with an
external pulldown is harmless, so only boards with miswired lines would
be broken and could be handled by adding disable-pulldown (which would
also document them not supporting hs400 in the dts instead of relying
on developers ot carry on this piece of information).

In fact I believe it would be good to add disable-strobe-pulldown
properties to the dts of boards which cannot support internal strobe
pulldown to be enabled, if only to document they cannot support it.


By the way how does the Rockchip kernel handles RockPro64 and Pinebook
Pro?
Are both RockPro64 V2.0 and V2.1 with DATA STROBE lines miswired?

Regards,
Alban
Folker Schwesinger June 14, 2024, 6:34 p.m. UTC | #12
Hi Alban,

thanks for aggregating all the background information about the issue.

On Tue Jun 11, 2024 at 9:38 PM CEST, Alban Browaeys wrote:
>
> Could you test the enable-strobe property on all the dts that  disabled
> hs400es due to this new default (I lack the hardware to test the
> patches).
> At least arch/arm64/boot/dts/rockchip/rk3399-khadas-edge.dtsi
> arch/arm64/boot/dts/rockchip/rk3399-rock-4c-plus.dts
> arch/arm64/boot/dts/rockchip/rk3399-rock-pi-4.dtsi
> and ping the board tester for other boards to test if they require
> "rockchip,enable-strobe-pulldown" for EMMC HS400 write support (read is
> fine even with the new default).
>

I tested some of the boards that include rk3399-rock-pi-4.dtsi (see
below).

> On Tue, 2024-02-27 at 10:11 +0000, Folker Schwesinger wrote:
> > with the following applied, the EMMC related errors are gone. dmesg
> > only
> > shows "Purging ... bytes" during my tests:
> > 
> > diff --git a/arch/arm64/boot/dts/rockchip/rk3399-rock-pi-4.dtsi
> > b/arch/arm64/boot/dts/rockchip/rk3399-rock-pi-4.dtsi
> > index f2279aa6ca9e..ae0fb87e1a8b 100644
> > --- a/arch/arm64/boot/dts/rockchip/rk3399-rock-pi-4.dtsi
> > +++ b/arch/arm64/boot/dts/rockchip/rk3399-rock-pi-4.dtsi
> > @@ -647,8 +647,10 @@ &saradc {
> >  &sdhci {
> >         max-frequency = <150000000>;
> >         bus-width = <8>;
> > -       mmc-hs200-1_8v;
> > +       mmc-hs400-1_8v;
> > +       mmc-hs400-enhanced-strobe;
> >         non-removable;
> > +       rockchip,enable-strobe-pulldown;
> >         status = "okay";
> >  };
> > 
> > For testing I ran dd three times in a row:
> > 
> > dd if=/dev/zero of=./zero.bin bs=1M count=5000
> > 
> > I tested this on both a Rock 4SE board and a Rock Pi 4B+ board with
> > the
> > same results.
>
> Folker, are you confident "Rock 4SE board and Rock Pi 4B+" were fixed
> with above patch?
> Ie the "rockchip,enable-strobe-pulldown;" should be under an
> "rockchip,rk3399-emmc-phy" compaible node, that is &emmc_phy in
> arch/arm64/boot/dts/rockchip/rk3399-rock-pi-4.dtsi, not sdhci.
>

The above diff was just a quick shot at testing the
"rockchip,enable-strobe-pulldown" property when I first learned about
it. I later realized that the property belongs under the &emmc_phy node
as you suggested. That's what I did in the other patchset I sent a bit
later, which was accepted and applied:

https://lists.infradead.org/pipermail/linux-rockchip/2024-March/045723.html
f720dd9b8b6d8b2160beda789429d5489ce8a099
c1b1f340dd7db11f273e426e110697551c9f501f

So yes, the Rock 4SE, Rock Pi 4B and Rock Pi 4B+ boards all were fixed
with the patch. I regularly have the Rock 4SE and Rock 4B running from
EMMC, always with the patch applied since I sent it.
diff mbox series

Patch

diff --git a/drivers/phy/rockchip/phy-rockchip-emmc.c b/drivers/phy/rockchip/phy-rockchip-emmc.c
index 20023f6eb994..6e637f3e1b19 100644
--- a/drivers/phy/rockchip/phy-rockchip-emmc.c
+++ b/drivers/phy/rockchip/phy-rockchip-emmc.c
@@ -376,14 +376,14 @@  static int rockchip_emmc_phy_probe(struct platform_device *pdev)
 	rk_phy->reg_offset = reg_offset;
 	rk_phy->reg_base = grf;
 	rk_phy->drive_impedance = PHYCTRL_DR_50OHM;
-	rk_phy->enable_strobe_pulldown = PHYCTRL_REN_STRB_DISABLE;
+	rk_phy->enable_strobe_pulldown = PHYCTRL_REN_STRB_ENABLE;
 	rk_phy->output_tapdelay_select = PHYCTRL_OTAPDLYSEL_DEFAULT;
 
 	if (!of_property_read_u32(dev->of_node, "drive-impedance-ohm", &val))
 		rk_phy->drive_impedance = convert_drive_impedance_ohm(pdev, val);
 
-	if (of_property_read_bool(dev->of_node, "rockchip,enable-strobe-pulldown"))
-		rk_phy->enable_strobe_pulldown = PHYCTRL_REN_STRB_ENABLE;
+	if (of_property_read_bool(dev->of_node, "rockchip,disable-strobe-pulldown"))
+		rk_phy->enable_strobe_pulldown = PHYCTRL_REN_STRB_DISABLE;
 
 	if (!of_property_read_u32(dev->of_node, "rockchip,output-tapdelay-select", &val)) {
 		if (val <= PHYCTRL_OTAPDLYSEL_MAXVALUE)