diff mbox series

[RFC] brcmfmac: add 43751 SDIO ids and initialization

Message ID c1a215cf-94be-871b-2a8a-3cc381588f83@free.fr (mailing list archive)
State New, archived
Headers show
Series [RFC] brcmfmac: add 43751 SDIO ids and initialization | expand

Commit Message

Marc Gonzalez March 6, 2023, 10:24 a.m. UTC
OF_NAME=wifi
OF_FULLNAME=/soc/sd@ffe03000/wifi@1
OF_COMPATIBLE_0=brcm,bcm4329-fmac
OF_COMPATIBLE_N=1
SDIO_CLASS=00
SDIO_ID=02D0:AAE7
SDIO_REVISION=0.0
MODALIAS=sdio:c00v02D0dAAE7

0xaae7 = 43751

Isn't there some overlap between:
include/linux/mmc/sdio_ids.h and
drivers/net/wireless/broadcom/brcm80211/include/brcm_hw_ids.h ?

Reference used: commit d2587c57ffd8dc
"brcmfmac: add 43752 SDIO ids and initialization"

Looking at all the occurrences of 43752 to insert a 43751,
could someone comment whether this looks reasonable?
---
 drivers/net/wireless/broadcom/brcm80211/brcmfmac/bcmsdh.c     | 1 +
 drivers/net/wireless/broadcom/brcm80211/brcmfmac/chip.c       | 2 ++
 drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c       | 4 ++++
 drivers/net/wireless/broadcom/brcm80211/include/brcm_hw_ids.h | 1 +
 include/linux/mmc/sdio_ids.h                                  | 1 +
 5 files changed, 9 insertions(+)

Comments

Marc Gonzalez March 6, 2023, 10:41 a.m. UTC | #1
On 06/03/2023 11:24, Marc Gonzalez wrote:

> OF_NAME=wifi
> OF_FULLNAME=/soc/sd@ffe03000/wifi@1
> OF_COMPATIBLE_0=brcm,bcm4329-fmac
> OF_COMPATIBLE_N=1
> SDIO_CLASS=00
> SDIO_ID=02D0:AAE7
> SDIO_REVISION=0.0
> MODALIAS=sdio:c00v02D0dAAE7
> 
> 0xaae7 = 43751
> 
> Isn't there some overlap between:
> include/linux/mmc/sdio_ids.h and
> drivers/net/wireless/broadcom/brcm80211/include/brcm_hw_ids.h ?
> 
> Reference used: commit d2587c57ffd8dc
> "brcmfmac: add 43752 SDIO ids and initialization"
> 
> Looking at all the occurrences of 43752 to insert a 43751,
> could someone comment whether this looks reasonable?

Obviously, the commit message would have to be rewritten
for a real patch.

> ---
>  drivers/net/wireless/broadcom/brcm80211/brcmfmac/bcmsdh.c     | 1 +
>  drivers/net/wireless/broadcom/brcm80211/brcmfmac/chip.c       | 2 ++
>  drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c       | 4 ++++
>  drivers/net/wireless/broadcom/brcm80211/include/brcm_hw_ids.h | 1 +
>  include/linux/mmc/sdio_ids.h                                  | 1 +
>  5 files changed, 9 insertions(+)

This code (vendor upstream?) is probably relevant:
https://github.com/StreamUnlimited/broadcom-bcmdhd-4359/blob/master/include/bcmdevs.h

#define BCM4362_CHIP_ID		0x4362          /* 4362 chipcommon chipid */
#define BCM43751_CHIP_ID	0xAAE7          /* 43751 chipcommon chipid */
#define BCM43752_CHIP_ID	0xAAE8          /* 43752 chipcommon chipid */


#define BCM4362_CHIP(chipid)	((CHIPID(chipid) == BCM4362_CHIP_ID) || \
				(CHIPID(chipid) == BCM43751_CHIP_ID) || \
				(CHIPID(chipid) == BCM43752_CHIP_ID))

#define BCM4362_CHIP_GRPID	BCM4362_CHIP_ID: \
				case BCM43751_CHIP_ID: \
				case BCM43752_CHIP_ID
Marc Gonzalez March 7, 2023, 2:40 p.m. UTC | #2
On 06/03/2023 11:41, Marc Gonzalez wrote:

> On 06/03/2023 11:24, Marc Gonzalez wrote:
> 
>> # cat /sys/bus/sdio/devices/mmc2:0001:1/uevent
>> OF_NAME=wifi
>> OF_FULLNAME=/soc/sd@ffe03000/wifi@1
>> OF_COMPATIBLE_0=brcm,bcm4329-fmac
>> OF_COMPATIBLE_N=1
>> SDIO_CLASS=00
>> SDIO_ID=02D0:AAE7
>> SDIO_REVISION=0.0
>> MODALIAS=sdio:c00v02D0dAAE7
>>
>> 0xaae7 = 43751
>>
>> Isn't there some overlap between:
>> include/linux/mmc/sdio_ids.h and
>> drivers/net/wireless/broadcom/brcm80211/include/brcm_hw_ids.h ?
>>
>> Reference used: commit d2587c57ffd8dc
>> "brcmfmac: add 43752 SDIO ids and initialization"
>>
>> Looking at all the occurrences of 43752 to insert a 43751,
>> could someone comment whether this looks reasonable?
> 
> Obviously, the commit message would have to be rewritten
> for a real patch.

Help. I am thoroughly confused... :(


Through the SDIO bus, the WiFi chip reports 0xaae7 (i.e. 43751)
hence the /sys/bus/sdio/devices output above.

sdio_read_func_cis() -> sdio_read_cis() which sets
func->vendor/func->device to 2d0/aae7


But when brcmf_chip_recognition() calls ci->ops->read32()
i.e. brcmf_sdio_buscore_read32()
[ vs brcmf_sdiod_readl() in brcmf_sdio_probe_attach() ]

[    1.177283] brcmfmac: F1 signature read @0x18000000=0x1042aae8
[    1.182912] found AXI chip: BCM43752/2
[    1.186384] BCM43752/2: chip=aae8 rev=2 type=1

Here it reports 0xaae8 (i.e. 43752)

Why the discrepancy?
Can it cause issues?
(Sometimes, the whole SDIO bus doesn't probe at boot.
I am still investigating these intermittent problems.)

Should I use 43751 or 43752 firmware...?

Regards.
Franky Lin March 7, 2023, 11:15 p.m. UTC | #3
On Tue, Mar 7, 2023 at 6:40 AM Marc Gonzalez <marc.w.gonzalez@free.fr> wrote:
>
> On 06/03/2023 11:41, Marc Gonzalez wrote:
>
> > On 06/03/2023 11:24, Marc Gonzalez wrote:
> >
> >> # cat /sys/bus/sdio/devices/mmc2:0001:1/uevent
> >> OF_NAME=wifi
> >> OF_FULLNAME=/soc/sd@ffe03000/wifi@1
> >> OF_COMPATIBLE_0=brcm,bcm4329-fmac
> >> OF_COMPATIBLE_N=1
> >> SDIO_CLASS=00
> >> SDIO_ID=02D0:AAE7
> >> SDIO_REVISION=0.0
> >> MODALIAS=sdio:c00v02D0dAAE7
> >>
> >> 0xaae7 = 43751
> >>
> >> Isn't there some overlap between:
> >> include/linux/mmc/sdio_ids.h and
> >> drivers/net/wireless/broadcom/brcm80211/include/brcm_hw_ids.h ?
> >>
> >> Reference used: commit d2587c57ffd8dc
> >> "brcmfmac: add 43752 SDIO ids and initialization"
> >>
> >> Looking at all the occurrences of 43752 to insert a 43751,
> >> could someone comment whether this looks reasonable?
> >
> > Obviously, the commit message would have to be rewritten
> > for a real patch.
>
> Help. I am thoroughly confused... :(
>
>
> Through the SDIO bus, the WiFi chip reports 0xaae7 (i.e. 43751)
> hence the /sys/bus/sdio/devices output above.
>
> sdio_read_func_cis() -> sdio_read_cis() which sets
> func->vendor/func->device to 2d0/aae7
>
>
> But when brcmf_chip_recognition() calls ci->ops->read32()
> i.e. brcmf_sdio_buscore_read32()
> [ vs brcmf_sdiod_readl() in brcmf_sdio_probe_attach() ]
>
> [    1.177283] brcmfmac: F1 signature read @0x18000000=0x1042aae8
> [    1.182912] found AXI chip: BCM43752/2
> [    1.186384] BCM43752/2: chip=aae8 rev=2 type=1
>
> Here it reports 0xaae8 (i.e. 43752)
>
> Why the discrepancy?
> Can it cause issues?
> (Sometimes, the whole SDIO bus doesn't probe at boot.
> I am still investigating these intermittent problems.)
>
> Should I use 43751 or 43752 firmware...?

This question should be answered by the Cypress/Infineon folks but
unfortunately they have been quiet for a long time. In general we use
the id read from 0x18000000 to decide which firmware to load. But be
aware that the rev also matters. There are some examples in
brcmf_sdio_fwnames table that the different firmware name can be
derived from the same chip common id but different rev.

However sdio device enumeration happens before firmware download so
the intermittent problem you are facing probably is not related to
firmware version.

Regards,
- Franky

>
> Regards.
>
>
Marc Gonzalez March 8, 2023, 12:47 p.m. UTC | #4
On 08/03/2023 00:15, Franky Lin wrote:

> On Tue, Mar 7, 2023 at 6:40 AM Marc Gonzalez wrote:
> 
>> Through the SDIO bus, the WiFi chip reports 0xaae7 (i.e. 43751)
>> hence the /sys/bus/sdio/devices output above.
>>
>> sdio_read_func_cis() -> sdio_read_cis() which sets
>> func->vendor/func->device to 2d0/aae7
>>
>>
>> But when brcmf_chip_recognition() calls ci->ops->read32()
>> i.e. brcmf_sdio_buscore_read32()
>> [ vs brcmf_sdiod_readl() in brcmf_sdio_probe_attach() ]
>>
>> [    1.177283] brcmfmac: F1 signature read @0x18000000=0x1042aae8
>> [    1.182912] found AXI chip: BCM43752/2
>> [    1.186384] BCM43752/2: chip=aae8 rev=2 type=1
>>
>> Here it reports 0xaae8 (i.e. 43752)
>>
>> Why the discrepancy?
>> Can it cause issues?
>> (Sometimes, the whole SDIO bus doesn't probe at boot.
>> I am still investigating these intermittent problems.)
>>
>> Should I use 43751 or 43752 firmware...?
> 
> This question should be answered by the Cypress/Infineon folks but
> unfortunately they have been quiet for a long time. In general we use
> the id read from 0x18000000 to decide which firmware to load. But be
> aware that the rev also matters. There are some examples in
> brcmf_sdio_fwnames table that the different firmware name can be
> derived from the same chip common id but different rev.
> 
> However sdio device enumeration happens before firmware download so
> the intermittent problem you are facing probably is not related to
> firmware version.

Disclosure: My knowledge of SDIO device enumeration is 0.

When the host sends the equivalent of an "identify yourself" message
on the SDIO bus, doesn't the reply come from the WiFi device?
Why would the device reply 0xaae7 instead of 0xaae8?

In other words, who is replying 0xaae7?

Regards.
Franky Lin March 8, 2023, 5:21 p.m. UTC | #5
On Wed, Mar 8, 2023 at 4:48 AM Marc Gonzalez <marc.w.gonzalez@free.fr> wrote:
>
> On 08/03/2023 00:15, Franky Lin wrote:
>
> > On Tue, Mar 7, 2023 at 6:40 AM Marc Gonzalez wrote:
> >
> >> Through the SDIO bus, the WiFi chip reports 0xaae7 (i.e. 43751)
> >> hence the /sys/bus/sdio/devices output above.
> >>
> >> sdio_read_func_cis() -> sdio_read_cis() which sets
> >> func->vendor/func->device to 2d0/aae7
> >>
> >>
> >> But when brcmf_chip_recognition() calls ci->ops->read32()
> >> i.e. brcmf_sdio_buscore_read32()
> >> [ vs brcmf_sdiod_readl() in brcmf_sdio_probe_attach() ]
> >>
> >> [    1.177283] brcmfmac: F1 signature read @0x18000000=0x1042aae8
> >> [    1.182912] found AXI chip: BCM43752/2
> >> [    1.186384] BCM43752/2: chip=aae8 rev=2 type=1
> >>
> >> Here it reports 0xaae8 (i.e. 43752)
> >>
> >> Why the discrepancy?
> >> Can it cause issues?
> >> (Sometimes, the whole SDIO bus doesn't probe at boot.
> >> I am still investigating these intermittent problems.)
> >>
> >> Should I use 43751 or 43752 firmware...?
> >
> > This question should be answered by the Cypress/Infineon folks but
> > unfortunately they have been quiet for a long time. In general we use
> > the id read from 0x18000000 to decide which firmware to load. But be
> > aware that the rev also matters. There are some examples in
> > brcmf_sdio_fwnames table that the different firmware name can be
> > derived from the same chip common id but different rev.
> >
> > However sdio device enumeration happens before firmware download so
> > the intermittent problem you are facing probably is not related to
> > firmware version.
>
> Disclosure: My knowledge of SDIO device enumeration is 0.
>
> When the host sends the equivalent of an "identify yourself" message
> on the SDIO bus, doesn't the reply come from the WiFi device?
> Why would the device reply 0xaae7 instead of 0xaae8?
>
> In other words, who is replying 0xaae7?

The enumeration response is done by the HW sdio core on the chip.  No
software involved.

Regards,
- Franky
Neil Armstrong March 8, 2023, 5:37 p.m. UTC | #6
On 08/03/2023 18:21, Franky Lin wrote:
> On Wed, Mar 8, 2023 at 4:48 AM Marc Gonzalez <marc.w.gonzalez@free.fr> wrote:
>>
>> On 08/03/2023 00:15, Franky Lin wrote:
>>
>>> On Tue, Mar 7, 2023 at 6:40 AM Marc Gonzalez wrote:
>>>
>>>> Through the SDIO bus, the WiFi chip reports 0xaae7 (i.e. 43751)
>>>> hence the /sys/bus/sdio/devices output above.
>>>>
>>>> sdio_read_func_cis() -> sdio_read_cis() which sets
>>>> func->vendor/func->device to 2d0/aae7
>>>>
>>>>
>>>> But when brcmf_chip_recognition() calls ci->ops->read32()
>>>> i.e. brcmf_sdio_buscore_read32()
>>>> [ vs brcmf_sdiod_readl() in brcmf_sdio_probe_attach() ]
>>>>
>>>> [    1.177283] brcmfmac: F1 signature read @0x18000000=0x1042aae8
>>>> [    1.182912] found AXI chip: BCM43752/2
>>>> [    1.186384] BCM43752/2: chip=aae8 rev=2 type=1
>>>>
>>>> Here it reports 0xaae8 (i.e. 43752)
>>>>
>>>> Why the discrepancy?
>>>> Can it cause issues?
>>>> (Sometimes, the whole SDIO bus doesn't probe at boot.
>>>> I am still investigating these intermittent problems.)
>>>>
>>>> Should I use 43751 or 43752 firmware...?
>>>
>>> This question should be answered by the Cypress/Infineon folks but
>>> unfortunately they have been quiet for a long time. In general we use
>>> the id read from 0x18000000 to decide which firmware to load. But be
>>> aware that the rev also matters. There are some examples in
>>> brcmf_sdio_fwnames table that the different firmware name can be
>>> derived from the same chip common id but different rev.
>>>
>>> However sdio device enumeration happens before firmware download so
>>> the intermittent problem you are facing probably is not related to
>>> firmware version.
>>
>> Disclosure: My knowledge of SDIO device enumeration is 0.
>>
>> When the host sends the equivalent of an "identify yourself" message
>> on the SDIO bus, doesn't the reply come from the WiFi device?
>> Why would the device reply 0xaae7 instead of 0xaae8?
>>
>> In other words, who is replying 0xaae7?
> 
> The enumeration response is done by the HW sdio core on the chip.  No
> software involved.

Mark,

This would be the:
	BRCMF_SDIO_DEVICE(SDIO_DEVICE_ID_BROADCOM_CYPRESS_43751, CYW),
which is returned by the SDIO bus and associated the SDIO device with the brcmf driver.

All the other IDs are returned by reading registers/send commands to the SDIO device.

Neil
> 
> Regards,
> - Franky
Marc Gonzalez March 9, 2023, 10:16 a.m. UTC | #7
On 06/03/2023 11:24, Marc Gonzalez wrote:

> # cat /sys/bus/sdio/devices/mmc2:0001:1/uevent
> OF_NAME=wifi
> OF_FULLNAME=/soc/sd@ffe03000/wifi@1
> OF_COMPATIBLE_0=brcm,bcm4329-fmac
> OF_COMPATIBLE_N=1
> SDIO_CLASS=00
> SDIO_ID=02D0:AAE7
> SDIO_REVISION=0.0
> MODALIAS=sdio:c00v02D0dAAE7
> 
> NB: 0xaae7 = 43751

I have run into another issue.

The WiFi device (and the mmc2 bus it sits on) don't show up at all
in the kernel log *unless* I add lots of debug output, such as with
#define DEBUG in drivers/base/dd.c

I think this points to some kind of race condition?

Neil suggested that maybe the host probes the mmc2 bus "too soon",
when the WiFi device is still powering up, which makes the entire
probe fail.

This patch appears to solve the symptom:

diff --git a/drivers/mmc/host/meson-gx-mmc.c b/drivers/mmc/host/meson-gx-mmc.c
index 6e5ea0213b477..999b3843c0d0b 100644
--- a/drivers/mmc/host/meson-gx-mmc.c
+++ b/drivers/mmc/host/meson-gx-mmc.c
@@ -1400,7 +1400,7 @@ static struct platform_driver meson_mmc_driver = {
 	.remove		= meson_mmc_remove,
 	.driver		= {
 		.name = DRIVER_NAME,
-		.probe_type = PROBE_PREFER_ASYNCHRONOUS,
+		.probe_type = PROBE_FORCE_SYNCHRONOUS,
 		.of_match_table = meson_mmc_of_match,
 	},
 };

But this might just be delaying the probe enough for the device
to become ready?

Ideas/Suggestions?

Regards
Marc Gonzalez March 9, 2023, 12:13 p.m. UTC | #8
On 09/03/2023 11:16, Marc Gonzalez wrote:

> On 06/03/2023 11:24, Marc Gonzalez wrote:
> 
>> # cat /sys/bus/sdio/devices/mmc2:0001:1/uevent
>> OF_NAME=wifi
>> OF_FULLNAME=/soc/sd@ffe03000/wifi@1
>> OF_COMPATIBLE_0=brcm,bcm4329-fmac
>> OF_COMPATIBLE_N=1
>> SDIO_CLASS=00
>> SDIO_ID=02D0:AAE7
>> SDIO_REVISION=0.0
>> MODALIAS=sdio:c00v02D0dAAE7
>>
>> NB: 0xaae7 = 43751
> 
> I have run into another issue.
> 
> The WiFi device (and the mmc2 bus it sits on) don't show up at all
> in the kernel log *unless* I add lots of debug output, such as with
> #define DEBUG in drivers/base/dd.c
> 
> I think this points to some kind of race condition?
> 
> Neil suggested that maybe the host probes the mmc2 bus "too soon",
> when the WiFi device is still powering up, which makes the entire
> probe fail.
> 
> This patch appears to solve the symptom:
> 
> diff --git a/drivers/mmc/host/meson-gx-mmc.c b/drivers/mmc/host/meson-gx-mmc.c
> index 6e5ea0213b477..999b3843c0d0b 100644
> --- a/drivers/mmc/host/meson-gx-mmc.c
> +++ b/drivers/mmc/host/meson-gx-mmc.c
> @@ -1400,7 +1400,7 @@ static struct platform_driver meson_mmc_driver = {
>  	.remove		= meson_mmc_remove,
>  	.driver		= {
>  		.name = DRIVER_NAME,
> -		.probe_type = PROBE_PREFER_ASYNCHRONOUS,
> +		.probe_type = PROBE_FORCE_SYNCHRONOUS,
>  		.of_match_table = meson_mmc_of_match,
>  	},
>  };
> 
> But this might just be delaying the probe enough for the device
> to become ready?

FWIW, the relevant device tree nodes are:

/* decompiled DTS */

		sd@ffe03000 {
			compatible = "amlogic,meson-axg-mmc";
			reg = <0x0 0xffe03000 0x0 0x800>;
			interrupts = <0x0 0xbd 0x4>;
			status = "okay";
			clocks = <0x2 0x21 0x2 0x3c 0x2 0x2>;
			clock-names = "core", "clkin0", "clkin1";
			resets = <0x5 0x2c>;
			amlogic,dram-access-quirk;
			pinctrl-0 = <0x2c>;
			pinctrl-1 = <0x2d>;
			pinctrl-names = "default", "clk-gate";
			#address-cells = <0x1>;
			#size-cells = <0x0>;
			bus-width = <0x4>;
			cap-sd-highspeed;
			sd-uhs-sdr50;
			max-frequency = <0x5f5e100>;
			non-removable;
			disable-wp;
			keep-power-in-suspend;
			mmc-pwrseq = <0x2e>;
			vmmc-supply = <0x2b>;
			vqmmc-supply = <0x21>;

			wifi@1 {
				reg = <0x1>;
				compatible = "brcm,bcm4329-fmac";
			};
		};


/* original DTS */

		sd_emmc_a: sd@ffe03000 {
			compatible = "amlogic,meson-axg-mmc";
			reg = <0x0 0xffe03000 0x0 0x800>;
			interrupts = <GIC_SPI 189 IRQ_TYPE_LEVEL_HIGH>;
			status = "disabled";
			clocks = <&clkc CLKID_SD_EMMC_A>,
				 <&clkc CLKID_SD_EMMC_A_CLK0>,
				 <&clkc CLKID_FCLK_DIV2>;
			clock-names = "core", "clkin0", "clkin1";
			resets = <&reset RESET_SD_EMMC_A>;
		};

&sd_emmc_a {
	status = "okay";
	pinctrl-0 = <&sdio_pins>;
	pinctrl-1 = <&sdio_clk_gate_pins>;
	pinctrl-names = "default", "clk-gate";
	#address-cells = <1>;
	#size-cells = <0>;

	bus-width = <4>;
	cap-sd-highspeed;
	sd-uhs-sdr50;
	max-frequency = <100000000>;

	non-removable;
	disable-wp;

	/* WiFi firmware requires power to be kept while in suspend */
	keep-power-in-suspend;

	mmc-pwrseq = <&sdio_pwrseq>;

	vmmc-supply = <&vddao_3v3>;
	vqmmc-supply = <&vddio_ao1v8>;

	brcmf: wifi@1 {
		reg = <1>;
		compatible = "brcm,bcm4329-fmac";
	};
};

With an asynchronous probe, meson_mmc_probe() always succeeds,
yet the WiFi card is not detected later on, even if I sleep
1-2 seconds in meson_mmc_probe().

[    0.879756] YO: meson_mmc_probe: ffe03000.sd
[    0.914320] YO: meson_mmc_probe: ffe03000.sd ALL OK
[    1.199170] YO: meson_mmc_probe: ffe07000.mmc
[    1.232734] YO: meson_mmc_probe: ffe07000.mmc ALL OK

Confused again...

Regards
Ulf Hansson March 9, 2023, 2:29 p.m. UTC | #9
On Thu, 9 Mar 2023 at 13:13, Marc Gonzalez <marc.w.gonzalez@free.fr> wrote:
>
> On 09/03/2023 11:16, Marc Gonzalez wrote:
>
> > On 06/03/2023 11:24, Marc Gonzalez wrote:
> >
> >> # cat /sys/bus/sdio/devices/mmc2:0001:1/uevent
> >> OF_NAME=wifi
> >> OF_FULLNAME=/soc/sd@ffe03000/wifi@1
> >> OF_COMPATIBLE_0=brcm,bcm4329-fmac
> >> OF_COMPATIBLE_N=1
> >> SDIO_CLASS=00
> >> SDIO_ID=02D0:AAE7
> >> SDIO_REVISION=0.0
> >> MODALIAS=sdio:c00v02D0dAAE7
> >>
> >> NB: 0xaae7 = 43751
> >
> > I have run into another issue.
> >
> > The WiFi device (and the mmc2 bus it sits on) don't show up at all
> > in the kernel log *unless* I add lots of debug output, such as with
> > #define DEBUG in drivers/base/dd.c
> >
> > I think this points to some kind of race condition?
> >
> > Neil suggested that maybe the host probes the mmc2 bus "too soon",
> > when the WiFi device is still powering up, which makes the entire
> > probe fail.

Ideally, the WiFi device/driver should not need to be initialized to
allow the SDIO card to be detected properly. Looks like there is
something fishy going on.

> >
> > This patch appears to solve the symptom:
> >
> > diff --git a/drivers/mmc/host/meson-gx-mmc.c b/drivers/mmc/host/meson-gx-mmc.c
> > index 6e5ea0213b477..999b3843c0d0b 100644
> > --- a/drivers/mmc/host/meson-gx-mmc.c
> > +++ b/drivers/mmc/host/meson-gx-mmc.c
> > @@ -1400,7 +1400,7 @@ static struct platform_driver meson_mmc_driver = {
> >       .remove         = meson_mmc_remove,
> >       .driver         = {
> >               .name = DRIVER_NAME,
> > -             .probe_type = PROBE_PREFER_ASYNCHRONOUS,
> > +             .probe_type = PROBE_FORCE_SYNCHRONOUS,
> >               .of_match_table = meson_mmc_of_match,
> >       },
> >  };
> >
> > But this might just be delaying the probe enough for the device
> > to become ready?
>
> FWIW, the relevant device tree nodes are:
>
> /* decompiled DTS */
>
>                 sd@ffe03000 {
>                         compatible = "amlogic,meson-axg-mmc";
>                         reg = <0x0 0xffe03000 0x0 0x800>;
>                         interrupts = <0x0 0xbd 0x4>;
>                         status = "okay";
>                         clocks = <0x2 0x21 0x2 0x3c 0x2 0x2>;
>                         clock-names = "core", "clkin0", "clkin1";
>                         resets = <0x5 0x2c>;
>                         amlogic,dram-access-quirk;
>                         pinctrl-0 = <0x2c>;
>                         pinctrl-1 = <0x2d>;
>                         pinctrl-names = "default", "clk-gate";
>                         #address-cells = <0x1>;
>                         #size-cells = <0x0>;
>                         bus-width = <0x4>;
>                         cap-sd-highspeed;
>                         sd-uhs-sdr50;
>                         max-frequency = <0x5f5e100>;
>                         non-removable;
>                         disable-wp;
>                         keep-power-in-suspend;
>                         mmc-pwrseq = <0x2e>;
>                         vmmc-supply = <0x2b>;
>                         vqmmc-supply = <0x21>;
>
>                         wifi@1 {
>                                 reg = <0x1>;
>                                 compatible = "brcm,bcm4329-fmac";
>                         };
>                 };
>
>
> /* original DTS */
>
>                 sd_emmc_a: sd@ffe03000 {
>                         compatible = "amlogic,meson-axg-mmc";
>                         reg = <0x0 0xffe03000 0x0 0x800>;
>                         interrupts = <GIC_SPI 189 IRQ_TYPE_LEVEL_HIGH>;
>                         status = "disabled";
>                         clocks = <&clkc CLKID_SD_EMMC_A>,
>                                  <&clkc CLKID_SD_EMMC_A_CLK0>,
>                                  <&clkc CLKID_FCLK_DIV2>;
>                         clock-names = "core", "clkin0", "clkin1";
>                         resets = <&reset RESET_SD_EMMC_A>;
>                 };
>
> &sd_emmc_a {
>         status = "okay";
>         pinctrl-0 = <&sdio_pins>;
>         pinctrl-1 = <&sdio_clk_gate_pins>;
>         pinctrl-names = "default", "clk-gate";
>         #address-cells = <1>;
>         #size-cells = <0>;
>
>         bus-width = <4>;
>         cap-sd-highspeed;
>         sd-uhs-sdr50;
>         max-frequency = <100000000>;
>
>         non-removable;
>         disable-wp;
>
>         /* WiFi firmware requires power to be kept while in suspend */
>         keep-power-in-suspend;
>
>         mmc-pwrseq = <&sdio_pwrseq>;

This one is particularly interesting. Can you share the content of the
sdio_pwrseq node too?

>
>         vmmc-supply = <&vddao_3v3>;
>         vqmmc-supply = <&vddio_ao1v8>;
>
>         brcmf: wifi@1 {
>                 reg = <1>;
>                 compatible = "brcm,bcm4329-fmac";
>         };
> };
>
> With an asynchronous probe, meson_mmc_probe() always succeeds,
> yet the WiFi card is not detected later on, even if I sleep
> 1-2 seconds in meson_mmc_probe().
>
> [    0.879756] YO: meson_mmc_probe: ffe03000.sd
> [    0.914320] YO: meson_mmc_probe: ffe03000.sd ALL OK
> [    1.199170] YO: meson_mmc_probe: ffe07000.mmc
> [    1.232734] YO: meson_mmc_probe: ffe07000.mmc ALL OK

To narrow down the problem, I would start by preventing the WiFi
driver from being insmoded. To make sure it doesn't affect the SDIO
card detection process.

The point is, the SDIO card should be detected properly, no matter
whether there is a corresponding SDIO func driver (WiFi driver)
available for it. For a detected SDIO/eMMC/SD card, mmc_add_card()
prints a message about the card in the log during initialization. It
could look like the below print, for example:

"mmc2: new ultra high speed SDR104 SDIO card at address 0001".

>
> Confused again...
>
> Regards
>

Kind regards
Uffe
Marc Gonzalez March 9, 2023, 4:45 p.m. UTC | #10
On 09/03/2023 15:29, Ulf Hansson wrote:

> On Thu, 9 Mar 2023 at 13:13, Marc Gonzalez wrote:
>
>> On 09/03/2023 11:16, Marc Gonzalez wrote:
>>
>>> On 06/03/2023 11:24, Marc Gonzalez wrote:
>>>
>>>> # cat /sys/bus/sdio/devices/mmc2:0001:1/uevent
>>>> OF_NAME=wifi
>>>> OF_FULLNAME=/soc/sd@ffe03000/wifi@1
>>>> OF_COMPATIBLE_0=brcm,bcm4329-fmac
>>>> OF_COMPATIBLE_N=1
>>>> SDIO_CLASS=00
>>>> SDIO_ID=02D0:AAE7
>>>> SDIO_REVISION=0.0
>>>> MODALIAS=sdio:c00v02D0dAAE7
>>>>
>>>> NB: 0xaae7 = 43751
>>>
>>> I have run into another issue.
>>>
>>> The WiFi device (and the mmc2 bus it sits on) don't show up at all
>>> in the kernel log *unless* I add lots of debug output, such as with
>>> #define DEBUG in drivers/base/dd.c
>>>
>>> I think this points to some kind of race condition?
>>>
>>> Neil suggested that maybe the host probes the mmc2 bus "too soon",
>>> when the WiFi device is still powering up, which makes the entire
>>> probe fail.
> 
> Ideally, the WiFi device/driver should not need to be initialized to
> allow the SDIO card to be detected properly. Looks like there is
> something fishy going on.

Something fishy indeed ;)


>>> This patch appears to solve the symptom:
>>>
>>> diff --git a/drivers/mmc/host/meson-gx-mmc.c b/drivers/mmc/host/meson-gx-mmc.c
>>> index 6e5ea0213b477..999b3843c0d0b 100644
>>> --- a/drivers/mmc/host/meson-gx-mmc.c
>>> +++ b/drivers/mmc/host/meson-gx-mmc.c
>>> @@ -1400,7 +1400,7 @@ static struct platform_driver meson_mmc_driver = {
>>>       .remove         = meson_mmc_remove,
>>>       .driver         = {
>>>               .name = DRIVER_NAME,
>>> -             .probe_type = PROBE_PREFER_ASYNCHRONOUS,
>>> +             .probe_type = PROBE_FORCE_SYNCHRONOUS,
>>>               .of_match_table = meson_mmc_of_match,
>>>       },
>>>  };
>>>
>>> But this might just be delaying the probe enough for the device
>>> to become ready?
>>
>> FWIW, the relevant device tree nodes are:
>>
>> /* decompiled DTS */
>>
>>                 sd@ffe03000 {
>>                         compatible = "amlogic,meson-axg-mmc";
>>                         reg = <0x0 0xffe03000 0x0 0x800>;
>>                         interrupts = <0x0 0xbd 0x4>;
>>                         status = "okay";
>>                         clocks = <0x2 0x21 0x2 0x3c 0x2 0x2>;
>>                         clock-names = "core", "clkin0", "clkin1";
>>                         resets = <0x5 0x2c>;
>>                         amlogic,dram-access-quirk;
>>                         pinctrl-0 = <0x2c>;
>>                         pinctrl-1 = <0x2d>;
>>                         pinctrl-names = "default", "clk-gate";
>>                         #address-cells = <0x1>;
>>                         #size-cells = <0x0>;
>>                         bus-width = <0x4>;
>>                         cap-sd-highspeed;
>>                         sd-uhs-sdr50;
>>                         max-frequency = <0x5f5e100>;
>>                         non-removable;
>>                         disable-wp;
>>                         keep-power-in-suspend;
>>                         mmc-pwrseq = <0x2e>;
>>                         vmmc-supply = <0x2b>;
>>                         vqmmc-supply = <0x21>;
>>
>>                         wifi@1 {
>>                                 reg = <0x1>;
>>                                 compatible = "brcm,bcm4329-fmac";
>>                         };
>>                 };
>>
>>
>> /* original DTS */
>>
>>                 sd_emmc_a: sd@ffe03000 {
>>                         compatible = "amlogic,meson-axg-mmc";
>>                         reg = <0x0 0xffe03000 0x0 0x800>;
>>                         interrupts = <GIC_SPI 189 IRQ_TYPE_LEVEL_HIGH>;
>>                         status = "disabled";
>>                         clocks = <&clkc CLKID_SD_EMMC_A>,
>>                                  <&clkc CLKID_SD_EMMC_A_CLK0>,
>>                                  <&clkc CLKID_FCLK_DIV2>;
>>                         clock-names = "core", "clkin0", "clkin1";
>>                         resets = <&reset RESET_SD_EMMC_A>;
>>                 };
>>
>> &sd_emmc_a {
>>         status = "okay";
>>         pinctrl-0 = <&sdio_pins>;
>>         pinctrl-1 = <&sdio_clk_gate_pins>;
>>         pinctrl-names = "default", "clk-gate";
>>         #address-cells = <1>;
>>         #size-cells = <0>;
>>
>>         bus-width = <4>;
>>         cap-sd-highspeed;
>>         sd-uhs-sdr50;
>>         max-frequency = <100000000>;
>>
>>         non-removable;
>>         disable-wp;
>>
>>         /* WiFi firmware requires power to be kept while in suspend */
>>         keep-power-in-suspend;
>>
>>         mmc-pwrseq = <&sdio_pwrseq>;
> 
> This one is particularly interesting. Can you share the content of the
> sdio_pwrseq node too?


	sdio_pwrseq: sdio-pwrseq {
		compatible = "mmc-pwrseq-simple";
		reset-gpios = <&gpio GPIOX_6 GPIO_ACTIVE_LOW>;
		clocks = <&wifi32k>;
		clock-names = "ext_clock";
	};

	wifi32k: wifi32k {
		compatible = "pwm-clock";
		#clock-cells = <0>;
		clock-frequency = <32768>;
		pwms = <&pwm_ef 0 30518 0>; /* PWM_E at 32.768KHz */
	};

			pwm_ef: pwm@19000 {
				compatible = "amlogic,meson-g12a-ee-pwm";
				reg = <0x0 0x19000 0x0 0x20>;
				#pwm-cells = <3>;
				status = "disabled";
			};

&pwm_ef {
	status = "okay";
	pinctrl-0 = <&pwm_e_pins>;
	pinctrl-names = "default";
	clocks = <&xtal>;
	clock-names = "clkin0";
};

					pwm_e_pins: pwm-e {
						mux {
							groups = "pwm_e";
							function = "pwm_e";
							bias-disable;
						};
					};



>>         vmmc-supply = <&vddao_3v3>;
>>         vqmmc-supply = <&vddio_ao1v8>;
>>
>>         brcmf: wifi@1 {
>>                 reg = <1>;
>>                 compatible = "brcm,bcm4329-fmac";
>>         };
>> };
>>
>> With an asynchronous probe, meson_mmc_probe() always succeeds,
>> yet the WiFi card is not detected later on, even if I sleep
>> 1-2 seconds in meson_mmc_probe().
>>
>> [    0.879756] YO: meson_mmc_probe: ffe03000.sd
>> [    0.914320] YO: meson_mmc_probe: ffe03000.sd ALL OK
>> [    1.199170] YO: meson_mmc_probe: ffe07000.mmc
>> [    1.232734] YO: meson_mmc_probe: ffe07000.mmc ALL OK
> 
> To narrow down the problem, I would start by preventing the WiFi
> driver from being insmoded. To make sure it doesn't affect the SDIO
> card detection process.

So far, I always embedded all device drivers in the kernel (=y)
You are suggesting to build the WiFi driver as a module, correct?


> The point is, the SDIO card should be detected properly, no matter
> whether there is a corresponding SDIO func driver (WiFi driver)
> available for it. For a detected SDIO/eMMC/SD card, mmc_add_card()
> prints a message about the card in the log during initialization. It
> could look like the below print, for example:
> 
> "mmc2: new ultra high speed SDR104 SDIO card at address 0001".

When everything works, I get:
mmc2: new ultra high speed SDR50 SDIO card at address 0001

When it doesn't, I don't get that line.

I'm trying to debug with printk's in
mmc_rescan_try_freq, mmc_rescan, mmc_attach_sdio, meson_mmc_probe

The problem is that adding "too many" printks makes the
probe work! (Debugging this issue is nerve-racking.)

Thanks, I will test your suggestions!

Regards
Marc Gonzalez March 9, 2023, 5:51 p.m. UTC | #11
On 09/03/2023 15:29, Ulf Hansson wrote:

> To narrow down the problem, I would start by preventing the WiFi
> driver from being insmoded. To make sure it doesn't affect the SDIO
> card detection process.
> 
> The point is, the SDIO card should be detected properly, no matter
> whether there is a corresponding SDIO func driver (WiFi driver)
> available for it. For a detected SDIO/eMMC/SD card, mmc_add_card()
> prints a message about the card in the log during initialization. It
> could look like the below print, for example:
> 
> "mmc2: new ultra high speed SDR104 SDIO card at address 0001".

OK, I built the WiFi driver as a module which is NOT loaded at boot time.



Here is a kernel log where mmc2 probing went WRONG
(no "mmc2: new ultra high speed SDR50 SDIO card at address 0001")


[    0.000000] Booting Linux on physical CPU 0x0000000000 [0x410fd034]
[    0.000000] Linux version 6.2.0-rc8 (mgonzalez@venus) (aarch64-none-linux-gnu-gcc (GNU Toolchain for the A-profile Architecture 10.3-2021.07 (arm-10.29)) 10.3.1 20210621, GNU ld (GNU Toolchain for the A-profile Architecture 10.3-2021.07 (arm-10.29)) 2.36.1.20210621) #247 SMP PREEMPT Thu Mar  9 18:23:56 CET 2023
[    0.000000] Machine model: SEI Robotics SEI510
[    0.000000] [Firmware Bug]: Kernel image misaligned at boot, please fix your bootloader!
[    0.000000] Reserved memory: created CMA memory pool at 0x0000000068000000, size 384 MiB
[    0.000000] OF: reserved mem: initialized node linux,cma, compatible id shared-dma-pool
[    0.000000] Zone ranges:
[    0.000000]   DMA      [mem 0x0000000000000000-0x000000007fffffff]
[    0.000000]   DMA32    empty
[    0.000000]   Normal   empty
[    0.000000] Movable zone start for each node
[    0.000000] Early memory node ranges
[    0.000000]   node   0: [mem 0x0000000000000000-0x0000000004ffffff]
[    0.000000]   node   0: [mem 0x0000000005000000-0x00000000053fffff]
[    0.000000]   node   0: [mem 0x0000000005400000-0x000000007fffffff]
[    0.000000] Initmem setup node 0 [mem 0x0000000000000000-0x000000007fffffff]
[    0.000000] psci: probing for conduit method from DT.
[    0.000000] psci: PSCIv1.0 detected in firmware.
[    0.000000] psci: Using standard PSCI v0.2 function IDs
[    0.000000] psci: Trusted OS migration not required
[    0.000000] psci: SMC Calling Convention v1.1
[    0.000000] percpu: Embedded 19 pages/cpu s38632 r8192 d31000 u77824
[    0.000000] pcpu-alloc: s38632 r8192 d31000 u77824 alloc=19*4096
[    0.000000] pcpu-alloc: [0] 0 [0] 1 [0] 2 [0] 3 
[    0.000000] Detected VIPT I-cache on CPU0
[    0.000000] CPU features: kernel page table isolation forced ON by KASLR
[    0.000000] CPU features: detected: Kernel page table isolation (KPTI)
[    0.000000] CPU features: detected: ARM erratum 845719
[    0.000000] alternatives: applying boot alternatives
[    0.000000] Built 1 zonelists, mobility grouping on.  Total pages: 516096
[    0.000000] Kernel command line: console=ttyAML0 debug
[    0.000000] Dentry cache hash table entries: 262144 (order: 9, 2097152 bytes, linear)
[    0.000000] Inode-cache hash table entries: 131072 (order: 8, 1048576 bytes, linear)
[    0.000000] mem auto-init: stack:off, heap alloc:off, heap free:off
[    0.000000] Memory: 1623888K/2097152K available (8384K kernel code, 1062K rwdata, 2656K rodata, 22592K init, 411K bss, 80048K reserved, 393216K cma-reserved)
[    0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=4, Nodes=1
[    0.000000] rcu: Preemptible hierarchical RCU implementation.
[    0.000000] rcu:     RCU event tracing is enabled.
[    0.000000] rcu:     RCU restricting CPUs from NR_CPUS=256 to nr_cpu_ids=4.
[    0.000000]  Trampoline variant of Tasks RCU enabled.
[    0.000000]  Tracing variant of Tasks RCU enabled.
[    0.000000] rcu: RCU calculated value of scheduler-enlistment delay is 25 jiffies.
[    0.000000] rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=4
[    0.000000] NR_IRQS: 64, nr_irqs: 64, preallocated irqs: 0
[    0.000000] Root IRQ handler: gic_handle_irq
[    0.000000] GIC: Using split EOI/Deactivate mode
[    0.000000] rcu: srcu_init: Setting srcu_struct sizes based on contention.
[    0.000000] arch_timer: cp15 timer(s) running at 24.00MHz (phys).
[    0.000000] clocksource: arch_sys_counter: mask: 0xffffffffffffff max_cycles: 0x588fe9dc0, max_idle_ns: 440795202592 ns
[    0.000000] sched_clock: 56 bits at 24MHz, resolution 41ns, wraps every 4398046511097ns
[    0.000188] Console: colour dummy device 80x25
[    0.000219] Calibrating delay loop (skipped), value calculated using timer frequency.. 48.00 BogoMIPS (lpj=96000)
[    0.000231] pid_max: default: 32768 minimum: 301
[    0.000366] Mount-cache hash table entries: 4096 (order: 3, 32768 bytes, linear)
[    0.000382] Mountpoint-cache hash table entries: 4096 (order: 3, 32768 bytes, linear)
[    0.001822] cblist_init_generic: Setting adjustable number of callback queues.
[    0.001837] cblist_init_generic: Setting shift to 2 and lim to 1.
[    0.001903] cblist_init_generic: Setting shift to 2 and lim to 1.
[    0.002042] rcu: Hierarchical SRCU implementation.
[    0.002046] rcu:     Max phase no-delay instances is 1000.
[    0.002709] smp: Bringing up secondary CPUs ...
[    0.003212] Detected VIPT I-cache on CPU1
[    0.003329] CPU1: Booted secondary processor 0x0000000001 [0x410fd034]
[    0.003862] Detected VIPT I-cache on CPU2
[    0.003936] CPU2: Booted secondary processor 0x0000000002 [0x410fd034]
[    0.004413] Detected VIPT I-cache on CPU3
[    0.004486] CPU3: Booted secondary processor 0x0000000003 [0x410fd034]
[    0.004541] smp: Brought up 1 node, 4 CPUs
[    0.004547] SMP: Total of 4 processors activated.
[    0.004551] CPU features: detected: 32-bit EL0 Support
[    0.004555] CPU features: detected: CRC32 instructions
[    0.004612] CPU: All CPU(s) started at EL2
[    0.004615] alternatives: applying system-wide alternatives
[    0.005677] devtmpfs: initialized
[    0.014358] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645041785100000 ns
[    0.014382] futex hash table entries: 1024 (order: 4, 65536 bytes, linear)
[    0.023672] pinctrl core: initialized pinctrl subsystem
[    0.024497] NET: Registered PF_NETLINK/PF_ROUTE protocol family
[    0.025626] DMA: preallocated 256 KiB GFP_KERNEL pool for atomic allocations
[    0.025826] DMA: preallocated 256 KiB GFP_KERNEL|GFP_DMA pool for atomic allocations
[    0.025959] DMA: preallocated 256 KiB GFP_KERNEL|GFP_DMA32 pool for atomic allocations
[    0.026022] audit: initializing netlink subsys (disabled)
[    0.026131] audit: type=2000 audit(0.024:1): state=initialized audit_enabled=0 res=1
[    0.026425] thermal_sys: Registered thermal governor 'step_wise'
[    0.026457] cpuidle: using governor menu
[    0.026528] hw-breakpoint: found 6 breakpoint and 4 watchpoint registers.
[    0.026606] ASID allocator initialised with 32768 entries
[    0.039376] platform ff900000.vpu: Fixing up cyclic dependency with ff600000.hdmi-tx
[    0.042947] platform hdmi-connector: Fixing up cyclic dependency with ff600000.hdmi-tx
[    0.045156] KASLR enabled
[    0.047484] iommu: Default domain type: Translated 
[    0.047494] iommu: DMA domain TLB invalidation policy: strict mode 
[    0.047738] SCSI subsystem initialized
[    0.047842] usbcore: registered new interface driver usbfs
[    0.047871] usbcore: registered new interface driver hub
[    0.047897] usbcore: registered new device driver usb
[    0.048551] FPGA manager framework
[    0.049473] clocksource: Switched to clocksource arch_sys_counter
[    0.055073] NET: Registered PF_INET protocol family
[    0.055266] IP idents hash table entries: 32768 (order: 6, 262144 bytes, linear)
[    0.056810] tcp_listen_portaddr_hash hash table entries: 1024 (order: 2, 16384 bytes, linear)
[    0.056859] Table-perturb hash table entries: 65536 (order: 6, 262144 bytes, linear)
[    0.056873] TCP established hash table entries: 16384 (order: 5, 131072 bytes, linear)
[    0.056985] TCP bind hash table entries: 16384 (order: 7, 524288 bytes, linear)
[    0.057400] TCP: Hash tables configured (established 16384 bind 16384)
[    0.057537] UDP hash table entries: 1024 (order: 3, 32768 bytes, linear)
[    0.057591] UDP-Lite hash table entries: 1024 (order: 3, 32768 bytes, linear)
[    0.057737] NET: Registered PF_UNIX/PF_LOCAL protocol family
[    0.066871] Initialise system trusted keyrings
[    0.067114] workingset: timestamp_bits=46 max_order=19 bucket_order=0
[    0.103569] Key type asymmetric registered
[    0.103587] Asymmetric key parser 'x509' registered
[    0.103692] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 248)
[    0.103699] io scheduler mq-deadline registered
[    0.103703] io scheduler kyber registered
[    0.104140] irq_meson_gpio: 100 to 8 gpio interrupt mux initialized
[    0.104681] simple-pm-bus: probe of ff63c000.system-controller rejects match -19
[    0.105299] simple-pm-bus: probe of ff800000.sys-ctrl rejects match -19
[    0.115611] soc soc0: Amlogic Meson G12A (S905X2) Revision 28:e (40:2) Detected
[    0.117430] ff803000.serial: ttyAML0 at MMIO 0xff803000 (irq = 14, base_baud = 1500000) is a meson_uart
[    0.117503] printk: console [ttyAML0] enabled
[    0.854455] loop: module loaded
[    0.855954] g12a-mdio_mux ff64c000.mdio-multiplexer: Driver g12a-mdio_mux requests probe deferral
[    0.862491] usbcore: registered new interface driver usb-storage
[    0.867279] input: gpio-keys-polled as /devices/platform/gpio-keys-polled/input/input0
[    0.875824] meson-vrtc ff8000a8.rtc: registered as rtc0
[    0.880002] meson-vrtc ff8000a8.rtc: setting system clock to 1970-01-01T00:00:00 UTC (0)
[    0.888109] i2c_dev: i2c /dev entries driver
[    0.894735] psci-cpuidle: probe of psci-cpuidle rejects match -19
[    0.899022] YO: meson_mmc_probe: ffe03000.sd
[    0.899350] meson-gx-mmc ffe03000.sd: allocated mmc-pwrseq
[    0.907943] ledtrig-cpu: registered to indicate activity on CPUs
[    0.914215] meson-sm: secure-monitor enabled
[    0.918257] hid: raw HID events driver (C) Jiri Kosina
[    0.923603] usbcore: registered new interface driver usbhid
[    0.928717] usbhid: USB HID core driver
[    0.934114] YO: meson_mmc_probe: ffe03000.sd ALL OK
[    0.935084] optee: probing for conduit method.
[    0.941807] optee: revision 2.4
[    0.942796] optee: initialized driver
[    0.949566] NET: Registered PF_PACKET protocol family
[    0.953499] Key type dns_resolver registered
[    0.961849] registered taskstats version 1
[    0.961962] Loading compiled-in X.509 certificates
[    1.059493] g12a-mdio_mux ff64c000.mdio-multiplexer: Driver g12a-mdio_mux requests probe deferral
[    1.063406] meson8b-dwmac ff3f0000.ethernet: IRQ eth_wake_irq not found
[    1.069301] meson8b-dwmac ff3f0000.ethernet: IRQ eth_lpi not found
[    1.075465] meson8b-dwmac ff3f0000.ethernet: PTP uses main clock
[    1.081970] meson8b-dwmac ff3f0000.ethernet: User ID: 0x11, Synopsys ID: 0x37
[    1.088462] meson8b-dwmac ff3f0000.ethernet:         DWMAC1000
[    1.093599] meson8b-dwmac ff3f0000.ethernet: DMA HW capability register supported
[    1.101016] meson8b-dwmac ff3f0000.ethernet: RX Checksum Offload Engine supported
[    1.108431] meson8b-dwmac ff3f0000.ethernet: COE Type 2
[    1.113610] meson8b-dwmac ff3f0000.ethernet: TX Checksum insertion supported
[    1.120592] meson8b-dwmac ff3f0000.ethernet: Wake-Up On Lan supported
[    1.127036] meson8b-dwmac ff3f0000.ethernet: Normal descriptors
[    1.132841] meson8b-dwmac ff3f0000.ethernet: Ring mode enabled
[    1.138620] meson8b-dwmac ff3f0000.ethernet: Enable RX Mitigation via HW Watchdog Timer
[    1.148085] dwc3-meson-g12a ffe09000.usb: USB2 ports: 2
[    1.151738] dwc3-meson-g12a ffe09000.usb: USB3 ports: 1
[    1.160071] xhci-hcd xhci-hcd.1.auto: xHCI Host Controller
[    1.162353] xhci-hcd xhci-hcd.1.auto: new USB bus registered, assigned bus number 1
[    1.170044] xhci-hcd xhci-hcd.1.auto: hcc params 0x0228fe6c hci version 0x110 quirks 0x0000000000010010
[    1.179281] xhci-hcd xhci-hcd.1.auto: irq 20, io mem 0xff500000
[    1.185228] xhci-hcd xhci-hcd.1.auto: xHCI Host Controller
[    1.190548] xhci-hcd xhci-hcd.1.auto: new USB bus registered, assigned bus number 2
[    1.198137] xhci-hcd xhci-hcd.1.auto: Host supports USB 3.0 SuperSpeed
[    1.205117] hub 1-0:1.0: USB hub found
[    1.208336] hub 1-0:1.0: 2 ports detected
[    1.212595] usb usb2: We don't know the algorithms for LPM for this host, disabling LPM.
[    1.220667] hub 2-0:1.0: USB hub found
[    1.224026] hub 2-0:1.0: 1 port detected
[    1.228908] YO: meson_mmc_probe: ffe07000.mmc
[    1.229222] meson-gx-mmc ffe07000.mmc: allocated mmc-pwrseq
[    1.245361] cfg80211: Loading compiled-in X.509 certificates for regulatory database
[    1.255085] cfg80211: Loaded X.509 cert 'sforshee: 00b28ddf47aef9cea7'
[    1.261508] YO: meson_mmc_probe: ffe07000.mmc ALL OK
[    1.265592] platform regulatory.0: loading /lib/firmware/updates/6.2.0-rc8/regulatory.db failed for no such file or directory.
[    1.271080] Freeing unused kernel memory: 22592K
[    1.276338] platform regulatory.0: loading /lib/firmware/updates/regulatory.db failed for no such file or directory.
[    1.291323] platform regulatory.0: loading /lib/firmware/6.2.0-rc8/regulatory.db failed for no such file or directory.
[    1.301921] platform regulatory.0: Loading firmware from /lib/firmware/regulatory.db
[    1.309555] platform regulatory.0: loading /lib/firmware/updates/6.2.0-rc8/regulatory.db.p7s failed for no such file or directory.
[    1.321185] platform regulatory.0: loading /lib/firmware/updates/regulatory.db.p7s failed for no such file or directory.
[    1.331965] platform regulatory.0: loading /lib/firmware/6.2.0-rc8/regulatory.db.p7s failed for no such file or directory.
[    1.342932] platform regulatory.0: Loading firmware from /lib/firmware/regulatory.db.p7s
[    1.389536] Run /init as init process
[    1.389557]   with arguments:
[    1.390496]     /init
[    1.392826]   with environment:
[    1.395946]     HOME=/
[    1.398258]     TERM=linux
[    1.406077] mmc1: new HS200 MMC card at address 0001
[    1.406841] mmcblk1: mmc1:0001 SCA16G 14.7 GiB 
[    1.413262] mmcblk1boot0: mmc1:0001 SCA16G 4.00 MiB 
[    1.416665] mmcblk1boot1: mmc1:0001 SCA16G 4.00 MiB 
[    1.421334] mmcblk1rpmb: mmc1:0001 SCA16G 4.00 MiB, chardev (246:0)
[    4.041482] random: crng init done




Here is a kernel log where mmc2 probing went RIGHT
(the added printk's changed some timings, I assume)


[    0.000000] Booting Linux on physical CPU 0x0000000000 [0x410fd034]
[    0.000000] Linux version 6.2.0-rc8 (mgonzalez@venus) (aarch64-none-linux-gnu-gcc (GNU Toolchain for the A-profile Architecture 10.3-2021.07 (arm-10.29)) 10.3.1 20210621, GNU ld (GNU Toolchain for the A-profile Architecture 10.3-2021.07 (arm-10.29)) 2.36.1.20210621) #249 SMP PREEMPT Thu Mar  9 18:37:17 CET 2023
[    0.000000] Machine model: SEI Robotics SEI510
[    0.000000] [Firmware Bug]: Kernel image misaligned at boot, please fix your bootloader!
[    0.000000] Reserved memory: created CMA memory pool at 0x0000000068000000, size 384 MiB
[    0.000000] OF: reserved mem: initialized node linux,cma, compatible id shared-dma-pool
[    0.000000] Zone ranges:
[    0.000000]   DMA      [mem 0x0000000000000000-0x000000007fffffff]
[    0.000000]   DMA32    empty
[    0.000000]   Normal   empty
[    0.000000] Movable zone start for each node
[    0.000000] Early memory node ranges
[    0.000000]   node   0: [mem 0x0000000000000000-0x0000000004ffffff]
[    0.000000]   node   0: [mem 0x0000000005000000-0x00000000053fffff]
[    0.000000]   node   0: [mem 0x0000000005400000-0x000000007fffffff]
[    0.000000] Initmem setup node 0 [mem 0x0000000000000000-0x000000007fffffff]
[    0.000000] psci: probing for conduit method from DT.
[    0.000000] psci: PSCIv1.0 detected in firmware.
[    0.000000] psci: Using standard PSCI v0.2 function IDs
[    0.000000] psci: Trusted OS migration not required
[    0.000000] psci: SMC Calling Convention v1.1
[    0.000000] percpu: Embedded 19 pages/cpu s38632 r8192 d31000 u77824
[    0.000000] pcpu-alloc: s38632 r8192 d31000 u77824 alloc=19*4096
[    0.000000] pcpu-alloc: [0] 0 [0] 1 [0] 2 [0] 3 
[    0.000000] Detected VIPT I-cache on CPU0
[    0.000000] CPU features: kernel page table isolation forced ON by KASLR
[    0.000000] CPU features: detected: Kernel page table isolation (KPTI)
[    0.000000] CPU features: detected: ARM erratum 845719
[    0.000000] alternatives: applying boot alternatives
[    0.000000] Built 1 zonelists, mobility grouping on.  Total pages: 516096
[    0.000000] Kernel command line: console=ttyAML0 debug
[    0.000000] Dentry cache hash table entries: 262144 (order: 9, 2097152 bytes, linear)
[    0.000000] Inode-cache hash table entries: 131072 (order: 8, 1048576 bytes, linear)
[    0.000000] mem auto-init: stack:off, heap alloc:off, heap free:off
[    0.000000] Memory: 1623888K/2097152K available (8384K kernel code, 1062K rwdata, 2656K rodata, 22592K init, 411K bss, 80048K reserved, 393216K cma-reserved)
[    0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=4, Nodes=1
[    0.000000] rcu: Preemptible hierarchical RCU implementation.
[    0.000000] rcu:     RCU event tracing is enabled.
[    0.000000] rcu:     RCU restricting CPUs from NR_CPUS=256 to nr_cpu_ids=4.
[    0.000000]  Trampoline variant of Tasks RCU enabled.
[    0.000000]  Tracing variant of Tasks RCU enabled.
[    0.000000] rcu: RCU calculated value of scheduler-enlistment delay is 25 jiffies.
[    0.000000] rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=4
[    0.000000] NR_IRQS: 64, nr_irqs: 64, preallocated irqs: 0
[    0.000000] Root IRQ handler: gic_handle_irq
[    0.000000] GIC: Using split EOI/Deactivate mode
[    0.000000] rcu: srcu_init: Setting srcu_struct sizes based on contention.
[    0.000000] arch_timer: cp15 timer(s) running at 24.00MHz (phys).
[    0.000000] clocksource: arch_sys_counter: mask: 0xffffffffffffff max_cycles: 0x588fe9dc0, max_idle_ns: 440795202592 ns
[    0.000000] sched_clock: 56 bits at 24MHz, resolution 41ns, wraps every 4398046511097ns
[    0.000189] Console: colour dummy device 80x25
[    0.000220] Calibrating delay loop (skipped), value calculated using timer frequency.. 48.00 BogoMIPS (lpj=96000)
[    0.000231] pid_max: default: 32768 minimum: 301
[    0.000366] Mount-cache hash table entries: 4096 (order: 3, 32768 bytes, linear)
[    0.000381] Mountpoint-cache hash table entries: 4096 (order: 3, 32768 bytes, linear)
[    0.001831] cblist_init_generic: Setting adjustable number of callback queues.
[    0.001844] cblist_init_generic: Setting shift to 2 and lim to 1.
[    0.001910] cblist_init_generic: Setting shift to 2 and lim to 1.
[    0.002053] rcu: Hierarchical SRCU implementation.
[    0.002056] rcu:     Max phase no-delay instances is 1000.
[    0.002710] smp: Bringing up secondary CPUs ...
[    0.003214] Detected VIPT I-cache on CPU1
[    0.003334] CPU1: Booted secondary processor 0x0000000001 [0x410fd034]
[    0.003869] Detected VIPT I-cache on CPU2
[    0.003947] CPU2: Booted secondary processor 0x0000000002 [0x410fd034]
[    0.004433] Detected VIPT I-cache on CPU3
[    0.004505] CPU3: Booted secondary processor 0x0000000003 [0x410fd034]
[    0.004556] smp: Brought up 1 node, 4 CPUs
[    0.004563] SMP: Total of 4 processors activated.
[    0.004567] CPU features: detected: 32-bit EL0 Support
[    0.004571] CPU features: detected: CRC32 instructions
[    0.004634] CPU: All CPU(s) started at EL2
[    0.004636] alternatives: applying system-wide alternatives
[    0.005697] devtmpfs: initialized
[    0.014641] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645041785100000 ns
[    0.014665] futex hash table entries: 1024 (order: 4, 65536 bytes, linear)
[    0.023935] pinctrl core: initialized pinctrl subsystem
[    0.024793] NET: Registered PF_NETLINK/PF_ROUTE protocol family
[    0.025930] DMA: preallocated 256 KiB GFP_KERNEL pool for atomic allocations
[    0.026129] DMA: preallocated 256 KiB GFP_KERNEL|GFP_DMA pool for atomic allocations
[    0.026260] DMA: preallocated 256 KiB GFP_KERNEL|GFP_DMA32 pool for atomic allocations
[    0.026323] audit: initializing netlink subsys (disabled)
[    0.026432] audit: type=2000 audit(0.024:1): state=initialized audit_enabled=0 res=1
[    0.026732] thermal_sys: Registered thermal governor 'step_wise'
[    0.026765] cpuidle: using governor menu
[    0.026840] hw-breakpoint: found 6 breakpoint and 4 watchpoint registers.
[    0.026919] ASID allocator initialised with 32768 entries
[    0.039724] platform ff900000.vpu: Fixing up cyclic dependency with ff600000.hdmi-tx
[    0.043299] platform hdmi-connector: Fixing up cyclic dependency with ff600000.hdmi-tx
[    0.045533] KASLR enabled
[    0.047864] iommu: Default domain type: Translated 
[    0.047874] iommu: DMA domain TLB invalidation policy: strict mode 
[    0.048146] SCSI subsystem initialized
[    0.048255] usbcore: registered new interface driver usbfs
[    0.048283] usbcore: registered new interface driver hub
[    0.048310] usbcore: registered new device driver usb
[    0.048941] FPGA manager framework
[    0.049867] clocksource: Switched to clocksource arch_sys_counter
[    0.055483] NET: Registered PF_INET protocol family
[    0.055675] IP idents hash table entries: 32768 (order: 6, 262144 bytes, linear)
[    0.057226] tcp_listen_portaddr_hash hash table entries: 1024 (order: 2, 16384 bytes, linear)
[    0.057278] Table-perturb hash table entries: 65536 (order: 6, 262144 bytes, linear)
[    0.057292] TCP established hash table entries: 16384 (order: 5, 131072 bytes, linear)
[    0.057403] TCP bind hash table entries: 16384 (order: 7, 524288 bytes, linear)
[    0.057814] TCP: Hash tables configured (established 16384 bind 16384)
[    0.057943] UDP hash table entries: 1024 (order: 3, 32768 bytes, linear)
[    0.057997] UDP-Lite hash table entries: 1024 (order: 3, 32768 bytes, linear)
[    0.058139] NET: Registered PF_UNIX/PF_LOCAL protocol family
[    0.067289] Initialise system trusted keyrings
[    0.067550] workingset: timestamp_bits=46 max_order=19 bucket_order=0
[    0.102639] Key type asymmetric registered
[    0.102658] Asymmetric key parser 'x509' registered
[    0.102771] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 248)
[    0.102778] io scheduler mq-deadline registered
[    0.102782] io scheduler kyber registered
[    0.103224] irq_meson_gpio: 100 to 8 gpio interrupt mux initialized
[    0.103782] simple-pm-bus: probe of ff63c000.system-controller rejects match -19
[    0.104417] simple-pm-bus: probe of ff800000.sys-ctrl rejects match -19
[    0.114877] soc soc0: Amlogic Meson G12A (S905X2) Revision 28:e (40:2) Detected
[    0.116692] ff803000.serial: ttyAML0 at MMIO 0xff803000 (irq = 14, base_baud = 1500000) is a meson_uart
[    0.116739] printk: console [ttyAML0] enabled
[    0.853697] loop: module loaded
[    0.855199] g12a-mdio_mux ff64c000.mdio-multiplexer: Driver g12a-mdio_mux requests probe deferral
[    0.861696] usbcore: registered new interface driver usb-storage
[    0.866517] input: gpio-keys-polled as /devices/platform/gpio-keys-polled/input/input0
[    0.875019] meson-vrtc ff8000a8.rtc: registered as rtc0
[    0.879234] meson-vrtc ff8000a8.rtc: setting system clock to 1970-01-01T00:00:00 UTC (0)
[    0.887343] i2c_dev: i2c /dev entries driver
[    0.893907] psci-cpuidle: probe of psci-cpuidle rejects match -19
[    0.898254] YO: meson_mmc_probe: ffe03000.sd
[    0.898582] meson-gx-mmc ffe03000.sd: allocated mmc-pwrseq
[    0.907186] ledtrig-cpu: registered to indicate activity on CPUs
[    0.913444] meson-sm: secure-monitor enabled
[    0.917494] hid: raw HID events driver (C) Jiri Kosina
[    0.922837] usbcore: registered new interface driver usbhid
[    0.927953] usbhid: USB HID core driver
[    0.933271] YO mmc_rescan: mmc2
[    0.933361] YO: meson_mmc_probe: ffe03000.sd ALL OK
[    0.934415] optee: probing for conduit method.
[    0.940795] YO mmc_attach_sdio: mmc2
[    0.944118] optee: revision 2.4
[    0.948523] optee: initialized driver
[    0.955324] NET: Registered PF_PACKET protocol family
[    0.959391] Key type dns_resolver registered
[    0.968932] registered taskstats version 1
[    0.969166] Loading compiled-in X.509 certificates
[    1.078067] g12a-mdio_mux ff64c000.mdio-multiplexer: Driver g12a-mdio_mux requests probe deferral
[    1.082159] meson8b-dwmac ff3f0000.ethernet: IRQ eth_wake_irq not found
[    1.087852] meson8b-dwmac ff3f0000.ethernet: IRQ eth_lpi not found
[    1.089252] sdio_read_cis: vendor=2d0 dev=aae7
[    1.094043] meson8b-dwmac ff3f0000.ethernet: PTP uses main clock
[    1.094607] sdio_read_cis: vendor=2d0 dev=aae7
[    1.099105] meson8b-dwmac ff3f0000.ethernet: User ID: 0x11, Synopsys ID: 0x37
[    1.104334] mmc2: new ultra high speed SDR50 SDIO card at address 0001
[    1.108761] meson8b-dwmac ff3f0000.ethernet:         DWMAC1000
[    1.115947] mmcblk: probe of mmc2:0001 rejects match -19
[    1.122260] meson8b-dwmac ff3f0000.ethernet: DMA HW capability register supported
[    1.127571] YO mmc_attach_sdio mmc2=0
[    1.132695] meson8b-dwmac ff3f0000.ethernet: RX Checksum Offload Engine supported
[    1.151152] meson8b-dwmac ff3f0000.ethernet: COE Type 2
[    1.156327] meson8b-dwmac ff3f0000.ethernet: TX Checksum insertion supported
[    1.163313] meson8b-dwmac ff3f0000.ethernet: Wake-Up On Lan supported
[    1.169746] meson8b-dwmac ff3f0000.ethernet: Normal descriptors
[    1.175564] meson8b-dwmac ff3f0000.ethernet: Ring mode enabled
[    1.181340] meson8b-dwmac ff3f0000.ethernet: Enable RX Mitigation via HW Watchdog Timer
[    1.190835] dwc3-meson-g12a ffe09000.usb: USB2 ports: 2
[    1.194456] dwc3-meson-g12a ffe09000.usb: USB3 ports: 1
[    1.202810] xhci-hcd xhci-hcd.1.auto: xHCI Host Controller
[    1.205064] xhci-hcd xhci-hcd.1.auto: new USB bus registered, assigned bus number 1
[    1.212772] xhci-hcd xhci-hcd.1.auto: hcc params 0x0228fe6c hci version 0x110 quirks 0x0000000000010010
[    1.222000] xhci-hcd xhci-hcd.1.auto: irq 20, io mem 0xff500000
[    1.227940] xhci-hcd xhci-hcd.1.auto: xHCI Host Controller
[    1.233270] xhci-hcd xhci-hcd.1.auto: new USB bus registered, assigned bus number 2
[    1.240858] xhci-hcd xhci-hcd.1.auto: Host supports USB 3.0 SuperSpeed
[    1.247839] hub 1-0:1.0: USB hub found
[    1.251053] hub 1-0:1.0: 2 ports detected
[    1.255315] usb usb2: We don't know the algorithms for LPM for this host, disabling LPM.
[    1.263393] hub 2-0:1.0: USB hub found
[    1.266747] hub 2-0:1.0: 1 port detected
[    1.271625] YO: meson_mmc_probe: ffe07000.mmc
[    1.271940] meson-gx-mmc ffe07000.mmc: allocated mmc-pwrseq
[    1.288191] cfg80211: Loading compiled-in X.509 certificates for regulatory database
[    1.297735] cfg80211: Loaded X.509 cert 'sforshee: 00b28ddf47aef9cea7'
[    1.304321] platform regulatory.0: loading /lib/firmware/updates/6.2.0-rc8/regulatory.db failed for no such file or directory.
[    1.310070] platform regulatory.0: loading /lib/firmware/updates/regulatory.db failed for no such file or directory.
[    1.310072] YO mmc_rescan: mmc1
[    1.310372] YO: meson_mmc_probe: ffe07000.mmc ALL OK
[    1.320590] platform regulatory.0: loading /lib/firmware/6.2.0-rc8/regulatory.db failed for no such file or directory.
[    1.332130] Freeing unused kernel memory: 22592K
[    1.339285] platform regulatory.0: Loading firmware from /lib/firmware/regulatory.db
[    1.351444] YO mmc_attach_sdio: mmc1
[    1.351545] platform regulatory.0: loading /lib/firmware/updates/6.2.0-rc8/regulatory.db.p7s failed for no such file or directory.
[    1.366588] platform regulatory.0: loading /lib/firmware/updates/regulatory.db.p7s failed for no such file or directory.
[    1.377411] platform regulatory.0: loading /lib/firmware/6.2.0-rc8/regulatory.db.p7s failed for no such file or directory.
[    1.377952] Run /init as init process
[    1.388447] platform regulatory.0: Loading firmware from /lib/firmware/regulatory.db.p7s
[    1.391974]   with arguments:
[    1.402862]     /init
[    1.402869] YO mmc_attach_sdio mmc1=-110
[    1.405082]   with environment:
[    1.412076]     HOME=/
[    1.414403]     TERM=linux
[    1.509991] mmc1: new HS200 MMC card at address 0001
[    1.510678] mmcblk1: mmc1:0001 SCA16G 14.7 GiB 
[    1.516496] mmcblk1boot0: mmc1:0001 SCA16G 4.00 MiB 
[    1.520045] mmcblk1boot1: mmc1:0001 SCA16G 4.00 MiB 
[    1.524705] mmcblk1rpmb: mmc1:0001 SCA16G 4.00 MiB, chardev (246:0)
[    4.081875] random: crng init done


I don't think it's enough to pinpoint the issue though?

Regards
Marc Gonzalez March 13, 2023, 1:55 p.m. UTC | #12
On 09/03/2023 18:51, Marc Gonzalez wrote:

> On 09/03/2023 15:29, Ulf Hansson wrote:
> 
>> To narrow down the problem, I would start by preventing the WiFi
>> driver from being insmoded. To make sure it doesn't affect the SDIO
>> card detection process.
>>
>> The point is, the SDIO card should be detected properly, no matter
>> whether there is a corresponding SDIO func driver (WiFi driver)
>> available for it. For a detected SDIO/eMMC/SD card, mmc_add_card()
>> prints a message about the card in the log during initialization. It
>> could look like the below print, for example:
>>
>> "mmc2: new ultra high speed SDR104 SDIO card at address 0001".
> 
> OK, I built the WiFi driver as a module which is NOT loaded at boot time.

Still trying to bisect this heisenbug into submission... :(

So far, I've pared it down to mmc_attach_sdio()

When probe WORKS, mmc_attach_sdio() returns 0.
When probe FAILS, mmc_attach_sdio() returns ETIMEDOUT
via mmc_send_io_op_cond(host, 0, &ocr);

Wrapping mmc_send_io_op_cond() in a loop
makes it work on the second try.

Would appreciate additional guidance. Am mostly stabbing in the dark :)

Regards
Marc Gonzalez March 13, 2023, 2:41 p.m. UTC | #13
On 13/03/2023 14:55, Marc Gonzalez wrote:

> On 09/03/2023 18:51, Marc Gonzalez wrote:
> 
>> On 09/03/2023 15:29, Ulf Hansson wrote:
>>
>>> To narrow down the problem, I would start by preventing the WiFi
>>> driver from being insmoded. To make sure it doesn't affect the SDIO
>>> card detection process.
>>>
>>> The point is, the SDIO card should be detected properly, no matter
>>> whether there is a corresponding SDIO func driver (WiFi driver)
>>> available for it. For a detected SDIO/eMMC/SD card, mmc_add_card()
>>> prints a message about the card in the log during initialization. It
>>> could look like the below print, for example:
>>>
>>> "mmc2: new ultra high speed SDR104 SDIO card at address 0001".
>>
>> OK, I built the WiFi driver as a module which is NOT loaded at boot time.
> 
> Still trying to bisect this heisenbug into submission... :(
> 
> So far, I've pared it down to mmc_attach_sdio()
> 
> When probe WORKS, mmc_attach_sdio() returns 0.
> When probe FAILS, mmc_attach_sdio() returns ETIMEDOUT
> via mmc_send_io_op_cond(host, 0, &ocr);
> 
> Wrapping mmc_send_io_op_cond() in a loop
> makes it work on the second try.
> 
> Would appreciate additional guidance. Am mostly stabbing in the dark :)

Caught the race "in the act" (in flagrante delicto)

Using this patch:

diff --git a/drivers/mmc/core/sdio.c b/drivers/mmc/core/sdio.c
index f64b9ac76a5cd..eb2c95721e32c 100644
--- a/drivers/mmc/core/sdio.c
+++ b/drivers/mmc/core/sdio.c
@@ -1204,11 +1204,24 @@ int mmc_attach_sdio(struct mmc_host *host)
 	struct mmc_card *card;
 
 	WARN_ON(!host->claimed);
-
+	//printk("YO %s: %s", __func__, mmc_hostname(host));
+	//msleep(500);
+
+#if 1
+	for (i = 0; i < 10; ++i) {
+		err = mmc_send_io_op_cond(host, 0, &ocr);
+		printk("YO %s", mmc_hostname(host));
+		if (!err) goto all_good;
+	}
+	printk("%s failed with %d", __func__, err);
+	return err;
+#else
 	err = mmc_send_io_op_cond(host, 0, &ocr);
 	if (err)
 		return err;
+#endif
 
+all_good:
 	mmc_attach_bus(host, &mmc_sdio_ops);
 	if (host->ocr_avail_sdio)
 		host->ocr_avail = host->ocr_avail_sdio;



Two boots of the same kernel:


/* THIS RUN PROBES ONCE */

[    0.846591] loop: module loaded
[    0.848079] g12a-mdio_mux ff64c000.mdio-multiplexer: Driver g12a-mdio_mux requests probe deferral
[    0.853901] usbcore: registered new interface driver usb-storage
[    0.859421] input: gpio-keys-polled as /devices/platform/gpio-keys-polled/input/input0
[    0.867880] meson-vrtc ff8000a8.rtc: registered as rtc0
[    0.872148] meson-vrtc ff8000a8.rtc: setting system clock to 1970-01-01T00:00:00 UTC (0)
[    0.880251] i2c_dev: i2c /dev entries driver
[    0.886780] psci-cpuidle: probe of psci-cpuidle rejects match -19
[    0.891279] ledtrig-cpu: registered to indicate activity on CPUs
[    0.891505] meson-gx-mmc ffe03000.sd: allocated mmc-pwrseq
[    0.896644] meson-sm: secure-monitor enabled
[    0.906242] hid: raw HID events driver (C) Jiri Kosina
[    0.911506] usbcore: registered new interface driver usbhid
[    0.916638] usbhid: USB HID core driver
[    0.922897] optee: probing for conduit method.
[    0.924837] optee: revision 2.4
[    0.925770] optee: initialized driver
[    0.932832] NET: Registered PF_PACKET protocol family
[    0.936627] Key type dns_resolver registered
[    0.941251] YO mmc2
[    0.945287] registered taskstats version 1
[    0.946989] Loading compiled-in X.509 certificates
[    1.074273] sdio_read_cis: vendor=2d0 dev=aae7
[    1.074824] sdio_read_cis: vendor=2d0 dev=aae7
[    1.074862] mmc2: new ultra high speed SDR50 SDIO card at address 0001


/* THIS RUN REQUIRES TWO PROBES */

[    0.855536] loop: module loaded
[    0.857064] g12a-mdio_mux ff64c000.mdio-multiplexer: Driver g12a-mdio_mux requests probe deferral
[    0.862952] usbcore: registered new interface driver usb-storage
[    0.868480] input: gpio-keys-polled as /devices/platform/gpio-keys-polled/input/input0
[    0.877000] meson-vrtc ff8000a8.rtc: registered as rtc0
[    0.881206] meson-vrtc ff8000a8.rtc: setting system clock to 1970-01-01T00:00:00 UTC (0)
[    0.889310] i2c_dev: i2c /dev entries driver
[    0.895850] psci-cpuidle: probe of psci-cpuidle rejects match -19
[    0.900536] meson-gx-mmc ffe03000.sd: allocated mmc-pwrseq
[    0.905438] ledtrig-cpu: registered to indicate activity on CPUs
[    0.911212] meson-sm: secure-monitor enabled
[    0.915235] hid: raw HID events driver (C) Jiri Kosina
[    0.920576] usbcore: registered new interface driver usbhid
[    0.925700] usbhid: USB HID core driver
[    0.931999] optee: probing for conduit method.
[    0.933912] optee: revision 2.4
[    0.934875] optee: initialized driver
[    0.941567] YO mmc2
[    0.941764] NET: Registered PF_PACKET protocol family
[    0.942884] YO mmc2
[    0.947738] Key type dns_resolver registered
[    0.958079] registered taskstats version 1
[    0.958190] Loading compiled-in X.509 certificates
[    1.068023] g12a-mdio_mux ff64c000.mdio-multiplexer: Driver g12a-mdio_mux requests probe deferral
[    1.072434] g12a-mdio_mux ff64c000.mdio-multiplexer: Driver g12a-mdio_mux requests probe deferral
[    1.072766] meson-gx-mmc ffe07000.mmc: allocated mmc-pwrseq
[    1.081297] cfg80211: Loading compiled-in X.509 certificates for regulatory database
[    1.101201] sdio_read_cis: vendor=2d0 dev=aae7
[    1.101755] sdio_read_cis: vendor=2d0 dev=aae7
[    1.101794] mmc2: new ultra high speed SDR50 SDIO card at address 0001



NOTA BENE: for mmc1, in both runs, kernel probes 10 times and fails with:

[    1.265231] mmc_attach_sdio failed with -110
[    1.373006] mmc1: new HS200 MMC card at address 0001
[    1.377242] mmcblk1: mmc1:0001 SCA16G 14.7 GiB 
[    1.383718] mmcblk1boot0: mmc1:0001 SCA16G 4.00 MiB 
[    1.387257] mmcblk1boot1: mmc1:0001 SCA16G 4.00 MiB 
[    1.391912] mmcblk1rpmb: mmc1:0001 SCA16G 4.00 MiB, chardev (246:0)

mmc_attach_sdio() fails but mmc_add_card() succeeds anyway...?
Confused.
mmc_attach_sdio() doesn't call mmc_add_card() if it exits prematurely.

Regards
Marc Gonzalez March 13, 2023, 2:53 p.m. UTC | #14
On 13/03/2023 15:41, Marc Gonzalez wrote:

> NOTA BENE: for mmc1, in both runs, kernel probes 10 times and fails with:
> 
> [    1.265231] mmc_attach_sdio failed with -110
> [    1.373006] mmc1: new HS200 MMC card at address 0001
> [    1.377242] mmcblk1: mmc1:0001 SCA16G 14.7 GiB 
> [    1.383718] mmcblk1boot0: mmc1:0001 SCA16G 4.00 MiB 
> [    1.387257] mmcblk1boot1: mmc1:0001 SCA16G 4.00 MiB 
> [    1.391912] mmcblk1rpmb: mmc1:0001 SCA16G 4.00 MiB, chardev (246:0)
> 
> mmc_attach_sdio() fails but mmc_add_card() succeeds anyway...?
> Confused.
> mmc_attach_sdio() doesn't call mmc_add_card() if it exits prematurely.

This might be because mmc1 and mmc2 are of "different nature".

[    2.083053] mmc2: new ultra high speed SDR50 SDIO card at address 0001
[    2.093918] CPU: 1 PID: 35 Comm: kworker/1:1 Not tainted 6.2.0-rc8 #300
[    2.100464] Hardware name: SEI Robotics SEI510 (DT)
[    2.105296] Workqueue: events_freezable mmc_rescan
[    2.110038] Call trace:
[    2.126856]  mmc_add_card+0xfc/0x2d0
[    2.130392]  mmc_attach_sdio+0x22c/0x3c0
[    2.134273]  mmc_rescan+0x258/0x2f0
[    2.137723]  process_one_work+0x1cc/0x320
[    2.141691]  worker_thread+0x14c/0x450
[    2.145399]  kthread+0x10c/0x110
[    2.148591]  ret_from_fork+0x10/0x20


/* below message from call_driver_probe() getting ENODEV */
[    2.152299] mmcblk: probe of mmc2:0001 rejects match -19
[    2.176589] YO mmc1
[    2.180263] YO mmc1
[    2.183937] YO mmc1
[    2.187621] YO mmc1
[    2.191299] YO mmc1
[    2.194972] YO mmc1
[    2.198649] YO mmc1
[    2.202326] YO mmc1
[    2.206000] YO mmc1
[    2.209674] YO mmc1
[    2.209687] mmc_attach_sdio failed with -110
[    2.318954] mmc1: new HS200 MMC card at address 0001
[    2.322515] CPU: 1 PID: 20 Comm: kworker/1:0 Not tainted 6.2.0-rc8 #300
[    2.329061] Hardware name: SEI Robotics SEI510 (DT)
[    2.333892] Workqueue: events_freezable mmc_rescan
[    2.338635] Call trace:
[    2.355453]  mmc_add_card+0xfc/0x2d0
[    2.358989]  mmc_attach_mmc+0xf0/0x180
[    2.362698]  mmc_rescan+0x280/0x2f0
[    2.366148]  process_one_work+0x1cc/0x320
[    2.370115]  worker_thread+0x14c/0x450
[    2.373824]  kthread+0x10c/0x110
[    2.377015]  ret_from_fork+0x10/0x20
[    2.381265] mmcblk1: mmc1:0001 SCA16G 14.7 GiB 
[    2.389945] mmcblk1boot0: mmc1:0001 SCA16G 4.00 MiB 
[    2.391199] mmcblk1boot1: mmc1:0001 SCA16G 4.00 MiB 
[    2.395897] mmcblk1rpmb: mmc1:0001 SCA16G 4.00 MiB, chardev (246:0)
Marc Gonzalez March 13, 2023, 5:09 p.m. UTC | #15
On 13/03/2023 14:55, Marc Gonzalez wrote:

> Still trying to bisect this heisenbug into submission... :(
> 
> So far, I've pared it down to mmc_attach_sdio()
> 
> When probe WORKS, mmc_attach_sdio() returns 0.
> When probe FAILS, mmc_attach_sdio() returns ETIMEDOUT
> via mmc_send_io_op_cond(host, 0, &ocr);
> 
> Wrapping mmc_send_io_op_cond() in a loop
> makes it work on the second try.

Almost there, I think, I hope :)

DT prop "post-power-on-delay-ms" looks like what I needed all along.
It exists both for host (default 10 ms) and for pwrseq_simple (default 0 apparently).

/*
 * Apply power to the MMC stack.  This is a two-stage process.
 * First, we enable power to the card without the clock running.
 * We then wait a bit for the power to stabilise.  Finally,
 * enable the bus drivers and clock to the card.
 *
 * We must _NOT_ enable the clock prior to power stablising.
 *
 * If a host does all the power sequencing itself, ignore the
 * initial MMC_POWER_UP stage.
 */
void mmc_power_up(struct mmc_host *host, u32 ocr)

Calls:

mmc_delay(host->ios.power_delay_ms);
mmc_pwrseq_post_power_on(host);
  => msleep(pwrseq->post_power_on_delay_ms);
...
mmc_delay(host->ios.power_delay_ms);


QUESTION:
It's not clear to me why we sleep twice for host->ios.power_delay_ms?


Looks like all I need is to add post-power-on-delay-ms = <100>; to

	sdio_pwrseq: sdio-pwrseq {
		compatible = "mmc-pwrseq-simple";
		reset-gpios = <&gpio GPIOX_6 GPIO_ACTIVE_LOW>;
		clocks = <&wifi32k>;
		clock-names = "ext_clock";
	};

Will revert all my stabs in the dark, and boot the board 100 times
to check if this does the trick.

Regards
Martin Blumenstingl March 13, 2023, 8:04 p.m. UTC | #16
Hi Marc,

On Mon, Mar 13, 2023 at 6:09 PM Marc Gonzalez <marc.w.gonzalez@free.fr> wrote:
[...]
> > Wrapping mmc_send_io_op_cond() in a loop
> > makes it work on the second try.
>
> Almost there, I think, I hope :)
>
> DT prop "post-power-on-delay-ms" looks like what I needed all along.
> It exists both for host (default 10 ms) and for pwrseq_simple (default 0 apparently).
I think you're on the right track here!

[...]
> QUESTION:
> It's not clear to me why we sleep twice for host->ios.power_delay_ms?
I'm not sure but I think host->ios.power_delay_ms is independent from
the post-power-on-delay-ms property of "mmc-pwrseq-simple".

> Looks like all I need is to add post-power-on-delay-ms = <100>; to
>
>         sdio_pwrseq: sdio-pwrseq {
>                 compatible = "mmc-pwrseq-simple";
>                 reset-gpios = <&gpio GPIOX_6 GPIO_ACTIVE_LOW>;
>                 clocks = <&wifi32k>;
>                 clock-names = "ext_clock";
>         };
I found the following code in the vendor RTL8822CS driver [0].
Simplified version:
    for (i = 0; i <= 50; i++) {
       msleep(10);
        if (sdhci_device_attached())
           break;
   }
(so a delay of up to 500ms)

Also I found this code [1] in the same repo:
    mmc_pm_gpio_ctrl("rtl8189es_vdd_en", 1);
    udelay(100);
    mmc_pm_gpio_ctrl("rtl8189es_vcc_en", 1);
    udelay(50);

So like I said above: I think you're going in the right direction!


Best regards,
Martin


[0] https://github.com/chewitt/RTL8822CS/blob/main/platform/platform_sprd_sdio.c#L55-L62
[1] https://github.com/chewitt/RTL8822CS/blob/main/platform/platform_ARM_SUNxI_sdio.c#L35-L43
Marc Gonzalez March 13, 2023, 9:42 p.m. UTC | #17
On 13/03/2023 21:04, Martin Blumenstingl wrote:

> On Mon, Mar 13, 2023 at 6:09 PM Marc Gonzalez wrote:
> [...]
>> QUESTION:
>> It's not clear to me why we sleep twice for host->ios.power_delay_ms?
>
> I'm not sure but I think host->ios.power_delay_ms is independent from
> the post-power-on-delay-ms property of "mmc-pwrseq-simple".

They are distinct indeed (the props have the same name & similar purpose though).

host->ios.power_delay_ms is set via:

drivers/mmc/core/host.c:        device_property_read_u32(dev, "post-power-on-delay-ms",
drivers/mmc/core/host.c-                                 &host->ios.power_delay_ms);


pwrseq->post_power_on_delay_ms is set via:

drivers/mmc/core/pwrseq_simple.c:       device_property_read_u32(dev, "post-power-on-delay-ms",
drivers/mmc/core/pwrseq_simple.c-                                &pwrseq->post_power_on_delay_ms);


However, they are both used to delay mmc_power_up(),
and the host delay is used to sleep twice, which I
found confusing ;)

Regards
Ulf Hansson March 14, 2023, 7:27 a.m. UTC | #18
On Mon, 13 Mar 2023 at 22:42, Marc Gonzalez <marc.w.gonzalez@free.fr> wrote:
>
> On 13/03/2023 21:04, Martin Blumenstingl wrote:
>
> > On Mon, Mar 13, 2023 at 6:09 PM Marc Gonzalez wrote:
> > [...]
> >> QUESTION:
> >> It's not clear to me why we sleep twice for host->ios.power_delay_ms?
> >
> > I'm not sure but I think host->ios.power_delay_ms is independent from
> > the post-power-on-delay-ms property of "mmc-pwrseq-simple".
>
> They are distinct indeed (the props have the same name & similar purpose though).

Correct.

>
> host->ios.power_delay_ms is set via:
>
> drivers/mmc/core/host.c:        device_property_read_u32(dev, "post-power-on-delay-ms",

This is a property that may be specified in the host's OF node. See
Documentation/devicetree/bindings/mmc/mmc-controller.yaml.

> drivers/mmc/core/host.c-                                 &host->ios.power_delay_ms);

Note that the default value is set to 10 ms, in mmc_alloc_host(). So,
if you need a different value, that can be specified in DT.

>
>
> pwrseq->post_power_on_delay_ms is set via:
>
> drivers/mmc/core/pwrseq_simple.c:       device_property_read_u32(dev, "post-power-on-delay-ms",

This is a property that may be specified in the pwrseq OF node. See
Documentation/devicetree/bindings/mmc/mmc-pwrseq-simple.yaml.

> drivers/mmc/core/pwrseq_simple.c-                                &pwrseq->post_power_on_delay_ms);
>
>
> However, they are both used to delay mmc_power_up(),
> and the host delay is used to sleep twice, which I
> found confusing ;)

It's certainly a bit hairy to follow, but I think you have understood
things correctly.

>
> Regards
>

Looks like you are getting closer to the solution, at least for the
first step to make sure the SDIO card is getting detected properly.
Nice!

Kind regards
Uffe
Marc Gonzalez March 14, 2023, 8:32 a.m. UTC | #19
On 14/03/2023 08:27, Ulf Hansson wrote:

> On Mon, 13 Mar 2023 at 22:42, Marc Gonzalez wrote:
>>
>> On 13/03/2023 21:04, Martin Blumenstingl wrote:
>>
>>> On Mon, Mar 13, 2023 at 6:09 PM Marc Gonzalez wrote:
>>> [...]
>>>> QUESTION:
>>>> It's not clear to me why we sleep twice for host->ios.power_delay_ms?
>>>
>>> I'm not sure but I think host->ios.power_delay_ms is independent from
>>> the post-power-on-delay-ms property of "mmc-pwrseq-simple".
>>
>> They are distinct indeed (the props have the same name & similar purpose though).
> 
> Correct.
> 
>>
>> host->ios.power_delay_ms is set via:
>>
>> drivers/mmc/core/host.c:        device_property_read_u32(dev, "post-power-on-delay-ms",
> 
> This is a property that may be specified in the host's OF node. See
> Documentation/devicetree/bindings/mmc/mmc-controller.yaml.
> 
>> drivers/mmc/core/host.c-                                 &host->ios.power_delay_ms);
> 
> Note that the default value is set to 10 ms, in mmc_alloc_host(). So,
> if you need a different value, that can be specified in DT.
> 
>>
>>
>> pwrseq->post_power_on_delay_ms is set via:
>>
>> drivers/mmc/core/pwrseq_simple.c:       device_property_read_u32(dev, "post-power-on-delay-ms",
> 
> This is a property that may be specified in the pwrseq OF node. See
> Documentation/devicetree/bindings/mmc/mmc-pwrseq-simple.yaml.
> 
>> drivers/mmc/core/pwrseq_simple.c-                                &pwrseq->post_power_on_delay_ms);
>>
>>
>> However, they are both used to delay mmc_power_up(),
>> and the host delay is used to sleep twice, which I
>> found confusing ;)
> 
> It's certainly a bit hairy to follow, but I think you have understood
> things correctly.
> 
>>
>> Regards
>>
> 
> Looks like you are getting closer to the solution, at least for the
> first step to make sure the SDIO card is getting detected properly.
> Nice!

What I'm confused about is:

Why would I specify the delay in the host rather than in the pwrseq?

I'll have to study mmc_power_up() a bit more ;)

Regards
Ulf Hansson March 14, 2023, 11:08 a.m. UTC | #20
On Tue, 14 Mar 2023 at 09:33, Marc Gonzalez <marc.w.gonzalez@free.fr> wrote:
>
> On 14/03/2023 08:27, Ulf Hansson wrote:
>
> > On Mon, 13 Mar 2023 at 22:42, Marc Gonzalez wrote:
> >>
> >> On 13/03/2023 21:04, Martin Blumenstingl wrote:
> >>
> >>> On Mon, Mar 13, 2023 at 6:09 PM Marc Gonzalez wrote:
> >>> [...]
> >>>> QUESTION:
> >>>> It's not clear to me why we sleep twice for host->ios.power_delay_ms?
> >>>
> >>> I'm not sure but I think host->ios.power_delay_ms is independent from
> >>> the post-power-on-delay-ms property of "mmc-pwrseq-simple".
> >>
> >> They are distinct indeed (the props have the same name & similar purpose though).
> >
> > Correct.
> >
> >>
> >> host->ios.power_delay_ms is set via:
> >>
> >> drivers/mmc/core/host.c:        device_property_read_u32(dev, "post-power-on-delay-ms",
> >
> > This is a property that may be specified in the host's OF node. See
> > Documentation/devicetree/bindings/mmc/mmc-controller.yaml.
> >
> >> drivers/mmc/core/host.c-                                 &host->ios.power_delay_ms);
> >
> > Note that the default value is set to 10 ms, in mmc_alloc_host(). So,
> > if you need a different value, that can be specified in DT.
> >
> >>
> >>
> >> pwrseq->post_power_on_delay_ms is set via:
> >>
> >> drivers/mmc/core/pwrseq_simple.c:       device_property_read_u32(dev, "post-power-on-delay-ms",
> >
> > This is a property that may be specified in the pwrseq OF node. See
> > Documentation/devicetree/bindings/mmc/mmc-pwrseq-simple.yaml.
> >
> >> drivers/mmc/core/pwrseq_simple.c-                                &pwrseq->post_power_on_delay_ms);
> >>
> >>
> >> However, they are both used to delay mmc_power_up(),
> >> and the host delay is used to sleep twice, which I
> >> found confusing ;)
> >
> > It's certainly a bit hairy to follow, but I think you have understood
> > things correctly.
> >
> >>
> >> Regards
> >>
> >
> > Looks like you are getting closer to the solution, at least for the
> > first step to make sure the SDIO card is getting detected properly.
> > Nice!
>
> What I'm confused about is:
>
> Why would I specify the delay in the host rather than in the pwrseq?

If the delay is to manage vmmc and vqmmc, which is somewhat part of
the generic specifications (SD/MMC), then it should be described in
the host's node.

A pwrseq is something special, which is also platform and device
specific (the SDIO device). If the delays correspond to this, it
should be part of the pwrseq node.

>
> I'll have to study mmc_power_up() a bit more ;)

:-)

Kind regards
Uffe
Marc Gonzalez March 14, 2023, 4:37 p.m. UTC | #21
On 14/03/2023 12:08, Ulf Hansson wrote:

> If the delay is to manage vmmc and vqmmc, which is somewhat part of
> the generic specifications (SD/MMC), then it should be described in
> the host's node.
> 
> A pwrseq is something special, which is also platform and device
> specific (the SDIO device). If the delays correspond to this, it
> should be part of the pwrseq node.

Uffe,

Wouldn't it make sense to warn, when probing for a non-removable card
returns nothing?

Something along these lines (which I can spin into a formal patch)

--- a/drivers/mmc/core/core.c
+++ b/drivers/mmc/core/core.c
@@ -2257,6 +2257,9 @@ void mmc_rescan(struct work_struct *work)
                        break;
        }
 
+       if (!mmc_card_is_removable(host) && !host->card)
+               dev_warn(mmc_dev(host), "no device found");
+
        /*
         * Ignore the command timeout errors observed during
         * the card init as those are excepted.

Regards
Ulf Hansson March 15, 2023, 8:55 p.m. UTC | #22
On Tue, 14 Mar 2023 at 17:38, Marc Gonzalez <marc.w.gonzalez@free.fr> wrote:
>
> On 14/03/2023 12:08, Ulf Hansson wrote:
>
> > If the delay is to manage vmmc and vqmmc, which is somewhat part of
> > the generic specifications (SD/MMC), then it should be described in
> > the host's node.
> >
> > A pwrseq is something special, which is also platform and device
> > specific (the SDIO device). If the delays correspond to this, it
> > should be part of the pwrseq node.
>
> Uffe,
>
> Wouldn't it make sense to warn, when probing for a non-removable card
> returns nothing?
>
> Something along these lines (which I can spin into a formal patch)
>
> --- a/drivers/mmc/core/core.c
> +++ b/drivers/mmc/core/core.c
> @@ -2257,6 +2257,9 @@ void mmc_rescan(struct work_struct *work)
>                         break;
>         }
>
> +       if (!mmc_card_is_removable(host) && !host->card)
> +               dev_warn(mmc_dev(host), "no device found");
> +
>         /*
>          * Ignore the command timeout errors observed during
>          * the card init as those are excepted.
>
> Regards
>

Seems reasonable to me! Please post a patch.

Kind regards
Uffe
Marc Gonzalez March 22, 2023, 5:40 p.m. UTC | #23
On 08/03/2023 18:21, Franky Lin wrote:

> On Wed, Mar 8, 2023 at 4:48 AM Marc Gonzalez wrote:
>>
>> On 08/03/2023 00:15, Franky Lin wrote:
>>
>>> On Tue, Mar 7, 2023 at 6:40 AM Marc Gonzalez wrote:
>>>
>>>> Through the SDIO bus, the WiFi chip reports 0xaae7 (i.e. 43751)
>>>> hence the /sys/bus/sdio/devices output above.
>>>>
>>>> sdio_read_func_cis() -> sdio_read_cis() which sets
>>>> func->vendor/func->device to 2d0/aae7
>>>>
>>>>
>>>> But when brcmf_chip_recognition() calls ci->ops->read32()
>>>> i.e. brcmf_sdio_buscore_read32()
>>>> [ vs brcmf_sdiod_readl() in brcmf_sdio_probe_attach() ]
>>>>
>>>> [    1.177283] brcmfmac: F1 signature read @0x18000000=0x1042aae8
>>>> [    1.182912] found AXI chip: BCM43752/2
>>>> [    1.186384] BCM43752/2: chip=aae8 rev=2 type=1
>>>>
>>>> Here it reports 0xaae8 (i.e. 43752)
>>>>
>>>> Why the discrepancy?
>>>> Can it cause issues?
>>>> (Sometimes, the whole SDIO bus doesn't probe at boot.
>>>> I am still investigating these intermittent problems.)
>>>>
>>>> Should I use 43751 or 43752 firmware...?
>>>
>>> This question should be answered by the Cypress/Infineon folks but
>>> unfortunately they have been quiet for a long time. In general we use
>>> the id read from 0x18000000 to decide which firmware to load. But be
>>> aware that the rev also matters. There are some examples in
>>> brcmf_sdio_fwnames table that the different firmware name can be
>>> derived from the same chip common id but different rev.
>>>
>>> However sdio device enumeration happens before firmware download so
>>> the intermittent problem you are facing probably is not related to
>>> firmware version.
>>
>> When the host sends the equivalent of an "identify yourself" message
>> on the SDIO bus, doesn't the reply come from the WiFi device?
>> Why would the device reply 0xaae7 instead of 0xaae8?
>>
>> In other words, who is replying 0xaae7?
> 
> The enumeration response is done by the HW sdio core on the chip.
> No software involved.

Hello Franky et al,

I have made some progress understanding some issues.
But I remain baffled by many others :(

At my disposal, two S905X2 boards with SDIO-based brcm modules
(according to the labels on the boards) :

Board A embeds "AP6398SR3"  which identifies as
	x vendor=2d0 dev=4359 over sdio_read_cis
	x brcmfmac: F1 signature read @0x18000000=0x17294359
	x brcmfmac: BCM4359/9: chip=4359 rev=9

Board B embeds "AP6398SR32" which identifies as
	x vendor=2d0 dev=aae7 over sdio_read_cis
	x F1 signature read @0x18000000=0x1042aae8
	x brcmfmac: BCM43752/2: chip=aae8 rev=2

(I still do not understand why AP6398SR32 on board B
identifies as AAE7 on the SDIO bus, but as AAE8 via MMIO.
That does not make sense to me.)

On board A, WiFi module probes correctly with the default timing.
On board B, WiFi module requires waiting 10 ms more,
otherwise mmc_send_io_op_cond() (which sends CMD5) times out,
and the mmc core assumes that there is no card in the "slot".

Using the same firmware files in mainline that vendor's kernel uses,
WiFi on board A works as expected.
WiFi on board B logs a scan error (-52) every second,
and eventually panics the kernel.

NB: using the vendor's kernel, both modules work as expected.


[ 1033.293100] ieee80211 phy0: brcmf_run_escan: error (-52)
[ 1033.293130] ieee80211 phy0: brcmf_cfg80211_scan: scan error (-52)
[ 1034.304785] ieee80211 phy0: brcmf_run_escan: error (-52)
[ 1034.304811] ieee80211 phy0: brcmf_cfg80211_scan: scan error (-52)
[ 1035.316452] ieee80211 phy0: brcmf_run_escan: error (-52)
[ 1035.316480] ieee80211 phy0: brcmf_cfg80211_scan: scan error (-52)
[ 1036.328251] ieee80211 phy0: brcmf_run_escan: error (-52)
[ 1036.328289] ieee80211 phy0: brcmf_cfg80211_scan: scan error (-52)
[ 1037.340004] ieee80211 phy0: brcmf_run_escan: error (-52)
[ 1037.340030] ieee80211 phy0: brcmf_cfg80211_scan: scan error (-52)
[ 1038.351667] ieee80211 phy0: brcmf_run_escan: error (-52)
[ 1038.351692] ieee80211 phy0: brcmf_cfg80211_scan: scan error (-52)
[ 1039.363331] ieee80211 phy0: brcmf_run_escan: error (-52)
[ 1039.363357] ieee80211 phy0: brcmf_cfg80211_scan: scan error (-52)
[ 1040.374874] ieee80211 phy0: brcmf_run_escan: error (-52)
[ 1040.374900] ieee80211 phy0: brcmf_cfg80211_scan: scan error (-52)
[ 1041.386538] ieee80211 phy0: brcmf_run_escan: error (-52)
[ 1041.386563] ieee80211 phy0: brcmf_cfg80211_scan: scan error (-52)
[ 1042.398166] SError Interrupt on CPU1, code 0x00000000bf000000 -- SError
[ 1042.398175] CPU: 1 PID: 61 Comm: kworker/u9:0 Not tainted 6.2.0-rc8 #431
[ 1042.398181] Hardware name: SEI Robotics SEI510 (DT)
[ 1042.398184] Workqueue: brcmf_wq/mmc2:0001:1 brcmf_sdio_dataworker
[ 1042.398201] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 1042.398206] pc : brcmf_sdio_dataworker+0x2090/0x23d0
[ 1042.398213] lr : brcmf_sdio_dataworker+0x1fe8/0x23d0
[ 1042.398219] sp : ffff80000a1d3c60
[ 1042.398220] x29: ffff80000a1d3c70 x28: 0000000000000000 x27: ffff8000088ac558
[ 1042.398227] x26: ffff800009ed1ab0 x25: ffff000004108000 x24: ffff80000b4f1000
[ 1042.398233] x23: ffff800009f25b68 x22: 00000000000000a8 x21: 00000000000000e2
[ 1042.398238] x20: 000000000000000c x19: ffff000000b3a000 x18: 0000000000000000
[ 1042.398244] x17: 0000000000000000 x16: 0000000000000000 x15: d099d095d090d08c
[ 1042.398250] x14: 0000000000002912 x13: ffff000067b5c400 x12: ffff000067b8db00
[ 1042.398256] x11: ffff000067b8db00 x10: 0000000000000000 x9 : 0000000000000000
[ 1042.398262] x8 : 000000000000005a x7 : ffff800009cc3400 x6 : 0000000000000200
[ 1042.398268] x5 : 0000000000002a10 x4 : 00000000000000d6 x3 : 0000000000000200
[ 1042.398273] x2 : 0000000000000002 x1 : 0000000000000008 x0 : ffff000000b39800
[ 1042.398280] Kernel panic - not syncing: Asynchronous SError Interrupt
[ 1042.398283] CPU: 1 PID: 61 Comm: kworker/u9:0 Not tainted 6.2.0-rc8 #431
[ 1042.398287] Hardware name: SEI Robotics SEI510 (DT)
[ 1042.398289] Workqueue: brcmf_wq/mmc2:0001:1 brcmf_sdio_dataworker
[ 1042.398296] Call trace:
[ 1042.398298]  dump_backtrace.part.0+0xe0/0xf0
[ 1042.398307]  show_stack+0x18/0x30
[ 1042.398313]  dump_stack_lvl+0x68/0x84
[ 1042.398320]  dump_stack+0x18/0x34
[ 1042.398324]  panic+0x184/0x344
[ 1042.398330]  nmi_panic+0xac/0xb0
[ 1042.398335]  arm64_serror_panic+0x6c/0x80
[ 1042.398339]  do_serror+0x58/0x60
[ 1042.398343]  el1h_64_error_handler+0x30/0x50
[ 1042.398348]  el1h_64_error+0x64/0x68
[ 1042.398352]  brcmf_sdio_dataworker+0x2090/0x23d0
[ 1042.398357]  process_one_work+0x1cc/0x320
[ 1042.398363]  worker_thread+0x14c/0x450
[ 1042.398367]  kthread+0xfc/0x100
[ 1042.398371]  ret_from_fork+0x10/0x20
[ 1042.398378] SMP: stopping secondary CPUs
[ 1042.398383] Kernel Offset: 0x80000 from 0xffff800008000000
[ 1042.398385] PHYS_OFFSET: 0x0
[ 1042.398387] CPU features: 0x00000,01200100,0000420b
[ 1042.398390] Memory Limit: none
[ 1042.608339] ---[ end Kernel panic - not syncing: Asynchronous SError Interrupt ]---


Has anyone seen a similar panic on a SDIO-based WiFi module?


I was wondering:
Has anyone evaluated an X2/X3 meson board with a PCIe (not SIO) WiFi module?
Neil told PCIe is expected to be less CPU-intensive.

Has anyone evaluated an SDIO-based board that doesn't require the
amlogic,dram-access-quirk?

Regards
Marc Gonzalez April 4, 2023, 3:59 p.m. UTC | #24
On 22/03/2023 18:40, Marc Gonzalez wrote:

> At my disposal, two S905X2 boards with SDIO-based brcm modules
> (according to the labels on the boards) :
> 
> Board A embeds "AP6398SR3"  which identifies as
> 	x vendor=2d0 dev=4359 over sdio_read_cis
> 	x brcmfmac: F1 signature read @0x18000000=0x17294359
> 	x brcmfmac: BCM4359/9: chip=4359 rev=9
> 
> Board B embeds "AP6398SR32" which identifies as
> 	x vendor=2d0 dev=aae7 over sdio_read_cis
> 	x F1 signature read @0x18000000=0x1042aae8
> 	x brcmfmac: BCM43752/2: chip=aae8 rev=2

Pali Rohár pointed out lssdio.
Here's the verbose output, for the record.


BOARD A
# lssdio -v
Host 2 Address 0001 Function 1: ID 02d0:4359 
    Common Address: 0x0001
    Common VDD voltage: 3.2-3.4V
    Common Type: SDIO
    Common Vendor ID: 0x02d0
    Common Device ID: 0x4359
    Common Revision: 0.0
    Function ID: 1
    Function Vendor ID: 0x02d0
    Function Device ID: 0x4359
    Function Class ID: 0x00
    Function Revision: 0.0
    Function Driver in use: brcmfmac
Host 2 Address 0001 Function 2: ID 02d0:4359 
    Common Address: 0x0001
    Common VDD voltage: 3.2-3.4V
    Common Type: SDIO
    Common Vendor ID: 0x02d0
    Common Device ID: 0x4359
    Common Revision: 0.0
    Function ID: 2
    Function Vendor ID: 0x02d0
    Function Device ID: 0x4359
    Function Class ID: 0x00
    Function Revision: 0.0
    Function Driver in use: brcmfmac
Host 2 Address 0001 Function 3: ID 02d0:4359 (Class 02)
    Common Address: 0x0001
    Common VDD voltage: 3.2-3.4V
    Common Type: SDIO
    Common Vendor ID: 0x02d0
    Common Device ID: 0x4359
    Common Revision: 0.0
    Function ID: 3
    Function Vendor ID: 0x02d0
    Function Device ID: 0x4359
    Function Class ID: 0x02
    Function Revision: 0.0
    Function Driver in use: (none)


BOARD B
# lssdio -v
Host 2 Address 0001 Function 1: ID 02d0:aae7 
    Common Address: 0x0001
    Common VDD voltage: 3.2-3.4V
    Common Type: SDIO
    Common Vendor ID: 0x02d0
    Common Device ID: 0xaae7
    Common Revision: 0.0
    Function ID: 1
    Function Vendor ID: 0x02d0
    Function Device ID: 0xaae7
    Function Class ID: 0x00
    Function Revision: 0.0
    Function Driver in use: (none)
Host 2 Address 0001 Function 2: ID 02d0:aae7 
    Common Address: 0x0001
    Common VDD voltage: 3.2-3.4V
    Common Type: SDIO
    Common Vendor ID: 0x02d0
    Common Device ID: 0xaae7
    Common Revision: 0.0
    Function ID: 2
    Function Vendor ID: 0x02d0
    Function Device ID: 0xaae7
    Function Class ID: 0x00
    Function Revision: 0.0
    Function Driver in use: (none)
diff mbox series

Patch

diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/bcmsdh.c b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/bcmsdh.c
index 1967160f211eb..89bbad598782e 100644
--- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/bcmsdh.c
+++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/bcmsdh.c
@@ -987,6 +987,7 @@  static const struct sdio_device_id brcmf_sdmmc_ids[] = {
 	BRCMF_SDIO_DEVICE(SDIO_DEVICE_ID_BROADCOM_CYPRESS_4373, CYW),
 	BRCMF_SDIO_DEVICE(SDIO_DEVICE_ID_BROADCOM_CYPRESS_43012, CYW),
 	BRCMF_SDIO_DEVICE(SDIO_DEVICE_ID_BROADCOM_CYPRESS_43439, CYW),
+	BRCMF_SDIO_DEVICE(SDIO_DEVICE_ID_BROADCOM_CYPRESS_43751, CYW),
 	BRCMF_SDIO_DEVICE(SDIO_DEVICE_ID_BROADCOM_CYPRESS_43752, CYW),
 	BRCMF_SDIO_DEVICE(SDIO_DEVICE_ID_BROADCOM_CYPRESS_89359, CYW),
 	{ /* end: all zeroes */ }
diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/chip.c b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/chip.c
index 121893bbaa1d7..cab9257272dd4 100644
--- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/chip.c
+++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/chip.c
@@ -731,6 +731,7 @@  static u32 brcmf_chip_tcm_rambase(struct brcmf_chip_priv *ci)
 	case BRCM_CC_4364_CHIP_ID:
 	case CY_CC_4373_CHIP_ID:
 		return 0x160000;
+	case CY_CC_43751_CHIP_ID:
 	case CY_CC_43752_CHIP_ID:
 		return 0x170000;
 	case BRCM_CC_4378_CHIP_ID:
@@ -1433,6 +1434,7 @@  bool brcmf_chip_sr_capable(struct brcmf_chip *pub)
 		reg = chip->ops->read32(chip->ctx, addr);
 		return (reg & CC_SR_CTL0_ENABLE_MASK) != 0;
 	case BRCM_CC_4359_CHIP_ID:
+	case CY_CC_43751_CHIP_ID:
 	case CY_CC_43752_CHIP_ID:
 	case CY_CC_43012_CHIP_ID:
 		addr = CORE_CC_REG(pmu->base, retention_ctl);
diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c
index 6b38d9de71af6..55cc7b65081fd 100644
--- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c
+++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c
@@ -624,6 +624,7 @@  BRCMF_FW_CLM_DEF(4356, "brcmfmac4356-sdio");
 BRCMF_FW_DEF(4359, "brcmfmac4359-sdio");
 BRCMF_FW_CLM_DEF(4373, "brcmfmac4373-sdio");
 BRCMF_FW_CLM_DEF(43012, "brcmfmac43012-sdio");
+BRCMF_FW_CLM_DEF(43751, "brcmfmac43751-sdio");
 BRCMF_FW_CLM_DEF(43752, "brcmfmac43752-sdio");
 
 /* firmware config files */
@@ -657,6 +658,7 @@  static const struct brcmf_firmware_mapping brcmf_sdio_fwnames[] = {
 	BRCMF_FW_ENTRY(CY_CC_4373_CHIP_ID, 0xFFFFFFFF, 4373),
 	BRCMF_FW_ENTRY(CY_CC_43012_CHIP_ID, 0xFFFFFFFF, 43012),
 	BRCMF_FW_ENTRY(CY_CC_43439_CHIP_ID, 0xFFFFFFFF, 43439),
+	BRCMF_FW_ENTRY(CY_CC_43751_CHIP_ID, 0xFFFFFFFF, 43751),
 	BRCMF_FW_ENTRY(CY_CC_43752_CHIP_ID, 0xFFFFFFFF, 43752)
 };
 
@@ -3425,6 +3427,7 @@  static int brcmf_sdio_download_firmware(struct brcmf_sdio *bus,
 static bool brcmf_sdio_aos_no_decode(struct brcmf_sdio *bus)
 {
 	if (bus->ci->chip == CY_CC_43012_CHIP_ID ||
+	    bus->ci->chip == CY_CC_43751_CHIP_ID ||
 	    bus->ci->chip == CY_CC_43752_CHIP_ID)
 		return true;
 	else
@@ -4274,6 +4277,7 @@  static void brcmf_sdio_firmware_callback(struct device *dev, int err,
 
 		switch (sdiod->func1->device) {
 		case SDIO_DEVICE_ID_BROADCOM_CYPRESS_4373:
+		case SDIO_DEVICE_ID_BROADCOM_CYPRESS_43751:
 		case SDIO_DEVICE_ID_BROADCOM_CYPRESS_43752:
 			brcmf_dbg(INFO, "set F2 watermark to 0x%x*4 bytes\n",
 				  CY_4373_F2_WATERMARK);
diff --git a/drivers/net/wireless/broadcom/brcm80211/include/brcm_hw_ids.h b/drivers/net/wireless/broadcom/brcm80211/include/brcm_hw_ids.h
index f4939cf627672..8b21da3e66291 100644
--- a/drivers/net/wireless/broadcom/brcm80211/include/brcm_hw_ids.h
+++ b/drivers/net/wireless/broadcom/brcm80211/include/brcm_hw_ids.h
@@ -55,6 +55,7 @@ 
 #define CY_CC_4373_CHIP_ID		0x4373
 #define CY_CC_43012_CHIP_ID		43012
 #define CY_CC_43439_CHIP_ID		43439
+#define CY_CC_43751_CHIP_ID		43751
 #define CY_CC_43752_CHIP_ID		43752
 #define CY_CC_89459_CHIP_ID		0x4355
 
diff --git a/include/linux/mmc/sdio_ids.h b/include/linux/mmc/sdio_ids.h
index 74f9d9a6d3307..80e84958e570f 100644
--- a/include/linux/mmc/sdio_ids.h
+++ b/include/linux/mmc/sdio_ids.h
@@ -76,6 +76,7 @@ 
 #define SDIO_DEVICE_ID_BROADCOM_43430		0xa9a6
 #define SDIO_DEVICE_ID_BROADCOM_CYPRESS_43439	0xa9af
 #define SDIO_DEVICE_ID_BROADCOM_43455		0xa9bf
+#define SDIO_DEVICE_ID_BROADCOM_CYPRESS_43751	0xaae7
 #define SDIO_DEVICE_ID_BROADCOM_CYPRESS_43752	0xaae8
 
 #define SDIO_VENDOR_ID_MARVELL			0x02df