diff mbox series

Issue with booting multi_v5_defconfig kernel with GCC 11

Message ID YcS4xVWs6bQlQSPC@archlinux-ax161 (mailing list archive)
State New, archived
Headers show
Series Issue with booting multi_v5_defconfig kernel with GCC 11 | expand

Commit Message

Nathan Chancellor Dec. 23, 2021, 5:58 p.m. UTC
Hi,

My apologies if this has already been reported somewhere, I did a search
of lore.kernel.org and did not really come up with anything.

I was recently investigating a regression in booting a
multi_v5_defconfig kernel on -next [1] and I noticed that building the
kernel with GCC 11 causes it to no longer boot due to a panic in
da850_evm_config_emac().

$ qemu-system-arm \
    -initrd rootfs.cpio \
    -machine palmetto-bmc \
    -no-reboot \
    -dtb arch/arm/boot/dts/aspeed-bmc-opp-palmetto.dtb \
    -display none \
    -kernel arch/arm/boot/zImage \
    -m 512m \
    -nodefaults \
    -serial mon:stdio
...
8<--- cut here ---
Unable to handle kernel NULL pointer dereference at virtual address 00000020
pgd = (ptrval)
[00000020] *pgd=00000000
Internal error: Oops: 5 [#1] PREEMPT ARM
Modules linked in:
CPU: 0 PID: 1 Comm: swapper Not tainted 5.15.0 #1
Hardware name: Generic DT based system
PC is at da850_evm_config_emac+0x1c/0x120
LR is at do_one_initcall+0x50/0x1e0
pc : [<c0b8f830>]    lr : [<c00098f4>]    psr: a0000153
sp : c106dee8  ip : c0b863e0  fp : c0b863e0
r10: c0dd3000  r9 : c0b4bea0  r8 : c0bc8830
r7 : 00000000  r6 : c0dd3288  r5 : ffffe000  r4 : c0b8f814
r3 : 00000000  r2 : 0000086d  r1 : ffffffff  r0 : 00000000
Flags: NzCv  IRQs on  FIQs off  Mode SVC_32  ISA ARM  Segment none
Control: 00093177  Table: 40004000  DAC: 00000053
Register r0 information: NULL pointer
Register r1 information: non-paged memory
Register r2 information: non-paged memory
Register r3 information: NULL pointer
Register r4 information: non-slab/vmalloc memory
Register r5 information: non-paged memory
Register r6 information: non-slab/vmalloc memory
Register r7 information: NULL pointer
Register r8 information: non-slab/vmalloc memory
Register r9 information: non-slab/vmalloc memory
Register r10 information: non-slab/vmalloc memory
Register r11 information: non-slab/vmalloc memory
Register r12 information: non-slab/vmalloc memory
Process swapper (pid: 1, stack limit = 0x(ptrval))
Stack: (0xc106dee8 to 0xc106e000)
dee0:                   c0b8f814 ffffe000 c0dd3000 00000000 c0bc8830 c00098f4
df00: c10ad600 c10ad608 c10ad5ff c003d8d8 00000000 c0ada9f0 000000f5 00000000
df20: 00000000 c0b863e0 00000006 00000006 00000011 c10ad615 00000000 6d86ee9a
df40: c1001100 00000006 000000f5 6d86ee9a c0bc884c c0bd9808 00000007 c10ad600
df60: c0bc8850 c0b87134 00000006 00000006 00000000 c0b863e0 00000000 000000f5
df80: c106c000 00000000 c08fa5e8 00000000 00000000 00000000 00000000 00000000
dfa0: 00000000 c08fa5f8 00000000 c0008548 00000000 00000000 00000000 00000000
dfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
dfe0: 00000000 00000000 00000000 00000000 00000013 00000000 00000000 00000000
[<c0b8f830>] (da850_evm_config_emac) from [<c00098f4>] (do_one_initcall+0x50/0x1e0)
[<c00098f4>] (do_one_initcall) from [<c0b87134>] (kernel_init_freeable+0x17c/0x204)
[<c0b87134>] (kernel_init_freeable) from [<c08fa5f8>] (kernel_init+0x10/0x108)
[<c08fa5f8>] (kernel_init) from [<c0008548>] (ret_from_fork+0x14/0x2c)
Exception stack(0xc106dfb0 to 0xc106dff8)
dfa0:                                     00000000 00000000 00000000 00000000
dfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
dfe0: 00000000 00000000 00000000 00000000 00000013 00000000
Code: e59f20dc e5931000 e5963044 e1510002 (e5d37020)
Unpacking initramfs...
---[ end trace 4a15a77f3ebdc580 ]---

Initially, I thought this might be a GCC regression so I bisected and
landed on commit dae673abd37 ("tree-optimization/97307 - improve sinking
of loads"), which I really think just exposes something here.

If I add some rudimentary debug prints, I can see that
soc_info->emac_pdata is NULL, both with GCC 10 and GCC 11.

GCC 11:

da850_evm_config_emac: soc_info: c0dd3288
da850_evm_config_emac: emac_pdata: 00000000

GCC 10:

da850_evm_config_emac: soc_info: c0dd7288
da850_evm_config_emac: emac_pdata: 00000000

I don't really see how that is possible, as I can see this member
assigned in the davinci_soc_info_* variables but maybe there is some
race during initialization or problem because the QEMU machine is not
one of these machines?

I think that the GCC commit just makes it so that the rmii_en member is
read earlier, rather than at the point of use.

Regardless, something like this resolves this panic for me but I have no
idea if it is proper or not, hence just the report :)


[1]: https://lore.kernel.org/r/YcNsG0Lp94V13whH@archlinux-ax161/

Cheers,
Nathan

Comments

Arnd Bergmann Dec. 23, 2021, 9:48 p.m. UTC | #1
On Thu, Dec 23, 2021 at 6:58 PM Nathan Chancellor <nathan@kernel.org> wrote:

> da850_evm_config_emac: soc_info: c0dd7288
> da850_evm_config_emac: emac_pdata: 00000000
>
> I don't really see how that is possible, as I can see this member
> assigned in the davinci_soc_info_* variables but maybe there is some
> race during initialization or problem because the QEMU machine is not
> one of these machines?

Right, the problem is that da850_evm_config_emac() gets called
unconditionally for all machines, but davinci_soc_info only gets
populated on davinci machines.

> I think that the GCC commit just makes it so that the rmii_en member is
> read earlier, rather than at the point of use.
>
> Regardless, something like this resolves this panic for me but I have no
> idea if it is proper or not, hence just the report :)
>
> diff --git a/arch/arm/mach-davinci/board-da850-evm.c b/arch/arm/mach-davinci/board-da850-evm.c
> index 428012687a80..7f7f6bae21c2 100644
> --- a/arch/arm/mach-davinci/board-da850-evm.c
> +++ b/arch/arm/mach-davinci/board-da850-evm.c
> @@ -1101,11 +1101,13 @@ static int __init da850_evm_config_emac(void)
>         int ret;
>         u32 val;
>         struct davinci_soc_info *soc_info = &davinci_soc_info;
> -       u8 rmii_en = soc_info->emac_pdata->rmii_en;
> +       u8 rmii_en;
>
>         if (!machine_is_davinci_da850_evm())
>                 return 0;
>
> +       rmii_en = soc_info->emac_pdata->rmii_en;
> +
>         cfg_chip3_base = DA8XX_SYSCFG0_VIRT(DA8XX_CFGCHIP3_REG);
>
>         val = __raw_readl(cfg_chip3_base);
>

This is the correct fix.

Fixes: bae105879f2f ("davinci: DA850/OMAP-L138 EVM: implement
autodetect of RMII PHY")
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
diff mbox series

Patch

diff --git a/arch/arm/mach-davinci/board-da850-evm.c b/arch/arm/mach-davinci/board-da850-evm.c
index 428012687a80..7f7f6bae21c2 100644
--- a/arch/arm/mach-davinci/board-da850-evm.c
+++ b/arch/arm/mach-davinci/board-da850-evm.c
@@ -1101,11 +1101,13 @@  static int __init da850_evm_config_emac(void)
        int ret;
        u32 val;
        struct davinci_soc_info *soc_info = &davinci_soc_info;
-       u8 rmii_en = soc_info->emac_pdata->rmii_en;
+       u8 rmii_en;

        if (!machine_is_davinci_da850_evm())
                return 0;

+       rmii_en = soc_info->emac_pdata->rmii_en;
+
        cfg_chip3_base = DA8XX_SYSCFG0_VIRT(DA8XX_CFGCHIP3_REG);

        val = __raw_readl(cfg_chip3_base);