diff mbox series

[v5,01/14] clk: microchip: mpfs: fix clk_cfg array bounds violation

Message ID 20220909123123.2699583-2-conor.dooley@microchip.com (mailing list archive)
State Awaiting Upstream, archived
Headers show
Series PolarFire SoC reset controller & clock cleanups | expand

Commit Message

Conor Dooley Sept. 9, 2022, 12:31 p.m. UTC
There is an array bounds violation present during clock registration,
triggered by current code by only specific toolchains. This seems to
fail gracefully in v6.0-rc1, using a toolchain build from the riscv-
gnu-toolchain repo and with clang-15, and life carries on. While
converting the driver to use standard clock structs/ops, kernel panics
were seen during boot when built with clang-15:

[    0.581754] Unable to handle kernel NULL pointer dereference at virtual address 00000000000000b1
[    0.591520] Oops [#1]
[    0.594045] Modules linked in:
[    0.597435] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.0.0-rc1-00011-g8e1459cf4eca #1
[    0.606188] Hardware name: Microchip PolarFire-SoC Icicle Kit (DT)
[    0.613012] epc : __clk_register+0x4a6/0x85c
[    0.617759]  ra : __clk_register+0x49e/0x85c
[    0.622489] epc : ffffffff803faf7c ra : ffffffff803faf74 sp : ffffffc80400b720
[    0.630466]  gp : ffffffff810e93f8 tp : ffffffe77fe60000 t0 : ffffffe77ffb3800
[    0.638443]  t1 : 000000000000000a t2 : ffffffffffffffff s0 : ffffffc80400b7c0
[    0.646420]  s1 : 0000000000000001 a0 : 0000000000000001 a1 : 0000000000000000
[    0.654396]  a2 : 0000000000000001 a3 : 0000000000000000 a4 : 0000000000000000
[    0.662373]  a5 : ffffffff803a5810 a6 : 0000000200000022 a7 : 0000000000000006
[    0.670350]  s2 : ffffffff81099d48 s3 : ffffffff80d6e28e s4 : 0000000000000028
[    0.678327]  s5 : ffffffff810ed3c8 s6 : ffffffff810ed3d0 s7 : ffffffe77ffbc100
[    0.686304]  s8 : ffffffe77ffb1540 s9 : ffffffe77ffb1540 s10: 0000000000000008
[    0.694281]  s11: 0000000000000000 t3 : 00000000000000c6 t4 : 0000000000000007
[    0.702258]  t5 : ffffffff810c78c0 t6 : ffffffe77ff88cd0
[    0.708125] status: 0000000200000120 badaddr: 00000000000000b1 cause: 000000000000000d
[    0.716869] [<ffffffff803fb892>] devm_clk_hw_register+0x62/0xaa
[    0.723420] [<ffffffff80403412>] mpfs_clk_probe+0x1e0/0x244

In v6.0-rc1 and later, this issue is visible without the follow on
patches doing the conversion using toolchains provided by our Yocto
meta layer too.

It fails on "clk_periph_timer" - which uses a different parent, that it
tries to find using the macro:
\#define PARENT_CLK(PARENT) (&mpfs_cfg_clks[CLK_##PARENT].cfg.hw)

If parent is RTCREF, so the macro becomes: &mpfs_cfg_clks[33].cfg.hw
which is well beyond the end of the array. Amazingly, builds with GCC
11.1 see no problem here, booting correctly and hooking the parent up
etc. Builds with clang-15 do not, with the above panic.

Change the macro to use specific offsets depending on the parent rather
than the dt-binding's clock IDs.

Fixes: 1c6a7ea32b8c ("clk: microchip: mpfs: add RTCREF clock control")
CC: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
---
 drivers/clk/microchip/clk-mpfs.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

Comments

Claudiu Beznea Sept. 12, 2022, 7:40 a.m. UTC | #1
On 09.09.2022 15:31, Conor Dooley wrote:
> There is an array bounds violation present during clock registration,
> triggered by current code by only specific toolchains. This seems to
> fail gracefully in v6.0-rc1, using a toolchain build from the riscv-
> gnu-toolchain repo and with clang-15, and life carries on. While
> converting the driver to use standard clock structs/ops, kernel panics
> were seen during boot when built with clang-15:
> 
> [    0.581754] Unable to handle kernel NULL pointer dereference at virtual address 00000000000000b1
> [    0.591520] Oops [#1]
> [    0.594045] Modules linked in:
> [    0.597435] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.0.0-rc1-00011-g8e1459cf4eca #1
> [    0.606188] Hardware name: Microchip PolarFire-SoC Icicle Kit (DT)
> [    0.613012] epc : __clk_register+0x4a6/0x85c
> [    0.617759]  ra : __clk_register+0x49e/0x85c
> [    0.622489] epc : ffffffff803faf7c ra : ffffffff803faf74 sp : ffffffc80400b720
> [    0.630466]  gp : ffffffff810e93f8 tp : ffffffe77fe60000 t0 : ffffffe77ffb3800
> [    0.638443]  t1 : 000000000000000a t2 : ffffffffffffffff s0 : ffffffc80400b7c0
> [    0.646420]  s1 : 0000000000000001 a0 : 0000000000000001 a1 : 0000000000000000
> [    0.654396]  a2 : 0000000000000001 a3 : 0000000000000000 a4 : 0000000000000000
> [    0.662373]  a5 : ffffffff803a5810 a6 : 0000000200000022 a7 : 0000000000000006
> [    0.670350]  s2 : ffffffff81099d48 s3 : ffffffff80d6e28e s4 : 0000000000000028
> [    0.678327]  s5 : ffffffff810ed3c8 s6 : ffffffff810ed3d0 s7 : ffffffe77ffbc100
> [    0.686304]  s8 : ffffffe77ffb1540 s9 : ffffffe77ffb1540 s10: 0000000000000008
> [    0.694281]  s11: 0000000000000000 t3 : 00000000000000c6 t4 : 0000000000000007
> [    0.702258]  t5 : ffffffff810c78c0 t6 : ffffffe77ff88cd0
> [    0.708125] status: 0000000200000120 badaddr: 00000000000000b1 cause: 000000000000000d
> [    0.716869] [<ffffffff803fb892>] devm_clk_hw_register+0x62/0xaa
> [    0.723420] [<ffffffff80403412>] mpfs_clk_probe+0x1e0/0x244
> 
> In v6.0-rc1 and later, this issue is visible without the follow on
> patches doing the conversion using toolchains provided by our Yocto
> meta layer too.
> 
> It fails on "clk_periph_timer" - which uses a different parent, that it
> tries to find using the macro:
> \#define PARENT_CLK(PARENT) (&mpfs_cfg_clks[CLK_##PARENT].cfg.hw)
> 
> If parent is RTCREF, so the macro becomes: &mpfs_cfg_clks[33].cfg.hw
> which is well beyond the end of the array. Amazingly, builds with GCC
> 11.1 see no problem here, booting correctly and hooking the parent up
> etc. Builds with clang-15 do not, with the above panic.
> 
> Change the macro to use specific offsets depending on the parent rather
> than the dt-binding's clock IDs.
> 
> Fixes: 1c6a7ea32b8c ("clk: microchip: mpfs: add RTCREF clock control")
> CC: Nathan Chancellor <nathan@kernel.org>
> Signed-off-by: Conor Dooley <conor.dooley@microchip.com>

Reviewed-by: Claudiu Beznea <claudiu.beznea@microchip.com>


> ---
>  drivers/clk/microchip/clk-mpfs.c | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/clk/microchip/clk-mpfs.c b/drivers/clk/microchip/clk-mpfs.c
> index 070c3b896559..f0f9c9a1cc48 100644
> --- a/drivers/clk/microchip/clk-mpfs.c
> +++ b/drivers/clk/microchip/clk-mpfs.c
> @@ -239,6 +239,11 @@ static const struct clk_ops mpfs_clk_cfg_ops = {
>  	.hw.init = CLK_HW_INIT(_name, _parent, &mpfs_clk_cfg_ops, 0),			\
>  }
>  
> +#define CLK_CPU_OFFSET		0u
> +#define CLK_AXI_OFFSET		1u
> +#define CLK_AHB_OFFSET		2u
> +#define CLK_RTCREF_OFFSET	3u
> +
>  static struct mpfs_cfg_hw_clock mpfs_cfg_clks[] = {
>  	CLK_CFG(CLK_CPU, "clk_cpu", "clk_msspll", 0, 2, mpfs_div_cpu_axi_table, 0,
>  		REG_CLOCK_CONFIG_CR),
> @@ -362,7 +367,7 @@ static const struct clk_ops mpfs_periph_clk_ops = {
>  				  _flags),					\
>  }
>  
> -#define PARENT_CLK(PARENT) (&mpfs_cfg_clks[CLK_##PARENT].hw)
> +#define PARENT_CLK(PARENT) (&mpfs_cfg_clks[CLK_##PARENT##_OFFSET].hw)
>  
>  /*
>   * Critical clocks:
Stephen Boyd Sept. 29, 2022, 12:30 a.m. UTC | #2
Quoting Conor Dooley (2022-09-09 05:31:10)
> There is an array bounds violation present during clock registration,
> triggered by current code by only specific toolchains. This seems to
> fail gracefully in v6.0-rc1, using a toolchain build from the riscv-
> gnu-toolchain repo and with clang-15, and life carries on. While
> converting the driver to use standard clock structs/ops, kernel panics
> were seen during boot when built with clang-15:
> 
[...]
> 
> If parent is RTCREF, so the macro becomes: &mpfs_cfg_clks[33].cfg.hw
> which is well beyond the end of the array. Amazingly, builds with GCC
> 11.1 see no problem here, booting correctly and hooking the parent up
> etc. Builds with clang-15 do not, with the above panic.
> 
> Change the macro to use specific offsets depending on the parent rather
> than the dt-binding's clock IDs.
> 
> Fixes: 1c6a7ea32b8c ("clk: microchip: mpfs: add RTCREF clock control")
> CC: Nathan Chancellor <nathan@kernel.org>
> Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
> ---

I'll merge this patch over to clk-fixes as well.
Stephen Boyd Sept. 29, 2022, 12:32 a.m. UTC | #3
Quoting Stephen Boyd (2022-09-28 17:30:28)
> Quoting Conor Dooley (2022-09-09 05:31:10)
> > There is an array bounds violation present during clock registration,
> > triggered by current code by only specific toolchains. This seems to
> > fail gracefully in v6.0-rc1, using a toolchain build from the riscv-
> > gnu-toolchain repo and with clang-15, and life carries on. While
> > converting the driver to use standard clock structs/ops, kernel panics
> > were seen during boot when built with clang-15:
> > 
> [...]
> > 
> > If parent is RTCREF, so the macro becomes: &mpfs_cfg_clks[33].cfg.hw
> > which is well beyond the end of the array. Amazingly, builds with GCC
> > 11.1 see no problem here, booting correctly and hooking the parent up
> > etc. Builds with clang-15 do not, with the above panic.
> > 
> > Change the macro to use specific offsets depending on the parent rather
> > than the dt-binding's clock IDs.
> > 
> > Fixes: 1c6a7ea32b8c ("clk: microchip: mpfs: add RTCREF clock control")
> > CC: Nathan Chancellor <nathan@kernel.org>
> > Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
> > ---
> 
> I'll merge this patch over to clk-fixes as well.

Great I see it's already split out and on fixes branch. Thanks!
diff mbox series

Patch

diff --git a/drivers/clk/microchip/clk-mpfs.c b/drivers/clk/microchip/clk-mpfs.c
index 070c3b896559..f0f9c9a1cc48 100644
--- a/drivers/clk/microchip/clk-mpfs.c
+++ b/drivers/clk/microchip/clk-mpfs.c
@@ -239,6 +239,11 @@  static const struct clk_ops mpfs_clk_cfg_ops = {
 	.hw.init = CLK_HW_INIT(_name, _parent, &mpfs_clk_cfg_ops, 0),			\
 }
 
+#define CLK_CPU_OFFSET		0u
+#define CLK_AXI_OFFSET		1u
+#define CLK_AHB_OFFSET		2u
+#define CLK_RTCREF_OFFSET	3u
+
 static struct mpfs_cfg_hw_clock mpfs_cfg_clks[] = {
 	CLK_CFG(CLK_CPU, "clk_cpu", "clk_msspll", 0, 2, mpfs_div_cpu_axi_table, 0,
 		REG_CLOCK_CONFIG_CR),
@@ -362,7 +367,7 @@  static const struct clk_ops mpfs_periph_clk_ops = {
 				  _flags),					\
 }
 
-#define PARENT_CLK(PARENT) (&mpfs_cfg_clks[CLK_##PARENT].hw)
+#define PARENT_CLK(PARENT) (&mpfs_cfg_clks[CLK_##PARENT##_OFFSET].hw)
 
 /*
  * Critical clocks: