diff mbox series

arm64: fdt: fix membock add/cap ordering

Message ID 20211213134745.43505-1-kernelfans@gmail.com (mailing list archive)
State New, archived
Headers show
Series arm64: fdt: fix membock add/cap ordering | expand

Commit Message

Pingfan Liu Dec. 13, 2021, 1:47 p.m. UTC
During kdump kernel saves vmcore, it runs into the following bug:
...
[   15.148919] usercopy: Kernel memory exposure attempt detected from SLUB object 'kmem_cache_node' (offset 0, size 4096)!
[   15.159707] ------------[ cut here ]------------
[   15.164311] kernel BUG at mm/usercopy.c:99!
[   15.168482] Internal error: Oops - BUG: 0 [#1] SMP
[   15.173261] Modules linked in: xfs libcrc32c crct10dif_ce ghash_ce sha2_ce sha256_arm64 sha1_ce sbsa_gwdt ast i2c_algo_bit drm_vram_helper drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec drm_ttm_helper ttm drm nvme nvme_core xgene_hwmon i2c_designware_platform i2c_designware_core dm_mirror dm_region_hash dm_log dm_mod overlay squashfs zstd_decompress loop
[   15.206186] CPU: 0 PID: 542 Comm: cp Not tainted 5.16.0-rc4 #1
[   15.212006] Hardware name: GIGABYTE R272-P30-JG/MP32-AR0-JG, BIOS F12 (SCP: 1.5.20210426) 05/13/2021
[   15.221125] pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[   15.228073] pc : usercopy_abort+0x9c/0xa0
[   15.232074] lr : usercopy_abort+0x9c/0xa0
[   15.236070] sp : ffff8000121abba0
[   15.239371] x29: ffff8000121abbb0 x28: 0000000000003000 x27: 0000000000000000
[   15.246494] x26: 0000000080000400 x25: 0000ffff885c7000 x24: 0000000000000000
[   15.253617] x23: 000007ff80400000 x22: ffff07ff80401000 x21: 0000000000000001
[   15.260739] x20: 0000000000001000 x19: ffff07ff80400000 x18: ffffffffffffffff
[   15.267861] x17: 656a626f2042554c x16: 53206d6f72662064 x15: 6574636574656420
[   15.274983] x14: 74706d6574746120 x13: 2129363930342065 x12: 7a6973202c302074
[   15.282105] x11: ffffc8b041d1b148 x10: 00000000ffff8000 x9 : ffffc8b04012812c
[   15.289228] x8 : 00000000ffff7fff x7 : ffffc8b041d1b148 x6 : 0000000000000000
[   15.296349] x5 : 0000000000000000 x4 : 0000000000007fff x3 : 0000000000000000
[   15.303471] x2 : 0000000000000000 x1 : ffff07ff8c064800 x0 : 000000000000006b
[   15.310593] Call trace:
[   15.313027]  usercopy_abort+0x9c/0xa0
[   15.316677]  __check_heap_object+0xd4/0xf0
[   15.320762]  __check_object_size.part.0+0x160/0x1e0
[   15.325628]  __check_object_size+0x2c/0x40
[   15.329711]  copy_oldmem_page+0x7c/0x140
[   15.333623]  read_from_oldmem.part.0+0xfc/0x1c0
[   15.338142]  __read_vmcore.constprop.0+0x23c/0x350
[   15.342920]  read_vmcore+0x28/0x34
[   15.346309]  proc_reg_read+0xb4/0xf0
[   15.349871]  vfs_read+0xb8/0x1f0
[   15.353088]  ksys_read+0x74/0x100
[   15.356390]  __arm64_sys_read+0x28/0x34
...

This bug introduced by commit b261dba2fdb2 ("arm64: kdump: Remove custom
linux,usable-memory-range handling"), which moves
memblock_cap_memory_range() to fdt, but it breaches the rules that
memblock_cap_memory_range() should come after memblock_add() etc as said
in commit e888fa7bb882 ("memblock: Check memory add/cap ordering").

As a consequence, the virtual address set up by copy_oldmem_page() does
not bail out from the test of virt_addr_valid() in check_heap_object(),
and finally hits the BUG_ON().

Since memblock allocator has no idea about the time point of the full
memblock's population, resolving this issue at arch level code by
calling a new interface early_init_dt_cap_memory_range() exposed by fdt.

Fixes: b261dba2fdb2 ("arm64: kdump: Remove custom linux,usable-memory-range handling")
Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Geert Uytterhoeven <geert+renesas@glider.be>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: Frank Rowand <frowand.list@gmail.com>
Cc: Nick Terrell <terrelln@fb.com>
To: linux-arm-kernel@lists.infradead.org
To: devicetree@vger.kernel.org
---
 arch/arm64/kernel/setup.c | 1 +
 drivers/of/fdt.c          | 5 +++++
 include/linux/of_fdt.h    | 1 +
 3 files changed, 7 insertions(+)

Comments

Rob Herring Dec. 13, 2021, 3:31 p.m. UTC | #1
+Zhen Lei

On Mon, Dec 13, 2021 at 7:48 AM Pingfan Liu <kernelfans@gmail.com> wrote:
>
> During kdump kernel saves vmcore, it runs into the following bug:
> ...
> [   15.148919] usercopy: Kernel memory exposure attempt detected from SLUB object 'kmem_cache_node' (offset 0, size 4096)!
> [   15.159707] ------------[ cut here ]------------
> [   15.164311] kernel BUG at mm/usercopy.c:99!
> [   15.168482] Internal error: Oops - BUG: 0 [#1] SMP
> [   15.173261] Modules linked in: xfs libcrc32c crct10dif_ce ghash_ce sha2_ce sha256_arm64 sha1_ce sbsa_gwdt ast i2c_algo_bit drm_vram_helper drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec drm_ttm_helper ttm drm nvme nvme_core xgene_hwmon i2c_designware_platform i2c_designware_core dm_mirror dm_region_hash dm_log dm_mod overlay squashfs zstd_decompress loop
> [   15.206186] CPU: 0 PID: 542 Comm: cp Not tainted 5.16.0-rc4 #1
> [   15.212006] Hardware name: GIGABYTE R272-P30-JG/MP32-AR0-JG, BIOS F12 (SCP: 1.5.20210426) 05/13/2021
> [   15.221125] pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [   15.228073] pc : usercopy_abort+0x9c/0xa0
> [   15.232074] lr : usercopy_abort+0x9c/0xa0
> [   15.236070] sp : ffff8000121abba0
> [   15.239371] x29: ffff8000121abbb0 x28: 0000000000003000 x27: 0000000000000000
> [   15.246494] x26: 0000000080000400 x25: 0000ffff885c7000 x24: 0000000000000000
> [   15.253617] x23: 000007ff80400000 x22: ffff07ff80401000 x21: 0000000000000001
> [   15.260739] x20: 0000000000001000 x19: ffff07ff80400000 x18: ffffffffffffffff
> [   15.267861] x17: 656a626f2042554c x16: 53206d6f72662064 x15: 6574636574656420
> [   15.274983] x14: 74706d6574746120 x13: 2129363930342065 x12: 7a6973202c302074
> [   15.282105] x11: ffffc8b041d1b148 x10: 00000000ffff8000 x9 : ffffc8b04012812c
> [   15.289228] x8 : 00000000ffff7fff x7 : ffffc8b041d1b148 x6 : 0000000000000000
> [   15.296349] x5 : 0000000000000000 x4 : 0000000000007fff x3 : 0000000000000000
> [   15.303471] x2 : 0000000000000000 x1 : ffff07ff8c064800 x0 : 000000000000006b
> [   15.310593] Call trace:
> [   15.313027]  usercopy_abort+0x9c/0xa0
> [   15.316677]  __check_heap_object+0xd4/0xf0
> [   15.320762]  __check_object_size.part.0+0x160/0x1e0
> [   15.325628]  __check_object_size+0x2c/0x40
> [   15.329711]  copy_oldmem_page+0x7c/0x140
> [   15.333623]  read_from_oldmem.part.0+0xfc/0x1c0
> [   15.338142]  __read_vmcore.constprop.0+0x23c/0x350
> [   15.342920]  read_vmcore+0x28/0x34
> [   15.346309]  proc_reg_read+0xb4/0xf0
> [   15.349871]  vfs_read+0xb8/0x1f0
> [   15.353088]  ksys_read+0x74/0x100
> [   15.356390]  __arm64_sys_read+0x28/0x34
> ...
>
> This bug introduced by commit b261dba2fdb2 ("arm64: kdump: Remove custom
> linux,usable-memory-range handling"), which moves
> memblock_cap_memory_range() to fdt, but it breaches the rules that
> memblock_cap_memory_range() should come after memblock_add() etc as said
> in commit e888fa7bb882 ("memblock: Check memory add/cap ordering").

Presumably only when using EFI boot which throws out any DT memblock setup.

>
> As a consequence, the virtual address set up by copy_oldmem_page() does
> not bail out from the test of virt_addr_valid() in check_heap_object(),
> and finally hits the BUG_ON().
>
> Since memblock allocator has no idea about the time point of the full
> memblock's population, resolving this issue at arch level code by
> calling a new interface early_init_dt_cap_memory_range() exposed by fdt.
>
> Fixes: b261dba2fdb2 ("arm64: kdump: Remove custom linux,usable-memory-range handling")
> Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Mike Rapoport <rppt@kernel.org>
> Cc: Geert Uytterhoeven <geert+renesas@glider.be>
> Cc: Rob Herring <robh+dt@kernel.org>
> Cc: Frank Rowand <frowand.list@gmail.com>
> Cc: Nick Terrell <terrelln@fb.com>
> To: linux-arm-kernel@lists.infradead.org
> To: devicetree@vger.kernel.org
> ---
>  arch/arm64/kernel/setup.c | 1 +
>  drivers/of/fdt.c          | 5 +++++
>  include/linux/of_fdt.h    | 1 +
>  3 files changed, 7 insertions(+)
>
> diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
> index be5f85b0a24d..353e5171a66c 100644
> --- a/arch/arm64/kernel/setup.c
> +++ b/arch/arm64/kernel/setup.c
> @@ -331,6 +331,7 @@ void __init __no_sanitize_address setup_arch(char **cmdline_p)
>
>         xen_early_init();
>         efi_init();
> +       early_init_dt_cap_memory_range();

As it is efi_init() that does all the memblock setup, I think this
belongs in efi_init() because it would be needed for any arch that
uses EFI for memory setup.

>
>         if (!efi_enabled(EFI_BOOT) && ((u64)_text % MIN_KIMG_ALIGN) != 0)
>              pr_warn(FW_BUG "Kernel image misaligned at boot, please fix your bootloader!");
> diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
> index bdca35284ceb..bb7e8fc3a334 100644
> --- a/drivers/of/fdt.c
> +++ b/drivers/of/fdt.c
> @@ -1278,6 +1278,11 @@ void __init early_init_dt_scan_nodes(void)
>         memblock_cap_memory_range(cap_mem_addr, cap_mem_size);
>  }
>
> +void __init early_init_dt_cap_memory_range(void)
> +{
> +       memblock_cap_memory_range(cap_mem_addr, cap_mem_size);

This code is changing in [1] which I think should make fixing this
issue easier. I'd suggest making
early_init_dt_check_for_usable_mem_range() non-static and the EFI code
can call it.

I'm also curious how folks testing that series don't hit this issue.

Rob

[1] https://lore.kernel.org/all/20211210065533.2023-9-thunder.leizhen@huawei.com/
Pingfan Liu Dec. 14, 2021, 3:06 a.m. UTC | #2
On Mon, Dec 13, 2021 at 09:31:41AM -0600, Rob Herring wrote:
> +Zhen Lei
> 
> On Mon, Dec 13, 2021 at 7:48 AM Pingfan Liu <kernelfans@gmail.com> wrote:
> >
> > During kdump kernel saves vmcore, it runs into the following bug:
> > ...
> > [   15.148919] usercopy: Kernel memory exposure attempt detected from SLUB object 'kmem_cache_node' (offset 0, size 4096)!
> > [   15.159707] ------------[ cut here ]------------
> > [   15.164311] kernel BUG at mm/usercopy.c:99!
> > [   15.168482] Internal error: Oops - BUG: 0 [#1] SMP
> > [   15.173261] Modules linked in: xfs libcrc32c crct10dif_ce ghash_ce sha2_ce sha256_arm64 sha1_ce sbsa_gwdt ast i2c_algo_bit drm_vram_helper drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec drm_ttm_helper ttm drm nvme nvme_core xgene_hwmon i2c_designware_platform i2c_designware_core dm_mirror dm_region_hash dm_log dm_mod overlay squashfs zstd_decompress loop
> > [   15.206186] CPU: 0 PID: 542 Comm: cp Not tainted 5.16.0-rc4 #1
> > [   15.212006] Hardware name: GIGABYTE R272-P30-JG/MP32-AR0-JG, BIOS F12 (SCP: 1.5.20210426) 05/13/2021
> > [   15.221125] pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> > [   15.228073] pc : usercopy_abort+0x9c/0xa0
> > [   15.232074] lr : usercopy_abort+0x9c/0xa0
> > [   15.236070] sp : ffff8000121abba0
> > [   15.239371] x29: ffff8000121abbb0 x28: 0000000000003000 x27: 0000000000000000
> > [   15.246494] x26: 0000000080000400 x25: 0000ffff885c7000 x24: 0000000000000000
> > [   15.253617] x23: 000007ff80400000 x22: ffff07ff80401000 x21: 0000000000000001
> > [   15.260739] x20: 0000000000001000 x19: ffff07ff80400000 x18: ffffffffffffffff
> > [   15.267861] x17: 656a626f2042554c x16: 53206d6f72662064 x15: 6574636574656420
> > [   15.274983] x14: 74706d6574746120 x13: 2129363930342065 x12: 7a6973202c302074
> > [   15.282105] x11: ffffc8b041d1b148 x10: 00000000ffff8000 x9 : ffffc8b04012812c
> > [   15.289228] x8 : 00000000ffff7fff x7 : ffffc8b041d1b148 x6 : 0000000000000000
> > [   15.296349] x5 : 0000000000000000 x4 : 0000000000007fff x3 : 0000000000000000
> > [   15.303471] x2 : 0000000000000000 x1 : ffff07ff8c064800 x0 : 000000000000006b
> > [   15.310593] Call trace:
> > [   15.313027]  usercopy_abort+0x9c/0xa0
> > [   15.316677]  __check_heap_object+0xd4/0xf0
> > [   15.320762]  __check_object_size.part.0+0x160/0x1e0
> > [   15.325628]  __check_object_size+0x2c/0x40
> > [   15.329711]  copy_oldmem_page+0x7c/0x140
> > [   15.333623]  read_from_oldmem.part.0+0xfc/0x1c0
> > [   15.338142]  __read_vmcore.constprop.0+0x23c/0x350
> > [   15.342920]  read_vmcore+0x28/0x34
> > [   15.346309]  proc_reg_read+0xb4/0xf0
> > [   15.349871]  vfs_read+0xb8/0x1f0
> > [   15.353088]  ksys_read+0x74/0x100
> > [   15.356390]  __arm64_sys_read+0x28/0x34
> > ...
> >
> > This bug introduced by commit b261dba2fdb2 ("arm64: kdump: Remove custom
> > linux,usable-memory-range handling"), which moves
> > memblock_cap_memory_range() to fdt, but it breaches the rules that
> > memblock_cap_memory_range() should come after memblock_add() etc as said
> > in commit e888fa7bb882 ("memblock: Check memory add/cap ordering").
> 
> Presumably only when using EFI boot which throws out any DT memblock setup.
> 
Yes, I do think so.

> >
> > As a consequence, the virtual address set up by copy_oldmem_page() does
> > not bail out from the test of virt_addr_valid() in check_heap_object(),
> > and finally hits the BUG_ON().
> >
> > Since memblock allocator has no idea about the time point of the full
> > memblock's population, resolving this issue at arch level code by
> > calling a new interface early_init_dt_cap_memory_range() exposed by fdt.
> >
> > Fixes: b261dba2fdb2 ("arm64: kdump: Remove custom linux,usable-memory-range handling")
> > Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
> > Cc: Catalin Marinas <catalin.marinas@arm.com>
> > Cc: Will Deacon <will@kernel.org>
> > Cc: Andrew Morton <akpm@linux-foundation.org>
> > Cc: Mike Rapoport <rppt@kernel.org>
> > Cc: Geert Uytterhoeven <geert+renesas@glider.be>
> > Cc: Rob Herring <robh+dt@kernel.org>
> > Cc: Frank Rowand <frowand.list@gmail.com>
> > Cc: Nick Terrell <terrelln@fb.com>
> > To: linux-arm-kernel@lists.infradead.org
> > To: devicetree@vger.kernel.org
> > ---
> >  arch/arm64/kernel/setup.c | 1 +
> >  drivers/of/fdt.c          | 5 +++++
> >  include/linux/of_fdt.h    | 1 +
> >  3 files changed, 7 insertions(+)
> >
> > diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
> > index be5f85b0a24d..353e5171a66c 100644
> > --- a/arch/arm64/kernel/setup.c
> > +++ b/arch/arm64/kernel/setup.c
> > @@ -331,6 +331,7 @@ void __init __no_sanitize_address setup_arch(char **cmdline_p)
> >
> >         xen_early_init();
> >         efi_init();
> > +       early_init_dt_cap_memory_range();
> 
> As it is efi_init() that does all the memblock setup, I think this
> belongs in efi_init() because it would be needed for any arch that
> uses EFI for memory setup.
> 

Reasonable, and efi_init() are shared by arm64 and risc-v now.

> >
> >         if (!efi_enabled(EFI_BOOT) && ((u64)_text % MIN_KIMG_ALIGN) != 0)
> >              pr_warn(FW_BUG "Kernel image misaligned at boot, please fix your bootloader!");
> > diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
> > index bdca35284ceb..bb7e8fc3a334 100644
> > --- a/drivers/of/fdt.c
> > +++ b/drivers/of/fdt.c
> > @@ -1278,6 +1278,11 @@ void __init early_init_dt_scan_nodes(void)
> >         memblock_cap_memory_range(cap_mem_addr, cap_mem_size);
> >  }
> >
> > +void __init early_init_dt_cap_memory_range(void)
> > +{
> > +       memblock_cap_memory_range(cap_mem_addr, cap_mem_size);
> 
> This code is changing in [1] which I think should make fixing this
> issue easier. I'd suggest making
> early_init_dt_check_for_usable_mem_range() non-static and the EFI code
> can call it.
> 
Thank you for the good advice. I will go in that way.

> I'm also curious how folks testing that series don't hit this issue.
> 

Maybe CONFIG_EFI=n.


Thanks,

	Pingfan

> Rob
> 
> [1] https://lore.kernel.org/all/20211210065533.2023-9-thunder.leizhen@huawei.com/
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
diff mbox series

Patch

diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index be5f85b0a24d..353e5171a66c 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -331,6 +331,7 @@  void __init __no_sanitize_address setup_arch(char **cmdline_p)
 
 	xen_early_init();
 	efi_init();
+	early_init_dt_cap_memory_range();
 
 	if (!efi_enabled(EFI_BOOT) && ((u64)_text % MIN_KIMG_ALIGN) != 0)
 	     pr_warn(FW_BUG "Kernel image misaligned at boot, please fix your bootloader!");
diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
index bdca35284ceb..bb7e8fc3a334 100644
--- a/drivers/of/fdt.c
+++ b/drivers/of/fdt.c
@@ -1278,6 +1278,11 @@  void __init early_init_dt_scan_nodes(void)
 	memblock_cap_memory_range(cap_mem_addr, cap_mem_size);
 }
 
+void __init early_init_dt_cap_memory_range(void)
+{
+	memblock_cap_memory_range(cap_mem_addr, cap_mem_size);
+}
+
 bool __init early_init_dt_scan(void *params)
 {
 	bool status;
diff --git a/include/linux/of_fdt.h b/include/linux/of_fdt.h
index cf48983d3c86..3c8a01627c02 100644
--- a/include/linux/of_fdt.h
+++ b/include/linux/of_fdt.h
@@ -75,6 +75,7 @@  extern int early_init_dt_scan_root(unsigned long node, const char *uname,
 extern bool early_init_dt_scan(void *params);
 extern bool early_init_dt_verify(void *params);
 extern void early_init_dt_scan_nodes(void);
+extern void early_init_dt_cap_memory_range(void);
 
 extern const char *of_flat_dt_get_machine_name(void);
 extern const void *of_flat_dt_match_machine(const void *default_match,