diff mbox series

arm64: fix .idmap.text assertion for large kernels

Message ID 20230220162317.1581208-1-mark.rutland@arm.com (mailing list archive)
State New, archived
Headers show
Series arm64: fix .idmap.text assertion for large kernels | expand

Commit Message

Mark Rutland Feb. 20, 2023, 4:23 p.m. UTC
When building a kernel with many debug options enabled (which happens in
test configurations use by myself and syzbot), the kernel can become
large enough that portions of .text can be more than 128M away from
.idmap.text (which is placed inside the .rodata section). Where idmap
code branches into .text, the linker will place veneers in the
.idmap.text section to make those branches possible.

Unfortunately, as Ard reports, GNU LD has bseen observed to add 4K of
padding when adding such veneers, e.g.

| .idmap.text    0xffffffc01e48e5c0      0x32c arch/arm64/mm/proc.o
|                0xffffffc01e48e5c0                idmap_cpu_replace_ttbr1
|                0xffffffc01e48e600                idmap_kpti_install_ng_mappings
|                0xffffffc01e48e800                __cpu_setup
| *fill*         0xffffffc01e48e8ec        0x4
| .idmap.text.stub
|                0xffffffc01e48e8f0       0x18 linker stubs
|                0xffffffc01e48f8f0                __idmap_text_end = .
|                0xffffffc01e48f000                . = ALIGN (0x1000)
| *fill*         0xffffffc01e48f8f0      0x710
|                0xffffffc01e490000                idmap_pg_dir = .

This makes the __idmap_text_start .. __idmap_text_end region bigger than
the 4K we require it to fit within, and triggers an assertion in arm64's
vmlinux.lds.S, which breaks the build:

| LD      .tmp_vmlinux.kallsyms1
| aarch64-linux-gnu-ld: ID map text too big or misaligned
| make[1]: *** [scripts/Makefile.vmlinux:35: vmlinux] Error 1
| make: *** [Makefile:1264: vmlinux] Error 2

Avoid this by using an `ADRP+ADD+BLR` sequence for branches out of
.idmap.text, which avoids the need for veneers. These branches are only
executed once per boot, and only when the MMU is on, so there should be
no noticeable performance penalty in replacing `BL` with `ADRP+ADD+BLR`.

At the same time, remove the "x" and "w" attributes when placing code in
.idmap.text, as these are not necessary, and this will prevent the
linker from assuming that it is safe to place PLTs into .idmap.text,
causing it to warn if and when there are out-of-range branches within
.idmap.text, e.g.

|   LD      .tmp_vmlinux.kallsyms1
| arch/arm64/kernel/head.o: in function `primary_entry':
| (.idmap.text+0x1c): relocation truncated to fit: R_AARCH64_CALL26 against symbol `dcache_clean_poc' defined in .text section in arch/arm64/mm/cache.o
| arch/arm64/kernel/head.o: in function `init_el2':
| (.idmap.text+0x88): relocation truncated to fit: R_AARCH64_CALL26 against symbol `dcache_clean_poc' defined in .text section in arch/arm64/mm/cache.o
| make[1]: *** [scripts/Makefile.vmlinux:34: vmlinux] Error 1
| make: *** [Makefile:1252: vmlinux] Error 2

Thus, if future changes add out-of-range branches in .idmap.text, it
should be easy enough to identify those from the resulting linker
errors.

Reported-by: syzbot+f8ac312e31226e23302b@syzkaller.appspotmail.com
Link: https://lore.kernel.org/linux-arm-kernel/00000000000028ea4105f4e2ef54@google.com/
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
---
 arch/arm64/kernel/head.S | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

Comments

Ard Biesheuvel Feb. 20, 2023, 6:01 p.m. UTC | #1
On Mon, 20 Feb 2023 at 17:23, Mark Rutland <mark.rutland@arm.com> wrote:
>
> When building a kernel with many debug options enabled (which happens in
> test configurations use by myself and syzbot), the kernel can become
> large enough that portions of .text can be more than 128M away from
> .idmap.text (which is placed inside the .rodata section). Where idmap
> code branches into .text, the linker will place veneers in the
> .idmap.text section to make those branches possible.
>
> Unfortunately, as Ard reports, GNU LD has bseen observed to add 4K of
> padding when adding such veneers, e.g.
>
> | .idmap.text    0xffffffc01e48e5c0      0x32c arch/arm64/mm/proc.o
> |                0xffffffc01e48e5c0                idmap_cpu_replace_ttbr1
> |                0xffffffc01e48e600                idmap_kpti_install_ng_mappings
> |                0xffffffc01e48e800                __cpu_setup
> | *fill*         0xffffffc01e48e8ec        0x4
> | .idmap.text.stub
> |                0xffffffc01e48e8f0       0x18 linker stubs
> |                0xffffffc01e48f8f0                __idmap_text_end = .
> |                0xffffffc01e48f000                . = ALIGN (0x1000)
> | *fill*         0xffffffc01e48f8f0      0x710
> |                0xffffffc01e490000                idmap_pg_dir = .
>
> This makes the __idmap_text_start .. __idmap_text_end region bigger than
> the 4K we require it to fit within,

This is *really* weird, and smells like a ld.bfd bug tbh

> and triggers an assertion in arm64's
> vmlinux.lds.S, which breaks the build:
>
> | LD      .tmp_vmlinux.kallsyms1
> | aarch64-linux-gnu-ld: ID map text too big or misaligned
> | make[1]: *** [scripts/Makefile.vmlinux:35: vmlinux] Error 1
> | make: *** [Makefile:1264: vmlinux] Error 2
>
> Avoid this by using an `ADRP+ADD+BLR` sequence for branches out of
> .idmap.text, which avoids the need for veneers. These branches are only
> executed once per boot, and only when the MMU is on, so there should be
> no noticeable performance penalty in replacing `BL` with `ADRP+ADD+BLR`.
>

It is unfortunate that we need this, but I don't have a better idea
save for fixing the linker. Hopefully, it will be a while before
allyesconfig exceeds the range of ADRP/ADD.

> At the same time, remove the "x" and "w" attributes when placing code in
> .idmap.text, as these are not necessary, and this will prevent the
> linker from assuming that it is safe to place PLTs into .idmap.text,
> causing it to warn if and when there are out-of-range branches within
> .idmap.text, e.g.
>

This by itself seems like a worthwhile change, and perhaps we need
something similar for HYP so that we don't get veneers in unexpected
places.

> |   LD      .tmp_vmlinux.kallsyms1
> | arch/arm64/kernel/head.o: in function `primary_entry':
> | (.idmap.text+0x1c): relocation truncated to fit: R_AARCH64_CALL26 against symbol `dcache_clean_poc' defined in .text section in arch/arm64/mm/cache.o
> | arch/arm64/kernel/head.o: in function `init_el2':
> | (.idmap.text+0x88): relocation truncated to fit: R_AARCH64_CALL26 against symbol `dcache_clean_poc' defined in .text section in arch/arm64/mm/cache.o
> | make[1]: *** [scripts/Makefile.vmlinux:34: vmlinux] Error 1
> | make: *** [Makefile:1252: vmlinux] Error 2
>
> Thus, if future changes add out-of-range branches in .idmap.text, it
> should be easy enough to identify those from the resulting linker
> errors.
>
> Reported-by: syzbot+f8ac312e31226e23302b@syzkaller.appspotmail.com
> Link: https://lore.kernel.org/linux-arm-kernel/00000000000028ea4105f4e2ef54@google.com/
> Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> Cc: Ard Biesheuvel <ardb@kernel.org>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>

Tested-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>



> ---
>  arch/arm64/kernel/head.S | 12 +++++++-----
>  1 file changed, 7 insertions(+), 5 deletions(-)
>
> diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
> index 212d93aca5e61..b98970907226b 100644
> --- a/arch/arm64/kernel/head.S
> +++ b/arch/arm64/kernel/head.S
> @@ -70,7 +70,7 @@
>
>         __EFI_PE_HEADER
>
> -       .section ".idmap.text","awx"
> +       .section ".idmap.text","a"
>
>         /*
>          * The following callee saved general purpose registers are used on the
> @@ -99,7 +99,8 @@ SYM_CODE_START(primary_entry)
>         cbz     x19, 0f
>         adrp    x0, __idmap_text_start
>         adr_l   x1, __idmap_text_end
> -       bl      dcache_clean_poc
> +       adr_l   x2, dcache_clean_poc
> +       blr     x2
>  0:     mov     x0, x19
>         bl      init_kernel_el                  // w0=cpu_boot_mode
>         mov     x20, x0
> @@ -527,7 +528,7 @@ SYM_FUNC_END(__primary_switched)
>   * end early head section, begin head code that is also used for
>   * hotplug and needs to have the same protections as the text region
>   */
> -       .section ".idmap.text","awx"
> +       .section ".idmap.text","a"
>
>  /*
>   * Starting from EL2 or EL1, configure the CPU to execute at the highest
> @@ -566,7 +567,8 @@ SYM_INNER_LABEL(init_el2, SYM_L_LOCAL)
>         cbz     x0, 0f
>         adrp    x0, __hyp_idmap_text_start
>         adr_l   x1, __hyp_text_end
> -       bl      dcache_clean_poc
> +       adr_l   x2, dcache_clean_poc
> +       blr     x2
>  0:
>         mov_q   x0, HCR_HOST_NVHE_FLAGS
>         msr     hcr_el2, x0
> @@ -728,7 +730,7 @@ SYM_FUNC_END(set_cpu_boot_mode_flag)
>   * Checks if the selected granule size is supported by the CPU.
>   * If it isn't, park the CPU
>   */
> -       .section ".idmap.text","awx"
> +       .section ".idmap.text","a"
>  SYM_FUNC_START(__enable_mmu)
>         mrs     x3, ID_AA64MMFR0_EL1
>         ubfx    x3, x3, #ID_AA64MMFR0_EL1_TGRAN_SHIFT, 4
> --
> 2.30.2
>
Catalin Marinas Feb. 20, 2023, 6:25 p.m. UTC | #2
On Mon, 20 Feb 2023 16:23:17 +0000, Mark Rutland wrote:
> When building a kernel with many debug options enabled (which happens in
> test configurations use by myself and syzbot), the kernel can become
> large enough that portions of .text can be more than 128M away from
> .idmap.text (which is placed inside the .rodata section). Where idmap
> code branches into .text, the linker will place veneers in the
> .idmap.text section to make those branches possible.
> 
> [...]

Applied to arm64 (for-next/core), thanks!

[1/1] arm64: fix .idmap.text assertion for large kernels
      https://git.kernel.org/arm64/c/d54170812ef1
diff mbox series

Patch

diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 212d93aca5e61..b98970907226b 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -70,7 +70,7 @@ 
 
 	__EFI_PE_HEADER
 
-	.section ".idmap.text","awx"
+	.section ".idmap.text","a"
 
 	/*
 	 * The following callee saved general purpose registers are used on the
@@ -99,7 +99,8 @@  SYM_CODE_START(primary_entry)
 	cbz	x19, 0f
 	adrp	x0, __idmap_text_start
 	adr_l	x1, __idmap_text_end
-	bl	dcache_clean_poc
+	adr_l	x2, dcache_clean_poc
+	blr	x2
 0:	mov	x0, x19
 	bl	init_kernel_el			// w0=cpu_boot_mode
 	mov	x20, x0
@@ -527,7 +528,7 @@  SYM_FUNC_END(__primary_switched)
  * end early head section, begin head code that is also used for
  * hotplug and needs to have the same protections as the text region
  */
-	.section ".idmap.text","awx"
+	.section ".idmap.text","a"
 
 /*
  * Starting from EL2 or EL1, configure the CPU to execute at the highest
@@ -566,7 +567,8 @@  SYM_INNER_LABEL(init_el2, SYM_L_LOCAL)
 	cbz	x0, 0f
 	adrp	x0, __hyp_idmap_text_start
 	adr_l	x1, __hyp_text_end
-	bl	dcache_clean_poc
+	adr_l	x2, dcache_clean_poc
+	blr	x2
 0:
 	mov_q	x0, HCR_HOST_NVHE_FLAGS
 	msr	hcr_el2, x0
@@ -728,7 +730,7 @@  SYM_FUNC_END(set_cpu_boot_mode_flag)
  * Checks if the selected granule size is supported by the CPU.
  * If it isn't, park the CPU
  */
-	.section ".idmap.text","awx"
+	.section ".idmap.text","a"
 SYM_FUNC_START(__enable_mmu)
 	mrs	x3, ID_AA64MMFR0_EL1
 	ubfx	x3, x3, #ID_AA64MMFR0_EL1_TGRAN_SHIFT, 4