diff mbox series

arm64: Restrict CPU_BIG_ENDIAN to GNU as or LLVM IAS 15.x or newer

Message ID 20231025-disable-arm64-be-ias-b4-llvm-15-v1-1-b25263ed8b23@kernel.org (mailing list archive)
State New, archived
Headers show
Series arm64: Restrict CPU_BIG_ENDIAN to GNU as or LLVM IAS 15.x or newer | expand

Commit Message

Nathan Chancellor Oct. 25, 2023, 5:21 p.m. UTC
A recent refactoring in the arm64 tree exposed an assembler bug in LLVM
with regards to the generation of NOPs for arm64 big endian, resulting
in near-immediate crashes on boot in QEMU.

Restrict CONFIG_CPU_BIG_ENDIAN to a known good assembler, which is
either GNU as or LLVM's IAS 15.0.0 and newer, which contains the linked
commit.

Closes: https://github.com/ClangBuiltLinux/linux/issues/1948
Link: https://github.com/llvm/llvm-project/commit/1379b150991f70a5782e9a143c2ba5308da1161c
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
---
 arch/arm64/Kconfig | 2 ++
 1 file changed, 2 insertions(+)


---
base-commit: 22e877699642285c47f5d7d83b2d59815c29ebe8
change-id: 20231025-disable-arm64-be-ias-b4-llvm-15-b6f30f3f24be

Best regards,

Comments

Mark Rutland Oct. 25, 2023, 6:01 p.m. UTC | #1
Hi Nathan,

On Wed, Oct 25, 2023 at 10:21:28AM -0700, Nathan Chancellor wrote:
> A recent refactoring in the arm64 tree exposed an assembler bug in LLVM
> with regards to the generation of NOPs for arm64 big endian, resulting
> in near-immediate crashes on boot in QEMU.

Could we please put a bit more detail into the commit message about what
exactly went wrong and how this was detected? I know that can be found
from the github links below, but having to go chase that is a bit of a
pain.

Would you be happy with the below? I've also added a Cc stable, since
this is a potential state corruption issue.

Assuming you're happy with that text:

Acked-by: Mark Rutland <mark.rutland@arm.com>

Mark.

---->8----
Prior to LLVM 15.0.0, LLVM's integrated assemble would incorrectly
byte-swap NOP when compiling for big-endian, and the resulting series of
bytes happened to match the encoding of FNMADD S21, S30, S0, S0.

This went unnoticed until commit:

  34f66c4c4d5518c1 ("arm64: Use a positive cpucap for FP/SIMD")

Prior to that commit, the kernel would always enable the use of FPSIMD
early in boot when __cpu_setup() initialized CPACR_EL1, and so usage of
FNMADD within the kernel was not detected, but could result in the
corruption of user or kernel FPSIMD state.

After that commit, the instructions happen to trap during boot prior to
FPSIMD being detected and enabled, e.g.

| Unhandled 64-bit el1h sync exception on CPU0, ESR 0x000000001fe00000 -- ASIMD
| CPU: 0 PID: 0 Comm: swapper Not tainted 6.6.0-rc3-00013-g34f66c4c4d55 #1
| Hardware name: linux,dummy-virt (DT)
| pstate: 400000c9 (nZcv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
| pc : __pi_strcmp+0x1c/0x150
| lr : populate_properties+0xe4/0x254
| sp : ffffd014173d3ad0
| x29: ffffd014173d3af0 x28: fffffbfffddffcb8 x27: 0000000000000000
| x26: 0000000000000058 x25: fffffbfffddfe054 x24: 0000000000000008
| x23: fffffbfffddfe000 x22: fffffbfffddfe000 x21: fffffbfffddfe044
| x20: ffffd014173d3b70 x19: 0000000000000001 x18: 0000000000000005
| x17: 0000000000000010 x16: 0000000000000000 x15: 00000000413e7000
| x14: 0000000000000000 x13: 0000000000001bcc x12: 0000000000000000
| x11: 00000000d00dfeed x10: ffffd414193f2cd0 x9 : 0000000000000000
| x8 : 0101010101010101 x7 : ffffffffffffffc0 x6 : 0000000000000000
| x5 : 0000000000000000 x4 : 0101010101010101 x3 : 000000000000002a
| x2 : 0000000000000001 x1 : ffffd014171f2988 x0 : fffffbfffddffcb8
| Kernel panic - not syncing: Unhandled exception
| CPU: 0 PID: 0 Comm: swapper Not tainted 6.6.0-rc3-00013-g34f66c4c4d55 #1
| Hardware name: linux,dummy-virt (DT)
| Call trace:
|  dump_backtrace+0xec/0x108
|  show_stack+0x18/0x2c
|  dump_stack_lvl+0x50/0x68
|  dump_stack+0x18/0x24
|  panic+0x13c/0x340
|  el1t_64_irq_handler+0x0/0x1c
|  el1_abort+0x0/0x5c
|  el1h_64_sync+0x64/0x68
|  __pi_strcmp+0x1c/0x150
|  unflatten_dt_nodes+0x1e8/0x2d8
|  __unflatten_device_tree+0x5c/0x15c
|  unflatten_device_tree+0x38/0x50
|  setup_arch+0x164/0x1e0
|  start_kernel+0x64/0x38c
|  __primary_switched+0xbc/0xc4

Restrict CONFIG_CPU_BIG_ENDIAN to a known good assembler, which is
either GNU as or LLVM's IAS 15.0.0 and newer, which contains the linked
commit.

Closes: https://github.com/ClangBuiltLinux/linux/issues/1948
Link: https://github.com/llvm/llvm-project/commit/1379b150991f70a5782e9a143c2ba5308da1161c
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Cc: stable@vger.kernel.org

> Restrict CONFIG_CPU_BIG_ENDIAN to a known good assembler, which is
> either GNU as or LLVM's IAS 15.0.0 and newer, which contains the linked
> commit.
> 
> Closes: https://github.com/ClangBuiltLinux/linux/issues/1948
> Link: https://github.com/llvm/llvm-project/commit/1379b150991f70a5782e9a143c2ba5308da1161c
> Signed-off-by: Nathan Chancellor <nathan@kernel.org>
> ---
>  arch/arm64/Kconfig | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index adf2f8a327be..92d33ece4c45 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -1360,6 +1360,8 @@ choice
>  config CPU_BIG_ENDIAN
>  	bool "Build big-endian kernel"
>  	depends on !LD_IS_LLD || LLD_VERSION >= 130000
> +	# https://github.com/llvm/llvm-project/commit/1379b150991f70a5782e9a143c2ba5308da1161c
> +	depends on AS_IS_GNU || AS_VERSION >= 150000
>  	help
>  	  Say Y if you plan on running a kernel with a big-endian userspace.
>  
> 
> ---
> base-commit: 22e877699642285c47f5d7d83b2d59815c29ebe8
> change-id: 20231025-disable-arm64-be-ias-b4-llvm-15-b6f30f3f24be
> 
> Best regards,
> -- 
> Nathan Chancellor <nathan@kernel.org>
>
Nathan Chancellor Oct. 25, 2023, 6:31 p.m. UTC | #2
On Wed, Oct 25, 2023 at 07:01:53PM +0100, Mark Rutland wrote:
> Hi Nathan,
> 
> On Wed, Oct 25, 2023 at 10:21:28AM -0700, Nathan Chancellor wrote:
> > A recent refactoring in the arm64 tree exposed an assembler bug in LLVM
> > with regards to the generation of NOPs for arm64 big endian, resulting
> > in near-immediate crashes on boot in QEMU.
> 
> Could we please put a bit more detail into the commit message about what
> exactly went wrong and how this was detected? I know that can be found
> from the github links below, but having to go chase that is a bit of a
> pain.

Sure, sorry for leaving that out of the initial revision.

> Would you be happy with the below? I've also added a Cc stable, since
> this is a potential state corruption issue.

That text looks much better to me, especially since it explains exactly
what goes wrong here (which I was unsure of, this helps). Thanks a lot!

Will / Catalin, would you like a v2 with that text or could it just be
copied and pasted from Mark's mail during application time? I am happy
to do whatever.

> Assuming you're happy with that text:
> 
> Acked-by: Mark Rutland <mark.rutland@arm.com>
> 
> Mark.
> 
> ---->8----
> Prior to LLVM 15.0.0, LLVM's integrated assemble would incorrectly
> byte-swap NOP when compiling for big-endian, and the resulting series of
> bytes happened to match the encoding of FNMADD S21, S30, S0, S0.
> 
> This went unnoticed until commit:
> 
>   34f66c4c4d5518c1 ("arm64: Use a positive cpucap for FP/SIMD")
> 
> Prior to that commit, the kernel would always enable the use of FPSIMD
> early in boot when __cpu_setup() initialized CPACR_EL1, and so usage of
> FNMADD within the kernel was not detected, but could result in the
> corruption of user or kernel FPSIMD state.
> 
> After that commit, the instructions happen to trap during boot prior to
> FPSIMD being detected and enabled, e.g.
> 
> | Unhandled 64-bit el1h sync exception on CPU0, ESR 0x000000001fe00000 -- ASIMD
> | CPU: 0 PID: 0 Comm: swapper Not tainted 6.6.0-rc3-00013-g34f66c4c4d55 #1
> | Hardware name: linux,dummy-virt (DT)
> | pstate: 400000c9 (nZcv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> | pc : __pi_strcmp+0x1c/0x150
> | lr : populate_properties+0xe4/0x254
> | sp : ffffd014173d3ad0
> | x29: ffffd014173d3af0 x28: fffffbfffddffcb8 x27: 0000000000000000
> | x26: 0000000000000058 x25: fffffbfffddfe054 x24: 0000000000000008
> | x23: fffffbfffddfe000 x22: fffffbfffddfe000 x21: fffffbfffddfe044
> | x20: ffffd014173d3b70 x19: 0000000000000001 x18: 0000000000000005
> | x17: 0000000000000010 x16: 0000000000000000 x15: 00000000413e7000
> | x14: 0000000000000000 x13: 0000000000001bcc x12: 0000000000000000
> | x11: 00000000d00dfeed x10: ffffd414193f2cd0 x9 : 0000000000000000
> | x8 : 0101010101010101 x7 : ffffffffffffffc0 x6 : 0000000000000000
> | x5 : 0000000000000000 x4 : 0101010101010101 x3 : 000000000000002a
> | x2 : 0000000000000001 x1 : ffffd014171f2988 x0 : fffffbfffddffcb8
> | Kernel panic - not syncing: Unhandled exception
> | CPU: 0 PID: 0 Comm: swapper Not tainted 6.6.0-rc3-00013-g34f66c4c4d55 #1
> | Hardware name: linux,dummy-virt (DT)
> | Call trace:
> |  dump_backtrace+0xec/0x108
> |  show_stack+0x18/0x2c
> |  dump_stack_lvl+0x50/0x68
> |  dump_stack+0x18/0x24
> |  panic+0x13c/0x340
> |  el1t_64_irq_handler+0x0/0x1c
> |  el1_abort+0x0/0x5c
> |  el1h_64_sync+0x64/0x68
> |  __pi_strcmp+0x1c/0x150
> |  unflatten_dt_nodes+0x1e8/0x2d8
> |  __unflatten_device_tree+0x5c/0x15c
> |  unflatten_device_tree+0x38/0x50
> |  setup_arch+0x164/0x1e0
> |  start_kernel+0x64/0x38c
> |  __primary_switched+0xbc/0xc4
> 
> Restrict CONFIG_CPU_BIG_ENDIAN to a known good assembler, which is
> either GNU as or LLVM's IAS 15.0.0 and newer, which contains the linked
> commit.
> 
> Closes: https://github.com/ClangBuiltLinux/linux/issues/1948
> Link: https://github.com/llvm/llvm-project/commit/1379b150991f70a5782e9a143c2ba5308da1161c
> Signed-off-by: Nathan Chancellor <nathan@kernel.org>
> Cc: stable@vger.kernel.org
> 
> > Restrict CONFIG_CPU_BIG_ENDIAN to a known good assembler, which is
> > either GNU as or LLVM's IAS 15.0.0 and newer, which contains the linked
> > commit.
> > 
> > Closes: https://github.com/ClangBuiltLinux/linux/issues/1948
> > Link: https://github.com/llvm/llvm-project/commit/1379b150991f70a5782e9a143c2ba5308da1161c
> > Signed-off-by: Nathan Chancellor <nathan@kernel.org>
> > ---
> >  arch/arm64/Kconfig | 2 ++
> >  1 file changed, 2 insertions(+)
> > 
> > diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> > index adf2f8a327be..92d33ece4c45 100644
> > --- a/arch/arm64/Kconfig
> > +++ b/arch/arm64/Kconfig
> > @@ -1360,6 +1360,8 @@ choice
> >  config CPU_BIG_ENDIAN
> >  	bool "Build big-endian kernel"
> >  	depends on !LD_IS_LLD || LLD_VERSION >= 130000
> > +	# https://github.com/llvm/llvm-project/commit/1379b150991f70a5782e9a143c2ba5308da1161c
> > +	depends on AS_IS_GNU || AS_VERSION >= 150000
> >  	help
> >  	  Say Y if you plan on running a kernel with a big-endian userspace.
> >  
> > 
> > ---
> > base-commit: 22e877699642285c47f5d7d83b2d59815c29ebe8
> > change-id: 20231025-disable-arm64-be-ias-b4-llvm-15-b6f30f3f24be
> > 
> > Best regards,
> > -- 
> > Nathan Chancellor <nathan@kernel.org>
> >
Catalin Marinas Oct. 26, 2023, 3:28 p.m. UTC | #3
On Wed, Oct 25, 2023 at 11:31:14AM -0700, Nathan Chancellor wrote:
> On Wed, Oct 25, 2023 at 07:01:53PM +0100, Mark Rutland wrote:
> > On Wed, Oct 25, 2023 at 10:21:28AM -0700, Nathan Chancellor wrote:
> > > A recent refactoring in the arm64 tree exposed an assembler bug in LLVM
> > > with regards to the generation of NOPs for arm64 big endian, resulting
> > > in near-immediate crashes on boot in QEMU.
> > 
> > Could we please put a bit more detail into the commit message about what
> > exactly went wrong and how this was detected? I know that can be found
> > from the github links below, but having to go chase that is a bit of a
> > pain.
> 
> Sure, sorry for leaving that out of the initial revision.
> 
> > Would you be happy with the below? I've also added a Cc stable, since
> > this is a potential state corruption issue.
> 
> That text looks much better to me, especially since it explains exactly
> what goes wrong here (which I was unsure of, this helps). Thanks a lot!
> 
> Will / Catalin, would you like a v2 with that text or could it just be
> copied and pasted from Mark's mail during application time? I am happy
> to do whatever.

I'll copy/paste Mark's text, no need for v2. Thanks.
Mark Rutland Oct. 26, 2023, 3:31 p.m. UTC | #4
On Thu, Oct 26, 2023 at 04:28:45PM +0100, Catalin Marinas wrote:
> On Wed, Oct 25, 2023 at 11:31:14AM -0700, Nathan Chancellor wrote:
> > On Wed, Oct 25, 2023 at 07:01:53PM +0100, Mark Rutland wrote:
> > > On Wed, Oct 25, 2023 at 10:21:28AM -0700, Nathan Chancellor wrote:
> > > > A recent refactoring in the arm64 tree exposed an assembler bug in LLVM
> > > > with regards to the generation of NOPs for arm64 big endian, resulting
> > > > in near-immediate crashes on boot in QEMU.
> > > 
> > > Could we please put a bit more detail into the commit message about what
> > > exactly went wrong and how this was detected? I know that can be found
> > > from the github links below, but having to go chase that is a bit of a
> > > pain.
> > 
> > Sure, sorry for leaving that out of the initial revision.
> > 
> > > Would you be happy with the below? I've also added a Cc stable, since
> > > this is a potential state corruption issue.
> > 
> > That text looks much better to me, especially since it explains exactly
> > what goes wrong here (which I was unsure of, this helps). Thanks a lot!
> > 
> > Will / Catalin, would you like a v2 with that text or could it just be
> > copied and pasted from Mark's mail during application time? I am happy
> > to do whatever.
> 
> I'll copy/paste Mark's text, no need for v2. Thanks.

If you do, could you fix my typo in the first line? I accidentally wrote:

	LLVM's integrated assemble

When that should have been:

	LLVM's integrated assembler

If you're already picked the patch it's not worth worrying about.

Mark.
Catalin Marinas Oct. 26, 2023, 3:33 p.m. UTC | #5
On Thu, Oct 26, 2023 at 04:31:59PM +0100, Mark Rutland wrote:
> On Thu, Oct 26, 2023 at 04:28:45PM +0100, Catalin Marinas wrote:
> > On Wed, Oct 25, 2023 at 11:31:14AM -0700, Nathan Chancellor wrote:
> > > On Wed, Oct 25, 2023 at 07:01:53PM +0100, Mark Rutland wrote:
> > > > On Wed, Oct 25, 2023 at 10:21:28AM -0700, Nathan Chancellor wrote:
> > > > > A recent refactoring in the arm64 tree exposed an assembler bug in LLVM
> > > > > with regards to the generation of NOPs for arm64 big endian, resulting
> > > > > in near-immediate crashes on boot in QEMU.
> > > > 
> > > > Could we please put a bit more detail into the commit message about what
> > > > exactly went wrong and how this was detected? I know that can be found
> > > > from the github links below, but having to go chase that is a bit of a
> > > > pain.
> > > 
> > > Sure, sorry for leaving that out of the initial revision.
> > > 
> > > > Would you be happy with the below? I've also added a Cc stable, since
> > > > this is a potential state corruption issue.
> > > 
> > > That text looks much better to me, especially since it explains exactly
> > > what goes wrong here (which I was unsure of, this helps). Thanks a lot!
> > > 
> > > Will / Catalin, would you like a v2 with that text or could it just be
> > > copied and pasted from Mark's mail during application time? I am happy
> > > to do whatever.
> > 
> > I'll copy/paste Mark's text, no need for v2. Thanks.
> 
> If you do, could you fix my typo in the first line? I accidentally wrote:
> 
> 	LLVM's integrated assemble
> 
> When that should have been:
> 
> 	LLVM's integrated assembler
> 
> If you're already picked the patch it's not worth worrying about.

I can still fix it, hasn't pushed it out yet.
Catalin Marinas Oct. 26, 2023, 4:36 p.m. UTC | #6
On Wed, 25 Oct 2023 10:21:28 -0700, Nathan Chancellor wrote:
> A recent refactoring in the arm64 tree exposed an assembler bug in LLVM
> with regards to the generation of NOPs for arm64 big endian, resulting
> in near-immediate crashes on boot in QEMU.
> 
> Restrict CONFIG_CPU_BIG_ENDIAN to a known good assembler, which is
> either GNU as or LLVM's IAS 15.0.0 and newer, which contains the linked
> commit.
> 
> [...]

Applied to arm64 (for-next/misc), thanks!

[1/1] arm64: Restrict CPU_BIG_ENDIAN to GNU as or LLVM IAS 15.x or newer
      https://git.kernel.org/arm64/c/146a15b87335
diff mbox series

Patch

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index adf2f8a327be..92d33ece4c45 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -1360,6 +1360,8 @@  choice
 config CPU_BIG_ENDIAN
 	bool "Build big-endian kernel"
 	depends on !LD_IS_LLD || LLD_VERSION >= 130000
+	# https://github.com/llvm/llvm-project/commit/1379b150991f70a5782e9a143c2ba5308da1161c
+	depends on AS_IS_GNU || AS_VERSION >= 150000
 	help
 	  Say Y if you plan on running a kernel with a big-endian userspace.