diff mbox series

[2/3] arm64: use XPACLRI to strip PAC

Message ID 20230314162044.888747-3-mark.rutland@arm.com (mailing list archive)
State New, archived
Headers show
Series arm64: PAC stripping cleanups | expand

Commit Message

Mark Rutland March 14, 2023, 4:20 p.m. UTC
Currently we strip the PAC from pointers using C code, which requires
generating bitmasks, and conditionally clearing/setting bits depending
on bit 55. We can do better by using XPACLRI directly.

When the logic was originally written to strip PACs from user pointers,
contemporary toolchains used for the kernel had assemblers which were
unaware of the PAC instructions. As stripping the PAC from userspace
pointers required unconditional clearing of a fixed set of bits (which
could be performed with a single instruction), it was simpler to
implement the masking in C than it was to make use of XPACI or XPACLRI.

When support for in-kernel pointer authentication was added, the
stripping logic was extended to cover TTBR1 pointers, requiring several
instructions to handle whether to clear/set bits dependent on bit 55 of
the pointer.

These days, all supported toolchains have assemblers which are aware of
the XPACI and XPACLRI instructions, and contemporary compilers use
XPACLRI to strip the PAC in __builtin_return_address().

Do likewise within the kernel, and use XPACLRI to strip PACs from
pointers. This lives in the HINT encoding space, and so can be use
unconditionally. Using XPACLRI saves a few instructions (especially
where tracepoints are enabled, as these can make heavy use of
__builtin_return_address()), and in theory should be lower overhead. At
the same time, I've split ptrauth_strip_insn_pac() into
ptrauth_strip_user_insn_pac() and ptrauth_strip_kernel_insn_pac()
helpers so that we can avoid unnecessary PAC stripping when pointer
authentication is not in use in userspace or kernel respectively.

The underlying xpaclri() macro uses inline assembly which clobbers x30.
The clobber causes the compiler to save/restore the original x30 value
in a frame record (protected with PACIASP and AUTIASP when in-kernel
authentication is enabled), so this does not provide a gadget to alter
the return address. Similarly this does not adversely affect unwinding
due to the presence of the frame record.

The ptrauth_user_pac_mask() and ptrauth_kernel_pac_mask() are exported
from the kernel in ptrace and core dumps, so these are retained. A
subsequent patch will move them out of <asm/compiler.h>.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Amit Daniel Kachhap <amit.kachhap@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Kristina Martsenko <kristina.martsenko@arm.com>
Cc: Will Deacon <will@kernel.org>
---
 arch/arm64/include/asm/compiler.h     | 32 +++++++++++++++++++++------
 arch/arm64/include/asm/pointer_auth.h |  6 -----
 arch/arm64/kernel/perf_callchain.c    |  2 +-
 arch/arm64/kernel/process.c           |  2 +-
 arch/arm64/kernel/stacktrace.c        |  2 +-
 5 files changed, 28 insertions(+), 16 deletions(-)

Comments

Will Deacon April 6, 2023, 3:30 p.m. UTC | #1
On Tue, Mar 14, 2023 at 04:20:43PM +0000, Mark Rutland wrote:
> Currently we strip the PAC from pointers using C code, which requires
> generating bitmasks, and conditionally clearing/setting bits depending
> on bit 55. We can do better by using XPACLRI directly.
> 
> When the logic was originally written to strip PACs from user pointers,
> contemporary toolchains used for the kernel had assemblers which were
> unaware of the PAC instructions. As stripping the PAC from userspace
> pointers required unconditional clearing of a fixed set of bits (which
> could be performed with a single instruction), it was simpler to
> implement the masking in C than it was to make use of XPACI or XPACLRI.
> 
> When support for in-kernel pointer authentication was added, the
> stripping logic was extended to cover TTBR1 pointers, requiring several
> instructions to handle whether to clear/set bits dependent on bit 55 of
> the pointer.
> 
> These days, all supported toolchains have assemblers which are aware of
> the XPACI and XPACLRI instructions, and contemporary compilers use
> XPACLRI to strip the PAC in __builtin_return_address().

I'm struggling slightly with the reasoning here... presumably there are
_still_ assemblers out there which don't support XPACLRI, so what happens
if somebody tries to build the kernel with one? Is your argument that
the compiler itself is going to generate XPACLRI in the builtin so we're
no worse rolling it ourselves?

If so, it still feels like a step backwards from where we are today (i.e.
we don't require the assembler support) and failing the build due to a
assembler error rather than a Kconfig version check and a helpful
diagnostic is pretty horrible.

Will
Mark Rutland April 6, 2023, 3:48 p.m. UTC | #2
On Thu, Apr 06, 2023 at 04:30:36PM +0100, Will Deacon wrote:
> On Tue, Mar 14, 2023 at 04:20:43PM +0000, Mark Rutland wrote:
> > Currently we strip the PAC from pointers using C code, which requires
> > generating bitmasks, and conditionally clearing/setting bits depending
> > on bit 55. We can do better by using XPACLRI directly.
> > 
> > When the logic was originally written to strip PACs from user pointers,
> > contemporary toolchains used for the kernel had assemblers which were
> > unaware of the PAC instructions. As stripping the PAC from userspace
> > pointers required unconditional clearing of a fixed set of bits (which
> > could be performed with a single instruction), it was simpler to
> > implement the masking in C than it was to make use of XPACI or XPACLRI.
> > 
> > When support for in-kernel pointer authentication was added, the
> > stripping logic was extended to cover TTBR1 pointers, requiring several
> > instructions to handle whether to clear/set bits dependent on bit 55 of
> > the pointer.
> > 
> > These days, all supported toolchains have assemblers which are aware of
> > the XPACI and XPACLRI instructions, and contemporary compilers use
> > XPACLRI to strip the PAC in __builtin_return_address().
> 
> I'm struggling slightly with the reasoning here... presumably there are
> _still_ assemblers out there which don't support XPACLRI, so what happens
> if somebody tries to build the kernel with one? Is your argument that
> the compiler itself is going to generate XPACLRI in the builtin so we're
> no worse rolling it ourselves?

Basically, yes. We only generate these instructions when in-kernel
authentication is enabled, at which point the compiler will generate these
instructions.

It's expected that GCC is shipped with a suitably up-to-date binutils, and all
supported versions for LLVM (11.0.0+) support the instructions in the
integrated assembler.

> If so, it still feels like a step backwards from where we are today (i.e.
> we don't require the assembler support) and failing the build due to a
> assembler error rather than a Kconfig version check and a helpful
> diagnostic is pretty horrible.

I don't think anyone's using such a mismatched binutils and GCC, but I could
use the hint encoding of XPACLRI directly (which all older assemblers support
regardless), if you prefer?

Thanks,
Mark.
Will Deacon April 11, 2023, 5:12 p.m. UTC | #3
On Thu, Apr 06, 2023 at 04:48:23PM +0100, Mark Rutland wrote:
> On Thu, Apr 06, 2023 at 04:30:36PM +0100, Will Deacon wrote:
> > On Tue, Mar 14, 2023 at 04:20:43PM +0000, Mark Rutland wrote:
> > > Currently we strip the PAC from pointers using C code, which requires
> > > generating bitmasks, and conditionally clearing/setting bits depending
> > > on bit 55. We can do better by using XPACLRI directly.
> > > 
> > > When the logic was originally written to strip PACs from user pointers,
> > > contemporary toolchains used for the kernel had assemblers which were
> > > unaware of the PAC instructions. As stripping the PAC from userspace
> > > pointers required unconditional clearing of a fixed set of bits (which
> > > could be performed with a single instruction), it was simpler to
> > > implement the masking in C than it was to make use of XPACI or XPACLRI.
> > > 
> > > When support for in-kernel pointer authentication was added, the
> > > stripping logic was extended to cover TTBR1 pointers, requiring several
> > > instructions to handle whether to clear/set bits dependent on bit 55 of
> > > the pointer.
> > > 
> > > These days, all supported toolchains have assemblers which are aware of
> > > the XPACI and XPACLRI instructions, and contemporary compilers use
> > > XPACLRI to strip the PAC in __builtin_return_address().
> > 
> > I'm struggling slightly with the reasoning here... presumably there are
> > _still_ assemblers out there which don't support XPACLRI, so what happens
> > if somebody tries to build the kernel with one? Is your argument that
> > the compiler itself is going to generate XPACLRI in the builtin so we're
> > no worse rolling it ourselves?
> 
> Basically, yes. We only generate these instructions when in-kernel
> authentication is enabled, at which point the compiler will generate these
> instructions.
> 
> It's expected that GCC is shipped with a suitably up-to-date binutils, and all
> supported versions for LLVM (11.0.0+) support the instructions in the
> integrated assembler.

Is that expectation enforced by e.g. the gcc build system? We'd had plenty
of cases in the past where people mix and match binutils and the compiler,
so I'm not sure why this case is different.

> 
> > If so, it still feels like a step backwards from where we are today (i.e.
> > we don't require the assembler support) and failing the build due to a
> > assembler error rather than a Kconfig version check and a helpful
> > diagnostic is pretty horrible.
> 
> I don't think anyone's using such a mismatched binutils and GCC, but I could
> use the hint encoding of XPACLRI directly (which all older assemblers support
> regardless), if you prefer?

That would work, or add an 'as-instr' check to make the dependency explicit
in a new AS_HAS_XPACLRI Kconfig option.

Will
Mark Rutland April 12, 2023, 4:01 p.m. UTC | #4
On Tue, Apr 11, 2023 at 06:12:01PM +0100, Will Deacon wrote:
> On Thu, Apr 06, 2023 at 04:48:23PM +0100, Mark Rutland wrote:
> > On Thu, Apr 06, 2023 at 04:30:36PM +0100, Will Deacon wrote:
> > > On Tue, Mar 14, 2023 at 04:20:43PM +0000, Mark Rutland wrote:
> > > > Currently we strip the PAC from pointers using C code, which requires
> > > > generating bitmasks, and conditionally clearing/setting bits depending
> > > > on bit 55. We can do better by using XPACLRI directly.
> > > > 
> > > > When the logic was originally written to strip PACs from user pointers,
> > > > contemporary toolchains used for the kernel had assemblers which were
> > > > unaware of the PAC instructions. As stripping the PAC from userspace
> > > > pointers required unconditional clearing of a fixed set of bits (which
> > > > could be performed with a single instruction), it was simpler to
> > > > implement the masking in C than it was to make use of XPACI or XPACLRI.
> > > > 
> > > > When support for in-kernel pointer authentication was added, the
> > > > stripping logic was extended to cover TTBR1 pointers, requiring several
> > > > instructions to handle whether to clear/set bits dependent on bit 55 of
> > > > the pointer.
> > > > 
> > > > These days, all supported toolchains have assemblers which are aware of
> > > > the XPACI and XPACLRI instructions, and contemporary compilers use
> > > > XPACLRI to strip the PAC in __builtin_return_address().
> > > 
> > > I'm struggling slightly with the reasoning here... presumably there are
> > > _still_ assemblers out there which don't support XPACLRI, so what happens
> > > if somebody tries to build the kernel with one? Is your argument that
> > > the compiler itself is going to generate XPACLRI in the builtin so we're
> > > no worse rolling it ourselves?
> > 
> > Basically, yes. We only generate these instructions when in-kernel
> > authentication is enabled, at which point the compiler will generate these
> > instructions.
> > 
> > It's expected that GCC is shipped with a suitably up-to-date binutils, and all
> > supported versions for LLVM (11.0.0+) support the instructions in the
> > integrated assembler.
> 
> Is that expectation enforced by e.g. the gcc build system? We'd had plenty
> of cases in the past where people mix and match binutils and the compiler,
> so I'm not sure why this case is different.

AFAIK it is not enforced.

Practically speaking, if someone mixes that up, defconfig (or any config with
CONFIG_ARM64_PTR_AUTH_KERNEL=y) would already be failing to build, and so I
suspect this is unlikely.

Admittedly it is possible that someone is building CONFIG_ARM64_PTR_AUTH=y &&
CONFIG_ARM64_PTR_AUTH_KERNEL=n with such a combination, and that would happen
to work today.

> > > If so, it still feels like a step backwards from where we are today (i.e.
> > > we don't require the assembler support) and failing the build due to a
> > > assembler error rather than a Kconfig version check and a helpful
> > > diagnostic is pretty horrible.
> > 
> > I don't think anyone's using such a mismatched binutils and GCC, but I could
> > use the hint encoding of XPACLRI directly (which all older assemblers support
> > regardless), if you prefer?
> 
> That would work, or add an 'as-instr' check to make the dependency explicit
> in a new AS_HAS_XPACLRI Kconfig option.

It's a bit simpler to use the hint encoding ("hint #7") directly, and that
won't regress the CONFIG_ARM64_PTR_AUTH=y && CONFIG_ARM64_PTR_AUTH_KERNEL=n
case, with such a toolchain, so I'll do that for v2.

Thanks,
Mark.
diff mbox series

Patch

diff --git a/arch/arm64/include/asm/compiler.h b/arch/arm64/include/asm/compiler.h
index 9033a0b2f46a..2476914e59a1 100644
--- a/arch/arm64/include/asm/compiler.h
+++ b/arch/arm64/include/asm/compiler.h
@@ -15,15 +15,33 @@ 
 #define ptrauth_user_pac_mask()		GENMASK_ULL(54, vabits_actual)
 #define ptrauth_kernel_pac_mask()	GENMASK_ULL(63, vabits_actual)
 
-/* Valid for EL0 TTBR0 and EL1 TTBR1 instruction pointers */
-#define ptrauth_clear_pac(ptr)						\
-	((ptr & BIT_ULL(55)) ? (ptr | ptrauth_kernel_pac_mask()) :	\
-			       (ptr & ~ptrauth_user_pac_mask()))
+#define xpaclri(ptr)							\
+({									\
+	register unsigned long __xpaclri_ptr asm("x30") = (ptr);	\
+									\
+	asm(								\
+	ARM64_ASM_PREAMBLE						\
+	"	xpaclri\n"						\
+	: "+r" (__xpaclri_ptr));					\
+									\
+	__xpaclri_ptr;							\
+})
 
-#if defined(CONFIG_ARM64_PTR_AUTH_KERNEL) &&				\
-    !defined(CONFIG_BUILTIN_RETURN_ADDRESS_STRIPS_PAC)
+#ifdef CONFIG_ARM64_PTR_AUTH_KERNEL
+#define ptrauth_strip_kernel_insn_pac(ptr)	xpaclri(ptr)
+#else
+#define ptrauth_strip_kernel_insn_pac(ptr)	(ptr)
+#endif
+
+#ifdef CONFIG_ARM64_PTR_AUTH
+#define ptrauth_strip_user_insn_pac(ptr)	xpaclri(ptr)
+#else
+#define ptrauth_strip_user_insn_pac(ptr)	(ptr)
+#endif
+
+#if !defined(CONFIG_BUILTIN_RETURN_ADDRESS_STRIPS_PAC)
 #define __builtin_return_address(val)					\
-	(void *)(ptrauth_clear_pac((unsigned long)__builtin_return_address(val)))
+	(void *)(ptrauth_strip_kernel_insn_pac((unsigned long)__builtin_return_address(val)))
 #endif
 
 #endif /* __ASM_COMPILER_H */
diff --git a/arch/arm64/include/asm/pointer_auth.h b/arch/arm64/include/asm/pointer_auth.h
index efb098de3a84..b0665db86aa6 100644
--- a/arch/arm64/include/asm/pointer_auth.h
+++ b/arch/arm64/include/asm/pointer_auth.h
@@ -97,11 +97,6 @@  extern int ptrauth_set_enabled_keys(struct task_struct *tsk, unsigned long keys,
 				    unsigned long enabled);
 extern int ptrauth_get_enabled_keys(struct task_struct *tsk);
 
-static inline unsigned long ptrauth_strip_insn_pac(unsigned long ptr)
-{
-	return ptrauth_clear_pac(ptr);
-}
-
 static __always_inline void ptrauth_enable(void)
 {
 	if (!system_supports_address_auth())
@@ -133,7 +128,6 @@  static __always_inline void ptrauth_enable(void)
 #define ptrauth_prctl_reset_keys(tsk, arg)	(-EINVAL)
 #define ptrauth_set_enabled_keys(tsk, keys, enabled)	(-EINVAL)
 #define ptrauth_get_enabled_keys(tsk)	(-EINVAL)
-#define ptrauth_strip_insn_pac(lr)	(lr)
 #define ptrauth_suspend_exit()
 #define ptrauth_thread_init_user()
 #define ptrauth_thread_switch_user(tsk)
diff --git a/arch/arm64/kernel/perf_callchain.c b/arch/arm64/kernel/perf_callchain.c
index 65b196e3ca6c..6d157f32187b 100644
--- a/arch/arm64/kernel/perf_callchain.c
+++ b/arch/arm64/kernel/perf_callchain.c
@@ -38,7 +38,7 @@  user_backtrace(struct frame_tail __user *tail,
 	if (err)
 		return NULL;
 
-	lr = ptrauth_strip_insn_pac(buftail.lr);
+	lr = ptrauth_strip_user_insn_pac(buftail.lr);
 
 	perf_callchain_store(entry, lr);
 
diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
index 71d59b5abede..b5bed62483cb 100644
--- a/arch/arm64/kernel/process.c
+++ b/arch/arm64/kernel/process.c
@@ -217,7 +217,7 @@  void __show_regs(struct pt_regs *regs)
 
 	if (!user_mode(regs)) {
 		printk("pc : %pS\n", (void *)regs->pc);
-		printk("lr : %pS\n", (void *)ptrauth_strip_insn_pac(lr));
+		printk("lr : %pS\n", (void *)ptrauth_strip_kernel_insn_pac(lr));
 	} else {
 		printk("pc : %016llx\n", regs->pc);
 		printk("lr : %016llx\n", lr);
diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
index 83154303e682..fcc2d269787e 100644
--- a/arch/arm64/kernel/stacktrace.c
+++ b/arch/arm64/kernel/stacktrace.c
@@ -90,7 +90,7 @@  static int notrace unwind_next(struct unwind_state *state)
 	if (err)
 		return err;
 
-	state->pc = ptrauth_strip_insn_pac(state->pc);
+	state->pc = ptrauth_strip_kernel_insn_pac(state->pc);
 
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
 	if (tsk->ret_stack &&