diff mbox series

arm64: mte: switch GCR_EL1 on task switch rather than entry/exit

Message ID 20210702031922.1291398-1-pcc@google.com (mailing list archive)
State New, archived
Headers show
Series arm64: mte: switch GCR_EL1 on task switch rather than entry/exit | expand

Commit Message

Peter Collingbourne July 2, 2021, 3:19 a.m. UTC
Accessing GCR_EL1 and issuing an ISB can be expensive on some
microarchitectures. To avoid taking this performance hit on every
kernel entry/exit, switch GCR_EL1 on task switch rather than
entry/exit. This is essentially a revert of commit bad1e1c663e0
("arm64: mte: switch GCR_EL1 in kernel entry and exit").

This requires changing how we generate random tags for HW tag-based
KASAN, since at this point IRG would use the user's exclusion mask,
which may not be suitable for kernel use. In this patch I chose to take
the modulus of CNTVCT_EL0, however alternative approaches are possible.

Signed-off-by: Peter Collingbourne <pcc@google.com>
Link: https://linux-review.googlesource.com/id/I560a190a74176ca4cc5191dad08f77f6b1577c75
---
 arch/arm64/include/asm/mte-kasan.h | 15 ++++--
 arch/arm64/include/asm/mte.h       |  6 ---
 arch/arm64/kernel/entry.S          | 41 ----------------
 arch/arm64/kernel/mte.c            | 76 ++++++++++++------------------
 4 files changed, 40 insertions(+), 98 deletions(-)

Comments

Catalin Marinas July 2, 2021, 5:44 p.m. UTC | #1
On Thu, Jul 01, 2021 at 08:19:22PM -0700, Peter Collingbourne wrote:
> Accessing GCR_EL1 and issuing an ISB can be expensive on some
> microarchitectures. To avoid taking this performance hit on every
> kernel entry/exit, switch GCR_EL1 on task switch rather than
> entry/exit.

Is it the ISB that's causing issues or the MRS/MSR as well? I think we
can avoid the ISB when PtrAuth is enabled by shuffling the entry code a
bit. We can also simplify the mte_set_gcr macro to avoid an MRS.
Peter Collingbourne July 2, 2021, 6:39 p.m. UTC | #2
On Fri, Jul 2, 2021 at 10:44 AM Catalin Marinas <catalin.marinas@arm.com> wrote:
>
> On Thu, Jul 01, 2021 at 08:19:22PM -0700, Peter Collingbourne wrote:
> > Accessing GCR_EL1 and issuing an ISB can be expensive on some
> > microarchitectures. To avoid taking this performance hit on every
> > kernel entry/exit, switch GCR_EL1 on task switch rather than
> > entry/exit.
>
> Is it the ISB that's causing issues or the MRS/MSR as well? I think we
> can avoid the ISB when PtrAuth is enabled by shuffling the entry code a
> bit. We can also simplify the mte_set_gcr macro to avoid an MRS.

This was the first thing that I tried on our hardware:

diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index 7312eafec946..8699ab28a924 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -204,7 +204,6 @@ alternative_else_nop_endif
        ldr_l   \tmp, gcr_kernel_excl

        mte_set_gcr \tmp, \tmp2
-       isb
 1:
 #endif
        .endm
@@ -277,13 +276,13 @@ alternative_if ARM64_HAS_ADDRESS_AUTH
        orr     x0, x0, SCTLR_ELx_ENIA
        msr     sctlr_el1, x0
 2:
-       isb
 alternative_else_nop_endif
 #endif

        apply_ssbd 1, x22, x23

        mte_set_kernel_gcr x22, x23
+       isb

        scs_load tsk, x20
        .else

However, on most of the cores this led to only around half of the
performance improvement of the patch that I sent. Which is somewhat
surprising, but it is what it is.

But I would like to get IRG out of the kernel (at least in production
kernels) for other reasons. I would at some point like to add a
deterministic IRG mode (to support record/replay debugging). This will
require setting RRND=0 and a per-task RGSR. If we then allow IRG in
the kernel we would need to manually switch RGSR here as well.

Peter
Catalin Marinas July 5, 2021, 2:17 p.m. UTC | #3
On Fri, Jul 02, 2021 at 11:39:33AM -0700, Peter Collingbourne wrote:
> On Fri, Jul 2, 2021 at 10:44 AM Catalin Marinas <catalin.marinas@arm.com> wrote:
> > On Thu, Jul 01, 2021 at 08:19:22PM -0700, Peter Collingbourne wrote:
> > > Accessing GCR_EL1 and issuing an ISB can be expensive on some
> > > microarchitectures. To avoid taking this performance hit on every
> > > kernel entry/exit, switch GCR_EL1 on task switch rather than
> > > entry/exit.
> >
> > Is it the ISB that's causing issues or the MRS/MSR as well? I think we
> > can avoid the ISB when PtrAuth is enabled by shuffling the entry code a
> > bit. We can also simplify the mte_set_gcr macro to avoid an MRS.
> 
> This was the first thing that I tried on our hardware:
> 
> diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
> index 7312eafec946..8699ab28a924 100644
> --- a/arch/arm64/kernel/entry.S
> +++ b/arch/arm64/kernel/entry.S
> @@ -204,7 +204,6 @@ alternative_else_nop_endif
>         ldr_l   \tmp, gcr_kernel_excl
> 
>         mte_set_gcr \tmp, \tmp2
> -       isb
>  1:
>  #endif
>         .endm
> @@ -277,13 +276,13 @@ alternative_if ARM64_HAS_ADDRESS_AUTH
>         orr     x0, x0, SCTLR_ELx_ENIA
>         msr     sctlr_el1, x0
>  2:
> -       isb
>  alternative_else_nop_endif
>  #endif
> 
>         apply_ssbd 1, x22, x23
> 
>         mte_set_kernel_gcr x22, x23
> +       isb
> 
>         scs_load tsk, x20
>         .else
> 
> However, on most of the cores this led to only around half of the
> performance improvement of the patch that I sent. Which is somewhat
> surprising, but it is what it is.

BTW, can you also modify mte_set_kernel_gcr to only do a write to the
GCR_EL1 register rather than a read-modify-write?
diff mbox series

Patch

diff --git a/arch/arm64/include/asm/mte-kasan.h b/arch/arm64/include/asm/mte-kasan.h
index ddd4d17cf9a0..e9b3c1bdbba3 100644
--- a/arch/arm64/include/asm/mte-kasan.h
+++ b/arch/arm64/include/asm/mte-kasan.h
@@ -13,6 +13,8 @@ 
 
 #ifdef CONFIG_ARM64_MTE
 
+extern u64 mte_tag_mod;
+
 /*
  * These functions are meant to be only used from KASAN runtime through
  * the arch_*() interface defined in asm/memory.h.
@@ -37,15 +39,18 @@  static inline u8 mte_get_mem_tag(void *addr)
 	return mte_get_ptr_tag(addr);
 }
 
-/* Generate a random tag. */
+/*
+ * Generate a random tag. We can't use IRG because the user's GCR_EL1 is still
+ * installed for performance reasons. Instead, take the modulus of the
+ * architected timer which should be random enough for our purposes.
+ */
 static inline u8 mte_get_random_tag(void)
 {
-	void *addr;
+	u64 cntvct;
 
-	asm(__MTE_PREAMBLE "irg %0, %0"
-		: "=r" (addr));
+	asm("mrs %0, cntvct_el0" : "=r"(cntvct));
 
-	return mte_get_ptr_tag(addr);
+	return 0xF0 | (cntvct % mte_tag_mod);
 }
 
 /*
diff --git a/arch/arm64/include/asm/mte.h b/arch/arm64/include/asm/mte.h
index 719687412798..412b94efcb11 100644
--- a/arch/arm64/include/asm/mte.h
+++ b/arch/arm64/include/asm/mte.h
@@ -16,8 +16,6 @@ 
 
 #include <asm/pgtable-types.h>
 
-extern u64 gcr_kernel_excl;
-
 void mte_clear_page_tags(void *addr);
 unsigned long mte_copy_tags_from_user(void *to, const void __user *from,
 				      unsigned long n);
@@ -40,7 +38,6 @@  void mte_free_tag_storage(char *storage);
 void mte_sync_tags(pte_t *ptep, pte_t pte);
 void mte_copy_page_tags(void *kto, const void *kfrom);
 void mte_thread_init_user(void);
-void mte_update_sctlr_user(struct task_struct *task);
 void mte_thread_switch(struct task_struct *next);
 void mte_suspend_enter(void);
 void mte_suspend_exit(void);
@@ -63,9 +60,6 @@  static inline void mte_copy_page_tags(void *kto, const void *kfrom)
 static inline void mte_thread_init_user(void)
 {
 }
-static inline void mte_update_sctlr_user(struct task_struct *task)
-{
-}
 static inline void mte_thread_switch(struct task_struct *next)
 {
 }
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index ce59280355c5..c95bfe145639 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -175,43 +175,6 @@  alternative_else_nop_endif
 #endif
 	.endm
 
-	.macro mte_set_gcr, tmp, tmp2
-#ifdef CONFIG_ARM64_MTE
-	/*
-	 * Calculate and set the exclude mask preserving
-	 * the RRND (bit[16]) setting.
-	 */
-	mrs_s	\tmp2, SYS_GCR_EL1
-	bfxil	\tmp2, \tmp, #MTE_CTRL_GCR_USER_EXCL_SHIFT, #16
-	msr_s	SYS_GCR_EL1, \tmp2
-#endif
-	.endm
-
-	.macro mte_set_kernel_gcr, tmp, tmp2
-#ifdef CONFIG_KASAN_HW_TAGS
-alternative_if_not ARM64_MTE
-	b	1f
-alternative_else_nop_endif
-	ldr_l	\tmp, gcr_kernel_excl
-
-	mte_set_gcr \tmp, \tmp2
-	isb
-1:
-#endif
-	.endm
-
-	.macro mte_set_user_gcr, tsk, tmp, tmp2
-#ifdef CONFIG_ARM64_MTE
-alternative_if_not ARM64_MTE
-	b	1f
-alternative_else_nop_endif
-	ldr	\tmp, [\tsk, #THREAD_MTE_CTRL]
-
-	mte_set_gcr \tmp, \tmp2
-1:
-#endif
-	.endm
-
 	.macro	kernel_entry, el, regsize = 64
 	.if	\regsize == 32
 	mov	w0, w0				// zero upper 32 bits of x0
@@ -273,8 +236,6 @@  alternative_if ARM64_HAS_ADDRESS_AUTH
 alternative_else_nop_endif
 #endif
 
-	mte_set_kernel_gcr x22, x23
-
 	scs_load tsk, x20
 	.else
 	add	x21, sp, #PT_REGS_SIZE
@@ -398,8 +359,6 @@  alternative_if ARM64_HAS_ADDRESS_AUTH
 alternative_else_nop_endif
 #endif
 
-	mte_set_user_gcr tsk, x0, x1
-
 	apply_ssbd 0, x0, x1
 	.endif
 
diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c
index 9c82e27b30f9..b8d3e0b20702 100644
--- a/arch/arm64/kernel/mte.c
+++ b/arch/arm64/kernel/mte.c
@@ -23,7 +23,7 @@ 
 #include <asm/ptrace.h>
 #include <asm/sysreg.h>
 
-u64 gcr_kernel_excl __ro_after_init;
+u64 mte_tag_mod __ro_after_init;
 
 static bool report_fault_once = true;
 
@@ -98,22 +98,7 @@  int memcmp_pages(struct page *page1, struct page *page2)
 
 void mte_init_tags(u64 max_tag)
 {
-	static bool gcr_kernel_excl_initialized;
-
-	if (!gcr_kernel_excl_initialized) {
-		/*
-		 * The format of the tags in KASAN is 0xFF and in MTE is 0xF.
-		 * This conversion extracts an MTE tag from a KASAN tag.
-		 */
-		u64 incl = GENMASK(FIELD_GET(MTE_TAG_MASK >> MTE_TAG_SHIFT,
-					     max_tag), 0);
-
-		gcr_kernel_excl = ~incl & SYS_GCR_EL1_EXCL_MASK;
-		gcr_kernel_excl_initialized = true;
-	}
-
-	/* Enable the kernel exclude mask for random tags generation. */
-	write_sysreg_s(SYS_GCR_EL1_RRND | gcr_kernel_excl, SYS_GCR_EL1);
+	mte_tag_mod = (max_tag & 0xF) + 1;
 }
 
 static inline void __mte_enable_kernel(const char *mode, unsigned long tcf)
@@ -188,8 +173,25 @@  void mte_check_tfsr_el1(void)
 }
 #endif
 
-static void update_gcr_el1_excl(u64 excl)
+static void mte_sync_ctrl(struct task_struct *task)
 {
+	/*
+	 * This can only be called on the current or next task since the CPU
+	 * must match where the thread is going to run.
+	 */
+	unsigned long sctlr = task->thread.sctlr_user;
+	unsigned long mte_ctrl = task->thread.mte_ctrl;
+	unsigned long pref, resolved_mte_tcf;
+
+	preempt_disable();
+	pref = __this_cpu_read(mte_tcf_preferred);
+	resolved_mte_tcf = (mte_ctrl & pref) ? pref : mte_ctrl;
+	sctlr &= ~SCTLR_EL1_TCF0_MASK;
+	if (resolved_mte_tcf & MTE_CTRL_TCF_ASYNC)
+		sctlr |= SCTLR_EL1_TCF0_ASYNC;
+	else if (resolved_mte_tcf & MTE_CTRL_TCF_SYNC)
+		sctlr |= SCTLR_EL1_TCF0_SYNC;
+	task->thread.sctlr_user = sctlr;
 
 	/*
 	 * Note that the mask controlled by the user via prctl() is an
@@ -197,7 +199,11 @@  static void update_gcr_el1_excl(u64 excl)
 	 * No need for ISB since this only affects EL0 currently, implicit
 	 * with ERET.
 	 */
-	sysreg_clear_set_s(SYS_GCR_EL1, SYS_GCR_EL1_EXCL_MASK, excl);
+	sysreg_clear_set_s(SYS_GCR_EL1, SYS_GCR_EL1_EXCL_MASK,
+			   (mte_ctrl & MTE_CTRL_GCR_USER_EXCL_MASK) >>
+				   MTE_CTRL_GCR_USER_EXCL_SHIFT);
+
+	preempt_enable();
 }
 
 void mte_thread_init_user(void)
@@ -211,35 +217,13 @@  void mte_thread_init_user(void)
 	clear_thread_flag(TIF_MTE_ASYNC_FAULT);
 	/* disable tag checking and reset tag generation mask */
 	current->thread.mte_ctrl = MTE_CTRL_GCR_USER_EXCL_MASK;
-	mte_update_sctlr_user(current);
+	mte_sync_ctrl(current);
 	set_task_sctlr_el1(current->thread.sctlr_user);
 }
 
-void mte_update_sctlr_user(struct task_struct *task)
-{
-	/*
-	 * This can only be called on the current or next task since the CPU
-	 * must match where the thread is going to run.
-	 */
-	unsigned long sctlr = task->thread.sctlr_user;
-	unsigned long mte_ctrl = task->thread.mte_ctrl;
-	unsigned long pref, resolved_mte_tcf;
-
-	preempt_disable();
-	pref = __this_cpu_read(mte_tcf_preferred);
-	resolved_mte_tcf = (mte_ctrl & pref) ? pref : mte_ctrl;
-	sctlr &= ~SCTLR_EL1_TCF0_MASK;
-	if (resolved_mte_tcf & MTE_CTRL_TCF_ASYNC)
-		sctlr |= SCTLR_EL1_TCF0_ASYNC;
-	else if (resolved_mte_tcf & MTE_CTRL_TCF_SYNC)
-		sctlr |= SCTLR_EL1_TCF0_SYNC;
-	task->thread.sctlr_user = sctlr;
-	preempt_enable();
-}
-
 void mte_thread_switch(struct task_struct *next)
 {
-	mte_update_sctlr_user(next);
+	mte_sync_ctrl(next);
 
 	/*
 	 * Check if an async tag exception occurred at EL1.
@@ -273,7 +257,7 @@  void mte_suspend_exit(void)
 	if (!system_supports_mte())
 		return;
 
-	update_gcr_el1_excl(gcr_kernel_excl);
+	mte_sync_ctrl(current);
 }
 
 long set_mte_ctrl(struct task_struct *task, unsigned long arg)
@@ -291,7 +275,7 @@  long set_mte_ctrl(struct task_struct *task, unsigned long arg)
 
 	task->thread.mte_ctrl = mte_ctrl;
 	if (task == current) {
-		mte_update_sctlr_user(task);
+		mte_sync_ctrl(task);
 		set_task_sctlr_el1(task->thread.sctlr_user);
 	}
 
@@ -467,7 +451,7 @@  static ssize_t mte_tcf_preferred_show(struct device *dev,
 
 static void sync_sctlr(void *arg)
 {
-	mte_update_sctlr_user(current);
+	mte_sync_ctrl(current);
 	set_task_sctlr_el1(current->thread.sctlr_user);
 }