diff mbox series

[v10,19/40] arm64/gcs: Context switch GCS state for EL0

Message ID 20240801-arm64-gcs-v10-19-699e2bd2190b@kernel.org (mailing list archive)
State Superseded
Headers show
Series arm64/gcs: Provide support for GCS in userspace | expand

Checks

Context Check Description
conchuod/vmtest-fixes-PR fail merge-conflict

Commit Message

Mark Brown Aug. 1, 2024, 12:06 p.m. UTC
There are two registers controlling the GCS state of EL0, GCSPR_EL0 which
is the current GCS pointer and GCSCRE0_EL1 which has enable bits for the
specific GCS functionality enabled for EL0. Manage these on context switch
and process lifetime events, GCS is reset on exec().  Also ensure that
any changes to the GCS memory are visible to other PEs and that changes
from other PEs are visible on this one by issuing a GCSB DSYNC when
moving to or from a thread with GCS.

Since the current GCS configuration of a thread will be visible to
userspace we store the configuration in the format used with userspace
and provide a helper which configures the system register as needed.

On systems that support GCS we always allow access to GCSPR_EL0, this
facilitates reporting of GCS faults if userspace implements disabling of
GCS on error - the GCS can still be discovered and examined even if GCS
has been disabled.

Reviewed-by: Thiago Jung Bauermann <thiago.bauermann@linaro.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
---
 arch/arm64/include/asm/gcs.h       | 24 ++++++++++++++++
 arch/arm64/include/asm/processor.h |  6 ++++
 arch/arm64/kernel/process.c        | 56 ++++++++++++++++++++++++++++++++++++++
 arch/arm64/mm/Makefile             |  1 +
 arch/arm64/mm/gcs.c                | 39 ++++++++++++++++++++++++++
 5 files changed, 126 insertions(+)

Comments

Catalin Marinas Aug. 19, 2024, 11:46 a.m. UTC | #1
On Thu, Aug 01, 2024 at 01:06:46PM +0100, Mark Brown wrote:
> diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
> index 4ae31b7af6c3..5f00cb0da9c3 100644
> --- a/arch/arm64/kernel/process.c
> +++ b/arch/arm64/kernel/process.c
[...]
> +static void gcs_thread_switch(struct task_struct *next)
> +{
> +	if (!system_supports_gcs())
> +		return;
> +
> +	gcs_preserve_current_state();
> +
> +	gcs_set_el0_mode(next);
> +	write_sysreg_s(next->thread.gcspr_el0, SYS_GCSPR_EL0);
> +
> +	/*
> +	 * Ensure that GCS changes are observable by/from other PEs in
> +	 * case of migration.
> +	 */
> +	if (task_gcs_el0_enabled(current) || task_gcs_el0_enabled(next))
> +		gcsb_dsync();

Could we do the sysreg writing under this 'if' block? If no app is using
GCS (which would be the case for a while), it looks like unnecessary
sysreg accesses.

What's the GCSB DSYNC supposed to do here? The Arm ARM talks about
ordering between GCS memory effects and other memory effects. I haven't
looked at the memory model in detail yet (D11.9.1) but AFAICT it has
nothing to do with the system registers. We'll need this barrier when
ordering is needed between explicit or implicit (e.g. BL) GCS accesses
and the explicit classic memory accesses. Paging comes to mind, so maybe
flush_dcache_page() would need this barrier. ptrace() is another case if
the memory accessed is a GCS page. I can see you added it in other
places, I'll have a look as I go through the rest. But I don't think one
is needed here.
Mark Brown Aug. 19, 2024, 3:44 p.m. UTC | #2
On Mon, Aug 19, 2024 at 12:46:13PM +0100, Catalin Marinas wrote:
> On Thu, Aug 01, 2024 at 01:06:46PM +0100, Mark Brown wrote:

> > +	/*
> > +	 * Ensure that GCS changes are observable by/from other PEs in
> > +	 * case of migration.
> > +	 */
> > +	if (task_gcs_el0_enabled(current) || task_gcs_el0_enabled(next))
> > +		gcsb_dsync();

> Could we do the sysreg writing under this 'if' block? If no app is using
> GCS (which would be the case for a while), it looks like unnecessary
> sysreg accesses.

Yes, that should be fine I think.

> What's the GCSB DSYNC supposed to do here? The Arm ARM talks about
> ordering between GCS memory effects and other memory effects. I haven't
> looked at the memory model in detail yet (D11.9.1) but AFAICT it has
> nothing to do with the system registers. We'll need this barrier when
> ordering is needed between explicit or implicit (e.g. BL) GCS accesses
> and the explicit classic memory accesses. Paging comes to mind, so maybe
> flush_dcache_page() would need this barrier. ptrace() is another case if
> the memory accessed is a GCS page. I can see you added it in other
> places, I'll have a look as I go through the rest. But I don't think one
> is needed here.

It's not particuarly for the system registers, is there's so that
anything else that looks at the task's GCS sees the current state.  I'm
pretty confident this excessive, the goal was to err on the side of
correctness and then relax later.
Catalin Marinas Aug. 20, 2024, 5:07 p.m. UTC | #3
On Mon, Aug 19, 2024 at 04:44:42PM +0100, Mark Brown wrote:
> On Mon, Aug 19, 2024 at 12:46:13PM +0100, Catalin Marinas wrote:
> > On Thu, Aug 01, 2024 at 01:06:46PM +0100, Mark Brown wrote:
> > > +	/*
> > > +	 * Ensure that GCS changes are observable by/from other PEs in
> > > +	 * case of migration.
> > > +	 */
> > > +	if (task_gcs_el0_enabled(current) || task_gcs_el0_enabled(next))
> > > +		gcsb_dsync();
[...]
> > What's the GCSB DSYNC supposed to do here? The Arm ARM talks about
> > ordering between GCS memory effects and other memory effects. I haven't
> > looked at the memory model in detail yet (D11.9.1) but AFAICT it has
> > nothing to do with the system registers. We'll need this barrier when
> > ordering is needed between explicit or implicit (e.g. BL) GCS accesses
> > and the explicit classic memory accesses. Paging comes to mind, so maybe
> > flush_dcache_page() would need this barrier. ptrace() is another case if
> > the memory accessed is a GCS page. I can see you added it in other
> > places, I'll have a look as I go through the rest. But I don't think one
> > is needed here.
> 
> It's not particuarly for the system registers, is there's so that
> anything else that looks at the task's GCS sees the current state.

Ah, so that's the to ensure that any writes on the CPU to the GCS stack
would be observable if the task appears on a different CPU (together
with the additional classic ordering/spinlocks used for the run queues).
Maybe update the comment to say "GCS memory effects" instead of "GCS
changes". I read the latter as GCS sysreg changes. Something like below
would make it clearer:

	/*
	 * Ensure that GCS memory effects of the 'prev' thread are
	 * ordered before other memory accesses with release semantics
	 * (or preceded by a DMB) on the current PE. In addition, any
	 * memory accesses with acquire semantics (or succeeded by a
	 * DMB) are ordered before GCS memory effects of the 'next'
	 * thread. This will ensure that the GCS memory effects are
	 * visible to other PEs in case of migration.
	 */

Feel free to rephrase as you see fit.

> I'm pretty confident this excessive, the goal was to err on the side
> of correctness and then relax later.

I think we are missing some. Paging should be ok as we have a pte change
and TLBI and IIRC the same rules as for standard memory accesses apply.
ptrace() memory accesses may need something though I'm fine with
considering this a best effort (we can't guarantee anyway if any
accesses are on different CPUs). I haven't got to the signal handling
patch yet.
Mark Brown Aug. 20, 2024, 5:56 p.m. UTC | #4
On Mon, Aug 19, 2024 at 04:44:52PM +0100, Mark Brown wrote:
> On Mon, Aug 19, 2024 at 12:46:13PM +0100, Catalin Marinas wrote:
> > On Thu, Aug 01, 2024 at 01:06:46PM +0100, Mark Brown wrote:

> > > +	/*
> > > +	 * Ensure that GCS changes are observable by/from other PEs in
> > > +	 * case of migration.
> > > +	 */
> > > +	if (task_gcs_el0_enabled(current) || task_gcs_el0_enabled(next))
> > > +		gcsb_dsync();

> > Could we do the sysreg writing under this 'if' block? If no app is using
> > GCS (which would be the case for a while), it looks like unnecessary
> > sysreg accesses.

> Yes, that should be fine I think.

I forgot when writing the above that we always allow reads from
GCSPR_EL0 in order to avoid corner cases for unwinders in the case of
asynchronous disable.  I'd expect that to be cheap to access though.
Catalin Marinas Aug. 21, 2024, 8:50 a.m. UTC | #5
On Tue, Aug 20, 2024 at 06:56:19PM +0100, Mark Brown wrote:
> On Mon, Aug 19, 2024 at 04:44:52PM +0100, Mark Brown wrote:
> > On Mon, Aug 19, 2024 at 12:46:13PM +0100, Catalin Marinas wrote:
> > > On Thu, Aug 01, 2024 at 01:06:46PM +0100, Mark Brown wrote:
> 
> > > > +	/*
> > > > +	 * Ensure that GCS changes are observable by/from other PEs in
> > > > +	 * case of migration.
> > > > +	 */
> > > > +	if (task_gcs_el0_enabled(current) || task_gcs_el0_enabled(next))
> > > > +		gcsb_dsync();
> 
> > > Could we do the sysreg writing under this 'if' block? If no app is using
> > > GCS (which would be the case for a while), it looks like unnecessary
> > > sysreg accesses.
> 
> > Yes, that should be fine I think.
> 
> I forgot when writing the above that we always allow reads from
> GCSPR_EL0 in order to avoid corner cases for unwinders in the case of
> asynchronous disable.  I'd expect that to be cheap to access though.

But then gcs_preserve_current_state() doesn't save the GCSPR_EL0 value
if the shadow stack was disabled. At the subsequent switch to this task,
we write some stale value.
Mark Brown Aug. 21, 2024, 12:48 p.m. UTC | #6
On Wed, Aug 21, 2024 at 09:50:22AM +0100, Catalin Marinas wrote:
> On Tue, Aug 20, 2024 at 06:56:19PM +0100, Mark Brown wrote:

> > I forgot when writing the above that we always allow reads from
> > GCSPR_EL0 in order to avoid corner cases for unwinders in the case of
> > asynchronous disable.  I'd expect that to be cheap to access though.

> But then gcs_preserve_current_state() doesn't save the GCSPR_EL0 value
> if the shadow stack was disabled. At the subsequent switch to this task,
> we write some stale value.

True, we should make the disable save the current value.
diff mbox series

Patch

diff --git a/arch/arm64/include/asm/gcs.h b/arch/arm64/include/asm/gcs.h
index 7c5e95218db6..04594ef59dad 100644
--- a/arch/arm64/include/asm/gcs.h
+++ b/arch/arm64/include/asm/gcs.h
@@ -48,4 +48,28 @@  static inline u64 gcsss2(void)
 	return Xt;
 }
 
+#ifdef CONFIG_ARM64_GCS
+
+static inline bool task_gcs_el0_enabled(struct task_struct *task)
+{
+	return current->thread.gcs_el0_mode & PR_SHADOW_STACK_ENABLE;
+}
+
+void gcs_set_el0_mode(struct task_struct *task);
+void gcs_free(struct task_struct *task);
+void gcs_preserve_current_state(void);
+
+#else
+
+static inline bool task_gcs_el0_enabled(struct task_struct *task)
+{
+	return false;
+}
+
+static inline void gcs_set_el0_mode(struct task_struct *task) { }
+static inline void gcs_free(struct task_struct *task) { }
+static inline void gcs_preserve_current_state(void) { }
+
+#endif
+
 #endif
diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
index f77371232d8c..c55e3600604a 100644
--- a/arch/arm64/include/asm/processor.h
+++ b/arch/arm64/include/asm/processor.h
@@ -184,6 +184,12 @@  struct thread_struct {
 	u64			sctlr_user;
 	u64			svcr;
 	u64			tpidr2_el0;
+#ifdef CONFIG_ARM64_GCS
+	unsigned int		gcs_el0_mode;
+	u64			gcspr_el0;
+	u64			gcs_base;
+	u64			gcs_size;
+#endif
 };
 
 static inline unsigned int thread_get_vl(struct thread_struct *thread,
diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
index 4ae31b7af6c3..5f00cb0da9c3 100644
--- a/arch/arm64/kernel/process.c
+++ b/arch/arm64/kernel/process.c
@@ -48,6 +48,7 @@ 
 #include <asm/cacheflush.h>
 #include <asm/exec.h>
 #include <asm/fpsimd.h>
+#include <asm/gcs.h>
 #include <asm/mmu_context.h>
 #include <asm/mte.h>
 #include <asm/processor.h>
@@ -271,12 +272,32 @@  static void flush_tagged_addr_state(void)
 		clear_thread_flag(TIF_TAGGED_ADDR);
 }
 
+#ifdef CONFIG_ARM64_GCS
+
+static void flush_gcs(void)
+{
+	if (!system_supports_gcs())
+		return;
+
+	gcs_free(current);
+	current->thread.gcs_el0_mode = 0;
+	write_sysreg_s(0, SYS_GCSCRE0_EL1);
+	write_sysreg_s(0, SYS_GCSPR_EL0);
+}
+
+#else
+
+static void flush_gcs(void) { }
+
+#endif
+
 void flush_thread(void)
 {
 	fpsimd_flush_thread();
 	tls_thread_flush();
 	flush_ptrace_hw_breakpoint(current);
 	flush_tagged_addr_state();
+	flush_gcs();
 }
 
 void arch_release_task_struct(struct task_struct *tsk)
@@ -471,6 +492,40 @@  static void entry_task_switch(struct task_struct *next)
 	__this_cpu_write(__entry_task, next);
 }
 
+#ifdef CONFIG_ARM64_GCS
+
+void gcs_preserve_current_state(void)
+{
+	if (task_gcs_el0_enabled(current))
+		current->thread.gcspr_el0 = read_sysreg_s(SYS_GCSPR_EL0);
+}
+
+static void gcs_thread_switch(struct task_struct *next)
+{
+	if (!system_supports_gcs())
+		return;
+
+	gcs_preserve_current_state();
+
+	gcs_set_el0_mode(next);
+	write_sysreg_s(next->thread.gcspr_el0, SYS_GCSPR_EL0);
+
+	/*
+	 * Ensure that GCS changes are observable by/from other PEs in
+	 * case of migration.
+	 */
+	if (task_gcs_el0_enabled(current) || task_gcs_el0_enabled(next))
+		gcsb_dsync();
+}
+
+#else
+
+static void gcs_thread_switch(struct task_struct *next)
+{
+}
+
+#endif
+
 /*
  * ARM erratum 1418040 handling, affecting the 32bit view of CNTVCT.
  * Ensure access is disabled when switching to a 32bit task, ensure
@@ -530,6 +585,7 @@  struct task_struct *__switch_to(struct task_struct *prev,
 	ssbs_thread_switch(next);
 	erratum_1418040_thread_switch(next);
 	ptrauth_thread_switch_user(next);
+	gcs_thread_switch(next);
 
 	/*
 	 * Complete any pending TLB or cache maintenance on this CPU in case
diff --git a/arch/arm64/mm/Makefile b/arch/arm64/mm/Makefile
index 60454256945b..1a7b3a2f21e6 100644
--- a/arch/arm64/mm/Makefile
+++ b/arch/arm64/mm/Makefile
@@ -11,6 +11,7 @@  obj-$(CONFIG_TRANS_TABLE)	+= trans_pgd.o
 obj-$(CONFIG_TRANS_TABLE)	+= trans_pgd-asm.o
 obj-$(CONFIG_DEBUG_VIRTUAL)	+= physaddr.o
 obj-$(CONFIG_ARM64_MTE)		+= mteswap.o
+obj-$(CONFIG_ARM64_GCS)		+= gcs.o
 KASAN_SANITIZE_physaddr.o	+= n
 
 obj-$(CONFIG_KASAN)		+= kasan_init.o
diff --git a/arch/arm64/mm/gcs.c b/arch/arm64/mm/gcs.c
new file mode 100644
index 000000000000..b0a67efc522b
--- /dev/null
+++ b/arch/arm64/mm/gcs.c
@@ -0,0 +1,39 @@ 
+// SPDX-License-Identifier: GPL-2.0-only
+
+#include <linux/mm.h>
+#include <linux/mman.h>
+#include <linux/syscalls.h>
+#include <linux/types.h>
+
+#include <asm/cpufeature.h>
+#include <asm/page.h>
+
+/*
+ * Apply the GCS mode configured for the specified task to the
+ * hardware.
+ */
+void gcs_set_el0_mode(struct task_struct *task)
+{
+	u64 gcscre0_el1 = GCSCRE0_EL1_nTR;
+
+	if (task->thread.gcs_el0_mode & PR_SHADOW_STACK_ENABLE)
+		gcscre0_el1 |= GCSCRE0_EL1_RVCHKEN | GCSCRE0_EL1_PCRSEL;
+
+	if (task->thread.gcs_el0_mode & PR_SHADOW_STACK_WRITE)
+		gcscre0_el1 |= GCSCRE0_EL1_STREn;
+
+	if (task->thread.gcs_el0_mode & PR_SHADOW_STACK_PUSH)
+		gcscre0_el1 |= GCSCRE0_EL1_PUSHMEn;
+
+	write_sysreg_s(gcscre0_el1, SYS_GCSCRE0_EL1);
+}
+
+void gcs_free(struct task_struct *task)
+{
+	if (task->thread.gcs_base)
+		vm_munmap(task->thread.gcs_base, task->thread.gcs_size);
+
+	task->thread.gcspr_el0 = 0;
+	task->thread.gcs_base = 0;
+	task->thread.gcs_size = 0;
+}