Message ID | 20240801-arm64-gcs-v10-23-699e2bd2190b@kernel.org (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | arm64/gcs: Provide support for GCS in userspace | expand |
On Thu, Aug 01, 2024 at 01:06:50PM +0100, Mark Brown wrote: > When invoking a signal handler we use the GCS configuration and stack > for the current thread. > > Since we implement signal return by calling the signal handler with a > return address set up pointing to a trampoline in the vDSO we need to > also configure any active GCS for this by pushing a frame for the > trampoline onto the GCS. If we do not do this then signal return will > generate a GCS protection fault. > > In order to guard against attempts to bypass GCS protections via signal > return we only allow returning with GCSPR_EL0 pointing to an address > where it was previously preempted by a signal. We do this by pushing a > cap onto the GCS, this takes the form of an architectural GCS cap token > with the top bit set and token type of 0 which we add on signal entry > and validate and pop off on signal return. The combination of the top > bit being set and the token type mean that this can't be interpreted as > a valid token or address. > > Reviewed-by: Thiago Jung Bauermann <thiago.bauermann@linaro.org> > Signed-off-by: Mark Brown <broonie@kernel.org> > --- > arch/arm64/include/asm/gcs.h | 1 + > arch/arm64/kernel/signal.c | 134 +++++++++++++++++++++++++++++++++++++++++-- > arch/arm64/mm/gcs.c | 1 + > 3 files changed, 131 insertions(+), 5 deletions(-) > > diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c [...] > +#ifdef CONFIG_ARM64_GCS > + > +static int gcs_signal_entry(__sigrestore_t sigtramp, struct ksignal *ksig) > +{ > + unsigned long __user *gcspr_el0; > + int ret = 0; > + > + if (!system_supports_gcs()) > + return 0; > + > + if (!task_gcs_el0_enabled(current)) > + return 0; > + > + /* > + * We are entering a signal handler, current register state is > + * active. > + */ > + gcspr_el0 = (unsigned long __user *)read_sysreg_s(SYS_GCSPR_EL0); > + > + /* > + * Push a cap and the GCS entry for the trampoline onto the GCS. > + */ > + put_user_gcs((unsigned long)sigtramp, gcspr_el0 - 2, &ret); > + put_user_gcs(GCS_SIGNAL_CAP(gcspr_el0 - 1), gcspr_el0 - 1, &ret); > + if (ret != 0) > + return ret; What happens if we went wrong here, or if the signal we are delivering was caused by a GCS overrun or bad GCSPR_EL0 in the first place? It feels like a program has no way to rescue itself from excessive recursion in some thread. Is there something equivalent to sigaltstack()? Or is the shadow stack always supposed to be big enough to cope with recursion that exhausts the main stack and alternate signal stack (and if so, how is this ensured)? > + > + gcsb_dsync(); > + > + gcspr_el0 -= 2; > + write_sysreg_s((unsigned long)gcspr_el0, SYS_GCSPR_EL0); > + > + return 0; > +} [...] Cheers ---Dave
On Wed, Aug 14, 2024 at 03:51:42PM +0100, Dave Martin wrote: > On Thu, Aug 01, 2024 at 01:06:50PM +0100, Mark Brown wrote: > > + put_user_gcs((unsigned long)sigtramp, gcspr_el0 - 2, &ret); > > + put_user_gcs(GCS_SIGNAL_CAP(gcspr_el0 - 1), gcspr_el0 - 1, &ret); > > + if (ret != 0) > > + return ret; > What happens if we went wrong here, or if the signal we are delivering > was caused by a GCS overrun or bad GCSPR_EL0 in the first place? > It feels like a program has no way to rescue itself from excessive > recursion in some thread. Is there something equivalent to > sigaltstack()? > Or is the shadow stack always supposed to be big enough to cope with > recursion that exhausts the main stack and alternate signal stack (and > if so, how is this ensured)? There's no sigaltstack() for GCS, this is also the ABI with the existing shadow stack on x86 and should be addressed in a cross architecture fashion. There have been some discussions about providing a shadow alt stack but they've generally been circular and inconclusive, there were a bunch of tradeoffs for corner cases and nobody had a clear sense as to what a good solution should be. It was a bit unclear that actively doing anything was worthwhile. The issues were IIRC around unwinders and disjoint shadow stacks, compatibility with non-shadow stacks and behaviour when we overflow the shadow stack. I think there were also some applications trying to be very clever with alt stacks that needed to be interacted with and complicated everything but I could be misremembering there. Practically speaking since we're only storing return addresses the default GCS should be extremely large so it's unlikely to come up without first encountering and handling issues on the normal stack. Users allocating their own shadow stacks should be careful. This isn't really satisfying but is probably fine in practice, there's certainly not been any pressure yet from the existing x86 deployments (though at present nobody can explicitly select their own shadow stack size, perhaps it'll become more of an issue when the clone3() stuff is in).
On Wed, Aug 14, 2024 at 05:00:23PM +0100, Mark Brown wrote: > On Wed, Aug 14, 2024 at 03:51:42PM +0100, Dave Martin wrote: > > On Thu, Aug 01, 2024 at 01:06:50PM +0100, Mark Brown wrote: > > > > + put_user_gcs((unsigned long)sigtramp, gcspr_el0 - 2, &ret); > > > + put_user_gcs(GCS_SIGNAL_CAP(gcspr_el0 - 1), gcspr_el0 - 1, &ret); > > > + if (ret != 0) > > > + return ret; > > > What happens if we went wrong here, or if the signal we are delivering > > was caused by a GCS overrun or bad GCSPR_EL0 in the first place? > > > It feels like a program has no way to rescue itself from excessive > > recursion in some thread. Is there something equivalent to > > sigaltstack()? > > > Or is the shadow stack always supposed to be big enough to cope with > > recursion that exhausts the main stack and alternate signal stack (and > > if so, how is this ensured)? > > There's no sigaltstack() for GCS, this is also the ABI with the existing > shadow stack on x86 and should be addressed in a cross architecture > fashion. There have been some discussions about providing a shadow alt > stack but they've generally been circular and inconclusive, there were a > bunch of tradeoffs for corner cases and nobody had a clear sense as to > what a good solution should be. It was a bit unclear that actively > doing anything was worthwhile. The issues were IIRC around unwinders > and disjoint shadow stacks, compatibility with non-shadow stacks and > behaviour when we overflow the shadow stack. I think there were also > some applications trying to be very clever with alt stacks that needed > to be interacted with and complicated everything but I could be > misremembering there. > > Practically speaking since we're only storing return addresses the > default GCS should be extremely large so it's unlikely to come up > without first encountering and handling issues on the normal stack. > Users allocating their own shadow stacks should be careful. This isn't > really satisfying but is probably fine in practice, there's certainly > not been any pressure yet from the existing x86 deployments (though at > present nobody can explicitly select their own shadow stack size, > perhaps it'll become more of an issue when the clone3() stuff is in). Ack, if this is a known limitation then I guess it makes sense just to follow other arches. I see that we default the shadow stack size to half the main stack size, which should indeed count as "huge". I guess this makes shadow stack overrun unlikely at least (at least, not before the main stack overruns). Hopping to an alternate (main) stack while continuing to push on the same shadow stack doesn't sound broken in principle. Is there a test for taking and returning from a signal on an alternate (main) stack, when a shadow stack is in use? Sounds like something that would be good to check if not. Cheers ---Dave
On Thu, Aug 15, 2024 at 02:37:22PM +0100, Dave Martin wrote: > Is there a test for taking and returning from a signal on an alternate > (main) stack, when a shadow stack is in use? Sounds like something > that would be good to check if not. Not specifically for any of the architectures.
On Thu, Aug 15, 2024 at 03:45:45PM +0100, Mark Brown wrote: > On Thu, Aug 15, 2024 at 02:37:22PM +0100, Dave Martin wrote: > > > Is there a test for taking and returning from a signal on an alternate > > (main) stack, when a shadow stack is in use? Sounds like something > > that would be good to check if not. > > Not specifically for any of the architectures. Can you see any reason why this shouldn't work? Maybe I'll hacking up a test if I get around to it, but don't take this as a promise! Cheers ---Dave
On Thu, Aug 15, 2024 at 04:11:56PM +0100, Dave Martin wrote: > On Thu, Aug 15, 2024 at 03:45:45PM +0100, Mark Brown wrote: > > On Thu, Aug 15, 2024 at 02:37:22PM +0100, Dave Martin wrote: > > > Is there a test for taking and returning from a signal on an alternate > > > (main) stack, when a shadow stack is in use? Sounds like something > > > that would be good to check if not. > > Not specifically for any of the architectures. > Can you see any reason why this shouldn't work? No, it's expected to work - I'm just not specifically aware of an explicit test for it. Possibly some of the userspace bringup work might've covered it? Any libc tests for altstack support should've exercised it for example. > Maybe I'll hacking up a test if I get around to it, but don't take this > as a promise! Thanks for your firm commitment! :P
On Thu, Aug 15, 2024 at 04:29:54PM +0100, Mark Brown wrote: > On Thu, Aug 15, 2024 at 04:11:56PM +0100, Dave Martin wrote: > > On Thu, Aug 15, 2024 at 03:45:45PM +0100, Mark Brown wrote: > > > On Thu, Aug 15, 2024 at 02:37:22PM +0100, Dave Martin wrote: > > > > > Is there a test for taking and returning from a signal on an alternate > > > > (main) stack, when a shadow stack is in use? Sounds like something > > > > that would be good to check if not. > > > > Not specifically for any of the architectures. > > > Can you see any reason why this shouldn't work? > > No, it's expected to work - I'm just not specifically aware of an > explicit test for it. Possibly some of the userspace bringup work > might've covered it? Any libc tests for altstack support should've > exercised it for example. That's true; if libc is built to use shadow stack, generic API tests ought to cover this. > > Maybe I'll hacking up a test if I get around to it, but don't take this > > as a promise! > > Thanks for your firm commitment! :P No problem... Cheers ---Dave
On Thu, Aug 01, 2024 at 01:06:50PM +0100, Mark Brown wrote: > @@ -860,6 +892,50 @@ static int restore_sigframe(struct pt_regs *regs, > return err; > } > > +#ifdef CONFIG_ARM64_GCS > +static int gcs_restore_signal(void) > +{ > + u64 gcspr_el0, cap; Nitpick: use 'unsigned long __user *gcspr_el0' as in the gcs_signal_entry(). It's more consistent and probably less casting. > + int ret; > + > + if (!system_supports_gcs()) > + return 0; > + > + if (!(current->thread.gcs_el0_mode & PR_SHADOW_STACK_ENABLE)) > + return 0; > + > + gcspr_el0 = read_sysreg_s(SYS_GCSPR_EL0); > + > + /* > + * GCSPR_EL0 should be pointing at a capped GCS, read the cap... > + */ > + gcsb_dsync(); > + ret = copy_from_user(&cap, (__user void*)gcspr_el0, sizeof(cap)); > + if (ret) > + return -EFAULT; Can the user change GCSPR_EL0 to a non-shadow-stack region, fake the cap before sigreturn? copy_from_user() cannot check it's a GCS page. Does it actually matter? > @@ -1130,7 +1209,50 @@ static int get_sigframe(struct rt_sigframe_user_layout *user, > return 0; > } > > -static void setup_return(struct pt_regs *regs, struct k_sigaction *ka, > +#ifdef CONFIG_ARM64_GCS > + > +static int gcs_signal_entry(__sigrestore_t sigtramp, struct ksignal *ksig) > +{ > + unsigned long __user *gcspr_el0; > + int ret = 0; > + > + if (!system_supports_gcs()) > + return 0; > + > + if (!task_gcs_el0_enabled(current)) > + return 0; > + > + /* > + * We are entering a signal handler, current register state is > + * active. > + */ > + gcspr_el0 = (unsigned long __user *)read_sysreg_s(SYS_GCSPR_EL0); > + > + /* > + * Push a cap and the GCS entry for the trampoline onto the GCS. > + */ > + put_user_gcs((unsigned long)sigtramp, gcspr_el0 - 2, &ret); > + put_user_gcs(GCS_SIGNAL_CAP(gcspr_el0 - 1), gcspr_el0 - 1, &ret); > + if (ret != 0) > + return ret; Doesn't the second put_user_gcs() override the previous ret? > + > + gcsb_dsync(); Wondering if we need the barrier both for entry and restore. If the restore happens on another CPU, we have the barriers in the context switch code already. If it's only the kernel writing the caps with GCSSTTR on setting up the stack and checking it on return, a single barrier is sufficient (can be this one). If the user can write something on the stack or maybe doing a sigreturn without fully unwinding the stack, we may need both. Either way, it would help to add some comments on these barriers.
On Wed, Aug 21, 2024 at 06:28:49PM +0100, Catalin Marinas wrote: > On Thu, Aug 01, 2024 at 01:06:50PM +0100, Mark Brown wrote: > > + ret = copy_from_user(&cap, (__user void*)gcspr_el0, sizeof(cap)); > > + if (ret) > > + return -EFAULT; > Can the user change GCSPR_EL0 to a non-shadow-stack region, fake the > cap before sigreturn? copy_from_user() cannot check it's a GCS page. > Does it actually matter? We don't take any steps to prevent that since I'm not clear that it matters, as soon as userspace tries to use the non-GCS page as a GCS it will fault. Given the abundance of ways in which a signal handler can cause a crash it didn't seem worth specific code, the cap token check is about protecting an actual GCS. > > + /* > > + * Push a cap and the GCS entry for the trampoline onto the GCS. > > + */ > > + put_user_gcs((unsigned long)sigtramp, gcspr_el0 - 2, &ret); > > + put_user_gcs(GCS_SIGNAL_CAP(gcspr_el0 - 1), gcspr_el0 - 1, &ret); > > + if (ret != 0) > > + return ret; > Doesn't the second put_user_gcs() override the previous ret? No, we only set ret on error - if the first one faults it'll set ret then the second one will either leave it unchanged or write the same error code depending on if it fails. This idiom is used quite a lot in the signal code.
On Wed, Aug 21, 2024 at 07:03:13PM +0100, Mark Brown wrote: > On Wed, Aug 21, 2024 at 06:28:49PM +0100, Catalin Marinas wrote: > > On Thu, Aug 01, 2024 at 01:06:50PM +0100, Mark Brown wrote: > > > + /* > > > + * Push a cap and the GCS entry for the trampoline onto the GCS. > > > + */ > > > + put_user_gcs((unsigned long)sigtramp, gcspr_el0 - 2, &ret); > > > + put_user_gcs(GCS_SIGNAL_CAP(gcspr_el0 - 1), gcspr_el0 - 1, &ret); > > > + if (ret != 0) > > > + return ret; > > > Doesn't the second put_user_gcs() override the previous ret? > > No, we only set ret on error - if the first one faults it'll set ret > then the second one will either leave it unchanged or write the same > error code depending on if it fails. This idiom is used quite a lot in > the signal code. You are right, I missed that it's called 'err' in put_user_gcs(), thought it's overridden.
diff --git a/arch/arm64/include/asm/gcs.h b/arch/arm64/include/asm/gcs.h index 48c97e63e56a..f50660603ecf 100644 --- a/arch/arm64/include/asm/gcs.h +++ b/arch/arm64/include/asm/gcs.h @@ -9,6 +9,7 @@ #include <asm/uaccess.h> struct kernel_clone_args; +struct ksignal; static inline void gcsb_dsync(void) { diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c index 4a77f4976e11..a1e0aa38bff9 100644 --- a/arch/arm64/kernel/signal.c +++ b/arch/arm64/kernel/signal.c @@ -25,6 +25,7 @@ #include <asm/elf.h> #include <asm/exception.h> #include <asm/cacheflush.h> +#include <asm/gcs.h> #include <asm/ucontext.h> #include <asm/unistd.h> #include <asm/fpsimd.h> @@ -34,6 +35,37 @@ #include <asm/traps.h> #include <asm/vdso.h> +#ifdef CONFIG_ARM64_GCS +/* Extra bit set in the address distinguishing a signal cap token. */ +#define GCS_SIGNAL_CAP_FLAG BIT(63) + +#define GCS_SIGNAL_CAP(addr) ((((unsigned long)addr) & GCS_CAP_ADDR_MASK) | \ + GCS_SIGNAL_CAP_FLAG) + +static bool gcs_signal_cap_valid(u64 addr, u64 val) +{ + /* + * The top bit should be set, this is an invalid address for + * EL0 and will only be set for caps created by signals. + */ + if (!(val & GCS_SIGNAL_CAP_FLAG)) + return false; + + /* The rest should be a standard architectural cap token. */ + val &= ~GCS_SIGNAL_CAP_FLAG; + + /* The cap must not have a token set */ + if (GCS_CAP_TOKEN(val) != 0) + return false; + + /* The cap must store the VA the cap was stored at */ + if (GCS_CAP_ADDR(addr) != GCS_CAP_ADDR(val)) + return false; + + return true; +} +#endif + /* * Do a signal return; undo the signal stack. These are aligned to 128-bit. */ @@ -860,6 +892,50 @@ static int restore_sigframe(struct pt_regs *regs, return err; } +#ifdef CONFIG_ARM64_GCS +static int gcs_restore_signal(void) +{ + u64 gcspr_el0, cap; + int ret; + + if (!system_supports_gcs()) + return 0; + + if (!(current->thread.gcs_el0_mode & PR_SHADOW_STACK_ENABLE)) + return 0; + + gcspr_el0 = read_sysreg_s(SYS_GCSPR_EL0); + + /* + * GCSPR_EL0 should be pointing at a capped GCS, read the cap... + */ + gcsb_dsync(); + ret = copy_from_user(&cap, (__user void*)gcspr_el0, sizeof(cap)); + if (ret) + return -EFAULT; + + /* + * ...then check that the cap is the actual GCS before + * restoring it. + */ + if (!gcs_signal_cap_valid(gcspr_el0, cap)) + return -EINVAL; + + /* Invalidate the token to prevent reuse */ + put_user_gcs(0, (__user void*)gcspr_el0, &ret); + if (ret != 0) + return -EFAULT; + + current->thread.gcspr_el0 = gcspr_el0 + sizeof(cap); + write_sysreg_s(current->thread.gcspr_el0, SYS_GCSPR_EL0); + + return 0; +} + +#else +static int gcs_restore_signal(void) { return 0; } +#endif + SYSCALL_DEFINE0(rt_sigreturn) { struct pt_regs *regs = current_pt_regs(); @@ -886,6 +962,9 @@ SYSCALL_DEFINE0(rt_sigreturn) if (restore_altstack(&frame->uc.uc_stack)) goto badframe; + if (gcs_restore_signal()) + goto badframe; + return regs->regs[0]; badframe: @@ -1130,7 +1209,50 @@ static int get_sigframe(struct rt_sigframe_user_layout *user, return 0; } -static void setup_return(struct pt_regs *regs, struct k_sigaction *ka, +#ifdef CONFIG_ARM64_GCS + +static int gcs_signal_entry(__sigrestore_t sigtramp, struct ksignal *ksig) +{ + unsigned long __user *gcspr_el0; + int ret = 0; + + if (!system_supports_gcs()) + return 0; + + if (!task_gcs_el0_enabled(current)) + return 0; + + /* + * We are entering a signal handler, current register state is + * active. + */ + gcspr_el0 = (unsigned long __user *)read_sysreg_s(SYS_GCSPR_EL0); + + /* + * Push a cap and the GCS entry for the trampoline onto the GCS. + */ + put_user_gcs((unsigned long)sigtramp, gcspr_el0 - 2, &ret); + put_user_gcs(GCS_SIGNAL_CAP(gcspr_el0 - 1), gcspr_el0 - 1, &ret); + if (ret != 0) + return ret; + + gcsb_dsync(); + + gcspr_el0 -= 2; + write_sysreg_s((unsigned long)gcspr_el0, SYS_GCSPR_EL0); + + return 0; +} +#else + +static int gcs_signal_entry(__sigrestore_t sigtramp, struct ksignal *ksig) +{ + return 0; +} + +#endif + +static int setup_return(struct pt_regs *regs, struct ksignal *ksig, struct rt_sigframe_user_layout *user, int usig) { __sigrestore_t sigtramp; @@ -1138,7 +1260,7 @@ static void setup_return(struct pt_regs *regs, struct k_sigaction *ka, regs->regs[0] = usig; regs->sp = (unsigned long)user->sigframe; regs->regs[29] = (unsigned long)&user->next_frame->fp; - regs->pc = (unsigned long)ka->sa.sa_handler; + regs->pc = (unsigned long)ksig->ka.sa.sa_handler; /* * Signal delivery is a (wacky) indirect function call in @@ -1178,12 +1300,14 @@ static void setup_return(struct pt_regs *regs, struct k_sigaction *ka, sme_smstop(); } - if (ka->sa.sa_flags & SA_RESTORER) - sigtramp = ka->sa.sa_restorer; + if (ksig->ka.sa.sa_flags & SA_RESTORER) + sigtramp = ksig->ka.sa.sa_restorer; else sigtramp = VDSO_SYMBOL(current->mm->context.vdso, sigtramp); regs->regs[30] = (unsigned long)sigtramp; + + return gcs_signal_entry(sigtramp, ksig); } static int setup_rt_frame(int usig, struct ksignal *ksig, sigset_t *set, @@ -1206,7 +1330,7 @@ static int setup_rt_frame(int usig, struct ksignal *ksig, sigset_t *set, err |= __save_altstack(&frame->uc.uc_stack, regs->sp); err |= setup_sigframe(&user, regs, set); if (err == 0) { - setup_return(regs, &ksig->ka, &user, usig); + err = setup_return(regs, ksig, &user, usig); if (ksig->ka.sa.sa_flags & SA_SIGINFO) { err |= copy_siginfo_to_user(&frame->info, &ksig->info); regs->regs[1] = (unsigned long)&frame->info; diff --git a/arch/arm64/mm/gcs.c b/arch/arm64/mm/gcs.c index 6703c70581a4..30614f4ad164 100644 --- a/arch/arm64/mm/gcs.c +++ b/arch/arm64/mm/gcs.c @@ -6,6 +6,7 @@ #include <linux/types.h> #include <asm/cpufeature.h> +#include <asm/gcs.h> #include <asm/page.h> static unsigned long alloc_gcs(unsigned long addr, unsigned long size)