Message ID | 20200427160018.243569-1-samitolvanen@google.com (mailing list archive) |
---|---|
Headers | show |
Series | add support for Clang's Shadow Call Stack | expand |
On Mon, 27 Apr 2020 at 18:00, Sami Tolvanen <samitolvanen@google.com> wrote: > > This patch series adds support for Clang's Shadow Call Stack > (SCS) mitigation, which uses a separately allocated shadow stack > to protect against return address overwrites. More information > can be found here: > > https://clang.llvm.org/docs/ShadowCallStack.html > > SCS provides better protection against traditional buffer > overflows than CONFIG_STACKPROTECTOR_*, but it should be noted > that SCS security guarantees in the kernel differ from the ones > documented for user space. The kernel must store addresses of > shadow stacks in memory, which means an attacker capable of > reading and writing arbitrary memory may be able to locate them > and hijack control flow by modifying the shadow stacks. > > SCS is currently supported only on arm64, where the compiler > requires the x18 register to be reserved for holding the current > task's shadow stack pointer. > > With -fsanitize=shadow-call-stack, the compiler injects > instructions to all non-leaf C functions to store the return > address to the shadow stack, and unconditionally load it again > before returning. As a result, SCS is incompatible with features > that rely on modifying function return addresses in the kernel > stack to alter control flow. A copy of the return address is > still kept in the kernel stack for compatibility with stack > unwinding, for example. > > SCS has a minimal performance overhead, but allocating > shadow stacks increases kernel memory usage. The feature is > therefore mostly useful on hardware that lacks support for PAC > instructions. > > Changes in v13: > - Changed thread_info::shadow_call_stack to a base address and > an offset instead, and removed the now unneeded __scs_base() > and scs_save(). > - Removed alignment from the kmem_cache and static allocations. > - Removed the task_set_scs() helper function. > - Moved the assembly code for loading and storing the offset in > thread_info to scs_load/save macros. > - Added offset checking to scs_corrupted(). > - Switched to cmpxchg_relaxed() in scs_check_usage(). > OK, so one thing that came up in an offline discussion about SCS is the way it interacts with the vmap'ed stack. The vmap'ed stack is great for robustness, but it only works if things don't explode for other reasons in the mean time. This means the ordinary-to-shadow-call-stack size ratio should be chosen such that it is *really* unlikely you could ever overflow the shadow call stack and corrupt another task's call stack before hitting the vmap stack's guard region. Alternatively, I wonder if there is a way we could let the SCS and ordinary stack share the [bottom of] the vmap'ed region. That would give rather nasty results if the ordinary stack overflows into the SCS, but for cases where we really recurse out of control, we could catch this occurrence on either stack, whichever one occurs first. And the nastiness -when it does occur- will not corrupt any state beyond the stack of the current task.
On Mon, 27 Apr 2020 at 19:39, Ard Biesheuvel <ardb@kernel.org> wrote: > > On Mon, 27 Apr 2020 at 18:00, Sami Tolvanen <samitolvanen@google.com> wrote: > > > > This patch series adds support for Clang's Shadow Call Stack > > (SCS) mitigation, which uses a separately allocated shadow stack > > to protect against return address overwrites. More information > > can be found here: > > > > https://clang.llvm.org/docs/ShadowCallStack.html > > > > SCS provides better protection against traditional buffer > > overflows than CONFIG_STACKPROTECTOR_*, but it should be noted > > that SCS security guarantees in the kernel differ from the ones > > documented for user space. The kernel must store addresses of > > shadow stacks in memory, which means an attacker capable of > > reading and writing arbitrary memory may be able to locate them > > and hijack control flow by modifying the shadow stacks. > > > > SCS is currently supported only on arm64, where the compiler > > requires the x18 register to be reserved for holding the current > > task's shadow stack pointer. > > > > With -fsanitize=shadow-call-stack, the compiler injects > > instructions to all non-leaf C functions to store the return > > address to the shadow stack, and unconditionally load it again > > before returning. As a result, SCS is incompatible with features > > that rely on modifying function return addresses in the kernel > > stack to alter control flow. A copy of the return address is > > still kept in the kernel stack for compatibility with stack > > unwinding, for example. > > > > SCS has a minimal performance overhead, but allocating > > shadow stacks increases kernel memory usage. The feature is > > therefore mostly useful on hardware that lacks support for PAC > > instructions. > > > > Changes in v13: > > - Changed thread_info::shadow_call_stack to a base address and > > an offset instead, and removed the now unneeded __scs_base() > > and scs_save(). > > - Removed alignment from the kmem_cache and static allocations. > > - Removed the task_set_scs() helper function. > > - Moved the assembly code for loading and storing the offset in > > thread_info to scs_load/save macros. > > - Added offset checking to scs_corrupted(). > > - Switched to cmpxchg_relaxed() in scs_check_usage(). > > > > OK, so one thing that came up in an offline discussion about SCS is > the way it interacts with the vmap'ed stack. > > The vmap'ed stack is great for robustness, but it only works if things > don't explode for other reasons in the mean time. This means the > ordinary-to-shadow-call-stack size ratio should be chosen such that it > is *really* unlikely you could ever overflow the shadow call stack and > corrupt another task's call stack before hitting the vmap stack's > guard region. > > Alternatively, I wonder if there is a way we could let the SCS and > ordinary stack share the [bottom of] the vmap'ed region. That would > give rather nasty results if the ordinary stack overflows into the > SCS, but for cases where we really recurse out of control, we could > catch this occurrence on either stack, whichever one occurs first. And > the nastiness -when it does occur- will not corrupt any state beyond > the stack of the current task. Hmm, I guess that would make it quite hard to keep the SCS address secret though :-(
On Mon, Apr 27, 2020 at 10:50:34PM +0200, Ard Biesheuvel wrote: > > OK, so one thing that came up in an offline discussion about SCS is > > the way it interacts with the vmap'ed stack. > > > > The vmap'ed stack is great for robustness, but it only works if things > > don't explode for other reasons in the mean time. This means the > > ordinary-to-shadow-call-stack size ratio should be chosen such that it > > is *really* unlikely you could ever overflow the shadow call stack and > > corrupt another task's call stack before hitting the vmap stack's > > guard region. > > > > Alternatively, I wonder if there is a way we could let the SCS and > > ordinary stack share the [bottom of] the vmap'ed region. That would > > give rather nasty results if the ordinary stack overflows into the > > SCS, but for cases where we really recurse out of control, we could > > catch this occurrence on either stack, whichever one occurs first. And > > the nastiness -when it does occur- will not corrupt any state beyond > > the stack of the current task. > > Hmm, I guess that would make it quite hard to keep the SCS address > secret though :-( Yes, and the stack potentially overflowing into the SCS sort of defeats the purpose. I'm fine with increasing the SCS size to something safer, but using a vmapped shadow stack seems like the correct solution to this problem, at least on devices where allocating a full page isn't an issue. Sami
From: Sami Tolvanen > Sent: 27 April 2020 23:10 ... > > > Alternatively, I wonder if there is a way we could let the SCS and > > > ordinary stack share the [bottom of] the vmap'ed region. That would > > > give rather nasty results if the ordinary stack overflows into the > > > SCS, but for cases where we really recurse out of control, we could > > > catch this occurrence on either stack, whichever one occurs first. And > > > the nastiness -when it does occur- will not corrupt any state beyond > > > the stack of the current task. > > > > Hmm, I guess that would make it quite hard to keep the SCS address > > secret though :-( > > Yes, and the stack potentially overflowing into the SCS sort of defeats > the purpose. I'm fine with increasing the SCS size to something safer, > but using a vmapped shadow stack seems like the correct solution to this > problem, at least on devices where allocating a full page isn't an issue. Wouldn't you do it the other way around - so shadow stack overflow corrupts the bottom of the normal stack? That can be detected 'after the fact' in a few places (eg process switch and return to user) Actually you might want to do syscall entry at the base of stack area, then (effectively) allocate an on-stack buffer for the shadow stack. I'd have though that kernel code could be the shadow stack address by just reading r18? Userspace isn't supposed to be able to get the main kernel stack address either. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
Hi Sami, On Mon, Apr 27, 2020 at 09:00:06AM -0700, Sami Tolvanen wrote: > This patch series adds support for Clang's Shadow Call Stack > (SCS) mitigation, which uses a separately allocated shadow stack > to protect against return address overwrites. More information > can be found here: > > https://clang.llvm.org/docs/ShadowCallStack.html I'm planning to queue this with the (mostly cosmetic) diff below folded in. I also have some extra patches on top which I'll send out shortly for review. However, I really think we need to get to the bottom of the size issue since I'm highly sceptical about not being able to afford a full page for the shadow stack allocation. We can change this later so it needn't hold up the patchset, but given that Android is the only user, I'd like to make sure that if we change to use a full page upstream then that is also acceptable in AOSP. Thanks, Will --->8 diff --git a/include/linux/compiler-clang.h b/include/linux/compiler-clang.h index 18fc4d29ef27..790c0c6b8552 100644 --- a/include/linux/compiler-clang.h +++ b/include/linux/compiler-clang.h @@ -45,6 +45,4 @@ #if __has_feature(shadow_call_stack) # define __noscs __attribute__((__no_sanitize__("shadow-call-stack"))) -#else -# define __noscs #endif diff --git a/include/linux/scs.h b/include/linux/scs.h index 060eeb3d1390..3f3662621a27 100644 --- a/include/linux/scs.h +++ b/include/linux/scs.h @@ -11,7 +11,7 @@ #include <linux/gfp.h> #include <linux/poison.h> #include <linux/sched.h> -#include <asm/page.h> +#include <linux/sizes.h> #ifdef CONFIG_SHADOW_CALL_STACK @@ -20,7 +20,7 @@ * architecture) provided ~40% safety margin on stack usage while keeping * memory allocation overhead reasonable. */ -#define SCS_SIZE 1024UL +#define SCS_SIZE SZ_1K #define GFP_SCS (GFP_KERNEL | __GFP_ZERO) /* An illegal pointer value to mark the end of the shadow stack. */ @@ -29,7 +29,9 @@ #define task_scs(tsk) (task_thread_info(tsk)->scs_base) #define task_scs_offset(tsk) (task_thread_info(tsk)->scs_offset) -extern void scs_init(void); +void scs_init(void); +int scs_prepare(struct task_struct *tsk, int node); +void scs_release(struct task_struct *tsk); static inline void scs_task_reset(struct task_struct *tsk) { @@ -40,8 +42,6 @@ static inline void scs_task_reset(struct task_struct *tsk) task_scs_offset(tsk) = 0; } -extern int scs_prepare(struct task_struct *tsk, int node); - static inline unsigned long *__scs_magic(void *s) { return (unsigned long *)(s + SCS_SIZE) - 1; @@ -55,12 +55,8 @@ static inline bool scs_corrupted(struct task_struct *tsk) READ_ONCE_NOCHECK(*magic) != SCS_END_MAGIC); } -extern void scs_release(struct task_struct *tsk); - #else /* CONFIG_SHADOW_CALL_STACK */ -#define task_scs(tsk) NULL - static inline void scs_init(void) {} static inline void scs_task_reset(struct task_struct *tsk) {} static inline int scs_prepare(struct task_struct *tsk, int node) { return 0; } diff --git a/kernel/scs.c b/kernel/scs.c index 2a96573f2b1b..9389c28f0853 100644 --- a/kernel/scs.c +++ b/kernel/scs.c @@ -55,45 +55,37 @@ static void scs_account(struct task_struct *tsk, int account) int scs_prepare(struct task_struct *tsk, int node) { - void *s; + void *s = scs_alloc(node); - s = scs_alloc(node); if (!s) return -ENOMEM; task_scs(tsk) = s; task_scs_offset(tsk) = 0; scs_account(tsk, 1); - return 0; } -#ifdef CONFIG_DEBUG_STACK_USAGE -static unsigned long __scs_used(struct task_struct *tsk) +static void scs_check_usage(struct task_struct *tsk) { - unsigned long *p = task_scs(tsk); - unsigned long *end = __scs_magic(p); - unsigned long s = (unsigned long)p; + static unsigned long highest; - while (p < end && READ_ONCE_NOCHECK(*p)) - p++; + unsigned long *p, prev, curr = highest, used = 0; - return (unsigned long)p - s; -} + if (!IS_ENABLED(CONFIG_DEBUG_STACK_USAGE)) + return; -static void scs_check_usage(struct task_struct *tsk) -{ - static unsigned long highest; - unsigned long used = __scs_used(tsk); - unsigned long prev; - unsigned long curr = highest; + for (p = task_scs(tsk); p < __scs_magic(tsk); ++p) { + if (!READ_ONCE_NOCHECK(*p)) + break; + used++; + } while (used > curr) { prev = cmpxchg_relaxed(&highest, curr, used); if (prev == curr) { - pr_info("%s (%d): highest shadow stack usage: " - "%lu bytes\n", + pr_info("%s (%d): highest shadow stack usage: %lu bytes\n", tsk->comm, task_pid_nr(tsk), used); break; } @@ -101,21 +93,16 @@ static void scs_check_usage(struct task_struct *tsk) curr = prev; } } -#else -static inline void scs_check_usage(struct task_struct *tsk) {} -#endif void scs_release(struct task_struct *tsk) { - void *s; + void *s = task_scs(tsk); - s = task_scs(tsk); if (!s) return; - WARN_ON(scs_corrupted(tsk)); + WARN(scs_corrupted(tsk), "corrupted shadow stack detected when freeing task\n"); scs_check_usage(tsk); - scs_account(tsk, -1); scs_free(s); }