Message ID | 20191018161033.261971-19-samitolvanen@google.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | add support for Clang's Shadow Call Stack | expand |
On Fri, Oct 18, 2019 at 6:16 PM Sami Tolvanen <samitolvanen@google.com> wrote: > This change implements shadow stack switching, initial SCS set-up, > and interrupt shadow stacks for arm64. [...] > +static inline void scs_save(struct task_struct *tsk) > +{ > + void *s; > + > + asm volatile("mov %0, x18" : "=r" (s)); > + task_set_scs(tsk, s); > +} > + > +static inline void scs_load(struct task_struct *tsk) > +{ > + asm volatile("mov x18, %0" : : "r" (task_scs(tsk))); > + task_set_scs(tsk, NULL); > +} These things should probably be __always_inline or something like that? If the compiler decides not to inline them (e.g. when called from scs_thread_switch()), stuff will blow up, right? > +static inline void scs_thread_switch(struct task_struct *prev, > + struct task_struct *next) > +{ > + scs_save(prev); > + scs_load(next); > + > + if (unlikely(scs_corrupted(prev))) > + panic("corrupted shadow stack detected inside scheduler\n"); > +} [...] > +#ifdef CONFIG_SHADOW_CALL_STACK > +DECLARE_PER_CPU(unsigned long *, irq_shadow_call_stack_ptr); > +#endif If an attacker has a leak of some random function pointer to find the ASLR base address, they'll know where irq_shadow_call_stack_ptr is. With an arbitrary read (to read irq_shadow_call_stack_ptr[sched_getcpu()]) followed by an arbitrary write, they'd be able to overwrite the shadow stack. Or with just an arbitrary write without a read, they could change irq_shadow_call_stack_ptr[sched_getcpu()] to point elsewhere. This is different from the intended protection level according to <https://clang.llvm.org/docs/ShadowCallStack.html#security>, which talks about "a runtime that avoids exposing the address of the shadow call stack to attackers that can read arbitrary memory". Of course, that's extremely hard to implement in the context of the kernel, where you can see all the memory management data structures and all physical memory. You might want to write something in the cover letter about what the benefits of this mechanism compared to STACKPROTECTOR are in the context of the kernel, including a specific description of which types of attacker capabilities this is supposed to defend against.
On Fri, Oct 18, 2019 at 10:13 AM Jann Horn <jannh@google.com> wrote: > These things should probably be __always_inline or something like > that? If the compiler decides not to inline them (e.g. when called > from scs_thread_switch()), stuff will blow up, right? Correct. I'll change these to __always_inline in v2. I think there might be other places in the kernel where not inlining a static inline function would break things, but there's no need to add more. > This is different from the intended protection level according to > <https://clang.llvm.org/docs/ShadowCallStack.html#security>, which > talks about "a runtime that avoids exposing the address of the shadow > call stack to attackers that can read arbitrary memory". Of course, > that's extremely hard to implement in the context of the kernel, where > you can see all the memory management data structures and all physical > memory. Yes, the security guarantees in the kernel are different as hiding shadow stack pointers is more challenging. > You might want to write something in the cover letter about what the > benefits of this mechanism compared to STACKPROTECTOR are in the > context of the kernel, including a specific description of which types > of attacker capabilities this is supposed to defend against. Sure, I'll add something about that in v2. Thanks. Sami
On Fri, Oct 18, 2019 at 07:12:52PM +0200, Jann Horn wrote: > On Fri, Oct 18, 2019 at 6:16 PM Sami Tolvanen <samitolvanen@google.com> wrote: > > This change implements shadow stack switching, initial SCS set-up, > > and interrupt shadow stacks for arm64. > [...] > > +static inline void scs_save(struct task_struct *tsk) > > +{ > > + void *s; > > + > > + asm volatile("mov %0, x18" : "=r" (s)); > > + task_set_scs(tsk, s); > > +} > > + > > +static inline void scs_load(struct task_struct *tsk) > > +{ > > + asm volatile("mov x18, %0" : : "r" (task_scs(tsk))); > > + task_set_scs(tsk, NULL); > > +} > > These things should probably be __always_inline or something like > that? If the compiler decides not to inline them (e.g. when called > from scs_thread_switch()), stuff will blow up, right? I think scs_save() would better live in assembly in cpu_switch_to(), where we switch the stack and current. It shouldn't matter whether scs_load() is inlined or not, since the x18 value _should_ be invariant from the PoV of the task. We just need to add a TSK_TI_SCS to asm-offsets.c, and then insert a single LDR at the end: mov sp, x9 msr sp_el0, x1 #ifdef CONFIG_SHADOW_CALL_STACK ldr x18, [x1, TSK_TI_SCS] #endif ret Thanks, Mark.
On Fri, Oct 18, 2019 at 10:23 AM Mark Rutland <mark.rutland@arm.com> wrote: > I think scs_save() would better live in assembly in cpu_switch_to(), > where we switch the stack and current. It shouldn't matter whether > scs_load() is inlined or not, since the x18 value _should_ be invariant > from the PoV of the task. Note that there's also a call to scs_save in cpu_die, because the current task's shadow stack pointer is only stored in x18 and we don't want to lose it. > We just need to add a TSK_TI_SCS to asm-offsets.c, and then insert a > single LDR at the end: > > mov sp, x9 > msr sp_el0, x1 > #ifdef CONFIG_SHADOW_CALL_STACK > ldr x18, [x1, TSK_TI_SCS] > #endif > ret TSK_TI_SCS is already defined, so yes, we could move this to cpu_switch_to. I would still prefer to have the overflow check that's in scs_thread_switch though. Sami
On Fri, Oct 18, 2019 at 10:35:49AM -0700, Sami Tolvanen wrote: > On Fri, Oct 18, 2019 at 10:23 AM Mark Rutland <mark.rutland@arm.com> wrote: > > I think scs_save() would better live in assembly in cpu_switch_to(), > > where we switch the stack and current. It shouldn't matter whether > > scs_load() is inlined or not, since the x18 value _should_ be invariant > > from the PoV of the task. > > Note that there's also a call to scs_save in cpu_die, because the > current task's shadow stack pointer is only stored in x18 and we don't > want to lose it. > > > We just need to add a TSK_TI_SCS to asm-offsets.c, and then insert a > > single LDR at the end: > > > > mov sp, x9 > > msr sp_el0, x1 > > #ifdef CONFIG_SHADOW_CALL_STACK > > ldr x18, [x1, TSK_TI_SCS] > > #endif > > ret > > TSK_TI_SCS is already defined, so yes, we could move this to > cpu_switch_to. I would still prefer to have the overflow check that's > in scs_thread_switch though. The only bit that I think needs to be in cpu_switch_to() is the install of the next task's shadow addr into x18. Having a separate scs_check_overflow() sounds fine to me, as that only has to read from the shadow stack. IIUC that's also for the prev task, not next, in the current patches. Thanks, Mark.
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index 3f047afb982c..9bf179db5da9 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -66,6 +66,7 @@ config ARM64 select ARCH_USE_QUEUED_RWLOCKS select ARCH_USE_QUEUED_SPINLOCKS select ARCH_SUPPORTS_MEMORY_FAILURE + select ARCH_SUPPORTS_SHADOW_CALL_STACK select ARCH_SUPPORTS_ATOMIC_RMW select ARCH_SUPPORTS_INT128 if GCC_VERSION >= 50000 || CC_IS_CLANG select ARCH_SUPPORTS_NUMA_BALANCING diff --git a/arch/arm64/include/asm/scs.h b/arch/arm64/include/asm/scs.h new file mode 100644 index 000000000000..14ba192dc6f0 --- /dev/null +++ b/arch/arm64/include/asm/scs.h @@ -0,0 +1,60 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _ASM_SCS_H +#define _ASM_SCS_H + +#ifndef __ASSEMBLY__ + +#include <linux/scs.h> + +#ifdef CONFIG_SHADOW_CALL_STACK + +extern void scs_init_irq(void); + +static inline void scs_save(struct task_struct *tsk) +{ + void *s; + + asm volatile("mov %0, x18" : "=r" (s)); + task_set_scs(tsk, s); +} + +static inline void scs_load(struct task_struct *tsk) +{ + asm volatile("mov x18, %0" : : "r" (task_scs(tsk))); + task_set_scs(tsk, NULL); +} + +static inline void scs_thread_switch(struct task_struct *prev, + struct task_struct *next) +{ + scs_save(prev); + scs_load(next); + + if (unlikely(scs_corrupted(prev))) + panic("corrupted shadow stack detected inside scheduler\n"); +} + +#else /* CONFIG_SHADOW_CALL_STACK */ + +static inline void scs_init_irq(void) +{ +} + +static inline void scs_save(struct task_struct *tsk) +{ +} + +static inline void scs_load(struct task_struct *tsk) +{ +} + +static inline void scs_thread_switch(struct task_struct *prev, + struct task_struct *next) +{ +} + +#endif /* CONFIG_SHADOW_CALL_STACK */ + +#endif /* __ASSEMBLY __ */ + +#endif /* _ASM_SCS_H */ diff --git a/arch/arm64/include/asm/stacktrace.h b/arch/arm64/include/asm/stacktrace.h index 4d9b1f48dc39..b6cf32fb4efe 100644 --- a/arch/arm64/include/asm/stacktrace.h +++ b/arch/arm64/include/asm/stacktrace.h @@ -68,6 +68,10 @@ extern void dump_backtrace(struct pt_regs *regs, struct task_struct *tsk); DECLARE_PER_CPU(unsigned long *, irq_stack_ptr); +#ifdef CONFIG_SHADOW_CALL_STACK +DECLARE_PER_CPU(unsigned long *, irq_shadow_call_stack_ptr); +#endif + static inline bool on_irq_stack(unsigned long sp, struct stack_info *info) { diff --git a/arch/arm64/include/asm/thread_info.h b/arch/arm64/include/asm/thread_info.h index f0cec4160136..8c73764b9ed2 100644 --- a/arch/arm64/include/asm/thread_info.h +++ b/arch/arm64/include/asm/thread_info.h @@ -41,6 +41,9 @@ struct thread_info { #endif } preempt; }; +#ifdef CONFIG_SHADOW_CALL_STACK + void *shadow_call_stack; +#endif }; #define thread_saved_pc(tsk) \ diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile index 478491f07b4f..b3995329d9e5 100644 --- a/arch/arm64/kernel/Makefile +++ b/arch/arm64/kernel/Makefile @@ -63,6 +63,7 @@ obj-$(CONFIG_CRASH_CORE) += crash_core.o obj-$(CONFIG_ARM_SDE_INTERFACE) += sdei.o obj-$(CONFIG_ARM64_SSBD) += ssbd.o obj-$(CONFIG_ARM64_PTR_AUTH) += pointer_auth.o +obj-$(CONFIG_SHADOW_CALL_STACK) += scs.o obj-y += vdso/ probes/ obj-$(CONFIG_COMPAT_VDSO) += vdso32/ diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c index 214685760e1c..f6762b9ae1e1 100644 --- a/arch/arm64/kernel/asm-offsets.c +++ b/arch/arm64/kernel/asm-offsets.c @@ -33,6 +33,9 @@ int main(void) DEFINE(TSK_TI_ADDR_LIMIT, offsetof(struct task_struct, thread_info.addr_limit)); #ifdef CONFIG_ARM64_SW_TTBR0_PAN DEFINE(TSK_TI_TTBR0, offsetof(struct task_struct, thread_info.ttbr0)); +#endif +#ifdef CONFIG_SHADOW_CALL_STACK + DEFINE(TSK_TI_SCS, offsetof(struct task_struct, thread_info.shadow_call_stack)); #endif DEFINE(TSK_STACK, offsetof(struct task_struct, stack)); #ifdef CONFIG_STACKPROTECTOR diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S index cf3bd2976e57..ca49938b99d0 100644 --- a/arch/arm64/kernel/entry.S +++ b/arch/arm64/kernel/entry.S @@ -172,6 +172,10 @@ alternative_cb_end apply_ssbd 1, x22, x23 +#ifdef CONFIG_SHADOW_CALL_STACK + ldr x18, [tsk, #TSK_TI_SCS] // Restore shadow call stack + str xzr, [tsk, #TSK_TI_SCS] +#endif .else add x21, sp, #S_FRAME_SIZE get_current_task tsk @@ -278,6 +282,12 @@ alternative_else_nop_endif ct_user_enter .endif +#ifdef CONFIG_SHADOW_CALL_STACK + .if \el == 0 + str x18, [tsk, #TSK_TI_SCS] // Save shadow call stack + .endif +#endif + #ifdef CONFIG_ARM64_SW_TTBR0_PAN /* * Restore access to TTBR0_EL1. If returning to EL0, no need for SPSR @@ -383,6 +393,9 @@ alternative_insn eret, nop, ARM64_UNMAP_KERNEL_AT_EL0 .macro irq_stack_entry mov x19, sp // preserve the original sp +#ifdef CONFIG_SHADOW_CALL_STACK + mov x20, x18 // preserve the original shadow stack +#endif /* * Compare sp with the base of the task stack. @@ -400,6 +413,12 @@ alternative_insn eret, nop, ARM64_UNMAP_KERNEL_AT_EL0 /* switch to the irq stack */ mov sp, x26 + +#ifdef CONFIG_SHADOW_CALL_STACK + /* also switch to the irq shadow stack */ + ldr_this_cpu x18, irq_shadow_call_stack_ptr, x26 +#endif + 9998: .endm @@ -409,6 +428,10 @@ alternative_insn eret, nop, ARM64_UNMAP_KERNEL_AT_EL0 */ .macro irq_stack_exit mov sp, x19 +#ifdef CONFIG_SHADOW_CALL_STACK + /* x20 is also preserved */ + mov x18, x20 +#endif .endm /* GPRs used by entry code */ diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S index 989b1944cb71..2be977c6496f 100644 --- a/arch/arm64/kernel/head.S +++ b/arch/arm64/kernel/head.S @@ -27,6 +27,7 @@ #include <asm/pgtable-hwdef.h> #include <asm/pgtable.h> #include <asm/page.h> +#include <asm/scs.h> #include <asm/smp.h> #include <asm/sysreg.h> #include <asm/thread_info.h> @@ -424,6 +425,10 @@ __primary_switched: stp xzr, x30, [sp, #-16]! mov x29, sp +#ifdef CONFIG_SHADOW_CALL_STACK + adr_l x18, init_shadow_call_stack // Set shadow call stack +#endif + str_l x21, __fdt_pointer, x5 // Save FDT pointer ldr_l x4, kimage_vaddr // Save the offset between @@ -731,6 +736,10 @@ __secondary_switched: ldr x2, [x0, #CPU_BOOT_TASK] cbz x2, __secondary_too_slow msr sp_el0, x2 +#ifdef CONFIG_SHADOW_CALL_STACK + ldr x18, [x2, #TSK_TI_SCS] // Set shadow call stack + str xzr, [x2, #TSK_TI_SCS] +#endif mov x29, #0 mov x30, #0 b secondary_start_kernel diff --git a/arch/arm64/kernel/irq.c b/arch/arm64/kernel/irq.c index 04a327ccf84d..fe0ca522ff60 100644 --- a/arch/arm64/kernel/irq.c +++ b/arch/arm64/kernel/irq.c @@ -21,6 +21,7 @@ #include <linux/vmalloc.h> #include <asm/daifflags.h> #include <asm/vmap_stack.h> +#include <asm/scs.h> unsigned long irq_err_count; @@ -63,6 +64,7 @@ static void init_irq_stacks(void) void __init init_IRQ(void) { init_irq_stacks(); + scs_init_irq(); irqchip_init(); if (!handle_arch_irq) panic("No interrupt controller found."); diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c index 71f788cd2b18..4490632047d6 100644 --- a/arch/arm64/kernel/process.c +++ b/arch/arm64/kernel/process.c @@ -52,6 +52,7 @@ #include <asm/mmu_context.h> #include <asm/processor.h> #include <asm/pointer_auth.h> +#include <asm/scs.h> #include <asm/stacktrace.h> #if defined(CONFIG_STACKPROTECTOR) && !defined(CONFIG_STACKPROTECTOR_PER_TASK) @@ -508,6 +509,8 @@ __notrace_funcgraph struct task_struct *__switch_to(struct task_struct *prev, ptrauth_thread_switch(next); ssbs_thread_switch(next); + scs_thread_switch(prev, next); + /* * Complete any pending TLB or cache maintenance on this CPU in case * the thread migrates to a different CPU. diff --git a/arch/arm64/kernel/scs.c b/arch/arm64/kernel/scs.c new file mode 100644 index 000000000000..6f255072c9a9 --- /dev/null +++ b/arch/arm64/kernel/scs.c @@ -0,0 +1,39 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Shadow Call Stack support. + * + * Copyright (C) 2019 Google LLC + */ + +#include <linux/percpu.h> +#include <linux/vmalloc.h> +#include <asm/scs.h> + +DEFINE_PER_CPU(unsigned long *, irq_shadow_call_stack_ptr); + +#ifndef CONFIG_SHADOW_CALL_STACK_VMAP +DEFINE_PER_CPU(unsigned long [SCS_SIZE/sizeof(long)], irq_shadow_call_stack) + __aligned(SCS_SIZE); +#endif + +void scs_init_irq(void) +{ + int cpu; + + for_each_possible_cpu(cpu) { +#ifdef CONFIG_SHADOW_CALL_STACK_VMAP + unsigned long *p; + + p = __vmalloc_node_range(SCS_SIZE, SCS_SIZE, + VMALLOC_START, VMALLOC_END, + SCS_GFP, PAGE_KERNEL, + 0, cpu_to_node(cpu), + __builtin_return_address(0)); + + per_cpu(irq_shadow_call_stack_ptr, cpu) = p; +#else + per_cpu(irq_shadow_call_stack_ptr, cpu) = + per_cpu(irq_shadow_call_stack, cpu); +#endif /* CONFIG_SHADOW_CALL_STACK_VMAP */ + } +} diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c index dc9fe879c279..cc1938a585d2 100644 --- a/arch/arm64/kernel/smp.c +++ b/arch/arm64/kernel/smp.c @@ -44,6 +44,7 @@ #include <asm/pgtable.h> #include <asm/pgalloc.h> #include <asm/processor.h> +#include <asm/scs.h> #include <asm/smp_plat.h> #include <asm/sections.h> #include <asm/tlbflush.h> @@ -357,6 +358,9 @@ void cpu_die(void) { unsigned int cpu = smp_processor_id(); + /* Save the shadow stack pointer before exiting the idle task */ + scs_save(current); + idle_task_exit(); local_daif_mask();
This change implements shadow stack switching, initial SCS set-up, and interrupt shadow stacks for arm64. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> --- arch/arm64/Kconfig | 1 + arch/arm64/include/asm/scs.h | 60 ++++++++++++++++++++++++++++ arch/arm64/include/asm/stacktrace.h | 4 ++ arch/arm64/include/asm/thread_info.h | 3 ++ arch/arm64/kernel/Makefile | 1 + arch/arm64/kernel/asm-offsets.c | 3 ++ arch/arm64/kernel/entry.S | 23 +++++++++++ arch/arm64/kernel/head.S | 9 +++++ arch/arm64/kernel/irq.c | 2 + arch/arm64/kernel/process.c | 3 ++ arch/arm64/kernel/scs.c | 39 ++++++++++++++++++ arch/arm64/kernel/smp.c | 4 ++ 12 files changed, 152 insertions(+) create mode 100644 arch/arm64/include/asm/scs.h create mode 100644 arch/arm64/kernel/scs.c