From patchwork Tue Jan 28 18:49:24 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sami Tolvanen X-Patchwork-Id: 11354989 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 33487139A for ; Tue, 28 Jan 2020 18:50:01 +0000 (UTC) Received: from mother.openwall.net (mother.openwall.net [195.42.179.200]) by mail.kernel.org (Postfix) with SMTP id 3A73C24683 for ; Tue, 28 Jan 2020 18:50:00 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="ZFPCNGo4" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 3A73C24683 Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kernel-hardening-return-17622-patchwork-kernel-hardening=patchwork.kernel.org@lists.openwall.com Received: (qmail 9688 invoked by uid 550); 28 Jan 2020 18:49:54 -0000 Mailing-List: contact kernel-hardening-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Delivered-To: mailing list kernel-hardening@lists.openwall.com Received: (qmail 9579 invoked from network); 28 Jan 2020 18:49:53 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=1yNw3GdbdPlo3jgIfcz5hmf/TUIflW/fBN3zPsI/gZA=; b=ZFPCNGo4Ngz2tIj4D2MEYX3CDfhK10cQ0AXPh0Dn5gEOAx2ksdPLLMVMrtdSkdtROd 7N/hK5l+eHpo5Rh76BUGtvgNAX89/Z/qigejYE55sXyc5R0fatem1LyFatPpyX9IpICJ h4MlALErb5ajzZEJsWwDyV/2d/h83L3YMkoZrrq3Dlqu9hKTrsDbyUcEZ/fEMG/pa0mS kczt9gdpfXF8nd+J9UVSXXcPifOpw6RF7ybyBkYTg5FQ3Algi4j0xW9a/cGAeErWl8zh PePIyHvx7BKqiSmkqs05lu14zfpiKRWplsQwvJAdEeNADyxoyKaEXG13HW1/UZZ5AX5D MyYQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=1yNw3GdbdPlo3jgIfcz5hmf/TUIflW/fBN3zPsI/gZA=; b=pjkwauX8KKrNaGDg2ar0wr/3+iTJSLdIGLe9GiZRYhfHkzSlPH9ElBFnw1XlHXYOEl GI7Ah3Ia1IPDJJCatHSNbZE2SMht3cZWsizfWFHUT2L8wGlccI3CvRb5hPet4B+BDq5C eTFcOaIMzD5tulbT0eNwn/aZtsaDhtyRs3ySjNfL8FZJkgryujmL15Qqsj/mp+gtZeQQ qgbDBX90BIA9uR/7GYM0p0LIpC5YP8wXKS1wgtBmm7ecqK4mVGhIEDaC8MVJrrv/gWif zQ0A4fIuQwCXUmD0jvksi64uktMBHp0CtvOMPTAbUP2MJcrsUgr+Q6PE+mq6U8zjDjl7 8LFw== X-Gm-Message-State: APjAAAVW38Z8Hr+i6BSQLKwwnDXCzBy4P5sWgUCcTOse3C0jH5uCwe4z EqGuBRrDt+F1GEb84Zd54hS9IEJf8ItmLR0Tr2s= X-Google-Smtp-Source: APXvYqxbwQfwyhCFL/kzfD3rxrqUPWXg8wo6OfKZCYlYesOhHszbH44Kz4CoCMPJ79SUBDa0GT9Kx2oczjfBs3QRAoc= X-Received: by 2002:a63:4d1b:: with SMTP id a27mr25633490pgb.352.1580237381257; Tue, 28 Jan 2020 10:49:41 -0800 (PST) Date: Tue, 28 Jan 2020 10:49:24 -0800 In-Reply-To: <20200128184934.77625-1-samitolvanen@google.com> Message-Id: <20200128184934.77625-2-samitolvanen@google.com> Mime-Version: 1.0 References: <20191018161033.261971-1-samitolvanen@google.com> <20200128184934.77625-1-samitolvanen@google.com> X-Mailer: git-send-email 2.25.0.341.g760bfbb309-goog Subject: [PATCH v7 01/11] add support for Clang's Shadow Call Stack (SCS) From: Sami Tolvanen To: Will Deacon , Catalin Marinas , Steven Rostedt , Masami Hiramatsu , Ard Biesheuvel , Mark Rutland , james.morse@arm.com Cc: Dave Martin , Kees Cook , Laura Abbott , Marc Zyngier , Nick Desaulniers , Jann Horn , Miguel Ojeda , Masahiro Yamada , clang-built-linux@googlegroups.com, kernel-hardening@lists.openwall.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, Sami Tolvanen This change adds generic support for Clang's Shadow Call Stack, which uses a shadow stack to protect return addresses from being overwritten by an attacker. Details are available here: https://clang.llvm.org/docs/ShadowCallStack.html Note that security guarantees in the kernel differ from the ones documented for user space. The kernel must store addresses of shadow stacks used by other tasks and interrupt handlers in memory, which means an attacker capable reading and writing arbitrary memory may be able to locate them and hijack control flow by modifying shadow stacks that are not currently in use. Signed-off-by: Sami Tolvanen Reviewed-by: Kees Cook Reviewed-by: Miguel Ojeda --- Makefile | 6 ++ arch/Kconfig | 34 ++++++ include/linux/compiler-clang.h | 6 ++ include/linux/compiler_types.h | 4 + include/linux/scs.h | 57 ++++++++++ init/init_task.c | 8 ++ kernel/Makefile | 1 + kernel/fork.c | 9 ++ kernel/sched/core.c | 2 + kernel/scs.c | 187 +++++++++++++++++++++++++++++++++ 10 files changed, 314 insertions(+) create mode 100644 include/linux/scs.h create mode 100644 kernel/scs.c diff --git a/Makefile b/Makefile index 6a01b073915e..b2a1e5b704f4 100644 --- a/Makefile +++ b/Makefile @@ -846,6 +846,12 @@ ifdef CONFIG_LIVEPATCH KBUILD_CFLAGS += $(call cc-option, -flive-patching=inline-clone) endif +ifdef CONFIG_SHADOW_CALL_STACK +CC_FLAGS_SCS := -fsanitize=shadow-call-stack +KBUILD_CFLAGS += $(CC_FLAGS_SCS) +export CC_FLAGS_SCS +endif + # arch Makefile may override CC so keep this after arch Makefile is included NOSTDINC_FLAGS += -nostdinc -isystem $(shell $(CC) -print-file-name=include) diff --git a/arch/Kconfig b/arch/Kconfig index 48b5e103bdb0..1b16aa9a3fe5 100644 --- a/arch/Kconfig +++ b/arch/Kconfig @@ -521,6 +521,40 @@ config STACKPROTECTOR_STRONG about 20% of all kernel functions, which increases the kernel code size by about 2%. +config ARCH_SUPPORTS_SHADOW_CALL_STACK + bool + help + An architecture should select this if it supports Clang's Shadow + Call Stack, has asm/scs.h, and implements runtime support for shadow + stack switching. + +config SHADOW_CALL_STACK + bool "Clang Shadow Call Stack" + depends on ARCH_SUPPORTS_SHADOW_CALL_STACK + help + This option enables Clang's Shadow Call Stack, which uses a + shadow stack to protect function return addresses from being + overwritten by an attacker. More information can be found from + Clang's documentation: + + https://clang.llvm.org/docs/ShadowCallStack.html + + Note that security guarantees in the kernel differ from the ones + documented for user space. The kernel must store addresses of shadow + stacks used by other tasks and interrupt handlers in memory, which + means an attacker capable reading and writing arbitrary memory may + be able to locate them and hijack control flow by modifying shadow + stacks that are not currently in use. + +config SHADOW_CALL_STACK_VMAP + bool "Use virtually mapped shadow call stacks" + depends on SHADOW_CALL_STACK + help + Use virtually mapped shadow call stacks. Selecting this option + provides better stack exhaustion protection, but increases per-thread + memory consumption as a full page is allocated for each shadow stack. + + config HAVE_ARCH_WITHIN_STACK_FRAMES bool help diff --git a/include/linux/compiler-clang.h b/include/linux/compiler-clang.h index 333a6695a918..18fc4d29ef27 100644 --- a/include/linux/compiler-clang.h +++ b/include/linux/compiler-clang.h @@ -42,3 +42,9 @@ * compilers, like ICC. */ #define barrier() __asm__ __volatile__("" : : : "memory") + +#if __has_feature(shadow_call_stack) +# define __noscs __attribute__((__no_sanitize__("shadow-call-stack"))) +#else +# define __noscs +#endif diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h index 72393a8c1a6c..be5d5be4b1ae 100644 --- a/include/linux/compiler_types.h +++ b/include/linux/compiler_types.h @@ -202,6 +202,10 @@ struct ftrace_likely_data { # define randomized_struct_fields_end #endif +#ifndef __noscs +# define __noscs +#endif + #ifndef asm_volatile_goto #define asm_volatile_goto(x...) asm goto(x) #endif diff --git a/include/linux/scs.h b/include/linux/scs.h new file mode 100644 index 000000000000..c5572fd770b0 --- /dev/null +++ b/include/linux/scs.h @@ -0,0 +1,57 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Shadow Call Stack support. + * + * Copyright (C) 2019 Google LLC + */ + +#ifndef _LINUX_SCS_H +#define _LINUX_SCS_H + +#include +#include +#include + +#ifdef CONFIG_SHADOW_CALL_STACK + +/* + * In testing, 1 KiB shadow stack size (i.e. 128 stack frames on a 64-bit + * architecture) provided ~40% safety margin on stack usage while keeping + * memory allocation overhead reasonable. + */ +#define SCS_SIZE 1024UL +#define GFP_SCS (GFP_KERNEL | __GFP_ZERO) + +/* + * A random number outside the kernel's virtual address space to mark the + * end of the shadow stack. + */ +#define SCS_END_MAGIC 0xaf0194819b1635f6UL + +#define task_scs(tsk) (task_thread_info(tsk)->shadow_call_stack) + +static inline void task_set_scs(struct task_struct *tsk, void *s) +{ + task_scs(tsk) = s; +} + +extern void scs_init(void); +extern void scs_task_reset(struct task_struct *tsk); +extern int scs_prepare(struct task_struct *tsk, int node); +extern bool scs_corrupted(struct task_struct *tsk); +extern void scs_release(struct task_struct *tsk); + +#else /* CONFIG_SHADOW_CALL_STACK */ + +#define task_scs(tsk) NULL + +static inline void task_set_scs(struct task_struct *tsk, void *s) {} +static inline void scs_init(void) {} +static inline void scs_task_reset(struct task_struct *tsk) {} +static inline int scs_prepare(struct task_struct *tsk, int node) { return 0; } +static inline bool scs_corrupted(struct task_struct *tsk) { return false; } +static inline void scs_release(struct task_struct *tsk) {} + +#endif /* CONFIG_SHADOW_CALL_STACK */ + +#endif /* _LINUX_SCS_H */ diff --git a/init/init_task.c b/init/init_task.c index 9e5cbe5eab7b..cbd40460e903 100644 --- a/init/init_task.c +++ b/init/init_task.c @@ -11,6 +11,7 @@ #include #include #include +#include #include #include @@ -184,6 +185,13 @@ struct task_struct init_task }; EXPORT_SYMBOL(init_task); +#ifdef CONFIG_SHADOW_CALL_STACK +unsigned long init_shadow_call_stack[SCS_SIZE / sizeof(long)] __init_task_data + __aligned(SCS_SIZE) = { + [(SCS_SIZE / sizeof(long)) - 1] = SCS_END_MAGIC +}; +#endif + /* * Initial thread structure. Alignment of this is handled by a special * linker map entry. diff --git a/kernel/Makefile b/kernel/Makefile index f2cc0d118a0b..06231f936a7a 100644 --- a/kernel/Makefile +++ b/kernel/Makefile @@ -102,6 +102,7 @@ obj-$(CONFIG_TRACEPOINTS) += trace/ obj-$(CONFIG_IRQ_WORK) += irq_work.o obj-$(CONFIG_CPU_PM) += cpu_pm.o obj-$(CONFIG_BPF) += bpf/ +obj-$(CONFIG_SHADOW_CALL_STACK) += scs.o obj-$(CONFIG_PERF_EVENTS) += events/ diff --git a/kernel/fork.c b/kernel/fork.c index ef82feb4bddc..2809e9f9f46b 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -94,6 +94,7 @@ #include #include #include +#include #include #include @@ -454,6 +455,8 @@ void put_task_stack(struct task_struct *tsk) void free_task(struct task_struct *tsk) { + scs_release(tsk); + #ifndef CONFIG_THREAD_INFO_IN_TASK /* * The task is finally done with both the stack and thread_info, @@ -837,6 +840,8 @@ void __init fork_init(void) NULL, free_vm_stack_cache); #endif + scs_init(); + lockdep_init_task(&init_task); uprobes_init(); } @@ -896,6 +901,10 @@ static struct task_struct *dup_task_struct(struct task_struct *orig, int node) if (err) goto free_stack; + err = scs_prepare(tsk, node); + if (err) + goto free_stack; + #ifdef CONFIG_SECCOMP /* * We must handle setting up seccomp filters once we're under diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 90e4b00ace89..a181c536e12e 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -11,6 +11,7 @@ #include #include +#include #include #include @@ -6038,6 +6039,7 @@ void init_idle(struct task_struct *idle, int cpu) idle->se.exec_start = sched_clock(); idle->flags |= PF_IDLE; + scs_task_reset(idle); kasan_unpoison_task_stack(idle); #ifdef CONFIG_SMP diff --git a/kernel/scs.c b/kernel/scs.c new file mode 100644 index 000000000000..28abed21950c --- /dev/null +++ b/kernel/scs.c @@ -0,0 +1,187 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Shadow Call Stack support. + * + * Copyright (C) 2019 Google LLC + */ + +#include +#include +#include +#include +#include +#include +#include +#include + +static inline void *__scs_base(struct task_struct *tsk) +{ + /* + * To minimize risk the of exposure, architectures may clear a + * task's thread_info::shadow_call_stack while that task is + * running, and only save/restore the active shadow call stack + * pointer when the usual register may be clobbered (e.g. across + * context switches). + * + * The shadow call stack is aligned to SCS_SIZE, and grows + * upwards, so we can mask out the low bits to extract the base + * when the task is not running. + */ + return (void *)((unsigned long)task_scs(tsk) & ~(SCS_SIZE - 1)); +} + +static inline unsigned long *scs_magic(void *s) +{ + return (unsigned long *)(s + SCS_SIZE) - 1; +} + +static inline void scs_set_magic(void *s) +{ + *scs_magic(s) = SCS_END_MAGIC; +} + +#ifdef CONFIG_SHADOW_CALL_STACK_VMAP + +/* Matches NR_CACHED_STACKS for VMAP_STACK */ +#define NR_CACHED_SCS 2 +static DEFINE_PER_CPU(void *, scs_cache[NR_CACHED_SCS]); + +static void *scs_alloc(int node) +{ + int i; + void *s; + + for (i = 0; i < NR_CACHED_SCS; i++) { + s = this_cpu_xchg(scs_cache[i], NULL); + if (s) { + memset(s, 0, SCS_SIZE); + goto out; + } + } + + /* + * We allocate a full page for the shadow stack, which should be + * more than we need. Check the assumption nevertheless. + */ + BUILD_BUG_ON(SCS_SIZE > PAGE_SIZE); + + s = __vmalloc_node_range(PAGE_SIZE, SCS_SIZE, + VMALLOC_START, VMALLOC_END, + GFP_SCS, PAGE_KERNEL, 0, + node, __builtin_return_address(0)); + +out: + if (s) + scs_set_magic(s); + /* TODO: poison for KASAN, unpoison in scs_free */ + + return s; +} + +static void scs_free(void *s) +{ + int i; + + for (i = 0; i < NR_CACHED_SCS; i++) + if (this_cpu_cmpxchg(scs_cache[i], 0, s) == NULL) + return; + + vfree_atomic(s); +} + +static int scs_cleanup(unsigned int cpu) +{ + int i; + void **cache = per_cpu_ptr(scs_cache, cpu); + + for (i = 0; i < NR_CACHED_SCS; i++) { + vfree(cache[i]); + cache[i] = NULL; + } + + return 0; +} + +void __init scs_init(void) +{ + WARN_ON(cpuhp_setup_state(CPUHP_BP_PREPARE_DYN, "scs:scs_cache", NULL, + scs_cleanup) < 0); +} + +#else /* !CONFIG_SHADOW_CALL_STACK_VMAP */ + +static struct kmem_cache *scs_cache; + +static inline void *scs_alloc(int node) +{ + void *s; + + s = kmem_cache_alloc_node(scs_cache, GFP_SCS, node); + if (s) { + scs_set_magic(s); + /* + * Poison the allocation to catch unintentional accesses to + * the shadow stack when KASAN is enabled. + */ + kasan_poison_object_data(scs_cache, s); + } + + return s; +} + +static inline void scs_free(void *s) +{ + kasan_unpoison_object_data(scs_cache, s); + kmem_cache_free(scs_cache, s); +} + +void __init scs_init(void) +{ + scs_cache = kmem_cache_create("scs_cache", SCS_SIZE, SCS_SIZE, + 0, NULL); + WARN_ON(!scs_cache); +} + +#endif /* CONFIG_SHADOW_CALL_STACK_VMAP */ + +void scs_task_reset(struct task_struct *tsk) +{ + /* + * Reset the shadow stack to the base address in case the task + * is reused. + */ + task_set_scs(tsk, __scs_base(tsk)); +} + +int scs_prepare(struct task_struct *tsk, int node) +{ + void *s; + + s = scs_alloc(node); + if (!s) + return -ENOMEM; + + task_set_scs(tsk, s); + return 0; +} + +bool scs_corrupted(struct task_struct *tsk) +{ + unsigned long *magic = scs_magic(__scs_base(tsk)); + + return READ_ONCE_NOCHECK(*magic) != SCS_END_MAGIC; +} + +void scs_release(struct task_struct *tsk) +{ + void *s; + + s = __scs_base(tsk); + if (!s) + return; + + WARN_ON(scs_corrupted(tsk)); + + task_set_scs(tsk, NULL); + scs_free(s); +}