Message ID | 20210330205750.428816-4-keescook@chromium.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Optionally randomize kernel stack offset each syscall | expand |
On Tue, Mar 30 2021 at 13:57, Kees Cook wrote: > +/* > + * Do not use this anywhere else in the kernel. This is used here because > + * it provides an arch-agnostic way to grow the stack with correct > + * alignment. Also, since this use is being explicitly masked to a max of > + * 10 bits, stack-clash style attacks are unlikely. For more details see > + * "VLAs" in Documentation/process/deprecated.rst > + * The asm statement is designed to convince the compiler to keep the > + * allocation around even after "ptr" goes out of scope. Nit. That explanation of "ptr" might be better placed right at the add_random...() macro. > + */ > +void *__builtin_alloca(size_t size); > +/* > + * Use, at most, 10 bits of entropy. We explicitly cap this to keep the > + * "VLA" from being unbounded (see above). 10 bits leaves enough room for > + * per-arch offset masks to reduce entropy (by removing higher bits, since > + * high entropy may overly constrain usable stack space), and for > + * compiler/arch-specific stack alignment to remove the lower bits. > + */ > +#define KSTACK_OFFSET_MAX(x) ((x) & 0x3FF) > + > +/* > + * These macros must be used during syscall entry when interrupts and > + * preempt are disabled, and after user registers have been stored to > + * the stack. > + */ > +#define add_random_kstack_offset() do { \ > + if (static_branch_maybe(CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT, \ > + &randomize_kstack_offset)) { \ > + u32 offset = __this_cpu_read(kstack_offset); \ > + u8 *ptr = __builtin_alloca(KSTACK_OFFSET_MAX(offset)); \ > + asm volatile("" : "=m"(*ptr) :: "memory"); \ > + } \ > +} while (0) Other than that. Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
On Wed, Mar 31, 2021 at 09:53:26AM +0200, Thomas Gleixner wrote: > On Tue, Mar 30 2021 at 13:57, Kees Cook wrote: > > +/* > > + * Do not use this anywhere else in the kernel. This is used here because > > + * it provides an arch-agnostic way to grow the stack with correct > > + * alignment. Also, since this use is being explicitly masked to a max of > > + * 10 bits, stack-clash style attacks are unlikely. For more details see > > + * "VLAs" in Documentation/process/deprecated.rst > > + * The asm statement is designed to convince the compiler to keep the > > + * allocation around even after "ptr" goes out of scope. > > Nit. That explanation of "ptr" might be better placed right at the > add_random...() macro. Ah, yes! Fixed in v9. > Other than that. > > Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Thank you for the reviews! Do you want to take this via -tip (and leave off the arm64 patch until it is acked), or would you rather it go via arm64? (I've sent v9 now...)
On Wed, Mar 31 2021 at 14:54, Kees Cook wrote: > On Wed, Mar 31, 2021 at 09:53:26AM +0200, Thomas Gleixner wrote: >> On Tue, Mar 30 2021 at 13:57, Kees Cook wrote: >> > +/* >> > + * Do not use this anywhere else in the kernel. This is used here because >> > + * it provides an arch-agnostic way to grow the stack with correct >> > + * alignment. Also, since this use is being explicitly masked to a max of >> > + * 10 bits, stack-clash style attacks are unlikely. For more details see >> > + * "VLAs" in Documentation/process/deprecated.rst >> > + * The asm statement is designed to convince the compiler to keep the >> > + * allocation around even after "ptr" goes out of scope. >> >> Nit. That explanation of "ptr" might be better placed right at the >> add_random...() macro. > > Ah, yes! Fixed in v9. Hmm, looking at V9 the "ptr" thing got lost .... > +/* > + * Do not use this anywhere else in the kernel. This is used here because > + * it provides an arch-agnostic way to grow the stack with correct > + * alignment. Also, since this use is being explicitly masked to a max of > + * 10 bits, stack-clash style attacks are unlikely. For more details see > + * "VLAs" in Documentation/process/deprecated.rst > + */ > +void *__builtin_alloca(size_t size); > +/* > + * Use, at most, 10 bits of entropy. We explicitly cap this to keep the > + * "VLA" from being unbounded (see above). 10 bits leaves enough room for > + * per-arch offset masks to reduce entropy (by removing higher bits, since > + * high entropy may overly constrain usable stack space), and for > + * compiler/arch-specific stack alignment to remove the lower bits. > + */ > +#define KSTACK_OFFSET_MAX(x) ((x) & 0x3FF) > + > +/* > + * These macros must be used during syscall entry when interrupts and > + * preempt are disabled, and after user registers have been stored to > + * the stack. > + */ > +#define add_random_kstack_offset() do { \ > Do you want to take this via -tip (and leave off the arm64 patch until > it is acked), or would you rather it go via arm64? (I've sent v9 now...) Either way is fine. Thanks, tglx
On Thu, Apr 01, 2021 at 12:38:31AM +0200, Thomas Gleixner wrote: > On Wed, Mar 31 2021 at 14:54, Kees Cook wrote: > > On Wed, Mar 31, 2021 at 09:53:26AM +0200, Thomas Gleixner wrote: > >> On Tue, Mar 30 2021 at 13:57, Kees Cook wrote: > >> > +/* > >> > + * Do not use this anywhere else in the kernel. This is used here because > >> > + * it provides an arch-agnostic way to grow the stack with correct > >> > + * alignment. Also, since this use is being explicitly masked to a max of > >> > + * 10 bits, stack-clash style attacks are unlikely. For more details see > >> > + * "VLAs" in Documentation/process/deprecated.rst > >> > + * The asm statement is designed to convince the compiler to keep the > >> > + * allocation around even after "ptr" goes out of scope. > >> > >> Nit. That explanation of "ptr" might be better placed right at the > >> add_random...() macro. > > > > Ah, yes! Fixed in v9. > > Hmm, looking at V9 the "ptr" thing got lost .... I put the comment inline in the macro directly above the asm(). > > Do you want to take this via -tip (and leave off the arm64 patch until > > it is acked), or would you rather it go via arm64? (I've sent v9 now...) > > Either way is fine. Since the arm64 folks have been a bit busy, can you just put this in -tip and leave off the arm64 patch for now? Thanks!
On Tue, Mar 30, 2021 at 01:57:47PM -0700, Kees Cook wrote: > diff --git a/include/linux/randomize_kstack.h b/include/linux/randomize_kstack.h > new file mode 100644 > index 000000000000..351520803006 > --- /dev/null > +++ b/include/linux/randomize_kstack.h > @@ -0,0 +1,55 @@ > +/* SPDX-License-Identifier: GPL-2.0-only */ > +#ifndef _LINUX_RANDOMIZE_KSTACK_H > +#define _LINUX_RANDOMIZE_KSTACK_H > + > +#include <linux/kernel.h> > +#include <linux/jump_label.h> > +#include <linux/percpu-defs.h> > + > +DECLARE_STATIC_KEY_MAYBE(CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT, > + randomize_kstack_offset); > +DECLARE_PER_CPU(u32, kstack_offset); > + > +/* > + * Do not use this anywhere else in the kernel. This is used here because > + * it provides an arch-agnostic way to grow the stack with correct > + * alignment. Also, since this use is being explicitly masked to a max of > + * 10 bits, stack-clash style attacks are unlikely. For more details see > + * "VLAs" in Documentation/process/deprecated.rst > + * The asm statement is designed to convince the compiler to keep the > + * allocation around even after "ptr" goes out of scope. > + */ > +void *__builtin_alloca(size_t size); > +/* > + * Use, at most, 10 bits of entropy. We explicitly cap this to keep the > + * "VLA" from being unbounded (see above). 10 bits leaves enough room for > + * per-arch offset masks to reduce entropy (by removing higher bits, since > + * high entropy may overly constrain usable stack space), and for > + * compiler/arch-specific stack alignment to remove the lower bits. > + */ > +#define KSTACK_OFFSET_MAX(x) ((x) & 0x3FF) > + > +/* > + * These macros must be used during syscall entry when interrupts and > + * preempt are disabled, and after user registers have been stored to > + * the stack. > + */ > +#define add_random_kstack_offset() do { \ > + if (static_branch_maybe(CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT, \ > + &randomize_kstack_offset)) { \ > + u32 offset = __this_cpu_read(kstack_offset); \ > + u8 *ptr = __builtin_alloca(KSTACK_OFFSET_MAX(offset)); \ > + asm volatile("" : "=m"(*ptr) :: "memory"); \ Using the "m" constraint here is dangerous if you don't actually evaluate it inside the asm. For example, if the compiler decides to generate an addressing mode relative to the stack but with writeback (autodecrement), then the stack pointer will be off by 8 bytes. Can you use "o" instead? Will
From: Will Deacon > Sent: 01 April 2021 09:31 ... > > +/* > > + * These macros must be used during syscall entry when interrupts and > > + * preempt are disabled, and after user registers have been stored to > > + * the stack. > > + */ > > +#define add_random_kstack_offset() do { \ > > + if (static_branch_maybe(CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT, \ > > + &randomize_kstack_offset)) { \ > > + u32 offset = __this_cpu_read(kstack_offset); \ > > + u8 *ptr = __builtin_alloca(KSTACK_OFFSET_MAX(offset)); \ > > + asm volatile("" : "=m"(*ptr) :: "memory"); \ > > Using the "m" constraint here is dangerous if you don't actually evaluate it > inside the asm. For example, if the compiler decides to generate an > addressing mode relative to the stack but with writeback (autodecrement), then > the stack pointer will be off by 8 bytes. Can you use "o" instead? Is it allowed to use such a mode? It would have to know that the "m" was substituted exactly once. I think there are quite a few examples with 'strange' uses of memory asm arguments. However, in this case, isn't it enough to ensure the address is 'saved'? So: asm volatile("" : "=r"(ptr) ); should be enough. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
On Thu, Apr 01, 2021 at 11:15:43AM +0000, David Laight wrote: > From: Will Deacon > > Sent: 01 April 2021 09:31 > ... > > > +/* > > > + * These macros must be used during syscall entry when interrupts and > > > + * preempt are disabled, and after user registers have been stored to > > > + * the stack. > > > + */ > > > +#define add_random_kstack_offset() do { \ > > > + if (static_branch_maybe(CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT, \ > > > + &randomize_kstack_offset)) { \ > > > + u32 offset = __this_cpu_read(kstack_offset); \ > > > + u8 *ptr = __builtin_alloca(KSTACK_OFFSET_MAX(offset)); \ > > > + asm volatile("" : "=m"(*ptr) :: "memory"); \ > > > > Using the "m" constraint here is dangerous if you don't actually evaluate it > > inside the asm. For example, if the compiler decides to generate an > > addressing mode relative to the stack but with writeback (autodecrement), then > > the stack pointer will be off by 8 bytes. Can you use "o" instead? I see other examples of empty asm, but it's true, none are using "=m" read constraints. But, yes, using "o" appears to work happily. > Is it allowed to use such a mode? > It would have to know that the "m" was substituted exactly once. > I think there are quite a few examples with 'strange' uses of memory > asm arguments. > > However, in this case, isn't it enough to ensure the address is 'saved'? > So: > asm volatile("" : "=r"(ptr) ); > should be enough. It isn't, it seems. Here's a comparison: https://godbolt.org/z/xYGn9GfGY So, I'll resend with "o", and with raw_cpu_*(). Thanks!
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index 04545725f187..bee8644a192e 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -4061,6 +4061,17 @@ fully seed the kernel's CRNG. Default is controlled by CONFIG_RANDOM_TRUST_CPU. + randomize_kstack_offset= + [KNL] Enable or disable kernel stack offset + randomization, which provides roughly 5 bits of + entropy, frustrating memory corruption attacks + that depend on stack address determinism or + cross-syscall address exposures. This is only + available on architectures that have defined + CONFIG_HAVE_ARCH_RANDOMIZE_KSTACK_OFFSET. + Format: <bool> (1/Y/y=enable, 0/N/n=disable) + Default is CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT. + ras=option[,option,...] [KNL] RAS-specific options cec_disable [X86] diff --git a/Makefile b/Makefile index 31dcdb3d61fa..8a959a264588 100644 --- a/Makefile +++ b/Makefile @@ -811,6 +811,10 @@ KBUILD_CFLAGS += -ftrivial-auto-var-init=zero KBUILD_CFLAGS += -enable-trivial-auto-var-init-zero-knowing-it-will-be-removed-from-clang endif +# While VLAs have been removed, GCC produces unreachable stack probes +# for the randomize_kstack_offset feature. Disable it for all compilers. +KBUILD_CFLAGS += $(call cc-option, -fno-stack-clash-protection) + DEBUG_CFLAGS := # Workaround for GCC versions < 5.0 diff --git a/arch/Kconfig b/arch/Kconfig index 2bb30673d8e6..4fe6b047fcbc 100644 --- a/arch/Kconfig +++ b/arch/Kconfig @@ -1055,6 +1055,29 @@ config VMAP_STACK backing virtual mappings with real shadow memory, and KASAN_VMALLOC must be enabled. +config HAVE_ARCH_RANDOMIZE_KSTACK_OFFSET + def_bool n + help + An arch should select this symbol if it can support kernel stack + offset randomization with calls to add_random_kstack_offset() + during syscall entry and choose_random_kstack_offset() during + syscall exit. Careful removal of -fstack-protector-strong and + -fstack-protector should also be applied to the entry code and + closely examined, as the artificial stack bump looks like an array + to the compiler, so it will attempt to add canary checks regardless + of the static branch state. + +config RANDOMIZE_KSTACK_OFFSET_DEFAULT + bool "Randomize kernel stack offset on syscall entry" + depends on HAVE_ARCH_RANDOMIZE_KSTACK_OFFSET + help + The kernel stack offset can be randomized (after pt_regs) by + roughly 5 bits of entropy, frustrating memory corruption + attacks that depend on stack address determinism or + cross-syscall address exposures. This feature is controlled + by kernel boot param "randomize_kstack_offset=on/off", and this + config chooses the default boot state. + config ARCH_OPTIONAL_KERNEL_RWX def_bool n diff --git a/include/linux/randomize_kstack.h b/include/linux/randomize_kstack.h new file mode 100644 index 000000000000..351520803006 --- /dev/null +++ b/include/linux/randomize_kstack.h @@ -0,0 +1,55 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +#ifndef _LINUX_RANDOMIZE_KSTACK_H +#define _LINUX_RANDOMIZE_KSTACK_H + +#include <linux/kernel.h> +#include <linux/jump_label.h> +#include <linux/percpu-defs.h> + +DECLARE_STATIC_KEY_MAYBE(CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT, + randomize_kstack_offset); +DECLARE_PER_CPU(u32, kstack_offset); + +/* + * Do not use this anywhere else in the kernel. This is used here because + * it provides an arch-agnostic way to grow the stack with correct + * alignment. Also, since this use is being explicitly masked to a max of + * 10 bits, stack-clash style attacks are unlikely. For more details see + * "VLAs" in Documentation/process/deprecated.rst + * The asm statement is designed to convince the compiler to keep the + * allocation around even after "ptr" goes out of scope. + */ +void *__builtin_alloca(size_t size); +/* + * Use, at most, 10 bits of entropy. We explicitly cap this to keep the + * "VLA" from being unbounded (see above). 10 bits leaves enough room for + * per-arch offset masks to reduce entropy (by removing higher bits, since + * high entropy may overly constrain usable stack space), and for + * compiler/arch-specific stack alignment to remove the lower bits. + */ +#define KSTACK_OFFSET_MAX(x) ((x) & 0x3FF) + +/* + * These macros must be used during syscall entry when interrupts and + * preempt are disabled, and after user registers have been stored to + * the stack. + */ +#define add_random_kstack_offset() do { \ + if (static_branch_maybe(CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT, \ + &randomize_kstack_offset)) { \ + u32 offset = __this_cpu_read(kstack_offset); \ + u8 *ptr = __builtin_alloca(KSTACK_OFFSET_MAX(offset)); \ + asm volatile("" : "=m"(*ptr) :: "memory"); \ + } \ +} while (0) + +#define choose_random_kstack_offset(rand) do { \ + if (static_branch_maybe(CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT, \ + &randomize_kstack_offset)) { \ + u32 offset = __this_cpu_read(kstack_offset); \ + offset ^= (rand); \ + __this_cpu_write(kstack_offset, offset); \ + } \ +} while (0) + +#endif diff --git a/init/main.c b/init/main.c index 53b278845b88..f498aac26e8c 100644 --- a/init/main.c +++ b/init/main.c @@ -844,6 +844,29 @@ static void __init mm_init(void) pti_init(); } +#ifdef CONFIG_HAVE_ARCH_RANDOMIZE_KSTACK_OFFSET +DEFINE_STATIC_KEY_MAYBE_RO(CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT, + randomize_kstack_offset); +DEFINE_PER_CPU(u32, kstack_offset); + +static int __init early_randomize_kstack_offset(char *buf) +{ + int ret; + bool bool_result; + + ret = kstrtobool(buf, &bool_result); + if (ret) + return ret; + + if (bool_result) + static_branch_enable(&randomize_kstack_offset); + else + static_branch_disable(&randomize_kstack_offset); + return 0; +} +early_param("randomize_kstack_offset", early_randomize_kstack_offset); +#endif + void __init __weak arch_call_rest_init(void) { rest_init();