Message ID | 20191018161033.261971-1-samitolvanen@google.com (mailing list archive) |
---|---|
Headers | show |
Series | add support for Clang's Shadow Call Stack | expand |
Hi, On Fri, 18 Oct 2019 09:10:15 -0700 Sami Tolvanen <samitolvanen@google.com> wrote: > This patch series adds support for Clang's Shadow Call Stack (SCS) > mitigation, which uses a separately allocated shadow stack to protect > against return address overwrites. More information can be found here: > > https://clang.llvm.org/docs/ShadowCallStack.html Looks interesting, and like what function-graph tracing does... > > SCS is currently supported only on arm64, where the compiler requires > the x18 register to be reserved for holding the current task's shadow > stack pointer. Because of this, the series includes four patches from > Ard to remove x18 usage from assembly code and to reserve the register > from general allocation. > > With -fsanitize=shadow-call-stack, the compiler injects instructions > to all non-leaf C functions to store the return address to the shadow > stack and unconditionally load it again before returning. As a result, > SCS is incompatible with features that rely on modifying function > return addresses to alter control flow, such as function graph tracing > and kretprobes. A copy of the return address is still kept in the > kernel stack for compatibility with stack unwinding, for example. Is it possible that kretprobes and function graph tracing modify the SCS directly instead of changing real stack in that case? Thank you,
This patch series adds support for Clang's Shadow Call Stack (SCS) mitigation, which uses a separately allocated shadow stack to protect against return address overwrites. More information can be found here: https://clang.llvm.org/docs/ShadowCallStack.html SCS provides better protection against traditional buffer overflows than CONFIG_STACKPROTECTOR_*, but it should be noted that SCS security guarantees in the kernel differ from the ones documented for user space. The kernel must store addresses of shadow stacks used by other tasks and interrupt handlers in memory, which means an attacker capable reading and writing arbitrary memory may be able to locate them and hijack control flow by modifying shadow stacks that are not currently in use. SCS is currently supported only on arm64, where the compiler requires the x18 register to be reserved for holding the current task's shadow stack pointer. Because of this, the series includes patches from Ard to remove x18 usage from assembly code. With -fsanitize=shadow-call-stack, the compiler injects instructions to all non-leaf C functions to store the return address to the shadow stack, and unconditionally load it again before returning. As a result, SCS is currently incompatible with features that rely on modifying function return addresses to alter control flow, such as function graph tracing and kretprobes, although it may be possible to later change these feature to modify the shadow stack instead. A copy of the return address is still kept in the kernel stack for compatibility with stack unwinding, for example. SCS has a minimal performance overhead, but allocating shadow stacks increases kernel memory usage. The feature is therefore mostly useful on hardware that lacks support for PAC instructions. Changes in v2: - Changed Ard's KVM patch to use x29 instead of x18 for the guest context, which makes restore_callee_saved_regs cleaner - Updated help text (and commit messages) to point out differences in security properties compared to user space SCS - Cleaned up config options: removed the ROP protection choice, replaced the CC_IS_CLANG dependency with an arch-specific cc-option test, and moved disabling of incompatible config options to an arch-specific Kconfig - Added CC_FLAGS_SCS, which are filtered out where needed instead of using DISABLE_SCS - Added a __has_feature guard around __noscs for older clang versions - Changed the shadow stack overflow check for vmapped SCS to use SCS_SIZE to avoid surprises when changing configs - Renamed SCS_GFP to GFP_SCS - Dropped the patch to reserve x18 unconditionally, it's now only reserved with SCS - Added a clarification why restoring x18 in the EFI RT wrapper is safe - Added a missing change to arch/arm64/include/asm/suspend.h, and a comment to arch/arm64/mm/proc.S to remind that struct cpu_suspend_ctx must be kept in sync with the code - Moved x18 loading/storing during a context switch to cpu_switch_to(), renamed scs_thread_switch() to scs_overflow_check(), and removed the now unused scs_load() - Added compile-time initialization for init_shadow_call_stack and removed scs_set_init_magic() Ard Biesheuvel (2): arm64/lib: copy_page: avoid x18 register in assembler code arm64: kernel: avoid x18 as an arbitrary temp register Sami Tolvanen (15): arm64: mm: don't use x18 in idmap_kpti_install_ng_mappings arm64: kvm: stop treating register x18 as caller save add support for Clang's Shadow Call Stack (SCS) scs: add accounting scs: add support for stack usage debugging kprobes: fix compilation without CONFIG_KRETPROBES arm64: disable function graph tracing with SCS arm64: disable kretprobes with SCS arm64: reserve x18 from general allocation with SCS arm64: preserve x18 when CPU is suspended arm64: efi: restore x18 if it was corrupted arm64: vdso: disable Shadow Call Stack arm64: kprobes: fix kprobes without CONFIG_KRETPROBES arm64: disable SCS for hypervisor code arm64: implement Shadow Call Stack Makefile | 6 + arch/Kconfig | 33 +++++ arch/arm64/Kconfig | 9 +- arch/arm64/Makefile | 4 + arch/arm64/include/asm/scs.h | 45 ++++++ arch/arm64/include/asm/stacktrace.h | 4 + arch/arm64/include/asm/suspend.h | 2 +- arch/arm64/include/asm/thread_info.h | 3 + arch/arm64/kernel/Makefile | 1 + arch/arm64/kernel/asm-offsets.c | 3 + arch/arm64/kernel/cpu-reset.S | 4 +- arch/arm64/kernel/efi-rt-wrapper.S | 7 +- arch/arm64/kernel/entry.S | 28 ++++ arch/arm64/kernel/head.S | 9 ++ arch/arm64/kernel/irq.c | 2 + arch/arm64/kernel/probes/kprobes.c | 2 + arch/arm64/kernel/process.c | 2 + arch/arm64/kernel/scs.c | 39 +++++ arch/arm64/kernel/smp.c | 4 + arch/arm64/kernel/vdso/Makefile | 2 +- arch/arm64/kvm/hyp/Makefile | 3 + arch/arm64/kvm/hyp/entry.S | 41 +++-- arch/arm64/lib/copy_page.S | 38 ++--- arch/arm64/mm/proc.S | 72 +++++---- drivers/base/node.c | 6 + fs/proc/meminfo.c | 4 + include/linux/compiler-clang.h | 6 + include/linux/compiler_types.h | 4 + include/linux/mmzone.h | 3 + include/linux/scs.h | 78 ++++++++++ init/init_task.c | 8 + kernel/Makefile | 1 + kernel/fork.c | 9 ++ kernel/kprobes.c | 38 ++--- kernel/sched/core.c | 2 + kernel/sched/sched.h | 1 + kernel/scs.c | 214 +++++++++++++++++++++++++++ mm/page_alloc.c | 6 + mm/vmstat.c | 3 + 39 files changed, 649 insertions(+), 97 deletions(-) create mode 100644 arch/arm64/include/asm/scs.h create mode 100644 arch/arm64/kernel/scs.c create mode 100644 include/linux/scs.h create mode 100644 kernel/scs.c
On Fri, Oct 25, 2019 at 7:52 AM <samitolvanen@google.com> wrote: > > Signed-off-by: Sami Tolvanen <samitolvanen@google.com> > --- > arch/arm64/kvm/hyp/Makefile | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/arch/arm64/kvm/hyp/Makefile b/arch/arm64/kvm/hyp/Makefile > index ea710f674cb6..8289ea086e5e 100644 > --- a/arch/arm64/kvm/hyp/Makefile > +++ b/arch/arm64/kvm/hyp/Makefile > @@ -28,3 +28,6 @@ GCOV_PROFILE := n > KASAN_SANITIZE := n > UBSAN_SANITIZE := n > KCOV_INSTRUMENT := n > + > +ORIG_CFLAGS := $(KBUILD_CFLAGS) > +KBUILD_CFLAGS = $(subst $(CC_FLAGS_SCS),,$(ORIG_CFLAGS)) $(subst ... ) is not the correct use here. It works like sed, s/$(CC_CFLAGS_SCS)// instead of matching by word. KBUILD_CFLAGS := $(filter-out $(CC_FLAGS_SCS), $(KBUILD_CFLAGS)) is more correct, and simpler.
On Fri, 25 Oct 2019 10:29:47 +0900 Masahiro Yamada <yamada.masahiro@socionext.com> wrote: > On Fri, Oct 25, 2019 at 7:52 AM <samitolvanen@google.com> wrote: > > > > Signed-off-by: Sami Tolvanen <samitolvanen@google.com> > > --- > > arch/arm64/kvm/hyp/Makefile | 3 +++ > > 1 file changed, 3 insertions(+) > > > > diff --git a/arch/arm64/kvm/hyp/Makefile b/arch/arm64/kvm/hyp/Makefile > > index ea710f674cb6..8289ea086e5e 100644 > > --- a/arch/arm64/kvm/hyp/Makefile > > +++ b/arch/arm64/kvm/hyp/Makefile > > @@ -28,3 +28,6 @@ GCOV_PROFILE := n > > KASAN_SANITIZE := n > > UBSAN_SANITIZE := n > > KCOV_INSTRUMENT := n > > + > > +ORIG_CFLAGS := $(KBUILD_CFLAGS) > > +KBUILD_CFLAGS = $(subst $(CC_FLAGS_SCS),,$(ORIG_CFLAGS)) > > > $(subst ... ) is not the correct use here. > > It works like sed, s/$(CC_CFLAGS_SCS)// > instead of matching by word. > > > > > KBUILD_CFLAGS := $(filter-out $(CC_FLAGS_SCS), $(KBUILD_CFLAGS)) > > is more correct, and simpler. I guess that would work too. Not sure why I never used it. I see mips used it for their -pg flags. -- Steve
On Thu, Oct 24, 2019 at 03:51:16PM -0700, samitolvanen@google.com wrote: > idmap_kpti_install_ng_mappings uses x18 as a temporary register, which > will result in a conflict when x18 is reserved. Use x16 and x17 instead > where needed. > > Signed-off-by: Sami Tolvanen <samitolvanen@google.com> > Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> AFAICT the new register assignment is sound, so FWIW: Reviewed-by: Mark Rutland <mark.rutland@arm.com> I was going to suggest adding menmonics for the remamining raw register names, but after having a go locally I think it's cleaner as-is given the registers are used in different widths for multiple purposes. Thanks, Mark. > --- > arch/arm64/mm/proc.S | 63 ++++++++++++++++++++++---------------------- > 1 file changed, 32 insertions(+), 31 deletions(-) > > diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S > index a1e0592d1fbc..fdabf40a83c8 100644 > --- a/arch/arm64/mm/proc.S > +++ b/arch/arm64/mm/proc.S > @@ -250,15 +250,15 @@ ENTRY(idmap_kpti_install_ng_mappings) > /* We're the boot CPU. Wait for the others to catch up */ > sevl > 1: wfe > - ldaxr w18, [flag_ptr] > - eor w18, w18, num_cpus > - cbnz w18, 1b > + ldaxr w17, [flag_ptr] > + eor w17, w17, num_cpus > + cbnz w17, 1b > > /* We need to walk swapper, so turn off the MMU. */ > pre_disable_mmu_workaround > - mrs x18, sctlr_el1 > - bic x18, x18, #SCTLR_ELx_M > - msr sctlr_el1, x18 > + mrs x17, sctlr_el1 > + bic x17, x17, #SCTLR_ELx_M > + msr sctlr_el1, x17 > isb > > /* Everybody is enjoying the idmap, so we can rewrite swapper. */ > @@ -281,9 +281,9 @@ skip_pgd: > isb > > /* We're done: fire up the MMU again */ > - mrs x18, sctlr_el1 > - orr x18, x18, #SCTLR_ELx_M > - msr sctlr_el1, x18 > + mrs x17, sctlr_el1 > + orr x17, x17, #SCTLR_ELx_M > + msr sctlr_el1, x17 > isb > > /* > @@ -353,46 +353,47 @@ skip_pte: > b.ne do_pte > b next_pmd > > + .unreq cpu > + .unreq num_cpus > + .unreq swapper_pa > + .unreq cur_pgdp > + .unreq end_pgdp > + .unreq pgd > + .unreq cur_pudp > + .unreq end_pudp > + .unreq pud > + .unreq cur_pmdp > + .unreq end_pmdp > + .unreq pmd > + .unreq cur_ptep > + .unreq end_ptep > + .unreq pte > + > /* Secondary CPUs end up here */ > __idmap_kpti_secondary: > /* Uninstall swapper before surgery begins */ > - __idmap_cpu_set_reserved_ttbr1 x18, x17 > + __idmap_cpu_set_reserved_ttbr1 x16, x17 > > /* Increment the flag to let the boot CPU we're ready */ > -1: ldxr w18, [flag_ptr] > - add w18, w18, #1 > - stxr w17, w18, [flag_ptr] > +1: ldxr w16, [flag_ptr] > + add w16, w16, #1 > + stxr w17, w16, [flag_ptr] > cbnz w17, 1b > > /* Wait for the boot CPU to finish messing around with swapper */ > sevl > 1: wfe > - ldxr w18, [flag_ptr] > - cbnz w18, 1b > + ldxr w16, [flag_ptr] > + cbnz w16, 1b > > /* All done, act like nothing happened */ > - offset_ttbr1 swapper_ttb, x18 > + offset_ttbr1 swapper_ttb, x16 > msr ttbr1_el1, swapper_ttb > isb > ret > > - .unreq cpu > - .unreq num_cpus > - .unreq swapper_pa > .unreq swapper_ttb > .unreq flag_ptr > - .unreq cur_pgdp > - .unreq end_pgdp > - .unreq pgd > - .unreq cur_pudp > - .unreq end_pudp > - .unreq pud > - .unreq cur_pmdp > - .unreq end_pmdp > - .unreq pmd > - .unreq cur_ptep > - .unreq end_ptep > - .unreq pte > ENDPROC(idmap_kpti_install_ng_mappings) > .popsection > #endif > -- > 2.24.0.rc0.303.g954a862665-goog >
On Thu, Oct 24, 2019 at 6:31 PM Masahiro Yamada <yamada.masahiro@socionext.com> wrote: > $(subst ... ) is not the correct use here. > > It works like sed, s/$(CC_CFLAGS_SCS)// > instead of matching by word. > > > > > KBUILD_CFLAGS := $(filter-out $(CC_FLAGS_SCS), $(KBUILD_CFLAGS)) > > is more correct, and simpler. Thanks, I will change this in v3. Sami
This patch series adds support for Clang's Shadow Call Stack (SCS) mitigation, which uses a separately allocated shadow stack to protect against return address overwrites. More information can be found here: https://clang.llvm.org/docs/ShadowCallStack.html SCS provides better protection against traditional buffer overflows than CONFIG_STACKPROTECTOR_*, but it should be noted that SCS security guarantees in the kernel differ from the ones documented for user space. The kernel must store addresses of shadow stacks used by other tasks and interrupt handlers in memory, which means an attacker capable reading and writing arbitrary memory may be able to locate them and hijack control flow by modifying shadow stacks that are not currently in use. SCS is currently supported only on arm64, where the compiler requires the x18 register to be reserved for holding the current task's shadow stack pointer. Because of this, the series includes patches from Ard to remove x18 usage from assembly code. With -fsanitize=shadow-call-stack, the compiler injects instructions to all non-leaf C functions to store the return address to the shadow stack, and unconditionally load it again before returning. As a result, SCS is currently incompatible with features that rely on modifying function return addresses to alter control flow, such as function graph tracing and kretprobes, although it may be possible to later change these features to modify the shadow stack instead. A copy of the return address is still kept in the kernel stack for compatibility with stack unwinding, for example. SCS has a minimal performance overhead, but allocating shadow stacks increases kernel memory usage. The feature is therefore mostly useful on hardware that lacks support for PAC instructions. Changes in v3: - Switched to filter-out for removing SCS flags in Makefiles - Changed the __noscs attribute to use __no_sanitize__("...") instead of no_sanitize("...") - Cleaned up inline function definitions and moved task_scs() into a macro - Cleaned up scs_free() and scs_magic() - Moved SCS initialization into dup_task_struct() and removed the now unused scs_task_init() - Added comments to __scs_base() and scs_task_reset() to better document design choices - Changed copy_page to make the offset and bias explicit Changes in v2: - Changed Ard's KVM patch to use x29 instead of x18 for the guest context, which makes restore_callee_saved_regs cleaner - Updated help text (and commit messages) to point out differences in security properties compared to user space SCS - Cleaned up config options: removed the ROP protection choice, replaced the CC_IS_CLANG dependency with an arch-specific cc-option test, and moved disabling of incompatible config options to an arch-specific Kconfig - Added CC_FLAGS_SCS, which are filtered out where needed instead of using DISABLE_SCS - Added a __has_feature guard around __noscs for older clang versions Ard Biesheuvel (1): arm64: kernel: avoid x18 __cpu_soft_restart Sami Tolvanen (16): arm64: mm: avoid x18 in idmap_kpti_install_ng_mappings arm64/lib: copy_page: avoid x18 register in assembler code arm64: kvm: stop treating register x18 as caller save add support for Clang's Shadow Call Stack (SCS) scs: add accounting scs: add support for stack usage debugging kprobes: fix compilation without CONFIG_KRETPROBES arm64: kprobes: fix kprobes without CONFIG_KRETPROBES arm64: disable kretprobes with SCS arm64: disable function graph tracing with SCS arm64: reserve x18 from general allocation with SCS arm64: preserve x18 when CPU is suspended arm64: efi: restore x18 if it was corrupted arm64: vdso: disable Shadow Call Stack arm64: disable SCS for hypervisor code arm64: implement Shadow Call Stack Makefile | 6 + arch/Kconfig | 33 ++++ arch/arm64/Kconfig | 9 +- arch/arm64/Makefile | 4 + arch/arm64/include/asm/scs.h | 37 +++++ arch/arm64/include/asm/stacktrace.h | 4 + arch/arm64/include/asm/suspend.h | 2 +- arch/arm64/include/asm/thread_info.h | 3 + arch/arm64/kernel/Makefile | 1 + arch/arm64/kernel/asm-offsets.c | 3 + arch/arm64/kernel/cpu-reset.S | 4 +- arch/arm64/kernel/efi-rt-wrapper.S | 7 +- arch/arm64/kernel/entry.S | 28 ++++ arch/arm64/kernel/head.S | 9 ++ arch/arm64/kernel/irq.c | 2 + arch/arm64/kernel/probes/kprobes.c | 2 + arch/arm64/kernel/process.c | 2 + arch/arm64/kernel/scs.c | 39 +++++ arch/arm64/kernel/smp.c | 4 + arch/arm64/kernel/vdso/Makefile | 2 +- arch/arm64/kvm/hyp/Makefile | 3 + arch/arm64/kvm/hyp/entry.S | 41 +++-- arch/arm64/lib/copy_page.S | 38 ++--- arch/arm64/mm/proc.S | 72 +++++---- drivers/base/node.c | 6 + fs/proc/meminfo.c | 4 + include/linux/compiler-clang.h | 6 + include/linux/compiler_types.h | 4 + include/linux/mmzone.h | 3 + include/linux/scs.h | 54 +++++++ init/init_task.c | 8 + kernel/Makefile | 1 + kernel/fork.c | 9 ++ kernel/kprobes.c | 38 ++--- kernel/sched/core.c | 2 + kernel/sched/sched.h | 1 + kernel/scs.c | 227 +++++++++++++++++++++++++++ mm/page_alloc.c | 6 + mm/vmstat.c | 3 + 39 files changed, 630 insertions(+), 97 deletions(-) create mode 100644 arch/arm64/include/asm/scs.h create mode 100644 arch/arm64/kernel/scs.c create mode 100644 include/linux/scs.h create mode 100644 kernel/scs.c
This patch series adds support for Clang's Shadow Call Stack (SCS) mitigation, which uses a separately allocated shadow stack to protect against return address overwrites. More information can be found here: https://clang.llvm.org/docs/ShadowCallStack.html SCS provides better protection against traditional buffer overflows than CONFIG_STACKPROTECTOR_*, but it should be noted that SCS security guarantees in the kernel differ from the ones documented for user space. The kernel must store addresses of shadow stacks used by inactive tasks and interrupt handlers in memory, which means an attacker capable reading and writing arbitrary memory may be able to locate them and hijack control flow by modifying shadow stacks that are not currently in use. SCS is currently supported only on arm64, where the compiler requires the x18 register to be reserved for holding the current task's shadow stack pointer. Because of this, the series includes patches from Ard to remove x18 usage from assembly code. With -fsanitize=shadow-call-stack, the compiler injects instructions to all non-leaf C functions to store the return address to the shadow stack, and unconditionally load it again before returning. As a result, SCS is currently incompatible with features that rely on modifying function return addresses in the kernel stack to alter control flow, such as function graph tracing, although it may be possible to later change these features to modify the shadow stack instead. A copy of the return address is still kept in the kernel stack for compatibility with stack unwinding, for example. SCS has a minimal performance overhead, but allocating shadow stacks increases kernel memory usage. The feature is therefore mostly useful on hardware that lacks support for PAC instructions. Changes in v5: - Updated the comment in __scs_base() to Mark's suggestion - Changed all instances of uintptr_t to unsigned long - Added allocation poisoning for KASAN to catch unintentional shadow stack accesses; moved set_set_magic before poisoning and switched scs_used() and scs_corrupted() to access the buffer using READ_ONCE_NOCHECK() instead - Changed scs_free() to check for NULL instead of zero - Renamed SCS_CACHE_SIZE to NR_CACHED_SCS - Added a warning if cpuhp_setup_state fails in scs_init() - Dropped patches disabling kretprobes after confirming there's no functional conflict with SCS instrumentation - Added an explanation to the commit message why function graph tracing and SCS are incompatible - Removed the ifdefs from arch/arm64/mm/proc.S and added comments explaining why we are saving and restoring x18 - Updated scs_check_usage format to include process information Changes in v4: - Fixed authorship for Ard's patches - Added missing commit messages - Commented code that clears SCS from thread_info - Added a comment about SCS_END_MAGIC being non-canonical Changes in v3: - Switched to filter-out for removing SCS flags in Makefiles - Changed the __noscs attribute to use __no_sanitize__("...") instead of no_sanitize("...") - Cleaned up inline function definitions and moved task_scs() into a macro - Cleaned up scs_free() and scs_magic() - Moved SCS initialization into dup_task_struct() and removed the now unused scs_task_init() - Added comments to __scs_base() and scs_task_reset() to better document design choices - Changed copy_page to make the offset and bias explicit Changes in v2: - Changed Ard's KVM patch to use x29 instead of x18 for the guest context, which makes restore_callee_saved_regs cleaner - Updated help text (and commit messages) to point out differences in security properties compared to user space SCS - Cleaned up config options: removed the ROP protection choice, replaced the CC_IS_CLANG dependency with an arch-specific cc-option test, and moved disabling of incompatible config options to an arch-specific Kconfig - Added CC_FLAGS_SCS, which are filtered out where needed instead of using DISABLE_SCS - Added a __has_feature guard around __noscs for older clang versions Ard Biesheuvel (3): arm64/lib: copy_page: avoid x18 register in assembler code arm64: kvm: stop treating register x18 as caller save arm64: kernel: avoid x18 in __cpu_soft_restart Sami Tolvanen (11): arm64: mm: avoid x18 in idmap_kpti_install_ng_mappings add support for Clang's Shadow Call Stack (SCS) scs: add accounting scs: add support for stack usage debugging arm64: disable function graph tracing with SCS arm64: reserve x18 from general allocation with SCS arm64: preserve x18 when CPU is suspended arm64: efi: restore x18 if it was corrupted arm64: vdso: disable Shadow Call Stack arm64: disable SCS for hypervisor code arm64: implement Shadow Call Stack Makefile | 6 + arch/Kconfig | 33 ++++ arch/arm64/Kconfig | 7 +- arch/arm64/Makefile | 4 + arch/arm64/include/asm/scs.h | 37 ++++ arch/arm64/include/asm/stacktrace.h | 4 + arch/arm64/include/asm/suspend.h | 2 +- arch/arm64/include/asm/thread_info.h | 3 + arch/arm64/kernel/Makefile | 1 + arch/arm64/kernel/asm-offsets.c | 3 + arch/arm64/kernel/cpu-reset.S | 4 +- arch/arm64/kernel/efi-rt-wrapper.S | 7 +- arch/arm64/kernel/entry.S | 28 +++ arch/arm64/kernel/head.S | 9 + arch/arm64/kernel/irq.c | 2 + arch/arm64/kernel/process.c | 2 + arch/arm64/kernel/scs.c | 39 +++++ arch/arm64/kernel/smp.c | 4 + arch/arm64/kernel/vdso/Makefile | 2 +- arch/arm64/kvm/hyp/Makefile | 3 + arch/arm64/kvm/hyp/entry.S | 45 ++--- arch/arm64/lib/copy_page.S | 38 ++--- arch/arm64/mm/proc.S | 77 +++++---- drivers/base/node.c | 6 + fs/proc/meminfo.c | 4 + include/linux/compiler-clang.h | 6 + include/linux/compiler_types.h | 4 + include/linux/mmzone.h | 3 + include/linux/scs.h | 57 +++++++ init/init_task.c | 8 + kernel/Makefile | 1 + kernel/fork.c | 9 + kernel/sched/core.c | 2 + kernel/scs.c | 246 +++++++++++++++++++++++++++ mm/page_alloc.c | 6 + mm/vmstat.c | 3 + 36 files changed, 638 insertions(+), 77 deletions(-) create mode 100644 arch/arm64/include/asm/scs.h create mode 100644 arch/arm64/kernel/scs.c create mode 100644 include/linux/scs.h create mode 100644 kernel/scs.c base-commit: 26bc672134241a080a83b2ab9aa8abede8d30e1c
On Tue, Nov 05, 2019 at 03:55:54PM -0800, Sami Tolvanen wrote: > This patch series adds support for Clang's Shadow Call Stack > (SCS) mitigation, which uses a separately allocated shadow stack > to protect against return address overwrites. More information Will, Catalin, Mark, What's the next step here? I *think* all the comments have been addressed. Is it possible to land this via the arm tree for v5.5? Thanks!
On Tue, Nov 12, 2019 at 03:44:42PM -0800, Kees Cook wrote: > On Tue, Nov 05, 2019 at 03:55:54PM -0800, Sami Tolvanen wrote: > > This patch series adds support for Clang's Shadow Call Stack > > (SCS) mitigation, which uses a separately allocated shadow stack > > to protect against return address overwrites. More information > > Will, Catalin, Mark, > > What's the next step here? I *think* all the comments have been > addressed. Is it possible to land this via the arm tree for v5.5? I was planning to queue this for 5.6, given that I'd really like it to spend some quality time in linux-next. Will
On Wed, Nov 13, 2019 at 12:03:38PM +0000, Will Deacon wrote: > On Tue, Nov 12, 2019 at 03:44:42PM -0800, Kees Cook wrote: > > On Tue, Nov 05, 2019 at 03:55:54PM -0800, Sami Tolvanen wrote: > > > This patch series adds support for Clang's Shadow Call Stack > > > (SCS) mitigation, which uses a separately allocated shadow stack > > > to protect against return address overwrites. More information > > > > Will, Catalin, Mark, > > > > What's the next step here? I *think* all the comments have been > > addressed. Is it possible to land this via the arm tree for v5.5? > > I was planning to queue this for 5.6, given that I'd really like it to > spend some quality time in linux-next. Sounds fine to me; I just wanted to have an idea what to expect. :) Thanks!
On Tue, Nov 12, 2019 at 03:44:42PM -0800, Kees Cook wrote: > On Tue, Nov 05, 2019 at 03:55:54PM -0800, Sami Tolvanen wrote: > > This patch series adds support for Clang's Shadow Call Stack > > (SCS) mitigation, which uses a separately allocated shadow stack > > to protect against return address overwrites. More information > > Will, Catalin, Mark, > > What's the next step here? I *think* all the comments have been > addressed. I'm hoping to look over the remaining bits in the next week or so, and to throw my test boxes at this shortly. Thanks, Mark.
This patch series adds support for Clang's Shadow Call Stack (SCS) mitigation, which uses a separately allocated shadow stack to protect against return address overwrites. More information can be found here: https://clang.llvm.org/docs/ShadowCallStack.html SCS provides better protection against traditional buffer overflows than CONFIG_STACKPROTECTOR_*, but it should be noted that SCS security guarantees in the kernel differ from the ones documented for user space. The kernel must store addresses of shadow stacks used by inactive tasks and interrupt handlers in memory, which means an attacker capable reading and writing arbitrary memory may be able to locate them and hijack control flow by modifying shadow stacks that are not currently in use. SCS is currently supported only on arm64, where the compiler requires the x18 register to be reserved for holding the current task's shadow stack pointer. Because of this, the series includes patches from Ard to remove x18 usage from assembly code. With -fsanitize=shadow-call-stack, the compiler injects instructions to all non-leaf C functions to store the return address to the shadow stack, and unconditionally load it again before returning. As a result, SCS is currently incompatible with features that rely on modifying function return addresses in the kernel stack to alter control flow, such as function graph tracing, although it may be possible to later change these features to modify the shadow stack instead. A copy of the return address is still kept in the kernel stack for compatibility with stack unwinding, for example. SCS has a minimal performance overhead, but allocating shadow stacks increases kernel memory usage. The feature is therefore mostly useful on hardware that lacks support for PAC instructions. Changes in v6: - Updated comment in the EFI RT wrapper to include the explanation from the commit message. - Fixed the SHADOW_CALL_STACK_VMAP config option and the compilation errors in scs_init_irq() - Updated the comment in entry.S to Mark's suggestion - Fixed the WARN_ON in scs_init() to trip only when the return value for cpuhp_setup_state() is < 0. - Removed ifdefs from the code in arch/arm64/kernel/scs.c and added separate shadow stacks for the SDEI handler Changes in v5: - Updated the comment in __scs_base() to Mark's suggestion - Changed all instances of uintptr_t to unsigned long - Added allocation poisoning for KASAN to catch unintentional shadow stack accesses; moved set_set_magic before poisoning and switched scs_used() and scs_corrupted() to access the buffer using READ_ONCE_NOCHECK() instead - Changed scs_free() to check for NULL instead of zero - Renamed SCS_CACHE_SIZE to NR_CACHED_SCS - Added a warning if cpuhp_setup_state fails in scs_init() - Dropped patches disabling kretprobes after confirming there's no functional conflict with SCS instrumentation - Added an explanation to the commit message why function graph tracing and SCS are incompatible - Removed the ifdefs from arch/arm64/mm/proc.S and added comments explaining why we are saving and restoring x18 - Updated scs_check_usage format to include process information Changes in v4: - Fixed authorship for Ard's patches - Added missing commit messages - Commented code that clears SCS from thread_info - Added a comment about SCS_END_MAGIC being non-canonical Changes in v3: - Switched to filter-out for removing SCS flags in Makefiles - Changed the __noscs attribute to use __no_sanitize__("...") instead of no_sanitize("...") - Cleaned up inline function definitions and moved task_scs() into a macro - Cleaned up scs_free() and scs_magic() - Moved SCS initialization into dup_task_struct() and removed the now unused scs_task_init() - Added comments to __scs_base() and scs_task_reset() to better document design choices - Changed copy_page to make the offset and bias explicit Changes in v2: - Changed Ard's KVM patch to use x29 instead of x18 for the guest context, which makes restore_callee_saved_regs cleaner - Updated help text (and commit messages) to point out differences in security properties compared to user space SCS - Cleaned up config options: removed the ROP protection choice, replaced the CC_IS_CLANG dependency with an arch-specific cc-option test, and moved disabling of incompatible config options to an arch-specific Kconfig - Added CC_FLAGS_SCS, which are filtered out where needed instead of using DISABLE_SCS - Added a __has_feature guard around __noscs for older clang versions Ard Biesheuvel (3): arm64/lib: copy_page: avoid x18 register in assembler code arm64: kvm: stop treating register x18 as caller save arm64: kernel: avoid x18 in __cpu_soft_restart Sami Tolvanen (12): arm64: mm: avoid x18 in idmap_kpti_install_ng_mappings add support for Clang's Shadow Call Stack (SCS) scs: add accounting scs: add support for stack usage debugging arm64: disable function graph tracing with SCS arm64: reserve x18 from general allocation with SCS arm64: preserve x18 when CPU is suspended arm64: efi: restore x18 if it was corrupted arm64: vdso: disable Shadow Call Stack arm64: disable SCS for hypervisor code arm64: implement Shadow Call Stack arm64: scs: add shadow stacks for SDEI Makefile | 6 + arch/Kconfig | 34 ++++ arch/arm64/Kconfig | 7 +- arch/arm64/Makefile | 4 + arch/arm64/include/asm/scs.h | 39 +++++ arch/arm64/include/asm/suspend.h | 2 +- arch/arm64/include/asm/thread_info.h | 3 + arch/arm64/kernel/Makefile | 1 + arch/arm64/kernel/asm-offsets.c | 3 + arch/arm64/kernel/cpu-reset.S | 4 +- arch/arm64/kernel/efi-rt-wrapper.S | 11 +- arch/arm64/kernel/entry.S | 45 ++++- arch/arm64/kernel/head.S | 9 + arch/arm64/kernel/irq.c | 2 + arch/arm64/kernel/process.c | 2 + arch/arm64/kernel/scs.c | 114 +++++++++++++ arch/arm64/kernel/sdei.c | 7 + arch/arm64/kernel/smp.c | 4 + arch/arm64/kernel/vdso/Makefile | 2 +- arch/arm64/kvm/hyp/Makefile | 3 + arch/arm64/kvm/hyp/entry.S | 45 ++--- arch/arm64/lib/copy_page.S | 38 ++--- arch/arm64/mm/proc.S | 77 +++++---- drivers/base/node.c | 6 + fs/proc/meminfo.c | 4 + include/linux/compiler-clang.h | 6 + include/linux/compiler_types.h | 4 + include/linux/mmzone.h | 3 + include/linux/scs.h | 57 +++++++ init/init_task.c | 8 + kernel/Makefile | 1 + kernel/fork.c | 9 + kernel/sched/core.c | 2 + kernel/scs.c | 246 +++++++++++++++++++++++++++ mm/page_alloc.c | 6 + mm/vmstat.c | 3 + 36 files changed, 737 insertions(+), 80 deletions(-) create mode 100644 arch/arm64/include/asm/scs.h create mode 100644 arch/arm64/kernel/scs.c create mode 100644 include/linux/scs.h create mode 100644 kernel/scs.c base-commit: 3cf2890f29ab6fe491361761df558ef9191cb468
On Fri, Dec 06, 2019 at 02:13:36PM -0800, Sami Tolvanen wrote: > This patch series adds support for Clang's Shadow Call Stack > (SCS) mitigation, which uses a separately allocated shadow stack > to protect against return address overwrites. More information > can be found here: > > https://clang.llvm.org/docs/ShadowCallStack.html I've queued the first four via arm64. Will
This patch series adds support for Clang's Shadow Call Stack (SCS) mitigation, which uses a separately allocated shadow stack to protect against return address overwrites. More information can be found here: https://clang.llvm.org/docs/ShadowCallStack.html SCS provides better protection against traditional buffer overflows than CONFIG_STACKPROTECTOR_*, but it should be noted that SCS security guarantees in the kernel differ from the ones documented for user space. The kernel must store addresses of shadow stacks used by inactive tasks and interrupt handlers in memory, which means an attacker capable reading and writing arbitrary memory may be able to locate them and hijack control flow by modifying shadow stacks that are not currently in use. SCS is currently supported only on arm64, where the compiler requires the x18 register to be reserved for holding the current task's shadow stack pointer. With -fsanitize=shadow-call-stack, the compiler injects instructions to all non-leaf C functions to store the return address to the shadow stack, and unconditionally load it again before returning. As a result, SCS is currently incompatible with features that rely on modifying function return addresses in the kernel stack to alter control flow, such as function graph tracing, although it may be possible to later change these features to modify the shadow stack instead. A copy of the return address is still kept in the kernel stack for compatibility with stack unwinding, for example. SCS has a minimal performance overhead, but allocating shadow stacks increases kernel memory usage. The feature is therefore mostly useful on hardware that lacks support for PAC instructions. Changes in v7: - Changed irq_stack_entry/exit to store the shadow stack pointer in x24 instead of x20 as kernel_entry uses x20-x23 to store data that can be used later. Updated the comment as well. - Changed the Makefile in arch/arm64/kvm/hyp to also filter out -ffixed-x18. - Changed SHADOW_CALL_STACK to depend on !FUNCTION_GRAPH_TRACER instead of not selecting HAVE_FUNCTION_GRAPH_TRACER with SCS. - Removed ifdefs from the EFI wrapper and updated the comment to explain why we are restoring x18. - Rebased as Ard's x18 patches that were part of this series have already been merged. Changes in v6: - Updated comment in the EFI RT wrapper to include the explanation from the commit message. - Fixed the SHADOW_CALL_STACK_VMAP config option and the compilation errors in scs_init_irq() - Updated the comment in entry.S to Mark's suggestion - Fixed the WARN_ON in scs_init() to trip only when the return value for cpuhp_setup_state() is < 0. - Removed ifdefs from the code in arch/arm64/kernel/scs.c and added separate shadow stacks for the SDEI handler Changes in v5: - Updated the comment in __scs_base() to Mark's suggestion - Changed all instances of uintptr_t to unsigned long - Added allocation poisoning for KASAN to catch unintentional shadow stack accesses; moved set_set_magic before poisoning and switched scs_used() and scs_corrupted() to access the buffer using READ_ONCE_NOCHECK() instead - Changed scs_free() to check for NULL instead of zero - Renamed SCS_CACHE_SIZE to NR_CACHED_SCS - Added a warning if cpuhp_setup_state fails in scs_init() - Dropped patches disabling kretprobes after confirming there's no functional conflict with SCS instrumentation - Added an explanation to the commit message why function graph tracing and SCS are incompatible - Removed the ifdefs from arch/arm64/mm/proc.S and added comments explaining why we are saving and restoring x18 - Updated scs_check_usage format to include process information Changes in v4: - Fixed authorship for Ard's patches - Added missing commit messages - Commented code that clears SCS from thread_info - Added a comment about SCS_END_MAGIC being non-canonical Changes in v3: - Switched to filter-out for removing SCS flags in Makefiles - Changed the __noscs attribute to use __no_sanitize__("...") instead of no_sanitize("...") - Cleaned up inline function definitions and moved task_scs() into a macro - Cleaned up scs_free() and scs_magic() - Moved SCS initialization into dup_task_struct() and removed the now unused scs_task_init() - Added comments to __scs_base() and scs_task_reset() to better document design choices - Changed copy_page to make the offset and bias explicit Changes in v2: - Changed Ard's KVM patch to use x29 instead of x18 for the guest context, which makes restore_callee_saved_regs cleaner - Updated help text (and commit messages) to point out differences in security properties compared to user space SCS - Cleaned up config options: removed the ROP protection choice, replaced the CC_IS_CLANG dependency with an arch-specific cc-option test, and moved disabling of incompatible config options to an arch-specific Kconfig - Added CC_FLAGS_SCS, which are filtered out where needed instead of using DISABLE_SCS - Added a __has_feature guard around __noscs for older clang versions Sami Tolvanen (11): add support for Clang's Shadow Call Stack (SCS) scs: add accounting scs: add support for stack usage debugging scs: disable when function graph tracing is enabled arm64: reserve x18 from general allocation with SCS arm64: preserve x18 when CPU is suspended arm64: efi: restore x18 if it was corrupted arm64: vdso: disable Shadow Call Stack arm64: disable SCS for hypervisor code arm64: implement Shadow Call Stack arm64: scs: add shadow stacks for SDEI Makefile | 6 + arch/Kconfig | 35 ++++ arch/arm64/Kconfig | 5 + arch/arm64/Makefile | 4 + arch/arm64/include/asm/scs.h | 39 +++++ arch/arm64/include/asm/suspend.h | 2 +- arch/arm64/include/asm/thread_info.h | 3 + arch/arm64/kernel/Makefile | 1 + arch/arm64/kernel/asm-offsets.c | 3 + arch/arm64/kernel/efi-rt-wrapper.S | 11 +- arch/arm64/kernel/entry.S | 46 ++++- arch/arm64/kernel/head.S | 9 + arch/arm64/kernel/irq.c | 2 + arch/arm64/kernel/process.c | 2 + arch/arm64/kernel/scs.c | 114 +++++++++++++ arch/arm64/kernel/sdei.c | 7 + arch/arm64/kernel/smp.c | 4 + arch/arm64/kernel/vdso/Makefile | 2 +- arch/arm64/kvm/hyp/Makefile | 3 + arch/arm64/mm/proc.S | 14 ++ drivers/base/node.c | 6 + fs/proc/meminfo.c | 4 + include/linux/compiler-clang.h | 6 + include/linux/compiler_types.h | 4 + include/linux/mmzone.h | 3 + include/linux/scs.h | 57 +++++++ init/init_task.c | 8 + kernel/Makefile | 1 + kernel/fork.c | 9 + kernel/sched/core.c | 2 + kernel/scs.c | 246 +++++++++++++++++++++++++++ mm/page_alloc.c | 6 + mm/vmstat.c | 3 + 33 files changed, 661 insertions(+), 6 deletions(-) create mode 100644 arch/arm64/include/asm/scs.h create mode 100644 arch/arm64/kernel/scs.c create mode 100644 include/linux/scs.h create mode 100644 kernel/scs.c base-commit: b0be0eff1a5ab77d588b76bd8b1c92d5d17b3f73
Hi Sami, On 28/01/2020 18:49, Sami Tolvanen wrote: > This patch series adds support for Clang's Shadow Call Stack > (SCS) mitigation, which uses a separately allocated shadow stack > to protect against return address overwrites. More information > can be found here: > > https://clang.llvm.org/docs/ShadowCallStack.html > > SCS provides better protection against traditional buffer > overflows than CONFIG_STACKPROTECTOR_*, but it should be noted > that SCS security guarantees in the kernel differ from the ones > documented for user space. The kernel must store addresses of > shadow stacks used by inactive tasks and interrupt handlers in > memory, which means an attacker capable reading and writing > arbitrary memory may be able to locate them and hijack control > flow by modifying shadow stacks that are not currently in use. > > SCS is currently supported only on arm64, where the compiler > requires the x18 register to be reserved for holding the current > task's shadow stack pointer. I found I had to add: | KBUILD_CFLAGS := $(filter-out -ffixed-x18 $(CC_FLAGS_SCS), $(KBUILD_CFLAGS)) to drivers/firmware/efi/libstub/Makefile, to get this going. I don't think there is much point supporting SCS for the EFIstub, its already isolated from the rest of the kernel's C code by the __efistub symbol prefix machinery, and trying to use it would expose us to buggy firmware at a point we can't handle it! I can send a patch if its easier for you, Thanks, James
On Tue, Feb 11, 2020 at 5:57 AM James Morse <james.morse@arm.com> wrote: > I found I had to add: > | KBUILD_CFLAGS := $(filter-out -ffixed-x18 $(CC_FLAGS_SCS), $(KBUILD_CFLAGS)) > > to drivers/firmware/efi/libstub/Makefile, to get this going. Ah, good catch! > I don't think there is much point supporting SCS for the EFIstub, its already isolated > from the rest of the kernel's C code by the __efistub symbol prefix machinery, and trying > to use it would expose us to buggy firmware at a point we can't handle it! Yes, fully agreed. > I can send a patch if its easier for you, It's not a problem, I will include a patch for this in v8. Sami
This patch series adds support for Clang's Shadow Call Stack (SCS) mitigation, which uses a separately allocated shadow stack to protect against return address overwrites. More information can be found here: https://clang.llvm.org/docs/ShadowCallStack.html SCS provides better protection against traditional buffer overflows than CONFIG_STACKPROTECTOR_*, but it should be noted that SCS security guarantees in the kernel differ from the ones documented for user space. The kernel must store addresses of shadow stacks used by inactive tasks and interrupt handlers in memory, which means an attacker capable reading and writing arbitrary memory may be able to locate them and hijack control flow by modifying shadow stacks that are not currently in use. SCS is currently supported only on arm64, where the compiler requires the x18 register to be reserved for holding the current task's shadow stack pointer. With -fsanitize=shadow-call-stack, the compiler injects instructions to all non-leaf C functions to store the return address to the shadow stack, and unconditionally load it again before returning. As a result, SCS is currently incompatible with features that rely on modifying function return addresses in the kernel stack to alter control flow, such as function graph tracing, although it may be possible to later change these features to modify the shadow stack instead. A copy of the return address is still kept in the kernel stack for compatibility with stack unwinding, for example. SCS has a minimal performance overhead, but allocating shadow stacks increases kernel memory usage. The feature is therefore mostly useful on hardware that lacks support for PAC instructions. Changes in v8: - Added __noscs to __hyp_text instead of filtering SCS flags from the entire arch/arm64/kvm/hyp directory. - Added a patch to filter out -ffixed-x18 and SCS flags from the EFI stub. Changes in v7: - Changed irq_stack_entry/exit to store the shadow stack pointer in x24 instead of x20 as kernel_entry uses x20-x23 to store data that can be used later. Updated the comment as well. - Changed the Makefile in arch/arm64/kvm/hyp to also filter out -ffixed-x18. - Changed SHADOW_CALL_STACK to depend on !FUNCTION_GRAPH_TRACER instead of not selecting HAVE_FUNCTION_GRAPH_TRACER with SCS. - Removed ifdefs from the EFI wrapper and updated the comment to explain why we are restoring x18. - Rebased as Ard's x18 patches that were part of this series have already been merged. Changes in v6: - Updated comment in the EFI RT wrapper to include the explanation from the commit message. - Fixed the SHADOW_CALL_STACK_VMAP config option and the compilation errors in scs_init_irq() - Updated the comment in entry.S to Mark's suggestion - Fixed the WARN_ON in scs_init() to trip only when the return value for cpuhp_setup_state() is < 0. - Removed ifdefs from the code in arch/arm64/kernel/scs.c and added separate shadow stacks for the SDEI handler Changes in v5: - Updated the comment in __scs_base() to Mark's suggestion - Changed all instances of uintptr_t to unsigned long - Added allocation poisoning for KASAN to catch unintentional shadow stack accesses; moved set_set_magic before poisoning and switched scs_used() and scs_corrupted() to access the buffer using READ_ONCE_NOCHECK() instead - Changed scs_free() to check for NULL instead of zero - Renamed SCS_CACHE_SIZE to NR_CACHED_SCS - Added a warning if cpuhp_setup_state fails in scs_init() - Dropped patches disabling kretprobes after confirming there's no functional conflict with SCS instrumentation - Added an explanation to the commit message why function graph tracing and SCS are incompatible - Removed the ifdefs from arch/arm64/mm/proc.S and added comments explaining why we are saving and restoring x18 - Updated scs_check_usage format to include process information Changes in v4: - Fixed authorship for Ard's patches - Added missing commit messages - Commented code that clears SCS from thread_info - Added a comment about SCS_END_MAGIC being non-canonical Changes in v3: - Switched to filter-out for removing SCS flags in Makefiles - Changed the __noscs attribute to use __no_sanitize__("...") instead of no_sanitize("...") - Cleaned up inline function definitions and moved task_scs() into a macro - Cleaned up scs_free() and scs_magic() - Moved SCS initialization into dup_task_struct() and removed the now unused scs_task_init() - Added comments to __scs_base() and scs_task_reset() to better document design choices - Changed copy_page to make the offset and bias explicit Changes in v2: - Changed Ard's KVM patch to use x29 instead of x18 for the guest context, which makes restore_callee_saved_regs cleaner - Updated help text (and commit messages) to point out differences in security properties compared to user space SCS - Cleaned up config options: removed the ROP protection choice, replaced the CC_IS_CLANG dependency with an arch-specific cc-option test, and moved disabling of incompatible config options to an arch-specific Kconfig - Added CC_FLAGS_SCS, which are filtered out where needed instead of using DISABLE_SCS - Added a __has_feature guard around __noscs for older clang versions Sami Tolvanen (12): add support for Clang's Shadow Call Stack (SCS) scs: add accounting scs: add support for stack usage debugging scs: disable when function graph tracing is enabled arm64: reserve x18 from general allocation with SCS arm64: preserve x18 when CPU is suspended arm64: efi: restore x18 if it was corrupted arm64: vdso: disable Shadow Call Stack arm64: disable SCS for hypervisor code arm64: implement Shadow Call Stack arm64: scs: add shadow stacks for SDEI efi/libstub: disable SCS Makefile | 6 + arch/Kconfig | 35 ++++ arch/arm64/Kconfig | 5 + arch/arm64/Makefile | 4 + arch/arm64/include/asm/kvm_hyp.h | 2 +- arch/arm64/include/asm/scs.h | 39 ++++ arch/arm64/include/asm/suspend.h | 2 +- arch/arm64/include/asm/thread_info.h | 3 + arch/arm64/kernel/Makefile | 1 + arch/arm64/kernel/asm-offsets.c | 3 + arch/arm64/kernel/efi-rt-wrapper.S | 11 +- arch/arm64/kernel/entry.S | 46 ++++- arch/arm64/kernel/head.S | 9 + arch/arm64/kernel/irq.c | 2 + arch/arm64/kernel/process.c | 2 + arch/arm64/kernel/scs.c | 114 ++++++++++++ arch/arm64/kernel/sdei.c | 7 + arch/arm64/kernel/smp.c | 4 + arch/arm64/kernel/vdso/Makefile | 2 +- arch/arm64/mm/proc.S | 14 ++ drivers/base/node.c | 6 + drivers/firmware/efi/libstub/Makefile | 3 + fs/proc/meminfo.c | 4 + include/linux/compiler-clang.h | 6 + include/linux/compiler_types.h | 4 + include/linux/mmzone.h | 3 + include/linux/scs.h | 57 ++++++ init/init_task.c | 8 + kernel/Makefile | 1 + kernel/fork.c | 9 + kernel/sched/core.c | 2 + kernel/scs.c | 246 ++++++++++++++++++++++++++ mm/page_alloc.c | 6 + mm/vmstat.c | 3 + 34 files changed, 662 insertions(+), 7 deletions(-) create mode 100644 arch/arm64/include/asm/scs.h create mode 100644 arch/arm64/kernel/scs.c create mode 100644 include/linux/scs.h create mode 100644 kernel/scs.c base-commit: 2b72104b8c12504176fb5fc1442d6e54e31e338b
Hi Sami, (CC: +Marc) On 19/02/2020 00:08, Sami Tolvanen wrote: > This patch series adds support for Clang's Shadow Call Stack > (SCS) mitigation, which uses a separately allocated shadow stack > to protect against return address overwrites. I took this for a spin on some real hardware. cpu-idle, kexec hibernate etc all work great... but starting a KVM guest causes the CPU to get stuck in EL2. With CONFIG_SHADOW_CALL_STACK disabled, this doesn't happen ... so its something about the feature being enabled. I'm using clang-9 from debian bullseye/sid. (I tried to build tip of tree ... that doesn't go so well on arm64) KVM takes an instruction abort from EL2 to EL2, because some of the code it runs is not mapped at EL2: | ffffa00011588308 <__kvm_tlb_flush_local_vmid>: | ffffa00011588308: d10103ff sub sp, sp, #0x40 | ffffa0001158830c: f90013f3 str x19, [sp, #32] | ffffa00011588310: a9037bfd stp x29, x30, [sp, #48] | ffffa00011588314: 9100c3fd add x29, sp, #0x30 | ffffa00011588318: 97ae18bf bl ffffa0001010e614 <__kern_hyp_va> INSTRUCTION ABORT! | ffffa0001158831c: f9400000 ldr x0, [x0] | ffffa00011588320: 97ae18bd bl ffffa0001010e614 <__kern_hyp_va> | ffffa00011588324: aa0003f3 mov x19, x0 | ffffa00011588328: 97ae18c1 bl ffffa0001010e62c <has_vhe> __kern_hyp_va() is static-inline which is patched wherever it appears at boot with the EL2 ASLR values, it converts a kernel linear-map address to its EL2 KVM alias: | ffffa0001010dc5c <__kern_hyp_va>: | ffffa0001010dc5c: 92400000 and x0, x0, #0x1 | ffffa0001010dc60: 93c00400 ror x0, x0, #1 | ffffa0001010dc64: 91000000 add x0, x0, #0x0 | ffffa0001010dc68: 91400000 add x0, x0, #0x0, lsl #12 | ffffa0001010dc6c: 93c0fc00 ror x0, x0, #63 | ffffa0001010dc70: d65f03c0 ret The problem here is where __kern_hyp_va() is. Its outside the __hyp_text section: | morse@eglon:~/kernel/linux-pigs$ nm -s vmlinux | grep hyp_text | ffffa0001158b800 T __hyp_text_end | ffffa000115838a0 T __hyp_text_start If I disable CONFIG_SHADOW_CALL_STACK in Kconfig, I get: | ffffa00011527fe0 <__kvm_tlb_flush_local_vmid>: | ffffa00011527fe0: d100c3ff sub sp, sp, #0x30 | ffffa00011527fe4: a9027bfd stp x29, x30, [sp, #32] | ffffa00011527fe8: 910083fd add x29, sp, #0x20 | ffffa00011527fec: 92400000 and x0, x0, #0x1 | ffffa00011527ff0: 93c00400 ror x0, x0, #1 | ffffa00011527ff4: 91000000 add x0, x0, #0x0 | ffffa00011527ff8: 91400000 add x0, x0, #0x0, lsl #12 | ffffa00011527ffc: 93c0fc00 ror x0, x0, #63 | ffffa00011528000: f9400000 ldr x0, [x0] | ffffa00011528004: 910023e1 add x1, sp, #0x8 | ffffa00011528008: 92400000 and x0, x0, #0x1 | ffffa0001152800c: 93c00400 ror x0, x0, #1 | ffffa00011528010: 91000000 add x0, x0, #0x0 | ffffa00011528014: 91400000 add x0, x0, #0x0, lsl #12 | ffffa00011528018: 93c0fc00 ror x0, x0, #63 | ffffa0001152801c: 97ffff78 bl ffffa00011527dfc <__tlb_switch_> | ffffa00011528020: d508871f tlbi vmalle1 | ffffa00011528024: d503201f nop This looks like reserving x18 is causing Clang to not-inline the __kern_hyp_va() calls, losing the vitally important section information. (I can see why the compiler thinks this is fair) Is this a known, er, thing, with clang-9? From eyeballing the disassembly __always_inline on __kern_hyp_va() is enough of a hint to stop this, ... with this configuration of clang-9. But KVM still doesn't work, so it isn't the only inlining decision KVM relies on that is changed by SCS. I suspect repainting all KVM's 'inline' with __always_inline will fix it. (yuck!) I'll try tomorrow. I don't think keeping the compiler-flags as they are today for KVM is the right thing to do, it could lead to x18 getting corrupted with the shared vhe/non-vhe code. Splitting that code up would lead to duplication. (hopefully objtool will be able to catch these at build time) Thanks, James > SCS is currently supported only on arm64, where the compiler > requires the x18 register to be reserved for holding the current > task's shadow stack pointer. > Changes in v8: > - Added __noscs to __hyp_text instead of filtering SCS flags from > the entire arch/arm64/kvm/hyp directory
On Wed, 19 Feb 2020 at 19:38, James Morse <james.morse@arm.com> wrote: > > Hi Sami, > > (CC: +Marc) > > On 19/02/2020 00:08, Sami Tolvanen wrote: > > This patch series adds support for Clang's Shadow Call Stack > > (SCS) mitigation, which uses a separately allocated shadow stack > > to protect against return address overwrites. > > I took this for a spin on some real hardware. cpu-idle, kexec hibernate etc all work > great... but starting a KVM guest causes the CPU to get stuck in EL2. > > With CONFIG_SHADOW_CALL_STACK disabled, this doesn't happen ... so its something about the > feature being enabled. > > > I'm using clang-9 from debian bullseye/sid. (I tried to build tip of tree ... that doesn't > go so well on arm64) > > KVM takes an instruction abort from EL2 to EL2, because some of the code it runs is not > mapped at EL2: > > | ffffa00011588308 <__kvm_tlb_flush_local_vmid>: > | ffffa00011588308: d10103ff sub sp, sp, #0x40 > | ffffa0001158830c: f90013f3 str x19, [sp, #32] > | ffffa00011588310: a9037bfd stp x29, x30, [sp, #48] > | ffffa00011588314: 9100c3fd add x29, sp, #0x30 > | ffffa00011588318: 97ae18bf bl ffffa0001010e614 <__kern_hyp_va> > > INSTRUCTION ABORT! > > | ffffa0001158831c: f9400000 ldr x0, [x0] > | ffffa00011588320: 97ae18bd bl ffffa0001010e614 <__kern_hyp_va> > | ffffa00011588324: aa0003f3 mov x19, x0 > | ffffa00011588328: 97ae18c1 bl ffffa0001010e62c <has_vhe> > > > __kern_hyp_va() is static-inline which is patched wherever it appears at boot with the EL2 > ASLR values, it converts a kernel linear-map address to its EL2 KVM alias: > > | ffffa0001010dc5c <__kern_hyp_va>: > | ffffa0001010dc5c: 92400000 and x0, x0, #0x1 > | ffffa0001010dc60: 93c00400 ror x0, x0, #1 > | ffffa0001010dc64: 91000000 add x0, x0, #0x0 > | ffffa0001010dc68: 91400000 add x0, x0, #0x0, lsl #12 > | ffffa0001010dc6c: 93c0fc00 ror x0, x0, #63 > | ffffa0001010dc70: d65f03c0 ret > > > The problem here is where __kern_hyp_va() is. Its outside the __hyp_text section: > | morse@eglon:~/kernel/linux-pigs$ nm -s vmlinux | grep hyp_text > | ffffa0001158b800 T __hyp_text_end > | ffffa000115838a0 T __hyp_text_start > > > If I disable CONFIG_SHADOW_CALL_STACK in Kconfig, I get: > | ffffa00011527fe0 <__kvm_tlb_flush_local_vmid>: > | ffffa00011527fe0: d100c3ff sub sp, sp, #0x30 > | ffffa00011527fe4: a9027bfd stp x29, x30, [sp, #32] > | ffffa00011527fe8: 910083fd add x29, sp, #0x20 > | ffffa00011527fec: 92400000 and x0, x0, #0x1 > | ffffa00011527ff0: 93c00400 ror x0, x0, #1 > | ffffa00011527ff4: 91000000 add x0, x0, #0x0 > | ffffa00011527ff8: 91400000 add x0, x0, #0x0, lsl #12 > | ffffa00011527ffc: 93c0fc00 ror x0, x0, #63 > | ffffa00011528000: f9400000 ldr x0, [x0] > | ffffa00011528004: 910023e1 add x1, sp, #0x8 > | ffffa00011528008: 92400000 and x0, x0, #0x1 > | ffffa0001152800c: 93c00400 ror x0, x0, #1 > | ffffa00011528010: 91000000 add x0, x0, #0x0 > | ffffa00011528014: 91400000 add x0, x0, #0x0, lsl #12 > | ffffa00011528018: 93c0fc00 ror x0, x0, #63 > | ffffa0001152801c: 97ffff78 bl ffffa00011527dfc <__tlb_switch_> > | ffffa00011528020: d508871f tlbi vmalle1 > | ffffa00011528024: d503201f nop > > > This looks like reserving x18 is causing Clang to not-inline the __kern_hyp_va() calls, > losing the vitally important section information. (I can see why the compiler thinks this > is fair) > > Is this a known, er, thing, with clang-9? > > From eyeballing the disassembly __always_inline on __kern_hyp_va() is enough of a hint to > stop this, ... with this configuration of clang-9. But KVM still doesn't work, so it isn't > the only inlining decision KVM relies on that is changed by SCS. > > I suspect repainting all KVM's 'inline' with __always_inline will fix it. (yuck!) I'll try > tomorrow. > If we are relying on the inlining for correctness, these should have been __always_inline to begin with, and yuckness aside, I don't think there's anything wrong with that. > I don't think keeping the compiler-flags as they are today for KVM is the right thing to > do, it could lead to x18 getting corrupted with the shared vhe/non-vhe code. Splitting > that code up would lead to duplication. > > (hopefully objtool will be able to catch these at build time) > I don't see why we should selectively en/disable the reservation of x18 (as I argued in the context of the EFI libstub patch as well). Just reserving it everywhere shouldn't hurt performance, and removes the need to prove that we reserved it in all the right places.
On Wed, Feb 19, 2020 at 10:38 AM James Morse <james.morse@arm.com> wrote: > This looks like reserving x18 is causing Clang to not-inline the __kern_hyp_va() calls, > losing the vitally important section information. (I can see why the compiler thinks this > is fair) Thanks for catching this. This doesn't appear to be caused by reserving x18, it looks like SCS itself is causing clang to avoid inlining these. If I add __noscs to __kern_hyp_va(), clang inlines the function again. __always_inline also works, as you pointed out. > Is this a known, er, thing, with clang-9? I can reproduce this with ToT clang as well. > I suspect repainting all KVM's 'inline' with __always_inline will fix it. (yuck!) I'll try > tomorrow. I think switching to __always_inline is the correct solution here. Sami
On 2020-02-19 18:53, Ard Biesheuvel wrote: > On Wed, 19 Feb 2020 at 19:38, James Morse <james.morse@arm.com> wrote: >> >> Hi Sami, >> >> (CC: +Marc) >> >> On 19/02/2020 00:08, Sami Tolvanen wrote: >> > This patch series adds support for Clang's Shadow Call Stack >> > (SCS) mitigation, which uses a separately allocated shadow stack >> > to protect against return address overwrites. >> >> I took this for a spin on some real hardware. cpu-idle, kexec >> hibernate etc all work >> great... but starting a KVM guest causes the CPU to get stuck in EL2. >> >> With CONFIG_SHADOW_CALL_STACK disabled, this doesn't happen ... so its >> something about the >> feature being enabled. >> >> >> I'm using clang-9 from debian bullseye/sid. (I tried to build tip of >> tree ... that doesn't >> go so well on arm64) >> >> KVM takes an instruction abort from EL2 to EL2, because some of the >> code it runs is not >> mapped at EL2: >> >> | ffffa00011588308 <__kvm_tlb_flush_local_vmid>: >> | ffffa00011588308: d10103ff sub sp, sp, #0x40 >> | ffffa0001158830c: f90013f3 str x19, [sp, #32] >> | ffffa00011588310: a9037bfd stp x29, x30, [sp, #48] >> | ffffa00011588314: 9100c3fd add x29, sp, #0x30 >> | ffffa00011588318: 97ae18bf bl ffffa0001010e614 >> <__kern_hyp_va> >> >> INSTRUCTION ABORT! >> >> | ffffa0001158831c: f9400000 ldr x0, [x0] >> | ffffa00011588320: 97ae18bd bl ffffa0001010e614 >> <__kern_hyp_va> >> | ffffa00011588324: aa0003f3 mov x19, x0 >> | ffffa00011588328: 97ae18c1 bl ffffa0001010e62c >> <has_vhe> >> >> >> __kern_hyp_va() is static-inline which is patched wherever it appears >> at boot with the EL2 >> ASLR values, it converts a kernel linear-map address to its EL2 KVM >> alias: >> >> | ffffa0001010dc5c <__kern_hyp_va>: >> | ffffa0001010dc5c: 92400000 and x0, x0, #0x1 >> | ffffa0001010dc60: 93c00400 ror x0, x0, #1 >> | ffffa0001010dc64: 91000000 add x0, x0, #0x0 >> | ffffa0001010dc68: 91400000 add x0, x0, #0x0, lsl >> #12 >> | ffffa0001010dc6c: 93c0fc00 ror x0, x0, #63 >> | ffffa0001010dc70: d65f03c0 ret >> >> >> The problem here is where __kern_hyp_va() is. Its outside the >> __hyp_text section: >> | morse@eglon:~/kernel/linux-pigs$ nm -s vmlinux | grep hyp_text >> | ffffa0001158b800 T __hyp_text_end >> | ffffa000115838a0 T __hyp_text_start >> >> >> If I disable CONFIG_SHADOW_CALL_STACK in Kconfig, I get: >> | ffffa00011527fe0 <__kvm_tlb_flush_local_vmid>: >> | ffffa00011527fe0: d100c3ff sub sp, sp, #0x30 >> | ffffa00011527fe4: a9027bfd stp x29, x30, [sp, #32] >> | ffffa00011527fe8: 910083fd add x29, sp, #0x20 >> | ffffa00011527fec: 92400000 and x0, x0, #0x1 >> | ffffa00011527ff0: 93c00400 ror x0, x0, #1 >> | ffffa00011527ff4: 91000000 add x0, x0, #0x0 >> | ffffa00011527ff8: 91400000 add x0, x0, #0x0, lsl >> #12 >> | ffffa00011527ffc: 93c0fc00 ror x0, x0, #63 >> | ffffa00011528000: f9400000 ldr x0, [x0] >> | ffffa00011528004: 910023e1 add x1, sp, #0x8 >> | ffffa00011528008: 92400000 and x0, x0, #0x1 >> | ffffa0001152800c: 93c00400 ror x0, x0, #1 >> | ffffa00011528010: 91000000 add x0, x0, #0x0 >> | ffffa00011528014: 91400000 add x0, x0, #0x0, lsl >> #12 >> | ffffa00011528018: 93c0fc00 ror x0, x0, #63 >> | ffffa0001152801c: 97ffff78 bl ffffa00011527dfc >> <__tlb_switch_> >> | ffffa00011528020: d508871f tlbi vmalle1 >> | ffffa00011528024: d503201f nop >> >> >> This looks like reserving x18 is causing Clang to not-inline the >> __kern_hyp_va() calls, >> losing the vitally important section information. (I can see why the >> compiler thinks this >> is fair) >> >> Is this a known, er, thing, with clang-9? >> >> From eyeballing the disassembly __always_inline on __kern_hyp_va() is >> enough of a hint to >> stop this, ... with this configuration of clang-9. But KVM still >> doesn't work, so it isn't >> the only inlining decision KVM relies on that is changed by SCS. >> >> I suspect repainting all KVM's 'inline' with __always_inline will fix >> it. (yuck!) I'll try >> tomorrow. >> > > If we are relying on the inlining for correctness, these should have > been __always_inline to begin with, and yuckness aside, I don't think > there's anything wrong with that. Agreed. Not having __always_inline is definitely an oversight, and we should fix it ASAP (hell knows what another compiler could produce...). And the whole EL2 aliasing is utter yuck already, this isn't going to make things much worse... I can queue something today for __kern_hyp_va(), but I'd like to make sure there isn't other silly mistakes like this one somewhere... >> I don't think keeping the compiler-flags as they are today for KVM is >> the right thing to >> do, it could lead to x18 getting corrupted with the shared vhe/non-vhe >> code. Splitting >> that code up would lead to duplication. >> >> (hopefully objtool will be able to catch these at build time) >> > > I don't see why we should selectively en/disable the reservation of > x18 (as I argued in the context of the EFI libstub patch as well). > Just reserving it everywhere shouldn't hurt performance, and removes > the need to prove that we reserved it in all the right places. I'd certainly like to keep things simple if we can. M.
This patch series adds support for Clang's Shadow Call Stack (SCS) mitigation, which uses a separately allocated shadow stack to protect against return address overwrites. More information can be found here: https://clang.llvm.org/docs/ShadowCallStack.html SCS provides better protection against traditional buffer overflows than CONFIG_STACKPROTECTOR_*, but it should be noted that SCS security guarantees in the kernel differ from the ones documented for user space. The kernel must store addresses of shadow stacks used by inactive tasks and interrupt handlers in memory, which means an attacker capable reading and writing arbitrary memory may be able to locate them and hijack control flow by modifying shadow stacks that are not currently in use. SCS is currently supported only on arm64, where the compiler requires the x18 register to be reserved for holding the current task's shadow stack pointer. With -fsanitize=shadow-call-stack, the compiler injects instructions to all non-leaf C functions to store the return address to the shadow stack, and unconditionally load it again before returning. As a result, SCS is currently incompatible with features that rely on modifying function return addresses in the kernel stack to alter control flow. A copy of the return address is still kept in the kernel stack for compatibility with stack unwinding, for example. SCS has a minimal performance overhead, but allocating shadow stacks increases kernel memory usage. The feature is therefore mostly useful on hardware that lacks support for PAC instructions. Changes in v9: - Fixed grammar in the Kconfig help text. - Changed Kconfig to allow SCS to be selected with the patchable- function-entry graph tracer. - Changed the EFI stub patch to not filter out -ffixed-x18, only SCS flags. Changes in v8: - Added __noscs to __hyp_text instead of filtering SCS flags from the entire arch/arm64/kvm/hyp directory. - Added a patch to filter out -ffixed-x18 and SCS flags from the EFI stub. Changes in v7: - Changed irq_stack_entry/exit to store the shadow stack pointer in x24 instead of x20 as kernel_entry uses x20-x23 to store data that can be used later. Updated the comment as well. - Changed the Makefile in arch/arm64/kvm/hyp to also filter out -ffixed-x18. - Changed SHADOW_CALL_STACK to depend on !FUNCTION_GRAPH_TRACER instead of not selecting HAVE_FUNCTION_GRAPH_TRACER with SCS. - Removed ifdefs from the EFI wrapper and updated the comment to explain why we are restoring x18. - Rebased as Ard's x18 patches that were part of this series have already been merged. Changes in v6: - Updated comment in the EFI RT wrapper to include the explanation from the commit message. - Fixed the SHADOW_CALL_STACK_VMAP config option and the compilation errors in scs_init_irq() - Updated the comment in entry.S to Mark's suggestion - Fixed the WARN_ON in scs_init() to trip only when the return value for cpuhp_setup_state() is < 0. - Removed ifdefs from the code in arch/arm64/kernel/scs.c and added separate shadow stacks for the SDEI handler Changes in v5: - Updated the comment in __scs_base() to Mark's suggestion - Changed all instances of uintptr_t to unsigned long - Added allocation poisoning for KASAN to catch unintentional shadow stack accesses; moved set_set_magic before poisoning and switched scs_used() and scs_corrupted() to access the buffer using READ_ONCE_NOCHECK() instead - Changed scs_free() to check for NULL instead of zero - Renamed SCS_CACHE_SIZE to NR_CACHED_SCS - Added a warning if cpuhp_setup_state fails in scs_init() - Dropped patches disabling kretprobes after confirming there's no functional conflict with SCS instrumentation - Added an explanation to the commit message why function graph tracing and SCS are incompatible - Removed the ifdefs from arch/arm64/mm/proc.S and added comments explaining why we are saving and restoring x18 - Updated scs_check_usage format to include process information Changes in v4: - Fixed authorship for Ard's patches - Added missing commit messages - Commented code that clears SCS from thread_info - Added a comment about SCS_END_MAGIC being non-canonical Changes in v3: - Switched to filter-out for removing SCS flags in Makefiles - Changed the __noscs attribute to use __no_sanitize__("...") instead of no_sanitize("...") - Cleaned up inline function definitions and moved task_scs() into a macro - Cleaned up scs_free() and scs_magic() - Moved SCS initialization into dup_task_struct() and removed the now unused scs_task_init() - Added comments to __scs_base() and scs_task_reset() to better document design choices - Changed copy_page to make the offset and bias explicit Changes in v2: - Changed Ard's KVM patch to use x29 instead of x18 for the guest context, which makes restore_callee_saved_regs cleaner - Updated help text (and commit messages) to point out differences in security properties compared to user space SCS - Cleaned up config options: removed the ROP protection choice, replaced the CC_IS_CLANG dependency with an arch-specific cc-option test, and moved disabling of incompatible config options to an arch-specific Kconfig - Added CC_FLAGS_SCS, which are filtered out where needed instead of using DISABLE_SCS - Added a __has_feature guard around __noscs for older clang versions Sami Tolvanen (12): add support for Clang's Shadow Call Stack (SCS) scs: add accounting scs: add support for stack usage debugging scs: disable when function graph tracing is enabled arm64: reserve x18 from general allocation with SCS arm64: preserve x18 when CPU is suspended arm64: efi: restore x18 if it was corrupted arm64: vdso: disable Shadow Call Stack arm64: disable SCS for hypervisor code arm64: implement Shadow Call Stack arm64: scs: add shadow stacks for SDEI efi/libstub: disable SCS Makefile | 6 + arch/Kconfig | 35 ++++ arch/arm64/Kconfig | 5 + arch/arm64/Makefile | 4 + arch/arm64/include/asm/kvm_hyp.h | 2 +- arch/arm64/include/asm/scs.h | 39 ++++ arch/arm64/include/asm/suspend.h | 2 +- arch/arm64/include/asm/thread_info.h | 3 + arch/arm64/kernel/Makefile | 1 + arch/arm64/kernel/asm-offsets.c | 3 + arch/arm64/kernel/efi-rt-wrapper.S | 11 +- arch/arm64/kernel/entry.S | 46 ++++- arch/arm64/kernel/head.S | 9 + arch/arm64/kernel/irq.c | 2 + arch/arm64/kernel/process.c | 2 + arch/arm64/kernel/scs.c | 114 ++++++++++++ arch/arm64/kernel/sdei.c | 7 + arch/arm64/kernel/smp.c | 4 + arch/arm64/kernel/vdso/Makefile | 2 +- arch/arm64/mm/proc.S | 14 ++ drivers/base/node.c | 6 + drivers/firmware/efi/libstub/Makefile | 3 + fs/proc/meminfo.c | 4 + include/linux/compiler-clang.h | 6 + include/linux/compiler_types.h | 4 + include/linux/mmzone.h | 3 + include/linux/scs.h | 57 ++++++ init/init_task.c | 8 + kernel/Makefile | 1 + kernel/fork.c | 9 + kernel/sched/core.c | 2 + kernel/scs.c | 246 ++++++++++++++++++++++++++ mm/page_alloc.c | 6 + mm/vmstat.c | 3 + 34 files changed, 662 insertions(+), 7 deletions(-) create mode 100644 arch/arm64/include/asm/scs.h create mode 100644 arch/arm64/kernel/scs.c create mode 100644 include/linux/scs.h create mode 100644 kernel/scs.c base-commit: 63623fd44972d1ed2bfb6e0fb631dfcf547fd1e7
This patch series adds support for Clang's Shadow Call Stack (SCS) mitigation, which uses a separately allocated shadow stack to protect against return address overwrites. More information can be found here: https://clang.llvm.org/docs/ShadowCallStack.html SCS provides better protection against traditional buffer overflows than CONFIG_STACKPROTECTOR_*, but it should be noted that SCS security guarantees in the kernel differ from the ones documented for user space. The kernel must store addresses of shadow stacks used by inactive tasks and interrupt handlers in memory, which means an attacker capable reading and writing arbitrary memory may be able to locate them and hijack control flow by modifying shadow stacks that are not currently in use. SCS is currently supported only on arm64, where the compiler requires the x18 register to be reserved for holding the current task's shadow stack pointer. With -fsanitize=shadow-call-stack, the compiler injects instructions to all non-leaf C functions to store the return address to the shadow stack, and unconditionally load it again before returning. As a result, SCS is currently incompatible with features that rely on modifying function return addresses in the kernel stack to alter control flow. A copy of the return address is still kept in the kernel stack for compatibility with stack unwinding, for example. SCS has a minimal performance overhead, but allocating shadow stacks increases kernel memory usage. The feature is therefore mostly useful on hardware that lacks support for PAC instructions. Changes in v10: - Removed an unnecessary <asm/scs.h> include from head.S. Changes in v9: - Fixed grammar in the Kconfig help text. - Changed Kconfig to allow SCS to be selected with the patchable- function-entry graph tracer. - Changed the EFI stub patch to not filter out -ffixed-x18, only SCS flags. Changes in v8: - Added __noscs to __hyp_text instead of filtering SCS flags from the entire arch/arm64/kvm/hyp directory. - Added a patch to filter out -ffixed-x18 and SCS flags from the EFI stub. Changes in v7: - Changed irq_stack_entry/exit to store the shadow stack pointer in x24 instead of x20 as kernel_entry uses x20-x23 to store data that can be used later. Updated the comment as well. - Changed the Makefile in arch/arm64/kvm/hyp to also filter out -ffixed-x18. - Changed SHADOW_CALL_STACK to depend on !FUNCTION_GRAPH_TRACER instead of not selecting HAVE_FUNCTION_GRAPH_TRACER with SCS. - Removed ifdefs from the EFI wrapper and updated the comment to explain why we are restoring x18. - Rebased as Ard's x18 patches that were part of this series have already been merged. Changes in v6: - Updated comment in the EFI RT wrapper to include the explanation from the commit message. - Fixed the SHADOW_CALL_STACK_VMAP config option and the compilation errors in scs_init_irq() - Updated the comment in entry.S to Mark's suggestion - Fixed the WARN_ON in scs_init() to trip only when the return value for cpuhp_setup_state() is < 0. - Removed ifdefs from the code in arch/arm64/kernel/scs.c and added separate shadow stacks for the SDEI handler Changes in v5: - Updated the comment in __scs_base() to Mark's suggestion - Changed all instances of uintptr_t to unsigned long - Added allocation poisoning for KASAN to catch unintentional shadow stack accesses; moved set_set_magic before poisoning and switched scs_used() and scs_corrupted() to access the buffer using READ_ONCE_NOCHECK() instead - Changed scs_free() to check for NULL instead of zero - Renamed SCS_CACHE_SIZE to NR_CACHED_SCS - Added a warning if cpuhp_setup_state fails in scs_init() - Dropped patches disabling kretprobes after confirming there's no functional conflict with SCS instrumentation - Added an explanation to the commit message why function graph tracing and SCS are incompatible - Removed the ifdefs from arch/arm64/mm/proc.S and added comments explaining why we are saving and restoring x18 - Updated scs_check_usage format to include process information Changes in v4: - Fixed authorship for Ard's patches - Added missing commit messages - Commented code that clears SCS from thread_info - Added a comment about SCS_END_MAGIC being non-canonical Changes in v3: - Switched to filter-out for removing SCS flags in Makefiles - Changed the __noscs attribute to use __no_sanitize__("...") instead of no_sanitize("...") - Cleaned up inline function definitions and moved task_scs() into a macro - Cleaned up scs_free() and scs_magic() - Moved SCS initialization into dup_task_struct() and removed the now unused scs_task_init() - Added comments to __scs_base() and scs_task_reset() to better document design choices - Changed copy_page to make the offset and bias explicit Changes in v2: - Changed Ard's KVM patch to use x29 instead of x18 for the guest context, which makes restore_callee_saved_regs cleaner - Updated help text (and commit messages) to point out differences in security properties compared to user space SCS - Cleaned up config options: removed the ROP protection choice, replaced the CC_IS_CLANG dependency with an arch-specific cc-option test, and moved disabling of incompatible config options to an arch-specific Kconfig - Added CC_FLAGS_SCS, which are filtered out where needed instead of using DISABLE_SCS - Added a __has_feature guard around __noscs for older clang versions Sami Tolvanen (12): add support for Clang's Shadow Call Stack (SCS) scs: add accounting scs: add support for stack usage debugging scs: disable when function graph tracing is enabled arm64: reserve x18 from general allocation with SCS arm64: preserve x18 when CPU is suspended arm64: efi: restore x18 if it was corrupted arm64: vdso: disable Shadow Call Stack arm64: disable SCS for hypervisor code arm64: implement Shadow Call Stack arm64: scs: add shadow stacks for SDEI efi/libstub: disable SCS Makefile | 6 + arch/Kconfig | 35 ++++ arch/arm64/Kconfig | 5 + arch/arm64/Makefile | 4 + arch/arm64/include/asm/kvm_hyp.h | 2 +- arch/arm64/include/asm/scs.h | 39 ++++ arch/arm64/include/asm/suspend.h | 2 +- arch/arm64/include/asm/thread_info.h | 3 + arch/arm64/kernel/Makefile | 1 + arch/arm64/kernel/asm-offsets.c | 3 + arch/arm64/kernel/efi-rt-wrapper.S | 11 +- arch/arm64/kernel/entry.S | 47 ++++- arch/arm64/kernel/head.S | 8 + arch/arm64/kernel/irq.c | 2 + arch/arm64/kernel/process.c | 2 + arch/arm64/kernel/scs.c | 114 ++++++++++++ arch/arm64/kernel/sdei.c | 7 + arch/arm64/kernel/smp.c | 4 + arch/arm64/kernel/vdso/Makefile | 2 +- arch/arm64/mm/proc.S | 14 ++ drivers/base/node.c | 6 + drivers/firmware/efi/libstub/Makefile | 3 + fs/proc/meminfo.c | 4 + include/linux/compiler-clang.h | 6 + include/linux/compiler_types.h | 4 + include/linux/mmzone.h | 3 + include/linux/scs.h | 57 ++++++ init/init_task.c | 8 + kernel/Makefile | 1 + kernel/fork.c | 9 + kernel/sched/core.c | 2 + kernel/scs.c | 246 ++++++++++++++++++++++++++ mm/page_alloc.c | 6 + mm/vmstat.c | 3 + 34 files changed, 662 insertions(+), 7 deletions(-) create mode 100644 arch/arm64/include/asm/scs.h create mode 100644 arch/arm64/kernel/scs.c create mode 100644 include/linux/scs.h create mode 100644 kernel/scs.c base-commit: a10c9c710f9ecea87b9f4bbb837467893b4bef01