mbox series

[v5,0/3] arm64: dynamic shadow call stack support

Message ID 20220822095058.2912704-1-ardb@kernel.org (mailing list archive)
Headers show
Series arm64: dynamic shadow call stack support | expand

Message

Ard Biesheuvel Aug. 22, 2022, 9:50 a.m. UTC
Generic kernel images such as Android's GKI usually enable all available
security features, which are typically implemented in such a way that
they only take effect if the underlying hardware can support it, but
don't interfere with correct and efficient operation otherwise.

For shadow call stack support, which is always supported by the
hardware, it means it will be enabled even if pointer authentication is
also supported, and enabled for signing return addresses stored on the
stack. The additional security provided by shadow call stack is only
marginal in this case, whereas the performance overhead is not.

Given that return address signing is based on PACIASP/AUTIASP
instructions that implicitly operate on the return address register
(X30) and are not idempotent (i.e., each needs to be emitted exactly
once before the return address is stored on the ordinary stack and after
it has been retrieved from it), we can convert these instruction 1:1
into shadow call stack pushes and pops involving the register X30.
As this is something that can be done at runtime rather than build time,
we can do this conditionally based on whether or not return address
signing is supported on the underlying hardware.

In order to allow runtimes to unwind call stacks that involve return
address signing, we track whether or not the return address is currently
signed by means of DWARF CFI directives in the unwinding metadata. This
means we can use this information to locate all PACIASP/AUTIASP
instructions in the binary, instead of having to use brute force and go
over all instructions in the entire program.

This series implements this approach for Clang, which has been vetted
(and fixed in release 15) to ensure that the unwind metadata is 100%
accurate when it comes to PACIASP/AUTIASP occurrences. Sadly, GCC does
not always get that quite right, so this series is Clang-only for the
moment.

Changes since v4 [1]:
- rebase onto v6.0-rc2
- use SYS_FIELD_GET for AA64ISAR1/2 sysreg field accesses
- add Sami's Rb/Tb

Changes since v3 [2]:
- rebase onto arm64/for-next/core
- fix init value of dynamic_scs_enabled static key
- don't discard .eh_frame sections (to work around a bug in an older
  Clang version if we are keeping them for dynamic SCS patching,
- print a diagnostic if dynamic SCS patching is enabled,
- apply build fix suggested by Sami and add his ack to patch #2

Changes since v2 [3]:
- don't enable unwind table generation for nVHE code - it cannot be
  patched anyway so it has no use for it;
- drop checks for ID reg overrides
- fix some remaining TODOs regarding augmentation data and the code
  alignment factor
- disable PAC for leaf functions when dynamic SCS is configured, so that
  we don't end up with SCS pushes and pops in all leaf functions too;
- add I-cache maintenance after code patching
- add Rb's from Nick and Kees.

Changes since RFC v1:
- implement boot time check for PAC/BTI support, and only enable dynamic
  SCS if neither are supported;
- implement module patching as well;
- switch to Clang, and drop workaround for GCC bug;

[0] https://lore.kernel.org/linux-arm-kernel/20211013152243.2216899-1-ardb@kernel.org/
[1] https://lore.kernel.org/linux-arm-kernel/20220701152724.3343599-1-ardb@kernel.org/
[2] https://lore.kernel.org/linux-arm-kernel/20220613134008.3760481-1-ardb@kernel.org/
[3] https://lore.kernel.org/linux-arm-kernel/20220505161011.1801596-1-ardb@kernel.org/

Cc: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Sami Tolvanen <samitolvanen@google.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Kees Cook <keescook@chromium.org>

Ard Biesheuvel (3):
  arm64: unwind: add asynchronous unwind tables to kernel and modules
  scs: add support for dynamic shadow call stacks
  arm64: implement dynamic shadow call stack for Clang

 Makefile                              |   2 +
 arch/Kconfig                          |   7 +
 arch/arm64/Kconfig                    |  12 +
 arch/arm64/Makefile                   |  15 +-
 arch/arm64/include/asm/module.lds.h   |   8 +
 arch/arm64/include/asm/scs.h          |  49 ++++
 arch/arm64/kernel/Makefile            |   2 +
 arch/arm64/kernel/head.S              |   3 +
 arch/arm64/kernel/irq.c               |   2 +-
 arch/arm64/kernel/module.c            |   8 +
 arch/arm64/kernel/patch-scs.c         | 257 ++++++++++++++++++++
 arch/arm64/kernel/pi/Makefile         |   1 +
 arch/arm64/kernel/sdei.c              |   2 +-
 arch/arm64/kernel/setup.c             |   4 +
 arch/arm64/kernel/vmlinux.lds.S       |  13 +
 arch/arm64/kvm/hyp/nvhe/Makefile      |   1 +
 drivers/firmware/efi/libstub/Makefile |   1 +
 include/asm-generic/vmlinux.lds.h     |   9 +-
 include/linux/scs.h                   |  18 ++
 kernel/scs.c                          |  14 +-
 scripts/module.lds.S                  |   8 +-
 21 files changed, 427 insertions(+), 9 deletions(-)
 create mode 100644 arch/arm64/kernel/patch-scs.c

Comments

Kees Cook Sept. 7, 2022, 5:25 p.m. UTC | #1
On Mon, Aug 22, 2022 at 11:50:55AM +0200, Ard Biesheuvel wrote:
> Generic kernel images such as Android's GKI usually enable all available
> security features, which are typically implemented in such a way that
> they only take effect if the underlying hardware can support it, but
> don't interfere with correct and efficient operation otherwise.
> 
> For shadow call stack support, which is always supported by the
> hardware, it means it will be enabled even if pointer authentication is
> also supported, and enabled for signing return addresses stored on the
> stack. The additional security provided by shadow call stack is only
> marginal in this case, whereas the performance overhead is not.
> 
> Given that return address signing is based on PACIASP/AUTIASP
> instructions that implicitly operate on the return address register
> (X30) and are not idempotent (i.e., each needs to be emitted exactly
> once before the return address is stored on the ordinary stack and after
> it has been retrieved from it), we can convert these instruction 1:1
> into shadow call stack pushes and pops involving the register X30.
> As this is something that can be done at runtime rather than build time,
> we can do this conditionally based on whether or not return address
> signing is supported on the underlying hardware.
> 
> In order to allow runtimes to unwind call stacks that involve return
> address signing, we track whether or not the return address is currently
> signed by means of DWARF CFI directives in the unwinding metadata. This
> means we can use this information to locate all PACIASP/AUTIASP
> instructions in the binary, instead of having to use brute force and go
> over all instructions in the entire program.
> 
> This series implements this approach for Clang, which has been vetted
> (and fixed in release 15) to ensure that the unwind metadata is 100%
> accurate when it comes to PACIASP/AUTIASP occurrences. Sadly, GCC does
> not always get that quite right, so this series is Clang-only for the
> moment.

Will, Catalin, what's left for this series? I'd really to get this
landed -- it's reviewed and tested, and will be used on real devices.

Thanks!

-Kees