mbox series

[RFC,0/9] arm64: use unwind data on GCC for shadow call stack

Message ID 20211013152243.2216899-1-ardb@kernel.org (mailing list archive)
Headers show
Series arm64: use unwind data on GCC for shadow call stack | expand

Message

Ard Biesheuvel Oct. 13, 2021, 3:22 p.m. UTC
This series is a proof of concept implementation of using unwind tables
to locate PACIASP/AUTIASP instructions in the code, and patching them
into shadow call stack pushes/pops at boot time if the platform in
question does not support pointer authentication in hardware. This way,
the overhead of the shadow call stack is only imposed if it actually
gives any benefit. It also means that the compiler does not need to
generate the code, so this works with GCC as well.

In fact, it only works with GCC at the moment, as Clang does not seem to
implement the DW_CFA_negate_ra_state correctly, which is emitted after
each PACIASP or AUTIASP instruction (Clang only does the former).
However, GCC does not appear to get it quite right either, as it emits
the directive in the wrong place in some cases (but in a way that can be
worked around).

Note that this only implements it for the core kernel. Modules should be
straight-forward, and most of the code can be reused. Also, the
transformation is applied unconditionally, even if the hardware does
implement PAC, but this does not really matter for a PoC.

One obvious downside is the size of the unwind tables (3 MiB for
defconfig), although there are plenty of use cases where this does not
really matters (and I haven't checked the compressed size). However,
there may be other reasons why we'd want to have access to these unwind
tables (reliable stack traces), so this will need to be discussed before
I intend to take this any further.

Cc: Kees Cook <keescook@google.com>
Cc: Sami Tolvanen <samitolvanen@google.com>
Cc: Fangrui Song <maskray@google.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Dan Li <ashimida@linux.alibaba.com>

Ard Biesheuvel (9):
  arm64: assembler: enable PAC for non-leaf assembler routines
  arm64: cache: use ALIAS version of linkage macros for local aliases
  arm64: crypto: avoid overlapping linkage definitions for AES-CBC
  arm64: aes-neonbs: move frame pop to end of function
  arm64: chacha-neon: move frame pop forward
  arm64: smccc: create proper stack frames for HVC/SMC calls
  arm64: assembler: add unwind annotations to frame push/pop macros
  arm64: unwind: add asynchronous unwind tables to the kernel proper
  arm64: implement dynamic shadow call stack for GCC

 Makefile                              |   4 +-
 arch/Kconfig                          |   4 +-
 arch/arm64/Kconfig                    |  11 +-
 arch/arm64/Makefile                   |   7 +-
 arch/arm64/crypto/aes-modes.S         |   4 +-
 arch/arm64/crypto/aes-neonbs-core.S   |   8 +-
 arch/arm64/crypto/chacha-neon-core.S  |   9 +-
 arch/arm64/include/asm/assembler.h    |  32 ++-
 arch/arm64/include/asm/linkage.h      |  16 +-
 arch/arm64/kernel/Makefile            |   2 +
 arch/arm64/kernel/head.S              |   3 +
 arch/arm64/kernel/patch-scs.c         | 223 ++++++++++++++++++++
 arch/arm64/kernel/smccc-call.S        |  40 ++--
 arch/arm64/kernel/vmlinux.lds.S       |  20 ++
 arch/arm64/mm/cache.S                 |   8 +-
 drivers/firmware/efi/libstub/Makefile |   1 +
 16 files changed, 347 insertions(+), 45 deletions(-)
 create mode 100644 arch/arm64/kernel/patch-scs.c

Comments

Ard Biesheuvel Oct. 13, 2021, 5:52 p.m. UTC | #1
On Wed, 13 Oct 2021 at 17:22, Ard Biesheuvel <ardb@kernel.org> wrote:
>
> This series is a proof of concept implementation of using unwind tables
> to locate PACIASP/AUTIASP instructions in the code, and patching them
> into shadow call stack pushes/pops at boot time if the platform in
> question does not support pointer authentication in hardware. This way,
> the overhead of the shadow call stack is only imposed if it actually
> gives any benefit. It also means that the compiler does not need to
> generate the code, so this works with GCC as well.
>
> In fact, it only works with GCC at the moment, as Clang does not seem to
> implement the DW_CFA_negate_ra_state correctly, which is emitted after
> each PACIASP or AUTIASP instruction (Clang only does the former).
> However, GCC does not appear to get it quite right either, as it emits
> the directive in the wrong place in some cases (but in a way that can be
> worked around).
>
> Note that this only implements it for the core kernel. Modules should be
> straight-forward, and most of the code can be reused. Also, the
> transformation is applied unconditionally, even if the hardware does
> implement PAC, but this does not really matter for a PoC.
>
> One obvious downside is the size of the unwind tables (3 MiB for
> defconfig), although there are plenty of use cases where this does not
> really matters (and I haven't checked the compressed size). However,
> there may be other reasons why we'd want to have access to these unwind
> tables (reliable stack traces), so this will need to be discussed before
> I intend to take this any further.
>
> Cc: Kees Cook <keescook@google.com>
> Cc: Sami Tolvanen <samitolvanen@google.com>
> Cc: Fangrui Song <maskray@google.com>
> Cc: Nick Desaulniers <ndesaulniers@google.com>
> Cc: Dan Li <ashimida@linux.alibaba.com>
>

Apologies - i failed to pass --cc-cover so the cc'ees above have only
received this cover letter.

The lore thread is here:
https://lore.kernel.org/r/20211013152243.2216899-1-ardb@kernel.org/


> Ard Biesheuvel (9):
>   arm64: assembler: enable PAC for non-leaf assembler routines
>   arm64: cache: use ALIAS version of linkage macros for local aliases
>   arm64: crypto: avoid overlapping linkage definitions for AES-CBC
>   arm64: aes-neonbs: move frame pop to end of function
>   arm64: chacha-neon: move frame pop forward
>   arm64: smccc: create proper stack frames for HVC/SMC calls
>   arm64: assembler: add unwind annotations to frame push/pop macros
>   arm64: unwind: add asynchronous unwind tables to the kernel proper
>   arm64: implement dynamic shadow call stack for GCC
>
>  Makefile                              |   4 +-
>  arch/Kconfig                          |   4 +-
>  arch/arm64/Kconfig                    |  11 +-
>  arch/arm64/Makefile                   |   7 +-
>  arch/arm64/crypto/aes-modes.S         |   4 +-
>  arch/arm64/crypto/aes-neonbs-core.S   |   8 +-
>  arch/arm64/crypto/chacha-neon-core.S  |   9 +-
>  arch/arm64/include/asm/assembler.h    |  32 ++-
>  arch/arm64/include/asm/linkage.h      |  16 +-
>  arch/arm64/kernel/Makefile            |   2 +
>  arch/arm64/kernel/head.S              |   3 +
>  arch/arm64/kernel/patch-scs.c         | 223 ++++++++++++++++++++
>  arch/arm64/kernel/smccc-call.S        |  40 ++--
>  arch/arm64/kernel/vmlinux.lds.S       |  20 ++
>  arch/arm64/mm/cache.S                 |   8 +-
>  drivers/firmware/efi/libstub/Makefile |   1 +
>  16 files changed, 347 insertions(+), 45 deletions(-)
>  create mode 100644 arch/arm64/kernel/patch-scs.c
>
> --
> 2.30.2
>
Nick Desaulniers Oct. 13, 2021, 6:01 p.m. UTC | #2
On Wed, Oct 13, 2021 at 8:22 AM Ard Biesheuvel <ardb@kernel.org> wrote:
>
> This series is a proof of concept implementation of using unwind tables
> to locate PACIASP/AUTIASP instructions in the code, and patching them
> into shadow call stack pushes/pops at boot time if the platform in
> question does not support pointer authentication in hardware. This way,
> the overhead of the shadow call stack is only imposed if it actually
> gives any benefit. It also means that the compiler does not need to
> generate the code, so this works with GCC as well.
>
> In fact, it only works with GCC at the moment, as Clang does not seem to
> implement the DW_CFA_negate_ra_state correctly, which is emitted after
> each PACIASP or AUTIASP instruction (Clang only does the former).
> However, GCC does not appear to get it quite right either, as it emits
> the directive in the wrong place in some cases (but in a way that can be
> worked around).

Can we work on getting bug reports to the compiler vendors? Then we
can have something free of workarounds, and more toolchain portable.

>
> Note that this only implements it for the core kernel. Modules should be
> straight-forward, and most of the code can be reused. Also, the
> transformation is applied unconditionally, even if the hardware does
> implement PAC, but this does not really matter for a PoC.
>
> One obvious downside is the size of the unwind tables (3 MiB for
> defconfig), although there are plenty of use cases where this does not
> really matters (and I haven't checked the compressed size). However,
> there may be other reasons why we'd want to have access to these unwind
> tables (reliable stack traces), so this will need to be discussed before
> I intend to take this any further.
>
> Cc: Kees Cook <keescook@google.com>
> Cc: Sami Tolvanen <samitolvanen@google.com>
> Cc: Fangrui Song <maskray@google.com>
> Cc: Nick Desaulniers <ndesaulniers@google.com>
> Cc: Dan Li <ashimida@linux.alibaba.com>
>
> Ard Biesheuvel (9):
>   arm64: assembler: enable PAC for non-leaf assembler routines
>   arm64: cache: use ALIAS version of linkage macros for local aliases
>   arm64: crypto: avoid overlapping linkage definitions for AES-CBC
>   arm64: aes-neonbs: move frame pop to end of function
>   arm64: chacha-neon: move frame pop forward
>   arm64: smccc: create proper stack frames for HVC/SMC calls
>   arm64: assembler: add unwind annotations to frame push/pop macros
>   arm64: unwind: add asynchronous unwind tables to the kernel proper
>   arm64: implement dynamic shadow call stack for GCC
>
>  Makefile                              |   4 +-
>  arch/Kconfig                          |   4 +-
>  arch/arm64/Kconfig                    |  11 +-
>  arch/arm64/Makefile                   |   7 +-
>  arch/arm64/crypto/aes-modes.S         |   4 +-
>  arch/arm64/crypto/aes-neonbs-core.S   |   8 +-
>  arch/arm64/crypto/chacha-neon-core.S  |   9 +-
>  arch/arm64/include/asm/assembler.h    |  32 ++-
>  arch/arm64/include/asm/linkage.h      |  16 +-
>  arch/arm64/kernel/Makefile            |   2 +
>  arch/arm64/kernel/head.S              |   3 +
>  arch/arm64/kernel/patch-scs.c         | 223 ++++++++++++++++++++
>  arch/arm64/kernel/smccc-call.S        |  40 ++--
>  arch/arm64/kernel/vmlinux.lds.S       |  20 ++
>  arch/arm64/mm/cache.S                 |   8 +-
>  drivers/firmware/efi/libstub/Makefile |   1 +
>  16 files changed, 347 insertions(+), 45 deletions(-)
>  create mode 100644 arch/arm64/kernel/patch-scs.c
>
> --
> 2.30.2
>