mbox series

[v4,00/13] KVM: arm64: Add support for hypervisor kCFI

Message ID 20240529121251.1993135-1-ptosi@google.com (mailing list archive)
Headers show
Series KVM: arm64: Add support for hypervisor kCFI | expand

Message

Pierre-Clément Tosi May 29, 2024, 12:12 p.m. UTC
CONFIG_CFI_CLANG ("kernel Control Flow Integrity") makes the compiler inject
runtime type checks before any indirect function call. On AArch64, it generates
a BRK instruction to be executed on type mismatch and encodes the indices of the
registers holding the branch target and expected type in the immediate of the
instruction. As a result, a synchronous exception gets triggered on kCFI failure
and the fault handler can retrieve the immediate (and indices) from ESR_ELx.

This feature has been supported at EL1 ("host") since it was introduced by
b26e484b8bb3 ("arm64: Add CFI error handling"), where cfi_handler() decodes
ESR_EL1, giving informative panic messages such as

  [   21.885179] CFI failure at lkdtm_indirect_call+0x2c/0x44 [lkdtm]
  (target: lkdtm_increment_int+0x0/0x1c [lkdtm]; expected type: 0x7e0c52a)
  [   21.886593] Internal error: Oops - CFI: 0 [#1] PREEMPT SMP

However, it is not or only partially supported at EL2: in nVHE (or pKVM),
CONFIG_CFI_CLANG gets filtered out at build time, preventing the compiler from
injecting the checks. In VHE, EL2 code gets compiled with the checks but the
handlers in VBAR_EL2 are not aware of kCFI and will produce a generic and
not-so-helpful panic message such as

  [   36.456088][  T200] Kernel panic - not syncing: HYP panic:
  [   36.456088][  T200] PS:204003c9 PC:ffffffc080092310 ESR:f2008228
  [   36.456088][  T200] FAR:0000000081a50000 HPFAR:000000000081a500 PAR:1de7ec7edbadc0de
  [   36.456088][  T200] VCPU:00000000e189c7cf

To address this,

- [01/13] fixes an existing bug where the ELR_EL2 was getting clobbered on
  synchronous exceptions, causing the wrong "PC" to be reported by
  nvhe_hyp_panic_handler() or __hyp_call_panic(). This is particularly limiting
  for kCFI, as it would mask the location of the failed type check.
- [02/13] fixes a minor C/asm ABI mismatch which would trigger a kCFI failure
- [03/13] to [09/13] prepare nVHE for CONFIG_CFI_CLANG and [10/13] enables it
- [11/13] improves kCFI error messages by saving then parsing the CPU context
- [12/13] adds a kCFI test module for VHE and [13/13] extends it to nVHE & pKVM

As a result, an informative kCFI panic message is printed by or on behalf of EL2
giving the expected type and target address (possibly resolved to a symbol) for
VHE, nVHE, and pKVM (iff CONFIG_NVHE_EL2_DEBUG=y).

Note that kCFI errors remain fatal at EL2, even when CONFIG_CFI_PERMISSIVE=y.

Changes in v4:
  - Addressed Will's comments on v3:
  - Removed save/restore of x0-x1 & used __guest_exit_panic ABI for new routines
  - Reworked __pkvm_init_switch_pgd new API with separate args
  - Moved cosmetic changes (renaming to __hyp_panic) into dedicated commit
  - Further clarified the commit message regarding R_AARCH64_ABS32
  - early_brk64() uses esr_is_cfi_brk() (now introduced along esr_brk_comment())
  - Added helper to display nvHE panic banner
  - Moved the test module to the end of the series

Pierre-Clément Tosi (13):
  KVM: arm64: Fix clobbered ELR in sync abort/SError
  KVM: arm64: Fix __pkvm_init_switch_pgd call ABI
  KVM: arm64: nVHE: Simplify __guest_exit_panic path
  KVM: arm64: nVHE: Add EL2h sync exception handler
  KVM: arm64: Rename __guest_exit_panic __hyp_panic
  KVM: arm64: nVHE: gen-hyprel: Skip R_AARCH64_ABS32
  KVM: arm64: VHE: Mark __hyp_call_panic __noreturn
  arm64: Introduce esr_comment() & esr_is_cfi_brk()
  KVM: arm64: Introduce print_nvhe_hyp_panic helper
  KVM: arm64: nVHE: Support CONFIG_CFI_CLANG at EL2
  KVM: arm64: Improve CONFIG_CFI_CLANG error message
  KVM: arm64: VHE: Add test module for hyp kCFI
  KVM: arm64: nVHE: Support test module for hyp kCFI

 arch/arm64/include/asm/esr.h            |  11 ++
 arch/arm64/include/asm/kvm_asm.h        |   3 +
 arch/arm64/include/asm/kvm_cfi.h        |  38 +++++++
 arch/arm64/include/asm/kvm_hyp.h        |   3 +-
 arch/arm64/kernel/asm-offsets.c         |   1 +
 arch/arm64/kernel/debug-monitors.c      |   4 +-
 arch/arm64/kernel/traps.c               |   8 +-
 arch/arm64/kvm/Kconfig                  |  20 ++++
 arch/arm64/kvm/Makefile                 |   3 +
 arch/arm64/kvm/handle_exit.c            |  50 ++++++++-
 arch/arm64/kvm/hyp/cfi.c                |  37 +++++++
 arch/arm64/kvm/hyp/entry.S              |  34 +++++-
 arch/arm64/kvm/hyp/hyp-entry.S          |   4 +-
 arch/arm64/kvm/hyp/include/hyp/cfi.h    |  47 +++++++++
 arch/arm64/kvm/hyp/include/hyp/switch.h |   5 +-
 arch/arm64/kvm/hyp/nvhe/Makefile        |   7 +-
 arch/arm64/kvm/hyp/nvhe/gen-hyprel.c    |   6 ++
 arch/arm64/kvm/hyp/nvhe/host.S          |  21 ++--
 arch/arm64/kvm/hyp/nvhe/hyp-init.S      |  23 ++--
 arch/arm64/kvm/hyp/nvhe/hyp-main.c      |  19 ++++
 arch/arm64/kvm/hyp/nvhe/setup.c         |   4 +-
 arch/arm64/kvm/hyp/nvhe/switch.c        |   7 ++
 arch/arm64/kvm/hyp/vhe/Makefile         |   1 +
 arch/arm64/kvm/hyp/vhe/switch.c         |  34 +++++-
 arch/arm64/kvm/hyp_cfi_test.c           |  75 +++++++++++++
 arch/arm64/kvm/hyp_cfi_test_module.c    | 135 ++++++++++++++++++++++++
 26 files changed, 553 insertions(+), 47 deletions(-)
 create mode 100644 arch/arm64/include/asm/kvm_cfi.h
 create mode 100644 arch/arm64/kvm/hyp/cfi.c
 create mode 100644 arch/arm64/kvm/hyp/include/hyp/cfi.h
 create mode 100644 arch/arm64/kvm/hyp_cfi_test.c
 create mode 100644 arch/arm64/kvm/hyp_cfi_test_module.c

Comments

Will Deacon June 3, 2024, 1:59 p.m. UTC | #1
On Wed, May 29, 2024 at 01:12:06PM +0100, Pierre-Clément Tosi wrote:
> CONFIG_CFI_CLANG ("kernel Control Flow Integrity") makes the compiler inject
> runtime type checks before any indirect function call. On AArch64, it generates
> a BRK instruction to be executed on type mismatch and encodes the indices of the
> registers holding the branch target and expected type in the immediate of the
> instruction. As a result, a synchronous exception gets triggered on kCFI failure
> and the fault handler can retrieve the immediate (and indices) from ESR_ELx.
> 
> This feature has been supported at EL1 ("host") since it was introduced by
> b26e484b8bb3 ("arm64: Add CFI error handling"), where cfi_handler() decodes
> ESR_EL1, giving informative panic messages such as
> 
>   [   21.885179] CFI failure at lkdtm_indirect_call+0x2c/0x44 [lkdtm]
>   (target: lkdtm_increment_int+0x0/0x1c [lkdtm]; expected type: 0x7e0c52a)
>   [   21.886593] Internal error: Oops - CFI: 0 [#1] PREEMPT SMP
> 
> However, it is not or only partially supported at EL2: in nVHE (or pKVM),
> CONFIG_CFI_CLANG gets filtered out at build time, preventing the compiler from
> injecting the checks. In VHE, EL2 code gets compiled with the checks but the
> handlers in VBAR_EL2 are not aware of kCFI and will produce a generic and
> not-so-helpful panic message such as
> 
>   [   36.456088][  T200] Kernel panic - not syncing: HYP panic:
>   [   36.456088][  T200] PS:204003c9 PC:ffffffc080092310 ESR:f2008228
>   [   36.456088][  T200] FAR:0000000081a50000 HPFAR:000000000081a500 PAR:1de7ec7edbadc0de
>   [   36.456088][  T200] VCPU:00000000e189c7cf
> 
> To address this,
> 
> - [01/13] fixes an existing bug where the ELR_EL2 was getting clobbered on
>   synchronous exceptions, causing the wrong "PC" to be reported by
>   nvhe_hyp_panic_handler() or __hyp_call_panic(). This is particularly limiting
>   for kCFI, as it would mask the location of the failed type check.
> - [02/13] fixes a minor C/asm ABI mismatch which would trigger a kCFI failure
> - [03/13] to [09/13] prepare nVHE for CONFIG_CFI_CLANG and [10/13] enables it
> - [11/13] improves kCFI error messages by saving then parsing the CPU context
> - [12/13] adds a kCFI test module for VHE and [13/13] extends it to nVHE & pKVM
> 
> As a result, an informative kCFI panic message is printed by or on behalf of EL2
> giving the expected type and target address (possibly resolved to a symbol) for
> VHE, nVHE, and pKVM (iff CONFIG_NVHE_EL2_DEBUG=y).
> 
> Note that kCFI errors remain fatal at EL2, even when CONFIG_CFI_PERMISSIVE=y.
> 
> Changes in v4:
>   - Addressed Will's comments on v3:

nit: but please keep reviewers on CC when you post a new version. I
missed this initially.

Will