mbox series

[v4,00/21] KCFI support

Message ID 20220830233129.30610-1-samitolvanen@google.com (mailing list archive)
Headers show
Series KCFI support | expand

Message

Sami Tolvanen Aug. 30, 2022, 11:31 p.m. UTC
KCFI is a forward-edge control-flow integrity scheme in the upcoming
Clang 16 release, which is more suitable for kernel use than the
existing CFI scheme used by CONFIG_CFI_CLANG. KCFI doesn't require
LTO, doesn't alter function references to point to a jump table, and
won't break function address equality.

This series replaces the current arm64 CFI implementation with KCFI
and adds support for x86_64.

KCFI requires assembly functions that are indirectly called from C
code to be annotated with type identifiers. As type information is
only available in C, the compiler emits expected type identifiers
into the symbol table, so they can be referenced from assembly
without having to hardcode type hashes. Patch 6 adds helper macros
for annotating functions, and patches 9 and 19 add annotations.

In case of a type mismatch, KCFI always traps. To support error
handling, the compiler generates a .kcfi_traps section for x86_64,
which contains the locations of each trap, and for arm64, encodes
the necessary register information to the ESR. Patches 10 and 21 add
arch-specific error handlers.

To test this series, you'll a ToT Clang toolchain. The series is
also available in GitHub:

  https://github.com/samitolvanen/linux/commits/kcfi-v4

---
Changes in v4:
- Dropped the RFC now that Clang support is merged.

- Changed the x86_64 function preamble to match the the preamble
  generated by the compiler, and fixed a code generation issue,
  which Peter pointed out.

- Added a patch to fix arm64 psci_initcall_t type mismatch based
  on Mark's suggestion.

Changes in v3:
- Merged the patches that split CC_FLAGS_CFI from CC_FLAGS_LTO.

- Dropped the psci_initcall_t patch as Mark volunteered to send a
  patch for this. Note that this patch is still needed to boot a
  CFI kernel on certain arm64 systems:
  https://lore.kernel.org/lkml/YoNhKaTT3EDukxXY@FVFF77S0Q05N/

- Added a patch to remove the now unnecessary workarounds with
  CFI+ThinLTO in kallsyms.

- Added an lkdtm patch to ensure the test actually generates an
  indirect call.

- Changed report_cfi_failure to clearly indicate if we failed to
  decode target address.

- Switched to relative offsets for .kcfi_traps.

- On x86_64, moved CFI error handling from traps.c to cfi.c, and
  as we only call memcpy indirectly w/ CONFIG_MODULES, ensured that
  the compiler emits __kcfi_typeid_memcpy also without modules.

- On x86_64, added a check for the cmpl REX prefix to handle the
  case where the compiler might not use r8-r15 registers for the
  call target.

- On the compiler side, ensured that on x86_64 calls are emitted
  immediately after the CFI check, fixed the __cfi_ preamble
  linkage, and changed the compiler to emit relative offsets in
  .kcfi_traps.

Changes in v2:
- Changed the compiler patch to encode arm64 target and type details
  in the ESR, and updated the kernel error handling patch accordingly.

- Changed the compiler patch to embed the x86_64 type hash in a valid
  instruction to avoid special casing objtool instruction decoding, and
  added a __cfi_ symbol for the preamble. Changed the kernel error
  handling and manual type annotations to match.

- Dropped the .kcfi_types section as that’s no longer needed by
  objtool, and changed the objtool patch to simply ignore the __cfi_
  preambles falling through.

- Dropped the .kcfi_traps section on arm64 as it’s no longer needed,
  and moved the trap look-up code behind CONFIG_ARCH_USES_CFI_TRAPS,
  which is selected only for x86_64.

- Dropped __nocfi attributes from arm64 code where CFI was disabled
  due to address space confusion issues, and added type annotations to
  relevant assembly functions.

- Dropped __nocfi from __init.

Sami Tolvanen (21):
  treewide: Filter out CC_FLAGS_CFI
  scripts/kallsyms: Ignore __kcfi_typeid_
  cfi: Remove CONFIG_CFI_CLANG_SHADOW
  cfi: Drop __CFI_ADDRESSABLE
  cfi: Switch to -fsanitize=kcfi
  cfi: Add type helper macros
  lkdtm: Emit an indirect call for CFI tests
  psci: Fix the function type for psci_initcall_t
  arm64: Add types to indirect called assembly functions
  arm64: Add CFI error handling
  arm64: Drop unneeded __nocfi attributes
  init: Drop __nocfi from __init
  treewide: Drop function_nocfi
  treewide: Drop WARN_ON_FUNCTION_MISMATCH
  treewide: Drop __cficanonical
  objtool: Disable CFI warnings
  kallsyms: Drop CONFIG_CFI_CLANG workarounds
  x86/tools/relocs: Ignore __kcfi_typeid_ relocations
  x86: Add types to indirectly called assembly functions
  x86/purgatory: Disable CFI
  x86: Add support for CONFIG_CFI_CLANG

 Makefile                                  |  13 +-
 arch/Kconfig                              |  18 +-
 arch/arm64/crypto/ghash-ce-core.S         |   5 +-
 arch/arm64/crypto/sm3-ce-core.S           |   3 +-
 arch/arm64/include/asm/brk-imm.h          |   6 +
 arch/arm64/include/asm/ftrace.h           |   2 +-
 arch/arm64/include/asm/mmu_context.h      |   4 +-
 arch/arm64/kernel/acpi_parking_protocol.c |   2 +-
 arch/arm64/kernel/alternative.c           |   2 +-
 arch/arm64/kernel/cpu-reset.S             |   5 +-
 arch/arm64/kernel/cpufeature.c            |   4 +-
 arch/arm64/kernel/ftrace.c                |   2 +-
 arch/arm64/kernel/machine_kexec.c         |   2 +-
 arch/arm64/kernel/psci.c                  |   2 +-
 arch/arm64/kernel/smp_spin_table.c        |   2 +-
 arch/arm64/kernel/traps.c                 |  47 ++-
 arch/arm64/kernel/vdso/Makefile           |   3 +-
 arch/arm64/mm/proc.S                      |   5 +-
 arch/x86/Kconfig                          |   2 +
 arch/x86/crypto/blowfish-x86_64-asm_64.S  |   5 +-
 arch/x86/entry/vdso/Makefile              |   3 +-
 arch/x86/include/asm/cfi.h                |  22 ++
 arch/x86/include/asm/linkage.h            |   9 +
 arch/x86/kernel/Makefile                  |   2 +
 arch/x86/kernel/cfi.c                     |  85 ++++++
 arch/x86/kernel/traps.c                   |   4 +-
 arch/x86/lib/memcpy_64.S                  |   3 +-
 arch/x86/purgatory/Makefile               |   4 +
 arch/x86/tools/relocs.c                   |   1 +
 drivers/firmware/efi/libstub/Makefile     |   2 +
 drivers/firmware/psci/psci.c              |  12 +-
 drivers/misc/lkdtm/cfi.c                  |  15 +-
 drivers/misc/lkdtm/usercopy.c             |   2 +-
 include/asm-generic/bug.h                 |  16 -
 include/asm-generic/vmlinux.lds.h         |  37 +--
 include/linux/cfi.h                       |  59 ++--
 include/linux/cfi_types.h                 |  57 ++++
 include/linux/compiler-clang.h            |  14 +-
 include/linux/compiler.h                  |  16 +-
 include/linux/compiler_types.h            |   4 -
 include/linux/init.h                      |   6 +-
 include/linux/module.h                    |  10 +-
 include/linux/pci.h                       |   4 +-
 kernel/cfi.c                              | 352 ++++------------------
 kernel/kallsyms.c                         |  17 --
 kernel/kthread.c                          |   3 +-
 kernel/module/main.c                      |  50 +--
 kernel/workqueue.c                        |   2 +-
 scripts/kallsyms.c                        |   1 +
 scripts/module.lds.S                      |  23 +-
 tools/objtool/check.c                     |   7 +-
 51 files changed, 423 insertions(+), 553 deletions(-)
 create mode 100644 arch/x86/include/asm/cfi.h
 create mode 100644 arch/x86/kernel/cfi.c
 create mode 100644 include/linux/cfi_types.h


base-commit: dcf8e5633e2e69ad60b730ab5905608b756a032f

Comments

Kees Cook Aug. 31, 2022, 10:04 p.m. UTC | #1
On Tue, Aug 30, 2022 at 04:31:08PM -0700, Sami Tolvanen wrote:
> KCFI is a forward-edge control-flow integrity scheme in the upcoming
> Clang 16 release, which is more suitable for kernel use than the
> existing CFI scheme used by CONFIG_CFI_CLANG. KCFI doesn't require
> LTO, doesn't alter function references to point to a jump table, and
> won't break function address equality.
> 
> This series replaces the current arm64 CFI implementation with KCFI
> and adds support for x86_64.
> 
> KCFI requires assembly functions that are indirectly called from C
> code to be annotated with type identifiers. As type information is
> only available in C, the compiler emits expected type identifiers
> into the symbol table, so they can be referenced from assembly
> without having to hardcode type hashes. Patch 6 adds helper macros
> for annotating functions, and patches 9 and 19 add annotations.
> 
> In case of a type mismatch, KCFI always traps. To support error
> handling, the compiler generates a .kcfi_traps section for x86_64,
> which contains the locations of each trap, and for arm64, encodes
> the necessary register information to the ESR. Patches 10 and 21 add
> arch-specific error handlers.
> 
> To test this series, you'll a ToT Clang toolchain. The series is
> also available in GitHub:
> 
>   https://github.com/samitolvanen/linux/commits/kcfi-v4

This happily runs through all my initial testing hoops. :)

Tested-by: Kees Cook <keescook@chromium.org>
Nathan Chancellor Sept. 1, 2022, 9:19 p.m. UTC | #2
Hi Sami,

On Tue, Aug 30, 2022 at 04:31:08PM -0700, Sami Tolvanen wrote:
> KCFI is a forward-edge control-flow integrity scheme in the upcoming
> Clang 16 release, which is more suitable for kernel use than the
> existing CFI scheme used by CONFIG_CFI_CLANG. KCFI doesn't require
> LTO, doesn't alter function references to point to a jump table, and
> won't break function address equality.
> 
> This series replaces the current arm64 CFI implementation with KCFI
> and adds support for x86_64.
> 
> KCFI requires assembly functions that are indirectly called from C
> code to be annotated with type identifiers. As type information is
> only available in C, the compiler emits expected type identifiers
> into the symbol table, so they can be referenced from assembly
> without having to hardcode type hashes. Patch 6 adds helper macros
> for annotating functions, and patches 9 and 19 add annotations.
> 
> In case of a type mismatch, KCFI always traps. To support error
> handling, the compiler generates a .kcfi_traps section for x86_64,
> which contains the locations of each trap, and for arm64, encodes
> the necessary register information to the ESR. Patches 10 and 21 add
> arch-specific error handlers.
> 
> To test this series, you'll a ToT Clang toolchain. The series is
> also available in GitHub:
> 
>   https://github.com/samitolvanen/linux/commits/kcfi-v4

I took this series for a spin on arm64 and x86_64.

I did not see any runtime issues on my arm64 or AMD test machines but I
do see a set of failures on my two Intel test machines when accessing
the files in /sys/devices/pci0000:00/0000:00:02.0/drm/card0/gt/gt0:

  $ ls -1 /sys/devices/pci0000:00/0000:00:02.0/drm/card0/gt/gt0
  id
  punit_req_freq_mhz
  rc6_enable
  rc6_residency_ms
  rps_act_freq_mhz
  rps_boost_freq_mhz
  rps_cur_freq_mhz
  rps_max_freq_mhz
  rps_min_freq_mhz
  rps_rp0_freq_mhz
  rps_rp1_freq_mhz
  rps_rpn_freq_mhz
  throttle_reason_pl1
  throttle_reason_pl2
  throttle_reason_pl4
  throttle_reason_prochot
  throttle_reason_ratl
  throttle_reason_status
  throttle_reason_thermal
  throttle_reason_vr_tdc
  throttle_reason_vr_thermalert

  $ cat /sys/devices/pci0000:00/0000:00:02.0/drm/card0/gt/gt0/*
  ...

  $ sudo dmesg | rg "CFI failure at"
  [  481.234522] CFI failure at kobj_attr_show+0x19/0x30 (target: max_freq_mhz_show+0x0/0xe0 [i915]; expected type: 0xc527b809)
  [  481.234699] CFI failure at kobj_attr_show+0x19/0x30 (target: act_freq_mhz_show+0x0/0xe0 [i915]; expected type: 0xc527b809)
  [  481.235067] CFI failure at kobj_attr_show+0x19/0x30 (target: boost_freq_mhz_show+0x0/0xe0 [i915]; expected type: 0xc527b809)
  [  481.235194] CFI failure at kobj_attr_show+0x19/0x30 (target: min_freq_mhz_show+0x0/0xe0 [i915]; expected type: 0xc527b809)
  [  481.235320] CFI failure at kobj_attr_show+0x19/0x30 (target: punit_req_freq_mhz_show+0x0/0x40 [i915]; expected type: 0xc527b809)
  [  481.235447] CFI failure at kobj_attr_show+0x19/0x30 (target: throttle_reason_bool_show+0x0/0x50 [i915]; expected type: 0xc527b809)
  [  481.235570] CFI failure at kobj_attr_show+0x19/0x30 (target: id_show+0x0/0x70 [i915]; expected type: 0xc527b809)
  [  481.235694] CFI failure at kobj_attr_show+0x19/0x30 (target: throttle_reason_bool_show+0x0/0x50 [i915]; expected type: 0xc527b809)
  [  481.235821] CFI failure at kobj_attr_show+0x19/0x30 (target: throttle_reason_bool_show+0x0/0x50 [i915]; expected type: 0xc527b809)
  [  481.235945] CFI failure at kobj_attr_show+0x19/0x30 (target: cur_freq_mhz_show+0x0/0xe0 [i915]; expected type: 0xc527b809)
  [  481.236075] CFI failure at kobj_attr_show+0x19/0x30 (target: throttle_reason_bool_show+0x0/0x50 [i915]; expected type: 0xc527b809)
  [  481.236201] CFI failure at kobj_attr_show+0x19/0x30 (target: rc6_enable_show+0x0/0x40 [i915]; expected type: 0xc527b809)
  [  481.236327] CFI failure at kobj_attr_show+0x19/0x30 (target: throttle_reason_bool_show+0x0/0x50 [i915]; expected type: 0xc527b809)
  [  481.236453] CFI failure at kobj_attr_show+0x19/0x30 (target: RP0_freq_mhz_show+0x0/0xe0 [i915]; expected type: 0xc527b809)
  [  481.236582] CFI failure at kobj_attr_show+0x19/0x30 (target: throttle_reason_bool_show+0x0/0x50 [i915]; expected type: 0xc527b809)
  [  481.236707] CFI failure at kobj_attr_show+0x19/0x30 (target: throttle_reason_bool_show+0x0/0x50 [i915]; expected type: 0xc527b809)
  [  481.236836] CFI failure at kobj_attr_show+0x19/0x30 (target: throttle_reason_bool_show+0x0/0x50 [i915]; expected type: 0xc527b809)
  [  481.236958] CFI failure at kobj_attr_show+0x19/0x30 (target: RPn_freq_mhz_show+0x0/0xe0 [i915]; expected type: 0xc527b809)
  [  481.237079] CFI failure at kobj_attr_show+0x19/0x30 (target: RP1_freq_mhz_show+0x0/0xe0 [i915]; expected type: 0xc527b809)
  [  481.237224] CFI failure at kobj_attr_show+0x19/0x30 (target: throttle_reason_bool_show+0x0/0x50 [i915]; expected type: 0xc527b809)
  [  481.237377] CFI failure at kobj_attr_show+0x19/0x30 (target: rc6_residency_ms_show+0x0/0x270 [i915]; expected type: 0xc527b809)

The source of those is drivers/gpu/drm/i915/gt/intel_gt_sysfs_pm.c. I
have not looked too closely yet but the fix should be something along
the lines of commit 58606220a2f1 ("drm/i915: Fix CFI violation with
show_dynamic_id()"). I will note that the 'dev' variable does appear to
be used so the fix might not be as trivial as that one. Not sure I will
have a chance to look into it before Plumbers but I am happy to test a
patch if you happen to see an obvious fix.

Interestingly, I do not see the KVM failure [1] that I reported anymore.
I do not see an obvious fix for it in this series or -next though, could
it have been an issue with an earlier revision of kCFI on the compiler
side?

I do see a few new objtool warnings as well:

vmlinux.o: warning: objtool: apply_relocate_add+0x34: relocation to !ENDBR: memcpy+0x0
vmlinux.o: warning: objtool: ___ksymtab+__memcpy+0x0: data relocation to !ENDBR: memcpy+0x0
vmlinux.o: warning: objtool: ___ksymtab+memcpy+0x0: data relocation to !ENDBR: memcpy+0x0

As a result of the above:

Tested-by: Nathan Chancellor <nathan@kernel.org>

[1]: https://github.com/ClangBuiltLinux/linux/issues/1644

Cheers,
Nathan
Sami Tolvanen Sept. 2, 2022, 12:33 a.m. UTC | #3
Hi Nathan,

On Thu, Sep 1, 2022 at 2:19 PM Nathan Chancellor <nathan@kernel.org> wrote:
> I took this series for a spin on arm64 and x86_64.

Thanks for testing!

> I did not see any runtime issues on my arm64 or AMD test machines but I
> do see a set of failures on my two Intel test machines when accessing
> the files in /sys/devices/pci0000:00/0000:00:02.0/drm/card0/gt/gt0:

Yes, I suspect there are a few sysfs type mismatches left to fix
still. I believe Kees was looking into these earlier.

> The source of those is drivers/gpu/drm/i915/gt/intel_gt_sysfs_pm.c. I
> have not looked too closely yet but the fix should be something along
> the lines of commit 58606220a2f1 ("drm/i915: Fix CFI violation with
> show_dynamic_id()").

I don't have hardware for testing this driver, but that looks like a
typical kobj_attribute / device_attribute mismatch, which happens to
work because struct device starts with a kobject and the attribute
structure is identical. I can take a look at this next week.

> Interestingly, I do not see the KVM failure [1] that I reported anymore.
> I do not see an obvious fix for it in this series or -next though, could
> it have been an issue with an earlier revision of kCFI on the compiler
> side?

Most likely the compiler either converted it into a direct call, or
inlined it. There are a few type mismatches in the kernel still that
don't trip KCFI because they're optimized into direct calls.

> I do see a few new objtool warnings as well:
>
> vmlinux.o: warning: objtool: apply_relocate_add+0x34: relocation to !ENDBR: memcpy+0x0
> vmlinux.o: warning: objtool: ___ksymtab+__memcpy+0x0: data relocation to !ENDBR: memcpy+0x0
> vmlinux.o: warning: objtool: ___ksymtab+memcpy+0x0: data relocation to !ENDBR: memcpy+0x0

That's interesting. I can only reproduce this warning with
allmodconfig+LTO, even though the relocation exists in all builds (the
code makes an indirect call to memcpy) and memcpy (aliased to
__memcpy) doesn't start with endbr. I'll have to take a closer look at
why this warning only appears with LTO.

Sami
Peter Zijlstra Sept. 2, 2022, 7:51 a.m. UTC | #4
On Thu, Sep 01, 2022 at 05:33:29PM -0700, Sami Tolvanen wrote:

> > I do see a few new objtool warnings as well:
> >
> > vmlinux.o: warning: objtool: apply_relocate_add+0x34: relocation to !ENDBR: memcpy+0x0
> > vmlinux.o: warning: objtool: ___ksymtab+__memcpy+0x0: data relocation to !ENDBR: memcpy+0x0
> > vmlinux.o: warning: objtool: ___ksymtab+memcpy+0x0: data relocation to !ENDBR: memcpy+0x0
> 
> That's interesting. I can only reproduce this warning with
> allmodconfig+LTO, even though the relocation exists in all builds (the
> code makes an indirect call to memcpy) and memcpy (aliased to
> __memcpy) doesn't start with endbr. I'll have to take a closer look at
> why this warning only appears with LTO.

From just looking at the patches I'd say patch #19 breaks it. IIRC you
forgot to make the SYM_TYPED_FUNC things emit ENDBR.

Look at how x86/asm/linkage.h is overriding SYM_FUNC_START*().

You might have the same bug vs ARM64 BTI, they do the same thing.
Sami Tolvanen Sept. 2, 2022, 3:39 p.m. UTC | #5
On Fri, Sep 2, 2022 at 12:51 AM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Thu, Sep 01, 2022 at 05:33:29PM -0700, Sami Tolvanen wrote:
>
> > > I do see a few new objtool warnings as well:
> > >
> > > vmlinux.o: warning: objtool: apply_relocate_add+0x34: relocation to !ENDBR: memcpy+0x0
> > > vmlinux.o: warning: objtool: ___ksymtab+__memcpy+0x0: data relocation to !ENDBR: memcpy+0x0
> > > vmlinux.o: warning: objtool: ___ksymtab+memcpy+0x0: data relocation to !ENDBR: memcpy+0x0
> >
> > That's interesting. I can only reproduce this warning with
> > allmodconfig+LTO, even though the relocation exists in all builds (the
> > code makes an indirect call to memcpy) and memcpy (aliased to
> > __memcpy) doesn't start with endbr. I'll have to take a closer look at
> > why this warning only appears with LTO.
>
> From just looking at the patches I'd say patch #19 breaks it. IIRC you
> forgot to make the SYM_TYPED_FUNC things emit ENDBR.
>
> Look at how x86/asm/linkage.h is overriding SYM_FUNC_START*().

Yes, that's the reason, I'll fix this next week. I was mostly
wondering why I'm not getting this warning with my other test configs,
but it looks like IBT isn't enabled by default.

Sami