mbox series

[RFC,00/28] x86: Rely on toolchain for relocatable code

Message ID 20240925150059.3955569-30-ardb+git@google.com (mailing list archive)
Headers show
Series x86: Rely on toolchain for relocatable code | expand

Message

Ard Biesheuvel Sept. 25, 2024, 3:01 p.m. UTC
From: Ard Biesheuvel <ardb@kernel.org>

The x86_64 port has a number of historical quirks that result in a
reliance on toolchain features that are either poorly specified or
basically implementation details of the toolchain:

- the 'kernel' C model implemented by the compiler is intended for
  position dependent code residing in the 'negative' 2 GiB of the
  virtual address space, but is used to create a position independent
  executable (for virtual KASLR);

- the 'kernel' C model has other properties that are not written down
  anywhere, and may therefore deviate between compilers and versions,
  which now includes the Rust compilers too (e.g., use %gs not %fs for
  per-CPU references); 

- the relocation format used to perform the PIE relocation at boot is
  complicated and non-standard, as it deals with 3 types of
  displacements, including 32-bit negative displacements for
  RIP-relative per-CPU references that are not subject to relocation
  fixups (as they are places in a separate, disjoint address space);

- the relocation table is generated from static relocation metadata
  taken from the ELF input objects into the linker, and these describe
  the input not the output - relaxations or other linker tweaks may
  result in a mismatch between the two, and GNU ld and LLD display
  different behavior here;

- this disjoint per-CPU address space requires elaborate hacks in the
  linker script and the startup code;

- some of the startup code executes from a 1:1 mapping of memory, where
  RIP-relative references are mandatory, whereas RIP-relative per-CPU
  variable references can only work correctly from the kernel virtual
  mapping (as they need to wrap around from the negative 2 GiB space
  into the 0x0 based per-CPU region);

The reason for this odd situation wrt per-CPU variable addressing is the
fact that we rely on the user-space TLS arrangement for per-task stack
cookies, and this was implemented using a fixed offset of 40 bytes from
%GS. If we bump the minimum GCC version to 8.1, we can switch to symbol
based stack cookie references, allowing the same arrangement to be
adopted as on other architectures, i.e., where the CPU register carries
the per-CPU offset, and UP or boot-time per-CPU references point into
the per-CPU load area directly (using an offset of 0x0).

With that out of the way, we can untangle this whole thing, and replace
the bespoke tooling and relocation formats with ordinary, linker
generated relocation tables, using the RELR format that reduces the
memory footprint of the relocation table by 20x. The compilers can
efficiently generate position independent code these days, without
unnecessary indirections via the Global Object Table (GOT) except for a
handful of special cases (see the KVM patch for an example where a
GOT-based indirection is the best choice for pushing the absolute
address of a symbol onto the stack in a position independent manner when
there are no free GPRs)

It also brings us much closer to the ordinary PIE relocation model used
for most of user space, which is therefore much better supported and
less likely to create problems as we increase the range of compilers and
linkers that need to be supported.

Tested on GCC 8 - 14 and Clang 15 - 17, using EFI and bare metal boot
using a variety of entry points (decompressor, EFI stub, XenPV, PVH)
 
Cc: x86@kernel.org
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Uros Bizjak <ubizjak@gmail.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Kees Cook <kees@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Keith Packard <keithp@keithp.com>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Kan Liang  <kan.liang@linux.intel.com>
Cc: linux-doc@vger.kernel.org
Cc: linux-pm@vger.kernel.org
Cc: kvm@vger.kernel.org
Cc: xen-devel@lists.xenproject.org
Cc: linux-efi@vger.kernel.org
Cc: linux-arch@vger.kernel.org
Cc: linux-sparse@vger.kernel.org
Cc: linux-kbuild@vger.kernel.org
Cc: linux-perf-users@vger.kernel.org
Cc: rust-for-linux@vger.kernel.org
Cc: llvm@lists.linux.dev

Ard Biesheuvel (28):
  x86/pvh: Call C code via the kernel virtual mapping
  Documentation: Bump minimum GCC version to 8.1
  x86/tools: Use mmap() to simplify relocs host tool
  x86/boot: Permit GOTPCREL relocations for x86_64 builds
  x86: Define the stack protector guard symbol explicitly
  x86/percpu: Get rid of absolute per-CPU variable placement
  scripts/kallsyms: Avoid 0x0 as the relative base
  scripts/kallsyms: Remove support for absolute per-CPU variables
  x86/tools: Remove special relocation handling for per-CPU variables
  x86/xen: Avoid relocatable quantities in Xen ELF notes
  x86/pvh: Avoid absolute symbol references in .head.text
  x86/pm-trace: Use RIP-relative accesses for .tracedata
  x86/kvm: Use RIP-relative addressing
  x86/rethook: Use RIP-relative reference for return address
  x86/sync_core: Use RIP-relative addressing
  x86/entry_64: Use RIP-relative addressing
  x86/hibernate: Prefer RIP-relative accesses
  x86/boot/64: Determine VA/PA offset before entering C code
  x86/boot/64: Avoid intentional absolute symbol references in
    .head.text
  x64/acpi: Use PIC-compatible references in wakeup_64.S
  x86/head: Use PIC-compatible symbol references in startup code
  asm-generic: Treat PIC .data.rel.ro sections as .rodata
  tools/objtool: Mark generated sections as writable
  tools/objtool: Treat indirect ftrace calls as direct calls
  x86: Use PIE codegen for the core kernel
  x86/boot: Implement support for ELF RELA/RELR relocations
  x86/kernel: Switch to PIE linking for the core kernel
  x86/tools: Drop x86_64 support from 'relocs' tool

 Documentation/admin-guide/README.rst    |   2 +-
 Documentation/arch/x86/zero-page.rst    |   3 +-
 Documentation/process/changes.rst       |   2 +-
 arch/x86/Kconfig                        |   3 +-
 arch/x86/Makefile                       |  22 +-
 arch/x86/boot/Makefile                  |   1 +
 arch/x86/boot/compressed/Makefile       |   2 +-
 arch/x86/boot/compressed/misc.c         |  16 +-
 arch/x86/entry/calling.h                |   9 +-
 arch/x86/entry/entry_64.S               |  12 +-
 arch/x86/entry/vdso/Makefile            |   1 +
 arch/x86/include/asm/desc.h             |   1 -
 arch/x86/include/asm/init.h             |   2 +-
 arch/x86/include/asm/percpu.h           |  22 -
 arch/x86/include/asm/pm-trace.h         |   4 +-
 arch/x86/include/asm/processor.h        |  14 +-
 arch/x86/include/asm/setup.h            |   3 +-
 arch/x86/include/asm/stackprotector.h   |   4 -
 arch/x86/include/asm/sync_core.h        |   3 +-
 arch/x86/include/uapi/asm/bootparam.h   |   2 +-
 arch/x86/kernel/acpi/wakeup_64.S        |  11 +-
 arch/x86/kernel/head64.c                |  76 +++-
 arch/x86/kernel/head_64.S               |  40 +-
 arch/x86/kernel/irq_64.c                |   1 -
 arch/x86/kernel/kvm.c                   |   8 +-
 arch/x86/kernel/relocate_kernel_64.S    |   6 +-
 arch/x86/kernel/rethook.c               |   3 +-
 arch/x86/kernel/setup_percpu.c          |   9 +-
 arch/x86/kernel/vmlinux.lds.S           |  75 ++--
 arch/x86/platform/pvh/head.S            |  57 ++-
 arch/x86/power/hibernate_asm_64.S       |   4 +-
 arch/x86/realmode/rm/Makefile           |   1 +
 arch/x86/tools/Makefile                 |   2 +-
 arch/x86/tools/relocs.c                 | 425 +++-----------------
 arch/x86/tools/relocs.h                 |  11 +-
 arch/x86/tools/relocs_64.c              |  18 -
 arch/x86/tools/relocs_common.c          |  11 +-
 arch/x86/xen/xen-head.S                 |  16 +-
 drivers/base/power/trace.c              |   6 +-
 drivers/firmware/efi/libstub/x86-stub.c |   2 +
 include/asm-generic/vmlinux.lds.h       |  10 +-
 include/linux/compiler.h                |   2 +-
 init/Kconfig                            |   5 -
 kernel/kallsyms.c                       |  12 +-
 scripts/kallsyms.c                      |  53 +--
 scripts/link-vmlinux.sh                 |   4 -
 tools/objtool/check.c                   |  43 +-
 tools/objtool/elf.c                     |   2 +-
 tools/objtool/include/objtool/special.h |   2 +-
 tools/perf/util/annotate.c              |   4 +-
 50 files changed, 380 insertions(+), 667 deletions(-)
 delete mode 100644 arch/x86/tools/relocs_64.c