mbox series

[v5,00/29] Add support for Clang LTO

Message ID 20201009161338.657380-1-samitolvanen@google.com (mailing list archive)
Headers show
Series Add support for Clang LTO | expand

Message

Sami Tolvanen Oct. 9, 2020, 4:13 p.m. UTC
This patch series adds support for building x86_64 and arm64 kernels
with Clang's Link Time Optimization (LTO).

In addition to performance, the primary motivation for LTO is
to allow Clang's Control-Flow Integrity (CFI) to be used in the
kernel. Google has shipped millions of Pixel devices running three
major kernel versions with LTO+CFI since 2018.

Most of the patches are build system changes for handling LLVM
bitcode, which Clang produces with LTO instead of ELF object files,
postponing ELF processing until a later stage, and ensuring initcall
ordering.

Note that this version is based on tip/master to reduce the number
of prerequisite patches, and to make it easier to manage changes to
objtool. Patch 1 is from Masahiro's kbuild tree, and while it's not
directly related to LTO, it makes the module linker script changes
cleaner.

Furthermore, patches 2-6 include Peter's patch for generating
__mcount_loc with objtool, and build system changes to enable it on
x86. With these patches, we no longer need to annotate functions
that have non-call references to __fentry__ with LTO, which greatly
simplifies supporting dynamic ftrace.

You can also pull this series from

  https://github.com/samitolvanen/linux.git lto-v5

---
Changes in v5:

  - Rebased on top of tip/master.

  - Changed the command line for objtool to use --vmlinux --duplicate
    to disable warnings about retpoline thunks and to fix .orc_unwind
    generation for vmlinux.o.

  - Added --noinstr flag to objtool, so we can use --vmlinux without
    also enabling noinstr validation.

  - Disabled objtool's unreachable instruction warnings with LTO to
    disable false positives for the int3 padding in vmlinux.o.

  - Added ANNOTATE_RETPOLINE_SAFE annotations to the indirect jumps
    in x86 assembly code to fix objtool warnings with retpoline.

  - Fixed modpost warnings about missing version information with
    CONFIG_MODVERSIONS.

  - Included Makefile.lib into Makefile.modpost for ld_flags. Thanks
    to Sedat for pointing this out.

  - Updated the help text for ThinLTO to better explain the trade-offs.

  - Updated commit messages with better explanations.

Changes in v4:

  - Fixed a typo in Makefile.lib to correctly pass --no-fp to objtool.

  - Moved ftrace configs related to generating __mcount_loc to Kconfig,
    so they are available also in Makefile.modfinal.

  - Dropped two prerequisite patches that were merged to Linus' tree.

Changes in v3:

  - Added a separate patch to remove the unused DISABLE_LTO treewide,
    as filtering out CC_FLAGS_LTO instead is preferred.

  - Updated the Kconfig help to explain why LTO is behind a choice
    and disabled by default.

  - Dropped CC_FLAGS_LTO_CLANG, compiler-specific LTO flags are now
    appended directly to CC_FLAGS_LTO.

  - Updated $(AR) flags as KBUILD_ARFLAGS was removed earlier.

  - Fixed ThinLTO cache handling for external module builds.

  - Rebased on top of Masahiro's patch for preprocessing modules.lds,
    and moved the contents of module-lto.lds to modules.lds.S.

  - Moved objtool_args to Makefile.lib to avoid duplication of the
    command line parameters in Makefile.modfinal.

  - Clarified in the commit message for the initcall ordering patch
    that the initcall order remains the same as without LTO.

  - Changed link-vmlinux.sh to use jobserver-exec to control the
    number of jobs started by generate_initcall_ordering.pl.

  - Dropped the x86/relocs patch to whitelist L4_PAGE_OFFSET as it's
    no longer needed with ToT kernel.

  - Disabled LTO for arch/x86/power/cpu.c to work around a Clang bug
    with stack protector attributes.

Changes in v2:

  - Fixed -Wmissing-prototypes warnings with W=1.

  - Dropped cc-option from -fsplit-lto-unit and added .thinlto-cache
    scrubbing to make distclean.

  - Added a comment about Clang >=11 being required.

  - Added a patch to disable LTO for the arm64 KVM nVHE code.

  - Disabled objtool's noinstr validation with LTO unless enabled.

  - Included Peter's proposed objtool mcount patch in the series
    and replaced recordmcount with the objtool pass to avoid
    whitelisting relocations that are not calls.

  - Updated several commit messages with better explanations.


Masahiro Yamada (1):
  kbuild: preprocess module linker script

Peter Zijlstra (1):
  objtool: Add a pass for generating __mcount_loc

Sami Tolvanen (27):
  objtool: Don't autodetect vmlinux.o
  tracing: move function tracer options to Kconfig
  tracing: add support for objtool mcount
  x86, build: use objtool mcount
  treewide: remove DISABLE_LTO
  kbuild: add support for Clang LTO
  kbuild: lto: fix module versioning
  objtool: Split noinstr validation from --vmlinux
  kbuild: lto: postpone objtool
  kbuild: lto: limit inlining
  kbuild: lto: merge module sections
  kbuild: lto: remove duplicate dependencies from .mod files
  init: lto: ensure initcall ordering
  init: lto: fix PREL32 relocations
  PCI: Fix PREL32 relocations for LTO
  modpost: lto: strip .lto from module names
  scripts/mod: disable LTO for empty.c
  efi/libstub: disable LTO
  drivers/misc/lkdtm: disable LTO for rodata.o
  arm64: vdso: disable LTO
  KVM: arm64: disable LTO for the nVHE directory
  arm64: disable recordmcount with DYNAMIC_FTRACE_WITH_REGS
  arm64: allow LTO_CLANG and THINLTO to be selected
  x86/asm: annotate indirect jumps
  x86, vdso: disable LTO only for vDSO
  x86, cpu: disable LTO for cpu.c
  x86, build: allow LTO_CLANG and THINLTO to be selected

 .gitignore                                    |   1 +
 Makefile                                      |  68 +++--
 arch/Kconfig                                  |  74 +++++
 arch/arm/Makefile                             |   4 -
 .../module.lds => include/asm/module.lds.h}   |   2 +
 arch/arm64/Kconfig                            |   4 +
 arch/arm64/Makefile                           |   4 -
 .../module.lds => include/asm/module.lds.h}   |   2 +
 arch/arm64/kernel/vdso/Makefile               |   4 +-
 arch/arm64/kvm/hyp/nvhe/Makefile              |   4 +-
 arch/ia64/Makefile                            |   1 -
 .../{module.lds => include/asm/module.lds.h}  |   0
 arch/m68k/Makefile                            |   1 -
 .../module.lds => include/asm/module.lds.h}   |   0
 arch/powerpc/Makefile                         |   1 -
 .../module.lds => include/asm/module.lds.h}   |   0
 arch/riscv/Makefile                           |   3 -
 .../module.lds => include/asm/module.lds.h}   |   3 +-
 arch/sparc/vdso/Makefile                      |   2 -
 arch/um/include/asm/Kbuild                    |   1 +
 arch/x86/Kconfig                              |   3 +
 arch/x86/Makefile                             |   5 +
 arch/x86/entry/vdso/Makefile                  |   5 +-
 arch/x86/kernel/acpi/wakeup_64.S              |   2 +
 arch/x86/platform/pvh/head.S                  |   2 +
 arch/x86/power/Makefile                       |   4 +
 arch/x86/power/hibernate_asm_64.S             |   3 +
 drivers/firmware/efi/libstub/Makefile         |   2 +
 drivers/misc/lkdtm/Makefile                   |   1 +
 include/asm-generic/Kbuild                    |   1 +
 include/asm-generic/module.lds.h              |  10 +
 include/asm-generic/vmlinux.lds.h             |  11 +-
 include/linux/init.h                          |  79 ++++-
 include/linux/pci.h                           |  19 +-
 kernel/Makefile                               |   3 -
 kernel/trace/Kconfig                          |  29 ++
 scripts/.gitignore                            |   1 +
 scripts/Makefile                              |   3 +
 scripts/Makefile.build                        |  69 +++--
 scripts/Makefile.lib                          |  17 +-
 scripts/Makefile.modfinal                     |  29 +-
 scripts/Makefile.modpost                      |  25 +-
 scripts/generate_initcall_order.pl            | 270 ++++++++++++++++++
 scripts/link-vmlinux.sh                       |  98 ++++++-
 scripts/mod/Makefile                          |   1 +
 scripts/mod/modpost.c                         |  16 +-
 scripts/mod/modpost.h                         |   9 +
 scripts/mod/sumversion.c                      |   6 +-
 scripts/{module-common.lds => module.lds.S}   |  31 ++
 scripts/package/builddeb                      |   2 +-
 tools/objtool/builtin-check.c                 |  10 +-
 tools/objtool/check.c                         |  84 +++++-
 tools/objtool/include/objtool/builtin.h       |   2 +-
 tools/objtool/include/objtool/check.h         |   1 +
 tools/objtool/include/objtool/objtool.h       |   1 +
 tools/objtool/objtool.c                       |   1 +
 56 files changed, 903 insertions(+), 131 deletions(-)
 rename arch/arm/{kernel/module.lds => include/asm/module.lds.h} (72%)
 rename arch/arm64/{kernel/module.lds => include/asm/module.lds.h} (76%)
 rename arch/ia64/{module.lds => include/asm/module.lds.h} (100%)
 rename arch/m68k/{kernel/module.lds => include/asm/module.lds.h} (100%)
 rename arch/powerpc/{kernel/module.lds => include/asm/module.lds.h} (100%)
 rename arch/riscv/{kernel/module.lds => include/asm/module.lds.h} (84%)
 create mode 100644 include/asm-generic/module.lds.h
 create mode 100755 scripts/generate_initcall_order.pl
 rename scripts/{module-common.lds => module.lds.S} (59%)


base-commit: 80396d76da65fc8b82581c0260c25a6aa0a495a3

Comments

Sedat Dilek Oct. 9, 2020, 4:30 p.m. UTC | #1
On Fri, Oct 9, 2020 at 6:13 PM 'Sami Tolvanen' via Clang Built Linux
<clang-built-linux@googlegroups.com> wrote:
>
> This patch series adds support for building x86_64 and arm64 kernels
> with Clang's Link Time Optimization (LTO).
>
> In addition to performance, the primary motivation for LTO is
> to allow Clang's Control-Flow Integrity (CFI) to be used in the
> kernel. Google has shipped millions of Pixel devices running three
> major kernel versions with LTO+CFI since 2018.
>
> Most of the patches are build system changes for handling LLVM
> bitcode, which Clang produces with LTO instead of ELF object files,
> postponing ELF processing until a later stage, and ensuring initcall
> ordering.
>
> Note that this version is based on tip/master to reduce the number
> of prerequisite patches, and to make it easier to manage changes to
> objtool. Patch 1 is from Masahiro's kbuild tree, and while it's not
> directly related to LTO, it makes the module linker script changes
> cleaner.
>
> Furthermore, patches 2-6 include Peter's patch for generating
> __mcount_loc with objtool, and build system changes to enable it on
> x86. With these patches, we no longer need to annotate functions
> that have non-call references to __fentry__ with LTO, which greatly
> simplifies supporting dynamic ftrace.
>
> You can also pull this series from
>
>   https://github.com/samitolvanen/linux.git lto-v5
>
> ---
> Changes in v5:
>
>   - Rebased on top of tip/master.
>

What are the plans to get this into mainline?
Linux v5.10 :-) too early - needs more review/testing?

Will clang-cfi be based on this, too?

>   - Changed the command line for objtool to use --vmlinux --duplicate
>     to disable warnings about retpoline thunks and to fix .orc_unwind
>     generation for vmlinux.o.
>
>   - Added --noinstr flag to objtool, so we can use --vmlinux without
>     also enabling noinstr validation.
>
>   - Disabled objtool's unreachable instruction warnings with LTO to
>     disable false positives for the int3 padding in vmlinux.o.
>
>   - Added ANNOTATE_RETPOLINE_SAFE annotations to the indirect jumps
>     in x86 assembly code to fix objtool warnings with retpoline.
>
>   - Fixed modpost warnings about missing version information with
>     CONFIG_MODVERSIONS.
>
>   - Included Makefile.lib into Makefile.modpost for ld_flags. Thanks
>     to Sedat for pointing this out.
>

That was a long way to detect this as I had very big Debian Linux
debug packages generated with CONFIG_DEBUG_INFO_COMPRESSED=y.

Thanks for v5 of clang-lto.

- Sedat -

[1] https://github.com/ClangBuiltLinux/linux/issues/1086#issuecomment-705754002

>   - Updated the help text for ThinLTO to better explain the trade-offs.
>
>   - Updated commit messages with better explanations.
>
> Changes in v4:
>
>   - Fixed a typo in Makefile.lib to correctly pass --no-fp to objtool.
>
>   - Moved ftrace configs related to generating __mcount_loc to Kconfig,
>     so they are available also in Makefile.modfinal.
>
>   - Dropped two prerequisite patches that were merged to Linus' tree.
>
> Changes in v3:
>
>   - Added a separate patch to remove the unused DISABLE_LTO treewide,
>     as filtering out CC_FLAGS_LTO instead is preferred.
>
>   - Updated the Kconfig help to explain why LTO is behind a choice
>     and disabled by default.
>
>   - Dropped CC_FLAGS_LTO_CLANG, compiler-specific LTO flags are now
>     appended directly to CC_FLAGS_LTO.
>
>   - Updated $(AR) flags as KBUILD_ARFLAGS was removed earlier.
>
>   - Fixed ThinLTO cache handling for external module builds.
>
>   - Rebased on top of Masahiro's patch for preprocessing modules.lds,
>     and moved the contents of module-lto.lds to modules.lds.S.
>
>   - Moved objtool_args to Makefile.lib to avoid duplication of the
>     command line parameters in Makefile.modfinal.
>
>   - Clarified in the commit message for the initcall ordering patch
>     that the initcall order remains the same as without LTO.
>
>   - Changed link-vmlinux.sh to use jobserver-exec to control the
>     number of jobs started by generate_initcall_ordering.pl.
>
>   - Dropped the x86/relocs patch to whitelist L4_PAGE_OFFSET as it's
>     no longer needed with ToT kernel.
>
>   - Disabled LTO for arch/x86/power/cpu.c to work around a Clang bug
>     with stack protector attributes.
>
> Changes in v2:
>
>   - Fixed -Wmissing-prototypes warnings with W=1.
>
>   - Dropped cc-option from -fsplit-lto-unit and added .thinlto-cache
>     scrubbing to make distclean.
>
>   - Added a comment about Clang >=11 being required.
>
>   - Added a patch to disable LTO for the arm64 KVM nVHE code.
>
>   - Disabled objtool's noinstr validation with LTO unless enabled.
>
>   - Included Peter's proposed objtool mcount patch in the series
>     and replaced recordmcount with the objtool pass to avoid
>     whitelisting relocations that are not calls.
>
>   - Updated several commit messages with better explanations.
>
>
> Masahiro Yamada (1):
>   kbuild: preprocess module linker script
>
> Peter Zijlstra (1):
>   objtool: Add a pass for generating __mcount_loc
>
> Sami Tolvanen (27):
>   objtool: Don't autodetect vmlinux.o
>   tracing: move function tracer options to Kconfig
>   tracing: add support for objtool mcount
>   x86, build: use objtool mcount
>   treewide: remove DISABLE_LTO
>   kbuild: add support for Clang LTO
>   kbuild: lto: fix module versioning
>   objtool: Split noinstr validation from --vmlinux
>   kbuild: lto: postpone objtool
>   kbuild: lto: limit inlining
>   kbuild: lto: merge module sections
>   kbuild: lto: remove duplicate dependencies from .mod files
>   init: lto: ensure initcall ordering
>   init: lto: fix PREL32 relocations
>   PCI: Fix PREL32 relocations for LTO
>   modpost: lto: strip .lto from module names
>   scripts/mod: disable LTO for empty.c
>   efi/libstub: disable LTO
>   drivers/misc/lkdtm: disable LTO for rodata.o
>   arm64: vdso: disable LTO
>   KVM: arm64: disable LTO for the nVHE directory
>   arm64: disable recordmcount with DYNAMIC_FTRACE_WITH_REGS
>   arm64: allow LTO_CLANG and THINLTO to be selected
>   x86/asm: annotate indirect jumps
>   x86, vdso: disable LTO only for vDSO
>   x86, cpu: disable LTO for cpu.c
>   x86, build: allow LTO_CLANG and THINLTO to be selected
>
>  .gitignore                                    |   1 +
>  Makefile                                      |  68 +++--
>  arch/Kconfig                                  |  74 +++++
>  arch/arm/Makefile                             |   4 -
>  .../module.lds => include/asm/module.lds.h}   |   2 +
>  arch/arm64/Kconfig                            |   4 +
>  arch/arm64/Makefile                           |   4 -
>  .../module.lds => include/asm/module.lds.h}   |   2 +
>  arch/arm64/kernel/vdso/Makefile               |   4 +-
>  arch/arm64/kvm/hyp/nvhe/Makefile              |   4 +-
>  arch/ia64/Makefile                            |   1 -
>  .../{module.lds => include/asm/module.lds.h}  |   0
>  arch/m68k/Makefile                            |   1 -
>  .../module.lds => include/asm/module.lds.h}   |   0
>  arch/powerpc/Makefile                         |   1 -
>  .../module.lds => include/asm/module.lds.h}   |   0
>  arch/riscv/Makefile                           |   3 -
>  .../module.lds => include/asm/module.lds.h}   |   3 +-
>  arch/sparc/vdso/Makefile                      |   2 -
>  arch/um/include/asm/Kbuild                    |   1 +
>  arch/x86/Kconfig                              |   3 +
>  arch/x86/Makefile                             |   5 +
>  arch/x86/entry/vdso/Makefile                  |   5 +-
>  arch/x86/kernel/acpi/wakeup_64.S              |   2 +
>  arch/x86/platform/pvh/head.S                  |   2 +
>  arch/x86/power/Makefile                       |   4 +
>  arch/x86/power/hibernate_asm_64.S             |   3 +
>  drivers/firmware/efi/libstub/Makefile         |   2 +
>  drivers/misc/lkdtm/Makefile                   |   1 +
>  include/asm-generic/Kbuild                    |   1 +
>  include/asm-generic/module.lds.h              |  10 +
>  include/asm-generic/vmlinux.lds.h             |  11 +-
>  include/linux/init.h                          |  79 ++++-
>  include/linux/pci.h                           |  19 +-
>  kernel/Makefile                               |   3 -
>  kernel/trace/Kconfig                          |  29 ++
>  scripts/.gitignore                            |   1 +
>  scripts/Makefile                              |   3 +
>  scripts/Makefile.build                        |  69 +++--
>  scripts/Makefile.lib                          |  17 +-
>  scripts/Makefile.modfinal                     |  29 +-
>  scripts/Makefile.modpost                      |  25 +-
>  scripts/generate_initcall_order.pl            | 270 ++++++++++++++++++
>  scripts/link-vmlinux.sh                       |  98 ++++++-
>  scripts/mod/Makefile                          |   1 +
>  scripts/mod/modpost.c                         |  16 +-
>  scripts/mod/modpost.h                         |   9 +
>  scripts/mod/sumversion.c                      |   6 +-
>  scripts/{module-common.lds => module.lds.S}   |  31 ++
>  scripts/package/builddeb                      |   2 +-
>  tools/objtool/builtin-check.c                 |  10 +-
>  tools/objtool/check.c                         |  84 +++++-
>  tools/objtool/include/objtool/builtin.h       |   2 +-
>  tools/objtool/include/objtool/check.h         |   1 +
>  tools/objtool/include/objtool/objtool.h       |   1 +
>  tools/objtool/objtool.c                       |   1 +
>  56 files changed, 903 insertions(+), 131 deletions(-)
>  rename arch/arm/{kernel/module.lds => include/asm/module.lds.h} (72%)
>  rename arch/arm64/{kernel/module.lds => include/asm/module.lds.h} (76%)
>  rename arch/ia64/{module.lds => include/asm/module.lds.h} (100%)
>  rename arch/m68k/{kernel/module.lds => include/asm/module.lds.h} (100%)
>  rename arch/powerpc/{kernel/module.lds => include/asm/module.lds.h} (100%)
>  rename arch/riscv/{kernel/module.lds => include/asm/module.lds.h} (84%)
>  create mode 100644 include/asm-generic/module.lds.h
>  create mode 100755 scripts/generate_initcall_order.pl
>  rename scripts/{module-common.lds => module.lds.S} (59%)
>
>
> base-commit: 80396d76da65fc8b82581c0260c25a6aa0a495a3
> --
> 2.28.0.1011.ga647a8990f-goog
>
> --
> You received this message because you are subscribed to the Google Groups "Clang Built Linux" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to clang-built-linux+unsubscribe@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/clang-built-linux/20201009161338.657380-1-samitolvanen%40google.com.
Steven Rostedt Oct. 9, 2020, 7:35 p.m. UTC | #2
On Fri,  9 Oct 2020 09:13:09 -0700
Sami Tolvanen <samitolvanen@google.com> wrote:

> This patch series adds support for building x86_64 and arm64 kernels
> with Clang's Link Time Optimization (LTO).
> 
> In addition to performance, the primary motivation for LTO is
> to allow Clang's Control-Flow Integrity (CFI) to be used in the
> kernel. Google has shipped millions of Pixel devices running three
> major kernel versions with LTO+CFI since 2018.
> 
> Most of the patches are build system changes for handling LLVM
> bitcode, which Clang produces with LTO instead of ELF object files,
> postponing ELF processing until a later stage, and ensuring initcall
> ordering.
> 
> Note that this version is based on tip/master to reduce the number
> of prerequisite patches, and to make it easier to manage changes to
> objtool. Patch 1 is from Masahiro's kbuild tree, and while it's not
> directly related to LTO, it makes the module linker script changes
> cleaner.
> 

I went to test this, but it appears that the latest tip/master fails to
build for me. This error is on tip/master, before I even applied a single
patch.

(config attached)

-- Steve

  SYSMAP  System.map
  HOSTCC  arch/x86/tools/insn_decoder_test
  HOSTCC  arch/x86/tools/insn_sanity
  MODPOST Module.symvers
In file included from /work/git/linux-test.git/include/uapi/linux/byteorder/little_endian.h:12,
                 from /work/git/linux-test.git/include/linux/byteorder/little_endian.h:5,
                 from /usr/include/asm/byteorder.h:5,
                 from /work/git/linux-test.git/arch/x86/include/asm/insn.h:10,
                 from /work/git/linux-test.git/arch/x86/tools/insn_sanity.c:21:
/work/git/linux-test.git/tools/include/linux/types.h:30:18: error: conflicting types for ‘u64’
   30 | typedef uint64_t u64;
      |                  ^~~
In file included from /usr/include/asm-generic/types.h:7,
                 from /usr/include/asm/types.h:1,
                 from /work/git/linux-test.git/tools/include/linux/types.h:10,
                 from /work/git/linux-test.git/include/uapi/linux/byteorder/little_endian.h:12,
                 from /work/git/linux-test.git/include/linux/byteorder/little_endian.h:5,
                 from /usr/include/asm/byteorder.h:5,
                 from /work/git/linux-test.git/arch/x86/include/asm/insn.h:10,
                 from /work/git/linux-test.git/arch/x86/tools/insn_sanity.c:21:
/work/git/linux-test.git/include/asm-generic/int-ll64.h:23:15: note: previous declaration of ‘u64’ was here
   23 | typedef __u64 u64;
      |               ^~~
In file included from /work/git/linux-test.git/include/uapi/linux/byteorder/little_endian.h:12,
                 from /work/git/linux-test.git/include/linux/byteorder/little_endian.h:5,
                 from /usr/include/asm/byteorder.h:5,
                 from /work/git/linux-test.git/arch/x86/include/asm/insn.h:10,
                 from /work/git/linux-test.git/arch/x86/tools/insn_sanity.c:21:
/work/git/linux-test.git/tools/include/linux/types.h:31:17: error: conflicting types for ‘s64’
   31 | typedef int64_t s64;
      |                 ^~~
In file included from /usr/include/asm-generic/types.h:7,
                 from /usr/include/asm/types.h:1,
                 from /work/git/linux-test.git/tools/include/linux/types.h:10,
                 from /work/git/linux-test.git/include/uapi/linux/byteorder/little_endian.h:12,
                 from /work/git/linux-test.git/include/linux/byteorder/little_endian.h:5,
                 from /usr/include/asm/byteorder.h:5,
                 from /work/git/linux-test.git/arch/x86/include/asm/insn.h:10,
                 from /work/git/linux-test.git/arch/x86/tools/insn_sanity.c:21:
/work/git/linux-test.git/include/asm-generic/int-ll64.h:22:15: note: previous declaration of ‘s64’ was here
   22 | typedef __s64 s64;
      |               ^~~
In file included from /work/git/linux-test.git/arch/x86/lib/insn.c:8,
                 from /work/git/linux-test.git/arch/x86/tools/insn_sanity.c:23:
/work/git/linux-test.git/tools/include/linux/kernel.h:87: warning: "cpu_to_le16" redefined
   87 | #define cpu_to_le16
      | 
In file included from /work/git/linux-test.git/include/linux/byteorder/little_endian.h:11,
                 from /usr/include/asm/byteorder.h:5,
                 from /work/git/linux-test.git/arch/x86/include/asm/insn.h:10,
                 from /work/git/linux-test.git/arch/x86/tools/insn_sanity.c:21:
/work/git/linux-test.git/include/linux/byteorder/generic.h:90: note: this is the location of the previous definition
   90 | #define cpu_to_le16 __cpu_to_le16
      | 
In file included from /work/git/linux-test.git/arch/x86/lib/insn.c:8,
                 from /work/git/linux-test.git/arch/x86/tools/insn_sanity.c:23:
/work/git/linux-test.git/tools/include/linux/kernel.h:88: warning: "cpu_to_le32" redefined
   88 | #define cpu_to_le32
      | 
In file included from /work/git/linux-test.git/include/linux/byteorder/little_endian.h:11,
                 from /usr/include/asm/byteorder.h:5,
                 from /work/git/linux-test.git/arch/x86/include/asm/insn.h:10,
                 from /work/git/linux-test.git/arch/x86/tools/insn_sanity.c:21:
/work/git/linux-test.git/include/linux/byteorder/generic.h:88: note: this is the location of the previous definition
   88 | #define cpu_to_le32 __cpu_to_le32
      | 
In file included from /work/git/linux-test.git/arch/x86/lib/insn.c:8,
                 from /work/git/linux-test.git/arch/x86/tools/insn_sanity.c:23:
/work/git/linux-test.git/tools/include/linux/kernel.h:89: warning: "cpu_to_le64" redefined
   89 | #define cpu_to_le64
      | 
In file included from /work/git/linux-test.git/include/linux/byteorder/little_endian.h:11,
                 from /usr/include/asm/byteorder.h:5,
                 from /work/git/linux-test.git/arch/x86/include/asm/insn.h:10,
                 from /work/git/linux-test.git/arch/x86/tools/insn_sanity.c:21:
/work/git/linux-test.git/include/linux/byteorder/generic.h:86: note: this is the location of the previous definition
   86 | #define cpu_to_le64 __cpu_to_le64
      | 
In file included from /work/git/linux-test.git/arch/x86/lib/insn.c:8,
                 from /work/git/linux-test.git/arch/x86/tools/insn_sanity.c:23:
/work/git/linux-test.git/tools/include/linux/kernel.h:90: warning: "le16_to_cpu" redefined
   90 | #define le16_to_cpu
      | 
In file included from /work/git/linux-test.git/include/linux/byteorder/little_endian.h:11,
                 from /usr/include/asm/byteorder.h:5,
                 from /work/git/linux-test.git/arch/x86/include/asm/insn.h:10,
                 from /work/git/linux-test.git/arch/x86/tools/insn_sanity.c:21:
/work/git/linux-test.git/include/linux/byteorder/generic.h:91: note: this is the location of the previous definition
   91 | #define le16_to_cpu __le16_to_cpu
      | 
In file included from /work/git/linux-test.git/arch/x86/lib/insn.c:8,
                 from /work/git/linux-test.git/arch/x86/tools/insn_sanity.c:23:
/work/git/linux-test.git/tools/include/linux/kernel.h:91: warning: "le32_to_cpu" redefined
   91 | #define le32_to_cpu
      | 
In file included from /work/git/linux-test.git/include/linux/byteorder/little_endian.h:11,
                 from /usr/include/asm/byteorder.h:5,
                 from /work/git/linux-test.git/arch/x86/include/asm/insn.h:10,
                 from /work/git/linux-test.git/arch/x86/tools/insn_sanity.c:21:
/work/git/linux-test.git/include/linux/byteorder/generic.h:89: note: this is the location of the previous definition
   89 | #define le32_to_cpu __le32_to_cpu
      | 
In file included from /work/git/linux-test.git/arch/x86/lib/insn.c:8,
                 from /work/git/linux-test.git/arch/x86/tools/insn_sanity.c:23:
/work/git/linux-test.git/tools/include/linux/kernel.h:92: warning: "le64_to_cpu" redefined
   92 | #define le64_to_cpu
      | 
In file included from /work/git/linux-test.git/include/linux/byteorder/little_endian.h:11,
                 from /usr/include/asm/byteorder.h:5,
                 from /work/git/linux-test.git/arch/x86/include/asm/insn.h:10,
                 from /work/git/linux-test.git/arch/x86/tools/insn_sanity.c:21:
/work/git/linux-test.git/include/linux/byteorder/generic.h:87: note: this is the location of the previous definition
   87 | #define le64_to_cpu __le64_to_cpu
      | 
In file included from /work/git/linux-test.git/arch/x86/lib/insn.c:8,
                 from /work/git/linux-test.git/arch/x86/tools/insn_sanity.c:23:
/work/git/linux-test.git/tools/include/linux/kernel.h:93: warning: "cpu_to_be16" redefined
   93 | #define cpu_to_be16 bswap_16
      | 
In file included from /work/git/linux-test.git/include/linux/byteorder/little_endian.h:11,
                 from /usr/include/asm/byteorder.h:5,
                 from /work/git/linux-test.git/arch/x86/include/asm/insn.h:10,
                 from /work/git/linux-test.git/arch/x86/tools/insn_sanity.c:21:
/work/git/linux-test.git/include/linux/byteorder/generic.h:96: note: this is the location of the previous definition
   96 | #define cpu_to_be16 __cpu_to_be16
      | 
In file included from /work/git/linux-test.git/arch/x86/lib/insn.c:8,
                 from /work/git/linux-test.git/arch/x86/tools/insn_sanity.c:23:
/work/git/linux-test.git/tools/include/linux/kernel.h:94: warning: "cpu_to_be32" redefined
   94 | #define cpu_to_be32 bswap_32
      | 
In file included from /work/git/linux-test.git/include/linux/byteorder/little_endian.h:11,
                 from /usr/include/asm/byteorder.h:5,
                 from /work/git/linux-test.git/arch/x86/include/asm/insn.h:10,
                 from /work/git/linux-test.git/arch/x86/tools/insn_sanity.c:21:
/work/git/linux-test.git/include/linux/byteorder/generic.h:94: note: this is the location of the previous definition
   94 | #define cpu_to_be32 __cpu_to_be32
      | 
In file included from /work/git/linux-test.git/arch/x86/lib/insn.c:8,
                 from /work/git/linux-test.git/arch/x86/tools/insn_sanity.c:23:
/work/git/linux-test.git/tools/include/linux/kernel.h:95: warning: "cpu_to_be64" redefined
   95 | #define cpu_to_be64 bswap_64
      | 
In file included from /work/git/linux-test.git/include/linux/byteorder/little_endian.h:11,
                 from /usr/include/asm/byteorder.h:5,
                 from /work/git/linux-test.git/arch/x86/include/asm/insn.h:10,
                 from /work/git/linux-test.git/arch/x86/tools/insn_sanity.c:21:
/work/git/linux-test.git/include/linux/byteorder/generic.h:92: note: this is the location of the previous definition
   92 | #define cpu_to_be64 __cpu_to_be64
      | 
In file included from /work/git/linux-test.git/arch/x86/lib/insn.c:8,
                 from /work/git/linux-test.git/arch/x86/tools/insn_sanity.c:23:
/work/git/linux-test.git/tools/include/linux/kernel.h:96: warning: "be16_to_cpu" redefined
   96 | #define be16_to_cpu bswap_16
      | 
In file included from /work/git/linux-test.git/include/linux/byteorder/little_endian.h:11,
                 from /usr/include/asm/byteorder.h:5,
                 from /work/git/linux-test.git/arch/x86/include/asm/insn.h:10,
                 from /work/git/linux-test.git/arch/x86/tools/insn_sanity.c:21:
/work/git/linux-test.git/include/linux/byteorder/generic.h:97: note: this is the location of the previous definition
   97 | #define be16_to_cpu __be16_to_cpu
      | 
In file included from /work/git/linux-test.git/arch/x86/lib/insn.c:8,
                 from /work/git/linux-test.git/arch/x86/tools/insn_sanity.c:23:
/work/git/linux-test.git/tools/include/linux/kernel.h:97: warning: "be32_to_cpu" redefined
   97 | #define be32_to_cpu bswap_32
      | 
In file included from /work/git/linux-test.git/include/linux/byteorder/little_endian.h:11,
                 from /usr/include/asm/byteorder.h:5,
                 from /work/git/linux-test.git/arch/x86/include/asm/insn.h:10,
                 from /work/git/linux-test.git/arch/x86/tools/insn_sanity.c:21:
/work/git/linux-test.git/include/linux/byteorder/generic.h:95: note: this is the location of the previous definition
   95 | #define be32_to_cpu __be32_to_cpu
      | 
In file included from /work/git/linux-test.git/arch/x86/lib/insn.c:8,
                 from /work/git/linux-test.git/arch/x86/tools/insn_sanity.c:23:
/work/git/linux-test.git/tools/include/linux/kernel.h:98: warning: "be64_to_cpu" redefined
   98 | #define be64_to_cpu bswap_64
      | 
In file included from /work/git/linux-test.git/include/linux/byteorder/little_endian.h:11,
                 from /usr/include/asm/byteorder.h:5,
                 from /work/git/linux-test.git/arch/x86/include/asm/insn.h:10,
                 from /work/git/linux-test.git/arch/x86/tools/insn_sanity.c:21:
/work/git/linux-test.git/include/linux/byteorder/generic.h:93: note: this is the location of the previous definition
   93 | #define be64_to_cpu __be64_to_cpu
      | 
In file included from /work/git/linux-test.git/arch/x86/lib/insn.c:8,
                 from /work/git/linux-test.git/arch/x86/tools/insn_sanity.c:23:
/work/git/linux-test.git/tools/include/linux/kernel.h:105: warning: "ARRAY_SIZE" redefined
  105 | #define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]) + __must_be_array(arr))
      | 
/work/git/linux-test.git/arch/x86/tools/insn_sanity.c:19: note: this is the location of the previous definition
   19 | #define ARRAY_SIZE(a) (sizeof(a)/sizeof(a[0]))
      | 
make[2]: *** [scripts/Makefile.host:95: arch/x86/tools/insn_sanity] Error 1
make[2]: *** Waiting for unfinished jobs....
make[1]: *** [arch/x86/Makefile:267: bzImage] Error 2
make[1]: *** Waiting for unfinished jobs....
Sami Tolvanen Oct. 9, 2020, 8:50 p.m. UTC | #3
On Fri, Oct 09, 2020 at 06:30:24PM +0200, Sedat Dilek wrote:
> Will clang-cfi be based on this, too?

At least until the prerequisite patches are merged into mainline. In the
meanwhile, I have a CFI tree based on this series here:

  https://github.com/samitolvanen/linux/tree/tip/clang-lto

Sami
Sami Tolvanen Oct. 9, 2020, 9:05 p.m. UTC | #4
On Fri, Oct 09, 2020 at 03:35:12PM -0400, Steven Rostedt wrote:
> On Fri,  9 Oct 2020 09:13:09 -0700
> Sami Tolvanen <samitolvanen@google.com> wrote:
> 
> > This patch series adds support for building x86_64 and arm64 kernels
> > with Clang's Link Time Optimization (LTO).
> > 
> > In addition to performance, the primary motivation for LTO is
> > to allow Clang's Control-Flow Integrity (CFI) to be used in the
> > kernel. Google has shipped millions of Pixel devices running three
> > major kernel versions with LTO+CFI since 2018.
> > 
> > Most of the patches are build system changes for handling LLVM
> > bitcode, which Clang produces with LTO instead of ELF object files,
> > postponing ELF processing until a later stage, and ensuring initcall
> > ordering.
> > 
> > Note that this version is based on tip/master to reduce the number
> > of prerequisite patches, and to make it easier to manage changes to
> > objtool. Patch 1 is from Masahiro's kbuild tree, and while it's not
> > directly related to LTO, it makes the module linker script changes
> > cleaner.
> > 
> 
> I went to test this, but it appears that the latest tip/master fails to
> build for me. This error is on tip/master, before I even applied a single
> patch.
> 
> (config attached)

Ah yes, X86_DECODER_SELFTEST seems to be broken in tip/master. If you
prefer, I have these patches on top of mainline here:

  https://github.com/samitolvanen/linux/tree/clang-lto

Testing your config with LTO on this tree, it does build and boot for
me, although I saw a couple of new objtool warnings, and with LLVM=1,
one warning from llvm-objdump.

Sami
Steven Rostedt Oct. 9, 2020, 11:38 p.m. UTC | #5
On Fri, 9 Oct 2020 14:05:48 -0700
Sami Tolvanen <samitolvanen@google.com> wrote:

> Ah yes, X86_DECODER_SELFTEST seems to be broken in tip/master. If you
> prefer, I have these patches on top of mainline here:
> 
>   https://github.com/samitolvanen/linux/tree/clang-lto
> 
> Testing your config with LTO on this tree, it does build and boot for
> me, although I saw a couple of new objtool warnings, and with LLVM=1,
> one warning from llvm-objdump.

Thanks, I disabled X86_DECODER_SELFTEST and it now builds.

I forced the objdump mcount logic with the below patch, which produces:

CONFIG_FTRACE_MCOUNT_RECORD=y
CONFIG_FTRACE_MCOUNT_USE_OBJTOOL=y

But I don't see the __mcount_loc sections being created.

I applied patches 1 - 6.

-- Steve

diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
index 89263210ab26..3042619e21b7 100644
--- a/kernel/trace/Kconfig
+++ b/kernel/trace/Kconfig
@@ -606,7 +606,7 @@ config FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY
 
 config FTRACE_MCOUNT_USE_CC
 	def_bool y
-	depends on $(cc-option,-mrecord-mcount)
+	depends on $(cc-option,-mrecord-mcount1)
 	depends on !FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY
 	depends on FTRACE_MCOUNT_RECORD
Sami Tolvanen Oct. 10, 2020, midnight UTC | #6
On Fri, Oct 9, 2020 at 4:38 PM Steven Rostedt <rostedt@goodmis.org> wrote:
>
> On Fri, 9 Oct 2020 14:05:48 -0700
> Sami Tolvanen <samitolvanen@google.com> wrote:
>
> > Ah yes, X86_DECODER_SELFTEST seems to be broken in tip/master. If you
> > prefer, I have these patches on top of mainline here:
> >
> >   https://github.com/samitolvanen/linux/tree/clang-lto
> >
> > Testing your config with LTO on this tree, it does build and boot for
> > me, although I saw a couple of new objtool warnings, and with LLVM=1,
> > one warning from llvm-objdump.
>
> Thanks, I disabled X86_DECODER_SELFTEST and it now builds.
>
> I forced the objdump mcount logic with the below patch, which produces:
>
> CONFIG_FTRACE_MCOUNT_RECORD=y
> CONFIG_FTRACE_MCOUNT_USE_OBJTOOL=y
>
> But I don't see the __mcount_loc sections being created.
>
> I applied patches 1 - 6.

Patch 6 is missing the part where we actually pass --mcount to
objtool, it's in patch 11 ("kbuild: lto: postpone objtool"). I'll fix
this in v6. In the meanwhile, please apply patches 1-11 to test the
objtool change. Do you have any thoughts about the approach otherwise?

Sami