Message ID | 20241102175115.1769468-2-xur@google.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | Add AutoFDO and Propeller support for Clang build | expand |
On 02.11.24 18:51, Rong Xu wrote: > Add the build support for using Clang's AutoFDO. Building the kernel > with AutoFDO does not reduce the optimization level from the > compiler. AutoFDO uses hardware sampling to gather information about > the frequency of execution of different code paths within a binary. > This information is then used to guide the compiler's optimization > decisions, resulting in a more efficient binary. Experiments > showed that the kernel can improve up to 10% in latency. > > The support requires a Clang compiler after LLVM 17. This submission > is limited to x86 platforms that support PMU features like LBR on > Intel machines and AMD Zen3 BRS. Support for SPE on ARM 1, > and BRBE on ARM 1 is part of planned future work. > > Here is an example workflow for AutoFDO kernel: > > 1) Build the kernel on the host machine with LLVM enabled, for example, > $ make menuconfig LLVM=1 > Turn on AutoFDO build config: > CONFIG_AUTOFDO_CLANG=y > With a configuration that has LLVM enabled, use the following > command: > scripts/config -e AUTOFDO_CLANG > After getting the config, build with > $ make LLVM=1 > > 2) Install the kernel on the test machine. > > 3) Run the load tests. The '-c' option in perf specifies the sample > event period. We suggest using a suitable prime number, > like 500009, for this purpose. > For Intel platforms: > $ perf record -e BR_INST_RETIRED.NEAR_TAKEN:k -a -N -b -c <count> \ > -o <perf_file> -- <loadtest> > For AMD platforms: > The supported system are: Zen3 with BRS, or Zen4 with amd_lbr_v2 > For Zen3: > $ cat proc/cpuinfo | grep " brs" > For Zen4: > $ cat proc/cpuinfo | grep amd_lbr_v2 > $ perf record --pfm-events RETIRED_TAKEN_BRANCH_INSTRUCTIONS:k -a \ > -N -b -c <count> -o <perf_file> -- <loadtest> > > 4) (Optional) Download the raw perf file to the host machine. > > 5) To generate an AutoFDO profile, two offline tools are available: > create_llvm_prof and llvm_profgen. The create_llvm_prof tool is part > of the AutoFDO project and can be found on GitHub > (https://github.com/google/autofdo), version v0.30.1 or later. The > llvm_profgen tool is included in the LLVM compiler itself. It's > important to note that the version of llvm_profgen doesn't need to > match the version of Clang. It needs to be the LLVM 19 release or > later, or from the LLVM trunk. > $ llvm-profgen --kernel --binary=<vmlinux> --perfdata=<perf_file> \ > -o <profile_file> > or > $ create_llvm_prof --binary=<vmlinux> --profile=<perf_file> \ > --format=extbinary --out=<profile_file> > > Note that multiple AutoFDO profile files can be merged into one via: > $ llvm-profdata merge -o <profile_file> <profile_1> ... <profile_n> > > 6) Rebuild the kernel using the AutoFDO profile file with the same config > as step 1, (Note CONFIG_AUTOFDO_CLANG needs to be enabled): > $ make LLVM=1 CLANG_AUTOFDO_PROFILE=<profile_file> > > Co-developed-by: Han Shen<shenhan@google.com> > Signed-off-by: Han Shen<shenhan@google.com> > Signed-off-by: Rong Xu<xur@google.com> > Suggested-by: Sriraman Tallam<tmsriram@google.com> > Suggested-by: Krzysztof Pszeniczny<kpszeniczny@google.com> > Suggested-by: Nick Desaulniers<ndesaulniers@google.com> > Suggested-by: Stephane Eranian<eranian@google.com> > Tested-by: Yonghong Song<yonghong.song@linux.dev> > Tested-by: Yabin Cui<yabinc@google.com> > Tested-by: Nathan Chancellor<nathan@kernel.org> > Reviewed-by: Kees Cook<kees@kernel.org> Tested-by: Peter Jung <ptr1337@cachyos.org>
On 02.11.24 20:46, Peter Jung wrote: > > > On 02.11.24 18:51, Rong Xu wrote: >> Add the build support for using Clang's AutoFDO. Building the kernel >> with AutoFDO does not reduce the optimization level from the >> compiler. AutoFDO uses hardware sampling to gather information about >> the frequency of execution of different code paths within a binary. >> This information is then used to guide the compiler's optimization >> decisions, resulting in a more efficient binary. Experiments >> showed that the kernel can improve up to 10% in latency. >> >> The support requires a Clang compiler after LLVM 17. This submission >> is limited to x86 platforms that support PMU features like LBR on >> Intel machines and AMD Zen3 BRS. Support for SPE on ARM 1, >> and BRBE on ARM 1 is part of planned future work. >> >> Here is an example workflow for AutoFDO kernel: >> >> 1) Build the kernel on the host machine with LLVM enabled, for example, >> $ make menuconfig LLVM=1 >> Turn on AutoFDO build config: >> CONFIG_AUTOFDO_CLANG=y >> With a configuration that has LLVM enabled, use the following >> command: >> scripts/config -e AUTOFDO_CLANG >> After getting the config, build with >> $ make LLVM=1 >> >> 2) Install the kernel on the test machine. >> >> 3) Run the load tests. The '-c' option in perf specifies the sample >> event period. We suggest using a suitable prime number, >> like 500009, for this purpose. >> For Intel platforms: >> $ perf record -e BR_INST_RETIRED.NEAR_TAKEN:k -a -N -b -c >> <count> \ >> -o <perf_file> -- <loadtest> >> For AMD platforms: >> The supported system are: Zen3 with BRS, or Zen4 with amd_lbr_v2 >> For Zen3: >> $ cat proc/cpuinfo | grep " brs" >> For Zen4: >> $ cat proc/cpuinfo | grep amd_lbr_v2 >> $ perf record --pfm-events RETIRED_TAKEN_BRANCH_INSTRUCTIONS:k >> -a \ >> -N -b -c <count> -o <perf_file> -- <loadtest> >> >> 4) (Optional) Download the raw perf file to the host machine. >> >> 5) To generate an AutoFDO profile, two offline tools are available: >> create_llvm_prof and llvm_profgen. The create_llvm_prof tool is part >> of the AutoFDO project and can be found on GitHub >> (https://github.com/google/autofdo), version v0.30.1 or later. The >> llvm_profgen tool is included in the LLVM compiler itself. It's >> important to note that the version of llvm_profgen doesn't need to >> match the version of Clang. It needs to be the LLVM 19 release or >> later, or from the LLVM trunk. >> $ llvm-profgen --kernel --binary=<vmlinux> -- >> perfdata=<perf_file> \ >> -o <profile_file> >> or >> $ create_llvm_prof --binary=<vmlinux> --profile=<perf_file> \ >> --format=extbinary --out=<profile_file> >> >> Note that multiple AutoFDO profile files can be merged into one via: >> $ llvm-profdata merge -o <profile_file> <profile_1> ... >> <profile_n> >> >> 6) Rebuild the kernel using the AutoFDO profile file with the same config >> as step 1, (Note CONFIG_AUTOFDO_CLANG needs to be enabled): >> $ make LLVM=1 CLANG_AUTOFDO_PROFILE=<profile_file> >> >> Co-developed-by: Han Shen<shenhan@google.com> >> Signed-off-by: Han Shen<shenhan@google.com> >> Signed-off-by: Rong Xu<xur@google.com> >> Suggested-by: Sriraman Tallam<tmsriram@google.com> >> Suggested-by: Krzysztof Pszeniczny<kpszeniczny@google.com> >> Suggested-by: Nick Desaulniers<ndesaulniers@google.com> >> Suggested-by: Stephane Eranian<eranian@google.com> >> Tested-by: Yonghong Song<yonghong.song@linux.dev> >> Tested-by: Yabin Cui<yabinc@google.com> >> Tested-by: Nathan Chancellor<nathan@kernel.org> >> Reviewed-by: Kees Cook<kees@kernel.org> > > Tested-by: Peter Jung <ptr1337@cachyos.org> > The compilations and testing with the "make pacman-pkg" function from the kernel worked fine. One problem I do face: When I apply a AutoFDO profile together with the PKGBUILD [1] from archlinux im running into issues at "module_install" at the packaging. See following log: ``` make[2]: *** [scripts/Makefile.modinst:125: /tmp/makepkg/linux-cachyos-rc-autofdo/pkg/linux-cachyos-rc-autofdo/usr/lib/modules/6.12.0-rc5-5-cachyos-rc-autofdo/kernel/arch/x86/kvm/kvm.ko] Error 1 make[2]: *** Deleting file '/tmp/makepkg/linux-cachyos-rc-autofdo/pkg/linux-cachyos-rc-autofdo/usr/lib/modules/6.12.0-rc5-5-cachyos-rc-autofdo/kernel/arch/x86/kvm/kvm.ko' INSTALL /tmp/makepkg/linux-cachyos-rc-autofdo/pkg/linux-cachyos-rc-autofdo/usr/lib/modules/6.12.0-rc5-5-cachyos-rc-autofdo/kernel/crypto/cryptd.ko make[2]: *** Waiting for unfinished jobs.... ``` This can be fixed with removed "INSTALL_MOD_STRIP=1" to the passed parameters of module_install. This explicitly only happens, if a profile is passed - otherwise the packaging works without problems. Regards, Peter Jung
Hi Peter, thanks for reporting the issue. I am trying to reproduce it in the up-to-date archlinux environment. Below is what I have: 0. pacman -Syu 1. cloned archlinux build files from https://aur.archlinux.org/linux-mainline.git the newest mainline version is 6.12rc5-1. 2. changed the PKGBUILD file to include the patches series 3. changed the "config" to turn on clang autofdo 4. collected afdo profiles 5. MAKEFLAGS="-j48 V=1 LLVM=1 CLANG_AUTOFDO_PROFILE=$(pwd)/perf.afdo" \ makepkg -s --skipinteg --skippgp 6. install and reboot The above steps succeeded. You mentioned the error happens at "module_install", can you instruct me how to execute the "module_install" step? Thanks, Han On Sat, Nov 2, 2024 at 12:53 PM Peter Jung <ptr1337@cachyos.org> wrote: > > > > On 02.11.24 20:46, Peter Jung wrote: > > > > > > On 02.11.24 18:51, Rong Xu wrote: > >> Add the build support for using Clang's AutoFDO. Building the kernel > >> with AutoFDO does not reduce the optimization level from the > >> compiler. AutoFDO uses hardware sampling to gather information about > >> the frequency of execution of different code paths within a binary. > >> This information is then used to guide the compiler's optimization > >> decisions, resulting in a more efficient binary. Experiments > >> showed that the kernel can improve up to 10% in latency. > >> > >> The support requires a Clang compiler after LLVM 17. This submission > >> is limited to x86 platforms that support PMU features like LBR on > >> Intel machines and AMD Zen3 BRS. Support for SPE on ARM 1, > >> and BRBE on ARM 1 is part of planned future work. > >> > >> Here is an example workflow for AutoFDO kernel: > >> > >> 1) Build the kernel on the host machine with LLVM enabled, for example, > >> $ make menuconfig LLVM=1 > >> Turn on AutoFDO build config: > >> CONFIG_AUTOFDO_CLANG=y > >> With a configuration that has LLVM enabled, use the following > >> command: > >> scripts/config -e AUTOFDO_CLANG > >> After getting the config, build with > >> $ make LLVM=1 > >> > >> 2) Install the kernel on the test machine. > >> > >> 3) Run the load tests. The '-c' option in perf specifies the sample > >> event period. We suggest using a suitable prime number, > >> like 500009, for this purpose. > >> For Intel platforms: > >> $ perf record -e BR_INST_RETIRED.NEAR_TAKEN:k -a -N -b -c > >> <count> \ > >> -o <perf_file> -- <loadtest> > >> For AMD platforms: > >> The supported system are: Zen3 with BRS, or Zen4 with amd_lbr_v2 > >> For Zen3: > >> $ cat proc/cpuinfo | grep " brs" > >> For Zen4: > >> $ cat proc/cpuinfo | grep amd_lbr_v2 > >> $ perf record --pfm-events RETIRED_TAKEN_BRANCH_INSTRUCTIONS:k > >> -a \ > >> -N -b -c <count> -o <perf_file> -- <loadtest> > >> > >> 4) (Optional) Download the raw perf file to the host machine. > >> > >> 5) To generate an AutoFDO profile, two offline tools are available: > >> create_llvm_prof and llvm_profgen. The create_llvm_prof tool is part > >> of the AutoFDO project and can be found on GitHub > >> (https://github.com/google/autofdo), version v0.30.1 or later. The > >> llvm_profgen tool is included in the LLVM compiler itself. It's > >> important to note that the version of llvm_profgen doesn't need to > >> match the version of Clang. It needs to be the LLVM 19 release or > >> later, or from the LLVM trunk. > >> $ llvm-profgen --kernel --binary=<vmlinux> -- > >> perfdata=<perf_file> \ > >> -o <profile_file> > >> or > >> $ create_llvm_prof --binary=<vmlinux> --profile=<perf_file> \ > >> --format=extbinary --out=<profile_file> > >> > >> Note that multiple AutoFDO profile files can be merged into one via: > >> $ llvm-profdata merge -o <profile_file> <profile_1> ... > >> <profile_n> > >> > >> 6) Rebuild the kernel using the AutoFDO profile file with the same config > >> as step 1, (Note CONFIG_AUTOFDO_CLANG needs to be enabled): > >> $ make LLVM=1 CLANG_AUTOFDO_PROFILE=<profile_file> > >> > >> Co-developed-by: Han Shen<shenhan@google.com> > >> Signed-off-by: Han Shen<shenhan@google.com> > >> Signed-off-by: Rong Xu<xur@google.com> > >> Suggested-by: Sriraman Tallam<tmsriram@google.com> > >> Suggested-by: Krzysztof Pszeniczny<kpszeniczny@google.com> > >> Suggested-by: Nick Desaulniers<ndesaulniers@google.com> > >> Suggested-by: Stephane Eranian<eranian@google.com> > >> Tested-by: Yonghong Song<yonghong.song@linux.dev> > >> Tested-by: Yabin Cui<yabinc@google.com> > >> Tested-by: Nathan Chancellor<nathan@kernel.org> > >> Reviewed-by: Kees Cook<kees@kernel.org> > > > > Tested-by: Peter Jung <ptr1337@cachyos.org> > > > > The compilations and testing with the "make pacman-pkg" function from > the kernel worked fine. > > One problem I do face: > When I apply a AutoFDO profile together with the PKGBUILD [1] from > archlinux im running into issues at "module_install" at the packaging. > > See following log: > ``` > make[2]: *** [scripts/Makefile.modinst:125: > /tmp/makepkg/linux-cachyos-rc-autofdo/pkg/linux-cachyos-rc-autofdo/usr/lib/modules/6.12.0-rc5-5-cachyos-rc-autofdo/kernel/arch/x86/kvm/kvm.ko] > Error 1 > make[2]: *** Deleting file > '/tmp/makepkg/linux-cachyos-rc-autofdo/pkg/linux-cachyos-rc-autofdo/usr/lib/modules/6.12.0-rc5-5-cachyos-rc-autofdo/kernel/arch/x86/kvm/kvm.ko' > INSTALL > /tmp/makepkg/linux-cachyos-rc-autofdo/pkg/linux-cachyos-rc-autofdo/usr/lib/modules/6.12.0-rc5-5-cachyos-rc-autofdo/kernel/crypto/cryptd.ko > make[2]: *** Waiting for unfinished jobs.... > ``` > > > This can be fixed with removed "INSTALL_MOD_STRIP=1" to the passed > parameters of module_install. > > This explicitly only happens, if a profile is passed - otherwise the > packaging works without problems. > > Regards, > > Peter Jung >
Hi Han, Thanks for the test. I will look into my current setup again and will let you know. Please do not see this as blocker for now. :) Regards, Peter On 04.11.24 05:50, Han Shen wrote: > Hi Peter, thanks for reporting the issue. I am trying to reproduce it > in the up-to-date archlinux environment. Below is what I have: > 0. pacman -Syu > 1. cloned archlinux build files from > https://aur.archlinux.org/linux-mainline.git the newest mainline > version is 6.12rc5-1. > 2. changed the PKGBUILD file to include the patches series > 3. changed the "config" to turn on clang autofdo > 4. collected afdo profiles > 5. MAKEFLAGS="-j48 V=1 LLVM=1 CLANG_AUTOFDO_PROFILE=$(pwd)/perf.afdo" \ > makepkg -s --skipinteg --skippgp > 6. install and reboot > The above steps succeeded. > You mentioned the error happens at "module_install", can you instruct > me how to execute the "module_install" step? > > Thanks, > Han > > On Sat, Nov 2, 2024 at 12:53 PM Peter Jung<ptr1337@cachyos.org> wrote: >> >> >> On 02.11.24 20:46, Peter Jung wrote: >>> >>> On 02.11.24 18:51, Rong Xu wrote: >>>> Add the build support for using Clang's AutoFDO. Building the kernel >>>> with AutoFDO does not reduce the optimization level from the >>>> compiler. AutoFDO uses hardware sampling to gather information about >>>> the frequency of execution of different code paths within a binary. >>>> This information is then used to guide the compiler's optimization >>>> decisions, resulting in a more efficient binary. Experiments >>>> showed that the kernel can improve up to 10% in latency. >>>> >>>> The support requires a Clang compiler after LLVM 17. This submission >>>> is limited to x86 platforms that support PMU features like LBR on >>>> Intel machines and AMD Zen3 BRS. Support for SPE on ARM 1, >>>> and BRBE on ARM 1 is part of planned future work. >>>> >>>> Here is an example workflow for AutoFDO kernel: >>>> >>>> 1) Build the kernel on the host machine with LLVM enabled, for example, >>>> $ make menuconfig LLVM=1 >>>> Turn on AutoFDO build config: >>>> CONFIG_AUTOFDO_CLANG=y >>>> With a configuration that has LLVM enabled, use the following >>>> command: >>>> scripts/config -e AUTOFDO_CLANG >>>> After getting the config, build with >>>> $ make LLVM=1 >>>> >>>> 2) Install the kernel on the test machine. >>>> >>>> 3) Run the load tests. The '-c' option in perf specifies the sample >>>> event period. We suggest using a suitable prime number, >>>> like 500009, for this purpose. >>>> For Intel platforms: >>>> $ perf record -e BR_INST_RETIRED.NEAR_TAKEN:k -a -N -b -c >>>> <count> \ >>>> -o <perf_file> -- <loadtest> >>>> For AMD platforms: >>>> The supported system are: Zen3 with BRS, or Zen4 with amd_lbr_v2 >>>> For Zen3: >>>> $ cat proc/cpuinfo | grep " brs" >>>> For Zen4: >>>> $ cat proc/cpuinfo | grep amd_lbr_v2 >>>> $ perf record --pfm-events RETIRED_TAKEN_BRANCH_INSTRUCTIONS:k >>>> -a \ >>>> -N -b -c <count> -o <perf_file> -- <loadtest> >>>> >>>> 4) (Optional) Download the raw perf file to the host machine. >>>> >>>> 5) To generate an AutoFDO profile, two offline tools are available: >>>> create_llvm_prof and llvm_profgen. The create_llvm_prof tool is part >>>> of the AutoFDO project and can be found on GitHub >>>> (https://github.com/google/autofdo), version v0.30.1 or later. The >>>> llvm_profgen tool is included in the LLVM compiler itself. It's >>>> important to note that the version of llvm_profgen doesn't need to >>>> match the version of Clang. It needs to be the LLVM 19 release or >>>> later, or from the LLVM trunk. >>>> $ llvm-profgen --kernel --binary=<vmlinux> -- >>>> perfdata=<perf_file> \ >>>> -o <profile_file> >>>> or >>>> $ create_llvm_prof --binary=<vmlinux> --profile=<perf_file> \ >>>> --format=extbinary --out=<profile_file> >>>> >>>> Note that multiple AutoFDO profile files can be merged into one via: >>>> $ llvm-profdata merge -o <profile_file> <profile_1> ... >>>> <profile_n> >>>> >>>> 6) Rebuild the kernel using the AutoFDO profile file with the same config >>>> as step 1, (Note CONFIG_AUTOFDO_CLANG needs to be enabled): >>>> $ make LLVM=1 CLANG_AUTOFDO_PROFILE=<profile_file> >>>> >>>> Co-developed-by: Han Shen<shenhan@google.com> >>>> Signed-off-by: Han Shen<shenhan@google.com> >>>> Signed-off-by: Rong Xu<xur@google.com> >>>> Suggested-by: Sriraman Tallam<tmsriram@google.com> >>>> Suggested-by: Krzysztof Pszeniczny<kpszeniczny@google.com> >>>> Suggested-by: Nick Desaulniers<ndesaulniers@google.com> >>>> Suggested-by: Stephane Eranian<eranian@google.com> >>>> Tested-by: Yonghong Song<yonghong.song@linux.dev> >>>> Tested-by: Yabin Cui<yabinc@google.com> >>>> Tested-by: Nathan Chancellor<nathan@kernel.org> >>>> Reviewed-by: Kees Cook<kees@kernel.org> >>> Tested-by: Peter Jung<ptr1337@cachyos.org> >>> >> The compilations and testing with the "make pacman-pkg" function from >> the kernel worked fine. >> >> One problem I do face: >> When I apply a AutoFDO profile together with the PKGBUILD [1] from >> archlinux im running into issues at "module_install" at the packaging. >> >> See following log: >> ``` >> make[2]: *** [scripts/Makefile.modinst:125: >> /tmp/makepkg/linux-cachyos-rc-autofdo/pkg/linux-cachyos-rc-autofdo/usr/lib/modules/6.12.0-rc5-5-cachyos-rc-autofdo/kernel/arch/x86/kvm/kvm.ko] >> Error 1 >> make[2]: *** Deleting file >> '/tmp/makepkg/linux-cachyos-rc-autofdo/pkg/linux-cachyos-rc-autofdo/usr/lib/modules/6.12.0-rc5-5-cachyos-rc-autofdo/kernel/arch/x86/kvm/kvm.ko' >> INSTALL >> /tmp/makepkg/linux-cachyos-rc-autofdo/pkg/linux-cachyos-rc-autofdo/usr/lib/modules/6.12.0-rc5-5-cachyos-rc-autofdo/kernel/crypto/cryptd.ko >> make[2]: *** Waiting for unfinished jobs.... >> ``` >> >> >> This can be fixed with removed "INSTALL_MOD_STRIP=1" to the passed >> parameters of module_install. >> >> This explicitly only happens, if a profile is passed - otherwise the >> packaging works without problems. >> >> Regards, >> >> Peter Jung >>
Hi Han, I have tested your provided method, but the AutoFDO profile (lld does not get lto-sample-profile=$pathtoprofile passed) nor Clang as compiler gets used. Please replace following PKGBUILD and config from linux-mainline with the provided one in the gist. The patch is also included there. https://gist.github.com/ptr1337/c92728bb273f7dbc2817db75eedec9ed The main change I am doing here, is passing following to the build array and replacing "make all": make LLVM=1 LLVM_IAS=1 CLANG_AUTOFDO_PROFILE=${srcdir}/perf.afdo all When compiling the kernel with makepkg, this results at the packaging to following issue and can be reliable reproduced. Regards, Peter On 04.11.24 05:50, Han Shen wrote: > Hi Peter, thanks for reporting the issue. I am trying to reproduce it > in the up-to-date archlinux environment. Below is what I have: > 0. pacman -Syu > 1. cloned archlinux build files from > https://aur.archlinux.org/linux-mainline.git the newest mainline > version is 6.12rc5-1. > 2. changed the PKGBUILD file to include the patches series > 3. changed the "config" to turn on clang autofdo > 4. collected afdo profiles > 5. MAKEFLAGS="-j48 V=1 LLVM=1 CLANG_AUTOFDO_PROFILE=$(pwd)/perf.afdo" \ > makepkg -s --skipinteg --skippgp > 6. install and reboot > The above steps succeeded. > You mentioned the error happens at "module_install", can you instruct > me how to execute the "module_install" step? > > Thanks, > Han > > On Sat, Nov 2, 2024 at 12:53 PM Peter Jung<ptr1337@cachyos.org> wrote: >> >> >> On 02.11.24 20:46, Peter Jung wrote: >>> >>> On 02.11.24 18:51, Rong Xu wrote: >>>> Add the build support for using Clang's AutoFDO. Building the kernel >>>> with AutoFDO does not reduce the optimization level from the >>>> compiler. AutoFDO uses hardware sampling to gather information about >>>> the frequency of execution of different code paths within a binary. >>>> This information is then used to guide the compiler's optimization >>>> decisions, resulting in a more efficient binary. Experiments >>>> showed that the kernel can improve up to 10% in latency. >>>> >>>> The support requires a Clang compiler after LLVM 17. This submission >>>> is limited to x86 platforms that support PMU features like LBR on >>>> Intel machines and AMD Zen3 BRS. Support for SPE on ARM 1, >>>> and BRBE on ARM 1 is part of planned future work. >>>> >>>> Here is an example workflow for AutoFDO kernel: >>>> >>>> 1) Build the kernel on the host machine with LLVM enabled, for example, >>>> $ make menuconfig LLVM=1 >>>> Turn on AutoFDO build config: >>>> CONFIG_AUTOFDO_CLANG=y >>>> With a configuration that has LLVM enabled, use the following >>>> command: >>>> scripts/config -e AUTOFDO_CLANG >>>> After getting the config, build with >>>> $ make LLVM=1 >>>> >>>> 2) Install the kernel on the test machine. >>>> >>>> 3) Run the load tests. The '-c' option in perf specifies the sample >>>> event period. We suggest using a suitable prime number, >>>> like 500009, for this purpose. >>>> For Intel platforms: >>>> $ perf record -e BR_INST_RETIRED.NEAR_TAKEN:k -a -N -b -c >>>> <count> \ >>>> -o <perf_file> -- <loadtest> >>>> For AMD platforms: >>>> The supported system are: Zen3 with BRS, or Zen4 with amd_lbr_v2 >>>> For Zen3: >>>> $ cat proc/cpuinfo | grep " brs" >>>> For Zen4: >>>> $ cat proc/cpuinfo | grep amd_lbr_v2 >>>> $ perf record --pfm-events RETIRED_TAKEN_BRANCH_INSTRUCTIONS:k >>>> -a \ >>>> -N -b -c <count> -o <perf_file> -- <loadtest> >>>> >>>> 4) (Optional) Download the raw perf file to the host machine. >>>> >>>> 5) To generate an AutoFDO profile, two offline tools are available: >>>> create_llvm_prof and llvm_profgen. The create_llvm_prof tool is part >>>> of the AutoFDO project and can be found on GitHub >>>> (https://github.com/google/autofdo), version v0.30.1 or later. The >>>> llvm_profgen tool is included in the LLVM compiler itself. It's >>>> important to note that the version of llvm_profgen doesn't need to >>>> match the version of Clang. It needs to be the LLVM 19 release or >>>> later, or from the LLVM trunk. >>>> $ llvm-profgen --kernel --binary=<vmlinux> -- >>>> perfdata=<perf_file> \ >>>> -o <profile_file> >>>> or >>>> $ create_llvm_prof --binary=<vmlinux> --profile=<perf_file> \ >>>> --format=extbinary --out=<profile_file> >>>> >>>> Note that multiple AutoFDO profile files can be merged into one via: >>>> $ llvm-profdata merge -o <profile_file> <profile_1> ... >>>> <profile_n> >>>> >>>> 6) Rebuild the kernel using the AutoFDO profile file with the same config >>>> as step 1, (Note CONFIG_AUTOFDO_CLANG needs to be enabled): >>>> $ make LLVM=1 CLANG_AUTOFDO_PROFILE=<profile_file> >>>> >>>> Co-developed-by: Han Shen<shenhan@google.com> >>>> Signed-off-by: Han Shen<shenhan@google.com> >>>> Signed-off-by: Rong Xu<xur@google.com> >>>> Suggested-by: Sriraman Tallam<tmsriram@google.com> >>>> Suggested-by: Krzysztof Pszeniczny<kpszeniczny@google.com> >>>> Suggested-by: Nick Desaulniers<ndesaulniers@google.com> >>>> Suggested-by: Stephane Eranian<eranian@google.com> >>>> Tested-by: Yonghong Song<yonghong.song@linux.dev> >>>> Tested-by: Yabin Cui<yabinc@google.com> >>>> Tested-by: Nathan Chancellor<nathan@kernel.org> >>>> Reviewed-by: Kees Cook<kees@kernel.org> >>> Tested-by: Peter Jung<ptr1337@cachyos.org> >>> >> The compilations and testing with the "make pacman-pkg" function from >> the kernel worked fine. >> >> One problem I do face: >> When I apply a AutoFDO profile together with the PKGBUILD [1] from >> archlinux im running into issues at "module_install" at the packaging. >> >> See following log: >> ``` >> make[2]: *** [scripts/Makefile.modinst:125: >> /tmp/makepkg/linux-cachyos-rc-autofdo/pkg/linux-cachyos-rc-autofdo/usr/lib/modules/6.12.0-rc5-5-cachyos-rc-autofdo/kernel/arch/x86/kvm/kvm.ko] >> Error 1 >> make[2]: *** Deleting file >> '/tmp/makepkg/linux-cachyos-rc-autofdo/pkg/linux-cachyos-rc-autofdo/usr/lib/modules/6.12.0-rc5-5-cachyos-rc-autofdo/kernel/arch/x86/kvm/kvm.ko' >> INSTALL >> /tmp/makepkg/linux-cachyos-rc-autofdo/pkg/linux-cachyos-rc-autofdo/usr/lib/modules/6.12.0-rc5-5-cachyos-rc-autofdo/kernel/crypto/cryptd.ko >> make[2]: *** Waiting for unfinished jobs.... >> ``` >> >> >> This can be fixed with removed "INSTALL_MOD_STRIP=1" to the passed >> parameters of module_install. >> >> This explicitly only happens, if a profile is passed - otherwise the >> packaging works without problems. >> >> Regards, >> >> Peter Jung >>
Hi Peter, Thanks for providing the detailed reproduce. Now I can see the error (after I synced to 6.12.0-rc6, I was using rc5). I'll look into that and report back. > I have tested your provided method, but the AutoFDO profile (lld does not get lto-sample-profile=$pathtoprofile passed) I see. You also turned on ThinLTO, which I didn't, so the profile was only used during compilation, not passed to lld. Thanks, Han On Mon, Nov 4, 2024 at 9:31 AM Peter Jung <ptr1337@cachyos.org> wrote: > > Hi Han, > > I have tested your provided method, but the AutoFDO profile (lld does > not get lto-sample-profile=$pathtoprofile passed) nor Clang as compiler > gets used. > Please replace following PKGBUILD and config from linux-mainline with > the provided one in the gist. The patch is also included there. > > https://gist.github.com/ptr1337/c92728bb273f7dbc2817db75eedec9ed > > The main change I am doing here, is passing following to the build array > and replacing "make all": > > make LLVM=1 LLVM_IAS=1 CLANG_AUTOFDO_PROFILE=${srcdir}/perf.afdo all > > When compiling the kernel with makepkg, this results at the packaging to > following issue and can be reliable reproduced. > > Regards, > > Peter > > > On 04.11.24 05:50, Han Shen wrote: > > Hi Peter, thanks for reporting the issue. I am trying to reproduce it > > in the up-to-date archlinux environment. Below is what I have: > > 0. pacman -Syu > > 1. cloned archlinux build files from > > https://aur.archlinux.org/linux-mainline.git the newest mainline > > version is 6.12rc5-1. > > 2. changed the PKGBUILD file to include the patches series > > 3. changed the "config" to turn on clang autofdo > > 4. collected afdo profiles > > 5. MAKEFLAGS="-j48 V=1 LLVM=1 CLANG_AUTOFDO_PROFILE=$(pwd)/perf.afdo" \ > > makepkg -s --skipinteg --skippgp > > 6. install and reboot > > The above steps succeeded. > > You mentioned the error happens at "module_install", can you instruct > > me how to execute the "module_install" step? > > > > Thanks, > > Han > > > > On Sat, Nov 2, 2024 at 12:53 PM Peter Jung<ptr1337@cachyos.org> wrote: > >> > >> > >> On 02.11.24 20:46, Peter Jung wrote: > >>> > >>> On 02.11.24 18:51, Rong Xu wrote: > >>>> Add the build support for using Clang's AutoFDO. Building the kernel > >>>> with AutoFDO does not reduce the optimization level from the > >>>> compiler. AutoFDO uses hardware sampling to gather information about > >>>> the frequency of execution of different code paths within a binary. > >>>> This information is then used to guide the compiler's optimization > >>>> decisions, resulting in a more efficient binary. Experiments > >>>> showed that the kernel can improve up to 10% in latency. > >>>> > >>>> The support requires a Clang compiler after LLVM 17. This submission > >>>> is limited to x86 platforms that support PMU features like LBR on > >>>> Intel machines and AMD Zen3 BRS. Support for SPE on ARM 1, > >>>> and BRBE on ARM 1 is part of planned future work. > >>>> > >>>> Here is an example workflow for AutoFDO kernel: > >>>> > >>>> 1) Build the kernel on the host machine with LLVM enabled, for example, > >>>> $ make menuconfig LLVM=1 > >>>> Turn on AutoFDO build config: > >>>> CONFIG_AUTOFDO_CLANG=y > >>>> With a configuration that has LLVM enabled, use the following > >>>> command: > >>>> scripts/config -e AUTOFDO_CLANG > >>>> After getting the config, build with > >>>> $ make LLVM=1 > >>>> > >>>> 2) Install the kernel on the test machine. > >>>> > >>>> 3) Run the load tests. The '-c' option in perf specifies the sample > >>>> event period. We suggest using a suitable prime number, > >>>> like 500009, for this purpose. > >>>> For Intel platforms: > >>>> $ perf record -e BR_INST_RETIRED.NEAR_TAKEN:k -a -N -b -c > >>>> <count> \ > >>>> -o <perf_file> -- <loadtest> > >>>> For AMD platforms: > >>>> The supported system are: Zen3 with BRS, or Zen4 with amd_lbr_v2 > >>>> For Zen3: > >>>> $ cat proc/cpuinfo | grep " brs" > >>>> For Zen4: > >>>> $ cat proc/cpuinfo | grep amd_lbr_v2 > >>>> $ perf record --pfm-events RETIRED_TAKEN_BRANCH_INSTRUCTIONS:k > >>>> -a \ > >>>> -N -b -c <count> -o <perf_file> -- <loadtest> > >>>> > >>>> 4) (Optional) Download the raw perf file to the host machine. > >>>> > >>>> 5) To generate an AutoFDO profile, two offline tools are available: > >>>> create_llvm_prof and llvm_profgen. The create_llvm_prof tool is part > >>>> of the AutoFDO project and can be found on GitHub > >>>> (https://github.com/google/autofdo), version v0.30.1 or later. The > >>>> llvm_profgen tool is included in the LLVM compiler itself. It's > >>>> important to note that the version of llvm_profgen doesn't need to > >>>> match the version of Clang. It needs to be the LLVM 19 release or > >>>> later, or from the LLVM trunk. > >>>> $ llvm-profgen --kernel --binary=<vmlinux> -- > >>>> perfdata=<perf_file> \ > >>>> -o <profile_file> > >>>> or > >>>> $ create_llvm_prof --binary=<vmlinux> --profile=<perf_file> \ > >>>> --format=extbinary --out=<profile_file> > >>>> > >>>> Note that multiple AutoFDO profile files can be merged into one via: > >>>> $ llvm-profdata merge -o <profile_file> <profile_1> ... > >>>> <profile_n> > >>>> > >>>> 6) Rebuild the kernel using the AutoFDO profile file with the same config > >>>> as step 1, (Note CONFIG_AUTOFDO_CLANG needs to be enabled): > >>>> $ make LLVM=1 CLANG_AUTOFDO_PROFILE=<profile_file> > >>>> > >>>> Co-developed-by: Han Shen<shenhan@google.com> > >>>> Signed-off-by: Han Shen<shenhan@google.com> > >>>> Signed-off-by: Rong Xu<xur@google.com> > >>>> Suggested-by: Sriraman Tallam<tmsriram@google.com> > >>>> Suggested-by: Krzysztof Pszeniczny<kpszeniczny@google.com> > >>>> Suggested-by: Nick Desaulniers<ndesaulniers@google.com> > >>>> Suggested-by: Stephane Eranian<eranian@google.com> > >>>> Tested-by: Yonghong Song<yonghong.song@linux.dev> > >>>> Tested-by: Yabin Cui<yabinc@google.com> > >>>> Tested-by: Nathan Chancellor<nathan@kernel.org> > >>>> Reviewed-by: Kees Cook<kees@kernel.org> > >>> Tested-by: Peter Jung<ptr1337@cachyos.org> > >>> > >> The compilations and testing with the "make pacman-pkg" function from > >> the kernel worked fine. > >> > >> One problem I do face: > >> When I apply a AutoFDO profile together with the PKGBUILD [1] from > >> archlinux im running into issues at "module_install" at the packaging. > >> > >> See following log: > >> ``` > >> make[2]: *** [scripts/Makefile.modinst:125: > >> /tmp/makepkg/linux-cachyos-rc-autofdo/pkg/linux-cachyos-rc-autofdo/usr/lib/modules/6.12.0-rc5-5-cachyos-rc-autofdo/kernel/arch/x86/kvm/kvm.ko] > >> Error 1 > >> make[2]: *** Deleting file > >> '/tmp/makepkg/linux-cachyos-rc-autofdo/pkg/linux-cachyos-rc-autofdo/usr/lib/modules/6.12.0-rc5-5-cachyos-rc-autofdo/kernel/arch/x86/kvm/kvm.ko' > >> INSTALL > >> /tmp/makepkg/linux-cachyos-rc-autofdo/pkg/linux-cachyos-rc-autofdo/usr/lib/modules/6.12.0-rc5-5-cachyos-rc-autofdo/kernel/crypto/cryptd.ko > >> make[2]: *** Waiting for unfinished jobs.... > >> ``` > >> > >> > >> This can be fixed with removed "INSTALL_MOD_STRIP=1" to the passed > >> parameters of module_install. > >> > >> This explicitly only happens, if a profile is passed - otherwise the > >> packaging works without problems. > >> > >> Regards, > >> > >> Peter Jung > >> >
diff --git a/Documentation/dev-tools/autofdo.rst b/Documentation/dev-tools/autofdo.rst new file mode 100644 index 0000000000000..1f0a451e9ccd3 --- /dev/null +++ b/Documentation/dev-tools/autofdo.rst @@ -0,0 +1,168 @@ +.. SPDX-License-Identifier: GPL-2.0 + +=================================== +Using AutoFDO with the Linux kernel +=================================== + +This enables AutoFDO build support for the kernel when using +the Clang compiler. AutoFDO (Auto-Feedback-Directed Optimization) +is a type of profile-guided optimization (PGO) used to enhance the +performance of binary executables. It gathers information about the +frequency of execution of various code paths within a binary using +hardware sampling. This data is then used to guide the compiler's +optimization decisions, resulting in a more efficient binary. AutoFDO +is a powerful optimization technique, and data indicates that it can +significantly improve kernel performance. It's especially beneficial +for workloads affected by front-end stalls. + +For AutoFDO builds, unlike non-FDO builds, the user must supply a +profile. Acquiring an AutoFDO profile can be done in several ways. +AutoFDO profiles are created by converting hardware sampling using +the "perf" tool. It is crucial that the workload used to create these +perf files is representative; they must exhibit runtime +characteristics similar to the workloads that are intended to be +optimized. Failure to do so will result in the compiler optimizing +for the wrong objective. + +The AutoFDO profile often encapsulates the program's behavior. If the +performance-critical codes are architecture-independent, the profile +can be applied across platforms to achieve performance gains. For +instance, using the profile generated on Intel architecture to build +a kernel for AMD architecture can also yield performance improvements. + +There are two methods for acquiring a representative profile: +(1) Sample real workloads using a production environment. +(2) Generate the profile using a representative load test. +When enabling the AutoFDO build configuration without providing an +AutoFDO profile, the compiler only modifies the dwarf information in +the kernel without impacting runtime performance. It's advisable to +use a kernel binary built with the same AutoFDO configuration to +collect the perf profile. While it's possible to use a kernel built +with different options, it may result in inferior performance. + +One can collect profiles using AutoFDO build for the previous kernel. +AutoFDO employs relative line numbers to match the profiles, offering +some tolerance for source changes. This mode is commonly used in a +production environment for profile collection. + +In a profile collection based on a load test, the AutoFDO collection +process consists of the following steps: + +#. Initial build: The kernel is built with AutoFDO options + without a profile. + +#. Profiling: The above kernel is then run with a representative + workload to gather execution frequency data. This data is + collected using hardware sampling, via perf. AutoFDO is most + effective on platforms supporting advanced PMU features like + LBR on Intel machines. + +#. AutoFDO profile generation: Perf output file is converted to + the AutoFDO profile via offline tools. + +The support requires a Clang compiler LLVM 17 or later. + +Preparation +=========== + +Configure the kernel with:: + + CONFIG_AUTOFDO_CLANG=y + +Customization +============= + +The default CONFIG_AUTOFDO_CLANG setting covers kernel space objects for +AutoFDO builds. One can, however, enable or disable AutoFDO build for +individual files and directories by adding a line similar to the following +to the respective kernel Makefile: + +- For enabling a single file (e.g. foo.o) :: + + AUTOFDO_PROFILE_foo.o := y + +- For enabling all files in one directory :: + + AUTOFDO_PROFILE := y + +- For disabling one file :: + + AUTOFDO_PROFILE_foo.o := n + +- For disabling all files in one directory :: + + AUTOFDO_PROFILE := n + +Workflow +======== + +Here is an example workflow for AutoFDO kernel: + +1) Build the kernel on the host machine with LLVM enabled, + for example, :: + + $ make menuconfig LLVM=1 + + Turn on AutoFDO build config:: + + CONFIG_AUTOFDO_CLANG=y + + With a configuration that with LLVM enabled, use the following command:: + + $ scripts/config -e AUTOFDO_CLANG + + After getting the config, build with :: + + $ make LLVM=1 + +2) Install the kernel on the test machine. + +3) Run the load tests. The '-c' option in perf specifies the sample + event period. We suggest using a suitable prime number, like 500009, + for this purpose. + + - For Intel platforms:: + + $ perf record -e BR_INST_RETIRED.NEAR_TAKEN:k -a -N -b -c <count> -o <perf_file> -- <loadtest> + + - For AMD platforms: + + The supported systems are: Zen3 with BRS, or Zen4 with amd_lbr_v2. To check, + + For Zen3:: + + $ cat proc/cpuinfo | grep " brs" + + For Zen4:: + + $ cat proc/cpuinfo | grep amd_lbr_v2 + + The following command generated the perf data file:: + + $ perf record --pfm-events RETIRED_TAKEN_BRANCH_INSTRUCTIONS:k -a -N -b -c <count> -o <perf_file> -- <loadtest> + +4) (Optional) Download the raw perf file to the host machine. + +5) To generate an AutoFDO profile, two offline tools are available: + create_llvm_prof and llvm_profgen. The create_llvm_prof tool is part + of the AutoFDO project and can be found on GitHub + (https://github.com/google/autofdo), version v0.30.1 or later. + The llvm_profgen tool is included in the LLVM compiler itself. It's + important to note that the version of llvm_profgen doesn't need to match + the version of Clang. It needs to be the LLVM 19 release of Clang + or later, or just from the LLVM trunk. :: + + $ llvm-profgen --kernel --binary=<vmlinux> --perfdata=<perf_file> -o <profile_file> + + or :: + + $ create_llvm_prof --binary=<vmlinux> --profile=<perf_file> --format=extbinary --out=<profile_file> + + Note that multiple AutoFDO profile files can be merged into one via:: + + $ llvm-profdata merge -o <profile_file> <profile_1> <profile_2> ... <profile_n> + +6) Rebuild the kernel using the AutoFDO profile file with the same config as step 1, + (Note CONFIG_AUTOFDO_CLANG needs to be enabled):: + + $ make LLVM=1 CLANG_AUTOFDO_PROFILE=<profile_file> diff --git a/Documentation/dev-tools/index.rst b/Documentation/dev-tools/index.rst index 53d4d124f9c52..6945644f7008a 100644 --- a/Documentation/dev-tools/index.rst +++ b/Documentation/dev-tools/index.rst @@ -34,6 +34,7 @@ Documentation/dev-tools/testing-overview.rst ktap checkuapi gpio-sloppy-logic-analyzer + autofdo .. only:: subproject and html diff --git a/MAINTAINERS b/MAINTAINERS index a274079502426..d6ea49433747a 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -3665,6 +3665,13 @@ F: kernel/audit* F: lib/*audit.c K: \baudit_[a-z_0-9]\+\b +AUTOFDO BUILD +M: Rong Xu <xur@google.com> +M: Han Shen <shenhan@google.com> +S: Supported +F: Documentation/dev-tools/autofdo.rst +F: scripts/Makefile.autofdo + AUXILIARY BUS DRIVER M: Greg Kroah-Hartman <gregkh@linuxfoundation.org> R: Dave Ertman <david.m.ertman@intel.com> diff --git a/Makefile b/Makefile index 5e04e4abffd88..b89d87b5dca79 100644 --- a/Makefile +++ b/Makefile @@ -1018,6 +1018,7 @@ include-$(CONFIG_KMSAN) += scripts/Makefile.kmsan include-$(CONFIG_UBSAN) += scripts/Makefile.ubsan include-$(CONFIG_KCOV) += scripts/Makefile.kcov include-$(CONFIG_RANDSTRUCT) += scripts/Makefile.randstruct +include-$(CONFIG_AUTOFDO_CLANG) += scripts/Makefile.autofdo include-$(CONFIG_GCC_PLUGINS) += scripts/Makefile.gcc-plugins include $(addprefix $(srctree)/, $(include-y)) diff --git a/arch/Kconfig b/arch/Kconfig index bd9f095d69fa0..8dca3b5e6ef53 100644 --- a/arch/Kconfig +++ b/arch/Kconfig @@ -811,6 +811,26 @@ config LTO_CLANG_THIN If unsure, say Y. endchoice +config ARCH_SUPPORTS_AUTOFDO_CLANG + bool + +config AUTOFDO_CLANG + bool "Enable Clang's AutoFDO build (EXPERIMENTAL)" + depends on ARCH_SUPPORTS_AUTOFDO_CLANG + depends on CC_IS_CLANG && CLANG_VERSION >= 170000 + help + This option enables Clang’s AutoFDO build. When + an AutoFDO profile is specified in variable + CLANG_AUTOFDO_PROFILE during the build process, + Clang uses the profile to optimize the kernel. + + If no profile is specified, AutoFDO options are + still passed to Clang to facilitate the collection + of perf data for creating an AutoFDO profile in + subsequent builds. + + If unsure, say N. + config ARCH_SUPPORTS_CFI_CLANG bool help diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 16354dfa6d965..9dc87661fb373 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -126,6 +126,7 @@ config X86 select ARCH_SUPPORTS_LTO_CLANG select ARCH_SUPPORTS_LTO_CLANG_THIN select ARCH_SUPPORTS_RT + select ARCH_SUPPORTS_AUTOFDO_CLANG select ARCH_USE_BUILTIN_BSWAP select ARCH_USE_CMPXCHG_LOCKREF if X86_CMPXCHG64 select ARCH_USE_MEMTEST diff --git a/scripts/Makefile.autofdo b/scripts/Makefile.autofdo new file mode 100644 index 0000000000000..ff96a63fea7cd --- /dev/null +++ b/scripts/Makefile.autofdo @@ -0,0 +1,22 @@ +# SPDX-License-Identifier: GPL-2.0 + +# Enable available and selected Clang AutoFDO features. + +CFLAGS_AUTOFDO_CLANG := -fdebug-info-for-profiling -mllvm -enable-fs-discriminator=true -mllvm -improved-fs-discriminator=true + +ifndef CONFIG_DEBUG_INFO + CFLAGS_AUTOFDO_CLANG += -gmlt +endif + +ifdef CLANG_AUTOFDO_PROFILE + CFLAGS_AUTOFDO_CLANG += -fprofile-sample-use=$(CLANG_AUTOFDO_PROFILE) +endif + +ifdef CONFIG_LTO_CLANG_THIN + ifdef CLANG_AUTOFDO_PROFILE + KBUILD_LDFLAGS += --lto-sample-profile=$(CLANG_AUTOFDO_PROFILE) + endif + KBUILD_LDFLAGS += --mllvm=-enable-fs-discriminator=true --mllvm=-improved-fs-discriminator=true -plugin-opt=thinlto +endif + +export CFLAGS_AUTOFDO_CLANG diff --git a/scripts/Makefile.lib b/scripts/Makefile.lib index 01a9f567d5af4..2d0942c1a0277 100644 --- a/scripts/Makefile.lib +++ b/scripts/Makefile.lib @@ -191,6 +191,16 @@ _c_flags += $(if $(patsubst n%,, \ -D__KCSAN_INSTRUMENT_BARRIERS__) endif +# +# Enable AutoFDO build flags except some files or directories we don't want to +# enable (depends on variables AUTOFDO_PROFILE_obj.o and AUTOFDO_PROFILE). +# +ifeq ($(CONFIG_AUTOFDO_CLANG),y) +_c_flags += $(if $(patsubst n%,, \ + $(AUTOFDO_PROFILE_$(target-stem).o)$(AUTOFDO_PROFILE)$(is-kernel-object)), \ + $(CFLAGS_AUTOFDO_CLANG)) +endif + # $(src) for including checkin headers from generated source files # $(obj) for including generated headers from checkin source files ifeq ($(KBUILD_EXTMOD),) diff --git a/tools/objtool/check.c b/tools/objtool/check.c index 6604f5d038aad..4c5229991e1e0 100644 --- a/tools/objtool/check.c +++ b/tools/objtool/check.c @@ -4557,6 +4557,7 @@ static int validate_ibt(struct objtool_file *file) !strcmp(sec->name, "__jump_table") || !strcmp(sec->name, "__mcount_loc") || !strcmp(sec->name, ".kcfi_traps") || + !strcmp(sec->name, ".llvm.call-graph-profile") || strstr(sec->name, "__patchable_function_entries")) continue;