Message ID | 20200702233929.181409-1-danny@kdrag0n.dev (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | kbuild: Allow Clang global merging if !MODULES | expand |
Hi Danny, On Thu, Jul 02, 2020 at 04:39:29PM -0700, Danny Lin wrote: > The old reasoning for disabling Clang's global merging optimization is > that it breaks modpost by coalescing many symbols into _MergedGlobals. > However, modpost is only used in builds with dynamic modules; > vmlinux.symvers is still created during standalone builds, but it's > effectively just an empty dummy file. > > Enabling the optimization whenever possible allows us to reap the > benefits of reduced register pressure when many global variables are > used in the same function. Have you run into any place within the kernel that this is the case or this more of a "could help if this ever happens" type of deal? > An x86 defconfig kernel built with this optimization boots fine in qemu, > and a downstream 4.14 kernel has been used on arm64 for nearly a year > without any issues caused by this optimization. If I am reading LLVM's source correctly, this option only seems relevant for ARM and AArch64? $ rg --no-heading createGlobalMergePass llvm/lib/CodeGen/GlobalMerge.cpp:679:Pass *llvm::createGlobalMergePass(const TargetMachine *TM, unsigned Offset, llvm/lib/Target/AArch64/AArch64TargetMachine.cpp:524: addPass(createGlobalMergePass(TM, 4095, OnlyOptimizeForSize, llvm/lib/Target/ARM/ARMTargetMachine.cpp:456: addPass(createGlobalMergePass(TM, 127, OnlyOptimizeForSize, llvm/include/llvm/CodeGen/Passes.h:419: Pass *createGlobalMergePass(const TargetMachine *TM, unsigned MaximalOffset, Otherwise, I think this is probably okay. According to [1], when the optimization level is less than -O3, we get a less aggressive version of this optimization level, which is good for code size: https://github.com/llvm/llvm-project/commit/8207641251706ea808df6d2a1ea8f87b8ee04c6d However, we do potentially get merging of extern globals if we do not specify -mglobal-merge (if I am reading the source correctly), which this commit claims might hurt performance? Not sure if there is any way to test or verify that? https://github.com/llvm/llvm-project/commit/de73404b8c4332190750537eb93ce0d5b6451300 > Signed-off-by: Danny Lin <danny@kdrag0n.dev> > --- > Makefile | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/Makefile b/Makefile > index a60c98519c37..f04c3639cf61 100644 > --- a/Makefile > +++ b/Makefile > @@ -772,10 +772,13 @@ ifdef CONFIG_CC_IS_CLANG > KBUILD_CPPFLAGS += -Qunused-arguments > KBUILD_CFLAGS += -Wno-format-invalid-specifier > KBUILD_CFLAGS += -Wno-gnu > + > +ifdef CONFIG_MODULES > # CLANG uses a _MergedGlobals as optimization, but this breaks modpost, as the > # source of a reference will be _MergedGlobals and not on of the whitelisted names. > # See modpost pattern 2 > KBUILD_CFLAGS += -mno-global-merge > +endif > else > > # These warnings generated too much noise in a regular build. > -- > 2.27.0 > Cheers, Nathan
On Sun, Jul 5, 2020 at 11:43 AM Nathan Chancellor <natechancellor@gmail.com> wrote: > > Hi Danny, > > On Thu, Jul 02, 2020 at 04:39:29PM -0700, Danny Lin wrote: > > The old reasoning for disabling Clang's global merging optimization is > > that it breaks modpost by coalescing many symbols into _MergedGlobals. > > However, modpost is only used in builds with dynamic modules; > > vmlinux.symvers is still created during standalone builds, but it's > > effectively just an empty dummy file. I am not convinced with this statement. modpost is used even if CONFIG_MODULES=n You can see the following log for allnoconfig. MODPOST vmlinux.symvers Yes, vmlinux.symver is empty. This is because EXPORT_SYMBOL is no-op when CONFIG_MODULES=n. > > > > Enabling the optimization whenever possible allows us to reap the > > benefits of reduced register pressure when many global variables are > > used in the same function. > > Have you run into any place within the kernel that this is the case or > this more of a "could help if this ever happens" type of deal? > > > An x86 defconfig kernel built with this optimization boots fine in qemu, > > and a downstream 4.14 kernel has been used on arm64 for nearly a year > > without any issues caused by this optimization. > > If I am reading LLVM's source correctly, this option only seems relevant > for ARM and AArch64? > > $ rg --no-heading createGlobalMergePass > llvm/lib/CodeGen/GlobalMerge.cpp:679:Pass *llvm::createGlobalMergePass(const TargetMachine *TM, unsigned Offset, > llvm/lib/Target/AArch64/AArch64TargetMachine.cpp:524: addPass(createGlobalMergePass(TM, 4095, OnlyOptimizeForSize, > llvm/lib/Target/ARM/ARMTargetMachine.cpp:456: addPass(createGlobalMergePass(TM, 127, OnlyOptimizeForSize, > llvm/include/llvm/CodeGen/Passes.h:419: Pass *createGlobalMergePass(const TargetMachine *TM, unsigned MaximalOffset, > > Otherwise, I think this is probably okay. According to [1], when the > optimization level is less than -O3, we get a less aggressive version of > this optimization level, which is good for code size: > > https://github.com/llvm/llvm-project/commit/8207641251706ea808df6d2a1ea8f87b8ee04c6d I do not understand how -mglobal-merge works. Do you have test code to generate _MergedGlobals ? Thanks. > However, we do potentially get merging of extern globals if we do not > specify -mglobal-merge (if I am reading the source correctly), which > this commit claims might hurt performance? Not sure if there is any way > to test or verify that? > > https://github.com/llvm/llvm-project/commit/de73404b8c4332190750537eb93ce0d5b6451300 > > > Signed-off-by: Danny Lin <danny@kdrag0n.dev> > > --- > > Makefile | 3 +++ > > 1 file changed, 3 insertions(+) > > > > diff --git a/Makefile b/Makefile > > index a60c98519c37..f04c3639cf61 100644 > > --- a/Makefile > > +++ b/Makefile > > @@ -772,10 +772,13 @@ ifdef CONFIG_CC_IS_CLANG > > KBUILD_CPPFLAGS += -Qunused-arguments > > KBUILD_CFLAGS += -Wno-format-invalid-specifier > > KBUILD_CFLAGS += -Wno-gnu > > + > > +ifdef CONFIG_MODULES > > # CLANG uses a _MergedGlobals as optimization, but this breaks modpost, as the > > # source of a reference will be _MergedGlobals and not on of the whitelisted names. > > # See modpost pattern 2 > > KBUILD_CFLAGS += -mno-global-merge > > +endif > > else > > > > # These warnings generated too much noise in a regular build. > > -- > > 2.27.0 > > > > Cheers, > Nathan -- Best Regards Masahiro Yamada
diff --git a/Makefile b/Makefile index a60c98519c37..f04c3639cf61 100644 --- a/Makefile +++ b/Makefile @@ -772,10 +772,13 @@ ifdef CONFIG_CC_IS_CLANG KBUILD_CPPFLAGS += -Qunused-arguments KBUILD_CFLAGS += -Wno-format-invalid-specifier KBUILD_CFLAGS += -Wno-gnu + +ifdef CONFIG_MODULES # CLANG uses a _MergedGlobals as optimization, but this breaks modpost, as the # source of a reference will be _MergedGlobals and not on of the whitelisted names. # See modpost pattern 2 KBUILD_CFLAGS += -mno-global-merge +endif else # These warnings generated too much noise in a regular build.
The old reasoning for disabling Clang's global merging optimization is that it breaks modpost by coalescing many symbols into _MergedGlobals. However, modpost is only used in builds with dynamic modules; vmlinux.symvers is still created during standalone builds, but it's effectively just an empty dummy file. Enabling the optimization whenever possible allows us to reap the benefits of reduced register pressure when many global variables are used in the same function. An x86 defconfig kernel built with this optimization boots fine in qemu, and a downstream 4.14 kernel has been used on arm64 for nearly a year without any issues caused by this optimization. Signed-off-by: Danny Lin <danny@kdrag0n.dev> --- Makefile | 3 +++ 1 file changed, 3 insertions(+)