Message ID | alpine.DEB.2.21.1907162135590.1767@nanos.tec.linutronix.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v2] kbuild: Fail if gold linker is detected | expand |
On Tue, Jul 16, 2019 at 09:47:27PM +0200, Thomas Gleixner wrote: > The gold linker has known issues of failing the build both in random and in > predictible ways: > > - The x86/X32 VDSO build fails with: > > arch/x86/entry/vdso/vclock_gettime-x32.o:vclock_gettime.c:function do_hres: > error: relocation overflow: reference to 'hvclock_page' > > That's a known issue for years and the usual workaround is to disable > CONFIG_X86_32 > > - A recent build failure is caused by turning a relocation into an > absolute one for unknown reasons. See link below. > > - There are a couple of gold workarounds applied already, but reports > about broken builds with ld.gold keep coming in on a regular base and in > most cases the root cause is unclear. > > In context of the most recent fail H.J. stated: > > "Since building a workable kernel for different kernel configurations > isn't a requirement for gold, I don't recommend gold for kernel." > > So instead of dealing with attempts to duct tape gold support without > understanding the root cause and without support from the gold folks, fail > the build when gold is detected. > > Signed-off-by: Thomas Gleixner <tglx@linutronix.de> > Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> > Link: https://lore.kernel.org/r/CAMe9rOqMqkQ0LNpm25yE_Yt0FKp05WmHOrwc0aRDb53miFKM+w@mail.gmail.com Based on the crude little testing script I wrote below: Reviewed-by: Nathan Chancellor <natechancellor@gmail.com> Tested-by: Nathan Chancellor <natechancellor@gmail.com> $ cat test.sh #!/bin/bash # ld.bfd (expected to pass) make distclean defconfig || exit ${?} # ld.gold explicitly (expected to fail) make LD=ld.gold distclean defconfig && exit ${?} # ld.gold as if it were the system linker (expected to fail) ln -fs /usr/bin/ld.gold ld PATH=${PWD}:${PATH} make distclean defconfig && exit ${?} # ld.lld (expected to pass) make LD=ld.lld distclean defconfig || exit ${?}
On Tue, 16 Jul 2019 at 21:00, Nathan Chancellor <natechancellor@gmail.com> wrote: > > On Tue, Jul 16, 2019 at 09:47:27PM +0200, Thomas Gleixner wrote: > > The gold linker has known issues of failing the build both in random and in > > predictible ways: > > > > - The x86/X32 VDSO build fails with: > > > > arch/x86/entry/vdso/vclock_gettime-x32.o:vclock_gettime.c:function do_hres: > > error: relocation overflow: reference to 'hvclock_page' > > > > That's a known issue for years and the usual workaround is to disable > > CONFIG_X86_32 > > > > - A recent build failure is caused by turning a relocation into an > > absolute one for unknown reasons. See link below. > > > > - There are a couple of gold workarounds applied already, but reports > > about broken builds with ld.gold keep coming in on a regular base and in > > most cases the root cause is unclear. > > > > In context of the most recent fail H.J. stated: > > > > "Since building a workable kernel for different kernel configurations > > isn't a requirement for gold, I don't recommend gold for kernel." > > > > So instead of dealing with attempts to duct tape gold support without > > understanding the root cause and without support from the gold folks, fail > > the build when gold is detected. > > > > Signed-off-by: Thomas Gleixner <tglx@linutronix.de> > > Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> > > Link: https://lore.kernel.org/r/CAMe9rOqMqkQ0LNpm25yE_Yt0FKp05WmHOrwc0aRDb53miFKM+w@mail.gmail.com > > Based on the crude little testing script I wrote below: > > Reviewed-by: Nathan Chancellor <natechancellor@gmail.com> > Tested-by: Nathan Chancellor <natechancellor@gmail.com> > > $ cat test.sh > #!/bin/bash > > # ld.bfd (expected to pass) > make distclean defconfig || exit ${?} > > # ld.gold explicitly (expected to fail) > make LD=ld.gold distclean defconfig && exit ${?} > > # ld.gold as if it were the system linker (expected to fail) > ln -fs /usr/bin/ld.gold ld > PATH=${PWD}:${PATH} make distclean defconfig && exit ${?} > > # ld.lld (expected to pass) > make LD=ld.lld distclean defconfig || exit ${?} Hi Would it be possible to force ld.bfd with -fuse-ld=bfd when gold is detected? Are there gold bug reports for any of the issues that have been seen with gold? It's been my default system linker for years and I've had very few issues with it and it's a big improvement when linking with LTO Cheers Mike
Mike, On Tue, 16 Jul 2019, Mike Lothian wrote: > On Tue, 16 Jul 2019 at 21:00, Nathan Chancellor wrote > > Would it be possible to force ld.bfd with -fuse-ld=bfd when gold is detected? It's probably possible but way beyond my kbuild foo. Adding LD=ld.bfd to the make invocation is the trivial workaround. > Are there gold bug reports for any of the issues that have been seen > with gold? Yes. Some got resolved, some not. > It's been my default system linker for years and I've had very few issues > with it and it's a big improvement when linking with LTO I understand, but the fact that you need to turn off config options in order to build a kernel and the clear statement that it's not recommended makes it truly unsuitable and unmaintainable for us. If the gold people are interested to link a kernel and resolve all issues, this surely can be revisited. We work with tooling folks and we try to accomodate different tools, see the ongoing efforts for clang, but that requires commitment from the tooling side. Thanks, tglx
On Wed, Jul 17, 2019 at 12:25:14AM +0200, Thomas Gleixner wrote: > > It's been my default system linker for years and I've had very few issues > > with it and it's a big improvement when linking with LTO > > I understand, but the fact that you need to turn off config options in > order to build a kernel and the clear statement that it's not recommended > makes it truly unsuitable and unmaintainable for us. Or if you work for a cloud company who is willing to make the gold linker work for your specific use case and configuration (and ideally, have gold toolchain experts on staff who will work with you), then it might be OK, but just for that particular use case. (Just as Android kernels worked with Clang when Clang was still miscompiling kernel on different architectures and configurations.) In those cases, you can just carry a patch to force the gold linker to work. The point though is the teams that were using alternative, not-always-reliable toolchains, were big boys and girls, and they weren't asking the upstream kernel devs for support. And they only cared about a few specific configurations, and not something that would work for all or even most configurations and hardware platforms. - Ted
On Wed, Jul 17, 2019 at 4:47 AM Thomas Gleixner <tglx@linutronix.de> wrote: > > The gold linker has known issues of failing the build both in random and in > predictible ways: > > - The x86/X32 VDSO build fails with: > > arch/x86/entry/vdso/vclock_gettime-x32.o:vclock_gettime.c:function do_hres: > error: relocation overflow: reference to 'hvclock_page' > > That's a known issue for years and the usual workaround is to disable > CONFIG_X86_32 > > - A recent build failure is caused by turning a relocation into an > absolute one for unknown reasons. See link below. > > - There are a couple of gold workarounds applied already, but reports > about broken builds with ld.gold keep coming in on a regular base and in > most cases the root cause is unclear. > > In context of the most recent fail H.J. stated: > > "Since building a workable kernel for different kernel configurations > isn't a requirement for gold, I don't recommend gold for kernel." > > So instead of dealing with attempts to duct tape gold support without > understanding the root cause and without support from the gold folks, fail > the build when gold is detected. > > Signed-off-by: Thomas Gleixner <tglx@linutronix.de> > Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> > Link: https://lore.kernel.org/r/CAMe9rOqMqkQ0LNpm25yE_Yt0FKp05WmHOrwc0aRDb53miFKM+w@mail.gmail.com > --- The code looks OK in the build system point of view. Please let me confirm this, just in case: For now, we give up all architectures, not only x86, right? I have not not heard much from other arch maintainers.
On Wed, 17 Jul 2019, Masahiro Yamada wrote: > On Wed, Jul 17, 2019 at 4:47 AM Thomas Gleixner <tglx@linutronix.de> wrote: > > So instead of dealing with attempts to duct tape gold support without > > understanding the root cause and without support from the gold folks, fail > > the build when gold is detected. > > > > The code looks OK in the build system point of view. > > Please let me confirm this, just in case: > For now, we give up all architectures, not only x86, right? Well, that's the logical consequence of a statement which says: don't use gold for the kernel. > I have not not heard much from other arch maintainers. Cc'ed linux-arch for that matter. Thanks, tglx
On Wed, 17 Jul 2019 at 08:57, Thomas Gleixner <tglx@linutronix.de> wrote: > > On Wed, 17 Jul 2019, Masahiro Yamada wrote: > > On Wed, Jul 17, 2019 at 4:47 AM Thomas Gleixner <tglx@linutronix.de> wrote: > > > So instead of dealing with attempts to duct tape gold support without > > > understanding the root cause and without support from the gold folks, fail > > > the build when gold is detected. > > > > > > > The code looks OK in the build system point of view. > > > > Please let me confirm this, just in case: > > For now, we give up all architectures, not only x86, right? > > Well, that's the logical consequence of a statement which says: don't use > gold for the kernel. > > > I have not not heard much from other arch maintainers. > > Cc'ed linux-arch for that matter. > > Thanks, > > tglx Hi I've done a bit more digging, I had a second machine that was building Linus's tree just fine with ld.gold I tried forcing ld.bfd on the problem machine and got this: ld.bfd: arch/x86/boot/compressed/head_64.o: warning: relocation in read-only section `.head.text' ld.bfd: warning: creating a DT_TEXTREL in object I had a look at the differences in the kernel configs and noticed this: CONFIG_RANDOMIZE_BASE=y CONFIG_X86_NEED_RELOCS=y CONFIG_PHYSICAL_ALIGN=0x1000000 CONFIG_DYNAMIC_MEMORY_LAYOUT=y CONFIG_RANDOMIZE_MEMORY=y CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING=0x0 Unsetting CONFIG_RANDOMIZE_BASE=y gets things working for me with ld.gold again In light of this - can we drop this patch? Cheers Mike
On Sat, 20 Jul 2019, Mike Lothian wrote: > On Wed, 17 Jul 2019 at 08:57, Thomas Gleixner <tglx@linutronix.de> wrote: > I've done a bit more digging, I had a second machine that was building > Linus's tree just fine with ld.gold > > I tried forcing ld.bfd on the problem machine and got this: > > ld.bfd: arch/x86/boot/compressed/head_64.o: warning: relocation in > read-only section `.head.text' > ld.bfd: warning: creating a DT_TEXTREL in object > > I had a look at the differences in the kernel configs and noticed this: > > CONFIG_RANDOMIZE_BASE=y > CONFIG_X86_NEED_RELOCS=y > CONFIG_PHYSICAL_ALIGN=0x1000000 > CONFIG_DYNAMIC_MEMORY_LAYOUT=y > CONFIG_RANDOMIZE_MEMORY=y > CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING=0x0 > > Unsetting CONFIG_RANDOMIZE_BASE=y gets things working for me with ld.gold again Can you please provide the full config? I have the above set here and it builds just fine. > In light of this - can we drop this patch? No. I'm not going to deal with unsupported tools. Thanks, tglx
On Sat, 20 Jul 2019 at 10:34, Thomas Gleixner <tglx@linutronix.de> wrote: > > On Sat, 20 Jul 2019, Mike Lothian wrote: > > On Wed, 17 Jul 2019 at 08:57, Thomas Gleixner <tglx@linutronix.de> wrote: > > I've done a bit more digging, I had a second machine that was building > > Linus's tree just fine with ld.gold > > > > I tried forcing ld.bfd on the problem machine and got this: > > > > ld.bfd: arch/x86/boot/compressed/head_64.o: warning: relocation in > > read-only section `.head.text' > > ld.bfd: warning: creating a DT_TEXTREL in object > > > > I had a look at the differences in the kernel configs and noticed this: > > > > CONFIG_RANDOMIZE_BASE=y > > CONFIG_X86_NEED_RELOCS=y > > CONFIG_PHYSICAL_ALIGN=0x1000000 > > CONFIG_DYNAMIC_MEMORY_LAYOUT=y > > CONFIG_RANDOMIZE_MEMORY=y > > CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING=0x0 > > > > Unsetting CONFIG_RANDOMIZE_BASE=y gets things working for me with ld.gold again > > Can you please provide the full config? I have the above set here and it > builds just fine. > > > In light of this - can we drop this patch? > > No. I'm not going to deal with unsupported tools. > > Thanks, > > tglx Hi Here is my config https://github.com/FireBurn/KernelStuff/blob/9b7e96581598d50b266f9df258e7de764949147a/dot_config_tip Regards Mike
On Sat, 20 Jul 2019, Mike Lothian wrote: > Here is my config > > https://github.com/FireBurn/KernelStuff/blob/9b7e96581598d50b266f9df258e7de764949147a/dot_config_tip > Builds perfectly fine.
On Sat, 20 Jul 2019 at 11:54, Thomas Gleixner <tglx@linutronix.de> wrote: > > On Sat, 20 Jul 2019, Mike Lothian wrote: > > Here is my config > > > > https://github.com/FireBurn/KernelStuff/blob/9b7e96581598d50b266f9df258e7de764949147a/dot_config_tip > > > > Builds perfectly fine. Sorry top posted from my phone Are you using gold? And which versions of GCC & binutils are you using?
On Sat, Jul 20, 2019 at 6:12 PM Mike Lothian <mike@fireburn.co.uk> wrote: > > On Wed, 17 Jul 2019 at 08:57, Thomas Gleixner <tglx@linutronix.de> wrote: > > > > On Wed, 17 Jul 2019, Masahiro Yamada wrote: > > > On Wed, Jul 17, 2019 at 4:47 AM Thomas Gleixner <tglx@linutronix.de> wrote: > > > > So instead of dealing with attempts to duct tape gold support without > > > > understanding the root cause and without support from the gold folks, fail > > > > the build when gold is detected. > > > > > > > > > > The code looks OK in the build system point of view. > > > > > > Please let me confirm this, just in case: > > > For now, we give up all architectures, not only x86, right? > > > > Well, that's the logical consequence of a statement which says: don't use > > gold for the kernel. > > > > > I have not not heard much from other arch maintainers. > > > > Cc'ed linux-arch for that matter. > > > > Thanks, > > > > tglx > > Hi > > I've done a bit more digging, I had a second machine that was building > Linus's tree just fine with ld.gold > > I tried forcing ld.bfd on the problem machine and got this: > > ld.bfd: arch/x86/boot/compressed/head_64.o: warning: relocation in > read-only section `.head.text' > ld.bfd: warning: creating a DT_TEXTREL in object > > I had a look at the differences in the kernel configs and noticed this: > > CONFIG_RANDOMIZE_BASE=y > CONFIG_X86_NEED_RELOCS=y > CONFIG_PHYSICAL_ALIGN=0x1000000 > CONFIG_DYNAMIC_MEMORY_LAYOUT=y > CONFIG_RANDOMIZE_MEMORY=y > CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING=0x0 > > Unsetting CONFIG_RANDOMIZE_BASE=y gets things working for me with ld.gold again Right. I was able to build with ld.gold So, we can use gold, depending on the kernel configuration. > In light of this - can we drop this patch?
On Tue, 23 Jul 2019, Masahiro Yamada wrote: > Right. > I was able to build with ld.gold > > So, we can use gold, depending on the kernel configuration. That's exactly the problem. It breaks with random kernel configurations which is not acceptable except for people who know what they are doing. I'm tired of dealing with half baken fixes and 'regression' reports. Either there is an effort to fix the issues with gold like the clang people fix their issues or it needs to be disabled. We have a clear statement that gold developers have other priorities. Thanks, tglx
On Tue, 23 Jul 2019, Thomas Gleixner wrote: > On Tue, 23 Jul 2019, Masahiro Yamada wrote: > > Right. > > I was able to build with ld.gold > > > > So, we can use gold, depending on the kernel configuration. > > That's exactly the problem. It breaks with random kernel configurations > which is not acceptable except for people who know what they are doing. > > I'm tired of dealing with half baken fixes and 'regression' reports. Either > there is an effort to fix the issues with gold like the clang people fix > their issues or it needs to be disabled. We have a clear statement that > gold developers have other priorities. That said, I'm perfectly happy to move this to x86 and leave it alone for other architectures, but it does not make sense to me. If the gold fans care enough, then we can add something like CONFIG_I_WANT_TO_USE_GOLD_AND_DEAL_WITH_THE_FALLOUT_MYSELF. Thanks, tglx
On Tue, Jul 23, 2019 at 5:17 PM Thomas Gleixner <tglx@linutronix.de> wrote: > > On Tue, 23 Jul 2019, Thomas Gleixner wrote: > > On Tue, 23 Jul 2019, Masahiro Yamada wrote: > > > Right. > > > I was able to build with ld.gold > > > > > > So, we can use gold, depending on the kernel configuration. > > > > That's exactly the problem. It breaks with random kernel configurations > > which is not acceptable except for people who know what they are doing. > > > > I'm tired of dealing with half baken fixes and 'regression' reports. Either > > there is an effort to fix the issues with gold like the clang people fix > > their issues or it needs to be disabled. We have a clear statement that > > gold developers have other priorities. > > That said, I'm perfectly happy to move this to x86 and leave it alone for > other architectures, but it does not make sense to me. I did not see opposition from other arch maintainers. > If the gold fans care enough, then we can add something like > CONFIG_I_WANT_TO_USE_GOLD_AND_DEAL_WITH_THE_FALLOUT_MYSELF. Let's apply this and see. If somebody really wants to use gold by his risk, I will consider such a config option. Applied to linux-kbuild. Thanks.
--- a/scripts/Kconfig.include +++ b/scripts/Kconfig.include @@ -35,5 +35,8 @@ ld-option = $(success,$(LD) -v $(1)) $(error-if,$(failure,command -v $(CC)),compiler '$(CC)' not found) $(error-if,$(failure,command -v $(LD)),linker '$(LD)' not found) +# Fail if the linker is gold as it's not capable of linking the kernel proper +$(error-if,$(success, $(LD) -v | grep -q gold), gold linker '$(LD)' not supported) + # gcc version including patch level gcc-version := $(shell,$(srctree)/scripts/gcc-version.sh $(CC))