diff mbox series

[v2] kbuild: Fail if gold linker is detected

Message ID alpine.DEB.2.21.1907162135590.1767@nanos.tec.linutronix.de (mailing list archive)
State New, archived
Headers show
Series [v2] kbuild: Fail if gold linker is detected | expand

Commit Message

Thomas Gleixner July 16, 2019, 7:47 p.m. UTC
The gold linker has known issues of failing the build both in random and in
predictible ways:

 - The x86/X32 VDSO build fails with:

   arch/x86/entry/vdso/vclock_gettime-x32.o:vclock_gettime.c:function do_hres:
   error: relocation overflow: reference to 'hvclock_page'

   That's a known issue for years and the usual workaround is to disable
   CONFIG_X86_32

 - A recent build failure is caused by turning a relocation into an
   absolute one for unknown reasons. See link below.

 - There are a couple of gold workarounds applied already, but reports
   about broken builds with ld.gold keep coming in on a regular base and in
   most cases the root cause is unclear.

In context of the most recent fail H.J. stated:

  "Since building a workable kernel for different kernel configurations
   isn't a requirement for gold, I don't recommend gold for kernel."

So instead of dealing with attempts to duct tape gold support without
understanding the root cause and without support from the gold folks, fail
the build when gold is detected.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/CAMe9rOqMqkQ0LNpm25yE_Yt0FKp05WmHOrwc0aRDb53miFKM+w@mail.gmail.com
---
V2: Better changelog
    Use the proper check as pointed out by Nathan
---
 scripts/Kconfig.include |    3 +++
 1 file changed, 3 insertions(+)

Comments

Nathan Chancellor July 16, 2019, 7:59 p.m. UTC | #1
On Tue, Jul 16, 2019 at 09:47:27PM +0200, Thomas Gleixner wrote:
> The gold linker has known issues of failing the build both in random and in
> predictible ways:
> 
>  - The x86/X32 VDSO build fails with:
> 
>    arch/x86/entry/vdso/vclock_gettime-x32.o:vclock_gettime.c:function do_hres:
>    error: relocation overflow: reference to 'hvclock_page'
> 
>    That's a known issue for years and the usual workaround is to disable
>    CONFIG_X86_32
> 
>  - A recent build failure is caused by turning a relocation into an
>    absolute one for unknown reasons. See link below.
> 
>  - There are a couple of gold workarounds applied already, but reports
>    about broken builds with ld.gold keep coming in on a regular base and in
>    most cases the root cause is unclear.
> 
> In context of the most recent fail H.J. stated:
> 
>   "Since building a workable kernel for different kernel configurations
>    isn't a requirement for gold, I don't recommend gold for kernel."
> 
> So instead of dealing with attempts to duct tape gold support without
> understanding the root cause and without support from the gold folks, fail
> the build when gold is detected.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> Link: https://lore.kernel.org/r/CAMe9rOqMqkQ0LNpm25yE_Yt0FKp05WmHOrwc0aRDb53miFKM+w@mail.gmail.com

Based on the crude little testing script I wrote below:

Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Tested-by: Nathan Chancellor <natechancellor@gmail.com>

$ cat test.sh
#!/bin/bash

# ld.bfd (expected to pass)
make distclean defconfig || exit ${?}

# ld.gold explicitly (expected to fail)
make LD=ld.gold distclean defconfig && exit ${?}

# ld.gold as if it were the system linker (expected to fail)
ln -fs /usr/bin/ld.gold ld
PATH=${PWD}:${PATH} make distclean defconfig && exit ${?}

# ld.lld (expected to pass)
make LD=ld.lld distclean defconfig || exit ${?}
Mike Lothian July 16, 2019, 9:20 p.m. UTC | #2
On Tue, 16 Jul 2019 at 21:00, Nathan Chancellor
<natechancellor@gmail.com> wrote:
>
> On Tue, Jul 16, 2019 at 09:47:27PM +0200, Thomas Gleixner wrote:
> > The gold linker has known issues of failing the build both in random and in
> > predictible ways:
> >
> >  - The x86/X32 VDSO build fails with:
> >
> >    arch/x86/entry/vdso/vclock_gettime-x32.o:vclock_gettime.c:function do_hres:
> >    error: relocation overflow: reference to 'hvclock_page'
> >
> >    That's a known issue for years and the usual workaround is to disable
> >    CONFIG_X86_32
> >
> >  - A recent build failure is caused by turning a relocation into an
> >    absolute one for unknown reasons. See link below.
> >
> >  - There are a couple of gold workarounds applied already, but reports
> >    about broken builds with ld.gold keep coming in on a regular base and in
> >    most cases the root cause is unclear.
> >
> > In context of the most recent fail H.J. stated:
> >
> >   "Since building a workable kernel for different kernel configurations
> >    isn't a requirement for gold, I don't recommend gold for kernel."
> >
> > So instead of dealing with attempts to duct tape gold support without
> > understanding the root cause and without support from the gold folks, fail
> > the build when gold is detected.
> >
> > Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> > Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> > Link: https://lore.kernel.org/r/CAMe9rOqMqkQ0LNpm25yE_Yt0FKp05WmHOrwc0aRDb53miFKM+w@mail.gmail.com
>
> Based on the crude little testing script I wrote below:
>
> Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
> Tested-by: Nathan Chancellor <natechancellor@gmail.com>
>
> $ cat test.sh
> #!/bin/bash
>
> # ld.bfd (expected to pass)
> make distclean defconfig || exit ${?}
>
> # ld.gold explicitly (expected to fail)
> make LD=ld.gold distclean defconfig && exit ${?}
>
> # ld.gold as if it were the system linker (expected to fail)
> ln -fs /usr/bin/ld.gold ld
> PATH=${PWD}:${PATH} make distclean defconfig && exit ${?}
>
> # ld.lld (expected to pass)
> make LD=ld.lld distclean defconfig || exit ${?}

Hi

Would it be possible to force ld.bfd with -fuse-ld=bfd when gold is detected?

Are there gold bug reports for any of the issues that have been seen
with gold? It's been my default system linker for years and I've had
very few issues with it and it's a big improvement when linking with
LTO

Cheers

Mike
Thomas Gleixner July 16, 2019, 10:25 p.m. UTC | #3
Mike,

On Tue, 16 Jul 2019, Mike Lothian wrote:
> On Tue, 16 Jul 2019 at 21:00, Nathan Chancellor wrote
> 
> Would it be possible to force ld.bfd with -fuse-ld=bfd when gold is detected?

It's probably possible but way beyond my kbuild foo.

Adding LD=ld.bfd to the make invocation is the trivial workaround.

> Are there gold bug reports for any of the issues that have been seen
> with gold?

Yes. Some got resolved, some not.

> It's been my default system linker for years and I've had very few issues
> with it and it's a big improvement when linking with LTO

I understand, but the fact that you need to turn off config options in
order to build a kernel and the clear statement that it's not recommended
makes it truly unsuitable and unmaintainable for us.

If the gold people are interested to link a kernel and resolve all issues,
this surely can be revisited. We work with tooling folks and we try to
accomodate different tools, see the ongoing efforts for clang, but that
requires commitment from the tooling side.

Thanks,

	tglx
Theodore Ts'o July 16, 2019, 11:37 p.m. UTC | #4
On Wed, Jul 17, 2019 at 12:25:14AM +0200, Thomas Gleixner wrote:
> > It's been my default system linker for years and I've had very few issues
> > with it and it's a big improvement when linking with LTO
> 
> I understand, but the fact that you need to turn off config options in
> order to build a kernel and the clear statement that it's not recommended
> makes it truly unsuitable and unmaintainable for us.

Or if you work for a cloud company who is willing to make the gold
linker work for your specific use case and configuration (and ideally,
have gold toolchain experts on staff who will work with you), then it
might be OK, but just for that particular use case.  (Just as Android
kernels worked with Clang when Clang was still miscompiling kernel on
different architectures and configurations.)  In those cases, you can
just carry a patch to force the gold linker to work.

The point though is the teams that were using alternative,
not-always-reliable toolchains, were big boys and girls, and they
weren't asking the upstream kernel devs for support.  And they only
cared about a few specific configurations, and not something that
would work for all or even most configurations and hardware platforms.

	      	       	      	   	- Ted
Masahiro Yamada July 17, 2019, 6:54 a.m. UTC | #5
On Wed, Jul 17, 2019 at 4:47 AM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> The gold linker has known issues of failing the build both in random and in
> predictible ways:
>
>  - The x86/X32 VDSO build fails with:
>
>    arch/x86/entry/vdso/vclock_gettime-x32.o:vclock_gettime.c:function do_hres:
>    error: relocation overflow: reference to 'hvclock_page'
>
>    That's a known issue for years and the usual workaround is to disable
>    CONFIG_X86_32
>
>  - A recent build failure is caused by turning a relocation into an
>    absolute one for unknown reasons. See link below.
>
>  - There are a couple of gold workarounds applied already, but reports
>    about broken builds with ld.gold keep coming in on a regular base and in
>    most cases the root cause is unclear.
>
> In context of the most recent fail H.J. stated:
>
>   "Since building a workable kernel for different kernel configurations
>    isn't a requirement for gold, I don't recommend gold for kernel."
>
> So instead of dealing with attempts to duct tape gold support without
> understanding the root cause and without support from the gold folks, fail
> the build when gold is detected.
>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> Link: https://lore.kernel.org/r/CAMe9rOqMqkQ0LNpm25yE_Yt0FKp05WmHOrwc0aRDb53miFKM+w@mail.gmail.com
> ---

The code looks OK in the build system point of view.

Please let me confirm this, just in case:
For now, we give up all architectures, not only x86, right?

I have not not heard much from other arch maintainers.
Thomas Gleixner July 17, 2019, 7:57 a.m. UTC | #6
On Wed, 17 Jul 2019, Masahiro Yamada wrote:
> On Wed, Jul 17, 2019 at 4:47 AM Thomas Gleixner <tglx@linutronix.de> wrote:
> > So instead of dealing with attempts to duct tape gold support without
> > understanding the root cause and without support from the gold folks, fail
> > the build when gold is detected.
> >
> 
> The code looks OK in the build system point of view.
> 
> Please let me confirm this, just in case:
> For now, we give up all architectures, not only x86, right?

Well, that's the logical consequence of a statement which says: don't use
gold for the kernel.

> I have not not heard much from other arch maintainers.

Cc'ed linux-arch for that matter.

Thanks,

	tglx
Mike Lothian July 20, 2019, 9:12 a.m. UTC | #7
On Wed, 17 Jul 2019 at 08:57, Thomas Gleixner <tglx@linutronix.de> wrote:
>
> On Wed, 17 Jul 2019, Masahiro Yamada wrote:
> > On Wed, Jul 17, 2019 at 4:47 AM Thomas Gleixner <tglx@linutronix.de> wrote:
> > > So instead of dealing with attempts to duct tape gold support without
> > > understanding the root cause and without support from the gold folks, fail
> > > the build when gold is detected.
> > >
> >
> > The code looks OK in the build system point of view.
> >
> > Please let me confirm this, just in case:
> > For now, we give up all architectures, not only x86, right?
>
> Well, that's the logical consequence of a statement which says: don't use
> gold for the kernel.
>
> > I have not not heard much from other arch maintainers.
>
> Cc'ed linux-arch for that matter.
>
> Thanks,
>
>         tglx

Hi

I've done a bit more digging, I had a second machine that was building
Linus's tree just fine with ld.gold

I tried forcing ld.bfd on the problem machine and got this:

ld.bfd: arch/x86/boot/compressed/head_64.o: warning: relocation in
read-only section `.head.text'
ld.bfd: warning: creating a DT_TEXTREL in object

I had a look at the differences in the kernel configs and noticed this:

CONFIG_RANDOMIZE_BASE=y
CONFIG_X86_NEED_RELOCS=y
CONFIG_PHYSICAL_ALIGN=0x1000000
CONFIG_DYNAMIC_MEMORY_LAYOUT=y
CONFIG_RANDOMIZE_MEMORY=y
CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING=0x0

Unsetting CONFIG_RANDOMIZE_BASE=y gets things working for me with ld.gold again

In light of this - can we drop this patch?

Cheers

Mike
Thomas Gleixner July 20, 2019, 9:34 a.m. UTC | #8
On Sat, 20 Jul 2019, Mike Lothian wrote:
> On Wed, 17 Jul 2019 at 08:57, Thomas Gleixner <tglx@linutronix.de> wrote:
> I've done a bit more digging, I had a second machine that was building
> Linus's tree just fine with ld.gold
> 
> I tried forcing ld.bfd on the problem machine and got this:
> 
> ld.bfd: arch/x86/boot/compressed/head_64.o: warning: relocation in
> read-only section `.head.text'
> ld.bfd: warning: creating a DT_TEXTREL in object
> 
> I had a look at the differences in the kernel configs and noticed this:
> 
> CONFIG_RANDOMIZE_BASE=y
> CONFIG_X86_NEED_RELOCS=y
> CONFIG_PHYSICAL_ALIGN=0x1000000
> CONFIG_DYNAMIC_MEMORY_LAYOUT=y
> CONFIG_RANDOMIZE_MEMORY=y
> CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING=0x0
> 
> Unsetting CONFIG_RANDOMIZE_BASE=y gets things working for me with ld.gold again

Can you please provide the full config? I have the above set here and it
builds just fine.

> In light of this - can we drop this patch?

No. I'm not going to deal with unsupported tools.

Thanks,

	tglx
Mike Lothian July 20, 2019, 10:13 a.m. UTC | #9
On Sat, 20 Jul 2019 at 10:34, Thomas Gleixner <tglx@linutronix.de> wrote:
>
> On Sat, 20 Jul 2019, Mike Lothian wrote:
> > On Wed, 17 Jul 2019 at 08:57, Thomas Gleixner <tglx@linutronix.de> wrote:
> > I've done a bit more digging, I had a second machine that was building
> > Linus's tree just fine with ld.gold
> >
> > I tried forcing ld.bfd on the problem machine and got this:
> >
> > ld.bfd: arch/x86/boot/compressed/head_64.o: warning: relocation in
> > read-only section `.head.text'
> > ld.bfd: warning: creating a DT_TEXTREL in object
> >
> > I had a look at the differences in the kernel configs and noticed this:
> >
> > CONFIG_RANDOMIZE_BASE=y
> > CONFIG_X86_NEED_RELOCS=y
> > CONFIG_PHYSICAL_ALIGN=0x1000000
> > CONFIG_DYNAMIC_MEMORY_LAYOUT=y
> > CONFIG_RANDOMIZE_MEMORY=y
> > CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING=0x0
> >
> > Unsetting CONFIG_RANDOMIZE_BASE=y gets things working for me with ld.gold again
>
> Can you please provide the full config? I have the above set here and it
> builds just fine.
>
> > In light of this - can we drop this patch?
>
> No. I'm not going to deal with unsupported tools.
>
> Thanks,
>
>         tglx

Hi

Here is my config

https://github.com/FireBurn/KernelStuff/blob/9b7e96581598d50b266f9df258e7de764949147a/dot_config_tip

Regards

Mike
Thomas Gleixner July 20, 2019, 10:54 a.m. UTC | #10
On Sat, 20 Jul 2019, Mike Lothian wrote:
> Here is my config
> 
> https://github.com/FireBurn/KernelStuff/blob/9b7e96581598d50b266f9df258e7de764949147a/dot_config_tip
> 

Builds perfectly fine.
Mike Lothian July 20, 2019, 10:59 a.m. UTC | #11
On Sat, 20 Jul 2019 at 11:54, Thomas Gleixner <tglx@linutronix.de> wrote:
>
> On Sat, 20 Jul 2019, Mike Lothian wrote:
> > Here is my config
> >
> > https://github.com/FireBurn/KernelStuff/blob/9b7e96581598d50b266f9df258e7de764949147a/dot_config_tip
> >
>
> Builds perfectly fine.

Sorry top posted from my phone

Are you using gold? And which versions of GCC & binutils are you using?
Masahiro Yamada July 23, 2019, 1:30 a.m. UTC | #12
On Sat, Jul 20, 2019 at 6:12 PM Mike Lothian <mike@fireburn.co.uk> wrote:
>
> On Wed, 17 Jul 2019 at 08:57, Thomas Gleixner <tglx@linutronix.de> wrote:
> >
> > On Wed, 17 Jul 2019, Masahiro Yamada wrote:
> > > On Wed, Jul 17, 2019 at 4:47 AM Thomas Gleixner <tglx@linutronix.de> wrote:
> > > > So instead of dealing with attempts to duct tape gold support without
> > > > understanding the root cause and without support from the gold folks, fail
> > > > the build when gold is detected.
> > > >
> > >
> > > The code looks OK in the build system point of view.
> > >
> > > Please let me confirm this, just in case:
> > > For now, we give up all architectures, not only x86, right?
> >
> > Well, that's the logical consequence of a statement which says: don't use
> > gold for the kernel.
> >
> > > I have not not heard much from other arch maintainers.
> >
> > Cc'ed linux-arch for that matter.
> >
> > Thanks,
> >
> >         tglx
>
> Hi
>
> I've done a bit more digging, I had a second machine that was building
> Linus's tree just fine with ld.gold
>
> I tried forcing ld.bfd on the problem machine and got this:
>
> ld.bfd: arch/x86/boot/compressed/head_64.o: warning: relocation in
> read-only section `.head.text'
> ld.bfd: warning: creating a DT_TEXTREL in object
>
> I had a look at the differences in the kernel configs and noticed this:
>
> CONFIG_RANDOMIZE_BASE=y
> CONFIG_X86_NEED_RELOCS=y
> CONFIG_PHYSICAL_ALIGN=0x1000000
> CONFIG_DYNAMIC_MEMORY_LAYOUT=y
> CONFIG_RANDOMIZE_MEMORY=y
> CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING=0x0
>
> Unsetting CONFIG_RANDOMIZE_BASE=y gets things working for me with ld.gold again


Right.
I was able to build with ld.gold

So, we can use gold, depending on the kernel configuration.


> In light of this - can we drop this patch?
Thomas Gleixner July 23, 2019, 6:41 a.m. UTC | #13
On Tue, 23 Jul 2019, Masahiro Yamada wrote:
> Right.
> I was able to build with ld.gold
> 
> So, we can use gold, depending on the kernel configuration.

That's exactly the problem. It breaks with random kernel configurations
which is not acceptable except for people who know what they are doing.

I'm tired of dealing with half baken fixes and 'regression' reports. Either
there is an effort to fix the issues with gold like the clang people fix
their issues or it needs to be disabled. We have a clear statement that
gold developers have other priorities.

Thanks,

	tglx
Thomas Gleixner July 23, 2019, 8:17 a.m. UTC | #14
On Tue, 23 Jul 2019, Thomas Gleixner wrote:
> On Tue, 23 Jul 2019, Masahiro Yamada wrote:
> > Right.
> > I was able to build with ld.gold
> > 
> > So, we can use gold, depending on the kernel configuration.
> 
> That's exactly the problem. It breaks with random kernel configurations
> which is not acceptable except for people who know what they are doing.
> 
> I'm tired of dealing with half baken fixes and 'regression' reports. Either
> there is an effort to fix the issues with gold like the clang people fix
> their issues or it needs to be disabled. We have a clear statement that
> gold developers have other priorities.

That said, I'm perfectly happy to move this to x86 and leave it alone for
other architectures, but it does not make sense to me.

If the gold fans care enough, then we can add something like
CONFIG_I_WANT_TO_USE_GOLD_AND_DEAL_WITH_THE_FALLOUT_MYSELF.

Thanks,

	tglx
Masahiro Yamada July 29, 2019, 2:27 a.m. UTC | #15
On Tue, Jul 23, 2019 at 5:17 PM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> On Tue, 23 Jul 2019, Thomas Gleixner wrote:
> > On Tue, 23 Jul 2019, Masahiro Yamada wrote:
> > > Right.
> > > I was able to build with ld.gold
> > >
> > > So, we can use gold, depending on the kernel configuration.
> >
> > That's exactly the problem. It breaks with random kernel configurations
> > which is not acceptable except for people who know what they are doing.
> >
> > I'm tired of dealing with half baken fixes and 'regression' reports. Either
> > there is an effort to fix the issues with gold like the clang people fix
> > their issues or it needs to be disabled. We have a clear statement that
> > gold developers have other priorities.
>
> That said, I'm perfectly happy to move this to x86 and leave it alone for
> other architectures, but it does not make sense to me.


I did not see opposition from other arch maintainers.


> If the gold fans care enough, then we can add something like
> CONFIG_I_WANT_TO_USE_GOLD_AND_DEAL_WITH_THE_FALLOUT_MYSELF.

Let's apply this and see.

If somebody really wants to use gold by his risk,
I will consider such a config option.


Applied to linux-kbuild.
Thanks.
diff mbox series

Patch

--- a/scripts/Kconfig.include
+++ b/scripts/Kconfig.include
@@ -35,5 +35,8 @@  ld-option = $(success,$(LD) -v $(1))
 $(error-if,$(failure,command -v $(CC)),compiler '$(CC)' not found)
 $(error-if,$(failure,command -v $(LD)),linker '$(LD)' not found)
 
+# Fail if the linker is gold as it's not capable of linking the kernel proper
+$(error-if,$(success, $(LD) -v | grep -q gold), gold linker '$(LD)' not supported)
+
 # gcc version including patch level
 gcc-version := $(shell,$(srctree)/scripts/gcc-version.sh $(CC))