kbuild: reuse vmlinux.o in vmlinux_link
diff mbox series

Message ID 20200521202716.193316-1-samitolvanen@google.com
State New
Headers show
Series
  • kbuild: reuse vmlinux.o in vmlinux_link
Related show

Commit Message

Sami Tolvanen May 21, 2020, 8:27 p.m. UTC
Instead of linking all compilation units again each time vmlinux_link is
called, reuse vmlinux.o from modpost_link.

With x86_64 allyesconfig, vmlinux_link is called three times and reusing
vmlinux.o reduces the build time ~38 seconds on my system (59% reduction
in the time spent in vmlinux_link).

Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
---
 scripts/link-vmlinux.sh | 5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)


base-commit: b85051e755b0e9d6dd8f17ef1da083851b83287d

Comments

Kees Cook May 21, 2020, 10:08 p.m. UTC | #1
On Thu, May 21, 2020 at 01:27:16PM -0700, Sami Tolvanen wrote:
> Instead of linking all compilation units again each time vmlinux_link is
> called, reuse vmlinux.o from modpost_link.
> 
> With x86_64 allyesconfig, vmlinux_link is called three times and reusing
> vmlinux.o reduces the build time ~38 seconds on my system (59% reduction
> in the time spent in vmlinux_link).

Nice! Any time savings at final link is a big cumulative win.

> Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
> ---
>  scripts/link-vmlinux.sh | 5 +----
>  1 file changed, 1 insertion(+), 4 deletions(-)
> 
> diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
> index d09ab4afbda4..c6cc4305950c 100755
> --- a/scripts/link-vmlinux.sh
> +++ b/scripts/link-vmlinux.sh
> @@ -77,11 +77,8 @@ vmlinux_link()
>  
>  	if [ "${SRCARCH}" != "um" ]; then
>  		objects="--whole-archive			\
> -			${KBUILD_VMLINUX_OBJS}			\
> +			vmlinux.o				\
>  			--no-whole-archive			\
> -			--start-group				\
> -			${KBUILD_VMLINUX_LIBS}			\
> -			--end-group				\
>  			${@}"
>  
>  		${LD} ${KBUILD_LDFLAGS} ${LDFLAGS_vmlinux}	\

I think the "um" case can be updated as well too, yes?

Also, I think the comment above modpost_link() needs to be updated now
to reflect the nature of how vmlinux.o gets used after this patch.
Kees Cook May 21, 2020, 10:18 p.m. UTC | #2
On Thu, May 21, 2020 at 01:27:16PM -0700, Sami Tolvanen wrote:
> Instead of linking all compilation units again each time vmlinux_link is
> called, reuse vmlinux.o from modpost_link.
> 
> With x86_64 allyesconfig, vmlinux_link is called three times and reusing
> vmlinux.o reduces the build time ~38 seconds on my system (59% reduction
> in the time spent in vmlinux_link).

BTW, I'll see this most in that it knocks about 6% off my "I changed
1 .c file and now I'm rebuilding" workflow time (which is obviously
dominated by linking), from 25 seconds to 23.5 seconds. And since most
of those seconds are spent staring at the build, it feels like a lot
more. ;)
Masahiro Yamada May 22, 2020, 5:41 p.m. UTC | #3
On Fri, May 22, 2020 at 5:27 AM Sami Tolvanen <samitolvanen@google.com> wrote:
>
> Instead of linking all compilation units again each time vmlinux_link is
> called, reuse vmlinux.o from modpost_link.
>
> With x86_64 allyesconfig, vmlinux_link is called three times and reusing
> vmlinux.o reduces the build time ~38 seconds on my system (59% reduction
> in the time spent in vmlinux_link).
>
> Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
> ---
>  scripts/link-vmlinux.sh | 5 +----
>  1 file changed, 1 insertion(+), 4 deletions(-)
>
> diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
> index d09ab4afbda4..c6cc4305950c 100755
> --- a/scripts/link-vmlinux.sh
> +++ b/scripts/link-vmlinux.sh
> @@ -77,11 +77,8 @@ vmlinux_link()
>
>         if [ "${SRCARCH}" != "um" ]; then
>                 objects="--whole-archive                        \
> -                       ${KBUILD_VMLINUX_OBJS}                  \
> +                       vmlinux.o                               \
>                         --no-whole-archive                      \
> -                       --start-group                           \
> -                       ${KBUILD_VMLINUX_LIBS}                  \
> -                       --end-group                             \
>                         ${@}"
>
>                 ${LD} ${KBUILD_LDFLAGS} ${LDFLAGS_vmlinux}      \
>
> base-commit: b85051e755b0e9d6dd8f17ef1da083851b83287d
> --
> 2.27.0.rc0.183.gde8f92d652-goog
>


I like this patch irrespective of CLANG_LTO, but
unfortunately, my build test failed.


ARCH=powerpc failed to build as follows:



  MODPOST vmlinux.o
  MODINFO modules.builtin.modinfo
  GEN     modules.builtin
  LD      .tmp_vmlinux.kallsyms1
vmlinux.o:(__ftr_alt_97+0x20): relocation truncated to fit:
R_PPC64_REL14 against `.text'+4b1c
vmlinux.o:(__ftr_alt_97+0x164): relocation truncated to fit:
R_PPC64_REL14 against `.text'+1cf78
vmlinux.o:(__ftr_alt_97+0x288): relocation truncated to fit:
R_PPC64_REL14 against `.text'+1dac4
vmlinux.o:(__ftr_alt_97+0x2f0): relocation truncated to fit:
R_PPC64_REL14 against `.text'+1e254
make: *** [Makefile:1125: vmlinux] Error 1



I used powerpc-linux-gcc
available at
https://mirrors.edge.kernel.org/pub/tools/crosstool/files/bin/x86_64/9.2.0/


Build command:

make -j24 ARCH=powerpc  CROSS_COMPILE=powerpc-linux-  defconfig all


Could you check it please?



I will apply it to my test branch.
Perhaps, 0-day bot may find more failure cases.


--
Best Regards
Masahiro Yamada
Masahiro Yamada May 22, 2020, 5:44 p.m. UTC | #4
+ Michael, and PPC ML.

They may know something about the reason of failure.


On Sat, May 23, 2020 at 2:41 AM Masahiro Yamada <masahiroy@kernel.org> wrote:
>
> On Fri, May 22, 2020 at 5:27 AM Sami Tolvanen <samitolvanen@google.com> wrote:
> >
> > Instead of linking all compilation units again each time vmlinux_link is
> > called, reuse vmlinux.o from modpost_link.
> >
> > With x86_64 allyesconfig, vmlinux_link is called three times and reusing
> > vmlinux.o reduces the build time ~38 seconds on my system (59% reduction
> > in the time spent in vmlinux_link).
> >
> > Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
> > ---
> >  scripts/link-vmlinux.sh | 5 +----
> >  1 file changed, 1 insertion(+), 4 deletions(-)
> >
> > diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
> > index d09ab4afbda4..c6cc4305950c 100755
> > --- a/scripts/link-vmlinux.sh
> > +++ b/scripts/link-vmlinux.sh
> > @@ -77,11 +77,8 @@ vmlinux_link()
> >
> >         if [ "${SRCARCH}" != "um" ]; then
> >                 objects="--whole-archive                        \
> > -                       ${KBUILD_VMLINUX_OBJS}                  \
> > +                       vmlinux.o                               \
> >                         --no-whole-archive                      \
> > -                       --start-group                           \
> > -                       ${KBUILD_VMLINUX_LIBS}                  \
> > -                       --end-group                             \
> >                         ${@}"
> >
> >                 ${LD} ${KBUILD_LDFLAGS} ${LDFLAGS_vmlinux}      \
> >
> > base-commit: b85051e755b0e9d6dd8f17ef1da083851b83287d
> > --
> > 2.27.0.rc0.183.gde8f92d652-goog
> >
>
>
> I like this patch irrespective of CLANG_LTO, but
> unfortunately, my build test failed.
>
>
> ARCH=powerpc failed to build as follows:
>
>
>
>   MODPOST vmlinux.o
>   MODINFO modules.builtin.modinfo
>   GEN     modules.builtin
>   LD      .tmp_vmlinux.kallsyms1
> vmlinux.o:(__ftr_alt_97+0x20): relocation truncated to fit:
> R_PPC64_REL14 against `.text'+4b1c
> vmlinux.o:(__ftr_alt_97+0x164): relocation truncated to fit:
> R_PPC64_REL14 against `.text'+1cf78
> vmlinux.o:(__ftr_alt_97+0x288): relocation truncated to fit:
> R_PPC64_REL14 against `.text'+1dac4
> vmlinux.o:(__ftr_alt_97+0x2f0): relocation truncated to fit:
> R_PPC64_REL14 against `.text'+1e254
> make: *** [Makefile:1125: vmlinux] Error 1
>
>
>
> I used powerpc-linux-gcc
> available at
> https://mirrors.edge.kernel.org/pub/tools/crosstool/files/bin/x86_64/9.2.0/
>
>
> Build command:
>
> make -j24 ARCH=powerpc  CROSS_COMPILE=powerpc-linux-  defconfig all
>
>
> Could you check it please?
>
>
>
> I will apply it to my test branch.
> Perhaps, 0-day bot may find more failure cases.
>
>
> --
> Best Regards
> Masahiro Yamada
Masahiro Yamada May 22, 2020, 6:16 p.m. UTC | #5
On Fri, May 22, 2020 at 7:08 AM Kees Cook <keescook@chromium.org> wrote:
>
> On Thu, May 21, 2020 at 01:27:16PM -0700, Sami Tolvanen wrote:
> > Instead of linking all compilation units again each time vmlinux_link is
> > called, reuse vmlinux.o from modpost_link.
> >
> > With x86_64 allyesconfig, vmlinux_link is called three times and reusing
> > vmlinux.o reduces the build time ~38 seconds on my system (59% reduction
> > in the time spent in vmlinux_link).
>
> Nice! Any time savings at final link is a big cumulative win.
>
> > Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
> > ---
> >  scripts/link-vmlinux.sh | 5 +----
> >  1 file changed, 1 insertion(+), 4 deletions(-)
> >
> > diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
> > index d09ab4afbda4..c6cc4305950c 100755
> > --- a/scripts/link-vmlinux.sh
> > +++ b/scripts/link-vmlinux.sh
> > @@ -77,11 +77,8 @@ vmlinux_link()
> >
> >       if [ "${SRCARCH}" != "um" ]; then
> >               objects="--whole-archive                        \
> > -                     ${KBUILD_VMLINUX_OBJS}                  \
> > +                     vmlinux.o                               \
> >                       --no-whole-archive                      \
> > -                     --start-group                           \
> > -                     ${KBUILD_VMLINUX_LIBS}                  \
> > -                     --end-group                             \
> >                       ${@}"
> >
> >               ${LD} ${KBUILD_LDFLAGS} ${LDFLAGS_vmlinux}      \
>
> I think the "um" case can be updated as well too, yes?

I agree.

I changed the um part, then ARCH=um build is successful.







> Also, I think the comment above modpost_link() needs to be updated now
> to reflect the nature of how vmlinux.o gets used after this patch.
>
> --
> Kees Cook
Nicholas Piggin May 23, 2020, 10:06 a.m. UTC | #6
Excerpts from Masahiro Yamada's message of May 23, 2020 3:44 am:
> + Michael, and PPC ML.
> 
> They may know something about the reason of failure.

Because the linker can't put branch stubs within object code sections, 
so when you incrementally link them too large, the linker can't resolve 
branches into other object files.

This is why we added incremental linking in the first place. I suppose 
it could be made conditional for platforms that can use this 
optimization.

What'd be really nice is if we could somehow build and link kallsyms 
without relinking everything twice, and if we could do section mismatch 
analysis without making that vmlinux.o as well. I had a few ideas but 
not enough time to do much work on it.

Thanks,
Nick

> 
> 
> On Sat, May 23, 2020 at 2:41 AM Masahiro Yamada <masahiroy@kernel.org> wrote:
>>
>> On Fri, May 22, 2020 at 5:27 AM Sami Tolvanen <samitolvanen@google.com> wrote:
>> >
>> > Instead of linking all compilation units again each time vmlinux_link is
>> > called, reuse vmlinux.o from modpost_link.
>> >
>> > With x86_64 allyesconfig, vmlinux_link is called three times and reusing
>> > vmlinux.o reduces the build time ~38 seconds on my system (59% reduction
>> > in the time spent in vmlinux_link).
>> >
>> > Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
>> > ---
>> >  scripts/link-vmlinux.sh | 5 +----
>> >  1 file changed, 1 insertion(+), 4 deletions(-)
>> >
>> > diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
>> > index d09ab4afbda4..c6cc4305950c 100755
>> > --- a/scripts/link-vmlinux.sh
>> > +++ b/scripts/link-vmlinux.sh
>> > @@ -77,11 +77,8 @@ vmlinux_link()
>> >
>> >         if [ "${SRCARCH}" != "um" ]; then
>> >                 objects="--whole-archive                        \
>> > -                       ${KBUILD_VMLINUX_OBJS}                  \
>> > +                       vmlinux.o                               \
>> >                         --no-whole-archive                      \
>> > -                       --start-group                           \
>> > -                       ${KBUILD_VMLINUX_LIBS}                  \
>> > -                       --end-group                             \
>> >                         ${@}"
>> >
>> >                 ${LD} ${KBUILD_LDFLAGS} ${LDFLAGS_vmlinux}      \
>> >
>> > base-commit: b85051e755b0e9d6dd8f17ef1da083851b83287d
>> > --
>> > 2.27.0.rc0.183.gde8f92d652-goog
>> >
>>
>>
>> I like this patch irrespective of CLANG_LTO, but
>> unfortunately, my build test failed.
>>
>>
>> ARCH=powerpc failed to build as follows:
>>
>>
>>
>>   MODPOST vmlinux.o
>>   MODINFO modules.builtin.modinfo
>>   GEN     modules.builtin
>>   LD      .tmp_vmlinux.kallsyms1
>> vmlinux.o:(__ftr_alt_97+0x20): relocation truncated to fit:
>> R_PPC64_REL14 against `.text'+4b1c
>> vmlinux.o:(__ftr_alt_97+0x164): relocation truncated to fit:
>> R_PPC64_REL14 against `.text'+1cf78
>> vmlinux.o:(__ftr_alt_97+0x288): relocation truncated to fit:
>> R_PPC64_REL14 against `.text'+1dac4
>> vmlinux.o:(__ftr_alt_97+0x2f0): relocation truncated to fit:
>> R_PPC64_REL14 against `.text'+1e254
>> make: *** [Makefile:1125: vmlinux] Error 1
>>
>>
>>
>> I used powerpc-linux-gcc
>> available at
>> https://mirrors.edge.kernel.org/pub/tools/crosstool/files/bin/x86_64/9.2.0/
>>
>>
>> Build command:
>>
>> make -j24 ARCH=powerpc  CROSS_COMPILE=powerpc-linux-  defconfig all
>>
>>
>> Could you check it please?
>>
>>
>>
>> I will apply it to my test branch.
>> Perhaps, 0-day bot may find more failure cases.
>>
>>
>> --
>> Best Regards
>> Masahiro Yamada
> 
> 
> 
> -- 
> Best Regards
> Masahiro Yamada
>
Masahiro Yamada May 23, 2020, 3:12 p.m. UTC | #7
Hi Nicholas,
(+CC: Sam Ravnborg)


On Sat, May 23, 2020 at 7:06 PM Nicholas Piggin <npiggin@gmail.com> wrote:
>
> Excerpts from Masahiro Yamada's message of May 23, 2020 3:44 am:
> > + Michael, and PPC ML.
> >
> > They may know something about the reason of failure.
>
> Because the linker can't put branch stubs within object code sections,
> so when you incrementally link them too large, the linker can't resolve
> branches into other object files.


Ah, you are right.

So, this is a problem not only for PPC
but also for ARM (both 32 and 64 bit), etc.

ARM needs to insert a veneer to jump far.

Prior to thin archive, we could not compile
ARCH=arm allyesconfig because
drivers/built-in.o was too large.

This patch gets us back to the too large
incremental object situation.

With my quick compile-testing,
ARCH=arm allyesconfig
and ARCH=arm64 allyesconfig are broken.


> This is why we added incremental linking in the first place. I suppose
> it could be made conditional for platforms that can use this
> optimization.
>
> What'd be really nice is if we could somehow build and link kallsyms
> without relinking everything twice, and if we could do section mismatch
> analysis without making that vmlinux.o as well. I had a few ideas but
> not enough time to do much work on it.


Right, kallsyms links 3 times. (not twice)


Hmm, I think Sami's main motivation is Clang LTO.

LTO is very time-consuming.
So, the android common kernel implements Clang LTO
in the pre modpost stage:


1) LTO against vmlinux.o

2) modpost against vmlinux.o

3) Link vmlinux.o + kallsyms into vmlinux
   (this requires linking 3 times)



If we move LTO to 3), we need to do LTO 3 times.

And, this was how GCC LTO was implemented in 2014,
(then rejected by Linus).


How to do modpost without making vmlinux.o ?

In old days, the section mismatch analysis was done
against the final vmlinux.


85bd2fddd68e757da8e1af98f857f61a3c9ce647 changed
it to run modpost for individual .o files.

Then, 741f98fe298a73c9d47ed53703c1279a29718581
introduced vmlinux.o to use it for modpost.


The following two commits.
I did not fully understand the background, though.

I CC'ed Sam in case he may add some comments.





commit 85bd2fddd68e757da8e1af98f857f61a3c9ce647
Author: Sam Ravnborg <sam@ravnborg.org>
Date:   Mon Feb 26 15:33:52 2007 +0100

    kbuild: fix section mismatch check for vmlinux

    vmlinux does not contain relocation entries which is
    used by the section mismatch checks.
    Reported by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>

    Use the individual objects as inputs to overcome
    this limitation.
    In modpost check the .o files and skip non-ELF files.

    Signed-off-by: Sam Ravnborg <sam@ravnborg.org>





commit 741f98fe298a73c9d47ed53703c1279a29718581
Author: Sam Ravnborg <sam@ravnborg.org>
Date:   Tue Jul 17 10:54:06 2007 +0200

    kbuild: do section mismatch check on full vmlinux

    Previously we did do the check on the .o files used to link
    vmlinux but that failed to find questionable references across
    the .o files.
    Create a dedicated vmlinux.o file used only for section mismatch checks
    that uses the defualt linker script so section does not get renamed.

    The vmlinux.o may later be used as part of the the final link of vmlinux
    but for now it is used fo section mismatch only.
    For a defconfig build this is instant but for an allyesconfig this
    add two minutes to a full build (that anyways takes ~2 hours).

    Signed-off-by: Sam Ravnborg <sam@ravnborg.org>












>
> Thanks,
> Nick
>
> >
> >
> > On Sat, May 23, 2020 at 2:41 AM Masahiro Yamada <masahiroy@kernel.org> wrote:
> >>
> >> On Fri, May 22, 2020 at 5:27 AM Sami Tolvanen <samitolvanen@google.com> wrote:
> >> >
> >> > Instead of linking all compilation units again each time vmlinux_link is
> >> > called, reuse vmlinux.o from modpost_link.
> >> >
> >> > With x86_64 allyesconfig, vmlinux_link is called three times and reusing
> >> > vmlinux.o reduces the build time ~38 seconds on my system (59% reduction
> >> > in the time spent in vmlinux_link).
> >> >
> >> > Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
> >> > ---
> >> >  scripts/link-vmlinux.sh | 5 +----
> >> >  1 file changed, 1 insertion(+), 4 deletions(-)
> >> >
> >> > diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
> >> > index d09ab4afbda4..c6cc4305950c 100755
> >> > --- a/scripts/link-vmlinux.sh
> >> > +++ b/scripts/link-vmlinux.sh
> >> > @@ -77,11 +77,8 @@ vmlinux_link()
> >> >
> >> >         if [ "${SRCARCH}" != "um" ]; then
> >> >                 objects="--whole-archive                        \
> >> > -                       ${KBUILD_VMLINUX_OBJS}                  \
> >> > +                       vmlinux.o                               \
> >> >                         --no-whole-archive                      \
> >> > -                       --start-group                           \
> >> > -                       ${KBUILD_VMLINUX_LIBS}                  \
> >> > -                       --end-group                             \
> >> >                         ${@}"
> >> >
> >> >                 ${LD} ${KBUILD_LDFLAGS} ${LDFLAGS_vmlinux}      \
> >> >
> >> > base-commit: b85051e755b0e9d6dd8f17ef1da083851b83287d
> >> > --
> >> > 2.27.0.rc0.183.gde8f92d652-goog
> >> >
> >>
> >>
> >> I like this patch irrespective of CLANG_LTO, but
> >> unfortunately, my build test failed.
> >>
> >>
> >> ARCH=powerpc failed to build as follows:
> >>
> >>
> >>
> >>   MODPOST vmlinux.o
> >>   MODINFO modules.builtin.modinfo
> >>   GEN     modules.builtin
> >>   LD      .tmp_vmlinux.kallsyms1
> >> vmlinux.o:(__ftr_alt_97+0x20): relocation truncated to fit:
> >> R_PPC64_REL14 against `.text'+4b1c
> >> vmlinux.o:(__ftr_alt_97+0x164): relocation truncated to fit:
> >> R_PPC64_REL14 against `.text'+1cf78
> >> vmlinux.o:(__ftr_alt_97+0x288): relocation truncated to fit:
> >> R_PPC64_REL14 against `.text'+1dac4
> >> vmlinux.o:(__ftr_alt_97+0x2f0): relocation truncated to fit:
> >> R_PPC64_REL14 against `.text'+1e254
> >> make: *** [Makefile:1125: vmlinux] Error 1
> >>
> >>
> >>
> >> I used powerpc-linux-gcc
> >> available at
> >> https://mirrors.edge.kernel.org/pub/tools/crosstool/files/bin/x86_64/9.2.0/
> >>
> >>
> >> Build command:
> >>
> >> make -j24 ARCH=powerpc  CROSS_COMPILE=powerpc-linux-  defconfig all
> >>
> >>
> >> Could you check it please?
> >>
> >>
> >>
> >> I will apply it to my test branch.
> >> Perhaps, 0-day bot may find more failure cases.
> >>
> >>
> >> --
> >> Best Regards
> >> Masahiro Yamada
> >
> >
> >
> > --
> > Best Regards
> > Masahiro Yamada
> >



--
Best Regards
Masahiro Yamada
Sam Ravnborg May 23, 2020, 4:53 p.m. UTC | #8
Hi Masahiro.

On Sun, May 24, 2020 at 12:12:35AM +0900, Masahiro Yamada wrote:
> Hi Nicholas,
> (+CC: Sam Ravnborg)
> 
> 
> On Sat, May 23, 2020 at 7:06 PM Nicholas Piggin <npiggin@gmail.com> wrote:
> >
> > Excerpts from Masahiro Yamada's message of May 23, 2020 3:44 am:
> > > + Michael, and PPC ML.
> > >
> > > They may know something about the reason of failure.
> >
> > Because the linker can't put branch stubs within object code sections,
> > so when you incrementally link them too large, the linker can't resolve
> > branches into other object files.
> 
> 
> Ah, you are right.
> 
> So, this is a problem not only for PPC
> but also for ARM (both 32 and 64 bit), etc.
> 
> ARM needs to insert a veneer to jump far.
> 
> Prior to thin archive, we could not compile
> ARCH=arm allyesconfig because
> drivers/built-in.o was too large.
> 
> This patch gets us back to the too large
> incremental object situation.
> 
> With my quick compile-testing,
> ARCH=arm allyesconfig
> and ARCH=arm64 allyesconfig are broken.
> 
> 
> > This is why we added incremental linking in the first place. I suppose
> > it could be made conditional for platforms that can use this
> > optimization.
> >
> > What'd be really nice is if we could somehow build and link kallsyms
> > without relinking everything twice, and if we could do section mismatch
> > analysis without making that vmlinux.o as well. I had a few ideas but
> > not enough time to do much work on it.
> 
> 
> Right, kallsyms links 3 times. (not twice)
> 
> 
> Hmm, I think Sami's main motivation is Clang LTO.
> 
> LTO is very time-consuming.
> So, the android common kernel implements Clang LTO
> in the pre modpost stage:
> 
> 
> 1) LTO against vmlinux.o
> 
> 2) modpost against vmlinux.o
> 
> 3) Link vmlinux.o + kallsyms into vmlinux
>    (this requires linking 3 times)

We have kallsyms we had to link three times because the linking
increased the object a little in size so symbols did not match.
The last time was added more or less only to check that we did
have stable symbol addresses.

All this predates LTO stuff which we only introduced later.

The reason for doing modpost on vmlinux.o was that we had cases
where everything in drivers/ was fine but there was section mismatch
references from arch/* to drivers/*
This is back when there were much more drivers in arch/ than what we
have today.
And back then we also had much more to check ad we had cPU hotplug
that could really cause section mismatches - this is no longer the case
which is a good thing.



...
> 
> The following two commits.
> I did not fully understand the background, though.
> 
> I CC'ed Sam in case he may add some comments.
> 
> commit 85bd2fddd68e757da8e1af98f857f61a3c9ce647
> Author: Sam Ravnborg <sam@ravnborg.org>
> Date:   Mon Feb 26 15:33:52 2007 +0100
> 
>     kbuild: fix section mismatch check for vmlinux
> 
>     vmlinux does not contain relocation entries which is
>     used by the section mismatch checks.
>     Reported by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
> 
>     Use the individual objects as inputs to overcome
>     this limitation.
>     In modpost check the .o files and skip non-ELF files.
> 
>     Signed-off-by: Sam Ravnborg <sam@ravnborg.org>


So we checked vmlinx - but vmlinx did have too much stripped away.
so in reality nothing was checked.
To allow the warnings to be as precise as possible move the checks
out to the indovidual .o files.
Sometimes the names was mangled a little so if warnigns was only
reported on vmlinx level in could be difficult to track down the
offender.
This would then also do the check on .o files that had all the
relocation symbols rtequired.

> 
> commit 741f98fe298a73c9d47ed53703c1279a29718581
> Author: Sam Ravnborg <sam@ravnborg.org>
> Date:   Tue Jul 17 10:54:06 2007 +0200
> 
>     kbuild: do section mismatch check on full vmlinux
> 
>     Previously we did do the check on the .o files used to link
>     vmlinux but that failed to find questionable references across
>     the .o files.
>     Create a dedicated vmlinux.o file used only for section mismatch checks
>     that uses the defualt linker script so section does not get renamed.
> 
>     The vmlinux.o may later be used as part of the the final link of vmlinux
>     but for now it is used fo section mismatch only.
>     For a defconfig build this is instant but for an allyesconfig this
>     add two minutes to a full build (that anyways takes ~2 hours).
> 
>     Signed-off-by: Sam Ravnborg <sam@ravnborg.org>

But when we introduced check of the individual .o fiules we missed when
the references spanned outside the .o files as explained previously.
So included a link of vmlinx.o that did NOT drop the relocations
so we could use it to check for the remaining section mismatch warnings.

Remember - back when we started this we had many hundred warnings
and it was a fight to keep that number low.
But we also wanted to report as much as possible.

There was back then several discussions if this was really worth the
effort. How much was gained from discarding the memory where the
section mismatch warnigns was triggered.
In other words - how about just keeping the init code in memory so there
were no illegal references anymore.
That is something that is maybe worth to consiuder again as we have even
less memory we save by throwing away the init code.
But I think this is a topic for another mail thread.

	Sam
Masahiro Yamada May 25, 2020, 6:13 a.m. UTC | #9
Hi Sam,

Thanks for the comments.

On Sun, May 24, 2020 at 1:54 AM Sam Ravnborg <sam@ravnborg.org> wrote:
>
> Hi Masahiro.
>
> On Sun, May 24, 2020 at 12:12:35AM +0900, Masahiro Yamada wrote:
> > Hi Nicholas,
> > (+CC: Sam Ravnborg)
> >
> >
> > On Sat, May 23, 2020 at 7:06 PM Nicholas Piggin <npiggin@gmail.com> wrote:
> > >
> > > Excerpts from Masahiro Yamada's message of May 23, 2020 3:44 am:
> > > > + Michael, and PPC ML.
> > > >
> > > > They may know something about the reason of failure.
> > >
> > > Because the linker can't put branch stubs within object code sections,
> > > so when you incrementally link them too large, the linker can't resolve
> > > branches into other object files.
> >
> >
> > Ah, you are right.
> >
> > So, this is a problem not only for PPC
> > but also for ARM (both 32 and 64 bit), etc.
> >
> > ARM needs to insert a veneer to jump far.
> >
> > Prior to thin archive, we could not compile
> > ARCH=arm allyesconfig because
> > drivers/built-in.o was too large.
> >
> > This patch gets us back to the too large
> > incremental object situation.
> >
> > With my quick compile-testing,
> > ARCH=arm allyesconfig
> > and ARCH=arm64 allyesconfig are broken.
> >
> >
> > > This is why we added incremental linking in the first place. I suppose
> > > it could be made conditional for platforms that can use this
> > > optimization.
> > >
> > > What'd be really nice is if we could somehow build and link kallsyms
> > > without relinking everything twice, and if we could do section mismatch
> > > analysis without making that vmlinux.o as well. I had a few ideas but
> > > not enough time to do much work on it.
> >
> >
> > Right, kallsyms links 3 times. (not twice)
> >
> >
> > Hmm, I think Sami's main motivation is Clang LTO.
> >
> > LTO is very time-consuming.
> > So, the android common kernel implements Clang LTO
> > in the pre modpost stage:
> >
> >
> > 1) LTO against vmlinux.o
> >
> > 2) modpost against vmlinux.o
> >
> > 3) Link vmlinux.o + kallsyms into vmlinux
> >    (this requires linking 3 times)
>
> We have kallsyms we had to link three times because the linking
> increased the object a little in size so symbols did not match.
> The last time was added more or less only to check that we did
> have stable symbol addresses.


Usually vmlinux_link is invoked 3 times if CONFIG_KALLSYMS=y.

(kallsyms_step 1, kallsyms_step 2, and final vmlinux_link)

If the elf size does not match after kallsyms_step 2,
kallsyms_step 3 is invoked.

So, 4 times including the extra check pass.

If CONFIG_DEBUG_INFO_BTF=y, vmlinux_link is invoked
one more time.

So, linked 5 times at most.



>
> All this predates LTO stuff which we only introduced later.
>
> The reason for doing modpost on vmlinux.o was that we had cases
> where everything in drivers/ was fine but there was section mismatch
> references from arch/* to drivers/*
> This is back when there were much more drivers in arch/ than what we
> have today.
> And back then we also had much more to check ad we had cPU hotplug
> that could really cause section mismatches - this is no longer the case
> which is a good thing.
>
>
>
> ...
> >
> > The following two commits.
> > I did not fully understand the background, though.
> >
> > I CC'ed Sam in case he may add some comments.
> >
> > commit 85bd2fddd68e757da8e1af98f857f61a3c9ce647
> > Author: Sam Ravnborg <sam@ravnborg.org>
> > Date:   Mon Feb 26 15:33:52 2007 +0100
> >
> >     kbuild: fix section mismatch check for vmlinux
> >
> >     vmlinux does not contain relocation entries which is
> >     used by the section mismatch checks.
> >     Reported by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
> >
> >     Use the individual objects as inputs to overcome
> >     this limitation.
> >     In modpost check the .o files and skip non-ELF files.
> >
> >     Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
>
>
> So we checked vmlinx - but vmlinx did have too much stripped away.
> so in reality nothing was checked.
> To allow the warnings to be as precise as possible move the checks
> out to the indovidual .o files.
> Sometimes the names was mangled a little so if warnigns was only
> reported on vmlinx level in could be difficult to track down the
> offender.
> This would then also do the check on .o files that had all the
> relocation symbols rtequired.
>
> >
> > commit 741f98fe298a73c9d47ed53703c1279a29718581
> > Author: Sam Ravnborg <sam@ravnborg.org>
> > Date:   Tue Jul 17 10:54:06 2007 +0200
> >
> >     kbuild: do section mismatch check on full vmlinux
> >
> >     Previously we did do the check on the .o files used to link
> >     vmlinux but that failed to find questionable references across
> >     the .o files.
> >     Create a dedicated vmlinux.o file used only for section mismatch checks
> >     that uses the defualt linker script so section does not get renamed.
> >
> >     The vmlinux.o may later be used as part of the the final link of vmlinux
> >     but for now it is used fo section mismatch only.
> >     For a defconfig build this is instant but for an allyesconfig this
> >     add two minutes to a full build (that anyways takes ~2 hours).
> >
> >     Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
>
> But when we introduced check of the individual .o fiules we missed when
> the references spanned outside the .o files as explained previously.
> So included a link of vmlinx.o that did NOT drop the relocations
> so we could use it to check for the remaining section mismatch warnings.
>
> Remember - back when we started this we had many hundred warnings
> and it was a fight to keep that number low.
> But we also wanted to report as much as possible.
>
> There was back then several discussions if this was really worth the
> effort. How much was gained from discarding the memory where the
> section mismatch warnigns was triggered.
> In other words - how about just keeping the init code in memory so there
> were no illegal references anymore.
> That is something that is maybe worth to consiuder again as we have even
> less memory we save by throwing away the init code.
> But I think this is a topic for another mail thread.


I am not sure if we want to go as far as stop doing __init.
I want to reuse memory after initialization.

Anyway, the section mismatch checks highly rely on
REL or RELA.

The REL(A) sections do not exist in the final vmlinux,
or is useless at least. So, it does not work for most of
architectures.

If we use individual .o files, modpost cannot check
function calls to a different object file.

So, the conclusion is we definitely need vmlinux.o for section
mismatch checks.




--
Best Regards
Masahiro Yamada
Sami Tolvanen June 15, 2020, 9:47 p.m. UTC | #10
On Sat, May 23, 2020 at 8:13 AM Masahiro Yamada <masahiroy@kernel.org> wrote:
>
> Hi Nicholas,
> (+CC: Sam Ravnborg)
>
>
> On Sat, May 23, 2020 at 7:06 PM Nicholas Piggin <npiggin@gmail.com> wrote:
> >
> > Excerpts from Masahiro Yamada's message of May 23, 2020 3:44 am:
> > > + Michael, and PPC ML.
> > >
> > > They may know something about the reason of failure.
> >
> > Because the linker can't put branch stubs within object code sections,
> > so when you incrementally link them too large, the linker can't resolve
> > branches into other object files.
>
>
> Ah, you are right.
>
> So, this is a problem not only for PPC
> but also for ARM (both 32 and 64 bit), etc.
>
> ARM needs to insert a veneer to jump far.
>
> Prior to thin archive, we could not compile
> ARCH=arm allyesconfig because
> drivers/built-in.o was too large.
>
> This patch gets us back to the too large
> incremental object situation.
>
> With my quick compile-testing,
> ARCH=arm allyesconfig
> and ARCH=arm64 allyesconfig are broken.

Thanks for looking into this! Clang doesn't appear to have this issue
with LTO because it always enables both -ffunction-sections and
-fdata-sections. I confirmed that -ffunction-sections also fixes arm64
allyesconfig with this patch. While I'm fine with reusing vmlinux.o
only with LTO, how would you feel about enabling -ffunction-sections
in the kernel by default?

Sami
Masahiro Yamada June 16, 2020, 3:22 a.m. UTC | #11
On Tue, Jun 16, 2020 at 6:47 AM Sami Tolvanen <samitolvanen@google.com> wrote:
>
> On Sat, May 23, 2020 at 8:13 AM Masahiro Yamada <masahiroy@kernel.org> wrote:
> >
> > Hi Nicholas,
> > (+CC: Sam Ravnborg)
> >
> >
> > On Sat, May 23, 2020 at 7:06 PM Nicholas Piggin <npiggin@gmail.com> wrote:
> > >
> > > Excerpts from Masahiro Yamada's message of May 23, 2020 3:44 am:
> > > > + Michael, and PPC ML.
> > > >
> > > > They may know something about the reason of failure.
> > >
> > > Because the linker can't put branch stubs within object code sections,
> > > so when you incrementally link them too large, the linker can't resolve
> > > branches into other object files.
> >
> >
> > Ah, you are right.
> >
> > So, this is a problem not only for PPC
> > but also for ARM (both 32 and 64 bit), etc.
> >
> > ARM needs to insert a veneer to jump far.
> >
> > Prior to thin archive, we could not compile
> > ARCH=arm allyesconfig because
> > drivers/built-in.o was too large.
> >
> > This patch gets us back to the too large
> > incremental object situation.
> >
> > With my quick compile-testing,
> > ARCH=arm allyesconfig
> > and ARCH=arm64 allyesconfig are broken.
>
> Thanks for looking into this! Clang doesn't appear to have this issue
> with LTO because it always enables both -ffunction-sections and
> -fdata-sections. I confirmed that -ffunction-sections also fixes arm64
> allyesconfig with this patch. While I'm fine with reusing vmlinux.o
> only with LTO, how would you feel about enabling -ffunction-sections
> in the kernel by default?


I am OK if it works.

Please do compile tests for some architectures.
(especially, ARCH=powerpc defconfig, and ARCH=arm(64) allyesconfig)


Thank you.

Patch
diff mbox series

diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
index d09ab4afbda4..c6cc4305950c 100755
--- a/scripts/link-vmlinux.sh
+++ b/scripts/link-vmlinux.sh
@@ -77,11 +77,8 @@  vmlinux_link()
 
 	if [ "${SRCARCH}" != "um" ]; then
 		objects="--whole-archive			\
-			${KBUILD_VMLINUX_OBJS}			\
+			vmlinux.o				\
 			--no-whole-archive			\
-			--start-group				\
-			${KBUILD_VMLINUX_LIBS}			\
-			--end-group				\
 			${@}"
 
 		${LD} ${KBUILD_LDFLAGS} ${LDFLAGS_vmlinux}	\