kbuild: move -pipe to global KBUILD_CFLAGS
diff mbox series

Message ID 20200222003820.220854-1-alex_y_xu@yahoo.ca
State New
Headers show
Series
  • kbuild: move -pipe to global KBUILD_CFLAGS
Related show

Commit Message

Alex Xu (Hello71) Feb. 22, 2020, 12:38 a.m. UTC
-pipe reduces unnecessary disk wear for systems where /tmp is not a
tmpfs, slightly increases compilation speed, and avoids leaving behind
files when gcc crashes.

According to the gcc manual, "this fails to work on some systems where
the assembler is unable to read from a pipe; but the GNU assembler has
no trouble". We already require GNU ld on all platforms, so this is not
an additional dependency. LLVM as also supports pipes.

-pipe has always been used for most architectures, this change
standardizes it globally. Most notably, arm, arm64, riscv, and x86 are
affected.

Signed-off-by: Alex Xu (Hello71) <alex_y_xu@yahoo.ca>
---
 Makefile               | 2 +-
 arch/alpha/Makefile    | 2 +-
 arch/arc/Makefile      | 2 +-
 arch/arm/Makefile      | 1 -
 arch/csky/Makefile     | 1 -
 arch/ia64/Makefile     | 2 +-
 arch/m68k/Makefile     | 2 +-
 arch/mips/Makefile     | 2 +-
 arch/nios2/Makefile    | 2 +-
 arch/openrisc/Makefile | 2 +-
 arch/parisc/Makefile   | 2 +-
 arch/powerpc/Makefile  | 2 +-
 arch/s390/Makefile     | 2 +-
 arch/sh/Makefile       | 2 +-
 arch/sparc/Makefile    | 4 ++--
 arch/xtensa/Makefile   | 2 +-
 16 files changed, 15 insertions(+), 17 deletions(-)

Comments

Masahiro Yamada Feb. 22, 2020, 2:07 a.m. UTC | #1
On Sat, Feb 22, 2020 at 9:40 AM Alex Xu (Hello71) <alex_y_xu@yahoo.ca> wrote:
>
> -pipe reduces unnecessary disk wear for systems where /tmp is not a
> tmpfs, slightly increases compilation speed, and avoids leaving behind
> files when gcc crashes.
>
> According to the gcc manual, "this fails to work on some systems where
> the assembler is unable to read from a pipe; but the GNU assembler has
> no trouble". We already require GNU ld on all platforms, so this is not
> an additional dependency. LLVM as also supports pipes.
>
> -pipe has always been used for most architectures, this change
> standardizes it globally. Most notably, arm, arm64, riscv, and x86 are
> affected.
>
> Signed-off-by: Alex Xu (Hello71) <alex_y_xu@yahoo.ca>

<snip>

> diff --git a/arch/arc/Makefile b/arch/arc/Makefile
> index 20e9ab6cc521..b6a2f553771c 100644
> --- a/arch/arc/Makefile
> +++ b/arch/arc/Makefile
> @@ -9,7 +9,7 @@ ifeq ($(CROSS_COMPILE),)
>  CROSS_COMPILE := $(call cc-cross-prefix, arc-linux- arceb-linux-)
>  endif
>
> -cflags-y       += -fno-common -pipe -fno-builtin -mmedium-calls -D__linux__
> +cflags-y       += -fno-common -fno-builtin -mmedium-calls -D__linux__
>  cflags-$(CONFIG_ISA_ARCOMPACT) += -mA7
>  cflags-$(CONFIG_ISA_ARCV2)     += -mcpu=hs38
>
> diff --git a/arch/arm/Makefile b/arch/arm/Makefile
> index db857d07114f..7711467e0797 100644
> --- a/arch/arm/Makefile
> +++ b/arch/arm/Makefile
> @@ -21,7 +21,6 @@ KBUILD_LDS_MODULE     += $(srctree)/arch/arm/kernel/module.lds
>  endif
>
>  GZFLAGS                :=-9
> -#KBUILD_CFLAGS +=-pipe


This was commented out by a very old commit,
which is available in the historical git tree.

https://git.kernel.org/pub/scm/linux/kernel/git/history/history.git/commit/?id=ce20ed858a20f6f04de475cae79e40d3697f4776

But, I could not parse the reason from the commit message.
Russell, do you remember why?


If arch maintainers are fine with this change,
I can pick up it.

Thanks.



--
Best Regards

Masahiro Yamada
Nathan Chancellor Feb. 22, 2020, 2:16 a.m. UTC | #2
Hi Alex,

On Fri, Feb 21, 2020 at 07:38:20PM -0500, Alex Xu (Hello71) wrote:
> -pipe reduces unnecessary disk wear for systems where /tmp is not a
> tmpfs, slightly increases compilation speed, and avoids leaving behind
> files when gcc crashes.
> 
> According to the gcc manual, "this fails to work on some systems where
> the assembler is unable to read from a pipe; but the GNU assembler has
> no trouble". We already require GNU ld on all platforms, so this is not
> an additional dependency. LLVM as also supports pipes.
> 
> -pipe has always been used for most architectures, this change
> standardizes it globally. Most notably, arm, arm64, riscv, and x86 are
> affected.
> 
> Signed-off-by: Alex Xu (Hello71) <alex_y_xu@yahoo.ca>

Do you have any numbers to show this is actually beneficial from a
compilation time perspective? I ask because I saw an improvement in
compilation time when removing -pipe from x86's KBUILD_CFLAGS in
commit 437e88ab8f9e ("x86/build: Remove -pipe from KBUILD_CFLAGS").

For what it's worth, clang ignores -pipe so this does not actually
matter for its integrated assembler.

That type of change could have been a fluke but I guarantee people
will care more about any change in compilation time than any of the
other things that you mention so it might be wise to check on major
architectures to make sure that it doesn't hurt.

Cheers,
Nathan
Alex Xu (Hello71) Feb. 22, 2020, 4:01 a.m. UTC | #3
Excerpts from Nathan Chancellor's message of February 21, 2020 9:16 pm:
> Hi Alex,
> 
> On Fri, Feb 21, 2020 at 07:38:20PM -0500, Alex Xu (Hello71) wrote:
>> -pipe reduces unnecessary disk wear for systems where /tmp is not a
>> tmpfs, slightly increases compilation speed, and avoids leaving behind
>> files when gcc crashes.
>> 
>> According to the gcc manual, "this fails to work on some systems where
>> the assembler is unable to read from a pipe; but the GNU assembler has
>> no trouble". We already require GNU ld on all platforms, so this is not
>> an additional dependency. LLVM as also supports pipes.
>> 
>> -pipe has always been used for most architectures, this change
>> standardizes it globally. Most notably, arm, arm64, riscv, and x86 are
>> affected.
>> 
>> Signed-off-by: Alex Xu (Hello71) <alex_y_xu@yahoo.ca>
> 
> Do you have any numbers to show this is actually beneficial from a
> compilation time perspective? I ask because I saw an improvement in
> compilation time when removing -pipe from x86's KBUILD_CFLAGS in
> commit 437e88ab8f9e ("x86/build: Remove -pipe from KBUILD_CFLAGS").
> 
> For what it's worth, clang ignores -pipe so this does not actually
> matter for its integrated assembler.
> 
> That type of change could have been a fluke but I guarantee people
> will care more about any change in compilation time than any of the
> other things that you mention so it might be wise to check on major
> architectures to make sure that it doesn't hurt.
> 
> Cheers,
> Nathan
> 

Sorry, I should've checked the performance first. I have now run:

cd /tmp/linux # previously: make O=/tmp/linux
export MAKEFLAGS=12 # Ryzen 1600, 6 cores, 12 threads
make allnoconfig
for i in {1..10}; do
    make clean >/dev/null
    time make XPIPE=-pipe >/dev/null
    make clean >/dev/null
    time make >/dev/null
done

after patching -pipe to $(XPIPE) in Makefile.

Results (without ld warnings):

make > /dev/null  130.54s user 10.41s system 969% cpu 14.537 total
make XPIPE=-pipe > /dev/null  129.83s user 9.95s system 977% cpu 14.296 total
make > /dev/null  129.73s user 10.28s system 966% cpu 14.493 total
make XPIPE=-pipe > /dev/null  130.04s user 10.63s system 986% cpu 14.252 total
make > /dev/null  129.53s user 10.28s system 972% cpu 14.379 total
make XPIPE=-pipe > /dev/null  130.29s user 10.17s system 983% cpu 14.288 total
make > /dev/null  130.19s user 10.52s system 968% cpu 14.530 total
make XPIPE=-pipe > /dev/null  129.90s user 10.47s system 978% cpu 14.343 total
make > /dev/null  129.50s user 10.81s system 959% cpu 14.620 total
make XPIPE=-pipe > /dev/null  130.37s user 10.60s system 975% cpu 14.446 total
make > /dev/null  129.63s user 10.18s system 972% cpu 14.374 total
make XPIPE=-pipe > /dev/null  131.29s user 9.92s system 1016% cpu 13.899 total
make > /dev/null  129.96s user 10.39s system 961% cpu 14.596 total
make XPIPE=-pipe > /dev/null  131.63s user 10.16s system 1011% cpu 14.015 total
make > /dev/null  129.33s user 10.54s system 970% cpu 14.405 total
make XPIPE=-pipe > /dev/null  129.70s user 10.40s system 976% cpu 14.349 total
make > /dev/null  129.53s user 10.25s system 964% cpu 14.494 total
make XPIPE=-pipe > /dev/null  130.38s user 10.62s system 973% cpu 14.479 total
make > /dev/null  130.73s user 10.08s system 957% cpu 14.704 total
make XPIPE=-pipe > /dev/null  130.43s user 10.62s system 985% cpu 14.309 total
make > /dev/null  130.54s user 10.41s system 969% cpu 14.537 total

There is a fair bit of variance, probably due to cpufreq, schedutil, CPU 
temperature, CPU scheduler, motherboard power delivery, etc. But, I 
think it can be clearly seen that -pipe is, on average, about 0.1 to 0.2 
seconds faster.

I also tried "make defconfig":

make > /dev/null  1238.26s user 102.39s system 1095% cpu 2:02.33 total
make XPIPE=-pipe > /dev/null  1231.33s user 102.52s system 1081% cpu 2:03.29 total
make > /dev/null  1232.92s user 102.07s system 1096% cpu 2:01.71 total
make XPIPE=-pipe > /dev/null  1239.59s user 102.30s system 1096% cpu 2:02.39 total
make > /dev/null  1229.81s user 101.72s system 1093% cpu 2:01.74 total
make XPIPE=-pipe > /dev/null  1234.64s user 101.30s system 1098% cpu 2:01.64 total
make > /dev/null  1228.50s user 104.39s system 1093% cpu 2:01.91 total
make XPIPE=-pipe > /dev/null  1238.78s user 102.57s system 1099% cpu 2:01.99 total
make > /dev/null  1238.26s user 102.39s system 1095% cpu 2:02.33 total

I stopped after this because I needed to use the machine for other 
tasks. The results are less clear, but I think there's not a big 
difference one way or another, at least on my machine.

CPU: Ryzen 1600, overclocked to ~3.8 GHz
RAM: Corsair Vengeance, overclocked to ~3300 MHz, forgot timings
Motherboard: ASRock B450 Pro4

I would speculate that the recent pipe changes have caused a change in 
the relative speed compared to 2018. I am using 5.6.0-rc2 with -O3 
-march=native patches.

Regards,
Alex.
Nathan Chancellor Feb. 22, 2020, 8:01 a.m. UTC | #4
On Fri, Feb 21, 2020 at 11:01:24PM -0500, Alex Xu (Hello71) wrote:
> Excerpts from Nathan Chancellor's message of February 21, 2020 9:16 pm:
> > Hi Alex,
> > 
> > On Fri, Feb 21, 2020 at 07:38:20PM -0500, Alex Xu (Hello71) wrote:
> >> -pipe reduces unnecessary disk wear for systems where /tmp is not a
> >> tmpfs, slightly increases compilation speed, and avoids leaving behind
> >> files when gcc crashes.
> >> 
> >> According to the gcc manual, "this fails to work on some systems where
> >> the assembler is unable to read from a pipe; but the GNU assembler has
> >> no trouble". We already require GNU ld on all platforms, so this is not
> >> an additional dependency. LLVM as also supports pipes.
> >> 
> >> -pipe has always been used for most architectures, this change
> >> standardizes it globally. Most notably, arm, arm64, riscv, and x86 are
> >> affected.
> >> 
> >> Signed-off-by: Alex Xu (Hello71) <alex_y_xu@yahoo.ca>
> > 
> > Do you have any numbers to show this is actually beneficial from a
> > compilation time perspective? I ask because I saw an improvement in
> > compilation time when removing -pipe from x86's KBUILD_CFLAGS in
> > commit 437e88ab8f9e ("x86/build: Remove -pipe from KBUILD_CFLAGS").
> > 
> > For what it's worth, clang ignores -pipe so this does not actually
> > matter for its integrated assembler.
> > 
> > That type of change could have been a fluke but I guarantee people
> > will care more about any change in compilation time than any of the
> > other things that you mention so it might be wise to check on major
> > architectures to make sure that it doesn't hurt.
> > 
> > Cheers,
> > Nathan
> > 
> 
> Sorry, I should've checked the performance first. I have now run:
> 
> cd /tmp/linux # previously: make O=/tmp/linux
> export MAKEFLAGS=12 # Ryzen 1600, 6 cores, 12 threads
> make allnoconfig
> for i in {1..10}; do
>     make clean >/dev/null
>     time make XPIPE=-pipe >/dev/null
>     make clean >/dev/null
>     time make >/dev/null
> done
> 
> after patching -pipe to $(XPIPE) in Makefile.
> 
> Results (without ld warnings):
> 
> make > /dev/null  130.54s user 10.41s system 969% cpu 14.537 total
> make XPIPE=-pipe > /dev/null  129.83s user 9.95s system 977% cpu 14.296 total
> make > /dev/null  129.73s user 10.28s system 966% cpu 14.493 total
> make XPIPE=-pipe > /dev/null  130.04s user 10.63s system 986% cpu 14.252 total
> make > /dev/null  129.53s user 10.28s system 972% cpu 14.379 total
> make XPIPE=-pipe > /dev/null  130.29s user 10.17s system 983% cpu 14.288 total
> make > /dev/null  130.19s user 10.52s system 968% cpu 14.530 total
> make XPIPE=-pipe > /dev/null  129.90s user 10.47s system 978% cpu 14.343 total
> make > /dev/null  129.50s user 10.81s system 959% cpu 14.620 total
> make XPIPE=-pipe > /dev/null  130.37s user 10.60s system 975% cpu 14.446 total
> make > /dev/null  129.63s user 10.18s system 972% cpu 14.374 total
> make XPIPE=-pipe > /dev/null  131.29s user 9.92s system 1016% cpu 13.899 total
> make > /dev/null  129.96s user 10.39s system 961% cpu 14.596 total
> make XPIPE=-pipe > /dev/null  131.63s user 10.16s system 1011% cpu 14.015 total
> make > /dev/null  129.33s user 10.54s system 970% cpu 14.405 total
> make XPIPE=-pipe > /dev/null  129.70s user 10.40s system 976% cpu 14.349 total
> make > /dev/null  129.53s user 10.25s system 964% cpu 14.494 total
> make XPIPE=-pipe > /dev/null  130.38s user 10.62s system 973% cpu 14.479 total
> make > /dev/null  130.73s user 10.08s system 957% cpu 14.704 total
> make XPIPE=-pipe > /dev/null  130.43s user 10.62s system 985% cpu 14.309 total
> make > /dev/null  130.54s user 10.41s system 969% cpu 14.537 total
> 
> There is a fair bit of variance, probably due to cpufreq, schedutil, CPU 
> temperature, CPU scheduler, motherboard power delivery, etc. But, I 
> think it can be clearly seen that -pipe is, on average, about 0.1 to 0.2 
> seconds faster.
> 
> I also tried "make defconfig":
> 
> make > /dev/null  1238.26s user 102.39s system 1095% cpu 2:02.33 total
> make XPIPE=-pipe > /dev/null  1231.33s user 102.52s system 1081% cpu 2:03.29 total
> make > /dev/null  1232.92s user 102.07s system 1096% cpu 2:01.71 total
> make XPIPE=-pipe > /dev/null  1239.59s user 102.30s system 1096% cpu 2:02.39 total
> make > /dev/null  1229.81s user 101.72s system 1093% cpu 2:01.74 total
> make XPIPE=-pipe > /dev/null  1234.64s user 101.30s system 1098% cpu 2:01.64 total
> make > /dev/null  1228.50s user 104.39s system 1093% cpu 2:01.91 total
> make XPIPE=-pipe > /dev/null  1238.78s user 102.57s system 1099% cpu 2:01.99 total
> make > /dev/null  1238.26s user 102.39s system 1095% cpu 2:02.33 total
> 
> I stopped after this because I needed to use the machine for other 
> tasks. The results are less clear, but I think there's not a big 
> difference one way or another, at least on my machine.
> 
> CPU: Ryzen 1600, overclocked to ~3.8 GHz
> RAM: Corsair Vengeance, overclocked to ~3300 MHz, forgot timings
> Motherboard: ASRock B450 Pro4
> 
> I would speculate that the recent pipe changes have caused a change in 
> the relative speed compared to 2018. I am using 5.6.0-rc2 with -O3 
> -march=native patches.
> 
> Regards,
> Alex.

I used hyperfine [1] to run a quick benchmark with a freshly built
GCC 9.2.0 for x86 and aarch64 and here are the results:

$ hyperfine -w 1 -r 25 \
            -p 'rm -rf out.x86_64' \
            'make -j$(nproc) -s CROSS_COMPILE=x86_64-linux- O=out.x86_64 defconfig all' \
            'make -j$(nproc) -s CROSS_COMPILE=x86_64-linux- KCFLAGS=-pipe O=out.x86_64 defconfig all'

Benchmark #1: make -j$(nproc) -s CROSS_COMPILE=x86_64-linux- O=out.x86_64 defconfig all
  Time (mean ± σ):     68.535 s ±  0.275 s    [User: 2241.681 s, System: 185.454 s]
  Range (min … max):   67.855 s … 68.953 s    25 runs

Benchmark #2: make -j$(nproc) -s CROSS_COMPILE=x86_64-linux- KCFLAGS=-pipe O=out.x86_64 defconfig all
  Time (mean ± σ):     68.922 s ±  0.095 s    [User: 2264.168 s, System: 190.297 s]
  Range (min … max):   68.781 s … 69.126 s    25 runs

Summary
  'make -j$(nproc) -s CROSS_COMPILE=x86_64-linux- O=out.x86_64 defconfig all' ran
    1.01 ± 0.00 times faster than 'make -j$(nproc) -s CROSS_COMPILE=x86_64-linux- KCFLAGS=-pipe O=out.x86_64 defconfig all'

$ hyperfine -w 1 -r 25 \
            -p 'rm -rf out.aarch64' \
            'make -j$(nproc) -s ARCH=arm64 CROSS_COMPILE=aarch64-linux- O=out.aarch64 defconfig all' \
            'make -j$(nproc) -s ARCH=arm64 CROSS_COMPILE=aarch64-linux- KCFLAGS=-pipe O=out.aarch64 defconfig all'

Benchmark #1: make -j$(nproc) -s ARCH=arm64 CROSS_COMPILE=aarch64-linux- O=out.aarch64 defconfig all
  Time (mean ± σ):     166.732 s ±  0.594 s    [User: 5654.780 s, System: 475.493 s]
  Range (min … max):   165.873 s … 167.859 s    25 runs

Benchmark #2: make -j$(nproc) -s ARCH=arm64 CROSS_COMPILE=aarch64-linux- KCFLAGS=-pipe O=out.aarch64 defconfig all
  Time (mean ± σ):     168.047 s ±  0.428 s    [User: 5734.031 s, System: 488.392 s]
  Range (min … max):   167.328 s … 168.959 s    25 runs

Summary
  'make -j$(nproc) -s ARCH=arm64 CROSS_COMPILE=aarch64-linux- O=out.aarch64 defconfig all' ran
    1.01 ± 0.00 times faster than 'make -j$(nproc) -s ARCH=arm64 CROSS_COMPILE=aarch64-linux- KCFLAGS=-pipe O=out.aarch64 defconfig all'

In both cases it seems like performance regresses (by 1% but still) but
maybe it is my machine, even though this benchmark was done on a
different machine than the one from my commit back in 2018.

I am not sure I would write off these results, since I did the benchmark
25 times on each one back to back, eliminating most of the variance that
you described.

[1]: https://github.com/sharkdp/hyperfine

Cheers,
Nathan
Russell King - ARM Linux admin Feb. 22, 2020, 9:01 a.m. UTC | #5
On Sat, Feb 22, 2020 at 11:07:14AM +0900, Masahiro Yamada wrote:
> On Sat, Feb 22, 2020 at 9:40 AM Alex Xu (Hello71) <alex_y_xu@yahoo.ca> wrote:
> >
> > -pipe reduces unnecessary disk wear for systems where /tmp is not a
> > tmpfs, slightly increases compilation speed, and avoids leaving behind
> > files when gcc crashes.
> >
> > According to the gcc manual, "this fails to work on some systems where
> > the assembler is unable to read from a pipe; but the GNU assembler has
> > no trouble". We already require GNU ld on all platforms, so this is not
> > an additional dependency. LLVM as also supports pipes.
> >
> > -pipe has always been used for most architectures, this change
> > standardizes it globally. Most notably, arm, arm64, riscv, and x86 are
> > affected.
> >
> > Signed-off-by: Alex Xu (Hello71) <alex_y_xu@yahoo.ca>
> 
> <snip>
> 
> > diff --git a/arch/arc/Makefile b/arch/arc/Makefile
> > index 20e9ab6cc521..b6a2f553771c 100644
> > --- a/arch/arc/Makefile
> > +++ b/arch/arc/Makefile
> > @@ -9,7 +9,7 @@ ifeq ($(CROSS_COMPILE),)
> >  CROSS_COMPILE := $(call cc-cross-prefix, arc-linux- arceb-linux-)
> >  endif
> >
> > -cflags-y       += -fno-common -pipe -fno-builtin -mmedium-calls -D__linux__
> > +cflags-y       += -fno-common -fno-builtin -mmedium-calls -D__linux__
> >  cflags-$(CONFIG_ISA_ARCOMPACT) += -mA7
> >  cflags-$(CONFIG_ISA_ARCV2)     += -mcpu=hs38
> >
> > diff --git a/arch/arm/Makefile b/arch/arm/Makefile
> > index db857d07114f..7711467e0797 100644
> > --- a/arch/arm/Makefile
> > +++ b/arch/arm/Makefile
> > @@ -21,7 +21,6 @@ KBUILD_LDS_MODULE     += $(srctree)/arch/arm/kernel/module.lds
> >  endif
> >
> >  GZFLAGS                :=-9
> > -#KBUILD_CFLAGS +=-pipe
> 
> 
> This was commented out by a very old commit,
> which is available in the historical git tree.
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/history/history.git/commit/?id=ce20ed858a20f6f04de475cae79e40d3697f4776
> 
> But, I could not parse the reason from the commit message.
> Russell, do you remember why?

-pipe may reduce the disk load but increases the CPU load, so it's an
option that's up to the build environment.  One may wish to pass a
lower parralellism when using -pipe to make to mitigate that, but both
options are up to the build environment to decide upon.

If we unconditionally add -pipe, then we take away choice.
Alex Xu (Hello71) Feb. 22, 2020, 2:24 p.m. UTC | #6
Excerpts from Nathan Chancellor's message of February 22, 2020 3:01 am:
> I used hyperfine [1] to run a quick benchmark with a freshly built
> GCC 9.2.0 for x86 and aarch64 and here are the results:
> 
> In both cases it seems like performance regresses (by 1% but still) but
> maybe it is my machine, even though this benchmark was done on a
> different machine than the one from my commit back in 2018.
> 
> I am not sure I would write off these results, since I did the benchmark
> 25 times on each one back to back, eliminating most of the variance that
> you described.
> 
> [1]: https://github.com/sharkdp/hyperfine
> 
> Cheers,
> Nathan
> 

What kernel version are you running? Do you have the 5.6 pipe reworks?
Nathan Chancellor Feb. 22, 2020, 6:12 p.m. UTC | #7
On Sat, Feb 22, 2020 at 09:24:14AM -0500, Alex Xu (Hello71) wrote:
> Excerpts from Nathan Chancellor's message of February 22, 2020 3:01 am:
> > I used hyperfine [1] to run a quick benchmark with a freshly built
> > GCC 9.2.0 for x86 and aarch64 and here are the results:
> > 
> > In both cases it seems like performance regresses (by 1% but still) but
> > maybe it is my machine, even though this benchmark was done on a
> > different machine than the one from my commit back in 2018.
> > 
> > I am not sure I would write off these results, since I did the benchmark
> > 25 times on each one back to back, eliminating most of the variance that
> > you described.
> > 
> > [1]: https://github.com/sharkdp/hyperfine
> > 
> > Cheers,
> > Nathan
> > 
> 
> What kernel version are you running? Do you have the 5.6 pipe reworks?

No, it is a stock Ubuntu 18.04 kernel, which is running 4.15.0.

$ uname -a
Linux c2-medium-x86 4.15.0-50-generic #54-Ubuntu SMP Mon May 6 18:46:08 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

If you are curious about the specs:

$ neofetch --stdout
nathan@c2-medium-x86 
-------------------- 
OS: Ubuntu 18.04.3 LTS x86_64 
Host: PowerEdge R6415 
Kernel: 4.15.0-50-generic 
Uptime: 126 days, 12 hours, 39 mins 
Packages: 686 
Shell: zsh 5.4.2 
Terminal: /dev/pts/0 
CPU: AMD EPYC 7401P 24- (48) @ 2.794GHz 
Memory: 2974MiB / 64018MiB 

Cheers,
Nathan

Patch
diff mbox series

diff --git a/Makefile b/Makefile
index aab38cb02b24..782c12267151 100644
--- a/Makefile
+++ b/Makefile
@@ -459,7 +459,7 @@  KBUILD_CFLAGS   := -Wall -Wundef -Werror=strict-prototypes -Wno-trigraphs \
 		   -fno-strict-aliasing -fno-common -fshort-wchar -fno-PIE \
 		   -Werror=implicit-function-declaration -Werror=implicit-int \
 		   -Wno-format-security \
-		   -std=gnu89
+		   -std=gnu89 -pipe
 KBUILD_CPPFLAGS := -D__KERNEL__
 KBUILD_AFLAGS_KERNEL :=
 KBUILD_CFLAGS_KERNEL :=
diff --git a/arch/alpha/Makefile b/arch/alpha/Makefile
index 12dee59b011c..b40a9be72d9b 100644
--- a/arch/alpha/Makefile
+++ b/arch/alpha/Makefile
@@ -12,7 +12,7 @@  NM := $(NM) -B
 
 LDFLAGS_vmlinux	:= -static -N #-relax
 CHECKFLAGS	+= -D__alpha__
-cflags-y	:= -pipe -mno-fp-regs -ffixed-8
+cflags-y	:= -mno-fp-regs -ffixed-8
 cflags-y	+= $(call cc-option, -fno-jump-tables)
 
 cpuflags-$(CONFIG_ALPHA_EV4)		:= -mcpu=ev4
diff --git a/arch/arc/Makefile b/arch/arc/Makefile
index 20e9ab6cc521..b6a2f553771c 100644
--- a/arch/arc/Makefile
+++ b/arch/arc/Makefile
@@ -9,7 +9,7 @@  ifeq ($(CROSS_COMPILE),)
 CROSS_COMPILE := $(call cc-cross-prefix, arc-linux- arceb-linux-)
 endif
 
-cflags-y	+= -fno-common -pipe -fno-builtin -mmedium-calls -D__linux__
+cflags-y	+= -fno-common -fno-builtin -mmedium-calls -D__linux__
 cflags-$(CONFIG_ISA_ARCOMPACT)	+= -mA7
 cflags-$(CONFIG_ISA_ARCV2)	+= -mcpu=hs38
 
diff --git a/arch/arm/Makefile b/arch/arm/Makefile
index db857d07114f..7711467e0797 100644
--- a/arch/arm/Makefile
+++ b/arch/arm/Makefile
@@ -21,7 +21,6 @@  KBUILD_LDS_MODULE	+= $(srctree)/arch/arm/kernel/module.lds
 endif
 
 GZFLAGS		:=-9
-#KBUILD_CFLAGS	+=-pipe
 
 # Never generate .eh_frame
 KBUILD_CFLAGS	+= $(call cc-option,-fno-dwarf2-cfi-asm)
diff --git a/arch/csky/Makefile b/arch/csky/Makefile
index fb1bbbd91954..3ba8d936122c 100644
--- a/arch/csky/Makefile
+++ b/arch/csky/Makefile
@@ -42,7 +42,6 @@  KBUILD_CFLAGS += -msoft-float -mdiv
 KBUILD_CFLAGS += -fno-tree-vectorize
 endif
 
-KBUILD_CFLAGS += -pipe
 ifeq ($(CSKYABI),abiv2)
 KBUILD_CFLAGS += -mno-stack-size
 endif
diff --git a/arch/ia64/Makefile b/arch/ia64/Makefile
index 32240000dc0c..554dc20873d8 100644
--- a/arch/ia64/Makefile
+++ b/arch/ia64/Makefile
@@ -24,7 +24,7 @@  KBUILD_LDS_MODULE += $(srctree)/arch/ia64/module.lds
 KBUILD_AFLAGS_KERNEL := -mconstant-gp
 EXTRA		:=
 
-cflags-y	:= -pipe $(EXTRA) -ffixed-r13 -mfixed-range=f12-f15,f32-f127 \
+cflags-y	:= $(EXTRA) -ffixed-r13 -mfixed-range=f12-f15,f32-f127 \
 		   -falign-functions=32 -frename-registers -fno-optimize-sibling-calls
 KBUILD_CFLAGS_KERNEL := -mconstant-gp
 
diff --git a/arch/m68k/Makefile b/arch/m68k/Makefile
index 5d9288384096..99a226bbd06c 100644
--- a/arch/m68k/Makefile
+++ b/arch/m68k/Makefile
@@ -60,7 +60,7 @@  cpuflags-$(CONFIG_M5206)	:= $(call cc-option,-mcpu=5206,-m5200)
 KBUILD_AFLAGS += $(cpuflags-y)
 KBUILD_CFLAGS += $(cpuflags-y)
 
-KBUILD_CFLAGS += -pipe -ffreestanding
+KBUILD_CFLAGS += -ffreestanding
 
 ifdef CONFIG_MMU
 # without -fno-strength-reduce the 53c7xx.c driver fails ;-(
diff --git a/arch/mips/Makefile b/arch/mips/Makefile
index e1c44aed8156..05eb77328a13 100644
--- a/arch/mips/Makefile
+++ b/arch/mips/Makefile
@@ -95,7 +95,7 @@  all-$(CONFIG_SYS_SUPPORTS_ZBOOT)+= vmlinuz
 # machines may also.  Since BFD is incredibly buggy with respect to
 # crossformat linking we rely on the elf2ecoff tool for format conversion.
 #
-cflags-y			+= -G 0 -mno-abicalls -fno-pic -pipe
+cflags-y			+= -G 0 -mno-abicalls -fno-pic
 cflags-y			+= -msoft-float
 LDFLAGS_vmlinux			+= -G 0 -static -n -nostdlib
 KBUILD_AFLAGS_MODULE		+= -mlong-calls
diff --git a/arch/nios2/Makefile b/arch/nios2/Makefile
index 52c03e60b114..3205cb5fd143 100644
--- a/arch/nios2/Makefile
+++ b/arch/nios2/Makefile
@@ -24,7 +24,7 @@  LIBGCC         := $(shell $(CC) $(KBUILD_CFLAGS) $(KCFLAGS) -print-libgcc-file-n
 
 KBUILD_AFLAGS += -march=r$(CONFIG_NIOS2_ARCH_REVISION)
 
-KBUILD_CFLAGS += -pipe -D__linux__ -D__ELF__
+KBUILD_CFLAGS += -D__linux__ -D__ELF__
 KBUILD_CFLAGS += -march=r$(CONFIG_NIOS2_ARCH_REVISION)
 KBUILD_CFLAGS += $(if $(CONFIG_NIOS2_HW_MUL_SUPPORT),-mhw-mul,-mno-hw-mul)
 KBUILD_CFLAGS += $(if $(CONFIG_NIOS2_HW_MULX_SUPPORT),-mhw-mulx,-mno-hw-mulx)
diff --git a/arch/openrisc/Makefile b/arch/openrisc/Makefile
index bf10141c7426..86075fc673d9 100644
--- a/arch/openrisc/Makefile
+++ b/arch/openrisc/Makefile
@@ -22,7 +22,7 @@  KBUILD_DEFCONFIG := or1ksim_defconfig
 OBJCOPYFLAGS    := -O binary -R .note -R .comment -S
 LIBGCC 		:= $(shell $(CC) $(KBUILD_CFLAGS) -print-libgcc-file-name)
 
-KBUILD_CFLAGS	+= -pipe -ffixed-r10 -D__linux__
+KBUILD_CFLAGS	+= -ffixed-r10 -D__linux__
 
 ifeq ($(CONFIG_OPENRISC_HAVE_INST_MUL),y)
 	KBUILD_CFLAGS += $(call cc-option,-mhard-mul)
diff --git a/arch/parisc/Makefile b/arch/parisc/Makefile
index dca8f2de8cf5..88bee828400d 100644
--- a/arch/parisc/Makefile
+++ b/arch/parisc/Makefile
@@ -64,7 +64,7 @@  endif
 
 OBJCOPY_FLAGS =-O binary -R .note -R .comment -S
 
-cflags-y	:= -pipe
+cflags-y	:=
 
 # These flags should be implied by an hppa-linux configuration, but they
 # are not in gcc 3.2.
diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile
index f35730548e42..0550b976157c 100644
--- a/arch/powerpc/Makefile
+++ b/arch/powerpc/Makefile
@@ -215,7 +215,7 @@  asinstr := $(call as-instr,lis 9$(comma)foo@high,-DHAVE_AS_ATHIGH=1)
 KBUILD_CPPFLAGS	+= -I $(srctree)/arch/$(ARCH) $(asinstr)
 KBUILD_AFLAGS	+= $(AFLAGS-y)
 KBUILD_CFLAGS	+= $(call cc-option,-msoft-float)
-KBUILD_CFLAGS	+= -pipe $(CFLAGS-y)
+KBUILD_CFLAGS	+= $(CFLAGS-y)
 CPP		= $(CC) -E $(KBUILD_CFLAGS)
 
 CHECKFLAGS	+= -m$(BITS) -D__powerpc__ -D__powerpc$(BITS)__
diff --git a/arch/s390/Makefile b/arch/s390/Makefile
index e0e3a465bbfd..3ca3e3a29864 100644
--- a/arch/s390/Makefile
+++ b/arch/s390/Makefile
@@ -118,7 +118,7 @@  endif
 cfi := $(call as-instr,.cfi_startproc\n.cfi_val_offset 15$(comma)-160\n.cfi_endproc,-DCONFIG_AS_CFI_VAL_OFFSET=1)
 
 KBUILD_CFLAGS	+= -mbackchain -msoft-float $(cflags-y)
-KBUILD_CFLAGS	+= -pipe -Wno-sign-compare
+KBUILD_CFLAGS	+= -Wno-sign-compare
 KBUILD_CFLAGS	+= -fno-asynchronous-unwind-tables $(cfi)
 KBUILD_AFLAGS	+= $(aflags-y) $(cfi)
 export KBUILD_AFLAGS_DECOMPRESSOR
diff --git a/arch/sh/Makefile b/arch/sh/Makefile
index b4a86f27e048..2e224b2a436b 100644
--- a/arch/sh/Makefile
+++ b/arch/sh/Makefile
@@ -194,7 +194,7 @@  drivers-$(CONFIG_OPROFILE)	+= arch/sh/oprofile/
 cflags-y	+= $(foreach d, $(cpuincdir-y), -I $(srctree)/arch/sh/include/$(d)) \
 		   $(foreach d, $(machdir-y), -I $(srctree)/arch/sh/include/$(d))
 
-KBUILD_CFLAGS		+= -pipe $(cflags-y)
+KBUILD_CFLAGS		+= $(cflags-y)
 KBUILD_CPPFLAGS		+= $(cflags-y)
 KBUILD_AFLAGS		+= $(cflags-y)
 
diff --git a/arch/sparc/Makefile b/arch/sparc/Makefile
index 4a0919581697..ad30e7e668e0 100644
--- a/arch/sparc/Makefile
+++ b/arch/sparc/Makefile
@@ -29,7 +29,7 @@  UTS_MACHINE    := sparc
 # versions of gcc.  Some gcc versions won't pass -Av8 to binutils when you
 # give -mcpu=v8.  This silently worked with older bintutils versions but
 # does not any more.
-KBUILD_CFLAGS  += -m32 -mcpu=v8 -pipe -mno-fpu -fcall-used-g5 -fcall-used-g7
+KBUILD_CFLAGS  += -m32 -mcpu=v8 -mno-fpu -fcall-used-g5 -fcall-used-g7
 KBUILD_CFLAGS  += -Wa,-Av8
 
 KBUILD_AFLAGS  += -m32 -Wa,-Av8
@@ -44,7 +44,7 @@  KBUILD_LDFLAGS := -m elf64_sparc
 export BITS   := 64
 UTS_MACHINE   := sparc64
 
-KBUILD_CFLAGS += -m64 -pipe -mno-fpu -mcpu=ultrasparc -mcmodel=medlow
+KBUILD_CFLAGS += -m64 -mno-fpu -mcpu=ultrasparc -mcmodel=medlow
 KBUILD_CFLAGS += -ffixed-g4 -ffixed-g5 -fcall-used-g7 -Wno-sign-compare
 KBUILD_CFLAGS += -Wa,--undeclared-regs
 KBUILD_CFLAGS += $(call cc-option,-mtune=ultrasparc3)
diff --git a/arch/xtensa/Makefile b/arch/xtensa/Makefile
index 67a7d151d1e7..fdaa588c504c 100644
--- a/arch/xtensa/Makefile
+++ b/arch/xtensa/Makefile
@@ -42,7 +42,7 @@  export PLATFORM
 
 # temporarily until string.h is fixed
 KBUILD_CFLAGS += -ffreestanding -D__linux__
-KBUILD_CFLAGS += -pipe -mlongcalls -mtext-section-literals
+KBUILD_CFLAGS += -mlongcalls -mtext-section-literals
 KBUILD_CFLAGS += $(call cc-option,-mforce-no-pic,)
 KBUILD_CFLAGS += $(call cc-option,-mno-serialize-volatile,)