From patchwork Sat Aug 6 20:10:45 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sam Ravnborg X-Patchwork-Id: 9265987 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 6874B6075A for ; Sat, 6 Aug 2016 20:19:48 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 57A7D27E22 for ; Sat, 6 Aug 2016 20:19:48 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 49CD42840A; Sat, 6 Aug 2016 20:19:48 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7A59527E22 for ; Sat, 6 Aug 2016 20:19:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751501AbcHFUTp (ORCPT ); Sat, 6 Aug 2016 16:19:45 -0400 Received: from asavdk4.altibox.net ([109.247.116.15]:33587 "EHLO asavdk4.altibox.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750980AbcHFUTo (ORCPT ); Sat, 6 Aug 2016 16:19:44 -0400 Received: from ravnborg.org (unknown [188.228.89.252]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by asavdk4.altibox.net (Postfix) with ESMTPS id 66C23801EE; Sat, 6 Aug 2016 22:10:46 +0200 (CEST) Date: Sat, 6 Aug 2016 22:10:45 +0200 From: Sam Ravnborg To: Nicholas Piggin Cc: linux-kbuild@vger.kernel.org, linux-arch@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, Stephen Rothwell , Arnd Bergmann , Nicolas Pitre , Segher Boessenkool , Alan Modra Subject: Re: [PATCH 1/5] kbuild: allow architectures to use thin archives instead of ld -r Message-ID: <20160806201045.GA25821@ravnborg.org> References: <1470399123-8455-1-git-send-email-npiggin@gmail.com> <1470399123-8455-2-git-send-email-npiggin@gmail.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <1470399123-8455-2-git-send-email-npiggin@gmail.com> User-Agent: Mutt/1.5.21 (2010-09-15) X-CMAE-Score: 0 X-CMAE-Analysis: v=2.2 cv=eqGd9chX c=1 sm=1 tr=0 a=Ij76tQDYWdb01v2+RnYW5w==:117 a=Ij76tQDYWdb01v2+RnYW5w==:17 a=kj9zAlcOel0A:10 a=rOUgymgbAAAA:8 a=L_NHb9P_84ZGpTuQR0kA:9 a=h7x0XkwgiJaIFzxv:21 a=tsDl-KaUqITJIdc-:21 a=Nw73OXTocoMA:10 a=MP9ZtiD8KjrkvI0BhSjB:22 Sender: linux-kbuild-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kbuild@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Hi Nicholas. On Fri, Aug 05, 2016 at 10:11:59PM +1000, Nicholas Piggin wrote: > From: Stephen Rothwell > > ld -r is an incremental link used to create built-in.o files in build > subdirectories. It produces relocatable object files containing all > its input files, and these are are then pulled together and relocated > in the final link. Aside from the bloat, this constrains the final > link relocations, which has bitten large powerpc builds with > unresolvable relocations in the final link. > > Alan Modra has recommended the kernel use thin archives for linking. > This is an alternative and means that the linker has more information > available to it when it links the kernel. > > This patch enables a config option architectures can select, If we want to do this, then I suggest to make the logic reverse. Architectures that for some reasons cannot use this should have the possibility to avoid it. But let it be enabled by default. > which > causes all built-in.o files to be built as thin archives. built-in.o > files in subdirectories do not get symbol table or index attached, > which improves speed and size. The final link pass creates a > built-in.o archive in the root output directory which includes the > symbol table and index. The linker then uses takes this file to link. > > The --whole-archive linker option is required, because the linker now > has visibility to every individual object file, and it will otherwise > just completely avoid including those without external references > (consider a file with EXPORT_SYMBOL or initcall or hardware exceptions > as its only entry points). The traditional built works "by luck" as > built-in.o files are large enough that they're going to get external > references. However this optimisation is unpredictable for the kernel > (due to above external references), ineffective at culling unused, and > costly because the .o files have to be searched for references. > Superior alternatives for link-time culling should be used instead. > > Build characteristics for inclink vs thinarc, on a small powerpc64le > pseries VM with a modest .config: > > inclink thinarc > sizes > vmlinux 15 618 680 15 625 028 > sum of all built-in.o 56 091 808 1 054 334 > sum excluding root built-in.o 151 430 > > find -name built-in.o | xargs rm ; time make vmlinux > real 22.772s 21.143s > user 13.280s 13.430s > sys 4.310s 2.750s > > - Final kernel pulled in only about 6K more, which shows how > ineffective the object file culling is. > - Build performance looks improved due to less pagecache activity. > On IO constrained systems it could be a bigger win. > - Build size saving is significant. Good to see this old proposal picked up again! Did you by any chance evalue the use of INPUT in linker files. Stephen back then (again based on proposal from Alan Modra), also made an implementation using INPUT. See below for an updated simple patch on top of mainline. Build statistics for "make defconfig" on my i7 box: find -name built-in.o; xargs rm; time make -j16 vmlinux standard singlelink delta real 0m6.368s 0m7.040s +672ms user 0m15.577s 0m14.960s -617ms sys 0m7.601s 0m6.226s -1375ms vmlinux size: standard singlelink delta text 10.250.675 10.250.675 0 data 4.369.632 4.374.816 +5184 bss 1.110.016 1.110.016 0 I had expected to see improvements in build time - but we serialize the heavy link phase, so it is actually slower. I did not investigate why data section got larger, but I think you already touch the reasons. The patch does not change how we link modules. Please consider if this approach is better / worse than using archieves. Note that this patch remove the possibility to run section mismatch anylysis on a per-directory basis. Sam --- To unsubscribe from this list: send the line "unsubscribe linux-kbuild" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/scripts/Makefile.build b/scripts/Makefile.build index 11602e5..954e7cb 100644 --- a/scripts/Makefile.build +++ b/scripts/Makefile.build @@ -360,10 +360,9 @@ $(sort $(subdir-obj-y)): $(subdir-ym) ; ifdef builtin-target quiet_cmd_link_o_target = LD $@ # If the list of objects to link is empty, just create an empty built-in.o -cmd_link_o_target = $(if $(strip $(obj-y)),\ - $(LD) $(ld_flags) -r -o $@ $(filter $(obj-y), $^) \ - $(cmd_secanalysis),\ - rm -f $@; $(AR) rcs$(KBUILD_ARFLAGS) $@) +cmd_link_o_target = $(if $(filter $(obj-y), $^), \ + echo INPUT\($(filter $(obj-y), $^)\) > $@, \ + echo "/* empty */" > $@) $(builtin-target): $(obj-y) FORCE $(call if_changed,link_o_target) @@ -414,10 +413,10 @@ $($(subst $(obj)/,,$(@:.o=-y))) \ $($(subst $(obj)/,,$(@:.o=-m)))), $^) quiet_cmd_link_multi-y = LD $@ -cmd_link_multi-y = $(LD) $(ld_flags) -r -o $@ $(link_multi_deps) $(cmd_secanalysis) +cmd_link_multi-y = echo INPUT\($(link_multi_deps)\) > $@ quiet_cmd_link_multi-m = LD [M] $@ -cmd_link_multi-m = $(cmd_link_multi-y) +cmd_link_multi-m = $(LD) $(ld_flags) -r -o $@ $(link_multi_deps) $(multi-used-y): FORCE $(call if_changed,link_multi-y)