Message ID | E1gK1ae-0000or-7R@rmk-PC.armlinux.org.uk (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Enable deadcode elimination at link time | expand |
On 6 November 2018 at 14:40, Russell King <rmk+kernel@armlinux.org.uk> wrote: > When building with -fdata-sections, the BSS data is placed into separate > sections, so we need to ask the linker to group these together into the > .bss output section, so that they're correctly placed. > > Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> > --- > arch/arm/boot/compressed/vmlinux.lds.S | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/arch/arm/boot/compressed/vmlinux.lds.S b/arch/arm/boot/compressed/vmlinux.lds.S > index 2b963d8e76dd..3b91bc3c606f 100644 > --- a/arch/arm/boot/compressed/vmlinux.lds.S > +++ b/arch/arm/boot/compressed/vmlinux.lds.S > @@ -118,7 +118,7 @@ SECTIONS > > . = BSS_START; > __bss_start = .; > - .bss : { *(.bss) } > + .bss : { *(.bss) *(.bss.*) } Would it make sense to sort these by alignment? Otherwise, I suspect you may get a lot of padding holes due to the number of different input sections, each with its own alignment. > _end = .; > > . = ALIGN(8); /* the stack must be 64-bit aligned */ > -- > 2.7.4 > > > _______________________________________________ > linux-arm-kernel mailing list > linux-arm-kernel@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
On Tue, Nov 06, 2018 at 03:08:06PM +0100, Ard Biesheuvel wrote: > On 6 November 2018 at 14:40, Russell King <rmk+kernel@armlinux.org.uk> wrote: > > When building with -fdata-sections, the BSS data is placed into separate > > sections, so we need to ask the linker to group these together into the > > .bss output section, so that they're correctly placed. > > > > Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> > > --- > > arch/arm/boot/compressed/vmlinux.lds.S | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/arch/arm/boot/compressed/vmlinux.lds.S b/arch/arm/boot/compressed/vmlinux.lds.S > > index 2b963d8e76dd..3b91bc3c606f 100644 > > --- a/arch/arm/boot/compressed/vmlinux.lds.S > > +++ b/arch/arm/boot/compressed/vmlinux.lds.S > > @@ -118,7 +118,7 @@ SECTIONS > > > > . = BSS_START; > > __bss_start = .; > > - .bss : { *(.bss) } > > + .bss : { *(.bss) *(.bss.*) } > > Would it make sense to sort these by alignment? Otherwise, I suspect > you may get a lot of padding holes due to the number of different > input sections, each with its own alignment. We don't bother elsewhere in the kernel linker script - do you have a case where we get lots of padding?
On 6 November 2018 at 15:10, Russell King - ARM Linux <linux@armlinux.org.uk> wrote: > On Tue, Nov 06, 2018 at 03:08:06PM +0100, Ard Biesheuvel wrote: >> On 6 November 2018 at 14:40, Russell King <rmk+kernel@armlinux.org.uk> wrote: >> > When building with -fdata-sections, the BSS data is placed into separate >> > sections, so we need to ask the linker to group these together into the >> > .bss output section, so that they're correctly placed. >> > >> > Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> >> > --- >> > arch/arm/boot/compressed/vmlinux.lds.S | 2 +- >> > 1 file changed, 1 insertion(+), 1 deletion(-) >> > >> > diff --git a/arch/arm/boot/compressed/vmlinux.lds.S b/arch/arm/boot/compressed/vmlinux.lds.S >> > index 2b963d8e76dd..3b91bc3c606f 100644 >> > --- a/arch/arm/boot/compressed/vmlinux.lds.S >> > +++ b/arch/arm/boot/compressed/vmlinux.lds.S >> > @@ -118,7 +118,7 @@ SECTIONS >> > >> > . = BSS_START; >> > __bss_start = .; >> > - .bss : { *(.bss) } >> > + .bss : { *(.bss) *(.bss.*) } >> >> Would it make sense to sort these by alignment? Otherwise, I suspect >> you may get a lot of padding holes due to the number of different >> input sections, each with its own alignment. > > We don't bother elsewhere in the kernel linker script - do you have > a case where we get lots of padding? > With -fdata-sections? How else are we ensuring that each .bss item in its own section is not placed such that its alignment results in a padding hole? WIthout -fdata-sections, this is done per object, and GCC takes care of this itself.
On Tue, Nov 06, 2018 at 03:13:07PM +0100, Ard Biesheuvel wrote: > On 6 November 2018 at 15:10, Russell King - ARM Linux > <linux@armlinux.org.uk> wrote: > > On Tue, Nov 06, 2018 at 03:08:06PM +0100, Ard Biesheuvel wrote: > >> On 6 November 2018 at 14:40, Russell King <rmk+kernel@armlinux.org.uk> wrote: > >> > When building with -fdata-sections, the BSS data is placed into separate > >> > sections, so we need to ask the linker to group these together into the > >> > .bss output section, so that they're correctly placed. > >> > > >> > Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> > >> > --- > >> > arch/arm/boot/compressed/vmlinux.lds.S | 2 +- > >> > 1 file changed, 1 insertion(+), 1 deletion(-) > >> > > >> > diff --git a/arch/arm/boot/compressed/vmlinux.lds.S b/arch/arm/boot/compressed/vmlinux.lds.S > >> > index 2b963d8e76dd..3b91bc3c606f 100644 > >> > --- a/arch/arm/boot/compressed/vmlinux.lds.S > >> > +++ b/arch/arm/boot/compressed/vmlinux.lds.S > >> > @@ -118,7 +118,7 @@ SECTIONS > >> > > >> > . = BSS_START; > >> > __bss_start = .; > >> > - .bss : { *(.bss) } > >> > + .bss : { *(.bss) *(.bss.*) } > >> > >> Would it make sense to sort these by alignment? Otherwise, I suspect > >> you may get a lot of padding holes due to the number of different > >> input sections, each with its own alignment. > > > > We don't bother elsewhere in the kernel linker script - do you have > > a case where we get lots of padding? > > > > With -fdata-sections? How else are we ensuring that each .bss item in > its own section is not placed such that its alignment results in a > padding hole? Quite simply, we don't care (at the moment). The alignment does not come from the size of each individual section, but the alignment requirements of data within the section. See include/asm-generic/vmlinux.lds.h and the definition of *_MAIN symbols from around line 60.
On 6 November 2018 at 15:29, Russell King - ARM Linux <linux@armlinux.org.uk> wrote: > On Tue, Nov 06, 2018 at 03:13:07PM +0100, Ard Biesheuvel wrote: >> On 6 November 2018 at 15:10, Russell King - ARM Linux >> <linux@armlinux.org.uk> wrote: >> > On Tue, Nov 06, 2018 at 03:08:06PM +0100, Ard Biesheuvel wrote: >> >> On 6 November 2018 at 14:40, Russell King <rmk+kernel@armlinux.org.uk> wrote: >> >> > When building with -fdata-sections, the BSS data is placed into separate >> >> > sections, so we need to ask the linker to group these together into the >> >> > .bss output section, so that they're correctly placed. >> >> > >> >> > Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> >> >> > --- >> >> > arch/arm/boot/compressed/vmlinux.lds.S | 2 +- >> >> > 1 file changed, 1 insertion(+), 1 deletion(-) >> >> > >> >> > diff --git a/arch/arm/boot/compressed/vmlinux.lds.S b/arch/arm/boot/compressed/vmlinux.lds.S >> >> > index 2b963d8e76dd..3b91bc3c606f 100644 >> >> > --- a/arch/arm/boot/compressed/vmlinux.lds.S >> >> > +++ b/arch/arm/boot/compressed/vmlinux.lds.S >> >> > @@ -118,7 +118,7 @@ SECTIONS >> >> > >> >> > . = BSS_START; >> >> > __bss_start = .; >> >> > - .bss : { *(.bss) } >> >> > + .bss : { *(.bss) *(.bss.*) } >> >> >> >> Would it make sense to sort these by alignment? Otherwise, I suspect >> >> you may get a lot of padding holes due to the number of different >> >> input sections, each with its own alignment. >> > >> > We don't bother elsewhere in the kernel linker script - do you have >> > a case where we get lots of padding? >> > >> >> With -fdata-sections? How else are we ensuring that each .bss item in >> its own section is not placed such that its alignment results in a >> padding hole? > > Quite simply, we don't care (at the moment). The alignment does not > come from the size of each individual section, but the alignment > requirements of data within the section. > > See include/asm-generic/vmlinux.lds.h and the definition of *_MAIN > symbols from around line 60. > This seems like an oversight to me: consider something like* unsigned char bar = 1; unsigned long long foo = 4; build it with arm-linux-gnueabihf-gcc -fdata-sections -o /tmp/foo.o -c /tmp/foo.c and we end up with the following object file Section Headers: [Nr] Name Type Addr Off Size ES Flg Lk Inf Al [ 0] NULL 00000000 000000 000000 00 0 0 0 [ 1] .text PROGBITS 00000000 000034 000000 00 AX 0 0 1 [ 2] .data PROGBITS 00000000 000034 000000 00 WA 0 0 1 [ 3] .bss NOBITS 00000000 000034 000000 00 WA 0 0 1 [ 4] .data.bar PROGBITS 00000000 000034 000001 00 WA 0 0 1 [ 5] .data.foo PROGBITS 00000000 000038 000008 00 WA 0 0 8 ... Symbol table '.symtab' contains 13 entries: Num: Value Size Type Bind Vis Ndx Name 11: 00000000 8 OBJECT GLOBAL DEFAULT 5 foo 12: 00000000 1 OBJECT GLOBAL DEFAULT 4 bar whereas if I remove the -fdata-sections argument, I get Section Headers: [Nr] Name Type Addr Off Size ES Flg Lk Inf Al [ 0] NULL 00000000 000000 000000 00 0 0 0 [ 1] .text PROGBITS 00000000 000034 000000 00 AX 0 0 1 [ 2] .data PROGBITS 00000000 000038 000009 00 WA 0 0 8 [ 3] .bss NOBITS 00000000 000041 000000 00 WA 0 0 1 ... and Symbol table '.symtab' contains 11 entries: Num: Value Size Type Bind Vis Ndx Name ... 9: 00000000 8 OBJECT GLOBAL DEFAULT 2 foo 10: 00000008 1 OBJECT GLOBAL DEFAULT 2 bar In other words, we end up with a 7 byte padding hole by switching to -fdata-sections, unless we sort the input objects by alignment. * for my test compiile, -fdata-sections would not produce different input section so I am using .data for this example.
On Tue, Nov 06, 2018 at 03:39:58PM +0100, Ard Biesheuvel wrote: > On 6 November 2018 at 15:29, Russell King - ARM Linux > <linux@armlinux.org.uk> wrote: > > On Tue, Nov 06, 2018 at 03:13:07PM +0100, Ard Biesheuvel wrote: > >> On 6 November 2018 at 15:10, Russell King - ARM Linux > >> <linux@armlinux.org.uk> wrote: > >> > On Tue, Nov 06, 2018 at 03:08:06PM +0100, Ard Biesheuvel wrote: > >> >> On 6 November 2018 at 14:40, Russell King <rmk+kernel@armlinux.org.uk> wrote: > >> >> > When building with -fdata-sections, the BSS data is placed into separate > >> >> > sections, so we need to ask the linker to group these together into the > >> >> > .bss output section, so that they're correctly placed. > >> >> > > >> >> > Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> > >> >> > --- > >> >> > arch/arm/boot/compressed/vmlinux.lds.S | 2 +- > >> >> > 1 file changed, 1 insertion(+), 1 deletion(-) > >> >> > > >> >> > diff --git a/arch/arm/boot/compressed/vmlinux.lds.S b/arch/arm/boot/compressed/vmlinux.lds.S > >> >> > index 2b963d8e76dd..3b91bc3c606f 100644 > >> >> > --- a/arch/arm/boot/compressed/vmlinux.lds.S > >> >> > +++ b/arch/arm/boot/compressed/vmlinux.lds.S > >> >> > @@ -118,7 +118,7 @@ SECTIONS > >> >> > > >> >> > . = BSS_START; > >> >> > __bss_start = .; > >> >> > - .bss : { *(.bss) } > >> >> > + .bss : { *(.bss) *(.bss.*) } > >> >> > >> >> Would it make sense to sort these by alignment? Otherwise, I suspect > >> >> you may get a lot of padding holes due to the number of different > >> >> input sections, each with its own alignment. > >> > > >> > We don't bother elsewhere in the kernel linker script - do you have > >> > a case where we get lots of padding? > >> > > >> > >> With -fdata-sections? How else are we ensuring that each .bss item in > >> its own section is not placed such that its alignment results in a > >> padding hole? > > > > Quite simply, we don't care (at the moment). The alignment does not > > come from the size of each individual section, but the alignment > > requirements of data within the section. > > > > See include/asm-generic/vmlinux.lds.h and the definition of *_MAIN > > symbols from around line 60. > > > > This seems like an oversight to me: consider something like* > > unsigned char bar = 1; > unsigned long long foo = 4; > > build it with > > arm-linux-gnueabihf-gcc -fdata-sections -o /tmp/foo.o -c /tmp/foo.c > > and we end up with the following object file > > Section Headers: > [Nr] Name Type Addr Off Size ES Flg Lk Inf Al > [ 0] NULL 00000000 000000 000000 00 0 0 0 > [ 1] .text PROGBITS 00000000 000034 000000 00 AX 0 0 1 > [ 2] .data PROGBITS 00000000 000034 000000 00 WA 0 0 1 > [ 3] .bss NOBITS 00000000 000034 000000 00 WA 0 0 1 > [ 4] .data.bar PROGBITS 00000000 000034 000001 00 WA 0 0 1 > [ 5] .data.foo PROGBITS 00000000 000038 000008 00 WA 0 0 8 > ... > > Symbol table '.symtab' contains 13 entries: > Num: Value Size Type Bind Vis Ndx Name > 11: 00000000 8 OBJECT GLOBAL DEFAULT 5 foo > 12: 00000000 1 OBJECT GLOBAL DEFAULT 4 bar > > whereas if I remove the -fdata-sections argument, I get > > Section Headers: > [Nr] Name Type Addr Off Size ES Flg Lk Inf Al > [ 0] NULL 00000000 000000 000000 00 0 0 0 > [ 1] .text PROGBITS 00000000 000034 000000 00 AX 0 0 1 > [ 2] .data PROGBITS 00000000 000038 000009 00 WA 0 0 8 > [ 3] .bss NOBITS 00000000 000041 000000 00 WA 0 0 1 > ... > > and > > Symbol table '.symtab' contains 11 entries: > Num: Value Size Type Bind Vis Ndx Name > ... > 9: 00000000 8 OBJECT GLOBAL DEFAULT 2 foo > 10: 00000008 1 OBJECT GLOBAL DEFAULT 2 bar > > In other words, we end up with a 7 byte padding hole by switching to > -fdata-sections, unless we sort the input objects by alignment. > > * for my test compiile, -fdata-sections would not produce different > input section so I am using .data for this example. Nevertheless, changing this is not a subject for a patch series that targets ARM. What you've identified is a deficiency in the generic cross-arch support for this which needs a wider audience to be involved.
On 6 November 2018 at 15:45, Russell King - ARM Linux <linux@armlinux.org.uk> wrote: > On Tue, Nov 06, 2018 at 03:39:58PM +0100, Ard Biesheuvel wrote: >> On 6 November 2018 at 15:29, Russell King - ARM Linux >> <linux@armlinux.org.uk> wrote: >> > On Tue, Nov 06, 2018 at 03:13:07PM +0100, Ard Biesheuvel wrote: >> >> On 6 November 2018 at 15:10, Russell King - ARM Linux >> >> <linux@armlinux.org.uk> wrote: >> >> > On Tue, Nov 06, 2018 at 03:08:06PM +0100, Ard Biesheuvel wrote: >> >> >> On 6 November 2018 at 14:40, Russell King <rmk+kernel@armlinux.org.uk> wrote: >> >> >> > When building with -fdata-sections, the BSS data is placed into separate >> >> >> > sections, so we need to ask the linker to group these together into the >> >> >> > .bss output section, so that they're correctly placed. >> >> >> > >> >> >> > Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> >> >> >> > --- >> >> >> > arch/arm/boot/compressed/vmlinux.lds.S | 2 +- >> >> >> > 1 file changed, 1 insertion(+), 1 deletion(-) >> >> >> > >> >> >> > diff --git a/arch/arm/boot/compressed/vmlinux.lds.S b/arch/arm/boot/compressed/vmlinux.lds.S >> >> >> > index 2b963d8e76dd..3b91bc3c606f 100644 >> >> >> > --- a/arch/arm/boot/compressed/vmlinux.lds.S >> >> >> > +++ b/arch/arm/boot/compressed/vmlinux.lds.S >> >> >> > @@ -118,7 +118,7 @@ SECTIONS >> >> >> > >> >> >> > . = BSS_START; >> >> >> > __bss_start = .; >> >> >> > - .bss : { *(.bss) } >> >> >> > + .bss : { *(.bss) *(.bss.*) } >> >> >> >> >> >> Would it make sense to sort these by alignment? Otherwise, I suspect >> >> >> you may get a lot of padding holes due to the number of different >> >> >> input sections, each with its own alignment. >> >> > >> >> > We don't bother elsewhere in the kernel linker script - do you have >> >> > a case where we get lots of padding? >> >> > >> >> >> >> With -fdata-sections? How else are we ensuring that each .bss item in >> >> its own section is not placed such that its alignment results in a >> >> padding hole? >> > >> > Quite simply, we don't care (at the moment). The alignment does not >> > come from the size of each individual section, but the alignment >> > requirements of data within the section. >> > >> > See include/asm-generic/vmlinux.lds.h and the definition of *_MAIN >> > symbols from around line 60. >> > >> >> This seems like an oversight to me: consider something like* >> >> unsigned char bar = 1; >> unsigned long long foo = 4; >> >> build it with >> >> arm-linux-gnueabihf-gcc -fdata-sections -o /tmp/foo.o -c /tmp/foo.c >> >> and we end up with the following object file >> >> Section Headers: >> [Nr] Name Type Addr Off Size ES Flg Lk Inf Al >> [ 0] NULL 00000000 000000 000000 00 0 0 0 >> [ 1] .text PROGBITS 00000000 000034 000000 00 AX 0 0 1 >> [ 2] .data PROGBITS 00000000 000034 000000 00 WA 0 0 1 >> [ 3] .bss NOBITS 00000000 000034 000000 00 WA 0 0 1 >> [ 4] .data.bar PROGBITS 00000000 000034 000001 00 WA 0 0 1 >> [ 5] .data.foo PROGBITS 00000000 000038 000008 00 WA 0 0 8 >> ... >> >> Symbol table '.symtab' contains 13 entries: >> Num: Value Size Type Bind Vis Ndx Name >> 11: 00000000 8 OBJECT GLOBAL DEFAULT 5 foo >> 12: 00000000 1 OBJECT GLOBAL DEFAULT 4 bar >> >> whereas if I remove the -fdata-sections argument, I get >> >> Section Headers: >> [Nr] Name Type Addr Off Size ES Flg Lk Inf Al >> [ 0] NULL 00000000 000000 000000 00 0 0 0 >> [ 1] .text PROGBITS 00000000 000034 000000 00 AX 0 0 1 >> [ 2] .data PROGBITS 00000000 000038 000009 00 WA 0 0 8 >> [ 3] .bss NOBITS 00000000 000041 000000 00 WA 0 0 1 >> ... >> >> and >> >> Symbol table '.symtab' contains 11 entries: >> Num: Value Size Type Bind Vis Ndx Name >> ... >> 9: 00000000 8 OBJECT GLOBAL DEFAULT 2 foo >> 10: 00000008 1 OBJECT GLOBAL DEFAULT 2 bar >> >> In other words, we end up with a 7 byte padding hole by switching to >> -fdata-sections, unless we sort the input objects by alignment. >> >> * for my test compiile, -fdata-sections would not produce different >> input section so I am using .data for this example. > > Nevertheless, changing this is not a subject for a patch series that > targets ARM. What you've identified is a deficiency in the generic > cross-arch support for this which needs a wider audience to be > involved. > Fair enough. For the record, when I build the arm64 defconfig kernel with LD_DEAD_CODE_ELIMINATION turned on, adding --sort-section=alignment reduces the image size from text data bss dec hex filename 11951684 6936332 385040 19273056 1261560 vmlinux to text data bss dec hex filename 11938452 6930404 383625 19252481 125c501 vmlinux
On Tue, Nov 06, 2018 at 04:18:46PM +0100, Ard Biesheuvel wrote: > On 6 November 2018 at 15:45, Russell King - ARM Linux > <linux@armlinux.org.uk> wrote: > > On Tue, Nov 06, 2018 at 03:39:58PM +0100, Ard Biesheuvel wrote: > >> On 6 November 2018 at 15:29, Russell King - ARM Linux > >> <linux@armlinux.org.uk> wrote: > >> > On Tue, Nov 06, 2018 at 03:13:07PM +0100, Ard Biesheuvel wrote: > >> >> On 6 November 2018 at 15:10, Russell King - ARM Linux > >> >> <linux@armlinux.org.uk> wrote: > >> >> > On Tue, Nov 06, 2018 at 03:08:06PM +0100, Ard Biesheuvel wrote: > >> >> >> On 6 November 2018 at 14:40, Russell King <rmk+kernel@armlinux.org.uk> wrote: > >> >> >> > When building with -fdata-sections, the BSS data is placed into separate > >> >> >> > sections, so we need to ask the linker to group these together into the > >> >> >> > .bss output section, so that they're correctly placed. > >> >> >> > > >> >> >> > Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> > >> >> >> > --- > >> >> >> > arch/arm/boot/compressed/vmlinux.lds.S | 2 +- > >> >> >> > 1 file changed, 1 insertion(+), 1 deletion(-) > >> >> >> > > >> >> >> > diff --git a/arch/arm/boot/compressed/vmlinux.lds.S b/arch/arm/boot/compressed/vmlinux.lds.S > >> >> >> > index 2b963d8e76dd..3b91bc3c606f 100644 > >> >> >> > --- a/arch/arm/boot/compressed/vmlinux.lds.S > >> >> >> > +++ b/arch/arm/boot/compressed/vmlinux.lds.S > >> >> >> > @@ -118,7 +118,7 @@ SECTIONS > >> >> >> > > >> >> >> > . = BSS_START; > >> >> >> > __bss_start = .; > >> >> >> > - .bss : { *(.bss) } > >> >> >> > + .bss : { *(.bss) *(.bss.*) } > >> >> >> > >> >> >> Would it make sense to sort these by alignment? Otherwise, I suspect > >> >> >> you may get a lot of padding holes due to the number of different > >> >> >> input sections, each with its own alignment. > >> >> > > >> >> > We don't bother elsewhere in the kernel linker script - do you have > >> >> > a case where we get lots of padding? > >> >> > > >> >> > >> >> With -fdata-sections? How else are we ensuring that each .bss item in > >> >> its own section is not placed such that its alignment results in a > >> >> padding hole? > >> > > >> > Quite simply, we don't care (at the moment). The alignment does not > >> > come from the size of each individual section, but the alignment > >> > requirements of data within the section. > >> > > >> > See include/asm-generic/vmlinux.lds.h and the definition of *_MAIN > >> > symbols from around line 60. > >> > > >> > >> This seems like an oversight to me: consider something like* > >> > >> unsigned char bar = 1; > >> unsigned long long foo = 4; > >> > >> build it with > >> > >> arm-linux-gnueabihf-gcc -fdata-sections -o /tmp/foo.o -c /tmp/foo.c > >> > >> and we end up with the following object file > >> > >> Section Headers: > >> [Nr] Name Type Addr Off Size ES Flg Lk Inf Al > >> [ 0] NULL 00000000 000000 000000 00 0 0 0 > >> [ 1] .text PROGBITS 00000000 000034 000000 00 AX 0 0 1 > >> [ 2] .data PROGBITS 00000000 000034 000000 00 WA 0 0 1 > >> [ 3] .bss NOBITS 00000000 000034 000000 00 WA 0 0 1 > >> [ 4] .data.bar PROGBITS 00000000 000034 000001 00 WA 0 0 1 > >> [ 5] .data.foo PROGBITS 00000000 000038 000008 00 WA 0 0 8 > >> ... > >> > >> Symbol table '.symtab' contains 13 entries: > >> Num: Value Size Type Bind Vis Ndx Name > >> 11: 00000000 8 OBJECT GLOBAL DEFAULT 5 foo > >> 12: 00000000 1 OBJECT GLOBAL DEFAULT 4 bar > >> > >> whereas if I remove the -fdata-sections argument, I get > >> > >> Section Headers: > >> [Nr] Name Type Addr Off Size ES Flg Lk Inf Al > >> [ 0] NULL 00000000 000000 000000 00 0 0 0 > >> [ 1] .text PROGBITS 00000000 000034 000000 00 AX 0 0 1 > >> [ 2] .data PROGBITS 00000000 000038 000009 00 WA 0 0 8 > >> [ 3] .bss NOBITS 00000000 000041 000000 00 WA 0 0 1 > >> ... > >> > >> and > >> > >> Symbol table '.symtab' contains 11 entries: > >> Num: Value Size Type Bind Vis Ndx Name > >> ... > >> 9: 00000000 8 OBJECT GLOBAL DEFAULT 2 foo > >> 10: 00000008 1 OBJECT GLOBAL DEFAULT 2 bar > >> > >> In other words, we end up with a 7 byte padding hole by switching to > >> -fdata-sections, unless we sort the input objects by alignment. > >> > >> * for my test compiile, -fdata-sections would not produce different > >> input section so I am using .data for this example. > > > > Nevertheless, changing this is not a subject for a patch series that > > targets ARM. What you've identified is a deficiency in the generic > > cross-arch support for this which needs a wider audience to be > > involved. > > > > Fair enough. > > For the record, when I build the arm64 defconfig kernel with > LD_DEAD_CODE_ELIMINATION turned on, adding --sort-section=alignment > reduces the image size from > > text data bss dec hex filename > 11951684 6936332 385040 19273056 1261560 vmlinux > > to > > text data bss dec hex filename > 11938452 6930404 383625 19252481 125c501 vmlinux I'm not sure that's relevant when this thread is about 32bit ARM. It's a little like building an x86 kernel and posting the results. Two entirely different architectures with different results. Here's the numbers for an imx6 ARM kernel: text data bss dec hex filename 8438980 3255060 9640528 21334568 1458a28 imx6-unpatched/vmlinux 6164770 84 4120 6168974 5e218e imx6-unpatched/arch/arm/boot/compressed/vmlinux 8438980 3255060 9640528 21334568 1458a28 imx6-dc-dis/vmlinux 6164770 84 4120 6168974 5e218e imx6-dc-dis/arch/arm/boot/compressed/vmlinux 8448960 3253388 9639992 21342340 145a884 imx6-dc-ena/vmlinux 6175233 84 4120 6179437 5e4a6d imx6-dc-ena/arch/arm/boot/compressed/vmlinux 8446700 3245700 9639212 21331612 1457e9c imx6-dc-sort/vmlinux 6172284 84 4120 6176488 5e3ee8 imx6-dc-sort/arch/arm/boot/compressed/vmlinux The interesting thing here is that while the data sizes are reduced by enabling this option, but the text size actually _increases_ by more than we save. So the result is an overall bigger kernel. Presumably this is because of the literal pools needing to be larger in the .text segments, and more instructions necessary to access the data as the compiler can no longer know the relative displacement between neighbouring data items in the object file. So, it seems dead-code elimination is not that useful on 32-bit ARM.
On 6 November 2018 at 17:06, Russell King - ARM Linux <linux@armlinux.org.uk> wrote: > On Tue, Nov 06, 2018 at 04:18:46PM +0100, Ard Biesheuvel wrote: >> On 6 November 2018 at 15:45, Russell King - ARM Linux >> <linux@armlinux.org.uk> wrote: >> > On Tue, Nov 06, 2018 at 03:39:58PM +0100, Ard Biesheuvel wrote: >> >> On 6 November 2018 at 15:29, Russell King - ARM Linux >> >> <linux@armlinux.org.uk> wrote: >> >> > On Tue, Nov 06, 2018 at 03:13:07PM +0100, Ard Biesheuvel wrote: >> >> >> On 6 November 2018 at 15:10, Russell King - ARM Linux >> >> >> <linux@armlinux.org.uk> wrote: >> >> >> > On Tue, Nov 06, 2018 at 03:08:06PM +0100, Ard Biesheuvel wrote: >> >> >> >> On 6 November 2018 at 14:40, Russell King <rmk+kernel@armlinux.org.uk> wrote: >> >> >> >> > When building with -fdata-sections, the BSS data is placed into separate >> >> >> >> > sections, so we need to ask the linker to group these together into the >> >> >> >> > .bss output section, so that they're correctly placed. >> >> >> >> > >> >> >> >> > Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> >> >> >> >> > --- >> >> >> >> > arch/arm/boot/compressed/vmlinux.lds.S | 2 +- >> >> >> >> > 1 file changed, 1 insertion(+), 1 deletion(-) >> >> >> >> > >> >> >> >> > diff --git a/arch/arm/boot/compressed/vmlinux.lds.S b/arch/arm/boot/compressed/vmlinux.lds.S >> >> >> >> > index 2b963d8e76dd..3b91bc3c606f 100644 >> >> >> >> > --- a/arch/arm/boot/compressed/vmlinux.lds.S >> >> >> >> > +++ b/arch/arm/boot/compressed/vmlinux.lds.S >> >> >> >> > @@ -118,7 +118,7 @@ SECTIONS >> >> >> >> > >> >> >> >> > . = BSS_START; >> >> >> >> > __bss_start = .; >> >> >> >> > - .bss : { *(.bss) } >> >> >> >> > + .bss : { *(.bss) *(.bss.*) } >> >> >> >> >> >> >> >> Would it make sense to sort these by alignment? Otherwise, I suspect >> >> >> >> you may get a lot of padding holes due to the number of different >> >> >> >> input sections, each with its own alignment. >> >> >> > >> >> >> > We don't bother elsewhere in the kernel linker script - do you have >> >> >> > a case where we get lots of padding? >> >> >> > >> >> >> >> >> >> With -fdata-sections? How else are we ensuring that each .bss item in >> >> >> its own section is not placed such that its alignment results in a >> >> >> padding hole? >> >> > >> >> > Quite simply, we don't care (at the moment). The alignment does not >> >> > come from the size of each individual section, but the alignment >> >> > requirements of data within the section. >> >> > >> >> > See include/asm-generic/vmlinux.lds.h and the definition of *_MAIN >> >> > symbols from around line 60. >> >> > >> >> >> >> This seems like an oversight to me: consider something like* >> >> >> >> unsigned char bar = 1; >> >> unsigned long long foo = 4; >> >> >> >> build it with >> >> >> >> arm-linux-gnueabihf-gcc -fdata-sections -o /tmp/foo.o -c /tmp/foo.c >> >> >> >> and we end up with the following object file >> >> >> >> Section Headers: >> >> [Nr] Name Type Addr Off Size ES Flg Lk Inf Al >> >> [ 0] NULL 00000000 000000 000000 00 0 0 0 >> >> [ 1] .text PROGBITS 00000000 000034 000000 00 AX 0 0 1 >> >> [ 2] .data PROGBITS 00000000 000034 000000 00 WA 0 0 1 >> >> [ 3] .bss NOBITS 00000000 000034 000000 00 WA 0 0 1 >> >> [ 4] .data.bar PROGBITS 00000000 000034 000001 00 WA 0 0 1 >> >> [ 5] .data.foo PROGBITS 00000000 000038 000008 00 WA 0 0 8 >> >> ... >> >> >> >> Symbol table '.symtab' contains 13 entries: >> >> Num: Value Size Type Bind Vis Ndx Name >> >> 11: 00000000 8 OBJECT GLOBAL DEFAULT 5 foo >> >> 12: 00000000 1 OBJECT GLOBAL DEFAULT 4 bar >> >> >> >> whereas if I remove the -fdata-sections argument, I get >> >> >> >> Section Headers: >> >> [Nr] Name Type Addr Off Size ES Flg Lk Inf Al >> >> [ 0] NULL 00000000 000000 000000 00 0 0 0 >> >> [ 1] .text PROGBITS 00000000 000034 000000 00 AX 0 0 1 >> >> [ 2] .data PROGBITS 00000000 000038 000009 00 WA 0 0 8 >> >> [ 3] .bss NOBITS 00000000 000041 000000 00 WA 0 0 1 >> >> ... >> >> >> >> and >> >> >> >> Symbol table '.symtab' contains 11 entries: >> >> Num: Value Size Type Bind Vis Ndx Name >> >> ... >> >> 9: 00000000 8 OBJECT GLOBAL DEFAULT 2 foo >> >> 10: 00000008 1 OBJECT GLOBAL DEFAULT 2 bar >> >> >> >> In other words, we end up with a 7 byte padding hole by switching to >> >> -fdata-sections, unless we sort the input objects by alignment. >> >> >> >> * for my test compiile, -fdata-sections would not produce different >> >> input section so I am using .data for this example. >> > >> > Nevertheless, changing this is not a subject for a patch series that >> > targets ARM. What you've identified is a deficiency in the generic >> > cross-arch support for this which needs a wider audience to be >> > involved. >> > >> >> Fair enough. >> >> For the record, when I build the arm64 defconfig kernel with >> LD_DEAD_CODE_ELIMINATION turned on, adding --sort-section=alignment >> reduces the image size from >> >> text data bss dec hex filename >> 11951684 6936332 385040 19273056 1261560 vmlinux >> >> to >> >> text data bss dec hex filename >> 11938452 6930404 383625 19252481 125c501 vmlinux > > I'm not sure that's relevant when this thread is about 32bit ARM. > It's a little like building an x86 kernel and posting the results. > Two entirely different architectures with different results. > You said it was a generic issue not an ARM issue, to which I replied 'fair enough', after which I shared some information that is relevant to the generic issue under discussion. > Here's the numbers for an imx6 ARM kernel: > > text data bss dec hex filename > 8438980 3255060 9640528 21334568 1458a28 imx6-unpatched/vmlinux > 6164770 84 4120 6168974 5e218e imx6-unpatched/arch/arm/boot/compressed/vmlinux > 8438980 3255060 9640528 21334568 1458a28 imx6-dc-dis/vmlinux > 6164770 84 4120 6168974 5e218e imx6-dc-dis/arch/arm/boot/compressed/vmlinux > 8448960 3253388 9639992 21342340 145a884 imx6-dc-ena/vmlinux > 6175233 84 4120 6179437 5e4a6d imx6-dc-ena/arch/arm/boot/compressed/vmlinux > 8446700 3245700 9639212 21331612 1457e9c imx6-dc-sort/vmlinux > 6172284 84 4120 6176488 5e3ee8 imx6-dc-sort/arch/arm/boot/compressed/vmlinux > > The interesting thing here is that while the data sizes are reduced > by enabling this option, but the text size actually _increases_ by > more than we save. So the result is an overall bigger kernel. > > Presumably this is because of the literal pools needing to be larger > in the .text segments, and more instructions necessary to access the > data as the compiler can no longer know the relative displacement > between neighbouring data items in the object file. > I guess the compiler can no longer share literal pools between functions, so they get duplicated into each one. > So, it seems dead-code elimination is not that useful on 32-bit ARM. > Agreed.
diff --git a/arch/arm/boot/compressed/vmlinux.lds.S b/arch/arm/boot/compressed/vmlinux.lds.S index 2b963d8e76dd..3b91bc3c606f 100644 --- a/arch/arm/boot/compressed/vmlinux.lds.S +++ b/arch/arm/boot/compressed/vmlinux.lds.S @@ -118,7 +118,7 @@ SECTIONS . = BSS_START; __bss_start = .; - .bss : { *(.bss) } + .bss : { *(.bss) *(.bss.*) } _end = .; . = ALIGN(8); /* the stack must be 64-bit aligned */
When building with -fdata-sections, the BSS data is placed into separate sections, so we need to ask the linker to group these together into the .bss output section, so that they're correctly placed. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> --- arch/arm/boot/compressed/vmlinux.lds.S | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)