Message ID | 20230109135828.879136-2-mark.rutland@arm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | arm64/ftrace: Add support for DYNAMIC_FTRACE_WITH_CALL_OPS | expand |
On Mon, Jan 9, 2023 at 2:58 PM Mark Rutland <mark.rutland@arm.com> wrote: > > As far as I can tell, GCC doesn't respect '-falign-functions=N': > > * When the __weak__ attribute is used > > GCC seems to forget the alignment specified by '-falign-functions=N', > but will respect the '__aligned__(N)' function attribute. Thus, we can > work around this by explciitly setting the alignment for weak > functions. > > * When the __cold__ attribute is used > > GCC seems to forget the alignment specified by '-falign-functions=N', > and also doesn't seem to respect the '__aligned__(N)' function > attribute. The only way to work around this is to not use the __cold__ > attibute. If you happen to have a reduced case, then it would be nice to link it in the commit. A bug report to GCC would also be nice. I gave it a very quick try in Compiler Explorer, but I couldn't reproduce it, so I guess it depends on flags, non-trivial functions or something else. > + * '-falign-functions=N', and require alignment to be specificed via a function Nit: specificed -> specified > +#if CONFIG_FUNCTION_ALIGNMENT > 0 > +#define __function_aligned __aligned(CONFIG_FUNCTION_ALIGNMENT) > +#else > +#define __function_aligned > +#endif Currently, the file is intended for attributes that do not depend on `CONFIG_*` options. What I usually mention is that we could change that policy, but otherwise these would go into e.g. `compiler_types.h`. > +#if !defined(CONFIG_CC_IS_GCC) || (CONFIG_FUNCTION_ALIGNMENT == 0) > #define __cold __attribute__((__cold__)) > +#else > +#define __cold > +#endif Similarly, in this case this could go into `compiler-gcc.h` / `compiler-clang.h` etc., since the definition will be different for each. Cheers, Miguel
On Mon, Jan 09, 2023 at 03:43:16PM +0100, Miguel Ojeda wrote: > On Mon, Jan 9, 2023 at 2:58 PM Mark Rutland <mark.rutland@arm.com> wrote: > > > > As far as I can tell, GCC doesn't respect '-falign-functions=N': > > > > * When the __weak__ attribute is used > > > > GCC seems to forget the alignment specified by '-falign-functions=N', > > but will respect the '__aligned__(N)' function attribute. Thus, we can > > work around this by explciitly setting the alignment for weak Whoops: s/explciitly/explicitly/ here too; I'll go re-proofread the series. > > functions. > > > > * When the __cold__ attribute is used > > > > GCC seems to forget the alignment specified by '-falign-functions=N', > > and also doesn't seem to respect the '__aligned__(N)' function > > attribute. The only way to work around this is to not use the __cold__ > > attibute. Whoops: s/attibute/attribute/ > If you happen to have a reduced case, then it would be nice to link it > in the commit. A bug report to GCC would also be nice. > > I gave it a very quick try in Compiler Explorer, but I couldn't > reproduce it, so I guess it depends on flags, non-trivial functions or > something else. Sorry, that is something I had intendeed to do but I hadn't extracted a reproducer yet. I'll try to come up with something that can be included in the commit message and reported to GCC folk (and double-check at the same time that there's not another hidden cause) With this series applied and this patch reverted, it's possible to see when building defconfig + CONFIG_DEBUG_FORCE_FUNCTION_ALIGN_64B=y, where scanning /proc/kallsyms with: $ grep ' [Tt] ' /proc/kallsyms | grep -iv '[048c]0 [Tt] ' ... will show a bunch of cold functions (and their callees/callers), largely init/exit functions (so I'll double-check whether section handling as an effect), e.g. ffffdf08be173b8c t snd_soc_exit ffffdf08be173bc4 t apple_mca_driver_exit ffffdf08be173be8 t failover_exit ffffdf08be173c10 t inet_diag_exit ffffdf08be173c60 t tcp_diag_exit ffffdf08be173c84 t cubictcp_unregister ffffdf08be173cac t af_unix_exit ffffdf08be173cf4 t packet_exit ffffdf08be173d3c t cleanup_sunrpc ffffdf08be173d8c t exit_rpcsec_gss ffffdf08be173dc4 t exit_p9 ffffdf08be173dec T p9_client_exit ffffdf08be173e10 t p9_trans_fd_exit ffffdf08be173e58 t p9_virtio_cleanup ffffdf08be173e90 t exit_dns_resolver > > + * '-falign-functions=N', and require alignment to be specificed via a function > > Nit: specificed -> specified Thanks, fixed > > +#if CONFIG_FUNCTION_ALIGNMENT > 0 > > +#define __function_aligned __aligned(CONFIG_FUNCTION_ALIGNMENT) > > +#else > > +#define __function_aligned > > +#endif > > Currently, the file is intended for attributes that do not depend on > `CONFIG_*` options. > > What I usually mention is that we could change that policy, but > otherwise these would go into e.g. `compiler_types.h`. I'm happy to move these, I just wasn't sure what the policy would be w.r.t. the existing __weak and __cold defitions since those end up depending upon __function_aligned. I assume I should move them all? i.e. move __weak as well? > > +#if !defined(CONFIG_CC_IS_GCC) || (CONFIG_FUNCTION_ALIGNMENT == 0) > > #define __cold __attribute__((__cold__)) > > +#else > > +#define __cold > > +#endif > > Similarly, in this case this could go into `compiler-gcc.h` / > `compiler-clang.h` etc., since the definition will be different for > each. Sure, can do. Thanks, Mark.
On Mon, Jan 9, 2023 at 6:06 PM Mark Rutland <mark.rutland@arm.com> wrote: > > Sorry, that is something I had intendeed to do but I hadn't extracted a > reproducer yet. I'll try to come up with something that can be included in the > commit message and reported to GCC folk (and double-check at the same time that > there's not another hidden cause) Yeah, no worries :) I suggested it because from my quick test it didn't appear to be reproducible trivially, so I thought having the reproducer would be nice. > I'm happy to move these, I just wasn't sure what the policy would be w.r.t. the > existing __weak and __cold defitions since those end up depending upon > __function_aligned. > > I assume I should move them all? i.e. move __weak as well? Yeah, with the current policy, all should be moved since their behavior now depends on the config (eventually). Cheers, Miguel
On Mon, Jan 09, 2023 at 03:43:16PM +0100, Miguel Ojeda wrote: > On Mon, Jan 9, 2023 at 2:58 PM Mark Rutland <mark.rutland@arm.com> wrote: > > > > As far as I can tell, GCC doesn't respect '-falign-functions=N': > > > > * When the __weak__ attribute is used > > > > GCC seems to forget the alignment specified by '-falign-functions=N', > > but will respect the '__aligned__(N)' function attribute. Thus, we can > > work around this by explciitly setting the alignment for weak > > functions. > > > > * When the __cold__ attribute is used > > > > GCC seems to forget the alignment specified by '-falign-functions=N', > > and also doesn't seem to respect the '__aligned__(N)' function > > attribute. The only way to work around this is to not use the __cold__ > > attibute. > > If you happen to have a reduced case, then it would be nice to link it > in the commit. A bug report to GCC would also be nice. > > I gave it a very quick try in Compiler Explorer, but I couldn't > reproduce it, so I guess it depends on flags, non-trivial functions or > something else. So having spent today coming up with tests, it turns out it's not quite as I described above, but in a sense worse. I'm posting a summary here for posterity; I'll try to get this to compiler folk shortly. GCC appears to not align cold functions to the alignment specified by `-falign-functions=N` when compiling at `-O1` or above. Alignment *can* be restored with explicit attributes on each function, but due to some interprocedural analysis, callees can be implicitly marked as cold (losing their default alignment), which means we don't have a reliable mechanism to ensure functions are always aligned short of annotating *every* function explicitly (and I suspect that's not sufficient due to some interprocedural optimizations). I've tested with the 12.1.0 binary release from the kernel.org cross toolchains page). LLVM always seems to repsect `-falign-functions=N` at both `-O1` and `-O2` (I tested the 10.0.0, 11.0.0, 11.0.1, 15.0.6 binary releases from llvm.org). For example: | [mark@lakrids:/mnt/data/tests/gcc-alignment]% cat test-cold.c | #define __cold \ | __attribute__((cold)) | | #define EXPORT_FUNC_PTR(func) \ | typeof((func)) *__ptr_##func = (func) | | __cold | void cold_func_a(void) { } | | __cold | void cold_func_b(void) { } | | __cold | void cold_func_c(void) { } | | static __cold | void static_cold_func_a(void) { } | EXPORT_FUNC_PTR(static_cold_func_a); | | static __cold | void static_cold_func_b(void) { } | EXPORT_FUNC_PTR(static_cold_func_b); | | static __cold | void static_cold_func_c(void) { } | EXPORT_FUNC_PTR(static_cold_func_c); | [mark@lakrids:/mnt/data/tests/gcc-alignment]% usekorg 12.1.0 aarch64-linux-gcc -falign-functions=16 -c test-cold.c -O1 | [mark@lakrids:/mnt/data/tests/gcc-alignment]% usekorg 12.1.0 aarch64-linux-objdump -d test-cold.o | | test-cold.o: file format elf64-littleaarch64 | | | Disassembly of section .text: | | 0000000000000000 <static_cold_func_a>: | 0: d65f03c0 ret | | 0000000000000004 <static_cold_func_b>: | 4: d65f03c0 ret | | 0000000000000008 <static_cold_func_c>: | 8: d65f03c0 ret | | 000000000000000c <cold_func_a>: | c: d65f03c0 ret | | 0000000000000010 <cold_func_b>: | 10: d65f03c0 ret | | 0000000000000014 <cold_func_c>: | 14: d65f03c0 ret | [mark@lakrids:/mnt/data/tests/gcc-alignment]% usekorg 12.1.0 aarch64-linux-objdump -h test-cold.o | | test-cold.o: file format elf64-littleaarch64 | | Sections: | Idx Name Size VMA LMA File off Algn | 0 .text 00000018 0000000000000000 0000000000000000 00000040 2**2 | CONTENTS, ALLOC, LOAD, READONLY, CODE | 1 .data 00000018 0000000000000000 0000000000000000 00000058 2**3 | CONTENTS, ALLOC, LOAD, RELOC, DATA | 2 .bss 00000000 0000000000000000 0000000000000000 00000070 2**0 | ALLOC | 3 .comment 00000013 0000000000000000 0000000000000000 00000070 2**0 | CONTENTS, READONLY | 4 .note.GNU-stack 00000000 0000000000000000 0000000000000000 00000083 2**0 | CONTENTS, READONLY | 5 .eh_frame 00000090 0000000000000000 0000000000000000 00000088 2**3 | CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA In simple cases, alignment *can* be restored if an explicit function attribute is used. For example: | [mark@lakrids:/mnt/data/tests/gcc-alignment]% cat test-aligned-cold.c | #define __aligned(n) \ | __attribute__((aligned(n))) | | #define __cold \ | __attribute__((cold)) __aligned(16) | | #define EXPORT_FUNC_PTR(func) \ | typeof((func)) *__ptr_##func = (func) | | __cold | void cold_func_a(void) { } | | __cold | void cold_func_b(void) { } | | __cold | void cold_func_c(void) { } | | static __cold | void static_cold_func_a(void) { } | EXPORT_FUNC_PTR(static_cold_func_a); | | static __cold | void static_cold_func_b(void) { } | EXPORT_FUNC_PTR(static_cold_func_b); | | static __cold | void static_cold_func_c(void) { } | EXPORT_FUNC_PTR(static_cold_func_c); | [mark@lakrids:/mnt/data/tests/gcc-alignment]% usekorg 12.1.0 aarch64-linux-gcc -falign-functions=16 -c test-aligned-cold.c -O1 | [mark@lakrids:/mnt/data/tests/gcc-alignment]% usekorg 12.1.0 aarch64-linux-objdump -d test-aligned-cold.o | | test-aligned-cold.o: file format elf64-littleaarch64 | | | Disassembly of section .text: | | 0000000000000000 <static_cold_func_a>: | 0: d65f03c0 ret | 4: d503201f nop | 8: d503201f nop | c: d503201f nop | | 0000000000000010 <static_cold_func_b>: | 10: d65f03c0 ret | 14: d503201f nop | 18: d503201f nop | 1c: d503201f nop | | 0000000000000020 <static_cold_func_c>: | 20: d65f03c0 ret | 24: d503201f nop | 28: d503201f nop | 2c: d503201f nop | | 0000000000000030 <cold_func_a>: | 30: d65f03c0 ret | 34: d503201f nop | 38: d503201f nop | 3c: d503201f nop | | 0000000000000040 <cold_func_b>: | 40: d65f03c0 ret | 44: d503201f nop | 48: d503201f nop | 4c: d503201f nop | | 0000000000000050 <cold_func_c>: | 50: d65f03c0 ret | [mark@lakrids:/mnt/data/tests/gcc-alignment]% usekorg 12.1.0 aarch64-linux-objdump -h test-aligned-cold.o | | test-aligned-cold.o: file format elf64-littleaarch64 | | Sections: | Idx Name Size VMA LMA File off Algn | 0 .text 00000054 0000000000000000 0000000000000000 00000040 2**4 | CONTENTS, ALLOC, LOAD, READONLY, CODE | 1 .data 00000018 0000000000000000 0000000000000000 00000098 2**3 | CONTENTS, ALLOC, LOAD, RELOC, DATA | 2 .bss 00000000 0000000000000000 0000000000000000 000000b0 2**0 | ALLOC | 3 .comment 00000013 0000000000000000 0000000000000000 000000b0 2**0 | CONTENTS, READONLY | 4 .note.GNU-stack 00000000 0000000000000000 0000000000000000 000000c3 2**0 | CONTENTS, READONLY | 5 .eh_frame 00000090 0000000000000000 0000000000000000 000000c8 2**3 | CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA Unfortunately it appears that some interprocedural analysis determines that if a callee is only called/referenced from cold callers, the callee is marked as cold, and the alignment it would have got from the command line option is dropped. If it's given an explicit alignment attribute, the alignment is retained. For example: | [mark@lakrids:/mnt/data/tests/gcc-alignment]% cat test-aligned-cold-caller.c | #define noinline \ | __attribute__((noinline)) | | #define __aligned(n) \ | __attribute__((aligned(n))) | | #define __cold \ | __attribute__((cold)) __aligned(16) | | #define EXPORT_FUNC_PTR(func) \ | typeof((func)) *__ptr_##func = (func) | | static noinline void callee_a(void) | { | asm volatile("// callee_a\n" ::: "memory"); | } | | static noinline void callee_b(void) | { | asm volatile("// callee_b\n" ::: "memory"); | } | | static noinline void callee_c(void) | { | asm volatile("// callee_c\n" ::: "memory"); | } | __cold | void cold_func_a(void) { callee_a(); } | | __cold | void cold_func_b(void) { callee_b(); } | | __cold | void cold_func_c(void) { callee_c(); } | | static __cold | void static_cold_func_a(void) { callee_a(); } | EXPORT_FUNC_PTR(static_cold_func_a); | | static __cold | void static_cold_func_b(void) { callee_b(); } | EXPORT_FUNC_PTR(static_cold_func_b); | | static __cold | void static_cold_func_c(void) { callee_c(); } | EXPORT_FUNC_PTR(static_cold_func_c); | [mark@lakrids:/mnt/data/tests/gcc-alignment]% usekorg 12.1.0 aarch64-linux-gcc -falign-functions=16 -c test-aligned-cold-caller.c -O1 | [mark@lakrids:/mnt/data/tests/gcc-alignment]% usekorg 12.1.0 aarch64-linux-objdump -d test-aligned-cold-caller.o | | test-aligned-cold-caller.o: file format elf64-littleaarch64 | | | Disassembly of section .text: | | 0000000000000000 <callee_a>: | 0: d65f03c0 ret | | 0000000000000004 <callee_b>: | 4: d65f03c0 ret | | 0000000000000008 <callee_c>: | 8: d65f03c0 ret | c: d503201f nop | | 0000000000000010 <static_cold_func_a>: | 10: a9bf7bfd stp x29, x30, [sp, #-16]! | 14: 910003fd mov x29, sp | 18: 97fffffa bl 0 <callee_a> | 1c: a8c17bfd ldp x29, x30, [sp], #16 | 20: d65f03c0 ret | 24: d503201f nop | 28: d503201f nop | 2c: d503201f nop | | 0000000000000030 <static_cold_func_b>: | 30: a9bf7bfd stp x29, x30, [sp, #-16]! | 34: 910003fd mov x29, sp | 38: 97fffff3 bl 4 <callee_b> | 3c: a8c17bfd ldp x29, x30, [sp], #16 | 40: d65f03c0 ret | 44: d503201f nop | 48: d503201f nop | 4c: d503201f nop | | 0000000000000050 <static_cold_func_c>: | 50: a9bf7bfd stp x29, x30, [sp, #-16]! | 54: 910003fd mov x29, sp | 58: 97ffffec bl 8 <callee_c> | 5c: a8c17bfd ldp x29, x30, [sp], #16 | 60: d65f03c0 ret | 64: d503201f nop | 68: d503201f nop | 6c: d503201f nop | | 0000000000000070 <cold_func_a>: | 70: a9bf7bfd stp x29, x30, [sp, #-16]! | 74: 910003fd mov x29, sp | 78: 97ffffe2 bl 0 <callee_a> | 7c: a8c17bfd ldp x29, x30, [sp], #16 | 80: d65f03c0 ret | 84: d503201f nop | 88: d503201f nop | 8c: d503201f nop | | 0000000000000090 <cold_func_b>: | 90: a9bf7bfd stp x29, x30, [sp, #-16]! | 94: 910003fd mov x29, sp | 98: 97ffffdb bl 4 <callee_b> | 9c: a8c17bfd ldp x29, x30, [sp], #16 | a0: d65f03c0 ret | a4: d503201f nop | a8: d503201f nop | ac: d503201f nop | | 00000000000000b0 <cold_func_c>: | b0: a9bf7bfd stp x29, x30, [sp, #-16]! | b4: 910003fd mov x29, sp | b8: 97ffffd4 bl 8 <callee_c> | bc: a8c17bfd ldp x29, x30, [sp], #16 | c0: d65f03c0 ret | [mark@lakrids:/mnt/data/tests/gcc-alignment]% usekorg 12.1.0 aarch64-linux-objdump -h test-aligned-cold-caller.o | | test-aligned-cold-caller.o: file format elf64-littleaarch64 | | Sections: | Idx Name Size VMA LMA File off Algn | 0 .text 000000c4 0000000000000000 0000000000000000 00000040 2**4 | CONTENTS, ALLOC, LOAD, READONLY, CODE | 1 .data 00000018 0000000000000000 0000000000000000 00000108 2**3 | CONTENTS, ALLOC, LOAD, RELOC, DATA | 2 .bss 00000000 0000000000000000 0000000000000000 00000120 2**0 | ALLOC | 3 .comment 00000013 0000000000000000 0000000000000000 00000120 2**0 | CONTENTS, READONLY | 4 .note.GNU-stack 00000000 0000000000000000 0000000000000000 00000133 2**0 | CONTENTS, READONLY | 5 .eh_frame 00000110 0000000000000000 0000000000000000 00000138 2**3 | CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA Thanks, Mark.
On Wed, Jan 11, 2023 at 06:27:53PM +0000, Mark Rutland wrote: > On Mon, Jan 09, 2023 at 03:43:16PM +0100, Miguel Ojeda wrote: > > On Mon, Jan 9, 2023 at 2:58 PM Mark Rutland <mark.rutland@arm.com> wrote: > > > > > > As far as I can tell, GCC doesn't respect '-falign-functions=N': > > > > > > * When the __weak__ attribute is used > > > > > > GCC seems to forget the alignment specified by '-falign-functions=N', > > > but will respect the '__aligned__(N)' function attribute. Thus, we can > > > work around this by explciitly setting the alignment for weak > > > functions. > > > > > > * When the __cold__ attribute is used > > > > > > GCC seems to forget the alignment specified by '-falign-functions=N', > > > and also doesn't seem to respect the '__aligned__(N)' function > > > attribute. The only way to work around this is to not use the __cold__ > > > attibute. > > > > If you happen to have a reduced case, then it would be nice to link it > > in the commit. A bug report to GCC would also be nice. > > > > I gave it a very quick try in Compiler Explorer, but I couldn't > > reproduce it, so I guess it depends on flags, non-trivial functions or > > something else. > > So having spent today coming up with tests, it turns out it's not quite as I > described above, but in a sense worse. I'm posting a summary here for > posterity; I'll try to get this to compiler folk shortly. I've added the cold bits to an existing ticket: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88345 I have not been able to reproduce the issue with __weak__, so I'll go dig into that some more; it's likely I was mistaken there. Thanks, Mark.
On Thu, Jan 12, 2023 at 11:38:17AM +0000, Mark Rutland wrote: > On Wed, Jan 11, 2023 at 06:27:53PM +0000, Mark Rutland wrote: > > On Mon, Jan 09, 2023 at 03:43:16PM +0100, Miguel Ojeda wrote: > > > On Mon, Jan 9, 2023 at 2:58 PM Mark Rutland <mark.rutland@arm.com> wrote: > > > > > > > > As far as I can tell, GCC doesn't respect '-falign-functions=N': > > > > > > > > * When the __weak__ attribute is used > > > > > > > > GCC seems to forget the alignment specified by '-falign-functions=N', > > > > but will respect the '__aligned__(N)' function attribute. Thus, we can > > > > work around this by explciitly setting the alignment for weak > > > > functions. > > > > > > > > * When the __cold__ attribute is used > > > > > > > > GCC seems to forget the alignment specified by '-falign-functions=N', > > > > and also doesn't seem to respect the '__aligned__(N)' function > > > > attribute. The only way to work around this is to not use the __cold__ > > > > attibute. > > > > > > If you happen to have a reduced case, then it would be nice to link it > > > in the commit. A bug report to GCC would also be nice. > > > > > > I gave it a very quick try in Compiler Explorer, but I couldn't > > > reproduce it, so I guess it depends on flags, non-trivial functions or > > > something else. > > > > So having spent today coming up with tests, it turns out it's not quite as I > > described above, but in a sense worse. I'm posting a summary here for > > posterity; I'll try to get this to compiler folk shortly. > > I've added the cold bits to an existing ticket: > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88345 > > I have not been able to reproduce the issue with __weak__, so I'll go dig into > that some more; it's likely I was mistaken there. It turns out that was a red herring; GCC is actually implicitly marking the abort() function as cold, and as Linux's implementation happened to be marked as weak I assumed that was the culprit. I'll drop the changes to weak and update our abort implementation specifically, with a comment. I'll also go update the ticket above. Thanks, Mark.
On Fri, Jan 13, 2023 at 1:49 PM Mark Rutland <mark.rutland@arm.com> wrote: > > It turns out that was a red herring; GCC is actually implicitly marking the > abort() function as cold, and as Linux's implementation happened to be marked > as weak I assumed that was the culprit. That and your previous message probably explains probably why I couldn't reproduce it. Thanks a lot for all the details -- the `cold` issue is reproducible since gcc 4.6 at least: https://godbolt.org/z/PoxazzT9T The `abort` case appears to happen since gcc 8.1. Cheers, Miguel
diff --git a/include/linux/compiler_attributes.h b/include/linux/compiler_attributes.h index 898b3458b24a0..dcb7ac67b764f 100644 --- a/include/linux/compiler_attributes.h +++ b/include/linux/compiler_attributes.h @@ -33,6 +33,17 @@ #define __aligned(x) __attribute__((__aligned__(x))) #define __aligned_largest __attribute__((__aligned__)) +/* + * Contemporary versions of GCC (e.g. 12.2.0) don't always respect + * '-falign-functions=N', and require alignment to be specificed via a function + * attribute in some cases. + */ +#if CONFIG_FUNCTION_ALIGNMENT > 0 +#define __function_aligned __aligned(CONFIG_FUNCTION_ALIGNMENT) +#else +#define __function_aligned +#endif + /* * Note: do not use this directly. Instead, use __alloc_size() since it is conditionally * available and includes other attributes. For GCC < 9.1, __alloc_size__ gets undefined @@ -78,8 +89,15 @@ /* * gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-cold-function-attribute * gcc: https://gcc.gnu.org/onlinedocs/gcc/Label-Attributes.html#index-cold-label-attribute + * + * GCC drops function alignment when the __cold__ attribute is used. Avoid the + * __cold__ attribute if function alignment is required. */ +#if !defined(CONFIG_CC_IS_GCC) || (CONFIG_FUNCTION_ALIGNMENT == 0) #define __cold __attribute__((__cold__)) +#else +#define __cold +#endif /* * Note the long name. @@ -369,8 +387,11 @@ /* * gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-weak-function-attribute * gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Variable-Attributes.html#index-weak-variable-attribute + * + * GCC drops function alignment when the __weak__ attribute is used. This can + * be restored with function attributes. */ -#define __weak __attribute__((__weak__)) +#define __weak __attribute__((__weak__)) __function_aligned /* * Used by functions that use '__builtin_return_address'. These function
From local testing, contemporary versions of of GCC (e.g. GCC 12.2.0) don't respect '-falign-functions=N' in all cases. This is unfortunate, as (for non-zero values of N) CONFIG_FUNCTION_ALIGNMENT=N will set '-falign-functions=N', but this won't take effect for all functions. LLVM appears to respect '-falign-functions=N' in call cases. Today, for x86 this turns out to be functionally benign, though it does somewhat undermine the CONFIG_FUNCTION_ALIGNMENT option, and it means that CONFIG_DEBUG_FORCE_FUNCTION_ALIGN_64B is not as robust as we'd like. On arm64 we'd like to use CONFIG_FUNCTION_ALIGNMENT to implement ftrace functionality, and we'll require function alignment to be respected for functional reasons. As far as I can tell, GCC doesn't respect '-falign-functions=N': * When the __weak__ attribute is used GCC seems to forget the alignment specified by '-falign-functions=N', but will respect the '__aligned__(N)' function attribute. Thus, we can work around this by explciitly setting the alignment for weak functions. * When the __cold__ attribute is used GCC seems to forget the alignment specified by '-falign-functions=N', and also doesn't seem to respect the '__aligned__(N)' function attribute. The only way to work around this is to not use the __cold__ attibute. This patch implements workarounds for these two cases, using a function attribute to set the alignment of __weak__ functions, and preventing the use of the __cold__ attribute when CONFIG_FUNCTION_ALIGNMENT is non-zero. I've tested this by selecting CONFIG_DEBUG_FORCE_FUNCTION_ALIGN_64B=y, building and booting a kernel, and looking for misaligned text symbols: * arm64: Before: # grep ' [Tt] ' /proc/kallsyms | grep -iv '[048c]0 [Tt] ' | wc -l 4939 After: # grep ' [Tt] ' /proc/kallsyms | grep -iv '[048c]0 [Tt] ' | wc -l 908 * x86_64: Before: # grep ' [Tt] ' /proc/kallsyms | grep -iv '[048c]0 [Tt] ' | wc -l 7969 After: # grep ' [Tt] ' /proc/kallsyms | grep -iv '[048c]0 [Tt] ' | wc -l 2057 With the patch applied, the remaining unaligned text labels are a combination of static call trampolines, non-function labels in assembly, and ACPICA functions, which will be dealt with in subsequent patches. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Florent Revest <revest@chromium.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Will Deacon <will@kernel.org> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> --- include/linux/compiler_attributes.h | 23 ++++++++++++++++++++++- 1 file changed, 22 insertions(+), 1 deletion(-)