Message ID | 1497887596-8731-1-git-send-email-psodagud@codeaurora.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Mon, 19 Jun 2017, Prasad Sodagudi wrote: > Commit abb2ea7dfd82 ("compiler, clang: suppress warning for unused > static inline functions") re-defining the 'inline' macro but > __attribute__((always_inline)) is missing. Some compilers may > not honor inline hint if always_iniline attribute not there. > So add always_inline attribute to inline as done by > compiler-gcc.h file. > IIUC, __attribute__((always_inline)) was only needed for gcc versions < 4 and that the inlining decision making is improved in >= 4. To make a change like this, I would think that we would need to show that clang is making suboptimal decisions. I don't think there's a downside to making CONFIG_OPTIMIZE_INLINING specific only to gcc. If it is shown that __attribute__((always_inline)) is needed for clang as well, this should be done as part of compiler-gcc.h to avoid duplicated code. > Signed-off-by: Prasad Sodagudi <psodagud@codeaurora.org> > --- > include/linux/compiler-clang.h | 5 +++++ > 1 file changed, 5 insertions(+) > > diff --git a/include/linux/compiler-clang.h b/include/linux/compiler-clang.h > index d614c5e..deb65b3 100644 > --- a/include/linux/compiler-clang.h > +++ b/include/linux/compiler-clang.h > @@ -22,4 +22,9 @@ > * directives. Suppress the warning in clang as well. > */ > #undef inline > +#if !defined(CONFIG_ARCH_SUPPORTS_OPTIMIZED_INLINING) || \ > + !defined(CONFIG_OPTIMIZE_INLINING) > +#define inline inline __attribute__((always_inline)) __attribute__((unused)) notrace > +#else > #define inline inline __attribute__((unused)) notrace > +#endif
On 2017-06-19 13:25, David Rientjes wrote: > On Mon, 19 Jun 2017, Prasad Sodagudi wrote: > >> Commit abb2ea7dfd82 ("compiler, clang: suppress warning for unused >> static inline functions") re-defining the 'inline' macro but >> __attribute__((always_inline)) is missing. Some compilers may >> not honor inline hint if always_iniline attribute not there. >> So add always_inline attribute to inline as done by >> compiler-gcc.h file. >> > > IIUC, __attribute__((always_inline)) was only needed for gcc versions < > 4 > and that the inlining decision making is improved in >= 4. To make a > change like this, I would think that we would need to show that clang > is > making suboptimal decisions. I don't think there's a downside to > making > CONFIG_OPTIMIZE_INLINING specific only to gcc. > > If it is shown that __attribute__((always_inline)) is needed for clang > as > well, this should be done as part of compiler-gcc.h to avoid duplicated > code. Hi David, Here is the discussion about this change - https://lkml.org/lkml/2017/6/15/396 Please check mark and will's comments. -Thanks, Prasad > >> Signed-off-by: Prasad Sodagudi <psodagud@codeaurora.org> >> --- >> include/linux/compiler-clang.h | 5 +++++ >> 1 file changed, 5 insertions(+) >> >> diff --git a/include/linux/compiler-clang.h >> b/include/linux/compiler-clang.h >> index d614c5e..deb65b3 100644 >> --- a/include/linux/compiler-clang.h >> +++ b/include/linux/compiler-clang.h >> @@ -22,4 +22,9 @@ >> * directives. Suppress the warning in clang as well. >> */ >> #undef inline >> +#if !defined(CONFIG_ARCH_SUPPORTS_OPTIMIZED_INLINING) || \ >> + !defined(CONFIG_OPTIMIZE_INLINING) >> +#define inline inline __attribute__((always_inline)) >> __attribute__((unused)) notrace >> +#else >> #define inline inline __attribute__((unused)) notrace >> +#endif
On Mon, 19 Jun 2017, Sodagudi Prasad wrote: > > > Commit abb2ea7dfd82 ("compiler, clang: suppress warning for unused > > > static inline functions") re-defining the 'inline' macro but > > > __attribute__((always_inline)) is missing. Some compilers may > > > not honor inline hint if always_iniline attribute not there. > > > So add always_inline attribute to inline as done by > > > compiler-gcc.h file. > > > > > > > IIUC, __attribute__((always_inline)) was only needed for gcc versions < 4 > > and that the inlining decision making is improved in >= 4. To make a > > change like this, I would think that we would need to show that clang is > > making suboptimal decisions. I don't think there's a downside to making > > CONFIG_OPTIMIZE_INLINING specific only to gcc. > > > > If it is shown that __attribute__((always_inline)) is needed for clang as > > well, this should be done as part of compiler-gcc.h to avoid duplicated > > code. > > Hi David, > > Here is the discussion about this change - > https://lkml.org/lkml/2017/6/15/396 > Please check mark and will's comments. > Yes, the arch/arm64/include/asm/cmpxchg.h instance appears to need __always_inline as several other functions need __always_inline in arch/arm64/include/*. It's worth making that change as you suggested in your original patch. The concern, however, is inlining all "inline" functions forcefully. The only reason this is done for gcc is because of suboptimal inlining decisions in gcc < 4. So the question is whether this is a single instance that can be fixed where clang un-inlining causes problems or whether that instance suggests all possible inline usage for clang absolutely requires __always_inline due to a suboptimal compiler implementation. I would suggest the former.
On 2017-06-19 14:42, David Rientjes wrote: > On Mon, 19 Jun 2017, Sodagudi Prasad wrote: > >> > > Commit abb2ea7dfd82 ("compiler, clang: suppress warning for unused >> > > static inline functions") re-defining the 'inline' macro but >> > > __attribute__((always_inline)) is missing. Some compilers may >> > > not honor inline hint if always_iniline attribute not there. >> > > So add always_inline attribute to inline as done by >> > > compiler-gcc.h file. >> > > >> > >> > IIUC, __attribute__((always_inline)) was only needed for gcc versions < 4 >> > and that the inlining decision making is improved in >= 4. To make a >> > change like this, I would think that we would need to show that clang is >> > making suboptimal decisions. I don't think there's a downside to making >> > CONFIG_OPTIMIZE_INLINING specific only to gcc. >> > >> > If it is shown that __attribute__((always_inline)) is needed for clang as >> > well, this should be done as part of compiler-gcc.h to avoid duplicated >> > code. >> >> Hi David, >> >> Here is the discussion about this change - >> https://lkml.org/lkml/2017/6/15/396 >> Please check mark and will's comments. >> > > Yes, the arch/arm64/include/asm/cmpxchg.h instance appears to need > __always_inline as several other functions need __always_inline in > arch/arm64/include/*. It's worth making that change as you suggested > in > your original patch. > > The concern, however, is inlining all "inline" functions forcefully. > The > only reason this is done for gcc is because of suboptimal inlining > decisions in gcc < 4. > > So the question is whether this is a single instance that can be fixed > where clang un-inlining causes problems or whether that instance > suggests > all possible inline usage for clang absolutely requires __always_inline > due to a suboptimal compiler implementation. I would suggest the > former. Hi David, I am not 100% sure about the best approach for this problem. We may have to replace inline with always_inline for all inline functions where BUILD_BUG() used. So far inline as always_inline for ARM64, if we do not continue same settings, will there not be any performance differences? Hi Will and Mark, Please suggest the best solution to this problem. Currently __xchg_mb is only having issue based on compiler -inline-threshold configuration. But there are many other instances in arch/arm64/* where BUILD_BUG() used for inline functions and which may fail later. -Thanks, Prasad
On Mon, Jun 19, 2017 at 02:42:23PM -0700, David Rientjes wrote: > On Mon, 19 Jun 2017, Sodagudi Prasad wrote: > > > > > Commit abb2ea7dfd82 ("compiler, clang: suppress warning for unused > > > > static inline functions") re-defining the 'inline' macro but > > > > __attribute__((always_inline)) is missing. Some compilers may > > > > not honor inline hint if always_iniline attribute not there. > > > > So add always_inline attribute to inline as done by > > > > compiler-gcc.h file. > > > > > > > > > > IIUC, __attribute__((always_inline)) was only needed for gcc versions < 4 > > > and that the inlining decision making is improved in >= 4. To make a > > > change like this, I would think that we would need to show that clang is > > > making suboptimal decisions. I don't think there's a downside to making > > > CONFIG_OPTIMIZE_INLINING specific only to gcc. > > > > > > If it is shown that __attribute__((always_inline)) is needed for clang as > > > well, this should be done as part of compiler-gcc.h to avoid duplicated > > > code. > > > > Hi David, > > > > Here is the discussion about this change - > > https://lkml.org/lkml/2017/6/15/396 > > Please check mark and will's comments. > > > > Yes, the arch/arm64/include/asm/cmpxchg.h instance appears to need > __always_inline as several other functions need __always_inline in > arch/arm64/include/*. It's worth making that change as you suggested in > your original patch. > > The concern, however, is inlining all "inline" functions forcefully. The > only reason this is done for gcc is because of suboptimal inlining > decisions in gcc < 4. > > So the question is whether this is a single instance that can be fixed > where clang un-inlining causes problems or whether that instance suggests > all possible inline usage for clang absolutely requires __always_inline > due to a suboptimal compiler implementation. I would suggest the former. My concern here is that code has been written with the implicit assumption that inline means __always_inline, since that's been the case for years with GCC, when !ARCH_SUPPORTS_OPTIMIZED_INLINING || !CONFIG_OPTIMIZE_INLINING, (i.e. for every !x86 arch). While this is the only breakage seen so far, it seems likely that similar breakage may exist elsewhere, and such breakage may easily be introduced by those only using GCC. I'd prefer to use the same guards for clang here, since that ensures that such code works by default across both compilers. That gives us the chance to test and fixup code without a violent flag day. Once we've fixed up the core arm64 code, we can select ARCH_SUPPORTS_OPTIMIZED_INLINING, and allow users to optionally select CONFIG_OPTIMIZE_INLINING (with either compiler). Once that's seen some testing, and if there's a benefit, then we can try to align with x86 and default to selecting CONFIG_OPTIMIZE_INLINING, and/or drop the config options entirely and only check the GCC version. Thanks, Mark.
On Mon, Jun 19, 2017 at 03:19:27PM -0700, Sodagudi Prasad wrote: > On 2017-06-19 14:42, David Rientjes wrote: > >Yes, the arch/arm64/include/asm/cmpxchg.h instance appears to need > >__always_inline as several other functions need __always_inline in > >arch/arm64/include/*. It's worth making that change as you > >suggested in > >your original patch. > > > >The concern, however, is inlining all "inline" functions > >forcefully. The > >only reason this is done for gcc is because of suboptimal inlining > >decisions in gcc < 4. > > > >So the question is whether this is a single instance that can be fixed > >where clang un-inlining causes problems or whether that instance > >suggests > >all possible inline usage for clang absolutely requires __always_inline > >due to a suboptimal compiler implementation. I would suggest the > >former. > > Hi David, > > I am not 100% sure about the best approach for this problem. We may > have to > replace inline with always_inline for all inline functions where > BUILD_BUG() used. > > So far inline as always_inline for ARM64, if we do not continue same > settings, > will there not be any performance differences? > > Hi Will and Mark, > > Please suggest the best solution to this problem. Currently > __xchg_mb is only having issue > based on compiler -inline-threshold configuration. But there are > many other instances > in arch/arm64/* where BUILD_BUG() used for inline functions and > which may fail later. As with my reply to David, my preference would be that we: 1) Align compiler-clang.h with the compiler-gcc.h inlining behaviour, so that things work by default. 2) Fix up the arm64 core code (and drivers for architected / common peripherals) to use __always_inline where we always require inlining. 3) Have arm64 select CONFIG_ARCH_SUPPORTS_OPTIMIZED_INLINING, and have people test-build configurations with CONFIG_OPTIMIZE_INLINING, with both GCC and clang. 4) Fix up drivers, etc, as appropriate. 5) Once that's largely stable, and if there's a benefit, have arm64 select CONFIG_OPTIMIZE_INLINING by default. That should avoid undue breakage, while enabling this ASAP. Thanks, Mark.
diff --git a/include/linux/compiler-clang.h b/include/linux/compiler-clang.h index d614c5e..deb65b3 100644 --- a/include/linux/compiler-clang.h +++ b/include/linux/compiler-clang.h @@ -22,4 +22,9 @@ * directives. Suppress the warning in clang as well. */ #undef inline +#if !defined(CONFIG_ARCH_SUPPORTS_OPTIMIZED_INLINING) || \ + !defined(CONFIG_OPTIMIZE_INLINING) +#define inline inline __attribute__((always_inline)) __attribute__((unused)) notrace +#else #define inline inline __attribute__((unused)) notrace +#endif
Commit abb2ea7dfd82 ("compiler, clang: suppress warning for unused static inline functions") re-defining the 'inline' macro but __attribute__((always_inline)) is missing. Some compilers may not honor inline hint if always_iniline attribute not there. So add always_inline attribute to inline as done by compiler-gcc.h file. Signed-off-by: Prasad Sodagudi <psodagud@codeaurora.org> --- include/linux/compiler-clang.h | 5 +++++ 1 file changed, 5 insertions(+)