diff mbox

compiler, clang: Add always_inline attribute to inline

Message ID 1497887596-8731-1-git-send-email-psodagud@codeaurora.org (mailing list archive)
State New, archived
Headers show

Commit Message

Prasad Sodagudi June 19, 2017, 3:53 p.m. UTC
Commit abb2ea7dfd82 ("compiler, clang: suppress warning for unused
static inline functions") re-defining the 'inline' macro but
__attribute__((always_inline)) is missing. Some compilers may
not honor inline hint if always_iniline attribute not there.
So add always_inline attribute to inline as done by
compiler-gcc.h file.

Signed-off-by: Prasad Sodagudi <psodagud@codeaurora.org>
---
 include/linux/compiler-clang.h | 5 +++++
 1 file changed, 5 insertions(+)

Comments

David Rientjes June 19, 2017, 8:25 p.m. UTC | #1
On Mon, 19 Jun 2017, Prasad Sodagudi wrote:

> Commit abb2ea7dfd82 ("compiler, clang: suppress warning for unused
> static inline functions") re-defining the 'inline' macro but
> __attribute__((always_inline)) is missing. Some compilers may
> not honor inline hint if always_iniline attribute not there.
> So add always_inline attribute to inline as done by
> compiler-gcc.h file.
> 

IIUC, __attribute__((always_inline)) was only needed for gcc versions < 4 
and that the inlining decision making is improved in >= 4.  To make a 
change like this, I would think that we would need to show that clang is 
making suboptimal decisions.  I don't think there's a downside to making 
CONFIG_OPTIMIZE_INLINING specific only to gcc.

If it is shown that __attribute__((always_inline)) is needed for clang as 
well, this should be done as part of compiler-gcc.h to avoid duplicated 
code.

> Signed-off-by: Prasad Sodagudi <psodagud@codeaurora.org>
> ---
>  include/linux/compiler-clang.h | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/include/linux/compiler-clang.h b/include/linux/compiler-clang.h
> index d614c5e..deb65b3 100644
> --- a/include/linux/compiler-clang.h
> +++ b/include/linux/compiler-clang.h
> @@ -22,4 +22,9 @@
>   * directives.  Suppress the warning in clang as well.
>   */
>  #undef inline
> +#if !defined(CONFIG_ARCH_SUPPORTS_OPTIMIZED_INLINING) ||		\
> +    !defined(CONFIG_OPTIMIZE_INLINING)
> +#define inline inline __attribute__((always_inline)) __attribute__((unused)) notrace
> +#else
>  #define inline inline __attribute__((unused)) notrace
> +#endif
Prasad Sodagudi June 19, 2017, 9:14 p.m. UTC | #2
On 2017-06-19 13:25, David Rientjes wrote:
> On Mon, 19 Jun 2017, Prasad Sodagudi wrote:
> 
>> Commit abb2ea7dfd82 ("compiler, clang: suppress warning for unused
>> static inline functions") re-defining the 'inline' macro but
>> __attribute__((always_inline)) is missing. Some compilers may
>> not honor inline hint if always_iniline attribute not there.
>> So add always_inline attribute to inline as done by
>> compiler-gcc.h file.
>> 
> 
> IIUC, __attribute__((always_inline)) was only needed for gcc versions < 
> 4
> and that the inlining decision making is improved in >= 4.  To make a
> change like this, I would think that we would need to show that clang 
> is
> making suboptimal decisions.  I don't think there's a downside to 
> making
> CONFIG_OPTIMIZE_INLINING specific only to gcc.
> 
> If it is shown that __attribute__((always_inline)) is needed for clang 
> as
> well, this should be done as part of compiler-gcc.h to avoid duplicated
> code.

Hi David,

Here is the discussion about this change -  
https://lkml.org/lkml/2017/6/15/396
Please check mark and will's comments.

-Thanks, Prasad

> 
>> Signed-off-by: Prasad Sodagudi <psodagud@codeaurora.org>
>> ---
>>  include/linux/compiler-clang.h | 5 +++++
>>  1 file changed, 5 insertions(+)
>> 
>> diff --git a/include/linux/compiler-clang.h 
>> b/include/linux/compiler-clang.h
>> index d614c5e..deb65b3 100644
>> --- a/include/linux/compiler-clang.h
>> +++ b/include/linux/compiler-clang.h
>> @@ -22,4 +22,9 @@
>>   * directives.  Suppress the warning in clang as well.
>>   */
>>  #undef inline
>> +#if !defined(CONFIG_ARCH_SUPPORTS_OPTIMIZED_INLINING) ||		\
>> +    !defined(CONFIG_OPTIMIZE_INLINING)
>> +#define inline inline __attribute__((always_inline)) 
>> __attribute__((unused)) notrace
>> +#else
>>  #define inline inline __attribute__((unused)) notrace
>> +#endif
David Rientjes June 19, 2017, 9:42 p.m. UTC | #3
On Mon, 19 Jun 2017, Sodagudi Prasad wrote:

> > > Commit abb2ea7dfd82 ("compiler, clang: suppress warning for unused
> > > static inline functions") re-defining the 'inline' macro but
> > > __attribute__((always_inline)) is missing. Some compilers may
> > > not honor inline hint if always_iniline attribute not there.
> > > So add always_inline attribute to inline as done by
> > > compiler-gcc.h file.
> > > 
> > 
> > IIUC, __attribute__((always_inline)) was only needed for gcc versions < 4
> > and that the inlining decision making is improved in >= 4.  To make a
> > change like this, I would think that we would need to show that clang is
> > making suboptimal decisions.  I don't think there's a downside to making
> > CONFIG_OPTIMIZE_INLINING specific only to gcc.
> > 
> > If it is shown that __attribute__((always_inline)) is needed for clang as
> > well, this should be done as part of compiler-gcc.h to avoid duplicated
> > code.
> 
> Hi David,
> 
> Here is the discussion about this change -
> https://lkml.org/lkml/2017/6/15/396
> Please check mark and will's comments.
> 

Yes, the arch/arm64/include/asm/cmpxchg.h instance appears to need 
__always_inline as several other functions need __always_inline in 
arch/arm64/include/*.  It's worth making that change as you suggested in 
your original patch.

The concern, however, is inlining all "inline" functions forcefully.  The 
only reason this is done for gcc is because of suboptimal inlining 
decisions in gcc < 4.

So the question is whether this is a single instance that can be fixed 
where clang un-inlining causes problems or whether that instance suggests 
all possible inline usage for clang absolutely requires __always_inline 
due to a suboptimal compiler implementation.  I would suggest the former.
Prasad Sodagudi June 19, 2017, 10:19 p.m. UTC | #4
On 2017-06-19 14:42, David Rientjes wrote:
> On Mon, 19 Jun 2017, Sodagudi Prasad wrote:
> 
>> > > Commit abb2ea7dfd82 ("compiler, clang: suppress warning for unused
>> > > static inline functions") re-defining the 'inline' macro but
>> > > __attribute__((always_inline)) is missing. Some compilers may
>> > > not honor inline hint if always_iniline attribute not there.
>> > > So add always_inline attribute to inline as done by
>> > > compiler-gcc.h file.
>> > >
>> >
>> > IIUC, __attribute__((always_inline)) was only needed for gcc versions < 4
>> > and that the inlining decision making is improved in >= 4.  To make a
>> > change like this, I would think that we would need to show that clang is
>> > making suboptimal decisions.  I don't think there's a downside to making
>> > CONFIG_OPTIMIZE_INLINING specific only to gcc.
>> >
>> > If it is shown that __attribute__((always_inline)) is needed for clang as
>> > well, this should be done as part of compiler-gcc.h to avoid duplicated
>> > code.
>> 
>> Hi David,
>> 
>> Here is the discussion about this change -
>> https://lkml.org/lkml/2017/6/15/396
>> Please check mark and will's comments.
>> 
> 
> Yes, the arch/arm64/include/asm/cmpxchg.h instance appears to need
> __always_inline as several other functions need __always_inline in
> arch/arm64/include/*.  It's worth making that change as you suggested 
> in
> your original patch.
> 
> The concern, however, is inlining all "inline" functions forcefully.  
> The
> only reason this is done for gcc is because of suboptimal inlining
> decisions in gcc < 4.
> 
> So the question is whether this is a single instance that can be fixed
> where clang un-inlining causes problems or whether that instance 
> suggests
> all possible inline usage for clang absolutely requires __always_inline
> due to a suboptimal compiler implementation.  I would suggest the 
> former.

Hi David,

  I am not 100% sure about the best approach for this problem. We may 
have to
replace inline with always_inline for all inline functions where 
BUILD_BUG() used.

So far inline as always_inline for ARM64, if we do not continue same 
settings,
will there not be any performance differences?

Hi Will and Mark,

Please suggest the best solution to this problem. Currently __xchg_mb is 
only having issue
based on compiler -inline-threshold configuration. But there are many 
other instances
in arch/arm64/* where BUILD_BUG() used for inline functions and which 
may fail later.

-Thanks, Prasad
Mark Rutland June 20, 2017, 10:52 a.m. UTC | #5
On Mon, Jun 19, 2017 at 02:42:23PM -0700, David Rientjes wrote:
> On Mon, 19 Jun 2017, Sodagudi Prasad wrote:
> 
> > > > Commit abb2ea7dfd82 ("compiler, clang: suppress warning for unused
> > > > static inline functions") re-defining the 'inline' macro but
> > > > __attribute__((always_inline)) is missing. Some compilers may
> > > > not honor inline hint if always_iniline attribute not there.
> > > > So add always_inline attribute to inline as done by
> > > > compiler-gcc.h file.
> > > > 
> > > 
> > > IIUC, __attribute__((always_inline)) was only needed for gcc versions < 4
> > > and that the inlining decision making is improved in >= 4.  To make a
> > > change like this, I would think that we would need to show that clang is
> > > making suboptimal decisions.  I don't think there's a downside to making
> > > CONFIG_OPTIMIZE_INLINING specific only to gcc.
> > > 
> > > If it is shown that __attribute__((always_inline)) is needed for clang as
> > > well, this should be done as part of compiler-gcc.h to avoid duplicated
> > > code.
> > 
> > Hi David,
> > 
> > Here is the discussion about this change -
> > https://lkml.org/lkml/2017/6/15/396
> > Please check mark and will's comments.
> > 
> 
> Yes, the arch/arm64/include/asm/cmpxchg.h instance appears to need 
> __always_inline as several other functions need __always_inline in 
> arch/arm64/include/*.  It's worth making that change as you suggested in 
> your original patch.
> 
> The concern, however, is inlining all "inline" functions forcefully.  The 
> only reason this is done for gcc is because of suboptimal inlining 
> decisions in gcc < 4.
>
> So the question is whether this is a single instance that can be fixed 
> where clang un-inlining causes problems or whether that instance suggests 
> all possible inline usage for clang absolutely requires __always_inline 
> due to a suboptimal compiler implementation.  I would suggest the former.

My concern here is that code has been written with the implicit
assumption that inline means __always_inline, since that's been the case
for years with GCC, when !ARCH_SUPPORTS_OPTIMIZED_INLINING ||
!CONFIG_OPTIMIZE_INLINING, (i.e. for every !x86 arch).

While this is the only breakage seen so far, it seems likely that
similar breakage may exist elsewhere, and such breakage may easily be
introduced by those only using GCC.

I'd prefer to use the same guards for clang here, since that ensures
that such code works by default across both compilers. That gives us the
chance to test and fixup code without a violent flag day.

Once we've fixed up the core arm64 code, we can select
ARCH_SUPPORTS_OPTIMIZED_INLINING, and allow users to optionally select
CONFIG_OPTIMIZE_INLINING (with either compiler).

Once that's seen some testing, and if there's a benefit, then we can try
to align with x86 and default to selecting CONFIG_OPTIMIZE_INLINING,
and/or drop the config options entirely and only check the GCC version.

Thanks,
Mark.
Mark Rutland June 20, 2017, 10:59 a.m. UTC | #6
On Mon, Jun 19, 2017 at 03:19:27PM -0700, Sodagudi Prasad wrote:
> On 2017-06-19 14:42, David Rientjes wrote:
> >Yes, the arch/arm64/include/asm/cmpxchg.h instance appears to need
> >__always_inline as several other functions need __always_inline in
> >arch/arm64/include/*.  It's worth making that change as you
> >suggested in
> >your original patch.
> >
> >The concern, however, is inlining all "inline" functions
> >forcefully.  The
> >only reason this is done for gcc is because of suboptimal inlining
> >decisions in gcc < 4.
> >
> >So the question is whether this is a single instance that can be fixed
> >where clang un-inlining causes problems or whether that instance
> >suggests
> >all possible inline usage for clang absolutely requires __always_inline
> >due to a suboptimal compiler implementation.  I would suggest the
> >former.
> 
> Hi David,
> 
>  I am not 100% sure about the best approach for this problem. We may
> have to
> replace inline with always_inline for all inline functions where
> BUILD_BUG() used.
> 
> So far inline as always_inline for ARM64, if we do not continue same
> settings,
> will there not be any performance differences?
> 
> Hi Will and Mark,
> 
> Please suggest the best solution to this problem. Currently
> __xchg_mb is only having issue
> based on compiler -inline-threshold configuration. But there are
> many other instances
> in arch/arm64/* where BUILD_BUG() used for inline functions and
> which may fail later.

As with my reply to David, my preference would be that we:

1) Align compiler-clang.h with the compiler-gcc.h inlining behaviour, so
   that things work by default.

2) Fix up the arm64 core code (and drivers for architected / common
   peripherals) to use __always_inline where we always require inlining.

3) Have arm64 select CONFIG_ARCH_SUPPORTS_OPTIMIZED_INLINING, and have
   people test-build configurations with CONFIG_OPTIMIZE_INLINING, with
   both GCC and clang.

4) Fix up drivers, etc, as appropriate.

5) Once that's largely stable, and if there's a benefit, have arm64
   select CONFIG_OPTIMIZE_INLINING by default.

That should avoid undue breakage, while enabling this ASAP.

Thanks,
Mark.
diff mbox

Patch

diff --git a/include/linux/compiler-clang.h b/include/linux/compiler-clang.h
index d614c5e..deb65b3 100644
--- a/include/linux/compiler-clang.h
+++ b/include/linux/compiler-clang.h
@@ -22,4 +22,9 @@ 
  * directives.  Suppress the warning in clang as well.
  */
 #undef inline
+#if !defined(CONFIG_ARCH_SUPPORTS_OPTIMIZED_INLINING) ||		\
+    !defined(CONFIG_OPTIMIZE_INLINING)
+#define inline inline __attribute__((always_inline)) __attribute__((unused)) notrace
+#else
 #define inline inline __attribute__((unused)) notrace
+#endif