diff mbox series

ARM: xor-neon: Replace __GNUC__ checks with CONFIG_CC_IS_GCC

Message ID 20190528235742.105510-1-natechancellor@gmail.com (mailing list archive)
State New, archived
Headers show
Series ARM: xor-neon: Replace __GNUC__ checks with CONFIG_CC_IS_GCC | expand

Commit Message

Nathan Chancellor May 28, 2019, 11:57 p.m. UTC
Currently, when compiling this code with clang, the following warning is
emitted:

    CC      arch/arm/lib/xor-neon.o
  arch/arm/lib/xor-neon.c:33:2: warning: This code requires at least
  version 4.6 of GCC [-W#warnings]

This is because clang poses as GCC 4.2.1 with its __GNUC__ conditionals
for glibc compatibility[1]:

$ echo | clang -dM -E -x c /dev/null | grep GNUC | awk '{print $2" "$3}'
__GNUC_MINOR__ 2
__GNUC_PATCHLEVEL__ 1
__GNUC_STDC_INLINE__ 1
__GNUC__ 4

As pointed out by Ard Biesheuvel and Arnd Bergmann in an earlier
thread[2], the oldest version of GCC that is currently supported is gcc
4.6 after commit cafa0010cd51 ("Raise the minimum required gcc version
to 4.6") so we do not need to check for anything older anymore.

However, just removing the version check is not enough to silence clang
because it does not recognize '#pragma GCC optimize':

  arch/arm/lib/xor-neon.c:25:13: warning: unknown pragma ignored
  [-Wunknown-pragmas]
  #pragma GCC optimize "tree-vectorize"

Looking into it further, -ftree-vectorize (which '#pragma GCC optimize
"tree-vectorize"' enables) is an alias in clang for -fvectorize[3],
which according to the documentation is on by default[4] (at least at
-O2 or -Os).

Just add the pragma when compiling with GCC so that clang does not
unnecessarily warn.

[1]: https://reviews.llvm.org/D51011#1206981
[2]: https://lore.kernel.org/lkml/CAK8P3a3NjTCgFd2dQ9KbHP8DpXf6s-ULfeU6acAYC4SDi+2qvw@mail.gmail.com/
[3]: https://github.com/llvm/llvm-project/blob/eafe8ef6f2b44ba/clang/include/clang/Driver/Options.td#L1729
[4]: https://llvm.org/docs/Vectorizers.html#usage

Link: https://github.com/ClangBuiltLinux/linux/issues/496
Reported-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
---
 arch/arm/lib/xor-neon.c | 9 +--------
 1 file changed, 1 insertion(+), 8 deletions(-)

Comments

Nick Desaulniers May 30, 2019, 11 p.m. UTC | #1
On Tue, May 28, 2019 at 4:57 PM Nathan Chancellor
<natechancellor@gmail.com> wrote:
>
> Currently, when compiling this code with clang, the following warning is
> emitted:
>
>     CC      arch/arm/lib/xor-neon.o
>   arch/arm/lib/xor-neon.c:33:2: warning: This code requires at least
>   version 4.6 of GCC [-W#warnings]
>
> This is because clang poses as GCC 4.2.1 with its __GNUC__ conditionals
> for glibc compatibility[1]:
>
> $ echo | clang -dM -E -x c /dev/null | grep GNUC | awk '{print $2" "$3}'
> __GNUC_MINOR__ 2
> __GNUC_PATCHLEVEL__ 1
> __GNUC_STDC_INLINE__ 1
> __GNUC__ 4
>
> As pointed out by Ard Biesheuvel and Arnd Bergmann in an earlier
> thread[2], the oldest version of GCC that is currently supported is gcc
> 4.6 after commit cafa0010cd51 ("Raise the minimum required gcc version
> to 4.6") so we do not need to check for anything older anymore.
>
> However, just removing the version check is not enough to silence clang
> because it does not recognize '#pragma GCC optimize':
>
>   arch/arm/lib/xor-neon.c:25:13: warning: unknown pragma ignored
>   [-Wunknown-pragmas]
>   #pragma GCC optimize "tree-vectorize"
>
> Looking into it further, -ftree-vectorize (which '#pragma GCC optimize
> "tree-vectorize"' enables) is an alias in clang for -fvectorize[3],
> which according to the documentation is on by default[4] (at least at
> -O2 or -Os).
>
> Just add the pragma when compiling with GCC so that clang does not
> unnecessarily warn.
>
> [1]: https://reviews.llvm.org/D51011#1206981
> [2]: https://lore.kernel.org/lkml/CAK8P3a3NjTCgFd2dQ9KbHP8DpXf6s-ULfeU6acAYC4SDi+2qvw@mail.gmail.com/
> [3]: https://github.com/llvm/llvm-project/blob/eafe8ef6f2b44ba/clang/include/clang/Driver/Options.td#L1729
> [4]: https://llvm.org/docs/Vectorizers.html#usage
>
> Link: https://github.com/ClangBuiltLinux/linux/issues/496
> Reported-by: Nick Desaulniers <ndesaulniers@google.com>
> Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>

LGTM, thanks Nathan.
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>

> ---
>  arch/arm/lib/xor-neon.c | 9 +--------
>  1 file changed, 1 insertion(+), 8 deletions(-)
>
> diff --git a/arch/arm/lib/xor-neon.c b/arch/arm/lib/xor-neon.c
> index c691b901092f..d532bc072ee4 100644
> --- a/arch/arm/lib/xor-neon.c
> +++ b/arch/arm/lib/xor-neon.c
> @@ -22,15 +22,8 @@ MODULE_LICENSE("GPL");
>   * -ftree-vectorize) to attempt to exploit implicit parallelism and emit
>   * NEON instructions.
>   */
> -#if __GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 6)
> +#ifdef CONFIG_CC_IS_GCC
>  #pragma GCC optimize "tree-vectorize"
> -#else
> -/*
> - * While older versions of GCC do not generate incorrect code, they fail to
> - * recognize the parallel nature of these functions, and emit plain ARM code,
> - * which is known to be slower than the optimized ARM code in asm-arm/xor.h.
> - */
> -#warning This code requires at least version 4.6 of GCC
>  #endif
>
>  #pragma GCC diagnostic ignored "-Wunused-variable"
> --
> 2.22.0.rc1
>
Arnd Bergmann May 31, 2019, 8:05 a.m. UTC | #2
On Wed, May 29, 2019 at 1:57 AM Nathan Chancellor
<natechancellor@gmail.com> wrote:
>
> Currently, when compiling this code with clang, the following warning is
> emitted:
>
>     CC      arch/arm/lib/xor-neon.o
>   arch/arm/lib/xor-neon.c:33:2: warning: This code requires at least
>   version 4.6 of GCC [-W#warnings]
>
> This is because clang poses as GCC 4.2.1 with its __GNUC__ conditionals
> for glibc compatibility[1]:
>
> $ echo | clang -dM -E -x c /dev/null | grep GNUC | awk '{print $2" "$3}'
> __GNUC_MINOR__ 2
> __GNUC_PATCHLEVEL__ 1
> __GNUC_STDC_INLINE__ 1
> __GNUC__ 4
>
> As pointed out by Ard Biesheuvel and Arnd Bergmann in an earlier
> thread[2], the oldest version of GCC that is currently supported is gcc
> 4.6 after commit cafa0010cd51 ("Raise the minimum required gcc version
> to 4.6") so we do not need to check for anything older anymore.
>
> However, just removing the version check is not enough to silence clang
> because it does not recognize '#pragma GCC optimize':
>
>   arch/arm/lib/xor-neon.c:25:13: warning: unknown pragma ignored
>   [-Wunknown-pragmas]
>   #pragma GCC optimize "tree-vectorize"
>
> Looking into it further, -ftree-vectorize (which '#pragma GCC optimize
> "tree-vectorize"' enables) is an alias in clang for -fvectorize[3],
> which according to the documentation is on by default[4] (at least at
> -O2 or -Os).
>
> Just add the pragma when compiling with GCC so that clang does not
> unnecessarily warn.

If I remember correctly, we also had the same issue with older versions
of clang, possibly even newer ones. Shouldn't we check for a minimum
compiler version when building with clang to ensure that the code is
really vectorized?

       Arnd
Nathan Chancellor May 31, 2019, 6:32 p.m. UTC | #3
On Fri, May 31, 2019 at 10:05:22AM +0200, Arnd Bergmann wrote:
> If I remember correctly, we also had the same issue with older versions
> of clang, possibly even newer ones. Shouldn't we check for a minimum
> compiler version when building with clang to ensure that the code is
> really vectorized?
> 
>        Arnd

Even on tip of tree, it doesn't look like vectorization happens
properly. With -S -Rpass-missed='.*' added to the xor-neon.c command:

/home/nathan/cbl/linux-next/include/asm-generic/xor.h:15:2: remark: the cost-model indicates that interleaving is not beneficial [-Rpass-missed=loop-vectorize]
/home/nathan/cbl/linux-next/include/asm-generic/xor.h:11:1: remark: List vectorization was possible but not beneficial with cost 0 >= 0 [-Rpass-missed=slp-vectorizer]
xor_8regs_2(unsigned long bytes, unsigned long *p1, unsigned long *p2)
^

So right now, it doesn't look like there is a minimum version for clang
and I don't think adding a warning for clang is productive (what is a
user supposed to do?)

Cheers,
Nathan
Arnd Bergmann May 31, 2019, 7:21 p.m. UTC | #4
On Fri, May 31, 2019 at 8:32 PM Nathan Chancellor
<natechancellor@gmail.com> wrote:
>
> On Fri, May 31, 2019 at 10:05:22AM +0200, Arnd Bergmann wrote:
> > If I remember correctly, we also had the same issue with older versions
> > of clang, possibly even newer ones. Shouldn't we check for a minimum
> > compiler version when building with clang to ensure that the code is
> > really vectorized?
> >
> >        Arnd
>
> Even on tip of tree, it doesn't look like vectorization happens
> properly. With -S -Rpass-missed='.*' added to the xor-neon.c command:
>
> /home/nathan/cbl/linux-next/include/asm-generic/xor.h:15:2: remark: the cost-model indicates that interleaving is not beneficial [-Rpass-missed=loop-vectorize]
> /home/nathan/cbl/linux-next/include/asm-generic/xor.h:11:1: remark: List vectorization was possible but not beneficial with cost 0 >= 0 [-Rpass-missed=slp-vectorizer]
> xor_8regs_2(unsigned long bytes, unsigned long *p1, unsigned long *p2)
> ^
>
> So right now, it doesn't look like there is a minimum version for clang
> and I don't think adding a warning for clang is productive (what is a
> user supposed to do?)

I see. If we don't have a vectorized version of the xor module with
clang, I would suggest dropping your patch then, and instead adding
a Kconfig dependency on CC_IS_GCC, until clang is able to do it right.

       Arnd
Nick Desaulniers May 31, 2019, 8:06 p.m. UTC | #5
On Fri, May 31, 2019 at 12:21 PM Arnd Bergmann <arnd@arndb.de> wrote:
> clang, I would suggest dropping your patch then, and instead adding

I disagree.  The minimum version of gcc required to build the kernel
is 4.6, so the comment about older versions of gcc is irrelevant and
should be removed.

Nathan's -Rpass warnings are warning that vectorization was not
calculated to be profitable **for 1 of the 4 functions** by LLVM.
Surely we wouldn't disable NEON opts for XOR because 1 of 4 was not
vectorized?
Nathan Chancellor May 31, 2019, 8:26 p.m. UTC | #6
On Fri, May 31, 2019 at 01:06:13PM -0700, Nick Desaulniers wrote:
> On Fri, May 31, 2019 at 12:21 PM Arnd Bergmann <arnd@arndb.de> wrote:
> > clang, I would suggest dropping your patch then, and instead adding
> 
> I disagree.  The minimum version of gcc required to build the kernel
> is 4.6, so the comment about older versions of gcc is irrelevant and
> should be removed.
> 
> Nathan's -Rpass warnings are warning that vectorization was not
> calculated to be profitable **for 1 of the 4 functions** by LLVM.
> Surely we wouldn't disable NEON opts for XOR because 1 of 4 was not
> vectorized?

Well I kept it short but clang warns that all of the loops are not
profitable.

However, the config option for xor-neon.c is CONFIG_XOR_BLOCKS, which
also controls the arm64 implementation. We wouldn't want to disable it
for clang altogether if it works on arm64 fine.

If it turns out to be broken for both, I suppose I would be okay with
disabling CONFIG_XOR_BLOCKS for clang but it should be done in a
separate patch as this one should be applied regardless of clang working
or not (because this warning will appear again when clang is fixed).

Cheers,
Nathan
Arnd Bergmann May 31, 2019, 9:03 p.m. UTC | #7
On Fri, May 31, 2019 at 10:06 PM 'Nick Desaulniers' via Clang Built
Linux <clang-built-linux@googlegroups.com> wrote:
>
> On Fri, May 31, 2019 at 12:21 PM Arnd Bergmann <arnd@arndb.de> wrote:
> > clang, I would suggest dropping your patch then, and instead adding
>
> I disagree.  The minimum version of gcc required to build the kernel
> is 4.6, so the comment about older versions of gcc is irrelevant and
> should be removed.

Sure, that's ok. It just feels wrong to remove a warning that points
to a real problem that still exists and can be detected at the moment.

If we think that clang-9 is going to be fixed before its release,
the warning could be changed to test for that version as a minimum,
and point to the bugzilla entry for more details.

      Arnd
Nathan Chancellor June 1, 2019, 12:28 a.m. UTC | #8
On Fri, May 31, 2019 at 11:03:19PM +0200, Arnd Bergmann wrote:
> On Fri, May 31, 2019 at 10:06 PM 'Nick Desaulniers' via Clang Built
> Linux <clang-built-linux@googlegroups.com> wrote:
> >
> > On Fri, May 31, 2019 at 12:21 PM Arnd Bergmann <arnd@arndb.de> wrote:
> > > clang, I would suggest dropping your patch then, and instead adding
> >
> > I disagree.  The minimum version of gcc required to build the kernel
> > is 4.6, so the comment about older versions of gcc is irrelevant and
> > should be removed.
> 
> Sure, that's ok. It just feels wrong to remove a warning that points
> to a real problem that still exists and can be detected at the moment.
> 
> If we think that clang-9 is going to be fixed before its release,
> the warning could be changed to test for that version as a minimum,
> and point to the bugzilla entry for more details.
> 
>       Arnd

I just tested the arm64 implementation and it shows the same warnings
about cost as arm.

However, I see a warning as something that can be resolved by the user.
The GCC warning's solution is to just use a newer version of GCC
(something fairly easily attainable). This new warning currently has no
solution other than don't use clang.

It is up to you and Nick but I would say unless we are going to
prioritize fixing this, we shouldn't add a warning for it. I'd say it is
more appropriate to fix it then add a warning saying upgrade to this
version to fix it, like the GCC one (though I don't necessarily hate
adding the warning assuming that clang 9 will have it fixed).

Cheers,
Nathan
diff mbox series

Patch

diff --git a/arch/arm/lib/xor-neon.c b/arch/arm/lib/xor-neon.c
index c691b901092f..d532bc072ee4 100644
--- a/arch/arm/lib/xor-neon.c
+++ b/arch/arm/lib/xor-neon.c
@@ -22,15 +22,8 @@  MODULE_LICENSE("GPL");
  * -ftree-vectorize) to attempt to exploit implicit parallelism and emit
  * NEON instructions.
  */
-#if __GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 6)
+#ifdef CONFIG_CC_IS_GCC
 #pragma GCC optimize "tree-vectorize"
-#else
-/*
- * While older versions of GCC do not generate incorrect code, they fail to
- * recognize the parallel nature of these functions, and emit plain ARM code,
- * which is known to be slower than the optimized ARM code in asm-arm/xor.h.
- */
-#warning This code requires at least version 4.6 of GCC
 #endif
 
 #pragma GCC diagnostic ignored "-Wunused-variable"