diff mbox series

arm64: lse: fix LSE atomics with LLVM's integrated assembler

Message ID 20191007201452.208067-1-samitolvanen@google.com (mailing list archive)
State New, archived
Headers show
Series arm64: lse: fix LSE atomics with LLVM's integrated assembler | expand

Commit Message

Sami Tolvanen Oct. 7, 2019, 8:14 p.m. UTC
Unlike gcc, clang considers each inline assembly block to be independent
and therefore, when using the integrated assembler for inline assembly,
any preambles that enable features must be repeated in each block.

Instead of changing all inline assembly blocks that use LSE, this change
adds -march=armv8-a+lse to KBUILD_CFLAGS, which works with both clang
and gcc.

Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
---
 arch/arm64/Makefile          | 2 ++
 arch/arm64/include/asm/lse.h | 2 --
 2 files changed, 2 insertions(+), 2 deletions(-)

Comments

Nick Desaulniers Oct. 7, 2019, 8:28 p.m. UTC | #1
On Mon, Oct 7, 2019 at 1:14 PM 'Sami Tolvanen' via Clang Built Linux
<clang-built-linux@googlegroups.com> wrote:
>
> Unlike gcc, clang considers each inline assembly block to be independent
> and therefore, when using the integrated assembler for inline assembly,
> any preambles that enable features must be repeated in each block.
>
> Instead of changing all inline assembly blocks that use LSE, this change
> adds -march=armv8-a+lse to KBUILD_CFLAGS, which works with both clang
> and gcc.
>
> Signed-off-by: Sami Tolvanen <samitolvanen@google.com>

Thanks Sami, looks like this addresses:
Link: https://github.com/ClangBuiltLinux/linux/issues/671
I tried adding `.arch armv8-a+lse` directives to all of the inline asm:
https://github.com/ClangBuiltLinux/linux/issues/573#issuecomment-535098996
though that quickly ran aground in some failed assertion using the
alternatives system that I don't quite yet understand.

One thing to be careful about is that blankets the entire kernel in
`+lse`, allowing LSE atomics to be selected at any point.  The
assembler directive in the header (or per inline asm block) is more
fine grained.  I'm not sure that they would be guarded by the
alternative system.  Maybe that's not an issue, but it is the reason I
tried to localize the assembler directive first.

Note that Clang really wants the target arch specified by either:
1. command line argument (as in this patch)
2. per function function attribute
3. per asm statement assembler directive

1 is the smallest incision.

> ---
>  arch/arm64/Makefile          | 2 ++
>  arch/arm64/include/asm/lse.h | 2 --
>  2 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/arch/arm64/Makefile b/arch/arm64/Makefile
> index 84a3d502c5a5..7a7c0cb8ed60 100644
> --- a/arch/arm64/Makefile
> +++ b/arch/arm64/Makefile
> @@ -36,6 +36,8 @@ lseinstr := $(call as-instr,.arch_extension lse,-DCONFIG_AS_LSE=1)
>  ifeq ($(CONFIG_ARM64_LSE_ATOMICS), y)
>    ifeq ($(lseinstr),)
>  $(warning LSE atomics not supported by binutils)
> +  else
> +KBUILD_CFLAGS  += -march=armv8-a+lse
>    endif
>  endif
>
> diff --git a/arch/arm64/include/asm/lse.h b/arch/arm64/include/asm/lse.h
> index 80b388278149..8603a9881529 100644
> --- a/arch/arm64/include/asm/lse.h
> +++ b/arch/arm64/include/asm/lse.h
> @@ -14,8 +14,6 @@
>  #include <asm/atomic_lse.h>
>  #include <asm/cpucaps.h>
>
> -__asm__(".arch_extension       lse");
> -
>  extern struct static_key_false cpu_hwcap_keys[ARM64_NCAPS];
>  extern struct static_key_false arm64_const_caps_ready;
>
> --
> 2.23.0.581.g78d2f28ef7-goog
>
> --
> You received this message because you are subscribed to the Google Groups "Clang Built Linux" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to clang-built-linux+unsubscribe@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/clang-built-linux/20191007201452.208067-1-samitolvanen%40google.com.
Sami Tolvanen Oct. 7, 2019, 9:24 p.m. UTC | #2
On Mon, Oct 7, 2019 at 1:28 PM 'Nick Desaulniers' via Clang Built
Linux <clang-built-linux@googlegroups.com> wrote:
> I tried adding `.arch armv8-a+lse` directives to all of the inline asm:
> https://github.com/ClangBuiltLinux/linux/issues/573#issuecomment-535098996

Yes, I had a similar patch earlier. I feel like using a command line
parameter here is cleaner, but I'm fine with either solution.

> One thing to be careful about is that blankets the entire kernel in
> `+lse`, allowing LSE atomics to be selected at any point.

Is that a problem? The current code allows LSE instructions with gcc
in any file that includes <asm/lse.h>, which turns out to be quite a
few places.

Sami
Nick Desaulniers Oct. 7, 2019, 9:46 p.m. UTC | #3
On Mon, Oct 7, 2019 at 2:24 PM Sami Tolvanen <samitolvanen@google.com> wrote:
>
> On Mon, Oct 7, 2019 at 1:28 PM 'Nick Desaulniers' via Clang Built
> Linux <clang-built-linux@googlegroups.com> wrote:
> > I tried adding `.arch armv8-a+lse` directives to all of the inline asm:
> > https://github.com/ClangBuiltLinux/linux/issues/573#issuecomment-535098996
>
> Yes, I had a similar patch earlier. I feel like using a command line
> parameter here is cleaner, but I'm fine with either solution.
>
> > One thing to be careful about is that blankets the entire kernel in
> > `+lse`, allowing LSE atomics to be selected at any point.
>
> Is that a problem? The current code allows LSE instructions with gcc
> in any file that includes <asm/lse.h>, which turns out to be quite a
> few places.

I may be mistaken, but I don't think inline asm directives allow the C
compiler to change what instructions it selects for C code, but
command line arguments to the C compiler do.

Grepping the kernel for some of the functions and memory orderings
turns up a few hits:
https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html

I'm worried that one of these might lower to LSE atomics without
ALTERNATIVE guards by blanketing all C code with `-march=armv8-a+lse`.
But I did just boot test this patch but using GAS in QEMU (on a -cpu
cortex-a72 which I suspect should not have lse instructions by default
IIUC), FWIW.
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Maybe the maintainers have more thoughts?
Andrew Murray Oct. 8, 2019, 8:37 a.m. UTC | #4
On Mon, Oct 07, 2019 at 01:28:19PM -0700, Nick Desaulniers wrote:
> On Mon, Oct 7, 2019 at 1:14 PM 'Sami Tolvanen' via Clang Built Linux
> <clang-built-linux@googlegroups.com> wrote:
> >
> > Unlike gcc, clang considers each inline assembly block to be independent
> > and therefore, when using the integrated assembler for inline assembly,
> > any preambles that enable features must be repeated in each block.
> >
> > Instead of changing all inline assembly blocks that use LSE, this change
> > adds -march=armv8-a+lse to KBUILD_CFLAGS, which works with both clang
> > and gcc.
> >
> > Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
> 
> Thanks Sami, looks like this addresses:
> Link: https://github.com/ClangBuiltLinux/linux/issues/671
> I tried adding `.arch armv8-a+lse` directives to all of the inline asm:
> https://github.com/ClangBuiltLinux/linux/issues/573#issuecomment-535098996
> though that quickly ran aground in some failed assertion using the
> alternatives system that I don't quite yet understand.

I think these issues somehow are related to the strange way we used to
jump to the out-of-line fallbacks. Since around addfc38672c7 ("arm64:
atomics: avoid out-of-line ll/sc atomics") this approach was removed.

I reproduced your patch on 5.4-rc2 and no longer get the clang build
errors. Therefore this approach is viable option.

> 
> One thing to be careful about is that blankets the entire kernel in
> `+lse`, allowing LSE atomics to be selected at any point.  The
> assembler directive in the header (or per inline asm block) is more
> fine grained.  I'm not sure that they would be guarded by the
> alternative system.  Maybe that's not an issue, but it is the reason I
> tried to localize the assembler directive first.
> 
> Note that Clang really wants the target arch specified by either:
> 1. command line argument (as in this patch)

This makes me nervous, as we're telling the compiler that the machine
we're building for has LSE - obviously it would be perfectly acceptable
for the compiler to then throw in some LSE instructions at some point.
Thus something may break further down the line.

> 2. per function function attribute
> 3. per asm statement assembler directive

Keen to hear Will's thoughts - but I'd suggest this is probably the
safest way forward.

Thanks,

Andrew Murray

> 
> 1 is the smallest incision.
> 
> > ---
> >  arch/arm64/Makefile          | 2 ++
> >  arch/arm64/include/asm/lse.h | 2 --
> >  2 files changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/arch/arm64/Makefile b/arch/arm64/Makefile
> > index 84a3d502c5a5..7a7c0cb8ed60 100644
> > --- a/arch/arm64/Makefile
> > +++ b/arch/arm64/Makefile
> > @@ -36,6 +36,8 @@ lseinstr := $(call as-instr,.arch_extension lse,-DCONFIG_AS_LSE=1)
> >  ifeq ($(CONFIG_ARM64_LSE_ATOMICS), y)
> >    ifeq ($(lseinstr),)
> >  $(warning LSE atomics not supported by binutils)
> > +  else
> > +KBUILD_CFLAGS  += -march=armv8-a+lse
> >    endif
> >  endif
> >
> > diff --git a/arch/arm64/include/asm/lse.h b/arch/arm64/include/asm/lse.h
> > index 80b388278149..8603a9881529 100644
> > --- a/arch/arm64/include/asm/lse.h
> > +++ b/arch/arm64/include/asm/lse.h
> > @@ -14,8 +14,6 @@
> >  #include <asm/atomic_lse.h>
> >  #include <asm/cpucaps.h>
> >
> > -__asm__(".arch_extension       lse");
> > -
> >  extern struct static_key_false cpu_hwcap_keys[ARM64_NCAPS];
> >  extern struct static_key_false arm64_const_caps_ready;
> >
> > --
> > 2.23.0.581.g78d2f28ef7-goog
> >
> > --
> > You received this message because you are subscribed to the Google Groups "Clang Built Linux" group.
> > To unsubscribe from this group and stop receiving emails from it, send an email to clang-built-linux+unsubscribe@googlegroups.com.
> > To view this discussion on the web visit https://groups.google.com/d/msgid/clang-built-linux/20191007201452.208067-1-samitolvanen%40google.com.
> 
> 
> 
> -- 
> Thanks,
> ~Nick Desaulniers
Sami Tolvanen Oct. 8, 2019, 3:22 p.m. UTC | #5
On Mon, Oct 7, 2019 at 2:46 PM 'Nick Desaulniers' via Clang Built
Linux <clang-built-linux@googlegroups.com> wrote:
> I'm worried that one of these might lower to LSE atomics without
> ALTERNATIVE guards by blanketing all C code with `-march=armv8-a+lse`.

True, that's a valid concern. I think adding the directive to each
assembly block is the way forward then, assuming the maintainers are
fine with that.

Sami
Robin Murphy Oct. 8, 2019, 4:18 p.m. UTC | #6
On 08/10/2019 16:22, Sami Tolvanen wrote:
> On Mon, Oct 7, 2019 at 2:46 PM 'Nick Desaulniers' via Clang Built
> Linux <clang-built-linux@googlegroups.com> wrote:
>> I'm worried that one of these might lower to LSE atomics without
>> ALTERNATIVE guards by blanketing all C code with `-march=armv8-a+lse`.
> 
> True, that's a valid concern. I think adding the directive to each
> assembly block is the way forward then, assuming the maintainers are
> fine with that.

It's definitely a valid concern in principle, but in practice note that 
lse.h ends up included in ~99% of C files, so the extension is enabled 
more or less everywhere already.

(based on a quick hack involving '#pragma message' and grepping the 
build logs)

Robin.
Ard Biesheuvel Oct. 8, 2019, 9:03 p.m. UTC | #7
On Tue, 8 Oct 2019 at 18:19, Robin Murphy <robin.murphy@arm.com> wrote:
>
> On 08/10/2019 16:22, Sami Tolvanen wrote:
> > On Mon, Oct 7, 2019 at 2:46 PM 'Nick Desaulniers' via Clang Built
> > Linux <clang-built-linux@googlegroups.com> wrote:
> >> I'm worried that one of these might lower to LSE atomics without
> >> ALTERNATIVE guards by blanketing all C code with `-march=armv8-a+lse`.
> >
> > True, that's a valid concern. I think adding the directive to each
> > assembly block is the way forward then, assuming the maintainers are
> > fine with that.
>
> It's definitely a valid concern in principle, but in practice note that
> lse.h ends up included in ~99% of C files, so the extension is enabled
> more or less everywhere already.
>

lse.h currently does

__asm__(".arch_extension        lse");

which instructs the assembler to permit the use of LSE opcodes, but it
does not instruct the compiler to emit them, so this is not quite the
same thing.
Robin Murphy Oct. 9, 2019, 10:03 a.m. UTC | #8
On 2019-10-08 10:03 pm, Ard Biesheuvel wrote:
> On Tue, 8 Oct 2019 at 18:19, Robin Murphy <robin.murphy@arm.com> wrote:
>>
>> On 08/10/2019 16:22, Sami Tolvanen wrote:
>>> On Mon, Oct 7, 2019 at 2:46 PM 'Nick Desaulniers' via Clang Built
>>> Linux <clang-built-linux@googlegroups.com> wrote:
>>>> I'm worried that one of these might lower to LSE atomics without
>>>> ALTERNATIVE guards by blanketing all C code with `-march=armv8-a+lse`.
>>>
>>> True, that's a valid concern. I think adding the directive to each
>>> assembly block is the way forward then, assuming the maintainers are
>>> fine with that.
>>
>> It's definitely a valid concern in principle, but in practice note that
>> lse.h ends up included in ~99% of C files, so the extension is enabled
>> more or less everywhere already.
>>
> 
> lse.h currently does
> 
> __asm__(".arch_extension        lse");
> 
> which instructs the assembler to permit the use of LSE opcodes, but it
> does not instruct the compiler to emit them, so this is not quite the
> same thing.

Derp, of course it isn't. And IIRC we can't just pass the option through 
with -Wa either because at least some versions of GCC emit an explicit 
.arch directive at the top of the output. Oh well; sorry for the 
distraction.

Robin.
diff mbox series

Patch

diff --git a/arch/arm64/Makefile b/arch/arm64/Makefile
index 84a3d502c5a5..7a7c0cb8ed60 100644
--- a/arch/arm64/Makefile
+++ b/arch/arm64/Makefile
@@ -36,6 +36,8 @@  lseinstr := $(call as-instr,.arch_extension lse,-DCONFIG_AS_LSE=1)
 ifeq ($(CONFIG_ARM64_LSE_ATOMICS), y)
   ifeq ($(lseinstr),)
 $(warning LSE atomics not supported by binutils)
+  else
+KBUILD_CFLAGS	+= -march=armv8-a+lse
   endif
 endif
 
diff --git a/arch/arm64/include/asm/lse.h b/arch/arm64/include/asm/lse.h
index 80b388278149..8603a9881529 100644
--- a/arch/arm64/include/asm/lse.h
+++ b/arch/arm64/include/asm/lse.h
@@ -14,8 +14,6 @@ 
 #include <asm/atomic_lse.h>
 #include <asm/cpucaps.h>
 
-__asm__(".arch_extension	lse");
-
 extern struct static_key_false cpu_hwcap_keys[ARM64_NCAPS];
 extern struct static_key_false arm64_const_caps_ready;