diff mbox series

ARM: mtd-xip: work around clang/llvm bug

Message ID 20190708203049.3484750-1-arnd@arndb.de (mailing list archive)
State New, archived
Headers show
Series ARM: mtd-xip: work around clang/llvm bug | expand

Commit Message

Arnd Bergmann July 8, 2019, 8:30 p.m. UTC
llvm gets confused by inline asm with .rep directives, which
can lead to miscalculating the number of instructions inside it,
and in turn lead to an overflow for relative address calculation:

/tmp/cfi_cmdset_0002-539a47.s: Assembler messages:
/tmp/cfi_cmdset_0002-539a47.s:11288: Error: bad immediate value for offset (4100)
/tmp/cfi_cmdset_0002-539a47.s:11289: Error: bad immediate value for offset (4100)

This might be fixed in future clang versions, but is not hard
to work around by just replacing the .rep with a series of
eight unrolled nop instructions.

Link: https://bugs.llvm.org/show_bug.cgi?id=42539
https://godbolt.org/z/DSM2Jy
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
 arch/arm/include/asm/mtd-xip.h | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Comments

Nick Desaulniers July 8, 2019, 8:54 p.m. UTC | #1
On Mon, Jul 8, 2019 at 1:31 PM Arnd Bergmann <arnd@arndb.de> wrote:
>
> llvm gets confused by inline asm with .rep directives, which
> can lead to miscalculating the number of instructions inside it,
> and in turn lead to an overflow for relative address calculation:
>
> /tmp/cfi_cmdset_0002-539a47.s: Assembler messages:
> /tmp/cfi_cmdset_0002-539a47.s:11288: Error: bad immediate value for offset (4100)
> /tmp/cfi_cmdset_0002-539a47.s:11289: Error: bad immediate value for offset (4100)
>
> This might be fixed in future clang versions, but is not hard
> to work around by just replacing the .rep with a series of
> eight unrolled nop instructions.

Seems like this is fixable on the Clang side as well. For now, thanks
for the patch.
Acked-by: Nick Desaulniers <ndesaulniers@google.com>

>
> Link: https://bugs.llvm.org/show_bug.cgi?id=42539
> https://godbolt.org/z/DSM2Jy

^ prefix with Link: on both lines?
also, linking to clang trunk will become stale once the issue is
fixed.  When possible, link to a release version of clang that's not
prone to change over time.

> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
> ---
>  arch/arm/include/asm/mtd-xip.h | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/arch/arm/include/asm/mtd-xip.h b/arch/arm/include/asm/mtd-xip.h
> index dfcef0152e3d..5ad0325604e4 100644
> --- a/arch/arm/include/asm/mtd-xip.h
> +++ b/arch/arm/include/asm/mtd-xip.h
> @@ -15,6 +15,8 @@
>  #include <mach/mtd-xip.h>
>
>  /* fill instruction prefetch */
> -#define xip_iprefetch()        do { asm volatile (".rep 8; nop; .endr"); } while (0)
> +#define xip_iprefetch()        do {                                            \
> +        asm volatile ("nop; nop; nop; nop; nop; nop; nop; nop;");      \
> +} while (0)                                                            \
Linus Walleij July 9, 2019, 8:41 a.m. UTC | #2
On Mon, Jul 8, 2019 at 10:31 PM Arnd Bergmann <arnd@arndb.de> wrote:

> llvm gets confused by inline asm with .rep directives,

Are the LLVM developers aware of the bug?
It seems like something we can work around but should
eventually be fixed properly in LLVM, right?

> which
> can lead to miscalculating the number of instructions inside it,
> and in turn lead to an overflow for relative address calculation:
>
> /tmp/cfi_cmdset_0002-539a47.s: Assembler messages:
> /tmp/cfi_cmdset_0002-539a47.s:11288: Error: bad immediate value for offset (4100)
> /tmp/cfi_cmdset_0002-539a47.s:11289: Error: bad immediate value for offset (4100)
>
> This might be fixed in future clang versions, but is not hard
> to work around by just replacing the .rep with a series of
> eight unrolled nop instructions.
>
> Link: https://bugs.llvm.org/show_bug.cgi?id=42539
> https://godbolt.org/z/DSM2Jy
> Signed-off-by: Arnd Bergmann <arnd@arndb.de>

I guess this brings up the old question whether the compiler should
be worked around or just considered immature, but as it happens this
other day I was grep:ing around to find "the 8 NOP" that is so
compulsively inserted in ARM executables (like at the very start of
the kernel execution) and I couldn't find them and now I see why.
Spelling them out makes it easier to find so:

Reviewed-by: Linus Walleij <linus.walleij@linaro.org>

Yours,
Linus Walleij
Russell King (Oracle) July 9, 2019, 9:17 a.m. UTC | #3
On Tue, Jul 09, 2019 at 10:41:05AM +0200, Linus Walleij wrote:
> I guess this brings up the old question whether the compiler should
> be worked around or just considered immature, but as it happens this
> other day I was grep:ing around to find "the 8 NOP" that is so
> compulsively inserted in ARM executables (like at the very start of
> the kernel execution)

The NOPs at the start of the kernel executable have nothing what so ever
to do with this.  They are there to align the kernel entry with the old
a.out format that was used (which had a 32 byte header).  Consequently,
there are boot loaders around that jump to 32 bytes into the kernel
header.

There are other places that we insert 10 NOPs (at cpu_relax()) due to a
CPU errata (otherwise a tight loop basically stalls other CPUs.)
Linus Walleij July 9, 2019, 11:48 a.m. UTC | #4
On Tue, Jul 9, 2019 at 11:17 AM Russell King - ARM Linux admin
<linux@armlinux.org.uk> wrote:
> On Tue, Jul 09, 2019 at 10:41:05AM +0200, Linus Walleij wrote:
> > I guess this brings up the old question whether the compiler should
> > be worked around or just considered immature, but as it happens this
> > other day I was grep:ing around to find "the 8 NOP" that is so
> > compulsively inserted in ARM executables (like at the very start of
> > the kernel execution)
>
> The NOPs at the start of the kernel executable have nothing what so ever
> to do with this.  They are there to align the kernel entry with the old
> a.out format that was used (which had a 32 byte header).  Consequently,
> there are boot loaders around that jump to 32 bytes into the kernel
> header.

Wow! Finally the puzzle pieces come together. And it makes a lot
of sense.

> There are other places that we insert 10 NOPs (at cpu_relax()) due to a
> CPU errata (otherwise a tight loop basically stalls other CPUs.)

Pretty interesting too!

I try to learn a bit more intrinsics of the Arm architecture (been doing
assembly experiments recent days) so getting to know things like
this is very valuable.

Yours,
Linus Walleij
Linus Walleij July 9, 2019, 12:17 p.m. UTC | #5
On Mon, Jul 8, 2019 at 10:31 PM Arnd Bergmann <arnd@arndb.de> wrote:

> -#define xip_iprefetch()        do { asm volatile (".rep 8; nop; .endr"); } while (0)
> +#define xip_iprefetch()        do {                                            \
> +        asm volatile ("nop; nop; nop; nop; nop; nop; nop; nop;");      \
> +} while (0)                                                            \

This is certainly an OK fix since we use a row of inline nop at
other places.

However after Russell explained the other nops I didn't understand I located
these in boot/compressed/head.S as this in __start:

                .rept   7
                __nop
                .endr
#ifndef CONFIG_THUMB2_KERNEL
                mov     r0, r0
#else

And certainly this gets compiled, right?

So does .rept/.endr work better than .rep/.endr, is it simply mis-spelled?

I.e. s/.rep/.rept/g
?

In that case we should explain in the commit that .rep doesn't work
but .rept does.

Yours,
Linus Walleij
Russell King (Oracle) July 9, 2019, 12:25 p.m. UTC | #6
On Tue, Jul 09, 2019 at 02:17:58PM +0200, Linus Walleij wrote:
> On Mon, Jul 8, 2019 at 10:31 PM Arnd Bergmann <arnd@arndb.de> wrote:
> 
> > -#define xip_iprefetch()        do { asm volatile (".rep 8; nop; .endr"); } while (0)
> > +#define xip_iprefetch()        do {                                            \
> > +        asm volatile ("nop; nop; nop; nop; nop; nop; nop; nop;");      \
> > +} while (0)                                                            \
> 
> This is certainly an OK fix since we use a row of inline nop at
> other places.
> 
> However after Russell explained the other nops I didn't understand I located
> these in boot/compressed/head.S as this in __start:
> 
>                 .rept   7
>                 __nop
>                 .endr
> #ifndef CONFIG_THUMB2_KERNEL
>                 mov     r0, r0
> #else
> 
> And certainly this gets compiled, right?
> 
> So does .rept/.endr work better than .rep/.endr, is it simply mis-spelled?
> 
> I.e. s/.rep/.rept/g
> ?
> 
> In that case we should explain in the commit that .rep doesn't work
> but .rept does.

According to the info pages for gas:

7.96 `.rept COUNT'
==================

Repeat the sequence of lines between the `.rept' directive and the next
`.endr' directive COUNT times.

So yes, ".rep" is mis-spelled, and it brings up the obvious question:
why isn't gas issuing an error for ".rep"?  There is no mention of
".rep" in the manual.
Nick Desaulniers July 9, 2019, 6:07 p.m. UTC | #7
On Tue, Jul 9, 2019 at 1:41 AM Linus Walleij <linus.walleij@linaro.org> wrote:
>
> On Mon, Jul 8, 2019 at 10:31 PM Arnd Bergmann <arnd@arndb.de> wrote:
>
> > llvm gets confused by inline asm with .rep directives,
>
> Are the LLVM developers aware of the bug?
> It seems like something we can work around but should
> eventually be fixed properly in LLVM, right?

Arnd filed the bug yesterday.  I looked at it; so someone working on
LLVM is aware of it.  The kernel is definitely exercising weak points
in our inline assembly support.

> > Link: https://bugs.llvm.org/show_bug.cgi?id=42539

> I guess this brings up the old question whether the compiler should
> be worked around or just considered immature, but as it happens this

Definitely a balancing act; we prioritize work based on what's
feasible to work around vs must be implemented.  A lot of my time is
going into validation of asm goto right now, but others are ramping up
on the integrated assembler (clang itself can be invoked as a
substitute for GNU AS; but there's not enough support to do `make
AS=clang` for the kernel just yet).
Nick Desaulniers July 9, 2019, 6:11 p.m. UTC | #8
On Tue, Jul 9, 2019 at 5:26 AM Russell King - ARM Linux admin
<linux@armlinux.org.uk> wrote:
>
> On Tue, Jul 09, 2019 at 02:17:58PM +0200, Linus Walleij wrote:
> > On Mon, Jul 8, 2019 at 10:31 PM Arnd Bergmann <arnd@arndb.de> wrote:
> >
> > > -#define xip_iprefetch()        do { asm volatile (".rep 8; nop; .endr"); } while (0)
> > > +#define xip_iprefetch()        do {                                            \
> > > +        asm volatile ("nop; nop; nop; nop; nop; nop; nop; nop;");      \
> > > +} while (0)                                                            \
> >
> > This is certainly an OK fix since we use a row of inline nop at
> > other places.
> >
> > However after Russell explained the other nops I didn't understand I located
> > these in boot/compressed/head.S as this in __start:
> >
> >                 .rept   7
> >                 __nop
> >                 .endr
> > #ifndef CONFIG_THUMB2_KERNEL
> >                 mov     r0, r0
> > #else
> >
> > And certainly this gets compiled, right?
> >
> > So does .rept/.endr work better than .rep/.endr, is it simply mis-spelled?
> >
> > I.e. s/.rep/.rept/g
> > ?
> >
> > In that case we should explain in the commit that .rep doesn't work
> > but .rept does.
>
> According to the info pages for gas:
>
> 7.96 `.rept COUNT'
> ==================
>
> Repeat the sequence of lines between the `.rept' directive and the next
> `.endr' directive COUNT times.
>
> So yes, ".rep" is mis-spelled, and it brings up the obvious question:
> why isn't gas issuing an error for ".rep"?  There is no mention of
> ".rep" in the manual.

I swear I had looked this up somewhere and found that GNU as and
clang's integrated assembler supported alternative spellings for
assembly directives.  Just checked the manual
https://ftp.gnu.org/old-gnu/Manuals/gas-2.9.1/html_chapter/as_7.html#SEC116
and indeed no mention of the alternatives...must have been looking at
the source...
Arnd Bergmann July 9, 2019, 6:40 p.m. UTC | #9
On Tue, Jul 9, 2019 at 8:08 PM 'Nick Desaulniers' via Clang Built
Linux <clang-built-linux@googlegroups.com> wrote:
> On Tue, Jul 9, 2019 at 1:41 AM Linus Walleij <linus.walleij@linaro.org> wrote:
>
> > I guess this brings up the old question whether the compiler should
> > be worked around or just considered immature, but as it happens this
>
> Definitely a balancing act; we prioritize work based on what's
> feasible to work around vs must be implemented.  A lot of my time is
> going into validation of asm goto right now, but others are ramping up
> on the integrated assembler (clang itself can be invoked as a
> substitute for GNU AS; but there's not enough support to do `make
> AS=clang` for the kernel just yet).

Note that this bug is the same with both gas and AS=clang, which also
indicates that clang implemented the undocumented .rep directive
for compatibility.

Overall I think we are at the point where building the kernel with clang-8
is mature enough that we should work around bugs like this where it is
easy enough rather than waiting for clang-9.

      Arnd
Russell King (Oracle) July 9, 2019, 10:25 p.m. UTC | #10
On Tue, Jul 09, 2019 at 08:40:51PM +0200, Arnd Bergmann wrote:
> On Tue, Jul 9, 2019 at 8:08 PM 'Nick Desaulniers' via Clang Built
> Linux <clang-built-linux@googlegroups.com> wrote:
> > On Tue, Jul 9, 2019 at 1:41 AM Linus Walleij <linus.walleij@linaro.org> wrote:
> >
> > > I guess this brings up the old question whether the compiler should
> > > be worked around or just considered immature, but as it happens this
> >
> > Definitely a balancing act; we prioritize work based on what's
> > feasible to work around vs must be implemented.  A lot of my time is
> > going into validation of asm goto right now, but others are ramping up
> > on the integrated assembler (clang itself can be invoked as a
> > substitute for GNU AS; but there's not enough support to do `make
> > AS=clang` for the kernel just yet).
> 
> Note that this bug is the same with both gas and AS=clang, which also
> indicates that clang implemented the undocumented .rep directive
> for compatibility.
> 
> Overall I think we are at the point where building the kernel with clang-8
> is mature enough that we should work around bugs like this where it is
> easy enough rather than waiting for clang-9.

While both assemblers seem to support both .rept and .rep, might it
be an idea to check what the clang-8 situation is with .rept ?
Arnd Bergmann July 10, 2019, 8:33 a.m. UTC | #11
On Wed, Jul 10, 2019 at 12:25 AM Russell King - ARM Linux admin
<linux@armlinux.org.uk> wrote:
>
> On Tue, Jul 09, 2019 at 08:40:51PM +0200, Arnd Bergmann wrote:
> > On Tue, Jul 9, 2019 at 8:08 PM 'Nick Desaulniers' via Clang Built
> > Linux <clang-built-linux@googlegroups.com> wrote:
> > > On Tue, Jul 9, 2019 at 1:41 AM Linus Walleij <linus.walleij@linaro.org> wrote:
> > >
> > > > I guess this brings up the old question whether the compiler should
> > > > be worked around or just considered immature, but as it happens this
> > >
> > > Definitely a balancing act; we prioritize work based on what's
> > > feasible to work around vs must be implemented.  A lot of my time is
> > > going into validation of asm goto right now, but others are ramping up
> > > on the integrated assembler (clang itself can be invoked as a
> > > substitute for GNU AS; but there's not enough support to do `make
> > > AS=clang` for the kernel just yet).
> >
> > Note that this bug is the same with both gas and AS=clang, which also
> > indicates that clang implemented the undocumented .rep directive
> > for compatibility.
> >
> > Overall I think we are at the point where building the kernel with clang-8
> > is mature enough that we should work around bugs like this where it is
> > easy enough rather than waiting for clang-9.
>
> While both assemblers seem to support both .rept and .rep, might it
> be an idea to check what the clang-8 situation is with .rept ?

Good idea. I tried this patch now:

--- a/arch/arm/include/asm/mtd-xip.h
+++ b/arch/arm/include/asm/mtd-xip.h
@@ -15,6 +15,6 @@
 #include <mach/mtd-xip.h>

 /* fill instruction prefetch */
-#define xip_iprefetch()        do { asm volatile (".rep 8; nop;
.endr"); } while (0)
+#define xip_iprefetch()        do { asm volatile (".rept 8; nop;
.endr"); } while (0)

 #endif /* __ARM_MTD_XIP_H__ */

Unfortunately that has no effect, clang treats them both the same way.

      Arnd
Russell King (Oracle) July 10, 2019, 8:58 a.m. UTC | #12
On Wed, Jul 10, 2019 at 10:33:31AM +0200, Arnd Bergmann wrote:
> On Wed, Jul 10, 2019 at 12:25 AM Russell King - ARM Linux admin
> <linux@armlinux.org.uk> wrote:
> >
> > On Tue, Jul 09, 2019 at 08:40:51PM +0200, Arnd Bergmann wrote:
> > > On Tue, Jul 9, 2019 at 8:08 PM 'Nick Desaulniers' via Clang Built
> > > Linux <clang-built-linux@googlegroups.com> wrote:
> > > > On Tue, Jul 9, 2019 at 1:41 AM Linus Walleij <linus.walleij@linaro.org> wrote:
> > > >
> > > > > I guess this brings up the old question whether the compiler should
> > > > > be worked around or just considered immature, but as it happens this
> > > >
> > > > Definitely a balancing act; we prioritize work based on what's
> > > > feasible to work around vs must be implemented.  A lot of my time is
> > > > going into validation of asm goto right now, but others are ramping up
> > > > on the integrated assembler (clang itself can be invoked as a
> > > > substitute for GNU AS; but there's not enough support to do `make
> > > > AS=clang` for the kernel just yet).
> > >
> > > Note that this bug is the same with both gas and AS=clang, which also
> > > indicates that clang implemented the undocumented .rep directive
> > > for compatibility.
> > >
> > > Overall I think we are at the point where building the kernel with clang-8
> > > is mature enough that we should work around bugs like this where it is
> > > easy enough rather than waiting for clang-9.
> >
> > While both assemblers seem to support both .rept and .rep, might it
> > be an idea to check what the clang-8 situation is with .rept ?
> 
> Good idea. I tried this patch now:
> 
> --- a/arch/arm/include/asm/mtd-xip.h
> +++ b/arch/arm/include/asm/mtd-xip.h
> @@ -15,6 +15,6 @@
>  #include <mach/mtd-xip.h>
> 
>  /* fill instruction prefetch */
> -#define xip_iprefetch()        do { asm volatile (".rep 8; nop;
> .endr"); } while (0)
> +#define xip_iprefetch()        do { asm volatile (".rept 8; nop;
> .endr"); } while (0)
> 
>  #endif /* __ARM_MTD_XIP_H__ */
> 
> Unfortunately that has no effect, clang treats them both the same way.

In any case, good to know.
diff mbox series

Patch

diff --git a/arch/arm/include/asm/mtd-xip.h b/arch/arm/include/asm/mtd-xip.h
index dfcef0152e3d..5ad0325604e4 100644
--- a/arch/arm/include/asm/mtd-xip.h
+++ b/arch/arm/include/asm/mtd-xip.h
@@ -15,6 +15,8 @@ 
 #include <mach/mtd-xip.h>
 
 /* fill instruction prefetch */
-#define xip_iprefetch() 	do { asm volatile (".rep 8; nop; .endr"); } while (0)
+#define xip_iprefetch()	do {						\
+	 asm volatile ("nop; nop; nop; nop; nop; nop; nop; nop;");	\
+} while (0)								\
 
 #endif /* __ARM_MTD_XIP_H__ */