diff mbox

ARM: reinsert ARCH_MULTI_V4 Kconfig option

Message ID CACmBeS3kC4dsdOrfANifgu8oQhLauFKcXhEu=iz8xbUxdhMq_A@mail.gmail.com (mailing list archive)
State New, archived
Headers show

Commit Message

Jonas Jensen April 9, 2014, 2:54 p.m. UTC
On 13 December 2013 12:39, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> I see what's causing this: the kuser helpers are using "bx lr" to return
> which will be undefined on non-Thumb CPUs.  We generally cope fine with
> non-Thumb CPUs, conditionalising where necessary on HWCAP_THUMB or the
> T bit in the PSR being set.
>
> However, it looks like the kuser helpers got missed.  As a check, please
> look at arch/arm/kernel/entry-armv.S, find the line with:
>
>         .macro  usr_ret, reg
>
> and ensure that the mov pc, \reg case always gets used.  Please report
> back.

Uwe and Arnd came up with a solution except it doesn't work when I test it.

The suggested patch is:



With this applied, and the following two possibilities in next-20140403:

[1] ARCH_MULTI_V4 only (no CONFIG_ARM_THUMB)
     CONFIG_CPU_32v4=y

[2] ARCH_MULTI_V4 and ARCH_MULTI_V4T
     CONFIG_ARM_THUMB=y
     CONFIG_CPU_32v4=y
     CONFIG_CPU_32v4T=y


Booting a kernel with either option [1] or [2] yield the following results:

[1] works
[2] hangs after "[    2.730000] Freeing unused kernel memory: 104K
(c02f5000 - c030f000)"


Any help why the moveq doesn't work would be much appreciated.


The commit where this was tested:
https://bitbucket.org/Kasreyn/linux-next/commits/48320cbe0d5a1fb6ada52cea9efd87c7712367cb


Regards,
Jonas

Comments

Uwe Kleine-König April 9, 2014, 3:04 p.m. UTC | #1
On Wed, Apr 09, 2014 at 04:54:16PM +0200, Jonas Jensen wrote:
> On 13 December 2013 12:39, Russell King - ARM Linux
> <linux@arm.linux.org.uk> wrote:
> > I see what's causing this: the kuser helpers are using "bx lr" to return
> > which will be undefined on non-Thumb CPUs.  We generally cope fine with
> > non-Thumb CPUs, conditionalising where necessary on HWCAP_THUMB or the
> > T bit in the PSR being set.
> >
> > However, it looks like the kuser helpers got missed.  As a check, please
> > look at arch/arm/kernel/entry-armv.S, find the line with:
> >
> >         .macro  usr_ret, reg
> >
> > and ensure that the mov pc, \reg case always gets used.  Please report
> > back.
> 
> Uwe and Arnd came up with a solution except it doesn't work when I test it.
> 
> The suggested patch is:
> 
> diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S
> index 1879e8d..de15bfd 100644
> --- a/arch/arm/kernel/entry-armv.S
> +++ b/arch/arm/kernel/entry-armv.S
> @@ -739,6 +739,18 @@ ENDPROC(__switch_to)
> 
>         .macro  usr_ret, reg
>  #ifdef CONFIG_ARM_THUMB
> +       /*
> +        * Having CONFIG_ARM_THUMB isn't a guarantee that the cpu has support
> +        * for Thumb and so the bx instruction. Use a mov if the address to
> +        * jump to is 32 bit aligned. (Note that this code is compiled in ARM
> +        * mode, so this is the right test.)
> +        */
> +#if defined(CONFIG_CPU_32v4)
> +       tst     \reg, #3
> +       moveq   pc, \reg
> +       b       .
> +#endif
> +
>         bx      \reg
>  #else
>         mov     pc, \reg
> 
> 
> With this applied, and the following two possibilities in next-20140403:
> 
> [1] ARCH_MULTI_V4 only (no CONFIG_ARM_THUMB)
>      CONFIG_CPU_32v4=y
> 
> [2] ARCH_MULTI_V4 and ARCH_MULTI_V4T
>      CONFIG_ARM_THUMB=y
>      CONFIG_CPU_32v4=y
>      CONFIG_CPU_32v4T=y
> 
> 
> Booting a kernel with either option [1] or [2] yield the following results:
> 
> [1] works
> [2] hangs after "[    2.730000] Freeing unused kernel memory: 104K
> (c02f5000 - c030f000)"
> 
> 
> Any help why the moveq doesn't work would be much appreciated.
doing s/moveq/mov/ does the trick on the machine in question, but this
is obviously not an option for mainline. But it means that even on this
non-Thumb capable machine \reg contains an address that is not aligned.

Where does \reg come from? Is it provided by userspace? If so, is it a
userspace bug?

Best regards
Uwe
Russell King - ARM Linux April 9, 2014, 3:06 p.m. UTC | #2
On Wed, Apr 09, 2014 at 04:54:16PM +0200, Jonas Jensen wrote:
> On 13 December 2013 12:39, Russell King - ARM Linux
> <linux@arm.linux.org.uk> wrote:
> > I see what's causing this: the kuser helpers are using "bx lr" to return
> > which will be undefined on non-Thumb CPUs.  We generally cope fine with
> > non-Thumb CPUs, conditionalising where necessary on HWCAP_THUMB or the
> > T bit in the PSR being set.
> >
> > However, it looks like the kuser helpers got missed.  As a check, please
> > look at arch/arm/kernel/entry-armv.S, find the line with:
> >
> >         .macro  usr_ret, reg
> >
> > and ensure that the mov pc, \reg case always gets used.  Please report
> > back.
> 
> Uwe and Arnd came up with a solution except it doesn't work when I test it.
> 
> The suggested patch is:
> 
> diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S
> index 1879e8d..de15bfd 100644
> --- a/arch/arm/kernel/entry-armv.S
> +++ b/arch/arm/kernel/entry-armv.S
> @@ -739,6 +739,18 @@ ENDPROC(__switch_to)
> 
>         .macro  usr_ret, reg
>  #ifdef CONFIG_ARM_THUMB
> +       /*
> +        * Having CONFIG_ARM_THUMB isn't a guarantee that the cpu has support
> +        * for Thumb and so the bx instruction. Use a mov if the address to
> +        * jump to is 32 bit aligned. (Note that this code is compiled in ARM
> +        * mode, so this is the right test.)
> +        */
> +#if defined(CONFIG_CPU_32v4)
> +       tst     \reg, #3
> +       moveq   pc, \reg
> +       b       .
> +#endif
> +
>         bx      \reg

What's wrong with:
	tst	\reg, #3
	moveq	pc, \reg
	bx	\reg

rather than ending in an infinite loop?
Russell King - ARM Linux April 9, 2014, 3:09 p.m. UTC | #3
On Wed, Apr 09, 2014 at 05:04:48PM +0200, Uwe Kleine-König wrote:
> On Wed, Apr 09, 2014 at 04:54:16PM +0200, Jonas Jensen wrote:
> > On 13 December 2013 12:39, Russell King - ARM Linux
> > <linux@arm.linux.org.uk> wrote:
> > > I see what's causing this: the kuser helpers are using "bx lr" to return
> > > which will be undefined on non-Thumb CPUs.  We generally cope fine with
> > > non-Thumb CPUs, conditionalising where necessary on HWCAP_THUMB or the
> > > T bit in the PSR being set.
> > >
> > > However, it looks like the kuser helpers got missed.  As a check, please
> > > look at arch/arm/kernel/entry-armv.S, find the line with:
> > >
> > >         .macro  usr_ret, reg
> > >
> > > and ensure that the mov pc, \reg case always gets used.  Please report
> > > back.
> > 
> > Uwe and Arnd came up with a solution except it doesn't work when I test it.
> > 
> > The suggested patch is:
> > 
> > diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S
> > index 1879e8d..de15bfd 100644
> > --- a/arch/arm/kernel/entry-armv.S
> > +++ b/arch/arm/kernel/entry-armv.S
> > @@ -739,6 +739,18 @@ ENDPROC(__switch_to)
> > 
> >         .macro  usr_ret, reg
> >  #ifdef CONFIG_ARM_THUMB
> > +       /*
> > +        * Having CONFIG_ARM_THUMB isn't a guarantee that the cpu has support
> > +        * for Thumb and so the bx instruction. Use a mov if the address to
> > +        * jump to is 32 bit aligned. (Note that this code is compiled in ARM
> > +        * mode, so this is the right test.)
> > +        */
> > +#if defined(CONFIG_CPU_32v4)
> > +       tst     \reg, #3
> > +       moveq   pc, \reg
> > +       b       .
> > +#endif
> > +
> >         bx      \reg
> >  #else
> >         mov     pc, \reg
> > 
> > 
> > With this applied, and the following two possibilities in next-20140403:
> > 
> > [1] ARCH_MULTI_V4 only (no CONFIG_ARM_THUMB)
> >      CONFIG_CPU_32v4=y
> > 
> > [2] ARCH_MULTI_V4 and ARCH_MULTI_V4T
> >      CONFIG_ARM_THUMB=y
> >      CONFIG_CPU_32v4=y
> >      CONFIG_CPU_32v4T=y
> > 
> > 
> > Booting a kernel with either option [1] or [2] yield the following results:
> > 
> > [1] works
> > [2] hangs after "[    2.730000] Freeing unused kernel memory: 104K
> > (c02f5000 - c030f000)"
> > 
> > 
> > Any help why the moveq doesn't work would be much appreciated.
> doing s/moveq/mov/ does the trick on the machine in question, but this
> is obviously not an option for mainline. But it means that even on this
> non-Thumb capable machine \reg contains an address that is not aligned.
> 
> Where does \reg come from? Is it provided by userspace? If so, is it a
> userspace bug?

Well, in case (2), if usr_ret is returning to a caller running thumb code,
it will lock up - because the register will not be aligned, so moveq will
fail, and the next instruction is a branch which only ever branches to
itself.
Uwe Kleine-König April 9, 2014, 3:13 p.m. UTC | #4
Hello Russell,

On Wed, Apr 09, 2014 at 04:06:40PM +0100, Russell King - ARM Linux wrote:
> On Wed, Apr 09, 2014 at 04:54:16PM +0200, Jonas Jensen wrote:
> > On 13 December 2013 12:39, Russell King - ARM Linux
> > <linux@arm.linux.org.uk> wrote:
> > > I see what's causing this: the kuser helpers are using "bx lr" to return
> > > which will be undefined on non-Thumb CPUs.  We generally cope fine with
> > > non-Thumb CPUs, conditionalising where necessary on HWCAP_THUMB or the
> > > T bit in the PSR being set.
> > >
> > > However, it looks like the kuser helpers got missed.  As a check, please
> > > look at arch/arm/kernel/entry-armv.S, find the line with:
> > >
> > >         .macro  usr_ret, reg
> > >
> > > and ensure that the mov pc, \reg case always gets used.  Please report
> > > back.
> > 
> > Uwe and Arnd came up with a solution except it doesn't work when I test it.
> > 
> > The suggested patch is:
> > 
> > diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S
> > index 1879e8d..de15bfd 100644
> > --- a/arch/arm/kernel/entry-armv.S
> > +++ b/arch/arm/kernel/entry-armv.S
> > @@ -739,6 +739,18 @@ ENDPROC(__switch_to)
> > 
> >         .macro  usr_ret, reg
> >  #ifdef CONFIG_ARM_THUMB
> > +       /*
> > +        * Having CONFIG_ARM_THUMB isn't a guarantee that the cpu has support
> > +        * for Thumb and so the bx instruction. Use a mov if the address to
> > +        * jump to is 32 bit aligned. (Note that this code is compiled in ARM
> > +        * mode, so this is the right test.)
> > +        */
> > +#if defined(CONFIG_CPU_32v4)
> > +       tst     \reg, #3
> > +       moveq   pc, \reg
> > +       b       .
> > +#endif
> > +
> >         bx      \reg
> 
> What's wrong with:
> 	tst	\reg, #3
> 	moveq	pc, \reg
> 	bx	\reg
> 
> rather than ending in an infinite loop?
The added b . was a test to check if the machine then hangs instead of
crashing. (And yes, that was the case, so it was tried to return to a
non-aligned address.)

Best regards
Uwe
Russell King - ARM Linux April 9, 2014, 3:27 p.m. UTC | #5
On Wed, Apr 09, 2014 at 05:13:26PM +0200, Uwe Kleine-König wrote:
> Hello Russell,
> 
> On Wed, Apr 09, 2014 at 04:06:40PM +0100, Russell King - ARM Linux wrote:
> > On Wed, Apr 09, 2014 at 04:54:16PM +0200, Jonas Jensen wrote:
> > > On 13 December 2013 12:39, Russell King - ARM Linux
> > > <linux@arm.linux.org.uk> wrote:
> > > > I see what's causing this: the kuser helpers are using "bx lr" to return
> > > > which will be undefined on non-Thumb CPUs.  We generally cope fine with
> > > > non-Thumb CPUs, conditionalising where necessary on HWCAP_THUMB or the
> > > > T bit in the PSR being set.
> > > >
> > > > However, it looks like the kuser helpers got missed.  As a check, please
> > > > look at arch/arm/kernel/entry-armv.S, find the line with:
> > > >
> > > >         .macro  usr_ret, reg
> > > >
> > > > and ensure that the mov pc, \reg case always gets used.  Please report
> > > > back.
> > > 
> > > Uwe and Arnd came up with a solution except it doesn't work when I test it.
> > > 
> > > The suggested patch is:
> > > 
> > > diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S
> > > index 1879e8d..de15bfd 100644
> > > --- a/arch/arm/kernel/entry-armv.S
> > > +++ b/arch/arm/kernel/entry-armv.S
> > > @@ -739,6 +739,18 @@ ENDPROC(__switch_to)
> > > 
> > >         .macro  usr_ret, reg
> > >  #ifdef CONFIG_ARM_THUMB
> > > +       /*
> > > +        * Having CONFIG_ARM_THUMB isn't a guarantee that the cpu has support
> > > +        * for Thumb and so the bx instruction. Use a mov if the address to
> > > +        * jump to is 32 bit aligned. (Note that this code is compiled in ARM
> > > +        * mode, so this is the right test.)
> > > +        */
> > > +#if defined(CONFIG_CPU_32v4)
> > > +       tst     \reg, #3
> > > +       moveq   pc, \reg
> > > +       b       .
> > > +#endif
> > > +
> > >         bx      \reg
> > 
> > What's wrong with:
> > 	tst	\reg, #3
> > 	moveq	pc, \reg
> > 	bx	\reg
> > 
> > rather than ending in an infinite loop?
> The added b . was a test to check if the machine then hangs instead of
> crashing. (And yes, that was the case, so it was tried to return to a
> non-aligned address.)

If it's called from ARM code, then \reg will contain a 4-byte aligned
address.  If it's called from Thumb code, \reg will contain a 2-byte
aligned address with bit 0 *always* set.

So, with the code originally quoted above, if the helper is called from
thumb code, and CONFIG_CPU_32v4 is enabled, then we end up falling past
the moveq to the "b ." and entering an infinite loop.
Arnd Bergmann April 9, 2014, 3:50 p.m. UTC | #6
On Wednesday 09 April 2014 16:27:11 Russell King - ARM Linux wrote:
> On Wed, Apr 09, 2014 at 05:13:26PM +0200, Uwe Kleine-König wrote:
> > On Wed, Apr 09, 2014 at 04:06:40PM +0100, Russell King - ARM Linux wrote:
> > > On Wed, Apr 09, 2014 at 04:54:16PM +0200, Jonas Jensen wrote:
> > > > On 13 December 2013 12:39, Russell King - ARM Linux <linux@arm.linux.org.uk> wrote:
> > > > diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S
> > > > index 1879e8d..de15bfd 100644
> > > > --- a/arch/arm/kernel/entry-armv.S
> > > > +++ b/arch/arm/kernel/entry-armv.S
> > > > @@ -739,6 +739,18 @@ ENDPROC(__switch_to)
> > > > 
> > > >         .macro  usr_ret, reg
> > > >  #ifdef CONFIG_ARM_THUMB
> > > > +       /*
> > > > +        * Having CONFIG_ARM_THUMB isn't a guarantee that the cpu has support
> > > > +        * for Thumb and so the bx instruction. Use a mov if the address to
> > > > +        * jump to is 32 bit aligned. (Note that this code is compiled in ARM
> > > > +        * mode, so this is the right test.)
> > > > +        */
> > > > +#if defined(CONFIG_CPU_32v4)
> > > > +       tst     \reg, #3
> > > > +       moveq   pc, \reg
> > > > +       b       .
> > > > +#endif
> > > > +
> > > >         bx      \reg
> > > 
> > > What's wrong with:
> > >     tst     \reg, #3
> > >     moveq   pc, \reg
> > >     bx      \reg
> > > 
> > > rather than ending in an infinite loop?
> > The added b . was a test to check if the machine then hangs instead of
> > crashing. (And yes, that was the case, so it was tried to return to a
> > non-aligned address.)
> 
> If it's called from ARM code, then \reg will contain a 4-byte aligned
> address.  If it's called from Thumb code, \reg will contain a 2-byte
> aligned address with bit 0 *always* set.

Right, that is the assumption.

> So, with the code originally quoted above, if the helper is called from
> thumb code, and CONFIG_CPU_32v4 is enabled, then we end up falling past
> the moveq to the "b ." and entering an infinite loop.

As Uwe said, that "b ." was not meant to be in the patch used for
submission, it was to check what goes wrong when running this code
on ARMv4 -- either crash user space or hang in an infinite loop.
I forgot what the result of that experiment was.

The trouble is that this code:

       .macro  usr_ret, reg
#ifdef CONFIG_ARM_THUMB
#ifdef CONFIG_CPU_32v4
	tst     \reg, #3
	moveq   pc, \reg
#endif
        bx      \reg
#else
	mov	pc, \reg
#endif
	.endm

for some reason does the wrong thing running on ARMv4 (fa526) with non-thumb
user space when both CONFIG_CPU_32v4 and CONFIG_ARM_THUMB are enabled:
it still tries to do the 'bx' and triggers an invalid opcode exception.

	Arnd
diff mbox

Patch

diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S
index 1879e8d..de15bfd 100644
--- a/arch/arm/kernel/entry-armv.S
+++ b/arch/arm/kernel/entry-armv.S
@@ -739,6 +739,18 @@  ENDPROC(__switch_to)

        .macro  usr_ret, reg
 #ifdef CONFIG_ARM_THUMB
+       /*
+        * Having CONFIG_ARM_THUMB isn't a guarantee that the cpu has support
+        * for Thumb and so the bx instruction. Use a mov if the address to
+        * jump to is 32 bit aligned. (Note that this code is compiled in ARM
+        * mode, so this is the right test.)
+        */
+#if defined(CONFIG_CPU_32v4)
+       tst     \reg, #3
+       moveq   pc, \reg
+       b       .
+#endif
+
        bx      \reg
 #else
        mov     pc, \reg