diff mbox series

__div64_32(): straighten up inline asm constraints

Message ID pr6q9q72-6n62-236q-s59n-7osq71o285r9@syhkavp.arg (mailing list archive)
State New, archived
Headers show
Series __div64_32(): straighten up inline asm constraints | expand

Commit Message

Nicolas Pitre Nov. 30, 2020, 7:05 p.m. UTC
The ARM version of __div64_32() encapsulates a call to __do_div64 with 
non-standard argument passing. In particular, __n is a 64-bit input 
argument assigned to r0-r1 and __rem is an output argument sharing half 
of that 40-r1 register pair.

With __n being an input argument, the compiler is in its right to 
presume that r0-r1 would still hold the value of __n past the inline 
assembly statement. Normally, the compiler would have assigned non 
overlapping registers to __n and __rem if the value for __n is needed 
again.

However, here we enforce our own register assignment and gcc fails to 
notice the conflict. In practice this doesn't cause any problem as __n 
is considered dead after the asm statement and *n is overwritten. 
However this is not always guaranteed and clang rightfully complains.

Let's fix it properly by making __n into an input-output variable. This 
makes it clear that those registers representing __n have been modified. 
Then we can extract __rem as the high part of __n with plain C code.

This asm constraint "abuse" was likely relied upon back when gcc didn't 
handle 64-bit values optimally Turns out that gcc is now able to 
optimize things and produces the same code with this patch applied.

Signed-off-by: Nicolas Pitre <nico@fluxnic.net>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Tested-by: Ard Biesheuvel <ardb@kernel.org>
---

This is related to the thread titled "[RESEND,PATCH] ARM: fix 
__div64_32() error when compiling with clang". My limited compile test 
with clang appears to make it happy. If no more comments I'll push this 
to RMK's patch system.

Comments

Nick Desaulniers Nov. 30, 2020, 7:33 p.m. UTC | #1
On Mon, Nov 30, 2020 at 11:05 AM Nicolas Pitre <nico@fluxnic.net> wrote:
>
> The ARM version of __div64_32() encapsulates a call to __do_div64 with
> non-standard argument passing. In particular, __n is a 64-bit input
> argument assigned to r0-r1 and __rem is an output argument sharing half
> of that 40-r1 register pair.

Should `40` be `r0`?

>
> With __n being an input argument, the compiler is in its right to
> presume that r0-r1 would still hold the value of __n past the inline
> assembly statement. Normally, the compiler would have assigned non
> overlapping registers to __n and __rem if the value for __n is needed
> again.
>
> However, here we enforce our own register assignment and gcc fails to
> notice the conflict. In practice this doesn't cause any problem as __n
> is considered dead after the asm statement and *n is overwritten.
> However this is not always guaranteed and clang rightfully complains.
>
> Let's fix it properly by making __n into an input-output variable. This
> makes it clear that those registers representing __n have been modified.
> Then we can extract __rem as the high part of __n with plain C code.
>
> This asm constraint "abuse" was likely relied upon back when gcc didn't
> handle 64-bit values optimally Turns out that gcc is now able to

^ Missing punctuation (period after `optimally`).

> optimize things and produces the same code with this patch applied.
>
> Signed-off-by: Nicolas Pitre <nico@fluxnic.net>
> Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
> Tested-by: Ard Biesheuvel <ardb@kernel.org>

Reported-by: Antony Yu <swpenim@gmail.com>


> ---
>
> This is related to the thread titled "[RESEND,PATCH] ARM: fix
> __div64_32() error when compiling with clang". My limited compile test
> with clang appears to make it happy. If no more comments I'll push this
> to RMK's patch system.
>
> diff --git a/arch/arm/include/asm/div64.h b/arch/arm/include/asm/div64.h
> index 898e9c78a7..595e538f5b 100644
> --- a/arch/arm/include/asm/div64.h
> +++ b/arch/arm/include/asm/div64.h
> @@ -21,29 +21,20 @@
>   * assembly implementation with completely non standard calling convention
>   * for arguments and results (beware).
>   */
> -
> -#ifdef __ARMEB__
> -#define __xh "r0"
> -#define __xl "r1"
> -#else
> -#define __xl "r0"
> -#define __xh "r1"
> -#endif
> -
>  static inline uint32_t __div64_32(uint64_t *n, uint32_t base)
>  {
>         register unsigned int __base      asm("r4") = base;
>         register unsigned long long __n   asm("r0") = *n;
>         register unsigned long long __res asm("r2");
> -       register unsigned int __rem       asm(__xh);
> -       asm(    __asmeq("%0", __xh)
> +       unsigned int __rem;
> +       asm(    __asmeq("%0", "r0")
>                 __asmeq("%1", "r2")
> -               __asmeq("%2", "r0")
> -               __asmeq("%3", "r4")
> +               __asmeq("%2", "r4")
>                 "bl     __do_div64"
> -               : "=r" (__rem), "=r" (__res)
> -               : "r" (__n), "r" (__base)
> +               : "+r" (__n), "=r" (__res)
> +               : "r" (__base)
>                 : "ip", "lr", "cc");
> +       __rem = __n >> 32;
>         *n = __res;
>         return __rem;

The above 3 statement could be:

```
*n = __res;
return __n >> 32;
```

>  }
Nicolas Pitre Nov. 30, 2020, 8:27 p.m. UTC | #2
On Mon, 30 Nov 2020, Nick Desaulniers wrote:

> On Mon, Nov 30, 2020 at 11:05 AM Nicolas Pitre <nico@fluxnic.net> wrote:
> 
> > +       __rem = __n >> 32;
> >         *n = __res;
> >         return __rem;
> 
> The above 3 statement could be:
> 
> ```
> *n = __res;
> return __n >> 32;
> ```

They could. However the compiler doesn't care, and the extra line makes 
it more obvious that the reminder is the high part of __n. So, 
semantically the extra line has value.

Thanks for the review.


Nicolas
diff mbox series

Patch

diff --git a/arch/arm/include/asm/div64.h b/arch/arm/include/asm/div64.h
index 898e9c78a7..595e538f5b 100644
--- a/arch/arm/include/asm/div64.h
+++ b/arch/arm/include/asm/div64.h
@@ -21,29 +21,20 @@ 
  * assembly implementation with completely non standard calling convention
  * for arguments and results (beware).
  */
-
-#ifdef __ARMEB__
-#define __xh "r0"
-#define __xl "r1"
-#else
-#define __xl "r0"
-#define __xh "r1"
-#endif
-
 static inline uint32_t __div64_32(uint64_t *n, uint32_t base)
 {
 	register unsigned int __base      asm("r4") = base;
 	register unsigned long long __n   asm("r0") = *n;
 	register unsigned long long __res asm("r2");
-	register unsigned int __rem       asm(__xh);
-	asm(	__asmeq("%0", __xh)
+	unsigned int __rem;
+	asm(	__asmeq("%0", "r0")
 		__asmeq("%1", "r2")
-		__asmeq("%2", "r0")
-		__asmeq("%3", "r4")
+		__asmeq("%2", "r4")
 		"bl	__do_div64"
-		: "=r" (__rem), "=r" (__res)
-		: "r" (__n), "r" (__base)
+		: "+r" (__n), "=r" (__res)
+		: "r" (__base)
 		: "ip", "lr", "cc");
+	__rem = __n >> 32;
 	*n = __res;
 	return __rem;
 }