diff mbox

[PULL,03/13] target-ppc: Use 32-bit rotate instead of deposit + 64-bit rotate

Message ID 2fe8f40e-27ba-23b7-5d11-f75eda95568d@twiddle.net (mailing list archive)
State New, archived
Headers show

Commit Message

Richard Henderson June 16, 2016, 7:04 p.m. UTC
On 06/15/2016 10:19 PM, David Gibson wrote:
> On Wed, Jun 15, 2016 at 10:17:19PM +1000, Anton Blanchard wrote:
>> Hi,
>>
>>> From: Richard Henderson <rth@twiddle.net>
>>>
>>> A 32-bit rotate insn is more common on hosts than a deposit insn,
>>> and if the host has neither the result is truely horrific.
>>>
>>> At the same time, tidy up the temporaries within these functions,
>>> drop the over-use of "likely", drop some checks for identity that
>>> will also be checked by tcg-op.c functions, and special case mask
>>> without rotate within rlwinm.
>>
>> This breaks masks that wrap:
>>
>>         li      r3,-1
>>         li      r4,-1
>>         rlwnm   r3,r3,r4,22,8
>>
>> We expect:
>>
>> ffffffffff8003ff
>>
>> But get:
>>
>> ff8003ff
>>
>> Anton
> 
> Bother.  I've tentatively put a revert into ppc-for-2.7.  Richard, do
> you have a better idea how to fix it?

Please try the following.


r~

Comments

Anton Blanchard June 17, 2016, 2:27 p.m. UTC | #1
Hi rth,

> > Bother.  I've tentatively put a revert into ppc-for-2.7.  Richard,
> > do you have a better idea how to fix it?  
> 
> Please try the following.

Thanks! This passes my tests. Feel free to add:

Tested-by: Anton Blanchard <anton@samba.org>

Anton
Anton Blanchard June 18, 2016, 4:02 a.m. UTC | #2
Hi,

> > > Bother.  I've tentatively put a revert into ppc-for-2.7.  Richard,
> > > do you have a better idea how to fix it?    
> > 
> > Please try the following.  
> 
> Thanks! This passes my tests. Feel free to add:
> 
> Tested-by: Anton Blanchard <anton@samba.org>

Actually I think I've found a problem:

        lis     r4,0x7fffffff@h
        ori     r4,r4,0x7fffffff@l
        rlwinm  r3,r4,0,25,1

32 bit rotate is defined as a 64 bit rotate of 2 copies of the 32 bit
value, so we expect 0x7fffffff4000007f, but get 0x4000007f.

Not sure if anything out there depends on it though.

Anton
Richard Henderson June 18, 2016, 5:10 a.m. UTC | #3
On 06/17/2016 09:02 PM, Anton Blanchard wrote:
>         lis     r4,0x7fffffff@h
>         ori     r4,r4,0x7fffffff@l
>         rlwinm  r3,r4,0,25,1

Ah, with zero rotate.  I see.  New patch coming up.


r~
Thomas Huth June 20, 2016, 8:21 a.m. UTC | #4
On 18.06.2016 06:02, Anton Blanchard wrote:
> Hi,
> 
>>>> Bother.  I've tentatively put a revert into ppc-for-2.7.  Richard,
>>>> do you have a better idea how to fix it?    
>>>
>>> Please try the following.  
>>
>> Thanks! This passes my tests. Feel free to add:
>>
>> Tested-by: Anton Blanchard <anton@samba.org>
> 
> Actually I think I've found a problem:
> 
>         lis     r4,0x7fffffff@h
>         ori     r4,r4,0x7fffffff@l
>         rlwinm  r3,r4,0,25,1
> 
> 32 bit rotate is defined as a 64 bit rotate of 2 copies of the 32 bit
> value, so we expect 0x7fffffff4000007f, but get 0x4000007f.
> 
> Not sure if anything out there depends on it though.

Would it maybe make sense to add some tests for this stuff to the
tests/tcg/ folder to make sure that such regressions do not sneak back
in in the future?

 Thomas
Peter Maydell June 20, 2016, 8:56 a.m. UTC | #5
On 20 June 2016 at 09:21, Thomas Huth <thuth@redhat.com> wrote:
> Would it maybe make sense to add some tests for this stuff to the
> tests/tcg/ folder to make sure that such regressions do not sneak back
> in in the future?

Wouldn't help very much, because tests/tcg doesn't get run.
(The underlying problem is "how do you have a good test
framework for all the TCG targets when you can't guarantee
to have the toolchains to build the test code for them?".)

thanks
-- PMM
Thomas Huth June 20, 2016, 9:08 a.m. UTC | #6
On 20.06.2016 10:56, Peter Maydell wrote:
> On 20 June 2016 at 09:21, Thomas Huth <thuth@redhat.com> wrote:
>> Would it maybe make sense to add some tests for this stuff to the
>> tests/tcg/ folder to make sure that such regressions do not sneak back
>> in in the future?
> 
> Wouldn't help very much, because tests/tcg doesn't get run.
> (The underlying problem is "how do you have a good test
> framework for all the TCG targets when you can't guarantee
> to have the toolchains to build the test code for them?".)

Oh, I see ... that's a pity.

And I guess storing short binaries in the repository is also something
we do not want, do we?

... so maybe, one far day in the future, we should detect potential
cross compilers in the configure script, too, and enable those tests if
the matching cross-compiler has been found... sounds like a task for
cold and long winter nights... ;-)

 Thomas
diff mbox

Patch

>From f6059bc0e1303d898be2132a444bd58478a0eba0 Mon Sep 17 00:00:00 2001
From: Richard Henderson <rth@twiddle.net>
Date: Thu, 16 Jun 2016 19:00:12 +0000
Subject: [PATCH] target-ppc: Fix rlwimi, rlwinm, rlwnm

In 63ae0915f8ec, I arranged to use a 32-bit rotate, without
considering the effect of a mask value that wraps around to
the high bits of the word.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 target-ppc/translate.c | 73 +++++++++++++++++++++++++++++++++++---------------
 1 file changed, 52 insertions(+), 21 deletions(-)

diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index b689475..12cfa37 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -1636,7 +1636,6 @@  static void gen_rlwimi(DisasContext *ctx)
         tcg_gen_deposit_tl(t_ra, t_ra, t_rs, sh, me - mb + 1);
     } else {
         target_ulong mask;
-        TCGv_i32 t0;
         TCGv t1;
 
 #if defined(TARGET_PPC64)
@@ -1645,12 +1644,21 @@  static void gen_rlwimi(DisasContext *ctx)
 #endif
         mask = MASK(mb, me);
 
-        t0 = tcg_temp_new_i32();
         t1 = tcg_temp_new();
-        tcg_gen_trunc_tl_i32(t0, t_rs);
-        tcg_gen_rotli_i32(t0, t0, sh);
-        tcg_gen_extu_i32_tl(t1, t0);
-        tcg_temp_free_i32(t0);
+        if (mask <= 0xffffffffu) {
+            TCGv_i32 t0 = tcg_temp_new_i32();
+            tcg_gen_trunc_tl_i32(t0, t_rs);
+            tcg_gen_rotli_i32(t0, t0, sh);
+            tcg_gen_extu_i32_tl(t1, t0);
+            tcg_temp_free_i32(t0);
+        } else {
+#if defined(TARGET_PPC64)
+            tcg_gen_deposit_i64(t1, t_rs, t_rs, 32, 32);
+            tcg_gen_rotli_i64(t1, t1, sh);
+#else
+            g_assert_not_reached();
+#endif
+        }
 
         tcg_gen_andi_tl(t1, t1, mask);
         tcg_gen_andi_tl(t_ra, t_ra, ~mask);
@@ -1678,20 +1686,30 @@  static void gen_rlwinm(DisasContext *ctx)
         tcg_gen_ext32u_tl(t_ra, t_rs);
         tcg_gen_shri_tl(t_ra, t_ra, mb);
     } else {
+        target_ulong mask;
 #if defined(TARGET_PPC64)
         mb += 32;
         me += 32;
 #endif
+        mask = MASK(mb, me);
+
         if (sh == 0) {
-            tcg_gen_andi_tl(t_ra, t_rs, MASK(mb, me));
-        } else {
+            tcg_gen_andi_tl(t_ra, t_rs, mask);
+        } else if (mask <= 0xffffffffu) {
             TCGv_i32 t0 = tcg_temp_new_i32();
-
             tcg_gen_trunc_tl_i32(t0, t_rs);
             tcg_gen_rotli_i32(t0, t0, sh);
-            tcg_gen_andi_i32(t0, t0, MASK(mb, me));
+            tcg_gen_andi_i32(t0, t0, mask);
             tcg_gen_extu_i32_tl(t_ra, t0);
             tcg_temp_free_i32(t0);
+        } else {
+#if defined(TARGET_PPC64)
+            tcg_gen_deposit_i64(t_ra, t_rs, t_rs, 32, 32);
+            tcg_gen_rotli_i64(t_ra, t_ra, sh);
+            tcg_gen_andi_i64(t_ra, t_ra, mask);
+#else
+            g_assert_not_reached();
+#endif
         }
     }
     if (unlikely(Rc(ctx->opcode) != 0)) {
@@ -1707,24 +1725,37 @@  static void gen_rlwnm(DisasContext *ctx)
     TCGv t_rb = cpu_gpr[rB(ctx->opcode)];
     uint32_t mb = MB(ctx->opcode);
     uint32_t me = ME(ctx->opcode);
-    TCGv_i32 t0, t1;
+    target_ulong mask;
 
 #if defined(TARGET_PPC64)
     mb += 32;
     me += 32;
 #endif
+    mask = MASK(mb, me);
 
-    t0 = tcg_temp_new_i32();
-    t1 = tcg_temp_new_i32();
-    tcg_gen_trunc_tl_i32(t0, t_rb);
-    tcg_gen_trunc_tl_i32(t1, t_rs);
-    tcg_gen_andi_i32(t0, t0, 0x1f);
-    tcg_gen_rotl_i32(t1, t1, t0);
-    tcg_temp_free_i32(t0);
+    if (mask <= 0xffffffffu) {
+        TCGv_i32 t0 = tcg_temp_new_i32();
+        TCGv_i32 t1 = tcg_temp_new_i32();
+        tcg_gen_trunc_tl_i32(t0, t_rb);
+        tcg_gen_trunc_tl_i32(t1, t_rs);
+        tcg_gen_andi_i32(t0, t0, 0x1f);
+        tcg_gen_rotl_i32(t1, t1, t0);
+        tcg_gen_extu_i32_tl(t_ra, t1);
+        tcg_temp_free_i32(t0);
+        tcg_temp_free_i32(t1);
+    } else {
+#if defined(TARGET_PPC64)
+        TCGv_i64 t0 = tcg_temp_new_i64();
+        tcg_gen_andi_i64(t0, t_rb, 0x1f);
+        tcg_gen_deposit_i64(t_ra, t_rs, t_rs, 32, 32);
+        tcg_gen_rotl_i64(t_ra, t_ra, t0);
+        tcg_temp_free_i64(t0);
+#else
+        g_assert_not_reached();
+#endif
+    }
 
-    tcg_gen_andi_i32(t1, t1, MASK(mb, me));
-    tcg_gen_extu_i32_tl(t_ra, t1);
-    tcg_temp_free_i32(t1);
+    tcg_gen_andi_tl(t_ra, t_ra, mask);
 
     if (unlikely(Rc(ctx->opcode) != 0)) {
         gen_set_Rc0(ctx, t_ra);
-- 
2.5.5