diff mbox series

[14/30] tcg/loongarch: Implement bswap32_i32/bswap64_i64

Message ID 20210920080451.408655-15-git@xen0n.name (mailing list archive)
State New, archived
Headers show
Series 64-bit LoongArch port of QEMU TCG | expand

Commit Message

WANG Xuerui Sept. 20, 2021, 8:04 a.m. UTC
Signed-off-by: WANG Xuerui <git@xen0n.name>
---
 tcg/loongarch/tcg-target.c.inc | 10 ++++++++++
 1 file changed, 10 insertions(+)

Comments

Richard Henderson Sept. 20, 2021, 3:11 p.m. UTC | #1
On 9/20/21 1:04 AM, WANG Xuerui wrote:
> +    case INDEX_op_bswap32_i32:
> +        tcg_out_opc_revb_2h(s, a0, a1);
> +        tcg_out_opc_rotri_w(s, a0, a0, 16);
> +        break;
> +    case INDEX_op_bswap64_i64:
> +        tcg_out_opc_revb_d(s, a0, a1);
> +        break;

You're missing INDEX_op_bswap32_i64, which in addition has a third argument consisting of 
TCG_BSWAP_* bits.

I would have expected revb_2w to be the preferred implementation of bswap32.  I would 
expect something like


     case INDEX_op_bswap32_i32:
         /* All 32-bit values are computed sign-extended in the register. */
         a2 = TCG_BSWAP_OS;
         /* fall through */
     case INDEX_op_bswap32_i64:
         tcg_out_opc_revb_2w(s, a0, a1);
         if (a2 & TCG_BSWAP_OS) {
             tcg_out_ext32s(s, a0, a0);
         } else if (a2 & TCG_BSWAP_OZ) {
             tcg_out_ext32u(s, a0, a0);
         }
         break;


r~
Richard Henderson Sept. 20, 2021, 6:20 p.m. UTC | #2
On 9/20/21 8:11 AM, Richard Henderson wrote:
>          } else if (a2 & TCG_BSWAP_OZ) {
>              tcg_out_ext32u(s, a0, a0);
>          }

Actually,

   if ((a2 & (TCG_BSWAP_IZ | TCG_BSWAP_OZ)) == TCG_BSWAP_OZ)

If the input is zero-extended, the output of revb_2w will also be zero-extended already.


r~
WANG Xuerui Sept. 21, 2021, 6:37 a.m. UTC | #3
Hi Richard,

On 9/20/21 23:11, Richard Henderson wrote:
> On 9/20/21 1:04 AM, WANG Xuerui wrote:
>> +    case INDEX_op_bswap32_i32:
>> +        tcg_out_opc_revb_2h(s, a0, a1);
>> +        tcg_out_opc_rotri_w(s, a0, a0, 16);
>> +        break;
>> +    case INDEX_op_bswap64_i64:
>> +        tcg_out_opc_revb_d(s, a0, a1);
>> +        break;
>
> You're missing INDEX_op_bswap32_i64, which in addition has a third 
> argument consisting of TCG_BSWAP_* bits.
>
> I would have expected revb_2w to be the preferred implementation of 
> bswap32.  I would expect something like
>
>
>     case INDEX_op_bswap32_i32:
>         /* All 32-bit values are computed sign-extended in the 
> register. */
>         a2 = TCG_BSWAP_OS;
>         /* fall through */
>     case INDEX_op_bswap32_i64:
>         tcg_out_opc_revb_2w(s, a0, a1);
>         if (a2 & TCG_BSWAP_OS) {
>             tcg_out_ext32s(s, a0, a0);
>         } else if (a2 & TCG_BSWAP_OZ) {
>             tcg_out_ext32u(s, a0, a0);
>         }
>         break;
>
You're right when we're supporting only 64-bit hosts. While I was 
writing that code I hadn't decided whether to remove support for 32-bit 
hosts, so I didn't make use of 64-bit instructions for the 32-bit ops. 
I'll fix this in v2.
>
> r~
diff mbox series

Patch

diff --git a/tcg/loongarch/tcg-target.c.inc b/tcg/loongarch/tcg-target.c.inc
index e5356bdaf8..d617b833e5 100644
--- a/tcg/loongarch/tcg-target.c.inc
+++ b/tcg/loongarch/tcg-target.c.inc
@@ -480,6 +480,14 @@  static void tcg_out_op(TCGContext *s, TCGOpcode opc,
         tcg_out_opc_bstrins_d(s, a0, a2, args[3], args[3] + args[4] - 1);
         break;
 
+    case INDEX_op_bswap32_i32:
+        tcg_out_opc_revb_2h(s, a0, a1);
+        tcg_out_opc_rotri_w(s, a0, a0, 16);
+        break;
+    case INDEX_op_bswap64_i64:
+        tcg_out_opc_revb_d(s, a0, a1);
+        break;
+
     case INDEX_op_mov_i32:  /* Always emitted via tcg_out_mov.  */
     case INDEX_op_mov_i64:
     default:
@@ -511,6 +519,8 @@  static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
     case INDEX_op_not_i64:
     case INDEX_op_extract_i32:
     case INDEX_op_extract_i64:
+    case INDEX_op_bswap32_i32:
+    case INDEX_op_bswap64_i64:
         return C_O1_I1(r, r);
 
     case INDEX_op_nor_i32: