[36/67] target/arm: Introduce gen_gvec_cls, gen_gvec_clz

Message ID	20241201150607.12812-37-richard.henderson@linaro.org (mailing list archive)
State	New
Headers	show Return-Path: <qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org> From: Richard Henderson <richard.henderson@linaro.org> To: qemu-devel@nongnu.org Cc: qemu-arm@nongnu.org Subject: [PATCH 36/67] target/arm: Introduce gen_gvec_cls, gen_gvec_clz Date: Sun, 1 Dec 2024 09:05:35 -0600 Message-ID: <20241201150607.12812-37-richard.henderson@linaro.org> In-Reply-To: <20241201150607.12812-1-richard.henderson@linaro.org> References: <20241201150607.12812-1-richard.henderson@linaro.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Received-SPF: pass client-ip=2607:f8b0:4864:20::32e; envelope-from=richard.henderson@linaro.org; helo=mail-ot1-x32e.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action Precedence: list Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org
Series	target/arm: AArch64 decodetree conversion, final part \| expand [00/67] target/arm: AArch64 decodetree conversion, final part [01/67] target/arm: Use ### to separate 3rd-level sections in a64.decode [02/67] target/arm: Convert UDIV, SDIV to decodetree [03/67] target/arm: Convert LSLV, LSRV, ASRV, RORV to decodetree [04/67] target/arm: Convert CRC32, CRC32C to decodetree [05/67] target/arm: Convert SUBP, IRG, GMI to decodetree [06/67] target/arm: Convert PACGA to decodetree [07/67] target/arm: Convert RBIT, REV16, REV32, REV64 to decodetree [08/67] target/arm: Convert CLZ, CLS to decodetree [09/67] target/arm: Convert PAC[ID], AUT[ID] to decodetree [10/67] target/arm: Convert XPAC[ID] to decodetree [11/67] target/arm: Convert disas_logic_reg to decodetree [12/67] target/arm: Convert disas_add_sub_ext_reg to decodetree [13/67] target/arm: Convert disas_add_sub_reg to decodetree [14/67] target/arm: Convert disas_data_proc_3src to decodetree [15/67] target/arm: Convert disas_adc_sbc to decodetree [16/67] target/arm: Convert RMIF to decodetree [17/67] target/arm: Convert SETF8, SETF16 to decodetree [18/67] target/arm: Convert CCMP, CCMN to decodetree [19/67] target/arm: Convert disas_cond_select to decodetree [20/67] target/arm: Introduce fp_access_check_scalar_hsd [21/67] target/arm: Introduce fp_access_check_vector_hsd [22/67] target/arm: Convert FCMP, FCMPE, FCCMP, FCCMPE to decodetree [23/67] target/arm: Convert FMOV, FABS, FNEG (scalar) to decodetree [24/67] target/arm: Pass fpstatus to vfp_sqrt* [25/67] target/arm: Remove helper_sqrt_f16 [26/67] target/arm: Convert FSQRT (scalar) to decodetree [27/67] target/arm: Convert FRINT[NPMSAXI] (scalar) to decodetree [28/67] target/arm: Convert BFCVT to decodetree [29/67] target/arm: Convert FRINT{32, 64}[ZX] (scalar) to decodetree [30/67] target/arm: Convert FCVT (scalar) to decodetree [31/67] target/arm: Convert handle_fpfpcvt to decodetree [32/67] target/arm: Convert FJCVTZS to decodetree [33/67] target/arm: Convert handle_fmov to decodetree [34/67] target/arm: Convert SQABS, SQNEG to decodetree [35/67] target/arm: Convert ABS, NEG to decodetree [36/67] target/arm: Introduce gen_gvec_cls, gen_gvec_clz [37/67] target/arm: Convert CLS, CLZ (vector) to decodetree [38/67] target/arm: Introduce gen_gvec_cnt, gen_gvec_rbit [39/67] target/arm: Convert CNT, NOT, RBIT (vector) to decodetree [40/67] target/arm: Convert CMGT, CMGE, GMLT, GMLE, CMEQ (zero) to decodetree [41/67] target/arm: Introduce gen_gvec_rev{16,32,64} [42/67] target/arm: Convert handle_rev to decodetree [43/67] target/arm: Move helper_neon_addlp_{s8, s16} to neon_helper.c [44/67] target/arm: Introduce gen_gvec_{s,u}{add,ada}lp [45/67] target/arm: Convert handle_2misc_pairwise to decodetree [46/67] target/arm: Remove helper_neon_{add,sub}l_u{16,32} [47/67] target/arm: Introduce clear_vec [48/67] target/arm: Convert XTN, SQXTUN, SQXTN, UQXTN to decodetree [49/67] target/arm: Convert FCVTN, BFCVTN to decodetree [50/67] target/arm: Convert FCVTXN to decodetree [51/67] target/arm: Convert SHLL to decodetree [52/67] target/arm: Convert FABS, FNEG (vector) to decodetree [53/67] target/arm: Convert FSQRT (vector) to decodetree [54/67] target/arm: Convert FRINT* (vector) to decodetree [55/67] target/arm: Convert FCVT* (vector, integer) scalar to decodetree [56/67] target/arm: Convert FCVT* (vector, fixed-point) scalar to decodetree [57/67] target/arm: Convert [US]CVTF (vector, integer) scalar to decodetree [58/67] target/arm: Convert [US]CVTF (vector, fixed-point) scalar to decodetree [59/67] target/arm: Rename helper_gvec_vcvt_[hf][su] with _rz [60/67] target/arm: Convert [US]CVTF (vector) to decodetree [61/67] target/arm: Convert FCVTZ[SU] (vector, fixed-point) to decodetree [62/67] target/arm: Convert FCVT* (vector, integer) to decodetree [63/67] target/arm: Convert handle_2misc_fcmp_zero to decodetree [64/67] target/arm: Convert FRECPE, FRECPX, FRSQRTE to decodetree [65/67] target/arm: Introduce gen_gvec_urecpe, gen_gvec_ursqrte [66/67] target/arm: Convert URECPE and URSQRTE to decodetree [67/67] target/arm: Convert FCVTL to decodetree

Message ID

20241201150607.12812-37-richard.henderson@linaro.org (mailing list archive)

State

New

Headers

From: Richard Henderson <richard.henderson@linaro.org>
To: qemu-devel@nongnu.org
Cc: qemu-arm@nongnu.org
Subject: [PATCH 36/67] target/arm: Introduce gen_gvec_cls, gen_gvec_clz
Date: Sun,  1 Dec 2024 09:05:35 -0600
Message-ID: <20241201150607.12812-37-richard.henderson@linaro.org>
In-Reply-To: <20241201150607.12812-1-richard.henderson@linaro.org>
References: <20241201150607.12812-1-richard.henderson@linaro.org>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Received-SPF: pass client-ip=2607:f8b0:4864:20::32e;
 envelope-from=richard.henderson@linaro.org; helo=mail-ot1-x32e.google.com
X-Spam_score_int: -20
X-Spam_score: -2.1
X-Spam_bar: --
X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1,
 DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1,
 RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001,
 SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no
X-Spam_action: no action
X-BeenThere: qemu-devel@nongnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
 <mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <https://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
 <mailto:qemu-devel-request@nongnu.org?subject=subscribe>
Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org
Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org

Series

target/arm: AArch64 decodetree conversion, final part | expand

Commit Message

Richard Henderson Dec. 1, 2024, 3:05 p.m. UTC

Add gvec interfaces for CLS and CLZ operations.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/tcg/translate.h      |  5 +++++
 target/arm/tcg/gengvec.c        | 35 +++++++++++++++++++++++++++++++++
 target/arm/tcg/translate-a64.c  | 29 +++++++--------------------
 target/arm/tcg/translate-neon.c | 29 ++-------------------------
 4 files changed, 49 insertions(+), 49 deletions(-)

Comments

Philippe Mathieu-Daudé Dec. 2, 2024, 4:29 p.m. UTC | #1

On 1/12/24 16:05, Richard Henderson wrote:
> Add gvec interfaces for CLS and CLZ operations.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   target/arm/tcg/translate.h      |  5 +++++
>   target/arm/tcg/gengvec.c        | 35 +++++++++++++++++++++++++++++++++
>   target/arm/tcg/translate-a64.c  | 29 +++++++--------------------
>   target/arm/tcg/translate-neon.c | 29 ++-------------------------
>   4 files changed, 49 insertions(+), 49 deletions(-)


> +void gen_gvec_cls(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
> +                  uint32_t opr_sz, uint32_t max_sz)
> +{
> +    static const GVecGen2 g[] = {
> +        { .fni4 = gen_helper_neon_cls_s8,
> +          .vece = MO_8 },
> +        { .fni4 = gen_helper_neon_cls_s16,
> +          .vece = MO_16 },
> +        { .fni4 = tcg_gen_clrsb_i32,

Why do we have tcg_gen_clrsb_i32(), ...

> +          .vece = MO_32 },
> +    };
> +    assert(vece <= MO_32);
> +    tcg_gen_gvec_2(rd_ofs, rn_ofs, opr_sz, max_sz, &g[vece]);
> +}
> +
> +static void gen_clz32_i32(TCGv_i32 d, TCGv_i32 n)

... but not tcg_gen_clz32_i32()?

> +{
> +    tcg_gen_clzi_i32(d, n, 32);
> +}

Anyhow,

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>

Richard Henderson Dec. 2, 2024, 5:56 p.m. UTC | #2

On 12/2/24 10:29, Philippe Mathieu-Daudé wrote:
> On 1/12/24 16:05, Richard Henderson wrote:
>> Add gvec interfaces for CLS and CLZ operations.
>>
>> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
>> ---
>>   target/arm/tcg/translate.h      |  5 +++++
>>   target/arm/tcg/gengvec.c        | 35 +++++++++++++++++++++++++++++++++
>>   target/arm/tcg/translate-a64.c  | 29 +++++++--------------------
>>   target/arm/tcg/translate-neon.c | 29 ++-------------------------
>>   4 files changed, 49 insertions(+), 49 deletions(-)
> 
> 
>> +void gen_gvec_cls(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
>> +                  uint32_t opr_sz, uint32_t max_sz)
>> +{
>> +    static const GVecGen2 g[] = {
>> +        { .fni4 = gen_helper_neon_cls_s8,
>> +          .vece = MO_8 },
>> +        { .fni4 = gen_helper_neon_cls_s16,
>> +          .vece = MO_16 },
>> +        { .fni4 = tcg_gen_clrsb_i32,
> 
> Why do we have tcg_gen_clrsb_i32(), ...
> 
>> +          .vece = MO_32 },
>> +    };
>> +    assert(vece <= MO_32);
>> +    tcg_gen_gvec_2(rd_ofs, rn_ofs, opr_sz, max_sz, &g[vece]);
>> +}
>> +
>> +static void gen_clz32_i32(TCGv_i32 d, TCGv_i32 n)
> 
> ... but not tcg_gen_clz32_i32()?

The tcg_gen_clz_i32 primitive has a third argument to specify the result for n == 0. 
Passing that third argument is exactly what gen_clz32_i32 is doing.


r~

diff --git a/target/arm/tcg/translate.h b/target/arm/tcg/translate.h
index 20cd0e851c..5c6c24f057 100644
--- a/target/arm/tcg/translate.h
+++ b/target/arm/tcg/translate.h
@@ -578,6 +578,11 @@  void gen_gvec_umaxp(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
 void gen_gvec_uminp(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
                     uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz);
 
+void gen_gvec_cls(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
+                  uint32_t opr_sz, uint32_t max_sz);
+void gen_gvec_clz(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
+                  uint32_t opr_sz, uint32_t max_sz);
+
 /*
  * Forward to the isar_feature_* tests given a DisasContext pointer.
  */
diff --git a/target/arm/tcg/gengvec.c b/target/arm/tcg/gengvec.c
index f652520b65..834b2961c0 100644
--- a/target/arm/tcg/gengvec.c
+++ b/target/arm/tcg/gengvec.c
@@ -2358,3 +2358,38 @@  void gen_gvec_urhadd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
     assert(vece <= MO_32);
     tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &g[vece]);
 }
+
+void gen_gvec_cls(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
+                  uint32_t opr_sz, uint32_t max_sz)
+{
+    static const GVecGen2 g[] = {
+        { .fni4 = gen_helper_neon_cls_s8,
+          .vece = MO_8 },
+        { .fni4 = gen_helper_neon_cls_s16,
+          .vece = MO_16 },
+        { .fni4 = tcg_gen_clrsb_i32,
+          .vece = MO_32 },
+    };
+    assert(vece <= MO_32);
+    tcg_gen_gvec_2(rd_ofs, rn_ofs, opr_sz, max_sz, &g[vece]);
+}
+
+static void gen_clz32_i32(TCGv_i32 d, TCGv_i32 n)
+{
+    tcg_gen_clzi_i32(d, n, 32);
+}
+
+void gen_gvec_clz(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
+                  uint32_t opr_sz, uint32_t max_sz)
+{
+    static const GVecGen2 g[] = {
+        { .fni4 = gen_helper_neon_clz_u8,
+          .vece = MO_8 },
+        { .fni4 = gen_helper_neon_clz_u16,
+          .vece = MO_16 },
+        { .fni4 = gen_clz32_i32,
+          .vece = MO_32 },
+    };
+    assert(vece <= MO_32);
+    tcg_gen_gvec_2(rd_ofs, rn_ofs, opr_sz, max_sz, &g[vece]);
+}
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
index c519f82452..4abc786cf6 100644
--- a/target/arm/tcg/translate-a64.c
+++ b/target/arm/tcg/translate-a64.c
@@ -10325,6 +10325,13 @@  static void disas_simd_two_reg_misc(DisasContext *s, uint32_t insn)
     }
 
     switch (opcode) {
+    case 0x4: /* CLZ, CLS */
+        if (u) {
+            gen_gvec_fn2(s, is_q, rd, rn, gen_gvec_clz, size);
+        } else {
+            gen_gvec_fn2(s, is_q, rd, rn, gen_gvec_cls, size);
+        }
+        return;
     case 0x5:
         if (u && size == 0) { /* NOT */
             gen_gvec_fn2(s, is_q, rd, rn, tcg_gen_gvec_not, 0);
@@ -10383,13 +10390,6 @@  static void disas_simd_two_reg_misc(DisasContext *s, uint32_t insn)
             if (size == 2) {
                 /* Special cases for 32 bit elements */
                 switch (opcode) {
-                case 0x4: /* CLS */
-                    if (u) {
-                        tcg_gen_clzi_i32(tcg_res, tcg_op, 32);
-                    } else {
-                        tcg_gen_clrsb_i32(tcg_res, tcg_op);
-                    }
-                    break;
                 case 0x2f: /* FABS */
                     gen_vfp_abss(tcg_res, tcg_op);
                     break;
@@ -10454,21 +10454,6 @@  static void disas_simd_two_reg_misc(DisasContext *s, uint32_t insn)
                         gen_helper_neon_cnt_u8(tcg_res, tcg_op);
                     }
                     break;
-                case 0x4: /* CLS, CLZ */
-                    if (u) {
-                        if (size == 0) {
-                            gen_helper_neon_clz_u8(tcg_res, tcg_op);
-                        } else {
-                            gen_helper_neon_clz_u16(tcg_res, tcg_op);
-                        }
-                    } else {
-                        if (size == 0) {
-                            gen_helper_neon_cls_s8(tcg_res, tcg_op);
-                        } else {
-                            gen_helper_neon_cls_s16(tcg_res, tcg_op);
-                        }
-                    }
-                    break;
                 default:
                 case 0x7: /* SQABS, SQNEG */
                     g_assert_not_reached();
diff --git a/target/arm/tcg/translate-neon.c b/target/arm/tcg/translate-neon.c
index 9c8829ad7d..1c89a53272 100644
--- a/target/arm/tcg/translate-neon.c
+++ b/target/arm/tcg/translate-neon.c
@@ -3120,6 +3120,8 @@  DO_2MISC_VEC(VCGT0, gen_gvec_cgt0)
 DO_2MISC_VEC(VCLE0, gen_gvec_cle0)
 DO_2MISC_VEC(VCGE0, gen_gvec_cge0)
 DO_2MISC_VEC(VCLT0, gen_gvec_clt0)
+DO_2MISC_VEC(VCLS, gen_gvec_cls)
+DO_2MISC_VEC(VCLZ, gen_gvec_clz)
 
 static bool trans_VMVN(DisasContext *s, arg_2misc *a)
 {
@@ -3227,33 +3229,6 @@  static bool trans_VREV16(DisasContext *s, arg_2misc *a)
     return do_2misc(s, a, gen_rev16);
 }
 
-static bool trans_VCLS(DisasContext *s, arg_2misc *a)
-{
-    static NeonGenOneOpFn * const fn[] = {
-        gen_helper_neon_cls_s8,
-        gen_helper_neon_cls_s16,
-        gen_helper_neon_cls_s32,
-        NULL,
-    };
-    return do_2misc(s, a, fn[a->size]);
-}
-
-static void do_VCLZ_32(TCGv_i32 rd, TCGv_i32 rm)
-{
-    tcg_gen_clzi_i32(rd, rm, 32);
-}
-
-static bool trans_VCLZ(DisasContext *s, arg_2misc *a)
-{
-    static NeonGenOneOpFn * const fn[] = {
-        gen_helper_neon_clz_u8,
-        gen_helper_neon_clz_u16,
-        do_VCLZ_32,
-        NULL,
-    };
-    return do_2misc(s, a, fn[a->size]);
-}
-
 static bool trans_VCNT(DisasContext *s, arg_2misc *a)
 {
     if (a->size != 0) {

[36/67] target/arm: Introduce gen_gvec_cls, gen_gvec_clz

Commit Message

Comments

Patch