[PULL,08/51] target/arm: Implement MVE VCADD

Message ID	20210901103653.13435-9-peter.maydell@linaro.org (mailing list archive)
State	New, archived
Headers	show Return-Path: <SRS0=ahEO=NX=nongnu.org=qemu-devel-bounces+qemu-devel=archiver.kernel.org@kernel.org> DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 1006360724 From: Peter Maydell <peter.maydell@linaro.org> To: qemu-devel@nongnu.org Subject: [PULL 08/51] target/arm: Implement MVE VCADD Date: Wed, 1 Sep 2021 11:36:10 +0100 Message-Id: <20210901103653.13435-9-peter.maydell@linaro.org> In-Reply-To: <20210901103653.13435-1-peter.maydell@linaro.org> References: <20210901103653.13435-1-peter.maydell@linaro.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Received-SPF: pass client-ip=2a00:1450:4864:20::435; envelope-from=peter.maydell@linaro.org; helo=mail-wr1-x435.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action Precedence: list Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" <qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org>
Series	[PULL,01/51] tests: Remove uses of deprecated raspi2/raspi3 machine names \| expand [PULL,01/51] tests: Remove uses of deprecated raspi2/raspi3 machine names [PULL,02/51] hw/arm/raspi: Remove deprecated raspi2/raspi3 aliases [PULL,03/51] hw/intc/arm_gicv3_dist: Rename 64-bit accessors with 'q' suffix [PULL,04/51] hw/intc/arm_gicv3: Replace mis-used MEMTX_* constants by booleans [PULL,05/51] hw: Add compat machines for 6.2 [PULL,06/51] target/arm: Implement MVE VADD (floating-point) [PULL,07/51] target/arm: Implement MVE VSUB, VMUL, VABD, VMAXNM, VMINNM [PULL,08/51] target/arm: Implement MVE VCADD [PULL,09/51] target/arm: Implement MVE VFMA and VFMS [PULL,10/51] target/arm: Implement MVE VCMUL and VCMLA [PULL,11/51] target/arm: Implement MVE VMAXNMA and VMINNMA [PULL,12/51] target/arm: Implement MVE scalar fp insns [PULL,13/51] target/arm: Implement MVE fp-with-scalar VFMA, VFMAS [PULL,14/51] softfloat: Remove assertion preventing silencing of NaN in default-NaN mode [PULL,15/51] target/arm: Implement MVE FP max/min across vector [PULL,16/51] target/arm: Implement MVE fp vector comparisons [PULL,17/51] target/arm: Implement MVE fp scalar comparisons [PULL,18/51] target/arm: Implement MVE VCVT between floating and fixed point [PULL,19/51] target/arm: Implement MVE VCVT between fp and integer [PULL,20/51] target/arm: Implement MVE VCVT with specified rounding mode [PULL,21/51] target/arm: Implement MVE VCVT between single and half precision [PULL,22/51] target/arm: Implement MVE VRINT insns [PULL,23/51] target/arm: Enable MVE in Cortex-M55 [PULL,24/51] target-arm: Add support for Fujitsu A64FX [PULL,25/51] hw/arm/virt: target-arm: Add A64FX processor support to virt machine [PULL,26/51] tests/arm-cpu-features: Add A64FX processor related tests [PULL,27/51] arm: Move M-profile RAS register block into its own device [PULL,28/51] arm: Move systick device creation from NVIC to ARMv7M object [PULL,29/51] arm: Move system PPB container handling to armv7m [PULL,30/51] hw/timer/armv7m_systick: Add usual QEMU interface comment [PULL,31/51] hw/timer/armv7m_systick: Add input clocks [PULL,32/51] hw/arm/armv7m: Create input clocks [PULL,33/51] armsse: Wire up systick cpuclk clock [PULL,34/51] hw/arm/mps2.c: Connect up armv7m clocks [PULL,35/51] clock: Provide builtin multiplier/divider [PULL,36/51] hw/arm: Don't allocate separate MemoryRegions in stm32 SoC realize [PULL,37/51] hw/arm/stm32f100: Wire up sysclk and refclk [PULL,38/51] hw/arm/stm32f205: Wire up sysclk and refclk [PULL,39/51] hw/arm/stm32f405: Wire up sysclk and refclk [PULL,40/51] hw/arm/stm32vldiscovery: Delete trailing blank line [PULL,41/51] hw/arm/nrf51: Wire up sysclk [PULL,42/51] hw/arm/stellaris: split stellaris_sys_init() [PULL,43/51] hw/arm/stellaris: Wire sysclk up to armv7m [PULL,44/51] hw/arm/msf2_soc: Don't allocate separate MemoryRegions [PULL,45/51] hw/arm/msf2: Use Clock input to MSF2_SOC instead of m3clk property [PULL,46/51] hw/arm/msf2-soc: Wire up refclk [PULL,47/51] hw/timer/armv7m_systick: Use clock inputs instead of system_clock_scale [PULL,48/51] hw/arm/stellaris: Fix code style issues in GPTM code [PULL,49/51] hw/arm/stellaris: Split stellaris-gptm into its own file [PULL,50/51] hw/timer/stellaris-gptm: Use Clock input instead of system_clock_scale [PULL,51/51] arm: Remove system_clock_scale global

Message ID

20210901103653.13435-9-peter.maydell@linaro.org (mailing list archive)

State

New, archived

Headers

DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 1006360724
From: Peter Maydell <peter.maydell@linaro.org>
To: qemu-devel@nongnu.org
Subject: [PULL 08/51] target/arm: Implement MVE VCADD
Date: Wed,  1 Sep 2021 11:36:10 +0100
Message-Id: <20210901103653.13435-9-peter.maydell@linaro.org>
In-Reply-To: <20210901103653.13435-1-peter.maydell@linaro.org>
References: <20210901103653.13435-1-peter.maydell@linaro.org>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Received-SPF: pass client-ip=2a00:1450:4864:20::435;
 envelope-from=peter.maydell@linaro.org; helo=mail-wr1-x435.google.com
X-Spam_score_int: -20
X-Spam_score: -2.1
X-Spam_bar: --
X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1,
 DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1,
 RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001,
 SPF_PASS=-0.001 autolearn=ham autolearn_force=no
X-Spam_action: no action
X-BeenThere: qemu-devel@nongnu.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
 <mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <https://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
 <mailto:qemu-devel-request@nongnu.org?subject=subscribe>
Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org
Sender: "Qemu-devel"
 <qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org>

Series

[PULL,01/51] tests: Remove uses of deprecated raspi2/raspi3 machine names | expand

Commit Message

Peter Maydell Sept. 1, 2021, 10:36 a.m. UTC

Implement the MVE VCADD insn.  Note that here the size bit is the
opposite sense to the other 2-operand fp insns.

We don't check for the sz == 1 && Qd == Qm UNPREDICTABLE case,
because that would mean we can't use the DO_2OP_FP macro in
translate-mve.c.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/helper-mve.h    |  6 ++++++
 target/arm/mve.decode      |  8 ++++++++
 target/arm/mve_helper.c    | 40 ++++++++++++++++++++++++++++++++++++++
 target/arm/translate-mve.c |  4 +++-
 4 files changed, 57 insertions(+), 1 deletion(-)

diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h
index 370876d7934..42eba8ea96d 100644
--- a/target/arm/helper-mve.h
+++ b/target/arm/helper-mve.h
@@ -428,6 +428,12 @@  DEF_HELPER_FLAGS_4(mve_vmaxnms, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
 DEF_HELPER_FLAGS_4(mve_vminnmh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
 DEF_HELPER_FLAGS_4(mve_vminnms, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
 
+DEF_HELPER_FLAGS_4(mve_vfcadd90h, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
+DEF_HELPER_FLAGS_4(mve_vfcadd90s, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
+
+DEF_HELPER_FLAGS_4(mve_vfcadd270h, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
+DEF_HELPER_FLAGS_4(mve_vfcadd270s, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr)
+
 DEF_HELPER_FLAGS_4(mve_vadd_scalarb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
 DEF_HELPER_FLAGS_4(mve_vadd_scalarh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
 DEF_HELPER_FLAGS_4(mve_vadd_scalarw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32)
diff --git a/target/arm/mve.decode b/target/arm/mve.decode
index cdbfaa4245b..c728c7089ac 100644
--- a/target/arm/mve.decode
+++ b/target/arm/mve.decode
@@ -29,6 +29,8 @@ 
 # 2 operand fp insns have size in bit 20: 1 for 16 bit, 0 for 32 bit,
 # like Neon FP insns.
 %2op_fp_size 20:1 !function=neon_3same_fp_size
+# VCADD is an exception, where bit 20 is 0 for 16 bit and 1 for 32 bit
+%2op_fp_size_rev 20:1 !function=plus_1
 
 # 1imm format immediate
 %imm_28_16_0 28:1 16:3 0:4
@@ -125,6 +127,9 @@ 
 @2op_fp .... .... .... .... .... .... .... .... &2op \
         qd=%qd qn=%qn qm=%qm size=%2op_fp_size
 
+@2op_fp_size_rev .... .... .... .... .... .... .... .... &2op \
+                 qd=%qd qn=%qn qm=%qm size=%2op_fp_size_rev
+
 # Vector loads and stores
 
 # Widening loads and narrowing stores:
@@ -631,3 +636,6 @@  VABD_fp           1111 1111 0 . 1 . ... 0 ... 0 1101 . 1 . 0 ... 0 @2op_fp
 
 VMAXNM            1111 1111 0 . 0 . ... 0 ... 0 1111 . 1 . 1 ... 0 @2op_fp
 VMINNM            1111 1111 0 . 1 . ... 0 ... 0 1111 . 1 . 1 ... 0 @2op_fp
+
+VCADD90_fp        1111 1100 1 . 0 . ... 0 ... 0 1000 . 1 . 0 ... 0 @2op_fp_size_rev
+VCADD270_fp       1111 1101 1 . 0 . ... 0 ... 0 1000 . 1 . 0 ... 0 @2op_fp_size_rev
diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c
index d6bc686c985..2cc8b3e11b7 100644
--- a/target/arm/mve_helper.c
+++ b/target/arm/mve_helper.c
@@ -2854,3 +2854,43 @@  static inline float32 float32_abd(float32 a, float32 b, float_status *s)
 DO_2OP_FP_ALL(vfabd, abd)
 DO_2OP_FP_ALL(vmaxnm, maxnum)
 DO_2OP_FP_ALL(vminnm, minnum)
+
+#define DO_VCADD_FP(OP, ESIZE, TYPE, FN0, FN1)                          \
+    void HELPER(glue(mve_, OP))(CPUARMState *env,                       \
+                                void *vd, void *vn, void *vm)           \
+    {                                                                   \
+        TYPE *d = vd, *n = vn, *m = vm;                                 \
+        TYPE r[16 / ESIZE];                                             \
+        uint16_t tm, mask = mve_element_mask(env);                      \
+        unsigned e;                                                     \
+        float_status *fpst;                                             \
+        float_status scratch_fpst;                                      \
+        /* Calculate all results first to avoid overwriting inputs */   \
+        for (e = 0, tm = mask; e < 16 / ESIZE; e++, tm >>= ESIZE) {     \
+            if ((tm & MAKE_64BIT_MASK(0, ESIZE)) == 0) {                \
+                r[e] = 0;                                               \
+                continue;                                               \
+            }                                                           \
+            fpst = (ESIZE == 2) ? &env->vfp.standard_fp_status_f16 :    \
+                &env->vfp.standard_fp_status;                           \
+            if (!(tm & 1)) {                                            \
+                /* We need the result but without updating flags */     \
+                scratch_fpst = *fpst;                                   \
+                fpst = &scratch_fpst;                                   \
+            }                                                           \
+            if (!(e & 1)) {                                             \
+                r[e] = FN0(n[H##ESIZE(e)], m[H##ESIZE(e + 1)], fpst);   \
+            } else {                                                    \
+                r[e] = FN1(n[H##ESIZE(e)], m[H##ESIZE(e - 1)], fpst);   \
+            }                                                           \
+        }                                                               \
+        for (e = 0; e < 16 / ESIZE; e++, mask >>= ESIZE) {              \
+            mergemask(&d[H##ESIZE(e)], r[e], mask);                     \
+        }                                                               \
+        mve_advance_vpt(env);                                           \
+    }
+
+DO_VCADD_FP(vfcadd90h, 2, float16, float16_sub, float16_add)
+DO_VCADD_FP(vfcadd90s, 4, float32, float32_sub, float32_add)
+DO_VCADD_FP(vfcadd270h, 2, float16, float16_add, float16_sub)
+DO_VCADD_FP(vfcadd270s, 4, float32, float32_add, float32_sub)
diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c
index 98282335820..6203e3ff916 100644
--- a/target/arm/translate-mve.c
+++ b/target/arm/translate-mve.c
@@ -852,6 +852,8 @@  DO_2OP_FP(VMUL_fp, vfmul)
 DO_2OP_FP(VABD_fp, vfabd)
 DO_2OP_FP(VMAXNM, vmaxnm)
 DO_2OP_FP(VMINNM, vminnm)
+DO_2OP_FP(VCADD90_fp, vfcadd90)
+DO_2OP_FP(VCADD270_fp, vfcadd270)
 
 static bool do_2op_scalar(DisasContext *s, arg_2scalar *a,
                           MVEGenTwoOpScalarFn fn)
@@ -883,7 +885,7 @@  static bool do_2op_scalar(DisasContext *s, arg_2scalar *a,
     return true;
 }
 
-#define DO_2OP_SCALAR(INSN, FN) \
+#define DO_2OP_SCALAR(INSN, FN)                                 \
     static bool trans_##INSN(DisasContext *s, arg_2scalar *a)   \
     {                                                           \
         static MVEGenTwoOpScalarFn * const fns[] = {            \

[PULL,08/51] target/arm: Implement MVE VCADD

Commit Message

Patch