From patchwork Tue Jun 20 09:37:29 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Gao X-Patchwork-Id: 13285446 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id D6BA4EB64D7 for ; Tue, 20 Jun 2023 09:39:34 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXop-000752-Df; Tue, 20 Jun 2023 05:38:35 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qBXol-00072h-CL for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:38:31 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXoe-0006F1-35 for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:38:28 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8Cxe+qIc5FkgiUHAA--.14662S3; Tue, 20 Jun 2023 17:38:16 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8BxduSGc5FkzIQhAA--.28394S3; Tue, 20 Jun 2023 17:38:15 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v1 01/46] target/loongarch: Add LASX data type XReg Date: Tue, 20 Jun 2023 17:37:29 +0800 Message-Id: <20230620093814.123650-2-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230620093814.123650-1-gaosong@loongson.cn> References: <20230620093814.123650-1-gaosong@loongson.cn> MIME-Version: 1.0 X-CM-TRANSID: AQAAf8BxduSGc5FkzIQhAA--.28394S3 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Signed-off-by: Song Gao --- linux-user/loongarch64/signal.c | 1 + target/loongarch/cpu.c | 1 + target/loongarch/cpu.h | 14 +++++++++ target/loongarch/gdbstub.c | 1 + target/loongarch/internals.h | 22 -------------- target/loongarch/lsx_helper.c | 1 + target/loongarch/machine.c | 40 ++++++++++++++++++++++++-- target/loongarch/vec.h | 51 +++++++++++++++++++++++++++++++++ 8 files changed, 106 insertions(+), 25 deletions(-) create mode 100644 target/loongarch/vec.h diff --git a/linux-user/loongarch64/signal.c b/linux-user/loongarch64/signal.c index bb8efb1172..39572c1190 100644 --- a/linux-user/loongarch64/signal.c +++ b/linux-user/loongarch64/signal.c @@ -12,6 +12,7 @@ #include "linux-user/trace.h" #include "target/loongarch/internals.h" +#include "target/loongarch/vec.h" /* FP context was used */ #define SC_USED_FP (1 << 0) diff --git a/target/loongarch/cpu.c b/target/loongarch/cpu.c index ad93ecac92..5037cfc02c 100644 --- a/target/loongarch/cpu.c +++ b/target/loongarch/cpu.c @@ -18,6 +18,7 @@ #include "cpu-csr.h" #include "sysemu/reset.h" #include "tcg/tcg.h" +#include "vec.h" const char * const regnames[32] = { "r0", "r1", "r2", "r3", "r4", "r5", "r6", "r7", diff --git a/target/loongarch/cpu.h b/target/loongarch/cpu.h index b23f38c3d5..347950b4d0 100644 --- a/target/loongarch/cpu.h +++ b/target/loongarch/cpu.h @@ -259,9 +259,23 @@ typedef union VReg { Int128 Q[LSX_LEN / 128]; }VReg; +#define LASX_LEN (256) +typedef union XReg { + int8_t XB[LASX_LEN / 8]; + int16_t XH[LASX_LEN / 16]; + int32_t XW[LASX_LEN / 32]; + int64_t XD[LASX_LEN / 64]; + uint8_t UXB[LASX_LEN / 8]; + uint16_t UXH[LASX_LEN / 16]; + uint32_t UXW[LASX_LEN / 32]; + uint64_t UXD[LASX_LEN / 64]; + Int128 XQ[LASX_LEN / 128]; +} XReg; + typedef union fpr_t fpr_t; union fpr_t { VReg vreg; + XReg xreg; }; struct LoongArchTLB { diff --git a/target/loongarch/gdbstub.c b/target/loongarch/gdbstub.c index 0752fff924..94c427f4da 100644 --- a/target/loongarch/gdbstub.c +++ b/target/loongarch/gdbstub.c @@ -11,6 +11,7 @@ #include "internals.h" #include "exec/gdbstub.h" #include "gdbstub/helpers.h" +#include "vec.h" uint64_t read_fcc(CPULoongArchState *env) { diff --git a/target/loongarch/internals.h b/target/loongarch/internals.h index 7b0f29c942..c492863cc5 100644 --- a/target/loongarch/internals.h +++ b/target/loongarch/internals.h @@ -21,28 +21,6 @@ /* Global bit for huge page */ #define LOONGARCH_HGLOBAL_SHIFT 12 -#if HOST_BIG_ENDIAN -#define B(x) B[15 - (x)] -#define H(x) H[7 - (x)] -#define W(x) W[3 - (x)] -#define D(x) D[1 - (x)] -#define UB(x) UB[15 - (x)] -#define UH(x) UH[7 - (x)] -#define UW(x) UW[3 - (x)] -#define UD(x) UD[1 -(x)] -#define Q(x) Q[x] -#else -#define B(x) B[x] -#define H(x) H[x] -#define W(x) W[x] -#define D(x) D[x] -#define UB(x) UB[x] -#define UH(x) UH[x] -#define UW(x) UW[x] -#define UD(x) UD[x] -#define Q(x) Q[x] -#endif - void loongarch_translate_init(void); void loongarch_cpu_dump_state(CPUState *cpu, FILE *f, int flags); diff --git a/target/loongarch/lsx_helper.c b/target/loongarch/lsx_helper.c index 9571f0aef0..b231a2798b 100644 --- a/target/loongarch/lsx_helper.c +++ b/target/loongarch/lsx_helper.c @@ -12,6 +12,7 @@ #include "fpu/softfloat.h" #include "internals.h" #include "tcg/tcg.h" +#include "vec.h" #define DO_ADD(a, b) (a + b) #define DO_SUB(a, b) (a - b) diff --git a/target/loongarch/machine.c b/target/loongarch/machine.c index d8ac99c9a4..3fbf68d7ff 100644 --- a/target/loongarch/machine.c +++ b/target/loongarch/machine.c @@ -8,7 +8,7 @@ #include "qemu/osdep.h" #include "cpu.h" #include "migration/cpu.h" -#include "internals.h" +#include "vec.h" static const VMStateDescription vmstate_fpu_reg = { .name = "fpu_reg", @@ -76,6 +76,39 @@ static const VMStateDescription vmstate_lsx = { }, }; +static const VMStateDescription vmstate_lasxh_reg = { + .name = "lasxh_reg", + .version_id = 1, + .minimum_version_id = 1, + .fields = (VMStateField[]) { + VMSTATE_UINT64(UXD(2), XReg), + VMSTATE_UINT64(UXD(3), XReg), + VMSTATE_END_OF_LIST() + } +}; + +#define VMSTATE_LASXH_REGS(_field, _state, _start) \ + VMSTATE_STRUCT_SUB_ARRAY(_field, _state, _start, 32, 0, \ + vmstate_lasxh_reg, fpr_t) + +static bool lasx_needed(void *opaque) +{ + LoongArchCPU *cpu = opaque; + + return FIELD_EX64(cpu->env.cpucfg[2], CPUCFG2, LASX); +} + +static const VMStateDescription vmstate_lasx = { + .name = "cpu/lasx", + .version_id = 1, + .minimum_version_id = 1, + .needed = lasx_needed, + .fields = (VMStateField[]) { + VMSTATE_LASXH_REGS(env.fpr, LoongArchCPU, 0), + VMSTATE_END_OF_LIST() + }, +}; + /* TLB state */ const VMStateDescription vmstate_tlb = { .name = "cpu/tlb", @@ -92,8 +125,8 @@ const VMStateDescription vmstate_tlb = { /* LoongArch CPU state */ const VMStateDescription vmstate_loongarch_cpu = { .name = "cpu", - .version_id = 1, - .minimum_version_id = 1, + .version_id = 2, + .minimum_version_id = 2, .fields = (VMStateField[]) { VMSTATE_UINTTL_ARRAY(env.gpr, LoongArchCPU, 32), VMSTATE_UINTTL(env.pc, LoongArchCPU), @@ -163,6 +196,7 @@ const VMStateDescription vmstate_loongarch_cpu = { .subsections = (const VMStateDescription*[]) { &vmstate_fpu, &vmstate_lsx, + &vmstate_lasx, NULL } }; diff --git a/target/loongarch/vec.h b/target/loongarch/vec.h new file mode 100644 index 0000000000..a89cdb8d45 --- /dev/null +++ b/target/loongarch/vec.h @@ -0,0 +1,51 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ +/* + * QEMU LoongArch vector utilitites + * + * Copyright (c) 2023 Loongson Technology Corporation Limited + */ + +#ifndef LOONGARCH_VEC_H +#define LOONGARCH_VEC_H + +#if HOST_BIG_ENDIAN +#define B(x) B[15 - (x)] +#define H(x) H[7 - (x)] +#define W(x) W[3 - (x)] +#define D(x) D[1 - (x)] +#define UB(x) UB[15 - (x)] +#define UH(x) UH[7 - (x)] +#define UW(x) UW[3 - (x)] +#define UD(x) UD[1 - (x)] +#define Q(x) Q[x] +#define XB(x) XB[31 - (x)] +#define XH(x) XH[15 - (x)] +#define XW(x) XW[7 - (x)] +#define XD(x) XD[3 - (x)] +#define UXB(x) UXB[31 - (x)] +#define UXH(x) UXH[15 - (x)] +#define UXW(x) UXW[7 - (x)] +#define UXD(x) UXD[3 - (x)] +#define XQ(x) XQ[1 - (x)] +#else +#define B(x) B[x] +#define H(x) H[x] +#define W(x) W[x] +#define D(x) D[x] +#define UB(x) UB[x] +#define UH(x) UH[x] +#define UW(x) UW[x] +#define UD(x) UD[x] +#define Q(x) Q[x] +#define XB(x) XB[x] +#define XH(x) XH[x] +#define XW(x) XW[x] +#define XD(x) XD[x] +#define UXB(x) UXB[x] +#define UXH(x) UXH[x] +#define UXW(x) UXW[x] +#define UXD(x) UXD[x] +#define XQ(x) XQ[x] +#endif /* HOST_BIG_ENDIAN */ + +#endif /* LOONGARCH_VEC_H */ From patchwork Tue Jun 20 09:37:30 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Gao X-Patchwork-Id: 13285448 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id E3578EB64DB for ; Tue, 20 Jun 2023 09:39:54 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXol-00071J-5g; Tue, 20 Jun 2023 05:38:31 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qBXof-00070w-MK for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:38:25 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXod-0006Gl-9I for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:38:25 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8Dxi+qIc5FkhSUHAA--.14545S3; Tue, 20 Jun 2023 17:38:16 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8BxduSGc5FkzIQhAA--.28394S4; Tue, 20 Jun 2023 17:38:16 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v1 02/46] target/loongarch: meson.build support build LASX Date: Tue, 20 Jun 2023 17:37:30 +0800 Message-Id: <20230620093814.123650-3-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230620093814.123650-1-gaosong@loongson.cn> References: <20230620093814.123650-1-gaosong@loongson.cn> MIME-Version: 1.0 X-CM-TRANSID: AQAAf8BxduSGc5FkzIQhAA--.28394S4 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Signed-off-by: Song Gao --- target/loongarch/insn_trans/trans_lasx.c.inc | 6 ++++++ target/loongarch/lasx_helper.c | 6 ++++++ target/loongarch/meson.build | 1 + target/loongarch/translate.c | 1 + 4 files changed, 14 insertions(+) create mode 100644 target/loongarch/insn_trans/trans_lasx.c.inc create mode 100644 target/loongarch/lasx_helper.c diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarch/insn_trans/trans_lasx.c.inc new file mode 100644 index 0000000000..56a9839255 --- /dev/null +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -0,0 +1,6 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ +/* + * LASX translate functions + * Copyright (c) 2023 Loongson Technology Corporation Limited + */ + diff --git a/target/loongarch/lasx_helper.c b/target/loongarch/lasx_helper.c new file mode 100644 index 0000000000..1754790a3a --- /dev/null +++ b/target/loongarch/lasx_helper.c @@ -0,0 +1,6 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ +/* + * QEMU LoongArch LASX helper functions. + * + * Copyright (c) 2023 Loongson Technology Corporation Limited + */ diff --git a/target/loongarch/meson.build b/target/loongarch/meson.build index 1117a51c52..90a5a21977 100644 --- a/target/loongarch/meson.build +++ b/target/loongarch/meson.build @@ -12,6 +12,7 @@ loongarch_tcg_ss.add(files( 'translate.c', 'gdbstub.c', 'lsx_helper.c', + 'lasx_helper.c', )) loongarch_tcg_ss.add(zlib) diff --git a/target/loongarch/translate.c b/target/loongarch/translate.c index 3146a2d4ac..6bf2d726d6 100644 --- a/target/loongarch/translate.c +++ b/target/loongarch/translate.c @@ -220,6 +220,7 @@ static void set_fpr(int reg_num, TCGv val) #include "insn_trans/trans_branch.c.inc" #include "insn_trans/trans_privileged.c.inc" #include "insn_trans/trans_lsx.c.inc" +#include "insn_trans/trans_lasx.c.inc" static void loongarch_tr_translate_insn(DisasContextBase *dcbase, CPUState *cs) { From patchwork Tue Jun 20 09:37:31 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Gao X-Patchwork-Id: 13285475 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 41DA4EB64D8 for ; Tue, 20 Jun 2023 09:42:17 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXom-00073E-0X; Tue, 20 Jun 2023 05:38:32 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qBXoh-00071Y-4N for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:38:27 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXod-0006Fe-CK for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:38:26 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8AxnOqJc5FkhiUHAA--.14482S3; Tue, 20 Jun 2023 17:38:17 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8BxduSGc5FkzIQhAA--.28394S5; Tue, 20 Jun 2023 17:38:16 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v1 03/46] target/loongarch: Add CHECK_ASXE maccro for check LASX enable Date: Tue, 20 Jun 2023 17:37:31 +0800 Message-Id: <20230620093814.123650-4-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230620093814.123650-1-gaosong@loongson.cn> References: <20230620093814.123650-1-gaosong@loongson.cn> MIME-Version: 1.0 X-CM-TRANSID: AQAAf8BxduSGc5FkzIQhAA--.28394S5 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Signed-off-by: Song Gao Reviewed-by: Richard Henderson --- target/loongarch/cpu.c | 2 ++ target/loongarch/cpu.h | 2 ++ target/loongarch/insn_trans/trans_lasx.c.inc | 10 ++++++++++ 3 files changed, 14 insertions(+) diff --git a/target/loongarch/cpu.c b/target/loongarch/cpu.c index 5037cfc02c..c9f9cbb19d 100644 --- a/target/loongarch/cpu.c +++ b/target/loongarch/cpu.c @@ -54,6 +54,7 @@ static const char * const excp_names[] = { [EXCCODE_DBP] = "Debug breakpoint", [EXCCODE_BCE] = "Bound Check Exception", [EXCCODE_SXD] = "128 bit vector instructions Disable exception", + [EXCCODE_ASXD] = "256 bit vector instructions Disable exception", }; const char *loongarch_exception_name(int32_t exception) @@ -189,6 +190,7 @@ static void loongarch_cpu_do_interrupt(CPUState *cs) case EXCCODE_FPD: case EXCCODE_FPE: case EXCCODE_SXD: + case EXCCODE_ASXD: env->CSR_BADV = env->pc; QEMU_FALLTHROUGH; case EXCCODE_BCE: diff --git a/target/loongarch/cpu.h b/target/loongarch/cpu.h index 347950b4d0..6e8d247ae0 100644 --- a/target/loongarch/cpu.h +++ b/target/loongarch/cpu.h @@ -440,6 +440,7 @@ static inline int cpu_mmu_index(CPULoongArchState *env, bool ifetch) #define HW_FLAGS_CRMD_PG R_CSR_CRMD_PG_MASK /* 0x10 */ #define HW_FLAGS_EUEN_FPE 0x04 #define HW_FLAGS_EUEN_SXE 0x08 +#define HW_FLAGS_EUEN_ASXE 0x10 static inline void cpu_get_tb_cpu_state(CPULoongArchState *env, target_ulong *pc, @@ -451,6 +452,7 @@ static inline void cpu_get_tb_cpu_state(CPULoongArchState *env, *flags = env->CSR_CRMD & (R_CSR_CRMD_PLV_MASK | R_CSR_CRMD_PG_MASK); *flags |= FIELD_EX64(env->CSR_EUEN, CSR_EUEN, FPE) * HW_FLAGS_EUEN_FPE; *flags |= FIELD_EX64(env->CSR_EUEN, CSR_EUEN, SXE) * HW_FLAGS_EUEN_SXE; + *flags |= FIELD_EX64(env->CSR_EUEN, CSR_EUEN, ASXE) * HW_FLAGS_EUEN_ASXE; } void loongarch_cpu_list(void); diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarch/insn_trans/trans_lasx.c.inc index 56a9839255..75a77f5dce 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -4,3 +4,13 @@ * Copyright (c) 2023 Loongson Technology Corporation Limited */ +#ifndef CONFIG_USER_ONLY +#define CHECK_ASXE do { \ + if ((ctx->base.tb->flags & HW_FLAGS_EUEN_ASXE) == 0) { \ + generate_exception(ctx, EXCCODE_ASXD); \ + return true; \ + } \ +} while (0) +#else +#define CHECK_ASXE +#endif From patchwork Tue Jun 20 09:37:32 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Gao X-Patchwork-Id: 13285474 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 9FD85EB64D7 for ; Tue, 20 Jun 2023 09:41:33 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXoo-00074K-M8; Tue, 20 Jun 2023 05:38:34 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qBXoi-00072D-Ie for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:38:31 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXod-0006Gk-I3 for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:38:28 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8BxLuuJc5FkiSUHAA--.14913S3; Tue, 20 Jun 2023 17:38:17 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8BxduSGc5FkzIQhAA--.28394S6; Tue, 20 Jun 2023 17:38:17 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v1 04/46] target/loongarch: Implement xvadd/xvsub Date: Tue, 20 Jun 2023 17:37:32 +0800 Message-Id: <20230620093814.123650-5-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230620093814.123650-1-gaosong@loongson.cn> References: <20230620093814.123650-1-gaosong@loongson.cn> MIME-Version: 1.0 X-CM-TRANSID: AQAAf8BxduSGc5FkzIQhAA--.28394S6 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org This patch includes: - XVADD.{B/H/W/D/Q}; - XVSUB.{B/H/W/D/Q}. Signed-off-by: Song Gao --- target/loongarch/disas.c | 23 ++++++++ target/loongarch/insn_trans/trans_lasx.c.inc | 59 ++++++++++++++++++++ target/loongarch/insns.decode | 23 ++++++++ target/loongarch/translate.c | 17 ++++++ 4 files changed, 122 insertions(+) diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index 5c402d944d..696f78c491 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -1695,3 +1695,26 @@ INSN_LSX(vstelm_d, vr_ii) INSN_LSX(vstelm_w, vr_ii) INSN_LSX(vstelm_h, vr_ii) INSN_LSX(vstelm_b, vr_ii) + +#define INSN_LASX(insn, type) \ +static bool trans_##insn(DisasContext *ctx, arg_##type * a) \ +{ \ + output_##type(ctx, a, #insn); \ + return true; \ +} + +static void output_xxx(DisasContext *ctx, arg_xxx * a, const char *mnemonic) +{ + output(ctx, mnemonic, "x%d, x%d, x%d", a->xd, a->xj, a->xk); +} + +INSN_LASX(xvadd_b, xxx) +INSN_LASX(xvadd_h, xxx) +INSN_LASX(xvadd_w, xxx) +INSN_LASX(xvadd_d, xxx) +INSN_LASX(xvadd_q, xxx) +INSN_LASX(xvsub_b, xxx) +INSN_LASX(xvsub_h, xxx) +INSN_LASX(xvsub_w, xxx) +INSN_LASX(xvsub_d, xxx) +INSN_LASX(xvsub_q, xxx) diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarch/insn_trans/trans_lasx.c.inc index 75a77f5dce..c918522f96 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -14,3 +14,62 @@ #else #define CHECK_ASXE #endif + +static bool gvec_xxx(DisasContext *ctx, arg_xxx *a, MemOp mop, + void (*func)(unsigned, uint32_t, uint32_t, + uint32_t, uint32_t, uint32_t)) +{ + uint32_t xd_ofs, xj_ofs, xk_ofs; + + CHECK_ASXE; + + xd_ofs = vec_full_offset(a->xd); + xj_ofs = vec_full_offset(a->xj); + xk_ofs = vec_full_offset(a->xk); + + func(mop, xd_ofs, xj_ofs, xk_ofs, 32, ctx->vl / 8); + return true; +} + +TRANS(xvadd_b, gvec_xxx, MO_8, tcg_gen_gvec_add) +TRANS(xvadd_h, gvec_xxx, MO_16, tcg_gen_gvec_add) +TRANS(xvadd_w, gvec_xxx, MO_32, tcg_gen_gvec_add) +TRANS(xvadd_d, gvec_xxx, MO_64, tcg_gen_gvec_add) + +#define XVADDSUB_Q(NAME) \ +static bool trans_xv## NAME ##_q(DisasContext *ctx, arg_xxx *a) \ +{ \ + TCGv_i64 rh, rl, ah, al, bh, bl; \ + int i; \ + \ + CHECK_ASXE; \ + \ + rh = tcg_temp_new_i64(); \ + rl = tcg_temp_new_i64(); \ + ah = tcg_temp_new_i64(); \ + al = tcg_temp_new_i64(); \ + bh = tcg_temp_new_i64(); \ + bl = tcg_temp_new_i64(); \ + \ + for (i = 0; i < 2; i++) { \ + get_xreg64(ah, a->xj, 1 + i * 2); \ + get_xreg64(al, a->xj, 0 + i * 2); \ + get_xreg64(bh, a->xk, 1 + i * 2); \ + get_xreg64(bl, a->xk, 0 + i * 2); \ + \ + tcg_gen_## NAME ##2_i64(rl, rh, al, ah, bl, bh); \ + \ + set_xreg64(rh, a->xd, 1 + i * 2); \ + set_xreg64(rl, a->xd, 0 + i * 2); \ + } \ + \ + return true; \ +} + +XVADDSUB_Q(add) +XVADDSUB_Q(sub) + +TRANS(xvsub_b, gvec_xxx, MO_8, tcg_gen_gvec_sub) +TRANS(xvsub_h, gvec_xxx, MO_16, tcg_gen_gvec_sub) +TRANS(xvsub_w, gvec_xxx, MO_32, tcg_gen_gvec_sub) +TRANS(xvsub_d, gvec_xxx, MO_64, tcg_gen_gvec_sub) diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index c9c3bc2c73..bac1903975 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1296,3 +1296,26 @@ vstelm_d 0011 00010001 0 . ........ ..... ..... @vr_i8i1 vstelm_w 0011 00010010 .. ........ ..... ..... @vr_i8i2 vstelm_h 0011 0001010 ... ........ ..... ..... @vr_i8i3 vstelm_b 0011 000110 .... ........ ..... ..... @vr_i8i4 + +# +# LASX Argument sets +# + +&xxx xd xj xk + +# +# LASX Formats +# + +@xxx .... ........ ..... xk:5 xj:5 xd:5 &xxx + +xvadd_b 0111 01000000 10100 ..... ..... ..... @xxx +xvadd_h 0111 01000000 10101 ..... ..... ..... @xxx +xvadd_w 0111 01000000 10110 ..... ..... ..... @xxx +xvadd_d 0111 01000000 10111 ..... ..... ..... @xxx +xvadd_q 0111 01010010 11010 ..... ..... ..... @xxx +xvsub_b 0111 01000000 11000 ..... ..... ..... @xxx +xvsub_h 0111 01000000 11001 ..... ..... ..... @xxx +xvsub_w 0111 01000000 11010 ..... ..... ..... @xxx +xvsub_d 0111 01000000 11011 ..... ..... ..... @xxx +xvsub_q 0111 01010010 11011 ..... ..... ..... @xxx diff --git a/target/loongarch/translate.c b/target/loongarch/translate.c index 6bf2d726d6..5300e14815 100644 --- a/target/loongarch/translate.c +++ b/target/loongarch/translate.c @@ -18,6 +18,7 @@ #include "fpu/softfloat.h" #include "translate.h" #include "internals.h" +#include "vec.h" /* Global register indices */ TCGv cpu_gpr[32], cpu_pc; @@ -48,6 +49,18 @@ static inline void set_vreg64(TCGv_i64 src, int regno, int index) offsetof(CPULoongArchState, fpr[regno].vreg.D(index))); } +static inline void get_xreg64(TCGv_i64 dest, int regno, int index) +{ + tcg_gen_ld_i64(dest, cpu_env, + offsetof(CPULoongArchState, fpr[regno].xreg.XD(index))); +} + +static inline void set_xreg64(TCGv_i64 src, int regno, int index) +{ + tcg_gen_st_i64(src, cpu_env, + offsetof(CPULoongArchState, fpr[regno].xreg.XD(index))); +} + static inline int plus_1(DisasContext *ctx, int x) { return x + 1; @@ -119,6 +132,10 @@ static void loongarch_tr_init_disas_context(DisasContextBase *dcbase, ctx->vl = LSX_LEN; } + if (FIELD_EX64(env->cpucfg[2], CPUCFG2, LASX)) { + ctx->vl = LASX_LEN; + } + ctx->zero = tcg_constant_tl(0); } From patchwork Tue Jun 20 09:37:33 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Gao X-Patchwork-Id: 13285477 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id DDC93EB64D7 for ; Tue, 20 Jun 2023 09:42:29 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXom-00073F-1C; Tue, 20 Jun 2023 05:38:32 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qBXoh-00071X-3R for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:38:27 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXod-0006Hn-ED for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:38:26 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8DxSuqJc5FkiiUHAA--.14505S3; Tue, 20 Jun 2023 17:38:17 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8BxduSGc5FkzIQhAA--.28394S7; Tue, 20 Jun 2023 17:38:17 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v1 05/46] target/loongarch: Implement xvreplgr2vr Date: Tue, 20 Jun 2023 17:37:33 +0800 Message-Id: <20230620093814.123650-6-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230620093814.123650-1-gaosong@loongson.cn> References: <20230620093814.123650-1-gaosong@loongson.cn> MIME-Version: 1.0 X-CM-TRANSID: AQAAf8BxduSGc5FkzIQhAA--.28394S7 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org This patch includes: - XVREPLGR2VR.{B/H/W/D}. Signed-off-by: Song Gao --- target/loongarch/disas.c | 10 ++++++++++ target/loongarch/insn_trans/trans_lasx.c.inc | 16 ++++++++++++++++ target/loongarch/insns.decode | 8 ++++++++ 3 files changed, 34 insertions(+) diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index 696f78c491..78e1fd19ac 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -1708,6 +1708,11 @@ static void output_xxx(DisasContext *ctx, arg_xxx * a, const char *mnemonic) output(ctx, mnemonic, "x%d, x%d, x%d", a->xd, a->xj, a->xk); } +static void output_xr(DisasContext *ctx, arg_xr *a, const char *mnemonic) +{ + output(ctx, mnemonic, "x%d, r%d", a->xd, a->rj); +} + INSN_LASX(xvadd_b, xxx) INSN_LASX(xvadd_h, xxx) INSN_LASX(xvadd_w, xxx) @@ -1718,3 +1723,8 @@ INSN_LASX(xvsub_h, xxx) INSN_LASX(xvsub_w, xxx) INSN_LASX(xvsub_d, xxx) INSN_LASX(xvsub_q, xxx) + +INSN_LASX(xvreplgr2vr_b, xr) +INSN_LASX(xvreplgr2vr_h, xr) +INSN_LASX(xvreplgr2vr_w, xr) +INSN_LASX(xvreplgr2vr_d, xr) diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarch/insn_trans/trans_lasx.c.inc index c918522f96..d394a4f40a 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -73,3 +73,19 @@ TRANS(xvsub_b, gvec_xxx, MO_8, tcg_gen_gvec_sub) TRANS(xvsub_h, gvec_xxx, MO_16, tcg_gen_gvec_sub) TRANS(xvsub_w, gvec_xxx, MO_32, tcg_gen_gvec_sub) TRANS(xvsub_d, gvec_xxx, MO_64, tcg_gen_gvec_sub) + +static bool gvec_dupx(DisasContext *ctx, arg_xr *a, MemOp mop) +{ + TCGv src = gpr_src(ctx, a->rj, EXT_NONE); + + CHECK_ASXE; + + tcg_gen_gvec_dup_i64(mop, vec_full_offset(a->xd), + 32, ctx->vl / 8, src); + return true; +} + +TRANS(xvreplgr2vr_b, gvec_dupx, MO_8) +TRANS(xvreplgr2vr_h, gvec_dupx, MO_16) +TRANS(xvreplgr2vr_w, gvec_dupx, MO_32) +TRANS(xvreplgr2vr_d, gvec_dupx, MO_64) diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index bac1903975..2eab7f6a98 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1302,12 +1302,15 @@ vstelm_b 0011 000110 .... ........ ..... ..... @vr_i8i4 # &xxx xd xj xk +&xr xd rj + # # LASX Formats # @xxx .... ........ ..... xk:5 xj:5 xd:5 &xxx +@xr .... ........ ..... ..... rj:5 xd:5 &xr xvadd_b 0111 01000000 10100 ..... ..... ..... @xxx xvadd_h 0111 01000000 10101 ..... ..... ..... @xxx @@ -1319,3 +1322,8 @@ xvsub_h 0111 01000000 11001 ..... ..... ..... @xxx xvsub_w 0111 01000000 11010 ..... ..... ..... @xxx xvsub_d 0111 01000000 11011 ..... ..... ..... @xxx xvsub_q 0111 01010010 11011 ..... ..... ..... @xxx + +xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @xr +xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @xr +xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @xr +xvreplgr2vr_d 0111 01101001 11110 00011 ..... ..... @xr From patchwork Tue Jun 20 09:37:34 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Gao X-Patchwork-Id: 13285482 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 848A2EB64D7 for ; Tue, 20 Jun 2023 09:43:15 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXom-00073z-NV; Tue, 20 Jun 2023 05:38:32 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qBXoh-00071r-Db for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:38:27 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXod-0006HE-Eg for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:38:27 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8Ax0OiKc5FkjCUHAA--.12762S3; Tue, 20 Jun 2023 17:38:18 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8BxduSGc5FkzIQhAA--.28394S8; Tue, 20 Jun 2023 17:38:17 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v1 06/46] target/loongarch: Implement xvaddi/xvsubi Date: Tue, 20 Jun 2023 17:37:34 +0800 Message-Id: <20230620093814.123650-7-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230620093814.123650-1-gaosong@loongson.cn> References: <20230620093814.123650-1-gaosong@loongson.cn> MIME-Version: 1.0 X-CM-TRANSID: AQAAf8BxduSGc5FkzIQhAA--.28394S8 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org This patch includes: - XVADDI.{B/H/W/D}U; - XVSUBI.{B/H/W/D}U. Signed-off-by: Song Gao --- target/loongarch/disas.c | 14 ++++++++ target/loongarch/insn_trans/trans_lasx.c.inc | 37 ++++++++++++++++++++ target/loongarch/insns.decode | 12 ++++++- 3 files changed, 62 insertions(+), 1 deletion(-) diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index 78e1fd19ac..7b84766fa8 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -1708,6 +1708,11 @@ static void output_xxx(DisasContext *ctx, arg_xxx * a, const char *mnemonic) output(ctx, mnemonic, "x%d, x%d, x%d", a->xd, a->xj, a->xk); } +static void output_xx_i(DisasContext *ctx, arg_xx_i *a, const char *mnemonic) +{ + output(ctx, mnemonic, "x%d, x%d, 0x%x", a->xd, a->xj, a->imm); +} + static void output_xr(DisasContext *ctx, arg_xr *a, const char *mnemonic) { output(ctx, mnemonic, "x%d, r%d", a->xd, a->rj); @@ -1724,6 +1729,15 @@ INSN_LASX(xvsub_w, xxx) INSN_LASX(xvsub_d, xxx) INSN_LASX(xvsub_q, xxx) +INSN_LASX(xvaddi_bu, xx_i) +INSN_LASX(xvaddi_hu, xx_i) +INSN_LASX(xvaddi_wu, xx_i) +INSN_LASX(xvaddi_du, xx_i) +INSN_LASX(xvsubi_bu, xx_i) +INSN_LASX(xvsubi_hu, xx_i) +INSN_LASX(xvsubi_wu, xx_i) +INSN_LASX(xvsubi_du, xx_i) + INSN_LASX(xvreplgr2vr_b, xr) INSN_LASX(xvreplgr2vr_h, xr) INSN_LASX(xvreplgr2vr_w, xr) diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarch/insn_trans/trans_lasx.c.inc index d394a4f40a..a42e92f930 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -31,6 +31,34 @@ static bool gvec_xxx(DisasContext *ctx, arg_xxx *a, MemOp mop, return true; } +static bool gvec_xx_i(DisasContext *ctx, arg_xx_i *a, MemOp mop, + void (*func)(unsigned, uint32_t, uint32_t, + int64_t, uint32_t, uint32_t)) +{ + uint32_t xd_ofs, xj_ofs; + + CHECK_ASXE; + + xd_ofs = vec_full_offset(a->xd); + xj_ofs = vec_full_offset(a->xj); + + func(mop, xd_ofs, xj_ofs, a->imm , 32, ctx->vl / 8); + return true; +} + +static bool gvec_xsubi(DisasContext *ctx, arg_xx_i *a, MemOp mop) +{ + uint32_t xd_ofs, xj_ofs; + + CHECK_ASXE; + + xd_ofs = vec_full_offset(a->xd); + xj_ofs = vec_full_offset(a->xj); + + tcg_gen_gvec_addi(mop, xd_ofs, xj_ofs, -a->imm, 32, ctx->vl / 8); + return true; +} + TRANS(xvadd_b, gvec_xxx, MO_8, tcg_gen_gvec_add) TRANS(xvadd_h, gvec_xxx, MO_16, tcg_gen_gvec_add) TRANS(xvadd_w, gvec_xxx, MO_32, tcg_gen_gvec_add) @@ -74,6 +102,15 @@ TRANS(xvsub_h, gvec_xxx, MO_16, tcg_gen_gvec_sub) TRANS(xvsub_w, gvec_xxx, MO_32, tcg_gen_gvec_sub) TRANS(xvsub_d, gvec_xxx, MO_64, tcg_gen_gvec_sub) +TRANS(xvaddi_bu, gvec_xx_i, MO_8, tcg_gen_gvec_addi) +TRANS(xvaddi_hu, gvec_xx_i, MO_16, tcg_gen_gvec_addi) +TRANS(xvaddi_wu, gvec_xx_i, MO_32, tcg_gen_gvec_addi) +TRANS(xvaddi_du, gvec_xx_i, MO_64, tcg_gen_gvec_addi) +TRANS(xvsubi_bu, gvec_xsubi, MO_8) +TRANS(xvsubi_hu, gvec_xsubi, MO_16) +TRANS(xvsubi_wu, gvec_xsubi, MO_32) +TRANS(xvsubi_du, gvec_xsubi, MO_64) + static bool gvec_dupx(DisasContext *ctx, arg_xr *a, MemOp mop) { TCGv src = gpr_src(ctx, a->rj, EXT_NONE); diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index 2eab7f6a98..0bed748216 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1303,7 +1303,7 @@ vstelm_b 0011 000110 .... ........ ..... ..... @vr_i8i4 &xxx xd xj xk &xr xd rj - +&xx_i xd xj imm # # LASX Formats @@ -1311,6 +1311,7 @@ vstelm_b 0011 000110 .... ........ ..... ..... @vr_i8i4 @xxx .... ........ ..... xk:5 xj:5 xd:5 &xxx @xr .... ........ ..... ..... rj:5 xd:5 &xr +@xx_ui5 .... ........ ..... imm:5 xj:5 xd:5 &xx_i xvadd_b 0111 01000000 10100 ..... ..... ..... @xxx xvadd_h 0111 01000000 10101 ..... ..... ..... @xxx @@ -1323,6 +1324,15 @@ xvsub_w 0111 01000000 11010 ..... ..... ..... @xxx xvsub_d 0111 01000000 11011 ..... ..... ..... @xxx xvsub_q 0111 01010010 11011 ..... ..... ..... @xxx +xvaddi_bu 0111 01101000 10100 ..... ..... ..... @xx_ui5 +xvaddi_hu 0111 01101000 10101 ..... ..... ..... @xx_ui5 +xvaddi_wu 0111 01101000 10110 ..... ..... ..... @xx_ui5 +xvaddi_du 0111 01101000 10111 ..... ..... ..... @xx_ui5 +xvsubi_bu 0111 01101000 11000 ..... ..... ..... @xx_ui5 +xvsubi_hu 0111 01101000 11001 ..... ..... ..... @xx_ui5 +xvsubi_wu 0111 01101000 11010 ..... ..... ..... @xx_ui5 +xvsubi_du 0111 01101000 11011 ..... ..... ..... @xx_ui5 + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @xr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @xr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @xr From patchwork Tue Jun 20 09:37:35 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Gao X-Patchwork-Id: 13285473 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 42837EB64D7 for ; Tue, 20 Jun 2023 09:41:26 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXoo-00074a-VZ; Tue, 20 Jun 2023 05:38:34 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qBXoi-00072F-Qp for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:38:31 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXoe-0006HG-J4 for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:38:28 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8DxCeqKc5FkjiUHAA--.14722S3; Tue, 20 Jun 2023 17:38:18 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8BxduSGc5FkzIQhAA--.28394S9; Tue, 20 Jun 2023 17:38:18 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v1 07/46] target/loongarch: Implement xvneg Date: Tue, 20 Jun 2023 17:37:35 +0800 Message-Id: <20230620093814.123650-8-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230620093814.123650-1-gaosong@loongson.cn> References: <20230620093814.123650-1-gaosong@loongson.cn> MIME-Version: 1.0 X-CM-TRANSID: AQAAf8BxduSGc5FkzIQhAA--.28394S9 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org This patch includes: - XVNEG.{B/H/W/D}. Signed-off-by: Song Gao --- target/loongarch/disas.c | 10 ++++++++++ target/loongarch/insn_trans/trans_lasx.c.inc | 20 ++++++++++++++++++++ target/loongarch/insns.decode | 7 +++++++ 3 files changed, 37 insertions(+) diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index 7b84766fa8..eefd16e3f1 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -1713,6 +1713,11 @@ static void output_xx_i(DisasContext *ctx, arg_xx_i *a, const char *mnemonic) output(ctx, mnemonic, "x%d, x%d, 0x%x", a->xd, a->xj, a->imm); } +static void output_xx(DisasContext *ctx, arg_xx *a, const char *mnemonic) +{ + output(ctx, mnemonic, "x%d, x%d", a->xd, a->xj); +} + static void output_xr(DisasContext *ctx, arg_xr *a, const char *mnemonic) { output(ctx, mnemonic, "x%d, r%d", a->xd, a->rj); @@ -1738,6 +1743,11 @@ INSN_LASX(xvsubi_hu, xx_i) INSN_LASX(xvsubi_wu, xx_i) INSN_LASX(xvsubi_du, xx_i) +INSN_LASX(xvneg_b, xx) +INSN_LASX(xvneg_h, xx) +INSN_LASX(xvneg_w, xx) +INSN_LASX(xvneg_d, xx) + INSN_LASX(xvreplgr2vr_b, xr) INSN_LASX(xvreplgr2vr_h, xr) INSN_LASX(xvreplgr2vr_w, xr) diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarch/insn_trans/trans_lasx.c.inc index a42e92f930..cea944c3ba 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -59,6 +59,21 @@ static bool gvec_xsubi(DisasContext *ctx, arg_xx_i *a, MemOp mop) return true; } +static bool gvec_xx(DisasContext *ctx, arg_xx *a, MemOp mop, + void (*func)(unsigned, uint32_t, uint32_t, + uint32_t, uint32_t)) +{ + uint32_t xd_ofs, xj_ofs; + + CHECK_ASXE; + + xd_ofs = vec_full_offset(a->xd); + xj_ofs = vec_full_offset(a->xj); + + func(mop, xd_ofs, xj_ofs, 32, ctx->vl / 8); + return true; +} + TRANS(xvadd_b, gvec_xxx, MO_8, tcg_gen_gvec_add) TRANS(xvadd_h, gvec_xxx, MO_16, tcg_gen_gvec_add) TRANS(xvadd_w, gvec_xxx, MO_32, tcg_gen_gvec_add) @@ -111,6 +126,11 @@ TRANS(xvsubi_hu, gvec_xsubi, MO_16) TRANS(xvsubi_wu, gvec_xsubi, MO_32) TRANS(xvsubi_du, gvec_xsubi, MO_64) +TRANS(xvneg_b, gvec_xx, MO_8, tcg_gen_gvec_neg) +TRANS(xvneg_h, gvec_xx, MO_16, tcg_gen_gvec_neg) +TRANS(xvneg_w, gvec_xx, MO_32, tcg_gen_gvec_neg) +TRANS(xvneg_d, gvec_xx, MO_64, tcg_gen_gvec_neg) + static bool gvec_dupx(DisasContext *ctx, arg_xr *a, MemOp mop) { TCGv src = gpr_src(ctx, a->rj, EXT_NONE); diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index 0bed748216..78452c622c 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1301,6 +1301,7 @@ vstelm_b 0011 000110 .... ........ ..... ..... @vr_i8i4 # LASX Argument sets # +&xx xd xj &xxx xd xj xk &xr xd rj &xx_i xd xj imm @@ -1309,6 +1310,7 @@ vstelm_b 0011 000110 .... ........ ..... ..... @vr_i8i4 # LASX Formats # +@xx .... ........ ..... ..... xj:5 xd:5 &xx @xxx .... ........ ..... xk:5 xj:5 xd:5 &xxx @xr .... ........ ..... ..... rj:5 xd:5 &xr @xx_ui5 .... ........ ..... imm:5 xj:5 xd:5 &xx_i @@ -1333,6 +1335,11 @@ xvsubi_hu 0111 01101000 11001 ..... ..... ..... @xx_ui5 xvsubi_wu 0111 01101000 11010 ..... ..... ..... @xx_ui5 xvsubi_du 0111 01101000 11011 ..... ..... ..... @xx_ui5 +xvneg_b 0111 01101001 11000 01100 ..... ..... @xx +xvneg_h 0111 01101001 11000 01101 ..... ..... @xx +xvneg_w 0111 01101001 11000 01110 ..... ..... @xx +xvneg_d 0111 01101001 11000 01111 ..... ..... @xx + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @xr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @xr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @xr From patchwork Tue Jun 20 09:37:36 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Gao X-Patchwork-Id: 13285479 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 7BE80EB64D7 for ; Tue, 20 Jun 2023 09:42:53 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXom-00073N-8C; Tue, 20 Jun 2023 05:38:32 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qBXoi-00072C-DK for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:38:31 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXod-0006Ho-EF for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:38:28 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8AxX+uKc5FkkCUHAA--.14745S3; Tue, 20 Jun 2023 17:38:18 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8BxduSGc5FkzIQhAA--.28394S10; Tue, 20 Jun 2023 17:38:18 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v1 08/46] target/loongarch: Implement xvsadd/xvssub Date: Tue, 20 Jun 2023 17:37:36 +0800 Message-Id: <20230620093814.123650-9-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230620093814.123650-1-gaosong@loongson.cn> References: <20230620093814.123650-1-gaosong@loongson.cn> MIME-Version: 1.0 X-CM-TRANSID: AQAAf8BxduSGc5FkzIQhAA--.28394S10 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org This patch includes: - XVSADD.{B/H/W/D}[U]; - XVSSUB.{B/H/W/D}[U]. Signed-off-by: Song Gao --- target/loongarch/disas.c | 17 +++++++++++++++++ target/loongarch/insn_trans/trans_lasx.c.inc | 17 +++++++++++++++++ target/loongarch/insns.decode | 18 ++++++++++++++++++ 3 files changed, 52 insertions(+) diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index eefd16e3f1..2a2993cb95 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -1748,6 +1748,23 @@ INSN_LASX(xvneg_h, xx) INSN_LASX(xvneg_w, xx) INSN_LASX(xvneg_d, xx) +INSN_LASX(xvsadd_b, xxx) +INSN_LASX(xvsadd_h, xxx) +INSN_LASX(xvsadd_w, xxx) +INSN_LASX(xvsadd_d, xxx) +INSN_LASX(xvsadd_bu, xxx) +INSN_LASX(xvsadd_hu, xxx) +INSN_LASX(xvsadd_wu, xxx) +INSN_LASX(xvsadd_du, xxx) +INSN_LASX(xvssub_b, xxx) +INSN_LASX(xvssub_h, xxx) +INSN_LASX(xvssub_w, xxx) +INSN_LASX(xvssub_d, xxx) +INSN_LASX(xvssub_bu, xxx) +INSN_LASX(xvssub_hu, xxx) +INSN_LASX(xvssub_wu, xxx) +INSN_LASX(xvssub_du, xxx) + INSN_LASX(xvreplgr2vr_b, xr) INSN_LASX(xvreplgr2vr_h, xr) INSN_LASX(xvreplgr2vr_w, xr) diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarch/insn_trans/trans_lasx.c.inc index cea944c3ba..ec68193686 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -131,6 +131,23 @@ TRANS(xvneg_h, gvec_xx, MO_16, tcg_gen_gvec_neg) TRANS(xvneg_w, gvec_xx, MO_32, tcg_gen_gvec_neg) TRANS(xvneg_d, gvec_xx, MO_64, tcg_gen_gvec_neg) +TRANS(xvsadd_b, gvec_xxx, MO_8, tcg_gen_gvec_ssadd) +TRANS(xvsadd_h, gvec_xxx, MO_16, tcg_gen_gvec_ssadd) +TRANS(xvsadd_w, gvec_xxx, MO_32, tcg_gen_gvec_ssadd) +TRANS(xvsadd_d, gvec_xxx, MO_64, tcg_gen_gvec_ssadd) +TRANS(xvsadd_bu, gvec_xxx, MO_8, tcg_gen_gvec_usadd) +TRANS(xvsadd_hu, gvec_xxx, MO_16, tcg_gen_gvec_usadd) +TRANS(xvsadd_wu, gvec_xxx, MO_32, tcg_gen_gvec_usadd) +TRANS(xvsadd_du, gvec_xxx, MO_64, tcg_gen_gvec_usadd) +TRANS(xvssub_b, gvec_xxx, MO_8, tcg_gen_gvec_sssub) +TRANS(xvssub_h, gvec_xxx, MO_16, tcg_gen_gvec_sssub) +TRANS(xvssub_w, gvec_xxx, MO_32, tcg_gen_gvec_sssub) +TRANS(xvssub_d, gvec_xxx, MO_64, tcg_gen_gvec_sssub) +TRANS(xvssub_bu, gvec_xxx, MO_8, tcg_gen_gvec_ussub) +TRANS(xvssub_hu, gvec_xxx, MO_16, tcg_gen_gvec_ussub) +TRANS(xvssub_wu, gvec_xxx, MO_32, tcg_gen_gvec_ussub) +TRANS(xvssub_du, gvec_xxx, MO_64, tcg_gen_gvec_ussub) + static bool gvec_dupx(DisasContext *ctx, arg_xr *a, MemOp mop) { TCGv src = gpr_src(ctx, a->rj, EXT_NONE); diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index 78452c622c..be706fe0f7 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1340,6 +1340,24 @@ xvneg_h 0111 01101001 11000 01101 ..... ..... @xx xvneg_w 0111 01101001 11000 01110 ..... ..... @xx xvneg_d 0111 01101001 11000 01111 ..... ..... @xx +xvsadd_b 0111 01000100 01100 ..... ..... ..... @xxx +xvsadd_h 0111 01000100 01101 ..... ..... ..... @xxx +xvsadd_w 0111 01000100 01110 ..... ..... ..... @xxx +xvsadd_d 0111 01000100 01111 ..... ..... ..... @xxx +xvsadd_bu 0111 01000100 10100 ..... ..... ..... @xxx +xvsadd_hu 0111 01000100 10101 ..... ..... ..... @xxx +xvsadd_wu 0111 01000100 10110 ..... ..... ..... @xxx +xvsadd_du 0111 01000100 10111 ..... ..... ..... @xxx + +xvssub_b 0111 01000100 10000 ..... ..... ..... @xxx +xvssub_h 0111 01000100 10001 ..... ..... ..... @xxx +xvssub_w 0111 01000100 10010 ..... ..... ..... @xxx +xvssub_d 0111 01000100 10011 ..... ..... ..... @xxx +xvssub_bu 0111 01000100 11000 ..... ..... ..... @xxx +xvssub_hu 0111 01000100 11001 ..... ..... ..... @xxx +xvssub_wu 0111 01000100 11010 ..... ..... ..... @xxx +xvssub_du 0111 01000100 11011 ..... ..... ..... @xxx + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @xr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @xr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @xr From patchwork Tue Jun 20 09:37:37 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Gao X-Patchwork-Id: 13285498 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 34A53EB64D7 for ; Tue, 20 Jun 2023 09:45:41 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXqW-00031f-F9; Tue, 20 Jun 2023 05:40:20 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qBXqQ-0002hO-5A for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:40:14 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXqM-0006aQ-DP for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:40:12 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8Bxb+uMc5FkkiUHAA--.14737S3; Tue, 20 Jun 2023 17:38:20 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8BxduSGc5FkzIQhAA--.28394S11; Tue, 20 Jun 2023 17:38:18 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v1 09/46] target/loongarch: Implement xvhaddw/xvhsubw Date: Tue, 20 Jun 2023 17:37:37 +0800 Message-Id: <20230620093814.123650-10-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230620093814.123650-1-gaosong@loongson.cn> References: <20230620093814.123650-1-gaosong@loongson.cn> MIME-Version: 1.0 X-CM-TRANSID: AQAAf8BxduSGc5FkzIQhAA--.28394S11 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org This patch includes: - XVHADDW.{H.B/W.H/D.W/Q.D/HU.BU/WU.HU/DU.WU/QU.DU}; - XVHSUBW.{H.B/W.H/D.W/Q.D/HU.BU/WU.HU/DU.WU/QU.DU}. Signed-off-by: Song Gao --- target/loongarch/disas.c | 17 ++++ target/loongarch/helper.h | 18 ++++ target/loongarch/insn_trans/trans_lasx.c.inc | 30 +++++++ target/loongarch/insns.decode | 18 ++++ target/loongarch/lasx_helper.c | 90 ++++++++++++++++++++ target/loongarch/lsx_helper.c | 3 - target/loongarch/vec.h | 3 + 7 files changed, 176 insertions(+), 3 deletions(-) diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index 2a2993cb95..770359524e 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -1765,6 +1765,23 @@ INSN_LASX(xvssub_hu, xxx) INSN_LASX(xvssub_wu, xxx) INSN_LASX(xvssub_du, xxx) +INSN_LASX(xvhaddw_h_b, xxx) +INSN_LASX(xvhaddw_w_h, xxx) +INSN_LASX(xvhaddw_d_w, xxx) +INSN_LASX(xvhaddw_q_d, xxx) +INSN_LASX(xvhaddw_hu_bu, xxx) +INSN_LASX(xvhaddw_wu_hu, xxx) +INSN_LASX(xvhaddw_du_wu, xxx) +INSN_LASX(xvhaddw_qu_du, xxx) +INSN_LASX(xvhsubw_h_b, xxx) +INSN_LASX(xvhsubw_w_h, xxx) +INSN_LASX(xvhsubw_d_w, xxx) +INSN_LASX(xvhsubw_q_d, xxx) +INSN_LASX(xvhsubw_hu_bu, xxx) +INSN_LASX(xvhsubw_wu_hu, xxx) +INSN_LASX(xvhsubw_du_wu, xxx) +INSN_LASX(xvhsubw_qu_du, xxx) + INSN_LASX(xvreplgr2vr_b, xr) INSN_LASX(xvreplgr2vr_h, xr) INSN_LASX(xvreplgr2vr_w, xr) diff --git a/target/loongarch/helper.h b/target/loongarch/helper.h index b9de77d926..db2deaff79 100644 --- a/target/loongarch/helper.h +++ b/target/loongarch/helper.h @@ -696,3 +696,21 @@ DEF_HELPER_4(vextrins_b, void, env, i32, i32, i32) DEF_HELPER_4(vextrins_h, void, env, i32, i32, i32) DEF_HELPER_4(vextrins_w, void, env, i32, i32, i32) DEF_HELPER_4(vextrins_d, void, env, i32, i32, i32) + +/* LoongArch LASX */ +DEF_HELPER_4(xvhaddw_h_b, void, env, i32, i32, i32) +DEF_HELPER_4(xvhaddw_w_h, void, env, i32, i32, i32) +DEF_HELPER_4(xvhaddw_d_w, void, env, i32, i32, i32) +DEF_HELPER_4(xvhaddw_q_d, void, env, i32, i32, i32) +DEF_HELPER_4(xvhaddw_hu_bu, void, env, i32, i32, i32) +DEF_HELPER_4(xvhaddw_wu_hu, void, env, i32, i32, i32) +DEF_HELPER_4(xvhaddw_du_wu, void, env, i32, i32, i32) +DEF_HELPER_4(xvhaddw_qu_du, void, env, i32, i32, i32) +DEF_HELPER_4(xvhsubw_h_b, void, env, i32, i32, i32) +DEF_HELPER_4(xvhsubw_w_h, void, env, i32, i32, i32) +DEF_HELPER_4(xvhsubw_d_w, void, env, i32, i32, i32) +DEF_HELPER_4(xvhsubw_q_d, void, env, i32, i32, i32) +DEF_HELPER_4(xvhsubw_hu_bu, void, env, i32, i32, i32) +DEF_HELPER_4(xvhsubw_wu_hu, void, env, i32, i32, i32) +DEF_HELPER_4(xvhsubw_du_wu, void, env, i32, i32, i32) +DEF_HELPER_4(xvhsubw_qu_du, void, env, i32, i32, i32) diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarch/insn_trans/trans_lasx.c.inc index ec68193686..aa0e35b228 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -15,6 +15,19 @@ #define CHECK_ASXE #endif +static bool gen_xxx(DisasContext *ctx, arg_xxx *a, + void (*func)(TCGv_ptr, TCGv_i32, TCGv_i32, TCGv_i32)) +{ + TCGv_i32 xd = tcg_constant_i32(a->xd); + TCGv_i32 xj = tcg_constant_i32(a->xj); + TCGv_i32 xk = tcg_constant_i32(a->xk); + + CHECK_ASXE; + + func(cpu_env, xd, xj, xk); + return true; +} + static bool gvec_xxx(DisasContext *ctx, arg_xxx *a, MemOp mop, void (*func)(unsigned, uint32_t, uint32_t, uint32_t, uint32_t, uint32_t)) @@ -148,6 +161,23 @@ TRANS(xvssub_hu, gvec_xxx, MO_16, tcg_gen_gvec_ussub) TRANS(xvssub_wu, gvec_xxx, MO_32, tcg_gen_gvec_ussub) TRANS(xvssub_du, gvec_xxx, MO_64, tcg_gen_gvec_ussub) +TRANS(xvhaddw_h_b, gen_xxx, gen_helper_xvhaddw_h_b) +TRANS(xvhaddw_w_h, gen_xxx, gen_helper_xvhaddw_w_h) +TRANS(xvhaddw_d_w, gen_xxx, gen_helper_xvhaddw_d_w) +TRANS(xvhaddw_q_d, gen_xxx, gen_helper_xvhaddw_q_d) +TRANS(xvhaddw_hu_bu, gen_xxx, gen_helper_xvhaddw_hu_bu) +TRANS(xvhaddw_wu_hu, gen_xxx, gen_helper_xvhaddw_wu_hu) +TRANS(xvhaddw_du_wu, gen_xxx, gen_helper_xvhaddw_du_wu) +TRANS(xvhaddw_qu_du, gen_xxx, gen_helper_xvhaddw_qu_du) +TRANS(xvhsubw_h_b, gen_xxx, gen_helper_xvhsubw_h_b) +TRANS(xvhsubw_w_h, gen_xxx, gen_helper_xvhsubw_w_h) +TRANS(xvhsubw_d_w, gen_xxx, gen_helper_xvhsubw_d_w) +TRANS(xvhsubw_q_d, gen_xxx, gen_helper_xvhsubw_q_d) +TRANS(xvhsubw_hu_bu, gen_xxx, gen_helper_xvhsubw_hu_bu) +TRANS(xvhsubw_wu_hu, gen_xxx, gen_helper_xvhsubw_wu_hu) +TRANS(xvhsubw_du_wu, gen_xxx, gen_helper_xvhsubw_du_wu) +TRANS(xvhsubw_qu_du, gen_xxx, gen_helper_xvhsubw_qu_du) + static bool gvec_dupx(DisasContext *ctx, arg_xr *a, MemOp mop) { TCGv src = gpr_src(ctx, a->rj, EXT_NONE); diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index be706fe0f7..48556b2267 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1358,6 +1358,24 @@ xvssub_hu 0111 01000100 11001 ..... ..... ..... @xxx xvssub_wu 0111 01000100 11010 ..... ..... ..... @xxx xvssub_du 0111 01000100 11011 ..... ..... ..... @xxx +xvhaddw_h_b 0111 01000101 01000 ..... ..... ..... @xxx +xvhaddw_w_h 0111 01000101 01001 ..... ..... ..... @xxx +xvhaddw_d_w 0111 01000101 01010 ..... ..... ..... @xxx +xvhaddw_q_d 0111 01000101 01011 ..... ..... ..... @xxx +xvhaddw_hu_bu 0111 01000101 10000 ..... ..... ..... @xxx +xvhaddw_wu_hu 0111 01000101 10001 ..... ..... ..... @xxx +xvhaddw_du_wu 0111 01000101 10010 ..... ..... ..... @xxx +xvhaddw_qu_du 0111 01000101 10011 ..... ..... ..... @xxx + +xvhsubw_h_b 0111 01000101 01100 ..... ..... ..... @xxx +xvhsubw_w_h 0111 01000101 01101 ..... ..... ..... @xxx +xvhsubw_d_w 0111 01000101 01110 ..... ..... ..... @xxx +xvhsubw_q_d 0111 01000101 01111 ..... ..... ..... @xxx +xvhsubw_hu_bu 0111 01000101 10100 ..... ..... ..... @xxx +xvhsubw_wu_hu 0111 01000101 10101 ..... ..... ..... @xxx +xvhsubw_du_wu 0111 01000101 10110 ..... ..... ..... @xxx +xvhsubw_qu_du 0111 01000101 10111 ..... ..... ..... @xxx + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @xr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @xr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @xr diff --git a/target/loongarch/lasx_helper.c b/target/loongarch/lasx_helper.c index 1754790a3a..d86381ff8a 100644 --- a/target/loongarch/lasx_helper.c +++ b/target/loongarch/lasx_helper.c @@ -4,3 +4,93 @@ * * Copyright (c) 2023 Loongson Technology Corporation Limited */ + +#include "qemu/osdep.h" +#include "cpu.h" +#include "exec/exec-all.h" +#include "exec/helper-proto.h" +#include "internals.h" +#include "vec.h" + +#define XDO_ODD_EVEN(NAME, BIT, E1, E2, DO_OP) \ +void HELPER(NAME)(CPULoongArchState *env, \ + uint32_t xd, uint32_t xj, uint32_t xk) \ +{ \ + int i; \ + XReg *Xd = &(env->fpr[xd].xreg); \ + XReg *Xj = &(env->fpr[xj].xreg); \ + XReg *Xk = &(env->fpr[xk].xreg); \ + typedef __typeof(Xd->E1(0)) TD; \ + \ + for (i = 0; i < LASX_LEN / BIT; i++) { \ + Xd->E1(i) = DO_OP((TD)Xj->E2(2 * i + 1), (TD)Xk->E2(2 * i)); \ + } \ +} + +XDO_ODD_EVEN(xvhaddw_h_b, 16, XH, XB, DO_ADD) +XDO_ODD_EVEN(xvhaddw_w_h, 32, XW, XH, DO_ADD) +XDO_ODD_EVEN(xvhaddw_d_w, 64, XD, XW, DO_ADD) + +void HELPER(xvhaddw_q_d)(CPULoongArchState *env, + uint32_t xd, uint32_t xj, uint32_t xk) +{ + XReg *Xd = &(env->fpr[xd].xreg); + XReg *Xj = &(env->fpr[xj].xreg); + XReg *Xk = &(env->fpr[xk].xreg); + + Xd->XQ(0) = int128_add(int128_makes64(Xj->XD(1)), + int128_makes64(Xk->XD(0))); + Xd->XQ(1) = int128_add(int128_makes64(Xj->XD(3)), + int128_makes64(Xk->XD(2))); +} + +XDO_ODD_EVEN(xvhsubw_h_b, 16, XH, XB, DO_SUB) +XDO_ODD_EVEN(xvhsubw_w_h, 32, XW, XH, DO_SUB) +XDO_ODD_EVEN(xvhsubw_d_w, 64, XD, XW, DO_SUB) + +void HELPER(xvhsubw_q_d)(CPULoongArchState *env, + uint32_t xd, uint32_t xj, uint32_t xk) +{ + XReg *Xd = &(env->fpr[xd].xreg); + XReg *Xj = &(env->fpr[xj].xreg); + XReg *Xk = &(env->fpr[xk].xreg); + + Xd->XQ(0) = int128_sub(int128_makes64(Xj->XD(1)), + int128_makes64(Xk->XD(0))); + Xd->XQ(1) = int128_sub(int128_makes64(Xj->XD(3)), + int128_makes64(Xk->XD(2))); +} + +XDO_ODD_EVEN(xvhaddw_hu_bu, 16, UXH, UXB, DO_ADD) +XDO_ODD_EVEN(xvhaddw_wu_hu, 32, UXW, UXH, DO_ADD) +XDO_ODD_EVEN(xvhaddw_du_wu, 64, UXD, UXW, DO_ADD) + +void HELPER(xvhaddw_qu_du)(CPULoongArchState *env, + uint32_t xd, uint32_t xj, uint32_t xk) +{ + XReg *Xd = &(env->fpr[xd].xreg); + XReg *Xj = &(env->fpr[xj].xreg); + XReg *Xk = &(env->fpr[xk].xreg); + + Xd->XQ(0) = int128_add(int128_make64(Xj->UXD(1)), + int128_make64(Xk->UXD(0))); + Xd->XQ(1) = int128_add(int128_make64(Xj->UXD(3)), + int128_make64(Xk->UXD(2))); +} + +XDO_ODD_EVEN(xvhsubw_hu_bu, 16, UXH, UXB, DO_SUB) +XDO_ODD_EVEN(xvhsubw_wu_hu, 32, UXW, UXH, DO_SUB) +XDO_ODD_EVEN(xvhsubw_du_wu, 64, UXD, UXW, DO_SUB) + +void HELPER(xvhsubw_qu_du)(CPULoongArchState *env, + uint32_t xd, uint32_t xj, uint32_t xk) +{ + XReg *Xd = &(env->fpr[xd].xreg); + XReg *Xj = &(env->fpr[xj].xreg); + XReg *Xk = &(env->fpr[xk].xreg); + + Xd->XQ(0) = int128_sub(int128_make64(Xj->UXD(1)), + int128_make64(Xk->UXD(0))); + Xd->XQ(1) = int128_sub(int128_make64(Xj->UXD(3)), + int128_make64(Xk->UXD(2))); +} diff --git a/target/loongarch/lsx_helper.c b/target/loongarch/lsx_helper.c index b231a2798b..d79a65dfe2 100644 --- a/target/loongarch/lsx_helper.c +++ b/target/loongarch/lsx_helper.c @@ -14,9 +14,6 @@ #include "tcg/tcg.h" #include "vec.h" -#define DO_ADD(a, b) (a + b) -#define DO_SUB(a, b) (a - b) - #define DO_ODD_EVEN(NAME, BIT, E1, E2, DO_OP) \ void HELPER(NAME)(CPULoongArchState *env, \ uint32_t vd, uint32_t vj, uint32_t vk) \ diff --git a/target/loongarch/vec.h b/target/loongarch/vec.h index a89cdb8d45..7e71035e50 100644 --- a/target/loongarch/vec.h +++ b/target/loongarch/vec.h @@ -48,4 +48,7 @@ #define XQ(x) XQ[x] #endif /* HOST_BIG_ENDIAN */ +#define DO_ADD(a, b) (a + b) +#define DO_SUB(a, b) (a - b) + #endif /* LOONGARCH_VEC_H */ From patchwork Tue Jun 20 09:37:38 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Gao X-Patchwork-Id: 13285480 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 48E41EB64D8 for ; Tue, 20 Jun 2023 09:42:55 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXoq-00075M-Eg; Tue, 20 Jun 2023 05:38:36 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qBXoi-00072B-Bd for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:38:31 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXod-0006JK-A9 for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:38:28 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8Cx8OiNc5FkliUHAA--.12756S3; Tue, 20 Jun 2023 17:38:21 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8BxduSGc5FkzIQhAA--.28394S12; Tue, 20 Jun 2023 17:38:20 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v1 10/46] target/loongarch: Implement xvaddw/xvsubw Date: Tue, 20 Jun 2023 17:37:38 +0800 Message-Id: <20230620093814.123650-11-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230620093814.123650-1-gaosong@loongson.cn> References: <20230620093814.123650-1-gaosong@loongson.cn> MIME-Version: 1.0 X-CM-TRANSID: AQAAf8BxduSGc5FkzIQhAA--.28394S12 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org This patch includes: - XVADDW{EV/OD}.{H.B/W.H/D.W/Q.D}[U]; - XVSUBW{EV/OD}.{H.B/W.H/D.W/Q.D}[U]; - XVADDW{EV/OD}.{H.BU.B/W.HU.H/D.WU.W/Q.DU.D}. Signed-off-by: Song Gao --- target/loongarch/disas.c | 43 ++ target/loongarch/helper.h | 45 ++ target/loongarch/insn_trans/trans_lasx.c.inc | 410 +++++++++++++++++++ target/loongarch/insns.decode | 45 ++ target/loongarch/lasx_helper.c | 214 ++++++++++ 5 files changed, 757 insertions(+) diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index 770359524e..6e790f0959 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -1782,6 +1782,49 @@ INSN_LASX(xvhsubw_wu_hu, xxx) INSN_LASX(xvhsubw_du_wu, xxx) INSN_LASX(xvhsubw_qu_du, xxx) +INSN_LASX(xvaddwev_h_b, xxx) +INSN_LASX(xvaddwev_w_h, xxx) +INSN_LASX(xvaddwev_d_w, xxx) +INSN_LASX(xvaddwev_q_d, xxx) +INSN_LASX(xvaddwod_h_b, xxx) +INSN_LASX(xvaddwod_w_h, xxx) +INSN_LASX(xvaddwod_d_w, xxx) +INSN_LASX(xvaddwod_q_d, xxx) +INSN_LASX(xvsubwev_h_b, xxx) +INSN_LASX(xvsubwev_w_h, xxx) +INSN_LASX(xvsubwev_d_w, xxx) +INSN_LASX(xvsubwev_q_d, xxx) +INSN_LASX(xvsubwod_h_b, xxx) +INSN_LASX(xvsubwod_w_h, xxx) +INSN_LASX(xvsubwod_d_w, xxx) +INSN_LASX(xvsubwod_q_d, xxx) + +INSN_LASX(xvaddwev_h_bu, xxx) +INSN_LASX(xvaddwev_w_hu, xxx) +INSN_LASX(xvaddwev_d_wu, xxx) +INSN_LASX(xvaddwev_q_du, xxx) +INSN_LASX(xvaddwod_h_bu, xxx) +INSN_LASX(xvaddwod_w_hu, xxx) +INSN_LASX(xvaddwod_d_wu, xxx) +INSN_LASX(xvaddwod_q_du, xxx) +INSN_LASX(xvsubwev_h_bu, xxx) +INSN_LASX(xvsubwev_w_hu, xxx) +INSN_LASX(xvsubwev_d_wu, xxx) +INSN_LASX(xvsubwev_q_du, xxx) +INSN_LASX(xvsubwod_h_bu, xxx) +INSN_LASX(xvsubwod_w_hu, xxx) +INSN_LASX(xvsubwod_d_wu, xxx) +INSN_LASX(xvsubwod_q_du, xxx) + +INSN_LASX(xvaddwev_h_bu_b, xxx) +INSN_LASX(xvaddwev_w_hu_h, xxx) +INSN_LASX(xvaddwev_d_wu_w, xxx) +INSN_LASX(xvaddwev_q_du_d, xxx) +INSN_LASX(xvaddwod_h_bu_b, xxx) +INSN_LASX(xvaddwod_w_hu_h, xxx) +INSN_LASX(xvaddwod_d_wu_w, xxx) +INSN_LASX(xvaddwod_q_du_d, xxx) + INSN_LASX(xvreplgr2vr_b, xr) INSN_LASX(xvreplgr2vr_h, xr) INSN_LASX(xvreplgr2vr_w, xr) diff --git a/target/loongarch/helper.h b/target/loongarch/helper.h index db2deaff79..2034576d87 100644 --- a/target/loongarch/helper.h +++ b/target/loongarch/helper.h @@ -714,3 +714,48 @@ DEF_HELPER_4(xvhsubw_hu_bu, void, env, i32, i32, i32) DEF_HELPER_4(xvhsubw_wu_hu, void, env, i32, i32, i32) DEF_HELPER_4(xvhsubw_du_wu, void, env, i32, i32, i32) DEF_HELPER_4(xvhsubw_qu_du, void, env, i32, i32, i32) + +DEF_HELPER_FLAGS_4(xvaddwev_h_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvaddwev_w_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvaddwev_d_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvaddwev_q_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvaddwod_h_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvaddwod_w_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvaddwod_d_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvaddwod_q_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(xvsubwev_h_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvsubwev_w_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvsubwev_d_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvsubwev_q_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvsubwod_h_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvsubwod_w_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvsubwod_d_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvsubwod_q_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(xvaddwev_h_bu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvaddwev_w_hu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvaddwev_d_wu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvaddwev_q_du, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvaddwod_h_bu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvaddwod_w_hu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvaddwod_d_wu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvaddwod_q_du, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(xvsubwev_h_bu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvsubwev_w_hu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvsubwev_d_wu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvsubwev_q_du, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvsubwod_h_bu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvsubwod_w_hu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvsubwod_d_wu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvsubwod_q_du, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(xvaddwev_h_bu_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvaddwev_w_hu_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvaddwev_d_wu_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvaddwev_q_du_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvaddwod_h_bu_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvaddwod_w_hu_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvaddwod_d_wu_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvaddwod_q_du_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarch/insn_trans/trans_lasx.c.inc index aa0e35b228..0a574182db 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -178,6 +178,416 @@ TRANS(xvhsubw_wu_hu, gen_xxx, gen_helper_xvhsubw_wu_hu) TRANS(xvhsubw_du_wu, gen_xxx, gen_helper_xvhsubw_du_wu) TRANS(xvhsubw_qu_du, gen_xxx, gen_helper_xvhsubw_qu_du) +static void do_xvaddwev_s(unsigned vece, uint32_t xd_ofs, uint32_t xj_ofs, + uint32_t xk_ofs, uint32_t oprsz, uint32_t maxsz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_shli_vec, INDEX_op_sari_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen3 op[4] = { + { + .fniv = gen_vaddwev_s, + .fno = gen_helper_xvaddwev_h_b, + .opt_opc = vecop_list, + .vece = MO_16 + }, + { + .fni4 = gen_vaddwev_w_h, + .fniv = gen_vaddwev_s, + .fno = gen_helper_xvaddwev_w_h, + .opt_opc = vecop_list, + .vece = MO_32 + }, + { + .fni8 = gen_vaddwev_d_w, + .fniv = gen_vaddwev_s, + .fno = gen_helper_xvaddwev_d_w, + .opt_opc = vecop_list, + .vece = MO_64 + }, + { + .fno = gen_helper_xvaddwev_q_d, + .vece = MO_128 + }, + }; + + tcg_gen_gvec_3(xd_ofs, xj_ofs, xk_ofs, oprsz, maxsz, &op[vece]); +} + +TRANS(xvaddwev_h_b, gvec_xxx, MO_8, do_xvaddwev_s) +TRANS(xvaddwev_w_h, gvec_xxx, MO_16, do_xvaddwev_s) +TRANS(xvaddwev_d_w, gvec_xxx, MO_32, do_xvaddwev_s) +TRANS(xvaddwev_q_d, gvec_xxx, MO_64, do_xvaddwev_s) + +static void do_xvaddwod_s(unsigned vece, uint32_t xd_ofs, uint32_t xj_ofs, + uint32_t xk_ofs, uint32_t oprsz, uint32_t maxsz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_sari_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen3 op[4] = { + { + .fniv = gen_vaddwod_s, + .fno = gen_helper_xvaddwod_h_b, + .opt_opc = vecop_list, + .vece = MO_16 + }, + { + .fni4 = gen_vaddwod_w_h, + .fniv = gen_vaddwod_s, + .fno = gen_helper_xvaddwod_w_h, + .opt_opc = vecop_list, + .vece = MO_32 + }, + { + .fni8 = gen_vaddwod_d_w, + .fniv = gen_vaddwod_s, + .fno = gen_helper_xvaddwod_d_w, + .opt_opc = vecop_list, + .vece = MO_64 + }, + { + .fno = gen_helper_xvaddwod_q_d, + .vece = MO_128 + }, + }; + + tcg_gen_gvec_3(xd_ofs, xj_ofs, xk_ofs, oprsz, maxsz, &op[vece]); +} + +TRANS(xvaddwod_h_b, gvec_xxx, MO_8, do_xvaddwod_s) +TRANS(xvaddwod_w_h, gvec_xxx, MO_16, do_xvaddwod_s) +TRANS(xvaddwod_d_w, gvec_xxx, MO_32, do_xvaddwod_s) +TRANS(xvaddwod_q_d, gvec_xxx, MO_64, do_xvaddwod_s) + +static void do_xvsubwev_s(unsigned vece, uint32_t xd_ofs, uint32_t xj_ofs, + uint32_t xk_ofs, uint32_t oprsz, uint32_t maxsz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_shli_vec, INDEX_op_sari_vec, INDEX_op_sub_vec, 0 + }; + static const GVecGen3 op[4] = { + { + .fniv = gen_vsubwev_s, + .fno = gen_helper_xvsubwev_h_b, + .opt_opc = vecop_list, + .vece = MO_16 + }, + { + .fni4 = gen_vsubwev_w_h, + .fniv = gen_vsubwev_s, + .fno = gen_helper_xvsubwev_w_h, + .opt_opc = vecop_list, + .vece = MO_32 + }, + { + .fni8 = gen_vsubwev_d_w, + .fniv = gen_vsubwev_s, + .fno = gen_helper_xvsubwev_d_w, + .opt_opc = vecop_list, + .vece = MO_64 + }, + { + .fno = gen_helper_xvsubwev_q_d, + .vece = MO_128 + }, + }; + + tcg_gen_gvec_3(xd_ofs, xj_ofs, xk_ofs, oprsz, maxsz, &op[vece]); +} + +TRANS(xvsubwev_h_b, gvec_xxx, MO_8, do_xvsubwev_s) +TRANS(xvsubwev_w_h, gvec_xxx, MO_16, do_xvsubwev_s) +TRANS(xvsubwev_d_w, gvec_xxx, MO_32, do_xvsubwev_s) +TRANS(xvsubwev_q_d, gvec_xxx, MO_64, do_xvsubwev_s) + +static void do_xvsubwod_s(unsigned vece, uint32_t xd_ofs, uint32_t xj_ofs, + uint32_t xk_ofs, uint32_t oprsz, uint32_t maxsz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_sari_vec, INDEX_op_sub_vec, 0 + }; + static const GVecGen3 op[4] = { + { + .fniv = gen_vsubwod_s, + .fno = gen_helper_xvsubwod_h_b, + .opt_opc = vecop_list, + .vece = MO_16 + }, + { + .fni4 = gen_vsubwod_w_h, + .fniv = gen_vsubwod_s, + .fno = gen_helper_xvsubwod_w_h, + .opt_opc = vecop_list, + .vece = MO_32 + }, + { + .fni8 = gen_vsubwod_d_w, + .fniv = gen_vsubwod_s, + .fno = gen_helper_xvsubwod_d_w, + .opt_opc = vecop_list, + .vece = MO_64 + }, + { + .fno = gen_helper_xvsubwod_q_d, + .vece = MO_128 + }, + }; + + tcg_gen_gvec_3(xd_ofs, xj_ofs, xk_ofs, oprsz, maxsz, &op[vece]); +} + +TRANS(xvsubwod_h_b, gvec_xxx, MO_8, do_xvsubwod_s) +TRANS(xvsubwod_w_h, gvec_xxx, MO_16, do_xvsubwod_s) +TRANS(xvsubwod_d_w, gvec_xxx, MO_32, do_xvsubwod_s) +TRANS(xvsubwod_q_d, gvec_xxx, MO_64, do_xvsubwod_s) + +static void do_xvaddwev_u(unsigned vece, uint32_t xd_ofs, uint32_t xj_ofs, + uint32_t xk_ofs, uint32_t oprsz, uint32_t maxsz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_add_vec, 0 + }; + static const GVecGen3 op[4] = { + { + .fniv = gen_vaddwev_u, + .fno = gen_helper_xvaddwev_h_bu, + .opt_opc = vecop_list, + .vece = MO_16 + }, + { + .fni4 = gen_vaddwev_w_hu, + .fniv = gen_vaddwev_u, + .fno = gen_helper_xvaddwev_w_hu, + .opt_opc = vecop_list, + .vece = MO_32 + }, + { + .fni8 = gen_vaddwev_d_wu, + .fniv = gen_vaddwev_u, + .fno = gen_helper_xvaddwev_d_wu, + .opt_opc = vecop_list, + .vece = MO_64 + }, + { + .fno = gen_helper_xvaddwev_q_du, + .vece = MO_128 + }, + }; + + tcg_gen_gvec_3(xd_ofs, xj_ofs, xk_ofs, oprsz, maxsz, &op[vece]); +} + +TRANS(xvaddwev_h_bu, gvec_xxx, MO_8, do_xvaddwev_u) +TRANS(xvaddwev_w_hu, gvec_xxx, MO_16, do_xvaddwev_u) +TRANS(xvaddwev_d_wu, gvec_xxx, MO_32, do_xvaddwev_u) +TRANS(xvaddwev_q_du, gvec_xxx, MO_64, do_xvaddwev_u) + +static void do_xvaddwod_u(unsigned vece, uint32_t xd_ofs, uint32_t xj_ofs, + uint32_t xk_ofs, uint32_t oprsz, uint32_t maxsz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_shri_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen3 op[4] = { + { + .fniv = gen_vaddwod_u, + .fno = gen_helper_xvaddwod_h_bu, + .opt_opc = vecop_list, + .vece = MO_16 + }, + { + .fni4 = gen_vaddwod_w_hu, + .fniv = gen_vaddwod_u, + .fno = gen_helper_xvaddwod_w_hu, + .opt_opc = vecop_list, + .vece = MO_32 + }, + { + .fni8 = gen_vaddwod_d_wu, + .fniv = gen_vaddwod_u, + .fno = gen_helper_xvaddwod_d_wu, + .opt_opc = vecop_list, + .vece = MO_64 + }, + { + .fno = gen_helper_xvaddwod_q_du, + .vece = MO_128 + }, + }; + + tcg_gen_gvec_3(xd_ofs, xj_ofs, xk_ofs, oprsz, maxsz, &op[vece]); +} + +TRANS(xvaddwod_h_bu, gvec_xxx, MO_8, do_xvaddwod_u) +TRANS(xvaddwod_w_hu, gvec_xxx, MO_16, do_xvaddwod_u) +TRANS(xvaddwod_d_wu, gvec_xxx, MO_32, do_xvaddwod_u) +TRANS(xvaddwod_q_du, gvec_xxx, MO_64, do_xvaddwod_u) + +static void do_xvsubwev_u(unsigned vece, uint32_t xd_ofs, uint32_t xj_ofs, + uint32_t xk_ofs, uint32_t oprsz, uint32_t maxsz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_sub_vec, 0 + }; + static const GVecGen3 op[4] = { + { + .fniv = gen_vsubwev_u, + .fno = gen_helper_xvsubwev_h_bu, + .opt_opc = vecop_list, + .vece = MO_16 + }, + { + .fni4 = gen_vsubwev_w_hu, + .fniv = gen_vsubwev_u, + .fno = gen_helper_xvsubwev_w_hu, + .opt_opc = vecop_list, + .vece = MO_32 + }, + { + .fni8 = gen_vsubwev_d_wu, + .fniv = gen_vsubwev_u, + .fno = gen_helper_xvsubwev_d_wu, + .opt_opc = vecop_list, + .vece = MO_64 + }, + { + .fno = gen_helper_xvsubwev_q_du, + .vece = MO_128 + }, + }; + + tcg_gen_gvec_3(xd_ofs, xj_ofs, xk_ofs, oprsz, maxsz, &op[vece]); +} + +TRANS(xvsubwev_h_bu, gvec_xxx, MO_8, do_xvsubwev_u) +TRANS(xvsubwev_w_hu, gvec_xxx, MO_16, do_xvsubwev_u) +TRANS(xvsubwev_d_wu, gvec_xxx, MO_32, do_xvsubwev_u) +TRANS(xvsubwev_q_du, gvec_xxx, MO_64, do_xvsubwev_u) + +static void do_xvsubwod_u(unsigned vece, uint32_t xd_ofs, uint32_t xj_ofs, + uint32_t xk_ofs, uint32_t oprsz, uint32_t maxsz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_shri_vec, INDEX_op_sub_vec, 0 + }; + static const GVecGen3 op[4] = { + { + .fniv = gen_vsubwod_u, + .fno = gen_helper_xvsubwod_h_bu, + .opt_opc = vecop_list, + .vece = MO_16 + }, + { + .fni4 = gen_vsubwod_w_hu, + .fniv = gen_vsubwod_u, + .fno = gen_helper_xvsubwod_w_hu, + .opt_opc = vecop_list, + .vece = MO_32 + }, + { + .fni8 = gen_vsubwod_d_wu, + .fniv = gen_vsubwod_u, + .fno = gen_helper_xvsubwod_d_wu, + .opt_opc = vecop_list, + .vece = MO_64 + }, + { + .fno = gen_helper_xvsubwod_q_du, + .vece = MO_128 + }, + }; + + tcg_gen_gvec_3(xd_ofs, xj_ofs, xk_ofs, oprsz, maxsz, &op[vece]); +} + +TRANS(xvsubwod_h_bu, gvec_xxx, MO_8, do_xvsubwod_u) +TRANS(xvsubwod_w_hu, gvec_xxx, MO_16, do_xvsubwod_u) +TRANS(xvsubwod_d_wu, gvec_xxx, MO_32, do_xvsubwod_u) +TRANS(xvsubwod_q_du, gvec_xxx, MO_64, do_xvsubwod_u) + +static void do_xvaddwev_u_s(unsigned vece, uint32_t xd_ofs, uint32_t xj_ofs, + uint32_t xk_ofs, uint32_t oprsz, uint32_t maxsz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_shli_vec, INDEX_op_sari_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen3 op[4] = { + { + .fniv = gen_vaddwev_u_s, + .fno = gen_helper_xvaddwev_h_bu_b, + .opt_opc = vecop_list, + .vece = MO_16 + }, + { + .fni4 = gen_vaddwev_w_hu_h, + .fniv = gen_vaddwev_u_s, + .fno = gen_helper_xvaddwev_w_hu_h, + .opt_opc = vecop_list, + .vece = MO_32 + }, + { + .fni8 = gen_vaddwev_d_wu_w, + .fniv = gen_vaddwev_u_s, + .fno = gen_helper_xvaddwev_d_wu_w, + .opt_opc = vecop_list, + .vece = MO_64 + }, + { + .fno = gen_helper_xvaddwev_q_du_d, + .vece = MO_128 + }, + }; + + tcg_gen_gvec_3(xd_ofs, xj_ofs, xk_ofs, oprsz, maxsz, &op[vece]); +} + +TRANS(xvaddwev_h_bu_b, gvec_xxx, MO_8, do_xvaddwev_u_s) +TRANS(xvaddwev_w_hu_h, gvec_xxx, MO_16, do_xvaddwev_u_s) +TRANS(xvaddwev_d_wu_w, gvec_xxx, MO_32, do_xvaddwev_u_s) +TRANS(xvaddwev_q_du_d, gvec_xxx, MO_64, do_xvaddwev_u_s) + +static void do_xvaddwod_u_s(unsigned vece, uint32_t xd_ofs, uint32_t xj_ofs, + uint32_t xk_ofs, uint32_t oprsz, uint32_t maxsz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_shri_vec, INDEX_op_sari_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen3 op[4] = { + { + .fniv = gen_vaddwod_u_s, + .fno = gen_helper_xvaddwod_h_bu_b, + .opt_opc = vecop_list, + .vece = MO_16 + }, + { + .fni4 = gen_vaddwod_w_hu_h, + .fniv = gen_vaddwod_u_s, + .fno = gen_helper_xvaddwod_w_hu_h, + .opt_opc = vecop_list, + .vece = MO_32 + }, + { + .fni8 = gen_vaddwod_d_wu_w, + .fniv = gen_vaddwod_u_s, + .fno = gen_helper_xvaddwod_d_wu_w, + .opt_opc = vecop_list, + .vece = MO_64 + }, + { + .fno = gen_helper_xvaddwod_q_du_d, + .vece = MO_128 + }, + }; + + tcg_gen_gvec_3(xd_ofs, xj_ofs, xk_ofs, oprsz, maxsz, &op[vece]); +} + +TRANS(xvaddwod_h_bu_b, gvec_xxx, MO_8, do_xvaddwod_u_s) +TRANS(xvaddwod_w_hu_h, gvec_xxx, MO_16, do_xvaddwod_u_s) +TRANS(xvaddwod_d_wu_w, gvec_xxx, MO_32, do_xvaddwod_u_s) +TRANS(xvaddwod_q_du_d, gvec_xxx, MO_64, do_xvaddwod_u_s) + static bool gvec_dupx(DisasContext *ctx, arg_xr *a, MemOp mop) { TCGv src = gpr_src(ctx, a->rj, EXT_NONE); diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index 48556b2267..1d177f9676 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1376,6 +1376,51 @@ xvhsubw_wu_hu 0111 01000101 10101 ..... ..... ..... @xxx xvhsubw_du_wu 0111 01000101 10110 ..... ..... ..... @xxx xvhsubw_qu_du 0111 01000101 10111 ..... ..... ..... @xxx +xvaddwev_h_b 0111 01000001 11100 ..... ..... ..... @xxx +xvaddwev_w_h 0111 01000001 11101 ..... ..... ..... @xxx +xvaddwev_d_w 0111 01000001 11110 ..... ..... ..... @xxx +xvaddwev_q_d 0111 01000001 11111 ..... ..... ..... @xxx +xvaddwod_h_b 0111 01000010 00100 ..... ..... ..... @xxx +xvaddwod_w_h 0111 01000010 00101 ..... ..... ..... @xxx +xvaddwod_d_w 0111 01000010 00110 ..... ..... ..... @xxx +xvaddwod_q_d 0111 01000010 00111 ..... ..... ..... @xxx + +xvsubwev_h_b 0111 01000010 00000 ..... ..... ..... @xxx +xvsubwev_w_h 0111 01000010 00001 ..... ..... ..... @xxx +xvsubwev_d_w 0111 01000010 00010 ..... ..... ..... @xxx +xvsubwev_q_d 0111 01000010 00011 ..... ..... ..... @xxx +xvsubwod_h_b 0111 01000010 01000 ..... ..... ..... @xxx +xvsubwod_w_h 0111 01000010 01001 ..... ..... ..... @xxx +xvsubwod_d_w 0111 01000010 01010 ..... ..... ..... @xxx +xvsubwod_q_d 0111 01000010 01011 ..... ..... ..... @xxx + +xvaddwev_h_bu 0111 01000010 11100 ..... ..... ..... @xxx +xvaddwev_w_hu 0111 01000010 11101 ..... ..... ..... @xxx +xvaddwev_d_wu 0111 01000010 11110 ..... ..... ..... @xxx +xvaddwev_q_du 0111 01000010 11111 ..... ..... ..... @xxx +xvaddwod_h_bu 0111 01000011 00100 ..... ..... ..... @xxx +xvaddwod_w_hu 0111 01000011 00101 ..... ..... ..... @xxx +xvaddwod_d_wu 0111 01000011 00110 ..... ..... ..... @xxx +xvaddwod_q_du 0111 01000011 00111 ..... ..... ..... @xxx + +xvsubwev_h_bu 0111 01000011 00000 ..... ..... ..... @xxx +xvsubwev_w_hu 0111 01000011 00001 ..... ..... ..... @xxx +xvsubwev_d_wu 0111 01000011 00010 ..... ..... ..... @xxx +xvsubwev_q_du 0111 01000011 00011 ..... ..... ..... @xxx +xvsubwod_h_bu 0111 01000011 01000 ..... ..... ..... @xxx +xvsubwod_w_hu 0111 01000011 01001 ..... ..... ..... @xxx +xvsubwod_d_wu 0111 01000011 01010 ..... ..... ..... @xxx +xvsubwod_q_du 0111 01000011 01011 ..... ..... ..... @xxx + +xvaddwev_h_bu_b 0111 01000011 11100 ..... ..... ..... @xxx +xvaddwev_w_hu_h 0111 01000011 11101 ..... ..... ..... @xxx +xvaddwev_d_wu_w 0111 01000011 11110 ..... ..... ..... @xxx +xvaddwev_q_du_d 0111 01000011 11111 ..... ..... ..... @xxx +xvaddwod_h_bu_b 0111 01000100 00000 ..... ..... ..... @xxx +xvaddwod_w_hu_h 0111 01000100 00001 ..... ..... ..... @xxx +xvaddwod_d_wu_w 0111 01000100 00010 ..... ..... ..... @xxx +xvaddwod_q_du_d 0111 01000100 00011 ..... ..... ..... @xxx + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @xr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @xr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @xr diff --git a/target/loongarch/lasx_helper.c b/target/loongarch/lasx_helper.c index d86381ff8a..8e830e1f3c 100644 --- a/target/loongarch/lasx_helper.c +++ b/target/loongarch/lasx_helper.c @@ -94,3 +94,217 @@ void HELPER(xvhsubw_qu_du)(CPULoongArchState *env, Xd->XQ(1) = int128_sub(int128_make64(Xj->UXD(3)), int128_make64(Xk->UXD(2))); } + +#define XDO_EVEN(NAME, BIT, E1, E2, DO_OP) \ +void HELPER(NAME)(void *xd, void *xj, void *xk, uint32_t v) \ +{ \ + int i; \ + XReg *Xd = (XReg *)xd; \ + XReg *Xj = (XReg *)xj; \ + XReg *Xk = (XReg *)xk; \ + typedef __typeof(Xd->E1(0)) TD; \ + for (i = 0; i < LASX_LEN / BIT; i++) { \ + Xd->E1(i) = DO_OP((TD)Xj->E2(2 * i), (TD)Xk->E2(2 * i)); \ + } \ +} + +#define XDO_ODD(NAME, BIT, E1, E2, DO_OP) \ +void HELPER(NAME)(void *xd, void *xj, void *xk, uint32_t v) \ +{ \ + int i; \ + XReg *Xd = (XReg *)xd; \ + XReg *Xj = (XReg *)xj; \ + XReg *Xk = (XReg *)xk; \ + typedef __typeof(Xd->E1(0)) TD; \ + for (i = 0; i < LASX_LEN / BIT; i++) { \ + Xd->E1(i) = DO_OP((TD)Xj->E2(2 * i + 1), (TD)Xk->E2(2 * i + 1)); \ + } \ +} + +void HELPER(xvaddwev_q_d)(void *xd, void *xj, void *xk, uint32_t v) +{ + XReg *Xd = (XReg *)xd; + XReg *Xj = (XReg *)xj; + XReg *Xk = (XReg *)xk; + + Xd->XQ(0) = int128_add(int128_makes64(Xj->XD(0)), + int128_makes64(Xk->XD(0))); + Xd->XQ(1) = int128_add(int128_makes64(Xj->XD(2)), + int128_makes64(Xk->XD(2))); +} + +XDO_EVEN(xvaddwev_h_b, 16, XH, XB, DO_ADD) +XDO_EVEN(xvaddwev_w_h, 32, XW, XH, DO_ADD) +XDO_EVEN(xvaddwev_d_w, 64, XD, XW, DO_ADD) + +void HELPER(xvaddwod_q_d)(void *xd, void *xj, void *xk, uint32_t v) +{ + XReg *Xd = (XReg *)xd; + XReg *Xj = (XReg *)xj; + XReg *Xk = (XReg *)xk; + + Xd->XQ(0) = int128_add(int128_makes64(Xj->XD(1)), + int128_makes64(Xk->XD(1))); + Xd->XQ(1) = int128_add(int128_makes64(Xj->XD(3)), + int128_makes64(Xk->XD(3))); +} + +XDO_ODD(xvaddwod_h_b, 16, XH, XB, DO_ADD) +XDO_ODD(xvaddwod_w_h, 32, XW, XH, DO_ADD) +XDO_ODD(xvaddwod_d_w, 64, XD, XW, DO_ADD) + +void HELPER(xvsubwev_q_d)(void *xd, void *xj, void *xk, uint32_t v) +{ + XReg *Xd = (XReg *)xd; + XReg *Xj = (XReg *)xj; + XReg *Xk = (XReg *)xk; + + Xd->XQ(0) = int128_sub(int128_makes64(Xj->XD(0)), + int128_makes64(Xk->XD(0))); + Xd->XQ(1) = int128_sub(int128_makes64(Xj->XD(2)), + int128_makes64(Xk->XD(2))); +} + +XDO_EVEN(xvsubwev_h_b, 16, XH, XB, DO_SUB) +XDO_EVEN(xvsubwev_w_h, 32, XW, XH, DO_SUB) +XDO_EVEN(xvsubwev_d_w, 64, XD, XW, DO_SUB) + +void HELPER(xvsubwod_q_d)(void *xd, void *xj, void *xk, uint32_t v) +{ + XReg *Xd = (XReg *)xd; + XReg *Xj = (XReg *)xj; + XReg *Xk = (XReg *)xk; + + Xd->XQ(0) = int128_sub(int128_makes64(Xj->XD(1)), + int128_makes64(Xk->XD(1))); + Xd->XQ(1) = int128_sub(int128_makes64(Xj->XD(3)), + int128_makes64(Xk->XD(3))); +} + +XDO_ODD(xvsubwod_h_b, 16, XH, XB, DO_SUB) +XDO_ODD(xvsubwod_w_h, 32, XW, XH, DO_SUB) +XDO_ODD(xvsubwod_d_w, 64, XD, XW, DO_SUB) + +void HELPER(xvaddwev_q_du)(void *xd, void *xj, void *xk, uint32_t v) +{ + XReg *Xd = (XReg *)xd; + XReg *Xj = (XReg *)xj; + XReg *Xk = (XReg *)xk; + + Xd->XQ(0) = int128_add(int128_make64(Xj->UXD(0)), + int128_make64(Xk->UXD(0))); + Xd->XQ(1) = int128_add(int128_make64(Xj->UXD(2)), + int128_make64(Xk->UXD(2))); +} + +XDO_EVEN(xvaddwev_h_bu, 16, UXH, UXB, DO_ADD) +XDO_EVEN(xvaddwev_w_hu, 32, UXW, UXH, DO_ADD) +XDO_EVEN(xvaddwev_d_wu, 64, UXD, UXW, DO_ADD) + +void HELPER(xvaddwod_q_du)(void *xd, void *xj, void *xk, uint32_t v) +{ + XReg *Xd = (XReg *)xd; + XReg *Xj = (XReg *)xj; + XReg *Xk = (XReg *)xk; + + Xd->XQ(0) = int128_add(int128_make64(Xj->UXD(1)), + int128_make64(Xk->UXD(1))); + Xd->XQ(1) = int128_add(int128_make64(Xj->UXD(3)), + int128_make64(Xk->UXD(3))); +} + +XDO_ODD(xvaddwod_h_bu, 16, UXH, UXB, DO_ADD) +XDO_ODD(xvaddwod_w_hu, 32, UXW, UXH, DO_ADD) +XDO_ODD(xvaddwod_d_wu, 64, UXD, UXW, DO_ADD) + +void HELPER(xvsubwev_q_du)(void *xd, void *xj, void *xk, uint32_t v) +{ + XReg *Xd = (XReg *)xd; + XReg *Xj = (XReg *)xj; + XReg *Xk = (XReg *)xk; + + Xd->XQ(0) = int128_sub(int128_make64(Xj->UXD(0)), + int128_make64(Xk->UXD(0))); + Xd->XQ(1) = int128_sub(int128_make64(Xj->UXD(2)), + int128_make64(Xk->UXD(2))); +} + +XDO_EVEN(xvsubwev_h_bu, 16, UXH, UXB, DO_SUB) +XDO_EVEN(xvsubwev_w_hu, 32, UXW, UXH, DO_SUB) +XDO_EVEN(xvsubwev_d_wu, 64, UXD, UXW, DO_SUB) + +void HELPER(xvsubwod_q_du)(void *xd, void *xj, void *xk, uint32_t v) +{ + XReg *Xd = (XReg *)xd; + XReg *Xj = (XReg *)xj; + XReg *Xk = (XReg *)xk; + + Xd->XQ(0) = int128_sub(int128_make64(Xj->UXD(1)), + int128_make64(Xk->UXD(1))); + Xd->XQ(1) = int128_sub(int128_make64(Xj->UXD(3)), + int128_make64(Xk->UXD(3))); +} + +XDO_ODD(xvsubwod_h_bu, 16, UXH, UXB, DO_SUB) +XDO_ODD(xvsubwod_w_hu, 32, UXW, UXH, DO_SUB) +XDO_ODD(xvsubwod_d_wu, 64, UXD, UXW, DO_SUB) + +#define XDO_EVEN_U_S(NAME, BIT, ES1, EU1, ES2, EU2, DO_OP) \ +void HELPER(NAME)(void *xd, void *xj, void *xk, uint32_t v) \ +{ \ + int i; \ + XReg *Xd = (XReg *)xd; \ + XReg *Xj = (XReg *)xj; \ + XReg *Xk = (XReg *)xk; \ + typedef __typeof(Xd->ES1(0)) TDS; \ + typedef __typeof(Xd->EU1(0)) TDU; \ + for (i = 0; i < LASX_LEN / BIT; i++) { \ + Xd->ES1(i) = DO_OP((TDU)Xj->EU2(2 * i), (TDS)Xk->ES2(2 * i)); \ + } \ +} + +#define XDO_ODD_U_S(NAME, BIT, ES1, EU1, ES2, EU2, DO_OP) \ +void HELPER(NAME)(void *xd, void *xj, void *xk, uint32_t v) \ +{ \ + int i; \ + XReg *Xd = (XReg *)xd; \ + XReg *Xj = (XReg *)xj; \ + XReg *Xk = (XReg *)xk; \ + typedef __typeof(Xd->ES1(0)) TDS; \ + typedef __typeof(Xd->EU1(0)) TDU; \ + for (i = 0; i < LSX_LEN / BIT; i++) { \ + Xd->ES1(i) = DO_OP((TDU)Xj->EU2(2 * i + 1), (TDS)Xk->ES2(2 * i + 1)); \ + } \ +} + +void HELPER(xvaddwev_q_du_d)(void *xd, void *xj, void *xk, uint32_t v) +{ + XReg *Xd = (XReg *)xd; + XReg *Xj = (XReg *)xj; + XReg *Xk = (XReg *)xk; + + Xd->XQ(0) = int128_add(int128_make64(Xj->UXD(0)), + int128_makes64(Xk->XD(0))); + Xd->XQ(1) = int128_add(int128_make64(Xj->UXD(2)), + int128_makes64(Xk->XD(2))); +} + +XDO_EVEN_U_S(xvaddwev_h_bu_b, 16, XH, UXH, XB, UXB, DO_ADD) +XDO_EVEN_U_S(xvaddwev_w_hu_h, 32, XW, UXW, XH, UXH, DO_ADD) +XDO_EVEN_U_S(xvaddwev_d_wu_w, 64, XD, UXD, XW, UXW, DO_ADD) + +void HELPER(xvaddwod_q_du_d)(void *xd, void *xj, void *xk, uint32_t v) +{ + XReg *Xd = (XReg *)xd; + XReg *Xj = (XReg *)xj; + XReg *Xk = (XReg *)xk; + + Xd->XQ(0) = int128_add(int128_make64(Xj->UXD(1)), + int128_makes64(Xk->XD(1))); + Xd->XQ(1) = int128_add(int128_make64(Xj->UXD(3)), + int128_makes64(Xk->XD(3))); +} + +XDO_ODD_U_S(xvaddwod_h_bu_b, 16, XH, UXH, XB, UXB, DO_ADD) +XDO_ODD_U_S(xvaddwod_w_hu_h, 32, XW, UXW, XH, UXH, DO_ADD) +XDO_ODD_U_S(xvaddwod_d_wu_w, 64, XD, UXD, XW, UXW, DO_ADD) From patchwork Tue Jun 20 09:37:39 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Gao X-Patchwork-Id: 13285492 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 2BA39EB64D7 for ; Tue, 20 Jun 2023 09:44:45 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXqX-0003Dx-R9; Tue, 20 Jun 2023 05:40:22 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qBXqR-0002jL-NO for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:40:16 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXqM-0006aO-Jk for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:40:15 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8AxW+qNc5FknyUHAA--.14683S3; Tue, 20 Jun 2023 17:38:21 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8BxduSGc5FkzIQhAA--.28394S13; Tue, 20 Jun 2023 17:38:21 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v1 11/46] target/loongarch: Implement xavg/xvagr Date: Tue, 20 Jun 2023 17:37:39 +0800 Message-Id: <20230620093814.123650-12-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230620093814.123650-1-gaosong@loongson.cn> References: <20230620093814.123650-1-gaosong@loongson.cn> MIME-Version: 1.0 X-CM-TRANSID: AQAAf8BxduSGc5FkzIQhAA--.28394S13 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org This patch includes: - XVAVG.{B/H/W/D/}[U]; - XVAVGR.{B/H/W/D}[U]. Signed-off-by: Song Gao --- target/loongarch/disas.c | 17 ++ target/loongarch/helper.h | 18 +++ target/loongarch/insn_trans/trans_lasx.c.inc | 162 +++++++++++++++++++ target/loongarch/insns.decode | 17 ++ target/loongarch/lasx_helper.c | 29 ++++ target/loongarch/vec.h | 3 + 6 files changed, 246 insertions(+) diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index 6e790f0959..d804caaee0 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -1825,6 +1825,23 @@ INSN_LASX(xvaddwod_w_hu_h, xxx) INSN_LASX(xvaddwod_d_wu_w, xxx) INSN_LASX(xvaddwod_q_du_d, xxx) +INSN_LASX(xvavg_b, xxx) +INSN_LASX(xvavg_h, xxx) +INSN_LASX(xvavg_w, xxx) +INSN_LASX(xvavg_d, xxx) +INSN_LASX(xvavg_bu, xxx) +INSN_LASX(xvavg_hu, xxx) +INSN_LASX(xvavg_wu, xxx) +INSN_LASX(xvavg_du, xxx) +INSN_LASX(xvavgr_b, xxx) +INSN_LASX(xvavgr_h, xxx) +INSN_LASX(xvavgr_w, xxx) +INSN_LASX(xvavgr_d, xxx) +INSN_LASX(xvavgr_bu, xxx) +INSN_LASX(xvavgr_hu, xxx) +INSN_LASX(xvavgr_wu, xxx) +INSN_LASX(xvavgr_du, xxx) + INSN_LASX(xvreplgr2vr_b, xr) INSN_LASX(xvreplgr2vr_h, xr) INSN_LASX(xvreplgr2vr_w, xr) diff --git a/target/loongarch/helper.h b/target/loongarch/helper.h index 2034576d87..feeaa92447 100644 --- a/target/loongarch/helper.h +++ b/target/loongarch/helper.h @@ -759,3 +759,21 @@ DEF_HELPER_FLAGS_4(xvaddwod_h_bu_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(xvaddwod_w_hu_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(xvaddwod_d_wu_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(xvaddwod_q_du_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(xvavg_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvavg_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvavg_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvavg_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvavg_bu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvavg_hu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvavg_wu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvavg_du, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(xvavgr_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvavgr_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvavgr_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvavgr_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvavgr_bu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvavgr_hu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvavgr_wu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvavgr_du, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarch/insn_trans/trans_lasx.c.inc index 0a574182db..4a8bcf618f 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -588,6 +588,168 @@ TRANS(xvaddwod_w_hu_h, gvec_xxx, MO_16, do_xvaddwod_u_s) TRANS(xvaddwod_d_wu_w, gvec_xxx, MO_32, do_xvaddwod_u_s) TRANS(xvaddwod_q_du_d, gvec_xxx, MO_64, do_xvaddwod_u_s) +static void do_xvavg_s(unsigned vece, uint32_t xd_ofs, uint32_t xj_ofs, + uint32_t xk_ofs, uint32_t oprsz, uint32_t maxsz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_sari_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen3 op[4] = { + { + .fniv = gen_vavg_s, + .fno = gen_helper_xvavg_b, + .opt_opc = vecop_list, + .vece = MO_8 + }, + { + .fniv = gen_vavg_s, + .fno = gen_helper_xvavg_h, + .opt_opc = vecop_list, + .vece = MO_16 + }, + { + .fniv = gen_vavg_s, + .fno = gen_helper_xvavg_w, + .opt_opc = vecop_list, + .vece = MO_32 + }, + { + .fniv = gen_vavg_s, + .fno = gen_helper_xvavg_d, + .opt_opc = vecop_list, + .vece = MO_64 + }, + }; + + tcg_gen_gvec_3(xd_ofs, xj_ofs, xk_ofs, oprsz, maxsz, &op[vece]); +} + +static void do_xvavg_u(unsigned vece, uint32_t xd_ofs, uint32_t xj_ofs, + uint32_t xk_ofs, uint32_t oprsz, uint32_t maxsz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_shri_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen3 op[4] = { + { + .fniv = gen_vavg_u, + .fno = gen_helper_xvavg_bu, + .opt_opc = vecop_list, + .vece = MO_8 + }, + { + .fniv = gen_vavg_u, + .fno = gen_helper_xvavg_hu, + .opt_opc = vecop_list, + .vece = MO_16 + }, + { + .fniv = gen_vavg_u, + .fno = gen_helper_xvavg_wu, + .opt_opc = vecop_list, + .vece = MO_32 + }, + { + .fniv = gen_vavg_u, + .fno = gen_helper_xvavg_du, + .opt_opc = vecop_list, + .vece = MO_64 + }, + }; + + tcg_gen_gvec_3(xd_ofs, xj_ofs, xk_ofs, oprsz, maxsz, &op[vece]); +} + +TRANS(xvavg_b, gvec_xxx, MO_8, do_xvavg_s) +TRANS(xvavg_h, gvec_xxx, MO_16, do_xvavg_s) +TRANS(xvavg_w, gvec_xxx, MO_32, do_xvavg_s) +TRANS(xvavg_d, gvec_xxx, MO_64, do_xvavg_s) +TRANS(xvavg_bu, gvec_xxx, MO_8, do_xvavg_u) +TRANS(xvavg_hu, gvec_xxx, MO_16, do_xvavg_u) +TRANS(xvavg_wu, gvec_xxx, MO_32, do_xvavg_u) +TRANS(xvavg_du, gvec_xxx, MO_64, do_xvavg_u) + +static void do_xvavgr_s(unsigned vece, uint32_t xd_ofs, uint32_t xj_ofs, + uint32_t xk_ofs, uint32_t oprsz, uint32_t maxsz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_sari_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen3 op[4] = { + { + .fniv = gen_vavgr_s, + .fno = gen_helper_xvavgr_b, + .opt_opc = vecop_list, + .vece = MO_8 + }, + { + .fniv = gen_vavgr_s, + .fno = gen_helper_xvavgr_h, + .opt_opc = vecop_list, + .vece = MO_16 + }, + { + .fniv = gen_vavgr_s, + .fno = gen_helper_xvavgr_w, + .opt_opc = vecop_list, + .vece = MO_32 + }, + { + .fniv = gen_vavgr_s, + .fno = gen_helper_xvavgr_d, + .opt_opc = vecop_list, + .vece = MO_64 + }, + }; + + tcg_gen_gvec_3(xd_ofs, xj_ofs, xk_ofs, oprsz, maxsz, &op[vece]); +} + +static void do_xvavgr_u(unsigned vece, uint32_t xd_ofs, uint32_t xj_ofs, + uint32_t xk_ofs, uint32_t oprsz, uint32_t maxsz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_shri_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen3 op[4] = { + { + .fniv = gen_vavgr_u, + .fno = gen_helper_xvavgr_bu, + .opt_opc = vecop_list, + .vece = MO_8 + }, + { + .fniv = gen_vavgr_u, + .fno = gen_helper_xvavgr_hu, + .opt_opc = vecop_list, + .vece = MO_16 + }, + { + .fniv = gen_vavgr_u, + .fno = gen_helper_xvavgr_wu, + .opt_opc = vecop_list, + .vece = MO_32 + }, + { + .fniv = gen_vavgr_u, + .fno = gen_helper_xvavgr_du, + .opt_opc = vecop_list, + .vece = MO_64 + }, + }; + + tcg_gen_gvec_3(xd_ofs, xj_ofs, xk_ofs, oprsz, maxsz, &op[vece]); +} + +TRANS(xvavgr_b, gvec_xxx, MO_8, do_xvavgr_s) +TRANS(xvavgr_h, gvec_xxx, MO_16, do_xvavgr_s) +TRANS(xvavgr_w, gvec_xxx, MO_32, do_xvavgr_s) +TRANS(xvavgr_d, gvec_xxx, MO_64, do_xvavgr_s) +TRANS(xvavgr_bu, gvec_xxx, MO_8, do_xvavgr_u) +TRANS(xvavgr_hu, gvec_xxx, MO_16, do_xvavgr_u) +TRANS(xvavgr_wu, gvec_xxx, MO_32, do_xvavgr_u) +TRANS(xvavgr_du, gvec_xxx, MO_64, do_xvavgr_u) + static bool gvec_dupx(DisasContext *ctx, arg_xr *a, MemOp mop) { TCGv src = gpr_src(ctx, a->rj, EXT_NONE); diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index 1d177f9676..0057aaf1d4 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1421,6 +1421,23 @@ xvaddwod_w_hu_h 0111 01000100 00001 ..... ..... ..... @xxx xvaddwod_d_wu_w 0111 01000100 00010 ..... ..... ..... @xxx xvaddwod_q_du_d 0111 01000100 00011 ..... ..... ..... @xxx +xvavg_b 0111 01000110 01000 ..... ..... ..... @xxx +xvavg_h 0111 01000110 01001 ..... ..... ..... @xxx +xvavg_w 0111 01000110 01010 ..... ..... ..... @xxx +xvavg_d 0111 01000110 01011 ..... ..... ..... @xxx +xvavg_bu 0111 01000110 01100 ..... ..... ..... @xxx +xvavg_hu 0111 01000110 01101 ..... ..... ..... @xxx +xvavg_wu 0111 01000110 01110 ..... ..... ..... @xxx +xvavg_du 0111 01000110 01111 ..... ..... ..... @xxx +xvavgr_b 0111 01000110 10000 ..... ..... ..... @xxx +xvavgr_h 0111 01000110 10001 ..... ..... ..... @xxx +xvavgr_w 0111 01000110 10010 ..... ..... ..... @xxx +xvavgr_d 0111 01000110 10011 ..... ..... ..... @xxx +xvavgr_bu 0111 01000110 10100 ..... ..... ..... @xxx +xvavgr_hu 0111 01000110 10101 ..... ..... ..... @xxx +xvavgr_wu 0111 01000110 10110 ..... ..... ..... @xxx +xvavgr_du 0111 01000110 10111 ..... ..... ..... @xxx + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @xr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @xr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @xr diff --git a/target/loongarch/lasx_helper.c b/target/loongarch/lasx_helper.c index 8e830e1f3c..8e1bcdb764 100644 --- a/target/loongarch/lasx_helper.c +++ b/target/loongarch/lasx_helper.c @@ -308,3 +308,32 @@ void HELPER(xvaddwod_q_du_d)(void *xd, void *xj, void *xk, uint32_t v) XDO_ODD_U_S(xvaddwod_h_bu_b, 16, XH, UXH, XB, UXB, DO_ADD) XDO_ODD_U_S(xvaddwod_w_hu_h, 32, XW, UXW, XH, UXH, DO_ADD) XDO_ODD_U_S(xvaddwod_d_wu_w, 64, XD, UXD, XW, UXW, DO_ADD) + +#define XDO_3OP(NAME, BIT, E, DO_OP) \ +void HELPER(NAME)(void *xd, void *xj, void *xk, uint32_t v) \ +{ \ + int i; \ + XReg *Xd = (XReg *)xd; \ + XReg *Xj = (XReg *)xj; \ + XReg *Xk = (XReg *)xk; \ + for (i = 0; i < LASX_LEN / BIT; i++) { \ + Xd->E(i) = DO_OP(Xj->E(i), Xk->E(i)); \ + } \ +} + +XDO_3OP(xvavg_b, 8, XB, DO_VAVG) +XDO_3OP(xvavg_h, 16, XH, DO_VAVG) +XDO_3OP(xvavg_w, 32, XW, DO_VAVG) +XDO_3OP(xvavg_d, 64, XD, DO_VAVG) +XDO_3OP(xvavgr_b, 8, XB, DO_VAVGR) +XDO_3OP(xvavgr_h, 16, XH, DO_VAVGR) +XDO_3OP(xvavgr_w, 32, XW, DO_VAVGR) +XDO_3OP(xvavgr_d, 64, XD, DO_VAVGR) +XDO_3OP(xvavg_bu, 8, UXB, DO_VAVG) +XDO_3OP(xvavg_hu, 16, UXH, DO_VAVG) +XDO_3OP(xvavg_wu, 32, UXW, DO_VAVG) +XDO_3OP(xvavg_du, 64, UXD, DO_VAVG) +XDO_3OP(xvavgr_bu, 8, UXB, DO_VAVGR) +XDO_3OP(xvavgr_hu, 16, UXH, DO_VAVGR) +XDO_3OP(xvavgr_wu, 32, UXW, DO_VAVGR) +XDO_3OP(xvavgr_du, 64, UXD, DO_VAVGR) diff --git a/target/loongarch/vec.h b/target/loongarch/vec.h index 7e71035e50..2a9c312e3d 100644 --- a/target/loongarch/vec.h +++ b/target/loongarch/vec.h @@ -51,4 +51,7 @@ #define DO_ADD(a, b) (a + b) #define DO_SUB(a, b) (a - b) +#define DO_VAVG(a, b) ((a >> 1) + (b >> 1) + (a & b & 1)) +#define DO_VAVGR(a, b) ((a >> 1) + (b >> 1) + ((a | b) & 1)) + #endif /* LOONGARCH_VEC_H */ From patchwork Tue Jun 20 09:37:40 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Gao X-Patchwork-Id: 13285445 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 74DB1EB64D7 for ; Tue, 20 Jun 2023 09:39:14 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXos-000761-SX; Tue, 20 Jun 2023 05:38:38 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qBXor-00075g-9b for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:38:37 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXoo-0006K8-GR for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:38:37 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8DxRumOc5FkoSUHAA--.12862S3; Tue, 20 Jun 2023 17:38:22 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8BxduSGc5FkzIQhAA--.28394S14; Tue, 20 Jun 2023 17:38:21 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v1 12/46] target/loongarch: Implement xvabsd Date: Tue, 20 Jun 2023 17:37:40 +0800 Message-Id: <20230620093814.123650-13-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230620093814.123650-1-gaosong@loongson.cn> References: <20230620093814.123650-1-gaosong@loongson.cn> MIME-Version: 1.0 X-CM-TRANSID: AQAAf8BxduSGc5FkzIQhAA--.28394S14 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org This patch includes: - XVABSD.{B/H/W/D}[U]. Signed-off-by: Song Gao --- target/loongarch/disas.c | 9 +++ target/loongarch/helper.h | 9 +++ target/loongarch/insn_trans/trans_lasx.c.inc | 81 ++++++++++++++++++++ target/loongarch/insns.decode | 9 +++ target/loongarch/lasx_helper.c | 9 +++ target/loongarch/lsx_helper.c | 2 - target/loongarch/vec.h | 2 + 7 files changed, 119 insertions(+), 2 deletions(-) diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index d804caaee0..d6b6b8ddd6 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -1842,6 +1842,15 @@ INSN_LASX(xvavgr_hu, xxx) INSN_LASX(xvavgr_wu, xxx) INSN_LASX(xvavgr_du, xxx) +INSN_LASX(xvabsd_b, xxx) +INSN_LASX(xvabsd_h, xxx) +INSN_LASX(xvabsd_w, xxx) +INSN_LASX(xvabsd_d, xxx) +INSN_LASX(xvabsd_bu, xxx) +INSN_LASX(xvabsd_hu, xxx) +INSN_LASX(xvabsd_wu, xxx) +INSN_LASX(xvabsd_du, xxx) + INSN_LASX(xvreplgr2vr_b, xr) INSN_LASX(xvreplgr2vr_h, xr) INSN_LASX(xvreplgr2vr_w, xr) diff --git a/target/loongarch/helper.h b/target/loongarch/helper.h index feeaa92447..3ec7717c88 100644 --- a/target/loongarch/helper.h +++ b/target/loongarch/helper.h @@ -777,3 +777,12 @@ DEF_HELPER_FLAGS_4(xvavgr_bu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(xvavgr_hu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(xvavgr_wu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(xvavgr_du, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(xvabsd_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvabsd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvabsd_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvabsd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvabsd_bu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvabsd_hu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvabsd_wu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvabsd_du, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarch/insn_trans/trans_lasx.c.inc index 4a8bcf618f..8f7ff2cba6 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -750,6 +750,87 @@ TRANS(xvavgr_hu, gvec_xxx, MO_16, do_xvavgr_u) TRANS(xvavgr_wu, gvec_xxx, MO_32, do_xvavgr_u) TRANS(xvavgr_du, gvec_xxx, MO_64, do_xvavgr_u) +static void do_xvabsd_s(unsigned vece, uint32_t xd_ofs, uint32_t xj_ofs, + uint32_t xk_ofs, uint32_t oprsz, uint32_t maxsz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_smax_vec, INDEX_op_smin_vec, INDEX_op_sub_vec, 0 + }; + static const GVecGen3 op[4] = { + { + .fniv = gen_vabsd_s, + .fno = gen_helper_xvabsd_b, + .opt_opc = vecop_list, + .vece = MO_8 + }, + { + .fniv = gen_vabsd_s, + .fno = gen_helper_xvabsd_h, + .opt_opc = vecop_list, + .vece = MO_16 + }, + { + .fniv = gen_vabsd_s, + .fno = gen_helper_xvabsd_w, + .opt_opc = vecop_list, + .vece = MO_32 + }, + { + .fniv = gen_vabsd_s, + .fno = gen_helper_xvabsd_d, + .opt_opc = vecop_list, + .vece = MO_64 + }, + }; + + tcg_gen_gvec_3(xd_ofs, xj_ofs, xk_ofs, oprsz, maxsz, &op[vece]); +} + +static void do_xvabsd_u(unsigned vece, uint32_t xd_ofs, uint32_t xj_ofs, + uint32_t xk_ofs, uint32_t oprsz, uint32_t maxsz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_umax_vec, INDEX_op_umin_vec, INDEX_op_sub_vec, 0 + }; + static const GVecGen3 op[4] = { + { + .fniv = gen_vabsd_u, + .fno = gen_helper_xvabsd_bu, + .opt_opc = vecop_list, + .vece = MO_8 + }, + { + .fniv = gen_vabsd_u, + .fno = gen_helper_xvabsd_hu, + .opt_opc = vecop_list, + .vece = MO_16 + }, + { + .fniv = gen_vabsd_u, + .fno = gen_helper_xvabsd_wu, + .opt_opc = vecop_list, + .vece = MO_32 + }, + { + .fniv = gen_vabsd_u, + .fno = gen_helper_xvabsd_du, + .opt_opc = vecop_list, + .vece = MO_64 + }, + }; + + tcg_gen_gvec_3(xd_ofs, xj_ofs, xk_ofs, oprsz, maxsz, &op[vece]); +} + +TRANS(xvabsd_b, gvec_xxx, MO_8, do_xvabsd_s) +TRANS(xvabsd_h, gvec_xxx, MO_16, do_xvabsd_s) +TRANS(xvabsd_w, gvec_xxx, MO_32, do_xvabsd_s) +TRANS(xvabsd_d, gvec_xxx, MO_64, do_xvabsd_s) +TRANS(xvabsd_bu, gvec_xxx, MO_8, do_xvabsd_u) +TRANS(xvabsd_hu, gvec_xxx, MO_16, do_xvabsd_u) +TRANS(xvabsd_wu, gvec_xxx, MO_32, do_xvabsd_u) +TRANS(xvabsd_du, gvec_xxx, MO_64, do_xvabsd_u) + static bool gvec_dupx(DisasContext *ctx, arg_xr *a, MemOp mop) { TCGv src = gpr_src(ctx, a->rj, EXT_NONE); diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index 0057aaf1d4..8bd029a6e8 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1438,6 +1438,15 @@ xvavgr_hu 0111 01000110 10101 ..... ..... ..... @xxx xvavgr_wu 0111 01000110 10110 ..... ..... ..... @xxx xvavgr_du 0111 01000110 10111 ..... ..... ..... @xxx +xvabsd_b 0111 01000110 00000 ..... ..... ..... @xxx +xvabsd_h 0111 01000110 00001 ..... ..... ..... @xxx +xvabsd_w 0111 01000110 00010 ..... ..... ..... @xxx +xvabsd_d 0111 01000110 00011 ..... ..... ..... @xxx +xvabsd_bu 0111 01000110 00100 ..... ..... ..... @xxx +xvabsd_hu 0111 01000110 00101 ..... ..... ..... @xxx +xvabsd_wu 0111 01000110 00110 ..... ..... ..... @xxx +xvabsd_du 0111 01000110 00111 ..... ..... ..... @xxx + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @xr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @xr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @xr diff --git a/target/loongarch/lasx_helper.c b/target/loongarch/lasx_helper.c index 8e1bcdb764..e9d38d83bc 100644 --- a/target/loongarch/lasx_helper.c +++ b/target/loongarch/lasx_helper.c @@ -337,3 +337,12 @@ XDO_3OP(xvavgr_bu, 8, UXB, DO_VAVGR) XDO_3OP(xvavgr_hu, 16, UXH, DO_VAVGR) XDO_3OP(xvavgr_wu, 32, UXW, DO_VAVGR) XDO_3OP(xvavgr_du, 64, UXD, DO_VAVGR) + +XDO_3OP(xvabsd_b, 8, XB, DO_VABSD) +XDO_3OP(xvabsd_h, 16, XH, DO_VABSD) +XDO_3OP(xvabsd_w, 32, XW, DO_VABSD) +XDO_3OP(xvabsd_d, 64, XD, DO_VABSD) +XDO_3OP(xvabsd_bu, 8, UXB, DO_VABSD) +XDO_3OP(xvabsd_hu, 16, UXH, DO_VABSD) +XDO_3OP(xvabsd_wu, 32, UXW, DO_VABSD) +XDO_3OP(xvabsd_du, 64, UXD, DO_VABSD) diff --git a/target/loongarch/lsx_helper.c b/target/loongarch/lsx_helper.c index d79a65dfe2..72e2306f0c 100644 --- a/target/loongarch/lsx_helper.c +++ b/target/loongarch/lsx_helper.c @@ -309,8 +309,6 @@ DO_3OP(vavgr_hu, 16, UH, DO_VAVGR) DO_3OP(vavgr_wu, 32, UW, DO_VAVGR) DO_3OP(vavgr_du, 64, UD, DO_VAVGR) -#define DO_VABSD(a, b) ((a > b) ? (a -b) : (b-a)) - DO_3OP(vabsd_b, 8, B, DO_VABSD) DO_3OP(vabsd_h, 16, H, DO_VABSD) DO_3OP(vabsd_w, 32, W, DO_VABSD) diff --git a/target/loongarch/vec.h b/target/loongarch/vec.h index 2a9c312e3d..652d46c157 100644 --- a/target/loongarch/vec.h +++ b/target/loongarch/vec.h @@ -54,4 +54,6 @@ #define DO_VAVG(a, b) ((a >> 1) + (b >> 1) + (a & b & 1)) #define DO_VAVGR(a, b) ((a >> 1) + (b >> 1) + ((a | b) & 1)) +#define DO_VABSD(a, b) ((a > b) ? (a - b) : (b - a)) + #endif /* LOONGARCH_VEC_H */ From patchwork Tue Jun 20 09:37:41 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Gao X-Patchwork-Id: 13285485 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 280E3EB64D7 for ; Tue, 20 Jun 2023 09:43:35 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXot-00076N-G5; Tue, 20 Jun 2023 05:38:39 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qBXor-00075f-8Y for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:38:37 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXop-0006KH-2s for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:38:37 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8CxMuiOc5FkoyUHAA--.655S3; Tue, 20 Jun 2023 17:38:22 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8BxduSGc5FkzIQhAA--.28394S15; Tue, 20 Jun 2023 17:38:22 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v1 13/46] target/loongarch: Implement xvadda Date: Tue, 20 Jun 2023 17:37:41 +0800 Message-Id: <20230620093814.123650-14-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230620093814.123650-1-gaosong@loongson.cn> References: <20230620093814.123650-1-gaosong@loongson.cn> MIME-Version: 1.0 X-CM-TRANSID: AQAAf8BxduSGc5FkzIQhAA--.28394S15 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org This patch includes: - XVADDA.{B/H/W/D}. Signed-off-by: Song Gao --- target/loongarch/disas.c | 5 +++ target/loongarch/helper.h | 5 +++ target/loongarch/insn_trans/trans_lasx.c.inc | 41 ++++++++++++++++++++ target/loongarch/insns.decode | 5 +++ target/loongarch/lasx_helper.c | 17 ++++++++ target/loongarch/lsx_helper.c | 2 - target/loongarch/vec.h | 2 + 7 files changed, 75 insertions(+), 2 deletions(-) diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index d6b6b8ddd6..cc92f0e763 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -1851,6 +1851,11 @@ INSN_LASX(xvabsd_hu, xxx) INSN_LASX(xvabsd_wu, xxx) INSN_LASX(xvabsd_du, xxx) +INSN_LASX(xvadda_b, xxx) +INSN_LASX(xvadda_h, xxx) +INSN_LASX(xvadda_w, xxx) +INSN_LASX(xvadda_d, xxx) + INSN_LASX(xvreplgr2vr_b, xr) INSN_LASX(xvreplgr2vr_h, xr) INSN_LASX(xvreplgr2vr_w, xr) diff --git a/target/loongarch/helper.h b/target/loongarch/helper.h index 3ec7717c88..67ef7491c4 100644 --- a/target/loongarch/helper.h +++ b/target/loongarch/helper.h @@ -786,3 +786,8 @@ DEF_HELPER_FLAGS_4(xvabsd_bu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(xvabsd_hu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(xvabsd_wu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(xvabsd_du, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(xvadda_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvadda_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvadda_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvadda_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarch/insn_trans/trans_lasx.c.inc index 8f7ff2cba6..4b2e50de68 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -831,6 +831,47 @@ TRANS(xvabsd_hu, gvec_xxx, MO_16, do_xvabsd_u) TRANS(xvabsd_wu, gvec_xxx, MO_32, do_xvabsd_u) TRANS(xvabsd_du, gvec_xxx, MO_64, do_xvabsd_u) +static void do_xvadda(unsigned vece, uint32_t xd_ofs, uint32_t xj_ofs, + uint32_t xk_ofs, uint32_t oprsz, uint32_t maxsz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_abs_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen3 op[4] = { + { + .fniv = gen_vadda, + .fno = gen_helper_xvadda_b, + .opt_opc = vecop_list, + .vece = MO_8 + }, + { + .fniv = gen_vadda, + .fno = gen_helper_xvadda_h, + .opt_opc = vecop_list, + .vece = MO_16 + }, + { + .fniv = gen_vadda, + .fno = gen_helper_xvadda_w, + .opt_opc = vecop_list, + .vece = MO_32 + }, + { + .fniv = gen_vadda, + .fno = gen_helper_xvadda_d, + .opt_opc = vecop_list, + .vece = MO_64 + }, + }; + + tcg_gen_gvec_3(xd_ofs, xj_ofs, xk_ofs, oprsz, maxsz, &op[vece]); +} + +TRANS(xvadda_b, gvec_xxx, MO_8, do_xvadda) +TRANS(xvadda_h, gvec_xxx, MO_16, do_xvadda) +TRANS(xvadda_w, gvec_xxx, MO_32, do_xvadda) +TRANS(xvadda_d, gvec_xxx, MO_64, do_xvadda) + static bool gvec_dupx(DisasContext *ctx, arg_xr *a, MemOp mop) { TCGv src = gpr_src(ctx, a->rj, EXT_NONE); diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index 8bd029a6e8..f8a17f262a 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1447,6 +1447,11 @@ xvabsd_hu 0111 01000110 00101 ..... ..... ..... @xxx xvabsd_wu 0111 01000110 00110 ..... ..... ..... @xxx xvabsd_du 0111 01000110 00111 ..... ..... ..... @xxx +xvadda_b 0111 01000101 11000 ..... ..... ..... @xxx +xvadda_h 0111 01000101 11001 ..... ..... ..... @xxx +xvadda_w 0111 01000101 11010 ..... ..... ..... @xxx +xvadda_d 0111 01000101 11011 ..... ..... ..... @xxx + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @xr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @xr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @xr diff --git a/target/loongarch/lasx_helper.c b/target/loongarch/lasx_helper.c index e9d38d83bc..52c230a681 100644 --- a/target/loongarch/lasx_helper.c +++ b/target/loongarch/lasx_helper.c @@ -346,3 +346,20 @@ XDO_3OP(xvabsd_bu, 8, UXB, DO_VABSD) XDO_3OP(xvabsd_hu, 16, UXH, DO_VABSD) XDO_3OP(xvabsd_wu, 32, UXW, DO_VABSD) XDO_3OP(xvabsd_du, 64, UXD, DO_VABSD) + +#define XDO_VADDA(NAME, BIT, E, DO_OP) \ +void HELPER(NAME)(void *xd, void *xj, void *xk, uint32_t v) \ +{ \ + int i; \ + XReg *Xd = (XReg *)xd; \ + XReg *Xj = (XReg *)xj; \ + XReg *Xk = (XReg *)xk; \ + for (i = 0; i < LASX_LEN / BIT; i++) { \ + Xd->E(i) = DO_OP(Xj->E(i)) + DO_OP(Xk->E(i)); \ + } \ +} + +XDO_VADDA(xvadda_b, 8, XB, DO_VABS) +XDO_VADDA(xvadda_h, 16, XH, DO_VABS) +XDO_VADDA(xvadda_w, 32, XW, DO_VABS) +XDO_VADDA(xvadda_d, 64, XD, DO_VABS) diff --git a/target/loongarch/lsx_helper.c b/target/loongarch/lsx_helper.c index 72e2306f0c..72120c04a4 100644 --- a/target/loongarch/lsx_helper.c +++ b/target/loongarch/lsx_helper.c @@ -318,8 +318,6 @@ DO_3OP(vabsd_hu, 16, UH, DO_VABSD) DO_3OP(vabsd_wu, 32, UW, DO_VABSD) DO_3OP(vabsd_du, 64, UD, DO_VABSD) -#define DO_VABS(a) ((a < 0) ? (-a) : (a)) - #define DO_VADDA(NAME, BIT, E, DO_OP) \ void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t v) \ { \ diff --git a/target/loongarch/vec.h b/target/loongarch/vec.h index 652d46c157..20b86c3119 100644 --- a/target/loongarch/vec.h +++ b/target/loongarch/vec.h @@ -56,4 +56,6 @@ #define DO_VABSD(a, b) ((a > b) ? (a - b) : (b - a)) +#define DO_VABS(a) ((a < 0) ? (-a) : (a)) + #endif /* LOONGARCH_VEC_H */ From patchwork Tue Jun 20 09:37:42 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Gao X-Patchwork-Id: 13285488 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 1938AEB64DD for ; Tue, 20 Jun 2023 09:43:56 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXqb-0003oO-PU; Tue, 20 Jun 2023 05:40:25 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qBXqV-0002sK-7F for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:40:19 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXqM-0006aS-PK for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:40:17 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8Ax0OiOc5FkpiUHAA--.12766S3; Tue, 20 Jun 2023 17:38:22 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8BxduSGc5FkzIQhAA--.28394S16; Tue, 20 Jun 2023 17:38:22 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v1 14/46] target/loongarch: Implement xvmax/xvmin Date: Tue, 20 Jun 2023 17:37:42 +0800 Message-Id: <20230620093814.123650-15-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230620093814.123650-1-gaosong@loongson.cn> References: <20230620093814.123650-1-gaosong@loongson.cn> MIME-Version: 1.0 X-CM-TRANSID: AQAAf8BxduSGc5FkzIQhAA--.28394S16 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org This patch includes: - XVMAX[I].{B/H/W/D}[U]; - XVMIN[I].{B/H/W/D}[U]. Signed-off-by: Song Gao --- target/loongarch/disas.c | 33 ++++ target/loongarch/helper.h | 18 ++ target/loongarch/insn_trans/trans_lasx.c.inc | 180 +++++++++++++++++++ target/loongarch/insns.decode | 37 ++++ target/loongarch/lasx_helper.c | 30 ++++ target/loongarch/lsx_helper.c | 3 - target/loongarch/vec.h | 3 + 7 files changed, 301 insertions(+), 3 deletions(-) diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index cc92f0e763..ff22fcb90e 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -1856,6 +1856,39 @@ INSN_LASX(xvadda_h, xxx) INSN_LASX(xvadda_w, xxx) INSN_LASX(xvadda_d, xxx) +INSN_LASX(xvmax_b, xxx) +INSN_LASX(xvmax_h, xxx) +INSN_LASX(xvmax_w, xxx) +INSN_LASX(xvmax_d, xxx) +INSN_LASX(xvmin_b, xxx) +INSN_LASX(xvmin_h, xxx) +INSN_LASX(xvmin_w, xxx) +INSN_LASX(xvmin_d, xxx) +INSN_LASX(xvmax_bu, xxx) +INSN_LASX(xvmax_hu, xxx) +INSN_LASX(xvmax_wu, xxx) +INSN_LASX(xvmax_du, xxx) +INSN_LASX(xvmin_bu, xxx) +INSN_LASX(xvmin_hu, xxx) +INSN_LASX(xvmin_wu, xxx) +INSN_LASX(xvmin_du, xxx) +INSN_LASX(xvmaxi_b, xx_i) +INSN_LASX(xvmaxi_h, xx_i) +INSN_LASX(xvmaxi_w, xx_i) +INSN_LASX(xvmaxi_d, xx_i) +INSN_LASX(xvmini_b, xx_i) +INSN_LASX(xvmini_h, xx_i) +INSN_LASX(xvmini_w, xx_i) +INSN_LASX(xvmini_d, xx_i) +INSN_LASX(xvmaxi_bu, xx_i) +INSN_LASX(xvmaxi_hu, xx_i) +INSN_LASX(xvmaxi_wu, xx_i) +INSN_LASX(xvmaxi_du, xx_i) +INSN_LASX(xvmini_bu, xx_i) +INSN_LASX(xvmini_hu, xx_i) +INSN_LASX(xvmini_wu, xx_i) +INSN_LASX(xvmini_du, xx_i) + INSN_LASX(xvreplgr2vr_b, xr) INSN_LASX(xvreplgr2vr_h, xr) INSN_LASX(xvreplgr2vr_w, xr) diff --git a/target/loongarch/helper.h b/target/loongarch/helper.h index 67ef7491c4..d5ebc0b963 100644 --- a/target/loongarch/helper.h +++ b/target/loongarch/helper.h @@ -791,3 +791,21 @@ DEF_HELPER_FLAGS_4(xvadda_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(xvadda_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(xvadda_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(xvadda_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(xvmini_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(xvmini_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(xvmini_w, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(xvmini_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(xvmini_bu, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(xvmini_hu, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(xvmini_wu, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(xvmini_du, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) + +DEF_HELPER_FLAGS_4(xvmaxi_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(xvmaxi_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(xvmaxi_w, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(xvmaxi_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(xvmaxi_bu, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(xvmaxi_hu, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(xvmaxi_wu, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(xvmaxi_du, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarch/insn_trans/trans_lasx.c.inc index 4b2e50de68..cdf3dcc161 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -872,6 +872,186 @@ TRANS(xvadda_h, gvec_xxx, MO_16, do_xvadda) TRANS(xvadda_w, gvec_xxx, MO_32, do_xvadda) TRANS(xvadda_d, gvec_xxx, MO_64, do_xvadda) +TRANS(xvmax_b, gvec_xxx, MO_8, tcg_gen_gvec_smax) +TRANS(xvmax_h, gvec_xxx, MO_16, tcg_gen_gvec_smax) +TRANS(xvmax_w, gvec_xxx, MO_32, tcg_gen_gvec_smax) +TRANS(xvmax_d, gvec_xxx, MO_64, tcg_gen_gvec_smax) +TRANS(xvmax_bu, gvec_xxx, MO_8, tcg_gen_gvec_umax) +TRANS(xvmax_hu, gvec_xxx, MO_16, tcg_gen_gvec_umax) +TRANS(xvmax_wu, gvec_xxx, MO_32, tcg_gen_gvec_umax) +TRANS(xvmax_du, gvec_xxx, MO_64, tcg_gen_gvec_umax) + +TRANS(xvmin_b, gvec_xxx, MO_8, tcg_gen_gvec_smin) +TRANS(xvmin_h, gvec_xxx, MO_16, tcg_gen_gvec_smin) +TRANS(xvmin_w, gvec_xxx, MO_32, tcg_gen_gvec_smin) +TRANS(xvmin_d, gvec_xxx, MO_64, tcg_gen_gvec_smin) +TRANS(xvmin_bu, gvec_xxx, MO_8, tcg_gen_gvec_umin) +TRANS(xvmin_hu, gvec_xxx, MO_16, tcg_gen_gvec_umin) +TRANS(xvmin_wu, gvec_xxx, MO_32, tcg_gen_gvec_umin) +TRANS(xvmin_du, gvec_xxx, MO_64, tcg_gen_gvec_umin) + +static void do_xvmini_s(unsigned vece, uint32_t xd_ofs, uint32_t xj_ofs, + int64_t imm, uint32_t oprsz, uint32_t maxsz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_smin_vec, 0 + }; + static const GVecGen2i op[4] = { + { + .fniv = gen_vmini_s, + .fnoi = gen_helper_xvmini_b, + .opt_opc = vecop_list, + .vece = MO_8 + }, + { + .fniv = gen_vmini_s, + .fnoi = gen_helper_xvmini_h, + .opt_opc = vecop_list, + .vece = MO_16 + }, + { + .fniv = gen_vmini_s, + .fnoi = gen_helper_xvmini_w, + .opt_opc = vecop_list, + .vece = MO_32 + }, + { + .fniv = gen_vmini_s, + .fnoi = gen_helper_xvmini_d, + .opt_opc = vecop_list, + .vece = MO_64 + }, + }; + + tcg_gen_gvec_2i(xd_ofs, xj_ofs, oprsz, maxsz, imm, &op[vece]); +} + +static void do_xvmini_u(unsigned vece, uint32_t xd_ofs, uint32_t xj_ofs, + int64_t imm, uint32_t oprsz, uint32_t maxsz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_umin_vec, 0 + }; + static const GVecGen2i op[4] = { + { + .fniv = gen_vmini_u, + .fnoi = gen_helper_xvmini_bu, + .opt_opc = vecop_list, + .vece = MO_8 + }, + { + .fniv = gen_vmini_u, + .fnoi = gen_helper_xvmini_hu, + .opt_opc = vecop_list, + .vece = MO_16 + }, + { + .fniv = gen_vmini_u, + .fnoi = gen_helper_xvmini_wu, + .opt_opc = vecop_list, + .vece = MO_32 + }, + { + .fniv = gen_vmini_u, + .fnoi = gen_helper_xvmini_du, + .opt_opc = vecop_list, + .vece = MO_64 + }, + }; + + tcg_gen_gvec_2i(xd_ofs, xj_ofs, oprsz, maxsz, imm, &op[vece]); +} + +TRANS(xvmini_b, gvec_xx_i, MO_8, do_xvmini_s) +TRANS(xvmini_h, gvec_xx_i, MO_16, do_xvmini_s) +TRANS(xvmini_w, gvec_xx_i, MO_32, do_xvmini_s) +TRANS(xvmini_d, gvec_xx_i, MO_64, do_xvmini_s) +TRANS(xvmini_bu, gvec_xx_i, MO_8, do_xvmini_u) +TRANS(xvmini_hu, gvec_xx_i, MO_16, do_xvmini_u) +TRANS(xvmini_wu, gvec_xx_i, MO_32, do_xvmini_u) +TRANS(xvmini_du, gvec_xx_i, MO_64, do_xvmini_u) + +static void do_xvmaxi_s(unsigned vece, uint32_t xd_ofs, uint32_t xj_ofs, + int64_t imm, uint32_t oprsz, uint32_t maxsz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_smax_vec, 0 + }; + static const GVecGen2i op[4] = { + { + .fniv = gen_vmaxi_s, + .fnoi = gen_helper_xvmaxi_b, + .opt_opc = vecop_list, + .vece = MO_8 + }, + { + .fniv = gen_vmaxi_s, + .fnoi = gen_helper_xvmaxi_h, + .opt_opc = vecop_list, + .vece = MO_16 + }, + { + .fniv = gen_vmaxi_s, + .fnoi = gen_helper_xvmaxi_w, + .opt_opc = vecop_list, + .vece = MO_32 + }, + { + .fniv = gen_vmaxi_s, + .fnoi = gen_helper_xvmaxi_d, + .opt_opc = vecop_list, + .vece = MO_64 + }, + }; + + tcg_gen_gvec_2i(xd_ofs, xj_ofs, oprsz, maxsz, imm, &op[vece]); +} + +static void do_xvmaxi_u(unsigned vece, uint32_t xd_ofs, uint32_t xj_ofs, + int64_t imm, uint32_t oprsz, uint32_t maxsz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_umax_vec, 0 + }; + static const GVecGen2i op[4] = { + { + .fniv = gen_vmaxi_u, + .fnoi = gen_helper_xvmaxi_bu, + .opt_opc = vecop_list, + .vece = MO_8 + }, + { + .fniv = gen_vmaxi_u, + .fnoi = gen_helper_xvmaxi_hu, + .opt_opc = vecop_list, + .vece = MO_16 + }, + { + .fniv = gen_vmaxi_u, + .fnoi = gen_helper_xvmaxi_wu, + .opt_opc = vecop_list, + .vece = MO_32 + }, + { + .fniv = gen_vmaxi_u, + .fnoi = gen_helper_xvmaxi_du, + .opt_opc = vecop_list, + .vece = MO_64 + }, + }; + + tcg_gen_gvec_2i(xd_ofs, xj_ofs, oprsz, maxsz, imm, &op[vece]); +} + +TRANS(xvmaxi_b, gvec_xx_i, MO_8, do_xvmaxi_s) +TRANS(xvmaxi_h, gvec_xx_i, MO_16, do_xvmaxi_s) +TRANS(xvmaxi_w, gvec_xx_i, MO_32, do_xvmaxi_s) +TRANS(xvmaxi_d, gvec_xx_i, MO_64, do_xvmaxi_s) +TRANS(xvmaxi_bu, gvec_xx_i, MO_8, do_xvmaxi_u) +TRANS(xvmaxi_hu, gvec_xx_i, MO_16, do_xvmaxi_u) +TRANS(xvmaxi_wu, gvec_xx_i, MO_32, do_xvmaxi_u) +TRANS(xvmaxi_du, gvec_xx_i, MO_64, do_xvmaxi_u) + static bool gvec_dupx(DisasContext *ctx, arg_xr *a, MemOp mop) { TCGv src = gpr_src(ctx, a->rj, EXT_NONE); diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index f8a17f262a..29666f7925 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1313,6 +1313,7 @@ vstelm_b 0011 000110 .... ........ ..... ..... @vr_i8i4 @xx .... ........ ..... ..... xj:5 xd:5 &xx @xxx .... ........ ..... xk:5 xj:5 xd:5 &xxx @xr .... ........ ..... ..... rj:5 xd:5 &xr +@xx_i5 .... ........ ..... imm:s5 xj:5 xd:5 &xx_i @xx_ui5 .... ........ ..... imm:5 xj:5 xd:5 &xx_i xvadd_b 0111 01000000 10100 ..... ..... ..... @xxx @@ -1452,6 +1453,42 @@ xvadda_h 0111 01000101 11001 ..... ..... ..... @xxx xvadda_w 0111 01000101 11010 ..... ..... ..... @xxx xvadda_d 0111 01000101 11011 ..... ..... ..... @xxx +xvmax_b 0111 01000111 00000 ..... ..... ..... @xxx +xvmax_h 0111 01000111 00001 ..... ..... ..... @xxx +xvmax_w 0111 01000111 00010 ..... ..... ..... @xxx +xvmax_d 0111 01000111 00011 ..... ..... ..... @xxx +xvmax_bu 0111 01000111 01000 ..... ..... ..... @xxx +xvmax_hu 0111 01000111 01001 ..... ..... ..... @xxx +xvmax_wu 0111 01000111 01010 ..... ..... ..... @xxx +xvmax_du 0111 01000111 01011 ..... ..... ..... @xxx + +xvmaxi_b 0111 01101001 00000 ..... ..... ..... @xx_i5 +xvmaxi_h 0111 01101001 00001 ..... ..... ..... @xx_i5 +xvmaxi_w 0111 01101001 00010 ..... ..... ..... @xx_i5 +xvmaxi_d 0111 01101001 00011 ..... ..... ..... @xx_i5 +xvmaxi_bu 0111 01101001 01000 ..... ..... ..... @xx_ui5 +xvmaxi_hu 0111 01101001 01001 ..... ..... ..... @xx_ui5 +xvmaxi_wu 0111 01101001 01010 ..... ..... ..... @xx_ui5 +xvmaxi_du 0111 01101001 01011 ..... ..... ..... @xx_ui5 + +xvmin_b 0111 01000111 00100 ..... ..... ..... @xxx +xvmin_h 0111 01000111 00101 ..... ..... ..... @xxx +xvmin_w 0111 01000111 00110 ..... ..... ..... @xxx +xvmin_d 0111 01000111 00111 ..... ..... ..... @xxx +xvmin_bu 0111 01000111 01100 ..... ..... ..... @xxx +xvmin_hu 0111 01000111 01101 ..... ..... ..... @xxx +xvmin_wu 0111 01000111 01110 ..... ..... ..... @xxx +xvmin_du 0111 01000111 01111 ..... ..... ..... @xxx + +xvmini_b 0111 01101001 00100 ..... ..... ..... @xx_i5 +xvmini_h 0111 01101001 00101 ..... ..... ..... @xx_i5 +xvmini_w 0111 01101001 00110 ..... ..... ..... @xx_i5 +xvmini_d 0111 01101001 00111 ..... ..... ..... @xx_i5 +xvmini_bu 0111 01101001 01100 ..... ..... ..... @xx_ui5 +xvmini_hu 0111 01101001 01101 ..... ..... ..... @xx_ui5 +xvmini_wu 0111 01101001 01110 ..... ..... ..... @xx_ui5 +xvmini_du 0111 01101001 01111 ..... ..... ..... @xx_ui5 + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @xr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @xr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @xr diff --git a/target/loongarch/lasx_helper.c b/target/loongarch/lasx_helper.c index 52c230a681..486cf9f7f1 100644 --- a/target/loongarch/lasx_helper.c +++ b/target/loongarch/lasx_helper.c @@ -363,3 +363,33 @@ XDO_VADDA(xvadda_b, 8, XB, DO_VABS) XDO_VADDA(xvadda_h, 16, XH, DO_VABS) XDO_VADDA(xvadda_w, 32, XW, DO_VABS) XDO_VADDA(xvadda_d, 64, XD, DO_VABS) + +#define XVMINMAXI(NAME, BIT, E, DO_OP) \ +void HELPER(NAME)(void *xd, void *xj, uint64_t imm, uint32_t v) \ +{ \ + int i; \ + XReg *Xd = (XReg *)xd; \ + XReg *Xj = (XReg *)xj; \ + typedef __typeof(Xd->E(0)) TD; \ + \ + for (i = 0; i < LASX_LEN / BIT; i++) { \ + Xd->E(i) = DO_OP(Xj->E(i), (TD)imm); \ + } \ +} + +XVMINMAXI(xvmini_b, 8, XB, DO_MIN) +XVMINMAXI(xvmini_h, 16, XH, DO_MIN) +XVMINMAXI(xvmini_w, 32, XW, DO_MIN) +XVMINMAXI(xvmini_d, 64, XD, DO_MIN) +XVMINMAXI(xvmaxi_b, 8, XB, DO_MAX) +XVMINMAXI(xvmaxi_h, 16, XH, DO_MAX) +XVMINMAXI(xvmaxi_w, 32, XW, DO_MAX) +XVMINMAXI(xvmaxi_d, 64, XD, DO_MAX) +XVMINMAXI(xvmini_bu, 8, UXB, DO_MIN) +XVMINMAXI(xvmini_hu, 16, UXH, DO_MIN) +XVMINMAXI(xvmini_wu, 32, UXW, DO_MIN) +XVMINMAXI(xvmini_du, 64, UXD, DO_MIN) +XVMINMAXI(xvmaxi_bu, 8, UXB, DO_MAX) +XVMINMAXI(xvmaxi_hu, 16, UXH, DO_MAX) +XVMINMAXI(xvmaxi_wu, 32, UXW, DO_MAX) +XVMINMAXI(xvmaxi_du, 64, UXD, DO_MAX) diff --git a/target/loongarch/lsx_helper.c b/target/loongarch/lsx_helper.c index 72120c04a4..192cdb164c 100644 --- a/target/loongarch/lsx_helper.c +++ b/target/loongarch/lsx_helper.c @@ -335,9 +335,6 @@ DO_VADDA(vadda_h, 16, H, DO_VABS) DO_VADDA(vadda_w, 32, W, DO_VABS) DO_VADDA(vadda_d, 64, D, DO_VABS) -#define DO_MIN(a, b) (a < b ? a : b) -#define DO_MAX(a, b) (a > b ? a : b) - #define VMINMAXI(NAME, BIT, E, DO_OP) \ void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t v) \ { \ diff --git a/target/loongarch/vec.h b/target/loongarch/vec.h index 20b86c3119..96f216d569 100644 --- a/target/loongarch/vec.h +++ b/target/loongarch/vec.h @@ -58,4 +58,7 @@ #define DO_VABS(a) ((a < 0) ? (-a) : (a)) +#define DO_MIN(a, b) (a < b ? a : b) +#define DO_MAX(a, b) (a > b ? a : b) + #endif /* LOONGARCH_VEC_H */ From patchwork Tue Jun 20 09:37:43 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Gao X-Patchwork-Id: 13285512 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 4AA2CEB64DC for ; Tue, 20 Jun 2023 09:46:28 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXox-00077U-3E; Tue, 20 Jun 2023 05:38:43 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qBXou-00076g-0X for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:38:40 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXoq-0006KR-3u for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:38:39 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8Ax3eqPc5FkqSUHAA--.14571S3; Tue, 20 Jun 2023 17:38:23 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8BxduSGc5FkzIQhAA--.28394S17; Tue, 20 Jun 2023 17:38:22 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v1 15/46] target/loongarch: Implement xvmul/xvmuh/xvmulw{ev/od} Date: Tue, 20 Jun 2023 17:37:43 +0800 Message-Id: <20230620093814.123650-16-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230620093814.123650-1-gaosong@loongson.cn> References: <20230620093814.123650-1-gaosong@loongson.cn> MIME-Version: 1.0 X-CM-TRANSID: AQAAf8BxduSGc5FkzIQhAA--.28394S17 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org This patch includes: - XVMUL.{B/H/W/D}; - XVMUH.{B/H/W/D}[U]; - XVMULW{EV/OD}.{H.B/W.H/D.W/Q.D}[U]; - XVMULW{EV/OD}.{H.BU.B/W.HU.H/D.WU.W/Q.DU.D}. Signed-off-by: Song Gao --- target/loongarch/disas.c | 38 +++ target/loongarch/helper.h | 30 ++ target/loongarch/insn_trans/trans_lasx.c.inc | 311 +++++++++++++++++++ target/loongarch/insns.decode | 38 +++ target/loongarch/lasx_helper.c | 74 +++++ target/loongarch/lsx_helper.c | 2 - target/loongarch/vec.h | 2 + 7 files changed, 493 insertions(+), 2 deletions(-) diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index ff22fcb90e..e7c46bc3a2 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -1889,6 +1889,44 @@ INSN_LASX(xvmini_hu, xx_i) INSN_LASX(xvmini_wu, xx_i) INSN_LASX(xvmini_du, xx_i) +INSN_LASX(xvmul_b, xxx) +INSN_LASX(xvmul_h, xxx) +INSN_LASX(xvmul_w, xxx) +INSN_LASX(xvmul_d, xxx) +INSN_LASX(xvmuh_b, xxx) +INSN_LASX(xvmuh_h, xxx) +INSN_LASX(xvmuh_w, xxx) +INSN_LASX(xvmuh_d, xxx) +INSN_LASX(xvmuh_bu, xxx) +INSN_LASX(xvmuh_hu, xxx) +INSN_LASX(xvmuh_wu, xxx) +INSN_LASX(xvmuh_du, xxx) + +INSN_LASX(xvmulwev_h_b, xxx) +INSN_LASX(xvmulwev_w_h, xxx) +INSN_LASX(xvmulwev_d_w, xxx) +INSN_LASX(xvmulwev_q_d, xxx) +INSN_LASX(xvmulwod_h_b, xxx) +INSN_LASX(xvmulwod_w_h, xxx) +INSN_LASX(xvmulwod_d_w, xxx) +INSN_LASX(xvmulwod_q_d, xxx) +INSN_LASX(xvmulwev_h_bu, xxx) +INSN_LASX(xvmulwev_w_hu, xxx) +INSN_LASX(xvmulwev_d_wu, xxx) +INSN_LASX(xvmulwev_q_du, xxx) +INSN_LASX(xvmulwod_h_bu, xxx) +INSN_LASX(xvmulwod_w_hu, xxx) +INSN_LASX(xvmulwod_d_wu, xxx) +INSN_LASX(xvmulwod_q_du, xxx) +INSN_LASX(xvmulwev_h_bu_b, xxx) +INSN_LASX(xvmulwev_w_hu_h, xxx) +INSN_LASX(xvmulwev_d_wu_w, xxx) +INSN_LASX(xvmulwev_q_du_d, xxx) +INSN_LASX(xvmulwod_h_bu_b, xxx) +INSN_LASX(xvmulwod_w_hu_h, xxx) +INSN_LASX(xvmulwod_d_wu_w, xxx) +INSN_LASX(xvmulwod_q_du_d, xxx) + INSN_LASX(xvreplgr2vr_b, xr) INSN_LASX(xvreplgr2vr_h, xr) INSN_LASX(xvreplgr2vr_w, xr) diff --git a/target/loongarch/helper.h b/target/loongarch/helper.h index d5ebc0b963..88ae707027 100644 --- a/target/loongarch/helper.h +++ b/target/loongarch/helper.h @@ -809,3 +809,33 @@ DEF_HELPER_FLAGS_4(xvmaxi_bu, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) DEF_HELPER_FLAGS_4(xvmaxi_hu, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) DEF_HELPER_FLAGS_4(xvmaxi_wu, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) DEF_HELPER_FLAGS_4(xvmaxi_du, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) + +DEF_HELPER_FLAGS_4(xvmuh_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvmuh_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvmuh_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvmuh_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvmuh_bu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvmuh_hu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvmuh_wu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvmuh_du, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(xvmulwev_h_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvmulwev_w_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvmulwev_d_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvmulwod_h_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvmulwod_w_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvmulwod_d_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(xvmulwev_h_bu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvmulwev_w_hu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvmulwev_d_wu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvmulwod_h_bu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvmulwod_w_hu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvmulwod_d_wu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(xvmulwev_h_bu_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvmulwev_w_hu_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvmulwev_d_wu_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvmulwod_h_bu_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvmulwod_w_hu_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvmulwod_d_wu_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarch/insn_trans/trans_lasx.c.inc index cdf3dcc161..d57d867f17 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -1052,6 +1052,317 @@ TRANS(xvmaxi_hu, gvec_xx_i, MO_16, do_xvmaxi_u) TRANS(xvmaxi_wu, gvec_xx_i, MO_32, do_xvmaxi_u) TRANS(xvmaxi_du, gvec_xx_i, MO_64, do_xvmaxi_u) +TRANS(xvmul_b, gvec_xxx, MO_8, tcg_gen_gvec_mul) +TRANS(xvmul_h, gvec_xxx, MO_16, tcg_gen_gvec_mul) +TRANS(xvmul_w, gvec_xxx, MO_32, tcg_gen_gvec_mul) +TRANS(xvmul_d, gvec_xxx, MO_64, tcg_gen_gvec_mul) + +static void do_xvmuh_s(unsigned vece, uint32_t xd_ofs, uint32_t xj_ofs, + uint32_t xk_ofs, uint32_t oprsz, uint32_t maxsz) +{ + static const GVecGen3 op[4] = { + { + .fno = gen_helper_xvmuh_b, + .vece = MO_8 + }, + { + .fno = gen_helper_xvmuh_h, + .vece = MO_16 + }, + { + .fni4 = gen_vmuh_w, + .fno = gen_helper_xvmuh_w, + .vece = MO_32 + }, + { + .fni8 = gen_vmuh_d, + .fno = gen_helper_xvmuh_d, + .vece = MO_64 + }, + }; + + tcg_gen_gvec_3(xd_ofs, xj_ofs, xk_ofs, oprsz, maxsz, &op[vece]); +} + +TRANS(xvmuh_b, gvec_xxx, MO_8, do_xvmuh_s) +TRANS(xvmuh_h, gvec_xxx, MO_16, do_xvmuh_s) +TRANS(xvmuh_w, gvec_xxx, MO_32, do_xvmuh_s) +TRANS(xvmuh_d, gvec_xxx, MO_64, do_xvmuh_s) + +static void do_xvmuh_u(unsigned vece, uint32_t xd_ofs, uint32_t xj_ofs, + uint32_t xk_ofs, uint32_t oprsz, uint32_t maxsz) +{ + static const GVecGen3 op[4] = { + { + .fno = gen_helper_xvmuh_bu, + .vece = MO_8 + }, + { + .fno = gen_helper_xvmuh_hu, + .vece = MO_16 + }, + { + .fni4 = gen_vmuh_wu, + .fno = gen_helper_xvmuh_wu, + .vece = MO_32 + }, + { + .fni8 = gen_vmuh_du, + .fno = gen_helper_xvmuh_du, + .vece = MO_64 + }, + }; + + tcg_gen_gvec_3(xd_ofs, xj_ofs, xk_ofs, oprsz, maxsz, &op[vece]); +} + +TRANS(xvmuh_bu, gvec_xxx, MO_8, do_xvmuh_u) +TRANS(xvmuh_hu, gvec_xxx, MO_16, do_xvmuh_u) +TRANS(xvmuh_wu, gvec_xxx, MO_32, do_xvmuh_u) +TRANS(xvmuh_du, gvec_xxx, MO_64, do_xvmuh_u) + +static void do_xvmulwev_s(unsigned vece, uint32_t xd_ofs, uint32_t xj_ofs, + uint32_t xk_ofs, uint32_t oprsz, uint32_t maxsz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_shli_vec, INDEX_op_sari_vec, INDEX_op_mul_vec, 0 + }; + static const GVecGen3 op[3] = { + { + .fniv = gen_vmulwev_s, + .fno = gen_helper_xvmulwev_h_b, + .opt_opc = vecop_list, + .vece = MO_16 + }, + { + .fni4 = gen_vmulwev_w_h, + .fniv = gen_vmulwev_s, + .fno = gen_helper_xvmulwev_w_h, + .opt_opc = vecop_list, + .vece = MO_32 + }, + { + .fni8 = gen_vmulwev_d_w, + .fniv = gen_vmulwev_s, + .fno = gen_helper_xvmulwev_d_w, + .opt_opc = vecop_list, + .vece = MO_64 + }, + }; + + tcg_gen_gvec_3(xd_ofs, xj_ofs, xk_ofs, oprsz, maxsz, &op[vece]); +} + +TRANS(xvmulwev_h_b, gvec_xxx, MO_8, do_xvmulwev_s) +TRANS(xvmulwev_w_h, gvec_xxx, MO_16, do_xvmulwev_s) +TRANS(xvmulwev_d_w, gvec_xxx, MO_32, do_xvmulwev_s) + +#define XVMUL_Q(NAME, FN, idx1, idx2) \ +static bool trans_## NAME(DisasContext *ctx, arg_xxx * a) \ +{ \ + TCGv_i64 rh, rl, arg1, arg2; \ + int i; \ + \ + rh = tcg_temp_new_i64(); \ + rl = tcg_temp_new_i64(); \ + arg1 = tcg_temp_new_i64(); \ + arg2 = tcg_temp_new_i64(); \ + \ + for (i = 0; i < 2; i++) { \ + get_xreg64(arg1, a->xj, idx1 + i * 2); \ + get_xreg64(arg2, a->xk, idx2 + i * 2); \ + \ + tcg_gen_## FN ##_i64(rl, rh, arg1, arg2); \ + \ + set_xreg64(rh, a->xd, 1 + i * 2); \ + set_xreg64(rl, a->xd, 0 + i * 2); \ + } \ + \ + return true; \ +} + +XVMUL_Q(xvmulwev_q_d, muls2, 0, 0) +XVMUL_Q(xvmulwod_q_d, muls2, 1, 1) +XVMUL_Q(xvmulwev_q_du, mulu2, 0, 0) +XVMUL_Q(xvmulwod_q_du, mulu2, 1, 1) +XVMUL_Q(xvmulwev_q_du_d, mulus2, 0, 0) +XVMUL_Q(xvmulwod_q_du_d, mulus2, 1, 1) + +static void do_xvmulwod_s(unsigned vece, uint32_t xd_ofs, uint32_t xj_ofs, + uint32_t xk_ofs, uint32_t oprsz, uint32_t maxsz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_sari_vec, INDEX_op_mul_vec, 0 + }; + static const GVecGen3 op[3] = { + { + .fniv = gen_vmulwod_s, + .fno = gen_helper_xvmulwod_h_b, + .opt_opc = vecop_list, + .vece = MO_16 + }, + { + .fni4 = gen_vmulwod_w_h, + .fniv = gen_vmulwod_s, + .fno = gen_helper_xvmulwod_w_h, + .opt_opc = vecop_list, + .vece = MO_32 + }, + { + .fni8 = gen_vmulwod_d_w, + .fniv = gen_vmulwod_s, + .fno = gen_helper_xvmulwod_d_w, + .opt_opc = vecop_list, + .vece = MO_64 + }, + }; + + tcg_gen_gvec_3(xd_ofs, xj_ofs, xk_ofs, oprsz, maxsz, &op[vece]); +} +TRANS(xvmulwod_h_b, gvec_xxx, MO_8, do_xvmulwod_s) +TRANS(xvmulwod_w_h, gvec_xxx, MO_16, do_xvmulwod_s) +TRANS(xvmulwod_d_w, gvec_xxx, MO_32, do_xvmulwod_s) + +static void do_xvmulwev_u(unsigned vece, uint32_t xd_ofs, uint32_t xj_ofs, + uint32_t xk_ofs, uint32_t oprsz, uint32_t maxsz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_mul_vec, 0 + }; + static const GVecGen3 op[3] = { + { + .fniv = gen_vmulwev_u, + .fno = gen_helper_xvmulwev_h_bu, + .opt_opc = vecop_list, + .vece = MO_16 + }, + { + .fni4 = gen_vmulwev_w_hu, + .fniv = gen_vmulwev_u, + .fno = gen_helper_xvmulwev_w_hu, + .opt_opc = vecop_list, + .vece = MO_32 + }, + { + .fni8 = gen_vmulwev_d_wu, + .fniv = gen_vmulwev_u, + .fno = gen_helper_xvmulwev_d_wu, + .opt_opc = vecop_list, + .vece = MO_64 + }, + }; + + tcg_gen_gvec_3(xd_ofs, xj_ofs, xk_ofs, oprsz, maxsz, &op[vece]); +} +TRANS(xvmulwev_h_bu, gvec_xxx, MO_8, do_xvmulwev_u) +TRANS(xvmulwev_w_hu, gvec_xxx, MO_16, do_xvmulwev_u) +TRANS(xvmulwev_d_wu, gvec_xxx, MO_32, do_xvmulwev_u) + +static void do_xvmulwod_u(unsigned vece, uint32_t xd_ofs, uint32_t xj_ofs, + uint32_t xk_ofs, uint32_t oprsz, uint32_t maxsz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_shri_vec, INDEX_op_mul_vec, 0 + }; + static const GVecGen3 op[3] = { + { + .fniv = gen_vmulwod_u, + .fno = gen_helper_xvmulwod_h_bu, + .opt_opc = vecop_list, + .vece = MO_16 + }, + { + .fni4 = gen_vmulwod_w_hu, + .fniv = gen_vmulwod_u, + .fno = gen_helper_xvmulwod_w_hu, + .opt_opc = vecop_list, + .vece = MO_32 + }, + { + .fni8 = gen_vmulwod_d_wu, + .fniv = gen_vmulwod_u, + .fno = gen_helper_xvmulwod_d_wu, + .opt_opc = vecop_list, + .vece = MO_64 + }, + }; + + tcg_gen_gvec_3(xd_ofs, xj_ofs, xk_ofs, oprsz, maxsz, &op[vece]); +} +TRANS(xvmulwod_h_bu, gvec_xxx, MO_8, do_xvmulwod_u) +TRANS(xvmulwod_w_hu, gvec_xxx, MO_16, do_xvmulwod_u) +TRANS(xvmulwod_d_wu, gvec_xxx, MO_32, do_xvmulwod_u) + +static void do_xvmulwev_u_s(unsigned vece, uint32_t xd_ofs, uint32_t xj_ofs, + uint32_t xk_ofs, uint32_t oprsz, uint32_t maxsz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_shli_vec, INDEX_op_sari_vec, INDEX_op_mul_vec, 0 + }; + static const GVecGen3 op[3] = { + { + .fniv = gen_vmulwev_u_s, + .fno = gen_helper_xvmulwev_h_bu_b, + .opt_opc = vecop_list, + .vece = MO_16 + }, + { + .fni4 = gen_vmulwev_w_hu_h, + .fniv = gen_vmulwev_u_s, + .fno = gen_helper_xvmulwev_w_hu_h, + .opt_opc = vecop_list, + .vece = MO_32 + }, + { + .fni8 = gen_vmulwev_d_wu_w, + .fniv = gen_vmulwev_u_s, + .fno = gen_helper_xvmulwev_d_wu_w, + .opt_opc = vecop_list, + .vece = MO_64 + }, + }; + + tcg_gen_gvec_3(xd_ofs, xj_ofs, xk_ofs, oprsz, maxsz, &op[vece]); +} +TRANS(xvmulwev_h_bu_b, gvec_xxx, MO_8, do_xvmulwev_u_s) +TRANS(xvmulwev_w_hu_h, gvec_xxx, MO_16, do_xvmulwev_u_s) +TRANS(xvmulwev_d_wu_w, gvec_xxx, MO_32, do_xvmulwev_u_s) + +static void do_xvmulwod_u_s(unsigned vece, uint32_t xd_ofs, uint32_t xj_ofs, + uint32_t xk_ofs, uint32_t oprsz, uint32_t maxsz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_shri_vec, INDEX_op_sari_vec, INDEX_op_mul_vec, 0 + }; + static const GVecGen3 op[3] = { + { + .fniv = gen_vmulwod_u_s, + .fno = gen_helper_xvmulwod_h_bu_b, + .opt_opc = vecop_list, + .vece = MO_16 + }, + { + .fni4 = gen_vmulwod_w_hu_h, + .fniv = gen_vmulwod_u_s, + .fno = gen_helper_xvmulwod_w_hu_h, + .opt_opc = vecop_list, + .vece = MO_32 + }, + { + .fni8 = gen_vmulwod_d_wu_w, + .fniv = gen_vmulwod_u_s, + .fno = gen_helper_xvmulwod_d_wu_w, + .opt_opc = vecop_list, + .vece = MO_64 + }, + }; + + tcg_gen_gvec_3(xd_ofs, xj_ofs, xk_ofs, oprsz, maxsz, &op[vece]); +} +TRANS(xvmulwod_h_bu_b, gvec_xxx, MO_8, do_xvmulwod_u_s) +TRANS(xvmulwod_w_hu_h, gvec_xxx, MO_16, do_xvmulwod_u_s) +TRANS(xvmulwod_d_wu_w, gvec_xxx, MO_32, do_xvmulwod_u_s) + static bool gvec_dupx(DisasContext *ctx, arg_xr *a, MemOp mop) { TCGv src = gpr_src(ctx, a->rj, EXT_NONE); diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index 29666f7925..872eeed7a8 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1489,6 +1489,44 @@ xvmini_hu 0111 01101001 01101 ..... ..... ..... @xx_ui5 xvmini_wu 0111 01101001 01110 ..... ..... ..... @xx_ui5 xvmini_du 0111 01101001 01111 ..... ..... ..... @xx_ui5 +xvmul_b 0111 01001000 01000 ..... ..... ..... @xxx +xvmul_h 0111 01001000 01001 ..... ..... ..... @xxx +xvmul_w 0111 01001000 01010 ..... ..... ..... @xxx +xvmul_d 0111 01001000 01011 ..... ..... ..... @xxx +xvmuh_b 0111 01001000 01100 ..... ..... ..... @xxx +xvmuh_h 0111 01001000 01101 ..... ..... ..... @xxx +xvmuh_w 0111 01001000 01110 ..... ..... ..... @xxx +xvmuh_d 0111 01001000 01111 ..... ..... ..... @xxx +xvmuh_bu 0111 01001000 10000 ..... ..... ..... @xxx +xvmuh_hu 0111 01001000 10001 ..... ..... ..... @xxx +xvmuh_wu 0111 01001000 10010 ..... ..... ..... @xxx +xvmuh_du 0111 01001000 10011 ..... ..... ..... @xxx + +xvmulwev_h_b 0111 01001001 00000 ..... ..... ..... @xxx +xvmulwev_w_h 0111 01001001 00001 ..... ..... ..... @xxx +xvmulwev_d_w 0111 01001001 00010 ..... ..... ..... @xxx +xvmulwev_q_d 0111 01001001 00011 ..... ..... ..... @xxx +xvmulwod_h_b 0111 01001001 00100 ..... ..... ..... @xxx +xvmulwod_w_h 0111 01001001 00101 ..... ..... ..... @xxx +xvmulwod_d_w 0111 01001001 00110 ..... ..... ..... @xxx +xvmulwod_q_d 0111 01001001 00111 ..... ..... ..... @xxx +xvmulwev_h_bu 0111 01001001 10000 ..... ..... ..... @xxx +xvmulwev_w_hu 0111 01001001 10001 ..... ..... ..... @xxx +xvmulwev_d_wu 0111 01001001 10010 ..... ..... ..... @xxx +xvmulwev_q_du 0111 01001001 10011 ..... ..... ..... @xxx +xvmulwod_h_bu 0111 01001001 10100 ..... ..... ..... @xxx +xvmulwod_w_hu 0111 01001001 10101 ..... ..... ..... @xxx +xvmulwod_d_wu 0111 01001001 10110 ..... ..... ..... @xxx +xvmulwod_q_du 0111 01001001 10111 ..... ..... ..... @xxx +xvmulwev_h_bu_b 0111 01001010 00000 ..... ..... ..... @xxx +xvmulwev_w_hu_h 0111 01001010 00001 ..... ..... ..... @xxx +xvmulwev_d_wu_w 0111 01001010 00010 ..... ..... ..... @xxx +xvmulwev_q_du_d 0111 01001010 00011 ..... ..... ..... @xxx +xvmulwod_h_bu_b 0111 01001010 00100 ..... ..... ..... @xxx +xvmulwod_w_hu_h 0111 01001010 00101 ..... ..... ..... @xxx +xvmulwod_d_wu_w 0111 01001010 00110 ..... ..... ..... @xxx +xvmulwod_q_du_d 0111 01001010 00111 ..... ..... ..... @xxx + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @xr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @xr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @xr diff --git a/target/loongarch/lasx_helper.c b/target/loongarch/lasx_helper.c index 486cf9f7f1..4c342b06e5 100644 --- a/target/loongarch/lasx_helper.c +++ b/target/loongarch/lasx_helper.c @@ -393,3 +393,77 @@ XVMINMAXI(xvmaxi_bu, 8, UXB, DO_MAX) XVMINMAXI(xvmaxi_hu, 16, UXH, DO_MAX) XVMINMAXI(xvmaxi_wu, 32, UXW, DO_MAX) XVMINMAXI(xvmaxi_du, 64, UXD, DO_MAX) + +#define DO_XVMUH(NAME, BIT, E1, E2) \ +void HELPER(NAME)(void *xd, void *xj, void *xk, uint32_t v) \ +{ \ + int i; \ + XReg *Xd = (XReg *)xd; \ + XReg *Xj = (XReg *)xj; \ + XReg *Xk = (XReg *)xk; \ + typedef __typeof(Xd->E1(0)) T; \ + \ + for (i = 0; i < LASX_LEN / BIT; i++) { \ + Xd->E2(i) = ((T)Xj->E2(i)) * ((T)Xk->E2(i)) >> BIT; \ + } \ +} + +void HELPER(xvmuh_d)(void *xd, void *xj, void *xk, uint32_t v) +{ + uint64_t l, h; + XReg *Xd = (XReg *)xd; + XReg *Xj = (XReg *)xj; + XReg *Xk = (XReg *)xk; + int i; + + for (i = 0; i < 4; i++) { + muls64(&l, &h, Xj->XD(i), Xk->XD(i)); + Xd->XD(i) = h; + } +} + +DO_XVMUH(xvmuh_b, 8, XH, XB) +DO_XVMUH(xvmuh_h, 16, XW, XH) +DO_XVMUH(xvmuh_w, 32, XD, XW) + +void HELPER(xvmuh_du)(void *xd, void *xj, void *xk, uint32_t v) +{ + uint64_t l, h; + XReg *Xd = (XReg *)xd; + XReg *Xj = (XReg *)xj; + XReg *Xk = (XReg *)xk; + int i; + + for (i = 0; i < 4; i++) { + mulu64(&l, &h, Xj->XD(i), Xk->XD(i)); + Xd->XD(i) = h; + } +} + +DO_XVMUH(xvmuh_bu, 8, UXH, UXB) +DO_XVMUH(xvmuh_hu, 16, UXW, UXH) +DO_XVMUH(xvmuh_wu, 32, UXD, UXW) + +XDO_EVEN(xvmulwev_h_b, 16, XH, XB, DO_MUL) +XDO_EVEN(xvmulwev_w_h, 32, XW, XH, DO_MUL) +XDO_EVEN(xvmulwev_d_w, 64, XD, XW, DO_MUL) + +XDO_ODD(xvmulwod_h_b, 16, XH, XB, DO_MUL) +XDO_ODD(xvmulwod_w_h, 32, XW, XH, DO_MUL) +XDO_ODD(xvmulwod_d_w, 64, XD, XW, DO_MUL) + +XDO_EVEN(xvmulwev_h_bu, 16, UXH, UXB, DO_MUL) +XDO_EVEN(xvmulwev_w_hu, 32, UXW, UXH, DO_MUL) +XDO_EVEN(xvmulwev_d_wu, 64, UXD, UXW, DO_MUL) + +XDO_ODD(xvmulwod_h_bu, 16, UXH, UXB, DO_MUL) +XDO_ODD(xvmulwod_w_hu, 32, UXW, UXH, DO_MUL) +XDO_ODD(xvmulwod_d_wu, 64, UXD, UXW, DO_MUL) + +XDO_EVEN_U_S(xvmulwev_h_bu_b, 16, XH, UXH, XB, UXB, DO_MUL) +XDO_EVEN_U_S(xvmulwev_w_hu_h, 32, XW, UXW, XH, UXH, DO_MUL) +XDO_EVEN_U_S(xvmulwev_d_wu_w, 64, XD, UXD, XW, UXW, DO_MUL) + +XDO_ODD_U_S(xvmulwod_h_bu_b, 16, XH, UXH, XB, UXB, DO_MUL) +XDO_ODD_U_S(xvmulwod_w_hu_h, 32, XW, UXW, XH, UXH, DO_MUL) +XDO_ODD_U_S(xvmulwod_d_wu_w, 64, XD, UXD, XW, UXW, DO_MUL) diff --git a/target/loongarch/lsx_helper.c b/target/loongarch/lsx_helper.c index 192cdb164c..d384fbef3a 100644 --- a/target/loongarch/lsx_helper.c +++ b/target/loongarch/lsx_helper.c @@ -415,8 +415,6 @@ DO_VMUH(vmuh_bu, 8, UH, UB, DO_MUH) DO_VMUH(vmuh_hu, 16, UW, UH, DO_MUH) DO_VMUH(vmuh_wu, 32, UD, UW, DO_MUH) -#define DO_MUL(a, b) (a * b) - DO_EVEN(vmulwev_h_b, 16, H, B, DO_MUL) DO_EVEN(vmulwev_w_h, 32, W, H, DO_MUL) DO_EVEN(vmulwev_d_w, 64, D, W, DO_MUL) diff --git a/target/loongarch/vec.h b/target/loongarch/vec.h index 96f216d569..e3dbf0f893 100644 --- a/target/loongarch/vec.h +++ b/target/loongarch/vec.h @@ -61,4 +61,6 @@ #define DO_MIN(a, b) (a < b ? a : b) #define DO_MAX(a, b) (a > b ? a : b) +#define DO_MUL(a, b) (a * b) + #endif /* LOONGARCH_VEC_H */ From patchwork Tue Jun 20 09:37:44 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Gao X-Patchwork-Id: 13285501 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 33EEFEB64D8 for ; Tue, 20 Jun 2023 09:45:49 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXqa-0003cP-Nl; Tue, 20 Jun 2023 05:40:24 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qBXqS-0002mv-Uo for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:40:18 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXqM-0006aL-Gc for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:40:15 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8AxGuqPc5FkqiUHAA--.14772S3; Tue, 20 Jun 2023 17:38:23 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8BxduSGc5FkzIQhAA--.28394S18; Tue, 20 Jun 2023 17:38:23 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v1 16/46] target/loongarch: Implement xvmadd/xvmsub/xvmaddw{ev/od} Date: Tue, 20 Jun 2023 17:37:44 +0800 Message-Id: <20230620093814.123650-17-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230620093814.123650-1-gaosong@loongson.cn> References: <20230620093814.123650-1-gaosong@loongson.cn> MIME-Version: 1.0 X-CM-TRANSID: AQAAf8BxduSGc5FkzIQhAA--.28394S18 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org This patch includes: - XVMADD.{B/H/W/D}; - XVMSUB.{B/H/W/D}; - XVMADDW{EV/OD}.{H.B/W.H/D.W/Q.D}[U]; - XVMADDW{EV/OD}.{H.BU.B/W.HU.H/D.WU.W/Q.DU.D}. Signed-off-by: Song Gao --- target/loongarch/disas.c | 34 ++ target/loongarch/helper.h | 30 ++ target/loongarch/insn_trans/trans_lasx.c.inc | 367 +++++++++++++++++++ target/loongarch/insns.decode | 34 ++ target/loongarch/lasx_helper.c | 104 ++++++ target/loongarch/vec.h | 3 + 6 files changed, 572 insertions(+) diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index e7c46bc3a2..ddfc4921b9 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -1927,6 +1927,40 @@ INSN_LASX(xvmulwod_w_hu_h, xxx) INSN_LASX(xvmulwod_d_wu_w, xxx) INSN_LASX(xvmulwod_q_du_d, xxx) +INSN_LASX(xvmadd_b, xxx) +INSN_LASX(xvmadd_h, xxx) +INSN_LASX(xvmadd_w, xxx) +INSN_LASX(xvmadd_d, xxx) +INSN_LASX(xvmsub_b, xxx) +INSN_LASX(xvmsub_h, xxx) +INSN_LASX(xvmsub_w, xxx) +INSN_LASX(xvmsub_d, xxx) + +INSN_LASX(xvmaddwev_h_b, xxx) +INSN_LASX(xvmaddwev_w_h, xxx) +INSN_LASX(xvmaddwev_d_w, xxx) +INSN_LASX(xvmaddwev_q_d, xxx) +INSN_LASX(xvmaddwod_h_b, xxx) +INSN_LASX(xvmaddwod_w_h, xxx) +INSN_LASX(xvmaddwod_d_w, xxx) +INSN_LASX(xvmaddwod_q_d, xxx) +INSN_LASX(xvmaddwev_h_bu, xxx) +INSN_LASX(xvmaddwev_w_hu, xxx) +INSN_LASX(xvmaddwev_d_wu, xxx) +INSN_LASX(xvmaddwev_q_du, xxx) +INSN_LASX(xvmaddwod_h_bu, xxx) +INSN_LASX(xvmaddwod_w_hu, xxx) +INSN_LASX(xvmaddwod_d_wu, xxx) +INSN_LASX(xvmaddwod_q_du, xxx) +INSN_LASX(xvmaddwev_h_bu_b, xxx) +INSN_LASX(xvmaddwev_w_hu_h, xxx) +INSN_LASX(xvmaddwev_d_wu_w, xxx) +INSN_LASX(xvmaddwev_q_du_d, xxx) +INSN_LASX(xvmaddwod_h_bu_b, xxx) +INSN_LASX(xvmaddwod_w_hu_h, xxx) +INSN_LASX(xvmaddwod_d_wu_w, xxx) +INSN_LASX(xvmaddwod_q_du_d, xxx) + INSN_LASX(xvreplgr2vr_b, xr) INSN_LASX(xvreplgr2vr_h, xr) INSN_LASX(xvreplgr2vr_w, xr) diff --git a/target/loongarch/helper.h b/target/loongarch/helper.h index 88ae707027..0dc4cc18da 100644 --- a/target/loongarch/helper.h +++ b/target/loongarch/helper.h @@ -839,3 +839,33 @@ DEF_HELPER_FLAGS_4(xvmulwev_d_wu_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(xvmulwod_h_bu_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(xvmulwod_w_hu_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(xvmulwod_d_wu_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(xvmadd_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvmadd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvmadd_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvmadd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvmsub_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvmsub_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvmsub_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvmsub_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(xvmaddwev_h_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvmaddwev_w_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvmaddwev_d_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvmaddwod_h_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvmaddwod_w_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvmaddwod_d_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(xvmaddwev_h_bu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvmaddwev_w_hu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvmaddwev_d_wu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvmaddwod_h_bu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvmaddwod_w_hu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvmaddwod_d_wu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(xvmaddwev_h_bu_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvmaddwev_w_hu_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvmaddwev_d_wu_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvmaddwod_h_bu_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvmaddwod_w_hu_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvmaddwod_d_wu_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarch/insn_trans/trans_lasx.c.inc index d57d867f17..78ba31b8c2 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -1363,6 +1363,373 @@ TRANS(xvmulwod_h_bu_b, gvec_xxx, MO_8, do_xvmulwod_u_s) TRANS(xvmulwod_w_hu_h, gvec_xxx, MO_16, do_xvmulwod_u_s) TRANS(xvmulwod_d_wu_w, gvec_xxx, MO_32, do_xvmulwod_u_s) +static void do_xvmadd(unsigned vece, uint32_t xd_ofs, uint32_t xj_ofs, + uint32_t xk_ofs, uint32_t oprsz, uint32_t maxsz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_mul_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen3 op[4] = { + { + .fniv = gen_vmadd, + .fno = gen_helper_xvmadd_b, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_8 + }, + { + .fniv = gen_vmadd, + .fno = gen_helper_xvmadd_h, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_16 + }, + { + .fni4 = gen_vmadd_w, + .fniv = gen_vmadd, + .fno = gen_helper_xvmadd_w, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_32 + }, + { + .fni8 = gen_vmadd_d, + .fniv = gen_vmadd, + .fno = gen_helper_xvmadd_d, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_64 + }, + }; + + tcg_gen_gvec_3(xd_ofs, xj_ofs, xk_ofs, oprsz, maxsz, &op[vece]); +} + +TRANS(xvmadd_b, gvec_xxx, MO_8, do_xvmadd) +TRANS(xvmadd_h, gvec_xxx, MO_16, do_xvmadd) +TRANS(xvmadd_w, gvec_xxx, MO_32, do_xvmadd) +TRANS(xvmadd_d, gvec_xxx, MO_64, do_xvmadd) + +static void do_xvmsub(unsigned vece, uint32_t xd_ofs, uint32_t xj_ofs, + uint32_t xk_ofs, uint32_t oprsz, uint32_t maxsz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_mul_vec, INDEX_op_sub_vec, 0 + }; + static const GVecGen3 op[4] = { + { + .fniv = gen_vmsub, + .fno = gen_helper_xvmsub_b, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_8 + }, + { + .fniv = gen_vmsub, + .fno = gen_helper_xvmsub_h, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_16 + }, + { + .fni4 = gen_vmsub_w, + .fniv = gen_vmsub, + .fno = gen_helper_xvmsub_w, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_32 + }, + { + .fni8 = gen_vmsub_d, + .fniv = gen_vmsub, + .fno = gen_helper_xvmsub_d, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_64 + }, + }; + + tcg_gen_gvec_3(xd_ofs, xj_ofs, xk_ofs, oprsz, maxsz, &op[vece]); +} + +TRANS(xvmsub_b, gvec_xxx, MO_8, do_xvmsub) +TRANS(xvmsub_h, gvec_xxx, MO_16, do_xvmsub) +TRANS(xvmsub_w, gvec_xxx, MO_32, do_xvmsub) +TRANS(xvmsub_d, gvec_xxx, MO_64, do_xvmsub) + +static void do_xvmaddwev_s(unsigned vece, uint32_t xd_ofs, uint32_t xj_ofs, + uint32_t xk_ofs, uint32_t oprsz, uint32_t maxsz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_shli_vec, INDEX_op_sari_vec, + INDEX_op_mul_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen3 op[3] = { + { + .fniv = gen_vmaddwev_s, + .fno = gen_helper_xvmaddwev_h_b, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_16 + }, + { + .fni4 = gen_vmaddwev_w_h, + .fniv = gen_vmaddwev_s, + .fno = gen_helper_xvmaddwev_w_h, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_32 + }, + { + .fni8 = gen_vmaddwev_d_w, + .fniv = gen_vmaddwev_s, + .fno = gen_helper_xvmaddwev_d_w, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_64 + }, + }; + + tcg_gen_gvec_3(xd_ofs, xj_ofs, xk_ofs, oprsz, maxsz, &op[vece]); +} + +TRANS(xvmaddwev_h_b, gvec_xxx, MO_8, do_xvmaddwev_s) +TRANS(xvmaddwev_w_h, gvec_xxx, MO_16, do_xvmaddwev_s) +TRANS(xvmaddwev_d_w, gvec_xxx, MO_32, do_xvmaddwev_s) + +#define XVMADD_Q(NAME, FN, idx1, idx2) \ +static bool trans_## NAME(DisasContext *ctx, arg_xxx * a) \ +{ \ + TCGv_i64 rh, rl, arg1, arg2, th, tl; \ + int i; \ + \ + rh = tcg_temp_new_i64(); \ + rl = tcg_temp_new_i64(); \ + arg1 = tcg_temp_new_i64(); \ + arg2 = tcg_temp_new_i64(); \ + th = tcg_temp_new_i64(); \ + tl = tcg_temp_new_i64(); \ + \ + for (i = 0; i < 2; i++) { \ + get_xreg64(arg1, a->xj, idx1 + i * 2); \ + get_xreg64(arg2, a->xk, idx2 + i * 2); \ + get_xreg64(rh, a->xd, 1 + i * 2); \ + get_xreg64(rl, a->xd, 0 + i * 2); \ + \ + tcg_gen_## FN ##_i64(tl, th, arg1, arg2); \ + tcg_gen_add2_i64(rl, rh, rl, rh, tl, th); \ + \ + set_xreg64(rh, a->xd, 1 + i * 2); \ + set_xreg64(rl, a->xd, 0 + i * 2); \ + } \ + \ + return true; \ +} + +XVMADD_Q(xvmaddwev_q_d, muls2, 0, 0) +XVMADD_Q(xvmaddwod_q_d, muls2, 1, 1) +XVMADD_Q(xvmaddwev_q_du, mulu2, 0, 0) +XVMADD_Q(xvmaddwod_q_du, mulu2, 1, 1) +XVMADD_Q(xvmaddwev_q_du_d, mulus2, 0, 0) +XVMADD_Q(xvmaddwod_q_du_d, mulus2, 1, 1) + +static void do_xvmaddwod_s(unsigned vece, uint32_t xd_ofs, uint32_t xj_ofs, + uint32_t xk_ofs, uint32_t oprsz, uint32_t maxsz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_sari_vec, INDEX_op_mul_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen3 op[3] = { + { + .fniv = gen_vmaddwod_s, + .fno = gen_helper_xvmaddwod_h_b, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_16 + }, + { + .fni4 = gen_vmaddwod_w_h, + .fniv = gen_vmaddwod_s, + .fno = gen_helper_xvmaddwod_w_h, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_32 + }, + { + .fni8 = gen_vmaddwod_d_w, + .fniv = gen_vmaddwod_s, + .fno = gen_helper_xvmaddwod_d_w, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_64 + }, + }; + + tcg_gen_gvec_3(xd_ofs, xj_ofs, xk_ofs, oprsz, maxsz, &op[vece]); +} + +TRANS(xvmaddwod_h_b, gvec_xxx, MO_8, do_xvmaddwod_s) +TRANS(xvmaddwod_w_h, gvec_xxx, MO_16, do_xvmaddwod_s) +TRANS(xvmaddwod_d_w, gvec_xxx, MO_32, do_xvmaddwod_s) + +static void do_xvmaddwev_u(unsigned vece, uint32_t xd_ofs, uint32_t xj_ofs, + uint32_t xk_ofs, uint32_t oprsz, uint32_t maxsz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_mul_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen3 op[3] = { + { + .fniv = gen_vmaddwev_u, + .fno = gen_helper_xvmaddwev_h_bu, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_16 + }, + { + .fni4 = gen_vmaddwev_w_hu, + .fniv = gen_vmaddwev_u, + .fno = gen_helper_xvmaddwev_w_hu, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_32 + }, + { + .fni8 = gen_vmaddwev_d_wu, + .fniv = gen_vmaddwev_u, + .fno = gen_helper_xvmaddwev_d_wu, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_64 + }, + }; + + tcg_gen_gvec_3(xd_ofs, xj_ofs, xk_ofs, oprsz, maxsz, &op[vece]); +} + +TRANS(xvmaddwev_h_bu, gvec_xxx, MO_8, do_xvmaddwev_u) +TRANS(xvmaddwev_w_hu, gvec_xxx, MO_16, do_xvmaddwev_u) +TRANS(xvmaddwev_d_wu, gvec_xxx, MO_32, do_xvmaddwev_u) + +static void do_xvmaddwod_u(unsigned vece, uint32_t xd_ofs, uint32_t xj_ofs, + uint32_t xk_ofs, uint32_t oprsz, uint32_t maxsz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_shri_vec, INDEX_op_mul_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen3 op[3] = { + { + .fniv = gen_vmaddwod_u, + .fno = gen_helper_xvmaddwod_h_bu, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_16 + }, + { + .fni4 = gen_vmaddwod_w_hu, + .fniv = gen_vmaddwod_u, + .fno = gen_helper_xvmaddwod_w_hu, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_32 + }, + { + .fni8 = gen_vmaddwod_d_wu, + .fniv = gen_vmaddwod_u, + .fno = gen_helper_xvmaddwod_d_wu, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_64 + }, + }; + + tcg_gen_gvec_3(xd_ofs, xj_ofs, xk_ofs, oprsz, maxsz, &op[vece]); +} + +TRANS(xvmaddwod_h_bu, gvec_xxx, MO_8, do_xvmaddwod_u) +TRANS(xvmaddwod_w_hu, gvec_xxx, MO_16, do_xvmaddwod_u) +TRANS(xvmaddwod_d_wu, gvec_xxx, MO_32, do_xvmaddwod_u) + +static void do_xvmaddwev_u_s(unsigned vece, uint32_t xd_ofs, uint32_t xj_ofs, + uint32_t xk_ofs, uint32_t oprsz, uint32_t maxsz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_shli_vec, INDEX_op_sari_vec, + INDEX_op_mul_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen3 op[3] = { + { + .fniv = gen_vmaddwev_u_s, + .fno = gen_helper_xvmaddwev_h_bu_b, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_16 + }, + { + .fni4 = gen_vmaddwev_w_hu_h, + .fniv = gen_vmaddwev_u_s, + .fno = gen_helper_xvmaddwev_w_hu_h, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_32 + }, + { + .fni8 = gen_vmaddwev_d_wu_w, + .fniv = gen_vmaddwev_u_s, + .fno = gen_helper_xvmaddwev_d_wu_w, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_64 + }, + }; + + tcg_gen_gvec_3(xd_ofs, xj_ofs, xk_ofs, oprsz, maxsz, &op[vece]); +} + +TRANS(xvmaddwev_h_bu_b, gvec_xxx, MO_8, do_xvmaddwev_u_s) +TRANS(xvmaddwev_w_hu_h, gvec_xxx, MO_16, do_xvmaddwev_u_s) +TRANS(xvmaddwev_d_wu_w, gvec_xxx, MO_32, do_xvmaddwev_u_s) + +static void do_xvmaddwod_u_s(unsigned vece, uint32_t xd_ofs, uint32_t xj_ofs, + uint32_t xk_ofs, uint32_t oprsz, uint32_t maxsz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_shri_vec, INDEX_op_sari_vec, + INDEX_op_mul_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen3 op[3] = { + { + .fniv = gen_vmaddwod_u_s, + .fno = gen_helper_xvmaddwod_h_bu_b, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_16 + }, + { + .fni4 = gen_vmaddwod_w_hu_h, + .fniv = gen_vmaddwod_u_s, + .fno = gen_helper_xvmaddwod_w_hu_h, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_32 + }, + { + .fni8 = gen_vmaddwod_d_wu_w, + .fniv = gen_vmaddwod_u_s, + .fno = gen_helper_xvmaddwod_d_wu_w, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_64 + }, + }; + + tcg_gen_gvec_3(xd_ofs, xj_ofs, xk_ofs, oprsz, maxsz, &op[vece]); +} + +TRANS(xvmaddwod_h_bu_b, gvec_xxx, MO_8, do_xvmaddwod_u_s) +TRANS(xvmaddwod_w_hu_h, gvec_xxx, MO_16, do_xvmaddwod_u_s) +TRANS(xvmaddwod_d_wu_w, gvec_xxx, MO_32, do_xvmaddwod_u_s) + static bool gvec_dupx(DisasContext *ctx, arg_xr *a, MemOp mop) { TCGv src = gpr_src(ctx, a->rj, EXT_NONE); diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index 872eeed7a8..cc210314ff 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1527,6 +1527,40 @@ xvmulwod_w_hu_h 0111 01001010 00101 ..... ..... ..... @xxx xvmulwod_d_wu_w 0111 01001010 00110 ..... ..... ..... @xxx xvmulwod_q_du_d 0111 01001010 00111 ..... ..... ..... @xxx +xvmadd_b 0111 01001010 10000 ..... ..... ..... @xxx +xvmadd_h 0111 01001010 10001 ..... ..... ..... @xxx +xvmadd_w 0111 01001010 10010 ..... ..... ..... @xxx +xvmadd_d 0111 01001010 10011 ..... ..... ..... @xxx +xvmsub_b 0111 01001010 10100 ..... ..... ..... @xxx +xvmsub_h 0111 01001010 10101 ..... ..... ..... @xxx +xvmsub_w 0111 01001010 10110 ..... ..... ..... @xxx +xvmsub_d 0111 01001010 10111 ..... ..... ..... @xxx + +xvmaddwev_h_b 0111 01001010 11000 ..... ..... ..... @xxx +xvmaddwev_w_h 0111 01001010 11001 ..... ..... ..... @xxx +xvmaddwev_d_w 0111 01001010 11010 ..... ..... ..... @xxx +xvmaddwev_q_d 0111 01001010 11011 ..... ..... ..... @xxx +xvmaddwod_h_b 0111 01001010 11100 ..... ..... ..... @xxx +xvmaddwod_w_h 0111 01001010 11101 ..... ..... ..... @xxx +xvmaddwod_d_w 0111 01001010 11110 ..... ..... ..... @xxx +xvmaddwod_q_d 0111 01001010 11111 ..... ..... ..... @xxx +xvmaddwev_h_bu 0111 01001011 01000 ..... ..... ..... @xxx +xvmaddwev_w_hu 0111 01001011 01001 ..... ..... ..... @xxx +xvmaddwev_d_wu 0111 01001011 01010 ..... ..... ..... @xxx +xvmaddwev_q_du 0111 01001011 01011 ..... ..... ..... @xxx +xvmaddwod_h_bu 0111 01001011 01100 ..... ..... ..... @xxx +xvmaddwod_w_hu 0111 01001011 01101 ..... ..... ..... @xxx +xvmaddwod_d_wu 0111 01001011 01110 ..... ..... ..... @xxx +xvmaddwod_q_du 0111 01001011 01111 ..... ..... ..... @xxx +xvmaddwev_h_bu_b 0111 01001011 11000 ..... ..... ..... @xxx +xvmaddwev_w_hu_h 0111 01001011 11001 ..... ..... ..... @xxx +xvmaddwev_d_wu_w 0111 01001011 11010 ..... ..... ..... @xxx +xvmaddwev_q_du_d 0111 01001011 11011 ..... ..... ..... @xxx +xvmaddwod_h_bu_b 0111 01001011 11100 ..... ..... ..... @xxx +xvmaddwod_w_hu_h 0111 01001011 11101 ..... ..... ..... @xxx +xvmaddwod_d_wu_w 0111 01001011 11110 ..... ..... ..... @xxx +xvmaddwod_q_du_d 0111 01001011 11111 ..... ..... ..... @xxx + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @xr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @xr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @xr diff --git a/target/loongarch/lasx_helper.c b/target/loongarch/lasx_helper.c index 4c342b06e5..df85fa04f0 100644 --- a/target/loongarch/lasx_helper.c +++ b/target/loongarch/lasx_helper.c @@ -467,3 +467,107 @@ XDO_EVEN_U_S(xvmulwev_d_wu_w, 64, XD, UXD, XW, UXW, DO_MUL) XDO_ODD_U_S(xvmulwod_h_bu_b, 16, XH, UXH, XB, UXB, DO_MUL) XDO_ODD_U_S(xvmulwod_w_hu_h, 32, XW, UXW, XH, UXH, DO_MUL) XDO_ODD_U_S(xvmulwod_d_wu_w, 64, XD, UXD, XW, UXW, DO_MUL) + +#define XVMADDSUB(NAME, BIT, E, DO_OP) \ +void HELPER(NAME)(void *xd, void *xj, void *xk, uint32_t v) \ +{ \ + int i; \ + XReg *Xd = (XReg *)xd; \ + XReg *Xj = (XReg *)xj; \ + XReg *Xk = (XReg *)xk; \ + for (i = 0; i < LASX_LEN / BIT; i++) { \ + Xd->E(i) = DO_OP(Xd->E(i), Xj->E(i), Xk->E(i)); \ + } \ +} + +XVMADDSUB(xvmadd_b, 8, XB, DO_MADD) +XVMADDSUB(xvmadd_h, 16, XH, DO_MADD) +XVMADDSUB(xvmadd_w, 32, XW, DO_MADD) +XVMADDSUB(xvmadd_d, 64, XD, DO_MADD) +XVMADDSUB(xvmsub_b, 8, XB, DO_MSUB) +XVMADDSUB(xvmsub_h, 16, XH, DO_MSUB) +XVMADDSUB(xvmsub_w, 32, XW, DO_MSUB) +XVMADDSUB(xvmsub_d, 64, XD, DO_MSUB) + +#define XVMADDWEV(NAME, BIT, E1, E2, DO_OP) \ +void HELPER(NAME)(void *xd, void *xj, void *xk, uint32_t v) \ +{ \ + int i; \ + XReg *Xd = (XReg *)xd; \ + XReg *Xj = (XReg *)xj; \ + XReg *Xk = (XReg *)xk; \ + typedef __typeof(Xd->E1(0)) TD; \ + \ + for (i = 0; i < LASX_LEN / BIT; i++) { \ + Xd->E1(i) += DO_OP((TD)Xj->E2(2 * i), (TD)Xk->E2(2 * i)); \ + } \ +} + +XVMADDWEV(xvmaddwev_h_b, 16, XH, XB, DO_MUL) +XVMADDWEV(xvmaddwev_w_h, 32, XW, XH, DO_MUL) +XVMADDWEV(xvmaddwev_d_w, 64, XD, XW, DO_MUL) +XVMADDWEV(xvmaddwev_h_bu, 16, UXH, UXB, DO_MUL) +XVMADDWEV(xvmaddwev_w_hu, 32, UXW, UXH, DO_MUL) +XVMADDWEV(xvmaddwev_d_wu, 64, UXD, UXW, DO_MUL) + +#define XVMADDWOD(NAME, BIT, E1, E2, DO_OP) \ +void HELPER(NAME)(void *xd, void *xj, void *xk, uint32_t v) \ +{ \ + int i; \ + XReg *Xd = (XReg *)xd; \ + XReg *Xj = (XReg *)xj; \ + XReg *Xk = (XReg *)xk; \ + typedef __typeof(Xd->E1(0)) TD; \ + \ + for (i = 0; i < LASX_LEN / BIT; i++) { \ + Xd->E1(i) += DO_OP((TD)Xj->E2(2 * i + 1), \ + (TD)Xk->E2(2 * i + 1)); \ + } \ +} + +XVMADDWOD(xvmaddwod_h_b, 16, XH, XB, DO_MUL) +XVMADDWOD(xvmaddwod_w_h, 32, XW, XH, DO_MUL) +XVMADDWOD(xvmaddwod_d_w, 64, XD, XW, DO_MUL) +XVMADDWOD(xvmaddwod_h_bu, 16, UXH, UXB, DO_MUL) +XVMADDWOD(xvmaddwod_w_hu, 32, UXW, UXH, DO_MUL) +XVMADDWOD(xvmaddwod_d_wu, 64, UXD, UXW, DO_MUL) + +#define XVMADDWEV_U_S(NAME, BIT, ES1, EU1, ES2, EU2, DO_OP) \ +void HELPER(NAME)(void *xd, void *xj, void *xk, uint32_t v) \ +{ \ + int i; \ + XReg *Xd = (XReg *)xd; \ + XReg *Xj = (XReg *)xj; \ + XReg *Xk = (XReg *)xk; \ + typedef __typeof(Xd->ES1(0)) TS1; \ + typedef __typeof(Xd->EU1(0)) TU1; \ + \ + for (i = 0; i < LASX_LEN / BIT; i++) { \ + Xd->ES1(i) += DO_OP((TU1)Xj->EU2(2 * i), \ + (TS1)Xk->ES2(2 * i)); \ + } \ +} + +XVMADDWEV_U_S(xvmaddwev_h_bu_b, 16, XH, UXH, XB, UXB, DO_MUL) +XVMADDWEV_U_S(xvmaddwev_w_hu_h, 32, XW, UXW, XH, UXH, DO_MUL) +XVMADDWEV_U_S(xvmaddwev_d_wu_w, 64, XD, UXD, XW, UXW, DO_MUL) + +#define XVMADDWOD_U_S(NAME, BIT, ES1, EU1, ES2, EU2, DO_OP) \ +void HELPER(NAME)(void *xd, void *xj, void *xk, uint32_t v) \ +{ \ + int i; \ + XReg *Xd = (XReg *)xd; \ + XReg *Xj = (XReg *)xj; \ + XReg *Xk = (XReg *)xk; \ + typedef __typeof(Xd->ES1(0)) TS1; \ + typedef __typeof(Xd->EU1(0)) TU1; \ + \ + for (i = 0; i < LASX_LEN / BIT; i++) { \ + Xd->ES1(i) += DO_OP((TU1)Xj->EU2(2 * i + 1), \ + (TS1)Xk->ES2(2 * i + 1)); \ + } \ +} + +XVMADDWOD_U_S(xvmaddwod_h_bu_b, 16, XH, UXH, XB, UXB, DO_MUL) +XVMADDWOD_U_S(xvmaddwod_w_hu_h, 32, XW, UXW, XH, UXH, DO_MUL) +XVMADDWOD_U_S(xvmaddwod_d_wu_w, 64, XD, UXD, XW, UXW, DO_MUL) diff --git a/target/loongarch/vec.h b/target/loongarch/vec.h index e3dbf0f893..06992410ad 100644 --- a/target/loongarch/vec.h +++ b/target/loongarch/vec.h @@ -63,4 +63,7 @@ #define DO_MUL(a, b) (a * b) +#define DO_MADD(a, b, c) (a + b * c) +#define DO_MSUB(a, b, c) (a - b * c) + #endif /* LOONGARCH_VEC_H */ From patchwork Tue Jun 20 09:37:45 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Gao X-Patchwork-Id: 13285487 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id C422BEB64DD for ; Tue, 20 Jun 2023 09:43:40 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXou-00076i-3i; Tue, 20 Jun 2023 05:38:40 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qBXot-00076J-9B for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:38:39 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXoq-0006La-Vm for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:38:39 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8DxzOqQc5FkrSUHAA--.14640S3; Tue, 20 Jun 2023 17:38:24 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8BxduSGc5FkzIQhAA--.28394S19; Tue, 20 Jun 2023 17:38:23 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v1 17/46] target/loongarch; Implement xvdiv/xvmod Date: Tue, 20 Jun 2023 17:37:45 +0800 Message-Id: <20230620093814.123650-18-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230620093814.123650-1-gaosong@loongson.cn> References: <20230620093814.123650-1-gaosong@loongson.cn> MIME-Version: 1.0 X-CM-TRANSID: AQAAf8BxduSGc5FkzIQhAA--.28394S19 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org This patch includes: - XVDIV.{B/H/W/D}[U]; - XVMOD.{B/H/W/D}[U]. Signed-off-by: Song Gao --- target/loongarch/disas.c | 17 +++++++++++ target/loongarch/helper.h | 17 +++++++++++ target/loongarch/insn_trans/trans_lasx.c.inc | 17 +++++++++++ target/loongarch/insns.decode | 17 +++++++++++ target/loongarch/lasx_helper.c | 30 ++++++++++++++++++++ target/loongarch/lsx_helper.c | 7 ----- target/loongarch/vec.h | 7 +++++ 7 files changed, 105 insertions(+), 7 deletions(-) diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index ddfc4921b9..83efde440f 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -1961,6 +1961,23 @@ INSN_LASX(xvmaddwod_w_hu_h, xxx) INSN_LASX(xvmaddwod_d_wu_w, xxx) INSN_LASX(xvmaddwod_q_du_d, xxx) +INSN_LASX(xvdiv_b, xxx) +INSN_LASX(xvdiv_h, xxx) +INSN_LASX(xvdiv_w, xxx) +INSN_LASX(xvdiv_d, xxx) +INSN_LASX(xvdiv_bu, xxx) +INSN_LASX(xvdiv_hu, xxx) +INSN_LASX(xvdiv_wu, xxx) +INSN_LASX(xvdiv_du, xxx) +INSN_LASX(xvmod_b, xxx) +INSN_LASX(xvmod_h, xxx) +INSN_LASX(xvmod_w, xxx) +INSN_LASX(xvmod_d, xxx) +INSN_LASX(xvmod_bu, xxx) +INSN_LASX(xvmod_hu, xxx) +INSN_LASX(xvmod_wu, xxx) +INSN_LASX(xvmod_du, xxx) + INSN_LASX(xvreplgr2vr_b, xr) INSN_LASX(xvreplgr2vr_h, xr) INSN_LASX(xvreplgr2vr_w, xr) diff --git a/target/loongarch/helper.h b/target/loongarch/helper.h index 0dc4cc18da..95c7ecba3b 100644 --- a/target/loongarch/helper.h +++ b/target/loongarch/helper.h @@ -869,3 +869,20 @@ DEF_HELPER_FLAGS_4(xvmaddwev_d_wu_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(xvmaddwod_h_bu_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(xvmaddwod_w_hu_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(xvmaddwod_d_wu_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_4(xvdiv_b, void, env, i32, i32, i32) +DEF_HELPER_4(xvdiv_h, void, env, i32, i32, i32) +DEF_HELPER_4(xvdiv_w, void, env, i32, i32, i32) +DEF_HELPER_4(xvdiv_d, void, env, i32, i32, i32) +DEF_HELPER_4(xvdiv_bu, void, env, i32, i32, i32) +DEF_HELPER_4(xvdiv_hu, void, env, i32, i32, i32) +DEF_HELPER_4(xvdiv_wu, void, env, i32, i32, i32) +DEF_HELPER_4(xvdiv_du, void, env, i32, i32, i32) +DEF_HELPER_4(xvmod_b, void, env, i32, i32, i32) +DEF_HELPER_4(xvmod_h, void, env, i32, i32, i32) +DEF_HELPER_4(xvmod_w, void, env, i32, i32, i32) +DEF_HELPER_4(xvmod_d, void, env, i32, i32, i32) +DEF_HELPER_4(xvmod_bu, void, env, i32, i32, i32) +DEF_HELPER_4(xvmod_hu, void, env, i32, i32, i32) +DEF_HELPER_4(xvmod_wu, void, env, i32, i32, i32) +DEF_HELPER_4(xvmod_du, void, env, i32, i32, i32) diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarch/insn_trans/trans_lasx.c.inc index 78ba31b8c2..930872c939 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -1730,6 +1730,23 @@ TRANS(xvmaddwod_h_bu_b, gvec_xxx, MO_8, do_xvmaddwod_u_s) TRANS(xvmaddwod_w_hu_h, gvec_xxx, MO_16, do_xvmaddwod_u_s) TRANS(xvmaddwod_d_wu_w, gvec_xxx, MO_32, do_xvmaddwod_u_s) +TRANS(xvdiv_b, gen_xxx, gen_helper_xvdiv_b) +TRANS(xvdiv_h, gen_xxx, gen_helper_xvdiv_h) +TRANS(xvdiv_w, gen_xxx, gen_helper_xvdiv_w) +TRANS(xvdiv_d, gen_xxx, gen_helper_xvdiv_d) +TRANS(xvdiv_bu, gen_xxx, gen_helper_xvdiv_bu) +TRANS(xvdiv_hu, gen_xxx, gen_helper_xvdiv_hu) +TRANS(xvdiv_wu, gen_xxx, gen_helper_xvdiv_wu) +TRANS(xvdiv_du, gen_xxx, gen_helper_xvdiv_du) +TRANS(xvmod_b, gen_xxx, gen_helper_xvmod_b) +TRANS(xvmod_h, gen_xxx, gen_helper_xvmod_h) +TRANS(xvmod_w, gen_xxx, gen_helper_xvmod_w) +TRANS(xvmod_d, gen_xxx, gen_helper_xvmod_d) +TRANS(xvmod_bu, gen_xxx, gen_helper_xvmod_bu) +TRANS(xvmod_hu, gen_xxx, gen_helper_xvmod_hu) +TRANS(xvmod_wu, gen_xxx, gen_helper_xvmod_wu) +TRANS(xvmod_du, gen_xxx, gen_helper_xvmod_du) + static bool gvec_dupx(DisasContext *ctx, arg_xr *a, MemOp mop) { TCGv src = gpr_src(ctx, a->rj, EXT_NONE); diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index cc210314ff..0bd4e7709a 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1561,6 +1561,23 @@ xvmaddwod_w_hu_h 0111 01001011 11101 ..... ..... ..... @xxx xvmaddwod_d_wu_w 0111 01001011 11110 ..... ..... ..... @xxx xvmaddwod_q_du_d 0111 01001011 11111 ..... ..... ..... @xxx +xvdiv_b 0111 01001110 00000 ..... ..... ..... @xxx +xvdiv_h 0111 01001110 00001 ..... ..... ..... @xxx +xvdiv_w 0111 01001110 00010 ..... ..... ..... @xxx +xvdiv_d 0111 01001110 00011 ..... ..... ..... @xxx +xvmod_b 0111 01001110 00100 ..... ..... ..... @xxx +xvmod_h 0111 01001110 00101 ..... ..... ..... @xxx +xvmod_w 0111 01001110 00110 ..... ..... ..... @xxx +xvmod_d 0111 01001110 00111 ..... ..... ..... @xxx +xvdiv_bu 0111 01001110 01000 ..... ..... ..... @xxx +xvdiv_hu 0111 01001110 01001 ..... ..... ..... @xxx +xvdiv_wu 0111 01001110 01010 ..... ..... ..... @xxx +xvdiv_du 0111 01001110 01011 ..... ..... ..... @xxx +xvmod_bu 0111 01001110 01100 ..... ..... ..... @xxx +xvmod_hu 0111 01001110 01101 ..... ..... ..... @xxx +xvmod_wu 0111 01001110 01110 ..... ..... ..... @xxx +xvmod_du 0111 01001110 01111 ..... ..... ..... @xxx + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @xr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @xr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @xr diff --git a/target/loongarch/lasx_helper.c b/target/loongarch/lasx_helper.c index df85fa04f0..d4a4a7659a 100644 --- a/target/loongarch/lasx_helper.c +++ b/target/loongarch/lasx_helper.c @@ -571,3 +571,33 @@ void HELPER(NAME)(void *xd, void *xj, void *xk, uint32_t v) \ XVMADDWOD_U_S(xvmaddwod_h_bu_b, 16, XH, UXH, XB, UXB, DO_MUL) XVMADDWOD_U_S(xvmaddwod_w_hu_h, 32, XW, UXW, XH, UXH, DO_MUL) XVMADDWOD_U_S(xvmaddwod_d_wu_w, 64, XD, UXD, XW, UXW, DO_MUL) + +#define XVDIV(NAME, BIT, E, DO_OP) \ +void HELPER(NAME)(CPULoongArchState *env, \ + uint32_t xd, uint32_t xj, uint32_t xk) \ +{ \ + int i; \ + XReg *Xd = &(env->fpr[xd].xreg); \ + XReg *Xj = &(env->fpr[xj].xreg); \ + XReg *Xk = &(env->fpr[xk].xreg); \ + for (i = 0; i < LASX_LEN / BIT; i++) { \ + Xd->E(i) = DO_OP(Xj->E(i), Xk->E(i)); \ + } \ +} + +XVDIV(xvdiv_b, 8, XB, DO_DIV) +XVDIV(xvdiv_h, 16, XH, DO_DIV) +XVDIV(xvdiv_w, 32, XW, DO_DIV) +XVDIV(xvdiv_d, 64, XD, DO_DIV) +XVDIV(xvdiv_bu, 8, UXB, DO_DIVU) +XVDIV(xvdiv_hu, 16, UXH, DO_DIVU) +XVDIV(xvdiv_wu, 32, UXW, DO_DIVU) +XVDIV(xvdiv_du, 64, UXD, DO_DIVU) +XVDIV(xvmod_b, 8, XB, DO_REM) +XVDIV(xvmod_h, 16, XH, DO_REM) +XVDIV(xvmod_w, 32, XW, DO_REM) +XVDIV(xvmod_d, 64, XD, DO_REM) +XVDIV(xvmod_bu, 8, UXB, DO_REMU) +XVDIV(xvmod_hu, 16, UXH, DO_REMU) +XVDIV(xvmod_wu, 32, UXW, DO_REMU) +XVDIV(xvmod_du, 64, UXD, DO_REMU) diff --git a/target/loongarch/lsx_helper.c b/target/loongarch/lsx_helper.c index d384fbef3a..5aac0c9ef5 100644 --- a/target/loongarch/lsx_helper.c +++ b/target/loongarch/lsx_helper.c @@ -546,13 +546,6 @@ VMADDWOD_U_S(vmaddwod_h_bu_b, 16, H, UH, B, UB, DO_MUL) VMADDWOD_U_S(vmaddwod_w_hu_h, 32, W, UW, H, UH, DO_MUL) VMADDWOD_U_S(vmaddwod_d_wu_w, 64, D, UD, W, UW, DO_MUL) -#define DO_DIVU(N, M) (unlikely(M == 0) ? 0 : N / M) -#define DO_REMU(N, M) (unlikely(M == 0) ? 0 : N % M) -#define DO_DIV(N, M) (unlikely(M == 0) ? 0 :\ - unlikely((N == -N) && (M == (__typeof(N))(-1))) ? N : N / M) -#define DO_REM(N, M) (unlikely(M == 0) ? 0 :\ - unlikely((N == -N) && (M == (__typeof(N))(-1))) ? 0 : N % M) - #define VDIV(NAME, BIT, E, DO_OP) \ void HELPER(NAME)(CPULoongArchState *env, \ uint32_t vd, uint32_t vj, uint32_t vk) \ diff --git a/target/loongarch/vec.h b/target/loongarch/vec.h index 06992410ad..c748957158 100644 --- a/target/loongarch/vec.h +++ b/target/loongarch/vec.h @@ -66,4 +66,11 @@ #define DO_MADD(a, b, c) (a + b * c) #define DO_MSUB(a, b, c) (a - b * c) +#define DO_DIVU(N, M) (unlikely(M == 0) ? 0 : N / M) +#define DO_REMU(N, M) (unlikely(M == 0) ? 0 : N % M) +#define DO_DIV(N, M) (unlikely(M == 0) ? 0 :\ + unlikely((N == -N) && (M == (__typeof(N))(-1))) ? N : N / M) +#define DO_REM(N, M) (unlikely(M == 0) ? 0 :\ + unlikely((N == -N) && (M == (__typeof(N))(-1))) ? 0 : N % M) + #endif /* LOONGARCH_VEC_H */ From patchwork Tue Jun 20 09:37:46 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Gao X-Patchwork-Id: 13285497 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 95A2EEB64DB for ; Tue, 20 Jun 2023 09:45:24 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXox-00077l-Ho; Tue, 20 Jun 2023 05:38:43 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qBXov-00077G-Ov for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:38:41 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXos-0006Lx-Ae for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:38:41 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8Ax0OiRc5FkryUHAA--.12768S3; Tue, 20 Jun 2023 17:38:25 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8BxduSGc5FkzIQhAA--.28394S20; Tue, 20 Jun 2023 17:38:24 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v1 18/46] target/loongarch: Implement xvsat Date: Tue, 20 Jun 2023 17:37:46 +0800 Message-Id: <20230620093814.123650-19-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230620093814.123650-1-gaosong@loongson.cn> References: <20230620093814.123650-1-gaosong@loongson.cn> MIME-Version: 1.0 X-CM-TRANSID: AQAAf8BxduSGc5FkzIQhAA--.28394S20 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org This patch includes: - XVSAT.{B/H/W/D}[U]. Signed-off-by: Song Gao --- target/loongarch/disas.c | 9 ++ target/loongarch/helper.h | 9 ++ target/loongarch/insn_trans/trans_lasx.c.inc | 86 ++++++++++++++++++++ target/loongarch/insns.decode | 13 +++ target/loongarch/lasx_helper.c | 37 +++++++++ 5 files changed, 154 insertions(+) diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index 83efde440f..18fa454be8 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -1978,6 +1978,15 @@ INSN_LASX(xvmod_hu, xxx) INSN_LASX(xvmod_wu, xxx) INSN_LASX(xvmod_du, xxx) +INSN_LASX(xvsat_b, xx_i) +INSN_LASX(xvsat_h, xx_i) +INSN_LASX(xvsat_w, xx_i) +INSN_LASX(xvsat_d, xx_i) +INSN_LASX(xvsat_bu, xx_i) +INSN_LASX(xvsat_hu, xx_i) +INSN_LASX(xvsat_wu, xx_i) +INSN_LASX(xvsat_du, xx_i) + INSN_LASX(xvreplgr2vr_b, xr) INSN_LASX(xvreplgr2vr_h, xr) INSN_LASX(xvreplgr2vr_w, xr) diff --git a/target/loongarch/helper.h b/target/loongarch/helper.h index 95c7ecba3b..741872a24d 100644 --- a/target/loongarch/helper.h +++ b/target/loongarch/helper.h @@ -886,3 +886,12 @@ DEF_HELPER_4(xvmod_bu, void, env, i32, i32, i32) DEF_HELPER_4(xvmod_hu, void, env, i32, i32, i32) DEF_HELPER_4(xvmod_wu, void, env, i32, i32, i32) DEF_HELPER_4(xvmod_du, void, env, i32, i32, i32) + +DEF_HELPER_FLAGS_4(xvsat_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(xvsat_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(xvsat_w, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(xvsat_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(xvsat_bu, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(xvsat_hu, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(xvsat_wu, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(xvsat_du, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarch/insn_trans/trans_lasx.c.inc index 930872c939..350d575a6a 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -1747,6 +1747,92 @@ TRANS(xvmod_hu, gen_xxx, gen_helper_xvmod_hu) TRANS(xvmod_wu, gen_xxx, gen_helper_xvmod_wu) TRANS(xvmod_du, gen_xxx, gen_helper_xvmod_du) +static void do_xvsat_s(unsigned vece, uint32_t xd_ofs, uint32_t xj_ofs, + int64_t imm, uint32_t oprsz, uint32_t maxsz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_smax_vec, INDEX_op_smin_vec, 0 + }; + static const GVecGen2s op[4] = { + { + .fniv = gen_vsat_s, + .fno = gen_helper_xvsat_b, + .opt_opc = vecop_list, + .vece = MO_8 + }, + { + .fniv = gen_vsat_s, + .fno = gen_helper_xvsat_h, + .opt_opc = vecop_list, + .vece = MO_16 + }, + { + .fniv = gen_vsat_s, + .fno = gen_helper_xvsat_w, + .opt_opc = vecop_list, + .vece = MO_32 + }, + { + .fniv = gen_vsat_s, + .fno = gen_helper_xvsat_d, + .opt_opc = vecop_list, + .vece = MO_64 + }, + }; + + tcg_gen_gvec_2s(xd_ofs, xj_ofs, oprsz, maxsz, + tcg_constant_i64((1ll << imm) - 1), &op[vece]); +} + +TRANS(xvsat_b, gvec_xx_i, MO_8, do_xvsat_s) +TRANS(xvsat_h, gvec_xx_i, MO_16, do_xvsat_s) +TRANS(xvsat_w, gvec_xx_i, MO_32, do_xvsat_s) +TRANS(xvsat_d, gvec_xx_i, MO_64, do_xvsat_s) + +static void do_xvsat_u(unsigned vece, uint32_t xd_ofs, uint32_t xj_ofs, + int64_t imm, uint32_t oprsz, uint32_t maxsz) +{ + uint64_t max; + static const TCGOpcode vecop_list[] = { + INDEX_op_umin_vec, 0 + }; + static const GVecGen2s op[4] = { + { + .fniv = gen_vsat_u, + .fno = gen_helper_xvsat_bu, + .opt_opc = vecop_list, + .vece = MO_8 + }, + { + .fniv = gen_vsat_u, + .fno = gen_helper_xvsat_hu, + .opt_opc = vecop_list, + .vece = MO_16 + }, + { + .fniv = gen_vsat_u, + .fno = gen_helper_xvsat_wu, + .opt_opc = vecop_list, + .vece = MO_32 + }, + { + .fniv = gen_vsat_u, + .fno = gen_helper_xvsat_du, + .opt_opc = vecop_list, + .vece = MO_64 + }, + }; + + max = (imm == 0x3f) ? UINT64_MAX : (1ull << (imm + 1)) - 1; + tcg_gen_gvec_2s(xd_ofs, xj_ofs, oprsz, maxsz, + tcg_constant_i64(max), &op[vece]); +} + +TRANS(xvsat_bu, gvec_xx_i, MO_8, do_xvsat_u) +TRANS(xvsat_hu, gvec_xx_i, MO_16, do_xvsat_u) +TRANS(xvsat_wu, gvec_xx_i, MO_32, do_xvsat_u) +TRANS(xvsat_du, gvec_xx_i, MO_64, do_xvsat_u) + static bool gvec_dupx(DisasContext *ctx, arg_xr *a, MemOp mop) { TCGv src = gpr_src(ctx, a->rj, EXT_NONE); diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index 0bd4e7709a..9efb5f2032 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1314,7 +1314,11 @@ vstelm_b 0011 000110 .... ........ ..... ..... @vr_i8i4 @xxx .... ........ ..... xk:5 xj:5 xd:5 &xxx @xr .... ........ ..... ..... rj:5 xd:5 &xr @xx_i5 .... ........ ..... imm:s5 xj:5 xd:5 &xx_i +@xx_ui3 .... ........ ..... .. imm:3 xj:5 xd:5 &xx_i +@xx_ui4 .... ........ ..... . imm:4 xj:5 xd:5 &xx_i @xx_ui5 .... ........ ..... imm:5 xj:5 xd:5 &xx_i +@xx_ui6 .... ........ .... imm:6 xj:5 xd:5 &xx_i + xvadd_b 0111 01000000 10100 ..... ..... ..... @xxx xvadd_h 0111 01000000 10101 ..... ..... ..... @xxx @@ -1578,6 +1582,15 @@ xvmod_hu 0111 01001110 01101 ..... ..... ..... @xxx xvmod_wu 0111 01001110 01110 ..... ..... ..... @xxx xvmod_du 0111 01001110 01111 ..... ..... ..... @xxx +xvsat_b 0111 01110010 01000 01 ... ..... ..... @xx_ui3 +xvsat_h 0111 01110010 01000 1 .... ..... ..... @xx_ui4 +xvsat_w 0111 01110010 01001 ..... ..... ..... @xx_ui5 +xvsat_d 0111 01110010 0101 ...... ..... ..... @xx_ui6 +xvsat_bu 0111 01110010 10000 01 ... ..... ..... @xx_ui3 +xvsat_hu 0111 01110010 10000 1 .... ..... ..... @xx_ui4 +xvsat_wu 0111 01110010 10001 ..... ..... ..... @xx_ui5 +xvsat_du 0111 01110010 1001 ...... ..... ..... @xx_ui6 + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @xr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @xr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @xr diff --git a/target/loongarch/lasx_helper.c b/target/loongarch/lasx_helper.c index d4a4a7659a..33da60f2d8 100644 --- a/target/loongarch/lasx_helper.c +++ b/target/loongarch/lasx_helper.c @@ -601,3 +601,40 @@ XVDIV(xvmod_bu, 8, UXB, DO_REMU) XVDIV(xvmod_hu, 16, UXH, DO_REMU) XVDIV(xvmod_wu, 32, UXW, DO_REMU) XVDIV(xvmod_du, 64, UXD, DO_REMU) + +#define XVSAT_S(NAME, BIT, E) \ +void HELPER(NAME)(void *xd, void *xj, uint64_t max, uint32_t v) \ +{ \ + int i; \ + XReg *Xd = (XReg *)xd; \ + XReg *Xj = (XReg *)xj; \ + typedef __typeof(Xd->E(0)) TD; \ + \ + for (i = 0; i < LASX_LEN / BIT; i++) { \ + Xd->E(i) = Xj->E(i) > (TD)max ? (TD)max : \ + Xj->E(i) < (TD)~max ? (TD)~max : Xj->E(i); \ + } \ +} + +XVSAT_S(xvsat_b, 8, XB) +XVSAT_S(xvsat_h, 16, XH) +XVSAT_S(xvsat_w, 32, XW) +XVSAT_S(xvsat_d, 64, XD) + +#define XVSAT_U(NAME, BIT, E) \ +void HELPER(NAME)(void *xd, void *xj, uint64_t max, uint32_t v) \ +{ \ + int i; \ + XReg *Xd = (XReg *)xd; \ + XReg *Xj = (XReg *)xj; \ + typedef __typeof(Xd->E(0)) TD; \ + \ + for (i = 0; i < LASX_LEN / BIT; i++) { \ + Xd->E(i) = Xj->E(i) > (TD)max ? (TD)max : Xj->E(i); \ + } \ +} + +XVSAT_U(xvsat_bu, 8, UXB) +XVSAT_U(xvsat_hu, 16, UXH) +XVSAT_U(xvsat_wu, 32, UXW) +XVSAT_U(xvsat_du, 64, UXD) From patchwork Tue Jun 20 09:37:47 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Gao X-Patchwork-Id: 13285495 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id E9E10EB64D7 for ; Tue, 20 Jun 2023 09:45:13 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXox-00077s-N9; Tue, 20 Jun 2023 05:38:43 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qBXou-00076r-RE for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:38:40 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXos-0006MA-NI for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:38:40 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8Bxb+uSc5FksSUHAA--.14740S3; Tue, 20 Jun 2023 17:38:26 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8BxduSGc5FkzIQhAA--.28394S21; Tue, 20 Jun 2023 17:38:25 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v1 19/46] target/loongarch: Implement xvexth Date: Tue, 20 Jun 2023 17:37:47 +0800 Message-Id: <20230620093814.123650-20-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230620093814.123650-1-gaosong@loongson.cn> References: <20230620093814.123650-1-gaosong@loongson.cn> MIME-Version: 1.0 X-CM-TRANSID: AQAAf8BxduSGc5FkzIQhAA--.28394S21 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org This patch includes: - XVEXTH.{H.B/W.H/D.W/Q.D}; - XVEXTH.{HU.BU/WU.HU/DU.WU/QU.DU}. Signed-off-by: Song Gao --- target/loongarch/disas.c | 9 +++++ target/loongarch/helper.h | 9 +++++ target/loongarch/insn_trans/trans_lasx.c.inc | 20 ++++++++++ target/loongarch/insns.decode | 9 +++++ target/loongarch/lasx_helper.c | 39 ++++++++++++++++++++ 5 files changed, 86 insertions(+) diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index 18fa454be8..5ac374bc63 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -1987,6 +1987,15 @@ INSN_LASX(xvsat_hu, xx_i) INSN_LASX(xvsat_wu, xx_i) INSN_LASX(xvsat_du, xx_i) +INSN_LASX(xvexth_h_b, xx) +INSN_LASX(xvexth_w_h, xx) +INSN_LASX(xvexth_d_w, xx) +INSN_LASX(xvexth_q_d, xx) +INSN_LASX(xvexth_hu_bu, xx) +INSN_LASX(xvexth_wu_hu, xx) +INSN_LASX(xvexth_du_wu, xx) +INSN_LASX(xvexth_qu_du, xx) + INSN_LASX(xvreplgr2vr_b, xr) INSN_LASX(xvreplgr2vr_h, xr) INSN_LASX(xvreplgr2vr_w, xr) diff --git a/target/loongarch/helper.h b/target/loongarch/helper.h index 741872a24d..17e54eb29a 100644 --- a/target/loongarch/helper.h +++ b/target/loongarch/helper.h @@ -895,3 +895,12 @@ DEF_HELPER_FLAGS_4(xvsat_bu, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) DEF_HELPER_FLAGS_4(xvsat_hu, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) DEF_HELPER_FLAGS_4(xvsat_wu, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) DEF_HELPER_FLAGS_4(xvsat_du, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) + +DEF_HELPER_3(xvexth_h_b, void, env, i32, i32) +DEF_HELPER_3(xvexth_w_h, void, env, i32, i32) +DEF_HELPER_3(xvexth_d_w, void, env, i32, i32) +DEF_HELPER_3(xvexth_q_d, void, env, i32, i32) +DEF_HELPER_3(xvexth_hu_bu, void, env, i32, i32) +DEF_HELPER_3(xvexth_wu_hu, void, env, i32, i32) +DEF_HELPER_3(xvexth_du_wu, void, env, i32, i32) +DEF_HELPER_3(xvexth_qu_du, void, env, i32, i32) diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarch/insn_trans/trans_lasx.c.inc index 350d575a6a..5110cf9a33 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -28,6 +28,17 @@ static bool gen_xxx(DisasContext *ctx, arg_xxx *a, return true; } +static bool gen_xx(DisasContext *ctx, arg_xx *a, + void (*func)(TCGv_ptr, TCGv_i32, TCGv_i32)) +{ + TCGv_i32 xd = tcg_constant_i32(a->xd); + TCGv_i32 xj = tcg_constant_i32(a->xj); + + CHECK_SXE; + func(cpu_env, xd, xj); + return true; +} + static bool gvec_xxx(DisasContext *ctx, arg_xxx *a, MemOp mop, void (*func)(unsigned, uint32_t, uint32_t, uint32_t, uint32_t, uint32_t)) @@ -1833,6 +1844,15 @@ TRANS(xvsat_hu, gvec_xx_i, MO_16, do_xvsat_u) TRANS(xvsat_wu, gvec_xx_i, MO_32, do_xvsat_u) TRANS(xvsat_du, gvec_xx_i, MO_64, do_xvsat_u) +TRANS(xvexth_h_b, gen_xx, gen_helper_xvexth_h_b) +TRANS(xvexth_w_h, gen_xx, gen_helper_xvexth_w_h) +TRANS(xvexth_d_w, gen_xx, gen_helper_xvexth_d_w) +TRANS(xvexth_q_d, gen_xx, gen_helper_xvexth_q_d) +TRANS(xvexth_hu_bu, gen_xx, gen_helper_xvexth_hu_bu) +TRANS(xvexth_wu_hu, gen_xx, gen_helper_xvexth_wu_hu) +TRANS(xvexth_du_wu, gen_xx, gen_helper_xvexth_du_wu) +TRANS(xvexth_qu_du, gen_xx, gen_helper_xvexth_qu_du) + static bool gvec_dupx(DisasContext *ctx, arg_xr *a, MemOp mop) { TCGv src = gpr_src(ctx, a->rj, EXT_NONE); diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index 9efb5f2032..98de616846 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1591,6 +1591,15 @@ xvsat_hu 0111 01110010 10000 1 .... ..... ..... @xx_ui4 xvsat_wu 0111 01110010 10001 ..... ..... ..... @xx_ui5 xvsat_du 0111 01110010 1001 ...... ..... ..... @xx_ui6 +xvexth_h_b 0111 01101001 11101 11000 ..... ..... @xx +xvexth_w_h 0111 01101001 11101 11001 ..... ..... @xx +xvexth_d_w 0111 01101001 11101 11010 ..... ..... @xx +xvexth_q_d 0111 01101001 11101 11011 ..... ..... @xx +xvexth_hu_bu 0111 01101001 11101 11100 ..... ..... @xx +xvexth_wu_hu 0111 01101001 11101 11101 ..... ..... @xx +xvexth_du_wu 0111 01101001 11101 11110 ..... ..... @xx +xvexth_qu_du 0111 01101001 11101 11111 ..... ..... @xx + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @xr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @xr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @xr diff --git a/target/loongarch/lasx_helper.c b/target/loongarch/lasx_helper.c index 33da60f2d8..ca74263c6e 100644 --- a/target/loongarch/lasx_helper.c +++ b/target/loongarch/lasx_helper.c @@ -638,3 +638,42 @@ XVSAT_U(xvsat_bu, 8, UXB) XVSAT_U(xvsat_hu, 16, UXH) XVSAT_U(xvsat_wu, 32, UXW) XVSAT_U(xvsat_du, 64, UXD) + +#define XVEXTH(NAME, BIT, E1, E2) \ +void HELPER(NAME)(CPULoongArchState *env, uint32_t xd, uint32_t xj) \ +{ \ + int i, max; \ + XReg *Xd = &(env->fpr[xd].xreg); \ + XReg *Xj = &(env->fpr[xj].xreg); \ + \ + max = LASX_LEN / (BIT * 2); \ + for (i = 0; i < max; i++) { \ + Xd->E1(i) = Xj->E2(i + max); \ + Xd->E1(i + max) = Xj->E2(i + max * 3); \ + } \ +} + +void HELPER(xvexth_q_d)(CPULoongArchState *env, uint32_t xd, uint32_t xj) +{ + XReg *Xd = &(env->fpr[xd].xreg); + XReg *Xj = &(env->fpr[xj].xreg); + + Xd->XQ(0) = int128_makes64(Xj->XD(1)); + Xd->XQ(1) = int128_makes64(Xj->XD(3)); +} + +void HELPER(xvexth_qu_du)(CPULoongArchState *env, uint32_t xd, uint32_t xj) +{ + XReg *Xd = &(env->fpr[xd].xreg); + XReg *Xj = &(env->fpr[xj].xreg); + + Xd->XQ(0) = int128_make64(Xj->UXD(1)); + Xd->XQ(1) = int128_make64(Xj->UXD(3)); +} + +XVEXTH(xvexth_h_b, 16, XH, XB) +XVEXTH(xvexth_w_h, 32, XW, XH) +XVEXTH(xvexth_d_w, 64, XD, XW) +XVEXTH(xvexth_hu_bu, 16, UXH, UXB) +XVEXTH(xvexth_wu_hu, 32, UXW, UXH) +XVEXTH(xvexth_du_wu, 64, UXD, UXW) From patchwork Tue Jun 20 09:37:48 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Gao X-Patchwork-Id: 13285447 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B1B0EEB64D7 for ; Tue, 20 Jun 2023 09:39:53 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXoz-00078q-4p; Tue, 20 Jun 2023 05:38:45 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qBXox-00077i-Fv for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:38:43 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXov-0006MJ-Ah for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:38:43 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8Cx+emSc5FksyUHAA--.14779S3; Tue, 20 Jun 2023 17:38:26 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8BxduSGc5FkzIQhAA--.28394S22; Tue, 20 Jun 2023 17:38:26 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v1 20/46] target/loongarch: Implement vext2xv Date: Tue, 20 Jun 2023 17:37:48 +0800 Message-Id: <20230620093814.123650-21-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230620093814.123650-1-gaosong@loongson.cn> References: <20230620093814.123650-1-gaosong@loongson.cn> MIME-Version: 1.0 X-CM-TRANSID: AQAAf8BxduSGc5FkzIQhAA--.28394S22 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org This patch includes: - VEXT2XV.{H/W/D}.B, VEXT2XV.{HU/WU/DU}.BU; - VEXT2XV.{W/D}.B, VEXT2XV.{WU/DU}.HU; - VEXT2XV.D.W, VEXT2XV.DU.WU. Signed-off-by: Song Gao --- target/loongarch/disas.c | 13 ++++++++++ target/loongarch/helper.h | 13 ++++++++++ target/loongarch/insn_trans/trans_lasx.c.inc | 13 ++++++++++ target/loongarch/insns.decode | 13 ++++++++++ target/loongarch/lasx_helper.c | 27 ++++++++++++++++++++ 5 files changed, 79 insertions(+) diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index 5ac374bc63..1897aa7ba1 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -1996,6 +1996,19 @@ INSN_LASX(xvexth_wu_hu, xx) INSN_LASX(xvexth_du_wu, xx) INSN_LASX(xvexth_qu_du, xx) +INSN_LASX(vext2xv_h_b, xx) +INSN_LASX(vext2xv_w_b, xx) +INSN_LASX(vext2xv_d_b, xx) +INSN_LASX(vext2xv_w_h, xx) +INSN_LASX(vext2xv_d_h, xx) +INSN_LASX(vext2xv_d_w, xx) +INSN_LASX(vext2xv_hu_bu, xx) +INSN_LASX(vext2xv_wu_bu, xx) +INSN_LASX(vext2xv_du_bu, xx) +INSN_LASX(vext2xv_wu_hu, xx) +INSN_LASX(vext2xv_du_hu, xx) +INSN_LASX(vext2xv_du_wu, xx) + INSN_LASX(xvreplgr2vr_b, xr) INSN_LASX(xvreplgr2vr_h, xr) INSN_LASX(xvreplgr2vr_w, xr) diff --git a/target/loongarch/helper.h b/target/loongarch/helper.h index 17e54eb29a..7a303ee3f1 100644 --- a/target/loongarch/helper.h +++ b/target/loongarch/helper.h @@ -904,3 +904,16 @@ DEF_HELPER_3(xvexth_hu_bu, void, env, i32, i32) DEF_HELPER_3(xvexth_wu_hu, void, env, i32, i32) DEF_HELPER_3(xvexth_du_wu, void, env, i32, i32) DEF_HELPER_3(xvexth_qu_du, void, env, i32, i32) + +DEF_HELPER_3(vext2xv_h_b, void, env, i32, i32) +DEF_HELPER_3(vext2xv_w_b, void, env, i32, i32) +DEF_HELPER_3(vext2xv_d_b, void, env, i32, i32) +DEF_HELPER_3(vext2xv_w_h, void, env, i32, i32) +DEF_HELPER_3(vext2xv_d_h, void, env, i32, i32) +DEF_HELPER_3(vext2xv_d_w, void, env, i32, i32) +DEF_HELPER_3(vext2xv_hu_bu, void, env, i32, i32) +DEF_HELPER_3(vext2xv_wu_bu, void, env, i32, i32) +DEF_HELPER_3(vext2xv_du_bu, void, env, i32, i32) +DEF_HELPER_3(vext2xv_wu_hu, void, env, i32, i32) +DEF_HELPER_3(vext2xv_du_hu, void, env, i32, i32) +DEF_HELPER_3(vext2xv_du_wu, void, env, i32, i32) diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarch/insn_trans/trans_lasx.c.inc index 5110cf9a33..c04469af75 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -1853,6 +1853,19 @@ TRANS(xvexth_wu_hu, gen_xx, gen_helper_xvexth_wu_hu) TRANS(xvexth_du_wu, gen_xx, gen_helper_xvexth_du_wu) TRANS(xvexth_qu_du, gen_xx, gen_helper_xvexth_qu_du) +TRANS(vext2xv_h_b, gen_xx, gen_helper_vext2xv_h_b) +TRANS(vext2xv_w_b, gen_xx, gen_helper_vext2xv_w_b) +TRANS(vext2xv_d_b, gen_xx, gen_helper_vext2xv_d_b) +TRANS(vext2xv_w_h, gen_xx, gen_helper_vext2xv_w_h) +TRANS(vext2xv_d_h, gen_xx, gen_helper_vext2xv_d_h) +TRANS(vext2xv_d_w, gen_xx, gen_helper_vext2xv_d_w) +TRANS(vext2xv_hu_bu, gen_xx, gen_helper_vext2xv_hu_bu) +TRANS(vext2xv_wu_bu, gen_xx, gen_helper_vext2xv_wu_bu) +TRANS(vext2xv_du_bu, gen_xx, gen_helper_vext2xv_du_bu) +TRANS(vext2xv_wu_hu, gen_xx, gen_helper_vext2xv_wu_hu) +TRANS(vext2xv_du_hu, gen_xx, gen_helper_vext2xv_du_hu) +TRANS(vext2xv_du_wu, gen_xx, gen_helper_vext2xv_du_wu) + static bool gvec_dupx(DisasContext *ctx, arg_xr *a, MemOp mop) { TCGv src = gpr_src(ctx, a->rj, EXT_NONE); diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index 98de616846..9f1cb04368 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1600,6 +1600,19 @@ xvexth_wu_hu 0111 01101001 11101 11101 ..... ..... @xx xvexth_du_wu 0111 01101001 11101 11110 ..... ..... @xx xvexth_qu_du 0111 01101001 11101 11111 ..... ..... @xx +vext2xv_h_b 0111 01101001 11110 00100 ..... ..... @xx +vext2xv_w_b 0111 01101001 11110 00101 ..... ..... @xx +vext2xv_d_b 0111 01101001 11110 00110 ..... ..... @xx +vext2xv_w_h 0111 01101001 11110 00111 ..... ..... @xx +vext2xv_d_h 0111 01101001 11110 01000 ..... ..... @xx +vext2xv_d_w 0111 01101001 11110 01001 ..... ..... @xx +vext2xv_hu_bu 0111 01101001 11110 01010 ..... ..... @xx +vext2xv_wu_bu 0111 01101001 11110 01011 ..... ..... @xx +vext2xv_du_bu 0111 01101001 11110 01100 ..... ..... @xx +vext2xv_wu_hu 0111 01101001 11110 01101 ..... ..... @xx +vext2xv_du_hu 0111 01101001 11110 01110 ..... ..... @xx +vext2xv_du_wu 0111 01101001 11110 01111 ..... ..... @xx + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @xr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @xr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @xr diff --git a/target/loongarch/lasx_helper.c b/target/loongarch/lasx_helper.c index ca74263c6e..ca82d03ff4 100644 --- a/target/loongarch/lasx_helper.c +++ b/target/loongarch/lasx_helper.c @@ -677,3 +677,30 @@ XVEXTH(xvexth_d_w, 64, XD, XW) XVEXTH(xvexth_hu_bu, 16, UXH, UXB) XVEXTH(xvexth_wu_hu, 32, UXW, UXH) XVEXTH(xvexth_du_wu, 64, UXD, UXW) + +#define VEXT2XV(NAME, BIT, E1, E2) \ +void HELPER(NAME)(CPULoongArchState *env, uint32_t xd, uint32_t xj) \ +{ \ + int i; \ + XReg *Xd = &(env->fpr[xd].xreg); \ + XReg *Xj = &(env->fpr[xj].xreg); \ + XReg temp; \ + \ + for (i = 0; i < LASX_LEN / BIT; i++) { \ + temp.E1(i) = Xj->E2(i); \ + } \ + *Xd = temp; \ +} + +VEXT2XV(vext2xv_h_b, 16, XH, XB) +VEXT2XV(vext2xv_w_b, 32, XW, XB) +VEXT2XV(vext2xv_d_b, 64, XD, XB) +VEXT2XV(vext2xv_w_h, 32, XW, XH) +VEXT2XV(vext2xv_d_h, 64, XD, XH) +VEXT2XV(vext2xv_d_w, 64, XD, XW) +VEXT2XV(vext2xv_hu_bu, 16, UXH, UXB) +VEXT2XV(vext2xv_wu_bu, 32, UXW, UXB) +VEXT2XV(vext2xv_du_bu, 64, UXD, UXB) +VEXT2XV(vext2xv_wu_hu, 32, UXW, UXH) +VEXT2XV(vext2xv_du_hu, 64, UXD, UXH) +VEXT2XV(vext2xv_du_wu, 64, UXD, UXW) From patchwork Tue Jun 20 09:37:49 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Gao X-Patchwork-Id: 13285455 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 420F4EB64D7 for ; Tue, 20 Jun 2023 09:40:54 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXoy-00078O-OO; Tue, 20 Jun 2023 05:38:44 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qBXox-00077f-Ba for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:38:43 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXov-0006MM-Ay for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:38:43 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8CxPuuTc5FktSUHAA--.14626S3; Tue, 20 Jun 2023 17:38:27 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8BxduSGc5FkzIQhAA--.28394S23; Tue, 20 Jun 2023 17:38:26 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v1 21/46] target/loongarch: Implement xvsigncov Date: Tue, 20 Jun 2023 17:37:49 +0800 Message-Id: <20230620093814.123650-22-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230620093814.123650-1-gaosong@loongson.cn> References: <20230620093814.123650-1-gaosong@loongson.cn> MIME-Version: 1.0 X-CM-TRANSID: AQAAf8BxduSGc5FkzIQhAA--.28394S23 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org This patch includes: - XVSIGNCOV.{B/H/W/D}. Signed-off-by: Song Gao --- target/loongarch/disas.c | 5 +++ target/loongarch/helper.h | 5 +++ target/loongarch/insn_trans/trans_lasx.c.inc | 41 ++++++++++++++++++++ target/loongarch/insns.decode | 5 +++ target/loongarch/lasx_helper.c | 5 +++ target/loongarch/lsx_helper.c | 2 - target/loongarch/vec.h | 2 + 7 files changed, 63 insertions(+), 2 deletions(-) diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index 1897aa7ba1..d0ccf3e86c 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -2009,6 +2009,11 @@ INSN_LASX(vext2xv_wu_hu, xx) INSN_LASX(vext2xv_du_hu, xx) INSN_LASX(vext2xv_du_wu, xx) +INSN_LASX(xvsigncov_b, xxx) +INSN_LASX(xvsigncov_h, xxx) +INSN_LASX(xvsigncov_w, xxx) +INSN_LASX(xvsigncov_d, xxx) + INSN_LASX(xvreplgr2vr_b, xr) INSN_LASX(xvreplgr2vr_h, xr) INSN_LASX(xvreplgr2vr_w, xr) diff --git a/target/loongarch/helper.h b/target/loongarch/helper.h index 7a303ee3f1..53a33703b3 100644 --- a/target/loongarch/helper.h +++ b/target/loongarch/helper.h @@ -917,3 +917,8 @@ DEF_HELPER_3(vext2xv_du_bu, void, env, i32, i32) DEF_HELPER_3(vext2xv_wu_hu, void, env, i32, i32) DEF_HELPER_3(vext2xv_du_hu, void, env, i32, i32) DEF_HELPER_3(vext2xv_du_wu, void, env, i32, i32) + +DEF_HELPER_FLAGS_4(xvsigncov_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvsigncov_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvsigncov_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvsigncov_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarch/insn_trans/trans_lasx.c.inc index c04469af75..9c24e82ac0 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -1866,6 +1866,47 @@ TRANS(vext2xv_wu_hu, gen_xx, gen_helper_vext2xv_wu_hu) TRANS(vext2xv_du_hu, gen_xx, gen_helper_vext2xv_du_hu) TRANS(vext2xv_du_wu, gen_xx, gen_helper_vext2xv_du_wu) +static void do_xvsigncov(unsigned vece, uint32_t xd_ofs, uint32_t xj_ofs, + uint32_t xk_ofs, uint32_t oprsz, uint32_t maxsz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_neg_vec, INDEX_op_cmpsel_vec, 0 + }; + static const GVecGen3 op[4] = { + { + .fniv = gen_vsigncov, + .fno = gen_helper_xvsigncov_b, + .opt_opc = vecop_list, + .vece = MO_8 + }, + { + .fniv = gen_vsigncov, + .fno = gen_helper_xvsigncov_h, + .opt_opc = vecop_list, + .vece = MO_16 + }, + { + .fniv = gen_vsigncov, + .fno = gen_helper_xvsigncov_w, + .opt_opc = vecop_list, + .vece = MO_32 + }, + { + .fniv = gen_vsigncov, + .fno = gen_helper_xvsigncov_d, + .opt_opc = vecop_list, + .vece = MO_64 + }, + }; + + tcg_gen_gvec_3(xd_ofs, xj_ofs, xk_ofs, oprsz, maxsz, &op[vece]); +} + +TRANS(xvsigncov_b, gvec_xxx, MO_8, do_xvsigncov) +TRANS(xvsigncov_h, gvec_xxx, MO_16, do_xvsigncov) +TRANS(xvsigncov_w, gvec_xxx, MO_32, do_xvsigncov) +TRANS(xvsigncov_d, gvec_xxx, MO_64, do_xvsigncov) + static bool gvec_dupx(DisasContext *ctx, arg_xr *a, MemOp mop) { TCGv src = gpr_src(ctx, a->rj, EXT_NONE); diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index 9f1cb04368..887d7f5a90 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1613,6 +1613,11 @@ vext2xv_wu_hu 0111 01101001 11110 01101 ..... ..... @xx vext2xv_du_hu 0111 01101001 11110 01110 ..... ..... @xx vext2xv_du_wu 0111 01101001 11110 01111 ..... ..... @xx +xvsigncov_b 0111 01010010 11100 ..... ..... ..... @xxx +xvsigncov_h 0111 01010010 11101 ..... ..... ..... @xxx +xvsigncov_w 0111 01010010 11110 ..... ..... ..... @xxx +xvsigncov_d 0111 01010010 11111 ..... ..... ..... @xxx + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @xr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @xr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @xr diff --git a/target/loongarch/lasx_helper.c b/target/loongarch/lasx_helper.c index ca82d03ff4..db7905fa4d 100644 --- a/target/loongarch/lasx_helper.c +++ b/target/loongarch/lasx_helper.c @@ -704,3 +704,8 @@ VEXT2XV(vext2xv_du_bu, 64, UXD, UXB) VEXT2XV(vext2xv_wu_hu, 32, UXW, UXH) VEXT2XV(vext2xv_du_hu, 64, UXD, UXH) VEXT2XV(vext2xv_du_wu, 64, UXD, UXW) + +XDO_3OP(xvsigncov_b, 8, XB, DO_SIGNCOV) +XDO_3OP(xvsigncov_h, 16, XH, DO_SIGNCOV) +XDO_3OP(xvsigncov_w, 32, XW, DO_SIGNCOV) +XDO_3OP(xvsigncov_d, 64, XD, DO_SIGNCOV) diff --git a/target/loongarch/lsx_helper.c b/target/loongarch/lsx_helper.c index 5aac0c9ef5..dadba47513 100644 --- a/target/loongarch/lsx_helper.c +++ b/target/loongarch/lsx_helper.c @@ -648,8 +648,6 @@ VEXTH(vexth_hu_bu, 16, UH, UB) VEXTH(vexth_wu_hu, 32, UW, UH) VEXTH(vexth_du_wu, 64, UD, UW) -#define DO_SIGNCOV(a, b) (a == 0 ? 0 : a < 0 ? -b : b) - DO_3OP(vsigncov_b, 8, B, DO_SIGNCOV) DO_3OP(vsigncov_h, 16, H, DO_SIGNCOV) DO_3OP(vsigncov_w, 32, W, DO_SIGNCOV) diff --git a/target/loongarch/vec.h b/target/loongarch/vec.h index c748957158..f6ad3f78dd 100644 --- a/target/loongarch/vec.h +++ b/target/loongarch/vec.h @@ -73,4 +73,6 @@ #define DO_REM(N, M) (unlikely(M == 0) ? 0 :\ unlikely((N == -N) && (M == (__typeof(N))(-1))) ? 0 : N % M) +#define DO_SIGNCOV(a, b) (a == 0 ? 0 : a < 0 ? -b : b) + #endif /* LOONGARCH_VEC_H */ From patchwork Tue Jun 20 09:37:50 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Gao X-Patchwork-Id: 13285472 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 98087EB64D7 for ; Tue, 20 Jun 2023 09:41:20 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXp4-0007Ai-Dm; Tue, 20 Jun 2023 05:38:50 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qBXp1-00079E-Hn for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:38:47 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXov-0006Mg-Jy for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:38:44 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8CxvOqTc5FktyUHAA--.14671S3; Tue, 20 Jun 2023 17:38:27 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8BxduSGc5FkzIQhAA--.28394S24; Tue, 20 Jun 2023 17:38:27 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v1 22/46] target/loongarch: Implement xvmskltz/xvmskgez/xvmsknz Date: Tue, 20 Jun 2023 17:37:50 +0800 Message-Id: <20230620093814.123650-23-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230620093814.123650-1-gaosong@loongson.cn> References: <20230620093814.123650-1-gaosong@loongson.cn> MIME-Version: 1.0 X-CM-TRANSID: AQAAf8BxduSGc5FkzIQhAA--.28394S24 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org This patch includes: - XVMSKLTZ.{B/H/W/D}; - XVMSKGEZ.B; - XVMSKNZ.B. Signed-off-by: Song Gao --- target/loongarch/disas.c | 7 ++ target/loongarch/helper.h | 7 ++ target/loongarch/insn_trans/trans_lasx.c.inc | 7 ++ target/loongarch/insns.decode | 7 ++ target/loongarch/lasx_helper.c | 95 ++++++++++++++++++++ target/loongarch/lsx_helper.c | 10 +-- target/loongarch/vec.h | 6 ++ 7 files changed, 134 insertions(+), 5 deletions(-) diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index d0ccf3e86c..5a3c14f33d 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -2014,6 +2014,13 @@ INSN_LASX(xvsigncov_h, xxx) INSN_LASX(xvsigncov_w, xxx) INSN_LASX(xvsigncov_d, xxx) +INSN_LASX(xvmskltz_b, xx) +INSN_LASX(xvmskltz_h, xx) +INSN_LASX(xvmskltz_w, xx) +INSN_LASX(xvmskltz_d, xx) +INSN_LASX(xvmskgez_b, xx) +INSN_LASX(xvmsknz_b, xx) + INSN_LASX(xvreplgr2vr_b, xr) INSN_LASX(xvreplgr2vr_h, xr) INSN_LASX(xvreplgr2vr_w, xr) diff --git a/target/loongarch/helper.h b/target/loongarch/helper.h index 53a33703b3..b7ba78ee06 100644 --- a/target/loongarch/helper.h +++ b/target/loongarch/helper.h @@ -922,3 +922,10 @@ DEF_HELPER_FLAGS_4(xvsigncov_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(xvsigncov_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(xvsigncov_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(xvsigncov_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_3(xvmskltz_b, void, env, i32, i32) +DEF_HELPER_3(xvmskltz_h, void, env, i32, i32) +DEF_HELPER_3(xvmskltz_w, void, env, i32, i32) +DEF_HELPER_3(xvmskltz_d, void, env, i32, i32) +DEF_HELPER_3(xvmskgez_b, void, env, i32, i32) +DEF_HELPER_3(xvmsknz_b, void, env, i32, i32) diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarch/insn_trans/trans_lasx.c.inc index 9c24e82ac0..b0aad21a9d 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -1907,6 +1907,13 @@ TRANS(xvsigncov_h, gvec_xxx, MO_16, do_xvsigncov) TRANS(xvsigncov_w, gvec_xxx, MO_32, do_xvsigncov) TRANS(xvsigncov_d, gvec_xxx, MO_64, do_xvsigncov) +TRANS(xvmskltz_b, gen_xx, gen_helper_xvmskltz_b) +TRANS(xvmskltz_h, gen_xx, gen_helper_xvmskltz_h) +TRANS(xvmskltz_w, gen_xx, gen_helper_xvmskltz_w) +TRANS(xvmskltz_d, gen_xx, gen_helper_xvmskltz_d) +TRANS(xvmskgez_b, gen_xx, gen_helper_xvmskgez_b) +TRANS(xvmsknz_b, gen_xx, gen_helper_xvmsknz_b) + static bool gvec_dupx(DisasContext *ctx, arg_xr *a, MemOp mop) { TCGv src = gpr_src(ctx, a->rj, EXT_NONE); diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index 887d7f5a90..b792a68fdf 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1618,6 +1618,13 @@ xvsigncov_h 0111 01010010 11101 ..... ..... ..... @xxx xvsigncov_w 0111 01010010 11110 ..... ..... ..... @xxx xvsigncov_d 0111 01010010 11111 ..... ..... ..... @xxx +xvmskltz_b 0111 01101001 11000 10000 ..... ..... @xx +xvmskltz_h 0111 01101001 11000 10001 ..... ..... @xx +xvmskltz_w 0111 01101001 11000 10010 ..... ..... @xx +xvmskltz_d 0111 01101001 11000 10011 ..... ..... @xx +xvmskgez_b 0111 01101001 11000 10100 ..... ..... @xx +xvmsknz_b 0111 01101001 11000 11000 ..... ..... @xx + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @xr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @xr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @xr diff --git a/target/loongarch/lasx_helper.c b/target/loongarch/lasx_helper.c index db7905fa4d..6aec554645 100644 --- a/target/loongarch/lasx_helper.c +++ b/target/loongarch/lasx_helper.c @@ -709,3 +709,98 @@ XDO_3OP(xvsigncov_b, 8, XB, DO_SIGNCOV) XDO_3OP(xvsigncov_h, 16, XH, DO_SIGNCOV) XDO_3OP(xvsigncov_w, 32, XW, DO_SIGNCOV) XDO_3OP(xvsigncov_d, 64, XD, DO_SIGNCOV) + +void HELPER(xvmskltz_b)(CPULoongArchState *env, uint32_t xd, uint32_t xj) +{ + uint16_t temp; + int i; + XReg *Xd = &(env->fpr[xd].xreg); + XReg *Xj = &(env->fpr[xj].xreg); + + for (i = 0; i < 2; i++) { + temp = 0; + temp = do_vmskltz_b(Xj->XD(2 * i)); + temp |= (do_vmskltz_b(Xj->XD(2 * i + 1)) << 8); + Xd->XD(2 * i) = temp; + Xd->XD(2 * i + 1) = 0; + } +} + +void HELPER(xvmskltz_h)(CPULoongArchState *env, uint32_t xd, uint32_t xj) +{ + uint16_t temp; + int i; + XReg *Xd = &(env->fpr[xd].xreg); + XReg *Xj = &(env->fpr[xj].xreg); + + for (i = 0; i < 2; i++) { + temp = 0; + temp = do_vmskltz_h(Xj->XD(2 * i)); + temp |= (do_vmskltz_h(Xj->XD(2 * i + 1)) << 4); + Xd->XD(2 * i) = temp; + Xd->XD(2 * i + 1) = 0; + } +} + +void HELPER(xvmskltz_w)(CPULoongArchState *env, uint32_t xd, uint32_t xj) +{ + uint16_t temp; + int i; + XReg *Xd = &(env->fpr[xd].xreg); + XReg *Xj = &(env->fpr[xj].xreg); + + for (i = 0; i < 2; i++) { + temp = do_vmskltz_w(Xj->XD(2 * i)); + temp |= (do_vmskltz_w(Xj->XD(2 * i + 1)) << 2); + Xd->XD(2 * i) = temp; + Xd->XD(2 * i + 1) = 0; + } +} + +void HELPER(xvmskltz_d)(CPULoongArchState *env, uint32_t xd, uint32_t xj) +{ + uint16_t temp; + int i; + XReg *Xd = &(env->fpr[xd].xreg); + XReg *Xj = &(env->fpr[xj].xreg); + + for (i = 0; i < 2; i++) { + temp = 0; + temp = do_vmskltz_d(Xj->XD(2 * i)); + temp |= (do_vmskltz_d(Xj->XD(2 * i + 1)) << 1); + Xd->XD(2 * i) = temp; + Xd->XD(2 * i + 1) = 0; + } +} + +void HELPER(xvmskgez_b)(CPULoongArchState *env, uint32_t xd, uint32_t xj) +{ + uint16_t temp; + int i; + XReg *Xd = &(env->fpr[xd].xreg); + XReg *Xj = &(env->fpr[xj].xreg); + + for (i = 0; i < 2; i++) { + temp = 0; + temp = do_vmskltz_b(Xj->XD(2 * i)); + temp |= (do_vmskltz_b(Xj->XD(2 * i + 1)) << 8); + Xd->XD(2 * i) = (uint16_t)(~temp); + Xd->XD(2 * i + 1) = 0; + } +} + +void HELPER(xvmsknz_b)(CPULoongArchState *env, uint32_t xd, uint32_t xj) +{ + uint16_t temp; + int i; + XReg *Xd = &(env->fpr[xd].xreg); + XReg *Xj = &(env->fpr[xj].xreg); + + for (i = 0; i < 2; i++) { + temp = 0; + temp = do_vmskez_b(Xj->XD(2 * i)); + temp |= (do_vmskez_b(Xj->XD(2 * i + 1)) << 8); + Xd->XD(2 * i) = (uint16_t)(~temp); + Xd->XD(2 * i + 1) = 0; + } +} diff --git a/target/loongarch/lsx_helper.c b/target/loongarch/lsx_helper.c index dadba47513..e64155f38c 100644 --- a/target/loongarch/lsx_helper.c +++ b/target/loongarch/lsx_helper.c @@ -653,7 +653,7 @@ DO_3OP(vsigncov_h, 16, H, DO_SIGNCOV) DO_3OP(vsigncov_w, 32, W, DO_SIGNCOV) DO_3OP(vsigncov_d, 64, D, DO_SIGNCOV) -static uint64_t do_vmskltz_b(int64_t val) +uint64_t do_vmskltz_b(int64_t val) { uint64_t m = 0x8080808080808080ULL; uint64_t c = val & m; @@ -675,7 +675,7 @@ void HELPER(vmskltz_b)(CPULoongArchState *env, uint32_t vd, uint32_t vj) Vd->D(1) = 0; } -static uint64_t do_vmskltz_h(int64_t val) +uint64_t do_vmskltz_h(int64_t val) { uint64_t m = 0x8000800080008000ULL; uint64_t c = val & m; @@ -696,7 +696,7 @@ void HELPER(vmskltz_h)(CPULoongArchState *env, uint32_t vd, uint32_t vj) Vd->D(1) = 0; } -static uint64_t do_vmskltz_w(int64_t val) +uint64_t do_vmskltz_w(int64_t val) { uint64_t m = 0x8000000080000000ULL; uint64_t c = val & m; @@ -716,7 +716,7 @@ void HELPER(vmskltz_w)(CPULoongArchState *env, uint32_t vd, uint32_t vj) Vd->D(1) = 0; } -static uint64_t do_vmskltz_d(int64_t val) +uint64_t do_vmskltz_d(int64_t val) { return (uint64_t)val >> 63; } @@ -744,7 +744,7 @@ void HELPER(vmskgez_b)(CPULoongArchState *env, uint32_t vd, uint32_t vj) Vd->D(1) = 0; } -static uint64_t do_vmskez_b(uint64_t a) +uint64_t do_vmskez_b(uint64_t a) { uint64_t m = 0x7f7f7f7f7f7f7f7fULL; uint64_t c = ~(((a & m) + m) | a | m); diff --git a/target/loongarch/vec.h b/target/loongarch/vec.h index f6ad3f78dd..d5a880b3fd 100644 --- a/target/loongarch/vec.h +++ b/target/loongarch/vec.h @@ -75,4 +75,10 @@ #define DO_SIGNCOV(a, b) (a == 0 ? 0 : a < 0 ? -b : b) +uint64_t do_vmskltz_b(int64_t val); +uint64_t do_vmskltz_h(int64_t val); +uint64_t do_vmskltz_w(int64_t val); +uint64_t do_vmskltz_d(int64_t val); +uint64_t do_vmskez_b(uint64_t val); + #endif /* LOONGARCH_VEC_H */ From patchwork Tue Jun 20 09:37:51 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Gao X-Patchwork-Id: 13285490 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 73E71EB64D8 for ; Tue, 20 Jun 2023 09:44:35 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXp3-0007A5-AL; Tue, 20 Jun 2023 05:38:49 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qBXp1-00079D-Hb for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:38:47 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXov-0006Mk-J0 for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:38:44 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8Cxc+iUc5FkuSUHAA--.594S3; Tue, 20 Jun 2023 17:38:28 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8BxduSGc5FkzIQhAA--.28394S25; Tue, 20 Jun 2023 17:38:27 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v1 23/46] target/loognarch: Implement xvldi Date: Tue, 20 Jun 2023 17:37:51 +0800 Message-Id: <20230620093814.123650-24-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230620093814.123650-1-gaosong@loongson.cn> References: <20230620093814.123650-1-gaosong@loongson.cn> MIME-Version: 1.0 X-CM-TRANSID: AQAAf8BxduSGc5FkzIQhAA--.28394S25 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org This patch includes: - XVLDI. Signed-off-by: Song Gao --- target/loongarch/disas.c | 7 +++++++ target/loongarch/insn_trans/trans_lasx.c.inc | 21 ++++++++++++++++++++ target/loongarch/insns.decode | 5 ++++- 3 files changed, 32 insertions(+), 1 deletion(-) diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index 5a3c14f33d..82a9826eb7 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -1703,6 +1703,11 @@ static bool trans_##insn(DisasContext *ctx, arg_##type * a) \ return true; \ } +static void output_x_i(DisasContext *ctx, arg_x_i *a, const char *mnemonic) +{ + output(ctx, mnemonic, "x%d, 0x%x", a->xd, a->imm); +} + static void output_xxx(DisasContext *ctx, arg_xxx * a, const char *mnemonic) { output(ctx, mnemonic, "x%d, x%d, x%d", a->xd, a->xj, a->xk); @@ -2021,6 +2026,8 @@ INSN_LASX(xvmskltz_d, xx) INSN_LASX(xvmskgez_b, xx) INSN_LASX(xvmsknz_b, xx) +INSN_LASX(xvldi, x_i) + INSN_LASX(xvreplgr2vr_b, xr) INSN_LASX(xvreplgr2vr_h, xr) INSN_LASX(xvreplgr2vr_w, xr) diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarch/insn_trans/trans_lasx.c.inc index b0aad21a9d..bf277e1fd9 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -1914,6 +1914,27 @@ TRANS(xvmskltz_d, gen_xx, gen_helper_xvmskltz_d) TRANS(xvmskgez_b, gen_xx, gen_helper_xvmskgez_b) TRANS(xvmsknz_b, gen_xx, gen_helper_xvmsknz_b) +static bool trans_xvldi(DisasContext *ctx, arg_xvldi * a) +{ + int sel, vece; + uint64_t value; + CHECK_ASXE; + + sel = (a->imm >> 12) & 0x1; + + if (sel) { + value = vldi_get_value(ctx, a->imm); + vece = MO_64; + } else { + value = ((int32_t)(a->imm << 22)) >> 22; + vece = (a->imm >> 10) & 0x3; + } + + tcg_gen_gvec_dup_i64(vece, vec_full_offset(a->xd), 32, ctx->vl / 8, + tcg_constant_i64(value)); + return true; +} + static bool gvec_dupx(DisasContext *ctx, arg_xr *a, MemOp mop) { TCGv src = gpr_src(ctx, a->rj, EXT_NONE); diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index b792a68fdf..fbd0dd229a 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1305,11 +1305,13 @@ vstelm_b 0011 000110 .... ........ ..... ..... @vr_i8i4 &xxx xd xj xk &xr xd rj &xx_i xd xj imm +&x_i xd imm # # LASX Formats # +@x_i13 .... ........ .. imm:13 xd:5 &x_i @xx .... ........ ..... ..... xj:5 xd:5 &xx @xxx .... ........ ..... xk:5 xj:5 xd:5 &xxx @xr .... ........ ..... ..... rj:5 xd:5 &xr @@ -1319,7 +1321,6 @@ vstelm_b 0011 000110 .... ........ ..... ..... @vr_i8i4 @xx_ui5 .... ........ ..... imm:5 xj:5 xd:5 &xx_i @xx_ui6 .... ........ .... imm:6 xj:5 xd:5 &xx_i - xvadd_b 0111 01000000 10100 ..... ..... ..... @xxx xvadd_h 0111 01000000 10101 ..... ..... ..... @xxx xvadd_w 0111 01000000 10110 ..... ..... ..... @xxx @@ -1625,6 +1626,8 @@ xvmskltz_d 0111 01101001 11000 10011 ..... ..... @xx xvmskgez_b 0111 01101001 11000 10100 ..... ..... @xx xvmsknz_b 0111 01101001 11000 11000 ..... ..... @xx +xvldi 0111 01111110 00 ............. ..... @x_i13 + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @xr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @xr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @xr From patchwork Tue Jun 20 09:37:52 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Gao X-Patchwork-Id: 13285450 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id A540FEB64D8 for ; Tue, 20 Jun 2023 09:40:16 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXp2-00079n-NA; Tue, 20 Jun 2023 05:38:48 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qBXp1-00079F-Ij for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:38:47 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXov-0006Mp-RK for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:38:44 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8CxNumUc5FkuyUHAA--.12676S3; Tue, 20 Jun 2023 17:38:28 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8BxduSGc5FkzIQhAA--.28394S26; Tue, 20 Jun 2023 17:38:28 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v1 24/46] target/loongarch: Implement LASX logic instructions Date: Tue, 20 Jun 2023 17:37:52 +0800 Message-Id: <20230620093814.123650-25-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230620093814.123650-1-gaosong@loongson.cn> References: <20230620093814.123650-1-gaosong@loongson.cn> MIME-Version: 1.0 X-CM-TRANSID: AQAAf8BxduSGc5FkzIQhAA--.28394S26 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org This patch includes: - XV{AND/OR/XOR/NOR/ANDN/ORN}.V; - XV{AND/OR/XOR/NOR}I.B. Signed-off-by: Song Gao --- target/loongarch/disas.c | 12 ++++++ target/loongarch/helper.h | 2 + target/loongarch/insn_trans/trans_lasx.c.inc | 42 ++++++++++++++++++++ target/loongarch/insns.decode | 13 ++++++ target/loongarch/lasx_helper.c | 11 +++++ 5 files changed, 80 insertions(+) diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index 82a9826eb7..2f1da9db80 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -2028,6 +2028,18 @@ INSN_LASX(xvmsknz_b, xx) INSN_LASX(xvldi, x_i) +INSN_LASX(xvand_v, xxx) +INSN_LASX(xvor_v, xxx) +INSN_LASX(xvxor_v, xxx) +INSN_LASX(xvnor_v, xxx) +INSN_LASX(xvandn_v, xxx) +INSN_LASX(xvorn_v, xxx) + +INSN_LASX(xvandi_b, xx_i) +INSN_LASX(xvori_b, xx_i) +INSN_LASX(xvxori_b, xx_i) +INSN_LASX(xvnori_b, xx_i) + INSN_LASX(xvreplgr2vr_b, xr) INSN_LASX(xvreplgr2vr_h, xr) INSN_LASX(xvreplgr2vr_w, xr) diff --git a/target/loongarch/helper.h b/target/loongarch/helper.h index b7ba78ee06..4e0a900318 100644 --- a/target/loongarch/helper.h +++ b/target/loongarch/helper.h @@ -929,3 +929,5 @@ DEF_HELPER_3(xvmskltz_w, void, env, i32, i32) DEF_HELPER_3(xvmskltz_d, void, env, i32, i32) DEF_HELPER_3(xvmskgez_b, void, env, i32, i32) DEF_HELPER_3(xvmsknz_b, void, env, i32, i32) + +DEF_HELPER_FLAGS_4(xvnori_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarch/insn_trans/trans_lasx.c.inc index bf277e1fd9..d48f76f118 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -1935,6 +1935,48 @@ static bool trans_xvldi(DisasContext *ctx, arg_xvldi * a) return true; } +TRANS(xvand_v, gvec_xxx, MO_64, tcg_gen_gvec_and) +TRANS(xvor_v, gvec_xxx, MO_64, tcg_gen_gvec_or) +TRANS(xvxor_v, gvec_xxx, MO_64, tcg_gen_gvec_xor) +TRANS(xvnor_v, gvec_xxx, MO_64, tcg_gen_gvec_nor) + +static bool trans_xvandn_v(DisasContext *ctx, arg_xxx * a) +{ + uint32_t xd_ofs, xj_ofs, xk_ofs; + + CHECK_ASXE; + + xd_ofs = vec_full_offset(a->xd); + xj_ofs = vec_full_offset(a->xj); + xk_ofs = vec_full_offset(a->xk); + + tcg_gen_gvec_andc(MO_64, xd_ofs, xk_ofs, xj_ofs, 32, ctx->vl / 8); + return true; +} +TRANS(xvorn_v, gvec_xxx, MO_64, tcg_gen_gvec_orc) +TRANS(xvandi_b, gvec_xx_i, MO_8, tcg_gen_gvec_andi) +TRANS(xvori_b, gvec_xx_i, MO_8, tcg_gen_gvec_ori) +TRANS(xvxori_b, gvec_xx_i, MO_8, tcg_gen_gvec_xori) + +static void do_xvnori_b(unsigned vece, uint32_t xd_ofs, uint32_t xj_ofs, + int64_t imm, uint32_t oprsz, uint32_t maxsz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_nor_vec, 0 + }; + static const GVecGen2i op = { + .fni8 = gen_vnori_b, + .fniv = gen_vnori, + .fnoi = gen_helper_xvnori_b, + .opt_opc = vecop_list, + .vece = MO_8 + }; + + tcg_gen_gvec_2i(xd_ofs, xj_ofs, oprsz, maxsz, imm, &op); +} + +TRANS(xvnori_b, gvec_xx_i, MO_8, do_xvnori_b) + static bool gvec_dupx(DisasContext *ctx, arg_xr *a, MemOp mop) { TCGv src = gpr_src(ctx, a->rj, EXT_NONE); diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index fbd0dd229a..ce2ad47b88 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1320,6 +1320,7 @@ vstelm_b 0011 000110 .... ........ ..... ..... @vr_i8i4 @xx_ui4 .... ........ ..... . imm:4 xj:5 xd:5 &xx_i @xx_ui5 .... ........ ..... imm:5 xj:5 xd:5 &xx_i @xx_ui6 .... ........ .... imm:6 xj:5 xd:5 &xx_i +@xx_ui8 .... ........ .. imm:8 xj:5 xd:5 &xx_i xvadd_b 0111 01000000 10100 ..... ..... ..... @xxx xvadd_h 0111 01000000 10101 ..... ..... ..... @xxx @@ -1628,6 +1629,18 @@ xvmsknz_b 0111 01101001 11000 11000 ..... ..... @xx xvldi 0111 01111110 00 ............. ..... @x_i13 +xvand_v 0111 01010010 01100 ..... ..... ..... @xxx +xvor_v 0111 01010010 01101 ..... ..... ..... @xxx +xvxor_v 0111 01010010 01110 ..... ..... ..... @xxx +xvnor_v 0111 01010010 01111 ..... ..... ..... @xxx +xvandn_v 0111 01010010 10000 ..... ..... ..... @xxx +xvorn_v 0111 01010010 10001 ..... ..... ..... @xxx + +xvandi_b 0111 01111101 00 ........ ..... ..... @xx_ui8 +xvori_b 0111 01111101 01 ........ ..... ..... @xx_ui8 +xvxori_b 0111 01111101 10 ........ ..... ..... @xx_ui8 +xvnori_b 0111 01111101 11 ........ ..... ..... @xx_ui8 + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @xr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @xr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @xr diff --git a/target/loongarch/lasx_helper.c b/target/loongarch/lasx_helper.c index 6aec554645..8e8860c1bb 100644 --- a/target/loongarch/lasx_helper.c +++ b/target/loongarch/lasx_helper.c @@ -804,3 +804,14 @@ void HELPER(xvmsknz_b)(CPULoongArchState *env, uint32_t xd, uint32_t xj) Xd->XD(2 * i + 1) = 0; } } + +void HELPER(xvnori_b)(void *xd, void *xj, uint64_t imm, uint32_t v) +{ + int i; + XReg *Xd = (XReg *)xd; + XReg *Xj = (XReg *)xj; + + for (i = 0; i < LASX_LEN / 8; i++) { + Xd->XB(i) = ~(Xj->XB(i) | (uint8_t)imm); + } +} From patchwork Tue Jun 20 09:37:53 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Gao X-Patchwork-Id: 13285491 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id D74C8EB64D7 for ; Tue, 20 Jun 2023 09:44:35 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXp7-0007C8-70; Tue, 20 Jun 2023 05:38:53 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qBXp3-00079u-0d for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:38:49 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXox-0006N5-0j for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:38:48 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8Cx_eqVc5FkviUHAA--.14882S3; Tue, 20 Jun 2023 17:38:29 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8BxduSGc5FkzIQhAA--.28394S27; Tue, 20 Jun 2023 17:38:28 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v1 25/46] target/loongarch: Implement xvsll xvsrl xvsra xvrotr Date: Tue, 20 Jun 2023 17:37:53 +0800 Message-Id: <20230620093814.123650-26-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230620093814.123650-1-gaosong@loongson.cn> References: <20230620093814.123650-1-gaosong@loongson.cn> MIME-Version: 1.0 X-CM-TRANSID: AQAAf8BxduSGc5FkzIQhAA--.28394S27 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org This patch includes: - XVSLL[I].{B/H/W/D}; - XVSRL[I].{B/H/W/D}; - XVSRA[I].{B/H/W/D}; - XVROTR[I].{B/H/W/D}. Signed-off-by: Song Gao --- target/loongarch/disas.c | 36 ++++++++++++++++++++ target/loongarch/insn_trans/trans_lasx.c.inc | 36 ++++++++++++++++++++ target/loongarch/insns.decode | 33 ++++++++++++++++++ 3 files changed, 105 insertions(+) diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index 2f1da9db80..0c1c7a7e6e 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -2040,6 +2040,42 @@ INSN_LASX(xvori_b, xx_i) INSN_LASX(xvxori_b, xx_i) INSN_LASX(xvnori_b, xx_i) +INSN_LASX(xvsll_b, xxx) +INSN_LASX(xvsll_h, xxx) +INSN_LASX(xvsll_w, xxx) +INSN_LASX(xvsll_d, xxx) +INSN_LASX(xvslli_b, xx_i) +INSN_LASX(xvslli_h, xx_i) +INSN_LASX(xvslli_w, xx_i) +INSN_LASX(xvslli_d, xx_i) + +INSN_LASX(xvsrl_b, xxx) +INSN_LASX(xvsrl_h, xxx) +INSN_LASX(xvsrl_w, xxx) +INSN_LASX(xvsrl_d, xxx) +INSN_LASX(xvsrli_b, xx_i) +INSN_LASX(xvsrli_h, xx_i) +INSN_LASX(xvsrli_w, xx_i) +INSN_LASX(xvsrli_d, xx_i) + +INSN_LASX(xvsra_b, xxx) +INSN_LASX(xvsra_h, xxx) +INSN_LASX(xvsra_w, xxx) +INSN_LASX(xvsra_d, xxx) +INSN_LASX(xvsrai_b, xx_i) +INSN_LASX(xvsrai_h, xx_i) +INSN_LASX(xvsrai_w, xx_i) +INSN_LASX(xvsrai_d, xx_i) + +INSN_LASX(xvrotr_b, xxx) +INSN_LASX(xvrotr_h, xxx) +INSN_LASX(xvrotr_w, xxx) +INSN_LASX(xvrotr_d, xxx) +INSN_LASX(xvrotri_b, xx_i) +INSN_LASX(xvrotri_h, xx_i) +INSN_LASX(xvrotri_w, xx_i) +INSN_LASX(xvrotri_d, xx_i) + INSN_LASX(xvreplgr2vr_b, xr) INSN_LASX(xvreplgr2vr_h, xr) INSN_LASX(xvreplgr2vr_w, xr) diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarch/insn_trans/trans_lasx.c.inc index d48f76f118..5d7deb312e 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -1977,6 +1977,42 @@ static void do_xvnori_b(unsigned vece, uint32_t xd_ofs, uint32_t xj_ofs, TRANS(xvnori_b, gvec_xx_i, MO_8, do_xvnori_b) +TRANS(xvsll_b, gvec_xxx, MO_8, tcg_gen_gvec_shlv) +TRANS(xvsll_h, gvec_xxx, MO_16, tcg_gen_gvec_shlv) +TRANS(xvsll_w, gvec_xxx, MO_32, tcg_gen_gvec_shlv) +TRANS(xvsll_d, gvec_xxx, MO_64, tcg_gen_gvec_shlv) +TRANS(xvslli_b, gvec_xx_i, MO_8, tcg_gen_gvec_shli) +TRANS(xvslli_h, gvec_xx_i, MO_16, tcg_gen_gvec_shli) +TRANS(xvslli_w, gvec_xx_i, MO_32, tcg_gen_gvec_shli) +TRANS(xvslli_d, gvec_xx_i, MO_64, tcg_gen_gvec_shli) + +TRANS(xvsrl_b, gvec_xxx, MO_8, tcg_gen_gvec_shrv) +TRANS(xvsrl_h, gvec_xxx, MO_16, tcg_gen_gvec_shrv) +TRANS(xvsrl_w, gvec_xxx, MO_32, tcg_gen_gvec_shrv) +TRANS(xvsrl_d, gvec_xxx, MO_64, tcg_gen_gvec_shrv) +TRANS(xvsrli_b, gvec_xx_i, MO_8, tcg_gen_gvec_shri) +TRANS(xvsrli_h, gvec_xx_i, MO_16, tcg_gen_gvec_shri) +TRANS(xvsrli_w, gvec_xx_i, MO_32, tcg_gen_gvec_shri) +TRANS(xvsrli_d, gvec_xx_i, MO_64, tcg_gen_gvec_shri) + +TRANS(xvsra_b, gvec_xxx, MO_8, tcg_gen_gvec_sarv) +TRANS(xvsra_h, gvec_xxx, MO_16, tcg_gen_gvec_sarv) +TRANS(xvsra_w, gvec_xxx, MO_32, tcg_gen_gvec_sarv) +TRANS(xvsra_d, gvec_xxx, MO_64, tcg_gen_gvec_sarv) +TRANS(xvsrai_b, gvec_xx_i, MO_8, tcg_gen_gvec_sari) +TRANS(xvsrai_h, gvec_xx_i, MO_16, tcg_gen_gvec_sari) +TRANS(xvsrai_w, gvec_xx_i, MO_32, tcg_gen_gvec_sari) +TRANS(xvsrai_d, gvec_xx_i, MO_64, tcg_gen_gvec_sari) + +TRANS(xvrotr_b, gvec_xxx, MO_8, tcg_gen_gvec_rotrv) +TRANS(xvrotr_h, gvec_xxx, MO_16, tcg_gen_gvec_rotrv) +TRANS(xvrotr_w, gvec_xxx, MO_32, tcg_gen_gvec_rotrv) +TRANS(xvrotr_d, gvec_xxx, MO_64, tcg_gen_gvec_rotrv) +TRANS(xvrotri_b, gvec_xx_i, MO_8, tcg_gen_gvec_rotri) +TRANS(xvrotri_h, gvec_xx_i, MO_16, tcg_gen_gvec_rotri) +TRANS(xvrotri_w, gvec_xx_i, MO_32, tcg_gen_gvec_rotri) +TRANS(xvrotri_d, gvec_xx_i, MO_64, tcg_gen_gvec_rotri) + static bool gvec_dupx(DisasContext *ctx, arg_xr *a, MemOp mop) { TCGv src = gpr_src(ctx, a->rj, EXT_NONE); diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index ce2ad47b88..03c3aa0019 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1641,6 +1641,39 @@ xvori_b 0111 01111101 01 ........ ..... ..... @xx_ui8 xvxori_b 0111 01111101 10 ........ ..... ..... @xx_ui8 xvnori_b 0111 01111101 11 ........ ..... ..... @xx_ui8 +xvsll_b 0111 01001110 10000 ..... ..... ..... @xxx +xvsll_h 0111 01001110 10001 ..... ..... ..... @xxx +xvsll_w 0111 01001110 10010 ..... ..... ..... @xxx +xvsll_d 0111 01001110 10011 ..... ..... ..... @xxx +xvslli_b 0111 01110010 11000 01 ... ..... ..... @xx_ui3 +xvslli_h 0111 01110010 11000 1 .... ..... ..... @xx_ui4 +xvslli_w 0111 01110010 11001 ..... ..... ..... @xx_ui5 +xvslli_d 0111 01110010 1101 ...... ..... ..... @xx_ui6 +xvsrl_b 0111 01001110 10100 ..... ..... ..... @xxx +xvsrl_h 0111 01001110 10101 ..... ..... ..... @xxx +xvsrl_w 0111 01001110 10110 ..... ..... ..... @xxx +xvsrl_d 0111 01001110 10111 ..... ..... ..... @xxx +xvsrli_b 0111 01110011 00000 01 ... ..... ..... @xx_ui3 +xvsrli_h 0111 01110011 00000 1 .... ..... ..... @xx_ui4 +xvsrli_w 0111 01110011 00001 ..... ..... ..... @xx_ui5 +xvsrli_d 0111 01110011 0001 ...... ..... ..... @xx_ui6 +xvsra_b 0111 01001110 11000 ..... ..... ..... @xxx +xvsra_h 0111 01001110 11001 ..... ..... ..... @xxx +xvsra_w 0111 01001110 11010 ..... ..... ..... @xxx +xvsra_d 0111 01001110 11011 ..... ..... ..... @xxx +xvsrai_b 0111 01110011 01000 01 ... ..... ..... @xx_ui3 +xvsrai_h 0111 01110011 01000 1 .... ..... ..... @xx_ui4 +xvsrai_w 0111 01110011 01001 ..... ..... ..... @xx_ui5 +xvsrai_d 0111 01110011 0101 ...... ..... ..... @xx_ui6 +xvrotr_b 0111 01001110 11100 ..... ..... ..... @xxx +xvrotr_h 0111 01001110 11101 ..... ..... ..... @xxx +xvrotr_w 0111 01001110 11110 ..... ..... ..... @xxx +xvrotr_d 0111 01001110 11111 ..... ..... ..... @xxx +xvrotri_b 0111 01101010 00000 01 ... ..... ..... @xx_ui3 +xvrotri_h 0111 01101010 00000 1 .... ..... ..... @xx_ui4 +xvrotri_w 0111 01101010 00001 ..... ..... ..... @xx_ui5 +xvrotri_d 0111 01101010 0001 ...... ..... ..... @xx_ui6 + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @xr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @xr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @xr From patchwork Tue Jun 20 09:37:54 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Gao X-Patchwork-Id: 13285484 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 7D2AEEB64D8 for ; Tue, 20 Jun 2023 09:43:27 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXp3-0007AB-Cg; Tue, 20 Jun 2023 05:38:49 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qBXp1-00079C-Gm for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:38:47 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXow-0006NB-En for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:38:44 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8DxSuqWc5FkwCUHAA--.14508S3; Tue, 20 Jun 2023 17:38:30 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8BxduSGc5FkzIQhAA--.28394S28; Tue, 20 Jun 2023 17:38:29 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v1 26/46] target/loongarch: Implement xvsllwil xvextl Date: Tue, 20 Jun 2023 17:37:54 +0800 Message-Id: <20230620093814.123650-27-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230620093814.123650-1-gaosong@loongson.cn> References: <20230620093814.123650-1-gaosong@loongson.cn> MIME-Version: 1.0 X-CM-TRANSID: AQAAf8BxduSGc5FkzIQhAA--.28394S28 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org This patch includes: - XVSLLWIL.{H.B/W.H/D.W}; - XVSLLWIL.{HU.BU/WU.HU/DU.WU}; - XVEXTL.Q.D, VEXTL.QU.DU. Signed-off-by: Song Gao --- target/loongarch/disas.c | 9 ++++ target/loongarch/helper.h | 9 ++++ target/loongarch/insn_trans/trans_lasx.c.inc | 21 +++++++++ target/loongarch/insns.decode | 9 ++++ target/loongarch/lasx_helper.c | 45 ++++++++++++++++++++ 5 files changed, 93 insertions(+) diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index 0c1c7a7e6e..b6940e6389 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -2076,6 +2076,15 @@ INSN_LASX(xvrotri_h, xx_i) INSN_LASX(xvrotri_w, xx_i) INSN_LASX(xvrotri_d, xx_i) +INSN_LASX(xvsllwil_h_b, xx_i) +INSN_LASX(xvsllwil_w_h, xx_i) +INSN_LASX(xvsllwil_d_w, xx_i) +INSN_LASX(xvextl_q_d, xx) +INSN_LASX(xvsllwil_hu_bu, xx_i) +INSN_LASX(xvsllwil_wu_hu, xx_i) +INSN_LASX(xvsllwil_du_wu, xx_i) +INSN_LASX(xvextl_qu_du, xx) + INSN_LASX(xvreplgr2vr_b, xr) INSN_LASX(xvreplgr2vr_h, xr) INSN_LASX(xvreplgr2vr_w, xr) diff --git a/target/loongarch/helper.h b/target/loongarch/helper.h index 4e0a900318..672a5f8988 100644 --- a/target/loongarch/helper.h +++ b/target/loongarch/helper.h @@ -931,3 +931,12 @@ DEF_HELPER_3(xvmskgez_b, void, env, i32, i32) DEF_HELPER_3(xvmsknz_b, void, env, i32, i32) DEF_HELPER_FLAGS_4(xvnori_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) + +DEF_HELPER_4(xvsllwil_h_b, void, env, i32, i32, i32) +DEF_HELPER_4(xvsllwil_w_h, void, env, i32, i32, i32) +DEF_HELPER_4(xvsllwil_d_w, void, env, i32, i32, i32) +DEF_HELPER_3(xvextl_q_d, void, env, i32, i32) +DEF_HELPER_4(xvsllwil_hu_bu, void, env, i32, i32, i32) +DEF_HELPER_4(xvsllwil_wu_hu, void, env, i32, i32, i32) +DEF_HELPER_4(xvsllwil_du_wu, void, env, i32, i32, i32) +DEF_HELPER_3(xvextl_qu_du, void, env, i32, i32) diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarch/insn_trans/trans_lasx.c.inc index 5d7deb312e..53631cea63 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -39,6 +39,18 @@ static bool gen_xx(DisasContext *ctx, arg_xx *a, return true; } +static bool gen_xx_i(DisasContext *ctx, arg_xx_i *a, + void (*func)(TCGv_ptr, TCGv_i32, TCGv_i32, TCGv_i32)) +{ + TCGv_i32 xd = tcg_constant_i32(a->xd); + TCGv_i32 xj = tcg_constant_i32(a->xj); + TCGv_i32 imm = tcg_constant_i32(a->imm); + + CHECK_SXE; + func(cpu_env, xd, xj, imm); + return true; +} + static bool gvec_xxx(DisasContext *ctx, arg_xxx *a, MemOp mop, void (*func)(unsigned, uint32_t, uint32_t, uint32_t, uint32_t, uint32_t)) @@ -2013,6 +2025,15 @@ TRANS(xvrotri_h, gvec_xx_i, MO_16, tcg_gen_gvec_rotri) TRANS(xvrotri_w, gvec_xx_i, MO_32, tcg_gen_gvec_rotri) TRANS(xvrotri_d, gvec_xx_i, MO_64, tcg_gen_gvec_rotri) +TRANS(xvsllwil_h_b, gen_xx_i, gen_helper_xvsllwil_h_b) +TRANS(xvsllwil_w_h, gen_xx_i, gen_helper_xvsllwil_w_h) +TRANS(xvsllwil_d_w, gen_xx_i, gen_helper_xvsllwil_d_w) +TRANS(xvextl_q_d, gen_xx, gen_helper_xvextl_q_d) +TRANS(xvsllwil_hu_bu, gen_xx_i, gen_helper_xvsllwil_hu_bu) +TRANS(xvsllwil_wu_hu, gen_xx_i, gen_helper_xvsllwil_wu_hu) +TRANS(xvsllwil_du_wu, gen_xx_i, gen_helper_xvsllwil_du_wu) +TRANS(xvextl_qu_du, gen_xx, gen_helper_xvextl_qu_du) + static bool gvec_dupx(DisasContext *ctx, arg_xr *a, MemOp mop) { TCGv src = gpr_src(ctx, a->rj, EXT_NONE); diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index 03c3aa0019..ebaddb94ea 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1674,6 +1674,15 @@ xvrotri_h 0111 01101010 00000 1 .... ..... ..... @xx_ui4 xvrotri_w 0111 01101010 00001 ..... ..... ..... @xx_ui5 xvrotri_d 0111 01101010 0001 ...... ..... ..... @xx_ui6 +xvsllwil_h_b 0111 01110000 10000 01 ... ..... ..... @xx_ui3 +xvsllwil_w_h 0111 01110000 10000 1 .... ..... ..... @xx_ui4 +xvsllwil_d_w 0111 01110000 10001 ..... ..... ..... @xx_ui5 +xvextl_q_d 0111 01110000 10010 00000 ..... ..... @xx +xvsllwil_hu_bu 0111 01110000 11000 01 ... ..... ..... @xx_ui3 +xvsllwil_wu_hu 0111 01110000 11000 1 .... ..... ..... @xx_ui4 +xvsllwil_du_wu 0111 01110000 11001 ..... ..... ..... @xx_ui5 +xvextl_qu_du 0111 01110000 11010 00000 ..... ..... @xx + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @xr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @xr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @xr diff --git a/target/loongarch/lasx_helper.c b/target/loongarch/lasx_helper.c index 8e8860c1bb..cd0e18ac3c 100644 --- a/target/loongarch/lasx_helper.c +++ b/target/loongarch/lasx_helper.c @@ -815,3 +815,48 @@ void HELPER(xvnori_b)(void *xd, void *xj, uint64_t imm, uint32_t v) Xd->XB(i) = ~(Xj->XB(i) | (uint8_t)imm); } } + +#define XVSLLWIL(NAME, BIT, E1, E2) \ +void HELPER(NAME)(CPULoongArchState *env, \ + uint32_t xd, uint32_t xj, uint32_t imm) \ +{ \ + int i, max; \ + XReg temp; \ + XReg *Xd = &(env->fpr[xd].xreg); \ + XReg *Xj = &(env->fpr[xj].xreg); \ + typedef __typeof(temp.E1(0)) TD; \ + \ + temp.XQ(0) = int128_zero(); \ + temp.XQ(1) = int128_zero(); \ + max = LASX_LEN / (BIT * 2); \ + for (i = 0; i < max; i++) { \ + temp.E1(i) = (TD)Xj->E2(i) << (imm % BIT); \ + temp.E1(i + max) = (TD)Xj->E2(i + max * 2) << (imm % BIT); \ + } \ + *Xd = temp; \ +} + +void HELPER(xvextl_q_d)(CPULoongArchState *env, uint32_t xd, uint32_t xj) +{ + XReg *Xd = &(env->fpr[xd].xreg); + XReg *Xj = &(env->fpr[xj].xreg); + + Xd->XQ(0) = int128_makes64(Xj->XD(0)); + Xd->XQ(1) = int128_makes64(Xj->XD(2)); +} + +void HELPER(xvextl_qu_du)(CPULoongArchState *env, uint32_t xd, uint32_t xj) +{ + XReg *Xd = &(env->fpr[xd].xreg); + XReg *Xj = &(env->fpr[xj].xreg); + + Xd->XQ(0) = int128_make64(Xj->UXD(0)); + Xd->XQ(1) = int128_make64(Xj->UXD(2)); +} + +XVSLLWIL(xvsllwil_h_b, 16, XH, XB) +XVSLLWIL(xvsllwil_w_h, 32, XW, XH) +XVSLLWIL(xvsllwil_d_w, 64, XD, XW) +XVSLLWIL(xvsllwil_hu_bu, 16, UXH, UXB) +XVSLLWIL(xvsllwil_wu_hu, 32, UXW, UXH) +XVSLLWIL(xvsllwil_du_wu, 64, UXD, UXW) From patchwork Tue Jun 20 09:37:55 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Gao X-Patchwork-Id: 13285493 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 8E60AEB64DB for ; Tue, 20 Jun 2023 09:44:45 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXp5-0007B8-EP; Tue, 20 Jun 2023 05:38:51 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qBXp3-00079w-4J for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:38:49 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXoy-0006NV-8N for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:38:48 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8CxtOiWc5FkwiUHAA--.4340S3; Tue, 20 Jun 2023 17:38:30 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8BxduSGc5FkzIQhAA--.28394S29; Tue, 20 Jun 2023 17:38:30 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v1 27/46] target/loongarch: Implement xvsrlr xvsrar Date: Tue, 20 Jun 2023 17:37:55 +0800 Message-Id: <20230620093814.123650-28-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230620093814.123650-1-gaosong@loongson.cn> References: <20230620093814.123650-1-gaosong@loongson.cn> MIME-Version: 1.0 X-CM-TRANSID: AQAAf8BxduSGc5FkzIQhAA--.28394S29 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org This patch includes: - XVSRLR[I].{B/H/W/D}; - XVSRAR[I].{B/H/W/D}. Signed-off-by: Song Gao --- target/loongarch/disas.c | 18 ++++ target/loongarch/helper.h | 18 ++++ target/loongarch/insn_trans/trans_lasx.c.inc | 18 ++++ target/loongarch/insns.decode | 17 +++ target/loongarch/lasx_helper.c | 104 +++++++++++++++++++ 5 files changed, 175 insertions(+) diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index b6940e6389..a63ba6d6ee 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -2085,6 +2085,24 @@ INSN_LASX(xvsllwil_wu_hu, xx_i) INSN_LASX(xvsllwil_du_wu, xx_i) INSN_LASX(xvextl_qu_du, xx) +INSN_LASX(xvsrlr_b, xxx) +INSN_LASX(xvsrlr_h, xxx) +INSN_LASX(xvsrlr_w, xxx) +INSN_LASX(xvsrlr_d, xxx) +INSN_LASX(xvsrlri_b, xx_i) +INSN_LASX(xvsrlri_h, xx_i) +INSN_LASX(xvsrlri_w, xx_i) +INSN_LASX(xvsrlri_d, xx_i) + +INSN_LASX(xvsrar_b, xxx) +INSN_LASX(xvsrar_h, xxx) +INSN_LASX(xvsrar_w, xxx) +INSN_LASX(xvsrar_d, xxx) +INSN_LASX(xvsrari_b, xx_i) +INSN_LASX(xvsrari_h, xx_i) +INSN_LASX(xvsrari_w, xx_i) +INSN_LASX(xvsrari_d, xx_i) + INSN_LASX(xvreplgr2vr_b, xr) INSN_LASX(xvreplgr2vr_h, xr) INSN_LASX(xvreplgr2vr_w, xr) diff --git a/target/loongarch/helper.h b/target/loongarch/helper.h index 672a5f8988..6bb30ddd31 100644 --- a/target/loongarch/helper.h +++ b/target/loongarch/helper.h @@ -940,3 +940,21 @@ DEF_HELPER_4(xvsllwil_hu_bu, void, env, i32, i32, i32) DEF_HELPER_4(xvsllwil_wu_hu, void, env, i32, i32, i32) DEF_HELPER_4(xvsllwil_du_wu, void, env, i32, i32, i32) DEF_HELPER_3(xvextl_qu_du, void, env, i32, i32) + +DEF_HELPER_4(xvsrlr_b, void, env, i32, i32, i32) +DEF_HELPER_4(xvsrlr_h, void, env, i32, i32, i32) +DEF_HELPER_4(xvsrlr_w, void, env, i32, i32, i32) +DEF_HELPER_4(xvsrlr_d, void, env, i32, i32, i32) +DEF_HELPER_4(xvsrlri_b, void, env, i32, i32, i32) +DEF_HELPER_4(xvsrlri_h, void, env, i32, i32, i32) +DEF_HELPER_4(xvsrlri_w, void, env, i32, i32, i32) +DEF_HELPER_4(xvsrlri_d, void, env, i32, i32, i32) + +DEF_HELPER_4(xvsrar_b, void, env, i32, i32, i32) +DEF_HELPER_4(xvsrar_h, void, env, i32, i32, i32) +DEF_HELPER_4(xvsrar_w, void, env, i32, i32, i32) +DEF_HELPER_4(xvsrar_d, void, env, i32, i32, i32) +DEF_HELPER_4(xvsrari_b, void, env, i32, i32, i32) +DEF_HELPER_4(xvsrari_h, void, env, i32, i32, i32) +DEF_HELPER_4(xvsrari_w, void, env, i32, i32, i32) +DEF_HELPER_4(xvsrari_d, void, env, i32, i32, i32) diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarch/insn_trans/trans_lasx.c.inc index 53631cea63..602ba0c800 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -2034,6 +2034,24 @@ TRANS(xvsllwil_wu_hu, gen_xx_i, gen_helper_xvsllwil_wu_hu) TRANS(xvsllwil_du_wu, gen_xx_i, gen_helper_xvsllwil_du_wu) TRANS(xvextl_qu_du, gen_xx, gen_helper_xvextl_qu_du) +TRANS(xvsrlr_b, gen_xxx, gen_helper_xvsrlr_b) +TRANS(xvsrlr_h, gen_xxx, gen_helper_xvsrlr_h) +TRANS(xvsrlr_w, gen_xxx, gen_helper_xvsrlr_w) +TRANS(xvsrlr_d, gen_xxx, gen_helper_xvsrlr_d) +TRANS(xvsrlri_b, gen_xx_i, gen_helper_xvsrlri_b) +TRANS(xvsrlri_h, gen_xx_i, gen_helper_xvsrlri_h) +TRANS(xvsrlri_w, gen_xx_i, gen_helper_xvsrlri_w) +TRANS(xvsrlri_d, gen_xx_i, gen_helper_xvsrlri_d) + +TRANS(xvsrar_b, gen_xxx, gen_helper_xvsrar_b) +TRANS(xvsrar_h, gen_xxx, gen_helper_xvsrar_h) +TRANS(xvsrar_w, gen_xxx, gen_helper_xvsrar_w) +TRANS(xvsrar_d, gen_xxx, gen_helper_xvsrar_d) +TRANS(xvsrari_b, gen_xx_i, gen_helper_xvsrari_b) +TRANS(xvsrari_h, gen_xx_i, gen_helper_xvsrari_h) +TRANS(xvsrari_w, gen_xx_i, gen_helper_xvsrari_w) +TRANS(xvsrari_d, gen_xx_i, gen_helper_xvsrari_d) + static bool gvec_dupx(DisasContext *ctx, arg_xr *a, MemOp mop) { TCGv src = gpr_src(ctx, a->rj, EXT_NONE); diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index ebaddb94ea..d901ddf063 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1683,6 +1683,23 @@ xvsllwil_wu_hu 0111 01110000 11000 1 .... ..... ..... @xx_ui4 xvsllwil_du_wu 0111 01110000 11001 ..... ..... ..... @xx_ui5 xvextl_qu_du 0111 01110000 11010 00000 ..... ..... @xx +xvsrlr_b 0111 01001111 00000 ..... ..... ..... @xxx +xvsrlr_h 0111 01001111 00001 ..... ..... ..... @xxx +xvsrlr_w 0111 01001111 00010 ..... ..... ..... @xxx +xvsrlr_d 0111 01001111 00011 ..... ..... ..... @xxx +xvsrlri_b 0111 01101010 01000 01 ... ..... ..... @xx_ui3 +xvsrlri_h 0111 01101010 01000 1 .... ..... ..... @xx_ui4 +xvsrlri_w 0111 01101010 01001 ..... ..... ..... @xx_ui5 +xvsrlri_d 0111 01101010 0101 ...... ..... ..... @xx_ui6 +xvsrar_b 0111 01001111 00100 ..... ..... ..... @xxx +xvsrar_h 0111 01001111 00101 ..... ..... ..... @xxx +xvsrar_w 0111 01001111 00110 ..... ..... ..... @xxx +xvsrar_d 0111 01001111 00111 ..... ..... ..... @xxx +xvsrari_b 0111 01101010 10000 01 ... ..... ..... @xx_ui3 +xvsrari_h 0111 01101010 10000 1 .... ..... ..... @xx_ui4 +xvsrari_w 0111 01101010 10001 ..... ..... ..... @xx_ui5 +xvsrari_d 0111 01101010 1001 ...... ..... ..... @xx_ui6 + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @xr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @xr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @xr diff --git a/target/loongarch/lasx_helper.c b/target/loongarch/lasx_helper.c index cd0e18ac3c..ebbbf014f7 100644 --- a/target/loongarch/lasx_helper.c +++ b/target/loongarch/lasx_helper.c @@ -860,3 +860,107 @@ XVSLLWIL(xvsllwil_d_w, 64, XD, XW) XVSLLWIL(xvsllwil_hu_bu, 16, UXH, UXB) XVSLLWIL(xvsllwil_wu_hu, 32, UXW, UXH) XVSLLWIL(xvsllwil_du_wu, 64, UXD, UXW) + +#define do_xvsrlr(E, T) \ +static T do_xvsrlr_ ##E(T s1, int sh) \ +{ \ + if (sh == 0) { \ + return s1; \ + } else { \ + return (s1 >> sh) + ((s1 >> (sh - 1)) & 0x1); \ + } \ +} + +do_xvsrlr(XB, uint8_t) +do_xvsrlr(XH, uint16_t) +do_xvsrlr(XW, uint32_t) +do_xvsrlr(XD, uint64_t) + +#define XVSRLR(NAME, BIT, E1, E2) \ +void HELPER(NAME)(CPULoongArchState *env, \ + uint32_t xd, uint32_t xj, uint32_t xk) \ +{ \ + int i; \ + XReg *Xd = &(env->fpr[xd].xreg); \ + XReg *Xj = &(env->fpr[xj].xreg); \ + XReg *Xk = &(env->fpr[xk].xreg); \ + \ + for (i = 0; i < LASX_LEN / BIT; i++) { \ + Xd->E1(i) = do_xvsrlr_ ## E1(Xj->E1(i), (Xk->E2(i)) % BIT); \ + } \ +} + +XVSRLR(xvsrlr_b, 8, XB, UXB) +XVSRLR(xvsrlr_h, 16, XH, UXH) +XVSRLR(xvsrlr_w, 32, XW, UXW) +XVSRLR(xvsrlr_d, 64, XD, UXD) + +#define XVSRLRI(NAME, BIT, E) \ +void HELPER(NAME)(CPULoongArchState *env, \ + uint32_t xd, uint32_t xj, uint32_t imm) \ +{ \ + int i; \ + XReg *Xd = &(env->fpr[xd].xreg); \ + XReg *Xj = &(env->fpr[xj].xreg); \ + \ + for (i = 0; i < LASX_LEN / BIT; i++) { \ + Xd->E(i) = do_xvsrlr_ ## E(Xj->E(i), imm); \ + } \ +} + +XVSRLRI(xvsrlri_b, 8, XB) +XVSRLRI(xvsrlri_h, 16, XH) +XVSRLRI(xvsrlri_w, 32, XW) +XVSRLRI(xvsrlri_d, 64, XD) + +#define do_xvsrar(E, T) \ +static T do_xvsrar_ ##E(T s1, int sh) \ +{ \ + if (sh == 0) { \ + return s1; \ + } else { \ + return (s1 >> sh) + ((s1 >> (sh - 1)) & 0x1); \ + } \ +} + +do_xvsrar(XB, int8_t) +do_xvsrar(XH, int16_t) +do_xvsrar(XW, int32_t) +do_xvsrar(XD, int64_t) + +#define XVSRAR(NAME, BIT, E1, E2) \ +void HELPER(NAME)(CPULoongArchState *env, \ + uint32_t xd, uint32_t xj, uint32_t xk) \ +{ \ + int i; \ + XReg *Xd = &(env->fpr[xd].xreg); \ + XReg *Xj = &(env->fpr[xj].xreg); \ + XReg *Xk = &(env->fpr[xk].xreg); \ + \ + for (i = 0; i < LASX_LEN / BIT; i++) { \ + Xd->E1(i) = do_xvsrar_ ## E1(Xj->E1(i), (Xk->E2(i)) % BIT); \ + } \ +} + +XVSRAR(xvsrar_b, 8, XB, UXB) +XVSRAR(xvsrar_h, 16, XH, UXH) +XVSRAR(xvsrar_w, 32, XW, UXW) +XVSRAR(xvsrar_d, 64, XD, UXD) + +#define XVSRARI(NAME, BIT, E) \ +void HELPER(NAME)(CPULoongArchState *env, \ + uint32_t xd, uint32_t xj, uint32_t imm) \ +{ \ + int i; \ + XReg *Xd = &(env->fpr[xd].xreg); \ + XReg *Xj = &(env->fpr[xj].xreg); \ + \ + for (i = 0; i < LASX_LEN / BIT; i++) { \ + Xd->E(i) = do_xvsrar_ ## E(Xj->E(i), imm); \ + } \ +} + +XVSRARI(xvsrari_b, 8, XB) +XVSRARI(xvsrari_h, 16, XH) +XVSRARI(xvsrari_w, 32, XW) +XVSRARI(xvsrari_d, 64, XD) From patchwork Tue Jun 20 09:37:56 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Gao X-Patchwork-Id: 13285508 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id E9960EB64D7 for ; Tue, 20 Jun 2023 09:46:10 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXqW-0002uC-6E; Tue, 20 Jun 2023 05:40:20 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qBXqQ-0002hK-3t for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:40:14 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXqM-0006aH-I9 for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:40:13 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8BxKuqXc5FkxCUHAA--.14734S3; Tue, 20 Jun 2023 17:38:31 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8BxduSGc5FkzIQhAA--.28394S30; Tue, 20 Jun 2023 17:38:30 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v1 28/46] target/loongarch: Implement xvsrln xvsran Date: Tue, 20 Jun 2023 17:37:56 +0800 Message-Id: <20230620093814.123650-29-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230620093814.123650-1-gaosong@loongson.cn> References: <20230620093814.123650-1-gaosong@loongson.cn> MIME-Version: 1.0 X-CM-TRANSID: AQAAf8BxduSGc5FkzIQhAA--.28394S30 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org This patch includes: - XVSRLN.{B.H/H.W/W.D}; - XVSRAN.{B.H/H.W/W.D}; - XVSRLNI.{B.H/H.W/W.D/D.Q}; - XVSRANI.{B.H/H.W/W.D/D.Q}. Signed-off-by: Song Gao --- target/loongarch/disas.c | 16 +++ target/loongarch/helper.h | 16 +++ target/loongarch/insn_trans/trans_lasx.c.inc | 16 +++ target/loongarch/insns.decode | 17 +++ target/loongarch/lasx_helper.c | 128 +++++++++++++++++++ target/loongarch/lsx_helper.c | 2 - target/loongarch/vec.h | 2 + 7 files changed, 195 insertions(+), 2 deletions(-) diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index a63ba6d6ee..5ea713075f 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -2103,6 +2103,22 @@ INSN_LASX(xvsrari_h, xx_i) INSN_LASX(xvsrari_w, xx_i) INSN_LASX(xvsrari_d, xx_i) +INSN_LASX(xvsrln_b_h, xxx) +INSN_LASX(xvsrln_h_w, xxx) +INSN_LASX(xvsrln_w_d, xxx) +INSN_LASX(xvsran_b_h, xxx) +INSN_LASX(xvsran_h_w, xxx) +INSN_LASX(xvsran_w_d, xxx) + +INSN_LASX(xvsrlni_b_h, xx_i) +INSN_LASX(xvsrlni_h_w, xx_i) +INSN_LASX(xvsrlni_w_d, xx_i) +INSN_LASX(xvsrlni_d_q, xx_i) +INSN_LASX(xvsrani_b_h, xx_i) +INSN_LASX(xvsrani_h_w, xx_i) +INSN_LASX(xvsrani_w_d, xx_i) +INSN_LASX(xvsrani_d_q, xx_i) + INSN_LASX(xvreplgr2vr_b, xr) INSN_LASX(xvreplgr2vr_h, xr) INSN_LASX(xvreplgr2vr_w, xr) diff --git a/target/loongarch/helper.h b/target/loongarch/helper.h index 6bb30ddd31..c41f8e2bc9 100644 --- a/target/loongarch/helper.h +++ b/target/loongarch/helper.h @@ -958,3 +958,19 @@ DEF_HELPER_4(xvsrari_b, void, env, i32, i32, i32) DEF_HELPER_4(xvsrari_h, void, env, i32, i32, i32) DEF_HELPER_4(xvsrari_w, void, env, i32, i32, i32) DEF_HELPER_4(xvsrari_d, void, env, i32, i32, i32) + +DEF_HELPER_4(xvsrln_b_h, void, env, i32, i32, i32) +DEF_HELPER_4(xvsrln_h_w, void, env, i32, i32, i32) +DEF_HELPER_4(xvsrln_w_d, void, env, i32, i32, i32) +DEF_HELPER_4(xvsran_b_h, void, env, i32, i32, i32) +DEF_HELPER_4(xvsran_h_w, void, env, i32, i32, i32) +DEF_HELPER_4(xvsran_w_d, void, env, i32, i32, i32) + +DEF_HELPER_4(xvsrlni_b_h, void, env, i32, i32, i32) +DEF_HELPER_4(xvsrlni_h_w, void, env, i32, i32, i32) +DEF_HELPER_4(xvsrlni_w_d, void, env, i32, i32, i32) +DEF_HELPER_4(xvsrlni_d_q, void, env, i32, i32, i32) +DEF_HELPER_4(xvsrani_b_h, void, env, i32, i32, i32) +DEF_HELPER_4(xvsrani_h_w, void, env, i32, i32, i32) +DEF_HELPER_4(xvsrani_w_d, void, env, i32, i32, i32) +DEF_HELPER_4(xvsrani_d_q, void, env, i32, i32, i32) diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarch/insn_trans/trans_lasx.c.inc index 602ba0c800..9a3c2114eb 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -2052,6 +2052,22 @@ TRANS(xvsrari_h, gen_xx_i, gen_helper_xvsrari_h) TRANS(xvsrari_w, gen_xx_i, gen_helper_xvsrari_w) TRANS(xvsrari_d, gen_xx_i, gen_helper_xvsrari_d) +TRANS(xvsrln_b_h, gen_xxx, gen_helper_xvsrln_b_h) +TRANS(xvsrln_h_w, gen_xxx, gen_helper_xvsrln_h_w) +TRANS(xvsrln_w_d, gen_xxx, gen_helper_xvsrln_w_d) +TRANS(xvsran_b_h, gen_xxx, gen_helper_xvsran_b_h) +TRANS(xvsran_h_w, gen_xxx, gen_helper_xvsran_h_w) +TRANS(xvsran_w_d, gen_xxx, gen_helper_xvsran_w_d) + +TRANS(xvsrlni_b_h, gen_xx_i, gen_helper_xvsrlni_b_h) +TRANS(xvsrlni_h_w, gen_xx_i, gen_helper_xvsrlni_h_w) +TRANS(xvsrlni_w_d, gen_xx_i, gen_helper_xvsrlni_w_d) +TRANS(xvsrlni_d_q, gen_xx_i, gen_helper_xvsrlni_d_q) +TRANS(xvsrani_b_h, gen_xx_i, gen_helper_xvsrani_b_h) +TRANS(xvsrani_h_w, gen_xx_i, gen_helper_xvsrani_h_w) +TRANS(xvsrani_w_d, gen_xx_i, gen_helper_xvsrani_w_d) +TRANS(xvsrani_d_q, gen_xx_i, gen_helper_xvsrani_d_q) + static bool gvec_dupx(DisasContext *ctx, arg_xr *a, MemOp mop) { TCGv src = gpr_src(ctx, a->rj, EXT_NONE); diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index d901ddf063..45f15e3be2 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1320,6 +1320,7 @@ vstelm_b 0011 000110 .... ........ ..... ..... @vr_i8i4 @xx_ui4 .... ........ ..... . imm:4 xj:5 xd:5 &xx_i @xx_ui5 .... ........ ..... imm:5 xj:5 xd:5 &xx_i @xx_ui6 .... ........ .... imm:6 xj:5 xd:5 &xx_i +@xx_ui7 .... ........ ... imm:7 xj:5 xd:5 &xx_i @xx_ui8 .... ........ .. imm:8 xj:5 xd:5 &xx_i xvadd_b 0111 01000000 10100 ..... ..... ..... @xxx @@ -1700,6 +1701,22 @@ xvsrari_h 0111 01101010 10000 1 .... ..... ..... @xx_ui4 xvsrari_w 0111 01101010 10001 ..... ..... ..... @xx_ui5 xvsrari_d 0111 01101010 1001 ...... ..... ..... @xx_ui6 +xvsrln_b_h 0111 01001111 01001 ..... ..... ..... @xxx +xvsrln_h_w 0111 01001111 01010 ..... ..... ..... @xxx +xvsrln_w_d 0111 01001111 01011 ..... ..... ..... @xxx +xvsran_b_h 0111 01001111 01101 ..... ..... ..... @xxx +xvsran_h_w 0111 01001111 01110 ..... ..... ..... @xxx +xvsran_w_d 0111 01001111 01111 ..... ..... ..... @xxx + +xvsrlni_b_h 0111 01110100 00000 1 .... ..... ..... @xx_ui4 +xvsrlni_h_w 0111 01110100 00001 ..... ..... ..... @xx_ui5 +xvsrlni_w_d 0111 01110100 0001 ...... ..... ..... @xx_ui6 +xvsrlni_d_q 0111 01110100 001 ....... ..... ..... @xx_ui7 +xvsrani_b_h 0111 01110101 10000 1 .... ..... ..... @xx_ui4 +xvsrani_h_w 0111 01110101 10001 ..... ..... ..... @xx_ui5 +xvsrani_w_d 0111 01110101 1001 ...... ..... ..... @xx_ui6 +xvsrani_d_q 0111 01110101 101 ....... ..... ..... @xx_ui7 + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @xr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @xr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @xr diff --git a/target/loongarch/lasx_helper.c b/target/loongarch/lasx_helper.c index ebbbf014f7..02550646d7 100644 --- a/target/loongarch/lasx_helper.c +++ b/target/loongarch/lasx_helper.c @@ -964,3 +964,131 @@ XVSRARI(xvsrari_b, 8, XB) XVSRARI(xvsrari_h, 16, XH) XVSRARI(xvsrari_w, 32, XW) XVSRARI(xvsrari_d, 64, XD) + +#define XVSRLN(NAME, BIT, E1, E2) \ +void HELPER(NAME)(CPULoongArchState *env, \ + uint32_t xd, uint32_t xj, uint32_t xk) \ +{ \ + int i, max; \ + XReg *Xd = &(env->fpr[xd].xreg); \ + XReg *Xj = &(env->fpr[xj].xreg); \ + XReg *Xk = &(env->fpr[xk].xreg); \ + \ + max = LASX_LEN / (BIT * 2); \ + for (i = 0; i < max; i++) { \ + Xd->E1(i) = R_SHIFT(Xj->E2(i), (Xk->E2(i)) % BIT); \ + Xd->E1(i + max * 2) = R_SHIFT(Xj->E2(i + max), \ + Xk->E2(i + max) % BIT); \ + } \ + Xd->XD(1) = 0; \ + Xd->XD(3) = 0; \ +} + +XVSRLN(xvsrln_b_h, 16, XB, UXH) +XVSRLN(xvsrln_h_w, 32, XH, UXW) +XVSRLN(xvsrln_w_d, 64, XW, UXD) + +#define XVSRAN(NAME, BIT, E1, E2, E3) \ +void HELPER(NAME)(CPULoongArchState *env, \ + uint32_t xd, uint32_t xj, uint32_t xk) \ +{ \ + int i, max; \ + XReg *Xd = &(env->fpr[xd].xreg); \ + XReg *Xj = &(env->fpr[xj].xreg); \ + XReg *Xk = &(env->fpr[xk].xreg); \ + \ + max = LASX_LEN / (BIT * 2); \ + for (i = 0; i < max; i++) { \ + Xd->E1(i) = R_SHIFT(Xj->E2(i), (Xk->E3(i)) % BIT); \ + Xd->E1(i + max * 2) = R_SHIFT(Xj->E2(i + max), \ + Xk->E3(i + max) % BIT); \ + } \ + Xd->XD(1) = 0; \ + Xd->XD(3) = 0; \ +} + +XVSRAN(xvsran_b_h, 16, XB, XH, UXH) +XVSRAN(xvsran_h_w, 32, XH, XW, UXW) +XVSRAN(xvsran_w_d, 64, XW, XD, UXD) + +#define XVSRLNI(NAME, BIT, E1, E2) \ +void HELPER(NAME)(CPULoongArchState *env, \ + uint32_t xd, uint32_t xj, uint32_t imm) \ +{ \ + int i, max; \ + XReg temp; \ + XReg *Xd = &(env->fpr[xd].xreg); \ + XReg *Xj = &(env->fpr[xj].xreg); \ + \ + temp.XQ(0) = int128_zero(); \ + temp.XQ(1) = int128_zero(); \ + max = LASX_LEN / (BIT * 2); \ + for (i = 0; i < max; i++) { \ + temp.E1(i) = R_SHIFT(Xj->E2(i), imm); \ + temp.E1(i + max) = R_SHIFT(Xd->E2(i), imm); \ + temp.E1(i + max * 2) = R_SHIFT(Xj->E2(i + max), imm); \ + temp.E1(i + max * 3) = R_SHIFT(Xd->E2(i + max), imm); \ + } \ + *Xd = temp; \ +} + +void HELPER(xvsrlni_d_q)(CPULoongArchState *env, + uint32_t xd, uint32_t xj, uint32_t imm) +{ + XReg temp; + XReg *Xd = &(env->fpr[xd].xreg); + XReg *Xj = &(env->fpr[xj].xreg); + + temp.XQ(0) = int128_zero(); + temp.XQ(1) = int128_zero(); + temp.XD(0) = int128_getlo(int128_urshift(Xj->XQ(0), imm % 128)); + temp.XD(1) = int128_getlo(int128_urshift(Xd->XQ(0), imm % 128)); + temp.XD(2) = int128_getlo(int128_urshift(Xj->XQ(1), imm % 128)); + temp.XD(3) = int128_getlo(int128_urshift(Xd->XQ(1), imm % 128)); + *Xd = temp; +} + +XVSRLNI(xvsrlni_b_h, 16, XB, UXH) +XVSRLNI(xvsrlni_h_w, 32, XH, UXW) +XVSRLNI(xvsrlni_w_d, 64, XW, UXD) + +#define XVSRANI(NAME, BIT, E1, E2) \ +void HELPER(NAME)(CPULoongArchState *env, \ + uint32_t xd, uint32_t xj, uint32_t imm) \ +{ \ + int i, max; \ + XReg temp; \ + XReg *Xd = &(env->fpr[xd].xreg); \ + XReg *Xj = &(env->fpr[xj].xreg); \ + \ + temp.XQ(0) = int128_zero(); \ + temp.XQ(1) = int128_zero(); \ + max = LASX_LEN / (BIT * 2); \ + for (i = 0; i < max; i++) { \ + temp.E1(i) = R_SHIFT(Xj->E2(i), imm); \ + temp.E1(i + max) = R_SHIFT(Xd->E2(i), imm); \ + temp.E1(i + max * 2) = R_SHIFT(Xj->E2(i + max), imm); \ + temp.E1(i + max * 3) = R_SHIFT(Xd->E2(i + max), imm); \ + } \ + *Xd = temp; \ +} + +void HELPER(xvsrani_d_q)(CPULoongArchState *env, + uint32_t xd, uint32_t xj, uint32_t imm) +{ + XReg temp; + XReg *Xd = &(env->fpr[xd].xreg); + XReg *Xj = &(env->fpr[xj].xreg); + + temp.XQ(0) = int128_zero(); + temp.XQ(1) = int128_zero(); + temp.XD(0) = int128_getlo(int128_rshift(Xj->XQ(0), imm % 128)); + temp.XD(1) = int128_getlo(int128_rshift(Xd->XQ(0), imm % 128)); + temp.XD(2) = int128_getlo(int128_rshift(Xj->XQ(1), imm % 128)); + temp.XD(3) = int128_getlo(int128_rshift(Xd->XQ(1), imm % 128)); + *Xd = temp; +} + +XVSRANI(xvsrani_b_h, 16, XB, XH) +XVSRANI(xvsrani_h_w, 32, XH, XW) +XVSRANI(xvsrani_w_d, 64, XW, XD) diff --git a/target/loongarch/lsx_helper.c b/target/loongarch/lsx_helper.c index e64155f38c..d21e4006f2 100644 --- a/target/loongarch/lsx_helper.c +++ b/target/loongarch/lsx_helper.c @@ -922,8 +922,6 @@ VSRARI(vsrari_h, 16, H) VSRARI(vsrari_w, 32, W) VSRARI(vsrari_d, 64, D) -#define R_SHIFT(a, b) (a >> b) - #define VSRLN(NAME, BIT, T, E1, E2) \ void HELPER(NAME)(CPULoongArchState *env, \ uint32_t vd, uint32_t vj, uint32_t vk) \ diff --git a/target/loongarch/vec.h b/target/loongarch/vec.h index d5a880b3fd..b5cdb4b470 100644 --- a/target/loongarch/vec.h +++ b/target/loongarch/vec.h @@ -75,6 +75,8 @@ #define DO_SIGNCOV(a, b) (a == 0 ? 0 : a < 0 ? -b : b) +#define R_SHIFT(a, b) (a >> b) + uint64_t do_vmskltz_b(int64_t val); uint64_t do_vmskltz_h(int64_t val); uint64_t do_vmskltz_w(int64_t val); From patchwork Tue Jun 20 09:37:57 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Gao X-Patchwork-Id: 13285453 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id C59F9EB64D8 for ; Tue, 20 Jun 2023 09:40:37 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXpQ-0007FK-26; Tue, 20 Jun 2023 05:39:12 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qBXp4-0007Ar-SP for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:38:50 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXoy-0006Nc-O4 for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:38:50 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8DxTuuYc5FkxiUHAA--.14551S3; Tue, 20 Jun 2023 17:38:32 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8BxduSGc5FkzIQhAA--.28394S31; Tue, 20 Jun 2023 17:38:31 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v1 29/46] target/loongarch: Implement xvsrlrn xvsrarn Date: Tue, 20 Jun 2023 17:37:57 +0800 Message-Id: <20230620093814.123650-30-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230620093814.123650-1-gaosong@loongson.cn> References: <20230620093814.123650-1-gaosong@loongson.cn> MIME-Version: 1.0 X-CM-TRANSID: AQAAf8BxduSGc5FkzIQhAA--.28394S31 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org This patch includes: - XVSRLRN.{B.H/H.W/W.D}; - XVSRARN.{B.H/H.W/W.D}; - XVSRLRNI.{B.H/H.W/W.D/D.Q}; - XVSRARNI.{B.H/H.W/W.D/D.Q}. Signed-off-by: Song Gao --- target/loongarch/disas.c | 16 ++ target/loongarch/helper.h | 16 ++ target/loongarch/insn_trans/trans_lasx.c.inc | 16 ++ target/loongarch/insns.decode | 16 ++ target/loongarch/lasx_helper.c | 150 +++++++++++++++++++ 5 files changed, 214 insertions(+) diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index 5ea713075f..515d99aa1f 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -2119,6 +2119,22 @@ INSN_LASX(xvsrani_h_w, xx_i) INSN_LASX(xvsrani_w_d, xx_i) INSN_LASX(xvsrani_d_q, xx_i) +INSN_LASX(xvsrlrn_b_h, xxx) +INSN_LASX(xvsrlrn_h_w, xxx) +INSN_LASX(xvsrlrn_w_d, xxx) +INSN_LASX(xvsrarn_b_h, xxx) +INSN_LASX(xvsrarn_h_w, xxx) +INSN_LASX(xvsrarn_w_d, xxx) + +INSN_LASX(xvsrlrni_b_h, xx_i) +INSN_LASX(xvsrlrni_h_w, xx_i) +INSN_LASX(xvsrlrni_w_d, xx_i) +INSN_LASX(xvsrlrni_d_q, xx_i) +INSN_LASX(xvsrarni_b_h, xx_i) +INSN_LASX(xvsrarni_h_w, xx_i) +INSN_LASX(xvsrarni_w_d, xx_i) +INSN_LASX(xvsrarni_d_q, xx_i) + INSN_LASX(xvreplgr2vr_b, xr) INSN_LASX(xvreplgr2vr_h, xr) INSN_LASX(xvreplgr2vr_w, xr) diff --git a/target/loongarch/helper.h b/target/loongarch/helper.h index c41f8e2bc9..09ae21edd6 100644 --- a/target/loongarch/helper.h +++ b/target/loongarch/helper.h @@ -974,3 +974,19 @@ DEF_HELPER_4(xvsrani_b_h, void, env, i32, i32, i32) DEF_HELPER_4(xvsrani_h_w, void, env, i32, i32, i32) DEF_HELPER_4(xvsrani_w_d, void, env, i32, i32, i32) DEF_HELPER_4(xvsrani_d_q, void, env, i32, i32, i32) + +DEF_HELPER_4(xvsrlrn_b_h, void, env, i32, i32, i32) +DEF_HELPER_4(xvsrlrn_h_w, void, env, i32, i32, i32) +DEF_HELPER_4(xvsrlrn_w_d, void, env, i32, i32, i32) +DEF_HELPER_4(xvsrarn_b_h, void, env, i32, i32, i32) +DEF_HELPER_4(xvsrarn_h_w, void, env, i32, i32, i32) +DEF_HELPER_4(xvsrarn_w_d, void, env, i32, i32, i32) + +DEF_HELPER_4(xvsrlrni_b_h, void, env, i32, i32, i32) +DEF_HELPER_4(xvsrlrni_h_w, void, env, i32, i32, i32) +DEF_HELPER_4(xvsrlrni_w_d, void, env, i32, i32, i32) +DEF_HELPER_4(xvsrlrni_d_q, void, env, i32, i32, i32) +DEF_HELPER_4(xvsrarni_b_h, void, env, i32, i32, i32) +DEF_HELPER_4(xvsrarni_h_w, void, env, i32, i32, i32) +DEF_HELPER_4(xvsrarni_w_d, void, env, i32, i32, i32) +DEF_HELPER_4(xvsrarni_d_q, void, env, i32, i32, i32) diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarch/insn_trans/trans_lasx.c.inc index 9a3c2114eb..5cd241bafa 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -2068,6 +2068,22 @@ TRANS(xvsrani_h_w, gen_xx_i, gen_helper_xvsrani_h_w) TRANS(xvsrani_w_d, gen_xx_i, gen_helper_xvsrani_w_d) TRANS(xvsrani_d_q, gen_xx_i, gen_helper_xvsrani_d_q) +TRANS(xvsrlrn_b_h, gen_xxx, gen_helper_xvsrlrn_b_h) +TRANS(xvsrlrn_h_w, gen_xxx, gen_helper_xvsrlrn_h_w) +TRANS(xvsrlrn_w_d, gen_xxx, gen_helper_xvsrlrn_w_d) +TRANS(xvsrarn_b_h, gen_xxx, gen_helper_xvsrarn_b_h) +TRANS(xvsrarn_h_w, gen_xxx, gen_helper_xvsrarn_h_w) +TRANS(xvsrarn_w_d, gen_xxx, gen_helper_xvsrarn_w_d) + +TRANS(xvsrlrni_b_h, gen_xx_i, gen_helper_xvsrlrni_b_h) +TRANS(xvsrlrni_h_w, gen_xx_i, gen_helper_xvsrlrni_h_w) +TRANS(xvsrlrni_w_d, gen_xx_i, gen_helper_xvsrlrni_w_d) +TRANS(xvsrlrni_d_q, gen_xx_i, gen_helper_xvsrlrni_d_q) +TRANS(xvsrarni_b_h, gen_xx_i, gen_helper_xvsrarni_b_h) +TRANS(xvsrarni_h_w, gen_xx_i, gen_helper_xvsrarni_h_w) +TRANS(xvsrarni_w_d, gen_xx_i, gen_helper_xvsrarni_w_d) +TRANS(xvsrarni_d_q, gen_xx_i, gen_helper_xvsrarni_d_q) + static bool gvec_dupx(DisasContext *ctx, arg_xr *a, MemOp mop) { TCGv src = gpr_src(ctx, a->rj, EXT_NONE); diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index 45f15e3be2..0273576ada 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1717,6 +1717,22 @@ xvsrani_h_w 0111 01110101 10001 ..... ..... ..... @xx_ui5 xvsrani_w_d 0111 01110101 1001 ...... ..... ..... @xx_ui6 xvsrani_d_q 0111 01110101 101 ....... ..... ..... @xx_ui7 +xvsrlrn_b_h 0111 01001111 10001 ..... ..... ..... @xxx +xvsrlrn_h_w 0111 01001111 10010 ..... ..... ..... @xxx +xvsrlrn_w_d 0111 01001111 10011 ..... ..... ..... @xxx +xvsrarn_b_h 0111 01001111 10101 ..... ..... ..... @xxx +xvsrarn_h_w 0111 01001111 10110 ..... ..... ..... @xxx +xvsrarn_w_d 0111 01001111 10111 ..... ..... ..... @xxx + +xvsrlrni_b_h 0111 01110100 01000 1 .... ..... ..... @xx_ui4 +xvsrlrni_h_w 0111 01110100 01001 ..... ..... ..... @xx_ui5 +xvsrlrni_w_d 0111 01110100 0101 ...... ..... ..... @xx_ui6 +xvsrlrni_d_q 0111 01110100 011 ....... ..... ..... @xx_ui7 +xvsrarni_b_h 0111 01110101 11000 1 .... ..... ..... @xx_ui4 +xvsrarni_h_w 0111 01110101 11001 ..... ..... ..... @xx_ui5 +xvsrarni_w_d 0111 01110101 1101 ...... ..... ..... @xx_ui6 +xvsrarni_d_q 0111 01110101 111 ....... ..... ..... @xx_ui7 + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @xr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @xr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @xr diff --git a/target/loongarch/lasx_helper.c b/target/loongarch/lasx_helper.c index 02550646d7..b0d5f93a97 100644 --- a/target/loongarch/lasx_helper.c +++ b/target/loongarch/lasx_helper.c @@ -1092,3 +1092,153 @@ void HELPER(xvsrani_d_q)(CPULoongArchState *env, XVSRANI(xvsrani_b_h, 16, XB, XH) XVSRANI(xvsrani_h_w, 32, XH, XW) XVSRANI(xvsrani_w_d, 64, XW, XD) + +#define XVSRLRN(NAME, BIT, E1, E2, E3) \ +void HELPER(NAME)(CPULoongArchState *env, \ + uint32_t xd, uint32_t xj, uint32_t xk) \ +{ \ + int i, max; \ + XReg *Xd = &(env->fpr[xd].xreg); \ + XReg *Xj = &(env->fpr[xj].xreg); \ + XReg *Xk = &(env->fpr[xk].xreg); \ + \ + max = LASX_LEN / (BIT * 2); \ + for (i = 0; i < max; i++) { \ + Xd->E1(i) = do_xvsrlr_ ## E2(Xj->E2(i), (Xk->E3(i)) % BIT); \ + Xd->E1(i + max * 2) = do_xvsrlr_## E2(Xj->E2(i + max), \ + Xk->E3(i + max) % BIT); \ + } \ + Xd->XD(1) = 0; \ + Xd->XD(3) = 0; \ +} + +XVSRLRN(xvsrlrn_b_h, 16, XB, XH, UXH) +XVSRLRN(xvsrlrn_h_w, 32, XH, XW, UXW) +XVSRLRN(xvsrlrn_w_d, 64, XW, XD, UXD) + +#define XVSRARN(NAME, BIT, E1, E2, E3) \ +void HELPER(NAME)(CPULoongArchState *env, \ + uint32_t xd, uint32_t xj, uint32_t xk) \ +{ \ + int i, max; \ + XReg *Xd = &(env->fpr[xd].xreg); \ + XReg *Xj = &(env->fpr[xj].xreg); \ + XReg *Xk = &(env->fpr[xk].xreg); \ + \ + max = LASX_LEN / (BIT * 2); \ + for (i = 0; i < max; i++) { \ + Xd->E1(i) = do_xvsrar_ ## E2(Xj->E2(i), (Xk->E3(i)) % BIT); \ + Xd->E1(i + max * 2) = do_xvsrar_## E2(Xj->E2(i + max), \ + Xk->E3(i + max) % BIT); \ + } \ + Xd->XD(1) = 0; \ + Xd->XD(3) = 0; \ +} + +XVSRARN(xvsrarn_b_h, 16, XB, XH, UXH) +XVSRARN(xvsrarn_h_w, 32, XH, XW, UXW) +XVSRARN(xvsrarn_w_d, 64, XW, XD, UXD) + +#define XVSRLRNI(NAME, BIT, E1, E2) \ +void HELPER(NAME)(CPULoongArchState *env, \ + uint32_t xd, uint32_t xj, uint32_t imm) \ +{ \ + int i, max; \ + XReg temp; \ + XReg *Xd = &(env->fpr[xd].xreg); \ + XReg *Xj = &(env->fpr[xj].xreg); \ + \ + temp.XQ(0) = int128_zero(); \ + temp.XQ(1) = int128_zero(); \ + max = LASX_LEN / (BIT * 2); \ + for (i = 0; i < max; i++) { \ + temp.E1(i) = do_xvsrlr_ ## E2(Xj->E2(i), imm); \ + temp.E1(i + max) = do_xvsrlr_ ## E2(Xd->E2(i), imm); \ + temp.E1(i + max * 2) = do_xvsrlr_## E2(Xj->E2(i + max), imm); \ + temp.E1(i + max * 3) = do_xvsrlr_## E2(Xd->E2(i + max), imm); \ + } \ + *Xd = temp; \ +} + +void HELPER(xvsrlrni_d_q)(CPULoongArchState *env, + uint32_t xd, uint32_t xj, uint32_t imm) +{ + XReg temp; + XReg *Xd = &(env->fpr[xd].xreg); + XReg *Xj = &(env->fpr[xj].xreg); + Int128 r1, r2, r3, r4; + + if (imm == 0) { + temp.XD(0) = int128_getlo(Xj->XQ(0)); + temp.XD(1) = int128_getlo(Xd->XQ(0)); + temp.XD(2) = int128_getlo(Xj->XQ(1)); + temp.XD(3) = int128_getlo(Xd->XQ(1)); + } else { + r1 = int128_and(int128_urshift(Xj->XQ(0), (imm - 1)), int128_one()); + r2 = int128_and(int128_urshift(Xd->XQ(0), (imm - 1)), int128_one()); + r3 = int128_and(int128_urshift(Xj->XQ(1), (imm - 1)), int128_one()); + r4 = int128_and(int128_urshift(Xd->XQ(1), (imm - 1)), int128_one()); + + temp.XD(0) = int128_getlo(int128_add(int128_urshift(Xj->XQ(0), imm), r1)); + temp.XD(1) = int128_getlo(int128_add(int128_urshift(Xd->XQ(0), imm), r2)); + temp.XD(2) = int128_getlo(int128_add(int128_urshift(Xj->XQ(1), imm), r3)); + temp.XD(3) = int128_getlo(int128_add(int128_urshift(Xd->XQ(1), imm), r4)); + } + *Xd = temp; +} + +XVSRLRNI(xvsrlrni_b_h, 16, XB, XH) +XVSRLRNI(xvsrlrni_h_w, 32, XH, XW) +XVSRLRNI(xvsrlrni_w_d, 64, XW, XD) + +#define XVSRARNI(NAME, BIT, E1, E2) \ +void HELPER(NAME)(CPULoongArchState *env, \ + uint32_t xd, uint32_t xj, uint32_t imm) \ +{ \ + int i, max; \ + XReg temp; \ + XReg *Xd = &(env->fpr[xd].xreg); \ + XReg *Xj = &(env->fpr[xj].xreg); \ + \ + temp.XQ(0) = int128_zero(); \ + temp.XQ(1) = int128_zero(); \ + max = LASX_LEN / (BIT * 2); \ + for (i = 0; i < max; i++) { \ + temp.E1(i) = do_xvsrar_ ## E2(Xj->E2(i), imm); \ + temp.E1(i + max) = do_xvsrar_ ## E2(Xd->E2(i), imm); \ + temp.E1(i + max * 2) = do_xvsrar_## E2(Xj->E2(i + max), imm); \ + temp.E1(i + max * 3) = do_xvsrar_## E2(Xd->E2(i + max), imm); \ + } \ + *Xd = temp; \ +} + +void HELPER(xvsrarni_d_q)(CPULoongArchState *env, + uint32_t xd, uint32_t xj, uint32_t imm) +{ + XReg temp; + XReg *Xd = &(env->fpr[xd].xreg); + XReg *Xj = &(env->fpr[xj].xreg); + Int128 r1, r2, r3, r4; + + if (imm == 0) { + temp.XD(0) = int128_getlo(Xj->XQ(0)); + temp.XD(1) = int128_getlo(Xd->XQ(0)); + temp.XD(2) = int128_getlo(Xj->XQ(1)); + temp.XD(3) = int128_getlo(Xd->XQ(1)); + } else { + r1 = int128_and(int128_rshift(Xj->XQ(0), (imm - 1)), int128_one()); + r2 = int128_and(int128_rshift(Xd->XQ(0), (imm - 1)), int128_one()); + r3 = int128_and(int128_rshift(Xj->XQ(1), (imm - 1)), int128_one()); + r4 = int128_and(int128_rshift(Xd->XQ(1), (imm - 1)), int128_one()); + + temp.XD(0) = int128_getlo(int128_add(int128_rshift(Xj->XQ(0), imm), r1)); + temp.XD(1) = int128_getlo(int128_add(int128_rshift(Xd->XQ(0), imm), r2)); + temp.XD(2) = int128_getlo(int128_add(int128_rshift(Xj->XQ(1), imm), r3)); + temp.XD(3) = int128_getlo(int128_add(int128_rshift(Xd->XQ(1), imm), r4)); + } + *Xd = temp; +} + +XVSRARNI(xvsrarni_b_h, 16, XB, XH) +XVSRARNI(xvsrarni_h_w, 32, XH, XW) +XVSRARNI(xvsrarni_w_d, 64, XW, XD) From patchwork Tue Jun 20 09:37:58 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Gao X-Patchwork-Id: 13285511 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 0E5C8EB64DB for ; Tue, 20 Jun 2023 09:46:28 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXqZ-0003QF-Ci; Tue, 20 Jun 2023 05:40:23 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qBXqS-0002n5-VS for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:40:18 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXqM-0006aP-QV for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:40:15 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8Cx_eqZc5FkyCUHAA--.14883S3; Tue, 20 Jun 2023 17:38:33 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8BxduSGc5FkzIQhAA--.28394S32; Tue, 20 Jun 2023 17:38:32 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v1 30/46] target/loongarch: Implement xvssrln xvssran Date: Tue, 20 Jun 2023 17:37:58 +0800 Message-Id: <20230620093814.123650-31-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230620093814.123650-1-gaosong@loongson.cn> References: <20230620093814.123650-1-gaosong@loongson.cn> MIME-Version: 1.0 X-CM-TRANSID: AQAAf8BxduSGc5FkzIQhAA--.28394S32 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org This patch includes: - XVSSRLN.{B.H/H.W/W.D}; - XVSSRAN.{B.H/H.W/W.D}; - XVSSRLN.{BU.H/HU.W/WU.D}; - XVSSRAN.{BU.H/HU.W/WU.D}; - XVSSRLNI.{B.H/H.W/W.D/D.Q}; - XVSSRANI.{B.H/H.W/W.D/D.Q}; - XVSSRLNI.{BU.H/HU.W/WU.D/DU.Q}; - XVSSRANI.{BU.H/HU.W/WU.D/DU.Q}. Signed-off-by: Song Gao --- target/loongarch/disas.c | 30 ++ target/loongarch/helper.h | 30 ++ target/loongarch/insn_trans/trans_lasx.c.inc | 30 ++ target/loongarch/insns.decode | 30 ++ target/loongarch/lasx_helper.c | 428 +++++++++++++++++++ 5 files changed, 548 insertions(+) diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index 515d99aa1f..1f40f3aaca 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -2135,6 +2135,36 @@ INSN_LASX(xvsrarni_h_w, xx_i) INSN_LASX(xvsrarni_w_d, xx_i) INSN_LASX(xvsrarni_d_q, xx_i) +INSN_LASX(xvssrln_b_h, xxx) +INSN_LASX(xvssrln_h_w, xxx) +INSN_LASX(xvssrln_w_d, xxx) +INSN_LASX(xvssran_b_h, xxx) +INSN_LASX(xvssran_h_w, xxx) +INSN_LASX(xvssran_w_d, xxx) +INSN_LASX(xvssrln_bu_h, xxx) +INSN_LASX(xvssrln_hu_w, xxx) +INSN_LASX(xvssrln_wu_d, xxx) +INSN_LASX(xvssran_bu_h, xxx) +INSN_LASX(xvssran_hu_w, xxx) +INSN_LASX(xvssran_wu_d, xxx) + +INSN_LASX(xvssrlni_b_h, xx_i) +INSN_LASX(xvssrlni_h_w, xx_i) +INSN_LASX(xvssrlni_w_d, xx_i) +INSN_LASX(xvssrlni_d_q, xx_i) +INSN_LASX(xvssrani_b_h, xx_i) +INSN_LASX(xvssrani_h_w, xx_i) +INSN_LASX(xvssrani_w_d, xx_i) +INSN_LASX(xvssrani_d_q, xx_i) +INSN_LASX(xvssrlni_bu_h, xx_i) +INSN_LASX(xvssrlni_hu_w, xx_i) +INSN_LASX(xvssrlni_wu_d, xx_i) +INSN_LASX(xvssrlni_du_q, xx_i) +INSN_LASX(xvssrani_bu_h, xx_i) +INSN_LASX(xvssrani_hu_w, xx_i) +INSN_LASX(xvssrani_wu_d, xx_i) +INSN_LASX(xvssrani_du_q, xx_i) + INSN_LASX(xvreplgr2vr_b, xr) INSN_LASX(xvreplgr2vr_h, xr) INSN_LASX(xvreplgr2vr_w, xr) diff --git a/target/loongarch/helper.h b/target/loongarch/helper.h index 09ae21edd6..2d76916049 100644 --- a/target/loongarch/helper.h +++ b/target/loongarch/helper.h @@ -990,3 +990,33 @@ DEF_HELPER_4(xvsrarni_b_h, void, env, i32, i32, i32) DEF_HELPER_4(xvsrarni_h_w, void, env, i32, i32, i32) DEF_HELPER_4(xvsrarni_w_d, void, env, i32, i32, i32) DEF_HELPER_4(xvsrarni_d_q, void, env, i32, i32, i32) + +DEF_HELPER_4(xvssrln_b_h, void, env, i32, i32, i32) +DEF_HELPER_4(xvssrln_h_w, void, env, i32, i32, i32) +DEF_HELPER_4(xvssrln_w_d, void, env, i32, i32, i32) +DEF_HELPER_4(xvssran_b_h, void, env, i32, i32, i32) +DEF_HELPER_4(xvssran_h_w, void, env, i32, i32, i32) +DEF_HELPER_4(xvssran_w_d, void, env, i32, i32, i32) +DEF_HELPER_4(xvssrln_bu_h, void, env, i32, i32, i32) +DEF_HELPER_4(xvssrln_hu_w, void, env, i32, i32, i32) +DEF_HELPER_4(xvssrln_wu_d, void, env, i32, i32, i32) +DEF_HELPER_4(xvssran_bu_h, void, env, i32, i32, i32) +DEF_HELPER_4(xvssran_hu_w, void, env, i32, i32, i32) +DEF_HELPER_4(xvssran_wu_d, void, env, i32, i32, i32) + +DEF_HELPER_4(xvssrlni_b_h, void, env, i32, i32, i32) +DEF_HELPER_4(xvssrlni_h_w, void, env, i32, i32, i32) +DEF_HELPER_4(xvssrlni_w_d, void, env, i32, i32, i32) +DEF_HELPER_4(xvssrlni_d_q, void, env, i32, i32, i32) +DEF_HELPER_4(xvssrani_b_h, void, env, i32, i32, i32) +DEF_HELPER_4(xvssrani_h_w, void, env, i32, i32, i32) +DEF_HELPER_4(xvssrani_w_d, void, env, i32, i32, i32) +DEF_HELPER_4(xvssrani_d_q, void, env, i32, i32, i32) +DEF_HELPER_4(xvssrlni_bu_h, void, env, i32, i32, i32) +DEF_HELPER_4(xvssrlni_hu_w, void, env, i32, i32, i32) +DEF_HELPER_4(xvssrlni_wu_d, void, env, i32, i32, i32) +DEF_HELPER_4(xvssrlni_du_q, void, env, i32, i32, i32) +DEF_HELPER_4(xvssrani_bu_h, void, env, i32, i32, i32) +DEF_HELPER_4(xvssrani_hu_w, void, env, i32, i32, i32) +DEF_HELPER_4(xvssrani_wu_d, void, env, i32, i32, i32) +DEF_HELPER_4(xvssrani_du_q, void, env, i32, i32, i32) diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarch/insn_trans/trans_lasx.c.inc index 5cd241bafa..b6c2ced30c 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -2084,6 +2084,36 @@ TRANS(xvsrarni_h_w, gen_xx_i, gen_helper_xvsrarni_h_w) TRANS(xvsrarni_w_d, gen_xx_i, gen_helper_xvsrarni_w_d) TRANS(xvsrarni_d_q, gen_xx_i, gen_helper_xvsrarni_d_q) +TRANS(xvssrln_b_h, gen_xxx, gen_helper_xvssrln_b_h) +TRANS(xvssrln_h_w, gen_xxx, gen_helper_xvssrln_h_w) +TRANS(xvssrln_w_d, gen_xxx, gen_helper_xvssrln_w_d) +TRANS(xvssran_b_h, gen_xxx, gen_helper_xvssran_b_h) +TRANS(xvssran_h_w, gen_xxx, gen_helper_xvssran_h_w) +TRANS(xvssran_w_d, gen_xxx, gen_helper_xvssran_w_d) +TRANS(xvssrln_bu_h, gen_xxx, gen_helper_xvssrln_bu_h) +TRANS(xvssrln_hu_w, gen_xxx, gen_helper_xvssrln_hu_w) +TRANS(xvssrln_wu_d, gen_xxx, gen_helper_xvssrln_wu_d) +TRANS(xvssran_bu_h, gen_xxx, gen_helper_xvssran_bu_h) +TRANS(xvssran_hu_w, gen_xxx, gen_helper_xvssran_hu_w) +TRANS(xvssran_wu_d, gen_xxx, gen_helper_xvssran_wu_d) + +TRANS(xvssrlni_b_h, gen_xx_i, gen_helper_xvssrlni_b_h) +TRANS(xvssrlni_h_w, gen_xx_i, gen_helper_xvssrlni_h_w) +TRANS(xvssrlni_w_d, gen_xx_i, gen_helper_xvssrlni_w_d) +TRANS(xvssrlni_d_q, gen_xx_i, gen_helper_xvssrlni_d_q) +TRANS(xvssrani_b_h, gen_xx_i, gen_helper_xvssrani_b_h) +TRANS(xvssrani_h_w, gen_xx_i, gen_helper_xvssrani_h_w) +TRANS(xvssrani_w_d, gen_xx_i, gen_helper_xvssrani_w_d) +TRANS(xvssrani_d_q, gen_xx_i, gen_helper_xvssrani_d_q) +TRANS(xvssrlni_bu_h, gen_xx_i, gen_helper_xvssrlni_bu_h) +TRANS(xvssrlni_hu_w, gen_xx_i, gen_helper_xvssrlni_hu_w) +TRANS(xvssrlni_wu_d, gen_xx_i, gen_helper_xvssrlni_wu_d) +TRANS(xvssrlni_du_q, gen_xx_i, gen_helper_xvssrlni_du_q) +TRANS(xvssrani_bu_h, gen_xx_i, gen_helper_xvssrani_bu_h) +TRANS(xvssrani_hu_w, gen_xx_i, gen_helper_xvssrani_hu_w) +TRANS(xvssrani_wu_d, gen_xx_i, gen_helper_xvssrani_wu_d) +TRANS(xvssrani_du_q, gen_xx_i, gen_helper_xvssrani_du_q) + static bool gvec_dupx(DisasContext *ctx, arg_xr *a, MemOp mop) { TCGv src = gpr_src(ctx, a->rj, EXT_NONE); diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index 0273576ada..cf3803c230 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1733,6 +1733,36 @@ xvsrarni_h_w 0111 01110101 11001 ..... ..... ..... @xx_ui5 xvsrarni_w_d 0111 01110101 1101 ...... ..... ..... @xx_ui6 xvsrarni_d_q 0111 01110101 111 ....... ..... ..... @xx_ui7 +xvssrln_b_h 0111 01001111 11001 ..... ..... ..... @xxx +xvssrln_h_w 0111 01001111 11010 ..... ..... ..... @xxx +xvssrln_w_d 0111 01001111 11011 ..... ..... ..... @xxx +xvssran_b_h 0111 01001111 11101 ..... ..... ..... @xxx +xvssran_h_w 0111 01001111 11110 ..... ..... ..... @xxx +xvssran_w_d 0111 01001111 11111 ..... ..... ..... @xxx +xvssrln_bu_h 0111 01010000 01001 ..... ..... ..... @xxx +xvssrln_hu_w 0111 01010000 01010 ..... ..... ..... @xxx +xvssrln_wu_d 0111 01010000 01011 ..... ..... ..... @xxx +xvssran_bu_h 0111 01010000 01101 ..... ..... ..... @xxx +xvssran_hu_w 0111 01010000 01110 ..... ..... ..... @xxx +xvssran_wu_d 0111 01010000 01111 ..... ..... ..... @xxx + +xvssrlni_b_h 0111 01110100 10000 1 .... ..... ..... @xx_ui4 +xvssrlni_h_w 0111 01110100 10001 ..... ..... ..... @xx_ui5 +xvssrlni_w_d 0111 01110100 1001 ...... ..... ..... @xx_ui6 +xvssrlni_d_q 0111 01110100 101 ....... ..... ..... @xx_ui7 +xvssrani_b_h 0111 01110110 00000 1 .... ..... ..... @xx_ui4 +xvssrani_h_w 0111 01110110 00001 ..... ..... ..... @xx_ui5 +xvssrani_w_d 0111 01110110 0001 ...... ..... ..... @xx_ui6 +xvssrani_d_q 0111 01110110 001 ....... ..... ..... @xx_ui7 +xvssrlni_bu_h 0111 01110100 11000 1 .... ..... ..... @xx_ui4 +xvssrlni_hu_w 0111 01110100 11001 ..... ..... ..... @xx_ui5 +xvssrlni_wu_d 0111 01110100 1101 ...... ..... ..... @xx_ui6 +xvssrlni_du_q 0111 01110100 111 ....... ..... ..... @xx_ui7 +xvssrani_bu_h 0111 01110110 01000 1 .... ..... ..... @xx_ui4 +xvssrani_hu_w 0111 01110110 01001 ..... ..... ..... @xx_ui5 +xvssrani_wu_d 0111 01110110 0101 ...... ..... ..... @xx_ui6 +xvssrani_du_q 0111 01110110 011 ....... ..... ..... @xx_ui7 + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @xr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @xr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @xr diff --git a/target/loongarch/lasx_helper.c b/target/loongarch/lasx_helper.c index b0d5f93a97..b42f412c02 100644 --- a/target/loongarch/lasx_helper.c +++ b/target/loongarch/lasx_helper.c @@ -1242,3 +1242,431 @@ void HELPER(xvsrarni_d_q)(CPULoongArchState *env, XVSRARNI(xvsrarni_b_h, 16, XB, XH) XVSRARNI(xvsrarni_h_w, 32, XH, XW) XVSRARNI(xvsrarni_w_d, 64, XW, XD) + +#define XSSRLNS(NAME, T1, T2, T3) \ +static T1 do_xssrlns_ ## NAME(T2 e2, int sa, int sh) \ +{ \ + T1 shft_res; \ + if (sa == 0) { \ + shft_res = e2; \ + } else { \ + shft_res = (((T1)e2) >> sa); \ + } \ + T3 mask; \ + mask = (1ull << sh) - 1; \ + if (shft_res > mask) { \ + return mask; \ + } else { \ + return shft_res; \ + } \ +} + +XSSRLNS(XB, uint16_t, int16_t, uint8_t) +XSSRLNS(XH, uint32_t, int32_t, uint16_t) +XSSRLNS(XW, uint64_t, int64_t, uint32_t) + +#define XVSSRLN(NAME, BIT, E1, E2, E3) \ +void HELPER(NAME)(CPULoongArchState *env, \ + uint32_t xd, uint32_t xj, uint32_t xk) \ +{ \ + int i, max; \ + XReg *Xd = &(env->fpr[xd].xreg); \ + XReg *Xj = &(env->fpr[xj].xreg); \ + XReg *Xk = &(env->fpr[xk].xreg); \ + \ + max = LASX_LEN / (BIT * 2); \ + for (i = 0; i < max; i++) { \ + Xd->E1(i) = do_xssrlns_ ## E1(Xj->E2(i), \ + Xk->E3(i) % BIT, (BIT / 2) - 1); \ + Xd->E1(i + max * 2) = do_xssrlns_## E1(Xj->E2(i + max), \ + Xk->E3(i + max) % BIT, \ + (BIT / 2) - 1); \ + } \ + Xd->XD(1) = 0; \ + Xd->XD(3) = 0; \ +} + +XVSSRLN(xvssrln_b_h, 16, XB, XH, UXH) +XVSSRLN(xvssrln_h_w, 32, XH, XW, UXW) +XVSSRLN(xvssrln_w_d, 64, XW, XD, UXD) + +#define XSSRANS(E, T1, T2) \ +static T1 do_xssrans_ ## E(T1 e2, int sa, int sh) \ +{ \ + T1 shft_res; \ + if (sa == 0) { \ + shft_res = e2; \ + } else { \ + shft_res = e2 >> sa; \ + } \ + T2 mask; \ + mask = (1ll << sh) - 1; \ + if (shft_res > mask) { \ + return mask; \ + } else if (shft_res < -(mask + 1)) { \ + return ~mask; \ + } else { \ + return shft_res; \ + } \ +} + +XSSRANS(XB, int16_t, int8_t) +XSSRANS(XH, int32_t, int16_t) +XSSRANS(XW, int64_t, int32_t) + +#define XVSSRAN(NAME, BIT, E1, E2, E3) \ +void HELPER(NAME)(CPULoongArchState *env, \ + uint32_t xd, uint32_t xj, uint32_t xk) \ +{ \ + int i, max; \ + XReg *Xd = &(env->fpr[xd].xreg); \ + XReg *Xj = &(env->fpr[xj].xreg); \ + XReg *Xk = &(env->fpr[xk].xreg); \ + \ + max = LASX_LEN / (BIT * 2); \ + for (i = 0; i < max; i++) { \ + Xd->E1(i) = do_xssrans_ ## E1(Xj->E2(i), \ + Xk->E3(i) % BIT, (BIT / 2) - 1); \ + Xd->E1(i + max * 2) = do_xssrans_## E1(Xj->E2(i + max), \ + Xk->E3(i + max) % BIT, \ + (BIT / 2) - 1); \ + } \ + Xd->XD(1) = 0; \ + Xd->XD(3) = 0; \ +} + +XVSSRAN(xvssran_b_h, 16, XB, XH, UXH) +XVSSRAN(xvssran_h_w, 32, XH, XW, UXW) +XVSSRAN(xvssran_w_d, 64, XW, XD, UXD) + +#define XSSRLNU(E, T1, T2, T3) \ +static T1 do_xssrlnu_ ## E(T3 e2, int sa, int sh) \ +{ \ + T1 shft_res; \ + if (sa == 0) { \ + shft_res = e2; \ + } else { \ + shft_res = (((T1)e2) >> sa); \ + } \ + T2 mask; \ + mask = (1ull << sh) - 1; \ + if (shft_res > mask) { \ + return mask; \ + } else { \ + return shft_res; \ + } \ +} + +XSSRLNU(XB, uint16_t, uint8_t, int16_t) +XSSRLNU(XH, uint32_t, uint16_t, int32_t) +XSSRLNU(XW, uint64_t, uint32_t, int64_t) + +#define XVSSRLNU(NAME, BIT, E1, E2, E3) \ +void HELPER(NAME)(CPULoongArchState *env, \ + uint32_t xd, uint32_t xj, uint32_t xk) \ +{ \ + int i, max; \ + XReg *Xd = &(env->fpr[xd].xreg); \ + XReg *Xj = &(env->fpr[xj].xreg); \ + XReg *Xk = &(env->fpr[xk].xreg); \ + \ + max = LASX_LEN / (BIT * 2); \ + for (i = 0; i < max; i++) { \ + Xd->E1(i) = do_xssrlnu_ ## E1(Xj->E2(i), Xk->E3(i) % BIT, BIT / 2); \ + Xd->E1(i + max * 2) = do_xssrlnu_## E1(Xj->E2(i + max), \ + Xk->E3(i + max) % BIT, \ + BIT / 2); \ + } \ + Xd->XD(1) = 0; \ + Xd->XD(3) = 0; \ +} + +XVSSRLNU(xvssrln_bu_h, 16, XB, XH, UXH) +XVSSRLNU(xvssrln_hu_w, 32, XH, XW, UXW) +XVSSRLNU(xvssrln_wu_d, 64, XW, XD, UXD) + +#define XSSRANU(E, T1, T2, T3) \ +static T1 do_xssranu_ ## E(T3 e2, int sa, int sh) \ +{ \ + T1 shft_res; \ + if (sa == 0) { \ + shft_res = e2; \ + } else { \ + shft_res = e2 >> sa; \ + } \ + if (e2 < 0) { \ + shft_res = 0; \ + } \ + T2 mask; \ + mask = (1ull << sh) - 1; \ + if (shft_res > mask) { \ + return mask; \ + } else { \ + return shft_res; \ + } \ +} + +XSSRANU(XB, uint16_t, uint8_t, int16_t) +XSSRANU(XH, uint32_t, uint16_t, int32_t) +XSSRANU(XW, uint64_t, uint32_t, int64_t) + +#define XVSSRANU(NAME, BIT, E1, E2, E3) \ +void HELPER(NAME)(CPULoongArchState *env, \ + uint32_t xd, uint32_t xj, uint32_t xk) \ +{ \ + int i, max; \ + XReg *Xd = &(env->fpr[xd].xreg); \ + XReg *Xj = &(env->fpr[xj].xreg); \ + XReg *Xk = &(env->fpr[xk].xreg); \ + \ + max = LASX_LEN / (BIT * 2); \ + for (i = 0; i < max; i++) { \ + Xd->E1(i) = do_xssranu_ ## E1(Xj->E2(i), Xk->E3(i) % BIT, BIT / 2); \ + Xd->E1(i + max * 2) = do_xssranu_## E1(Xj->E2(i + max), \ + Xk->E3(i + max) % BIT, \ + BIT / 2); \ + } \ + Xd->XD(1) = 0; \ + Xd->XD(3) = 0; \ +} + +XVSSRANU(xvssran_bu_h, 16, XB, XH, UXH) +XVSSRANU(xvssran_hu_w, 32, XH, XW, UXW) +XVSSRANU(xvssran_wu_d, 64, XW, XD, UXD) + +#define XVSSRLNI(NAME, BIT, E1, E2) \ +void HELPER(NAME)(CPULoongArchState *env, \ + uint32_t xd, uint32_t xj, uint32_t imm) \ +{ \ + int i, max; \ + XReg temp; \ + XReg *Xd = &(env->fpr[xd].xreg); \ + XReg *Xj = &(env->fpr[xj].xreg); \ + \ + max = LASX_LEN / (BIT * 2); \ + for (i = 0; i < max; i++) { \ + temp.E1(i) = do_xssrlns_ ## E1(Xj->E2(i), imm, (BIT / 2) - 1); \ + temp.E1(i + max) = do_xssrlns_ ## E1(Xd->E2(i), imm, (BIT / 2) - 1); \ + temp.E1(i + max * 2) = do_xssrlns_## E1(Xj->E2(i + max), \ + imm, (BIT / 2) - 1); \ + temp.E1(i + max * 3) = do_xssrlns_## E1(Xd->E2(i + max), \ + imm, (BIT / 2) - 1); \ + } \ + *Xd = temp; \ +} + +void HELPER(xvssrlni_d_q)(CPULoongArchState *env, + uint32_t xd, uint32_t xj, uint32_t imm) +{ + int i; + Int128 shft_res[4], mask; + XReg *Xd = &(env->fpr[xd].xreg); + XReg *Xj = &(env->fpr[xj].xreg); + + if (imm == 0) { + shft_res[0] = Xj->XQ(0); + shft_res[1] = Xd->XQ(0); + shft_res[2] = Xj->XQ(1); + shft_res[3] = Xd->XQ(1); + } else { + shft_res[0] = int128_urshift(Xj->XQ(0), imm); + shft_res[1] = int128_urshift(Xd->XQ(0), imm); + shft_res[2] = int128_urshift(Xj->XQ(1), imm); + shft_res[3] = int128_urshift(Xd->XQ(1), imm); + } + mask = int128_sub(int128_lshift(int128_one(), 63), int128_one()); + + for (i = 0; i < 4; i++) { + if (int128_ult(mask, shft_res[i])) { + Xd->XD(i) = int128_getlo(mask); + } else { + Xd->XD(i) = int128_getlo(shft_res[i]); + } + } +} + +XVSSRLNI(xvssrlni_b_h, 16, XB, XH) +XVSSRLNI(xvssrlni_h_w, 32, XH, XW) +XVSSRLNI(xvssrlni_w_d, 64, XW, XD) + +#define XVSSRANI(NAME, BIT, E1, E2) \ +void HELPER(NAME)(CPULoongArchState *env, \ + uint32_t xd, uint32_t xj, uint32_t imm) \ +{ \ + int i, max; \ + XReg temp; \ + XReg *Xd = &(env->fpr[xd].xreg); \ + XReg *Xj = &(env->fpr[xj].xreg); \ + \ + max = LASX_LEN / (BIT * 2); \ + for (i = 0; i < max; i++) { \ + temp.E1(i) = do_xssrans_ ## E1(Xj->E2(i), imm, (BIT / 2) - 1); \ + temp.E1(i + max) = do_xssrans_ ## E1(Xd->E2(i), imm, (BIT / 2) - 1); \ + temp.E1(i + max * 2) = do_xssrans_## E1(Xj->E2(i + max), \ + imm, (BIT / 2) - 1); \ + temp.E1(i + max * 3) = do_xssrans_## E1(Xd->E2(i + max), \ + imm, (BIT / 2) - 1); \ + } \ + *Xd = temp; \ +} + +void HELPER(xvssrani_d_q)(CPULoongArchState *env, + uint32_t xd, uint32_t xj, uint32_t imm) +{ + int i; + Int128 shft_res[4], mask, min; + XReg *Xd = &(env->fpr[xd].xreg); + XReg *Xj = &(env->fpr[xj].xreg); + + if (imm == 0) { + shft_res[0] = Xj->XQ(0); + shft_res[1] = Xd->XQ(0); + shft_res[2] = Xj->XQ(1); + shft_res[3] = Xd->XQ(1); + } else { + shft_res[0] = int128_rshift(Xj->XQ(0), imm); + shft_res[1] = int128_rshift(Xd->XQ(0), imm); + shft_res[2] = int128_rshift(Xj->XQ(1), imm); + shft_res[3] = int128_rshift(Xd->XQ(1), imm); + } + mask = int128_sub(int128_lshift(int128_one(), 63), int128_one()); + min = int128_lshift(int128_one(), 63); + + for (i = 0; i < 4; i++) { + if (int128_gt(shft_res[i], mask)) { + Xd->XD(i) = int128_getlo(mask); + } else if (int128_lt(shft_res[i], int128_neg(min))) { + Xd->XD(i) = int128_getlo(min); + } else { + Xd->XD(i) = int128_getlo(shft_res[i]); + } + } +} + +XVSSRANI(xvssrani_b_h, 16, XB, XH) +XVSSRANI(xvssrani_h_w, 32, XH, XW) +XVSSRANI(xvssrani_w_d, 64, XW, XD) + +#define XVSSRLNUI(NAME, BIT, E1, E2) \ +void HELPER(NAME)(CPULoongArchState *env, \ + uint32_t xd, uint32_t xj, uint32_t imm) \ +{ \ + int i, max; \ + XReg temp; \ + XReg *Xd = &(env->fpr[xd].xreg); \ + XReg *Xj = &(env->fpr[xj].xreg); \ + \ + max = LASX_LEN / (BIT * 2); \ + for (i = 0; i < max; i++) { \ + temp.E1(i) = do_xssrlnu_ ## E1(Xj->E2(i), imm, BIT / 2); \ + temp.E1(i + max) = do_xssrlnu_ ## E1(Xd->E2(i), imm, BIT / 2); \ + temp.E1(i + max * 2) = do_xssrlnu_## E1(Xj->E2(i + max), \ + imm, BIT / 2); \ + temp.E1(i + max * 3) = do_xssrlnu_## E1(Xd->E2(i + max), \ + imm, BIT / 2); \ + } \ + *Xd = temp; \ +} + +void HELPER(xvssrlni_du_q)(CPULoongArchState *env, + uint32_t xd, uint32_t xj, uint32_t imm) +{ + int i; + Int128 shft_res[4], mask; + XReg *Xd = &(env->fpr[xd].xreg); + XReg *Xj = &(env->fpr[xj].xreg); + + if (imm == 0) { + shft_res[0] = Xj->XQ(0); + shft_res[1] = Xd->XQ(0); + shft_res[2] = Xj->XQ(1); + shft_res[3] = Xd->XQ(1); + } else { + shft_res[0] = int128_urshift(Xj->XQ(0), imm); + shft_res[1] = int128_urshift(Xd->XQ(0), imm); + shft_res[2] = int128_urshift(Xj->XQ(1), imm); + shft_res[3] = int128_urshift(Xd->XQ(1), imm); + } + mask = int128_sub(int128_lshift(int128_one(), 64), int128_one()); + + for (i = 0; i < 4; i++) { + if (int128_ult(mask, shft_res[i])) { + Xd->XD(i) = int128_getlo(mask); + } else { + Xd->XD(i) = int128_getlo(shft_res[i]); + } + } +} + +XVSSRLNUI(xvssrlni_bu_h, 16, XB, XH) +XVSSRLNUI(xvssrlni_hu_w, 32, XH, XW) +XVSSRLNUI(xvssrlni_wu_d, 64, XW, XD) + +#define XVSSRANUI(NAME, BIT, E1, E2) \ +void HELPER(NAME)(CPULoongArchState *env, \ + uint32_t xd, uint32_t xj, uint32_t imm) \ +{ \ + int i, max; \ + XReg temp; \ + XReg *Xd = &(env->fpr[xd].xreg); \ + XReg *Xj = &(env->fpr[xj].xreg); \ + \ + max = LASX_LEN / (BIT * 2); \ + for (i = 0; i < max; i++) { \ + temp.E1(i) = do_xssranu_ ## E1(Xj->E2(i), imm, BIT / 2); \ + temp.E1(i + max) = do_xssranu_ ## E1(Xd->E2(i), imm, BIT / 2); \ + temp.E1(i + max * 2) = do_xssranu_## E1(Xj->E2(i + max), \ + imm, BIT / 2); \ + temp.E1(i + max * 3) = do_xssranu_## E1(Xd->E2(i + max), \ + imm, BIT / 2); \ + } \ + *Xd = temp; \ +} + +void HELPER(xvssrani_du_q)(CPULoongArchState *env, + uint32_t xd, uint32_t xj, uint32_t imm) +{ + int i; + Int128 shft_res[4], mask; + XReg *Xd = &(env->fpr[xd].xreg); + XReg *Xj = &(env->fpr[xj].xreg); + + if (imm == 0) { + shft_res[0] = Xj->XQ(0); + shft_res[1] = Xd->XQ(0); + shft_res[2] = Xj->XQ(1); + shft_res[3] = Xd->XQ(1); + } else { + shft_res[0] = int128_rshift(Xj->XQ(0), imm); + shft_res[1] = int128_rshift(Xd->XQ(0), imm); + shft_res[2] = int128_rshift(Xj->XQ(1), imm); + shft_res[3] = int128_rshift(Xd->XQ(1), imm); + } + + if (int128_lt(Xj->XQ(0), int128_zero())) { + shft_res[0] = int128_zero(); + } + if (int128_lt(Xd->XQ(0), int128_zero())) { + shft_res[1] = int128_zero(); + } + if (int128_lt(Xj->XQ(1), int128_zero())) { + shft_res[2] = int128_zero(); + } + if (int128_lt(Xd->XQ(1), int128_zero())) { + shft_res[3] = int128_zero(); + } + + mask = int128_sub(int128_lshift(int128_one(), 64), int128_one()); + + for (i = 0; i < 4; i++) { + if (int128_ult(mask, shft_res[i])) { + Xd->XD(i) = int128_getlo(mask); + } else { + Xd->XD(i) = int128_getlo(shft_res[i]); + } + } +} + +XVSSRANUI(xvssrani_bu_h, 16, XB, XH) +XVSSRANUI(xvssrani_hu_w, 32, XH, XW) +XVSSRANUI(xvssrani_wu_d, 64, XW, XD) From patchwork Tue Jun 20 09:37:59 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Gao X-Patchwork-Id: 13285486 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id EFA53EB64DB for ; Tue, 20 Jun 2023 09:43:36 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXpH-0007Cu-2s; Tue, 20 Jun 2023 05:39:04 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qBXp5-0007BF-Gf for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:38:51 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXp1-0006Nu-7g for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:38:51 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8Cx+emZc5FkyiUHAA--.14781S3; Tue, 20 Jun 2023 17:38:33 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8BxduSGc5FkzIQhAA--.28394S33; Tue, 20 Jun 2023 17:38:33 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v1 31/46] target/loongarch: Implement xvssrlrn xvssrarn Date: Tue, 20 Jun 2023 17:37:59 +0800 Message-Id: <20230620093814.123650-32-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230620093814.123650-1-gaosong@loongson.cn> References: <20230620093814.123650-1-gaosong@loongson.cn> MIME-Version: 1.0 X-CM-TRANSID: AQAAf8BxduSGc5FkzIQhAA--.28394S33 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org This patch includes: - XVSSRLRN.{B.H/H.W/W.D}; - XVSSRARN.{B.H/H.W/W.D}; - XVSSRLRN.{BU.H/HU.W/WU.D}; - XVSSRARN.{BU.H/HU.W/WU.D}; - XVSSRLRNI.{B.H/H.W/W.D/D.Q}; - XVSSRARNI.{B.H/H.W/W.D/D.Q}; - XVSSRLRNI.{BU.H/HU.W/WU.D/DU.Q}; - XVSSRARNI.{BU.H/HU.W/WU.D/DU.Q}. Signed-off-by: Song Gao --- target/loongarch/disas.c | 30 ++ target/loongarch/helper.h | 30 ++ target/loongarch/insn_trans/trans_lasx.c.inc | 30 ++ target/loongarch/insns.decode | 30 ++ target/loongarch/lasx_helper.c | 411 +++++++++++++++++++ 5 files changed, 531 insertions(+) diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index 1f40f3aaca..da07b56dee 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -2165,6 +2165,36 @@ INSN_LASX(xvssrani_hu_w, xx_i) INSN_LASX(xvssrani_wu_d, xx_i) INSN_LASX(xvssrani_du_q, xx_i) +INSN_LASX(xvssrlrn_b_h, xxx) +INSN_LASX(xvssrlrn_h_w, xxx) +INSN_LASX(xvssrlrn_w_d, xxx) +INSN_LASX(xvssrarn_b_h, xxx) +INSN_LASX(xvssrarn_h_w, xxx) +INSN_LASX(xvssrarn_w_d, xxx) +INSN_LASX(xvssrlrn_bu_h, xxx) +INSN_LASX(xvssrlrn_hu_w, xxx) +INSN_LASX(xvssrlrn_wu_d, xxx) +INSN_LASX(xvssrarn_bu_h, xxx) +INSN_LASX(xvssrarn_hu_w, xxx) +INSN_LASX(xvssrarn_wu_d, xxx) + +INSN_LASX(xvssrlrni_b_h, xx_i) +INSN_LASX(xvssrlrni_h_w, xx_i) +INSN_LASX(xvssrlrni_w_d, xx_i) +INSN_LASX(xvssrlrni_d_q, xx_i) +INSN_LASX(xvssrlrni_bu_h, xx_i) +INSN_LASX(xvssrlrni_hu_w, xx_i) +INSN_LASX(xvssrlrni_wu_d, xx_i) +INSN_LASX(xvssrlrni_du_q, xx_i) +INSN_LASX(xvssrarni_b_h, xx_i) +INSN_LASX(xvssrarni_h_w, xx_i) +INSN_LASX(xvssrarni_w_d, xx_i) +INSN_LASX(xvssrarni_d_q, xx_i) +INSN_LASX(xvssrarni_bu_h, xx_i) +INSN_LASX(xvssrarni_hu_w, xx_i) +INSN_LASX(xvssrarni_wu_d, xx_i) +INSN_LASX(xvssrarni_du_q, xx_i) + INSN_LASX(xvreplgr2vr_b, xr) INSN_LASX(xvreplgr2vr_h, xr) INSN_LASX(xvreplgr2vr_w, xr) diff --git a/target/loongarch/helper.h b/target/loongarch/helper.h index 2d76916049..b5d1cff1f0 100644 --- a/target/loongarch/helper.h +++ b/target/loongarch/helper.h @@ -1020,3 +1020,33 @@ DEF_HELPER_4(xvssrani_bu_h, void, env, i32, i32, i32) DEF_HELPER_4(xvssrani_hu_w, void, env, i32, i32, i32) DEF_HELPER_4(xvssrani_wu_d, void, env, i32, i32, i32) DEF_HELPER_4(xvssrani_du_q, void, env, i32, i32, i32) + +DEF_HELPER_4(xvssrlrn_b_h, void, env, i32, i32, i32) +DEF_HELPER_4(xvssrlrn_h_w, void, env, i32, i32, i32) +DEF_HELPER_4(xvssrlrn_w_d, void, env, i32, i32, i32) +DEF_HELPER_4(xvssrarn_b_h, void, env, i32, i32, i32) +DEF_HELPER_4(xvssrarn_h_w, void, env, i32, i32, i32) +DEF_HELPER_4(xvssrarn_w_d, void, env, i32, i32, i32) +DEF_HELPER_4(xvssrlrn_bu_h, void, env, i32, i32, i32) +DEF_HELPER_4(xvssrlrn_hu_w, void, env, i32, i32, i32) +DEF_HELPER_4(xvssrlrn_wu_d, void, env, i32, i32, i32) +DEF_HELPER_4(xvssrarn_bu_h, void, env, i32, i32, i32) +DEF_HELPER_4(xvssrarn_hu_w, void, env, i32, i32, i32) +DEF_HELPER_4(xvssrarn_wu_d, void, env, i32, i32, i32) + +DEF_HELPER_4(xvssrlrni_b_h, void, env, i32, i32, i32) +DEF_HELPER_4(xvssrlrni_h_w, void, env, i32, i32, i32) +DEF_HELPER_4(xvssrlrni_w_d, void, env, i32, i32, i32) +DEF_HELPER_4(xvssrlrni_d_q, void, env, i32, i32, i32) +DEF_HELPER_4(xvssrarni_b_h, void, env, i32, i32, i32) +DEF_HELPER_4(xvssrarni_h_w, void, env, i32, i32, i32) +DEF_HELPER_4(xvssrarni_w_d, void, env, i32, i32, i32) +DEF_HELPER_4(xvssrarni_d_q, void, env, i32, i32, i32) +DEF_HELPER_4(xvssrlrni_bu_h, void, env, i32, i32, i32) +DEF_HELPER_4(xvssrlrni_hu_w, void, env, i32, i32, i32) +DEF_HELPER_4(xvssrlrni_wu_d, void, env, i32, i32, i32) +DEF_HELPER_4(xvssrlrni_du_q, void, env, i32, i32, i32) +DEF_HELPER_4(xvssrarni_bu_h, void, env, i32, i32, i32) +DEF_HELPER_4(xvssrarni_hu_w, void, env, i32, i32, i32) +DEF_HELPER_4(xvssrarni_wu_d, void, env, i32, i32, i32) +DEF_HELPER_4(xvssrarni_du_q, void, env, i32, i32, i32) diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarch/insn_trans/trans_lasx.c.inc index b6c2ced30c..aa145c850b 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -2114,6 +2114,36 @@ TRANS(xvssrani_hu_w, gen_xx_i, gen_helper_xvssrani_hu_w) TRANS(xvssrani_wu_d, gen_xx_i, gen_helper_xvssrani_wu_d) TRANS(xvssrani_du_q, gen_xx_i, gen_helper_xvssrani_du_q) +TRANS(xvssrlrn_b_h, gen_xxx, gen_helper_xvssrlrn_b_h) +TRANS(xvssrlrn_h_w, gen_xxx, gen_helper_xvssrlrn_h_w) +TRANS(xvssrlrn_w_d, gen_xxx, gen_helper_xvssrlrn_w_d) +TRANS(xvssrarn_b_h, gen_xxx, gen_helper_xvssrarn_b_h) +TRANS(xvssrarn_h_w, gen_xxx, gen_helper_xvssrarn_h_w) +TRANS(xvssrarn_w_d, gen_xxx, gen_helper_xvssrarn_w_d) +TRANS(xvssrlrn_bu_h, gen_xxx, gen_helper_xvssrlrn_bu_h) +TRANS(xvssrlrn_hu_w, gen_xxx, gen_helper_xvssrlrn_hu_w) +TRANS(xvssrlrn_wu_d, gen_xxx, gen_helper_xvssrlrn_wu_d) +TRANS(xvssrarn_bu_h, gen_xxx, gen_helper_xvssrarn_bu_h) +TRANS(xvssrarn_hu_w, gen_xxx, gen_helper_xvssrarn_hu_w) +TRANS(xvssrarn_wu_d, gen_xxx, gen_helper_xvssrarn_wu_d) + +TRANS(xvssrlrni_b_h, gen_xx_i, gen_helper_xvssrlrni_b_h) +TRANS(xvssrlrni_h_w, gen_xx_i, gen_helper_xvssrlrni_h_w) +TRANS(xvssrlrni_w_d, gen_xx_i, gen_helper_xvssrlrni_w_d) +TRANS(xvssrlrni_d_q, gen_xx_i, gen_helper_xvssrlrni_d_q) +TRANS(xvssrarni_b_h, gen_xx_i, gen_helper_xvssrarni_b_h) +TRANS(xvssrarni_h_w, gen_xx_i, gen_helper_xvssrarni_h_w) +TRANS(xvssrarni_w_d, gen_xx_i, gen_helper_xvssrarni_w_d) +TRANS(xvssrarni_d_q, gen_xx_i, gen_helper_xvssrarni_d_q) +TRANS(xvssrlrni_bu_h, gen_xx_i, gen_helper_xvssrlrni_bu_h) +TRANS(xvssrlrni_hu_w, gen_xx_i, gen_helper_xvssrlrni_hu_w) +TRANS(xvssrlrni_wu_d, gen_xx_i, gen_helper_xvssrlrni_wu_d) +TRANS(xvssrlrni_du_q, gen_xx_i, gen_helper_xvssrlrni_du_q) +TRANS(xvssrarni_bu_h, gen_xx_i, gen_helper_xvssrarni_bu_h) +TRANS(xvssrarni_hu_w, gen_xx_i, gen_helper_xvssrarni_hu_w) +TRANS(xvssrarni_wu_d, gen_xx_i, gen_helper_xvssrarni_wu_d) +TRANS(xvssrarni_du_q, gen_xx_i, gen_helper_xvssrarni_du_q) + static bool gvec_dupx(DisasContext *ctx, arg_xr *a, MemOp mop) { TCGv src = gpr_src(ctx, a->rj, EXT_NONE); diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index cf3803c230..3aed69b766 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1763,6 +1763,36 @@ xvssrani_hu_w 0111 01110110 01001 ..... ..... ..... @xx_ui5 xvssrani_wu_d 0111 01110110 0101 ...... ..... ..... @xx_ui6 xvssrani_du_q 0111 01110110 011 ....... ..... ..... @xx_ui7 +xvssrlrn_b_h 0111 01010000 00001 ..... ..... ..... @xxx +xvssrlrn_h_w 0111 01010000 00010 ..... ..... ..... @xxx +xvssrlrn_w_d 0111 01010000 00011 ..... ..... ..... @xxx +xvssrarn_b_h 0111 01010000 00101 ..... ..... ..... @xxx +xvssrarn_h_w 0111 01010000 00110 ..... ..... ..... @xxx +xvssrarn_w_d 0111 01010000 00111 ..... ..... ..... @xxx +xvssrlrn_bu_h 0111 01010000 10001 ..... ..... ..... @xxx +xvssrlrn_hu_w 0111 01010000 10010 ..... ..... ..... @xxx +xvssrlrn_wu_d 0111 01010000 10011 ..... ..... ..... @xxx +xvssrarn_bu_h 0111 01010000 10101 ..... ..... ..... @xxx +xvssrarn_hu_w 0111 01010000 10110 ..... ..... ..... @xxx +xvssrarn_wu_d 0111 01010000 10111 ..... ..... ..... @xxx + +xvssrlrni_b_h 0111 01110101 00000 1 .... ..... ..... @xx_ui4 +xvssrlrni_h_w 0111 01110101 00001 ..... ..... ..... @xx_ui5 +xvssrlrni_w_d 0111 01110101 0001 ...... ..... ..... @xx_ui6 +xvssrlrni_d_q 0111 01110101 001 ....... ..... ..... @xx_ui7 +xvssrarni_b_h 0111 01110110 10000 1 .... ..... ..... @xx_ui4 +xvssrarni_h_w 0111 01110110 10001 ..... ..... ..... @xx_ui5 +xvssrarni_w_d 0111 01110110 1001 ...... ..... ..... @xx_ui6 +xvssrarni_d_q 0111 01110110 101 ....... ..... ..... @xx_ui7 +xvssrlrni_bu_h 0111 01110101 01000 1 .... ..... ..... @xx_ui4 +xvssrlrni_hu_w 0111 01110101 01001 ..... ..... ..... @xx_ui5 +xvssrlrni_wu_d 0111 01110101 0101 ...... ..... ..... @xx_ui6 +xvssrlrni_du_q 0111 01110101 011 ....... ..... ..... @xx_ui7 +xvssrarni_bu_h 0111 01110110 11000 1 .... ..... ..... @xx_ui4 +xvssrarni_hu_w 0111 01110110 11001 ..... ..... ..... @xx_ui5 +xvssrarni_wu_d 0111 01110110 1101 ...... ..... ..... @xx_ui6 +xvssrarni_du_q 0111 01110110 111 ....... ..... ..... @xx_ui7 + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @xr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @xr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @xr diff --git a/target/loongarch/lasx_helper.c b/target/loongarch/lasx_helper.c index b42f412c02..0e223601de 100644 --- a/target/loongarch/lasx_helper.c +++ b/target/loongarch/lasx_helper.c @@ -1670,3 +1670,414 @@ void HELPER(xvssrani_du_q)(CPULoongArchState *env, XVSSRANUI(xvssrani_bu_h, 16, XB, XH) XVSSRANUI(xvssrani_hu_w, 32, XH, XW) XVSSRANUI(xvssrani_wu_d, 64, XW, XD) + +#define XSSRLRNS(E1, E2, T1, T2, T3) \ +static T1 do_xssrlrns_ ## E1(T2 e2, int sa, int sh) \ +{ \ + T1 shft_res; \ + \ + shft_res = do_xvsrlr_ ## E2(e2, sa); \ + T1 mask; \ + mask = (1ull << sh) - 1; \ + if (shft_res > mask) { \ + return mask; \ + } else { \ + return shft_res; \ + } \ +} + +XSSRLRNS(XB, XH, uint16_t, int16_t, uint8_t) +XSSRLRNS(XH, XW, uint32_t, int32_t, uint16_t) +XSSRLRNS(XW, XD, uint64_t, int64_t, uint32_t) + +#define XVSSRLRN(NAME, BIT, E1, E2, E3) \ +void HELPER(NAME)(CPULoongArchState *env, \ + uint32_t xd, uint32_t xj, uint32_t xk) \ +{ \ + int i, max; \ + XReg *Xd = &(env->fpr[xd].xreg); \ + XReg *Xj = &(env->fpr[xj].xreg); \ + XReg *Xk = &(env->fpr[xk].xreg); \ + \ + max = LASX_LEN / (BIT * 2); \ + for (i = 0; i < max; i++) { \ + Xd->E1(i) = do_xssrlrns_ ## E1(Xj->E2(i), \ + Xk->E3(i) % BIT, (BIT / 2) - 1); \ + Xd->E1(i + max * 2) = do_xssrlrns_## E1(Xj->E2(i + max), \ + Xk->E3(i + max) % BIT, \ + (BIT / 2) - 1); \ + } \ + Xd->XD(1) = 0; \ + Xd->XD(3) = 0; \ +} + +XVSSRLRN(xvssrlrn_b_h, 16, XB, XH, UXH) +XVSSRLRN(xvssrlrn_h_w, 32, XH, XW, UXW) +XVSSRLRN(xvssrlrn_w_d, 64, XW, XD, UXD) + +#define XSSRARNS(E1, E2, T1, T2) \ +static T1 do_xssrarns_ ## E1(T1 e2, int sa, int sh) \ +{ \ + T1 shft_res; \ + \ + shft_res = do_xvsrar_ ## E2(e2, sa); \ + T2 mask; \ + mask = (1ll << sh) - 1; \ + if (shft_res > mask) { \ + return mask; \ + } else if (shft_res < -(mask + 1)) { \ + return ~mask; \ + } else { \ + return shft_res; \ + } \ +} + +XSSRARNS(XB, XH, int16_t, int8_t) +XSSRARNS(XH, XW, int32_t, int16_t) +XSSRARNS(XW, XD, int64_t, int32_t) + +#define XVSSRARN(NAME, BIT, E1, E2, E3) \ +void HELPER(NAME)(CPULoongArchState *env, \ + uint32_t xd, uint32_t xj, uint32_t xk) \ +{ \ + int i, max; \ + XReg *Xd = &(env->fpr[xd].xreg); \ + XReg *Xj = &(env->fpr[xj].xreg); \ + XReg *Xk = &(env->fpr[xk].xreg); \ + \ + max = LASX_LEN / (BIT * 2); \ + for (i = 0; i < max; i++) { \ + Xd->E1(i) = do_xssrarns_ ## E1(Xj->E2(i), \ + Xk->E3(i) % BIT, (BIT / 2) - 1); \ + Xd->E1(i + max * 2) = do_xssrarns_## E1(Xj->E2(i + max), \ + Xk->E3(i + max) % BIT, \ + (BIT / 2) - 1); \ + } \ + Xd->XD(1) = 0; \ + Xd->XD(3) = 0; \ +} + +XVSSRARN(xvssrarn_b_h, 16, XB, XH, UXH) +XVSSRARN(xvssrarn_h_w, 32, XH, XW, UXW) +XVSSRARN(xvssrarn_w_d, 64, XW, XD, UXD) + +#define XSSRLRNU(E1, E2, T1, T2, T3) \ +static T1 do_xssrlrnu_ ## E1(T3 e2, int sa, int sh) \ +{ \ + T1 shft_res; \ + \ + shft_res = do_xvsrlr_ ## E2(e2, sa); \ + \ + T2 mask; \ + mask = (1ull << sh) - 1; \ + if (shft_res > mask) { \ + return mask; \ + } else { \ + return shft_res; \ + } \ +} + +XSSRLRNU(XB, XH, uint16_t, uint8_t, int16_t) +XSSRLRNU(XH, XW, uint32_t, uint16_t, int32_t) +XSSRLRNU(XW, XD, uint64_t, uint32_t, int64_t) + +#define XVSSRLRNU(NAME, BIT, E1, E2, E3) \ +void HELPER(NAME)(CPULoongArchState *env, \ + uint32_t xd, uint32_t xj, uint32_t xk) \ +{ \ + int i, max; \ + XReg *Xd = &(env->fpr[xd].xreg); \ + XReg *Xj = &(env->fpr[xj].xreg); \ + XReg *Xk = &(env->fpr[xk].xreg); \ + \ + max = LASX_LEN / (BIT * 2); \ + for (i = 0; i < max; i++) { \ + Xd->E1(i) = do_xssrlrnu_ ## E1(Xj->E2(i), Xk->E3(i) % BIT, BIT / 2); \ + Xd->E1(i + max * 2) = do_xssrlrnu_## E1(Xj->E2(i + max), \ + Xk->E3(i + max) % BIT, \ + BIT / 2); \ + } \ + Xd->XD(1) = 0; \ + Xd->XD(3) = 0; \ +} + +XVSSRLRNU(xvssrlrn_bu_h, 16, XB, XH, UXH) +XVSSRLRNU(xvssrlrn_hu_w, 32, XH, XW, UXW) +XVSSRLRNU(xvssrlrn_wu_d, 64, XW, XD, UXD) + +#define XSSRARNU(E1, E2, T1, T2, T3) \ +static T1 do_xssrarnu_ ## E1(T3 e2, int sa, int sh) \ +{ \ + T1 shft_res; \ + \ + if (e2 < 0) { \ + shft_res = 0; \ + } else { \ + shft_res = do_xvsrar_ ## E2(e2, sa); \ + } \ + T2 mask; \ + mask = (1ull << sh) - 1; \ + if (shft_res > mask) { \ + return mask; \ + } else { \ + return shft_res; \ + } \ +} + +XSSRARNU(XB, XH, uint16_t, uint8_t, int16_t) +XSSRARNU(XH, XW, uint32_t, uint16_t, int32_t) +XSSRARNU(XW, XD, uint64_t, uint32_t, int64_t) + +#define XVSSRARNU(NAME, BIT, E1, E2, E3) \ +void HELPER(NAME)(CPULoongArchState *env, \ + uint32_t xd, uint32_t xj, uint32_t xk) \ +{ \ + int i, max; \ + XReg *Xd = &(env->fpr[xd].xreg); \ + XReg *Xj = &(env->fpr[xj].xreg); \ + XReg *Xk = &(env->fpr[xk].xreg); \ + \ + max = LASX_LEN / (BIT * 2); \ + for (i = 0; i < max; i++) { \ + Xd->E1(i) = do_xssrarnu_ ## E1(Xj->E2(i), Xk->E3(i) % BIT, BIT / 2); \ + Xd->E1(i + max * 2) = do_xssrarnu_## E1(Xj->E2(i + max), \ + Xk->E3(i + max) % BIT, \ + BIT / 2); \ + } \ + Xd->XD(1) = 0; \ + Xd->XD(3) = 0; \ +} + +XVSSRARNU(xvssrarn_bu_h, 16, XB, XH, UXH) +XVSSRARNU(xvssrarn_hu_w, 32, XH, XW, UXW) +XVSSRARNU(xvssrarn_wu_d, 64, XW, XD, UXD) + +#define XVSSRLRNI(NAME, BIT, E1, E2) \ +void HELPER(NAME)(CPULoongArchState *env, \ + uint32_t xd, uint32_t xj, uint32_t imm) \ +{ \ + int i, max; \ + XReg temp; \ + XReg *Xd = &(env->fpr[xd].xreg); \ + XReg *Xj = &(env->fpr[xj].xreg); \ + \ + max = LASX_LEN / (BIT * 2); \ + for (i = 0; i < max; i++) { \ + temp.E1(i) = do_xssrlrns_ ## E1(Xj->E2(i), imm, (BIT / 2) - 1); \ + temp.E1(i + max) = do_xssrlrns_ ## E1(Xd->E2(i), imm, (BIT / 2) - 1); \ + temp.E1(i + max * 2) = do_xssrlrns_## E1(Xj->E2(i + max), \ + imm, (BIT / 2) - 1); \ + temp.E1(i + max * 3) = do_xssrlrns_## E1(Xd->E2(i + max), \ + imm, (BIT / 2) - 1); \ + } \ + *Xd = temp; \ +} + +#define XVSSRLRNI_Q(NAME, sh) \ +void HELPER(NAME)(CPULoongArchState *env, \ + uint32_t xd, uint32_t xj, uint32_t imm) \ +{ \ + int i; \ + Int128 shft_res[4], r[4], mask; \ + XReg *Xd = &(env->fpr[xd].xreg); \ + XReg *Xj = &(env->fpr[xj].xreg); \ + \ + if (imm == 0) { \ + shft_res[0] = Xj->XQ(0); \ + shft_res[1] = Xd->XQ(0); \ + shft_res[2] = Xj->XQ(1); \ + shft_res[3] = Xd->XQ(1); \ + } else { \ + r[0] = int128_and(int128_urshift(Xj->XQ(0), (imm - 1)), int128_one()); \ + r[1] = int128_and(int128_urshift(Xd->XQ(0), (imm - 1)), int128_one()); \ + r[2] = int128_and(int128_urshift(Xj->XQ(1), (imm - 1)), int128_one()); \ + r[3] = int128_and(int128_urshift(Xd->XQ(1), (imm - 1)), int128_one()); \ + \ + shft_res[0] = (int128_add(int128_urshift(Xj->XQ(0), imm), r[0])); \ + shft_res[1] = (int128_add(int128_urshift(Xd->XQ(0), imm), r[1])); \ + shft_res[2] = (int128_add(int128_urshift(Xj->XQ(1), imm), r[2])); \ + shft_res[3] = (int128_add(int128_urshift(Xd->XQ(1), imm), r[3])); \ + } \ + \ + mask = int128_sub(int128_lshift(int128_one(), sh), int128_one()); \ + \ + for (i = 0; i < 4; i++) { \ + if (int128_ult(mask, shft_res[i])) { \ + Xd->XD(i) = int128_getlo(mask); \ + } else { \ + Xd->XD(i) = int128_getlo(shft_res[i]); \ + } \ + } \ +} + +XVSSRLRNI(xvssrlrni_b_h, 16, XB, XH) +XVSSRLRNI(xvssrlrni_h_w, 32, XH, XW) +XVSSRLRNI(xvssrlrni_w_d, 64, XW, XD) +XVSSRLRNI_Q(xvssrlrni_d_q, 63) + +#define XVSSRARNI(NAME, BIT, E1, E2) \ +void HELPER(NAME)(CPULoongArchState *env, \ + uint32_t xd, uint32_t xj, uint32_t imm) \ +{ \ + int i, max; \ + XReg temp; \ + XReg *Xd = &(env->fpr[xd].xreg); \ + XReg *Xj = &(env->fpr[xj].xreg); \ + \ + max = LASX_LEN / (BIT * 2); \ + for (i = 0; i < max; i++) { \ + temp.E1(i) = do_xssrarns_ ## E1(Xj->E2(i), imm, (BIT / 2) - 1); \ + temp.E1(i + max) = do_xssrarns_ ## E1(Xd->E2(i), imm, (BIT / 2) - 1); \ + temp.E1(i + max * 2) = do_xssrarns_## E1(Xj->E2(i + max), \ + imm, (BIT / 2) - 1); \ + temp.E1(i + max * 3) = do_xssrarns_## E1(Xd->E2(i + max), \ + imm, (BIT / 2) - 1); \ + } \ + *Xd = temp; \ +} + +void HELPER(xvssrarni_d_q)(CPULoongArchState *env, + uint32_t xd, uint32_t xj, uint32_t imm) +{ + int i; + Int128 shft_res[4], r[4], mask1, mask2; + XReg *Xd = &(env->fpr[xd].xreg); + XReg *Xj = &(env->fpr[xj].xreg); + + if (imm == 0) { + shft_res[0] = Xj->XQ(0); + shft_res[1] = Xd->XQ(0); + shft_res[2] = Xj->XQ(1); + shft_res[3] = Xd->XQ(1); + } else { + r[0] = int128_and(int128_rshift(Xj->XQ(0), (imm - 1)), int128_one()); + r[1] = int128_and(int128_rshift(Xd->XQ(0), (imm - 1)), int128_one()); + r[2] = int128_and(int128_rshift(Xj->XQ(1), (imm - 1)), int128_one()); + r[3] = int128_and(int128_rshift(Xd->XQ(1), (imm - 1)), int128_one()); + + shft_res[0] = int128_add(int128_rshift(Xj->XQ(0), imm), r[0]); + shft_res[1] = int128_add(int128_rshift(Xd->XQ(0), imm), r[1]); + shft_res[2] = int128_add(int128_rshift(Xj->XQ(1), imm), r[2]); + shft_res[3] = int128_add(int128_rshift(Xd->XQ(1), imm), r[3]); + } + + mask1 = int128_sub(int128_lshift(int128_one(), 63), int128_one()); + mask2 = int128_lshift(int128_one(), 63); + + for (i = 0; i < 4; i++) { + if (int128_gt(shft_res[i], mask1)) { + Xd->XD(i) = int128_getlo(mask1); + } else if (int128_lt(shft_res[i], int128_neg(mask2))) { + Xd->XD(i) = int128_getlo(mask2); + } else { + Xd->XD(i) = int128_getlo(shft_res[i]); + } + } +} + +XVSSRARNI(xvssrarni_b_h, 16, XB, XH) +XVSSRARNI(xvssrarni_h_w, 32, XH, XW) +XVSSRARNI(xvssrarni_w_d, 64, XW, XD) + +#define XVSSRLRNUI(NAME, BIT, E1, E2) \ +void HELPER(NAME)(CPULoongArchState *env, \ + uint32_t xd, uint32_t xj, uint32_t imm) \ +{ \ + int i, max; \ + XReg temp; \ + XReg *Xd = &(env->fpr[xd].xreg); \ + XReg *Xj = &(env->fpr[xj].xreg); \ + \ + max = LASX_LEN / (BIT * 2); \ + for (i = 0; i < max; i++) { \ + temp.E1(i) = do_xssrlrnu_ ## E1(Xj->E2(i), imm, BIT / 2); \ + temp.E1(i + max) = do_xssrlrnu_ ## E1(Xd->E2(i), imm, BIT / 2); \ + temp.E1(i + max * 2) = do_xssrlrnu_## E1(Xj->E2(i + max), \ + imm, BIT / 2); \ + temp.E1(i + max * 3) = do_xssrlrnu_## E1(Xd->E2(i + max), \ + imm, BIT / 2); \ + } \ + *Xd = temp; \ +} + +XVSSRLRNUI(xvssrlrni_bu_h, 16, XB, XH) +XVSSRLRNUI(xvssrlrni_hu_w, 32, XH, XW) +XVSSRLRNUI(xvssrlrni_wu_d, 64, XW, XD) +XVSSRLRNI_Q(xvssrlrni_du_q, 64) + +#define XVSSRARNUI(NAME, BIT, E1, E2) \ +void HELPER(NAME)(CPULoongArchState *env, \ + uint32_t xd, uint32_t xj, uint32_t imm) \ +{ \ + int i, max; \ + XReg temp; \ + XReg *Xd = &(env->fpr[xd].xreg); \ + XReg *Xj = &(env->fpr[xj].xreg); \ + \ + max = LASX_LEN / (BIT * 2); \ + for (i = 0; i < max; i++) { \ + temp.E1(i) = do_xssrarnu_ ## E1(Xj->E2(i), imm, BIT / 2); \ + temp.E1(i + max) = do_xssrarnu_ ## E1(Xd->E2(i), imm, BIT / 2); \ + temp.E1(i + max * 2) = do_xssrarnu_## E1(Xj->E2(i + max), \ + imm, BIT / 2); \ + temp.E1(i + max * 3) = do_xssrarnu_## E1(Xd->E2(i + max), \ + imm, BIT / 2); \ + } \ + *Xd = temp; \ +} + +void HELPER(xvssrarni_du_q)(CPULoongArchState *env, + uint32_t xd, uint32_t xj, uint32_t imm) +{ + int i; + Int128 shft_res[4], r[4], mask1, mask2; + XReg *Xd = &(env->fpr[xd].xreg); + XReg *Xj = &(env->fpr[xj].xreg); + + if (imm == 0) { + shft_res[0] = Xj->XQ(0); + shft_res[1] = Xd->XQ(0); + shft_res[2] = Xj->XQ(1); + shft_res[3] = Xd->XQ(1); + } else { + r[0] = int128_and(int128_rshift(Xj->XQ(0), (imm - 1)), int128_one()); + r[1] = int128_and(int128_rshift(Xd->XQ(0), (imm - 1)), int128_one()); + r[2] = int128_and(int128_rshift(Xj->XQ(1), (imm - 1)), int128_one()); + r[3] = int128_and(int128_rshift(Xd->XQ(1), (imm - 1)), int128_one()); + + shft_res[0] = int128_add(int128_rshift(Xj->XQ(0), imm), r[0]); + shft_res[1] = int128_add(int128_rshift(Xd->XQ(0), imm), r[1]); + shft_res[2] = int128_add(int128_rshift(Xj->XQ(1), imm), r[2]); + shft_res[3] = int128_add(int128_rshift(Xd->XQ(1), imm), r[3]); + } + + if (int128_lt(Xj->XQ(0), int128_zero())) { + shft_res[0] = int128_zero(); + } + if (int128_lt(Xd->XQ(0), int128_zero())) { + shft_res[1] = int128_zero(); + } + if (int128_lt(Xj->XQ(1), int128_zero())) { + shft_res[2] = int128_zero(); + } + if (int128_lt(Xd->XQ(1), int128_zero())) { + shft_res[3] = int128_zero(); + } + + mask1 = int128_sub(int128_lshift(int128_one(), 64), int128_one()); + mask2 = int128_lshift(int128_one(), 64); + + for (i = 0; i < 4; i++) { + if (int128_gt(shft_res[i], mask1)) { + Xd->XD(i) = int128_getlo(mask1); + } else if (int128_lt(shft_res[i], int128_neg(mask2))) { + Xd->XD(i) = int128_getlo(mask2); + } else { + Xd->XD(i) = int128_getlo(shft_res[i]); + } + } +} + +XVSSRARNUI(xvssrarni_bu_h, 16, XB, XH) +XVSSRARNUI(xvssrarni_hu_w, 32, XH, XW) +XVSSRARNUI(xvssrarni_wu_d, 64, XW, XD) From patchwork Tue Jun 20 09:38:00 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Gao X-Patchwork-Id: 13285476 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 5629DEB64DB for ; Tue, 20 Jun 2023 09:42:19 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXqe-000465-6j; Tue, 20 Jun 2023 05:40:28 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qBXqT-0002nO-1t for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:40:18 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXqM-0006aV-Op for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:40:16 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8DxDeuac5FkzCUHAA--.14735S3; Tue, 20 Jun 2023 17:38:34 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8BxduSGc5FkzIQhAA--.28394S34; Tue, 20 Jun 2023 17:38:33 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v1 32/46] target/loongarch: Implement xvclo xvclz Date: Tue, 20 Jun 2023 17:38:00 +0800 Message-Id: <20230620093814.123650-33-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230620093814.123650-1-gaosong@loongson.cn> References: <20230620093814.123650-1-gaosong@loongson.cn> MIME-Version: 1.0 X-CM-TRANSID: AQAAf8BxduSGc5FkzIQhAA--.28394S34 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org This patch includes: - XVCLO.{B/H/W/D}; - XVCLZ.{B/H/W/D}. Signed-off-by: Song Gao --- target/loongarch/disas.c | 9 +++++++++ target/loongarch/helper.h | 9 +++++++++ target/loongarch/insn_trans/trans_lasx.c.inc | 9 +++++++++ target/loongarch/insns.decode | 9 +++++++++ target/loongarch/lasx_helper.c | 21 ++++++++++++++++++++ target/loongarch/lsx_helper.c | 9 --------- target/loongarch/vec.h | 9 +++++++++ 7 files changed, 66 insertions(+), 9 deletions(-) diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index da07b56dee..99636ca56c 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -2195,6 +2195,15 @@ INSN_LASX(xvssrarni_hu_w, xx_i) INSN_LASX(xvssrarni_wu_d, xx_i) INSN_LASX(xvssrarni_du_q, xx_i) +INSN_LASX(xvclo_b, xx) +INSN_LASX(xvclo_h, xx) +INSN_LASX(xvclo_w, xx) +INSN_LASX(xvclo_d, xx) +INSN_LASX(xvclz_b, xx) +INSN_LASX(xvclz_h, xx) +INSN_LASX(xvclz_w, xx) +INSN_LASX(xvclz_d, xx) + INSN_LASX(xvreplgr2vr_b, xr) INSN_LASX(xvreplgr2vr_h, xr) INSN_LASX(xvreplgr2vr_w, xr) diff --git a/target/loongarch/helper.h b/target/loongarch/helper.h index b5d1cff1f0..950a73ec6f 100644 --- a/target/loongarch/helper.h +++ b/target/loongarch/helper.h @@ -1050,3 +1050,12 @@ DEF_HELPER_4(xvssrarni_bu_h, void, env, i32, i32, i32) DEF_HELPER_4(xvssrarni_hu_w, void, env, i32, i32, i32) DEF_HELPER_4(xvssrarni_wu_d, void, env, i32, i32, i32) DEF_HELPER_4(xvssrarni_du_q, void, env, i32, i32, i32) + +DEF_HELPER_3(xvclo_b, void, env, i32, i32) +DEF_HELPER_3(xvclo_h, void, env, i32, i32) +DEF_HELPER_3(xvclo_w, void, env, i32, i32) +DEF_HELPER_3(xvclo_d, void, env, i32, i32) +DEF_HELPER_3(xvclz_b, void, env, i32, i32) +DEF_HELPER_3(xvclz_h, void, env, i32, i32) +DEF_HELPER_3(xvclz_w, void, env, i32, i32) +DEF_HELPER_3(xvclz_d, void, env, i32, i32) diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarch/insn_trans/trans_lasx.c.inc index aa145c850b..fa7dafa7f9 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -2144,6 +2144,15 @@ TRANS(xvssrarni_hu_w, gen_xx_i, gen_helper_xvssrarni_hu_w) TRANS(xvssrarni_wu_d, gen_xx_i, gen_helper_xvssrarni_wu_d) TRANS(xvssrarni_du_q, gen_xx_i, gen_helper_xvssrarni_du_q) +TRANS(xvclo_b, gen_xx, gen_helper_xvclo_b) +TRANS(xvclo_h, gen_xx, gen_helper_xvclo_h) +TRANS(xvclo_w, gen_xx, gen_helper_xvclo_w) +TRANS(xvclo_d, gen_xx, gen_helper_xvclo_d) +TRANS(xvclz_b, gen_xx, gen_helper_xvclz_b) +TRANS(xvclz_h, gen_xx, gen_helper_xvclz_h) +TRANS(xvclz_w, gen_xx, gen_helper_xvclz_w) +TRANS(xvclz_d, gen_xx, gen_helper_xvclz_d) + static bool gvec_dupx(DisasContext *ctx, arg_xr *a, MemOp mop) { TCGv src = gpr_src(ctx, a->rj, EXT_NONE); diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index 3aed69b766..91de5a3815 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1793,6 +1793,15 @@ xvssrarni_hu_w 0111 01110110 11001 ..... ..... ..... @xx_ui5 xvssrarni_wu_d 0111 01110110 1101 ...... ..... ..... @xx_ui6 xvssrarni_du_q 0111 01110110 111 ....... ..... ..... @xx_ui7 +xvclo_b 0111 01101001 11000 00000 ..... ..... @xx +xvclo_h 0111 01101001 11000 00001 ..... ..... @xx +xvclo_w 0111 01101001 11000 00010 ..... ..... @xx +xvclo_d 0111 01101001 11000 00011 ..... ..... @xx +xvclz_b 0111 01101001 11000 00100 ..... ..... @xx +xvclz_h 0111 01101001 11000 00101 ..... ..... @xx +xvclz_w 0111 01101001 11000 00110 ..... ..... @xx +xvclz_d 0111 01101001 11000 00111 ..... ..... @xx + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @xr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @xr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @xr diff --git a/target/loongarch/lasx_helper.c b/target/loongarch/lasx_helper.c index 0e223601de..122c460fb5 100644 --- a/target/loongarch/lasx_helper.c +++ b/target/loongarch/lasx_helper.c @@ -2081,3 +2081,24 @@ void HELPER(xvssrarni_du_q)(CPULoongArchState *env, XVSSRARNUI(xvssrarni_bu_h, 16, XB, XH) XVSSRARNUI(xvssrarni_hu_w, 32, XH, XW) XVSSRARNUI(xvssrarni_wu_d, 64, XW, XD) + +#define XDO_2OP(NAME, BIT, E, DO_OP) \ +void HELPER(NAME)(CPULoongArchState *env, uint32_t xd, uint32_t xj) \ +{ \ + int i; \ + XReg *Xd = &(env->fpr[xd].xreg); \ + XReg *Xj = &(env->fpr[xj].xreg); \ + \ + for (i = 0; i < LASX_LEN / BIT; i++) { \ + Xd->E(i) = DO_OP(Xj->E(i)); \ + } \ +} + +XDO_2OP(xvclo_b, 8, UXB, DO_CLO_B) +XDO_2OP(xvclo_h, 16, UXH, DO_CLO_H) +XDO_2OP(xvclo_w, 32, UXW, DO_CLO_W) +XDO_2OP(xvclo_d, 64, UXD, DO_CLO_D) +XDO_2OP(xvclz_b, 8, UXB, DO_CLZ_B) +XDO_2OP(xvclz_h, 16, UXH, DO_CLZ_H) +XDO_2OP(xvclz_w, 32, UXW, DO_CLZ_W) +XDO_2OP(xvclz_d, 64, UXD, DO_CLZ_D) diff --git a/target/loongarch/lsx_helper.c b/target/loongarch/lsx_helper.c index d21e4006f2..e1b448a2e6 100644 --- a/target/loongarch/lsx_helper.c +++ b/target/loongarch/lsx_helper.c @@ -1910,15 +1910,6 @@ void HELPER(NAME)(CPULoongArchState *env, uint32_t vd, uint32_t vj) \ } \ } -#define DO_CLO_B(N) (clz32(~N & 0xff) - 24) -#define DO_CLO_H(N) (clz32(~N & 0xffff) - 16) -#define DO_CLO_W(N) (clz32(~N)) -#define DO_CLO_D(N) (clz64(~N)) -#define DO_CLZ_B(N) (clz32(N) - 24) -#define DO_CLZ_H(N) (clz32(N) - 16) -#define DO_CLZ_W(N) (clz32(N)) -#define DO_CLZ_D(N) (clz64(N)) - DO_2OP(vclo_b, 8, UB, DO_CLO_B) DO_2OP(vclo_h, 16, UH, DO_CLO_H) DO_2OP(vclo_w, 32, UW, DO_CLO_W) diff --git a/target/loongarch/vec.h b/target/loongarch/vec.h index b5cdb4b470..db5704dd05 100644 --- a/target/loongarch/vec.h +++ b/target/loongarch/vec.h @@ -77,6 +77,15 @@ #define R_SHIFT(a, b) (a >> b) +#define DO_CLO_B(N) (clz32(~N & 0xff) - 24) +#define DO_CLO_H(N) (clz32(~N & 0xffff) - 16) +#define DO_CLO_W(N) (clz32(~N)) +#define DO_CLO_D(N) (clz64(~N)) +#define DO_CLZ_B(N) (clz32(N) - 24) +#define DO_CLZ_H(N) (clz32(N) - 16) +#define DO_CLZ_W(N) (clz32(N)) +#define DO_CLZ_D(N) (clz64(N)) + uint64_t do_vmskltz_b(int64_t val); uint64_t do_vmskltz_h(int64_t val); uint64_t do_vmskltz_w(int64_t val); From patchwork Tue Jun 20 09:38:01 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Gao X-Patchwork-Id: 13285452 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id BFAE7EB64D8 for ; Tue, 20 Jun 2023 09:40:27 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXqY-0003Iz-Fp; Tue, 20 Jun 2023 05:40:22 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qBXqR-0002j5-EN for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:40:16 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXqM-0006aM-Gc for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:40:15 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8BxIuibc5FkzyUHAA--.621S3; Tue, 20 Jun 2023 17:38:35 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8BxduSGc5FkzIQhAA--.28394S35; Tue, 20 Jun 2023 17:38:34 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v1 33/46] target/loongarch: Implement xvpcnt Date: Tue, 20 Jun 2023 17:38:01 +0800 Message-Id: <20230620093814.123650-34-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230620093814.123650-1-gaosong@loongson.cn> References: <20230620093814.123650-1-gaosong@loongson.cn> MIME-Version: 1.0 X-CM-TRANSID: AQAAf8BxduSGc5FkzIQhAA--.28394S35 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org This patch includes: - VPCNT.{B/H/W/D}. Signed-off-by: Song Gao --- target/loongarch/disas.c | 5 +++++ target/loongarch/helper.h | 5 +++++ target/loongarch/insn_trans/trans_lasx.c.inc | 5 +++++ target/loongarch/insns.decode | 5 +++++ target/loongarch/lasx_helper.c | 17 +++++++++++++++++ 5 files changed, 37 insertions(+) diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index 99636ca56c..b7a322651f 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -2204,6 +2204,11 @@ INSN_LASX(xvclz_h, xx) INSN_LASX(xvclz_w, xx) INSN_LASX(xvclz_d, xx) +INSN_LASX(xvpcnt_b, xx) +INSN_LASX(xvpcnt_h, xx) +INSN_LASX(xvpcnt_w, xx) +INSN_LASX(xvpcnt_d, xx) + INSN_LASX(xvreplgr2vr_b, xr) INSN_LASX(xvreplgr2vr_h, xr) INSN_LASX(xvreplgr2vr_w, xr) diff --git a/target/loongarch/helper.h b/target/loongarch/helper.h index 950a73ec6f..a434443819 100644 --- a/target/loongarch/helper.h +++ b/target/loongarch/helper.h @@ -1059,3 +1059,8 @@ DEF_HELPER_3(xvclz_b, void, env, i32, i32) DEF_HELPER_3(xvclz_h, void, env, i32, i32) DEF_HELPER_3(xvclz_w, void, env, i32, i32) DEF_HELPER_3(xvclz_d, void, env, i32, i32) + +DEF_HELPER_3(xvpcnt_b, void, env, i32, i32) +DEF_HELPER_3(xvpcnt_h, void, env, i32, i32) +DEF_HELPER_3(xvpcnt_w, void, env, i32, i32) +DEF_HELPER_3(xvpcnt_d, void, env, i32, i32) diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarch/insn_trans/trans_lasx.c.inc index fa7dafa7f9..616d296432 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -2153,6 +2153,11 @@ TRANS(xvclz_h, gen_xx, gen_helper_xvclz_h) TRANS(xvclz_w, gen_xx, gen_helper_xvclz_w) TRANS(xvclz_d, gen_xx, gen_helper_xvclz_d) +TRANS(xvpcnt_b, gen_xx, gen_helper_xvpcnt_b) +TRANS(xvpcnt_h, gen_xx, gen_helper_xvpcnt_h) +TRANS(xvpcnt_w, gen_xx, gen_helper_xvpcnt_w) +TRANS(xvpcnt_d, gen_xx, gen_helper_xvpcnt_d) + static bool gvec_dupx(DisasContext *ctx, arg_xr *a, MemOp mop) { TCGv src = gpr_src(ctx, a->rj, EXT_NONE); diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index 91de5a3815..7d49ddb0ea 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1802,6 +1802,11 @@ xvclz_h 0111 01101001 11000 00101 ..... ..... @xx xvclz_w 0111 01101001 11000 00110 ..... ..... @xx xvclz_d 0111 01101001 11000 00111 ..... ..... @xx +xvpcnt_b 0111 01101001 11000 01000 ..... ..... @xx +xvpcnt_h 0111 01101001 11000 01001 ..... ..... @xx +xvpcnt_w 0111 01101001 11000 01010 ..... ..... @xx +xvpcnt_d 0111 01101001 11000 01011 ..... ..... @xx + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @xr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @xr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @xr diff --git a/target/loongarch/lasx_helper.c b/target/loongarch/lasx_helper.c index 122c460fb5..f04817984b 100644 --- a/target/loongarch/lasx_helper.c +++ b/target/loongarch/lasx_helper.c @@ -2102,3 +2102,20 @@ XDO_2OP(xvclz_b, 8, UXB, DO_CLZ_B) XDO_2OP(xvclz_h, 16, UXH, DO_CLZ_H) XDO_2OP(xvclz_w, 32, UXW, DO_CLZ_W) XDO_2OP(xvclz_d, 64, UXD, DO_CLZ_D) + +#define XVPCNT(NAME, BIT, E, FN) \ +void HELPER(NAME)(CPULoongArchState *env, uint32_t xd, uint32_t xj) \ +{ \ + int i; \ + XReg *Xd = &(env->fpr[xd].xreg); \ + XReg *Xj = &(env->fpr[xj].xreg); \ + \ + for (i = 0; i < LASX_LEN / BIT; i++) { \ + Xd->E(i) = FN(Xj->E(i)); \ + } \ +} + +XVPCNT(xvpcnt_b, 8, UXB, ctpop8) +XVPCNT(xvpcnt_h, 16, UXH, ctpop16) +XVPCNT(xvpcnt_w, 32, UXW, ctpop32) +XVPCNT(xvpcnt_d, 64, UXD, ctpop64) From patchwork Tue Jun 20 09:38:02 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Gao X-Patchwork-Id: 13285483 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 14D02EB64D7 for ; Tue, 20 Jun 2023 09:43:28 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXpQ-0007H0-Nh; Tue, 20 Jun 2023 05:39:12 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qBXp5-0007B2-6e for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:38:51 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXp1-0006O8-Lu for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:38:50 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8AxHuubc5Fk0SUHAA--.14789S3; Tue, 20 Jun 2023 17:38:35 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8BxduSGc5FkzIQhAA--.28394S36; Tue, 20 Jun 2023 17:38:35 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v1 34/46] target/loongarch: Implement xvbitclr xvbitset xvbitrev Date: Tue, 20 Jun 2023 17:38:02 +0800 Message-Id: <20230620093814.123650-35-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230620093814.123650-1-gaosong@loongson.cn> References: <20230620093814.123650-1-gaosong@loongson.cn> MIME-Version: 1.0 X-CM-TRANSID: AQAAf8BxduSGc5FkzIQhAA--.28394S36 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org This patch includes: - XVBITCLR[I].{B/H/W/D}; - XVBITSET[I].{B/H/W/D}; - XVBITREV[I].{B/H/W/D}. Signed-off-by: Song Gao --- target/loongarch/disas.c | 25 ++ target/loongarch/helper.h | 27 ++ target/loongarch/insn_trans/trans_lasx.c.inc | 246 +++++++++++++++++++ target/loongarch/insns.decode | 27 ++ target/loongarch/lasx_helper.c | 51 ++++ target/loongarch/lsx_helper.c | 4 - target/loongarch/vec.h | 4 + 7 files changed, 380 insertions(+), 4 deletions(-) diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index b7a322651f..60d265a9f2 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -2209,6 +2209,31 @@ INSN_LASX(xvpcnt_h, xx) INSN_LASX(xvpcnt_w, xx) INSN_LASX(xvpcnt_d, xx) +INSN_LASX(xvbitclr_b, xxx) +INSN_LASX(xvbitclr_h, xxx) +INSN_LASX(xvbitclr_w, xxx) +INSN_LASX(xvbitclr_d, xxx) +INSN_LASX(xvbitclri_b, xx_i) +INSN_LASX(xvbitclri_h, xx_i) +INSN_LASX(xvbitclri_w, xx_i) +INSN_LASX(xvbitclri_d, xx_i) +INSN_LASX(xvbitset_b, xxx) +INSN_LASX(xvbitset_h, xxx) +INSN_LASX(xvbitset_w, xxx) +INSN_LASX(xvbitset_d, xxx) +INSN_LASX(xvbitseti_b, xx_i) +INSN_LASX(xvbitseti_h, xx_i) +INSN_LASX(xvbitseti_w, xx_i) +INSN_LASX(xvbitseti_d, xx_i) +INSN_LASX(xvbitrev_b, xxx) +INSN_LASX(xvbitrev_h, xxx) +INSN_LASX(xvbitrev_w, xxx) +INSN_LASX(xvbitrev_d, xxx) +INSN_LASX(xvbitrevi_b, xx_i) +INSN_LASX(xvbitrevi_h, xx_i) +INSN_LASX(xvbitrevi_w, xx_i) +INSN_LASX(xvbitrevi_d, xx_i) + INSN_LASX(xvreplgr2vr_b, xr) INSN_LASX(xvreplgr2vr_h, xr) INSN_LASX(xvreplgr2vr_w, xr) diff --git a/target/loongarch/helper.h b/target/loongarch/helper.h index a434443819..294ac477fc 100644 --- a/target/loongarch/helper.h +++ b/target/loongarch/helper.h @@ -1064,3 +1064,30 @@ DEF_HELPER_3(xvpcnt_b, void, env, i32, i32) DEF_HELPER_3(xvpcnt_h, void, env, i32, i32) DEF_HELPER_3(xvpcnt_w, void, env, i32, i32) DEF_HELPER_3(xvpcnt_d, void, env, i32, i32) + +DEF_HELPER_FLAGS_4(xvbitclr_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvbitclr_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvbitclr_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvbitclr_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvbitclri_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(xvbitclri_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(xvbitclri_w, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(xvbitclri_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) + +DEF_HELPER_FLAGS_4(xvbitset_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvbitset_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvbitset_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvbitset_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvbitseti_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(xvbitseti_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(xvbitseti_w, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(xvbitseti_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) + +DEF_HELPER_FLAGS_4(xvbitrev_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvbitrev_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvbitrev_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvbitrev_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(xvbitrevi_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(xvbitrevi_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(xvbitrevi_w, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(xvbitrevi_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarch/insn_trans/trans_lasx.c.inc index 616d296432..e87e000478 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -2158,6 +2158,252 @@ TRANS(xvpcnt_h, gen_xx, gen_helper_xvpcnt_h) TRANS(xvpcnt_w, gen_xx, gen_helper_xvpcnt_w) TRANS(xvpcnt_d, gen_xx, gen_helper_xvpcnt_d) +static void do_xvbitclr(unsigned vece, uint32_t xd_ofs, uint32_t xj_ofs, + uint32_t xk_ofs, uint32_t oprsz, uint32_t maxsz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_shlv_vec, INDEX_op_andc_vec, 0 + }; + static const GVecGen3 op[4] = { + { + .fniv = gen_vbitclr, + .fno = gen_helper_xvbitclr_b, + .opt_opc = vecop_list, + .vece = MO_8 + }, + { + .fniv = gen_vbitclr, + .fno = gen_helper_xvbitclr_h, + .opt_opc = vecop_list, + .vece = MO_16 + }, + { + .fniv = gen_vbitclr, + .fno = gen_helper_xvbitclr_w, + .opt_opc = vecop_list, + .vece = MO_32 + }, + { + .fniv = gen_vbitclr, + .fno = gen_helper_xvbitclr_d, + .opt_opc = vecop_list, + .vece = MO_64 + }, + }; + + tcg_gen_gvec_3(xd_ofs, xj_ofs, xk_ofs, oprsz, maxsz, &op[vece]); +} + +TRANS(xvbitclr_b, gvec_xxx, MO_8, do_xvbitclr) +TRANS(xvbitclr_h, gvec_xxx, MO_16, do_xvbitclr) +TRANS(xvbitclr_w, gvec_xxx, MO_32, do_xvbitclr) +TRANS(xvbitclr_d, gvec_xxx, MO_64, do_xvbitclr) + +static void do_xvbitclri(unsigned vece, uint32_t xd_ofs, uint32_t xj_ofs, + int64_t imm, uint32_t oprsz, uint32_t maxsz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_shli_vec, INDEX_op_andc_vec, 0 + }; + static const GVecGen2i op[4] = { + { + .fniv = gen_vbitclri, + .fnoi = gen_helper_xvbitclri_b, + .opt_opc = vecop_list, + .vece = MO_8 + }, + { + .fniv = gen_vbitclri, + .fnoi = gen_helper_xvbitclri_h, + .opt_opc = vecop_list, + .vece = MO_16 + }, + { + .fniv = gen_vbitclri, + .fnoi = gen_helper_xvbitclri_w, + .opt_opc = vecop_list, + .vece = MO_32 + }, + { + .fniv = gen_vbitclri, + .fnoi = gen_helper_xvbitclri_d, + .opt_opc = vecop_list, + .vece = MO_64 + }, + }; + + tcg_gen_gvec_2i(xd_ofs, xj_ofs, oprsz, maxsz, imm, &op[vece]); +} + +TRANS(xvbitclri_b, gvec_xx_i, MO_8, do_xvbitclri) +TRANS(xvbitclri_h, gvec_xx_i, MO_16, do_xvbitclri) +TRANS(xvbitclri_w, gvec_xx_i, MO_32, do_xvbitclri) +TRANS(xvbitclri_d, gvec_xx_i, MO_64, do_xvbitclri) + +static void do_xvbitset(unsigned vece, uint32_t xd_ofs, uint32_t xj_ofs, + uint32_t xk_ofs, uint32_t oprsz, uint32_t maxsz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_shlv_vec, 0 + }; + static const GVecGen3 op[4] = { + { + .fniv = gen_vbitset, + .fno = gen_helper_xvbitset_b, + .opt_opc = vecop_list, + .vece = MO_8 + }, + { + .fniv = gen_vbitset, + .fno = gen_helper_xvbitset_h, + .opt_opc = vecop_list, + .vece = MO_16 + }, + { + .fniv = gen_vbitset, + .fno = gen_helper_xvbitset_w, + .opt_opc = vecop_list, + .vece = MO_32 + }, + { + .fniv = gen_vbitset, + .fno = gen_helper_xvbitset_d, + .opt_opc = vecop_list, + .vece = MO_64 + }, + }; + + tcg_gen_gvec_3(xd_ofs, xj_ofs, xk_ofs, oprsz, maxsz, &op[vece]); +} + +TRANS(xvbitset_b, gvec_xxx, MO_8, do_xvbitset) +TRANS(xvbitset_h, gvec_xxx, MO_16, do_xvbitset) +TRANS(xvbitset_w, gvec_xxx, MO_32, do_xvbitset) +TRANS(xvbitset_d, gvec_xxx, MO_64, do_xvbitset) + +static void do_xvbitseti(unsigned vece, uint32_t xd_ofs, uint32_t xj_ofs, + int64_t imm, uint32_t oprsz, uint32_t maxsz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_shli_vec, 0 + }; + static const GVecGen2i op[4] = { + { + .fniv = gen_vbitseti, + .fnoi = gen_helper_xvbitseti_b, + .opt_opc = vecop_list, + .vece = MO_8 + }, + { + .fniv = gen_vbitseti, + .fnoi = gen_helper_xvbitseti_h, + .opt_opc = vecop_list, + .vece = MO_16 + }, + { + .fniv = gen_vbitseti, + .fnoi = gen_helper_xvbitseti_w, + .opt_opc = vecop_list, + .vece = MO_32 + }, + { + .fniv = gen_vbitseti, + .fnoi = gen_helper_xvbitseti_d, + .opt_opc = vecop_list, + .vece = MO_64 + }, + }; + + tcg_gen_gvec_2i(xd_ofs, xj_ofs, oprsz, maxsz, imm, &op[vece]); +} + +TRANS(xvbitseti_b, gvec_xx_i, MO_8, do_xvbitseti) +TRANS(xvbitseti_h, gvec_xx_i, MO_16, do_xvbitseti) +TRANS(xvbitseti_w, gvec_xx_i, MO_32, do_xvbitseti) +TRANS(xvbitseti_d, gvec_xx_i, MO_64, do_xvbitseti) + +static void do_xvbitrev(unsigned vece, uint32_t xd_ofs, uint32_t xj_ofs, + uint32_t xk_ofs, uint32_t oprsz, uint32_t maxsz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_shlv_vec, 0 + }; + static const GVecGen3 op[4] = { + { + .fniv = gen_vbitrev, + .fno = gen_helper_xvbitrev_b, + .opt_opc = vecop_list, + .vece = MO_8 + }, + { + .fniv = gen_vbitrev, + .fno = gen_helper_xvbitrev_h, + .opt_opc = vecop_list, + .vece = MO_16 + }, + { + .fniv = gen_vbitrev, + .fno = gen_helper_xvbitrev_w, + .opt_opc = vecop_list, + .vece = MO_32 + }, + { + .fniv = gen_vbitrev, + .fno = gen_helper_xvbitrev_d, + .opt_opc = vecop_list, + .vece = MO_64 + }, + }; + + tcg_gen_gvec_3(xd_ofs, xj_ofs, xk_ofs, oprsz, maxsz, &op[vece]); +} + +TRANS(xvbitrev_b, gvec_xxx, MO_8, do_xvbitrev) +TRANS(xvbitrev_h, gvec_xxx, MO_16, do_xvbitrev) +TRANS(xvbitrev_w, gvec_xxx, MO_32, do_xvbitrev) +TRANS(xvbitrev_d, gvec_xxx, MO_64, do_xvbitrev) + +static void do_xvbitrevi(unsigned vece, uint32_t xd_ofs, uint32_t xj_ofs, + int64_t imm, uint32_t oprsz, uint32_t maxsz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_shli_vec, 0 + }; + static const GVecGen2i op[4] = { + { + .fniv = gen_vbitrevi, + .fnoi = gen_helper_xvbitrevi_b, + .opt_opc = vecop_list, + .vece = MO_8 + }, + { + .fniv = gen_vbitrevi, + .fnoi = gen_helper_xvbitrevi_h, + .opt_opc = vecop_list, + .vece = MO_16 + }, + { + .fniv = gen_vbitrevi, + .fnoi = gen_helper_xvbitrevi_w, + .opt_opc = vecop_list, + .vece = MO_32 + }, + { + .fniv = gen_vbitrevi, + .fnoi = gen_helper_xvbitrevi_d, + .opt_opc = vecop_list, + .vece = MO_64 + }, + }; + + tcg_gen_gvec_2i(xd_ofs, xj_ofs, oprsz, maxsz, imm, &op[vece]); +} + +TRANS(xvbitrevi_b, gvec_xx_i, MO_8, do_xvbitrevi) +TRANS(xvbitrevi_h, gvec_xx_i, MO_16, do_xvbitrevi) +TRANS(xvbitrevi_w, gvec_xx_i, MO_32, do_xvbitrevi) +TRANS(xvbitrevi_d, gvec_xx_i, MO_64, do_xvbitrevi) + static bool gvec_dupx(DisasContext *ctx, arg_xr *a, MemOp mop) { TCGv src = gpr_src(ctx, a->rj, EXT_NONE); diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index 7d49ddb0ea..47374054c6 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1807,6 +1807,33 @@ xvpcnt_h 0111 01101001 11000 01001 ..... ..... @xx xvpcnt_w 0111 01101001 11000 01010 ..... ..... @xx xvpcnt_d 0111 01101001 11000 01011 ..... ..... @xx +xvbitclr_b 0111 01010000 11000 ..... ..... ..... @xxx +xvbitclr_h 0111 01010000 11001 ..... ..... ..... @xxx +xvbitclr_w 0111 01010000 11010 ..... ..... ..... @xxx +xvbitclr_d 0111 01010000 11011 ..... ..... ..... @xxx +xvbitclri_b 0111 01110001 00000 01 ... ..... ..... @xx_ui3 +xvbitclri_h 0111 01110001 00000 1 .... ..... ..... @xx_ui4 +xvbitclri_w 0111 01110001 00001 ..... ..... ..... @xx_ui5 +xvbitclri_d 0111 01110001 0001 ...... ..... ..... @xx_ui6 + +xvbitset_b 0111 01010000 11100 ..... ..... ..... @xxx +xvbitset_h 0111 01010000 11101 ..... ..... ..... @xxx +xvbitset_w 0111 01010000 11110 ..... ..... ..... @xxx +xvbitset_d 0111 01010000 11111 ..... ..... ..... @xxx +xvbitseti_b 0111 01110001 01000 01 ... ..... ..... @xx_ui3 +xvbitseti_h 0111 01110001 01000 1 .... ..... ..... @xx_ui4 +xvbitseti_w 0111 01110001 01001 ..... ..... ..... @xx_ui5 +xvbitseti_d 0111 01110001 0101 ...... ..... ..... @xx_ui6 + +xvbitrev_b 0111 01010001 00000 ..... ..... ..... @xxx +xvbitrev_h 0111 01010001 00001 ..... ..... ..... @xxx +xvbitrev_w 0111 01010001 00010 ..... ..... ..... @xxx +xvbitrev_d 0111 01010001 00011 ..... ..... ..... @xxx +xvbitrevi_b 0111 01110001 10000 01 ... ..... ..... @xx_ui3 +xvbitrevi_h 0111 01110001 10000 1 .... ..... ..... @xx_ui4 +xvbitrevi_w 0111 01110001 10001 ..... ..... ..... @xx_ui5 +xvbitrevi_d 0111 01110001 1001 ...... ..... ..... @xx_ui6 + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @xr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @xr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @xr diff --git a/target/loongarch/lasx_helper.c b/target/loongarch/lasx_helper.c index f04817984b..7092835d30 100644 --- a/target/loongarch/lasx_helper.c +++ b/target/loongarch/lasx_helper.c @@ -2119,3 +2119,54 @@ XVPCNT(xvpcnt_b, 8, UXB, ctpop8) XVPCNT(xvpcnt_h, 16, UXH, ctpop16) XVPCNT(xvpcnt_w, 32, UXW, ctpop32) XVPCNT(xvpcnt_d, 64, UXD, ctpop64) + +#define XDO_BIT(NAME, BIT, E, DO_OP) \ +void HELPER(NAME)(void *xd, void *xj, void *xk, uint32_t v) \ +{ \ + int i; \ + XReg *Xd = (XReg *)xd; \ + XReg *Xj = (XReg *)xj; \ + XReg *Xk = (XReg *)xk; \ + \ + for (i = 0; i < LASX_LEN / BIT; i++) { \ + Xd->E(i) = DO_OP(Xj->E(i), Xk->E(i) % BIT); \ + } \ +} + +XDO_BIT(xvbitclr_b, 8, UXB, DO_BITCLR) +XDO_BIT(xvbitclr_h, 16, UXH, DO_BITCLR) +XDO_BIT(xvbitclr_w, 32, UXW, DO_BITCLR) +XDO_BIT(xvbitclr_d, 64, UXD, DO_BITCLR) +XDO_BIT(xvbitset_b, 8, UXB, DO_BITSET) +XDO_BIT(xvbitset_h, 16, UXH, DO_BITSET) +XDO_BIT(xvbitset_w, 32, UXW, DO_BITSET) +XDO_BIT(xvbitset_d, 64, UXD, DO_BITSET) +XDO_BIT(xvbitrev_b, 8, UXB, DO_BITREV) +XDO_BIT(xvbitrev_h, 16, UXH, DO_BITREV) +XDO_BIT(xvbitrev_w, 32, UXW, DO_BITREV) +XDO_BIT(xvbitrev_d, 64, UXD, DO_BITREV) + +#define XDO_BITI(NAME, BIT, E, DO_OP) \ +void HELPER(NAME)(void *xd, void *xj, uint64_t imm, uint32_t v) \ +{ \ + int i; \ + XReg *Xd = (XReg *)xd; \ + XReg *Xj = (XReg *)xj; \ + \ + for (i = 0; i < LASX_LEN / BIT; i++) { \ + Xd->E(i) = DO_OP(Xj->E(i), imm); \ + } \ +} + +XDO_BITI(xvbitclri_b, 8, UXB, DO_BITCLR) +XDO_BITI(xvbitclri_h, 16, UXH, DO_BITCLR) +XDO_BITI(xvbitclri_w, 32, UXW, DO_BITCLR) +XDO_BITI(xvbitclri_d, 64, UXD, DO_BITCLR) +XDO_BITI(xvbitseti_b, 8, UXB, DO_BITSET) +XDO_BITI(xvbitseti_h, 16, UXH, DO_BITSET) +XDO_BITI(xvbitseti_w, 32, UXW, DO_BITSET) +XDO_BITI(xvbitseti_d, 64, UXD, DO_BITSET) +XDO_BITI(xvbitrevi_b, 8, UXB, DO_BITREV) +XDO_BITI(xvbitrevi_h, 16, UXH, DO_BITREV) +XDO_BITI(xvbitrevi_w, 32, UXW, DO_BITREV) +XDO_BITI(xvbitrevi_d, 64, UXD, DO_BITREV) diff --git a/target/loongarch/lsx_helper.c b/target/loongarch/lsx_helper.c index e1b448a2e6..b9fdcd3ed7 100644 --- a/target/loongarch/lsx_helper.c +++ b/target/loongarch/lsx_helper.c @@ -1937,10 +1937,6 @@ VPCNT(vpcnt_h, 16, UH, ctpop16) VPCNT(vpcnt_w, 32, UW, ctpop32) VPCNT(vpcnt_d, 64, UD, ctpop64) -#define DO_BITCLR(a, bit) (a & ~(1ull << bit)) -#define DO_BITSET(a, bit) (a | 1ull << bit) -#define DO_BITREV(a, bit) (a ^ (1ull << bit)) - #define DO_BIT(NAME, BIT, E, DO_OP) \ void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t v) \ { \ diff --git a/target/loongarch/vec.h b/target/loongarch/vec.h index db5704dd05..4d9c4eb85f 100644 --- a/target/loongarch/vec.h +++ b/target/loongarch/vec.h @@ -86,6 +86,10 @@ #define DO_CLZ_W(N) (clz32(N)) #define DO_CLZ_D(N) (clz64(N)) +#define DO_BITCLR(a, bit) (a & ~(1ull << bit)) +#define DO_BITSET(a, bit) (a | 1ull << bit) +#define DO_BITREV(a, bit) (a ^ (1ull << bit)) + uint64_t do_vmskltz_b(int64_t val); uint64_t do_vmskltz_h(int64_t val); uint64_t do_vmskltz_w(int64_t val); From patchwork Tue Jun 20 09:38:03 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Gao X-Patchwork-Id: 13285494 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 1D8FAEB64D8 for ; Tue, 20 Jun 2023 09:44:54 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXp7-0007CS-Aq; Tue, 20 Jun 2023 05:38:53 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qBXp4-0007Aj-H5 for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:38:50 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXp2-0006OK-9l for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:38:50 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8CxNumbc5Fk0iUHAA--.12679S3; Tue, 20 Jun 2023 17:38:35 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8BxduSGc5FkzIQhAA--.28394S37; Tue, 20 Jun 2023 17:38:35 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v1 35/46] target/loongarch: Implement xvfrstp Date: Tue, 20 Jun 2023 17:38:03 +0800 Message-Id: <20230620093814.123650-36-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230620093814.123650-1-gaosong@loongson.cn> References: <20230620093814.123650-1-gaosong@loongson.cn> MIME-Version: 1.0 X-CM-TRANSID: AQAAf8BxduSGc5FkzIQhAA--.28394S37 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org This patch includes: - XVFRSTP[I].{B/H}. Signed-off-by: Song Gao --- target/loongarch/disas.c | 5 ++ target/loongarch/helper.h | 5 ++ target/loongarch/insn_trans/trans_lasx.c.inc | 5 ++ target/loongarch/insns.decode | 5 ++ target/loongarch/lasx_helper.c | 56 ++++++++++++++++++++ 5 files changed, 76 insertions(+) diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index 60d265a9f2..5340609e6f 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -2234,6 +2234,11 @@ INSN_LASX(xvbitrevi_h, xx_i) INSN_LASX(xvbitrevi_w, xx_i) INSN_LASX(xvbitrevi_d, xx_i) +INSN_LASX(xvfrstp_b, xxx) +INSN_LASX(xvfrstp_h, xxx) +INSN_LASX(xvfrstpi_b, xx_i) +INSN_LASX(xvfrstpi_h, xx_i) + INSN_LASX(xvreplgr2vr_b, xr) INSN_LASX(xvreplgr2vr_h, xr) INSN_LASX(xvreplgr2vr_w, xr) diff --git a/target/loongarch/helper.h b/target/loongarch/helper.h index 294ac477fc..4db0cd25d3 100644 --- a/target/loongarch/helper.h +++ b/target/loongarch/helper.h @@ -1091,3 +1091,8 @@ DEF_HELPER_FLAGS_4(xvbitrevi_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) DEF_HELPER_FLAGS_4(xvbitrevi_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) DEF_HELPER_FLAGS_4(xvbitrevi_w, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) DEF_HELPER_FLAGS_4(xvbitrevi_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) + +DEF_HELPER_4(xvfrstp_b, void, env, i32, i32, i32) +DEF_HELPER_4(xvfrstp_h, void, env, i32, i32, i32) +DEF_HELPER_4(xvfrstpi_b, void, env, i32, i32, i32) +DEF_HELPER_4(xvfrstpi_h, void, env, i32, i32, i32) diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarch/insn_trans/trans_lasx.c.inc index e87e000478..beeb9b3ff8 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -2404,6 +2404,11 @@ TRANS(xvbitrevi_h, gvec_xx_i, MO_16, do_xvbitrevi) TRANS(xvbitrevi_w, gvec_xx_i, MO_32, do_xvbitrevi) TRANS(xvbitrevi_d, gvec_xx_i, MO_64, do_xvbitrevi) +TRANS(xvfrstp_b, gen_xxx, gen_helper_xvfrstp_b) +TRANS(xvfrstp_h, gen_xxx, gen_helper_xvfrstp_h) +TRANS(xvfrstpi_b, gen_xx_i, gen_helper_xvfrstpi_b) +TRANS(xvfrstpi_h, gen_xx_i, gen_helper_xvfrstpi_h) + static bool gvec_dupx(DisasContext *ctx, arg_xr *a, MemOp mop) { TCGv src = gpr_src(ctx, a->rj, EXT_NONE); diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index 47374054c6..387c1e5776 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1834,6 +1834,11 @@ xvbitrevi_h 0111 01110001 10000 1 .... ..... ..... @xx_ui4 xvbitrevi_w 0111 01110001 10001 ..... ..... ..... @xx_ui5 xvbitrevi_d 0111 01110001 1001 ...... ..... ..... @xx_ui6 +xvfrstp_b 0111 01010010 10110 ..... ..... ..... @xxx +xvfrstp_h 0111 01010010 10111 ..... ..... ..... @xxx +xvfrstpi_b 0111 01101001 10100 ..... ..... ..... @xx_ui5 +xvfrstpi_h 0111 01101001 10101 ..... ..... ..... @xx_ui5 + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @xr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @xr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @xr diff --git a/target/loongarch/lasx_helper.c b/target/loongarch/lasx_helper.c index 7092835d30..011eab46fb 100644 --- a/target/loongarch/lasx_helper.c +++ b/target/loongarch/lasx_helper.c @@ -2170,3 +2170,59 @@ XDO_BITI(xvbitrevi_b, 8, UXB, DO_BITREV) XDO_BITI(xvbitrevi_h, 16, UXH, DO_BITREV) XDO_BITI(xvbitrevi_w, 32, UXW, DO_BITREV) XDO_BITI(xvbitrevi_d, 64, UXD, DO_BITREV) + +#define XVFRSTP(NAME, BIT, MASK, E) \ +void HELPER(NAME)(CPULoongArchState *env, \ + uint32_t xd, uint32_t xj, uint32_t xk) \ +{ \ + int i, j, m1, m2, max; \ + XReg *Xd = &(env->fpr[xd].xreg); \ + XReg *Xj = &(env->fpr[xj].xreg); \ + XReg *Xk = &(env->fpr[xk].xreg); \ + \ + max = LASX_LEN / (BIT * 2); \ + m1 = Xk->E(0) & MASK; \ + for (i = 0; i < max; i++) { \ + if (Xj->E(i) < 0) { \ + break; \ + } \ + } \ + Xd->E(m1) = i; \ + for (j = 0; j < max; j++) { \ + if (Xj->E(j + max) < 0) { \ + break; \ + } \ + } \ + m2 = Xk->E(max) & MASK; \ + Xd->E(m2 + max) = j; \ +} + +XVFRSTP(xvfrstp_b, 8, 0xf, XB) +XVFRSTP(xvfrstp_h, 16, 0x7, XH) + +#define XVFRSTPI(NAME, BIT, E) \ +void HELPER(NAME)(CPULoongArchState *env, \ + uint32_t xd, uint32_t xj, uint32_t imm) \ +{ \ + int i, j, m, max; \ + XReg *Xd = &(env->fpr[xd].xreg); \ + XReg *Xj = &(env->fpr[xj].xreg); \ + \ + max = LASX_LEN / (BIT * 2); \ + m = imm % (max); \ + for (i = 0; i < max; i++) { \ + if (Xj->E(i) < 0) { \ + break; \ + } \ + } \ + Xd->E(m) = i; \ + for (j = 0; j < max; j++) { \ + if (Xj->E(j + max) < 0) { \ + break; \ + } \ + } \ + Xd->E(m + max) = j; \ +} + +XVFRSTPI(xvfrstpi_b, 8, XB) +XVFRSTPI(xvfrstpi_h, 16, XH) From patchwork Tue Jun 20 09:38:04 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Gao X-Patchwork-Id: 13285489 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id A9EAFEB64D8 for ; Tue, 20 Jun 2023 09:44:15 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXpR-0007IY-DE; Tue, 20 Jun 2023 05:39:13 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qBXp7-0007CO-1b for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:38:53 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXp3-0006OZ-Di for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:38:52 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8Cx8Oicc5Fk1SUHAA--.12760S3; Tue, 20 Jun 2023 17:38:36 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8BxduSGc5FkzIQhAA--.28394S38; Tue, 20 Jun 2023 17:38:35 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v1 36/46] target/loongarch: Implement LASX fpu arith instructions Date: Tue, 20 Jun 2023 17:38:04 +0800 Message-Id: <20230620093814.123650-37-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230620093814.123650-1-gaosong@loongson.cn> References: <20230620093814.123650-1-gaosong@loongson.cn> MIME-Version: 1.0 X-CM-TRANSID: AQAAf8BxduSGc5FkzIQhAA--.28394S38 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org This patch includes: - XVF{ADD/SUB/MUL/DIV}.{S/D}; - XVF{MADD/MSUB/NMADD/NMSUB}.{S/D}; - XVF{MAX/MIN}.{S/D}; - XVF{MAXA/MINA}.{S/D}; - XVFLOGB.{S/D}; - XVFCLASS.{S/D}; - XVF{SQRT/RECIP/RSQRT}.{S/D}. Signed-off-by: Song Gao --- target/loongarch/disas.c | 46 +++++++++ target/loongarch/helper.h | 41 ++++++++ target/loongarch/insn_trans/trans_lasx.c.inc | 55 +++++++++++ target/loongarch/insns.decode | 43 +++++++++ target/loongarch/lasx_helper.c | 99 ++++++++++++++++++++ target/loongarch/lsx_helper.c | 51 +++++----- target/loongarch/vec.h | 13 +++ 7 files changed, 322 insertions(+), 26 deletions(-) diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index 5340609e6f..0e4ec2bd03 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -1708,6 +1708,11 @@ static void output_x_i(DisasContext *ctx, arg_x_i *a, const char *mnemonic) output(ctx, mnemonic, "x%d, 0x%x", a->xd, a->imm); } +static void output_xxxx(DisasContext *ctx, arg_xxxx *a, const char *mnemonic) +{ + output(ctx, mnemonic, "x%d, x%d, x%d, x%d", a->xd, a->xj, a->xk, a->xa); +} + static void output_xxx(DisasContext *ctx, arg_xxx * a, const char *mnemonic) { output(ctx, mnemonic, "x%d, x%d, x%d", a->xd, a->xj, a->xk); @@ -2239,6 +2244,47 @@ INSN_LASX(xvfrstp_h, xxx) INSN_LASX(xvfrstpi_b, xx_i) INSN_LASX(xvfrstpi_h, xx_i) +INSN_LASX(xvfadd_s, xxx) +INSN_LASX(xvfadd_d, xxx) +INSN_LASX(xvfsub_s, xxx) +INSN_LASX(xvfsub_d, xxx) +INSN_LASX(xvfmul_s, xxx) +INSN_LASX(xvfmul_d, xxx) +INSN_LASX(xvfdiv_s, xxx) +INSN_LASX(xvfdiv_d, xxx) + +INSN_LASX(xvfmadd_s, xxxx) +INSN_LASX(xvfmadd_d, xxxx) +INSN_LASX(xvfmsub_s, xxxx) +INSN_LASX(xvfmsub_d, xxxx) +INSN_LASX(xvfnmadd_s, xxxx) +INSN_LASX(xvfnmadd_d, xxxx) +INSN_LASX(xvfnmsub_s, xxxx) +INSN_LASX(xvfnmsub_d, xxxx) + +INSN_LASX(xvfmax_s, xxx) +INSN_LASX(xvfmax_d, xxx) +INSN_LASX(xvfmin_s, xxx) +INSN_LASX(xvfmin_d, xxx) + +INSN_LASX(xvfmaxa_s, xxx) +INSN_LASX(xvfmaxa_d, xxx) +INSN_LASX(xvfmina_s, xxx) +INSN_LASX(xvfmina_d, xxx) + +INSN_LASX(xvflogb_s, xx) +INSN_LASX(xvflogb_d, xx) + +INSN_LASX(xvfclass_s, xx) +INSN_LASX(xvfclass_d, xx) + +INSN_LASX(xvfsqrt_s, xx) +INSN_LASX(xvfsqrt_d, xx) +INSN_LASX(xvfrecip_s, xx) +INSN_LASX(xvfrecip_d, xx) +INSN_LASX(xvfrsqrt_s, xx) +INSN_LASX(xvfrsqrt_d, xx) + INSN_LASX(xvreplgr2vr_b, xr) INSN_LASX(xvreplgr2vr_h, xr) INSN_LASX(xvreplgr2vr_w, xr) diff --git a/target/loongarch/helper.h b/target/loongarch/helper.h index 4db0cd25d3..2e6e3f2fd3 100644 --- a/target/loongarch/helper.h +++ b/target/loongarch/helper.h @@ -1096,3 +1096,44 @@ DEF_HELPER_4(xvfrstp_b, void, env, i32, i32, i32) DEF_HELPER_4(xvfrstp_h, void, env, i32, i32, i32) DEF_HELPER_4(xvfrstpi_b, void, env, i32, i32, i32) DEF_HELPER_4(xvfrstpi_h, void, env, i32, i32, i32) + +DEF_HELPER_4(xvfadd_s, void, env, i32, i32, i32) +DEF_HELPER_4(xvfadd_d, void, env, i32, i32, i32) +DEF_HELPER_4(xvfsub_s, void, env, i32, i32, i32) +DEF_HELPER_4(xvfsub_d, void, env, i32, i32, i32) +DEF_HELPER_4(xvfmul_s, void, env, i32, i32, i32) +DEF_HELPER_4(xvfmul_d, void, env, i32, i32, i32) +DEF_HELPER_4(xvfdiv_s, void, env, i32, i32, i32) +DEF_HELPER_4(xvfdiv_d, void, env, i32, i32, i32) + +DEF_HELPER_5(xvfmadd_s, void, env, i32, i32, i32, i32) +DEF_HELPER_5(xvfmadd_d, void, env, i32, i32, i32, i32) +DEF_HELPER_5(xvfmsub_s, void, env, i32, i32, i32, i32) +DEF_HELPER_5(xvfmsub_d, void, env, i32, i32, i32, i32) +DEF_HELPER_5(xvfnmadd_s, void, env, i32, i32, i32, i32) +DEF_HELPER_5(xvfnmadd_d, void, env, i32, i32, i32, i32) +DEF_HELPER_5(xvfnmsub_s, void, env, i32, i32, i32, i32) +DEF_HELPER_5(xvfnmsub_d, void, env, i32, i32, i32, i32) + +DEF_HELPER_4(xvfmax_s, void, env, i32, i32, i32) +DEF_HELPER_4(xvfmax_d, void, env, i32, i32, i32) +DEF_HELPER_4(xvfmin_s, void, env, i32, i32, i32) +DEF_HELPER_4(xvfmin_d, void, env, i32, i32, i32) + +DEF_HELPER_4(xvfmaxa_s, void, env, i32, i32, i32) +DEF_HELPER_4(xvfmaxa_d, void, env, i32, i32, i32) +DEF_HELPER_4(xvfmina_s, void, env, i32, i32, i32) +DEF_HELPER_4(xvfmina_d, void, env, i32, i32, i32) + +DEF_HELPER_3(xvflogb_s, void, env, i32, i32) +DEF_HELPER_3(xvflogb_d, void, env, i32, i32) + +DEF_HELPER_3(xvfclass_s, void, env, i32, i32) +DEF_HELPER_3(xvfclass_d, void, env, i32, i32) + +DEF_HELPER_3(xvfsqrt_s, void, env, i32, i32) +DEF_HELPER_3(xvfsqrt_d, void, env, i32, i32) +DEF_HELPER_3(xvfrecip_s, void, env, i32, i32) +DEF_HELPER_3(xvfrecip_d, void, env, i32, i32) +DEF_HELPER_3(xvfrsqrt_s, void, env, i32, i32) +DEF_HELPER_3(xvfrsqrt_d, void, env, i32, i32) diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarch/insn_trans/trans_lasx.c.inc index beeb9b3ff8..b9785be6c5 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -15,6 +15,20 @@ #define CHECK_ASXE #endif +static bool gen_xxxx(DisasContext *ctx, arg_xxxx *a, + void (*func)(TCGv_ptr, TCGv_i32, TCGv_i32, + TCGv_i32, TCGv_i32)) +{ + TCGv_i32 xd = tcg_constant_i32(a->xd); + TCGv_i32 xj = tcg_constant_i32(a->xj); + TCGv_i32 xk = tcg_constant_i32(a->xk); + TCGv_i32 xa = tcg_constant_i32(a->xa); + + CHECK_ASXE; + func(cpu_env, xd, xj, xk, xa); + return true; +} + static bool gen_xxx(DisasContext *ctx, arg_xxx *a, void (*func)(TCGv_ptr, TCGv_i32, TCGv_i32, TCGv_i32)) { @@ -2409,6 +2423,47 @@ TRANS(xvfrstp_h, gen_xxx, gen_helper_xvfrstp_h) TRANS(xvfrstpi_b, gen_xx_i, gen_helper_xvfrstpi_b) TRANS(xvfrstpi_h, gen_xx_i, gen_helper_xvfrstpi_h) +TRANS(xvfadd_s, gen_xxx, gen_helper_xvfadd_s) +TRANS(xvfadd_d, gen_xxx, gen_helper_xvfadd_d) +TRANS(xvfsub_s, gen_xxx, gen_helper_xvfsub_s) +TRANS(xvfsub_d, gen_xxx, gen_helper_xvfsub_d) +TRANS(xvfmul_s, gen_xxx, gen_helper_xvfmul_s) +TRANS(xvfmul_d, gen_xxx, gen_helper_xvfmul_d) +TRANS(xvfdiv_s, gen_xxx, gen_helper_xvfdiv_s) +TRANS(xvfdiv_d, gen_xxx, gen_helper_xvfdiv_d) + +TRANS(xvfmadd_s, gen_xxxx, gen_helper_xvfmadd_s) +TRANS(xvfmadd_d, gen_xxxx, gen_helper_xvfmadd_d) +TRANS(xvfmsub_s, gen_xxxx, gen_helper_xvfmsub_s) +TRANS(xvfmsub_d, gen_xxxx, gen_helper_xvfmsub_d) +TRANS(xvfnmadd_s, gen_xxxx, gen_helper_xvfnmadd_s) +TRANS(xvfnmadd_d, gen_xxxx, gen_helper_xvfnmadd_d) +TRANS(xvfnmsub_s, gen_xxxx, gen_helper_xvfnmsub_s) +TRANS(xvfnmsub_d, gen_xxxx, gen_helper_xvfnmsub_d) + +TRANS(xvfmax_s, gen_xxx, gen_helper_xvfmax_s) +TRANS(xvfmax_d, gen_xxx, gen_helper_xvfmax_d) +TRANS(xvfmin_s, gen_xxx, gen_helper_xvfmin_s) +TRANS(xvfmin_d, gen_xxx, gen_helper_xvfmin_d) + +TRANS(xvfmaxa_s, gen_xxx, gen_helper_xvfmaxa_s) +TRANS(xvfmaxa_d, gen_xxx, gen_helper_xvfmaxa_d) +TRANS(xvfmina_s, gen_xxx, gen_helper_xvfmina_s) +TRANS(xvfmina_d, gen_xxx, gen_helper_xvfmina_d) + +TRANS(xvflogb_s, gen_xx, gen_helper_xvflogb_s) +TRANS(xvflogb_d, gen_xx, gen_helper_xvflogb_d) + +TRANS(xvfclass_s, gen_xx, gen_helper_xvfclass_s) +TRANS(xvfclass_d, gen_xx, gen_helper_xvfclass_d) + +TRANS(xvfsqrt_s, gen_xx, gen_helper_xvfsqrt_s) +TRANS(xvfsqrt_d, gen_xx, gen_helper_xvfsqrt_d) +TRANS(xvfrecip_s, gen_xx, gen_helper_xvfrecip_s) +TRANS(xvfrecip_d, gen_xx, gen_helper_xvfrecip_d) +TRANS(xvfrsqrt_s, gen_xx, gen_helper_xvfrsqrt_s) +TRANS(xvfrsqrt_d, gen_xx, gen_helper_xvfrsqrt_d) + static bool gvec_dupx(DisasContext *ctx, arg_xr *a, MemOp mop) { TCGv src = gpr_src(ctx, a->rj, EXT_NONE); diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index 387c1e5776..8a5d6a8d45 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1306,6 +1306,7 @@ vstelm_b 0011 000110 .... ........ ..... ..... @vr_i8i4 &xr xd rj &xx_i xd xj imm &x_i xd imm +&xxxx xd xj xk xa # # LASX Formats @@ -1322,6 +1323,7 @@ vstelm_b 0011 000110 .... ........ ..... ..... @vr_i8i4 @xx_ui6 .... ........ .... imm:6 xj:5 xd:5 &xx_i @xx_ui7 .... ........ ... imm:7 xj:5 xd:5 &xx_i @xx_ui8 .... ........ .. imm:8 xj:5 xd:5 &xx_i +@xxxx .... ........ xa:5 xk:5 xj:5 xd:5 &xxxx xvadd_b 0111 01000000 10100 ..... ..... ..... @xxx xvadd_h 0111 01000000 10101 ..... ..... ..... @xxx @@ -1839,6 +1841,47 @@ xvfrstp_h 0111 01010010 10111 ..... ..... ..... @xxx xvfrstpi_b 0111 01101001 10100 ..... ..... ..... @xx_ui5 xvfrstpi_h 0111 01101001 10101 ..... ..... ..... @xx_ui5 +xvfadd_s 0111 01010011 00001 ..... ..... ..... @xxx +xvfadd_d 0111 01010011 00010 ..... ..... ..... @xxx +xvfsub_s 0111 01010011 00101 ..... ..... ..... @xxx +xvfsub_d 0111 01010011 00110 ..... ..... ..... @xxx +xvfmul_s 0111 01010011 10001 ..... ..... ..... @xxx +xvfmul_d 0111 01010011 10010 ..... ..... ..... @xxx +xvfdiv_s 0111 01010011 10101 ..... ..... ..... @xxx +xvfdiv_d 0111 01010011 10110 ..... ..... ..... @xxx + +xvfmadd_s 0000 10100001 ..... ..... ..... ..... @xxxx +xvfmadd_d 0000 10100010 ..... ..... ..... ..... @xxxx +xvfmsub_s 0000 10100101 ..... ..... ..... ..... @xxxx +xvfmsub_d 0000 10100110 ..... ..... ..... ..... @xxxx +xvfnmadd_s 0000 10101001 ..... ..... ..... ..... @xxxx +xvfnmadd_d 0000 10101010 ..... ..... ..... ..... @xxxx +xvfnmsub_s 0000 10101101 ..... ..... ..... ..... @xxxx +xvfnmsub_d 0000 10101110 ..... ..... ..... ..... @xxxx + +xvfmax_s 0111 01010011 11001 ..... ..... ..... @xxx +xvfmax_d 0111 01010011 11010 ..... ..... ..... @xxx +xvfmin_s 0111 01010011 11101 ..... ..... ..... @xxx +xvfmin_d 0111 01010011 11110 ..... ..... ..... @xxx + +xvfmaxa_s 0111 01010100 00001 ..... ..... ..... @xxx +xvfmaxa_d 0111 01010100 00010 ..... ..... ..... @xxx +xvfmina_s 0111 01010100 00101 ..... ..... ..... @xxx +xvfmina_d 0111 01010100 00110 ..... ..... ..... @xxx + +xvflogb_s 0111 01101001 11001 10001 ..... ..... @xx +xvflogb_d 0111 01101001 11001 10010 ..... ..... @xx + +xvfclass_s 0111 01101001 11001 10101 ..... ..... @xx +xvfclass_d 0111 01101001 11001 10110 ..... ..... @xx + +xvfsqrt_s 0111 01101001 11001 11001 ..... ..... @xx +xvfsqrt_d 0111 01101001 11001 11010 ..... ..... @xx +xvfrecip_s 0111 01101001 11001 11101 ..... ..... @xx +xvfrecip_d 0111 01101001 11001 11110 ..... ..... @xx +xvfrsqrt_s 0111 01101001 11010 00001 ..... ..... @xx +xvfrsqrt_d 0111 01101001 11010 00010 ..... ..... @xx + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @xr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @xr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @xr diff --git a/target/loongarch/lasx_helper.c b/target/loongarch/lasx_helper.c index 011eab46fb..316ebd3463 100644 --- a/target/loongarch/lasx_helper.c +++ b/target/loongarch/lasx_helper.c @@ -9,6 +9,7 @@ #include "cpu.h" #include "exec/exec-all.h" #include "exec/helper-proto.h" +#include "fpu/softfloat.h" #include "internals.h" #include "vec.h" @@ -2226,3 +2227,101 @@ void HELPER(NAME)(CPULoongArchState *env, \ XVFRSTPI(xvfrstpi_b, 8, XB) XVFRSTPI(xvfrstpi_h, 16, XH) + +#define XDO_3OP_F(NAME, BIT, E, FN) \ +void HELPER(NAME)(CPULoongArchState *env, \ + uint32_t xd, uint32_t xj, uint32_t xk) \ +{ \ + int i; \ + XReg *Xd = &(env->fpr[xd].xreg); \ + XReg *Xj = &(env->fpr[xj].xreg); \ + XReg *Xk = &(env->fpr[xk].xreg); \ + \ + vec_clear_cause(env); \ + for (i = 0; i < LASX_LEN / BIT; i++) { \ + Xd->E(i) = FN(Xj->E(i), Xk->E(i), &env->fp_status); \ + vec_update_fcsr0(env, GETPC()); \ + } \ +} + +XDO_3OP_F(xvfadd_s, 32, UXW, float32_add) +XDO_3OP_F(xvfadd_d, 64, UXD, float64_add) +XDO_3OP_F(xvfsub_s, 32, UXW, float32_sub) +XDO_3OP_F(xvfsub_d, 64, UXD, float64_sub) +XDO_3OP_F(xvfmul_s, 32, UXW, float32_mul) +XDO_3OP_F(xvfmul_d, 64, UXD, float64_mul) +XDO_3OP_F(xvfdiv_s, 32, UXW, float32_div) +XDO_3OP_F(xvfdiv_d, 64, UXD, float64_div) +XDO_3OP_F(xvfmax_s, 32, UXW, float32_maxnum) +XDO_3OP_F(xvfmax_d, 64, UXD, float64_maxnum) +XDO_3OP_F(xvfmin_s, 32, UXW, float32_minnum) +XDO_3OP_F(xvfmin_d, 64, UXD, float64_minnum) +XDO_3OP_F(xvfmaxa_s, 32, UXW, float32_maxnummag) +XDO_3OP_F(xvfmaxa_d, 64, UXD, float64_maxnummag) +XDO_3OP_F(xvfmina_s, 32, UXW, float32_minnummag) +XDO_3OP_F(xvfmina_d, 64, UXD, float64_minnummag) + +#define XDO_4OP_F(NAME, BIT, E, FN, flags) \ +void HELPER(NAME)(CPULoongArchState *env, \ + uint32_t xd, uint32_t xj, uint32_t xk, uint32_t xa) \ +{ \ + int i; \ + XReg *Xd = &(env->fpr[xd].xreg); \ + XReg *Xj = &(env->fpr[xj].xreg); \ + XReg *Xk = &(env->fpr[xk].xreg); \ + XReg *Xa = &(env->fpr[xa].xreg); \ + \ + vec_clear_cause(env); \ + for (i = 0; i < LASX_LEN / BIT; i++) { \ + Xd->E(i) = FN(Xj->E(i), Xk->E(i), Xa->E(i), flags, &env->fp_status); \ + vec_update_fcsr0(env, GETPC()); \ + } \ +} + +XDO_4OP_F(xvfmadd_s, 32, UXW, float32_muladd, 0) +XDO_4OP_F(xvfmadd_d, 64, UXD, float64_muladd, 0) +XDO_4OP_F(xvfmsub_s, 32, UXW, float32_muladd, float_muladd_negate_c) +XDO_4OP_F(xvfmsub_d, 64, UXD, float64_muladd, float_muladd_negate_c) +XDO_4OP_F(xvfnmadd_s, 32, UXW, float32_muladd, float_muladd_negate_result) +XDO_4OP_F(xvfnmadd_d, 64, UXD, float64_muladd, float_muladd_negate_result) +XDO_4OP_F(xvfnmsub_s, 32, UXW, float32_muladd, + float_muladd_negate_c | float_muladd_negate_result) +XDO_4OP_F(xvfnmsub_d, 64, UXD, float64_muladd, + float_muladd_negate_c | float_muladd_negate_result) + +#define XDO_2OP_F(NAME, BIT, E, FN) \ +void HELPER(NAME)(CPULoongArchState *env, uint32_t xd, uint32_t xj) \ +{ \ + int i; \ + XReg *Xd = &(env->fpr[xd].xreg); \ + XReg *Xj = &(env->fpr[xj].xreg); \ + \ + vec_clear_cause(env); \ + for (i = 0; i < LASX_LEN / BIT; i++) { \ + Xd->E(i) = FN(env, Xj->E(i)); \ + } \ +} + +#define XFCLASS(NAME, BIT, E, FN) \ +void HELPER(NAME)(CPULoongArchState *env, uint32_t xd, uint32_t xj) \ +{ \ + int i; \ + XReg *Xd = &(env->fpr[xd].xreg); \ + XReg *Xj = &(env->fpr[xj].xreg); \ + \ + for (i = 0; i < LASX_LEN / BIT; i++) { \ + Xd->E(i) = FN(env, Xj->E(i)); \ + } \ +} + +XFCLASS(xvfclass_s, 32, UXW, helper_fclass_s) +XFCLASS(xvfclass_d, 64, UXD, helper_fclass_d) + +XDO_2OP_F(xvflogb_s, 32, UXW, do_flogb_32) +XDO_2OP_F(xvflogb_d, 64, UXD, do_flogb_64) +XDO_2OP_F(xvfsqrt_s, 32, UXW, do_fsqrt_32) +XDO_2OP_F(xvfsqrt_d, 64, UXD, do_fsqrt_64) +XDO_2OP_F(xvfrecip_s, 32, UXW, do_frecip_32) +XDO_2OP_F(xvfrecip_d, 64, UXD, do_frecip_64) +XDO_2OP_F(xvfrsqrt_s, 32, UXW, do_frsqrt_32) +XDO_2OP_F(xvfrsqrt_d, 64, UXD, do_frsqrt_64) diff --git a/target/loongarch/lsx_helper.c b/target/loongarch/lsx_helper.c index b9fdcd3ed7..446a1bdfe3 100644 --- a/target/loongarch/lsx_helper.c +++ b/target/loongarch/lsx_helper.c @@ -2029,8 +2029,7 @@ void HELPER(NAME)(CPULoongArchState *env, \ VFRSTPI(vfrstpi_b, 8, B) VFRSTPI(vfrstpi_h, 16, H) -static void vec_update_fcsr0_mask(CPULoongArchState *env, - uintptr_t pc, int mask) +void vec_update_fcsr0_mask(CPULoongArchState *env, uintptr_t pc, int mask) { int flags = get_float_exception_flags(&env->fp_status); @@ -2050,12 +2049,12 @@ static void vec_update_fcsr0_mask(CPULoongArchState *env, } } -static void vec_update_fcsr0(CPULoongArchState *env, uintptr_t pc) +void vec_update_fcsr0(CPULoongArchState *env, uintptr_t pc) { vec_update_fcsr0_mask(env, pc, 0); } -static inline void vec_clear_cause(CPULoongArchState *env) +inline void vec_clear_cause(CPULoongArchState *env) { SET_FP_CAUSE(env->fcsr0, 0); } @@ -2134,19 +2133,19 @@ void HELPER(NAME)(CPULoongArchState *env, uint32_t vd, uint32_t vj) \ } \ } -#define FLOGB(BIT, T) \ -static T do_flogb_## BIT(CPULoongArchState *env, T fj) \ -{ \ - T fp, fd; \ - float_status *status = &env->fp_status; \ - FloatRoundMode old_mode = get_float_rounding_mode(status); \ - \ - set_float_rounding_mode(float_round_down, status); \ - fp = float ## BIT ##_log2(fj, status); \ - fd = float ## BIT ##_round_to_int(fp, status); \ - set_float_rounding_mode(old_mode, status); \ - vec_update_fcsr0_mask(env, GETPC(), float_flag_inexact); \ - return fd; \ +#define FLOGB(BIT, T) \ +T do_flogb_## BIT(CPULoongArchState *env, T fj) \ +{ \ + T fp, fd; \ + float_status *status = &env->fp_status; \ + FloatRoundMode old_mode = get_float_rounding_mode(status); \ + \ + set_float_rounding_mode(float_round_down, status); \ + fp = float ## BIT ##_log2(fj, status); \ + fd = float ## BIT ##_round_to_int(fp, status); \ + set_float_rounding_mode(old_mode, status); \ + vec_update_fcsr0_mask(env, GETPC(), float_flag_inexact); \ + return fd; \ } FLOGB(32, uint32_t) @@ -2167,20 +2166,20 @@ void HELPER(NAME)(CPULoongArchState *env, uint32_t vd, uint32_t vj) \ FCLASS(vfclass_s, 32, UW, helper_fclass_s) FCLASS(vfclass_d, 64, UD, helper_fclass_d) -#define FSQRT(BIT, T) \ -static T do_fsqrt_## BIT(CPULoongArchState *env, T fj) \ -{ \ - T fd; \ - fd = float ## BIT ##_sqrt(fj, &env->fp_status); \ - vec_update_fcsr0(env, GETPC()); \ - return fd; \ +#define FSQRT(BIT, T) \ +T do_fsqrt_## BIT(CPULoongArchState *env, T fj) \ +{ \ + T fd; \ + fd = float ## BIT ##_sqrt(fj, &env->fp_status); \ + vec_update_fcsr0(env, GETPC()); \ + return fd; \ } FSQRT(32, uint32_t) FSQRT(64, uint64_t) #define FRECIP(BIT, T) \ -static T do_frecip_## BIT(CPULoongArchState *env, T fj) \ +T do_frecip_## BIT(CPULoongArchState *env, T fj) \ { \ T fd; \ fd = float ## BIT ##_div(float ## BIT ##_one, fj, &env->fp_status); \ @@ -2192,7 +2191,7 @@ FRECIP(32, uint32_t) FRECIP(64, uint64_t) #define FRSQRT(BIT, T) \ -static T do_frsqrt_## BIT(CPULoongArchState *env, T fj) \ +T do_frsqrt_## BIT(CPULoongArchState *env, T fj) \ { \ T fd, fp; \ fp = float ## BIT ##_sqrt(fj, &env->fp_status); \ diff --git a/target/loongarch/vec.h b/target/loongarch/vec.h index 4d9c4eb85f..583997d576 100644 --- a/target/loongarch/vec.h +++ b/target/loongarch/vec.h @@ -96,4 +96,17 @@ uint64_t do_vmskltz_w(int64_t val); uint64_t do_vmskltz_d(int64_t val); uint64_t do_vmskez_b(uint64_t val); +void vec_update_fcsr0_mask(CPULoongArchState *env, uintptr_t pc, int mask); +void vec_update_fcsr0(CPULoongArchState *env, uintptr_t pc); +void vec_clear_cause(CPULoongArchState *env); + +uint32_t do_flogb_32(CPULoongArchState *env, uint32_t fj); +uint64_t do_flogb_64(CPULoongArchState *env, uint64_t fj); +uint32_t do_fsqrt_32(CPULoongArchState *env, uint32_t fj); +uint64_t do_fsqrt_64(CPULoongArchState *env, uint64_t fj); +uint32_t do_frecip_32(CPULoongArchState *env, uint32_t fj); +uint64_t do_frecip_64(CPULoongArchState *env, uint64_t fj); +uint32_t do_frsqrt_32(CPULoongArchState *env, uint32_t fj); +uint64_t do_frsqrt_64(CPULoongArchState *env, uint64_t fj); + #endif /* LOONGARCH_VEC_H */ From patchwork Tue Jun 20 09:38:05 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Gao X-Patchwork-Id: 13285449 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 2A528EB64D7 for ; Tue, 20 Jun 2023 09:40:03 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXpR-0007IZ-Qh; Tue, 20 Jun 2023 05:39:13 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qBXp8-0007Cs-3e for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:39:01 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXp3-0006Od-5U for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:38:53 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8Cxd+mcc5Fk1yUHAA--.12763S3; Tue, 20 Jun 2023 17:38:36 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8BxduSGc5FkzIQhAA--.28394S39; Tue, 20 Jun 2023 17:38:36 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v1 37/46] target/loongarch: Implement LASX fpu fcvt instructions Date: Tue, 20 Jun 2023 17:38:05 +0800 Message-Id: <20230620093814.123650-38-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230620093814.123650-1-gaosong@loongson.cn> References: <20230620093814.123650-1-gaosong@loongson.cn> MIME-Version: 1.0 X-CM-TRANSID: AQAAf8BxduSGc5FkzIQhAA--.28394S39 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org This patch includes: - XVFCVT{L/H}.{S.H/D.S}; - XVFCVT.{H.S/S.D}; - XVFRINT[{RNE/RZ/RP/RM}].{S/D}; - XVFTINT[{RNE/RZ/RP/RM}].{W.S/L.D}; - XVFTINT[RZ].{WU.S/LU.D}; - XVFTINT[{RNE/RZ/RP/RM}].W.D; - XVFTINT[{RNE/RZ/RP/RM}]{L/H}.L.S; - XVFFINT.{S.W/D.L}[U]; - X[CVFFINT.S.L, VFFINT{L/H}.D.W. Signed-off-by: Song Gao --- target/loongarch/disas.c | 56 +++ target/loongarch/helper.h | 56 +++ target/loongarch/insn_trans/trans_lasx.c.inc | 56 +++ target/loongarch/insns.decode | 58 +++ target/loongarch/lasx_helper.c | 398 +++++++++++++++++++ 5 files changed, 624 insertions(+) diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index 0e4ec2bd03..65eccc8598 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -2285,6 +2285,62 @@ INSN_LASX(xvfrecip_d, xx) INSN_LASX(xvfrsqrt_s, xx) INSN_LASX(xvfrsqrt_d, xx) +INSN_LASX(xvfcvtl_s_h, xx) +INSN_LASX(xvfcvth_s_h, xx) +INSN_LASX(xvfcvtl_d_s, xx) +INSN_LASX(xvfcvth_d_s, xx) +INSN_LASX(xvfcvt_h_s, xxx) +INSN_LASX(xvfcvt_s_d, xxx) + +INSN_LASX(xvfrint_s, xx) +INSN_LASX(xvfrint_d, xx) +INSN_LASX(xvfrintrm_s, xx) +INSN_LASX(xvfrintrm_d, xx) +INSN_LASX(xvfrintrp_s, xx) +INSN_LASX(xvfrintrp_d, xx) +INSN_LASX(xvfrintrz_s, xx) +INSN_LASX(xvfrintrz_d, xx) +INSN_LASX(xvfrintrne_s, xx) +INSN_LASX(xvfrintrne_d, xx) + +INSN_LASX(xvftint_w_s, xx) +INSN_LASX(xvftint_l_d, xx) +INSN_LASX(xvftintrm_w_s, xx) +INSN_LASX(xvftintrm_l_d, xx) +INSN_LASX(xvftintrp_w_s, xx) +INSN_LASX(xvftintrp_l_d, xx) +INSN_LASX(xvftintrz_w_s, xx) +INSN_LASX(xvftintrz_l_d, xx) +INSN_LASX(xvftintrne_w_s, xx) +INSN_LASX(xvftintrne_l_d, xx) +INSN_LASX(xvftint_wu_s, xx) +INSN_LASX(xvftint_lu_d, xx) +INSN_LASX(xvftintrz_wu_s, xx) +INSN_LASX(xvftintrz_lu_d, xx) +INSN_LASX(xvftint_w_d, xxx) +INSN_LASX(xvftintrm_w_d, xxx) +INSN_LASX(xvftintrp_w_d, xxx) +INSN_LASX(xvftintrz_w_d, xxx) +INSN_LASX(xvftintrne_w_d, xxx) +INSN_LASX(xvftintl_l_s, xx) +INSN_LASX(xvftinth_l_s, xx) +INSN_LASX(xvftintrml_l_s, xx) +INSN_LASX(xvftintrmh_l_s, xx) +INSN_LASX(xvftintrpl_l_s, xx) +INSN_LASX(xvftintrph_l_s, xx) +INSN_LASX(xvftintrzl_l_s, xx) +INSN_LASX(xvftintrzh_l_s, xx) +INSN_LASX(xvftintrnel_l_s, xx) +INSN_LASX(xvftintrneh_l_s, xx) + +INSN_LASX(xvffint_s_w, xx) +INSN_LASX(xvffint_s_wu, xx) +INSN_LASX(xvffint_d_l, xx) +INSN_LASX(xvffint_d_lu, xx) +INSN_LASX(xvffintl_d_w, xx) +INSN_LASX(xvffinth_d_w, xx) +INSN_LASX(xvffint_s_l, xxx) + INSN_LASX(xvreplgr2vr_b, xr) INSN_LASX(xvreplgr2vr_h, xr) INSN_LASX(xvreplgr2vr_w, xr) diff --git a/target/loongarch/helper.h b/target/loongarch/helper.h index 2e6e3f2fd3..d30ea7f6a4 100644 --- a/target/loongarch/helper.h +++ b/target/loongarch/helper.h @@ -1137,3 +1137,59 @@ DEF_HELPER_3(xvfrecip_s, void, env, i32, i32) DEF_HELPER_3(xvfrecip_d, void, env, i32, i32) DEF_HELPER_3(xvfrsqrt_s, void, env, i32, i32) DEF_HELPER_3(xvfrsqrt_d, void, env, i32, i32) + +DEF_HELPER_3(xvfcvtl_s_h, void, env, i32, i32) +DEF_HELPER_3(xvfcvth_s_h, void, env, i32, i32) +DEF_HELPER_3(xvfcvtl_d_s, void, env, i32, i32) +DEF_HELPER_3(xvfcvth_d_s, void, env, i32, i32) +DEF_HELPER_4(xvfcvt_h_s, void, env, i32, i32, i32) +DEF_HELPER_4(xvfcvt_s_d, void, env, i32, i32, i32) + +DEF_HELPER_3(xvfrintrne_s, void, env, i32, i32) +DEF_HELPER_3(xvfrintrne_d, void, env, i32, i32) +DEF_HELPER_3(xvfrintrz_s, void, env, i32, i32) +DEF_HELPER_3(xvfrintrz_d, void, env, i32, i32) +DEF_HELPER_3(xvfrintrp_s, void, env, i32, i32) +DEF_HELPER_3(xvfrintrp_d, void, env, i32, i32) +DEF_HELPER_3(xvfrintrm_s, void, env, i32, i32) +DEF_HELPER_3(xvfrintrm_d, void, env, i32, i32) +DEF_HELPER_3(xvfrint_s, void, env, i32, i32) +DEF_HELPER_3(xvfrint_d, void, env, i32, i32) + +DEF_HELPER_3(xvftintrne_w_s, void, env, i32, i32) +DEF_HELPER_3(xvftintrne_l_d, void, env, i32, i32) +DEF_HELPER_3(xvftintrz_w_s, void, env, i32, i32) +DEF_HELPER_3(xvftintrz_l_d, void, env, i32, i32) +DEF_HELPER_3(xvftintrp_w_s, void, env, i32, i32) +DEF_HELPER_3(xvftintrp_l_d, void, env, i32, i32) +DEF_HELPER_3(xvftintrm_w_s, void, env, i32, i32) +DEF_HELPER_3(xvftintrm_l_d, void, env, i32, i32) +DEF_HELPER_3(xvftint_w_s, void, env, i32, i32) +DEF_HELPER_3(xvftint_l_d, void, env, i32, i32) +DEF_HELPER_3(xvftintrz_wu_s, void, env, i32, i32) +DEF_HELPER_3(xvftintrz_lu_d, void, env, i32, i32) +DEF_HELPER_3(xvftint_wu_s, void, env, i32, i32) +DEF_HELPER_3(xvftint_lu_d, void, env, i32, i32) +DEF_HELPER_4(xvftintrne_w_d, void, env, i32, i32, i32) +DEF_HELPER_4(xvftintrz_w_d, void, env, i32, i32, i32) +DEF_HELPER_4(xvftintrp_w_d, void, env, i32, i32, i32) +DEF_HELPER_4(xvftintrm_w_d, void, env, i32, i32, i32) +DEF_HELPER_4(xvftint_w_d, void, env, i32, i32, i32) +DEF_HELPER_3(xvftintrnel_l_s, void, env, i32, i32) +DEF_HELPER_3(xvftintrneh_l_s, void, env, i32, i32) +DEF_HELPER_3(xvftintrzl_l_s, void, env, i32, i32) +DEF_HELPER_3(xvftintrzh_l_s, void, env, i32, i32) +DEF_HELPER_3(xvftintrpl_l_s, void, env, i32, i32) +DEF_HELPER_3(xvftintrph_l_s, void, env, i32, i32) +DEF_HELPER_3(xvftintrml_l_s, void, env, i32, i32) +DEF_HELPER_3(xvftintrmh_l_s, void, env, i32, i32) +DEF_HELPER_3(xvftintl_l_s, void, env, i32, i32) +DEF_HELPER_3(xvftinth_l_s, void, env, i32, i32) + +DEF_HELPER_3(xvffint_s_w, void, env, i32, i32) +DEF_HELPER_3(xvffint_d_l, void, env, i32, i32) +DEF_HELPER_3(xvffint_s_wu, void, env, i32, i32) +DEF_HELPER_3(xvffint_d_lu, void, env, i32, i32) +DEF_HELPER_3(xvffintl_d_w, void, env, i32, i32) +DEF_HELPER_3(xvffinth_d_w, void, env, i32, i32) +DEF_HELPER_4(xvffint_s_l, void, env, i32, i32, i32) diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarch/insn_trans/trans_lasx.c.inc index b9785be6c5..998c07b358 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -2464,6 +2464,62 @@ TRANS(xvfrecip_d, gen_xx, gen_helper_xvfrecip_d) TRANS(xvfrsqrt_s, gen_xx, gen_helper_xvfrsqrt_s) TRANS(xvfrsqrt_d, gen_xx, gen_helper_xvfrsqrt_d) +TRANS(xvfcvtl_s_h, gen_xx, gen_helper_xvfcvtl_s_h) +TRANS(xvfcvth_s_h, gen_xx, gen_helper_xvfcvth_s_h) +TRANS(xvfcvtl_d_s, gen_xx, gen_helper_xvfcvtl_d_s) +TRANS(xvfcvth_d_s, gen_xx, gen_helper_xvfcvth_d_s) +TRANS(xvfcvt_h_s, gen_xxx, gen_helper_xvfcvt_h_s) +TRANS(xvfcvt_s_d, gen_xxx, gen_helper_xvfcvt_s_d) + +TRANS(xvfrintrne_s, gen_xx, gen_helper_xvfrintrne_s) +TRANS(xvfrintrne_d, gen_xx, gen_helper_xvfrintrne_d) +TRANS(xvfrintrz_s, gen_xx, gen_helper_xvfrintrz_s) +TRANS(xvfrintrz_d, gen_xx, gen_helper_xvfrintrz_d) +TRANS(xvfrintrp_s, gen_xx, gen_helper_xvfrintrp_s) +TRANS(xvfrintrp_d, gen_xx, gen_helper_xvfrintrp_d) +TRANS(xvfrintrm_s, gen_xx, gen_helper_xvfrintrm_s) +TRANS(xvfrintrm_d, gen_xx, gen_helper_xvfrintrm_d) +TRANS(xvfrint_s, gen_xx, gen_helper_xvfrint_s) +TRANS(xvfrint_d, gen_xx, gen_helper_xvfrint_d) + +TRANS(xvftintrne_w_s, gen_xx, gen_helper_xvftintrne_w_s) +TRANS(xvftintrne_l_d, gen_xx, gen_helper_xvftintrne_l_d) +TRANS(xvftintrz_w_s, gen_xx, gen_helper_xvftintrz_w_s) +TRANS(xvftintrz_l_d, gen_xx, gen_helper_xvftintrz_l_d) +TRANS(xvftintrp_w_s, gen_xx, gen_helper_xvftintrp_w_s) +TRANS(xvftintrp_l_d, gen_xx, gen_helper_xvftintrp_l_d) +TRANS(xvftintrm_w_s, gen_xx, gen_helper_xvftintrm_w_s) +TRANS(xvftintrm_l_d, gen_xx, gen_helper_xvftintrm_l_d) +TRANS(xvftint_w_s, gen_xx, gen_helper_xvftint_w_s) +TRANS(xvftint_l_d, gen_xx, gen_helper_xvftint_l_d) +TRANS(xvftintrz_wu_s, gen_xx, gen_helper_xvftintrz_wu_s) +TRANS(xvftintrz_lu_d, gen_xx, gen_helper_xvftintrz_lu_d) +TRANS(xvftint_wu_s, gen_xx, gen_helper_xvftint_wu_s) +TRANS(xvftint_lu_d, gen_xx, gen_helper_xvftint_lu_d) +TRANS(xvftintrne_w_d, gen_xxx, gen_helper_xvftintrne_w_d) +TRANS(xvftintrz_w_d, gen_xxx, gen_helper_xvftintrz_w_d) +TRANS(xvftintrp_w_d, gen_xxx, gen_helper_xvftintrp_w_d) +TRANS(xvftintrm_w_d, gen_xxx, gen_helper_xvftintrm_w_d) +TRANS(xvftint_w_d, gen_xxx, gen_helper_xvftint_w_d) +TRANS(xvftintrnel_l_s, gen_xx, gen_helper_xvftintrnel_l_s) +TRANS(xvftintrneh_l_s, gen_xx, gen_helper_xvftintrneh_l_s) +TRANS(xvftintrzl_l_s, gen_xx, gen_helper_xvftintrzl_l_s) +TRANS(xvftintrzh_l_s, gen_xx, gen_helper_xvftintrzh_l_s) +TRANS(xvftintrpl_l_s, gen_xx, gen_helper_xvftintrpl_l_s) +TRANS(xvftintrph_l_s, gen_xx, gen_helper_xvftintrph_l_s) +TRANS(xvftintrml_l_s, gen_xx, gen_helper_xvftintrml_l_s) +TRANS(xvftintrmh_l_s, gen_xx, gen_helper_xvftintrmh_l_s) +TRANS(xvftintl_l_s, gen_xx, gen_helper_xvftintl_l_s) +TRANS(xvftinth_l_s, gen_xx, gen_helper_xvftinth_l_s) + +TRANS(xvffint_s_w, gen_xx, gen_helper_xvffint_s_w) +TRANS(xvffint_d_l, gen_xx, gen_helper_xvffint_d_l) +TRANS(xvffint_s_wu, gen_xx, gen_helper_xvffint_s_wu) +TRANS(xvffint_d_lu, gen_xx, gen_helper_xvffint_d_lu) +TRANS(xvffintl_d_w, gen_xx, gen_helper_xvffintl_d_w) +TRANS(xvffinth_d_w, gen_xx, gen_helper_xvffinth_d_w) +TRANS(xvffint_s_l, gen_xxx, gen_helper_xvffint_s_l) + static bool gvec_dupx(DisasContext *ctx, arg_xr *a, MemOp mop) { TCGv src = gpr_src(ctx, a->rj, EXT_NONE); diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index 8a5d6a8d45..59b79573e5 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1882,6 +1882,64 @@ xvfrecip_d 0111 01101001 11001 11110 ..... ..... @xx xvfrsqrt_s 0111 01101001 11010 00001 ..... ..... @xx xvfrsqrt_d 0111 01101001 11010 00010 ..... ..... @xx +xvfcvtl_s_h 0111 01101001 11011 11010 ..... ..... @xx +xvfcvth_s_h 0111 01101001 11011 11011 ..... ..... @xx +xvfcvtl_d_s 0111 01101001 11011 11100 ..... ..... @xx +xvfcvth_d_s 0111 01101001 11011 11101 ..... ..... @xx +xvfcvt_h_s 0111 01010100 01100 ..... ..... ..... @xxx +xvfcvt_s_d 0111 01010100 01101 ..... ..... ..... @xxx + +xvfrintrne_s 0111 01101001 11010 11101 ..... ..... @xx +xvfrintrne_d 0111 01101001 11010 11110 ..... ..... @xx +xvfrintrz_s 0111 01101001 11010 11001 ..... ..... @xx +xvfrintrz_d 0111 01101001 11010 11010 ..... ..... @xx +xvfrintrp_s 0111 01101001 11010 10101 ..... ..... @xx +xvfrintrp_d 0111 01101001 11010 10110 ..... ..... @xx +xvfrintrm_s 0111 01101001 11010 10001 ..... ..... @xx +xvfrintrm_d 0111 01101001 11010 10010 ..... ..... @xx +xvfrint_s 0111 01101001 11010 01101 ..... ..... @xx +xvfrint_d 0111 01101001 11010 01110 ..... ..... @xx + +xvftintrne_w_s 0111 01101001 11100 10100 ..... ..... @xx +xvftintrne_l_d 0111 01101001 11100 10101 ..... ..... @xx +xvftintrz_w_s 0111 01101001 11100 10010 ..... ..... @xx +xvftintrz_l_d 0111 01101001 11100 10011 ..... ..... @xx +xvftintrp_w_s 0111 01101001 11100 10000 ..... ..... @xx +xvftintrp_l_d 0111 01101001 11100 10001 ..... ..... @xx +xvftintrm_w_s 0111 01101001 11100 01110 ..... ..... @xx +xvftintrm_l_d 0111 01101001 11100 01111 ..... ..... @xx +xvftint_w_s 0111 01101001 11100 01100 ..... ..... @xx +xvftint_l_d 0111 01101001 11100 01101 ..... ..... @xx +xvftintrz_wu_s 0111 01101001 11100 11100 ..... ..... @xx +xvftintrz_lu_d 0111 01101001 11100 11101 ..... ..... @xx +xvftint_wu_s 0111 01101001 11100 10110 ..... ..... @xx +xvftint_lu_d 0111 01101001 11100 10111 ..... ..... @xx + +xvftintrne_w_d 0111 01010100 10111 ..... ..... ..... @xxx +xvftintrz_w_d 0111 01010100 10110 ..... ..... ..... @xxx +xvftintrp_w_d 0111 01010100 10101 ..... ..... ..... @xxx +xvftintrm_w_d 0111 01010100 10100 ..... ..... ..... @xxx +xvftint_w_d 0111 01010100 10011 ..... ..... ..... @xxx + +xvftintrnel_l_s 0111 01101001 11101 01000 ..... ..... @xx +xvftintrneh_l_s 0111 01101001 11101 01001 ..... ..... @xx +xvftintrzl_l_s 0111 01101001 11101 00110 ..... ..... @xx +xvftintrzh_l_s 0111 01101001 11101 00111 ..... ..... @xx +xvftintrpl_l_s 0111 01101001 11101 00100 ..... ..... @xx +xvftintrph_l_s 0111 01101001 11101 00101 ..... ..... @xx +xvftintrml_l_s 0111 01101001 11101 00010 ..... ..... @xx +xvftintrmh_l_s 0111 01101001 11101 00011 ..... ..... @xx +xvftintl_l_s 0111 01101001 11101 00000 ..... ..... @xx +xvftinth_l_s 0111 01101001 11101 00001 ..... ..... @xx + +xvffint_s_w 0111 01101001 11100 00000 ..... ..... @xx +xvffint_d_l 0111 01101001 11100 00010 ..... ..... @xx +xvffint_s_wu 0111 01101001 11100 00001 ..... ..... @xx +xvffint_d_lu 0111 01101001 11100 00011 ..... ..... @xx +xvffintl_d_w 0111 01101001 11100 00100 ..... ..... @xx +xvffinth_d_w 0111 01101001 11100 00101 ..... ..... @xx +xvffint_s_l 0111 01010100 10000 ..... ..... ..... @xxx + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @xr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @xr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @xr diff --git a/target/loongarch/lasx_helper.c b/target/loongarch/lasx_helper.c index 316ebd3463..5cc917fdc3 100644 --- a/target/loongarch/lasx_helper.c +++ b/target/loongarch/lasx_helper.c @@ -2325,3 +2325,401 @@ XDO_2OP_F(xvfrecip_s, 32, UXW, do_frecip_32) XDO_2OP_F(xvfrecip_d, 64, UXD, do_frecip_64) XDO_2OP_F(xvfrsqrt_s, 32, UXW, do_frsqrt_32) XDO_2OP_F(xvfrsqrt_d, 64, UXD, do_frsqrt_64) + +void HELPER(xvfcvtl_s_h)(CPULoongArchState *env, uint32_t xd, uint32_t xj) +{ + int i, max; + XReg temp; + XReg *Xd = &(env->fpr[xd].xreg); + XReg *Xj = &(env->fpr[xj].xreg); + + max = LASX_LEN / (32 * 2); + vec_clear_cause(env); + for (i = 0; i < max; i++) { + temp.UXW(i) = float16_to_float32(Xj->UXH(i), true, &env->fp_status); + temp.UXW(i + max) = float16_to_float32(Xj->UXH(i + max * 2), + true, &env->fp_status); + vec_update_fcsr0(env, GETPC()); + } + *Xd = temp; +} + +void HELPER(xvfcvtl_d_s)(CPULoongArchState *env, uint32_t xd, uint32_t xj) +{ + int i, max; + XReg temp; + XReg *Xd = &(env->fpr[xd].xreg); + XReg *Xj = &(env->fpr[xj].xreg); + + max = LASX_LEN / (64 * 2); + vec_clear_cause(env); + for (i = 0; i < max; i++) { + temp.UXD(i) = float32_to_float64(Xj->UXW(i), &env->fp_status); + temp.UXD(i + max) = float32_to_float64(Xj->UXW(i + max * 2), + &env->fp_status); + vec_update_fcsr0(env, GETPC()); + } + *Xd = temp; +} + +void HELPER(xvfcvth_s_h)(CPULoongArchState *env, uint32_t xd, uint32_t xj) +{ + int i, max; + XReg temp; + XReg *Xd = &(env->fpr[xd].xreg); + XReg *Xj = &(env->fpr[xj].xreg); + + max = LASX_LEN / (32 * 2); + vec_clear_cause(env); + for (i = 0; i < max; i++) { + temp.UXW(i) = float16_to_float32(Xj->UXH(i + max), + true, &env->fp_status); + temp.UXW(i + max) = float16_to_float32(Xj->UXH(i + max * 3), + true, &env->fp_status); + vec_update_fcsr0(env, GETPC()); + } + *Xd = temp; +} + +void HELPER(xvfcvth_d_s)(CPULoongArchState *env, uint32_t xd, uint32_t xj) +{ + int i, max; + XReg temp; + XReg *Xd = &(env->fpr[xd].xreg); + XReg *Xj = &(env->fpr[xj].xreg); + + max = LASX_LEN / (64 * 2); + vec_clear_cause(env); + for (i = 0; i < max; i++) { + temp.UXD(i) = float32_to_float64(Xj->UXW(i + max), &env->fp_status); + temp.UXD(i + max) = float32_to_float64(Xj->UXW(i + max * 3), + &env->fp_status); + vec_update_fcsr0(env, GETPC()); + } + *Xd = temp; +} + +void HELPER(xvfcvt_h_s)(CPULoongArchState *env, + uint32_t xd, uint32_t xj, uint32_t xk) +{ + int i, max; + XReg temp; + XReg *Xd = &(env->fpr[xd].xreg); + XReg *Xj = &(env->fpr[xj].xreg); + XReg *Xk = &(env->fpr[xk].xreg); + + max = LASX_LEN / (32 * 2); + vec_clear_cause(env); + for (i = 0; i < max; i++) { + temp.UXH(i + max) = float32_to_float16(Xj->UXW(i), + true, &env->fp_status); + temp.UXH(i) = float32_to_float16(Xk->UXW(i), true, &env->fp_status); + temp.UXH(i + max * 3) = float32_to_float16(Xj->UXW(i + max), + true, &env->fp_status); + temp.UXH(i + max * 2) = float32_to_float16(Xk->UXW(i + max), + true, &env->fp_status); + vec_update_fcsr0(env, GETPC()); + } + *Xd = temp; +} + +void HELPER(xvfcvt_s_d)(CPULoongArchState *env, + uint32_t xd, uint32_t xj, uint32_t xk) +{ + int i, max; + XReg temp; + XReg *Xd = &(env->fpr[xd].xreg); + XReg *Xj = &(env->fpr[xj].xreg); + XReg *Xk = &(env->fpr[xk].xreg); + + max = LASX_LEN / (64 * 2); + vec_clear_cause(env); + for (i = 0; i < max; i++) { + temp.UXW(i + max) = float64_to_float32(Xj->UXD(i), &env->fp_status); + temp.UXW(i) = float64_to_float32(Xk->UXD(i), &env->fp_status); + temp.UXW(i + max * 3) = float64_to_float32(Xj->UXD(i + max), + &env->fp_status); + temp.UXW(i + max * 2) = float64_to_float32(Xk->UXD(i + max), + &env->fp_status); + vec_update_fcsr0(env, GETPC()); + } + *Xd = temp; +} + +void HELPER(xvfrint_s)(CPULoongArchState *env, uint32_t xd, uint32_t xj) +{ + int i; + XReg *Xd = &(env->fpr[xd].xreg); + XReg *Xj = &(env->fpr[xj].xreg); + + vec_clear_cause(env); + for (i = 0; i < LASX_LEN / 32; i++) { + Xd->XW(i) = float32_round_to_int(Xj->UXW(i), &env->fp_status); + vec_update_fcsr0(env, GETPC()); + } +} + +void HELPER(xvfrint_d)(CPULoongArchState *env, uint32_t xd, uint32_t xj) +{ + int i; + XReg *Xd = &(env->fpr[xd].xreg); + XReg *Xj = &(env->fpr[xj].xreg); + + vec_clear_cause(env); + for (i = 0; i < LASX_LEN / 64; i++) { + Xd->XD(i) = float64_round_to_int(Xj->UXD(i), &env->fp_status); + vec_update_fcsr0(env, GETPC()); + } +} + +#define XFCVT_2OP(NAME, BIT, E, MODE) \ +void HELPER(NAME)(CPULoongArchState *env, uint32_t xd, uint32_t xj) \ +{ \ + int i; \ + XReg *Xd = &(env->fpr[xd].xreg); \ + XReg *Xj = &(env->fpr[xj].xreg); \ + \ + vec_clear_cause(env); \ + for (i = 0; i < LASX_LEN / BIT; i++) { \ + FloatRoundMode old_mode = get_float_rounding_mode(&env->fp_status); \ + set_float_rounding_mode(MODE, &env->fp_status); \ + Xd->E(i) = float## BIT ## _round_to_int(Xj->E(i), &env->fp_status); \ + set_float_rounding_mode(old_mode, &env->fp_status); \ + vec_update_fcsr0(env, GETPC()); \ + } \ +} + +XFCVT_2OP(xvfrintrne_s, 32, UXW, float_round_nearest_even) +XFCVT_2OP(xvfrintrne_d, 64, UXD, float_round_nearest_even) +XFCVT_2OP(xvfrintrz_s, 32, UXW, float_round_to_zero) +XFCVT_2OP(xvfrintrz_d, 64, UXD, float_round_to_zero) +XFCVT_2OP(xvfrintrp_s, 32, UXW, float_round_up) +XFCVT_2OP(xvfrintrp_d, 64, UXD, float_round_up) +XFCVT_2OP(xvfrintrm_s, 32, UXW, float_round_down) +XFCVT_2OP(xvfrintrm_d, 64, UXD, float_round_down) + +#define XFTINT(NAME, FMT1, FMT2, T1, T2, MODE) \ +static T2 do_xftint ## NAME(CPULoongArchState *env, T1 fj) \ +{ \ + T2 fd; \ + FloatRoundMode old_mode = get_float_rounding_mode(&env->fp_status); \ + \ + set_float_rounding_mode(MODE, &env->fp_status); \ + fd = do_## FMT1 ##_to_## FMT2(env, fj); \ + set_float_rounding_mode(old_mode, &env->fp_status); \ + return fd; \ +} + +#define XDO_FTINT(FMT1, FMT2, T1, T2) \ +static T2 do_## FMT1 ##_to_## FMT2(CPULoongArchState *env, T1 fj) \ +{ \ + T2 fd; \ + \ + fd = FMT1 ##_to_## FMT2(fj, &env->fp_status); \ + if (get_float_exception_flags(&env->fp_status) & (float_flag_invalid)) { \ + if (FMT1 ##_is_any_nan(fj)) { \ + fd = 0; \ + } \ + } \ + vec_update_fcsr0(env, GETPC()); \ + return fd; \ +} + +XDO_FTINT(float32, int32, uint32_t, uint32_t) +XDO_FTINT(float64, int64, uint64_t, uint64_t) +XDO_FTINT(float32, uint32, uint32_t, uint32_t) +XDO_FTINT(float64, uint64, uint64_t, uint64_t) +XDO_FTINT(float64, int32, uint64_t, uint32_t) +XDO_FTINT(float32, int64, uint32_t, uint64_t) + +XFTINT(rne_w_s, float32, int32, uint32_t, uint32_t, float_round_nearest_even) +XFTINT(rne_l_d, float64, int64, uint64_t, uint64_t, float_round_nearest_even) +XFTINT(rp_w_s, float32, int32, uint32_t, uint32_t, float_round_up) +XFTINT(rp_l_d, float64, int64, uint64_t, uint64_t, float_round_up) +XFTINT(rz_w_s, float32, int32, uint32_t, uint32_t, float_round_to_zero) +XFTINT(rz_l_d, float64, int64, uint64_t, uint64_t, float_round_to_zero) +XFTINT(rm_w_s, float32, int32, uint32_t, uint32_t, float_round_down) +XFTINT(rm_l_d, float64, int64, uint64_t, uint64_t, float_round_down) + +XDO_2OP_F(xvftintrne_w_s, 32, UXW, do_xftintrne_w_s) +XDO_2OP_F(xvftintrne_l_d, 64, UXD, do_xftintrne_l_d) +XDO_2OP_F(xvftintrp_w_s, 32, UXW, do_xftintrp_w_s) +XDO_2OP_F(xvftintrp_l_d, 64, UXD, do_xftintrp_l_d) +XDO_2OP_F(xvftintrz_w_s, 32, UXW, do_xftintrz_w_s) +XDO_2OP_F(xvftintrz_l_d, 64, UXD, do_xftintrz_l_d) +XDO_2OP_F(xvftintrm_w_s, 32, UXW, do_xftintrm_w_s) +XDO_2OP_F(xvftintrm_l_d, 64, UXD, do_xftintrm_l_d) +XDO_2OP_F(xvftint_w_s, 32, UXW, do_float32_to_int32) +XDO_2OP_F(xvftint_l_d, 64, UXD, do_float64_to_int64) + +XFTINT(rz_wu_s, float32, uint32, uint32_t, uint32_t, float_round_to_zero) +XFTINT(rz_lu_d, float64, uint64, uint64_t, uint64_t, float_round_to_zero) + +XDO_2OP_F(xvftintrz_wu_s, 32, UXW, do_xftintrz_wu_s) +XDO_2OP_F(xvftintrz_lu_d, 64, UXD, do_xftintrz_lu_d) +XDO_2OP_F(xvftint_wu_s, 32, UXW, do_float32_to_uint32) +XDO_2OP_F(xvftint_lu_d, 64, UXD, do_float64_to_uint64) + +XFTINT(rm_w_d, float64, int32, uint64_t, uint32_t, float_round_down) +XFTINT(rp_w_d, float64, int32, uint64_t, uint32_t, float_round_up) +XFTINT(rz_w_d, float64, int32, uint64_t, uint32_t, float_round_to_zero) +XFTINT(rne_w_d, float64, int32, uint64_t, uint32_t, float_round_nearest_even) + +#define XFTINT_W_D(NAME, FN) \ +void HELPER(NAME)(CPULoongArchState *env, \ + uint32_t xd, uint32_t xj, uint32_t xk) \ +{ \ + int i, max; \ + XReg temp; \ + XReg *Xd = &(env->fpr[xd].xreg); \ + XReg *Xj = &(env->fpr[xj].xreg); \ + XReg *Xk = &(env->fpr[xk].xreg); \ + \ + max = LASX_LEN / (64 * 2); \ + vec_clear_cause(env); \ + for (i = 0; i < max; i++) { \ + temp.XW(i + max) = FN(env, Xj->UXD(i)); \ + temp.XW(i) = FN(env, Xk->UXD(i)); \ + temp.XW(i + max * 3) = FN(env, Xj->UXD(i + max)); \ + temp.XW(i + max * 2) = FN(env, Xk->UXD(i + max)); \ + } \ + *Xd = temp; \ +} + +XFTINT_W_D(xvftint_w_d, do_float64_to_int32) +XFTINT_W_D(xvftintrm_w_d, do_xftintrm_w_d) +XFTINT_W_D(xvftintrp_w_d, do_xftintrp_w_d) +XFTINT_W_D(xvftintrz_w_d, do_xftintrz_w_d) +XFTINT_W_D(xvftintrne_w_d, do_xftintrne_w_d) + +XFTINT(rml_l_s, float32, int64, uint32_t, uint64_t, float_round_down) +XFTINT(rpl_l_s, float32, int64, uint32_t, uint64_t, float_round_up) +XFTINT(rzl_l_s, float32, int64, uint32_t, uint64_t, float_round_to_zero) +XFTINT(rnel_l_s, float32, int64, uint32_t, uint64_t, float_round_nearest_even) +XFTINT(rmh_l_s, float32, int64, uint32_t, uint64_t, float_round_down) +XFTINT(rph_l_s, float32, int64, uint32_t, uint64_t, float_round_up) +XFTINT(rzh_l_s, float32, int64, uint32_t, uint64_t, float_round_to_zero) +XFTINT(rneh_l_s, float32, int64, uint32_t, uint64_t, float_round_nearest_even) + +#define XFTINTL_L_S(NAME, FN) \ +void HELPER(NAME)(CPULoongArchState *env, uint32_t xd, uint32_t xj) \ +{ \ + int i, max; \ + XReg temp; \ + XReg *Xd = &(env->fpr[xd].xreg); \ + XReg *Xj = &(env->fpr[xj].xreg); \ + \ + max = LASX_LEN / (64 * 2); \ + vec_clear_cause(env); \ + for (i = 0; i < max; i++) { \ + temp.XD(i) = FN(env, Xj->UXW(i)); \ + temp.XD(i + max) = FN(env, Xj->UXW(i + max * 2)); \ + } \ + *Xd = temp; \ +} + +XFTINTL_L_S(xvftintl_l_s, do_float32_to_int64) +XFTINTL_L_S(xvftintrml_l_s, do_xftintrml_l_s) +XFTINTL_L_S(xvftintrpl_l_s, do_xftintrpl_l_s) +XFTINTL_L_S(xvftintrzl_l_s, do_xftintrzl_l_s) +XFTINTL_L_S(xvftintrnel_l_s, do_xftintrnel_l_s) + +#define XFTINTH_L_S(NAME, FN) \ +void HELPER(NAME)(CPULoongArchState *env, uint32_t xd, uint32_t xj) \ +{ \ + int i, max; \ + XReg temp; \ + XReg *Xd = &(env->fpr[xd].xreg); \ + XReg *Xj = &(env->fpr[xj].xreg); \ + \ + max = LASX_LEN / (64 * 2); \ + vec_clear_cause(env); \ + for (i = 0; i < max; i++) { \ + temp.XD(i) = FN(env, Xj->UXW(i + max)); \ + temp.XD(i + max) = FN(env, Xj->UXW(i + max * 3)); \ + } \ + *Xd = temp; \ +} + +XFTINTH_L_S(xvftinth_l_s, do_float32_to_int64) +XFTINTH_L_S(xvftintrmh_l_s, do_xftintrmh_l_s) +XFTINTH_L_S(xvftintrph_l_s, do_xftintrph_l_s) +XFTINTH_L_S(xvftintrzh_l_s, do_xftintrzh_l_s) +XFTINTH_L_S(xvftintrneh_l_s, do_xftintrneh_l_s) + +#define XFFINT(NAME, FMT1, FMT2, T1, T2) \ +static T2 do_xffint_ ## NAME(CPULoongArchState *env, T1 fj) \ +{ \ + T2 fd; \ + \ + fd = FMT1 ##_to_## FMT2(fj, &env->fp_status); \ + vec_update_fcsr0(env, GETPC()); \ + return fd; \ +} + +XFFINT(s_w, int32, float32, int32_t, uint32_t) +XFFINT(d_l, int64, float64, int64_t, uint64_t) +XFFINT(s_wu, uint32, float32, uint32_t, uint32_t) +XFFINT(d_lu, uint64, float64, uint64_t, uint64_t) + +XDO_2OP_F(xvffint_s_w, 32, XW, do_xffint_s_w) +XDO_2OP_F(xvffint_d_l, 64, XD, do_xffint_d_l) +XDO_2OP_F(xvffint_s_wu, 32, UXW, do_xffint_s_wu) +XDO_2OP_F(xvffint_d_lu, 64, UXD, do_xffint_d_lu) + +void HELPER(xvffintl_d_w)(CPULoongArchState *env, uint32_t xd, uint32_t xj) +{ + int i, max; + XReg temp; + XReg *Xd = &(env->fpr[xd].xreg); + XReg *Xj = &(env->fpr[xj].xreg); + + max = LASX_LEN / (64 * 2); + vec_clear_cause(env); + for (i = 0; i < max; i++) { + temp.XD(i) = int32_to_float64(Xj->XW(i), &env->fp_status); + temp.XD(i + max) = int32_to_float64(Xj->XW(i + max * 2), + &env->fp_status); + vec_update_fcsr0(env, GETPC()); + } + *Xd = temp; +} + +void HELPER(xvffinth_d_w)(CPULoongArchState *env, uint32_t xd, uint32_t xj) +{ + int i, max; + XReg temp; + XReg *Xd = &(env->fpr[xd].xreg); + XReg *Xj = &(env->fpr[xj].xreg); + + max = LASX_LEN / (64 * 2); + vec_clear_cause(env); + for (i = 0; i < max; i++) { + temp.XD(i) = int32_to_float64(Xj->XW(i + max), &env->fp_status); + temp.XD(i + max) = int32_to_float64(Xj->XW(i + max * 3), + &env->fp_status); + vec_update_fcsr0(env, GETPC()); + } + *Xd = temp; +} + +void HELPER(xvffint_s_l)(CPULoongArchState *env, + uint32_t xd, uint32_t xj, uint32_t xk) +{ + int i, max; + XReg temp; + XReg *Xd = &(env->fpr[xd].xreg); + XReg *Xj = &(env->fpr[xj].xreg); + XReg *Xk = &(env->fpr[xk].xreg); + + max = LASX_LEN / (64 * 2); + vec_clear_cause(env); + for (i = 0; i < max; i++) { + temp.XW(i + max) = int64_to_float32(Xj->XD(i), &env->fp_status); + temp.XW(i) = int64_to_float32(Xk->XD(i), &env->fp_status); + temp.XW(i + max * 3) = int64_to_float32(Xj->XD(i + max), &env->fp_status); + temp.XW(i + max * 2) = int64_to_float32(Xk->XD(i + max), &env->fp_status); + vec_update_fcsr0(env, GETPC()); + } + *Xd = temp; +} From patchwork Tue Jun 20 09:38:06 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Gao X-Patchwork-Id: 13285500 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id AF4AEEB64D7 for ; Tue, 20 Jun 2023 09:45:47 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXqN-0002XU-V7; Tue, 20 Jun 2023 05:40:13 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qBXqF-0001sh-Da for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:40:03 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXqC-0006aJ-4P for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:40:02 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8Bxb+udc5Fk2SUHAA--.14743S3; Tue, 20 Jun 2023 17:38:37 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8BxduSGc5FkzIQhAA--.28394S40; Tue, 20 Jun 2023 17:38:36 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v1 38/46] target/loongarch: Implement xvseq xvsle xvslt Date: Tue, 20 Jun 2023 17:38:06 +0800 Message-Id: <20230620093814.123650-39-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230620093814.123650-1-gaosong@loongson.cn> References: <20230620093814.123650-1-gaosong@loongson.cn> MIME-Version: 1.0 X-CM-TRANSID: AQAAf8BxduSGc5FkzIQhAA--.28394S40 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org This patch includes: - XVSEQ[I].{B/H/W/D}; - XVSLE[I].{B/H/W/D}[U]; - XVSLT[I].{B/H/W/D/}[U]. Signed-off-by: Song Gao --- target/loongarch/disas.c | 43 ++++++ target/loongarch/helper.h | 23 +++ target/loongarch/insn_trans/trans_lasx.c.inc | 154 +++++++++++++++++++ target/loongarch/insns.decode | 43 ++++++ target/loongarch/lasx_helper.c | 34 ++++ target/loongarch/lsx_helper.c | 4 - target/loongarch/vec.h | 4 + 7 files changed, 301 insertions(+), 4 deletions(-) diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index 65eccc8598..5d3904402d 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -2341,6 +2341,49 @@ INSN_LASX(xvffintl_d_w, xx) INSN_LASX(xvffinth_d_w, xx) INSN_LASX(xvffint_s_l, xxx) +INSN_LASX(xvseq_b, xxx) +INSN_LASX(xvseq_h, xxx) +INSN_LASX(xvseq_w, xxx) +INSN_LASX(xvseq_d, xxx) +INSN_LASX(xvseqi_b, xx_i) +INSN_LASX(xvseqi_h, xx_i) +INSN_LASX(xvseqi_w, xx_i) +INSN_LASX(xvseqi_d, xx_i) + +INSN_LASX(xvsle_b, xxx) +INSN_LASX(xvsle_h, xxx) +INSN_LASX(xvsle_w, xxx) +INSN_LASX(xvsle_d, xxx) +INSN_LASX(xvslei_b, xx_i) +INSN_LASX(xvslei_h, xx_i) +INSN_LASX(xvslei_w, xx_i) +INSN_LASX(xvslei_d, xx_i) +INSN_LASX(xvsle_bu, xxx) +INSN_LASX(xvsle_hu, xxx) +INSN_LASX(xvsle_wu, xxx) +INSN_LASX(xvsle_du, xxx) +INSN_LASX(xvslei_bu, xx_i) +INSN_LASX(xvslei_hu, xx_i) +INSN_LASX(xvslei_wu, xx_i) +INSN_LASX(xvslei_du, xx_i) + +INSN_LASX(xvslt_b, xxx) +INSN_LASX(xvslt_h, xxx) +INSN_LASX(xvslt_w, xxx) +INSN_LASX(xvslt_d, xxx) +INSN_LASX(xvslti_b, xx_i) +INSN_LASX(xvslti_h, xx_i) +INSN_LASX(xvslti_w, xx_i) +INSN_LASX(xvslti_d, xx_i) +INSN_LASX(xvslt_bu, xxx) +INSN_LASX(xvslt_hu, xxx) +INSN_LASX(xvslt_wu, xxx) +INSN_LASX(xvslt_du, xxx) +INSN_LASX(xvslti_bu, xx_i) +INSN_LASX(xvslti_hu, xx_i) +INSN_LASX(xvslti_wu, xx_i) +INSN_LASX(xvslti_du, xx_i) + INSN_LASX(xvreplgr2vr_b, xr) INSN_LASX(xvreplgr2vr_h, xr) INSN_LASX(xvreplgr2vr_w, xr) diff --git a/target/loongarch/helper.h b/target/loongarch/helper.h index d30ea7f6a4..fbfd15d711 100644 --- a/target/loongarch/helper.h +++ b/target/loongarch/helper.h @@ -1193,3 +1193,26 @@ DEF_HELPER_3(xvffint_d_lu, void, env, i32, i32) DEF_HELPER_3(xvffintl_d_w, void, env, i32, i32) DEF_HELPER_3(xvffinth_d_w, void, env, i32, i32) DEF_HELPER_4(xvffint_s_l, void, env, i32, i32, i32) + +DEF_HELPER_FLAGS_4(xvseqi_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(xvseqi_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(xvseqi_w, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(xvseqi_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) + +DEF_HELPER_FLAGS_4(xvslei_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(xvslei_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(xvslei_w, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(xvslei_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(xvslei_bu, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(xvslei_hu, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(xvslei_wu, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(xvslei_du, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) + +DEF_HELPER_FLAGS_4(xvslti_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(xvslti_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(xvslti_w, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(xvslti_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(xvslti_bu, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(xvslti_hu, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(xvslti_wu, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(xvslti_du, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarch/insn_trans/trans_lasx.c.inc index 998c07b358..cc1b4fd42a 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -2520,6 +2520,160 @@ TRANS(xvffintl_d_w, gen_xx, gen_helper_xvffintl_d_w) TRANS(xvffinth_d_w, gen_xx, gen_helper_xvffinth_d_w) TRANS(xvffint_s_l, gen_xxx, gen_helper_xvffint_s_l) +static bool do_xcmp(DisasContext *ctx, arg_xxx * a, MemOp mop, TCGCond cond) +{ + uint32_t xd_ofs, xj_ofs, xk_ofs; + + CHECK_ASXE; + + xd_ofs = vec_full_offset(a->xd); + xj_ofs = vec_full_offset(a->xj); + xk_ofs = vec_full_offset(a->xk); + + tcg_gen_gvec_cmp(cond, mop, xd_ofs, xj_ofs, xk_ofs, 32, ctx->vl / 8); + return true; +} + +#define DO_XCMPI_S(NAME) \ +static bool do_x## NAME ##_s(DisasContext *ctx, arg_xx_i * a, MemOp mop) \ +{ \ + uint32_t xd_ofs, xj_ofs; \ + \ + CHECK_ASXE; \ + \ + static const TCGOpcode vecop_list[] = { \ + INDEX_op_cmp_vec, 0 \ + }; \ + static const GVecGen2i op[4] = { \ + { \ + .fniv = gen_## NAME ##_s_vec, \ + .fnoi = gen_helper_x## NAME ##_b, \ + .opt_opc = vecop_list, \ + .vece = MO_8 \ + }, \ + { \ + .fniv = gen_## NAME ##_s_vec, \ + .fnoi = gen_helper_x## NAME ##_h, \ + .opt_opc = vecop_list, \ + .vece = MO_16 \ + }, \ + { \ + .fniv = gen_## NAME ##_s_vec, \ + .fnoi = gen_helper_x## NAME ##_w, \ + .opt_opc = vecop_list, \ + .vece = MO_32 \ + }, \ + { \ + .fniv = gen_## NAME ##_s_vec, \ + .fnoi = gen_helper_x## NAME ##_d, \ + .opt_opc = vecop_list, \ + .vece = MO_64 \ + } \ + }; \ + \ + xd_ofs = vec_full_offset(a->xd); \ + xj_ofs = vec_full_offset(a->xj); \ + \ + tcg_gen_gvec_2i(xd_ofs, xj_ofs, 32, ctx->vl / 8, a->imm, &op[mop]); \ + \ + return true; \ +} + +DO_XCMPI_S(vseqi) +DO_XCMPI_S(vslei) +DO_XCMPI_S(vslti) + +#define DO_XCMPI_U(NAME) \ +static bool do_x## NAME ##_u(DisasContext *ctx, arg_xx_i * a, MemOp mop) \ +{ \ + uint32_t xd_ofs, xj_ofs; \ + \ + CHECK_ASXE; \ + \ + static const TCGOpcode vecop_list[] = { \ + INDEX_op_cmp_vec, 0 \ + }; \ + static const GVecGen2i op[4] = { \ + { \ + .fniv = gen_## NAME ##_u_vec, \ + .fnoi = gen_helper_x## NAME ##_bu, \ + .opt_opc = vecop_list, \ + .vece = MO_8 \ + }, \ + { \ + .fniv = gen_## NAME ##_u_vec, \ + .fnoi = gen_helper_x## NAME ##_hu, \ + .opt_opc = vecop_list, \ + .vece = MO_16 \ + }, \ + { \ + .fniv = gen_## NAME ##_u_vec, \ + .fnoi = gen_helper_x## NAME ##_wu, \ + .opt_opc = vecop_list, \ + .vece = MO_32 \ + }, \ + { \ + .fniv = gen_## NAME ##_u_vec, \ + .fnoi = gen_helper_x## NAME ##_du, \ + .opt_opc = vecop_list, \ + .vece = MO_64 \ + } \ + }; \ + \ + xd_ofs = vec_full_offset(a->xd); \ + xj_ofs = vec_full_offset(a->xj); \ + \ + tcg_gen_gvec_2i(xd_ofs, xj_ofs, 32, ctx->vl / 8, a->imm, &op[mop]); \ + \ + return true; \ +} + +DO_XCMPI_U(vslei) +DO_XCMPI_U(vslti) + +TRANS(xvseq_b, do_xcmp, MO_8, TCG_COND_EQ) +TRANS(xvseq_h, do_xcmp, MO_16, TCG_COND_EQ) +TRANS(xvseq_w, do_xcmp, MO_32, TCG_COND_EQ) +TRANS(xvseq_d, do_xcmp, MO_64, TCG_COND_EQ) +TRANS(xvseqi_b, do_xvseqi_s, MO_8) +TRANS(xvseqi_h, do_xvseqi_s, MO_16) +TRANS(xvseqi_w, do_xvseqi_s, MO_32) +TRANS(xvseqi_d, do_xvseqi_s, MO_64) + +TRANS(xvsle_b, do_xcmp, MO_8, TCG_COND_LE) +TRANS(xvsle_h, do_xcmp, MO_16, TCG_COND_LE) +TRANS(xvsle_w, do_xcmp, MO_32, TCG_COND_LE) +TRANS(xvsle_d, do_xcmp, MO_64, TCG_COND_LE) +TRANS(xvslei_b, do_xvslei_s, MO_8) +TRANS(xvslei_h, do_xvslei_s, MO_16) +TRANS(xvslei_w, do_xvslei_s, MO_32) +TRANS(xvslei_d, do_xvslei_s, MO_64) +TRANS(xvsle_bu, do_xcmp, MO_8, TCG_COND_LEU) +TRANS(xvsle_hu, do_xcmp, MO_16, TCG_COND_LEU) +TRANS(xvsle_wu, do_xcmp, MO_32, TCG_COND_LEU) +TRANS(xvsle_du, do_xcmp, MO_64, TCG_COND_LEU) +TRANS(xvslei_bu, do_xvslei_u, MO_8) +TRANS(xvslei_hu, do_xvslei_u, MO_16) +TRANS(xvslei_wu, do_xvslei_u, MO_32) +TRANS(xvslei_du, do_xvslei_u, MO_64) + +TRANS(xvslt_b, do_xcmp, MO_8, TCG_COND_LT) +TRANS(xvslt_h, do_xcmp, MO_16, TCG_COND_LT) +TRANS(xvslt_w, do_xcmp, MO_32, TCG_COND_LT) +TRANS(xvslt_d, do_xcmp, MO_64, TCG_COND_LT) +TRANS(xvslti_b, do_xvslti_s, MO_8) +TRANS(xvslti_h, do_xvslti_s, MO_16) +TRANS(xvslti_w, do_xvslti_s, MO_32) +TRANS(xvslti_d, do_xvslti_s, MO_64) +TRANS(xvslt_bu, do_xcmp, MO_8, TCG_COND_LTU) +TRANS(xvslt_hu, do_xcmp, MO_16, TCG_COND_LTU) +TRANS(xvslt_wu, do_xcmp, MO_32, TCG_COND_LTU) +TRANS(xvslt_du, do_xcmp, MO_64, TCG_COND_LTU) +TRANS(xvslti_bu, do_xvslti_u, MO_8) +TRANS(xvslti_hu, do_xvslti_u, MO_16) +TRANS(xvslti_wu, do_xvslti_u, MO_32) +TRANS(xvslti_du, do_xvslti_u, MO_64) + static bool gvec_dupx(DisasContext *ctx, arg_xr *a, MemOp mop) { TCGv src = gpr_src(ctx, a->rj, EXT_NONE); diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index 59b79573e5..4e1f0b30a0 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1940,6 +1940,49 @@ xvffintl_d_w 0111 01101001 11100 00100 ..... ..... @xx xvffinth_d_w 0111 01101001 11100 00101 ..... ..... @xx xvffint_s_l 0111 01010100 10000 ..... ..... ..... @xxx +xvseq_b 0111 01000000 00000 ..... ..... ..... @xxx +xvseq_h 0111 01000000 00001 ..... ..... ..... @xxx +xvseq_w 0111 01000000 00010 ..... ..... ..... @xxx +xvseq_d 0111 01000000 00011 ..... ..... ..... @xxx +xvseqi_b 0111 01101000 00000 ..... ..... ..... @xx_i5 +xvseqi_h 0111 01101000 00001 ..... ..... ..... @xx_i5 +xvseqi_w 0111 01101000 00010 ..... ..... ..... @xx_i5 +xvseqi_d 0111 01101000 00011 ..... ..... ..... @xx_i5 + +xvsle_b 0111 01000000 00100 ..... ..... ..... @xxx +xvsle_h 0111 01000000 00101 ..... ..... ..... @xxx +xvsle_w 0111 01000000 00110 ..... ..... ..... @xxx +xvsle_d 0111 01000000 00111 ..... ..... ..... @xxx +xvslei_b 0111 01101000 00100 ..... ..... ..... @xx_i5 +xvslei_h 0111 01101000 00101 ..... ..... ..... @xx_i5 +xvslei_w 0111 01101000 00110 ..... ..... ..... @xx_i5 +xvslei_d 0111 01101000 00111 ..... ..... ..... @xx_i5 +xvsle_bu 0111 01000000 01000 ..... ..... ..... @xxx +xvsle_hu 0111 01000000 01001 ..... ..... ..... @xxx +xvsle_wu 0111 01000000 01010 ..... ..... ..... @xxx +xvsle_du 0111 01000000 01011 ..... ..... ..... @xxx +xvslei_bu 0111 01101000 01000 ..... ..... ..... @xx_ui5 +xvslei_hu 0111 01101000 01001 ..... ..... ..... @xx_ui5 +xvslei_wu 0111 01101000 01010 ..... ..... ..... @xx_ui5 +xvslei_du 0111 01101000 01011 ..... ..... ..... @xx_ui5 + +xvslt_b 0111 01000000 01100 ..... ..... ..... @xxx +xvslt_h 0111 01000000 01101 ..... ..... ..... @xxx +xvslt_w 0111 01000000 01110 ..... ..... ..... @xxx +xvslt_d 0111 01000000 01111 ..... ..... ..... @xxx +xvslti_b 0111 01101000 01100 ..... ..... ..... @xx_i5 +xvslti_h 0111 01101000 01101 ..... ..... ..... @xx_i5 +xvslti_w 0111 01101000 01110 ..... ..... ..... @xx_i5 +xvslti_d 0111 01101000 01111 ..... ..... ..... @xx_i5 +xvslt_bu 0111 01000000 10000 ..... ..... ..... @xxx +xvslt_hu 0111 01000000 10001 ..... ..... ..... @xxx +xvslt_wu 0111 01000000 10010 ..... ..... ..... @xxx +xvslt_du 0111 01000000 10011 ..... ..... ..... @xxx +xvslti_bu 0111 01101000 10000 ..... ..... ..... @xx_ui5 +xvslti_hu 0111 01101000 10001 ..... ..... ..... @xx_ui5 +xvslti_wu 0111 01101000 10010 ..... ..... ..... @xx_ui5 +xvslti_du 0111 01101000 10011 ..... ..... ..... @xx_ui5 + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @xr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @xr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @xr diff --git a/target/loongarch/lasx_helper.c b/target/loongarch/lasx_helper.c index 5cc917fdc3..d0bc02de72 100644 --- a/target/loongarch/lasx_helper.c +++ b/target/loongarch/lasx_helper.c @@ -2723,3 +2723,37 @@ void HELPER(xvffint_s_l)(CPULoongArchState *env, } *Xd = temp; } + +#define XVCMPI(NAME, BIT, E, DO_OP) \ +void HELPER(NAME)(void *xd, void *xj, uint64_t imm, uint32_t v) \ +{ \ + int i; \ + XReg *Xd = (XReg *)xd; \ + XReg *Xj = (XReg *)xj; \ + typedef __typeof(Xd->E(0)) TD; \ + \ + for (i = 0; i < LASX_LEN / BIT; i++) { \ + Xd->E(i) = DO_OP(Xj->E(i), (TD)imm); \ + } \ +} + +XVCMPI(xvseqi_b, 8, XB, VSEQ) +XVCMPI(xvseqi_h, 16, XH, VSEQ) +XVCMPI(xvseqi_w, 32, XW, VSEQ) +XVCMPI(xvseqi_d, 64, XD, VSEQ) +XVCMPI(xvslei_b, 8, XB, VSLE) +XVCMPI(xvslei_h, 16, XH, VSLE) +XVCMPI(xvslei_w, 32, XW, VSLE) +XVCMPI(xvslei_d, 64, XD, VSLE) +XVCMPI(xvslei_bu, 8, UXB, VSLE) +XVCMPI(xvslei_hu, 16, UXH, VSLE) +XVCMPI(xvslei_wu, 32, UXW, VSLE) +XVCMPI(xvslei_du, 64, UXD, VSLE) +XVCMPI(xvslti_b, 8, XB, VSLT) +XVCMPI(xvslti_h, 16, XH, VSLT) +XVCMPI(xvslti_w, 32, XW, VSLT) +XVCMPI(xvslti_d, 64, XD, VSLT) +XVCMPI(xvslti_bu, 8, UXB, VSLT) +XVCMPI(xvslti_hu, 16, UXH, VSLT) +XVCMPI(xvslti_wu, 32, UXW, VSLT) +XVCMPI(xvslti_du, 64, UXD, VSLT) diff --git a/target/loongarch/lsx_helper.c b/target/loongarch/lsx_helper.c index 446a1bdfe3..22d71cb39e 100644 --- a/target/loongarch/lsx_helper.c +++ b/target/loongarch/lsx_helper.c @@ -2588,10 +2588,6 @@ void HELPER(vffint_s_l)(CPULoongArchState *env, *Vd = temp; } -#define VSEQ(a, b) (a == b ? -1 : 0) -#define VSLE(a, b) (a <= b ? -1 : 0) -#define VSLT(a, b) (a < b ? -1 : 0) - #define VCMPI(NAME, BIT, E, DO_OP) \ void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t v) \ { \ diff --git a/target/loongarch/vec.h b/target/loongarch/vec.h index 583997d576..54fd2689f3 100644 --- a/target/loongarch/vec.h +++ b/target/loongarch/vec.h @@ -90,6 +90,10 @@ #define DO_BITSET(a, bit) (a | 1ull << bit) #define DO_BITREV(a, bit) (a ^ (1ull << bit)) +#define VSEQ(a, b) (a == b ? -1 : 0) +#define VSLE(a, b) (a <= b ? -1 : 0) +#define VSLT(a, b) (a < b ? -1 : 0) + uint64_t do_vmskltz_b(int64_t val); uint64_t do_vmskltz_h(int64_t val); uint64_t do_vmskltz_w(int64_t val); From patchwork Tue Jun 20 09:38:07 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Gao X-Patchwork-Id: 13285478 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 51D67EB64D7 for ; Tue, 20 Jun 2023 09:42:48 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXpS-0007LC-FW; Tue, 20 Jun 2023 05:39:14 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qBXp8-0007Cq-3D for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:39:01 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXp5-0006Oy-2j for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:38:53 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8Bx4Oiec5Fk2yUHAA--.12806S3; Tue, 20 Jun 2023 17:38:38 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8BxduSGc5FkzIQhAA--.28394S41; Tue, 20 Jun 2023 17:38:37 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v1 39/46] target/loongarch: Implement xvfcmp Date: Tue, 20 Jun 2023 17:38:07 +0800 Message-Id: <20230620093814.123650-40-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230620093814.123650-1-gaosong@loongson.cn> References: <20230620093814.123650-1-gaosong@loongson.cn> MIME-Version: 1.0 X-CM-TRANSID: AQAAf8BxduSGc5FkzIQhAA--.28394S41 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org This patch includes: - XVFCMP.cond.{S/D}. Signed-off-by: Song Gao --- target/loongarch/disas.c | 94 ++++++++++++++++++++ target/loongarch/helper.h | 5 ++ target/loongarch/insn_trans/trans_lasx.c.inc | 32 +++++++ target/loongarch/insns.decode | 5 ++ target/loongarch/lasx_helper.c | 25 ++++++ target/loongarch/lsx_helper.c | 4 +- target/loongarch/vec.h | 5 ++ 7 files changed, 168 insertions(+), 2 deletions(-) diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index 5d3904402d..c3bcb9d84a 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -2384,6 +2384,100 @@ INSN_LASX(xvslti_hu, xx_i) INSN_LASX(xvslti_wu, xx_i) INSN_LASX(xvslti_du, xx_i) +#define output_xvfcmp(C, PREFIX, SUFFIX) \ +{ \ + (C)->info->fprintf_func((C)->info->stream, "%08x %s%s\tx%d, x%d, x%d", \ + (C)->insn, PREFIX, SUFFIX, a->xd, \ + a->xj, a->xk); \ +} + +static bool output_xxx_fcond(DisasContext *ctx, arg_xxx_fcond * a, + const char *suffix) +{ + bool ret = true; + switch (a->fcond) { + case 0x0: + output_xvfcmp(ctx, "xvfcmp_caf_", suffix); + break; + case 0x1: + output_xvfcmp(ctx, "xvfcmp_saf_", suffix); + break; + case 0x2: + output_xvfcmp(ctx, "xvfcmp_clt_", suffix); + break; + case 0x3: + output_xvfcmp(ctx, "xvfcmp_slt_", suffix); + break; + case 0x4: + output_xvfcmp(ctx, "xvfcmp_ceq_", suffix); + break; + case 0x5: + output_xvfcmp(ctx, "xvfcmp_seq_", suffix); + break; + case 0x6: + output_xvfcmp(ctx, "xvfcmp_cle_", suffix); + break; + case 0x7: + output_xvfcmp(ctx, "xvfcmp_sle_", suffix); + break; + case 0x8: + output_xvfcmp(ctx, "xvfcmp_cun_", suffix); + break; + case 0x9: + output_xvfcmp(ctx, "xvfcmp_sun_", suffix); + break; + case 0xA: + output_xvfcmp(ctx, "xvfcmp_cult_", suffix); + break; + case 0xB: + output_xvfcmp(ctx, "xvfcmp_sult_", suffix); + break; + case 0xC: + output_xvfcmp(ctx, "xvfcmp_cueq_", suffix); + break; + case 0xD: + output_xvfcmp(ctx, "xvfcmp_sueq_", suffix); + break; + case 0xE: + output_xvfcmp(ctx, "xvfcmp_cule_", suffix); + break; + case 0xF: + output_xvfcmp(ctx, "xvfcmp_sule_", suffix); + break; + case 0x10: + output_xvfcmp(ctx, "xvfcmp_cne_", suffix); + break; + case 0x11: + output_xvfcmp(ctx, "xvfcmp_sne_", suffix); + break; + case 0x14: + output_xvfcmp(ctx, "xvfcmp_cor_", suffix); + break; + case 0x15: + output_xvfcmp(ctx, "xvfcmp_sor_", suffix); + break; + case 0x18: + output_xvfcmp(ctx, "xvfcmp_cune_", suffix); + break; + case 0x19: + output_xvfcmp(ctx, "xvfcmp_sune_", suffix); + break; + default: + ret = false; + } + return ret; +} + +#define LASX_FCMP_INSN(suffix) \ +static bool trans_xvfcmp_cond_##suffix(DisasContext *ctx, \ + arg_xxx_fcond * a) \ +{ \ + return output_xxx_fcond(ctx, a, #suffix); \ +} + +LASX_FCMP_INSN(s) +LASX_FCMP_INSN(d) + INSN_LASX(xvreplgr2vr_b, xr) INSN_LASX(xvreplgr2vr_h, xr) INSN_LASX(xvreplgr2vr_w, xr) diff --git a/target/loongarch/helper.h b/target/loongarch/helper.h index fbfd15d711..665bcb812a 100644 --- a/target/loongarch/helper.h +++ b/target/loongarch/helper.h @@ -1216,3 +1216,8 @@ DEF_HELPER_FLAGS_4(xvslti_bu, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) DEF_HELPER_FLAGS_4(xvslti_hu, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) DEF_HELPER_FLAGS_4(xvslti_wu, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) DEF_HELPER_FLAGS_4(xvslti_du, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) + +DEF_HELPER_5(xvfcmp_c_s, void, env, i32, i32, i32, i32) +DEF_HELPER_5(xvfcmp_s_s, void, env, i32, i32, i32, i32) +DEF_HELPER_5(xvfcmp_c_d, void, env, i32, i32, i32, i32) +DEF_HELPER_5(xvfcmp_s_d, void, env, i32, i32, i32, i32) diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarch/insn_trans/trans_lasx.c.inc index cc1b4fd42a..cdcd4a279a 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -2674,6 +2674,38 @@ TRANS(xvslti_hu, do_xvslti_u, MO_16) TRANS(xvslti_wu, do_xvslti_u, MO_32) TRANS(xvslti_du, do_xvslti_u, MO_64) +static bool trans_xvfcmp_cond_s(DisasContext *ctx, arg_xxx_fcond * a) +{ + uint32_t flags; + void (*fn)(TCGv_env, TCGv_i32, TCGv_i32, TCGv_i32, TCGv_i32); + TCGv_i32 xd = tcg_constant_i32(a->xd); + TCGv_i32 xj = tcg_constant_i32(a->xj); + TCGv_i32 xk = tcg_constant_i32(a->xk); + + CHECK_SXE; + + fn = (a->fcond & 1 ? gen_helper_xvfcmp_s_s : gen_helper_xvfcmp_c_s); + flags = get_fcmp_flags(a->fcond >> 1); + fn(cpu_env, xd, xj, xk, tcg_constant_i32(flags)); + + return true; +} + +static bool trans_xvfcmp_cond_d(DisasContext *ctx, arg_xxx_fcond *a) +{ + uint32_t flags; + void (*fn)(TCGv_env, TCGv_i32, TCGv_i32, TCGv_i32, TCGv_i32); + TCGv_i32 xd = tcg_constant_i32(a->xd); + TCGv_i32 xj = tcg_constant_i32(a->xj); + TCGv_i32 xk = tcg_constant_i32(a->xk); + + fn = (a->fcond & 1 ? gen_helper_xvfcmp_s_d : gen_helper_xvfcmp_c_d); + flags = get_fcmp_flags(a->fcond >> 1); + fn(cpu_env, xd, xj, xk, tcg_constant_i32(flags)); + + return true; +} + static bool gvec_dupx(DisasContext *ctx, arg_xr *a, MemOp mop) { TCGv src = gpr_src(ctx, a->rj, EXT_NONE); diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index 4e1f0b30a0..df45dc3d76 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1307,6 +1307,7 @@ vstelm_b 0011 000110 .... ........ ..... ..... @vr_i8i4 &xx_i xd xj imm &x_i xd imm &xxxx xd xj xk xa +&xxx_fcond xd xj xk fcond # # LASX Formats @@ -1324,6 +1325,7 @@ vstelm_b 0011 000110 .... ........ ..... ..... @vr_i8i4 @xx_ui7 .... ........ ... imm:7 xj:5 xd:5 &xx_i @xx_ui8 .... ........ .. imm:8 xj:5 xd:5 &xx_i @xxxx .... ........ xa:5 xk:5 xj:5 xd:5 &xxxx +@xxx_fcond .... ........ fcond:5 xk:5 xj:5 xd:5 &xxx_fcond xvadd_b 0111 01000000 10100 ..... ..... ..... @xxx xvadd_h 0111 01000000 10101 ..... ..... ..... @xxx @@ -1983,6 +1985,9 @@ xvslti_hu 0111 01101000 10001 ..... ..... ..... @xx_ui5 xvslti_wu 0111 01101000 10010 ..... ..... ..... @xx_ui5 xvslti_du 0111 01101000 10011 ..... ..... ..... @xx_ui5 +xvfcmp_cond_s 0000 11001001 ..... ..... ..... ..... @xxx_fcond +xvfcmp_cond_d 0000 11001010 ..... ..... ..... ..... @xxx_fcond + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @xr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @xr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @xr diff --git a/target/loongarch/lasx_helper.c b/target/loongarch/lasx_helper.c index d0bc02de72..1d56fe7b22 100644 --- a/target/loongarch/lasx_helper.c +++ b/target/loongarch/lasx_helper.c @@ -2757,3 +2757,28 @@ XVCMPI(xvslti_bu, 8, UXB, VSLT) XVCMPI(xvslti_hu, 16, UXH, VSLT) XVCMPI(xvslti_wu, 32, UXW, VSLT) XVCMPI(xvslti_du, 64, UXD, VSLT) + +#define XVFCMP(NAME, BIT, E, FN) \ +void HELPER(NAME)(CPULoongArchState *env, \ + uint32_t xd, uint32_t xj, uint32_t xk, uint32_t flags) \ +{ \ + int i; \ + XReg t; \ + XReg *Xd = &(env->fpr[xd].xreg); \ + XReg *Xj = &(env->fpr[xj].xreg); \ + XReg *Xk = &(env->fpr[xk].xreg); \ + \ + vec_clear_cause(env); \ + for (i = 0; i < LASX_LEN / BIT ; i++) { \ + FloatRelation cmp; \ + cmp = FN(Xj->E(i), Xk->E(i), &env->fp_status); \ + t.E(i) = vfcmp_common(env, cmp, flags); \ + vec_update_fcsr0(env, GETPC()); \ + } \ + *Xd = t; \ +} + +XVFCMP(xvfcmp_c_s, 32, UXW, float32_compare_quiet) +XVFCMP(xvfcmp_s_s, 32, UXW, float32_compare) +XVFCMP(xvfcmp_c_d, 64, UXD, float64_compare_quiet) +XVFCMP(xvfcmp_s_d, 64, UXD, float64_compare) diff --git a/target/loongarch/lsx_helper.c b/target/loongarch/lsx_helper.c index 22d71cb39e..4a5c1a47a1 100644 --- a/target/loongarch/lsx_helper.c +++ b/target/loongarch/lsx_helper.c @@ -2622,8 +2622,8 @@ VCMPI(vslti_hu, 16, UH, VSLT) VCMPI(vslti_wu, 32, UW, VSLT) VCMPI(vslti_du, 64, UD, VSLT) -static uint64_t vfcmp_common(CPULoongArchState *env, - FloatRelation cmp, uint32_t flags) +uint64_t vfcmp_common(CPULoongArchState *env, + FloatRelation cmp, uint32_t flags) { uint64_t ret = 0; diff --git a/target/loongarch/vec.h b/target/loongarch/vec.h index 54fd2689f3..134dd265bf 100644 --- a/target/loongarch/vec.h +++ b/target/loongarch/vec.h @@ -8,6 +8,8 @@ #ifndef LOONGARCH_VEC_H #define LOONGARCH_VEC_H +#include "fpu/softfloat.h" + #if HOST_BIG_ENDIAN #define B(x) B[15 - (x)] #define H(x) H[7 - (x)] @@ -113,4 +115,7 @@ uint64_t do_frecip_64(CPULoongArchState *env, uint64_t fj); uint32_t do_frsqrt_32(CPULoongArchState *env, uint32_t fj); uint64_t do_frsqrt_64(CPULoongArchState *env, uint64_t fj); +uint64_t vfcmp_common(CPULoongArchState *env, + FloatRelation cmp, uint32_t flags); + #endif /* LOONGARCH_VEC_H */ From patchwork Tue Jun 20 09:38:08 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Gao X-Patchwork-Id: 13285499 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 984BFEB64DB for ; Tue, 20 Jun 2023 09:45:47 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXpT-0007Md-BH; Tue, 20 Jun 2023 05:39:15 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qBXp8-0007Cr-3L for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:39:01 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXp5-0006P3-BR for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:38:53 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8CxvOqec5Fk3CUHAA--.14674S3; Tue, 20 Jun 2023 17:38:38 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8BxduSGc5FkzIQhAA--.28394S42; Tue, 20 Jun 2023 17:38:38 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v1 40/46] target/loongarch: Implement xvbitsel xvset Date: Tue, 20 Jun 2023 17:38:08 +0800 Message-Id: <20230620093814.123650-41-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230620093814.123650-1-gaosong@loongson.cn> References: <20230620093814.123650-1-gaosong@loongson.cn> MIME-Version: 1.0 X-CM-TRANSID: AQAAf8BxduSGc5FkzIQhAA--.28394S42 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org This patch includes: - XVBITSEL.V; - XVBITSELI.B; - XVSET{EQZ/NEZ}.V; - XVSETANYEQZ.{B/H/W/D}; - XVSETALLNEZ.{B/H/W/D}. Signed-off-by: Song Gao --- target/loongarch/disas.c | 19 +++++ target/loongarch/helper.h | 11 +++ target/loongarch/insn_trans/trans_lasx.c.inc | 76 ++++++++++++++++++++ target/loongarch/insns.decode | 17 +++++ target/loongarch/lasx_helper.c | 37 ++++++++++ target/loongarch/lsx_helper.c | 2 +- target/loongarch/vec.h | 2 + 7 files changed, 163 insertions(+), 1 deletion(-) diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index c3bcb9d84a..5c2a81ee80 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -1703,6 +1703,11 @@ static bool trans_##insn(DisasContext *ctx, arg_##type * a) \ return true; \ } +static void output_cx(DisasContext *ctx, arg_cx *a, const char *mnemonic) +{ + output(ctx, mnemonic, "fcc%d, x%d", a->cd, a->xj); +} + static void output_x_i(DisasContext *ctx, arg_x_i *a, const char *mnemonic) { output(ctx, mnemonic, "x%d, 0x%x", a->xd, a->imm); @@ -2478,6 +2483,20 @@ static bool trans_xvfcmp_cond_##suffix(DisasContext *ctx, \ LASX_FCMP_INSN(s) LASX_FCMP_INSN(d) +INSN_LASX(xvbitsel_v, xxxx) +INSN_LASX(xvbitseli_b, xx_i) + +INSN_LASX(xvseteqz_v, cx) +INSN_LASX(xvsetnez_v, cx) +INSN_LASX(xvsetanyeqz_b, cx) +INSN_LASX(xvsetanyeqz_h, cx) +INSN_LASX(xvsetanyeqz_w, cx) +INSN_LASX(xvsetanyeqz_d, cx) +INSN_LASX(xvsetallnez_b, cx) +INSN_LASX(xvsetallnez_h, cx) +INSN_LASX(xvsetallnez_w, cx) +INSN_LASX(xvsetallnez_d, cx) + INSN_LASX(xvreplgr2vr_b, xr) INSN_LASX(xvreplgr2vr_h, xr) INSN_LASX(xvreplgr2vr_w, xr) diff --git a/target/loongarch/helper.h b/target/loongarch/helper.h index 665bcb812a..f6d64bfde5 100644 --- a/target/loongarch/helper.h +++ b/target/loongarch/helper.h @@ -1221,3 +1221,14 @@ DEF_HELPER_5(xvfcmp_c_s, void, env, i32, i32, i32, i32) DEF_HELPER_5(xvfcmp_s_s, void, env, i32, i32, i32, i32) DEF_HELPER_5(xvfcmp_c_d, void, env, i32, i32, i32, i32) DEF_HELPER_5(xvfcmp_s_d, void, env, i32, i32, i32, i32) + +DEF_HELPER_FLAGS_4(xvbitseli_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) + +DEF_HELPER_3(xvsetanyeqz_b, void, env, i32, i32) +DEF_HELPER_3(xvsetanyeqz_h, void, env, i32, i32) +DEF_HELPER_3(xvsetanyeqz_w, void, env, i32, i32) +DEF_HELPER_3(xvsetanyeqz_d, void, env, i32, i32) +DEF_HELPER_3(xvsetallnez_b, void, env, i32, i32) +DEF_HELPER_3(xvsetallnez_h, void, env, i32, i32) +DEF_HELPER_3(xvsetallnez_w, void, env, i32, i32) +DEF_HELPER_3(xvsetallnez_d, void, env, i32, i32) diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarch/insn_trans/trans_lasx.c.inc index cdcd4a279a..cefb6a4973 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -65,6 +65,17 @@ static bool gen_xx_i(DisasContext *ctx, arg_xx_i *a, return true; } +static bool gen_cx(DisasContext *ctx, arg_cx *a, + void (*func)(TCGv_ptr, TCGv_i32, TCGv_i32)) +{ + TCGv_i32 xj = tcg_constant_i32(a->xj); + TCGv_i32 cd = tcg_constant_i32(a->cd); + + CHECK_ASXE; + func(cpu_env, cd, xj); + return true; +} + static bool gvec_xxx(DisasContext *ctx, arg_xxx *a, MemOp mop, void (*func)(unsigned, uint32_t, uint32_t, uint32_t, uint32_t, uint32_t)) @@ -2706,6 +2717,71 @@ static bool trans_xvfcmp_cond_d(DisasContext *ctx, arg_xxx_fcond *a) return true; } +static bool trans_xvbitsel_v(DisasContext *ctx, arg_xxxx *a) +{ + CHECK_ASXE; + + tcg_gen_gvec_bitsel(MO_64, vec_full_offset(a->xd), vec_full_offset(a->xa), + vec_full_offset(a->xk), vec_full_offset(a->xj), + 32, ctx->vl / 8); + return true; +} + +static bool trans_xvbitseli_b(DisasContext *ctx, arg_xx_i *a) +{ + static const GVecGen2i op = { + .fniv = gen_vbitseli, + .fnoi = gen_helper_xvbitseli_b, + .vece = MO_8, + .load_dest = true + }; + + CHECK_ASXE; + + tcg_gen_gvec_2i(vec_full_offset(a->xd), vec_full_offset(a->xj), + 32, ctx->vl / 8, a->imm, &op); + return true; +} + +#define XVSET(NAME, COND) \ +static bool trans_## NAME(DisasContext *ctx, arg_cx * a) \ +{ \ + TCGv_i64 t1, t2, d[4]; \ + \ + d[0] = tcg_temp_new_i64(); \ + d[1] = tcg_temp_new_i64(); \ + d[2] = tcg_temp_new_i64(); \ + d[3] = tcg_temp_new_i64(); \ + t1 = tcg_temp_new_i64(); \ + t2 = tcg_temp_new_i64(); \ + \ + get_xreg64(d[0], a->xj, 0); \ + get_xreg64(d[1], a->xj, 1); \ + get_xreg64(d[2], a->xj, 2); \ + get_xreg64(d[3], a->xj, 3); \ + \ + CHECK_ASXE; \ + tcg_gen_or_i64(t1, d[0], d[1]); \ + tcg_gen_or_i64(t2, d[2], d[3]); \ + tcg_gen_or_i64(t1, t2, t1); \ + tcg_gen_setcondi_i64(COND, t1, t1, 0); \ + tcg_gen_st8_tl(t1, cpu_env, offsetof(CPULoongArchState, cf[a->cd & 0x7])); \ + \ + return true; \ +} + +XVSET(xvseteqz_v, TCG_COND_EQ) +XVSET(xvsetnez_v, TCG_COND_NE) + +TRANS(xvsetanyeqz_b, gen_cx, gen_helper_xvsetanyeqz_b) +TRANS(xvsetanyeqz_h, gen_cx, gen_helper_xvsetanyeqz_h) +TRANS(xvsetanyeqz_w, gen_cx, gen_helper_xvsetanyeqz_w) +TRANS(xvsetanyeqz_d, gen_cx, gen_helper_xvsetanyeqz_d) +TRANS(xvsetallnez_b, gen_cx, gen_helper_xvsetallnez_b) +TRANS(xvsetallnez_h, gen_cx, gen_helper_xvsetallnez_h) +TRANS(xvsetallnez_w, gen_cx, gen_helper_xvsetallnez_w) +TRANS(xvsetallnez_d, gen_cx, gen_helper_xvsetallnez_d) + static bool gvec_dupx(DisasContext *ctx, arg_xr *a, MemOp mop) { TCGv src = gpr_src(ctx, a->rj, EXT_NONE); diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index df45dc3d76..b696d99577 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1308,6 +1308,7 @@ vstelm_b 0011 000110 .... ........ ..... ..... @vr_i8i4 &x_i xd imm &xxxx xd xj xk xa &xxx_fcond xd xj xk fcond +&cx cd xj # # LASX Formats @@ -1326,6 +1327,7 @@ vstelm_b 0011 000110 .... ........ ..... ..... @vr_i8i4 @xx_ui8 .... ........ .. imm:8 xj:5 xd:5 &xx_i @xxxx .... ........ xa:5 xk:5 xj:5 xd:5 &xxxx @xxx_fcond .... ........ fcond:5 xk:5 xj:5 xd:5 &xxx_fcond +@cx .... ........ ..... ..... xj:5 .. cd:3 &cx xvadd_b 0111 01000000 10100 ..... ..... ..... @xxx xvadd_h 0111 01000000 10101 ..... ..... ..... @xxx @@ -1988,6 +1990,21 @@ xvslti_du 0111 01101000 10011 ..... ..... ..... @xx_ui5 xvfcmp_cond_s 0000 11001001 ..... ..... ..... ..... @xxx_fcond xvfcmp_cond_d 0000 11001010 ..... ..... ..... ..... @xxx_fcond +xvbitsel_v 0000 11010010 ..... ..... ..... ..... @xxxx + +xvbitseli_b 0111 01111100 01 ........ ..... ..... @xx_ui8 + +xvseteqz_v 0111 01101001 11001 00110 ..... 00 ... @cx +xvsetnez_v 0111 01101001 11001 00111 ..... 00 ... @cx +xvsetanyeqz_b 0111 01101001 11001 01000 ..... 00 ... @cx +xvsetanyeqz_h 0111 01101001 11001 01001 ..... 00 ... @cx +xvsetanyeqz_w 0111 01101001 11001 01010 ..... 00 ... @cx +xvsetanyeqz_d 0111 01101001 11001 01011 ..... 00 ... @cx +xvsetallnez_b 0111 01101001 11001 01100 ..... 00 ... @cx +xvsetallnez_h 0111 01101001 11001 01101 ..... 00 ... @cx +xvsetallnez_w 0111 01101001 11001 01110 ..... 00 ... @cx +xvsetallnez_d 0111 01101001 11001 01111 ..... 00 ... @cx + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @xr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @xr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @xr diff --git a/target/loongarch/lasx_helper.c b/target/loongarch/lasx_helper.c index 1d56fe7b22..56dfe10a0d 100644 --- a/target/loongarch/lasx_helper.c +++ b/target/loongarch/lasx_helper.c @@ -2782,3 +2782,40 @@ XVFCMP(xvfcmp_c_s, 32, UXW, float32_compare_quiet) XVFCMP(xvfcmp_s_s, 32, UXW, float32_compare) XVFCMP(xvfcmp_c_d, 64, UXD, float64_compare_quiet) XVFCMP(xvfcmp_s_d, 64, UXD, float64_compare) + +void HELPER(xvbitseli_b)(void *xd, void *xj, uint64_t imm, uint32_t v) +{ + int i; + XReg *Xd = (XReg *)xd; + XReg *Xj = (XReg *)xj; + + for (i = 0; i < LASX_LEN / 8; i++) { + Xd->XB(i) = (~Xd->XB(i) & Xj->XB(i)) | (Xd->XB(i) & imm); + } +} + +#define XSETANYEQZ(NAME, MO) \ +void HELPER(NAME)(CPULoongArchState *env, uint32_t cd, uint32_t xj) \ +{ \ + XReg *Xj = &(env->fpr[xj].xreg); \ + \ + env->cf[cd & 0x7] = do_match2(0, Xj->XD(0), Xj->XD(1), MO) || \ + do_match2(0, Xj->XD(2), Xj->XD(3), MO); \ +} +XSETANYEQZ(xvsetanyeqz_b, MO_8) +XSETANYEQZ(xvsetanyeqz_h, MO_16) +XSETANYEQZ(xvsetanyeqz_w, MO_32) +XSETANYEQZ(xvsetanyeqz_d, MO_64) + +#define XSETALLNEZ(NAME, MO) \ +void HELPER(NAME)(CPULoongArchState *env, uint32_t cd, uint32_t xj) \ +{ \ + XReg *Xj = &(env->fpr[xj].xreg); \ + \ + env->cf[cd & 0x7] = !do_match2(0, Xj->XD(0), Xj->XD(1), MO) && \ + !do_match2(0, Xj->XD(2), Xj->XD(3), MO); \ +} +XSETALLNEZ(xvsetallnez_b, MO_8) +XSETALLNEZ(xvsetallnez_h, MO_16) +XSETALLNEZ(xvsetallnez_w, MO_32) +XSETALLNEZ(xvsetallnez_d, MO_64) diff --git a/target/loongarch/lsx_helper.c b/target/loongarch/lsx_helper.c index 4a5c1a47a1..00c9835948 100644 --- a/target/loongarch/lsx_helper.c +++ b/target/loongarch/lsx_helper.c @@ -2688,7 +2688,7 @@ void HELPER(vbitseli_b)(void *vd, void *vj, uint64_t imm, uint32_t v) } /* Copy from target/arm/tcg/sve_helper.c */ -static inline bool do_match2(uint64_t n, uint64_t m0, uint64_t m1, int esz) +bool do_match2(uint64_t n, uint64_t m0, uint64_t m1, int esz) { uint64_t bits = 8 << esz; uint64_t ones = dup_const(esz, 1); diff --git a/target/loongarch/vec.h b/target/loongarch/vec.h index 134dd265bf..cfac1c0e1c 100644 --- a/target/loongarch/vec.h +++ b/target/loongarch/vec.h @@ -118,4 +118,6 @@ uint64_t do_frsqrt_64(CPULoongArchState *env, uint64_t fj); uint64_t vfcmp_common(CPULoongArchState *env, FloatRelation cmp, uint32_t flags); +bool do_match2(uint64_t n, uint64_t m0, uint64_t m1, int esz); + #endif /* LOONGARCH_VEC_H */ From patchwork Tue Jun 20 09:38:09 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Gao X-Patchwork-Id: 13285496 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 252EFEB64D7 for ; Tue, 20 Jun 2023 09:45:22 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXpT-0007NR-Sz; Tue, 20 Jun 2023 05:39:15 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qBXp8-0007Cp-2g for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:39:00 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXp5-0006PA-J8 for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:38:53 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8Cx8Oifc5Fk3yUHAA--.12762S3; Tue, 20 Jun 2023 17:38:39 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8BxduSGc5FkzIQhAA--.28394S43; Tue, 20 Jun 2023 17:38:38 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v1 41/46] target/loongarch: Implement xvinsgr2vr xvpickve2gr Date: Tue, 20 Jun 2023 17:38:09 +0800 Message-Id: <20230620093814.123650-42-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230620093814.123650-1-gaosong@loongson.cn> References: <20230620093814.123650-1-gaosong@loongson.cn> MIME-Version: 1.0 X-CM-TRANSID: AQAAf8BxduSGc5FkzIQhAA--.28394S43 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org This patch includes: - XVINSGR2VR.{W/D}; - XVPICKVE2GR.{W/D}[U]. Signed-off-by: Song Gao --- target/loongarch/disas.c | 17 ++++++ target/loongarch/insn_trans/trans_lasx.c.inc | 54 ++++++++++++++++++++ target/loongarch/insns.decode | 13 +++++ 3 files changed, 84 insertions(+) diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index 5c2a81ee80..fd7d459921 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -1738,6 +1738,16 @@ static void output_xr(DisasContext *ctx, arg_xr *a, const char *mnemonic) output(ctx, mnemonic, "x%d, r%d", a->xd, a->rj); } +static void output_xr_i(DisasContext *ctx, arg_xr_i *a, const char *mnemonic) +{ + output(ctx, mnemonic, "x%d, r%d, 0x%x", a->xd, a->rj, a->imm); +} + +static void output_rx_i(DisasContext *ctx, arg_rx_i *a, const char *mnemonic) +{ + output(ctx, mnemonic, "r%d, x%d, 0x%x", a->rd, a->xj, a->imm); +} + INSN_LASX(xvadd_b, xxx) INSN_LASX(xvadd_h, xxx) INSN_LASX(xvadd_w, xxx) @@ -2497,6 +2507,13 @@ INSN_LASX(xvsetallnez_h, cx) INSN_LASX(xvsetallnez_w, cx) INSN_LASX(xvsetallnez_d, cx) +INSN_LASX(xvinsgr2vr_w, xr_i) +INSN_LASX(xvinsgr2vr_d, xr_i) +INSN_LASX(xvpickve2gr_w, rx_i) +INSN_LASX(xvpickve2gr_d, rx_i) +INSN_LASX(xvpickve2gr_wu, rx_i) +INSN_LASX(xvpickve2gr_du, rx_i) + INSN_LASX(xvreplgr2vr_b, xr) INSN_LASX(xvreplgr2vr_h, xr) INSN_LASX(xvreplgr2vr_w, xr) diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarch/insn_trans/trans_lasx.c.inc index cefb6a4973..0fc26023d1 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -2782,6 +2782,60 @@ TRANS(xvsetallnez_h, gen_cx, gen_helper_xvsetallnez_h) TRANS(xvsetallnez_w, gen_cx, gen_helper_xvsetallnez_w) TRANS(xvsetallnez_d, gen_cx, gen_helper_xvsetallnez_d) +static bool trans_xvinsgr2vr_w(DisasContext *ctx, arg_xr_i *a) +{ + TCGv src = gpr_src(ctx, a->rj, EXT_NONE); + CHECK_ASXE; + tcg_gen_st32_i64(src, cpu_env, + offsetof(CPULoongArchState, fpr[a->xd].xreg.XW(a->imm))); + return true; +} + +static bool trans_xvinsgr2vr_d(DisasContext *ctx, arg_xr_i *a) +{ + TCGv src = gpr_src(ctx, a->rj, EXT_NONE); + CHECK_ASXE; + tcg_gen_st_i64(src, cpu_env, + offsetof(CPULoongArchState, fpr[a->xd].xreg.XD(a->imm))); + return true; +} + +static bool trans_xvpickve2gr_w(DisasContext *ctx, arg_rx_i *a) +{ + TCGv dst = gpr_dst(ctx, a->rd, EXT_NONE); + CHECK_ASXE; + tcg_gen_ld32s_i64(dst, cpu_env, + offsetof(CPULoongArchState, fpr[a->xj].xreg.XW(a->imm))); + return true; +} + +static bool trans_xvpickve2gr_d(DisasContext *ctx, arg_rx_i *a) +{ + TCGv dst = gpr_dst(ctx, a->rd, EXT_NONE); + CHECK_ASXE; + tcg_gen_ld_i64(dst, cpu_env, + offsetof(CPULoongArchState, fpr[a->xj].xreg.XD(a->imm))); + return true; +} + +static bool trans_xvpickve2gr_wu(DisasContext *ctx, arg_rx_i *a) +{ + TCGv dst = gpr_dst(ctx, a->rd, EXT_NONE); + CHECK_ASXE; + tcg_gen_ld32u_i64(dst, cpu_env, + offsetof(CPULoongArchState, fpr[a->xj].xreg.XW(a->imm))); + return true; +} + +static bool trans_xvpickve2gr_du(DisasContext *ctx, arg_rx_i *a) +{ + TCGv dst = gpr_dst(ctx, a->rd, EXT_NONE); + CHECK_ASXE; + tcg_gen_ld_i64(dst, cpu_env, + offsetof(CPULoongArchState, fpr[a->xj].xreg.XD(a->imm))); + return true; +} + static bool gvec_dupx(DisasContext *ctx, arg_xr *a, MemOp mop) { TCGv src = gpr_src(ctx, a->rj, EXT_NONE); diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index b696d99577..8c87b3f840 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1309,6 +1309,8 @@ vstelm_b 0011 000110 .... ........ ..... ..... @vr_i8i4 &xxxx xd xj xk xa &xxx_fcond xd xj xk fcond &cx cd xj +&xr_i xd rj imm +&rx_i rd xj imm # # LASX Formats @@ -1328,6 +1330,10 @@ vstelm_b 0011 000110 .... ........ ..... ..... @vr_i8i4 @xxxx .... ........ xa:5 xk:5 xj:5 xd:5 &xxxx @xxx_fcond .... ........ fcond:5 xk:5 xj:5 xd:5 &xxx_fcond @cx .... ........ ..... ..... xj:5 .. cd:3 &cx +@xr_ui3 .... ........ ..... .. imm:3 rj:5 xd:5 &xr_i +@xr_ui2 .... ........ ..... ... imm:2 rj:5 xd:5 &xr_i +@rx_ui3 .... ........ ..... .. imm:3 xj:5 rd:5 &rx_i +@rx_ui2 .... ........ ..... ... imm:2 xj:5 rd:5 &rx_i xvadd_b 0111 01000000 10100 ..... ..... ..... @xxx xvadd_h 0111 01000000 10101 ..... ..... ..... @xxx @@ -2005,6 +2011,13 @@ xvsetallnez_h 0111 01101001 11001 01101 ..... 00 ... @cx xvsetallnez_w 0111 01101001 11001 01110 ..... 00 ... @cx xvsetallnez_d 0111 01101001 11001 01111 ..... 00 ... @cx +xvinsgr2vr_w 0111 01101110 10111 10 ... ..... ..... @xr_ui3 +xvinsgr2vr_d 0111 01101110 10111 110 .. ..... ..... @xr_ui2 +xvpickve2gr_w 0111 01101110 11111 10 ... ..... ..... @rx_ui3 +xvpickve2gr_d 0111 01101110 11111 110 .. ..... ..... @rx_ui2 +xvpickve2gr_wu 0111 01101111 00111 10 ... ..... ..... @rx_ui3 +xvpickve2gr_du 0111 01101111 00111 110 .. ..... ..... @rx_ui2 + xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @xr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @xr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @xr From patchwork Tue Jun 20 09:38:10 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Gao X-Patchwork-Id: 13285454 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B214DEB64D7 for ; Tue, 20 Jun 2023 09:40:44 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXpS-0007KU-63; Tue, 20 Jun 2023 05:39:14 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qBXpD-0007DX-W2 for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:39:03 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXpB-0006Pf-Gb for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:38:59 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8BxIuihc5Fk4SUHAA--.623S3; Tue, 20 Jun 2023 17:38:41 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8BxduSGc5FkzIQhAA--.28394S44; Tue, 20 Jun 2023 17:38:39 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v1 42/46] target/loongarch: Implement xvreplve xvinsve0 xvpickve xvb{sll/srl}v Date: Tue, 20 Jun 2023 17:38:10 +0800 Message-Id: <20230620093814.123650-43-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230620093814.123650-1-gaosong@loongson.cn> References: <20230620093814.123650-1-gaosong@loongson.cn> MIME-Version: 1.0 X-CM-TRANSID: AQAAf8BxduSGc5FkzIQhAA--.28394S44 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org This patch includes: - XVREPLVE.{B/H/W/D}; - XVREPL128VEI.{B/H/W/D}; - XVREPLVE0.{B/H/W/D/Q}; - XVINSVE0.{W/D}; - XVPICKVE.{W/D}; - XVBSLL.V, XVBSRL.V. Signed-off-by: Song Gao --- target/loongarch/disas.c | 29 +++ target/loongarch/helper.h | 5 + target/loongarch/insn_trans/trans_lasx.c.inc | 205 +++++++++++++++++++ target/loongarch/insns.decode | 29 +++ target/loongarch/lasx_helper.c | 29 +++ 5 files changed, 297 insertions(+) diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index fd7d459921..3b89a5df87 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -1748,6 +1748,11 @@ static void output_rx_i(DisasContext *ctx, arg_rx_i *a, const char *mnemonic) output(ctx, mnemonic, "r%d, x%d, 0x%x", a->rd, a->xj, a->imm); } +static void output_xxr(DisasContext *ctx, arg_xxr *a, const char *mnemonic) +{ + output(ctx, mnemonic, "x%d, x%d, r%d", a->xd, a->xj, a->rk); +} + INSN_LASX(xvadd_b, xxx) INSN_LASX(xvadd_h, xxx) INSN_LASX(xvadd_w, xxx) @@ -2518,3 +2523,27 @@ INSN_LASX(xvreplgr2vr_b, xr) INSN_LASX(xvreplgr2vr_h, xr) INSN_LASX(xvreplgr2vr_w, xr) INSN_LASX(xvreplgr2vr_d, xr) + +INSN_LASX(xvreplve_b, xxr) +INSN_LASX(xvreplve_h, xxr) +INSN_LASX(xvreplve_w, xxr) +INSN_LASX(xvreplve_d, xxr) +INSN_LASX(xvrepl128vei_b, xx_i) +INSN_LASX(xvrepl128vei_h, xx_i) +INSN_LASX(xvrepl128vei_w, xx_i) +INSN_LASX(xvrepl128vei_d, xx_i) + +INSN_LASX(xvreplve0_b, xx) +INSN_LASX(xvreplve0_h, xx) +INSN_LASX(xvreplve0_w, xx) +INSN_LASX(xvreplve0_d, xx) +INSN_LASX(xvreplve0_q, xx) + +INSN_LASX(xvinsve0_w, xx_i) +INSN_LASX(xvinsve0_d, xx_i) + +INSN_LASX(xvpickve_w, xx_i) +INSN_LASX(xvpickve_d, xx_i) + +INSN_LASX(xvbsll_v, xx_i) +INSN_LASX(xvbsrl_v, xx_i) diff --git a/target/loongarch/helper.h b/target/loongarch/helper.h index f6d64bfde5..6c4525a413 100644 --- a/target/loongarch/helper.h +++ b/target/loongarch/helper.h @@ -1232,3 +1232,8 @@ DEF_HELPER_3(xvsetallnez_b, void, env, i32, i32) DEF_HELPER_3(xvsetallnez_h, void, env, i32, i32) DEF_HELPER_3(xvsetallnez_w, void, env, i32, i32) DEF_HELPER_3(xvsetallnez_d, void, env, i32, i32) + +DEF_HELPER_4(xvinsve0_w, void, env, i32, i32, i32) +DEF_HELPER_4(xvinsve0_d, void, env, i32, i32, i32) +DEF_HELPER_4(xvpickve_w, void, env, i32, i32, i32) +DEF_HELPER_4(xvpickve_d, void, env, i32, i32, i32) diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarch/insn_trans/trans_lasx.c.inc index 0fc26023d1..e63b1c67c9 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -2851,3 +2851,208 @@ TRANS(xvreplgr2vr_b, gvec_dupx, MO_8) TRANS(xvreplgr2vr_h, gvec_dupx, MO_16) TRANS(xvreplgr2vr_w, gvec_dupx, MO_32) TRANS(xvreplgr2vr_d, gvec_dupx, MO_64) + +static bool gen_xvreplve(DisasContext *ctx, arg_xxr *a, int vece, int bit, + void (*func)(TCGv_i64, TCGv_ptr, tcg_target_long)) +{ + TCGv_i64 t0 = tcg_temp_new_i64(); + TCGv_ptr t1 = tcg_temp_new_ptr(); + TCGv_i64 t2 = tcg_temp_new_i64(); + + CHECK_ASXE; + + tcg_gen_andi_i64(t0, gpr_src(ctx, a->rk, EXT_NONE), (LSX_LEN / bit) - 1); + tcg_gen_shli_i64(t0, t0, vece); + if (HOST_BIG_ENDIAN) { + tcg_gen_xori_i64(t0, t0, vece << ((LSX_LEN / bit) - 1)); + } + + tcg_gen_trunc_i64_ptr(t1, t0); + tcg_gen_add_ptr(t1, t1, cpu_env); + func(t2, t1, vec_full_offset(a->xj)); + tcg_gen_gvec_dup_i64(vece, vec_full_offset(a->xd), 16, 16, t2); + func(t2, t1, offsetof(CPULoongArchState, fpr[a->xj].xreg.XQ(1))); + tcg_gen_gvec_dup_i64(vece, + offsetof(CPULoongArchState, fpr[a->xd].xreg.XQ(1)), + 16, 16, t2); + return true; +} + +TRANS(xvreplve_b, gen_xvreplve, MO_8, 8, tcg_gen_ld8u_i64) +TRANS(xvreplve_h, gen_xvreplve, MO_16, 16, tcg_gen_ld16u_i64) +TRANS(xvreplve_w, gen_xvreplve, MO_32, 32, tcg_gen_ld32u_i64) +TRANS(xvreplve_d, gen_xvreplve, MO_64, 64, tcg_gen_ld_i64) + +static bool trans_xvrepl128vei_b(DisasContext *ctx, arg_xx_i * a) +{ + CHECK_ASXE; + + tcg_gen_gvec_dup_mem(MO_8, + offsetof(CPULoongArchState, fpr[a->xd].xreg.XB(0)), + offsetof(CPULoongArchState, + fpr[a->xj].xreg.XB((a->imm))), + 16, 16); + tcg_gen_gvec_dup_mem(MO_8, + offsetof(CPULoongArchState, fpr[a->xd].xreg.XB(16)), + offsetof(CPULoongArchState, + fpr[a->xj].xreg.XB((a->imm + 16))), + 16, 16); + return true; +} + +static bool trans_xvrepl128vei_h(DisasContext *ctx, arg_xx_i *a) +{ + CHECK_ASXE; + + tcg_gen_gvec_dup_mem(MO_16, + offsetof(CPULoongArchState, fpr[a->xd].xreg.XH(0)), + offsetof(CPULoongArchState, + fpr[a->xj].xreg.XH((a->imm))), + 16, 16); + tcg_gen_gvec_dup_mem(MO_16, + offsetof(CPULoongArchState, fpr[a->xd].xreg.XH(8)), + offsetof(CPULoongArchState, + fpr[a->xj].xreg.XH((a->imm + 8))), + 16, 16); + return true; +} + +static bool trans_xvrepl128vei_w(DisasContext *ctx, arg_xx_i *a) +{ + CHECK_ASXE; + + tcg_gen_gvec_dup_mem(MO_32, + offsetof(CPULoongArchState, fpr[a->xd].xreg.XW(0)), + offsetof(CPULoongArchState, + fpr[a->xj].xreg.XW((a->imm))), + 16, 16); + tcg_gen_gvec_dup_mem(MO_32, + offsetof(CPULoongArchState, fpr[a->xd].xreg.XW(4)), + offsetof(CPULoongArchState, + fpr[a->xj].xreg.XW((a->imm + 4))), + 16, 16); + return true; +} + +static bool trans_xvrepl128vei_d(DisasContext *ctx, arg_xx_i *a) +{ + CHECK_ASXE; + + tcg_gen_gvec_dup_mem(MO_64, + offsetof(CPULoongArchState, fpr[a->xd].xreg.XD(0)), + offsetof(CPULoongArchState, + fpr[a->xj].xreg.XD((a->imm))), + 16, 16); + tcg_gen_gvec_dup_mem(MO_64, + offsetof(CPULoongArchState, fpr[a->xd].xreg.XD(2)), + offsetof(CPULoongArchState, + fpr[a->xj].xreg.XD((a->imm + 2))), + 16, 16); + return true; +} + +#define XVREPLVE0(NAME, MOP) \ +static bool trans_## NAME(DisasContext *ctx, arg_xx * a) \ +{ \ + CHECK_ASXE; \ + \ + tcg_gen_gvec_dup_mem(MOP, vec_full_offset(a->xd), vec_full_offset(a->xj), \ + 32, 32); \ + return true; \ +} + +XVREPLVE0(xvreplve0_b, MO_8) +XVREPLVE0(xvreplve0_h, MO_16) +XVREPLVE0(xvreplve0_w, MO_32) +XVREPLVE0(xvreplve0_d, MO_64) +XVREPLVE0(xvreplve0_q, MO_128) + +TRANS(xvinsve0_w, gen_xx_i, gen_helper_xvinsve0_w) +TRANS(xvinsve0_d, gen_xx_i, gen_helper_xvinsve0_d) + +TRANS(xvpickve_w, gen_xx_i, gen_helper_xvpickve_w) +TRANS(xvpickve_d, gen_xx_i, gen_helper_xvpickve_d) + +static bool trans_xvbsll_v(DisasContext *ctx, arg_xx_i *a) +{ + int ofs; + TCGv_i64 desthigh[2], destlow[2], high[2], low[2]; + + CHECK_ASXE; + + desthigh[0] = tcg_temp_new_i64(); + desthigh[1] = tcg_temp_new_i64(); + destlow[0] = tcg_temp_new_i64(); + destlow[1] = tcg_temp_new_i64(); + high[0] = tcg_temp_new_i64(); + high[1] = tcg_temp_new_i64(); + low[0] = tcg_temp_new_i64(); + low[1] = tcg_temp_new_i64(); + + get_xreg64(low[0], a->xj, 0); + get_xreg64(low[1], a->xj, 2); + + ofs = ((a->imm) & 0xf) * 8; + if (ofs < 64) { + get_xreg64(high[0], a->xj, 1); + get_xreg64(high[1], a->xj, 3); + tcg_gen_extract2_i64(desthigh[0], low[0], high[0], 64 - ofs); + tcg_gen_extract2_i64(desthigh[1], low[1], high[1], 64 - ofs); + tcg_gen_shli_i64(destlow[0], low[0], ofs); + tcg_gen_shli_i64(destlow[1], low[1], ofs); + } else { + tcg_gen_shli_i64(desthigh[0], low[0], ofs - 64); + tcg_gen_shli_i64(desthigh[1], low[1], ofs - 64); + destlow[0] = tcg_constant_i64(0); + destlow[1] = tcg_constant_i64(0); + } + + set_xreg64(desthigh[0], a->xd, 1); + set_xreg64(destlow[0], a->xd, 0); + set_xreg64(desthigh[1], a->xd, 3); + set_xreg64(destlow[1], a->xd, 2); + + return true; +} + +static bool trans_xvbsrl_v(DisasContext *ctx, arg_xx_i *a) +{ + TCGv_i64 desthigh[2], destlow[2], high[2], low[2]; + int ofs; + + CHECK_ASXE; + + desthigh[0] = tcg_temp_new_i64(); + desthigh[1] = tcg_temp_new_i64(); + destlow[0] = tcg_temp_new_i64(); + destlow[1] = tcg_temp_new_i64(); + high[0] = tcg_temp_new_i64(); + high[1] = tcg_temp_new_i64(); + low[0] = tcg_temp_new_i64(); + low[1] = tcg_temp_new_i64(); + + get_xreg64(high[0], a->xj, 1); + get_xreg64(high[1], a->xj, 3); + + ofs = ((a->imm) & 0xf) * 8; + if (ofs < 64) { + get_xreg64(low[0], a->xj, 0); + get_xreg64(low[1], a->xj, 2); + tcg_gen_extract2_i64(destlow[0], low[0], high[0], ofs); + tcg_gen_extract2_i64(destlow[1], low[1], high[1], ofs); + tcg_gen_shri_i64(desthigh[0], high[0], ofs); + tcg_gen_shri_i64(desthigh[1], high[1], ofs); + } else { + tcg_gen_shri_i64(destlow[0], high[0], ofs - 64); + tcg_gen_shri_i64(destlow[1], high[1], ofs - 64); + desthigh[0] = tcg_constant_i64(0); + desthigh[1] = tcg_constant_i64(0); + } + + set_xreg64(desthigh[0], a->xd, 1); + set_xreg64(destlow[0], a->xd, 0); + set_xreg64(desthigh[1], a->xd, 3); + set_xreg64(destlow[1], a->xd, 2); + + return true; +} diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index 8c87b3f840..697087e6ef 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1311,6 +1311,7 @@ vstelm_b 0011 000110 .... ........ ..... ..... @vr_i8i4 &cx cd xj &xr_i xd rj imm &rx_i rd xj imm +&xxr xd xj rk # # LASX Formats @@ -1321,6 +1322,8 @@ vstelm_b 0011 000110 .... ........ ..... ..... @vr_i8i4 @xxx .... ........ ..... xk:5 xj:5 xd:5 &xxx @xr .... ........ ..... ..... rj:5 xd:5 &xr @xx_i5 .... ........ ..... imm:s5 xj:5 xd:5 &xx_i +@xx_ui1 .... ........ ..... .... imm:1 xj:5 xd:5 &xx_i +@xx_ui2 .... ........ ..... ... imm:2 xj:5 xd:5 &xx_i @xx_ui3 .... ........ ..... .. imm:3 xj:5 xd:5 &xx_i @xx_ui4 .... ........ ..... . imm:4 xj:5 xd:5 &xx_i @xx_ui5 .... ........ ..... imm:5 xj:5 xd:5 &xx_i @@ -1334,6 +1337,7 @@ vstelm_b 0011 000110 .... ........ ..... ..... @vr_i8i4 @xr_ui2 .... ........ ..... ... imm:2 rj:5 xd:5 &xr_i @rx_ui3 .... ........ ..... .. imm:3 xj:5 rd:5 &rx_i @rx_ui2 .... ........ ..... ... imm:2 xj:5 rd:5 &rx_i +@xxr .... ........ ..... rk:5 xj:5 xd:5 &xxr xvadd_b 0111 01000000 10100 ..... ..... ..... @xxx xvadd_h 0111 01000000 10101 ..... ..... ..... @xxx @@ -2022,3 +2026,28 @@ xvreplgr2vr_b 0111 01101001 11110 00000 ..... ..... @xr xvreplgr2vr_h 0111 01101001 11110 00001 ..... ..... @xr xvreplgr2vr_w 0111 01101001 11110 00010 ..... ..... @xr xvreplgr2vr_d 0111 01101001 11110 00011 ..... ..... @xr + +xvreplve_b 0111 01010010 00100 ..... ..... ..... @xxr +xvreplve_h 0111 01010010 00101 ..... ..... ..... @xxr +xvreplve_w 0111 01010010 00110 ..... ..... ..... @xxr +xvreplve_d 0111 01010010 00111 ..... ..... ..... @xxr + +xvrepl128vei_b 0111 01101111 01111 0 .... ..... ..... @xx_ui4 +xvrepl128vei_h 0111 01101111 01111 10 ... ..... ..... @xx_ui3 +xvrepl128vei_w 0111 01101111 01111 110 .. ..... ..... @xx_ui2 +xvrepl128vei_d 0111 01101111 01111 1110 . ..... ..... @xx_ui1 + +xvreplve0_b 0111 01110000 01110 00000 ..... ..... @xx +xvreplve0_h 0111 01110000 01111 00000 ..... ..... @xx +xvreplve0_w 0111 01110000 01111 10000 ..... ..... @xx +xvreplve0_d 0111 01110000 01111 11000 ..... ..... @xx +xvreplve0_q 0111 01110000 01111 11100 ..... ..... @xx + +xvinsve0_w 0111 01101111 11111 10 ... ..... ..... @xx_ui3 +xvinsve0_d 0111 01101111 11111 110 .. ..... ..... @xx_ui2 + +xvpickve_w 0111 01110000 00111 10 ... ..... ..... @xx_ui3 +xvpickve_d 0111 01110000 00111 110 .. ..... ..... @xx_ui2 + +xvbsll_v 0111 01101000 11100 ..... ..... ..... @xx_ui5 +xvbsrl_v 0111 01101000 11101 ..... ..... ..... @xx_ui5 diff --git a/target/loongarch/lasx_helper.c b/target/loongarch/lasx_helper.c index 56dfe10a0d..4422c1292e 100644 --- a/target/loongarch/lasx_helper.c +++ b/target/loongarch/lasx_helper.c @@ -2819,3 +2819,32 @@ XSETALLNEZ(xvsetallnez_b, MO_8) XSETALLNEZ(xvsetallnez_h, MO_16) XSETALLNEZ(xvsetallnez_w, MO_32) XSETALLNEZ(xvsetallnez_d, MO_64) + +#define XVINSVE0(NAME, E, MASK) \ +void HELPER(NAME)(CPULoongArchState *env, \ + uint32_t xd, uint32_t xj, uint32_t imm) \ +{ \ + XReg *Xd = &(env->fpr[xd].xreg); \ + XReg *Xj = &(env->fpr[xj].xreg); \ + Xd->E(imm & MASK) = Xj->E(0); \ +} + +XVINSVE0(xvinsve0_w, XW, 0x7) +XVINSVE0(xvinsve0_d, XD, 0x3) + +#define XVPICKVE(NAME, E, BIT, MASK) \ +void HELPER(NAME)(CPULoongArchState *env, \ + uint32_t xd, uint32_t xj, uint32_t imm) \ +{ \ + int i; \ + XReg *Xd = &(env->fpr[xd].xreg); \ + XReg *Xj = &(env->fpr[xj].xreg); \ + \ + Xd->E(0) = Xj->E(imm & MASK); \ + for (i = 1; i < LASX_LEN / BIT; i++) { \ + Xd->E(i) = 0; \ + } \ +} + +XVPICKVE(xvpickve_w, XW, 32, 0x7) +XVPICKVE(xvpickve_d, XD, 64, 0x3) From patchwork Tue Jun 20 09:38:11 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Gao X-Patchwork-Id: 13285481 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 71430EB64DB for ; Tue, 20 Jun 2023 09:43:05 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXpR-0007IX-48; Tue, 20 Jun 2023 05:39:13 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qBXpD-0007DW-V0 for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:39:03 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXpB-0006Px-HN for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:38:59 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8Ax0Oiic5Fk4iUHAA--.12773S3; Tue, 20 Jun 2023 17:38:42 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8BxduSGc5FkzIQhAA--.28394S45; Tue, 20 Jun 2023 17:38:41 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v1 43/46] target/loongarch: Implement xvpack xvpick xvilv{l/h} Date: Tue, 20 Jun 2023 17:38:11 +0800 Message-Id: <20230620093814.123650-44-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230620093814.123650-1-gaosong@loongson.cn> References: <20230620093814.123650-1-gaosong@loongson.cn> MIME-Version: 1.0 X-CM-TRANSID: AQAAf8BxduSGc5FkzIQhAA--.28394S45 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org This patch includes: - XVPACK{EV/OD}.{B/H/W/D}; - XVPICK{EV/OD}.{B/H/W/D}; - XVILV{L/H}.{B/H/W/D}. Signed-off-by: Song Gao --- target/loongarch/disas.c | 27 ++++ target/loongarch/helper.h | 27 ++++ target/loongarch/insn_trans/trans_lasx.c.inc | 27 ++++ target/loongarch/insns.decode | 27 ++++ target/loongarch/lasx_helper.c | 144 +++++++++++++++++++ 5 files changed, 252 insertions(+) diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index 3b89a5df87..4b815c86b8 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -2547,3 +2547,30 @@ INSN_LASX(xvpickve_d, xx_i) INSN_LASX(xvbsll_v, xx_i) INSN_LASX(xvbsrl_v, xx_i) + +INSN_LASX(xvpackev_b, xxx) +INSN_LASX(xvpackev_h, xxx) +INSN_LASX(xvpackev_w, xxx) +INSN_LASX(xvpackev_d, xxx) +INSN_LASX(xvpackod_b, xxx) +INSN_LASX(xvpackod_h, xxx) +INSN_LASX(xvpackod_w, xxx) +INSN_LASX(xvpackod_d, xxx) + +INSN_LASX(xvpickev_b, xxx) +INSN_LASX(xvpickev_h, xxx) +INSN_LASX(xvpickev_w, xxx) +INSN_LASX(xvpickev_d, xxx) +INSN_LASX(xvpickod_b, xxx) +INSN_LASX(xvpickod_h, xxx) +INSN_LASX(xvpickod_w, xxx) +INSN_LASX(xvpickod_d, xxx) + +INSN_LASX(xvilvl_b, xxx) +INSN_LASX(xvilvl_h, xxx) +INSN_LASX(xvilvl_w, xxx) +INSN_LASX(xvilvl_d, xxx) +INSN_LASX(xvilvh_b, xxx) +INSN_LASX(xvilvh_h, xxx) +INSN_LASX(xvilvh_w, xxx) +INSN_LASX(xvilvh_d, xxx) diff --git a/target/loongarch/helper.h b/target/loongarch/helper.h index 6c4525a413..dc5ab59f8e 100644 --- a/target/loongarch/helper.h +++ b/target/loongarch/helper.h @@ -1237,3 +1237,30 @@ DEF_HELPER_4(xvinsve0_w, void, env, i32, i32, i32) DEF_HELPER_4(xvinsve0_d, void, env, i32, i32, i32) DEF_HELPER_4(xvpickve_w, void, env, i32, i32, i32) DEF_HELPER_4(xvpickve_d, void, env, i32, i32, i32) + +DEF_HELPER_4(xvpackev_b, void, env, i32, i32, i32) +DEF_HELPER_4(xvpackev_h, void, env, i32, i32, i32) +DEF_HELPER_4(xvpackev_w, void, env, i32, i32, i32) +DEF_HELPER_4(xvpackev_d, void, env, i32, i32, i32) +DEF_HELPER_4(xvpackod_b, void, env, i32, i32, i32) +DEF_HELPER_4(xvpackod_h, void, env, i32, i32, i32) +DEF_HELPER_4(xvpackod_w, void, env, i32, i32, i32) +DEF_HELPER_4(xvpackod_d, void, env, i32, i32, i32) + +DEF_HELPER_4(xvpickev_b, void, env, i32, i32, i32) +DEF_HELPER_4(xvpickev_h, void, env, i32, i32, i32) +DEF_HELPER_4(xvpickev_w, void, env, i32, i32, i32) +DEF_HELPER_4(xvpickev_d, void, env, i32, i32, i32) +DEF_HELPER_4(xvpickod_b, void, env, i32, i32, i32) +DEF_HELPER_4(xvpickod_h, void, env, i32, i32, i32) +DEF_HELPER_4(xvpickod_w, void, env, i32, i32, i32) +DEF_HELPER_4(xvpickod_d, void, env, i32, i32, i32) + +DEF_HELPER_4(xvilvl_b, void, env, i32, i32, i32) +DEF_HELPER_4(xvilvl_h, void, env, i32, i32, i32) +DEF_HELPER_4(xvilvl_w, void, env, i32, i32, i32) +DEF_HELPER_4(xvilvl_d, void, env, i32, i32, i32) +DEF_HELPER_4(xvilvh_b, void, env, i32, i32, i32) +DEF_HELPER_4(xvilvh_h, void, env, i32, i32, i32) +DEF_HELPER_4(xvilvh_w, void, env, i32, i32, i32) +DEF_HELPER_4(xvilvh_d, void, env, i32, i32, i32) diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarch/insn_trans/trans_lasx.c.inc index e63b1c67c9..75ac0ae1f1 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -3056,3 +3056,30 @@ static bool trans_xvbsrl_v(DisasContext *ctx, arg_xx_i *a) return true; } + +TRANS(xvpackev_b, gen_xxx, gen_helper_xvpackev_b) +TRANS(xvpackev_h, gen_xxx, gen_helper_xvpackev_h) +TRANS(xvpackev_w, gen_xxx, gen_helper_xvpackev_w) +TRANS(xvpackev_d, gen_xxx, gen_helper_xvpackev_d) +TRANS(xvpackod_b, gen_xxx, gen_helper_xvpackod_b) +TRANS(xvpackod_h, gen_xxx, gen_helper_xvpackod_h) +TRANS(xvpackod_w, gen_xxx, gen_helper_xvpackod_w) +TRANS(xvpackod_d, gen_xxx, gen_helper_xvpackod_d) + +TRANS(xvpickev_b, gen_xxx, gen_helper_xvpickev_b) +TRANS(xvpickev_h, gen_xxx, gen_helper_xvpickev_h) +TRANS(xvpickev_w, gen_xxx, gen_helper_xvpickev_w) +TRANS(xvpickev_d, gen_xxx, gen_helper_xvpickev_d) +TRANS(xvpickod_b, gen_xxx, gen_helper_xvpickod_b) +TRANS(xvpickod_h, gen_xxx, gen_helper_xvpickod_h) +TRANS(xvpickod_w, gen_xxx, gen_helper_xvpickod_w) +TRANS(xvpickod_d, gen_xxx, gen_helper_xvpickod_d) + +TRANS(xvilvl_b, gen_xxx, gen_helper_xvilvl_b) +TRANS(xvilvl_h, gen_xxx, gen_helper_xvilvl_h) +TRANS(xvilvl_w, gen_xxx, gen_helper_xvilvl_w) +TRANS(xvilvl_d, gen_xxx, gen_helper_xvilvl_d) +TRANS(xvilvh_b, gen_xxx, gen_helper_xvilvh_b) +TRANS(xvilvh_h, gen_xxx, gen_helper_xvilvh_h) +TRANS(xvilvh_w, gen_xxx, gen_helper_xvilvh_w) +TRANS(xvilvh_d, gen_xxx, gen_helper_xvilvh_d) diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index 697087e6ef..5c3a18fbe2 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -2051,3 +2051,30 @@ xvpickve_d 0111 01110000 00111 110 .. ..... ..... @xx_ui2 xvbsll_v 0111 01101000 11100 ..... ..... ..... @xx_ui5 xvbsrl_v 0111 01101000 11101 ..... ..... ..... @xx_ui5 + +xvpackev_b 0111 01010001 01100 ..... ..... ..... @xxx +xvpackev_h 0111 01010001 01101 ..... ..... ..... @xxx +xvpackev_w 0111 01010001 01110 ..... ..... ..... @xxx +xvpackev_d 0111 01010001 01111 ..... ..... ..... @xxx +xvpackod_b 0111 01010001 10000 ..... ..... ..... @xxx +xvpackod_h 0111 01010001 10001 ..... ..... ..... @xxx +xvpackod_w 0111 01010001 10010 ..... ..... ..... @xxx +xvpackod_d 0111 01010001 10011 ..... ..... ..... @xxx + +xvpickev_b 0111 01010001 11100 ..... ..... ..... @xxx +xvpickev_h 0111 01010001 11101 ..... ..... ..... @xxx +xvpickev_w 0111 01010001 11110 ..... ..... ..... @xxx +xvpickev_d 0111 01010001 11111 ..... ..... ..... @xxx +xvpickod_b 0111 01010010 00000 ..... ..... ..... @xxx +xvpickod_h 0111 01010010 00001 ..... ..... ..... @xxx +xvpickod_w 0111 01010010 00010 ..... ..... ..... @xxx +xvpickod_d 0111 01010010 00011 ..... ..... ..... @xxx + +xvilvl_b 0111 01010001 10100 ..... ..... ..... @xxx +xvilvl_h 0111 01010001 10101 ..... ..... ..... @xxx +xvilvl_w 0111 01010001 10110 ..... ..... ..... @xxx +xvilvl_d 0111 01010001 10111 ..... ..... ..... @xxx +xvilvh_b 0111 01010001 11000 ..... ..... ..... @xxx +xvilvh_h 0111 01010001 11001 ..... ..... ..... @xxx +xvilvh_w 0111 01010001 11010 ..... ..... ..... @xxx +xvilvh_d 0111 01010001 11011 ..... ..... ..... @xxx diff --git a/target/loongarch/lasx_helper.c b/target/loongarch/lasx_helper.c index 4422c1292e..50991998bf 100644 --- a/target/loongarch/lasx_helper.c +++ b/target/loongarch/lasx_helper.c @@ -2848,3 +2848,147 @@ void HELPER(NAME)(CPULoongArchState *env, \ XVPICKVE(xvpickve_w, XW, 32, 0x7) XVPICKVE(xvpickve_d, XD, 64, 0x3) + +#define XVPACKEV(NAME, BIT, E) \ +void HELPER(NAME)(CPULoongArchState *env, \ + uint32_t xd, uint32_t xj, uint32_t xk) \ +{ \ + int i; \ + XReg temp; \ + XReg *Xd = &(env->fpr[xd].xreg); \ + XReg *Xj = &(env->fpr[xj].xreg); \ + XReg *Xk = &(env->fpr[xk].xreg); \ + \ + for (i = 0; i < LASX_LEN / (BIT * 2); i++) { \ + temp.E(2 * i + 1) = Xj->E(2 * i); \ + temp.E(2 * i) = Xk->E(2 * i); \ + } \ + *Xd = temp; \ +} + +XVPACKEV(xvpackev_b, 8, XB) +XVPACKEV(xvpackev_h, 16, XH) +XVPACKEV(xvpackev_w, 32, XW) +XVPACKEV(xvpackev_d, 64, XD) + +#define XVPACKOD(NAME, BIT, E) \ +void HELPER(NAME)(CPULoongArchState *env, \ + uint32_t xd, uint32_t xj, uint32_t xk) \ +{ \ + int i; \ + XReg temp; \ + XReg *Xd = &(env->fpr[xd].xreg); \ + XReg *Xj = &(env->fpr[xj].xreg); \ + XReg *Xk = &(env->fpr[xk].xreg); \ + \ + for (i = 0; i < LASX_LEN / (BIT * 2); i++) { \ + temp.E(2 * i + 1) = Xj->E(2 * i + 1); \ + temp.E(2 * i) = Xk->E(2 * i + 1); \ + } \ + *Xd = temp; \ +} + +XVPACKOD(xvpackod_b, 8, XB) +XVPACKOD(xvpackod_h, 16, XH) +XVPACKOD(xvpackod_w, 32, XW) +XVPACKOD(xvpackod_d, 64, XD) + +#define XVPICKEV(NAME, BIT, E) \ +void HELPER(NAME)(CPULoongArchState *env, \ + uint32_t xd, uint32_t xj, uint32_t xk) \ +{ \ + int i, max; \ + XReg temp; \ + XReg *Xd = &(env->fpr[xd].xreg); \ + XReg *Xj = &(env->fpr[xj].xreg); \ + XReg *Xk = &(env->fpr[xk].xreg); \ + \ + max = LASX_LEN / (BIT * 4); \ + for (i = 0; i < max; i++) { \ + temp.E(i + max) = Xj->E(2 * i); \ + temp.E(i) = Xk->E(2 * i); \ + temp.E(i + max * 3) = Xj->E(2 * i + max * 2); \ + temp.E(i + max * 2) = Xk->E(2 * i + max * 2); \ + } \ + *Xd = temp; \ +} + +XVPICKEV(xvpickev_b, 8, XB) +XVPICKEV(xvpickev_h, 16, XH) +XVPICKEV(xvpickev_w, 32, XW) +XVPICKEV(xvpickev_d, 64, XD) + +#define XVPICKOD(NAME, BIT, E) \ +void HELPER(NAME)(CPULoongArchState *env, \ + uint32_t xd, uint32_t xj, uint32_t xk) \ +{ \ + int i, max; \ + XReg temp; \ + XReg *Xd = &(env->fpr[xd].xreg); \ + XReg *Xj = &(env->fpr[xj].xreg); \ + XReg *Xk = &(env->fpr[xk].xreg); \ + \ + max = LASX_LEN / (BIT * 4); \ + for (i = 0; i < max; i++) { \ + temp.E(i + max) = Xj->E(2 * i + 1); \ + temp.E(i) = Xk->E(2 * i + 1); \ + temp.E(i + max * 3) = Xj->E(2 * i + 1 + max * 2); \ + temp.E(i + max * 2) = Xk->E(2 * i + 1 + max * 2); \ + } \ + *Xd = temp; \ +} + +XVPICKOD(xvpickod_b, 8, XB) +XVPICKOD(xvpickod_h, 16, XH) +XVPICKOD(xvpickod_w, 32, XW) +XVPICKOD(xvpickod_d, 64, XD) + +#define XVILVL(NAME, BIT, E) \ +void HELPER(NAME)(CPULoongArchState *env, \ + uint32_t xd, uint32_t xj, uint32_t xk) \ +{ \ + int i, max; \ + XReg temp; \ + XReg *Xd = &(env->fpr[xd].xreg); \ + XReg *Xj = &(env->fpr[xj].xreg); \ + XReg *Xk = &(env->fpr[xk].xreg); \ + \ + max = LASX_LEN / (BIT * 4); \ + for (i = 0; i < max; i++) { \ + temp.E(2 * i + 1) = Xj->E(i); \ + temp.E(2 * i) = Xk->E(i); \ + temp.E(2 * i + 1 + max * 2) = Xj->E(i + max * 2); \ + temp.E(2 * i + max * 2) = Xk->E(i + max * 2); \ + } \ + *Xd = temp; \ +} + +XVILVL(xvilvl_b, 8, XB) +XVILVL(xvilvl_h, 16, XH) +XVILVL(xvilvl_w, 32, XW) +XVILVL(xvilvl_d, 64, XD) + +#define XVILVH(NAME, BIT, E) \ +void HELPER(NAME)(CPULoongArchState *env, \ + uint32_t xd, uint32_t xj, uint32_t xk) \ +{ \ + int i, max; \ + XReg temp; \ + XReg *Xd = &(env->fpr[xd].xreg); \ + XReg *Xj = &(env->fpr[xj].xreg); \ + XReg *Xk = &(env->fpr[xk].xreg); \ + \ + max = LASX_LEN / (BIT * 4); \ + for (i = 0; i < max; i++) { \ + temp.E(2 * i + 1) = Xj->E(i + max); \ + temp.E(2 * i) = Xk->E(i + max); \ + temp.E(2 * i + 1 + max * 2) = Xj->E(i + max * 3); \ + temp.E(2 * i + max * 2) = Xk->E(i + max * 3); \ + } \ + *Xd = temp; \ +} + +XVILVH(xvilvh_b, 8, XB) +XVILVH(xvilvh_h, 16, XH) +XVILVH(xvilvh_w, 32, XW) +XVILVH(xvilvh_d, 64, XD) From patchwork Tue Jun 20 09:38:12 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Gao X-Patchwork-Id: 13285471 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id C47EDEB64D8 for ; Tue, 20 Jun 2023 09:41:16 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXqY-0003K9-Oq; Tue, 20 Jun 2023 05:40:22 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qBXqV-0002sM-8f for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:40:19 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXqM-0006aR-PK for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:40:17 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8AxEemjc5Fk5iUHAA--.12645S3; Tue, 20 Jun 2023 17:38:43 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8BxduSGc5FkzIQhAA--.28394S46; Tue, 20 Jun 2023 17:38:42 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v1 44/46] target/loongarch: Implement xvshuf xvperm{i} xvshuf4i xvextrins Date: Tue, 20 Jun 2023 17:38:12 +0800 Message-Id: <20230620093814.123650-45-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230620093814.123650-1-gaosong@loongson.cn> References: <20230620093814.123650-1-gaosong@loongson.cn> MIME-Version: 1.0 X-CM-TRANSID: AQAAf8BxduSGc5FkzIQhAA--.28394S46 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org This patch includes: - XVSHUF.{B/H/W/D}; - XVPERM.W; - XVSHUF4i.{B/H/W/D}; - XVPERMI.{W/D/Q}; - XVEXTRINS.{B/H/W/D}. Signed-off-by: Song Gao --- target/loongarch/disas.c | 21 +++ target/loongarch/helper.h | 21 +++ target/loongarch/insn_trans/trans_lasx.c.inc | 21 +++ target/loongarch/insns.decode | 21 +++ target/loongarch/lasx_helper.c | 168 +++++++++++++++++++ target/loongarch/lsx_helper.c | 3 +- target/loongarch/vec.h | 2 + 7 files changed, 255 insertions(+), 2 deletions(-) diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index 4b815c86b8..9af1c95641 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -2574,3 +2574,24 @@ INSN_LASX(xvilvh_b, xxx) INSN_LASX(xvilvh_h, xxx) INSN_LASX(xvilvh_w, xxx) INSN_LASX(xvilvh_d, xxx) + +INSN_LASX(xvshuf_b, xxxx) +INSN_LASX(xvshuf_h, xxx) +INSN_LASX(xvshuf_w, xxx) +INSN_LASX(xvshuf_d, xxx) + +INSN_LASX(xvperm_w, xxx) + +INSN_LASX(xvshuf4i_b, xx_i) +INSN_LASX(xvshuf4i_h, xx_i) +INSN_LASX(xvshuf4i_w, xx_i) +INSN_LASX(xvshuf4i_d, xx_i) + +INSN_LASX(xvpermi_w, xx_i) +INSN_LASX(xvpermi_d, xx_i) +INSN_LASX(xvpermi_q, xx_i) + +INSN_LASX(xvextrins_d, xx_i) +INSN_LASX(xvextrins_w, xx_i) +INSN_LASX(xvextrins_h, xx_i) +INSN_LASX(xvextrins_b, xx_i) diff --git a/target/loongarch/helper.h b/target/loongarch/helper.h index dc5ab59f8e..1058a7de75 100644 --- a/target/loongarch/helper.h +++ b/target/loongarch/helper.h @@ -1264,3 +1264,24 @@ DEF_HELPER_4(xvilvh_b, void, env, i32, i32, i32) DEF_HELPER_4(xvilvh_h, void, env, i32, i32, i32) DEF_HELPER_4(xvilvh_w, void, env, i32, i32, i32) DEF_HELPER_4(xvilvh_d, void, env, i32, i32, i32) + +DEF_HELPER_5(xvshuf_b, void, env, i32, i32, i32, i32) +DEF_HELPER_4(xvshuf_h, void, env, i32, i32, i32) +DEF_HELPER_4(xvshuf_w, void, env, i32, i32, i32) +DEF_HELPER_4(xvshuf_d, void, env, i32, i32, i32) + +DEF_HELPER_4(xvperm_w, void, env, i32, i32, i32) + +DEF_HELPER_4(xvshuf4i_b, void, env, i32, i32, i32) +DEF_HELPER_4(xvshuf4i_h, void, env, i32, i32, i32) +DEF_HELPER_4(xvshuf4i_w, void, env, i32, i32, i32) +DEF_HELPER_4(xvshuf4i_d, void, env, i32, i32, i32) + +DEF_HELPER_4(xvpermi_w, void, env, i32, i32, i32) +DEF_HELPER_4(xvpermi_d, void, env, i32, i32, i32) +DEF_HELPER_4(xvpermi_q, void, env, i32, i32, i32) + +DEF_HELPER_4(xvextrins_d, void, env, i32, i32, i32) +DEF_HELPER_4(xvextrins_w, void, env, i32, i32, i32) +DEF_HELPER_4(xvextrins_h, void, env, i32, i32, i32) +DEF_HELPER_4(xvextrins_b, void, env, i32, i32, i32) diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarch/insn_trans/trans_lasx.c.inc index 75ac0ae1f1..1344f75113 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -3083,3 +3083,24 @@ TRANS(xvilvh_b, gen_xxx, gen_helper_xvilvh_b) TRANS(xvilvh_h, gen_xxx, gen_helper_xvilvh_h) TRANS(xvilvh_w, gen_xxx, gen_helper_xvilvh_w) TRANS(xvilvh_d, gen_xxx, gen_helper_xvilvh_d) + +TRANS(xvshuf_b, gen_xxxx, gen_helper_xvshuf_b) +TRANS(xvshuf_h, gen_xxx, gen_helper_xvshuf_h) +TRANS(xvshuf_w, gen_xxx, gen_helper_xvshuf_w) +TRANS(xvshuf_d, gen_xxx, gen_helper_xvshuf_d) + +TRANS(xvperm_w, gen_xxx, gen_helper_xvperm_w) + +TRANS(xvshuf4i_b, gen_xx_i, gen_helper_xvshuf4i_b) +TRANS(xvshuf4i_h, gen_xx_i, gen_helper_xvshuf4i_h) +TRANS(xvshuf4i_w, gen_xx_i, gen_helper_xvshuf4i_w) +TRANS(xvshuf4i_d, gen_xx_i, gen_helper_xvshuf4i_d) + +TRANS(xvpermi_w, gen_xx_i, gen_helper_xvpermi_w) +TRANS(xvpermi_d, gen_xx_i, gen_helper_xvpermi_d) +TRANS(xvpermi_q, gen_xx_i, gen_helper_xvpermi_q) + +TRANS(xvextrins_b, gen_xx_i, gen_helper_xvextrins_b) +TRANS(xvextrins_h, gen_xx_i, gen_helper_xvextrins_h) +TRANS(xvextrins_w, gen_xx_i, gen_helper_xvextrins_w) +TRANS(xvextrins_d, gen_xx_i, gen_helper_xvextrins_d) diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index 5c3a18fbe2..9c6a6037e9 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -2078,3 +2078,24 @@ xvilvh_b 0111 01010001 11000 ..... ..... ..... @xxx xvilvh_h 0111 01010001 11001 ..... ..... ..... @xxx xvilvh_w 0111 01010001 11010 ..... ..... ..... @xxx xvilvh_d 0111 01010001 11011 ..... ..... ..... @xxx + +xvshuf_b 0000 11010110 ..... ..... ..... ..... @xxxx +xvshuf_h 0111 01010111 10101 ..... ..... ..... @xxx +xvshuf_w 0111 01010111 10110 ..... ..... ..... @xxx +xvshuf_d 0111 01010111 10111 ..... ..... ..... @xxx + +xvperm_w 0111 01010111 11010 ..... ..... ..... @xxx + +xvshuf4i_b 0111 01111001 00 ........ ..... ..... @xx_ui8 +xvshuf4i_h 0111 01111001 01 ........ ..... ..... @xx_ui8 +xvshuf4i_w 0111 01111001 10 ........ ..... ..... @xx_ui8 +xvshuf4i_d 0111 01111001 11 ........ ..... ..... @xx_ui8 + +xvpermi_w 0111 01111110 01 ........ ..... ..... @xx_ui8 +xvpermi_d 0111 01111110 10 ........ ..... ..... @xx_ui8 +xvpermi_q 0111 01111110 11 ........ ..... ..... @xx_ui8 + +xvextrins_d 0111 01111000 00 ........ ..... ..... @xx_ui8 +xvextrins_w 0111 01111000 01 ........ ..... ..... @xx_ui8 +xvextrins_h 0111 01111000 10 ........ ..... ..... @xx_ui8 +xvextrins_b 0111 01111000 11 ........ ..... ..... @xx_ui8 diff --git a/target/loongarch/lasx_helper.c b/target/loongarch/lasx_helper.c index 50991998bf..a0338dfa6d 100644 --- a/target/loongarch/lasx_helper.c +++ b/target/loongarch/lasx_helper.c @@ -2992,3 +2992,171 @@ XVILVH(xvilvh_b, 8, XB) XVILVH(xvilvh_h, 16, XH) XVILVH(xvilvh_w, 32, XW) XVILVH(xvilvh_d, 64, XD) + +void HELPER(xvshuf_b)(CPULoongArchState *env, + uint32_t xd, uint32_t xj, uint32_t xk, uint32_t xa) +{ + int i, m; + XReg temp; + XReg *Xd = &(env->fpr[xd].xreg); + XReg *Xj = &(env->fpr[xj].xreg); + XReg *Xk = &(env->fpr[xk].xreg); + XReg *Xa = &(env->fpr[xa].xreg); + + m = LASX_LEN / (8 * 2); + for (i = 0; i < 2 * m ; i++) { + uint64_t k = (uint8_t)Xa->XB(i) % (2 * m); + if (i < m) { + temp.XB(i) = k < m ? Xk->XB(k) : Xj->XB(k - m); + } else { + temp.XB(i) = k < m ? Xk->XB(k + m) : Xj->XB(k); + } + } + *Xd = temp; +} + +#define XVSHUF(NAME, BIT, E) \ +void HELPER(NAME)(CPULoongArchState *env, \ + uint32_t xd, uint32_t xj, uint32_t xk) \ +{ \ + int i, m; \ + XReg temp; \ + XReg *Xd = &(env->fpr[xd].xreg); \ + XReg *Xj = &(env->fpr[xj].xreg); \ + XReg *Xk = &(env->fpr[xk].xreg); \ + \ + m = LASX_LEN / (BIT * 2); \ + for (i = 0; i < m * 2; i++) { \ + uint64_t k = (uint8_t)Xd->E(i) % (2 * m); \ + if (i < m) { \ + temp.E(i) = k < m ? Xk->E(k) : Xj->E(k - m); \ + } else { \ + temp.E(i) = k < m ? Xk->E(k + m) : Xj->E(k); \ + } \ + } \ + *Xd = temp; \ +} + +XVSHUF(xvshuf_h, 16, XH) +XVSHUF(xvshuf_w, 32, XW) +XVSHUF(xvshuf_d, 64, XD) + +void HELPER(xvperm_w)(CPULoongArchState *env, + uint32_t xd, uint32_t xj, uint32_t xk) +{ + int i, m; + XReg temp; + XReg *Xd = &(env->fpr[xd].xreg); + XReg *Xj = &(env->fpr[xj].xreg); + XReg *Xk = &(env->fpr[xk].xreg); + + m = LASX_LEN / 32; + for (i = 0; i < m ; i++) { + uint64_t k = (uint8_t)Xk->XW(i) % 8; + temp.XW(i) = Xj->XW(k); + } + *Xd = temp; +} + +#define XVSHUF4I(NAME, BIT, E) \ +void HELPER(NAME)(CPULoongArchState *env, \ + uint32_t xd, uint32_t xj, uint32_t imm) \ +{ \ + int i, m; \ + XReg temp; \ + XReg *Xd = &(env->fpr[xd].xreg); \ + XReg *Xj = &(env->fpr[xj].xreg); \ + \ + m = LASX_LEN / BIT; \ + for (i = 0; i < m; i++) { \ + if (i < (m / 2)) { \ + temp.E(i) = Xj->E(SHF_POS(i, imm)); \ + } else { \ + temp.E(i) = Xj->E(SHF_POS(i - (m / 2), imm) + (m / 2)); \ + } \ + } \ + *Xd = temp; \ +} + +XVSHUF4I(xvshuf4i_b, 8, XB) +XVSHUF4I(xvshuf4i_h, 16, XH) +XVSHUF4I(xvshuf4i_w, 32, XW) + +void HELPER(xvshuf4i_d)(CPULoongArchState *env, + uint32_t xd, uint32_t xj, uint32_t imm) +{ + XReg *Xd = &(env->fpr[xd].xreg); + XReg *Xj = &(env->fpr[xj].xreg); + + XReg temp; + temp.XD(0) = (imm & 2 ? Xj : Xd)->XD(imm & 1); + temp.XD(1) = (imm & 8 ? Xj : Xd)->XD((imm >> 2) & 1); + temp.XD(2) = (imm & 2 ? Xj : Xd)->XD((imm & 1) + 2); + temp.XD(3) = (imm & 8 ? Xj : Xd)->XD(((imm >> 2) & 1) + 2); + *Xd = temp; +} + +void HELPER(xvpermi_w)(CPULoongArchState *env, + uint32_t xd, uint32_t xj, uint32_t imm) +{ + XReg temp; + XReg *Xd = &(env->fpr[xd].xreg); + XReg *Xj = &(env->fpr[xj].xreg); + + temp.XW(0) = Xj->XW(imm & 0x3); + temp.XW(1) = Xj->XW((imm >> 2) & 0x3); + temp.XW(2) = Xd->XW((imm >> 4) & 0x3); + temp.XW(3) = Xd->XW((imm >> 6) & 0x3); + temp.XW(4) = Xj->XW((imm & 0x3) + 4); + temp.XW(5) = Xj->XW(((imm >> 2) & 0x3) + 4); + temp.XW(6) = Xd->XW(((imm >> 4) & 0x3) + 4); + temp.XW(7) = Xd->XW(((imm >> 6) & 0x3) + 4); + *Xd = temp; +} + +void HELPER(xvpermi_d)(CPULoongArchState *env, + uint32_t xd, uint32_t xj, uint32_t imm) +{ + XReg temp; + XReg *Xd = &(env->fpr[xd].xreg); + XReg *Xj = &(env->fpr[xj].xreg); + + temp.XD(0) = Xj->XD(imm & 0x3); + temp.XD(1) = Xj->XD((imm >> 2) & 0x3); + temp.XD(2) = Xj->XD((imm >> 4) & 0x3); + temp.XD(3) = Xj->XD((imm >> 6) & 0x3); + *Xd = temp; +} + +void HELPER(xvpermi_q)(CPULoongArchState *env, + uint32_t xd, uint32_t xj, uint32_t imm) +{ + XReg temp; + XReg *Xd = &(env->fpr[xd].xreg); + XReg *Xj = &(env->fpr[xj].xreg); + + temp.XQ(0) = (imm & 0x3) > 1 ? Xd->XQ((imm & 0x3) - 2) : Xj->XQ(imm & 0x3); + temp.XQ(1) = ((imm >> 4) & 0x3) > 1 ? Xd->XQ(((imm >> 4) & 0x3) - 2) : + Xj->XQ((imm >> 4) & 0x3); + *Xd = temp; +} + +#define XVEXTRINS(NAME, BIT, E, MASK) \ +void HELPER(NAME)(CPULoongArchState *env, \ + uint32_t xd, uint32_t xj, uint32_t imm) \ +{ \ + int ins, extr, m; \ + XReg *Xd = &(env->fpr[xd].xreg); \ + XReg *Xj = &(env->fpr[xj].xreg); \ + \ + m = LASX_LEN / (BIT * 2); \ + ins = (imm >> 4) & MASK; \ + extr = imm & MASK; \ + Xd->E(ins) = Xj->E(extr); \ + Xd->E(ins + m) = Xj->E(extr + m); \ +} + +XVEXTRINS(xvextrins_b, 8, XB, 0xf) +XVEXTRINS(xvextrins_h, 16, XH, 0x7) +XVEXTRINS(xvextrins_w, 32, XW, 0x3) +XVEXTRINS(xvextrins_d, 64, XD, 0x1) diff --git a/target/loongarch/lsx_helper.c b/target/loongarch/lsx_helper.c index 00c9835948..c40e0d65ca 100644 --- a/target/loongarch/lsx_helper.c +++ b/target/loongarch/lsx_helper.c @@ -2909,8 +2909,7 @@ void HELPER(NAME)(CPULoongArchState *env, \ VReg *Vj = &(env->fpr[vj].vreg); \ \ for (i = 0; i < LSX_LEN/BIT; i++) { \ - temp.E(i) = Vj->E(((i) & 0xfc) + (((imm) >> \ - (2 * ((i) & 0x03))) & 0x03)); \ + temp.E(i) = Vj->E(SHF_POS(i, imm)); \ } \ *Vd = temp; \ } diff --git a/target/loongarch/vec.h b/target/loongarch/vec.h index cfac1c0e1c..09d070a865 100644 --- a/target/loongarch/vec.h +++ b/target/loongarch/vec.h @@ -96,6 +96,8 @@ #define VSLE(a, b) (a <= b ? -1 : 0) #define VSLT(a, b) (a < b ? -1 : 0) +#define SHF_POS(i, imm) (((i) & 0xfc) + (((imm) >> (2 * ((i) & 0x03))) & 0x03)) + uint64_t do_vmskltz_b(int64_t val); uint64_t do_vmskltz_h(int64_t val); uint64_t do_vmskltz_w(int64_t val); From patchwork Tue Jun 20 09:38:13 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Gao X-Patchwork-Id: 13285509 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 100C4EB64DB for ; Tue, 20 Jun 2023 09:46:13 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXpV-0007YB-Me; Tue, 20 Jun 2023 05:39:17 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qBXpF-0007Dv-Kd for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:39:03 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXpB-0006QC-KG for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:39:01 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8AxGuqjc5Fk6CUHAA--.14776S3; Tue, 20 Jun 2023 17:38:43 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8BxduSGc5FkzIQhAA--.28394S47; Tue, 20 Jun 2023 17:38:43 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v1 45/46] target/loongarch: Implement xvld xvst Date: Tue, 20 Jun 2023 17:38:13 +0800 Message-Id: <20230620093814.123650-46-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230620093814.123650-1-gaosong@loongson.cn> References: <20230620093814.123650-1-gaosong@loongson.cn> MIME-Version: 1.0 X-CM-TRANSID: AQAAf8BxduSGc5FkzIQhAA--.28394S47 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org This patch includes: - XVLD[X], XVST[X]; - XVLDREPL.{B/H/W/D}; - XVSTELM.{B/H/W/D}. Signed-off-by: Song Gao --- target/loongarch/disas.c | 24 +++++ target/loongarch/helper.h | 3 + target/loongarch/insn_trans/trans_lasx.c.inc | 97 ++++++++++++++++++++ target/loongarch/insns.decode | 25 +++++ target/loongarch/lasx_helper.c | 59 ++++++++++++ 5 files changed, 208 insertions(+) diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c index 9af1c95641..4403669047 100644 --- a/target/loongarch/disas.c +++ b/target/loongarch/disas.c @@ -1753,6 +1753,16 @@ static void output_xxr(DisasContext *ctx, arg_xxr *a, const char *mnemonic) output(ctx, mnemonic, "x%d, x%d, r%d", a->xd, a->xj, a->rk); } +static void output_xrr(DisasContext *ctx, arg_xrr *a, const char *mnemonic) +{ + output(ctx, mnemonic, "x%d, r%d, r%d", a->xd, a->rj, a->rk); +} + +static void output_xr_ii(DisasContext *ctx, arg_xr_ii *a, const char *mnemonic) +{ + output(ctx, mnemonic, "x%d, r%d, 0x%x, 0x%x", a->xd, a->rj, a->imm, a->imm2); +} + INSN_LASX(xvadd_b, xxx) INSN_LASX(xvadd_h, xxx) INSN_LASX(xvadd_w, xxx) @@ -2595,3 +2605,17 @@ INSN_LASX(xvextrins_d, xx_i) INSN_LASX(xvextrins_w, xx_i) INSN_LASX(xvextrins_h, xx_i) INSN_LASX(xvextrins_b, xx_i) + +INSN_LASX(xvld, xr_i) +INSN_LASX(xvst, xr_i) +INSN_LASX(xvldx, xrr) +INSN_LASX(xvstx, xrr) + +INSN_LASX(xvldrepl_d, xr_i) +INSN_LASX(xvldrepl_w, xr_i) +INSN_LASX(xvldrepl_h, xr_i) +INSN_LASX(xvldrepl_b, xr_i) +INSN_LASX(xvstelm_d, xr_ii) +INSN_LASX(xvstelm_w, xr_ii) +INSN_LASX(xvstelm_h, xr_ii) +INSN_LASX(xvstelm_b, xr_ii) diff --git a/target/loongarch/helper.h b/target/loongarch/helper.h index 1058a7de75..adeb181407 100644 --- a/target/loongarch/helper.h +++ b/target/loongarch/helper.h @@ -1285,3 +1285,6 @@ DEF_HELPER_4(xvextrins_d, void, env, i32, i32, i32) DEF_HELPER_4(xvextrins_w, void, env, i32, i32, i32) DEF_HELPER_4(xvextrins_h, void, env, i32, i32, i32) DEF_HELPER_4(xvextrins_b, void, env, i32, i32, i32) + +DEF_HELPER_3(xvld_b, void, env, i32, tl) +DEF_HELPER_3(xvst_b, void, env, i32, tl) diff --git a/target/loongarch/insn_trans/trans_lasx.c.inc b/target/loongarch/insn_trans/trans_lasx.c.inc index 1344f75113..761f227c76 100644 --- a/target/loongarch/insn_trans/trans_lasx.c.inc +++ b/target/loongarch/insn_trans/trans_lasx.c.inc @@ -3104,3 +3104,100 @@ TRANS(xvextrins_b, gen_xx_i, gen_helper_xvextrins_b) TRANS(xvextrins_h, gen_xx_i, gen_helper_xvextrins_h) TRANS(xvextrins_w, gen_xx_i, gen_helper_xvextrins_w) TRANS(xvextrins_d, gen_xx_i, gen_helper_xvextrins_d) + +static bool gen_lasx_memory(DisasContext *ctx, arg_xr_i * a, + void (*func)(TCGv_ptr, TCGv_i32, TCGv)) +{ + TCGv_i32 xd = tcg_constant_i32(a->xd); + TCGv addr = gpr_src(ctx, a->rj, EXT_NONE); + TCGv temp = NULL; + + CHECK_ASXE; + + if (a->imm) { + temp = tcg_temp_new(); + tcg_gen_addi_tl(temp, addr, a->imm); + addr = temp; + } + + func(cpu_env, xd, addr); + return true; +} + +TRANS(xvld, gen_lasx_memory, gen_helper_xvld_b) +TRANS(xvst, gen_lasx_memory, gen_helper_xvst_b) + +static bool gen_lasx_memoryx(DisasContext *ctx, arg_xrr *a, + void (*func)(TCGv_ptr, TCGv_i32, TCGv)) +{ + TCGv_i32 xd = tcg_constant_i32(a->xd); + TCGv src1 = gpr_src(ctx, a->rj, EXT_NONE); + TCGv src2 = gpr_src(ctx, a->rk, EXT_NONE); + TCGv addr = tcg_temp_new(); + + CHECK_ASXE; + + tcg_gen_add_tl(addr, src1, src2); + func(cpu_env, xd, addr); + return true; +} + +TRANS(xvldx, gen_lasx_memoryx, gen_helper_xvld_b) +TRANS(xvstx, gen_lasx_memoryx, gen_helper_xvst_b) + +#define XVLDREPL(NAME, MO) \ +static bool trans_## NAME(DisasContext *ctx, arg_xr_i * a) \ +{ \ + TCGv addr, temp; \ + TCGv_i64 val; \ + \ + CHECK_ASXE; \ + \ + addr = gpr_src(ctx, a->rj, EXT_NONE); \ + val = tcg_temp_new_i64(); \ + \ + if (a->imm) { \ + temp = tcg_temp_new(); \ + tcg_gen_addi_tl(temp, addr, a->imm); \ + addr = temp; \ + } \ + \ + tcg_gen_qemu_ld_i64(val, addr, ctx->mem_idx, MO); \ + tcg_gen_gvec_dup_i64(MO, vec_full_offset(a->xd), 32, ctx->vl / 8, val); \ + \ + return true; \ +} + +XVLDREPL(xvldrepl_b, MO_8) +XVLDREPL(xvldrepl_h, MO_16) +XVLDREPL(xvldrepl_w, MO_32) +XVLDREPL(xvldrepl_d, MO_64) + +#define XVSTELM(NAME, MO, E) \ +static bool trans_## NAME(DisasContext *ctx, arg_xr_ii * a) \ +{ \ + TCGv addr, temp; \ + TCGv_i64 val; \ + \ + CHECK_ASXE; \ + \ + addr = gpr_src(ctx, a->rj, EXT_NONE); \ + val = tcg_temp_new_i64(); \ + \ + if (a->imm) { \ + temp = tcg_temp_new(); \ + tcg_gen_addi_tl(temp, addr, a->imm); \ + addr = temp; \ + } \ + \ + tcg_gen_ld_i64(val, cpu_env, \ + offsetof(CPULoongArchState, fpr[a->xd].xreg.E(a->imm2))); \ + tcg_gen_qemu_st_i64(val, addr, ctx->mem_idx, MO); \ + \ + return true; \ +} + +XVSTELM(xvstelm_b, MO_8, XB) +XVSTELM(xvstelm_h, MO_16, XH) +XVSTELM(xvstelm_w, MO_32, XW) +XVSTELM(xvstelm_d, MO_64, XD) diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode index 9c6a6037e9..b7940e4c23 100644 --- a/target/loongarch/insns.decode +++ b/target/loongarch/insns.decode @@ -1312,6 +1312,8 @@ vstelm_b 0011 000110 .... ........ ..... ..... @vr_i8i4 &xr_i xd rj imm &rx_i rd xj imm &xxr xd xj rk +&xrr xd rj rk +&xr_ii xd rj imm imm2 # # LASX Formats @@ -1338,6 +1340,15 @@ vstelm_b 0011 000110 .... ........ ..... ..... @vr_i8i4 @rx_ui3 .... ........ ..... .. imm:3 xj:5 rd:5 &rx_i @rx_ui2 .... ........ ..... ... imm:2 xj:5 rd:5 &rx_i @xxr .... ........ ..... rk:5 xj:5 xd:5 &xxr +@xr_i9 .... ........ . ......... rj:5 xd:5 &xr_i imm=%i9s3 +@xr_i10 .... ........ .......... rj:5 xd:5 &xr_i imm=%i10s2 +@xr_i11 .... ....... ........... rj:5 xd:5 &xr_i imm=%i11s1 +@xr_i12 .... ...... imm:s12 rj:5 xd:5 &xr_i +@xr_i8i2 .... ........ imm2:2 ........ rj:5 xd:5 &xr_ii imm=%i8s3 +@xr_i8i3 .... ....... imm2:3 ........ rj:5 xd:5 &xr_ii imm=%i8s2 +@xr_i8i4 .... ...... imm2:4 ........ rj:5 xd:5 &xr_ii imm=%i8s1 +@xr_i8i5 .... ..... imm2:5 imm:s8 rj:5 xd:5 &xr_ii +@xrr .... ........ ..... rk:5 rj:5 xd:5 &xrr xvadd_b 0111 01000000 10100 ..... ..... ..... @xxx xvadd_h 0111 01000000 10101 ..... ..... ..... @xxx @@ -2099,3 +2110,17 @@ xvextrins_d 0111 01111000 00 ........ ..... ..... @xx_ui8 xvextrins_w 0111 01111000 01 ........ ..... ..... @xx_ui8 xvextrins_h 0111 01111000 10 ........ ..... ..... @xx_ui8 xvextrins_b 0111 01111000 11 ........ ..... ..... @xx_ui8 + +xvld 0010 110010 ............ ..... ..... @xr_i12 +xvst 0010 110011 ............ ..... ..... @xr_i12 +xvldx 0011 10000100 10000 ..... ..... ..... @xrr +xvstx 0011 10000100 11000 ..... ..... ..... @xrr + +xvldrepl_d 0011 00100001 0 ......... ..... ..... @xr_i9 +xvldrepl_w 0011 00100010 .......... ..... ..... @xr_i10 +xvldrepl_h 0011 0010010 ........... ..... ..... @xr_i11 +xvldrepl_b 0011 001010 ............ ..... ..... @xr_i12 +xvstelm_d 0011 00110001 .. ........ ..... ..... @xr_i8i2 +xvstelm_w 0011 0011001 ... ........ ..... ..... @xr_i8i3 +xvstelm_h 0011 001101 .... ........ ..... ..... @xr_i8i4 +xvstelm_b 0011 00111 ..... ........ ..... ..... @xr_i8i5 diff --git a/target/loongarch/lasx_helper.c b/target/loongarch/lasx_helper.c index a0338dfa6d..16346f218c 100644 --- a/target/loongarch/lasx_helper.c +++ b/target/loongarch/lasx_helper.c @@ -12,6 +12,9 @@ #include "fpu/softfloat.h" #include "internals.h" #include "vec.h" +#include "tcg/tcg.h" +#include "exec/cpu_ldst.h" +#include "tcg/tcg-ldst.h" #define XDO_ODD_EVEN(NAME, BIT, E1, E2, DO_OP) \ void HELPER(NAME)(CPULoongArchState *env, \ @@ -3160,3 +3163,59 @@ XVEXTRINS(xvextrins_b, 8, XB, 0xf) XVEXTRINS(xvextrins_h, 16, XH, 0x7) XVEXTRINS(xvextrins_w, 32, XW, 0x3) XVEXTRINS(xvextrins_d, 64, XD, 0x1) + +void helper_xvld_b(CPULoongArchState *env, uint32_t xd, target_ulong addr) +{ + int i; + XReg *Xd = &(env->fpr[xd].xreg); +#if !defined(CONFIG_USER_ONLY) + MemOpIdx oi = make_memop_idx(MO_TE | MO_UNALN, cpu_mmu_index(env, false)); + + for (i = 0; i < LASX_LEN / 8; i++) { + Xd->XB(i) = helper_ldub_mmu(env, addr + i, oi, GETPC()); + } +#else + for (i = 0; i < LASX_LEN / 8; i++) { + Xd->XB(i) = cpu_ldub_data(env, addr + i); + } +#endif +} + +#define LASX_PAGESPAN(x) \ + ((((x) & ~TARGET_PAGE_MASK) + (LASX_LEN / 8) - 1) >= TARGET_PAGE_SIZE) + +static inline void ensure_lasx_writable_pages(CPULoongArchState *env, + target_ulong addr, + int mmu_idx, + uintptr_t retaddr) +{ +#ifndef CONFIG_USER_ONLY + /* FIXME: Probe the actual accesses (pass and use a size) */ + if (unlikely(LASX_PAGESPAN(addr))) { + /* first page */ + probe_write(env, addr, 0, mmu_idx, retaddr); + /* second page */ + addr = (addr & TARGET_PAGE_MASK) + TARGET_PAGE_SIZE; + probe_write(env, addr, 0, mmu_idx, retaddr); + } +#endif +} + +void helper_xvst_b(CPULoongArchState *env, uint32_t xd, target_ulong addr) +{ + int i; + XReg *Xd = &(env->fpr[xd].xreg); + int mmu_idx = cpu_mmu_index(env, false); + + ensure_lasx_writable_pages(env, addr, mmu_idx, GETPC()); +#if !defined(CONFIG_USER_ONLY) + MemOpIdx oi = make_memop_idx(MO_TE | MO_UNALN, mmu_idx); + for (i = 0; i < LASX_LEN / 8; i++) { + helper_stb_mmu(env, addr + i, Xd->XB(i), oi, GETPC()); + } +#else + for (i = 0; i < LASX_LEN / 8; i++) { + cpu_stb_data(env, addr + i, Xd->XB(i)); + } +#endif +} From patchwork Tue Jun 20 09:38:14 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Gao X-Patchwork-Id: 13285510 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 0C4C4EB64D7 for ; Tue, 20 Jun 2023 09:46:21 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXpV-0007Xi-IE; Tue, 20 Jun 2023 05:39:17 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qBXpF-0007Dm-0A for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:39:03 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qBXpB-0006Q7-HC for qemu-devel@nongnu.org; Tue, 20 Jun 2023 05:39:00 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8Bx7eqkc5Fk6SUHAA--.14714S3; Tue, 20 Jun 2023 17:38:44 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8BxduSGc5FkzIQhAA--.28394S48; Tue, 20 Jun 2023 17:38:43 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org Subject: [PATCH v1 46/46] target/loongarch: CPUCFG support LASX Date: Tue, 20 Jun 2023 17:38:14 +0800 Message-Id: <20230620093814.123650-47-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230620093814.123650-1-gaosong@loongson.cn> References: <20230620093814.123650-1-gaosong@loongson.cn> MIME-Version: 1.0 X-CM-TRANSID: AQAAf8BxduSGc5FkzIQhAA--.28394S48 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Signed-off-by: Song Gao --- target/loongarch/cpu.c | 1 + 1 file changed, 1 insertion(+) diff --git a/target/loongarch/cpu.c b/target/loongarch/cpu.c index c9f9cbb19d..aeccbb42e6 100644 --- a/target/loongarch/cpu.c +++ b/target/loongarch/cpu.c @@ -392,6 +392,7 @@ static void loongarch_la464_initfn(Object *obj) data = FIELD_DP32(data, CPUCFG2, FP_DP, 1); data = FIELD_DP32(data, CPUCFG2, FP_VER, 1); data = FIELD_DP32(data, CPUCFG2, LSX, 1), + data = FIELD_DP32(data, CPUCFG2, LASX, 1), data = FIELD_DP32(data, CPUCFG2, LLFTP, 1); data = FIELD_DP32(data, CPUCFG2, LLFTP_VER, 1); data = FIELD_DP32(data, CPUCFG2, LAM, 1);