From patchwork Thu Sep 7 08:08:20 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Gao X-Patchwork-Id: 13376213 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 04AAFEE14DB for ; Thu, 7 Sep 2023 08:12:42 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qeA57-0003yM-2L; Thu, 07 Sep 2023 04:09:42 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qeA4v-0003v8-2G for qemu-devel@nongnu.org; Thu, 07 Sep 2023 04:09:29 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qeA4p-0003Kw-58 for qemu-devel@nongnu.org; Thu, 07 Sep 2023 04:09:28 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8AxV_EuhflkQTQhAA--.1361S3; Thu, 07 Sep 2023 16:09:18 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8DxviMthflkXE1wAA--.31585S3; Thu, 07 Sep 2023 16:09:17 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, maobibo@loongson.cn Subject: [PATCH v5 01/57] target/loongarch: Renamed lsx*.c to vec* .c Date: Thu, 7 Sep 2023 16:08:20 +0800 Message-Id: <20230907080916.3974502-2-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230907080916.3974502-1-gaosong@loongson.cn> References: <20230907080916.3974502-1-gaosong@loongson.cn> MIME-Version: 1.0 X-CM-TRANSID: AQAAf8DxviMthflkXE1wAA--.31585S3 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Renamed lsx_helper.c to vec_helper.c and trans_lsx.c.inc to trans_vec.c.inc So LASX can used them. Signed-off-by: Song Gao --- target/loongarch/translate.c | 2 +- target/loongarch/{lsx_helper.c => vec_helper.c} | 2 +- .../loongarch/insn_trans/{trans_lsx.c.inc => trans_vec.c.inc} | 2 +- target/loongarch/meson.build | 2 +- 4 files changed, 4 insertions(+), 4 deletions(-) rename target/loongarch/{lsx_helper.c => vec_helper.c} (99%) rename target/loongarch/insn_trans/{trans_lsx.c.inc => trans_vec.c.inc} (99%) diff --git a/target/loongarch/translate.c b/target/loongarch/translate.c index fd393ed76d..288727181b 100644 --- a/target/loongarch/translate.c +++ b/target/loongarch/translate.c @@ -261,7 +261,7 @@ static uint64_t make_address_pc(DisasContext *ctx, uint64_t addr) #include "insn_trans/trans_fmemory.c.inc" #include "insn_trans/trans_branch.c.inc" #include "insn_trans/trans_privileged.c.inc" -#include "insn_trans/trans_lsx.c.inc" +#include "insn_trans/trans_vec.c.inc" static void loongarch_tr_translate_insn(DisasContextBase *dcbase, CPUState *cs) { diff --git a/target/loongarch/lsx_helper.c b/target/loongarch/vec_helper.c similarity index 99% rename from target/loongarch/lsx_helper.c rename to target/loongarch/vec_helper.c index 9571f0aef0..73f0974744 100644 --- a/target/loongarch/lsx_helper.c +++ b/target/loongarch/vec_helper.c @@ -1,6 +1,6 @@ /* SPDX-License-Identifier: GPL-2.0-or-later */ /* - * QEMU LoongArch LSX helper functions. + * QEMU LoongArch vector helper functions. * * Copyright (c) 2022-2023 Loongson Technology Corporation Limited */ diff --git a/target/loongarch/insn_trans/trans_lsx.c.inc b/target/loongarch/insn_trans/trans_vec.c.inc similarity index 99% rename from target/loongarch/insn_trans/trans_lsx.c.inc rename to target/loongarch/insn_trans/trans_vec.c.inc index 5fbf2718f7..aed5bac5bc 100644 --- a/target/loongarch/insn_trans/trans_lsx.c.inc +++ b/target/loongarch/insn_trans/trans_vec.c.inc @@ -1,6 +1,6 @@ /* SPDX-License-Identifier: GPL-2.0-or-later */ /* - * LSX translate functions + * LoongArch vector translate functions * Copyright (c) 2022-2023 Loongson Technology Corporation Limited */ diff --git a/target/loongarch/meson.build b/target/loongarch/meson.build index b7a27df5a9..7fbf045a5d 100644 --- a/target/loongarch/meson.build +++ b/target/loongarch/meson.build @@ -11,7 +11,7 @@ loongarch_tcg_ss.add(files( 'op_helper.c', 'translate.c', 'gdbstub.c', - 'lsx_helper.c', + 'vec_helper.c', )) loongarch_tcg_ss.add(zlib) From patchwork Thu Sep 7 08:08:21 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Gao X-Patchwork-Id: 13376207 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 38E31EE14D0 for ; Thu, 7 Sep 2023 08:10:37 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qeA4z-0003v7-ER; Thu, 07 Sep 2023 04:09:35 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qeA4t-0003ub-5f for qemu-devel@nongnu.org; Thu, 07 Sep 2023 04:09:27 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qeA4p-0003L3-12 for qemu-devel@nongnu.org; Thu, 07 Sep 2023 04:09:26 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8DxVugvhflkQzQhAA--.31073S3; Thu, 07 Sep 2023 16:09:19 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8DxviMthflkXE1wAA--.31585S4; Thu, 07 Sep 2023 16:09:18 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, maobibo@loongson.cn Subject: [PATCH v5 02/57] target/loongarch: Implement gvec_*_vl functions Date: Thu, 7 Sep 2023 16:08:21 +0800 Message-Id: <20230907080916.3974502-3-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230907080916.3974502-1-gaosong@loongson.cn> References: <20230907080916.3974502-1-gaosong@loongson.cn> MIME-Version: 1.0 X-CM-TRANSID: AQAAf8DxviMthflkXE1wAA--.31585S4 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Using gvec_*_vl functions hides oprsz. We can use gvec_v* for oprsz 16. and gvec_v* for oprsz 32. Signed-off-by: Song Gao --- target/loongarch/insn_trans/trans_vec.c.inc | 68 +++++++++++++-------- 1 file changed, 44 insertions(+), 24 deletions(-) diff --git a/target/loongarch/insn_trans/trans_vec.c.inc b/target/loongarch/insn_trans/trans_vec.c.inc index aed5bac5bc..aeeb2df41c 100644 --- a/target/loongarch/insn_trans/trans_vec.c.inc +++ b/target/loongarch/insn_trans/trans_vec.c.inc @@ -76,34 +76,58 @@ static bool gen_cv(DisasContext *ctx, arg_cv *a, return true; } +static bool gvec_vvv_vl(DisasContext *ctx, arg_vvv *a, + uint32_t oprsz, MemOp mop, + void (*func)(unsigned, uint32_t, uint32_t, + uint32_t, uint32_t, uint32_t)) +{ + uint32_t vd_ofs = vec_full_offset(a->vd); + uint32_t vj_ofs = vec_full_offset(a->vj); + uint32_t vk_ofs = vec_full_offset(a->vk); + + func(mop, vd_ofs, vj_ofs, vk_ofs, oprsz, ctx->vl / 8); + return true; +} + static bool gvec_vvv(DisasContext *ctx, arg_vvv *a, MemOp mop, void (*func)(unsigned, uint32_t, uint32_t, uint32_t, uint32_t, uint32_t)) { - uint32_t vd_ofs, vj_ofs, vk_ofs; - CHECK_SXE; + return gvec_vvv_vl(ctx, a, 16, mop, func); +} - vd_ofs = vec_full_offset(a->vd); - vj_ofs = vec_full_offset(a->vj); - vk_ofs = vec_full_offset(a->vk); - func(mop, vd_ofs, vj_ofs, vk_ofs, 16, ctx->vl/8); +static bool gvec_vv_vl(DisasContext *ctx, arg_vv *a, + uint32_t oprsz, MemOp mop, + void (*func)(unsigned, uint32_t, uint32_t, + uint32_t, uint32_t)) +{ + uint32_t vd_ofs = vec_full_offset(a->vd); + uint32_t vj_ofs = vec_full_offset(a->vj); + + func(mop, vd_ofs, vj_ofs, oprsz, ctx->vl / 8); return true; } + static bool gvec_vv(DisasContext *ctx, arg_vv *a, MemOp mop, void (*func)(unsigned, uint32_t, uint32_t, uint32_t, uint32_t)) { - uint32_t vd_ofs, vj_ofs; - CHECK_SXE; + return gvec_vv_vl(ctx, a, 16, mop, func); +} - vd_ofs = vec_full_offset(a->vd); - vj_ofs = vec_full_offset(a->vj); +static bool gvec_vv_i_vl(DisasContext *ctx, arg_vv_i *a, + uint32_t oprsz, MemOp mop, + void (*func)(unsigned, uint32_t, uint32_t, + int64_t, uint32_t, uint32_t)) +{ + uint32_t vd_ofs = vec_full_offset(a->vd); + uint32_t vj_ofs = vec_full_offset(a->vj); - func(mop, vd_ofs, vj_ofs, 16, ctx->vl/8); + func(mop, vd_ofs, vj_ofs, a->imm, oprsz, ctx->vl / 8); return true; } @@ -111,28 +135,24 @@ static bool gvec_vv_i(DisasContext *ctx, arg_vv_i *a, MemOp mop, void (*func)(unsigned, uint32_t, uint32_t, int64_t, uint32_t, uint32_t)) { - uint32_t vd_ofs, vj_ofs; - CHECK_SXE; + return gvec_vv_i_vl(ctx, a, 16, mop, func); +} - vd_ofs = vec_full_offset(a->vd); - vj_ofs = vec_full_offset(a->vj); +static bool gvec_subi_vl(DisasContext *ctx, arg_vv_i *a, + uint32_t oprsz, MemOp mop) +{ + uint32_t vd_ofs = vec_full_offset(a->vd); + uint32_t vj_ofs = vec_full_offset(a->vj); - func(mop, vd_ofs, vj_ofs, a->imm , 16, ctx->vl/8); + tcg_gen_gvec_addi(mop, vd_ofs, vj_ofs, -a->imm, oprsz, ctx->vl / 8); return true; } static bool gvec_subi(DisasContext *ctx, arg_vv_i *a, MemOp mop) { - uint32_t vd_ofs, vj_ofs; - CHECK_SXE; - - vd_ofs = vec_full_offset(a->vd); - vj_ofs = vec_full_offset(a->vj); - - tcg_gen_gvec_addi(mop, vd_ofs, vj_ofs, -a->imm, 16, ctx->vl/8); - return true; + return gvec_subi_vl(ctx, a, 16, mop); } TRANS(vadd_b, LSX, gvec_vvv, MO_8, tcg_gen_gvec_add) From patchwork Thu Sep 7 08:08:22 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Gao X-Patchwork-Id: 13376215 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id D4C8DEE14DC for ; Thu, 7 Sep 2023 08:12:44 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qeA5A-0003yd-N9; Thu, 07 Sep 2023 04:09:44 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qeA4v-0003vE-Va for qemu-devel@nongnu.org; Thu, 07 Sep 2023 04:09:31 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qeA4p-0003L4-5t for qemu-devel@nongnu.org; Thu, 07 Sep 2023 04:09:29 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8AxDOsvhflkRzQhAA--.60899S3; Thu, 07 Sep 2023 16:09:19 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8DxviMthflkXE1wAA--.31585S5; Thu, 07 Sep 2023 16:09:18 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, maobibo@loongson.cn Subject: [PATCH v5 03/57] target/loongarch: Use gen_helper_gvec_4_ptr for 4OP + env vector instructions Date: Thu, 7 Sep 2023 16:08:22 +0800 Message-Id: <20230907080916.3974502-4-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230907080916.3974502-1-gaosong@loongson.cn> References: <20230907080916.3974502-1-gaosong@loongson.cn> MIME-Version: 1.0 X-CM-TRANSID: AQAAf8DxviMthflkXE1wAA--.31585S5 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Signed-off-by: Song Gao --- target/loongarch/helper.h | 16 +++++----- target/loongarch/vec_helper.c | 12 +++---- target/loongarch/insn_trans/trans_vec.c.inc | 35 ++++++++++++++++----- 3 files changed, 41 insertions(+), 22 deletions(-) diff --git a/target/loongarch/helper.h b/target/loongarch/helper.h index ffb1e0b0bf..ead16567c2 100644 --- a/target/loongarch/helper.h +++ b/target/loongarch/helper.h @@ -528,14 +528,14 @@ DEF_HELPER_4(vfmul_d, void, env, i32, i32, i32) DEF_HELPER_4(vfdiv_s, void, env, i32, i32, i32) DEF_HELPER_4(vfdiv_d, void, env, i32, i32, i32) -DEF_HELPER_5(vfmadd_s, void, env, i32, i32, i32, i32) -DEF_HELPER_5(vfmadd_d, void, env, i32, i32, i32, i32) -DEF_HELPER_5(vfmsub_s, void, env, i32, i32, i32, i32) -DEF_HELPER_5(vfmsub_d, void, env, i32, i32, i32, i32) -DEF_HELPER_5(vfnmadd_s, void, env, i32, i32, i32, i32) -DEF_HELPER_5(vfnmadd_d, void, env, i32, i32, i32, i32) -DEF_HELPER_5(vfnmsub_s, void, env, i32, i32, i32, i32) -DEF_HELPER_5(vfnmsub_d, void, env, i32, i32, i32, i32) +DEF_HELPER_FLAGS_6(vfmadd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_6(vfmadd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_6(vfmsub_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_6(vfmsub_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_6(vfnmadd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_6(vfnmadd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_6(vfnmsub_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_6(vfnmsub_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, env, i32) DEF_HELPER_4(vfmax_s, void, env, i32, i32, i32) DEF_HELPER_4(vfmax_d, void, env, i32, i32, i32) diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c index 73f0974744..3a7a620227 100644 --- a/target/loongarch/vec_helper.c +++ b/target/loongarch/vec_helper.c @@ -2129,14 +2129,14 @@ DO_3OP_F(vfmina_s, 32, UW, float32_minnummag) DO_3OP_F(vfmina_d, 64, UD, float64_minnummag) #define DO_4OP_F(NAME, BIT, E, FN, flags) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t vk, uint32_t va) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, void *va, \ + CPULoongArchState *env, uint32_t desc) \ { \ int i; \ - VReg *Vd = &(env->fpr[vd].vreg); \ - VReg *Vj = &(env->fpr[vj].vreg); \ - VReg *Vk = &(env->fpr[vk].vreg); \ - VReg *Va = &(env->fpr[va].vreg); \ + VReg *Vd = (VReg *)vd; \ + VReg *Vj = (VReg *)vj; \ + VReg *Vk = (VReg *)vk; \ + VReg *Va = (VReg *)va; \ \ vec_clear_cause(env); \ for (i = 0; i < LSX_LEN/BIT; i++) { \ diff --git a/target/loongarch/insn_trans/trans_vec.c.inc b/target/loongarch/insn_trans/trans_vec.c.inc index aeeb2df41c..85bc8670a7 100644 --- a/target/loongarch/insn_trans/trans_vec.c.inc +++ b/target/loongarch/insn_trans/trans_vec.c.inc @@ -15,6 +15,25 @@ #define CHECK_SXE #endif +static bool gen_vvvv_ptr_vl(DisasContext *ctx, arg_vvvv *a, uint32_t oprsz, + gen_helper_gvec_4_ptr *fn) +{ + tcg_gen_gvec_4_ptr(vec_full_offset(a->vd), + vec_full_offset(a->vj), + vec_full_offset(a->vk), + vec_full_offset(a->va), + cpu_env, + oprsz, ctx->vl / 8, oprsz, fn); + return true; +} + +static bool gen_vvvv_ptr(DisasContext *ctx, arg_vvvv *a, + gen_helper_gvec_4_ptr *fn) +{ + CHECK_SXE; + return gen_vvvv_ptr_vl(ctx, a, 16, fn); +} + static bool gen_vvvv(DisasContext *ctx, arg_vvvv *a, void (*func)(TCGv_ptr, TCGv_i32, TCGv_i32, TCGv_i32, TCGv_i32)) @@ -3634,14 +3653,14 @@ TRANS(vfmul_d, LSX, gen_vvv, gen_helper_vfmul_d) TRANS(vfdiv_s, LSX, gen_vvv, gen_helper_vfdiv_s) TRANS(vfdiv_d, LSX, gen_vvv, gen_helper_vfdiv_d) -TRANS(vfmadd_s, LSX, gen_vvvv, gen_helper_vfmadd_s) -TRANS(vfmadd_d, LSX, gen_vvvv, gen_helper_vfmadd_d) -TRANS(vfmsub_s, LSX, gen_vvvv, gen_helper_vfmsub_s) -TRANS(vfmsub_d, LSX, gen_vvvv, gen_helper_vfmsub_d) -TRANS(vfnmadd_s, LSX, gen_vvvv, gen_helper_vfnmadd_s) -TRANS(vfnmadd_d, LSX, gen_vvvv, gen_helper_vfnmadd_d) -TRANS(vfnmsub_s, LSX, gen_vvvv, gen_helper_vfnmsub_s) -TRANS(vfnmsub_d, LSX, gen_vvvv, gen_helper_vfnmsub_d) +TRANS(vfmadd_s, LSX, gen_vvvv_ptr, gen_helper_vfmadd_s) +TRANS(vfmadd_d, LSX, gen_vvvv_ptr, gen_helper_vfmadd_d) +TRANS(vfmsub_s, LSX, gen_vvvv_ptr, gen_helper_vfmsub_s) +TRANS(vfmsub_d, LSX, gen_vvvv_ptr, gen_helper_vfmsub_d) +TRANS(vfnmadd_s, LSX, gen_vvvv_ptr, gen_helper_vfnmadd_s) +TRANS(vfnmadd_d, LSX, gen_vvvv_ptr, gen_helper_vfnmadd_d) +TRANS(vfnmsub_s, LSX, gen_vvvv_ptr, gen_helper_vfnmsub_s) +TRANS(vfnmsub_d, LSX, gen_vvvv_ptr, gen_helper_vfnmsub_d) TRANS(vfmax_s, LSX, gen_vvv, gen_helper_vfmax_s) TRANS(vfmax_d, LSX, gen_vvv, gen_helper_vfmax_d) From patchwork Thu Sep 7 08:08:23 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Gao X-Patchwork-Id: 13376208 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 18CB1EE14D4 for ; Thu, 7 Sep 2023 08:10:41 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qeA5B-0003z6-Ox; Thu, 07 Sep 2023 04:09:46 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qeA4v-0003vD-VI for qemu-devel@nongnu.org; Thu, 07 Sep 2023 04:09:31 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qeA4p-0003L5-9P for qemu-devel@nongnu.org; Thu, 07 Sep 2023 04:09:28 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8BxHOsvhflkSjQhAA--.61682S3; Thu, 07 Sep 2023 16:09:19 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8DxviMthflkXE1wAA--.31585S6; Thu, 07 Sep 2023 16:09:19 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, maobibo@loongson.cn Subject: [PATCH v5 04/57] target/loongarch: Use gen_helper_gvec_4 for 4OP vector instructions Date: Thu, 7 Sep 2023 16:08:23 +0800 Message-Id: <20230907080916.3974502-5-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230907080916.3974502-1-gaosong@loongson.cn> References: <20230907080916.3974502-1-gaosong@loongson.cn> MIME-Version: 1.0 X-CM-TRANSID: AQAAf8DxviMthflkXE1wAA--.31585S6 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Signed-off-by: Song Gao --- target/loongarch/helper.h | 2 +- target/loongarch/vec_helper.c | 11 +++++------ target/loongarch/insn_trans/trans_vec.c.inc | 22 ++++++++++++--------- 3 files changed, 19 insertions(+), 16 deletions(-) diff --git a/target/loongarch/helper.h b/target/loongarch/helper.h index ead16567c2..727ccfb32c 100644 --- a/target/loongarch/helper.h +++ b/target/loongarch/helper.h @@ -682,7 +682,7 @@ DEF_HELPER_4(vilvh_h, void, env, i32, i32, i32) DEF_HELPER_4(vilvh_w, void, env, i32, i32, i32) DEF_HELPER_4(vilvh_d, void, env, i32, i32, i32) -DEF_HELPER_5(vshuf_b, void, env, i32, i32, i32, i32) +DEF_HELPER_FLAGS_5(vshuf_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_4(vshuf_h, void, env, i32, i32, i32) DEF_HELPER_4(vshuf_w, void, env, i32, i32, i32) DEF_HELPER_4(vshuf_d, void, env, i32, i32, i32) diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c index 3a7a620227..7078c4c845 100644 --- a/target/loongarch/vec_helper.c +++ b/target/loongarch/vec_helper.c @@ -2899,15 +2899,14 @@ VILVH(vilvh_h, 32, H) VILVH(vilvh_w, 64, W) VILVH(vilvh_d, 128, D) -void HELPER(vshuf_b)(CPULoongArchState *env, - uint32_t vd, uint32_t vj, uint32_t vk, uint32_t va) +void HELPER(vshuf_b)(void *vd, void *vj, void *vk, void *va, uint32_t desc) { int i, m; VReg temp; - VReg *Vd = &(env->fpr[vd].vreg); - VReg *Vj = &(env->fpr[vj].vreg); - VReg *Vk = &(env->fpr[vk].vreg); - VReg *Va = &(env->fpr[va].vreg); + VReg *Vd = (VReg *)vd; + VReg *Vj = (VReg *)vj; + VReg *Vk = (VReg *)vk; + VReg *Va = (VReg *)va; m = LSX_LEN/8; for (i = 0; i < m ; i++) { diff --git a/target/loongarch/insn_trans/trans_vec.c.inc b/target/loongarch/insn_trans/trans_vec.c.inc index 85bc8670a7..6f45296987 100644 --- a/target/loongarch/insn_trans/trans_vec.c.inc +++ b/target/loongarch/insn_trans/trans_vec.c.inc @@ -34,18 +34,22 @@ static bool gen_vvvv_ptr(DisasContext *ctx, arg_vvvv *a, return gen_vvvv_ptr_vl(ctx, a, 16, fn); } -static bool gen_vvvv(DisasContext *ctx, arg_vvvv *a, - void (*func)(TCGv_ptr, TCGv_i32, TCGv_i32, - TCGv_i32, TCGv_i32)) +static bool gen_vvvv_vl(DisasContext *ctx, arg_vvvv *a, uint32_t oprsz, + gen_helper_gvec_4 *fn) { - TCGv_i32 vd = tcg_constant_i32(a->vd); - TCGv_i32 vj = tcg_constant_i32(a->vj); - TCGv_i32 vk = tcg_constant_i32(a->vk); - TCGv_i32 va = tcg_constant_i32(a->va); + tcg_gen_gvec_4_ool(vec_full_offset(a->vd), + vec_full_offset(a->vj), + vec_full_offset(a->vk), + vec_full_offset(a->va), + oprsz, ctx->vl / 8, oprsz, fn); + return true; +} +static bool gen_vvvv(DisasContext *ctx, arg_vvvv *a, + gen_helper_gvec_4 *fn) +{ CHECK_SXE; - func(cpu_env, vd, vj, vk, va); - return true; + return gen_vvvv_vl(ctx, a, 16, fn); } static bool gen_vvv(DisasContext *ctx, arg_vvv *a, From patchwork Thu Sep 7 08:08:24 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Gao X-Patchwork-Id: 13376216 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 83AACEE14DA for ; Thu, 7 Sep 2023 08:12:43 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qeA5C-0003z7-RU; Thu, 07 Sep 2023 04:09:46 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qeA4w-0003vF-08 for qemu-devel@nongnu.org; Thu, 07 Sep 2023 04:09:31 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qeA4p-0003LB-8b for qemu-devel@nongnu.org; Thu, 07 Sep 2023 04:09:29 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8Dxg_AwhflkTDQhAA--.1190S3; Thu, 07 Sep 2023 16:09:20 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8DxviMthflkXE1wAA--.31585S7; Thu, 07 Sep 2023 16:09:19 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, maobibo@loongson.cn Subject: [PATCH v5 05/57] target/loongarch: Use gen_helper_gvec_3_ptr for 3OP + env vector instructions Date: Thu, 7 Sep 2023 16:08:24 +0800 Message-Id: <20230907080916.3974502-6-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230907080916.3974502-1-gaosong@loongson.cn> References: <20230907080916.3974502-1-gaosong@loongson.cn> MIME-Version: 1.0 X-CM-TRANSID: AQAAf8DxviMthflkXE1wAA--.31585S7 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Signed-off-by: Song Gao --- target/loongarch/helper.h | 48 +++++++-------- target/loongarch/vec_helper.c | 50 ++++++++-------- target/loongarch/insn_trans/trans_vec.c.inc | 66 +++++++++++++-------- 3 files changed, 91 insertions(+), 73 deletions(-) diff --git a/target/loongarch/helper.h b/target/loongarch/helper.h index 727ccfb32c..bcf82597aa 100644 --- a/target/loongarch/helper.h +++ b/target/loongarch/helper.h @@ -519,14 +519,14 @@ DEF_HELPER_4(vfrstp_h, void, env, i32, i32, i32) DEF_HELPER_4(vfrstpi_b, void, env, i32, i32, i32) DEF_HELPER_4(vfrstpi_h, void, env, i32, i32, i32) -DEF_HELPER_4(vfadd_s, void, env, i32, i32, i32) -DEF_HELPER_4(vfadd_d, void, env, i32, i32, i32) -DEF_HELPER_4(vfsub_s, void, env, i32, i32, i32) -DEF_HELPER_4(vfsub_d, void, env, i32, i32, i32) -DEF_HELPER_4(vfmul_s, void, env, i32, i32, i32) -DEF_HELPER_4(vfmul_d, void, env, i32, i32, i32) -DEF_HELPER_4(vfdiv_s, void, env, i32, i32, i32) -DEF_HELPER_4(vfdiv_d, void, env, i32, i32, i32) +DEF_HELPER_FLAGS_5(vfadd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_5(vfadd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_5(vfsub_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_5(vfsub_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_5(vfmul_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_5(vfmul_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_5(vfdiv_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_5(vfdiv_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32) DEF_HELPER_FLAGS_6(vfmadd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, env, i32) DEF_HELPER_FLAGS_6(vfmadd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, env, i32) @@ -537,15 +537,15 @@ DEF_HELPER_FLAGS_6(vfnmadd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, env, i3 DEF_HELPER_FLAGS_6(vfnmsub_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, env, i32) DEF_HELPER_FLAGS_6(vfnmsub_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, env, i32) -DEF_HELPER_4(vfmax_s, void, env, i32, i32, i32) -DEF_HELPER_4(vfmax_d, void, env, i32, i32, i32) -DEF_HELPER_4(vfmin_s, void, env, i32, i32, i32) -DEF_HELPER_4(vfmin_d, void, env, i32, i32, i32) +DEF_HELPER_FLAGS_5(vfmax_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_5(vfmax_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_5(vfmin_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_5(vfmin_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32) -DEF_HELPER_4(vfmaxa_s, void, env, i32, i32, i32) -DEF_HELPER_4(vfmaxa_d, void, env, i32, i32, i32) -DEF_HELPER_4(vfmina_s, void, env, i32, i32, i32) -DEF_HELPER_4(vfmina_d, void, env, i32, i32, i32) +DEF_HELPER_FLAGS_5(vfmaxa_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_5(vfmaxa_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_5(vfmina_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_5(vfmina_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32) DEF_HELPER_3(vflogb_s, void, env, i32, i32) DEF_HELPER_3(vflogb_d, void, env, i32, i32) @@ -564,8 +564,8 @@ DEF_HELPER_3(vfcvtl_s_h, void, env, i32, i32) DEF_HELPER_3(vfcvth_s_h, void, env, i32, i32) DEF_HELPER_3(vfcvtl_d_s, void, env, i32, i32) DEF_HELPER_3(vfcvth_d_s, void, env, i32, i32) -DEF_HELPER_4(vfcvt_h_s, void, env, i32, i32, i32) -DEF_HELPER_4(vfcvt_s_d, void, env, i32, i32, i32) +DEF_HELPER_FLAGS_5(vfcvt_h_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_5(vfcvt_s_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32) DEF_HELPER_3(vfrintrne_s, void, env, i32, i32) DEF_HELPER_3(vfrintrne_d, void, env, i32, i32) @@ -592,11 +592,11 @@ DEF_HELPER_3(vftintrz_wu_s, void, env, i32, i32) DEF_HELPER_3(vftintrz_lu_d, void, env, i32, i32) DEF_HELPER_3(vftint_wu_s, void, env, i32, i32) DEF_HELPER_3(vftint_lu_d, void, env, i32, i32) -DEF_HELPER_4(vftintrne_w_d, void, env, i32, i32, i32) -DEF_HELPER_4(vftintrz_w_d, void, env, i32, i32, i32) -DEF_HELPER_4(vftintrp_w_d, void, env, i32, i32, i32) -DEF_HELPER_4(vftintrm_w_d, void, env, i32, i32, i32) -DEF_HELPER_4(vftint_w_d, void, env, i32, i32, i32) +DEF_HELPER_FLAGS_5(vftintrne_w_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_5(vftintrz_w_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_5(vftintrp_w_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_5(vftintrm_w_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_5(vftint_w_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32) DEF_HELPER_3(vftintrnel_l_s, void, env, i32, i32) DEF_HELPER_3(vftintrneh_l_s, void, env, i32, i32) DEF_HELPER_3(vftintrzl_l_s, void, env, i32, i32) @@ -614,7 +614,7 @@ DEF_HELPER_3(vffint_s_wu, void, env, i32, i32) DEF_HELPER_3(vffint_d_lu, void, env, i32, i32) DEF_HELPER_3(vffintl_d_w, void, env, i32, i32) DEF_HELPER_3(vffinth_d_w, void, env, i32, i32) -DEF_HELPER_4(vffint_s_l, void, env, i32, i32, i32) +DEF_HELPER_FLAGS_5(vffint_s_l, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32) DEF_HELPER_FLAGS_4(vseqi_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) DEF_HELPER_FLAGS_4(vseqi_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c index 7078c4c845..eab94a8b76 100644 --- a/target/loongarch/vec_helper.c +++ b/target/loongarch/vec_helper.c @@ -2096,13 +2096,13 @@ static inline void vec_clear_cause(CPULoongArchState *env) } #define DO_3OP_F(NAME, BIT, E, FN) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t vk) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, \ + CPULoongArchState *env, uint32_t desc) \ { \ int i; \ - VReg *Vd = &(env->fpr[vd].vreg); \ - VReg *Vj = &(env->fpr[vj].vreg); \ - VReg *Vk = &(env->fpr[vk].vreg); \ + VReg *Vd = (VReg *)vd; \ + VReg *Vj = (VReg *)vj; \ + VReg *Vk = (VReg *)vk; \ \ vec_clear_cause(env); \ for (i = 0; i < LSX_LEN/BIT; i++) { \ @@ -2326,14 +2326,14 @@ void HELPER(vfcvth_d_s)(CPULoongArchState *env, uint32_t vd, uint32_t vj) *Vd = temp; } -void HELPER(vfcvt_h_s)(CPULoongArchState *env, - uint32_t vd, uint32_t vj, uint32_t vk) +void HELPER(vfcvt_h_s)(void *vd, void *vj, void *vk, + CPULoongArchState *env, uint32_t desc) { int i; VReg temp; - VReg *Vd = &(env->fpr[vd].vreg); - VReg *Vj = &(env->fpr[vj].vreg); - VReg *Vk = &(env->fpr[vk].vreg); + VReg *Vd = (VReg *)vd; + VReg *Vj = (VReg *)vj; + VReg *Vk = (VReg *)vk; vec_clear_cause(env); for(i = 0; i < LSX_LEN/32; i++) { @@ -2344,14 +2344,14 @@ void HELPER(vfcvt_h_s)(CPULoongArchState *env, *Vd = temp; } -void HELPER(vfcvt_s_d)(CPULoongArchState *env, - uint32_t vd, uint32_t vj, uint32_t vk) +void HELPER(vfcvt_s_d)(void *vd, void *vj, void *vk, + CPULoongArchState *env, uint32_t desc) { int i; VReg temp; - VReg *Vd = &(env->fpr[vd].vreg); - VReg *Vj = &(env->fpr[vj].vreg); - VReg *Vk = &(env->fpr[vk].vreg); + VReg *Vd = (VReg *)vd; + VReg *Vj = (VReg *)vj; + VReg *Vk = (VReg *)vk; vec_clear_cause(env); for(i = 0; i < LSX_LEN/64; i++) { @@ -2482,14 +2482,14 @@ FTINT(rz_w_d, float64, int32, uint64_t, uint32_t, float_round_to_zero) FTINT(rne_w_d, float64, int32, uint64_t, uint32_t, float_round_nearest_even) #define FTINT_W_D(NAME, FN) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t vk) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, \ + CPULoongArchState *env, uint32_t desc) \ { \ int i; \ VReg temp; \ - VReg *Vd = &(env->fpr[vd].vreg); \ - VReg *Vj = &(env->fpr[vj].vreg); \ - VReg *Vk = &(env->fpr[vk].vreg); \ + VReg *Vd = (VReg *)vd; \ + VReg *Vj = (VReg *)vj; \ + VReg *Vk = (VReg *)vk; \ \ vec_clear_cause(env); \ for (i = 0; i < 2; i++) { \ @@ -2606,14 +2606,14 @@ void HELPER(vffinth_d_w)(CPULoongArchState *env, uint32_t vd, uint32_t vj) *Vd = temp; } -void HELPER(vffint_s_l)(CPULoongArchState *env, - uint32_t vd, uint32_t vj, uint32_t vk) +void HELPER(vffint_s_l)(void *vd, void *vj, void *vk, + CPULoongArchState *env, uint32_t desc) { int i; VReg temp; - VReg *Vd = &(env->fpr[vd].vreg); - VReg *Vj = &(env->fpr[vj].vreg); - VReg *Vk = &(env->fpr[vk].vreg); + VReg *Vd = (VReg *)vd; + VReg *Vj = (VReg *)vj; + VReg *Vk = (VReg *)vk; vec_clear_cause(env); for (i = 0; i < 2; i++) { diff --git a/target/loongarch/insn_trans/trans_vec.c.inc b/target/loongarch/insn_trans/trans_vec.c.inc index 6f45296987..eae1929f44 100644 --- a/target/loongarch/insn_trans/trans_vec.c.inc +++ b/target/loongarch/insn_trans/trans_vec.c.inc @@ -52,6 +52,24 @@ static bool gen_vvvv(DisasContext *ctx, arg_vvvv *a, return gen_vvvv_vl(ctx, a, 16, fn); } +static bool gen_vvv_ptr_vl(DisasContext *ctx, arg_vvv *a, uint32_t oprsz, + gen_helper_gvec_3_ptr *fn) +{ + tcg_gen_gvec_3_ptr(vec_full_offset(a->vd), + vec_full_offset(a->vj), + vec_full_offset(a->vk), + cpu_env, + oprsz, ctx->vl / 8, oprsz, fn); + return true; +} + +static bool gen_vvv_ptr(DisasContext *ctx, arg_vvv *a, + gen_helper_gvec_3_ptr *fn) +{ + CHECK_SXE; + return gen_vvv_ptr_vl(ctx, a, 16, fn); +} + static bool gen_vvv(DisasContext *ctx, arg_vvv *a, void (*func)(TCGv_ptr, TCGv_i32, TCGv_i32, TCGv_i32)) { @@ -3648,14 +3666,14 @@ TRANS(vfrstp_h, LSX, gen_vvv, gen_helper_vfrstp_h) TRANS(vfrstpi_b, LSX, gen_vv_i, gen_helper_vfrstpi_b) TRANS(vfrstpi_h, LSX, gen_vv_i, gen_helper_vfrstpi_h) -TRANS(vfadd_s, LSX, gen_vvv, gen_helper_vfadd_s) -TRANS(vfadd_d, LSX, gen_vvv, gen_helper_vfadd_d) -TRANS(vfsub_s, LSX, gen_vvv, gen_helper_vfsub_s) -TRANS(vfsub_d, LSX, gen_vvv, gen_helper_vfsub_d) -TRANS(vfmul_s, LSX, gen_vvv, gen_helper_vfmul_s) -TRANS(vfmul_d, LSX, gen_vvv, gen_helper_vfmul_d) -TRANS(vfdiv_s, LSX, gen_vvv, gen_helper_vfdiv_s) -TRANS(vfdiv_d, LSX, gen_vvv, gen_helper_vfdiv_d) +TRANS(vfadd_s, LSX, gen_vvv_ptr, gen_helper_vfadd_s) +TRANS(vfadd_d, LSX, gen_vvv_ptr, gen_helper_vfadd_d) +TRANS(vfsub_s, LSX, gen_vvv_ptr, gen_helper_vfsub_s) +TRANS(vfsub_d, LSX, gen_vvv_ptr, gen_helper_vfsub_d) +TRANS(vfmul_s, LSX, gen_vvv_ptr, gen_helper_vfmul_s) +TRANS(vfmul_d, LSX, gen_vvv_ptr, gen_helper_vfmul_d) +TRANS(vfdiv_s, LSX, gen_vvv_ptr, gen_helper_vfdiv_s) +TRANS(vfdiv_d, LSX, gen_vvv_ptr, gen_helper_vfdiv_d) TRANS(vfmadd_s, LSX, gen_vvvv_ptr, gen_helper_vfmadd_s) TRANS(vfmadd_d, LSX, gen_vvvv_ptr, gen_helper_vfmadd_d) @@ -3666,15 +3684,15 @@ TRANS(vfnmadd_d, LSX, gen_vvvv_ptr, gen_helper_vfnmadd_d) TRANS(vfnmsub_s, LSX, gen_vvvv_ptr, gen_helper_vfnmsub_s) TRANS(vfnmsub_d, LSX, gen_vvvv_ptr, gen_helper_vfnmsub_d) -TRANS(vfmax_s, LSX, gen_vvv, gen_helper_vfmax_s) -TRANS(vfmax_d, LSX, gen_vvv, gen_helper_vfmax_d) -TRANS(vfmin_s, LSX, gen_vvv, gen_helper_vfmin_s) -TRANS(vfmin_d, LSX, gen_vvv, gen_helper_vfmin_d) +TRANS(vfmax_s, LSX, gen_vvv_ptr, gen_helper_vfmax_s) +TRANS(vfmax_d, LSX, gen_vvv_ptr, gen_helper_vfmax_d) +TRANS(vfmin_s, LSX, gen_vvv_ptr, gen_helper_vfmin_s) +TRANS(vfmin_d, LSX, gen_vvv_ptr, gen_helper_vfmin_d) -TRANS(vfmaxa_s, LSX, gen_vvv, gen_helper_vfmaxa_s) -TRANS(vfmaxa_d, LSX, gen_vvv, gen_helper_vfmaxa_d) -TRANS(vfmina_s, LSX, gen_vvv, gen_helper_vfmina_s) -TRANS(vfmina_d, LSX, gen_vvv, gen_helper_vfmina_d) +TRANS(vfmaxa_s, LSX, gen_vvv_ptr, gen_helper_vfmaxa_s) +TRANS(vfmaxa_d, LSX, gen_vvv_ptr, gen_helper_vfmaxa_d) +TRANS(vfmina_s, LSX, gen_vvv_ptr, gen_helper_vfmina_s) +TRANS(vfmina_d, LSX, gen_vvv_ptr, gen_helper_vfmina_d) TRANS(vflogb_s, LSX, gen_vv, gen_helper_vflogb_s) TRANS(vflogb_d, LSX, gen_vv, gen_helper_vflogb_d) @@ -3693,8 +3711,8 @@ TRANS(vfcvtl_s_h, LSX, gen_vv, gen_helper_vfcvtl_s_h) TRANS(vfcvth_s_h, LSX, gen_vv, gen_helper_vfcvth_s_h) TRANS(vfcvtl_d_s, LSX, gen_vv, gen_helper_vfcvtl_d_s) TRANS(vfcvth_d_s, LSX, gen_vv, gen_helper_vfcvth_d_s) -TRANS(vfcvt_h_s, LSX, gen_vvv, gen_helper_vfcvt_h_s) -TRANS(vfcvt_s_d, LSX, gen_vvv, gen_helper_vfcvt_s_d) +TRANS(vfcvt_h_s, LSX, gen_vvv_ptr, gen_helper_vfcvt_h_s) +TRANS(vfcvt_s_d, LSX, gen_vvv_ptr, gen_helper_vfcvt_s_d) TRANS(vfrintrne_s, LSX, gen_vv, gen_helper_vfrintrne_s) TRANS(vfrintrne_d, LSX, gen_vv, gen_helper_vfrintrne_d) @@ -3721,11 +3739,11 @@ TRANS(vftintrz_wu_s, LSX, gen_vv, gen_helper_vftintrz_wu_s) TRANS(vftintrz_lu_d, LSX, gen_vv, gen_helper_vftintrz_lu_d) TRANS(vftint_wu_s, LSX, gen_vv, gen_helper_vftint_wu_s) TRANS(vftint_lu_d, LSX, gen_vv, gen_helper_vftint_lu_d) -TRANS(vftintrne_w_d, LSX, gen_vvv, gen_helper_vftintrne_w_d) -TRANS(vftintrz_w_d, LSX, gen_vvv, gen_helper_vftintrz_w_d) -TRANS(vftintrp_w_d, LSX, gen_vvv, gen_helper_vftintrp_w_d) -TRANS(vftintrm_w_d, LSX, gen_vvv, gen_helper_vftintrm_w_d) -TRANS(vftint_w_d, LSX, gen_vvv, gen_helper_vftint_w_d) +TRANS(vftintrne_w_d, LSX, gen_vvv_ptr, gen_helper_vftintrne_w_d) +TRANS(vftintrz_w_d, LSX, gen_vvv_ptr, gen_helper_vftintrz_w_d) +TRANS(vftintrp_w_d, LSX, gen_vvv_ptr, gen_helper_vftintrp_w_d) +TRANS(vftintrm_w_d, LSX, gen_vvv_ptr, gen_helper_vftintrm_w_d) +TRANS(vftint_w_d, LSX, gen_vvv_ptr, gen_helper_vftint_w_d) TRANS(vftintrnel_l_s, LSX, gen_vv, gen_helper_vftintrnel_l_s) TRANS(vftintrneh_l_s, LSX, gen_vv, gen_helper_vftintrneh_l_s) TRANS(vftintrzl_l_s, LSX, gen_vv, gen_helper_vftintrzl_l_s) @@ -3743,7 +3761,7 @@ TRANS(vffint_s_wu, LSX, gen_vv, gen_helper_vffint_s_wu) TRANS(vffint_d_lu, LSX, gen_vv, gen_helper_vffint_d_lu) TRANS(vffintl_d_w, LSX, gen_vv, gen_helper_vffintl_d_w) TRANS(vffinth_d_w, LSX, gen_vv, gen_helper_vffinth_d_w) -TRANS(vffint_s_l, LSX, gen_vvv, gen_helper_vffint_s_l) +TRANS(vffint_s_l, LSX, gen_vvv_ptr, gen_helper_vffint_s_l) static bool do_cmp(DisasContext *ctx, arg_vvv *a, MemOp mop, TCGCond cond) { From patchwork Thu Sep 7 08:08:25 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Gao X-Patchwork-Id: 13376217 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id C7535EE14D4 for ; Thu, 7 Sep 2023 08:13:13 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qeA5I-00040V-Bq; Thu, 07 Sep 2023 04:09:52 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qeA50-0003wB-4b for qemu-devel@nongnu.org; Thu, 07 Sep 2023 04:09:37 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qeA4p-0003LH-NJ for qemu-devel@nongnu.org; Thu, 07 Sep 2023 04:09:33 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8AxZ+gwhflkTzQhAA--.30289S3; Thu, 07 Sep 2023 16:09:20 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8DxviMthflkXE1wAA--.31585S8; Thu, 07 Sep 2023 16:09:20 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, maobibo@loongson.cn Subject: [PATCH v5 06/57] target/loongarch: Use gen_helper_gvec_3 for 3OP vector instructions Date: Thu, 7 Sep 2023 16:08:25 +0800 Message-Id: <20230907080916.3974502-7-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230907080916.3974502-1-gaosong@loongson.cn> References: <20230907080916.3974502-1-gaosong@loongson.cn> MIME-Version: 1.0 X-CM-TRANSID: AQAAf8DxviMthflkXE1wAA--.31585S8 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Signed-off-by: Song Gao --- target/loongarch/helper.h | 214 +++++----- target/loongarch/vec_helper.c | 444 +++++++++----------- target/loongarch/insn_trans/trans_vec.c.inc | 19 +- 3 files changed, 326 insertions(+), 351 deletions(-) diff --git a/target/loongarch/helper.h b/target/loongarch/helper.h index bcf82597aa..4b681e948f 100644 --- a/target/loongarch/helper.h +++ b/target/loongarch/helper.h @@ -133,22 +133,22 @@ DEF_HELPER_1(idle, void, env) #endif /* LoongArch LSX */ -DEF_HELPER_4(vhaddw_h_b, void, env, i32, i32, i32) -DEF_HELPER_4(vhaddw_w_h, void, env, i32, i32, i32) -DEF_HELPER_4(vhaddw_d_w, void, env, i32, i32, i32) -DEF_HELPER_4(vhaddw_q_d, void, env, i32, i32, i32) -DEF_HELPER_4(vhaddw_hu_bu, void, env, i32, i32, i32) -DEF_HELPER_4(vhaddw_wu_hu, void, env, i32, i32, i32) -DEF_HELPER_4(vhaddw_du_wu, void, env, i32, i32, i32) -DEF_HELPER_4(vhaddw_qu_du, void, env, i32, i32, i32) -DEF_HELPER_4(vhsubw_h_b, void, env, i32, i32, i32) -DEF_HELPER_4(vhsubw_w_h, void, env, i32, i32, i32) -DEF_HELPER_4(vhsubw_d_w, void, env, i32, i32, i32) -DEF_HELPER_4(vhsubw_q_d, void, env, i32, i32, i32) -DEF_HELPER_4(vhsubw_hu_bu, void, env, i32, i32, i32) -DEF_HELPER_4(vhsubw_wu_hu, void, env, i32, i32, i32) -DEF_HELPER_4(vhsubw_du_wu, void, env, i32, i32, i32) -DEF_HELPER_4(vhsubw_qu_du, void, env, i32, i32, i32) +DEF_HELPER_FLAGS_4(vhaddw_h_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vhaddw_w_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vhaddw_d_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vhaddw_q_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vhaddw_hu_bu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vhaddw_wu_hu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vhaddw_du_wu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vhaddw_qu_du, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vhsubw_h_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vhsubw_w_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vhsubw_d_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vhsubw_q_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vhsubw_hu_bu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vhsubw_wu_hu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vhsubw_du_wu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vhsubw_qu_du, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vaddwev_h_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vaddwev_w_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) @@ -305,22 +305,22 @@ DEF_HELPER_FLAGS_4(vmaddwod_h_bu_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vmaddwod_w_hu_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vmaddwod_d_wu_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) -DEF_HELPER_4(vdiv_b, void, env, i32, i32, i32) -DEF_HELPER_4(vdiv_h, void, env, i32, i32, i32) -DEF_HELPER_4(vdiv_w, void, env, i32, i32, i32) -DEF_HELPER_4(vdiv_d, void, env, i32, i32, i32) -DEF_HELPER_4(vdiv_bu, void, env, i32, i32, i32) -DEF_HELPER_4(vdiv_hu, void, env, i32, i32, i32) -DEF_HELPER_4(vdiv_wu, void, env, i32, i32, i32) -DEF_HELPER_4(vdiv_du, void, env, i32, i32, i32) -DEF_HELPER_4(vmod_b, void, env, i32, i32, i32) -DEF_HELPER_4(vmod_h, void, env, i32, i32, i32) -DEF_HELPER_4(vmod_w, void, env, i32, i32, i32) -DEF_HELPER_4(vmod_d, void, env, i32, i32, i32) -DEF_HELPER_4(vmod_bu, void, env, i32, i32, i32) -DEF_HELPER_4(vmod_hu, void, env, i32, i32, i32) -DEF_HELPER_4(vmod_wu, void, env, i32, i32, i32) -DEF_HELPER_4(vmod_du, void, env, i32, i32, i32) +DEF_HELPER_FLAGS_4(vdiv_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vdiv_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vdiv_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vdiv_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vdiv_bu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vdiv_hu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vdiv_wu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vdiv_du, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vmod_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vmod_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vmod_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vmod_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vmod_bu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vmod_hu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vmod_wu, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vmod_du, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vsat_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) DEF_HELPER_FLAGS_4(vsat_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) @@ -363,30 +363,30 @@ DEF_HELPER_4(vsllwil_wu_hu, void, env, i32, i32, i32) DEF_HELPER_4(vsllwil_du_wu, void, env, i32, i32, i32) DEF_HELPER_3(vextl_qu_du, void, env, i32, i32) -DEF_HELPER_4(vsrlr_b, void, env, i32, i32, i32) -DEF_HELPER_4(vsrlr_h, void, env, i32, i32, i32) -DEF_HELPER_4(vsrlr_w, void, env, i32, i32, i32) -DEF_HELPER_4(vsrlr_d, void, env, i32, i32, i32) +DEF_HELPER_FLAGS_4(vsrlr_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vsrlr_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vsrlr_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vsrlr_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_4(vsrlri_b, void, env, i32, i32, i32) DEF_HELPER_4(vsrlri_h, void, env, i32, i32, i32) DEF_HELPER_4(vsrlri_w, void, env, i32, i32, i32) DEF_HELPER_4(vsrlri_d, void, env, i32, i32, i32) -DEF_HELPER_4(vsrar_b, void, env, i32, i32, i32) -DEF_HELPER_4(vsrar_h, void, env, i32, i32, i32) -DEF_HELPER_4(vsrar_w, void, env, i32, i32, i32) -DEF_HELPER_4(vsrar_d, void, env, i32, i32, i32) +DEF_HELPER_FLAGS_4(vsrar_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vsrar_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vsrar_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vsrar_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_4(vsrari_b, void, env, i32, i32, i32) DEF_HELPER_4(vsrari_h, void, env, i32, i32, i32) DEF_HELPER_4(vsrari_w, void, env, i32, i32, i32) DEF_HELPER_4(vsrari_d, void, env, i32, i32, i32) -DEF_HELPER_4(vsrln_b_h, void, env, i32, i32, i32) -DEF_HELPER_4(vsrln_h_w, void, env, i32, i32, i32) -DEF_HELPER_4(vsrln_w_d, void, env, i32, i32, i32) -DEF_HELPER_4(vsran_b_h, void, env, i32, i32, i32) -DEF_HELPER_4(vsran_h_w, void, env, i32, i32, i32) -DEF_HELPER_4(vsran_w_d, void, env, i32, i32, i32) +DEF_HELPER_FLAGS_4(vsrln_b_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vsrln_h_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vsrln_w_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vsran_b_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vsran_h_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vsran_w_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_4(vsrlni_b_h, void, env, i32, i32, i32) DEF_HELPER_4(vsrlni_h_w, void, env, i32, i32, i32) @@ -397,12 +397,12 @@ DEF_HELPER_4(vsrani_h_w, void, env, i32, i32, i32) DEF_HELPER_4(vsrani_w_d, void, env, i32, i32, i32) DEF_HELPER_4(vsrani_d_q, void, env, i32, i32, i32) -DEF_HELPER_4(vsrlrn_b_h, void, env, i32, i32, i32) -DEF_HELPER_4(vsrlrn_h_w, void, env, i32, i32, i32) -DEF_HELPER_4(vsrlrn_w_d, void, env, i32, i32, i32) -DEF_HELPER_4(vsrarn_b_h, void, env, i32, i32, i32) -DEF_HELPER_4(vsrarn_h_w, void, env, i32, i32, i32) -DEF_HELPER_4(vsrarn_w_d, void, env, i32, i32, i32) +DEF_HELPER_FLAGS_4(vsrlrn_b_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vsrlrn_h_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vsrlrn_w_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vsrarn_b_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vsrarn_h_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vsrarn_w_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_4(vsrlrni_b_h, void, env, i32, i32, i32) DEF_HELPER_4(vsrlrni_h_w, void, env, i32, i32, i32) @@ -413,18 +413,18 @@ DEF_HELPER_4(vsrarni_h_w, void, env, i32, i32, i32) DEF_HELPER_4(vsrarni_w_d, void, env, i32, i32, i32) DEF_HELPER_4(vsrarni_d_q, void, env, i32, i32, i32) -DEF_HELPER_4(vssrln_b_h, void, env, i32, i32, i32) -DEF_HELPER_4(vssrln_h_w, void, env, i32, i32, i32) -DEF_HELPER_4(vssrln_w_d, void, env, i32, i32, i32) -DEF_HELPER_4(vssran_b_h, void, env, i32, i32, i32) -DEF_HELPER_4(vssran_h_w, void, env, i32, i32, i32) -DEF_HELPER_4(vssran_w_d, void, env, i32, i32, i32) -DEF_HELPER_4(vssrln_bu_h, void, env, i32, i32, i32) -DEF_HELPER_4(vssrln_hu_w, void, env, i32, i32, i32) -DEF_HELPER_4(vssrln_wu_d, void, env, i32, i32, i32) -DEF_HELPER_4(vssran_bu_h, void, env, i32, i32, i32) -DEF_HELPER_4(vssran_hu_w, void, env, i32, i32, i32) -DEF_HELPER_4(vssran_wu_d, void, env, i32, i32, i32) +DEF_HELPER_FLAGS_4(vssrln_b_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vssrln_h_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vssrln_w_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vssran_b_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vssran_h_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vssran_w_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vssrln_bu_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vssrln_hu_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vssrln_wu_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vssran_bu_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vssran_hu_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vssran_wu_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_4(vssrlni_b_h, void, env, i32, i32, i32) DEF_HELPER_4(vssrlni_h_w, void, env, i32, i32, i32) @@ -443,18 +443,18 @@ DEF_HELPER_4(vssrani_hu_w, void, env, i32, i32, i32) DEF_HELPER_4(vssrani_wu_d, void, env, i32, i32, i32) DEF_HELPER_4(vssrani_du_q, void, env, i32, i32, i32) -DEF_HELPER_4(vssrlrn_b_h, void, env, i32, i32, i32) -DEF_HELPER_4(vssrlrn_h_w, void, env, i32, i32, i32) -DEF_HELPER_4(vssrlrn_w_d, void, env, i32, i32, i32) -DEF_HELPER_4(vssrarn_b_h, void, env, i32, i32, i32) -DEF_HELPER_4(vssrarn_h_w, void, env, i32, i32, i32) -DEF_HELPER_4(vssrarn_w_d, void, env, i32, i32, i32) -DEF_HELPER_4(vssrlrn_bu_h, void, env, i32, i32, i32) -DEF_HELPER_4(vssrlrn_hu_w, void, env, i32, i32, i32) -DEF_HELPER_4(vssrlrn_wu_d, void, env, i32, i32, i32) -DEF_HELPER_4(vssrarn_bu_h, void, env, i32, i32, i32) -DEF_HELPER_4(vssrarn_hu_w, void, env, i32, i32, i32) -DEF_HELPER_4(vssrarn_wu_d, void, env, i32, i32, i32) +DEF_HELPER_FLAGS_4(vssrlrn_b_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vssrlrn_h_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vssrlrn_w_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vssrarn_b_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vssrarn_h_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vssrarn_w_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vssrlrn_bu_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vssrlrn_hu_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vssrlrn_wu_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vssrarn_bu_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vssrarn_hu_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vssrarn_wu_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_4(vssrlrni_b_h, void, env, i32, i32, i32) DEF_HELPER_4(vssrlrni_h_w, void, env, i32, i32, i32) @@ -514,8 +514,8 @@ DEF_HELPER_FLAGS_4(vbitrevi_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) DEF_HELPER_FLAGS_4(vbitrevi_w, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) DEF_HELPER_FLAGS_4(vbitrevi_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) -DEF_HELPER_4(vfrstp_b, void, env, i32, i32, i32) -DEF_HELPER_4(vfrstp_h, void, env, i32, i32, i32) +DEF_HELPER_FLAGS_4(vfrstp_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vfrstp_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_4(vfrstpi_b, void, env, i32, i32, i32) DEF_HELPER_4(vfrstpi_h, void, env, i32, i32, i32) @@ -655,37 +655,37 @@ DEF_HELPER_3(vsetallnez_h, void, env, i32, i32) DEF_HELPER_3(vsetallnez_w, void, env, i32, i32) DEF_HELPER_3(vsetallnez_d, void, env, i32, i32) -DEF_HELPER_4(vpackev_b, void, env, i32, i32, i32) -DEF_HELPER_4(vpackev_h, void, env, i32, i32, i32) -DEF_HELPER_4(vpackev_w, void, env, i32, i32, i32) -DEF_HELPER_4(vpackev_d, void, env, i32, i32, i32) -DEF_HELPER_4(vpackod_b, void, env, i32, i32, i32) -DEF_HELPER_4(vpackod_h, void, env, i32, i32, i32) -DEF_HELPER_4(vpackod_w, void, env, i32, i32, i32) -DEF_HELPER_4(vpackod_d, void, env, i32, i32, i32) - -DEF_HELPER_4(vpickev_b, void, env, i32, i32, i32) -DEF_HELPER_4(vpickev_h, void, env, i32, i32, i32) -DEF_HELPER_4(vpickev_w, void, env, i32, i32, i32) -DEF_HELPER_4(vpickev_d, void, env, i32, i32, i32) -DEF_HELPER_4(vpickod_b, void, env, i32, i32, i32) -DEF_HELPER_4(vpickod_h, void, env, i32, i32, i32) -DEF_HELPER_4(vpickod_w, void, env, i32, i32, i32) -DEF_HELPER_4(vpickod_d, void, env, i32, i32, i32) - -DEF_HELPER_4(vilvl_b, void, env, i32, i32, i32) -DEF_HELPER_4(vilvl_h, void, env, i32, i32, i32) -DEF_HELPER_4(vilvl_w, void, env, i32, i32, i32) -DEF_HELPER_4(vilvl_d, void, env, i32, i32, i32) -DEF_HELPER_4(vilvh_b, void, env, i32, i32, i32) -DEF_HELPER_4(vilvh_h, void, env, i32, i32, i32) -DEF_HELPER_4(vilvh_w, void, env, i32, i32, i32) -DEF_HELPER_4(vilvh_d, void, env, i32, i32, i32) +DEF_HELPER_FLAGS_4(vpackev_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vpackev_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vpackev_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vpackev_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vpackod_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vpackod_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vpackod_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vpackod_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(vpickev_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vpickev_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vpickev_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vpickev_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vpickod_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vpickod_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vpickod_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vpickod_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(vilvl_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vilvl_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vilvl_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vilvl_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vilvh_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vilvh_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vilvh_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vilvh_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(vshuf_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) -DEF_HELPER_4(vshuf_h, void, env, i32, i32, i32) -DEF_HELPER_4(vshuf_w, void, env, i32, i32, i32) -DEF_HELPER_4(vshuf_d, void, env, i32, i32, i32) +DEF_HELPER_FLAGS_4(vshuf_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vshuf_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(vshuf_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_4(vshuf4i_b, void, env, i32, i32, i32) DEF_HELPER_4(vshuf4i_h, void, env, i32, i32, i32) DEF_HELPER_4(vshuf4i_w, void, env, i32, i32, i32) diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c index eab94a8b76..15b361c6b3 100644 --- a/target/loongarch/vec_helper.c +++ b/target/loongarch/vec_helper.c @@ -17,13 +17,12 @@ #define DO_SUB(a, b) (a - b) #define DO_ODD_EVEN(NAME, BIT, E1, E2, DO_OP) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t vk) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ { \ int i; \ - VReg *Vd = &(env->fpr[vd].vreg); \ - VReg *Vj = &(env->fpr[vj].vreg); \ - VReg *Vk = &(env->fpr[vk].vreg); \ + VReg *Vd = (VReg *)vd; \ + VReg *Vj = (VReg *)vj; \ + VReg *Vk = (VReg *)vk; \ typedef __typeof(Vd->E1(0)) TD; \ \ for (i = 0; i < LSX_LEN/BIT; i++) { \ @@ -35,12 +34,11 @@ DO_ODD_EVEN(vhaddw_h_b, 16, H, B, DO_ADD) DO_ODD_EVEN(vhaddw_w_h, 32, W, H, DO_ADD) DO_ODD_EVEN(vhaddw_d_w, 64, D, W, DO_ADD) -void HELPER(vhaddw_q_d)(CPULoongArchState *env, - uint32_t vd, uint32_t vj, uint32_t vk) +void HELPER(vhaddw_q_d)(void *vd, void *vj, void *vk, uint32_t desc) { - VReg *Vd = &(env->fpr[vd].vreg); - VReg *Vj = &(env->fpr[vj].vreg); - VReg *Vk = &(env->fpr[vk].vreg); + VReg *Vd = (VReg *)vd; + VReg *Vj = (VReg *)vj; + VReg *Vk = (VReg *)vk; Vd->Q(0) = int128_add(int128_makes64(Vj->D(1)), int128_makes64(Vk->D(0))); } @@ -49,12 +47,11 @@ DO_ODD_EVEN(vhsubw_h_b, 16, H, B, DO_SUB) DO_ODD_EVEN(vhsubw_w_h, 32, W, H, DO_SUB) DO_ODD_EVEN(vhsubw_d_w, 64, D, W, DO_SUB) -void HELPER(vhsubw_q_d)(CPULoongArchState *env, - uint32_t vd, uint32_t vj, uint32_t vk) +void HELPER(vhsubw_q_d)(void *vd, void *vj, void *vk, uint32_t desc) { - VReg *Vd = &(env->fpr[vd].vreg); - VReg *Vj = &(env->fpr[vj].vreg); - VReg *Vk = &(env->fpr[vk].vreg); + VReg *Vd = (VReg *)vd; + VReg *Vj = (VReg *)vj; + VReg *Vk = (VReg *)vk; Vd->Q(0) = int128_sub(int128_makes64(Vj->D(1)), int128_makes64(Vk->D(0))); } @@ -63,12 +60,11 @@ DO_ODD_EVEN(vhaddw_hu_bu, 16, UH, UB, DO_ADD) DO_ODD_EVEN(vhaddw_wu_hu, 32, UW, UH, DO_ADD) DO_ODD_EVEN(vhaddw_du_wu, 64, UD, UW, DO_ADD) -void HELPER(vhaddw_qu_du)(CPULoongArchState *env, - uint32_t vd, uint32_t vj, uint32_t vk) +void HELPER(vhaddw_qu_du)(void *vd, void *vj, void *vk, uint32_t desc) { - VReg *Vd = &(env->fpr[vd].vreg); - VReg *Vj = &(env->fpr[vj].vreg); - VReg *Vk = &(env->fpr[vk].vreg); + VReg *Vd = (VReg *)vd; + VReg *Vj = (VReg *)vj; + VReg *Vk = (VReg *)vk; Vd->Q(0) = int128_add(int128_make64((uint64_t)Vj->D(1)), int128_make64((uint64_t)Vk->D(0))); @@ -78,12 +74,11 @@ DO_ODD_EVEN(vhsubw_hu_bu, 16, UH, UB, DO_SUB) DO_ODD_EVEN(vhsubw_wu_hu, 32, UW, UH, DO_SUB) DO_ODD_EVEN(vhsubw_du_wu, 64, UD, UW, DO_SUB) -void HELPER(vhsubw_qu_du)(CPULoongArchState *env, - uint32_t vd, uint32_t vj, uint32_t vk) +void HELPER(vhsubw_qu_du)(void *vd, void *vj, void *vk, uint32_t desc) { - VReg *Vd = &(env->fpr[vd].vreg); - VReg *Vj = &(env->fpr[vj].vreg); - VReg *Vk = &(env->fpr[vk].vreg); + VReg *Vd = (VReg *)vd; + VReg *Vj = (VReg *)vj; + VReg *Vk = (VReg *)vk; Vd->Q(0) = int128_sub(int128_make64((uint64_t)Vj->D(1)), int128_make64((uint64_t)Vk->D(0))); @@ -564,17 +559,16 @@ VMADDWOD_U_S(vmaddwod_d_wu_w, 64, D, UD, W, UW, DO_MUL) #define DO_REM(N, M) (unlikely(M == 0) ? 0 :\ unlikely((N == -N) && (M == (__typeof(N))(-1))) ? 0 : N % M) -#define VDIV(NAME, BIT, E, DO_OP) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t vk) \ -{ \ - int i; \ - VReg *Vd = &(env->fpr[vd].vreg); \ - VReg *Vj = &(env->fpr[vj].vreg); \ - VReg *Vk = &(env->fpr[vk].vreg); \ - for (i = 0; i < LSX_LEN/BIT; i++) { \ - Vd->E(i) = DO_OP(Vj->E(i), Vk->E(i)); \ - } \ +#define VDIV(NAME, BIT, E, DO_OP) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ +{ \ + int i; \ + VReg *Vd = (VReg *)vd; \ + VReg *Vj = (VReg *)vj; \ + VReg *Vk = (VReg *)vk; \ + for (i = 0; i < LSX_LEN/BIT; i++) { \ + Vd->E(i) = DO_OP(Vj->E(i), Vk->E(i)); \ + } \ } VDIV(vdiv_b, 8, B, DO_DIV) @@ -854,13 +848,12 @@ do_vsrlr(W, uint32_t) do_vsrlr(D, uint64_t) #define VSRLR(NAME, BIT, T, E) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t vk) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ { \ int i; \ - VReg *Vd = &(env->fpr[vd].vreg); \ - VReg *Vj = &(env->fpr[vj].vreg); \ - VReg *Vk = &(env->fpr[vk].vreg); \ + VReg *Vd = (VReg *)vd; \ + VReg *Vj = (VReg *)vj; \ + VReg *Vk = (VReg *)vk; \ \ for (i = 0; i < LSX_LEN/BIT; i++) { \ Vd->E(i) = do_vsrlr_ ## E(Vj->E(i), ((T)Vk->E(i))%BIT); \ @@ -906,13 +899,12 @@ do_vsrar(W, int32_t) do_vsrar(D, int64_t) #define VSRAR(NAME, BIT, T, E) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t vk) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ { \ int i; \ - VReg *Vd = &(env->fpr[vd].vreg); \ - VReg *Vj = &(env->fpr[vj].vreg); \ - VReg *Vk = &(env->fpr[vk].vreg); \ + VReg *Vd = (VReg *)vd; \ + VReg *Vj = (VReg *)vj; \ + VReg *Vk = (VReg *)vk; \ \ for (i = 0; i < LSX_LEN/BIT; i++) { \ Vd->E(i) = do_vsrar_ ## E(Vj->E(i), ((T)Vk->E(i))%BIT); \ @@ -945,13 +937,12 @@ VSRARI(vsrari_d, 64, D) #define R_SHIFT(a, b) (a >> b) #define VSRLN(NAME, BIT, T, E1, E2) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t vk) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ { \ int i; \ - VReg *Vd = &(env->fpr[vd].vreg); \ - VReg *Vj = &(env->fpr[vj].vreg); \ - VReg *Vk = &(env->fpr[vk].vreg); \ + VReg *Vd = (VReg *)vd; \ + VReg *Vj = (VReg *)vj; \ + VReg *Vk = (VReg *)vk; \ \ for (i = 0; i < LSX_LEN/BIT; i++) { \ Vd->E1(i) = R_SHIFT((T)Vj->E2(i),((T)Vk->E2(i)) % BIT); \ @@ -963,19 +954,18 @@ VSRLN(vsrln_b_h, 16, uint16_t, B, H) VSRLN(vsrln_h_w, 32, uint32_t, H, W) VSRLN(vsrln_w_d, 64, uint64_t, W, D) -#define VSRAN(NAME, BIT, T, E1, E2) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t vk) \ -{ \ - int i; \ - VReg *Vd = &(env->fpr[vd].vreg); \ - VReg *Vj = &(env->fpr[vj].vreg); \ - VReg *Vk = &(env->fpr[vk].vreg); \ - \ - for (i = 0; i < LSX_LEN/BIT; i++) { \ - Vd->E1(i) = R_SHIFT(Vj->E2(i), ((T)Vk->E2(i)) % BIT); \ - } \ - Vd->D(1) = 0; \ +#define VSRAN(NAME, BIT, T, E1, E2) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ +{ \ + int i; \ + VReg *Vd = (VReg *)vd; \ + VReg *Vj = (VReg *)vj; \ + VReg *Vk = (VReg *)vk; \ + \ + for (i = 0; i < LSX_LEN/BIT; i++) { \ + Vd->E1(i) = R_SHIFT(Vj->E2(i), ((T)Vk->E2(i)) % BIT); \ + } \ + Vd->D(1) = 0; \ } VSRAN(vsran_b_h, 16, uint16_t, B, H) @@ -1057,13 +1047,12 @@ VSRANI(vsrani_h_w, 32, H, W) VSRANI(vsrani_w_d, 64, W, D) #define VSRLRN(NAME, BIT, T, E1, E2) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t vk) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ { \ int i; \ - VReg *Vd = &(env->fpr[vd].vreg); \ - VReg *Vj = &(env->fpr[vj].vreg); \ - VReg *Vk = &(env->fpr[vk].vreg); \ + VReg *Vd = (VReg *)vd; \ + VReg *Vj = (VReg *)vj; \ + VReg *Vk = (VReg *)vk; \ \ for (i = 0; i < LSX_LEN/BIT; i++) { \ Vd->E1(i) = do_vsrlr_ ## E2(Vj->E2(i), ((T)Vk->E2(i))%BIT); \ @@ -1076,13 +1065,12 @@ VSRLRN(vsrlrn_h_w, 32, uint32_t, H, W) VSRLRN(vsrlrn_w_d, 64, uint64_t, W, D) #define VSRARN(NAME, BIT, T, E1, E2) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t vk) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ { \ int i; \ - VReg *Vd = &(env->fpr[vd].vreg); \ - VReg *Vj = &(env->fpr[vj].vreg); \ - VReg *Vk = &(env->fpr[vk].vreg); \ + VReg *Vd = (VReg *)vd; \ + VReg *Vj = (VReg *)vj; \ + VReg *Vk = (VReg *)vk; \ \ for (i = 0; i < LSX_LEN/BIT; i++) { \ Vd->E1(i) = do_vsrar_ ## E2(Vj->E2(i), ((T)Vk->E2(i))%BIT); \ @@ -1205,13 +1193,12 @@ SSRLNS(H, uint32_t, int32_t, uint16_t) SSRLNS(W, uint64_t, int64_t, uint32_t) #define VSSRLN(NAME, BIT, T, E1, E2) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t vk) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ { \ int i; \ - VReg *Vd = &(env->fpr[vd].vreg); \ - VReg *Vj = &(env->fpr[vj].vreg); \ - VReg *Vk = &(env->fpr[vk].vreg); \ + VReg *Vd = (VReg *)vd; \ + VReg *Vj = (VReg *)vj; \ + VReg *Vk = (VReg *)vk; \ \ for (i = 0; i < LSX_LEN/BIT; i++) { \ Vd->E1(i) = do_ssrlns_ ## E1(Vj->E2(i), (T)Vk->E2(i)% BIT, BIT/2 -1); \ @@ -1248,13 +1235,12 @@ SSRANS(H, int32_t, int16_t) SSRANS(W, int64_t, int32_t) #define VSSRAN(NAME, BIT, T, E1, E2) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t vk) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ { \ int i; \ - VReg *Vd = &(env->fpr[vd].vreg); \ - VReg *Vj = &(env->fpr[vj].vreg); \ - VReg *Vk = &(env->fpr[vk].vreg); \ + VReg *Vd = (VReg *)vd; \ + VReg *Vj = (VReg *)vj; \ + VReg *Vk = (VReg *)vk; \ \ for (i = 0; i < LSX_LEN/BIT; i++) { \ Vd->E1(i) = do_ssrans_ ## E1(Vj->E2(i), (T)Vk->E2(i)%BIT, BIT/2 -1); \ @@ -1289,13 +1275,12 @@ SSRLNU(H, uint32_t, uint16_t, int32_t) SSRLNU(W, uint64_t, uint32_t, int64_t) #define VSSRLNU(NAME, BIT, T, E1, E2) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t vk) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ { \ int i; \ - VReg *Vd = &(env->fpr[vd].vreg); \ - VReg *Vj = &(env->fpr[vj].vreg); \ - VReg *Vk = &(env->fpr[vk].vreg); \ + VReg *Vd = (VReg *)vd; \ + VReg *Vj = (VReg *)vj; \ + VReg *Vk = (VReg *)vk; \ \ for (i = 0; i < LSX_LEN/BIT; i++) { \ Vd->E1(i) = do_ssrlnu_ ## E1(Vj->E2(i), (T)Vk->E2(i)%BIT, BIT/2); \ @@ -1333,13 +1318,12 @@ SSRANU(H, uint32_t, uint16_t, int32_t) SSRANU(W, uint64_t, uint32_t, int64_t) #define VSSRANU(NAME, BIT, T, E1, E2) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t vk) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ { \ int i; \ - VReg *Vd = &(env->fpr[vd].vreg); \ - VReg *Vj = &(env->fpr[vj].vreg); \ - VReg *Vk = &(env->fpr[vk].vreg); \ + VReg *Vd = (VReg *)vd; \ + VReg *Vj = (VReg *)vj; \ + VReg *Vk = (VReg *)vk; \ \ for (i = 0; i < LSX_LEN/BIT; i++) { \ Vd->E1(i) = do_ssranu_ ## E1(Vj->E2(i), (T)Vk->E2(i)%BIT, BIT/2); \ @@ -1581,13 +1565,12 @@ SSRLRNS(H, W, uint32_t, int32_t, uint16_t) SSRLRNS(W, D, uint64_t, int64_t, uint32_t) #define VSSRLRN(NAME, BIT, T, E1, E2) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t vk) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ { \ int i; \ - VReg *Vd = &(env->fpr[vd].vreg); \ - VReg *Vj = &(env->fpr[vj].vreg); \ - VReg *Vk = &(env->fpr[vk].vreg); \ + VReg *Vd = (VReg *)vd; \ + VReg *Vj = (VReg *)vj; \ + VReg *Vk = (VReg *)vk; \ \ for (i = 0; i < LSX_LEN/BIT; i++) { \ Vd->E1(i) = do_ssrlrns_ ## E1(Vj->E2(i), (T)Vk->E2(i)%BIT, BIT/2 -1); \ @@ -1621,13 +1604,12 @@ SSRARNS(H, W, int32_t, int16_t) SSRARNS(W, D, int64_t, int32_t) #define VSSRARN(NAME, BIT, T, E1, E2) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t vk) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ { \ int i; \ - VReg *Vd = &(env->fpr[vd].vreg); \ - VReg *Vj = &(env->fpr[vj].vreg); \ - VReg *Vk = &(env->fpr[vk].vreg); \ + VReg *Vd = (VReg *)vd; \ + VReg *Vj = (VReg *)vj; \ + VReg *Vk = (VReg *)vk; \ \ for (i = 0; i < LSX_LEN/BIT; i++) { \ Vd->E1(i) = do_ssrarns_ ## E1(Vj->E2(i), (T)Vk->E2(i)%BIT, BIT/2 -1); \ @@ -1660,13 +1642,12 @@ SSRLRNU(H, W, uint32_t, uint16_t, int32_t) SSRLRNU(W, D, uint64_t, uint32_t, int64_t) #define VSSRLRNU(NAME, BIT, T, E1, E2) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t vk) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ { \ int i; \ - VReg *Vd = &(env->fpr[vd].vreg); \ - VReg *Vj = &(env->fpr[vj].vreg); \ - VReg *Vk = &(env->fpr[vk].vreg); \ + VReg *Vd = (VReg *)vd; \ + VReg *Vj = (VReg *)vj; \ + VReg *Vk = (VReg *)vk; \ \ for (i = 0; i < LSX_LEN/BIT; i++) { \ Vd->E1(i) = do_ssrlrnu_ ## E1(Vj->E2(i), (T)Vk->E2(i)%BIT, BIT/2); \ @@ -1702,13 +1683,12 @@ SSRARNU(H, W, uint32_t, uint16_t, int32_t) SSRARNU(W, D, uint64_t, uint32_t, int64_t) #define VSSRARNU(NAME, BIT, T, E1, E2) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t vk) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ { \ int i; \ - VReg *Vd = &(env->fpr[vd].vreg); \ - VReg *Vj = &(env->fpr[vj].vreg); \ - VReg *Vk = &(env->fpr[vk].vreg); \ + VReg *Vd = (VReg *)vd; \ + VReg *Vj = (VReg *)vj; \ + VReg *Vk = (VReg *)vk; \ \ for (i = 0; i < LSX_LEN/BIT; i++) { \ Vd->E1(i) = do_ssrarnu_ ## E1(Vj->E2(i), (T)Vk->E2(i)%BIT, BIT/2); \ @@ -2023,22 +2003,21 @@ DO_BITI(vbitrevi_h, 16, UH, DO_BITREV) DO_BITI(vbitrevi_w, 32, UW, DO_BITREV) DO_BITI(vbitrevi_d, 64, UD, DO_BITREV) -#define VFRSTP(NAME, BIT, MASK, E) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t vk) \ -{ \ - int i, m; \ - VReg *Vd = &(env->fpr[vd].vreg); \ - VReg *Vj = &(env->fpr[vj].vreg); \ - VReg *Vk = &(env->fpr[vk].vreg); \ - \ - for (i = 0; i < LSX_LEN/BIT; i++) { \ - if (Vj->E(i) < 0) { \ - break; \ - } \ - } \ - m = Vk->E(0) & MASK; \ - Vd->E(m) = i; \ +#define VFRSTP(NAME, BIT, MASK, E) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ +{ \ + int i, m; \ + VReg *Vd = (VReg *)vd; \ + VReg *Vj = (VReg *)vj; \ + VReg *Vk = (VReg *)vk; \ + \ + for (i = 0; i < LSX_LEN/BIT; i++) { \ + if (Vj->E(i) < 0) { \ + break; \ + } \ + } \ + m = Vk->E(0) & MASK; \ + Vd->E(m) = i; \ } VFRSTP(vfrstp_b, 8, 0xf, B) @@ -2767,21 +2746,20 @@ SETALLNEZ(vsetallnez_h, MO_16) SETALLNEZ(vsetallnez_w, MO_32) SETALLNEZ(vsetallnez_d, MO_64) -#define VPACKEV(NAME, BIT, E) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t vk) \ -{ \ - int i; \ - VReg temp; \ - VReg *Vd = &(env->fpr[vd].vreg); \ - VReg *Vj = &(env->fpr[vj].vreg); \ - VReg *Vk = &(env->fpr[vk].vreg); \ - \ - for (i = 0; i < LSX_LEN/BIT; i++) { \ - temp.E(2 * i + 1) = Vj->E(2 * i); \ - temp.E(2 *i) = Vk->E(2 * i); \ - } \ - *Vd = temp; \ +#define VPACKEV(NAME, BIT, E) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ +{ \ + int i; \ + VReg temp; \ + VReg *Vd = (VReg *)vd; \ + VReg *Vj = (VReg *)vj; \ + VReg *Vk = (VReg *)vk; \ + \ + for (i = 0; i < LSX_LEN/BIT; i++) { \ + temp.E(2 * i + 1) = Vj->E(2 * i); \ + temp.E(2 *i) = Vk->E(2 * i); \ + } \ + *Vd = temp; \ } VPACKEV(vpackev_b, 16, B) @@ -2789,21 +2767,20 @@ VPACKEV(vpackev_h, 32, H) VPACKEV(vpackev_w, 64, W) VPACKEV(vpackev_d, 128, D) -#define VPACKOD(NAME, BIT, E) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t vk) \ -{ \ - int i; \ - VReg temp; \ - VReg *Vd = &(env->fpr[vd].vreg); \ - VReg *Vj = &(env->fpr[vj].vreg); \ - VReg *Vk = &(env->fpr[vk].vreg); \ - \ - for (i = 0; i < LSX_LEN/BIT; i++) { \ - temp.E(2 * i + 1) = Vj->E(2 * i + 1); \ - temp.E(2 * i) = Vk->E(2 * i + 1); \ - } \ - *Vd = temp; \ +#define VPACKOD(NAME, BIT, E) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ +{ \ + int i; \ + VReg temp; \ + VReg *Vd = (VReg *)vd; \ + VReg *Vj = (VReg *)vj; \ + VReg *Vk = (VReg *)vk; \ + \ + for (i = 0; i < LSX_LEN/BIT; i++) { \ + temp.E(2 * i + 1) = Vj->E(2 * i + 1); \ + temp.E(2 * i) = Vk->E(2 * i + 1); \ + } \ + *Vd = temp; \ } VPACKOD(vpackod_b, 16, B) @@ -2811,21 +2788,20 @@ VPACKOD(vpackod_h, 32, H) VPACKOD(vpackod_w, 64, W) VPACKOD(vpackod_d, 128, D) -#define VPICKEV(NAME, BIT, E) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t vk) \ -{ \ - int i; \ - VReg temp; \ - VReg *Vd = &(env->fpr[vd].vreg); \ - VReg *Vj = &(env->fpr[vj].vreg); \ - VReg *Vk = &(env->fpr[vk].vreg); \ - \ - for (i = 0; i < LSX_LEN/BIT; i++) { \ - temp.E(i + LSX_LEN/BIT) = Vj->E(2 * i); \ - temp.E(i) = Vk->E(2 * i); \ - } \ - *Vd = temp; \ +#define VPICKEV(NAME, BIT, E) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ +{ \ + int i; \ + VReg temp; \ + VReg *Vd = (VReg *)vd; \ + VReg *Vj = (VReg *)vj; \ + VReg *Vk = (VReg *)vk; \ + \ + for (i = 0; i < LSX_LEN/BIT; i++) { \ + temp.E(i + LSX_LEN/BIT) = Vj->E(2 * i); \ + temp.E(i) = Vk->E(2 * i); \ + } \ + *Vd = temp; \ } VPICKEV(vpickev_b, 16, B) @@ -2833,21 +2809,20 @@ VPICKEV(vpickev_h, 32, H) VPICKEV(vpickev_w, 64, W) VPICKEV(vpickev_d, 128, D) -#define VPICKOD(NAME, BIT, E) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t vk) \ -{ \ - int i; \ - VReg temp; \ - VReg *Vd = &(env->fpr[vd].vreg); \ - VReg *Vj = &(env->fpr[vj].vreg); \ - VReg *Vk = &(env->fpr[vk].vreg); \ - \ - for (i = 0; i < LSX_LEN/BIT; i++) { \ - temp.E(i + LSX_LEN/BIT) = Vj->E(2 * i + 1); \ - temp.E(i) = Vk->E(2 * i + 1); \ - } \ - *Vd = temp; \ +#define VPICKOD(NAME, BIT, E) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ +{ \ + int i; \ + VReg temp; \ + VReg *Vd = (VReg *)vd; \ + VReg *Vj = (VReg *)vj; \ + VReg *Vk = (VReg *)vk; \ + \ + for (i = 0; i < LSX_LEN/BIT; i++) { \ + temp.E(i + LSX_LEN/BIT) = Vj->E(2 * i + 1); \ + temp.E(i) = Vk->E(2 * i + 1); \ + } \ + *Vd = temp; \ } VPICKOD(vpickod_b, 16, B) @@ -2855,21 +2830,20 @@ VPICKOD(vpickod_h, 32, H) VPICKOD(vpickod_w, 64, W) VPICKOD(vpickod_d, 128, D) -#define VILVL(NAME, BIT, E) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t vk) \ -{ \ - int i; \ - VReg temp; \ - VReg *Vd = &(env->fpr[vd].vreg); \ - VReg *Vj = &(env->fpr[vj].vreg); \ - VReg *Vk = &(env->fpr[vk].vreg); \ - \ - for (i = 0; i < LSX_LEN/BIT; i++) { \ - temp.E(2 * i + 1) = Vj->E(i); \ - temp.E(2 * i) = Vk->E(i); \ - } \ - *Vd = temp; \ +#define VILVL(NAME, BIT, E) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ +{ \ + int i; \ + VReg temp; \ + VReg *Vd = (VReg *)vd; \ + VReg *Vj = (VReg *)vj; \ + VReg *Vk = (VReg *)vk; \ + \ + for (i = 0; i < LSX_LEN/BIT; i++) { \ + temp.E(2 * i + 1) = Vj->E(i); \ + temp.E(2 * i) = Vk->E(i); \ + } \ + *Vd = temp; \ } VILVL(vilvl_b, 16, B) @@ -2877,21 +2851,20 @@ VILVL(vilvl_h, 32, H) VILVL(vilvl_w, 64, W) VILVL(vilvl_d, 128, D) -#define VILVH(NAME, BIT, E) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t vk) \ -{ \ - int i; \ - VReg temp; \ - VReg *Vd = &(env->fpr[vd].vreg); \ - VReg *Vj = &(env->fpr[vj].vreg); \ - VReg *Vk = &(env->fpr[vk].vreg); \ - \ - for (i = 0; i < LSX_LEN/BIT; i++) { \ - temp.E(2 * i + 1) = Vj->E(i + LSX_LEN/BIT); \ - temp.E(2 * i) = Vk->E(i + LSX_LEN/BIT); \ - } \ - *Vd = temp; \ +#define VILVH(NAME, BIT, E) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ +{ \ + int i; \ + VReg temp; \ + VReg *Vd = (VReg *)vd; \ + VReg *Vj = (VReg *)vj; \ + VReg *Vk = (VReg *)vk; \ + \ + for (i = 0; i < LSX_LEN/BIT; i++) { \ + temp.E(2 * i + 1) = Vj->E(i + LSX_LEN/BIT); \ + temp.E(2 * i) = Vk->E(i + LSX_LEN/BIT); \ + } \ + *Vd = temp; \ } VILVH(vilvh_b, 16, B) @@ -2916,22 +2889,21 @@ void HELPER(vshuf_b)(void *vd, void *vj, void *vk, void *va, uint32_t desc) *Vd = temp; } -#define VSHUF(NAME, BIT, E) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t vk) \ -{ \ - int i, m; \ - VReg temp; \ - VReg *Vd = &(env->fpr[vd].vreg); \ - VReg *Vj = &(env->fpr[vj].vreg); \ - VReg *Vk = &(env->fpr[vk].vreg); \ - \ - m = LSX_LEN/BIT; \ - for (i = 0; i < m; i++) { \ - uint64_t k = ((uint8_t) Vd->E(i)) % (2 * m); \ - temp.E(i) = k < m ? Vk->E(k) : Vj->E(k - m); \ - } \ - *Vd = temp; \ +#define VSHUF(NAME, BIT, E) \ +void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ +{ \ + int i, m; \ + VReg temp; \ + VReg *Vd = (VReg *)vd; \ + VReg *Vj = (VReg *)vj; \ + VReg *Vk = (VReg *)vk; \ + \ + m = LSX_LEN/BIT; \ + for (i = 0; i < m; i++) { \ + uint64_t k = ((uint8_t) Vd->E(i)) % (2 * m); \ + temp.E(i) = k < m ? Vk->E(k) : Vj->E(k - m); \ + } \ + *Vd = temp; \ } VSHUF(vshuf_h, 16, H) diff --git a/target/loongarch/insn_trans/trans_vec.c.inc b/target/loongarch/insn_trans/trans_vec.c.inc index eae1929f44..6ead8fb4c5 100644 --- a/target/loongarch/insn_trans/trans_vec.c.inc +++ b/target/loongarch/insn_trans/trans_vec.c.inc @@ -70,17 +70,20 @@ static bool gen_vvv_ptr(DisasContext *ctx, arg_vvv *a, return gen_vvv_ptr_vl(ctx, a, 16, fn); } -static bool gen_vvv(DisasContext *ctx, arg_vvv *a, - void (*func)(TCGv_ptr, TCGv_i32, TCGv_i32, TCGv_i32)) +static bool gen_vvv_vl(DisasContext *ctx, arg_vvv *a, uint32_t oprsz, + gen_helper_gvec_3 *fn) { - TCGv_i32 vd = tcg_constant_i32(a->vd); - TCGv_i32 vj = tcg_constant_i32(a->vj); - TCGv_i32 vk = tcg_constant_i32(a->vk); + tcg_gen_gvec_3_ool(vec_full_offset(a->vd), + vec_full_offset(a->vj), + vec_full_offset(a->vk), + oprsz, ctx->vl / 8, oprsz, fn); + return true; +} +static bool gen_vvv(DisasContext *ctx, arg_vvv *a, gen_helper_gvec_3 *fn) +{ CHECK_SXE; - - func(cpu_env, vd, vj, vk); - return true; + return gen_vvv_vl(ctx, a, 16, fn); } static bool gen_vv(DisasContext *ctx, arg_vv *a, From patchwork Thu Sep 7 08:08:26 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Gao X-Patchwork-Id: 13376211 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 2085FEE14D0 for ; Thu, 7 Sep 2023 08:12:41 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qeA5D-0003zW-2C; Thu, 07 Sep 2023 04:09:47 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qeA4w-0003vH-Fs for qemu-devel@nongnu.org; Thu, 07 Sep 2023 04:09:31 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qeA4p-0003LR-KG for qemu-devel@nongnu.org; Thu, 07 Sep 2023 04:09:30 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8AxjusxhflkUDQhAA--.63744S3; Thu, 07 Sep 2023 16:09:21 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8DxviMthflkXE1wAA--.31585S9; Thu, 07 Sep 2023 16:09:20 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, maobibo@loongson.cn Subject: [PATCH v5 07/57] target/loongarch: Use gen_helper_gvec_2_ptr for 2OP + env vector instructions Date: Thu, 7 Sep 2023 16:08:26 +0800 Message-Id: <20230907080916.3974502-8-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230907080916.3974502-1-gaosong@loongson.cn> References: <20230907080916.3974502-1-gaosong@loongson.cn> MIME-Version: 1.0 X-CM-TRANSID: AQAAf8DxviMthflkXE1wAA--.31585S9 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Signed-off-by: Song Gao --- target/loongarch/helper.h | 118 +++++++------- target/loongarch/vec_helper.c | 161 +++++++++++--------- target/loongarch/insn_trans/trans_vec.c.inc | 129 +++++++++------- 3 files changed, 219 insertions(+), 189 deletions(-) diff --git a/target/loongarch/helper.h b/target/loongarch/helper.h index 4b681e948f..0752cc7212 100644 --- a/target/loongarch/helper.h +++ b/target/loongarch/helper.h @@ -547,73 +547,73 @@ DEF_HELPER_FLAGS_5(vfmaxa_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32) DEF_HELPER_FLAGS_5(vfmina_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32) DEF_HELPER_FLAGS_5(vfmina_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32) -DEF_HELPER_3(vflogb_s, void, env, i32, i32) -DEF_HELPER_3(vflogb_d, void, env, i32, i32) - -DEF_HELPER_3(vfclass_s, void, env, i32, i32) -DEF_HELPER_3(vfclass_d, void, env, i32, i32) - -DEF_HELPER_3(vfsqrt_s, void, env, i32, i32) -DEF_HELPER_3(vfsqrt_d, void, env, i32, i32) -DEF_HELPER_3(vfrecip_s, void, env, i32, i32) -DEF_HELPER_3(vfrecip_d, void, env, i32, i32) -DEF_HELPER_3(vfrsqrt_s, void, env, i32, i32) -DEF_HELPER_3(vfrsqrt_d, void, env, i32, i32) - -DEF_HELPER_3(vfcvtl_s_h, void, env, i32, i32) -DEF_HELPER_3(vfcvth_s_h, void, env, i32, i32) -DEF_HELPER_3(vfcvtl_d_s, void, env, i32, i32) -DEF_HELPER_3(vfcvth_d_s, void, env, i32, i32) +DEF_HELPER_FLAGS_4(vflogb_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vflogb_d, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) + +DEF_HELPER_FLAGS_4(vfclass_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vfclass_d, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) + +DEF_HELPER_FLAGS_4(vfsqrt_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vfsqrt_d, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vfrecip_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vfrecip_d, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vfrsqrt_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vfrsqrt_d, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) + +DEF_HELPER_FLAGS_4(vfcvtl_s_h, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vfcvth_s_h, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vfcvtl_d_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vfcvth_d_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) DEF_HELPER_FLAGS_5(vfcvt_h_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32) DEF_HELPER_FLAGS_5(vfcvt_s_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32) -DEF_HELPER_3(vfrintrne_s, void, env, i32, i32) -DEF_HELPER_3(vfrintrne_d, void, env, i32, i32) -DEF_HELPER_3(vfrintrz_s, void, env, i32, i32) -DEF_HELPER_3(vfrintrz_d, void, env, i32, i32) -DEF_HELPER_3(vfrintrp_s, void, env, i32, i32) -DEF_HELPER_3(vfrintrp_d, void, env, i32, i32) -DEF_HELPER_3(vfrintrm_s, void, env, i32, i32) -DEF_HELPER_3(vfrintrm_d, void, env, i32, i32) -DEF_HELPER_3(vfrint_s, void, env, i32, i32) -DEF_HELPER_3(vfrint_d, void, env, i32, i32) - -DEF_HELPER_3(vftintrne_w_s, void, env, i32, i32) -DEF_HELPER_3(vftintrne_l_d, void, env, i32, i32) -DEF_HELPER_3(vftintrz_w_s, void, env, i32, i32) -DEF_HELPER_3(vftintrz_l_d, void, env, i32, i32) -DEF_HELPER_3(vftintrp_w_s, void, env, i32, i32) -DEF_HELPER_3(vftintrp_l_d, void, env, i32, i32) -DEF_HELPER_3(vftintrm_w_s, void, env, i32, i32) -DEF_HELPER_3(vftintrm_l_d, void, env, i32, i32) -DEF_HELPER_3(vftint_w_s, void, env, i32, i32) -DEF_HELPER_3(vftint_l_d, void, env, i32, i32) -DEF_HELPER_3(vftintrz_wu_s, void, env, i32, i32) -DEF_HELPER_3(vftintrz_lu_d, void, env, i32, i32) -DEF_HELPER_3(vftint_wu_s, void, env, i32, i32) -DEF_HELPER_3(vftint_lu_d, void, env, i32, i32) +DEF_HELPER_FLAGS_4(vfrintrne_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vfrintrne_d, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vfrintrz_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vfrintrz_d, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vfrintrp_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vfrintrp_d, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vfrintrm_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vfrintrm_d, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vfrint_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vfrint_d, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) + +DEF_HELPER_FLAGS_4(vftintrne_w_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vftintrne_l_d, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vftintrz_w_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vftintrz_l_d, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vftintrp_w_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vftintrp_l_d, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vftintrm_w_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vftintrm_l_d, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vftint_w_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vftint_l_d, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vftintrz_wu_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vftintrz_lu_d, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vftint_wu_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vftint_lu_d, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) DEF_HELPER_FLAGS_5(vftintrne_w_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32) DEF_HELPER_FLAGS_5(vftintrz_w_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32) DEF_HELPER_FLAGS_5(vftintrp_w_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32) DEF_HELPER_FLAGS_5(vftintrm_w_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32) DEF_HELPER_FLAGS_5(vftint_w_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32) -DEF_HELPER_3(vftintrnel_l_s, void, env, i32, i32) -DEF_HELPER_3(vftintrneh_l_s, void, env, i32, i32) -DEF_HELPER_3(vftintrzl_l_s, void, env, i32, i32) -DEF_HELPER_3(vftintrzh_l_s, void, env, i32, i32) -DEF_HELPER_3(vftintrpl_l_s, void, env, i32, i32) -DEF_HELPER_3(vftintrph_l_s, void, env, i32, i32) -DEF_HELPER_3(vftintrml_l_s, void, env, i32, i32) -DEF_HELPER_3(vftintrmh_l_s, void, env, i32, i32) -DEF_HELPER_3(vftintl_l_s, void, env, i32, i32) -DEF_HELPER_3(vftinth_l_s, void, env, i32, i32) - -DEF_HELPER_3(vffint_s_w, void, env, i32, i32) -DEF_HELPER_3(vffint_d_l, void, env, i32, i32) -DEF_HELPER_3(vffint_s_wu, void, env, i32, i32) -DEF_HELPER_3(vffint_d_lu, void, env, i32, i32) -DEF_HELPER_3(vffintl_d_w, void, env, i32, i32) -DEF_HELPER_3(vffinth_d_w, void, env, i32, i32) +DEF_HELPER_FLAGS_4(vftintrnel_l_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vftintrneh_l_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vftintrzl_l_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vftintrzh_l_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vftintrpl_l_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vftintrph_l_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vftintrml_l_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vftintrmh_l_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vftintl_l_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vftinth_l_s, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) + +DEF_HELPER_FLAGS_4(vffint_s_w, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vffint_d_l, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vffint_s_wu, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vffint_d_lu, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vffintl_d_w, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) +DEF_HELPER_FLAGS_4(vffinth_d_w, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32) DEF_HELPER_FLAGS_5(vffint_s_l, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32) DEF_HELPER_FLAGS_4(vseqi_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c index 15b361c6b3..2898ae06ce 100644 --- a/target/loongarch/vec_helper.c +++ b/target/loongarch/vec_helper.c @@ -2135,17 +2135,18 @@ DO_4OP_F(vfnmsub_s, 32, UW, float32_muladd, DO_4OP_F(vfnmsub_d, 64, UD, float64_muladd, float_muladd_negate_c | float_muladd_negate_result) -#define DO_2OP_F(NAME, BIT, E, FN) \ -void HELPER(NAME)(CPULoongArchState *env, uint32_t vd, uint32_t vj) \ -{ \ - int i; \ - VReg *Vd = &(env->fpr[vd].vreg); \ - VReg *Vj = &(env->fpr[vj].vreg); \ - \ - vec_clear_cause(env); \ - for (i = 0; i < LSX_LEN/BIT; i++) { \ - Vd->E(i) = FN(env, Vj->E(i)); \ - } \ +#define DO_2OP_F(NAME, BIT, E, FN) \ +void HELPER(NAME)(void *vd, void *vj, \ + CPULoongArchState *env, uint32_t desc) \ +{ \ + int i; \ + VReg *Vd = (VReg *)vd; \ + VReg *Vj = (VReg *)vj; \ + \ + vec_clear_cause(env); \ + for (i = 0; i < LSX_LEN/BIT; i++) { \ + Vd->E(i) = FN(env, Vj->E(i)); \ + } \ } #define FLOGB(BIT, T) \ @@ -2166,16 +2167,17 @@ static T do_flogb_## BIT(CPULoongArchState *env, T fj) \ FLOGB(32, uint32_t) FLOGB(64, uint64_t) -#define FCLASS(NAME, BIT, E, FN) \ -void HELPER(NAME)(CPULoongArchState *env, uint32_t vd, uint32_t vj) \ -{ \ - int i; \ - VReg *Vd = &(env->fpr[vd].vreg); \ - VReg *Vj = &(env->fpr[vj].vreg); \ - \ - for (i = 0; i < LSX_LEN/BIT; i++) { \ - Vd->E(i) = FN(env, Vj->E(i)); \ - } \ +#define FCLASS(NAME, BIT, E, FN) \ +void HELPER(NAME)(void *vd, void *vj, \ + CPULoongArchState *env, uint32_t desc) \ +{ \ + int i; \ + VReg *Vd = (VReg *)vd; \ + VReg *Vj = (VReg *)vj; \ + \ + for (i = 0; i < LSX_LEN/BIT; i++) { \ + Vd->E(i) = FN(env, Vj->E(i)); \ + } \ } FCLASS(vfclass_s, 32, UW, helper_fclass_s) @@ -2245,12 +2247,13 @@ static uint32_t float64_cvt_float32(uint64_t d, float_status *status) return float64_to_float32(d, status); } -void HELPER(vfcvtl_s_h)(CPULoongArchState *env, uint32_t vd, uint32_t vj) +void HELPER(vfcvtl_s_h)(void *vd, void *vj, + CPULoongArchState *env, uint32_t desc) { int i; VReg temp; - VReg *Vd = &(env->fpr[vd].vreg); - VReg *Vj = &(env->fpr[vj].vreg); + VReg *Vd = (VReg *)vd; + VReg *Vj = (VReg *)vj; vec_clear_cause(env); for (i = 0; i < LSX_LEN/32; i++) { @@ -2260,12 +2263,13 @@ void HELPER(vfcvtl_s_h)(CPULoongArchState *env, uint32_t vd, uint32_t vj) *Vd = temp; } -void HELPER(vfcvtl_d_s)(CPULoongArchState *env, uint32_t vd, uint32_t vj) +void HELPER(vfcvtl_d_s)(void *vd, void *vj, + CPULoongArchState *env, uint32_t desc) { int i; VReg temp; - VReg *Vd = &(env->fpr[vd].vreg); - VReg *Vj = &(env->fpr[vj].vreg); + VReg *Vd = (VReg *)vd; + VReg *Vj = (VReg *)vj; vec_clear_cause(env); for (i = 0; i < LSX_LEN/64; i++) { @@ -2275,12 +2279,13 @@ void HELPER(vfcvtl_d_s)(CPULoongArchState *env, uint32_t vd, uint32_t vj) *Vd = temp; } -void HELPER(vfcvth_s_h)(CPULoongArchState *env, uint32_t vd, uint32_t vj) +void HELPER(vfcvth_s_h)(void *vd, void *vj, + CPULoongArchState *env, uint32_t desc) { int i; VReg temp; - VReg *Vd = &(env->fpr[vd].vreg); - VReg *Vj = &(env->fpr[vj].vreg); + VReg *Vd = (VReg *)vd; + VReg *Vj = (VReg *)vj; vec_clear_cause(env); for (i = 0; i < LSX_LEN/32; i++) { @@ -2290,12 +2295,13 @@ void HELPER(vfcvth_s_h)(CPULoongArchState *env, uint32_t vd, uint32_t vj) *Vd = temp; } -void HELPER(vfcvth_d_s)(CPULoongArchState *env, uint32_t vd, uint32_t vj) +void HELPER(vfcvth_d_s)(void *vd, void *vj, + CPULoongArchState *env, uint32_t desc) { int i; VReg temp; - VReg *Vd = &(env->fpr[vd].vreg); - VReg *Vj = &(env->fpr[vj].vreg); + VReg *Vd = (VReg *)vd; + VReg *Vj = (VReg *)vj; vec_clear_cause(env); for (i = 0; i < LSX_LEN/64; i++) { @@ -2341,11 +2347,12 @@ void HELPER(vfcvt_s_d)(void *vd, void *vj, void *vk, *Vd = temp; } -void HELPER(vfrint_s)(CPULoongArchState *env, uint32_t vd, uint32_t vj) +void HELPER(vfrint_s)(void *vd, void *vj, + CPULoongArchState *env, uint32_t desc) { int i; - VReg *Vd = &(env->fpr[vd].vreg); - VReg *Vj = &(env->fpr[vj].vreg); + VReg *Vd = (VReg *)vd; + VReg *Vj = (VReg *)vj; vec_clear_cause(env); for (i = 0; i < 4; i++) { @@ -2354,11 +2361,12 @@ void HELPER(vfrint_s)(CPULoongArchState *env, uint32_t vd, uint32_t vj) } } -void HELPER(vfrint_d)(CPULoongArchState *env, uint32_t vd, uint32_t vj) +void HELPER(vfrint_d)(void *vd, void *vj, + CPULoongArchState *env, uint32_t desc) { int i; - VReg *Vd = &(env->fpr[vd].vreg); - VReg *Vj = &(env->fpr[vj].vreg); + VReg *Vd = (VReg *)vd; + VReg *Vj = (VReg *)vj; vec_clear_cause(env); for (i = 0; i < 2; i++) { @@ -2368,11 +2376,12 @@ void HELPER(vfrint_d)(CPULoongArchState *env, uint32_t vd, uint32_t vj) } #define FCVT_2OP(NAME, BIT, E, MODE) \ -void HELPER(NAME)(CPULoongArchState *env, uint32_t vd, uint32_t vj) \ +void HELPER(NAME)(void *vd, void *vj, \ + CPULoongArchState *env, uint32_t desc) \ { \ int i; \ - VReg *Vd = &(env->fpr[vd].vreg); \ - VReg *Vj = &(env->fpr[vj].vreg); \ + VReg *Vd = (VReg *)vd; \ + VReg *Vj = (VReg *)vj; \ \ vec_clear_cause(env); \ for (i = 0; i < LSX_LEN/BIT; i++) { \ @@ -2493,19 +2502,20 @@ FTINT(rph_l_s, float32, int64, uint32_t, uint64_t, float_round_up) FTINT(rzh_l_s, float32, int64, uint32_t, uint64_t, float_round_to_zero) FTINT(rneh_l_s, float32, int64, uint32_t, uint64_t, float_round_nearest_even) -#define FTINTL_L_S(NAME, FN) \ -void HELPER(NAME)(CPULoongArchState *env, uint32_t vd, uint32_t vj) \ -{ \ - int i; \ - VReg temp; \ - VReg *Vd = &(env->fpr[vd].vreg); \ - VReg *Vj = &(env->fpr[vj].vreg); \ - \ - vec_clear_cause(env); \ - for (i = 0; i < 2; i++) { \ - temp.D(i) = FN(env, Vj->UW(i)); \ - } \ - *Vd = temp; \ +#define FTINTL_L_S(NAME, FN) \ +void HELPER(NAME)(void *vd, void *vj, \ + CPULoongArchState *env, uint32_t desc) \ +{ \ + int i; \ + VReg temp; \ + VReg *Vd = (VReg *)vd; \ + VReg *Vj = (VReg *)vj; \ + \ + vec_clear_cause(env); \ + for (i = 0; i < 2; i++) { \ + temp.D(i) = FN(env, Vj->UW(i)); \ + } \ + *Vd = temp; \ } FTINTL_L_S(vftintl_l_s, do_float32_to_int64) @@ -2514,19 +2524,20 @@ FTINTL_L_S(vftintrpl_l_s, do_ftintrpl_l_s) FTINTL_L_S(vftintrzl_l_s, do_ftintrzl_l_s) FTINTL_L_S(vftintrnel_l_s, do_ftintrnel_l_s) -#define FTINTH_L_S(NAME, FN) \ -void HELPER(NAME)(CPULoongArchState *env, uint32_t vd, uint32_t vj) \ -{ \ - int i; \ - VReg temp; \ - VReg *Vd = &(env->fpr[vd].vreg); \ - VReg *Vj = &(env->fpr[vj].vreg); \ - \ - vec_clear_cause(env); \ - for (i = 0; i < 2; i++) { \ - temp.D(i) = FN(env, Vj->UW(i + 2)); \ - } \ - *Vd = temp; \ +#define FTINTH_L_S(NAME, FN) \ +void HELPER(NAME)(void *vd, void *vj, \ + CPULoongArchState *env, uint32_t desc) \ +{ \ + int i; \ + VReg temp; \ + VReg *Vd = (VReg *)vd; \ + VReg *Vj = (VReg *)vj; \ + \ + vec_clear_cause(env); \ + for (i = 0; i < 2; i++) { \ + temp.D(i) = FN(env, Vj->UW(i + 2)); \ + } \ + *Vd = temp; \ } FTINTH_L_S(vftinth_l_s, do_float32_to_int64) @@ -2555,12 +2566,13 @@ DO_2OP_F(vffint_d_l, 64, D, do_ffint_d_l) DO_2OP_F(vffint_s_wu, 32, UW, do_ffint_s_wu) DO_2OP_F(vffint_d_lu, 64, UD, do_ffint_d_lu) -void HELPER(vffintl_d_w)(CPULoongArchState *env, uint32_t vd, uint32_t vj) +void HELPER(vffintl_d_w)(void *vd, void *vj, + CPULoongArchState *env, uint32_t desc) { int i; VReg temp; - VReg *Vd = &(env->fpr[vd].vreg); - VReg *Vj = &(env->fpr[vj].vreg); + VReg *Vd = (VReg *)vd; + VReg *Vj = (VReg *)vj; vec_clear_cause(env); for (i = 0; i < 2; i++) { @@ -2570,12 +2582,13 @@ void HELPER(vffintl_d_w)(CPULoongArchState *env, uint32_t vd, uint32_t vj) *Vd = temp; } -void HELPER(vffinth_d_w)(CPULoongArchState *env, uint32_t vd, uint32_t vj) +void HELPER(vffinth_d_w)(void *vd, void *vj, + CPULoongArchState *env, uint32_t desc) { int i; VReg temp; - VReg *Vd = &(env->fpr[vd].vreg); - VReg *Vj = &(env->fpr[vj].vreg); + VReg *Vd = (VReg *)vd; + VReg *Vj = (VReg *)vj; vec_clear_cause(env); for (i = 0; i < 2; i++) { diff --git a/target/loongarch/insn_trans/trans_vec.c.inc b/target/loongarch/insn_trans/trans_vec.c.inc index 6ead8fb4c5..11d7158809 100644 --- a/target/loongarch/insn_trans/trans_vec.c.inc +++ b/target/loongarch/insn_trans/trans_vec.c.inc @@ -86,6 +86,23 @@ static bool gen_vvv(DisasContext *ctx, arg_vvv *a, gen_helper_gvec_3 *fn) return gen_vvv_vl(ctx, a, 16, fn); } +static bool gen_vv_ptr_vl(DisasContext *ctx, arg_vv *a, uint32_t oprsz, + gen_helper_gvec_2_ptr *fn) +{ + tcg_gen_gvec_2_ptr(vec_full_offset(a->vd), + vec_full_offset(a->vj), + cpu_env, + oprsz, ctx->vl / 8, oprsz, fn); + return true; +} + +static bool gen_vv_ptr(DisasContext *ctx, arg_vv *a, + gen_helper_gvec_2_ptr *fn) +{ + CHECK_SXE; + return gen_vv_ptr_vl(ctx, a, 16, fn); +} + static bool gen_vv(DisasContext *ctx, arg_vv *a, void (*func)(TCGv_ptr, TCGv_i32, TCGv_i32)) { @@ -3697,73 +3714,73 @@ TRANS(vfmaxa_d, LSX, gen_vvv_ptr, gen_helper_vfmaxa_d) TRANS(vfmina_s, LSX, gen_vvv_ptr, gen_helper_vfmina_s) TRANS(vfmina_d, LSX, gen_vvv_ptr, gen_helper_vfmina_d) -TRANS(vflogb_s, LSX, gen_vv, gen_helper_vflogb_s) -TRANS(vflogb_d, LSX, gen_vv, gen_helper_vflogb_d) +TRANS(vflogb_s, LSX, gen_vv_ptr, gen_helper_vflogb_s) +TRANS(vflogb_d, LSX, gen_vv_ptr, gen_helper_vflogb_d) -TRANS(vfclass_s, LSX, gen_vv, gen_helper_vfclass_s) -TRANS(vfclass_d, LSX, gen_vv, gen_helper_vfclass_d) +TRANS(vfclass_s, LSX, gen_vv_ptr, gen_helper_vfclass_s) +TRANS(vfclass_d, LSX, gen_vv_ptr, gen_helper_vfclass_d) -TRANS(vfsqrt_s, LSX, gen_vv, gen_helper_vfsqrt_s) -TRANS(vfsqrt_d, LSX, gen_vv, gen_helper_vfsqrt_d) -TRANS(vfrecip_s, LSX, gen_vv, gen_helper_vfrecip_s) -TRANS(vfrecip_d, LSX, gen_vv, gen_helper_vfrecip_d) -TRANS(vfrsqrt_s, LSX, gen_vv, gen_helper_vfrsqrt_s) -TRANS(vfrsqrt_d, LSX, gen_vv, gen_helper_vfrsqrt_d) +TRANS(vfsqrt_s, LSX, gen_vv_ptr, gen_helper_vfsqrt_s) +TRANS(vfsqrt_d, LSX, gen_vv_ptr, gen_helper_vfsqrt_d) +TRANS(vfrecip_s, LSX, gen_vv_ptr, gen_helper_vfrecip_s) +TRANS(vfrecip_d, LSX, gen_vv_ptr, gen_helper_vfrecip_d) +TRANS(vfrsqrt_s, LSX, gen_vv_ptr, gen_helper_vfrsqrt_s) +TRANS(vfrsqrt_d, LSX, gen_vv_ptr, gen_helper_vfrsqrt_d) -TRANS(vfcvtl_s_h, LSX, gen_vv, gen_helper_vfcvtl_s_h) -TRANS(vfcvth_s_h, LSX, gen_vv, gen_helper_vfcvth_s_h) -TRANS(vfcvtl_d_s, LSX, gen_vv, gen_helper_vfcvtl_d_s) -TRANS(vfcvth_d_s, LSX, gen_vv, gen_helper_vfcvth_d_s) +TRANS(vfcvtl_s_h, LSX, gen_vv_ptr, gen_helper_vfcvtl_s_h) +TRANS(vfcvth_s_h, LSX, gen_vv_ptr, gen_helper_vfcvth_s_h) +TRANS(vfcvtl_d_s, LSX, gen_vv_ptr, gen_helper_vfcvtl_d_s) +TRANS(vfcvth_d_s, LSX, gen_vv_ptr, gen_helper_vfcvth_d_s) TRANS(vfcvt_h_s, LSX, gen_vvv_ptr, gen_helper_vfcvt_h_s) TRANS(vfcvt_s_d, LSX, gen_vvv_ptr, gen_helper_vfcvt_s_d) -TRANS(vfrintrne_s, LSX, gen_vv, gen_helper_vfrintrne_s) -TRANS(vfrintrne_d, LSX, gen_vv, gen_helper_vfrintrne_d) -TRANS(vfrintrz_s, LSX, gen_vv, gen_helper_vfrintrz_s) -TRANS(vfrintrz_d, LSX, gen_vv, gen_helper_vfrintrz_d) -TRANS(vfrintrp_s, LSX, gen_vv, gen_helper_vfrintrp_s) -TRANS(vfrintrp_d, LSX, gen_vv, gen_helper_vfrintrp_d) -TRANS(vfrintrm_s, LSX, gen_vv, gen_helper_vfrintrm_s) -TRANS(vfrintrm_d, LSX, gen_vv, gen_helper_vfrintrm_d) -TRANS(vfrint_s, LSX, gen_vv, gen_helper_vfrint_s) -TRANS(vfrint_d, LSX, gen_vv, gen_helper_vfrint_d) - -TRANS(vftintrne_w_s, LSX, gen_vv, gen_helper_vftintrne_w_s) -TRANS(vftintrne_l_d, LSX, gen_vv, gen_helper_vftintrne_l_d) -TRANS(vftintrz_w_s, LSX, gen_vv, gen_helper_vftintrz_w_s) -TRANS(vftintrz_l_d, LSX, gen_vv, gen_helper_vftintrz_l_d) -TRANS(vftintrp_w_s, LSX, gen_vv, gen_helper_vftintrp_w_s) -TRANS(vftintrp_l_d, LSX, gen_vv, gen_helper_vftintrp_l_d) -TRANS(vftintrm_w_s, LSX, gen_vv, gen_helper_vftintrm_w_s) -TRANS(vftintrm_l_d, LSX, gen_vv, gen_helper_vftintrm_l_d) -TRANS(vftint_w_s, LSX, gen_vv, gen_helper_vftint_w_s) -TRANS(vftint_l_d, LSX, gen_vv, gen_helper_vftint_l_d) -TRANS(vftintrz_wu_s, LSX, gen_vv, gen_helper_vftintrz_wu_s) -TRANS(vftintrz_lu_d, LSX, gen_vv, gen_helper_vftintrz_lu_d) -TRANS(vftint_wu_s, LSX, gen_vv, gen_helper_vftint_wu_s) -TRANS(vftint_lu_d, LSX, gen_vv, gen_helper_vftint_lu_d) +TRANS(vfrintrne_s, LSX, gen_vv_ptr, gen_helper_vfrintrne_s) +TRANS(vfrintrne_d, LSX, gen_vv_ptr, gen_helper_vfrintrne_d) +TRANS(vfrintrz_s, LSX, gen_vv_ptr, gen_helper_vfrintrz_s) +TRANS(vfrintrz_d, LSX, gen_vv_ptr, gen_helper_vfrintrz_d) +TRANS(vfrintrp_s, LSX, gen_vv_ptr, gen_helper_vfrintrp_s) +TRANS(vfrintrp_d, LSX, gen_vv_ptr, gen_helper_vfrintrp_d) +TRANS(vfrintrm_s, LSX, gen_vv_ptr, gen_helper_vfrintrm_s) +TRANS(vfrintrm_d, LSX, gen_vv_ptr, gen_helper_vfrintrm_d) +TRANS(vfrint_s, LSX, gen_vv_ptr, gen_helper_vfrint_s) +TRANS(vfrint_d, LSX, gen_vv_ptr, gen_helper_vfrint_d) + +TRANS(vftintrne_w_s, LSX, gen_vv_ptr, gen_helper_vftintrne_w_s) +TRANS(vftintrne_l_d, LSX, gen_vv_ptr, gen_helper_vftintrne_l_d) +TRANS(vftintrz_w_s, LSX, gen_vv_ptr, gen_helper_vftintrz_w_s) +TRANS(vftintrz_l_d, LSX, gen_vv_ptr, gen_helper_vftintrz_l_d) +TRANS(vftintrp_w_s, LSX, gen_vv_ptr, gen_helper_vftintrp_w_s) +TRANS(vftintrp_l_d, LSX, gen_vv_ptr, gen_helper_vftintrp_l_d) +TRANS(vftintrm_w_s, LSX, gen_vv_ptr, gen_helper_vftintrm_w_s) +TRANS(vftintrm_l_d, LSX, gen_vv_ptr, gen_helper_vftintrm_l_d) +TRANS(vftint_w_s, LSX, gen_vv_ptr, gen_helper_vftint_w_s) +TRANS(vftint_l_d, LSX, gen_vv_ptr, gen_helper_vftint_l_d) +TRANS(vftintrz_wu_s, LSX, gen_vv_ptr, gen_helper_vftintrz_wu_s) +TRANS(vftintrz_lu_d, LSX, gen_vv_ptr, gen_helper_vftintrz_lu_d) +TRANS(vftint_wu_s, LSX, gen_vv_ptr, gen_helper_vftint_wu_s) +TRANS(vftint_lu_d, LSX, gen_vv_ptr, gen_helper_vftint_lu_d) TRANS(vftintrne_w_d, LSX, gen_vvv_ptr, gen_helper_vftintrne_w_d) TRANS(vftintrz_w_d, LSX, gen_vvv_ptr, gen_helper_vftintrz_w_d) TRANS(vftintrp_w_d, LSX, gen_vvv_ptr, gen_helper_vftintrp_w_d) TRANS(vftintrm_w_d, LSX, gen_vvv_ptr, gen_helper_vftintrm_w_d) TRANS(vftint_w_d, LSX, gen_vvv_ptr, gen_helper_vftint_w_d) -TRANS(vftintrnel_l_s, LSX, gen_vv, gen_helper_vftintrnel_l_s) -TRANS(vftintrneh_l_s, LSX, gen_vv, gen_helper_vftintrneh_l_s) -TRANS(vftintrzl_l_s, LSX, gen_vv, gen_helper_vftintrzl_l_s) -TRANS(vftintrzh_l_s, LSX, gen_vv, gen_helper_vftintrzh_l_s) -TRANS(vftintrpl_l_s, LSX, gen_vv, gen_helper_vftintrpl_l_s) -TRANS(vftintrph_l_s, LSX, gen_vv, gen_helper_vftintrph_l_s) -TRANS(vftintrml_l_s, LSX, gen_vv, gen_helper_vftintrml_l_s) -TRANS(vftintrmh_l_s, LSX, gen_vv, gen_helper_vftintrmh_l_s) -TRANS(vftintl_l_s, LSX, gen_vv, gen_helper_vftintl_l_s) -TRANS(vftinth_l_s, LSX, gen_vv, gen_helper_vftinth_l_s) - -TRANS(vffint_s_w, LSX, gen_vv, gen_helper_vffint_s_w) -TRANS(vffint_d_l, LSX, gen_vv, gen_helper_vffint_d_l) -TRANS(vffint_s_wu, LSX, gen_vv, gen_helper_vffint_s_wu) -TRANS(vffint_d_lu, LSX, gen_vv, gen_helper_vffint_d_lu) -TRANS(vffintl_d_w, LSX, gen_vv, gen_helper_vffintl_d_w) -TRANS(vffinth_d_w, LSX, gen_vv, gen_helper_vffinth_d_w) +TRANS(vftintrnel_l_s, LSX, gen_vv_ptr, gen_helper_vftintrnel_l_s) +TRANS(vftintrneh_l_s, LSX, gen_vv_ptr, gen_helper_vftintrneh_l_s) +TRANS(vftintrzl_l_s, LSX, gen_vv_ptr, gen_helper_vftintrzl_l_s) +TRANS(vftintrzh_l_s, LSX, gen_vv_ptr, gen_helper_vftintrzh_l_s) +TRANS(vftintrpl_l_s, LSX, gen_vv_ptr, gen_helper_vftintrpl_l_s) +TRANS(vftintrph_l_s, LSX, gen_vv_ptr, gen_helper_vftintrph_l_s) +TRANS(vftintrml_l_s, LSX, gen_vv_ptr, gen_helper_vftintrml_l_s) +TRANS(vftintrmh_l_s, LSX, gen_vv_ptr, gen_helper_vftintrmh_l_s) +TRANS(vftintl_l_s, LSX, gen_vv_ptr, gen_helper_vftintl_l_s) +TRANS(vftinth_l_s, LSX, gen_vv_ptr, gen_helper_vftinth_l_s) + +TRANS(vffint_s_w, LSX, gen_vv_ptr, gen_helper_vffint_s_w) +TRANS(vffint_d_l, LSX, gen_vv_ptr, gen_helper_vffint_d_l) +TRANS(vffint_s_wu, LSX, gen_vv_ptr, gen_helper_vffint_s_wu) +TRANS(vffint_d_lu, LSX, gen_vv_ptr, gen_helper_vffint_d_lu) +TRANS(vffintl_d_w, LSX, gen_vv_ptr, gen_helper_vffintl_d_w) +TRANS(vffinth_d_w, LSX, gen_vv_ptr, gen_helper_vffinth_d_w) TRANS(vffint_s_l, LSX, gen_vvv_ptr, gen_helper_vffint_s_l) static bool do_cmp(DisasContext *ctx, arg_vvv *a, MemOp mop, TCGCond cond) From patchwork Thu Sep 7 08:08:27 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Gao X-Patchwork-Id: 13376214 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id F1D7FEE14D9 for ; Thu, 7 Sep 2023 08:12:42 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qeA5H-000408-2F; Thu, 07 Sep 2023 04:09:51 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qeA4z-0003vZ-Bb for qemu-devel@nongnu.org; Thu, 07 Sep 2023 04:09:33 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qeA4r-0003Le-1P for qemu-devel@nongnu.org; Thu, 07 Sep 2023 04:09:31 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8BxnusyhflkVjQhAA--.63842S3; Thu, 07 Sep 2023 16:09:22 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8DxviMthflkXE1wAA--.31585S10; Thu, 07 Sep 2023 16:09:21 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, maobibo@loongson.cn Subject: [PATCH v5 08/57] target/loongarch: Use gen_helper_gvec_2 for 2OP vector instructions Date: Thu, 7 Sep 2023 16:08:27 +0800 Message-Id: <20230907080916.3974502-9-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230907080916.3974502-1-gaosong@loongson.cn> References: <20230907080916.3974502-1-gaosong@loongson.cn> MIME-Version: 1.0 X-CM-TRANSID: AQAAf8DxviMthflkXE1wAA--.31585S10 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Signed-off-by: Song Gao --- target/loongarch/helper.h | 58 ++++----- target/loongarch/vec_helper.c | 124 ++++++++++---------- target/loongarch/insn_trans/trans_vec.c.inc | 16 ++- 3 files changed, 101 insertions(+), 97 deletions(-) diff --git a/target/loongarch/helper.h b/target/loongarch/helper.h index 0752cc7212..523591035d 100644 --- a/target/loongarch/helper.h +++ b/target/loongarch/helper.h @@ -331,37 +331,37 @@ DEF_HELPER_FLAGS_4(vsat_hu, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) DEF_HELPER_FLAGS_4(vsat_wu, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) DEF_HELPER_FLAGS_4(vsat_du, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) -DEF_HELPER_3(vexth_h_b, void, env, i32, i32) -DEF_HELPER_3(vexth_w_h, void, env, i32, i32) -DEF_HELPER_3(vexth_d_w, void, env, i32, i32) -DEF_HELPER_3(vexth_q_d, void, env, i32, i32) -DEF_HELPER_3(vexth_hu_bu, void, env, i32, i32) -DEF_HELPER_3(vexth_wu_hu, void, env, i32, i32) -DEF_HELPER_3(vexth_du_wu, void, env, i32, i32) -DEF_HELPER_3(vexth_qu_du, void, env, i32, i32) +DEF_HELPER_FLAGS_3(vexth_h_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vexth_w_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vexth_d_w, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vexth_q_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vexth_hu_bu, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vexth_wu_hu, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vexth_du_wu, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vexth_qu_du, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vsigncov_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vsigncov_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vsigncov_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vsigncov_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) -DEF_HELPER_3(vmskltz_b, void, env, i32, i32) -DEF_HELPER_3(vmskltz_h, void, env, i32, i32) -DEF_HELPER_3(vmskltz_w, void, env, i32, i32) -DEF_HELPER_3(vmskltz_d, void, env, i32, i32) -DEF_HELPER_3(vmskgez_b, void, env, i32, i32) -DEF_HELPER_3(vmsknz_b, void, env, i32,i32) +DEF_HELPER_FLAGS_3(vmskltz_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vmskltz_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vmskltz_w, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vmskltz_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vmskgez_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vmsknz_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vnori_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) DEF_HELPER_4(vsllwil_h_b, void, env, i32, i32, i32) DEF_HELPER_4(vsllwil_w_h, void, env, i32, i32, i32) DEF_HELPER_4(vsllwil_d_w, void, env, i32, i32, i32) -DEF_HELPER_3(vextl_q_d, void, env, i32, i32) +DEF_HELPER_FLAGS_3(vextl_q_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_4(vsllwil_hu_bu, void, env, i32, i32, i32) DEF_HELPER_4(vsllwil_wu_hu, void, env, i32, i32, i32) DEF_HELPER_4(vsllwil_du_wu, void, env, i32, i32, i32) -DEF_HELPER_3(vextl_qu_du, void, env, i32, i32) +DEF_HELPER_FLAGS_3(vextl_qu_du, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vsrlr_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vsrlr_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) @@ -473,19 +473,19 @@ DEF_HELPER_4(vssrarni_hu_w, void, env, i32, i32, i32) DEF_HELPER_4(vssrarni_wu_d, void, env, i32, i32, i32) DEF_HELPER_4(vssrarni_du_q, void, env, i32, i32, i32) -DEF_HELPER_3(vclo_b, void, env, i32, i32) -DEF_HELPER_3(vclo_h, void, env, i32, i32) -DEF_HELPER_3(vclo_w, void, env, i32, i32) -DEF_HELPER_3(vclo_d, void, env, i32, i32) -DEF_HELPER_3(vclz_b, void, env, i32, i32) -DEF_HELPER_3(vclz_h, void, env, i32, i32) -DEF_HELPER_3(vclz_w, void, env, i32, i32) -DEF_HELPER_3(vclz_d, void, env, i32, i32) - -DEF_HELPER_3(vpcnt_b, void, env, i32, i32) -DEF_HELPER_3(vpcnt_h, void, env, i32, i32) -DEF_HELPER_3(vpcnt_w, void, env, i32, i32) -DEF_HELPER_3(vpcnt_d, void, env, i32, i32) +DEF_HELPER_FLAGS_3(vclo_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vclo_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vclo_w, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vclo_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vclz_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vclz_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vclz_w, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vclz_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(vpcnt_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vpcnt_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vpcnt_w, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(vpcnt_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vbitclr_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vbitclr_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c index 2898ae06ce..fd38b47c28 100644 --- a/target/loongarch/vec_helper.c +++ b/target/loongarch/vec_helper.c @@ -625,30 +625,30 @@ VSAT_U(vsat_hu, 16, UH) VSAT_U(vsat_wu, 32, UW) VSAT_U(vsat_du, 64, UD) -#define VEXTH(NAME, BIT, E1, E2) \ -void HELPER(NAME)(CPULoongArchState *env, uint32_t vd, uint32_t vj) \ -{ \ - int i; \ - VReg *Vd = &(env->fpr[vd].vreg); \ - VReg *Vj = &(env->fpr[vj].vreg); \ - \ - for (i = 0; i < LSX_LEN/BIT; i++) { \ - Vd->E1(i) = Vj->E2(i + LSX_LEN/BIT); \ - } \ +#define VEXTH(NAME, BIT, E1, E2) \ +void HELPER(NAME)(void *vd, void *vj, uint32_t desc) \ +{ \ + int i; \ + VReg *Vd = (VReg *)vd; \ + VReg *Vj = (VReg *)vj; \ + \ + for (i = 0; i < LSX_LEN/BIT; i++) { \ + Vd->E1(i) = Vj->E2(i + LSX_LEN/BIT); \ + } \ } -void HELPER(vexth_q_d)(CPULoongArchState *env, uint32_t vd, uint32_t vj) +void HELPER(vexth_q_d)(void *vd, void *vj, uint32_t desc) { - VReg *Vd = &(env->fpr[vd].vreg); - VReg *Vj = &(env->fpr[vj].vreg); + VReg *Vd = (VReg *)vd; + VReg *Vj = (VReg *)vj; Vd->Q(0) = int128_makes64(Vj->D(1)); } -void HELPER(vexth_qu_du)(CPULoongArchState *env, uint32_t vd, uint32_t vj) +void HELPER(vexth_qu_du)(void *vd, void *vj, uint32_t desc) { - VReg *Vd = &(env->fpr[vd].vreg); - VReg *Vj = &(env->fpr[vj].vreg); + VReg *Vd = (VReg *)vd; + VReg *Vj = (VReg *)vj; Vd->Q(0) = int128_make64((uint64_t)Vj->D(1)); } @@ -677,11 +677,11 @@ static uint64_t do_vmskltz_b(int64_t val) return c >> 56; } -void HELPER(vmskltz_b)(CPULoongArchState *env, uint32_t vd, uint32_t vj) +void HELPER(vmskltz_b)(void *vd, void *vj, uint32_t desc) { uint16_t temp = 0; - VReg *Vd = &(env->fpr[vd].vreg); - VReg *Vj = &(env->fpr[vj].vreg); + VReg *Vd = (VReg *)vd; + VReg *Vj = (VReg *)vj; temp = do_vmskltz_b(Vj->D(0)); temp |= (do_vmskltz_b(Vj->D(1)) << 8); @@ -698,11 +698,11 @@ static uint64_t do_vmskltz_h(int64_t val) return c >> 60; } -void HELPER(vmskltz_h)(CPULoongArchState *env, uint32_t vd, uint32_t vj) +void HELPER(vmskltz_h)(void *vd, void *vj, uint32_t desc) { uint16_t temp = 0; - VReg *Vd = &(env->fpr[vd].vreg); - VReg *Vj = &(env->fpr[vj].vreg); + VReg *Vd = (VReg *)vd; + VReg *Vj = (VReg *)vj; temp = do_vmskltz_h(Vj->D(0)); temp |= (do_vmskltz_h(Vj->D(1)) << 4); @@ -718,11 +718,11 @@ static uint64_t do_vmskltz_w(int64_t val) return c >> 62; } -void HELPER(vmskltz_w)(CPULoongArchState *env, uint32_t vd, uint32_t vj) +void HELPER(vmskltz_w)(void *vd, void *vj, uint32_t desc) { uint16_t temp = 0; - VReg *Vd = &(env->fpr[vd].vreg); - VReg *Vj = &(env->fpr[vj].vreg); + VReg *Vd = (VReg *)vd; + VReg *Vj = (VReg *)vj; temp = do_vmskltz_w(Vj->D(0)); temp |= (do_vmskltz_w(Vj->D(1)) << 2); @@ -734,11 +734,11 @@ static uint64_t do_vmskltz_d(int64_t val) { return (uint64_t)val >> 63; } -void HELPER(vmskltz_d)(CPULoongArchState *env, uint32_t vd, uint32_t vj) +void HELPER(vmskltz_d)(void *vd, void *vj, uint32_t desc) { uint16_t temp = 0; - VReg *Vd = &(env->fpr[vd].vreg); - VReg *Vj = &(env->fpr[vj].vreg); + VReg *Vd = (VReg *)vd; + VReg *Vj = (VReg *)vj; temp = do_vmskltz_d(Vj->D(0)); temp |= (do_vmskltz_d(Vj->D(1)) << 1); @@ -746,11 +746,11 @@ void HELPER(vmskltz_d)(CPULoongArchState *env, uint32_t vd, uint32_t vj) Vd->D(1) = 0; } -void HELPER(vmskgez_b)(CPULoongArchState *env, uint32_t vd, uint32_t vj) +void HELPER(vmskgez_b)(void *vd, void *vj, uint32_t desc) { uint16_t temp = 0; - VReg *Vd = &(env->fpr[vd].vreg); - VReg *Vj = &(env->fpr[vj].vreg); + VReg *Vd = (VReg *)vd; + VReg *Vj = (VReg *)vj; temp = do_vmskltz_b(Vj->D(0)); temp |= (do_vmskltz_b(Vj->D(1)) << 8); @@ -768,11 +768,11 @@ static uint64_t do_vmskez_b(uint64_t a) return c >> 56; } -void HELPER(vmsknz_b)(CPULoongArchState *env, uint32_t vd, uint32_t vj) +void HELPER(vmsknz_b)(void *vd, void *vj, uint32_t desc) { uint16_t temp = 0; - VReg *Vd = &(env->fpr[vd].vreg); - VReg *Vj = &(env->fpr[vj].vreg); + VReg *Vd = (VReg *)vd; + VReg *Vj = (VReg *)vj; temp = do_vmskez_b(Vj->D(0)); temp |= (do_vmskez_b(Vj->D(1)) << 8); @@ -809,18 +809,18 @@ void HELPER(NAME)(CPULoongArchState *env, \ *Vd = temp; \ } -void HELPER(vextl_q_d)(CPULoongArchState *env, uint32_t vd, uint32_t vj) +void HELPER(vextl_q_d)(void *vd, void *vj, uint32_t desc) { - VReg *Vd = &(env->fpr[vd].vreg); - VReg *Vj = &(env->fpr[vj].vreg); + VReg *Vd = (VReg *)vd; + VReg *Vj = (VReg *)vj; Vd->Q(0) = int128_makes64(Vj->D(0)); } -void HELPER(vextl_qu_du)(CPULoongArchState *env, uint32_t vd, uint32_t vj) +void HELPER(vextl_qu_du)(void *vd, void *vj, uint32_t desc) { - VReg *Vd = &(env->fpr[vd].vreg); - VReg *Vj = &(env->fpr[vj].vreg); + VReg *Vd = (VReg *)vd; + VReg *Vj = (VReg *)vj; Vd->Q(0) = int128_make64(Vj->D(0)); } @@ -1899,17 +1899,17 @@ VSSRARNUI(vssrarni_bu_h, 16, B, H) VSSRARNUI(vssrarni_hu_w, 32, H, W) VSSRARNUI(vssrarni_wu_d, 64, W, D) -#define DO_2OP(NAME, BIT, E, DO_OP) \ -void HELPER(NAME)(CPULoongArchState *env, uint32_t vd, uint32_t vj) \ -{ \ - int i; \ - VReg *Vd = &(env->fpr[vd].vreg); \ - VReg *Vj = &(env->fpr[vj].vreg); \ - \ - for (i = 0; i < LSX_LEN/BIT; i++) \ - { \ - Vd->E(i) = DO_OP(Vj->E(i)); \ - } \ +#define DO_2OP(NAME, BIT, E, DO_OP) \ +void HELPER(NAME)(void *vd, void *vj, uint32_t desc) \ +{ \ + int i; \ + VReg *Vd = (VReg *)vd; \ + VReg *Vj = (VReg *)vj; \ + \ + for (i = 0; i < LSX_LEN/BIT; i++) \ + { \ + Vd->E(i) = DO_OP(Vj->E(i)); \ + } \ } #define DO_CLO_B(N) (clz32(~N & 0xff) - 24) @@ -1930,17 +1930,17 @@ DO_2OP(vclz_h, 16, UH, DO_CLZ_H) DO_2OP(vclz_w, 32, UW, DO_CLZ_W) DO_2OP(vclz_d, 64, UD, DO_CLZ_D) -#define VPCNT(NAME, BIT, E, FN) \ -void HELPER(NAME)(CPULoongArchState *env, uint32_t vd, uint32_t vj) \ -{ \ - int i; \ - VReg *Vd = &(env->fpr[vd].vreg); \ - VReg *Vj = &(env->fpr[vj].vreg); \ - \ - for (i = 0; i < LSX_LEN/BIT; i++) \ - { \ - Vd->E(i) = FN(Vj->E(i)); \ - } \ +#define VPCNT(NAME, BIT, E, FN) \ +void HELPER(NAME)(void *vd, void *vj, uint32_t desc) \ +{ \ + int i; \ + VReg *Vd = (VReg *)vd; \ + VReg *Vj = (VReg *)vj; \ + \ + for (i = 0; i < LSX_LEN/BIT; i++) \ + { \ + Vd->E(i) = FN(Vj->E(i)); \ + } \ } VPCNT(vpcnt_b, 8, UB, ctpop8) diff --git a/target/loongarch/insn_trans/trans_vec.c.inc b/target/loongarch/insn_trans/trans_vec.c.inc index 11d7158809..4c3d206df1 100644 --- a/target/loongarch/insn_trans/trans_vec.c.inc +++ b/target/loongarch/insn_trans/trans_vec.c.inc @@ -103,15 +103,19 @@ static bool gen_vv_ptr(DisasContext *ctx, arg_vv *a, return gen_vv_ptr_vl(ctx, a, 16, fn); } -static bool gen_vv(DisasContext *ctx, arg_vv *a, - void (*func)(TCGv_ptr, TCGv_i32, TCGv_i32)) +static bool gen_vv_vl(DisasContext *ctx, arg_vv *a, uint32_t oprsz, + gen_helper_gvec_2 *fn) { - TCGv_i32 vd = tcg_constant_i32(a->vd); - TCGv_i32 vj = tcg_constant_i32(a->vj); + tcg_gen_gvec_2_ool(vec_full_offset(a->vd), + vec_full_offset(a->vj), + oprsz, ctx->vl / 8, oprsz, fn); + return true; +} +static bool gen_vv(DisasContext *ctx, arg_vv *a, gen_helper_gvec_2 *fn) +{ CHECK_SXE; - func(cpu_env, vd, vj); - return true; + return gen_vv_vl(ctx, a, 16, fn); } static bool gen_vv_i(DisasContext *ctx, arg_vv_i *a, From patchwork Thu Sep 7 08:08:28 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Gao X-Patchwork-Id: 13376212 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 4FA1EEE14D4 for ; Thu, 7 Sep 2023 08:12:41 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qeA5K-000411-Sz; Thu, 07 Sep 2023 04:09:54 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qeA4z-0003va-Bp for qemu-devel@nongnu.org; Thu, 07 Sep 2023 04:09:33 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qeA4r-0003Lb-0x for qemu-devel@nongnu.org; Thu, 07 Sep 2023 04:09:32 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8CxtPAyhflkVTQhAA--.1614S3; Thu, 07 Sep 2023 16:09:22 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8DxviMthflkXE1wAA--.31585S11; Thu, 07 Sep 2023 16:09:21 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, maobibo@loongson.cn Subject: [PATCH v5 09/57] target/loongarch: Use gen_helper_gvec_2i for 2OP + imm vector instructions Date: Thu, 7 Sep 2023 16:08:28 +0800 Message-Id: <20230907080916.3974502-10-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230907080916.3974502-1-gaosong@loongson.cn> References: <20230907080916.3974502-1-gaosong@loongson.cn> MIME-Version: 1.0 X-CM-TRANSID: AQAAf8DxviMthflkXE1wAA--.31585S11 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Signed-off-by: Song Gao --- target/loongarch/helper.h | 146 +++---- target/loongarch/vec_helper.c | 445 +++++++++----------- target/loongarch/insn_trans/trans_vec.c.inc | 18 +- 3 files changed, 291 insertions(+), 318 deletions(-) diff --git a/target/loongarch/helper.h b/target/loongarch/helper.h index 523591035d..1abd9e1410 100644 --- a/target/loongarch/helper.h +++ b/target/loongarch/helper.h @@ -354,32 +354,32 @@ DEF_HELPER_FLAGS_3(vmsknz_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vnori_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) -DEF_HELPER_4(vsllwil_h_b, void, env, i32, i32, i32) -DEF_HELPER_4(vsllwil_w_h, void, env, i32, i32, i32) -DEF_HELPER_4(vsllwil_d_w, void, env, i32, i32, i32) +DEF_HELPER_FLAGS_4(vsllwil_h_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vsllwil_w_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vsllwil_d_w, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) DEF_HELPER_FLAGS_3(vextl_q_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) -DEF_HELPER_4(vsllwil_hu_bu, void, env, i32, i32, i32) -DEF_HELPER_4(vsllwil_wu_hu, void, env, i32, i32, i32) -DEF_HELPER_4(vsllwil_du_wu, void, env, i32, i32, i32) +DEF_HELPER_FLAGS_4(vsllwil_hu_bu, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vsllwil_wu_hu, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vsllwil_du_wu, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) DEF_HELPER_FLAGS_3(vextl_qu_du, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vsrlr_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vsrlr_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vsrlr_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vsrlr_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) -DEF_HELPER_4(vsrlri_b, void, env, i32, i32, i32) -DEF_HELPER_4(vsrlri_h, void, env, i32, i32, i32) -DEF_HELPER_4(vsrlri_w, void, env, i32, i32, i32) -DEF_HELPER_4(vsrlri_d, void, env, i32, i32, i32) +DEF_HELPER_FLAGS_4(vsrlri_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vsrlri_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vsrlri_w, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vsrlri_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) DEF_HELPER_FLAGS_4(vsrar_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vsrar_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vsrar_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vsrar_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) -DEF_HELPER_4(vsrari_b, void, env, i32, i32, i32) -DEF_HELPER_4(vsrari_h, void, env, i32, i32, i32) -DEF_HELPER_4(vsrari_w, void, env, i32, i32, i32) -DEF_HELPER_4(vsrari_d, void, env, i32, i32, i32) +DEF_HELPER_FLAGS_4(vsrari_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vsrari_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vsrari_w, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vsrari_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) DEF_HELPER_FLAGS_4(vsrln_b_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vsrln_h_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) @@ -388,14 +388,14 @@ DEF_HELPER_FLAGS_4(vsran_b_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vsran_h_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vsran_w_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) -DEF_HELPER_4(vsrlni_b_h, void, env, i32, i32, i32) -DEF_HELPER_4(vsrlni_h_w, void, env, i32, i32, i32) -DEF_HELPER_4(vsrlni_w_d, void, env, i32, i32, i32) -DEF_HELPER_4(vsrlni_d_q, void, env, i32, i32, i32) -DEF_HELPER_4(vsrani_b_h, void, env, i32, i32, i32) -DEF_HELPER_4(vsrani_h_w, void, env, i32, i32, i32) -DEF_HELPER_4(vsrani_w_d, void, env, i32, i32, i32) -DEF_HELPER_4(vsrani_d_q, void, env, i32, i32, i32) +DEF_HELPER_FLAGS_4(vsrlni_b_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vsrlni_h_w, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vsrlni_w_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vsrlni_d_q, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vsrani_b_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vsrani_h_w, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vsrani_w_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vsrani_d_q, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) DEF_HELPER_FLAGS_4(vsrlrn_b_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vsrlrn_h_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) @@ -404,14 +404,14 @@ DEF_HELPER_FLAGS_4(vsrarn_b_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vsrarn_h_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vsrarn_w_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) -DEF_HELPER_4(vsrlrni_b_h, void, env, i32, i32, i32) -DEF_HELPER_4(vsrlrni_h_w, void, env, i32, i32, i32) -DEF_HELPER_4(vsrlrni_w_d, void, env, i32, i32, i32) -DEF_HELPER_4(vsrlrni_d_q, void, env, i32, i32, i32) -DEF_HELPER_4(vsrarni_b_h, void, env, i32, i32, i32) -DEF_HELPER_4(vsrarni_h_w, void, env, i32, i32, i32) -DEF_HELPER_4(vsrarni_w_d, void, env, i32, i32, i32) -DEF_HELPER_4(vsrarni_d_q, void, env, i32, i32, i32) +DEF_HELPER_FLAGS_4(vsrlrni_b_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vsrlrni_h_w, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vsrlrni_w_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vsrlrni_d_q, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vsrarni_b_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vsrarni_h_w, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vsrarni_w_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vsrarni_d_q, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) DEF_HELPER_FLAGS_4(vssrln_b_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vssrln_h_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) @@ -426,22 +426,22 @@ DEF_HELPER_FLAGS_4(vssran_bu_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vssran_hu_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vssran_wu_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) -DEF_HELPER_4(vssrlni_b_h, void, env, i32, i32, i32) -DEF_HELPER_4(vssrlni_h_w, void, env, i32, i32, i32) -DEF_HELPER_4(vssrlni_w_d, void, env, i32, i32, i32) -DEF_HELPER_4(vssrlni_d_q, void, env, i32, i32, i32) -DEF_HELPER_4(vssrani_b_h, void, env, i32, i32, i32) -DEF_HELPER_4(vssrani_h_w, void, env, i32, i32, i32) -DEF_HELPER_4(vssrani_w_d, void, env, i32, i32, i32) -DEF_HELPER_4(vssrani_d_q, void, env, i32, i32, i32) -DEF_HELPER_4(vssrlni_bu_h, void, env, i32, i32, i32) -DEF_HELPER_4(vssrlni_hu_w, void, env, i32, i32, i32) -DEF_HELPER_4(vssrlni_wu_d, void, env, i32, i32, i32) -DEF_HELPER_4(vssrlni_du_q, void, env, i32, i32, i32) -DEF_HELPER_4(vssrani_bu_h, void, env, i32, i32, i32) -DEF_HELPER_4(vssrani_hu_w, void, env, i32, i32, i32) -DEF_HELPER_4(vssrani_wu_d, void, env, i32, i32, i32) -DEF_HELPER_4(vssrani_du_q, void, env, i32, i32, i32) +DEF_HELPER_FLAGS_4(vssrlni_b_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vssrlni_h_w, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vssrlni_w_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vssrlni_d_q, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vssrani_b_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vssrani_h_w, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vssrani_w_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vssrani_d_q, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vssrlni_bu_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vssrlni_hu_w, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vssrlni_wu_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vssrlni_du_q, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vssrani_bu_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vssrani_hu_w, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vssrani_wu_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vssrani_du_q, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) DEF_HELPER_FLAGS_4(vssrlrn_b_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vssrlrn_h_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) @@ -456,22 +456,22 @@ DEF_HELPER_FLAGS_4(vssrarn_bu_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vssrarn_hu_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vssrarn_wu_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) -DEF_HELPER_4(vssrlrni_b_h, void, env, i32, i32, i32) -DEF_HELPER_4(vssrlrni_h_w, void, env, i32, i32, i32) -DEF_HELPER_4(vssrlrni_w_d, void, env, i32, i32, i32) -DEF_HELPER_4(vssrlrni_d_q, void, env, i32, i32, i32) -DEF_HELPER_4(vssrarni_b_h, void, env, i32, i32, i32) -DEF_HELPER_4(vssrarni_h_w, void, env, i32, i32, i32) -DEF_HELPER_4(vssrarni_w_d, void, env, i32, i32, i32) -DEF_HELPER_4(vssrarni_d_q, void, env, i32, i32, i32) -DEF_HELPER_4(vssrlrni_bu_h, void, env, i32, i32, i32) -DEF_HELPER_4(vssrlrni_hu_w, void, env, i32, i32, i32) -DEF_HELPER_4(vssrlrni_wu_d, void, env, i32, i32, i32) -DEF_HELPER_4(vssrlrni_du_q, void, env, i32, i32, i32) -DEF_HELPER_4(vssrarni_bu_h, void, env, i32, i32, i32) -DEF_HELPER_4(vssrarni_hu_w, void, env, i32, i32, i32) -DEF_HELPER_4(vssrarni_wu_d, void, env, i32, i32, i32) -DEF_HELPER_4(vssrarni_du_q, void, env, i32, i32, i32) +DEF_HELPER_FLAGS_4(vssrlrni_b_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vssrlrni_h_w, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vssrlrni_w_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vssrlrni_d_q, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vssrarni_b_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vssrarni_h_w, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vssrarni_w_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vssrarni_d_q, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vssrlrni_bu_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vssrlrni_hu_w, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vssrlrni_wu_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vssrlrni_du_q, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vssrarni_bu_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vssrarni_hu_w, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vssrarni_wu_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vssrarni_du_q, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) DEF_HELPER_FLAGS_3(vclo_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(vclo_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) @@ -516,8 +516,8 @@ DEF_HELPER_FLAGS_4(vbitrevi_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) DEF_HELPER_FLAGS_4(vfrstp_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vfrstp_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) -DEF_HELPER_4(vfrstpi_b, void, env, i32, i32, i32) -DEF_HELPER_4(vfrstpi_h, void, env, i32, i32, i32) +DEF_HELPER_FLAGS_4(vfrstpi_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vfrstpi_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) DEF_HELPER_FLAGS_5(vfadd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32) DEF_HELPER_FLAGS_5(vfadd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, env, i32) @@ -686,14 +686,14 @@ DEF_HELPER_FLAGS_5(vshuf_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vshuf_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vshuf_w, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(vshuf_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) -DEF_HELPER_4(vshuf4i_b, void, env, i32, i32, i32) -DEF_HELPER_4(vshuf4i_h, void, env, i32, i32, i32) -DEF_HELPER_4(vshuf4i_w, void, env, i32, i32, i32) -DEF_HELPER_4(vshuf4i_d, void, env, i32, i32, i32) +DEF_HELPER_FLAGS_4(vshuf4i_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vshuf4i_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vshuf4i_w, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vshuf4i_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) -DEF_HELPER_4(vpermi_w, void, env, i32, i32, i32) +DEF_HELPER_FLAGS_4(vpermi_w, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) -DEF_HELPER_4(vextrins_b, void, env, i32, i32, i32) -DEF_HELPER_4(vextrins_h, void, env, i32, i32, i32) -DEF_HELPER_4(vextrins_w, void, env, i32, i32, i32) -DEF_HELPER_4(vextrins_d, void, env, i32, i32, i32) +DEF_HELPER_FLAGS_4(vextrins_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vextrins_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vextrins_w, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(vextrins_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) diff --git a/target/loongarch/vec_helper.c b/target/loongarch/vec_helper.c index fd38b47c28..4e10957b90 100644 --- a/target/loongarch/vec_helper.c +++ b/target/loongarch/vec_helper.c @@ -791,22 +791,21 @@ void HELPER(vnori_b)(void *vd, void *vj, uint64_t imm, uint32_t v) } } -#define VSLLWIL(NAME, BIT, E1, E2) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t imm) \ -{ \ - int i; \ - VReg temp; \ - VReg *Vd = &(env->fpr[vd].vreg); \ - VReg *Vj = &(env->fpr[vj].vreg); \ - typedef __typeof(temp.E1(0)) TD; \ - \ - temp.D(0) = 0; \ - temp.D(1) = 0; \ - for (i = 0; i < LSX_LEN/BIT; i++) { \ - temp.E1(i) = (TD)Vj->E2(i) << (imm % BIT); \ - } \ - *Vd = temp; \ +#define VSLLWIL(NAME, BIT, E1, E2) \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) \ +{ \ + int i; \ + VReg temp; \ + VReg *Vd = (VReg *)vd; \ + VReg *Vj = (VReg *)vj; \ + typedef __typeof(temp.E1(0)) TD; \ + \ + temp.D(0) = 0; \ + temp.D(1) = 0; \ + for (i = 0; i < LSX_LEN/BIT; i++) { \ + temp.E1(i) = (TD)Vj->E2(i) << (imm % BIT); \ + } \ + *Vd = temp; \ } void HELPER(vextl_q_d)(void *vd, void *vj, uint32_t desc) @@ -865,17 +864,16 @@ VSRLR(vsrlr_h, 16, uint16_t, H) VSRLR(vsrlr_w, 32, uint32_t, W) VSRLR(vsrlr_d, 64, uint64_t, D) -#define VSRLRI(NAME, BIT, E) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t imm) \ -{ \ - int i; \ - VReg *Vd = &(env->fpr[vd].vreg); \ - VReg *Vj = &(env->fpr[vj].vreg); \ - \ - for (i = 0; i < LSX_LEN/BIT; i++) { \ - Vd->E(i) = do_vsrlr_ ## E(Vj->E(i), imm); \ - } \ +#define VSRLRI(NAME, BIT, E) \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) \ +{ \ + int i; \ + VReg *Vd = (VReg *)vd; \ + VReg *Vj = (VReg *)vj; \ + \ + for (i = 0; i < LSX_LEN/BIT; i++) { \ + Vd->E(i) = do_vsrlr_ ## E(Vj->E(i), imm); \ + } \ } VSRLRI(vsrlri_b, 8, B) @@ -916,17 +914,16 @@ VSRAR(vsrar_h, 16, uint16_t, H) VSRAR(vsrar_w, 32, uint32_t, W) VSRAR(vsrar_d, 64, uint64_t, D) -#define VSRARI(NAME, BIT, E) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t imm) \ -{ \ - int i; \ - VReg *Vd = &(env->fpr[vd].vreg); \ - VReg *Vj = &(env->fpr[vj].vreg); \ - \ - for (i = 0; i < LSX_LEN/BIT; i++) { \ - Vd->E(i) = do_vsrar_ ## E(Vj->E(i), imm); \ - } \ +#define VSRARI(NAME, BIT, E) \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) \ +{ \ + int i; \ + VReg *Vd = (VReg *)vd; \ + VReg *Vj = (VReg *)vj; \ + \ + for (i = 0; i < LSX_LEN/BIT; i++) { \ + Vd->E(i) = do_vsrar_ ## E(Vj->E(i), imm); \ + } \ } VSRARI(vsrari_b, 8, B) @@ -972,31 +969,29 @@ VSRAN(vsran_b_h, 16, uint16_t, B, H) VSRAN(vsran_h_w, 32, uint32_t, H, W) VSRAN(vsran_w_d, 64, uint64_t, W, D) -#define VSRLNI(NAME, BIT, T, E1, E2) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t imm) \ -{ \ - int i, max; \ - VReg temp; \ - VReg *Vd = &(env->fpr[vd].vreg); \ - VReg *Vj = &(env->fpr[vj].vreg); \ - \ - temp.D(0) = 0; \ - temp.D(1) = 0; \ - max = LSX_LEN/BIT; \ - for (i = 0; i < max; i++) { \ - temp.E1(i) = R_SHIFT((T)Vj->E2(i), imm); \ - temp.E1(i + max) = R_SHIFT((T)Vd->E2(i), imm); \ - } \ - *Vd = temp; \ -} - -void HELPER(vsrlni_d_q)(CPULoongArchState *env, - uint32_t vd, uint32_t vj, uint32_t imm) +#define VSRLNI(NAME, BIT, T, E1, E2) \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) \ +{ \ + int i, max; \ + VReg temp; \ + VReg *Vd = (VReg *)vd; \ + VReg *Vj = (VReg *)vj; \ + \ + temp.D(0) = 0; \ + temp.D(1) = 0; \ + max = LSX_LEN/BIT; \ + for (i = 0; i < max; i++) { \ + temp.E1(i) = R_SHIFT((T)Vj->E2(i), imm); \ + temp.E1(i + max) = R_SHIFT((T)Vd->E2(i), imm); \ + } \ + *Vd = temp; \ +} + +void HELPER(vsrlni_d_q)(void *vd, void *vj, uint64_t imm, uint32_t desc) { VReg temp; - VReg *Vd = &(env->fpr[vd].vreg); - VReg *Vj = &(env->fpr[vj].vreg); + VReg *Vd = (VReg *)vd; + VReg *Vj = (VReg *)vj; temp.D(0) = 0; temp.D(1) = 0; @@ -1009,31 +1004,29 @@ VSRLNI(vsrlni_b_h, 16, uint16_t, B, H) VSRLNI(vsrlni_h_w, 32, uint32_t, H, W) VSRLNI(vsrlni_w_d, 64, uint64_t, W, D) -#define VSRANI(NAME, BIT, E1, E2) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t imm) \ -{ \ - int i, max; \ - VReg temp; \ - VReg *Vd = &(env->fpr[vd].vreg); \ - VReg *Vj = &(env->fpr[vj].vreg); \ - \ - temp.D(0) = 0; \ - temp.D(1) = 0; \ - max = LSX_LEN/BIT; \ - for (i = 0; i < max; i++) { \ - temp.E1(i) = R_SHIFT(Vj->E2(i), imm); \ - temp.E1(i + max) = R_SHIFT(Vd->E2(i), imm); \ - } \ - *Vd = temp; \ -} - -void HELPER(vsrani_d_q)(CPULoongArchState *env, - uint32_t vd, uint32_t vj, uint32_t imm) +#define VSRANI(NAME, BIT, E1, E2) \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) \ +{ \ + int i, max; \ + VReg temp; \ + VReg *Vd = (VReg *)vd; \ + VReg *Vj = (VReg *)vj; \ + \ + temp.D(0) = 0; \ + temp.D(1) = 0; \ + max = LSX_LEN/BIT; \ + for (i = 0; i < max; i++) { \ + temp.E1(i) = R_SHIFT(Vj->E2(i), imm); \ + temp.E1(i + max) = R_SHIFT(Vd->E2(i), imm); \ + } \ + *Vd = temp; \ +} + +void HELPER(vsrani_d_q)(void *vd, void *vj, uint64_t imm, uint32_t desc) { VReg temp; - VReg *Vd = &(env->fpr[vd].vreg); - VReg *Vj = &(env->fpr[vj].vreg); + VReg *Vd = (VReg *)vd; + VReg *Vj = (VReg *)vj; temp.D(0) = 0; temp.D(1) = 0; @@ -1082,31 +1075,29 @@ VSRARN(vsrarn_b_h, 16, uint8_t, B, H) VSRARN(vsrarn_h_w, 32, uint16_t, H, W) VSRARN(vsrarn_w_d, 64, uint32_t, W, D) -#define VSRLRNI(NAME, BIT, E1, E2) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t imm) \ -{ \ - int i, max; \ - VReg temp; \ - VReg *Vd = &(env->fpr[vd].vreg); \ - VReg *Vj = &(env->fpr[vj].vreg); \ - \ - temp.D(0) = 0; \ - temp.D(1) = 0; \ - max = LSX_LEN/BIT; \ - for (i = 0; i < max; i++) { \ - temp.E1(i) = do_vsrlr_ ## E2(Vj->E2(i), imm); \ - temp.E1(i + max) = do_vsrlr_ ## E2(Vd->E2(i), imm); \ - } \ - *Vd = temp; \ -} - -void HELPER(vsrlrni_d_q)(CPULoongArchState *env, - uint32_t vd, uint32_t vj, uint32_t imm) +#define VSRLRNI(NAME, BIT, E1, E2) \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) \ +{ \ + int i, max; \ + VReg temp; \ + VReg *Vd = (VReg *)vd; \ + VReg *Vj = (VReg *)vj; \ + \ + temp.D(0) = 0; \ + temp.D(1) = 0; \ + max = LSX_LEN/BIT; \ + for (i = 0; i < max; i++) { \ + temp.E1(i) = do_vsrlr_ ## E2(Vj->E2(i), imm); \ + temp.E1(i + max) = do_vsrlr_ ## E2(Vd->E2(i), imm); \ + } \ + *Vd = temp; \ +} + +void HELPER(vsrlrni_d_q)(void *vd, void *vj, uint64_t imm, uint32_t desc) { VReg temp; - VReg *Vd = &(env->fpr[vd].vreg); - VReg *Vj = &(env->fpr[vj].vreg); + VReg *Vd = (VReg *)vd; + VReg *Vj = (VReg *)vj; Int128 r1, r2; if (imm == 0) { @@ -1126,31 +1117,29 @@ VSRLRNI(vsrlrni_b_h, 16, B, H) VSRLRNI(vsrlrni_h_w, 32, H, W) VSRLRNI(vsrlrni_w_d, 64, W, D) -#define VSRARNI(NAME, BIT, E1, E2) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t imm) \ -{ \ - int i, max; \ - VReg temp; \ - VReg *Vd = &(env->fpr[vd].vreg); \ - VReg *Vj = &(env->fpr[vj].vreg); \ - \ - temp.D(0) = 0; \ - temp.D(1) = 0; \ - max = LSX_LEN/BIT; \ - for (i = 0; i < max; i++) { \ - temp.E1(i) = do_vsrar_ ## E2(Vj->E2(i), imm); \ - temp.E1(i + max) = do_vsrar_ ## E2(Vd->E2(i), imm); \ - } \ - *Vd = temp; \ -} - -void HELPER(vsrarni_d_q)(CPULoongArchState *env, - uint32_t vd, uint32_t vj, uint32_t imm) +#define VSRARNI(NAME, BIT, E1, E2) \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) \ +{ \ + int i, max; \ + VReg temp; \ + VReg *Vd = (VReg *)vd; \ + VReg *Vj = (VReg *)vj; \ + \ + temp.D(0) = 0; \ + temp.D(1) = 0; \ + max = LSX_LEN/BIT; \ + for (i = 0; i < max; i++) { \ + temp.E1(i) = do_vsrar_ ## E2(Vj->E2(i), imm); \ + temp.E1(i + max) = do_vsrar_ ## E2(Vd->E2(i), imm); \ + } \ + *Vd = temp; \ +} + +void HELPER(vsrarni_d_q)(void *vd, void *vj, uint64_t imm, uint32_t desc) { VReg temp; - VReg *Vd = &(env->fpr[vd].vreg); - VReg *Vj = &(env->fpr[vj].vreg); + VReg *Vd = (VReg *)vd; + VReg *Vj = (VReg *)vj; Int128 r1, r2; if (imm == 0) { @@ -1336,13 +1325,12 @@ VSSRANU(vssran_hu_w, 32, uint32_t, H, W) VSSRANU(vssran_wu_d, 64, uint64_t, W, D) #define VSSRLNI(NAME, BIT, E1, E2) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t imm) \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) \ { \ int i; \ VReg temp; \ - VReg *Vd = &(env->fpr[vd].vreg); \ - VReg *Vj = &(env->fpr[vj].vreg); \ + VReg *Vd = (VReg *)vd; \ + VReg *Vj = (VReg *)vj; \ \ for (i = 0; i < LSX_LEN/BIT; i++) { \ temp.E1(i) = do_ssrlns_ ## E1(Vj->E2(i), imm, BIT/2 -1); \ @@ -1351,12 +1339,11 @@ void HELPER(NAME)(CPULoongArchState *env, \ *Vd = temp; \ } -void HELPER(vssrlni_d_q)(CPULoongArchState *env, - uint32_t vd, uint32_t vj, uint32_t imm) +void HELPER(vssrlni_d_q)(void *vd, void *vj, uint64_t imm, uint32_t desc) { Int128 shft_res1, shft_res2, mask; - VReg *Vd = &(env->fpr[vd].vreg); - VReg *Vj = &(env->fpr[vj].vreg); + VReg *Vd = (VReg *)vd; + VReg *Vj = (VReg *)vj; if (imm == 0) { shft_res1 = Vj->Q(0); @@ -1385,13 +1372,12 @@ VSSRLNI(vssrlni_h_w, 32, H, W) VSSRLNI(vssrlni_w_d, 64, W, D) #define VSSRANI(NAME, BIT, E1, E2) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t imm) \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) \ { \ int i; \ VReg temp; \ - VReg *Vd = &(env->fpr[vd].vreg); \ - VReg *Vj = &(env->fpr[vj].vreg); \ + VReg *Vd = (VReg *)vd; \ + VReg *Vj = (VReg *)vj; \ \ for (i = 0; i < LSX_LEN/BIT; i++) { \ temp.E1(i) = do_ssrans_ ## E1(Vj->E2(i), imm, BIT/2 -1); \ @@ -1400,12 +1386,11 @@ void HELPER(NAME)(CPULoongArchState *env, \ *Vd = temp; \ } -void HELPER(vssrani_d_q)(CPULoongArchState *env, - uint32_t vd, uint32_t vj, uint32_t imm) +void HELPER(vssrani_d_q)(void *vd, void *vj, uint64_t imm, uint32_t desc) { Int128 shft_res1, shft_res2, mask, min; - VReg *Vd = &(env->fpr[vd].vreg); - VReg *Vj = &(env->fpr[vj].vreg); + VReg *Vd = (VReg *)vd; + VReg *Vj = (VReg *)vj; if (imm == 0) { shft_res1 = Vj->Q(0); @@ -1439,13 +1424,12 @@ VSSRANI(vssrani_h_w, 32, H, W) VSSRANI(vssrani_w_d, 64, W, D) #define VSSRLNUI(NAME, BIT, E1, E2) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t imm) \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) \ { \ int i; \ VReg temp; \ - VReg *Vd = &(env->fpr[vd].vreg); \ - VReg *Vj = &(env->fpr[vj].vreg); \ + VReg *Vd = (VReg *)vd; \ + VReg *Vj = (VReg *)vj; \ \ for (i = 0; i < LSX_LEN/BIT; i++) { \ temp.E1(i) = do_ssrlnu_ ## E1(Vj->E2(i), imm, BIT/2); \ @@ -1454,12 +1438,11 @@ void HELPER(NAME)(CPULoongArchState *env, \ *Vd = temp; \ } -void HELPER(vssrlni_du_q)(CPULoongArchState *env, - uint32_t vd, uint32_t vj, uint32_t imm) +void HELPER(vssrlni_du_q)(void *vd, void *vj, uint64_t imm, uint32_t desc) { Int128 shft_res1, shft_res2, mask; - VReg *Vd = &(env->fpr[vd].vreg); - VReg *Vj = &(env->fpr[vj].vreg); + VReg *Vd = (VReg *)vd; + VReg *Vj = (VReg *)vj; if (imm == 0) { shft_res1 = Vj->Q(0); @@ -1488,13 +1471,12 @@ VSSRLNUI(vssrlni_hu_w, 32, H, W) VSSRLNUI(vssrlni_wu_d, 64, W, D) #define VSSRANUI(NAME, BIT, E1, E2) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t imm) \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) \ { \ int i; \ VReg temp; \ - VReg *Vd = &(env->fpr[vd].vreg); \ - VReg *Vj = &(env->fpr[vj].vreg); \ + VReg *Vd = (VReg *)vd; \ + VReg *Vj = (VReg *)vj; \ \ for (i = 0; i < LSX_LEN/BIT; i++) { \ temp.E1(i) = do_ssranu_ ## E1(Vj->E2(i), imm, BIT/2); \ @@ -1503,12 +1485,11 @@ void HELPER(NAME)(CPULoongArchState *env, \ *Vd = temp; \ } -void HELPER(vssrani_du_q)(CPULoongArchState *env, - uint32_t vd, uint32_t vj, uint32_t imm) +void HELPER(vssrani_du_q)(void *vd, void *vj, uint64_t imm, uint32_t desc) { Int128 shft_res1, shft_res2, mask; - VReg *Vd = &(env->fpr[vd].vreg); - VReg *Vj = &(env->fpr[vj].vreg); + VReg *Vd = (VReg *)vd; + VReg *Vj = (VReg *)vj; if (imm == 0) { shft_res1 = Vj->Q(0); @@ -1701,13 +1682,12 @@ VSSRARNU(vssrarn_hu_w, 32, uint32_t, H, W) VSSRARNU(vssrarn_wu_d, 64, uint64_t, W, D) #define VSSRLRNI(NAME, BIT, E1, E2) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t imm) \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) \ { \ int i; \ VReg temp; \ - VReg *Vd = &(env->fpr[vd].vreg); \ - VReg *Vj = &(env->fpr[vj].vreg); \ + VReg *Vd = (VReg *)vd; \ + VReg *Vj = (VReg *)vj; \ \ for (i = 0; i < LSX_LEN/BIT; i++) { \ temp.E1(i) = do_ssrlrns_ ## E1(Vj->E2(i), imm, BIT/2 -1); \ @@ -1717,12 +1697,11 @@ void HELPER(NAME)(CPULoongArchState *env, \ } #define VSSRLRNI_Q(NAME, sh) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t imm) \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) \ { \ Int128 shft_res1, shft_res2, mask, r1, r2; \ - VReg *Vd = &(env->fpr[vd].vreg); \ - VReg *Vj = &(env->fpr[vj].vreg); \ + VReg *Vd = (VReg *)vd; \ + VReg *Vj = (VReg *)vj; \ \ if (imm == 0) { \ shft_res1 = Vj->Q(0); \ @@ -1756,13 +1735,12 @@ VSSRLRNI(vssrlrni_w_d, 64, W, D) VSSRLRNI_Q(vssrlrni_d_q, 63) #define VSSRARNI(NAME, BIT, E1, E2) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t imm) \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) \ { \ int i; \ VReg temp; \ - VReg *Vd = &(env->fpr[vd].vreg); \ - VReg *Vj = &(env->fpr[vj].vreg); \ + VReg *Vd = (VReg *)vd; \ + VReg *Vj = (VReg *)vj; \ \ for (i = 0; i < LSX_LEN/BIT; i++) { \ temp.E1(i) = do_ssrarns_ ## E1(Vj->E2(i), imm, BIT/2 -1); \ @@ -1771,12 +1749,11 @@ void HELPER(NAME)(CPULoongArchState *env, *Vd = temp; \ } -void HELPER(vssrarni_d_q)(CPULoongArchState *env, - uint32_t vd, uint32_t vj, uint32_t imm) +void HELPER(vssrarni_d_q)(void *vd, void *vj, uint64_t imm, uint32_t desc) { Int128 shft_res1, shft_res2, mask1, mask2, r1, r2; - VReg *Vd = &(env->fpr[vd].vreg); - VReg *Vj = &(env->fpr[vj].vreg); + VReg *Vd = (VReg *)vd; + VReg *Vj = (VReg *)vj; if (imm == 0) { shft_res1 = Vj->Q(0); @@ -1814,13 +1791,12 @@ VSSRARNI(vssrarni_h_w, 32, H, W) VSSRARNI(vssrarni_w_d, 64, W, D) #define VSSRLRNUI(NAME, BIT, E1, E2) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t imm) \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) \ { \ int i; \ VReg temp; \ - VReg *Vd = &(env->fpr[vd].vreg); \ - VReg *Vj = &(env->fpr[vj].vreg); \ + VReg *Vd = (VReg *)vd; \ + VReg *Vj = (VReg *)vj; \ \ for (i = 0; i < LSX_LEN/BIT; i++) { \ temp.E1(i) = do_ssrlrnu_ ## E1(Vj->E2(i), imm, BIT/2); \ @@ -1835,13 +1811,12 @@ VSSRLRNUI(vssrlrni_wu_d, 64, W, D) VSSRLRNI_Q(vssrlrni_du_q, 64) #define VSSRARNUI(NAME, BIT, E1, E2) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t imm) \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) \ { \ int i; \ VReg temp; \ - VReg *Vd = &(env->fpr[vd].vreg); \ - VReg *Vj = &(env->fpr[vj].vreg); \ + VReg *Vd = (VReg *)vd; \ + VReg *Vj = (VReg *)vj; \ \ for (i = 0; i < LSX_LEN/BIT; i++) { \ temp.E1(i) = do_ssrarnu_ ## E1(Vj->E2(i), imm, BIT/2); \ @@ -1850,12 +1825,11 @@ void HELPER(NAME)(CPULoongArchState *env, \ *Vd = temp; \ } -void HELPER(vssrarni_du_q)(CPULoongArchState *env, - uint32_t vd, uint32_t vj, uint32_t imm) +void HELPER(vssrarni_du_q)(void *vd, void *vj, uint64_t imm, uint32_t desc) { Int128 shft_res1, shft_res2, mask1, mask2, r1, r2; - VReg *Vd = &(env->fpr[vd].vreg); - VReg *Vj = &(env->fpr[vj].vreg); + VReg *Vd = (VReg *)vd; + VReg *Vj = (VReg *)vj; if (imm == 0) { shft_res1 = Vj->Q(0); @@ -2023,21 +1997,20 @@ void HELPER(NAME)(void *vd, void *vj, void *vk, uint32_t desc) \ VFRSTP(vfrstp_b, 8, 0xf, B) VFRSTP(vfrstp_h, 16, 0x7, H) -#define VFRSTPI(NAME, BIT, E) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t imm) \ -{ \ - int i, m; \ - VReg *Vd = &(env->fpr[vd].vreg); \ - VReg *Vj = &(env->fpr[vj].vreg); \ - \ - for (i = 0; i < LSX_LEN/BIT; i++) { \ - if (Vj->E(i) < 0) { \ - break; \ - } \ - } \ - m = imm % (LSX_LEN/BIT); \ - Vd->E(m) = i; \ +#define VFRSTPI(NAME, BIT, E) \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) \ +{ \ + int i, m; \ + VReg *Vd = (VReg *)vd; \ + VReg *Vj = (VReg *)vj; \ + \ + for (i = 0; i < LSX_LEN/BIT; i++) { \ + if (Vj->E(i) < 0) { \ + break; \ + } \ + } \ + m = imm % (LSX_LEN/BIT); \ + Vd->E(m) = i; \ } VFRSTPI(vfrstpi_b, 8, B) @@ -2923,31 +2896,29 @@ VSHUF(vshuf_h, 16, H) VSHUF(vshuf_w, 32, W) VSHUF(vshuf_d, 64, D) -#define VSHUF4I(NAME, BIT, E) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t imm) \ -{ \ - int i; \ - VReg temp; \ - VReg *Vd = &(env->fpr[vd].vreg); \ - VReg *Vj = &(env->fpr[vj].vreg); \ - \ - for (i = 0; i < LSX_LEN/BIT; i++) { \ - temp.E(i) = Vj->E(((i) & 0xfc) + (((imm) >> \ - (2 * ((i) & 0x03))) & 0x03)); \ - } \ - *Vd = temp; \ +#define VSHUF4I(NAME, BIT, E) \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) \ +{ \ + int i; \ + VReg temp; \ + VReg *Vd = (VReg *)vd; \ + VReg *Vj = (VReg *)vj; \ + \ + for (i = 0; i < LSX_LEN/BIT; i++) { \ + temp.E(i) = Vj->E(((i) & 0xfc) + (((imm) >> \ + (2 * ((i) & 0x03))) & 0x03)); \ + } \ + *Vd = temp; \ } VSHUF4I(vshuf4i_b, 8, B) VSHUF4I(vshuf4i_h, 16, H) VSHUF4I(vshuf4i_w, 32, W) -void HELPER(vshuf4i_d)(CPULoongArchState *env, - uint32_t vd, uint32_t vj, uint32_t imm) +void HELPER(vshuf4i_d)(void *vd, void *vj, uint64_t imm, uint32_t desc) { - VReg *Vd = &(env->fpr[vd].vreg); - VReg *Vj = &(env->fpr[vj].vreg); + VReg *Vd = (VReg *)vd; + VReg *Vj = (VReg *)vj; VReg temp; temp.D(0) = (imm & 2 ? Vj : Vd)->D(imm & 1); @@ -2955,12 +2926,11 @@ void HELPER(vshuf4i_d)(CPULoongArchState *env, *Vd = temp; } -void HELPER(vpermi_w)(CPULoongArchState *env, - uint32_t vd, uint32_t vj, uint32_t imm) +void HELPER(vpermi_w)(void *vd, void *vj, uint64_t imm, uint32_t desc) { VReg temp; - VReg *Vd = &(env->fpr[vd].vreg); - VReg *Vj = &(env->fpr[vj].vreg); + VReg *Vd = (VReg *)vd; + VReg *Vj = (VReg *)vj; temp.W(0) = Vj->W(imm & 0x3); temp.W(1) = Vj->W((imm >> 2) & 0x3); @@ -2969,17 +2939,16 @@ void HELPER(vpermi_w)(CPULoongArchState *env, *Vd = temp; } -#define VEXTRINS(NAME, BIT, E, MASK) \ -void HELPER(NAME)(CPULoongArchState *env, \ - uint32_t vd, uint32_t vj, uint32_t imm) \ -{ \ - int ins, extr; \ - VReg *Vd = &(env->fpr[vd].vreg); \ - VReg *Vj = &(env->fpr[vj].vreg); \ - \ - ins = (imm >> 4) & MASK; \ - extr = imm & MASK; \ - Vd->E(ins) = Vj->E(extr); \ +#define VEXTRINS(NAME, BIT, E, MASK) \ +void HELPER(NAME)(void *vd, void *vj, uint64_t imm, uint32_t desc) \ +{ \ + int ins, extr; \ + VReg *Vd = (VReg *)vd; \ + VReg *Vj = (VReg *)vj; \ + \ + ins = (imm >> 4) & MASK; \ + extr = imm & MASK; \ + Vd->E(ins) = Vj->E(extr); \ } VEXTRINS(vextrins_b, 8, B, 0xf) diff --git a/target/loongarch/insn_trans/trans_vec.c.inc b/target/loongarch/insn_trans/trans_vec.c.inc index 4c3d206df1..41c2996e90 100644 --- a/target/loongarch/insn_trans/trans_vec.c.inc +++ b/target/loongarch/insn_trans/trans_vec.c.inc @@ -118,16 +118,20 @@ static bool gen_vv(DisasContext *ctx, arg_vv *a, gen_helper_gvec_2 *fn) return gen_vv_vl(ctx, a, 16, fn); } -static bool gen_vv_i(DisasContext *ctx, arg_vv_i *a, - void (*func)(TCGv_ptr, TCGv_i32, TCGv_i32, TCGv_i32)) +static bool gen_vv_i_vl(DisasContext *ctx, arg_vv_i *a, uint32_t oprsz, + gen_helper_gvec_2i *fn) { - TCGv_i32 vd = tcg_constant_i32(a->vd); - TCGv_i32 vj = tcg_constant_i32(a->vj); - TCGv_i32 imm = tcg_constant_i32(a->imm); + tcg_gen_gvec_2i_ool(vec_full_offset(a->vd), + vec_full_offset(a->vj), + tcg_constant_i64(a->imm), + oprsz, ctx->vl / 8, oprsz, fn); + return true; +} +static bool gen_vv_i(DisasContext *ctx, arg_vv_i *a, gen_helper_gvec_2i *fn) +{ CHECK_SXE; - func(cpu_env, vd, vj, imm); - return true; + return gen_vv_i_vl(ctx, a, 16, fn); } static bool gen_cv(DisasContext *ctx, arg_cv *a, From patchwork Thu Sep 7 08:08:29 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Gao X-Patchwork-Id: 13376210 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id EE7C9EE14D4 for ; Thu, 7 Sep 2023 08:11:56 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qeA6L-000568-KN; Thu, 07 Sep 2023 04:11:00 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qeA69-0004nS-Ue for qemu-devel@nongnu.org; Thu, 07 Sep 2023 04:10:46 -0400 Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qeA66-0003lD-LJ for qemu-devel@nongnu.org; Thu, 07 Sep 2023 04:10:45 -0400 Received: from loongson.cn (unknown [10.2.5.185]) by gateway (Coremail) with SMTP id _____8BxpPAzhflkWDQhAA--.753S3; Thu, 07 Sep 2023 16:09:23 +0800 (CST) Received: from localhost.localdomain (unknown [10.2.5.185]) by localhost.localdomain (Coremail) with SMTP id AQAAf8DxviMthflkXE1wAA--.31585S12; Thu, 07 Sep 2023 16:09:22 +0800 (CST) From: Song Gao To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, maobibo@loongson.cn Subject: [PATCH v5 10/57] target/loongarch: Replace CHECK_SXE to check_vec(ctx, 16) Date: Thu, 7 Sep 2023 16:08:29 +0800 Message-Id: <20230907080916.3974502-11-gaosong@loongson.cn> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230907080916.3974502-1-gaosong@loongson.cn> References: <20230907080916.3974502-1-gaosong@loongson.cn> MIME-Version: 1.0 X-CM-TRANSID: AQAAf8DxviMthflkXE1wAA--.31585S12 X-CM-SenderInfo: 5jdr20tqj6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7 ZEXasCq-sGcSsGvfJ3UbIjqfuFe4nvWSU5nxnvy29KBjDU0xBIdaVrnUUvcSsGvfC2Kfnx nUUI43ZEXa7xR_UUUUUUUUU== Received-SPF: pass client-ip=114.242.206.163; envelope-from=gaosong@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Intrudce a new function check_vec to replace CHECK_SXE Signed-off-by: Song Gao --- target/loongarch/insn_trans/trans_vec.c.inc | 248 +++++++++++++++----- 1 file changed, 192 insertions(+), 56 deletions(-) diff --git a/target/loongarch/insn_trans/trans_vec.c.inc b/target/loongarch/insn_trans/trans_vec.c.inc index 41c2996e90..0985191c70 100644 --- a/target/loongarch/insn_trans/trans_vec.c.inc +++ b/target/loongarch/insn_trans/trans_vec.c.inc @@ -5,14 +5,23 @@ */ #ifndef CONFIG_USER_ONLY -#define CHECK_SXE do { \ - if ((ctx->base.tb->flags & HW_FLAGS_EUEN_SXE) == 0) { \ - generate_exception(ctx, EXCCODE_SXD); \ - return true; \ - } \ -} while (0) + +static bool check_vec(DisasContext *ctx, uint32_t oprsz) +{ + if ((oprsz == 16) && ((ctx->base.tb->flags & HW_FLAGS_EUEN_SXE) == 0)) { + generate_exception(ctx, EXCCODE_SXD); + return false; + } + return true; +} + #else -#define CHECK_SXE + +static bool check_vec(DisasContext *ctx, uint32_t oprsz) +{ + return true; +} + #endif static bool gen_vvvv_ptr_vl(DisasContext *ctx, arg_vvvv *a, uint32_t oprsz, @@ -30,7 +39,10 @@ static bool gen_vvvv_ptr_vl(DisasContext *ctx, arg_vvvv *a, uint32_t oprsz, static bool gen_vvvv_ptr(DisasContext *ctx, arg_vvvv *a, gen_helper_gvec_4_ptr *fn) { - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } + return gen_vvvv_ptr_vl(ctx, a, 16, fn); } @@ -48,7 +60,10 @@ static bool gen_vvvv_vl(DisasContext *ctx, arg_vvvv *a, uint32_t oprsz, static bool gen_vvvv(DisasContext *ctx, arg_vvvv *a, gen_helper_gvec_4 *fn) { - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } + return gen_vvvv_vl(ctx, a, 16, fn); } @@ -66,7 +81,10 @@ static bool gen_vvv_ptr_vl(DisasContext *ctx, arg_vvv *a, uint32_t oprsz, static bool gen_vvv_ptr(DisasContext *ctx, arg_vvv *a, gen_helper_gvec_3_ptr *fn) { - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } + return gen_vvv_ptr_vl(ctx, a, 16, fn); } @@ -82,7 +100,10 @@ static bool gen_vvv_vl(DisasContext *ctx, arg_vvv *a, uint32_t oprsz, static bool gen_vvv(DisasContext *ctx, arg_vvv *a, gen_helper_gvec_3 *fn) { - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } + return gen_vvv_vl(ctx, a, 16, fn); } @@ -99,7 +120,10 @@ static bool gen_vv_ptr_vl(DisasContext *ctx, arg_vv *a, uint32_t oprsz, static bool gen_vv_ptr(DisasContext *ctx, arg_vv *a, gen_helper_gvec_2_ptr *fn) { - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } + return gen_vv_ptr_vl(ctx, a, 16, fn); } @@ -114,7 +138,10 @@ static bool gen_vv_vl(DisasContext *ctx, arg_vv *a, uint32_t oprsz, static bool gen_vv(DisasContext *ctx, arg_vv *a, gen_helper_gvec_2 *fn) { - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } + return gen_vv_vl(ctx, a, 16, fn); } @@ -130,7 +157,10 @@ static bool gen_vv_i_vl(DisasContext *ctx, arg_vv_i *a, uint32_t oprsz, static bool gen_vv_i(DisasContext *ctx, arg_vv_i *a, gen_helper_gvec_2i *fn) { - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } + return gen_vv_i_vl(ctx, a, 16, fn); } @@ -140,7 +170,10 @@ static bool gen_cv(DisasContext *ctx, arg_cv *a, TCGv_i32 vj = tcg_constant_i32(a->vj); TCGv_i32 cd = tcg_constant_i32(a->cd); - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } + func(cpu_env, cd, vj); return true; } @@ -162,7 +195,10 @@ static bool gvec_vvv(DisasContext *ctx, arg_vvv *a, MemOp mop, void (*func)(unsigned, uint32_t, uint32_t, uint32_t, uint32_t, uint32_t)) { - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } + return gvec_vvv_vl(ctx, a, 16, mop, func); } @@ -184,7 +220,10 @@ static bool gvec_vv(DisasContext *ctx, arg_vv *a, MemOp mop, void (*func)(unsigned, uint32_t, uint32_t, uint32_t, uint32_t)) { - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } + return gvec_vv_vl(ctx, a, 16, mop, func); } @@ -204,7 +243,10 @@ static bool gvec_vv_i(DisasContext *ctx, arg_vv_i *a, MemOp mop, void (*func)(unsigned, uint32_t, uint32_t, int64_t, uint32_t, uint32_t)) { - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } + return gvec_vv_i_vl(ctx, a, 16, mop, func); } @@ -220,7 +262,10 @@ static bool gvec_subi_vl(DisasContext *ctx, arg_vv_i *a, static bool gvec_subi(DisasContext *ctx, arg_vv_i *a, MemOp mop) { - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } + return gvec_subi_vl(ctx, a, 16, mop); } @@ -238,7 +283,9 @@ static bool trans_v## NAME ##_q(DisasContext *ctx, arg_vvv *a) \ return false; \ } \ \ - CHECK_SXE; \ + if (!check_vec(ctx, 16)) { \ + return true; \ + } \ \ rh = tcg_temp_new_i64(); \ rl = tcg_temp_new_i64(); \ @@ -3138,7 +3185,9 @@ static bool trans_vldi(DisasContext *ctx, arg_vldi *a) return false; } - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } sel = (a->imm >> 12) & 0x1; @@ -3168,7 +3217,9 @@ static bool trans_vandn_v(DisasContext *ctx, arg_vvv *a) return false; } - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } vd_ofs = vec_full_offset(a->vd); vj_ofs = vec_full_offset(a->vj); @@ -3795,7 +3846,9 @@ static bool do_cmp(DisasContext *ctx, arg_vvv *a, MemOp mop, TCGCond cond) { uint32_t vd_ofs, vj_ofs, vk_ofs; - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } vd_ofs = vec_full_offset(a->vd); vj_ofs = vec_full_offset(a->vj); @@ -3841,7 +3894,9 @@ static bool do_## NAME ##_s(DisasContext *ctx, arg_vv_i *a, MemOp mop) \ { \ uint32_t vd_ofs, vj_ofs; \ \ - CHECK_SXE; \ + if (!check_vec(ctx, 16)) { \ + return true; \ + } \ \ static const TCGOpcode vecop_list[] = { \ INDEX_op_cmp_vec, 0 \ @@ -3890,7 +3945,9 @@ static bool do_## NAME ##_u(DisasContext *ctx, arg_vv_i *a, MemOp mop) \ { \ uint32_t vd_ofs, vj_ofs; \ \ - CHECK_SXE; \ + if (!check_vec(ctx, 16)) { \ + return true; \ + } \ \ static const TCGOpcode vecop_list[] = { \ INDEX_op_cmp_vec, 0 \ @@ -3988,7 +4045,9 @@ static bool trans_vfcmp_cond_s(DisasContext *ctx, arg_vvv_fcond *a) return false; } - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } fn = (a->fcond & 1 ? gen_helper_vfcmp_s_s : gen_helper_vfcmp_c_s); flags = get_fcmp_flags(a->fcond >> 1); @@ -4009,7 +4068,9 @@ static bool trans_vfcmp_cond_d(DisasContext *ctx, arg_vvv_fcond *a) return false; } - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } fn = (a->fcond & 1 ? gen_helper_vfcmp_s_d : gen_helper_vfcmp_c_d); flags = get_fcmp_flags(a->fcond >> 1); @@ -4024,7 +4085,9 @@ static bool trans_vbitsel_v(DisasContext *ctx, arg_vvvv *a) return false; } - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } tcg_gen_gvec_bitsel(MO_64, vec_full_offset(a->vd), vec_full_offset(a->va), vec_full_offset(a->vk), vec_full_offset(a->vj), @@ -4050,7 +4113,9 @@ static bool trans_vbitseli_b(DisasContext *ctx, arg_vv_i *a) return false; } - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } tcg_gen_gvec_2i(vec_full_offset(a->vd), vec_full_offset(a->vj), 16, ctx->vl/8, a->imm, &op); @@ -4073,7 +4138,10 @@ static bool trans_## NAME (DisasContext *ctx, arg_cv *a) \ return false; \ } \ \ - CHECK_SXE; \ + if (!check_vec(ctx, 16)) { \ + return true; \ + } \ + \ tcg_gen_or_i64(t1, al, ah); \ tcg_gen_setcondi_i64(COND, t1, t1, 0); \ tcg_gen_st8_tl(t1, cpu_env, offsetof(CPULoongArchState, cf[a->cd & 0x7])); \ @@ -4101,7 +4169,10 @@ static bool trans_vinsgr2vr_b(DisasContext *ctx, arg_vr_i *a) return false; } - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } + tcg_gen_st8_i64(src, cpu_env, offsetof(CPULoongArchState, fpr[a->vd].vreg.B(a->imm))); return true; @@ -4115,7 +4186,10 @@ static bool trans_vinsgr2vr_h(DisasContext *ctx, arg_vr_i *a) return false; } - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } + tcg_gen_st16_i64(src, cpu_env, offsetof(CPULoongArchState, fpr[a->vd].vreg.H(a->imm))); return true; @@ -4129,7 +4203,10 @@ static bool trans_vinsgr2vr_w(DisasContext *ctx, arg_vr_i *a) return false; } - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } + tcg_gen_st32_i64(src, cpu_env, offsetof(CPULoongArchState, fpr[a->vd].vreg.W(a->imm))); return true; @@ -4143,7 +4220,10 @@ static bool trans_vinsgr2vr_d(DisasContext *ctx, arg_vr_i *a) return false; } - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } + tcg_gen_st_i64(src, cpu_env, offsetof(CPULoongArchState, fpr[a->vd].vreg.D(a->imm))); return true; @@ -4157,7 +4237,10 @@ static bool trans_vpickve2gr_b(DisasContext *ctx, arg_rv_i *a) return false; } - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } + tcg_gen_ld8s_i64(dst, cpu_env, offsetof(CPULoongArchState, fpr[a->vj].vreg.B(a->imm))); return true; @@ -4171,7 +4254,10 @@ static bool trans_vpickve2gr_h(DisasContext *ctx, arg_rv_i *a) return false; } - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } + tcg_gen_ld16s_i64(dst, cpu_env, offsetof(CPULoongArchState, fpr[a->vj].vreg.H(a->imm))); return true; @@ -4185,7 +4271,10 @@ static bool trans_vpickve2gr_w(DisasContext *ctx, arg_rv_i *a) return false; } - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } + tcg_gen_ld32s_i64(dst, cpu_env, offsetof(CPULoongArchState, fpr[a->vj].vreg.W(a->imm))); return true; @@ -4199,7 +4288,10 @@ static bool trans_vpickve2gr_d(DisasContext *ctx, arg_rv_i *a) return false; } - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } + tcg_gen_ld_i64(dst, cpu_env, offsetof(CPULoongArchState, fpr[a->vj].vreg.D(a->imm))); return true; @@ -4213,7 +4305,10 @@ static bool trans_vpickve2gr_bu(DisasContext *ctx, arg_rv_i *a) return false; } - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } + tcg_gen_ld8u_i64(dst, cpu_env, offsetof(CPULoongArchState, fpr[a->vj].vreg.B(a->imm))); return true; @@ -4227,7 +4322,10 @@ static bool trans_vpickve2gr_hu(DisasContext *ctx, arg_rv_i *a) return false; } - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } + tcg_gen_ld16u_i64(dst, cpu_env, offsetof(CPULoongArchState, fpr[a->vj].vreg.H(a->imm))); return true; @@ -4241,7 +4339,10 @@ static bool trans_vpickve2gr_wu(DisasContext *ctx, arg_rv_i *a) return false; } - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } + tcg_gen_ld32u_i64(dst, cpu_env, offsetof(CPULoongArchState, fpr[a->vj].vreg.W(a->imm))); return true; @@ -4255,7 +4356,10 @@ static bool trans_vpickve2gr_du(DisasContext *ctx, arg_rv_i *a) return false; } - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } + tcg_gen_ld_i64(dst, cpu_env, offsetof(CPULoongArchState, fpr[a->vj].vreg.D(a->imm))); return true; @@ -4269,7 +4373,9 @@ static bool gvec_dup(DisasContext *ctx, arg_vr *a, MemOp mop) return false; } - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } tcg_gen_gvec_dup_i64(mop, vec_full_offset(a->vd), 16, ctx->vl/8, src); @@ -4287,7 +4393,10 @@ static bool trans_vreplvei_b(DisasContext *ctx, arg_vv_i *a) return false; } - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } + tcg_gen_gvec_dup_mem(MO_8,vec_full_offset(a->vd), offsetof(CPULoongArchState, fpr[a->vj].vreg.B((a->imm))), @@ -4301,7 +4410,10 @@ static bool trans_vreplvei_h(DisasContext *ctx, arg_vv_i *a) return false; } - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } + tcg_gen_gvec_dup_mem(MO_16, vec_full_offset(a->vd), offsetof(CPULoongArchState, fpr[a->vj].vreg.H((a->imm))), @@ -4314,7 +4426,10 @@ static bool trans_vreplvei_w(DisasContext *ctx, arg_vv_i *a) return false; } - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } + tcg_gen_gvec_dup_mem(MO_32, vec_full_offset(a->vd), offsetof(CPULoongArchState, fpr[a->vj].vreg.W((a->imm))), @@ -4327,7 +4442,10 @@ static bool trans_vreplvei_d(DisasContext *ctx, arg_vv_i *a) return false; } - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } + tcg_gen_gvec_dup_mem(MO_64, vec_full_offset(a->vd), offsetof(CPULoongArchState, fpr[a->vj].vreg.D((a->imm))), @@ -4346,7 +4464,9 @@ static bool gen_vreplve(DisasContext *ctx, arg_vvr *a, int vece, int bit, return false; } - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } tcg_gen_andi_i64(t0, gpr_src(ctx, a->rk, EXT_NONE), (LSX_LEN/bit) -1); tcg_gen_shli_i64(t0, t0, vece); @@ -4376,7 +4496,9 @@ static bool trans_vbsll_v(DisasContext *ctx, arg_vv_i *a) return false; } - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } desthigh = tcg_temp_new_i64(); destlow = tcg_temp_new_i64(); @@ -4410,7 +4532,9 @@ static bool trans_vbsrl_v(DisasContext *ctx, arg_vv_i *a) return false; } - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } desthigh = tcg_temp_new_i64(); destlow = tcg_temp_new_i64(); @@ -4488,7 +4612,9 @@ static bool trans_vld(DisasContext *ctx, arg_vr_i *a) return false; } - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } addr = gpr_src(ctx, a->rj, EXT_NONE); val = tcg_temp_new_i128(); @@ -4515,7 +4641,9 @@ static bool trans_vst(DisasContext *ctx, arg_vr_i *a) return false; } - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } addr = gpr_src(ctx, a->rj, EXT_NONE); val = tcg_temp_new_i128(); @@ -4542,7 +4670,9 @@ static bool trans_vldx(DisasContext *ctx, arg_vrr *a) return false; } - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } src1 = gpr_src(ctx, a->rj, EXT_NONE); src2 = gpr_src(ctx, a->rk, EXT_NONE); @@ -4569,7 +4699,9 @@ static bool trans_vstx(DisasContext *ctx, arg_vrr *a) return false; } - CHECK_SXE; + if (!check_vec(ctx, 16)) { + return true; + } src1 = gpr_src(ctx, a->rj, EXT_NONE); src2 = gpr_src(ctx, a->rk, EXT_NONE); @@ -4596,7 +4728,9 @@ static bool trans_## NAME (DisasContext *ctx, arg_vr_i *a) \ return false; \ } \ \ - CHECK_SXE; \ + if (!check_vec(ctx, 16)) { \ + return true; \ + } \ \ addr = gpr_src(ctx, a->rj, EXT_NONE); \ val = tcg_temp_new_i64(); \ @@ -4624,7 +4758,9 @@ static bool trans_## NAME (DisasContext *ctx, arg_vr_ii *a) \ return false; \ } \ \ - CHECK_SXE; \ + if (!check_vec(ctx, 16)) { \ + return true; \ + } \ \ addr = gpr_src(ctx, a->rj, EXT_NONE); \ val = tcg_temp_new_i64(); \