From patchwork Wed Sep 11 06:25:25 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: LIU Zhiwei X-Patchwork-Id: 11140353 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C798514ED for ; Wed, 11 Sep 2019 06:36:27 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id A82D5207FC for ; Wed, 11 Sep 2019 06:36:27 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A82D5207FC Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=c-sky.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Received: from localhost ([::1]:46844 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1i7wEs-0000FL-FD for patchwork-qemu-devel@patchwork.kernel.org; Wed, 11 Sep 2019 02:36:26 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:38358) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1i7wDU-0006xD-Cu for qemu-devel@nongnu.org; Wed, 11 Sep 2019 02:35:01 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1i7wDT-0007m7-GQ for qemu-devel@nongnu.org; Wed, 11 Sep 2019 02:35:00 -0400 Received: from smtp2200-217.mail.aliyun.com ([121.197.200.217]:37857) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1i7wDT-0007jC-4b; Wed, 11 Sep 2019 02:34:59 -0400 X-Alimail-AntiSpam: AC=CONTINUE; BC=0.06282304|-1; CH=green; DM=CONTINUE|CONTINUE|true|0.20373-0.00815328-0.788117; FP=0|0|0|0|0|-1|-1|-1; HT=e02c03277; MF=zhiwei_liu@c-sky.com; NM=1; PH=DS; RN=11; RT=11; SR=0; TI=SMTPD_---.FSRGAhQ_1568183690; Received: from localhost(mailfrom:zhiwei_liu@c-sky.com fp:SMTPD_---.FSRGAhQ_1568183690) by smtp.aliyun-inc.com(10.147.42.198); Wed, 11 Sep 2019 14:34:50 +0800 From: liuzhiwei To: Alistair.Francis@wdc.com, palmer@sifive.com, sagark@eecs.berkeley.edu, kbastian@mail.uni-paderborn.de, riku.voipio@iki.fi, laurent@vivier.eu, wenmeng_zhang@c-sky.com Date: Wed, 11 Sep 2019 14:25:25 +0800 Message-Id: <1568183141-67641-2-git-send-email-zhiwei_liu@c-sky.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1568183141-67641-1-git-send-email-zhiwei_liu@c-sky.com> References: <1568183141-67641-1-git-send-email-zhiwei_liu@c-sky.com> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x [generic] X-Received-From: 121.197.200.217 Subject: [Qemu-devel] [PATCH v2 01/17] RISC-V: add vfp field in CPURISCVState X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-riscv@nongnu.org, qemu-devel@nongnu.org, wxy194768@alibaba-inc.com, LIU Zhiwei Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" From: LIU Zhiwei Signed-off-by: LIU Zhiwei --- target/riscv/cpu.h | 28 ++++++++++++++++++++++++++++ 1 file changed, 28 insertions(+) diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h index 0adb307..c992b1d 100644 --- a/target/riscv/cpu.h +++ b/target/riscv/cpu.h @@ -93,9 +93,37 @@ typedef struct CPURISCVState CPURISCVState; #include "pmp.h" +#define VLEN 128 +#define VUNIT(x) (VLEN / x) + struct CPURISCVState { target_ulong gpr[32]; uint64_t fpr[32]; /* assume both F and D extensions */ + + /* vector coprocessor state. */ + struct { + union VECTOR { + float64 f64[VUNIT(64)]; + float32 f32[VUNIT(32)]; + float16 f16[VUNIT(16)]; + uint64_t u64[VUNIT(64)]; + int64_t s64[VUNIT(64)]; + uint32_t u32[VUNIT(32)]; + int32_t s32[VUNIT(32)]; + uint16_t u16[VUNIT(16)]; + int16_t s16[VUNIT(16)]; + uint8_t u8[VUNIT(8)]; + int8_t s8[VUNIT(8)]; + } vreg[32]; + target_ulong vxrm; + target_ulong vxsat; + target_ulong vl; + target_ulong vstart; + target_ulong vtype; + float_status fp_status; + } vfp; + + bool foflag; target_ulong pc; target_ulong load_res; target_ulong load_val; From patchwork Wed Sep 11 06:25:26 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: LIU Zhiwei X-Patchwork-Id: 11140357 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 344E01395 for ; Wed, 11 Sep 2019 06:36:45 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 15771207FC for ; Wed, 11 Sep 2019 06:36:45 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 15771207FC Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=c-sky.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Received: from localhost ([::1]:46848 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1i7wFA-0000LD-8R for patchwork-qemu-devel@patchwork.kernel.org; Wed, 11 Sep 2019 02:36:44 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:38364) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1i7wDU-0006xE-Ma for qemu-devel@nongnu.org; Wed, 11 Sep 2019 02:35:01 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1i7wDT-0007mE-Ix for qemu-devel@nongnu.org; Wed, 11 Sep 2019 02:35:00 -0400 Received: from smtp2200-217.mail.aliyun.com ([121.197.200.217]:35531) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1i7wDT-0007iz-6j; Wed, 11 Sep 2019 02:34:59 -0400 X-Alimail-AntiSpam: AC=CONTINUE; BC=0.0409334|-1; CH=green; DM=CONTINUE|CONTINUE|true|0.275102-0.0298298-0.695068; FP=0|0|0|0|0|-1|-1|-1; HT=e02c03276; MF=zhiwei_liu@c-sky.com; NM=1; PH=DS; RN=11; RT=11; SR=0; TI=SMTPD_---.FSRC1sE_1568183690; Received: from localhost(mailfrom:zhiwei_liu@c-sky.com fp:SMTPD_---.FSRC1sE_1568183690) by smtp.aliyun-inc.com(10.147.43.230); Wed, 11 Sep 2019 14:34:50 +0800 From: liuzhiwei To: Alistair.Francis@wdc.com, palmer@sifive.com, sagark@eecs.berkeley.edu, kbastian@mail.uni-paderborn.de, riku.voipio@iki.fi, laurent@vivier.eu, wenmeng_zhang@c-sky.com Date: Wed, 11 Sep 2019 14:25:26 +0800 Message-Id: <1568183141-67641-3-git-send-email-zhiwei_liu@c-sky.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1568183141-67641-1-git-send-email-zhiwei_liu@c-sky.com> References: <1568183141-67641-1-git-send-email-zhiwei_liu@c-sky.com> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x [generic] X-Received-From: 121.197.200.217 Subject: [Qemu-devel] [PATCH v2 02/17] RISC-V: turn on vector extension from command line by cfg.ext_v Property X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-riscv@nongnu.org, qemu-devel@nongnu.org, wxy194768@alibaba-inc.com, LIU Zhiwei Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" From: LIU Zhiwei Signed-off-by: LIU Zhiwei --- target/riscv/cpu.c | 6 +++++- target/riscv/cpu.h | 2 ++ 2 files changed, 7 insertions(+), 1 deletion(-) diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c index f8d07bd..9f93ce7 100644 --- a/target/riscv/cpu.c +++ b/target/riscv/cpu.c @@ -109,7 +109,7 @@ static void set_resetvec(CPURISCVState *env, int resetvec) static void riscv_any_cpu_init(Object *obj) { CPURISCVState *env = &RISCV_CPU(obj)->env; - set_misa(env, RVXLEN | RVI | RVM | RVA | RVF | RVD | RVC | RVU); + set_misa(env, RVXLEN | RVI | RVM | RVA | RVF | RVD | RVC | RVU | RVV); set_priv_version(env, PRIV_VERSION_1_11_0); set_resetvec(env, DEFAULT_RSTVEC); } @@ -406,6 +406,9 @@ static void riscv_cpu_realize(DeviceState *dev, Error **errp) if (cpu->cfg.ext_u) { target_misa |= RVU; } + if (cpu->cfg.ext_v) { + target_misa |= RVV; + } set_misa(env, RVXLEN | target_misa); } @@ -441,6 +444,7 @@ static Property riscv_cpu_properties[] = { DEFINE_PROP_BOOL("c", RISCVCPU, cfg.ext_c, true), DEFINE_PROP_BOOL("s", RISCVCPU, cfg.ext_s, true), DEFINE_PROP_BOOL("u", RISCVCPU, cfg.ext_u, true), + DEFINE_PROP_BOOL("v", RISCVCPU, cfg.ext_v, true), DEFINE_PROP_BOOL("Counters", RISCVCPU, cfg.ext_counters, true), DEFINE_PROP_BOOL("Zifencei", RISCVCPU, cfg.ext_ifencei, true), DEFINE_PROP_BOOL("Zicsr", RISCVCPU, cfg.ext_icsr, true), diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h index c992b1d..2c7072a 100644 --- a/target/riscv/cpu.h +++ b/target/riscv/cpu.h @@ -67,6 +67,7 @@ #define RVC RV('C') #define RVS RV('S') #define RVU RV('U') +#define RVV RV('V') /* S extension denotes that Supervisor mode exists, however it is possible to have a core that support S mode but does not have an MMU and there @@ -250,6 +251,7 @@ typedef struct RISCVCPU { bool ext_c; bool ext_s; bool ext_u; + bool ext_v; bool ext_counters; bool ext_ifencei; bool ext_icsr; From patchwork Wed Sep 11 06:25:27 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: LIU Zhiwei X-Patchwork-Id: 11140361 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6DFF71395 for ; Wed, 11 Sep 2019 06:39:30 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 4DA01207FC for ; Wed, 11 Sep 2019 06:39:30 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 4DA01207FC Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=c-sky.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Received: from localhost ([::1]:46874 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1i7wHo-0003vs-Qj for patchwork-qemu-devel@patchwork.kernel.org; Wed, 11 Sep 2019 02:39:28 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:38349) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1i7wDU-0006xB-Bd for qemu-devel@nongnu.org; Wed, 11 Sep 2019 02:35:01 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1i7wDS-0007lv-UA for qemu-devel@nongnu.org; Wed, 11 Sep 2019 02:35:00 -0400 Received: from smtp2200-217.mail.aliyun.com ([121.197.200.217]:46051) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1i7wDS-0007iy-IE; Wed, 11 Sep 2019 02:34:58 -0400 X-Alimail-AntiSpam: AC=CONTINUE; BC=0.03712995|-1; CH=green; DM=CONTINUE|CONTINUE|true|0.6613-0.00789933-0.3308; FP=0|0|0|0|0|-1|-1|-1; HT=e02c03307; MF=zhiwei_liu@c-sky.com; NM=1; PH=DS; RN=11; RT=11; SR=0; TI=SMTPD_---.FSRJyb1_1568183691; Received: from localhost(mailfrom:zhiwei_liu@c-sky.com fp:SMTPD_---.FSRJyb1_1568183691) by smtp.aliyun-inc.com(10.147.44.129); Wed, 11 Sep 2019 14:34:51 +0800 From: liuzhiwei To: Alistair.Francis@wdc.com, palmer@sifive.com, sagark@eecs.berkeley.edu, kbastian@mail.uni-paderborn.de, riku.voipio@iki.fi, laurent@vivier.eu, wenmeng_zhang@c-sky.com Date: Wed, 11 Sep 2019 14:25:27 +0800 Message-Id: <1568183141-67641-4-git-send-email-zhiwei_liu@c-sky.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1568183141-67641-1-git-send-email-zhiwei_liu@c-sky.com> References: <1568183141-67641-1-git-send-email-zhiwei_liu@c-sky.com> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x [generic] X-Received-From: 121.197.200.217 Subject: [Qemu-devel] [PATCH v2 03/17] RISC-V: support vector extension csr X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-riscv@nongnu.org, qemu-devel@nongnu.org, wxy194768@alibaba-inc.com, LIU Zhiwei Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" From: LIU Zhiwei Signed-off-by: LIU Zhiwei Reviewed-by: Chih-Min Chao --- target/riscv/cpu_bits.h | 15 ++++++++++++ target/riscv/csr.c | 65 ++++++++++++++++++++++++++++++++++++++++++++++--- 2 files changed, 76 insertions(+), 4 deletions(-) diff --git a/target/riscv/cpu_bits.h b/target/riscv/cpu_bits.h index 11f971a..9eb43ec 100644 --- a/target/riscv/cpu_bits.h +++ b/target/riscv/cpu_bits.h @@ -29,6 +29,14 @@ #define FSR_NXA (FPEXC_NX << FSR_AEXC_SHIFT) #define FSR_AEXC (FSR_NVA | FSR_OFA | FSR_UFA | FSR_DZA | FSR_NXA) +/* Vector Fixed-Point round model */ +#define FSR_VXRM_SHIFT 9 +#define FSR_VXRM (0x3 << FSR_VXRM_SHIFT) + +/* Vector Fixed-Point saturation flag */ +#define FSR_VXSAT_SHIFT 8 +#define FSR_VXSAT (0x1 << FSR_VXSAT_SHIFT) + /* Control and Status Registers */ /* User Trap Setup */ @@ -48,6 +56,13 @@ #define CSR_FRM 0x002 #define CSR_FCSR 0x003 +/* User Vector CSRs */ +#define CSR_VSTART 0x008 +#define CSR_VXSAT 0x009 +#define CSR_VXRM 0x00a +#define CSR_VL 0xc20 +#define CSR_VTYPE 0xc21 + /* User Timers and Counters */ #define CSR_CYCLE 0xc00 #define CSR_TIME 0xc01 diff --git a/target/riscv/csr.c b/target/riscv/csr.c index e0d4586..a6131ff 100644 --- a/target/riscv/csr.c +++ b/target/riscv/csr.c @@ -87,12 +87,12 @@ static int ctr(CPURISCVState *env, int csrno) return 0; } -#if !defined(CONFIG_USER_ONLY) static int any(CPURISCVState *env, int csrno) { return 0; } +#if !defined(CONFIG_USER_ONLY) static int smode(CPURISCVState *env, int csrno) { return -!riscv_has_ext(env, RVS); @@ -158,8 +158,10 @@ static int read_fcsr(CPURISCVState *env, int csrno, target_ulong *val) return -1; } #endif - *val = (riscv_cpu_get_fflags(env) << FSR_AEXC_SHIFT) - | (env->frm << FSR_RD_SHIFT); + *val = (env->vfp.vxrm << FSR_VXRM_SHIFT) + | (env->vfp.vxsat << FSR_VXSAT_SHIFT) + | (riscv_cpu_get_fflags(env) << FSR_AEXC_SHIFT) + | (env->frm << FSR_RD_SHIFT); return 0; } @@ -172,10 +174,60 @@ static int write_fcsr(CPURISCVState *env, int csrno, target_ulong val) env->mstatus |= MSTATUS_FS; #endif env->frm = (val & FSR_RD) >> FSR_RD_SHIFT; + env->vfp.vxrm = (val & FSR_VXRM) >> FSR_VXRM_SHIFT; + env->vfp.vxsat = (val & FSR_VXSAT) >> FSR_VXSAT_SHIFT; riscv_cpu_set_fflags(env, (val & FSR_AEXC) >> FSR_AEXC_SHIFT); return 0; } +static int read_vtype(CPURISCVState *env, int csrno, target_ulong *val) +{ + *val = env->vfp.vtype; + return 0; +} + +static int read_vl(CPURISCVState *env, int csrno, target_ulong *val) +{ + *val = env->vfp.vl; + return 0; +} + +static int read_vxrm(CPURISCVState *env, int csrno, target_ulong *val) +{ + *val = env->vfp.vxrm; + return 0; +} + +static int read_vxsat(CPURISCVState *env, int csrno, target_ulong *val) +{ + *val = env->vfp.vxsat; + return 0; +} + +static int read_vstart(CPURISCVState *env, int csrno, target_ulong *val) +{ + *val = env->vfp.vstart; + return 0; +} + +static int write_vxrm(CPURISCVState *env, int csrno, target_ulong val) +{ + env->vfp.vxrm = val; + return 0; +} + +static int write_vxsat(CPURISCVState *env, int csrno, target_ulong val) +{ + env->vfp.vxsat = val; + return 0; +} + +static int write_vstart(CPURISCVState *env, int csrno, target_ulong val) +{ + env->vfp.vstart = val; + return 0; +} + /* User Timers and Counters */ static int read_instret(CPURISCVState *env, int csrno, target_ulong *val) { @@ -873,7 +925,12 @@ static riscv_csr_operations csr_ops[CSR_TABLE_SIZE] = { [CSR_FFLAGS] = { fs, read_fflags, write_fflags }, [CSR_FRM] = { fs, read_frm, write_frm }, [CSR_FCSR] = { fs, read_fcsr, write_fcsr }, - + /* Vector CSRs */ + [CSR_VSTART] = { any, read_vstart, write_vstart }, + [CSR_VXSAT] = { any, read_vxsat, write_vxsat }, + [CSR_VXRM] = { any, read_vxrm, write_vxrm }, + [CSR_VL] = { any, read_vl }, + [CSR_VTYPE] = { any, read_vtype }, /* User Timers and Counters */ [CSR_CYCLE] = { ctr, read_instret }, [CSR_INSTRET] = { ctr, read_instret }, From patchwork Wed Sep 11 06:25:28 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: LIU Zhiwei X-Patchwork-Id: 11140365 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A88471395 for ; Wed, 11 Sep 2019 06:39:33 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 89631207FC for ; Wed, 11 Sep 2019 06:39:33 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 89631207FC Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=c-sky.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Received: from localhost ([::1]:46882 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1i7wHs-0003zn-6X for patchwork-qemu-devel@patchwork.kernel.org; Wed, 11 Sep 2019 02:39:32 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:38385) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1i7wDV-0006xG-Fn for qemu-devel@nongnu.org; Wed, 11 Sep 2019 02:35:04 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1i7wDT-0007mN-Mn for qemu-devel@nongnu.org; Wed, 11 Sep 2019 02:35:01 -0400 Received: from smtp2200-217.mail.aliyun.com ([121.197.200.217]:51600) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1i7wDT-0007j1-17; Wed, 11 Sep 2019 02:34:59 -0400 X-Alimail-AntiSpam: AC=CONTINUE; BC=0.03713555|-1; CH=green; DM=CONTINUE|CONTINUE|true|0.373179-0.00432624-0.622495; FP=0|0|0|0|0|-1|-1|-1; HT=e01l07423; MF=zhiwei_liu@c-sky.com; NM=1; PH=DS; RN=11; RT=11; SR=0; TI=SMTPD_---.FSRDYI2_1568183691; Received: from localhost(mailfrom:zhiwei_liu@c-sky.com fp:SMTPD_---.FSRDYI2_1568183691) by smtp.aliyun-inc.com(10.147.44.118); Wed, 11 Sep 2019 14:34:51 +0800 From: liuzhiwei To: Alistair.Francis@wdc.com, palmer@sifive.com, sagark@eecs.berkeley.edu, kbastian@mail.uni-paderborn.de, riku.voipio@iki.fi, laurent@vivier.eu, wenmeng_zhang@c-sky.com Date: Wed, 11 Sep 2019 14:25:28 +0800 Message-Id: <1568183141-67641-5-git-send-email-zhiwei_liu@c-sky.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1568183141-67641-1-git-send-email-zhiwei_liu@c-sky.com> References: <1568183141-67641-1-git-send-email-zhiwei_liu@c-sky.com> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x [generic] X-Received-From: 121.197.200.217 Subject: [Qemu-devel] [PATCH v2 04/17] RISC-V: add vector extension configure instruction X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-riscv@nongnu.org, qemu-devel@nongnu.org, wxy194768@alibaba-inc.com, LIU Zhiwei Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" From: LIU Zhiwei Signed-off-by: LIU Zhiwei --- target/riscv/Makefile.objs | 2 +- target/riscv/helper.h | 3 + target/riscv/insn32.decode | 5 ++ target/riscv/insn_trans/trans_rvv.inc.c | 46 ++++++++++++ target/riscv/translate.c | 1 + target/riscv/vector_helper.c | 126 ++++++++++++++++++++++++++++++++ 6 files changed, 182 insertions(+), 1 deletion(-) create mode 100644 target/riscv/insn_trans/trans_rvv.inc.c create mode 100644 target/riscv/vector_helper.c diff --git a/target/riscv/Makefile.objs b/target/riscv/Makefile.objs index b1c79bc..d577cef 100644 --- a/target/riscv/Makefile.objs +++ b/target/riscv/Makefile.objs @@ -1,4 +1,4 @@ -obj-y += translate.o op_helper.o cpu_helper.o cpu.o csr.o fpu_helper.o gdbstub.o pmp.o +obj-y += translate.o op_helper.o cpu_helper.o cpu.o csr.o fpu_helper.o vector_helper.o gdbstub.o pmp.o DECODETREE = $(SRC_PATH)/scripts/decodetree.py diff --git a/target/riscv/helper.h b/target/riscv/helper.h index debb22a..652f8c3 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -76,3 +76,6 @@ DEF_HELPER_2(mret, tl, env, tl) DEF_HELPER_1(wfi, void, env) DEF_HELPER_1(tlb_flush, void, env) #endif +/* Vector functions */ +DEF_HELPER_4(vector_vsetvli, void, env, i32, i32, i32) +DEF_HELPER_4(vector_vsetvl, void, env, i32, i32, i32) diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode index 77f794e..5dc009c 100644 --- a/target/riscv/insn32.decode +++ b/target/riscv/insn32.decode @@ -62,6 +62,7 @@ @r_rm ....... ..... ..... ... ..... ....... %rs2 %rs1 %rm %rd @r2_rm ....... ..... ..... ... ..... ....... %rs1 %rm %rd @r2 ....... ..... ..... ... ..... ....... %rs1 %rd +@r2_zimm . zimm:11 ..... ... ..... ....... %rs1 %rd @sfence_vma ....... ..... ..... ... ..... ....... %rs2 %rs1 @sfence_vm ....... ..... ..... ... ..... ....... %rs1 @@ -203,3 +204,7 @@ fcvt_w_d 1100001 00000 ..... ... ..... 1010011 @r2_rm fcvt_wu_d 1100001 00001 ..... ... ..... 1010011 @r2_rm fcvt_d_w 1101001 00000 ..... ... ..... 1010011 @r2_rm fcvt_d_wu 1101001 00001 ..... ... ..... 1010011 @r2_rm + +# *** RV32V Extension *** +vsetvli 0 ........... ..... 111 ..... 1010111 @r2_zimm +vsetvl 1000000 ..... ..... 111 ..... 1010111 @r diff --git a/target/riscv/insn_trans/trans_rvv.inc.c b/target/riscv/insn_trans/trans_rvv.inc.c new file mode 100644 index 0000000..82e7ad6 --- /dev/null +++ b/target/riscv/insn_trans/trans_rvv.inc.c @@ -0,0 +1,46 @@ +/* + * RISC-V translation routines for the RVV Standard Extension. + * + * Copyright (c) 2019 C-SKY Limited. All rights reserved. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms and conditions of the GNU General Public License, + * version 2 or later, as published by the Free Software Foundation. + * + * This program is distributed in the hope it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * You should have received a copy of the GNU General Public License along with + * this program. If not, see . + */ + +#define GEN_VECTOR_R(INSN) \ +static bool trans_##INSN(DisasContext *ctx, arg_##INSN * a) \ +{ \ + TCGv_i32 s1 = tcg_const_i32(a->rs1); \ + TCGv_i32 s2 = tcg_const_i32(a->rs2); \ + TCGv_i32 d = tcg_const_i32(a->rd); \ + gen_helper_vector_##INSN(cpu_env, s1, s2, d); \ + tcg_temp_free_i32(s1); \ + tcg_temp_free_i32(s2); \ + tcg_temp_free_i32(d); \ + return true; \ +} + +#define GEN_VECTOR_R2_ZIMM(INSN) \ +static bool trans_##INSN(DisasContext *ctx, arg_##INSN * a) \ +{ \ + TCGv_i32 s1 = tcg_const_i32(a->rs1); \ + TCGv_i32 zimm = tcg_const_i32(a->zimm); \ + TCGv_i32 d = tcg_const_i32(a->rd); \ + gen_helper_vector_##INSN(cpu_env, s1, zimm, d); \ + tcg_temp_free_i32(s1); \ + tcg_temp_free_i32(zimm); \ + tcg_temp_free_i32(d); \ + return true; \ +} + +GEN_VECTOR_R2_ZIMM(vsetvli) +GEN_VECTOR_R(vsetvl) diff --git a/target/riscv/translate.c b/target/riscv/translate.c index 8d6ab73..587c23e 100644 --- a/target/riscv/translate.c +++ b/target/riscv/translate.c @@ -706,6 +706,7 @@ static bool gen_shift(DisasContext *ctx, arg_r *a, #include "insn_trans/trans_rva.inc.c" #include "insn_trans/trans_rvf.inc.c" #include "insn_trans/trans_rvd.inc.c" +#include "insn_trans/trans_rvv.inc.c" #include "insn_trans/trans_privileged.inc.c" /* diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c new file mode 100644 index 0000000..b279e6f --- /dev/null +++ b/target/riscv/vector_helper.c @@ -0,0 +1,126 @@ +/* + * RISC-V Vectore Extension Helpers for QEMU. + * + * Copyright (c) 2019 C-SKY Limited. All rights reserved. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms and conditions of the GNU General Public License, + * version 2 or later, as published by the Free Software Foundation. + * + * This program is distributed in the hope it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * You should have received a copy of the GNU General Public License along with + * this program. If not, see . + */ + +#include "qemu/osdep.h" +#include "cpu.h" +#include "exec/exec-all.h" +#include "exec/helper-proto.h" +#include + +#define VECTOR_HELPER(name) HELPER(glue(vector_, name)) + +static inline void vector_vtype_set_ill(CPURISCVState *env) +{ + env->vfp.vtype = ((target_ulong)1) << (sizeof(target_ulong) - 1); + return; +} + +static inline int vector_vtype_get_sew(CPURISCVState *env) +{ + return (env->vfp.vtype >> 2) & 0x7; +} + +static inline int vector_get_width(CPURISCVState *env) +{ + return 8 * (1 << vector_vtype_get_sew(env)); +} + +static inline int vector_get_lmul(CPURISCVState *env) +{ + return 1 << (env->vfp.vtype & 0x3); +} + +static inline int vector_get_vlmax(CPURISCVState *env) +{ + return vector_get_lmul(env) * VLEN / vector_get_width(env); +} + +void VECTOR_HELPER(vsetvl)(CPURISCVState *env, uint32_t rs1, uint32_t rs2, + uint32_t rd) +{ + int sew, max_sew, vlmax, vl; + + if (rs2 == 0) { + vector_vtype_set_ill(env); + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + env->vfp.vtype = env->gpr[rs2]; + sew = 1 << vector_get_width(env) / 8; + max_sew = sizeof(target_ulong); + + if (env->misa & RVD) { + max_sew = max_sew > 8 ? max_sew : 8; + } else if (env->misa & RVF) { + max_sew = max_sew > 4 ? max_sew : 4; + } + if (sew > max_sew) { + vector_vtype_set_ill(env); + return; + } + + vlmax = vector_get_vlmax(env); + if (rs1 == 0) { + vl = vlmax; + } else if (env->gpr[rs1] <= vlmax) { + vl = env->gpr[rs1]; + } else if (env->gpr[rs1] < 2 * vlmax) { + vl = ceil(env->gpr[rs1] / 2); + } else { + vl = vlmax; + } + env->vfp.vl = vl; + env->gpr[rd] = vl; + env->vfp.vstart = 0; + return; +} + +void VECTOR_HELPER(vsetvli)(CPURISCVState *env, uint32_t rs1, uint32_t zimm, + uint32_t rd) +{ + int sew, max_sew, vlmax, vl; + + env->vfp.vtype = zimm; + sew = vector_get_width(env) / 8; + max_sew = sizeof(target_ulong); + + if (env->misa & RVD) { + max_sew = max_sew > 8 ? max_sew : 8; + } else if (env->misa & RVF) { + max_sew = max_sew > 4 ? max_sew : 4; + } + if (sew > max_sew) { + vector_vtype_set_ill(env); + return; + } + + vlmax = vector_get_vlmax(env); + if (rs1 == 0) { + vl = vlmax; + } else if (env->gpr[rs1] <= vlmax) { + vl = env->gpr[rs1]; + } else if (env->gpr[rs1] < 2 * vlmax) { + vl = ceil(env->gpr[rs1] / 2); + } else { + vl = vlmax; + } + env->vfp.vl = vl; + env->gpr[rd] = vl; + env->vfp.vstart = 0; + return; +} From patchwork Wed Sep 11 06:25:29 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: LIU Zhiwei X-Patchwork-Id: 11140369 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 42C931395 for ; Wed, 11 Sep 2019 06:40:09 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id E4C58207FC for ; Wed, 11 Sep 2019 06:40:08 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E4C58207FC Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=c-sky.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Received: from localhost ([::1]:46888 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1i7wIR-0004ll-Hj for patchwork-qemu-devel@patchwork.kernel.org; Wed, 11 Sep 2019 02:40:07 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:38497) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1i7wDf-00078t-1E for qemu-devel@nongnu.org; Wed, 11 Sep 2019 02:35:19 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1i7wDU-0007nI-Ng for qemu-devel@nongnu.org; Wed, 11 Sep 2019 02:35:10 -0400 Received: from smtp2200-217.mail.aliyun.com ([121.197.200.217]:56934) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1i7wDT-0007k0-2z; Wed, 11 Sep 2019 02:35:00 -0400 X-Alimail-AntiSpam: AC=CONTINUE; BC=0.03883426|-1; CH=green; DM=CONTINUE|CONTINUE|true|0.484801-0.00607782-0.509121; FP=0|0|0|0|0|-1|-1|-1; HT=e01a16368; MF=zhiwei_liu@c-sky.com; NM=1; PH=DS; RN=11; RT=11; SR=0; TI=SMTPD_---.FSRMC1k_1568183691; Received: from localhost(mailfrom:zhiwei_liu@c-sky.com fp:SMTPD_---.FSRMC1k_1568183691) by smtp.aliyun-inc.com(10.147.41.199); Wed, 11 Sep 2019 14:34:52 +0800 From: liuzhiwei To: Alistair.Francis@wdc.com, palmer@sifive.com, sagark@eecs.berkeley.edu, kbastian@mail.uni-paderborn.de, riku.voipio@iki.fi, laurent@vivier.eu, wenmeng_zhang@c-sky.com Date: Wed, 11 Sep 2019 14:25:29 +0800 Message-Id: <1568183141-67641-6-git-send-email-zhiwei_liu@c-sky.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1568183141-67641-1-git-send-email-zhiwei_liu@c-sky.com> References: <1568183141-67641-1-git-send-email-zhiwei_liu@c-sky.com> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x [generic] X-Received-From: 121.197.200.217 Subject: [Qemu-devel] [PATCH v2 05/17] RISC-V: add vector extension load and store instructions X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-riscv@nongnu.org, qemu-devel@nongnu.org, wxy194768@alibaba-inc.com, LIU Zhiwei Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" From: LIU Zhiwei Signed-off-by: LIU Zhiwei --- target/riscv/helper.h | 37 + target/riscv/insn32.decode | 46 + target/riscv/insn_trans/trans_rvv.inc.c | 70 + target/riscv/vector_helper.c | 2638 +++++++++++++++++++++++++++++++ 4 files changed, 2791 insertions(+) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index 652f8c3..f77c392 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -77,5 +77,42 @@ DEF_HELPER_1(wfi, void, env) DEF_HELPER_1(tlb_flush, void, env) #endif /* Vector functions */ +DEF_HELPER_5(vector_vlb_v, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vlh_v, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vlw_v, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vle_v, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vlbu_v, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vlhu_v, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vlwu_v, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vsb_v, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vsh_v, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vsw_v, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vse_v, void, env, i32, i32, i32, i32) +DEF_HELPER_6(vector_vlsb_v, void, env, i32, i32, i32, i32, i32) +DEF_HELPER_6(vector_vlsh_v, void, env, i32, i32, i32, i32, i32) +DEF_HELPER_6(vector_vlsw_v, void, env, i32, i32, i32, i32, i32) +DEF_HELPER_6(vector_vlse_v, void, env, i32, i32, i32, i32, i32) +DEF_HELPER_6(vector_vlsbu_v, void, env, i32, i32, i32, i32, i32) +DEF_HELPER_6(vector_vlshu_v, void, env, i32, i32, i32, i32, i32) +DEF_HELPER_6(vector_vlswu_v, void, env, i32, i32, i32, i32, i32) +DEF_HELPER_6(vector_vssb_v, void, env, i32, i32, i32, i32, i32) +DEF_HELPER_6(vector_vssh_v, void, env, i32, i32, i32, i32, i32) +DEF_HELPER_6(vector_vssw_v, void, env, i32, i32, i32, i32, i32) +DEF_HELPER_6(vector_vsse_v, void, env, i32, i32, i32, i32, i32) +DEF_HELPER_6(vector_vlxb_v, void, env, i32, i32, i32, i32, i32) +DEF_HELPER_6(vector_vlxh_v, void, env, i32, i32, i32, i32, i32) +DEF_HELPER_6(vector_vlxw_v, void, env, i32, i32, i32, i32, i32) +DEF_HELPER_6(vector_vlxe_v, void, env, i32, i32, i32, i32, i32) +DEF_HELPER_6(vector_vlxbu_v, void, env, i32, i32, i32, i32, i32) +DEF_HELPER_6(vector_vlxhu_v, void, env, i32, i32, i32, i32, i32) +DEF_HELPER_6(vector_vlxwu_v, void, env, i32, i32, i32, i32, i32) +DEF_HELPER_6(vector_vsxb_v, void, env, i32, i32, i32, i32, i32) +DEF_HELPER_6(vector_vsxh_v, void, env, i32, i32, i32, i32, i32) +DEF_HELPER_6(vector_vsxw_v, void, env, i32, i32, i32, i32, i32) +DEF_HELPER_6(vector_vsxe_v, void, env, i32, i32, i32, i32, i32) +DEF_HELPER_6(vector_vsuxb_v, void, env, i32, i32, i32, i32, i32) +DEF_HELPER_6(vector_vsuxh_v, void, env, i32, i32, i32, i32, i32) +DEF_HELPER_6(vector_vsuxw_v, void, env, i32, i32, i32, i32, i32) +DEF_HELPER_6(vector_vsuxe_v, void, env, i32, i32, i32, i32, i32) DEF_HELPER_4(vector_vsetvli, void, env, i32, i32, i32) DEF_HELPER_4(vector_vsetvl, void, env, i32, i32, i32) diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode index 5dc009c..b8a3d8a 100644 --- a/target/riscv/insn32.decode +++ b/target/riscv/insn32.decode @@ -25,6 +25,7 @@ %sh10 20:10 %csr 20:12 %rm 12:3 +%nf 29:3 # immediates: %imm_i 20:s12 @@ -62,6 +63,8 @@ @r_rm ....... ..... ..... ... ..... ....... %rs2 %rs1 %rm %rd @r2_rm ....... ..... ..... ... ..... ....... %rs1 %rm %rd @r2 ....... ..... ..... ... ..... ....... %rs1 %rd +@r_nfvm nf:3 ... vm:1 ..... ..... ... ..... ....... %rs2 %rs1 %rd +@r2_nfvm nf:3 ... vm:1 ..... ..... ... ..... ....... %rs1 %rd @r2_zimm . zimm:11 ..... ... ..... ....... %rs1 %rd @sfence_vma ....... ..... ..... ... ..... ....... %rs2 %rs1 @@ -206,5 +209,48 @@ fcvt_d_w 1101001 00000 ..... ... ..... 1010011 @r2_rm fcvt_d_wu 1101001 00001 ..... ... ..... 1010011 @r2_rm # *** RV32V Extension *** + +# *** Vector loads and stores are encoded within LOADFP/STORE-FP *** +vlb_v ... 100 . 00000 ..... 000 ..... 0000111 @r2_nfvm +vlh_v ... 100 . 00000 ..... 101 ..... 0000111 @r2_nfvm +vlw_v ... 100 . 00000 ..... 110 ..... 0000111 @r2_nfvm +vle_v ... 000 . 00000 ..... 111 ..... 0000111 @r2_nfvm +vlbu_v ... 000 . 00000 ..... 000 ..... 0000111 @r2_nfvm +vlhu_v ... 000 . 00000 ..... 101 ..... 0000111 @r2_nfvm +vlwu_v ... 000 . 00000 ..... 110 ..... 0000111 @r2_nfvm +vsb_v ... 000 . 00000 ..... 000 ..... 0100111 @r2_nfvm +vsh_v ... 000 . 00000 ..... 101 ..... 0100111 @r2_nfvm +vsw_v ... 000 . 00000 ..... 110 ..... 0100111 @r2_nfvm +vse_v ... 000 . 00000 ..... 111 ..... 0100111 @r2_nfvm + +vlsb_v ... 110 . ..... ..... 000 ..... 0000111 @r_nfvm +vlsh_v ... 110 . ..... ..... 101 ..... 0000111 @r_nfvm +vlsw_v ... 110 . ..... ..... 110 ..... 0000111 @r_nfvm +vlse_v ... 010 . ..... ..... 111 ..... 0000111 @r_nfvm +vlsbu_v ... 010 . ..... ..... 000 ..... 0000111 @r_nfvm +vlshu_v ... 010 . ..... ..... 101 ..... 0000111 @r_nfvm +vlswu_v ... 010 . ..... ..... 110 ..... 0000111 @r_nfvm +vssb_v ... 010 . ..... ..... 000 ..... 0100111 @r_nfvm +vssh_v ... 010 . ..... ..... 101 ..... 0100111 @r_nfvm +vssw_v ... 010 . ..... ..... 110 ..... 0100111 @r_nfvm +vsse_v ... 010 . ..... ..... 111 ..... 0100111 @r_nfvm + +vlxb_v ... 111 . ..... ..... 000 ..... 0000111 @r_nfvm +vlxh_v ... 111 . ..... ..... 101 ..... 0000111 @r_nfvm +vlxw_v ... 111 . ..... ..... 110 ..... 0000111 @r_nfvm +vlxe_v ... 011 . ..... ..... 111 ..... 0000111 @r_nfvm +vlxbu_v ... 011 . ..... ..... 000 ..... 0000111 @r_nfvm +vlxhu_v ... 011 . ..... ..... 101 ..... 0000111 @r_nfvm +vlxwu_v ... 011 . ..... ..... 110 ..... 0000111 @r_nfvm +vsxb_v ... 011 . ..... ..... 000 ..... 0100111 @r_nfvm +vsxh_v ... 011 . ..... ..... 101 ..... 0100111 @r_nfvm +vsxw_v ... 011 . ..... ..... 110 ..... 0100111 @r_nfvm +vsxe_v ... 011 . ..... ..... 111 ..... 0100111 @r_nfvm +vsuxb_v ... 111 . ..... ..... 000 ..... 0100111 @r_nfvm +vsuxh_v ... 111 . ..... ..... 101 ..... 0100111 @r_nfvm +vsuxw_v ... 111 . ..... ..... 110 ..... 0100111 @r_nfvm +vsuxe_v ... 111 . ..... ..... 111 ..... 0100111 @r_nfvm + +#*** new major opcode OP-V *** vsetvli 0 ........... ..... 111 ..... 1010111 @r2_zimm vsetvl 1000000 ..... ..... 111 ..... 1010111 @r diff --git a/target/riscv/insn_trans/trans_rvv.inc.c b/target/riscv/insn_trans/trans_rvv.inc.c index 82e7ad6..16b1f90 100644 --- a/target/riscv/insn_trans/trans_rvv.inc.c +++ b/target/riscv/insn_trans/trans_rvv.inc.c @@ -16,6 +16,37 @@ * this program. If not, see . */ +#define GEN_VECTOR_R2_NFVM(INSN) \ +static bool trans_##INSN(DisasContext *ctx, arg_##INSN * a) \ +{ \ + TCGv_i32 s1 = tcg_const_i32(a->rs1); \ + TCGv_i32 d = tcg_const_i32(a->rd); \ + TCGv_i32 nf = tcg_const_i32(a->nf); \ + TCGv_i32 vm = tcg_const_i32(a->vm); \ + gen_helper_vector_##INSN(cpu_env, nf, vm, s1, d); \ + tcg_temp_free_i32(s1); \ + tcg_temp_free_i32(d); \ + tcg_temp_free_i32(nf); \ + tcg_temp_free_i32(vm); \ + return true; \ +} +#define GEN_VECTOR_R_NFVM(INSN) \ +static bool trans_##INSN(DisasContext *ctx, arg_##INSN * a) \ +{ \ + TCGv_i32 s1 = tcg_const_i32(a->rs1); \ + TCGv_i32 s2 = tcg_const_i32(a->rs2); \ + TCGv_i32 d = tcg_const_i32(a->rd); \ + TCGv_i32 nf = tcg_const_i32(a->nf); \ + TCGv_i32 vm = tcg_const_i32(a->vm); \ + gen_helper_vector_##INSN(cpu_env, nf, vm, s1, s2, d);\ + tcg_temp_free_i32(s1); \ + tcg_temp_free_i32(s2); \ + tcg_temp_free_i32(d); \ + tcg_temp_free_i32(nf); \ + tcg_temp_free_i32(vm); \ + return true; \ +} + #define GEN_VECTOR_R(INSN) \ static bool trans_##INSN(DisasContext *ctx, arg_##INSN * a) \ { \ @@ -42,5 +73,44 @@ static bool trans_##INSN(DisasContext *ctx, arg_##INSN * a) \ return true; \ } +GEN_VECTOR_R2_NFVM(vlb_v) +GEN_VECTOR_R2_NFVM(vlh_v) +GEN_VECTOR_R2_NFVM(vlw_v) +GEN_VECTOR_R2_NFVM(vle_v) +GEN_VECTOR_R2_NFVM(vlbu_v) +GEN_VECTOR_R2_NFVM(vlhu_v) +GEN_VECTOR_R2_NFVM(vlwu_v) +GEN_VECTOR_R2_NFVM(vsb_v) +GEN_VECTOR_R2_NFVM(vsh_v) +GEN_VECTOR_R2_NFVM(vsw_v) +GEN_VECTOR_R2_NFVM(vse_v) + +GEN_VECTOR_R_NFVM(vlsb_v) +GEN_VECTOR_R_NFVM(vlsh_v) +GEN_VECTOR_R_NFVM(vlsw_v) +GEN_VECTOR_R_NFVM(vlse_v) +GEN_VECTOR_R_NFVM(vlsbu_v) +GEN_VECTOR_R_NFVM(vlshu_v) +GEN_VECTOR_R_NFVM(vlswu_v) +GEN_VECTOR_R_NFVM(vssb_v) +GEN_VECTOR_R_NFVM(vssh_v) +GEN_VECTOR_R_NFVM(vssw_v) +GEN_VECTOR_R_NFVM(vsse_v) +GEN_VECTOR_R_NFVM(vlxb_v) +GEN_VECTOR_R_NFVM(vlxh_v) +GEN_VECTOR_R_NFVM(vlxw_v) +GEN_VECTOR_R_NFVM(vlxe_v) +GEN_VECTOR_R_NFVM(vlxbu_v) +GEN_VECTOR_R_NFVM(vlxhu_v) +GEN_VECTOR_R_NFVM(vlxwu_v) +GEN_VECTOR_R_NFVM(vsxb_v) +GEN_VECTOR_R_NFVM(vsxh_v) +GEN_VECTOR_R_NFVM(vsxw_v) +GEN_VECTOR_R_NFVM(vsxe_v) +GEN_VECTOR_R_NFVM(vsuxb_v) +GEN_VECTOR_R_NFVM(vsuxh_v) +GEN_VECTOR_R_NFVM(vsuxw_v) +GEN_VECTOR_R_NFVM(vsuxe_v) + GEN_VECTOR_R2_ZIMM(vsetvli) GEN_VECTOR_R(vsetvl) diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c index b279e6f..62e4d2e 100644 --- a/target/riscv/vector_helper.c +++ b/target/riscv/vector_helper.c @@ -20,10 +20,60 @@ #include "cpu.h" #include "exec/exec-all.h" #include "exec/helper-proto.h" +#include "exec/cpu_ldst.h" #include #define VECTOR_HELPER(name) HELPER(glue(vector_, name)) +static int64_t sign_extend(int64_t a, int8_t width) +{ + return a << (64 - width) >> (64 - width); +} + +static target_ulong vector_get_index(CPURISCVState *env, int rs1, int rs2, + int index, int mem, int width, int nf) +{ + target_ulong abs_off, base = env->gpr[rs1]; + target_long offset; + switch (width) { + case 8: + offset = sign_extend(env->vfp.vreg[rs2].s8[index], 8) + nf * mem; + break; + case 16: + offset = sign_extend(env->vfp.vreg[rs2].s16[index], 16) + nf * mem; + break; + case 32: + offset = sign_extend(env->vfp.vreg[rs2].s32[index], 32) + nf * mem; + break; + case 64: + offset = env->vfp.vreg[rs2].s64[index] + nf * mem; + break; + default: + helper_raise_exception(env, RISCV_EXCP_ILLEGAL_INST); + return 0; + } + if (offset < 0) { + abs_off = ~offset + 1; + if (base >= abs_off) { + return base - abs_off; + } + } else { + if ((target_ulong)((target_ulong)offset + base) >= base) { + return (target_ulong)offset + base; + } + } + helper_raise_exception(env, RISCV_EXCP_ILLEGAL_INST); + return 0; +} + +static inline bool vector_vtype_ill(CPURISCVState *env) +{ + if ((env->vfp.vtype >> (sizeof(target_ulong) - 1)) & 0x1) { + return true; + } + return false; +} + static inline void vector_vtype_set_ill(CPURISCVState *env) { env->vfp.vtype = ((target_ulong)1) << (sizeof(target_ulong) - 1); @@ -50,6 +100,76 @@ static inline int vector_get_vlmax(CPURISCVState *env) return vector_get_lmul(env) * VLEN / vector_get_width(env); } +static inline int vector_elem_mask(CPURISCVState *env, uint32_t vm, int width, + int lmul, int index) +{ + int mlen = width / lmul; + int idx = (index * mlen) / 8; + int pos = (index * mlen) % 8; + + return vm || ((env->vfp.vreg[0].u8[idx] >> pos) & 0x1); +} + +static inline bool vector_overlap_vm_common(int lmul, int vm, int rd) +{ + if (lmul > 1 && vm == 0 && rd == 0) { + return true; + } + return false; +} + +static bool vector_lmul_check_reg(CPURISCVState *env, uint32_t lmul, + uint32_t reg, bool widen) +{ + int legal = widen ? (lmul * 2) : lmul; + + if ((lmul != 1 && lmul != 2 && lmul != 4 && lmul != 8) || + (lmul == 8 && widen)) { + helper_raise_exception(env, RISCV_EXCP_ILLEGAL_INST); + return false; + } + + if (reg % legal != 0) { + helper_raise_exception(env, RISCV_EXCP_ILLEGAL_INST); + return false; + } + return true; +} + +static void vector_tail_segment(CPURISCVState *env, int vreg, int index, + int width, int nf, int lmul) +{ + switch (width) { + case 8: + while (nf >= 0) { + env->vfp.vreg[vreg + nf * lmul].u8[index] = 0; + nf--; + } + break; + case 16: + while (nf >= 0) { + env->vfp.vreg[vreg + nf * lmul].u16[index] = 0; + nf--; + } + break; + case 32: + while (nf >= 0) { + env->vfp.vreg[vreg + nf * lmul].u32[index] = 0; + nf--; + } + break; + case 64: + while (nf >= 0) { + env->vfp.vreg[vreg + nf * lmul].u64[index] = 0; + nf--; + } + break; + default: + helper_raise_exception(env, RISCV_EXCP_ILLEGAL_INST); + return; + } +} + void VECTOR_HELPER(vsetvl)(CPURISCVState *env, uint32_t rs1, uint32_t rs2, uint32_t rd) { @@ -124,3 +244,2521 @@ void VECTOR_HELPER(vsetvli)(CPURISCVState *env, uint32_t rs1, uint32_t zimm, env->vfp.vstart = 0; return; } + +void VECTOR_HELPER(vlbu_v)(CPURISCVState *env, uint32_t nf, uint32_t vm, + uint32_t rs1, uint32_t rd) +{ + int i, j, k, vl, vlmax, lmul, width, dest, read; + + vl = env->vfp.vl; + + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + if (lmul * (nf + 1) > 32) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + k = nf; + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + read = i * (nf + 1) + k; + env->vfp.vreg[dest + k * lmul].u8[j] = + cpu_ldub_data(env, env->gpr[rs1] + read); + k--; + } + env->vfp.vstart++; + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + read = i * (nf + 1) + k; + env->vfp.vreg[dest + k * lmul].u16[j] = + cpu_ldub_data(env, env->gpr[rs1] + read); + k--; + } + env->vfp.vstart++; + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + read = i * (nf + 1) + k; + env->vfp.vreg[dest + k * lmul].u32[j] = + cpu_ldub_data(env, env->gpr[rs1] + read); + k--; + } + env->vfp.vstart++; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + read = i * (nf + 1) + k; + env->vfp.vreg[dest + k * lmul].u64[j] = + cpu_ldub_data(env, env->gpr[rs1] + read); + k--; + } + env->vfp.vstart++; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_segment(env, dest, j, width, k, lmul); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vlb_v)(CPURISCVState *env, uint32_t nf, uint32_t vm, + uint32_t rs1, uint32_t rd) +{ + int i, j, k, vl, vlmax, lmul, width, dest, read; + + vl = env->vfp.vl; + + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + if (lmul * (nf + 1) > 32) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + k = nf; + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + read = i * (nf + 1) + k; + env->vfp.vreg[dest + k * lmul].s8[j] = + cpu_ldsb_data(env, env->gpr[rs1] + read); + k--; + } + env->vfp.vstart++; + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + read = i * (nf + 1) + k; + env->vfp.vreg[dest + k * lmul].s16[j] = sign_extend( + cpu_ldsb_data(env, env->gpr[rs1] + read), 8); + k--; + } + env->vfp.vstart++; + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + read = i * (nf + 1) + k; + env->vfp.vreg[dest + k * lmul].s32[j] = sign_extend( + cpu_ldsb_data(env, env->gpr[rs1] + read), 8); + k--; + } + env->vfp.vstart++; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + read = i * (nf + 1) + k; + env->vfp.vreg[dest + k * lmul].s64[j] = sign_extend( + cpu_ldsb_data(env, env->gpr[rs1] + read), 8); + k--; + } + env->vfp.vstart++; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_segment(env, dest, j, width, k, lmul); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vlsbu_v)(CPURISCVState *env, uint32_t nf, uint32_t vm, + uint32_t rs1, uint32_t rs2, uint32_t rd) +{ + int i, j, k, vl, vlmax, lmul, width, dest, read; + + vl = env->vfp.vl; + + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + if (lmul * (nf + 1) > 32) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + k = nf; + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + read = i * env->gpr[rs2] + k; + env->vfp.vreg[dest + k * lmul].u8[j] = + cpu_ldub_data(env, env->gpr[rs1] + read); + k--; + } + env->vfp.vstart++; + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + read = i * env->gpr[rs2] + k; + env->vfp.vreg[dest + k * lmul].u16[j] = + cpu_ldub_data(env, env->gpr[rs1] + read); + k--; + } + env->vfp.vstart++; + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + read = i * env->gpr[rs2] + k; + env->vfp.vreg[dest + k * lmul].u32[j] = + cpu_ldub_data(env, env->gpr[rs1] + read); + k--; + } + env->vfp.vstart++; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + read = i * env->gpr[rs2] + k; + env->vfp.vreg[dest + k * lmul].u64[j] = + cpu_ldub_data(env, env->gpr[rs1] + read); + k--; + } + env->vfp.vstart++; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_segment(env, dest, j, width, k, lmul); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vlsb_v)(CPURISCVState *env, uint32_t nf, uint32_t vm, + uint32_t rs1, uint32_t rs2, uint32_t rd) +{ + int i, j, k, vl, vlmax, lmul, width, dest, read; + + vl = env->vfp.vl; + + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + if (lmul * (nf + 1) > 32) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + k = nf; + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + read = i * env->gpr[rs2] + k; + env->vfp.vreg[dest + k * lmul].s8[j] = + cpu_ldsb_data(env, env->gpr[rs1] + read); + k--; + } + env->vfp.vstart++; + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + read = i * env->gpr[rs2] + k; + env->vfp.vreg[dest + k * lmul].s16[j] = sign_extend( + cpu_ldsb_data(env, env->gpr[rs1] + read), 8); + k--; + } + env->vfp.vstart++; + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + read = i * env->gpr[rs2] + k; + env->vfp.vreg[dest + k * lmul].s32[j] = sign_extend( + cpu_ldsb_data(env, env->gpr[rs1] + read), 8); + k--; + } + env->vfp.vstart++; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + read = i * env->gpr[rs2] + k; + env->vfp.vreg[dest + k * lmul].s64[j] = sign_extend( + cpu_ldsb_data(env, env->gpr[rs1] + read), 8); + k--; + } + env->vfp.vstart++; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_segment(env, dest, j, width, k, lmul); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vlxbu_v)(CPURISCVState *env, uint32_t nf, uint32_t vm, + uint32_t rs1, uint32_t rs2, uint32_t rd) +{ + int i, j, k, vl, vlmax, lmul, width, dest, src2; + target_ulong addr; + + vl = env->vfp.vl; + + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + if (lmul * (nf + 1) > 32) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + k = nf; + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + addr = vector_get_index(env, rs1, src2, j, 1, width, k); + env->vfp.vreg[dest + k * lmul].u8[j] = + cpu_ldub_data(env, addr); + k--; + } + env->vfp.vstart++; + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + addr = vector_get_index(env, rs1, src2, j, 1, width, k); + env->vfp.vreg[dest + k * lmul].u16[j] = + cpu_ldub_data(env, addr); + k--; + } + env->vfp.vstart++; + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + addr = vector_get_index(env, rs1, src2, j, 1, width, k); + env->vfp.vreg[dest + k * lmul].u32[j] = + cpu_ldub_data(env, addr); + k--; + } + env->vfp.vstart++; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + addr = vector_get_index(env, rs1, src2, j, 1, width, k); + env->vfp.vreg[dest + k * lmul].u64[j] = + cpu_ldub_data(env, addr); + k--; + } + env->vfp.vstart++; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_segment(env, dest, j, width, k, lmul); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vlxb_v)(CPURISCVState *env, uint32_t nf, uint32_t vm, + uint32_t rs1, uint32_t rs2, uint32_t rd) +{ + int i, j, k, vl, vlmax, lmul, width, dest, src2; + target_ulong addr; + + vl = env->vfp.vl; + + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + if (lmul * (nf + 1) > 32) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + k = nf; + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + addr = vector_get_index(env, rs1, src2, j, 1, width, k); + env->vfp.vreg[dest + k * lmul].s8[j] = + cpu_ldsb_data(env, addr); + k--; + } + env->vfp.vstart++; + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + addr = vector_get_index(env, rs1, src2, j, 1, width, k); + env->vfp.vreg[dest + k * lmul].s16[j] = sign_extend( + cpu_ldsb_data(env, addr), 8); + k--; + } + env->vfp.vstart++; + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + addr = vector_get_index(env, rs1, src2, j, 1, width, k); + env->vfp.vreg[dest + k * lmul].s32[j] = sign_extend( + cpu_ldsb_data(env, addr), 8); + k--; + } + env->vfp.vstart++; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + addr = vector_get_index(env, rs1, src2, j, 1, width, k); + env->vfp.vreg[dest + k * lmul].s64[j] = sign_extend( + cpu_ldsb_data(env, addr), 8); + k--; + } + env->vfp.vstart++; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_segment(env, dest, j, width, k, lmul); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vlhu_v)(CPURISCVState *env, uint32_t nf, uint32_t vm, + uint32_t rs1, uint32_t rd) +{ + int i, j, k, vl, vlmax, lmul, width, dest, read; + + vl = env->vfp.vl; + + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + if (lmul * (nf + 1) > 32) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + k = nf; + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + read = (i * (nf + 1) + k) * 2; + env->vfp.vreg[dest + k * lmul].u16[j] = + cpu_lduw_data(env, env->gpr[rs1] + read); + k--; + } + env->vfp.vstart++; + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + read = (i * (nf + 1) + k) * 2; + env->vfp.vreg[dest + k * lmul].u32[j] = + cpu_lduw_data(env, env->gpr[rs1] + read); + k--; + } + env->vfp.vstart++; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + read = (i * (nf + 1) + k) * 2; + env->vfp.vreg[dest + k * lmul].u64[j] = + cpu_lduw_data(env, env->gpr[rs1] + read); + k--; + } + env->vfp.vstart++; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_segment(env, dest, j, width, k, lmul); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vlh_v)(CPURISCVState *env, uint32_t nf, uint32_t vm, + uint32_t rs1, uint32_t rd) +{ + int i, j, k, vl, vlmax, lmul, width, dest, read; + + vl = env->vfp.vl; + + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + if (lmul * (nf + 1) > 32) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + k = nf; + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + read = (i * (nf + 1) + k) * 2; + env->vfp.vreg[dest + k * lmul].s16[j] = + cpu_ldsw_data(env, env->gpr[rs1] + read); + k--; + } + env->vfp.vstart++; + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + read = (i * (nf + 1) + k) * 2; + env->vfp.vreg[dest + k * lmul].s32[j] = sign_extend( + cpu_ldsw_data(env, env->gpr[rs1] + read), 16); + k--; + } + env->vfp.vstart++; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + read = (i * (nf + 1) + k) * 2; + env->vfp.vreg[dest + k * lmul].s64[j] = sign_extend( + cpu_ldsw_data(env, env->gpr[rs1] + read), 16); + k--; + } + env->vfp.vstart++; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_segment(env, dest, j, width, k, lmul); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vlshu_v)(CPURISCVState *env, uint32_t nf, uint32_t vm, + uint32_t rs1, uint32_t rs2, uint32_t rd) +{ + int i, j, k, vl, vlmax, lmul, width, dest, read; + + vl = env->vfp.vl; + + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + if (lmul * (nf + 1) > 32) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + k = nf; + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + read = i * env->gpr[rs2] + k * 2; + env->vfp.vreg[dest + k * lmul].u16[j] = + cpu_lduw_data(env, env->gpr[rs1] + read); + k--; + } + env->vfp.vstart++; + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + read = i * env->gpr[rs2] + k * 2; + env->vfp.vreg[dest + k * lmul].u32[j] = + cpu_lduw_data(env, env->gpr[rs1] + read); + k--; + } + env->vfp.vstart++; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + read = i * env->gpr[rs2] + k * 2; + env->vfp.vreg[dest + k * lmul].u64[j] = + cpu_lduw_data(env, env->gpr[rs1] + read); + k--; + } + env->vfp.vstart++; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_segment(env, dest, j, width, k, lmul); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vlsh_v)(CPURISCVState *env, uint32_t nf, uint32_t vm, + uint32_t rs1, uint32_t rs2, uint32_t rd) +{ + int i, j, k, vl, vlmax, lmul, width, dest, read; + + vl = env->vfp.vl; + + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + if (lmul * (nf + 1) > 32) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + k = nf; + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + read = i * env->gpr[rs2] + k * 2; + env->vfp.vreg[dest + k * lmul].s16[j] = + cpu_ldsw_data(env, env->gpr[rs1] + read); + k--; + } + env->vfp.vstart++; + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + read = i * env->gpr[rs2] + k * 2; + env->vfp.vreg[dest + k * lmul].s32[j] = sign_extend( + cpu_ldsw_data(env, env->gpr[rs1] + read), 16); + k--; + } + env->vfp.vstart++; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + read = i * env->gpr[rs2] + k * 2; + env->vfp.vreg[dest + k * lmul].s64[j] = sign_extend( + cpu_ldsw_data(env, env->gpr[rs1] + read), 16); + k--; + } + env->vfp.vstart++; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_segment(env, dest, j, width, k, lmul); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vlxhu_v)(CPURISCVState *env, uint32_t nf, uint32_t vm, + uint32_t rs1, uint32_t rs2, uint32_t rd) +{ + int i, j, k, vl, vlmax, lmul, width, dest, src2; + target_ulong addr; + + vl = env->vfp.vl; + + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + if (lmul * (nf + 1) > 32) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + k = nf; + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + addr = vector_get_index(env, rs1, src2, j, 2, width, k); + env->vfp.vreg[dest + k * lmul].u16[j] = + cpu_lduw_data(env, addr); + k--; + } + env->vfp.vstart++; + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + addr = vector_get_index(env, rs1, src2, j, 2, width, k); + env->vfp.vreg[dest + k * lmul].u32[j] = + cpu_lduw_data(env, addr); + k--; + } + env->vfp.vstart++; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + addr = vector_get_index(env, rs1, src2, j, 2, width, k); + env->vfp.vreg[dest + k * lmul].u64[j] = + cpu_lduw_data(env, addr); + k--; + } + env->vfp.vstart++; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_segment(env, dest, j, width, k, lmul); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vlxh_v)(CPURISCVState *env, uint32_t nf, uint32_t vm, + uint32_t rs1, uint32_t rs2, uint32_t rd) +{ + int i, j, k, vl, vlmax, lmul, width, dest, src2; + target_ulong addr; + + vl = env->vfp.vl; + + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + if (lmul * (nf + 1) > 32) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + k = nf; + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + addr = vector_get_index(env, rs1, src2, j, 2, width, k); + env->vfp.vreg[dest + k * lmul].s16[j] = + cpu_ldsw_data(env, addr); + k--; + } + env->vfp.vstart++; + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + addr = vector_get_index(env, rs1, src2, j, 2, width, k); + env->vfp.vreg[dest + k * lmul].s32[j] = sign_extend( + cpu_ldsw_data(env, addr), 16); + k--; + } + env->vfp.vstart++; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + addr = vector_get_index(env, rs1, src2, j, 2, width, k); + env->vfp.vreg[dest + k * lmul].s64[j] = sign_extend( + cpu_ldsw_data(env, addr), 16); + k--; + } + env->vfp.vstart++; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_segment(env, dest, j, width, k, lmul); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vlw_v)(CPURISCVState *env, uint32_t nf, uint32_t vm, + uint32_t rs1, uint32_t rd) +{ + int i, j, k, vl, vlmax, lmul, width, dest, read; + + vl = env->vfp.vl; + + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + if (lmul * (nf + 1) > 32) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + k = nf; + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + read = (i * (nf + 1) + k) * 4; + env->vfp.vreg[dest + k * lmul].s32[j] = + cpu_ldl_data(env, env->gpr[rs1] + read); + k--; + } + env->vfp.vstart++; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + read = (i * (nf + 1) + k) * 4; + env->vfp.vreg[dest + k * lmul].s64[j] = sign_extend( + cpu_ldl_data(env, env->gpr[rs1] + read), 32); + k--; + } + env->vfp.vstart++; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_segment(env, dest, j, width, k, lmul); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vlwu_v)(CPURISCVState *env, uint32_t nf, uint32_t vm, + uint32_t rs1, uint32_t rd) +{ + int i, j, k, vl, vlmax, lmul, width, dest, read; + + vl = env->vfp.vl; + + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + if (lmul * (nf + 1) > 32) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + k = nf; + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + read = (i * (nf + 1) + k) * 4; + env->vfp.vreg[dest + k * lmul].u32[j] = + cpu_ldl_data(env, env->gpr[rs1] + read); + k--; + } + env->vfp.vstart++; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + read = (i * (nf + 1) + k) * 4; + env->vfp.vreg[dest + k * lmul].u64[j] = + cpu_ldl_data(env, env->gpr[rs1] + read); + k--; + } + env->vfp.vstart++; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_segment(env, dest, j, width, k, lmul); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vlswu_v)(CPURISCVState *env, uint32_t nf, uint32_t vm, + uint32_t rs1, uint32_t rs2, uint32_t rd) +{ + int i, j, k, vl, vlmax, lmul, width, dest, read; + + vl = env->vfp.vl; + + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + if (lmul * (nf + 1) > 32) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + k = nf; + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + read = i * env->gpr[rs2] + k * 4; + env->vfp.vreg[dest + k * lmul].u32[j] = + cpu_ldl_data(env, env->gpr[rs1] + read); + k--; + } + env->vfp.vstart++; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + read = i * env->gpr[rs2] + k * 4; + env->vfp.vreg[dest + k * lmul].u64[j] = + cpu_ldl_data(env, env->gpr[rs1] + read); + k--; + } + env->vfp.vstart++; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_segment(env, dest, j, width, k, lmul); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vlsw_v)(CPURISCVState *env, uint32_t nf, uint32_t vm, + uint32_t rs1, uint32_t rs2, uint32_t rd) +{ + int i, j, k, vl, vlmax, lmul, width, dest, read; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + if (lmul * (nf + 1) > 32) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + k = nf; + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + read = i * env->gpr[rs2] + k * 4; + env->vfp.vreg[dest + k * lmul].s32[j] = + cpu_ldl_data(env, env->gpr[rs1] + read); + k--; + } + env->vfp.vstart++; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + read = i * env->gpr[rs2] + k * 4; + env->vfp.vreg[dest + k * lmul].s64[j] = sign_extend( + cpu_ldl_data(env, env->gpr[rs1] + read), 32); + k--; + } + env->vfp.vstart++; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_segment(env, dest, j, width, k, lmul); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vlxwu_v)(CPURISCVState *env, uint32_t nf, uint32_t vm, + uint32_t rs1, uint32_t rs2, uint32_t rd) +{ + int i, j, k, vl, vlmax, lmul, width, dest, src2; + target_ulong addr; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + if (lmul * (nf + 1) > 32) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + k = nf; + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + addr = vector_get_index(env, rs1, src2, j, 4, width, k); + env->vfp.vreg[dest + k * lmul].u32[j] = + cpu_ldl_data(env, addr); + k--; + } + env->vfp.vstart++; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + addr = vector_get_index(env, rs1, src2, j, 4, width, k); + env->vfp.vreg[dest + k * lmul].u64[j] = + cpu_ldl_data(env, addr); + k--; + } + env->vfp.vstart++; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_segment(env, dest, j, width, k, lmul); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vlxw_v)(CPURISCVState *env, uint32_t nf, uint32_t vm, + uint32_t rs1, uint32_t rs2, uint32_t rd) +{ + int i, j, k, vl, vlmax, lmul, width, dest, src2; + target_ulong addr; + + vl = env->vfp.vl; + + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + if (lmul * (nf + 1) > 32) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + k = nf; + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + addr = vector_get_index(env, rs1, src2, j, 4, width, k); + env->vfp.vreg[dest + k * lmul].s32[j] = + cpu_ldl_data(env, addr); + k--; + } + env->vfp.vstart++; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + addr = vector_get_index(env, rs1, src2, j, 4, width, k); + env->vfp.vreg[dest + k * lmul].s64[j] = sign_extend( + cpu_ldl_data(env, addr), 32); + k--; + } + env->vfp.vstart++; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_segment(env, dest, j, width, k, lmul); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vle_v)(CPURISCVState *env, uint32_t nf, uint32_t vm, + uint32_t rs1, uint32_t rd) +{ + int i, j, k, vl, vlmax, lmul, width, dest, read; + + vl = env->vfp.vl; + + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + if (lmul * (nf + 1) > 32) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + k = nf; + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + read = i * (nf + 1) + k; + env->vfp.vreg[dest + k * lmul].u8[j] = + cpu_ldub_data(env, env->gpr[rs1] + read); + k--; + } + env->vfp.vstart++; + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + read = (i * (nf + 1) + k) * 2; + env->vfp.vreg[dest + k * lmul].u16[j] = + cpu_lduw_data(env, env->gpr[rs1] + read); + k--; + } + env->vfp.vstart++; + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + read = (i * (nf + 1) + k) * 4; + env->vfp.vreg[dest + k * lmul].u32[j] = + cpu_ldl_data(env, env->gpr[rs1] + read); + k--; + } + env->vfp.vstart++; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + read = (i * (nf + 1) + k) * 8; + env->vfp.vreg[dest + k * lmul].u64[j] = + cpu_ldq_data(env, env->gpr[rs1] + read); + k--; + } + env->vfp.vstart++; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_segment(env, dest, j, width, k, lmul); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vlse_v)(CPURISCVState *env, uint32_t nf, uint32_t vm, + uint32_t rs1, uint32_t rs2, uint32_t rd) +{ + int i, j, k, vl, vlmax, lmul, width, dest, read; + + vl = env->vfp.vl; + + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + if (lmul * (nf + 1) > 32) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + k = nf; + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + read = i * env->gpr[rs2] + k; + env->vfp.vreg[dest + k * lmul].u8[j] = + cpu_ldub_data(env, env->gpr[rs1] + read); + k--; + } + env->vfp.vstart++; + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + read = i * env->gpr[rs2] + k * 2; + env->vfp.vreg[dest + k * lmul].u16[j] = + cpu_lduw_data(env, env->gpr[rs1] + read); + k--; + } + env->vfp.vstart++; + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + read = i * env->gpr[rs2] + k * 4; + env->vfp.vreg[dest + k * lmul].u32[j] = + cpu_ldl_data(env, env->gpr[rs1] + read); + k--; + } + env->vfp.vstart++; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + read = i * env->gpr[rs2] + k * 8; + env->vfp.vreg[dest + k * lmul].u64[j] = + cpu_ldq_data(env, env->gpr[rs1] + read); + k--; + } + env->vfp.vstart++; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_segment(env, dest, j, width, k, lmul); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vlxe_v)(CPURISCVState *env, uint32_t nf, uint32_t vm, + uint32_t rs1, uint32_t rs2, uint32_t rd) +{ + int i, j, k, vl, vlmax, lmul, width, dest, src2; + target_ulong addr; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + if (lmul * (nf + 1) > 32) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + k = nf; + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + addr = vector_get_index(env, rs1, src2, j, 1, width, k); + env->vfp.vreg[dest + k * lmul].u8[j] = + cpu_ldub_data(env, addr); + k--; + } + env->vfp.vstart++; + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + addr = vector_get_index(env, rs1, src2, j, 2, width, k); + env->vfp.vreg[dest + k * lmul].u16[j] = + cpu_lduw_data(env, addr); + k--; + } + env->vfp.vstart++; + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + addr = vector_get_index(env, rs1, src2, j, 4, width, k); + env->vfp.vreg[dest + k * lmul].u32[j] = + cpu_ldl_data(env, addr); + k--; + } + env->vfp.vstart++; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + addr = vector_get_index(env, rs1, src2, j, 8, width, k); + env->vfp.vreg[dest + k * lmul].u64[j] = + cpu_ldq_data(env, addr); + k--; + } + env->vfp.vstart++; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_segment(env, dest, j, width, k, lmul); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vsb_v)(CPURISCVState *env, uint32_t nf, uint32_t vm, + uint32_t rs1, uint32_t rd) +{ + int i, j, k, vl, vlmax, lmul, width, dest, wrote; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + if (lmul * (nf + 1) > 32) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + k = nf; + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + wrote = i * (nf + 1) + k; + cpu_stb_data(env, env->gpr[rs1] + wrote, + env->vfp.vreg[dest + k * lmul].s8[j]); + k--; + } + env->vfp.vstart++; + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + wrote = i * (nf + 1) + k; + cpu_stb_data(env, env->gpr[rs1] + wrote, + env->vfp.vreg[dest + k * lmul].s16[j]); + k--; + } + env->vfp.vstart++; + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + wrote = i * (nf + 1) + k; + cpu_stb_data(env, env->gpr[rs1] + wrote, + env->vfp.vreg[dest + k * lmul].s32[j]); + k--; + } + env->vfp.vstart++; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + wrote = i * (nf + 1) + k; + cpu_stb_data(env, env->gpr[rs1] + wrote, + env->vfp.vreg[dest + k * lmul].s64[j]); + k--; + } + env->vfp.vstart++; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vssb_v)(CPURISCVState *env, uint32_t nf, uint32_t vm, + uint32_t rs1, uint32_t rs2, uint32_t rd) +{ + int i, j, k, vl, vlmax, lmul, width, dest, wrote; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + if (lmul * (nf + 1) > 32) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + k = nf; + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + wrote = i * env->gpr[rs2] + k; + cpu_stb_data(env, env->gpr[rs1] + wrote, + env->vfp.vreg[dest + k * lmul].s8[j]); + k--; + } + env->vfp.vstart++; + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + wrote = i * env->gpr[rs2] + k; + cpu_stb_data(env, env->gpr[rs1] + wrote, + env->vfp.vreg[dest + k * lmul].s16[j]); + k--; + } + env->vfp.vstart++; + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + wrote = i * env->gpr[rs2] + k; + cpu_stb_data(env, env->gpr[rs1] + wrote, + env->vfp.vreg[dest + k * lmul].s32[j]); + k--; + } + env->vfp.vstart++; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + wrote = i * env->gpr[rs2] + k; + cpu_stb_data(env, env->gpr[rs1] + wrote, + env->vfp.vreg[dest + k * lmul].s64[j]); + k--; + } + env->vfp.vstart++; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vsxb_v)(CPURISCVState *env, uint32_t nf, uint32_t vm, + uint32_t rs1, uint32_t rs2, uint32_t rd) +{ + int i, j, k, vl, vlmax, lmul, width, dest, src2; + target_ulong addr; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + if (lmul * (nf + 1) > 32) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + k = nf; + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + addr = vector_get_index(env, rs1, src2, j, 1, width, k); + cpu_stb_data(env, addr, + env->vfp.vreg[dest + k * lmul].s8[j]); + k--; + } + env->vfp.vstart++; + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + addr = vector_get_index(env, rs1, src2, j, 1, width, k); + cpu_stb_data(env, addr, + env->vfp.vreg[dest + k * lmul].s16[j]); + k--; + } + env->vfp.vstart++; + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + addr = vector_get_index(env, rs1, src2, j, 1, width, k); + cpu_stb_data(env, addr, + env->vfp.vreg[dest + k * lmul].s32[j]); + k--; + } + env->vfp.vstart++; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + addr = vector_get_index(env, rs1, src2, j, 1, width, k); + cpu_stb_data(env, addr, + env->vfp.vreg[dest + k * lmul].s64[j]); + k--; + } + env->vfp.vstart++; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vsuxb_v)(CPURISCVState *env, uint32_t nf, uint32_t vm, + uint32_t rs1, uint32_t rs2, uint32_t rd) +{ + return VECTOR_HELPER(vsxb_v)(env, nf, vm, rs1, rs2, rd); + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vsh_v)(CPURISCVState *env, uint32_t nf, uint32_t vm, + uint32_t rs1, uint32_t rd) +{ + int i, j, k, vl, vlmax, lmul, width, dest, wrote; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + if (lmul * (nf + 1) > 32) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + k = nf; + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + wrote = (i * (nf + 1) + k) * 2; + cpu_stw_data(env, env->gpr[rs1] + wrote, + env->vfp.vreg[dest + k * lmul].s16[j]); + k--; + } + env->vfp.vstart++; + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + wrote = (i * (nf + 1) + k) * 2; + cpu_stw_data(env, env->gpr[rs1] + wrote, + env->vfp.vreg[dest + k * lmul].s32[j]); + k--; + } + env->vfp.vstart++; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + wrote = (i * (nf + 1) + k) * 2; + cpu_stw_data(env, env->gpr[rs1] + wrote, + env->vfp.vreg[dest + k * lmul].s64[j]); + k--; + } + env->vfp.vstart++; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vssh_v)(CPURISCVState *env, uint32_t nf, uint32_t vm, + uint32_t rs1, uint32_t rs2, uint32_t rd) +{ + int i, j, k, vl, vlmax, lmul, width, dest, wrote; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + if (lmul * (nf + 1) > 32) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + k = nf; + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + wrote = i * env->gpr[rs2] + k * 2; + cpu_stw_data(env, env->gpr[rs1] + wrote, + env->vfp.vreg[dest + k * lmul].s16[j]); + k--; + } + env->vfp.vstart++; + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + wrote = i * env->gpr[rs2] + k * 2; + cpu_stw_data(env, env->gpr[rs1] + wrote, + env->vfp.vreg[dest + k * lmul].s32[j]); + k--; + } + env->vfp.vstart++; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + wrote = i * env->gpr[rs2] + k * 2; + cpu_stw_data(env, env->gpr[rs1] + wrote, + env->vfp.vreg[dest + k * lmul].s64[j]); + k--; + } + env->vfp.vstart++; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vsxh_v)(CPURISCVState *env, uint32_t nf, uint32_t vm, + uint32_t rs1, uint32_t rs2, uint32_t rd) +{ + int i, j, k, vl, vlmax, lmul, width, dest, src2; + target_ulong addr; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + if (lmul * (nf + 1) > 32) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + k = nf; + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + addr = vector_get_index(env, rs1, src2, j, 2, width, k); + cpu_stw_data(env, addr, + env->vfp.vreg[dest + k * lmul].s16[j]); + k--; + } + env->vfp.vstart++; + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + addr = vector_get_index(env, rs1, src2, j, 2, width, k); + cpu_stw_data(env, addr, + env->vfp.vreg[dest + k * lmul].s32[j]); + k--; + } + env->vfp.vstart++; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + addr = vector_get_index(env, rs1, src2, j, 2, width, k); + cpu_stw_data(env, addr, + env->vfp.vreg[dest + k * lmul].s64[j]); + k--; + } + env->vfp.vstart++; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vsuxh_v)(CPURISCVState *env, uint32_t nf, uint32_t vm, + uint32_t rs1, uint32_t rs2, uint32_t rd) +{ + return VECTOR_HELPER(vsxh_v)(env, nf, vm, rs1, rs2, rd); + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vsw_v)(CPURISCVState *env, uint32_t nf, uint32_t vm, + uint32_t rs1, uint32_t rd) +{ + int i, j, k, vl, vlmax, lmul, width, dest, wrote; + + vl = env->vfp.vl; + + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + if (lmul * (nf + 1) > 32) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + k = nf; + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + wrote = (i * (nf + 1) + k) * 4; + cpu_stl_data(env, env->gpr[rs1] + wrote, + env->vfp.vreg[dest + k * lmul].s32[j]); + k--; + } + env->vfp.vstart++; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + wrote = (i * (nf + 1) + k) * 4; + cpu_stl_data(env, env->gpr[rs1] + wrote, + env->vfp.vreg[dest + k * lmul].s64[j]); + k--; + } + env->vfp.vstart++; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vssw_v)(CPURISCVState *env, uint32_t nf, uint32_t vm, + uint32_t rs1, uint32_t rs2, uint32_t rd) +{ + int i, j, k, vl, vlmax, lmul, width, dest, wrote; + + vl = env->vfp.vl; + + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + if (lmul * (nf + 1) > 32) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + k = nf; + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + wrote = i * env->gpr[rs2] + k * 4; + cpu_stl_data(env, env->gpr[rs1] + wrote, + env->vfp.vreg[dest + k * lmul].s32[j]); + k--; + } + env->vfp.vstart++; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + wrote = i * env->gpr[rs2] + k * 4; + cpu_stl_data(env, env->gpr[rs1] + wrote, + env->vfp.vreg[dest + k * lmul].s64[j]); + k--; + } + env->vfp.vstart++; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vsxw_v)(CPURISCVState *env, uint32_t nf, uint32_t vm, + uint32_t rs1, uint32_t rs2, uint32_t rd) +{ + int i, j, k, vl, vlmax, lmul, width, dest, src2; + target_ulong addr; + + vl = env->vfp.vl; + + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + if (lmul * (nf + 1) > 32) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + k = nf; + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + addr = vector_get_index(env, rs1, src2, j, 4, width, k); + cpu_stl_data(env, addr, + env->vfp.vreg[dest + k * lmul].s32[j]); + k--; + } + env->vfp.vstart++; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + addr = vector_get_index(env, rs1, src2, j, 4, width, k); + cpu_stl_data(env, addr, + env->vfp.vreg[dest + k * lmul].s64[j]); + k--; + } + env->vfp.vstart++; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vsuxw_v)(CPURISCVState *env, uint32_t nf, uint32_t vm, + uint32_t rs1, uint32_t rs2, uint32_t rd) +{ + return VECTOR_HELPER(vsxw_v)(env, nf, vm, rs1, rs2, rd); + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vse_v)(CPURISCVState *env, uint32_t nf, uint32_t vm, + uint32_t rs1, uint32_t rd) +{ + int i, j, k, vl, vlmax, lmul, width, dest, wrote; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + if (lmul * (nf + 1) > 32) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + k = nf; + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + wrote = i * (nf + 1) + k; + cpu_stb_data(env, env->gpr[rs1] + wrote, + env->vfp.vreg[dest + k * lmul].s8[j]); + k--; + } + env->vfp.vstart++; + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + wrote = (i * (nf + 1) + k) * 2; + cpu_stw_data(env, env->gpr[rs1] + wrote, + env->vfp.vreg[dest + k * lmul].s16[j]); + k--; + } + env->vfp.vstart++; + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + wrote = (i * (nf + 1) + k) * 4; + cpu_stl_data(env, env->gpr[rs1] + wrote, + env->vfp.vreg[dest + k * lmul].s32[j]); + k--; + } + env->vfp.vstart++; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + wrote = (i * (nf + 1) + k) * 8; + cpu_stq_data(env, env->gpr[rs1] + wrote, + env->vfp.vreg[dest + k * lmul].s64[j]); + k--; + } + env->vfp.vstart++; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vsse_v)(CPURISCVState *env, uint32_t nf, uint32_t vm, + uint32_t rs1, uint32_t rs2, uint32_t rd) +{ + int i, j, k, vl, vlmax, lmul, width, dest, wrote; + + vl = env->vfp.vl; + + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + if (lmul * (nf + 1) > 32) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + k = nf; + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + wrote = i * env->gpr[rs2] + k; + cpu_stb_data(env, env->gpr[rs1] + wrote, + env->vfp.vreg[dest + k * lmul].s8[j]); + k--; + } + env->vfp.vstart++; + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + wrote = i * env->gpr[rs2] + k * 2; + cpu_stw_data(env, env->gpr[rs1] + wrote, + env->vfp.vreg[dest + k * lmul].s16[j]); + k--; + } + env->vfp.vstart++; + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + wrote = i * env->gpr[rs2] + k * 4; + cpu_stl_data(env, env->gpr[rs1] + wrote, + env->vfp.vreg[dest + k * lmul].s32[j]); + k--; + } + env->vfp.vstart++; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + wrote = i * env->gpr[rs2] + k * 8; + cpu_stq_data(env, env->gpr[rs1] + wrote, + env->vfp.vreg[dest + k * lmul].s64[j]); + k--; + } + env->vfp.vstart++; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vsxe_v)(CPURISCVState *env, uint32_t nf, uint32_t vm, + uint32_t rs1, uint32_t rs2, uint32_t rd) +{ + int i, j, k, vl, vlmax, lmul, width, dest, src2; + target_ulong addr; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + if (lmul * (nf + 1) > 32) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + k = nf; + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + addr = vector_get_index(env, rs1, src2, j, 1, width, k); + cpu_stb_data(env, addr, + env->vfp.vreg[dest + k * lmul].s8[j]); + k--; + } + env->vfp.vstart++; + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + addr = vector_get_index(env, rs1, src2, j, 2, width, k); + cpu_stw_data(env, addr, + env->vfp.vreg[dest + k * lmul].s16[j]); + k--; + } + env->vfp.vstart++; + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + addr = vector_get_index(env, rs1, src2, j, 4, width, k); + cpu_stl_data(env, addr, + env->vfp.vreg[dest + k * lmul].s32[j]); + k--; + } + env->vfp.vstart++; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + addr = vector_get_index(env, rs1, src2, j, 8, width, k); + cpu_stq_data(env, addr, + env->vfp.vreg[dest + k * lmul].s64[j]); + k--; + } + env->vfp.vstart++; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vsuxe_v)(CPURISCVState *env, uint32_t nf, uint32_t vm, + uint32_t rs1, uint32_t rs2, uint32_t rd) +{ + return VECTOR_HELPER(vsxe_v)(env, nf, vm, rs1, rs2, rd); + env->vfp.vstart = 0; +} + From patchwork Wed Sep 11 06:25:30 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: LIU Zhiwei X-Patchwork-Id: 11140389 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 144A414ED for ; Wed, 11 Sep 2019 06:43:17 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id D868D21D7B for ; Wed, 11 Sep 2019 06:43:16 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D868D21D7B Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=c-sky.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Received: from localhost ([::1]:46918 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1i7wLT-0007w4-H7 for patchwork-qemu-devel@patchwork.kernel.org; Wed, 11 Sep 2019 02:43:15 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:38407) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1i7wDW-0006xh-VH for qemu-devel@nongnu.org; Wed, 11 Sep 2019 02:35:07 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1i7wDT-0007mZ-Rc for qemu-devel@nongnu.org; Wed, 11 Sep 2019 02:35:02 -0400 Received: from smtp2200-217.mail.aliyun.com ([121.197.200.217]:43584) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1i7wDT-0007k2-5P; Wed, 11 Sep 2019 02:34:59 -0400 X-Alimail-AntiSpam: AC=CONTINUE; BC=0.03883426|-1; CH=green; DM=CONTINUE|CONTINUE|true|0.480035-0.00505187-0.514913; FP=0|0|0|0|0|-1|-1|-1; HT=e01a16368; MF=zhiwei_liu@c-sky.com; NM=1; PH=DS; RN=11; RT=11; SR=0; TI=SMTPD_---.FSRUTgl_1568183693; Received: from localhost(mailfrom:zhiwei_liu@c-sky.com fp:SMTPD_---.FSRUTgl_1568183693) by smtp.aliyun-inc.com(10.147.42.22); Wed, 11 Sep 2019 14:34:53 +0800 From: liuzhiwei To: Alistair.Francis@wdc.com, palmer@sifive.com, sagark@eecs.berkeley.edu, kbastian@mail.uni-paderborn.de, riku.voipio@iki.fi, laurent@vivier.eu, wenmeng_zhang@c-sky.com Date: Wed, 11 Sep 2019 14:25:30 +0800 Message-Id: <1568183141-67641-7-git-send-email-zhiwei_liu@c-sky.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1568183141-67641-1-git-send-email-zhiwei_liu@c-sky.com> References: <1568183141-67641-1-git-send-email-zhiwei_liu@c-sky.com> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x [generic] X-Received-From: 121.197.200.217 Subject: [Qemu-devel] [PATCH v2 06/17] RISC-V: add vector extension fault-only-first implementation X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-riscv@nongnu.org, qemu-devel@nongnu.org, wxy194768@alibaba-inc.com, LIU Zhiwei Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" From: LIU Zhiwei Signed-off-by: LIU Zhiwei --- linux-user/riscv/cpu_loop.c | 7 + target/riscv/cpu_helper.c | 7 + target/riscv/helper.h | 7 + target/riscv/insn32.decode | 7 + target/riscv/insn_trans/trans_rvv.inc.c | 7 + target/riscv/vector_helper.c | 567 ++++++++++++++++++++++++++++++++ 6 files changed, 602 insertions(+) diff --git a/linux-user/riscv/cpu_loop.c b/linux-user/riscv/cpu_loop.c index 12aa3c0..d673fa5 100644 --- a/linux-user/riscv/cpu_loop.c +++ b/linux-user/riscv/cpu_loop.c @@ -41,6 +41,13 @@ void cpu_loop(CPURISCVState *env) sigcode = 0; sigaddr = 0; + if (env->foflag) { + if (env->vfp.vl != 0) { + env->foflag = false; + env->pc += 4; + continue; + } + } switch (trapnr) { case EXCP_INTERRUPT: /* just indicate that signals should be handled asap */ diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c index e32b612..405caf6 100644 --- a/target/riscv/cpu_helper.c +++ b/target/riscv/cpu_helper.c @@ -521,6 +521,13 @@ void riscv_cpu_do_interrupt(CPUState *cs) [PRV_H] = RISCV_EXCP_H_ECALL, [PRV_M] = RISCV_EXCP_M_ECALL }; + if (env->foflag) { + if (env->vfp.vl != 0) { + env->foflag = false; + env->pc += 4; + return; + } + } if (!async) { /* set tval to badaddr for traps with address information */ diff --git a/target/riscv/helper.h b/target/riscv/helper.h index f77c392..973342f 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -84,6 +84,13 @@ DEF_HELPER_5(vector_vle_v, void, env, i32, i32, i32, i32) DEF_HELPER_5(vector_vlbu_v, void, env, i32, i32, i32, i32) DEF_HELPER_5(vector_vlhu_v, void, env, i32, i32, i32, i32) DEF_HELPER_5(vector_vlwu_v, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vlbff_v, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vlhff_v, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vlwff_v, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vleff_v, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vlbuff_v, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vlhuff_v, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vlwuff_v, void, env, i32, i32, i32, i32) DEF_HELPER_5(vector_vsb_v, void, env, i32, i32, i32, i32) DEF_HELPER_5(vector_vsh_v, void, env, i32, i32, i32, i32) DEF_HELPER_5(vector_vsw_v, void, env, i32, i32, i32, i32) diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode index b8a3d8a..b286997 100644 --- a/target/riscv/insn32.decode +++ b/target/riscv/insn32.decode @@ -218,6 +218,13 @@ vle_v ... 000 . 00000 ..... 111 ..... 0000111 @r2_nfvm vlbu_v ... 000 . 00000 ..... 000 ..... 0000111 @r2_nfvm vlhu_v ... 000 . 00000 ..... 101 ..... 0000111 @r2_nfvm vlwu_v ... 000 . 00000 ..... 110 ..... 0000111 @r2_nfvm +vlbff_v ... 100 . 10000 ..... 000 ..... 0000111 @r2_nfvm +vlhff_v ... 100 . 10000 ..... 101 ..... 0000111 @r2_nfvm +vlwff_v ... 100 . 10000 ..... 110 ..... 0000111 @r2_nfvm +vleff_v ... 000 . 10000 ..... 111 ..... 0000111 @r2_nfvm +vlbuff_v ... 000 . 10000 ..... 000 ..... 0000111 @r2_nfvm +vlhuff_v ... 000 . 10000 ..... 101 ..... 0000111 @r2_nfvm +vlwuff_v ... 000 . 10000 ..... 110 ..... 0000111 @r2_nfvm vsb_v ... 000 . 00000 ..... 000 ..... 0100111 @r2_nfvm vsh_v ... 000 . 00000 ..... 101 ..... 0100111 @r2_nfvm vsw_v ... 000 . 00000 ..... 110 ..... 0100111 @r2_nfvm diff --git a/target/riscv/insn_trans/trans_rvv.inc.c b/target/riscv/insn_trans/trans_rvv.inc.c index 16b1f90..bd83885 100644 --- a/target/riscv/insn_trans/trans_rvv.inc.c +++ b/target/riscv/insn_trans/trans_rvv.inc.c @@ -80,6 +80,13 @@ GEN_VECTOR_R2_NFVM(vle_v) GEN_VECTOR_R2_NFVM(vlbu_v) GEN_VECTOR_R2_NFVM(vlhu_v) GEN_VECTOR_R2_NFVM(vlwu_v) +GEN_VECTOR_R2_NFVM(vlbff_v) +GEN_VECTOR_R2_NFVM(vlhff_v) +GEN_VECTOR_R2_NFVM(vlwff_v) +GEN_VECTOR_R2_NFVM(vleff_v) +GEN_VECTOR_R2_NFVM(vlbuff_v) +GEN_VECTOR_R2_NFVM(vlhuff_v) +GEN_VECTOR_R2_NFVM(vlwuff_v) GEN_VECTOR_R2_NFVM(vsb_v) GEN_VECTOR_R2_NFVM(vsh_v) GEN_VECTOR_R2_NFVM(vsw_v) diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c index 62e4d2e..0ac8c74 100644 --- a/target/riscv/vector_helper.c +++ b/target/riscv/vector_helper.c @@ -2762,3 +2762,570 @@ void VECTOR_HELPER(vsuxe_v)(CPURISCVState *env, uint32_t nf, uint32_t vm, env->vfp.vstart = 0; } +void VECTOR_HELPER(vlbuff_v)(CPURISCVState *env, uint32_t nf, uint32_t vm, + uint32_t rs1, uint32_t rd) +{ + int i, j, k, vl, vlmax, lmul, width, dest, read; + + vl = env->vfp.vl; + + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + if (lmul * (nf + 1) > 32) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rd, false); + + env->foflag = true; + env->vfp.vl = 0; + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + k = nf; + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + read = i * (nf + 1) + k; + env->vfp.vreg[dest + k * lmul].u8[j] = + cpu_ldub_data(env, env->gpr[rs1] + read); + k--; + } + env->vfp.vstart++; + } + env->vfp.vl++; + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + read = i * (nf + 1) + k; + env->vfp.vreg[dest + k * lmul].u16[j] = + cpu_ldub_data(env, env->gpr[rs1] + read); + k--; + } + env->vfp.vstart++; + } + env->vfp.vl++; + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + read = i * (nf + 1) + k; + env->vfp.vreg[dest + k * lmul].u32[j] = + cpu_ldub_data(env, env->gpr[rs1] + read); + k--; + } + env->vfp.vstart++; + } + env->vfp.vl++; + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + read = i * (nf + 1) + k; + env->vfp.vreg[dest + k * lmul].u64[j] = + cpu_ldub_data(env, env->gpr[rs1] + read); + k--; + } + env->vfp.vstart++; + } + env->vfp.vl++; + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_segment(env, dest, j, width, k, lmul); + } + } + env->foflag = false; + env->vfp.vl = vl; + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vlbff_v)(CPURISCVState *env, uint32_t nf, uint32_t vm, + uint32_t rs1, uint32_t rd) +{ + int i, j, k, vl, vlmax, lmul, width, dest, read; + + vl = env->vfp.vl; + + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + if (lmul * (nf + 1) > 32) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rd, false); + env->foflag = true; + env->vfp.vl = 0; + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + k = nf; + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + read = i * (nf + 1) + k; + env->vfp.vreg[dest + k * lmul].s8[j] = + cpu_ldsb_data(env, env->gpr[rs1] + read); + k--; + } + env->vfp.vstart++; + } + env->vfp.vl++; + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + read = i * (nf + 1) + k; + env->vfp.vreg[dest + k * lmul].s16[j] = sign_extend( + cpu_ldsb_data(env, env->gpr[rs1] + read), 8); + k--; + } + env->vfp.vstart++; + } + env->vfp.vl++; + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + read = i * (nf + 1) + k; + env->vfp.vreg[dest + k * lmul].s32[j] = sign_extend( + cpu_ldsb_data(env, env->gpr[rs1] + read), 8); + k--; + } + env->vfp.vstart++; + } + env->vfp.vl++; + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + read = i * (nf + 1) + k; + env->vfp.vreg[dest + k * lmul].s64[j] = sign_extend( + cpu_ldsb_data(env, env->gpr[rs1] + read), 8); + k--; + } + env->vfp.vstart++; + } + env->vfp.vl++; + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_segment(env, dest, j, width, k, lmul); + } + } + env->foflag = false; + env->vfp.vl = vl; + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vlhuff_v)(CPURISCVState *env, uint32_t nf, uint32_t vm, + uint32_t rs1, uint32_t rd) +{ + int i, j, k, vl, vlmax, lmul, width, dest, read; + + vl = env->vfp.vl; + + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + if (lmul * (nf + 1) > 32) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rd, false); + env->foflag = true; + env->vfp.vl = 0; + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + k = nf; + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + read = (i * (nf + 1) + k) * 2; + env->vfp.vreg[dest + k * lmul].u16[j] = + cpu_lduw_data(env, env->gpr[rs1] + read); + k--; + } + env->vfp.vstart++; + } + env->vfp.vl++; + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + read = (i * (nf + 1) + k) * 2; + env->vfp.vreg[dest + k * lmul].u32[j] = + cpu_lduw_data(env, env->gpr[rs1] + read); + k--; + } + env->vfp.vstart++; + } + env->vfp.vl++; + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + read = (i * (nf + 1) + k) * 2; + env->vfp.vreg[dest + k * lmul].u64[j] = + cpu_lduw_data(env, env->gpr[rs1] + read); + k--; + } + env->vfp.vstart++; + } + env->vfp.vl++; + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_segment(env, dest, j, width, k, lmul); + } + } + env->foflag = false; + env->vfp.vl = vl; + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vlhff_v)(CPURISCVState *env, uint32_t nf, uint32_t vm, + uint32_t rs1, uint32_t rd) +{ + int i, j, k, vl, vlmax, lmul, width, dest, read; + + vl = env->vfp.vl; + + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + if (lmul * (nf + 1) > 32) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rd, false); + env->foflag = true; + env->vfp.vl = 0; + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + k = nf; + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + read = (i * (nf + 1) + k) * 2; + env->vfp.vreg[dest + k * lmul].s16[j] = + cpu_ldsw_data(env, env->gpr[rs1] + read); + k--; + } + env->vfp.vstart++; + } + env->vfp.vl++; + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + read = (i * (nf + 1) + k) * 2; + env->vfp.vreg[dest + k * lmul].s32[j] = sign_extend( + cpu_ldsw_data(env, env->gpr[rs1] + read), 16); + k--; + } + env->vfp.vstart++; + } + env->vfp.vl++; + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + read = (i * (nf + 1) + k) * 2; + env->vfp.vreg[dest + k * lmul].s64[j] = sign_extend( + cpu_ldsw_data(env, env->gpr[rs1] + read), 16); + k--; + } + env->vfp.vstart++; + } + env->vfp.vl++; + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_segment(env, dest, j, width, k, lmul); + } + } + env->vfp.vl = vl; + env->foflag = false; + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vlwuff_v)(CPURISCVState *env, uint32_t nf, uint32_t vm, + uint32_t rs1, uint32_t rd) +{ + int i, j, k, vl, vlmax, lmul, width, dest, read; + + vl = env->vfp.vl; + + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + if (lmul * (nf + 1) > 32) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rd, false); + env->foflag = true; + env->vfp.vl = 0; + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + k = nf; + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + read = (i * (nf + 1) + k) * 4; + env->vfp.vreg[dest + k * lmul].u32[j] = + cpu_ldl_data(env, env->gpr[rs1] + read); + k--; + } + env->vfp.vstart++; + } + env->vfp.vl++; + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + read = (i * (nf + 1) + k) * 4; + env->vfp.vreg[dest + k * lmul].u64[j] = + cpu_ldl_data(env, env->gpr[rs1] + read); + k--; + } + env->vfp.vstart++; + } + env->vfp.vl++; + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_segment(env, dest, j, width, k, lmul); + } + } + env->foflag = false; + env->vfp.vl = vl; + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vlwff_v)(CPURISCVState *env, uint32_t nf, uint32_t vm, + uint32_t rs1, uint32_t rd) +{ + int i, j, k, vl, vlmax, lmul, width, dest, read; + + vl = env->vfp.vl; + + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + if (lmul * (nf + 1) > 32) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rd, false); + env->foflag = true; + env->vfp.vl = 0; + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + k = nf; + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + read = (i * (nf + 1) + k) * 4; + env->vfp.vreg[dest + k * lmul].s32[j] = + cpu_ldl_data(env, env->gpr[rs1] + read); + k--; + } + env->vfp.vstart++; + } + env->vfp.vl++; + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + read = (i * (nf + 1) + k) * 4; + env->vfp.vreg[dest + k * lmul].s64[j] = sign_extend( + cpu_ldl_data(env, env->gpr[rs1] + read), 32); + k--; + } + env->vfp.vstart++; + } + env->vfp.vl++; + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_segment(env, dest, j, width, k, lmul); + } + } + env->foflag = false; + env->vfp.vl = vl; + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vleff_v)(CPURISCVState *env, uint32_t nf, uint32_t vm, + uint32_t rs1, uint32_t rd) +{ + int i, j, k, vl, vlmax, lmul, width, dest, read; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + if (lmul * (nf + 1) > 32) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rd, false); + env->vfp.vl = 0; + env->foflag = true; + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + k = nf; + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + read = i * (nf + 1) + k; + env->vfp.vreg[dest + k * lmul].u8[j] = + cpu_ldub_data(env, env->gpr[rs1] + read); + k--; + } + env->vfp.vstart++; + } + env->vfp.vl++; + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + read = (i * (nf + 1) + k) * 2; + env->vfp.vreg[dest + k * lmul].u16[j] = + cpu_lduw_data(env, env->gpr[rs1] + read); + k--; + } + env->vfp.vstart++; + } + env->vfp.vl++; + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + read = (i * (nf + 1) + k) * 4; + env->vfp.vreg[dest + k * lmul].u32[j] = + cpu_ldl_data(env, env->gpr[rs1] + read); + k--; + } + env->vfp.vstart++; + } + env->vfp.vl++; + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + while (k >= 0) { + read = (i * (nf + 1) + k) * 8; + env->vfp.vreg[dest + k * lmul].u64[j] = + cpu_ldq_data(env, env->gpr[rs1] + read); + k--; + } + env->vfp.vstart++; + } + env->vfp.vl++; + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_segment(env, dest, j, width, k, lmul); + } + } + env->foflag = false; + env->vfp.vl = vl; + env->vfp.vstart = 0; +} From patchwork Wed Sep 11 06:25:31 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: LIU Zhiwei X-Patchwork-Id: 11140359 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1AA8614ED for ; Wed, 11 Sep 2019 06:37:06 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id CA85A207FC for ; Wed, 11 Sep 2019 06:37:05 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org CA85A207FC Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=c-sky.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Received: from localhost ([::1]:46850 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1i7wFU-0000rp-Eb for patchwork-qemu-devel@patchwork.kernel.org; Wed, 11 Sep 2019 02:37:04 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:38472) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1i7wDb-000742-Hs for qemu-devel@nongnu.org; Wed, 11 Sep 2019 02:35:13 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1i7wDU-0007mr-9u for qemu-devel@nongnu.org; Wed, 11 Sep 2019 02:35:07 -0400 Received: from smtp2200-217.mail.aliyun.com ([121.197.200.217]:34645) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1i7wDT-0007jy-1H; Wed, 11 Sep 2019 02:35:00 -0400 X-Alimail-AntiSpam: AC=CONTINUE; BC=0.03883426|-1; CH=green; DM=CONTINUE|CONTINUE|true|0.353627-0.00801203-0.638361; FP=0|0|0|0|0|-1|-1|-1; HT=e01l07391; MF=zhiwei_liu@c-sky.com; NM=1; PH=DS; RN=11; RT=11; SR=0; TI=SMTPD_---.FSRFPyr_1568183693; Received: from localhost(mailfrom:zhiwei_liu@c-sky.com fp:SMTPD_---.FSRFPyr_1568183693) by smtp.aliyun-inc.com(10.147.40.233); Wed, 11 Sep 2019 14:34:54 +0800 From: liuzhiwei To: Alistair.Francis@wdc.com, palmer@sifive.com, sagark@eecs.berkeley.edu, kbastian@mail.uni-paderborn.de, riku.voipio@iki.fi, laurent@vivier.eu, wenmeng_zhang@c-sky.com Date: Wed, 11 Sep 2019 14:25:31 +0800 Message-Id: <1568183141-67641-8-git-send-email-zhiwei_liu@c-sky.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1568183141-67641-1-git-send-email-zhiwei_liu@c-sky.com> References: <1568183141-67641-1-git-send-email-zhiwei_liu@c-sky.com> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x [generic] X-Received-From: 121.197.200.217 Subject: [Qemu-devel] [PATCH v2 07/17] RISC-V: add vector extension atomic instructions X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-riscv@nongnu.org, qemu-devel@nongnu.org, wxy194768@alibaba-inc.com, LIU Zhiwei Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" From: LIU Zhiwei Signed-off-by: LIU Zhiwei --- target/riscv/helper.h | 18 + target/riscv/insn32.decode | 21 + target/riscv/insn_trans/trans_rvv.inc.c | 36 + target/riscv/vector_helper.c | 1467 +++++++++++++++++++++++++++++++ 4 files changed, 1542 insertions(+) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index 973342f..c107925 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -121,5 +121,23 @@ DEF_HELPER_6(vector_vsuxb_v, void, env, i32, i32, i32, i32, i32) DEF_HELPER_6(vector_vsuxh_v, void, env, i32, i32, i32, i32, i32) DEF_HELPER_6(vector_vsuxw_v, void, env, i32, i32, i32, i32, i32) DEF_HELPER_6(vector_vsuxe_v, void, env, i32, i32, i32, i32, i32) +DEF_HELPER_6(vector_vamoswapw_v, void, env, i32, i32, i32, i32, i32) +DEF_HELPER_6(vector_vamoswapd_v, void, env, i32, i32, i32, i32, i32) +DEF_HELPER_6(vector_vamoaddw_v, void, env, i32, i32, i32, i32, i32) +DEF_HELPER_6(vector_vamoaddd_v, void, env, i32, i32, i32, i32, i32) +DEF_HELPER_6(vector_vamoxorw_v, void, env, i32, i32, i32, i32, i32) +DEF_HELPER_6(vector_vamoxord_v, void, env, i32, i32, i32, i32, i32) +DEF_HELPER_6(vector_vamoandw_v, void, env, i32, i32, i32, i32, i32) +DEF_HELPER_6(vector_vamoandd_v, void, env, i32, i32, i32, i32, i32) +DEF_HELPER_6(vector_vamoorw_v, void, env, i32, i32, i32, i32, i32) +DEF_HELPER_6(vector_vamoord_v, void, env, i32, i32, i32, i32, i32) +DEF_HELPER_6(vector_vamominw_v, void, env, i32, i32, i32, i32, i32) +DEF_HELPER_6(vector_vamomind_v, void, env, i32, i32, i32, i32, i32) +DEF_HELPER_6(vector_vamomaxw_v, void, env, i32, i32, i32, i32, i32) +DEF_HELPER_6(vector_vamomaxd_v, void, env, i32, i32, i32, i32, i32) +DEF_HELPER_6(vector_vamominuw_v, void, env, i32, i32, i32, i32, i32) +DEF_HELPER_6(vector_vamominud_v, void, env, i32, i32, i32, i32, i32) +DEF_HELPER_6(vector_vamomaxuw_v, void, env, i32, i32, i32, i32, i32) +DEF_HELPER_6(vector_vamomaxud_v, void, env, i32, i32, i32, i32, i32) DEF_HELPER_4(vector_vsetvli, void, env, i32, i32, i32) DEF_HELPER_4(vector_vsetvl, void, env, i32, i32, i32) diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode index b286997..48e7661 100644 --- a/target/riscv/insn32.decode +++ b/target/riscv/insn32.decode @@ -63,6 +63,7 @@ @r_rm ....... ..... ..... ... ..... ....... %rs2 %rs1 %rm %rd @r2_rm ....... ..... ..... ... ..... ....... %rs1 %rm %rd @r2 ....... ..... ..... ... ..... ....... %rs1 %rd +@r_wdvm ..... wd:1 vm:1 ..... ..... ... ..... ....... %rs2 %rs1 %rd @r_nfvm nf:3 ... vm:1 ..... ..... ... ..... ....... %rs2 %rs1 %rd @r2_nfvm nf:3 ... vm:1 ..... ..... ... ..... ....... %rs1 %rd @r2_zimm . zimm:11 ..... ... ..... ....... %rs1 %rd @@ -258,6 +259,26 @@ vsuxh_v ... 111 . ..... ..... 101 ..... 0100111 @r_nfvm vsuxw_v ... 111 . ..... ..... 110 ..... 0100111 @r_nfvm vsuxe_v ... 111 . ..... ..... 111 ..... 0100111 @r_nfvm +#*** Vector AMO operations are encoded under the standard AMO major opcode.*** +vamoswapw_v 00001 . . ..... ..... 110 ..... 0101111 @r_wdvm +vamoswapd_v 00001 . . ..... ..... 111 ..... 0101111 @r_wdvm +vamoaddw_v 00000 . . ..... ..... 110 ..... 0101111 @r_wdvm +vamoaddd_v 00000 . . ..... ..... 111 ..... 0101111 @r_wdvm +vamoxorw_v 00100 . . ..... ..... 110 ..... 0101111 @r_wdvm +vamoxord_v 00100 . . ..... ..... 111 ..... 0101111 @r_wdvm +vamoandw_v 01100 . . ..... ..... 110 ..... 0101111 @r_wdvm +vamoandd_v 01100 . . ..... ..... 111 ..... 0101111 @r_wdvm +vamoorw_v 01000 . . ..... ..... 110 ..... 0101111 @r_wdvm +vamoord_v 01000 . . ..... ..... 111 ..... 0101111 @r_wdvm +vamominw_v 10000 . . ..... ..... 110 ..... 0101111 @r_wdvm +vamomind_v 10000 . . ..... ..... 111 ..... 0101111 @r_wdvm +vamomaxw_v 10100 . . ..... ..... 110 ..... 0101111 @r_wdvm +vamomaxd_v 10100 . . ..... ..... 111 ..... 0101111 @r_wdvm +vamominuw_v 11000 . . ..... ..... 110 ..... 0101111 @r_wdvm +vamominud_v 11000 . . ..... ..... 111 ..... 0101111 @r_wdvm +vamomaxuw_v 11100 . . ..... ..... 110 ..... 0101111 @r_wdvm +vamomaxud_v 11100 . . ..... ..... 111 ..... 0101111 @r_wdvm + #*** new major opcode OP-V *** vsetvli 0 ........... ..... 111 ..... 1010111 @r2_zimm vsetvl 1000000 ..... ..... 111 ..... 1010111 @r diff --git a/target/riscv/insn_trans/trans_rvv.inc.c b/target/riscv/insn_trans/trans_rvv.inc.c index bd83885..7bda378 100644 --- a/target/riscv/insn_trans/trans_rvv.inc.c +++ b/target/riscv/insn_trans/trans_rvv.inc.c @@ -47,6 +47,23 @@ static bool trans_##INSN(DisasContext *ctx, arg_##INSN * a) \ return true; \ } +#define GEN_VECTOR_R_WDVM(INSN) \ +static bool trans_##INSN(DisasContext *ctx, arg_##INSN * a) \ +{ \ + TCGv_i32 s1 = tcg_const_i32(a->rs1); \ + TCGv_i32 s2 = tcg_const_i32(a->rs2); \ + TCGv_i32 d = tcg_const_i32(a->rd); \ + TCGv_i32 wd = tcg_const_i32(a->wd); \ + TCGv_i32 vm = tcg_const_i32(a->vm); \ + gen_helper_vector_##INSN(cpu_env, wd, vm, s1, s2, d);\ + tcg_temp_free_i32(s1); \ + tcg_temp_free_i32(s2); \ + tcg_temp_free_i32(d); \ + tcg_temp_free_i32(wd); \ + tcg_temp_free_i32(vm); \ + return true; \ +} + #define GEN_VECTOR_R(INSN) \ static bool trans_##INSN(DisasContext *ctx, arg_##INSN * a) \ { \ @@ -119,5 +136,24 @@ GEN_VECTOR_R_NFVM(vsuxh_v) GEN_VECTOR_R_NFVM(vsuxw_v) GEN_VECTOR_R_NFVM(vsuxe_v) +GEN_VECTOR_R_WDVM(vamoswapw_v) +GEN_VECTOR_R_WDVM(vamoswapd_v) +GEN_VECTOR_R_WDVM(vamoaddw_v) +GEN_VECTOR_R_WDVM(vamoaddd_v) +GEN_VECTOR_R_WDVM(vamoxorw_v) +GEN_VECTOR_R_WDVM(vamoxord_v) +GEN_VECTOR_R_WDVM(vamoandw_v) +GEN_VECTOR_R_WDVM(vamoandd_v) +GEN_VECTOR_R_WDVM(vamoorw_v) +GEN_VECTOR_R_WDVM(vamoord_v) +GEN_VECTOR_R_WDVM(vamominw_v) +GEN_VECTOR_R_WDVM(vamomind_v) +GEN_VECTOR_R_WDVM(vamomaxw_v) +GEN_VECTOR_R_WDVM(vamomaxd_v) +GEN_VECTOR_R_WDVM(vamominuw_v) +GEN_VECTOR_R_WDVM(vamominud_v) +GEN_VECTOR_R_WDVM(vamomaxuw_v) +GEN_VECTOR_R_WDVM(vamomaxud_v) + GEN_VECTOR_R2_ZIMM(vsetvli) GEN_VECTOR_R(vsetvl) diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c index 0ac8c74..9ebf70d 100644 --- a/target/riscv/vector_helper.c +++ b/target/riscv/vector_helper.c @@ -136,6 +136,21 @@ static bool vector_lmul_check_reg(CPURISCVState *env, uint32_t lmul, return true; } +static void vector_tail_amo(CPURISCVState *env, int vreg, int index, int width) +{ + switch (width) { + case 32: + env->vfp.vreg[vreg].u32[index] = 0; + break; + case 64: + env->vfp.vreg[vreg].u64[index] = 0; + break; + default: + helper_raise_exception(env, RISCV_EXCP_ILLEGAL_INST); + return; + } +} + static void vector_tail_segment(CPURISCVState *env, int vreg, int index, int width, int nf, int lmul) { @@ -3329,3 +3344,1455 @@ void VECTOR_HELPER(vleff_v)(CPURISCVState *env, uint32_t nf, uint32_t vm, env->vfp.vl = vl; env->vfp.vstart = 0; } + +void VECTOR_HELPER(vamoswapw_v)(CPURISCVState *env, uint32_t wd, uint32_t vm, + uint32_t rs1, uint32_t vs2, uint32_t vs3) +{ + int i, j, vl; + target_long idx; + uint32_t lmul, width, src2, src3, vlmax; + target_ulong addr; +#ifdef CONFIG_SOFTMMU + int mem_idx = cpu_mmu_index(env, false); + TCGMemOp memop = MO_ALIGN | MO_TESL; +#endif + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + /* MEM <= SEW <= XLEN */ + if (width < 32 || (width > sizeof(target_ulong) * 8)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + /* if wd, rd is writen the old value */ + if (vector_vtype_ill(env) || + (vector_overlap_vm_common(lmul, vm, vs3) && wd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, vs2, false); + vector_lmul_check_reg(env, lmul, vs3, false); + + for (i = 0; i < vlmax; i++) { + src2 = vs2 + (i / (VLEN / width)); + src3 = vs3 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + int32_t tmp; + idx = (target_long)env->vfp.vreg[src2].s32[j]; + addr = idx + env->gpr[rs1]; +#ifdef CONFIG_SOFTMMU + tmp = helper_atomic_xchgl_le(env, addr, + env->vfp.vreg[src3].s32[j], + make_memop_idx(memop & ~MO_SIGN, mem_idx)); +#else + tmp = helper_atomic_xchgl_le(env, addr, + env->vfp.vreg[src3].s32[j]); +#endif + if (wd) { + env->vfp.vreg[src3].s32[j] = tmp; + } + env->vfp.vstart++; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + int64_t tmp; + idx = (target_long)env->vfp.vreg[src2].s64[j]; + addr = idx + env->gpr[rs1]; + +#ifdef CONFIG_SOFTMMU + tmp = (int64_t)(int32_t)helper_atomic_xchgl_le(env, addr, + env->vfp.vreg[src3].s64[j], + make_memop_idx(memop & ~MO_SIGN, mem_idx)); +#else + tmp = (int64_t)(int32_t)helper_atomic_xchgl_le(env, addr, + env->vfp.vreg[src3].s64[j]); +#endif + if (wd) { + env->vfp.vreg[src3].s64[j] = tmp; + } + env->vfp.vstart++; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_amo(env, src3, j, width); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vamoswapd_v)(CPURISCVState *env, uint32_t wd, uint32_t vm, + uint32_t rs1, uint32_t vs2, uint32_t vs3) +{ + int i, j, vl; + target_long idx; + uint32_t lmul, width, src2, src3, vlmax; + target_ulong addr; +#ifdef CONFIG_SOFTMMU + int mem_idx = cpu_mmu_index(env, false); + TCGMemOp memop = MO_ALIGN | MO_TEQ; +#endif + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + /* MEM <= SEW <= XLEN */ + if (width < 64 || (width > sizeof(target_ulong) * 8)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + /* if wd, rd is writen the old value */ + if (vector_vtype_ill(env) || + (vector_overlap_vm_common(lmul, vm, vs3) && wd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, vs2, false); + vector_lmul_check_reg(env, lmul, vs3, false); + + for (i = 0; i < vlmax; i++) { + src2 = vs2 + (i / (VLEN / width)); + src3 = vs3 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + int64_t tmp; + idx = (target_long)env->vfp.vreg[src2].s64[j]; + addr = idx + env->gpr[rs1]; + +#ifdef CONFIG_SOFTMMU + tmp = helper_atomic_xchgq_le(env, addr, + env->vfp.vreg[src3].s64[j], + make_memop_idx(memop & ~MO_SIGN, mem_idx)); +#else + tmp = helper_atomic_xchgq_le(env, addr, + env->vfp.vreg[src3].s64[j]); +#endif + + if (wd) { + env->vfp.vreg[src3].s64[j] = tmp; + } + env->vfp.vstart++; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_amo(env, src3, j, width); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vamoaddw_v)(CPURISCVState *env, uint32_t wd, uint32_t vm, + uint32_t rs1, uint32_t vs2, uint32_t vs3) +{ + int i, j, vl; + target_long idx; + uint32_t lmul, width, src2, src3, vlmax; + target_ulong addr; +#ifdef CONFIG_SOFTMMU + int mem_idx = cpu_mmu_index(env, false); + TCGMemOp memop = MO_ALIGN | MO_TESL; +#endif + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + /* MEM <= SEW <= XLEN */ + if (width < 32 || (width > sizeof(target_ulong) * 8)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + /* if wd, rd is writen the old value */ + if (vector_vtype_ill(env) || + (vector_overlap_vm_common(lmul, vm, vs3) && wd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, vs2, false); + vector_lmul_check_reg(env, lmul, vs3, false); + + for (i = 0; i < vlmax; i++) { + src2 = vs2 + (i / (VLEN / width)); + src3 = vs3 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + int32_t tmp; + idx = (target_long)env->vfp.vreg[src2].s32[j]; + addr = idx + env->gpr[rs1]; +#ifdef CONFIG_SOFTMMU + tmp = helper_atomic_fetch_addl_le(env, addr, + env->vfp.vreg[src3].s32[j], + make_memop_idx(memop & ~MO_SIGN, mem_idx)); +#else + tmp = helper_atomic_fetch_addl_le(env, addr, + env->vfp.vreg[src3].s32[j]); +#endif + if (wd) { + env->vfp.vreg[src3].s32[j] = tmp; + } + env->vfp.vstart++; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + int64_t tmp; + idx = (target_long)env->vfp.vreg[src2].s64[j]; + addr = idx + env->gpr[rs1]; + +#ifdef CONFIG_SOFTMMU + tmp = (int64_t)(int32_t)helper_atomic_fetch_addl_le(env, + addr, env->vfp.vreg[src3].s64[j], + make_memop_idx(memop & ~MO_SIGN, mem_idx)); +#else + tmp = (int64_t)(int32_t)helper_atomic_fetch_addl_le(env, + addr, env->vfp.vreg[src3].s64[j]); +#endif + if (wd) { + env->vfp.vreg[src3].s64[j] = tmp; + } + env->vfp.vstart++; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_amo(env, src3, j, width); + } + } + env->vfp.vstart = 0; +} +void VECTOR_HELPER(vamoaddd_v)(CPURISCVState *env, uint32_t wd, uint32_t vm, + uint32_t rs1, uint32_t vs2, uint32_t vs3) +{ + int i, j, vl; + target_long idx; + uint32_t lmul, width, src2, src3, vlmax; + target_ulong addr; +#ifdef CONFIG_SOFTMMU + int mem_idx = cpu_mmu_index(env, false); + TCGMemOp memop = MO_ALIGN | MO_TEQ; +#endif + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + /* MEM <= SEW <= XLEN */ + if (width < 64 || (width > sizeof(target_ulong) * 8)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + /* if wd, rd is writen the old value */ + if (vector_vtype_ill(env) || + (vector_overlap_vm_common(lmul, vm, vs3) && wd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, vs2, false); + vector_lmul_check_reg(env, lmul, vs3, false); + + for (i = 0; i < vlmax; i++) { + src2 = vs2 + (i / (VLEN / width)); + src3 = vs3 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + int64_t tmp; + idx = (target_long)env->vfp.vreg[src2].s64[j]; + addr = idx + env->gpr[rs1]; + +#ifdef CONFIG_SOFTMMU + tmp = helper_atomic_fetch_addq_le(env, addr, + env->vfp.vreg[src3].s64[j], + make_memop_idx(memop & ~MO_SIGN, mem_idx)); +#else + tmp = helper_atomic_fetch_addq_le(env, addr, + env->vfp.vreg[src3].s64[j]); +#endif + if (wd) { + env->vfp.vreg[src3].s64[j] = tmp; + } + env->vfp.vstart++; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_amo(env, src3, j, width); + } + } + + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vamoxorw_v)(CPURISCVState *env, uint32_t wd, uint32_t vm, + uint32_t rs1, uint32_t vs2, uint32_t vs3) +{ + int i, j, vl; + target_long idx; + uint32_t lmul, width, src2, src3, vlmax; + target_ulong addr; +#ifdef CONFIG_SOFTMMU + int mem_idx = cpu_mmu_index(env, false); + TCGMemOp memop = MO_ALIGN | MO_TESL; +#endif + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + /* MEM <= SEW <= XLEN */ + if (width < 32 || (width > sizeof(target_ulong) * 8)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + /* if wd, rd is writen the old value */ + if (vector_vtype_ill(env) || + (vector_overlap_vm_common(lmul, vm, vs3) && wd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, vs2, false); + vector_lmul_check_reg(env, lmul, vs3, false); + + for (i = 0; i < vlmax; i++) { + src2 = vs2 + (i / (VLEN / width)); + src3 = vs3 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + int32_t tmp; + idx = (target_long)env->vfp.vreg[src2].s32[j]; + addr = idx + env->gpr[rs1]; +#ifdef CONFIG_SOFTMMU + tmp = helper_atomic_fetch_xorl_le(env, addr, + env->vfp.vreg[src3].s32[j], + make_memop_idx(memop & ~MO_SIGN, mem_idx)); +#else + tmp = helper_atomic_fetch_xorl_le(env, addr, + env->vfp.vreg[src3].s32[j]); +#endif + if (wd) { + env->vfp.vreg[src3].s32[j] = tmp; + } + env->vfp.vstart++; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + int64_t tmp; + idx = (target_long)env->vfp.vreg[src2].s64[j]; + addr = idx + env->gpr[rs1]; + +#ifdef CONFIG_SOFTMMU + tmp = (int64_t)(int32_t)helper_atomic_fetch_xorl_le(env, + addr, env->vfp.vreg[src3].s64[j], + make_memop_idx(memop & ~MO_SIGN, mem_idx)); +#else + tmp = (int64_t)(int32_t)helper_atomic_fetch_xorl_le(env, + addr, env->vfp.vreg[src3].s64[j]); +#endif + if (wd) { + env->vfp.vreg[src3].s64[j] = tmp; + } + env->vfp.vstart++; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_amo(env, src3, j, width); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vamoxord_v)(CPURISCVState *env, uint32_t wd, uint32_t vm, + uint32_t rs1, uint32_t vs2, uint32_t vs3) +{ + int i, j, vl; + target_long idx; + uint32_t lmul, width, src2, src3, vlmax; + target_ulong addr; +#ifdef CONFIG_SOFTMMU + int mem_idx = cpu_mmu_index(env, false); + TCGMemOp memop = MO_ALIGN | MO_TESL; +#endif + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + /* MEM <= SEW <= XLEN */ + if (width < 64 || (width > sizeof(target_ulong) * 8)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + /* if wd, rd is writen the old value */ + if (vector_vtype_ill(env) || + (vector_overlap_vm_common(lmul, vm, vs3) && wd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, vs2, false); + vector_lmul_check_reg(env, lmul, vs3, false); + + for (i = 0; i < vlmax; i++) { + src2 = vs2 + (i / (VLEN / width)); + src3 = vs3 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + int64_t tmp; + idx = (target_long)env->vfp.vreg[src2].s64[j]; + addr = idx + env->gpr[rs1]; + +#ifdef CONFIG_SOFTMMU + tmp = helper_atomic_fetch_xorq_le(env, addr, + env->vfp.vreg[src3].s64[j], + make_memop_idx(memop & ~MO_SIGN, mem_idx)); +#else + tmp = helper_atomic_fetch_xorq_le(env, addr, + env->vfp.vreg[src3].s64[j]); +#endif + if (wd) { + env->vfp.vreg[src3].s64[j] = tmp; + } + env->vfp.vstart++; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_amo(env, src3, j, width); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vamoandw_v)(CPURISCVState *env, uint32_t wd, uint32_t vm, + uint32_t rs1, uint32_t vs2, uint32_t vs3) +{ + int i, j, vl; + target_long idx; + uint32_t lmul, width, src2, src3, vlmax; + target_ulong addr; +#ifdef CONFIG_SOFTMMU + int mem_idx = cpu_mmu_index(env, false); + TCGMemOp memop = MO_ALIGN | MO_TESL; +#endif + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + /* MEM <= SEW <= XLEN */ + if (width < 32 || (width > sizeof(target_ulong) * 8)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + /* if wd, rd is writen the old value */ + if (vector_vtype_ill(env) || + (vector_overlap_vm_common(lmul, vm, vs3) && wd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, vs2, false); + vector_lmul_check_reg(env, lmul, vs3, false); + + for (i = 0; i < vlmax; i++) { + src2 = vs2 + (i / (VLEN / width)); + src3 = vs3 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + int32_t tmp; + idx = (target_long)env->vfp.vreg[src2].s32[j]; + addr = idx + env->gpr[rs1]; +#ifdef CONFIG_SOFTMMU + tmp = helper_atomic_fetch_andl_le(env, addr, + env->vfp.vreg[src3].s32[j], + make_memop_idx(memop & ~MO_SIGN, mem_idx)); +#else + tmp = helper_atomic_fetch_andl_le(env, addr, + env->vfp.vreg[src3].s32[j]); +#endif + if (wd) { + env->vfp.vreg[src3].s32[j] = tmp; + } + env->vfp.vstart++; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + int64_t tmp; + idx = (target_long)env->vfp.vreg[src2].s64[j]; + addr = idx + env->gpr[rs1]; + +#ifdef CONFIG_SOFTMMU + tmp = (int64_t)(int32_t)helper_atomic_fetch_andl_le(env, + addr, env->vfp.vreg[src3].s64[j], + make_memop_idx(memop & ~MO_SIGN, mem_idx)); +#else + tmp = (int64_t)(int32_t)helper_atomic_fetch_andl_le(env, + addr, env->vfp.vreg[src3].s64[j]); +#endif + if (wd) { + env->vfp.vreg[src3].s64[j] = tmp; + } + env->vfp.vstart++; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_amo(env, src3, j, width); + } + } + + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vamoandd_v)(CPURISCVState *env, uint32_t wd, uint32_t vm, + uint32_t rs1, uint32_t vs2, uint32_t vs3) +{ + int i, j, vl; + target_long idx; + uint32_t lmul, width, src2, src3, vlmax; + target_ulong addr; +#ifdef CONFIG_SOFTMMU + int mem_idx = cpu_mmu_index(env, false); + TCGMemOp memop = MO_ALIGN | MO_TEQ; +#endif + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + /* MEM <= SEW <= XLEN */ + if (width < 64 || (width > sizeof(target_ulong) * 8)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + /* if wd, rd is writen the old value */ + if (vector_vtype_ill(env) || + (vector_overlap_vm_common(lmul, vm, vs3) && wd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, vs2, false); + vector_lmul_check_reg(env, lmul, vs3, false); + + for (i = 0; i < vlmax; i++) { + src2 = vs2 + (i / (VLEN / width)); + src3 = vs3 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + int64_t tmp; + idx = (target_long)env->vfp.vreg[src2].s64[j]; + addr = idx + env->gpr[rs1]; + +#ifdef CONFIG_SOFTMMU + tmp = helper_atomic_fetch_andq_le(env, addr, + env->vfp.vreg[src3].s64[j], + make_memop_idx(memop & ~MO_SIGN, mem_idx)); +#else + tmp = helper_atomic_fetch_andq_le(env, addr, + env->vfp.vreg[src3].s64[j]); +#endif + if (wd) { + env->vfp.vreg[src3].s64[j] = tmp; + } + env->vfp.vstart++; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_amo(env, src3, j, width); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vamoorw_v)(CPURISCVState *env, uint32_t wd, uint32_t vm, + uint32_t rs1, uint32_t vs2, uint32_t vs3) +{ + int i, j, vl; + target_long idx; + uint32_t lmul, width, src2, src3, vlmax; + target_ulong addr; +#ifdef CONFIG_SOFTMMU + int mem_idx = cpu_mmu_index(env, false); + TCGMemOp memop = MO_ALIGN | MO_TESL; +#endif + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + /* MEM <= SEW <= XLEN */ + if (width < 32 || (width > sizeof(target_ulong) * 8)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + /* if wd, rd is writen the old value */ + if (vector_vtype_ill(env) || + (vector_overlap_vm_common(lmul, vm, vs3) && wd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, vs2, false); + vector_lmul_check_reg(env, lmul, vs3, false); + + for (i = 0; i < vlmax; i++) { + src2 = vs2 + (i / (VLEN / width)); + src3 = vs3 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + int32_t tmp; + idx = (target_long)env->vfp.vreg[src2].s32[j]; + addr = idx + env->gpr[rs1]; +#ifdef CONFIG_SOFTMMU + tmp = helper_atomic_fetch_orl_le(env, addr, + env->vfp.vreg[src3].s32[j], + make_memop_idx(memop & ~MO_SIGN, mem_idx)); +#else + tmp = helper_atomic_fetch_orl_le(env, addr, + env->vfp.vreg[src3].s32[j]); +#endif + if (wd) { + env->vfp.vreg[src3].s32[j] = tmp; + } + env->vfp.vstart++; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + int64_t tmp; + idx = (target_long)env->vfp.vreg[src2].s64[j]; + addr = idx + env->gpr[rs1]; + +#ifdef CONFIG_SOFTMMU + tmp = (int64_t)(int32_t)helper_atomic_fetch_orl_le(env, + addr, env->vfp.vreg[src3].s64[j], + make_memop_idx(memop & ~MO_SIGN, mem_idx)); +#else + tmp = (int64_t)(int32_t)helper_atomic_fetch_orl_le(env, + addr, env->vfp.vreg[src3].s64[j]); +#endif + if (wd) { + env->vfp.vreg[src3].s64[j] = tmp; + } + env->vfp.vstart++; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_amo(env, src3, j, width); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vamoord_v)(CPURISCVState *env, uint32_t wd, uint32_t vm, + uint32_t rs1, uint32_t vs2, uint32_t vs3) +{ + int i, j, vl; + target_long idx; + uint32_t lmul, width, src2, src3, vlmax; + target_ulong addr; +#ifdef CONFIG_SOFTMMU + int mem_idx = cpu_mmu_index(env, false); + TCGMemOp memop = MO_ALIGN | MO_TEQ; +#endif + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + /* MEM <= SEW <= XLEN */ + if (width < 64 || (width > sizeof(target_ulong) * 8)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + /* if wd, rd is writen the old value */ + if (vector_vtype_ill(env) || + (vector_overlap_vm_common(lmul, vm, vs3) && wd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, vs2, false); + vector_lmul_check_reg(env, lmul, vs3, false); + + for (i = 0; i < vlmax; i++) { + src2 = vs2 + (i / (VLEN / width)); + src3 = vs3 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + int64_t tmp; + idx = (target_long)env->vfp.vreg[src2].s64[j]; + addr = idx + env->gpr[rs1]; + +#ifdef CONFIG_SOFTMMU + tmp = helper_atomic_fetch_orq_le(env, addr, + env->vfp.vreg[src3].s64[j], + make_memop_idx(memop & ~MO_SIGN, mem_idx)); +#else + tmp = helper_atomic_fetch_orq_le(env, addr, + env->vfp.vreg[src3].s64[j]); +#endif + if (wd) { + env->vfp.vreg[src3].s64[j] = tmp; + } + env->vfp.vstart++; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_amo(env, src3, j, width); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vamominw_v)(CPURISCVState *env, uint32_t wd, uint32_t vm, + uint32_t rs1, uint32_t vs2, uint32_t vs3) +{ + int i, j, vl; + target_long idx; + uint32_t lmul, width, src2, src3, vlmax; + target_ulong addr; +#ifdef CONFIG_SOFTMMU + int mem_idx = cpu_mmu_index(env, false); + TCGMemOp memop = MO_ALIGN | MO_TESL; +#endif + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + /* MEM <= SEW <= XLEN */ + if (width < 32 || (width > sizeof(target_ulong) * 8)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + /* if wd, rd is writen the old value */ + if (vector_vtype_ill(env) || + (vector_overlap_vm_common(lmul, vm, vs3) && wd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, vs2, false); + vector_lmul_check_reg(env, lmul, vs3, false); + + for (i = 0; i < vlmax; i++) { + src2 = vs2 + (i / (VLEN / width)); + src3 = vs3 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + int32_t tmp; + idx = (target_long)env->vfp.vreg[src2].s32[j]; + addr = idx + env->gpr[rs1]; +#ifdef CONFIG_SOFTMMU + tmp = helper_atomic_fetch_sminl_le(env, addr, + env->vfp.vreg[src3].s32[j], + make_memop_idx(memop & ~MO_SIGN, mem_idx)); +#else + tmp = helper_atomic_fetch_sminl_le(env, addr, + env->vfp.vreg[src3].s32[j]); +#endif + if (wd) { + env->vfp.vreg[src3].s32[j] = tmp; + } + env->vfp.vstart++; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + int64_t tmp; + idx = (target_long)env->vfp.vreg[src2].s64[j]; + addr = idx + env->gpr[rs1]; + +#ifdef CONFIG_SOFTMMU + tmp = (int64_t)(int32_t)helper_atomic_fetch_sminl_le(env, + addr, env->vfp.vreg[src3].s64[j], + make_memop_idx(memop & ~MO_SIGN, mem_idx)); +#else + tmp = (int64_t)(int32_t)helper_atomic_fetch_sminl_le(env, + addr, env->vfp.vreg[src3].s64[j]); +#endif + if (wd) { + env->vfp.vreg[src3].s64[j] = tmp; + } + env->vfp.vstart++; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_amo(env, src3, j, width); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vamomind_v)(CPURISCVState *env, uint32_t wd, uint32_t vm, + uint32_t rs1, uint32_t vs2, uint32_t vs3) +{ + int i, j, vl; + target_long idx; + uint32_t lmul, width, src2, src3, vlmax; + target_ulong addr; +#ifdef CONFIG_SOFTMMU + int mem_idx = cpu_mmu_index(env, false); + TCGMemOp memop = MO_ALIGN | MO_TEQ; +#endif + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + /* MEM <= SEW <= XLEN */ + if (width < 64 || (width > sizeof(target_ulong) * 8)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + /* if wd, rd is writen the old value */ + if (vector_vtype_ill(env) || + (vector_overlap_vm_common(lmul, vm, vs3) && wd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, vs2, false); + vector_lmul_check_reg(env, lmul, vs3, false); + + for (i = 0; i < vlmax; i++) { + src2 = vs2 + (i / (VLEN / width)); + src3 = vs3 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + int64_t tmp; + idx = (target_long)env->vfp.vreg[src2].s64[j]; + addr = idx + env->gpr[rs1]; + +#ifdef CONFIG_SOFTMMU + tmp = helper_atomic_fetch_sminq_le(env, addr, + env->vfp.vreg[src3].s64[j], + make_memop_idx(memop & ~MO_SIGN, mem_idx)); +#else + tmp = helper_atomic_fetch_sminq_le(env, addr, + env->vfp.vreg[src3].s64[j]); +#endif + if (wd) { + env->vfp.vreg[src3].s64[j] = tmp; + } + env->vfp.vstart++; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_amo(env, src3, j, width); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vamomaxw_v)(CPURISCVState *env, uint32_t wd, uint32_t vm, + uint32_t rs1, uint32_t vs2, uint32_t vs3) +{ + int i, j, vl; + target_long idx; + uint32_t lmul, width, src2, src3, vlmax; + target_ulong addr; +#ifdef CONFIG_SOFTMMU + int mem_idx = cpu_mmu_index(env, false); + TCGMemOp memop = MO_ALIGN | MO_TESL; +#endif + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + /* MEM <= SEW <= XLEN */ + if (width < 32 || (width > sizeof(target_ulong) * 8)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + /* if wd, rd is writen the old value */ + if (vector_vtype_ill(env) || + (vector_overlap_vm_common(lmul, vm, vs3) && wd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, vs2, false); + vector_lmul_check_reg(env, lmul, vs3, false); + + for (i = 0; i < vlmax; i++) { + src2 = vs2 + (i / (VLEN / width)); + src3 = vs3 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + int32_t tmp; + idx = (target_long)env->vfp.vreg[src2].s32[j]; + addr = idx + env->gpr[rs1]; +#ifdef CONFIG_SOFTMMU + tmp = helper_atomic_fetch_smaxl_le(env, addr, + env->vfp.vreg[src3].s32[j], + make_memop_idx(memop & ~MO_SIGN, mem_idx)); +#else + tmp = helper_atomic_fetch_smaxl_le(env, addr, + env->vfp.vreg[src3].s32[j]); +#endif + if (wd) { + env->vfp.vreg[src3].s32[j] = tmp; + } + env->vfp.vstart++; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + int64_t tmp; + idx = (target_long)env->vfp.vreg[src2].s64[j]; + addr = idx + env->gpr[rs1]; + +#ifdef CONFIG_SOFTMMU + tmp = (int64_t)(int32_t)helper_atomic_fetch_smaxl_le(env, + addr, env->vfp.vreg[src3].s64[j], + make_memop_idx(memop & ~MO_SIGN, mem_idx)); +#else + tmp = (int64_t)(int32_t)helper_atomic_fetch_smaxl_le(env, + addr, env->vfp.vreg[src3].s64[j]); +#endif + if (wd) { + env->vfp.vreg[src3].s64[j] = tmp; + } + env->vfp.vstart++; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_amo(env, src3, j, width); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vamomaxd_v)(CPURISCVState *env, uint32_t wd, uint32_t vm, + uint32_t rs1, uint32_t vs2, uint32_t vs3) +{ + int i, j, vl; + target_long idx; + uint32_t lmul, width, src2, src3, vlmax; + target_ulong addr; +#ifdef CONFIG_SOFTMMU + int mem_idx = cpu_mmu_index(env, false); + TCGMemOp memop = MO_ALIGN | MO_TEQ; +#endif + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + /* MEM <= SEW <= XLEN */ + if (width < 64 || (width > sizeof(target_ulong) * 8)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + /* if wd, rd is writen the old value */ + if (vector_vtype_ill(env) || + (vector_overlap_vm_common(lmul, vm, vs3) && wd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, vs2, false); + vector_lmul_check_reg(env, lmul, vs3, false); + + for (i = 0; i < vlmax; i++) { + src2 = vs2 + (i / (VLEN / width)); + src3 = vs3 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + int64_t tmp; + idx = (target_long)env->vfp.vreg[src2].s64[j]; + addr = idx + env->gpr[rs1]; + +#ifdef CONFIG_SOFTMMU + tmp = helper_atomic_fetch_smaxq_le(env, addr, + env->vfp.vreg[src3].s64[j], + make_memop_idx(memop & ~MO_SIGN, mem_idx)); +#else + tmp = helper_atomic_fetch_smaxq_le(env, addr, + env->vfp.vreg[src3].s64[j]); +#endif + if (wd) { + env->vfp.vreg[src3].s64[j] = tmp; + } + env->vfp.vstart++; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_amo(env, src3, j, width); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vamominuw_v)(CPURISCVState *env, uint32_t wd, uint32_t vm, + uint32_t rs1, uint32_t vs2, uint32_t vs3) +{ + int i, j, vl; + target_long idx; + uint32_t lmul, width, src2, src3, vlmax; + target_ulong addr; +#ifdef CONFIG_SOFTMMU + int mem_idx = cpu_mmu_index(env, false); + TCGMemOp memop = MO_ALIGN | MO_TESL; +#endif + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + /* MEM <= SEW <= XLEN */ + if (width < 32 || (width > sizeof(target_ulong) * 8)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + /* if wd, rd is writen the old value */ + if (vector_vtype_ill(env) || + (vector_overlap_vm_common(lmul, vm, vs3) && wd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, vs2, false); + vector_lmul_check_reg(env, lmul, vs3, false); + + for (i = 0; i < vlmax; i++) { + src2 = vs2 + (i / (VLEN / width)); + src3 = vs3 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + uint32_t tmp; + idx = (target_long)env->vfp.vreg[src2].s32[j]; + addr = idx + env->gpr[rs1]; +#ifdef CONFIG_SOFTMMU + tmp = helper_atomic_fetch_uminl_le(env, addr, + env->vfp.vreg[src3].s32[j], + make_memop_idx(memop & ~MO_SIGN, mem_idx)); +#else + tmp = helper_atomic_fetch_uminl_le(env, addr, + env->vfp.vreg[src3].s32[j]); +#endif + if (wd) { + env->vfp.vreg[src3].s32[j] = tmp; + } + env->vfp.vstart++; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + uint64_t tmp; + idx = (target_long)env->vfp.vreg[src2].s64[j]; + addr = idx + env->gpr[rs1]; + +#ifdef CONFIG_SOFTMMU + tmp = (int64_t)(int32_t)helper_atomic_fetch_uminl_le( + env, addr, env->vfp.vreg[src3].s64[j], + make_memop_idx(memop & ~MO_SIGN, mem_idx)); +#else + tmp = (int64_t)(int32_t)helper_atomic_fetch_uminl_le( + env, addr, env->vfp.vreg[src3].s64[j]); +#endif + if (wd) { + env->vfp.vreg[src3].s64[j] = tmp; + } + env->vfp.vstart++; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_amo(env, src3, j, width); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vamominud_v)(CPURISCVState *env, uint32_t wd, uint32_t vm, + uint32_t rs1, uint32_t vs2, uint32_t vs3) +{ + int i, j, vl; + target_long idx; + uint32_t lmul, width, src2, src3, vlmax; + target_ulong addr; +#ifdef CONFIG_SOFTMMU + int mem_idx = cpu_mmu_index(env, false); + TCGMemOp memop = MO_ALIGN | MO_TESL; +#endif + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + /* MEM <= SEW <= XLEN */ + if (width < 64 || (width > sizeof(target_ulong) * 8)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + /* if wd, rd is writen the old value */ + if (vector_vtype_ill(env) || + (vector_overlap_vm_common(lmul, vm, vs3) && wd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, vs2, false); + vector_lmul_check_reg(env, lmul, vs3, false); + + for (i = 0; i < vlmax; i++) { + src2 = vs2 + (i / (VLEN / width)); + src3 = vs3 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + uint32_t tmp; + idx = (target_long)env->vfp.vreg[src2].s32[j]; + addr = idx + env->gpr[rs1]; +#ifdef CONFIG_SOFTMMU + tmp = helper_atomic_fetch_uminl_le(env, addr, + env->vfp.vreg[src3].s32[j], + make_memop_idx(memop & ~MO_SIGN, mem_idx)); +#else + tmp = helper_atomic_fetch_uminl_le(env, addr, + env->vfp.vreg[src3].s32[j]); +#endif + if (wd) { + env->vfp.vreg[src3].s32[j] = tmp; + } + env->vfp.vstart++; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + uint64_t tmp; + idx = (target_long)env->vfp.vreg[src2].s64[j]; + addr = idx + env->gpr[rs1]; + +#ifdef CONFIG_SOFTMMU + tmp = helper_atomic_fetch_uminq_le( + env, addr, env->vfp.vreg[src3].s64[j], + make_memop_idx(memop & ~MO_SIGN, mem_idx)); +#else + tmp = helper_atomic_fetch_uminq_le(env, addr, + env->vfp.vreg[src3].s64[j]); +#endif + if (wd) { + env->vfp.vreg[src3].s64[j] = tmp; + } + env->vfp.vstart++; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_amo(env, src3, j, width); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vamomaxuw_v)(CPURISCVState *env, uint32_t wd, uint32_t vm, + uint32_t rs1, uint32_t vs2, uint32_t vs3) +{ + int i, j, vl; + target_long idx; + uint32_t lmul, width, src2, src3, vlmax; + target_ulong addr; +#ifdef CONFIG_SOFTMMU + int mem_idx = cpu_mmu_index(env, false); + TCGMemOp memop = MO_ALIGN | MO_TESL; +#endif + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + /* MEM <= SEW <= XLEN */ + if (width < 32 || (width > sizeof(target_ulong) * 8)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + /* if wd, rd is writen the old value */ + if (vector_vtype_ill(env) || + (vector_overlap_vm_common(lmul, vm, vs3) && wd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, vs2, false); + vector_lmul_check_reg(env, lmul, vs3, false); + + for (i = 0; i < vlmax; i++) { + src2 = vs2 + (i / (VLEN / width)); + src3 = vs3 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + uint32_t tmp; + idx = (target_long)env->vfp.vreg[src2].s32[j]; + addr = idx + env->gpr[rs1]; +#ifdef CONFIG_SOFTMMU + tmp = helper_atomic_fetch_umaxl_le(env, addr, + env->vfp.vreg[src3].s32[j], + make_memop_idx(memop & ~MO_SIGN, mem_idx)); +#else + tmp = helper_atomic_fetch_umaxl_le(env, addr, + env->vfp.vreg[src3].s32[j]); +#endif + if (wd) { + env->vfp.vreg[src3].s32[j] = tmp; + } + env->vfp.vstart++; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + uint64_t tmp; + idx = (target_long)env->vfp.vreg[src2].s64[j]; + addr = idx + env->gpr[rs1]; + +#ifdef CONFIG_SOFTMMU + tmp = (int64_t)(int32_t)helper_atomic_fetch_umaxl_le( + env, addr, env->vfp.vreg[src3].s64[j], + make_memop_idx(memop & ~MO_SIGN, mem_idx)); +#else + tmp = (int64_t)(int32_t)helper_atomic_fetch_umaxl_le( + env, addr, env->vfp.vreg[src3].s64[j]); +#endif + if (wd) { + env->vfp.vreg[src3].s64[j] = tmp; + } + env->vfp.vstart++; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_amo(env, src3, j, width); + } + } + env->vfp.vstart = 0; +} +void VECTOR_HELPER(vamomaxud_v)(CPURISCVState *env, uint32_t wd, uint32_t vm, + uint32_t rs1, uint32_t vs2, uint32_t vs3) +{ + int i, j, vl; + target_long idx; + uint32_t lmul, width, src2, src3, vlmax; + target_ulong addr; +#ifdef CONFIG_SOFTMMU + int mem_idx = cpu_mmu_index(env, false); + TCGMemOp memop = MO_ALIGN | MO_TEQ; +#endif + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + /* MEM <= SEW <= XLEN */ + if (width < 64 || (width > sizeof(target_ulong) * 8)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + /* if wd, rd is writen the old value */ + if (vector_vtype_ill(env) || + (vector_overlap_vm_common(lmul, vm, vs3) && wd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, vs2, false); + vector_lmul_check_reg(env, lmul, vs3, false); + + for (i = 0; i < vlmax; i++) { + src2 = vs2 + (i / (VLEN / width)); + src3 = vs3 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + uint64_t tmp; + idx = (target_long)env->vfp.vreg[src2].s64[j]; + addr = idx + env->gpr[rs1]; + +#ifdef CONFIG_SOFTMMU + tmp = helper_atomic_fetch_umaxq_le( + env, addr, env->vfp.vreg[src3].s64[j], + make_memop_idx(memop & ~MO_SIGN, mem_idx)); +#else + tmp = helper_atomic_fetch_umaxq_le(env, addr, + env->vfp.vreg[src3].s64[j]); +#endif + if (wd) { + env->vfp.vreg[src3].s64[j] = tmp; + } + env->vfp.vstart++; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_amo(env, src3, j, width); + } + } + env->vfp.vstart = 0; +} + From patchwork Wed Sep 11 06:25:32 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: LIU Zhiwei X-Patchwork-Id: 11140393 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B10B914ED for ; Wed, 11 Sep 2019 06:43:59 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 719CE2089F for ; Wed, 11 Sep 2019 06:43:59 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 719CE2089F Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=c-sky.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Received: from localhost ([::1]:46920 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1i7wMA-0000NN-As for patchwork-qemu-devel@patchwork.kernel.org; Wed, 11 Sep 2019 02:43:58 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:38517) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1i7wDg-0007AW-79 for qemu-devel@nongnu.org; Wed, 11 Sep 2019 02:35:20 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1i7wDW-0007pK-5M for qemu-devel@nongnu.org; Wed, 11 Sep 2019 02:35:12 -0400 Received: from smtp2200-217.mail.aliyun.com ([121.197.200.217]:56549) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1i7wDU-0007lQ-E2; Wed, 11 Sep 2019 02:35:02 -0400 X-Alimail-AntiSpam: AC=CONTINUE; BC=0.03883426|-1; CH=green; DM=CONTINUE|CONTINUE|true|0.259867-0.00419391-0.735939; FP=0|0|0|0|0|-1|-1|-1; HT=e02c03276; MF=zhiwei_liu@c-sky.com; NM=1; PH=DS; RN=11; RT=11; SR=0; TI=SMTPD_---.FSRSzSU_1568183695; Received: from localhost(mailfrom:zhiwei_liu@c-sky.com fp:SMTPD_---.FSRSzSU_1568183695) by smtp.aliyun-inc.com(10.147.41.178); Wed, 11 Sep 2019 14:34:55 +0800 From: liuzhiwei To: Alistair.Francis@wdc.com, palmer@sifive.com, sagark@eecs.berkeley.edu, kbastian@mail.uni-paderborn.de, riku.voipio@iki.fi, laurent@vivier.eu, wenmeng_zhang@c-sky.com Date: Wed, 11 Sep 2019 14:25:32 +0800 Message-Id: <1568183141-67641-9-git-send-email-zhiwei_liu@c-sky.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1568183141-67641-1-git-send-email-zhiwei_liu@c-sky.com> References: <1568183141-67641-1-git-send-email-zhiwei_liu@c-sky.com> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x [generic] X-Received-From: 121.197.200.217 Subject: [Qemu-devel] [PATCH v2 08/17] RISC-V: add vector extension integer instructions part1, add/sub/adc/sbc X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-riscv@nongnu.org, qemu-devel@nongnu.org, wxy194768@alibaba-inc.com, LIU Zhiwei Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" From: LIU Zhiwei Signed-off-by: LIU Zhiwei --- target/riscv/helper.h | 36 + target/riscv/insn32.decode | 35 + target/riscv/insn_trans/trans_rvv.inc.c | 49 + target/riscv/vector_helper.c | 2335 +++++++++++++++++++++++++++++++ 4 files changed, 2455 insertions(+) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index c107925..31e20dc 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -121,6 +121,7 @@ DEF_HELPER_6(vector_vsuxb_v, void, env, i32, i32, i32, i32, i32) DEF_HELPER_6(vector_vsuxh_v, void, env, i32, i32, i32, i32, i32) DEF_HELPER_6(vector_vsuxw_v, void, env, i32, i32, i32, i32, i32) DEF_HELPER_6(vector_vsuxe_v, void, env, i32, i32, i32, i32, i32) + DEF_HELPER_6(vector_vamoswapw_v, void, env, i32, i32, i32, i32, i32) DEF_HELPER_6(vector_vamoswapd_v, void, env, i32, i32, i32, i32, i32) DEF_HELPER_6(vector_vamoaddw_v, void, env, i32, i32, i32, i32, i32) @@ -139,5 +140,40 @@ DEF_HELPER_6(vector_vamominuw_v, void, env, i32, i32, i32, i32, i32) DEF_HELPER_6(vector_vamominud_v, void, env, i32, i32, i32, i32, i32) DEF_HELPER_6(vector_vamomaxuw_v, void, env, i32, i32, i32, i32, i32) DEF_HELPER_6(vector_vamomaxud_v, void, env, i32, i32, i32, i32, i32) + +DEF_HELPER_4(vector_vadc_vvm, void, env, i32, i32, i32) +DEF_HELPER_4(vector_vadc_vxm, void, env, i32, i32, i32) +DEF_HELPER_4(vector_vadc_vim, void, env, i32, i32, i32) +DEF_HELPER_4(vector_vmadc_vvm, void, env, i32, i32, i32) +DEF_HELPER_4(vector_vmadc_vxm, void, env, i32, i32, i32) +DEF_HELPER_4(vector_vmadc_vim, void, env, i32, i32, i32) +DEF_HELPER_4(vector_vsbc_vvm, void, env, i32, i32, i32) +DEF_HELPER_4(vector_vsbc_vxm, void, env, i32, i32, i32) +DEF_HELPER_4(vector_vmsbc_vvm, void, env, i32, i32, i32) +DEF_HELPER_4(vector_vmsbc_vxm, void, env, i32, i32, i32) +DEF_HELPER_5(vector_vadd_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vadd_vx, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vadd_vi, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vsub_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vsub_vx, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vrsub_vx, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vrsub_vi, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vwaddu_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vwaddu_vx, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vwadd_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vwadd_vx, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vwsubu_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vwsubu_vx, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vwsub_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vwsub_vx, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vwaddu_wv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vwaddu_wx, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vwadd_wv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vwadd_wx, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vwsubu_wv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vwsubu_wx, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vwsub_wv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vwsub_wx, void, env, i32, i32, i32, i32) + DEF_HELPER_4(vector_vsetvli, void, env, i32, i32, i32) DEF_HELPER_4(vector_vsetvl, void, env, i32, i32, i32) diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode index 48e7661..fc7e498 100644 --- a/target/riscv/insn32.decode +++ b/target/riscv/insn32.decode @@ -63,6 +63,7 @@ @r_rm ....... ..... ..... ... ..... ....... %rs2 %rs1 %rm %rd @r2_rm ....... ..... ..... ... ..... ....... %rs1 %rm %rd @r2 ....... ..... ..... ... ..... ....... %rs1 %rd +@r_vm ...... vm:1 ..... ..... ... ..... ....... %rs2 %rs1 %rd @r_wdvm ..... wd:1 vm:1 ..... ..... ... ..... ....... %rs2 %rs1 %rd @r_nfvm nf:3 ... vm:1 ..... ..... ... ..... ....... %rs2 %rs1 %rd @r2_nfvm nf:3 ... vm:1 ..... ..... ... ..... ....... %rs1 %rd @@ -280,5 +281,39 @@ vamomaxuw_v 11100 . . ..... ..... 110 ..... 0101111 @r_wdvm vamomaxud_v 11100 . . ..... ..... 111 ..... 0101111 @r_wdvm #*** new major opcode OP-V *** +vadd_vv 000000 . ..... ..... 000 ..... 1010111 @r_vm +vadd_vx 000000 . ..... ..... 100 ..... 1010111 @r_vm +vadd_vi 000000 . ..... ..... 011 ..... 1010111 @r_vm +vsub_vv 000010 . ..... ..... 000 ..... 1010111 @r_vm +vsub_vx 000010 . ..... ..... 100 ..... 1010111 @r_vm +vrsub_vx 000011 . ..... ..... 100 ..... 1010111 @r_vm +vrsub_vi 000011 . ..... ..... 011 ..... 1010111 @r_vm +vwaddu_vv 110000 . ..... ..... 010 ..... 1010111 @r_vm +vwaddu_vx 110000 . ..... ..... 110 ..... 1010111 @r_vm +vwadd_vv 110001 . ..... ..... 010 ..... 1010111 @r_vm +vwadd_vx 110001 . ..... ..... 110 ..... 1010111 @r_vm +vwsubu_vv 110010 . ..... ..... 010 ..... 1010111 @r_vm +vwsubu_vx 110010 . ..... ..... 110 ..... 1010111 @r_vm +vwsub_vv 110011 . ..... ..... 010 ..... 1010111 @r_vm +vwsub_vx 110011 . ..... ..... 110 ..... 1010111 @r_vm +vwaddu_wv 110100 . ..... ..... 010 ..... 1010111 @r_vm +vwaddu_wx 110100 . ..... ..... 110 ..... 1010111 @r_vm +vwadd_wv 110101 . ..... ..... 010 ..... 1010111 @r_vm +vwadd_wx 110101 . ..... ..... 110 ..... 1010111 @r_vm +vwsubu_wv 110110 . ..... ..... 010 ..... 1010111 @r_vm +vwsubu_wx 110110 . ..... ..... 110 ..... 1010111 @r_vm +vwsub_wv 110111 . ..... ..... 010 ..... 1010111 @r_vm +vwsub_wx 110111 . ..... ..... 110 ..... 1010111 @r_vm +vadc_vvm 010000 1 ..... ..... 000 ..... 1010111 @r +vadc_vxm 010000 1 ..... ..... 100 ..... 1010111 @r +vadc_vim 010000 1 ..... ..... 011 ..... 1010111 @r +vmadc_vvm 010001 1 ..... ..... 000 ..... 1010111 @r +vmadc_vxm 010001 1 ..... ..... 100 ..... 1010111 @r +vmadc_vim 010001 1 ..... ..... 011 ..... 1010111 @r +vsbc_vvm 010010 1 ..... ..... 000 ..... 1010111 @r +vsbc_vxm 010010 1 ..... ..... 100 ..... 1010111 @r +vmsbc_vvm 010011 1 ..... ..... 000 ..... 1010111 @r +vmsbc_vxm 010011 1 ..... ..... 100 ..... 1010111 @r + vsetvli 0 ........... ..... 111 ..... 1010111 @r2_zimm vsetvl 1000000 ..... ..... 111 ..... 1010111 @r diff --git a/target/riscv/insn_trans/trans_rvv.inc.c b/target/riscv/insn_trans/trans_rvv.inc.c index 7bda378..a1c1960 100644 --- a/target/riscv/insn_trans/trans_rvv.inc.c +++ b/target/riscv/insn_trans/trans_rvv.inc.c @@ -77,6 +77,21 @@ static bool trans_##INSN(DisasContext *ctx, arg_##INSN * a) \ return true; \ } +#define GEN_VECTOR_R_VM(INSN) \ +static bool trans_##INSN(DisasContext *ctx, arg_##INSN * a) \ +{ \ + TCGv_i32 s1 = tcg_const_i32(a->rs1); \ + TCGv_i32 s2 = tcg_const_i32(a->rs2); \ + TCGv_i32 d = tcg_const_i32(a->rd); \ + TCGv_i32 vm = tcg_const_i32(a->vm); \ + gen_helper_vector_##INSN(cpu_env, vm, s1, s2, d); \ + tcg_temp_free_i32(s1); \ + tcg_temp_free_i32(s2); \ + tcg_temp_free_i32(d); \ + tcg_temp_free_i32(vm); \ + return true; \ +} + #define GEN_VECTOR_R2_ZIMM(INSN) \ static bool trans_##INSN(DisasContext *ctx, arg_##INSN * a) \ { \ @@ -155,5 +170,39 @@ GEN_VECTOR_R_WDVM(vamominud_v) GEN_VECTOR_R_WDVM(vamomaxuw_v) GEN_VECTOR_R_WDVM(vamomaxud_v) +GEN_VECTOR_R(vadc_vvm) +GEN_VECTOR_R(vadc_vxm) +GEN_VECTOR_R(vadc_vim) +GEN_VECTOR_R(vmadc_vvm) +GEN_VECTOR_R(vmadc_vxm) +GEN_VECTOR_R(vmadc_vim) +GEN_VECTOR_R(vsbc_vvm) +GEN_VECTOR_R(vsbc_vxm) +GEN_VECTOR_R(vmsbc_vvm) +GEN_VECTOR_R(vmsbc_vxm) +GEN_VECTOR_R_VM(vadd_vv) +GEN_VECTOR_R_VM(vadd_vx) +GEN_VECTOR_R_VM(vadd_vi) +GEN_VECTOR_R_VM(vsub_vv) +GEN_VECTOR_R_VM(vsub_vx) +GEN_VECTOR_R_VM(vrsub_vx) +GEN_VECTOR_R_VM(vrsub_vi) +GEN_VECTOR_R_VM(vwaddu_vv) +GEN_VECTOR_R_VM(vwaddu_vx) +GEN_VECTOR_R_VM(vwadd_vv) +GEN_VECTOR_R_VM(vwadd_vx) +GEN_VECTOR_R_VM(vwsubu_vv) +GEN_VECTOR_R_VM(vwsubu_vx) +GEN_VECTOR_R_VM(vwsub_vv) +GEN_VECTOR_R_VM(vwsub_vx) +GEN_VECTOR_R_VM(vwaddu_wv) +GEN_VECTOR_R_VM(vwaddu_wx) +GEN_VECTOR_R_VM(vwadd_wv) +GEN_VECTOR_R_VM(vwadd_wx) +GEN_VECTOR_R_VM(vwsubu_wv) +GEN_VECTOR_R_VM(vwsubu_wx) +GEN_VECTOR_R_VM(vwsub_wv) +GEN_VECTOR_R_VM(vwsub_wx) + GEN_VECTOR_R2_ZIMM(vsetvli) GEN_VECTOR_R(vsetvl) diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c index 9ebf70d..95336c9 100644 --- a/target/riscv/vector_helper.c +++ b/target/riscv/vector_helper.c @@ -24,12 +24,21 @@ #include #define VECTOR_HELPER(name) HELPER(glue(vector_, name)) +#define SIGNBIT8 (1 << 7) +#define SIGNBIT16 (1 << 15) +#define SIGNBIT32 (1 << 31) +#define SIGNBIT64 ((uint64_t)1 << 63) static int64_t sign_extend(int64_t a, int8_t width) { return a << (64 - width) >> (64 - width); } +static int64_t extend_gpr(target_ulong reg) +{ + return sign_extend(reg, sizeof(target_ulong) * 8); +} + static target_ulong vector_get_index(CPURISCVState *env, int rs1, int rs2, int index, int mem, int width, int nf) { @@ -118,6 +127,39 @@ static inline bool vector_overlap_vm_common(int lmul, int vm, int rd) return false; } +static inline bool vector_overlap_vm_force(int vm, int rd) +{ + if (vm == 0 && rd == 0) { + return true; + } + return false; +} + +static inline bool vector_overlap_carry(int lmul, int rd) +{ + if (lmul > 1 && rd == 0) { + return true; + } + return false; +} + +static inline bool vector_overlap_dstgp_srcgp(int rd, int dlen, int rs, + int slen) +{ + if ((rd >= rs && rd < rs + slen) || (rs >= rd && rs < rd + dlen)) { + return true; + } + return false; +} + +static inline void vector_get_layout(CPURISCVState *env, int width, int lmul, + int index, int *idx, int *pos) +{ + int mlen = width / lmul; + *idx = (index * mlen) / 8; + *pos = (index * mlen) % 8; +} + static bool vector_lmul_check_reg(CPURISCVState *env, uint32_t lmul, uint32_t reg, bool widen) { @@ -185,6 +227,173 @@ static void vector_tail_segment(CPURISCVState *env, int vreg, int index, } } +static void vector_tail_common(CPURISCVState *env, int vreg, int index, + int width) +{ + switch (width) { + case 8: + env->vfp.vreg[vreg].u8[index] = 0; + break; + case 16: + env->vfp.vreg[vreg].u16[index] = 0; + break; + case 32: + env->vfp.vreg[vreg].u32[index] = 0; + break; + case 64: + env->vfp.vreg[vreg].u64[index] = 0; + break; + default: + helper_raise_exception(env, RISCV_EXCP_ILLEGAL_INST); + return; + } +} + +static void vector_tail_widen(CPURISCVState *env, int vreg, int index, + int width) +{ + switch (width) { + case 8: + env->vfp.vreg[vreg].u16[index] = 0; + break; + case 16: + env->vfp.vreg[vreg].u32[index] = 0; + break; + case 32: + env->vfp.vreg[vreg].u64[index] = 0; + break; + default: + helper_raise_exception(env, RISCV_EXCP_ILLEGAL_INST); + return; + } +} + +static inline int vector_get_carry(CPURISCVState *env, int width, int lmul, + int index) +{ + int mlen = width / lmul; + int idx = (index * mlen) / 8; + int pos = (index * mlen) % 8; + + return (env->vfp.vreg[0].u8[idx] >> pos) & 0x1; +} + +static inline void vector_mask_result(CPURISCVState *env, uint32_t reg, + int width, int lmul, int index, uint32_t result) +{ + int mlen = width / lmul; + int idx = (index * mlen) / width; + int pos = (index * mlen) % width; + uint64_t mask = ~((((uint64_t)1 << mlen) - 1) << pos); + + switch (width) { + case 8: + env->vfp.vreg[reg].u8[idx] = (env->vfp.vreg[reg].u8[idx] & mask) + | (result << pos); + break; + case 16: + env->vfp.vreg[reg].u16[idx] = (env->vfp.vreg[reg].u16[idx] & mask) + | (result << pos); + break; + case 32: + env->vfp.vreg[reg].u32[idx] = (env->vfp.vreg[reg].u32[idx] & mask) + | (result << pos); + break; + case 64: + env->vfp.vreg[reg].u64[idx] = (env->vfp.vreg[reg].u64[idx] & mask) + | ((uint64_t)result << pos); + break; + default: + helper_raise_exception(env, RISCV_EXCP_ILLEGAL_INST); + break; + } + + return; +} + +static inline uint64_t u64xu64_lh(uint64_t a, uint64_t b) +{ + uint64_t hi_64, carry; + + /* first get the whole product in {hi_64, lo_64} */ + uint64_t a_hi = a >> 32; + uint64_t a_lo = (uint32_t)a; + uint64_t b_hi = b >> 32; + uint64_t b_lo = (uint32_t)b; + + /* + * a * b = (a_hi << 32 + a_lo) * (b_hi << 32 + b_lo) + * = (a_hi * b_hi) << 64 + (a_hi * b_lo) << 32 + + * (a_lo * b_hi) << 32 + a_lo * b_lo + * = {hi_64, lo_64} + * hi_64 = ((a_hi * b_lo) << 32 + (a_lo * b_hi) << 32 + (a_lo * b_lo)) >> 64 + * = (a_hi * b_lo) >> 32 + (a_lo * b_hi) >> 32 + carry + * carry = ((uint64_t)(uint32_t)(a_hi * b_lo) + + * (uint64_t)(uint32_t)(a_lo * b_hi) + (a_lo * b_lo) >> 32) >> 32 + */ + + carry = ((uint64_t)(uint32_t)(a_hi * b_lo) + + (uint64_t)(uint32_t)(a_lo * b_hi) + + ((a_lo * b_lo) >> 32)) >> 32; + + hi_64 = a_hi * b_hi + + ((a_hi * b_lo) >> 32) + ((a_lo * b_hi) >> 32) + + carry; + + return hi_64; +} + +static inline int64_t s64xu64_lh(int64_t a, uint64_t b) +{ + uint64_t abs_a = a; + uint64_t lo_64, hi_64; + + if (a < 0) { + abs_a = ~a + 1; + } + lo_64 = abs_a * b; + hi_64 = u64xu64_lh(abs_a, b); + + if ((a ^ b) & SIGNBIT64) { + lo_64 = ~lo_64; + hi_64 = ~hi_64; + if (lo_64 == UINT64_MAX) { + lo_64 = 0; + hi_64 += 1; + } else { + lo_64 += 1; + } + } + return hi_64; +} + +static inline int64_t s64xs64_lh(int64_t a, int64_t b) +{ + uint64_t abs_a = a, abs_b = b; + uint64_t lo_64, hi_64; + + if (a < 0) { + abs_a = ~a + 1; + } + if (b < 0) { + abs_b = ~b + 1; + } + lo_64 = abs_a * abs_b; + hi_64 = u64xu64_lh(abs_a, abs_b); + + if ((a ^ b) & SIGNBIT64) { + lo_64 = ~lo_64; + hi_64 = ~hi_64; + if (lo_64 == UINT64_MAX) { + lo_64 = 0; + hi_64 += 1; + } else { + lo_64 += 1; + } + } + return hi_64; +} + void VECTOR_HELPER(vsetvl)(CPURISCVState *env, uint32_t rs1, uint32_t rs2, uint32_t rd) { @@ -4796,3 +5005,2129 @@ void VECTOR_HELPER(vamomaxud_v)(CPURISCVState *env, uint32_t wd, uint32_t vm, env->vfp.vstart = 0; } +void VECTOR_HELPER(vadc_vvm)(CPURISCVState *env, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src1, src2, dest, vlmax, carry; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_carry(lmul, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + src1 = rs1 + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + carry = vector_get_carry(env, width, lmul, i); + env->vfp.vreg[dest].u8[j] = env->vfp.vreg[src1].u8[j] + + env->vfp.vreg[src2].u8[j] + carry; + break; + case 16: + carry = vector_get_carry(env, width, lmul, i); + env->vfp.vreg[dest].u16[j] = env->vfp.vreg[src1].u16[j] + + env->vfp.vreg[src2].u16[j] + carry; + break; + case 32: + carry = vector_get_carry(env, width, lmul, i); + env->vfp.vreg[dest].u32[j] = env->vfp.vreg[src1].u32[j] + + env->vfp.vreg[src2].u32[j] + carry; + break; + case 64: + carry = vector_get_carry(env, width, lmul, i); + env->vfp.vreg[dest].u64[j] = env->vfp.vreg[src1].u64[j] + + env->vfp.vreg[src2].u64[j] + carry; + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; +} +void VECTOR_HELPER(vadc_vxm)(CPURISCVState *env, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src2, dest, vlmax, carry; + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_carry(lmul, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + carry = vector_get_carry(env, width, lmul, i); + env->vfp.vreg[dest].u8[j] = env->gpr[rs1] + + env->vfp.vreg[src2].u8[j] + carry; + break; + case 16: + carry = vector_get_carry(env, width, lmul, i); + env->vfp.vreg[dest].u16[j] = env->gpr[rs1] + + env->vfp.vreg[src2].u16[j] + carry; + break; + case 32: + carry = vector_get_carry(env, width, lmul, i); + env->vfp.vreg[dest].u32[j] = env->gpr[rs1] + + env->vfp.vreg[src2].u32[j] + carry; + break; + case 64: + carry = vector_get_carry(env, width, lmul, i); + env->vfp.vreg[dest].u64[j] = (uint64_t)extend_gpr(env->gpr[rs1]) + + env->vfp.vreg[src2].u64[j] + carry; + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vadc_vim)(CPURISCVState *env, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src2, dest, vlmax, carry; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_carry(lmul, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + carry = vector_get_carry(env, width, lmul, i); + env->vfp.vreg[dest].u8[j] = sign_extend(rs1, 5) + + env->vfp.vreg[src2].u8[j] + carry; + break; + case 16: + carry = vector_get_carry(env, width, lmul, i); + env->vfp.vreg[dest].u16[j] = sign_extend(rs1, 5) + + env->vfp.vreg[src2].u16[j] + carry; + break; + case 32: + carry = vector_get_carry(env, width, lmul, i); + env->vfp.vreg[dest].u32[j] = sign_extend(rs1, 5) + + env->vfp.vreg[src2].u32[j] + carry; + break; + case 64: + carry = vector_get_carry(env, width, lmul, i); + env->vfp.vreg[dest].u64[j] = sign_extend(rs1, 5) + + env->vfp.vreg[src2].u64[j] + carry; + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vmadc_vvm)(CPURISCVState *env, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src1, src2, vlmax, carry; + uint64_t tmp; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) + || vector_overlap_dstgp_srcgp(rd, 1, rs1, lmul) + || vector_overlap_dstgp_srcgp(rd, 1, rs2, lmul) + || (rd == 0)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + + for (i = 0; i < vlmax; i++) { + src1 = rs1 + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + carry = vector_get_carry(env, width, lmul, i); + tmp = env->vfp.vreg[src1].u8[j] + + env->vfp.vreg[src2].u8[j] + carry; + tmp = tmp >> width; + + vector_mask_result(env, rd, width, lmul, i, tmp); + break; + case 16: + carry = vector_get_carry(env, width, lmul, i); + tmp = env->vfp.vreg[src1].u16[j] + + env->vfp.vreg[src2].u16[j] + carry; + tmp = tmp >> width; + vector_mask_result(env, rd, width, lmul, i, tmp); + break; + case 32: + carry = vector_get_carry(env, width, lmul, i); + tmp = (uint64_t)env->vfp.vreg[src1].u32[j] + + (uint64_t)env->vfp.vreg[src2].u32[j] + carry; + tmp = tmp >> width; + vector_mask_result(env, rd, width, lmul, i, tmp); + break; + case 64: + carry = vector_get_carry(env, width, lmul, i); + tmp = env->vfp.vreg[src1].u64[j] + + env->vfp.vreg[src2].u64[j] + carry; + + if ((tmp < env->vfp.vreg[src1].u64[j] || + tmp < env->vfp.vreg[src2].u64[j]) + || (env->vfp.vreg[src1].u64[j] == UINT64_MAX && + env->vfp.vreg[src2].u64[j] == UINT64_MAX)) { + tmp = 1; + } else { + tmp = 0; + } + vector_mask_result(env, rd, width, lmul, i, tmp); + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + if (width <= 64) { + vector_mask_result(env, rd, width, lmul, i, 0); + } else { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } + } + env->vfp.vstart = 0; +} +void VECTOR_HELPER(vmadc_vxm)(CPURISCVState *env, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src2, vlmax, carry; + uint64_t tmp, extend_rs1; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) + || vector_overlap_dstgp_srcgp(rd, 1, rs2, lmul) + || (rd == 0)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs2, false); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + carry = vector_get_carry(env, width, lmul, i); + tmp = (uint8_t)env->gpr[rs1] + + env->vfp.vreg[src2].u8[j] + carry; + tmp = tmp >> width; + + vector_mask_result(env, rd, width, lmul, i, tmp); + break; + case 16: + carry = vector_get_carry(env, width, lmul, i); + tmp = (uint16_t)env->gpr[rs1] + + env->vfp.vreg[src2].u16[j] + carry; + tmp = tmp >> width; + vector_mask_result(env, rd, width, lmul, i, tmp); + break; + case 32: + carry = vector_get_carry(env, width, lmul, i); + tmp = (uint64_t)((uint32_t)env->gpr[rs1]) + + (uint64_t)env->vfp.vreg[src2].u32[j] + carry; + tmp = tmp >> width; + vector_mask_result(env, rd, width, lmul, i, tmp); + break; + case 64: + carry = vector_get_carry(env, width, lmul, i); + + extend_rs1 = (uint64_t)extend_gpr(env->gpr[rs1]); + tmp = extend_rs1 + env->vfp.vreg[src2].u64[j] + carry; + if ((tmp < extend_rs1) || + (carry && (env->vfp.vreg[src2].u64[j] == UINT64_MAX))) { + tmp = 1; + } else { + tmp = 0; + } + vector_mask_result(env, rd, width, lmul, i, tmp); + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + if (width <= 64) { + vector_mask_result(env, rd, width, lmul, i, 0); + } else { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vmadc_vim)(CPURISCVState *env, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src2, vlmax, carry; + uint64_t tmp; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) + || vector_overlap_dstgp_srcgp(rd, 1, rs2, lmul) + || (rd == 0)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs2, false); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + carry = vector_get_carry(env, width, lmul, i); + tmp = (uint8_t)sign_extend(rs1, 5) + + env->vfp.vreg[src2].u8[j] + carry; + tmp = tmp >> width; + + vector_mask_result(env, rd, width, lmul, i, tmp); + break; + case 16: + carry = vector_get_carry(env, width, lmul, i); + tmp = (uint16_t)sign_extend(rs1, 5) + + env->vfp.vreg[src2].u16[j] + carry; + tmp = tmp >> width; + vector_mask_result(env, rd, width, lmul, i, tmp); + break; + case 32: + carry = vector_get_carry(env, width, lmul, i); + tmp = (uint64_t)((uint32_t)sign_extend(rs1, 5)) + + (uint64_t)env->vfp.vreg[src2].u32[j] + carry; + tmp = tmp >> width; + vector_mask_result(env, rd, width, lmul, i, tmp); + break; + case 64: + carry = vector_get_carry(env, width, lmul, i); + tmp = (uint64_t)sign_extend(rs1, 5) + + env->vfp.vreg[src2].u64[j] + carry; + + if ((tmp < (uint64_t)sign_extend(rs1, 5) || + tmp < env->vfp.vreg[src2].u64[j]) + || ((uint64_t)sign_extend(rs1, 5) == UINT64_MAX && + env->vfp.vreg[src2].u64[j] == UINT64_MAX)) { + tmp = 1; + } else { + tmp = 0; + } + vector_mask_result(env, rd, width, lmul, i, tmp); + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + if (width <= 64) { + vector_mask_result(env, rd, width, lmul, i, 0); + } else { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vsbc_vvm)(CPURISCVState *env, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src1, src2, dest, vlmax, carry; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_carry(lmul, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + src1 = rs1 + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + carry = vector_get_carry(env, width, lmul, i); + env->vfp.vreg[dest].u8[j] = env->vfp.vreg[src2].u8[j] + - env->vfp.vreg[src1].u8[j] - carry; + break; + case 16: + carry = vector_get_carry(env, width, lmul, i); + env->vfp.vreg[dest].u16[j] = env->vfp.vreg[src2].u16[j] + - env->vfp.vreg[src1].u16[j] - carry; + break; + case 32: + carry = vector_get_carry(env, width, lmul, i); + env->vfp.vreg[dest].u32[j] = env->vfp.vreg[src2].u32[j] + - env->vfp.vreg[src1].u32[j] - carry; + break; + case 64: + carry = vector_get_carry(env, width, lmul, i); + env->vfp.vreg[dest].u64[j] = env->vfp.vreg[src2].u64[j] + - env->vfp.vreg[src1].u64[j] - carry; + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; +} +void VECTOR_HELPER(vsbc_vxm)(CPURISCVState *env, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src2, dest, vlmax, carry; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_carry(lmul, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + carry = vector_get_carry(env, width, lmul, i); + env->vfp.vreg[dest].u8[j] = env->vfp.vreg[src2].u8[j] + - env->gpr[rs1] - carry; + break; + case 16: + carry = vector_get_carry(env, width, lmul, i); + env->vfp.vreg[dest].u16[j] = env->vfp.vreg[src2].u16[j] + - env->gpr[rs1] - carry; + break; + case 32: + carry = vector_get_carry(env, width, lmul, i); + env->vfp.vreg[dest].u32[j] = env->vfp.vreg[src2].u32[j] + - env->gpr[rs1] - carry; + break; + case 64: + carry = vector_get_carry(env, width, lmul, i); + env->vfp.vreg[dest].u64[j] = env->vfp.vreg[src2].u64[j] + - (uint64_t)extend_gpr(env->gpr[rs1]) - carry; + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; +} +void VECTOR_HELPER(vmsbc_vvm)(CPURISCVState *env, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src1, src2, vlmax, carry; + uint64_t tmp; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) + || vector_overlap_dstgp_srcgp(rd, 1, rs1, lmul) + || vector_overlap_dstgp_srcgp(rd, 1, rs2, lmul) + || (rd == 0)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + + for (i = 0; i < vlmax; i++) { + src1 = rs1 + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + carry = vector_get_carry(env, width, lmul, i); + tmp = env->vfp.vreg[src2].u8[j] + - env->vfp.vreg[src1].u8[j] - carry; + tmp = (tmp >> width) & 0x1; + + vector_mask_result(env, rd, width, lmul, i, tmp); + break; + case 16: + carry = vector_get_carry(env, width, lmul, i); + tmp = env->vfp.vreg[src2].u16[j] + - env->vfp.vreg[src1].u16[j] - carry; + tmp = (tmp >> width) & 0x1; + vector_mask_result(env, rd, width, lmul, i, tmp); + break; + case 32: + carry = vector_get_carry(env, width, lmul, i); + tmp = (uint64_t)env->vfp.vreg[src2].u32[j] + - (uint64_t)env->vfp.vreg[src1].u32[j] - carry; + tmp = (tmp >> width) & 0x1; + vector_mask_result(env, rd, width, lmul, i, tmp); + break; + case 64: + carry = vector_get_carry(env, width, lmul, i); + tmp = env->vfp.vreg[src2].u64[j] + - env->vfp.vreg[src1].u64[j] - carry; + + if (((env->vfp.vreg[src1].u64[j] == UINT64_MAX) && carry) || + env->vfp.vreg[src2].u64[j] < + (env->vfp.vreg[src1].u64[j] + carry)) { + tmp = 1; + } else { + tmp = 0; + } + vector_mask_result(env, rd, width, lmul, i, tmp); + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + if (width <= 64) { + vector_mask_result(env, rd, width, lmul, i, 0); + } else { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } + } + env->vfp.vstart = 0; +} +void VECTOR_HELPER(vmsbc_vxm)(CPURISCVState *env, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src2, vlmax, carry; + uint64_t tmp, extend_rs1; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) + || vector_overlap_dstgp_srcgp(rd, 1, rs2, lmul) + || (rd == 0)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs2, false); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + carry = vector_get_carry(env, width, lmul, i); + tmp = env->vfp.vreg[src2].u8[j] + - (uint8_t)env->gpr[rs1] - carry; + tmp = (tmp >> width) & 0x1; + vector_mask_result(env, rd, width, lmul, i, tmp); + break; + case 16: + carry = vector_get_carry(env, width, lmul, i); + tmp = env->vfp.vreg[src2].u16[j] + - (uint16_t)env->gpr[rs1] - carry; + tmp = (tmp >> width) & 0x1; + vector_mask_result(env, rd, width, lmul, i, tmp); + break; + case 32: + carry = vector_get_carry(env, width, lmul, i); + tmp = (uint64_t)env->vfp.vreg[src2].u32[j] + - (uint64_t)((uint32_t)env->gpr[rs1]) - carry; + tmp = (tmp >> width) & 0x1; + vector_mask_result(env, rd, width, lmul, i, tmp); + break; + case 64: + carry = vector_get_carry(env, width, lmul, i); + + extend_rs1 = (uint64_t)extend_gpr(env->gpr[rs1]); + tmp = env->vfp.vreg[src2].u64[j] - extend_rs1 - carry; + + if ((tmp > env->vfp.vreg[src2].u64[j]) || + ((extend_rs1 == UINT64_MAX) && carry)) { + tmp = 1; + } else { + tmp = 0; + } + vector_mask_result(env, rd, width, lmul, i, tmp); + break; + + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + if (width <= 64) { + vector_mask_result(env, rd, width, lmul, i, 0); + } else { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vadd_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src1, src2, dest, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + src1 = rs1 + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u8[j] = env->vfp.vreg[src1].u8[j] + + env->vfp.vreg[src2].u8[j]; + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u16[j] = env->vfp.vreg[src1].u16[j] + + env->vfp.vreg[src2].u16[j]; + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u32[j] = env->vfp.vreg[src1].u32[j] + + env->vfp.vreg[src2].u32[j]; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u64[j] = env->vfp.vreg[src1].u64[j] + + env->vfp.vreg[src2].u64[j]; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vadd_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src2, dest, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u8[j] = env->gpr[rs1] + + env->vfp.vreg[src2].u8[j]; + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u16[j] = env->gpr[rs1] + + env->vfp.vreg[src2].u16[j]; + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u32[j] = env->gpr[rs1] + + env->vfp.vreg[src2].u32[j]; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u64[j] = + (uint64_t)extend_gpr(env->gpr[rs1]) + + env->vfp.vreg[src2].u64[j]; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vadd_vi)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src2, dest, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s8[j] = sign_extend(rs1, 5) + + env->vfp.vreg[src2].s8[j]; + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s16[j] = sign_extend(rs1, 5) + + env->vfp.vreg[src2].s16[j]; + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s32[j] = sign_extend(rs1, 5) + + env->vfp.vreg[src2].s32[j]; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s64[j] = sign_extend(rs1, 5) + + env->vfp.vreg[src2].s64[j]; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vsub_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src1, src2, dest, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + src1 = rs1 + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u8[j] = env->vfp.vreg[src2].u8[j] + - env->vfp.vreg[src1].u8[j]; + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u16[j] = env->vfp.vreg[src2].u16[j] + - env->vfp.vreg[src1].u16[j]; + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u32[j] = env->vfp.vreg[src2].u32[j] + - env->vfp.vreg[src1].u32[j]; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u64[j] = env->vfp.vreg[src2].u64[j] + - env->vfp.vreg[src1].u64[j]; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + + env->vfp.vstart = 0; +} +void VECTOR_HELPER(vsub_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src2, dest, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u8[j] = env->vfp.vreg[src2].u8[j] + - env->gpr[rs1]; + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u16[j] = env->vfp.vreg[src2].u16[j] + - env->gpr[rs1]; + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u32[j] = env->vfp.vreg[src2].u32[j] + - env->gpr[rs1]; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u64[j] = env->vfp.vreg[src2].u64[j] + - (uint64_t)extend_gpr(env->gpr[rs1]); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vrsub_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src2, dest, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u8[j] = env->gpr[rs1] + - env->vfp.vreg[src2].u8[j]; + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u16[j] = env->gpr[rs1] + - env->vfp.vreg[src2].u16[j]; + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u32[j] = env->gpr[rs1] + - env->vfp.vreg[src2].u32[j]; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u64[j] = + (uint64_t)extend_gpr(env->gpr[rs1]) + - env->vfp.vreg[src2].u64[j]; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vrsub_vi)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src2, dest, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s8[j] = sign_extend(rs1, 5) + - env->vfp.vreg[src2].s8[j]; + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s16[j] = sign_extend(rs1, 5) + - env->vfp.vreg[src2].s16[j]; + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s32[j] = sign_extend(rs1, 5) + - env->vfp.vreg[src2].s32[j]; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s64[j] = sign_extend(rs1, 5) + - env->vfp.vreg[src2].s64[j]; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vwaddu_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, k, vl; + uint32_t lmul, width, src1, src2, dest, vlmax; + + vl = env->vfp.vl; + + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) + || vector_overlap_vm_force(vm, rd) + || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs1, lmul) + || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul) + ) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, true); + + for (i = 0; i < vlmax; i++) { + src1 = rs1 + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / (2 * width))); + j = i % (VLEN / width); + k = i % (VLEN / (2 * width)); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u16[k] = + (uint16_t)env->vfp.vreg[src1].u8[j] + + (uint16_t)env->vfp.vreg[src2].u8[j]; + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u32[k] = + (uint32_t)env->vfp.vreg[src1].u16[j] + + (uint32_t)env->vfp.vreg[src2].u16[j]; + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u64[k] = + (uint64_t)env->vfp.vreg[src1].u32[j] + + (uint64_t)env->vfp.vreg[src2].u32[j]; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_widen(env, dest, k, width); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vwaddu_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, k, vl; + uint32_t lmul, width, src2, dest, vlmax; + + vl = env->vfp.vl; + + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) + || vector_overlap_vm_force(vm, rd) + || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul) + ) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, true); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / (2 * width))); + j = i % (VLEN / width); + k = i % (VLEN / (2 * width)); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u16[k] = + (uint16_t)env->vfp.vreg[src2].u8[j] + + (uint16_t)((uint8_t)env->gpr[rs1]); + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u32[k] = + (uint32_t)env->vfp.vreg[src2].u16[j] + + (uint32_t)((uint16_t)env->gpr[rs1]); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u64[k] = + (uint64_t)env->vfp.vreg[src2].u32[j] + + (uint64_t)((uint32_t)env->gpr[rs1]); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_widen(env, dest, k, width); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vwadd_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, k, vl; + uint32_t lmul, width, src1, src2, dest, vlmax; + + vl = env->vfp.vl; + + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) + || vector_overlap_vm_force(vm, rd) + || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs1, lmul) + || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, true); + + for (i = 0; i < vlmax; i++) { + src1 = rs1 + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / (2 * width))); + j = i % (VLEN / width); + k = i % (VLEN / (2 * width)); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s16[k] = + (int16_t)env->vfp.vreg[src1].s8[j] + + (int16_t)env->vfp.vreg[src2].s8[j]; + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s32[k] = + (int32_t)env->vfp.vreg[src1].s16[j] + + (int32_t)env->vfp.vreg[src2].s16[j]; + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s64[k] = + (int64_t)env->vfp.vreg[src1].s32[j] + + (int64_t)env->vfp.vreg[src2].s32[j]; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_widen(env, dest, k, width); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vwadd_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, k, vl; + uint32_t lmul, width, src2, dest, vlmax; + + vl = env->vfp.vl; + + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) + || vector_overlap_vm_force(vm, rd) + || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, true); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / (2 * width))); + j = i % (VLEN / width); + k = i % (VLEN / (2 * width)); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s16[k] = + (int16_t)((int8_t)env->vfp.vreg[src2].s8[j]) + + (int16_t)((int8_t)env->gpr[rs1]); + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s32[k] = + (int32_t)((int16_t)env->vfp.vreg[src2].s16[j]) + + (int32_t)((int16_t)env->gpr[rs1]); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s64[k] = + (int64_t)((int32_t)env->vfp.vreg[src2].s32[j]) + + (int64_t)((int32_t)env->gpr[rs1]); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_widen(env, dest, k, width); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vwsubu_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, k, vl; + uint32_t lmul, width, src1, src2, dest, vlmax; + + vl = env->vfp.vl; + + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) + || vector_overlap_vm_force(vm, rd) + || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs1, lmul) + || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul) + ) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, true); + + for (i = 0; i < vlmax; i++) { + src1 = rs1 + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / (2 * width))); + j = i % (VLEN / width); + k = i % (VLEN / (2 * width)); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u16[k] = + (uint16_t)env->vfp.vreg[src2].u8[j] - + (uint16_t)env->vfp.vreg[src1].u8[j]; + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u32[k] = + (uint32_t)env->vfp.vreg[src2].u16[j] - + (uint32_t)env->vfp.vreg[src1].u16[j]; + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u64[k] = + (uint64_t)env->vfp.vreg[src2].u32[j] - + (uint64_t)env->vfp.vreg[src1].u32[j]; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_widen(env, dest, k, width); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vwsubu_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, k, vl; + uint32_t lmul, width, src2, dest, vlmax; + + vl = env->vfp.vl; + + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) + || vector_overlap_vm_force(vm, rd) + || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul) + ) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, true); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / (2 * width))); + j = i % (VLEN / width); + k = i % (VLEN / (2 * width)); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u16[k] = + (uint16_t)env->vfp.vreg[src2].u8[j] - + (uint16_t)((uint8_t)env->gpr[rs1]); + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u32[k] = + (uint32_t)env->vfp.vreg[src2].u16[j] - + (uint32_t)((uint16_t)env->gpr[rs1]); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u64[k] = + (uint64_t)env->vfp.vreg[src2].u32[j] - + (uint64_t)((uint32_t)env->gpr[rs1]); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_widen(env, dest, k, width); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vwsub_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, k, vl; + uint32_t lmul, width, src1, src2, dest, vlmax; + + vl = env->vfp.vl; + + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) + || vector_overlap_vm_force(vm, rd) + || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs1, lmul) + || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul) + ) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, true); + + for (i = 0; i < vlmax; i++) { + src1 = rs1 + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / (2 * width))); + j = i % (VLEN / width); + k = i % (VLEN / (2 * width)); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s16[k] = + (int16_t)env->vfp.vreg[src2].s8[j] - + (int16_t)env->vfp.vreg[src1].s8[j]; + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s32[k] = + (int32_t)env->vfp.vreg[src2].s16[j] - + (int32_t)env->vfp.vreg[src1].s16[j]; + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s64[k] = + (int64_t)env->vfp.vreg[src2].s32[j] - + (int64_t)env->vfp.vreg[src1].s32[j]; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_widen(env, dest, k, width); + } + } + env->vfp.vstart = 0; +} +void VECTOR_HELPER(vwsub_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, k, vl; + uint32_t lmul, width, src2, dest, vlmax; + + vl = env->vfp.vl; + + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) + || vector_overlap_vm_force(vm, rd) + || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul) + ) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, true); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / (2 * width))); + j = i % (VLEN / width); + k = i % (VLEN / (2 * width)); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s16[k] = + (int16_t)((int8_t)env->vfp.vreg[src2].s8[j]) - + (int16_t)((int8_t)env->gpr[rs1]); + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s32[k] = + (int32_t)((int16_t)env->vfp.vreg[src2].s16[j]) - + (int32_t)((int16_t)env->gpr[rs1]); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s64[k] = + (int64_t)((int32_t)env->vfp.vreg[src2].s32[j]) - + (int64_t)((int32_t)env->gpr[rs1]); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_widen(env, dest, k, width); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vwaddu_wv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, k, vl; + uint32_t lmul, width, src1, src2, dest, vlmax; + + vl = env->vfp.vl; + + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) + || vector_overlap_vm_force(vm, rd) + || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs1, lmul)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, true); + vector_lmul_check_reg(env, lmul, rd, true); + + for (i = 0; i < vlmax; i++) { + src1 = rs1 + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / (2 * width))); + dest = rd + (i / (VLEN / (2 * width))); + j = i % (VLEN / width); + k = i % (VLEN / (2 * width)); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u16[k] = + (uint16_t)env->vfp.vreg[src1].u8[j] + + (uint16_t)env->vfp.vreg[src2].u16[k]; + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u32[k] = + (uint32_t)env->vfp.vreg[src1].u16[j] + + (uint32_t)env->vfp.vreg[src2].u32[k]; + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u64[k] = + (uint64_t)env->vfp.vreg[src1].u32[j] + + (uint64_t)env->vfp.vreg[src2].u64[k]; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_widen(env, dest, k, width); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vwaddu_wx)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, k, vl; + uint32_t lmul, width, src2, dest, vlmax; + + vl = env->vfp.vl; + + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) + || vector_overlap_vm_force(vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs2, true); + vector_lmul_check_reg(env, lmul, rd, true); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / (2 * width))); + dest = rd + (i / (VLEN / (2 * width))); + k = i % (VLEN / (2 * width)); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u16[k] = + (uint16_t)env->vfp.vreg[src2].u16[k] + + (uint16_t)((uint8_t)env->gpr[rs1]); + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u32[k] = + (uint32_t)env->vfp.vreg[src2].u32[k] + + (uint32_t)((uint16_t)env->gpr[rs1]); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u64[k] = + (uint64_t)env->vfp.vreg[src2].u64[k] + + (uint64_t)((uint32_t)env->gpr[rs1]); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_widen(env, dest, k, width); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vwadd_wv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, k, vl; + uint32_t lmul, width, src1, src2, dest, vlmax; + + vl = env->vfp.vl; + + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) + || vector_overlap_vm_force(vm, rd) + || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs1, lmul)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, true); + vector_lmul_check_reg(env, lmul, rd, true); + + for (i = 0; i < vlmax; i++) { + src1 = rs1 + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / (2 * width))); + dest = rd + (i / (VLEN / (2 * width))); + j = i % (VLEN / width); + k = i % (VLEN / (2 * width)); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s16[k] = + (int16_t)((int8_t)env->vfp.vreg[src1].s8[j]) + + (int16_t)env->vfp.vreg[src2].s16[k]; + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s32[k] = + (int32_t)((int16_t)env->vfp.vreg[src1].s16[j]) + + (int32_t)env->vfp.vreg[src2].s32[k]; + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s64[k] = + (int64_t)((int32_t)env->vfp.vreg[src1].s32[j]) + + (int64_t)env->vfp.vreg[src2].s64[k]; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_widen(env, dest, k, width); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vwadd_wx)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, k, vl; + uint32_t lmul, width, src2, dest, vlmax; + + vl = env->vfp.vl; + + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) + || vector_overlap_vm_force(vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs2, true); + vector_lmul_check_reg(env, lmul, rd, true); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / (2 * width))); + dest = rd + (i / (VLEN / (2 * width))); + k = i % (VLEN / (2 * width)); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s16[k] = + (int16_t)env->vfp.vreg[src2].s16[k] + + (int16_t)((int8_t)env->gpr[rs1]); + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s32[k] = + (int32_t)env->vfp.vreg[src2].s32[k] + + (int32_t)((int16_t)env->gpr[rs1]); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s64[k] = + (int64_t)env->vfp.vreg[src2].s64[k] + + (int64_t)((int32_t)env->gpr[rs1]); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_widen(env, dest, k, width); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vwsubu_wv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, k, vl; + uint32_t lmul, width, src1, src2, dest, vlmax; + + vl = env->vfp.vl; + + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) + || vector_overlap_vm_force(vm, rd) + || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs1, lmul)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, true); + vector_lmul_check_reg(env, lmul, rd, true); + + for (i = 0; i < vlmax; i++) { + src1 = rs1 + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / (2 * width))); + dest = rd + (i / (VLEN / (2 * width))); + j = i % (VLEN / width); + k = i % (VLEN / (2 * width)); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u16[k] = + (uint16_t)env->vfp.vreg[src2].u16[k] - + (uint16_t)env->vfp.vreg[src1].u8[j]; + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u32[k] = + (uint32_t)env->vfp.vreg[src2].u32[k] - + (uint32_t)env->vfp.vreg[src1].u16[j]; + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u64[k] = + (uint64_t)env->vfp.vreg[src2].u64[k] - + (uint64_t)env->vfp.vreg[src1].u32[j]; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_widen(env, dest, k, width); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vwsubu_wx)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, k, vl; + uint32_t lmul, width, src2, dest, vlmax; + + vl = env->vfp.vl; + + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) + || vector_overlap_vm_force(vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs2, true); + vector_lmul_check_reg(env, lmul, rd, true); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / (2 * width))); + dest = rd + (i / (VLEN / (2 * width))); + k = i % (VLEN / (2 * width)); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u16[k] = + (uint16_t)env->vfp.vreg[src2].u16[k] - + (uint16_t)((uint8_t)env->gpr[rs1]); + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u32[k] = + (uint32_t)env->vfp.vreg[src2].u32[k] - + (uint32_t)((uint16_t)env->gpr[rs1]); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u64[k] = + (uint64_t)env->vfp.vreg[src2].u64[k] - + (uint64_t)((uint32_t)env->gpr[rs1]); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_widen(env, dest, k, width); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vwsub_wv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, k, vl; + uint32_t lmul, width, src1, src2, dest, vlmax; + + vl = env->vfp.vl; + + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) + || vector_overlap_vm_force(vm, rd) + || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs1, lmul)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, true); + vector_lmul_check_reg(env, lmul, rd, true); + + for (i = 0; i < vlmax; i++) { + src1 = rs1 + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / (2 * width))); + dest = rd + (i / (VLEN / (2 * width))); + j = i % (VLEN / width); + k = i % (VLEN / (2 * width)); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s16[k] = + (int16_t)env->vfp.vreg[src2].s16[k] - + (int16_t)((int8_t)env->vfp.vreg[src1].s8[j]); + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s32[k] = + (int32_t)env->vfp.vreg[src2].s32[k] - + (int32_t)((int16_t)env->vfp.vreg[src1].s16[j]); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s64[k] = + (int64_t)env->vfp.vreg[src2].s64[k] - + (int64_t)((int32_t)env->vfp.vreg[src1].s32[j]); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_widen(env, dest, k, width); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vwsub_wx)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, k, vl; + uint32_t lmul, width, src2, dest, vlmax; + + vl = env->vfp.vl; + + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) + || vector_overlap_vm_force(vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs2, true); + vector_lmul_check_reg(env, lmul, rd, true); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / (2 * width))); + dest = rd + (i / (VLEN / (2 * width))); + k = i % (VLEN / (2 * width)); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s16[k] = + (int16_t)env->vfp.vreg[src2].s16[k] - + (int16_t)((int8_t)env->gpr[rs1]); + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s32[k] = + (int32_t)env->vfp.vreg[src2].s32[k] - + (int32_t)((int16_t)env->gpr[rs1]); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s64[k] = + (int64_t)env->vfp.vreg[src2].s64[k] - + (int64_t)((int32_t)env->gpr[rs1]); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_widen(env, dest, k, width); + } + } + env->vfp.vstart = 0; +} From patchwork Wed Sep 11 06:25:33 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: LIU Zhiwei X-Patchwork-Id: 11140387 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id CD19214ED for ; Wed, 11 Sep 2019 06:43:15 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 8E2832089F for ; Wed, 11 Sep 2019 06:43:15 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 8E2832089F Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=c-sky.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Received: from localhost ([::1]:46914 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1i7wLS-0007u7-4P for patchwork-qemu-devel@patchwork.kernel.org; Wed, 11 Sep 2019 02:43:14 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:38483) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1i7wDc-00075N-Ev for qemu-devel@nongnu.org; Wed, 11 Sep 2019 02:35:14 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1i7wDV-0007ni-EM for qemu-devel@nongnu.org; Wed, 11 Sep 2019 02:35:08 -0400 Received: from smtp2200-217.mail.aliyun.com ([121.197.200.217]:40152) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1i7wDU-0007kP-5Z; Wed, 11 Sep 2019 02:35:01 -0400 X-Alimail-AntiSpam: AC=CONTINUE; BC=0.03883426|-1; CH=green; DM=CONTINUE|CONTINUE|true|0.273373-0.00435976-0.722268; FP=0|0|0|0|0|-1|-1|-1; HT=e02c03307; MF=zhiwei_liu@c-sky.com; NM=1; PH=DS; RN=11; RT=11; SR=0; TI=SMTPD_---.FSRDYQn_1568183696; Received: from localhost(mailfrom:zhiwei_liu@c-sky.com fp:SMTPD_---.FSRDYQn_1568183696) by smtp.aliyun-inc.com(10.147.44.118); Wed, 11 Sep 2019 14:34:56 +0800 From: liuzhiwei To: Alistair.Francis@wdc.com, palmer@sifive.com, sagark@eecs.berkeley.edu, kbastian@mail.uni-paderborn.de, riku.voipio@iki.fi, laurent@vivier.eu, wenmeng_zhang@c-sky.com Date: Wed, 11 Sep 2019 14:25:33 +0800 Message-Id: <1568183141-67641-10-git-send-email-zhiwei_liu@c-sky.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1568183141-67641-1-git-send-email-zhiwei_liu@c-sky.com> References: <1568183141-67641-1-git-send-email-zhiwei_liu@c-sky.com> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x [generic] X-Received-From: 121.197.200.217 Subject: [Qemu-devel] [PATCH v2 09/17] RISC-V: add vector extension integer instructions part2, bit/shift X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-riscv@nongnu.org, qemu-devel@nongnu.org, wxy194768@alibaba-inc.com, LIU Zhiwei Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" From: LIU Zhiwei Signed-off-by: LIU Zhiwei --- target/riscv/helper.h | 25 + target/riscv/insn32.decode | 25 + target/riscv/insn_trans/trans_rvv.inc.c | 25 + target/riscv/vector_helper.c | 1477 +++++++++++++++++++++++++++++++ 4 files changed, 1552 insertions(+) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index 31e20dc..28863e2 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -175,5 +175,30 @@ DEF_HELPER_5(vector_vwsubu_wx, void, env, i32, i32, i32, i32) DEF_HELPER_5(vector_vwsub_wv, void, env, i32, i32, i32, i32) DEF_HELPER_5(vector_vwsub_wx, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vand_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vand_vx, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vand_vi, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vor_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vor_vx, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vor_vi, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vxor_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vxor_vx, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vxor_vi, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vsll_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vsll_vx, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vsll_vi, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vsrl_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vsrl_vx, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vsrl_vi, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vsra_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vsra_vx, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vsra_vi, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vnsrl_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vnsrl_vx, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vnsrl_vi, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vnsra_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vnsra_vx, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vnsra_vi, void, env, i32, i32, i32, i32) + DEF_HELPER_4(vector_vsetvli, void, env, i32, i32, i32) DEF_HELPER_4(vector_vsetvl, void, env, i32, i32, i32) diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode index fc7e498..19710f5 100644 --- a/target/riscv/insn32.decode +++ b/target/riscv/insn32.decode @@ -315,5 +315,30 @@ vsbc_vxm 010010 1 ..... ..... 100 ..... 1010111 @r vmsbc_vvm 010011 1 ..... ..... 000 ..... 1010111 @r vmsbc_vxm 010011 1 ..... ..... 100 ..... 1010111 @r +vand_vv 001001 . ..... ..... 000 ..... 1010111 @r_vm +vand_vx 001001 . ..... ..... 100 ..... 1010111 @r_vm +vand_vi 001001 . ..... ..... 011 ..... 1010111 @r_vm +vor_vv 001010 . ..... ..... 000 ..... 1010111 @r_vm +vor_vx 001010 . ..... ..... 100 ..... 1010111 @r_vm +vor_vi 001010 . ..... ..... 011 ..... 1010111 @r_vm +vxor_vv 001011 . ..... ..... 000 ..... 1010111 @r_vm +vxor_vx 001011 . ..... ..... 100 ..... 1010111 @r_vm +vxor_vi 001011 . ..... ..... 011 ..... 1010111 @r_vm +vsll_vv 100101 . ..... ..... 000 ..... 1010111 @r_vm +vsll_vx 100101 . ..... ..... 100 ..... 1010111 @r_vm +vsll_vi 100101 . ..... ..... 011 ..... 1010111 @r_vm +vsrl_vv 101000 . ..... ..... 000 ..... 1010111 @r_vm +vsrl_vx 101000 . ..... ..... 100 ..... 1010111 @r_vm +vsrl_vi 101000 . ..... ..... 011 ..... 1010111 @r_vm +vsra_vv 101001 . ..... ..... 000 ..... 1010111 @r_vm +vsra_vx 101001 . ..... ..... 100 ..... 1010111 @r_vm +vsra_vi 101001 . ..... ..... 011 ..... 1010111 @r_vm +vnsrl_vv 101100 . ..... ..... 000 ..... 1010111 @r_vm +vnsrl_vx 101100 . ..... ..... 100 ..... 1010111 @r_vm +vnsrl_vi 101100 . ..... ..... 011 ..... 1010111 @r_vm +vnsra_vv 101101 . ..... ..... 000 ..... 1010111 @r_vm +vnsra_vx 101101 . ..... ..... 100 ..... 1010111 @r_vm +vnsra_vi 101101 . ..... ..... 011 ..... 1010111 @r_vm + vsetvli 0 ........... ..... 111 ..... 1010111 @r2_zimm vsetvl 1000000 ..... ..... 111 ..... 1010111 @r diff --git a/target/riscv/insn_trans/trans_rvv.inc.c b/target/riscv/insn_trans/trans_rvv.inc.c index a1c1960..6af29d0 100644 --- a/target/riscv/insn_trans/trans_rvv.inc.c +++ b/target/riscv/insn_trans/trans_rvv.inc.c @@ -204,5 +204,30 @@ GEN_VECTOR_R_VM(vwsubu_wx) GEN_VECTOR_R_VM(vwsub_wv) GEN_VECTOR_R_VM(vwsub_wx) +GEN_VECTOR_R_VM(vand_vv) +GEN_VECTOR_R_VM(vand_vx) +GEN_VECTOR_R_VM(vand_vi) +GEN_VECTOR_R_VM(vor_vv) +GEN_VECTOR_R_VM(vor_vx) +GEN_VECTOR_R_VM(vor_vi) +GEN_VECTOR_R_VM(vxor_vv) +GEN_VECTOR_R_VM(vxor_vx) +GEN_VECTOR_R_VM(vxor_vi) +GEN_VECTOR_R_VM(vsll_vv) +GEN_VECTOR_R_VM(vsll_vx) +GEN_VECTOR_R_VM(vsll_vi) +GEN_VECTOR_R_VM(vsrl_vv) +GEN_VECTOR_R_VM(vsrl_vx) +GEN_VECTOR_R_VM(vsrl_vi) +GEN_VECTOR_R_VM(vsra_vv) +GEN_VECTOR_R_VM(vsra_vx) +GEN_VECTOR_R_VM(vsra_vi) +GEN_VECTOR_R_VM(vnsrl_vv) +GEN_VECTOR_R_VM(vnsrl_vx) +GEN_VECTOR_R_VM(vnsrl_vi) +GEN_VECTOR_R_VM(vnsra_vv) +GEN_VECTOR_R_VM(vnsra_vx) +GEN_VECTOR_R_VM(vnsra_vi) + GEN_VECTOR_R2_ZIMM(vsetvli) GEN_VECTOR_R(vsetvl) diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c index 95336c9..298a10a 100644 --- a/target/riscv/vector_helper.c +++ b/target/riscv/vector_helper.c @@ -268,6 +268,25 @@ static void vector_tail_widen(CPURISCVState *env, int vreg, int index, } } +static void vector_tail_narrow(CPURISCVState *env, int vreg, int index, + int width) +{ + switch (width) { + case 8: + env->vfp.vreg[vreg].u8[index] = 0; + break; + case 16: + env->vfp.vreg[vreg].u16[index] = 0; + break; + case 32: + env->vfp.vreg[vreg].u32[index] = 0; + break; + default: + helper_raise_exception(env, RISCV_EXCP_ILLEGAL_INST); + return; + } +} + static inline int vector_get_carry(CPURISCVState *env, int width, int lmul, int index) { @@ -7131,3 +7150,1461 @@ void VECTOR_HELPER(vwsub_wx)(CPURISCVState *env, uint32_t vm, uint32_t rs1, } env->vfp.vstart = 0; } + +void VECTOR_HELPER(vand_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src1, src2, dest, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + src1 = rs1 + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u8[j] = env->vfp.vreg[src1].u8[j] + & env->vfp.vreg[src2].u8[j]; + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u16[j] = env->vfp.vreg[src1].u16[j] + & env->vfp.vreg[src2].u16[j]; + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u32[j] = env->vfp.vreg[src1].u32[j] + & env->vfp.vreg[src2].u32[j]; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u64[j] = env->vfp.vreg[src1].u64[j] + & env->vfp.vreg[src2].u64[j]; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vand_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src2, dest, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u8[j] = env->gpr[rs1] + & env->vfp.vreg[src2].u8[j]; + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u16[j] = env->gpr[rs1] + & env->vfp.vreg[src2].u16[j]; + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u32[j] = env->gpr[rs1] + & env->vfp.vreg[src2].u32[j]; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u64[j] = + (uint64_t)extend_gpr(env->gpr[rs1]) + & env->vfp.vreg[src2].u64[j]; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vand_vi)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src2, dest, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s8[j] = sign_extend(rs1, 5) + & env->vfp.vreg[src2].s8[j]; + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s16[j] = sign_extend(rs1, 5) + & env->vfp.vreg[src2].s16[j]; + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s32[j] = sign_extend(rs1, 5) + & env->vfp.vreg[src2].s32[j]; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s64[j] = sign_extend(rs1, 5) + & env->vfp.vreg[src2].s64[j]; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vor_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src1, src2, dest, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + src1 = rs1 + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u8[j] = env->vfp.vreg[src1].u8[j] + | env->vfp.vreg[src2].u8[j]; + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u16[j] = env->vfp.vreg[src1].u16[j] + | env->vfp.vreg[src2].u16[j]; + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u32[j] = env->vfp.vreg[src1].u32[j] + | env->vfp.vreg[src2].u32[j]; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u64[j] = env->vfp.vreg[src1].u64[j] + | env->vfp.vreg[src2].u64[j]; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; +} +void VECTOR_HELPER(vor_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src2, dest, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u8[j] = env->gpr[rs1] + | env->vfp.vreg[src2].u8[j]; + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u16[j] = env->gpr[rs1] + | env->vfp.vreg[src2].u16[j]; + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u32[j] = env->gpr[rs1] + | env->vfp.vreg[src2].u32[j]; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u64[j] = + (uint64_t)extend_gpr(env->gpr[rs1]) + | env->vfp.vreg[src2].u64[j]; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; +} +void VECTOR_HELPER(vor_vi)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src2, dest, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s8[j] = sign_extend(rs1, 5) + | env->vfp.vreg[src2].s8[j]; + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s16[j] = sign_extend(rs1, 5) + | env->vfp.vreg[src2].s16[j]; + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s32[j] = sign_extend(rs1, 5) + | env->vfp.vreg[src2].s32[j]; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s64[j] = sign_extend(rs1, 5) + | env->vfp.vreg[src2].s64[j]; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vxor_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src1, src2, dest, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + src1 = rs1 + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u8[j] = env->vfp.vreg[src1].u8[j] + ^ env->vfp.vreg[src2].u8[j]; + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u16[j] = env->vfp.vreg[src1].u16[j] + ^ env->vfp.vreg[src2].u16[j]; + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u32[j] = env->vfp.vreg[src1].u32[j] + ^ env->vfp.vreg[src2].u32[j]; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u64[j] = env->vfp.vreg[src1].u64[j] + ^ env->vfp.vreg[src2].u64[j]; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; +} +void VECTOR_HELPER(vxor_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src2, dest, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u8[j] = env->gpr[rs1] + ^ env->vfp.vreg[src2].u8[j]; + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u16[j] = env->gpr[rs1] + ^ env->vfp.vreg[src2].u16[j]; + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u32[j] = env->gpr[rs1] + ^ env->vfp.vreg[src2].u32[j]; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u64[j] = + (uint64_t)extend_gpr(env->gpr[rs1]) + ^ env->vfp.vreg[src2].u64[j]; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vxor_vi)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src2, dest, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s8[j] = sign_extend(rs1, 5) + ^ env->vfp.vreg[src2].s8[j]; + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s16[j] = sign_extend(rs1, 5) + ^ env->vfp.vreg[src2].s16[j]; + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s32[j] = sign_extend(rs1, 5) + ^ env->vfp.vreg[src2].s32[j]; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s64[j] = sign_extend(rs1, 5) + ^ env->vfp.vreg[src2].s64[j]; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vsll_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src1, src2, dest, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + src1 = rs1 + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u8[j] = env->vfp.vreg[src2].u8[j] + << (env->vfp.vreg[src1].u8[j] & 0x7); + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u16[j] = env->vfp.vreg[src2].u16[j] + << (env->vfp.vreg[src1].u16[j] & 0xf); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u32[j] = env->vfp.vreg[src2].u32[j] + << (env->vfp.vreg[src1].u32[j] & 0x1f); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u64[j] = env->vfp.vreg[src2].u64[j] + << (env->vfp.vreg[src1].u64[j] & 0x3f); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vsll_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src2, dest, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u8[j] = env->vfp.vreg[src2].u8[j] + << (env->gpr[rs1] & 0x7); + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u16[j] = env->vfp.vreg[src2].u16[j] + << (env->gpr[rs1] & 0xf); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u32[j] = env->vfp.vreg[src2].u32[j] + << (env->gpr[rs1] & 0x1f); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u64[j] = env->vfp.vreg[src2].u64[j] + << ((uint64_t)extend_gpr(env->gpr[rs1]) & 0x3f); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vsll_vi)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src2, dest, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u8[j] = env->vfp.vreg[src2].u8[j] + << (rs1); + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u16[j] = env->vfp.vreg[src2].u16[j] + << (rs1); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u32[j] = env->vfp.vreg[src2].u32[j] + << (rs1); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u64[j] = env->vfp.vreg[src2].u64[j] + << (rs1); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vsrl_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src1, src2, dest, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + src1 = rs1 + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u8[j] = env->vfp.vreg[src2].u8[j] + >> (env->vfp.vreg[src1].u8[j] & 0x7); + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u16[j] = env->vfp.vreg[src2].u16[j] + >> (env->vfp.vreg[src1].u16[j] & 0xf); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u32[j] = env->vfp.vreg[src2].u32[j] + >> (env->vfp.vreg[src1].u32[j] & 0x1f); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u64[j] = env->vfp.vreg[src2].u64[j] + >> (env->vfp.vreg[src1].u64[j] & 0x3f); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; +} +void VECTOR_HELPER(vsrl_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src2, dest, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u8[j] = env->vfp.vreg[src2].u8[j] + >> (env->gpr[rs1] & 0x7); + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u16[j] = env->vfp.vreg[src2].u16[j] + >> (env->gpr[rs1] & 0xf); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u32[j] = env->vfp.vreg[src2].u32[j] + >> (env->gpr[rs1] & 0x1f); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u64[j] = env->vfp.vreg[src2].u64[j] + >> ((uint64_t)extend_gpr(env->gpr[rs1]) & 0x3f); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; +} +void VECTOR_HELPER(vsrl_vi)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src2, dest, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u8[j] = env->vfp.vreg[src2].u8[j] + >> (rs1); + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u16[j] = env->vfp.vreg[src2].u16[j] + >> (rs1); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u32[j] = env->vfp.vreg[src2].u32[j] + >> (rs1); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u64[j] = env->vfp.vreg[src2].u64[j] + >> (rs1); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vsra_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src1, src2, dest, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + src1 = rs1 + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s8[j] = env->vfp.vreg[src2].s8[j] + >> (env->vfp.vreg[src1].s8[j] & 0x7); + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s16[j] = env->vfp.vreg[src2].s16[j] + >> (env->vfp.vreg[src1].s16[j] & 0xf); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s32[j] = env->vfp.vreg[src2].s32[j] + >> (env->vfp.vreg[src1].s32[j] & 0x1f); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s64[j] = env->vfp.vreg[src2].s64[j] + >> (env->vfp.vreg[src1].s64[j] & 0x3f); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vsra_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src2, dest, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s8[j] = env->vfp.vreg[src2].s8[j] + >> (env->gpr[rs1] & 0x7); + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s16[j] = env->vfp.vreg[src2].s16[j] + >> (env->gpr[rs1] & 0xf); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s32[j] = env->vfp.vreg[src2].s32[j] + >> (env->gpr[rs1] & 0x1f); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s64[j] = env->vfp.vreg[src2].s64[j] + >> ((uint64_t)extend_gpr(env->gpr[rs1]) & 0x3f); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vsra_vi)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src2, dest, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s8[j] = env->vfp.vreg[src2].s8[j] + >> (rs1); + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s16[j] = env->vfp.vreg[src2].s16[j] + >> (rs1); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s32[j] = env->vfp.vreg[src2].s32[j] + >> (rs1); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s64[j] = env->vfp.vreg[src2].s64[j] + >> (rs1); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vnsrl_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, k, vl; + uint32_t lmul, width, src1, src2, dest, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || + vector_overlap_vm_common(lmul, vm, rd) || + vector_overlap_dstgp_srcgp(rd, lmul, rs2, 2 * lmul)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, true); + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + src1 = rs1 + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / (2 * width))); + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + k = i % (VLEN / (2 * width)); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u8[j] = env->vfp.vreg[src2].u16[k] + >> (env->vfp.vreg[src1].u8[j] & 0xf); + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u16[j] = env->vfp.vreg[src2].u32[k] + >> (env->vfp.vreg[src1].u16[j] & 0x1f); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u32[j] = env->vfp.vreg[src2].u64[k] + >> (env->vfp.vreg[src1].u32[j] & 0x3f); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_narrow(env, dest, j, width); + } + } + env->vfp.vstart = 0; +} +void VECTOR_HELPER(vnsrl_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, k, vl; + uint32_t lmul, width, src2, dest, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || + vector_overlap_vm_common(lmul, vm, rd) || + vector_overlap_dstgp_srcgp(rd, lmul, rs2, 2 * lmul)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs2, true); + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / (2 * width))); + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + k = i % (VLEN / (2 * width)); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u8[j] = env->vfp.vreg[src2].u16[k] + >> (env->gpr[rs1] & 0xf); + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u16[j] = env->vfp.vreg[src2].u32[k] + >> (env->gpr[rs1] & 0x1f); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u32[j] = env->vfp.vreg[src2].u64[k] + >> (env->gpr[rs1] & 0x3f); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_narrow(env, dest, j, width); + } + } + env->vfp.vstart = 0; +} +void VECTOR_HELPER(vnsrl_vi)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, k, vl; + uint32_t lmul, width, src2, dest, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || + vector_overlap_vm_common(lmul, vm, rd) || + vector_overlap_dstgp_srcgp(rd, lmul, rs2, 2 * lmul)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs2, true); + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / (2 * width))); + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + k = i % (VLEN / (2 * width)); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u8[j] = env->vfp.vreg[src2].u16[k] + >> (rs1); + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u16[j] = env->vfp.vreg[src2].u32[k] + >> (rs1); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u32[j] = env->vfp.vreg[src2].u64[k] + >> (rs1); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_narrow(env, dest, j, width); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vnsra_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, k, vl; + uint32_t lmul, width, src1, src2, dest, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || + vector_overlap_vm_common(lmul, vm, rd) || + vector_overlap_dstgp_srcgp(rd, lmul, rs2, 2 * lmul)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, true); + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + src1 = rs1 + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / (2 * width))); + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + k = i % (VLEN / (2 * width)); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s8[j] = env->vfp.vreg[src2].s16[k] + >> (env->vfp.vreg[src1].s8[j] & 0xf); + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s16[j] = env->vfp.vreg[src2].s32[k] + >> (env->vfp.vreg[src1].s16[j] & 0x1f); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s32[j] = env->vfp.vreg[src2].s64[k] + >> (env->vfp.vreg[src1].s32[j] & 0x3f); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_narrow(env, dest, j, width); + } + } + env->vfp.vstart = 0; +} +void VECTOR_HELPER(vnsra_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, k, vl; + uint32_t lmul, width, src2, dest, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || + vector_overlap_vm_common(lmul, vm, rd) || + vector_overlap_dstgp_srcgp(rd, lmul, rs2, 2 * lmul)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs2, true); + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / (2 * width))); + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + k = i % (VLEN / (2 * width)); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s8[j] = env->vfp.vreg[src2].s16[k] + >> (env->gpr[rs1] & 0xf); + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s16[j] = env->vfp.vreg[src2].s32[k] + >> (env->gpr[rs1] & 0x1f); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s32[j] = env->vfp.vreg[src2].s64[k] + >> (env->gpr[rs1] & 0x3f); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_narrow(env, dest, j, width); + } + } + env->vfp.vstart = 0; +} +void VECTOR_HELPER(vnsra_vi)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, k, vl; + uint32_t lmul, width, src2, dest, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || + vector_overlap_vm_common(lmul, vm, rd) || + vector_overlap_dstgp_srcgp(rd, lmul, rs2, 2 * lmul)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs2, true); + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / (2 * width))); + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + k = i % (VLEN / (2 * width)); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s8[j] = env->vfp.vreg[src2].s16[k] + >> (rs1); + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s16[j] = env->vfp.vreg[src2].s32[k] + >> (rs1); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s32[j] = env->vfp.vreg[src2].s64[k] + >> (rs1); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_narrow(env, dest, j, width); + } + } + env->vfp.vstart = 0; +} + From patchwork Wed Sep 11 06:25:34 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: LIU Zhiwei X-Patchwork-Id: 11140399 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 50A3D14DB for ; Wed, 11 Sep 2019 06:48:26 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 10CCD21928 for ; Wed, 11 Sep 2019 06:48:26 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 10CCD21928 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=c-sky.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Received: from localhost ([::1]:46968 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1i7wQS-0004SM-BQ for patchwork-qemu-devel@patchwork.kernel.org; Wed, 11 Sep 2019 02:48:24 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:38514) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1i7wDf-0007AC-Vj for qemu-devel@nongnu.org; Wed, 11 Sep 2019 02:35:19 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1i7wDW-0007pq-VE for qemu-devel@nongnu.org; Wed, 11 Sep 2019 02:35:11 -0400 Received: from smtp2200-217.mail.aliyun.com ([121.197.200.217]:54321) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1i7wDV-0007lg-3B; Wed, 11 Sep 2019 02:35:02 -0400 X-Alimail-AntiSpam: AC=CONTINUE; BC=0.03883426|-1; CH=green; DM=CONTINUE|CONTINUE|true|0.186311-0.00318493-0.810504; FP=0|0|0|0|0|-1|-1|-1; HT=e01a16370; MF=zhiwei_liu@c-sky.com; NM=1; PH=DS; RN=11; RT=11; SR=0; TI=SMTPD_---.FSRIpfa_1568183696; Received: from localhost(mailfrom:zhiwei_liu@c-sky.com fp:SMTPD_---.FSRIpfa_1568183696) by smtp.aliyun-inc.com(10.147.41.120); Wed, 11 Sep 2019 14:34:56 +0800 From: liuzhiwei To: Alistair.Francis@wdc.com, palmer@sifive.com, sagark@eecs.berkeley.edu, kbastian@mail.uni-paderborn.de, riku.voipio@iki.fi, laurent@vivier.eu, wenmeng_zhang@c-sky.com Date: Wed, 11 Sep 2019 14:25:34 +0800 Message-Id: <1568183141-67641-11-git-send-email-zhiwei_liu@c-sky.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1568183141-67641-1-git-send-email-zhiwei_liu@c-sky.com> References: <1568183141-67641-1-git-send-email-zhiwei_liu@c-sky.com> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x [generic] X-Received-From: 121.197.200.217 Subject: [Qemu-devel] [PATCH v2 10/17] RISC-V: add vector extension integer instructions part3, cmp/min/max X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-riscv@nongnu.org, qemu-devel@nongnu.org, wxy194768@alibaba-inc.com, LIU Zhiwei Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" From: LIU Zhiwei Signed-off-by: LIU Zhiwei --- target/riscv/helper.h | 29 + target/riscv/insn32.decode | 29 + target/riscv/insn_trans/trans_rvv.inc.c | 29 + target/riscv/vector_helper.c | 2280 +++++++++++++++++++++++++++++++ 4 files changed, 2367 insertions(+) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index 28863e2..7354b12 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -200,5 +200,34 @@ DEF_HELPER_5(vector_vnsra_vv, void, env, i32, i32, i32, i32) DEF_HELPER_5(vector_vnsra_vx, void, env, i32, i32, i32, i32) DEF_HELPER_5(vector_vnsra_vi, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vminu_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vminu_vx, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vmin_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vmin_vx, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vmaxu_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vmaxu_vx, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vmax_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vmax_vx, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vmseq_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vmseq_vx, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vmseq_vi, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vmsne_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vmsne_vx, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vmsne_vi, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vmsltu_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vmsltu_vx, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vmslt_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vmslt_vx, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vmsleu_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vmsleu_vx, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vmsleu_vi, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vmsle_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vmsle_vx, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vmsle_vi, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vmsgtu_vx, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vmsgtu_vi, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vmsgt_vx, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vmsgt_vi, void, env, i32, i32, i32, i32) + DEF_HELPER_4(vector_vsetvli, void, env, i32, i32, i32) DEF_HELPER_4(vector_vsetvl, void, env, i32, i32, i32) diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode index 19710f5..1ff0b08 100644 --- a/target/riscv/insn32.decode +++ b/target/riscv/insn32.decode @@ -340,5 +340,34 @@ vnsra_vv 101101 . ..... ..... 000 ..... 1010111 @r_vm vnsra_vx 101101 . ..... ..... 100 ..... 1010111 @r_vm vnsra_vi 101101 . ..... ..... 011 ..... 1010111 @r_vm +vmseq_vv 011000 . ..... ..... 000 ..... 1010111 @r_vm +vmseq_vx 011000 . ..... ..... 100 ..... 1010111 @r_vm +vmseq_vi 011000 . ..... ..... 011 ..... 1010111 @r_vm +vmsne_vv 011001 . ..... ..... 000 ..... 1010111 @r_vm +vmsne_vx 011001 . ..... ..... 100 ..... 1010111 @r_vm +vmsne_vi 011001 . ..... ..... 011 ..... 1010111 @r_vm +vmsltu_vv 011010 . ..... ..... 000 ..... 1010111 @r_vm +vmsltu_vx 011010 . ..... ..... 100 ..... 1010111 @r_vm +vmslt_vv 011011 . ..... ..... 000 ..... 1010111 @r_vm +vmslt_vx 011011 . ..... ..... 100 ..... 1010111 @r_vm +vmsleu_vv 011100 . ..... ..... 000 ..... 1010111 @r_vm +vmsleu_vx 011100 . ..... ..... 100 ..... 1010111 @r_vm +vmsleu_vi 011100 . ..... ..... 011 ..... 1010111 @r_vm +vmsle_vv 011101 . ..... ..... 000 ..... 1010111 @r_vm +vmsle_vx 011101 . ..... ..... 100 ..... 1010111 @r_vm +vmsle_vi 011101 . ..... ..... 011 ..... 1010111 @r_vm +vmsgtu_vx 011110 . ..... ..... 100 ..... 1010111 @r_vm +vmsgtu_vi 011110 . ..... ..... 011 ..... 1010111 @r_vm +vmsgt_vx 011111 . ..... ..... 100 ..... 1010111 @r_vm +vmsgt_vi 011111 . ..... ..... 011 ..... 1010111 @r_vm +vminu_vv 000100 . ..... ..... 000 ..... 1010111 @r_vm +vminu_vx 000100 . ..... ..... 100 ..... 1010111 @r_vm +vmin_vv 000101 . ..... ..... 000 ..... 1010111 @r_vm +vmin_vx 000101 . ..... ..... 100 ..... 1010111 @r_vm +vmaxu_vv 000110 . ..... ..... 000 ..... 1010111 @r_vm +vmaxu_vx 000110 . ..... ..... 100 ..... 1010111 @r_vm +vmax_vv 000111 . ..... ..... 000 ..... 1010111 @r_vm +vmax_vx 000111 . ..... ..... 100 ..... 1010111 @r_vm + vsetvli 0 ........... ..... 111 ..... 1010111 @r2_zimm vsetvl 1000000 ..... ..... 111 ..... 1010111 @r diff --git a/target/riscv/insn_trans/trans_rvv.inc.c b/target/riscv/insn_trans/trans_rvv.inc.c index 6af29d0..cd5ab07 100644 --- a/target/riscv/insn_trans/trans_rvv.inc.c +++ b/target/riscv/insn_trans/trans_rvv.inc.c @@ -229,5 +229,34 @@ GEN_VECTOR_R_VM(vnsra_vv) GEN_VECTOR_R_VM(vnsra_vx) GEN_VECTOR_R_VM(vnsra_vi) +GEN_VECTOR_R_VM(vmseq_vv) +GEN_VECTOR_R_VM(vmseq_vx) +GEN_VECTOR_R_VM(vmseq_vi) +GEN_VECTOR_R_VM(vmsne_vv) +GEN_VECTOR_R_VM(vmsne_vx) +GEN_VECTOR_R_VM(vmsne_vi) +GEN_VECTOR_R_VM(vmsltu_vv) +GEN_VECTOR_R_VM(vmsltu_vx) +GEN_VECTOR_R_VM(vmslt_vv) +GEN_VECTOR_R_VM(vmslt_vx) +GEN_VECTOR_R_VM(vmsleu_vv) +GEN_VECTOR_R_VM(vmsleu_vx) +GEN_VECTOR_R_VM(vmsleu_vi) +GEN_VECTOR_R_VM(vmsle_vv) +GEN_VECTOR_R_VM(vmsle_vx) +GEN_VECTOR_R_VM(vmsle_vi) +GEN_VECTOR_R_VM(vmsgtu_vx) +GEN_VECTOR_R_VM(vmsgtu_vi) +GEN_VECTOR_R_VM(vmsgt_vx) +GEN_VECTOR_R_VM(vmsgt_vi) +GEN_VECTOR_R_VM(vminu_vv) +GEN_VECTOR_R_VM(vminu_vx) +GEN_VECTOR_R_VM(vmin_vv) +GEN_VECTOR_R_VM(vmin_vx) +GEN_VECTOR_R_VM(vmaxu_vv) +GEN_VECTOR_R_VM(vmaxu_vx) +GEN_VECTOR_R_VM(vmax_vv) +GEN_VECTOR_R_VM(vmax_vx) + GEN_VECTOR_R2_ZIMM(vsetvli) GEN_VECTOR_R(vsetvl) diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c index 298a10a..fbf2145 100644 --- a/target/riscv/vector_helper.c +++ b/target/riscv/vector_helper.c @@ -8608,3 +8608,2283 @@ void VECTOR_HELPER(vnsra_vi)(CPURISCVState *env, uint32_t vm, uint32_t rs1, env->vfp.vstart = 0; } +void VECTOR_HELPER(vmseq_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src1, src2, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + + for (i = 0; i < vlmax; i++) { + src1 = rs1 + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src1].u8[j] == + env->vfp.vreg[src2].u8[j]) { + vector_mask_result(env, rd, width, lmul, i, 1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src1].u16[j] == + env->vfp.vreg[src2].u16[j]) { + vector_mask_result(env, rd, width, lmul, i, 1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src1].u32[j] == + env->vfp.vreg[src2].u32[j]) { + vector_mask_result(env, rd, width, lmul, i, 1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src1].u64[j] == + env->vfp.vreg[src2].u64[j]) { + vector_mask_result(env, rd, width, lmul, i, 1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + if (width <= 64) { + vector_mask_result(env, rd, width, lmul, i, 0); + } else { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } + } + env->vfp.vstart = 0; +} +void VECTOR_HELPER(vmseq_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src2, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if ((uint8_t)env->gpr[rs1] == env->vfp.vreg[src2].u8[j]) { + vector_mask_result(env, rd, width, lmul, i, 1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if ((uint16_t)env->gpr[rs1] == env->vfp.vreg[src2].u16[j]) { + vector_mask_result(env, rd, width, lmul, i, 1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if ((uint32_t)env->gpr[rs1] == env->vfp.vreg[src2].u32[j]) { + vector_mask_result(env, rd, width, lmul, i, 1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if ((uint64_t)extend_gpr(env->gpr[rs1]) == + env->vfp.vreg[src2].u64[j]) { + vector_mask_result(env, rd, width, lmul, i, 1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + if (width <= 64) { + vector_mask_result(env, rd, width, lmul, i, 0); + } else { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } + } + env->vfp.vstart = 0; +} +void VECTOR_HELPER(vmseq_vi)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src2, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if ((uint8_t)sign_extend(rs1, 5) + == env->vfp.vreg[src2].u8[j]) { + vector_mask_result(env, rd, width, lmul, i, 1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if ((uint16_t)sign_extend(rs1, 5) + == env->vfp.vreg[src2].u16[j]) { + vector_mask_result(env, rd, width, lmul, i, 1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if ((uint32_t)sign_extend(rs1, 5) + == env->vfp.vreg[src2].u32[j]) { + vector_mask_result(env, rd, width, lmul, i, 1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if ((uint64_t)sign_extend(rs1, 5) == + env->vfp.vreg[src2].u64[j]) { + vector_mask_result(env, rd, width, lmul, i, 1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + if (width <= 64) { + vector_mask_result(env, rd, width, lmul, i, 0); + } else { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vmsne_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src1, src2, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + + for (i = 0; i < vlmax; i++) { + src1 = rs1 + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src1].u8[j] != + env->vfp.vreg[src2].u8[j]) { + vector_mask_result(env, rd, width, lmul, i, 1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src1].u16[j] != + env->vfp.vreg[src2].u16[j]) { + vector_mask_result(env, rd, width, lmul, i, 1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src1].u32[j] != + env->vfp.vreg[src2].u32[j]) { + vector_mask_result(env, rd, width, lmul, i, 1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src1].u64[j] != + env->vfp.vreg[src2].u64[j]) { + vector_mask_result(env, rd, width, lmul, i, 1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + if (width <= 64) { + vector_mask_result(env, rd, width, lmul, i, 0); + } else { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } + } + env->vfp.vstart = 0; +} +void VECTOR_HELPER(vmsne_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src2, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if ((uint8_t)env->gpr[rs1] != env->vfp.vreg[src2].u8[j]) { + vector_mask_result(env, rd, width, lmul, i, 1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if ((uint16_t)env->gpr[rs1] != env->vfp.vreg[src2].u16[j]) { + vector_mask_result(env, rd, width, lmul, i, 1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if ((uint32_t)env->gpr[rs1] != env->vfp.vreg[src2].u32[j]) { + vector_mask_result(env, rd, width, lmul, i, 1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if ((uint64_t)extend_gpr(env->gpr[rs1]) != + env->vfp.vreg[src2].u64[j]) { + vector_mask_result(env, rd, width, lmul, i, 1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + if (width <= 64) { + vector_mask_result(env, rd, width, lmul, i, 0); + } else { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } + } + env->vfp.vstart = 0; +} +void VECTOR_HELPER(vmsne_vi)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src2, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if ((uint8_t)sign_extend(rs1, 5) + != env->vfp.vreg[src2].u8[j]) { + vector_mask_result(env, rd, width, lmul, i, 1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if ((uint16_t)sign_extend(rs1, 5) + != env->vfp.vreg[src2].u16[j]) { + vector_mask_result(env, rd, width, lmul, i, 1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if ((uint32_t)sign_extend(rs1, 5) + != env->vfp.vreg[src2].u32[j]) { + vector_mask_result(env, rd, width, lmul, i, 1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if ((uint64_t)sign_extend(rs1, 5) != + env->vfp.vreg[src2].u64[j]) { + vector_mask_result(env, rd, width, lmul, i, 1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + if (width <= 64) { + vector_mask_result(env, rd, width, lmul, i, 0); + } else { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vmsltu_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src1, src2, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + + for (i = 0; i < vlmax; i++) { + src1 = rs1 + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src2].u8[j] < + env->vfp.vreg[src1].u8[j]) { + vector_mask_result(env, rd, width, lmul, i, 1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src2].u16[j] < + env->vfp.vreg[src1].u16[j]) { + vector_mask_result(env, rd, width, lmul, i, 1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src2].u32[j] < + env->vfp.vreg[src1].u32[j]) { + vector_mask_result(env, rd, width, lmul, i, 1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src2].u64[j] < + env->vfp.vreg[src1].u64[j]) { + vector_mask_result(env, rd, width, lmul, i, 1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + if (width <= 64) { + vector_mask_result(env, rd, width, lmul, i, 0); + } else { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } + } + env->vfp.vstart = 0; +} +void VECTOR_HELPER(vmsltu_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src2, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src2].u8[j] < (uint8_t)env->gpr[rs1]) { + vector_mask_result(env, rd, width, lmul, i, 1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src2].u16[j] < (uint16_t)env->gpr[rs1]) { + vector_mask_result(env, rd, width, lmul, i, 1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src2].u32[j] < (uint32_t)env->gpr[rs1]) { + vector_mask_result(env, rd, width, lmul, i, 1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src2].u64[j] < + (uint64_t)extend_gpr(env->gpr[rs1])) { + vector_mask_result(env, rd, width, lmul, i, 1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + if (width <= 64) { + vector_mask_result(env, rd, width, lmul, i, 0); + } else { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vmslt_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src1, src2, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + + for (i = 0; i < vlmax; i++) { + src1 = rs1 + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src2].s8[j] < + env->vfp.vreg[src1].s8[j]) { + vector_mask_result(env, rd, width, lmul, i, 1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src2].s16[j] < + env->vfp.vreg[src1].s16[j]) { + vector_mask_result(env, rd, width, lmul, i, 1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src2].s32[j] < + env->vfp.vreg[src1].s32[j]) { + vector_mask_result(env, rd, width, lmul, i, 1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src2].s64[j] < + env->vfp.vreg[src1].s64[j]) { + vector_mask_result(env, rd, width, lmul, i, 1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + if (width <= 64) { + vector_mask_result(env, rd, width, lmul, i, 0); + } else { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } + } + env->vfp.vstart = 0; +} +void VECTOR_HELPER(vmslt_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src2, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src2].s8[j] < (int8_t)env->gpr[rs1]) { + vector_mask_result(env, rd, width, lmul, i, 1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src2].s16[j] < (int16_t)env->gpr[rs1]) { + vector_mask_result(env, rd, width, lmul, i, 1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src2].s32[j] < (int32_t)env->gpr[rs1]) { + vector_mask_result(env, rd, width, lmul, i, 1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src2].s64[j] < + (int64_t)extend_gpr(env->gpr[rs1])) { + vector_mask_result(env, rd, width, lmul, i, 1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + if (width <= 64) { + vector_mask_result(env, rd, width, lmul, i, 0); + } else { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vmsleu_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src1, src2, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + + for (i = 0; i < vlmax; i++) { + src1 = rs1 + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src2].u8[j] <= + env->vfp.vreg[src1].u8[j]) { + vector_mask_result(env, rd, width, lmul, i, 1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src2].u16[j] <= + env->vfp.vreg[src1].u16[j]) { + vector_mask_result(env, rd, width, lmul, i, 1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src2].u32[j] <= + env->vfp.vreg[src1].u32[j]) { + vector_mask_result(env, rd, width, lmul, i, 1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src2].u64[j] <= + env->vfp.vreg[src1].u64[j]) { + vector_mask_result(env, rd, width, lmul, i, 1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + if (width <= 64) { + vector_mask_result(env, rd, width, lmul, i, 0); + } else { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } + } + env->vfp.vstart = 0; +} +void VECTOR_HELPER(vmsleu_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src2, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src2].u8[j] <= (uint8_t)env->gpr[rs1]) { + vector_mask_result(env, rd, width, lmul, i, 1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src2].u16[j] <= (uint16_t)env->gpr[rs1]) { + vector_mask_result(env, rd, width, lmul, i, 1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src2].u32[j] <= (uint32_t)env->gpr[rs1]) { + vector_mask_result(env, rd, width, lmul, i, 1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src2].u64[j] <= + (uint64_t)extend_gpr(env->gpr[rs1])) { + vector_mask_result(env, rd, width, lmul, i, 1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + if (width <= 64) { + vector_mask_result(env, rd, width, lmul, i, 0); + } else { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } + } + env->vfp.vstart = 0; +} +void VECTOR_HELPER(vmsleu_vi)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src2, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src2].u8[j] <= (uint8_t)rs1) { + vector_mask_result(env, rd, width, lmul, i, 1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src2].u16[j] <= (uint16_t)rs1) { + vector_mask_result(env, rd, width, lmul, i, 1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src2].u32[j] <= (uint32_t)rs1) { + vector_mask_result(env, rd, width, lmul, i, 1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src2].u64[j] <= + (uint64_t)rs1) { + vector_mask_result(env, rd, width, lmul, i, 1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + if (width <= 64) { + vector_mask_result(env, rd, width, lmul, i, 0); + } else { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vmsle_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src1, src2, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + + for (i = 0; i < vlmax; i++) { + src1 = rs1 + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src2].s8[j] <= + env->vfp.vreg[src1].s8[j]) { + vector_mask_result(env, rd, width, lmul, i, 1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src2].s16[j] <= + env->vfp.vreg[src1].s16[j]) { + vector_mask_result(env, rd, width, lmul, i, 1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src2].s32[j] <= + env->vfp.vreg[src1].s32[j]) { + vector_mask_result(env, rd, width, lmul, i, 1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src2].s64[j] <= + env->vfp.vreg[src1].s64[j]) { + vector_mask_result(env, rd, width, lmul, i, 1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + if (width <= 64) { + vector_mask_result(env, rd, width, lmul, i, 0); + } else { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } + } + env->vfp.vstart = 0; +} +void VECTOR_HELPER(vmsle_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src2, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src2].s8[j] <= (int8_t)env->gpr[rs1]) { + vector_mask_result(env, rd, width, lmul, i, 1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src2].s16[j] <= (int16_t)env->gpr[rs1]) { + vector_mask_result(env, rd, width, lmul, i, 1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src2].s32[j] <= (int32_t)env->gpr[rs1]) { + vector_mask_result(env, rd, width, lmul, i, 1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src2].s64[j] <= + (int64_t)extend_gpr(env->gpr[rs1])) { + vector_mask_result(env, rd, width, lmul, i, 1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + if (width <= 64) { + vector_mask_result(env, rd, width, lmul, i, 0); + } else { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } + } + env->vfp.vstart = 0; +} +void VECTOR_HELPER(vmsle_vi)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src2, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src2].s8[j] <= + (int8_t)sign_extend(rs1, 5)) { + vector_mask_result(env, rd, width, lmul, i, 1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src2].s16[j] <= + (int16_t)sign_extend(rs1, 5)) { + vector_mask_result(env, rd, width, lmul, i, 1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src2].s32[j] <= + (int32_t)sign_extend(rs1, 5)) { + vector_mask_result(env, rd, width, lmul, i, 1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src2].s64[j] <= + sign_extend(rs1, 5)) { + vector_mask_result(env, rd, width, lmul, i, 1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + if (width <= 64) { + vector_mask_result(env, rd, width, lmul, i, 0); + } else { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vmsgtu_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src2, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src2].u8[j] > (uint8_t)env->gpr[rs1]) { + vector_mask_result(env, rd, width, lmul, i, 1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src2].u16[j] > (uint16_t)env->gpr[rs1]) { + vector_mask_result(env, rd, width, lmul, i, 1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src2].u32[j] > (uint32_t)env->gpr[rs1]) { + vector_mask_result(env, rd, width, lmul, i, 1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src2].u64[j] > + (uint64_t)extend_gpr(env->gpr[rs1])) { + vector_mask_result(env, rd, width, lmul, i, 1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + if (width <= 64) { + vector_mask_result(env, rd, width, lmul, i, 0); + } else { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } + } + + env->vfp.vstart = 0; +} +void VECTOR_HELPER(vmsgtu_vi)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src2, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src2].u8[j] > (uint8_t)rs1) { + vector_mask_result(env, rd, width, lmul, i, 1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src2].u16[j] > (uint16_t)rs1) { + vector_mask_result(env, rd, width, lmul, i, 1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src2].u32[j] > (uint32_t)rs1) { + vector_mask_result(env, rd, width, lmul, i, 1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src2].u64[j] > + (uint64_t)rs1) { + vector_mask_result(env, rd, width, lmul, i, 1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + if (width <= 64) { + vector_mask_result(env, rd, width, lmul, i, 0); + } else { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vmsgt_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src2, vlmax; + + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src2].s8[j] > (int8_t)env->gpr[rs1]) { + vector_mask_result(env, rd, width, lmul, i, 1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src2].s16[j] > (int16_t)env->gpr[rs1]) { + vector_mask_result(env, rd, width, lmul, i, 1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src2].s32[j] > (int32_t)env->gpr[rs1]) { + vector_mask_result(env, rd, width, lmul, i, 1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src2].s64[j] > + (int64_t)extend_gpr(env->gpr[rs1])) { + vector_mask_result(env, rd, width, lmul, i, 1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + if (width <= 64) { + vector_mask_result(env, rd, width, lmul, i, 0); + } else { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } + } + env->vfp.vstart = 0; +} +void VECTOR_HELPER(vmsgt_vi)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src2, vlmax; + + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src2].s8[j] > + (int8_t)sign_extend(rs1, 5)) { + vector_mask_result(env, rd, width, lmul, i, 1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src2].s16[j] > + (int16_t)sign_extend(rs1, 5)) { + vector_mask_result(env, rd, width, lmul, i, 1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src2].s32[j] > + (int32_t)sign_extend(rs1, 5)) { + vector_mask_result(env, rd, width, lmul, i, 1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src2].s64[j] > + sign_extend(rs1, 5)) { + vector_mask_result(env, rd, width, lmul, i, 1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + if (width <= 64) { + vector_mask_result(env, rd, width, lmul, i, 0); + } else { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vminu_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src1, src2, dest, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + src1 = rs1 + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src1].u8[j] <= + env->vfp.vreg[src2].u8[j]) { + env->vfp.vreg[dest].u8[j] = + env->vfp.vreg[src1].u8[j]; + } else { + env->vfp.vreg[dest].u8[j] = + env->vfp.vreg[src2].u8[j]; + } + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src1].u16[j] <= + env->vfp.vreg[src2].u16[j]) { + env->vfp.vreg[dest].u16[j] = + env->vfp.vreg[src1].u16[j]; + } else { + env->vfp.vreg[dest].u16[j] = + env->vfp.vreg[src2].u16[j]; + } + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src1].u32[j] <= + env->vfp.vreg[src2].u32[j]) { + env->vfp.vreg[dest].u32[j] = + env->vfp.vreg[src1].u32[j]; + } else { + env->vfp.vreg[dest].u32[j] = + env->vfp.vreg[src2].u32[j]; + } + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src1].u64[j] <= + env->vfp.vreg[src2].u64[j]) { + env->vfp.vreg[dest].u64[j] = + env->vfp.vreg[src1].u64[j]; + } else { + env->vfp.vreg[dest].u64[j] = + env->vfp.vreg[src2].u64[j]; + } + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vminu_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src2, dest, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if ((uint8_t)env->gpr[rs1] <= + env->vfp.vreg[src2].u8[j]) { + env->vfp.vreg[dest].u8[j] = + env->gpr[rs1]; + } else { + env->vfp.vreg[dest].u8[j] = + env->vfp.vreg[src2].u8[j]; + } + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if ((uint16_t)env->gpr[rs1] <= + env->vfp.vreg[src2].u16[j]) { + env->vfp.vreg[dest].u16[j] = + env->gpr[rs1]; + } else { + env->vfp.vreg[dest].u16[j] = + env->vfp.vreg[src2].u16[j]; + } + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if ((uint32_t)env->gpr[rs1] <= + env->vfp.vreg[src2].u32[j]) { + env->vfp.vreg[dest].u32[j] = + env->gpr[rs1]; + } else { + env->vfp.vreg[dest].u32[j] = + env->vfp.vreg[src2].u32[j]; + } + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if ((uint64_t)extend_gpr(env->gpr[rs1]) <= + env->vfp.vreg[src2].u64[j]) { + env->vfp.vreg[dest].u64[j] = + (uint64_t)extend_gpr(env->gpr[rs1]); + } else { + env->vfp.vreg[dest].u64[j] = + env->vfp.vreg[src2].u64[j]; + } + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vmin_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src1, src2, dest, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + src1 = rs1 + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src1].s8[j] <= + env->vfp.vreg[src2].s8[j]) { + env->vfp.vreg[dest].s8[j] = + env->vfp.vreg[src1].s8[j]; + } else { + env->vfp.vreg[dest].s8[j] = + env->vfp.vreg[src2].s8[j]; + } + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src1].s16[j] <= + env->vfp.vreg[src2].s16[j]) { + env->vfp.vreg[dest].s16[j] = + env->vfp.vreg[src1].s16[j]; + } else { + env->vfp.vreg[dest].s16[j] = + env->vfp.vreg[src2].s16[j]; + } + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src1].s32[j] <= + env->vfp.vreg[src2].s32[j]) { + env->vfp.vreg[dest].s32[j] = + env->vfp.vreg[src1].s32[j]; + } else { + env->vfp.vreg[dest].s32[j] = + env->vfp.vreg[src2].s32[j]; + } + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src1].s64[j] <= + env->vfp.vreg[src2].s64[j]) { + env->vfp.vreg[dest].s64[j] = + env->vfp.vreg[src1].s64[j]; + } else { + env->vfp.vreg[dest].s64[j] = + env->vfp.vreg[src2].s64[j]; + } + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; +} +void VECTOR_HELPER(vmin_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src2, dest, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if ((int8_t)env->gpr[rs1] <= + env->vfp.vreg[src2].s8[j]) { + env->vfp.vreg[dest].s8[j] = + env->gpr[rs1]; + } else { + env->vfp.vreg[dest].s8[j] = + env->vfp.vreg[src2].s8[j]; + } + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if ((int16_t)env->gpr[rs1] <= + env->vfp.vreg[src2].s16[j]) { + env->vfp.vreg[dest].s16[j] = + env->gpr[rs1]; + } else { + env->vfp.vreg[dest].s16[j] = + env->vfp.vreg[src2].s16[j]; + } + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if ((int32_t)env->gpr[rs1] <= + env->vfp.vreg[src2].s32[j]) { + env->vfp.vreg[dest].s32[j] = + env->gpr[rs1]; + } else { + env->vfp.vreg[dest].s32[j] = + env->vfp.vreg[src2].s32[j]; + } + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if ((int64_t)extend_gpr(env->gpr[rs1]) <= + env->vfp.vreg[src2].s64[j]) { + env->vfp.vreg[dest].s64[j] = + (int64_t)extend_gpr(env->gpr[rs1]); + } else { + env->vfp.vreg[dest].s64[j] = + env->vfp.vreg[src2].s64[j]; + } + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vmaxu_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src1, src2, dest, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + src1 = rs1 + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src1].u8[j] >= + env->vfp.vreg[src2].u8[j]) { + env->vfp.vreg[dest].u8[j] = + env->vfp.vreg[src1].u8[j]; + } else { + env->vfp.vreg[dest].u8[j] = + env->vfp.vreg[src2].u8[j]; + } + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src1].u16[j] >= + env->vfp.vreg[src2].u16[j]) { + env->vfp.vreg[dest].u16[j] = + env->vfp.vreg[src1].u16[j]; + } else { + env->vfp.vreg[dest].u16[j] = + env->vfp.vreg[src2].u16[j]; + } + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src1].u32[j] >= + env->vfp.vreg[src2].u32[j]) { + env->vfp.vreg[dest].u32[j] = + env->vfp.vreg[src1].u32[j]; + } else { + env->vfp.vreg[dest].u32[j] = + env->vfp.vreg[src2].u32[j]; + } + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src1].u64[j] >= + env->vfp.vreg[src2].u64[j]) { + env->vfp.vreg[dest].u64[j] = + env->vfp.vreg[src1].u64[j]; + } else { + env->vfp.vreg[dest].u64[j] = + env->vfp.vreg[src2].u64[j]; + } + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vmaxu_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src2, dest, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if ((uint8_t)env->gpr[rs1] >= + env->vfp.vreg[src2].u8[j]) { + env->vfp.vreg[dest].u8[j] = + env->gpr[rs1]; + } else { + env->vfp.vreg[dest].u8[j] = + env->vfp.vreg[src2].u8[j]; + } + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if ((uint16_t)env->gpr[rs1] >= + env->vfp.vreg[src2].u16[j]) { + env->vfp.vreg[dest].u16[j] = + env->gpr[rs1]; + } else { + env->vfp.vreg[dest].u16[j] = + env->vfp.vreg[src2].u16[j]; + } + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if ((uint32_t)env->gpr[rs1] >= + env->vfp.vreg[src2].u32[j]) { + env->vfp.vreg[dest].u32[j] = + env->gpr[rs1]; + } else { + env->vfp.vreg[dest].u32[j] = + env->vfp.vreg[src2].u32[j]; + } + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if ((uint64_t)extend_gpr(env->gpr[rs1]) >= + env->vfp.vreg[src2].u64[j]) { + env->vfp.vreg[dest].u64[j] = + (uint64_t)extend_gpr(env->gpr[rs1]); + } else { + env->vfp.vreg[dest].u64[j] = + env->vfp.vreg[src2].u64[j]; + } + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vmax_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src1, src2, dest, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + src1 = rs1 + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src1].s8[j] >= + env->vfp.vreg[src2].s8[j]) { + env->vfp.vreg[dest].s8[j] = + env->vfp.vreg[src1].s8[j]; + } else { + env->vfp.vreg[dest].s8[j] = + env->vfp.vreg[src2].s8[j]; + } + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src1].s16[j] >= + env->vfp.vreg[src2].s16[j]) { + env->vfp.vreg[dest].s16[j] = + env->vfp.vreg[src1].s16[j]; + } else { + env->vfp.vreg[dest].s16[j] = + env->vfp.vreg[src2].s16[j]; + } + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src1].s32[j] >= + env->vfp.vreg[src2].s32[j]) { + env->vfp.vreg[dest].s32[j] = + env->vfp.vreg[src1].s32[j]; + } else { + env->vfp.vreg[dest].s32[j] = + env->vfp.vreg[src2].s32[j]; + } + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src1].s64[j] >= + env->vfp.vreg[src2].s64[j]) { + env->vfp.vreg[dest].s64[j] = + env->vfp.vreg[src1].s64[j]; + } else { + env->vfp.vreg[dest].s64[j] = + env->vfp.vreg[src2].s64[j]; + } + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; +} +void VECTOR_HELPER(vmax_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src2, dest, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if ((int8_t)env->gpr[rs1] >= + env->vfp.vreg[src2].s8[j]) { + env->vfp.vreg[dest].s8[j] = + env->gpr[rs1]; + } else { + env->vfp.vreg[dest].s8[j] = + env->vfp.vreg[src2].s8[j]; + } + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if ((int16_t)env->gpr[rs1] >= + env->vfp.vreg[src2].s16[j]) { + env->vfp.vreg[dest].s16[j] = + env->gpr[rs1]; + } else { + env->vfp.vreg[dest].s16[j] = + env->vfp.vreg[src2].s16[j]; + } + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if ((int32_t)env->gpr[rs1] >= + env->vfp.vreg[src2].s32[j]) { + env->vfp.vreg[dest].s32[j] = + env->gpr[rs1]; + } else { + env->vfp.vreg[dest].s32[j] = + env->vfp.vreg[src2].s32[j]; + } + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if ((int64_t)extend_gpr(env->gpr[rs1]) >= + env->vfp.vreg[src2].s64[j]) { + env->vfp.vreg[dest].s64[j] = + (int64_t)extend_gpr(env->gpr[rs1]); + } else { + env->vfp.vreg[dest].s64[j] = + env->vfp.vreg[src2].s64[j]; + } + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; +} + From patchwork Wed Sep 11 06:25:35 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: LIU Zhiwei X-Patchwork-Id: 11140405 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 07FF214DB for ; Wed, 11 Sep 2019 06:50:34 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id A95D121A4C for ; Wed, 11 Sep 2019 06:50:33 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A95D121A4C Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=c-sky.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Received: from localhost ([::1]:47002 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1i7wSW-00070U-8C for patchwork-qemu-devel@patchwork.kernel.org; Wed, 11 Sep 2019 02:50:32 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:38744) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1i7wED-0007xC-HT for qemu-devel@nongnu.org; Wed, 11 Sep 2019 02:35:56 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1i7wE2-0008BA-Sc for qemu-devel@nongnu.org; Wed, 11 Sep 2019 02:35:45 -0400 Received: from smtp2200-217.mail.aliyun.com ([121.197.200.217]:34921) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1i7wE0-0007mn-TG; Wed, 11 Sep 2019 02:35:34 -0400 X-Alimail-AntiSpam: AC=CONTINUE; BC=0.03883426|-1; CH=green; DM=CONTINUE|CONTINUE|true|0.352111-0.00568075-0.642208; FP=0|0|0|0|0|-1|-1|-1; HT=e01l07423; MF=zhiwei_liu@c-sky.com; NM=1; PH=DS; RN=11; RT=11; SR=0; TI=SMTPD_---.FSRLrNg_1568183697; Received: from localhost(mailfrom:zhiwei_liu@c-sky.com fp:SMTPD_---.FSRLrNg_1568183697) by smtp.aliyun-inc.com(10.147.43.95); Wed, 11 Sep 2019 14:34:58 +0800 From: liuzhiwei To: Alistair.Francis@wdc.com, palmer@sifive.com, sagark@eecs.berkeley.edu, kbastian@mail.uni-paderborn.de, riku.voipio@iki.fi, laurent@vivier.eu, wenmeng_zhang@c-sky.com Date: Wed, 11 Sep 2019 14:25:35 +0800 Message-Id: <1568183141-67641-12-git-send-email-zhiwei_liu@c-sky.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1568183141-67641-1-git-send-email-zhiwei_liu@c-sky.com> References: <1568183141-67641-1-git-send-email-zhiwei_liu@c-sky.com> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x [generic] X-Received-From: 121.197.200.217 Subject: [Qemu-devel] [PATCH v2 11/17] RISC-V: add vector extension integer instructions part4, mul/div/merge X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-riscv@nongnu.org, qemu-devel@nongnu.org, wxy194768@alibaba-inc.com, LIU Zhiwei Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" From: LIU Zhiwei Signed-off-by: LIU Zhiwei --- target/riscv/helper.h | 41 + target/riscv/insn32.decode | 41 + target/riscv/insn_trans/trans_rvv.inc.c | 41 + target/riscv/vector_helper.c | 2838 +++++++++++++++++++++++++++++++ 4 files changed, 2961 insertions(+) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index 7354b12..ab31ef7 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -229,5 +229,46 @@ DEF_HELPER_5(vector_vmsgtu_vi, void, env, i32, i32, i32, i32) DEF_HELPER_5(vector_vmsgt_vx, void, env, i32, i32, i32, i32) DEF_HELPER_5(vector_vmsgt_vi, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vmul_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vmul_vx, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vmulhsu_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vmulhsu_vx, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vmulh_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vmulh_vx, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vdivu_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vdivu_vx, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vdiv_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vdiv_vx, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vremu_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vremu_vx, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vrem_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vrem_vx, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vmulhu_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vmulhu_vx, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vmadd_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vmadd_vx, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vnmsub_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vnmsub_vx, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vmacc_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vmacc_vx, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vnmsac_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vnmsac_vx, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vwmulu_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vwmulu_vx, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vwmulsu_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vwmulsu_vx, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vwmul_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vwmul_vx, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vwmaccu_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vwmaccu_vx, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vwmacc_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vwmacc_vx, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vwmaccsu_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vwmaccsu_vx, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vwmaccus_vx, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vmerge_vvm, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vmerge_vxm, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vmerge_vim, void, env, i32, i32, i32, i32) + DEF_HELPER_4(vector_vsetvli, void, env, i32, i32, i32) DEF_HELPER_4(vector_vsetvl, void, env, i32, i32, i32) diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode index 1ff0b08..6db18c5 100644 --- a/target/riscv/insn32.decode +++ b/target/riscv/insn32.decode @@ -369,5 +369,46 @@ vmaxu_vx 000110 . ..... ..... 100 ..... 1010111 @r_vm vmax_vv 000111 . ..... ..... 000 ..... 1010111 @r_vm vmax_vx 000111 . ..... ..... 100 ..... 1010111 @r_vm +vmul_vv 100101 . ..... ..... 010 ..... 1010111 @r_vm +vmul_vx 100101 . ..... ..... 110 ..... 1010111 @r_vm +vmulhsu_vv 100110 . ..... ..... 010 ..... 1010111 @r_vm +vmulhsu_vx 100110 . ..... ..... 110 ..... 1010111 @r_vm +vmulh_vv 100111 . ..... ..... 010 ..... 1010111 @r_vm +vmulh_vx 100111 . ..... ..... 110 ..... 1010111 @r_vm +vmulhu_vv 100100 . ..... ..... 010 ..... 1010111 @r_vm +vmulhu_vx 100100 . ..... ..... 110 ..... 1010111 @r_vm +vdivu_vv 100000 . ..... ..... 010 ..... 1010111 @r_vm +vdivu_vx 100000 . ..... ..... 110 ..... 1010111 @r_vm +vdiv_vv 100001 . ..... ..... 010 ..... 1010111 @r_vm +vdiv_vx 100001 . ..... ..... 110 ..... 1010111 @r_vm +vremu_vv 100010 . ..... ..... 010 ..... 1010111 @r_vm +vremu_vx 100010 . ..... ..... 110 ..... 1010111 @r_vm +vrem_vv 100011 . ..... ..... 010 ..... 1010111 @r_vm +vrem_vx 100011 . ..... ..... 110 ..... 1010111 @r_vm +vwmulu_vv 111000 . ..... ..... 010 ..... 1010111 @r_vm +vwmulu_vx 111000 . ..... ..... 110 ..... 1010111 @r_vm +vwmulsu_vv 111010 . ..... ..... 010 ..... 1010111 @r_vm +vwmulsu_vx 111010 . ..... ..... 110 ..... 1010111 @r_vm +vwmul_vv 111011 . ..... ..... 010 ..... 1010111 @r_vm +vwmul_vx 111011 . ..... ..... 110 ..... 1010111 @r_vm +vmacc_vv 101101 . ..... ..... 010 ..... 1010111 @r_vm +vmacc_vx 101101 . ..... ..... 110 ..... 1010111 @r_vm +vnmsac_vv 101111 . ..... ..... 010 ..... 1010111 @r_vm +vnmsac_vx 101111 . ..... ..... 110 ..... 1010111 @r_vm +vmadd_vv 101001 . ..... ..... 010 ..... 1010111 @r_vm +vmadd_vx 101001 . ..... ..... 110 ..... 1010111 @r_vm +vnmsub_vv 101011 . ..... ..... 010 ..... 1010111 @r_vm +vnmsub_vx 101011 . ..... ..... 110 ..... 1010111 @r_vm +vwmaccu_vv 111100 . ..... ..... 010 ..... 1010111 @r_vm +vwmaccu_vx 111100 . ..... ..... 110 ..... 1010111 @r_vm +vwmacc_vv 111101 . ..... ..... 010 ..... 1010111 @r_vm +vwmacc_vx 111101 . ..... ..... 110 ..... 1010111 @r_vm +vwmaccsu_vv 111110 . ..... ..... 010 ..... 1010111 @r_vm +vwmaccsu_vx 111110 . ..... ..... 110 ..... 1010111 @r_vm +vwmaccus_vx 111111 . ..... ..... 110 ..... 1010111 @r_vm +vmerge_vvm 010111 . ..... ..... 000 ..... 1010111 @r_vm +vmerge_vxm 010111 . ..... ..... 100 ..... 1010111 @r_vm +vmerge_vim 010111 . ..... ..... 011 ..... 1010111 @r_vm + vsetvli 0 ........... ..... 111 ..... 1010111 @r2_zimm vsetvl 1000000 ..... ..... 111 ..... 1010111 @r diff --git a/target/riscv/insn_trans/trans_rvv.inc.c b/target/riscv/insn_trans/trans_rvv.inc.c index cd5ab07..1ba52e7 100644 --- a/target/riscv/insn_trans/trans_rvv.inc.c +++ b/target/riscv/insn_trans/trans_rvv.inc.c @@ -258,5 +258,46 @@ GEN_VECTOR_R_VM(vmaxu_vx) GEN_VECTOR_R_VM(vmax_vv) GEN_VECTOR_R_VM(vmax_vx) +GEN_VECTOR_R_VM(vmulhu_vv) +GEN_VECTOR_R_VM(vmulhu_vx) +GEN_VECTOR_R_VM(vmul_vv) +GEN_VECTOR_R_VM(vmul_vx) +GEN_VECTOR_R_VM(vmulhsu_vv) +GEN_VECTOR_R_VM(vmulhsu_vx) +GEN_VECTOR_R_VM(vmulh_vv) +GEN_VECTOR_R_VM(vmulh_vx) +GEN_VECTOR_R_VM(vdivu_vv) +GEN_VECTOR_R_VM(vdivu_vx) +GEN_VECTOR_R_VM(vdiv_vv) +GEN_VECTOR_R_VM(vdiv_vx) +GEN_VECTOR_R_VM(vremu_vv) +GEN_VECTOR_R_VM(vremu_vx) +GEN_VECTOR_R_VM(vrem_vv) +GEN_VECTOR_R_VM(vrem_vx) +GEN_VECTOR_R_VM(vmacc_vv) +GEN_VECTOR_R_VM(vmacc_vx) +GEN_VECTOR_R_VM(vnmsac_vv) +GEN_VECTOR_R_VM(vnmsac_vx) +GEN_VECTOR_R_VM(vmadd_vv) +GEN_VECTOR_R_VM(vmadd_vx) +GEN_VECTOR_R_VM(vnmsub_vv) +GEN_VECTOR_R_VM(vnmsub_vx) +GEN_VECTOR_R_VM(vwmulu_vv) +GEN_VECTOR_R_VM(vwmulu_vx) +GEN_VECTOR_R_VM(vwmulsu_vv) +GEN_VECTOR_R_VM(vwmulsu_vx) +GEN_VECTOR_R_VM(vwmul_vv) +GEN_VECTOR_R_VM(vwmul_vx) +GEN_VECTOR_R_VM(vwmaccu_vv) +GEN_VECTOR_R_VM(vwmaccu_vx) +GEN_VECTOR_R_VM(vwmacc_vv) +GEN_VECTOR_R_VM(vwmacc_vx) +GEN_VECTOR_R_VM(vwmaccsu_vv) +GEN_VECTOR_R_VM(vwmaccsu_vx) +GEN_VECTOR_R_VM(vwmaccus_vx) +GEN_VECTOR_R_VM(vmerge_vvm) +GEN_VECTOR_R_VM(vmerge_vxm) +GEN_VECTOR_R_VM(vmerge_vim) + GEN_VECTOR_R2_ZIMM(vsetvli) GEN_VECTOR_R(vsetvl) diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c index fbf2145..49f1cb8 100644 --- a/target/riscv/vector_helper.c +++ b/target/riscv/vector_helper.c @@ -10888,3 +10888,2841 @@ void VECTOR_HELPER(vmax_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1, env->vfp.vstart = 0; } +void VECTOR_HELPER(vmulhu_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src1, src2, dest, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + src1 = rs1 + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u8[j] = + ((uint16_t)env->vfp.vreg[src1].u8[j] + * (uint16_t)env->vfp.vreg[src2].u8[j]) >> width; + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u16[j] = + ((uint32_t)env->vfp.vreg[src1].u16[j] + * (uint32_t)env->vfp.vreg[src2].u16[j]) >> width; + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u32[j] = + ((uint64_t)env->vfp.vreg[src1].u32[j] + * (uint64_t)env->vfp.vreg[src2].u32[j]) >> width; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u64[j] = u64xu64_lh( + env->vfp.vreg[src1].u64[j], env->vfp.vreg[src2].u64[j]); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; +} +void VECTOR_HELPER(vmulhu_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src2, dest, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u8[j] = + ((uint16_t)(uint8_t)env->gpr[rs1] + * (uint16_t)env->vfp.vreg[src2].u8[j]) >> width; + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u16[j] = + ((uint32_t)(uint16_t)env->gpr[rs1] + * (uint32_t)env->vfp.vreg[src2].u16[j]) >> width; + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u32[j] = + ((uint64_t)(uint32_t)env->gpr[rs1] + * (uint64_t)env->vfp.vreg[src2].u32[j]) >> width; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u64[j] = u64xu64_lh( + (uint64_t)extend_gpr(env->gpr[rs1]) + , env->vfp.vreg[src2].u64[j]); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vmul_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src1, src2, dest, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + src1 = rs1 + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s8[j] = env->vfp.vreg[src1].s8[j] + * env->vfp.vreg[src2].s8[j]; + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s16[j] = env->vfp.vreg[src1].s16[j] + * env->vfp.vreg[src2].s16[j]; + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s32[j] = env->vfp.vreg[src1].s32[j] + * env->vfp.vreg[src2].s32[j]; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s64[j] = env->vfp.vreg[src1].s64[j] + * env->vfp.vreg[src2].s64[j]; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; +} +void VECTOR_HELPER(vmul_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src2, dest, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s8[j] = env->gpr[rs1] + * env->vfp.vreg[src2].s8[j]; + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s16[j] = env->gpr[rs1] + * env->vfp.vreg[src2].s16[j]; + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s32[j] = env->gpr[rs1] + * env->vfp.vreg[src2].s32[j]; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s64[j] = + (int64_t)extend_gpr(env->gpr[rs1]) + * env->vfp.vreg[src2].s64[j]; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vmulhsu_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src1, src2, dest, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + src1 = rs1 + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s8[j] = + ((uint16_t)env->vfp.vreg[src1].u8[j] + * (int16_t)env->vfp.vreg[src2].s8[j]) >> width; + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s16[j] = + ((uint32_t)env->vfp.vreg[src1].u16[j] + * (int32_t)env->vfp.vreg[src2].s16[j]) >> width; + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s32[j] = + ((uint64_t)env->vfp.vreg[src1].u32[j] + * (int64_t)env->vfp.vreg[src2].s32[j]) >> width; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s64[j] = s64xu64_lh( + env->vfp.vreg[src2].s64[j], env->vfp.vreg[src1].u64[j]); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; +} +void VECTOR_HELPER(vmulhsu_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src2, dest, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s8[j] = + ((uint16_t)(uint8_t)env->gpr[rs1] + * (int16_t)env->vfp.vreg[src2].s8[j]) >> width; + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s16[j] = + ((uint32_t)(uint16_t)env->gpr[rs1] + * (int32_t)env->vfp.vreg[src2].s16[j]) >> width; + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s32[j] = + ((uint64_t)(uint32_t)env->gpr[rs1] + * (int64_t)env->vfp.vreg[src2].s32[j]) >> width; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s64[j] = s64xu64_lh( + env->vfp.vreg[src2].s64[j], + (uint64_t)extend_gpr(env->gpr[rs1])); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vmulh_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src1, src2, dest, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + src1 = rs1 + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s8[j] = + ((int16_t)env->vfp.vreg[src1].s8[j] + * (int16_t)env->vfp.vreg[src2].s8[j]) >> width; + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s16[j] = + ((int32_t)env->vfp.vreg[src1].s16[j] + * (int32_t)env->vfp.vreg[src2].s16[j]) >> width; + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s32[j] = + ((int64_t)env->vfp.vreg[src1].s32[j] + * (int64_t)env->vfp.vreg[src2].s32[j]) >> width; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s64[j] = s64xs64_lh( + env->vfp.vreg[src1].s64[j], env->vfp.vreg[src2].s64[j]); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; +} +void VECTOR_HELPER(vmulh_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src2, dest, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s8[j] = + ((int16_t)(int8_t)env->gpr[rs1] + * (int16_t)env->vfp.vreg[src2].s8[j]) >> width; + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s16[j] = + ((int32_t)(int16_t)env->gpr[rs1] + * (int32_t)env->vfp.vreg[src2].s16[j]) >> width; + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s32[j] = + ((int64_t)(int32_t)env->gpr[rs1] + * (int64_t)env->vfp.vreg[src2].s32[j]) >> width; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s64[j] = s64xs64_lh( + (int64_t)extend_gpr(env->gpr[rs1]) + , env->vfp.vreg[src2].s64[j]); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vdivu_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src1, src2, dest, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + src1 = rs1 + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src1].u8[j] == 0) { + env->vfp.vreg[dest].u8[j] = UINT8_MAX; + } else { + env->vfp.vreg[dest].u8[j] = env->vfp.vreg[src2].u8[j] / + env->vfp.vreg[src1].u8[j]; + } + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src1].u16[j] == 0) { + env->vfp.vreg[dest].u16[j] = UINT16_MAX; + } else { + env->vfp.vreg[dest].u16[j] = env->vfp.vreg[src2].u16[j] + / env->vfp.vreg[src1].u16[j]; + } + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src1].u32[j] == 0) { + env->vfp.vreg[dest].u32[j] = UINT32_MAX; + } else { + env->vfp.vreg[dest].u32[j] = env->vfp.vreg[src2].u32[j] + / env->vfp.vreg[src1].u32[j]; + } + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src1].u64[j] == 0) { + env->vfp.vreg[dest].u64[j] = UINT64_MAX; + } else { + env->vfp.vreg[dest].u64[j] = env->vfp.vreg[src2].u64[j] + / env->vfp.vreg[src1].u64[j]; + } + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vdivu_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src2, dest, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if ((uint8_t)env->gpr[rs1] == 0) { + env->vfp.vreg[dest].u8[j] = UINT8_MAX; + } else { + env->vfp.vreg[dest].u8[j] = env->vfp.vreg[src2].u8[j] / + (uint8_t)env->gpr[rs1]; + } + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if ((uint16_t)env->gpr[rs1] == 0) { + env->vfp.vreg[dest].u16[j] = UINT16_MAX; + } else { + env->vfp.vreg[dest].u16[j] = env->vfp.vreg[src2].u16[j] + / (uint16_t)env->gpr[rs1]; + } + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if ((uint32_t)env->gpr[rs1] == 0) { + env->vfp.vreg[dest].u32[j] = UINT32_MAX; + } else { + env->vfp.vreg[dest].u32[j] = env->vfp.vreg[src2].u32[j] + / (uint32_t)env->gpr[rs1]; + } + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if ((uint64_t)extend_gpr(env->gpr[rs1]) == 0) { + env->vfp.vreg[dest].u64[j] = UINT64_MAX; + } else { + env->vfp.vreg[dest].u64[j] = env->vfp.vreg[src2].u64[j] + / (uint64_t)extend_gpr(env->gpr[rs1]); + } + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vdiv_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src1, src2, dest, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + src1 = rs1 + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src1].s8[j] == 0) { + env->vfp.vreg[dest].s8[j] = -1; + } else if ((env->vfp.vreg[src2].s8[j] == INT8_MIN) && + (env->vfp.vreg[src1].s8[j] == (int8_t)(-1))) { + env->vfp.vreg[dest].s8[j] = INT8_MIN; + } else { + env->vfp.vreg[dest].s8[j] = env->vfp.vreg[src2].s8[j] / + env->vfp.vreg[src1].s8[j]; + } + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src1].s16[j] == 0) { + env->vfp.vreg[dest].s16[j] = -1; + } else if ((env->vfp.vreg[src2].s16[j] == INT16_MIN) && + (env->vfp.vreg[src1].s16[j] == (int16_t)(-1))) { + env->vfp.vreg[dest].s16[j] = INT16_MIN; + } else { + env->vfp.vreg[dest].s16[j] = env->vfp.vreg[src2].s16[j] + / env->vfp.vreg[src1].s16[j]; + } + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src1].s32[j] == 0) { + env->vfp.vreg[dest].s32[j] = -1; + } else if ((env->vfp.vreg[src2].s32[j] == INT32_MIN) && + (env->vfp.vreg[src1].s32[j] == (int32_t)(-1))) { + env->vfp.vreg[dest].s32[j] = INT32_MIN; + } else { + env->vfp.vreg[dest].s32[j] = env->vfp.vreg[src2].s32[j] + / env->vfp.vreg[src1].s32[j]; + } + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src1].s64[j] == 0) { + env->vfp.vreg[dest].s64[j] = -1; + } else if ((env->vfp.vreg[src2].s64[j] == INT64_MIN) && + (env->vfp.vreg[src1].s64[j] == (int64_t)(-1))) { + env->vfp.vreg[dest].s64[j] = INT64_MIN; + } else { + env->vfp.vreg[dest].s64[j] = env->vfp.vreg[src2].s64[j] + / env->vfp.vreg[src1].s64[j]; + } + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; +} +void VECTOR_HELPER(vdiv_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src2, dest, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if ((int8_t)env->gpr[rs1] == 0) { + env->vfp.vreg[dest].s8[j] = -1; + } else if ((env->vfp.vreg[src2].s8[j] == INT8_MIN) && + ((int8_t)env->gpr[rs1] == (int8_t)(-1))) { + env->vfp.vreg[dest].s8[j] = INT8_MIN; + } else { + env->vfp.vreg[dest].s8[j] = env->vfp.vreg[src2].s8[j] / + (int8_t)env->gpr[rs1]; + } + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if ((int16_t)env->gpr[rs1] == 0) { + env->vfp.vreg[dest].s16[j] = -1; + } else if ((env->vfp.vreg[src2].s16[j] == INT16_MIN) && + ((int16_t)env->gpr[rs1] == (int16_t)(-1))) { + env->vfp.vreg[dest].s16[j] = INT16_MIN; + } else { + env->vfp.vreg[dest].s16[j] = env->vfp.vreg[src2].s16[j] + / (int16_t)env->gpr[rs1]; + } + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if ((int32_t)env->gpr[rs1] == 0) { + env->vfp.vreg[dest].s32[j] = -1; + } else if ((env->vfp.vreg[src2].s32[j] == INT32_MIN) && + ((int32_t)env->gpr[rs1] == (int32_t)(-1))) { + env->vfp.vreg[dest].s32[j] = INT32_MIN; + } else { + env->vfp.vreg[dest].s32[j] = env->vfp.vreg[src2].s32[j] + / (int32_t)env->gpr[rs1]; + } + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if ((int64_t)extend_gpr(env->gpr[rs1]) == 0) { + env->vfp.vreg[dest].s64[j] = -1; + } else if ((env->vfp.vreg[src2].s64[j] == INT64_MIN) && + ((int64_t)extend_gpr(env->gpr[rs1]) == (int64_t)(-1))) { + env->vfp.vreg[dest].s64[j] = INT64_MIN; + } else { + env->vfp.vreg[dest].s64[j] = env->vfp.vreg[src2].s64[j] + / (int64_t)extend_gpr(env->gpr[rs1]); + } + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vremu_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src1, src2, dest, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + src1 = rs1 + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src1].u8[j] == 0) { + env->vfp.vreg[dest].u8[j] = env->vfp.vreg[src2].u8[j]; + } else { + env->vfp.vreg[dest].u8[j] = env->vfp.vreg[src2].u8[j] % + env->vfp.vreg[src1].u8[j]; + } + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src1].u16[j] == 0) { + env->vfp.vreg[dest].u16[j] = env->vfp.vreg[src2].u16[j]; + } else { + env->vfp.vreg[dest].u16[j] = env->vfp.vreg[src2].u16[j] + % env->vfp.vreg[src1].u16[j]; + } + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src1].u32[j] == 0) { + env->vfp.vreg[dest].u32[j] = env->vfp.vreg[src2].u32[j]; + } else { + env->vfp.vreg[dest].u32[j] = env->vfp.vreg[src2].u32[j] + % env->vfp.vreg[src1].u32[j]; + } + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src1].u64[j] == 0) { + env->vfp.vreg[dest].u64[j] = env->vfp.vreg[src2].u64[j]; + } else { + env->vfp.vreg[dest].u64[j] = env->vfp.vreg[src2].u64[j] + % env->vfp.vreg[src1].u64[j]; + } + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + + env->vfp.vstart = 0; +} +void VECTOR_HELPER(vremu_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src2, dest, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if ((uint8_t)env->gpr[rs1] == 0) { + env->vfp.vreg[dest].u8[j] = env->vfp.vreg[src2].u8[j]; + } else { + env->vfp.vreg[dest].u8[j] = env->vfp.vreg[src2].u8[j] % + (uint8_t)env->gpr[rs1]; + } + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if ((uint16_t)env->gpr[rs1] == 0) { + env->vfp.vreg[dest].u16[j] = env->vfp.vreg[src2].u16[j]; + } else { + env->vfp.vreg[dest].u16[j] = env->vfp.vreg[src2].u16[j] + % (uint16_t)env->gpr[rs1]; + } + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if ((uint32_t)env->gpr[rs1] == 0) { + env->vfp.vreg[dest].u32[j] = env->vfp.vreg[src2].u32[j]; + } else { + env->vfp.vreg[dest].u32[j] = env->vfp.vreg[src2].u32[j] + % (uint32_t)env->gpr[rs1]; + } + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if ((uint64_t)extend_gpr(env->gpr[rs1]) == 0) { + env->vfp.vreg[dest].u64[j] = env->vfp.vreg[src2].u64[j]; + } else { + env->vfp.vreg[dest].u64[j] = env->vfp.vreg[src2].u64[j] + % (uint64_t)extend_gpr(env->gpr[rs1]); + } + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vrem_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src1, src2, dest, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + src1 = rs1 + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src1].s8[j] == 0) { + env->vfp.vreg[dest].s8[j] = env->vfp.vreg[src2].s8[j]; + } else if ((env->vfp.vreg[src2].s8[j] == INT8_MIN) && + (env->vfp.vreg[src1].s8[j] == (int8_t)(-1))) { + env->vfp.vreg[dest].s8[j] = 0; + } else { + env->vfp.vreg[dest].s8[j] = env->vfp.vreg[src2].s8[j] % + env->vfp.vreg[src1].s8[j]; + } + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src1].s16[j] == 0) { + env->vfp.vreg[dest].s16[j] = env->vfp.vreg[src2].s16[j]; + } else if ((env->vfp.vreg[src2].s16[j] == INT16_MIN) && + (env->vfp.vreg[src1].s16[j] == (int16_t)(-1))) { + env->vfp.vreg[dest].s16[j] = 0; + } else { + env->vfp.vreg[dest].s16[j] = env->vfp.vreg[src2].s16[j] + % env->vfp.vreg[src1].s16[j]; + } + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src1].s32[j] == 0) { + env->vfp.vreg[dest].s32[j] = env->vfp.vreg[src2].s32[j]; + } else if ((env->vfp.vreg[src2].s32[j] == INT32_MIN) && + (env->vfp.vreg[src1].s32[j] == (int32_t)(-1))) { + env->vfp.vreg[dest].s32[j] = 0; + } else { + env->vfp.vreg[dest].s32[j] = env->vfp.vreg[src2].s32[j] + % env->vfp.vreg[src1].s32[j]; + } + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (env->vfp.vreg[src1].s64[j] == 0) { + env->vfp.vreg[dest].s64[j] = env->vfp.vreg[src2].s64[j]; + } else if ((env->vfp.vreg[src2].s64[j] == INT64_MIN) && + (env->vfp.vreg[src1].s64[j] == (int64_t)(-1))) { + env->vfp.vreg[dest].s64[j] = 0; + } else { + env->vfp.vreg[dest].s64[j] = env->vfp.vreg[src2].s64[j] + % env->vfp.vreg[src1].s64[j]; + } + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; +} +void VECTOR_HELPER(vrem_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src2, dest, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if ((int8_t)env->gpr[rs1] == 0) { + env->vfp.vreg[dest].s8[j] = env->vfp.vreg[src2].s8[j]; + } else if ((env->vfp.vreg[src2].s8[j] == INT8_MIN) && + ((int8_t)env->gpr[rs1] == (int8_t)(-1))) { + env->vfp.vreg[dest].s8[j] = 0; + } else { + env->vfp.vreg[dest].s8[j] = env->vfp.vreg[src2].s8[j] % + (int8_t)env->gpr[rs1]; + } + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if ((int16_t)env->gpr[rs1] == 0) { + env->vfp.vreg[dest].s16[j] = env->vfp.vreg[src2].s16[j]; + } else if ((env->vfp.vreg[src2].s16[j] == INT16_MIN) && + ((int16_t)env->gpr[rs1] == (int16_t)(-1))) { + env->vfp.vreg[dest].s16[j] = 0; + } else { + env->vfp.vreg[dest].s16[j] = env->vfp.vreg[src2].s16[j] + % (int16_t)env->gpr[rs1]; + } + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if ((int32_t)env->gpr[rs1] == 0) { + env->vfp.vreg[dest].s32[j] = env->vfp.vreg[src2].s32[j]; + } else if ((env->vfp.vreg[src2].s32[j] == INT32_MIN) && + ((int32_t)env->gpr[rs1] == (int32_t)(-1))) { + env->vfp.vreg[dest].s32[j] = 0; + } else { + env->vfp.vreg[dest].s32[j] = env->vfp.vreg[src2].s32[j] + % (int32_t)env->gpr[rs1]; + } + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if ((int64_t)extend_gpr(env->gpr[rs1]) == 0) { + env->vfp.vreg[dest].s64[j] = env->vfp.vreg[src2].s64[j]; + } else if ((env->vfp.vreg[src2].s64[j] == INT64_MIN) && + ((int64_t)extend_gpr(env->gpr[rs1]) == (int64_t)(-1))) { + env->vfp.vreg[dest].s64[j] = 0; + } else { + env->vfp.vreg[dest].s64[j] = env->vfp.vreg[src2].s64[j] + % (int64_t)extend_gpr(env->gpr[rs1]); + } + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vmacc_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src1, src2, dest, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + src1 = rs1 + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s8[j] += env->vfp.vreg[src1].s8[j] + * env->vfp.vreg[src2].s8[j]; + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s16[j] += env->vfp.vreg[src1].s16[j] + * env->vfp.vreg[src2].s16[j]; + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s32[j] += env->vfp.vreg[src1].s32[j] + * env->vfp.vreg[src2].s32[j]; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s64[j] += env->vfp.vreg[src1].s64[j] + * env->vfp.vreg[src2].s64[j]; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; +} +void VECTOR_HELPER(vmacc_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src2, dest, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s8[j] += env->gpr[rs1] + * env->vfp.vreg[src2].s8[j]; + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s16[j] += env->gpr[rs1] + * env->vfp.vreg[src2].s16[j]; + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s32[j] += env->gpr[rs1] + * env->vfp.vreg[src2].s32[j]; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s64[j] += + (int64_t)extend_gpr(env->gpr[rs1]) + * env->vfp.vreg[src2].s64[j]; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vnmsac_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src1, src2, dest, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + src1 = rs1 + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s8[j] -= env->vfp.vreg[src1].s8[j] + * env->vfp.vreg[src2].s8[j]; + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s16[j] -= env->vfp.vreg[src1].s16[j] + * env->vfp.vreg[src2].s16[j]; + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s32[j] -= env->vfp.vreg[src1].s32[j] + * env->vfp.vreg[src2].s32[j]; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s64[j] -= env->vfp.vreg[src1].s64[j] + * env->vfp.vreg[src2].s64[j]; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; +} +void VECTOR_HELPER(vnmsac_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src2, dest, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s8[j] -= env->gpr[rs1] + * env->vfp.vreg[src2].s8[j]; + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s16[j] -= env->gpr[rs1] + * env->vfp.vreg[src2].s16[j]; + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s32[j] -= env->gpr[rs1] + * env->vfp.vreg[src2].s32[j]; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s64[j] -= + (int64_t)extend_gpr(env->gpr[rs1]) + * env->vfp.vreg[src2].s64[j]; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vmadd_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src1, src2, dest, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + src1 = rs1 + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s8[j] = env->vfp.vreg[src1].s8[j] + * env->vfp.vreg[dest].s8[j] + + env->vfp.vreg[src2].s8[j]; + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s16[j] = env->vfp.vreg[src1].s16[j] + * env->vfp.vreg[dest].s16[j] + + env->vfp.vreg[src2].s16[j]; + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s32[j] = env->vfp.vreg[src1].s32[j] + * env->vfp.vreg[dest].s32[j] + + env->vfp.vreg[src2].s32[j]; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s64[j] = env->vfp.vreg[src1].s64[j] + * env->vfp.vreg[dest].s64[j] + + env->vfp.vreg[src2].s64[j]; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; +} +void VECTOR_HELPER(vmadd_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src2, dest, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s8[j] = env->gpr[rs1] + * env->vfp.vreg[dest].s8[j] + + env->vfp.vreg[src2].s8[j]; + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s16[j] = env->gpr[rs1] + * env->vfp.vreg[dest].s16[j] + + env->vfp.vreg[src2].s16[j]; + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s32[j] = env->gpr[rs1] + * env->vfp.vreg[dest].s32[j] + + env->vfp.vreg[src2].s32[j]; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s64[j] = + (int64_t)extend_gpr(env->gpr[rs1]) + * env->vfp.vreg[dest].s64[j] + + env->vfp.vreg[src2].s64[j]; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vnmsub_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src1, src2, dest, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + src1 = rs1 + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s8[j] = env->vfp.vreg[src2].s8[j] + - env->vfp.vreg[src1].s8[j] + * env->vfp.vreg[dest].s8[j]; + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s16[j] = env->vfp.vreg[src2].s16[j] + - env->vfp.vreg[src1].s16[j] + * env->vfp.vreg[dest].s16[j]; + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s32[j] = env->vfp.vreg[src2].s32[j] + - env->vfp.vreg[src1].s32[j] + * env->vfp.vreg[dest].s32[j]; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s64[j] = env->vfp.vreg[src2].s64[j] + - env->vfp.vreg[src1].s64[j] + * env->vfp.vreg[dest].s64[j]; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; +} +void VECTOR_HELPER(vnmsub_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl; + uint32_t lmul, width, src2, dest, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s8[j] = env->vfp.vreg[src2].s8[j] + - env->gpr[rs1] + * env->vfp.vreg[dest].s8[j]; + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s16[j] = env->vfp.vreg[src2].s16[j] + - env->gpr[rs1] + * env->vfp.vreg[dest].s16[j]; + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s32[j] = env->vfp.vreg[src2].s32[j] + - env->gpr[rs1] + * env->vfp.vreg[dest].s32[j]; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s64[j] = env->vfp.vreg[src2].s64[j] + - (int64_t)extend_gpr(env->gpr[rs1]) + * env->vfp.vreg[dest].s64[j]; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vwmulu_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, k, vl; + uint32_t lmul, width, src1, src2, dest, vlmax; + + vl = env->vfp.vl; + + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) + || vector_overlap_vm_force(vm, rd) + || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs1, lmul) + || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, true); + + for (i = 0; i < vlmax; i++) { + src1 = rs1 + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / (2 * width))); + j = i % (VLEN / width); + k = i % (VLEN / (2 * width)); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u16[k] = + (uint16_t)env->vfp.vreg[src1].u8[j] * + (uint16_t)env->vfp.vreg[src2].u8[j]; + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u32[k] = + (uint32_t)env->vfp.vreg[src1].u16[j] * + (uint32_t)env->vfp.vreg[src2].u16[j]; + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u64[k] = + (uint64_t)env->vfp.vreg[src1].u32[j] * + (uint64_t)env->vfp.vreg[src2].u32[j]; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_widen(env, dest, k, width); + } + } + env->vfp.vstart = 0; +} +void VECTOR_HELPER(vwmulu_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, k, vl; + uint32_t lmul, width, src2, dest, vlmax; + + vl = env->vfp.vl; + + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) + || vector_overlap_vm_force(vm, rd) + || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, true); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / (2 * width))); + j = i % (VLEN / width); + k = i % (VLEN / (2 * width)); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u16[k] = + (uint16_t)env->vfp.vreg[src2].u8[j] * + (uint16_t)((uint8_t)env->gpr[rs1]); + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u32[k] = + (uint32_t)env->vfp.vreg[src2].u16[j] * + (uint32_t)((uint16_t)env->gpr[rs1]); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u64[k] = + (uint64_t)env->vfp.vreg[src2].u32[j] * + (uint64_t)((uint32_t)env->gpr[rs1]); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_widen(env, dest, k, width); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vwmulsu_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, k, vl; + uint32_t lmul, width, src1, src2, dest, vlmax; + + vl = env->vfp.vl; + + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) + || vector_overlap_vm_force(vm, rd) + || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs1, lmul) + || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, true); + + for (i = 0; i < vlmax; i++) { + src1 = rs1 + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / (2 * width))); + j = i % (VLEN / width); + k = i % (VLEN / (2 * width)); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s16[k] = + (int16_t)env->vfp.vreg[src2].s8[j] * + (uint16_t)env->vfp.vreg[src1].u8[j]; + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s32[k] = + (int32_t)env->vfp.vreg[src2].s16[j] * + (uint32_t)env->vfp.vreg[src1].u16[j]; + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s64[k] = + (int64_t)env->vfp.vreg[src2].s32[j] * + (uint64_t)env->vfp.vreg[src1].u32[j]; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_widen(env, dest, k, width); + } + } + env->vfp.vstart = 0; +} +void VECTOR_HELPER(vwmulsu_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, k, vl; + uint32_t lmul, width, src2, dest, vlmax; + + vl = env->vfp.vl; + + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) + || vector_overlap_vm_force(vm, rd) + || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, true); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / (2 * width))); + j = i % (VLEN / width); + k = i % (VLEN / (2 * width)); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s16[k] = + (int16_t)((int8_t)env->vfp.vreg[src2].s8[j]) * + (uint16_t)((uint8_t)env->gpr[rs1]); + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s32[k] = + (int32_t)((int16_t)env->vfp.vreg[src2].s16[j]) * + (uint32_t)((uint16_t)env->gpr[rs1]); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s64[k] = + (int64_t)((int32_t)env->vfp.vreg[src2].s32[j]) * + (uint64_t)((uint32_t)env->gpr[rs1]); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_widen(env, dest, k, width); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vwmul_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, k, vl; + uint32_t lmul, width, src1, src2, dest, vlmax; + + vl = env->vfp.vl; + + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) + || vector_overlap_vm_force(vm, rd) + || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs1, lmul) + || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, true); + + for (i = 0; i < vlmax; i++) { + src1 = rs1 + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / (2 * width))); + j = i % (VLEN / width); + k = i % (VLEN / (2 * width)); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s16[k] = + (int16_t)env->vfp.vreg[src1].s8[j] * + (int16_t)env->vfp.vreg[src2].s8[j]; + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s32[k] = + (int32_t)env->vfp.vreg[src1].s16[j] * + (int32_t)env->vfp.vreg[src2].s16[j]; + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s64[k] = + (int64_t)env->vfp.vreg[src1].s32[j] * + (int64_t)env->vfp.vreg[src2].s32[j]; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_widen(env, dest, k, width); + } + } + env->vfp.vstart = 0; +} +void VECTOR_HELPER(vwmul_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, k, vl; + uint32_t lmul, width, src2, dest, vlmax; + + vl = env->vfp.vl; + + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) + || vector_overlap_vm_force(vm, rd) + || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, true); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / (2 * width))); + j = i % (VLEN / width); + k = i % (VLEN / (2 * width)); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s16[k] = + (int16_t)((int8_t)env->vfp.vreg[src2].s8[j]) * + (int16_t)((int8_t)env->gpr[rs1]); + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s32[k] = + (int32_t)((int16_t)env->vfp.vreg[src2].s16[j]) * + (int32_t)((int16_t)env->gpr[rs1]); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s64[k] = + (int64_t)((int32_t)env->vfp.vreg[src2].s32[j]) * + (int64_t)((int32_t)env->gpr[rs1]); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_widen(env, dest, k, width); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vwmaccu_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, k, vl; + uint32_t lmul, width, src1, src2, dest, vlmax; + + vl = env->vfp.vl; + + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) + || vector_overlap_vm_force(vm, rd) + || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs1, lmul) + || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, true); + + for (i = 0; i < vlmax; i++) { + src1 = rs1 + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / (2 * width))); + j = i % (VLEN / width); + k = i % (VLEN / (2 * width)); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u16[k] += + (uint16_t)env->vfp.vreg[src1].u8[j] * + (uint16_t)env->vfp.vreg[src2].u8[j]; + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u32[k] += + (uint32_t)env->vfp.vreg[src1].u16[j] * + (uint32_t)env->vfp.vreg[src2].u16[j]; + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u64[k] += + (uint64_t)env->vfp.vreg[src1].u32[j] * + (uint64_t)env->vfp.vreg[src2].u32[j]; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_widen(env, dest, k, width); + } + } + env->vfp.vstart = 0; +} +void VECTOR_HELPER(vwmaccu_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, k, vl; + uint32_t lmul, width, src2, dest, vlmax; + + vl = env->vfp.vl; + + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) + || vector_overlap_vm_force(vm, rd) + || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, true); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / (2 * width))); + j = i % (VLEN / width); + k = i % (VLEN / (2 * width)); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u16[k] += + (uint16_t)env->vfp.vreg[src2].u8[j] * + (uint16_t)((uint8_t)env->gpr[rs1]); + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u32[k] += + (uint32_t)env->vfp.vreg[src2].u16[j] * + (uint32_t)((uint16_t)env->gpr[rs1]); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u64[k] += + (uint64_t)env->vfp.vreg[src2].u32[j] * + (uint64_t)((uint32_t)env->gpr[rs1]); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_widen(env, dest, k, width); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vwmaccsu_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, k, vl; + uint32_t lmul, width, src1, src2, dest, vlmax; + + vl = env->vfp.vl; + + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) + || vector_overlap_vm_force(vm, rd) + || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs1, lmul) + || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, true); + + for (i = 0; i < vlmax; i++) { + src1 = rs1 + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / (2 * width))); + j = i % (VLEN / width); + k = i % (VLEN / (2 * width)); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s16[k] += + (int16_t)env->vfp.vreg[src1].s8[j] + * (uint16_t)env->vfp.vreg[src2].u8[j]; + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s32[k] += + (int32_t)env->vfp.vreg[src1].s16[j] * + (uint32_t)env->vfp.vreg[src2].u16[j]; + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s64[k] += + (int64_t)env->vfp.vreg[src1].s32[j] * + (uint64_t)env->vfp.vreg[src2].u32[j]; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_widen(env, dest, k, width); + } + } + env->vfp.vstart = 0; +} +void VECTOR_HELPER(vwmaccsu_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, k, vl; + uint32_t lmul, width, src2, dest, vlmax; + + vl = env->vfp.vl; + + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) + || vector_overlap_vm_force(vm, rd) + || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, true); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / (2 * width))); + j = i % (VLEN / width); + k = i % (VLEN / (2 * width)); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s16[k] += + (uint16_t)((uint8_t)env->vfp.vreg[src2].u8[j]) * + (int16_t)((int8_t)env->gpr[rs1]); + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s32[k] += + (uint32_t)((uint16_t)env->vfp.vreg[src2].u16[j]) * + (int32_t)((int16_t)env->gpr[rs1]); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s64[k] += + (uint64_t)((uint32_t)env->vfp.vreg[src2].u32[j]) * + (int64_t)((int32_t)env->gpr[rs1]); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_widen(env, dest, k, width); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vwmaccus_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, k, vl; + uint32_t lmul, width, src2, dest, vlmax; + + vl = env->vfp.vl; + + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) + || vector_overlap_vm_force(vm, rd) + || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, true); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / (2 * width))); + j = i % (VLEN / width); + k = i % (VLEN / (2 * width)); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s16[k] += + (int16_t)((int8_t)env->vfp.vreg[src2].s8[j]) * + (uint16_t)((uint8_t)env->gpr[rs1]); + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s32[k] += + (int32_t)((int16_t)env->vfp.vreg[src2].s16[j]) * + (uint32_t)((uint16_t)env->gpr[rs1]); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s64[k] += + (int64_t)((int32_t)env->vfp.vreg[src2].s32[j]) * + (uint64_t)((uint32_t)env->gpr[rs1]); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_widen(env, dest, k, width); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vwmacc_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, k, vl; + uint32_t lmul, width, src1, src2, dest, vlmax; + + vl = env->vfp.vl; + + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) + || vector_overlap_vm_force(vm, rd) + || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs1, lmul) + || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, true); + + for (i = 0; i < vlmax; i++) { + src1 = rs1 + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / (2 * width))); + j = i % (VLEN / width); + k = i % (VLEN / (2 * width)); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s16[k] += + (int16_t)env->vfp.vreg[src1].s8[j] + * (int16_t)env->vfp.vreg[src2].s8[j]; + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s32[k] += + (int32_t)env->vfp.vreg[src1].s16[j] * + (int32_t)env->vfp.vreg[src2].s16[j]; + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s64[k] += + (int64_t)env->vfp.vreg[src1].s32[j] * + (int64_t)env->vfp.vreg[src2].s32[j]; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_widen(env, dest, k, width); + } + } + env->vfp.vstart = 0; +} +void VECTOR_HELPER(vwmacc_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, k, vl; + uint32_t lmul, width, src2, dest, vlmax; + + vl = env->vfp.vl; + + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) + || vector_overlap_vm_force(vm, rd) + || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, true); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / (2 * width))); + j = i % (VLEN / width); + k = i % (VLEN / (2 * width)); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s16[k] += + (int16_t)((int8_t)env->vfp.vreg[src2].s8[j]) * + (int16_t)((int8_t)env->gpr[rs1]); + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s32[k] += + (int32_t)((int16_t)env->vfp.vreg[src2].s16[j]) * + (int32_t)((int16_t)env->gpr[rs1]); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s64[k] += + (int64_t)((int32_t)env->vfp.vreg[src2].s32[j]) * + (int64_t)((int32_t)env->gpr[rs1]); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_widen(env, dest, k, width); + } + } + env->vfp.vstart = 0; +} + +void VECTOR_HELPER(vmerge_vvm)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl, idx, pos; + uint32_t lmul, width, src1, src2, dest, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + src1 = rs1 + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vm == 0) { + vector_get_layout(env, width, lmul, i, &idx, &pos); + if (((env->vfp.vreg[0].u8[idx] >> pos) & 0x1) == 0) { + env->vfp.vreg[dest].u8[j] = + env->vfp.vreg[src2].u8[j]; + } else { + env->vfp.vreg[dest].u8[j] = + env->vfp.vreg[src1].u8[j]; + } + } else { + if (rs2 != 0) { + riscv_raise_exception(env, + RISCV_EXCP_ILLEGAL_INST, GETPC()); + } + env->vfp.vreg[dest].u8[j] = env->vfp.vreg[src1].u8[j]; + } + break; + case 16: + if (vm == 0) { + vector_get_layout(env, width, lmul, i, &idx, &pos); + if (((env->vfp.vreg[0].u8[idx] >> pos) & 0x1) == 0) { + env->vfp.vreg[dest].u16[j] = + env->vfp.vreg[src2].u16[j]; + } else { + env->vfp.vreg[dest].u16[j] = + env->vfp.vreg[src1].u16[j]; + } + } else { + if (rs2 != 0) { + riscv_raise_exception(env, + RISCV_EXCP_ILLEGAL_INST, GETPC()); + } + env->vfp.vreg[dest].u16[j] = env->vfp.vreg[src1].u16[j]; + } + break; + case 32: + if (vm == 0) { + vector_get_layout(env, width, lmul, i, &idx, &pos); + if (((env->vfp.vreg[0].u8[idx] >> pos) & 0x1) == 0) { + env->vfp.vreg[dest].u32[j] = + env->vfp.vreg[src2].u32[j]; + } else { + env->vfp.vreg[dest].u32[j] = + env->vfp.vreg[src1].u32[j]; + } + } else { + if (rs2 != 0) { + riscv_raise_exception(env, + RISCV_EXCP_ILLEGAL_INST, GETPC()); + } + env->vfp.vreg[dest].u32[j] = env->vfp.vreg[src1].u32[j]; + } + break; + case 64: + if (vm == 0) { + vector_get_layout(env, width, lmul, i, &idx, &pos); + if (((env->vfp.vreg[0].u8[idx] >> pos) & 0x1) == 0) { + env->vfp.vreg[dest].u64[j] = + env->vfp.vreg[src2].u64[j]; + } else { + env->vfp.vreg[dest].u64[j] = + env->vfp.vreg[src1].u64[j]; + } + } else { + if (rs2 != 0) { + riscv_raise_exception(env, + RISCV_EXCP_ILLEGAL_INST, GETPC()); + } + env->vfp.vreg[dest].u64[j] = env->vfp.vreg[src1].u64[j]; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; +} +void VECTOR_HELPER(vmerge_vxm)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl, idx, pos; + uint32_t lmul, width, src2, dest, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vm == 0) { + vector_get_layout(env, width, lmul, i, &idx, &pos); + if (((env->vfp.vreg[0].u8[idx] >> pos) & 0x1) == 0) { + env->vfp.vreg[dest].u8[j] = + env->vfp.vreg[src2].u8[j]; + } else { + env->vfp.vreg[dest].u8[j] = env->gpr[rs1]; + } + } else { + if (rs2 != 0) { + riscv_raise_exception(env, + RISCV_EXCP_ILLEGAL_INST, GETPC()); + } + env->vfp.vreg[dest].u8[j] = env->gpr[rs1]; + } + break; + case 16: + if (vm == 0) { + vector_get_layout(env, width, lmul, i, &idx, &pos); + if (((env->vfp.vreg[0].u8[idx] >> pos) & 0x1) == 0) { + env->vfp.vreg[dest].u16[j] = + env->vfp.vreg[src2].u16[j]; + } else { + env->vfp.vreg[dest].u16[j] = env->gpr[rs1]; + } + } else { + if (rs2 != 0) { + riscv_raise_exception(env, + RISCV_EXCP_ILLEGAL_INST, GETPC()); + } + env->vfp.vreg[dest].u16[j] = env->gpr[rs1]; + } + break; + case 32: + if (vm == 0) { + vector_get_layout(env, width, lmul, i, &idx, &pos); + if (((env->vfp.vreg[0].u8[idx] >> pos) & 0x1) == 0) { + env->vfp.vreg[dest].u32[j] = + env->vfp.vreg[src2].u32[j]; + } else { + env->vfp.vreg[dest].u32[j] = env->gpr[rs1]; + } + } else { + if (rs2 != 0) { + riscv_raise_exception(env, + RISCV_EXCP_ILLEGAL_INST, GETPC()); + } + env->vfp.vreg[dest].u32[j] = env->gpr[rs1]; + } + break; + case 64: + if (vm == 0) { + vector_get_layout(env, width, lmul, i, &idx, &pos); + if (((env->vfp.vreg[0].u8[idx] >> pos) & 0x1) == 0) { + env->vfp.vreg[dest].u64[j] = + env->vfp.vreg[src2].u64[j]; + } else { + env->vfp.vreg[dest].u64[j] = + (uint64_t)extend_gpr(env->gpr[rs1]); + } + } else { + if (rs2 != 0) { + riscv_raise_exception(env, + RISCV_EXCP_ILLEGAL_INST, GETPC()); + } + env->vfp.vreg[dest].u64[j] = + (uint64_t)extend_gpr(env->gpr[rs1]); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; +} +void VECTOR_HELPER(vmerge_vim)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int i, j, vl, idx, pos; + uint32_t lmul, width, src2, dest, vlmax; + + vl = env->vfp.vl; + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + if (vector_vtype_ill(env)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vm == 0) { + vector_get_layout(env, width, lmul, i, &idx, &pos); + if (((env->vfp.vreg[0].u8[idx] >> pos) & 0x1) == 0) { + env->vfp.vreg[dest].u8[j] = + env->vfp.vreg[src2].u8[j]; + } else { + env->vfp.vreg[dest].u8[j] = + (uint8_t)sign_extend(rs1, 5); + } + } else { + if (rs2 != 0) { + riscv_raise_exception(env, + RISCV_EXCP_ILLEGAL_INST, GETPC()); + } + env->vfp.vreg[dest].u8[j] = (uint8_t)sign_extend(rs1, 5); + } + break; + case 16: + if (vm == 0) { + vector_get_layout(env, width, lmul, i, &idx, &pos); + if (((env->vfp.vreg[0].u8[idx] >> pos) & 0x1) == 0) { + env->vfp.vreg[dest].u16[j] = + env->vfp.vreg[src2].u16[j]; + } else { + env->vfp.vreg[dest].u16[j] = + (uint16_t)sign_extend(rs1, 5); + } + } else { + if (rs2 != 0) { + riscv_raise_exception(env, + RISCV_EXCP_ILLEGAL_INST, GETPC()); + } + env->vfp.vreg[dest].u16[j] = (uint16_t)sign_extend(rs1, 5); + } + break; + case 32: + if (vm == 0) { + vector_get_layout(env, width, lmul, i, &idx, &pos); + if (((env->vfp.vreg[0].u8[idx] >> pos) & 0x1) == 0) { + env->vfp.vreg[dest].u32[j] = + env->vfp.vreg[src2].u32[j]; + } else { + env->vfp.vreg[dest].u32[j] = + (uint32_t)sign_extend(rs1, 5); + } + } else { + if (rs2 != 0) { + riscv_raise_exception(env, + RISCV_EXCP_ILLEGAL_INST, GETPC()); + } + env->vfp.vreg[dest].u32[j] = (uint32_t)sign_extend(rs1, 5); + } + break; + case 64: + if (vm == 0) { + vector_get_layout(env, width, lmul, i, &idx, &pos); + if (((env->vfp.vreg[0].u8[idx] >> pos) & 0x1) == 0) { + env->vfp.vreg[dest].u64[j] = + env->vfp.vreg[src2].u64[j]; + } else { + env->vfp.vreg[dest].u64[j] = + (uint64_t)sign_extend(rs1, 5); + } + } else { + if (rs2 != 0) { + riscv_raise_exception(env, + RISCV_EXCP_ILLEGAL_INST, GETPC()); + } + env->vfp.vreg[dest].u64[j] = (uint64_t)sign_extend(rs1, 5); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + break; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; +} + From patchwork Wed Sep 11 06:25:36 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: LIU Zhiwei X-Patchwork-Id: 11140417 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 08AAF76 for ; Wed, 11 Sep 2019 06:55:12 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id AC3F821928 for ; Wed, 11 Sep 2019 06:55:11 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org AC3F821928 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=c-sky.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Received: from localhost ([::1]:47058 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1i7wX0-000491-3N for patchwork-qemu-devel@patchwork.kernel.org; Wed, 11 Sep 2019 02:55:10 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:38678) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1i7wE5-0007lo-61 for qemu-devel@nongnu.org; Wed, 11 Sep 2019 02:35:48 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1i7wDu-00087t-Bw for qemu-devel@nongnu.org; Wed, 11 Sep 2019 02:35:37 -0400 Received: from smtp2200-217.mail.aliyun.com ([121.197.200.217]:47996) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1i7wDs-0007oG-Fk; Wed, 11 Sep 2019 02:35:26 -0400 X-Alimail-AntiSpam: AC=CONTINUE; BC=0.03883426|-1; CH=green; DM=CONTINUE|CONTINUE|true|0.216564-0.00369968-0.779736; FP=0|0|0|0|0|-1|-1|-1; HT=e02c03306; MF=zhiwei_liu@c-sky.com; NM=1; PH=DS; RN=11; RT=11; SR=0; TI=SMTPD_---.FSRTjs0_1568183699; Received: from localhost(mailfrom:zhiwei_liu@c-sky.com fp:SMTPD_---.FSRTjs0_1568183699) by smtp.aliyun-inc.com(10.147.42.253); Wed, 11 Sep 2019 14:34:59 +0800 From: liuzhiwei To: Alistair.Francis@wdc.com, palmer@sifive.com, sagark@eecs.berkeley.edu, kbastian@mail.uni-paderborn.de, riku.voipio@iki.fi, laurent@vivier.eu, wenmeng_zhang@c-sky.com Date: Wed, 11 Sep 2019 14:25:36 +0800 Message-Id: <1568183141-67641-13-git-send-email-zhiwei_liu@c-sky.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1568183141-67641-1-git-send-email-zhiwei_liu@c-sky.com> References: <1568183141-67641-1-git-send-email-zhiwei_liu@c-sky.com> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x [generic] X-Received-From: 121.197.200.217 Subject: [Qemu-devel] [PATCH v2 12/17] RISC-V: add vector extension fixed point instructions X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-riscv@nongnu.org, qemu-devel@nongnu.org, wxy194768@alibaba-inc.com, LIU Zhiwei Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" From: LIU Zhiwei Signed-off-by: LIU Zhiwei --- target/riscv/helper.h | 37 + target/riscv/insn32.decode | 37 + target/riscv/insn_trans/trans_rvv.inc.c | 37 + target/riscv/vector_helper.c | 3388 +++++++++++++++++++++++++++++++ 4 files changed, 3499 insertions(+) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index ab31ef7..ff6002e 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -270,5 +270,42 @@ DEF_HELPER_5(vector_vmerge_vvm, void, env, i32, i32, i32, i32) DEF_HELPER_5(vector_vmerge_vxm, void, env, i32, i32, i32, i32) DEF_HELPER_5(vector_vmerge_vim, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vsaddu_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vsaddu_vx, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vsaddu_vi, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vsadd_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vsadd_vx, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vsadd_vi, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vssubu_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vssubu_vx, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vssub_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vssub_vx, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vaadd_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vaadd_vx, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vaadd_vi, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vasub_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vasub_vx, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vsmul_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vsmul_vx, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vwsmaccu_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vwsmaccu_vx, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vwsmacc_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vwsmacc_vx, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vwsmaccsu_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vwsmaccsu_vx, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vwsmaccus_vx, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vssrl_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vssrl_vx, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vssrl_vi, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vssra_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vssra_vx, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vssra_vi, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vnclipu_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vnclipu_vx, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vnclipu_vi, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vnclip_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vnclip_vx, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vnclip_vi, void, env, i32, i32, i32, i32) + DEF_HELPER_4(vector_vsetvli, void, env, i32, i32, i32) DEF_HELPER_4(vector_vsetvl, void, env, i32, i32, i32) diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode index 6db18c5..a82e53e 100644 --- a/target/riscv/insn32.decode +++ b/target/riscv/insn32.decode @@ -410,5 +410,42 @@ vmerge_vvm 010111 . ..... ..... 000 ..... 1010111 @r_vm vmerge_vxm 010111 . ..... ..... 100 ..... 1010111 @r_vm vmerge_vim 010111 . ..... ..... 011 ..... 1010111 @r_vm +vsaddu_vv 100000 . ..... ..... 000 ..... 1010111 @r_vm +vsaddu_vx 100000 . ..... ..... 100 ..... 1010111 @r_vm +vsaddu_vi 100000 . ..... ..... 011 ..... 1010111 @r_vm +vsadd_vv 100001 . ..... ..... 000 ..... 1010111 @r_vm +vsadd_vx 100001 . ..... ..... 100 ..... 1010111 @r_vm +vsadd_vi 100001 . ..... ..... 011 ..... 1010111 @r_vm +vssubu_vv 100010 . ..... ..... 000 ..... 1010111 @r_vm +vssubu_vx 100010 . ..... ..... 100 ..... 1010111 @r_vm +vssub_vv 100011 . ..... ..... 000 ..... 1010111 @r_vm +vssub_vx 100011 . ..... ..... 100 ..... 1010111 @r_vm +vaadd_vv 100100 . ..... ..... 000 ..... 1010111 @r_vm +vaadd_vx 100100 . ..... ..... 100 ..... 1010111 @r_vm +vaadd_vi 100100 . ..... ..... 011 ..... 1010111 @r_vm +vasub_vv 100110 . ..... ..... 000 ..... 1010111 @r_vm +vasub_vx 100110 . ..... ..... 100 ..... 1010111 @r_vm +vsmul_vv 100111 . ..... ..... 000 ..... 1010111 @r_vm +vsmul_vx 100111 . ..... ..... 100 ..... 1010111 @r_vm +vwsmaccu_vv 111100 . ..... ..... 000 ..... 1010111 @r_vm +vwsmaccu_vx 111100 . ..... ..... 100 ..... 1010111 @r_vm +vwsmacc_vv 111101 . ..... ..... 000 ..... 1010111 @r_vm +vwsmacc_vx 111101 . ..... ..... 100 ..... 1010111 @r_vm +vwsmaccsu_vv 111110 . ..... ..... 000 ..... 1010111 @r_vm +vwsmaccsu_vx 111110 . ..... ..... 100 ..... 1010111 @r_vm +vwsmaccus_vx 111111 . ..... ..... 100 ..... 1010111 @r_vm +vssrl_vv 101010 . ..... ..... 000 ..... 1010111 @r_vm +vssrl_vx 101010 . ..... ..... 100 ..... 1010111 @r_vm +vssrl_vi 101010 . ..... ..... 011 ..... 1010111 @r_vm +vssra_vv 101011 . ..... ..... 000 ..... 1010111 @r_vm +vssra_vx 101011 . ..... ..... 100 ..... 1010111 @r_vm +vssra_vi 101011 . ..... ..... 011 ..... 1010111 @r_vm +vnclipu_vv 101110 . ..... ..... 000 ..... 1010111 @r_vm +vnclipu_vx 101110 . ..... ..... 100 ..... 1010111 @r_vm +vnclipu_vi 101110 . ..... ..... 011 ..... 1010111 @r_vm +vnclip_vv 101111 . ..... ..... 000 ..... 1010111 @r_vm +vnclip_vx 101111 . ..... ..... 100 ..... 1010111 @r_vm +vnclip_vi 101111 . ..... ..... 011 ..... 1010111 @r_vm + vsetvli 0 ........... ..... 111 ..... 1010111 @r2_zimm vsetvl 1000000 ..... ..... 111 ..... 1010111 @r diff --git a/target/riscv/insn_trans/trans_rvv.inc.c b/target/riscv/insn_trans/trans_rvv.inc.c index 1ba52e7..d650e8c 100644 --- a/target/riscv/insn_trans/trans_rvv.inc.c +++ b/target/riscv/insn_trans/trans_rvv.inc.c @@ -299,5 +299,42 @@ GEN_VECTOR_R_VM(vmerge_vvm) GEN_VECTOR_R_VM(vmerge_vxm) GEN_VECTOR_R_VM(vmerge_vim) +GEN_VECTOR_R_VM(vsaddu_vv) +GEN_VECTOR_R_VM(vsaddu_vx) +GEN_VECTOR_R_VM(vsaddu_vi) +GEN_VECTOR_R_VM(vsadd_vv) +GEN_VECTOR_R_VM(vsadd_vx) +GEN_VECTOR_R_VM(vsadd_vi) +GEN_VECTOR_R_VM(vssubu_vv) +GEN_VECTOR_R_VM(vssubu_vx) +GEN_VECTOR_R_VM(vssub_vv) +GEN_VECTOR_R_VM(vssub_vx) +GEN_VECTOR_R_VM(vaadd_vv) +GEN_VECTOR_R_VM(vaadd_vx) +GEN_VECTOR_R_VM(vaadd_vi) +GEN_VECTOR_R_VM(vasub_vv) +GEN_VECTOR_R_VM(vasub_vx) +GEN_VECTOR_R_VM(vsmul_vv) +GEN_VECTOR_R_VM(vsmul_vx) +GEN_VECTOR_R_VM(vwsmaccu_vv) +GEN_VECTOR_R_VM(vwsmaccu_vx) +GEN_VECTOR_R_VM(vwsmacc_vv) +GEN_VECTOR_R_VM(vwsmacc_vx) +GEN_VECTOR_R_VM(vwsmaccsu_vv) +GEN_VECTOR_R_VM(vwsmaccsu_vx) +GEN_VECTOR_R_VM(vwsmaccus_vx) +GEN_VECTOR_R_VM(vssrl_vv) +GEN_VECTOR_R_VM(vssrl_vx) +GEN_VECTOR_R_VM(vssrl_vi) +GEN_VECTOR_R_VM(vssra_vv) +GEN_VECTOR_R_VM(vssra_vx) +GEN_VECTOR_R_VM(vssra_vi) +GEN_VECTOR_R_VM(vnclipu_vv) +GEN_VECTOR_R_VM(vnclipu_vx) +GEN_VECTOR_R_VM(vnclipu_vi) +GEN_VECTOR_R_VM(vnclip_vv) +GEN_VECTOR_R_VM(vnclip_vx) +GEN_VECTOR_R_VM(vnclip_vi) + GEN_VECTOR_R2_ZIMM(vsetvli) GEN_VECTOR_R(vsetvl) diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c index 49f1cb8..2292fa5 100644 --- a/target/riscv/vector_helper.c +++ b/target/riscv/vector_helper.c @@ -75,6 +75,844 @@ static target_ulong vector_get_index(CPURISCVState *env, int rs1, int rs2, return 0; } +/* ADD/SUB/COMPARE instructions. */ +static inline uint8_t sat_add_u8(CPURISCVState *env, uint8_t a, uint8_t b) +{ + uint8_t res = a + b; + if (res < a) { + res = UINT8_MAX; + env->vfp.vxsat = 0x1; + } + return res; +} + +static inline uint16_t sat_add_u16(CPURISCVState *env, uint16_t a, uint16_t b) +{ + uint16_t res = a + b; + if (res < a) { + res = UINT16_MAX; + env->vfp.vxsat = 0x1; + } + return res; +} + +static inline uint32_t sat_add_u32(CPURISCVState *env, uint32_t a, uint32_t b) +{ + uint32_t res = a + b; + if (res < a) { + res = UINT32_MAX; + env->vfp.vxsat = 0x1; + } + return res; +} + +static inline uint64_t sat_add_u64(CPURISCVState *env, uint64_t a, uint64_t b) +{ + uint64_t res = a + b; + if (res < a) { + res = UINT64_MAX; + env->vfp.vxsat = 0x1; + } + return res; +} + +static inline uint8_t sat_add_s8(CPURISCVState *env, uint8_t a, uint8_t b) +{ + uint8_t res = a + b; + if (((res ^ a) & SIGNBIT8) && !((a ^ b) & SIGNBIT8)) { + res = ~(((int8_t)a >> 7) ^ SIGNBIT8); + env->vfp.vxsat = 0x1; + } + return res; +} + +static inline uint16_t sat_add_s16(CPURISCVState *env, uint16_t a, uint16_t b) +{ + uint16_t res = a + b; + if (((res ^ a) & SIGNBIT16) && !((a ^ b) & SIGNBIT16)) { + res = ~(((int16_t)a >> 15) ^ SIGNBIT16); + env->vfp.vxsat = 0x1; + } + return res; +} + +static inline uint32_t sat_add_s32(CPURISCVState *env, uint32_t a, uint32_t b) +{ + uint32_t res = a + b; + if (((res ^ a) & SIGNBIT32) && !((a ^ b) & SIGNBIT32)) { + res = ~(((int32_t)a >> 31) ^ SIGNBIT32); + env->vfp.vxsat = 0x1; + } + return res; +} + +static inline uint64_t sat_add_s64(CPURISCVState *env, uint64_t a, uint64_t b) +{ + uint64_t res = a + b; + if (((res ^ a) & SIGNBIT64) && !((a ^ b) & SIGNBIT64)) { + res = ~(((int64_t)a >> 63) ^ SIGNBIT64); + env->vfp.vxsat = 0x1; + } + return res; +} + +static inline uint8_t sat_sub_u8(CPURISCVState *env, uint8_t a, uint8_t b) +{ + uint8_t res = a - b; + if (res > a) { + res = 0; + env->vfp.vxsat = 0x1; + } + return res; +} + +static inline uint16_t sat_sub_u16(CPURISCVState *env, uint16_t a, uint16_t b) +{ + uint16_t res = a - b; + if (res > a) { + res = 0; + env->vfp.vxsat = 0x1; + } + return res; +} + +static inline uint32_t sat_sub_u32(CPURISCVState *env, uint32_t a, uint32_t b) +{ + uint32_t res = a - b; + if (res > a) { + res = 0; + env->vfp.vxsat = 0x1; + } + return res; +} + +static inline uint64_t sat_sub_u64(CPURISCVState *env, uint64_t a, uint64_t b) +{ + uint64_t res = a - b; + if (res > a) { + res = 0; + env->vfp.vxsat = 0x1; + } + return res; +} + +static inline uint8_t sat_sub_s8(CPURISCVState *env, uint8_t a, uint8_t b) +{ + uint8_t res = a - b; + if (((res ^ a) & SIGNBIT8) && ((a ^ b) & SIGNBIT8)) { + res = ~(((int8_t)a >> 7) ^ SIGNBIT8); + env->vfp.vxsat = 0x1; + } + return res; +} + +static inline uint16_t sat_sub_s16(CPURISCVState *env, uint16_t a, uint16_t b) +{ + uint16_t res = a - b; + if (((res ^ a) & SIGNBIT16) && ((a ^ b) & SIGNBIT16)) { + res = ~(((int16_t)a >> 15) ^ SIGNBIT16); + env->vfp.vxsat = 0x1; + } + return res; +} + +static inline uint32_t sat_sub_s32(CPURISCVState *env, uint32_t a, uint32_t b) +{ + uint32_t res = a - b; + if (((res ^ a) & SIGNBIT32) && ((a ^ b) & SIGNBIT32)) { + res = ~(((int32_t)a >> 31) ^ SIGNBIT32); + env->vfp.vxsat = 0x1; + } + return res; +} + +static inline uint64_t sat_sub_s64(CPURISCVState *env, uint64_t a, uint64_t b) +{ + uint64_t res = a - b; + if (((res ^ a) & SIGNBIT64) && ((a ^ b) & SIGNBIT64)) { + res = ~(((int64_t)a >> 63) ^ SIGNBIT64); + env->vfp.vxsat = 0x1; + } + return res; +} + +static uint64_t fix_data_round(CPURISCVState *env, uint64_t result, + uint8_t shift) +{ + uint64_t lsb_1 = (uint64_t)1 << shift; + int mod = env->vfp.vxrm; + int mask = ((uint64_t)1 << shift) - 1; + + if (mod == 0x0) { /* rnu */ + return lsb_1 >> 1; + } else if (mod == 0x1) { /* rne */ + if ((result & mask) > (lsb_1 >> 1) || + (((result & mask) == (lsb_1 >> 1)) && + (((result >> shift) & 0x1)) == 1)) { + return lsb_1 >> 1; + } + } else if (mod == 0x3) { /* rod */ + if (((result & mask) >= 0x1) && (((result >> shift) & 0x1) == 0)) { + return lsb_1; + } + } + return 0; +} + +static int8_t saturate_s8(CPURISCVState *env, int16_t res) +{ + if (res > INT8_MAX) { + env->vfp.vxsat = 0x1; + return INT8_MAX; + } else if (res < INT8_MIN) { + env->vfp.vxsat = 0x1; + return INT8_MIN; + } else { + return res; + } +} + +static uint8_t saturate_u8(CPURISCVState *env, uint16_t res) +{ + if (res > UINT8_MAX) { + env->vfp.vxsat = 0x1; + return UINT8_MAX; + } else { + return res; + } +} + +static uint16_t saturate_u16(CPURISCVState *env, uint32_t res) +{ + if (res > UINT16_MAX) { + env->vfp.vxsat = 0x1; + return UINT16_MAX; + } else { + return res; + } +} + +static uint32_t saturate_u32(CPURISCVState *env, uint64_t res) +{ + if (res > UINT32_MAX) { + env->vfp.vxsat = 0x1; + return UINT32_MAX; + } else { + return res; + } +} + +static int16_t saturate_s16(CPURISCVState *env, int32_t res) +{ + if (res > INT16_MAX) { + env->vfp.vxsat = 0x1; + return INT16_MAX; + } else if (res < INT16_MIN) { + env->vfp.vxsat = 0x1; + return INT16_MIN; + } else { + return res; + } +} + +static int32_t saturate_s32(CPURISCVState *env, int64_t res) +{ + if (res > INT32_MAX) { + env->vfp.vxsat = 0x1; + return INT32_MAX; + } else if (res < INT32_MIN) { + env->vfp.vxsat = 0x1; + return INT32_MIN; + } else { + return res; + } +} +static uint16_t vwsmaccu_8(CPURISCVState *env, uint8_t a, uint8_t b, + uint16_t c) +{ + uint16_t round, res; + uint16_t product = (uint16_t)a * (uint16_t)b; + + round = (uint16_t)fix_data_round(env, (uint64_t)product, 4); + res = (round + product) >> 4; + return sat_add_u16(env, c, res); +} + +static uint32_t vwsmaccu_16(CPURISCVState *env, uint16_t a, uint16_t b, + uint32_t c) +{ + uint32_t round, res; + uint32_t product = (uint32_t)a * (uint32_t)b; + + round = (uint32_t)fix_data_round(env, (uint64_t)product, 8); + res = (round + product) >> 8; + return sat_add_u32(env, c, res); +} + +static uint64_t vwsmaccu_32(CPURISCVState *env, uint32_t a, uint32_t b, + uint64_t c) +{ + uint64_t round, res; + uint64_t product = (uint64_t)a * (uint64_t)b; + + round = (uint64_t)fix_data_round(env, (uint64_t)product, 16); + res = (round + product) >> 16; + return sat_add_u64(env, c, res); +} + +static int16_t vwsmacc_8(CPURISCVState *env, int8_t a, int8_t b, + int16_t c) +{ + int16_t round, res; + int16_t product = (int16_t)a * (int16_t)b; + + round = (int16_t)fix_data_round(env, (uint64_t)product, 4); + res = (int16_t)(round + product) >> 4; + return sat_add_s16(env, c, res); +} + +static int32_t vwsmacc_16(CPURISCVState *env, int16_t a, int16_t b, + int32_t c) +{ + int32_t round, res; + int32_t product = (int32_t)a * (int32_t)b; + + round = (int32_t)fix_data_round(env, (uint64_t)product, 8); + res = (int32_t)(round + product) >> 8; + return sat_add_s32(env, c, res); +} + +static int64_t vwsmacc_32(CPURISCVState *env, int32_t a, int32_t b, + int64_t c) +{ + int64_t round, res; + int64_t product = (int64_t)a * (int64_t)b; + + round = (int64_t)fix_data_round(env, (uint64_t)product, 16); + res = (int64_t)(round + product) >> 16; + return sat_add_s64(env, c, res); +} + +static int16_t vwsmaccsu_8(CPURISCVState *env, uint8_t a, int8_t b, + int16_t c) +{ + int16_t round, res; + int16_t product = (uint16_t)a * (int16_t)b; + + round = (int16_t)fix_data_round(env, (uint64_t)product, 4); + res = (round + product) >> 4; + return sat_sub_s16(env, c, res); +} + +static int32_t vwsmaccsu_16(CPURISCVState *env, uint16_t a, int16_t b, + uint32_t c) +{ + int32_t round, res; + int32_t product = (uint32_t)a * (int32_t)b; + + round = (int32_t)fix_data_round(env, (uint64_t)product, 8); + res = (round + product) >> 8; + return sat_sub_s32(env, c, res); +} + +static int64_t vwsmaccsu_32(CPURISCVState *env, uint32_t a, int32_t b, + int64_t c) +{ + int64_t round, res; + int64_t product = (uint64_t)a * (int64_t)b; + + round = (int64_t)fix_data_round(env, (uint64_t)product, 16); + res = (round + product) >> 16; + return sat_sub_s64(env, c, res); +} + +static int16_t vwsmaccus_8(CPURISCVState *env, int8_t a, uint8_t b, + int16_t c) +{ + int16_t round, res; + int16_t product = (int16_t)a * (uint16_t)b; + + round = (int16_t)fix_data_round(env, (uint64_t)product, 4); + res = (round + product) >> 4; + return sat_sub_s16(env, c, res); +} + +static int32_t vwsmaccus_16(CPURISCVState *env, int16_t a, uint16_t b, + int32_t c) +{ + int32_t round, res; + int32_t product = (int32_t)a * (uint32_t)b; + + round = (int32_t)fix_data_round(env, (uint64_t)product, 8); + res = (round + product) >> 8; + return sat_sub_s32(env, c, res); +} + +static uint64_t vwsmaccus_32(CPURISCVState *env, int32_t a, uint32_t b, + int64_t c) +{ + int64_t round, res; + int64_t product = (int64_t)a * (uint64_t)b; + + round = (int64_t)fix_data_round(env, (uint64_t)product, 16); + res = (round + product) >> 16; + return sat_sub_s64(env, c, res); +} + +static int8_t vssra_8(CPURISCVState *env, int8_t a, uint8_t b) +{ + int16_t round, res; + uint8_t shift = b & 0x7; + + round = (int16_t)fix_data_round(env, (uint64_t)a, shift); + res = (a + round) >> shift; + + return res; +} + +static int16_t vssra_16(CPURISCVState *env, int16_t a, uint16_t b) +{ + int32_t round, res; + uint8_t shift = b & 0xf; + + round = (int32_t)fix_data_round(env, (uint64_t)a, shift); + res = (a + round) >> shift; + return res; +} + +static int32_t vssra_32(CPURISCVState *env, int32_t a, uint32_t b) +{ + int64_t round, res; + uint8_t shift = b & 0x1f; + + round = (int64_t)fix_data_round(env, (uint64_t)a, shift); + res = (a + round) >> shift; + return res; +} + +static int64_t vssra_64(CPURISCVState *env, int64_t a, uint64_t b) +{ + int64_t round, res; + uint8_t shift = b & 0x3f; + + round = (int64_t)fix_data_round(env, (uint64_t)a, shift); + res = (a >> (shift - 1)) + (round >> (shift - 1)); + return res >> 1; +} + +static int8_t vssrai_8(CPURISCVState *env, int8_t a, uint8_t b) +{ + int16_t round, res; + + round = (int16_t)fix_data_round(env, (uint64_t)a, b); + res = (a + round) >> b; + return res; +} + +static int16_t vssrai_16(CPURISCVState *env, int16_t a, uint8_t b) +{ + int32_t round, res; + + round = (int32_t)fix_data_round(env, (uint64_t)a, b); + res = (a + round) >> b; + return res; +} + +static int32_t vssrai_32(CPURISCVState *env, int32_t a, uint8_t b) +{ + int64_t round, res; + + round = (int64_t)fix_data_round(env, (uint64_t)a, b); + res = (a + round) >> b; + return res; +} + +static int64_t vssrai_64(CPURISCVState *env, int64_t a, uint8_t b) +{ + int64_t round, res; + + round = (int64_t)fix_data_round(env, (uint64_t)a, b); + res = (a >> (b - 1)) + (round >> (b - 1)); + return res >> 1; +} + +static int8_t vnclip_16(CPURISCVState *env, int16_t a, uint8_t b) +{ + int16_t round, res; + uint8_t shift = b & 0xf; + + round = (int16_t)fix_data_round(env, (uint64_t)a, shift); + res = (a + round) >> shift; + + return saturate_s8(env, res); +} + +static int16_t vnclip_32(CPURISCVState *env, int32_t a, uint16_t b) +{ + int32_t round, res; + uint8_t shift = b & 0x1f; + + round = (int32_t)fix_data_round(env, (uint64_t)a, shift); + res = (a + round) >> shift; + return saturate_s16(env, res); +} + +static int32_t vnclip_64(CPURISCVState *env, int64_t a, uint32_t b) +{ + int64_t round, res; + uint8_t shift = b & 0x3f; + + round = (int64_t)fix_data_round(env, (uint64_t)a, shift); + res = (a + round) >> shift; + + return saturate_s32(env, res); +} + +static int8_t vnclipi_16(CPURISCVState *env, int16_t a, uint8_t b) +{ + int16_t round, res; + + round = (int16_t)fix_data_round(env, (uint64_t)a, b); + res = (a + round) >> b; + + return saturate_s8(env, res); +} + +static int16_t vnclipi_32(CPURISCVState *env, int32_t a, uint8_t b) +{ + int32_t round, res; + + round = (int32_t)fix_data_round(env, (uint64_t)a, b); + res = (a + round) >> b; + + return saturate_s16(env, res); +} + +static int32_t vnclipi_64(CPURISCVState *env, int64_t a, uint8_t b) +{ + int32_t round, res; + + round = (int64_t)fix_data_round(env, (uint64_t)a, b); + res = (a + round) >> b; + + return saturate_s32(env, res); +} + +static uint8_t vnclipu_16(CPURISCVState *env, uint16_t a, uint8_t b) +{ + uint16_t round, res; + uint8_t shift = b & 0xf; + + round = (uint16_t)fix_data_round(env, (uint64_t)a, shift); + res = (a + round) >> shift; + + return saturate_u8(env, res); +} + +static uint16_t vnclipu_32(CPURISCVState *env, uint32_t a, uint16_t b) +{ + uint32_t round, res; + uint8_t shift = b & 0x1f; + + round = (uint32_t)fix_data_round(env, (uint64_t)a, shift); + res = (a + round) >> shift; + + return saturate_u16(env, res); +} + +static uint32_t vnclipu_64(CPURISCVState *env, uint64_t a, uint32_t b) +{ + uint64_t round, res; + uint8_t shift = b & 0x3f; + + round = (uint64_t)fix_data_round(env, (uint64_t)a, shift); + res = (a + round) >> shift; + + return saturate_u32(env, res); +} + +static uint8_t vnclipui_16(CPURISCVState *env, uint16_t a, uint8_t b) +{ + uint16_t round, res; + + round = (uint16_t)fix_data_round(env, (uint64_t)a, b); + res = (a + round) >> b; + + return saturate_u8(env, res); +} + +static uint16_t vnclipui_32(CPURISCVState *env, uint32_t a, uint8_t b) +{ + uint32_t round, res; + + round = (uint32_t)fix_data_round(env, (uint64_t)a, b); + res = (a + round) >> b; + + return saturate_u16(env, res); +} + +static uint32_t vnclipui_64(CPURISCVState *env, uint64_t a, uint8_t b) +{ + uint64_t round, res; + + round = (uint64_t)fix_data_round(env, (uint64_t)a, b); + res = (a + round) >> b; + + return saturate_u32(env, res); +} + +static uint8_t vssrl_8(CPURISCVState *env, uint8_t a, uint8_t b) +{ + uint16_t round, res; + uint8_t shift = b & 0x7; + + round = (uint16_t)fix_data_round(env, (uint64_t)a, shift); + res = (a + round) >> shift; + return res; +} + +static uint16_t vssrl_16(CPURISCVState *env, uint16_t a, uint16_t b) +{ + uint32_t round, res; + uint8_t shift = b & 0xf; + + round = (uint32_t)fix_data_round(env, (uint64_t)a, shift); + res = (a + round) >> shift; + return res; +} + +static uint32_t vssrl_32(CPURISCVState *env, uint32_t a, uint32_t b) +{ + uint64_t round, res; + uint8_t shift = b & 0x1f; + + round = (uint64_t)fix_data_round(env, (uint64_t)a, shift); + res = (a + round) >> shift; + return res; +} + +static uint64_t vssrl_64(CPURISCVState *env, uint64_t a, uint64_t b) +{ + uint64_t round, res; + uint8_t shift = b & 0x3f; + + round = (uint64_t)fix_data_round(env, (uint64_t)a, shift); + res = (a >> (shift - 1)) + (round >> (shift - 1)); + return res >> 1; +} + +static uint8_t vssrli_8(CPURISCVState *env, uint8_t a, uint8_t b) +{ + uint16_t round, res; + + round = (uint16_t)fix_data_round(env, (uint64_t)a, b); + res = (a + round) >> b; + return res; +} + +static uint16_t vssrli_16(CPURISCVState *env, uint16_t a, uint8_t b) +{ + uint32_t round, res; + + round = (uint32_t)fix_data_round(env, (uint64_t)a, b); + res = (a + round) >> b; + return res; +} + +static uint32_t vssrli_32(CPURISCVState *env, uint32_t a, uint8_t b) +{ + uint64_t round, res; + + round = (uint64_t)fix_data_round(env, (uint64_t)a, b); + res = (a + round) >> b; + return res; +} + +static uint64_t vssrli_64(CPURISCVState *env, uint64_t a, uint8_t b) +{ + uint64_t round, res; + + round = (uint64_t)fix_data_round(env, (uint64_t)a, b); + res = (a >> (b - 1)) + (round >> (b - 1)); + return res >> 1; +} + +static int8_t vsmul_8(CPURISCVState *env, int8_t a, int8_t b) +{ + int16_t round; + int8_t res; + int16_t product = (int16_t)a * (int16_t)b; + + if (a == INT8_MIN && b == INT8_MIN) { + env->vfp.vxsat = 1; + + return INT8_MAX; + } + + round = (int16_t)fix_data_round(env, (uint64_t)product, 7); + res = sat_add_s16(env, product, round) >> 7; + return res; +} + +static int16_t vsmul_16(CPURISCVState *env, int16_t a, int16_t b) +{ + int32_t round; + int16_t res; + int32_t product = (int32_t)a * (int32_t)b; + + if (a == INT16_MIN && b == INT16_MIN) { + env->vfp.vxsat = 1; + + return INT16_MAX; + } + + round = (int32_t)fix_data_round(env, (uint64_t)product, 15); + res = sat_add_s32(env, product, round) >> 15; + return res; +} + +static int32_t vsmul_32(CPURISCVState *env, int32_t a, int32_t b) +{ + int64_t round; + int32_t res; + int64_t product = (int64_t)a * (int64_t)b; + + if (a == INT32_MIN && b == INT32_MIN) { + env->vfp.vxsat = 1; + + return INT32_MAX; + } + + round = (int64_t)fix_data_round(env, (uint64_t)product, 31); + res = sat_add_s64(env, product, round) >> 31; + return res; +} + +static int64_t vsmul_64(CPURISCVState *env, int64_t a, int64_t b) +{ + int64_t res; + uint64_t abs_a = a, abs_b = b; + uint64_t lo_64, hi_64, carry, round; + + if (a == INT64_MIN && b == INT64_MIN) { + env->vfp.vxsat = 1; + + return INT64_MAX; + } + + if (a < 0) { + abs_a = ~a + 1; + } + if (b < 0) { + abs_b = ~b + 1; + } + + /* first get the whole product in {hi_64, lo_64} */ + uint64_t a_hi = abs_a >> 32; + uint64_t a_lo = (uint32_t)abs_a; + uint64_t b_hi = abs_b >> 32; + uint64_t b_lo = (uint32_t)abs_b; + + /* + * abs_a * abs_b = (a_hi << 32 + a_lo) * (b_hi << 32 + b_lo) + * = (a_hi * b_hi) << 64 + (a_hi * b_lo) << 32 + + * (a_lo * b_hi) << 32 + a_lo * b_lo + * = {hi_64, lo_64} + * hi_64 = ((a_hi * b_lo) << 32 + (a_lo * b_hi) << 32 + (a_lo * b_lo)) >> 64 + * = (a_hi * b_lo) >> 32 + (a_lo * b_hi) >> 32 + carry + * carry = ((uint64_t)(uint32_t)(a_hi * b_lo) + + * (uint64_t)(uint32_t)(a_lo * b_hi) + (a_lo * b_lo) >> 32) >> 32 + */ + + lo_64 = abs_a * abs_b; + carry = ((uint64_t)(uint32_t)(a_hi * b_lo) + + (uint64_t)(uint32_t)(a_lo * b_hi) + + ((a_lo * b_lo) >> 32)) >> 32; + + hi_64 = a_hi * b_hi + + ((a_hi * b_lo) >> 32) + ((a_lo * b_hi) >> 32) + + carry; + + if ((a ^ b) & SIGNBIT64) { + lo_64 = ~lo_64; + hi_64 = ~hi_64; + if (lo_64 == UINT64_MAX) { + lo_64 = 0; + hi_64 += 1; + } else { + lo_64 += 1; + } + } + + /* set rem and res */ + round = fix_data_round(env, lo_64, 63); + if ((lo_64 + round) < lo_64) { + hi_64 += 1; + res = (hi_64 << 1); + } else { + res = (hi_64 << 1) | ((lo_64 + round) >> 63); + } + + return res; +} +static inline int8_t avg_round_s8(CPURISCVState *env, int8_t a, int8_t b) +{ + int16_t round; + int8_t res; + int16_t sum = a + b; + + round = (int16_t)fix_data_round(env, (uint64_t)sum, 1); + res = (sum + round) >> 1; + + return res; +} + +static inline int16_t avg_round_s16(CPURISCVState *env, int16_t a, int16_t b) +{ + int32_t round; + int16_t res; + int32_t sum = a + b; + + round = (int32_t)fix_data_round(env, (uint64_t)sum, 1); + res = (sum + round) >> 1; + + return res; +} + +static inline int32_t avg_round_s32(CPURISCVState *env, int32_t a, int32_t b) +{ + int64_t round; + int32_t res; + int64_t sum = a + b; + + round = (int64_t)fix_data_round(env, (uint64_t)sum, 1); + res = (sum + round) >> 1; + + return res; +} + +static inline int64_t avg_round_s64(CPURISCVState *env, int64_t a, int64_t b) +{ + int64_t rem = (a & 0x1) + (b & 0x1); + int64_t res = (a >> 1) + (b >> 1) + (rem >> 1); + int mod = env->vfp.vxrm; + + if (mod == 0x0) { /* rnu */ + if (rem == 0x1) { + return res + 1; + } + } else if (mod == 0x1) { /* rne */ + if ((rem & 0x1) == 1 && ((res & 0x1) == 1)) { + return res + 1; + } + } else if (mod == 0x3) { /* rod */ + if (((rem & 0x1) >= 0x1) && (res & 0x1) == 0) { + return res + 1; + } + } + return res; +} + static inline bool vector_vtype_ill(CPURISCVState *env) { if ((env->vfp.vtype >> (sizeof(target_ulong) - 1)) & 0x1) { @@ -13726,3 +14564,2553 @@ void VECTOR_HELPER(vmerge_vim)(CPURISCVState *env, uint32_t vm, uint32_t rs1, env->vfp.vstart = 0; } +/* vsaddu.vv vd, vs2, vs1, vm # Vector-vector */ +void VECTOR_HELPER(vsaddu_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src1, src2; + + lmul = vector_get_lmul(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + vl = env->vfp.vl; + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + src1 = rs1 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u8[j] = sat_add_u8(env, + env->vfp.vreg[src1].u8[j], env->vfp.vreg[src2].u8[j]); + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u16[j] = sat_add_u16(env, + env->vfp.vreg[src1].u16[j], env->vfp.vreg[src2].u16[j]); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u32[j] = sat_add_u32(env, + env->vfp.vreg[src1].u32[j], env->vfp.vreg[src2].u32[j]); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u64[j] = sat_add_u64(env, + env->vfp.vreg[src1].u64[j], env->vfp.vreg[src2].u64[j]); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* vsaddu.vx vd, vs2, rs1, vm # vector-scalar */ +void VECTOR_HELPER(vsaddu_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src2; + + lmul = vector_get_lmul(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + vl = env->vfp.vl; + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u8[j] = sat_add_u8(env, + env->vfp.vreg[src2].u8[j], env->gpr[rs1]); + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u16[j] = sat_add_u16(env, + env->vfp.vreg[src2].u16[j], env->gpr[rs1]); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u32[j] = sat_add_u32(env, + env->vfp.vreg[src2].u32[j], env->gpr[rs1]); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u64[j] = sat_add_u64(env, + env->vfp.vreg[src2].u64[j], env->gpr[rs1]); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* vsaddu.vi vd, vs2, imm, vm # vector-immediate */ +void VECTOR_HELPER(vsaddu_vi)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src2; + + lmul = vector_get_lmul(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + vl = env->vfp.vl; + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u8[j] = sat_add_u8(env, + env->vfp.vreg[src2].u8[j], rs1); + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u16[j] = sat_add_u16(env, + env->vfp.vreg[src2].u16[j], rs1); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u32[j] = sat_add_u32(env, + env->vfp.vreg[src2].u32[j], rs1); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u64[j] = sat_add_u64(env, + env->vfp.vreg[src2].u64[j], rs1); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* vsadd.vv vd, vs2, vs1, vm # Vector-vector */ +void VECTOR_HELPER(vsadd_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src1, src2; + + lmul = vector_get_lmul(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + vl = env->vfp.vl; + + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + src1 = rs1 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s8[j] = sat_add_s8(env, + env->vfp.vreg[src1].s8[j], env->vfp.vreg[src2].s8[j]); + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s16[j] = sat_add_s16(env, + env->vfp.vreg[src1].s16[j], env->vfp.vreg[src2].s16[j]); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s32[j] = sat_add_s32(env, + env->vfp.vreg[src1].s32[j], env->vfp.vreg[src2].s32[j]); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s64[j] = sat_add_s64(env, + env->vfp.vreg[src1].s64[j], env->vfp.vreg[src2].s64[j]); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* vsadd.vx vd, vs2, rs1, vm # vector-scalar */ +void VECTOR_HELPER(vsadd_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src2; + + lmul = vector_get_lmul(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + vl = env->vfp.vl; + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s8[j] = sat_add_s8(env, + env->vfp.vreg[src2].s8[j], env->gpr[rs1]); + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s16[j] = sat_add_s16(env, + env->vfp.vreg[src2].s16[j], env->gpr[rs1]); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s32[j] = sat_add_s32(env, + env->vfp.vreg[src2].s32[j], env->gpr[rs1]); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s64[j] = sat_add_s64(env, + env->vfp.vreg[src2].s64[j], env->gpr[rs1]); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* vsadd.vi vd, vs2, imm, vm # vector-immediate */ +void VECTOR_HELPER(vsadd_vi)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src2; + + lmul = vector_get_lmul(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + vl = env->vfp.vl; + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s8[j] = sat_add_s8(env, + env->vfp.vreg[src2].s8[j], sign_extend(rs1, 5)); + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s16[j] = sat_add_s16(env, + env->vfp.vreg[src2].s16[j], sign_extend(rs1, 5)); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s32[j] = sat_add_s32(env, + env->vfp.vreg[src2].s32[j], sign_extend(rs1, 5)); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s64[j] = sat_add_s64(env, + env->vfp.vreg[src2].s64[j], sign_extend(rs1, 5)); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* vssubu.vv vd, vs2, vs1, vm # Vector-vector */ +void VECTOR_HELPER(vssubu_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src1, src2; + + lmul = vector_get_lmul(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + vl = env->vfp.vl; + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + src1 = rs1 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u8[j] = sat_sub_u8(env, + env->vfp.vreg[src2].u8[j], env->vfp.vreg[src1].u8[j]); + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u16[j] = sat_sub_u16(env, + env->vfp.vreg[src2].u16[j], env->vfp.vreg[src1].u16[j]); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u32[j] = sat_sub_u32(env, + env->vfp.vreg[src2].u32[j], env->vfp.vreg[src1].u32[j]); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u64[j] = sat_sub_u64(env, + env->vfp.vreg[src2].u64[j], env->vfp.vreg[src1].u64[j]); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* vssubu.vx vd, vs2, rs1, vm # vector-scalar */ +void VECTOR_HELPER(vssubu_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src2; + + lmul = vector_get_lmul(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + vl = env->vfp.vl; + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u8[j] = sat_sub_u8(env, + env->vfp.vreg[src2].u8[j], env->gpr[rs1]); + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u16[j] = sat_sub_u16(env, + env->vfp.vreg[src2].u16[j], env->gpr[rs1]); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u32[j] = sat_sub_u32(env, + env->vfp.vreg[src2].u32[j], env->gpr[rs1]); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u64[j] = sat_sub_u64(env, + env->vfp.vreg[src2].u64[j], env->gpr[rs1]); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* vssub.vv vd, vs2, vs1, vm # Vector-vector */ +void VECTOR_HELPER(vssub_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src1, src2; + + lmul = vector_get_lmul(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + vl = env->vfp.vl; + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + src1 = rs1 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s8[j] = sat_sub_s8(env, + env->vfp.vreg[src2].s8[j], env->vfp.vreg[src1].s8[j]); + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s16[j] = sat_sub_s16(env, + env->vfp.vreg[src2].s16[j], env->vfp.vreg[src1].s16[j]); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s32[j] = sat_sub_s32(env, + env->vfp.vreg[src2].s32[j], env->vfp.vreg[src1].s32[j]); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s64[j] = sat_sub_s64(env, + env->vfp.vreg[src2].s64[j], env->vfp.vreg[src1].s64[j]); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* vssub.vx vd, vs2, rs1, vm # vector-scalar */ +void VECTOR_HELPER(vssub_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src2; + + lmul = vector_get_lmul(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + vl = env->vfp.vl; + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s8[j] = sat_sub_s8(env, + env->vfp.vreg[src2].s8[j], env->gpr[rs1]); + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s16[j] = sat_sub_s16(env, + env->vfp.vreg[src2].s16[j], env->gpr[rs1]); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s32[j] = sat_sub_s32(env, + env->vfp.vreg[src2].s32[j], env->gpr[rs1]); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s64[j] = sat_sub_s64(env, + env->vfp.vreg[src2].s64[j], env->gpr[rs1]); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* vaadd.vv vd, vs2, vs1, vm # Vector-vector */ +void VECTOR_HELPER(vaadd_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src1, src2; + + lmul = vector_get_lmul(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + vl = env->vfp.vl; + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + src1 = rs1 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s8[j] = avg_round_s8(env, + env->vfp.vreg[src1].s8[j], env->vfp.vreg[src2].s8[j]); + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s16[j] = avg_round_s16(env, + env->vfp.vreg[src1].s16[j], env->vfp.vreg[src2].s16[j]); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s32[j] = avg_round_s32(env, + env->vfp.vreg[src1].s32[j], env->vfp.vreg[src2].s32[j]); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s64[j] = avg_round_s64(env, + env->vfp.vreg[src1].s64[j], env->vfp.vreg[src2].s64[j]); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* vaadd.vx vd, vs2, rs1, vm # vector-scalar */ +void VECTOR_HELPER(vaadd_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src2; + + lmul = vector_get_lmul(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + vl = env->vfp.vl; + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s8[j] = avg_round_s8(env, + env->gpr[rs1], env->vfp.vreg[src2].s8[j]); + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s16[j] = avg_round_s16(env, + env->gpr[rs1], env->vfp.vreg[src2].s16[j]); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s32[j] = avg_round_s32(env, + env->gpr[rs1], env->vfp.vreg[src2].s32[j]); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s64[j] = avg_round_s64(env, + env->gpr[rs1], env->vfp.vreg[src2].s64[j]); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* vaadd.vi vd, vs2, imm, vm # vector-immediate */ +void VECTOR_HELPER(vaadd_vi)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src2; + + lmul = vector_get_lmul(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + vl = env->vfp.vl; + + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s8[j] = avg_round_s8(env, + rs1, env->vfp.vreg[src2].s8[j]); + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s16[j] = avg_round_s16(env, + rs1, env->vfp.vreg[src2].s16[j]); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s32[j] = avg_round_s32(env, + rs1, env->vfp.vreg[src2].s32[j]); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s64[j] = avg_round_s64(env, + rs1, env->vfp.vreg[src2].s64[j]); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* vasub.vv vd, vs2, vs1, vm # Vector-vector */ +void VECTOR_HELPER(vasub_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src1, src2; + + lmul = vector_get_lmul(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + vl = env->vfp.vl; + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + src1 = rs1 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s8[j] = avg_round_s8( + env, + ~env->vfp.vreg[src1].s8[j] + 1, + env->vfp.vreg[src2].s8[j]); + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s16[j] = avg_round_s16( + env, + ~env->vfp.vreg[src1].s16[j] + 1, + env->vfp.vreg[src2].s16[j]); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s32[j] = avg_round_s32( + env, + ~env->vfp.vreg[src1].s32[j] + 1, + env->vfp.vreg[src2].s32[j]); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s64[j] = avg_round_s64( + env, + ~env->vfp.vreg[src1].s64[j] + 1, + env->vfp.vreg[src2].s64[j]); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + return; + + env->vfp.vstart = 0; +} + +/* vasub.vx vd, vs2, rs1, vm # vector-scalar */ +void VECTOR_HELPER(vasub_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src2; + + lmul = vector_get_lmul(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + vl = env->vfp.vl; + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s8[j] = avg_round_s8( + env, ~env->gpr[rs1] + 1, env->vfp.vreg[src2].s8[j]); + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s16[j] = avg_round_s16( + env, ~env->gpr[rs1] + 1, env->vfp.vreg[src2].s16[j]); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s32[j] = avg_round_s32( + env, ~env->gpr[rs1] + 1, env->vfp.vreg[src2].s32[j]); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s64[j] = avg_round_s64( + env, ~env->gpr[rs1] + 1, env->vfp.vreg[src2].s64[j]); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* vsmul.vv vd, vs2, vs1, vm # vd[i] = clip((vs2[i]*vs1[i]+round)>>(SEW-1)) */ +void VECTOR_HELPER(vsmul_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src1, src2; + if (vector_vtype_ill(env)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + + if ((!(vm)) && rd == 0) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + src1 = rs1 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s8[j] = vsmul_8(env, + env->vfp.vreg[src1].s8[j], env->vfp.vreg[src2].s8[j]); + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s16[j] = vsmul_16(env, + env->vfp.vreg[src1].s16[j], env->vfp.vreg[src2].s16[j]); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s32[j] = vsmul_32(env, + env->vfp.vreg[src1].s32[j], env->vfp.vreg[src2].s32[j]); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s64[j] = vsmul_64(env, + env->vfp.vreg[src1].s64[j], env->vfp.vreg[src2].s64[j]); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* vsmul.vx vd, vs2, rs1, vm # vd[i] = clip((vs2[i]*x[rs1]+round)>>(SEW-1)) */ +void VECTOR_HELPER(vsmul_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src2; + if (vector_vtype_ill(env)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + + if ((!(vm)) && rd == 0) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s8[j] = vsmul_8(env, + env->vfp.vreg[src2].s8[j], env->gpr[rs1]); + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s16[j] = vsmul_16(env, + env->vfp.vreg[src2].s16[j], env->gpr[rs1]); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s32[j] = vsmul_32(env, + env->vfp.vreg[src2].s32[j], env->gpr[rs1]); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s64[j] = vsmul_64(env, + env->vfp.vreg[src2].s64[j], env->gpr[rs1]); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* + * vwsmaccu.vv vd, vs1, vs2, vm # + * vd[i] = clipu((+(vs1[i]*vs2[i]+round)>>SEW/2)+vd[i]) + */ +void VECTOR_HELPER(vwsmaccu_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, k, dest, src1, src2; + + lmul = vector_get_lmul(env); + + if (vector_vtype_ill(env) + || vector_overlap_vm_force(vm, rd) + || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs1, lmul) + || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, true); + + vl = env->vfp.vl; + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / (2 * width))); + src2 = rs2 + (i / (VLEN / width)); + src1 = rs1 + (i / (VLEN / width)); + j = i % (VLEN / width); + k = i % (VLEN / (2 * width)); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u16[k] = vwsmaccu_8(env, + env->vfp.vreg[src2].u8[j], + env->vfp.vreg[src1].u8[j], + env->vfp.vreg[dest].u16[k]); + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u32[k] = vwsmaccu_16(env, + env->vfp.vreg[src2].u16[j], + env->vfp.vreg[src1].u16[j], + env->vfp.vreg[dest].u32[k]); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u64[k] = vwsmaccu_32(env, + env->vfp.vreg[src2].u32[j], + env->vfp.vreg[src1].u32[j], + env->vfp.vreg[dest].u64[k]); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_widen(env, dest, k, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* + * vwsmaccu.vx vd, rs1, vs2, vm # + * vd[i] = clipu((+(x[rs1]*vs2[i]+round)>>SEW/2)+vd[i]) + */ +void VECTOR_HELPER(vwsmaccu_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, k, dest, src2; + + lmul = vector_get_lmul(env); + + if (vector_vtype_ill(env) + || vector_overlap_vm_force(vm, rd) + || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, true); + + vl = env->vfp.vl; + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / (2 * width))); + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + k = i % (VLEN / (2 * width)); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u16[k] = vwsmaccu_8(env, + env->vfp.vreg[src2].u8[j], + env->gpr[rs1], + env->vfp.vreg[dest].u16[k]); + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u32[k] = vwsmaccu_16(env, + env->vfp.vreg[src2].u16[j], + env->gpr[rs1], + env->vfp.vreg[dest].u32[k]); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u64[k] = vwsmaccu_32(env, + env->vfp.vreg[src2].u32[j], + env->gpr[rs1], + env->vfp.vreg[dest].u64[k]); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_widen(env, dest, k, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* + * vwsmacc.vv vd, vs1, vs2, vm # + * vd[i] = clip((+(vs1[i]*vs2[i]+round)>>SEW/2)+vd[i]) + */ +void VECTOR_HELPER(vwsmacc_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, k, dest, src1, src2; + + lmul = vector_get_lmul(env); + + if (vector_vtype_ill(env) + || vector_overlap_vm_force(vm, rd) + || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs1, lmul) + || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, true); + + vl = env->vfp.vl; + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / (2 * width))); + src2 = rs2 + (i / (VLEN / width)); + src1 = rs1 + (i / (VLEN / width)); + j = i % (VLEN / width); + k = i % (VLEN / (2 * width)); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s16[k] = vwsmacc_8(env, + env->vfp.vreg[src2].s8[j], + env->vfp.vreg[src1].s8[j], + env->vfp.vreg[dest].s16[k]); + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s32[k] = vwsmacc_16(env, + env->vfp.vreg[src2].s16[j], + env->vfp.vreg[src1].s16[j], + env->vfp.vreg[dest].s32[k]); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s64[k] = vwsmacc_32(env, + env->vfp.vreg[src2].s32[j], + env->vfp.vreg[src1].s32[j], + env->vfp.vreg[dest].s64[k]); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_widen(env, dest, k, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* + * vwsmacc.vx vd, rs1, vs2, vm # + * vd[i] = clip((+(x[rs1]*vs2[i]+round)>>SEW/2)+vd[i]) + */ +void VECTOR_HELPER(vwsmacc_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, k, dest, src2; + + lmul = vector_get_lmul(env); + + if (vector_vtype_ill(env) + || vector_overlap_vm_force(vm, rd) + || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, true); + + vl = env->vfp.vl; + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / (2 * width))); + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + k = i % (VLEN / (2 * width)); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s16[k] = vwsmacc_8(env, + env->vfp.vreg[src2].s8[j], + env->gpr[rs1], + env->vfp.vreg[dest].s16[k]); + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s32[k] = vwsmacc_16(env, + env->vfp.vreg[src2].s16[j], + env->gpr[rs1], + env->vfp.vreg[dest].s32[k]); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s64[k] = vwsmacc_32(env, + env->vfp.vreg[src2].s32[j], + env->gpr[rs1], + env->vfp.vreg[dest].s64[k]); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_widen(env, dest, k, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* + * vwsmaccsu.vv vd, vs1, vs2, vm + * # vd[i] = clip(-((signed(vs1[i])*unsigned(vs2[i])+round)>>SEW/2)+vd[i]) + */ +void VECTOR_HELPER(vwsmaccsu_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, k, dest, src1, src2; + + lmul = vector_get_lmul(env); + + if (vector_vtype_ill(env) + || vector_overlap_vm_force(vm, rd) + || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs1, lmul) + || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, true); + + vl = env->vfp.vl; + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / (2 * width))); + src2 = rs2 + (i / (VLEN / width)); + src1 = rs1 + (i / (VLEN / width)); + j = i % (VLEN / width); + k = i % (VLEN / (2 * width)); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s16[k] = vwsmaccsu_8(env, + env->vfp.vreg[src2].u8[j], + env->vfp.vreg[src1].s8[j], + env->vfp.vreg[dest].s16[k]); + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s32[k] = vwsmaccsu_16(env, + env->vfp.vreg[src2].u16[j], + env->vfp.vreg[src1].s16[j], + env->vfp.vreg[dest].s32[k]); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s64[k] = vwsmaccsu_32(env, + env->vfp.vreg[src2].u32[j], + env->vfp.vreg[src1].s32[j], + env->vfp.vreg[dest].s64[k]); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_widen(env, dest, k, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* + * vwsmaccsu.vx vd, rs1, vs2, vm + * # vd[i] = clip(-((signed(x[rs1])*unsigned(vs2[i])+round)>>SEW/2)+vd[i]) + */ +void VECTOR_HELPER(vwsmaccsu_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, k, dest, src2; + + lmul = vector_get_lmul(env); + + if (vector_vtype_ill(env) + || vector_overlap_vm_force(vm, rd) + || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, true); + + vl = env->vfp.vl; + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / (2 * width))); + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + k = i % (VLEN / (2 * width)); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s16[k] = vwsmaccsu_8(env, + env->vfp.vreg[src2].u8[j], + env->gpr[rs1], + env->vfp.vreg[dest].s16[k]); + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s32[k] = vwsmaccsu_16(env, + env->vfp.vreg[src2].u16[j], + env->gpr[rs1], + env->vfp.vreg[dest].s32[k]); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s64[k] = vwsmaccsu_32(env, + env->vfp.vreg[src2].u32[j], + env->gpr[rs1], + env->vfp.vreg[dest].s64[k]); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_widen(env, dest, k, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* + * vwsmaccus.vx vd, rs1, vs2, vm + * # vd[i] = clip(-((unsigned(x[rs1])*signed(vs2[i])+round)>>SEW/2)+vd[i]) + */ +void VECTOR_HELPER(vwsmaccus_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, k, dest, src2; + + lmul = vector_get_lmul(env); + + if (vector_vtype_ill(env) + || vector_overlap_vm_force(vm, rd) + || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, true); + + vl = env->vfp.vl; + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / (2 * width))); + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + k = i % (VLEN / (2 * width)); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s16[k] = vwsmaccus_8(env, + env->vfp.vreg[src2].s8[j], + env->gpr[rs1], + env->vfp.vreg[dest].s16[k]); + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s32[k] = vwsmaccus_16(env, + env->vfp.vreg[src2].s16[j], + env->gpr[rs1], + env->vfp.vreg[dest].s32[k]); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s64[k] = vwsmaccus_32(env, + env->vfp.vreg[src2].s32[j], + env->gpr[rs1], + env->vfp.vreg[dest].s64[k]); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_widen(env, dest, k, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* vssrl.vv vd, vs2, vs1, vm # vd[i] = ((vs2[i] + round)>>vs1[i] */ +void VECTOR_HELPER(vssrl_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src1, src2; + + lmul = vector_get_lmul(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + vl = env->vfp.vl; + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + src1 = rs1 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u8[j] = vssrl_8(env, + env->vfp.vreg[src2].u8[j], env->vfp.vreg[src1].u8[j]); + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u16[j] = vssrl_16(env, + env->vfp.vreg[src2].u16[j], env->vfp.vreg[src1].u16[j]); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u32[j] = vssrl_32(env, + env->vfp.vreg[src2].u32[j], env->vfp.vreg[src1].u32[j]); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u64[j] = vssrl_64(env, + env->vfp.vreg[src2].u64[j], env->vfp.vreg[src1].u64[j]); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* vssrl.vx vd, vs2, rs1, vm # vd[i] = ((vs2[i] + round)>>x[rs1]) */ +void VECTOR_HELPER(vssrl_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src2; + + lmul = vector_get_lmul(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + vl = env->vfp.vl; + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u8[j] = vssrl_8(env, + env->vfp.vreg[src2].u8[j], env->gpr[rs1]); + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u16[j] = vssrl_16(env, + env->vfp.vreg[src2].u16[j], env->gpr[rs1]); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u32[j] = vssrl_32(env, + env->vfp.vreg[src2].u32[j], env->gpr[rs1]); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u64[j] = vssrl_64(env, + env->vfp.vreg[src2].u64[j], env->gpr[rs1]); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* vssrl.vi vd, vs2, imm, vm # vd[i] = ((vs2[i] + round)>>imm) */ +void VECTOR_HELPER(vssrl_vi)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src2; + + lmul = vector_get_lmul(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + vl = env->vfp.vl; + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u8[j] = vssrli_8(env, + env->vfp.vreg[src2].u8[j], rs1); + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u16[j] = vssrli_16(env, + env->vfp.vreg[src2].u16[j], rs1); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u32[j] = vssrli_32(env, + env->vfp.vreg[src2].u32[j], rs1); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u64[j] = vssrli_64(env, + env->vfp.vreg[src2].u64[j], rs1); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* vssra.vv vd, vs2, vs1, vm # vd[i] = ((vs2[i] + round)>>vs1[i]) */ +void VECTOR_HELPER(vssra_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src1, src2; + + lmul = vector_get_lmul(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + vl = env->vfp.vl; + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + src1 = rs1 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s8[j] = vssra_8(env, + env->vfp.vreg[src2].s8[j], env->vfp.vreg[src1].u8[j]); + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s16[j] = vssra_16(env, + env->vfp.vreg[src2].s16[j], env->vfp.vreg[src1].u16[j]); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s32[j] = vssra_32(env, + env->vfp.vreg[src2].s32[j], env->vfp.vreg[src1].u32[j]); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s64[j] = vssra_64(env, + env->vfp.vreg[src2].s64[j], env->vfp.vreg[src1].u64[j]); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* vssra.vx vd, vs2, rs1, vm # vd[i] = ((vs2[i] + round)>>x[rs1]) */ +void VECTOR_HELPER(vssra_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src2; + + lmul = vector_get_lmul(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + vl = env->vfp.vl; + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s8[j] = vssra_8(env, + env->vfp.vreg[src2].s8[j], env->gpr[rs1]); + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s16[j] = vssra_16(env, + env->vfp.vreg[src2].s16[j], env->gpr[rs1]); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s32[j] = vssra_32(env, + env->vfp.vreg[src2].s32[j], env->gpr[rs1]); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s64[j] = vssra_64(env, + env->vfp.vreg[src2].s64[j], env->gpr[rs1]); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* vssra.vi vd, vs2, imm, vm # vd[i] = ((vs2[i] + round)>>imm) */ +void VECTOR_HELPER(vssra_vi)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src2; + + lmul = vector_get_lmul(env); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + vl = env->vfp.vl; + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s8[j] = vssrai_8(env, + env->vfp.vreg[src2].s8[j], rs1); + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s16[j] = vssrai_16(env, + env->vfp.vreg[src2].s16[j], rs1); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s32[j] = vssrai_32(env, + env->vfp.vreg[src2].s32[j], rs1); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s64[j] = vssrai_64(env, + env->vfp.vreg[src2].s64[j], rs1); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* vnclipu.vv vd, vs2, vs1, vm # vector-vector */ +void VECTOR_HELPER(vnclipu_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, k, src1, src2; + + lmul = vector_get_lmul(env); + + if (vector_vtype_ill(env) + || vector_overlap_dstgp_srcgp(rd, lmul, rs2, 2 * lmul) + || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, true); + vector_lmul_check_reg(env, lmul, rd, false); + + vl = env->vfp.vl; + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / (2 * width))); + src1 = rs1 + (i / (VLEN / width)); + j = i % (VLEN / (2 * width)); + k = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u8[k] = vnclipu_16(env, + env->vfp.vreg[src2].u16[j], env->vfp.vreg[src1].u8[k]); + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u16[k] = vnclipu_32(env, + env->vfp.vreg[src2].u32[j], env->vfp.vreg[src1].u16[k]); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u32[k] = vnclipu_64(env, + env->vfp.vreg[src2].u64[j], env->vfp.vreg[src1].u32[k]); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_narrow(env, dest, k, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* vnclipu.vx vd, vs2, rs1, vm # vector-scalar */ +void VECTOR_HELPER(vnclipu_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, k, dest, src2; + + lmul = vector_get_lmul(env); + + if (vector_vtype_ill(env) + || vector_overlap_dstgp_srcgp(rd, lmul, rs2, 2 * lmul) + || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, true); + vector_lmul_check_reg(env, lmul, rd, false); + + vl = env->vfp.vl; + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / (2 * width))); + j = i % (VLEN / (2 * width)); + k = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u8[k] = vnclipu_16(env, + env->vfp.vreg[src2].u16[j], env->gpr[rs1]); + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u16[k] = vnclipu_32(env, + env->vfp.vreg[src2].u32[j], env->gpr[rs1]); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u32[k] = vnclipu_64(env, + env->vfp.vreg[src2].u64[j], env->gpr[rs1]); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_narrow(env, dest, k, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* vnclipu.vi vd, vs2, imm, vm # vector-immediate */ +void VECTOR_HELPER(vnclipu_vi)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, k, dest, src2; + + lmul = vector_get_lmul(env); + + if (vector_vtype_ill(env) + || vector_overlap_dstgp_srcgp(rd, lmul, rs2, 2 * lmul) + || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, true); + vector_lmul_check_reg(env, lmul, rd, false); + + vl = env->vfp.vl; + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / (2 * width))); + j = i % (VLEN / (2 * width)); + k = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u8[k] = vnclipui_16(env, + env->vfp.vreg[src2].u16[j], rs1); + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u16[k] = vnclipui_32(env, + env->vfp.vreg[src2].u32[j], rs1); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u32[k] = vnclipui_64(env, + env->vfp.vreg[src2].u64[j], rs1); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_narrow(env, dest, k, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* vnclip.vv vd, vs2, vs1, vm # vector-vector */ +void VECTOR_HELPER(vnclip_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, k, src1, src2; + + lmul = vector_get_lmul(env); + + if (vector_vtype_ill(env) + || vector_overlap_dstgp_srcgp(rd, lmul, rs2, 2 * lmul) + || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, true); + vector_lmul_check_reg(env, lmul, rd, false); + + vl = env->vfp.vl; + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / (2 * width))); + src1 = rs1 + (i / (VLEN / width)); + j = i % (VLEN / (2 * width)); + k = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s8[k] = vnclip_16(env, + env->vfp.vreg[src2].s16[j], env->vfp.vreg[src1].u8[k]); + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s16[k] = vnclip_32(env, + env->vfp.vreg[src2].s32[j], env->vfp.vreg[src1].u16[k]); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s32[k] = vnclip_64(env, + env->vfp.vreg[src2].s64[j], env->vfp.vreg[src1].u32[k]); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_narrow(env, dest, k, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* vnclip.vx vd, vs2, rs1, vm # vector-scalar */ +void VECTOR_HELPER(vnclip_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, k, src2; + + lmul = vector_get_lmul(env); + + if (vector_vtype_ill(env) + || vector_overlap_dstgp_srcgp(rd, lmul, rs2, 2 * lmul) + || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, true); + vector_lmul_check_reg(env, lmul, rd, false); + + vl = env->vfp.vl; + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / (2 * width))); + j = i % (VLEN / (2 * width)); + k = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s8[k] = vnclip_16(env, + env->vfp.vreg[src2].s16[j], env->gpr[rs1]); + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s16[k] = vnclip_32(env, + env->vfp.vreg[src2].s32[j], env->gpr[rs1]); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s32[k] = vnclip_64(env, + env->vfp.vreg[src2].s64[j], env->gpr[rs1]); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_narrow(env, dest, k, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* vnclip.vi vd, vs2, imm, vm # vector-immediate */ +void VECTOR_HELPER(vnclip_vi)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, k, src2; + + lmul = vector_get_lmul(env); + + if (vector_vtype_ill(env) + || vector_overlap_dstgp_srcgp(rd, lmul, rs2, 2 * lmul) + || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, true); + vector_lmul_check_reg(env, lmul, rd, false); + + vl = env->vfp.vl; + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / (2 * width))); + j = i % (VLEN / (2 * width)); + k = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s8[k] = vnclipi_16(env, + env->vfp.vreg[src2].s16[j], rs1); + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s16[k] = vnclipi_32(env, + env->vfp.vreg[src2].s32[j], rs1); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s32[k] = vnclipi_64(env, + env->vfp.vreg[src2].s64[j], rs1); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_narrow(env, dest, k, width); + } + } + env->vfp.vstart = 0; + return; +} From patchwork Wed Sep 11 06:25:37 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: LIU Zhiwei X-Patchwork-Id: 11140397 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1F12814ED for ; Wed, 11 Sep 2019 06:46:54 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id C13BC21928 for ; Wed, 11 Sep 2019 06:46:53 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C13BC21928 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=c-sky.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Received: from localhost ([::1]:46956 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1i7wOy-0003HO-AF for patchwork-qemu-devel@patchwork.kernel.org; Wed, 11 Sep 2019 02:46:52 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:38696) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1i7wE6-0007nD-8n for qemu-devel@nongnu.org; Wed, 11 Sep 2019 02:35:48 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1i7wDv-00088a-S5 for qemu-devel@nongnu.org; Wed, 11 Sep 2019 02:35:38 -0400 Received: from smtp2200-217.mail.aliyun.com ([121.197.200.217]:49053) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1i7wDu-0007q7-7S; Wed, 11 Sep 2019 02:35:27 -0400 X-Alimail-AntiSpam: AC=CONTINUE; BC=0.03883426|-1; CH=green; DM=CONTINUE|CONTINUE|true|0.338074-0.00686125-0.655065; FP=0|0|0|0|0|-1|-1|-1; HT=e01a16384; MF=zhiwei_liu@c-sky.com; NM=1; PH=DS; RN=11; RT=11; SR=0; TI=SMTPD_---.FSRDwoS_1568183700; Received: from localhost(mailfrom:zhiwei_liu@c-sky.com fp:SMTPD_---.FSRDwoS_1568183700) by smtp.aliyun-inc.com(10.147.40.200); Wed, 11 Sep 2019 14:35:01 +0800 From: liuzhiwei To: Alistair.Francis@wdc.com, palmer@sifive.com, sagark@eecs.berkeley.edu, kbastian@mail.uni-paderborn.de, riku.voipio@iki.fi, laurent@vivier.eu, wenmeng_zhang@c-sky.com Date: Wed, 11 Sep 2019 14:25:37 +0800 Message-Id: <1568183141-67641-14-git-send-email-zhiwei_liu@c-sky.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1568183141-67641-1-git-send-email-zhiwei_liu@c-sky.com> References: <1568183141-67641-1-git-send-email-zhiwei_liu@c-sky.com> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x [generic] X-Received-From: 121.197.200.217 Subject: [Qemu-devel] [PATCH v2 13/17] RISC-V: add vector extension float instruction part1, add/sub/mul/div X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-riscv@nongnu.org, qemu-devel@nongnu.org, wxy194768@alibaba-inc.com, LIU Zhiwei Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" From: LIU Zhiwei Signed-off-by: LIU Zhiwei --- target/riscv/helper.h | 37 + target/riscv/insn32.decode | 37 + target/riscv/insn_trans/trans_rvv.inc.c | 37 + target/riscv/vector_helper.c | 2645 +++++++++++++++++++++++++++++++ 4 files changed, 2756 insertions(+) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index ff6002e..d2c8684 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -307,5 +307,42 @@ DEF_HELPER_5(vector_vnclip_vv, void, env, i32, i32, i32, i32) DEF_HELPER_5(vector_vnclip_vx, void, env, i32, i32, i32, i32) DEF_HELPER_5(vector_vnclip_vi, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfadd_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfadd_vf, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfsub_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfsub_vf, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfrsub_vf, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfwadd_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfwadd_vf, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfwadd_wv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfwadd_wf, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfwsub_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfwsub_vf, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfwsub_wv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfwsub_wf, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfmul_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfmul_vf, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfdiv_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfdiv_vf, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfrdiv_vf, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfwmul_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfwmul_vf, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfmacc_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfmacc_vf, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfnmacc_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfnmacc_vf, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfmsac_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfmsac_vf, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfnmsac_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfnmsac_vf, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfmadd_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfmadd_vf, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfnmadd_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfnmadd_vf, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfmsub_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfmsub_vf, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfnmsub_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfnmsub_vf, void, env, i32, i32, i32, i32) + DEF_HELPER_4(vector_vsetvli, void, env, i32, i32, i32) DEF_HELPER_4(vector_vsetvl, void, env, i32, i32, i32) diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode index a82e53e..31868ab 100644 --- a/target/riscv/insn32.decode +++ b/target/riscv/insn32.decode @@ -447,5 +447,42 @@ vnclip_vv 101111 . ..... ..... 000 ..... 1010111 @r_vm vnclip_vx 101111 . ..... ..... 100 ..... 1010111 @r_vm vnclip_vi 101111 . ..... ..... 011 ..... 1010111 @r_vm +vfadd_vv 000000 . ..... ..... 001 ..... 1010111 @r_vm +vfadd_vf 000000 . ..... ..... 101 ..... 1010111 @r_vm +vfsub_vv 000010 . ..... ..... 001 ..... 1010111 @r_vm +vfsub_vf 000010 . ..... ..... 101 ..... 1010111 @r_vm +vfrsub_vf 100111 . ..... ..... 101 ..... 1010111 @r_vm +vfwadd_vv 110000 . ..... ..... 001 ..... 1010111 @r_vm +vfwadd_vf 110000 . ..... ..... 101 ..... 1010111 @r_vm +vfwadd_wv 110100 . ..... ..... 001 ..... 1010111 @r_vm +vfwadd_wf 110100 . ..... ..... 101 ..... 1010111 @r_vm +vfwsub_vv 110010 . ..... ..... 001 ..... 1010111 @r_vm +vfwsub_vf 110010 . ..... ..... 101 ..... 1010111 @r_vm +vfwsub_wv 110110 . ..... ..... 001 ..... 1010111 @r_vm +vfwsub_wf 110110 . ..... ..... 101 ..... 1010111 @r_vm +vfmul_vv 100100 . ..... ..... 001 ..... 1010111 @r_vm +vfmul_vf 100100 . ..... ..... 101 ..... 1010111 @r_vm +vfdiv_vv 100000 . ..... ..... 001 ..... 1010111 @r_vm +vfdiv_vf 100000 . ..... ..... 101 ..... 1010111 @r_vm +vfrdiv_vf 100001 . ..... ..... 101 ..... 1010111 @r_vm +vfwmul_vv 111000 . ..... ..... 001 ..... 1010111 @r_vm +vfwmul_vf 111000 . ..... ..... 101 ..... 1010111 @r_vm +vfmacc_vf 101100 . ..... ..... 101 ..... 1010111 @r_vm +vfmacc_vv 101100 . ..... ..... 001 ..... 1010111 @r_vm +vfnmacc_vv 101101 . ..... ..... 001 ..... 1010111 @r_vm +vfnmacc_vf 101101 . ..... ..... 101 ..... 1010111 @r_vm +vfmsac_vv 101110 . ..... ..... 001 ..... 1010111 @r_vm +vfmsac_vf 101110 . ..... ..... 101 ..... 1010111 @r_vm +vfnmsac_vv 101111 . ..... ..... 001 ..... 1010111 @r_vm +vfnmsac_vf 101111 . ..... ..... 101 ..... 1010111 @r_vm +vfmadd_vv 101000 . ..... ..... 001 ..... 1010111 @r_vm +vfmadd_vf 101000 . ..... ..... 101 ..... 1010111 @r_vm +vfnmadd_vv 101001 . ..... ..... 001 ..... 1010111 @r_vm +vfnmadd_vf 101001 . ..... ..... 101 ..... 1010111 @r_vm +vfmsub_vv 101010 . ..... ..... 001 ..... 1010111 @r_vm +vfmsub_vf 101010 . ..... ..... 101 ..... 1010111 @r_vm +vfnmsub_vv 101011 . ..... ..... 001 ..... 1010111 @r_vm +vfnmsub_vf 101011 . ..... ..... 101 ..... 1010111 @r_vm + vsetvli 0 ........... ..... 111 ..... 1010111 @r2_zimm vsetvl 1000000 ..... ..... 111 ..... 1010111 @r diff --git a/target/riscv/insn_trans/trans_rvv.inc.c b/target/riscv/insn_trans/trans_rvv.inc.c index d650e8c..ff23bc2 100644 --- a/target/riscv/insn_trans/trans_rvv.inc.c +++ b/target/riscv/insn_trans/trans_rvv.inc.c @@ -336,5 +336,42 @@ GEN_VECTOR_R_VM(vnclip_vv) GEN_VECTOR_R_VM(vnclip_vx) GEN_VECTOR_R_VM(vnclip_vi) +GEN_VECTOR_R_VM(vfadd_vv) +GEN_VECTOR_R_VM(vfadd_vf) +GEN_VECTOR_R_VM(vfsub_vv) +GEN_VECTOR_R_VM(vfsub_vf) +GEN_VECTOR_R_VM(vfrsub_vf) +GEN_VECTOR_R_VM(vfwadd_vv) +GEN_VECTOR_R_VM(vfwadd_vf) +GEN_VECTOR_R_VM(vfwadd_wv) +GEN_VECTOR_R_VM(vfwadd_wf) +GEN_VECTOR_R_VM(vfwsub_wv) +GEN_VECTOR_R_VM(vfwsub_wf) +GEN_VECTOR_R_VM(vfwsub_vv) +GEN_VECTOR_R_VM(vfwsub_vf) +GEN_VECTOR_R_VM(vfmul_vv) +GEN_VECTOR_R_VM(vfmul_vf) +GEN_VECTOR_R_VM(vfdiv_vv) +GEN_VECTOR_R_VM(vfdiv_vf) +GEN_VECTOR_R_VM(vfrdiv_vf) +GEN_VECTOR_R_VM(vfwmul_vv) +GEN_VECTOR_R_VM(vfwmul_vf) +GEN_VECTOR_R_VM(vfmacc_vv) +GEN_VECTOR_R_VM(vfmacc_vf) +GEN_VECTOR_R_VM(vfnmacc_vv) +GEN_VECTOR_R_VM(vfnmacc_vf) +GEN_VECTOR_R_VM(vfmsac_vv) +GEN_VECTOR_R_VM(vfmsac_vf) +GEN_VECTOR_R_VM(vfnmsac_vv) +GEN_VECTOR_R_VM(vfnmsac_vf) +GEN_VECTOR_R_VM(vfmadd_vv) +GEN_VECTOR_R_VM(vfmadd_vf) +GEN_VECTOR_R_VM(vfnmadd_vv) +GEN_VECTOR_R_VM(vfnmadd_vf) +GEN_VECTOR_R_VM(vfmsub_vv) +GEN_VECTOR_R_VM(vfmsub_vf) +GEN_VECTOR_R_VM(vfnmsub_vv) +GEN_VECTOR_R_VM(vfnmsub_vf) + GEN_VECTOR_R2_ZIMM(vsetvli) GEN_VECTOR_R(vsetvl) diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c index 2292fa5..e16543b 100644 --- a/target/riscv/vector_helper.c +++ b/target/riscv/vector_helper.c @@ -21,6 +21,7 @@ #include "exec/exec-all.h" #include "exec/helper-proto.h" #include "exec/cpu_ldst.h" +#include "fpu/softfloat.h" #include #define VECTOR_HELPER(name) HELPER(glue(vector_, name)) @@ -1125,6 +1126,41 @@ static void vector_tail_narrow(CPURISCVState *env, int vreg, int index, } } +static void vector_tail_fcommon(CPURISCVState *env, int vreg, int index, + int width) +{ + switch (width) { + case 16: + env->vfp.vreg[vreg].u16[index] = 0; + break; + case 32: + env->vfp.vreg[vreg].u32[index] = 0; + break; + case 64: + env->vfp.vreg[vreg].u64[index] = 0; + break; + default: + helper_raise_exception(env, RISCV_EXCP_ILLEGAL_INST); + return; + } +} + +static void vector_tail_fwiden(CPURISCVState *env, int vreg, int index, + int width) +{ + switch (width) { + case 16: + env->vfp.vreg[vreg].u32[index] = 0; + break; + case 32: + env->vfp.vreg[vreg].u64[index] = 0; + break; + default: + helper_raise_exception(env, RISCV_EXCP_ILLEGAL_INST); + return; + } +} + static inline int vector_get_carry(CPURISCVState *env, int width, int lmul, int index) { @@ -17114,3 +17150,2612 @@ void VECTOR_HELPER(vnclip_vi)(CPURISCVState *env, uint32_t vm, uint32_t rs1, env->vfp.vstart = 0; return; } + +/* vfadd.vv vd, vs2, vs1, vm # Vector-vector */ +void VECTOR_HELPER(vfadd_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src1, src2; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + src1 = rs1 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f16[j] = float16_add( + env->vfp.vreg[src1].f16[j], + env->vfp.vreg[src2].f16[j], + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[j] = float32_add( + env->vfp.vreg[src1].f32[j], + env->vfp.vreg[src2].f32[j], + &env->fp_status); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[j] = float64_add( + env->vfp.vreg[src1].f64[j], + env->vfp.vreg[src2].f64[j], + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_fcommon(env, dest, j, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* vfadd.vf vd, vs2, rs1, vm # vector-scalar */ +void VECTOR_HELPER(vfadd_vf)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src2; + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f16[j] = float16_add( + env->fpr[rs1], + env->vfp.vreg[src2].f16[j], + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[j] = float32_add( + env->fpr[rs1], + env->vfp.vreg[src2].f32[j], + &env->fp_status); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[j] = float64_add( + env->fpr[rs1], + env->vfp.vreg[src2].f64[j], + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_fcommon(env, dest, j, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* vfsub.vv vd, vs2, vs1, vm # Vector-vector */ +void VECTOR_HELPER(vfsub_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src1, src2; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + src1 = rs1 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f16[j] = float16_sub( + env->vfp.vreg[src2].f16[j], + env->vfp.vreg[src1].f16[j], + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[j] = float32_sub( + env->vfp.vreg[src2].f32[j], + env->vfp.vreg[src1].f32[j], + &env->fp_status); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[j] = float64_sub( + env->vfp.vreg[src2].f64[j], + env->vfp.vreg[src1].f64[j], + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_fcommon(env, dest, j, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* vfsub.vf vd, vs2, rs1, vm # Vector-scalar vd[i] = vs2[i] - f[rs1] */ +void VECTOR_HELPER(vfsub_vf)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src2; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f16[j] = float16_sub( + env->vfp.vreg[src2].f16[j], + env->fpr[rs1], + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[j] = float32_sub( + env->vfp.vreg[src2].f32[j], + env->fpr[rs1], + &env->fp_status); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[j] = float64_sub( + env->vfp.vreg[src2].f64[j], + env->fpr[rs1], + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_fcommon(env, dest, j, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* vfrsub.vf vd, vs2, rs1, vm # Scalar-vector vd[i] = f[rs1] - vs2[i] */ +void VECTOR_HELPER(vfrsub_vf)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src2; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f16[j] = float16_sub( + env->fpr[rs1], + env->vfp.vreg[src2].f16[j], + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[j] = float32_sub( + env->fpr[rs1], + env->vfp.vreg[src2].f32[j], + &env->fp_status); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[j] = float64_sub( + env->fpr[rs1], + env->vfp.vreg[src2].f64[j], + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_fcommon(env, dest, j, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* vfwadd.vv vd, vs2, vs1, vm # vector-vector */ +void VECTOR_HELPER(vfwadd_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, k, dest, src1, src2; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + + if (vector_vtype_ill(env) + || vector_overlap_vm_force(vm, rd) + || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs1, lmul) + || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, true); + + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / (2 * width))); + src2 = rs2 + (i / (VLEN / width)); + src1 = rs1 + (i / (VLEN / width)); + j = i % (VLEN / width); + k = i % (VLEN / (2 * width)); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[k] = float32_add( + float16_to_float32(env->vfp.vreg[src2].f16[j], true, + &env->fp_status), + float16_to_float32(env->vfp.vreg[src1].f16[j], true, + &env->fp_status), + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[k] = float64_add( + float32_to_float64(env->vfp.vreg[src2].f32[j], + &env->fp_status), + float32_to_float64(env->vfp.vreg[src1].f32[j], + &env->fp_status), + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_fwiden(env, dest, k, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* vfwadd.vf vd, vs2, rs1, vm # vector-scalar */ +void VECTOR_HELPER(vfwadd_vf)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, k, dest, src2; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + + if (vector_vtype_ill(env) + || vector_overlap_vm_force(vm, rd) + || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, true); + + if (vector_vtype_ill(env)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / (2 * width))); + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + k = i % (VLEN / (2 * width)); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[k] = float32_add( + float16_to_float32(env->vfp.vreg[src2].f16[j], true, + &env->fp_status), + float16_to_float32(env->fpr[rs1], true, + &env->fp_status), + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[k] = float64_add( + float32_to_float64(env->vfp.vreg[src2].f32[j], + &env->fp_status), + float32_to_float64(env->fpr[rs1], &env->fp_status), + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_fwiden(env, dest, k, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* vfwadd.wv vd, vs2, vs1, vm # vector-vector */ +void VECTOR_HELPER(vfwadd_wv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, k, dest, src1, src2; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + + if (vector_vtype_ill(env) + || vector_overlap_vm_force(vm, rd) + || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs1, lmul)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, true); + vector_lmul_check_reg(env, lmul, rd, true); + + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / (2 * width))); + src2 = rs2 + (i / (VLEN / (2 * width))); + src1 = rs1 + (i / (VLEN / width)); + j = i % (VLEN / width); + k = i % (VLEN / (2 * width)); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[k] = float32_add( + env->vfp.vreg[src2].f32[k], + float16_to_float32(env->vfp.vreg[src1].f16[j], true, + &env->fp_status), + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[k] = float64_add( + env->vfp.vreg[src2].f64[k], + float32_to_float64(env->vfp.vreg[src1].f32[j], + &env->fp_status), + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_widen(env, dest, k, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* vfwadd.wf vd, vs2, rs1, vm # vector-scalar */ +void VECTOR_HELPER(vfwadd_wf)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, k, dest, src2; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + + if (vector_vtype_ill(env) + || vector_overlap_vm_force(vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, true); + vector_lmul_check_reg(env, lmul, rd, true); + + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / (2 * width))); + src2 = rs2 + (i / (VLEN / (2 * width))); + k = i % (VLEN / (2 * width)); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[k] = float32_add( + env->vfp.vreg[src2].f32[k], + float16_to_float32(env->fpr[rs1], true, + &env->fp_status), + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[k] = float64_add( + env->vfp.vreg[src2].f64[k], + float32_to_float64(env->fpr[rs1], &env->fp_status), + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_widen(env, dest, k, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* vfwsub.vv vd, vs2, vs1, vm # vector-vector */ +void VECTOR_HELPER(vfwsub_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, k, dest, src1, src2; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + + if (vector_vtype_ill(env) + || vector_overlap_vm_force(vm, rd) + || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs1, lmul) + || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, true); + + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / (2 * width))); + src2 = rs2 + (i / (VLEN / width)); + src1 = rs1 + (i / (VLEN / width)); + j = i % (VLEN / width); + k = i % (VLEN / (2 * width)); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[k] = float32_sub( + float16_to_float32(env->vfp.vreg[src2].f16[j], true, + &env->fp_status), + float16_to_float32(env->vfp.vreg[src1].f16[j], true, + &env->fp_status), + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[k] = float64_sub( + float32_to_float64(env->vfp.vreg[src2].f32[j], + &env->fp_status), + float32_to_float64(env->vfp.vreg[src1].f32[j], + &env->fp_status), + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_fwiden(env, dest, k, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* vfwsub.vf vd, vs2, rs1, vm # vector-scalar */ +void VECTOR_HELPER(vfwsub_vf)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, k, dest, src2; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + + if (vector_vtype_ill(env) + || vector_overlap_vm_force(vm, rd) + || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, true); + + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / (2 * width))); + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + k = i % (VLEN / (2 * width)); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[k] = float32_sub( + float16_to_float32(env->vfp.vreg[src2].f16[j], true, + &env->fp_status), + float16_to_float32(env->fpr[rs1], true, + &env->fp_status), + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[k] = float64_sub( + float32_to_float64(env->vfp.vreg[src2].f32[j], + &env->fp_status), + float32_to_float64(env->fpr[rs1], &env->fp_status), + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_fwiden(env, dest, k, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* vfwsub.wv vd, vs2, vs1, vm # vector-vector */ +void VECTOR_HELPER(vfwsub_wv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, k, dest, src1, src2; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + + if (vector_vtype_ill(env) + || vector_overlap_vm_force(vm, rd) + || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs1, lmul)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, true); + vector_lmul_check_reg(env, lmul, rd, true); + + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / (2 * width))); + src2 = rs2 + (i / (VLEN / (2 * width))); + src1 = rs1 + (i / (VLEN / width)); + j = i % (VLEN / width); + k = i % (VLEN / (2 * width)); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[k] = float32_sub( + env->vfp.vreg[src2].f32[k], + float16_to_float32(env->vfp.vreg[src1].f16[j], true, + &env->fp_status), + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[k] = float64_sub( + env->vfp.vreg[src2].f64[k], + float32_to_float64(env->vfp.vreg[src1].f32[j], + &env->fp_status), + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_fwiden(env, dest, k, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* vfwsub.wf vd, vs2, rs1, vm # vector-scalar */ +void VECTOR_HELPER(vfwsub_wf)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, k, dest, src2; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + + if (vector_vtype_ill(env) + || vector_overlap_vm_force(vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, true); + vector_lmul_check_reg(env, lmul, rd, true); + + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / (2 * width))); + src2 = rs2 + (i / (VLEN / (2 * width))); + k = i % (VLEN / (2 * width)); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[k] = float32_sub( + env->vfp.vreg[src2].f32[k], + float16_to_float32(env->fpr[rs1], true, + &env->fp_status), + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[k] = float64_sub( + env->vfp.vreg[src2].f64[k], + float32_to_float64(env->fpr[rs1], &env->fp_status), + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_fwiden(env, dest, k, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* vfmul.vv vd, vs2, vs1, vm # Vector-vector */ +void VECTOR_HELPER(vfmul_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src1, src2; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + src1 = rs1 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f16[j] = float16_mul( + env->vfp.vreg[src1].f16[j], + env->vfp.vreg[src2].f16[j], + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[j] = float32_mul( + env->vfp.vreg[src1].f32[j], + env->vfp.vreg[src2].f32[j], + &env->fp_status); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[j] = float64_mul( + env->vfp.vreg[src1].f64[j], + env->vfp.vreg[src2].f64[j], + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_fcommon(env, dest, j, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* vfmul.vf vd, vs2, rs1, vm # vector-scalar */ +void VECTOR_HELPER(vfmul_vf)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src2; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f16[j] = float16_mul( + env->fpr[rs1], + env->vfp.vreg[src2].f16[j], + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[j] = float32_mul( + env->fpr[rs1], + env->vfp.vreg[src2].f32[j], + &env->fp_status); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[j] = float64_mul( + env->fpr[rs1], + env->vfp.vreg[src2].f64[j], + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_fcommon(env, dest, j, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* vfdiv.vv vd, vs2, vs1, vm # Vector-vector */ +void VECTOR_HELPER(vfdiv_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src1, src2; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + src1 = rs1 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f16[j] = float16_div( + env->vfp.vreg[src2].f16[j], + env->vfp.vreg[src1].f16[j], + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[j] = float32_div( + env->vfp.vreg[src2].f32[j], + env->vfp.vreg[src1].f32[j], + &env->fp_status); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[j] = float64_div( + env->vfp.vreg[src2].f64[j], + env->vfp.vreg[src1].f64[j], + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_fcommon(env, dest, j, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* vfdiv.vf vd, vs2, rs1, vm # vector-scalar */ +void VECTOR_HELPER(vfdiv_vf)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src2; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f16[j] = float16_div( + env->vfp.vreg[src2].f16[j], + env->fpr[rs1], + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[j] = float32_div( + env->vfp.vreg[src2].f32[j], + env->fpr[rs1], + &env->fp_status); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[j] = float64_div( + env->vfp.vreg[src2].f64[j], + env->fpr[rs1], + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_fcommon(env, dest, j, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* vfrdiv.vf vd, vs2, rs1, vm # scalar-vector, vd[i] = f[rs1]/vs2[i] */ +void VECTOR_HELPER(vfrdiv_vf)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src2; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f16[j] = float16_div( + env->fpr[rs1], + env->vfp.vreg[src2].f16[j], + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[j] = float32_div( + env->fpr[rs1], + env->vfp.vreg[src2].f32[j], + &env->fp_status); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[j] = float64_div( + env->fpr[rs1], + env->vfp.vreg[src2].f64[j], + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_fcommon(env, dest, j, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* vfwmul.vv vd, vs2, vs1, vm # vector-vector */ +void VECTOR_HELPER(vfwmul_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, k, dest, src1, src2; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + + if (vector_vtype_ill(env) + || vector_overlap_vm_force(vm, rd) + || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs1, lmul) + || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, true); + + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / (2 * width))); + src2 = rs2 + (i / (VLEN / width)); + src1 = rs1 + (i / (VLEN / width)); + j = i % (VLEN / width); + k = i % (VLEN / (2 * width)); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[k] = float32_mul( + float16_to_float32(env->vfp.vreg[src2].f16[j], true, + &env->fp_status), + float16_to_float32(env->vfp.vreg[src1].f16[j], true, + &env->fp_status), + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[k] = float64_mul( + float32_to_float64(env->vfp.vreg[src2].f32[j], + &env->fp_status), + float32_to_float64(env->vfp.vreg[src1].f32[j], + &env->fp_status), + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_fwiden(env, dest, k, width); + } + } + return; + + env->vfp.vstart = 0; +} + +/* vfwmul.vf vd, vs2, rs1, vm # vector-scalar */ +void VECTOR_HELPER(vfwmul_vf)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, k, dest, src2; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + + if (vector_vtype_ill(env) + || vector_overlap_vm_force(vm, rd) + || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, true); + + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / (2 * width))); + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + k = i % (VLEN / (2 * width)); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[k] = float32_mul( + float16_to_float32(env->vfp.vreg[src2].f16[j], true, + &env->fp_status), + float16_to_float32(env->fpr[rs1], true, + &env->fp_status), + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[k] = float64_mul( + float32_to_float64(env->vfp.vreg[src2].f32[j], + &env->fp_status), + float32_to_float64(env->fpr[rs1], &env->fp_status), + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_fwiden(env, dest, k, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* vfmacc.vv vd, vs1, vs2, vm # vd[i] = +(vs1[i] * vs2[i]) + vd[i] */ +void VECTOR_HELPER(vfmacc_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src1, src2; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + src1 = rs1 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f16[j] = float16_muladd( + env->vfp.vreg[src1].f16[j], + env->vfp.vreg[src2].f16[j], + env->vfp.vreg[dest].f16[j], + 0, + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[j] = float32_muladd( + env->vfp.vreg[src1].f32[j], + env->vfp.vreg[src2].f32[j], + env->vfp.vreg[dest].f32[j], + 0, + &env->fp_status); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[j] = float64_muladd( + env->vfp.vreg[src1].f64[j], + env->vfp.vreg[src2].f64[j], + env->vfp.vreg[dest].f64[j], + 0, + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_fcommon(env, dest, j, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* vfmacc.vf vd, rs1, vs2, vm # vd[i] = +(f[rs1] * vs2[i]) + vd[i] */ +void VECTOR_HELPER(vfmacc_vf)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src2; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f16[j] = float16_muladd( + env->fpr[rs1], + env->vfp.vreg[src2].f16[j], + env->vfp.vreg[dest].f16[j], + 0, + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[j] = float32_muladd( + env->fpr[rs1], + env->vfp.vreg[src2].f32[j], + env->vfp.vreg[dest].f32[j], + 0, + &env->fp_status); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[j] = float64_muladd( + env->fpr[rs1], + env->vfp.vreg[src2].f64[j], + env->vfp.vreg[dest].f64[j], + 0, + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_fcommon(env, dest, j, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* vfnmacc.vv vd, vs1, vs2, vm # vd[i] = -(vs1[i] * vs2[i]) - vd[i] */ +void VECTOR_HELPER(vfnmacc_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src1, src2; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + src1 = rs1 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f16[j] = float16_muladd( + env->vfp.vreg[src1].f16[j], + env->vfp.vreg[src2].f16[j], + env->vfp.vreg[dest].f16[j], + float_muladd_negate_c | + float_muladd_negate_product, + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[j] = float32_muladd( + env->vfp.vreg[src1].f32[j], + env->vfp.vreg[src2].f32[j], + env->vfp.vreg[dest].f32[j], + float_muladd_negate_c | + float_muladd_negate_product, + &env->fp_status); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[j] = float64_muladd( + env->vfp.vreg[src1].f64[j], + env->vfp.vreg[src2].f64[j], + env->vfp.vreg[dest].f64[j], + float_muladd_negate_c | + float_muladd_negate_product, + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_fcommon(env, dest, j, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* vfnmacc.vf vd, rs1, vs2, vm # vd[i] = -(f[rs1] * vs2[i]) - vd[i] */ +void VECTOR_HELPER(vfnmacc_vf)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src2; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f16[j] = float16_muladd( + env->fpr[rs1], + env->vfp.vreg[src2].f16[j], + env->vfp.vreg[dest].f16[j], + float_muladd_negate_c | + float_muladd_negate_product, + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[j] = float32_muladd( + env->fpr[rs1], + env->vfp.vreg[src2].f32[j], + env->vfp.vreg[dest].f32[j], + float_muladd_negate_c | + float_muladd_negate_product, + &env->fp_status); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[j] = float64_muladd( + env->fpr[rs1], + env->vfp.vreg[src2].f64[j], + env->vfp.vreg[dest].f64[j], + float_muladd_negate_c | + float_muladd_negate_product, + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_fcommon(env, dest, j, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* vfmsac.vv vd, vs1, vs2, vm # vd[i] = +(vs1[i] * vs2[i]) - vd[i] */ +void VECTOR_HELPER(vfmsac_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src1, src2; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + src1 = rs1 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f16[j] = float16_muladd( + env->vfp.vreg[src1].f16[j], + env->vfp.vreg[src2].f16[j], + env->vfp.vreg[dest].f16[j], + float_muladd_negate_c, + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[j] = float32_muladd( + env->vfp.vreg[src1].f32[j], + env->vfp.vreg[src2].f32[j], + env->vfp.vreg[dest].f32[j], + float_muladd_negate_c, + &env->fp_status); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[j] = float64_muladd( + env->vfp.vreg[src1].f64[j], + env->vfp.vreg[src2].f64[j], + env->vfp.vreg[dest].f64[j], + float_muladd_negate_c, + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_fcommon(env, dest, j, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* vfmsac.vf vd, rs1, vs2, vm # vd[i] = +(f[rs1] * vs2[i]) - vd[i] */ +void VECTOR_HELPER(vfmsac_vf)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src2; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f16[j] = float16_muladd( + env->fpr[rs1], + env->vfp.vreg[src2].f16[j], + env->vfp.vreg[dest].f16[j], + float_muladd_negate_c, + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[j] = float32_muladd( + env->fpr[rs1], + env->vfp.vreg[src2].f32[j], + env->vfp.vreg[dest].f32[j], + float_muladd_negate_c, + &env->fp_status); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[j] = float64_muladd( + env->fpr[rs1], + env->vfp.vreg[src2].f64[j], + env->vfp.vreg[dest].f64[j], + float_muladd_negate_c, + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_fcommon(env, dest, j, width); + } + } + return; + + env->vfp.vstart = 0; +} + +/* vfnmsac.vv vd, vs1, vs2, vm # vd[i] = -(vs1[i] * vs2[i]) + vd[i] */ +void VECTOR_HELPER(vfnmsac_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src1, src2; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + src1 = rs1 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f16[j] = float16_muladd( + env->vfp.vreg[src1].f16[j], + env->vfp.vreg[src2].f16[j], + env->vfp.vreg[dest].f16[j], + float_muladd_negate_product, + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[j] = float32_muladd( + env->vfp.vreg[src1].f32[j], + env->vfp.vreg[src2].f32[j], + env->vfp.vreg[dest].f32[j], + float_muladd_negate_product, + &env->fp_status); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[j] = float64_muladd( + env->vfp.vreg[src1].f64[j], + env->vfp.vreg[src2].f64[j], + env->vfp.vreg[dest].f64[j], + float_muladd_negate_product, + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_fcommon(env, dest, j, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* vfnmsac.vf vd, rs1, vs2, vm # vd[i] = -(f[rs1] * vs2[i]) + vd[i] */ +void VECTOR_HELPER(vfnmsac_vf)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src2; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f16[j] = float16_muladd( + env->fpr[rs1], + env->vfp.vreg[src2].f16[j], + env->vfp.vreg[dest].f16[j], + float_muladd_negate_product, + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[j] = float32_muladd( + env->fpr[rs1], + env->vfp.vreg[src2].f32[j], + env->vfp.vreg[dest].f32[j], + float_muladd_negate_product, + &env->fp_status); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[j] = float64_muladd( + env->fpr[rs1], + env->vfp.vreg[src2].f64[j], + env->vfp.vreg[dest].f64[j], + float_muladd_negate_product, + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_fcommon(env, dest, j, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* vfmadd.vv vd, vs1, vs2, vm # vd[i] = +(vs1[i] * vd[i]) + vs2[i] */ +void VECTOR_HELPER(vfmadd_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src1, src2; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + src1 = rs1 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f16[j] = float16_muladd( + env->vfp.vreg[src1].f16[j], + env->vfp.vreg[dest].f16[j], + env->vfp.vreg[src2].f16[j], + 0, + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[j] = float32_muladd( + env->vfp.vreg[src1].f32[j], + env->vfp.vreg[dest].f32[j], + env->vfp.vreg[src2].f32[j], + 0, + &env->fp_status); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[j] = float64_muladd( + env->vfp.vreg[src1].f64[j], + env->vfp.vreg[dest].f64[j], + env->vfp.vreg[src2].f64[j], + 0, + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_fcommon(env, dest, j, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* vfmadd.vf vd, rs1, vs2, vm # vd[i] = +(f[rs1] * vd[i]) + vs2[i] */ +void VECTOR_HELPER(vfmadd_vf)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src2; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f16[j] = float16_muladd( + env->fpr[rs1], + env->vfp.vreg[dest].f16[j], + env->vfp.vreg[src2].f16[j], + 0, + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[j] = float32_muladd( + env->fpr[rs1], + env->vfp.vreg[dest].f32[j], + env->vfp.vreg[src2].f32[j], + 0, + &env->fp_status); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[j] = float64_muladd( + env->fpr[rs1], + env->vfp.vreg[dest].f64[j], + env->vfp.vreg[src2].f64[j], + 0, + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_fcommon(env, dest, j, width); + } + } + env->vfp.vstart = 0; + return; +} +/* vfnmadd.vv vd, vs1, vs2, vm # vd[i] = -(vs1[i] * vd[i]) - vs2[i] */ +void VECTOR_HELPER(vfnmadd_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src1, src2; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + src1 = rs1 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f16[j] = float16_muladd( + env->vfp.vreg[src1].f16[j], + env->vfp.vreg[dest].f16[j], + env->vfp.vreg[src2].f16[j], + float_muladd_negate_c | + float_muladd_negate_product, + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[j] = float32_muladd( + env->vfp.vreg[src1].f32[j], + env->vfp.vreg[dest].f32[j], + env->vfp.vreg[src2].f32[j], + float_muladd_negate_c | + float_muladd_negate_product, + &env->fp_status); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[j] = float64_muladd( + env->vfp.vreg[src1].f64[j], + env->vfp.vreg[dest].f64[j], + env->vfp.vreg[src2].f64[j], + float_muladd_negate_c | + float_muladd_negate_product, + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_fcommon(env, dest, j, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* vfnmadd.vf vd, rs1, vs2, vm # vd[i] = -(f[rs1] * vd[i]) - vs2[i] */ +void VECTOR_HELPER(vfnmadd_vf)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src2; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f16[j] = float16_muladd( + env->fpr[rs1], + env->vfp.vreg[dest].f16[j], + env->vfp.vreg[src2].f16[j], + float_muladd_negate_c | + float_muladd_negate_product, + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[j] = float32_muladd( + env->fpr[rs1], + env->vfp.vreg[dest].f32[j], + env->vfp.vreg[src2].f32[j], + float_muladd_negate_c | + float_muladd_negate_product, + &env->fp_status); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[j] = float64_muladd( + env->fpr[rs1], + env->vfp.vreg[dest].f64[j], + env->vfp.vreg[src2].f64[j], + float_muladd_negate_c | + float_muladd_negate_product, + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_fcommon(env, dest, j, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* vfmsub.vv vd, vs1, vs2, vm # vd[i] = +(vs1[i] * vd[i]) - vs2[i] */ +void VECTOR_HELPER(vfmsub_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src1, src2; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + src1 = rs1 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f16[j] = float16_muladd( + env->vfp.vreg[src1].f16[j], + env->vfp.vreg[dest].f16[j], + env->vfp.vreg[src2].f16[j], + float_muladd_negate_c, + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[j] = float32_muladd( + env->vfp.vreg[src1].f32[j], + env->vfp.vreg[dest].f32[j], + env->vfp.vreg[src2].f32[j], + float_muladd_negate_c, + &env->fp_status); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[j] = float64_muladd( + env->vfp.vreg[src1].f64[j], + env->vfp.vreg[dest].f64[j], + env->vfp.vreg[src2].f64[j], + float_muladd_negate_c, + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_fcommon(env, dest, j, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* vfmsub.vf vd, rs1, vs2, vm # vd[i] = +(f[rs1] * vd[i]) - vs2[i] */ +void VECTOR_HELPER(vfmsub_vf)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src2; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f16[j] = float16_muladd( + env->fpr[rs1], + env->vfp.vreg[dest].f16[j], + env->vfp.vreg[src2].f16[j], + float_muladd_negate_c, + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[j] = float32_muladd( + env->fpr[rs1], + env->vfp.vreg[dest].f32[j], + env->vfp.vreg[src2].f32[j], + float_muladd_negate_c, + &env->fp_status); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[j] = float64_muladd( + env->fpr[rs1], + env->vfp.vreg[dest].f64[j], + env->vfp.vreg[src2].f64[j], + float_muladd_negate_c, + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_fcommon(env, dest, j, width); + } + } + env->vfp.vstart = 0; + return; +} +/* vfnmsub.vv vd, vs1, vs2, vm # vd[i] = -(vs1[i] * vd[i]) + vs2[i] */ +void VECTOR_HELPER(vfnmsub_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src1, src2; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + src1 = rs1 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f16[j] = float16_muladd( + env->vfp.vreg[src1].f16[j], + env->vfp.vreg[dest].f16[j], + env->vfp.vreg[src2].f16[j], + float_muladd_negate_product, + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[j] = float32_muladd( + env->vfp.vreg[src1].f32[j], + env->vfp.vreg[dest].f32[j], + env->vfp.vreg[src2].f32[j], + float_muladd_negate_product, + &env->fp_status); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[j] = float64_muladd( + env->vfp.vreg[src1].f64[j], + env->vfp.vreg[dest].f64[j], + env->vfp.vreg[src2].f64[j], + float_muladd_negate_product, + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_fcommon(env, dest, j, width); + } + } + return; + + env->vfp.vstart = 0; +} + +/* vfnmsub.vf vd, rs1, vs2, vm # vd[i] = -(f[rs1] * vd[i]) + vs2[i] */ +void VECTOR_HELPER(vfnmsub_vf)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src2; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f16[j] = float16_muladd( + env->fpr[rs1], + env->vfp.vreg[dest].f16[j], + env->vfp.vreg[src2].f16[j], + float_muladd_negate_product, + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[j] = float32_muladd( + env->fpr[rs1], + env->vfp.vreg[dest].f32[j], + env->vfp.vreg[src2].f32[j], + float_muladd_negate_product, + &env->fp_status); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[j] = float64_muladd( + env->fpr[rs1], + env->vfp.vreg[dest].f64[j], + env->vfp.vreg[src2].f64[j], + float_muladd_negate_product, + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_fcommon(env, dest, j, width); + } + } + env->vfp.vstart = 0; + return; +} + + From patchwork Wed Sep 11 06:25:38 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: LIU Zhiwei X-Patchwork-Id: 11140395 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6B17114DB for ; Wed, 11 Sep 2019 06:46:35 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 1988F21928 for ; Wed, 11 Sep 2019 06:46:35 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1988F21928 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=c-sky.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Received: from localhost ([::1]:46954 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1i7wOf-0002xl-7S for patchwork-qemu-devel@patchwork.kernel.org; Wed, 11 Sep 2019 02:46:33 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:38712) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1i7wE8-0007q0-CL for qemu-devel@nongnu.org; Wed, 11 Sep 2019 02:35:51 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1i7wDy-00089O-2R for qemu-devel@nongnu.org; Wed, 11 Sep 2019 02:35:40 -0400 Received: from smtp2200-217.mail.aliyun.com ([121.197.200.217]:48981) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1i7wDw-0007qw-5g; Wed, 11 Sep 2019 02:35:30 -0400 X-Alimail-AntiSpam: AC=CONTINUE; BC=0.03883426|-1; CH=green; DM=CONTINUE|CONTINUE|true|0.275989-0.00415687-0.719854; FP=0|0|0|0|0|-1|-1|-1; HT=e02c03300; MF=zhiwei_liu@c-sky.com; NM=1; PH=DS; RN=11; RT=11; SR=0; TI=SMTPD_---.FSRQiS-_1568183703; Received: from localhost(mailfrom:zhiwei_liu@c-sky.com fp:SMTPD_---.FSRQiS-_1568183703) by smtp.aliyun-inc.com(10.147.41.137); Wed, 11 Sep 2019 14:35:03 +0800 From: liuzhiwei To: Alistair.Francis@wdc.com, palmer@sifive.com, sagark@eecs.berkeley.edu, kbastian@mail.uni-paderborn.de, riku.voipio@iki.fi, laurent@vivier.eu, wenmeng_zhang@c-sky.com Date: Wed, 11 Sep 2019 14:25:38 +0800 Message-Id: <1568183141-67641-15-git-send-email-zhiwei_liu@c-sky.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1568183141-67641-1-git-send-email-zhiwei_liu@c-sky.com> References: <1568183141-67641-1-git-send-email-zhiwei_liu@c-sky.com> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x [generic] X-Received-From: 121.197.200.217 Subject: [Qemu-devel] [PATCH v2 14/17] RISC-V: add vector extension float instructions part2, sqrt/cmp/cvt/others X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-riscv@nongnu.org, qemu-devel@nongnu.org, wxy194768@alibaba-inc.com, LIU Zhiwei Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" From: LIU Zhiwei Signed-off-by: LIU Zhiwei --- target/riscv/helper.h | 40 + target/riscv/insn32.decode | 40 + target/riscv/insn_trans/trans_rvv.inc.c | 54 + target/riscv/vector_helper.c | 2962 +++++++++++++++++++++++++++++++ 4 files changed, 3096 insertions(+) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index d2c8684..e2384eb 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -344,5 +344,45 @@ DEF_HELPER_5(vector_vfmsub_vf, void, env, i32, i32, i32, i32) DEF_HELPER_5(vector_vfnmsub_vv, void, env, i32, i32, i32, i32) DEF_HELPER_5(vector_vfnmsub_vf, void, env, i32, i32, i32, i32) +DEF_HELPER_4(vector_vfsqrt_v, void, env, i32, i32, i32) +DEF_HELPER_5(vector_vfmin_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfmin_vf, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfmax_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfmax_vf, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfsgnj_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfsgnj_vf, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfsgnjn_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfsgnjn_vf, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfsgnjx_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfsgnjx_vf, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vmfeq_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vmfeq_vf, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vmfne_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vmfne_vf, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vmfle_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vmfle_vf, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vmflt_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vmflt_vf, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vmfgt_vf, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vmfge_vf, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vmford_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vmford_vf, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfmerge_vfm, void, env, i32, i32, i32, i32) +DEF_HELPER_4(vector_vfclass_v, void, env, i32, i32, i32) +DEF_HELPER_4(vector_vfcvt_xu_f_v, void, env, i32, i32, i32) +DEF_HELPER_4(vector_vfcvt_x_f_v, void, env, i32, i32, i32) +DEF_HELPER_4(vector_vfcvt_f_xu_v, void, env, i32, i32, i32) +DEF_HELPER_4(vector_vfcvt_f_x_v, void, env, i32, i32, i32) +DEF_HELPER_4(vector_vfwcvt_xu_f_v, void, env, i32, i32, i32) +DEF_HELPER_4(vector_vfwcvt_x_f_v, void, env, i32, i32, i32) +DEF_HELPER_4(vector_vfwcvt_f_xu_v, void, env, i32, i32, i32) +DEF_HELPER_4(vector_vfwcvt_f_x_v, void, env, i32, i32, i32) +DEF_HELPER_4(vector_vfwcvt_f_f_v, void, env, i32, i32, i32) +DEF_HELPER_4(vector_vfncvt_xu_f_v, void, env, i32, i32, i32) +DEF_HELPER_4(vector_vfncvt_x_f_v, void, env, i32, i32, i32) +DEF_HELPER_4(vector_vfncvt_f_xu_v, void, env, i32, i32, i32) +DEF_HELPER_4(vector_vfncvt_f_x_v, void, env, i32, i32, i32) +DEF_HELPER_4(vector_vfncvt_f_f_v, void, env, i32, i32, i32) + DEF_HELPER_4(vector_vsetvli, void, env, i32, i32, i32) DEF_HELPER_4(vector_vsetvl, void, env, i32, i32, i32) diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode index 31868ab..256d8ea 100644 --- a/target/riscv/insn32.decode +++ b/target/riscv/insn32.decode @@ -67,6 +67,7 @@ @r_wdvm ..... wd:1 vm:1 ..... ..... ... ..... ....... %rs2 %rs1 %rd @r_nfvm nf:3 ... vm:1 ..... ..... ... ..... ....... %rs2 %rs1 %rd @r2_nfvm nf:3 ... vm:1 ..... ..... ... ..... ....... %rs1 %rd +@r2_vm ...... vm:1 ..... ..... ... ..... ....... %rs2 %rd @r2_zimm . zimm:11 ..... ... ..... ....... %rs1 %rd @sfence_vma ....... ..... ..... ... ..... ....... %rs2 %rs1 @@ -483,6 +484,45 @@ vfmsub_vv 101010 . ..... ..... 001 ..... 1010111 @r_vm vfmsub_vf 101010 . ..... ..... 101 ..... 1010111 @r_vm vfnmsub_vv 101011 . ..... ..... 001 ..... 1010111 @r_vm vfnmsub_vf 101011 . ..... ..... 101 ..... 1010111 @r_vm +vfsqrt_v 100011 . ..... 00000 001 ..... 1010111 @r2_vm +vfmin_vv 000100 . ..... ..... 001 ..... 1010111 @r_vm +vfmin_vf 000100 . ..... ..... 101 ..... 1010111 @r_vm +vfmax_vv 000110 . ..... ..... 001 ..... 1010111 @r_vm +vfmax_vf 000110 . ..... ..... 101 ..... 1010111 @r_vm +vfsgnj_vv 001000 . ..... ..... 001 ..... 1010111 @r_vm +vfsgnj_vf 001000 . ..... ..... 101 ..... 1010111 @r_vm +vfsgnjn_vv 001001 . ..... ..... 001 ..... 1010111 @r_vm +vfsgnjn_vf 001001 . ..... ..... 101 ..... 1010111 @r_vm +vfsgnjx_vv 001010 . ..... ..... 001 ..... 1010111 @r_vm +vfsgnjx_vf 001010 . ..... ..... 101 ..... 1010111 @r_vm +vmfeq_vv 011000 . ..... ..... 001 ..... 1010111 @r_vm +vmfeq_vf 011000 . ..... ..... 101 ..... 1010111 @r_vm +vmfne_vv 011100 . ..... ..... 001 ..... 1010111 @r_vm +vmfne_vf 011100 . ..... ..... 101 ..... 1010111 @r_vm +vmflt_vv 011011 . ..... ..... 001 ..... 1010111 @r_vm +vmflt_vf 011011 . ..... ..... 101 ..... 1010111 @r_vm +vmfle_vv 011001 . ..... ..... 001 ..... 1010111 @r_vm +vmfle_vf 011001 . ..... ..... 101 ..... 1010111 @r_vm +vmfgt_vf 011101 . ..... ..... 101 ..... 1010111 @r_vm +vmfge_vf 011111 . ..... ..... 101 ..... 1010111 @r_vm +vmford_vv 011010 . ..... ..... 001 ..... 1010111 @r_vm +vmford_vf 011010 . ..... ..... 101 ..... 1010111 @r_vm +vfclass_v 100011 . ..... 10000 001 ..... 1010111 @r2_vm +vfmerge_vfm 010111 . ..... ..... 101 ..... 1010111 @r_vm +vfcvt_xu_f_v 100010 . ..... 00000 001 ..... 1010111 @r2_vm +vfcvt_x_f_v 100010 . ..... 00001 001 ..... 1010111 @r2_vm +vfcvt_f_xu_v 100010 . ..... 00010 001 ..... 1010111 @r2_vm +vfcvt_f_x_v 100010 . ..... 00011 001 ..... 1010111 @r2_vm +vfwcvt_xu_f_v 100010 . ..... 01000 001 ..... 1010111 @r2_vm +vfwcvt_x_f_v 100010 . ..... 01001 001 ..... 1010111 @r2_vm +vfwcvt_f_xu_v 100010 . ..... 01010 001 ..... 1010111 @r2_vm +vfwcvt_f_x_v 100010 . ..... 01011 001 ..... 1010111 @r2_vm +vfwcvt_f_f_v 100010 . ..... 01100 001 ..... 1010111 @r2_vm +vfncvt_xu_f_v 100010 . ..... 10000 001 ..... 1010111 @r2_vm +vfncvt_x_f_v 100010 . ..... 10001 001 ..... 1010111 @r2_vm +vfncvt_f_xu_v 100010 . ..... 10010 001 ..... 1010111 @r2_vm +vfncvt_f_x_v 100010 . ..... 10011 001 ..... 1010111 @r2_vm +vfncvt_f_f_v 100010 . ..... 10100 001 ..... 1010111 @r2_vm vsetvli 0 ........... ..... 111 ..... 1010111 @r2_zimm vsetvl 1000000 ..... ..... 111 ..... 1010111 @r diff --git a/target/riscv/insn_trans/trans_rvv.inc.c b/target/riscv/insn_trans/trans_rvv.inc.c index ff23bc2..e4d4576 100644 --- a/target/riscv/insn_trans/trans_rvv.inc.c +++ b/target/riscv/insn_trans/trans_rvv.inc.c @@ -92,6 +92,20 @@ static bool trans_##INSN(DisasContext *ctx, arg_##INSN * a) \ return true; \ } +#define GEN_VECTOR_R2_VM(INSN) \ +static bool trans_##INSN(DisasContext *ctx, arg_##INSN * a) \ +{ \ + TCGv_i32 s2 = tcg_const_i32(a->rs2); \ + TCGv_i32 d = tcg_const_i32(a->rd); \ + TCGv_i32 vm = tcg_const_i32(a->vm); \ + gen_helper_vector_##INSN(cpu_env, vm, s2, d); \ + tcg_temp_free_i32(s2); \ + tcg_temp_free_i32(d); \ + tcg_temp_free_i32(vm); \ + return true; \ +} + + #define GEN_VECTOR_R2_ZIMM(INSN) \ static bool trans_##INSN(DisasContext *ctx, arg_##INSN * a) \ { \ @@ -373,5 +387,45 @@ GEN_VECTOR_R_VM(vfmsub_vf) GEN_VECTOR_R_VM(vfnmsub_vv) GEN_VECTOR_R_VM(vfnmsub_vf) +GEN_VECTOR_R2_VM(vfsqrt_v) +GEN_VECTOR_R_VM(vfmin_vv) +GEN_VECTOR_R_VM(vfmin_vf) +GEN_VECTOR_R_VM(vfmax_vv) +GEN_VECTOR_R_VM(vfmax_vf) +GEN_VECTOR_R_VM(vfsgnj_vv) +GEN_VECTOR_R_VM(vfsgnj_vf) +GEN_VECTOR_R_VM(vfsgnjn_vv) +GEN_VECTOR_R_VM(vfsgnjn_vf) +GEN_VECTOR_R_VM(vfsgnjx_vv) +GEN_VECTOR_R_VM(vfsgnjx_vf) +GEN_VECTOR_R_VM(vmfeq_vv) +GEN_VECTOR_R_VM(vmfeq_vf) +GEN_VECTOR_R_VM(vmfne_vv) +GEN_VECTOR_R_VM(vmfne_vf) +GEN_VECTOR_R_VM(vmfle_vv) +GEN_VECTOR_R_VM(vmfle_vf) +GEN_VECTOR_R_VM(vmflt_vv) +GEN_VECTOR_R_VM(vmflt_vf) +GEN_VECTOR_R_VM(vmfgt_vf) +GEN_VECTOR_R_VM(vmfge_vf) +GEN_VECTOR_R_VM(vmford_vv) +GEN_VECTOR_R_VM(vmford_vf) +GEN_VECTOR_R2_VM(vfclass_v) +GEN_VECTOR_R_VM(vfmerge_vfm) +GEN_VECTOR_R2_VM(vfcvt_xu_f_v) +GEN_VECTOR_R2_VM(vfcvt_x_f_v) +GEN_VECTOR_R2_VM(vfcvt_f_xu_v) +GEN_VECTOR_R2_VM(vfcvt_f_x_v) +GEN_VECTOR_R2_VM(vfwcvt_xu_f_v) +GEN_VECTOR_R2_VM(vfwcvt_x_f_v) +GEN_VECTOR_R2_VM(vfwcvt_f_xu_v) +GEN_VECTOR_R2_VM(vfwcvt_f_x_v) +GEN_VECTOR_R2_VM(vfwcvt_f_f_v) +GEN_VECTOR_R2_VM(vfncvt_xu_f_v) +GEN_VECTOR_R2_VM(vfncvt_x_f_v) +GEN_VECTOR_R2_VM(vfncvt_f_xu_v) +GEN_VECTOR_R2_VM(vfncvt_f_x_v) +GEN_VECTOR_R2_VM(vfncvt_f_f_v) + GEN_VECTOR_R2_ZIMM(vsetvli) GEN_VECTOR_R(vsetvl) diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c index e16543b..fd2ecb7 100644 --- a/target/riscv/vector_helper.c +++ b/target/riscv/vector_helper.c @@ -914,6 +914,25 @@ static inline int64_t avg_round_s64(CPURISCVState *env, int64_t a, int64_t b) return res; } +static target_ulong helper_fclass_h(uint64_t frs1) +{ + float16 f = frs1; + bool sign = float16_is_neg(f); + + if (float16_is_infinity(f)) { + return sign ? 1 << 0 : 1 << 7; + } else if (float16_is_zero(f)) { + return sign ? 1 << 3 : 1 << 4; + } else if (float16_is_zero_or_denormal(f)) { + return sign ? 1 << 2 : 1 << 5; + } else if (float16_is_any_nan(f)) { + float_status s = { }; /* for snan_bit_is_one */ + return float16_is_quiet_nan(f, &s) ? 1 << 9 : 1 << 8; + } else { + return sign ? 1 << 1 : 1 << 6; + } +} + static inline bool vector_vtype_ill(CPURISCVState *env) { if ((env->vfp.vtype >> (sizeof(target_ulong) - 1)) & 0x1) { @@ -1017,6 +1036,32 @@ static bool vector_lmul_check_reg(CPURISCVState *env, uint32_t lmul, return true; } +/** + * deposit16: + * @value: initial value to insert bit field into + * @start: the lowest bit in the bit field (numbered from 0) + * @length: the length of the bit field + * @fieldval: the value to insert into the bit field + * + * Deposit @fieldval into the 16 bit @value at the bit field specified + * by the @start and @length parameters, and return the modified + * @value. Bits of @value outside the bit field are not modified. + * Bits of @fieldval above the least significant @length bits are + * ignored. The bit field must lie entirely within the 16 bit word. + * It is valid to request that all 16 bits are modified (ie @length + * 16 and @start 0). + * + * Returns: the modified @value. + */ +static inline uint16_t deposit16(uint16_t value, int start, int length, + uint16_t fieldval) +{ + uint16_t mask; + assert(start >= 0 && length > 0 && length <= 16 - start); + mask = (~0U >> (16 - length)) << start; + return (value & ~mask) | ((fieldval << start) & mask); +} + static void vector_tail_amo(CPURISCVState *env, int vreg, int index, int width) { switch (width) { @@ -1161,6 +1206,22 @@ static void vector_tail_fwiden(CPURISCVState *env, int vreg, int index, } } +static void vector_tail_fnarrow(CPURISCVState *env, int vreg, int index, + int width) +{ + switch (width) { + case 16: + env->vfp.vreg[vreg].u16[index] = 0; + break; + case 32: + env->vfp.vreg[vreg].u32[index] = 0; + break; + default: + helper_raise_exception(env, RISCV_EXCP_ILLEGAL_INST); + return; + } +} + static inline int vector_get_carry(CPURISCVState *env, int width, int lmul, int index) { @@ -19758,4 +19819,2905 @@ void VECTOR_HELPER(vfnmsub_vf)(CPURISCVState *env, uint32_t vm, uint32_t rs1, return; } +/* vfsqrt.v vd, vs2, vm # Vector-vector square root */ +void VECTOR_HELPER(vfsqrt_v)(CPURISCVState *env, uint32_t vm, uint32_t rs2, + uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src2; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f16[j] = float16_sqrt( + env->vfp.vreg[src2].f16[j], + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[j] = float32_sqrt( + env->vfp.vreg[src2].f32[j], + &env->fp_status); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[j] = float64_sqrt( + env->vfp.vreg[src2].f64[j], + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + switch (width) { + case 16: + env->vfp.vreg[dest].f16[j] = 0; + case 32: + env->vfp.vreg[dest].f32[j] = 0; + case 64: + env->vfp.vreg[dest].f64[j] = 0; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } + } + env->vfp.vstart = 0; + return; +} + +/* vfmin.vv vd, vs2, vs1, vm # Vector-vector */ +void VECTOR_HELPER(vfmin_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src1, src2; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + src1 = rs1 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f16[j] = float16_minnum( + env->vfp.vreg[src1].f16[j], + env->vfp.vreg[src2].f16[j], + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[j] = float32_minnum( + env->vfp.vreg[src1].f32[j], + env->vfp.vreg[src2].f32[j], + &env->fp_status); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[j] = float64_minnum( + env->vfp.vreg[src1].f64[j], + env->vfp.vreg[src2].f64[j], + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + switch (width) { + case 16: + env->vfp.vreg[dest].f16[j] = 0; + case 32: + env->vfp.vreg[dest].f32[j] = 0; + case 64: + env->vfp.vreg[dest].f64[j] = 0; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } + } + env->vfp.vstart = 0; + return; +} + +/* vfmin.vf vd, vs2, rs1, vm # vector-scalar */ +void VECTOR_HELPER(vfmin_vf)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src2; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f16[j] = float16_minnum( + env->fpr[rs1], + env->vfp.vreg[src2].f16[j], + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[j] = float32_minnum( + env->fpr[rs1], + env->vfp.vreg[src2].f32[j], + &env->fp_status); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[j] = float64_minnum( + env->fpr[rs1], + env->vfp.vreg[src2].f64[j], + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + switch (width) { + case 16: + env->vfp.vreg[dest].f16[j] = 0; + case 32: + env->vfp.vreg[dest].f32[j] = 0; + case 64: + env->vfp.vreg[dest].f64[j] = 0; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } + } + return; + + env->vfp.vstart = 0; +} + +/*vfmax.vv vd, vs2, vs1, vm # Vector-vector */ +void VECTOR_HELPER(vfmax_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src1, src2; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + src1 = rs1 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f16[j] = float16_maxnum( + env->vfp.vreg[src1].f16[j], + env->vfp.vreg[src2].f16[j], + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[j] = float32_maxnum( + env->vfp.vreg[src1].f32[j], + env->vfp.vreg[src2].f32[j], + &env->fp_status); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[j] = float64_maxnum( + env->vfp.vreg[src1].f64[j], + env->vfp.vreg[src2].f64[j], + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + switch (width) { + case 16: + env->vfp.vreg[dest].f16[j] = 0; + case 32: + env->vfp.vreg[dest].f32[j] = 0; + case 64: + env->vfp.vreg[dest].f64[j] = 0; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } + } + env->vfp.vstart = 0; + return; +} + +/* vfmax.vf vd, vs2, rs1, vm # vector-scalar */ +void VECTOR_HELPER(vfmax_vf)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src2; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + + if (env->vfp.vstart >= vl) { + return; + } + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f16[j] = float16_maxnum( + env->fpr[rs1], + env->vfp.vreg[src2].f16[j], + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[j] = float32_maxnum( + env->fpr[rs1], + env->vfp.vreg[src2].f32[j], + &env->fp_status); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[j] = float64_maxnum( + env->fpr[rs1], + env->vfp.vreg[src2].f64[j], + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + switch (width) { + case 16: + env->vfp.vreg[dest].f16[j] = 0; + case 32: + env->vfp.vreg[dest].f32[j] = 0; + case 64: + env->vfp.vreg[dest].f64[j] = 0; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } + } + return; + + env->vfp.vstart = 0; +} + +/* vfsgnj.vv vd, vs2, vs1, vm # Vector-vector */ +void VECTOR_HELPER(vfsgnj_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src1, src2; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + src1 = rs1 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f16[j] = deposit16( + env->vfp.vreg[src1].f16[j], + 0, + 15, + env->vfp.vreg[src2].f16[j]); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[j] = deposit32( + env->vfp.vreg[src1].f32[j], + 0, + 31, + env->vfp.vreg[src2].f32[j]); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[j] = deposit64( + env->vfp.vreg[src1].f64[j], + 0, + 63, + env->vfp.vreg[src2].f64[j]); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + switch (width) { + case 16: + env->vfp.vreg[dest].f16[j] = 0; + case 32: + env->vfp.vreg[dest].f32[j] = 0; + case 64: + env->vfp.vreg[dest].f64[j] = 0; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } + } + env->vfp.vstart = 0; + return; +} + +/* vfsgnj.vf vd, vs2, rs1, vm # vector-scalar */ +void VECTOR_HELPER(vfsgnj_vf)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src2; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f16[j] = deposit16( + env->fpr[rs1], + 0, + 15, + env->vfp.vreg[src2].f16[j]); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[j] = deposit32( + env->fpr[rs1], + 0, + 31, + env->vfp.vreg[src2].f32[j]); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[j] = deposit64( + env->fpr[rs1], + 0, + 63, + env->vfp.vreg[src2].f64[j]); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + switch (width) { + case 16: + env->vfp.vreg[dest].f16[j] = 0; + case 32: + env->vfp.vreg[dest].f32[j] = 0; + case 64: + env->vfp.vreg[dest].f64[j] = 0; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } + } + env->vfp.vstart = 0; + return; +} + +/* vfsgnjn.vv vd, vs2, vs1, vm # Vector-vector */ +void VECTOR_HELPER(vfsgnjn_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src1, src2; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + src1 = rs1 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f16[j] = deposit16( + ~env->vfp.vreg[src1].f16[j], + 0, + 15, + env->vfp.vreg[src2].f16[j]); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[j] = deposit32( + ~env->vfp.vreg[src1].f32[j], + 0, + 31, + env->vfp.vreg[src2].f32[j]); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[j] = deposit64( + ~env->vfp.vreg[src1].f64[j], + 0, + 63, + env->vfp.vreg[src2].f64[j]); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + switch (width) { + case 16: + env->vfp.vreg[dest].f16[j] = 0; + case 32: + env->vfp.vreg[dest].f32[j] = 0; + case 64: + env->vfp.vreg[dest].f64[j] = 0; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } + } + env->vfp.vstart = 0; + return; +} +/* vfsgnjn.vf vd, vs2, rs1, vm # vector-scalar */ +void VECTOR_HELPER(vfsgnjn_vf)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src2; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f16[j] = deposit16( + ~env->fpr[rs1], + 0, + 15, + env->vfp.vreg[src2].f16[j]); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[j] = deposit32( + ~env->fpr[rs1], + 0, + 31, + env->vfp.vreg[src2].f32[j]); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[j] = deposit64( + ~env->fpr[rs1], + 0, + 63, + env->vfp.vreg[src2].f64[j]); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + switch (width) { + case 16: + env->vfp.vreg[dest].f16[j] = 0; + case 32: + env->vfp.vreg[dest].f32[j] = 0; + case 64: + env->vfp.vreg[dest].f64[j] = 0; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } + } + env->vfp.vstart = 0; + return; +} + +/* vfsgnjx.vv vd, vs2, vs1, vm # Vector-vector */ +void VECTOR_HELPER(vfsgnjx_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src1, src2; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + src1 = rs1 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f16[j] = deposit16( + env->vfp.vreg[src1].f16[j] ^ + env->vfp.vreg[src2].f16[j], + 0, + 15, + env->vfp.vreg[src2].f16[j]); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[j] = deposit32( + env->vfp.vreg[src1].f32[j] ^ + env->vfp.vreg[src2].f32[j], + 0, + 31, + env->vfp.vreg[src2].f32[j]); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[j] = deposit64( + env->vfp.vreg[src1].f64[j] ^ + env->vfp.vreg[src2].f64[j], + 0, + 63, + env->vfp.vreg[src2].f64[j]); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + switch (width) { + case 16: + env->vfp.vreg[dest].f16[j] = 0; + case 32: + env->vfp.vreg[dest].f32[j] = 0; + case 64: + env->vfp.vreg[dest].f64[j] = 0; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } + } + return; + + env->vfp.vstart = 0; +} + +/* vfsgnjx.vf vd, vs2, rs1, vm # vector-scalar */ +void VECTOR_HELPER(vfsgnjx_vf)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src2; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f16[j] = deposit16( + env->fpr[rs1] ^ + env->vfp.vreg[src2].f16[j], + 0, + 15, + env->vfp.vreg[src2].f16[j]); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[j] = deposit32( + env->fpr[rs1] ^ + env->vfp.vreg[src2].f32[j], + 0, + 31, + env->vfp.vreg[src2].f32[j]); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[j] = deposit64( + env->fpr[rs1] ^ + env->vfp.vreg[src2].f64[j], + 0, + 63, + env->vfp.vreg[src2].f64[j]); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + switch (width) { + case 16: + env->vfp.vreg[dest].f16[j] = 0; + case 32: + env->vfp.vreg[dest].f32[j] = 0; + case 64: + env->vfp.vreg[dest].f64[j] = 0; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } + } + return; + + env->vfp.vstart = 0; +} + +/* vfmerge.vfm vd, vs2, rs1, v0 # vd[i] = v0[i].LSB ? f[rs1] : vs2[i] */ +void VECTOR_HELPER(vfmerge_vfm)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src2; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + /* vfmv.v.f vd, rs1 # vd[i] = f[rs1]; */ + if (vm && (rs2 != 0)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f16[j] = env->fpr[rs1]; + } else { + env->vfp.vreg[dest].f16[j] = env->vfp.vreg[src2].f16[j]; + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[j] = env->fpr[rs1]; + } else { + env->vfp.vreg[dest].f32[j] = env->vfp.vreg[src2].f32[j]; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[j] = env->fpr[rs1]; + } else { + env->vfp.vreg[dest].f64[j] = env->vfp.vreg[src2].f64[j]; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_fcommon(env, dest, j, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* vmfeq.vv vd, vs2, vs1, vm # Vector-vector */ +void VECTOR_HELPER(vmfeq_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, src1, src2, result, r; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + src1 = rs1 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + r = float16_compare_quiet(env->vfp.vreg[src1].f16[j], + env->vfp.vreg[src2].f16[j], + &env->fp_status); + if (r == float_relation_equal) { + result = 1; + } else { + result = 0; + } + vector_mask_result(env, rd, width, lmul, i, result); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + result = float32_eq_quiet(env->vfp.vreg[src1].f32[j], + env->vfp.vreg[src2].f32[j], + &env->fp_status); + vector_mask_result(env, rd, width, lmul, i, result); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + result = float64_eq_quiet(env->vfp.vreg[src1].f64[j], + env->vfp.vreg[src2].f64[j], + &env->fp_status); + vector_mask_result(env, rd, width, lmul, i, result); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + switch (width) { + case 16: + case 32: + case 64: + vector_mask_result(env, rd, width, lmul, i, 0); + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } + } + env->vfp.vstart = 0; + return; +} + +/* vmfeq.vf vd, vs2, rs1, vm # vector-scalar */ +void VECTOR_HELPER(vmfeq_vf)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, src2, result, r; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + r = float16_compare_quiet(env->fpr[rs1], + env->vfp.vreg[src2].f16[j], + &env->fp_status); + if (r == float_relation_equal) { + result = 1; + } else { + result = 0; + } + vector_mask_result(env, rd, width, lmul, i, result); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + result = float32_eq_quiet(env->fpr[rs1], + env->vfp.vreg[src2].f32[j], + &env->fp_status); + vector_mask_result(env, rd, width, lmul, i, result); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + result = float64_eq_quiet(env->fpr[rs1], + env->vfp.vreg[src2].f64[j], + &env->fp_status); + vector_mask_result(env, rd, width, lmul, i, result); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + switch (width) { + case 16: + case 32: + case 64: + vector_mask_result(env, rd, width, lmul, i, 0); + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } + } + env->vfp.vstart = 0; + return; +} + +/* vmfne.vv vd, vs2, vs1, vm # Vector-vector */ +void VECTOR_HELPER(vmfne_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, src1, src2, result, r; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + src1 = rs1 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + r = float16_compare_quiet(env->vfp.vreg[src1].f16[j], + env->vfp.vreg[src2].f16[j], + &env->fp_status); + if (r != float_relation_equal) { + result = 1; + } else { + result = 0; + } + vector_mask_result(env, rd, width, lmul, i, result); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + result = float32_eq_quiet(env->vfp.vreg[src1].f32[j], + env->vfp.vreg[src2].f32[j], + &env->fp_status); + vector_mask_result(env, rd, width, lmul, i, !result); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + result = float64_eq_quiet(env->vfp.vreg[src1].f64[j], + env->vfp.vreg[src2].f64[j], + &env->fp_status); + vector_mask_result(env, rd, width, lmul, i, !result); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + switch (width) { + case 16: + case 32: + case 64: + vector_mask_result(env, rd, width, lmul, i, 0); + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } + } + env->vfp.vstart = 0; + return; +} + +/* vmfne.vf vd, vs2, rs1, vm # vector-scalar */ +void VECTOR_HELPER(vmfne_vf)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, src2, result, r; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + r = float16_compare_quiet(env->fpr[rs1], + env->vfp.vreg[src2].f16[j], + &env->fp_status); + if (r != float_relation_equal) { + result = 1; + } else { + result = 0; + } + vector_mask_result(env, rd, width, lmul, i, result); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + result = float32_eq_quiet(env->fpr[rs1], + env->vfp.vreg[src2].f32[j], + &env->fp_status); + vector_mask_result(env, rd, width, lmul, i, !result); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + result = float64_eq_quiet(env->fpr[rs1], + env->vfp.vreg[src2].f64[j], + &env->fp_status); + vector_mask_result(env, rd, width, lmul, i, !result); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + switch (width) { + case 16: + case 32: + case 64: + vector_mask_result(env, rd, width, lmul, i, 0); + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } + } + env->vfp.vstart = 0; + return; +} + +/* vmflt.vv vd, vs2, vs1, vm # Vector-vector */ +void VECTOR_HELPER(vmflt_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, src1, src2, result, r; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + src1 = rs1 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + r = float16_compare(env->vfp.vreg[src2].f16[j], + env->vfp.vreg[src1].f16[j], + &env->fp_status); + if (r == float_relation_less) { + result = 1; + } else { + result = 0; + } + vector_mask_result(env, rd, width, lmul, i, result); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + result = float32_lt(env->vfp.vreg[src2].f32[j], + env->vfp.vreg[src1].f32[j], + &env->fp_status); + vector_mask_result(env, rd, width, lmul, i, result); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + result = float64_lt(env->vfp.vreg[src2].f64[j], + env->vfp.vreg[src1].f64[j], + &env->fp_status); + vector_mask_result(env, rd, width, lmul, i, result); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + switch (width) { + case 16: + case 32: + case 64: + vector_mask_result(env, rd, width, lmul, i, 0); + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } + } + env->vfp.vstart = 0; + return; +} + +/* vmflt.vf vd, vs2, rs1, vm # vector-scalar */ +void VECTOR_HELPER(vmflt_vf)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, src2, result, r; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + r = float16_compare(env->vfp.vreg[src2].f16[j], + env->fpr[rs1], + &env->fp_status); + if (r == float_relation_less) { + result = 1; + } else { + result = 0; + } + vector_mask_result(env, rd, width, lmul, i, result); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + result = float32_lt(env->vfp.vreg[src2].f32[j], + env->fpr[rs1], + &env->fp_status); + vector_mask_result(env, rd, width, lmul, i, result); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + result = float64_lt(env->vfp.vreg[src2].f64[j], + env->fpr[rs1], + &env->fp_status); + vector_mask_result(env, rd, width, lmul, i, result); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + switch (width) { + case 16: + case 32: + case 64: + vector_mask_result(env, rd, width, lmul, i, 0); + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } + } + env->vfp.vstart = 0; + return; +} + +/* vmfle.vv vd, vs2, vs1, vm # Vector-vector */ +void VECTOR_HELPER(vmfle_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, src1, src2, result, r; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + src1 = rs1 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + r = float16_compare(env->vfp.vreg[src2].f16[j], + env->vfp.vreg[src1].f16[j], + &env->fp_status); + if ((r == float_relation_less) || + (r == float_relation_equal)) { + result = 1; + } else { + result = 0; + } + vector_mask_result(env, rd, width, lmul, i, result); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + result = float32_le(env->vfp.vreg[src2].f32[j], + env->vfp.vreg[src1].f32[j], + &env->fp_status); + vector_mask_result(env, rd, width, lmul, i, result); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + result = float64_le(env->vfp.vreg[src2].f64[j], + env->vfp.vreg[src1].f64[j], + &env->fp_status); + vector_mask_result(env, rd, width, lmul, i, result); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + switch (width) { + case 16: + case 32: + case 64: + vector_mask_result(env, rd, width, lmul, i, 0); + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } + } + env->vfp.vstart = 0; + return; +} + +/* vmfle.vf vd, vs2, rs1, vm # vector-scalar */ +void VECTOR_HELPER(vmfle_vf)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, src2, result, r; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + r = float16_compare(env->vfp.vreg[src2].f16[j], + env->fpr[rs1], + &env->fp_status); + if ((r == float_relation_less) || + (r == float_relation_equal)) { + result = 1; + } else { + result = 0; + } + vector_mask_result(env, rd, width, lmul, i, result); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + result = float32_le(env->vfp.vreg[src2].f32[j], + env->fpr[rs1], + &env->fp_status); + vector_mask_result(env, rd, width, lmul, i, result); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + result = float64_le(env->vfp.vreg[src2].f64[j], + env->fpr[rs1], + &env->fp_status); + vector_mask_result(env, rd, width, lmul, i, result); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + switch (width) { + case 16: + case 32: + case 64: + vector_mask_result(env, rd, width, lmul, i, 0); + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } + } + env->vfp.vstart = 0; + return; +} + +/* vmfgt.vf vd, vs2, rs1, vm # vector-scalar */ +void VECTOR_HELPER(vmfgt_vf)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, src2, result, r; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + if (vector_elem_mask(env, vm, width, lmul, i)) { + switch (width) { + case 16: + r = float16_compare(env->vfp.vreg[src2].f16[j], + env->fpr[rs1], + &env->fp_status); + break; + case 32: + r = float32_compare(env->vfp.vreg[src2].f32[j], + env->fpr[rs1], + &env->fp_status); + break; + case 64: + r = float64_compare(env->vfp.vreg[src2].f64[j], + env->fpr[rs1], + &env->fp_status); + break; + default: + riscv_raise_exception(env, + RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + if (r == float_relation_greater) { + result = 1; + } else { + result = 0; + } + vector_mask_result(env, rd, width, lmul, i, result); + } + } else { + switch (width) { + case 16: + case 32: + case 64: + vector_mask_result(env, rd, width, lmul, i, 0); + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } + } + env->vfp.vstart = 0; + return; +} + +/* vmfge.vf vd, vs2, rs1, vm # vector-scalar */ +void VECTOR_HELPER(vmfge_vf)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, src2, result, r; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + if (vector_elem_mask(env, vm, width, lmul, i)) { + switch (width) { + case 16: + r = float16_compare(env->vfp.vreg[src2].f16[j], + env->fpr[rs1], + &env->fp_status); + break; + case 32: + r = float32_compare(env->vfp.vreg[src2].f32[j], + env->fpr[rs1], + &env->fp_status); + break; + case 64: + r = float64_compare(env->vfp.vreg[src2].f64[j], + env->fpr[rs1], + &env->fp_status); + break; + default: + riscv_raise_exception(env, + RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + if ((r == float_relation_greater) || + (r == float_relation_equal)) { + result = 1; + } else { + result = 0; + } + vector_mask_result(env, rd, width, lmul, i, result); + } + } else { + switch (width) { + case 16: + case 32: + case 64: + vector_mask_result(env, rd, width, lmul, i, 0); + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } + } + env->vfp.vstart = 0; + return; +} + +/* vmford.vv vd, vs2, vs1, vm # Vector-vector */ +void VECTOR_HELPER(vmford_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, src1, src2, result, r; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + src1 = rs1 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + r = float16_compare_quiet(env->vfp.vreg[src1].f16[j], + env->vfp.vreg[src2].f16[j], + &env->fp_status); + if (r == float_relation_unordered) { + result = 1; + } else { + result = 0; + } + vector_mask_result(env, rd, width, lmul, i, !result); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + result = float32_unordered_quiet(env->vfp.vreg[src1].f32[j], + env->vfp.vreg[src2].f32[j], + &env->fp_status); + vector_mask_result(env, rd, width, lmul, i, !result); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + result = float64_unordered_quiet(env->vfp.vreg[src1].f64[j], + env->vfp.vreg[src2].f64[j], + &env->fp_status); + vector_mask_result(env, rd, width, lmul, i, !result); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + switch (width) { + case 16: + case 32: + case 64: + vector_mask_result(env, rd, width, lmul, i, 0); + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } + } + env->vfp.vstart = 0; + return; +} + +/* vmford.vf vd, vs2, rs1, vm # Vector-scalar */ +void VECTOR_HELPER(vmford_vf)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, src2, result, r; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + r = float16_compare_quiet(env->vfp.vreg[src2].f16[j], + env->fpr[rs1], + &env->fp_status); + if (r == float_relation_unordered) { + result = 1; + } else { + result = 0; + } + vector_mask_result(env, rd, width, lmul, i, !result); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + result = float32_unordered_quiet(env->vfp.vreg[src2].f32[j], + env->fpr[rs1], + &env->fp_status); + vector_mask_result(env, rd, width, lmul, i, !result); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + result = float64_unordered_quiet(env->vfp.vreg[src2].f64[j], + env->fpr[rs1], + &env->fp_status); + vector_mask_result(env, rd, width, lmul, i, !result); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + switch (width) { + case 16: + case 32: + case 64: + vector_mask_result(env, rd, width, lmul, i, 0); + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } + } + env->vfp.vstart = 0; + return; +} + +/* vfclass.v vd, vs2, vm # Vector-vector */ +void VECTOR_HELPER(vfclass_v)(CPURISCVState *env, uint32_t vm, uint32_t rs2, + uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src2; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u16[j] = helper_fclass_h( + env->vfp.vreg[src2].f16[j]); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u32[j] = helper_fclass_s( + env->vfp.vreg[src2].f32[j]); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u64[j] = helper_fclass_d( + env->vfp.vreg[src2].f64[j]); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_fcommon(env, dest, j, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* vfcvt.xu.f.v vd, vs2, vm # Convert float to unsigned integer. */ +void VECTOR_HELPER(vfcvt_xu_f_v)(CPURISCVState *env, uint32_t vm, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src2; + if (vector_vtype_ill(env)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u16[j] = float16_to_uint16( + env->vfp.vreg[src2].f16[j], + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u32[j] = float32_to_uint32( + env->vfp.vreg[src2].f32[j], + &env->fp_status); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u64[j] = float64_to_uint64( + env->vfp.vreg[src2].f64[j], + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_fcommon(env, dest, j, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* vfcvt.x.f.v vd, vs2, vm # Convert float to signed integer. */ +void VECTOR_HELPER(vfcvt_x_f_v)(CPURISCVState *env, uint32_t vm, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src2; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s16[j] = float16_to_int16( + env->vfp.vreg[src2].f16[j], + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s32[j] = float32_to_int32( + env->vfp.vreg[src2].f32[j], + &env->fp_status); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s64[j] = float64_to_int64( + env->vfp.vreg[src2].f64[j], + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_fcommon(env, dest, j, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* vfcvt.f.xu.v vd, vs2, vm # Convert unsigned integer to float. */ +void VECTOR_HELPER(vfcvt_f_xu_v)(CPURISCVState *env, uint32_t vm, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src2; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f16[j] = uint16_to_float16( + env->vfp.vreg[src2].u16[j], + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[j] = uint32_to_float32( + env->vfp.vreg[src2].u32[j], + &env->fp_status); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[j] = uint64_to_float64( + env->vfp.vreg[src2].u64[j], + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_fcommon(env, dest, j, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* vfcvt.f.x.v vd, vs2, vm # Convert integer to float. */ +void VECTOR_HELPER(vfcvt_f_x_v)(CPURISCVState *env, uint32_t vm, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src2; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f16[j] = int16_to_float16( + env->vfp.vreg[src2].s16[j], + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[j] = int32_to_float32( + env->vfp.vreg[src2].s32[j], + &env->fp_status); + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[j] = int64_to_float64( + env->vfp.vreg[src2].s64[j], + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_fcommon(env, dest, j, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* vfwcvt.xu.f.v vd, vs2, vm # Convert float to double-width unsigned integer.*/ +void VECTOR_HELPER(vfwcvt_xu_f_v)(CPURISCVState *env, uint32_t vm, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, k, dest, src2; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + + if (vector_vtype_ill(env) + || vector_overlap_vm_force(vm, rd) + || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, true); + + if (lmul > 4) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / (2 * width))); + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + k = i % (VLEN / (2 * width)); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u32[k] = float16_to_uint32( + env->vfp.vreg[src2].f16[j], + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u64[k] = float32_to_uint64( + env->vfp.vreg[src2].f32[j], + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + } + } else { + vector_tail_fwiden(env, dest, j, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* vfwcvt.x.f.v vd, vs2, vm # Convert float to double-width signed integer. */ +void VECTOR_HELPER(vfwcvt_x_f_v)(CPURISCVState *env, uint32_t vm, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, k, dest, src2; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + + if (vector_vtype_ill(env) + || vector_overlap_vm_force(vm, rd) + || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, true); + + if (lmul > 4) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / (2 * width))); + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + k = i % (VLEN / (2 * width)); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s32[k] = float16_to_int32( + env->vfp.vreg[src2].f16[j], + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s64[k] = float32_to_int64( + env->vfp.vreg[src2].f32[j], + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_fwiden(env, dest, k, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* vfwcvt.f.xu.v vd, vs2, vm # Convert unsigned integer to double-width float */ +void VECTOR_HELPER(vfwcvt_f_xu_v)(CPURISCVState *env, uint32_t vm, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, k, dest, src2; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + + if (vector_vtype_ill(env) + || vector_overlap_vm_force(vm, rd) + || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, true); + + if (lmul > 4) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / (2 * width))); + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + k = i % (VLEN / (2 * width)); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[k] = uint16_to_float32( + env->vfp.vreg[src2].u16[j], + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[k] = uint32_to_float64( + env->vfp.vreg[src2].u32[j], + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_fwiden(env, dest, k, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* vfwcvt.f.x.v vd, vs2, vm # Convert integer to double-width float. */ +void VECTOR_HELPER(vfwcvt_f_x_v)(CPURISCVState *env, uint32_t vm, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, k, dest, src2; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + if (vector_vtype_ill(env) + || vector_overlap_vm_force(vm, rd) + || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, true); + + if (lmul > 4) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / (2 * width))); + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + k = i % (VLEN / (2 * width)); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[k] = int16_to_float32( + env->vfp.vreg[src2].s16[j], + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[k] = int32_to_float64( + env->vfp.vreg[src2].s32[j], + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_fwiden(env, dest, k, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* + * vfwcvt.f.f.v vd, vs2, vm # + * Convert single-width float to double-width float. + */ +void VECTOR_HELPER(vfwcvt_f_f_v)(CPURISCVState *env, uint32_t vm, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, k, dest, src2; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + if (vector_vtype_ill(env) + || vector_overlap_vm_force(vm, rd) + || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, true); + + if (lmul > 4) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / (2 * width))); + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + k = i % (VLEN / (2 * width)); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[k] = float16_to_float32( + env->vfp.vreg[src2].f16[j], + true, + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f64[k] = float32_to_float64( + env->vfp.vreg[src2].f32[j], + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_fwiden(env, dest, k, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* vfncvt.xu.f.v vd, vs2, vm # Convert float to unsigned integer. */ +void VECTOR_HELPER(vfncvt_xu_f_v)(CPURISCVState *env, uint32_t vm, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, k, dest, src2; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + if (vector_vtype_ill(env) || + vector_overlap_vm_common(lmul, vm, rd) || + vector_overlap_dstgp_srcgp(rd, lmul, rs2, 2 * lmul)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, true); + vector_lmul_check_reg(env, lmul, rd, false); + + if (lmul > 4) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / (2 * width))); + k = i % (VLEN / width); + j = i % (VLEN / (2 * width)); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u16[k] = float32_to_uint16( + env->vfp.vreg[src2].f32[j], + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u32[k] = float64_to_uint32( + env->vfp.vreg[src2].f64[j], + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_fnarrow(env, dest, k, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* vfncvt.x.f.v vd, vs2, vm # Convert double-width float to signed integer. */ +void VECTOR_HELPER(vfncvt_x_f_v)(CPURISCVState *env, uint32_t vm, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, k, dest, src2; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + if (vector_vtype_ill(env) || + vector_overlap_vm_common(lmul, vm, rd) || + vector_overlap_dstgp_srcgp(rd, lmul, rs2, 2 * lmul)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, true); + vector_lmul_check_reg(env, lmul, rd, false); + + if (lmul > 4) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / (2 * width))); + k = i % (VLEN / width); + j = i % (VLEN / (2 * width)); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s16[k] = float32_to_int16( + env->vfp.vreg[src2].f32[j], + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].s32[k] = float64_to_int32( + env->vfp.vreg[src2].f64[j], + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_fnarrow(env, dest, k, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* vfncvt.f.xu.v vd, vs2, vm # Convert double-width unsigned integer to float */ +void VECTOR_HELPER(vfncvt_f_xu_v)(CPURISCVState *env, uint32_t vm, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, k, dest, src2; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + + if (vector_vtype_ill(env) || + vector_overlap_vm_common(lmul, vm, rd) || + vector_overlap_dstgp_srcgp(rd, lmul, rs2, 2 * lmul)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs2, true); + vector_lmul_check_reg(env, lmul, rd, false); + + if (lmul > 4) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / (2 * width))); + k = i % (VLEN / width); + j = i % (VLEN / (2 * width)); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f16[k] = uint32_to_float16( + env->vfp.vreg[src2].u32[j], + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[k] = uint64_to_float32( + env->vfp.vreg[src2].u64[j], + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_fnarrow(env, dest, k, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* vfncvt.f.x.v vd, vs2, vm # Convert double-width integer to float. */ +void VECTOR_HELPER(vfncvt_f_x_v)(CPURISCVState *env, uint32_t vm, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, k, dest, src2; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + if (vector_vtype_ill(env) || + vector_overlap_vm_common(lmul, vm, rd) || + vector_overlap_dstgp_srcgp(rd, lmul, rs2, 2 * lmul)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, true); + vector_lmul_check_reg(env, lmul, rd, false); + + if (lmul > 4) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / (2 * width))); + k = i % (VLEN / width); + j = i % (VLEN / (2 * width)); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f16[k] = int32_to_float16( + env->vfp.vreg[src2].s32[j], + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[k] = int64_to_float32( + env->vfp.vreg[src2].s64[j], + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_fnarrow(env, dest, k, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* vfncvt.f.f.v vd, vs2, vm # Convert double float to single-width float. */ +void VECTOR_HELPER(vfncvt_f_f_v)(CPURISCVState *env, uint32_t vm, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, k, dest, src2; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + if (vector_vtype_ill(env) || + vector_overlap_vm_common(lmul, vm, rd) || + vector_overlap_dstgp_srcgp(rd, lmul, rs2, 2 * lmul)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, true); + vector_lmul_check_reg(env, lmul, rd, false); + + if (lmul > 4) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src2 = rs2 + (i / (VLEN / (2 * width))); + k = i % (VLEN / width); + j = i % (VLEN / (2 * width)); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f16[k] = float32_to_float16( + env->vfp.vreg[src2].f32[j], + true, + &env->fp_status); + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].f32[k] = float64_to_float32( + env->vfp.vreg[src2].f64[j], + &env->fp_status); + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_fnarrow(env, dest, k, width); + } + } + env->vfp.vstart = 0; + return; +} + From patchwork Wed Sep 11 06:25:39 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: LIU Zhiwei X-Patchwork-Id: 11140401 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id AE4DE76 for ; Wed, 11 Sep 2019 06:49:41 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 6F9F521A4C for ; Wed, 11 Sep 2019 06:49:41 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6F9F521A4C Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=c-sky.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Received: from localhost ([::1]:46982 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1i7wRf-0005yU-VH for patchwork-qemu-devel@patchwork.kernel.org; Wed, 11 Sep 2019 02:49:40 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:38701) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1i7wE6-0007np-RO for qemu-devel@nongnu.org; Wed, 11 Sep 2019 02:35:43 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1i7wE0-0008AP-9u for qemu-devel@nongnu.org; Wed, 11 Sep 2019 02:35:38 -0400 Received: from smtp2200-217.mail.aliyun.com ([121.197.200.217]:55853) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1i7wDz-0007sI-71; Wed, 11 Sep 2019 02:35:32 -0400 X-Alimail-AntiSpam: AC=CONTINUE; BC=0.03883426|-1; CH=green; DM=CONTINUE|CONTINUE|true|0.257642-0.00741329-0.734945; FP=0|0|0|0|0|-1|-1|-1; HT=e01a16367; MF=zhiwei_liu@c-sky.com; NM=1; PH=DS; RN=11; RT=11; SR=0; TI=SMTPD_---.FSRZlKe_1568183704; Received: from localhost(mailfrom:zhiwei_liu@c-sky.com fp:SMTPD_---.FSRZlKe_1568183704) by smtp.aliyun-inc.com(10.147.42.135); Wed, 11 Sep 2019 14:35:04 +0800 From: liuzhiwei To: Alistair.Francis@wdc.com, palmer@sifive.com, sagark@eecs.berkeley.edu, kbastian@mail.uni-paderborn.de, riku.voipio@iki.fi, laurent@vivier.eu, wenmeng_zhang@c-sky.com Date: Wed, 11 Sep 2019 14:25:39 +0800 Message-Id: <1568183141-67641-16-git-send-email-zhiwei_liu@c-sky.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1568183141-67641-1-git-send-email-zhiwei_liu@c-sky.com> References: <1568183141-67641-1-git-send-email-zhiwei_liu@c-sky.com> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x [generic] X-Received-From: 121.197.200.217 Subject: [Qemu-devel] [PATCH v2 15/17] RISC-V: add vector extension reduction instructions X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-riscv@nongnu.org, qemu-devel@nongnu.org, wxy194768@alibaba-inc.com, LIU Zhiwei Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" From: LIU Zhiwei Signed-off-by: LIU Zhiwei --- target/riscv/helper.h | 17 + target/riscv/insn32.decode | 17 + target/riscv/insn_trans/trans_rvv.inc.c | 17 + target/riscv/vector_helper.c | 1275 +++++++++++++++++++++++++++++++ 4 files changed, 1326 insertions(+) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index e2384eb..d36bd00 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -384,5 +384,22 @@ DEF_HELPER_4(vector_vfncvt_f_xu_v, void, env, i32, i32, i32) DEF_HELPER_4(vector_vfncvt_f_x_v, void, env, i32, i32, i32) DEF_HELPER_4(vector_vfncvt_f_f_v, void, env, i32, i32, i32) +DEF_HELPER_5(vector_vredsum_vs, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vredand_vs, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfredsum_vs, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vredor_vs, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vredxor_vs, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfredosum_vs, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vredminu_vs, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vredmin_vs, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfredmin_vs, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vredmaxu_vs, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vredmax_vs, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfredmax_vs, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vwredsumu_vs, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vwredsum_vs, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfwredsum_vs, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vfwredosum_vs, void, env, i32, i32, i32, i32) + DEF_HELPER_4(vector_vsetvli, void, env, i32, i32, i32) DEF_HELPER_4(vector_vsetvl, void, env, i32, i32, i32) diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode index 256d8ea..3f63bc1 100644 --- a/target/riscv/insn32.decode +++ b/target/riscv/insn32.decode @@ -524,5 +524,22 @@ vfncvt_f_xu_v 100010 . ..... 10010 001 ..... 1010111 @r2_vm vfncvt_f_x_v 100010 . ..... 10011 001 ..... 1010111 @r2_vm vfncvt_f_f_v 100010 . ..... 10100 001 ..... 1010111 @r2_vm +vredsum_vs 000000 . ..... ..... 010 ..... 1010111 @r_vm +vredand_vs 000001 . ..... ..... 010 ..... 1010111 @r_vm +vredor_vs 000010 . ..... ..... 010 ..... 1010111 @r_vm +vredxor_vs 000011 . ..... ..... 010 ..... 1010111 @r_vm +vredminu_vs 000100 . ..... ..... 010 ..... 1010111 @r_vm +vredmin_vs 000101 . ..... ..... 010 ..... 1010111 @r_vm +vredmaxu_vs 000110 . ..... ..... 010 ..... 1010111 @r_vm +vredmax_vs 000111 . ..... ..... 010 ..... 1010111 @r_vm +vwredsumu_vs 110000 . ..... ..... 000 ..... 1010111 @r_vm +vwredsum_vs 110001 . ..... ..... 000 ..... 1010111 @r_vm +vfredsum_vs 000001 . ..... ..... 001 ..... 1010111 @r_vm +vfredosum_vs 000011 . ..... ..... 001 ..... 1010111 @r_vm +vfredmin_vs 000101 . ..... ..... 001 ..... 1010111 @r_vm +vfredmax_vs 000111 . ..... ..... 001 ..... 1010111 @r_vm +vfwredsum_vs 110001 . ..... ..... 001 ..... 1010111 @r_vm +vfwredosum_vs 110011 . ..... ..... 001 ..... 1010111 @r_vm + vsetvli 0 ........... ..... 111 ..... 1010111 @r2_zimm vsetvl 1000000 ..... ..... 111 ..... 1010111 @r diff --git a/target/riscv/insn_trans/trans_rvv.inc.c b/target/riscv/insn_trans/trans_rvv.inc.c index e4d4576..9a3d31b 100644 --- a/target/riscv/insn_trans/trans_rvv.inc.c +++ b/target/riscv/insn_trans/trans_rvv.inc.c @@ -427,5 +427,22 @@ GEN_VECTOR_R2_VM(vfncvt_f_xu_v) GEN_VECTOR_R2_VM(vfncvt_f_x_v) GEN_VECTOR_R2_VM(vfncvt_f_f_v) +GEN_VECTOR_R_VM(vredsum_vs) +GEN_VECTOR_R_VM(vredand_vs) +GEN_VECTOR_R_VM(vredor_vs) +GEN_VECTOR_R_VM(vredxor_vs) +GEN_VECTOR_R_VM(vredminu_vs) +GEN_VECTOR_R_VM(vredmin_vs) +GEN_VECTOR_R_VM(vredmaxu_vs) +GEN_VECTOR_R_VM(vredmax_vs) +GEN_VECTOR_R_VM(vwredsumu_vs) +GEN_VECTOR_R_VM(vwredsum_vs) +GEN_VECTOR_R_VM(vfredsum_vs) +GEN_VECTOR_R_VM(vfredosum_vs) +GEN_VECTOR_R_VM(vfredmin_vs) +GEN_VECTOR_R_VM(vfredmax_vs) +GEN_VECTOR_R_VM(vfwredsum_vs) +GEN_VECTOR_R_VM(vfwredosum_vs) + GEN_VECTOR_R2_ZIMM(vsetvli) GEN_VECTOR_R(vsetvl) diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c index fd2ecb7..4a9083b 100644 --- a/target/riscv/vector_helper.c +++ b/target/riscv/vector_helper.c @@ -22720,4 +22720,1279 @@ void VECTOR_HELPER(vfncvt_f_f_v)(CPURISCVState *env, uint32_t vm, return; } +/* vredsum.vs vd, vs2, vs1, vm # vd[0] = sum(vs1[0] , vs2[*]) */ +void VECTOR_HELPER(vredsum_vs)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, src2; + uint64_t sum = 0; + + lmul = vector_get_lmul(env); + vector_lmul_check_reg(env, lmul, rs2, false); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + if (env->vfp.vstart != 0) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vl = env->vfp.vl; + if (vl == 0) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < VLEN / 64; i++) { + env->vfp.vreg[rd].u64[i] = 0; + } + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + + if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + sum += env->vfp.vreg[src2].u8[j]; + } + if (i == 0) { + sum += env->vfp.vreg[rs1].u8[0]; + } + if (i == vl - 1) { + env->vfp.vreg[rd].u8[0] = sum; + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + sum += env->vfp.vreg[src2].u16[j]; + } + if (i == 0) { + sum += env->vfp.vreg[rs1].u16[0]; + } + if (i == vl - 1) { + env->vfp.vreg[rd].u16[0] = sum; + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + sum += env->vfp.vreg[src2].u32[j]; + } + if (i == 0) { + sum += env->vfp.vreg[rs1].u32[0]; + } + if (i == vl - 1) { + env->vfp.vreg[rd].u32[0] = sum; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + sum += env->vfp.vreg[src2].u64[j]; + } + if (i == 0) { + sum += env->vfp.vreg[rs1].u64[0]; + } + if (i == vl - 1) { + env->vfp.vreg[rd].u64[0] = sum; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } + } + env->vfp.vstart = 0; + return; +} + + +/* vredand.vs vd, vs2, vs1, vm # vd[0] = and( vs1[0] , vs2[*] ) */ +void VECTOR_HELPER(vredand_vs)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, src2; + uint64_t res = 0; + + lmul = vector_get_lmul(env); + vector_lmul_check_reg(env, lmul, rs2, false); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + if (env->vfp.vstart != 0) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vl = env->vfp.vl; + if (vl == 0) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < VLEN / 64; i++) { + env->vfp.vreg[rd].u64[i] = 0; + } + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + + if (i < vl) { + switch (width) { + case 8: + if (i == 0) { + res = env->vfp.vreg[rs1].u8[0]; + } + if (vector_elem_mask(env, vm, width, lmul, i)) { + res &= env->vfp.vreg[src2].u8[j]; + } + if (i == vl - 1) { + env->vfp.vreg[rd].u8[0] = res; + } + break; + case 16: + if (i == 0) { + res = env->vfp.vreg[rs1].u16[0]; + } + if (vector_elem_mask(env, vm, width, lmul, i)) { + res &= env->vfp.vreg[src2].u16[j]; + } + if (i == vl - 1) { + env->vfp.vreg[rd].u16[0] = res; + } + break; + case 32: + if (i == 0) { + res = env->vfp.vreg[rs1].u32[0]; + } + if (vector_elem_mask(env, vm, width, lmul, i)) { + res &= env->vfp.vreg[src2].u32[j]; + } + if (i == vl - 1) { + env->vfp.vreg[rd].u32[0] = res; + } + break; + case 64: + if (i == 0) { + res = env->vfp.vreg[rs1].u64[0]; + } + if (vector_elem_mask(env, vm, width, lmul, i)) { + res &= env->vfp.vreg[src2].u64[j]; + } + if (i == vl - 1) { + env->vfp.vreg[rd].u64[0] = res; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } + } + env->vfp.vstart = 0; + return; +} + +/* vfredsum.vs vd, vs2, vs1, vm # Unordered sum */ +void VECTOR_HELPER(vfredsum_vs)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, src2; + float16 sum16 = 0.0f; + float32 sum32 = 0.0f; + float64 sum64 = 0.0f; + + lmul = vector_get_lmul(env); + vector_lmul_check_reg(env, lmul, rs2, false); + + if (vector_vtype_ill(env)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + if (env->vfp.vstart != 0) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vl = env->vfp.vl; + if (vl == 0) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < VLEN / 64; i++) { + env->vfp.vreg[rd].u64[i] = 0; + } + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + + if (i < vl) { + switch (width) { + case 16: + if (i == 0) { + sum16 = env->vfp.vreg[rs1].f16[0]; + } + if (vector_elem_mask(env, vm, width, lmul, i)) { + sum16 = float16_add(sum16, env->vfp.vreg[src2].f16[j], + &env->fp_status); + } + if (i == vl - 1) { + env->vfp.vreg[rd].f16[0] = sum16; + } + break; + case 32: + if (i == 0) { + sum32 = env->vfp.vreg[rs1].f32[0]; + } + if (vector_elem_mask(env, vm, width, lmul, i)) { + sum32 = float32_add(sum32, env->vfp.vreg[src2].f32[j], + &env->fp_status); + } + if (i == vl - 1) { + env->vfp.vreg[rd].f32[0] = sum32; + } + break; + case 64: + if (i == 0) { + sum64 = env->vfp.vreg[rs1].f64[0]; + } + if (vector_elem_mask(env, vm, width, lmul, i)) { + sum64 = float64_add(sum64, env->vfp.vreg[src2].f64[j], + &env->fp_status); + } + if (i == vl - 1) { + env->vfp.vreg[rd].f64[0] = sum64; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } + } + env->vfp.vstart = 0; + return; +} + +/* vredor.vs vd, vs2, vs1, vm # vd[0] = or( vs1[0] , vs2[*] ) */ +void VECTOR_HELPER(vredor_vs)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, src2; + uint64_t res = 0; + + lmul = vector_get_lmul(env); + vector_lmul_check_reg(env, lmul, rs2, false); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + if (env->vfp.vstart != 0) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vl = env->vfp.vl; + if (vl == 0) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < VLEN / 64; i++) { + env->vfp.vreg[rd].u64[i] = 0; + } + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + + if (i < vl) { + switch (width) { + case 8: + if (i == 0) { + res = env->vfp.vreg[rs1].u8[0]; + } + if (vector_elem_mask(env, vm, width, lmul, i)) { + res |= env->vfp.vreg[src2].u8[j]; + } + if (i == vl - 1) { + env->vfp.vreg[rd].u8[0] = res; + } + break; + case 16: + if (i == 0) { + res = env->vfp.vreg[rs1].u16[0]; + } + if (vector_elem_mask(env, vm, width, lmul, i)) { + res |= env->vfp.vreg[src2].u16[j]; + } + if (i == vl - 1) { + env->vfp.vreg[rd].u16[0] = res; + } + break; + case 32: + if (i == 0) { + res = env->vfp.vreg[rs1].u32[0]; + } + if (vector_elem_mask(env, vm, width, lmul, i)) { + res |= env->vfp.vreg[src2].u32[j]; + } + if (i == vl - 1) { + env->vfp.vreg[rd].u32[0] = res; + } + break; + case 64: + if (i == 0) { + res = env->vfp.vreg[rs1].u64[0]; + } + if (vector_elem_mask(env, vm, width, lmul, i)) { + res |= env->vfp.vreg[src2].u64[j]; + } + if (i == vl - 1) { + env->vfp.vreg[rd].u64[0] = res; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } + } + env->vfp.vstart = 0; + return; +} + +/* vredxor.vs vd, vs2, vs1, vm # vd[0] = xor( vs1[0] , vs2[*] ) */ +void VECTOR_HELPER(vredxor_vs)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, src2; + uint64_t res = 0; + + lmul = vector_get_lmul(env); + vector_lmul_check_reg(env, lmul, rs2, false); + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + if (env->vfp.vstart != 0) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vl = env->vfp.vl; + if (vl == 0) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < VLEN / 64; i++) { + env->vfp.vreg[rd].u64[i] = 0; + } + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + + if (i < vl) { + switch (width) { + case 8: + if (i == 0) { + res = env->vfp.vreg[rs1].u8[0]; + } + if (vector_elem_mask(env, vm, width, lmul, i)) { + res ^= env->vfp.vreg[src2].u8[j]; + } + if (i == vl - 1) { + env->vfp.vreg[rd].u8[0] = res; + } + break; + case 16: + if (i == 0) { + res = env->vfp.vreg[rs1].u16[0]; + } + if (vector_elem_mask(env, vm, width, lmul, i)) { + res ^= env->vfp.vreg[src2].u16[j]; + } + if (i == vl - 1) { + env->vfp.vreg[rd].u16[0] = res; + } + break; + case 32: + if (i == 0) { + res = env->vfp.vreg[rs1].u32[0]; + } + if (vector_elem_mask(env, vm, width, lmul, i)) { + res ^= env->vfp.vreg[src2].u32[j]; + } + if (i == vl - 1) { + env->vfp.vreg[rd].u32[0] = res; + } + break; + case 64: + if (i == 0) { + res = env->vfp.vreg[rs1].u64[0]; + } + if (vector_elem_mask(env, vm, width, lmul, i)) { + res ^= env->vfp.vreg[src2].u64[j]; + } + if (i == vl - 1) { + env->vfp.vreg[rd].u64[0] = res; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } + } + env->vfp.vstart = 0; + return; +} + +/* vfredosum.vs vd, vs2, vs1, vm # Ordered sum */ +void VECTOR_HELPER(vfredosum_vs)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + helper_vector_vfredsum_vs(env, vm, rs1, rs2, rd); + env->vfp.vstart = 0; + return; +} + +/* vredminu.vs vd, vs2, vs1, vm # vd[0] = minu( vs1[0] , vs2[*] ) */ +void VECTOR_HELPER(vredminu_vs)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, src2; + uint64_t minu = 0; + + lmul = vector_get_lmul(env); + vector_lmul_check_reg(env, lmul, rs2, false); + + if (vector_vtype_ill(env)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + if (env->vfp.vstart != 0) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vl = env->vfp.vl; + if (vl == 0) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < VLEN / 64; i++) { + env->vfp.vreg[rd].u64[i] = 0; + } + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + + if (i < vl) { + switch (width) { + case 8: + if (i == 0) { + minu = env->vfp.vreg[rs1].u8[0]; + } + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (minu > env->vfp.vreg[src2].u8[j]) { + minu = env->vfp.vreg[src2].u8[j]; + } + } + if (i == vl - 1) { + env->vfp.vreg[rd].u8[0] = minu; + } + break; + case 16: + if (i == 0) { + minu = env->vfp.vreg[rs1].u16[0]; + } + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (minu > env->vfp.vreg[src2].u16[j]) { + minu = env->vfp.vreg[src2].u16[j]; + } + } + if (i == vl - 1) { + env->vfp.vreg[rd].u16[0] = minu; + } + break; + case 32: + if (i == 0) { + minu = env->vfp.vreg[rs1].u32[0]; + } + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (minu > env->vfp.vreg[src2].u32[j]) { + minu = env->vfp.vreg[src2].u32[j]; + } + } + if (i == vl - 1) { + env->vfp.vreg[rd].u32[0] = minu; + } + break; + case 64: + if (i == 0) { + minu = env->vfp.vreg[rs1].u64[0]; + } + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (minu > env->vfp.vreg[src2].u64[j]) { + minu = env->vfp.vreg[src2].u64[j]; + } + } + if (i == vl - 1) { + env->vfp.vreg[rd].u64[0] = minu; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } + } + env->vfp.vstart = 0; + return; +} + +/* vredmin.vs vd, vs2, vs1, vm # vd[0] = min( vs1[0] , vs2[*] ) */ +void VECTOR_HELPER(vredmin_vs)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, src2; + int64_t min = 0; + + lmul = vector_get_lmul(env); + vector_lmul_check_reg(env, lmul, rs2, false); + + if (vector_vtype_ill(env)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + if (env->vfp.vstart != 0) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vl = env->vfp.vl; + if (vl == 0) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < VLEN / 64; i++) { + env->vfp.vreg[rd].u64[i] = 0; + } + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + + if (i < vl) { + switch (width) { + case 8: + if (i == 0) { + min = env->vfp.vreg[rs1].s8[0]; + } + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (min > env->vfp.vreg[src2].s8[j]) { + min = env->vfp.vreg[src2].s8[j]; + } + } + if (i == vl - 1) { + env->vfp.vreg[rd].s8[0] = min; + } + break; + case 16: + if (i == 0) { + min = env->vfp.vreg[rs1].s16[0]; + } + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (min > env->vfp.vreg[src2].s16[j]) { + min = env->vfp.vreg[src2].s16[j]; + } + } + if (i == vl - 1) { + env->vfp.vreg[rd].s16[0] = min; + } + break; + case 32: + if (i == 0) { + min = env->vfp.vreg[rs1].s32[0]; + } + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (min > env->vfp.vreg[src2].s32[j]) { + min = env->vfp.vreg[src2].s32[j]; + } + } + if (i == vl - 1) { + env->vfp.vreg[rd].s32[0] = min; + } + break; + case 64: + if (i == 0) { + min = env->vfp.vreg[rs1].s64[0]; + } + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (min > env->vfp.vreg[src2].s64[j]) { + min = env->vfp.vreg[src2].s64[j]; + } + } + if (i == vl - 1) { + env->vfp.vreg[rd].s64[0] = min; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } + } + env->vfp.vstart = 0; + return; +} + +/* vfredmin.vs vd, vs2, vs1, vm # Minimum value */ +void VECTOR_HELPER(vfredmin_vs)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, src2; + float16 min16 = 0.0f; + float32 min32 = 0.0f; + float64 min64 = 0.0f; + + lmul = vector_get_lmul(env); + vector_lmul_check_reg(env, lmul, rs2, false); + + if (vector_vtype_ill(env)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + if (env->vfp.vstart != 0) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vl = env->vfp.vl; + if (vl == 0) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < VLEN / 64; i++) { + env->vfp.vreg[rd].u64[i] = 0; + } + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + + if (i < vl) { + switch (width) { + case 16: + if (i == 0) { + min16 = env->vfp.vreg[rs1].f16[0]; + } + if (vector_elem_mask(env, vm, width, lmul, i)) { + min16 = float16_minnum(min16, env->vfp.vreg[src2].f16[j], + &env->fp_status); + } + if (i == vl - 1) { + env->vfp.vreg[rd].f16[0] = min16; + } + break; + case 32: + if (i == 0) { + min32 = env->vfp.vreg[rs1].f32[0]; + } + if (vector_elem_mask(env, vm, width, lmul, i)) { + min32 = float32_minnum(min32, env->vfp.vreg[src2].f32[j], + &env->fp_status); + } + if (i == vl - 1) { + env->vfp.vreg[rd].f32[0] = min32; + } + break; + case 64: + if (i == 0) { + min64 = env->vfp.vreg[rs1].f64[0]; + } + if (vector_elem_mask(env, vm, width, lmul, i)) { + min64 = float64_minnum(min64, env->vfp.vreg[src2].f64[j], + &env->fp_status); + } + if (i == vl - 1) { + env->vfp.vreg[rd].f64[0] = min64; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } + } + env->vfp.vstart = 0; + return; +} + +/* vredmaxu.vs vd, vs2, vs1, vm # vd[0] = maxu( vs1[0] , vs2[*] ) */ +void VECTOR_HELPER(vredmaxu_vs)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, src2; + uint64_t maxu = 0; + + lmul = vector_get_lmul(env); + vector_lmul_check_reg(env, lmul, rs2, false); + + if (vector_vtype_ill(env)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + if (env->vfp.vstart != 0) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vl = env->vfp.vl; + if (vl == 0) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < VLEN / 64; i++) { + env->vfp.vreg[rd].u64[i] = 0; + } + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + + if (i < vl) { + switch (width) { + case 8: + if (i == 0) { + maxu = env->vfp.vreg[rs1].u8[0]; + } + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (maxu < env->vfp.vreg[src2].u8[j]) { + maxu = env->vfp.vreg[src2].u8[j]; + } + } + if (i == vl - 1) { + env->vfp.vreg[rd].u8[0] = maxu; + } + break; + case 16: + if (i == 0) { + maxu = env->vfp.vreg[rs1].u16[0]; + } + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (maxu < env->vfp.vreg[src2].u16[j]) { + maxu = env->vfp.vreg[src2].u16[j]; + } + } + if (i == vl - 1) { + env->vfp.vreg[rd].u16[0] = maxu; + } + break; + case 32: + if (i == 0) { + maxu = env->vfp.vreg[rs1].u32[0]; + } + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (maxu < env->vfp.vreg[src2].u32[j]) { + maxu = env->vfp.vreg[src2].u32[j]; + } + } + if (i == vl - 1) { + env->vfp.vreg[rd].u32[0] = maxu; + } + break; + case 64: + if (i == 0) { + maxu = env->vfp.vreg[rs1].u64[0]; + } + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (maxu < env->vfp.vreg[src2].u64[j]) { + maxu = env->vfp.vreg[src2].u64[j]; + } + } + if (i == vl - 1) { + env->vfp.vreg[rd].u64[0] = maxu; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } + } + env->vfp.vstart = 0; + return; +} +/* vredmax.vs vd, vs2, vs1, vm # vd[0] = max( vs1[0] , vs2[*] ) */ +void VECTOR_HELPER(vredmax_vs)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, src2; + int64_t max = 0; + + lmul = vector_get_lmul(env); + vector_lmul_check_reg(env, lmul, rs2, false); + + if (vector_vtype_ill(env)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + if (env->vfp.vstart != 0) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vl = env->vfp.vl; + if (vl == 0) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < VLEN / 64; i++) { + env->vfp.vreg[rd].u64[i] = 0; + } + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + + if (i < vl) { + switch (width) { + case 8: + if (i == 0) { + max = env->vfp.vreg[rs1].s8[0]; + } + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (max < env->vfp.vreg[src2].s8[j]) { + max = env->vfp.vreg[src2].s8[j]; + } + } + if (i == vl - 1) { + env->vfp.vreg[rd].s8[0] = max; + } + break; + case 16: + if (i == 0) { + max = env->vfp.vreg[rs1].s16[0]; + } + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (max < env->vfp.vreg[src2].s16[j]) { + max = env->vfp.vreg[src2].s16[j]; + } + } + if (i == vl - 1) { + env->vfp.vreg[rd].s16[0] = max; + } + break; + case 32: + if (i == 0) { + max = env->vfp.vreg[rs1].s32[0]; + } + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (max < env->vfp.vreg[src2].s32[j]) { + max = env->vfp.vreg[src2].s32[j]; + } + } + if (i == vl - 1) { + env->vfp.vreg[rd].s32[0] = max; + } + break; + case 64: + if (i == 0) { + max = env->vfp.vreg[rs1].s64[0]; + } + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (max < env->vfp.vreg[src2].s64[j]) { + max = env->vfp.vreg[src2].s64[j]; + } + } + if (i == vl - 1) { + env->vfp.vreg[rd].s64[0] = max; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } + } + env->vfp.vstart = 0; + return; +} + +/* vfredmax.vs vd, vs2, vs1, vm # Maximum value */ +void VECTOR_HELPER(vfredmax_vs)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, src2; + float16 max16 = 0.0f; + float32 max32 = 0.0f; + float64 max64 = 0.0f; + + lmul = vector_get_lmul(env); + vector_lmul_check_reg(env, lmul, rs2, false); + + if (vector_vtype_ill(env)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + if (env->vfp.vstart != 0) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vl = env->vfp.vl; + if (vl == 0) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < VLEN / 64; i++) { + env->vfp.vreg[rd].u64[i] = 0; + } + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + + if (i < vl) { + switch (width) { + case 16: + if (i == 0) { + max16 = env->vfp.vreg[rs1].f16[0]; + } + if (vector_elem_mask(env, vm, width, lmul, i)) { + max16 = float16_maxnum(max16, env->vfp.vreg[src2].f16[j], + &env->fp_status); + } + if (i == vl - 1) { + env->vfp.vreg[rd].f16[0] = max16; + } + break; + case 32: + if (i == 0) { + max32 = env->vfp.vreg[rs1].f32[0]; + } + if (vector_elem_mask(env, vm, width, lmul, i)) { + max32 = float32_maxnum(max32, env->vfp.vreg[src2].f32[j], + &env->fp_status); + } + if (i == vl - 1) { + env->vfp.vreg[rd].f32[0] = max32; + } + break; + case 64: + if (i == 0) { + max64 = env->vfp.vreg[rs1].f64[0]; + } + if (vector_elem_mask(env, vm, width, lmul, i)) { + max64 = float64_maxnum(max64, env->vfp.vreg[src2].f64[j], + &env->fp_status); + } + if (i == vl - 1) { + env->vfp.vreg[rd].f64[0] = max64; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } + } + env->vfp.vstart = 0; + return; +} + +/* vwredsumu.vs vd, vs2, vs1, vm # 2*SEW = 2*SEW + sum(zero-extend(SEW)) */ +void VECTOR_HELPER(vwredsumu_vs)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, src2; + uint64_t sum = 0; + + lmul = vector_get_lmul(env); + vector_lmul_check_reg(env, lmul, rs2, false); + + if (vector_vtype_ill(env)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + if (env->vfp.vstart != 0) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vl = env->vfp.vl; + if (vl == 0) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < VLEN / 64; i++) { + env->vfp.vreg[rd].u64[i] = 0; + } + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + + if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + sum += env->vfp.vreg[src2].u8[j]; + } + if (i == 0) { + sum += env->vfp.vreg[rs1].u16[0]; + } + if (i == vl - 1) { + env->vfp.vreg[rd].u16[0] = sum; + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + sum += env->vfp.vreg[src2].u16[j]; + } + if (i == 0) { + sum += env->vfp.vreg[rs1].u32[0]; + } + if (i == vl - 1) { + env->vfp.vreg[rd].u32[0] = sum; + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + sum += env->vfp.vreg[src2].u32[j]; + } + if (i == 0) { + sum += env->vfp.vreg[rs1].u64[0]; + } + if (i == vl - 1) { + env->vfp.vreg[rd].u64[0] = sum; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } + } + env->vfp.vstart = 0; + return; +} + +/* vwredsum.vs vd, vs2, vs1, vm # 2*SEW = 2*SEW + sum(sign-extend(SEW)) */ +void VECTOR_HELPER(vwredsum_vs)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, src2; + int64_t sum = 0; + + lmul = vector_get_lmul(env); + vector_lmul_check_reg(env, lmul, rs2, false); + + if (vector_vtype_ill(env)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + if (env->vfp.vstart != 0) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vl = env->vfp.vl; + if (vl == 0) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < VLEN / 64; i++) { + env->vfp.vreg[rd].u64[i] = 0; + } + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + + if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + sum += (int16_t)env->vfp.vreg[src2].s8[j] << 8 >> 8; + } + if (i == 0) { + sum += env->vfp.vreg[rs1].s16[0]; + } + if (i == vl - 1) { + env->vfp.vreg[rd].s16[0] = sum; + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + sum += (int32_t)env->vfp.vreg[src2].s16[j] << 16 >> 16; + } + if (i == 0) { + sum += env->vfp.vreg[rs1].s32[0]; + } + if (i == vl - 1) { + env->vfp.vreg[rd].s32[0] = sum; + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + sum += (int64_t)env->vfp.vreg[src2].s32[j] << 32 >> 32; + } + if (i == 0) { + sum += env->vfp.vreg[rs1].s64[0]; + } + if (i == vl - 1) { + env->vfp.vreg[rd].s64[0] = sum; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } + } + env->vfp.vstart = 0; + return; +} + +/* + * vfwredsum.vs vd, vs2, vs1, vm # + * Unordered reduce 2*SEW = 2*SEW + sum(promote(SEW)) + */ +void VECTOR_HELPER(vfwredsum_vs)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, src2; + float32 sum32 = 0.0f; + float64 sum64 = 0.0f; + + lmul = vector_get_lmul(env); + vector_lmul_check_reg(env, lmul, rs2, false); + + if (vector_vtype_ill(env)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + if (env->vfp.vstart != 0) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vl = env->vfp.vl; + if (vl == 0) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < VLEN / 64; i++) { + env->vfp.vreg[rd].u64[i] = 0; + } + + for (i = 0; i < vlmax; i++) { + src2 = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + + if (i < vl) { + switch (width) { + case 16: + if (i == 0) { + sum32 = env->vfp.vreg[rs1].f32[0]; + } + if (vector_elem_mask(env, vm, width, lmul, i)) { + sum32 = float32_add(sum32, + float16_to_float32(env->vfp.vreg[src2].f16[j], + true, &env->fp_status), + &env->fp_status); + } + if (i == vl - 1) { + env->vfp.vreg[rd].f32[0] = sum32; + } + break; + case 32: + if (i == 0) { + sum64 = env->vfp.vreg[rs1].f64[0]; + } + if (vector_elem_mask(env, vm, width, lmul, i)) { + sum64 = float64_add(sum64, + float32_to_float64(env->vfp.vreg[src2].f32[j], + &env->fp_status), + &env->fp_status); + } + if (i == vl - 1) { + env->vfp.vreg[rd].f64[0] = sum64; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } + } + env->vfp.vstart = 0; + return; +} + +/* + * vfwredosum.vs vd, vs2, vs1, vm # + * Ordered reduce 2*SEW = 2*SEW + sum(promote(SEW)) + */ +void VECTOR_HELPER(vfwredosum_vs)(CPURISCVState *env, uint32_t vm, + uint32_t rs1, uint32_t rs2, uint32_t rd) +{ + helper_vector_vfwredsum_vs(env, vm, rs1, rs2, rd); + env->vfp.vstart = 0; + return; +} From patchwork Wed Sep 11 06:25:40 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: LIU Zhiwei X-Patchwork-Id: 11140411 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A39A914DB for ; Wed, 11 Sep 2019 06:51:38 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 73F6A21928 for ; Wed, 11 Sep 2019 06:51:38 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 73F6A21928 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=c-sky.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Received: from localhost ([::1]:47018 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1i7wTY-0008V6-W0 for patchwork-qemu-devel@patchwork.kernel.org; Wed, 11 Sep 2019 02:51:37 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:38658) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1i7wE2-0007im-V7 for qemu-devel@nongnu.org; Wed, 11 Sep 2019 02:35:39 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1i7wDy-00089a-E1 for qemu-devel@nongnu.org; Wed, 11 Sep 2019 02:35:34 -0400 Received: from smtp2200-217.mail.aliyun.com ([121.197.200.217]:51752) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1i7wDx-0007rL-Ka; Wed, 11 Sep 2019 02:35:30 -0400 X-Alimail-AntiSpam: AC=CONTINUE; BC=0.03883426|-1; CH=green; DM=CONTINUE|CONTINUE|true|0.427709-0.00502295-0.567268; FP=0|0|0|0|0|-1|-1|-1; HT=e02c03312; MF=zhiwei_liu@c-sky.com; NM=1; PH=DS; RN=11; RT=11; SR=0; TI=SMTPD_---.FSRZlMV_1568183705; Received: from localhost(mailfrom:zhiwei_liu@c-sky.com fp:SMTPD_---.FSRZlMV_1568183705) by smtp.aliyun-inc.com(10.147.42.135); Wed, 11 Sep 2019 14:35:05 +0800 From: liuzhiwei To: Alistair.Francis@wdc.com, palmer@sifive.com, sagark@eecs.berkeley.edu, kbastian@mail.uni-paderborn.de, riku.voipio@iki.fi, laurent@vivier.eu, wenmeng_zhang@c-sky.com Date: Wed, 11 Sep 2019 14:25:40 +0800 Message-Id: <1568183141-67641-17-git-send-email-zhiwei_liu@c-sky.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1568183141-67641-1-git-send-email-zhiwei_liu@c-sky.com> References: <1568183141-67641-1-git-send-email-zhiwei_liu@c-sky.com> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x [generic] X-Received-From: 121.197.200.217 Subject: [Qemu-devel] [PATCH v2 16/17] RISC-V: add vector extension mask instructions X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-riscv@nongnu.org, qemu-devel@nongnu.org, wxy194768@alibaba-inc.com, LIU Zhiwei Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" From: LIU Zhiwei Signed-off-by: LIU Zhiwei --- target/riscv/helper.h | 16 + target/riscv/insn32.decode | 17 + target/riscv/insn_trans/trans_rvv.inc.c | 27 ++ target/riscv/vector_helper.c | 635 ++++++++++++++++++++++++++++++++ 4 files changed, 695 insertions(+) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index d36bd00..337ac2e 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -401,5 +401,21 @@ DEF_HELPER_5(vector_vwredsum_vs, void, env, i32, i32, i32, i32) DEF_HELPER_5(vector_vfwredsum_vs, void, env, i32, i32, i32, i32) DEF_HELPER_5(vector_vfwredosum_vs, void, env, i32, i32, i32, i32) +DEF_HELPER_4(vector_vmandnot_mm, void, env, i32, i32, i32) +DEF_HELPER_4(vector_vmand_mm, void, env, i32, i32, i32) +DEF_HELPER_4(vector_vmor_mm, void, env, i32, i32, i32) +DEF_HELPER_4(vector_vmxor_mm, void, env, i32, i32, i32) +DEF_HELPER_4(vector_vmornot_mm, void, env, i32, i32, i32) +DEF_HELPER_4(vector_vmnand_mm, void, env, i32, i32, i32) +DEF_HELPER_4(vector_vmnor_mm, void, env, i32, i32, i32) +DEF_HELPER_4(vector_vmxnor_mm, void, env, i32, i32, i32) +DEF_HELPER_4(vector_vmsbf_m, void, env, i32, i32, i32) +DEF_HELPER_4(vector_vmsof_m, void, env, i32, i32, i32) +DEF_HELPER_4(vector_vmsif_m, void, env, i32, i32, i32) +DEF_HELPER_4(vector_viota_m, void, env, i32, i32, i32) +DEF_HELPER_3(vector_vid_v, void, env, i32, i32) +DEF_HELPER_4(vector_vmpopc_m, void, env, i32, i32, i32) +DEF_HELPER_4(vector_vmfirst_m, void, env, i32, i32, i32) + DEF_HELPER_4(vector_vsetvli, void, env, i32, i32, i32) DEF_HELPER_4(vector_vsetvl, void, env, i32, i32, i32) diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode index 3f63bc1..1de776b 100644 --- a/target/riscv/insn32.decode +++ b/target/riscv/insn32.decode @@ -68,6 +68,7 @@ @r_nfvm nf:3 ... vm:1 ..... ..... ... ..... ....... %rs2 %rs1 %rd @r2_nfvm nf:3 ... vm:1 ..... ..... ... ..... ....... %rs1 %rd @r2_vm ...... vm:1 ..... ..... ... ..... ....... %rs2 %rd +@r1_vm ...... vm:1 ..... ..... ... ..... ....... %rd @r2_zimm . zimm:11 ..... ... ..... ....... %rs1 %rd @sfence_vma ....... ..... ..... ... ..... ....... %rs2 %rs1 @@ -541,5 +542,21 @@ vfredmax_vs 000111 . ..... ..... 001 ..... 1010111 @r_vm vfwredsum_vs 110001 . ..... ..... 001 ..... 1010111 @r_vm vfwredosum_vs 110011 . ..... ..... 001 ..... 1010111 @r_vm +vmand_mm 011001 - ..... ..... 010 ..... 1010111 @r +vmnand_mm 011101 - ..... ..... 010 ..... 1010111 @r +vmandnot_mm 011000 - ..... ..... 010 ..... 1010111 @r +vmor_mm 011010 - ..... ..... 010 ..... 1010111 @r +vmxor_mm 011011 - ..... ..... 010 ..... 1010111 @r +vmnor_mm 011110 - ..... ..... 010 ..... 1010111 @r +vmornot_mm 011100 - ..... ..... 010 ..... 1010111 @r +vmxnor_mm 011111 - ..... ..... 010 ..... 1010111 @r +vmpopc_m 010100 . ..... ----- 010 ..... 1010111 @r2_vm +vmfirst_m 010101 . ..... ----- 010 ..... 1010111 @r2_vm +vmsbf_m 010110 . ..... 00001 010 ..... 1010111 @r2_vm +vmsof_m 010110 . ..... 00010 010 ..... 1010111 @r2_vm +vmsif_m 010110 . ..... 00011 010 ..... 1010111 @r2_vm +viota_m 010110 . ..... 10000 010 ..... 1010111 @r2_vm +vid_v 010110 . 00000 10001 010 ..... 1010111 @r1_vm + vsetvli 0 ........... ..... 111 ..... 1010111 @r2_zimm vsetvl 1000000 ..... ..... 111 ..... 1010111 @r diff --git a/target/riscv/insn_trans/trans_rvv.inc.c b/target/riscv/insn_trans/trans_rvv.inc.c index 9a3d31b..85e435a 100644 --- a/target/riscv/insn_trans/trans_rvv.inc.c +++ b/target/riscv/insn_trans/trans_rvv.inc.c @@ -77,6 +77,17 @@ static bool trans_##INSN(DisasContext *ctx, arg_##INSN * a) \ return true; \ } +#define GEN_VECTOR_R1_VM(INSN) \ +static bool trans_##INSN(DisasContext *ctx, arg_##INSN * a) \ +{ \ + TCGv_i32 d = tcg_const_i32(a->rd); \ + TCGv_i32 vm = tcg_const_i32(a->vm); \ + gen_helper_vector_##INSN(cpu_env, vm, d); \ + tcg_temp_free_i32(d); \ + tcg_temp_free_i32(vm); \ + return true; \ +} + #define GEN_VECTOR_R_VM(INSN) \ static bool trans_##INSN(DisasContext *ctx, arg_##INSN * a) \ { \ @@ -444,5 +455,21 @@ GEN_VECTOR_R_VM(vfredmax_vs) GEN_VECTOR_R_VM(vfwredsum_vs) GEN_VECTOR_R_VM(vfwredosum_vs) +GEN_VECTOR_R(vmandnot_mm) +GEN_VECTOR_R(vmand_mm) +GEN_VECTOR_R(vmor_mm) +GEN_VECTOR_R(vmxor_mm) +GEN_VECTOR_R(vmornot_mm) +GEN_VECTOR_R(vmnand_mm) +GEN_VECTOR_R(vmnor_mm) +GEN_VECTOR_R(vmxnor_mm) +GEN_VECTOR_R2_VM(vmpopc_m) +GEN_VECTOR_R2_VM(vmfirst_m) +GEN_VECTOR_R2_VM(vmsbf_m) +GEN_VECTOR_R2_VM(vmsof_m) +GEN_VECTOR_R2_VM(vmsif_m) +GEN_VECTOR_R2_VM(viota_m) +GEN_VECTOR_R1_VM(vid_v) + GEN_VECTOR_R2_ZIMM(vsetvli) GEN_VECTOR_R(vsetvl) diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c index 4a9083b..9e15df9 100644 --- a/target/riscv/vector_helper.c +++ b/target/riscv/vector_helper.c @@ -1232,6 +1232,15 @@ static inline int vector_get_carry(CPURISCVState *env, int width, int lmul, return (env->vfp.vreg[0].u8[idx] >> pos) & 0x1; } +static inline int vector_mask_reg(CPURISCVState *env, uint32_t reg, int width, + int lmul, int index) +{ + int mlen = width / lmul; + int idx = (index * mlen) / 8; + int pos = (index * mlen) % 8; + return (env->vfp.vreg[reg].u8[idx] >> pos) & 0x1; +} + static inline void vector_mask_result(CPURISCVState *env, uint32_t reg, int width, int lmul, int index, uint32_t result) { @@ -23996,3 +24005,629 @@ void VECTOR_HELPER(vfwredosum_vs)(CPURISCVState *env, uint32_t vm, env->vfp.vstart = 0; return; } + +/* vmandnot.mm vd, vs2, vs1 # vd = vs2 & ~vs1 */ +void VECTOR_HELPER(vmandnot_mm)(CPURISCVState *env, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, i, vlmax; + uint32_t tmp; + + if (vector_vtype_ill(env)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + vl = env->vfp.vl; + if (env->vfp.vstart >= vl) { + return; + } + + for (i = 0; i < vlmax; i++) { + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + tmp = ~vector_mask_reg(env, rs1, width, lmul, i) & + vector_mask_reg(env, rs2, width, lmul, i); + vector_mask_result(env, rd, width, lmul, i, tmp); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + + env->vfp.vstart = 0; + return; +} +/* vmand.mm vd, vs2, vs1 # vd = vs2 & vs1 */ +void VECTOR_HELPER(vmand_mm)(CPURISCVState *env, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, i, vlmax; + uint32_t tmp; + + if (vector_vtype_ill(env)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + vl = env->vfp.vl; + if (env->vfp.vstart >= vl) { + return; + } + + for (i = 0; i < vlmax; i++) { + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + tmp = vector_mask_reg(env, rs1, width, lmul, i) & + vector_mask_reg(env, rs2, width, lmul, i); + vector_mask_result(env, rd, width, lmul, i, tmp); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + + env->vfp.vstart = 0; + return; +} +/* vmor.mm vd, vs2, vs1 # vd = vs2 | vs1 */ +void VECTOR_HELPER(vmor_mm)(CPURISCVState *env, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, i, vlmax; + uint32_t tmp; + + if (vector_vtype_ill(env)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + vl = env->vfp.vl; + if (env->vfp.vstart >= vl) { + return; + } + + for (i = 0; i < vlmax; i++) { + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + tmp = vector_mask_reg(env, rs1, width, lmul, i) | + vector_mask_reg(env, rs2, width, lmul, i); + vector_mask_result(env, rd, width, lmul, i, tmp & 0x1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + + env->vfp.vstart = 0; + return; +} +/* vmxor.mm vd, vs2, vs1 # vd = vs2 ^ vs1 */ +void VECTOR_HELPER(vmxor_mm)(CPURISCVState *env, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, i, vlmax; + uint32_t tmp; + + if (vector_vtype_ill(env)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + vl = env->vfp.vl; + if (env->vfp.vstart >= vl) { + return; + } + + for (i = 0; i < vlmax; i++) { + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + tmp = vector_mask_reg(env, rs1, width, lmul, i) ^ + vector_mask_reg(env, rs2, width, lmul, i); + vector_mask_result(env, rd, width, lmul, i, tmp & 0x1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + + env->vfp.vstart = 0; + return; +} +/* vmornot.mm vd, vs2, vs1 # vd = vs2 | ~vs1 */ +void VECTOR_HELPER(vmornot_mm)(CPURISCVState *env, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, i, vlmax; + uint32_t tmp; + + if (vector_vtype_ill(env)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + vl = env->vfp.vl; + if (env->vfp.vstart >= vl) { + return; + } + + for (i = 0; i < vlmax; i++) { + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + tmp = ~vector_mask_reg(env, rs1, width, lmul, i) | + vector_mask_reg(env, rs2, width, lmul, i); + vector_mask_result(env, rd, width, lmul, i, tmp & 0x1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + + env->vfp.vstart = 0; + return; +} +/* vmnand.mm vd, vs2, vs1 # vd = ~(vs2 & vs1) */ +void VECTOR_HELPER(vmnand_mm)(CPURISCVState *env, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, i, vlmax; + uint32_t tmp; + + if (vector_vtype_ill(env)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + vl = env->vfp.vl; + if (env->vfp.vstart >= vl) { + return; + } + + for (i = 0; i < vlmax; i++) { + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + tmp = vector_mask_reg(env, rs1, width, lmul, i) & + vector_mask_reg(env, rs2, width, lmul, i); + vector_mask_result(env, rd, width, lmul, i, (~tmp & 0x1)); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + + env->vfp.vstart = 0; + return; +} +/* vmnor.mm vd, vs2, vs1 # vd = ~(vs2 | vs1) */ +void VECTOR_HELPER(vmnor_mm)(CPURISCVState *env, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, i, vlmax; + uint32_t tmp; + + if (vector_vtype_ill(env)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + vl = env->vfp.vl; + if (env->vfp.vstart >= vl) { + return; + } + + for (i = 0; i < vlmax; i++) { + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + tmp = vector_mask_reg(env, rs1, width, lmul, i) | + vector_mask_reg(env, rs2, width, lmul, i); + vector_mask_result(env, rd, width, lmul, i, ~tmp & 0x1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + + env->vfp.vstart = 0; + return; +} + +/* vmxnor.mm vd, vs2, vs1 # vd = ~(vs2 ^ vs1) */ +void VECTOR_HELPER(vmxnor_mm)(CPURISCVState *env, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, i, vlmax; + uint32_t tmp; + + if (vector_vtype_ill(env)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + lmul = vector_get_lmul(env); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + vl = env->vfp.vl; + if (env->vfp.vstart >= vl) { + return; + } + + for (i = 0; i < vlmax; i++) { + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + tmp = vector_mask_reg(env, rs1, width, lmul, i) ^ + vector_mask_reg(env, rs2, width, lmul, i); + vector_mask_result(env, rd, width, lmul, i, ~tmp & 0x1); + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + + env->vfp.vstart = 0; + return; +} + +/* vmpopc.m rd, vs2, v0.t # x[rd] = sum_i ( vs2[i].LSB && v0[i].LSB ) */ +void VECTOR_HELPER(vmpopc_m)(CPURISCVState *env, uint32_t vm, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i; + if (vector_vtype_ill(env)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + + if (env->vfp.vstart != 0) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + env->gpr[rd] = 0; + + for (i = 0; i < vlmax; i++) { + if (i < vl) { + if (vector_mask_reg(env, rs2, width, lmul, i) && + vector_elem_mask(env, vm, width, lmul, i)) { + env->gpr[rd]++; + } + } + } + env->vfp.vstart = 0; + return; +} + +/* vmfirst.m rd, vs2, vm */ +void VECTOR_HELPER(vmfirst_m)(CPURISCVState *env, uint32_t vm, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i; + if (vector_vtype_ill(env)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + + if (env->vfp.vstart != 0) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + if (i < vl) { + if (vector_mask_reg(env, rs2, width, lmul, i) && + vector_elem_mask(env, vm, width, lmul, i)) { + env->gpr[rd] = i; + break; + } + } else { + env->gpr[rd] = -1; + } + } + env->vfp.vstart = 0; + return; +} + +/* vmsbf.m vd, vs2, vm # set-before-first mask bit */ +void VECTOR_HELPER(vmsbf_m)(CPURISCVState *env, uint32_t vm, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i; + bool first_mask_bit = false; + if (vector_vtype_ill(env)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + + if (env->vfp.vstart != 0) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + if (i < vl) { + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (first_mask_bit) { + vector_mask_result(env, rd, width, lmul, i, 0); + continue; + } + if (!vector_mask_reg(env, rs2, width, lmul, i)) { + vector_mask_result(env, rd, width, lmul, i, 1); + } else { + first_mask_bit = true; + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + env->vfp.vstart = 0; + return; +} + +/* vmsif.m vd, vs2, vm # set-including-first mask bit */ +void VECTOR_HELPER(vmsif_m)(CPURISCVState *env, uint32_t vm, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i; + bool first_mask_bit = false; + if (vector_vtype_ill(env)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + + if (env->vfp.vstart != 0) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + if (i < vl) { + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (first_mask_bit) { + vector_mask_result(env, rd, width, lmul, i, 0); + continue; + } + if (!vector_mask_reg(env, rs2, width, lmul, i)) { + vector_mask_result(env, rd, width, lmul, i, 1); + } else { + first_mask_bit = true; + vector_mask_result(env, rd, width, lmul, i, 1); + } + } + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + env->vfp.vstart = 0; + return; +} + +/* vmsof.m vd, vs2, vm # set-only-first mask bit */ +void VECTOR_HELPER(vmsof_m)(CPURISCVState *env, uint32_t vm, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i; + bool first_mask_bit = false; + if (vector_vtype_ill(env)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + + if (env->vfp.vstart != 0) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + if (i < vl) { + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (first_mask_bit) { + vector_mask_result(env, rd, width, lmul, i, 0); + continue; + } + if (!vector_mask_reg(env, rs2, width, lmul, i)) { + vector_mask_result(env, rd, width, lmul, i, 0); + } else { + first_mask_bit = true; + vector_mask_result(env, rd, width, lmul, i, 1); + } + } + } else { + vector_mask_result(env, rd, width, lmul, i, 0); + } + } + env->vfp.vstart = 0; + return; +} + +/* viota.m v4, v2, v0.t */ +void VECTOR_HELPER(viota_m)(CPURISCVState *env, uint32_t vm, uint32_t rs2, + uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest; + uint32_t sum = 0; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + if (vector_vtype_ill(env) + || vector_overlap_vm_force(vm, rd) + || vector_overlap_dstgp_srcgp(rd, lmul, rs2, 1)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + if (env->vfp.vstart != 0) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u8[j] = sum; + if (vector_mask_reg(env, rs2, width, lmul, i)) { + sum++; + } + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u16[j] = sum; + if (vector_mask_reg(env, rs2, width, lmul, i)) { + sum++; + } + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u32[j] = sum; + if (vector_mask_reg(env, rs2, width, lmul, i)) { + sum++; + } + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u64[j] = sum; + if (vector_mask_reg(env, rs2, width, lmul, i)) { + sum++; + } + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* vid.v vd, vm # Write element ID to destination. */ +void VECTOR_HELPER(vid_v)(CPURISCVState *env, uint32_t vm, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rd, false); + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u8[j] = i; + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u16[j] = i; + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u32[j] = i; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u64[j] = i; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; + return; +} From patchwork Wed Sep 11 06:25:41 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: LIU Zhiwei X-Patchwork-Id: 11140371 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B8D2C14ED for ; Wed, 11 Sep 2019 06:41:25 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 79ADB2089F for ; Wed, 11 Sep 2019 06:41:25 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 79ADB2089F Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=c-sky.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Received: from localhost ([::1]:46896 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1i7wJf-00067T-V4 for patchwork-qemu-devel@patchwork.kernel.org; Wed, 11 Sep 2019 02:41:24 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:38683) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1i7wE5-0007mZ-PC for qemu-devel@nongnu.org; Wed, 11 Sep 2019 02:35:42 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1i7wDz-0008AD-LQ for qemu-devel@nongnu.org; Wed, 11 Sep 2019 02:35:37 -0400 Received: from smtp2200-217.mail.aliyun.com ([121.197.200.217]:56528) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1i7wDy-0007rn-IX; Wed, 11 Sep 2019 02:35:31 -0400 X-Alimail-AntiSpam: AC=CONTINUE; BC=0.03883426|-1; CH=green; DM=CONTINUE|CONTINUE|true|0.442378-0.00542541-0.552197; FP=0|0|0|0|0|-1|-1|-1; HT=e02c03307; MF=zhiwei_liu@c-sky.com; NM=1; PH=DS; RN=11; RT=11; SR=0; TI=SMTPD_---.FSRMCTm_1568183706; Received: from localhost(mailfrom:zhiwei_liu@c-sky.com fp:SMTPD_---.FSRMCTm_1568183706) by smtp.aliyun-inc.com(10.147.41.199); Wed, 11 Sep 2019 14:35:06 +0800 From: liuzhiwei To: Alistair.Francis@wdc.com, palmer@sifive.com, sagark@eecs.berkeley.edu, kbastian@mail.uni-paderborn.de, riku.voipio@iki.fi, laurent@vivier.eu, wenmeng_zhang@c-sky.com Date: Wed, 11 Sep 2019 14:25:41 +0800 Message-Id: <1568183141-67641-18-git-send-email-zhiwei_liu@c-sky.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1568183141-67641-1-git-send-email-zhiwei_liu@c-sky.com> References: <1568183141-67641-1-git-send-email-zhiwei_liu@c-sky.com> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x [generic] X-Received-From: 121.197.200.217 Subject: [Qemu-devel] [PATCH v2 17/17] RISC-V: add vector extension premutation instructions X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-riscv@nongnu.org, qemu-devel@nongnu.org, wxy194768@alibaba-inc.com, LIU Zhiwei Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" From: LIU Zhiwei Signed-off-by: LIU Zhiwei --- target/riscv/helper.h | 15 + target/riscv/insn32.decode | 16 + target/riscv/insn_trans/trans_rvv.inc.c | 15 + target/riscv/vector_helper.c | 1068 +++++++++++++++++++++++++++++++ 4 files changed, 1114 insertions(+) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index 337ac2e..2d153ce 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -417,5 +417,20 @@ DEF_HELPER_3(vector_vid_v, void, env, i32, i32) DEF_HELPER_4(vector_vmpopc_m, void, env, i32, i32, i32) DEF_HELPER_4(vector_vmfirst_m, void, env, i32, i32, i32) +DEF_HELPER_4(vector_vext_x_v, void, env, i32, i32, i32) +DEF_HELPER_4(vector_vmv_s_x, void, env, i32, i32, i32) +DEF_HELPER_4(vector_vfmv_f_s, void, env, i32, i32, i32) +DEF_HELPER_4(vector_vfmv_s_f, void, env, i32, i32, i32) +DEF_HELPER_5(vector_vslideup_vx, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vslideup_vi, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vslide1up_vx, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vslidedown_vx, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vslidedown_vi, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vslide1down_vx, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vrgather_vv, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vrgather_vx, void, env, i32, i32, i32, i32) +DEF_HELPER_5(vector_vrgather_vi, void, env, i32, i32, i32, i32) +DEF_HELPER_4(vector_vcompress_vm, void, env, i32, i32, i32) + DEF_HELPER_4(vector_vsetvli, void, env, i32, i32, i32) DEF_HELPER_4(vector_vsetvl, void, env, i32, i32, i32) diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode index 1de776b..c98915b 100644 --- a/target/riscv/insn32.decode +++ b/target/riscv/insn32.decode @@ -558,5 +558,21 @@ vmsif_m 010110 . ..... 00011 010 ..... 1010111 @r2_vm viota_m 010110 . ..... 10000 010 ..... 1010111 @r2_vm vid_v 010110 . 00000 10001 010 ..... 1010111 @r1_vm +vext_x_v 001100 1 ..... ..... 010 ..... 1010111 @r +vmv_s_x 001101 1 ..... ..... 110 ..... 1010111 @r +vfmv_f_s 001100 1 ..... ..... 001 ..... 1010111 @r +vfmv_s_f 001101 1 ..... ..... 101 ..... 1010111 @r +vslideup_vx 001110 . ..... ..... 100 ..... 1010111 @r_vm +vslideup_vi 001110 . ..... ..... 011 ..... 1010111 @r_vm +vslide1up_vx 001110 . ..... ..... 110 ..... 1010111 @r_vm +vslidedown_vx 001111 . ..... ..... 100 ..... 1010111 @r_vm +vslidedown_vi 001111 . ..... ..... 011 ..... 1010111 @r_vm +vslide1down_vx 001111 . ..... ..... 110 ..... 1010111 @r_vm +vrgather_vv 001100 . ..... ..... 000 ..... 1010111 @r_vm +vrgather_vx 001100 . ..... ..... 100 ..... 1010111 @r_vm +vrgather_vi 001100 . ..... ..... 011 ..... 1010111 @r_vm +vcompress_vm 010111 - ..... ..... 010 ..... 1010111 @r + + vsetvli 0 ........... ..... 111 ..... 1010111 @r2_zimm vsetvl 1000000 ..... ..... 111 ..... 1010111 @r diff --git a/target/riscv/insn_trans/trans_rvv.inc.c b/target/riscv/insn_trans/trans_rvv.inc.c index 85e435a..1774d1f 100644 --- a/target/riscv/insn_trans/trans_rvv.inc.c +++ b/target/riscv/insn_trans/trans_rvv.inc.c @@ -471,5 +471,20 @@ GEN_VECTOR_R2_VM(vmsif_m) GEN_VECTOR_R2_VM(viota_m) GEN_VECTOR_R1_VM(vid_v) +GEN_VECTOR_R(vmv_s_x) +GEN_VECTOR_R(vfmv_f_s) +GEN_VECTOR_R(vfmv_s_f) +GEN_VECTOR_R(vext_x_v) +GEN_VECTOR_R_VM(vslideup_vx) +GEN_VECTOR_R_VM(vslideup_vi) +GEN_VECTOR_R_VM(vslide1up_vx) +GEN_VECTOR_R_VM(vslidedown_vx) +GEN_VECTOR_R_VM(vslidedown_vi) +GEN_VECTOR_R_VM(vslide1down_vx) +GEN_VECTOR_R_VM(vrgather_vv) +GEN_VECTOR_R_VM(vrgather_vx) +GEN_VECTOR_R_VM(vrgather_vi) +GEN_VECTOR_R(vcompress_vm) + GEN_VECTOR_R2_ZIMM(vsetvli) GEN_VECTOR_R(vsetvl) diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c index 9e15df9..0a25996 100644 --- a/target/riscv/vector_helper.c +++ b/target/riscv/vector_helper.c @@ -1010,6 +1010,26 @@ static inline bool vector_overlap_dstgp_srcgp(int rd, int dlen, int rs, return false; } +/* fetch unsigned element by width */ +static inline uint64_t vector_get_iu_elem(CPURISCVState *env, uint32_t width, + uint32_t rs2, uint32_t index) +{ + uint64_t elem; + if (width == 8) { + elem = env->vfp.vreg[rs2].u8[index]; + } else if (width == 16) { + elem = env->vfp.vreg[rs2].u16[index]; + } else if (width == 32) { + elem = env->vfp.vreg[rs2].u32[index]; + } else if (width == 64) { + elem = env->vfp.vreg[rs2].u64[index]; + } else { /* the max of (XLEN, FLEN) is no bigger than 64 */ + helper_raise_exception(env, RISCV_EXCP_ILLEGAL_INST); + return 0; + } + return elem; +} + static inline void vector_get_layout(CPURISCVState *env, int width, int lmul, int index, int *idx, int *pos) { @@ -24631,3 +24651,1051 @@ void VECTOR_HELPER(vid_v)(CPURISCVState *env, uint32_t vm, uint32_t rd) env->vfp.vstart = 0; return; } + +/* vfmv.f.s rd, vs2 # rd = vs2[0] (rs1=0) */ +void VECTOR_HELPER(vfmv_f_s)(CPURISCVState *env, uint32_t rs1, uint32_t rs2, + uint32_t rd) +{ + int width, flen; + uint64_t mask; + + if (vector_vtype_ill(env)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + if (env->misa & RVD) { + flen = 8; + } else if (env->misa & RVF) { + flen = 4; + } else { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + width = vector_get_width(env); + mask = (~((uint64_t)0)) << width; + + if (width == 8) { + env->fpr[rd] = (uint64_t)env->vfp.vreg[rs2].s8[0] | mask; + } else if (width == 16) { + env->fpr[rd] = (uint64_t)env->vfp.vreg[rs2].s16[0] | mask; + } else if (width == 32) { + env->fpr[rd] = (uint64_t)env->vfp.vreg[rs2].s32[0] | mask; + } else if (width == 64) { + if (flen == 4) { + env->fpr[rd] = env->vfp.vreg[rs2].s64[0] & 0xffffffff; + } else { + env->fpr[rd] = env->vfp.vreg[rs2].s64[0]; + } + } else { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + env->vfp.vstart = 0; + return; +} + +/* vmv.s.x vd, rs1 # vd[0] = rs1 */ +void VECTOR_HELPER(vmv_s_x)(CPURISCVState *env, uint32_t rs1, uint32_t rs2, + uint32_t rd) +{ + int width; + if (vector_vtype_ill(env)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + if (env->vfp.vstart >= env->vfp.vl) { + return; + } + + memset(&env->vfp.vreg[rd].u8[0], 0, VLEN / 8); + width = vector_get_width(env); + + if (width == 8) { + env->vfp.vreg[rd].u8[0] = env->gpr[rs1]; + } else if (width == 16) { + env->vfp.vreg[rd].u16[0] = env->gpr[rs1]; + } else if (width == 32) { + env->vfp.vreg[rd].u32[0] = env->gpr[rs1]; + } else if (width == 64) { + env->vfp.vreg[rd].u64[0] = env->gpr[rs1]; + } else { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + env->vfp.vstart = 0; + return; +} + +/* vfmv.s.f vd, rs1 # vd[0] = rs1 (vs2 = 0) */ +void VECTOR_HELPER(vfmv_s_f)(CPURISCVState *env, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, flen; + + if (vector_vtype_ill(env)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + if (env->vfp.vstart >= env->vfp.vl) { + return; + } + if (env->misa & RVD) { + flen = 8; + } else if (env->misa & RVF) { + flen = 4; + } else { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + width = vector_get_width(env); + + if (width == 8) { + env->vfp.vreg[rd].u8[0] = env->fpr[rs1]; + } else if (width == 16) { + env->vfp.vreg[rd].u16[0] = env->fpr[rs1]; + } else if (width == 32) { + env->vfp.vreg[rd].u32[0] = env->fpr[rs1]; + } else if (width == 64) { + if (flen == 4) { /* 1-extended to FLEN bits */ + env->vfp.vreg[rd].u64[0] = (uint64_t)env->fpr[rs1] + | 0xffffffff00000000; + } else { + env->vfp.vreg[rd].u64[0] = env->fpr[rs1]; + } + } else { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + env->vfp.vstart = 0; + return; +} + +/* vslideup.vx vd, vs2, rs1, vm # vd[i+rs1] = vs2[i] */ +void VECTOR_HELPER(vslideup_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax, offset; + int i, j, dest, src, k; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + + if (vector_vtype_ill(env) + || vector_overlap_vm_force(vm, rd) + || vector_overlap_dstgp_srcgp(rd, lmul, rs2, lmul)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + offset = env->gpr[rs1]; + + if (offset < env->vfp.vstart) { + offset = env->vfp.vstart; + } + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src = rs2 + ((i - offset) / (VLEN / width)); + j = i % (VLEN / width); + k = (i - offset) % (VLEN / width); + if (i < offset) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u8[j] = + env->vfp.vreg[src].u8[k]; + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u16[j] = + env->vfp.vreg[src].u16[k]; + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u32[j] = + env->vfp.vreg[src].u32[k]; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u64[j] = + env->vfp.vreg[src].u64[k]; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* vslideup.vi vd, vs2, rs1, vm # vd[i+rs1] = vs2[i] */ +void VECTOR_HELPER(vslideup_vi)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax, offset; + int i, j, dest, src, k; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + + if (vector_vtype_ill(env) + || vector_overlap_vm_force(vm, rd) + || vector_overlap_dstgp_srcgp(rd, lmul, rs2, lmul)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + offset = rs1; + + if (offset < env->vfp.vstart) { + offset = env->vfp.vstart; + } + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src = rs2 + ((i - offset) / (VLEN / width)); + j = i % (VLEN / width); + k = (i - offset) % (VLEN / width); + if (i < offset) { + continue; + } else if (i < vl) { + if (width == 8) { + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u8[j] = + env->vfp.vreg[src].u8[k]; + } + } else if (width == 16) { + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u16[j] = + env->vfp.vreg[src].u16[k]; + } + } else if (width == 32) { + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u32[j] = + env->vfp.vreg[src].u32[k]; + } + } else if (width == 64) { + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u64[j] = + env->vfp.vreg[src].u64[k]; + } + } else { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* vslide1up.vx vd, vs2, rs1, vm # vd[0]=x[rs1], vd[i+1] = vs2[i] */ +void VECTOR_HELPER(vslide1up_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src, k; + uint64_t s1; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + + if (vector_vtype_ill(env) + || vector_overlap_vm_force(vm, rd) + || vector_overlap_dstgp_srcgp(rd, lmul, rs2, lmul)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + s1 = env->gpr[rs1]; + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src = rs2 + ((i - 1) / (VLEN / width)); + j = i % (VLEN / width); + k = (i - 1) % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i == 0 && env->vfp.vstart == 0) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u8[j] = s1; + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u16[j] = s1; + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u32[j] = s1; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u64[j] = s1; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u8[j] = env->vfp.vreg[src].u8[k]; + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u16[j] = + env->vfp.vreg[src].u16[k]; + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u32[j] = + env->vfp.vreg[src].u32[k]; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u64[j] = + env->vfp.vreg[src].u64[k]; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* vslidedown.vx vd, vs2, rs1, vm # vd[i] = vs2[i + rs1] */ +void VECTOR_HELPER(vslidedown_vx)(CPURISCVState *env, uint32_t vm, + uint32_t rs1, uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax, offset; + int i, j, dest, src, k; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + + if (vector_vtype_ill(env) || vector_overlap_vm_force(vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + offset = env->gpr[rs1]; + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src = rs2 + ((i + offset) / (VLEN / width)); + j = i % (VLEN / width); + k = (i + offset) % (VLEN / width); + if (i < offset) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (i + offset < vlmax) { + env->vfp.vreg[dest].u8[j] = + env->vfp.vreg[src].u8[k]; + } else { + env->vfp.vreg[dest].u8[j] = 0; + } + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (i + offset < vlmax) { + env->vfp.vreg[dest].u16[j] = + env->vfp.vreg[src].u16[k]; + } else { + env->vfp.vreg[dest].u16[j] = 0; + } + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (i + offset < vlmax) { + env->vfp.vreg[dest].u32[j] = + env->vfp.vreg[src].u32[k]; + } else { + env->vfp.vreg[dest].u32[j] = 0; + } + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (i + offset < vlmax) { + env->vfp.vreg[dest].u64[j] = + env->vfp.vreg[src].u64[k]; + } else { + env->vfp.vreg[dest].u64[j] = 0; + } + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; + return; +} + +void VECTOR_HELPER(vslidedown_vi)(CPURISCVState *env, uint32_t vm, + uint32_t rs1, uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax, offset; + int i, j, dest, src, k; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + + if (vector_vtype_ill(env) || vector_overlap_vm_force(vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + offset = rs1; + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src = rs2 + ((i + offset) / (VLEN / width)); + j = i % (VLEN / width); + k = (i + offset) % (VLEN / width); + if (i < offset) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (i + offset < vlmax) { + env->vfp.vreg[dest].u8[j] = + env->vfp.vreg[src].u8[k]; + } else { + env->vfp.vreg[dest].u8[j] = 0; + } + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (i + offset < vlmax) { + env->vfp.vreg[dest].u16[j] = + env->vfp.vreg[src].u16[k]; + } else { + env->vfp.vreg[dest].u16[j] = 0; + } + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (i + offset < vlmax) { + env->vfp.vreg[dest].u32[j] = + env->vfp.vreg[src].u32[k]; + } else { + env->vfp.vreg[dest].u32[j] = 0; + } + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (i + offset < vlmax) { + env->vfp.vreg[dest].u64[j] = + env->vfp.vreg[src].u64[k]; + } else { + env->vfp.vreg[dest].u64[j] = 0; + } + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* vslide1down.vx vd, vs2, rs1, vm # vd[vl - 1]=x[rs1], vd[i] = vs2[i + 1] */ +void VECTOR_HELPER(vslide1down_vx)(CPURISCVState *env, uint32_t vm, + uint32_t rs1, uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src, k; + uint64_t s1; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + + if (vector_vtype_ill(env) || vector_overlap_vm_force(vm, rd)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + s1 = env->gpr[rs1]; + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src = rs2 + ((i + 1) / (VLEN / width)); + j = i % (VLEN / width); + k = (i + 1) % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i == vl - 1 && i >= env->vfp.vstart) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u8[j] = s1; + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u16[j] = s1; + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u32[j] = s1; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u64[j] = s1; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else if (i < vl - 1) { + switch (width) { + case 8: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u8[j] = env->vfp.vreg[src].u8[k]; + } + break; + case 16: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u16[j] = + env->vfp.vreg[src].u16[k]; + } + break; + case 32: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u32[j] = + env->vfp.vreg[src].u32[k]; + } + break; + case 64: + if (vector_elem_mask(env, vm, width, lmul, i)) { + env->vfp.vreg[dest].u64[j] = + env->vfp.vreg[src].u64[k]; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* + * vcompress.vm vd, vs2, vs1 + * Compress into vd elements of vs2 where vs1 is enabled + */ +void VECTOR_HELPER(vcompress_vm)(CPURISCVState *env, uint32_t rs1, uint32_t rs2, + uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src; + uint32_t vd_idx, num = 0; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + if (vector_vtype_ill(env) + || vector_overlap_dstgp_srcgp(rd, lmul, rs1, 1) + || vector_overlap_dstgp_srcgp(rd, lmul, rs2, lmul)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + if (env->vfp.vstart >= vl) { + return; + } + + if (env->vfp.vstart != 0) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + /* zeroed all elements */ + for (i = 0; i < lmul; i++) { + memset(&env->vfp.vreg[rd + i].u64[0], 0, VLEN / 8); + } + + for (i = 0; i < vlmax; i++) { + dest = rd + (num / (VLEN / width)); + src = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + vd_idx = num % (VLEN / width); + if (i < vl) { + switch (width) { + case 8: + if (vector_mask_reg(env, rs1, width, lmul, i)) { + env->vfp.vreg[dest].u8[vd_idx] = + env->vfp.vreg[src].u8[j]; + num++; + } + break; + case 16: + if (vector_mask_reg(env, rs1, width, lmul, i)) { + env->vfp.vreg[dest].u16[vd_idx] = + env->vfp.vreg[src].u16[j]; + num++; + } + break; + case 32: + if (vector_mask_reg(env, rs1, width, lmul, i)) { + env->vfp.vreg[dest].u32[vd_idx] = + env->vfp.vreg[src].u32[j]; + num++; + } + break; + case 64: + if (vector_mask_reg(env, rs1, width, lmul, i)) { + env->vfp.vreg[dest].u64[vd_idx] = + env->vfp.vreg[src].u64[j]; + num++; + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } + } + env->vfp.vstart = 0; + return; +} + +void VECTOR_HELPER(vext_x_v)(CPURISCVState *env, uint32_t rs1, uint32_t rs2, + uint32_t rd) +{ + int width; + uint64_t elem; + target_ulong index = env->gpr[rs1]; + + if (vector_vtype_ill(env)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + width = vector_get_width(env); + + elem = vector_get_iu_elem(env, width, rs2, index); + if (index >= VLEN / width) { /* index is too big */ + env->gpr[rd] = 0; + } else { + env->gpr[rd] = elem; + } + env->vfp.vstart = 0; + return; +} + +/* + * vrgather.vv vd, vs2, vs1, vm # + * vd[i] = (vs1[i] >= VLMAX) ? 0 : vs2[vs1[i]]; + */ +void VECTOR_HELPER(vrgather_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src, src1; + uint32_t index; + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + + if (vector_vtype_ill(env) + || vector_overlap_vm_force(vm, rd) + || vector_overlap_dstgp_srcgp(rd, lmul, rs1, lmul) + || vector_overlap_dstgp_srcgp(rd, lmul, rs2, lmul)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + + if (env->vfp.vstart >= vl) { + return; + } + + vector_lmul_check_reg(env, lmul, rs1, false); + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src1 = rs1 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + index = env->vfp.vreg[src1].u8[j]; + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (index >= vlmax) { + env->vfp.vreg[dest].u8[j] = 0; + } else { + src = rs2 + (index / (VLEN / width)); + index = index % (VLEN / width); + env->vfp.vreg[dest].u8[j] = + env->vfp.vreg[src].u8[index]; + } + } + break; + case 16: + index = env->vfp.vreg[src1].u16[j]; + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (index >= vlmax) { + env->vfp.vreg[dest].u16[j] = 0; + } else { + src = rs2 + (index / (VLEN / width)); + index = index % (VLEN / width); + env->vfp.vreg[dest].u16[j] = + env->vfp.vreg[src].u16[index]; + } + } + break; + case 32: + index = env->vfp.vreg[src1].u32[j]; + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (index >= vlmax) { + env->vfp.vreg[dest].u32[j] = 0; + } else { + src = rs2 + (index / (VLEN / width)); + index = index % (VLEN / width); + env->vfp.vreg[dest].u32[j] = + env->vfp.vreg[src].u32[index]; + } + } + break; + case 64: + index = env->vfp.vreg[src1].u64[j]; + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (index >= vlmax) { + env->vfp.vreg[dest].u64[j] = 0; + } else { + src = rs2 + (index / (VLEN / width)); + index = index % (VLEN / width); + env->vfp.vreg[dest].u64[j] = + env->vfp.vreg[src].u64[index]; + } + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* vrgather.vx vd, vs2, rs1, vm # vd[i] = (x[rs1] >= VLMAX) ? 0 : vs2[rs1] */ +void VECTOR_HELPER(vrgather_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src; + uint32_t index; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + + if (vector_vtype_ill(env) + || vector_overlap_vm_force(vm, rd) + || vector_overlap_dstgp_srcgp(rd, lmul, rs2, lmul)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + index = env->gpr[rs1]; + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (index >= vlmax) { + env->vfp.vreg[dest].u8[j] = 0; + } else { + src = rs2 + (index / (VLEN / width)); + index = index % (VLEN / width); + env->vfp.vreg[dest].u8[j] = + env->vfp.vreg[src].u8[index]; + } + } + break; + case 16: + index = env->gpr[rs1]; + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (index >= vlmax) { + env->vfp.vreg[dest].u16[j] = 0; + } else { + src = rs2 + (index / (VLEN / width)); + index = index % (VLEN / width); + env->vfp.vreg[dest].u16[j] = + env->vfp.vreg[src].u16[index]; + } + } + break; + case 32: + index = env->gpr[rs1]; + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (index >= vlmax) { + env->vfp.vreg[dest].u32[j] = 0; + } else { + src = rs2 + (index / (VLEN / width)); + index = index % (VLEN / width); + env->vfp.vreg[dest].u32[j] = + env->vfp.vreg[src].u32[index]; + } + } + break; + case 64: + index = env->gpr[rs1]; + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (index >= vlmax) { + env->vfp.vreg[dest].u64[j] = 0; + } else { + src = rs2 + (index / (VLEN / width)); + index = index % (VLEN / width); + env->vfp.vreg[dest].u64[j] = + env->vfp.vreg[src].u64[index]; + } + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; + return; +} + +/* vrgather.vi vd, vs2, imm, vm # vd[i] = (imm >= VLMAX) ? 0 : vs2[imm] */ +void VECTOR_HELPER(vrgather_vi)(CPURISCVState *env, uint32_t vm, uint32_t rs1, + uint32_t rs2, uint32_t rd) +{ + int width, lmul, vl, vlmax; + int i, j, dest, src; + uint32_t index; + + lmul = vector_get_lmul(env); + vl = env->vfp.vl; + + if (vector_vtype_ill(env) + || vector_overlap_vm_force(vm, rd) + || vector_overlap_dstgp_srcgp(rd, lmul, rs2, lmul)) { + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + vector_lmul_check_reg(env, lmul, rs2, false); + vector_lmul_check_reg(env, lmul, rd, false); + + if (env->vfp.vstart >= vl) { + return; + } + + width = vector_get_width(env); + vlmax = vector_get_vlmax(env); + + for (i = 0; i < vlmax; i++) { + dest = rd + (i / (VLEN / width)); + src = rs2 + (i / (VLEN / width)); + j = i % (VLEN / width); + if (i < env->vfp.vstart) { + continue; + } else if (i < vl) { + switch (width) { + case 8: + index = rs1; + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (index >= vlmax) { + env->vfp.vreg[dest].u8[j] = 0; + } else { + src = rs2 + (index / (VLEN / width)); + index = index % (VLEN / width); + env->vfp.vreg[dest].u8[j] = + env->vfp.vreg[src].u8[index]; + } + } + break; + case 16: + index = rs1; + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (index >= vlmax) { + env->vfp.vreg[dest].u16[j] = 0; + } else { + src = rs2 + (index / (VLEN / width)); + index = index % (VLEN / width); + env->vfp.vreg[dest].u16[j] = + env->vfp.vreg[src].u16[index]; + } + } + break; + case 32: + index = rs1; + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (index >= vlmax) { + env->vfp.vreg[dest].u32[j] = 0; + } else { + src = rs2 + (index / (VLEN / width)); + index = index % (VLEN / width); + env->vfp.vreg[dest].u32[j] = + env->vfp.vreg[src].u32[index]; + } + } + break; + case 64: + index = rs1; + if (vector_elem_mask(env, vm, width, lmul, i)) { + if (index >= vlmax) { + env->vfp.vreg[dest].u64[j] = 0; + } else { + src = rs2 + (index / (VLEN / width)); + index = index % (VLEN / width); + env->vfp.vreg[dest].u64[j] = + env->vfp.vreg[src].u64[index]; + } + } + break; + default: + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC()); + return; + } + } else { + vector_tail_common(env, dest, j, width); + } + } + env->vfp.vstart = 0; + return; +} +