From patchwork Sun Mar 10 11:53:05 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Henrique Barboza X-Patchwork-Id: 13588061 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B9F42C54E66 for ; Sun, 10 Mar 2024 11:55:12 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rjHkK-0007j8-12; Sun, 10 Mar 2024 07:53:40 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rjHk9-0007ga-2u for qemu-devel@nongnu.org; Sun, 10 Mar 2024 07:53:31 -0400 Received: from mail-pg1-x535.google.com ([2607:f8b0:4864:20::535]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1rjHk7-00041a-Nr for qemu-devel@nongnu.org; Sun, 10 Mar 2024 07:53:28 -0400 Received: by mail-pg1-x535.google.com with SMTP id 41be03b00d2f7-517ab9a4a13so3216468a12.1 for ; Sun, 10 Mar 2024 04:53:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ventanamicro.com; s=google; t=1710071605; x=1710676405; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=8PUZYN/jxC55WgIbgr/YDIPnxwzEfC4uN3+8U/WL+Ts=; b=DJuFY74Ib7pRxCJOJmYWV0G/GoToKoLv9rCnKlYvsnUrzFh8qG6RiGg4okZ1VF4CtE 4N3dTBxI02yD2JkTUbGKIajPhvjcsMHbKNOiClh39rqyUoejNf3+LDkD4XgdYpNR0+Ef rX2ZlIIjJgMzeHmCTxjFdkOsSedYPYEuZqRtfnfiic2oIaOnMQnH0IMipvcFQHYrBh5y yLW1fvYHRn08GTa3Rc2KZISyYQZ9YRVpuvk4lbPv74bj0iJJuWJ7aWMvtVMnUg5FBdS0 g9sUnkKGRjQhtMYq4gH2TChBOmC6t5Y8lIdw0Gp7JH6I6VVs+au/alXyUKBLV4qV1ERI CfOg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710071605; x=1710676405; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=8PUZYN/jxC55WgIbgr/YDIPnxwzEfC4uN3+8U/WL+Ts=; b=oBb1Vplp6Vjm5IB8TSxA/KaAchxNfm0uYIXUBaDkN4JAQUl8RjPsLWE0wNjt9l9uID /Y/7LsYFqiTRtKtdV287z2SjHEM0q/E4Dluk9ANRIKjDky9qEE/ffPFlJTwP+oE/tUHt 6ckrHMrB6z/vqax/5MoFPWBHs6jwF/aizRI1KjfWxpW3moc0iptpkc5sUWaIrgqJgAJ5 BgEJef0Ju9zXAcCz8qKUAjg2ivUDyH1DCQcv4XElTJH4rUyo14dubjHSm8pX/qrtXg0r QALIvXVgq9Dc19/vsVRjGVWqQwDgBGibnyncqCdz5yziy891pyKIAZITJJA4EHoxB5s1 /9wg== X-Gm-Message-State: AOJu0Yx0+GITEJIJQSU0sfbRJtCgmSpEcjFHqqkl05GQUajSaMr/40+f TeD1/0V1tbEDvOYCu22Yh+qKp/mL2L5/DtMvSBd8jDpzJa4jyXwUAlpE0TTZF+ao2C3rEv0XEcE W X-Google-Smtp-Source: AGHT+IG4KfrazGU/f//rbpQyy9ewTUhdUY8aWZmwDc/x1u4CMpXVcc+gFv/lSI1EDGDRHxQP913xOQ== X-Received: by 2002:a05:6a20:e608:b0:1a3:1349:8489 with SMTP id my8-20020a056a20e60800b001a313498489mr1092223pzb.30.1710071605583; Sun, 10 Mar 2024 04:53:25 -0700 (PDT) Received: from grind.. ([177.94.15.159]) by smtp.gmail.com with ESMTPSA id g22-20020aa78196000000b006e647059cccsm2449253pfi.33.2024.03.10.04.53.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 10 Mar 2024 04:53:25 -0700 (PDT) From: Daniel Henrique Barboza To: qemu-devel@nongnu.org Cc: qemu-riscv@nongnu.org, alistair.francis@wdc.com, bmeng@tinylab.org, liwei1518@gmail.com, zhiwei_liu@linux.alibaba.com, palmer@rivosinc.com, richard.henderson@linaro.org, philmd@linaro.org, Daniel Henrique Barboza Subject: [PATCH v10 01/10] target/riscv/vector_helper.c: set vstart = 0 in GEN_VEXT_VSLIDEUP_VX() Date: Sun, 10 Mar 2024 08:53:05 -0300 Message-ID: <20240310115315.187283-2-dbarboza@ventanamicro.com> X-Mailer: git-send-email 2.43.2 In-Reply-To: <20240310115315.187283-1-dbarboza@ventanamicro.com> References: <20240310115315.187283-1-dbarboza@ventanamicro.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::535; envelope-from=dbarboza@ventanamicro.com; helo=mail-pg1-x535.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org The helper isn't setting env->vstart = 0 after its execution, as it is expected from every vector instruction that completes successfully. Signed-off-by: Daniel Henrique Barboza Reviewed-by: Richard Henderson Reviewed-by: Alistair Francis --- target/riscv/vector_helper.c | 1 + 1 file changed, 1 insertion(+) diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c index fe56c007d5..ca79571ae2 100644 --- a/target/riscv/vector_helper.c +++ b/target/riscv/vector_helper.c @@ -4781,6 +4781,7 @@ void HELPER(NAME)(void *vd, void *v0, target_ulong s1, void *vs2, \ } \ *((ETYPE *)vd + H(i)) = *((ETYPE *)vs2 + H(i - offset)); \ } \ + env->vstart = 0; \ /* set tail elements to 1s */ \ vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz); \ } From patchwork Sun Mar 10 11:53:06 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Daniel Henrique Barboza X-Patchwork-Id: 13588059 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id CBC98C54E58 for ; Sun, 10 Mar 2024 11:55:04 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rjHkO-0007kd-3A; Sun, 10 Mar 2024 07:53:46 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rjHkD-0007hg-BV for qemu-devel@nongnu.org; Sun, 10 Mar 2024 07:53:33 -0400 Received: from mail-oi1-x230.google.com ([2607:f8b0:4864:20::230]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1rjHkB-00041w-Lx for qemu-devel@nongnu.org; Sun, 10 Mar 2024 07:53:33 -0400 Received: by mail-oi1-x230.google.com with SMTP id 5614622812f47-3c1a2f7e302so1543659b6e.0 for ; Sun, 10 Mar 2024 04:53:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ventanamicro.com; s=google; t=1710071609; x=1710676409; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=xXFJKr57FFL8Naa8L/xp3qZMbV9BR2rU2iKRNY7Q9BQ=; b=WpenGyZy+VhlC3jMQdOrqtBLNlCXU/2nwgghV/EYElOaEWwqVuU7O7LPgT6FGW4wTI smQnlHcCg+Bvy9jqYjU2KxvjglnvOQ1KWhaMuvCh3CLc2QquKmGahYJ/PMlR+ikhN+6U lfpWaBP9fVJ3Ie0hHGpW5KFot4qn6SLvybJSMvQgtMq07ziL2qZiVNZVtpkCUMHUlJ2p VdyWXvm3ntcTrwXEZDGT/oS5p00BwMrgg4XPD+YVvrEYdMT0iLi49PKgJkLC2dUTVj2I Es30HFTSdPCeOYDonUvloOjc6KjcguPYOcYaYM6L5uWIocUUecvoNFZ85Hwtu7nnsyo6 C6GA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710071609; x=1710676409; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=xXFJKr57FFL8Naa8L/xp3qZMbV9BR2rU2iKRNY7Q9BQ=; b=KDxzLBItJnHju/KK45xM4fUhOdv3RNsFf6Tijet1aqY2cJKiBQu+4tqHVXbDf8FjQy sZaxPCNV0kttWbGfZ8oJdlSZQCCbMruQTKprJSAAHxwmSAmwsGfQ1GpyFOzVvsgSsIJW pOEs1iMA8WkI6j7gPg1WDiFBguUCWWfcGY9sqLNk5GL6MGwqE5WP2ynYHlKo+it7xAOi jtZwvLwaP/2PeDjqcCHSRLmvIw/1lnKmCf7gHF6PuIHJyQlk5V+Z6sM7IDCsV6UoMi0U wegFBvzoULcIuHQCRt2v87ZzEFIPqL+6pE5CMgEuSTxlb0iFaqSgb8K/YI4agvHWjdkN hBYw== X-Gm-Message-State: AOJu0Yxy/5mlt5ZBx55kiIP2KLdfOUTZl9TO7sNT0b7luk6ICIwuY6/7 ctWK9RqZQ4atZV8RCJMEOiE9f9cviFaL8l8KY1aP4OlBPDm7X7KdGu75H/EP2xXNwM70U6RBsoF J X-Google-Smtp-Source: AGHT+IHI0zB2FpS2NSwpVxHB+RfqIX79SdIt09mCh4aOpPyEBgNWRJRKiMWHCVlUl5CKiBuwWdTufg== X-Received: by 2002:a05:6808:13c7:b0:3c2:3b86:b245 with SMTP id d7-20020a05680813c700b003c23b86b245mr4905122oiw.8.1710071608969; Sun, 10 Mar 2024 04:53:28 -0700 (PDT) Received: from grind.. ([177.94.15.159]) by smtp.gmail.com with ESMTPSA id g22-20020aa78196000000b006e647059cccsm2449253pfi.33.2024.03.10.04.53.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 10 Mar 2024 04:53:28 -0700 (PDT) From: Daniel Henrique Barboza To: qemu-devel@nongnu.org Cc: qemu-riscv@nongnu.org, alistair.francis@wdc.com, bmeng@tinylab.org, liwei1518@gmail.com, zhiwei_liu@linux.alibaba.com, palmer@rivosinc.com, richard.henderson@linaro.org, philmd@linaro.org, Daniel Henrique Barboza Subject: [PATCH v10 02/10] target/riscv: handle vstart >= vl in vext_set_tail_elems_1s() Date: Sun, 10 Mar 2024 08:53:06 -0300 Message-ID: <20240310115315.187283-3-dbarboza@ventanamicro.com> X-Mailer: git-send-email 2.43.2 In-Reply-To: <20240310115315.187283-1-dbarboza@ventanamicro.com> References: <20240310115315.187283-1-dbarboza@ventanamicro.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::230; envelope-from=dbarboza@ventanamicro.com; helo=mail-oi1-x230.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org We're going to make changes that will required each helper to be responsible for the 'vstart' management, i.e. we will relieve the 'vstart < vl' assumption that helpers have today. To do that we'll need to deal with how we're updating tail elements first. We can't update them if vstart >= vl, but at this moment we're not guarding for it. We have the vext_set_tail_elems_1s() helper to update tail elements. Change it to accept an 'env' pointer, where we can read both vstart and vl, and make it a no-op if vstart >= vl. Note that callers will need to set env->start = 0 *after* the helper from now on. The exception are three helpers: vext_ldst_stride(), vext_ldst_us() and vext_ldst_index(). They are are incrementing env->vstart during execution and will end up with env->vstart = vl when tail updating. For these cases only, do an early check and exit if vstart >= vl, and set env->vstart = 0 before updating the tail. For everyone else we'll do vext_set_tail_elems_1s() and then clear env->vstart. This is the case of vext_ldff() that is already using set_tail_elems_1s(), and will be the case for the rest after the next patches. Let's also simplify the API a little by removing the 'nf' argument since it can be derived from 'desc'. Signed-off-by: Daniel Henrique Barboza Reviewed-by: Richard Henderson --- target/riscv/vector_helper.c | 59 ++++++++++++++++++++++++++++++------ 1 file changed, 49 insertions(+), 10 deletions(-) diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c index ca79571ae2..a3b496b6e9 100644 --- a/target/riscv/vector_helper.c +++ b/target/riscv/vector_helper.c @@ -174,19 +174,32 @@ GEN_VEXT_ST_ELEM(ste_h, int16_t, H2, stw) GEN_VEXT_ST_ELEM(ste_w, int32_t, H4, stl) GEN_VEXT_ST_ELEM(ste_d, int64_t, H8, stq) -static void vext_set_tail_elems_1s(target_ulong vl, void *vd, - uint32_t desc, uint32_t nf, - uint32_t esz, uint32_t max_elems) +/* + * This function is sensitive to env->vstart changes since + * it'll be a no-op if vstart >= vl. Do not clear env->vstart + * before calling it unless you're certain that vstart < vl. + */ +static void vext_set_tail_elems_1s(CPURISCVState *env, void *vd, + uint32_t desc, uint32_t esz, + uint32_t max_elems) { uint32_t vta = vext_vta(desc); + uint32_t nf = vext_nf(desc); int k; - if (vta == 0) { + /* + * Section 5.4 of the RVV spec mentions: + * "When vstart ≥ vl, there are no body elements, and no + * elements are updated in any destination vector register + * group, including that no tail elements are updated + * with agnostic values." + */ + if (vta == 0 || env->vstart >= env->vl) { return; } for (k = 0; k < nf; ++k) { - vext_set_elems_1s(vd, vta, (k * max_elems + vl) * esz, + vext_set_elems_1s(vd, vta, (k * max_elems + env->vl) * esz, (k * max_elems + max_elems) * esz); } } @@ -207,6 +220,11 @@ vext_ldst_stride(void *vd, void *v0, target_ulong base, uint32_t esz = 1 << log2_esz; uint32_t vma = vext_vma(desc); + if (env->vstart >= env->vl) { + env->vstart = 0; + return; + } + for (i = env->vstart; i < env->vl; i++, env->vstart++) { k = 0; while (k < nf) { @@ -222,9 +240,13 @@ vext_ldst_stride(void *vd, void *v0, target_ulong base, k++; } } + /* + * Set vstart before tail update - vstart changed during + * execution and we already checked that vstart < vl. + */ env->vstart = 0; - vext_set_tail_elems_1s(env->vl, vd, desc, nf, esz, max_elems); + vext_set_tail_elems_1s(env, vd, desc, esz, max_elems); } #define GEN_VEXT_LD_STRIDE(NAME, ETYPE, LOAD_FN) \ @@ -272,6 +294,11 @@ vext_ldst_us(void *vd, target_ulong base, CPURISCVState *env, uint32_t desc, uint32_t max_elems = vext_max_elems(desc, log2_esz); uint32_t esz = 1 << log2_esz; + if (env->vstart >= env->vl) { + env->vstart = 0; + return; + } + /* load bytes from guest memory */ for (i = env->vstart; i < evl; i++, env->vstart++) { k = 0; @@ -281,9 +308,13 @@ vext_ldst_us(void *vd, target_ulong base, CPURISCVState *env, uint32_t desc, k++; } } + /* + * Set vstart before tail update - vstart changed during + * execution and we already checked that vstart < vl. + */ env->vstart = 0; - vext_set_tail_elems_1s(evl, vd, desc, nf, esz, max_elems); + vext_set_tail_elems_1s(env, vd, desc, esz, max_elems); } /* @@ -386,6 +417,11 @@ vext_ldst_index(void *vd, void *v0, target_ulong base, uint32_t esz = 1 << log2_esz; uint32_t vma = vext_vma(desc); + if (env->vstart >= env->vl) { + env->vstart = 0; + return; + } + /* load bytes from guest memory */ for (i = env->vstart; i < env->vl; i++, env->vstart++) { k = 0; @@ -402,9 +438,13 @@ vext_ldst_index(void *vd, void *v0, target_ulong base, k++; } } + /* + * Set vstart before tail update - vstart changed during + * execution and we already checked that vstart < vl. + */ env->vstart = 0; - vext_set_tail_elems_1s(env->vl, vd, desc, nf, esz, max_elems); + vext_set_tail_elems_1s(env, vd, desc, esz, max_elems); } #define GEN_VEXT_LD_INDEX(NAME, ETYPE, INDEX_FN, LOAD_FN) \ @@ -532,9 +572,8 @@ ProbeSuccess: k++; } } + vext_set_tail_elems_1s(env, vd, desc, esz, max_elems); env->vstart = 0; - - vext_set_tail_elems_1s(env->vl, vd, desc, nf, esz, max_elems); } #define GEN_VEXT_LDFF(NAME, ETYPE, LOAD_FN) \ From patchwork Sun Mar 10 11:53:07 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Henrique Barboza X-Patchwork-Id: 13588054 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id C1D56C54E58 for ; Sun, 10 Mar 2024 11:54:32 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rjHkX-0007no-Kp; Sun, 10 Mar 2024 07:53:54 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rjHkG-0007ij-3h for qemu-devel@nongnu.org; Sun, 10 Mar 2024 07:53:37 -0400 Received: from mail-pg1-x52d.google.com ([2607:f8b0:4864:20::52d]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1rjHkE-00042S-Iu for qemu-devel@nongnu.org; Sun, 10 Mar 2024 07:53:35 -0400 Received: by mail-pg1-x52d.google.com with SMTP id 41be03b00d2f7-5cfd95130c6so2514628a12.1 for ; Sun, 10 Mar 2024 04:53:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ventanamicro.com; s=google; t=1710071612; x=1710676412; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=dkHaxAaccqLTSQvWjOIYNF50FjoIR0eELlfVCBEjemM=; b=YnBAqt6chxfjRtD4WWayBw70WWQV7LQYcQcPHmNVEmgjc+Tt+KXfCqCH7Ndyu9NNis XCbWGtiIw4hDnEkqVsd3b8EL88SBBzJZ7WyDU3lJyGw5STdkRvGptXlpEetxMCl+y9IV XcVzKZfvAT+Du1Db9vq1JcVhmPO4gWBDu5ew+D6oVk+S8n5IS1qfBkJMcG12ntw0jQIl gbaTvIdck8Zvp92xrNcObmk132fjlh7IqAx/paYUIoviHXIJiJDnlhlgV1Hxv0JRAdpF mGitlqsglsp8XjABFyCag9bhM+WJIFEAaW9/mhvBYf4I8Ps2naeeF6HWMeNT+Xw/3Fj0 2giQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710071612; x=1710676412; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=dkHaxAaccqLTSQvWjOIYNF50FjoIR0eELlfVCBEjemM=; b=qcfaAVvhOA0i+cxuE5Kf0801HBXO7SqGvAgcV1tlH0uZFQZjrtUQIUR7wSEifHeCPy 1lu0hkCJs/JBNbH/M4WLsa8c7WO46/98HkVIagrOtn7VYCIgTyztrmRNoWy/0LeOMUH2 QVevzB8dXCjWqHOZ7mzbqpkwJUrY+dZMZayut5sRmzfScpt900g9uA39oeZw/KIVsZkD IRi2dehpqbgj8+MmoZEy5I87b35EijL7On2BvvADsz+7zm5f3hcU1c9nf4g5+IvuLhqc e8Ai0rwE8A4tSIRsNwS7Jn1XFTXxSP7rhv7fbO4e2QpbY4/8RVwrkRyeQMlHo5eNAiQ4 JRHQ== X-Gm-Message-State: AOJu0Yzy7qhSrkB3C4+VhrJjccOnIr6htGrC87CU/PR25Z4kZQaL6xLs 50tUTQWtdhr7aXcjsSZXNGCcLpQOREi5zEI9GMnvxm1jRpuI1sfeQqtsS+uFkcBQ6MBbvx8PMxv G X-Google-Smtp-Source: AGHT+IEK82fIquQPB4PdrHH1GwDnGYrjws3X/PLHcGNNfwwo2+xWZfz4wYMEPMoKoXP6d9of2QUYUg== X-Received: by 2002:a05:6a00:2291:b0:6e5:7480:e5b9 with SMTP id f17-20020a056a00229100b006e57480e5b9mr4116557pfe.9.1710071612270; Sun, 10 Mar 2024 04:53:32 -0700 (PDT) Received: from grind.. ([177.94.15.159]) by smtp.gmail.com with ESMTPSA id g22-20020aa78196000000b006e647059cccsm2449253pfi.33.2024.03.10.04.53.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 10 Mar 2024 04:53:31 -0700 (PDT) From: Daniel Henrique Barboza To: qemu-devel@nongnu.org Cc: qemu-riscv@nongnu.org, alistair.francis@wdc.com, bmeng@tinylab.org, liwei1518@gmail.com, zhiwei_liu@linux.alibaba.com, palmer@rivosinc.com, richard.henderson@linaro.org, philmd@linaro.org, Daniel Henrique Barboza Subject: [PATCH v10 03/10] target/riscv/vector_helper.c: do vstart=0 after updating tail Date: Sun, 10 Mar 2024 08:53:07 -0300 Message-ID: <20240310115315.187283-4-dbarboza@ventanamicro.com> X-Mailer: git-send-email 2.43.2 In-Reply-To: <20240310115315.187283-1-dbarboza@ventanamicro.com> References: <20240310115315.187283-1-dbarboza@ventanamicro.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::52d; envelope-from=dbarboza@ventanamicro.com; helo=mail-pg1-x52d.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org vext_vv_rm_1() and vext_vv_rm_2() are setting vstart = 0 before their respective callers (vext_vv_rm_2 and vext_vx_rm_2) update the tail elements. This is benign now, but we'll convert the tail updates to use vext_set_tail_elems_1s(), and this function is sensitive to vstart changes. Do vstart = 0 after vext_set_elems_1s() now to make the conversion easier. Signed-off-by: Daniel Henrique Barboza Reviewed-by: Richard Henderson --- target/riscv/vector_helper.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c index a3b496b6e9..86b990ce03 100644 --- a/target/riscv/vector_helper.c +++ b/target/riscv/vector_helper.c @@ -1962,7 +1962,6 @@ vext_vv_rm_1(void *vd, void *v0, void *vs1, void *vs2, } fn(vd, vs1, vs2, i, env, vxrm); } - env->vstart = 0; } static inline void @@ -1997,6 +1996,7 @@ vext_vv_rm_2(void *vd, void *v0, void *vs1, void *vs2, } /* set tail elements to 1s */ vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz); + env->vstart = 0; } /* generate helpers for fixed point instructions with OPIVV format */ @@ -2087,7 +2087,6 @@ vext_vx_rm_1(void *vd, void *v0, target_long s1, void *vs2, } fn(vd, s1, vs2, i, env, vxrm); } - env->vstart = 0; } static inline void @@ -2122,6 +2121,7 @@ vext_vx_rm_2(void *vd, void *v0, target_long s1, void *vs2, } /* set tail elements to 1s */ vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz); + env->vstart = 0; } /* generate helpers for fixed point instructions with OPIVX format */ From patchwork Sun Mar 10 11:53:08 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Henrique Barboza X-Patchwork-Id: 13588060 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 37B0DC54E58 for ; Sun, 10 Mar 2024 11:55:10 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rjHkb-0007qX-70; Sun, 10 Mar 2024 07:53:57 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rjHkM-0007kh-FL for qemu-devel@nongnu.org; Sun, 10 Mar 2024 07:53:43 -0400 Received: from mail-pf1-x435.google.com ([2607:f8b0:4864:20::435]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1rjHkI-000431-1k for qemu-devel@nongnu.org; Sun, 10 Mar 2024 07:53:42 -0400 Received: by mail-pf1-x435.google.com with SMTP id d2e1a72fcca58-6e5a232fe80so2376164b3a.0 for ; Sun, 10 Mar 2024 04:53:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ventanamicro.com; s=google; t=1710071616; x=1710676416; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=5sKaHkB6QQZBCC3OefEpCLRm3FHr/IfzE1/AjvYolwU=; b=Nf/+B1dyolEGelbXvMBiLqrstyPRjilTkW1mrErWjq+OnVCwcEpZcbpLOQEiiTa9ee IeyT9TC6qq9PqcNeleV84kKd7yWiVJGtKOtNDDeu21rkPI7GpOaUOqmfy5uziQlFuT8j oeHBDuSVItywJeZVM0iI11i0C61GQDdhn+Coxs9QJwjZ1t9KcTYzgpVnCQ8u10IWjto/ s/DSUSBi1MM2164+f5h1tllk+6oXgRq/JZHVub0q5vOeBmyWAKXh2H5yxAo7DV+yDaaQ B61NsxCvWyrOvrmx6EI/uufiUjHbeIw4QNnNCCJc2e/NXXBUpbJckHx6pfO007tfSId4 FjEQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710071616; x=1710676416; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=5sKaHkB6QQZBCC3OefEpCLRm3FHr/IfzE1/AjvYolwU=; b=hRf6hG6rB36eYeKLilZAZ7BSgHtEV5cHQLFA6pCoPszJa2T/LdwaaZcdxgqYdCQcMS DPPQr+CVVyQFtKInaYPgkepYLRtq84YaEv/7YG1dtNx8FJ5jeOQXt8ZBM5Yksky4GfZd oz/oF0cifRjjz4KAtKuYhxDxEBAOt+UW0n/zekP+acGSqPI0wZ44nP9aedvzK//DhGbl Wwvh9JA4qxZ0sY+DAh52umoo7laPnZ5nmeIhO03lI/gGiJ12IXWmSA9ixm3nR0yrHOuk 4eAONWBLKQsYIarjydpY8E1zHBKbADOZV3OCF7R42T09rjGfvKWqTAOZuPYVSLb3w2/R Huzg== X-Gm-Message-State: AOJu0Yy7rVf0CPeZoUrX1DtmyfrUtEzW3xGPxEfMwO6KaaVLmdSB80st O5oM2V4C1iRxmRgmD7A3a1ufLf2wowMQ2CBvtFMcrNsRegCcEbTpGPA+EIzEbj0SVPPRxkTBKzE v X-Google-Smtp-Source: AGHT+IGHmwCl6+CXsvrjvJPpoGL0LhIc5dPRvZnGIDnnMbi/Ya3QdTkmuNMvu/juhxILYwhOuLcNsA== X-Received: by 2002:aa7:88ce:0:b0:6e6:75d8:3d19 with SMTP id k14-20020aa788ce000000b006e675d83d19mr6143193pff.8.1710071615723; Sun, 10 Mar 2024 04:53:35 -0700 (PDT) Received: from grind.. ([177.94.15.159]) by smtp.gmail.com with ESMTPSA id g22-20020aa78196000000b006e647059cccsm2449253pfi.33.2024.03.10.04.53.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 10 Mar 2024 04:53:35 -0700 (PDT) From: Daniel Henrique Barboza To: qemu-devel@nongnu.org Cc: qemu-riscv@nongnu.org, alistair.francis@wdc.com, bmeng@tinylab.org, liwei1518@gmail.com, zhiwei_liu@linux.alibaba.com, palmer@rivosinc.com, richard.henderson@linaro.org, philmd@linaro.org, Daniel Henrique Barboza Subject: [PATCH v10 04/10] target/riscv/vector_helper.c: update tail with vext_set_tail_elems_1s() Date: Sun, 10 Mar 2024 08:53:08 -0300 Message-ID: <20240310115315.187283-5-dbarboza@ventanamicro.com> X-Mailer: git-send-email 2.43.2 In-Reply-To: <20240310115315.187283-1-dbarboza@ventanamicro.com> References: <20240310115315.187283-1-dbarboza@ventanamicro.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::435; envelope-from=dbarboza@ventanamicro.com; helo=mail-pf1-x435.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Change all code that updates tail elems to use vext_set_tail_elems_1s() instead of vext_set_elems_1s(). Setting 'env->vstart=0' needs to be the very last thing a helper does because env->vstart is being checked by vext_set_tail_elems_1s(). A side effect of this change is that a lot of 'vta' local variables got unused. The reason is that 'vta' was being fetched to be used with vext_set_elems_1s() but vext_set_tail_elems_1s() doesn't use it - 'vta' is retrieve inside the helper using 'desc'. Signed-off-by: Daniel Henrique Barboza Reviewed-by: Richard Henderson --- target/riscv/vector_helper.c | 130 ++++++++++++++--------------------- 1 file changed, 52 insertions(+), 78 deletions(-) diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c index 86b990ce03..b174ddeae8 100644 --- a/target/riscv/vector_helper.c +++ b/target/riscv/vector_helper.c @@ -913,7 +913,6 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1, void *vs2, \ uint32_t esz = sizeof(ETYPE); \ uint32_t total_elems = \ vext_get_total_elems(env, desc, esz); \ - uint32_t vta = vext_vta(desc); \ uint32_t i; \ \ for (i = env->vstart; i < vl; i++) { \ @@ -923,9 +922,9 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1, void *vs2, \ \ *((ETYPE *)vd + H(i)) = DO_OP(s2, s1, carry); \ } \ - env->vstart = 0; \ /* set tail elements to 1s */ \ - vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz); \ + vext_set_tail_elems_1s(env, vd, desc, esz, total_elems); \ + env->vstart = 0; \ } GEN_VEXT_VADC_VVM(vadc_vvm_b, uint8_t, H1, DO_VADC) @@ -945,7 +944,6 @@ void HELPER(NAME)(void *vd, void *v0, target_ulong s1, void *vs2, \ uint32_t vl = env->vl; \ uint32_t esz = sizeof(ETYPE); \ uint32_t total_elems = vext_get_total_elems(env, desc, esz); \ - uint32_t vta = vext_vta(desc); \ uint32_t i; \ \ for (i = env->vstart; i < vl; i++) { \ @@ -954,9 +952,9 @@ void HELPER(NAME)(void *vd, void *v0, target_ulong s1, void *vs2, \ \ *((ETYPE *)vd + H(i)) = DO_OP(s2, (ETYPE)(target_long)s1, carry);\ } \ - env->vstart = 0; \ /* set tail elements to 1s */ \ - vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz); \ + vext_set_tail_elems_1s(env, vd, desc, esz, total_elems); \ + env->vstart = 0; \ } GEN_VEXT_VADC_VXM(vadc_vxm_b, uint8_t, H1, DO_VADC) @@ -1113,7 +1111,6 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1, \ uint32_t vl = env->vl; \ uint32_t esz = sizeof(TS1); \ uint32_t total_elems = vext_get_total_elems(env, desc, esz); \ - uint32_t vta = vext_vta(desc); \ uint32_t vma = vext_vma(desc); \ uint32_t i; \ \ @@ -1127,9 +1124,9 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1, \ TS2 s2 = *((TS2 *)vs2 + HS2(i)); \ *((TS1 *)vd + HS1(i)) = OP(s2, s1 & MASK); \ } \ - env->vstart = 0; \ /* set tail elements to 1s */ \ - vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz); \ + vext_set_tail_elems_1s(env, vd, desc, esz, total_elems); \ + env->vstart = 0; \ } GEN_VEXT_SHIFT_VV(vsll_vv_b, uint8_t, uint8_t, H1, H1, DO_SLL, 0x7) @@ -1160,7 +1157,6 @@ void HELPER(NAME)(void *vd, void *v0, target_ulong s1, \ uint32_t esz = sizeof(TD); \ uint32_t total_elems = \ vext_get_total_elems(env, desc, esz); \ - uint32_t vta = vext_vta(desc); \ uint32_t vma = vext_vma(desc); \ uint32_t i; \ \ @@ -1174,9 +1170,9 @@ void HELPER(NAME)(void *vd, void *v0, target_ulong s1, \ TS2 s2 = *((TS2 *)vs2 + HS2(i)); \ *((TD *)vd + HD(i)) = OP(s2, s1 & MASK); \ } \ - env->vstart = 0; \ /* set tail elements to 1s */ \ - vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz);\ + vext_set_tail_elems_1s(env, vd, desc, esz, total_elems);\ + env->vstart = 0; \ } GEN_VEXT_SHIFT_VX(vsll_vx_b, uint8_t, int8_t, H1, H1, DO_SLL, 0x7) @@ -1835,16 +1831,15 @@ void HELPER(NAME)(void *vd, void *vs1, CPURISCVState *env, \ uint32_t vl = env->vl; \ uint32_t esz = sizeof(ETYPE); \ uint32_t total_elems = vext_get_total_elems(env, desc, esz); \ - uint32_t vta = vext_vta(desc); \ uint32_t i; \ \ for (i = env->vstart; i < vl; i++) { \ ETYPE s1 = *((ETYPE *)vs1 + H(i)); \ *((ETYPE *)vd + H(i)) = s1; \ } \ - env->vstart = 0; \ /* set tail elements to 1s */ \ - vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz); \ + vext_set_tail_elems_1s(env, vd, desc, esz, total_elems); \ + env->vstart = 0; \ } GEN_VEXT_VMV_VV(vmv_v_v_b, int8_t, H1) @@ -1859,15 +1854,14 @@ void HELPER(NAME)(void *vd, uint64_t s1, CPURISCVState *env, \ uint32_t vl = env->vl; \ uint32_t esz = sizeof(ETYPE); \ uint32_t total_elems = vext_get_total_elems(env, desc, esz); \ - uint32_t vta = vext_vta(desc); \ uint32_t i; \ \ for (i = env->vstart; i < vl; i++) { \ *((ETYPE *)vd + H(i)) = (ETYPE)s1; \ } \ - env->vstart = 0; \ /* set tail elements to 1s */ \ - vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz); \ + vext_set_tail_elems_1s(env, vd, desc, esz, total_elems); \ + env->vstart = 0; \ } GEN_VEXT_VMV_VX(vmv_v_x_b, int8_t, H1) @@ -1882,16 +1876,15 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1, void *vs2, \ uint32_t vl = env->vl; \ uint32_t esz = sizeof(ETYPE); \ uint32_t total_elems = vext_get_total_elems(env, desc, esz); \ - uint32_t vta = vext_vta(desc); \ uint32_t i; \ \ for (i = env->vstart; i < vl; i++) { \ ETYPE *vt = (!vext_elem_mask(v0, i) ? vs2 : vs1); \ *((ETYPE *)vd + H(i)) = *(vt + H(i)); \ } \ - env->vstart = 0; \ /* set tail elements to 1s */ \ - vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz); \ + vext_set_tail_elems_1s(env, vd, desc, esz, total_elems); \ + env->vstart = 0; \ } GEN_VEXT_VMERGE_VV(vmerge_vvm_b, int8_t, H1) @@ -1906,7 +1899,6 @@ void HELPER(NAME)(void *vd, void *v0, target_ulong s1, \ uint32_t vl = env->vl; \ uint32_t esz = sizeof(ETYPE); \ uint32_t total_elems = vext_get_total_elems(env, desc, esz); \ - uint32_t vta = vext_vta(desc); \ uint32_t i; \ \ for (i = env->vstart; i < vl; i++) { \ @@ -1915,9 +1907,9 @@ void HELPER(NAME)(void *vd, void *v0, target_ulong s1, \ (ETYPE)(target_long)s1); \ *((ETYPE *)vd + H(i)) = d; \ } \ - env->vstart = 0; \ /* set tail elements to 1s */ \ - vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz); \ + vext_set_tail_elems_1s(env, vd, desc, esz, total_elems); \ + env->vstart = 0; \ } GEN_VEXT_VMERGE_VX(vmerge_vxm_b, int8_t, H1) @@ -1973,7 +1965,6 @@ vext_vv_rm_2(void *vd, void *v0, void *vs1, void *vs2, uint32_t vm = vext_vm(desc); uint32_t vl = env->vl; uint32_t total_elems = vext_get_total_elems(env, desc, esz); - uint32_t vta = vext_vta(desc); uint32_t vma = vext_vma(desc); switch (env->vxrm) { @@ -1995,7 +1986,7 @@ vext_vv_rm_2(void *vd, void *v0, void *vs1, void *vs2, break; } /* set tail elements to 1s */ - vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz); + vext_set_tail_elems_1s(env, vd, desc, esz, total_elems); env->vstart = 0; } @@ -2098,7 +2089,6 @@ vext_vx_rm_2(void *vd, void *v0, target_long s1, void *vs2, uint32_t vm = vext_vm(desc); uint32_t vl = env->vl; uint32_t total_elems = vext_get_total_elems(env, desc, esz); - uint32_t vta = vext_vta(desc); uint32_t vma = vext_vma(desc); switch (env->vxrm) { @@ -2120,7 +2110,7 @@ vext_vx_rm_2(void *vd, void *v0, target_long s1, void *vs2, break; } /* set tail elements to 1s */ - vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz); + vext_set_tail_elems_1s(env, vd, desc, esz, total_elems); env->vstart = 0; } @@ -2872,7 +2862,6 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1, \ uint32_t vl = env->vl; \ uint32_t total_elems = \ vext_get_total_elems(env, desc, ESZ); \ - uint32_t vta = vext_vta(desc); \ uint32_t vma = vext_vma(desc); \ uint32_t i; \ \ @@ -2885,10 +2874,10 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1, \ } \ do_##NAME(vd, vs1, vs2, i, env); \ } \ - env->vstart = 0; \ /* set tail elements to 1s */ \ - vext_set_elems_1s(vd, vta, vl * ESZ, \ - total_elems * ESZ); \ + vext_set_tail_elems_1s(env, vd, desc, ESZ, \ + total_elems); \ + env->vstart = 0; \ } RVVCALL(OPFVV2, vfadd_vv_h, OP_UUU_H, H2, H2, H2, float16_add) @@ -2915,7 +2904,6 @@ void HELPER(NAME)(void *vd, void *v0, uint64_t s1, \ uint32_t vl = env->vl; \ uint32_t total_elems = \ vext_get_total_elems(env, desc, ESZ); \ - uint32_t vta = vext_vta(desc); \ uint32_t vma = vext_vma(desc); \ uint32_t i; \ \ @@ -2928,10 +2916,10 @@ void HELPER(NAME)(void *vd, void *v0, uint64_t s1, \ } \ do_##NAME(vd, s1, vs2, i, env); \ } \ - env->vstart = 0; \ /* set tail elements to 1s */ \ - vext_set_elems_1s(vd, vta, vl * ESZ, \ - total_elems * ESZ); \ + vext_set_tail_elems_1s(env, vd, desc, ESZ, \ + total_elems); \ + env->vstart = 0; \ } RVVCALL(OPFVF2, vfadd_vf_h, OP_UUU_H, H2, H2, float16_add) @@ -3501,7 +3489,6 @@ void HELPER(NAME)(void *vd, void *v0, void *vs2, \ uint32_t vl = env->vl; \ uint32_t total_elems = \ vext_get_total_elems(env, desc, ESZ); \ - uint32_t vta = vext_vta(desc); \ uint32_t vma = vext_vma(desc); \ uint32_t i; \ \ @@ -3517,9 +3504,9 @@ void HELPER(NAME)(void *vd, void *v0, void *vs2, \ } \ do_##NAME(vd, vs2, i, env); \ } \ + vext_set_tail_elems_1s(env, vd, desc, ESZ, \ + total_elems); \ env->vstart = 0; \ - vext_set_elems_1s(vd, vta, vl * ESZ, \ - total_elems * ESZ); \ } RVVCALL(OPFVV1, vfsqrt_v_h, OP_UU_H, H2, H2, float16_sqrt) @@ -4256,7 +4243,6 @@ void HELPER(NAME)(void *vd, void *v0, uint64_t s1, void *vs2, \ uint32_t esz = sizeof(ETYPE); \ uint32_t total_elems = \ vext_get_total_elems(env, desc, esz); \ - uint32_t vta = vext_vta(desc); \ uint32_t i; \ \ for (i = env->vstart; i < vl; i++) { \ @@ -4264,9 +4250,9 @@ void HELPER(NAME)(void *vd, void *v0, uint64_t s1, void *vs2, \ *((ETYPE *)vd + H(i)) = \ (!vm && !vext_elem_mask(v0, i) ? s2 : s1); \ } \ - env->vstart = 0; \ /* set tail elements to 1s */ \ - vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz); \ + vext_set_tail_elems_1s(env, vd, desc, esz, total_elems); \ + env->vstart = 0; \ } GEN_VFMERGE_VF(vfmerge_vfm_h, int16_t, H2) @@ -4421,7 +4407,6 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1, \ uint32_t vl = env->vl; \ uint32_t esz = sizeof(TD); \ uint32_t vlenb = simd_maxsz(desc); \ - uint32_t vta = vext_vta(desc); \ uint32_t i; \ TD s1 = *((TD *)vs1 + HD(0)); \ \ @@ -4433,9 +4418,9 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1, \ s1 = OP(s1, (TD)s2); \ } \ *((TD *)vd + HD(0)) = s1; \ - env->vstart = 0; \ /* set tail elements to 1s */ \ - vext_set_elems_1s(vd, vta, esz, vlenb); \ + vext_set_tail_elems_1s(env, vd, desc, esz, vlenb); \ + env->vstart = 0; \ } /* vd[0] = sum(vs1[0], vs2[*]) */ @@ -4507,7 +4492,6 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1, \ uint32_t vl = env->vl; \ uint32_t esz = sizeof(TD); \ uint32_t vlenb = simd_maxsz(desc); \ - uint32_t vta = vext_vta(desc); \ uint32_t i; \ TD s1 = *((TD *)vs1 + HD(0)); \ \ @@ -4519,9 +4503,9 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1, \ s1 = OP(s1, (TD)s2, &env->fp_status); \ } \ *((TD *)vd + HD(0)) = s1; \ - env->vstart = 0; \ /* set tail elements to 1s */ \ - vext_set_elems_1s(vd, vta, esz, vlenb); \ + vext_set_tail_elems_1s(env, vd, desc, esz, vlenb); \ + env->vstart = 0; \ } /* Unordered sum */ @@ -4738,7 +4722,6 @@ void HELPER(NAME)(void *vd, void *v0, void *vs2, CPURISCVState *env, \ uint32_t vl = env->vl; \ uint32_t esz = sizeof(ETYPE); \ uint32_t total_elems = vext_get_total_elems(env, desc, esz); \ - uint32_t vta = vext_vta(desc); \ uint32_t vma = vext_vma(desc); \ uint32_t sum = 0; \ int i; \ @@ -4754,9 +4737,9 @@ void HELPER(NAME)(void *vd, void *v0, void *vs2, CPURISCVState *env, \ sum++; \ } \ } \ - env->vstart = 0; \ /* set tail elements to 1s */ \ - vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz); \ + vext_set_tail_elems_1s(env, vd, desc, esz, total_elems); \ + env->vstart = 0; \ } GEN_VEXT_VIOTA_M(viota_m_b, uint8_t, H1) @@ -4772,7 +4755,6 @@ void HELPER(NAME)(void *vd, void *v0, CPURISCVState *env, uint32_t desc) \ uint32_t vl = env->vl; \ uint32_t esz = sizeof(ETYPE); \ uint32_t total_elems = vext_get_total_elems(env, desc, esz); \ - uint32_t vta = vext_vta(desc); \ uint32_t vma = vext_vma(desc); \ int i; \ \ @@ -4784,9 +4766,9 @@ void HELPER(NAME)(void *vd, void *v0, CPURISCVState *env, uint32_t desc) \ } \ *((ETYPE *)vd + H(i)) = i; \ } \ - env->vstart = 0; \ /* set tail elements to 1s */ \ - vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz); \ + vext_set_tail_elems_1s(env, vd, desc, esz, total_elems); \ + env->vstart = 0; \ } GEN_VEXT_VID_V(vid_v_b, uint8_t, H1) @@ -4807,7 +4789,6 @@ void HELPER(NAME)(void *vd, void *v0, target_ulong s1, void *vs2, \ uint32_t vl = env->vl; \ uint32_t esz = sizeof(ETYPE); \ uint32_t total_elems = vext_get_total_elems(env, desc, esz); \ - uint32_t vta = vext_vta(desc); \ uint32_t vma = vext_vma(desc); \ target_ulong offset = s1, i_min, i; \ \ @@ -4820,9 +4801,9 @@ void HELPER(NAME)(void *vd, void *v0, target_ulong s1, void *vs2, \ } \ *((ETYPE *)vd + H(i)) = *((ETYPE *)vs2 + H(i - offset)); \ } \ - env->vstart = 0; \ /* set tail elements to 1s */ \ - vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz); \ + vext_set_tail_elems_1s(env, vd, desc, esz, total_elems); \ + env->vstart = 0; \ } /* vslideup.vx vd, vs2, rs1, vm # vd[i+rs1] = vs2[i] */ @@ -4840,7 +4821,6 @@ void HELPER(NAME)(void *vd, void *v0, target_ulong s1, void *vs2, \ uint32_t vl = env->vl; \ uint32_t esz = sizeof(ETYPE); \ uint32_t total_elems = vext_get_total_elems(env, desc, esz); \ - uint32_t vta = vext_vta(desc); \ uint32_t vma = vext_vma(desc); \ target_ulong i_max, i_min, i; \ \ @@ -4861,9 +4841,9 @@ void HELPER(NAME)(void *vd, void *v0, target_ulong s1, void *vs2, \ } \ } \ \ - env->vstart = 0; \ /* set tail elements to 1s */ \ - vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz); \ + vext_set_tail_elems_1s(env, vd, desc, esz, total_elems); \ + env->vstart = 0; \ } /* vslidedown.vx vd, vs2, rs1, vm # vd[i] = vs2[i+rs1] */ @@ -4882,7 +4862,6 @@ static void vslide1up_##BITWIDTH(void *vd, void *v0, uint64_t s1, \ uint32_t vl = env->vl; \ uint32_t esz = sizeof(ETYPE); \ uint32_t total_elems = vext_get_total_elems(env, desc, esz); \ - uint32_t vta = vext_vta(desc); \ uint32_t vma = vext_vma(desc); \ uint32_t i; \ \ @@ -4898,9 +4877,9 @@ static void vslide1up_##BITWIDTH(void *vd, void *v0, uint64_t s1, \ *((ETYPE *)vd + H(i)) = *((ETYPE *)vs2 + H(i - 1)); \ } \ } \ - env->vstart = 0; \ /* set tail elements to 1s */ \ - vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz); \ + vext_set_tail_elems_1s(env, vd, desc, esz, total_elems); \ + env->vstart = 0; \ } GEN_VEXT_VSLIE1UP(8, H1) @@ -4931,7 +4910,6 @@ static void vslide1down_##BITWIDTH(void *vd, void *v0, uint64_t s1, \ uint32_t vl = env->vl; \ uint32_t esz = sizeof(ETYPE); \ uint32_t total_elems = vext_get_total_elems(env, desc, esz); \ - uint32_t vta = vext_vta(desc); \ uint32_t vma = vext_vma(desc); \ uint32_t i; \ \ @@ -4947,9 +4925,9 @@ static void vslide1down_##BITWIDTH(void *vd, void *v0, uint64_t s1, \ *((ETYPE *)vd + H(i)) = *((ETYPE *)vs2 + H(i + 1)); \ } \ } \ - env->vstart = 0; \ /* set tail elements to 1s */ \ - vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz); \ + vext_set_tail_elems_1s(env, vd, desc, esz, total_elems); \ + env->vstart = 0; \ } GEN_VEXT_VSLIDE1DOWN(8, H1) @@ -5005,7 +4983,6 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1, void *vs2, \ uint32_t vl = env->vl; \ uint32_t esz = sizeof(TS2); \ uint32_t total_elems = vext_get_total_elems(env, desc, esz); \ - uint32_t vta = vext_vta(desc); \ uint32_t vma = vext_vma(desc); \ uint64_t index; \ uint32_t i; \ @@ -5023,9 +5000,9 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1, void *vs2, \ *((TS2 *)vd + HS2(i)) = *((TS2 *)vs2 + HS2(index)); \ } \ } \ - env->vstart = 0; \ /* set tail elements to 1s */ \ - vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz); \ + vext_set_tail_elems_1s(env, vd, desc, esz, total_elems); \ + env->vstart = 0; \ } /* vd[i] = (vs1[i] >= VLMAX) ? 0 : vs2[vs1[i]]; */ @@ -5048,7 +5025,6 @@ void HELPER(NAME)(void *vd, void *v0, target_ulong s1, void *vs2, \ uint32_t vl = env->vl; \ uint32_t esz = sizeof(ETYPE); \ uint32_t total_elems = vext_get_total_elems(env, desc, esz); \ - uint32_t vta = vext_vta(desc); \ uint32_t vma = vext_vma(desc); \ uint64_t index = s1; \ uint32_t i; \ @@ -5065,9 +5041,9 @@ void HELPER(NAME)(void *vd, void *v0, target_ulong s1, void *vs2, \ *((ETYPE *)vd + H(i)) = *((ETYPE *)vs2 + H(index)); \ } \ } \ - env->vstart = 0; \ /* set tail elements to 1s */ \ - vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz); \ + vext_set_tail_elems_1s(env, vd, desc, esz, total_elems); \ + env->vstart = 0; \ } /* vd[i] = (x[rs1] >= VLMAX) ? 0 : vs2[rs1] */ @@ -5084,7 +5060,6 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1, void *vs2, \ uint32_t vl = env->vl; \ uint32_t esz = sizeof(ETYPE); \ uint32_t total_elems = vext_get_total_elems(env, desc, esz); \ - uint32_t vta = vext_vta(desc); \ uint32_t num = 0, i; \ \ for (i = env->vstart; i < vl; i++) { \ @@ -5094,9 +5069,9 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1, void *vs2, \ *((ETYPE *)vd + H(num)) = *((ETYPE *)vs2 + H(i)); \ num++; \ } \ - env->vstart = 0; \ /* set tail elements to 1s */ \ - vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz); \ + vext_set_tail_elems_1s(env, vd, desc, esz, total_elems); \ + env->vstart = 0; \ } /* Compress into vd elements of vs2 where vs1 is enabled */ @@ -5130,7 +5105,6 @@ void HELPER(NAME)(void *vd, void *v0, void *vs2, \ uint32_t vm = vext_vm(desc); \ uint32_t esz = sizeof(ETYPE); \ uint32_t total_elems = vext_get_total_elems(env, desc, esz); \ - uint32_t vta = vext_vta(desc); \ uint32_t vma = vext_vma(desc); \ uint32_t i; \ \ @@ -5142,9 +5116,9 @@ void HELPER(NAME)(void *vd, void *v0, void *vs2, \ } \ *((ETYPE *)vd + HD(i)) = *((DTYPE *)vs2 + HS1(i)); \ } \ - env->vstart = 0; \ /* set tail elements to 1s */ \ - vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz); \ + vext_set_tail_elems_1s(env, vd, desc, esz, total_elems); \ + env->vstart = 0; \ } GEN_VEXT_INT_EXT(vzext_vf2_h, uint16_t, uint8_t, H2, H1) From patchwork Sun Mar 10 11:53:09 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Daniel Henrique Barboza X-Patchwork-Id: 13588062 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id D71A8C54E58 for ; Sun, 10 Mar 2024 11:55:18 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rjHka-0007pT-DO; Sun, 10 Mar 2024 07:53:56 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rjHkO-0007lo-F6 for qemu-devel@nongnu.org; Sun, 10 Mar 2024 07:53:48 -0400 Received: from mail-oi1-x22f.google.com ([2607:f8b0:4864:20::22f]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1rjHkM-00043O-4k for qemu-devel@nongnu.org; Sun, 10 Mar 2024 07:53:44 -0400 Received: by mail-oi1-x22f.google.com with SMTP id 5614622812f47-3c19aaedfdaso1565396b6e.2 for ; Sun, 10 Mar 2024 04:53:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ventanamicro.com; s=google; t=1710071619; x=1710676419; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=UkZttwzbGfcxGdm/nXSRZp+cW9ikJaFZ7Cn5052rX+I=; b=DcOpGhPZgVJgSfjnvfZgLL5QWi5P/efWcQQYTVYfPRjed9pfkxiqvXfDP4oCdWBQrO ZxZLlRFPR4+KXDb0ZCqiDFeZzgYX6tIG+36/reGWVfFtoDUdcUXD+8+xy9u9EjeeKZRO xkJtAwlfkfig5nBjJQQMRdH9R5tOf6WdOFBYTLCuTVxywYuPcLy6NhAB5/s9ipAd+WAk kYT//2HcE9TjWahgMUhC6fpm9XkpC5ZaNzoX0NIk9ibIr+RuQErzwvcMy/OZF5gZzwXg SAZ0G8ylcbKocezTh0CwwD94Y5giqz7ttxUkmxeVDUenGls4j92VUJAuJcOhwwR9j00I rvEg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710071619; x=1710676419; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=UkZttwzbGfcxGdm/nXSRZp+cW9ikJaFZ7Cn5052rX+I=; b=NyIu4WzvMIBscuJ80cycFJG5prV60zl7sCim6Lq4el8sihktI3Q/lWT0XbCaQ/wOD0 hX83nNsPCmG7o/jaP7doga5I97rdcxT2Ku7nS+1PZnhvLQaCzxffcmCANNzDqnij9INv 9ENH6PBhoX1mIwnPVVTi5N15/DdURIPIAPlL/ALu+ojrkUfB5EYwLJqdO45RBRqgQsWl HP6zHqQNsiZl1z44JWqtMXe8dC2QpzhIDbgJMcfICeNoiZPc3dQRSFW+WZAHuRYYJRvQ JUrQ5vCrDLrY/sd/chsKSOG1SeNVoP7FpviliA0YNskQNXeDvOprbKeSfC5U3g51oIfs YJaw== X-Gm-Message-State: AOJu0YwNt2IIpdQ99welByWlHc67JUNXS8q/QvanHqAVEuzcHA62Rg2v D7a7sh68i1GyllcOs8PSYCTBY+dXWzA+hxRAJb69Ucb8Il9r2A4/2DPIjFWdNESLmK75zkL0zHW m X-Google-Smtp-Source: AGHT+IGbs6X6DctJfeBcfmk2lKwdPd9GitOZkFmrkHDDSAuyPi7KdwQ+If5A1ILO98uj6oi/wOKi0g== X-Received: by 2002:a05:6808:2e89:b0:3c2:3b0e:b830 with SMTP id gt9-20020a0568082e8900b003c23b0eb830mr6492658oib.25.1710071619273; Sun, 10 Mar 2024 04:53:39 -0700 (PDT) Received: from grind.. ([177.94.15.159]) by smtp.gmail.com with ESMTPSA id g22-20020aa78196000000b006e647059cccsm2449253pfi.33.2024.03.10.04.53.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 10 Mar 2024 04:53:38 -0700 (PDT) From: Daniel Henrique Barboza To: qemu-devel@nongnu.org Cc: qemu-riscv@nongnu.org, alistair.francis@wdc.com, bmeng@tinylab.org, liwei1518@gmail.com, zhiwei_liu@linux.alibaba.com, palmer@rivosinc.com, richard.henderson@linaro.org, philmd@linaro.org, Daniel Henrique Barboza Subject: [PATCH v10 05/10] target/riscv: use vext_set_tail_elems_1s() in vcrypto insns Date: Sun, 10 Mar 2024 08:53:09 -0300 Message-ID: <20240310115315.187283-6-dbarboza@ventanamicro.com> X-Mailer: git-send-email 2.43.2 In-Reply-To: <20240310115315.187283-1-dbarboza@ventanamicro.com> References: <20240310115315.187283-1-dbarboza@ventanamicro.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::22f; envelope-from=dbarboza@ventanamicro.com; helo=mail-oi1-x22f.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Vcrypto insns should also use the same helper the regular vector insns uses to update the tail elements. Move vext_set_tail_elems_1s() to vector_internals.c and make it public. Use it in vcrypto_helper.c to set tail elements instead of vext_set_elems_1s(). Helpers must set env->vstart = 0 after setting the tail. Signed-off-by: Daniel Henrique Barboza Reviewed-by: Richard Henderson --- target/riscv/vcrypto_helper.c | 63 ++++++++++++--------------------- target/riscv/vector_helper.c | 30 ---------------- target/riscv/vector_internals.c | 29 +++++++++++++++ target/riscv/vector_internals.h | 4 +++ 4 files changed, 56 insertions(+), 70 deletions(-) diff --git a/target/riscv/vcrypto_helper.c b/target/riscv/vcrypto_helper.c index e2d719b13b..66d449c274 100644 --- a/target/riscv/vcrypto_helper.c +++ b/target/riscv/vcrypto_helper.c @@ -218,9 +218,7 @@ static inline void xor_round_key(AESState *round_state, AESState *round_key) void HELPER(NAME)(void *vd, void *vs2, CPURISCVState *env, \ uint32_t desc) \ { \ - uint32_t vl = env->vl; \ uint32_t total_elems = vext_get_total_elems(env, desc, 4); \ - uint32_t vta = vext_vta(desc); \ \ for (uint32_t i = env->vstart / 4; i < env->vl / 4; i++) { \ AESState round_key; \ @@ -233,18 +231,16 @@ static inline void xor_round_key(AESState *round_state, AESState *round_key) *((uint64_t *)vd + H8(i * 2 + 0)) = round_state.d[0]; \ *((uint64_t *)vd + H8(i * 2 + 1)) = round_state.d[1]; \ } \ - env->vstart = 0; \ /* set tail elements to 1s */ \ - vext_set_elems_1s(vd, vta, vl * 4, total_elems * 4); \ + vext_set_tail_elems_1s(env, vd, desc, 4, total_elems); \ + env->vstart = 0; \ } #define GEN_ZVKNED_HELPER_VS(NAME, ...) \ void HELPER(NAME)(void *vd, void *vs2, CPURISCVState *env, \ uint32_t desc) \ { \ - uint32_t vl = env->vl; \ uint32_t total_elems = vext_get_total_elems(env, desc, 4); \ - uint32_t vta = vext_vta(desc); \ \ for (uint32_t i = env->vstart / 4; i < env->vl / 4; i++) { \ AESState round_key; \ @@ -257,9 +253,9 @@ static inline void xor_round_key(AESState *round_state, AESState *round_key) *((uint64_t *)vd + H8(i * 2 + 0)) = round_state.d[0]; \ *((uint64_t *)vd + H8(i * 2 + 1)) = round_state.d[1]; \ } \ - env->vstart = 0; \ /* set tail elements to 1s */ \ - vext_set_elems_1s(vd, vta, vl * 4, total_elems * 4); \ + vext_set_tail_elems_1s(env, vd, desc, 4, total_elems); \ + env->vstart = 0; \ } GEN_ZVKNED_HELPER_VV(vaesef_vv, aesenc_SB_SR_AK(&round_state, @@ -301,9 +297,7 @@ void HELPER(vaeskf1_vi)(void *vd_vptr, void *vs2_vptr, uint32_t uimm, { uint32_t *vd = vd_vptr; uint32_t *vs2 = vs2_vptr; - uint32_t vl = env->vl; uint32_t total_elems = vext_get_total_elems(env, desc, 4); - uint32_t vta = vext_vta(desc); uimm &= 0b1111; if (uimm > 10 || uimm == 0) { @@ -337,9 +331,9 @@ void HELPER(vaeskf1_vi)(void *vd_vptr, void *vs2_vptr, uint32_t uimm, vd[i * 4 + H4(2)] = rk[6]; vd[i * 4 + H4(3)] = rk[7]; } - env->vstart = 0; /* set tail elements to 1s */ - vext_set_elems_1s(vd, vta, vl * 4, total_elems * 4); + vext_set_tail_elems_1s(env, vd, desc, 4, total_elems); + env->vstart = 0; } void HELPER(vaeskf2_vi)(void *vd_vptr, void *vs2_vptr, uint32_t uimm, @@ -347,9 +341,7 @@ void HELPER(vaeskf2_vi)(void *vd_vptr, void *vs2_vptr, uint32_t uimm, { uint32_t *vd = vd_vptr; uint32_t *vs2 = vs2_vptr; - uint32_t vl = env->vl; uint32_t total_elems = vext_get_total_elems(env, desc, 4); - uint32_t vta = vext_vta(desc); uimm &= 0b1111; if (uimm > 14 || uimm < 2) { @@ -394,9 +386,9 @@ void HELPER(vaeskf2_vi)(void *vd_vptr, void *vs2_vptr, uint32_t uimm, vd[i * 4 + H4(2)] = rk[10]; vd[i * 4 + H4(3)] = rk[11]; } - env->vstart = 0; /* set tail elements to 1s */ - vext_set_elems_1s(vd, vta, vl * 4, total_elems * 4); + vext_set_tail_elems_1s(env, vd, desc, 4, total_elems); + env->vstart = 0; } static inline uint32_t sig0_sha256(uint32_t x) @@ -455,7 +447,6 @@ void HELPER(vsha2ms_vv)(void *vd, void *vs1, void *vs2, CPURISCVState *env, uint32_t sew = FIELD_EX64(env->vtype, VTYPE, VSEW); uint32_t esz = sew == MO_32 ? 4 : 8; uint32_t total_elems; - uint32_t vta = vext_vta(desc); for (uint32_t i = env->vstart / 4; i < env->vl / 4; i++) { if (sew == MO_32) { @@ -469,7 +460,7 @@ void HELPER(vsha2ms_vv)(void *vd, void *vs1, void *vs2, CPURISCVState *env, } /* set tail elements to 1s */ total_elems = vext_get_total_elems(env, desc, esz); - vext_set_elems_1s(vd, vta, env->vl * esz, total_elems * esz); + vext_set_tail_elems_1s(env, vd, desc, esz, total_elems); env->vstart = 0; } @@ -570,7 +561,6 @@ void HELPER(vsha2ch32_vv)(void *vd, void *vs1, void *vs2, CPURISCVState *env, { const uint32_t esz = 4; uint32_t total_elems; - uint32_t vta = vext_vta(desc); for (uint32_t i = env->vstart / 4; i < env->vl / 4; i++) { vsha2c_32(((uint32_t *)vs2) + 4 * i, ((uint32_t *)vd) + 4 * i, @@ -579,7 +569,7 @@ void HELPER(vsha2ch32_vv)(void *vd, void *vs1, void *vs2, CPURISCVState *env, /* set tail elements to 1s */ total_elems = vext_get_total_elems(env, desc, esz); - vext_set_elems_1s(vd, vta, env->vl * esz, total_elems * esz); + vext_set_tail_elems_1s(env, vd, desc, esz, total_elems); env->vstart = 0; } @@ -588,7 +578,6 @@ void HELPER(vsha2ch64_vv)(void *vd, void *vs1, void *vs2, CPURISCVState *env, { const uint32_t esz = 8; uint32_t total_elems; - uint32_t vta = vext_vta(desc); for (uint32_t i = env->vstart / 4; i < env->vl / 4; i++) { vsha2c_64(((uint64_t *)vs2) + 4 * i, ((uint64_t *)vd) + 4 * i, @@ -597,7 +586,7 @@ void HELPER(vsha2ch64_vv)(void *vd, void *vs1, void *vs2, CPURISCVState *env, /* set tail elements to 1s */ total_elems = vext_get_total_elems(env, desc, esz); - vext_set_elems_1s(vd, vta, env->vl * esz, total_elems * esz); + vext_set_tail_elems_1s(env, vd, desc, esz, total_elems); env->vstart = 0; } @@ -606,7 +595,6 @@ void HELPER(vsha2cl32_vv)(void *vd, void *vs1, void *vs2, CPURISCVState *env, { const uint32_t esz = 4; uint32_t total_elems; - uint32_t vta = vext_vta(desc); for (uint32_t i = env->vstart / 4; i < env->vl / 4; i++) { vsha2c_32(((uint32_t *)vs2) + 4 * i, ((uint32_t *)vd) + 4 * i, @@ -615,7 +603,7 @@ void HELPER(vsha2cl32_vv)(void *vd, void *vs1, void *vs2, CPURISCVState *env, /* set tail elements to 1s */ total_elems = vext_get_total_elems(env, desc, esz); - vext_set_elems_1s(vd, vta, env->vl * esz, total_elems * esz); + vext_set_tail_elems_1s(env, vd, desc, esz, total_elems); env->vstart = 0; } @@ -624,7 +612,6 @@ void HELPER(vsha2cl64_vv)(void *vd, void *vs1, void *vs2, CPURISCVState *env, { uint32_t esz = 8; uint32_t total_elems; - uint32_t vta = vext_vta(desc); for (uint32_t i = env->vstart / 4; i < env->vl / 4; i++) { vsha2c_64(((uint64_t *)vs2) + 4 * i, ((uint64_t *)vd) + 4 * i, @@ -633,7 +620,7 @@ void HELPER(vsha2cl64_vv)(void *vd, void *vs1, void *vs2, CPURISCVState *env, /* set tail elements to 1s */ total_elems = vext_get_total_elems(env, desc, esz); - vext_set_elems_1s(vd, vta, env->vl * esz, total_elems * esz); + vext_set_tail_elems_1s(env, vd, desc, esz, total_elems); env->vstart = 0; } @@ -653,7 +640,6 @@ void HELPER(vsm3me_vv)(void *vd_vptr, void *vs1_vptr, void *vs2_vptr, { uint32_t esz = memop_size(FIELD_EX64(env->vtype, VTYPE, VSEW)); uint32_t total_elems = vext_get_total_elems(env, desc, esz); - uint32_t vta = vext_vta(desc); uint32_t *vd = vd_vptr; uint32_t *vs1 = vs1_vptr; uint32_t *vs2 = vs2_vptr; @@ -672,7 +658,7 @@ void HELPER(vsm3me_vv)(void *vd_vptr, void *vs1_vptr, void *vs2_vptr, vd[(i * 8) + j] = bswap32(w[H4(j + 16)]); } } - vext_set_elems_1s(vd_vptr, vta, env->vl * esz, total_elems * esz); + vext_set_tail_elems_1s(env, vd_vptr, desc, esz, total_elems); env->vstart = 0; } @@ -752,7 +738,6 @@ void HELPER(vsm3c_vi)(void *vd_vptr, void *vs2_vptr, uint32_t uimm, { uint32_t esz = memop_size(FIELD_EX64(env->vtype, VTYPE, VSEW)); uint32_t total_elems = vext_get_total_elems(env, desc, esz); - uint32_t vta = vext_vta(desc); uint32_t *vd = vd_vptr; uint32_t *vs2 = vs2_vptr; uint32_t v1[8], v2[8], v3[8]; @@ -767,7 +752,7 @@ void HELPER(vsm3c_vi)(void *vd_vptr, void *vs2_vptr, uint32_t uimm, vd[i * 8 + k] = bswap32(v1[H4(k)]); } } - vext_set_elems_1s(vd_vptr, vta, env->vl * esz, total_elems * esz); + vext_set_tail_elems_1s(env, vd_vptr, desc, esz, total_elems); env->vstart = 0; } @@ -777,7 +762,6 @@ void HELPER(vghsh_vv)(void *vd_vptr, void *vs1_vptr, void *vs2_vptr, uint64_t *vd = vd_vptr; uint64_t *vs1 = vs1_vptr; uint64_t *vs2 = vs2_vptr; - uint32_t vta = vext_vta(desc); uint32_t total_elems = vext_get_total_elems(env, desc, 4); for (uint32_t i = env->vstart / 4; i < env->vl / 4; i++) { @@ -805,7 +789,7 @@ void HELPER(vghsh_vv)(void *vd_vptr, void *vs1_vptr, void *vs2_vptr, vd[i * 2 + 1] = brev8(Z[1]); } /* set tail elements to 1s */ - vext_set_elems_1s(vd, vta, env->vl * 4, total_elems * 4); + vext_set_tail_elems_1s(env, vd, desc, 4, total_elems); env->vstart = 0; } @@ -814,7 +798,6 @@ void HELPER(vgmul_vv)(void *vd_vptr, void *vs2_vptr, CPURISCVState *env, { uint64_t *vd = vd_vptr; uint64_t *vs2 = vs2_vptr; - uint32_t vta = vext_vta(desc); uint32_t total_elems = vext_get_total_elems(env, desc, 4); for (uint32_t i = env->vstart / 4; i < env->vl / 4; i++) { @@ -839,7 +822,7 @@ void HELPER(vgmul_vv)(void *vd_vptr, void *vs2_vptr, CPURISCVState *env, vd[i * 2 + 1] = brev8(Z[1]); } /* set tail elements to 1s */ - vext_set_elems_1s(vd, vta, env->vl * 4, total_elems * 4); + vext_set_tail_elems_1s(env, vd, desc, 4, total_elems); env->vstart = 0; } @@ -881,9 +864,9 @@ void HELPER(vsm4k_vi)(void *vd, void *vs2, uint32_t uimm5, CPURISCVState *env, } } - env->vstart = 0; /* set tail elements to 1s */ - vext_set_elems_1s(vd, vext_vta(desc), env->vl * esz, total_elems * esz); + vext_set_tail_elems_1s(env, vd, desc, esz, total_elems); + env->vstart = 0; } static void do_sm4_round(uint32_t *rk, uint32_t *buf) @@ -930,9 +913,9 @@ void HELPER(vsm4r_vv)(void *vd, void *vs2, CPURISCVState *env, uint32_t desc) } } - env->vstart = 0; /* set tail elements to 1s */ - vext_set_elems_1s(vd, vext_vta(desc), env->vl * esz, total_elems * esz); + vext_set_tail_elems_1s(env, vd, desc, esz, total_elems); + env->vstart = 0; } void HELPER(vsm4r_vs)(void *vd, void *vs2, CPURISCVState *env, uint32_t desc) @@ -964,7 +947,7 @@ void HELPER(vsm4r_vs)(void *vd, void *vs2, CPURISCVState *env, uint32_t desc) } } - env->vstart = 0; /* set tail elements to 1s */ - vext_set_elems_1s(vd, vext_vta(desc), env->vl * esz, total_elems * esz); + vext_set_tail_elems_1s(env, vd, desc, esz, total_elems); + env->vstart = 0; } diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c index b174ddeae8..4fe8752eea 100644 --- a/target/riscv/vector_helper.c +++ b/target/riscv/vector_helper.c @@ -174,36 +174,6 @@ GEN_VEXT_ST_ELEM(ste_h, int16_t, H2, stw) GEN_VEXT_ST_ELEM(ste_w, int32_t, H4, stl) GEN_VEXT_ST_ELEM(ste_d, int64_t, H8, stq) -/* - * This function is sensitive to env->vstart changes since - * it'll be a no-op if vstart >= vl. Do not clear env->vstart - * before calling it unless you're certain that vstart < vl. - */ -static void vext_set_tail_elems_1s(CPURISCVState *env, void *vd, - uint32_t desc, uint32_t esz, - uint32_t max_elems) -{ - uint32_t vta = vext_vta(desc); - uint32_t nf = vext_nf(desc); - int k; - - /* - * Section 5.4 of the RVV spec mentions: - * "When vstart ≥ vl, there are no body elements, and no - * elements are updated in any destination vector register - * group, including that no tail elements are updated - * with agnostic values." - */ - if (vta == 0 || env->vstart >= env->vl) { - return; - } - - for (k = 0; k < nf; ++k) { - vext_set_elems_1s(vd, vta, (k * max_elems + env->vl) * esz, - (k * max_elems + max_elems) * esz); - } -} - /* * stride: access vector element from strided memory */ diff --git a/target/riscv/vector_internals.c b/target/riscv/vector_internals.c index 12f5964fbb..bf3e9e2370 100644 --- a/target/riscv/vector_internals.c +++ b/target/riscv/vector_internals.c @@ -33,6 +33,35 @@ void vext_set_elems_1s(void *base, uint32_t is_agnostic, uint32_t cnt, memset(base + cnt, -1, tot - cnt); } +/* + * This function is sensitive to env->vstart changes since + * it'll be a no-op if vstart >= vl. Do not clear env->vstart + * before calling it unless you're certain that vstart < vl. + */ +void vext_set_tail_elems_1s(CPURISCVState *env, void *vd, uint32_t desc, + uint32_t esz, uint32_t max_elems) +{ + uint32_t vta = vext_vta(desc); + uint32_t nf = vext_nf(desc); + int k; + + /* + * Section 5.4 of the RVV spec mentions: + * "When vstart ≥ vl, there are no body elements, and no + * elements are updated in any destination vector register + * group, including that no tail elements are updated + * with agnostic values." + */ + if (vta == 0 || env->vstart >= env->vl) { + return; + } + + for (k = 0; k < nf; ++k) { + vext_set_elems_1s(vd, vta, (k * max_elems + env->vl) * esz, + (k * max_elems + max_elems) * esz); + } +} + void do_vext_vv(void *vd, void *v0, void *vs1, void *vs2, CPURISCVState *env, uint32_t desc, opivv2_fn *fn, uint32_t esz) diff --git a/target/riscv/vector_internals.h b/target/riscv/vector_internals.h index 842765f6c1..c5a2bc4bf3 100644 --- a/target/riscv/vector_internals.h +++ b/target/riscv/vector_internals.h @@ -117,6 +117,10 @@ static inline uint32_t vext_get_total_elems(CPURISCVState *env, uint32_t desc, void vext_set_elems_1s(void *base, uint32_t is_agnostic, uint32_t cnt, uint32_t tot); +void vext_set_tail_elems_1s(CPURISCVState *env, void *vd, + uint32_t desc, uint32_t esz, + uint32_t max_elems); + /* expand macro args before macro */ #define RVVCALL(macro, ...) macro(__VA_ARGS__) From patchwork Sun Mar 10 11:53:10 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Henrique Barboza X-Patchwork-Id: 13588057 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id DF062C54791 for ; Sun, 10 Mar 2024 11:54:56 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rjHkb-0007qa-CZ; Sun, 10 Mar 2024 07:53:57 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rjHkQ-0007lz-Da for qemu-devel@nongnu.org; Sun, 10 Mar 2024 07:53:48 -0400 Received: from mail-oa1-x33.google.com ([2001:4860:4864:20::33]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1rjHkO-00043j-SR for qemu-devel@nongnu.org; Sun, 10 Mar 2024 07:53:46 -0400 Received: by mail-oa1-x33.google.com with SMTP id 586e51a60fabf-22181888b88so2862479fac.1 for ; Sun, 10 Mar 2024 04:53:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ventanamicro.com; s=google; t=1710071622; x=1710676422; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=DgCmMDQSmZuQF2R71f2SMhy+sJRtIk+zLEeTXoFqnr0=; b=De+DA449//GwsW3EytqWJOW4AH2cHEmpjYrIfmT6r0aufq9rlVtFvUamWtsI/X4XPx eDfrbXhr/ZP1h6TmVScw3XHnilD86FiG493uFIvd8x2s9TLM+cXSdKrhmG0FlkJbISDc bGcZclLTtnFIZJbJmzc/RPxLCZfuAdY7WfDOjv06bYRnnF4VRw3Bz2k1teXzmT3ekq3p 3tY68jLyR1QnDOAr+s96qSQuvfoCsK5HDv0753hbl1uIqsle8zOEDYoIC7UrWfBtP6f6 DU0j3KgOUPOrU4w39nZmCQ8RzpIoYNoRGqlQ88vyI8UQwM1iYxt83LoP26ssx9xaNk7q /7PA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710071622; x=1710676422; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=DgCmMDQSmZuQF2R71f2SMhy+sJRtIk+zLEeTXoFqnr0=; b=V6Ww/H/7bC8AVVpGurCxFC3Ml6CabtRFdtitXF0FcDx4Gjk6pNnXyVr1taS95n19Lh KP7SCb7A+E1gXX4UPvqgn4ec61G1Sh1QBVM1AVSiZZHbToVU/0dypeB/qW5EvL07CBIr rRRqShm4+v1HAQ1yGrZLIIDMD3UCx2tPqGmSHP/aM417N7X69+lhbMhIP2cDEVrqw5Vb IDt5eFm6cLI3Q7qO2qUGK0p0nGecj4u1b4rl5EJ4RuG7zqht3ws/cKqtM71rTZzdOihS 4erSW3xaBjZPAucr65tKat36uAJlYZltiGVymgCORGG5jXjslegazGkVYXpshYARX1hM D0GA== X-Gm-Message-State: AOJu0YzR9e0J5WmLg+/hbbMIvq2s8jhFK+2kxHAVAjG/Sq9hRa9Dl2h1 DSRAoRO76qy7++1w4LNt7oI30gt3XUuJaYVFSb2OYbutl4hSXH/zC+1BCG91aN+XUBVXvSu0eVj Q X-Google-Smtp-Source: AGHT+IFw2R4uPF110ohNpRph1e5neJLjsjQzpFM03QiXVqC5hOXgq+UZ/Z0PI1gx7fXCKdGEsgHwKg== X-Received: by 2002:a05:6870:970c:b0:221:bd93:293d with SMTP id n12-20020a056870970c00b00221bd93293dmr4048764oaq.15.1710071622674; Sun, 10 Mar 2024 04:53:42 -0700 (PDT) Received: from grind.. ([177.94.15.159]) by smtp.gmail.com with ESMTPSA id g22-20020aa78196000000b006e647059cccsm2449253pfi.33.2024.03.10.04.53.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 10 Mar 2024 04:53:42 -0700 (PDT) From: Daniel Henrique Barboza To: qemu-devel@nongnu.org Cc: qemu-riscv@nongnu.org, alistair.francis@wdc.com, bmeng@tinylab.org, liwei1518@gmail.com, zhiwei_liu@linux.alibaba.com, palmer@rivosinc.com, richard.henderson@linaro.org, philmd@linaro.org, Daniel Henrique Barboza Subject: [PATCH v10 06/10] trans_rvv.c.inc: set vstart = 0 in int scalar move insns Date: Sun, 10 Mar 2024 08:53:10 -0300 Message-ID: <20240310115315.187283-7-dbarboza@ventanamicro.com> X-Mailer: git-send-email 2.43.2 In-Reply-To: <20240310115315.187283-1-dbarboza@ventanamicro.com> References: <20240310115315.187283-1-dbarboza@ventanamicro.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2001:4860:4864:20::33; envelope-from=dbarboza@ventanamicro.com; helo=mail-oa1-x33.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org trans_vmv_x_s, trans_vmv_s_x, trans_vfmv_f_s and trans_vfmv_s_f aren't setting vstart = 0 after execution. This is usually done by a helper in vector_helper.c but these functions don't use helpers. We'll set vstart after any potential 'over' brconds, and that will also mandate a mark_vs_dirty() too. Fixes: dedc53cbc9 ("target/riscv: rvv-1.0: integer scalar move instructions") Signed-off-by: Daniel Henrique Barboza Reviewed-by: Richard Henderson --- target/riscv/insn_trans/trans_rvv.c.inc | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/target/riscv/insn_trans/trans_rvv.c.inc b/target/riscv/insn_trans/trans_rvv.c.inc index e42728990e..8c16a9f5b3 100644 --- a/target/riscv/insn_trans/trans_rvv.c.inc +++ b/target/riscv/insn_trans/trans_rvv.c.inc @@ -3373,6 +3373,8 @@ static bool trans_vmv_x_s(DisasContext *s, arg_vmv_x_s *a) vec_element_loadi(s, t1, a->rs2, 0, true); tcg_gen_trunc_i64_tl(dest, t1); gen_set_gpr(s, a->rd, dest); + tcg_gen_movi_tl(cpu_vstart, 0); + mark_vs_dirty(s); return true; } return false; @@ -3399,8 +3401,9 @@ static bool trans_vmv_s_x(DisasContext *s, arg_vmv_s_x *a) s1 = get_gpr(s, a->rs1, EXT_NONE); tcg_gen_ext_tl_i64(t1, s1); vec_element_storei(s, a->rd, 0, t1); - mark_vs_dirty(s); gen_set_label(over); + tcg_gen_movi_tl(cpu_vstart, 0); + mark_vs_dirty(s); return true; } return false; @@ -3427,6 +3430,8 @@ static bool trans_vfmv_f_s(DisasContext *s, arg_vfmv_f_s *a) } mark_fs_dirty(s); + tcg_gen_movi_tl(cpu_vstart, 0); + mark_vs_dirty(s); return true; } return false; @@ -3452,8 +3457,9 @@ static bool trans_vfmv_s_f(DisasContext *s, arg_vfmv_s_f *a) do_nanbox(s, t1, cpu_fpr[a->rs1]); vec_element_storei(s, a->rd, 0, t1); - mark_vs_dirty(s); gen_set_label(over); + tcg_gen_movi_tl(cpu_vstart, 0); + mark_vs_dirty(s); return true; } return false; From patchwork Sun Mar 10 11:53:11 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Henrique Barboza X-Patchwork-Id: 13588056 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id BB6B0C54E66 for ; Sun, 10 Mar 2024 11:54:49 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rjHkc-0007rq-Rs; Sun, 10 Mar 2024 07:53:58 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rjHkW-0007nc-Ru for qemu-devel@nongnu.org; Sun, 10 Mar 2024 07:53:52 -0400 Received: from mail-oi1-x233.google.com ([2607:f8b0:4864:20::233]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1rjHkT-00044C-B5 for qemu-devel@nongnu.org; Sun, 10 Mar 2024 07:53:52 -0400 Received: by mail-oi1-x233.google.com with SMTP id 5614622812f47-3c21ecc072cso1753342b6e.2 for ; Sun, 10 Mar 2024 04:53:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ventanamicro.com; s=google; t=1710071626; x=1710676426; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=d5g+OaReEmab0LUW8CbEsTbyJZCzarGRFXQkW37sONY=; b=eS8BotPWqY5D7T+bTtK9whRptZ9cMsGgIdxKGoRQTzzNEwDF7q6IwG37IfUe+x8JsS l3ziDHSMAG4+cKGcQapM1MU9QvsRmNHekXp3K6Na5jNnnQalk2YqvrFaRjlcq8bRhIdQ yEai1sE7VOWf2UYVdLjrMuq3hqnKLhqoL1YyvCaB5m3Rlz3b08Jphx4Js42mDkUc8x/p 1fT6plsgNkHhmJiIzNz6gwUqAy3NMOiOe1wCPo5GdX1vH0Y8ppAQC/QMCQnB4JuahvTU 8gDEXAjtfIRVEWH1urOIatg9JBwATJ0Ojgz+BZeTN7ir485No63Q8YJU9iHE1w98FR2X tcPA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710071626; x=1710676426; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=d5g+OaReEmab0LUW8CbEsTbyJZCzarGRFXQkW37sONY=; b=K8vGKNNQkZo6/J+f5u/QOPMBrFb2kvLytdbZIB2fiCrsUMKhiKM5TD4syeGXAYbdUh q25fAlVWhE7lKzUFnUjo7p7RpJzaCzxgcYBaL496TZ4pcC4+U+5fKeWQvZNq+9Y23vFI 7RqPsY4332nr1HoM2iTd1TrADvmKApYLy576LSPI7NFxyMaRZGQZEBkc9R5dcDjHN4KY p7WPCAqiY0r0t6zFGbDC0dVmc4SV9VJIePOod6ZY7cyzVcLeEYvMzTE4qnTqXOd6BN86 CNNyPyXEvyKTUXuKKce4vF4fnXFKSOmYeVLna2GsMO7WQ7SLh5xRhgII+IEXZaPISTKf Lfng== X-Gm-Message-State: AOJu0YyWRPmnZTs69VYuIuU3Ias8zso7Kzm4G8dbloXdealKqRNQ/J5E wiC1UX//p4dtMX71LRqwBD6RN09r8+mDRNNRzggqx0PKvYvYCVz6i3dAs0HRdOyx6PApeDWbrnn k X-Google-Smtp-Source: AGHT+IH4MHpLfR+IhyhzHmOj8gT4ltHQ+qpAN1tXJVox3WFg+jAA8agXV2Mfyhmv0ncrI1eJtaxoWQ== X-Received: by 2002:a05:6808:bd0:b0:3c2:3518:da82 with SMTP id o16-20020a0568080bd000b003c23518da82mr5534790oik.38.1710071626201; Sun, 10 Mar 2024 04:53:46 -0700 (PDT) Received: from grind.. ([177.94.15.159]) by smtp.gmail.com with ESMTPSA id g22-20020aa78196000000b006e647059cccsm2449253pfi.33.2024.03.10.04.53.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 10 Mar 2024 04:53:45 -0700 (PDT) From: Daniel Henrique Barboza To: qemu-devel@nongnu.org Cc: qemu-riscv@nongnu.org, alistair.francis@wdc.com, bmeng@tinylab.org, liwei1518@gmail.com, zhiwei_liu@linux.alibaba.com, palmer@rivosinc.com, richard.henderson@linaro.org, philmd@linaro.org, Daniel Henrique Barboza Subject: [PATCH v10 07/10] target/riscv: remove 'over' brconds from vector trans Date: Sun, 10 Mar 2024 08:53:11 -0300 Message-ID: <20240310115315.187283-8-dbarboza@ventanamicro.com> X-Mailer: git-send-email 2.43.2 In-Reply-To: <20240310115315.187283-1-dbarboza@ventanamicro.com> References: <20240310115315.187283-1-dbarboza@ventanamicro.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::233; envelope-from=dbarboza@ventanamicro.com; helo=mail-oi1-x233.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Most of the vector translations has this following pattern at the start: TCGLabel *over = gen_new_label(); tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); And then right at the end: gen_set_label(over); return true; This means that if vstart >= vl we'll not set vstart = 0 at the end of the insns - this is done inside the helper that is being skipped. The reason why this pattern hasn't been a bigger problem is because the conditional vstart >= vl is very rare. Checking all the helpers in vector_helper.c we see all of them with a pattern like this: for (i = env->vstart; i < vl; i++) { (...) } env->vstart = 0; Thus they can handle vstart >= vl case gracefully, with the benefit of setting env->vstart = 0 during the process. Remove all 'over' conditionals and let the helper set env->vstart = 0 every time. Note that not all insns uses helpers, and for those cases the 'brcond' jump is the only way to filter vstart >= vl. This is the case of trans_vmv_s_x() and trans_vfmv_s_f(). We won't remove the 'brcond' conditionals from them. While we're at it, remove the (vl == 0) brconds from trans_rvbf16.c.inc too since they're unneeded. Suggested-by: Richard Henderson Signed-off-by: Daniel Henrique Barboza Reviewed-by: Richard Henderson Reviewed-by: Alistair Francis --- target/riscv/insn_trans/trans_rvbf16.c.inc | 12 --- target/riscv/insn_trans/trans_rvv.c.inc | 108 --------------------- target/riscv/insn_trans/trans_rvvk.c.inc | 18 ---- 3 files changed, 138 deletions(-) diff --git a/target/riscv/insn_trans/trans_rvbf16.c.inc b/target/riscv/insn_trans/trans_rvbf16.c.inc index 8ee99df3f3..a842e76a6b 100644 --- a/target/riscv/insn_trans/trans_rvbf16.c.inc +++ b/target/riscv/insn_trans/trans_rvbf16.c.inc @@ -71,11 +71,8 @@ static bool trans_vfncvtbf16_f_f_w(DisasContext *ctx, arg_vfncvtbf16_f_f_w *a) if (opfv_narrow_check(ctx, a) && (ctx->sew == MO_16)) { uint32_t data = 0; - TCGLabel *over = gen_new_label(); gen_set_rm_chkfrm(ctx, RISCV_FRM_DYN); - tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over); - tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); data = FIELD_DP32(data, VDATA, VM, a->vm); data = FIELD_DP32(data, VDATA, LMUL, ctx->lmul); @@ -87,7 +84,6 @@ static bool trans_vfncvtbf16_f_f_w(DisasContext *ctx, arg_vfncvtbf16_f_f_w *a) ctx->cfg_ptr->vlenb, data, gen_helper_vfncvtbf16_f_f_w); mark_vs_dirty(ctx); - gen_set_label(over); return true; } return false; @@ -100,11 +96,8 @@ static bool trans_vfwcvtbf16_f_f_v(DisasContext *ctx, arg_vfwcvtbf16_f_f_v *a) if (opfv_widen_check(ctx, a) && (ctx->sew == MO_16)) { uint32_t data = 0; - TCGLabel *over = gen_new_label(); gen_set_rm_chkfrm(ctx, RISCV_FRM_DYN); - tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over); - tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); data = FIELD_DP32(data, VDATA, VM, a->vm); data = FIELD_DP32(data, VDATA, LMUL, ctx->lmul); @@ -116,7 +109,6 @@ static bool trans_vfwcvtbf16_f_f_v(DisasContext *ctx, arg_vfwcvtbf16_f_f_v *a) ctx->cfg_ptr->vlenb, data, gen_helper_vfwcvtbf16_f_f_v); mark_vs_dirty(ctx); - gen_set_label(over); return true; } return false; @@ -130,11 +122,8 @@ static bool trans_vfwmaccbf16_vv(DisasContext *ctx, arg_vfwmaccbf16_vv *a) if (require_rvv(ctx) && vext_check_isa_ill(ctx) && (ctx->sew == MO_16) && vext_check_dss(ctx, a->rd, a->rs1, a->rs2, a->vm)) { uint32_t data = 0; - TCGLabel *over = gen_new_label(); gen_set_rm_chkfrm(ctx, RISCV_FRM_DYN); - tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over); - tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); data = FIELD_DP32(data, VDATA, VM, a->vm); data = FIELD_DP32(data, VDATA, LMUL, ctx->lmul); @@ -147,7 +136,6 @@ static bool trans_vfwmaccbf16_vv(DisasContext *ctx, arg_vfwmaccbf16_vv *a) ctx->cfg_ptr->vlenb, data, gen_helper_vfwmaccbf16_vv); mark_vs_dirty(ctx); - gen_set_label(over); return true; } return false; diff --git a/target/riscv/insn_trans/trans_rvv.c.inc b/target/riscv/insn_trans/trans_rvv.c.inc index 8c16a9f5b3..4c1a064cf6 100644 --- a/target/riscv/insn_trans/trans_rvv.c.inc +++ b/target/riscv/insn_trans/trans_rvv.c.inc @@ -616,9 +616,6 @@ static bool ldst_us_trans(uint32_t vd, uint32_t rs1, uint32_t data, TCGv base; TCGv_i32 desc; - TCGLabel *over = gen_new_label(); - tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); - dest = tcg_temp_new_ptr(); mask = tcg_temp_new_ptr(); base = get_gpr(s, rs1, EXT_NONE); @@ -660,7 +657,6 @@ static bool ldst_us_trans(uint32_t vd, uint32_t rs1, uint32_t data, tcg_gen_mb(TCG_MO_ALL | TCG_BAR_LDAQ); } - gen_set_label(over); return true; } @@ -802,9 +798,6 @@ static bool ldst_stride_trans(uint32_t vd, uint32_t rs1, uint32_t rs2, TCGv base, stride; TCGv_i32 desc; - TCGLabel *over = gen_new_label(); - tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); - dest = tcg_temp_new_ptr(); mask = tcg_temp_new_ptr(); base = get_gpr(s, rs1, EXT_NONE); @@ -819,7 +812,6 @@ static bool ldst_stride_trans(uint32_t vd, uint32_t rs1, uint32_t rs2, fn(dest, mask, base, stride, tcg_env, desc); - gen_set_label(over); return true; } @@ -906,9 +898,6 @@ static bool ldst_index_trans(uint32_t vd, uint32_t rs1, uint32_t vs2, TCGv base; TCGv_i32 desc; - TCGLabel *over = gen_new_label(); - tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); - dest = tcg_temp_new_ptr(); mask = tcg_temp_new_ptr(); index = tcg_temp_new_ptr(); @@ -924,7 +913,6 @@ static bool ldst_index_trans(uint32_t vd, uint32_t rs1, uint32_t vs2, fn(dest, mask, base, index, tcg_env, desc); - gen_set_label(over); return true; } @@ -1044,9 +1032,6 @@ static bool ldff_trans(uint32_t vd, uint32_t rs1, uint32_t data, TCGv base; TCGv_i32 desc; - TCGLabel *over = gen_new_label(); - tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); - dest = tcg_temp_new_ptr(); mask = tcg_temp_new_ptr(); base = get_gpr(s, rs1, EXT_NONE); @@ -1059,7 +1044,6 @@ static bool ldff_trans(uint32_t vd, uint32_t rs1, uint32_t data, fn(dest, mask, base, tcg_env, desc); mark_vs_dirty(s); - gen_set_label(over); return true; } @@ -1100,10 +1084,6 @@ static bool ldst_whole_trans(uint32_t vd, uint32_t rs1, uint32_t nf, uint32_t width, gen_helper_ldst_whole *fn, DisasContext *s) { - uint32_t evl = s->cfg_ptr->vlenb * nf / width; - TCGLabel *over = gen_new_label(); - tcg_gen_brcondi_tl(TCG_COND_GEU, cpu_vstart, evl, over); - TCGv_ptr dest; TCGv base; TCGv_i32 desc; @@ -1120,8 +1100,6 @@ static bool ldst_whole_trans(uint32_t vd, uint32_t rs1, uint32_t nf, fn(dest, base, tcg_env, desc); - gen_set_label(over); - return true; } @@ -1195,10 +1173,6 @@ static inline bool do_opivv_gvec(DisasContext *s, arg_rmrr *a, GVecGen3Fn *gvec_fn, gen_helper_gvec_4_ptr *fn) { - TCGLabel *over = gen_new_label(); - - tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); - if (a->vm && s->vl_eq_vlmax && !(s->vta && s->lmul < 0)) { gvec_fn(s->sew, vreg_ofs(s, a->rd), vreg_ofs(s, a->rs2), vreg_ofs(s, a->rs1), @@ -1216,7 +1190,6 @@ do_opivv_gvec(DisasContext *s, arg_rmrr *a, GVecGen3Fn *gvec_fn, s->cfg_ptr->vlenb, data, fn); } mark_vs_dirty(s); - gen_set_label(over); return true; } @@ -1248,9 +1221,6 @@ static bool opivx_trans(uint32_t vd, uint32_t rs1, uint32_t vs2, uint32_t vm, TCGv_i32 desc; uint32_t data = 0; - TCGLabel *over = gen_new_label(); - tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); - dest = tcg_temp_new_ptr(); mask = tcg_temp_new_ptr(); src2 = tcg_temp_new_ptr(); @@ -1271,7 +1241,6 @@ static bool opivx_trans(uint32_t vd, uint32_t rs1, uint32_t vs2, uint32_t vm, fn(dest, mask, src1, src2, tcg_env, desc); mark_vs_dirty(s); - gen_set_label(over); return true; } @@ -1410,9 +1379,6 @@ static bool opivi_trans(uint32_t vd, uint32_t imm, uint32_t vs2, uint32_t vm, TCGv_i32 desc; uint32_t data = 0; - TCGLabel *over = gen_new_label(); - tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); - dest = tcg_temp_new_ptr(); mask = tcg_temp_new_ptr(); src2 = tcg_temp_new_ptr(); @@ -1433,7 +1399,6 @@ static bool opivi_trans(uint32_t vd, uint32_t imm, uint32_t vs2, uint32_t vm, fn(dest, mask, src1, src2, tcg_env, desc); mark_vs_dirty(s); - gen_set_label(over); return true; } @@ -1495,8 +1460,6 @@ static bool do_opivv_widen(DisasContext *s, arg_rmrr *a, { if (checkfn(s, a)) { uint32_t data = 0; - TCGLabel *over = gen_new_label(); - tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); data = FIELD_DP32(data, VDATA, VM, a->vm); data = FIELD_DP32(data, VDATA, LMUL, s->lmul); @@ -1509,7 +1472,6 @@ static bool do_opivv_widen(DisasContext *s, arg_rmrr *a, s->cfg_ptr->vlenb, data, fn); mark_vs_dirty(s); - gen_set_label(over); return true; } return false; @@ -1571,8 +1533,6 @@ static bool do_opiwv_widen(DisasContext *s, arg_rmrr *a, { if (opiwv_widen_check(s, a)) { uint32_t data = 0; - TCGLabel *over = gen_new_label(); - tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); data = FIELD_DP32(data, VDATA, VM, a->vm); data = FIELD_DP32(data, VDATA, LMUL, s->lmul); @@ -1584,7 +1544,6 @@ static bool do_opiwv_widen(DisasContext *s, arg_rmrr *a, tcg_env, s->cfg_ptr->vlenb, s->cfg_ptr->vlenb, data, fn); mark_vs_dirty(s); - gen_set_label(over); return true; } return false; @@ -1643,8 +1602,6 @@ static bool opivv_trans(uint32_t vd, uint32_t vs1, uint32_t vs2, uint32_t vm, gen_helper_gvec_4_ptr *fn, DisasContext *s) { uint32_t data = 0; - TCGLabel *over = gen_new_label(); - tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); data = FIELD_DP32(data, VDATA, VM, vm); data = FIELD_DP32(data, VDATA, LMUL, s->lmul); @@ -1655,7 +1612,6 @@ static bool opivv_trans(uint32_t vd, uint32_t vs1, uint32_t vs2, uint32_t vm, vreg_ofs(s, vs2), tcg_env, s->cfg_ptr->vlenb, s->cfg_ptr->vlenb, data, fn); mark_vs_dirty(s); - gen_set_label(over); return true; } @@ -1834,8 +1790,6 @@ static bool trans_##NAME(DisasContext *s, arg_rmrr *a) \ gen_helper_##NAME##_h, \ gen_helper_##NAME##_w, \ }; \ - TCGLabel *over = gen_new_label(); \ - tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); \ \ data = FIELD_DP32(data, VDATA, VM, a->vm); \ data = FIELD_DP32(data, VDATA, LMUL, s->lmul); \ @@ -1848,7 +1802,6 @@ static bool trans_##NAME(DisasContext *s, arg_rmrr *a) \ s->cfg_ptr->vlenb, data, \ fns[s->sew]); \ mark_vs_dirty(s); \ - gen_set_label(over); \ return true; \ } \ return false; \ @@ -2045,14 +1998,11 @@ static bool trans_vmv_v_v(DisasContext *s, arg_vmv_v_v *a) gen_helper_vmv_v_v_b, gen_helper_vmv_v_v_h, gen_helper_vmv_v_v_w, gen_helper_vmv_v_v_d, }; - TCGLabel *over = gen_new_label(); - tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); tcg_gen_gvec_2_ptr(vreg_ofs(s, a->rd), vreg_ofs(s, a->rs1), tcg_env, s->cfg_ptr->vlenb, s->cfg_ptr->vlenb, data, fns[s->sew]); - gen_set_label(over); } mark_vs_dirty(s); return true; @@ -2068,8 +2018,6 @@ static bool trans_vmv_v_x(DisasContext *s, arg_vmv_v_x *a) /* vmv.v.x has rs2 = 0 and vm = 1 */ vext_check_ss(s, a->rd, 0, 1)) { TCGv s1; - TCGLabel *over = gen_new_label(); - tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); s1 = get_gpr(s, a->rs1, EXT_SIGN); @@ -2102,7 +2050,6 @@ static bool trans_vmv_v_x(DisasContext *s, arg_vmv_v_x *a) } mark_vs_dirty(s); - gen_set_label(over); return true; } return false; @@ -2129,8 +2076,6 @@ static bool trans_vmv_v_i(DisasContext *s, arg_vmv_v_i *a) gen_helper_vmv_v_x_b, gen_helper_vmv_v_x_h, gen_helper_vmv_v_x_w, gen_helper_vmv_v_x_d, }; - TCGLabel *over = gen_new_label(); - tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); s1 = tcg_constant_i64(simm); dest = tcg_temp_new_ptr(); @@ -2140,7 +2085,6 @@ static bool trans_vmv_v_i(DisasContext *s, arg_vmv_v_i *a) fns[s->sew](dest, s1, tcg_env, desc); mark_vs_dirty(s); - gen_set_label(over); } return true; } @@ -2275,9 +2219,7 @@ static bool trans_##NAME(DisasContext *s, arg_rmrr *a) \ gen_helper_##NAME##_w, \ gen_helper_##NAME##_d, \ }; \ - TCGLabel *over = gen_new_label(); \ gen_set_rm(s, RISCV_FRM_DYN); \ - tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); \ \ data = FIELD_DP32(data, VDATA, VM, a->vm); \ data = FIELD_DP32(data, VDATA, LMUL, s->lmul); \ @@ -2292,7 +2234,6 @@ static bool trans_##NAME(DisasContext *s, arg_rmrr *a) \ s->cfg_ptr->vlenb, data, \ fns[s->sew - 1]); \ mark_vs_dirty(s); \ - gen_set_label(over); \ return true; \ } \ return false; \ @@ -2310,9 +2251,6 @@ static bool opfvf_trans(uint32_t vd, uint32_t rs1, uint32_t vs2, TCGv_i32 desc; TCGv_i64 t1; - TCGLabel *over = gen_new_label(); - tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); - dest = tcg_temp_new_ptr(); mask = tcg_temp_new_ptr(); src2 = tcg_temp_new_ptr(); @@ -2330,7 +2268,6 @@ static bool opfvf_trans(uint32_t vd, uint32_t rs1, uint32_t vs2, fn(dest, mask, t1, src2, tcg_env, desc); mark_vs_dirty(s); - gen_set_label(over); return true; } @@ -2393,9 +2330,7 @@ static bool trans_##NAME(DisasContext *s, arg_rmrr *a) \ static gen_helper_gvec_4_ptr * const fns[2] = { \ gen_helper_##NAME##_h, gen_helper_##NAME##_w, \ }; \ - TCGLabel *over = gen_new_label(); \ gen_set_rm(s, RISCV_FRM_DYN); \ - tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over);\ \ data = FIELD_DP32(data, VDATA, VM, a->vm); \ data = FIELD_DP32(data, VDATA, LMUL, s->lmul); \ @@ -2408,7 +2343,6 @@ static bool trans_##NAME(DisasContext *s, arg_rmrr *a) \ s->cfg_ptr->vlenb, data, \ fns[s->sew - 1]); \ mark_vs_dirty(s); \ - gen_set_label(over); \ return true; \ } \ return false; \ @@ -2467,9 +2401,7 @@ static bool trans_##NAME(DisasContext *s, arg_rmrr *a) \ static gen_helper_gvec_4_ptr * const fns[2] = { \ gen_helper_##NAME##_h, gen_helper_##NAME##_w, \ }; \ - TCGLabel *over = gen_new_label(); \ gen_set_rm(s, RISCV_FRM_DYN); \ - tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); \ \ data = FIELD_DP32(data, VDATA, VM, a->vm); \ data = FIELD_DP32(data, VDATA, LMUL, s->lmul); \ @@ -2482,7 +2414,6 @@ static bool trans_##NAME(DisasContext *s, arg_rmrr *a) \ s->cfg_ptr->vlenb, data, \ fns[s->sew - 1]); \ mark_vs_dirty(s); \ - gen_set_label(over); \ return true; \ } \ return false; \ @@ -2584,9 +2515,7 @@ static bool do_opfv(DisasContext *s, arg_rmr *a, { if (checkfn(s, a)) { uint32_t data = 0; - TCGLabel *over = gen_new_label(); gen_set_rm_chkfrm(s, rm); - tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); data = FIELD_DP32(data, VDATA, VM, a->vm); data = FIELD_DP32(data, VDATA, LMUL, s->lmul); @@ -2597,7 +2526,6 @@ static bool do_opfv(DisasContext *s, arg_rmr *a, s->cfg_ptr->vlenb, s->cfg_ptr->vlenb, data, fn); mark_vs_dirty(s); - gen_set_label(over); return true; } return false; @@ -2696,8 +2624,6 @@ static bool trans_vfmv_v_f(DisasContext *s, arg_vfmv_v_f *a) gen_helper_vmv_v_x_w, gen_helper_vmv_v_x_d, }; - TCGLabel *over = gen_new_label(); - tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); t1 = tcg_temp_new_i64(); /* NaN-box f[rs1] */ @@ -2711,7 +2637,6 @@ static bool trans_vfmv_v_f(DisasContext *s, arg_vfmv_v_f *a) fns[s->sew - 1](dest, t1, tcg_env, desc); mark_vs_dirty(s); - gen_set_label(over); } return true; } @@ -2773,9 +2698,7 @@ static bool trans_##NAME(DisasContext *s, arg_rmr *a) \ gen_helper_##HELPER##_h, \ gen_helper_##HELPER##_w, \ }; \ - TCGLabel *over = gen_new_label(); \ gen_set_rm_chkfrm(s, FRM); \ - tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); \ \ data = FIELD_DP32(data, VDATA, VM, a->vm); \ data = FIELD_DP32(data, VDATA, LMUL, s->lmul); \ @@ -2787,7 +2710,6 @@ static bool trans_##NAME(DisasContext *s, arg_rmr *a) \ s->cfg_ptr->vlenb, data, \ fns[s->sew - 1]); \ mark_vs_dirty(s); \ - gen_set_label(over); \ return true; \ } \ return false; \ @@ -2824,9 +2746,7 @@ static bool trans_##NAME(DisasContext *s, arg_rmr *a) \ gen_helper_##NAME##_h, \ gen_helper_##NAME##_w, \ }; \ - TCGLabel *over = gen_new_label(); \ gen_set_rm(s, RISCV_FRM_DYN); \ - tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); \ \ data = FIELD_DP32(data, VDATA, VM, a->vm); \ data = FIELD_DP32(data, VDATA, LMUL, s->lmul); \ @@ -2838,7 +2758,6 @@ static bool trans_##NAME(DisasContext *s, arg_rmr *a) \ s->cfg_ptr->vlenb, data, \ fns[s->sew]); \ mark_vs_dirty(s); \ - gen_set_label(over); \ return true; \ } \ return false; \ @@ -2891,9 +2810,7 @@ static bool trans_##NAME(DisasContext *s, arg_rmr *a) \ gen_helper_##HELPER##_h, \ gen_helper_##HELPER##_w, \ }; \ - TCGLabel *over = gen_new_label(); \ gen_set_rm_chkfrm(s, FRM); \ - tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); \ \ data = FIELD_DP32(data, VDATA, VM, a->vm); \ data = FIELD_DP32(data, VDATA, LMUL, s->lmul); \ @@ -2905,7 +2822,6 @@ static bool trans_##NAME(DisasContext *s, arg_rmr *a) \ s->cfg_ptr->vlenb, data, \ fns[s->sew - 1]); \ mark_vs_dirty(s); \ - gen_set_label(over); \ return true; \ } \ return false; \ @@ -2940,9 +2856,7 @@ static bool trans_##NAME(DisasContext *s, arg_rmr *a) \ gen_helper_##HELPER##_h, \ gen_helper_##HELPER##_w, \ }; \ - TCGLabel *over = gen_new_label(); \ gen_set_rm_chkfrm(s, FRM); \ - tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); \ \ data = FIELD_DP32(data, VDATA, VM, a->vm); \ data = FIELD_DP32(data, VDATA, LMUL, s->lmul); \ @@ -2954,7 +2868,6 @@ static bool trans_##NAME(DisasContext *s, arg_rmr *a) \ s->cfg_ptr->vlenb, data, \ fns[s->sew]); \ mark_vs_dirty(s); \ - gen_set_label(over); \ return true; \ } \ return false; \ @@ -3031,8 +2944,6 @@ static bool trans_##NAME(DisasContext *s, arg_r *a) \ vext_check_isa_ill(s)) { \ uint32_t data = 0; \ gen_helper_gvec_4_ptr *fn = gen_helper_##NAME; \ - TCGLabel *over = gen_new_label(); \ - tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); \ \ data = FIELD_DP32(data, VDATA, LMUL, s->lmul); \ data = \ @@ -3043,7 +2954,6 @@ static bool trans_##NAME(DisasContext *s, arg_r *a) \ s->cfg_ptr->vlenb, \ s->cfg_ptr->vlenb, data, fn); \ mark_vs_dirty(s); \ - gen_set_label(over); \ return true; \ } \ return false; \ @@ -3131,8 +3041,6 @@ static bool trans_##NAME(DisasContext *s, arg_rmr *a) \ s->vstart_eq_zero) { \ uint32_t data = 0; \ gen_helper_gvec_3_ptr *fn = gen_helper_##NAME; \ - TCGLabel *over = gen_new_label(); \ - tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over); \ \ data = FIELD_DP32(data, VDATA, VM, a->vm); \ data = FIELD_DP32(data, VDATA, LMUL, s->lmul); \ @@ -3145,7 +3053,6 @@ static bool trans_##NAME(DisasContext *s, arg_rmr *a) \ s->cfg_ptr->vlenb, \ data, fn); \ mark_vs_dirty(s); \ - gen_set_label(over); \ return true; \ } \ return false; \ @@ -3171,8 +3078,6 @@ static bool trans_viota_m(DisasContext *s, arg_viota_m *a) require_align(a->rd, s->lmul) && s->vstart_eq_zero) { uint32_t data = 0; - TCGLabel *over = gen_new_label(); - tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over); data = FIELD_DP32(data, VDATA, VM, a->vm); data = FIELD_DP32(data, VDATA, LMUL, s->lmul); @@ -3187,7 +3092,6 @@ static bool trans_viota_m(DisasContext *s, arg_viota_m *a) s->cfg_ptr->vlenb, s->cfg_ptr->vlenb, data, fns[s->sew]); mark_vs_dirty(s); - gen_set_label(over); return true; } return false; @@ -3201,8 +3105,6 @@ static bool trans_vid_v(DisasContext *s, arg_vid_v *a) require_align(a->rd, s->lmul) && require_vm(a->vm, a->rd)) { uint32_t data = 0; - TCGLabel *over = gen_new_label(); - tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); data = FIELD_DP32(data, VDATA, VM, a->vm); data = FIELD_DP32(data, VDATA, LMUL, s->lmul); @@ -3217,7 +3119,6 @@ static bool trans_vid_v(DisasContext *s, arg_vid_v *a) s->cfg_ptr->vlenb, data, fns[s->sew]); mark_vs_dirty(s); - gen_set_label(over); return true; } return false; @@ -3630,8 +3531,6 @@ static bool trans_vcompress_vm(DisasContext *s, arg_r *a) gen_helper_vcompress_vm_b, gen_helper_vcompress_vm_h, gen_helper_vcompress_vm_w, gen_helper_vcompress_vm_d, }; - TCGLabel *over = gen_new_label(); - tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over); data = FIELD_DP32(data, VDATA, LMUL, s->lmul); data = FIELD_DP32(data, VDATA, VTA, s->vta); @@ -3641,7 +3540,6 @@ static bool trans_vcompress_vm(DisasContext *s, arg_r *a) s->cfg_ptr->vlenb, data, fns[s->sew]); mark_vs_dirty(s); - gen_set_label(over); return true; } return false; @@ -3664,12 +3562,9 @@ static bool trans_##NAME(DisasContext *s, arg_##NAME * a) \ vreg_ofs(s, a->rs2), maxsz, maxsz); \ mark_vs_dirty(s); \ } else { \ - TCGLabel *over = gen_new_label(); \ - tcg_gen_brcondi_tl(TCG_COND_GEU, cpu_vstart, maxsz, over); \ tcg_gen_gvec_2_ptr(vreg_ofs(s, a->rd), vreg_ofs(s, a->rs2), \ tcg_env, maxsz, maxsz, 0, gen_helper_vmvr_v); \ mark_vs_dirty(s); \ - gen_set_label(over); \ } \ return true; \ } \ @@ -3698,8 +3593,6 @@ static bool int_ext_op(DisasContext *s, arg_rmr *a, uint8_t seq) { uint32_t data = 0; gen_helper_gvec_3_ptr *fn; - TCGLabel *over = gen_new_label(); - tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); static gen_helper_gvec_3_ptr * const fns[6][4] = { { @@ -3744,7 +3637,6 @@ static bool int_ext_op(DisasContext *s, arg_rmr *a, uint8_t seq) s->cfg_ptr->vlenb, data, fn); mark_vs_dirty(s); - gen_set_label(over); return true; } diff --git a/target/riscv/insn_trans/trans_rvvk.c.inc b/target/riscv/insn_trans/trans_rvvk.c.inc index a5cdd1b67f..6d640e4596 100644 --- a/target/riscv/insn_trans/trans_rvvk.c.inc +++ b/target/riscv/insn_trans/trans_rvvk.c.inc @@ -164,8 +164,6 @@ GEN_OPIVX_GVEC_TRANS_CHECK(vandn_vx, andcs, zvkb_vx_check) gen_helper_##NAME##_w, \ gen_helper_##NAME##_d, \ }; \ - TCGLabel *over = gen_new_label(); \ - tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); \ \ data = FIELD_DP32(data, VDATA, VM, a->vm); \ data = FIELD_DP32(data, VDATA, LMUL, s->lmul); \ @@ -177,7 +175,6 @@ GEN_OPIVX_GVEC_TRANS_CHECK(vandn_vx, andcs, zvkb_vx_check) s->cfg_ptr->vlenb, s->cfg_ptr->vlenb, \ data, fns[s->sew]); \ mark_vs_dirty(s); \ - gen_set_label(over); \ return true; \ } \ return false; \ @@ -249,14 +246,12 @@ GEN_OPIVI_WIDEN_TRANS(vwsll_vi, IMM_ZX, vwsll_vx, vwsll_vx_check) TCGv_ptr rd_v, rs2_v; \ TCGv_i32 desc, egs; \ uint32_t data = 0; \ - TCGLabel *over = gen_new_label(); \ \ if (!s->vstart_eq_zero || !s->vl_eq_vlmax) { \ /* save opcode for unwinding in case we throw an exception */ \ decode_save_opc(s); \ egs = tcg_constant_i32(EGS); \ gen_helper_egs_check(egs, tcg_env); \ - tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); \ } \ \ data = FIELD_DP32(data, VDATA, VM, a->vm); \ @@ -272,7 +267,6 @@ GEN_OPIVI_WIDEN_TRANS(vwsll_vi, IMM_ZX, vwsll_vx, vwsll_vx_check) tcg_gen_addi_ptr(rs2_v, tcg_env, vreg_ofs(s, a->rs2)); \ gen_helper_##NAME(rd_v, rs2_v, tcg_env, desc); \ mark_vs_dirty(s); \ - gen_set_label(over); \ return true; \ } \ return false; \ @@ -325,14 +319,12 @@ GEN_V_UNMASKED_TRANS(vaesem_vs, vaes_check_vs, ZVKNED_EGS) TCGv_ptr rd_v, rs2_v; \ TCGv_i32 uimm_v, desc, egs; \ uint32_t data = 0; \ - TCGLabel *over = gen_new_label(); \ \ if (!s->vstart_eq_zero || !s->vl_eq_vlmax) { \ /* save opcode for unwinding in case we throw an exception */ \ decode_save_opc(s); \ egs = tcg_constant_i32(EGS); \ gen_helper_egs_check(egs, tcg_env); \ - tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); \ } \ \ data = FIELD_DP32(data, VDATA, VM, a->vm); \ @@ -350,7 +342,6 @@ GEN_V_UNMASKED_TRANS(vaesem_vs, vaes_check_vs, ZVKNED_EGS) tcg_gen_addi_ptr(rs2_v, tcg_env, vreg_ofs(s, a->rs2)); \ gen_helper_##NAME(rd_v, rs2_v, uimm_v, tcg_env, desc); \ mark_vs_dirty(s); \ - gen_set_label(over); \ return true; \ } \ return false; \ @@ -394,7 +385,6 @@ GEN_VI_UNMASKED_TRANS(vaeskf2_vi, vaeskf2_check, ZVKNED_EGS) { \ if (CHECK(s, a)) { \ uint32_t data = 0; \ - TCGLabel *over = gen_new_label(); \ TCGv_i32 egs; \ \ if (!s->vstart_eq_zero || !s->vl_eq_vlmax) { \ @@ -402,7 +392,6 @@ GEN_VI_UNMASKED_TRANS(vaeskf2_vi, vaeskf2_check, ZVKNED_EGS) decode_save_opc(s); \ egs = tcg_constant_i32(EGS); \ gen_helper_egs_check(egs, tcg_env); \ - tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); \ } \ \ data = FIELD_DP32(data, VDATA, VM, a->vm); \ @@ -417,7 +406,6 @@ GEN_VI_UNMASKED_TRANS(vaeskf2_vi, vaeskf2_check, ZVKNED_EGS) data, gen_helper_##NAME); \ \ mark_vs_dirty(s); \ - gen_set_label(over); \ return true; \ } \ return false; \ @@ -448,7 +436,6 @@ static bool trans_vsha2cl_vv(DisasContext *s, arg_rmrr *a) { if (vsha_check(s, a)) { uint32_t data = 0; - TCGLabel *over = gen_new_label(); TCGv_i32 egs; if (!s->vstart_eq_zero || !s->vl_eq_vlmax) { @@ -456,7 +443,6 @@ static bool trans_vsha2cl_vv(DisasContext *s, arg_rmrr *a) decode_save_opc(s); egs = tcg_constant_i32(ZVKNH_EGS); gen_helper_egs_check(egs, tcg_env); - tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); } data = FIELD_DP32(data, VDATA, VM, a->vm); @@ -472,7 +458,6 @@ static bool trans_vsha2cl_vv(DisasContext *s, arg_rmrr *a) gen_helper_vsha2cl32_vv : gen_helper_vsha2cl64_vv); mark_vs_dirty(s); - gen_set_label(over); return true; } return false; @@ -482,7 +467,6 @@ static bool trans_vsha2ch_vv(DisasContext *s, arg_rmrr *a) { if (vsha_check(s, a)) { uint32_t data = 0; - TCGLabel *over = gen_new_label(); TCGv_i32 egs; if (!s->vstart_eq_zero || !s->vl_eq_vlmax) { @@ -490,7 +474,6 @@ static bool trans_vsha2ch_vv(DisasContext *s, arg_rmrr *a) decode_save_opc(s); egs = tcg_constant_i32(ZVKNH_EGS); gen_helper_egs_check(egs, tcg_env); - tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); } data = FIELD_DP32(data, VDATA, VM, a->vm); @@ -506,7 +489,6 @@ static bool trans_vsha2ch_vv(DisasContext *s, arg_rmrr *a) gen_helper_vsha2ch32_vv : gen_helper_vsha2ch64_vv); mark_vs_dirty(s); - gen_set_label(over); return true; } return false; From patchwork Sun Mar 10 11:53:12 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Daniel Henrique Barboza X-Patchwork-Id: 13588055 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id A6797C54791 for ; Sun, 10 Mar 2024 11:54:49 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rjHke-0007tL-HV; Sun, 10 Mar 2024 07:54:00 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rjHkY-0007ok-36 for qemu-devel@nongnu.org; Sun, 10 Mar 2024 07:53:54 -0400 Received: from mail-oa1-x2b.google.com ([2001:4860:4864:20::2b]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1rjHkV-00044R-Kl for qemu-devel@nongnu.org; Sun, 10 Mar 2024 07:53:53 -0400 Received: by mail-oa1-x2b.google.com with SMTP id 586e51a60fabf-2218b571776so1927985fac.0 for ; Sun, 10 Mar 2024 04:53:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ventanamicro.com; s=google; t=1710071629; x=1710676429; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=M1pgxbL4ext+eLfj8GE9V3VngjTMGW2LA/FcQ0QZgQg=; b=hEugmHwDQrC2dGGQzRvXTrp8Eh76YPtPonsAEOoacg0JH2BGdi4rZHKZF0UqMlCfgt dBlB5Onm8Ia4J+qRggkNylCUV2WwSgyeuD1Octb/OgbM7MStn/3dxUNwuUJ10T5M6SOu T1xYNyp4no4+CmX5WJix4hvjWoDHVI4pPxauHI/BSq1qD898fYGVGk+x10HdF8u14Lkz iYEx0FEMxT1Fm4NP5wIvfzRnJNXbLDvD9UoiN46uTmieTv2yS9EKTK0aaTun3RFRLHOA JGCdNKAzmH/EJoQdIeOQK8cgBBvSF+JMKVkfkTTAWJh1xPpgcKYABsXo7hMWoiUCSd1x Z+1g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710071629; x=1710676429; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=M1pgxbL4ext+eLfj8GE9V3VngjTMGW2LA/FcQ0QZgQg=; b=cBuEd7aj7POeGifGamLpEOwRH1t7A1MLW1V9nkUSVlkne7aCm2SRrxBBn8qc+81ebH nHVNOlore3GVHOu7/yNY0tdM67sYYjUFnzInxctxOmhFG5VqtPo1xK+G4lOTIQraQzlJ UFojuWGvvzYeg46Q88rqurDwwZZrSESTag5mFfVflj0my9p+fJhnVhhgA2RTV8M6mYsn gcBuYvZ2g0RG/JcdlqDSTIzJjltHF3gd70x2YRNwQD1KIygHW2Lzd7ilzC+m9VQgXf6U HQN62nhxX/F89+08HopL/NY1qI+0bqyyEV3gK4nR640X0rTzv9lyw5iTtiFKSvF/Wxji cQqg== X-Gm-Message-State: AOJu0YwL4JcXIbpEqsg0KYHQCUecC0sr81Pd+T81ING/s1Gt/D2aMYeI LGGB1rPl1WMwvJb7pb0ikgBXrOO46S0a3I5hKNjx0dLr1FjYeHSjSJeK6CTOyzjm2NoXuD1nj7A 0 X-Google-Smtp-Source: AGHT+IHR7GjVqgDg49SViK8tuD1Yc7thAOvpEzkUqGNEV0pKP/CAWw/+7uawCZy3btpSQc+Q54+o0Q== X-Received: by 2002:a05:6871:5c8:b0:221:acb5:7746 with SMTP id v8-20020a05687105c800b00221acb57746mr4486832oan.41.1710071629614; Sun, 10 Mar 2024 04:53:49 -0700 (PDT) Received: from grind.. ([177.94.15.159]) by smtp.gmail.com with ESMTPSA id g22-20020aa78196000000b006e647059cccsm2449253pfi.33.2024.03.10.04.53.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 10 Mar 2024 04:53:49 -0700 (PDT) From: Daniel Henrique Barboza To: qemu-devel@nongnu.org Cc: qemu-riscv@nongnu.org, alistair.francis@wdc.com, bmeng@tinylab.org, liwei1518@gmail.com, zhiwei_liu@linux.alibaba.com, palmer@rivosinc.com, richard.henderson@linaro.org, philmd@linaro.org, Daniel Henrique Barboza Subject: [PATCH v10 08/10] trans_rvv.c.inc: remove redundant mark_vs_dirty() calls Date: Sun, 10 Mar 2024 08:53:12 -0300 Message-ID: <20240310115315.187283-9-dbarboza@ventanamicro.com> X-Mailer: git-send-email 2.43.2 In-Reply-To: <20240310115315.187283-1-dbarboza@ventanamicro.com> References: <20240310115315.187283-1-dbarboza@ventanamicro.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2001:4860:4864:20::2b; envelope-from=dbarboza@ventanamicro.com; helo=mail-oa1-x2b.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org trans_vmv_v_i , trans_vfmv_v_f and the trans_##NAME macro from GEN_VMV_WHOLE_TRANS() are calling mark_vs_dirty() in both branches of their 'ifs'. conditionals. Call it just once in the end like other functions are doing. Signed-off-by: Daniel Henrique Barboza Reviewed-by: Richard Henderson Reviewed-by: Alistair Francis Reviewed-by: Philippe Mathieu-Daudé --- target/riscv/insn_trans/trans_rvv.c.inc | 11 +++-------- 1 file changed, 3 insertions(+), 8 deletions(-) diff --git a/target/riscv/insn_trans/trans_rvv.c.inc b/target/riscv/insn_trans/trans_rvv.c.inc index 4c1a064cf6..b0f19dcd85 100644 --- a/target/riscv/insn_trans/trans_rvv.c.inc +++ b/target/riscv/insn_trans/trans_rvv.c.inc @@ -2065,7 +2065,6 @@ static bool trans_vmv_v_i(DisasContext *s, arg_vmv_v_i *a) if (s->vl_eq_vlmax && !(s->vta && s->lmul < 0)) { tcg_gen_gvec_dup_imm(s->sew, vreg_ofs(s, a->rd), MAXSZ(s), MAXSZ(s), simm); - mark_vs_dirty(s); } else { TCGv_i32 desc; TCGv_i64 s1; @@ -2083,9 +2082,8 @@ static bool trans_vmv_v_i(DisasContext *s, arg_vmv_v_i *a) s->cfg_ptr->vlenb, data)); tcg_gen_addi_ptr(dest, tcg_env, vreg_ofs(s, a->rd)); fns[s->sew](dest, s1, tcg_env, desc); - - mark_vs_dirty(s); } + mark_vs_dirty(s); return true; } return false; @@ -2612,7 +2610,6 @@ static bool trans_vfmv_v_f(DisasContext *s, arg_vfmv_v_f *a) tcg_gen_gvec_dup_i64(s->sew, vreg_ofs(s, a->rd), MAXSZ(s), MAXSZ(s), t1); - mark_vs_dirty(s); } else { TCGv_ptr dest; TCGv_i32 desc; @@ -2635,9 +2632,8 @@ static bool trans_vfmv_v_f(DisasContext *s, arg_vfmv_v_f *a) tcg_gen_addi_ptr(dest, tcg_env, vreg_ofs(s, a->rd)); fns[s->sew - 1](dest, t1, tcg_env, desc); - - mark_vs_dirty(s); } + mark_vs_dirty(s); return true; } return false; @@ -3560,12 +3556,11 @@ static bool trans_##NAME(DisasContext *s, arg_##NAME * a) \ if (s->vstart_eq_zero) { \ tcg_gen_gvec_mov(s->sew, vreg_ofs(s, a->rd), \ vreg_ofs(s, a->rs2), maxsz, maxsz); \ - mark_vs_dirty(s); \ } else { \ tcg_gen_gvec_2_ptr(vreg_ofs(s, a->rd), vreg_ofs(s, a->rs2), \ tcg_env, maxsz, maxsz, 0, gen_helper_vmvr_v); \ - mark_vs_dirty(s); \ } \ + mark_vs_dirty(s); \ return true; \ } \ return false; \ From patchwork Sun Mar 10 11:53:13 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Henrique Barboza X-Patchwork-Id: 13588063 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 2CDB2C54E58 for ; Sun, 10 Mar 2024 11:55:35 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rjHkf-0007tm-5b; Sun, 10 Mar 2024 07:54:01 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rjHkc-0007rf-Fo for qemu-devel@nongnu.org; Sun, 10 Mar 2024 07:53:58 -0400 Received: from mail-pf1-x429.google.com ([2607:f8b0:4864:20::429]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1rjHkZ-00045S-N9 for qemu-devel@nongnu.org; Sun, 10 Mar 2024 07:53:58 -0400 Received: by mail-pf1-x429.google.com with SMTP id d2e1a72fcca58-6e5dddd3b95so2519652b3a.1 for ; Sun, 10 Mar 2024 04:53:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ventanamicro.com; s=google; t=1710071633; x=1710676433; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=clxLarVfm96FQhFWe3+r/40YYNIvZDUOLCoWs9HtyxY=; b=aws87+E+q9iCAZ6Ij0AtYNWUaPWUPU2EsTZRPwyHeVAfs0IpV9R7yEjxCOUhEG8o0E 5dX3V4IOBmtbcSSr1yWv+cRQbLkfDZVnyD+q5lY+TMIFinM/1vp5Vql1c4O8sjZF6XIS QQNydc0P5CSKgljw33b709qi2CYmVdqYgphEOU4PmH0VgBFvQEhOsYJ0qpZcEZ9eqYqz CjicSK0vooXNMGv6lmOc3TQf0wnvQylUpUqWwg0MuVKOyHsL4Rav3PA7VMLyeIuPRMF4 QPptlvEX44owEpKat7ZIb1vHbHh63B8SyeQlmnibei/PsAq6/wRdSucCfJJeVHIE2OqJ Djwg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710071633; x=1710676433; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=clxLarVfm96FQhFWe3+r/40YYNIvZDUOLCoWs9HtyxY=; b=OnM4sBlt7ioWbtDC4hHFuiOIQk0n4kj5AwwQcCkBXMmagtbkkMj/bot/8ZCSSk3i9z Xuk4R1H7RhrbLcEx0wbHCj2zESVwHaOGG34/Z5pRwiApv3baQXev0sCBK/oy3yDBon5p gezpHabu5qrGWpp/gxFhR5S9N7qGvRvQ+UTSorrodndbD38xtkIMywSEdoyIYH5shyk4 m4czp3nIDu1urcX81S1EAl4RlvbwG+TeuM7ScFwj3wqJMaBoISoc+YJTeUgRwBLw7MhQ edSA5gMo/guTnajBXebLag79xh/FXo7U0lUKwQR9OqCWNP2P1rvYsWGjdWV99xIrFEyr UaYQ== X-Gm-Message-State: AOJu0YxXZFLTwKcJSNXmF+Y0FF/BTa+f6BP4mxOk/0iNTuTnaZOkw3F3 4lphSmZKjTAkB75QVE/zUkafca+FSBhnHhXZIq3ANBfLV9+jSdBRinxp6XtJQzBzj9/kJYNlf8f v X-Google-Smtp-Source: AGHT+IFScVmCozFgPJyT4vfSbawFNnfGwQ1oRPu/nIrqbPPwzXHcBXhjbv6iaA89tuerddQwT+wPvQ== X-Received: by 2002:a05:6a00:6c9c:b0:6e6:5291:1779 with SMTP id jc28-20020a056a006c9c00b006e652911779mr6991406pfb.6.1710071633262; Sun, 10 Mar 2024 04:53:53 -0700 (PDT) Received: from grind.. ([177.94.15.159]) by smtp.gmail.com with ESMTPSA id g22-20020aa78196000000b006e647059cccsm2449253pfi.33.2024.03.10.04.53.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 10 Mar 2024 04:53:52 -0700 (PDT) From: Daniel Henrique Barboza To: qemu-devel@nongnu.org Cc: qemu-riscv@nongnu.org, alistair.francis@wdc.com, bmeng@tinylab.org, liwei1518@gmail.com, zhiwei_liu@linux.alibaba.com, palmer@rivosinc.com, richard.henderson@linaro.org, philmd@linaro.org, Ivan Klokov , Daniel Henrique Barboza Subject: [PATCH v10 09/10] target/riscv: enable 'vstart_qe_zero' in the end of insns Date: Sun, 10 Mar 2024 08:53:13 -0300 Message-ID: <20240310115315.187283-10-dbarboza@ventanamicro.com> X-Mailer: git-send-email 2.43.2 In-Reply-To: <20240310115315.187283-1-dbarboza@ventanamicro.com> References: <20240310115315.187283-1-dbarboza@ventanamicro.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::429; envelope-from=dbarboza@ventanamicro.com; helo=mail-pf1-x429.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Ivan Klokov The vstart_qe_zero flag is updated at the beginning of the translation phase from the env->vstart variable. During the execution phase all functions will set env->vstart = 0 after a successful execution, but the vstart_eq_zero flag remains the same as at the start of the block. This will wrongly cause SIGILLs in translations that requires env->vstart = 0 and might be reading vstart_eq_zero = false. This patch adds a new finalize_rvv_inst() helper that is called at the end of each vector instruction that will both update vstart_eq_zero and do a mark_vs_dirty(). Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1976 Signed-off-by: Ivan Klokov Signed-off-by: Daniel Henrique Barboza Reviewed-by: Richard Henderson Reviewed-by: Alistair Francis --- target/riscv/insn_trans/trans_rvbf16.c.inc | 6 +- target/riscv/insn_trans/trans_rvv.c.inc | 83 ++++++++++++---------- target/riscv/insn_trans/trans_rvvk.c.inc | 12 ++-- target/riscv/translate.c | 6 ++ 4 files changed, 59 insertions(+), 48 deletions(-) diff --git a/target/riscv/insn_trans/trans_rvbf16.c.inc b/target/riscv/insn_trans/trans_rvbf16.c.inc index a842e76a6b..0a9cd1ec31 100644 --- a/target/riscv/insn_trans/trans_rvbf16.c.inc +++ b/target/riscv/insn_trans/trans_rvbf16.c.inc @@ -83,7 +83,7 @@ static bool trans_vfncvtbf16_f_f_w(DisasContext *ctx, arg_vfncvtbf16_f_f_w *a) ctx->cfg_ptr->vlenb, ctx->cfg_ptr->vlenb, data, gen_helper_vfncvtbf16_f_f_w); - mark_vs_dirty(ctx); + finalize_rvv_inst(ctx); return true; } return false; @@ -108,7 +108,7 @@ static bool trans_vfwcvtbf16_f_f_v(DisasContext *ctx, arg_vfwcvtbf16_f_f_v *a) ctx->cfg_ptr->vlenb, ctx->cfg_ptr->vlenb, data, gen_helper_vfwcvtbf16_f_f_v); - mark_vs_dirty(ctx); + finalize_rvv_inst(ctx); return true; } return false; @@ -135,7 +135,7 @@ static bool trans_vfwmaccbf16_vv(DisasContext *ctx, arg_vfwmaccbf16_vv *a) ctx->cfg_ptr->vlenb, ctx->cfg_ptr->vlenb, data, gen_helper_vfwmaccbf16_vv); - mark_vs_dirty(ctx); + finalize_rvv_inst(ctx); return true; } return false; diff --git a/target/riscv/insn_trans/trans_rvv.c.inc b/target/riscv/insn_trans/trans_rvv.c.inc index b0f19dcd85..b3d467a874 100644 --- a/target/riscv/insn_trans/trans_rvv.c.inc +++ b/target/riscv/insn_trans/trans_rvv.c.inc @@ -167,7 +167,7 @@ static bool do_vsetvl(DisasContext *s, int rd, int rs1, TCGv s2) gen_helper_vsetvl(dst, tcg_env, s1, s2); gen_set_gpr(s, rd, dst); - mark_vs_dirty(s); + finalize_rvv_inst(s); gen_update_pc(s, s->cur_insn_len); lookup_and_goto_ptr(s); @@ -187,7 +187,7 @@ static bool do_vsetivli(DisasContext *s, int rd, TCGv s1, TCGv s2) gen_helper_vsetvl(dst, tcg_env, s1, s2); gen_set_gpr(s, rd, dst); - mark_vs_dirty(s); + finalize_rvv_inst(s); gen_update_pc(s, s->cur_insn_len); lookup_and_goto_ptr(s); s->base.is_jmp = DISAS_NORETURN; @@ -657,6 +657,7 @@ static bool ldst_us_trans(uint32_t vd, uint32_t rs1, uint32_t data, tcg_gen_mb(TCG_MO_ALL | TCG_BAR_LDAQ); } + finalize_rvv_inst(s); return true; } @@ -812,6 +813,7 @@ static bool ldst_stride_trans(uint32_t vd, uint32_t rs1, uint32_t rs2, fn(dest, mask, base, stride, tcg_env, desc); + finalize_rvv_inst(s); return true; } @@ -913,6 +915,7 @@ static bool ldst_index_trans(uint32_t vd, uint32_t rs1, uint32_t vs2, fn(dest, mask, base, index, tcg_env, desc); + finalize_rvv_inst(s); return true; } @@ -1043,7 +1046,7 @@ static bool ldff_trans(uint32_t vd, uint32_t rs1, uint32_t data, fn(dest, mask, base, tcg_env, desc); - mark_vs_dirty(s); + finalize_rvv_inst(s); return true; } @@ -1100,6 +1103,7 @@ static bool ldst_whole_trans(uint32_t vd, uint32_t rs1, uint32_t nf, fn(dest, base, tcg_env, desc); + finalize_rvv_inst(s); return true; } @@ -1189,7 +1193,7 @@ do_opivv_gvec(DisasContext *s, arg_rmrr *a, GVecGen3Fn *gvec_fn, tcg_env, s->cfg_ptr->vlenb, s->cfg_ptr->vlenb, data, fn); } - mark_vs_dirty(s); + finalize_rvv_inst(s); return true; } @@ -1240,7 +1244,7 @@ static bool opivx_trans(uint32_t vd, uint32_t rs1, uint32_t vs2, uint32_t vm, fn(dest, mask, src1, src2, tcg_env, desc); - mark_vs_dirty(s); + finalize_rvv_inst(s); return true; } @@ -1265,7 +1269,7 @@ do_opivx_gvec(DisasContext *s, arg_rmrr *a, GVecGen2sFn *gvec_fn, gvec_fn(s->sew, vreg_ofs(s, a->rd), vreg_ofs(s, a->rs2), src1, MAXSZ(s), MAXSZ(s)); - mark_vs_dirty(s); + finalize_rvv_inst(s); return true; } return opivx_trans(a->rd, a->rs1, a->rs2, a->vm, fn, s); @@ -1398,7 +1402,7 @@ static bool opivi_trans(uint32_t vd, uint32_t imm, uint32_t vs2, uint32_t vm, fn(dest, mask, src1, src2, tcg_env, desc); - mark_vs_dirty(s); + finalize_rvv_inst(s); return true; } @@ -1412,7 +1416,7 @@ do_opivi_gvec(DisasContext *s, arg_rmrr *a, GVecGen2iFn *gvec_fn, if (a->vm && s->vl_eq_vlmax && !(s->vta && s->lmul < 0)) { gvec_fn(s->sew, vreg_ofs(s, a->rd), vreg_ofs(s, a->rs2), extract_imm(s, a->rs1, imm_mode), MAXSZ(s), MAXSZ(s)); - mark_vs_dirty(s); + finalize_rvv_inst(s); return true; } return opivi_trans(a->rd, a->rs1, a->rs2, a->vm, fn, s, imm_mode); @@ -1471,7 +1475,7 @@ static bool do_opivv_widen(DisasContext *s, arg_rmrr *a, tcg_env, s->cfg_ptr->vlenb, s->cfg_ptr->vlenb, data, fn); - mark_vs_dirty(s); + finalize_rvv_inst(s); return true; } return false; @@ -1543,7 +1547,7 @@ static bool do_opiwv_widen(DisasContext *s, arg_rmrr *a, vreg_ofs(s, a->rs2), tcg_env, s->cfg_ptr->vlenb, s->cfg_ptr->vlenb, data, fn); - mark_vs_dirty(s); + finalize_rvv_inst(s); return true; } return false; @@ -1611,7 +1615,7 @@ static bool opivv_trans(uint32_t vd, uint32_t vs1, uint32_t vs2, uint32_t vm, tcg_gen_gvec_4_ptr(vreg_ofs(s, vd), vreg_ofs(s, 0), vreg_ofs(s, vs1), vreg_ofs(s, vs2), tcg_env, s->cfg_ptr->vlenb, s->cfg_ptr->vlenb, data, fn); - mark_vs_dirty(s); + finalize_rvv_inst(s); return true; } @@ -1744,7 +1748,7 @@ do_opivx_gvec_shift(DisasContext *s, arg_rmrr *a, GVecGen2sFn32 *gvec_fn, gvec_fn(s->sew, vreg_ofs(s, a->rd), vreg_ofs(s, a->rs2), src1, MAXSZ(s), MAXSZ(s)); - mark_vs_dirty(s); + finalize_rvv_inst(s); return true; } return opivx_trans(a->rd, a->rs1, a->rs2, a->vm, fn, s); @@ -1801,7 +1805,7 @@ static bool trans_##NAME(DisasContext *s, arg_rmrr *a) \ s->cfg_ptr->vlenb, \ s->cfg_ptr->vlenb, data, \ fns[s->sew]); \ - mark_vs_dirty(s); \ + finalize_rvv_inst(s); \ return true; \ } \ return false; \ @@ -2004,7 +2008,7 @@ static bool trans_vmv_v_v(DisasContext *s, arg_vmv_v_v *a) s->cfg_ptr->vlenb, data, fns[s->sew]); } - mark_vs_dirty(s); + finalize_rvv_inst(s); return true; } return false; @@ -2049,7 +2053,7 @@ static bool trans_vmv_v_x(DisasContext *s, arg_vmv_v_x *a) fns[s->sew](dest, s1_i64, tcg_env, desc); } - mark_vs_dirty(s); + finalize_rvv_inst(s); return true; } return false; @@ -2083,7 +2087,7 @@ static bool trans_vmv_v_i(DisasContext *s, arg_vmv_v_i *a) tcg_gen_addi_ptr(dest, tcg_env, vreg_ofs(s, a->rd)); fns[s->sew](dest, s1, tcg_env, desc); } - mark_vs_dirty(s); + finalize_rvv_inst(s); return true; } return false; @@ -2231,7 +2235,7 @@ static bool trans_##NAME(DisasContext *s, arg_rmrr *a) \ s->cfg_ptr->vlenb, \ s->cfg_ptr->vlenb, data, \ fns[s->sew - 1]); \ - mark_vs_dirty(s); \ + finalize_rvv_inst(s); \ return true; \ } \ return false; \ @@ -2265,7 +2269,7 @@ static bool opfvf_trans(uint32_t vd, uint32_t rs1, uint32_t vs2, fn(dest, mask, t1, src2, tcg_env, desc); - mark_vs_dirty(s); + finalize_rvv_inst(s); return true; } @@ -2340,7 +2344,7 @@ static bool trans_##NAME(DisasContext *s, arg_rmrr *a) \ s->cfg_ptr->vlenb, \ s->cfg_ptr->vlenb, data, \ fns[s->sew - 1]); \ - mark_vs_dirty(s); \ + finalize_rvv_inst(s); \ return true; \ } \ return false; \ @@ -2411,7 +2415,7 @@ static bool trans_##NAME(DisasContext *s, arg_rmrr *a) \ s->cfg_ptr->vlenb, \ s->cfg_ptr->vlenb, data, \ fns[s->sew - 1]); \ - mark_vs_dirty(s); \ + finalize_rvv_inst(s); \ return true; \ } \ return false; \ @@ -2523,7 +2527,7 @@ static bool do_opfv(DisasContext *s, arg_rmr *a, vreg_ofs(s, a->rs2), tcg_env, s->cfg_ptr->vlenb, s->cfg_ptr->vlenb, data, fn); - mark_vs_dirty(s); + finalize_rvv_inst(s); return true; } return false; @@ -2633,7 +2637,7 @@ static bool trans_vfmv_v_f(DisasContext *s, arg_vfmv_v_f *a) fns[s->sew - 1](dest, t1, tcg_env, desc); } - mark_vs_dirty(s); + finalize_rvv_inst(s); return true; } return false; @@ -2705,7 +2709,7 @@ static bool trans_##NAME(DisasContext *s, arg_rmr *a) \ s->cfg_ptr->vlenb, \ s->cfg_ptr->vlenb, data, \ fns[s->sew - 1]); \ - mark_vs_dirty(s); \ + finalize_rvv_inst(s); \ return true; \ } \ return false; \ @@ -2753,7 +2757,7 @@ static bool trans_##NAME(DisasContext *s, arg_rmr *a) \ s->cfg_ptr->vlenb, \ s->cfg_ptr->vlenb, data, \ fns[s->sew]); \ - mark_vs_dirty(s); \ + finalize_rvv_inst(s); \ return true; \ } \ return false; \ @@ -2817,7 +2821,7 @@ static bool trans_##NAME(DisasContext *s, arg_rmr *a) \ s->cfg_ptr->vlenb, \ s->cfg_ptr->vlenb, data, \ fns[s->sew - 1]); \ - mark_vs_dirty(s); \ + finalize_rvv_inst(s); \ return true; \ } \ return false; \ @@ -2863,7 +2867,7 @@ static bool trans_##NAME(DisasContext *s, arg_rmr *a) \ s->cfg_ptr->vlenb, \ s->cfg_ptr->vlenb, data, \ fns[s->sew]); \ - mark_vs_dirty(s); \ + finalize_rvv_inst(s); \ return true; \ } \ return false; \ @@ -2949,7 +2953,7 @@ static bool trans_##NAME(DisasContext *s, arg_r *a) \ vreg_ofs(s, a->rs2), tcg_env, \ s->cfg_ptr->vlenb, \ s->cfg_ptr->vlenb, data, fn); \ - mark_vs_dirty(s); \ + finalize_rvv_inst(s); \ return true; \ } \ return false; \ @@ -3048,7 +3052,7 @@ static bool trans_##NAME(DisasContext *s, arg_rmr *a) \ tcg_env, s->cfg_ptr->vlenb, \ s->cfg_ptr->vlenb, \ data, fn); \ - mark_vs_dirty(s); \ + finalize_rvv_inst(s); \ return true; \ } \ return false; \ @@ -3087,7 +3091,7 @@ static bool trans_viota_m(DisasContext *s, arg_viota_m *a) vreg_ofs(s, a->rs2), tcg_env, s->cfg_ptr->vlenb, s->cfg_ptr->vlenb, data, fns[s->sew]); - mark_vs_dirty(s); + finalize_rvv_inst(s); return true; } return false; @@ -3114,7 +3118,7 @@ static bool trans_vid_v(DisasContext *s, arg_vid_v *a) tcg_env, s->cfg_ptr->vlenb, s->cfg_ptr->vlenb, data, fns[s->sew]); - mark_vs_dirty(s); + finalize_rvv_inst(s); return true; } return false; @@ -3271,7 +3275,7 @@ static bool trans_vmv_x_s(DisasContext *s, arg_vmv_x_s *a) tcg_gen_trunc_i64_tl(dest, t1); gen_set_gpr(s, a->rd, dest); tcg_gen_movi_tl(cpu_vstart, 0); - mark_vs_dirty(s); + finalize_rvv_inst(s); return true; } return false; @@ -3300,7 +3304,7 @@ static bool trans_vmv_s_x(DisasContext *s, arg_vmv_s_x *a) vec_element_storei(s, a->rd, 0, t1); gen_set_label(over); tcg_gen_movi_tl(cpu_vstart, 0); - mark_vs_dirty(s); + finalize_rvv_inst(s); return true; } return false; @@ -3328,7 +3332,7 @@ static bool trans_vfmv_f_s(DisasContext *s, arg_vfmv_f_s *a) mark_fs_dirty(s); tcg_gen_movi_tl(cpu_vstart, 0); - mark_vs_dirty(s); + finalize_rvv_inst(s); return true; } return false; @@ -3354,9 +3358,10 @@ static bool trans_vfmv_s_f(DisasContext *s, arg_vfmv_s_f *a) do_nanbox(s, t1, cpu_fpr[a->rs1]); vec_element_storei(s, a->rd, 0, t1); + gen_set_label(over); tcg_gen_movi_tl(cpu_vstart, 0); - mark_vs_dirty(s); + finalize_rvv_inst(s); return true; } return false; @@ -3462,7 +3467,7 @@ static bool trans_vrgather_vx(DisasContext *s, arg_rmrr *a) tcg_gen_gvec_dup_i64(s->sew, vreg_ofs(s, a->rd), MAXSZ(s), MAXSZ(s), dest); - mark_vs_dirty(s); + finalize_rvv_inst(s); } else { static gen_helper_opivx * const fns[4] = { gen_helper_vrgather_vx_b, gen_helper_vrgather_vx_h, @@ -3490,7 +3495,7 @@ static bool trans_vrgather_vi(DisasContext *s, arg_rmrr *a) endian_ofs(s, a->rs2, a->rs1), MAXSZ(s), MAXSZ(s)); } - mark_vs_dirty(s); + finalize_rvv_inst(s); } else { static gen_helper_opivx * const fns[4] = { gen_helper_vrgather_vx_b, gen_helper_vrgather_vx_h, @@ -3535,7 +3540,7 @@ static bool trans_vcompress_vm(DisasContext *s, arg_r *a) tcg_env, s->cfg_ptr->vlenb, s->cfg_ptr->vlenb, data, fns[s->sew]); - mark_vs_dirty(s); + finalize_rvv_inst(s); return true; } return false; @@ -3560,7 +3565,7 @@ static bool trans_##NAME(DisasContext *s, arg_##NAME * a) \ tcg_gen_gvec_2_ptr(vreg_ofs(s, a->rd), vreg_ofs(s, a->rs2), \ tcg_env, maxsz, maxsz, 0, gen_helper_vmvr_v); \ } \ - mark_vs_dirty(s); \ + finalize_rvv_inst(s); \ return true; \ } \ return false; \ @@ -3631,7 +3636,7 @@ static bool int_ext_op(DisasContext *s, arg_rmr *a, uint8_t seq) s->cfg_ptr->vlenb, s->cfg_ptr->vlenb, data, fn); - mark_vs_dirty(s); + finalize_rvv_inst(s); return true; } diff --git a/target/riscv/insn_trans/trans_rvvk.c.inc b/target/riscv/insn_trans/trans_rvvk.c.inc index 6d640e4596..ae1f40174a 100644 --- a/target/riscv/insn_trans/trans_rvvk.c.inc +++ b/target/riscv/insn_trans/trans_rvvk.c.inc @@ -174,7 +174,7 @@ GEN_OPIVX_GVEC_TRANS_CHECK(vandn_vx, andcs, zvkb_vx_check) vreg_ofs(s, a->rs2), tcg_env, \ s->cfg_ptr->vlenb, s->cfg_ptr->vlenb, \ data, fns[s->sew]); \ - mark_vs_dirty(s); \ + finalize_rvv_inst(s); \ return true; \ } \ return false; \ @@ -266,7 +266,7 @@ GEN_OPIVI_WIDEN_TRANS(vwsll_vi, IMM_ZX, vwsll_vx, vwsll_vx_check) tcg_gen_addi_ptr(rd_v, tcg_env, vreg_ofs(s, a->rd)); \ tcg_gen_addi_ptr(rs2_v, tcg_env, vreg_ofs(s, a->rs2)); \ gen_helper_##NAME(rd_v, rs2_v, tcg_env, desc); \ - mark_vs_dirty(s); \ + finalize_rvv_inst(s); \ return true; \ } \ return false; \ @@ -341,7 +341,7 @@ GEN_V_UNMASKED_TRANS(vaesem_vs, vaes_check_vs, ZVKNED_EGS) tcg_gen_addi_ptr(rd_v, tcg_env, vreg_ofs(s, a->rd)); \ tcg_gen_addi_ptr(rs2_v, tcg_env, vreg_ofs(s, a->rs2)); \ gen_helper_##NAME(rd_v, rs2_v, uimm_v, tcg_env, desc); \ - mark_vs_dirty(s); \ + finalize_rvv_inst(s); \ return true; \ } \ return false; \ @@ -405,7 +405,7 @@ GEN_VI_UNMASKED_TRANS(vaeskf2_vi, vaeskf2_check, ZVKNED_EGS) s->cfg_ptr->vlenb, s->cfg_ptr->vlenb, \ data, gen_helper_##NAME); \ \ - mark_vs_dirty(s); \ + finalize_rvv_inst(s); \ return true; \ } \ return false; \ @@ -457,7 +457,7 @@ static bool trans_vsha2cl_vv(DisasContext *s, arg_rmrr *a) s->sew == MO_32 ? gen_helper_vsha2cl32_vv : gen_helper_vsha2cl64_vv); - mark_vs_dirty(s); + finalize_rvv_inst(s); return true; } return false; @@ -488,7 +488,7 @@ static bool trans_vsha2ch_vv(DisasContext *s, arg_rmrr *a) s->sew == MO_32 ? gen_helper_vsha2ch32_vv : gen_helper_vsha2ch64_vv); - mark_vs_dirty(s); + finalize_rvv_inst(s); return true; } return false; diff --git a/target/riscv/translate.c b/target/riscv/translate.c index ea5d52b2ef..9d57089fcc 100644 --- a/target/riscv/translate.c +++ b/target/riscv/translate.c @@ -676,6 +676,12 @@ static void mark_vs_dirty(DisasContext *ctx) static inline void mark_vs_dirty(DisasContext *ctx) { } #endif +static void finalize_rvv_inst(DisasContext *ctx) +{ + mark_vs_dirty(ctx); + ctx->vstart_eq_zero = true; +} + static void gen_set_rm(DisasContext *ctx, int rm) { if (ctx->frm == rm) { From patchwork Sun Mar 10 11:53:14 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Henrique Barboza X-Patchwork-Id: 13588058 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id D9477C54E58 for ; Sun, 10 Mar 2024 11:54:59 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rjHkg-0007ul-Pf; Sun, 10 Mar 2024 07:54:02 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rjHkf-0007uB-DA for qemu-devel@nongnu.org; Sun, 10 Mar 2024 07:54:01 -0400 Received: from mail-pf1-x42f.google.com ([2607:f8b0:4864:20::42f]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1rjHkc-00045e-Ue for qemu-devel@nongnu.org; Sun, 10 Mar 2024 07:54:01 -0400 Received: by mail-pf1-x42f.google.com with SMTP id d2e1a72fcca58-6da9c834646so3324402b3a.3 for ; Sun, 10 Mar 2024 04:53:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ventanamicro.com; s=google; t=1710071636; x=1710676436; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=TMn63prcVteRus1yYdprTy/N0umMjC+ArdoC3MJRmKA=; b=Gp7kAbyVzbhshiGoyksLjdrzntU+i9/DMdM22TRoqLP/s+XCxgCaVCDTiwhcx8MQdr fQjO7nhyDVULbmCUsxXKekUpQiiYZYh+gWk2jiJBvvJF8lGQW3qinoLM2wYgIpGExOJt jLFu/iuaFWaH/Dl7Fz+ULwN3wDfcX41VOZ6AApPuDCkwLl96rk35BN1tZOJWq6MTqP7r LNDLT97oFFbpS87ymG+IC3NcygmR8CRRx46aRrtAigXO7q+g8fudNHV+YcY2KlMBqhRo Ei0jkE1PEgUAmPZkKr096P4jTxhN1MX1FHI8jZMQpC41U5R0UXvmJ+axFcJBP9psDhAY K/ZA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710071636; x=1710676436; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=TMn63prcVteRus1yYdprTy/N0umMjC+ArdoC3MJRmKA=; b=p/qfOEaUHA8IkSrIHt9ASyKTnQAxagG8EgB9bC+Cx3kGQwhA4obk3zMc7ZXC/Ax1aQ p99Frygqk5svrLkzGA4Pz9smzqLJA7BfWOvcvdNnOjRAgSf1axGkb78hewhb+Bixlee/ N82ftuSp1yoGiZS7Dzd1aMKOFQ7mz8geb7of+d3+CgRCgvHO9SAhLYRyi61J0sEFfFdO Yh1sLzPGaicfw+8JRxKLBGgiIPerA8jsJV2Eujzu69Sk6y771LLmWt6W4ILC59GAFWyy FaB6Q0g4/9ldLrHKMs60TS3Wu42r8AKuCxntqUOBTuEYZ1VURcrhAapmOm8nQkoiv42S JDew== X-Gm-Message-State: AOJu0YyJ/XgHXGNMXfW1+okw5mr8VN9YjfGDcJfCUlJS64Klwp28Jw84 EBN6icA7IqwJHYqVv5ekuiqe3EDUFxQXYnUXQxmROM1bO/74p1XWcRTmPjJlv4oyXQZIxQ03hJu k X-Google-Smtp-Source: AGHT+IFUwsA/jTOcAvz9/pkbzo5evtQFxHkRNBEKWYoGKUA0bYWWOOsoiPz9hsXp8ZDCHR6hSlZQiQ== X-Received: by 2002:a05:6a00:2e94:b0:6e6:16b5:2ea3 with SMTP id fd20-20020a056a002e9400b006e616b52ea3mr6030031pfb.27.1710071636621; Sun, 10 Mar 2024 04:53:56 -0700 (PDT) Received: from grind.. ([177.94.15.159]) by smtp.gmail.com with ESMTPSA id g22-20020aa78196000000b006e647059cccsm2449253pfi.33.2024.03.10.04.53.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 10 Mar 2024 04:53:56 -0700 (PDT) From: Daniel Henrique Barboza To: qemu-devel@nongnu.org Cc: qemu-riscv@nongnu.org, alistair.francis@wdc.com, bmeng@tinylab.org, liwei1518@gmail.com, zhiwei_liu@linux.alibaba.com, palmer@rivosinc.com, richard.henderson@linaro.org, philmd@linaro.org, Daniel Henrique Barboza Subject: [PATCH v10 10/10] target/riscv/vector_helper.c: optimize loops in ldst helpers Date: Sun, 10 Mar 2024 08:53:14 -0300 Message-ID: <20240310115315.187283-11-dbarboza@ventanamicro.com> X-Mailer: git-send-email 2.43.2 In-Reply-To: <20240310115315.187283-1-dbarboza@ventanamicro.com> References: <20240310115315.187283-1-dbarboza@ventanamicro.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::42f; envelope-from=dbarboza@ventanamicro.com; helo=mail-pf1-x42f.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Change the for loops in ldst helpers to do a single increment in the counter, and assign it env->vstart, to avoid re-reading from vstart every time. Suggested-by: Richard Henderson Signed-off-by: Daniel Henrique Barboza Reviewed-by: Alistair Francis Reviewed-by: Richard Henderson --- target/riscv/vector_helper.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c index 4fe8752eea..ee57300dc0 100644 --- a/target/riscv/vector_helper.c +++ b/target/riscv/vector_helper.c @@ -195,7 +195,7 @@ vext_ldst_stride(void *vd, void *v0, target_ulong base, return; } - for (i = env->vstart; i < env->vl; i++, env->vstart++) { + for (i = env->vstart; i < env->vl; env->vstart = ++i) { k = 0; while (k < nf) { if (!vm && !vext_elem_mask(v0, i)) { @@ -270,7 +270,7 @@ vext_ldst_us(void *vd, target_ulong base, CPURISCVState *env, uint32_t desc, } /* load bytes from guest memory */ - for (i = env->vstart; i < evl; i++, env->vstart++) { + for (i = env->vstart; i < evl; env->vstart = ++i) { k = 0; while (k < nf) { target_ulong addr = base + ((i * nf + k) << log2_esz); @@ -393,7 +393,7 @@ vext_ldst_index(void *vd, void *v0, target_ulong base, } /* load bytes from guest memory */ - for (i = env->vstart; i < env->vl; i++, env->vstart++) { + for (i = env->vstart; i < env->vl; env->vstart = ++i) { k = 0; while (k < nf) { if (!vm && !vext_elem_mask(v0, i)) {