From patchwork Fri May 31 12:03:25 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 13681632 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B6943C27C4F for ; Fri, 31 May 2024 12:11:01 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sD0zq-0004RZ-D1; Fri, 31 May 2024 08:04:34 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sD0zY-0004JF-Me for qemu-devel@nongnu.org; Fri, 31 May 2024 08:04:16 -0400 Received: from mail-wm1-x329.google.com ([2a00:1450:4864:20::329]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sD0zU-0003PJ-EL for qemu-devel@nongnu.org; Fri, 31 May 2024 08:04:16 -0400 Received: by mail-wm1-x329.google.com with SMTP id 5b1f17b1804b1-4211249fdf4so20404595e9.3 for ; Fri, 31 May 2024 05:04:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1717157048; x=1717761848; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=m300P5qGdWq/KX3sCv0lwvsoRKZl9oqmNGtWoC/clvE=; b=jBstJ6VbbHxIfBHEH+8AjRQDQKouDXCKWb0jhxvZfBL2mRpkpH31jmhGt66OU6AyZa JLbEIeGy2h8k8jPUHHgE71G0BfVt5YRSv3Honzbi2POVcpEHSd7lmC33ywA4cVVxw9nZ 26nOaL5fzs3n5L9mGOcD2SUC9C7O16djqz+v/Ifu/e0LgSI8OZnOLl9hUTMfR6iWZQPM 6DNfQ/4mAUIQqLvuubS3C5lp3veGqVdND9yGtPFtDv+fVm+fnTQi9fK8Qz6Rk/4aSCh6 kOKlIaHyoVhCEIJaaWyJqZb7CHIDvPedIX7E7LkRYxsU7RC5k4TQdAYbpHdTu/4ua8jy R8/A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1717157048; x=1717761848; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=m300P5qGdWq/KX3sCv0lwvsoRKZl9oqmNGtWoC/clvE=; b=pg5Jy+uTb0iYBTfI23M5HB3ariRY6q5w8tF1V3GvGuZnCkg/nlZwj2YRl46h2kV+xe r5TsuCcNg+CUClqmBSGio140sEv18pAF0Uli7B0V0B+33yt73lGTOOBVJd/BE031AL9B yohw2eivHH8m7D2PGJd2Sqy5B68+x7B5POP76DTt2sc7CmmCO56amP7MfHXNmVVlFZqW QN7wWOPF9zl8ZQAPEvHmrwtV/wnBXq7pD1rPOM/FQtC2wXUinPGjOtT0+4qIhvn/dhSE YCuXaOy+zuDuXqSO86ASYhpoHJP4r4Tl9FuGKiHSVvhkVnGRcIp3APUvODnhRGmbRHN4 PIbw== X-Gm-Message-State: AOJu0YwwvSyqABDd+iOQG2uESxuflRJXxminzld07ZtNzIyrkI6wEkVg +vt5QhDC/B5ODqwxxJPCyJXAeLm37+yQJarbgCoHjUzZgArcsKTusOPrHISAZsRQbWdMRgVUE9D E X-Google-Smtp-Source: AGHT+IHxMPmOqDhWxZWZVOLj795tTbc393q3pRN1i2843XZb/oNJmVvXqaywuxp8/+Q7tPC2FA48iA== X-Received: by 2002:a05:600c:450b:b0:421:2990:7f8b with SMTP id 5b1f17b1804b1-4212e044c39mr14746845e9.3.1717157048182; Fri, 31 May 2024 05:04:08 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [2001:8b0:1d0::2]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-42129de0cf9sm37027685e9.13.2024.05.31.05.04.07 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 31 May 2024 05:04:07 -0700 (PDT) From: Peter Maydell To: qemu-devel@nongnu.org Subject: [PULL 07/43] target/arm: Improve vector UQADD, UQSUB, SQADD, SQSUB Date: Fri, 31 May 2024 13:03:25 +0100 Message-Id: <20240531120401.394550-8-peter.maydell@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240531120401.394550-1-peter.maydell@linaro.org> References: <20240531120401.394550-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::329; envelope-from=peter.maydell@linaro.org; helo=mail-wm1-x329.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Richard Henderson No need for a full comparison; xor produces non-zero bits for QC just fine. Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson Message-id: 20240528203044.612851-3-richard.henderson@linaro.org Signed-off-by: Peter Maydell --- target/arm/tcg/gengvec.c | 32 ++++++++++++++++---------------- 1 file changed, 16 insertions(+), 16 deletions(-) diff --git a/target/arm/tcg/gengvec.c b/target/arm/tcg/gengvec.c index 22c9d17dce4..bfe6885a018 100644 --- a/target/arm/tcg/gengvec.c +++ b/target/arm/tcg/gengvec.c @@ -1217,21 +1217,21 @@ void gen_gvec_sshl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); } -static void gen_uqadd_vec(unsigned vece, TCGv_vec t, TCGv_vec sat, +static void gen_uqadd_vec(unsigned vece, TCGv_vec t, TCGv_vec qc, TCGv_vec a, TCGv_vec b) { TCGv_vec x = tcg_temp_new_vec_matching(t); tcg_gen_add_vec(vece, x, a, b); tcg_gen_usadd_vec(vece, t, a, b); - tcg_gen_cmp_vec(TCG_COND_NE, vece, x, x, t); - tcg_gen_or_vec(vece, sat, sat, x); + tcg_gen_xor_vec(vece, x, x, t); + tcg_gen_or_vec(vece, qc, qc, x); } void gen_gvec_uqadd_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) { static const TCGOpcode vecop_list[] = { - INDEX_op_usadd_vec, INDEX_op_cmp_vec, INDEX_op_add_vec, 0 + INDEX_op_usadd_vec, INDEX_op_add_vec, 0 }; static const GVecGen4 ops[4] = { { .fniv = gen_uqadd_vec, @@ -1259,21 +1259,21 @@ void gen_gvec_uqadd_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); } -static void gen_sqadd_vec(unsigned vece, TCGv_vec t, TCGv_vec sat, +static void gen_sqadd_vec(unsigned vece, TCGv_vec t, TCGv_vec qc, TCGv_vec a, TCGv_vec b) { TCGv_vec x = tcg_temp_new_vec_matching(t); tcg_gen_add_vec(vece, x, a, b); tcg_gen_ssadd_vec(vece, t, a, b); - tcg_gen_cmp_vec(TCG_COND_NE, vece, x, x, t); - tcg_gen_or_vec(vece, sat, sat, x); + tcg_gen_xor_vec(vece, x, x, t); + tcg_gen_or_vec(vece, qc, qc, x); } void gen_gvec_sqadd_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) { static const TCGOpcode vecop_list[] = { - INDEX_op_ssadd_vec, INDEX_op_cmp_vec, INDEX_op_add_vec, 0 + INDEX_op_ssadd_vec, INDEX_op_add_vec, 0 }; static const GVecGen4 ops[4] = { { .fniv = gen_sqadd_vec, @@ -1301,21 +1301,21 @@ void gen_gvec_sqadd_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); } -static void gen_uqsub_vec(unsigned vece, TCGv_vec t, TCGv_vec sat, +static void gen_uqsub_vec(unsigned vece, TCGv_vec t, TCGv_vec qc, TCGv_vec a, TCGv_vec b) { TCGv_vec x = tcg_temp_new_vec_matching(t); tcg_gen_sub_vec(vece, x, a, b); tcg_gen_ussub_vec(vece, t, a, b); - tcg_gen_cmp_vec(TCG_COND_NE, vece, x, x, t); - tcg_gen_or_vec(vece, sat, sat, x); + tcg_gen_xor_vec(vece, x, x, t); + tcg_gen_or_vec(vece, qc, qc, x); } void gen_gvec_uqsub_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) { static const TCGOpcode vecop_list[] = { - INDEX_op_ussub_vec, INDEX_op_cmp_vec, INDEX_op_sub_vec, 0 + INDEX_op_ussub_vec, INDEX_op_sub_vec, 0 }; static const GVecGen4 ops[4] = { { .fniv = gen_uqsub_vec, @@ -1343,21 +1343,21 @@ void gen_gvec_uqsub_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); } -static void gen_sqsub_vec(unsigned vece, TCGv_vec t, TCGv_vec sat, +static void gen_sqsub_vec(unsigned vece, TCGv_vec t, TCGv_vec qc, TCGv_vec a, TCGv_vec b) { TCGv_vec x = tcg_temp_new_vec_matching(t); tcg_gen_sub_vec(vece, x, a, b); tcg_gen_sssub_vec(vece, t, a, b); - tcg_gen_cmp_vec(TCG_COND_NE, vece, x, x, t); - tcg_gen_or_vec(vece, sat, sat, x); + tcg_gen_xor_vec(vece, x, x, t); + tcg_gen_or_vec(vece, qc, qc, x); } void gen_gvec_sqsub_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) { static const TCGOpcode vecop_list[] = { - INDEX_op_sssub_vec, INDEX_op_cmp_vec, INDEX_op_sub_vec, 0 + INDEX_op_sssub_vec, INDEX_op_sub_vec, 0 }; static const GVecGen4 ops[4] = { { .fniv = gen_sqsub_vec,