From patchwork Tue May 25 01:02:27 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277417 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8043BC2B9F7 for ; Tue, 25 May 2021 01:06:16 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id E5828613EC for ; Tue, 25 May 2021 01:06:15 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E5828613EC Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:39930 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llLWQ-00047A-Ug for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 21:06:14 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:53158) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLUK-0000tl-SE for qemu-devel@nongnu.org; Mon, 24 May 2021 21:04:04 -0400 Received: from mail-pf1-x429.google.com ([2607:f8b0:4864:20::429]:33420) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLUH-0001bm-PI for qemu-devel@nongnu.org; Mon, 24 May 2021 21:04:04 -0400 Received: by mail-pf1-x429.google.com with SMTP id f22so13898724pfn.0 for ; Mon, 24 May 2021 18:04:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=jk9xuQuOdWwLhsZNV9U+4ALfnilDyaP5t7dhpaqzQXE=; b=CgYyAhh2wNdocjFevoKi06EGWL9bbN5tgQ/2qTbbbzWqRFFZHp+fBu/i4y4vbhXfpS h5fNoo1ua2jJ0tPTPwbEGSU4eBlGtY/dG30kf6RHlXdpbCxxjvkLwEr7aWJ4iSXLUf7C 2V5c+KR1POAof6R5dtpnsbti6kCfGs2czvOFiGJAsGDFMcyYL6EI33U0bZ7tSfSwUU8o GBtXYfDQ82m5fU1mIiTGUqJ+faecFn5rBKrFsy+/CpDutacV0HQ4uAl0Qqqs20RvJKM8 DwUw2gEap6P6MS5VztDbY+6AyLq5KvZN/6p3OSa/jpmb4kmy9vwoNeVP01+ovyi6t623 bXxQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=jk9xuQuOdWwLhsZNV9U+4ALfnilDyaP5t7dhpaqzQXE=; b=UrJc4lEEIAZClWDRFLtvJvjOxBnEAY3x1Q/JaOKUc31Re9vFAFOp/L4Q6RAzoD651D SLKkMmwER//YVh95MjA20KRuydzlVnRBoWS8X9xyVgIpeX4Qfd8H5TwM8nac3y3GO08H 5P8DTNd3Oc/wuXlJGMz3t3K1++6vUsV6i75eJyPSShBIKT//UTtwZ/Szbrjngjy2zFn4 etPE/1yTqScv29XqM5v3fSJnBkoiEZvjoDDLoW+hLJWuxz0j6KgUxGEq7UgLnNGV/zaE XVwUuPD2Eq0AxJbMco1U8t6pgmllKduSztCgYXxSQ33AmV1Ji6TlMhHWUVw2TZMwhd9l w8QA== X-Gm-Message-State: AOAM5310VoUfqcM4mqkC5kMj0foUR/cyzbaKZYKdMQ2KMKtRDRMuD56E IEpCRRPxn3R3J9RgpxndhFSa96njD+0XSw== X-Google-Smtp-Source: ABdhPJznsZE5x1YyAM/VVwAaM6zIaljOu0J0r6/dydjhjwA21TZLWD8O4xMZ76kWqgjjmavN0n3aeA== X-Received: by 2002:aa7:92da:0:b029:2d5:59bc:a7e4 with SMTP id k26-20020aa792da0000b02902d559bca7e4mr27283842pfa.46.1621904640361; Mon, 24 May 2021 18:04:00 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id b1sm13742645pgf.84.2021.05.24.18.03.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:04:00 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 01/92] target/arm: Add ID_AA64ZFR0 fields and isar_feature_aa64_sve2 Date: Mon, 24 May 2021 18:02:27 -0700 Message-Id: <20210525010358.152808-2-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::429; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x429.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org, =?utf-8?q?Alex_Benn=C3=A9e?= Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Will be used for SVE2 isa subset enablement. Reviewed-by: Alex Bennée Signed-off-by: Richard Henderson --- v2: Do not read zfr0 from kvm unless sve is available. v7: Move zfr0 read inside existing sve_enabled block. --- target/arm/cpu.h | 16 ++++++++++++++++ target/arm/helper.c | 3 +-- target/arm/kvm64.c | 21 +++++++++++++++------ 3 files changed, 32 insertions(+), 8 deletions(-) diff --git a/target/arm/cpu.h b/target/arm/cpu.h index 616b393253..a6e1fa6333 100644 --- a/target/arm/cpu.h +++ b/target/arm/cpu.h @@ -947,6 +947,7 @@ struct ARMCPU { uint64_t id_aa64mmfr2; uint64_t id_aa64dfr0; uint64_t id_aa64dfr1; + uint64_t id_aa64zfr0; } isar; uint64_t midr; uint32_t revidr; @@ -2034,6 +2035,16 @@ FIELD(ID_AA64DFR0, DOUBLELOCK, 36, 4) FIELD(ID_AA64DFR0, TRACEFILT, 40, 4) FIELD(ID_AA64DFR0, MTPMU, 48, 4) +FIELD(ID_AA64ZFR0, SVEVER, 0, 4) +FIELD(ID_AA64ZFR0, AES, 4, 4) +FIELD(ID_AA64ZFR0, BITPERM, 16, 4) +FIELD(ID_AA64ZFR0, BFLOAT16, 20, 4) +FIELD(ID_AA64ZFR0, SHA3, 32, 4) +FIELD(ID_AA64ZFR0, SM4, 40, 4) +FIELD(ID_AA64ZFR0, I8MM, 44, 4) +FIELD(ID_AA64ZFR0, F32MM, 52, 4) +FIELD(ID_AA64ZFR0, F64MM, 56, 4) + FIELD(ID_DFR0, COPDBG, 0, 4) FIELD(ID_DFR0, COPSDBG, 4, 4) FIELD(ID_DFR0, MMAPDBG, 8, 4) @@ -4215,6 +4226,11 @@ static inline bool isar_feature_aa64_ssbs(const ARMISARegisters *id) return FIELD_EX64(id->id_aa64pfr1, ID_AA64PFR1, SSBS) != 0; } +static inline bool isar_feature_aa64_sve2(const ARMISARegisters *id) +{ + return FIELD_EX64(id->id_aa64zfr0, ID_AA64ZFR0, SVEVER) != 0; +} + /* * Feature tests for "does this exist in either 32-bit or 64-bit?" */ diff --git a/target/arm/helper.c b/target/arm/helper.c index 3b365a78cb..696aea2250 100644 --- a/target/arm/helper.c +++ b/target/arm/helper.c @@ -7561,8 +7561,7 @@ void register_cp_regs_for_features(ARMCPU *cpu) .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 4, .opc2 = 4, .access = PL1_R, .type = ARM_CP_CONST, .accessfn = access_aa64_tid3, - /* At present, only SVEver == 0 is defined anyway. */ - .resetvalue = 0 }, + .resetvalue = cpu->isar.id_aa64zfr0 }, { .name = "ID_AA64PFR5_EL1_RESERVED", .state = ARM_CP_STATE_AA64, .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 4, .opc2 = 5, .access = PL1_R, .type = ARM_CP_CONST, diff --git a/target/arm/kvm64.c b/target/arm/kvm64.c index dff85f6db9..37ceadd9a9 100644 --- a/target/arm/kvm64.c +++ b/target/arm/kvm64.c @@ -647,17 +647,26 @@ bool kvm_arm_get_host_cpu_features(ARMHostCPUFeatures *ahcf) sve_supported = ioctl(fdarray[0], KVM_CHECK_EXTENSION, KVM_CAP_ARM_SVE) > 0; - kvm_arm_destroy_scratch_host_vcpu(fdarray); - - if (err < 0) { - return false; - } - /* Add feature bits that can't appear until after VCPU init. */ if (sve_supported) { t = ahcf->isar.id_aa64pfr0; t = FIELD_DP64(t, ID_AA64PFR0, SVE, 1); ahcf->isar.id_aa64pfr0 = t; + + /* + * Before v5.1, KVM did not support SVE and did not expose + * ID_AA64ZFR0_EL1 even as RAZ. After v5.1, KVM still does + * not expose the register to "user" requests like this + * unless the host supports SVE. + */ + err |= read_sys_reg64(fdarray[2], &ahcf->isar.id_aa64zfr0, + ARM64_SYS_REG(3, 0, 0, 4, 4)); + } + + kvm_arm_destroy_scratch_host_vcpu(fdarray); + + if (err < 0) { + return false; } /* From patchwork Tue May 25 01:02:28 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277423 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BEB7FC2B9F7 for ; Tue, 25 May 2021 01:06:34 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 28F6D61378 for ; Tue, 25 May 2021 01:06:34 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 28F6D61378 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:40874 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llLWj-0004jK-8p for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 21:06:33 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:53228) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLUM-0000w3-MM for qemu-devel@nongnu.org; Mon, 24 May 2021 21:04:06 -0400 Received: from mail-pj1-x1031.google.com ([2607:f8b0:4864:20::1031]:37870) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLUI-0001cF-8Y for qemu-devel@nongnu.org; Mon, 24 May 2021 21:04:06 -0400 Received: by mail-pj1-x1031.google.com with SMTP id gb21-20020a17090b0615b029015d1a863a91so12272810pjb.2 for ; Mon, 24 May 2021 18:04:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=2ryMgFEidN6BuvEO6ooGd+kn9kbE73JtCknZ+NXDv7g=; b=WtPigSd6+rd4dZTI5bl7jQa6ALvlr4RdnHAyKiWQKCl+3qEDKSr6lhibb21EZzCwYF iDYhbjkUJ49wYWk3tKv6I49WpG6XV3WbBwtNFOAR+rBZWD7RY+WSrggOLorRIlOA3ZbL jvxFIQbgJXV4r22TXFc3vJMUWXex/4kWpVcmXwrsaL7auEyUa4rD9lmXth4IO7aRzGrj WHh50nY2J3eiVe29JfFhrP50g6O/sWmq9GkocotAmeOhJDAXD78KSM0saMYhy67CGBKL jvH9bD0QFERCX+WjM4qysZPOXsYoK+4FM6rO7r999ma/ohJN547A3O25BVO3D91QrW1f PR5w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=2ryMgFEidN6BuvEO6ooGd+kn9kbE73JtCknZ+NXDv7g=; b=F+QsOYOceUvK7m5SiE7njfQ2GdTZvgMGGlieMiKP8PdzBWbGiW3gwkXLJwcf5PmHiV pgzc0ry7eo/oOC3oyLztdcCedulWfk8GEtLdEKFBXkHyDiLsPlXVBVqqGkbwyRx2p7T5 rKu9b2TfEYqXqHgGGATjXI+pT0PslUfgL8C8oN6Mon/JFmhuDu1Tn0YIzfaJ5LAd0d+Z hL8n+5IHPoF1cVOye0UzJYzAgh5kOeQkX1tBEgpMd4NM0W1elSErtafB6VuDqPs/PJin RkRaL3H3KJrQg59hHdFLUwNWhvs7H/DbxaWEEx3R37z6GlwohRjUjzgZD+LXVpn1tg61 NjTA== X-Gm-Message-State: AOAM531+kXSNcq5mwkpvPzJFVY0/h0PuhiZuEBcI+4yBTmCDTRhKqbQe RZlqOvQdGIjIZ2hf7lfJUq77cBmDw6QZfw== X-Google-Smtp-Source: ABdhPJwnN2NsQElKMtEn942iMqEUlzIjg1k0RabiHMi3IpnP3Xo23J108KkOit3UDUJ3AOgV3OfSFw== X-Received: by 2002:a17:90a:8902:: with SMTP id u2mr27791387pjn.143.1621904640934; Mon, 24 May 2021 18:04:00 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id b1sm13742645pgf.84.2021.05.24.18.04.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:04:00 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 02/92] target/arm: Implement SVE2 Integer Multiply - Unpredicated Date: Mon, 24 May 2021 18:02:28 -0700 Message-Id: <20210525010358.152808-3-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1031; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1031.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" For MUL, we can rely on generic support. For SMULH and UMULH, create some trivial helpers. For PMUL, back in a21bb78e5817, we organized helper_gvec_pmul_b in preparation for this use. Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/helper.h | 10 ++++ target/arm/sve.decode | 10 ++++ target/arm/translate-sve.c | 50 ++++++++++++++++++++ target/arm/vec_helper.c | 96 ++++++++++++++++++++++++++++++++++++++ 4 files changed, 166 insertions(+) diff --git a/target/arm/helper.h b/target/arm/helper.h index ff8148ddc6..2c412ffd3b 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -828,6 +828,16 @@ DEF_HELPER_FLAGS_3(gvec_cgt0_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(gvec_cge0_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(gvec_cge0_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_smulh_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_smulh_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_smulh_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_smulh_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(gvec_umulh_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_umulh_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_umulh_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_umulh_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_4(gvec_sshl_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(gvec_sshl_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(gvec_ushl_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 5c90603358..557706cacb 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1090,3 +1090,13 @@ ST1_zprz 1110010 .. 00 ..... 100 ... ..... ..... \ @rprr_scatter_store xs=0 esz=3 scale=0 ST1_zprz 1110010 .. 00 ..... 110 ... ..... ..... \ @rprr_scatter_store xs=1 esz=3 scale=0 + +#### SVE2 Support + +### SVE2 Integer Multiply - Unpredicated + +# SVE2 integer multiply vectors (unpredicated) +MUL_zzz 00000100 .. 1 ..... 0110 00 ..... ..... @rd_rn_rm +SMULH_zzz 00000100 .. 1 ..... 0110 10 ..... ..... @rd_rn_rm +UMULH_zzz 00000100 .. 1 ..... 0110 11 ..... ..... @rd_rn_rm +PMUL_zzz 00000100 00 1 ..... 0110 01 ..... ..... @rd_rn_rm_e0 diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 864ed669c4..f82d7d96f6 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -5795,3 +5795,53 @@ static bool trans_MOVPRFX_z(DisasContext *s, arg_rpr_esz *a) { return do_movz_zpz(s, a->rd, a->rn, a->pg, a->esz, false); } + +/* + * SVE2 Integer Multiply - Unpredicated + */ + +static bool trans_MUL_zzz(DisasContext *s, arg_rrr_esz *a) +{ + if (!dc_isar_feature(aa64_sve2, s)) { + return false; + } + if (sve_access_check(s)) { + gen_gvec_fn_zzz(s, tcg_gen_gvec_mul, a->esz, a->rd, a->rn, a->rm); + } + return true; +} + +static bool do_sve2_zzz_ool(DisasContext *s, arg_rrr_esz *a, + gen_helper_gvec_3 *fn) +{ + if (fn == NULL || !dc_isar_feature(aa64_sve2, s)) { + return false; + } + if (sve_access_check(s)) { + gen_gvec_ool_zzz(s, fn, a->rd, a->rn, a->rm, 0); + } + return true; +} + +static bool trans_SMULH_zzz(DisasContext *s, arg_rrr_esz *a) +{ + static gen_helper_gvec_3 * const fns[4] = { + gen_helper_gvec_smulh_b, gen_helper_gvec_smulh_h, + gen_helper_gvec_smulh_s, gen_helper_gvec_smulh_d, + }; + return do_sve2_zzz_ool(s, a, fns[a->esz]); +} + +static bool trans_UMULH_zzz(DisasContext *s, arg_rrr_esz *a) +{ + static gen_helper_gvec_3 * const fns[4] = { + gen_helper_gvec_umulh_b, gen_helper_gvec_umulh_h, + gen_helper_gvec_umulh_s, gen_helper_gvec_umulh_d, + }; + return do_sve2_zzz_ool(s, a, fns[a->esz]); +} + +static bool trans_PMUL_zzz(DisasContext *s, arg_rrr_esz *a) +{ + return do_sve2_zzz_ool(s, a, gen_helper_gvec_pmul_b); +} diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index 3fbeae87cb..40b92100bf 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -1985,3 +1985,99 @@ void HELPER(simd_tblx)(void *vd, void *vm, void *venv, uint32_t desc) clear_tail(vd, oprsz, simd_maxsz(desc)); } #endif + +/* + * NxN -> N highpart multiply + * + * TODO: expose this as a generic vector operation. + */ + +void HELPER(gvec_smulh_b)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + int8_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz; ++i) { + d[i] = ((int32_t)n[i] * m[i]) >> 8; + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(gvec_smulh_h)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + int16_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz / 2; ++i) { + d[i] = ((int32_t)n[i] * m[i]) >> 16; + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(gvec_smulh_s)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + int32_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz / 4; ++i) { + d[i] = ((int64_t)n[i] * m[i]) >> 32; + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(gvec_smulh_d)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + uint64_t *d = vd, *n = vn, *m = vm; + uint64_t discard; + + for (i = 0; i < opr_sz / 8; ++i) { + muls64(&discard, &d[i], n[i], m[i]); + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(gvec_umulh_b)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + uint8_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz; ++i) { + d[i] = ((uint32_t)n[i] * m[i]) >> 8; + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(gvec_umulh_h)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + uint16_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz / 2; ++i) { + d[i] = ((uint32_t)n[i] * m[i]) >> 16; + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(gvec_umulh_s)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + uint32_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz / 4; ++i) { + d[i] = ((uint64_t)n[i] * m[i]) >> 32; + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(gvec_umulh_d)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + uint64_t *d = vd, *n = vn, *m = vm; + uint64_t discard; + + for (i = 0; i < opr_sz / 8; ++i) { + mulu64(&discard, &d[i], n[i], m[i]); + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} From patchwork Tue May 25 01:02:29 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277421 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BDFB7C2B9F7 for ; Tue, 25 May 2021 01:06:27 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 54087613D8 for ; Tue, 25 May 2021 01:06:27 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 54087613D8 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:40210 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llLWc-0004Ia-Bt for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 21:06:26 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:53286) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLUN-000100-V4 for qemu-devel@nongnu.org; Mon, 24 May 2021 21:04:07 -0400 Received: from mail-pl1-x62a.google.com ([2607:f8b0:4864:20::62a]:33333) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLUJ-0001dN-9j for qemu-devel@nongnu.org; Mon, 24 May 2021 21:04:07 -0400 Received: by mail-pl1-x62a.google.com with SMTP id b7so11343110plg.0 for ; Mon, 24 May 2021 18:04:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=FNZnUD/eyFlv1BY39McDi1pIrDDwWGi6zae0AybpjpQ=; b=rhl/7K8MTXoh219VrmXcAoVFnsPl6dNGvxgrZ98dDSu0/D7AyXOzUfb/+8pFK6UscX kBuXUg5JQ/0U2P/jHQeD8/qSKgRG6BDFhx4q7K1HYOVAJEw5i6fjw/fp6N2ZYizs71E+ SDS+d2C/d+LvqyGIVLDziZTloY2GKY7Ko4O0onalvyk6jV4/JO6JsyMQ6uDgiyLKULKI e+bmsqs/u1vcFqms6E7/Hsyj8YF2Togv5VqLavt/Y82+FGegcvRsWCMPw7MRAobzJ1z3 Y2zuSSNBV4G/7WQ/npaLIMRrbwz+j/d+VqsllWELYskaB8QkcI7lhzIF5PByj7czY4E/ zfKQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=FNZnUD/eyFlv1BY39McDi1pIrDDwWGi6zae0AybpjpQ=; b=GbNcBr7+vmpAY8sHZVJOtrgfUlXBid08WpCKrZjOOZbaXXevMI2xQ8aEdXBHgJGdjW Pot7N529rQ2DI+XUwZ+dCEVVz7sujmZ5psZhL3YqdSm4ig6T+glHcVIGfX2oGyu94Yl3 nZHwyRCeFVbOaKBfXsBfdqIUy1Pi2To+JozR1AohKBAfjUggBfbhxhuGEsnyhd5KeZp6 H2PtUxNa/G2tNDE3j1uII31Qg9TmCCxwTbRfxQET72jdUnMNM8l4Pz9TiPuoSMgl1192 ewXmqoez5r2m2YZOaRjt1slZ6dEZOKqeoOSpe/GMtgCSjCn6ta8nVx7JUVycFcpw0YS5 hPrQ== X-Gm-Message-State: AOAM530IVqidmmMYncJ1RyZLDOx1MvhpuvS1wLIf3+qaCzXDDv/yc0g+ TkOuUTDgryZKEslQKfZuXU5n0y/rUKZO9A== X-Google-Smtp-Source: ABdhPJxqHynvMmwBMTFAHrWIAAgtzBjgyLXeOMsoIgJ++i96W39Qr5zAhmGjOrhaEofpwrQLFLIi6A== X-Received: by 2002:a17:90a:3988:: with SMTP id z8mr28354500pjb.175.1621904641738; Mon, 24 May 2021 18:04:01 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id b1sm13742645pgf.84.2021.05.24.18.04.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:04:01 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 03/92] target/arm: Implement SVE2 integer pairwise add and accumulate long Date: Mon, 24 May 2021 18:02:29 -0700 Message-Id: <20210525010358.152808-4-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::62a; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x62a.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 14 ++++++++++++ target/arm/sve.decode | 5 +++++ target/arm/sve_helper.c | 44 ++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 39 +++++++++++++++++++++++++++++++++ 4 files changed, 102 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index e4cadd2a65..b2a274b40b 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -158,6 +158,20 @@ DEF_HELPER_FLAGS_5(sve_umulh_zpzz_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(sve_umulh_zpzz_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sadalp_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sadalp_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sadalp_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_uadalp_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uadalp_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uadalp_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_sdiv_zpzz_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_sdiv_zpzz_d, TCG_CALL_NO_RWG, diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 557706cacb..0524c01fcf 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1100,3 +1100,8 @@ MUL_zzz 00000100 .. 1 ..... 0110 00 ..... ..... @rd_rn_rm SMULH_zzz 00000100 .. 1 ..... 0110 10 ..... ..... @rd_rn_rm UMULH_zzz 00000100 .. 1 ..... 0110 11 ..... ..... @rd_rn_rm PMUL_zzz 00000100 00 1 ..... 0110 01 ..... ..... @rd_rn_rm_e0 + +### SVE2 Integer - Predicated + +SADALP_zpzz 01000100 .. 000 100 101 ... ..... ..... @rdm_pg_rn +UADALP_zpzz 01000100 .. 000 101 101 ... ..... ..... @rdm_pg_rn diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index c068dfa0d5..f44b4138cc 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -517,6 +517,50 @@ DO_ZPZZ_D(sve_asr_zpzz_d, int64_t, DO_ASR) DO_ZPZZ_D(sve_lsr_zpzz_d, uint64_t, DO_LSR) DO_ZPZZ_D(sve_lsl_zpzz_d, uint64_t, DO_LSL) +static inline uint16_t do_sadalp_h(int16_t n, int16_t m) +{ + int8_t n1 = n, n2 = n >> 8; + return m + n1 + n2; +} + +static inline uint32_t do_sadalp_s(int32_t n, int32_t m) +{ + int16_t n1 = n, n2 = n >> 16; + return m + n1 + n2; +} + +static inline uint64_t do_sadalp_d(int64_t n, int64_t m) +{ + int32_t n1 = n, n2 = n >> 32; + return m + n1 + n2; +} + +DO_ZPZZ(sve2_sadalp_zpzz_h, int16_t, H1_2, do_sadalp_h) +DO_ZPZZ(sve2_sadalp_zpzz_s, int32_t, H1_4, do_sadalp_s) +DO_ZPZZ_D(sve2_sadalp_zpzz_d, int64_t, do_sadalp_d) + +static inline uint16_t do_uadalp_h(uint16_t n, uint16_t m) +{ + uint8_t n1 = n, n2 = n >> 8; + return m + n1 + n2; +} + +static inline uint32_t do_uadalp_s(uint32_t n, uint32_t m) +{ + uint16_t n1 = n, n2 = n >> 16; + return m + n1 + n2; +} + +static inline uint64_t do_uadalp_d(uint64_t n, uint64_t m) +{ + uint32_t n1 = n, n2 = n >> 32; + return m + n1 + n2; +} + +DO_ZPZZ(sve2_uadalp_zpzz_h, uint16_t, H1_2, do_uadalp_h) +DO_ZPZZ(sve2_uadalp_zpzz_s, uint32_t, H1_4, do_uadalp_s) +DO_ZPZZ_D(sve2_uadalp_zpzz_d, uint64_t, do_uadalp_d) + #undef DO_ZPZZ #undef DO_ZPZZ_D diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index f82d7d96f6..208d9ea7e0 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -5845,3 +5845,42 @@ static bool trans_PMUL_zzz(DisasContext *s, arg_rrr_esz *a) { return do_sve2_zzz_ool(s, a, gen_helper_gvec_pmul_b); } + +/* + * SVE2 Integer - Predicated + */ + +static bool do_sve2_zpzz_ool(DisasContext *s, arg_rprr_esz *a, + gen_helper_gvec_4 *fn) +{ + if (!dc_isar_feature(aa64_sve2, s)) { + return false; + } + return do_zpzz_ool(s, a, fn); +} + +static bool trans_SADALP_zpzz(DisasContext *s, arg_rprr_esz *a) +{ + static gen_helper_gvec_4 * const fns[3] = { + gen_helper_sve2_sadalp_zpzz_h, + gen_helper_sve2_sadalp_zpzz_s, + gen_helper_sve2_sadalp_zpzz_d, + }; + if (a->esz == 0) { + return false; + } + return do_sve2_zpzz_ool(s, a, fns[a->esz - 1]); +} + +static bool trans_UADALP_zpzz(DisasContext *s, arg_rprr_esz *a) +{ + static gen_helper_gvec_4 * const fns[3] = { + gen_helper_sve2_uadalp_zpzz_h, + gen_helper_sve2_uadalp_zpzz_s, + gen_helper_sve2_uadalp_zpzz_d, + }; + if (a->esz == 0) { + return false; + } + return do_sve2_zpzz_ool(s, a, fns[a->esz - 1]); +} From patchwork Tue May 25 01:02:30 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277425 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 633E7C2B9F7 for ; Tue, 25 May 2021 01:09:30 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id E4842613F6 for ; Tue, 25 May 2021 01:09:29 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E4842613F6 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:50348 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llLZY-0002rl-Rf for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 21:09:28 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:53328) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLUP-00016U-Jh for qemu-devel@nongnu.org; Mon, 24 May 2021 21:04:09 -0400 Received: from mail-pj1-x1031.google.com ([2607:f8b0:4864:20::1031]:55840) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLUJ-0001dp-MS for qemu-devel@nongnu.org; Mon, 24 May 2021 21:04:09 -0400 Received: by mail-pj1-x1031.google.com with SMTP id kr9so7794713pjb.5 for ; Mon, 24 May 2021 18:04:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=jVZUYPn904dvokrZxIul+Uwhblv3yiIVTy/DzrmsM1E=; b=EiBsQbqg5JUy2jVDqufMkcd5skIWxCm19ZAdzfov1wzbLE0VcCEqcq3mCcmccp2GP/ VEced8LEtUoGF9vh38/6VXmk7geP6anPdrcTmoIim4Qr6zoBG4i/cp6sD9aM+n8waAP1 xge6vqML/PI6DJ/rqLsBWo193cyzOxE3arWCwrpS6JoAgIf5ZWopl10KFk/ivhgqKngf IY86pkiuFSe72P9opzaUWiyjfpdQdIFkaz5nSSerPqnWGJxPt28RHR7rCzb7eugU1MQ+ QqHXBOO+nLIqwouTahJ/nLl2sWPNdQLJoniqVpCZn2RC4tINFySRYDZwxZoqPTRuzXJy GoaA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=jVZUYPn904dvokrZxIul+Uwhblv3yiIVTy/DzrmsM1E=; b=aibLMEpsthApQJD96UQfDWuHIKjsLcmYwXVMSHi+OnsvUU1HhSmn9ELuJBM2PvpfDf qZBbHek+nktIEnWTR8m9ixwxCUr2SNy4q+XQiu81fn7iq96OBo14+yHZUGpLQs13hjf1 5TQlPHizPmpozK7Sgit6Bae+5fqSwzC0SRmeOrEPHbTUBtnLi7oP68SsDWn/zZMMOjIK WuA7b5ueUro9L9hi7K0EjPr6ionBKHXYNJxcC5TXSL1YA/hvFr7LIKDK16UkOG9w4rHB pfHNuzPlO2B+sZ/jdHZz4pivAUAw+isJrAI33cfg2IUtfFmaqpqWhkj9w4Ev7T77cbNY riZg== X-Gm-Message-State: AOAM532ydtYyLp6LEzwyEm4PkULvDIN+JavsdhmY24KYGpFomcZO7mhd G4MtgXxGarGKHD7/HSJZX7aDq9Peh2g6pA== X-Google-Smtp-Source: ABdhPJxwYKm/uGZZufTRWLb/5Ta+wQnjhfCFfTcnwkM4T+TIsKa6ZyiiJEvkd73qvW6O7wd1cUF1cQ== X-Received: by 2002:a17:90a:7306:: with SMTP id m6mr1979743pjk.217.1621904642421; Mon, 24 May 2021 18:04:02 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id b1sm13742645pgf.84.2021.05.24.18.04.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:04:02 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 04/92] target/arm: Implement SVE2 integer unary operations (predicated) Date: Mon, 24 May 2021 18:02:30 -0700 Message-Id: <20210525010358.152808-5-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1031; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1031.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- v2: Fix sqabs, sqneg (laurent desnogues) v7: Fix rebase error vs sadalp/uadalp. --- target/arm/helper-sve.h | 13 +++++++++++ target/arm/sve.decode | 7 ++++++ target/arm/sve_helper.c | 21 +++++++++++++++++ target/arm/translate-sve.c | 47 ++++++++++++++++++++++++++++++++++++++ 4 files changed, 88 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index b2a274b40b..9992e93e2b 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -502,6 +502,19 @@ DEF_HELPER_FLAGS_4(sve_rbit_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_rbit_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_rbit_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqabs_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqabs_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqabs_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqabs_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_sqneg_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqneg_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqneg_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqneg_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_urecpe_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_ursqrte_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_splice, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_cmpeq_ppzz_b, TCG_CALL_NO_RWG, diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 0524c01fcf..5ba542969b 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1105,3 +1105,10 @@ PMUL_zzz 00000100 00 1 ..... 0110 01 ..... ..... @rd_rn_rm_e0 SADALP_zpzz 01000100 .. 000 100 101 ... ..... ..... @rdm_pg_rn UADALP_zpzz 01000100 .. 000 101 101 ... ..... ..... @rdm_pg_rn + +### SVE2 integer unary operations (predicated) + +URECPE 01000100 .. 000 000 101 ... ..... ..... @rd_pg_rn +URSQRTE 01000100 .. 000 001 101 ... ..... ..... @rd_pg_rn +SQABS 01000100 .. 001 000 101 ... ..... ..... @rd_pg_rn +SQNEG 01000100 .. 001 001 101 ... ..... ..... @rd_pg_rn diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index f44b4138cc..7a08c24f2d 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -728,6 +728,27 @@ DO_ZPZ(sve_rbit_h, uint16_t, H1_2, revbit16) DO_ZPZ(sve_rbit_s, uint32_t, H1_4, revbit32) DO_ZPZ_D(sve_rbit_d, uint64_t, revbit64) +#define DO_SQABS(X) \ + ({ __typeof(X) x_ = (X), min_ = 1ull << (sizeof(X) * 8 - 1); \ + x_ >= 0 ? x_ : x_ == min_ ? -min_ - 1 : -x_; }) + +DO_ZPZ(sve2_sqabs_b, int8_t, H1, DO_SQABS) +DO_ZPZ(sve2_sqabs_h, int16_t, H1_2, DO_SQABS) +DO_ZPZ(sve2_sqabs_s, int32_t, H1_4, DO_SQABS) +DO_ZPZ_D(sve2_sqabs_d, int64_t, DO_SQABS) + +#define DO_SQNEG(X) \ + ({ __typeof(X) x_ = (X), min_ = 1ull << (sizeof(X) * 8 - 1); \ + x_ == min_ ? -min_ - 1 : -x_; }) + +DO_ZPZ(sve2_sqneg_b, uint8_t, H1, DO_SQNEG) +DO_ZPZ(sve2_sqneg_h, uint16_t, H1_2, DO_SQNEG) +DO_ZPZ(sve2_sqneg_s, uint32_t, H1_4, DO_SQNEG) +DO_ZPZ_D(sve2_sqneg_d, uint64_t, DO_SQNEG) + +DO_ZPZ(sve2_urecpe_s, uint32_t, H1_4, helper_recpe_u32) +DO_ZPZ(sve2_ursqrte_s, uint32_t, H1_4, helper_rsqrte_u32) + /* Three-operand expander, unpredicated, in which the third operand is "wide". */ #define DO_ZZW(NAME, TYPE, TYPEW, H, OP) \ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 208d9ea7e0..c30b3c476e 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -5884,3 +5884,50 @@ static bool trans_UADALP_zpzz(DisasContext *s, arg_rprr_esz *a) } return do_sve2_zpzz_ool(s, a, fns[a->esz - 1]); } + +/* + * SVE2 integer unary operations (predicated) + */ + +static bool do_sve2_zpz_ool(DisasContext *s, arg_rpr_esz *a, + gen_helper_gvec_3 *fn) +{ + if (!dc_isar_feature(aa64_sve2, s)) { + return false; + } + return do_zpz_ool(s, a, fn); +} + +static bool trans_URECPE(DisasContext *s, arg_rpr_esz *a) +{ + if (a->esz != 2) { + return false; + } + return do_sve2_zpz_ool(s, a, gen_helper_sve2_urecpe_s); +} + +static bool trans_URSQRTE(DisasContext *s, arg_rpr_esz *a) +{ + if (a->esz != 2) { + return false; + } + return do_sve2_zpz_ool(s, a, gen_helper_sve2_ursqrte_s); +} + +static bool trans_SQABS(DisasContext *s, arg_rpr_esz *a) +{ + static gen_helper_gvec_3 * const fns[4] = { + gen_helper_sve2_sqabs_b, gen_helper_sve2_sqabs_h, + gen_helper_sve2_sqabs_s, gen_helper_sve2_sqabs_d, + }; + return do_sve2_zpz_ool(s, a, fns[a->esz]); +} + +static bool trans_SQNEG(DisasContext *s, arg_rpr_esz *a) +{ + static gen_helper_gvec_3 * const fns[4] = { + gen_helper_sve2_sqneg_b, gen_helper_sve2_sqneg_h, + gen_helper_sve2_sqneg_s, gen_helper_sve2_sqneg_d, + }; + return do_sve2_zpz_ool(s, a, fns[a->esz]); +} From patchwork Tue May 25 01:02:31 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277427 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 65B9DC47084 for ; Tue, 25 May 2021 01:09:31 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id CCEE8613F6 for ; Tue, 25 May 2021 01:09:30 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org CCEE8613F6 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:50324 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llLZZ-0002qZ-M9 for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 21:09:29 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:53378) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLUQ-0001C0-Rp for qemu-devel@nongnu.org; Mon, 24 May 2021 21:04:10 -0400 Received: from mail-pj1-x1033.google.com ([2607:f8b0:4864:20::1033]:53899) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLUK-0001e8-Gf for qemu-devel@nongnu.org; Mon, 24 May 2021 21:04:10 -0400 Received: by mail-pj1-x1033.google.com with SMTP id ot16so13927312pjb.3 for ; Mon, 24 May 2021 18:04:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=Xg5ynT6UlolcogkiaHpsvWNeRcUQFZb29YvDeNG9uc0=; b=XRX6r3kLP4PEK97rh6WOu/hZq0mmaXmezz+rt4W4yTeYoh3A3dovo7dJN55SQHgptD HMmCsBbb5+nPki8lffG4KHus8oVoh5swOr+2+y7m2CcA49SnE3xf9SOWdGhhDFS4ryFE hUQ+ywZx5tsCx+eK5poIObi5XMu2OobGFp8p1WcAeAxABjManArIYYt2rtXEwafWpYoK 8jcLiKY3ViZQ3ezYOdTMuW9gt340OpmnMcQB9SrXUsGq+63iJv+1w3yUle9Ps4f6r8sC 6Tl+d5ursL3JvMmmsMJ39R84dCoI4rA7PaZkg1RsxZZeMXh27tmlZW4cqeM16vQcrH5g +Pqg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Xg5ynT6UlolcogkiaHpsvWNeRcUQFZb29YvDeNG9uc0=; b=XXfR+ClGGZvuvAi5LZtLgRhRx6gSjK5EiiLI8YDmgfuD/1SQy6zaHx6wSJjwfMlFzh Xl8lYF7n6C2n7yVd3t+eln5sgZ9SU32s8eYePfmpWka9MNPg6om+KLYHhpwEMzxxGMNj NA/JvyHBP2uirZdVvdI4glfUuAgYWim/OeAOfCmyU9e/47aL8mqPZFf+yYfT3G6bgHsk lDsIuFiBk54fd/xqUC1ag4KmBv7Ql+ScpGpYcB137E2pMs+6zKT3YjpBcCO3AMp98CLx quC0m0eHz7RGbSWDLiD6yBFNxwHL3eqiO4kDnd+rNa5hQyJccxfHEvvKM4Qch7zgXvk/ 85EA== X-Gm-Message-State: AOAM530DXHurNAq5DIXPmWVf0pjvpRZwh14eW1Z0WOe96ZEV9sLotpv9 pQ+14baTO/bsMaveij/w2TPw9Q051zaEdw== X-Google-Smtp-Source: ABdhPJyM9c/4QtFS2cVp8RYJ/5xfxn+B7HKoFf6/nqrSDvnFKTRzGYeJmHwA++OACqNJv+ALsuCypw== X-Received: by 2002:a17:90a:4205:: with SMTP id o5mr20998014pjg.140.1621904643029; Mon, 24 May 2021 18:04:03 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id b1sm13742645pgf.84.2021.05.24.18.04.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:04:02 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 05/92] target/arm: Split out saturating/rounding shifts from neon Date: Mon, 24 May 2021 18:02:31 -0700 Message-Id: <20210525010358.152808-6-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1033; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1033.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Split these operations out into a header that can be shared between neon and sve. The "sat" pointer acts both as a boolean for control of saturating behavior and controls the difference in behavior between neon and sve -- QC bit or no QC bit. Widen the shift operand in the new helpers, as the SVE2 insns treat the whole input element as significant. For the neon uses, truncate the shift to int8_t while passing the parameter. Implement right-shift rounding as tmp = src >> (shift - 1); dst = (tmp >> 1) + (tmp & 1); This is the same number of instructions as the current tmp = 1 << (shift - 1); dst = (src + tmp) >> shift; without any possibility of intermediate overflow. Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- v2: Widen the shift operand (laurent desnouges) v7: Add null checks to suqrshl (pm215). --- target/arm/vec_internal.h | 138 +++++++++++ target/arm/neon_helper.c | 507 +++++++------------------------------- 2 files changed, 221 insertions(+), 424 deletions(-) diff --git a/target/arm/vec_internal.h b/target/arm/vec_internal.h index e3eb3e7a6b..5b78e79329 100644 --- a/target/arm/vec_internal.h +++ b/target/arm/vec_internal.h @@ -30,4 +30,142 @@ static inline void clear_tail(void *vd, uintptr_t opr_sz, uintptr_t max_sz) } } +static inline int32_t do_sqrshl_bhs(int32_t src, int32_t shift, int bits, + bool round, uint32_t *sat) +{ + if (shift <= -bits) { + /* Rounding the sign bit always produces 0. */ + if (round) { + return 0; + } + return src >> 31; + } else if (shift < 0) { + if (round) { + src >>= -shift - 1; + return (src >> 1) + (src & 1); + } + return src >> -shift; + } else if (shift < bits) { + int32_t val = src << shift; + if (bits == 32) { + if (!sat || val >> shift == src) { + return val; + } + } else { + int32_t extval = sextract32(val, 0, bits); + if (!sat || val == extval) { + return extval; + } + } + } else if (!sat || src == 0) { + return 0; + } + + *sat = 1; + return (1u << (bits - 1)) - (src >= 0); +} + +static inline uint32_t do_uqrshl_bhs(uint32_t src, int32_t shift, int bits, + bool round, uint32_t *sat) +{ + if (shift <= -(bits + round)) { + return 0; + } else if (shift < 0) { + if (round) { + src >>= -shift - 1; + return (src >> 1) + (src & 1); + } + return src >> -shift; + } else if (shift < bits) { + uint32_t val = src << shift; + if (bits == 32) { + if (!sat || val >> shift == src) { + return val; + } + } else { + uint32_t extval = extract32(val, 0, bits); + if (!sat || val == extval) { + return extval; + } + } + } else if (!sat || src == 0) { + return 0; + } + + *sat = 1; + return MAKE_64BIT_MASK(0, bits); +} + +static inline int32_t do_suqrshl_bhs(int32_t src, int32_t shift, int bits, + bool round, uint32_t *sat) +{ + if (sat && src < 0) { + *sat = 1; + return 0; + } + return do_uqrshl_bhs(src, shift, bits, round, sat); +} + +static inline int64_t do_sqrshl_d(int64_t src, int64_t shift, + bool round, uint32_t *sat) +{ + if (shift <= -64) { + /* Rounding the sign bit always produces 0. */ + if (round) { + return 0; + } + return src >> 63; + } else if (shift < 0) { + if (round) { + src >>= -shift - 1; + return (src >> 1) + (src & 1); + } + return src >> -shift; + } else if (shift < 64) { + int64_t val = src << shift; + if (!sat || val >> shift == src) { + return val; + } + } else if (!sat || src == 0) { + return 0; + } + + *sat = 1; + return src < 0 ? INT64_MIN : INT64_MAX; +} + +static inline uint64_t do_uqrshl_d(uint64_t src, int64_t shift, + bool round, uint32_t *sat) +{ + if (shift <= -(64 + round)) { + return 0; + } else if (shift < 0) { + if (round) { + src >>= -shift - 1; + return (src >> 1) + (src & 1); + } + return src >> -shift; + } else if (shift < 64) { + uint64_t val = src << shift; + if (!sat || val >> shift == src) { + return val; + } + } else if (!sat || src == 0) { + return 0; + } + + *sat = 1; + return UINT64_MAX; +} + +static inline int64_t do_suqrshl_d(int64_t src, int64_t shift, + bool round, uint32_t *sat) +{ + if (sat && src < 0) { + *sat = 1; + return 0; + } + return do_uqrshl_d(src, shift, round, sat); +} + #endif /* TARGET_ARM_VEC_INTERNALS_H */ diff --git a/target/arm/neon_helper.c b/target/arm/neon_helper.c index b637265691..338b9189d5 100644 --- a/target/arm/neon_helper.c +++ b/target/arm/neon_helper.c @@ -11,6 +11,7 @@ #include "cpu.h" #include "exec/helper-proto.h" #include "fpu/softfloat.h" +#include "vec_internal.h" #define SIGNBIT (uint32_t)0x80000000 #define SIGNBIT64 ((uint64_t)1 << 63) @@ -576,496 +577,154 @@ NEON_POP(pmax_s16, neon_s16, 2) NEON_POP(pmax_u16, neon_u16, 2) #undef NEON_FN -#define NEON_FN(dest, src1, src2) do { \ - int8_t tmp; \ - tmp = (int8_t)src2; \ - if (tmp >= (ssize_t)sizeof(src1) * 8 || \ - tmp <= -(ssize_t)sizeof(src1) * 8) { \ - dest = 0; \ - } else if (tmp < 0) { \ - dest = src1 >> -tmp; \ - } else { \ - dest = src1 << tmp; \ - }} while (0) +#define NEON_FN(dest, src1, src2) \ + (dest = do_uqrshl_bhs(src1, (int8_t)src2, 16, false, NULL)) NEON_VOP(shl_u16, neon_u16, 2) #undef NEON_FN -#define NEON_FN(dest, src1, src2) do { \ - int8_t tmp; \ - tmp = (int8_t)src2; \ - if (tmp >= (ssize_t)sizeof(src1) * 8) { \ - dest = 0; \ - } else if (tmp <= -(ssize_t)sizeof(src1) * 8) { \ - dest = src1 >> (sizeof(src1) * 8 - 1); \ - } else if (tmp < 0) { \ - dest = src1 >> -tmp; \ - } else { \ - dest = src1 << tmp; \ - }} while (0) +#define NEON_FN(dest, src1, src2) \ + (dest = do_sqrshl_bhs(src1, (int8_t)src2, 16, false, NULL)) NEON_VOP(shl_s16, neon_s16, 2) #undef NEON_FN -#define NEON_FN(dest, src1, src2) do { \ - int8_t tmp; \ - tmp = (int8_t)src2; \ - if ((tmp >= (ssize_t)sizeof(src1) * 8) \ - || (tmp <= -(ssize_t)sizeof(src1) * 8)) { \ - dest = 0; \ - } else if (tmp < 0) { \ - dest = (src1 + (1 << (-1 - tmp))) >> -tmp; \ - } else { \ - dest = src1 << tmp; \ - }} while (0) +#define NEON_FN(dest, src1, src2) \ + (dest = do_sqrshl_bhs(src1, (int8_t)src2, 8, true, NULL)) NEON_VOP(rshl_s8, neon_s8, 4) +#undef NEON_FN + +#define NEON_FN(dest, src1, src2) \ + (dest = do_sqrshl_bhs(src1, (int8_t)src2, 16, true, NULL)) NEON_VOP(rshl_s16, neon_s16, 2) #undef NEON_FN -/* The addition of the rounding constant may overflow, so we use an - * intermediate 64 bit accumulator. */ -uint32_t HELPER(neon_rshl_s32)(uint32_t valop, uint32_t shiftop) +uint32_t HELPER(neon_rshl_s32)(uint32_t val, uint32_t shift) { - int32_t dest; - int32_t val = (int32_t)valop; - int8_t shift = (int8_t)shiftop; - if ((shift >= 32) || (shift <= -32)) { - dest = 0; - } else if (shift < 0) { - int64_t big_dest = ((int64_t)val + (1 << (-1 - shift))); - dest = big_dest >> -shift; - } else { - dest = val << shift; - } - return dest; + return do_sqrshl_bhs(val, (int8_t)shift, 32, true, NULL); } -/* Handling addition overflow with 64 bit input values is more - * tricky than with 32 bit values. */ -uint64_t HELPER(neon_rshl_s64)(uint64_t valop, uint64_t shiftop) +uint64_t HELPER(neon_rshl_s64)(uint64_t val, uint64_t shift) { - int8_t shift = (int8_t)shiftop; - int64_t val = valop; - if ((shift >= 64) || (shift <= -64)) { - val = 0; - } else if (shift < 0) { - val >>= (-shift - 1); - if (val == INT64_MAX) { - /* In this case, it means that the rounding constant is 1, - * and the addition would overflow. Return the actual - * result directly. */ - val = 0x4000000000000000LL; - } else { - val++; - val >>= 1; - } - } else { - val <<= shift; - } - return val; + return do_sqrshl_d(val, (int8_t)shift, true, NULL); } -#define NEON_FN(dest, src1, src2) do { \ - int8_t tmp; \ - tmp = (int8_t)src2; \ - if (tmp >= (ssize_t)sizeof(src1) * 8 || \ - tmp < -(ssize_t)sizeof(src1) * 8) { \ - dest = 0; \ - } else if (tmp == -(ssize_t)sizeof(src1) * 8) { \ - dest = src1 >> (-tmp - 1); \ - } else if (tmp < 0) { \ - dest = (src1 + (1 << (-1 - tmp))) >> -tmp; \ - } else { \ - dest = src1 << tmp; \ - }} while (0) +#define NEON_FN(dest, src1, src2) \ + (dest = do_uqrshl_bhs(src1, (int8_t)src2, 8, true, NULL)) NEON_VOP(rshl_u8, neon_u8, 4) +#undef NEON_FN + +#define NEON_FN(dest, src1, src2) \ + (dest = do_uqrshl_bhs(src1, (int8_t)src2, 16, true, NULL)) NEON_VOP(rshl_u16, neon_u16, 2) #undef NEON_FN -/* The addition of the rounding constant may overflow, so we use an - * intermediate 64 bit accumulator. */ -uint32_t HELPER(neon_rshl_u32)(uint32_t val, uint32_t shiftop) +uint32_t HELPER(neon_rshl_u32)(uint32_t val, uint32_t shift) { - uint32_t dest; - int8_t shift = (int8_t)shiftop; - if (shift >= 32 || shift < -32) { - dest = 0; - } else if (shift == -32) { - dest = val >> 31; - } else if (shift < 0) { - uint64_t big_dest = ((uint64_t)val + (1 << (-1 - shift))); - dest = big_dest >> -shift; - } else { - dest = val << shift; - } - return dest; + return do_uqrshl_bhs(val, (int8_t)shift, 32, true, NULL); } -/* Handling addition overflow with 64 bit input values is more - * tricky than with 32 bit values. */ -uint64_t HELPER(neon_rshl_u64)(uint64_t val, uint64_t shiftop) +uint64_t HELPER(neon_rshl_u64)(uint64_t val, uint64_t shift) { - int8_t shift = (uint8_t)shiftop; - if (shift >= 64 || shift < -64) { - val = 0; - } else if (shift == -64) { - /* Rounding a 1-bit result just preserves that bit. */ - val >>= 63; - } else if (shift < 0) { - val >>= (-shift - 1); - if (val == UINT64_MAX) { - /* In this case, it means that the rounding constant is 1, - * and the addition would overflow. Return the actual - * result directly. */ - val = 0x8000000000000000ULL; - } else { - val++; - val >>= 1; - } - } else { - val <<= shift; - } - return val; + return do_uqrshl_d(val, (int8_t)shift, true, NULL); } -#define NEON_FN(dest, src1, src2) do { \ - int8_t tmp; \ - tmp = (int8_t)src2; \ - if (tmp >= (ssize_t)sizeof(src1) * 8) { \ - if (src1) { \ - SET_QC(); \ - dest = ~0; \ - } else { \ - dest = 0; \ - } \ - } else if (tmp <= -(ssize_t)sizeof(src1) * 8) { \ - dest = 0; \ - } else if (tmp < 0) { \ - dest = src1 >> -tmp; \ - } else { \ - dest = src1 << tmp; \ - if ((dest >> tmp) != src1) { \ - SET_QC(); \ - dest = ~0; \ - } \ - }} while (0) +#define NEON_FN(dest, src1, src2) \ + (dest = do_uqrshl_bhs(src1, (int8_t)src2, 8, false, env->vfp.qc)) NEON_VOP_ENV(qshl_u8, neon_u8, 4) +#undef NEON_FN + +#define NEON_FN(dest, src1, src2) \ + (dest = do_uqrshl_bhs(src1, (int8_t)src2, 16, false, env->vfp.qc)) NEON_VOP_ENV(qshl_u16, neon_u16, 2) -NEON_VOP_ENV(qshl_u32, neon_u32, 1) #undef NEON_FN -uint64_t HELPER(neon_qshl_u64)(CPUARMState *env, uint64_t val, uint64_t shiftop) +uint32_t HELPER(neon_qshl_u32)(CPUARMState *env, uint32_t val, uint32_t shift) { - int8_t shift = (int8_t)shiftop; - if (shift >= 64) { - if (val) { - val = ~(uint64_t)0; - SET_QC(); - } - } else if (shift <= -64) { - val = 0; - } else if (shift < 0) { - val >>= -shift; - } else { - uint64_t tmp = val; - val <<= shift; - if ((val >> shift) != tmp) { - SET_QC(); - val = ~(uint64_t)0; - } - } - return val; + return do_uqrshl_bhs(val, (int8_t)shift, 32, false, env->vfp.qc); } -#define NEON_FN(dest, src1, src2) do { \ - int8_t tmp; \ - tmp = (int8_t)src2; \ - if (tmp >= (ssize_t)sizeof(src1) * 8) { \ - if (src1) { \ - SET_QC(); \ - dest = (uint32_t)(1 << (sizeof(src1) * 8 - 1)); \ - if (src1 > 0) { \ - dest--; \ - } \ - } else { \ - dest = src1; \ - } \ - } else if (tmp <= -(ssize_t)sizeof(src1) * 8) { \ - dest = src1 >> 31; \ - } else if (tmp < 0) { \ - dest = src1 >> -tmp; \ - } else { \ - dest = src1 << tmp; \ - if ((dest >> tmp) != src1) { \ - SET_QC(); \ - dest = (uint32_t)(1 << (sizeof(src1) * 8 - 1)); \ - if (src1 > 0) { \ - dest--; \ - } \ - } \ - }} while (0) +uint64_t HELPER(neon_qshl_u64)(CPUARMState *env, uint64_t val, uint64_t shift) +{ + return do_uqrshl_d(val, (int8_t)shift, false, env->vfp.qc); +} + +#define NEON_FN(dest, src1, src2) \ + (dest = do_sqrshl_bhs(src1, (int8_t)src2, 8, false, env->vfp.qc)) NEON_VOP_ENV(qshl_s8, neon_s8, 4) +#undef NEON_FN + +#define NEON_FN(dest, src1, src2) \ + (dest = do_sqrshl_bhs(src1, (int8_t)src2, 16, false, env->vfp.qc)) NEON_VOP_ENV(qshl_s16, neon_s16, 2) -NEON_VOP_ENV(qshl_s32, neon_s32, 1) #undef NEON_FN -uint64_t HELPER(neon_qshl_s64)(CPUARMState *env, uint64_t valop, uint64_t shiftop) +uint32_t HELPER(neon_qshl_s32)(CPUARMState *env, uint32_t val, uint32_t shift) { - int8_t shift = (uint8_t)shiftop; - int64_t val = valop; - if (shift >= 64) { - if (val) { - SET_QC(); - val = (val >> 63) ^ ~SIGNBIT64; - } - } else if (shift <= -64) { - val >>= 63; - } else if (shift < 0) { - val >>= -shift; - } else { - int64_t tmp = val; - val <<= shift; - if ((val >> shift) != tmp) { - SET_QC(); - val = (tmp >> 63) ^ ~SIGNBIT64; - } - } - return val; + return do_sqrshl_bhs(val, (int8_t)shift, 32, false, env->vfp.qc); } -#define NEON_FN(dest, src1, src2) do { \ - if (src1 & (1 << (sizeof(src1) * 8 - 1))) { \ - SET_QC(); \ - dest = 0; \ - } else { \ - int8_t tmp; \ - tmp = (int8_t)src2; \ - if (tmp >= (ssize_t)sizeof(src1) * 8) { \ - if (src1) { \ - SET_QC(); \ - dest = ~0; \ - } else { \ - dest = 0; \ - } \ - } else if (tmp <= -(ssize_t)sizeof(src1) * 8) { \ - dest = 0; \ - } else if (tmp < 0) { \ - dest = src1 >> -tmp; \ - } else { \ - dest = src1 << tmp; \ - if ((dest >> tmp) != src1) { \ - SET_QC(); \ - dest = ~0; \ - } \ - } \ - }} while (0) -NEON_VOP_ENV(qshlu_s8, neon_u8, 4) -NEON_VOP_ENV(qshlu_s16, neon_u16, 2) +uint64_t HELPER(neon_qshl_s64)(CPUARMState *env, uint64_t val, uint64_t shift) +{ + return do_sqrshl_d(val, (int8_t)shift, false, env->vfp.qc); +} + +#define NEON_FN(dest, src1, src2) \ + (dest = do_suqrshl_bhs(src1, (int8_t)src2, 8, false, env->vfp.qc)) +NEON_VOP_ENV(qshlu_s8, neon_s8, 4) #undef NEON_FN -uint32_t HELPER(neon_qshlu_s32)(CPUARMState *env, uint32_t valop, uint32_t shiftop) +#define NEON_FN(dest, src1, src2) \ + (dest = do_suqrshl_bhs(src1, (int8_t)src2, 16, false, env->vfp.qc)) +NEON_VOP_ENV(qshlu_s16, neon_s16, 2) +#undef NEON_FN + +uint32_t HELPER(neon_qshlu_s32)(CPUARMState *env, uint32_t val, uint32_t shift) { - if ((int32_t)valop < 0) { - SET_QC(); - return 0; - } - return helper_neon_qshl_u32(env, valop, shiftop); + return do_suqrshl_bhs(val, (int8_t)shift, 32, false, env->vfp.qc); } -uint64_t HELPER(neon_qshlu_s64)(CPUARMState *env, uint64_t valop, uint64_t shiftop) +uint64_t HELPER(neon_qshlu_s64)(CPUARMState *env, uint64_t val, uint64_t shift) { - if ((int64_t)valop < 0) { - SET_QC(); - return 0; - } - return helper_neon_qshl_u64(env, valop, shiftop); + return do_suqrshl_d(val, (int8_t)shift, false, env->vfp.qc); } -#define NEON_FN(dest, src1, src2) do { \ - int8_t tmp; \ - tmp = (int8_t)src2; \ - if (tmp >= (ssize_t)sizeof(src1) * 8) { \ - if (src1) { \ - SET_QC(); \ - dest = ~0; \ - } else { \ - dest = 0; \ - } \ - } else if (tmp < -(ssize_t)sizeof(src1) * 8) { \ - dest = 0; \ - } else if (tmp == -(ssize_t)sizeof(src1) * 8) { \ - dest = src1 >> (sizeof(src1) * 8 - 1); \ - } else if (tmp < 0) { \ - dest = (src1 + (1 << (-1 - tmp))) >> -tmp; \ - } else { \ - dest = src1 << tmp; \ - if ((dest >> tmp) != src1) { \ - SET_QC(); \ - dest = ~0; \ - } \ - }} while (0) +#define NEON_FN(dest, src1, src2) \ + (dest = do_uqrshl_bhs(src1, (int8_t)src2, 8, true, env->vfp.qc)) NEON_VOP_ENV(qrshl_u8, neon_u8, 4) +#undef NEON_FN + +#define NEON_FN(dest, src1, src2) \ + (dest = do_uqrshl_bhs(src1, (int8_t)src2, 16, true, env->vfp.qc)) NEON_VOP_ENV(qrshl_u16, neon_u16, 2) #undef NEON_FN -/* The addition of the rounding constant may overflow, so we use an - * intermediate 64 bit accumulator. */ -uint32_t HELPER(neon_qrshl_u32)(CPUARMState *env, uint32_t val, uint32_t shiftop) +uint32_t HELPER(neon_qrshl_u32)(CPUARMState *env, uint32_t val, uint32_t shift) { - uint32_t dest; - int8_t shift = (int8_t)shiftop; - if (shift >= 32) { - if (val) { - SET_QC(); - dest = ~0; - } else { - dest = 0; - } - } else if (shift < -32) { - dest = 0; - } else if (shift == -32) { - dest = val >> 31; - } else if (shift < 0) { - uint64_t big_dest = ((uint64_t)val + (1 << (-1 - shift))); - dest = big_dest >> -shift; - } else { - dest = val << shift; - if ((dest >> shift) != val) { - SET_QC(); - dest = ~0; - } - } - return dest; + return do_uqrshl_bhs(val, (int8_t)shift, 32, true, env->vfp.qc); } -/* Handling addition overflow with 64 bit input values is more - * tricky than with 32 bit values. */ -uint64_t HELPER(neon_qrshl_u64)(CPUARMState *env, uint64_t val, uint64_t shiftop) +uint64_t HELPER(neon_qrshl_u64)(CPUARMState *env, uint64_t val, uint64_t shift) { - int8_t shift = (int8_t)shiftop; - if (shift >= 64) { - if (val) { - SET_QC(); - val = ~0; - } - } else if (shift < -64) { - val = 0; - } else if (shift == -64) { - val >>= 63; - } else if (shift < 0) { - val >>= (-shift - 1); - if (val == UINT64_MAX) { - /* In this case, it means that the rounding constant is 1, - * and the addition would overflow. Return the actual - * result directly. */ - val = 0x8000000000000000ULL; - } else { - val++; - val >>= 1; - } - } else { \ - uint64_t tmp = val; - val <<= shift; - if ((val >> shift) != tmp) { - SET_QC(); - val = ~0; - } - } - return val; + return do_uqrshl_d(val, (int8_t)shift, true, env->vfp.qc); } -#define NEON_FN(dest, src1, src2) do { \ - int8_t tmp; \ - tmp = (int8_t)src2; \ - if (tmp >= (ssize_t)sizeof(src1) * 8) { \ - if (src1) { \ - SET_QC(); \ - dest = (typeof(dest))(1 << (sizeof(src1) * 8 - 1)); \ - if (src1 > 0) { \ - dest--; \ - } \ - } else { \ - dest = 0; \ - } \ - } else if (tmp <= -(ssize_t)sizeof(src1) * 8) { \ - dest = 0; \ - } else if (tmp < 0) { \ - dest = (src1 + (1 << (-1 - tmp))) >> -tmp; \ - } else { \ - dest = src1 << tmp; \ - if ((dest >> tmp) != src1) { \ - SET_QC(); \ - dest = (uint32_t)(1 << (sizeof(src1) * 8 - 1)); \ - if (src1 > 0) { \ - dest--; \ - } \ - } \ - }} while (0) +#define NEON_FN(dest, src1, src2) \ + (dest = do_sqrshl_bhs(src1, (int8_t)src2, 8, true, env->vfp.qc)) NEON_VOP_ENV(qrshl_s8, neon_s8, 4) +#undef NEON_FN + +#define NEON_FN(dest, src1, src2) \ + (dest = do_sqrshl_bhs(src1, (int8_t)src2, 16, true, env->vfp.qc)) NEON_VOP_ENV(qrshl_s16, neon_s16, 2) #undef NEON_FN -/* The addition of the rounding constant may overflow, so we use an - * intermediate 64 bit accumulator. */ -uint32_t HELPER(neon_qrshl_s32)(CPUARMState *env, uint32_t valop, uint32_t shiftop) +uint32_t HELPER(neon_qrshl_s32)(CPUARMState *env, uint32_t val, uint32_t shift) { - int32_t dest; - int32_t val = (int32_t)valop; - int8_t shift = (int8_t)shiftop; - if (shift >= 32) { - if (val) { - SET_QC(); - dest = (val >> 31) ^ ~SIGNBIT; - } else { - dest = 0; - } - } else if (shift <= -32) { - dest = 0; - } else if (shift < 0) { - int64_t big_dest = ((int64_t)val + (1 << (-1 - shift))); - dest = big_dest >> -shift; - } else { - dest = val << shift; - if ((dest >> shift) != val) { - SET_QC(); - dest = (val >> 31) ^ ~SIGNBIT; - } - } - return dest; + return do_sqrshl_bhs(val, (int8_t)shift, 32, true, env->vfp.qc); } -/* Handling addition overflow with 64 bit input values is more - * tricky than with 32 bit values. */ -uint64_t HELPER(neon_qrshl_s64)(CPUARMState *env, uint64_t valop, uint64_t shiftop) +uint64_t HELPER(neon_qrshl_s64)(CPUARMState *env, uint64_t val, uint64_t shift) { - int8_t shift = (uint8_t)shiftop; - int64_t val = valop; - - if (shift >= 64) { - if (val) { - SET_QC(); - val = (val >> 63) ^ ~SIGNBIT64; - } - } else if (shift <= -64) { - val = 0; - } else if (shift < 0) { - val >>= (-shift - 1); - if (val == INT64_MAX) { - /* In this case, it means that the rounding constant is 1, - * and the addition would overflow. Return the actual - * result directly. */ - val = 0x4000000000000000ULL; - } else { - val++; - val >>= 1; - } - } else { - int64_t tmp = val; - val <<= shift; - if ((val >> shift) != tmp) { - SET_QC(); - val = (tmp >> 63) ^ ~SIGNBIT64; - } - } - return val; + return do_sqrshl_d(val, (int8_t)shift, true, env->vfp.qc); } uint32_t HELPER(neon_add_u8)(uint32_t a, uint32_t b) From patchwork Tue May 25 01:02:32 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277429 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DA2FAC2B9F8 for ; Tue, 25 May 2021 01:09:37 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 60FB7613F6 for ; Tue, 25 May 2021 01:09:37 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 60FB7613F6 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:51060 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llLZg-0003MC-Fs for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 21:09:36 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:53424) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLUS-0001Jf-C6 for qemu-devel@nongnu.org; Mon, 24 May 2021 21:04:12 -0400 Received: from mail-pj1-x1034.google.com ([2607:f8b0:4864:20::1034]:34398) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLUL-0001eZ-3r for qemu-devel@nongnu.org; Mon, 24 May 2021 21:04:12 -0400 Received: by mail-pj1-x1034.google.com with SMTP id g6-20020a17090adac6b029015d1a9a6f1aso866889pjx.1 for ; Mon, 24 May 2021 18:04:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=35k/34I17hEoDYmueROTz9nZX44IQkmrnncONOuNZHs=; b=d2Y8bJy+w1pbvVSDHX5gaDLTo3EMeDmxSNgi6koxk8gjphCDvG1zZVZCdoiBHP/U8g fW6sXaMyZODlS+giYSJPAmh67qH+HFR0j9ynCRJknqgq1mk1acS0UMPZLRowUEES+KpH HqHSW8ts6jRJjEgal0ih1ZyC//qT9EOeunWXTpjw4WYUI/3A5qF5BpPw1CzAHnRCtNMQ ZkS+0PM42C3C0jau8PtmMQxByCX6NmYI9IioNTrrzGUev7uNozazb5A/4SyLNlZNd+/B MXxZSHbZnMgcdUA1NkkwyYMVWnm4Mf1cgXcVhd8gJKUD+v/qYg0tIZ52y9esYd+0Q0yW Q8Fg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=35k/34I17hEoDYmueROTz9nZX44IQkmrnncONOuNZHs=; b=PZRN0XTExWmvQpHIh2cBpHMPD2E7wY7b+P+WJ2hYyqJdocc29u4cVetSd3+PTH0MBL wfzJCOAvtgpubeFNXPeytHkaFYZ3vCgGoS1Py5VhsvkuMGIqo+xH0IMnIP7x0ruPGCv6 /z8gQPagClPky96BKnFZ+XlvQgWE8hNIWtaxy0jDJiXfOx2dKZTfqLV3Ibtjo30ij8E8 0cCO25Xz1J8X/bgCoSUgAtLPHVts+Zqh+BOsyzV5RY61JO/xIsPQ4jjp/HKvjtSp+PYW dd4C6LErQDhkEEbZ3Z1Jw/QtMEBZIZxt1eSqfHWQMlArvmzgeZQQzBnomZVZm5xE9KR+ TxBw== X-Gm-Message-State: AOAM533Q1T9a+GP0q3qb/U90eBmAXes6ny2MV3L8pVl9Gvn8tN3dLdyi tmY1gDoviZ/FyZjFQ44R30YHpLNxCV4phA== X-Google-Smtp-Source: ABdhPJwQe3SJfluEfiBwDZj2+gtiRFUMVnr3d8KOmPqsC/eubmUIdIBkA6qnfTM8IcUHX+ZWiI9wVw== X-Received: by 2002:a17:90a:fc8b:: with SMTP id ci11mr24934681pjb.2.1621904643578; Mon, 24 May 2021 18:04:03 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id b1sm13742645pgf.84.2021.05.24.18.04.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:04:03 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 06/92] target/arm: Implement SVE2 saturating/rounding bitwise shift left (predicated) Date: Mon, 24 May 2021 18:02:32 -0700 Message-Id: <20210525010358.152808-7-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1034; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1034.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- v2: Shift values are always signed (laurent desnogues). --- target/arm/helper-sve.h | 54 +++++++++++++++++++++++ target/arm/sve.decode | 17 ++++++++ target/arm/sve_helper.c | 87 ++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 18 ++++++++ 4 files changed, 176 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 9992e93e2b..62106c74be 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -172,6 +172,60 @@ DEF_HELPER_FLAGS_5(sve2_uadalp_zpzz_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(sve2_uadalp_zpzz_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_srshl_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_srshl_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_srshl_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_srshl_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_urshl_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_urshl_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_urshl_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_urshl_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_sqshl_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqshl_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqshl_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqshl_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_uqshl_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uqshl_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uqshl_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uqshl_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_sqrshl_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqrshl_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqrshl_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqrshl_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_uqrshl_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uqrshl_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uqrshl_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uqrshl_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_sdiv_zpzz_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_sdiv_zpzz_d, TCG_CALL_NO_RWG, diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 5ba542969b..93f2479693 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1112,3 +1112,20 @@ URECPE 01000100 .. 000 000 101 ... ..... ..... @rd_pg_rn URSQRTE 01000100 .. 000 001 101 ... ..... ..... @rd_pg_rn SQABS 01000100 .. 001 000 101 ... ..... ..... @rd_pg_rn SQNEG 01000100 .. 001 001 101 ... ..... ..... @rd_pg_rn + +### SVE2 saturating/rounding bitwise shift left (predicated) + +SRSHL 01000100 .. 000 010 100 ... ..... ..... @rdn_pg_rm +URSHL 01000100 .. 000 011 100 ... ..... ..... @rdn_pg_rm +SRSHL 01000100 .. 000 110 100 ... ..... ..... @rdm_pg_rn # SRSHLR +URSHL 01000100 .. 000 111 100 ... ..... ..... @rdm_pg_rn # URSHLR + +SQSHL 01000100 .. 001 000 100 ... ..... ..... @rdn_pg_rm +UQSHL 01000100 .. 001 001 100 ... ..... ..... @rdn_pg_rm +SQSHL 01000100 .. 001 100 100 ... ..... ..... @rdm_pg_rn # SQSHLR +UQSHL 01000100 .. 001 101 100 ... ..... ..... @rdm_pg_rn # UQSHLR + +SQRSHL 01000100 .. 001 010 100 ... ..... ..... @rdn_pg_rm +UQRSHL 01000100 .. 001 011 100 ... ..... ..... @rdn_pg_rm +SQRSHL 01000100 .. 001 110 100 ... ..... ..... @rdm_pg_rn # SQRSHLR +UQRSHL 01000100 .. 001 111 100 ... ..... ..... @rdm_pg_rn # UQRSHLR diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 7a08c24f2d..17c6440b06 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -26,6 +26,7 @@ #include "tcg/tcg-gvec-desc.h" #include "fpu/softfloat.h" #include "tcg/tcg.h" +#include "vec_internal.h" /* Note that vector data is stored in host-endian 64-bit chunks, @@ -561,6 +562,92 @@ DO_ZPZZ(sve2_uadalp_zpzz_h, uint16_t, H1_2, do_uadalp_h) DO_ZPZZ(sve2_uadalp_zpzz_s, uint32_t, H1_4, do_uadalp_s) DO_ZPZZ_D(sve2_uadalp_zpzz_d, uint64_t, do_uadalp_d) +#define do_srshl_b(n, m) do_sqrshl_bhs(n, m, 8, true, NULL) +#define do_srshl_h(n, m) do_sqrshl_bhs(n, m, 16, true, NULL) +#define do_srshl_s(n, m) do_sqrshl_bhs(n, m, 32, true, NULL) +#define do_srshl_d(n, m) do_sqrshl_d(n, m, true, NULL) + +DO_ZPZZ(sve2_srshl_zpzz_b, int8_t, H1, do_srshl_b) +DO_ZPZZ(sve2_srshl_zpzz_h, int16_t, H1_2, do_srshl_h) +DO_ZPZZ(sve2_srshl_zpzz_s, int32_t, H1_4, do_srshl_s) +DO_ZPZZ_D(sve2_srshl_zpzz_d, int64_t, do_srshl_d) + +#define do_urshl_b(n, m) do_uqrshl_bhs(n, (int8_t)m, 8, true, NULL) +#define do_urshl_h(n, m) do_uqrshl_bhs(n, (int16_t)m, 16, true, NULL) +#define do_urshl_s(n, m) do_uqrshl_bhs(n, m, 32, true, NULL) +#define do_urshl_d(n, m) do_uqrshl_d(n, m, true, NULL) + +DO_ZPZZ(sve2_urshl_zpzz_b, uint8_t, H1, do_urshl_b) +DO_ZPZZ(sve2_urshl_zpzz_h, uint16_t, H1_2, do_urshl_h) +DO_ZPZZ(sve2_urshl_zpzz_s, uint32_t, H1_4, do_urshl_s) +DO_ZPZZ_D(sve2_urshl_zpzz_d, uint64_t, do_urshl_d) + +/* + * Unlike the NEON and AdvSIMD versions, there is no QC bit to set. + * We pass in a pointer to a dummy saturation field to trigger + * the saturating arithmetic but discard the information about + * whether it has occurred. + */ +#define do_sqshl_b(n, m) \ + ({ uint32_t discard; do_sqrshl_bhs(n, m, 8, false, &discard); }) +#define do_sqshl_h(n, m) \ + ({ uint32_t discard; do_sqrshl_bhs(n, m, 16, false, &discard); }) +#define do_sqshl_s(n, m) \ + ({ uint32_t discard; do_sqrshl_bhs(n, m, 32, false, &discard); }) +#define do_sqshl_d(n, m) \ + ({ uint32_t discard; do_sqrshl_d(n, m, false, &discard); }) + +DO_ZPZZ(sve2_sqshl_zpzz_b, int8_t, H1_2, do_sqshl_b) +DO_ZPZZ(sve2_sqshl_zpzz_h, int16_t, H1_2, do_sqshl_h) +DO_ZPZZ(sve2_sqshl_zpzz_s, int32_t, H1_4, do_sqshl_s) +DO_ZPZZ_D(sve2_sqshl_zpzz_d, int64_t, do_sqshl_d) + +#define do_uqshl_b(n, m) \ + ({ uint32_t discard; do_uqrshl_bhs(n, (int8_t)m, 8, false, &discard); }) +#define do_uqshl_h(n, m) \ + ({ uint32_t discard; do_uqrshl_bhs(n, (int16_t)m, 16, false, &discard); }) +#define do_uqshl_s(n, m) \ + ({ uint32_t discard; do_uqrshl_bhs(n, m, 32, false, &discard); }) +#define do_uqshl_d(n, m) \ + ({ uint32_t discard; do_uqrshl_d(n, m, false, &discard); }) + +DO_ZPZZ(sve2_uqshl_zpzz_b, uint8_t, H1_2, do_uqshl_b) +DO_ZPZZ(sve2_uqshl_zpzz_h, uint16_t, H1_2, do_uqshl_h) +DO_ZPZZ(sve2_uqshl_zpzz_s, uint32_t, H1_4, do_uqshl_s) +DO_ZPZZ_D(sve2_uqshl_zpzz_d, uint64_t, do_uqshl_d) + +#define do_sqrshl_b(n, m) \ + ({ uint32_t discard; do_sqrshl_bhs(n, m, 8, true, &discard); }) +#define do_sqrshl_h(n, m) \ + ({ uint32_t discard; do_sqrshl_bhs(n, m, 16, true, &discard); }) +#define do_sqrshl_s(n, m) \ + ({ uint32_t discard; do_sqrshl_bhs(n, m, 32, true, &discard); }) +#define do_sqrshl_d(n, m) \ + ({ uint32_t discard; do_sqrshl_d(n, m, true, &discard); }) + +DO_ZPZZ(sve2_sqrshl_zpzz_b, int8_t, H1_2, do_sqrshl_b) +DO_ZPZZ(sve2_sqrshl_zpzz_h, int16_t, H1_2, do_sqrshl_h) +DO_ZPZZ(sve2_sqrshl_zpzz_s, int32_t, H1_4, do_sqrshl_s) +DO_ZPZZ_D(sve2_sqrshl_zpzz_d, int64_t, do_sqrshl_d) + +#undef do_sqrshl_d + +#define do_uqrshl_b(n, m) \ + ({ uint32_t discard; do_uqrshl_bhs(n, (int8_t)m, 8, true, &discard); }) +#define do_uqrshl_h(n, m) \ + ({ uint32_t discard; do_uqrshl_bhs(n, (int16_t)m, 16, true, &discard); }) +#define do_uqrshl_s(n, m) \ + ({ uint32_t discard; do_uqrshl_bhs(n, m, 32, true, &discard); }) +#define do_uqrshl_d(n, m) \ + ({ uint32_t discard; do_uqrshl_d(n, m, true, &discard); }) + +DO_ZPZZ(sve2_uqrshl_zpzz_b, uint8_t, H1_2, do_uqrshl_b) +DO_ZPZZ(sve2_uqrshl_zpzz_h, uint16_t, H1_2, do_uqrshl_h) +DO_ZPZZ(sve2_uqrshl_zpzz_s, uint32_t, H1_4, do_uqrshl_s) +DO_ZPZZ_D(sve2_uqrshl_zpzz_d, uint64_t, do_uqrshl_d) + +#undef do_uqrshl_d + #undef DO_ZPZZ #undef DO_ZPZZ_D diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index c30b3c476e..6c1561d897 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -5931,3 +5931,21 @@ static bool trans_SQNEG(DisasContext *s, arg_rpr_esz *a) }; return do_sve2_zpz_ool(s, a, fns[a->esz]); } + +#define DO_SVE2_ZPZZ(NAME, name) \ +static bool trans_##NAME(DisasContext *s, arg_rprr_esz *a) \ +{ \ + static gen_helper_gvec_4 * const fns[4] = { \ + gen_helper_sve2_##name##_zpzz_b, gen_helper_sve2_##name##_zpzz_h, \ + gen_helper_sve2_##name##_zpzz_s, gen_helper_sve2_##name##_zpzz_d, \ + }; \ + return do_sve2_zpzz_ool(s, a, fns[a->esz]); \ +} + +DO_SVE2_ZPZZ(SQSHL, sqshl) +DO_SVE2_ZPZZ(SQRSHL, sqrshl) +DO_SVE2_ZPZZ(SRSHL, srshl) + +DO_SVE2_ZPZZ(UQSHL, uqshl) +DO_SVE2_ZPZZ(UQRSHL, uqrshl) +DO_SVE2_ZPZZ(URSHL, urshl) From patchwork Tue May 25 01:02:33 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277431 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 22EBDC2B9F7 for ; Tue, 25 May 2021 01:09:41 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id B0A8D613F6 for ; Tue, 25 May 2021 01:09:40 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B0A8D613F6 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:51418 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llLZj-0003ay-RR for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 21:09:39 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:53438) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLUS-0001Lf-Ty for qemu-devel@nongnu.org; Mon, 24 May 2021 21:04:12 -0400 Received: from mail-pl1-x62b.google.com ([2607:f8b0:4864:20::62b]:33334) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLUL-0001f3-BI for qemu-devel@nongnu.org; Mon, 24 May 2021 21:04:12 -0400 Received: by mail-pl1-x62b.google.com with SMTP id b7so11343145plg.0 for ; Mon, 24 May 2021 18:04:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=vwh3FG03GqoghiHmQp6LBozSXJjJCqJA1rP1ahS89tI=; b=wNdypyhQxSkeiPxS/vfu9ph76lgDLrkDN2sLXtpKIhkOvh00CYuUJlfloxGJZsgcem GmYDJHtzCJ0+NaBpNbddpHqCADWZopguJsCNoQG0D7dIj1heamGaLI1vlwTnKrKCrOiS nC4qaa3anpKCrAvTQqXb0vu/vWtBTPCaoHKJ0+sSGcpDRkut00eqbpOcOKwUllOQmbfJ tGJsAKHUte7uy116ocf5YPAXTx8uawdwsTz/qhT9nmP/qW/dr1+FCHmnN+AmhqLzkuWa Qdn0c02d2goyHPzhZo1I0gGiYjrYvT3GnE8yBZ/HFH+K5+bt0kduuF7Z48laQwLblb9k WPbw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=vwh3FG03GqoghiHmQp6LBozSXJjJCqJA1rP1ahS89tI=; b=dzOzYvBTEZuC/nqXfVe7f2ZATcAFb0qUpIoOnK+Y8zStLlpbW2P/e6YnaLoG6EYrmY 7PwfyGvTPwAWkpIr5M9NxnQTkKYjO8PomlUOpH8zzMh5gjjxH7QSZrvpc3t2swqZXpC/ lFE43tvPP8+o7Nka37QlZdNunVaHtxu0JEvyeLAS3SZbFAjflmOX41cdcPU0Y6K+tr/u Riy2D7Td+XaXpJ7TJX9EMH/fEz56R+MjuYgixWwqkX+uTCP6K4C2LLZIwJWd9RMPDEz1 VFX84zAWfI35f7wQqDQ4Rjlw2K3CXOtJ2Hi2tVN81Zz4Hv6KvHBkLB/NKh+q37Jc0xoU zv9A== X-Gm-Message-State: AOAM531oAEoJ/CY1/hNtFX8q8NnZgd0XiE9/W3Z+7JoQtA+Jn/AmWVX0 YmMmfH90XX+jTMDmC9dYMVU7fOp6cdO8iQ== X-Google-Smtp-Source: ABdhPJztOEjyZwgyJ7jIC3qvaydsH6VHOkNs1DwFrugpZIeiczSz8T64Wb37MVPNxTgICsFzJhJQeg== X-Received: by 2002:a17:90a:ca05:: with SMTP id x5mr27132612pjt.16.1621904644083; Mon, 24 May 2021 18:04:04 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id b1sm13742645pgf.84.2021.05.24.18.04.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:04:03 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 07/92] target/arm: Implement SVE2 integer halving add/subtract (predicated) Date: Mon, 24 May 2021 18:02:33 -0700 Message-Id: <20210525010358.152808-8-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::62b; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x62b.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 54 ++++++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 11 ++++++++ target/arm/sve_helper.c | 39 +++++++++++++++++++++++++++ target/arm/translate-sve.c | 8 ++++++ 4 files changed, 112 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 62106c74be..5fdc0d223a 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -226,6 +226,60 @@ DEF_HELPER_FLAGS_5(sve2_uqrshl_zpzz_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(sve2_uqrshl_zpzz_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_shadd_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_shadd_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_shadd_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_shadd_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_uhadd_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uhadd_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uhadd_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uhadd_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_srhadd_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_srhadd_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_srhadd_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_srhadd_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_urhadd_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_urhadd_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_urhadd_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_urhadd_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_shsub_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_shsub_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_shsub_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_shsub_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_uhsub_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uhsub_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uhsub_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uhsub_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_sdiv_zpzz_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_sdiv_zpzz_d, TCG_CALL_NO_RWG, diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 93f2479693..58c3f7ede4 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1129,3 +1129,14 @@ SQRSHL 01000100 .. 001 010 100 ... ..... ..... @rdn_pg_rm UQRSHL 01000100 .. 001 011 100 ... ..... ..... @rdn_pg_rm SQRSHL 01000100 .. 001 110 100 ... ..... ..... @rdm_pg_rn # SQRSHLR UQRSHL 01000100 .. 001 111 100 ... ..... ..... @rdm_pg_rn # UQRSHLR + +### SVE2 integer halving add/subtract (predicated) + +SHADD 01000100 .. 010 000 100 ... ..... ..... @rdn_pg_rm +UHADD 01000100 .. 010 001 100 ... ..... ..... @rdn_pg_rm +SHSUB 01000100 .. 010 010 100 ... ..... ..... @rdn_pg_rm +UHSUB 01000100 .. 010 011 100 ... ..... ..... @rdn_pg_rm +SRHADD 01000100 .. 010 100 100 ... ..... ..... @rdn_pg_rm +URHADD 01000100 .. 010 101 100 ... ..... ..... @rdn_pg_rm +SHSUB 01000100 .. 010 110 100 ... ..... ..... @rdm_pg_rn # SHSUBR +UHSUB 01000100 .. 010 111 100 ... ..... ..... @rdm_pg_rn # UHSUBR diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 17c6440b06..f30af5596c 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -648,6 +648,45 @@ DO_ZPZZ_D(sve2_uqrshl_zpzz_d, uint64_t, do_uqrshl_d) #undef do_uqrshl_d +#define DO_HADD_BHS(n, m) (((int64_t)n + m) >> 1) +#define DO_HADD_D(n, m) ((n >> 1) + (m >> 1) + (n & m & 1)) + +DO_ZPZZ(sve2_shadd_zpzz_b, int8_t, H1, DO_HADD_BHS) +DO_ZPZZ(sve2_shadd_zpzz_h, int16_t, H1_2, DO_HADD_BHS) +DO_ZPZZ(sve2_shadd_zpzz_s, int32_t, H1_4, DO_HADD_BHS) +DO_ZPZZ_D(sve2_shadd_zpzz_d, int64_t, DO_HADD_D) + +DO_ZPZZ(sve2_uhadd_zpzz_b, uint8_t, H1, DO_HADD_BHS) +DO_ZPZZ(sve2_uhadd_zpzz_h, uint16_t, H1_2, DO_HADD_BHS) +DO_ZPZZ(sve2_uhadd_zpzz_s, uint32_t, H1_4, DO_HADD_BHS) +DO_ZPZZ_D(sve2_uhadd_zpzz_d, uint64_t, DO_HADD_D) + +#define DO_RHADD_BHS(n, m) (((int64_t)n + m + 1) >> 1) +#define DO_RHADD_D(n, m) ((n >> 1) + (m >> 1) + ((n | m) & 1)) + +DO_ZPZZ(sve2_srhadd_zpzz_b, int8_t, H1, DO_RHADD_BHS) +DO_ZPZZ(sve2_srhadd_zpzz_h, int16_t, H1_2, DO_RHADD_BHS) +DO_ZPZZ(sve2_srhadd_zpzz_s, int32_t, H1_4, DO_RHADD_BHS) +DO_ZPZZ_D(sve2_srhadd_zpzz_d, int64_t, DO_RHADD_D) + +DO_ZPZZ(sve2_urhadd_zpzz_b, uint8_t, H1, DO_RHADD_BHS) +DO_ZPZZ(sve2_urhadd_zpzz_h, uint16_t, H1_2, DO_RHADD_BHS) +DO_ZPZZ(sve2_urhadd_zpzz_s, uint32_t, H1_4, DO_RHADD_BHS) +DO_ZPZZ_D(sve2_urhadd_zpzz_d, uint64_t, DO_RHADD_D) + +#define DO_HSUB_BHS(n, m) (((int64_t)n - m) >> 1) +#define DO_HSUB_D(n, m) ((n >> 1) - (m >> 1) - (~n & m & 1)) + +DO_ZPZZ(sve2_shsub_zpzz_b, int8_t, H1, DO_HSUB_BHS) +DO_ZPZZ(sve2_shsub_zpzz_h, int16_t, H1_2, DO_HSUB_BHS) +DO_ZPZZ(sve2_shsub_zpzz_s, int32_t, H1_4, DO_HSUB_BHS) +DO_ZPZZ_D(sve2_shsub_zpzz_d, int64_t, DO_HSUB_D) + +DO_ZPZZ(sve2_uhsub_zpzz_b, uint8_t, H1, DO_HSUB_BHS) +DO_ZPZZ(sve2_uhsub_zpzz_h, uint16_t, H1_2, DO_HSUB_BHS) +DO_ZPZZ(sve2_uhsub_zpzz_s, uint32_t, H1_4, DO_HSUB_BHS) +DO_ZPZZ_D(sve2_uhsub_zpzz_d, uint64_t, DO_HSUB_D) + #undef DO_ZPZZ #undef DO_ZPZZ_D diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 6c1561d897..43690999ab 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -5949,3 +5949,11 @@ DO_SVE2_ZPZZ(SRSHL, srshl) DO_SVE2_ZPZZ(UQSHL, uqshl) DO_SVE2_ZPZZ(UQRSHL, uqrshl) DO_SVE2_ZPZZ(URSHL, urshl) + +DO_SVE2_ZPZZ(SHADD, shadd) +DO_SVE2_ZPZZ(SRHADD, srhadd) +DO_SVE2_ZPZZ(SHSUB, shsub) + +DO_SVE2_ZPZZ(UHADD, uhadd) +DO_SVE2_ZPZZ(URHADD, urhadd) +DO_SVE2_ZPZZ(UHSUB, uhsub) From patchwork Tue May 25 01:02:34 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277433 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AA620C2B9F7 for ; Tue, 25 May 2021 01:13:05 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 32E49613CC for ; Tue, 25 May 2021 01:13:05 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 32E49613CC Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:59132 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llLd2-0000XF-99 for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 21:13:04 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:53498) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLUU-0001Tp-ND for qemu-devel@nongnu.org; Mon, 24 May 2021 21:04:14 -0400 Received: from mail-pj1-x102a.google.com ([2607:f8b0:4864:20::102a]:37864) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLUM-0001fS-0s for qemu-devel@nongnu.org; Mon, 24 May 2021 21:04:14 -0400 Received: by mail-pj1-x102a.google.com with SMTP id gb21-20020a17090b0615b029015d1a863a91so12272886pjb.2 for ; Mon, 24 May 2021 18:04:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=xslZ6QVhrkaObQgxBfVFTmsr04y8kNkqoxaeZPgVSlQ=; b=LxnPiXNTurmfdckOu1qmv60cdLQmP5kNsHJdgyyjK2b64QmUfuiOVFwb+Q4O9/rmNY 3FRGFGM5/Bvq3Y6QtX9OrTM3mdI+XabPaGJwAAQZlWOPHBMSXWm/1yTGpp6YPOzRp9Xs 1caURWtdNVjSiefVUPdNPI88OxSWDP7QMYcXuy+qaBKQQnEIn5EH7/zB5nsUZB/ePgvo W3UAe5K3XlRGiFhiePb87gsG9ZMR0vD2cknl6PPb9YoAytAOa2XYlWF6tBP1+qWECHf9 BP1B37PA53/tyBG8B18En1QXG0a4HEhbiNrWmz9m3u1bRSvEg2SFD21PJ7zbDjNdPM9I lf9A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=xslZ6QVhrkaObQgxBfVFTmsr04y8kNkqoxaeZPgVSlQ=; b=T9VnHT7I2La2o2RyNBZra+gZG6X5AC8pfAUxwPOV7zeX0C53NuU6/sUcszoipTuNdN 7wFqK/fiYsTC2gFV9Nr509aJtqSvDAANt6+dZhm6svLr3QJDidX3oALNrBWaE9HCZOro XZIRd+J/NARSG/s2J1tQA9dpiIn/I1p2tTpkQJgGIZpbjgVKXl3OYRllnP4Oi2Fc83QC rUGzN8j17rXZ8JG2l1g9stxIK6c44ZbMRqeC2QN6csFKlyBWsPmztKuokOVP6SIUCnKU /T1Z0wcMEgXuyNLwv4NnEhiWnjbeC3iA/fB2xUxVoz8mIqvfGMJRsBAJM8tcPdf139QF E0aw== X-Gm-Message-State: AOAM532pmM+RZhwReQgeWTMOL97KZ+rmFP3OUgE1pvO88CT27LTgsHNs 5dwSMAmJsyIoRM+ZfYZ32QiW/TCZe9ZUwA== X-Google-Smtp-Source: ABdhPJw7Cu+v/7anc3c+p0Tv0tAGNN7sOipknuVwhg2zlzpGfXPHoELddxwDsXGEYMPFcBspSOaEwA== X-Received: by 2002:a17:90a:df0a:: with SMTP id gp10mr27209830pjb.27.1621904644589; Mon, 24 May 2021 18:04:04 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id b1sm13742645pgf.84.2021.05.24.18.04.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:04:04 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 08/92] target/arm: Implement SVE2 integer pairwise arithmetic Date: Mon, 24 May 2021 18:02:34 -0700 Message-Id: <20210525010358.152808-9-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::102a; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x102a.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- v2: Load all inputs before writing any output (laurent desnogues) --- target/arm/helper-sve.h | 45 ++++++++++++++++++++++ target/arm/sve.decode | 8 ++++ target/arm/sve_helper.c | 76 ++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 6 +++ 4 files changed, 135 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 5fdc0d223a..09bc067dd4 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -326,6 +326,51 @@ DEF_HELPER_FLAGS_5(sve_sel_zpzz_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(sve_sel_zpzz_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_addp_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_addp_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_addp_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_addp_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_smaxp_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_smaxp_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_smaxp_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_smaxp_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_umaxp_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_umaxp_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_umaxp_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_umaxp_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_sminp_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sminp_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sminp_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sminp_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_uminp_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uminp_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uminp_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uminp_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_asr_zpzw_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_asr_zpzw_h, TCG_CALL_NO_RWG, diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 58c3f7ede4..61a3321325 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1140,3 +1140,11 @@ SRHADD 01000100 .. 010 100 100 ... ..... ..... @rdn_pg_rm URHADD 01000100 .. 010 101 100 ... ..... ..... @rdn_pg_rm SHSUB 01000100 .. 010 110 100 ... ..... ..... @rdm_pg_rn # SHSUBR UHSUB 01000100 .. 010 111 100 ... ..... ..... @rdm_pg_rn # UHSUBR + +### SVE2 integer pairwise arithmetic + +ADDP 01000100 .. 010 001 101 ... ..... ..... @rdn_pg_rm +SMAXP 01000100 .. 010 100 101 ... ..... ..... @rdn_pg_rm +UMAXP 01000100 .. 010 101 101 ... ..... ..... @rdn_pg_rm +SMINP 01000100 .. 010 110 101 ... ..... ..... @rdn_pg_rm +UMINP 01000100 .. 010 111 101 ... ..... ..... @rdn_pg_rm diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index f30af5596c..7406368095 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -690,6 +690,82 @@ DO_ZPZZ_D(sve2_uhsub_zpzz_d, uint64_t, DO_HSUB_D) #undef DO_ZPZZ #undef DO_ZPZZ_D +/* + * Three operand expander, operating on element pairs. + * If the slot I is even, the elements from from VN {I, I+1}. + * If the slot I is odd, the elements from from VM {I-1, I}. + * Load all of the input elements in each pair before overwriting output. + */ +#define DO_ZPZZ_PAIR(NAME, TYPE, H, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, void *vg, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc); \ + for (i = 0; i < opr_sz; ) { \ + uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \ + do { \ + TYPE n0 = *(TYPE *)(vn + H(i)); \ + TYPE m0 = *(TYPE *)(vm + H(i)); \ + TYPE n1 = *(TYPE *)(vn + H(i + sizeof(TYPE))); \ + TYPE m1 = *(TYPE *)(vm + H(i + sizeof(TYPE))); \ + if (pg & 1) { \ + *(TYPE *)(vd + H(i)) = OP(n0, n1); \ + } \ + i += sizeof(TYPE), pg >>= sizeof(TYPE); \ + if (pg & 1) { \ + *(TYPE *)(vd + H(i)) = OP(m0, m1); \ + } \ + i += sizeof(TYPE), pg >>= sizeof(TYPE); \ + } while (i & 15); \ + } \ +} + +/* Similarly, specialized for 64-bit operands. */ +#define DO_ZPZZ_PAIR_D(NAME, TYPE, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, void *vg, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc) / 8; \ + TYPE *d = vd, *n = vn, *m = vm; \ + uint8_t *pg = vg; \ + for (i = 0; i < opr_sz; i += 2) { \ + TYPE n0 = n[i], n1 = n[i + 1]; \ + TYPE m0 = m[i], m1 = m[i + 1]; \ + if (pg[H1(i)] & 1) { \ + d[i] = OP(n0, n1); \ + } \ + if (pg[H1(i + 1)] & 1) { \ + d[i + 1] = OP(m0, m1); \ + } \ + } \ +} + +DO_ZPZZ_PAIR(sve2_addp_zpzz_b, uint8_t, H1, DO_ADD) +DO_ZPZZ_PAIR(sve2_addp_zpzz_h, uint16_t, H1_2, DO_ADD) +DO_ZPZZ_PAIR(sve2_addp_zpzz_s, uint32_t, H1_4, DO_ADD) +DO_ZPZZ_PAIR_D(sve2_addp_zpzz_d, uint64_t, DO_ADD) + +DO_ZPZZ_PAIR(sve2_umaxp_zpzz_b, uint8_t, H1, DO_MAX) +DO_ZPZZ_PAIR(sve2_umaxp_zpzz_h, uint16_t, H1_2, DO_MAX) +DO_ZPZZ_PAIR(sve2_umaxp_zpzz_s, uint32_t, H1_4, DO_MAX) +DO_ZPZZ_PAIR_D(sve2_umaxp_zpzz_d, uint64_t, DO_MAX) + +DO_ZPZZ_PAIR(sve2_uminp_zpzz_b, uint8_t, H1, DO_MIN) +DO_ZPZZ_PAIR(sve2_uminp_zpzz_h, uint16_t, H1_2, DO_MIN) +DO_ZPZZ_PAIR(sve2_uminp_zpzz_s, uint32_t, H1_4, DO_MIN) +DO_ZPZZ_PAIR_D(sve2_uminp_zpzz_d, uint64_t, DO_MIN) + +DO_ZPZZ_PAIR(sve2_smaxp_zpzz_b, int8_t, H1, DO_MAX) +DO_ZPZZ_PAIR(sve2_smaxp_zpzz_h, int16_t, H1_2, DO_MAX) +DO_ZPZZ_PAIR(sve2_smaxp_zpzz_s, int32_t, H1_4, DO_MAX) +DO_ZPZZ_PAIR_D(sve2_smaxp_zpzz_d, int64_t, DO_MAX) + +DO_ZPZZ_PAIR(sve2_sminp_zpzz_b, int8_t, H1, DO_MIN) +DO_ZPZZ_PAIR(sve2_sminp_zpzz_h, int16_t, H1_2, DO_MIN) +DO_ZPZZ_PAIR(sve2_sminp_zpzz_s, int32_t, H1_4, DO_MIN) +DO_ZPZZ_PAIR_D(sve2_sminp_zpzz_d, int64_t, DO_MIN) + +#undef DO_ZPZZ_PAIR +#undef DO_ZPZZ_PAIR_D + /* Three-operand expander, controlled by a predicate, in which the * third operand is "wide". That is, for D = N op M, the same 64-bit * value of M is used with all of the narrower values of N. diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 43690999ab..2d449c9b57 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -5957,3 +5957,9 @@ DO_SVE2_ZPZZ(SHSUB, shsub) DO_SVE2_ZPZZ(UHADD, uhadd) DO_SVE2_ZPZZ(URHADD, urhadd) DO_SVE2_ZPZZ(UHSUB, uhsub) + +DO_SVE2_ZPZZ(ADDP, addp) +DO_SVE2_ZPZZ(SMAXP, smaxp) +DO_SVE2_ZPZZ(UMAXP, umaxp) +DO_SVE2_ZPZZ(SMINP, sminp) +DO_SVE2_ZPZZ(UMINP, uminp) From patchwork Tue May 25 01:02:35 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277441 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8B94CC2B9F7 for ; Tue, 25 May 2021 01:16:17 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 093C0613F9 for ; Tue, 25 May 2021 01:16:17 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 093C0613F9 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:39376 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llLg8-0006Fe-02 for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 21:16:16 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:53514) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLUV-0001Vi-4y for qemu-devel@nongnu.org; Mon, 24 May 2021 21:04:15 -0400 Received: from mail-pj1-x102f.google.com ([2607:f8b0:4864:20::102f]:42878) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLUM-0001gM-QZ for qemu-devel@nongnu.org; Mon, 24 May 2021 21:04:14 -0400 Received: by mail-pj1-x102f.google.com with SMTP id j6-20020a17090adc86b02900cbfe6f2c96so12184787pjv.1 for ; Mon, 24 May 2021 18:04:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=kTQFUYPN2TBKMrZb2fo1xu1LqT0RL2Q/8a0Tw7EkkbU=; b=QiSPhn9VDpajOP72zlOCP29CDjZaLMpAq2FejhvEFsA+RAD/iZ6PkUjTfUbl6d/dCK myZlBY5Oige9YxjBuJ0wNubYokYALQ7Bu2qqFEnonH8yXaLvjm/5bNkFL8RHPbtulBgC 7W2s0lZr90VNIxB3P4iOBnrNwJH0uVYHJIpMIMsymM4OAp5l1tS/G1kJblDFarBXRWvo QA2XmmUxm8vudJcgS5P5lnVOBLkOymke1U9+X49nnEKmtVTxyV5hp9sw2xh4vSOeG7Kz Wv6Rc8EEVN+PS55ohxZUS37iFgWRV4BpUMue51EYhkrWPX5y9pFVsOoiAiKFDGgGPflj H61w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=kTQFUYPN2TBKMrZb2fo1xu1LqT0RL2Q/8a0Tw7EkkbU=; b=E018IaotSJAx0ADnRXY12lgYxZdGb+WulXFkyWn1gNptPS5Znq7M4mCOI6FNtzVZUw ljk5VAsqOxqGm0XjDlK/IJt5m0txQi8uJniuZmxlqDCGku8IH1jnvGBY1zdGts813nM/ lhNyXek3VlVGj79SAKNrSvjzAe0hS8pAnb8DsudWTxX59UbYC+BBEE7zNZsbO/M/PfZP Di562cYcFoxjF0giK8YEnQMKeg/CFq3hyD7w6ZWTM788OF5uCQVcgjdNnI01M278PRU2 zyVVYzArvlchWMDEameeuKomufbLmrsj7yvC+k68oi1QW6yNz2c4dXzd4ODGrb50nQ/c LWEA== X-Gm-Message-State: AOAM533TuWgR0BB+oQ8ZXYUkSVEPm9SCo+LTAzQdpiZRg0ay6z3X/wR0 XYYczeM+nXtSEhoLuXS1OP9cRhlkdSrmXg== X-Google-Smtp-Source: ABdhPJzfZfWJr/UuJnTRd/Q8DCJIYLGEFVFP1u8uyMbuW1TDzZaW41udUjnIPacW0o+3PMBPA/vViw== X-Received: by 2002:a17:90a:3841:: with SMTP id l1mr2049444pjf.72.1621904645261; Mon, 24 May 2021 18:04:05 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id b1sm13742645pgf.84.2021.05.24.18.04.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:04:05 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 09/92] target/arm: Implement SVE2 saturating add/subtract (predicated) Date: Mon, 24 May 2021 18:02:35 -0700 Message-Id: <20210525010358.152808-10-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::102f; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x102f.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 54 +++++++++++ target/arm/sve.decode | 11 +++ target/arm/sve_helper.c | 194 ++++++++++++++++++++++++++----------- target/arm/translate-sve.c | 7 ++ 4 files changed, 210 insertions(+), 56 deletions(-) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 09bc067dd4..37461c9927 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -371,6 +371,60 @@ DEF_HELPER_FLAGS_5(sve2_uminp_zpzz_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(sve2_uminp_zpzz_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqadd_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqadd_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqadd_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqadd_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_uqadd_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uqadd_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uqadd_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uqadd_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_sqsub_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqsub_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqsub_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqsub_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_uqsub_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uqsub_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uqsub_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uqsub_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_suqadd_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_suqadd_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_suqadd_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_suqadd_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_usqadd_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_usqadd_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_usqadd_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_usqadd_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_asr_zpzw_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_asr_zpzw_h, TCG_CALL_NO_RWG, diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 61a3321325..cd4f73265f 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1148,3 +1148,14 @@ SMAXP 01000100 .. 010 100 101 ... ..... ..... @rdn_pg_rm UMAXP 01000100 .. 010 101 101 ... ..... ..... @rdn_pg_rm SMINP 01000100 .. 010 110 101 ... ..... ..... @rdn_pg_rm UMINP 01000100 .. 010 111 101 ... ..... ..... @rdn_pg_rm + +### SVE2 saturating add/subtract (predicated) + +SQADD_zpzz 01000100 .. 011 000 100 ... ..... ..... @rdn_pg_rm +UQADD_zpzz 01000100 .. 011 001 100 ... ..... ..... @rdn_pg_rm +SQSUB_zpzz 01000100 .. 011 010 100 ... ..... ..... @rdn_pg_rm +UQSUB_zpzz 01000100 .. 011 011 100 ... ..... ..... @rdn_pg_rm +SUQADD 01000100 .. 011 100 100 ... ..... ..... @rdn_pg_rm +USQADD 01000100 .. 011 101 100 ... ..... ..... @rdn_pg_rm +SQSUB_zpzz 01000100 .. 011 110 100 ... ..... ..... @rdm_pg_rn # SQSUBR +UQSUB_zpzz 01000100 .. 011 111 100 ... ..... ..... @rdm_pg_rn # UQSUBR diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 7406368095..1f1783b8f3 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -687,6 +687,135 @@ DO_ZPZZ(sve2_uhsub_zpzz_h, uint16_t, H1_2, DO_HSUB_BHS) DO_ZPZZ(sve2_uhsub_zpzz_s, uint32_t, H1_4, DO_HSUB_BHS) DO_ZPZZ_D(sve2_uhsub_zpzz_d, uint64_t, DO_HSUB_D) +static inline int32_t do_sat_bhs(int64_t val, int64_t min, int64_t max) +{ + return val >= max ? max : val <= min ? min : val; +} + +#define DO_SQADD_B(n, m) do_sat_bhs((int64_t)n + m, INT8_MIN, INT8_MAX) +#define DO_SQADD_H(n, m) do_sat_bhs((int64_t)n + m, INT16_MIN, INT16_MAX) +#define DO_SQADD_S(n, m) do_sat_bhs((int64_t)n + m, INT32_MIN, INT32_MAX) + +static inline int64_t do_sqadd_d(int64_t n, int64_t m) +{ + int64_t r = n + m; + if (((r ^ n) & ~(n ^ m)) < 0) { + /* Signed overflow. */ + return r < 0 ? INT64_MAX : INT64_MIN; + } + return r; +} + +DO_ZPZZ(sve2_sqadd_zpzz_b, int8_t, H1, DO_SQADD_B) +DO_ZPZZ(sve2_sqadd_zpzz_h, int16_t, H1_2, DO_SQADD_H) +DO_ZPZZ(sve2_sqadd_zpzz_s, int32_t, H1_4, DO_SQADD_S) +DO_ZPZZ_D(sve2_sqadd_zpzz_d, int64_t, do_sqadd_d) + +#define DO_UQADD_B(n, m) do_sat_bhs((int64_t)n + m, 0, UINT8_MAX) +#define DO_UQADD_H(n, m) do_sat_bhs((int64_t)n + m, 0, UINT16_MAX) +#define DO_UQADD_S(n, m) do_sat_bhs((int64_t)n + m, 0, UINT32_MAX) + +static inline uint64_t do_uqadd_d(uint64_t n, uint64_t m) +{ + uint64_t r = n + m; + return r < n ? UINT64_MAX : r; +} + +DO_ZPZZ(sve2_uqadd_zpzz_b, uint8_t, H1, DO_UQADD_B) +DO_ZPZZ(sve2_uqadd_zpzz_h, uint16_t, H1_2, DO_UQADD_H) +DO_ZPZZ(sve2_uqadd_zpzz_s, uint32_t, H1_4, DO_UQADD_S) +DO_ZPZZ_D(sve2_uqadd_zpzz_d, uint64_t, do_uqadd_d) + +#define DO_SQSUB_B(n, m) do_sat_bhs((int64_t)n - m, INT8_MIN, INT8_MAX) +#define DO_SQSUB_H(n, m) do_sat_bhs((int64_t)n - m, INT16_MIN, INT16_MAX) +#define DO_SQSUB_S(n, m) do_sat_bhs((int64_t)n - m, INT32_MIN, INT32_MAX) + +static inline int64_t do_sqsub_d(int64_t n, int64_t m) +{ + int64_t r = n - m; + if (((r ^ n) & (n ^ m)) < 0) { + /* Signed overflow. */ + return r < 0 ? INT64_MAX : INT64_MIN; + } + return r; +} + +DO_ZPZZ(sve2_sqsub_zpzz_b, int8_t, H1, DO_SQSUB_B) +DO_ZPZZ(sve2_sqsub_zpzz_h, int16_t, H1_2, DO_SQSUB_H) +DO_ZPZZ(sve2_sqsub_zpzz_s, int32_t, H1_4, DO_SQSUB_S) +DO_ZPZZ_D(sve2_sqsub_zpzz_d, int64_t, do_sqsub_d) + +#define DO_UQSUB_B(n, m) do_sat_bhs((int64_t)n - m, 0, UINT8_MAX) +#define DO_UQSUB_H(n, m) do_sat_bhs((int64_t)n - m, 0, UINT16_MAX) +#define DO_UQSUB_S(n, m) do_sat_bhs((int64_t)n - m, 0, UINT32_MAX) + +static inline uint64_t do_uqsub_d(uint64_t n, uint64_t m) +{ + return n > m ? n - m : 0; +} + +DO_ZPZZ(sve2_uqsub_zpzz_b, uint8_t, H1, DO_UQSUB_B) +DO_ZPZZ(sve2_uqsub_zpzz_h, uint16_t, H1_2, DO_UQSUB_H) +DO_ZPZZ(sve2_uqsub_zpzz_s, uint32_t, H1_4, DO_UQSUB_S) +DO_ZPZZ_D(sve2_uqsub_zpzz_d, uint64_t, do_uqsub_d) + +#define DO_SUQADD_B(n, m) \ + do_sat_bhs((int64_t)(int8_t)n + m, INT8_MIN, INT8_MAX) +#define DO_SUQADD_H(n, m) \ + do_sat_bhs((int64_t)(int16_t)n + m, INT16_MIN, INT16_MAX) +#define DO_SUQADD_S(n, m) \ + do_sat_bhs((int64_t)(int32_t)n + m, INT32_MIN, INT32_MAX) + +static inline int64_t do_suqadd_d(int64_t n, uint64_t m) +{ + uint64_t r = n + m; + + if (n < 0) { + /* Note that m - abs(n) cannot underflow. */ + if (r > INT64_MAX) { + /* Result is either very large positive or negative. */ + if (m > -n) { + /* m > abs(n), so r is a very large positive. */ + return INT64_MAX; + } + /* Result is negative. */ + } + } else { + /* Both inputs are positive: check for overflow. */ + if (r < m || r > INT64_MAX) { + return INT64_MAX; + } + } + return r; +} + +DO_ZPZZ(sve2_suqadd_zpzz_b, uint8_t, H1, DO_SUQADD_B) +DO_ZPZZ(sve2_suqadd_zpzz_h, uint16_t, H1_2, DO_SUQADD_H) +DO_ZPZZ(sve2_suqadd_zpzz_s, uint32_t, H1_4, DO_SUQADD_S) +DO_ZPZZ_D(sve2_suqadd_zpzz_d, uint64_t, do_suqadd_d) + +#define DO_USQADD_B(n, m) \ + do_sat_bhs((int64_t)n + (int8_t)m, 0, UINT8_MAX) +#define DO_USQADD_H(n, m) \ + do_sat_bhs((int64_t)n + (int16_t)m, 0, UINT16_MAX) +#define DO_USQADD_S(n, m) \ + do_sat_bhs((int64_t)n + (int32_t)m, 0, UINT32_MAX) + +static inline uint64_t do_usqadd_d(uint64_t n, int64_t m) +{ + uint64_t r = n + m; + + if (m < 0) { + return n < -m ? 0 : r; + } + return r < n ? UINT64_MAX : r; +} + +DO_ZPZZ(sve2_usqadd_zpzz_b, uint8_t, H1, DO_USQADD_B) +DO_ZPZZ(sve2_usqadd_zpzz_h, uint16_t, H1_2, DO_USQADD_H) +DO_ZPZZ(sve2_usqadd_zpzz_s, uint32_t, H1_4, DO_USQADD_S) +DO_ZPZZ_D(sve2_usqadd_zpzz_d, uint64_t, do_usqadd_d) + #undef DO_ZPZZ #undef DO_ZPZZ_D @@ -1623,13 +1752,7 @@ void HELPER(sve_sqaddi_b)(void *d, void *a, int32_t b, uint32_t desc) intptr_t i, oprsz = simd_oprsz(desc); for (i = 0; i < oprsz; i += sizeof(int8_t)) { - int r = *(int8_t *)(a + i) + b; - if (r > INT8_MAX) { - r = INT8_MAX; - } else if (r < INT8_MIN) { - r = INT8_MIN; - } - *(int8_t *)(d + i) = r; + *(int8_t *)(d + i) = DO_SQADD_B(b, *(int8_t *)(a + i)); } } @@ -1638,13 +1761,7 @@ void HELPER(sve_sqaddi_h)(void *d, void *a, int32_t b, uint32_t desc) intptr_t i, oprsz = simd_oprsz(desc); for (i = 0; i < oprsz; i += sizeof(int16_t)) { - int r = *(int16_t *)(a + i) + b; - if (r > INT16_MAX) { - r = INT16_MAX; - } else if (r < INT16_MIN) { - r = INT16_MIN; - } - *(int16_t *)(d + i) = r; + *(int16_t *)(d + i) = DO_SQADD_H(b, *(int16_t *)(a + i)); } } @@ -1653,13 +1770,7 @@ void HELPER(sve_sqaddi_s)(void *d, void *a, int64_t b, uint32_t desc) intptr_t i, oprsz = simd_oprsz(desc); for (i = 0; i < oprsz; i += sizeof(int32_t)) { - int64_t r = *(int32_t *)(a + i) + b; - if (r > INT32_MAX) { - r = INT32_MAX; - } else if (r < INT32_MIN) { - r = INT32_MIN; - } - *(int32_t *)(d + i) = r; + *(int32_t *)(d + i) = DO_SQADD_S(b, *(int32_t *)(a + i)); } } @@ -1668,13 +1779,7 @@ void HELPER(sve_sqaddi_d)(void *d, void *a, int64_t b, uint32_t desc) intptr_t i, oprsz = simd_oprsz(desc); for (i = 0; i < oprsz; i += sizeof(int64_t)) { - int64_t ai = *(int64_t *)(a + i); - int64_t r = ai + b; - if (((r ^ ai) & ~(ai ^ b)) < 0) { - /* Signed overflow. */ - r = (r < 0 ? INT64_MAX : INT64_MIN); - } - *(int64_t *)(d + i) = r; + *(int64_t *)(d + i) = do_sqadd_d(b, *(int64_t *)(a + i)); } } @@ -1687,13 +1792,7 @@ void HELPER(sve_uqaddi_b)(void *d, void *a, int32_t b, uint32_t desc) intptr_t i, oprsz = simd_oprsz(desc); for (i = 0; i < oprsz; i += sizeof(uint8_t)) { - int r = *(uint8_t *)(a + i) + b; - if (r > UINT8_MAX) { - r = UINT8_MAX; - } else if (r < 0) { - r = 0; - } - *(uint8_t *)(d + i) = r; + *(uint8_t *)(d + i) = DO_UQADD_B(b, *(uint8_t *)(a + i)); } } @@ -1702,13 +1801,7 @@ void HELPER(sve_uqaddi_h)(void *d, void *a, int32_t b, uint32_t desc) intptr_t i, oprsz = simd_oprsz(desc); for (i = 0; i < oprsz; i += sizeof(uint16_t)) { - int r = *(uint16_t *)(a + i) + b; - if (r > UINT16_MAX) { - r = UINT16_MAX; - } else if (r < 0) { - r = 0; - } - *(uint16_t *)(d + i) = r; + *(uint16_t *)(d + i) = DO_UQADD_H(b, *(uint16_t *)(a + i)); } } @@ -1717,13 +1810,7 @@ void HELPER(sve_uqaddi_s)(void *d, void *a, int64_t b, uint32_t desc) intptr_t i, oprsz = simd_oprsz(desc); for (i = 0; i < oprsz; i += sizeof(uint32_t)) { - int64_t r = *(uint32_t *)(a + i) + b; - if (r > UINT32_MAX) { - r = UINT32_MAX; - } else if (r < 0) { - r = 0; - } - *(uint32_t *)(d + i) = r; + *(uint32_t *)(d + i) = DO_UQADD_S(b, *(uint32_t *)(a + i)); } } @@ -1732,11 +1819,7 @@ void HELPER(sve_uqaddi_d)(void *d, void *a, uint64_t b, uint32_t desc) intptr_t i, oprsz = simd_oprsz(desc); for (i = 0; i < oprsz; i += sizeof(uint64_t)) { - uint64_t r = *(uint64_t *)(a + i) + b; - if (r < b) { - r = UINT64_MAX; - } - *(uint64_t *)(d + i) = r; + *(uint64_t *)(d + i) = do_uqadd_d(b, *(uint64_t *)(a + i)); } } @@ -1745,8 +1828,7 @@ void HELPER(sve_uqsubi_d)(void *d, void *a, uint64_t b, uint32_t desc) intptr_t i, oprsz = simd_oprsz(desc); for (i = 0; i < oprsz; i += sizeof(uint64_t)) { - uint64_t ai = *(uint64_t *)(a + i); - *(uint64_t *)(d + i) = (ai < b ? 0 : ai - b); + *(uint64_t *)(d + i) = do_uqsub_d(*(uint64_t *)(a + i), b); } } diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 2d449c9b57..609d5ae7b7 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -5963,3 +5963,10 @@ DO_SVE2_ZPZZ(SMAXP, smaxp) DO_SVE2_ZPZZ(UMAXP, umaxp) DO_SVE2_ZPZZ(SMINP, sminp) DO_SVE2_ZPZZ(UMINP, uminp) + +DO_SVE2_ZPZZ(SQADD_zpzz, sqadd) +DO_SVE2_ZPZZ(UQADD_zpzz, uqadd) +DO_SVE2_ZPZZ(SQSUB_zpzz, sqsub) +DO_SVE2_ZPZZ(UQSUB_zpzz, uqsub) +DO_SVE2_ZPZZ(SUQADD, suqadd) +DO_SVE2_ZPZZ(USQADD, usqadd) From patchwork Tue May 25 01:02:36 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277437 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CA2A3C2B9F7 for ; Tue, 25 May 2021 01:14:11 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 6255B613F6 for ; Tue, 25 May 2021 01:14:11 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6255B613F6 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:35328 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llLe6-0003WI-FN for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 21:14:10 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:53568) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLUW-0001Z0-ND for qemu-devel@nongnu.org; Mon, 24 May 2021 21:04:16 -0400 Received: from mail-pg1-x531.google.com ([2607:f8b0:4864:20::531]:43821) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLUN-0001gV-8c for qemu-devel@nongnu.org; Mon, 24 May 2021 21:04:16 -0400 Received: by mail-pg1-x531.google.com with SMTP id e22so5659766pgv.10 for ; Mon, 24 May 2021 18:04:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=yaQHYwY2JPCt8/qZ/BOmb2MjubbJLY+/yK5upNFIqSk=; b=yyX4371ZFOMRzd0xmjxkt3CePs0MSXH5zsdbuI9Ko+ijR1FNDDhZA6m2/U/GccUkz2 P0sAYNtIccZ5XHNyfJqvlikDl4p4znKzHcFyEIk5LcWksvI/VwysePE+NaD6iEbsTAjK KYIGSPMpm7eAa8CnlfsP55aiefrOUTzKcPLyF+V0RF4fPsR2TggYkvLeKKTh75y0cCbp PSEjb4nec2uRh7xIk4UfpTy1vOJkJ0NsCCndl5jfdC7k7bMYVPn5vJWCnHpAc/5DI0H8 yK7F2CHyZi1XMcd/ndp8CWLCINprGfoSq60sq4YWsfL5yF9eFRb5pHEgRZi6YDAAbXqJ JEWw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=yaQHYwY2JPCt8/qZ/BOmb2MjubbJLY+/yK5upNFIqSk=; b=IQ/yM4tH5Wdvx/2TuqWgbDV6Lrcp63/g2aqWovy95uMnKq+P3GnfhrfCo1yoQKdvi6 aTxJ+S0Iibcb8ngyDRE9nADej0Y26TDRHqiMZniXz+rhqWlmHDuqXLJz4XDgfrmOE8WA LaueDgsBAz3NqNpJaYdlTxF2XAxyvXEmz5pnZso+swDeOCPABnTtLdxsjyBxO/G8SFOW 9wfj2gGbB13Es3EHccb4AZX+zCLBKdoWOgaVe+7Te7F1F9GAkIaQlHtUXukAcxklCLJu 2GpwZWdKoxRdlioUbO2Fqny6WqanswXNwvUTbA87QF7YLDdwCW3xmlZfo349d3ps1R+6 rs/A== X-Gm-Message-State: AOAM531r+1XobFSF0IhezmpQbedyxiybbcAh0EaB8XaCyyG1fzXMX9Tr e5m1tWMMM09OKTAn6o/j94CmtG3uL5K2bg== X-Google-Smtp-Source: ABdhPJx7jx6VU1hXccHES/Cb6FaAWtUazu/VK91vGh0qq3ty02ItDgfAkiXHyLqEBmcoFr8DTBocQA== X-Received: by 2002:aa7:8896:0:b029:2de:a06d:a52f with SMTP id z22-20020aa788960000b02902dea06da52fmr27987203pfe.4.1621904645838; Mon, 24 May 2021 18:04:05 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id b1sm13742645pgf.84.2021.05.24.18.04.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:04:05 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 10/92] target/arm: Implement SVE2 integer add/subtract long Date: Mon, 24 May 2021 18:02:36 -0700 Message-Id: <20210525010358.152808-11-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::531; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x531.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- v2: Fix select offsets (laurent desnogues). --- target/arm/helper-sve.h | 24 ++++++++++++++++++++ target/arm/sve.decode | 19 ++++++++++++++++ target/arm/sve_helper.c | 43 +++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 46 ++++++++++++++++++++++++++++++++++++++ 4 files changed, 132 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 37461c9927..a81297b387 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -1367,6 +1367,30 @@ DEF_HELPER_FLAGS_5(sve_ftmad_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_ftmad_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_ftmad_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_saddl_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_saddl_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_saddl_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_ssubl_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_ssubl_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_ssubl_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_sabdl_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sabdl_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sabdl_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_uaddl_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_uaddl_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_uaddl_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_usubl_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_usubl_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_usubl_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_uabdl_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_uabdl_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_uabdl_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_4(sve_ld1bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) DEF_HELPER_FLAGS_4(sve_ld2bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) DEF_HELPER_FLAGS_4(sve_ld3bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index cd4f73265f..fbfd57b23a 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1159,3 +1159,22 @@ SUQADD 01000100 .. 011 100 100 ... ..... ..... @rdn_pg_rm USQADD 01000100 .. 011 101 100 ... ..... ..... @rdn_pg_rm SQSUB_zpzz 01000100 .. 011 110 100 ... ..... ..... @rdm_pg_rn # SQSUBR UQSUB_zpzz 01000100 .. 011 111 100 ... ..... ..... @rdm_pg_rn # UQSUBR + +#### SVE2 Widening Integer Arithmetic + +## SVE2 integer add/subtract long + +SADDLB 01000101 .. 0 ..... 00 0000 ..... ..... @rd_rn_rm +SADDLT 01000101 .. 0 ..... 00 0001 ..... ..... @rd_rn_rm +UADDLB 01000101 .. 0 ..... 00 0010 ..... ..... @rd_rn_rm +UADDLT 01000101 .. 0 ..... 00 0011 ..... ..... @rd_rn_rm + +SSUBLB 01000101 .. 0 ..... 00 0100 ..... ..... @rd_rn_rm +SSUBLT 01000101 .. 0 ..... 00 0101 ..... ..... @rd_rn_rm +USUBLB 01000101 .. 0 ..... 00 0110 ..... ..... @rd_rn_rm +USUBLT 01000101 .. 0 ..... 00 0111 ..... ..... @rd_rn_rm + +SABDLB 01000101 .. 0 ..... 00 1100 ..... ..... @rd_rn_rm +SABDLT 01000101 .. 0 ..... 00 1101 ..... ..... @rd_rn_rm +UABDLB 01000101 .. 0 ..... 00 1110 ..... ..... @rd_rn_rm +UABDLT 01000101 .. 0 ..... 00 1111 ..... ..... @rd_rn_rm diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 1f1783b8f3..d88fab9865 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1122,6 +1122,49 @@ DO_ZZW(sve_lsl_zzw_s, uint32_t, uint64_t, H1_4, DO_LSL) #undef DO_ZPZ #undef DO_ZPZ_D +/* + * Three-operand expander, unpredicated, in which the two inputs are + * selected from the top or bottom half of the wide column. + */ +#define DO_ZZZ_TB(NAME, TYPEW, TYPEN, HW, HN, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc); \ + int sel1 = extract32(desc, SIMD_DATA_SHIFT, 1) * sizeof(TYPEN); \ + int sel2 = extract32(desc, SIMD_DATA_SHIFT + 1, 1) * sizeof(TYPEN); \ + for (i = 0; i < opr_sz; i += sizeof(TYPEW)) { \ + TYPEW nn = *(TYPEN *)(vn + HN(i + sel1)); \ + TYPEW mm = *(TYPEN *)(vm + HN(i + sel2)); \ + *(TYPEW *)(vd + HW(i)) = OP(nn, mm); \ + } \ +} + +DO_ZZZ_TB(sve2_saddl_h, int16_t, int8_t, H1_2, H1, DO_ADD) +DO_ZZZ_TB(sve2_saddl_s, int32_t, int16_t, H1_4, H1_2, DO_ADD) +DO_ZZZ_TB(sve2_saddl_d, int64_t, int32_t, , H1_4, DO_ADD) + +DO_ZZZ_TB(sve2_ssubl_h, int16_t, int8_t, H1_2, H1, DO_SUB) +DO_ZZZ_TB(sve2_ssubl_s, int32_t, int16_t, H1_4, H1_2, DO_SUB) +DO_ZZZ_TB(sve2_ssubl_d, int64_t, int32_t, , H1_4, DO_SUB) + +DO_ZZZ_TB(sve2_sabdl_h, int16_t, int8_t, H1_2, H1, DO_ABD) +DO_ZZZ_TB(sve2_sabdl_s, int32_t, int16_t, H1_4, H1_2, DO_ABD) +DO_ZZZ_TB(sve2_sabdl_d, int64_t, int32_t, , H1_4, DO_ABD) + +DO_ZZZ_TB(sve2_uaddl_h, uint16_t, uint8_t, H1_2, H1, DO_ADD) +DO_ZZZ_TB(sve2_uaddl_s, uint32_t, uint16_t, H1_4, H1_2, DO_ADD) +DO_ZZZ_TB(sve2_uaddl_d, uint64_t, uint32_t, , H1_4, DO_ADD) + +DO_ZZZ_TB(sve2_usubl_h, uint16_t, uint8_t, H1_2, H1, DO_SUB) +DO_ZZZ_TB(sve2_usubl_s, uint32_t, uint16_t, H1_4, H1_2, DO_SUB) +DO_ZZZ_TB(sve2_usubl_d, uint64_t, uint32_t, , H1_4, DO_SUB) + +DO_ZZZ_TB(sve2_uabdl_h, uint16_t, uint8_t, H1_2, H1, DO_ABD) +DO_ZZZ_TB(sve2_uabdl_s, uint32_t, uint16_t, H1_4, H1_2, DO_ABD) +DO_ZZZ_TB(sve2_uabdl_d, uint64_t, uint32_t, , H1_4, DO_ABD) + +#undef DO_ZZZ_TB + /* Two-operand reduction expander, controlled by a predicate. * The difference between TYPERED and TYPERET has to do with * sign-extension. E.g. for SMAX, TYPERED must be signed, diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 609d5ae7b7..22983b3b85 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -5970,3 +5970,49 @@ DO_SVE2_ZPZZ(SQSUB_zpzz, sqsub) DO_SVE2_ZPZZ(UQSUB_zpzz, uqsub) DO_SVE2_ZPZZ(SUQADD, suqadd) DO_SVE2_ZPZZ(USQADD, usqadd) + +/* + * SVE2 Widening Integer Arithmetic + */ + +static bool do_sve2_zzw_ool(DisasContext *s, arg_rrr_esz *a, + gen_helper_gvec_3 *fn, int data) +{ + if (fn == NULL || !dc_isar_feature(aa64_sve2, s)) { + return false; + } + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + tcg_gen_gvec_3_ool(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + vsz, vsz, data, fn); + } + return true; +} + +#define DO_SVE2_ZZZ_TB(NAME, name, SEL1, SEL2) \ +static bool trans_##NAME(DisasContext *s, arg_rrr_esz *a) \ +{ \ + static gen_helper_gvec_3 * const fns[4] = { \ + NULL, gen_helper_sve2_##name##_h, \ + gen_helper_sve2_##name##_s, gen_helper_sve2_##name##_d, \ + }; \ + return do_sve2_zzw_ool(s, a, fns[a->esz], (SEL2 << 1) | SEL1); \ +} + +DO_SVE2_ZZZ_TB(SADDLB, saddl, false, false) +DO_SVE2_ZZZ_TB(SSUBLB, ssubl, false, false) +DO_SVE2_ZZZ_TB(SABDLB, sabdl, false, false) + +DO_SVE2_ZZZ_TB(UADDLB, uaddl, false, false) +DO_SVE2_ZZZ_TB(USUBLB, usubl, false, false) +DO_SVE2_ZZZ_TB(UABDLB, uabdl, false, false) + +DO_SVE2_ZZZ_TB(SADDLT, saddl, true, true) +DO_SVE2_ZZZ_TB(SSUBLT, ssubl, true, true) +DO_SVE2_ZZZ_TB(SABDLT, sabdl, true, true) + +DO_SVE2_ZZZ_TB(UADDLT, uaddl, true, true) +DO_SVE2_ZZZ_TB(USUBLT, usubl, true, true) +DO_SVE2_ZZZ_TB(UABDLT, uabdl, true, true) From patchwork Tue May 25 01:02:37 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277449 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 40AA9C2B9F7 for ; Tue, 25 May 2021 01:19:55 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 027B9613CC for ; Tue, 25 May 2021 01:19:54 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 027B9613CC Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:47916 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llLjd-0004BD-TY for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 21:19:53 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:53576) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLUX-0001aL-5M for qemu-devel@nongnu.org; Mon, 24 May 2021 21:04:17 -0400 Received: from mail-pf1-x433.google.com ([2607:f8b0:4864:20::433]:44631) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLUN-0001hF-PH for qemu-devel@nongnu.org; Mon, 24 May 2021 21:04:16 -0400 Received: by mail-pf1-x433.google.com with SMTP id 22so21875700pfv.11 for ; Mon, 24 May 2021 18:04:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=KUNHHrwNpMX6FZ/Id7bckirr0h/yO5yVC29ynoS1sek=; b=jAsr3wLnq718wsfb96Q6c+5GsXNQ/4W0rVjJeESdtUZMfbsgqh1qL5Z6oTWwrCmeNO OYaUr8a2z6jifQ02/URsWolld9E/4OkGIujBNoEyI+8YdpHYlp4n3oSl8diowWuKHKyE g9OG1AQZ+pyfvispYdVFumrZAVxGOPDGlKHAW9YJ9C73yFUVKpUu8+XXP2kjPRjNp5Ne 4BGWmNdgDgSNUiMP4C1hWxeCdJS9Q4gteDYVTgv/rrAuqU8oT5sznFJn1YWlHnSJHnze nvTQvAHSAresrK16NBGq5wTGRzHTDLRvp9CvbUGMebsiwkIT+L31392/LQZMb5/7AOqQ NsRA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=KUNHHrwNpMX6FZ/Id7bckirr0h/yO5yVC29ynoS1sek=; b=iZObGcx4dvCaRD4mxybIN70DVdLyeiNJyC7qrqKw6i5lLpBhBYNVQeZqum6As0GATT ga4718wP0kXioI95mxfrrZvzxGOwAJOu8aw9PRDzWxhvtUWAHlPPkuidlXrN4M+k1/9d Fb3TOOveV/iNy9h7viTcZVdBT1GR5E56ItPAg40Tgzzhn3CTgoRsuSF+f8JEDjejwS9n hWA4+KEnk3JaJGgUT/5K2zHyMozKuObLKdUVWukR/JbRWRzgVQ/7E2TV0UkMMYlSMkMt dwTfbOpjTIhsM3H2NmT0mAB5bxIrmHCzTznP8o8pRotj3qh2ztH7ycHSLVa13F3TlTZk WBxg== X-Gm-Message-State: AOAM5318eCpE9Ztl+6iDVS9OTb+Y/RkqujZ1NlPnqHjpjUin6hg22MdQ VOM0NEWHPQfLCYBjNV5apwkaPUfP3xx9rQ== X-Google-Smtp-Source: ABdhPJwSVQ9jp5A9dTr52yeg2fjQ8C+rRvQKmKO7zGkEOxtQSVpHnHh6NvhSH3Z+SDpkWyym7s8YhA== X-Received: by 2002:a63:af57:: with SMTP id s23mr16469666pgo.393.1621904646427; Mon, 24 May 2021 18:04:06 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id b1sm13742645pgf.84.2021.05.24.18.04.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:04:06 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 11/92] target/arm: Implement SVE2 integer add/subtract interleaved long Date: Mon, 24 May 2021 18:02:37 -0700 Message-Id: <20210525010358.152808-12-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::433; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x433.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/sve.decode | 6 ++++++ target/arm/translate-sve.c | 4 ++++ 2 files changed, 10 insertions(+) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index fbfd57b23a..12be0584a8 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1178,3 +1178,9 @@ SABDLB 01000101 .. 0 ..... 00 1100 ..... ..... @rd_rn_rm SABDLT 01000101 .. 0 ..... 00 1101 ..... ..... @rd_rn_rm UABDLB 01000101 .. 0 ..... 00 1110 ..... ..... @rd_rn_rm UABDLT 01000101 .. 0 ..... 00 1111 ..... ..... @rd_rn_rm + +## SVE2 integer add/subtract interleaved long + +SADDLBT 01000101 .. 0 ..... 1000 00 ..... ..... @rd_rn_rm +SSUBLBT 01000101 .. 0 ..... 1000 10 ..... ..... @rd_rn_rm +SSUBLTB 01000101 .. 0 ..... 1000 11 ..... ..... @rd_rn_rm diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 22983b3b85..ae8323adb7 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -6016,3 +6016,7 @@ DO_SVE2_ZZZ_TB(SABDLT, sabdl, true, true) DO_SVE2_ZZZ_TB(UADDLT, uaddl, true, true) DO_SVE2_ZZZ_TB(USUBLT, usubl, true, true) DO_SVE2_ZZZ_TB(UABDLT, uabdl, true, true) + +DO_SVE2_ZZZ_TB(SADDLBT, saddl, false, true) +DO_SVE2_ZZZ_TB(SSUBLBT, ssubl, false, true) +DO_SVE2_ZZZ_TB(SSUBLTB, ssubl, true, false) From patchwork Tue May 25 01:02:38 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277445 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2824EC2B9F7 for ; Tue, 25 May 2021 01:19:18 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 8EC20613CC for ; Tue, 25 May 2021 01:19:17 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 8EC20613CC Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:45184 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llLj2-0002KL-Ky for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 21:19:16 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:53648) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLUZ-0001fZ-TR for qemu-devel@nongnu.org; Mon, 24 May 2021 21:04:19 -0400 Received: from mail-pf1-x436.google.com ([2607:f8b0:4864:20::436]:37410) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLUO-0001hs-GD for qemu-devel@nongnu.org; Mon, 24 May 2021 21:04:19 -0400 Received: by mail-pf1-x436.google.com with SMTP id q67so5221032pfb.4 for ; Mon, 24 May 2021 18:04:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=2FEqaGXGRGG/h3UifaZhFAnj1R1ao5X9aq13b4ZOWhE=; b=Sh70RDBYxjZcnB+E1Op8OmAgUh8KFzZH01bogekBB1a7yZPs2tOwJ3lQfe4lzYkD4Q MvH749Bxx1xkZ5AilJThIBWeKm2zlzffDPXIxBACBugbVndEHxglhCFhmerkD6Sg6BlB PxDl0BOdPTiPczkSFk1HfydaB2QPARs97cIp7r6VwUxT22TM8YMAqE5fXbB6Y4ihiHvg qznVG0GfMrFJq7a1qWUJDnXsVV51/lFqJGPVy4TrVaImXfGJtRO6pNbnLb0nliX9b+cx Y++epXzdvSDATE/G6C8BxIIrLq3UIVF+V73JKfkdKy3JWSj4giGV0DTsQoz5X/YF2Ukh uNCg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=2FEqaGXGRGG/h3UifaZhFAnj1R1ao5X9aq13b4ZOWhE=; b=MEinxN2g6z/R14mvv8tiYTA9OV1vgELFl99qMoS8AXpeR8HABG7CCQpd5nzBNP/mJc uAdo64/9AfAERHUXY9Pp68EncBfnwoOLnU+Vkuhfz/Cxy2bhRIeLkqNVENmi//Y7s+S9 OGPuKvCikKKW6nAVJwGip5eg3CM+FDv8aul4nIIa74KTi72t0XhPNfD1w1nXkLQ+oMdr UCV997vH8VH1w6hP5kWYYMvbVhiEwOcPuO74jxZ2al8TheF4xI3pvPgYg6zYYKfqoHIB u4Tyz8pW9VB2M3xIbdfeGFVSj7AOboO9oC/pr2U+NP8XVdnvpaKXhPFWFPaa6uWruEM5 h5Mg== X-Gm-Message-State: AOAM533BO0waqQpZ3ebAVyep4DEtVQFIDL5LZLO3usWm2fZ5IM4wqy90 7Hor1Pv7piomF1pQ8RNhligg+EDm3CMzjQ== X-Google-Smtp-Source: ABdhPJz4oSkga0f9ljpD81yzJReQ+v6sBNxTJF/893QQedm1Jl22ubXacDDy1dbIc9Pb820+AeAgUA== X-Received: by 2002:a63:7204:: with SMTP id n4mr16452938pgc.78.1621904647073; Mon, 24 May 2021 18:04:07 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id b1sm13742645pgf.84.2021.05.24.18.04.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:04:06 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 12/92] target/arm: Implement SVE2 integer add/subtract wide Date: Mon, 24 May 2021 18:02:38 -0700 Message-Id: <20210525010358.152808-13-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::436; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x436.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- v2: Fix select offsets (laurent desnogues). --- target/arm/helper-sve.h | 16 ++++++++++++++++ target/arm/sve.decode | 12 ++++++++++++ target/arm/sve_helper.c | 30 ++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 20 ++++++++++++++++++++ 4 files changed, 78 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index a81297b387..3286a9c205 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -1391,6 +1391,22 @@ DEF_HELPER_FLAGS_4(sve2_uabdl_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_uabdl_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_uabdl_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_saddw_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_saddw_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_saddw_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_ssubw_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_ssubw_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_ssubw_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_uaddw_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_uaddw_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_uaddw_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_usubw_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_usubw_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_usubw_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_4(sve_ld1bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) DEF_HELPER_FLAGS_4(sve_ld2bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) DEF_HELPER_FLAGS_4(sve_ld3bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 12be0584a8..f6f21426ef 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1184,3 +1184,15 @@ UABDLT 01000101 .. 0 ..... 00 1111 ..... ..... @rd_rn_rm SADDLBT 01000101 .. 0 ..... 1000 00 ..... ..... @rd_rn_rm SSUBLBT 01000101 .. 0 ..... 1000 10 ..... ..... @rd_rn_rm SSUBLTB 01000101 .. 0 ..... 1000 11 ..... ..... @rd_rn_rm + +## SVE2 integer add/subtract wide + +SADDWB 01000101 .. 0 ..... 010 000 ..... ..... @rd_rn_rm +SADDWT 01000101 .. 0 ..... 010 001 ..... ..... @rd_rn_rm +UADDWB 01000101 .. 0 ..... 010 010 ..... ..... @rd_rn_rm +UADDWT 01000101 .. 0 ..... 010 011 ..... ..... @rd_rn_rm + +SSUBWB 01000101 .. 0 ..... 010 100 ..... ..... @rd_rn_rm +SSUBWT 01000101 .. 0 ..... 010 101 ..... ..... @rd_rn_rm +USUBWB 01000101 .. 0 ..... 010 110 ..... ..... @rd_rn_rm +USUBWT 01000101 .. 0 ..... 010 111 ..... ..... @rd_rn_rm diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index d88fab9865..374e02dbf8 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1165,6 +1165,36 @@ DO_ZZZ_TB(sve2_uabdl_d, uint64_t, uint32_t, , H1_4, DO_ABD) #undef DO_ZZZ_TB +#define DO_ZZZ_WTB(NAME, TYPEW, TYPEN, HW, HN, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc); \ + int sel2 = extract32(desc, SIMD_DATA_SHIFT, 1) * sizeof(TYPEN); \ + for (i = 0; i < opr_sz; i += sizeof(TYPEW)) { \ + TYPEW nn = *(TYPEW *)(vn + HW(i)); \ + TYPEW mm = *(TYPEN *)(vm + HN(i + sel2)); \ + *(TYPEW *)(vd + HW(i)) = OP(nn, mm); \ + } \ +} + +DO_ZZZ_WTB(sve2_saddw_h, int16_t, int8_t, H1_2, H1, DO_ADD) +DO_ZZZ_WTB(sve2_saddw_s, int32_t, int16_t, H1_4, H1_2, DO_ADD) +DO_ZZZ_WTB(sve2_saddw_d, int64_t, int32_t, , H1_4, DO_ADD) + +DO_ZZZ_WTB(sve2_ssubw_h, int16_t, int8_t, H1_2, H1, DO_SUB) +DO_ZZZ_WTB(sve2_ssubw_s, int32_t, int16_t, H1_4, H1_2, DO_SUB) +DO_ZZZ_WTB(sve2_ssubw_d, int64_t, int32_t, , H1_4, DO_SUB) + +DO_ZZZ_WTB(sve2_uaddw_h, uint16_t, uint8_t, H1_2, H1, DO_ADD) +DO_ZZZ_WTB(sve2_uaddw_s, uint32_t, uint16_t, H1_4, H1_2, DO_ADD) +DO_ZZZ_WTB(sve2_uaddw_d, uint64_t, uint32_t, , H1_4, DO_ADD) + +DO_ZZZ_WTB(sve2_usubw_h, uint16_t, uint8_t, H1_2, H1, DO_SUB) +DO_ZZZ_WTB(sve2_usubw_s, uint32_t, uint16_t, H1_4, H1_2, DO_SUB) +DO_ZZZ_WTB(sve2_usubw_d, uint64_t, uint32_t, , H1_4, DO_SUB) + +#undef DO_ZZZ_WTB + /* Two-operand reduction expander, controlled by a predicate. * The difference between TYPERED and TYPERET has to do with * sign-extension. E.g. for SMAX, TYPERED must be signed, diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index ae8323adb7..70900c122f 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -6020,3 +6020,23 @@ DO_SVE2_ZZZ_TB(UABDLT, uabdl, true, true) DO_SVE2_ZZZ_TB(SADDLBT, saddl, false, true) DO_SVE2_ZZZ_TB(SSUBLBT, ssubl, false, true) DO_SVE2_ZZZ_TB(SSUBLTB, ssubl, true, false) + +#define DO_SVE2_ZZZ_WTB(NAME, name, SEL2) \ +static bool trans_##NAME(DisasContext *s, arg_rrr_esz *a) \ +{ \ + static gen_helper_gvec_3 * const fns[4] = { \ + NULL, gen_helper_sve2_##name##_h, \ + gen_helper_sve2_##name##_s, gen_helper_sve2_##name##_d, \ + }; \ + return do_sve2_zzw_ool(s, a, fns[a->esz], SEL2); \ +} + +DO_SVE2_ZZZ_WTB(SADDWB, saddw, false) +DO_SVE2_ZZZ_WTB(SADDWT, saddw, true) +DO_SVE2_ZZZ_WTB(SSUBWB, ssubw, false) +DO_SVE2_ZZZ_WTB(SSUBWT, ssubw, true) + +DO_SVE2_ZZZ_WTB(UADDWB, uaddw, false) +DO_SVE2_ZZZ_WTB(UADDWT, uaddw, true) +DO_SVE2_ZZZ_WTB(USUBWB, usubw, false) +DO_SVE2_ZZZ_WTB(USUBWT, usubw, true) From patchwork Tue May 25 01:02:39 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277439 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 16143C2B9F7 for ; Tue, 25 May 2021 01:14:17 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 9C045613CC for ; Tue, 25 May 2021 01:14:16 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 9C045613CC Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:35824 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llLeB-0003rm-OT for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 21:14:15 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:53798) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLUg-0001lA-0b for qemu-devel@nongnu.org; Mon, 24 May 2021 21:04:27 -0400 Received: from mail-pg1-x530.google.com ([2607:f8b0:4864:20::530]:41845) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLUP-0001ib-A0 for qemu-devel@nongnu.org; Mon, 24 May 2021 21:04:25 -0400 Received: by mail-pg1-x530.google.com with SMTP id r1so6546254pgk.8 for ; Mon, 24 May 2021 18:04:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=+zYjL/8UBwWwCIOfTLsN3pMU5qz0QJffUTrpndpZaQA=; b=s0nxnI6lO7rAnIR/EWmqyj++JArqaP1yw2IlMFz72Nu3xGP9G0EaAIAB2FVMwm/oii ltRS3tObi7Qp/3U701+mfGDrZsgDf3yqt6ssSaTakJGOcW33+DXzie4dpa9/S2ELCoNe AM+Ge19q64/t38cX7GUtdp312DCe6Du8kIWILH68tK8bRSHUp5+n+YFgtB8/PTgwcJid fz7HmW7Wz3cY77plL7+9oZ90O3RSILGEI3WW9B35T6BJ4/mTEkj9DuEKRutA1Kgjopcb QKLD07fkZ4qnAErslBPEGLbozoDGV1Qn9h50Hk4so9RslrLvOGfzLJ/WL7eGmgBfwmQn Ei/w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=+zYjL/8UBwWwCIOfTLsN3pMU5qz0QJffUTrpndpZaQA=; b=AsdhVnDcLQJrlPnhYmtiWYqrbBhd1EDe5fNGWLIEN/csDjgiRQSCRfwnBWRu0XTOn8 KnIOXyHtmugrCif/pyhr/YUUPVf5iLxE+dvqL9yha53iSiuoxSAdm8GOPKhdX0dE+bFi DjqQP1zP7J1BGuCqZyFG5DawnQ6xFOUiUORVHdHyfp2mvq85PZ/qcpC6kC8W1xSm3Ot0 5HFKsKb99OMoe4X5jA27Ihu5UnlRyiKX8lrir2ol2bdXKx/vNR0dE46YJEhSQEgR2kz9 VIzxotJ6PqSd6nS4iA1pG7k1udo9o57CCOdE40gKy9HaLb79mimWmpXGGo4biwuShB+b 20+w== X-Gm-Message-State: AOAM532TPZRybvjghs0F4IE/aGAn7nx5uDh6istQQhYBlTDNuG4YgloZ 9imqbIcXF+RNxrVEphUf9+O0W6qGkq4F9Q== X-Google-Smtp-Source: ABdhPJyzpDkBi+yBi1Z0C4WaLuZf1l3AzDeEgKljxUP+RO+Q8Ds/9fAzE8PvuuCbpRbb9gMKnPd7Kw== X-Received: by 2002:a05:6a00:248e:b029:28e:bca9:5985 with SMTP id c14-20020a056a00248eb029028ebca95985mr27892463pfv.10.1621904647672; Mon, 24 May 2021 18:04:07 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id b1sm13742645pgf.84.2021.05.24.18.04.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:04:07 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 13/92] target/arm: Implement SVE2 integer multiply long Date: Mon, 24 May 2021 18:02:39 -0700 Message-Id: <20210525010358.152808-14-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::530; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x530.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Exclude PMULL from this category for the moment. Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 15 +++++++++++++++ target/arm/sve.decode | 9 +++++++++ target/arm/sve_helper.c | 31 +++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 9 +++++++++ 4 files changed, 64 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 3286a9c205..ad8121eec6 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2347,4 +2347,19 @@ DEF_HELPER_FLAGS_6(sve_stdd_le_zd_mte, TCG_CALL_NO_WG, DEF_HELPER_FLAGS_6(sve_stdd_be_zd_mte, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve2_sqdmull_zzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqdmull_zzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqdmull_zzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_smull_zzz_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_smull_zzz_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_smull_zzz_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_umull_zzz_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_umull_zzz_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_umull_zzz_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_4(sve2_pmull_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index f6f21426ef..d9a72b7661 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1196,3 +1196,12 @@ SSUBWB 01000101 .. 0 ..... 010 100 ..... ..... @rd_rn_rm SSUBWT 01000101 .. 0 ..... 010 101 ..... ..... @rd_rn_rm USUBWB 01000101 .. 0 ..... 010 110 ..... ..... @rd_rn_rm USUBWT 01000101 .. 0 ..... 010 111 ..... ..... @rd_rn_rm + +## SVE2 integer multiply long + +SQDMULLB_zzz 01000101 .. 0 ..... 011 000 ..... ..... @rd_rn_rm +SQDMULLT_zzz 01000101 .. 0 ..... 011 001 ..... ..... @rd_rn_rm +SMULLB_zzz 01000101 .. 0 ..... 011 100 ..... ..... @rd_rn_rm +SMULLT_zzz 01000101 .. 0 ..... 011 101 ..... ..... @rd_rn_rm +UMULLB_zzz 01000101 .. 0 ..... 011 110 ..... ..... @rd_rn_rm +UMULLT_zzz 01000101 .. 0 ..... 011 111 ..... ..... @rd_rn_rm diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 374e02dbf8..cfd1a7cb49 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1163,6 +1163,37 @@ DO_ZZZ_TB(sve2_uabdl_h, uint16_t, uint8_t, H1_2, H1, DO_ABD) DO_ZZZ_TB(sve2_uabdl_s, uint32_t, uint16_t, H1_4, H1_2, DO_ABD) DO_ZZZ_TB(sve2_uabdl_d, uint64_t, uint32_t, , H1_4, DO_ABD) +DO_ZZZ_TB(sve2_smull_zzz_h, int16_t, int8_t, H1_2, H1, DO_MUL) +DO_ZZZ_TB(sve2_smull_zzz_s, int32_t, int16_t, H1_4, H1_2, DO_MUL) +DO_ZZZ_TB(sve2_smull_zzz_d, int64_t, int32_t, , H1_4, DO_MUL) + +DO_ZZZ_TB(sve2_umull_zzz_h, uint16_t, uint8_t, H1_2, H1, DO_MUL) +DO_ZZZ_TB(sve2_umull_zzz_s, uint32_t, uint16_t, H1_4, H1_2, DO_MUL) +DO_ZZZ_TB(sve2_umull_zzz_d, uint64_t, uint32_t, , H1_4, DO_MUL) + +/* Note that the multiply cannot overflow, but the doubling can. */ +static inline int16_t do_sqdmull_h(int16_t n, int16_t m) +{ + int16_t val = n * m; + return DO_SQADD_H(val, val); +} + +static inline int32_t do_sqdmull_s(int32_t n, int32_t m) +{ + int32_t val = n * m; + return DO_SQADD_S(val, val); +} + +static inline int64_t do_sqdmull_d(int64_t n, int64_t m) +{ + int64_t val = n * m; + return do_sqadd_d(val, val); +} + +DO_ZZZ_TB(sve2_sqdmull_zzz_h, int16_t, int8_t, H1_2, H1, do_sqdmull_h) +DO_ZZZ_TB(sve2_sqdmull_zzz_s, int32_t, int16_t, H1_4, H1_2, do_sqdmull_s) +DO_ZZZ_TB(sve2_sqdmull_zzz_d, int64_t, int32_t, , H1_4, do_sqdmull_d) + #undef DO_ZZZ_TB #define DO_ZZZ_WTB(NAME, TYPEW, TYPEN, HW, HN, OP) \ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 70900c122f..19a1f289d8 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -6021,6 +6021,15 @@ DO_SVE2_ZZZ_TB(SADDLBT, saddl, false, true) DO_SVE2_ZZZ_TB(SSUBLBT, ssubl, false, true) DO_SVE2_ZZZ_TB(SSUBLTB, ssubl, true, false) +DO_SVE2_ZZZ_TB(SQDMULLB_zzz, sqdmull_zzz, false, false) +DO_SVE2_ZZZ_TB(SQDMULLT_zzz, sqdmull_zzz, true, true) + +DO_SVE2_ZZZ_TB(SMULLB_zzz, smull_zzz, false, false) +DO_SVE2_ZZZ_TB(SMULLT_zzz, smull_zzz, true, true) + +DO_SVE2_ZZZ_TB(UMULLB_zzz, umull_zzz, false, false) +DO_SVE2_ZZZ_TB(UMULLT_zzz, umull_zzz, true, true) + #define DO_SVE2_ZZZ_WTB(NAME, name, SEL2) \ static bool trans_##NAME(DisasContext *s, arg_rrr_esz *a) \ { \ From patchwork Tue May 25 01:02:40 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277457 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A6306C2B9F7 for ; Tue, 25 May 2021 01:23:01 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 33E36613CC for ; Tue, 25 May 2021 01:23:01 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 33E36613CC Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:56316 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llLme-0001Zn-8b for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 21:23:00 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:53684) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLUb-0001hh-Bt for qemu-devel@nongnu.org; Mon, 24 May 2021 21:04:21 -0400 Received: from mail-pf1-x429.google.com ([2607:f8b0:4864:20::429]:45853) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLUP-0001iw-Pg for qemu-devel@nongnu.org; Mon, 24 May 2021 21:04:21 -0400 Received: by mail-pf1-x429.google.com with SMTP id d16so22255512pfn.12 for ; Mon, 24 May 2021 18:04:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=/IR73sJZl9/5Rj7aH7uFBFILgqJUCpPjkqDLQZz1N3o=; b=PJzQEKjPXXvyWTGkbK1J8g238xwAvEEAHlpYnDH8cTCoaGOme5tsmpAvFCoFTyK3D1 30VPozPq7fmemk/0PAhbnsCzlKmvZFiMvf4+C6B0qZ1tTl27wooFOGj3YyAFDEfkw/0z 0WSz5hZAdkZrUazznxTUZYf5qb2aByLyWotMaBwMEBYDqr2Yfm5j3PmpCN4EFoBMvXGp y1MU8xX9n+Mb6pfgzDi+5frkq6b2n71OOjv8431yGfJEzNiKmNbJTuydLbFDlYlWapKO 75iUABr0cUZbORfWosSJH360JPxPYnMcroB7PLJf+3uzT6jKJ465IpZY2O9UWBYR7IKv MSHw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=/IR73sJZl9/5Rj7aH7uFBFILgqJUCpPjkqDLQZz1N3o=; b=IbQpi6uiAzlduGDIj9wAwDRjonNBYdlqYBSm7A/g2eMoWN5pAieXfD99zzqOokvpN+ kIayVKOiG3ojoD2trCd6eh+JRX4rEhrl9yxevsPl23JP5O1HQkwg400/+Yt0UCvZ5agf qU3b64wpFJczUKxqhRPmiTsoKHGcAcr+LJZMetkuZuJ+tf+GsFeTg69pj+SfavTwXPFe hRbTl9qv35B12b7p21umCmCKVucsqzKmBlRwVE6X/xidEstUok5ydE1dM1RsuY86QpAb IT2Ale6CiREqmnQuf9GvvxfaNXd0wZSnB/Ej33+hAFninc7XHD7wzY6GWqv6att27CZQ v/iw== X-Gm-Message-State: AOAM5320D2Xy0zl2JOFaDD+PVekft/+AsCYARqiwUjzuqpcjYPfkbPta gG+UQFsK9Cjm5hWZmLdUIckOcaRQxjfwTQ== X-Google-Smtp-Source: ABdhPJwgxgpe8Hq2OrYYcf388yzwADl5ppbncJtO6xv7WJbnVLqMAqq0oqIpFxpf/t9TEHHFns4aYw== X-Received: by 2002:a62:1b97:0:b029:24e:44e9:a8c1 with SMTP id b145-20020a621b970000b029024e44e9a8c1mr27562420pfb.19.1621904648270; Mon, 24 May 2021 18:04:08 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id b1sm13742645pgf.84.2021.05.24.18.04.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:04:08 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 14/92] target/arm: Implement SVE2 PMULLB, PMULLT Date: Mon, 24 May 2021 18:02:40 -0700 Message-Id: <20210525010358.152808-15-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::429; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x429.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/cpu.h | 10 ++++++++++ target/arm/helper-sve.h | 1 + target/arm/sve.decode | 2 ++ target/arm/translate-sve.c | 22 ++++++++++++++++++++++ target/arm/vec_helper.c | 24 ++++++++++++++++++++++++ 5 files changed, 59 insertions(+) diff --git a/target/arm/cpu.h b/target/arm/cpu.h index a6e1fa6333..902579d24b 100644 --- a/target/arm/cpu.h +++ b/target/arm/cpu.h @@ -4231,6 +4231,16 @@ static inline bool isar_feature_aa64_sve2(const ARMISARegisters *id) return FIELD_EX64(id->id_aa64zfr0, ID_AA64ZFR0, SVEVER) != 0; } +static inline bool isar_feature_aa64_sve2_aes(const ARMISARegisters *id) +{ + return FIELD_EX64(id->id_aa64zfr0, ID_AA64ZFR0, AES) != 0; +} + +static inline bool isar_feature_aa64_sve2_pmull128(const ARMISARegisters *id) +{ + return FIELD_EX64(id->id_aa64zfr0, ID_AA64ZFR0, AES) >= 2; +} + /* * Feature tests for "does this exist in either 32-bit or 64-bit?" */ diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index ad8121eec6..bf3e533eb4 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2363,3 +2363,4 @@ DEF_HELPER_FLAGS_4(sve2_umull_zzz_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_umull_zzz_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_pmull_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_pmull_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index d9a72b7661..016c15ebb6 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1201,6 +1201,8 @@ USUBWT 01000101 .. 0 ..... 010 111 ..... ..... @rd_rn_rm SQDMULLB_zzz 01000101 .. 0 ..... 011 000 ..... ..... @rd_rn_rm SQDMULLT_zzz 01000101 .. 0 ..... 011 001 ..... ..... @rd_rn_rm +PMULLB 01000101 .. 0 ..... 011 010 ..... ..... @rd_rn_rm +PMULLT 01000101 .. 0 ..... 011 011 ..... ..... @rd_rn_rm SMULLB_zzz 01000101 .. 0 ..... 011 100 ..... ..... @rd_rn_rm SMULLT_zzz 01000101 .. 0 ..... 011 101 ..... ..... @rd_rn_rm UMULLB_zzz 01000101 .. 0 ..... 011 110 ..... ..... @rd_rn_rm diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 19a1f289d8..fbdccc1c68 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -6030,6 +6030,28 @@ DO_SVE2_ZZZ_TB(SMULLT_zzz, smull_zzz, true, true) DO_SVE2_ZZZ_TB(UMULLB_zzz, umull_zzz, false, false) DO_SVE2_ZZZ_TB(UMULLT_zzz, umull_zzz, true, true) +static bool do_trans_pmull(DisasContext *s, arg_rrr_esz *a, bool sel) +{ + static gen_helper_gvec_3 * const fns[4] = { + gen_helper_gvec_pmull_q, gen_helper_sve2_pmull_h, + NULL, gen_helper_sve2_pmull_d, + }; + if (a->esz == 0 && !dc_isar_feature(aa64_sve2_pmull128, s)) { + return false; + } + return do_sve2_zzw_ool(s, a, fns[a->esz], sel); +} + +static bool trans_PMULLB(DisasContext *s, arg_rrr_esz *a) +{ + return do_trans_pmull(s, a, false); +} + +static bool trans_PMULLT(DisasContext *s, arg_rrr_esz *a) +{ + return do_trans_pmull(s, a, true); +} + #define DO_SVE2_ZZZ_WTB(NAME, name, SEL2) \ static bool trans_##NAME(DisasContext *s, arg_rrr_esz *a) \ { \ diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index 40b92100bf..b0ce597060 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -1750,6 +1750,30 @@ void HELPER(sve2_pmull_h)(void *vd, void *vn, void *vm, uint32_t desc) d[i] = pmull_h(nn, mm); } } + +static uint64_t pmull_d(uint64_t op1, uint64_t op2) +{ + uint64_t result = 0; + int i; + + for (i = 0; i < 32; ++i) { + uint64_t mask = -((op1 >> i) & 1); + result ^= (op2 << i) & mask; + } + return result; +} + +void HELPER(sve2_pmull_d)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t sel = H4(simd_data(desc)); + intptr_t i, opr_sz = simd_oprsz(desc); + uint32_t *n = vn, *m = vm; + uint64_t *d = vd; + + for (i = 0; i < opr_sz / 8; ++i) { + d[i] = pmull_d(n[2 * i + sel], m[2 * i + sel]); + } +} #endif #define DO_CMP0(NAME, TYPE, OP) \ From patchwork Tue May 25 01:02:41 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277435 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D2A9AC2B9F7 for ; Tue, 25 May 2021 01:14:08 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 55DD2613F6 for ; Tue, 25 May 2021 01:14:08 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 55DD2613F6 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:35048 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llLe3-0003KY-F9 for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 21:14:07 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:53850) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLUj-0001ni-OH for qemu-devel@nongnu.org; Mon, 24 May 2021 21:04:31 -0400 Received: from mail-pg1-x535.google.com ([2607:f8b0:4864:20::535]:43825) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLUQ-0001jg-DG for qemu-devel@nongnu.org; Mon, 24 May 2021 21:04:28 -0400 Received: by mail-pg1-x535.google.com with SMTP id e22so5659837pgv.10 for ; Mon, 24 May 2021 18:04:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=Y4c9Q1ltTS6oaJqAB0GG5NXaf6ZXrAYDm4l6aGlsq0Y=; b=WMhcMGMTR2wlZkUt8N4a2gpULTXInc3yWrIxyZAAWrc5heUYx+1voopJ6VZ1ItMZ3S wbSqMYAihqxItzv4pfTkAGz8XyZCHOg0oJJAZcPsbv57rpJCVrMz9cCeV6m3jwHlrnaZ b90HcdxVOqlgSMR1TtTIsjRlvTPPS4h2xzIdqFi+MRxxyJAq1gAb/lDN/0PMKPoj3XbL y3K2E3eVHCvk084sndIsONCQTQu72170i2JaGaiKCr/Jc8oKZqLQ5asqXemN6Hgnty5T nInuSJntUgFO3bModCT9/Ta4zQ5Rh/3fjM1bk8HjwnEIIP6J14spgkyjxnNhSOJQu4cp TtIw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Y4c9Q1ltTS6oaJqAB0GG5NXaf6ZXrAYDm4l6aGlsq0Y=; b=eVyLU2vp/K9kjG8VZgfFQC2ih+XIEtCX3muGcaIfHYjExbTMny/FU5Ir+RFnTP4qNs Rs37ZskLaRHU0BfhJw9DAPb9vD4G/Qe4H968Br0hqiEJ2MIE7fFzPfaIOumeP0gBAT2W 3D7OKRTur1NX2iMuPnyYKXBiEfLLBWWtQdRUoTuAWSVBxhvU+EtPmT9Rk4TVzENzauRz 6S1ppCuAtrRr2YGwOffEQ9jcB8pPrMnIjxsNRMkfwMZ63/bhuyna48w6SeByCCxqAV/n R/PpcmrGbHq1ZjzmIIPHR32vtcb5JWjjQlexUqd+guPvp4fgB3+f8ojc70h5BuEXa9iT Azzg== X-Gm-Message-State: AOAM533C9jcTPasRl1aLUSf2hu2yYUxsF2S9xsP5mJqTjuCae4lpW8tg LOYCptsEYsWLmsD+o75OViG7MVtFIaw3yw== X-Google-Smtp-Source: ABdhPJzFaJfCzWe9ETY7zdpiPXAKVm46BibQfpIE2N6XIW6K/uZDQg+SvUiekyCMn9aNp0aI96wEhQ== X-Received: by 2002:a63:9c7:: with SMTP id 190mr16409885pgj.149.1621904648949; Mon, 24 May 2021 18:04:08 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id b1sm13742645pgf.84.2021.05.24.18.04.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:04:08 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 15/92] target/arm: Implement SVE2 bitwise shift left long Date: Mon, 24 May 2021 18:02:41 -0700 Message-Id: <20210525010358.152808-16-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::535; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x535.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 8 ++ target/arm/sve.decode | 8 ++ target/arm/sve_helper.c | 22 +++++ target/arm/translate-sve.c | 159 +++++++++++++++++++++++++++++++++++++ 4 files changed, 197 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index bf3e533eb4..740939e7a8 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2364,3 +2364,11 @@ DEF_HELPER_FLAGS_4(sve2_umull_zzz_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_pmull_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_pmull_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve2_sshll_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_sshll_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_sshll_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve2_ushll_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_ushll_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_ushll_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 016c15ebb6..a3191eba7b 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1207,3 +1207,11 @@ SMULLB_zzz 01000101 .. 0 ..... 011 100 ..... ..... @rd_rn_rm SMULLT_zzz 01000101 .. 0 ..... 011 101 ..... ..... @rd_rn_rm UMULLB_zzz 01000101 .. 0 ..... 011 110 ..... ..... @rd_rn_rm UMULLT_zzz 01000101 .. 0 ..... 011 111 ..... ..... @rd_rn_rm + +## SVE2 bitwise shift left long + +# Note bit23 == 0 is handled by esz > 0 in do_sve2_shll_tb. +SSHLLB 01000101 .. 0 ..... 1010 00 ..... ..... @rd_rn_tszimm_shl +SSHLLT 01000101 .. 0 ..... 1010 01 ..... ..... @rd_rn_tszimm_shl +USHLLB 01000101 .. 0 ..... 1010 10 ..... ..... @rd_rn_tszimm_shl +USHLLT 01000101 .. 0 ..... 1010 11 ..... ..... @rd_rn_tszimm_shl diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index cfd1a7cb49..79b268cbba 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1226,6 +1226,28 @@ DO_ZZZ_WTB(sve2_usubw_d, uint64_t, uint32_t, , H1_4, DO_SUB) #undef DO_ZZZ_WTB +#define DO_ZZI_SHLL(NAME, TYPEW, TYPEN, HW, HN) \ +void HELPER(NAME)(void *vd, void *vn, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc); \ + intptr_t sel = (simd_data(desc) & 1) * sizeof(TYPEN); \ + int shift = simd_data(desc) >> 1; \ + for (i = 0; i < opr_sz; i += sizeof(TYPEW)) { \ + TYPEW nn = *(TYPEN *)(vn + HN(i + sel)); \ + *(TYPEW *)(vd + HW(i)) = nn << shift; \ + } \ +} + +DO_ZZI_SHLL(sve2_sshll_h, int16_t, int8_t, H1_2, H1) +DO_ZZI_SHLL(sve2_sshll_s, int32_t, int16_t, H1_4, H1_2) +DO_ZZI_SHLL(sve2_sshll_d, int64_t, int32_t, , H1_4) + +DO_ZZI_SHLL(sve2_ushll_h, uint16_t, uint8_t, H1_2, H1) +DO_ZZI_SHLL(sve2_ushll_s, uint32_t, uint16_t, H1_4, H1_2) +DO_ZZI_SHLL(sve2_ushll_d, uint64_t, uint32_t, , H1_4) + +#undef DO_ZZI_SHLL + /* Two-operand reduction expander, controlled by a predicate. * The difference between TYPERED and TYPERET has to do with * sign-extension. E.g. for SMAX, TYPERED must be signed, diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index fbdccc1c68..da7308d1af 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -6071,3 +6071,162 @@ DO_SVE2_ZZZ_WTB(UADDWB, uaddw, false) DO_SVE2_ZZZ_WTB(UADDWT, uaddw, true) DO_SVE2_ZZZ_WTB(USUBWB, usubw, false) DO_SVE2_ZZZ_WTB(USUBWT, usubw, true) + +static void gen_sshll_vec(unsigned vece, TCGv_vec d, TCGv_vec n, int64_t imm) +{ + int top = imm & 1; + int shl = imm >> 1; + int halfbits = 4 << vece; + + if (top) { + if (shl == halfbits) { + TCGv_vec t = tcg_temp_new_vec_matching(d); + tcg_gen_dupi_vec(vece, t, MAKE_64BIT_MASK(halfbits, halfbits)); + tcg_gen_and_vec(vece, d, n, t); + tcg_temp_free_vec(t); + } else { + tcg_gen_sari_vec(vece, d, n, halfbits); + tcg_gen_shli_vec(vece, d, d, shl); + } + } else { + tcg_gen_shli_vec(vece, d, n, halfbits); + tcg_gen_sari_vec(vece, d, d, halfbits - shl); + } +} + +static void gen_ushll_i64(unsigned vece, TCGv_i64 d, TCGv_i64 n, int imm) +{ + int halfbits = 4 << vece; + int top = imm & 1; + int shl = (imm >> 1); + int shift; + uint64_t mask; + + mask = MAKE_64BIT_MASK(0, halfbits); + mask <<= shl; + mask = dup_const(vece, mask); + + shift = shl - top * halfbits; + if (shift < 0) { + tcg_gen_shri_i64(d, n, -shift); + } else { + tcg_gen_shli_i64(d, n, shift); + } + tcg_gen_andi_i64(d, d, mask); +} + +static void gen_ushll16_i64(TCGv_i64 d, TCGv_i64 n, int64_t imm) +{ + gen_ushll_i64(MO_16, d, n, imm); +} + +static void gen_ushll32_i64(TCGv_i64 d, TCGv_i64 n, int64_t imm) +{ + gen_ushll_i64(MO_32, d, n, imm); +} + +static void gen_ushll64_i64(TCGv_i64 d, TCGv_i64 n, int64_t imm) +{ + gen_ushll_i64(MO_64, d, n, imm); +} + +static void gen_ushll_vec(unsigned vece, TCGv_vec d, TCGv_vec n, int64_t imm) +{ + int halfbits = 4 << vece; + int top = imm & 1; + int shl = imm >> 1; + + if (top) { + if (shl == halfbits) { + TCGv_vec t = tcg_temp_new_vec_matching(d); + tcg_gen_dupi_vec(vece, t, MAKE_64BIT_MASK(halfbits, halfbits)); + tcg_gen_and_vec(vece, d, n, t); + tcg_temp_free_vec(t); + } else { + tcg_gen_shri_vec(vece, d, n, halfbits); + tcg_gen_shli_vec(vece, d, d, shl); + } + } else { + if (shl == 0) { + TCGv_vec t = tcg_temp_new_vec_matching(d); + tcg_gen_dupi_vec(vece, t, MAKE_64BIT_MASK(0, halfbits)); + tcg_gen_and_vec(vece, d, n, t); + tcg_temp_free_vec(t); + } else { + tcg_gen_shli_vec(vece, d, n, halfbits); + tcg_gen_shri_vec(vece, d, d, halfbits - shl); + } + } +} + +static bool do_sve2_shll_tb(DisasContext *s, arg_rri_esz *a, + bool sel, bool uns) +{ + static const TCGOpcode sshll_list[] = { + INDEX_op_shli_vec, INDEX_op_sari_vec, 0 + }; + static const TCGOpcode ushll_list[] = { + INDEX_op_shli_vec, INDEX_op_shri_vec, 0 + }; + static const GVecGen2i ops[2][3] = { + { { .fniv = gen_sshll_vec, + .opt_opc = sshll_list, + .fno = gen_helper_sve2_sshll_h, + .vece = MO_16 }, + { .fniv = gen_sshll_vec, + .opt_opc = sshll_list, + .fno = gen_helper_sve2_sshll_s, + .vece = MO_32 }, + { .fniv = gen_sshll_vec, + .opt_opc = sshll_list, + .fno = gen_helper_sve2_sshll_d, + .vece = MO_64 } }, + { { .fni8 = gen_ushll16_i64, + .fniv = gen_ushll_vec, + .opt_opc = ushll_list, + .fno = gen_helper_sve2_ushll_h, + .vece = MO_16 }, + { .fni8 = gen_ushll32_i64, + .fniv = gen_ushll_vec, + .opt_opc = ushll_list, + .fno = gen_helper_sve2_ushll_s, + .vece = MO_32 }, + { .fni8 = gen_ushll64_i64, + .fniv = gen_ushll_vec, + .opt_opc = ushll_list, + .fno = gen_helper_sve2_ushll_d, + .vece = MO_64 } }, + }; + + if (a->esz < 0 || a->esz > 2 || !dc_isar_feature(aa64_sve2, s)) { + return false; + } + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + tcg_gen_gvec_2i(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vsz, vsz, (a->imm << 1) | sel, + &ops[uns][a->esz]); + } + return true; +} + +static bool trans_SSHLLB(DisasContext *s, arg_rri_esz *a) +{ + return do_sve2_shll_tb(s, a, false, false); +} + +static bool trans_SSHLLT(DisasContext *s, arg_rri_esz *a) +{ + return do_sve2_shll_tb(s, a, true, false); +} + +static bool trans_USHLLB(DisasContext *s, arg_rri_esz *a) +{ + return do_sve2_shll_tb(s, a, false, true); +} + +static bool trans_USHLLT(DisasContext *s, arg_rri_esz *a) +{ + return do_sve2_shll_tb(s, a, true, true); +} From patchwork Tue May 25 01:02:42 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277451 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C51BDC2B9F7 for ; Tue, 25 May 2021 01:22:32 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 4DE56613F6 for ; Tue, 25 May 2021 01:22:32 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 4DE56613F6 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:53578 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llLmB-00087U-5i for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 21:22:31 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:53812) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLUh-0001m7-Np for qemu-devel@nongnu.org; Mon, 24 May 2021 21:04:29 -0400 Received: from mail-pg1-x52d.google.com ([2607:f8b0:4864:20::52d]:39733) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLUQ-0001k0-Vi for qemu-devel@nongnu.org; Mon, 24 May 2021 21:04:27 -0400 Received: by mail-pg1-x52d.google.com with SMTP id v14so18677719pgi.6 for ; Mon, 24 May 2021 18:04:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=YmkDxU3EI1YasUomfQhOObIJxE4RDefqjrp+VfccipQ=; b=cwu4i28GagnlRArerYF4uTyMSYH2rikYBgYT4tdVIUFvsUCAQDcjNYPZEKyN3Z7tn6 TenkMLdEEbp+6V+qOth24KW5r35hiqChezxaSvZ7qboXoP6lYjWj52WQziokzvTT162+ JCACWa7PRQWB+sGrojG01iyv4U52+Q4Xsmm1r+HGeXoMffdf2gVT/rW51e5YKjOUGQLm YlH3zsh/oY0rtU1wydVjXEaVYUCWig27TKRot4azJMRyrEMPIqEpVdHVOuTQwye5rJIx r5H8bitHpaWM2CbLbFYkBUv90OaFYCzD3jWanyrEAHtM8vEeuzmpmCTMy/QixTDzxGxe QTGQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=YmkDxU3EI1YasUomfQhOObIJxE4RDefqjrp+VfccipQ=; b=F3iaVtrvr6+rSZQclLp7N/KqXEwPjp+qUDJmQxYN5tjVdY8O8ZCcQASEcIFa5rJUMx Hz6OAh/okNfHJkt1zLqvoeB48mX6p272267yEKVI5k/UnXd01af96JtYtfahMI+OAou7 QwDu306Lpm5whtvHSTY2LmopekF1jp1P5Xiw7Mhld5wLhsfEt86nGV7RnD+Z64LMmqAs z9MgpOTd4+RxRollwXM+QWx135DDPh8sV9C0PxmPAfzByLY8IYfzNPlgvXN4t58cIoy8 96Du8ot5y0+9m85hgMnhuCnuCcXdq7uUn3KbPNLdM7YxoH1dF1W9vehBSRbUxRU9jevT Zsog== X-Gm-Message-State: AOAM530AxSzSo/ZqRivfzEnELq6Ec4c+lJzEuQKt8BbXPwll4pV72D+l taHJorNUvGZi3y53j6nu4ruEh3QNP3sG6w== X-Google-Smtp-Source: ABdhPJxjVXkNvUsFKf0hivBq47hWbxtELHXEbBht0uaf08K+Lbo4olFTXk18UHS9ZW/2ebtcahGebA== X-Received: by 2002:a62:1cd4:0:b029:2d9:f0ef:2b42 with SMTP id c203-20020a621cd40000b02902d9f0ef2b42mr27038164pfc.36.1621904649530; Mon, 24 May 2021 18:04:09 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id b1sm13742645pgf.84.2021.05.24.18.04.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:04:09 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 16/92] target/arm: Implement SVE2 bitwise exclusive-or interleaved Date: Mon, 24 May 2021 18:02:42 -0700 Message-Id: <20210525010358.152808-17-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::52d; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x52d.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 5 +++++ target/arm/sve.decode | 5 +++++ target/arm/sve_helper.c | 20 ++++++++++++++++++++ target/arm/translate-sve.c | 19 +++++++++++++++++++ 4 files changed, 49 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 740939e7a8..f65818da05 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2372,3 +2372,8 @@ DEF_HELPER_FLAGS_3(sve2_sshll_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve2_ushll_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve2_ushll_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve2_ushll_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_eoril_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_eoril_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_eoril_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_eoril_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index a3191eba7b..0922a44829 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1215,3 +1215,8 @@ SSHLLB 01000101 .. 0 ..... 1010 00 ..... ..... @rd_rn_tszimm_shl SSHLLT 01000101 .. 0 ..... 1010 01 ..... ..... @rd_rn_tszimm_shl USHLLB 01000101 .. 0 ..... 1010 10 ..... ..... @rd_rn_tszimm_shl USHLLT 01000101 .. 0 ..... 1010 11 ..... ..... @rd_rn_tszimm_shl + +## SVE2 bitwise exclusive-or interleaved + +EORBT 01000101 .. 0 ..... 10010 0 ..... ..... @rd_rn_rm +EORTB 01000101 .. 0 ..... 10010 1 ..... ..... @rd_rn_rm diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 79b268cbba..1af6dfde8e 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1226,6 +1226,26 @@ DO_ZZZ_WTB(sve2_usubw_d, uint64_t, uint32_t, , H1_4, DO_SUB) #undef DO_ZZZ_WTB +#define DO_ZZZ_NTB(NAME, TYPE, H, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc); \ + intptr_t sel1 = extract32(desc, SIMD_DATA_SHIFT, 1) * sizeof(TYPE); \ + intptr_t sel2 = extract32(desc, SIMD_DATA_SHIFT + 1, 1) * sizeof(TYPE); \ + for (i = 0; i < opr_sz; i += 2 * sizeof(TYPE)) { \ + TYPE nn = *(TYPE *)(vn + H(i + sel1)); \ + TYPE mm = *(TYPE *)(vm + H(i + sel2)); \ + *(TYPE *)(vd + H(i + sel1)) = OP(nn, mm); \ + } \ +} + +DO_ZZZ_NTB(sve2_eoril_b, uint8_t, H1, DO_EOR) +DO_ZZZ_NTB(sve2_eoril_h, uint16_t, H1_2, DO_EOR) +DO_ZZZ_NTB(sve2_eoril_s, uint32_t, H1_4, DO_EOR) +DO_ZZZ_NTB(sve2_eoril_d, uint64_t, , DO_EOR) + +#undef DO_ZZZ_NTB + #define DO_ZZI_SHLL(NAME, TYPEW, TYPEN, HW, HN) \ void HELPER(NAME)(void *vd, void *vn, uint32_t desc) \ { \ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index da7308d1af..d2c1fafc5f 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -6030,6 +6030,25 @@ DO_SVE2_ZZZ_TB(SMULLT_zzz, smull_zzz, true, true) DO_SVE2_ZZZ_TB(UMULLB_zzz, umull_zzz, false, false) DO_SVE2_ZZZ_TB(UMULLT_zzz, umull_zzz, true, true) +static bool do_eor_tb(DisasContext *s, arg_rrr_esz *a, bool sel1) +{ + static gen_helper_gvec_3 * const fns[4] = { + gen_helper_sve2_eoril_b, gen_helper_sve2_eoril_h, + gen_helper_sve2_eoril_s, gen_helper_sve2_eoril_d, + }; + return do_sve2_zzw_ool(s, a, fns[a->esz], (!sel1 << 1) | sel1); +} + +static bool trans_EORBT(DisasContext *s, arg_rrr_esz *a) +{ + return do_eor_tb(s, a, false); +} + +static bool trans_EORTB(DisasContext *s, arg_rrr_esz *a) +{ + return do_eor_tb(s, a, true); +} + static bool do_trans_pmull(DisasContext *s, arg_rrr_esz *a, bool sel) { static gen_helper_gvec_3 * const fns[4] = { From patchwork Tue May 25 01:02:43 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277443 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0DEC0C2B9F7 for ; Tue, 25 May 2021 01:19:09 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 4BD90613CC for ; Tue, 25 May 2021 01:19:08 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 4BD90613CC Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:44534 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llLit-0001qQ-91 for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 21:19:07 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:53854) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLUj-0001ny-Qa for qemu-devel@nongnu.org; Mon, 24 May 2021 21:04:31 -0400 Received: from mail-pf1-x431.google.com ([2607:f8b0:4864:20::431]:39532) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLUR-0001kO-Fv for qemu-devel@nongnu.org; Mon, 24 May 2021 21:04:29 -0400 Received: by mail-pf1-x431.google.com with SMTP id c17so22279487pfn.6 for ; Mon, 24 May 2021 18:04:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=YllsSn3S6mr37hNRRlPTvuttEfJ5beXx1HU7uDK21Ag=; b=Rp+hciFRaKWsSpihlN7vSuYBp+e2uaKMx9Wb4/5fi/4StlwzMOiFDrpol72FK/6BiJ E+5htJonrf8/S14vVnKcNj6lSD27CiNjY5g2bYi1kh6QHRjy+ZbJEHbpwVo7LuCAW0T3 VS7Dx8lBjhUBDam5G5SADAxIKfzsJHGEotRV4cx2yst8CAwIu8e5HpmrfS8FsQ9uG/2C xk2OYzRVEqOxgQCdvI2zBpQZGGQlaqLfdYipY+qfyyU9apUOHZkWqA05bZ2zixnqN15V zZRfDA6pJ4KRoSI4/hsDCLSc90QHZmRCj66y9EXgIle8HAxQBGi14NAdtYFt2GbJ+PPs bAgQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=YllsSn3S6mr37hNRRlPTvuttEfJ5beXx1HU7uDK21Ag=; b=dkUqf4j5mnW8he2BGkSl+KwNDIA5HZtJcgHQKSYWJZNn90hI2EiVgWEZzRmz5wx3Kt 0PzgeXqurP24JG5ZEYbz1kHzUVN8M8cfWuWY56RkEhh2ubauDUaw0umkkkaWkJpWuTHk s0TssiwvWo7ic3OvaaA41EJ/nlYtK3L+pott0ZgBHpo0gUWet/2aRmrFWUraO3mqkWjf bBuFZp8nVK0q8LHNggSz7/FZqIa7yL79lcqtBV+FdRI9PFVWvOb03+xyXQh3oYqAjW8x JAmG/DcBTk4//CMSwfQ7FUasEdOhMlNWvhL+RqVIF/jP3tLRTviqbnpn7ETkofm5ACrT /6tQ== X-Gm-Message-State: AOAM530NS6pLLNjIO+559tboWGffUCPq/OClmITVamCTqxsvcJ2zEd9e BgOevaxitRFzziT+gL709BUKTVrryw0F2w== X-Google-Smtp-Source: ABdhPJznAqukCIrb4DqvPijJXGc1BRl4NwsD7Mke2Ki/FJqttdx5w0BJr0EFr2pJP93fEYnBecJvFg== X-Received: by 2002:a63:585:: with SMTP id 127mr16172839pgf.322.1621904650039; Mon, 24 May 2021 18:04:10 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id b1sm13742645pgf.84.2021.05.24.18.04.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:04:09 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 17/92] target/arm: Implement SVE2 bitwise permute Date: Mon, 24 May 2021 18:02:43 -0700 Message-Id: <20210525010358.152808-18-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::431; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x431.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/cpu.h | 5 +++ target/arm/helper-sve.h | 15 ++++++++ target/arm/sve.decode | 6 ++++ target/arm/sve_helper.c | 73 ++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 36 +++++++++++++++++++ 5 files changed, 135 insertions(+) diff --git a/target/arm/cpu.h b/target/arm/cpu.h index 902579d24b..ae787fac8a 100644 --- a/target/arm/cpu.h +++ b/target/arm/cpu.h @@ -4241,6 +4241,11 @@ static inline bool isar_feature_aa64_sve2_pmull128(const ARMISARegisters *id) return FIELD_EX64(id->id_aa64zfr0, ID_AA64ZFR0, AES) >= 2; } +static inline bool isar_feature_aa64_sve2_bitperm(const ARMISARegisters *id) +{ + return FIELD_EX64(id->id_aa64zfr0, ID_AA64ZFR0, BITPERM) != 0; +} + /* * Feature tests for "does this exist in either 32-bit or 64-bit?" */ diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index f65818da05..4861481fe0 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2377,3 +2377,18 @@ DEF_HELPER_FLAGS_4(sve2_eoril_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_eoril_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_eoril_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_eoril_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_bext_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_bext_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_bext_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_bext_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_bdep_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_bdep_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_bdep_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_bdep_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_bgrp_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_bgrp_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_bgrp_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_bgrp_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 0922a44829..7cb89a0d47 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1220,3 +1220,9 @@ USHLLT 01000101 .. 0 ..... 1010 11 ..... ..... @rd_rn_tszimm_shl EORBT 01000101 .. 0 ..... 10010 0 ..... ..... @rd_rn_rm EORTB 01000101 .. 0 ..... 10010 1 ..... ..... @rd_rn_rm + +## SVE2 bitwise permute + +BEXT 01000101 .. 0 ..... 1011 00 ..... ..... @rd_rn_rm +BDEP 01000101 .. 0 ..... 1011 01 ..... ..... @rd_rn_rm +BGRP 01000101 .. 0 ..... 1011 10 ..... ..... @rd_rn_rm diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 1af6dfde8e..3cb256e4a5 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1246,6 +1246,79 @@ DO_ZZZ_NTB(sve2_eoril_d, uint64_t, , DO_EOR) #undef DO_ZZZ_NTB +#define DO_BITPERM(NAME, TYPE, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc); \ + for (i = 0; i < opr_sz; i += sizeof(TYPE)) { \ + TYPE nn = *(TYPE *)(vn + i); \ + TYPE mm = *(TYPE *)(vm + i); \ + *(TYPE *)(vd + i) = OP(nn, mm, sizeof(TYPE) * 8); \ + } \ +} + +static uint64_t bitextract(uint64_t data, uint64_t mask, int n) +{ + uint64_t res = 0; + int db, rb = 0; + + for (db = 0; db < n; ++db) { + if ((mask >> db) & 1) { + res |= ((data >> db) & 1) << rb; + ++rb; + } + } + return res; +} + +DO_BITPERM(sve2_bext_b, uint8_t, bitextract) +DO_BITPERM(sve2_bext_h, uint16_t, bitextract) +DO_BITPERM(sve2_bext_s, uint32_t, bitextract) +DO_BITPERM(sve2_bext_d, uint64_t, bitextract) + +static uint64_t bitdeposit(uint64_t data, uint64_t mask, int n) +{ + uint64_t res = 0; + int rb, db = 0; + + for (rb = 0; rb < n; ++rb) { + if ((mask >> rb) & 1) { + res |= ((data >> db) & 1) << rb; + ++db; + } + } + return res; +} + +DO_BITPERM(sve2_bdep_b, uint8_t, bitdeposit) +DO_BITPERM(sve2_bdep_h, uint16_t, bitdeposit) +DO_BITPERM(sve2_bdep_s, uint32_t, bitdeposit) +DO_BITPERM(sve2_bdep_d, uint64_t, bitdeposit) + +static uint64_t bitgroup(uint64_t data, uint64_t mask, int n) +{ + uint64_t resm = 0, resu = 0; + int db, rbm = 0, rbu = 0; + + for (db = 0; db < n; ++db) { + uint64_t val = (data >> db) & 1; + if ((mask >> db) & 1) { + resm |= val << rbm++; + } else { + resu |= val << rbu++; + } + } + + return resm | (resu << rbm); +} + +DO_BITPERM(sve2_bgrp_b, uint8_t, bitgroup) +DO_BITPERM(sve2_bgrp_h, uint16_t, bitgroup) +DO_BITPERM(sve2_bgrp_s, uint32_t, bitgroup) +DO_BITPERM(sve2_bgrp_d, uint64_t, bitgroup) + +#undef DO_BITPERM + #define DO_ZZI_SHLL(NAME, TYPEW, TYPEN, HW, HN) \ void HELPER(NAME)(void *vd, void *vn, uint32_t desc) \ { \ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index d2c1fafc5f..3ea42758fc 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -6249,3 +6249,39 @@ static bool trans_USHLLT(DisasContext *s, arg_rri_esz *a) { return do_sve2_shll_tb(s, a, true, true); } + +static bool trans_BEXT(DisasContext *s, arg_rrr_esz *a) +{ + static gen_helper_gvec_3 * const fns[4] = { + gen_helper_sve2_bext_b, gen_helper_sve2_bext_h, + gen_helper_sve2_bext_s, gen_helper_sve2_bext_d, + }; + if (!dc_isar_feature(aa64_sve2_bitperm, s)) { + return false; + } + return do_sve2_zzw_ool(s, a, fns[a->esz], 0); +} + +static bool trans_BDEP(DisasContext *s, arg_rrr_esz *a) +{ + static gen_helper_gvec_3 * const fns[4] = { + gen_helper_sve2_bdep_b, gen_helper_sve2_bdep_h, + gen_helper_sve2_bdep_s, gen_helper_sve2_bdep_d, + }; + if (!dc_isar_feature(aa64_sve2_bitperm, s)) { + return false; + } + return do_sve2_zzw_ool(s, a, fns[a->esz], 0); +} + +static bool trans_BGRP(DisasContext *s, arg_rrr_esz *a) +{ + static gen_helper_gvec_3 * const fns[4] = { + gen_helper_sve2_bgrp_b, gen_helper_sve2_bgrp_h, + gen_helper_sve2_bgrp_s, gen_helper_sve2_bgrp_d, + }; + if (!dc_isar_feature(aa64_sve2_bitperm, s)) { + return false; + } + return do_sve2_zzw_ool(s, a, fns[a->esz], 0); +} From patchwork Tue May 25 01:02:44 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277447 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id F2DD9C2B9F7 for ; Tue, 25 May 2021 01:19:24 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 58913613CC for ; Tue, 25 May 2021 01:19:24 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 58913613CC Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:45814 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llLj9-0002kh-AK for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 21:19:23 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:53892) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLUm-0001rT-KW for qemu-devel@nongnu.org; Mon, 24 May 2021 21:04:33 -0400 Received: from mail-pj1-x1035.google.com ([2607:f8b0:4864:20::1035]:53049) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLUS-0001mC-8U for qemu-devel@nongnu.org; Mon, 24 May 2021 21:04:32 -0400 Received: by mail-pj1-x1035.google.com with SMTP id q6so15860190pjj.2 for ; Mon, 24 May 2021 18:04:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=N7jVRHW3AUV0601YS+Dov38j2vXMjfTkQdrpVyl2dkA=; b=eMLKO3T9rKM4N6arzJQkV5pQaTPD/9QWfDw1Kf5w4ce//MukpRhAda+M7uMIcWZ939 eSPQ7aQlIPG/SxUl461xzwAPZJaMAi0yFSG88rX8Y2LidHD9U1pLUVQD0v1G9C8dsvhJ uhUFx8Rmdmdyr2YmSRUpm0szvgW002AwIkW1XDY5CAkplcA17USHSkEOyvy8zfHe4NY5 pyVMEAqu2ks+wa+c6/R1aBSZQQn6N6LfictYWcdxY+rIOtzaYYlO3wEaE73i9Psp4fQN EZLrWBeOi1+BXRR3u5x8Rj/Qyyj06kKQnqRwp5vbPbKxbup+vQ3C5Llp4mpPFP8A5s4w OyZg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=N7jVRHW3AUV0601YS+Dov38j2vXMjfTkQdrpVyl2dkA=; b=gtSPDJcFLpXPc9WlP2OW8LWL3u/iAS+SduUcj72yGw4UjYVJv7pB1iV9ne8vzQuYT/ D+ld5miIGPQLdEQrOENxbLV+rZFtEcSO1Kf607b14UM/IG+pgvZ4/l3/DKSrje72y0pk Jlk2XS873vzw0chh91Anazh+EjpS598tgNW+vzXqEuBG277svcN+kKrHJqZJmtK3UQwV 0yUPTcC19RFyg6YAq3oPoZNaos/XqRjO0DzfgLR4xUzgVyiHH4I2eGqyEX0s1vC+Xodx Ih/HgBRZ+OjNe3omsKXhuhs6c9nFpcF54vtVU0x/XOT0y7PmVrmFl2ocnOhugG5can8b AcEg== X-Gm-Message-State: AOAM531cjLVmSV3vzDL2Pxy/V+3J5i55bm5lBJ6xk1Pjv5TBc6fF46Rs 7nOdMIHnYo5ROtAwjENYZYVmlU0w+Bw6lA== X-Google-Smtp-Source: ABdhPJzTfrhv+fuAIYtkhlou7wlQEEmHTe1IDypP0CmDBhVI0KyjVlSvK6FFMce8Z5albLsKihlFYA== X-Received: by 2002:a17:902:b687:b029:eb:6491:b3f7 with SMTP id c7-20020a170902b687b02900eb6491b3f7mr28103186pls.38.1621904650772; Mon, 24 May 2021 18:04:10 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id b1sm13742645pgf.84.2021.05.24.18.04.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:04:10 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 18/92] target/arm: Implement SVE2 complex integer add Date: Mon, 24 May 2021 18:02:44 -0700 Message-Id: <20210525010358.152808-19-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1035; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1035.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- v2: Fix subtraction ordering (laurent desnogues). --- target/arm/helper-sve.h | 10 +++++++++ target/arm/sve.decode | 9 ++++++++ target/arm/sve_helper.c | 42 ++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 31 ++++++++++++++++++++++++++++ 4 files changed, 92 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 4861481fe0..c2155cc544 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2392,3 +2392,13 @@ DEF_HELPER_FLAGS_4(sve2_bgrp_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_bgrp_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_bgrp_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_bgrp_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_cadd_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_cadd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_cadd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_cadd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_sqcadd_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqcadd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqcadd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqcadd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 7cb89a0d47..7508b901d0 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1226,3 +1226,12 @@ EORTB 01000101 .. 0 ..... 10010 1 ..... ..... @rd_rn_rm BEXT 01000101 .. 0 ..... 1011 00 ..... ..... @rd_rn_rm BDEP 01000101 .. 0 ..... 1011 01 ..... ..... @rd_rn_rm BGRP 01000101 .. 0 ..... 1011 10 ..... ..... @rd_rn_rm + +#### SVE2 Accumulate + +## SVE2 complex integer add + +CADD_rot90 01000101 .. 00000 0 11011 0 ..... ..... @rdn_rm +CADD_rot270 01000101 .. 00000 0 11011 1 ..... ..... @rdn_rm +SQCADD_rot90 01000101 .. 00000 1 11011 0 ..... ..... @rdn_rm +SQCADD_rot270 01000101 .. 00000 1 11011 1 ..... ..... @rdn_rm diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 3cb256e4a5..9015e68cb8 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1319,6 +1319,48 @@ DO_BITPERM(sve2_bgrp_d, uint64_t, bitgroup) #undef DO_BITPERM +#define DO_CADD(NAME, TYPE, H, ADD_OP, SUB_OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc); \ + int sub_r = simd_data(desc); \ + if (sub_r) { \ + for (i = 0; i < opr_sz; i += 2 * sizeof(TYPE)) { \ + TYPE acc_r = *(TYPE *)(vn + H(i)); \ + TYPE acc_i = *(TYPE *)(vn + H(i + sizeof(TYPE))); \ + TYPE el2_r = *(TYPE *)(vm + H(i)); \ + TYPE el2_i = *(TYPE *)(vm + H(i + sizeof(TYPE))); \ + acc_r = ADD_OP(acc_r, el2_i); \ + acc_i = SUB_OP(acc_i, el2_r); \ + *(TYPE *)(vd + H(i)) = acc_r; \ + *(TYPE *)(vd + H(i + sizeof(TYPE))) = acc_i; \ + } \ + } else { \ + for (i = 0; i < opr_sz; i += 2 * sizeof(TYPE)) { \ + TYPE acc_r = *(TYPE *)(vn + H(i)); \ + TYPE acc_i = *(TYPE *)(vn + H(i + sizeof(TYPE))); \ + TYPE el2_r = *(TYPE *)(vm + H(i)); \ + TYPE el2_i = *(TYPE *)(vm + H(i + sizeof(TYPE))); \ + acc_r = SUB_OP(acc_r, el2_i); \ + acc_i = ADD_OP(acc_i, el2_r); \ + *(TYPE *)(vd + H(i)) = acc_r; \ + *(TYPE *)(vd + H(i + sizeof(TYPE))) = acc_i; \ + } \ + } \ +} + +DO_CADD(sve2_cadd_b, int8_t, H1, DO_ADD, DO_SUB) +DO_CADD(sve2_cadd_h, int16_t, H1_2, DO_ADD, DO_SUB) +DO_CADD(sve2_cadd_s, int32_t, H1_4, DO_ADD, DO_SUB) +DO_CADD(sve2_cadd_d, int64_t, , DO_ADD, DO_SUB) + +DO_CADD(sve2_sqcadd_b, int8_t, H1, DO_SQADD_B, DO_SQSUB_B) +DO_CADD(sve2_sqcadd_h, int16_t, H1_2, DO_SQADD_H, DO_SQSUB_H) +DO_CADD(sve2_sqcadd_s, int32_t, H1_4, DO_SQADD_S, DO_SQSUB_S) +DO_CADD(sve2_sqcadd_d, int64_t, , do_sqadd_d, do_sqsub_d) + +#undef DO_CADD + #define DO_ZZI_SHLL(NAME, TYPEW, TYPEN, HW, HN) \ void HELPER(NAME)(void *vd, void *vn, uint32_t desc) \ { \ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 3ea42758fc..27eb6f3233 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -6285,3 +6285,34 @@ static bool trans_BGRP(DisasContext *s, arg_rrr_esz *a) } return do_sve2_zzw_ool(s, a, fns[a->esz], 0); } + +static bool do_cadd(DisasContext *s, arg_rrr_esz *a, bool sq, bool rot) +{ + static gen_helper_gvec_3 * const fns[2][4] = { + { gen_helper_sve2_cadd_b, gen_helper_sve2_cadd_h, + gen_helper_sve2_cadd_s, gen_helper_sve2_cadd_d }, + { gen_helper_sve2_sqcadd_b, gen_helper_sve2_sqcadd_h, + gen_helper_sve2_sqcadd_s, gen_helper_sve2_sqcadd_d }, + }; + return do_sve2_zzw_ool(s, a, fns[sq][a->esz], rot); +} + +static bool trans_CADD_rot90(DisasContext *s, arg_rrr_esz *a) +{ + return do_cadd(s, a, false, false); +} + +static bool trans_CADD_rot270(DisasContext *s, arg_rrr_esz *a) +{ + return do_cadd(s, a, false, true); +} + +static bool trans_SQCADD_rot90(DisasContext *s, arg_rrr_esz *a) +{ + return do_cadd(s, a, true, false); +} + +static bool trans_SQCADD_rot270(DisasContext *s, arg_rrr_esz *a) +{ + return do_cadd(s, a, true, true); +} From patchwork Tue May 25 01:02:45 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277461 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3105CC2B9F7 for ; Tue, 25 May 2021 01:25:53 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id B7032613CC for ; Tue, 25 May 2021 01:25:52 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B7032613CC Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:36946 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llLpP-0007Yc-R0 for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 21:25:51 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:53902) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLUn-0001sc-RO for qemu-devel@nongnu.org; Mon, 24 May 2021 21:04:36 -0400 Received: from mail-pf1-x42c.google.com ([2607:f8b0:4864:20::42c]:36679) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLUS-0001mL-RQ for qemu-devel@nongnu.org; Mon, 24 May 2021 21:04:33 -0400 Received: by mail-pf1-x42c.google.com with SMTP id c12so10680658pfl.3 for ; Mon, 24 May 2021 18:04:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=xQqdhPEFi//818PiFQlOdC031EC4wLZPJm67LCcXIzc=; b=lC/lh53Ad36WBX8wsv/kZ25lm3l+yJY2issjRnUDWBGixSV9ccPA+tLXZ3OkZsaP5m Fl1/vb2ytxl2SgRlP5Amcf33w0CfP4SBhyR//oANQ32BDuk6XWRaUz2FmEFHJrJCY7H8 nAUGxPkqNVHg7l0rEKrPMLIwoVTv9wR3RXOXNXIeNw90+HsQF1mpuTwcHWnXDl9y2gFf WwK0I8HuiJ2nXnD7qc9p03sJ9XDRgt2gxV16VheJhyONm6XRptf7VdWrQeNiuYY8/K5W 8am043idPOx2mHoySEgccG0RUQPYVQP3A0FCaBuSGQohLcu+QLRFG0GkIUY1mophQ+ZD efvQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=xQqdhPEFi//818PiFQlOdC031EC4wLZPJm67LCcXIzc=; b=csziZXj9c1BuJzaqMjWQcHfp5POONjh0hnI7kYF2/3FpxzzlFzu7IxKMIAGVZFRfAg Uyc7JJvJBOXHz9CjjDY1cdv9nk1YP/b4SgweTKd5co/I9DchJrjzzWkltDQoh+XItRFf qJdAudSp/vQWA+wRT4PF8fgrKojRAeZN++pPmTyLdOazCexVSSPNm/b7yLGKIIP5ganI kuU2vo35OFU5TwLEyQCJyC8ky6q07t6uVbaEoEDuhAJdTomHjAR4nLOfYvIbuqJ7T4/l DGcainym8SvRpwTqrMH4vtbvqpdKjBRM3ERO2TsrsR2022vP72wCoAkH+af0VkIlRClV 3mlA== X-Gm-Message-State: AOAM531IdX3ZR6Bupz7cll25SNOKNRlEBkeL4lmo6OYdP7HVIOFpCKdl mXdUUOY2lg/9qTSKYAc+WjB22V+577uCiQ== X-Google-Smtp-Source: ABdhPJyqWxlPytOpYQafPgwbck+iVZ3eokbfWoQLGeCz2XBJOxlnUDwBkH/Ety+7qkhZlnV8c49xEA== X-Received: by 2002:a63:cf55:: with SMTP id b21mr16514965pgj.126.1621904651372; Mon, 24 May 2021 18:04:11 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id b1sm13742645pgf.84.2021.05.24.18.04.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:04:11 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 19/92] target/arm: Implement SVE2 integer absolute difference and accumulate long Date: Mon, 24 May 2021 18:02:45 -0700 Message-Id: <20210525010358.152808-20-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::42c; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x42c.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- v2: Fix select offsetting and argument order (laurent desnogues). --- target/arm/helper-sve.h | 14 ++++++++++ target/arm/sve.decode | 12 +++++++++ target/arm/sve_helper.c | 23 ++++++++++++++++ target/arm/translate-sve.c | 55 ++++++++++++++++++++++++++++++++++++++ 4 files changed, 104 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index c2155cc544..229fb396b2 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2402,3 +2402,17 @@ DEF_HELPER_FLAGS_4(sve2_sqcadd_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_sqcadd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_sqcadd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_sqcadd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_sabal_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sabal_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sabal_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_uabal_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uabal_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uabal_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 7508b901d0..56b7353bfa 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -70,6 +70,7 @@ &rpr_s rd pg rn s &rprr_s rd pg rn rm s &rprr_esz rd pg rn rm esz +&rrrr_esz rd ra rn rm esz &rprrr_esz rd pg rn rm ra esz &rpri_esz rd pg rn imm esz &ptrue rd esz pat s @@ -119,6 +120,10 @@ @rdn_i8s ........ esz:2 ...... ... imm:s8 rd:5 \ &rri_esz rn=%reg_movprfx +# Four operand, vector element size +@rda_rn_rm ........ esz:2 . rm:5 ... ... rn:5 rd:5 \ + &rrrr_esz ra=%reg_movprfx + # Three operand with "memory" size, aka immediate left shift @rd_rn_msz_rm ........ ... rm:5 .... imm:2 rn:5 rd:5 &rrri @@ -1235,3 +1240,10 @@ CADD_rot90 01000101 .. 00000 0 11011 0 ..... ..... @rdn_rm CADD_rot270 01000101 .. 00000 0 11011 1 ..... ..... @rdn_rm SQCADD_rot90 01000101 .. 00000 1 11011 0 ..... ..... @rdn_rm SQCADD_rot270 01000101 .. 00000 1 11011 1 ..... ..... @rdn_rm + +## SVE2 integer absolute difference and accumulate long + +SABALB 01000101 .. 0 ..... 1100 00 ..... ..... @rda_rn_rm +SABALT 01000101 .. 0 ..... 1100 01 ..... ..... @rda_rn_rm +UABALB 01000101 .. 0 ..... 1100 10 ..... ..... @rda_rn_rm +UABALT 01000101 .. 0 ..... 1100 11 ..... ..... @rda_rn_rm diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 9015e68cb8..5d084a1164 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1246,6 +1246,29 @@ DO_ZZZ_NTB(sve2_eoril_d, uint64_t, , DO_EOR) #undef DO_ZZZ_NTB +#define DO_ZZZW_ACC(NAME, TYPEW, TYPEN, HW, HN, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, void *va, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc); \ + intptr_t sel1 = simd_data(desc) * sizeof(TYPEN); \ + for (i = 0; i < opr_sz; i += sizeof(TYPEW)) { \ + TYPEW nn = *(TYPEN *)(vn + HN(i + sel1)); \ + TYPEW mm = *(TYPEN *)(vm + HN(i + sel1)); \ + TYPEW aa = *(TYPEW *)(va + HW(i)); \ + *(TYPEW *)(vd + HW(i)) = OP(nn, mm) + aa; \ + } \ +} + +DO_ZZZW_ACC(sve2_sabal_h, int16_t, int8_t, H1_2, H1, DO_ABD) +DO_ZZZW_ACC(sve2_sabal_s, int32_t, int16_t, H1_4, H1_2, DO_ABD) +DO_ZZZW_ACC(sve2_sabal_d, int64_t, int32_t, , H1_4, DO_ABD) + +DO_ZZZW_ACC(sve2_uabal_h, uint16_t, uint8_t, H1_2, H1, DO_ABD) +DO_ZZZW_ACC(sve2_uabal_s, uint32_t, uint16_t, H1_4, H1_2, DO_ABD) +DO_ZZZW_ACC(sve2_uabal_d, uint64_t, uint32_t, , H1_4, DO_ABD) + +#undef DO_ZZZW_ACC + #define DO_BITPERM(NAME, TYPE, OP) \ void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ { \ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 27eb6f3233..c41464ba22 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -163,6 +163,18 @@ static void gen_gvec_ool_zzz(DisasContext *s, gen_helper_gvec_3 *fn, vsz, vsz, data, fn); } +/* Invoke an out-of-line helper on 4 Zregs. */ +static void gen_gvec_ool_zzzz(DisasContext *s, gen_helper_gvec_4 *fn, + int rd, int rn, int rm, int ra, int data) +{ + unsigned vsz = vec_full_reg_size(s); + tcg_gen_gvec_4_ool(vec_full_reg_offset(s, rd), + vec_full_reg_offset(s, rn), + vec_full_reg_offset(s, rm), + vec_full_reg_offset(s, ra), + vsz, vsz, data, fn); +} + /* Invoke an out-of-line helper on 2 Zregs and a predicate. */ static void gen_gvec_ool_zzp(DisasContext *s, gen_helper_gvec_3 *fn, int rd, int rn, int pg, int data) @@ -6316,3 +6328,46 @@ static bool trans_SQCADD_rot270(DisasContext *s, arg_rrr_esz *a) { return do_cadd(s, a, true, true); } + +static bool do_sve2_zzzz_ool(DisasContext *s, arg_rrrr_esz *a, + gen_helper_gvec_4 *fn, int data) +{ + if (fn == NULL || !dc_isar_feature(aa64_sve2, s)) { + return false; + } + if (sve_access_check(s)) { + gen_gvec_ool_zzzz(s, fn, a->rd, a->rn, a->rm, a->ra, data); + } + return true; +} + +static bool do_abal(DisasContext *s, arg_rrrr_esz *a, bool uns, bool sel) +{ + static gen_helper_gvec_4 * const fns[2][4] = { + { NULL, gen_helper_sve2_sabal_h, + gen_helper_sve2_sabal_s, gen_helper_sve2_sabal_d }, + { NULL, gen_helper_sve2_uabal_h, + gen_helper_sve2_uabal_s, gen_helper_sve2_uabal_d }, + }; + return do_sve2_zzzz_ool(s, a, fns[uns][a->esz], sel); +} + +static bool trans_SABALB(DisasContext *s, arg_rrrr_esz *a) +{ + return do_abal(s, a, false, false); +} + +static bool trans_SABALT(DisasContext *s, arg_rrrr_esz *a) +{ + return do_abal(s, a, false, true); +} + +static bool trans_UABALB(DisasContext *s, arg_rrrr_esz *a) +{ + return do_abal(s, a, true, false); +} + +static bool trans_UABALT(DisasContext *s, arg_rrrr_esz *a) +{ + return do_abal(s, a, true, true); +} From patchwork Tue May 25 01:02:46 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277453 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,UNWANTED_LANGUAGE_BODY, URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A036BC2B9F7 for ; Tue, 25 May 2021 01:22:54 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 3C81D613F9 for ; Tue, 25 May 2021 01:22:54 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 3C81D613F9 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:55514 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llLmX-00011u-4U for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 21:22:53 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:53904) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLUn-0001se-Ry for qemu-devel@nongnu.org; Mon, 24 May 2021 21:04:36 -0400 Received: from mail-pj1-x1030.google.com ([2607:f8b0:4864:20::1030]:55840) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLUT-0001mp-AX for qemu-devel@nongnu.org; Mon, 24 May 2021 21:04:33 -0400 Received: by mail-pj1-x1030.google.com with SMTP id kr9so7794900pjb.5 for ; Mon, 24 May 2021 18:04:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=SWFfFR452q1fUmRFJ3PaFSu6DwY4BPwjHHMIRk/NZbA=; b=eoEgxpve4A8zngVkaQB3BqKjCLq/mqS6llMMuYGnp5guFpLPbERF0f4lODQ5gZxUDO yOXSImBamWY36CwkUDixipcX3FwxXR9wiq07Id40SRxv02Ey7NnB4MfoqgJRtaEHMO9u hJ5/gq2L0aFxQD2/Hh7q65Mqe3S4HiwFemILsvoxjOXlfOgf1cFhH9qH8PgGT1ap2dQM fDNX/df2OHWnaGJibDhjkhiCpLM+hikV8GudqTluRY1V2aGJNGzd0cAakhQ64goEw7pv yLnGJhT89VeGrMGKPIA27MoOUEWhYzStQK+5HxNYoqb4mJ4IigdEBsXt18C2EOFSwoqV thRw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=SWFfFR452q1fUmRFJ3PaFSu6DwY4BPwjHHMIRk/NZbA=; b=LkQ8PiUQ/Bg+ZsM4uUJWx8lrXBiZYFXMdiatI5XBceQBLZqXA6BNuON16UqVwn0M+z F0YtYy1tp2nfE29VcawOHN9Dk+SVprM/OE5cjZf0H0pRT4ga7WXi05RDtO3K2yC4WlCS 88JSx4z6saTcwpES8NAN1IXV0iZgZvDb+HsGY86y2VLAOqxvXzlawGE78oDgS/Ry1K2Z MRwjNjzSxN128LI3Zs35P85Fwv6vNoweoqMRiRtFrHZ41xO66LN7MtB9uO1FU9CMtAEQ qreJhfWwS/gUqYC/NqQ6XXPfAsIEbHuHA5Y3o4rjyrEzq2rvXUN0YH5U9OT1q1PlwCTW tF3Q== X-Gm-Message-State: AOAM532jQ8UiQoUg5M9udQBuyM82cKzbMVdKRjPyhLX1JQ4UXFqXWcRP Gu3cbdUCfhtFd0hZOd0ezctNPQ4VtOrD9g== X-Google-Smtp-Source: ABdhPJylkR1LuUMWIQHRM0txSGNERrXvzjumpAWJwForxsTxEC4StC+c/OnKjnlupKI+XnR58QC+Og== X-Received: by 2002:a17:90b:4d82:: with SMTP id oj2mr27258073pjb.61.1621904651960; Mon, 24 May 2021 18:04:11 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id b1sm13742645pgf.84.2021.05.24.18.04.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:04:11 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 20/92] target/arm: Implement SVE2 integer add/subtract long with carry Date: Mon, 24 May 2021 18:02:46 -0700 Message-Id: <20210525010358.152808-21-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1030; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1030.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- v2: Fix sel indexing and argument order (laurent desnogues). --- target/arm/helper-sve.h | 3 +++ target/arm/sve.decode | 6 ++++++ target/arm/sve_helper.c | 34 ++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 23 +++++++++++++++++++++++ 4 files changed, 66 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 229fb396b2..4a62012850 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2416,3 +2416,6 @@ DEF_HELPER_FLAGS_5(sve2_uabal_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve2_uabal_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_adcl_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_adcl_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 56b7353bfa..79046d81e3 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1247,3 +1247,9 @@ SABALB 01000101 .. 0 ..... 1100 00 ..... ..... @rda_rn_rm SABALT 01000101 .. 0 ..... 1100 01 ..... ..... @rda_rn_rm UABALB 01000101 .. 0 ..... 1100 10 ..... ..... @rda_rn_rm UABALT 01000101 .. 0 ..... 1100 11 ..... ..... @rda_rn_rm + +## SVE2 integer add/subtract long with carry + +# ADC and SBC decoded via size in helper dispatch. +ADCLB 01000101 .. 0 ..... 11010 0 ..... ..... @rda_rn_rm +ADCLT 01000101 .. 0 ..... 11010 1 ..... ..... @rda_rn_rm diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 5d084a1164..b63d84eef4 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1269,6 +1269,40 @@ DO_ZZZW_ACC(sve2_uabal_d, uint64_t, uint32_t, , H1_4, DO_ABD) #undef DO_ZZZW_ACC +void HELPER(sve2_adcl_s)(void *vd, void *vn, void *vm, void *va, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + int sel = H4(extract32(desc, SIMD_DATA_SHIFT, 1)); + uint32_t inv = -extract32(desc, SIMD_DATA_SHIFT + 1, 1); + uint32_t *a = va, *n = vn; + uint64_t *d = vd, *m = vm; + + for (i = 0; i < opr_sz / 8; ++i) { + uint32_t e1 = a[2 * i + H4(0)]; + uint32_t e2 = n[2 * i + sel] ^ inv; + uint64_t c = extract64(m[i], 32, 1); + /* Compute and store the entire 33-bit result at once. */ + d[i] = c + e1 + e2; + } +} + +void HELPER(sve2_adcl_d)(void *vd, void *vn, void *vm, void *va, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + int sel = extract32(desc, SIMD_DATA_SHIFT, 1); + uint64_t inv = -(uint64_t)extract32(desc, SIMD_DATA_SHIFT + 1, 1); + uint64_t *d = vd, *a = va, *n = vn, *m = vm; + + for (i = 0; i < opr_sz / 8; i += 2) { + Int128 e1 = int128_make64(a[i]); + Int128 e2 = int128_make64(n[i + sel] ^ inv); + Int128 c = int128_make64(m[i + 1] & 1); + Int128 r = int128_add(int128_add(e1, e2), c); + d[i + 0] = int128_getlo(r); + d[i + 1] = int128_gethi(r); + } +} + #define DO_BITPERM(NAME, TYPE, OP) \ void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ { \ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index c41464ba22..cf4fa50ad2 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -6371,3 +6371,26 @@ static bool trans_UABALT(DisasContext *s, arg_rrrr_esz *a) { return do_abal(s, a, true, true); } + +static bool do_adcl(DisasContext *s, arg_rrrr_esz *a, bool sel) +{ + static gen_helper_gvec_4 * const fns[2] = { + gen_helper_sve2_adcl_s, + gen_helper_sve2_adcl_d, + }; + /* + * Note that in this case the ESZ field encodes both size and sign. + * Split out 'subtract' into bit 1 of the data field for the helper. + */ + return do_sve2_zzzz_ool(s, a, fns[a->esz & 1], (a->esz & 2) | sel); +} + +static bool trans_ADCLB(DisasContext *s, arg_rrrr_esz *a) +{ + return do_adcl(s, a, false); +} + +static bool trans_ADCLT(DisasContext *s, arg_rrrr_esz *a) +{ + return do_adcl(s, a, true); +} From patchwork Tue May 25 01:02:47 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277459 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 24C7EC2B9F7 for ; Tue, 25 May 2021 01:25:44 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id B5BF8613CC for ; Tue, 25 May 2021 01:25:43 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B5BF8613CC Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:36034 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llLpG-0006vC-Nm for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 21:25:42 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:53978) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLUs-0001un-Ty for qemu-devel@nongnu.org; Mon, 24 May 2021 21:04:41 -0400 Received: from mail-pl1-x631.google.com ([2607:f8b0:4864:20::631]:45847) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLUT-0001nU-Vu for qemu-devel@nongnu.org; Mon, 24 May 2021 21:04:38 -0400 Received: by mail-pl1-x631.google.com with SMTP id s4so13948131plg.12 for ; Mon, 24 May 2021 18:04:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=GDzHPQFAAp6LPiIxIvoRlgxRhbX+S5BFcnVxxG6PSRM=; b=jZx2RhZKK3RCYO4IipBfvR1L3pP83zinh2Y287RKOvG5oInrda1EhpgCZ9i3ClIBHf 3mDpbCpvDIj4jBvKl4RNYU62OSzzZl6gXuk6TrLtRC8ulSykSJtvmQj8zWLoyLCH0Nhp o9KfsjAt5iCP7LBxz8+Rcf3XnOPFY2gMBLz5dTr934PM5ZMBHjY0ashXN6yAynPZvD1V Zn5lamWptrkIntQoqbuYTc7UcYC2wfWbmgiG10snHLoPP7clF70mnCC/gdyMYO0Y2tTv 2qM9/i1EkOkqjqzpTZ1SplBtQs8ZmMxQWw6EhD0JDcHzOpPiSe2c5SobKzhg5ZxBNlzc JzIw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=GDzHPQFAAp6LPiIxIvoRlgxRhbX+S5BFcnVxxG6PSRM=; b=k+5vBWVc3c9nmsj2q5JWnDtN+RLF+mBYDCOirhCy7nkQ+xR7vHSQ/50g4zsrqe3x9d 1x7z7hKxRyNXhSQX/saFzswZ/0pjbP5psnJJ1tx9DvoGN41vNcyjYgl8B8iCcbutRelM nKruU52DuvL94CCqO/GQAZLPLwPcAo+lgUBn5yv+DvUfLkmfR4q0trbgCJfZCp2pk2R1 z5Yc//iMluoSIWOAEJNuR7t9rcNgQr5Zj/EH/Pk5d937aGHpsDVh+SUkIS2mg88pOeIJ HG8w1f7NH5sdepv33RPJG6C3Gl4ZdvAbSZZmWdzBnuHnr4TKFBX2gssZlBDa+Qj6cbA/ yRCw== X-Gm-Message-State: AOAM530Iz8qHQPV3QN5HtX+r54kUFpWk8vEx4ju+853p0MidIbpjZML6 tcZeeKV554mP8VM66Wh5BXqp1UZkdVbahw== X-Google-Smtp-Source: ABdhPJx67nZW5cckjOwkTl0iFfp5dEmPjrkf+QDXzWtlIQH53P7Og1XIHoaAjd1H+qzVs2Iotuxdrg== X-Received: by 2002:a17:902:db01:b029:f6:4a13:1764 with SMTP id m1-20020a170902db01b02900f64a131764mr24201024plx.25.1621904652580; Mon, 24 May 2021 18:04:12 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id b1sm13742645pgf.84.2021.05.24.18.04.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:04:12 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 21/92] target/arm: Implement SVE2 bitwise shift right and accumulate Date: Mon, 24 May 2021 18:02:47 -0700 Message-Id: <20210525010358.152808-22-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::631; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x631.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/sve.decode | 8 ++++++++ target/arm/translate-sve.c | 34 ++++++++++++++++++++++++++++++++++ 2 files changed, 42 insertions(+) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 79046d81e3..d3c4ec6dd1 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1253,3 +1253,11 @@ UABALT 01000101 .. 0 ..... 1100 11 ..... ..... @rda_rn_rm # ADC and SBC decoded via size in helper dispatch. ADCLB 01000101 .. 0 ..... 11010 0 ..... ..... @rda_rn_rm ADCLT 01000101 .. 0 ..... 11010 1 ..... ..... @rda_rn_rm + +## SVE2 bitwise shift right and accumulate + +# TODO: Use @rda and %reg_movprfx here. +SSRA 01000101 .. 0 ..... 1110 00 ..... ..... @rd_rn_tszimm_shr +USRA 01000101 .. 0 ..... 1110 01 ..... ..... @rd_rn_tszimm_shr +SRSRA 01000101 .. 0 ..... 1110 10 ..... ..... @rd_rn_tszimm_shr +URSRA 01000101 .. 0 ..... 1110 11 ..... ..... @rd_rn_tszimm_shr diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index cf4fa50ad2..1f93b1e3fe 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -6394,3 +6394,37 @@ static bool trans_ADCLT(DisasContext *s, arg_rrrr_esz *a) { return do_adcl(s, a, true); } + +static bool do_sve2_fn2i(DisasContext *s, arg_rri_esz *a, GVecGen2iFn *fn) +{ + if (a->esz < 0 || !dc_isar_feature(aa64_sve2, s)) { + return false; + } + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + unsigned rd_ofs = vec_full_reg_offset(s, a->rd); + unsigned rn_ofs = vec_full_reg_offset(s, a->rn); + fn(a->esz, rd_ofs, rn_ofs, a->imm, vsz, vsz); + } + return true; +} + +static bool trans_SSRA(DisasContext *s, arg_rri_esz *a) +{ + return do_sve2_fn2i(s, a, gen_gvec_ssra); +} + +static bool trans_USRA(DisasContext *s, arg_rri_esz *a) +{ + return do_sve2_fn2i(s, a, gen_gvec_usra); +} + +static bool trans_SRSRA(DisasContext *s, arg_rri_esz *a) +{ + return do_sve2_fn2i(s, a, gen_gvec_srsra); +} + +static bool trans_URSRA(DisasContext *s, arg_rri_esz *a) +{ + return do_sve2_fn2i(s, a, gen_gvec_ursra); +} From patchwork Tue May 25 01:02:48 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277455 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E5519C47084 for ; Tue, 25 May 2021 01:23:01 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 9D6BD613F6 for ; Tue, 25 May 2021 01:23:01 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 9D6BD613F6 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:56304 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llLme-0001ZM-M4 for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 21:23:00 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54012) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLUx-0001ve-41 for qemu-devel@nongnu.org; Mon, 24 May 2021 21:04:43 -0400 Received: from mail-pj1-x102d.google.com ([2607:f8b0:4864:20::102d]:34392) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLUU-0001nz-JA for qemu-devel@nongnu.org; Mon, 24 May 2021 21:04:42 -0400 Received: by mail-pj1-x102d.google.com with SMTP id g6-20020a17090adac6b029015d1a9a6f1aso867066pjx.1 for ; Mon, 24 May 2021 18:04:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=er/w61vzB6jmCET9oe1687J8UZHT2fYzXceEdfUCDQw=; b=hP0h3mhtTW6R8lC94cJAnYv1oYj1EIRRiiiGD042XD6wKIrf53KYvecEJB+WXY/wiz kNEU6DPtw8P0lsQFKc5LvdtrpeZAKhRjZbcDn6B6RfTWvnel2oeYrmjo2nBTQhbkUXeL oaBTjUdNZQe1hfMbHMFdeDO4Hk9XStw8nNNP0pP38/ishPtrS3d/KspJNv5ylvSHQVb7 nD7/CVpBlIebtAadHR9w9txypwaWxzqdcYfC3vyVAaeCa/02pELIeZjNqW5lrmARiOKX XhnT29oM3yLGysSOnd1Sv/RF0eZMvnOq+3/tvG0Ei/ZFS/3Q0M7uGnkY4b3cDCFR66yj H7vA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=er/w61vzB6jmCET9oe1687J8UZHT2fYzXceEdfUCDQw=; b=bWmYjMB5EFokkFDL5mjwEXdBLwQUEP7N5hdQX0zL2rnIHV3Ae4zBS2jg2ddDAacHyF iQnjWaAz8UItVeYjMoKfaUo/QlTbBQG/kooT3wZvSi9vDEzn0ZOx6bsKeOI4e9K4uxQw sBE5cQ2f+PiLgwweD1F8pSHf8UcsNvDot8jSogBsGV5W89WUyJbi5tnLvCihuN4xv+tB 4AkdPzN6IhxihU0mJUY3N5gVYeg6FzjoQe36zQY9U0qIesFde31Kqbnh4dyKL84Fuwjc J9DE2ngAJaoSMlWYbR+w6OokTHP2URqrqIEtmploU/eUGrVo10X0qdE4zjHWyhe/Ik16 t3Zw== X-Gm-Message-State: AOAM532/yZUm1OBOqTJVtZIZBQfE6dnMxAiUmgHqMjp0+Xhg9rAboxPL /2Luz0oGYp/5sP3Sij3Dq3z4Q8dLPC3LtA== X-Google-Smtp-Source: ABdhPJykQIPU5/BZBULixGfub8DZy1hwnyR7bjsZx8gB21HbpYsshNmdyfc3itBPhfCLwwAlxI8iKw== X-Received: by 2002:a17:90a:ab13:: with SMTP id m19mr2080492pjq.124.1621904653186; Mon, 24 May 2021 18:04:13 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id b1sm13742645pgf.84.2021.05.24.18.04.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:04:12 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 22/92] target/arm: Implement SVE2 bitwise shift and insert Date: Mon, 24 May 2021 18:02:48 -0700 Message-Id: <20210525010358.152808-23-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::102d; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x102d.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/sve.decode | 5 +++++ target/arm/translate-sve.c | 10 ++++++++++ 2 files changed, 15 insertions(+) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index d3c4ec6dd1..695a16551e 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1261,3 +1261,8 @@ SSRA 01000101 .. 0 ..... 1110 00 ..... ..... @rd_rn_tszimm_shr USRA 01000101 .. 0 ..... 1110 01 ..... ..... @rd_rn_tszimm_shr SRSRA 01000101 .. 0 ..... 1110 10 ..... ..... @rd_rn_tszimm_shr URSRA 01000101 .. 0 ..... 1110 11 ..... ..... @rd_rn_tszimm_shr + +## SVE2 bitwise shift and insert + +SRI 01000101 .. 0 ..... 11110 0 ..... ..... @rd_rn_tszimm_shr +SLI 01000101 .. 0 ..... 11110 1 ..... ..... @rd_rn_tszimm_shl diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 1f93b1e3fe..5e42ba350e 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -6428,3 +6428,13 @@ static bool trans_URSRA(DisasContext *s, arg_rri_esz *a) { return do_sve2_fn2i(s, a, gen_gvec_ursra); } + +static bool trans_SRI(DisasContext *s, arg_rri_esz *a) +{ + return do_sve2_fn2i(s, a, gen_gvec_sri); +} + +static bool trans_SLI(DisasContext *s, arg_rri_esz *a) +{ + return do_sve2_fn2i(s, a, gen_gvec_sli); +} From patchwork Tue May 25 01:02:49 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277475 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 848DDC2B9F7 for ; Tue, 25 May 2021 01:32:59 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 23D26613F9 for ; Tue, 25 May 2021 01:32:59 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 23D26613F9 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:55106 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llLwG-0002xN-Ip for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 21:32:57 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54044) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLUz-0001xL-DH for qemu-devel@nongnu.org; Mon, 24 May 2021 21:04:46 -0400 Received: from mail-pj1-x1029.google.com ([2607:f8b0:4864:20::1029]:53890) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLUV-0001ol-5X for qemu-devel@nongnu.org; Mon, 24 May 2021 21:04:45 -0400 Received: by mail-pj1-x1029.google.com with SMTP id ot16so13927520pjb.3 for ; Mon, 24 May 2021 18:04:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=tXQchyLhKN98Y/+BfztzDunnLVKLZyMFnTQ84i+EsqU=; b=QjNMI/n7NoJqLHruEFDaxJvZssMfiXw+TA4rI8WK1/lHaPB/8oJxPsDoJDuEk5QMMx u22G2z/66oryHOtIzwgVQEyejezE9vCvfGI3jJ90lFwTsnmmtUvMwT8hzB6GjYqT//KB F1qWXqv/Lwc9fmpQufFbpGqpc3ZyTdrO8ZS5V+o3RNjC4UYgsYvwswzQiejT34L9XIUX XJRT57qmnHMC+uI2TIPS0RdxjoXXcO1Ln1hIplm2e0dmIK2FSBIIxu+nk3NpIhwymqIg aAJsieSniFr7Ojo2ZO8iet37riJC7CQUDqIl1eiSWy6yEfMvJGca6wjwZ6fxt+ARc1C4 1sGw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=tXQchyLhKN98Y/+BfztzDunnLVKLZyMFnTQ84i+EsqU=; b=GU77q1VTMIT20Bn1tJyX9r/ux5BrsD6sE+4S7SKjynjdm3eHKZTakgeCUVGiGVEdiO qhVy/XdlY7Tdeef+vEdgCZSVNWUO8iQfFU6sf/jvDF05F2SaLot79/OMIaV0fix7n/9K LMyRtY5496kYFjr07UIDz6ODjSSwxnfMYbbOoocIUa0BsLcDU8UNbh2McGw1GLBkjzUI G+uGRlDj5K1y17QSxaw9AeOSPldBdLcM7wL2jWNB7CObacbWmuNizq5+A6bcDwDq8U4h c4WjLwdQaeJrTmNBnuh8wUri65E/KCN+z5W0y2ECC3rnVuNsN1AE4RH+VKu7GD2Va6H2 z59Q== X-Gm-Message-State: AOAM532bjMVLj2olR9O8FuqsYWkNlxmuz7hCRpcL7RhgD8RwuO420vUl IBGvr9JQrbZ/dCikrGUklQfiQC+EHMDDFw== X-Google-Smtp-Source: ABdhPJy4wmxQtZvqrougNEA6M6ljeIw/+E4zSyW4UNRQdNJcsZk07ZL9ptval4y6b+fa70XYbaEpSg== X-Received: by 2002:a17:902:d101:b029:f4:b38a:a10a with SMTP id w1-20020a170902d101b02900f4b38aa10amr28065273plw.46.1621904653820; Mon, 24 May 2021 18:04:13 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id b1sm13742645pgf.84.2021.05.24.18.04.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:04:13 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 23/92] target/arm: Implement SVE2 integer absolute difference and accumulate Date: Mon, 24 May 2021 18:02:49 -0700 Message-Id: <20210525010358.152808-24-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1029; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1029.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/sve.decode | 6 ++++++ target/arm/translate-sve.c | 21 +++++++++++++++++++++ 2 files changed, 27 insertions(+) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 695a16551e..32b15e4192 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1266,3 +1266,9 @@ URSRA 01000101 .. 0 ..... 1110 11 ..... ..... @rd_rn_tszimm_shr SRI 01000101 .. 0 ..... 11110 0 ..... ..... @rd_rn_tszimm_shr SLI 01000101 .. 0 ..... 11110 1 ..... ..... @rd_rn_tszimm_shl + +## SVE2 integer absolute difference and accumulate + +# TODO: Use @rda and %reg_movprfx here. +SABA 01000101 .. 0 ..... 11111 0 ..... ..... @rd_rn_rm +UABA 01000101 .. 0 ..... 11111 1 ..... ..... @rd_rn_rm diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 5e42ba350e..202107de98 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -6438,3 +6438,24 @@ static bool trans_SLI(DisasContext *s, arg_rri_esz *a) { return do_sve2_fn2i(s, a, gen_gvec_sli); } + +static bool do_sve2_fn_zzz(DisasContext *s, arg_rrr_esz *a, GVecGen3Fn *fn) +{ + if (!dc_isar_feature(aa64_sve2, s)) { + return false; + } + if (sve_access_check(s)) { + gen_gvec_fn_zzz(s, fn, a->esz, a->rd, a->rn, a->rm); + } + return true; +} + +static bool trans_SABA(DisasContext *s, arg_rrr_esz *a) +{ + return do_sve2_fn_zzz(s, a, gen_gvec_saba); +} + +static bool trans_UABA(DisasContext *s, arg_rrr_esz *a) +{ + return do_sve2_fn_zzz(s, a, gen_gvec_uaba); +} From patchwork Tue May 25 01:02:50 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277465 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0E4B6C2B9F8 for ; Tue, 25 May 2021 01:26:01 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 762B5613CC for ; Tue, 25 May 2021 01:26:00 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 762B5613CC Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:37680 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llLpX-00081m-Cq for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 21:25:59 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54024) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLUy-0001wO-AE for qemu-devel@nongnu.org; Mon, 24 May 2021 21:04:44 -0400 Received: from mail-pj1-x102b.google.com ([2607:f8b0:4864:20::102b]:54944) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLUV-0001pO-Rg for qemu-devel@nongnu.org; Mon, 24 May 2021 21:04:43 -0400 Received: by mail-pj1-x102b.google.com with SMTP id g24so15843417pji.4 for ; Mon, 24 May 2021 18:04:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=ls/0SSv7EuSBugiK6GJfPb9owUda/xBhvp5T7eNiSFU=; b=XylYW7K70osSXGRFJtGOq5Do16pL3kBKJR+NfuaSyTcvE9GHBIPLmsFeIG/CUlL7fI a03+Rf4C7Gp6JdpSQhTRf1MN82zmXA1VKRKl5CbduMAmy+yKD+i4Nc2F0HDvjA3Tcqpm Xu/QwmxmvitvYRa9WAra3sP7qUdFlqPt1l23jFCG6fUl9ah6Mbyz/kzpsYBK6SD0KN1I kls4h6NlMuAZ9pV2tylpEX3kbo+IFuTv3GUY1lSVoqP0N4myMMFmeWLmrHncd1GCM6b5 pQ6WKRAa31TnGXzh2IZRi5v3pK/wjybTkXXl67MPA1DtfBviorsH86It/rprPHdR87bu 91yg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=ls/0SSv7EuSBugiK6GJfPb9owUda/xBhvp5T7eNiSFU=; b=ZzHdRKjb56DISZkHLAnc+qg+8yVSNlPR4pYasnKdAVMS1QArFL9m09txb/OlxjBFCN 4CgpuTr/L6OHW3VWiGf68apOKGn/v10xO0TPbQccpeYbxsFrGgZFxtw/fANNIJXBLhe1 333p+sPg/QuWTn/wGNQalbmoZWD3KRZWFMd8M7CXyxKXzugNWeSJaAzguLPGv+v69Qwa +qEacnwRz9LbOwIxAHi3BtBO3NYYXWf9E7gzTqop4v4bxwJeGOa9rOQhJtBtQD6GMDqh jV4wsgQO8hNQ4nU5je9ASZEo0DBvDi5epg1dIImRkeoiCyGPXnMqBXljo8X6jNtlmcwG tk0w== X-Gm-Message-State: AOAM531TQ266vZOFyNnqJQSV04sUX2zj1/cB3VqF8USMPhH+/pf3szIy pyozCeOCIxrjC/sIrKBjiA9wSRhU7HXuKg== X-Google-Smtp-Source: ABdhPJzMNJcNsRna2R0gCBucrRtX5k7ZnQR2KZiyRH793+Bo+B5TnE2N0DJEIhI/w2wUZrwds0mvoQ== X-Received: by 2002:a17:902:dcce:b029:ef:339:fa6a with SMTP id t14-20020a170902dcceb02900ef0339fa6amr28092522pll.24.1621904654471; Mon, 24 May 2021 18:04:14 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id b1sm13742645pgf.84.2021.05.24.18.04.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:04:14 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 24/92] target/arm: Implement SVE2 saturating extract narrow Date: Mon, 24 May 2021 18:02:50 -0700 Message-Id: <20210525010358.152808-25-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::102b; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x102b.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 24 ++++ target/arm/sve.decode | 12 ++ target/arm/sve_helper.c | 56 +++++++++ target/arm/translate-sve.c | 238 +++++++++++++++++++++++++++++++++++++ 4 files changed, 330 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 4a62012850..b302203ce8 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2419,3 +2419,27 @@ DEF_HELPER_FLAGS_5(sve2_uabal_d, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(sve2_adcl_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve2_adcl_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve2_sqxtnb_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_sqxtnb_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_sqxtnb_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve2_uqxtnb_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_uqxtnb_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_uqxtnb_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve2_sqxtunb_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_sqxtunb_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_sqxtunb_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve2_sqxtnt_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_sqxtnt_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_sqxtnt_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve2_uqxtnt_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_uqxtnt_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_uqxtnt_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve2_sqxtunt_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_sqxtunt_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_sqxtunt_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 32b15e4192..19866ec4c6 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1272,3 +1272,15 @@ SLI 01000101 .. 0 ..... 11110 1 ..... ..... @rd_rn_tszimm_shl # TODO: Use @rda and %reg_movprfx here. SABA 01000101 .. 0 ..... 11111 0 ..... ..... @rd_rn_rm UABA 01000101 .. 0 ..... 11111 1 ..... ..... @rd_rn_rm + +#### SVE2 Narrowing + +## SVE2 saturating extract narrow + +# Bits 23, 18-16 are zero, limited in the translator via esz < 3 & imm == 0. +SQXTNB 01000101 .. 1 ..... 010 000 ..... ..... @rd_rn_tszimm_shl +SQXTNT 01000101 .. 1 ..... 010 001 ..... ..... @rd_rn_tszimm_shl +UQXTNB 01000101 .. 1 ..... 010 010 ..... ..... @rd_rn_tszimm_shl +UQXTNT 01000101 .. 1 ..... 010 011 ..... ..... @rd_rn_tszimm_shl +SQXTUNB 01000101 .. 1 ..... 010 100 ..... ..... @rd_rn_tszimm_shl +SQXTUNT 01000101 .. 1 ..... 010 101 ..... ..... @rd_rn_tszimm_shl diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index b63d84eef4..1ca71e367d 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1269,6 +1269,62 @@ DO_ZZZW_ACC(sve2_uabal_d, uint64_t, uint32_t, , H1_4, DO_ABD) #undef DO_ZZZW_ACC +#define DO_XTNB(NAME, TYPE, OP) \ +void HELPER(NAME)(void *vd, void *vn, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc); \ + for (i = 0; i < opr_sz; i += sizeof(TYPE)) { \ + TYPE nn = *(TYPE *)(vn + i); \ + nn = OP(nn) & MAKE_64BIT_MASK(0, sizeof(TYPE) * 4); \ + *(TYPE *)(vd + i) = nn; \ + } \ +} + +#define DO_XTNT(NAME, TYPE, TYPEN, H, OP) \ +void HELPER(NAME)(void *vd, void *vn, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc), odd = H(sizeof(TYPEN)); \ + for (i = 0; i < opr_sz; i += sizeof(TYPE)) { \ + TYPE nn = *(TYPE *)(vn + i); \ + *(TYPEN *)(vd + i + odd) = OP(nn); \ + } \ +} + +#define DO_SQXTN_H(n) do_sat_bhs(n, INT8_MIN, INT8_MAX) +#define DO_SQXTN_S(n) do_sat_bhs(n, INT16_MIN, INT16_MAX) +#define DO_SQXTN_D(n) do_sat_bhs(n, INT32_MIN, INT32_MAX) + +DO_XTNB(sve2_sqxtnb_h, int16_t, DO_SQXTN_H) +DO_XTNB(sve2_sqxtnb_s, int32_t, DO_SQXTN_S) +DO_XTNB(sve2_sqxtnb_d, int64_t, DO_SQXTN_D) + +DO_XTNT(sve2_sqxtnt_h, int16_t, int8_t, H1, DO_SQXTN_H) +DO_XTNT(sve2_sqxtnt_s, int32_t, int16_t, H1_2, DO_SQXTN_S) +DO_XTNT(sve2_sqxtnt_d, int64_t, int32_t, H1_4, DO_SQXTN_D) + +#define DO_UQXTN_H(n) do_sat_bhs(n, 0, UINT8_MAX) +#define DO_UQXTN_S(n) do_sat_bhs(n, 0, UINT16_MAX) +#define DO_UQXTN_D(n) do_sat_bhs(n, 0, UINT32_MAX) + +DO_XTNB(sve2_uqxtnb_h, uint16_t, DO_UQXTN_H) +DO_XTNB(sve2_uqxtnb_s, uint32_t, DO_UQXTN_S) +DO_XTNB(sve2_uqxtnb_d, uint64_t, DO_UQXTN_D) + +DO_XTNT(sve2_uqxtnt_h, uint16_t, uint8_t, H1, DO_UQXTN_H) +DO_XTNT(sve2_uqxtnt_s, uint32_t, uint16_t, H1_2, DO_UQXTN_S) +DO_XTNT(sve2_uqxtnt_d, uint64_t, uint32_t, H1_4, DO_UQXTN_D) + +DO_XTNB(sve2_sqxtunb_h, int16_t, DO_UQXTN_H) +DO_XTNB(sve2_sqxtunb_s, int32_t, DO_UQXTN_S) +DO_XTNB(sve2_sqxtunb_d, int64_t, DO_UQXTN_D) + +DO_XTNT(sve2_sqxtunt_h, int16_t, int8_t, H1, DO_UQXTN_H) +DO_XTNT(sve2_sqxtunt_s, int32_t, int16_t, H1_2, DO_UQXTN_S) +DO_XTNT(sve2_sqxtunt_d, int64_t, int32_t, H1_4, DO_UQXTN_D) + +#undef DO_XTNB +#undef DO_XTNT + void HELPER(sve2_adcl_s)(void *vd, void *vn, void *vm, void *va, uint32_t desc) { intptr_t i, opr_sz = simd_oprsz(desc); diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 202107de98..c77df3dbeb 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -6459,3 +6459,241 @@ static bool trans_UABA(DisasContext *s, arg_rrr_esz *a) { return do_sve2_fn_zzz(s, a, gen_gvec_uaba); } + +static bool do_sve2_narrow_extract(DisasContext *s, arg_rri_esz *a, + const GVecGen2 ops[3]) +{ + if (a->esz < 0 || a->esz > MO_32 || a->imm != 0 || + !dc_isar_feature(aa64_sve2, s)) { + return false; + } + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + tcg_gen_gvec_2(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vsz, vsz, &ops[a->esz]); + } + return true; +} + +static const TCGOpcode sqxtn_list[] = { + INDEX_op_shli_vec, INDEX_op_smin_vec, INDEX_op_smax_vec, 0 +}; + +static void gen_sqxtnb_vec(unsigned vece, TCGv_vec d, TCGv_vec n) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + int halfbits = 4 << vece; + int64_t mask = (1ull << halfbits) - 1; + int64_t min = -1ull << (halfbits - 1); + int64_t max = -min - 1; + + tcg_gen_dupi_vec(vece, t, min); + tcg_gen_smax_vec(vece, d, n, t); + tcg_gen_dupi_vec(vece, t, max); + tcg_gen_smin_vec(vece, d, d, t); + tcg_gen_dupi_vec(vece, t, mask); + tcg_gen_and_vec(vece, d, d, t); + tcg_temp_free_vec(t); +} + +static bool trans_SQXTNB(DisasContext *s, arg_rri_esz *a) +{ + static const GVecGen2 ops[3] = { + { .fniv = gen_sqxtnb_vec, + .opt_opc = sqxtn_list, + .fno = gen_helper_sve2_sqxtnb_h, + .vece = MO_16 }, + { .fniv = gen_sqxtnb_vec, + .opt_opc = sqxtn_list, + .fno = gen_helper_sve2_sqxtnb_s, + .vece = MO_32 }, + { .fniv = gen_sqxtnb_vec, + .opt_opc = sqxtn_list, + .fno = gen_helper_sve2_sqxtnb_d, + .vece = MO_64 }, + }; + return do_sve2_narrow_extract(s, a, ops); +} + +static void gen_sqxtnt_vec(unsigned vece, TCGv_vec d, TCGv_vec n) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + int halfbits = 4 << vece; + int64_t mask = (1ull << halfbits) - 1; + int64_t min = -1ull << (halfbits - 1); + int64_t max = -min - 1; + + tcg_gen_dupi_vec(vece, t, min); + tcg_gen_smax_vec(vece, n, n, t); + tcg_gen_dupi_vec(vece, t, max); + tcg_gen_smin_vec(vece, n, n, t); + tcg_gen_shli_vec(vece, n, n, halfbits); + tcg_gen_dupi_vec(vece, t, mask); + tcg_gen_bitsel_vec(vece, d, t, d, n); + tcg_temp_free_vec(t); +} + +static bool trans_SQXTNT(DisasContext *s, arg_rri_esz *a) +{ + static const GVecGen2 ops[3] = { + { .fniv = gen_sqxtnt_vec, + .opt_opc = sqxtn_list, + .load_dest = true, + .fno = gen_helper_sve2_sqxtnt_h, + .vece = MO_16 }, + { .fniv = gen_sqxtnt_vec, + .opt_opc = sqxtn_list, + .load_dest = true, + .fno = gen_helper_sve2_sqxtnt_s, + .vece = MO_32 }, + { .fniv = gen_sqxtnt_vec, + .opt_opc = sqxtn_list, + .load_dest = true, + .fno = gen_helper_sve2_sqxtnt_d, + .vece = MO_64 }, + }; + return do_sve2_narrow_extract(s, a, ops); +} + +static const TCGOpcode uqxtn_list[] = { + INDEX_op_shli_vec, INDEX_op_umin_vec, 0 +}; + +static void gen_uqxtnb_vec(unsigned vece, TCGv_vec d, TCGv_vec n) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + int halfbits = 4 << vece; + int64_t max = (1ull << halfbits) - 1; + + tcg_gen_dupi_vec(vece, t, max); + tcg_gen_umin_vec(vece, d, n, t); + tcg_temp_free_vec(t); +} + +static bool trans_UQXTNB(DisasContext *s, arg_rri_esz *a) +{ + static const GVecGen2 ops[3] = { + { .fniv = gen_uqxtnb_vec, + .opt_opc = uqxtn_list, + .fno = gen_helper_sve2_uqxtnb_h, + .vece = MO_16 }, + { .fniv = gen_uqxtnb_vec, + .opt_opc = uqxtn_list, + .fno = gen_helper_sve2_uqxtnb_s, + .vece = MO_32 }, + { .fniv = gen_uqxtnb_vec, + .opt_opc = uqxtn_list, + .fno = gen_helper_sve2_uqxtnb_d, + .vece = MO_64 }, + }; + return do_sve2_narrow_extract(s, a, ops); +} + +static void gen_uqxtnt_vec(unsigned vece, TCGv_vec d, TCGv_vec n) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + int halfbits = 4 << vece; + int64_t max = (1ull << halfbits) - 1; + + tcg_gen_dupi_vec(vece, t, max); + tcg_gen_umin_vec(vece, n, n, t); + tcg_gen_shli_vec(vece, n, n, halfbits); + tcg_gen_bitsel_vec(vece, d, t, d, n); + tcg_temp_free_vec(t); +} + +static bool trans_UQXTNT(DisasContext *s, arg_rri_esz *a) +{ + static const GVecGen2 ops[3] = { + { .fniv = gen_uqxtnt_vec, + .opt_opc = uqxtn_list, + .load_dest = true, + .fno = gen_helper_sve2_uqxtnt_h, + .vece = MO_16 }, + { .fniv = gen_uqxtnt_vec, + .opt_opc = uqxtn_list, + .load_dest = true, + .fno = gen_helper_sve2_uqxtnt_s, + .vece = MO_32 }, + { .fniv = gen_uqxtnt_vec, + .opt_opc = uqxtn_list, + .load_dest = true, + .fno = gen_helper_sve2_uqxtnt_d, + .vece = MO_64 }, + }; + return do_sve2_narrow_extract(s, a, ops); +} + +static const TCGOpcode sqxtun_list[] = { + INDEX_op_shli_vec, INDEX_op_umin_vec, INDEX_op_smax_vec, 0 +}; + +static void gen_sqxtunb_vec(unsigned vece, TCGv_vec d, TCGv_vec n) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + int halfbits = 4 << vece; + int64_t max = (1ull << halfbits) - 1; + + tcg_gen_dupi_vec(vece, t, 0); + tcg_gen_smax_vec(vece, d, n, t); + tcg_gen_dupi_vec(vece, t, max); + tcg_gen_umin_vec(vece, d, d, t); + tcg_temp_free_vec(t); +} + +static bool trans_SQXTUNB(DisasContext *s, arg_rri_esz *a) +{ + static const GVecGen2 ops[3] = { + { .fniv = gen_sqxtunb_vec, + .opt_opc = sqxtun_list, + .fno = gen_helper_sve2_sqxtunb_h, + .vece = MO_16 }, + { .fniv = gen_sqxtunb_vec, + .opt_opc = sqxtun_list, + .fno = gen_helper_sve2_sqxtunb_s, + .vece = MO_32 }, + { .fniv = gen_sqxtunb_vec, + .opt_opc = sqxtun_list, + .fno = gen_helper_sve2_sqxtunb_d, + .vece = MO_64 }, + }; + return do_sve2_narrow_extract(s, a, ops); +} + +static void gen_sqxtunt_vec(unsigned vece, TCGv_vec d, TCGv_vec n) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + int halfbits = 4 << vece; + int64_t max = (1ull << halfbits) - 1; + + tcg_gen_dupi_vec(vece, t, 0); + tcg_gen_smax_vec(vece, n, n, t); + tcg_gen_dupi_vec(vece, t, max); + tcg_gen_umin_vec(vece, n, n, t); + tcg_gen_shli_vec(vece, n, n, halfbits); + tcg_gen_bitsel_vec(vece, d, t, d, n); + tcg_temp_free_vec(t); +} + +static bool trans_SQXTUNT(DisasContext *s, arg_rri_esz *a) +{ + static const GVecGen2 ops[3] = { + { .fniv = gen_sqxtunt_vec, + .opt_opc = sqxtun_list, + .load_dest = true, + .fno = gen_helper_sve2_sqxtunt_h, + .vece = MO_16 }, + { .fniv = gen_sqxtunt_vec, + .opt_opc = sqxtun_list, + .load_dest = true, + .fno = gen_helper_sve2_sqxtunt_s, + .vece = MO_32 }, + { .fniv = gen_sqxtunt_vec, + .opt_opc = sqxtun_list, + .load_dest = true, + .fno = gen_helper_sve2_sqxtunt_d, + .vece = MO_64 }, + }; + return do_sve2_narrow_extract(s, a, ops); +} From patchwork Tue May 25 01:02:51 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277469 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 048D4C2B9F8 for ; Tue, 25 May 2021 01:28:39 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 7DF9F61403 for ; Tue, 25 May 2021 01:28:38 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 7DF9F61403 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:46470 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llLs5-0005RJ-L1 for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 21:28:37 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54042) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLUz-0001xJ-Bn for qemu-devel@nongnu.org; Mon, 24 May 2021 21:04:46 -0400 Received: from mail-pj1-x1031.google.com ([2607:f8b0:4864:20::1031]:45982) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLUW-0001pW-Bh for qemu-devel@nongnu.org; Mon, 24 May 2021 21:04:45 -0400 Received: by mail-pj1-x1031.google.com with SMTP id ne24-20020a17090b3758b029015f2dafecb0so11007146pjb.4 for ; Mon, 24 May 2021 18:04:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=9qiMCqX9xYtsQZuR9hfA+EDWfL1lJ5uuA+y3bAtpIkc=; b=a48WyMM6IqoeluhOQ2pjrzeok8lRmWavdHsOPE5Z1GisTiPIGdlXyT3ikr9oFcOhnY Qw+NsMBHY+LIXiLo+IF+Yk2JlSf5nLNwIQ2WOkS631cFM1Zc9c+SlOQCRP6ipeUJ7TPx kxndSJV1c/r9dnjp0H+g/mOckY62Pmb3PfqgNLmh9+i6PGfrn7UMOhpbUVwu660n6JmZ dOlnu5Q9ujbOdVj2ZeSlmrFBsm42m5YRcoPX6cpslclRP/kLiREXQa+gSUa0p18v2L3L XGExIjTqnl2z401c4ASOWXsLVQNCWNMuougprRp+WfLl7RfKrPxpXPjnsfoJSWOYo7Af uQVg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=9qiMCqX9xYtsQZuR9hfA+EDWfL1lJ5uuA+y3bAtpIkc=; b=YC/fMKDeQ4KOz2n5Yn953vAVjEDfFb4UIxqF4sbWRb0JmegVY87jMuelIPBYOOfdop RVHxsBxKTujlDZV/fJHXMjBLodOgkVevvWFfL4/WXiKOPloWRfcsOrfDm+QganvfieVH b15dpEXYOqEpfJwNboaP8Pq8+ygU9VGhI+FZ1wZ85t6IucHzDU9HFPZgobzKYS6lJtef dQL2v7uGYR3W7AawXnoePq6S/Q6k+dRq45kRRfM57aK/QpTqkvgnJHTGZCrxNPBS0SdA oOR4KD7gVqQTmtZeH2SLg10tt5YvjYX3JaKzGQizZ7y1wCxd8uRnZzxGOl6p5IDUxvoc UUWQ== X-Gm-Message-State: AOAM5330VBRqa6ZLo/CvSNP0MoF+anfX08zkN0qah6+KlwsPWDMxHFZH n3S2jw94L4bS5y1dYDcr9iEAkfgHQtdlGQ== X-Google-Smtp-Source: ABdhPJymPrTUgDm7DXsMzwdrrC1LVKFIvLUFBM1EhH33vNV/9L1yny5PfMfAuMLmiIwxF2qKbkw0eQ== X-Received: by 2002:a17:90b:3b92:: with SMTP id pc18mr27580607pjb.218.1621904655066; Mon, 24 May 2021 18:04:15 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id b1sm13742645pgf.84.2021.05.24.18.04.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:04:14 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 25/92] target/arm: Implement SVE2 floating-point pairwise Date: Mon, 24 May 2021 18:02:51 -0700 Message-Id: <20210525010358.152808-26-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1031; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1031.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org, Stephen Long Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Stephen Long Reviewed-by: Peter Maydell Signed-off-by: Stephen Long Reviewed-by: Richard Henderson Signed-off-by: Richard Henderson --- v2: Load all inputs before writing any output (laurent desnogues) --- target/arm/helper-sve.h | 35 +++++++++++++++++++++++++++++ target/arm/sve.decode | 8 +++++++ target/arm/sve_helper.c | 46 ++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 25 +++++++++++++++++++++ 4 files changed, 114 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index b302203ce8..a033b5f6b2 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2443,3 +2443,38 @@ DEF_HELPER_FLAGS_3(sve2_uqxtnt_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve2_sqxtunt_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve2_sqxtunt_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve2_sqxtunt_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve2_faddp_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve2_faddp_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve2_faddp_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve2_fmaxnmp_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve2_fmaxnmp_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve2_fmaxnmp_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve2_fminnmp_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve2_fminnmp_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve2_fminnmp_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve2_fmaxp_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve2_fmaxp_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve2_fmaxp_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve2_fminp_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve2_fminp_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve2_fminp_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 19866ec4c6..9c75ac94c0 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1284,3 +1284,11 @@ UQXTNB 01000101 .. 1 ..... 010 010 ..... ..... @rd_rn_tszimm_shl UQXTNT 01000101 .. 1 ..... 010 011 ..... ..... @rd_rn_tszimm_shl SQXTUNB 01000101 .. 1 ..... 010 100 ..... ..... @rd_rn_tszimm_shl SQXTUNT 01000101 .. 1 ..... 010 101 ..... ..... @rd_rn_tszimm_shl + +## SVE2 floating-point pairwise operations + +FADDP 01100100 .. 010 00 0 100 ... ..... ..... @rdn_pg_rm +FMAXNMP 01100100 .. 010 10 0 100 ... ..... ..... @rdn_pg_rm +FMINNMP 01100100 .. 010 10 1 100 ... ..... ..... @rdn_pg_rm +FMAXP 01100100 .. 010 11 0 100 ... ..... ..... @rdn_pg_rm +FMINP 01100100 .. 010 11 1 100 ... ..... ..... @rdn_pg_rm diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 1ca71e367d..16604a424f 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -895,6 +895,52 @@ DO_ZPZZ_PAIR_D(sve2_sminp_zpzz_d, int64_t, DO_MIN) #undef DO_ZPZZ_PAIR #undef DO_ZPZZ_PAIR_D +#define DO_ZPZZ_PAIR_FP(NAME, TYPE, H, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, void *vg, \ + void *status, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc); \ + for (i = 0; i < opr_sz; ) { \ + uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \ + do { \ + TYPE n0 = *(TYPE *)(vn + H(i)); \ + TYPE m0 = *(TYPE *)(vm + H(i)); \ + TYPE n1 = *(TYPE *)(vn + H(i + sizeof(TYPE))); \ + TYPE m1 = *(TYPE *)(vm + H(i + sizeof(TYPE))); \ + if (pg & 1) { \ + *(TYPE *)(vd + H(i)) = OP(n0, n1, status); \ + } \ + i += sizeof(TYPE), pg >>= sizeof(TYPE); \ + if (pg & 1) { \ + *(TYPE *)(vd + H(i)) = OP(m0, m1, status); \ + } \ + i += sizeof(TYPE), pg >>= sizeof(TYPE); \ + } while (i & 15); \ + } \ +} + +DO_ZPZZ_PAIR_FP(sve2_faddp_zpzz_h, float16, H1_2, float16_add) +DO_ZPZZ_PAIR_FP(sve2_faddp_zpzz_s, float32, H1_4, float32_add) +DO_ZPZZ_PAIR_FP(sve2_faddp_zpzz_d, float64, , float64_add) + +DO_ZPZZ_PAIR_FP(sve2_fmaxnmp_zpzz_h, float16, H1_2, float16_maxnum) +DO_ZPZZ_PAIR_FP(sve2_fmaxnmp_zpzz_s, float32, H1_4, float32_maxnum) +DO_ZPZZ_PAIR_FP(sve2_fmaxnmp_zpzz_d, float64, , float64_maxnum) + +DO_ZPZZ_PAIR_FP(sve2_fminnmp_zpzz_h, float16, H1_2, float16_minnum) +DO_ZPZZ_PAIR_FP(sve2_fminnmp_zpzz_s, float32, H1_4, float32_minnum) +DO_ZPZZ_PAIR_FP(sve2_fminnmp_zpzz_d, float64, , float64_minnum) + +DO_ZPZZ_PAIR_FP(sve2_fmaxp_zpzz_h, float16, H1_2, float16_max) +DO_ZPZZ_PAIR_FP(sve2_fmaxp_zpzz_s, float32, H1_4, float32_max) +DO_ZPZZ_PAIR_FP(sve2_fmaxp_zpzz_d, float64, , float64_max) + +DO_ZPZZ_PAIR_FP(sve2_fminp_zpzz_h, float16, H1_2, float16_min) +DO_ZPZZ_PAIR_FP(sve2_fminp_zpzz_s, float32, H1_4, float32_min) +DO_ZPZZ_PAIR_FP(sve2_fminp_zpzz_d, float64, , float64_min) + +#undef DO_ZPZZ_PAIR_FP + /* Three-operand expander, controlled by a predicate, in which the * third operand is "wide". That is, for D = N op M, the same 64-bit * value of M is used with all of the narrower values of N. diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index c77df3dbeb..faf94b304a 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -6697,3 +6697,28 @@ static bool trans_SQXTUNT(DisasContext *s, arg_rri_esz *a) }; return do_sve2_narrow_extract(s, a, ops); } + +static bool do_sve2_zpzz_fp(DisasContext *s, arg_rprr_esz *a, + gen_helper_gvec_4_ptr *fn) +{ + if (!dc_isar_feature(aa64_sve2, s)) { + return false; + } + return do_zpzz_fp(s, a, fn); +} + +#define DO_SVE2_ZPZZ_FP(NAME, name) \ +static bool trans_##NAME(DisasContext *s, arg_rprr_esz *a) \ +{ \ + static gen_helper_gvec_4_ptr * const fns[4] = { \ + NULL, gen_helper_sve2_##name##_zpzz_h, \ + gen_helper_sve2_##name##_zpzz_s, gen_helper_sve2_##name##_zpzz_d \ + }; \ + return do_sve2_zpzz_fp(s, a, fns[a->esz]); \ +} + +DO_SVE2_ZPZZ_FP(FADDP, faddp) +DO_SVE2_ZPZZ_FP(FMAXNMP, fmaxnmp) +DO_SVE2_ZPZZ_FP(FMINNMP, fminnmp) +DO_SVE2_ZPZZ_FP(FMAXP, fmaxp) +DO_SVE2_ZPZZ_FP(FMINP, fminp) From patchwork Tue May 25 01:02:52 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277483 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 36C8EC2B9F7 for ; Tue, 25 May 2021 01:35:27 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id C5F6361401 for ; Tue, 25 May 2021 01:35:26 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C5F6361401 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:35610 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llLyf-0000F2-Sn for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 21:35:25 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54092) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLV2-00023o-QG for qemu-devel@nongnu.org; Mon, 24 May 2021 21:04:48 -0400 Received: from mail-pf1-x429.google.com ([2607:f8b0:4864:20::429]:42604) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLUX-0001pm-5L for qemu-devel@nongnu.org; Mon, 24 May 2021 21:04:48 -0400 Received: by mail-pf1-x429.google.com with SMTP id x18so17968220pfi.9 for ; Mon, 24 May 2021 18:04:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=1ZgLGkKf891GG6QDr9mPY7zyf50wI69yX/zlHFqjxeo=; b=cmfXEopvQ3ub5yHLSb0yAkCvpSry4C1gy1K27cfyjNLwTaDkfm2fdTNoI485Pc0K4Y PVPnVNtKCw7bAXz4hrnjv1M41IM4NLiwqEw1KjcgQhg8vBsasvVwVn0zE87DtGufBmOB eQ1dv3U3g4wnCBAWoWH6YxFlDYpaDRJq/fJGibN0IAuR+TvqXA+SG/5LW+6bGiM+WZwz H4budhcB0S/TeGoPzM4ULkUqNnwy5HoEAvxMpRoystoMIwvZQhln0Y6bw6Dq5oKVXLo+ S2nHBiqOPTMt44LS9cRyIne8nDtKfG3eSegNMpO6i3hg0SKj6GwDNN2Hkd1TS8Jaj1z+ 3iLg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=1ZgLGkKf891GG6QDr9mPY7zyf50wI69yX/zlHFqjxeo=; b=iQwJxOkV4hox7IhFFz5w7hI+hfe0zNhsDkFZ0/kyfuLK31fKg4xN0deo0DhDrqBPjH t81XD45x40ZCW3YTy9BmlUnXrajsU/275yGHMiWYVRZ6EUliEAvj7AZtTa/qVlHJWk2l meBDRiNDBUMg4XKeh2Z3E9Y/R+Oih43jkRlNtIeAhY+T46UjQLRUrKi7sZ8N4awUWVqq hjvo4GbrdrP1WLQ0USB/t7qqa7kJ/ikuDVa7y9ke80R7NP/ZwKd7mfyX/Zr4MyAkXPYN LoWnbEOCqYIYPWybsm2oyaTa9pN7X2weNz4Te1d2y7dUATZJSzFujaa8lGJRJMw3uPHH fqIA== X-Gm-Message-State: AOAM533jxPlokflLj2nXOC1t6L5ycNllmNzz9O+6cmb3+p8qouwn+jsC p7Ygh05Dyh71GcFpwXT60ETqTjQodj/TFg== X-Google-Smtp-Source: ABdhPJwONbKCSHCmN++cxSoYYaFcSJaWIXCFoO3b/sMFC122HkBMbd3UcwGPmXv8jc62071EDDs+1g== X-Received: by 2002:aa7:982e:0:b029:2e4:eef5:e0c9 with SMTP id q14-20020aa7982e0000b02902e4eef5e0c9mr18991258pfl.3.1621904655603; Mon, 24 May 2021 18:04:15 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id b1sm13742645pgf.84.2021.05.24.18.04.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:04:15 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 26/92] target/arm: Implement SVE2 SHRN, RSHRN Date: Mon, 24 May 2021 18:02:52 -0700 Message-Id: <20210525010358.152808-27-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::429; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x429.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- v2: Fix typo in gen_shrnb_vec (laurent desnogues) v3: Replace DO_RSHR with an inline function --- target/arm/helper-sve.h | 16 ++++ target/arm/sve.decode | 8 ++ target/arm/sve_helper.c | 54 ++++++++++++- target/arm/translate-sve.c | 160 +++++++++++++++++++++++++++++++++++++ 4 files changed, 236 insertions(+), 2 deletions(-) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index a033b5f6b2..2b2ebea631 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2444,6 +2444,22 @@ DEF_HELPER_FLAGS_3(sve2_sqxtunt_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve2_sqxtunt_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve2_sqxtunt_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_shrnb_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_shrnb_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_shrnb_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve2_shrnt_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_shrnt_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_shrnt_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve2_rshrnb_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_rshrnb_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_rshrnb_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve2_rshrnt_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_rshrnt_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_rshrnt_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + DEF_HELPER_FLAGS_6(sve2_faddp_zpzz_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_6(sve2_faddp_zpzz_s, TCG_CALL_NO_RWG, diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 9c75ac94c0..169486ecb2 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1285,6 +1285,14 @@ UQXTNT 01000101 .. 1 ..... 010 011 ..... ..... @rd_rn_tszimm_shl SQXTUNB 01000101 .. 1 ..... 010 100 ..... ..... @rd_rn_tszimm_shl SQXTUNT 01000101 .. 1 ..... 010 101 ..... ..... @rd_rn_tszimm_shl +## SVE2 bitwise shift right narrow + +# Bit 23 == 0 is handled by esz > 0 in the translator. +SHRNB 01000101 .. 1 ..... 00 0100 ..... ..... @rd_rn_tszimm_shr +SHRNT 01000101 .. 1 ..... 00 0101 ..... ..... @rd_rn_tszimm_shr +RSHRNB 01000101 .. 1 ..... 00 0110 ..... ..... @rd_rn_tszimm_shr +RSHRNT 01000101 .. 1 ..... 00 0111 ..... ..... @rd_rn_tszimm_shr + ## SVE2 floating-point pairwise operations FADDP 01100100 .. 010 00 0 100 ... ..... ..... @rdn_pg_rm diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 16604a424f..8fd61e37f9 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1868,6 +1868,17 @@ void HELPER(NAME)(void *vd, void *vn, void *vg, uint32_t desc) \ when N is negative, add 2**M-1. */ #define DO_ASRD(N, M) ((N + (N < 0 ? ((__typeof(N))1 << M) - 1 : 0)) >> M) +static inline uint64_t do_urshr(uint64_t x, unsigned sh) +{ + if (likely(sh < 64)) { + return (x >> sh) + ((x >> (sh - 1)) & 1); + } else if (sh == 64) { + return x >> 63; + } else { + return 0; + } +} + DO_ZPZI(sve_asr_zpzi_b, int8_t, H1, DO_SHR) DO_ZPZI(sve_asr_zpzi_h, int16_t, H1_2, DO_SHR) DO_ZPZI(sve_asr_zpzi_s, int32_t, H1_4, DO_SHR) @@ -1888,12 +1899,51 @@ DO_ZPZI(sve_asrd_h, int16_t, H1_2, DO_ASRD) DO_ZPZI(sve_asrd_s, int32_t, H1_4, DO_ASRD) DO_ZPZI_D(sve_asrd_d, int64_t, DO_ASRD) -#undef DO_SHR -#undef DO_SHL #undef DO_ASRD #undef DO_ZPZI #undef DO_ZPZI_D +#define DO_SHRNB(NAME, TYPEW, TYPEN, OP) \ +void HELPER(NAME)(void *vd, void *vn, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc); \ + int shift = simd_data(desc); \ + for (i = 0; i < opr_sz; i += sizeof(TYPEW)) { \ + TYPEW nn = *(TYPEW *)(vn + i); \ + *(TYPEW *)(vd + i) = (TYPEN)OP(nn, shift); \ + } \ +} + +#define DO_SHRNT(NAME, TYPEW, TYPEN, HW, HN, OP) \ +void HELPER(NAME)(void *vd, void *vn, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc); \ + int shift = simd_data(desc); \ + for (i = 0; i < opr_sz; i += sizeof(TYPEW)) { \ + TYPEW nn = *(TYPEW *)(vn + HW(i)); \ + *(TYPEN *)(vd + HN(i + sizeof(TYPEN))) = OP(nn, shift); \ + } \ +} + +DO_SHRNB(sve2_shrnb_h, uint16_t, uint8_t, DO_SHR) +DO_SHRNB(sve2_shrnb_s, uint32_t, uint16_t, DO_SHR) +DO_SHRNB(sve2_shrnb_d, uint64_t, uint32_t, DO_SHR) + +DO_SHRNT(sve2_shrnt_h, uint16_t, uint8_t, H1_2, H1, DO_SHR) +DO_SHRNT(sve2_shrnt_s, uint32_t, uint16_t, H1_4, H1_2, DO_SHR) +DO_SHRNT(sve2_shrnt_d, uint64_t, uint32_t, , H1_4, DO_SHR) + +DO_SHRNB(sve2_rshrnb_h, uint16_t, uint8_t, do_urshr) +DO_SHRNB(sve2_rshrnb_s, uint32_t, uint16_t, do_urshr) +DO_SHRNB(sve2_rshrnb_d, uint64_t, uint32_t, do_urshr) + +DO_SHRNT(sve2_rshrnt_h, uint16_t, uint8_t, H1_2, H1, do_urshr) +DO_SHRNT(sve2_rshrnt_s, uint32_t, uint16_t, H1_4, H1_2, do_urshr) +DO_SHRNT(sve2_rshrnt_d, uint64_t, uint32_t, , H1_4, do_urshr) + +#undef DO_SHRNB +#undef DO_SHRNT + /* Fully general four-operand expander, controlled by a predicate. */ #define DO_ZPZZZ(NAME, TYPE, H, OP) \ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index faf94b304a..e072f8a2cf 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -6698,6 +6698,166 @@ static bool trans_SQXTUNT(DisasContext *s, arg_rri_esz *a) return do_sve2_narrow_extract(s, a, ops); } +static bool do_sve2_shr_narrow(DisasContext *s, arg_rri_esz *a, + const GVecGen2i ops[3]) +{ + if (a->esz < 0 || a->esz > MO_32 || !dc_isar_feature(aa64_sve2, s)) { + return false; + } + assert(a->imm > 0 && a->imm <= (8 << a->esz)); + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + tcg_gen_gvec_2i(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vsz, vsz, a->imm, &ops[a->esz]); + } + return true; +} + +static void gen_shrnb_i64(unsigned vece, TCGv_i64 d, TCGv_i64 n, int shr) +{ + int halfbits = 4 << vece; + uint64_t mask = dup_const(vece, MAKE_64BIT_MASK(0, halfbits)); + + tcg_gen_shri_i64(d, n, shr); + tcg_gen_andi_i64(d, d, mask); +} + +static void gen_shrnb16_i64(TCGv_i64 d, TCGv_i64 n, int64_t shr) +{ + gen_shrnb_i64(MO_16, d, n, shr); +} + +static void gen_shrnb32_i64(TCGv_i64 d, TCGv_i64 n, int64_t shr) +{ + gen_shrnb_i64(MO_32, d, n, shr); +} + +static void gen_shrnb64_i64(TCGv_i64 d, TCGv_i64 n, int64_t shr) +{ + gen_shrnb_i64(MO_64, d, n, shr); +} + +static void gen_shrnb_vec(unsigned vece, TCGv_vec d, TCGv_vec n, int64_t shr) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + int halfbits = 4 << vece; + uint64_t mask = MAKE_64BIT_MASK(0, halfbits); + + tcg_gen_shri_vec(vece, n, n, shr); + tcg_gen_dupi_vec(vece, t, mask); + tcg_gen_and_vec(vece, d, n, t); + tcg_temp_free_vec(t); +} + +static bool trans_SHRNB(DisasContext *s, arg_rri_esz *a) +{ + static const TCGOpcode vec_list[] = { INDEX_op_shri_vec, 0 }; + static const GVecGen2i ops[3] = { + { .fni8 = gen_shrnb16_i64, + .fniv = gen_shrnb_vec, + .opt_opc = vec_list, + .fno = gen_helper_sve2_shrnb_h, + .vece = MO_16 }, + { .fni8 = gen_shrnb32_i64, + .fniv = gen_shrnb_vec, + .opt_opc = vec_list, + .fno = gen_helper_sve2_shrnb_s, + .vece = MO_32 }, + { .fni8 = gen_shrnb64_i64, + .fniv = gen_shrnb_vec, + .opt_opc = vec_list, + .fno = gen_helper_sve2_shrnb_d, + .vece = MO_64 }, + }; + return do_sve2_shr_narrow(s, a, ops); +} + +static void gen_shrnt_i64(unsigned vece, TCGv_i64 d, TCGv_i64 n, int shr) +{ + int halfbits = 4 << vece; + uint64_t mask = dup_const(vece, MAKE_64BIT_MASK(0, halfbits)); + + tcg_gen_shli_i64(n, n, halfbits - shr); + tcg_gen_andi_i64(n, n, ~mask); + tcg_gen_andi_i64(d, d, mask); + tcg_gen_or_i64(d, d, n); +} + +static void gen_shrnt16_i64(TCGv_i64 d, TCGv_i64 n, int64_t shr) +{ + gen_shrnt_i64(MO_16, d, n, shr); +} + +static void gen_shrnt32_i64(TCGv_i64 d, TCGv_i64 n, int64_t shr) +{ + gen_shrnt_i64(MO_32, d, n, shr); +} + +static void gen_shrnt64_i64(TCGv_i64 d, TCGv_i64 n, int64_t shr) +{ + tcg_gen_shri_i64(n, n, shr); + tcg_gen_deposit_i64(d, d, n, 32, 32); +} + +static void gen_shrnt_vec(unsigned vece, TCGv_vec d, TCGv_vec n, int64_t shr) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + int halfbits = 4 << vece; + uint64_t mask = MAKE_64BIT_MASK(0, halfbits); + + tcg_gen_shli_vec(vece, n, n, halfbits - shr); + tcg_gen_dupi_vec(vece, t, mask); + tcg_gen_bitsel_vec(vece, d, t, d, n); + tcg_temp_free_vec(t); +} + +static bool trans_SHRNT(DisasContext *s, arg_rri_esz *a) +{ + static const TCGOpcode vec_list[] = { INDEX_op_shli_vec, 0 }; + static const GVecGen2i ops[3] = { + { .fni8 = gen_shrnt16_i64, + .fniv = gen_shrnt_vec, + .opt_opc = vec_list, + .load_dest = true, + .fno = gen_helper_sve2_shrnt_h, + .vece = MO_16 }, + { .fni8 = gen_shrnt32_i64, + .fniv = gen_shrnt_vec, + .opt_opc = vec_list, + .load_dest = true, + .fno = gen_helper_sve2_shrnt_s, + .vece = MO_32 }, + { .fni8 = gen_shrnt64_i64, + .fniv = gen_shrnt_vec, + .opt_opc = vec_list, + .load_dest = true, + .fno = gen_helper_sve2_shrnt_d, + .vece = MO_64 }, + }; + return do_sve2_shr_narrow(s, a, ops); +} + +static bool trans_RSHRNB(DisasContext *s, arg_rri_esz *a) +{ + static const GVecGen2i ops[3] = { + { .fno = gen_helper_sve2_rshrnb_h }, + { .fno = gen_helper_sve2_rshrnb_s }, + { .fno = gen_helper_sve2_rshrnb_d }, + }; + return do_sve2_shr_narrow(s, a, ops); +} + +static bool trans_RSHRNT(DisasContext *s, arg_rri_esz *a) +{ + static const GVecGen2i ops[3] = { + { .fno = gen_helper_sve2_rshrnt_h }, + { .fno = gen_helper_sve2_rshrnt_s }, + { .fno = gen_helper_sve2_rshrnt_d }, + }; + return do_sve2_shr_narrow(s, a, ops); +} + static bool do_sve2_zpzz_fp(DisasContext *s, arg_rprr_esz *a, gen_helper_gvec_4_ptr *fn) { From patchwork Tue May 25 01:02:53 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277463 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 490D8C2B9F7 for ; Tue, 25 May 2021 01:25:59 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id B5399613F6 for ; Tue, 25 May 2021 01:25:58 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B5399613F6 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:37578 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llLpV-0007xj-Pl for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 21:25:57 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54074) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLV1-00020D-Ub for qemu-devel@nongnu.org; Mon, 24 May 2021 21:04:47 -0400 Received: from mail-pg1-x531.google.com ([2607:f8b0:4864:20::531]:34553) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLUX-0001py-FH for qemu-devel@nongnu.org; Mon, 24 May 2021 21:04:47 -0400 Received: by mail-pg1-x531.google.com with SMTP id l70so21412748pga.1 for ; Mon, 24 May 2021 18:04:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=/gvHZOzTj+xrZjcBXR64cEcKDu6e++bnkbmWseU6CSo=; b=CoJ7ofIyzW+xe7CmsIEOlEMaDl5ncxXEReQDcPZOLCcVoo6teXbZCEQ7iM2rAeprKc omZA+dMBVe2FNQmbsNkA2BHZBcywCGgTGXtOoiCx/sJnb5XEHA3juXkWMIfqV9Rhc2eT WJ5ouHlrzH7QuQ4uHksaeeT9bmz4NJH5ImfeB43nnU59t2ItrWd3g/wfOIRf+dKqrOPz ClRYtNHp25X2824HbVKPTgPobrSQ7/uf+d8KABdBGPPIZ8uf/3aE0eulxfvK2Jco5MMR 8wtq/tsqSExlTWitkRYbd5Iuf8jQDyvW5fE7puLKsEAlMhpcM3oTIqWkLCZ7xjW/8pZL jy4g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=/gvHZOzTj+xrZjcBXR64cEcKDu6e++bnkbmWseU6CSo=; b=H6XtSL+t/xITZYCXW/InzaaCj9WjfqLMGIy5KeswGmG4OB+xXhWkT9iH6DoKJYSGMp xDB3CFE0U/J3IAZaZKS1ko5HgGzRHkca4lwTkksRt2+GQ5vOHe+1KjisN+C4sE4Pov9H vwVN46yYY2HT+hmTxza0gvCMLFcu57xXiB53m6+J6u3xaJRe9S7w6CUe2bkz/MXQVl59 9tknXrAZU93NteUpTjjdiwaqEg3c6CsHEymtq815StReI0/rhKCTvN38k8AnuZN6DPgl LKsc0sN8Ex/u2NjM6ViJY2pTOQRjtgASJMbU+iGwLMB3tVeNVv8nDh84iWDK4Cndt6bN gdOg== X-Gm-Message-State: AOAM532EIvO9DMBqnS+3F9vG/Q2Rd+cRtjshzI+go16387290C9+z20K 0wxxm0ESAF5Yy/mjgwZBCdPH72GT5FpBlA== X-Google-Smtp-Source: ABdhPJxTNKQtLGXHWbXYVcGQW0KjCsW9fn0vX1bXqsFhtsvDkPQzH6OctNaJOgC9eePb9/5Za3tqgg== X-Received: by 2002:a63:7158:: with SMTP id b24mr16175334pgn.310.1621904656127; Mon, 24 May 2021 18:04:16 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id b1sm13742645pgf.84.2021.05.24.18.04.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:04:15 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 27/92] target/arm: Implement SVE2 SQSHRUN, SQRSHRUN Date: Mon, 24 May 2021 18:02:53 -0700 Message-Id: <20210525010358.152808-28-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::531; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x531.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 16 +++++++ target/arm/sve.decode | 4 ++ target/arm/sve_helper.c | 35 ++++++++++++++ target/arm/translate-sve.c | 98 ++++++++++++++++++++++++++++++++++++++ 4 files changed, 153 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 2b2ebea631..2e80d9d27b 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2460,6 +2460,22 @@ DEF_HELPER_FLAGS_3(sve2_rshrnt_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve2_rshrnt_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve2_rshrnt_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_sqshrunb_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_sqshrunb_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_sqshrunb_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve2_sqshrunt_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_sqshrunt_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_sqshrunt_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve2_sqrshrunb_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_sqrshrunb_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_sqrshrunb_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve2_sqrshrunt_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_sqrshrunt_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_sqrshrunt_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + DEF_HELPER_FLAGS_6(sve2_faddp_zpzz_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_6(sve2_faddp_zpzz_s, TCG_CALL_NO_RWG, diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 169486ecb2..18faa900ca 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1288,6 +1288,10 @@ SQXTUNT 01000101 .. 1 ..... 010 101 ..... ..... @rd_rn_tszimm_shl ## SVE2 bitwise shift right narrow # Bit 23 == 0 is handled by esz > 0 in the translator. +SQSHRUNB 01000101 .. 1 ..... 00 0000 ..... ..... @rd_rn_tszimm_shr +SQSHRUNT 01000101 .. 1 ..... 00 0001 ..... ..... @rd_rn_tszimm_shr +SQRSHRUNB 01000101 .. 1 ..... 00 0010 ..... ..... @rd_rn_tszimm_shr +SQRSHRUNT 01000101 .. 1 ..... 00 0011 ..... ..... @rd_rn_tszimm_shr SHRNB 01000101 .. 1 ..... 00 0100 ..... ..... @rd_rn_tszimm_shr SHRNT 01000101 .. 1 ..... 00 0101 ..... ..... @rd_rn_tszimm_shr RSHRNB 01000101 .. 1 ..... 00 0110 ..... ..... @rd_rn_tszimm_shr diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 8fd61e37f9..b304ca19e8 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1879,6 +1879,16 @@ static inline uint64_t do_urshr(uint64_t x, unsigned sh) } } +static inline int64_t do_srshr(int64_t x, unsigned sh) +{ + if (likely(sh < 64)) { + return (x >> sh) + ((x >> (sh - 1)) & 1); + } else { + /* Rounding the sign bit always produces 0. */ + return 0; + } +} + DO_ZPZI(sve_asr_zpzi_b, int8_t, H1, DO_SHR) DO_ZPZI(sve_asr_zpzi_h, int16_t, H1_2, DO_SHR) DO_ZPZI(sve_asr_zpzi_s, int32_t, H1_4, DO_SHR) @@ -1941,6 +1951,31 @@ DO_SHRNT(sve2_rshrnt_h, uint16_t, uint8_t, H1_2, H1, do_urshr) DO_SHRNT(sve2_rshrnt_s, uint32_t, uint16_t, H1_4, H1_2, do_urshr) DO_SHRNT(sve2_rshrnt_d, uint64_t, uint32_t, , H1_4, do_urshr) +#define DO_SQSHRUN_H(x, sh) do_sat_bhs((int64_t)(x) >> sh, 0, UINT8_MAX) +#define DO_SQSHRUN_S(x, sh) do_sat_bhs((int64_t)(x) >> sh, 0, UINT16_MAX) +#define DO_SQSHRUN_D(x, sh) \ + do_sat_bhs((int64_t)(x) >> (sh < 64 ? sh : 63), 0, UINT32_MAX) + +DO_SHRNB(sve2_sqshrunb_h, int16_t, uint8_t, DO_SQSHRUN_H) +DO_SHRNB(sve2_sqshrunb_s, int32_t, uint16_t, DO_SQSHRUN_S) +DO_SHRNB(sve2_sqshrunb_d, int64_t, uint32_t, DO_SQSHRUN_D) + +DO_SHRNT(sve2_sqshrunt_h, int16_t, uint8_t, H1_2, H1, DO_SQSHRUN_H) +DO_SHRNT(sve2_sqshrunt_s, int32_t, uint16_t, H1_4, H1_2, DO_SQSHRUN_S) +DO_SHRNT(sve2_sqshrunt_d, int64_t, uint32_t, , H1_4, DO_SQSHRUN_D) + +#define DO_SQRSHRUN_H(x, sh) do_sat_bhs(do_srshr(x, sh), 0, UINT8_MAX) +#define DO_SQRSHRUN_S(x, sh) do_sat_bhs(do_srshr(x, sh), 0, UINT16_MAX) +#define DO_SQRSHRUN_D(x, sh) do_sat_bhs(do_srshr(x, sh), 0, UINT32_MAX) + +DO_SHRNB(sve2_sqrshrunb_h, int16_t, uint8_t, DO_SQRSHRUN_H) +DO_SHRNB(sve2_sqrshrunb_s, int32_t, uint16_t, DO_SQRSHRUN_S) +DO_SHRNB(sve2_sqrshrunb_d, int64_t, uint32_t, DO_SQRSHRUN_D) + +DO_SHRNT(sve2_sqrshrunt_h, int16_t, uint8_t, H1_2, H1, DO_SQRSHRUN_H) +DO_SHRNT(sve2_sqrshrunt_s, int32_t, uint16_t, H1_4, H1_2, DO_SQRSHRUN_S) +DO_SHRNT(sve2_sqrshrunt_d, int64_t, uint32_t, , H1_4, DO_SQRSHRUN_D) + #undef DO_SHRNB #undef DO_SHRNT diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index e072f8a2cf..36986b6e87 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -6858,6 +6858,104 @@ static bool trans_RSHRNT(DisasContext *s, arg_rri_esz *a) return do_sve2_shr_narrow(s, a, ops); } +static void gen_sqshrunb_vec(unsigned vece, TCGv_vec d, + TCGv_vec n, int64_t shr) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + int halfbits = 4 << vece; + + tcg_gen_sari_vec(vece, n, n, shr); + tcg_gen_dupi_vec(vece, t, 0); + tcg_gen_smax_vec(vece, n, n, t); + tcg_gen_dupi_vec(vece, t, MAKE_64BIT_MASK(0, halfbits)); + tcg_gen_umin_vec(vece, d, n, t); + tcg_temp_free_vec(t); +} + +static bool trans_SQSHRUNB(DisasContext *s, arg_rri_esz *a) +{ + static const TCGOpcode vec_list[] = { + INDEX_op_sari_vec, INDEX_op_smax_vec, INDEX_op_umin_vec, 0 + }; + static const GVecGen2i ops[3] = { + { .fniv = gen_sqshrunb_vec, + .opt_opc = vec_list, + .fno = gen_helper_sve2_sqshrunb_h, + .vece = MO_16 }, + { .fniv = gen_sqshrunb_vec, + .opt_opc = vec_list, + .fno = gen_helper_sve2_sqshrunb_s, + .vece = MO_32 }, + { .fniv = gen_sqshrunb_vec, + .opt_opc = vec_list, + .fno = gen_helper_sve2_sqshrunb_d, + .vece = MO_64 }, + }; + return do_sve2_shr_narrow(s, a, ops); +} + +static void gen_sqshrunt_vec(unsigned vece, TCGv_vec d, + TCGv_vec n, int64_t shr) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + int halfbits = 4 << vece; + + tcg_gen_sari_vec(vece, n, n, shr); + tcg_gen_dupi_vec(vece, t, 0); + tcg_gen_smax_vec(vece, n, n, t); + tcg_gen_dupi_vec(vece, t, MAKE_64BIT_MASK(0, halfbits)); + tcg_gen_umin_vec(vece, n, n, t); + tcg_gen_shli_vec(vece, n, n, halfbits); + tcg_gen_bitsel_vec(vece, d, t, d, n); + tcg_temp_free_vec(t); +} + +static bool trans_SQSHRUNT(DisasContext *s, arg_rri_esz *a) +{ + static const TCGOpcode vec_list[] = { + INDEX_op_shli_vec, INDEX_op_sari_vec, + INDEX_op_smax_vec, INDEX_op_umin_vec, 0 + }; + static const GVecGen2i ops[3] = { + { .fniv = gen_sqshrunt_vec, + .opt_opc = vec_list, + .load_dest = true, + .fno = gen_helper_sve2_sqshrunt_h, + .vece = MO_16 }, + { .fniv = gen_sqshrunt_vec, + .opt_opc = vec_list, + .load_dest = true, + .fno = gen_helper_sve2_sqshrunt_s, + .vece = MO_32 }, + { .fniv = gen_sqshrunt_vec, + .opt_opc = vec_list, + .load_dest = true, + .fno = gen_helper_sve2_sqshrunt_d, + .vece = MO_64 }, + }; + return do_sve2_shr_narrow(s, a, ops); +} + +static bool trans_SQRSHRUNB(DisasContext *s, arg_rri_esz *a) +{ + static const GVecGen2i ops[3] = { + { .fno = gen_helper_sve2_sqrshrunb_h }, + { .fno = gen_helper_sve2_sqrshrunb_s }, + { .fno = gen_helper_sve2_sqrshrunb_d }, + }; + return do_sve2_shr_narrow(s, a, ops); +} + +static bool trans_SQRSHRUNT(DisasContext *s, arg_rri_esz *a) +{ + static const GVecGen2i ops[3] = { + { .fno = gen_helper_sve2_sqrshrunt_h }, + { .fno = gen_helper_sve2_sqrshrunt_s }, + { .fno = gen_helper_sve2_sqrshrunt_d }, + }; + return do_sve2_shr_narrow(s, a, ops); +} + static bool do_sve2_zpzz_fp(DisasContext *s, arg_rprr_esz *a, gen_helper_gvec_4_ptr *fn) { From patchwork Tue May 25 01:02:54 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277467 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 18CDEC2B9F7 for ; Tue, 25 May 2021 01:28:37 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 721DE61403 for ; Tue, 25 May 2021 01:28:36 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 721DE61403 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:46312 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llLs3-0005LF-HG for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 21:28:35 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54104) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLV3-00026c-Cq for qemu-devel@nongnu.org; Mon, 24 May 2021 21:04:49 -0400 Received: from mail-pf1-x434.google.com ([2607:f8b0:4864:20::434]:40469) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLUZ-0001qY-FI for qemu-devel@nongnu.org; Mon, 24 May 2021 21:04:49 -0400 Received: by mail-pf1-x434.google.com with SMTP id x188so22267523pfd.7 for ; Mon, 24 May 2021 18:04:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=3edOJdwrmEVvr/U0p0jlNwDzNcaMU1hX79Y9+XdOszM=; b=mky7RJmJmi86QdpAqRp8T7J3gCqZ0ZW8lv0qzmIRqUece1XUIJpAYnYp9t4KyGNnVI K3E48k3pkAddHP9akzlQiT/QUvYoEDvZSYJuWOKV0uSK8CgpQv4yIdE2jqosCCGpTy+s 4Ua6Z98D2QI472oaPp15jNWTfHIaqrh7MXYFKUjoJ4Vb0A/vZGnQvCC0fpfr9ONpm4Vf 7Zpjy1jc3IpYIFSRwMaYS8JYLAOkymipMQzIJRgUaGU3rBgUpGPT/p/kUdBnv7W84Mng 1ibJPimOs5+I/kqXYrS63R2FdO/ZDg+qZJGSIVXrRdA976FssA31ir23AcZeFuuGvIG6 z+kQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=3edOJdwrmEVvr/U0p0jlNwDzNcaMU1hX79Y9+XdOszM=; b=DH/n9l8U33XV2MJ46ZdBr1iHvNSdOw2J93zfrRiCmAQXzzYzhfb59y2t3i4t84J4Ej wC2cCWmF1V512TsS2xL5RvF7q1yNP4QAhKAW4ZDpwdROQPDJNVUgr5FT7Uc1GhweYOQc d3EAyj275RNImobwxEutPkN3ZZZS2A3ZXvFC8gt3UD1DHHHZxW5qY6mv5+0x5mifCtUc Gtts7aYxTG0f88gl4hfS9ctNL2Z3OFKSGPreTRRLMp6+lFWzII4SLsf7uduxgQM5Oeev V7uiZfjbZJpSrmc2e+IgNC852J4I9vnZPhLrX/q/CrLy/vnqjOI72U0mAK3KIQ+h0qW7 FloQ== X-Gm-Message-State: AOAM533jpu+h+WqYyDMfSPFRGYH9HoymqY4844bPZcMrpXzhy2V/UotZ ZNFzXqOmCC4E51cDtcbCcGvAZMkHWrHO+A== X-Google-Smtp-Source: ABdhPJwQKns60z6iucbZ8M2LZb6FUDP6T22p5p54c1+c7YWTIu+27gV0GsoYO0iAkBuISDQcQvr0Cg== X-Received: by 2002:a63:5d18:: with SMTP id r24mr16542601pgb.94.1621904656698; Mon, 24 May 2021 18:04:16 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id b1sm13742645pgf.84.2021.05.24.18.04.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:04:16 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 28/92] target/arm: Implement SVE2 UQSHRN, UQRSHRN Date: Mon, 24 May 2021 18:02:54 -0700 Message-Id: <20210525010358.152808-29-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::434; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x434.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 16 +++++++ target/arm/sve.decode | 4 ++ target/arm/sve_helper.c | 24 ++++++++++ target/arm/translate-sve.c | 93 ++++++++++++++++++++++++++++++++++++++ 4 files changed, 137 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 2e80d9d27b..ba6a24fc8b 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2476,6 +2476,22 @@ DEF_HELPER_FLAGS_3(sve2_sqrshrunt_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve2_sqrshrunt_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve2_sqrshrunt_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_uqshrnb_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_uqshrnb_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_uqshrnb_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve2_uqshrnt_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_uqshrnt_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_uqshrnt_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve2_uqrshrnb_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_uqrshrnb_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_uqrshrnb_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve2_uqrshrnt_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_uqrshrnt_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_uqrshrnt_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + DEF_HELPER_FLAGS_6(sve2_faddp_zpzz_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_6(sve2_faddp_zpzz_s, TCG_CALL_NO_RWG, diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 18faa900ca..13b5da0856 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1296,6 +1296,10 @@ SHRNB 01000101 .. 1 ..... 00 0100 ..... ..... @rd_rn_tszimm_shr SHRNT 01000101 .. 1 ..... 00 0101 ..... ..... @rd_rn_tszimm_shr RSHRNB 01000101 .. 1 ..... 00 0110 ..... ..... @rd_rn_tszimm_shr RSHRNT 01000101 .. 1 ..... 00 0111 ..... ..... @rd_rn_tszimm_shr +UQSHRNB 01000101 .. 1 ..... 00 1100 ..... ..... @rd_rn_tszimm_shr +UQSHRNT 01000101 .. 1 ..... 00 1101 ..... ..... @rd_rn_tszimm_shr +UQRSHRNB 01000101 .. 1 ..... 00 1110 ..... ..... @rd_rn_tszimm_shr +UQRSHRNT 01000101 .. 1 ..... 00 1111 ..... ..... @rd_rn_tszimm_shr ## SVE2 floating-point pairwise operations diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index b304ca19e8..89262149f9 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1976,6 +1976,30 @@ DO_SHRNT(sve2_sqrshrunt_h, int16_t, uint8_t, H1_2, H1, DO_SQRSHRUN_H) DO_SHRNT(sve2_sqrshrunt_s, int32_t, uint16_t, H1_4, H1_2, DO_SQRSHRUN_S) DO_SHRNT(sve2_sqrshrunt_d, int64_t, uint32_t, , H1_4, DO_SQRSHRUN_D) +#define DO_UQSHRN_H(x, sh) MIN(x >> sh, UINT8_MAX) +#define DO_UQSHRN_S(x, sh) MIN(x >> sh, UINT16_MAX) +#define DO_UQSHRN_D(x, sh) MIN(x >> sh, UINT32_MAX) + +DO_SHRNB(sve2_uqshrnb_h, uint16_t, uint8_t, DO_UQSHRN_H) +DO_SHRNB(sve2_uqshrnb_s, uint32_t, uint16_t, DO_UQSHRN_S) +DO_SHRNB(sve2_uqshrnb_d, uint64_t, uint32_t, DO_UQSHRN_D) + +DO_SHRNT(sve2_uqshrnt_h, uint16_t, uint8_t, H1_2, H1, DO_UQSHRN_H) +DO_SHRNT(sve2_uqshrnt_s, uint32_t, uint16_t, H1_4, H1_2, DO_UQSHRN_S) +DO_SHRNT(sve2_uqshrnt_d, uint64_t, uint32_t, , H1_4, DO_UQSHRN_D) + +#define DO_UQRSHRN_H(x, sh) MIN(do_urshr(x, sh), UINT8_MAX) +#define DO_UQRSHRN_S(x, sh) MIN(do_urshr(x, sh), UINT16_MAX) +#define DO_UQRSHRN_D(x, sh) MIN(do_urshr(x, sh), UINT32_MAX) + +DO_SHRNB(sve2_uqrshrnb_h, uint16_t, uint8_t, DO_UQRSHRN_H) +DO_SHRNB(sve2_uqrshrnb_s, uint32_t, uint16_t, DO_UQRSHRN_S) +DO_SHRNB(sve2_uqrshrnb_d, uint64_t, uint32_t, DO_UQRSHRN_D) + +DO_SHRNT(sve2_uqrshrnt_h, uint16_t, uint8_t, H1_2, H1, DO_UQRSHRN_H) +DO_SHRNT(sve2_uqrshrnt_s, uint32_t, uint16_t, H1_4, H1_2, DO_UQRSHRN_S) +DO_SHRNT(sve2_uqrshrnt_d, uint64_t, uint32_t, , H1_4, DO_UQRSHRN_D) + #undef DO_SHRNB #undef DO_SHRNT diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 36986b6e87..e5c71005c8 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -6956,6 +6956,99 @@ static bool trans_SQRSHRUNT(DisasContext *s, arg_rri_esz *a) return do_sve2_shr_narrow(s, a, ops); } +static void gen_uqshrnb_vec(unsigned vece, TCGv_vec d, + TCGv_vec n, int64_t shr) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + int halfbits = 4 << vece; + + tcg_gen_shri_vec(vece, n, n, shr); + tcg_gen_dupi_vec(vece, t, MAKE_64BIT_MASK(0, halfbits)); + tcg_gen_umin_vec(vece, d, n, t); + tcg_temp_free_vec(t); +} + +static bool trans_UQSHRNB(DisasContext *s, arg_rri_esz *a) +{ + static const TCGOpcode vec_list[] = { + INDEX_op_shri_vec, INDEX_op_umin_vec, 0 + }; + static const GVecGen2i ops[3] = { + { .fniv = gen_uqshrnb_vec, + .opt_opc = vec_list, + .fno = gen_helper_sve2_uqshrnb_h, + .vece = MO_16 }, + { .fniv = gen_uqshrnb_vec, + .opt_opc = vec_list, + .fno = gen_helper_sve2_uqshrnb_s, + .vece = MO_32 }, + { .fniv = gen_uqshrnb_vec, + .opt_opc = vec_list, + .fno = gen_helper_sve2_uqshrnb_d, + .vece = MO_64 }, + }; + return do_sve2_shr_narrow(s, a, ops); +} + +static void gen_uqshrnt_vec(unsigned vece, TCGv_vec d, + TCGv_vec n, int64_t shr) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + int halfbits = 4 << vece; + + tcg_gen_shri_vec(vece, n, n, shr); + tcg_gen_dupi_vec(vece, t, MAKE_64BIT_MASK(0, halfbits)); + tcg_gen_umin_vec(vece, n, n, t); + tcg_gen_shli_vec(vece, n, n, halfbits); + tcg_gen_bitsel_vec(vece, d, t, d, n); + tcg_temp_free_vec(t); +} + +static bool trans_UQSHRNT(DisasContext *s, arg_rri_esz *a) +{ + static const TCGOpcode vec_list[] = { + INDEX_op_shli_vec, INDEX_op_shri_vec, INDEX_op_umin_vec, 0 + }; + static const GVecGen2i ops[3] = { + { .fniv = gen_uqshrnt_vec, + .opt_opc = vec_list, + .load_dest = true, + .fno = gen_helper_sve2_uqshrnt_h, + .vece = MO_16 }, + { .fniv = gen_uqshrnt_vec, + .opt_opc = vec_list, + .load_dest = true, + .fno = gen_helper_sve2_uqshrnt_s, + .vece = MO_32 }, + { .fniv = gen_uqshrnt_vec, + .opt_opc = vec_list, + .load_dest = true, + .fno = gen_helper_sve2_uqshrnt_d, + .vece = MO_64 }, + }; + return do_sve2_shr_narrow(s, a, ops); +} + +static bool trans_UQRSHRNB(DisasContext *s, arg_rri_esz *a) +{ + static const GVecGen2i ops[3] = { + { .fno = gen_helper_sve2_uqrshrnb_h }, + { .fno = gen_helper_sve2_uqrshrnb_s }, + { .fno = gen_helper_sve2_uqrshrnb_d }, + }; + return do_sve2_shr_narrow(s, a, ops); +} + +static bool trans_UQRSHRNT(DisasContext *s, arg_rri_esz *a) +{ + static const GVecGen2i ops[3] = { + { .fno = gen_helper_sve2_uqrshrnt_h }, + { .fno = gen_helper_sve2_uqrshrnt_s }, + { .fno = gen_helper_sve2_uqrshrnt_d }, + }; + return do_sve2_shr_narrow(s, a, ops); +} + static bool do_sve2_zpzz_fp(DisasContext *s, arg_rprr_esz *a, gen_helper_gvec_4_ptr *fn) { From patchwork Tue May 25 01:02:55 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277491 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4B878C2B9F7 for ; Tue, 25 May 2021 01:39:21 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id A689761415 for ; Tue, 25 May 2021 01:39:20 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A689761415 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:45570 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llM2R-0006tZ-L5 for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 21:39:19 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54136) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLV4-0002BM-DY for qemu-devel@nongnu.org; Mon, 24 May 2021 21:04:50 -0400 Received: from mail-pf1-x435.google.com ([2607:f8b0:4864:20::435]:38899) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLUZ-0001rZ-FN for qemu-devel@nongnu.org; Mon, 24 May 2021 21:04:50 -0400 Received: by mail-pf1-x435.google.com with SMTP id e17so11668793pfl.5 for ; Mon, 24 May 2021 18:04:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=AJkG8XDDClP3dpiPaSjuBLtopyEBlxV6aqwcFZt8nEI=; b=AQmvUa9mWoKt/vOvHkLa5Lc94chEyRDRuf3pD4STWYcd+XapEdy4v56FIXjKQHggS9 rQXspAgYuvSea6QHq0Yx4MDBkhotsaU9nC/+k0QJ3QlHjtSYvH/ce4Ew4s2iwD0mTSYK 7e/LCwK9cscmgd3JzUKBCr6d42YmZDtIcoY4QD6PSw95mvIqt8LIFrZUIsqEN0OGSDrJ W0ejOXWNGRB3JzUYay9/m8PGwMPk2cKrymb7URMBh7JgtIkc2rNCxHLviA/iWAFrSAPg 4I0Q48CjfegnFaAZ1gURWzg0k8ca+1KUFpPzAU8WURI5GcF+0Cb6HCRbLZQ3xygXyHGS Z1Jw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=AJkG8XDDClP3dpiPaSjuBLtopyEBlxV6aqwcFZt8nEI=; b=t9HDZnjLG56QtaCu72J6Kz+Y2/Y7WjgMpAHsvgmBLOerPPuxh0sDvVDUNU8z4IFGYU RUvVgLCSA+fHuNOLQzlsvFCIzbP6Cc8CpP9AArZlsh/ZggPRbnukEkyiq0ftJMEipReH SF/CKjZAAWj5yvEpIo6yvIUTIfjuUAzh1nMwYDVg5PYLhcO3siYpfYuxPTKerkxDmmNI MqoI4b0Rwh5q9Lbkdoa3AEPK3LkD6J9I79TFenLGA4U+y+HJz2wViMz4rMWf1R8r2TxL oMsWIXOw0oi1kDTyEy63HMe/LrJgxI5rSbX3FpKhBG3gQgUqR+G+BVDPH+30RFv3H8HL jj4w== X-Gm-Message-State: AOAM532KcOQVmnoeRabpgQWNZb2EPJI6h+vd+tKgP6LXdfl2bTOsF+j2 6QmRrPT7dNikf+hZUQfVSA0O/q4NyG0aiQ== X-Google-Smtp-Source: ABdhPJxTS6r7gnpW9hUXHftKuMtxjs+xGdlFKfvWk8I/FCe+TiVw9KY0ShAxzRyOXaFf9jVim1gBUg== X-Received: by 2002:aa7:8bd5:0:b029:2d6:ab78:7770 with SMTP id s21-20020aa78bd50000b02902d6ab787770mr27455583pfd.59.1621904657315; Mon, 24 May 2021 18:04:17 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id b1sm13742645pgf.84.2021.05.24.18.04.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:04:17 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 29/92] target/arm: Implement SVE2 SQSHRN, SQRSHRN Date: Mon, 24 May 2021 18:02:55 -0700 Message-Id: <20210525010358.152808-30-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::435; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x435.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" This completes the section "SVE2 bitwise shift right narrow". Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 16 ++++++ target/arm/sve.decode | 4 ++ target/arm/sve_helper.c | 24 +++++++++ target/arm/translate-sve.c | 105 +++++++++++++++++++++++++++++++++++++ 4 files changed, 149 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index ba6a24fc8b..1c7fe8e417 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2476,6 +2476,22 @@ DEF_HELPER_FLAGS_3(sve2_sqrshrunt_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve2_sqrshrunt_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve2_sqrshrunt_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_sqshrnb_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_sqshrnb_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_sqshrnb_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve2_sqshrnt_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_sqshrnt_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_sqshrnt_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve2_sqrshrnb_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_sqrshrnb_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_sqrshrnb_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve2_sqrshrnt_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_sqrshrnt_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_sqrshrnt_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + DEF_HELPER_FLAGS_3(sve2_uqshrnb_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve2_uqshrnb_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve2_uqshrnb_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 13b5da0856..0674464695 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1296,6 +1296,10 @@ SHRNB 01000101 .. 1 ..... 00 0100 ..... ..... @rd_rn_tszimm_shr SHRNT 01000101 .. 1 ..... 00 0101 ..... ..... @rd_rn_tszimm_shr RSHRNB 01000101 .. 1 ..... 00 0110 ..... ..... @rd_rn_tszimm_shr RSHRNT 01000101 .. 1 ..... 00 0111 ..... ..... @rd_rn_tszimm_shr +SQSHRNB 01000101 .. 1 ..... 00 1000 ..... ..... @rd_rn_tszimm_shr +SQSHRNT 01000101 .. 1 ..... 00 1001 ..... ..... @rd_rn_tszimm_shr +SQRSHRNB 01000101 .. 1 ..... 00 1010 ..... ..... @rd_rn_tszimm_shr +SQRSHRNT 01000101 .. 1 ..... 00 1011 ..... ..... @rd_rn_tszimm_shr UQSHRNB 01000101 .. 1 ..... 00 1100 ..... ..... @rd_rn_tszimm_shr UQSHRNT 01000101 .. 1 ..... 00 1101 ..... ..... @rd_rn_tszimm_shr UQRSHRNB 01000101 .. 1 ..... 00 1110 ..... ..... @rd_rn_tszimm_shr diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 89262149f9..0ea4ae28db 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1976,6 +1976,30 @@ DO_SHRNT(sve2_sqrshrunt_h, int16_t, uint8_t, H1_2, H1, DO_SQRSHRUN_H) DO_SHRNT(sve2_sqrshrunt_s, int32_t, uint16_t, H1_4, H1_2, DO_SQRSHRUN_S) DO_SHRNT(sve2_sqrshrunt_d, int64_t, uint32_t, , H1_4, DO_SQRSHRUN_D) +#define DO_SQSHRN_H(x, sh) do_sat_bhs(x >> sh, INT8_MIN, INT8_MAX) +#define DO_SQSHRN_S(x, sh) do_sat_bhs(x >> sh, INT16_MIN, INT16_MAX) +#define DO_SQSHRN_D(x, sh) do_sat_bhs(x >> sh, INT32_MIN, INT32_MAX) + +DO_SHRNB(sve2_sqshrnb_h, int16_t, uint8_t, DO_SQSHRN_H) +DO_SHRNB(sve2_sqshrnb_s, int32_t, uint16_t, DO_SQSHRN_S) +DO_SHRNB(sve2_sqshrnb_d, int64_t, uint32_t, DO_SQSHRN_D) + +DO_SHRNT(sve2_sqshrnt_h, int16_t, uint8_t, H1_2, H1, DO_SQSHRN_H) +DO_SHRNT(sve2_sqshrnt_s, int32_t, uint16_t, H1_4, H1_2, DO_SQSHRN_S) +DO_SHRNT(sve2_sqshrnt_d, int64_t, uint32_t, , H1_4, DO_SQSHRN_D) + +#define DO_SQRSHRN_H(x, sh) do_sat_bhs(do_srshr(x, sh), INT8_MIN, INT8_MAX) +#define DO_SQRSHRN_S(x, sh) do_sat_bhs(do_srshr(x, sh), INT16_MIN, INT16_MAX) +#define DO_SQRSHRN_D(x, sh) do_sat_bhs(do_srshr(x, sh), INT32_MIN, INT32_MAX) + +DO_SHRNB(sve2_sqrshrnb_h, int16_t, uint8_t, DO_SQRSHRN_H) +DO_SHRNB(sve2_sqrshrnb_s, int32_t, uint16_t, DO_SQRSHRN_S) +DO_SHRNB(sve2_sqrshrnb_d, int64_t, uint32_t, DO_SQRSHRN_D) + +DO_SHRNT(sve2_sqrshrnt_h, int16_t, uint8_t, H1_2, H1, DO_SQRSHRN_H) +DO_SHRNT(sve2_sqrshrnt_s, int32_t, uint16_t, H1_4, H1_2, DO_SQRSHRN_S) +DO_SHRNT(sve2_sqrshrnt_d, int64_t, uint32_t, , H1_4, DO_SQRSHRN_D) + #define DO_UQSHRN_H(x, sh) MIN(x >> sh, UINT8_MAX) #define DO_UQSHRN_S(x, sh) MIN(x >> sh, UINT16_MAX) #define DO_UQSHRN_D(x, sh) MIN(x >> sh, UINT32_MAX) diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index e5c71005c8..4141d76311 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -6956,6 +6956,111 @@ static bool trans_SQRSHRUNT(DisasContext *s, arg_rri_esz *a) return do_sve2_shr_narrow(s, a, ops); } +static void gen_sqshrnb_vec(unsigned vece, TCGv_vec d, + TCGv_vec n, int64_t shr) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + int halfbits = 4 << vece; + int64_t max = MAKE_64BIT_MASK(0, halfbits - 1); + int64_t min = -max - 1; + + tcg_gen_sari_vec(vece, n, n, shr); + tcg_gen_dupi_vec(vece, t, min); + tcg_gen_smax_vec(vece, n, n, t); + tcg_gen_dupi_vec(vece, t, max); + tcg_gen_smin_vec(vece, n, n, t); + tcg_gen_dupi_vec(vece, t, MAKE_64BIT_MASK(0, halfbits)); + tcg_gen_and_vec(vece, d, n, t); + tcg_temp_free_vec(t); +} + +static bool trans_SQSHRNB(DisasContext *s, arg_rri_esz *a) +{ + static const TCGOpcode vec_list[] = { + INDEX_op_sari_vec, INDEX_op_smax_vec, INDEX_op_smin_vec, 0 + }; + static const GVecGen2i ops[3] = { + { .fniv = gen_sqshrnb_vec, + .opt_opc = vec_list, + .fno = gen_helper_sve2_sqshrnb_h, + .vece = MO_16 }, + { .fniv = gen_sqshrnb_vec, + .opt_opc = vec_list, + .fno = gen_helper_sve2_sqshrnb_s, + .vece = MO_32 }, + { .fniv = gen_sqshrnb_vec, + .opt_opc = vec_list, + .fno = gen_helper_sve2_sqshrnb_d, + .vece = MO_64 }, + }; + return do_sve2_shr_narrow(s, a, ops); +} + +static void gen_sqshrnt_vec(unsigned vece, TCGv_vec d, + TCGv_vec n, int64_t shr) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + int halfbits = 4 << vece; + int64_t max = MAKE_64BIT_MASK(0, halfbits - 1); + int64_t min = -max - 1; + + tcg_gen_sari_vec(vece, n, n, shr); + tcg_gen_dupi_vec(vece, t, min); + tcg_gen_smax_vec(vece, n, n, t); + tcg_gen_dupi_vec(vece, t, max); + tcg_gen_smin_vec(vece, n, n, t); + tcg_gen_shli_vec(vece, n, n, halfbits); + tcg_gen_dupi_vec(vece, t, MAKE_64BIT_MASK(0, halfbits)); + tcg_gen_bitsel_vec(vece, d, t, d, n); + tcg_temp_free_vec(t); +} + +static bool trans_SQSHRNT(DisasContext *s, arg_rri_esz *a) +{ + static const TCGOpcode vec_list[] = { + INDEX_op_shli_vec, INDEX_op_sari_vec, + INDEX_op_smax_vec, INDEX_op_smin_vec, 0 + }; + static const GVecGen2i ops[3] = { + { .fniv = gen_sqshrnt_vec, + .opt_opc = vec_list, + .load_dest = true, + .fno = gen_helper_sve2_sqshrnt_h, + .vece = MO_16 }, + { .fniv = gen_sqshrnt_vec, + .opt_opc = vec_list, + .load_dest = true, + .fno = gen_helper_sve2_sqshrnt_s, + .vece = MO_32 }, + { .fniv = gen_sqshrnt_vec, + .opt_opc = vec_list, + .load_dest = true, + .fno = gen_helper_sve2_sqshrnt_d, + .vece = MO_64 }, + }; + return do_sve2_shr_narrow(s, a, ops); +} + +static bool trans_SQRSHRNB(DisasContext *s, arg_rri_esz *a) +{ + static const GVecGen2i ops[3] = { + { .fno = gen_helper_sve2_sqrshrnb_h }, + { .fno = gen_helper_sve2_sqrshrnb_s }, + { .fno = gen_helper_sve2_sqrshrnb_d }, + }; + return do_sve2_shr_narrow(s, a, ops); +} + +static bool trans_SQRSHRNT(DisasContext *s, arg_rri_esz *a) +{ + static const GVecGen2i ops[3] = { + { .fno = gen_helper_sve2_sqrshrnt_h }, + { .fno = gen_helper_sve2_sqrshrnt_s }, + { .fno = gen_helper_sve2_sqrshrnt_d }, + }; + return do_sve2_shr_narrow(s, a, ops); +} + static void gen_uqshrnb_vec(unsigned vece, TCGv_vec d, TCGv_vec n, int64_t shr) { From patchwork Tue May 25 01:02:56 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277471 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5C3BBC2B9F7 for ; Tue, 25 May 2021 01:28:44 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id CC93C61403 for ; Tue, 25 May 2021 01:28:43 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org CC93C61403 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:47002 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llLsA-0005pM-UR for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 21:28:42 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54216) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLV9-0002Kl-Ie for qemu-devel@nongnu.org; Mon, 24 May 2021 21:04:55 -0400 Received: from mail-pg1-x533.google.com ([2607:f8b0:4864:20::533]:44797) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLUZ-0001rm-J0 for qemu-devel@nongnu.org; Mon, 24 May 2021 21:04:55 -0400 Received: by mail-pg1-x533.google.com with SMTP id 29so10300258pgu.11 for ; Mon, 24 May 2021 18:04:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=+pFuuNENrjc/D1s2ITTHICil4lfJFc3PqB+UO0v4xzE=; b=XTj4BzCjC1me3PdCJRIwr4W1n1/5Zafc/bYAlIbo9hGOhHEqfCQI0FX7u2c4l7+gRi 9vycHaCMEwkTU0MAEcHQTo+BadcdctiJkV3t7W7pigocCr4f1IE1DFZdhMXfYl69FgMw TQnJmqMcSoNBf9AzrWel/OempOMurH3DQfpD9pFtABozP+1mVu+uwAgDA0LggrW6nGQj /7r4HF+5rlrHZ7S0VVNx+nppHshrdbXFpi9FSQ9bHC0fWoXgQmus596q8CB7Det6tNDT lDMoFTxxQRTpXKqI0kwzfadv6eFyh4RVSYJL2OQR4xwJZ71fKxrumsyBlaCGaSzhjQqX Xalg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=+pFuuNENrjc/D1s2ITTHICil4lfJFc3PqB+UO0v4xzE=; b=hAfapD/lOMhGMGg1lKUP0qNqujt4shcaH7gQ+ijvlk0Pz72hEUDg7ydyVB5Qc4Zes/ kW15aE1klIIkmEI5koVnNfkdwuBdF9xlYIw11Ia2AMEGgl8/zYPjRemmzX6cP5N1Ee72 M71nV9Au65Jqia26xaSaNM5ctoKtqsN3bKRgPVJmyVZV2Vym3jk3x3PuAlJ3OAKaVjS0 BFnwLmyp7GLWIECQR+6iOr8x255zcuf2Be43QPsb0WHqVGyWosEapOFBBmTldj6rK0Te 9tT0DR0uv36B3bDWojJUZ8EAp7UKHq5QypvU1Z208MHmZwtbuZ5HbtLRMwfzF/ukcGa8 5uNA== X-Gm-Message-State: AOAM530aiDPzS930UN9Z41bg7UkxmE0xrjX0Q97YXjSFAh/qK3lteIUg VheBXVmbw5fzCt0DKRlyUql7XnOX1gQF9g== X-Google-Smtp-Source: ABdhPJwnzn88wmsAa7LLWaBCa7FnOkJNkfYAT11rkAa/XMmK7g4X+0MrbrlD2eDaZwwY6sB+8q1qpg== X-Received: by 2002:a63:fa03:: with SMTP id y3mr16190805pgh.389.1621904657931; Mon, 24 May 2021 18:04:17 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id b1sm13742645pgf.84.2021.05.24.18.04.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:04:17 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 30/92] target/arm: Implement SVE2 WHILEGT, WHILEGE, WHILEHI, WHILEHS Date: Mon, 24 May 2021 18:02:56 -0700 Message-Id: <20210525010358.152808-31-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::533; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x533.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Rename the existing sve_while (less-than) helper to sve_whilel to make room for a new sve_whileg helper for greater-than. Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- v2: Use a new helper function to implement this. v4: Update for PREDDESC. --- target/arm/helper-sve.h | 3 +- target/arm/sve.decode | 2 +- target/arm/sve_helper.c | 38 +++++++++++++++++++++++++- target/arm/translate-sve.c | 56 ++++++++++++++++++++++++++++---------- 4 files changed, 82 insertions(+), 17 deletions(-) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 1c7fe8e417..5bf9fdc7a3 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -913,7 +913,8 @@ DEF_HELPER_FLAGS_4(sve_brkns, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve_cntp, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) -DEF_HELPER_FLAGS_3(sve_while, TCG_CALL_NO_RWG, i32, ptr, i32, i32) +DEF_HELPER_FLAGS_3(sve_whilel, TCG_CALL_NO_RWG, i32, ptr, i32, i32) +DEF_HELPER_FLAGS_3(sve_whileg, TCG_CALL_NO_RWG, i32, ptr, i32, i32) DEF_HELPER_FLAGS_4(sve_subri_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) DEF_HELPER_FLAGS_4(sve_subri_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 0674464695..ae853d21f2 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -700,7 +700,7 @@ SINCDECP_z 00100101 .. 1010 d:1 u:1 10000 00 .... ..... @incdec2_pred CTERM 00100101 1 sf:1 1 rm:5 001000 rn:5 ne:1 0000 # SVE integer compare scalar count and limit -WHILE 00100101 esz:2 1 rm:5 000 sf:1 u:1 1 rn:5 eq:1 rd:4 +WHILE 00100101 esz:2 1 rm:5 000 sf:1 u:1 lt:1 rn:5 eq:1 rd:4 ### SVE Integer Wide Immediate - Unpredicated Group diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 0ea4ae28db..7450977630 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -3750,7 +3750,7 @@ uint64_t HELPER(sve_cntp)(void *vn, void *vg, uint32_t pred_desc) return sum; } -uint32_t HELPER(sve_while)(void *vd, uint32_t count, uint32_t pred_desc) +uint32_t HELPER(sve_whilel)(void *vd, uint32_t count, uint32_t pred_desc) { intptr_t oprsz = FIELD_EX32(pred_desc, PREDDESC, OPRSZ); intptr_t esz = FIELD_EX32(pred_desc, PREDDESC, ESZ); @@ -3776,6 +3776,42 @@ uint32_t HELPER(sve_while)(void *vd, uint32_t count, uint32_t pred_desc) return predtest_ones(d, oprsz, esz_mask); } +uint32_t HELPER(sve_whileg)(void *vd, uint32_t count, uint32_t pred_desc) +{ + intptr_t oprsz = FIELD_EX32(pred_desc, PREDDESC, OPRSZ); + intptr_t esz = FIELD_EX32(pred_desc, PREDDESC, ESZ); + uint64_t esz_mask = pred_esz_masks[esz]; + ARMPredicateReg *d = vd; + intptr_t i, invcount, oprbits; + uint64_t bits; + + if (count == 0) { + return do_zero(d, oprsz); + } + + oprbits = oprsz * 8; + tcg_debug_assert(count <= oprbits); + + bits = esz_mask; + if (oprbits & 63) { + bits &= MAKE_64BIT_MASK(0, oprbits & 63); + } + + invcount = oprbits - count; + for (i = (oprsz - 1) / 8; i > invcount / 64; --i) { + d->p[i] = bits; + bits = esz_mask; + } + + d->p[i] = bits & MAKE_64BIT_MASK(invcount & 63, 64); + + while (--i >= 0) { + d->p[i] = 0; + } + + return predtest_ones(d, oprsz, esz_mask); +} + /* Recursive reduction on a function; * C.f. the ARM ARM function ReducePredicated. * diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 4141d76311..a55e747514 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3112,7 +3112,14 @@ static bool trans_WHILE(DisasContext *s, arg_WHILE *a) unsigned vsz = vec_full_reg_size(s); unsigned desc = 0; TCGCond cond; + uint64_t maxval; + /* Note that GE/HS has a->eq == 0 and GT/HI has a->eq == 1. */ + bool eq = a->eq == a->lt; + /* The greater-than conditions are all SVE2. */ + if (!a->lt && !dc_isar_feature(aa64_sve2, s)) { + return false; + } if (!sve_access_check(s)) { return true; } @@ -3135,22 +3142,42 @@ static bool trans_WHILE(DisasContext *s, arg_WHILE *a) */ t0 = tcg_temp_new_i64(); t1 = tcg_temp_new_i64(); - tcg_gen_sub_i64(t0, op1, op0); + + if (a->lt) { + tcg_gen_sub_i64(t0, op1, op0); + if (a->u) { + maxval = a->sf ? UINT64_MAX : UINT32_MAX; + cond = eq ? TCG_COND_LEU : TCG_COND_LTU; + } else { + maxval = a->sf ? INT64_MAX : INT32_MAX; + cond = eq ? TCG_COND_LE : TCG_COND_LT; + } + } else { + tcg_gen_sub_i64(t0, op0, op1); + if (a->u) { + maxval = 0; + cond = eq ? TCG_COND_GEU : TCG_COND_GTU; + } else { + maxval = a->sf ? INT64_MIN : INT32_MIN; + cond = eq ? TCG_COND_GE : TCG_COND_GT; + } + } tmax = tcg_const_i64(vsz >> a->esz); - if (a->eq) { + if (eq) { /* Equality means one more iteration. */ tcg_gen_addi_i64(t0, t0, 1); - /* If op1 is max (un)signed integer (and the only time the addition - * above could overflow), then we produce an all-true predicate by - * setting the count to the vector length. This is because the - * pseudocode is described as an increment + compare loop, and the - * max integer would always compare true. + /* + * For the less-than while, if op1 is maxval (and the only time + * the addition above could overflow), then we produce an all-true + * predicate by setting the count to the vector length. This is + * because the pseudocode is described as an increment + compare + * loop, and the maximum integer would always compare true. + * Similarly, the greater-than while has the same issue with the + * minimum integer due to the decrement + compare loop. */ - tcg_gen_movi_i64(t1, (a->sf - ? (a->u ? UINT64_MAX : INT64_MAX) - : (a->u ? UINT32_MAX : INT32_MAX))); + tcg_gen_movi_i64(t1, maxval); tcg_gen_movcond_i64(TCG_COND_EQ, t0, op1, t1, tmax, t0); } @@ -3159,9 +3186,6 @@ static bool trans_WHILE(DisasContext *s, arg_WHILE *a) tcg_temp_free_i64(tmax); /* Set the count to zero if the condition is false. */ - cond = (a->u - ? (a->eq ? TCG_COND_LEU : TCG_COND_LTU) - : (a->eq ? TCG_COND_LE : TCG_COND_LT)); tcg_gen_movi_i64(t1, 0); tcg_gen_movcond_i64(cond, t0, op0, op1, t0, t1); tcg_temp_free_i64(t1); @@ -3181,7 +3205,11 @@ static bool trans_WHILE(DisasContext *s, arg_WHILE *a) ptr = tcg_temp_new_ptr(); tcg_gen_addi_ptr(ptr, cpu_env, pred_full_reg_offset(s, a->rd)); - gen_helper_sve_while(t2, ptr, t2, t3); + if (a->lt) { + gen_helper_sve_whilel(t2, ptr, t2, t3); + } else { + gen_helper_sve_whileg(t2, ptr, t2, t3); + } do_pred_flags(t2); tcg_temp_free_ptr(ptr); From patchwork Tue May 25 01:02:57 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277477 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DA483C2B9F7 for ; Tue, 25 May 2021 01:33:02 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 7D193613F9 for ; Tue, 25 May 2021 01:33:02 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 7D193613F9 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:55218 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llLwL-00031U-Kq for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 21:33:01 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54168) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLV7-0002Gi-9Y for qemu-devel@nongnu.org; Mon, 24 May 2021 21:04:53 -0400 Received: from mail-pg1-x534.google.com ([2607:f8b0:4864:20::534]:44798) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLUZ-0001ry-Sq for qemu-devel@nongnu.org; Mon, 24 May 2021 21:04:52 -0400 Received: by mail-pg1-x534.google.com with SMTP id 29so10300276pgu.11 for ; Mon, 24 May 2021 18:04:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=/gvD7a8f2BNOsA9JJLwFLYTrDJciHubXEtbSMNmmysw=; b=oXCiNl0/8NVueEswb4jgV0nkBTcU0a3QmZ4pibtx/uvEUD2rMtN7Z2o4AYQ1biBQlH 87rlWrQTww9+I0RD9J8nYxHpQwmtp9nz0fce1KXNZzptZ5nEKeXrb7WU+98AvjoEzWEf Q0d3Krx/lU/Az+IVbiYoL6KfsONJqRj6cRj1YkqtB3dDDWnxp8CrWf1d3tGGe7iFErav aNSQxRmIZQFg1nP5NYHHvQNY8K1QjzzlZXHaKrecxW5tI476qPF/0eJu0IFSsmU4sp8m XHEz0jM539gL91TSsR1Aq9WKkkYmD+nebESGppli7lKd/7FGoTZoVn97LiThZiPODlL+ lAEQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=/gvD7a8f2BNOsA9JJLwFLYTrDJciHubXEtbSMNmmysw=; b=cTjXKMid7jQjwj6Pdg0pPGHaEvg19FTDvYVcnNBbJVOqzkP6mVC4qWf6Ss9QI2NBBK P3t8gYGAp5HsQj3taoXjrKHPYV27qncUrucxBRTOgEE5luWMC6sh5YFwC+9dWfVfWQzT 8li0pplWSifq1S6BPcmzNMUactcGLOcEJLD3j2OT+a5nLaZEaDIiQ22W+6huyS8R1lQF vfN1+be79gUZazvsPq4skueaE0n1bqz2J0F3aqDIatpG7cmtLgiqvM3UCYiFBj3Atd2F wEFbp/Ypw5D20uuVfNbSDUuX4RC953mSBBZoUtASMpRCcC73xEZPRukrKXczygWmfrZs SHLQ== X-Gm-Message-State: AOAM530u4kCNrgguki96/YJ9Rh6091uzDB2BTcgq/NK8Sy7KuS+PYumH vVyQTWQFPR8t864YDH7EmFPZpaXRvFqhkg== X-Google-Smtp-Source: ABdhPJwJ+KWpiiVhAsjOfr3kBugMBuRJ297xXZaZq3SPfFWI971tpJ8oYrcBg9sF6M044lLqhpzitA== X-Received: by 2002:a63:104a:: with SMTP id 10mr16326557pgq.66.1621904658524; Mon, 24 May 2021 18:04:18 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id b1sm13742645pgf.84.2021.05.24.18.04.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:04:18 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 31/92] target/arm: Implement SVE2 WHILERW, WHILEWR Date: Mon, 24 May 2021 18:02:57 -0700 Message-Id: <20210525010358.152808-32-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::534; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x534.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- v2: Fix decodetree typo v3: Fix iteration counts (zhiwei). v4: Update for PREDDESC. --- target/arm/sve.decode | 3 ++ target/arm/translate-sve.c | 67 ++++++++++++++++++++++++++++++++++++++ 2 files changed, 70 insertions(+) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index ae853d21f2..f365907518 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -702,6 +702,9 @@ CTERM 00100101 1 sf:1 1 rm:5 001000 rn:5 ne:1 0000 # SVE integer compare scalar count and limit WHILE 00100101 esz:2 1 rm:5 000 sf:1 u:1 lt:1 rn:5 eq:1 rd:4 +# SVE2 pointer conflict compare +WHILE_ptr 00100101 esz:2 1 rm:5 001 100 rn:5 rw:1 rd:4 + ### SVE Integer Wide Immediate - Unpredicated Group # SVE broadcast floating-point immediate (unpredicated) diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index a55e747514..64aecc2db4 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3218,6 +3218,73 @@ static bool trans_WHILE(DisasContext *s, arg_WHILE *a) return true; } +static bool trans_WHILE_ptr(DisasContext *s, arg_WHILE_ptr *a) +{ + TCGv_i64 op0, op1, diff, t1, tmax; + TCGv_i32 t2, t3; + TCGv_ptr ptr; + unsigned vsz = vec_full_reg_size(s); + unsigned desc = 0; + + if (!dc_isar_feature(aa64_sve2, s)) { + return false; + } + if (!sve_access_check(s)) { + return true; + } + + op0 = read_cpu_reg(s, a->rn, 1); + op1 = read_cpu_reg(s, a->rm, 1); + + tmax = tcg_const_i64(vsz); + diff = tcg_temp_new_i64(); + + if (a->rw) { + /* WHILERW */ + /* diff = abs(op1 - op0), noting that op0/1 are unsigned. */ + t1 = tcg_temp_new_i64(); + tcg_gen_sub_i64(diff, op0, op1); + tcg_gen_sub_i64(t1, op1, op0); + tcg_gen_movcond_i64(TCG_COND_GEU, diff, op0, op1, diff, t1); + tcg_temp_free_i64(t1); + /* Round down to a multiple of ESIZE. */ + tcg_gen_andi_i64(diff, diff, -1 << a->esz); + /* If op1 == op0, diff == 0, and the condition is always true. */ + tcg_gen_movcond_i64(TCG_COND_EQ, diff, op0, op1, tmax, diff); + } else { + /* WHILEWR */ + tcg_gen_sub_i64(diff, op1, op0); + /* Round down to a multiple of ESIZE. */ + tcg_gen_andi_i64(diff, diff, -1 << a->esz); + /* If op0 >= op1, diff <= 0, the condition is always true. */ + tcg_gen_movcond_i64(TCG_COND_GEU, diff, op0, op1, tmax, diff); + } + + /* Bound to the maximum. */ + tcg_gen_umin_i64(diff, diff, tmax); + tcg_temp_free_i64(tmax); + + /* Since we're bounded, pass as a 32-bit type. */ + t2 = tcg_temp_new_i32(); + tcg_gen_extrl_i64_i32(t2, diff); + tcg_temp_free_i64(diff); + + desc = FIELD_DP32(desc, PREDDESC, OPRSZ, vsz / 8); + desc = FIELD_DP32(desc, PREDDESC, ESZ, a->esz); + t3 = tcg_const_i32(desc); + + ptr = tcg_temp_new_ptr(); + tcg_gen_addi_ptr(ptr, cpu_env, pred_full_reg_offset(s, a->rd)); + + gen_helper_sve_whilel(t2, ptr, t2, t3); + do_pred_flags(t2); + + tcg_temp_free_ptr(ptr); + tcg_temp_free_i32(t2); + tcg_temp_free_i32(t3); + return true; +} + /* *** SVE Integer Wide Immediate - Unpredicated Group */ From patchwork Tue May 25 01:02:58 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277473 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C7225C2B9F7 for ; Tue, 25 May 2021 01:31:03 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 1861D61403 for ; Tue, 25 May 2021 01:31:03 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1861D61403 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:51614 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llLuQ-0000T0-1t for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 21:31:02 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54194) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLV8-0002JF-66 for qemu-devel@nongnu.org; Mon, 24 May 2021 21:04:54 -0400 Received: from mail-pf1-x42f.google.com ([2607:f8b0:4864:20::42f]:33427) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLUb-0001s7-4J for qemu-devel@nongnu.org; Mon, 24 May 2021 21:04:53 -0400 Received: by mail-pf1-x42f.google.com with SMTP id f22so13899280pfn.0 for ; Mon, 24 May 2021 18:04:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=GpTPnrtzyEW6ttxv/vRf6/zSu+ma0vPd6fBn+T2nWPw=; b=q5y8d5xdEt+NzEf/bFyq/MH0Ub8+RMsZJElXZ+7hYe4x6IyTjOSyyI6DpLvrpAjwRi MnSSS2GKNjkXHVpKt18wEfPWUJLHNzRlEjoGY82q3Mj9Rt+pGNZG54OfDoY4k1sCa1wN OEQ4o3dd0INllB0qq5YM8zTXPN4v3UBph6Z6ubiYcUk3ZHi1DCBkNbPNconBMOI/oAIm fn9AI+1K2NSy6bk7e7eX/+y7jYxr+SnrWCgop/RbVap473rmBcOA72H5+8XZQZWz9x08 Dbqy21U+e62iA2U65Pnn3Ubt8hBRhw4J3h3KQPsX3Ffw9H5ywz+FEY85Y+9aHngmVh2o qlNw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=GpTPnrtzyEW6ttxv/vRf6/zSu+ma0vPd6fBn+T2nWPw=; b=SVbyIp3xgQxVwV0f5RKa3KNb4F7f0TK2NB37DEhLoir5fpK0SooO2GcMAtw6rlodog HJRJNVNToe+sgMcE5+LIhB0oNodoSH7fO0nB5P5kMbdUoOsBmdn9bFV6yl0pyT005J9G 6exZ8WeW+mrPU0RIWX+Icq/eev4O4HjCbUARIFvYsBwGSgA1QAONckdoJJpzNrHuQTnM cRktfjtfQDTo2dfxUnDSnnToXIQYk+/T+2VRSgqlrDbGlcy/LBaSzga9wIY8y0woD3Ae AGmhvuXa9MRHlClNGB5Ff5xuHbxKl1suoX6rVtOYRvmLG9433+6nNG22gisgFOfSTpbd /FJQ== X-Gm-Message-State: AOAM532RICer3j9P36zn4IY33J5o9YGwJhBLnsS/SRfJUjeX9ZVQ7a7M PBDJHcG2X9Udc+d35FzX2fpTYtLX23A6zw== X-Google-Smtp-Source: ABdhPJy2FZMj2xJcqkZyd848JkTU0MnvIsMnf3Xmn0TOwdqDqmhfTjRSSq1rs/1RApqvODGioKuECw== X-Received: by 2002:a05:6a00:2ad:b029:2dc:900f:1c28 with SMTP id q13-20020a056a0002adb02902dc900f1c28mr26981671pfs.67.1621904659172; Mon, 24 May 2021 18:04:19 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id b1sm13742645pgf.84.2021.05.24.18.04.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:04:18 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 32/92] target/arm: Implement SVE2 bitwise ternary operations Date: Mon, 24 May 2021 18:02:58 -0700 Message-Id: <20210525010358.152808-33-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::42f; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x42f.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 6 ++ target/arm/sve.decode | 12 +++ target/arm/sve_helper.c | 50 +++++++++ target/arm/translate-sve.c | 213 +++++++++++++++++++++++++++++++++++++ 4 files changed, 281 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 5bf9fdc7a3..df617e3351 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2543,3 +2543,9 @@ DEF_HELPER_FLAGS_6(sve2_fminp_zpzz_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_6(sve2_fminp_zpzz_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_eor3, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_bcax, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_bsl1n, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_bsl2n, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_nbsl, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index f365907518..bf673e2f16 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -124,6 +124,10 @@ @rda_rn_rm ........ esz:2 . rm:5 ... ... rn:5 rd:5 \ &rrrr_esz ra=%reg_movprfx +# Four operand with unused vector element size +@rdn_ra_rm_e0 ........ ... rm:5 ... ... ra:5 rd:5 \ + &rrrr_esz esz=0 rn=%reg_movprfx + # Three operand with "memory" size, aka immediate left shift @rd_rn_msz_rm ........ ... rm:5 .... imm:2 rn:5 rd:5 &rrri @@ -379,6 +383,14 @@ ORR_zzz 00000100 01 1 ..... 001 100 ..... ..... @rd_rn_rm_e0 EOR_zzz 00000100 10 1 ..... 001 100 ..... ..... @rd_rn_rm_e0 BIC_zzz 00000100 11 1 ..... 001 100 ..... ..... @rd_rn_rm_e0 +# SVE2 bitwise ternary operations +EOR3 00000100 00 1 ..... 001 110 ..... ..... @rdn_ra_rm_e0 +BSL 00000100 00 1 ..... 001 111 ..... ..... @rdn_ra_rm_e0 +BCAX 00000100 01 1 ..... 001 110 ..... ..... @rdn_ra_rm_e0 +BSL1N 00000100 01 1 ..... 001 111 ..... ..... @rdn_ra_rm_e0 +BSL2N 00000100 10 1 ..... 001 111 ..... ..... @rdn_ra_rm_e0 +NBSL 00000100 11 1 ..... 001 111 ..... ..... @rdn_ra_rm_e0 + ### SVE Index Generation Group # SVE index generation (immediate start, immediate increment) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 7450977630..17889bc316 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -6797,3 +6797,53 @@ DO_ST1_ZPZ_D(dd_be, zd, MO_64) #undef DO_ST1_ZPZ_S #undef DO_ST1_ZPZ_D + +void HELPER(sve2_eor3)(void *vd, void *vn, void *vm, void *vk, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc) / 8; + uint64_t *d = vd, *n = vn, *m = vm, *k = vk; + + for (i = 0; i < opr_sz; ++i) { + d[i] = n[i] ^ m[i] ^ k[i]; + } +} + +void HELPER(sve2_bcax)(void *vd, void *vn, void *vm, void *vk, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc) / 8; + uint64_t *d = vd, *n = vn, *m = vm, *k = vk; + + for (i = 0; i < opr_sz; ++i) { + d[i] = n[i] ^ (m[i] & ~k[i]); + } +} + +void HELPER(sve2_bsl1n)(void *vd, void *vn, void *vm, void *vk, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc) / 8; + uint64_t *d = vd, *n = vn, *m = vm, *k = vk; + + for (i = 0; i < opr_sz; ++i) { + d[i] = (~n[i] & k[i]) | (m[i] & ~k[i]); + } +} + +void HELPER(sve2_bsl2n)(void *vd, void *vn, void *vm, void *vk, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc) / 8; + uint64_t *d = vd, *n = vn, *m = vm, *k = vk; + + for (i = 0; i < opr_sz; ++i) { + d[i] = (n[i] & k[i]) | (~m[i] & ~k[i]); + } +} + +void HELPER(sve2_nbsl)(void *vd, void *vn, void *vm, void *vk, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc) / 8; + uint64_t *d = vd, *n = vn, *m = vm, *k = vk; + + for (i = 0; i < opr_sz; ++i) { + d[i] = ~((n[i] & k[i]) | (m[i] & ~k[i])); + } +} diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 64aecc2db4..093424fd27 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -217,6 +217,17 @@ static void gen_gvec_fn_zzz(DisasContext *s, GVecGen3Fn *gvec_fn, vec_full_reg_offset(s, rm), vsz, vsz); } +/* Invoke a vector expander on four Zregs. */ +static void gen_gvec_fn_zzzz(DisasContext *s, GVecGen4Fn *gvec_fn, + int esz, int rd, int rn, int rm, int ra) +{ + unsigned vsz = vec_full_reg_size(s); + gvec_fn(esz, vec_full_reg_offset(s, rd), + vec_full_reg_offset(s, rn), + vec_full_reg_offset(s, rm), + vec_full_reg_offset(s, ra), vsz, vsz); +} + /* Invoke a vector move on two Zregs. */ static bool do_mov_z(DisasContext *s, int rd, int rn) { @@ -329,6 +340,208 @@ static bool trans_BIC_zzz(DisasContext *s, arg_rrr_esz *a) return do_zzz_fn(s, a, tcg_gen_gvec_andc); } +static bool do_sve2_zzzz_fn(DisasContext *s, arg_rrrr_esz *a, GVecGen4Fn *fn) +{ + if (!dc_isar_feature(aa64_sve2, s)) { + return false; + } + if (sve_access_check(s)) { + gen_gvec_fn_zzzz(s, fn, a->esz, a->rd, a->rn, a->rm, a->ra); + } + return true; +} + +static void gen_eor3_i64(TCGv_i64 d, TCGv_i64 n, TCGv_i64 m, TCGv_i64 k) +{ + tcg_gen_xor_i64(d, n, m); + tcg_gen_xor_i64(d, d, k); +} + +static void gen_eor3_vec(unsigned vece, TCGv_vec d, TCGv_vec n, + TCGv_vec m, TCGv_vec k) +{ + tcg_gen_xor_vec(vece, d, n, m); + tcg_gen_xor_vec(vece, d, d, k); +} + +static void gen_eor3(unsigned vece, uint32_t d, uint32_t n, uint32_t m, + uint32_t a, uint32_t oprsz, uint32_t maxsz) +{ + static const GVecGen4 op = { + .fni8 = gen_eor3_i64, + .fniv = gen_eor3_vec, + .fno = gen_helper_sve2_eor3, + .vece = MO_64, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + }; + tcg_gen_gvec_4(d, n, m, a, oprsz, maxsz, &op); +} + +static bool trans_EOR3(DisasContext *s, arg_rrrr_esz *a) +{ + return do_sve2_zzzz_fn(s, a, gen_eor3); +} + +static void gen_bcax_i64(TCGv_i64 d, TCGv_i64 n, TCGv_i64 m, TCGv_i64 k) +{ + tcg_gen_andc_i64(d, m, k); + tcg_gen_xor_i64(d, d, n); +} + +static void gen_bcax_vec(unsigned vece, TCGv_vec d, TCGv_vec n, + TCGv_vec m, TCGv_vec k) +{ + tcg_gen_andc_vec(vece, d, m, k); + tcg_gen_xor_vec(vece, d, d, n); +} + +static void gen_bcax(unsigned vece, uint32_t d, uint32_t n, uint32_t m, + uint32_t a, uint32_t oprsz, uint32_t maxsz) +{ + static const GVecGen4 op = { + .fni8 = gen_bcax_i64, + .fniv = gen_bcax_vec, + .fno = gen_helper_sve2_bcax, + .vece = MO_64, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + }; + tcg_gen_gvec_4(d, n, m, a, oprsz, maxsz, &op); +} + +static bool trans_BCAX(DisasContext *s, arg_rrrr_esz *a) +{ + return do_sve2_zzzz_fn(s, a, gen_bcax); +} + +static void gen_bsl(unsigned vece, uint32_t d, uint32_t n, uint32_t m, + uint32_t a, uint32_t oprsz, uint32_t maxsz) +{ + /* BSL differs from the generic bitsel in argument ordering. */ + tcg_gen_gvec_bitsel(vece, d, a, n, m, oprsz, maxsz); +} + +static bool trans_BSL(DisasContext *s, arg_rrrr_esz *a) +{ + return do_sve2_zzzz_fn(s, a, gen_bsl); +} + +static void gen_bsl1n_i64(TCGv_i64 d, TCGv_i64 n, TCGv_i64 m, TCGv_i64 k) +{ + tcg_gen_andc_i64(n, k, n); + tcg_gen_andc_i64(m, m, k); + tcg_gen_or_i64(d, n, m); +} + +static void gen_bsl1n_vec(unsigned vece, TCGv_vec d, TCGv_vec n, + TCGv_vec m, TCGv_vec k) +{ + if (TCG_TARGET_HAS_bitsel_vec) { + tcg_gen_not_vec(vece, n, n); + tcg_gen_bitsel_vec(vece, d, k, n, m); + } else { + tcg_gen_andc_vec(vece, n, k, n); + tcg_gen_andc_vec(vece, m, m, k); + tcg_gen_or_vec(vece, d, n, m); + } +} + +static void gen_bsl1n(unsigned vece, uint32_t d, uint32_t n, uint32_t m, + uint32_t a, uint32_t oprsz, uint32_t maxsz) +{ + static const GVecGen4 op = { + .fni8 = gen_bsl1n_i64, + .fniv = gen_bsl1n_vec, + .fno = gen_helper_sve2_bsl1n, + .vece = MO_64, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + }; + tcg_gen_gvec_4(d, n, m, a, oprsz, maxsz, &op); +} + +static bool trans_BSL1N(DisasContext *s, arg_rrrr_esz *a) +{ + return do_sve2_zzzz_fn(s, a, gen_bsl1n); +} + +static void gen_bsl2n_i64(TCGv_i64 d, TCGv_i64 n, TCGv_i64 m, TCGv_i64 k) +{ + /* + * Z[dn] = (n & k) | (~m & ~k) + * = | ~(m | k) + */ + tcg_gen_and_i64(n, n, k); + if (TCG_TARGET_HAS_orc_i64) { + tcg_gen_or_i64(m, m, k); + tcg_gen_orc_i64(d, n, m); + } else { + tcg_gen_nor_i64(m, m, k); + tcg_gen_or_i64(d, n, m); + } +} + +static void gen_bsl2n_vec(unsigned vece, TCGv_vec d, TCGv_vec n, + TCGv_vec m, TCGv_vec k) +{ + if (TCG_TARGET_HAS_bitsel_vec) { + tcg_gen_not_vec(vece, m, m); + tcg_gen_bitsel_vec(vece, d, k, n, m); + } else { + tcg_gen_and_vec(vece, n, n, k); + tcg_gen_or_vec(vece, m, m, k); + tcg_gen_orc_vec(vece, d, n, m); + } +} + +static void gen_bsl2n(unsigned vece, uint32_t d, uint32_t n, uint32_t m, + uint32_t a, uint32_t oprsz, uint32_t maxsz) +{ + static const GVecGen4 op = { + .fni8 = gen_bsl2n_i64, + .fniv = gen_bsl2n_vec, + .fno = gen_helper_sve2_bsl2n, + .vece = MO_64, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + }; + tcg_gen_gvec_4(d, n, m, a, oprsz, maxsz, &op); +} + +static bool trans_BSL2N(DisasContext *s, arg_rrrr_esz *a) +{ + return do_sve2_zzzz_fn(s, a, gen_bsl2n); +} + +static void gen_nbsl_i64(TCGv_i64 d, TCGv_i64 n, TCGv_i64 m, TCGv_i64 k) +{ + tcg_gen_and_i64(n, n, k); + tcg_gen_andc_i64(m, m, k); + tcg_gen_nor_i64(d, n, m); +} + +static void gen_nbsl_vec(unsigned vece, TCGv_vec d, TCGv_vec n, + TCGv_vec m, TCGv_vec k) +{ + tcg_gen_bitsel_vec(vece, d, k, n, m); + tcg_gen_not_vec(vece, d, d); +} + +static void gen_nbsl(unsigned vece, uint32_t d, uint32_t n, uint32_t m, + uint32_t a, uint32_t oprsz, uint32_t maxsz) +{ + static const GVecGen4 op = { + .fni8 = gen_nbsl_i64, + .fniv = gen_nbsl_vec, + .fno = gen_helper_sve2_nbsl, + .vece = MO_64, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + }; + tcg_gen_gvec_4(d, n, m, a, oprsz, maxsz, &op); +} + +static bool trans_NBSL(DisasContext *s, arg_rrrr_esz *a) +{ + return do_sve2_zzzz_fn(s, a, gen_nbsl); +} + /* *** SVE Integer Arithmetic - Unpredicated Group */ From patchwork Tue May 25 01:02:59 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277481 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2C8B9C2B9F7 for ; Tue, 25 May 2021 01:34:29 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 9C3AA61401 for ; Tue, 25 May 2021 01:34:28 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 9C3AA61401 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:60280 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llLxj-0006Mh-Jv for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 21:34:27 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54224) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLVA-0002NX-76 for qemu-devel@nongnu.org; Mon, 24 May 2021 21:04:56 -0400 Received: from mail-pf1-x42a.google.com ([2607:f8b0:4864:20::42a]:40460) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLUb-0001sV-Ph for qemu-devel@nongnu.org; Mon, 24 May 2021 21:04:55 -0400 Received: by mail-pf1-x42a.google.com with SMTP id x188so22267591pfd.7 for ; Mon, 24 May 2021 18:04:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=Ogxl8WRvUU10mS7HNWeicfHXRTXx6p7GB6bwGt2V1I4=; b=PDPKh8BWfyH0eDow1yaPNXMZvPfVbUY6kxNaZ8LzPiZINhni6PZIM6jNvq9vtDaMk/ 5YveZboqrpAevJteVvtZkqHn9YQyPnYq/iiqKz32vwuYusrML1MjnrFh8RWZlRGmx/oG wCXl1zJC5LZFgO36og++NFbRMzXYj+s7gwPowHRCN68sW5+CUVXfxYKYrtJHKVNbGSjC Zjo9yf6cFx1itkUUdVNVYJ84lcOzexYoNGQLWz2aOKVWxEij/9Ws9xhUDMoYeBXmwVjx T9SBb72huSr9USpHoBEu9Fn0rp5/J1hdw/VunxeYBToStvR9KS8jFlpxHgWvHEv/tVXI oYXw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Ogxl8WRvUU10mS7HNWeicfHXRTXx6p7GB6bwGt2V1I4=; b=fzjYgcLllq9q2eMeMYgKQ7dz5PTmR2oWABFPuM9O+7volb8jxfwqm3ik5UkU6DSGZe 7Lumwsc3gt6KKU5CCvbxOQsG1Ff+arZNSjiIhZLOweHXtN4RyIkEy9ryHsYyxv4gGyTk MvxJjC/L5iiHWDjAWTQ27HIafXWFnAIZQwRUyilvwcXOlr4CUTxC4z7GnzC3JYMvGlCw +tJUjPJru4KWrvSmvw/JjyXAmP/vpIZuoNhGZQdztSWAv1Y/hv3CAjmwefzHnf6JP/C6 D+vyWMBWuHutHVq6OCHpi39iJPKahb31NNSFKYDF5cDe5MgK+vPje0gVWThniF/UMPCA E9/w== X-Gm-Message-State: AOAM530VhWKf5ncoy2rv6ulKcjtfYzcno/PpvPxh5d5GJxPiNQ+NSf2/ BuSLRRcqcPwKztIU/5BFEcO8voh/JYTB8g== X-Google-Smtp-Source: ABdhPJy6PrIDMEmS+IM+Mw8hG1hX2wXfgJ8NVWK1bxNwmQIXVAtYofyEjQwd3b3fOTtcQj/jEsnV/A== X-Received: by 2002:a63:5b23:: with SMTP id p35mr16363763pgb.352.1621904659895; Mon, 24 May 2021 18:04:19 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id b1sm13742645pgf.84.2021.05.24.18.04.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:04:19 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 33/92] target/arm: Implement SVE2 MATCH, NMATCH Date: Mon, 24 May 2021 18:02:59 -0700 Message-Id: <20210525010358.152808-34-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::42a; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x42a.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org, Stephen Long Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Stephen Long Reviewed-by: Richard Henderson Signed-off-by: Stephen Long Message-Id: <20200415145915.2859-1-steplong@quicinc.com> [rth: Expanded comment for do_match2] Signed-off-by: Richard Henderson --- v2: Apply esz_mask to input pg to fix output flags. --- target/arm/helper-sve.h | 10 ++++++ target/arm/sve.decode | 5 +++ target/arm/sve_helper.c | 64 ++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 22 +++++++++++++ 4 files changed, 101 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index df617e3351..11dc6870de 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2509,6 +2509,16 @@ DEF_HELPER_FLAGS_3(sve2_uqrshrnt_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve2_uqrshrnt_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve2_uqrshrnt_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_match_ppzz_b, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_match_ppzz_h, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_nmatch_ppzz_b, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_nmatch_ppzz_h, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_6(sve2_faddp_zpzz_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_6(sve2_faddp_zpzz_s, TCG_CALL_NO_RWG, diff --git a/target/arm/sve.decode b/target/arm/sve.decode index bf673e2f16..47fca5e12d 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1320,6 +1320,11 @@ UQSHRNT 01000101 .. 1 ..... 00 1101 ..... ..... @rd_rn_tszimm_shr UQRSHRNB 01000101 .. 1 ..... 00 1110 ..... ..... @rd_rn_tszimm_shr UQRSHRNT 01000101 .. 1 ..... 00 1111 ..... ..... @rd_rn_tszimm_shr +### SVE2 Character Match + +MATCH 01000101 .. 1 ..... 100 ... ..... 0 .... @pd_pg_rn_rm +NMATCH 01000101 .. 1 ..... 100 ... ..... 1 .... @pd_pg_rn_rm + ## SVE2 floating-point pairwise operations FADDP 01100100 .. 010 00 0 100 ... ..... ..... @rdn_pg_rm diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 17889bc316..f3250165da 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -6847,3 +6847,67 @@ void HELPER(sve2_nbsl)(void *vd, void *vn, void *vm, void *vk, uint32_t desc) d[i] = ~((n[i] & k[i]) | (m[i] & ~k[i])); } } + +/* + * Returns true if m0 or m1 contains the low uint8_t/uint16_t in n. + * See hasless(v,1) from + * https://graphics.stanford.edu/~seander/bithacks.html#ZeroInWord + */ +static inline bool do_match2(uint64_t n, uint64_t m0, uint64_t m1, int esz) +{ + int bits = 8 << esz; + uint64_t ones = dup_const(esz, 1); + uint64_t signs = ones << (bits - 1); + uint64_t cmp0, cmp1; + + cmp1 = dup_const(esz, n); + cmp0 = cmp1 ^ m0; + cmp1 = cmp1 ^ m1; + cmp0 = (cmp0 - ones) & ~cmp0; + cmp1 = (cmp1 - ones) & ~cmp1; + return (cmp0 | cmp1) & signs; +} + +static inline uint32_t do_match(void *vd, void *vn, void *vm, void *vg, + uint32_t desc, int esz, bool nmatch) +{ + uint16_t esz_mask = pred_esz_masks[esz]; + intptr_t opr_sz = simd_oprsz(desc); + uint32_t flags = PREDTEST_INIT; + intptr_t i, j, k; + + for (i = 0; i < opr_sz; i += 16) { + uint64_t m0 = *(uint64_t *)(vm + i); + uint64_t m1 = *(uint64_t *)(vm + i + 8); + uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)) & esz_mask; + uint16_t out = 0; + + for (j = 0; j < 16; j += 8) { + uint64_t n = *(uint64_t *)(vn + i + j); + + for (k = 0; k < 8; k += 1 << esz) { + if (pg & (1 << (j + k))) { + bool o = do_match2(n >> (k * 8), m0, m1, esz); + out |= (o ^ nmatch) << (j + k); + } + } + } + *(uint16_t *)(vd + H1_2(i >> 3)) = out; + flags = iter_predtest_fwd(out, pg, flags); + } + return flags; +} + +#define DO_PPZZ_MATCH(NAME, ESZ, INV) \ +uint32_t HELPER(NAME)(void *vd, void *vn, void *vm, void *vg, uint32_t desc) \ +{ \ + return do_match(vd, vn, vm, vg, desc, ESZ, INV); \ +} + +DO_PPZZ_MATCH(sve2_match_ppzz_b, MO_8, false) +DO_PPZZ_MATCH(sve2_match_ppzz_h, MO_16, false) + +DO_PPZZ_MATCH(sve2_nmatch_ppzz_b, MO_8, true) +DO_PPZZ_MATCH(sve2_nmatch_ppzz_h, MO_16, true) + +#undef DO_PPZZ_MATCH diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 093424fd27..0ac2aeef09 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -7462,6 +7462,28 @@ static bool trans_UQRSHRNT(DisasContext *s, arg_rri_esz *a) return do_sve2_shr_narrow(s, a, ops); } +static bool do_sve2_ppzz_flags(DisasContext *s, arg_rprr_esz *a, + gen_helper_gvec_flags_4 *fn) +{ + if (!dc_isar_feature(aa64_sve2, s)) { + return false; + } + return do_ppzz_flags(s, a, fn); +} + +#define DO_SVE2_PPZZ_MATCH(NAME, name) \ +static bool trans_##NAME(DisasContext *s, arg_rprr_esz *a) \ +{ \ + static gen_helper_gvec_flags_4 * const fns[4] = { \ + gen_helper_sve2_##name##_ppzz_b, gen_helper_sve2_##name##_ppzz_h, \ + NULL, NULL \ + }; \ + return do_sve2_ppzz_flags(s, a, fns[a->esz]); \ +} + +DO_SVE2_PPZZ_MATCH(MATCH, match) +DO_SVE2_PPZZ_MATCH(NMATCH, nmatch) + static bool do_sve2_zpzz_fp(DisasContext *s, arg_rprr_esz *a, gen_helper_gvec_4_ptr *fn) { From patchwork Tue May 25 01:03:00 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277501 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 579FCC2B9F7 for ; Tue, 25 May 2021 01:42:51 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id E15B06141B for ; Tue, 25 May 2021 01:42:50 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E15B06141B Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:56686 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llM5q-0005tJ-06 for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 21:42:50 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54230) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLVA-0002P4-M5 for qemu-devel@nongnu.org; Mon, 24 May 2021 21:04:56 -0400 Received: from mail-pg1-x52c.google.com ([2607:f8b0:4864:20::52c]:39733) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLUb-0001sd-V8 for qemu-devel@nongnu.org; Mon, 24 May 2021 21:04:56 -0400 Received: by mail-pg1-x52c.google.com with SMTP id v14so18677987pgi.6 for ; Mon, 24 May 2021 18:04:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=NEJph6bPv2fQDS20YrndVl2KdexL9NoqSHsy7DSwOvU=; b=b+JR2hPEtjy72FzsrUkgOFumx2VPy9X1k+SMv8mIuie/5UgLIjq9w/JDJasxBjF75l UMw28bkVcyMtRdZiurSmecblvSIOS1O24eyA1Mlrkm6ry2+LpMPNWHUqKLqN7EuOjv5/ oc/QAEraE1syfZ+39K6CZJt59BqxW2G6dYti34WEi1xleJ3+/60mU8gentRfLt1P0Lr2 N3DSrVaK0U7f3QrCYENBfQm7WkgED1MOWOHZ6+Dqj31pDZT2OTDwZdBqaxlUFaopisrE J2w2hP00x/aaP+aZZ8xKo57c2lCLSo8gMEAcQcDIgZLAYGo31z1XRpfA3aO3KK3vrk+b bKJA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=NEJph6bPv2fQDS20YrndVl2KdexL9NoqSHsy7DSwOvU=; b=WSC1oFeKwDNDiGHvsYOwRX23LrszakNTDrja8LeZrJxt+8SkLx8dyH3thUvWop0Gz/ BK/Bju8UH8tfLr0Jc4maK0EDBK0zIUf+9O2IOHC/xpg23ONCK9VaIpW5ufhrbYb1JJbF nTf8/DFg7SHLugIO3Qan3dnilGyf7AlThG0B9FZOVGSQndtuxgBq5Q44rHpvomPUv1Im LlyCwnDCwof00vXIIBTRPUOd0icd7p67Q4b8nmw2vbbUcCRvtbLBHYFkOJ8HeFREeMH2 bDwZvdPmqVyfnhKXI2z7p8EPW9cdrmS5We7+3w5qeTEytK6BhqJweFRhUcZnIGQa73Nc 6MEA== X-Gm-Message-State: AOAM5322tr9IderS+NB/R9fJ6+14Sd+yACXewzpi5IArs0g/3j/571by FaFqA2LgbpZdPdkK7IH4g/jGVkj7UCcUEw== X-Google-Smtp-Source: ABdhPJwhadzhcZERW6MNA+/mDZDR+nJP6RuXX8nNWjG9o6XRrAHlOZXhALaSwvsNemZtxIrzr7HKdw== X-Received: by 2002:a63:5c17:: with SMTP id q23mr16459424pgb.168.1621904660573; Mon, 24 May 2021 18:04:20 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id b1sm13742645pgf.84.2021.05.24.18.04.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:04:20 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 34/92] target/arm: Implement SVE2 saturating multiply-add long Date: Mon, 24 May 2021 18:03:00 -0700 Message-Id: <20210525010358.152808-35-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::52c; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x52c.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 14 ++++++++++ target/arm/sve.decode | 14 ++++++++++ target/arm/sve_helper.c | 30 +++++++++++++++++++++ target/arm/translate-sve.c | 54 ++++++++++++++++++++++++++++++++++++++ 4 files changed, 112 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 11dc6870de..d8f390617c 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2559,3 +2559,17 @@ DEF_HELPER_FLAGS_5(sve2_bcax, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve2_bsl1n, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve2_bsl2n, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve2_nbsl, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_sqdmlal_zzzw_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqdmlal_zzzw_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqdmlal_zzzw_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_sqdmlsl_zzzw_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqdmlsl_zzzw_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqdmlsl_zzzw_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 47fca5e12d..52f615b39e 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1332,3 +1332,17 @@ FMAXNMP 01100100 .. 010 10 0 100 ... ..... ..... @rdn_pg_rm FMINNMP 01100100 .. 010 10 1 100 ... ..... ..... @rdn_pg_rm FMAXP 01100100 .. 010 11 0 100 ... ..... ..... @rdn_pg_rm FMINP 01100100 .. 010 11 1 100 ... ..... ..... @rdn_pg_rm + +#### SVE Integer Multiply-Add (unpredicated) + +## SVE2 saturating multiply-add long + +SQDMLALB_zzzw 01000100 .. 0 ..... 0110 00 ..... ..... @rda_rn_rm +SQDMLALT_zzzw 01000100 .. 0 ..... 0110 01 ..... ..... @rda_rn_rm +SQDMLSLB_zzzw 01000100 .. 0 ..... 0110 10 ..... ..... @rda_rn_rm +SQDMLSLT_zzzw 01000100 .. 0 ..... 0110 11 ..... ..... @rda_rn_rm + +## SVE2 saturating multiply-add interleaved long + +SQDMLALBT 01000100 .. 0 ..... 00001 0 ..... ..... @rda_rn_rm +SQDMLSLBT 01000100 .. 0 ..... 00001 1 ..... ..... @rda_rn_rm diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index f3250165da..ad211249ca 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1405,6 +1405,36 @@ void HELPER(sve2_adcl_d)(void *vd, void *vn, void *vm, void *va, uint32_t desc) } } +#define DO_SQDMLAL(NAME, TYPEW, TYPEN, HW, HN, DMUL_OP, SUM_OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, void *va, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc); \ + int sel1 = extract32(desc, SIMD_DATA_SHIFT, 1) * sizeof(TYPEN); \ + int sel2 = extract32(desc, SIMD_DATA_SHIFT + 1, 1) * sizeof(TYPEN); \ + for (i = 0; i < opr_sz; i += sizeof(TYPEW)) { \ + TYPEW nn = *(TYPEN *)(vn + HN(i + sel1)); \ + TYPEW mm = *(TYPEN *)(vm + HN(i + sel2)); \ + TYPEW aa = *(TYPEW *)(va + HW(i)); \ + *(TYPEW *)(vd + HW(i)) = SUM_OP(aa, DMUL_OP(nn, mm)); \ + } \ +} + +DO_SQDMLAL(sve2_sqdmlal_zzzw_h, int16_t, int8_t, H1_2, H1, + do_sqdmull_h, DO_SQADD_H) +DO_SQDMLAL(sve2_sqdmlal_zzzw_s, int32_t, int16_t, H1_4, H1_2, + do_sqdmull_s, DO_SQADD_S) +DO_SQDMLAL(sve2_sqdmlal_zzzw_d, int64_t, int32_t, , H1_4, + do_sqdmull_d, do_sqadd_d) + +DO_SQDMLAL(sve2_sqdmlsl_zzzw_h, int16_t, int8_t, H1_2, H1, + do_sqdmull_h, DO_SQSUB_H) +DO_SQDMLAL(sve2_sqdmlsl_zzzw_s, int32_t, int16_t, H1_4, H1_2, + do_sqdmull_s, DO_SQSUB_S) +DO_SQDMLAL(sve2_sqdmlsl_zzzw_d, int64_t, int32_t, , H1_4, + do_sqdmull_d, do_sqsub_d) + +#undef DO_SQDMLAL + #define DO_BITPERM(NAME, TYPE, OP) \ void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ { \ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 0ac2aeef09..7e23d1cad3 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -7508,3 +7508,57 @@ DO_SVE2_ZPZZ_FP(FMAXNMP, fmaxnmp) DO_SVE2_ZPZZ_FP(FMINNMP, fminnmp) DO_SVE2_ZPZZ_FP(FMAXP, fmaxp) DO_SVE2_ZPZZ_FP(FMINP, fminp) + +/* + * SVE Integer Multiply-Add (unpredicated) + */ + +static bool do_sqdmlal_zzzw(DisasContext *s, arg_rrrr_esz *a, + bool sel1, bool sel2) +{ + static gen_helper_gvec_4 * const fns[] = { + NULL, gen_helper_sve2_sqdmlal_zzzw_h, + gen_helper_sve2_sqdmlal_zzzw_s, gen_helper_sve2_sqdmlal_zzzw_d, + }; + return do_sve2_zzzz_ool(s, a, fns[a->esz], (sel2 << 1) | sel1); +} + +static bool do_sqdmlsl_zzzw(DisasContext *s, arg_rrrr_esz *a, + bool sel1, bool sel2) +{ + static gen_helper_gvec_4 * const fns[] = { + NULL, gen_helper_sve2_sqdmlsl_zzzw_h, + gen_helper_sve2_sqdmlsl_zzzw_s, gen_helper_sve2_sqdmlsl_zzzw_d, + }; + return do_sve2_zzzz_ool(s, a, fns[a->esz], (sel2 << 1) | sel1); +} + +static bool trans_SQDMLALB_zzzw(DisasContext *s, arg_rrrr_esz *a) +{ + return do_sqdmlal_zzzw(s, a, false, false); +} + +static bool trans_SQDMLALT_zzzw(DisasContext *s, arg_rrrr_esz *a) +{ + return do_sqdmlal_zzzw(s, a, true, true); +} + +static bool trans_SQDMLALBT(DisasContext *s, arg_rrrr_esz *a) +{ + return do_sqdmlal_zzzw(s, a, false, true); +} + +static bool trans_SQDMLSLB_zzzw(DisasContext *s, arg_rrrr_esz *a) +{ + return do_sqdmlsl_zzzw(s, a, false, false); +} + +static bool trans_SQDMLSLT_zzzw(DisasContext *s, arg_rrrr_esz *a) +{ + return do_sqdmlsl_zzzw(s, a, true, true); +} + +static bool trans_SQDMLSLBT(DisasContext *s, arg_rrrr_esz *a) +{ + return do_sqdmlsl_zzzw(s, a, false, true); +} From patchwork Tue May 25 01:03:01 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277511 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 93EC4C2B9F7 for ; Tue, 25 May 2021 01:46:33 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id F028E61415 for ; Tue, 25 May 2021 01:46:32 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org F028E61415 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:40632 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llM9P-0005Yh-T9 for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 21:46:31 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54296) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLVD-0002ZU-06 for qemu-devel@nongnu.org; Mon, 24 May 2021 21:04:59 -0400 Received: from mail-pf1-x435.google.com ([2607:f8b0:4864:20::435]:36688) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLUf-0001ss-Jg for qemu-devel@nongnu.org; Mon, 24 May 2021 21:04:58 -0400 Received: by mail-pf1-x435.google.com with SMTP id c12so10680914pfl.3 for ; Mon, 24 May 2021 18:04:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=hMJfd4FOTAIzDHyQYXykKd3nDXG5kXZfYldRWp5RaJ0=; b=lsWnoK8FHDNwE77ndduq9LvoRhreyjKdwFR+6YKtGHCg1hhyAeiq1X0QEpf9+v3xBq B+VkC8/R1j2UEmnkQp2MAC4CqRx7IQgKrK0T0oo1yBQjYNUJ065nkXB4s2AqEl5S2f57 FHyXlVtj/FwGcLAEn2XfGFX1P1hhOdeBULtpQVYp9b4jYhytWfpJXQTuKi7H6a1aLFqS r4+WtHprYDCbJ2bDRFQTXGtUaB8BS5g1Yvbzm/7g6NGoKrM02Pb8NEU3h8Rdp12RBWK+ T3fWrgfLBhha9vIQUYuqGmESm+2s4Ve585smmqoH+2it+HVemDtzP/5niPVFlGr7JD25 4GIQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=hMJfd4FOTAIzDHyQYXykKd3nDXG5kXZfYldRWp5RaJ0=; b=ZaPRJEvXCWp3LNu2J5wouziot0jm3QASnQkCl/D1BfcWtFRiSfNyeAyRfaqEAZX0e1 y1BnMseA2HBJoFdNZCzR2wMB9HP5LgxTq5yHz1OEiP9nPzX+VFr+tOmSiDSMEusPNNed OsFUqAIbZWOPhKSD+mJSQUvS4CnujI32+N/4ZgD0LoeMHXLt0/0mZ6LJ1kK7OOCYl1C0 sV0Xzsbg7n0jmahfv34rfbZMX9rbBzhRSm2T6hsPkZWAT6f/ksfPbm+STyz4xY/pRiEV 6Px2sWIUy8BsnhowuVUWYkKVNv5HBTLY8AuHYWZJODyTSC3OXxJKFdl/4oCiY4jgXtkX 5hng== X-Gm-Message-State: AOAM531z9PWvV8z9tawfMlLuQk5Trh1zxO+iUH7yyvJPXor27jNj55au 6tyypmv7fGMagGONO5G3DLr8CKWyxLkObg== X-Google-Smtp-Source: ABdhPJweAp/IMrCrMD34DLX5n6dqpatqcZUI5DuTX/AWB5UCFr2P34Npu2IMznYssxqKVwDl5u9VRQ== X-Received: by 2002:aa7:8f37:0:b029:2db:551f:ed8e with SMTP id y23-20020aa78f370000b02902db551fed8emr27141654pfr.43.1621904661197; Mon, 24 May 2021 18:04:21 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id b1sm13742645pgf.84.2021.05.24.18.04.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:04:20 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 35/92] target/arm: Implement SVE2 saturating multiply-add high Date: Mon, 24 May 2021 18:03:01 -0700 Message-Id: <20210525010358.152808-36-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::435; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x435.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" SVE2 has two additional sizes of the operation and unlike NEON, there is no saturation flag. Create new entry points for SVE2 that do not set QC. Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/helper.h | 17 ++++ target/arm/sve.decode | 5 ++ target/arm/translate-sve.c | 18 +++++ target/arm/vec_helper.c | 161 +++++++++++++++++++++++++++++++++++-- 4 files changed, 195 insertions(+), 6 deletions(-) diff --git a/target/arm/helper.h b/target/arm/helper.h index 2c412ffd3b..6bb0b0ddc0 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -591,6 +591,23 @@ DEF_HELPER_FLAGS_5(gvec_qrdmlah_s32, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(gvec_qrdmlsh_s32, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqrdmlah_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqrdmlsh_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqrdmlah_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqrdmlsh_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqrdmlah_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqrdmlsh_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqrdmlah_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqrdmlsh_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_4(gvec_sdot_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(gvec_udot_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(gvec_sdot_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 52f615b39e..8308c9238a 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1346,3 +1346,8 @@ SQDMLSLT_zzzw 01000100 .. 0 ..... 0110 11 ..... ..... @rda_rn_rm SQDMLALBT 01000100 .. 0 ..... 00001 0 ..... ..... @rda_rn_rm SQDMLSLBT 01000100 .. 0 ..... 00001 1 ..... ..... @rda_rn_rm + +## SVE2 saturating multiply-add high + +SQRDMLAH_zzzz 01000100 .. 0 ..... 01110 0 ..... ..... @rda_rn_rm +SQRDMLSH_zzzz 01000100 .. 0 ..... 01110 1 ..... ..... @rda_rn_rm diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 7e23d1cad3..a3597a4c38 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -7562,3 +7562,21 @@ static bool trans_SQDMLSLBT(DisasContext *s, arg_rrrr_esz *a) { return do_sqdmlsl_zzzw(s, a, false, true); } + +static bool trans_SQRDMLAH_zzzz(DisasContext *s, arg_rrrr_esz *a) +{ + static gen_helper_gvec_4 * const fns[] = { + gen_helper_sve2_sqrdmlah_b, gen_helper_sve2_sqrdmlah_h, + gen_helper_sve2_sqrdmlah_s, gen_helper_sve2_sqrdmlah_d, + }; + return do_sve2_zzzz_ool(s, a, fns[a->esz], 0); +} + +static bool trans_SQRDMLSH_zzzz(DisasContext *s, arg_rrrr_esz *a) +{ + static gen_helper_gvec_4 * const fns[] = { + gen_helper_sve2_sqrdmlsh_b, gen_helper_sve2_sqrdmlsh_h, + gen_helper_sve2_sqrdmlsh_s, gen_helper_sve2_sqrdmlsh_d, + }; + return do_sve2_zzzz_ool(s, a, fns[a->esz], 0); +} diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index b0ce597060..c56337e724 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -22,6 +22,7 @@ #include "exec/helper-proto.h" #include "tcg/tcg-gvec-desc.h" #include "fpu/softfloat.h" +#include "qemu/int128.h" #include "vec_internal.h" /* Note that vector data is stored in host-endian 64-bit chunks, @@ -36,15 +37,55 @@ #define H4(x) (x) #endif +/* Signed saturating rounding doubling multiply-accumulate high half, 8-bit */ +static int8_t do_sqrdmlah_b(int8_t src1, int8_t src2, int8_t src3, + bool neg, bool round) +{ + /* + * Simplify: + * = ((a3 << 8) + ((e1 * e2) << 1) + (round << 7)) >> 8 + * = ((a3 << 7) + (e1 * e2) + (round << 6)) >> 7 + */ + int32_t ret = (int32_t)src1 * src2; + if (neg) { + ret = -ret; + } + ret += ((int32_t)src3 << 7) + (round << 6); + ret >>= 7; + + if (ret != (int8_t)ret) { + ret = (ret < 0 ? INT8_MIN : INT8_MAX); + } + return ret; +} + +void HELPER(sve2_sqrdmlah_b)(void *vd, void *vn, void *vm, + void *va, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + int8_t *d = vd, *n = vn, *m = vm, *a = va; + + for (i = 0; i < opr_sz; ++i) { + d[i] = do_sqrdmlah_b(n[i], m[i], a[i], false, true); + } +} + +void HELPER(sve2_sqrdmlsh_b)(void *vd, void *vn, void *vm, + void *va, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + int8_t *d = vd, *n = vn, *m = vm, *a = va; + + for (i = 0; i < opr_sz; ++i) { + d[i] = do_sqrdmlah_b(n[i], m[i], a[i], true, true); + } +} + /* Signed saturating rounding doubling multiply-accumulate high half, 16-bit */ static int16_t do_sqrdmlah_h(int16_t src1, int16_t src2, int16_t src3, bool neg, bool round, uint32_t *sat) { - /* - * Simplify: - * = ((a3 << 16) + ((e1 * e2) << 1) + (1 << 15)) >> 16 - * = ((a3 << 15) + (e1 * e2) + (1 << 14)) >> 15 - */ + /* Simplify similarly to do_sqrdmlah_b above. */ int32_t ret = (int32_t)src1 * src2; if (neg) { ret = -ret; @@ -133,11 +174,35 @@ void HELPER(neon_sqrdmulh_h)(void *vd, void *vn, void *vm, clear_tail(d, opr_sz, simd_maxsz(desc)); } +void HELPER(sve2_sqrdmlah_h)(void *vd, void *vn, void *vm, + void *va, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + int16_t *d = vd, *n = vn, *m = vm, *a = va; + uint32_t discard; + + for (i = 0; i < opr_sz / 2; ++i) { + d[i] = do_sqrdmlah_h(n[i], m[i], a[i], false, true, &discard); + } +} + +void HELPER(sve2_sqrdmlsh_h)(void *vd, void *vn, void *vm, + void *va, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + int16_t *d = vd, *n = vn, *m = vm, *a = va; + uint32_t discard; + + for (i = 0; i < opr_sz / 2; ++i) { + d[i] = do_sqrdmlah_h(n[i], m[i], a[i], true, true, &discard); + } +} + /* Signed saturating rounding doubling multiply-accumulate high half, 32-bit */ static int32_t do_sqrdmlah_s(int32_t src1, int32_t src2, int32_t src3, bool neg, bool round, uint32_t *sat) { - /* Simplify similarly to int_qrdmlah_s16 above. */ + /* Simplify similarly to do_sqrdmlah_b above. */ int64_t ret = (int64_t)src1 * src2; if (neg) { ret = -ret; @@ -220,6 +285,90 @@ void HELPER(neon_sqrdmulh_s)(void *vd, void *vn, void *vm, clear_tail(d, opr_sz, simd_maxsz(desc)); } +void HELPER(sve2_sqrdmlah_s)(void *vd, void *vn, void *vm, + void *va, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + int32_t *d = vd, *n = vn, *m = vm, *a = va; + uint32_t discard; + + for (i = 0; i < opr_sz / 4; ++i) { + d[i] = do_sqrdmlah_s(n[i], m[i], a[i], false, true, &discard); + } +} + +void HELPER(sve2_sqrdmlsh_s)(void *vd, void *vn, void *vm, + void *va, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + int32_t *d = vd, *n = vn, *m = vm, *a = va; + uint32_t discard; + + for (i = 0; i < opr_sz / 4; ++i) { + d[i] = do_sqrdmlah_s(n[i], m[i], a[i], true, true, &discard); + } +} + +/* Signed saturating rounding doubling multiply-accumulate high half, 64-bit */ +static int64_t do_sat128_d(Int128 r) +{ + int64_t ls = int128_getlo(r); + int64_t hs = int128_gethi(r); + + if (unlikely(hs != (ls >> 63))) { + return hs < 0 ? INT64_MIN : INT64_MAX; + } + return ls; +} + +static int64_t do_sqrdmlah_d(int64_t n, int64_t m, int64_t a, + bool neg, bool round) +{ + uint64_t l, h; + Int128 r, t; + + /* As in do_sqrdmlah_b, but with 128-bit arithmetic. */ + muls64(&l, &h, m, n); + r = int128_make128(l, h); + if (neg) { + r = int128_neg(r); + } + if (a) { + t = int128_exts64(a); + t = int128_lshift(t, 63); + r = int128_add(r, t); + } + if (round) { + t = int128_exts64(1ll << 62); + r = int128_add(r, t); + } + r = int128_rshift(r, 63); + + return do_sat128_d(r); +} + +void HELPER(sve2_sqrdmlah_d)(void *vd, void *vn, void *vm, + void *va, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + int64_t *d = vd, *n = vn, *m = vm, *a = va; + + for (i = 0; i < opr_sz / 8; ++i) { + d[i] = do_sqrdmlah_d(n[i], m[i], a[i], false, true); + } +} + +void HELPER(sve2_sqrdmlsh_d)(void *vd, void *vn, void *vm, + void *va, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + int64_t *d = vd, *n = vn, *m = vm, *a = va; + + for (i = 0; i < opr_sz / 8; ++i) { + d[i] = do_sqrdmlah_d(n[i], m[i], a[i], true, true); + } +} + /* Integer 8 and 16-bit dot-product. * * Note that for the loops herein, host endianness does not matter From patchwork Tue May 25 01:03:02 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277479 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 45FE1C2B9F7 for ; Tue, 25 May 2021 01:33:12 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id C0474613F9 for ; Tue, 25 May 2021 01:33:11 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C0474613F9 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:55562 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llLwU-0003En-IM for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 21:33:10 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54276) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLVC-0002WP-8H for qemu-devel@nongnu.org; Mon, 24 May 2021 21:04:58 -0400 Received: from mail-pf1-x434.google.com ([2607:f8b0:4864:20::434]:39536) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLUf-0001tF-JP for qemu-devel@nongnu.org; Mon, 24 May 2021 21:04:57 -0400 Received: by mail-pf1-x434.google.com with SMTP id c17so22279810pfn.6 for ; Mon, 24 May 2021 18:04:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=XJu+GcaQ0uZ85sKkAw0BpV2CyuCatL0glz/LOPaXGPI=; b=ZIGjrlAZrHZ3P2uUfGggS/W0agmffmYL3yI+4ND688E/fjHLwEcnc1xhtQValaf0dW BY/wZXwp/S4tzr5ajl+A1RDNnjOApNT0lm3Hk6wS6yIm6u21EYopqpK9B7cNLHjToftZ hXyFZst8ddGmKUh/vYcTCzSi2v3Bqq/8+Vi1Ge9b1XcxQUjX3Vh0zKeCnFH74ZLQEnqv jJ17ArfJcvwN5Qth4l6qjQ8nDS00WNh4cLJ+mJaiOXO6FElVTItUp7/Vniqt6RjOGPes fJGz6a8zMl/SwmBxJ3PMQW2mHZ+AbVYklnX2+ZJd0rpXEqGjsMTRoiV0tvSItF0KtT7d 4lzA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=XJu+GcaQ0uZ85sKkAw0BpV2CyuCatL0glz/LOPaXGPI=; b=JaT9LoFO0QtMtRUvjInZ9lT96fkpTlNs7f8wZWJAXC0yTftpzEUeG+49lzQ8jYM7so gT0yiNDW5rlNpkd5H2I/tDA94m92qdW6Y86wzYSJ10Kjeb9P2+H1MRqNLXxmriqMZrYb wlAWCo5/ivQiIVWjGIsH7SEog57SvFrqm5sD8QKjfLW4SqUW93pevA980ei8opbexMnA 4ZL0mbPScRitB9Py8Pl0yopLujo20/qSzKAS6dDUd4J926+WBbSKNksbQzNyb/GAbX7/ IFwDWk+O2lJKvpZyq05bAzl/l6jfScgZaE6rtlKj3S37SxTckKVRuD3ljduV96EihjGo 23Ag== X-Gm-Message-State: AOAM531VHafGgPaXMETrzuMo6yN6PZm9GG5qjzFjVn+2pslfDPSnqOkS O8A7Tx/0GwZ5dXi9aZA5Xbbs/JPT9Euw2g== X-Google-Smtp-Source: ABdhPJw+l1yCG1XZZRyLiUYxpBvw8OmjexmHZYrGX/IIDP025/4olE9O6mqhOh5RpvkI9O2pLXpOGw== X-Received: by 2002:a05:6a00:139a:b029:2e3:db98:9ae3 with SMTP id t26-20020a056a00139ab02902e3db989ae3mr24089842pfg.81.1621904661686; Mon, 24 May 2021 18:04:21 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id b1sm13742645pgf.84.2021.05.24.18.04.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:04:21 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 36/92] target/arm: Implement SVE2 integer multiply-add long Date: Mon, 24 May 2021 18:03:02 -0700 Message-Id: <20210525010358.152808-37-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::434; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x434.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 28 ++++++++++++++ target/arm/sve.decode | 11 ++++++ target/arm/sve_helper.c | 18 +++++++++ target/arm/translate-sve.c | 76 ++++++++++++++++++++++++++++++++++++++ 4 files changed, 133 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index d8f390617c..457a421455 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2573,3 +2573,31 @@ DEF_HELPER_FLAGS_5(sve2_sqdmlsl_zzzw_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve2_sqdmlsl_zzzw_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_smlal_zzzw_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_smlal_zzzw_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_smlal_zzzw_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_umlal_zzzw_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_umlal_zzzw_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_umlal_zzzw_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_smlsl_zzzw_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_smlsl_zzzw_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_smlsl_zzzw_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_umlsl_zzzw_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_umlsl_zzzw_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_umlsl_zzzw_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 8308c9238a..b28b50e05c 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1351,3 +1351,14 @@ SQDMLSLBT 01000100 .. 0 ..... 00001 1 ..... ..... @rda_rn_rm SQRDMLAH_zzzz 01000100 .. 0 ..... 01110 0 ..... ..... @rda_rn_rm SQRDMLSH_zzzz 01000100 .. 0 ..... 01110 1 ..... ..... @rda_rn_rm + +## SVE2 integer multiply-add long + +SMLALB_zzzw 01000100 .. 0 ..... 010 000 ..... ..... @rda_rn_rm +SMLALT_zzzw 01000100 .. 0 ..... 010 001 ..... ..... @rda_rn_rm +UMLALB_zzzw 01000100 .. 0 ..... 010 010 ..... ..... @rda_rn_rm +UMLALT_zzzw 01000100 .. 0 ..... 010 011 ..... ..... @rda_rn_rm +SMLSLB_zzzw 01000100 .. 0 ..... 010 100 ..... ..... @rda_rn_rm +SMLSLT_zzzw 01000100 .. 0 ..... 010 101 ..... ..... @rda_rn_rm +UMLSLB_zzzw 01000100 .. 0 ..... 010 110 ..... ..... @rda_rn_rm +UMLSLT_zzzw 01000100 .. 0 ..... 010 111 ..... ..... @rda_rn_rm diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index ad211249ca..c1a92a2ba5 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1313,6 +1313,24 @@ DO_ZZZW_ACC(sve2_uabal_h, uint16_t, uint8_t, H1_2, H1, DO_ABD) DO_ZZZW_ACC(sve2_uabal_s, uint32_t, uint16_t, H1_4, H1_2, DO_ABD) DO_ZZZW_ACC(sve2_uabal_d, uint64_t, uint32_t, , H1_4, DO_ABD) +DO_ZZZW_ACC(sve2_smlal_zzzw_h, int16_t, int8_t, H1_2, H1, DO_MUL) +DO_ZZZW_ACC(sve2_smlal_zzzw_s, int32_t, int16_t, H1_4, H1_2, DO_MUL) +DO_ZZZW_ACC(sve2_smlal_zzzw_d, int64_t, int32_t, , H1_4, DO_MUL) + +DO_ZZZW_ACC(sve2_umlal_zzzw_h, uint16_t, uint8_t, H1_2, H1, DO_MUL) +DO_ZZZW_ACC(sve2_umlal_zzzw_s, uint32_t, uint16_t, H1_4, H1_2, DO_MUL) +DO_ZZZW_ACC(sve2_umlal_zzzw_d, uint64_t, uint32_t, , H1_4, DO_MUL) + +#define DO_NMUL(N, M) -(N * M) + +DO_ZZZW_ACC(sve2_smlsl_zzzw_h, int16_t, int8_t, H1_2, H1, DO_NMUL) +DO_ZZZW_ACC(sve2_smlsl_zzzw_s, int32_t, int16_t, H1_4, H1_2, DO_NMUL) +DO_ZZZW_ACC(sve2_smlsl_zzzw_d, int64_t, int32_t, , H1_4, DO_NMUL) + +DO_ZZZW_ACC(sve2_umlsl_zzzw_h, uint16_t, uint8_t, H1_2, H1, DO_NMUL) +DO_ZZZW_ACC(sve2_umlsl_zzzw_s, uint32_t, uint16_t, H1_4, H1_2, DO_NMUL) +DO_ZZZW_ACC(sve2_umlsl_zzzw_d, uint64_t, uint32_t, , H1_4, DO_NMUL) + #undef DO_ZZZW_ACC #define DO_XTNB(NAME, TYPE, OP) \ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index a3597a4c38..f878b0d033 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -7580,3 +7580,79 @@ static bool trans_SQRDMLSH_zzzz(DisasContext *s, arg_rrrr_esz *a) }; return do_sve2_zzzz_ool(s, a, fns[a->esz], 0); } + +static bool do_smlal_zzzw(DisasContext *s, arg_rrrr_esz *a, bool sel) +{ + static gen_helper_gvec_4 * const fns[] = { + NULL, gen_helper_sve2_smlal_zzzw_h, + gen_helper_sve2_smlal_zzzw_s, gen_helper_sve2_smlal_zzzw_d, + }; + return do_sve2_zzzz_ool(s, a, fns[a->esz], sel); +} + +static bool trans_SMLALB_zzzw(DisasContext *s, arg_rrrr_esz *a) +{ + return do_smlal_zzzw(s, a, false); +} + +static bool trans_SMLALT_zzzw(DisasContext *s, arg_rrrr_esz *a) +{ + return do_smlal_zzzw(s, a, true); +} + +static bool do_umlal_zzzw(DisasContext *s, arg_rrrr_esz *a, bool sel) +{ + static gen_helper_gvec_4 * const fns[] = { + NULL, gen_helper_sve2_umlal_zzzw_h, + gen_helper_sve2_umlal_zzzw_s, gen_helper_sve2_umlal_zzzw_d, + }; + return do_sve2_zzzz_ool(s, a, fns[a->esz], sel); +} + +static bool trans_UMLALB_zzzw(DisasContext *s, arg_rrrr_esz *a) +{ + return do_umlal_zzzw(s, a, false); +} + +static bool trans_UMLALT_zzzw(DisasContext *s, arg_rrrr_esz *a) +{ + return do_umlal_zzzw(s, a, true); +} + +static bool do_smlsl_zzzw(DisasContext *s, arg_rrrr_esz *a, bool sel) +{ + static gen_helper_gvec_4 * const fns[] = { + NULL, gen_helper_sve2_smlsl_zzzw_h, + gen_helper_sve2_smlsl_zzzw_s, gen_helper_sve2_smlsl_zzzw_d, + }; + return do_sve2_zzzz_ool(s, a, fns[a->esz], sel); +} + +static bool trans_SMLSLB_zzzw(DisasContext *s, arg_rrrr_esz *a) +{ + return do_smlsl_zzzw(s, a, false); +} + +static bool trans_SMLSLT_zzzw(DisasContext *s, arg_rrrr_esz *a) +{ + return do_smlsl_zzzw(s, a, true); +} + +static bool do_umlsl_zzzw(DisasContext *s, arg_rrrr_esz *a, bool sel) +{ + static gen_helper_gvec_4 * const fns[] = { + NULL, gen_helper_sve2_umlsl_zzzw_h, + gen_helper_sve2_umlsl_zzzw_s, gen_helper_sve2_umlsl_zzzw_d, + }; + return do_sve2_zzzz_ool(s, a, fns[a->esz], sel); +} + +static bool trans_UMLSLB_zzzw(DisasContext *s, arg_rrrr_esz *a) +{ + return do_umlsl_zzzw(s, a, false); +} + +static bool trans_UMLSLT_zzzw(DisasContext *s, arg_rrrr_esz *a) +{ + return do_umlsl_zzzw(s, a, true); +} From patchwork Tue May 25 01:03:03 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277489 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 29095C2B9F7 for ; Tue, 25 May 2021 01:37:37 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id AC1EA61415 for ; Tue, 25 May 2021 01:37:36 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org AC1EA61415 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:42022 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llM0l-0004a8-R6 for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 21:37:35 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54312) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLVD-0002cz-QU for qemu-devel@nongnu.org; Mon, 24 May 2021 21:04:59 -0400 Received: from mail-pf1-x42a.google.com ([2607:f8b0:4864:20::42a]:37400) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLUf-0001tN-JO for qemu-devel@nongnu.org; Mon, 24 May 2021 21:04:59 -0400 Received: by mail-pf1-x42a.google.com with SMTP id q67so5221440pfb.4 for ; Mon, 24 May 2021 18:04:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=ODvyNp8Bx58owdUS56l4H0VEu+tNqT+JOV9/HtxbVbQ=; b=jb6RwPRxq38jjz7Y+rj2pR7Bx459kzdKr0kBBVTIBD/eWygOoRExlwqhKi/jiU85uz 8xDcKuJTW2On5Yd97b8JiIYdyYVovGLL9AuYLH2Elz7w2qHkP7JTVpKl94b/aGoTTB+H zXaB1SLKrNqkH21wSHqmPtV7eXg2ECyxadzg+U91U5AygcGD578/fXXDDeNyQLJ6Pw59 pcm+O5vHgjHJIrve2ts4JEj2WE5/UgweM/Ki5akwZHlpN8EONrb6xL9t0RRdKY0Hm9Yn S/ur1nvf6uRMmYJLpUsm71VTd4Pr6hppGv+wSg++8NI+bk+e3Fph8BGVrDJu4zlWSEAp PEdw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=ODvyNp8Bx58owdUS56l4H0VEu+tNqT+JOV9/HtxbVbQ=; b=C7Mu2MHWcl2ozRD+vBWJay2iGIp9F22P46lUTB0ZLab+lKG3r6kFGXiQaaaQWLmtHv odGHm1k6Lt+UOVnfbpjM06rSfmqmzFJw9Vga2nY4/eJ96b4QXk7Yr1mGojCK7tF+n6BZ ct6+8lb1qxXdN3djx4XtnjEh/2seRcvKfC4yAVZpFF7yqJibFVlB5W3jwGOjTUWCM27i 7TBJ25w2nVW4davDmp0cktB8Jcmqi0euu/EnR7s6f6jGIxNvfWqsF1s5+lQkIDYiM0XB oWKb3xOHG65XG1VC9YzFgK0seALPrv0zO1+k8yqfIpnE4u6+mx6mpjoEwVHUeNVt16/z y7+Q== X-Gm-Message-State: AOAM5323kgMBQDx7UQqxIUonRVQzZWAuhc45Ti4u2GXd47YikikSfOFO 3paWBJrlINP9LcKmGRwJw5cmdo/rsk3U+Q== X-Google-Smtp-Source: ABdhPJxLTh4wnuA0gI3j/38M3hqoAGuOe3p0GApLSfVSQW5VWNGPn6/cpmopxhqRlT3WDuCw1a3isQ== X-Received: by 2002:a05:6a00:2282:b029:2db:7cfe:a43b with SMTP id f2-20020a056a002282b02902db7cfea43bmr27179120pfe.34.1621904662216; Mon, 24 May 2021 18:04:22 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id b1sm13742645pgf.84.2021.05.24.18.04.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:04:22 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 37/92] target/arm: Implement SVE2 complex integer multiply-add Date: Mon, 24 May 2021 18:03:03 -0700 Message-Id: <20210525010358.152808-38-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::42a; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x42a.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- v2: Fix do_sqrdmlah_d (laurent desnogues) v7: Rename DO_CMLA/do_cmla (pm215) --- target/arm/helper-sve.h | 18 +++++++++++++++ target/arm/vec_internal.h | 5 +++++ target/arm/sve.decode | 5 +++++ target/arm/sve_helper.c | 46 ++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 32 ++++++++++++++++++++++++++ target/arm/vec_helper.c | 15 ++++++------- 6 files changed, 113 insertions(+), 8 deletions(-) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 457a421455..d154218452 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2601,3 +2601,21 @@ DEF_HELPER_FLAGS_5(sve2_umlsl_zzzw_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve2_umlsl_zzzw_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_cmla_zzzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_cmla_zzzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_cmla_zzzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_cmla_zzzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_sqrdcmlah_zzzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqrdcmlah_zzzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqrdcmlah_zzzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqrdcmlah_zzzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/vec_internal.h b/target/arm/vec_internal.h index 5b78e79329..ff694d870a 100644 --- a/target/arm/vec_internal.h +++ b/target/arm/vec_internal.h @@ -168,4 +168,9 @@ static inline int64_t do_suqrshl_d(int64_t src, int64_t shift, return do_uqrshl_d(src, shift, round, sat); } +int8_t do_sqrdmlah_b(int8_t, int8_t, int8_t, bool, bool); +int16_t do_sqrdmlah_h(int16_t, int16_t, int16_t, bool, bool, uint32_t *); +int32_t do_sqrdmlah_s(int32_t, int32_t, int32_t, bool, bool, uint32_t *); +int64_t do_sqrdmlah_d(int64_t, int64_t, int64_t, bool, bool); + #endif /* TARGET_ARM_VEC_INTERNALS_H */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index b28b50e05c..936977eacb 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1362,3 +1362,8 @@ SMLSLB_zzzw 01000100 .. 0 ..... 010 100 ..... ..... @rda_rn_rm SMLSLT_zzzw 01000100 .. 0 ..... 010 101 ..... ..... @rda_rn_rm UMLSLB_zzzw 01000100 .. 0 ..... 010 110 ..... ..... @rda_rn_rm UMLSLT_zzzw 01000100 .. 0 ..... 010 111 ..... ..... @rda_rn_rm + +## SVE2 complex integer multiply-add + +CMLA_zzzz 01000100 esz:2 0 rm:5 0010 rot:2 rn:5 rd:5 ra=%reg_movprfx +SQRDCMLAH_zzzz 01000100 esz:2 0 rm:5 0011 rot:2 rn:5 rd:5 ra=%reg_movprfx diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index c1a92a2ba5..263663cfc4 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1453,6 +1453,52 @@ DO_SQDMLAL(sve2_sqdmlsl_zzzw_d, int64_t, int32_t, , H1_4, #undef DO_SQDMLAL +#define DO_CMLA_FUNC(NAME, TYPE, H, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, void *va, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc) / sizeof(TYPE); \ + int rot = simd_data(desc); \ + int sel_a = rot & 1, sel_b = sel_a ^ 1; \ + bool sub_r = rot == 1 || rot == 2; \ + bool sub_i = rot >= 2; \ + TYPE *d = vd, *n = vn, *m = vm, *a = va; \ + for (i = 0; i < opr_sz; i += 2) { \ + TYPE elt1_a = n[H(i + sel_a)]; \ + TYPE elt2_a = m[H(i + sel_a)]; \ + TYPE elt2_b = m[H(i + sel_b)]; \ + d[H(i)] = OP(elt1_a, elt2_a, a[H(i)], sub_r); \ + d[H(i + 1)] = OP(elt1_a, elt2_b, a[H(i + 1)], sub_i); \ + } \ +} + +#define DO_CMLA(N, M, A, S) (A + (N * M) * (S ? -1 : 1)) + +DO_CMLA_FUNC(sve2_cmla_zzzz_b, uint8_t, H1, DO_CMLA) +DO_CMLA_FUNC(sve2_cmla_zzzz_h, uint16_t, H2, DO_CMLA) +DO_CMLA_FUNC(sve2_cmla_zzzz_s, uint32_t, H4, DO_CMLA) +DO_CMLA_FUNC(sve2_cmla_zzzz_d, uint64_t, , DO_CMLA) + +#define DO_SQRDMLAH_B(N, M, A, S) \ + do_sqrdmlah_b(N, M, A, S, true) +#define DO_SQRDMLAH_H(N, M, A, S) \ + ({ uint32_t discard; do_sqrdmlah_h(N, M, A, S, true, &discard); }) +#define DO_SQRDMLAH_S(N, M, A, S) \ + ({ uint32_t discard; do_sqrdmlah_s(N, M, A, S, true, &discard); }) +#define DO_SQRDMLAH_D(N, M, A, S) \ + do_sqrdmlah_d(N, M, A, S, true) + +DO_CMLA_FUNC(sve2_sqrdcmlah_zzzz_b, int8_t, H1, DO_SQRDMLAH_B) +DO_CMLA_FUNC(sve2_sqrdcmlah_zzzz_h, int16_t, H2, DO_SQRDMLAH_H) +DO_CMLA_FUNC(sve2_sqrdcmlah_zzzz_s, int32_t, H4, DO_SQRDMLAH_S) +DO_CMLA_FUNC(sve2_sqrdcmlah_zzzz_d, int64_t, , DO_SQRDMLAH_D) + +#undef DO_CMLA +#undef DO_CMLA_FUNC +#undef DO_SQRDMLAH_B +#undef DO_SQRDMLAH_H +#undef DO_SQRDMLAH_S +#undef DO_SQRDMLAH_D + #define DO_BITPERM(NAME, TYPE, OP) \ void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ { \ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index f878b0d033..05d9edead4 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -7656,3 +7656,35 @@ static bool trans_UMLSLT_zzzw(DisasContext *s, arg_rrrr_esz *a) { return do_umlsl_zzzw(s, a, true); } + +static bool trans_CMLA_zzzz(DisasContext *s, arg_CMLA_zzzz *a) +{ + static gen_helper_gvec_4 * const fns[] = { + gen_helper_sve2_cmla_zzzz_b, gen_helper_sve2_cmla_zzzz_h, + gen_helper_sve2_cmla_zzzz_s, gen_helper_sve2_cmla_zzzz_d, + }; + + if (!dc_isar_feature(aa64_sve2, s)) { + return false; + } + if (sve_access_check(s)) { + gen_gvec_ool_zzzz(s, fns[a->esz], a->rd, a->rn, a->rm, a->ra, a->rot); + } + return true; +} + +static bool trans_SQRDCMLAH_zzzz(DisasContext *s, arg_SQRDCMLAH_zzzz *a) +{ + static gen_helper_gvec_4 * const fns[] = { + gen_helper_sve2_sqrdcmlah_zzzz_b, gen_helper_sve2_sqrdcmlah_zzzz_h, + gen_helper_sve2_sqrdcmlah_zzzz_s, gen_helper_sve2_sqrdcmlah_zzzz_d, + }; + + if (!dc_isar_feature(aa64_sve2, s)) { + return false; + } + if (sve_access_check(s)) { + gen_gvec_ool_zzzz(s, fns[a->esz], a->rd, a->rn, a->rm, a->ra, a->rot); + } + return true; +} diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index c56337e724..19006f50f7 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -38,8 +38,8 @@ #endif /* Signed saturating rounding doubling multiply-accumulate high half, 8-bit */ -static int8_t do_sqrdmlah_b(int8_t src1, int8_t src2, int8_t src3, - bool neg, bool round) +int8_t do_sqrdmlah_b(int8_t src1, int8_t src2, int8_t src3, + bool neg, bool round) { /* * Simplify: @@ -82,8 +82,8 @@ void HELPER(sve2_sqrdmlsh_b)(void *vd, void *vn, void *vm, } /* Signed saturating rounding doubling multiply-accumulate high half, 16-bit */ -static int16_t do_sqrdmlah_h(int16_t src1, int16_t src2, int16_t src3, - bool neg, bool round, uint32_t *sat) +int16_t do_sqrdmlah_h(int16_t src1, int16_t src2, int16_t src3, + bool neg, bool round, uint32_t *sat) { /* Simplify similarly to do_sqrdmlah_b above. */ int32_t ret = (int32_t)src1 * src2; @@ -199,8 +199,8 @@ void HELPER(sve2_sqrdmlsh_h)(void *vd, void *vn, void *vm, } /* Signed saturating rounding doubling multiply-accumulate high half, 32-bit */ -static int32_t do_sqrdmlah_s(int32_t src1, int32_t src2, int32_t src3, - bool neg, bool round, uint32_t *sat) +int32_t do_sqrdmlah_s(int32_t src1, int32_t src2, int32_t src3, + bool neg, bool round, uint32_t *sat) { /* Simplify similarly to do_sqrdmlah_b above. */ int64_t ret = (int64_t)src1 * src2; @@ -321,8 +321,7 @@ static int64_t do_sat128_d(Int128 r) return ls; } -static int64_t do_sqrdmlah_d(int64_t n, int64_t m, int64_t a, - bool neg, bool round) +int64_t do_sqrdmlah_d(int64_t n, int64_t m, int64_t a, bool neg, bool round) { uint64_t l, h; Int128 r, t; From patchwork Tue May 25 01:03:04 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277497 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4B588C2B9F7 for ; Tue, 25 May 2021 01:41:26 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id DD43261415 for ; Tue, 25 May 2021 01:41:25 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org DD43261415 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:52030 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llM4T-0002oy-12 for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 21:41:25 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54326) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLVE-0002fn-Ea for qemu-devel@nongnu.org; Mon, 24 May 2021 21:05:00 -0400 Received: from mail-pg1-x52d.google.com ([2607:f8b0:4864:20::52d]:44792) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLUf-0001tZ-Lr for qemu-devel@nongnu.org; Mon, 24 May 2021 21:05:00 -0400 Received: by mail-pg1-x52d.google.com with SMTP id 29so10300388pgu.11 for ; Mon, 24 May 2021 18:04:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=lC4zcBHCZT2FFT1aqghsZIqg78m2BlLe2zTnfBW+FyI=; b=xrDA6ApMfb5CJmh0Y31wn+PqNX1gNIs/kSxzTps1zZrNH4UgztDwtLKIC/mI8I6cld AJOLwY6VDI3DazLWB7/TZ1rbmbApHG9UOf87ZmJtvYzwPmfD3G61ZL4nVzNioXSTSxyr JrAeJJ8cnIap4G9J8j+CdXRZwD/+BfsVwEfMvfWRhZ3MUq4/H7FXXy0vO1yYFTa0Gdfa dDIpCHsKuuv9LhG1zWiVXmvjV5lurJhF2l93AeteN5VuMZxukkZkVMEWxDMQBxiQYC8d ZLNoXjSNfJVdKLXtM9OL8YZ1e+CDtyubhW8OZ5/IyWgOaHeD4gaap8XeDWjMEgmTb8CV 8ssg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=lC4zcBHCZT2FFT1aqghsZIqg78m2BlLe2zTnfBW+FyI=; b=d0gqFeeZMl48aDCc61ruKtqDR6Y3fKq7+b/uFWi8AV3bL21MWNGnxounHrxoYSCScm I4p/KM5JUiOnWxSlpckUKAfVuDjl4D7s4z9mqScczd08dkQqvHLki1Fa0LTA1Qi7qoas EDDNAZPC4gH45JJrEj7XG2C718odeRqRYlsC63bDQZPNQ9LF535O062zp0Q+aofiJ8Ao 6/OtBePkyQ4qcwmxcK3KOkj8d7Ug1xrDi/j9fDqkPuq3rjosX7+y9BdTVw/qBY5rhjZw wut2abmHAkUiVgcd0mmGGl9EmSIuDXb8ydYuWmrTtIuQZei3QviRf3oEST1tbvx+02iB dTDg== X-Gm-Message-State: AOAM530QsKjj9i2iAiLsStN0t9AV1/5T5MkMKUFTqzpnC5N13wkIME5/ Xhzt7ptIMcjsgeRbbQ8YClEOg4S+DMnDfA== X-Google-Smtp-Source: ABdhPJw9twX1oWOmcLV98Jnaco/ZrjDiGrn1lXNTSHg0f5eb7qwJQGl1HXtBvqLVLAgLS+E2BrjczQ== X-Received: by 2002:a05:6a00:16c9:b029:2df:c620:8156 with SMTP id l9-20020a056a0016c9b02902dfc6208156mr27283162pfc.40.1621904662866; Mon, 24 May 2021 18:04:22 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id b1sm13742645pgf.84.2021.05.24.18.04.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:04:22 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 38/92] target/arm: Implement SVE2 ADDHNB, ADDHNT Date: Mon, 24 May 2021 18:03:04 -0700 Message-Id: <20210525010358.152808-39-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::52d; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x52d.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org, Stephen Long Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Stephen Long Reviewed-by: Peter Maydell Signed-off-by: Stephen Long Message-Id: <20200417162231.10374-2-steplong@quicinc.com> Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 8 ++++++++ target/arm/sve.decode | 5 +++++ target/arm/sve_helper.c | 36 ++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 13 +++++++++++++ 4 files changed, 62 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index d154218452..a369fd2391 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2509,6 +2509,14 @@ DEF_HELPER_FLAGS_3(sve2_uqrshrnt_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve2_uqrshrnt_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve2_uqrshrnt_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_addhnb_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_addhnb_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_addhnb_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_addhnt_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_addhnt_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_addhnt_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve2_match_ppzz_b, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve2_match_ppzz_h, TCG_CALL_NO_RWG, diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 936977eacb..72dd36a5c8 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1320,6 +1320,11 @@ UQSHRNT 01000101 .. 1 ..... 00 1101 ..... ..... @rd_rn_tszimm_shr UQRSHRNB 01000101 .. 1 ..... 00 1110 ..... ..... @rd_rn_tszimm_shr UQRSHRNT 01000101 .. 1 ..... 00 1111 ..... ..... @rd_rn_tszimm_shr +## SVE2 integer add/subtract narrow high part + +ADDHNB 01000101 .. 1 ..... 011 000 ..... ..... @rd_rn_rm +ADDHNT 01000101 .. 1 ..... 011 001 ..... ..... @rd_rn_rm + ### SVE2 Character Match MATCH 01000101 .. 1 ..... 100 ... ..... 0 .... @pd_pg_rn_rm diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 263663cfc4..df7413f9c9 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2121,6 +2121,42 @@ DO_SHRNT(sve2_uqrshrnt_d, uint64_t, uint32_t, , H1_4, DO_UQRSHRN_D) #undef DO_SHRNB #undef DO_SHRNT +#define DO_BINOPNB(NAME, TYPEW, TYPEN, SHIFT, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc); \ + for (i = 0; i < opr_sz; i += sizeof(TYPEW)) { \ + TYPEW nn = *(TYPEW *)(vn + i); \ + TYPEW mm = *(TYPEW *)(vm + i); \ + *(TYPEW *)(vd + i) = (TYPEN)OP(nn, mm, SHIFT); \ + } \ +} + +#define DO_BINOPNT(NAME, TYPEW, TYPEN, SHIFT, HW, HN, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc); \ + for (i = 0; i < opr_sz; i += sizeof(TYPEW)) { \ + TYPEW nn = *(TYPEW *)(vn + HW(i)); \ + TYPEW mm = *(TYPEW *)(vm + HW(i)); \ + *(TYPEN *)(vd + HN(i + sizeof(TYPEN))) = OP(nn, mm, SHIFT); \ + } \ +} + +#define DO_ADDHN(N, M, SH) ((N + M) >> SH) + +DO_BINOPNB(sve2_addhnb_h, uint16_t, uint8_t, 8, DO_ADDHN) +DO_BINOPNB(sve2_addhnb_s, uint32_t, uint16_t, 16, DO_ADDHN) +DO_BINOPNB(sve2_addhnb_d, uint64_t, uint32_t, 32, DO_ADDHN) + +DO_BINOPNT(sve2_addhnt_h, uint16_t, uint8_t, 8, H1_2, H1, DO_ADDHN) +DO_BINOPNT(sve2_addhnt_s, uint32_t, uint16_t, 16, H1_4, H1_2, DO_ADDHN) +DO_BINOPNT(sve2_addhnt_d, uint64_t, uint32_t, 32, , H1_4, DO_ADDHN) + +#undef DO_ADDHN + +#undef DO_BINOPNB + /* Fully general four-operand expander, controlled by a predicate. */ #define DO_ZPZZZ(NAME, TYPE, H, OP) \ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 05d9edead4..442bf80b82 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -7462,6 +7462,19 @@ static bool trans_UQRSHRNT(DisasContext *s, arg_rri_esz *a) return do_sve2_shr_narrow(s, a, ops); } +#define DO_SVE2_ZZZ_NARROW(NAME, name) \ +static bool trans_##NAME(DisasContext *s, arg_rrr_esz *a) \ +{ \ + static gen_helper_gvec_3 * const fns[4] = { \ + NULL, gen_helper_sve2_##name##_h, \ + gen_helper_sve2_##name##_s, gen_helper_sve2_##name##_d, \ + }; \ + return do_sve2_zzz_ool(s, a, fns[a->esz]); \ +} + +DO_SVE2_ZZZ_NARROW(ADDHNB, addhnb) +DO_SVE2_ZZZ_NARROW(ADDHNT, addhnt) + static bool do_sve2_ppzz_flags(DisasContext *s, arg_rprr_esz *a, gen_helper_gvec_flags_4 *fn) { From patchwork Tue May 25 01:03:05 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277487 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 45DD1C2B9F7 for ; Tue, 25 May 2021 01:37:06 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id C04D6613BC for ; Tue, 25 May 2021 01:37:05 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C04D6613BC Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:39378 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llM0G-0002pR-QZ for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 21:37:04 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54322) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLVE-0002f6-9e for qemu-devel@nongnu.org; Mon, 24 May 2021 21:05:00 -0400 Received: from mail-pf1-x42b.google.com ([2607:f8b0:4864:20::42b]:36679) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLUf-0001tg-Kp for qemu-devel@nongnu.org; Mon, 24 May 2021 21:04:59 -0400 Received: by mail-pf1-x42b.google.com with SMTP id c12so10680977pfl.3 for ; Mon, 24 May 2021 18:04:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=DwtbG3bSNWXAP9uBkfJM+X1tflgrIEMxL+cytHL7AWs=; b=XMkoJQU5q0nffPvB9OMB51OrEa8/q8PS77+tKLUTRI0Y9r75E3oob5ai6xVCM04QEF W43NPchohz706F/pftt7IOPwv9+reBqkKdW3Lw73i09T3f6WBhmpl15iYCgVatZMAZNE W443tZVLHPf7xfAroVbqTr3Lc4Uy9oX1+YEcV6rtRcNZzOkUWb/8E0hi/WxdCvxiyDbJ zwByEbVTdZPvJ3GK8u96/SPmnqo3ralYdQsrzbctklc86oEhy0r3Zod7cKioeeMGKXcj y4p29P+cpQ4LiltNvmlnFL3B+BvgBp3RIJ+Z5zS0lKV2GE4gQ3d4VX6mndFdIs4s5WUW UZJQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=DwtbG3bSNWXAP9uBkfJM+X1tflgrIEMxL+cytHL7AWs=; b=pD8EWL2wVwy+CWf6QdeNWRnj2v3w54g7wmpp6bLW9Q7ey9yJPx/KkWzQl3ClZbZUxO yQevf2BOh8iytwV1F3nFJR5Si9vYEGEGHPBQn47D3sQ3HeozuP5kZUXuunkc9DHrjmZC 6J72onVMSGmdKB20lcDZrxUOLwuBVqL1zsANstS7qjNGExCf2kc/rcGqwsnHrvhnTlxH H6s43Y0vNIj0cg6F+6Io41keJvUxFgHnnhb2Cszvb/+QmzByMsphKZENWEYRlNDb30Ir VM3RFT5ym2herfBxi1Rd+Njvqnw5OcxM2Xv8epDxZwfUkEXME+N/oBw/IdrFqtwKn+a+ YUHA== X-Gm-Message-State: AOAM533jLVR/7XDoUqdmNHSAqrAnPrpG3JNmWa1tkshUlr3gr6sHCiDr lJIwfK/ciIQZcurmjYQgOIEumPOHDRL+jA== X-Google-Smtp-Source: ABdhPJz1iJ9M5hp2e7mjr4g90UB+gQIVkk0upMEbG1iK762SkrOy4UVlKf6U8M7M6u45+HADQp4KNg== X-Received: by 2002:a05:6a00:10c2:b029:2de:7333:1343 with SMTP id d2-20020a056a0010c2b02902de73331343mr27660202pfu.42.1621904663451; Mon, 24 May 2021 18:04:23 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id b1sm13742645pgf.84.2021.05.24.18.04.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:04:23 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 39/92] target/arm: Implement SVE2 RADDHNB, RADDHNT Date: Mon, 24 May 2021 18:03:05 -0700 Message-Id: <20210525010358.152808-40-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::42b; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x42b.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org, Stephen Long Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Stephen Long Reviewed-by: Peter Maydell Signed-off-by: Stephen Long Message-Id: <20200417162231.10374-3-steplong@quicinc.com> Signed-off-by: Richard Henderson --- v2: Fix round bit type (laurent desnogues) --- target/arm/helper-sve.h | 8 ++++++++ target/arm/sve.decode | 2 ++ target/arm/sve_helper.c | 10 ++++++++++ target/arm/translate-sve.c | 2 ++ 4 files changed, 22 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index a369fd2391..8d95c87694 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2517,6 +2517,14 @@ DEF_HELPER_FLAGS_4(sve2_addhnt_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_addhnt_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_addhnt_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_raddhnb_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_raddhnb_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_raddhnb_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_raddhnt_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_raddhnt_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_raddhnt_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve2_match_ppzz_b, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve2_match_ppzz_h, TCG_CALL_NO_RWG, diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 72dd36a5c8..dfcfab4bc0 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1324,6 +1324,8 @@ UQRSHRNT 01000101 .. 1 ..... 00 1111 ..... ..... @rd_rn_tszimm_shr ADDHNB 01000101 .. 1 ..... 011 000 ..... ..... @rd_rn_rm ADDHNT 01000101 .. 1 ..... 011 001 ..... ..... @rd_rn_rm +RADDHNB 01000101 .. 1 ..... 011 010 ..... ..... @rd_rn_rm +RADDHNT 01000101 .. 1 ..... 011 011 ..... ..... @rd_rn_rm ### SVE2 Character Match diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index df7413f9c9..8b450418c5 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2144,6 +2144,7 @@ void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ } #define DO_ADDHN(N, M, SH) ((N + M) >> SH) +#define DO_RADDHN(N, M, SH) ((N + M + ((__typeof(N))1 << (SH - 1))) >> SH) DO_BINOPNB(sve2_addhnb_h, uint16_t, uint8_t, 8, DO_ADDHN) DO_BINOPNB(sve2_addhnb_s, uint32_t, uint16_t, 16, DO_ADDHN) @@ -2153,6 +2154,15 @@ DO_BINOPNT(sve2_addhnt_h, uint16_t, uint8_t, 8, H1_2, H1, DO_ADDHN) DO_BINOPNT(sve2_addhnt_s, uint32_t, uint16_t, 16, H1_4, H1_2, DO_ADDHN) DO_BINOPNT(sve2_addhnt_d, uint64_t, uint32_t, 32, , H1_4, DO_ADDHN) +DO_BINOPNB(sve2_raddhnb_h, uint16_t, uint8_t, 8, DO_RADDHN) +DO_BINOPNB(sve2_raddhnb_s, uint32_t, uint16_t, 16, DO_RADDHN) +DO_BINOPNB(sve2_raddhnb_d, uint64_t, uint32_t, 32, DO_RADDHN) + +DO_BINOPNT(sve2_raddhnt_h, uint16_t, uint8_t, 8, H1_2, H1, DO_RADDHN) +DO_BINOPNT(sve2_raddhnt_s, uint32_t, uint16_t, 16, H1_4, H1_2, DO_RADDHN) +DO_BINOPNT(sve2_raddhnt_d, uint64_t, uint32_t, 32, , H1_4, DO_RADDHN) + +#undef DO_RADDHN #undef DO_ADDHN #undef DO_BINOPNB diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 442bf80b82..e7bf8cd9cc 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -7474,6 +7474,8 @@ static bool trans_##NAME(DisasContext *s, arg_rrr_esz *a) \ DO_SVE2_ZZZ_NARROW(ADDHNB, addhnb) DO_SVE2_ZZZ_NARROW(ADDHNT, addhnt) +DO_SVE2_ZZZ_NARROW(RADDHNB, raddhnb) +DO_SVE2_ZZZ_NARROW(RADDHNT, raddhnt) static bool do_sve2_ppzz_flags(DisasContext *s, arg_rprr_esz *a, gen_helper_gvec_flags_4 *fn) From patchwork Tue May 25 01:03:06 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277505 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 15F83C2B9F7 for ; Tue, 25 May 2021 01:44:33 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 8E42B61417 for ; Tue, 25 May 2021 01:44:32 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 8E42B61417 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:33126 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llM7T-0000aJ-Mm for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 21:44:31 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54816) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLXp-0007n5-Ei for qemu-devel@nongnu.org; Mon, 24 May 2021 21:07:43 -0400 Received: from mail-pl1-x633.google.com ([2607:f8b0:4864:20::633]:37528) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLXb-0003z6-1A for qemu-devel@nongnu.org; Mon, 24 May 2021 21:07:37 -0400 Received: by mail-pl1-x633.google.com with SMTP id u7so6943061plq.4 for ; Mon, 24 May 2021 18:07:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=nCF/nrD5OThI63JZjNx8Fj9vXwhS+t24cYrOO6DKU1E=; b=vEG6iI83wA+gwH4XanBvOHLC4sNnIb1aDYdejc026lQCAcKgTHX8O86esJrrpAFTsK g+kLkZoPKMSBeOjueaTDGjmsLNnX+9JQutDp7LZxzJ6lrR6gfjyct0bY8WdYUf2m+FLO H4b87KV6lWzMgVNTbcT4adcbZb7IBA2Buz3h3A8DlK/HlfE3h/H5+56VTRtD21sm8Kyc rU6TvQD5AUuE/FCRzfVQwq90owjrCetPfjjqKKcUeSsjIoK9w5NKGeen28b6PeatXj8E 1ibNFjwDL4eiawXDcx3VoM+kkAAiwD5OcCM9Gfl81ZavITu+Uiyv8OjU/QoQC3DTkCN8 JHAw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=nCF/nrD5OThI63JZjNx8Fj9vXwhS+t24cYrOO6DKU1E=; b=LwKrwCd7fDM022Qnh3w7yQ8qHtBwSsLoC0xGpIIy4jVca6ueJCeC2uUoIKV27C/FLr Ug/iHHVjdMpQxZ+31NIY19LS6Sq7HykxZ+PrMySgrbiNFhm2SGHPNzCIzTfYqzhnNKdW ngmiDdSo2HveiEcqa4BeXN3BLDcD6DBGlx2rm7HbSQG5xXHNiX8ZCG50dBN1QnIzbDD2 uJP6t5IPjhImqga6Ytx0M31QeFkIvBNEn+0kZX3ESeKSpEkNHfBNCrYiEsJndS/pwirk fekxH0SYpUs3wLwxBub+U89cAvXPkMKgBJNx7m7VpHcky7o57e8+8z5meR6oCFO7ifGb HO0w== X-Gm-Message-State: AOAM531/oocq6wXZ/9EEr442G8d0VykF6UmQtfYo4D3eJYr1bdCOPPgp HVy3m8J08xhRWoazWXOBYGPvXRV90gmm9Q== X-Google-Smtp-Source: ABdhPJxo05FghcaQ5ULijSu15gXQTluifos/CdpMrhml87E6/lCa+4ADLO02VeeEj7n5eqvFKYcVvA== X-Received: by 2002:a17:90a:55c5:: with SMTP id o5mr27482147pjm.169.1621904844541; Mon, 24 May 2021 18:07:24 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id b16sm11748176pju.35.2021.05.24.18.07.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:07:24 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 40/92] target/arm: Implement SVE2 SUBHNB, SUBHNT Date: Mon, 24 May 2021 18:03:06 -0700 Message-Id: <20210525010358.152808-41-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::633; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x633.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org, Stephen Long Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Stephen Long Reviewed-by: Peter Maydell Signed-off-by: Stephen Long Message-Id: <20200417162231.10374-4-steplong@quicinc.com> Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 8 ++++++++ target/arm/sve.decode | 2 ++ target/arm/sve_helper.c | 10 ++++++++++ target/arm/translate-sve.c | 3 +++ 4 files changed, 23 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 8d95c87694..3642e7c820 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2525,6 +2525,14 @@ DEF_HELPER_FLAGS_4(sve2_raddhnt_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_raddhnt_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_raddhnt_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_subhnb_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_subhnb_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_subhnb_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_subhnt_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_subhnt_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_subhnt_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve2_match_ppzz_b, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve2_match_ppzz_h, TCG_CALL_NO_RWG, diff --git a/target/arm/sve.decode b/target/arm/sve.decode index dfcfab4bc0..c68bfcf6ed 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1326,6 +1326,8 @@ ADDHNB 01000101 .. 1 ..... 011 000 ..... ..... @rd_rn_rm ADDHNT 01000101 .. 1 ..... 011 001 ..... ..... @rd_rn_rm RADDHNB 01000101 .. 1 ..... 011 010 ..... ..... @rd_rn_rm RADDHNT 01000101 .. 1 ..... 011 011 ..... ..... @rd_rn_rm +SUBHNB 01000101 .. 1 ..... 011 100 ..... ..... @rd_rn_rm +SUBHNT 01000101 .. 1 ..... 011 101 ..... ..... @rd_rn_rm ### SVE2 Character Match diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 8b450418c5..922df9575a 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2145,6 +2145,7 @@ void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ #define DO_ADDHN(N, M, SH) ((N + M) >> SH) #define DO_RADDHN(N, M, SH) ((N + M + ((__typeof(N))1 << (SH - 1))) >> SH) +#define DO_SUBHN(N, M, SH) ((N - M) >> SH) DO_BINOPNB(sve2_addhnb_h, uint16_t, uint8_t, 8, DO_ADDHN) DO_BINOPNB(sve2_addhnb_s, uint32_t, uint16_t, 16, DO_ADDHN) @@ -2162,6 +2163,15 @@ DO_BINOPNT(sve2_raddhnt_h, uint16_t, uint8_t, 8, H1_2, H1, DO_RADDHN) DO_BINOPNT(sve2_raddhnt_s, uint32_t, uint16_t, 16, H1_4, H1_2, DO_RADDHN) DO_BINOPNT(sve2_raddhnt_d, uint64_t, uint32_t, 32, , H1_4, DO_RADDHN) +DO_BINOPNB(sve2_subhnb_h, uint16_t, uint8_t, 8, DO_SUBHN) +DO_BINOPNB(sve2_subhnb_s, uint32_t, uint16_t, 16, DO_SUBHN) +DO_BINOPNB(sve2_subhnb_d, uint64_t, uint32_t, 32, DO_SUBHN) + +DO_BINOPNT(sve2_subhnt_h, uint16_t, uint8_t, 8, H1_2, H1, DO_SUBHN) +DO_BINOPNT(sve2_subhnt_s, uint32_t, uint16_t, 16, H1_4, H1_2, DO_SUBHN) +DO_BINOPNT(sve2_subhnt_d, uint64_t, uint32_t, 32, , H1_4, DO_SUBHN) + +#undef DO_SUBHN #undef DO_RADDHN #undef DO_ADDHN diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index e7bf8cd9cc..334c57b44f 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -7477,6 +7477,9 @@ DO_SVE2_ZZZ_NARROW(ADDHNT, addhnt) DO_SVE2_ZZZ_NARROW(RADDHNB, raddhnb) DO_SVE2_ZZZ_NARROW(RADDHNT, raddhnt) +DO_SVE2_ZZZ_NARROW(SUBHNB, subhnb) +DO_SVE2_ZZZ_NARROW(SUBHNT, subhnt) + static bool do_sve2_ppzz_flags(DisasContext *s, arg_rprr_esz *a, gen_helper_gvec_flags_4 *fn) { From patchwork Tue May 25 01:03:07 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277485 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AC75FC2B9F7 for ; Tue, 25 May 2021 01:36:47 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 2EAD260FE3 for ; Tue, 25 May 2021 01:36:47 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 2EAD260FE3 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:38346 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llLzy-00023l-BD for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 21:36:46 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54852) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLXr-0007oD-W8 for qemu-devel@nongnu.org; Mon, 24 May 2021 21:07:44 -0400 Received: from mail-pg1-x530.google.com ([2607:f8b0:4864:20::530]:44810) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLXb-0003zV-5m for qemu-devel@nongnu.org; Mon, 24 May 2021 21:07:43 -0400 Received: by mail-pg1-x530.google.com with SMTP id 29so10305387pgu.11 for ; Mon, 24 May 2021 18:07:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=WNvi/8XipqAB06HT1oOFYrRXluBK7WBz7cq2x3atlxo=; b=k+x4ZPDo5A5XB76ECuUnIkSdTI/TQleEU4lv6KzdmIOp66rL88Fp48zCnM40ecCv1E tIDDmki/7REd8+xx9HpOZ6K1mvoQYHX7BqDkO8IpjPJ+3pfJopKhtUz/QmUCBNYK7UWt H10E9FuNX6riRvYBs/5sUu/lPjWgLZzBg9G0rLIDPKTtGIOHHR7WYdnsus/k6EBJbUVe +qOA5X6R3DNKhEDcNAzuRtV9oRtSegtApNIekmez3YBwcnHiF5qrTABbKSojz3uuNryf DFXzX7wng4WvuDKfrXpLt4r/7kj00OkC8efLdzfeFg8sa/8lQjO+Pqdmzyw9y4v9mEgq 8g6g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=WNvi/8XipqAB06HT1oOFYrRXluBK7WBz7cq2x3atlxo=; b=mTLzY+tsRQWnH0zEOfg/vxBYMIvcitqDk5+PsxTbp8QDOwiP9DR4yuqx9We5frVgCf 7oWueoiU6+9boS5A1k/JdPAT4D4CB9zD+TbczpnDtulPhW7gxLB1SJpqNcyaZx5NBmls ZkXyejxqS/A2MZmrs2NSrT+nUxM9ZYmmn2p8tCOUxYxbaJQmVE6zS0Lh/chz6jG4TsKO lbtKF3Tcj1K1f3R2wkZwXPHxZQQmCkeFIbv7HB96zcOvn/S+yRDyPQJYV4LSVUUHzrkp I2Alt6VoYLbGB94jRWLL1oGASGp25l7HvCkmJNPk+YhdSpTKAvm1+KAMSQTjLHKl/aII NaPA== X-Gm-Message-State: AOAM532zVMK6YbaKatmQnJVHWz9jegACGX7zxQzhssYpJ9Smo0FP+wOO i+SMBsIeC+F44ddzamiBb3b3R8o26ujJhg== X-Google-Smtp-Source: ABdhPJw4ij4zVHJfdgHr/s70Ru8W7wkWG6gfB59x7cJp77kfFm10xHPiUieXLQnbSQrHPJvUYEbgQw== X-Received: by 2002:a63:7703:: with SMTP id s3mr16386215pgc.339.1621904845161; Mon, 24 May 2021 18:07:25 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id b16sm11748176pju.35.2021.05.24.18.07.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:07:24 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 41/92] target/arm: Implement SVE2 RSUBHNB, RSUBHNT Date: Mon, 24 May 2021 18:03:07 -0700 Message-Id: <20210525010358.152808-42-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::530; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x530.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org, Stephen Long Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Stephen Long This completes the section 'SVE2 integer add/subtract narrow high part' Reviewed-by: Peter Maydell Signed-off-by: Stephen Long Message-Id: <20200417162231.10374-5-steplong@quicinc.com> Signed-off-by: Richard Henderson --- v2: Fix round bit type (laurent desnogues) --- target/arm/helper-sve.h | 8 ++++++++ target/arm/sve.decode | 2 ++ target/arm/sve_helper.c | 10 ++++++++++ target/arm/translate-sve.c | 2 ++ 4 files changed, 22 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 3642e7c820..98e6b57e38 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2533,6 +2533,14 @@ DEF_HELPER_FLAGS_4(sve2_subhnt_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_subhnt_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_subhnt_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_rsubhnb_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_rsubhnb_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_rsubhnb_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_rsubhnt_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_rsubhnt_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_rsubhnt_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve2_match_ppzz_b, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve2_match_ppzz_h, TCG_CALL_NO_RWG, diff --git a/target/arm/sve.decode b/target/arm/sve.decode index c68bfcf6ed..388bf92acf 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1328,6 +1328,8 @@ RADDHNB 01000101 .. 1 ..... 011 010 ..... ..... @rd_rn_rm RADDHNT 01000101 .. 1 ..... 011 011 ..... ..... @rd_rn_rm SUBHNB 01000101 .. 1 ..... 011 100 ..... ..... @rd_rn_rm SUBHNT 01000101 .. 1 ..... 011 101 ..... ..... @rd_rn_rm +RSUBHNB 01000101 .. 1 ..... 011 110 ..... ..... @rd_rn_rm +RSUBHNT 01000101 .. 1 ..... 011 111 ..... ..... @rd_rn_rm ### SVE2 Character Match diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 922df9575a..891f6ff453 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2146,6 +2146,7 @@ void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ #define DO_ADDHN(N, M, SH) ((N + M) >> SH) #define DO_RADDHN(N, M, SH) ((N + M + ((__typeof(N))1 << (SH - 1))) >> SH) #define DO_SUBHN(N, M, SH) ((N - M) >> SH) +#define DO_RSUBHN(N, M, SH) ((N - M + ((__typeof(N))1 << (SH - 1))) >> SH) DO_BINOPNB(sve2_addhnb_h, uint16_t, uint8_t, 8, DO_ADDHN) DO_BINOPNB(sve2_addhnb_s, uint32_t, uint16_t, 16, DO_ADDHN) @@ -2171,6 +2172,15 @@ DO_BINOPNT(sve2_subhnt_h, uint16_t, uint8_t, 8, H1_2, H1, DO_SUBHN) DO_BINOPNT(sve2_subhnt_s, uint32_t, uint16_t, 16, H1_4, H1_2, DO_SUBHN) DO_BINOPNT(sve2_subhnt_d, uint64_t, uint32_t, 32, , H1_4, DO_SUBHN) +DO_BINOPNB(sve2_rsubhnb_h, uint16_t, uint8_t, 8, DO_RSUBHN) +DO_BINOPNB(sve2_rsubhnb_s, uint32_t, uint16_t, 16, DO_RSUBHN) +DO_BINOPNB(sve2_rsubhnb_d, uint64_t, uint32_t, 32, DO_RSUBHN) + +DO_BINOPNT(sve2_rsubhnt_h, uint16_t, uint8_t, 8, H1_2, H1, DO_RSUBHN) +DO_BINOPNT(sve2_rsubhnt_s, uint32_t, uint16_t, 16, H1_4, H1_2, DO_RSUBHN) +DO_BINOPNT(sve2_rsubhnt_d, uint64_t, uint32_t, 32, , H1_4, DO_RSUBHN) + +#undef DO_RSUBHN #undef DO_SUBHN #undef DO_RADDHN #undef DO_ADDHN diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 334c57b44f..484d4218b5 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -7479,6 +7479,8 @@ DO_SVE2_ZZZ_NARROW(RADDHNT, raddhnt) DO_SVE2_ZZZ_NARROW(SUBHNB, subhnb) DO_SVE2_ZZZ_NARROW(SUBHNT, subhnt) +DO_SVE2_ZZZ_NARROW(RSUBHNB, rsubhnb) +DO_SVE2_ZZZ_NARROW(RSUBHNT, rsubhnt) static bool do_sve2_ppzz_flags(DisasContext *s, arg_rprr_esz *a, gen_helper_gvec_flags_4 *fn) From patchwork Tue May 25 01:03:08 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277519 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C1470C2B9F7 for ; Tue, 25 May 2021 01:49:36 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 596286141B for ; Tue, 25 May 2021 01:49:36 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 596286141B Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:50752 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llMCN-0003qP-F5 for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 21:49:35 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54926) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLXs-0007oZ-8v for qemu-devel@nongnu.org; Mon, 24 May 2021 21:07:44 -0400 Received: from mail-pf1-x42c.google.com ([2607:f8b0:4864:20::42c]:40477) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLXb-00040G-5c for qemu-devel@nongnu.org; Mon, 24 May 2021 21:07:43 -0400 Received: by mail-pf1-x42c.google.com with SMTP id x188so22273002pfd.7 for ; Mon, 24 May 2021 18:07:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=B/2ZbcN0Aw9g5MR4U6W9E3jK4XyfrMoXAdTmeD7ETMA=; b=bwJc4rgFgy2gC1QUcZFwpymtwPyNaDfm+XCFjyYuOylWA9DVjxgeKZL7NojWz4yOhT 90v4RCPDdAbnlsd3Vcxtz5rYpMCG3nL+HwCwKBupAxf1YXxcpKi3NMedyLu4ns+2HRqo +mLWwdvR0gl1WmskhW9qxbI7s6Lyik79mLG8yW37pGamsdgrFSQ3jDLF03mm3eRa4Uy3 2+SPr515F+VnfVVojNx/RuDmL/y0pLdQv9kBgvGtVlJYOI3UYe9W5BXFcRAbNakofT9u 9qRGdoalrMedZaYxS/5N94dyCWYED21QaLxKIzA5klIN2JE60FF+YgEcOHataWOVBFyl a9ug== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=B/2ZbcN0Aw9g5MR4U6W9E3jK4XyfrMoXAdTmeD7ETMA=; b=RhMpdJHS0wlLbAhi0WEcCZFKHYxMdXqplMEcqa0WlTrNxGobrvr9WHcqwJmvRs4W6T hODMRiIbJT2w1611ig+qTKDCJE32MSDBqVyZ8vw7BskXcmCuBwULhpaaiYFD5ZwRhb4D pl+B3NDBdgrHaQUAzPLohanHXTZ1rX2F4XiiQnduHYHq5K7oHnJopOaMiV4Kec2/zKzb eH+ZIjiHpHBo21aIWOdyt5XWvLNoTcab/NDSOev8lV5sUhXTDVpJND60hxGfUv84QJcc E9Bct/18fKFlkSw5sZ02iIotAdUSRt0Mo068BLv8Gu1STB8HYP1lSEPEi4cHytJ6Fxf9 CH4Q== X-Gm-Message-State: AOAM532fel+mpDA+UTgAta7JssiAf0yaJvPrvMRVJPvhANA+lBMDOpko 9UtitKZZIKEkdo1Xv1xlqH6AD8/WgLoH8Q== X-Google-Smtp-Source: ABdhPJyuZNcFfAlzyf91bRx4ZGZ70i4YQfl9E4BjEyDuiwVXZuhvVsOKRdHUg1nGVDGegD2tmtk2mA== X-Received: by 2002:aa7:9a8e:0:b029:2e7:e3fd:4fa4 with SMTP id w14-20020aa79a8e0000b02902e7e3fd4fa4mr12358002pfi.63.1621904845779; Mon, 24 May 2021 18:07:25 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id b16sm11748176pju.35.2021.05.24.18.07.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:07:25 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 42/92] target/arm: Implement SVE2 HISTCNT, HISTSEG Date: Mon, 24 May 2021 18:03:08 -0700 Message-Id: <20210525010358.152808-43-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::42c; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x42c.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org, Stephen Long Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Stephen Long Reviewed-by: Peter Maydell Signed-off-by: Stephen Long Message-Id: <20200416173109.8856-1-steplong@quicinc.com> Signed-off-by: Richard Henderson --- v2: Fix overlap between output and input vectors. v4: Fix histseg counting (zhiwei). --- target/arm/helper-sve.h | 7 ++ target/arm/sve.decode | 6 ++ target/arm/sve_helper.c | 131 +++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 19 ++++++ 4 files changed, 163 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 98e6b57e38..507a2fea8e 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2551,6 +2551,13 @@ DEF_HELPER_FLAGS_5(sve2_nmatch_ppzz_b, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(sve2_nmatch_ppzz_h, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_histcnt_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_histcnt_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_histseg, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_6(sve2_faddp_zpzz_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_6(sve2_faddp_zpzz_s, TCG_CALL_NO_RWG, diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 388bf92acf..8f501a083c 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -146,6 +146,7 @@ &rprrr_esz rn=%reg_movprfx @rdn_pg_rm_ra ........ esz:2 . ra:5 ... pg:3 rm:5 rd:5 \ &rprrr_esz rn=%reg_movprfx +@rd_pg_rn_rm ........ esz:2 . rm:5 ... pg:3 rn:5 rd:5 &rprr_esz # One register operand, with governing predicate, vector element size @rd_pg_rn ........ esz:2 ... ... ... pg:3 rn:5 rd:5 &rpr_esz @@ -1336,6 +1337,11 @@ RSUBHNT 01000101 .. 1 ..... 011 111 ..... ..... @rd_rn_rm MATCH 01000101 .. 1 ..... 100 ... ..... 0 .... @pd_pg_rn_rm NMATCH 01000101 .. 1 ..... 100 ... ..... 1 .... @pd_pg_rn_rm +### SVE2 Histogram Computation + +HISTCNT 01000101 .. 1 ..... 110 ... ..... ..... @rd_pg_rn_rm +HISTSEG 01000101 .. 1 ..... 101 000 ..... ..... @rd_rn_rm + ## SVE2 floating-point pairwise operations FADDP 01100100 .. 010 00 0 100 ... ..... ..... @rdn_pg_rm diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 891f6ff453..662ed80b1c 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -7071,3 +7071,134 @@ DO_PPZZ_MATCH(sve2_nmatch_ppzz_b, MO_8, true) DO_PPZZ_MATCH(sve2_nmatch_ppzz_h, MO_16, true) #undef DO_PPZZ_MATCH + +void HELPER(sve2_histcnt_s)(void *vd, void *vn, void *vm, void *vg, + uint32_t desc) +{ + ARMVectorReg scratch; + intptr_t i, j; + intptr_t opr_sz = simd_oprsz(desc); + uint32_t *d = vd, *n = vn, *m = vm; + uint8_t *pg = vg; + + if (d == n) { + n = memcpy(&scratch, n, opr_sz); + if (d == m) { + m = n; + } + } else if (d == m) { + m = memcpy(&scratch, m, opr_sz); + } + + for (i = 0; i < opr_sz; i += 4) { + uint64_t count = 0; + uint8_t pred; + + pred = pg[H1(i >> 3)] >> (i & 7); + if (pred & 1) { + uint32_t nn = n[H4(i >> 2)]; + + for (j = 0; j <= i; j += 4) { + pred = pg[H1(j >> 3)] >> (j & 7); + if ((pred & 1) && nn == m[H4(j >> 2)]) { + ++count; + } + } + } + d[H4(i >> 2)] = count; + } +} + +void HELPER(sve2_histcnt_d)(void *vd, void *vn, void *vm, void *vg, + uint32_t desc) +{ + ARMVectorReg scratch; + intptr_t i, j; + intptr_t opr_sz = simd_oprsz(desc); + uint64_t *d = vd, *n = vn, *m = vm; + uint8_t *pg = vg; + + if (d == n) { + n = memcpy(&scratch, n, opr_sz); + if (d == m) { + m = n; + } + } else if (d == m) { + m = memcpy(&scratch, m, opr_sz); + } + + for (i = 0; i < opr_sz / 8; ++i) { + uint64_t count = 0; + if (pg[H1(i)] & 1) { + uint64_t nn = n[i]; + for (j = 0; j <= i; ++j) { + if ((pg[H1(j)] & 1) && nn == m[j]) { + ++count; + } + } + } + d[i] = count; + } +} + +/* + * Returns the number of bytes in m0 and m1 that match n. + * Unlike do_match2 we don't just need true/false, we need an exact count. + * This requires two extra logical operations. + */ +static inline uint64_t do_histseg_cnt(uint8_t n, uint64_t m0, uint64_t m1) +{ + const uint64_t mask = dup_const(MO_8, 0x7f); + uint64_t cmp0, cmp1; + + cmp1 = dup_const(MO_8, n); + cmp0 = cmp1 ^ m0; + cmp1 = cmp1 ^ m1; + + /* + * 1: clear msb of each byte to avoid carry to next byte (& mask) + * 2: carry in to msb if byte != 0 (+ mask) + * 3: set msb if cmp has msb set (| cmp) + * 4: set ~msb to ignore them (| mask) + * We now have 0xff for byte != 0 or 0x7f for byte == 0. + * 5: invert, resulting in 0x80 if and only if byte == 0. + */ + cmp0 = ~(((cmp0 & mask) + mask) | cmp0 | mask); + cmp1 = ~(((cmp1 & mask) + mask) | cmp1 | mask); + + /* + * Combine the two compares in a way that the bits do + * not overlap, and so preserves the count of set bits. + * If the host has an efficient instruction for ctpop, + * then ctpop(x) + ctpop(y) has the same number of + * operations as ctpop(x | (y >> 1)). If the host does + * not have an efficient ctpop, then we only want to + * use it once. + */ + return ctpop64(cmp0 | (cmp1 >> 1)); +} + +void HELPER(sve2_histseg)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, j; + intptr_t opr_sz = simd_oprsz(desc); + + for (i = 0; i < opr_sz; i += 16) { + uint64_t n0 = *(uint64_t *)(vn + i); + uint64_t m0 = *(uint64_t *)(vm + i); + uint64_t n1 = *(uint64_t *)(vn + i + 8); + uint64_t m1 = *(uint64_t *)(vm + i + 8); + uint64_t out0 = 0; + uint64_t out1 = 0; + + for (j = 0; j < 64; j += 8) { + uint64_t cnt0 = do_histseg_cnt(n0 >> j, m0, m1); + uint64_t cnt1 = do_histseg_cnt(n1 >> j, m0, m1); + out0 |= cnt0 << j; + out1 |= cnt1 << j; + } + + *(uint64_t *)(vd + i) = out0; + *(uint64_t *)(vd + i + 8) = out1; + } +} diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 484d4218b5..13f84d14d3 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -7504,6 +7504,25 @@ static bool trans_##NAME(DisasContext *s, arg_rprr_esz *a) \ DO_SVE2_PPZZ_MATCH(MATCH, match) DO_SVE2_PPZZ_MATCH(NMATCH, nmatch) +static bool trans_HISTCNT(DisasContext *s, arg_rprr_esz *a) +{ + static gen_helper_gvec_4 * const fns[2] = { + gen_helper_sve2_histcnt_s, gen_helper_sve2_histcnt_d + }; + if (a->esz < 2) { + return false; + } + return do_sve2_zpzz_ool(s, a, fns[a->esz - 2]); +} + +static bool trans_HISTSEG(DisasContext *s, arg_rrr_esz *a) +{ + if (a->esz != 0) { + return false; + } + return do_sve2_zzz_ool(s, a, gen_helper_sve2_histseg); +} + static bool do_sve2_zpzz_fp(DisasContext *s, arg_rprr_esz *a, gen_helper_gvec_4_ptr *fn) { From patchwork Tue May 25 01:03:09 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277493 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 873D4C2B9F8 for ; Tue, 25 May 2021 01:40:36 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 0EFA061413 for ; Tue, 25 May 2021 01:40:36 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 0EFA061413 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:48370 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llM3f-0000Jv-4h for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 21:40:35 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:55046) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLXu-0007sE-Af for qemu-devel@nongnu.org; Mon, 24 May 2021 21:07:46 -0400 Received: from mail-pf1-x42d.google.com ([2607:f8b0:4864:20::42d]:38908) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLXc-00040N-4x for qemu-devel@nongnu.org; Mon, 24 May 2021 21:07:46 -0400 Received: by mail-pf1-x42d.google.com with SMTP id e17so11674229pfl.5 for ; Mon, 24 May 2021 18:07:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=I8/Dcr1RlD62cWLa9TUHbYmDfTs7MOj8zYzO3BzWCao=; b=Vl/DRBDtjyAvgDJzLlhjjpbrvCF1MqIviLVwNgvZllcHh/J4xBQMw4bxI+MvwNdBFp Tkg5pvTN0BRBEWpVrOsTCSkWs/C6RnjO8J1NZeOCjSVeX77p2y1CSzxkfoAY/0Xf3nfF 3TQc+RVpshj4iNsYXEZhqABiSUGyCFM3MLx41F9Q0h7sqlC5vhmN6qLGZmstG2dj9kav f2D0L49ei9qh8S9CdltxZIaK+DWrEpCHhYqQFZySupRY9PNZffAuzRFPbE62NASTf0Yk NLZ030vASyXupMMBPPIj/Jq7Vr3VpJJV1rAPJ6SrOPw01oryQQoujVNJ/8i/ZnIfFqJe Ky6A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=I8/Dcr1RlD62cWLa9TUHbYmDfTs7MOj8zYzO3BzWCao=; b=SvYRM0qf+lAczhOJo0Lsld5XlNUeHDKfNu8hTy8b5+uqmDb6dGVbXfAveocErxroUB gbUQULEAcw2AmCu+n7c6XAaOP2hMxdvkoAOt8daqG36DhtqKhRV56gLHBVJNc+ecOABX LTndONssUoeZPHf5L+AbmOsWm4lvvgv55VV4ZrkC/O0BTIDHK5jgQgXFYxyloqPuxABX yqKtXJ9tVEvW5KMxGUrMVs/tGVqCcWLgh7LQpBX9M5BfHzwHgVPtFlMWvVbkJ9fijit3 glntrJROyli7J/E747XANvFixzSrWYskf5bCN3opQqi5QNgzmxfHSw1QfrMcxBl+VqmC ryhw== X-Gm-Message-State: AOAM530cy5+wTeZurk6kFld6aCDrt1cjEbXxxAjeXDYet804CSc2ZzSZ d6ZeaZI8bAq0xNYAPlBOC291eu2MzioEyA== X-Google-Smtp-Source: ABdhPJzWHefMK0k0QECdhnE5t30KzR1OjEYvsvEyMPiAliFCj/SdcZiQDzjn7fEN+ldiCGo8hSC2DA== X-Received: by 2002:a63:b507:: with SMTP id y7mr8934679pge.74.1621904846375; Mon, 24 May 2021 18:07:26 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id b16sm11748176pju.35.2021.05.24.18.07.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:07:26 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 43/92] target/arm: Implement SVE2 XAR Date: Mon, 24 May 2021 18:03:09 -0700 Message-Id: <20210525010358.152808-44-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::42d; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x42d.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" In addition, use the same vector generator interface for AdvSIMD. This fixes a bug in which the AdvSIMD insn failed to clear the high bits of the SVE register. Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 4 ++ target/arm/helper.h | 2 + target/arm/translate-a64.h | 3 ++ target/arm/sve.decode | 4 ++ target/arm/sve_helper.c | 39 ++++++++++++++ target/arm/translate-a64.c | 25 ++------- target/arm/translate-sve.c | 104 +++++++++++++++++++++++++++++++++++++ target/arm/vec_helper.c | 12 +++++ 8 files changed, 172 insertions(+), 21 deletions(-) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 507a2fea8e..28b8f00201 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2558,6 +2558,10 @@ DEF_HELPER_FLAGS_5(sve2_histcnt_d, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_4(sve2_histseg, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_xar_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_xar_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_xar_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_6(sve2_faddp_zpzz_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_6(sve2_faddp_zpzz_s, TCG_CALL_NO_RWG, diff --git a/target/arm/helper.h b/target/arm/helper.h index 6bb0b0ddc0..23a7ec5638 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -953,6 +953,8 @@ DEF_HELPER_FLAGS_5(neon_sqrdmulh_h, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(neon_sqrdmulh_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_xar_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + #ifdef TARGET_AARCH64 #include "helper-a64.h" #include "helper-sve.h" diff --git a/target/arm/translate-a64.h b/target/arm/translate-a64.h index 89437276e7..58f50abca4 100644 --- a/target/arm/translate-a64.h +++ b/target/arm/translate-a64.h @@ -120,5 +120,8 @@ bool disas_sve(DisasContext *, uint32_t); void gen_gvec_rax1(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_xar(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, int64_t shift, + uint32_t opr_sz, uint32_t max_sz); #endif /* TARGET_ARM_TRANSLATE_A64_H */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 8f501a083c..7645587469 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -65,6 +65,7 @@ &rr_dbm rd rn dbm &rrri rd rn rm imm &rri_esz rd rn imm esz +&rrri_esz rd rn rm imm esz &rrr_esz rd rn rm esz &rpr_esz rd pg rn esz &rpr_s rd pg rn s @@ -384,6 +385,9 @@ ORR_zzz 00000100 01 1 ..... 001 100 ..... ..... @rd_rn_rm_e0 EOR_zzz 00000100 10 1 ..... 001 100 ..... ..... @rd_rn_rm_e0 BIC_zzz 00000100 11 1 ..... 001 100 ..... ..... @rd_rn_rm_e0 +XAR 00000100 .. 1 ..... 001 101 rm:5 rd:5 &rrri_esz \ + rn=%reg_movprfx esz=%tszimm16_esz imm=%tszimm16_shr + # SVE2 bitwise ternary operations EOR3 00000100 00 1 ..... 001 110 ..... ..... @rdn_ra_rm_e0 BSL 00000100 00 1 ..... 001 111 ..... ..... @rdn_ra_rm_e0 diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 662ed80b1c..5b6292929e 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -7202,3 +7202,42 @@ void HELPER(sve2_histseg)(void *vd, void *vn, void *vm, uint32_t desc) *(uint64_t *)(vd + i + 8) = out1; } } + +void HELPER(sve2_xar_b)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc) / 8; + int shr = simd_data(desc); + int shl = 8 - shr; + uint64_t mask = dup_const(MO_8, 0xff >> shr); + uint64_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz; ++i) { + uint64_t t = n[i] ^ m[i]; + d[i] = ((t >> shr) & mask) | ((t << shl) & ~mask); + } +} + +void HELPER(sve2_xar_h)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc) / 8; + int shr = simd_data(desc); + int shl = 16 - shr; + uint64_t mask = dup_const(MO_16, 0xffff >> shr); + uint64_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz; ++i) { + uint64_t t = n[i] ^ m[i]; + d[i] = ((t >> shr) & mask) | ((t << shl) & ~mask); + } +} + +void HELPER(sve2_xar_s)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc) / 4; + int shr = simd_data(desc); + uint32_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz; ++i) { + d[i] = ror32(n[i] ^ m[i], shr); + } +} diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index 0c80d0b505..bdd47208b1 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -14349,8 +14349,6 @@ static void disas_crypto_xar(DisasContext *s, uint32_t insn) int imm6 = extract32(insn, 10, 6); int rn = extract32(insn, 5, 5); int rd = extract32(insn, 0, 5); - TCGv_i64 tcg_op1, tcg_op2, tcg_res[2]; - int pass; if (!dc_isar_feature(aa64_sha3, s)) { unallocated_encoding(s); @@ -14361,25 +14359,10 @@ static void disas_crypto_xar(DisasContext *s, uint32_t insn) return; } - tcg_op1 = tcg_temp_new_i64(); - tcg_op2 = tcg_temp_new_i64(); - tcg_res[0] = tcg_temp_new_i64(); - tcg_res[1] = tcg_temp_new_i64(); - - for (pass = 0; pass < 2; pass++) { - read_vec_element(s, tcg_op1, rn, pass, MO_64); - read_vec_element(s, tcg_op2, rm, pass, MO_64); - - tcg_gen_xor_i64(tcg_res[pass], tcg_op1, tcg_op2); - tcg_gen_rotri_i64(tcg_res[pass], tcg_res[pass], imm6); - } - write_vec_element(s, tcg_res[0], rd, 0, MO_64); - write_vec_element(s, tcg_res[1], rd, 1, MO_64); - - tcg_temp_free_i64(tcg_op1); - tcg_temp_free_i64(tcg_op2); - tcg_temp_free_i64(tcg_res[0]); - tcg_temp_free_i64(tcg_res[1]); + gen_gvec_xar(MO_64, vec_full_reg_offset(s, rd), + vec_full_reg_offset(s, rn), + vec_full_reg_offset(s, rm), imm6, 16, + vec_full_reg_size(s)); } /* Crypto three-reg imm2 diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 13f84d14d3..ba39ff84a5 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -340,6 +340,110 @@ static bool trans_BIC_zzz(DisasContext *s, arg_rrr_esz *a) return do_zzz_fn(s, a, tcg_gen_gvec_andc); } +static void gen_xar8_i64(TCGv_i64 d, TCGv_i64 n, TCGv_i64 m, int64_t sh) +{ + TCGv_i64 t = tcg_temp_new_i64(); + uint64_t mask = dup_const(MO_8, 0xff >> sh); + + tcg_gen_xor_i64(t, n, m); + tcg_gen_shri_i64(d, t, sh); + tcg_gen_shli_i64(t, t, 8 - sh); + tcg_gen_andi_i64(d, d, mask); + tcg_gen_andi_i64(t, t, ~mask); + tcg_gen_or_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_xar16_i64(TCGv_i64 d, TCGv_i64 n, TCGv_i64 m, int64_t sh) +{ + TCGv_i64 t = tcg_temp_new_i64(); + uint64_t mask = dup_const(MO_16, 0xffff >> sh); + + tcg_gen_xor_i64(t, n, m); + tcg_gen_shri_i64(d, t, sh); + tcg_gen_shli_i64(t, t, 16 - sh); + tcg_gen_andi_i64(d, d, mask); + tcg_gen_andi_i64(t, t, ~mask); + tcg_gen_or_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_xar_i32(TCGv_i32 d, TCGv_i32 n, TCGv_i32 m, int32_t sh) +{ + tcg_gen_xor_i32(d, n, m); + tcg_gen_rotri_i32(d, d, sh); +} + +static void gen_xar_i64(TCGv_i64 d, TCGv_i64 n, TCGv_i64 m, int64_t sh) +{ + tcg_gen_xor_i64(d, n, m); + tcg_gen_rotri_i64(d, d, sh); +} + +static void gen_xar_vec(unsigned vece, TCGv_vec d, TCGv_vec n, + TCGv_vec m, int64_t sh) +{ + tcg_gen_xor_vec(vece, d, n, m); + tcg_gen_rotri_vec(vece, d, d, sh); +} + +void gen_gvec_xar(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, int64_t shift, + uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop[] = { INDEX_op_rotli_vec, 0 }; + static const GVecGen3i ops[4] = { + { .fni8 = gen_xar8_i64, + .fniv = gen_xar_vec, + .fno = gen_helper_sve2_xar_b, + .opt_opc = vecop, + .vece = MO_8 }, + { .fni8 = gen_xar16_i64, + .fniv = gen_xar_vec, + .fno = gen_helper_sve2_xar_h, + .opt_opc = vecop, + .vece = MO_16 }, + { .fni4 = gen_xar_i32, + .fniv = gen_xar_vec, + .fno = gen_helper_sve2_xar_s, + .opt_opc = vecop, + .vece = MO_32 }, + { .fni8 = gen_xar_i64, + .fniv = gen_xar_vec, + .fno = gen_helper_gvec_xar_d, + .opt_opc = vecop, + .vece = MO_64 } + }; + int esize = 8 << vece; + + /* The SVE2 range is 1 .. esize; the AdvSIMD range is 0 .. esize-1. */ + tcg_debug_assert(shift >= 0); + tcg_debug_assert(shift <= esize); + shift &= esize - 1; + + if (shift == 0) { + /* xar with no rotate devolves to xor. */ + tcg_gen_gvec_xor(vece, rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz); + } else { + tcg_gen_gvec_3i(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, + shift, &ops[vece]); + } +} + +static bool trans_XAR(DisasContext *s, arg_rrri_esz *a) +{ + if (a->esz < 0 || !dc_isar_feature(aa64_sve2, s)) { + return false; + } + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + gen_gvec_xar(a->esz, vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), a->imm, vsz, vsz); + } + return true; +} + static bool do_sve2_zzzz_fn(DisasContext *s, arg_rrrr_esz *a, GVecGen4Fn *fn) { if (!dc_isar_feature(aa64_sve2, s)) { diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index 19006f50f7..a3d80ecad0 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -2253,3 +2253,15 @@ void HELPER(gvec_umulh_d)(void *vd, void *vn, void *vm, uint32_t desc) } clear_tail(d, opr_sz, simd_maxsz(desc)); } + +void HELPER(gvec_xar_d)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc) / 8; + int shr = simd_data(desc); + uint64_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz; ++i) { + d[i] = ror64(n[i] ^ m[i], shr); + } + clear_tail(d, opr_sz * 8, simd_maxsz(desc)); +} From patchwork Tue May 25 01:03:10 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277495 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.5 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2E89DC2B9F7 for ; Tue, 25 May 2021 01:40:46 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id D04316141B for ; Tue, 25 May 2021 01:40:45 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D04316141B Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:49406 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llM3o-00011B-WD for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 21:40:45 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:55060) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLXu-0007t4-Ia for qemu-devel@nongnu.org; Mon, 24 May 2021 21:07:47 -0400 Received: from mail-pg1-x534.google.com ([2607:f8b0:4864:20::534]:40748) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLXd-00040r-8r for qemu-devel@nongnu.org; Mon, 24 May 2021 21:07:46 -0400 Received: by mail-pg1-x534.google.com with SMTP id j12so21402660pgh.7 for ; Mon, 24 May 2021 18:07:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=TTEy8kE9jF34vkk9pbuNVU/WzIkyybKzFSfIHyfqAxE=; b=U9Pl1I9buOSt4dI/AIC1yQ9vQgLkUUIOLdXcPfGtIEKdAnCRi9iPDZWo8foGVjDHsm 6ZFVUL8yooc6sXx3U1wAds9usnazcNWKDiKeqmbPM7dVDr0h0Rv6Tio+8ljA/sr6bJ+r QTJx2dcBd8/2gft+ZEU9PsZwvHxlGg9lHEqJkB8OYYnc35eF1vPyHlZqd9wce1bMhnsA f+iDvW6wxkU2WiPeb4VssTY3UAOmgksXi8uFy9jbs5eFhzAn8keyRdRP+OO2qW9d+tgS BSchvWNbxvVxUFASO+XzrewwbW1HEgBjkfUpS6hBYpZY6iddLvKgJNJSB5/40Bntoz0K XNXA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=TTEy8kE9jF34vkk9pbuNVU/WzIkyybKzFSfIHyfqAxE=; b=PF5FL4N1wx1VDs2pPi5LfPfdpCS1wUVRRWS2OuA8ozhFVmo0J1V650AcyUTcOCFr/u vO+xRc1i2DmhmxBX+z87BXiIAJ4dGsvUKRpRSlVVMlpwAeHMpx0OvwOnnV3ohDnZkYWx lmazJIxAFubJUj68WMZyepEBCDKM+L0v1VkII9v1rVLQg3Eud5CnD659TybwQtcsa25K iD73cbeH57slCjoFkyi0cPcuL4Q5wlRLtRhCM4seoMKYDPjmXlsxImKvnncGssfT/TLC D4LMNKv1t/1CY7+BoFgl2gCc+oEckT60hkuS//ddlZTRl7mS4k9G4rpNwkLrvVSBo0ZZ U6Rw== X-Gm-Message-State: AOAM533UWhLQ1n9w3Tu/ieXcm/vyLu/rNy6nekjjVFNw865ueidIZvHQ AFZoudbJl+2SJSonehw4IfUqIj8z6Bz5Uw== X-Google-Smtp-Source: ABdhPJyvfUKkbD0DnxAvtTqtx60+cuuxqro7ZAnKQFKjouS58ZUNUvvoCvyyyUyG3VKxCIHiKWIw2w== X-Received: by 2002:a65:420d:: with SMTP id c13mr16268095pgq.370.1621904847042; Mon, 24 May 2021 18:07:27 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id b16sm11748176pju.35.2021.05.24.18.07.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:07:26 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 44/92] target/arm: Implement SVE2 scatter store insns Date: Mon, 24 May 2021 18:03:10 -0700 Message-Id: <20210525010358.152808-45-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::534; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x534.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org, Stephen Long Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Stephen Long Add decoding logic for SVE2 64-bit/32-bit scatter non-temporal store insns. 64-bit * STNT1B (vector plus scalar) * STNT1H (vector plus scalar) * STNT1W (vector plus scalar) * STNT1D (vector plus scalar) 32-bit * STNT1B (vector plus scalar) * STNT1H (vector plus scalar) * STNT1W (vector plus scalar) Reviewed-by: Peter Maydell Signed-off-by: Stephen Long Message-Id: <20200422141553.8037-1-steplong@quicinc.com> Signed-off-by: Richard Henderson --- target/arm/sve.decode | 10 ++++++++++ target/arm/translate-sve.c | 8 ++++++++ 2 files changed, 18 insertions(+) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 7645587469..5cfe6df0d2 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1388,3 +1388,13 @@ UMLSLT_zzzw 01000100 .. 0 ..... 010 111 ..... ..... @rda_rn_rm CMLA_zzzz 01000100 esz:2 0 rm:5 0010 rot:2 rn:5 rd:5 ra=%reg_movprfx SQRDCMLAH_zzzz 01000100 esz:2 0 rm:5 0011 rot:2 rn:5 rd:5 ra=%reg_movprfx + +### SVE2 Memory Store Group + +# SVE2 64-bit scatter non-temporal store (vector plus scalar) +STNT1_zprz 1110010 .. 00 ..... 001 ... ..... ..... \ + @rprr_scatter_store xs=2 esz=3 scale=0 + +# SVE2 32-bit scatter non-temporal store (vector plus scalar) +STNT1_zprz 1110010 .. 10 ..... 001 ... ..... ..... \ + @rprr_scatter_store xs=0 esz=2 scale=0 diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index ba39ff84a5..ac43bb02be 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -6167,6 +6167,14 @@ static bool trans_ST1_zpiz(DisasContext *s, arg_ST1_zpiz *a) return true; } +static bool trans_STNT1_zprz(DisasContext *s, arg_ST1_zprz *a) +{ + if (!dc_isar_feature(aa64_sve2, s)) { + return false; + } + return trans_ST1_zprz(s, a); +} + /* * Prefetches */ From patchwork Tue May 25 01:03:11 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277551 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 345FCC2B9F7 for ; Tue, 25 May 2021 02:00:47 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id C628E61209 for ; Tue, 25 May 2021 02:00:46 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C628E61209 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:33840 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llMNB-0004jr-TL for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 22:00:45 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:55338) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLYH-0008Sv-FR for qemu-devel@nongnu.org; Mon, 24 May 2021 21:08:10 -0400 Received: from mail-pf1-x430.google.com ([2607:f8b0:4864:20::430]:40481) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLXr-00041I-UJ for qemu-devel@nongnu.org; Mon, 24 May 2021 21:08:09 -0400 Received: by mail-pf1-x430.google.com with SMTP id x188so22273057pfd.7 for ; Mon, 24 May 2021 18:07:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=xVLmkte3/X8WeOPRHb0OAi+AOLu5ah+uxiTRZR828b8=; b=dXsAbzQfuSoNAh60M8w9PqGejS9pb7zhvdt32oQ/WPbJWAuhv3VrtNHI+ds1EjpupQ sVJJen/zU1Nc3woOC/mgSFb7LLr331ce1IN7rzJUZg5m3uM9hG2ameTNoy/Nb4O3IhzP d/CbZQ42iQPwI5dK7anIOHGVdIT0BZvShOzfjnSUa1NNRg9jQ6CZT72XLOqbncW2ynHQ F0NTRGcUaDwT1KxLU9f892dqhS2ORHQ6c4OLqgM871RiKuLZQQmZKTR43osmQ7N5pzeO PJerB6ArtyO95t+2pow9mu19ncmMIF2ahKlRO5qn6D0S/03sDCbIN9TK4n+SOZnLkLE/ E9dw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=xVLmkte3/X8WeOPRHb0OAi+AOLu5ah+uxiTRZR828b8=; b=MGeazf0fqnOKATp4qJxgKZYPeUDCAg6nZHMb/AUcC88wLmyzfvqVodNGTBO7yNB9aj pMfhDY7esX2oPnNIJMeUsptqs6JZq4m8EN4ce9HlspauOWI/g0szaKbrjCl21cen+Tee Tc5ow4dWqkhrj06WQD8fQTYnnEBcq+a/+BFN3+fHxy5vsbdmnXfV/Q9U1ukQyI35AfDk kTzA9Rlp66OsSoLL752tB8uIyRnuN3jcewsU8BUMLcq//dONVOOT0nVwmdn7NZtQDWPy 95nNnVLVVVL8v6L94V7CV7MK+GCMMQy6BsDCJjgU+HwgugfsiGzWPdC07oW0b5FHB4TK UmjA== X-Gm-Message-State: AOAM533+jydKO/cSejoqj4pFivU5UgvoOn6ElWoVJIkKo5SQ9w5R6lhY AYXwjl8Agtyfq6/BynCkUZlLCw3Csri0zA== X-Google-Smtp-Source: ABdhPJzaLJxhBvuGJowEbajCVmqar8VTQ3F5lEgJvxfij5QV8rWgXHiTShLWJPP8VacqMkcIlL7n+A== X-Received: by 2002:a62:2743:0:b029:2d5:897d:7481 with SMTP id n64-20020a6227430000b02902d5897d7481mr27304397pfn.46.1621904847593; Mon, 24 May 2021 18:07:27 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id b16sm11748176pju.35.2021.05.24.18.07.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:07:27 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 45/92] target/arm: Implement SVE2 gather load insns Date: Mon, 24 May 2021 18:03:11 -0700 Message-Id: <20210525010358.152808-46-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::430; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x430.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org, Stephen Long Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Stephen Long Add decoding logic for SVE2 64-bit/32-bit gather non-temporal load insns. 64-bit * LDNT1SB * LDNT1B (vector plus scalar) * LDNT1SH * LDNT1H (vector plus scalar) * LDNT1SW * LDNT1W (vector plus scalar) * LDNT1D (vector plus scalar) 32-bit * LDNT1SB * LDNT1B (vector plus scalar) * LDNT1SH * LDNT1H (vector plus scalar) * LDNT1W (vector plus scalar) Reviewed-by: Peter Maydell Signed-off-by: Stephen Long Message-Id: <20200422152343.12493-1-steplong@quicinc.com> Signed-off-by: Richard Henderson --- target/arm/sve.decode | 11 +++++++++++ target/arm/translate-sve.c | 8 ++++++++ 2 files changed, 19 insertions(+) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 5cfe6df0d2..c3958bed6a 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1389,6 +1389,17 @@ UMLSLT_zzzw 01000100 .. 0 ..... 010 111 ..... ..... @rda_rn_rm CMLA_zzzz 01000100 esz:2 0 rm:5 0010 rot:2 rn:5 rd:5 ra=%reg_movprfx SQRDCMLAH_zzzz 01000100 esz:2 0 rm:5 0011 rot:2 rn:5 rd:5 ra=%reg_movprfx +### SVE2 Memory Gather Load Group + +# SVE2 64-bit gather non-temporal load +# (scalar plus unpacked 32-bit unscaled offsets) +LDNT1_zprz 1100010 msz:2 00 rm:5 1 u:1 0 pg:3 rn:5 rd:5 \ + &rprr_gather_load xs=0 esz=3 scale=0 ff=0 + +# SVE2 32-bit gather non-temporal load (scalar plus 32-bit unscaled offsets) +LDNT1_zprz 1000010 msz:2 00 rm:5 10 u:1 pg:3 rn:5 rd:5 \ + &rprr_gather_load xs=0 esz=2 scale=0 ff=0 + ### SVE2 Memory Store Group # SVE2 64-bit scatter non-temporal store (vector plus scalar) diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index ac43bb02be..a64ad04c50 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -6015,6 +6015,14 @@ static bool trans_LD1_zpiz(DisasContext *s, arg_LD1_zpiz *a) return true; } +static bool trans_LDNT1_zprz(DisasContext *s, arg_LD1_zprz *a) +{ + if (!dc_isar_feature(aa64_sve2, s)) { + return false; + } + return trans_LD1_zprz(s, a); +} + /* Indexed by [mte][be][xs][msz]. */ static gen_helper_gvec_mem_scatter * const scatter_store_fn32[2][2][2][3] = { { /* MTE Inactive */ From patchwork Tue May 25 01:03:12 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277589 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BB60EC2B9F7 for ; Tue, 25 May 2021 02:12:43 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 4A94061360 for ; Tue, 25 May 2021 02:12:43 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 4A94061360 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:49042 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llMYk-0000Ec-Dx for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 22:12:42 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:55558) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLYR-0000i1-Kg for qemu-devel@nongnu.org; Mon, 24 May 2021 21:08:19 -0400 Received: from mail-pf1-x42d.google.com ([2607:f8b0:4864:20::42d]:35516) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLXs-00041Z-7q for qemu-devel@nongnu.org; Mon, 24 May 2021 21:08:19 -0400 Received: by mail-pf1-x42d.google.com with SMTP id g18so20613623pfr.2 for ; Mon, 24 May 2021 18:07:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=s3J5xZAT9p1EG4Pf7PPGTAQp3zXTu/AnQThOp9WPnBo=; b=PeJV6hiYV5x+R3VSAKZDrUbZI8yeiLdgl7sG13XXfxjgjrtwY7OQ16RS6nBKQTCOvX vT2g7d9ajTp+7/jTEN90xnPbq36+kxZn75pqpevSVOts4p3iFBRIOYA0JHYMH6nCULz/ 63/9rB4+CxSXGi8J2ue+3SSoh6YF6U5w6z0/NrakVYt/XYMYq7RTeazXicKTFNrLpMD+ CiTQGb81x7H6zJQssslUska7AFnhtJtK0OKcVDshweWr+vi2PKRWeAYI8RGQC1+LYpOF 08pCO3NRw2EG9SD1NfxCGQYO6rVe2AC2Wr6cap3usxBEaKCTkrN5C+ms7NmKlTLNTWYl KqXw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=s3J5xZAT9p1EG4Pf7PPGTAQp3zXTu/AnQThOp9WPnBo=; b=eph+oaKyrI6SrqjE460TeWRImBX6pWuyyREmbPbadlzD6cu9pVTZbop+YyeM8Ck/43 /2QpepJIrqU7mRwnOM0LpuqLbEhdfN2jJmr/XDH1N60J8V6Bam34Kd/re+TAbTLXdjJh Y/3xXLnRLfKf2y8/+JzaZ7LzrEMlk/dMJ3HReoeJWR1f2Jrv61pZCF6MotQTmNMT+OmM qD8ZUXg6jg/kkJh59nqVjEq5PQeu89AJXnI9DBGT/n/GxhSamFrAMacpLn8K6I4gyl9c On+uDkChDl9atf4FF43FiZTucdcCm1RE1vyvkK+Joap9C9rrl2vU02FkOmQ6S4Tg/PIh MU/A== X-Gm-Message-State: AOAM531KoL6JqasaxsiYaiTPQBXuSpbYyx6T+0dXUkNTyCLyd51+aNwR W4zPu3Tes0kWtQT0PmRbQDoTh8ETVKkmlQ== X-Google-Smtp-Source: ABdhPJylizg85+5XApbJOJhBWE6sRFKdppsh17MgF8OOzB9G9TuG2s31ftOhFu0Kcvv2dhE7HcN9Nw== X-Received: by 2002:a63:f502:: with SMTP id w2mr16366269pgh.197.1621904848143; Mon, 24 May 2021 18:07:28 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id b16sm11748176pju.35.2021.05.24.18.07.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:07:27 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 46/92] target/arm: Implement SVE2 FMMLA Date: Mon, 24 May 2021 18:03:12 -0700 Message-Id: <20210525010358.152808-47-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::42d; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x42d.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org, Stephen Long Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Stephen Long Reviewed-by: Peter Maydell Signed-off-by: Stephen Long Message-Id: <20200422165503.13511-1-steplong@quicinc.com> [rth: Fix indexing in helpers, expand macro to straight functions.] Signed-off-by: Richard Henderson --- target/arm/cpu.h | 10 ++++++ target/arm/helper-sve.h | 3 ++ target/arm/sve.decode | 4 +++ target/arm/sve_helper.c | 74 ++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 34 ++++++++++++++++++ 5 files changed, 125 insertions(+) diff --git a/target/arm/cpu.h b/target/arm/cpu.h index ae787fac8a..595bc6349d 100644 --- a/target/arm/cpu.h +++ b/target/arm/cpu.h @@ -4246,6 +4246,16 @@ static inline bool isar_feature_aa64_sve2_bitperm(const ARMISARegisters *id) return FIELD_EX64(id->id_aa64zfr0, ID_AA64ZFR0, BITPERM) != 0; } +static inline bool isar_feature_aa64_sve_f32mm(const ARMISARegisters *id) +{ + return FIELD_EX64(id->id_aa64zfr0, ID_AA64ZFR0, F32MM) != 0; +} + +static inline bool isar_feature_aa64_sve_f64mm(const ARMISARegisters *id) +{ + return FIELD_EX64(id->id_aa64zfr0, ID_AA64ZFR0, F64MM) != 0; +} + /* * Feature tests for "does this exist in either 32-bit or 64-bit?" */ diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 28b8f00201..7e99dcd119 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2662,3 +2662,6 @@ DEF_HELPER_FLAGS_5(sve2_sqrdcmlah_zzzz_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve2_sqrdcmlah_zzzz_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(fmmla_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(fmmla_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index c3958bed6a..cb2ee86228 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1389,6 +1389,10 @@ UMLSLT_zzzw 01000100 .. 0 ..... 010 111 ..... ..... @rda_rn_rm CMLA_zzzz 01000100 esz:2 0 rm:5 0010 rot:2 rn:5 rd:5 ra=%reg_movprfx SQRDCMLAH_zzzz 01000100 esz:2 0 rm:5 0011 rot:2 rn:5 rd:5 ra=%reg_movprfx +### SVE2 floating point matrix multiply accumulate + +FMMLA 01100100 .. 1 ..... 111001 ..... ..... @rda_rn_rm + ### SVE2 Memory Gather Load Group # SVE2 64-bit gather non-temporal load diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 5b6292929e..fa96e28639 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -7241,3 +7241,77 @@ void HELPER(sve2_xar_s)(void *vd, void *vn, void *vm, uint32_t desc) d[i] = ror32(n[i] ^ m[i], shr); } } + +void HELPER(fmmla_s)(void *vd, void *vn, void *vm, void *va, + void *status, uint32_t desc) +{ + intptr_t s, opr_sz = simd_oprsz(desc) / (sizeof(float32) * 4); + + for (s = 0; s < opr_sz; ++s) { + float32 *n = vn + s * sizeof(float32) * 4; + float32 *m = vm + s * sizeof(float32) * 4; + float32 *a = va + s * sizeof(float32) * 4; + float32 *d = vd + s * sizeof(float32) * 4; + float32 n00 = n[H4(0)], n01 = n[H4(1)]; + float32 n10 = n[H4(2)], n11 = n[H4(3)]; + float32 m00 = m[H4(0)], m01 = m[H4(1)]; + float32 m10 = m[H4(2)], m11 = m[H4(3)]; + float32 p0, p1; + + /* i = 0, j = 0 */ + p0 = float32_mul(n00, m00, status); + p1 = float32_mul(n01, m01, status); + d[H4(0)] = float32_add(a[H4(0)], float32_add(p0, p1, status), status); + + /* i = 0, j = 1 */ + p0 = float32_mul(n00, m10, status); + p1 = float32_mul(n01, m11, status); + d[H4(1)] = float32_add(a[H4(1)], float32_add(p0, p1, status), status); + + /* i = 1, j = 0 */ + p0 = float32_mul(n10, m00, status); + p1 = float32_mul(n11, m01, status); + d[H4(2)] = float32_add(a[H4(2)], float32_add(p0, p1, status), status); + + /* i = 1, j = 1 */ + p0 = float32_mul(n10, m10, status); + p1 = float32_mul(n11, m11, status); + d[H4(3)] = float32_add(a[H4(3)], float32_add(p0, p1, status), status); + } +} + +void HELPER(fmmla_d)(void *vd, void *vn, void *vm, void *va, + void *status, uint32_t desc) +{ + intptr_t s, opr_sz = simd_oprsz(desc) / (sizeof(float64) * 4); + + for (s = 0; s < opr_sz; ++s) { + float64 *n = vn + s * sizeof(float64) * 4; + float64 *m = vm + s * sizeof(float64) * 4; + float64 *a = va + s * sizeof(float64) * 4; + float64 *d = vd + s * sizeof(float64) * 4; + float64 n00 = n[0], n01 = n[1], n10 = n[2], n11 = n[3]; + float64 m00 = m[0], m01 = m[1], m10 = m[2], m11 = m[3]; + float64 p0, p1; + + /* i = 0, j = 0 */ + p0 = float64_mul(n00, m00, status); + p1 = float64_mul(n01, m01, status); + d[0] = float64_add(a[0], float64_add(p0, p1, status), status); + + /* i = 0, j = 1 */ + p0 = float64_mul(n00, m10, status); + p1 = float64_mul(n01, m11, status); + d[1] = float64_add(a[1], float64_add(p0, p1, status), status); + + /* i = 1, j = 0 */ + p0 = float64_mul(n10, m00, status); + p1 = float64_mul(n11, m01, status); + d[2] = float64_add(a[2], float64_add(p0, p1, status), status); + + /* i = 1, j = 1 */ + p0 = float64_mul(n10, m10, status); + p1 = float64_mul(n11, m11, status); + d[3] = float64_add(a[3], float64_add(p0, p1, status), status); + } +} diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index a64ad04c50..a94b399f67 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -7672,6 +7672,40 @@ DO_SVE2_ZPZZ_FP(FMINP, fminp) * SVE Integer Multiply-Add (unpredicated) */ +static bool trans_FMMLA(DisasContext *s, arg_rrrr_esz *a) +{ + gen_helper_gvec_4_ptr *fn; + + switch (a->esz) { + case MO_32: + if (!dc_isar_feature(aa64_sve_f32mm, s)) { + return false; + } + fn = gen_helper_fmmla_s; + break; + case MO_64: + if (!dc_isar_feature(aa64_sve_f64mm, s)) { + return false; + } + fn = gen_helper_fmmla_d; + break; + default: + return false; + } + + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + TCGv_ptr status = fpstatus_ptr(FPST_FPCR); + tcg_gen_gvec_4_ptr(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + vec_full_reg_offset(s, a->ra), + status, vsz, vsz, 0, fn); + tcg_temp_free_ptr(status); + } + return true; +} + static bool do_sqdmlal_zzzw(DisasContext *s, arg_rrrr_esz *a, bool sel1, bool sel2) { From patchwork Tue May 25 01:03:13 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277521 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9F70AC2B9F7 for ; Tue, 25 May 2021 01:49:56 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 330666140F for ; Tue, 25 May 2021 01:49:56 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 330666140F Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:52856 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llMCh-0005C9-Bi for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 21:49:55 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:55222) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLY1-0008BK-Tu for qemu-devel@nongnu.org; Mon, 24 May 2021 21:07:55 -0400 Received: from mail-pg1-x534.google.com ([2607:f8b0:4864:20::534]:42730) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLXm-00042E-3d for qemu-devel@nongnu.org; Mon, 24 May 2021 21:07:53 -0400 Received: by mail-pg1-x534.google.com with SMTP id f22so20467203pgb.9 for ; Mon, 24 May 2021 18:07:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=QwI4e6BjJpZYIkcfm9+Buq0GP0uyNec+3MloCW2amL0=; b=HrWxNvr8kSTkWr+DagaDgGB13jTW21MM0trZnCUg42qEYGyoZ3x9IUBywmaf+tCcoL eFjkUa6IX3FbhoQeD/zGxtJsPO8cHho0his+aDZdQxfEz87iRq/tS/L/ZUCdpuDJaUAl DR+4qPeBHOFbHDpz+nuC4WcIMpyJ6HLWysCc13OCqdoXR82gpCqsjGMDChjXGAJ4SJvh YQdb2jGtYD+3Dmsfd1w3UQBQDZATZfkm0lMGkNlLiJyuhFZ5WuF3p+DDA5RyN5ti5ExY JYqr0kHqKgNIq9AfREV318GEVg7WKEepS1/9L+RgBGcfJ1/c+4WXawzTH096fj7wfK7l GJwQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=QwI4e6BjJpZYIkcfm9+Buq0GP0uyNec+3MloCW2amL0=; b=BEcul2bjDvR7vso5+X0jm7thXhGA3IZpCrZ/xNSCOVzzA/js0f2UnbdDbWJ+UADi2O DXzVgKyTr2U40agLkCRgl9GsliOJ7YDOxM3iLz7gtODE1xepeNWWJzPiJvkfNiIg/bGf Yv9D6zkCNLYQV+IbqbhNTmfCb4m3HKBe8D2jceEb0ox0uRgQsW2AFlmmoLrOg1jhf4l8 MTgGlQXakI4FPTU8fHs1ib+f8WW+DCPQ68sWVm1Ksm4NLzZGZp27JygdjSX0LGk1lAHW g2HgvN1C/MjljpheljDDAWKQwExe+srseBKNBPXnAd0zv8A4yWKA4t+ka2fQCT9wwJ9z YLMw== X-Gm-Message-State: AOAM5339I9RROOY8241Jgp9ipuhpi/C9iX9Fx/mMhdPzAIb8GoSDuDZQ xVcjBMAj8R0yRiUzW/s+MbeB997pUsvjPA== X-Google-Smtp-Source: ABdhPJy7Xo44zUO9AytLr1/WNiKGPZIFyJQB2DjBmu6IAoARCDs32vhSRDTIolif5j8LEAqMlqzIVA== X-Received: by 2002:a62:3501:0:b029:28d:50e1:99a8 with SMTP id c1-20020a6235010000b029028d50e199a8mr27201640pfa.14.1621904848639; Mon, 24 May 2021 18:07:28 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id b16sm11748176pju.35.2021.05.24.18.07.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:07:28 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 47/92] target/arm: Implement SVE2 SPLICE, EXT Date: Mon, 24 May 2021 18:03:13 -0700 Message-Id: <20210525010358.152808-48-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::534; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x534.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org, Stephen Long Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Stephen Long Reviewed-by: Peter Maydell Signed-off-by: Stephen Long Message-Id: <20200423180347.9403-1-steplong@quicinc.com> [rth: Rename the trans_* functions to *_sve2.] Signed-off-by: Richard Henderson --- target/arm/sve.decode | 11 +++++++++-- target/arm/translate-sve.c | 35 ++++++++++++++++++++++++++++++----- 2 files changed, 39 insertions(+), 7 deletions(-) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index cb2ee86228..67b6466a1e 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -494,10 +494,14 @@ CPY_z_i 00000101 .. 01 .... 00 . ........ ..... @rdn_pg4 imm=%sh8_i8s ### SVE Permute - Extract Group -# SVE extract vector (immediate offset) +# SVE extract vector (destructive) EXT 00000101 001 ..... 000 ... rm:5 rd:5 \ &rrri rn=%reg_movprfx imm=%imm8_16_10 +# SVE2 extract vector (constructive) +EXT_sve2 00000101 011 ..... 000 ... rn:5 rd:5 \ + &rri imm=%imm8_16_10 + ### SVE Permute - Unpredicated Group # SVE broadcast general register @@ -588,9 +592,12 @@ REVH 00000101 .. 1001 01 100 ... ..... ..... @rd_pg_rn REVW 00000101 .. 1001 10 100 ... ..... ..... @rd_pg_rn RBIT 00000101 .. 1001 11 100 ... ..... ..... @rd_pg_rn -# SVE vector splice (predicated) +# SVE vector splice (predicated, destructive) SPLICE 00000101 .. 101 100 100 ... ..... ..... @rdn_pg_rm +# SVE2 vector splice (predicated, constructive) +SPLICE_sve2 00000101 .. 101 101 100 ... ..... ..... @rd_pg_rn + ### SVE Select Vectors Group # SVE select vector elements (predicated) diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index a94b399f67..46f87ee259 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -2266,18 +2266,18 @@ static bool trans_CPY_z_i(DisasContext *s, arg_CPY_z_i *a) *** SVE Permute Extract Group */ -static bool trans_EXT(DisasContext *s, arg_EXT *a) +static bool do_EXT(DisasContext *s, int rd, int rn, int rm, int imm) { if (!sve_access_check(s)) { return true; } unsigned vsz = vec_full_reg_size(s); - unsigned n_ofs = a->imm >= vsz ? 0 : a->imm; + unsigned n_ofs = imm >= vsz ? 0 : imm; unsigned n_siz = vsz - n_ofs; - unsigned d = vec_full_reg_offset(s, a->rd); - unsigned n = vec_full_reg_offset(s, a->rn); - unsigned m = vec_full_reg_offset(s, a->rm); + unsigned d = vec_full_reg_offset(s, rd); + unsigned n = vec_full_reg_offset(s, rn); + unsigned m = vec_full_reg_offset(s, rm); /* Use host vector move insns if we have appropriate sizes * and no unfortunate overlap. @@ -2296,6 +2296,19 @@ static bool trans_EXT(DisasContext *s, arg_EXT *a) return true; } +static bool trans_EXT(DisasContext *s, arg_EXT *a) +{ + return do_EXT(s, a->rd, a->rn, a->rm, a->imm); +} + +static bool trans_EXT_sve2(DisasContext *s, arg_rri *a) +{ + if (!dc_isar_feature(aa64_sve2, s)) { + return false; + } + return do_EXT(s, a->rd, a->rn, (a->rn + 1) % 32, a->imm); +} + /* *** SVE Permute - Unpredicated Group */ @@ -3013,6 +3026,18 @@ static bool trans_SPLICE(DisasContext *s, arg_rprr_esz *a) return true; } +static bool trans_SPLICE_sve2(DisasContext *s, arg_rpr_esz *a) +{ + if (!dc_isar_feature(aa64_sve2, s)) { + return false; + } + if (sve_access_check(s)) { + gen_gvec_ool_zzzp(s, gen_helper_sve_splice, + a->rd, a->rn, (a->rn + 1) % 32, a->pg, a->esz); + } + return true; +} + /* *** SVE Integer Compare - Vectors Group */ From patchwork Tue May 25 01:03:14 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277503 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id F1226C2B9F7 for ; Tue, 25 May 2021 01:43:29 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id AF08E61413 for ; Tue, 25 May 2021 01:43:29 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org AF08E61413 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:58070 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llM6S-0006md-Po for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 21:43:28 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:55144) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLXx-0007wk-Bx for qemu-devel@nongnu.org; Mon, 24 May 2021 21:07:49 -0400 Received: from mail-pf1-x42c.google.com ([2607:f8b0:4864:20::42c]:41861) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLXg-00042M-Pk for qemu-devel@nongnu.org; Mon, 24 May 2021 21:07:49 -0400 Received: by mail-pf1-x42c.google.com with SMTP id p39so5932961pfw.8 for ; Mon, 24 May 2021 18:07:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=NjRQE2G1APvNpsux7iWCMoF795+yU21m5CTY+GrRsi4=; b=eEo9dew3kfujtXVBJF/zu/C+5+FcjEjHkwdFQbYeOIccSuuEcwWLblMcGaQ9tzmUxY 1iVpzEmZW8jj0PSgxiA9l+3s8J5fYLkLRkE7T0PDX5u+r9hu0oYzDPOGBFUjQO8hkdh/ TeSJHLl9qwvDXurCKbrIQQvwhLSGzbaR/MwUnJ/hExfjVPc5j+gXw1WtI8fETaYJ0AhI Q5L+KGLXNMpPY8NUykoL2JKmFi2sfWgS2bf+tq8XP6DamRDLa85w6WyRngLznSgBpyhp DGq6K2DVy1ETzm6gC7xe1Zeg3/oJ2GUEvJ/QHX8fvEFg8w3rGWbIzvDfjMq1fsQ2NTWd ktAQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=NjRQE2G1APvNpsux7iWCMoF795+yU21m5CTY+GrRsi4=; b=lCqke8bJaKz78I6mZaVUIvGfj2IbFNbEMK34g6mHJNVQ+IoVsh3xcmdwNBMNsazXDJ 7xH2fuItc7T1nzT4+LnkQJZdONkDk0OODdDsJHckUNXeSL/31cjVDsYVauvzySj/CUR0 DVEulPy3sePcn6BfhoFu28wNnWblPx7bgWIoFXxf9MGO4HsaYp2bljnb1KRHtlIgZGEW KVWbM5Rk2bJBi73J4dkk/NtY//1SJQxXH+e8bcjjoXz3jFjoaGyTnlR2vI/tzQ0kkfTo Po5fxgafH3XXjKpK96aLRtxWUm/5QkJ55aSPk5KBwkP40ow4r/PfEX4zZuPoReQ5m9k4 9w2A== X-Gm-Message-State: AOAM532NYfE5JbfAYoftUUpCBytSHmLunEaMhvj/5YVmm2Z7DVbJDgn6 v+8ybAPM3MnL+ZuAesofVaanulrSOfERFA== X-Google-Smtp-Source: ABdhPJxRAoruwQ/Pl8d+Az9YpTN5DZmFx4LC3cc0Q//HveDw1QGrRUIrews/at0ugIcYSFcxB9PHBA== X-Received: by 2002:a63:4a4e:: with SMTP id j14mr15955640pgl.221.1621904849212; Mon, 24 May 2021 18:07:29 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id b16sm11748176pju.35.2021.05.24.18.07.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:07:28 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 48/92] target/arm: Use correct output type for gvec_sdot_*_b Date: Mon, 24 May 2021 18:03:14 -0700 Message-Id: <20210525010358.152808-49-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::42c; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x42c.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" The signed dot product routines produce a signed result. Since we use -fwrapv, there is no functional change. Signed-off-by: Richard Henderson --- v7: Split out of the next patch. --- target/arm/vec_helper.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index a3d80ecad0..48e3addd81 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -378,7 +378,7 @@ void HELPER(sve2_sqrdmlsh_d)(void *vd, void *vn, void *vm, void HELPER(gvec_sdot_b)(void *vd, void *vn, void *vm, uint32_t desc) { intptr_t i, opr_sz = simd_oprsz(desc); - uint32_t *d = vd; + int32_t *d = vd; int8_t *n = vn, *m = vm; for (i = 0; i < opr_sz / 4; ++i) { @@ -408,7 +408,7 @@ void HELPER(gvec_udot_b)(void *vd, void *vn, void *vm, uint32_t desc) void HELPER(gvec_sdot_h)(void *vd, void *vn, void *vm, uint32_t desc) { intptr_t i, opr_sz = simd_oprsz(desc); - uint64_t *d = vd; + int64_t *d = vd; int16_t *n = vn, *m = vm; for (i = 0; i < opr_sz / 8; ++i) { @@ -439,7 +439,7 @@ void HELPER(gvec_sdot_idx_b)(void *vd, void *vn, void *vm, uint32_t desc) { intptr_t i, segend, opr_sz = simd_oprsz(desc), opr_sz_4 = opr_sz / 4; intptr_t index = simd_data(desc); - uint32_t *d = vd; + int32_t *d = vd; int8_t *n = vn; int8_t *m_indexed = (int8_t *)vm + H4(index) * 4; @@ -501,7 +501,7 @@ void HELPER(gvec_sdot_idx_h)(void *vd, void *vn, void *vm, uint32_t desc) { intptr_t i, opr_sz = simd_oprsz(desc), opr_sz_8 = opr_sz / 8; intptr_t index = simd_data(desc); - uint64_t *d = vd; + int64_t *d = vd; int16_t *n = vn; int16_t *m_indexed = (int16_t *)vm + index * 4; From patchwork Tue May 25 01:03:15 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277513 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9592DC2B9F7 for ; Tue, 25 May 2021 01:48:02 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 073AE6140F for ; Tue, 25 May 2021 01:48:02 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 073AE6140F Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:44086 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llMAr-0007sH-0E for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 21:48:01 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:55162) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLXy-0007yX-Ai for qemu-devel@nongnu.org; Mon, 24 May 2021 21:07:50 -0400 Received: from mail-pf1-x42f.google.com ([2607:f8b0:4864:20::42f]:39547) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLXg-000437-RB for qemu-devel@nongnu.org; Mon, 24 May 2021 21:07:50 -0400 Received: by mail-pf1-x42f.google.com with SMTP id c17so22285194pfn.6 for ; Mon, 24 May 2021 18:07:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=MkFq+pFvVpoPKu1qsSV1qEp5dVkNmOZKlA/bTOocMNM=; b=Dt7plDrpMy7wTO+qehyi+Q+sCWMkX5HPnhZREeMo8YJbfoUByzhcvpp4+xrvzhV9EK rKHnA+BsX1deYY1gAlpPKYQTVf5OP+CEA9To8l3F/9TIHJOIE9idSYuJYaMzwVR1M33C S4+IeLOksjAiX+9HZkaBvRzCGi05mmL/SmekWGJLcTCy4rudeytBWMOIpi6vyYEfAmeD j/jh93QdRg+dTKlpjh8JUgST3957TXH2uLN6Z+tXFfYeSsjqRPjLW4rv8w1VFWmwXQkV ePoK0mm5hjtBbfy273ll8a+VyDBASzOqm8ohtdI7LcQj7LPrRKDyZWP7c/gCCIXZ9Cmo V7Fg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=MkFq+pFvVpoPKu1qsSV1qEp5dVkNmOZKlA/bTOocMNM=; b=SfSXNAk8T6ef6oOpree9yux4K4v99MJnNV7App9ixsmWYe50IPGVDzfharF8VvKim9 F2zLMvuD/sfewku2bFELh6yGvPlzStf9ZV9/JL13Fd328aplS0pFf7sakqUwDBYCPrxO 6lGN4tGXTdtrA72MYu82PQ99evyJUlzu7JUn9Y3SrEuyPgaPayXE0N0ZpoUXodyiP7i7 tiibhVMSPfc3KhbAHuIM/FI9/HFD2Uk4F43bojFCeR/hDDTCQkgSCu3hBarFWDiXYAF/ yhsbDcyKO1DCDyMEaVilQ2roCBi7gLzl0jibDq3OG2EiMf4Wvrk/gnnWsFIo7SbD35Ff +faw== X-Gm-Message-State: AOAM532uQNPEXlMPOa3iCK2ROcZpWRcuu77ZI7GEslBnaNcNu3SZl/Sl GgYoOUqBbJVdIrkS+KGtbXB9QIwNpUwKfA== X-Google-Smtp-Source: ABdhPJxBaI5XY5d9vJ8f22ABJN/IsCbvoPsACyIKJbBXxx5txukhCcLRDO1hRbLmf60HeH6wd/y8qA== X-Received: by 2002:a63:4b5b:: with SMTP id k27mr16656038pgl.368.1621904849874; Mon, 24 May 2021 18:07:29 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id b16sm11748176pju.35.2021.05.24.18.07.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:07:29 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 49/92] target/arm: Pass separate addend to {U, S}DOT helpers Date: Mon, 24 May 2021 18:03:15 -0700 Message-Id: <20210525010358.152808-50-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::42f; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x42f.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" For SVE, we potentially have a 4th argument coming from the movprfx instruction. Currently we do not optimize movprfx, so the problem is not visible. Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- v4: Fix double addition (zhiwei). v7: Split out type changes. --- target/arm/helper.h | 20 +++--- target/arm/sve.decode | 7 ++- target/arm/translate-a64.c | 15 ++++- target/arm/translate-neon.c | 10 +-- target/arm/translate-sve.c | 13 ++-- target/arm/vec_helper.c | 120 ++++++++++++++++++++---------------- 6 files changed, 109 insertions(+), 76 deletions(-) diff --git a/target/arm/helper.h b/target/arm/helper.h index 23a7ec5638..f4b092ee1c 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -608,15 +608,19 @@ DEF_HELPER_FLAGS_5(sve2_sqrdmlah_d, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(sve2_sqrdmlsh_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) -DEF_HELPER_FLAGS_4(gvec_sdot_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) -DEF_HELPER_FLAGS_4(gvec_udot_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) -DEF_HELPER_FLAGS_4(gvec_sdot_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) -DEF_HELPER_FLAGS_4(gvec_udot_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_sdot_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_udot_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_sdot_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_udot_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) -DEF_HELPER_FLAGS_4(gvec_sdot_idx_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) -DEF_HELPER_FLAGS_4(gvec_udot_idx_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) -DEF_HELPER_FLAGS_4(gvec_sdot_idx_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) -DEF_HELPER_FLAGS_4(gvec_udot_idx_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_sdot_idx_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_udot_idx_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_sdot_idx_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_udot_idx_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(gvec_fcaddh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 67b6466a1e..04ef38f148 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -756,12 +756,13 @@ UMIN_zzi 00100101 .. 101 011 110 ........ ..... @rdn_i8u MUL_zzi 00100101 .. 110 000 110 ........ ..... @rdn_i8s # SVE integer dot product (unpredicated) -DOT_zzz 01000100 1 sz:1 0 rm:5 00000 u:1 rn:5 rd:5 ra=%reg_movprfx +DOT_zzzz 01000100 1 sz:1 0 rm:5 00000 u:1 rn:5 rd:5 \ + ra=%reg_movprfx # SVE integer dot product (indexed) -DOT_zzx 01000100 101 index:2 rm:3 00000 u:1 rn:5 rd:5 \ +DOT_zzxw 01000100 101 index:2 rm:3 00000 u:1 rn:5 rd:5 \ sz=0 ra=%reg_movprfx -DOT_zzx 01000100 111 index:1 rm:4 00000 u:1 rn:5 rd:5 \ +DOT_zzxw 01000100 111 index:1 rm:4 00000 u:1 rn:5 rd:5 \ sz=1 ra=%reg_movprfx # SVE floating-point complex add (predicated) diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index bdd47208b1..61c5fa9656 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -683,6 +683,17 @@ static void gen_gvec_op3_qc(DisasContext *s, bool is_q, int rd, int rn, tcg_temp_free_ptr(qc_ptr); } +/* Expand a 4-operand operation using an out-of-line helper. */ +static void gen_gvec_op4_ool(DisasContext *s, bool is_q, int rd, int rn, + int rm, int ra, int data, gen_helper_gvec_4 *fn) +{ + tcg_gen_gvec_4_ool(vec_full_reg_offset(s, rd), + vec_full_reg_offset(s, rn), + vec_full_reg_offset(s, rm), + vec_full_reg_offset(s, ra), + is_q ? 16 : 8, vec_full_reg_size(s), data, fn); +} + /* Set ZF and NF based on a 64 bit result. This is alas fiddlier * than the 32 bit equivalent. */ @@ -12183,7 +12194,7 @@ static void disas_simd_three_reg_same_extra(DisasContext *s, uint32_t insn) return; case 0x2: /* SDOT / UDOT */ - gen_gvec_op3_ool(s, is_q, rd, rn, rm, 0, + gen_gvec_op4_ool(s, is_q, rd, rn, rm, rd, 0, u ? gen_helper_gvec_udot_b : gen_helper_gvec_sdot_b); return; @@ -13442,7 +13453,7 @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn) switch (16 * u + opcode) { case 0x0e: /* SDOT */ case 0x1e: /* UDOT */ - gen_gvec_op3_ool(s, is_q, rd, rn, rm, index, + gen_gvec_op4_ool(s, is_q, rd, rn, rm, rd, index, u ? gen_helper_gvec_udot_idx_b : gen_helper_gvec_sdot_idx_b); return; diff --git a/target/arm/translate-neon.c b/target/arm/translate-neon.c index 658bd275da..fa67605fdc 100644 --- a/target/arm/translate-neon.c +++ b/target/arm/translate-neon.c @@ -230,7 +230,7 @@ static bool trans_VCADD(DisasContext *s, arg_VCADD *a) static bool trans_VDOT(DisasContext *s, arg_VDOT *a) { int opr_sz; - gen_helper_gvec_3 *fn_gvec; + gen_helper_gvec_4 *fn_gvec; if (!dc_isar_feature(aa32_dp, s)) { return false; @@ -252,9 +252,10 @@ static bool trans_VDOT(DisasContext *s, arg_VDOT *a) opr_sz = (1 + a->q) * 8; fn_gvec = a->u ? gen_helper_gvec_udot_b : gen_helper_gvec_sdot_b; - tcg_gen_gvec_3_ool(vfp_reg_offset(1, a->vd), + tcg_gen_gvec_4_ool(vfp_reg_offset(1, a->vd), vfp_reg_offset(1, a->vn), vfp_reg_offset(1, a->vm), + vfp_reg_offset(1, a->vd), opr_sz, opr_sz, 0, fn_gvec); return true; } @@ -332,7 +333,7 @@ static bool trans_VCMLA_scalar(DisasContext *s, arg_VCMLA_scalar *a) static bool trans_VDOT_scalar(DisasContext *s, arg_VDOT_scalar *a) { - gen_helper_gvec_3 *fn_gvec; + gen_helper_gvec_4 *fn_gvec; int opr_sz; TCGv_ptr fpst; @@ -357,9 +358,10 @@ static bool trans_VDOT_scalar(DisasContext *s, arg_VDOT_scalar *a) fn_gvec = a->u ? gen_helper_gvec_udot_idx_b : gen_helper_gvec_sdot_idx_b; opr_sz = (1 + a->q) * 8; fpst = fpstatus_ptr(FPST_STD); - tcg_gen_gvec_3_ool(vfp_reg_offset(1, a->vd), + tcg_gen_gvec_4_ool(vfp_reg_offset(1, a->vd), vfp_reg_offset(1, a->vn), vfp_reg_offset(1, a->rm), + vfp_reg_offset(1, a->vd), opr_sz, opr_sz, a->index, fn_gvec); tcg_temp_free_ptr(fpst); return true; diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 46f87ee259..2864c3a3cf 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3800,28 +3800,29 @@ DO_ZZI(UMIN, umin) #undef DO_ZZI -static bool trans_DOT_zzz(DisasContext *s, arg_DOT_zzz *a) +static bool trans_DOT_zzzz(DisasContext *s, arg_DOT_zzzz *a) { - static gen_helper_gvec_3 * const fns[2][2] = { + static gen_helper_gvec_4 * const fns[2][2] = { { gen_helper_gvec_sdot_b, gen_helper_gvec_sdot_h }, { gen_helper_gvec_udot_b, gen_helper_gvec_udot_h } }; if (sve_access_check(s)) { - gen_gvec_ool_zzz(s, fns[a->u][a->sz], a->rd, a->rn, a->rm, 0); + gen_gvec_ool_zzzz(s, fns[a->u][a->sz], a->rd, a->rn, a->rm, a->ra, 0); } return true; } -static bool trans_DOT_zzx(DisasContext *s, arg_DOT_zzx *a) +static bool trans_DOT_zzxw(DisasContext *s, arg_DOT_zzxw *a) { - static gen_helper_gvec_3 * const fns[2][2] = { + static gen_helper_gvec_4 * const fns[2][2] = { { gen_helper_gvec_sdot_idx_b, gen_helper_gvec_sdot_idx_h }, { gen_helper_gvec_udot_idx_b, gen_helper_gvec_udot_idx_h } }; if (sve_access_check(s)) { - gen_gvec_ool_zzz(s, fns[a->u][a->sz], a->rd, a->rn, a->rm, a->index); + gen_gvec_ool_zzzz(s, fns[a->u][a->sz], a->rd, a->rn, a->rm, + a->ra, a->index); } return true; } diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index 48e3addd81..f88e572132 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -375,71 +375,76 @@ void HELPER(sve2_sqrdmlsh_d)(void *vd, void *vn, void *vm, * All elements are treated equally, no matter where they are. */ -void HELPER(gvec_sdot_b)(void *vd, void *vn, void *vm, uint32_t desc) +void HELPER(gvec_sdot_b)(void *vd, void *vn, void *vm, void *va, uint32_t desc) { intptr_t i, opr_sz = simd_oprsz(desc); - int32_t *d = vd; + int32_t *d = vd, *a = va; int8_t *n = vn, *m = vm; for (i = 0; i < opr_sz / 4; ++i) { - d[i] += n[i * 4 + 0] * m[i * 4 + 0] - + n[i * 4 + 1] * m[i * 4 + 1] - + n[i * 4 + 2] * m[i * 4 + 2] - + n[i * 4 + 3] * m[i * 4 + 3]; + d[i] = (a[i] + + n[i * 4 + 0] * m[i * 4 + 0] + + n[i * 4 + 1] * m[i * 4 + 1] + + n[i * 4 + 2] * m[i * 4 + 2] + + n[i * 4 + 3] * m[i * 4 + 3]); } clear_tail(d, opr_sz, simd_maxsz(desc)); } -void HELPER(gvec_udot_b)(void *vd, void *vn, void *vm, uint32_t desc) +void HELPER(gvec_udot_b)(void *vd, void *vn, void *vm, void *va, uint32_t desc) { intptr_t i, opr_sz = simd_oprsz(desc); - uint32_t *d = vd; + uint32_t *d = vd, *a = va; uint8_t *n = vn, *m = vm; for (i = 0; i < opr_sz / 4; ++i) { - d[i] += n[i * 4 + 0] * m[i * 4 + 0] - + n[i * 4 + 1] * m[i * 4 + 1] - + n[i * 4 + 2] * m[i * 4 + 2] - + n[i * 4 + 3] * m[i * 4 + 3]; + d[i] = (a[i] + + n[i * 4 + 0] * m[i * 4 + 0] + + n[i * 4 + 1] * m[i * 4 + 1] + + n[i * 4 + 2] * m[i * 4 + 2] + + n[i * 4 + 3] * m[i * 4 + 3]); } clear_tail(d, opr_sz, simd_maxsz(desc)); } -void HELPER(gvec_sdot_h)(void *vd, void *vn, void *vm, uint32_t desc) +void HELPER(gvec_sdot_h)(void *vd, void *vn, void *vm, void *va, uint32_t desc) { intptr_t i, opr_sz = simd_oprsz(desc); - int64_t *d = vd; + int64_t *d = vd, *a = va; int16_t *n = vn, *m = vm; for (i = 0; i < opr_sz / 8; ++i) { - d[i] += (int64_t)n[i * 4 + 0] * m[i * 4 + 0] - + (int64_t)n[i * 4 + 1] * m[i * 4 + 1] - + (int64_t)n[i * 4 + 2] * m[i * 4 + 2] - + (int64_t)n[i * 4 + 3] * m[i * 4 + 3]; + d[i] = (a[i] + + (int64_t)n[i * 4 + 0] * m[i * 4 + 0] + + (int64_t)n[i * 4 + 1] * m[i * 4 + 1] + + (int64_t)n[i * 4 + 2] * m[i * 4 + 2] + + (int64_t)n[i * 4 + 3] * m[i * 4 + 3]); } clear_tail(d, opr_sz, simd_maxsz(desc)); } -void HELPER(gvec_udot_h)(void *vd, void *vn, void *vm, uint32_t desc) +void HELPER(gvec_udot_h)(void *vd, void *vn, void *vm, void *va, uint32_t desc) { intptr_t i, opr_sz = simd_oprsz(desc); - uint64_t *d = vd; + uint64_t *d = vd, *a = va; uint16_t *n = vn, *m = vm; for (i = 0; i < opr_sz / 8; ++i) { - d[i] += (uint64_t)n[i * 4 + 0] * m[i * 4 + 0] - + (uint64_t)n[i * 4 + 1] * m[i * 4 + 1] - + (uint64_t)n[i * 4 + 2] * m[i * 4 + 2] - + (uint64_t)n[i * 4 + 3] * m[i * 4 + 3]; + d[i] = (a[i] + + (uint64_t)n[i * 4 + 0] * m[i * 4 + 0] + + (uint64_t)n[i * 4 + 1] * m[i * 4 + 1] + + (uint64_t)n[i * 4 + 2] * m[i * 4 + 2] + + (uint64_t)n[i * 4 + 3] * m[i * 4 + 3]); } clear_tail(d, opr_sz, simd_maxsz(desc)); } -void HELPER(gvec_sdot_idx_b)(void *vd, void *vn, void *vm, uint32_t desc) +void HELPER(gvec_sdot_idx_b)(void *vd, void *vn, void *vm, + void *va, uint32_t desc) { intptr_t i, segend, opr_sz = simd_oprsz(desc), opr_sz_4 = opr_sz / 4; intptr_t index = simd_data(desc); - int32_t *d = vd; + int32_t *d = vd, *a = va; int8_t *n = vn; int8_t *m_indexed = (int8_t *)vm + H4(index) * 4; @@ -455,10 +460,11 @@ void HELPER(gvec_sdot_idx_b)(void *vd, void *vn, void *vm, uint32_t desc) int8_t m3 = m_indexed[i * 4 + 3]; do { - d[i] += n[i * 4 + 0] * m0 - + n[i * 4 + 1] * m1 - + n[i * 4 + 2] * m2 - + n[i * 4 + 3] * m3; + d[i] = (a[i] + + n[i * 4 + 0] * m0 + + n[i * 4 + 1] * m1 + + n[i * 4 + 2] * m2 + + n[i * 4 + 3] * m3); } while (++i < segend); segend = i + 4; } while (i < opr_sz_4); @@ -466,11 +472,12 @@ void HELPER(gvec_sdot_idx_b)(void *vd, void *vn, void *vm, uint32_t desc) clear_tail(d, opr_sz, simd_maxsz(desc)); } -void HELPER(gvec_udot_idx_b)(void *vd, void *vn, void *vm, uint32_t desc) +void HELPER(gvec_udot_idx_b)(void *vd, void *vn, void *vm, + void *va, uint32_t desc) { intptr_t i, segend, opr_sz = simd_oprsz(desc), opr_sz_4 = opr_sz / 4; intptr_t index = simd_data(desc); - uint32_t *d = vd; + uint32_t *d = vd, *a = va; uint8_t *n = vn; uint8_t *m_indexed = (uint8_t *)vm + H4(index) * 4; @@ -486,10 +493,11 @@ void HELPER(gvec_udot_idx_b)(void *vd, void *vn, void *vm, uint32_t desc) uint8_t m3 = m_indexed[i * 4 + 3]; do { - d[i] += n[i * 4 + 0] * m0 - + n[i * 4 + 1] * m1 - + n[i * 4 + 2] * m2 - + n[i * 4 + 3] * m3; + d[i] = (a[i] + + n[i * 4 + 0] * m0 + + n[i * 4 + 1] * m1 + + n[i * 4 + 2] * m2 + + n[i * 4 + 3] * m3); } while (++i < segend); segend = i + 4; } while (i < opr_sz_4); @@ -497,11 +505,12 @@ void HELPER(gvec_udot_idx_b)(void *vd, void *vn, void *vm, uint32_t desc) clear_tail(d, opr_sz, simd_maxsz(desc)); } -void HELPER(gvec_sdot_idx_h)(void *vd, void *vn, void *vm, uint32_t desc) +void HELPER(gvec_sdot_idx_h)(void *vd, void *vn, void *vm, + void *va, uint32_t desc) { intptr_t i, opr_sz = simd_oprsz(desc), opr_sz_8 = opr_sz / 8; intptr_t index = simd_data(desc); - int64_t *d = vd; + int64_t *d = vd, *a = va; int16_t *n = vn; int16_t *m_indexed = (int16_t *)vm + index * 4; @@ -509,30 +518,33 @@ void HELPER(gvec_sdot_idx_h)(void *vd, void *vn, void *vm, uint32_t desc) * Process the entire segment all at once, writing back the results * only after we've consumed all of the inputs. */ - for (i = 0; i < opr_sz_8 ; i += 2) { - uint64_t d0, d1; + for (i = 0; i < opr_sz_8; i += 2) { + int64_t d0, d1; - d0 = n[i * 4 + 0] * (int64_t)m_indexed[i * 4 + 0]; + d0 = a[i + 0]; + d0 += n[i * 4 + 0] * (int64_t)m_indexed[i * 4 + 0]; d0 += n[i * 4 + 1] * (int64_t)m_indexed[i * 4 + 1]; d0 += n[i * 4 + 2] * (int64_t)m_indexed[i * 4 + 2]; d0 += n[i * 4 + 3] * (int64_t)m_indexed[i * 4 + 3]; - d1 = n[i * 4 + 4] * (int64_t)m_indexed[i * 4 + 0]; + + d1 = a[i + 1]; + d1 += n[i * 4 + 4] * (int64_t)m_indexed[i * 4 + 0]; d1 += n[i * 4 + 5] * (int64_t)m_indexed[i * 4 + 1]; d1 += n[i * 4 + 6] * (int64_t)m_indexed[i * 4 + 2]; d1 += n[i * 4 + 7] * (int64_t)m_indexed[i * 4 + 3]; - d[i + 0] += d0; - d[i + 1] += d1; + d[i + 0] = d0; + d[i + 1] = d1; } - clear_tail(d, opr_sz, simd_maxsz(desc)); } -void HELPER(gvec_udot_idx_h)(void *vd, void *vn, void *vm, uint32_t desc) +void HELPER(gvec_udot_idx_h)(void *vd, void *vn, void *vm, + void *va, uint32_t desc) { intptr_t i, opr_sz = simd_oprsz(desc), opr_sz_8 = opr_sz / 8; intptr_t index = simd_data(desc); - uint64_t *d = vd; + uint64_t *d = vd, *a = va; uint16_t *n = vn; uint16_t *m_indexed = (uint16_t *)vm + index * 4; @@ -540,22 +552,24 @@ void HELPER(gvec_udot_idx_h)(void *vd, void *vn, void *vm, uint32_t desc) * Process the entire segment all at once, writing back the results * only after we've consumed all of the inputs. */ - for (i = 0; i < opr_sz_8 ; i += 2) { + for (i = 0; i < opr_sz_8; i += 2) { uint64_t d0, d1; - d0 = n[i * 4 + 0] * (uint64_t)m_indexed[i * 4 + 0]; + d0 = a[i + 0]; + d0 += n[i * 4 + 0] * (uint64_t)m_indexed[i * 4 + 0]; d0 += n[i * 4 + 1] * (uint64_t)m_indexed[i * 4 + 1]; d0 += n[i * 4 + 2] * (uint64_t)m_indexed[i * 4 + 2]; d0 += n[i * 4 + 3] * (uint64_t)m_indexed[i * 4 + 3]; - d1 = n[i * 4 + 4] * (uint64_t)m_indexed[i * 4 + 0]; + + d1 = a[i + 1]; + d1 += n[i * 4 + 4] * (uint64_t)m_indexed[i * 4 + 0]; d1 += n[i * 4 + 5] * (uint64_t)m_indexed[i * 4 + 1]; d1 += n[i * 4 + 6] * (uint64_t)m_indexed[i * 4 + 2]; d1 += n[i * 4 + 7] * (uint64_t)m_indexed[i * 4 + 3]; - d[i + 0] += d0; - d[i + 1] += d1; + d[i + 0] = d0; + d[i + 1] = d1; } - clear_tail(d, opr_sz, simd_maxsz(desc)); } From patchwork Tue May 25 01:03:16 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277547 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4BAD7C2B9F7 for ; Tue, 25 May 2021 01:58:24 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id D750F61417 for ; Tue, 25 May 2021 01:58:23 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D750F61417 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:54568 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llMKs-0008BR-Pm for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 21:58:22 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:55518) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLYQ-0000bZ-AP for qemu-devel@nongnu.org; Mon, 24 May 2021 21:08:18 -0400 Received: from mail-pj1-x102d.google.com ([2607:f8b0:4864:20::102d]:38808) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLXs-00043I-0R for qemu-devel@nongnu.org; Mon, 24 May 2021 21:08:17 -0400 Received: by mail-pj1-x102d.google.com with SMTP id cu11-20020a17090afa8bb029015d5d5d2175so12260153pjb.3 for ; Mon, 24 May 2021 18:07:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=cBhzZYMHQj2yvOM4vvz9Jmebb2oW8LX0Xxyx5R1ATzY=; b=jJhMpmT8ATTzoX2rg6v+FAPY/zY8Hz8UgZJGCgn2se0YaeS78B2RPGeIEhKxjn4Enr UlkjZKICO5U/BY0GvtlSQg2Q+AvmPzaEYgFZ07evcbqqciX2rn4wS9yaCx4eW02nRoOG AKYbWbA9GBrcGhXtcqomb9hRfsadfseQXJfpiuqYBl69EwL1amQjZeoVOu/zRj6tOMsN cxPX8+Onn5zS/LEB9BDlsQ72vbMQLj5UpZ/BCCj8IuqMI7ri8tjnrRaSGcs7Xt8a2DJg HEHulct8/plaCATQrJS0D/dBe0+HrmrJmLKMjHJkLYD9a0ujOemzVlRtsJteKtTKBx4X ZkcA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=cBhzZYMHQj2yvOM4vvz9Jmebb2oW8LX0Xxyx5R1ATzY=; b=FePNkkrTcMes1hUTqI7ooBEgPju57vYMj3y3+XSitNy5Nrn2VFN0eEqKNiNVYXuvRe Pu+LRYRWVYs6jKZ6WXZTyZqrNj93wra+ovTbYQXQKpjTk5n5fDBwrYzz3MUxjBmV1JSh phAkhFjVEyruYUSueL/5L8BCCAp8m4jNN4Fnvn2C59klJe+c703Io/B7B9vY98AbMQqu WxoChP95carI6AX3WyePMRfyIXmiFAa26qGWvMoiL2dbF87BlomKc6SGgeUANUzph8q2 J37+WlRfcbs680nM68JIzABxoGuvB4wvcL5++zhe761cayMh/KzXq/ZIrBDdtDVaO3j6 Ucnw== X-Gm-Message-State: AOAM531iI7HdDRBF3S4x2NJLcHTKoNGklMoHUMxa9U41H6DdCtfyyL+s kkXZjw/BqKjwwFVKiRNHdHI7jFZ72HCzCA== X-Google-Smtp-Source: ABdhPJzmJy4lNnRvCwvnM13sR5uVZOQWtzcgbkjBqbOhGs3CRIZwNRzUCGYcWdd3Qs/tWJIinD5mnw== X-Received: by 2002:a17:902:7b8e:b029:ec:982d:7a7e with SMTP id w14-20020a1709027b8eb02900ec982d7a7emr27932411pll.55.1621904850481; Mon, 24 May 2021 18:07:30 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id b16sm11748176pju.35.2021.05.24.18.07.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:07:30 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 50/92] target/arm: Pass separate addend to FCMLA helpers Date: Mon, 24 May 2021 18:03:16 -0700 Message-Id: <20210525010358.152808-51-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::102d; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x102d.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" For SVE, we potentially have a 4th argument coming from the movprfx instruction. Currently we do not optimize movprfx, so the problem is not visible. Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/helper.h | 20 +++++++-------- target/arm/translate-a64.c | 28 +++++++++++++++++---- target/arm/translate-neon.c | 10 +++++--- target/arm/translate-sve.c | 5 ++-- target/arm/vec_helper.c | 50 +++++++++++++++---------------------- 5 files changed, 62 insertions(+), 51 deletions(-) diff --git a/target/arm/helper.h b/target/arm/helper.h index f4b092ee1c..72c5bf6aca 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -629,16 +629,16 @@ DEF_HELPER_FLAGS_5(gvec_fcadds, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(gvec_fcaddd, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) -DEF_HELPER_FLAGS_5(gvec_fcmlah, TCG_CALL_NO_RWG, - void, ptr, ptr, ptr, ptr, i32) -DEF_HELPER_FLAGS_5(gvec_fcmlah_idx, TCG_CALL_NO_RWG, - void, ptr, ptr, ptr, ptr, i32) -DEF_HELPER_FLAGS_5(gvec_fcmlas, TCG_CALL_NO_RWG, - void, ptr, ptr, ptr, ptr, i32) -DEF_HELPER_FLAGS_5(gvec_fcmlas_idx, TCG_CALL_NO_RWG, - void, ptr, ptr, ptr, ptr, i32) -DEF_HELPER_FLAGS_5(gvec_fcmlad, TCG_CALL_NO_RWG, - void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(gvec_fcmlah, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(gvec_fcmlah_idx, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(gvec_fcmlas, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(gvec_fcmlas_idx, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(gvec_fcmlad, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(neon_paddh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(neon_pmaxh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index 61c5fa9656..a8edd2d281 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -694,6 +694,23 @@ static void gen_gvec_op4_ool(DisasContext *s, bool is_q, int rd, int rn, is_q ? 16 : 8, vec_full_reg_size(s), data, fn); } +/* + * Expand a 4-operand + fpstatus pointer + simd data value operation using + * an out-of-line helper. + */ +static void gen_gvec_op4_fpst(DisasContext *s, bool is_q, int rd, int rn, + int rm, int ra, bool is_fp16, int data, + gen_helper_gvec_4_ptr *fn) +{ + TCGv_ptr fpst = fpstatus_ptr(is_fp16 ? FPST_FPCR_F16 : FPST_FPCR); + tcg_gen_gvec_4_ptr(vec_full_reg_offset(s, rd), + vec_full_reg_offset(s, rn), + vec_full_reg_offset(s, rm), + vec_full_reg_offset(s, ra), fpst, + is_q ? 16 : 8, vec_full_reg_size(s), data, fn); + tcg_temp_free_ptr(fpst); +} + /* Set ZF and NF based on a 64 bit result. This is alas fiddlier * than the 32 bit equivalent. */ @@ -12205,15 +12222,15 @@ static void disas_simd_three_reg_same_extra(DisasContext *s, uint32_t insn) rot = extract32(opcode, 0, 2); switch (size) { case 1: - gen_gvec_op3_fpst(s, is_q, rd, rn, rm, true, rot, + gen_gvec_op4_fpst(s, is_q, rd, rn, rm, rd, true, rot, gen_helper_gvec_fcmlah); break; case 2: - gen_gvec_op3_fpst(s, is_q, rd, rn, rm, false, rot, + gen_gvec_op4_fpst(s, is_q, rd, rn, rm, rd, false, rot, gen_helper_gvec_fcmlas); break; case 3: - gen_gvec_op3_fpst(s, is_q, rd, rn, rm, false, rot, + gen_gvec_op4_fpst(s, is_q, rd, rn, rm, rd, false, rot, gen_helper_gvec_fcmlad); break; default: @@ -13464,9 +13481,10 @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn) { int rot = extract32(insn, 13, 2); int data = (index << 2) | rot; - tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, rd), + tcg_gen_gvec_4_ptr(vec_full_reg_offset(s, rd), vec_full_reg_offset(s, rn), - vec_full_reg_offset(s, rm), fpst, + vec_full_reg_offset(s, rm), + vec_full_reg_offset(s, rd), fpst, is_q ? 16 : 8, vec_full_reg_size(s), data, size == MO_64 ? gen_helper_gvec_fcmlas_idx diff --git a/target/arm/translate-neon.c b/target/arm/translate-neon.c index fa67605fdc..45fa5166f3 100644 --- a/target/arm/translate-neon.c +++ b/target/arm/translate-neon.c @@ -155,7 +155,7 @@ static bool trans_VCMLA(DisasContext *s, arg_VCMLA *a) { int opr_sz; TCGv_ptr fpst; - gen_helper_gvec_3_ptr *fn_gvec_ptr; + gen_helper_gvec_4_ptr *fn_gvec_ptr; if (!dc_isar_feature(aa32_vcma, s) || (a->size == MO_16 && !dc_isar_feature(aa32_fp16_arith, s))) { @@ -180,9 +180,10 @@ static bool trans_VCMLA(DisasContext *s, arg_VCMLA *a) fpst = fpstatus_ptr(a->size == MO_16 ? FPST_STD_F16 : FPST_STD); fn_gvec_ptr = (a->size == MO_16) ? gen_helper_gvec_fcmlah : gen_helper_gvec_fcmlas; - tcg_gen_gvec_3_ptr(vfp_reg_offset(1, a->vd), + tcg_gen_gvec_4_ptr(vfp_reg_offset(1, a->vd), vfp_reg_offset(1, a->vn), vfp_reg_offset(1, a->vm), + vfp_reg_offset(1, a->vd), fpst, opr_sz, opr_sz, a->rot, fn_gvec_ptr); tcg_temp_free_ptr(fpst); @@ -293,7 +294,7 @@ static bool trans_VFML(DisasContext *s, arg_VFML *a) static bool trans_VCMLA_scalar(DisasContext *s, arg_VCMLA_scalar *a) { - gen_helper_gvec_3_ptr *fn_gvec_ptr; + gen_helper_gvec_4_ptr *fn_gvec_ptr; int opr_sz; TCGv_ptr fpst; @@ -322,9 +323,10 @@ static bool trans_VCMLA_scalar(DisasContext *s, arg_VCMLA_scalar *a) gen_helper_gvec_fcmlah_idx : gen_helper_gvec_fcmlas_idx; opr_sz = (1 + a->q) * 8; fpst = fpstatus_ptr(a->size == MO_16 ? FPST_STD_F16 : FPST_STD); - tcg_gen_gvec_3_ptr(vfp_reg_offset(1, a->vd), + tcg_gen_gvec_4_ptr(vfp_reg_offset(1, a->vd), vfp_reg_offset(1, a->vn), vfp_reg_offset(1, a->vm), + vfp_reg_offset(1, a->vd), fpst, opr_sz, opr_sz, (a->index << 2) | a->rot, fn_gvec_ptr); tcg_temp_free_ptr(fpst); diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 2864c3a3cf..4f4b383e52 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -4383,7 +4383,7 @@ static bool trans_FCMLA_zpzzz(DisasContext *s, arg_FCMLA_zpzzz *a) static bool trans_FCMLA_zzxz(DisasContext *s, arg_FCMLA_zzxz *a) { - static gen_helper_gvec_3_ptr * const fns[2] = { + static gen_helper_gvec_4_ptr * const fns[2] = { gen_helper_gvec_fcmlah_idx, gen_helper_gvec_fcmlas_idx, }; @@ -4393,9 +4393,10 @@ static bool trans_FCMLA_zzxz(DisasContext *s, arg_FCMLA_zzxz *a) if (sve_access_check(s)) { unsigned vsz = vec_full_reg_size(s); TCGv_ptr status = fpstatus_ptr(a->esz == MO_16 ? FPST_FPCR_F16 : FPST_FPCR); - tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, a->rd), + tcg_gen_gvec_4_ptr(vec_full_reg_offset(s, a->rd), vec_full_reg_offset(s, a->rn), vec_full_reg_offset(s, a->rm), + vec_full_reg_offset(s, a->ra), status, vsz, vsz, a->index * 4 + a->rot, fns[a->esz - 1]); diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index f88e572132..b19877e0d3 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -657,13 +657,11 @@ void HELPER(gvec_fcaddd)(void *vd, void *vn, void *vm, clear_tail(d, opr_sz, simd_maxsz(desc)); } -void HELPER(gvec_fcmlah)(void *vd, void *vn, void *vm, +void HELPER(gvec_fcmlah)(void *vd, void *vn, void *vm, void *va, void *vfpst, uint32_t desc) { uintptr_t opr_sz = simd_oprsz(desc); - float16 *d = vd; - float16 *n = vn; - float16 *m = vm; + float16 *d = vd, *n = vn, *m = vm, *a = va; float_status *fpst = vfpst; intptr_t flip = extract32(desc, SIMD_DATA_SHIFT, 1); uint32_t neg_imag = extract32(desc, SIMD_DATA_SHIFT + 1, 1); @@ -680,19 +678,17 @@ void HELPER(gvec_fcmlah)(void *vd, void *vn, void *vm, float16 e4 = e2; float16 e3 = m[H2(i + 1 - flip)] ^ neg_imag; - d[H2(i)] = float16_muladd(e2, e1, d[H2(i)], 0, fpst); - d[H2(i + 1)] = float16_muladd(e4, e3, d[H2(i + 1)], 0, fpst); + d[H2(i)] = float16_muladd(e2, e1, a[H2(i)], 0, fpst); + d[H2(i + 1)] = float16_muladd(e4, e3, a[H2(i + 1)], 0, fpst); } clear_tail(d, opr_sz, simd_maxsz(desc)); } -void HELPER(gvec_fcmlah_idx)(void *vd, void *vn, void *vm, +void HELPER(gvec_fcmlah_idx)(void *vd, void *vn, void *vm, void *va, void *vfpst, uint32_t desc) { uintptr_t opr_sz = simd_oprsz(desc); - float16 *d = vd; - float16 *n = vn; - float16 *m = vm; + float16 *d = vd, *n = vn, *m = vm, *a = va; float_status *fpst = vfpst; intptr_t flip = extract32(desc, SIMD_DATA_SHIFT, 1); uint32_t neg_imag = extract32(desc, SIMD_DATA_SHIFT + 1, 1); @@ -716,20 +712,18 @@ void HELPER(gvec_fcmlah_idx)(void *vd, void *vn, void *vm, float16 e2 = n[H2(j + flip)]; float16 e4 = e2; - d[H2(j)] = float16_muladd(e2, e1, d[H2(j)], 0, fpst); - d[H2(j + 1)] = float16_muladd(e4, e3, d[H2(j + 1)], 0, fpst); + d[H2(j)] = float16_muladd(e2, e1, a[H2(j)], 0, fpst); + d[H2(j + 1)] = float16_muladd(e4, e3, a[H2(j + 1)], 0, fpst); } } clear_tail(d, opr_sz, simd_maxsz(desc)); } -void HELPER(gvec_fcmlas)(void *vd, void *vn, void *vm, +void HELPER(gvec_fcmlas)(void *vd, void *vn, void *vm, void *va, void *vfpst, uint32_t desc) { uintptr_t opr_sz = simd_oprsz(desc); - float32 *d = vd; - float32 *n = vn; - float32 *m = vm; + float32 *d = vd, *n = vn, *m = vm, *a = va; float_status *fpst = vfpst; intptr_t flip = extract32(desc, SIMD_DATA_SHIFT, 1); uint32_t neg_imag = extract32(desc, SIMD_DATA_SHIFT + 1, 1); @@ -746,19 +740,17 @@ void HELPER(gvec_fcmlas)(void *vd, void *vn, void *vm, float32 e4 = e2; float32 e3 = m[H4(i + 1 - flip)] ^ neg_imag; - d[H4(i)] = float32_muladd(e2, e1, d[H4(i)], 0, fpst); - d[H4(i + 1)] = float32_muladd(e4, e3, d[H4(i + 1)], 0, fpst); + d[H4(i)] = float32_muladd(e2, e1, a[H4(i)], 0, fpst); + d[H4(i + 1)] = float32_muladd(e4, e3, a[H4(i + 1)], 0, fpst); } clear_tail(d, opr_sz, simd_maxsz(desc)); } -void HELPER(gvec_fcmlas_idx)(void *vd, void *vn, void *vm, +void HELPER(gvec_fcmlas_idx)(void *vd, void *vn, void *vm, void *va, void *vfpst, uint32_t desc) { uintptr_t opr_sz = simd_oprsz(desc); - float32 *d = vd; - float32 *n = vn; - float32 *m = vm; + float32 *d = vd, *n = vn, *m = vm, *a = va; float_status *fpst = vfpst; intptr_t flip = extract32(desc, SIMD_DATA_SHIFT, 1); uint32_t neg_imag = extract32(desc, SIMD_DATA_SHIFT + 1, 1); @@ -782,20 +774,18 @@ void HELPER(gvec_fcmlas_idx)(void *vd, void *vn, void *vm, float32 e2 = n[H4(j + flip)]; float32 e4 = e2; - d[H4(j)] = float32_muladd(e2, e1, d[H4(j)], 0, fpst); - d[H4(j + 1)] = float32_muladd(e4, e3, d[H4(j + 1)], 0, fpst); + d[H4(j)] = float32_muladd(e2, e1, a[H4(j)], 0, fpst); + d[H4(j + 1)] = float32_muladd(e4, e3, a[H4(j + 1)], 0, fpst); } } clear_tail(d, opr_sz, simd_maxsz(desc)); } -void HELPER(gvec_fcmlad)(void *vd, void *vn, void *vm, +void HELPER(gvec_fcmlad)(void *vd, void *vn, void *vm, void *va, void *vfpst, uint32_t desc) { uintptr_t opr_sz = simd_oprsz(desc); - float64 *d = vd; - float64 *n = vn; - float64 *m = vm; + float64 *d = vd, *n = vn, *m = vm, *a = va; float_status *fpst = vfpst; intptr_t flip = extract32(desc, SIMD_DATA_SHIFT, 1); uint64_t neg_imag = extract32(desc, SIMD_DATA_SHIFT + 1, 1); @@ -812,8 +802,8 @@ void HELPER(gvec_fcmlad)(void *vd, void *vn, void *vm, float64 e4 = e2; float64 e3 = m[i + 1 - flip] ^ neg_imag; - d[i] = float64_muladd(e2, e1, d[i], 0, fpst); - d[i + 1] = float64_muladd(e4, e3, d[i + 1], 0, fpst); + d[i] = float64_muladd(e2, e1, a[i], 0, fpst); + d[i + 1] = float64_muladd(e4, e3, a[i + 1], 0, fpst); } clear_tail(d, opr_sz, simd_maxsz(desc)); } From patchwork Tue May 25 01:03:17 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277529 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E1A44C2B9F7 for ; Tue, 25 May 2021 01:52:34 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id A13CF6141B for ; Tue, 25 May 2021 01:52:34 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A13CF6141B Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:33758 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llMFF-0002q4-Nx for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 21:52:33 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:55560) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLYR-0000i6-Ky for qemu-devel@nongnu.org; Mon, 24 May 2021 21:08:19 -0400 Received: from mail-pf1-x429.google.com ([2607:f8b0:4864:20::429]:44637) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLXs-00043N-8F for qemu-devel@nongnu.org; Mon, 24 May 2021 21:08:19 -0400 Received: by mail-pf1-x429.google.com with SMTP id 22so21881705pfv.11 for ; Mon, 24 May 2021 18:07:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=P+Vp7PCJgIRKIEMBm3mLYg2gfZhdkwYbxMZPEhx6JvM=; b=Tcaxt00Zw7wS53mtbRQ3eKOB+GGnnwlxy9QohWinQ0I+u39nqEwxSH83jBKmc/VftX ioEntWdp8lPKl3+K4t7bBqs24cycqR6yJjHyAi0TU5y2QfBJTBlEBHaPvKhWsUIEMiyZ OxTVPlAVDUOgJZ+EIXdHP44dBLUh/1jOa3bMVx4DxzgQXd0zs+YFtAl432tg3jUXOcPC j6j8QLzWd8oVI/t18ja7ngJQWFmKMZnjo6/NKr6L6xGVmBIWzqm82eYp1JW6saObUnv7 7HhfUvGJcvsm0Sm3kzy3zhYGhFSOiJUHXVOIoPcSfbwbD2s/tHCl39cnutkPkAdkCJyf UG5w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=P+Vp7PCJgIRKIEMBm3mLYg2gfZhdkwYbxMZPEhx6JvM=; b=Do16/4tsvw8z7Yw5zaFfOJTEtFwFVCRV0rTZ+sHgW1fz2VpLy8ewnBQD1fvJmNzzj9 +s1RT6ZpdqWjuyNhUke8cDA/+PUVa+LqOQDFyDQoWIi1aGTJZf7jGd2oK2peTAGPYvg8 Q7xG8ziiniHGmIRaP14AxBQPTCRsJdXt2QHJBgCAUJt47IpGXjRgeCMnkbwACSPxkoS8 4o3sr5nZH9tT3DBmFZg939MbNcp76mh9GvRimNVYV3MaQODxT1RITNEoubDO1kOlP8KJ WgR5s+/t87oXOIUqfqtzdR3QIBXC4pbSPYzOSBOPSPnZ259i+M61dhn0U+a5jW3Jd9fo 0qtA== X-Gm-Message-State: AOAM531Cuz5vzjVVdmpiB50QpU0+so0+W+yNylyjLkzx4I7+zNZFajOe AGD8F7L+1vRDp6tCHJ02HWwysqhJETFsAA== X-Google-Smtp-Source: ABdhPJxz119bKox5wOLULD1OR8Ha/aVigbi6JNasfeoeif4zbKwBfV472Mz9Ra2ML+bYC/OsKpSEhQ== X-Received: by 2002:a63:d14b:: with SMTP id c11mr15938821pgj.162.1621904851062; Mon, 24 May 2021 18:07:31 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id b16sm11748176pju.35.2021.05.24.18.07.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:07:30 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 51/92] target/arm: Split out formats for 2 vectors + 1 index Date: Mon, 24 May 2021 18:03:17 -0700 Message-Id: <20210525010358.152808-52-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::429; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x429.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Currently only used by FMUL, but will shortly be used more. Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/sve.decode | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 04ef38f148..a504b55dad 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -67,6 +67,7 @@ &rri_esz rd rn imm esz &rrri_esz rd rn rm imm esz &rrr_esz rd rn rm esz +&rrx_esz rd rn rm index esz &rpr_esz rd pg rn esz &rpr_s rd pg rn s &rprr_s rd pg rn rm s @@ -245,6 +246,12 @@ @rpri_scatter_store ....... msz:2 .. imm:5 ... pg:3 rn:5 rd:5 \ &rpri_scatter_store +# Two registers and a scalar by N-bit index +@rrx_3 ........ .. . .. rm:3 ...... rn:5 rd:5 \ + &rrx_esz index=%index3_22_19 +@rrx_2 ........ .. . index:2 rm:3 ...... rn:5 rd:5 &rrx_esz +@rrx_1 ........ .. . index:1 rm:4 ...... rn:5 rd:5 &rrx_esz + ########################################################################### # Instruction patterns. Grouped according to the SVE encodingindex.xhtml. @@ -792,10 +799,9 @@ FMLA_zzxz 01100100 111 index:1 rm:4 00000 sub:1 rn:5 rd:5 \ ### SVE FP Multiply Indexed Group # SVE floating-point multiply (indexed) -FMUL_zzx 01100100 0.1 .. rm:3 001000 rn:5 rd:5 \ - index=%index3_22_19 esz=1 -FMUL_zzx 01100100 101 index:2 rm:3 001000 rn:5 rd:5 esz=2 -FMUL_zzx 01100100 111 index:1 rm:4 001000 rn:5 rd:5 esz=3 +FMUL_zzx 01100100 0. 1 ..... 001000 ..... ..... @rrx_3 esz=1 +FMUL_zzx 01100100 10 1 ..... 001000 ..... ..... @rrx_2 esz=2 +FMUL_zzx 01100100 11 1 ..... 001000 ..... ..... @rrx_1 esz=3 ### SVE FP Fast Reduction Group From patchwork Tue May 25 01:03:18 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277537 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 46D59C2B9F7 for ; Tue, 25 May 2021 01:54:43 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id AB8FF6141E for ; Tue, 25 May 2021 01:54:42 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org AB8FF6141E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:42898 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llMHJ-0000RJ-Pc for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 21:54:41 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:55324) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLYF-0008Q7-Hu for qemu-devel@nongnu.org; Mon, 24 May 2021 21:08:07 -0400 Received: from mail-pg1-x52c.google.com ([2607:f8b0:4864:20::52c]:35618) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLXr-00043Y-JI for qemu-devel@nongnu.org; Mon, 24 May 2021 21:08:06 -0400 Received: by mail-pg1-x52c.google.com with SMTP id m190so21413146pga.2 for ; Mon, 24 May 2021 18:07:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=J0U7o9nxMC5lTXFPuzpsH42oZtV01Y61hMVi7cku2iM=; b=NM/c2nD/9H9lClDfTd3VxN949tjLzbs23VKEJbEheD9Nary0ovqdvqMXwru5HkDhzd NtroxpNPyjUZjcdmTdLRf5KRC4Je5JlJz4E9jHsAvl+Vj2y3qGNhufb/JB3igcuNdv1w mqsF1WT0IyRLA6Rt6YZPK2HsxgmKkhugUXjJzimA7eIHOUgrM7SRzqZJYvwqPhlwxBnA ZDAOiXT+iJ8WkA9uoAU7dcTiizmRwO/ksk5IESW1fE82WT/FVHeY/hDcgn+lo/KSmT7w A0jfW1tMViXt1phgBIomVzmlZQ8ANU6vkiTgtia3EokytYPPqRxSt9XOttVTnZQwBhqg qC6A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=J0U7o9nxMC5lTXFPuzpsH42oZtV01Y61hMVi7cku2iM=; b=m9TYXjH5a8+5rmBEkoR8OttPedq4cfeLZO+NY7NWNkyjoZ3vr2B7nteJkQL5Llh5JC eujZv2Qzhg0FDS91DuGRjRzb2hije7R430/8ZwpUUFeBkiW0NmHixhN64BLcN6f4UOFQ pkxLxPrDN7I7rTXKhGBuvGcaYu017TBkO6MeiUREWqLF1JHPcCdvWaaYLil7+UUSNQ5O BAnoPBDZ1/xFOplljGkZ3B9CL00REsDUMf5GAWupA/O+BY09z5Ff2flvUr47k5+/Vdv2 slT85newzE8TJRS9CSvgJ1MkNJ9yfaYokcxJ97w7OlKG0bxvCOkVWctldvAf5NDZbZgx jsBw== X-Gm-Message-State: AOAM533qawApQrgXHg3uSvKhlLiLuZHNH23vv+VMbYsKMAY0nQG/VwWe SLZ8X0bdhkd58HsIeJZ4/p9qFbodxuwW4w== X-Google-Smtp-Source: ABdhPJxFbfyOVM4E9D/SQbmpHiFaBnzrCdh9iJfvdiXs+/SsnlhQVR9/4Vx2D+J1yJETdBBHyVkutA== X-Received: by 2002:a62:ea03:0:b029:2e7:8445:243c with SMTP id t3-20020a62ea030000b02902e78445243cmr13102091pfh.54.1621904851700; Mon, 24 May 2021 18:07:31 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id b16sm11748176pju.35.2021.05.24.18.07.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:07:31 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 52/92] target/arm: Split out formats for 3 vectors + 1 index Date: Mon, 24 May 2021 18:03:18 -0700 Message-Id: <20210525010358.152808-53-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::52c; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x52c.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Used by FMLA and DOT, but will shortly be used more. Split FMLA from FMLS to avoid an extra sub field; similarly for SDOT from UDOT. Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/sve.decode | 29 +++++++++++++++++++---------- target/arm/translate-sve.c | 38 ++++++++++++++++++++++++++++---------- 2 files changed, 47 insertions(+), 20 deletions(-) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index a504b55dad..74ac72bdbd 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -73,6 +73,7 @@ &rprr_s rd pg rn rm s &rprr_esz rd pg rn rm esz &rrrr_esz rd ra rn rm esz +&rrxr_esz rd rn rm ra index esz &rprrr_esz rd pg rn rm ra esz &rpri_esz rd pg rn imm esz &ptrue rd esz pat s @@ -252,6 +253,14 @@ @rrx_2 ........ .. . index:2 rm:3 ...... rn:5 rd:5 &rrx_esz @rrx_1 ........ .. . index:1 rm:4 ...... rn:5 rd:5 &rrx_esz +# Three registers and a scalar by N-bit index +@rrxr_3 ........ .. . .. rm:3 ...... rn:5 rd:5 \ + &rrxr_esz ra=%reg_movprfx index=%index3_22_19 +@rrxr_2 ........ .. . index:2 rm:3 ...... rn:5 rd:5 \ + &rrxr_esz ra=%reg_movprfx +@rrxr_1 ........ .. . index:1 rm:4 ...... rn:5 rd:5 \ + &rrxr_esz ra=%reg_movprfx + ########################################################################### # Instruction patterns. Grouped according to the SVE encodingindex.xhtml. @@ -767,10 +776,10 @@ DOT_zzzz 01000100 1 sz:1 0 rm:5 00000 u:1 rn:5 rd:5 \ ra=%reg_movprfx # SVE integer dot product (indexed) -DOT_zzxw 01000100 101 index:2 rm:3 00000 u:1 rn:5 rd:5 \ - sz=0 ra=%reg_movprfx -DOT_zzxw 01000100 111 index:1 rm:4 00000 u:1 rn:5 rd:5 \ - sz=1 ra=%reg_movprfx +SDOT_zzxw_s 01000100 10 1 ..... 000000 ..... ..... @rrxr_2 esz=2 +SDOT_zzxw_d 01000100 11 1 ..... 000000 ..... ..... @rrxr_1 esz=3 +UDOT_zzxw_s 01000100 10 1 ..... 000001 ..... ..... @rrxr_2 esz=2 +UDOT_zzxw_d 01000100 11 1 ..... 000001 ..... ..... @rrxr_1 esz=3 # SVE floating-point complex add (predicated) FCADD 01100100 esz:2 00000 rot:1 100 pg:3 rm:5 rd:5 \ @@ -789,12 +798,12 @@ FCMLA_zzxz 01100100 11 1 index:1 rm:4 0001 rot:2 rn:5 rd:5 \ ### SVE FP Multiply-Add Indexed Group # SVE floating-point multiply-add (indexed) -FMLA_zzxz 01100100 0.1 .. rm:3 00000 sub:1 rn:5 rd:5 \ - ra=%reg_movprfx index=%index3_22_19 esz=1 -FMLA_zzxz 01100100 101 index:2 rm:3 00000 sub:1 rn:5 rd:5 \ - ra=%reg_movprfx esz=2 -FMLA_zzxz 01100100 111 index:1 rm:4 00000 sub:1 rn:5 rd:5 \ - ra=%reg_movprfx esz=3 +FMLA_zzxz 01100100 0. 1 ..... 000000 ..... ..... @rrxr_3 esz=1 +FMLA_zzxz 01100100 10 1 ..... 000000 ..... ..... @rrxr_2 esz=2 +FMLA_zzxz 01100100 11 1 ..... 000000 ..... ..... @rrxr_1 esz=3 +FMLS_zzxz 01100100 0. 1 ..... 000001 ..... ..... @rrxr_3 esz=1 +FMLS_zzxz 01100100 10 1 ..... 000001 ..... ..... @rrxr_2 esz=2 +FMLS_zzxz 01100100 11 1 ..... 000001 ..... ..... @rrxr_1 esz=3 ### SVE FP Multiply Indexed Group diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 4f4b383e52..ae443f3b20 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3813,26 +3813,34 @@ static bool trans_DOT_zzzz(DisasContext *s, arg_DOT_zzzz *a) return true; } -static bool trans_DOT_zzxw(DisasContext *s, arg_DOT_zzxw *a) +static bool do_zzxz_ool(DisasContext *s, arg_rrxr_esz *a, + gen_helper_gvec_4 *fn) { - static gen_helper_gvec_4 * const fns[2][2] = { - { gen_helper_gvec_sdot_idx_b, gen_helper_gvec_sdot_idx_h }, - { gen_helper_gvec_udot_idx_b, gen_helper_gvec_udot_idx_h } - }; - + if (fn == NULL) { + return false; + } if (sve_access_check(s)) { - gen_gvec_ool_zzzz(s, fns[a->u][a->sz], a->rd, a->rn, a->rm, - a->ra, a->index); + gen_gvec_ool_zzzz(s, fn, a->rd, a->rn, a->rm, a->ra, a->index); } return true; } +#define DO_RRXR(NAME, FUNC) \ + static bool NAME(DisasContext *s, arg_rrxr_esz *a) \ + { return do_zzxz_ool(s, a, FUNC); } + +DO_RRXR(trans_SDOT_zzxw_s, gen_helper_gvec_sdot_idx_b) +DO_RRXR(trans_SDOT_zzxw_d, gen_helper_gvec_sdot_idx_h) +DO_RRXR(trans_UDOT_zzxw_s, gen_helper_gvec_udot_idx_b) +DO_RRXR(trans_UDOT_zzxw_d, gen_helper_gvec_udot_idx_h) + +#undef DO_RRXR /* *** SVE Floating Point Multiply-Add Indexed Group */ -static bool trans_FMLA_zzxz(DisasContext *s, arg_FMLA_zzxz *a) +static bool do_FMLA_zzxz(DisasContext *s, arg_rrxr_esz *a, bool sub) { static gen_helper_gvec_4_ptr * const fns[3] = { gen_helper_gvec_fmla_idx_h, @@ -3847,13 +3855,23 @@ static bool trans_FMLA_zzxz(DisasContext *s, arg_FMLA_zzxz *a) vec_full_reg_offset(s, a->rn), vec_full_reg_offset(s, a->rm), vec_full_reg_offset(s, a->ra), - status, vsz, vsz, (a->index << 1) | a->sub, + status, vsz, vsz, (a->index << 1) | sub, fns[a->esz - 1]); tcg_temp_free_ptr(status); } return true; } +static bool trans_FMLA_zzxz(DisasContext *s, arg_FMLA_zzxz *a) +{ + return do_FMLA_zzxz(s, a, false); +} + +static bool trans_FMLS_zzxz(DisasContext *s, arg_FMLA_zzxz *a) +{ + return do_FMLA_zzxz(s, a, true); +} + /* *** SVE Floating Point Multiply Indexed Group */ From patchwork Tue May 25 01:03:19 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277559 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 67721C2B9F7 for ; Tue, 25 May 2021 02:04:19 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 1C26761209 for ; Tue, 25 May 2021 02:04:19 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1C26761209 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:44914 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llMQc-0003ir-5V for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 22:04:18 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:55618) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLYT-0000sE-Ry for qemu-devel@nongnu.org; Mon, 24 May 2021 21:08:21 -0400 Received: from mail-pg1-x52f.google.com ([2607:f8b0:4864:20::52f]:45855) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLXs-00043e-9o for qemu-devel@nongnu.org; Mon, 24 May 2021 21:08:21 -0400 Received: by mail-pg1-x52f.google.com with SMTP id q15so21396654pgg.12 for ; Mon, 24 May 2021 18:07:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=gnLuFEuk0ImYugyY8pF7Rs4So+H1zk2c+Dqze+bYb5s=; b=CrAgXX4xzWcVRFk4YJmEb1OHTmsc9y0jh082yYcylghbLtBFkX3IDimgERadfEznwm suzDRDu9KszKvUq3UvW5lYp6Arv9BuPNJUQFFTxYxhcWdtSYp/3oUgfbqudVLgORnqyD cvGrdgQ6VGm940vE0o/QdNMguU3979b4cEEU5xTLgu3pLvTKLLXvZNGng4LuAQQRsgCx zCI/6tzV2IWO8xWHpcgyzolrDQQ/LRu7PFQl19l2RAxBwJj8fLgF40UnPKkN6HcujQJw R2WSilsT72ytH7vcICPwnLyeymtt9CB7p12y4+wiGpwP0B1BxuCqpaMEVN/CcHG4v+10 5AjA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=gnLuFEuk0ImYugyY8pF7Rs4So+H1zk2c+Dqze+bYb5s=; b=gqNfJVngG64MphfbN+2eQ7ktTkmNY25UkmC8OpEksUw470JucAaJ9/1sNH3psNntkd RnYqe5UbVJRKzjB8xn5Hzu5rL1QcdZv17o6nzB3gZ/awaIAFpfE/3HL804Ki/LgJ4/cz mWbFDVGbAXACWrBE2drfnIptESk7eBuI/ciDcnyZVD1zf3SSL9T9BZm5ObVdKwD73Bt1 RIY+MJZjzugR7h+h4T2lwNlmPm6wkVaDSOHgE673e6IPg1LD52Z/JT5y53/mHW4Iw/V+ Mwnvy3z1UNVf5KllSE1rEqBxy4N9/BTqoPrylKObIdgKrf3aeV87Hymf3p0knNIwq6X+ 1IMA== X-Gm-Message-State: AOAM5321TQi3DB+nsVgslveMTNFDQy4Cw5IEFpaJxJrPcb3tQ7sXU6FD ckyljcKX0SQSdceOjFC3DvIpj0QY2G/rxw== X-Google-Smtp-Source: ABdhPJwPilobaQaeilg9VxMSF483L+c3RXQiR3iv1cbJgmefNpdiguCZ3Zf4U4Ay2jofWZCvEHU9zA== X-Received: by 2002:a05:6a00:b51:b029:2d5:874a:6bd7 with SMTP id p17-20020a056a000b51b02902d5874a6bd7mr27819381pfo.6.1621904852284; Mon, 24 May 2021 18:07:32 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id b16sm11748176pju.35.2021.05.24.18.07.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:07:32 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 53/92] target/arm: Implement SVE2 integer multiply (indexed) Date: Mon, 24 May 2021 18:03:19 -0700 Message-Id: <20210525010358.152808-54-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::52f; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x52f.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- v7: Split arguments to do_sve2_zzz_data. --- target/arm/sve.decode | 7 +++++++ target/arm/translate-sve.c | 30 ++++++++++++++++++++++++++++++ 2 files changed, 37 insertions(+) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 74ac72bdbd..65cb0a2206 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -775,12 +775,19 @@ MUL_zzi 00100101 .. 110 000 110 ........ ..... @rdn_i8s DOT_zzzz 01000100 1 sz:1 0 rm:5 00000 u:1 rn:5 rd:5 \ ra=%reg_movprfx +#### SVE Multiply - Indexed + # SVE integer dot product (indexed) SDOT_zzxw_s 01000100 10 1 ..... 000000 ..... ..... @rrxr_2 esz=2 SDOT_zzxw_d 01000100 11 1 ..... 000000 ..... ..... @rrxr_1 esz=3 UDOT_zzxw_s 01000100 10 1 ..... 000001 ..... ..... @rrxr_2 esz=2 UDOT_zzxw_d 01000100 11 1 ..... 000001 ..... ..... @rrxr_1 esz=3 +# SVE2 integer multiply (indexed) +MUL_zzx_h 01000100 0. 1 ..... 111110 ..... ..... @rrx_3 esz=1 +MUL_zzx_s 01000100 10 1 ..... 111110 ..... ..... @rrx_2 esz=2 +MUL_zzx_d 01000100 11 1 ..... 111110 ..... ..... @rrx_1 esz=3 + # SVE floating-point complex add (predicated) FCADD 01100100 esz:2 00000 rot:1 100 pg:3 rm:5 rd:5 \ rn=%reg_movprfx diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index ae443f3b20..dbab067a53 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3813,6 +3813,10 @@ static bool trans_DOT_zzzz(DisasContext *s, arg_DOT_zzzz *a) return true; } +/* + * SVE Multiply - Indexed + */ + static bool do_zzxz_ool(DisasContext *s, arg_rrxr_esz *a, gen_helper_gvec_4 *fn) { @@ -3836,6 +3840,32 @@ DO_RRXR(trans_UDOT_zzxw_d, gen_helper_gvec_udot_idx_h) #undef DO_RRXR +static bool do_sve2_zzz_data(DisasContext *s, int rd, int rn, int rm, int data, + gen_helper_gvec_3 *fn) +{ + if (fn == NULL || !dc_isar_feature(aa64_sve2, s)) { + return false; + } + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + tcg_gen_gvec_3_ool(vec_full_reg_offset(s, rd), + vec_full_reg_offset(s, rn), + vec_full_reg_offset(s, rm), + vsz, vsz, data, fn); + } + return true; +} + +#define DO_SVE2_RRX(NAME, FUNC) \ + static bool NAME(DisasContext *s, arg_rrx_esz *a) \ + { return do_sve2_zzz_data(s, a->rd, a->rn, a->rm, a->index, FUNC); } + +DO_SVE2_RRX(trans_MUL_zzx_h, gen_helper_gvec_mul_idx_h) +DO_SVE2_RRX(trans_MUL_zzx_s, gen_helper_gvec_mul_idx_s) +DO_SVE2_RRX(trans_MUL_zzx_d, gen_helper_gvec_mul_idx_d) + +#undef DO_SVE2_RRX + /* *** SVE Floating Point Multiply-Add Indexed Group */ From patchwork Tue May 25 01:03:20 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277523 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 374F7C2B9F7 for ; Tue, 25 May 2021 01:51:07 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id D92046141B for ; Tue, 25 May 2021 01:51:06 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D92046141B Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:57404 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llMDp-0008FU-SZ for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 21:51:05 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:55554) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLYR-0000hC-D2 for qemu-devel@nongnu.org; Mon, 24 May 2021 21:08:19 -0400 Received: from mail-pf1-x42b.google.com ([2607:f8b0:4864:20::42b]:44639) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLXs-00043p-47 for qemu-devel@nongnu.org; Mon, 24 May 2021 21:08:19 -0400 Received: by mail-pf1-x42b.google.com with SMTP id 22so21881753pfv.11 for ; Mon, 24 May 2021 18:07:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=znRmUtPpPzNdkgxhCUoIISqXwMWpCGz1tEvasknwhl0=; b=j5YHUEF77esK3OhPf0DtpTYt1FX9GDFvdcZc7maAJDS4GFZXr0eAmaCmAvgE47qJ7D QteP1209u6dym2+wu/HoFeq4i7DIwOjG9GuzNw+SfF1cqMQDxO0iE+bRtlUyL0pmLyx6 RNpSYkj6NI6ASGsx11QylxiVbfvC21W95LYHRcYspEw9lbTEdBiaIErZtaJn+jUXPMNF XMiRZscE9L9z0WtN+S3kXRLH14eD92bXyzHNHDbLQmNTQMiMOwMZcamOA9ioXLunmhdh Ecu+b2wPbnB2oshTy6hcRTt7+p2j3jPLhX/Vo3hkjt3dxBJjE8vNeBJUQbJLS65vzj/K 4tug== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=znRmUtPpPzNdkgxhCUoIISqXwMWpCGz1tEvasknwhl0=; b=KJSe1I6KdEYWwQrb/5h5Mx3pCwB8e2sOypyKemiQaHfHpy9lfGGb2C/M4bwMdWfyyL Mvez9FOGjbv8laFKOXcZRPM2Hwpt6kgfzfM06RhA7xYgv7cIK48i6p8Wg9ZEKP/8jDzd WX5WGrvyWFWHVRdcLBHlnhB+/V8YbAjaMXX8F1prQ6JN/65/jP/foBqtz+xBsOU3Bqrv sVkCyEMNxcphqxIZ+2U3ZHVB6NPu1DIaPK5K28CnrV1hB5J/qkODiFm2ULEOy9W7wrd8 fw8z1wEZuzH2SUe9hWVA40z8ZPWToYW7xBUdvfB8o2F8ezJyF5vZ1pmG0Fy4UbUsb+9M cv5Q== X-Gm-Message-State: AOAM530AVJD5kzNBnDK70VXM6u14hO941wWrzgCp9tv56WwTwL82aPlA G9WdBfB4BJe9hLgdi042QV9/UgbAP4ivyg== X-Google-Smtp-Source: ABdhPJzr1jAeRfDJk1A7VsqZMT5OLnaGSXyVPvocWYRomHNDH/WAPOFvOEMwusJwHorJk0GPvz9ioQ== X-Received: by 2002:a63:6f8e:: with SMTP id k136mr16602341pgc.326.1621904852971; Mon, 24 May 2021 18:07:32 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id b16sm11748176pju.35.2021.05.24.18.07.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:07:32 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 54/92] target/arm: Implement SVE2 integer multiply-add (indexed) Date: Mon, 24 May 2021 18:03:20 -0700 Message-Id: <20210525010358.152808-55-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::42b; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x42b.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- v7: Split arguments to do_sve2_zzzz_data. --- target/arm/sve.decode | 8 ++++++++ target/arm/translate-sve.c | 31 +++++++++++++++++++++++++++++++ 2 files changed, 39 insertions(+) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 65cb0a2206..9bfaf737b7 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -783,6 +783,14 @@ SDOT_zzxw_d 01000100 11 1 ..... 000000 ..... ..... @rrxr_1 esz=3 UDOT_zzxw_s 01000100 10 1 ..... 000001 ..... ..... @rrxr_2 esz=2 UDOT_zzxw_d 01000100 11 1 ..... 000001 ..... ..... @rrxr_1 esz=3 +# SVE2 integer multiply-add (indexed) +MLA_zzxz_h 01000100 0. 1 ..... 000010 ..... ..... @rrxr_3 esz=1 +MLA_zzxz_s 01000100 10 1 ..... 000010 ..... ..... @rrxr_2 esz=2 +MLA_zzxz_d 01000100 11 1 ..... 000010 ..... ..... @rrxr_1 esz=3 +MLS_zzxz_h 01000100 0. 1 ..... 000011 ..... ..... @rrxr_3 esz=1 +MLS_zzxz_s 01000100 10 1 ..... 000011 ..... ..... @rrxr_2 esz=2 +MLS_zzxz_d 01000100 11 1 ..... 000011 ..... ..... @rrxr_1 esz=3 + # SVE2 integer multiply (indexed) MUL_zzx_h 01000100 0. 1 ..... 111110 ..... ..... @rrx_3 esz=1 MUL_zzx_s 01000100 10 1 ..... 111110 ..... ..... @rrx_2 esz=2 diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index dbab067a53..39a6839de4 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3866,6 +3866,37 @@ DO_SVE2_RRX(trans_MUL_zzx_d, gen_helper_gvec_mul_idx_d) #undef DO_SVE2_RRX +static bool do_sve2_zzzz_data(DisasContext *s, int rd, int rn, int rm, int ra, + int data, gen_helper_gvec_4 *fn) +{ + if (fn == NULL || !dc_isar_feature(aa64_sve2, s)) { + return false; + } + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + tcg_gen_gvec_4_ool(vec_full_reg_offset(s, rd), + vec_full_reg_offset(s, rn), + vec_full_reg_offset(s, rm), + vec_full_reg_offset(s, ra), + vsz, vsz, data, fn); + } + return true; +} + +#define DO_SVE2_RRXR(NAME, FUNC) \ + static bool NAME(DisasContext *s, arg_rrxr_esz *a) \ + { return do_sve2_zzzz_data(s, a->rd, a->rn, a->rm, a->ra, a->index, FUNC); } + +DO_SVE2_RRXR(trans_MLA_zzxz_h, gen_helper_gvec_mla_idx_h) +DO_SVE2_RRXR(trans_MLA_zzxz_s, gen_helper_gvec_mla_idx_s) +DO_SVE2_RRXR(trans_MLA_zzxz_d, gen_helper_gvec_mla_idx_d) + +DO_SVE2_RRXR(trans_MLS_zzxz_h, gen_helper_gvec_mls_idx_h) +DO_SVE2_RRXR(trans_MLS_zzxz_s, gen_helper_gvec_mls_idx_s) +DO_SVE2_RRXR(trans_MLS_zzxz_d, gen_helper_gvec_mls_idx_d) + +#undef DO_SVE2_RRXR + /* *** SVE Floating Point Multiply-Add Indexed Group */ From patchwork Tue May 25 01:03:21 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277507 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E06B5C2B9F7 for ; Tue, 25 May 2021 01:44:38 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 695D361413 for ; Tue, 25 May 2021 01:44:38 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 695D361413 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:33706 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llM7Z-0000xo-Gp for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 21:44:37 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:55186) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLXz-00084T-Ic for qemu-devel@nongnu.org; Mon, 24 May 2021 21:07:51 -0400 Received: from mail-pg1-x52f.google.com ([2607:f8b0:4864:20::52f]:35621) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLXl-000440-Gs for qemu-devel@nongnu.org; Mon, 24 May 2021 21:07:51 -0400 Received: by mail-pg1-x52f.google.com with SMTP id m190so21413187pga.2 for ; Mon, 24 May 2021 18:07:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=GQHvgtMZf/PjykaQJIid8lZwo0SRrWyxRSDDb4lrulY=; b=g3xGsFy5opLIo5unDrg1KRXqo5k60tNFHYAN8WuyrL7OF529G3sfg2DrB680A8JfbF EGGAWlBq1lPcikm6gJXtBGLmGQEaB83FuTjgRRQzWA6RFERT4Aziggm7/Iae/y9kq/QD 8QVKqYN18yjo6gEYBUdGDKnoPuDg+vXLndhp2HxDPMEav3+Zh/6Q23zJITJFh1h9jsw+ VL68zXmfs3OHgFfAZ+gQvbJ9laU6kYUeZakZ4pDikrcPsxpG/9+8mKuFqr3XUhFK/HFG 5UDypovlMjj/3kHNDoikka5FtJjNVl+3Sb/pZ4o2s7N+DLaFAnC+1TeFxa9MRoD5Z35V ru3g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=GQHvgtMZf/PjykaQJIid8lZwo0SRrWyxRSDDb4lrulY=; b=SyqoQ95H21a83TSVO1EML3/byiMXBeGTJlpuem/VfANRgDAwtp79jSwlw8+rmCW9RW YBfvEC/eab/7x6IK/0JzHZcStdGnn08RZ24UNA5iZrb9GLWKrQ1Ihwsa9RDQyb5DQniy xELlNS6XcsHX9a4KTV19CLrRb4JgqCOTDVfc/VatuyzqNlv+2e/Nc1f1cvKnQgspYHmI YdaIzM5sXVA3/ey05VdR8ERIfLeN8g6Qb5CtOcCfaN3Wdneq/Di23egtaa77w1rYjoFm rA0mL92CEWsp0rX1B/BURPJwmFBzJ7IQkiGuz4PZJBztk7qSvoVPhluP8V+ccopaoQ0K Lm6A== X-Gm-Message-State: AOAM530giq2f+NPEQeQU23EbLX41XCUrP20XGvj3raGxsKcF/0DEystn 8094xOj1RsTCxKNqwPlFBKZvG7j2xET26w== X-Google-Smtp-Source: ABdhPJzVvXpmn0YbxWpbRdluT0L9ELExIPF+AMlX59lhWo4+p66q5idOl9gGbbzuQfcbpB9PgwF5EA== X-Received: by 2002:aa7:8b44:0:b029:2dd:4cfc:7666 with SMTP id i4-20020aa78b440000b02902dd4cfc7666mr27830968pfd.73.1621904853518; Mon, 24 May 2021 18:07:33 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id b16sm11748176pju.35.2021.05.24.18.07.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:07:33 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 55/92] target/arm: Implement SVE2 saturating multiply-add high (indexed) Date: Mon, 24 May 2021 18:03:21 -0700 Message-Id: <20210525010358.152808-56-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::52f; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x52f.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 14 ++++++++++++++ target/arm/sve.decode | 8 ++++++++ target/arm/sve_helper.c | 36 ++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 8 ++++++++ 4 files changed, 66 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 7e99dcd119..fe67574741 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2665,3 +2665,17 @@ DEF_HELPER_FLAGS_5(sve2_sqrdcmlah_zzzz_d, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_6(fmmla_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_6(fmmla_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_sqrdmlah_idx_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqrdmlah_idx_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqrdmlah_idx_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_sqrdmlsh_idx_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqrdmlsh_idx_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqrdmlsh_idx_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 9bfaf737b7..1956d96ad5 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -791,6 +791,14 @@ MLS_zzxz_h 01000100 0. 1 ..... 000011 ..... ..... @rrxr_3 esz=1 MLS_zzxz_s 01000100 10 1 ..... 000011 ..... ..... @rrxr_2 esz=2 MLS_zzxz_d 01000100 11 1 ..... 000011 ..... ..... @rrxr_1 esz=3 +# SVE2 saturating multiply-add high (indexed) +SQRDMLAH_zzxz_h 01000100 0. 1 ..... 000100 ..... ..... @rrxr_3 esz=1 +SQRDMLAH_zzxz_s 01000100 10 1 ..... 000100 ..... ..... @rrxr_2 esz=2 +SQRDMLAH_zzxz_d 01000100 11 1 ..... 000100 ..... ..... @rrxr_1 esz=3 +SQRDMLSH_zzxz_h 01000100 0. 1 ..... 000101 ..... ..... @rrxr_3 esz=1 +SQRDMLSH_zzxz_s 01000100 10 1 ..... 000101 ..... ..... @rrxr_2 esz=2 +SQRDMLSH_zzxz_d 01000100 11 1 ..... 000101 ..... ..... @rrxr_1 esz=3 + # SVE2 integer multiply (indexed) MUL_zzx_h 01000100 0. 1 ..... 111110 ..... ..... @rrx_3 esz=1 MUL_zzx_s 01000100 10 1 ..... 111110 ..... ..... @rrx_2 esz=2 diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index fa96e28639..11d4a2a722 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1499,6 +1499,42 @@ DO_CMLA_FUNC(sve2_sqrdcmlah_zzzz_d, int64_t, , DO_SQRDMLAH_D) #undef DO_SQRDMLAH_S #undef DO_SQRDMLAH_D +#define DO_ZZXZ(NAME, TYPE, H, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, void *va, uint32_t desc) \ +{ \ + intptr_t oprsz = simd_oprsz(desc), segment = 16 / sizeof(TYPE); \ + intptr_t i, j, idx = simd_data(desc); \ + TYPE *d = vd, *a = va, *n = vn, *m = (TYPE *)vm + H(idx); \ + for (i = 0; i < oprsz / sizeof(TYPE); i += segment) { \ + TYPE mm = m[i]; \ + for (j = 0; j < segment; j++) { \ + d[i + j] = OP(n[i + j], mm, a[i + j]); \ + } \ + } \ +} + +#define DO_SQRDMLAH_H(N, M, A) \ + ({ uint32_t discard; do_sqrdmlah_h(N, M, A, false, true, &discard); }) +#define DO_SQRDMLAH_S(N, M, A) \ + ({ uint32_t discard; do_sqrdmlah_s(N, M, A, false, true, &discard); }) +#define DO_SQRDMLAH_D(N, M, A) do_sqrdmlah_d(N, M, A, false, true) + +DO_ZZXZ(sve2_sqrdmlah_idx_h, int16_t, H2, DO_SQRDMLAH_H) +DO_ZZXZ(sve2_sqrdmlah_idx_s, int32_t, H4, DO_SQRDMLAH_S) +DO_ZZXZ(sve2_sqrdmlah_idx_d, int64_t, , DO_SQRDMLAH_D) + +#define DO_SQRDMLSH_H(N, M, A) \ + ({ uint32_t discard; do_sqrdmlah_h(N, M, A, true, true, &discard); }) +#define DO_SQRDMLSH_S(N, M, A) \ + ({ uint32_t discard; do_sqrdmlah_s(N, M, A, true, true, &discard); }) +#define DO_SQRDMLSH_D(N, M, A) do_sqrdmlah_d(N, M, A, true, true) + +DO_ZZXZ(sve2_sqrdmlsh_idx_h, int16_t, H2, DO_SQRDMLSH_H) +DO_ZZXZ(sve2_sqrdmlsh_idx_s, int32_t, H4, DO_SQRDMLSH_S) +DO_ZZXZ(sve2_sqrdmlsh_idx_d, int64_t, , DO_SQRDMLSH_D) + +#undef DO_ZZXZ + #define DO_BITPERM(NAME, TYPE, OP) \ void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ { \ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 39a6839de4..b31a4d1fb2 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3895,6 +3895,14 @@ DO_SVE2_RRXR(trans_MLS_zzxz_h, gen_helper_gvec_mls_idx_h) DO_SVE2_RRXR(trans_MLS_zzxz_s, gen_helper_gvec_mls_idx_s) DO_SVE2_RRXR(trans_MLS_zzxz_d, gen_helper_gvec_mls_idx_d) +DO_SVE2_RRXR(trans_SQRDMLAH_zzxz_h, gen_helper_sve2_sqrdmlah_idx_h) +DO_SVE2_RRXR(trans_SQRDMLAH_zzxz_s, gen_helper_sve2_sqrdmlah_idx_s) +DO_SVE2_RRXR(trans_SQRDMLAH_zzxz_d, gen_helper_sve2_sqrdmlah_idx_d) + +DO_SVE2_RRXR(trans_SQRDMLSH_zzxz_h, gen_helper_sve2_sqrdmlsh_idx_h) +DO_SVE2_RRXR(trans_SQRDMLSH_zzxz_s, gen_helper_sve2_sqrdmlsh_idx_s) +DO_SVE2_RRXR(trans_SQRDMLSH_zzxz_d, gen_helper_sve2_sqrdmlsh_idx_d) + #undef DO_SVE2_RRXR /* From patchwork Tue May 25 01:03:22 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277509 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A77E4C2B9F8 for ; Tue, 25 May 2021 01:45:58 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 27BAE61415 for ; Tue, 25 May 2021 01:45:58 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 27BAE61415 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:38492 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llM8r-0004BB-2i for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 21:45:57 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:55164) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLXy-00080N-Mb for qemu-devel@nongnu.org; Mon, 24 May 2021 21:07:50 -0400 Received: from mail-pg1-x529.google.com ([2607:f8b0:4864:20::529]:44804) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLXj-00044A-Kh for qemu-devel@nongnu.org; Mon, 24 May 2021 21:07:50 -0400 Received: by mail-pg1-x529.google.com with SMTP id 29so10305575pgu.11 for ; Mon, 24 May 2021 18:07:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=OYiNt0wX2Mv0iXs0hRuMcwgPOgtkctEvtFu6Xu39RU0=; b=AwpMP1LYx9lPT3Ci7qJhSEn7PbJ5/P4H6tTLhwbqOKxS1e0YOOzxXagI34XaahUVrc EaKnrPPXSbxyKcHaopi6QVhbWBvRmewWKryCIuDltNH4zTnFdrBu4hJlK2x1JJ/f59n4 O4t+EIRCTpn/1QrVGKKi7P3lal2l8o0bjUTiUaNlnVGkqG0aGy3hdFkkdoo4KXghqXNx Db9mnFkGFXFv/bsnvRjFMOwflj7afpBgaYiUnZmOPSjk/0MNfbCphRBwuvci5j8Hp52s pUmLnxkoaE0q9mCcX4VpOTw1MEq0u2UYH2StgMnElX2BgWoeAjXWJaNWYXhTK3hXU2TR ZgiQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=OYiNt0wX2Mv0iXs0hRuMcwgPOgtkctEvtFu6Xu39RU0=; b=ZvS0QzPOiKBR6RXneAEeOQRYo8F4PeSgMA/xIGUKEXMRTVLwQy6GMFpzi2HzpANj6t UlldGI2IOHDCInLtL3LcF5U0oCx/REj3eCAJPjUgDcjJ9v9uGct4dgNRrL3ykupF6yih VSJpasc/KkhcDJIzlT8SMBOuVS1jx0kmIXTK2JEJijn2ixnD7XGsZfyaebJvYJJhYpbF qY1/4KgKIR7S/jal8qvrY3A9KvCvUunE50mZX/DXyEUyU+c934QEBwQjWZhhpdFaB7uZ bvcIMrNeMcKIw0qbaBhha00Irgz3T7trz+e5NOnHMl/oFV6LIXrP69R+U0GFLoIJ1NSS j9tw== X-Gm-Message-State: AOAM533FZ7H5DNWfs+m7hJebMuK7yNUOZ65xYQJ6XAPZMmJB94QJZMUM tLRCc5p03m6bqm3Hk5LpHEA5Zg93dl9DFg== X-Google-Smtp-Source: ABdhPJxpYFJEMxEVX9+4436PDK/5S2COlvstSMeyhVX4adapTygb0+pJbjU/IpJ+03QRuWqFfeJkDQ== X-Received: by 2002:a63:4e4f:: with SMTP id o15mr16448551pgl.208.1621904854053; Mon, 24 May 2021 18:07:34 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id b16sm11748176pju.35.2021.05.24.18.07.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:07:33 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 56/92] target/arm: Implement SVE2 saturating multiply-add (indexed) Date: Mon, 24 May 2021 18:03:22 -0700 Message-Id: <20210525010358.152808-57-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::529; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x529.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 9 +++++++++ target/arm/sve.decode | 18 ++++++++++++++++++ target/arm/sve_helper.c | 30 ++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 19 +++++++++++++++++++ 4 files changed, 76 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index fe67574741..08398800bd 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2679,3 +2679,12 @@ DEF_HELPER_FLAGS_5(sve2_sqrdmlsh_idx_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve2_sqrdmlsh_idx_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_sqdmlal_idx_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqdmlal_idx_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqdmlsl_idx_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqdmlsl_idx_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 1956d96ad5..8d2709d3cc 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -30,6 +30,8 @@ %size_23 23:2 %dtype_23_13 23:2 13:2 %index3_22_19 22:1 19:2 +%index3_19_11 19:2 11:1 +%index2_20_11 20:1 11:1 # A combination of tsz:imm3 -- extract esize. %tszimm_esz 22:2 5:5 !function=tszimm_esz @@ -261,6 +263,12 @@ @rrxr_1 ........ .. . index:1 rm:4 ...... rn:5 rd:5 \ &rrxr_esz ra=%reg_movprfx +# Three registers and a scalar by N-bit index, alternate +@rrxr_3a ........ .. ... rm:3 ...... rn:5 rd:5 \ + &rrxr_esz ra=%reg_movprfx index=%index3_19_11 +@rrxr_2a ........ .. .. rm:4 ...... rn:5 rd:5 \ + &rrxr_esz ra=%reg_movprfx index=%index2_20_11 + ########################################################################### # Instruction patterns. Grouped according to the SVE encodingindex.xhtml. @@ -799,6 +807,16 @@ SQRDMLSH_zzxz_h 01000100 0. 1 ..... 000101 ..... ..... @rrxr_3 esz=1 SQRDMLSH_zzxz_s 01000100 10 1 ..... 000101 ..... ..... @rrxr_2 esz=2 SQRDMLSH_zzxz_d 01000100 11 1 ..... 000101 ..... ..... @rrxr_1 esz=3 +# SVE2 saturating multiply-add (indexed) +SQDMLALB_zzxw_s 01000100 10 1 ..... 0010.0 ..... ..... @rrxr_3a esz=2 +SQDMLALB_zzxw_d 01000100 11 1 ..... 0010.0 ..... ..... @rrxr_2a esz=3 +SQDMLALT_zzxw_s 01000100 10 1 ..... 0010.1 ..... ..... @rrxr_3a esz=2 +SQDMLALT_zzxw_d 01000100 11 1 ..... 0010.1 ..... ..... @rrxr_2a esz=3 +SQDMLSLB_zzxw_s 01000100 10 1 ..... 0011.0 ..... ..... @rrxr_3a esz=2 +SQDMLSLB_zzxw_d 01000100 11 1 ..... 0011.0 ..... ..... @rrxr_2a esz=3 +SQDMLSLT_zzxw_s 01000100 10 1 ..... 0011.1 ..... ..... @rrxr_3a esz=2 +SQDMLSLT_zzxw_d 01000100 11 1 ..... 0011.1 ..... ..... @rrxr_2a esz=3 + # SVE2 integer multiply (indexed) MUL_zzx_h 01000100 0. 1 ..... 111110 ..... ..... @rrx_3 esz=1 MUL_zzx_s 01000100 10 1 ..... 111110 ..... ..... @rrx_2 esz=2 diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 11d4a2a722..b80bd15085 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1535,6 +1535,36 @@ DO_ZZXZ(sve2_sqrdmlsh_idx_d, int64_t, , DO_SQRDMLSH_D) #undef DO_ZZXZ +#define DO_ZZXW(NAME, TYPEW, TYPEN, HW, HN, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, void *va, uint32_t desc) \ +{ \ + intptr_t i, j, oprsz = simd_oprsz(desc); \ + intptr_t sel = extract32(desc, SIMD_DATA_SHIFT, 1) * sizeof(TYPEN); \ + intptr_t idx = extract32(desc, SIMD_DATA_SHIFT + 1, 3) * sizeof(TYPEN); \ + for (i = 0; i < oprsz; i += 16) { \ + TYPEW mm = *(TYPEN *)(vm + HN(i + idx)); \ + for (j = 0; j < 16; j += sizeof(TYPEW)) { \ + TYPEW nn = *(TYPEN *)(vn + HN(i + j + sel)); \ + TYPEW aa = *(TYPEW *)(va + HW(i + j)); \ + *(TYPEW *)(vd + HW(i + j)) = OP(nn, mm, aa); \ + } \ + } \ +} + +#define DO_SQDMLAL_S(N, M, A) DO_SQADD_S(A, do_sqdmull_s(N, M)) +#define DO_SQDMLAL_D(N, M, A) do_sqadd_d(A, do_sqdmull_d(N, M)) + +DO_ZZXW(sve2_sqdmlal_idx_s, int32_t, int16_t, H1_4, H1_2, DO_SQDMLAL_S) +DO_ZZXW(sve2_sqdmlal_idx_d, int64_t, int32_t, , H1_4, DO_SQDMLAL_D) + +#define DO_SQDMLSL_S(N, M, A) DO_SQSUB_S(A, do_sqdmull_s(N, M)) +#define DO_SQDMLSL_D(N, M, A) do_sqsub_d(A, do_sqdmull_d(N, M)) + +DO_ZZXW(sve2_sqdmlsl_idx_s, int32_t, int16_t, H1_4, H1_2, DO_SQDMLSL_S) +DO_ZZXW(sve2_sqdmlsl_idx_d, int64_t, int32_t, , H1_4, DO_SQDMLSL_D) + +#undef DO_ZZXW + #define DO_BITPERM(NAME, TYPE, OP) \ void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ { \ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index b31a4d1fb2..3e7f310d59 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3905,6 +3905,25 @@ DO_SVE2_RRXR(trans_SQRDMLSH_zzxz_d, gen_helper_sve2_sqrdmlsh_idx_d) #undef DO_SVE2_RRXR +#define DO_SVE2_RRXR_TB(NAME, FUNC, TOP) \ + static bool NAME(DisasContext *s, arg_rrxr_esz *a) \ + { \ + return do_sve2_zzzz_data(s, a->rd, a->rn, a->rm, a->rd, \ + (a->index << 1) | TOP, FUNC); \ + } + +DO_SVE2_RRXR_TB(trans_SQDMLALB_zzxw_s, gen_helper_sve2_sqdmlal_idx_s, false) +DO_SVE2_RRXR_TB(trans_SQDMLALB_zzxw_d, gen_helper_sve2_sqdmlal_idx_d, false) +DO_SVE2_RRXR_TB(trans_SQDMLALT_zzxw_s, gen_helper_sve2_sqdmlal_idx_s, true) +DO_SVE2_RRXR_TB(trans_SQDMLALT_zzxw_d, gen_helper_sve2_sqdmlal_idx_d, true) + +DO_SVE2_RRXR_TB(trans_SQDMLSLB_zzxw_s, gen_helper_sve2_sqdmlsl_idx_s, false) +DO_SVE2_RRXR_TB(trans_SQDMLSLB_zzxw_d, gen_helper_sve2_sqdmlsl_idx_d, false) +DO_SVE2_RRXR_TB(trans_SQDMLSLT_zzxw_s, gen_helper_sve2_sqdmlsl_idx_s, true) +DO_SVE2_RRXR_TB(trans_SQDMLSLT_zzxw_d, gen_helper_sve2_sqdmlsl_idx_d, true) + +#undef DO_SVE2_RRXR_TB + /* *** SVE Floating Point Multiply-Add Indexed Group */ From patchwork Tue May 25 01:03:23 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277583 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9E59AC2B9F7 for ; Tue, 25 May 2021 02:10:36 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 3655F6141D for ; Tue, 25 May 2021 02:10:36 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 3655F6141D Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:40600 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llMWh-0002vx-AP for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 22:10:35 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:55508) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLYP-0000ZD-PZ for qemu-devel@nongnu.org; Mon, 24 May 2021 21:08:17 -0400 Received: from mail-pl1-x629.google.com ([2607:f8b0:4864:20::629]:33342) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLXs-00044J-3c for qemu-devel@nongnu.org; Mon, 24 May 2021 21:08:17 -0400 Received: by mail-pl1-x629.google.com with SMTP id b7so11346523plg.0 for ; Mon, 24 May 2021 18:07:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=b3dj9ONkMwSM4D3dV9NQFfT4vbk+6eB3o1NU5yiO8yU=; b=tfNdaPR1hA9dvjf4feWwtzwtebdGB4nIXgU8LjykMIFyzS9c9cHtW+NR7GgmKzsIxL Uz3T5+tWiBZKnQcRa/6hzn5LEDeaeqyjeKjDRX22bhxy6oS5j5f08tISHEjjyQe2qTaD 77iXaHJMTIhm6NItUMmKEK3U/JaO32UByW9lQeLXXJZ790FiCdEKSOIWMb6ZZJsZoHsD /tvkJoQ3toZbRw7YyESY9pSDelEg3nIhlaCHRzOfMLKOMbqF6sptI1oQlk0mDaYUJpCc 11Sp+4ZnKLi2H9lbLpRl6WHx5hiH869vN5MgN06IvavBeT7MjGI2uk5aohpOH+fUGmkx VZmA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=b3dj9ONkMwSM4D3dV9NQFfT4vbk+6eB3o1NU5yiO8yU=; b=gbo4MEJCN97ePlg9//p941A/RfMc5LaXSbRK0JvuTKTRnETpPjfKN3UTpFya5UYTt/ mpFSwePrZX0wK/SxUuQs2VKYZBwlBgDdvkKsWVLdQoIxPRaE3vVJHEzrKFlqnP/5/zOF iRzNq+LsPkWXF5HYNlYtInIVfygOANBhwXuBtHpMuH/FWhILTS/Q4IggvbqlWU1T1Lwh z8SXb/MRguU4YadQnAyHza08zmA/oPR+jlBp9u/AexsZoFc2SxchsX2fnOEEnIvaqHr6 N75sNYagZVHYYv3bqptHlq+EjCmWFZKFLig2OWwHWDrEwtnCXgWyaggVvj29MBpTUWnX pnBg== X-Gm-Message-State: AOAM533aKdxfKp76se6QnirK/hNNm/yZx9zo2mue4nrQHtg1tTqyRCO4 KB6wM3FbVuFyrkr//Ok6tkUZMGVRbeL1sw== X-Google-Smtp-Source: ABdhPJwhdgnxlayGRXVdL4Zh536Q0lPtyM3Hv4u9HrStVEoQhi0G3tnQfEcJh6Tz44+NPrkz3Ca5ug== X-Received: by 2002:a17:902:bd42:b029:ee:2ada:b6fc with SMTP id b2-20020a170902bd42b02900ee2adab6fcmr28155792plx.59.1621904854762; Mon, 24 May 2021 18:07:34 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id b16sm11748176pju.35.2021.05.24.18.07.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:07:34 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 57/92] target/arm: Implement SVE2 saturating multiply (indexed) Date: Mon, 24 May 2021 18:03:23 -0700 Message-Id: <20210525010358.152808-58-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::629; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x629.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 5 +++++ target/arm/sve.decode | 12 ++++++++++++ target/arm/sve_helper.c | 20 ++++++++++++++++++++ target/arm/translate-sve.c | 14 ++++++++++++++ 4 files changed, 51 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 08398800bd..0be0d90bee 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2688,3 +2688,8 @@ DEF_HELPER_FLAGS_5(sve2_sqdmlsl_idx_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve2_sqdmlsl_idx_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_sqdmull_idx_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqdmull_idx_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 8d2709d3cc..a3b9fb95f9 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -255,6 +255,12 @@ @rrx_2 ........ .. . index:2 rm:3 ...... rn:5 rd:5 &rrx_esz @rrx_1 ........ .. . index:1 rm:4 ...... rn:5 rd:5 &rrx_esz +# Two registers and a scalar by N-bit index, alternate +@rrx_3a ........ .. . .. rm:3 ...... rn:5 rd:5 \ + &rrx_esz index=%index3_19_11 +@rrx_2a ........ .. . . rm:4 ...... rn:5 rd:5 \ + &rrx_esz index=%index2_20_11 + # Three registers and a scalar by N-bit index @rrxr_3 ........ .. . .. rm:3 ...... rn:5 rd:5 \ &rrxr_esz ra=%reg_movprfx index=%index3_22_19 @@ -817,6 +823,12 @@ SQDMLSLB_zzxw_d 01000100 11 1 ..... 0011.0 ..... ..... @rrxr_2a esz=3 SQDMLSLT_zzxw_s 01000100 10 1 ..... 0011.1 ..... ..... @rrxr_3a esz=2 SQDMLSLT_zzxw_d 01000100 11 1 ..... 0011.1 ..... ..... @rrxr_2a esz=3 +# SVE2 saturating multiply (indexed) +SQDMULLB_zzx_s 01000100 10 1 ..... 1110.0 ..... ..... @rrx_3a esz=2 +SQDMULLB_zzx_d 01000100 11 1 ..... 1110.0 ..... ..... @rrx_2a esz=3 +SQDMULLT_zzx_s 01000100 10 1 ..... 1110.1 ..... ..... @rrx_3a esz=2 +SQDMULLT_zzx_d 01000100 11 1 ..... 1110.1 ..... ..... @rrx_2a esz=3 + # SVE2 integer multiply (indexed) MUL_zzx_h 01000100 0. 1 ..... 111110 ..... ..... @rrx_3 esz=1 MUL_zzx_s 01000100 10 1 ..... 111110 ..... ..... @rrx_2 esz=2 diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index b80bd15085..3953e2f502 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1565,6 +1565,26 @@ DO_ZZXW(sve2_sqdmlsl_idx_d, int64_t, int32_t, , H1_4, DO_SQDMLSL_D) #undef DO_ZZXW +#define DO_ZZX(NAME, TYPEW, TYPEN, HW, HN, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ +{ \ + intptr_t i, j, oprsz = simd_oprsz(desc); \ + intptr_t sel = extract32(desc, SIMD_DATA_SHIFT, 1) * sizeof(TYPEN); \ + intptr_t idx = extract32(desc, SIMD_DATA_SHIFT + 1, 3) * sizeof(TYPEN); \ + for (i = 0; i < oprsz; i += 16) { \ + TYPEW mm = *(TYPEN *)(vm + HN(i + idx)); \ + for (j = 0; j < 16; j += sizeof(TYPEW)) { \ + TYPEW nn = *(TYPEN *)(vn + HN(i + j + sel)); \ + *(TYPEW *)(vd + HW(i + j)) = OP(nn, mm); \ + } \ + } \ +} + +DO_ZZX(sve2_sqdmull_idx_s, int32_t, int16_t, H1_4, H1_2, do_sqdmull_s) +DO_ZZX(sve2_sqdmull_idx_d, int64_t, int32_t, , H1_4, do_sqdmull_d) + +#undef DO_ZZX + #define DO_BITPERM(NAME, TYPE, OP) \ void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ { \ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 3e7f310d59..c009ec54ff 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3866,6 +3866,20 @@ DO_SVE2_RRX(trans_MUL_zzx_d, gen_helper_gvec_mul_idx_d) #undef DO_SVE2_RRX +#define DO_SVE2_RRX_TB(NAME, FUNC, TOP) \ + static bool NAME(DisasContext *s, arg_rrx_esz *a) \ + { \ + return do_sve2_zzz_data(s, a->rd, a->rn, a->rm, \ + (a->index << 1) | TOP, FUNC); \ + } + +DO_SVE2_RRX_TB(trans_SQDMULLB_zzx_s, gen_helper_sve2_sqdmull_idx_s, false) +DO_SVE2_RRX_TB(trans_SQDMULLB_zzx_d, gen_helper_sve2_sqdmull_idx_d, false) +DO_SVE2_RRX_TB(trans_SQDMULLT_zzx_s, gen_helper_sve2_sqdmull_idx_s, true) +DO_SVE2_RRX_TB(trans_SQDMULLT_zzx_d, gen_helper_sve2_sqdmull_idx_d, true) + +#undef DO_SVE2_RRX_TB + static bool do_sve2_zzzz_data(DisasContext *s, int rd, int rn, int rm, int ra, int data, gen_helper_gvec_4 *fn) { From patchwork Tue May 25 01:03:24 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277563 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0F454C2B9F7 for ; Tue, 25 May 2021 02:05:09 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 88F0E6141B for ; Tue, 25 May 2021 02:05:08 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 88F0E6141B Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:46666 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llMRP-0004tp-MW for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 22:05:07 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:55356) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLYJ-00009F-Vo for qemu-devel@nongnu.org; Mon, 24 May 2021 21:08:12 -0400 Received: from mail-pf1-x42c.google.com ([2607:f8b0:4864:20::42c]:37419) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLXr-00044V-UL for qemu-devel@nongnu.org; Mon, 24 May 2021 21:08:11 -0400 Received: by mail-pf1-x42c.google.com with SMTP id q67so5226909pfb.4 for ; Mon, 24 May 2021 18:07:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=qBXkWxquMOq2x86z6aJY3GTw4+QZqr7gtzGtz2kiP3U=; b=zCqAff79u6hmnWvJltgUKrgC2aCaKHb4OS3jVKTebFpIuJ0SkEx1x9XMUTUmXzX4rD H2V4y1qi2YA+iR1MTaAYU3cbMQDCeCIZv9D7oxQA5I+0K70JKcip0aN3K9fkDMjID2Lt 0CmnBlhq4nuczikOals0uw58MRAc+VdoAmFn0sgaXIjtyX6pjvFFpcXkQvzz9QxBTLSD Zmy2WFFL/6HVzqtbuippygq8ixIlQkdnPyMA0puWpj5hlZD1jhGtrZ7U8pJ4OpltjAi2 XPCTad+/Sfh/cjUifqnkrSFhC0J0AaTtyCkVKwNtEaZs4IHza//fxUDnT4ByU3rIP18L Fbvg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=qBXkWxquMOq2x86z6aJY3GTw4+QZqr7gtzGtz2kiP3U=; b=AqP94SpIjbBWZDqYRHPnZxQuVtQKJNfpu+OvFhHm5AvD/3vIZLTY6vfxXibBexe5lG AxQIk5hOkAA1Jrc/bRM2YipA7aBT9QEgQFt+BAFB3gEhkIop7Q49QlhWGgKB0agAw/c1 EvFgtaXhEjESxj/xRsAaRCzd+C4Ky3MK/O1EJrCzJ+O5B46FH4Y/TfyUcjaJeOlRtKoT nZPgkfEBOM29Jo2KqnbLKKTf412iyrCZglUEHW0OcmSyAjdsP8X8MCKbO/BLf2cz6pQn zdVA/+9YBjoSeLoQRHdUvN/3gos9lgpKwuAkH/YR84g/nUer/3X59tVaAutPOdDxhEkn d3WA== X-Gm-Message-State: AOAM533KYveuEjD69DwqZjNOBpSyBy1GKAHM15d89Mj/JFlzi0hwQxgf fSzQYq15QFBphXClw7nYlJ3S351zdXPFcw== X-Google-Smtp-Source: ABdhPJzjRLtoUAHSsPBxdcssSKWtKaYaKsO3JjfGPp8UKU+Rc7DOCbNVfd4kFhQJ6aZoh9P2h4Nmog== X-Received: by 2002:a63:1e4f:: with SMTP id p15mr16310158pgm.40.1621904855389; Mon, 24 May 2021 18:07:35 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id b16sm11748176pju.35.2021.05.24.18.07.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:07:35 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 58/92] target/arm: Implement SVE2 signed saturating doubling multiply high Date: Mon, 24 May 2021 18:03:24 -0700 Message-Id: <20210525010358.152808-59-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::42c; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x42c.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/helper.h | 10 +++++ target/arm/sve.decode | 4 ++ target/arm/translate-sve.c | 18 ++++++++ target/arm/vec_helper.c | 84 ++++++++++++++++++++++++++++++++++++++ 4 files changed, 116 insertions(+) diff --git a/target/arm/helper.h b/target/arm/helper.h index 72c5bf6aca..eb94b6b1e6 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -957,6 +957,16 @@ DEF_HELPER_FLAGS_5(neon_sqrdmulh_h, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(neon_sqrdmulh_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqdmulh_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqdmulh_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqdmulh_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqdmulh_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_sqrdmulh_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqrdmulh_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqrdmulh_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqrdmulh_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_4(gvec_xar_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) #ifdef TARGET_AARCH64 diff --git a/target/arm/sve.decode b/target/arm/sve.decode index a3b9fb95f9..407d3019d1 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1202,6 +1202,10 @@ SMULH_zzz 00000100 .. 1 ..... 0110 10 ..... ..... @rd_rn_rm UMULH_zzz 00000100 .. 1 ..... 0110 11 ..... ..... @rd_rn_rm PMUL_zzz 00000100 00 1 ..... 0110 01 ..... ..... @rd_rn_rm_e0 +# SVE2 signed saturating doubling multiply high (unpredicated) +SQDMULH_zzz 00000100 .. 1 ..... 0111 00 ..... ..... @rd_rn_rm +SQRDMULH_zzz 00000100 .. 1 ..... 0111 01 ..... ..... @rd_rn_rm + ### SVE2 Integer - Predicated SADALP_zpzz 01000100 .. 000 100 101 ... ..... ..... @rdm_pg_rn diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index c009ec54ff..001432eccc 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -6433,6 +6433,24 @@ static bool trans_PMUL_zzz(DisasContext *s, arg_rrr_esz *a) return do_sve2_zzz_ool(s, a, gen_helper_gvec_pmul_b); } +static bool trans_SQDMULH_zzz(DisasContext *s, arg_rrr_esz *a) +{ + static gen_helper_gvec_3 * const fns[4] = { + gen_helper_sve2_sqdmulh_b, gen_helper_sve2_sqdmulh_h, + gen_helper_sve2_sqdmulh_s, gen_helper_sve2_sqdmulh_d, + }; + return do_sve2_zzz_ool(s, a, fns[a->esz]); +} + +static bool trans_SQRDMULH_zzz(DisasContext *s, arg_rrr_esz *a) +{ + static gen_helper_gvec_3 * const fns[4] = { + gen_helper_sve2_sqrdmulh_b, gen_helper_sve2_sqrdmulh_h, + gen_helper_sve2_sqrdmulh_s, gen_helper_sve2_sqrdmulh_d, + }; + return do_sve2_zzz_ool(s, a, fns[a->esz]); +} + /* * SVE2 Integer - Predicated */ diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index b19877e0d3..25061c15e1 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -81,6 +81,26 @@ void HELPER(sve2_sqrdmlsh_b)(void *vd, void *vn, void *vm, } } +void HELPER(sve2_sqdmulh_b)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + int8_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz; ++i) { + d[i] = do_sqrdmlah_b(n[i], m[i], 0, false, false); + } +} + +void HELPER(sve2_sqrdmulh_b)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + int8_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz; ++i) { + d[i] = do_sqrdmlah_b(n[i], m[i], 0, false, true); + } +} + /* Signed saturating rounding doubling multiply-accumulate high half, 16-bit */ int16_t do_sqrdmlah_h(int16_t src1, int16_t src2, int16_t src3, bool neg, bool round, uint32_t *sat) @@ -198,6 +218,28 @@ void HELPER(sve2_sqrdmlsh_h)(void *vd, void *vn, void *vm, } } +void HELPER(sve2_sqdmulh_h)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + int16_t *d = vd, *n = vn, *m = vm; + uint32_t discard; + + for (i = 0; i < opr_sz / 2; ++i) { + d[i] = do_sqrdmlah_h(n[i], m[i], 0, false, false, &discard); + } +} + +void HELPER(sve2_sqrdmulh_h)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + int16_t *d = vd, *n = vn, *m = vm; + uint32_t discard; + + for (i = 0; i < opr_sz / 2; ++i) { + d[i] = do_sqrdmlah_h(n[i], m[i], 0, false, true, &discard); + } +} + /* Signed saturating rounding doubling multiply-accumulate high half, 32-bit */ int32_t do_sqrdmlah_s(int32_t src1, int32_t src2, int32_t src3, bool neg, bool round, uint32_t *sat) @@ -309,6 +351,28 @@ void HELPER(sve2_sqrdmlsh_s)(void *vd, void *vn, void *vm, } } +void HELPER(sve2_sqdmulh_s)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + int32_t *d = vd, *n = vn, *m = vm; + uint32_t discard; + + for (i = 0; i < opr_sz / 4; ++i) { + d[i] = do_sqrdmlah_s(n[i], m[i], 0, false, false, &discard); + } +} + +void HELPER(sve2_sqrdmulh_s)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + int32_t *d = vd, *n = vn, *m = vm; + uint32_t discard; + + for (i = 0; i < opr_sz / 4; ++i) { + d[i] = do_sqrdmlah_s(n[i], m[i], 0, false, true, &discard); + } +} + /* Signed saturating rounding doubling multiply-accumulate high half, 64-bit */ static int64_t do_sat128_d(Int128 r) { @@ -368,6 +432,26 @@ void HELPER(sve2_sqrdmlsh_d)(void *vd, void *vn, void *vm, } } +void HELPER(sve2_sqdmulh_d)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + int64_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz / 8; ++i) { + d[i] = do_sqrdmlah_d(n[i], m[i], 0, false, false); + } +} + +void HELPER(sve2_sqrdmulh_d)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + int64_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz / 8; ++i) { + d[i] = do_sqrdmlah_d(n[i], m[i], 0, false, true); + } +} + /* Integer 8 and 16-bit dot-product. * * Note that for the loops herein, host endianness does not matter From patchwork Tue May 25 01:03:25 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277527 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 26548C2B9F7 for ; Tue, 25 May 2021 01:52:30 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 92D5A6113B for ; Tue, 25 May 2021 01:52:29 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 92D5A6113B Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:33468 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llMFA-0002eG-K4 for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 21:52:28 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:55220) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLY1-0008BI-Tf for qemu-devel@nongnu.org; Mon, 24 May 2021 21:07:55 -0400 Received: from mail-pg1-x529.google.com ([2607:f8b0:4864:20::529]:34562) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLXo-00045C-Jn for qemu-devel@nongnu.org; Mon, 24 May 2021 21:07:53 -0400 Received: by mail-pg1-x529.google.com with SMTP id l70so21418163pga.1 for ; Mon, 24 May 2021 18:07:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=sKUb502aLLms8bMUPiVbrKDD4PCp/jOA6YF9RrGlUsg=; b=uN3j1aN6vmiCzUe+e0RpQRTUfw47xD8xZvGH7eVCNbnRSKbj35Q1tNO8m6p4mnoIMo 6E6nalTW+Jm5nUQ5zPHhoG0aPffpnmSZD/qsNTVRTYqXzki8iYuCzeJEjxZM81XkjGXB 2khavWsPoMYFVsZCdvhTiBmI3dYSf5pxpp0U5kQPgo/zj2Ct+TSU7GHIbdvk4hUuui7/ Am7EXeiNow/6cHfW9uob2X93ieYAbJpMlW8Es9NvNpOtOWVngv3g2nLYvDguH9STUKGp q+bKeXmhqUTTSGre3ZGduHt8deEqcPFuLUKQmdr4A/5p4Q8WLikkVSB7cihC0+uir0/b 6e0Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=sKUb502aLLms8bMUPiVbrKDD4PCp/jOA6YF9RrGlUsg=; b=ltuVwd2g8VTkPXqQd8uRYD9wP4ffMa3y7d6zKiLnerQFR7s+VXbjyf/XqJdOrxw21x iAe+gW1MrBr9Ui7Y0Vt3PEv33m9IQibbmeezGRkxdzE/0GFcRfZEekWzmh6tpyTjjEL3 Hp6SjjzCeG3hEazo8HHdye2TAe4GXNUBiL5Yj1jd5fHteRu2tcanpZmHtqTfBTWMUFuM GKEzQFTTyfbS8zGfj3UW9A1iy/gJ4FU9bfjTcQOJZVti14auT5zvv84lx3N2yFOpYE4z jpy/LIERGnIMEVkUWnyrIqpHtMvUNOZk8jjOs5mZx9mD12bEkaoV0vg+bMppmb8y1MSv o27A== X-Gm-Message-State: AOAM532zaNJPQUCcjYV5wJHMrJudr9gBUHsbvbKbuX6KqZukVScCO9Ss IjQDQv4J9ShRRMU5hRr0ApKDnT+TCWEIFg== X-Google-Smtp-Source: ABdhPJz30ouL901ifXb3MpJ0v2dW85LW/7UZjElOrkqPyft5NX1hlE0nd0nWClTJ+ajUIyRQCWvz7Q== X-Received: by 2002:a65:6256:: with SMTP id q22mr16428981pgv.391.1621904855991; Mon, 24 May 2021 18:07:35 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id b16sm11748176pju.35.2021.05.24.18.07.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:07:35 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 59/92] target/arm: Implement SVE2 saturating multiply high (indexed) Date: Mon, 24 May 2021 18:03:25 -0700 Message-Id: <20210525010358.152808-60-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::529; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x529.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/helper.h | 14 ++++++ target/arm/sve.decode | 8 ++++ target/arm/translate-sve.c | 8 ++++ target/arm/vec_helper.c | 88 ++++++++++++++++++++++++++++++++++++++ 4 files changed, 118 insertions(+) diff --git a/target/arm/helper.h b/target/arm/helper.h index eb94b6b1e6..e7c463fff5 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -967,6 +967,20 @@ DEF_HELPER_FLAGS_4(sve2_sqrdmulh_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_sqrdmulh_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_sqrdmulh_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqdmulh_idx_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqdmulh_idx_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqdmulh_idx_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_sqrdmulh_idx_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqrdmulh_idx_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqrdmulh_idx_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_4(gvec_xar_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) #ifdef TARGET_AARCH64 diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 407d3019d1..35010d755f 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -829,6 +829,14 @@ SQDMULLB_zzx_d 01000100 11 1 ..... 1110.0 ..... ..... @rrx_2a esz=3 SQDMULLT_zzx_s 01000100 10 1 ..... 1110.1 ..... ..... @rrx_3a esz=2 SQDMULLT_zzx_d 01000100 11 1 ..... 1110.1 ..... ..... @rrx_2a esz=3 +# SVE2 saturating multiply high (indexed) +SQDMULH_zzx_h 01000100 0. 1 ..... 111100 ..... ..... @rrx_3 esz=1 +SQDMULH_zzx_s 01000100 10 1 ..... 111100 ..... ..... @rrx_2 esz=2 +SQDMULH_zzx_d 01000100 11 1 ..... 111100 ..... ..... @rrx_1 esz=3 +SQRDMULH_zzx_h 01000100 0. 1 ..... 111101 ..... ..... @rrx_3 esz=1 +SQRDMULH_zzx_s 01000100 10 1 ..... 111101 ..... ..... @rrx_2 esz=2 +SQRDMULH_zzx_d 01000100 11 1 ..... 111101 ..... ..... @rrx_1 esz=3 + # SVE2 integer multiply (indexed) MUL_zzx_h 01000100 0. 1 ..... 111110 ..... ..... @rrx_3 esz=1 MUL_zzx_s 01000100 10 1 ..... 111110 ..... ..... @rrx_2 esz=2 diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 001432eccc..a03fce003e 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3864,6 +3864,14 @@ DO_SVE2_RRX(trans_MUL_zzx_h, gen_helper_gvec_mul_idx_h) DO_SVE2_RRX(trans_MUL_zzx_s, gen_helper_gvec_mul_idx_s) DO_SVE2_RRX(trans_MUL_zzx_d, gen_helper_gvec_mul_idx_d) +DO_SVE2_RRX(trans_SQDMULH_zzx_h, gen_helper_sve2_sqdmulh_idx_h) +DO_SVE2_RRX(trans_SQDMULH_zzx_s, gen_helper_sve2_sqdmulh_idx_s) +DO_SVE2_RRX(trans_SQDMULH_zzx_d, gen_helper_sve2_sqdmulh_idx_d) + +DO_SVE2_RRX(trans_SQRDMULH_zzx_h, gen_helper_sve2_sqrdmulh_idx_h) +DO_SVE2_RRX(trans_SQRDMULH_zzx_s, gen_helper_sve2_sqrdmulh_idx_s) +DO_SVE2_RRX(trans_SQRDMULH_zzx_d, gen_helper_sve2_sqrdmulh_idx_d) + #undef DO_SVE2_RRX #define DO_SVE2_RRX_TB(NAME, FUNC, TOP) \ diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index 25061c15e1..8b7269d8e1 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -240,6 +240,36 @@ void HELPER(sve2_sqrdmulh_h)(void *vd, void *vn, void *vm, uint32_t desc) } } +void HELPER(sve2_sqdmulh_idx_h)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, j, opr_sz = simd_oprsz(desc); + int idx = simd_data(desc); + int16_t *d = vd, *n = vn, *m = (int16_t *)vm + H2(idx); + uint32_t discard; + + for (i = 0; i < opr_sz / 2; i += 16 / 2) { + int16_t mm = m[i]; + for (j = 0; j < 16 / 2; ++j) { + d[i + j] = do_sqrdmlah_h(n[i + j], mm, 0, false, false, &discard); + } + } +} + +void HELPER(sve2_sqrdmulh_idx_h)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, j, opr_sz = simd_oprsz(desc); + int idx = simd_data(desc); + int16_t *d = vd, *n = vn, *m = (int16_t *)vm + H2(idx); + uint32_t discard; + + for (i = 0; i < opr_sz / 2; i += 16 / 2) { + int16_t mm = m[i]; + for (j = 0; j < 16 / 2; ++j) { + d[i + j] = do_sqrdmlah_h(n[i + j], mm, 0, false, true, &discard); + } + } +} + /* Signed saturating rounding doubling multiply-accumulate high half, 32-bit */ int32_t do_sqrdmlah_s(int32_t src1, int32_t src2, int32_t src3, bool neg, bool round, uint32_t *sat) @@ -373,6 +403,36 @@ void HELPER(sve2_sqrdmulh_s)(void *vd, void *vn, void *vm, uint32_t desc) } } +void HELPER(sve2_sqdmulh_idx_s)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, j, opr_sz = simd_oprsz(desc); + int idx = simd_data(desc); + int32_t *d = vd, *n = vn, *m = (int32_t *)vm + H4(idx); + uint32_t discard; + + for (i = 0; i < opr_sz / 4; i += 16 / 4) { + int32_t mm = m[i]; + for (j = 0; j < 16 / 4; ++j) { + d[i + j] = do_sqrdmlah_s(n[i + j], mm, 0, false, false, &discard); + } + } +} + +void HELPER(sve2_sqrdmulh_idx_s)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, j, opr_sz = simd_oprsz(desc); + int idx = simd_data(desc); + int32_t *d = vd, *n = vn, *m = (int32_t *)vm + H4(idx); + uint32_t discard; + + for (i = 0; i < opr_sz / 4; i += 16 / 4) { + int32_t mm = m[i]; + for (j = 0; j < 16 / 4; ++j) { + d[i + j] = do_sqrdmlah_s(n[i + j], mm, 0, false, true, &discard); + } + } +} + /* Signed saturating rounding doubling multiply-accumulate high half, 64-bit */ static int64_t do_sat128_d(Int128 r) { @@ -452,6 +512,34 @@ void HELPER(sve2_sqrdmulh_d)(void *vd, void *vn, void *vm, uint32_t desc) } } +void HELPER(sve2_sqdmulh_idx_d)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, j, opr_sz = simd_oprsz(desc); + int idx = simd_data(desc); + int64_t *d = vd, *n = vn, *m = (int64_t *)vm + idx; + + for (i = 0; i < opr_sz / 8; i += 16 / 8) { + int64_t mm = m[i]; + for (j = 0; j < 16 / 8; ++j) { + d[i + j] = do_sqrdmlah_d(n[i + j], mm, 0, false, false); + } + } +} + +void HELPER(sve2_sqrdmulh_idx_d)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, j, opr_sz = simd_oprsz(desc); + int idx = simd_data(desc); + int64_t *d = vd, *n = vn, *m = (int64_t *)vm + idx; + + for (i = 0; i < opr_sz / 8; i += 16 / 8) { + int64_t mm = m[i]; + for (j = 0; j < 16 / 8; ++j) { + d[i + j] = do_sqrdmlah_d(n[i + j], mm, 0, false, true); + } + } +} + /* Integer 8 and 16-bit dot-product. * * Note that for the loops herein, host endianness does not matter From patchwork Tue May 25 01:03:26 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277533 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D1F33C2B9F7 for ; Tue, 25 May 2021 01:54:37 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 5B1AA6113B for ; Tue, 25 May 2021 01:54:37 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 5B1AA6113B Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:42284 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llMHE-0008TY-7d for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 21:54:36 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:55244) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLY3-0008CS-Si for qemu-devel@nongnu.org; Mon, 24 May 2021 21:07:55 -0400 Received: from mail-pf1-x431.google.com ([2607:f8b0:4864:20::431]:43990) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLXp-00045K-8g for qemu-devel@nongnu.org; Mon, 24 May 2021 21:07:55 -0400 Received: by mail-pf1-x431.google.com with SMTP id d78so21402934pfd.10 for ; Mon, 24 May 2021 18:07:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=XxyJoHFLZ6IEaEoSZY3UlLt3WrNUJc/LLzbZKvHo7/o=; b=m0I0Ix3PUGxTJ9A4r9zv0GnGDoZUeNOPHHblWSfq7wf+iHxAecA/5IlX03dSmN5+lQ xwAWPHfiAALtFbZqYvc4mDXmrhZhNoUypgWmJ2hfNTryX+bNWi7smVmriQnO5FIiRQe1 A2vjHAKKcDsHhVbGFP4L0bd2XEgC1qIs3OPMTkO4zKat+LpeyEG1FMO+tT84B3VjvhHT Stjoiub0xMn6lJp7Rjb+8/NNHHGKzn7CmfinFFUjUTXHnA9wCx6jgUTZgsTe8R/lKw/L n4rbRNpChq3yASczl4FV4faZdbSg9gI5JKdkf5cvDkWvNIG+uCSMpvE/hkTnicAyPhYL 3qJA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=XxyJoHFLZ6IEaEoSZY3UlLt3WrNUJc/LLzbZKvHo7/o=; b=NDN7G+pLTmsg2rl1ZT9QyGCdJyjYaYncvQOGvib/+XdJruRi8AcC7YArw8SzwNeiyE 74XuClBMsp4x/wt+P1Dm0Whvc0E8mz4/GWqBohAQiOd7K/nXDwSrgztMXCAi8AL3q7XZ /pHMPXF0sxTtfN7OASVB1XiRkjpxjyTK4OwxukQ/kxzbfIqSTn8c3USMtELTX01T7K9s 3PskCt3brERIh33PrmsHAfktRey5A3ZH4CJXO2NFfDfVsYPkwWCqvlbORM6Tpfw/y2d5 g3TyPUPoI2MgyqWnU6vPz8AOX4nKm9GMG+4cSX7Bhl3xktGt5gzgTKtNozzVvpgpI1hS ubVQ== X-Gm-Message-State: AOAM5334sLzBp4H8mRykQjJHzUixaa5nm7umKNguszEKLl0+m3z/UWrD wiPca4KPQ7vPuEtUuJrzy+NmNHE9a5pMMQ== X-Google-Smtp-Source: ABdhPJwocGlYY+VczoefeFwXwzxdj28ZIbBHDDuKcFOIsCdkuc81967ch8GPHhARX7/3hvgCRcr/3A== X-Received: by 2002:a63:5158:: with SMTP id r24mr16324643pgl.41.1621904856574; Mon, 24 May 2021 18:07:36 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id b16sm11748176pju.35.2021.05.24.18.07.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:07:36 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 60/92] target/arm: Implement SVE2 multiply-add long (indexed) Date: Mon, 24 May 2021 18:03:26 -0700 Message-Id: <20210525010358.152808-61-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::431; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x431.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- v7: Rebasing dropped from v6. --- target/arm/helper-sve.h | 17 +++++++++++++++++ target/arm/sve.decode | 18 ++++++++++++++++++ target/arm/sve_helper.c | 16 ++++++++++++++++ target/arm/translate-sve.c | 20 ++++++++++++++++++++ 4 files changed, 71 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 0be0d90bee..4a0e70ee91 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2693,3 +2693,20 @@ DEF_HELPER_FLAGS_4(sve2_sqdmull_idx_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_sqdmull_idx_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_smlal_idx_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_smlal_idx_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_smlsl_idx_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_smlsl_idx_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_umlal_idx_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_umlal_idx_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_umlsl_idx_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_umlsl_idx_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 35010d755f..dd50b9b5c0 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -823,6 +823,24 @@ SQDMLSLB_zzxw_d 01000100 11 1 ..... 0011.0 ..... ..... @rrxr_2a esz=3 SQDMLSLT_zzxw_s 01000100 10 1 ..... 0011.1 ..... ..... @rrxr_3a esz=2 SQDMLSLT_zzxw_d 01000100 11 1 ..... 0011.1 ..... ..... @rrxr_2a esz=3 +# SVE2 multiply-add long (indexed) +SMLALB_zzxw_s 01000100 10 1 ..... 1000.0 ..... ..... @rrxr_3a esz=2 +SMLALB_zzxw_d 01000100 11 1 ..... 1000.0 ..... ..... @rrxr_2a esz=3 +SMLALT_zzxw_s 01000100 10 1 ..... 1000.1 ..... ..... @rrxr_3a esz=2 +SMLALT_zzxw_d 01000100 11 1 ..... 1000.1 ..... ..... @rrxr_2a esz=3 +UMLALB_zzxw_s 01000100 10 1 ..... 1001.0 ..... ..... @rrxr_3a esz=2 +UMLALB_zzxw_d 01000100 11 1 ..... 1001.0 ..... ..... @rrxr_2a esz=3 +UMLALT_zzxw_s 01000100 10 1 ..... 1001.1 ..... ..... @rrxr_3a esz=2 +UMLALT_zzxw_d 01000100 11 1 ..... 1001.1 ..... ..... @rrxr_2a esz=3 +SMLSLB_zzxw_s 01000100 10 1 ..... 1010.0 ..... ..... @rrxr_3a esz=2 +SMLSLB_zzxw_d 01000100 11 1 ..... 1010.0 ..... ..... @rrxr_2a esz=3 +SMLSLT_zzxw_s 01000100 10 1 ..... 1010.1 ..... ..... @rrxr_3a esz=2 +SMLSLT_zzxw_d 01000100 11 1 ..... 1010.1 ..... ..... @rrxr_2a esz=3 +UMLSLB_zzxw_s 01000100 10 1 ..... 1011.0 ..... ..... @rrxr_3a esz=2 +UMLSLB_zzxw_d 01000100 11 1 ..... 1011.0 ..... ..... @rrxr_2a esz=3 +UMLSLT_zzxw_s 01000100 10 1 ..... 1011.1 ..... ..... @rrxr_3a esz=2 +UMLSLT_zzxw_d 01000100 11 1 ..... 1011.1 ..... ..... @rrxr_2a esz=3 + # SVE2 saturating multiply (indexed) SQDMULLB_zzx_s 01000100 10 1 ..... 1110.0 ..... ..... @rrx_3a esz=2 SQDMULLB_zzx_d 01000100 11 1 ..... 1110.0 ..... ..... @rrx_2a esz=3 diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 3953e2f502..2ec936a8b1 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1551,6 +1551,20 @@ void HELPER(NAME)(void *vd, void *vn, void *vm, void *va, uint32_t desc) \ } \ } +#define DO_MLA(N, M, A) (A + N * M) + +DO_ZZXW(sve2_smlal_idx_s, int32_t, int16_t, H1_4, H1_2, DO_MLA) +DO_ZZXW(sve2_smlal_idx_d, int64_t, int32_t, , H1_4, DO_MLA) +DO_ZZXW(sve2_umlal_idx_s, uint32_t, uint16_t, H1_4, H1_2, DO_MLA) +DO_ZZXW(sve2_umlal_idx_d, uint64_t, uint32_t, , H1_4, DO_MLA) + +#define DO_MLS(N, M, A) (A - N * M) + +DO_ZZXW(sve2_smlsl_idx_s, int32_t, int16_t, H1_4, H1_2, DO_MLS) +DO_ZZXW(sve2_smlsl_idx_d, int64_t, int32_t, , H1_4, DO_MLS) +DO_ZZXW(sve2_umlsl_idx_s, uint32_t, uint16_t, H1_4, H1_2, DO_MLS) +DO_ZZXW(sve2_umlsl_idx_d, uint64_t, uint32_t, , H1_4, DO_MLS) + #define DO_SQDMLAL_S(N, M, A) DO_SQADD_S(A, do_sqdmull_s(N, M)) #define DO_SQDMLAL_D(N, M, A) do_sqadd_d(A, do_sqdmull_d(N, M)) @@ -1563,6 +1577,8 @@ DO_ZZXW(sve2_sqdmlal_idx_d, int64_t, int32_t, , H1_4, DO_SQDMLAL_D) DO_ZZXW(sve2_sqdmlsl_idx_s, int32_t, int16_t, H1_4, H1_2, DO_SQDMLSL_S) DO_ZZXW(sve2_sqdmlsl_idx_d, int64_t, int32_t, , H1_4, DO_SQDMLSL_D) +#undef DO_MLA +#undef DO_MLS #undef DO_ZZXW #define DO_ZZX(NAME, TYPEW, TYPEN, HW, HN, OP) \ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index a03fce003e..1f6a61bf55 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3944,6 +3944,26 @@ DO_SVE2_RRXR_TB(trans_SQDMLSLB_zzxw_d, gen_helper_sve2_sqdmlsl_idx_d, false) DO_SVE2_RRXR_TB(trans_SQDMLSLT_zzxw_s, gen_helper_sve2_sqdmlsl_idx_s, true) DO_SVE2_RRXR_TB(trans_SQDMLSLT_zzxw_d, gen_helper_sve2_sqdmlsl_idx_d, true) +DO_SVE2_RRXR_TB(trans_SMLALB_zzxw_s, gen_helper_sve2_smlal_idx_s, false) +DO_SVE2_RRXR_TB(trans_SMLALB_zzxw_d, gen_helper_sve2_smlal_idx_d, false) +DO_SVE2_RRXR_TB(trans_SMLALT_zzxw_s, gen_helper_sve2_smlal_idx_s, true) +DO_SVE2_RRXR_TB(trans_SMLALT_zzxw_d, gen_helper_sve2_smlal_idx_d, true) + +DO_SVE2_RRXR_TB(trans_UMLALB_zzxw_s, gen_helper_sve2_umlal_idx_s, false) +DO_SVE2_RRXR_TB(trans_UMLALB_zzxw_d, gen_helper_sve2_umlal_idx_d, false) +DO_SVE2_RRXR_TB(trans_UMLALT_zzxw_s, gen_helper_sve2_umlal_idx_s, true) +DO_SVE2_RRXR_TB(trans_UMLALT_zzxw_d, gen_helper_sve2_umlal_idx_d, true) + +DO_SVE2_RRXR_TB(trans_SMLSLB_zzxw_s, gen_helper_sve2_smlsl_idx_s, false) +DO_SVE2_RRXR_TB(trans_SMLSLB_zzxw_d, gen_helper_sve2_smlsl_idx_d, false) +DO_SVE2_RRXR_TB(trans_SMLSLT_zzxw_s, gen_helper_sve2_smlsl_idx_s, true) +DO_SVE2_RRXR_TB(trans_SMLSLT_zzxw_d, gen_helper_sve2_smlsl_idx_d, true) + +DO_SVE2_RRXR_TB(trans_UMLSLB_zzxw_s, gen_helper_sve2_umlsl_idx_s, false) +DO_SVE2_RRXR_TB(trans_UMLSLB_zzxw_d, gen_helper_sve2_umlsl_idx_d, false) +DO_SVE2_RRXR_TB(trans_UMLSLT_zzxw_s, gen_helper_sve2_umlsl_idx_s, true) +DO_SVE2_RRXR_TB(trans_UMLSLT_zzxw_d, gen_helper_sve2_umlsl_idx_d, true) + #undef DO_SVE2_RRXR_TB /* From patchwork Tue May 25 01:03:27 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277515 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 78321C2B9F7 for ; Tue, 25 May 2021 01:48:15 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 0FD856140F for ; Tue, 25 May 2021 01:48:15 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 0FD856140F Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:44880 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llMB4-0008OH-6X for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 21:48:14 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:55348) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLYJ-00006G-EM for qemu-devel@nongnu.org; Mon, 24 May 2021 21:08:11 -0400 Received: from mail-pf1-x42c.google.com ([2607:f8b0:4864:20::42c]:35516) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLXr-00045R-P2 for qemu-devel@nongnu.org; Mon, 24 May 2021 21:08:11 -0400 Received: by mail-pf1-x42c.google.com with SMTP id g18so20613852pfr.2 for ; Mon, 24 May 2021 18:07:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=bfCAhe9/uQHtOdN9hcGfCpP2k7jny4Jk9ySklgsWsZk=; b=G2UDNpnBC7aPm3QHyVoOFxnLx5sxvPNwPjdp8KkAvYPotbeMbI9Fitt1glO+/MBsrK 6q4yBFyI8LVOXVzYknWmfdLP+R3J/rotwLFSXuraHDeLc/XgWmE4LLFSYkHnvEX6uPUk Ck8pQTn50Fm6nHv6M4z0zxSTBy6KNsyvXozyNZGjEndyXZiMSrdHk5183xRv4tqGVzd1 N8DgPSgGsz9jZWyEmMqgFbdZthJdysXOD3uxXKhZi5rfSYBd+pYPrb6m+6GElFFQ0vWi 5AiD4O0UY7eJdsjWACgAbVbkEcQnqoXXzealsl+f6Csk6K1AxDTA49BrxvEDeN/EqohK 5Ssw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=bfCAhe9/uQHtOdN9hcGfCpP2k7jny4Jk9ySklgsWsZk=; b=OTTrYsjs6yCA49ZCwFlWGMNW0dYBoRqOma+41FakcXpA3pcoEEiUbXNReSHrveyyx4 bI7rzbM/E2ITOB/QAqiC0RAuoxD596+JSfWftKlGNNWPmnDwu2vsLEkCmenLpx3etbNn 5P36HgtNI6ipsM/CjWqnsiLwaBre/ef1MbaSfXv3gfJ7XLymzaM8r8rgFkqS+aCMib+A cOZlVi9qBdeeOvXpEuBv8CtWo84g2abCHmxv5QZ4oIMaEkPLY5RyZ6WduH5MNRABQAO7 n5tvKbzIjLFK7oNsFQ1b46TUumyOENKT1Bqb1rlzM30CAC3yU82Ao5CIdP7z50dFlFsQ YsUw== X-Gm-Message-State: AOAM531ZfyGwvoEQ5gEGLkaiIcIFC1w62hJdVFtguqPFpMhJTaPNe0Rv 7ddvZ+q0DHx+s31l9D21ypix9QZK5ISRwA== X-Google-Smtp-Source: ABdhPJyJfPKaN/Qljfafv3lHJ0BLP4ljmxRICuuLZmvZ+cRwfu7MTY27GVkGv/5YtCOIjnWt+NDsuQ== X-Received: by 2002:a63:dd12:: with SMTP id t18mr16221331pgg.361.1621904857212; Mon, 24 May 2021 18:07:37 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id b16sm11748176pju.35.2021.05.24.18.07.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:07:36 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 61/92] target/arm: Implement SVE2 integer multiply long (indexed) Date: Mon, 24 May 2021 18:03:27 -0700 Message-Id: <20210525010358.152808-62-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::42c; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x42c.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- v7: Rebasing dropped from v6. --- target/arm/helper-sve.h | 5 +++++ target/arm/sve.decode | 10 ++++++++++ target/arm/sve_helper.c | 6 ++++++ target/arm/translate-sve.c | 10 ++++++++++ 4 files changed, 31 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 4a0e70ee91..3bec807e13 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2710,3 +2710,8 @@ DEF_HELPER_FLAGS_5(sve2_umlsl_idx_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve2_umlsl_idx_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_smull_idx_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_smull_idx_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_umull_idx_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_umull_idx_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index dd50b9b5c0..9c5761347a 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -841,6 +841,16 @@ UMLSLB_zzxw_d 01000100 11 1 ..... 1011.0 ..... ..... @rrxr_2a esz=3 UMLSLT_zzxw_s 01000100 10 1 ..... 1011.1 ..... ..... @rrxr_3a esz=2 UMLSLT_zzxw_d 01000100 11 1 ..... 1011.1 ..... ..... @rrxr_2a esz=3 +# SVE2 integer multiply long (indexed) +SMULLB_zzx_s 01000100 10 1 ..... 1100.0 ..... ..... @rrx_3a esz=2 +SMULLB_zzx_d 01000100 11 1 ..... 1100.0 ..... ..... @rrx_2a esz=3 +SMULLT_zzx_s 01000100 10 1 ..... 1100.1 ..... ..... @rrx_3a esz=2 +SMULLT_zzx_d 01000100 11 1 ..... 1100.1 ..... ..... @rrx_2a esz=3 +UMULLB_zzx_s 01000100 10 1 ..... 1101.0 ..... ..... @rrx_3a esz=2 +UMULLB_zzx_d 01000100 11 1 ..... 1101.0 ..... ..... @rrx_2a esz=3 +UMULLT_zzx_s 01000100 10 1 ..... 1101.1 ..... ..... @rrx_3a esz=2 +UMULLT_zzx_d 01000100 11 1 ..... 1101.1 ..... ..... @rrx_2a esz=3 + # SVE2 saturating multiply (indexed) SQDMULLB_zzx_s 01000100 10 1 ..... 1110.0 ..... ..... @rrx_3a esz=2 SQDMULLB_zzx_d 01000100 11 1 ..... 1110.0 ..... ..... @rrx_2a esz=3 diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 2ec936a8b1..20ed2f34bc 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1599,6 +1599,12 @@ void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ DO_ZZX(sve2_sqdmull_idx_s, int32_t, int16_t, H1_4, H1_2, do_sqdmull_s) DO_ZZX(sve2_sqdmull_idx_d, int64_t, int32_t, , H1_4, do_sqdmull_d) +DO_ZZX(sve2_smull_idx_s, int32_t, int16_t, H1_4, H1_2, DO_MUL) +DO_ZZX(sve2_smull_idx_d, int64_t, int32_t, , H1_4, DO_MUL) + +DO_ZZX(sve2_umull_idx_s, uint32_t, uint16_t, H1_4, H1_2, DO_MUL) +DO_ZZX(sve2_umull_idx_d, uint64_t, uint32_t, , H1_4, DO_MUL) + #undef DO_ZZX #define DO_BITPERM(NAME, TYPE, OP) \ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 1f6a61bf55..e8e2a4e948 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3886,6 +3886,16 @@ DO_SVE2_RRX_TB(trans_SQDMULLB_zzx_d, gen_helper_sve2_sqdmull_idx_d, false) DO_SVE2_RRX_TB(trans_SQDMULLT_zzx_s, gen_helper_sve2_sqdmull_idx_s, true) DO_SVE2_RRX_TB(trans_SQDMULLT_zzx_d, gen_helper_sve2_sqdmull_idx_d, true) +DO_SVE2_RRX_TB(trans_SMULLB_zzx_s, gen_helper_sve2_smull_idx_s, false) +DO_SVE2_RRX_TB(trans_SMULLB_zzx_d, gen_helper_sve2_smull_idx_d, false) +DO_SVE2_RRX_TB(trans_SMULLT_zzx_s, gen_helper_sve2_smull_idx_s, true) +DO_SVE2_RRX_TB(trans_SMULLT_zzx_d, gen_helper_sve2_smull_idx_d, true) + +DO_SVE2_RRX_TB(trans_UMULLB_zzx_s, gen_helper_sve2_umull_idx_s, false) +DO_SVE2_RRX_TB(trans_UMULLB_zzx_d, gen_helper_sve2_umull_idx_d, false) +DO_SVE2_RRX_TB(trans_UMULLT_zzx_s, gen_helper_sve2_umull_idx_s, true) +DO_SVE2_RRX_TB(trans_UMULLT_zzx_d, gen_helper_sve2_umull_idx_d, true) + #undef DO_SVE2_RRX_TB static bool do_sve2_zzzz_data(DisasContext *s, int rd, int rn, int rm, int ra, From patchwork Tue May 25 01:03:28 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277535 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 84463C2B9F8 for ; Tue, 25 May 2021 01:54:39 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 1F4F16113B for ; Tue, 25 May 2021 01:54:39 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1F4F16113B Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:42508 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llMHG-0000BI-8h for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 21:54:38 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:55650) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLYV-0000xZ-7G for qemu-devel@nongnu.org; Mon, 24 May 2021 21:08:23 -0400 Received: from mail-pf1-x436.google.com ([2607:f8b0:4864:20::436]:36704) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLXs-00045b-AB for qemu-devel@nongnu.org; Mon, 24 May 2021 21:08:22 -0400 Received: by mail-pf1-x436.google.com with SMTP id c12so10686577pfl.3 for ; Mon, 24 May 2021 18:07:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=uEjH7UlSwvzXygqFaMQUC9bmeZGdG0PhYsWn+uO15Hk=; b=B5uuerafa9TAHc2q8Kx816aHoCBZjfRmFmXVdXlcHqdwaUgLKVvDr9tT9msCUgqkcT Qe2OBPfz1WTYh+uLQuoMCxKe5Ai+NrHCKmby7qUybCE7Tzkr7EV3rEv7J4HDbCS2uj83 FUxEPGyrnh+vKMd8eaeia+PcodeyY5IkmwcwixREDees0WNMfk+uR6c655Ywbmv103ct 5Ne1geF3cDEWDnOkFE/YIexDG/xuQcbwaD4Wles7+a8obb9m5bKK+ApUii5/lkQHAkBV LIZnu9XdN9mD5Z/3H2+KnNzxwbOKsMxaPs0zNloptPjP8bfBlUfbxUyyfp6Igb3j74Eu PVQw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=uEjH7UlSwvzXygqFaMQUC9bmeZGdG0PhYsWn+uO15Hk=; b=D1dMAvbiVH3jhacq1nbZ6PXuU5akzu3gAyMMzMuICEKb9QeN5jP4DuCiCGL7MSjJqF qOEn/jqy3RQWqmNeHwVdYD9dcUgVcxKCtlxKfUeJ4k/elg6kstrsVWl7DNzoeudh0PIV xx6/vQWcZza/T18YlBNXZeR2Z9az5w8zagd1m63CoNKB3E19GUZIk5IYRxL/0D/AHNyt 4tMg0zZMzMyt66m4A4SuYO/dDUqH6Ab9OY7sPV7BDerFW5bhc6nBGueR1vBO9u/Uqgh6 bREu+MDr24ymUe7PNJgmhxm1kTq96a8fF6zlE0pdxoGW+19PbcRurntVp385wDJaEtd5 kQAg== X-Gm-Message-State: AOAM533HQwOColp6F8U1SfABIP5E5uNbJjTT2sZmNXWUZfOKA8m5zTKR goqf5XhO7eZWYikgzV+yoLGbHjBWNM3GEw== X-Google-Smtp-Source: ABdhPJz4HEqeSn0Xvnlx4apwr3om3si0u2PStWy6/DR4lNTk8maRFDSIeDAZTqKP3wGgI2DfHEs7KA== X-Received: by 2002:a63:f755:: with SMTP id f21mr16228688pgk.129.1621904857809; Mon, 24 May 2021 18:07:37 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id b16sm11748176pju.35.2021.05.24.18.07.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:07:37 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 62/92] target/arm: Implement SVE2 complex integer multiply-add (indexed) Date: Mon, 24 May 2021 18:03:28 -0700 Message-Id: <20210525010358.152808-63-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::436; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x436.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- v7: Rebasing dropped from v6. --- target/arm/helper-sve.h | 9 +++++++++ target/arm/sve.decode | 12 ++++++++++++ target/arm/sve_helper.c | 28 ++++++++++++++++++++++++++++ target/arm/translate-sve.c | 15 +++++++++++++++ 4 files changed, 64 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 3bec807e13..d6399a6d6e 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2715,3 +2715,12 @@ DEF_HELPER_FLAGS_4(sve2_smull_idx_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_smull_idx_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_umull_idx_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_umull_idx_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_cmla_idx_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_cmla_idx_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqrdcmlah_idx_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqrdcmlah_idx_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 9c5761347a..42cf344ad6 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -823,6 +823,18 @@ SQDMLSLB_zzxw_d 01000100 11 1 ..... 0011.0 ..... ..... @rrxr_2a esz=3 SQDMLSLT_zzxw_s 01000100 10 1 ..... 0011.1 ..... ..... @rrxr_3a esz=2 SQDMLSLT_zzxw_d 01000100 11 1 ..... 0011.1 ..... ..... @rrxr_2a esz=3 +# SVE2 complex integer multiply-add (indexed) +CMLA_zzxz_h 01000100 10 1 index:2 rm:3 0110 rot:2 rn:5 rd:5 \ + ra=%reg_movprfx +CMLA_zzxz_s 01000100 11 1 index:1 rm:4 0110 rot:2 rn:5 rd:5 \ + ra=%reg_movprfx + +# SVE2 complex saturating integer multiply-add (indexed) +SQRDCMLAH_zzxz_h 01000100 10 1 index:2 rm:3 0111 rot:2 rn:5 rd:5 \ + ra=%reg_movprfx +SQRDCMLAH_zzxz_s 01000100 11 1 index:1 rm:4 0111 rot:2 rn:5 rd:5 \ + ra=%reg_movprfx + # SVE2 multiply-add long (indexed) SMLALB_zzxw_s 01000100 10 1 ..... 1000.0 ..... ..... @rrxr_3a esz=2 SMLALB_zzxw_d 01000100 11 1 ..... 1000.0 ..... ..... @rrxr_2a esz=3 diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 20ed2f34bc..eb083e4061 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1492,8 +1492,36 @@ DO_CMLA_FUNC(sve2_sqrdcmlah_zzzz_h, int16_t, H2, DO_SQRDMLAH_H) DO_CMLA_FUNC(sve2_sqrdcmlah_zzzz_s, int32_t, H4, DO_SQRDMLAH_S) DO_CMLA_FUNC(sve2_sqrdcmlah_zzzz_d, int64_t, , DO_SQRDMLAH_D) +#define DO_CMLA_IDX_FUNC(NAME, TYPE, H, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, void *va, uint32_t desc) \ +{ \ + intptr_t i, j, oprsz = simd_oprsz(desc); \ + int rot = extract32(desc, SIMD_DATA_SHIFT, 2); \ + int idx = extract32(desc, SIMD_DATA_SHIFT + 2, 2) * 2; \ + int sel_a = rot & 1, sel_b = sel_a ^ 1; \ + bool sub_r = rot == 1 || rot == 2; \ + bool sub_i = rot >= 2; \ + TYPE *d = vd, *n = vn, *m = vm, *a = va; \ + for (i = 0; i < oprsz / sizeof(TYPE); i += 16 / sizeof(TYPE)) { \ + TYPE elt2_a = m[H(i + idx + sel_a)]; \ + TYPE elt2_b = m[H(i + idx + sel_b)]; \ + for (j = 0; j < 16 / sizeof(TYPE); j += 2) { \ + TYPE elt1_a = n[H(i + j + sel_a)]; \ + d[H2(i + j)] = OP(elt1_a, elt2_a, a[H(i + j)], sub_r); \ + d[H2(i + j + 1)] = OP(elt1_a, elt2_b, a[H(i + j + 1)], sub_i); \ + } \ + } \ +} + +DO_CMLA_IDX_FUNC(sve2_cmla_idx_h, int16_t, H2, DO_CMLA) +DO_CMLA_IDX_FUNC(sve2_cmla_idx_s, int32_t, H4, DO_CMLA) + +DO_CMLA_IDX_FUNC(sve2_sqrdcmlah_idx_h, int16_t, H2, DO_SQRDMLAH_H) +DO_CMLA_IDX_FUNC(sve2_sqrdcmlah_idx_s, int32_t, H4, DO_SQRDMLAH_S) + #undef DO_CMLA #undef DO_CMLA_FUNC +#undef DO_CMLA_IDX_FUNC #undef DO_SQRDMLAH_B #undef DO_SQRDMLAH_H #undef DO_SQRDMLAH_S diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index e8e2a4e948..91aa2506de 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3976,6 +3976,21 @@ DO_SVE2_RRXR_TB(trans_UMLSLT_zzxw_d, gen_helper_sve2_umlsl_idx_d, true) #undef DO_SVE2_RRXR_TB +#define DO_SVE2_RRXR_ROT(NAME, FUNC) \ + static bool trans_##NAME(DisasContext *s, arg_##NAME *a) \ + { \ + return do_sve2_zzzz_data(s, a->rd, a->rn, a->rm, a->ra, \ + (a->index << 2) | a->rot, FUNC); \ + } + +DO_SVE2_RRXR_ROT(CMLA_zzxz_h, gen_helper_sve2_cmla_idx_h) +DO_SVE2_RRXR_ROT(CMLA_zzxz_s, gen_helper_sve2_cmla_idx_s) + +DO_SVE2_RRXR_ROT(SQRDCMLAH_zzxz_h, gen_helper_sve2_sqrdcmlah_idx_h) +DO_SVE2_RRXR_ROT(SQRDCMLAH_zzxz_s, gen_helper_sve2_sqrdcmlah_idx_s) + +#undef DO_SVE2_RRXR_ROT + /* *** SVE Floating Point Multiply-Add Indexed Group */ From patchwork Tue May 25 01:03:29 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277567 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 27531C2B9F7 for ; Tue, 25 May 2021 02:06:44 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 9C80261209 for ; Tue, 25 May 2021 02:06:43 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 9C80261209 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:53566 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llMSw-00011N-Lf for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 22:06:42 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:55660) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLYV-00010D-PP for qemu-devel@nongnu.org; Mon, 24 May 2021 21:08:23 -0400 Received: from mail-pj1-x1033.google.com ([2607:f8b0:4864:20::1033]:43957) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLXs-00045k-Ad for qemu-devel@nongnu.org; Mon, 24 May 2021 21:08:23 -0400 Received: by mail-pj1-x1033.google.com with SMTP id ep16-20020a17090ae650b029015d00f578a8so12188643pjb.2 for ; Mon, 24 May 2021 18:07:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=XMctm2RKiEZe0qpIRD8jYJj9Bl6nZP7KSASrSY1klRM=; b=zZt9YrEd8T4rWPShByx6VXRLfigihteLdMGCyXCGiD9H9o01gsdqK5hkzSWZ/QXguz QaDgo76mdgrxiIJbIexNyaGkGmEkEpPxSO+7HiTttPJUtSC+NJiQqoU5cqbgvoYDZeMF zWqd30dQt7FUIOwxqd6K062lVimOF2i78jee+LSxgFfhS01K4HPJAzS0mXfgAe5IIRQy Jlk6RCHaBeDonWzSQ0ZWRjnTQ94k4W6jQTCZciwEXZshgu9IIH2VedsQZpXJIxsRkfsN kglzHvldCHyLX6ayOWx/Lj7OW0Np9r1xUFBZcqE6mHe0Atwk05Ajqf/nWLu5k46Cg+37 LWpQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=XMctm2RKiEZe0qpIRD8jYJj9Bl6nZP7KSASrSY1klRM=; b=YmuGRTkqbGYp0K886lTkfBSINDNegkouJzLiJIyXcycJX4eAWg0RdkXteITmLFfj9G Y3YtkCoYaq6moMgX5B1QUO95qtMeQeF9Ap3/4XbnS0szWsakUHn+Z6orqF0pq4CJ3InX 8ePrV0PIpuHvYJUrjQwYpyER48Mcvzv2k7MBgHjA9e+c+eliVx/BMvtt0j3cr36/79+w gwSZo96KG64wj8ehvGFp6aVe3TyTaPUjQuHxTrtb/nDiblPvOI2LEKlCbc23UNuSBwMq 0qEepj4zJthNDtozrYRVcRVBSUh41DPan1mUgQhQX7UfKvHe6+lqpN/KwUJ9sgXPdL5s vgjw== X-Gm-Message-State: AOAM530PzHQve9DEbq06KYC7s6FlzwA0kcAiJxTSwXD8qbjae4f6Faoz ItrTI2+9XGbMo12UNqmqWBa9qh745wV7+Q== X-Google-Smtp-Source: ABdhPJzGftP7KCxW+UPtiZC/xYCh2r1oSyLS9ZGw2LDjrttTFPLJyRzfwoBJczrJ8qbOENPhSsDKoA== X-Received: by 2002:a17:90a:5288:: with SMTP id w8mr2015202pjh.170.1621904858429; Mon, 24 May 2021 18:07:38 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id b16sm11748176pju.35.2021.05.24.18.07.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:07:38 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 63/92] target/arm: Implement SVE2 complex integer dot product Date: Mon, 24 May 2021 18:03:29 -0700 Message-Id: <20210525010358.152808-64-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1033; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1033.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- v7: Rebasing dropped from v6. --- target/arm/helper-sve.h | 10 ++++ target/arm/sve.decode | 9 ++++ target/arm/sve_helper.c | 99 ++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 17 +++++++ 4 files changed, 135 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index d6399a6d6e..efc9a7ccf1 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2724,3 +2724,13 @@ DEF_HELPER_FLAGS_5(sve2_sqrdcmlah_idx_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve2_sqrdcmlah_idx_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_cdot_zzzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_cdot_zzzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_cdot_idx_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_cdot_idx_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 42cf344ad6..0339410cf7 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -789,6 +789,9 @@ MUL_zzi 00100101 .. 110 000 110 ........ ..... @rdn_i8s DOT_zzzz 01000100 1 sz:1 0 rm:5 00000 u:1 rn:5 rd:5 \ ra=%reg_movprfx +# SVE2 complex dot product (vectors) +CDOT_zzzz 01000100 esz:2 0 rm:5 0001 rot:2 rn:5 rd:5 ra=%reg_movprfx + #### SVE Multiply - Indexed # SVE integer dot product (indexed) @@ -823,6 +826,12 @@ SQDMLSLB_zzxw_d 01000100 11 1 ..... 0011.0 ..... ..... @rrxr_2a esz=3 SQDMLSLT_zzxw_s 01000100 10 1 ..... 0011.1 ..... ..... @rrxr_3a esz=2 SQDMLSLT_zzxw_d 01000100 11 1 ..... 0011.1 ..... ..... @rrxr_2a esz=3 +# SVE2 complex integer dot product (indexed) +CDOT_zzxw_s 01000100 10 1 index:2 rm:3 0100 rot:2 rn:5 rd:5 \ + ra=%reg_movprfx +CDOT_zzxw_d 01000100 11 1 index:1 rm:4 0100 rot:2 rn:5 rd:5 \ + ra=%reg_movprfx + # SVE2 complex integer multiply-add (indexed) CMLA_zzxz_h 01000100 10 1 index:2 rm:3 0110 rot:2 rn:5 rd:5 \ ra=%reg_movprfx diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index eb083e4061..f9c2061260 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1527,6 +1527,105 @@ DO_CMLA_IDX_FUNC(sve2_sqrdcmlah_idx_s, int32_t, H4, DO_SQRDMLAH_S) #undef DO_SQRDMLAH_S #undef DO_SQRDMLAH_D +/* Note N and M are 4 elements bundled into one unit. */ +static int32_t do_cdot_s(uint32_t n, uint32_t m, int32_t a, + int sel_a, int sel_b, int sub_i) +{ + for (int i = 0; i <= 1; i++) { + int32_t elt1_r = (int8_t)(n >> (16 * i)); + int32_t elt1_i = (int8_t)(n >> (16 * i + 8)); + int32_t elt2_a = (int8_t)(m >> (16 * i + 8 * sel_a)); + int32_t elt2_b = (int8_t)(m >> (16 * i + 8 * sel_b)); + + a += elt1_r * elt2_a + elt1_i * elt2_b * sub_i; + } + return a; +} + +static int64_t do_cdot_d(uint64_t n, uint64_t m, int64_t a, + int sel_a, int sel_b, int sub_i) +{ + for (int i = 0; i <= 1; i++) { + int64_t elt1_r = (int16_t)(n >> (32 * i + 0)); + int64_t elt1_i = (int16_t)(n >> (32 * i + 16)); + int64_t elt2_a = (int16_t)(m >> (32 * i + 16 * sel_a)); + int64_t elt2_b = (int16_t)(m >> (32 * i + 16 * sel_b)); + + a += elt1_r * elt2_a + elt1_i * elt2_b * sub_i; + } + return a; +} + +void HELPER(sve2_cdot_zzzz_s)(void *vd, void *vn, void *vm, + void *va, uint32_t desc) +{ + int opr_sz = simd_oprsz(desc); + int rot = simd_data(desc); + int sel_a = rot & 1; + int sel_b = sel_a ^ 1; + int sub_i = (rot == 0 || rot == 3 ? -1 : 1); + uint32_t *d = vd, *n = vn, *m = vm, *a = va; + + for (int e = 0; e < opr_sz / 4; e++) { + d[e] = do_cdot_s(n[e], m[e], a[e], sel_a, sel_b, sub_i); + } +} + +void HELPER(sve2_cdot_zzzz_d)(void *vd, void *vn, void *vm, + void *va, uint32_t desc) +{ + int opr_sz = simd_oprsz(desc); + int rot = simd_data(desc); + int sel_a = rot & 1; + int sel_b = sel_a ^ 1; + int sub_i = (rot == 0 || rot == 3 ? -1 : 1); + uint64_t *d = vd, *n = vn, *m = vm, *a = va; + + for (int e = 0; e < opr_sz / 8; e++) { + d[e] = do_cdot_d(n[e], m[e], a[e], sel_a, sel_b, sub_i); + } +} + +void HELPER(sve2_cdot_idx_s)(void *vd, void *vn, void *vm, + void *va, uint32_t desc) +{ + int opr_sz = simd_oprsz(desc); + int rot = extract32(desc, SIMD_DATA_SHIFT, 2); + int idx = H4(extract32(desc, SIMD_DATA_SHIFT + 2, 2)); + int sel_a = rot & 1; + int sel_b = sel_a ^ 1; + int sub_i = (rot == 0 || rot == 3 ? -1 : 1); + uint32_t *d = vd, *n = vn, *m = vm, *a = va; + + for (int seg = 0; seg < opr_sz / 4; seg += 4) { + uint32_t seg_m = m[seg + idx]; + for (int e = 0; e < 4; e++) { + d[seg + e] = do_cdot_s(n[seg + e], seg_m, a[seg + e], + sel_a, sel_b, sub_i); + } + } +} + +void HELPER(sve2_cdot_idx_d)(void *vd, void *vn, void *vm, + void *va, uint32_t desc) +{ + int seg, opr_sz = simd_oprsz(desc); + int rot = extract32(desc, SIMD_DATA_SHIFT, 2); + int idx = extract32(desc, SIMD_DATA_SHIFT + 2, 2); + int sel_a = rot & 1; + int sel_b = sel_a ^ 1; + int sub_i = (rot == 0 || rot == 3 ? -1 : 1); + uint64_t *d = vd, *n = vn, *m = vm, *a = va; + + for (seg = 0; seg < opr_sz / 8; seg += 2) { + uint64_t seg_m = m[seg + idx]; + for (int e = 0; e < 2; e++) { + d[seg + e] = do_cdot_d(n[seg + e], seg_m, a[seg + e], + sel_a, sel_b, sub_i); + } + } +} + #define DO_ZZXZ(NAME, TYPE, H, OP) \ void HELPER(NAME)(void *vd, void *vn, void *vm, void *va, uint32_t desc) \ { \ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 91aa2506de..b454f50a6b 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3989,6 +3989,9 @@ DO_SVE2_RRXR_ROT(CMLA_zzxz_s, gen_helper_sve2_cmla_idx_s) DO_SVE2_RRXR_ROT(SQRDCMLAH_zzxz_h, gen_helper_sve2_sqrdcmlah_idx_h) DO_SVE2_RRXR_ROT(SQRDCMLAH_zzxz_s, gen_helper_sve2_sqrdcmlah_idx_s) +DO_SVE2_RRXR_ROT(CDOT_zzxw_s, gen_helper_sve2_cdot_idx_s) +DO_SVE2_RRXR_ROT(CDOT_zzxw_d, gen_helper_sve2_cdot_idx_d) + #undef DO_SVE2_RRXR_ROT /* @@ -8084,6 +8087,20 @@ static bool trans_CMLA_zzzz(DisasContext *s, arg_CMLA_zzzz *a) return true; } +static bool trans_CDOT_zzzz(DisasContext *s, arg_CMLA_zzzz *a) +{ + if (!dc_isar_feature(aa64_sve2, s) || a->esz < MO_32) { + return false; + } + if (sve_access_check(s)) { + gen_helper_gvec_4 *fn = (a->esz == MO_32 + ? gen_helper_sve2_cdot_zzzz_s + : gen_helper_sve2_cdot_zzzz_d); + gen_gvec_ool_zzzz(s, fn, a->rd, a->rn, a->rm, a->ra, a->rot); + } + return true; +} + static bool trans_SQRDCMLAH_zzzz(DisasContext *s, arg_SQRDCMLAH_zzzz *a) { static gen_helper_gvec_4 * const fns[] = { From patchwork Tue May 25 01:03:30 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277525 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 54050C2B9F7 for ; Tue, 25 May 2021 01:52:03 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id CC6D16141B for ; Tue, 25 May 2021 01:52:02 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org CC6D16141B Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:60942 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llMEj-00029O-Te for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 21:52:01 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:55210) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLY1-00088o-8S for qemu-devel@nongnu.org; Mon, 24 May 2021 21:07:53 -0400 Received: from mail-pg1-x52d.google.com ([2607:f8b0:4864:20::52d]:34566) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLXp-00045q-8P for qemu-devel@nongnu.org; Mon, 24 May 2021 21:07:52 -0400 Received: by mail-pg1-x52d.google.com with SMTP id l70so21418261pga.1 for ; Mon, 24 May 2021 18:07:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=m90pIoUuj3x9BHKQq8AirzUHZnpMlycwnUIoFbbQwCI=; b=UPjcDohjJOJ+qTjT6e2kQCTqrDw0vBzi5VU7M1yF1/fyhGRPdxAVpT7rJy7IM+K+fd vKtQz35UFQG2+IseiVfUxJnreY4+xAErut7NzvrxqKGyg16p43VIF1dBdT0+21e0ZBQN 3RxWGRxJHNS3J6Q9rbJESLkJ5Vot5uxW1/hGHkAgRHymUm0G6uVlXXmksbLiSjzNjgSa DyvrGxtW+3MQF1YNe81bLwQPvJAiNfRI7SmT7w/0kv2X2HFcHY46qAjizlaS2jsN9485 k//c8pZH6SyIT0CYBs5K0NNJk1dh8pPCZB35oBJHmhS5DwbHlypVJKK+SMn/yXSyVvYb y7DQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=m90pIoUuj3x9BHKQq8AirzUHZnpMlycwnUIoFbbQwCI=; b=kVx3C75Ap41fXL98Lu2Ml39+CugsRmawatiPSep1oYD1xmK/vuqvfdPXvIf8oXJKRU CnJoR+Fh/RQo9I+Q4reierE0p+qo6Sz3qANqmVe7q/xcmlhKrDI9zZ9Q40HlkewfRh70 5fWODqmG3ARnFDL6CNlpDFGsNk6SR+7sviQYePXVeNvNNtLlv4HbhPp7EHp9OfvRDJne 4ISOXdYehmE5hYcCr49Tmm54ynRnE7Es8NSW+SrtKIXgGz/NHzn+rsTCnmglKSXeMLKr NAWRHonfogOtx7hxAX0ovgwue6ONFrlmGHwPOVIyOhD9PpDsiLPsEx3HFjA+bEjaHWxY DhJw== X-Gm-Message-State: AOAM533Q7L661lou8pDtuGH9wMqBTu+GtN1qw86EbPDYnH5ff8t5ebVC Bo5VntS1biDBnatqRtqBzAcvaVT0qDgN7Q== X-Google-Smtp-Source: ABdhPJwnRPfu0ZDd7lV6wQHZGp1HhGSeLY9lXPjxNcW/zy5z+b/Ylsn6tJfwGv37uVe3OijSze5geQ== X-Received: by 2002:a63:5218:: with SMTP id g24mr16111748pgb.309.1621904859006; Mon, 24 May 2021 18:07:39 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id b16sm11748176pju.35.2021.05.24.18.07.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:07:38 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 64/92] target/arm: Macroize helper_gvec_{s,u}dot_{b,h} Date: Mon, 24 May 2021 18:03:30 -0700 Message-Id: <20210525010358.152808-65-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::52d; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x52d.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" We're about to add more variations on this theme. Signed-off-by: Richard Henderson --- target/arm/vec_helper.c | 82 ++++++++++------------------------------- 1 file changed, 20 insertions(+), 62 deletions(-) diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index 8b7269d8e1..cddf095c74 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -543,73 +543,31 @@ void HELPER(sve2_sqrdmulh_idx_d)(void *vd, void *vn, void *vm, uint32_t desc) /* Integer 8 and 16-bit dot-product. * * Note that for the loops herein, host endianness does not matter - * with respect to the ordering of data within the 64-bit lanes. + * with respect to the ordering of data within the quad-width lanes. * All elements are treated equally, no matter where they are. */ -void HELPER(gvec_sdot_b)(void *vd, void *vn, void *vm, void *va, uint32_t desc) -{ - intptr_t i, opr_sz = simd_oprsz(desc); - int32_t *d = vd, *a = va; - int8_t *n = vn, *m = vm; - - for (i = 0; i < opr_sz / 4; ++i) { - d[i] = (a[i] + - n[i * 4 + 0] * m[i * 4 + 0] + - n[i * 4 + 1] * m[i * 4 + 1] + - n[i * 4 + 2] * m[i * 4 + 2] + - n[i * 4 + 3] * m[i * 4 + 3]); - } - clear_tail(d, opr_sz, simd_maxsz(desc)); +#define DO_DOT(NAME, TYPED, TYPEN, TYPEM) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, void *va, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc); \ + TYPED *d = vd, *a = va; \ + TYPEN *n = vn; \ + TYPEM *m = vm; \ + for (i = 0; i < opr_sz / sizeof(TYPED); ++i) { \ + d[i] = (a[i] + \ + (TYPED)n[i * 4 + 0] * m[i * 4 + 0] + \ + (TYPED)n[i * 4 + 1] * m[i * 4 + 1] + \ + (TYPED)n[i * 4 + 2] * m[i * 4 + 2] + \ + (TYPED)n[i * 4 + 3] * m[i * 4 + 3]); \ + } \ + clear_tail(d, opr_sz, simd_maxsz(desc)); \ } -void HELPER(gvec_udot_b)(void *vd, void *vn, void *vm, void *va, uint32_t desc) -{ - intptr_t i, opr_sz = simd_oprsz(desc); - uint32_t *d = vd, *a = va; - uint8_t *n = vn, *m = vm; - - for (i = 0; i < opr_sz / 4; ++i) { - d[i] = (a[i] + - n[i * 4 + 0] * m[i * 4 + 0] + - n[i * 4 + 1] * m[i * 4 + 1] + - n[i * 4 + 2] * m[i * 4 + 2] + - n[i * 4 + 3] * m[i * 4 + 3]); - } - clear_tail(d, opr_sz, simd_maxsz(desc)); -} - -void HELPER(gvec_sdot_h)(void *vd, void *vn, void *vm, void *va, uint32_t desc) -{ - intptr_t i, opr_sz = simd_oprsz(desc); - int64_t *d = vd, *a = va; - int16_t *n = vn, *m = vm; - - for (i = 0; i < opr_sz / 8; ++i) { - d[i] = (a[i] + - (int64_t)n[i * 4 + 0] * m[i * 4 + 0] + - (int64_t)n[i * 4 + 1] * m[i * 4 + 1] + - (int64_t)n[i * 4 + 2] * m[i * 4 + 2] + - (int64_t)n[i * 4 + 3] * m[i * 4 + 3]); - } - clear_tail(d, opr_sz, simd_maxsz(desc)); -} - -void HELPER(gvec_udot_h)(void *vd, void *vn, void *vm, void *va, uint32_t desc) -{ - intptr_t i, opr_sz = simd_oprsz(desc); - uint64_t *d = vd, *a = va; - uint16_t *n = vn, *m = vm; - - for (i = 0; i < opr_sz / 8; ++i) { - d[i] = (a[i] + - (uint64_t)n[i * 4 + 0] * m[i * 4 + 0] + - (uint64_t)n[i * 4 + 1] * m[i * 4 + 1] + - (uint64_t)n[i * 4 + 2] * m[i * 4 + 2] + - (uint64_t)n[i * 4 + 3] * m[i * 4 + 3]); - } - clear_tail(d, opr_sz, simd_maxsz(desc)); -} +DO_DOT(gvec_sdot_b, int32_t, int8_t, int8_t) +DO_DOT(gvec_udot_b, uint32_t, uint8_t, uint8_t) +DO_DOT(gvec_sdot_h, int64_t, int16_t, int16_t) +DO_DOT(gvec_udot_h, uint64_t, uint16_t, uint16_t) void HELPER(gvec_sdot_idx_b)(void *vd, void *vn, void *vm, void *va, uint32_t desc) From patchwork Tue May 25 01:03:31 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277593 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 86F10C2B9F7 for ; Tue, 25 May 2021 02:14:57 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id D41BB61360 for ; Tue, 25 May 2021 02:14:56 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D41BB61360 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:57616 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llMat-0005zH-Nf for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 22:14:55 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:55600) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLYT-0000pO-7E for qemu-devel@nongnu.org; Mon, 24 May 2021 21:08:21 -0400 Received: from mail-pg1-x52c.google.com ([2607:f8b0:4864:20::52c]:33448) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLXs-000460-8d for qemu-devel@nongnu.org; Mon, 24 May 2021 21:08:20 -0400 Received: by mail-pg1-x52c.google.com with SMTP id i5so21439354pgm.0 for ; Mon, 24 May 2021 18:07:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=TSJhIg/oEj78OAJlzsUKkaERruGiMdx8U7Tjdyc3i70=; b=UpeH7HGx3DEoqpbxxdl1btOB4yS91c6cBOZqmH9T67vPVoRYtEG2lk0stVvr4eHB4R A3RUnM8VcuCwo0Y4M+C0aBSr7cvIWWjPBiPsyCYHjexbcFmoNJvKC0tJAkIVAxAxGKk7 Mh6jTMflxELLGyMjqr4GqWso3tJTIpABwhvbVRRmXXD9FkiQY/GPJP7ZVpt3l+BWKo2l 3/Prs35f38DYZlFWHKliSxB+FKtXNmOWDb6zKTR9aJGIiH4N3L5UHPtGSFYau8IZDher jUSkLQa4GbIopGRbhPx63BgAIDrRiqd39J8O9ntisURDsnwhmVYbH7SYJixgOdd2gTcU /dwg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=TSJhIg/oEj78OAJlzsUKkaERruGiMdx8U7Tjdyc3i70=; b=WpCc/GJNq9MFaqRZHy3vZuCHrclC+CjZL6JYwsQk30YwoHZMX5O7O1R00HrxfT/I9r 2IpfKe/tvdf7f5WWMFTdLaXj7bn3LPcuQnceMZWfa58THsiYwQtrkT+4DLlZCIdw7I9t aDmuR0CNuhPmVRV2Aj8QKe6CjGhd3VY36uCJB2fXFlqhwDywR0rFcrCnuc6ahVhEdrw/ J1tewg6AEURLUl82p+oiJWLwOmfL2sJ+fLU0ux2zL8/i1Ubd2bCl2N0MlhwUe+kJolWL LKIp61ZWReTAvFOBWsdWbmtoKqMtH9DUVO/74MLZcgBODrzP875X4wsqMwh7LY0xVb/M M5/Q== X-Gm-Message-State: AOAM532gZAaU49/4Ky6EukhJSvnf8J3yBi5VQLKFcgHm0V3Tb9xuaHNn mIA2wQMzk/XNDanX2vT46Vkj9Coj54juQg== X-Google-Smtp-Source: ABdhPJwz8Q1tz2CWh8KBmfj/csEz1MEWcsiN2bzrXQh3ice4lRZKDKX0YbzbrJUXHQDmhy+CPp0c2Q== X-Received: by 2002:a63:5d04:: with SMTP id r4mr16266249pgb.178.1621904859504; Mon, 24 May 2021 18:07:39 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id b16sm11748176pju.35.2021.05.24.18.07.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:07:39 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 65/92] target/arm: Macroize helper_gvec_{s,u}dot_idx_{b,h} Date: Mon, 24 May 2021 18:03:31 -0700 Message-Id: <20210525010358.152808-66-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::52c; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x52c.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" We're about to add more variations on this theme. Accept the inner loop for the _h variants, rather than keep it unrolled. Signed-off-by: Richard Henderson --- target/arm/vec_helper.c | 160 ++++++++-------------------------------- 1 file changed, 29 insertions(+), 131 deletions(-) diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index cddf095c74..1c4266a9c0 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -569,139 +569,37 @@ DO_DOT(gvec_udot_b, uint32_t, uint8_t, uint8_t) DO_DOT(gvec_sdot_h, int64_t, int16_t, int16_t) DO_DOT(gvec_udot_h, uint64_t, uint16_t, uint16_t) -void HELPER(gvec_sdot_idx_b)(void *vd, void *vn, void *vm, - void *va, uint32_t desc) -{ - intptr_t i, segend, opr_sz = simd_oprsz(desc), opr_sz_4 = opr_sz / 4; - intptr_t index = simd_data(desc); - int32_t *d = vd, *a = va; - int8_t *n = vn; - int8_t *m_indexed = (int8_t *)vm + H4(index) * 4; - - /* Notice the special case of opr_sz == 8, from aa64/aa32 advsimd. - * Otherwise opr_sz is a multiple of 16. - */ - segend = MIN(4, opr_sz_4); - i = 0; - do { - int8_t m0 = m_indexed[i * 4 + 0]; - int8_t m1 = m_indexed[i * 4 + 1]; - int8_t m2 = m_indexed[i * 4 + 2]; - int8_t m3 = m_indexed[i * 4 + 3]; - - do { - d[i] = (a[i] + - n[i * 4 + 0] * m0 + - n[i * 4 + 1] * m1 + - n[i * 4 + 2] * m2 + - n[i * 4 + 3] * m3); - } while (++i < segend); - segend = i + 4; - } while (i < opr_sz_4); - - clear_tail(d, opr_sz, simd_maxsz(desc)); +#define DO_DOT_IDX(NAME, TYPED, TYPEN, TYPEM, HD) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, void *va, uint32_t desc) \ +{ \ + intptr_t i = 0, opr_sz = simd_oprsz(desc); \ + intptr_t opr_sz_n = opr_sz / sizeof(TYPED); \ + intptr_t segend = MIN(16 / sizeof(TYPED), opr_sz_n); \ + intptr_t index = simd_data(desc); \ + TYPED *d = vd, *a = va; \ + TYPEN *n = vn; \ + TYPEM *m_indexed = (TYPEM *)vm + HD(index) * 4; \ + do { \ + TYPED m0 = m_indexed[i * 4 + 0]; \ + TYPED m1 = m_indexed[i * 4 + 1]; \ + TYPED m2 = m_indexed[i * 4 + 2]; \ + TYPED m3 = m_indexed[i * 4 + 3]; \ + do { \ + d[i] = (a[i] + \ + n[i * 4 + 0] * m0 + \ + n[i * 4 + 1] * m1 + \ + n[i * 4 + 2] * m2 + \ + n[i * 4 + 3] * m3); \ + } while (++i < segend); \ + segend = i + 4; \ + } while (i < opr_sz_n); \ + clear_tail(d, opr_sz, simd_maxsz(desc)); \ } -void HELPER(gvec_udot_idx_b)(void *vd, void *vn, void *vm, - void *va, uint32_t desc) -{ - intptr_t i, segend, opr_sz = simd_oprsz(desc), opr_sz_4 = opr_sz / 4; - intptr_t index = simd_data(desc); - uint32_t *d = vd, *a = va; - uint8_t *n = vn; - uint8_t *m_indexed = (uint8_t *)vm + H4(index) * 4; - - /* Notice the special case of opr_sz == 8, from aa64/aa32 advsimd. - * Otherwise opr_sz is a multiple of 16. - */ - segend = MIN(4, opr_sz_4); - i = 0; - do { - uint8_t m0 = m_indexed[i * 4 + 0]; - uint8_t m1 = m_indexed[i * 4 + 1]; - uint8_t m2 = m_indexed[i * 4 + 2]; - uint8_t m3 = m_indexed[i * 4 + 3]; - - do { - d[i] = (a[i] + - n[i * 4 + 0] * m0 + - n[i * 4 + 1] * m1 + - n[i * 4 + 2] * m2 + - n[i * 4 + 3] * m3); - } while (++i < segend); - segend = i + 4; - } while (i < opr_sz_4); - - clear_tail(d, opr_sz, simd_maxsz(desc)); -} - -void HELPER(gvec_sdot_idx_h)(void *vd, void *vn, void *vm, - void *va, uint32_t desc) -{ - intptr_t i, opr_sz = simd_oprsz(desc), opr_sz_8 = opr_sz / 8; - intptr_t index = simd_data(desc); - int64_t *d = vd, *a = va; - int16_t *n = vn; - int16_t *m_indexed = (int16_t *)vm + index * 4; - - /* This is supported by SVE only, so opr_sz is always a multiple of 16. - * Process the entire segment all at once, writing back the results - * only after we've consumed all of the inputs. - */ - for (i = 0; i < opr_sz_8; i += 2) { - int64_t d0, d1; - - d0 = a[i + 0]; - d0 += n[i * 4 + 0] * (int64_t)m_indexed[i * 4 + 0]; - d0 += n[i * 4 + 1] * (int64_t)m_indexed[i * 4 + 1]; - d0 += n[i * 4 + 2] * (int64_t)m_indexed[i * 4 + 2]; - d0 += n[i * 4 + 3] * (int64_t)m_indexed[i * 4 + 3]; - - d1 = a[i + 1]; - d1 += n[i * 4 + 4] * (int64_t)m_indexed[i * 4 + 0]; - d1 += n[i * 4 + 5] * (int64_t)m_indexed[i * 4 + 1]; - d1 += n[i * 4 + 6] * (int64_t)m_indexed[i * 4 + 2]; - d1 += n[i * 4 + 7] * (int64_t)m_indexed[i * 4 + 3]; - - d[i + 0] = d0; - d[i + 1] = d1; - } - clear_tail(d, opr_sz, simd_maxsz(desc)); -} - -void HELPER(gvec_udot_idx_h)(void *vd, void *vn, void *vm, - void *va, uint32_t desc) -{ - intptr_t i, opr_sz = simd_oprsz(desc), opr_sz_8 = opr_sz / 8; - intptr_t index = simd_data(desc); - uint64_t *d = vd, *a = va; - uint16_t *n = vn; - uint16_t *m_indexed = (uint16_t *)vm + index * 4; - - /* This is supported by SVE only, so opr_sz is always a multiple of 16. - * Process the entire segment all at once, writing back the results - * only after we've consumed all of the inputs. - */ - for (i = 0; i < opr_sz_8; i += 2) { - uint64_t d0, d1; - - d0 = a[i + 0]; - d0 += n[i * 4 + 0] * (uint64_t)m_indexed[i * 4 + 0]; - d0 += n[i * 4 + 1] * (uint64_t)m_indexed[i * 4 + 1]; - d0 += n[i * 4 + 2] * (uint64_t)m_indexed[i * 4 + 2]; - d0 += n[i * 4 + 3] * (uint64_t)m_indexed[i * 4 + 3]; - - d1 = a[i + 1]; - d1 += n[i * 4 + 4] * (uint64_t)m_indexed[i * 4 + 0]; - d1 += n[i * 4 + 5] * (uint64_t)m_indexed[i * 4 + 1]; - d1 += n[i * 4 + 6] * (uint64_t)m_indexed[i * 4 + 2]; - d1 += n[i * 4 + 7] * (uint64_t)m_indexed[i * 4 + 3]; - - d[i + 0] = d0; - d[i + 1] = d1; - } - clear_tail(d, opr_sz, simd_maxsz(desc)); -} +DO_DOT_IDX(gvec_sdot_idx_b, int32_t, int8_t, int8_t, H4) +DO_DOT_IDX(gvec_udot_idx_b, uint32_t, uint8_t, uint8_t, H4) +DO_DOT_IDX(gvec_sdot_idx_h, int64_t, int16_t, int16_t, ) +DO_DOT_IDX(gvec_udot_idx_h, uint64_t, uint16_t, uint16_t, ) void HELPER(gvec_fcaddh)(void *vd, void *vn, void *vm, void *vfpst, uint32_t desc) From patchwork Tue May 25 01:03:32 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277517 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D15EFC2B9F7 for ; Tue, 25 May 2021 01:49:23 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 60A636140F for ; Tue, 25 May 2021 01:49:23 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 60A636140F Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:49430 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llMCA-0002xL-Ef for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 21:49:22 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:55384) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLYK-0000D3-RF for qemu-devel@nongnu.org; Mon, 24 May 2021 21:08:12 -0400 Received: from mail-pj1-x1029.google.com ([2607:f8b0:4864:20::1029]:36450) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLXr-000466-VP for qemu-devel@nongnu.org; Mon, 24 May 2021 21:08:12 -0400 Received: by mail-pj1-x1029.google.com with SMTP id n6-20020a17090ac686b029015d2f7aeea8so12277297pjt.1 for ; Mon, 24 May 2021 18:07:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=8LkYL6oLmF6Ik5JttTbaNom5I8sHRj4wI+YARoykw1E=; b=D+eWJZWG+Hmy7/cZ8YxYljkmWrnneUWQ9a/bgdcevayXkmn+9IVf7SQCTcXiTnh/3Z mulvQwv/a2wKvr6MLlDLvskdpB1KioIxoiYrdFFkSZteOlB2d/EAB5p6DLvdpjqFkF1N 7zKpyJv5viYaxHRx6kRWqcsi75yw+h1BkHXd3l4cQHDoKdxI32yp6CXpr9peCsYkzb7D iXoGt9y8MWZqjzn0kBVRCJqGNBPUDXPckQ4s6QmHBtFTVpwefacMTnZvAGelhTNd+L+9 Mac6oZjtomioiUQ3lL/yJlKisOMJbWMTDdQ76QI5kyXDMmZrTrIXa84ygzN9TXyZ597i uSmA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=8LkYL6oLmF6Ik5JttTbaNom5I8sHRj4wI+YARoykw1E=; b=Jgj0HwSwJurXIvvFd52laKrUG79SR7EEkKmaeewVgsmHeJTl+gq2RyhrRYReY1Tetg 1HwY19oKlo58JBEb7YUuIuvgPAfN6PhYf9kAgP3NdwN7RnqAV4NUdjVPYBfe66BfnfHU bLiVw9lObAQ0viaMphmnNaMC85dYTfbrlBgKQRhNqPOjjEleAoyMZYpCdbKjsTFmHc13 vwojGnmZuNenNJOhG+4Jp06himOUhvAiukTtlsH06w3Hiv/A57iwpdrvBF2mZsypdUaE 6mjKTYRfGNM7kytzkligDrsE6vhml/KRCUvbmuSrg4pwY8A4kd37aBKTo/KlQkbtwZ8L gEGQ== X-Gm-Message-State: AOAM533d7uwYupsF1ShEGHXYfkSfKIhcwPEcjV2ZT7I/NczMFSXh3PbI k241enJwtyBITO7IsUhMQ3EmvJToXKrvvg== X-Google-Smtp-Source: ABdhPJyknisWuaKB1eVgbZgH3P9CDsAH15BHfgBeYsfjHdwrThAY6kr0Jf8QaM7lbE12AHrtWbFtKQ== X-Received: by 2002:a17:90a:af85:: with SMTP id w5mr27163936pjq.115.1621904860009; Mon, 24 May 2021 18:07:40 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id b16sm11748176pju.35.2021.05.24.18.07.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:07:39 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 66/92] target/arm: Implement SVE mixed sign dot product (indexed) Date: Mon, 24 May 2021 18:03:32 -0700 Message-Id: <20210525010358.152808-67-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1029; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1029.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- v7: Macroize the helpers. --- target/arm/cpu.h | 5 +++++ target/arm/helper.h | 4 ++++ target/arm/sve.decode | 4 ++++ target/arm/translate-sve.c | 16 ++++++++++++++++ target/arm/vec_helper.c | 2 ++ 5 files changed, 31 insertions(+) diff --git a/target/arm/cpu.h b/target/arm/cpu.h index 595bc6349d..0a41142d35 100644 --- a/target/arm/cpu.h +++ b/target/arm/cpu.h @@ -4246,6 +4246,11 @@ static inline bool isar_feature_aa64_sve2_bitperm(const ARMISARegisters *id) return FIELD_EX64(id->id_aa64zfr0, ID_AA64ZFR0, BITPERM) != 0; } +static inline bool isar_feature_aa64_sve_i8mm(const ARMISARegisters *id) +{ + return FIELD_EX64(id->id_aa64zfr0, ID_AA64ZFR0, I8MM) != 0; +} + static inline bool isar_feature_aa64_sve_f32mm(const ARMISARegisters *id) { return FIELD_EX64(id->id_aa64zfr0, ID_AA64ZFR0, F32MM) != 0; diff --git a/target/arm/helper.h b/target/arm/helper.h index e7c463fff5..e4c6458f98 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -621,6 +621,10 @@ DEF_HELPER_FLAGS_5(gvec_sdot_idx_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(gvec_udot_idx_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_sudot_idx_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_usdot_idx_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(gvec_fcaddh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 0339410cf7..c6b32a3f69 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -816,6 +816,10 @@ SQRDMLSH_zzxz_h 01000100 0. 1 ..... 000101 ..... ..... @rrxr_3 esz=1 SQRDMLSH_zzxz_s 01000100 10 1 ..... 000101 ..... ..... @rrxr_2 esz=2 SQRDMLSH_zzxz_d 01000100 11 1 ..... 000101 ..... ..... @rrxr_1 esz=3 +# SVE mixed sign dot product (indexed) +USDOT_zzxw_s 01000100 10 1 ..... 000110 ..... ..... @rrxr_2 esz=2 +SUDOT_zzxw_s 01000100 10 1 ..... 000111 ..... ..... @rrxr_2 esz=2 + # SVE2 saturating multiply-add (indexed) SQDMLALB_zzxw_s 01000100 10 1 ..... 0010.0 ..... ..... @rrxr_3a esz=2 SQDMLALB_zzxw_d 01000100 11 1 ..... 0010.0 ..... ..... @rrxr_2a esz=3 diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index b454f50a6b..30894a4143 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3838,6 +3838,22 @@ DO_RRXR(trans_SDOT_zzxw_d, gen_helper_gvec_sdot_idx_h) DO_RRXR(trans_UDOT_zzxw_s, gen_helper_gvec_udot_idx_b) DO_RRXR(trans_UDOT_zzxw_d, gen_helper_gvec_udot_idx_h) +static bool trans_SUDOT_zzxw_s(DisasContext *s, arg_rrxr_esz *a) +{ + if (!dc_isar_feature(aa64_sve_i8mm, s)) { + return false; + } + return do_zzxz_ool(s, a, gen_helper_gvec_sudot_idx_b); +} + +static bool trans_USDOT_zzxw_s(DisasContext *s, arg_rrxr_esz *a) +{ + if (!dc_isar_feature(aa64_sve_i8mm, s)) { + return false; + } + return do_zzxz_ool(s, a, gen_helper_gvec_usdot_idx_b); +} + #undef DO_RRXR static bool do_sve2_zzz_data(DisasContext *s, int rd, int rn, int rm, int data, diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index 1c4266a9c0..f128b41eac 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -598,6 +598,8 @@ void HELPER(NAME)(void *vd, void *vn, void *vm, void *va, uint32_t desc) \ DO_DOT_IDX(gvec_sdot_idx_b, int32_t, int8_t, int8_t, H4) DO_DOT_IDX(gvec_udot_idx_b, uint32_t, uint8_t, uint8_t, H4) +DO_DOT_IDX(gvec_sudot_idx_b, int32_t, int8_t, uint8_t, H4) +DO_DOT_IDX(gvec_usdot_idx_b, int32_t, uint8_t, int8_t, H4) DO_DOT_IDX(gvec_sdot_idx_h, int64_t, int16_t, int16_t, ) DO_DOT_IDX(gvec_udot_idx_h, uint64_t, uint16_t, uint16_t, ) From patchwork Tue May 25 01:03:33 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277575 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 56CB1C47084 for ; Tue, 25 May 2021 02:07:43 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 0F59D6141D for ; Tue, 25 May 2021 02:07:43 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 0F59D6141D Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:57624 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llMTu-0003pB-6Q for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 22:07:42 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:55454) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLYN-0000Rd-Uz for qemu-devel@nongnu.org; Mon, 24 May 2021 21:08:15 -0400 Received: from mail-pl1-x62f.google.com ([2607:f8b0:4864:20::62f]:45855) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLXs-00046H-13 for qemu-devel@nongnu.org; Mon, 24 May 2021 21:08:15 -0400 Received: by mail-pl1-x62f.google.com with SMTP id s4so13951527plg.12 for ; Mon, 24 May 2021 18:07:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=nAnCZLz+MRigDJDZpVvfWKaVfWlaQMDaEQ15z1KDlqw=; b=X2Boind6meauKW67aXkIfxA9DnmqGVI+R8Rs1fAzr+5FbaoTWPbL30HMmneSGcrxjd qizAh16aUCgDQ7wc9qLe+5Z0e0SbB7jGz9OiwW75o68db/jOgplcEXOojFEbSjzxtsyO CDTy6nev7t6luKXmXDnXy4FUK4umTdMVZCqBOx7B1dFXlmUVQ2FMpr2S/tIVPjQMSa6J G3EHjSkvy7BC+pzH/u2Qkn19ZuShWfhcFqjIlUxI8JGp/623/b+nYl594E1BKWdExRrD vva8Q48PN+6cJCd82AC6H352CJgz6NKCA03w7jdHKk74/HfN1nFc9RnMFSJu6X5khpRf dJpw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=nAnCZLz+MRigDJDZpVvfWKaVfWlaQMDaEQ15z1KDlqw=; b=SFRv4urLvCDfPZidUnZAf7deh098/tgvuNSZp2eYFsU81dDioEgz8G0UqmOYbYPDQ3 51X/3DCC2hRNMtqujVYEl4XpngUlbT/xRbXavomoOTVlWyV3JPZmWpu0vVyVP2FD6NdW 8UKarU/UxNMdDUxse6WLI0AZst+ccsHhRcMGlovkwesivq3vNAJWCIYO3gEp5vxC57ts ZEWdIWKBNeK7IzubNi6Jsx6CDe/elgCe1MaC2p9IIxnQklvykmQft01OzWI/cWU08qGa rwAaafZWihu3SX500vyr+oN51SC6cZKIj+H+nmOcwnpEEHsgrOnnddOEPMs+g+vHrtlt XihQ== X-Gm-Message-State: AOAM533tMTvRCkNjntLYsTdsKreU39kf+EyA83TXXa0YKr/Eb62Qx/PH nZBaMKqBeXkdwHDtQDyqxuV2bLztWERLYg== X-Google-Smtp-Source: ABdhPJyb+EH9b8PnageFqUTTzzJEYhdFjKO3u1We+DFhGR2k5NgF3JxtMr1vt15zPP+20zqMeWNWMQ== X-Received: by 2002:a17:90a:b796:: with SMTP id m22mr2058749pjr.146.1621904860670; Mon, 24 May 2021 18:07:40 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id b16sm11748176pju.35.2021.05.24.18.07.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:07:40 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 67/92] target/arm: Implement SVE mixed sign dot product Date: Mon, 24 May 2021 18:03:33 -0700 Message-Id: <20210525010358.152808-68-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::62f; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x62f.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/helper.h | 1 + target/arm/sve.decode | 4 ++++ target/arm/translate-sve.c | 16 ++++++++++++++++ target/arm/vec_helper.c | 1 + 4 files changed, 22 insertions(+) diff --git a/target/arm/helper.h b/target/arm/helper.h index e4c6458f98..2e212ae96b 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -612,6 +612,7 @@ DEF_HELPER_FLAGS_5(gvec_sdot_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(gvec_udot_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(gvec_sdot_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(gvec_udot_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_usdot_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(gvec_sdot_idx_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index c6b32a3f69..9f037fe5a7 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1530,6 +1530,10 @@ UMLSLT_zzzw 01000100 .. 0 ..... 010 111 ..... ..... @rda_rn_rm CMLA_zzzz 01000100 esz:2 0 rm:5 0010 rot:2 rn:5 rd:5 ra=%reg_movprfx SQRDCMLAH_zzzz 01000100 esz:2 0 rm:5 0011 rot:2 rn:5 rd:5 ra=%reg_movprfx +## SVE mixed sign dot product + +USDOT_zzzz 01000100 .. 0 ..... 011 110 ..... ..... @rda_rn_rm + ### SVE2 floating point matrix multiply accumulate FMMLA 01100100 .. 1 ..... 111001 ..... ..... @rda_rn_rm diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 30894a4143..ae078b095a 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -8132,3 +8132,19 @@ static bool trans_SQRDCMLAH_zzzz(DisasContext *s, arg_SQRDCMLAH_zzzz *a) } return true; } + +static bool trans_USDOT_zzzz(DisasContext *s, arg_USDOT_zzzz *a) +{ + if (a->esz != 2 || !dc_isar_feature(aa64_sve_i8mm, s)) { + return false; + } + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + tcg_gen_gvec_4_ool(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + vec_full_reg_offset(s, a->ra), + vsz, vsz, 0, gen_helper_gvec_usdot_b); + } + return true; +} diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index f128b41eac..21ae1258f2 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -566,6 +566,7 @@ void HELPER(NAME)(void *vd, void *vn, void *vm, void *va, uint32_t desc) \ DO_DOT(gvec_sdot_b, int32_t, int8_t, int8_t) DO_DOT(gvec_udot_b, uint32_t, uint8_t, uint8_t) +DO_DOT(gvec_usdot_b, uint32_t, uint8_t, int8_t) DO_DOT(gvec_sdot_h, int64_t, int16_t, int16_t) DO_DOT(gvec_udot_h, uint64_t, uint16_t, uint16_t) From patchwork Tue May 25 01:03:34 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277543 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B59A0C2B9F7 for ; Tue, 25 May 2021 01:58:07 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 5D9836113B for ; Tue, 25 May 2021 01:58:07 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 5D9836113B Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:53440 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llMKc-0007SO-DG for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 21:58:06 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:55306) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLYD-0008O0-J1 for qemu-devel@nongnu.org; Mon, 24 May 2021 21:08:05 -0400 Received: from mail-pj1-x102a.google.com ([2607:f8b0:4864:20::102a]:45050) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLXr-000475-Jf for qemu-devel@nongnu.org; Mon, 24 May 2021 21:08:03 -0400 Received: by mail-pj1-x102a.google.com with SMTP id h20-20020a17090aa894b029015db8f3969eso11565480pjq.3 for ; Mon, 24 May 2021 18:07:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=oldk+knzMpq2uW/3oLaoYDRDODzDy8i55wbuRwYP8h8=; b=OV32m4kkfxfWHwJyCUuQnFLS1+Gge+qVsVE5HvoxlW5uZ9IDcw0FZ7WIfRoI87do+q alYcmedogu2gxsRf5iY91HwKfHj8I4ABV3lT2se14iYwQtDh/vDb29r1JfHlMSuPBYwK 4KwNJivLk1uVy6zXtkfhM8Rr8p/Txdc0ul/85Frpf9DnjwpusEUwoTqIlvq4NMWdCM1q XU2qds4NSK+LcoOJso91MNlNuXg9d3is6nCL2y34Wrr7bzmumsJ0n6QkD3Jx33v99e0H iBTWRVYi0mqyM0CpAcBIkYONy/Pxhs3PMgv0tfQs9FzvrduX8T+/FvG+511Sa5IEJI2q jeVw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=oldk+knzMpq2uW/3oLaoYDRDODzDy8i55wbuRwYP8h8=; b=ZrqKS66SKgaUpyYhzF3DvgqhIMFGxrnqLOVfnWb02S/XnyZYY5STvx6noe/sfb4Ex0 87PCtePebT2v4c2rtULQbh07t4XzZgkZeZAbgapl2qSJVVsVXs5jii/FNS0fNdh7Q1nS Hx1p3Q4qxx8JXtLyejcPCd/ZWRExp18xU/cW4GOupQ0DdIUrxZtX1gkDYHnvjTWgXO0B YtHgITO8n8LC+9cIqLWjd302CvCogHBGtYCKmhUnzvKW0MGlGT8krajdgfR1mriGbkxQ 33EHwyCE8E6NXqTx0/tb+rvbCkLafyKqMO9DjvOEhk4wIe9dYktNyso5RPpK2JaCanBQ UlQQ== X-Gm-Message-State: AOAM531a8YOrDXgrgSr0ofT27QS63assGEb1qkenIgEk4B7n6BhufCk2 2h/Te2+Roym13YOw1p2VO6KZIS/kDxTquw== X-Google-Smtp-Source: ABdhPJy2yCXxtgZO96scUCj7DdoHUiQ4JTnpHzEcoYTQI3ErTE1Y6kD17ZSyqdjdd6WXqcx2XHQd6w== X-Received: by 2002:a17:902:bb8e:b029:f4:58d1:5170 with SMTP id m14-20020a170902bb8eb02900f458d15170mr28067026pls.84.1621904861562; Mon, 24 May 2021 18:07:41 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id b16sm11748176pju.35.2021.05.24.18.07.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:07:41 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 68/92] target/arm: Implement SVE2 crypto unary operations Date: Mon, 24 May 2021 18:03:34 -0700 Message-Id: <20210525010358.152808-69-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::102a; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x102a.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/sve.decode | 6 ++++++ target/arm/translate-sve.c | 11 +++++++++++ 2 files changed, 17 insertions(+) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 9f037fe5a7..a9cf3bea3e 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1558,3 +1558,9 @@ STNT1_zprz 1110010 .. 00 ..... 001 ... ..... ..... \ # SVE2 32-bit scatter non-temporal store (vector plus scalar) STNT1_zprz 1110010 .. 10 ..... 001 ... ..... ..... \ @rprr_scatter_store xs=0 esz=2 scale=0 + +### SVE2 Crypto Extensions + +# SVE2 crypto unary operations +# AESMC and AESIMC +AESMC 01000101 00 10000011100 decrypt:1 00000 rd:5 diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index ae078b095a..79b4991549 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -8148,3 +8148,14 @@ static bool trans_USDOT_zzzz(DisasContext *s, arg_USDOT_zzzz *a) } return true; } + +static bool trans_AESMC(DisasContext *s, arg_AESMC *a) +{ + if (!dc_isar_feature(aa64_sve2_aes, s)) { + return false; + } + if (sve_access_check(s)) { + gen_gvec_ool_zz(s, gen_helper_crypto_aesmc, a->rd, a->rd, a->decrypt); + } + return true; +} From patchwork Tue May 25 01:03:35 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277531 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3C8E5C2B9F7 for ; Tue, 25 May 2021 01:54:05 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id C379F6141B for ; Tue, 25 May 2021 01:54:04 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C379F6141B Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:39068 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llMGh-0006KI-PL for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 21:54:03 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:55590) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLYS-0000nt-T5 for qemu-devel@nongnu.org; Mon, 24 May 2021 21:08:20 -0400 Received: from mail-pl1-x630.google.com ([2607:f8b0:4864:20::630]:37526) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLXs-00047I-89 for qemu-devel@nongnu.org; Mon, 24 May 2021 21:08:20 -0400 Received: by mail-pl1-x630.google.com with SMTP id u7so6943375plq.4 for ; Mon, 24 May 2021 18:07:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=5gZ7BW3wfCAgbeuXwj4YLYqajE5+QDFompI+UN5LUPE=; b=o5kzVQpyX93MnRF5Pie9HlNwTQma9uOJfxGL1eUxgfWOg5B2puIj4rfV7QfK+yfvb0 UFGZN1X8yb5tck9TwN4hX4+x63HnVghYvdrnBMj0Uxf+jUUTUgjfoewki9lFL0va0qRz DVdmL/p6Niq83h3Y86cu600IyuFebd4WyZJi4nodpdXlMMnNOxOGzNnIBVaAzuFMZ+Qx cUACnPLXh35TcI0QSW9/8bBcEAzl5fBsuewHssKBzTMM2zMjdcoRRPtlbTmSM7i2rdMT aFzpxJ7vxs5NpqvJjyHlnPonKHvMGUXdyfMz+kdECmqxTDkQj1ZBZNbovEdvVby68tDL 6L8Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=5gZ7BW3wfCAgbeuXwj4YLYqajE5+QDFompI+UN5LUPE=; b=kPEHh0lHHZhluIaaQldUXhuRyeghYXxyqbgP7SmArqgQZUw27/JY1F4lrEGKej5ofz T3brInhMv0kzY4dDSAe3JPVvLWGBguZtXmewLv8w0RUhISF4nTvj+LlUyKR0Bc7FQhTD hyjs07YPVKjoVwwrxnvofpApGghQAtJZ2HQdjEAAcUhUqbfRbszmo9Szu3HT5j87Im+D v1cfcTy6dKZc7m8YQ4d2WGhyWaF2hNob5ktE70uQ8BiTPv3Xy7YaDp5hG8SVe1z+xRix HIO61HZUMXpRbfuGDLpszEfbVHLxW8PVFzr9vAr9l58gDAJ0164QToUFraJMwwfioV9f yi5g== X-Gm-Message-State: AOAM531eLvee4/MWrvJyi0qACG30PQbyQ5TQWHUr0N6Kl0D893tR6g0u s1qvsF75JKruc7u8bzjAOwplb9KZ5fOAiA== X-Google-Smtp-Source: ABdhPJz48QbHSw6rn+GEa5bgxY17OJ7OijlKvMPSORCSaSVXjPWB2as4MMBmbws9cn5s9FrtyIz8ZQ== X-Received: by 2002:a17:90a:fb53:: with SMTP id iq19mr27908725pjb.11.1621904862114; Mon, 24 May 2021 18:07:42 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id b16sm11748176pju.35.2021.05.24.18.07.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:07:41 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 69/92] target/arm: Implement SVE2 crypto destructive binary operations Date: Mon, 24 May 2021 18:03:35 -0700 Message-Id: <20210525010358.152808-70-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::630; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x630.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/cpu.h | 5 +++++ target/arm/sve.decode | 7 +++++++ target/arm/translate-sve.c | 38 ++++++++++++++++++++++++++++++++++++++ 3 files changed, 50 insertions(+) diff --git a/target/arm/cpu.h b/target/arm/cpu.h index 0a41142d35..384c92eebb 100644 --- a/target/arm/cpu.h +++ b/target/arm/cpu.h @@ -4246,6 +4246,11 @@ static inline bool isar_feature_aa64_sve2_bitperm(const ARMISARegisters *id) return FIELD_EX64(id->id_aa64zfr0, ID_AA64ZFR0, BITPERM) != 0; } +static inline bool isar_feature_aa64_sve2_sm4(const ARMISARegisters *id) +{ + return FIELD_EX64(id->id_aa64zfr0, ID_AA64ZFR0, SM4) != 0; +} + static inline bool isar_feature_aa64_sve_i8mm(const ARMISARegisters *id) { return FIELD_EX64(id->id_aa64zfr0, ID_AA64ZFR0, I8MM) != 0; diff --git a/target/arm/sve.decode b/target/arm/sve.decode index a9cf3bea3e..46ebb5e2f8 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -118,6 +118,8 @@ @pd_pn_pm ........ esz:2 .. rm:4 ....... rn:4 . rd:4 &rrr_esz @rdn_rm ........ esz:2 ...... ...... rm:5 rd:5 \ &rrr_esz rn=%reg_movprfx +@rdn_rm_e0 ........ .. ...... ...... rm:5 rd:5 \ + &rrr_esz rn=%reg_movprfx esz=0 @rdn_sh_i8u ........ esz:2 ...... ...... ..... rd:5 \ &rri_esz rn=%reg_movprfx imm=%sh8_i8u @rdn_i8u ........ esz:2 ...... ... imm:8 rd:5 \ @@ -1564,3 +1566,8 @@ STNT1_zprz 1110010 .. 10 ..... 001 ... ..... ..... \ # SVE2 crypto unary operations # AESMC and AESIMC AESMC 01000101 00 10000011100 decrypt:1 00000 rd:5 + +# SVE2 crypto destructive binary operations +AESE 01000101 00 10001 0 11100 0 ..... ..... @rdn_rm_e0 +AESD 01000101 00 10001 0 11100 1 ..... ..... @rdn_rm_e0 +SM4E 01000101 00 10001 1 11100 0 ..... ..... @rdn_rm_e0 diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 79b4991549..3b977b2462 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -8159,3 +8159,41 @@ static bool trans_AESMC(DisasContext *s, arg_AESMC *a) } return true; } + +static bool do_aese(DisasContext *s, arg_rrr_esz *a, bool decrypt) +{ + if (!dc_isar_feature(aa64_sve2_aes, s)) { + return false; + } + if (sve_access_check(s)) { + gen_gvec_ool_zzz(s, gen_helper_crypto_aese, + a->rd, a->rn, a->rm, decrypt); + } + return true; +} + +static bool trans_AESE(DisasContext *s, arg_rrr_esz *a) +{ + return do_aese(s, a, false); +} + +static bool trans_AESD(DisasContext *s, arg_rrr_esz *a) +{ + return do_aese(s, a, true); +} + +static bool do_sm4(DisasContext *s, arg_rrr_esz *a, gen_helper_gvec_3 *fn) +{ + if (!dc_isar_feature(aa64_sve2_sm4, s)) { + return false; + } + if (sve_access_check(s)) { + gen_gvec_ool_zzz(s, fn, a->rd, a->rn, a->rm, 0); + } + return true; +} + +static bool trans_SM4E(DisasContext *s, arg_rrr_esz *a) +{ + return do_sm4(s, a, gen_helper_crypto_sm4e); +} From patchwork Tue May 25 01:03:36 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277553 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 76E20C47084 for ; Tue, 25 May 2021 02:01:34 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 3011061417 for ; Tue, 25 May 2021 02:01:34 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 3011061417 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:35174 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llMNx-0005bj-3q for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 22:01:33 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:55538) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLYQ-0000el-Ti for qemu-devel@nongnu.org; Mon, 24 May 2021 21:08:18 -0400 Received: from mail-pj1-x102b.google.com ([2607:f8b0:4864:20::102b]:35590) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLXs-00047Q-7c for qemu-devel@nongnu.org; Mon, 24 May 2021 21:08:18 -0400 Received: by mail-pj1-x102b.google.com with SMTP id lx17-20020a17090b4b11b029015f3b32b8dbso10496655pjb.0 for ; Mon, 24 May 2021 18:07:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=al5zygwWXlZnBczeihpPgLL//CCdS4mSvx1LWn5XSaU=; b=UsNI/+8DVAFvY9wJZM7qJshQpJXVup3a39IYJALxHwmrFsEJ1FK1/biEduebSZGRjd R40HvDxyIxPaG82zMtu3RT1eRbanJsQ0vpE0VFHuJzozq97cm+Qvhd/rQdes9Xly5yKF UORJPny/Fs/vDk6FMxtl/iFBmgpDpxaLNfeIJ1QXsKQ7udn45ddMBB+pKhMNPTDTy2W1 FEmprUX9yg6m9JbKtyKxOWQ7r/DckyBUgONtIRDLYZWa4hq07k0gEIFWuK0aLG+URwn4 xzAJwNJ0x1yRL5ukFK0ky5ovjGi2zwxl/dvvrli1Uz1VhgMhpNgNpidkld3Z18HkQNoR y3yA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=al5zygwWXlZnBczeihpPgLL//CCdS4mSvx1LWn5XSaU=; b=hq1H6Slx/sVwLwVDTk+QNZcfWTHBl0aucbeBIL+MVYcd+RrmY4puCUkEcBmZ22Y/mF RH5/SVvmQXf4LqXcKVfTt7mBNaluAGEm8AGRZPyVb5f3+hUuFCk9c3+a+sKhXYHUjGnG BXk0We6lzOJzoynKN5jr0SAiurEptDI+rOgRcOIBrGsS8+Iu2Y9/z0+tor6zWkl0fDfz OIIfxO2w5Pp3cCb9lOZwGGdw/1/DccFPnwfcAzTPEB7IpZFx1+yT/CjRhSqHbwTOmim0 DG2tJk5lEaG2oXFZPK5sQ3/msAvExp0oWWfCUhbpjga3rxRt2JmWcrEp48kMWouEAsMM Jozg== X-Gm-Message-State: AOAM533TqxhRswHMGsBtx27dH1hPGEi6wuqWSkklbPfgGMG9KG3e/Tv+ JXfSG8+qB42RHjgDZPeagcPKfL91D3lsag== X-Google-Smtp-Source: ABdhPJw5OfkS1BLG0wDeLNhVh86dER6ctLdzcFPkh858/Z3WU4XXpMp5zUNahYN+Hcf4Sw4ywzpL/w== X-Received: by 2002:a17:90a:ab8c:: with SMTP id n12mr1929359pjq.201.1621904862713; Mon, 24 May 2021 18:07:42 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id b16sm11748176pju.35.2021.05.24.18.07.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:07:42 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 70/92] target/arm: Implement SVE2 crypto constructive binary operations Date: Mon, 24 May 2021 18:03:36 -0700 Message-Id: <20210525010358.152808-71-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::102b; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x102b.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/cpu.h | 5 +++++ target/arm/sve.decode | 4 ++++ target/arm/translate-sve.c | 16 ++++++++++++++++ 3 files changed, 25 insertions(+) diff --git a/target/arm/cpu.h b/target/arm/cpu.h index 384c92eebb..c75601b221 100644 --- a/target/arm/cpu.h +++ b/target/arm/cpu.h @@ -4246,6 +4246,11 @@ static inline bool isar_feature_aa64_sve2_bitperm(const ARMISARegisters *id) return FIELD_EX64(id->id_aa64zfr0, ID_AA64ZFR0, BITPERM) != 0; } +static inline bool isar_feature_aa64_sve2_sha3(const ARMISARegisters *id) +{ + return FIELD_EX64(id->id_aa64zfr0, ID_AA64ZFR0, SHA3) != 0; +} + static inline bool isar_feature_aa64_sve2_sm4(const ARMISARegisters *id) { return FIELD_EX64(id->id_aa64zfr0, ID_AA64ZFR0, SM4) != 0; diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 46ebb5e2f8..051a6399ac 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1571,3 +1571,7 @@ AESMC 01000101 00 10000011100 decrypt:1 00000 rd:5 AESE 01000101 00 10001 0 11100 0 ..... ..... @rdn_rm_e0 AESD 01000101 00 10001 0 11100 1 ..... ..... @rdn_rm_e0 SM4E 01000101 00 10001 1 11100 0 ..... ..... @rdn_rm_e0 + +# SVE2 crypto constructive binary operations +SM4EKEY 01000101 00 1 ..... 11110 0 ..... ..... @rd_rn_rm_e0 +RAX1 01000101 00 1 ..... 11110 1 ..... ..... @rd_rn_rm_e0 diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 3b977b2462..2136a41094 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -8197,3 +8197,19 @@ static bool trans_SM4E(DisasContext *s, arg_rrr_esz *a) { return do_sm4(s, a, gen_helper_crypto_sm4e); } + +static bool trans_SM4EKEY(DisasContext *s, arg_rrr_esz *a) +{ + return do_sm4(s, a, gen_helper_crypto_sm4ekey); +} + +static bool trans_RAX1(DisasContext *s, arg_rrr_esz *a) +{ + if (!dc_isar_feature(aa64_sve2_sha3, s)) { + return false; + } + if (sve_access_check(s)) { + gen_gvec_fn_zzz(s, gen_gvec_rax1, MO_64, a->rd, a->rn, a->rm); + } + return true; +} From patchwork Tue May 25 01:03:37 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277599 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C542FC2B9F7 for ; Tue, 25 May 2021 02:20:52 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 3FD49613EC for ; Tue, 25 May 2021 02:20:52 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 3FD49613EC Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:45070 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llMgd-0008Ul-6O for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 22:20:51 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:55686) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLYW-000136-CW for qemu-devel@nongnu.org; Mon, 24 May 2021 21:08:24 -0400 Received: from mail-pj1-x102a.google.com ([2607:f8b0:4864:20::102a]:50772) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLXt-00047d-DX for qemu-devel@nongnu.org; Mon, 24 May 2021 21:08:24 -0400 Received: by mail-pj1-x102a.google.com with SMTP id t11so15883017pjm.0 for ; Mon, 24 May 2021 18:07:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=wEecqwdmdBE90cV2vELccP25jMzWfd6VgncPN1qIkME=; b=f16mTpWHHRlXwzpwKhZDcQlt9ZDz5GfzeaeLMBt8Ts1FPiOrRcYJvT1Gl6kRFpPTQg QxIcGmND0NqOvldNR9oUgKN07RsRJltwVrhBAtk/MTuudua7bNmM3Cy+NyqcL9T9XRpD VCKAf44hFmXFSJpGlWtBM/qefY8N2hsIlmHLyNk5QGWgbC9E+XdmgpVQqmKcS4dZUmKo UoHLH5mJrXa+bb809poezA5i/fDCh0/lbEVZST+aW7XOPBG1zpG6XSv0TD2biNP6SzR3 yq9QvSLBdJ6NeGN+4N1DcvY2MtOE7Ji0aTag8ns37w1JtTPJ4jt8RbZXQWqjlI0nVXiJ GV2A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=wEecqwdmdBE90cV2vELccP25jMzWfd6VgncPN1qIkME=; b=e3SoaJ1a2SbyS1u/kzhFe2QUEN9XuSESKtEeKkoDU3F3IYrG6yory6qR+KD/T1S/UX TxRkRxw7NKD24Sgtv+7ptb1/9t1K84zpWT1hd5V/cKevmW6aq4xCW3vAxGoQTzf5PG3p g667RakBUTOmEt7GbQ42OCAV/UIK/X8v2TOBGmwA5gHpIglP51ZyNy9O9k5v48YjzxRG hb8eJj2zl2UEB5CWalWhNCbT8QOw3Ab72uOr9//zs98boIotHRy3f4V2Tyjkang6/pZf NBzf4Qv80LZvNvK2NWLQwSDZ0nIC4CWahHeao6B8UuKlb/XHHxaKLB3pv5/XVJDaAFC/ kPeQ== X-Gm-Message-State: AOAM532EO2xn9qfHEo6TKwIN7aEoq3jzbSkViZgYzdk56ZUu2Awnr7kR yaIGfBFqczNJ2+zn3yKl0eBnwQt1/RJ4MQ== X-Google-Smtp-Source: ABdhPJz66sc6d74k/xsP06Piqft4+HPktOaFSLX/Yy1GzY0zS4Nqkyy2Al3b4cnFYBv5e7eAB9hjZA== X-Received: by 2002:a17:902:ecc3:b029:f0:a7c2:7b3f with SMTP id a3-20020a170902ecc3b02900f0a7c27b3fmr28229823plh.4.1621904863367; Mon, 24 May 2021 18:07:43 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id b16sm11748176pju.35.2021.05.24.18.07.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:07:43 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 71/92] target/arm: Implement SVE2 TBL, TBX Date: Mon, 24 May 2021 18:03:37 -0700 Message-Id: <20210525010358.152808-72-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::102a; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x102a.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org, Stephen Long Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Stephen Long Reviewed-by: Peter Maydell Signed-off-by: Stephen Long Message-Id: <20200428144352.9275-1-steplong@quicinc.com> [rth: rearrange the macros a little and rebase] Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 10 +++++ target/arm/sve.decode | 5 +++ target/arm/sve_helper.c | 90 ++++++++++++++++++++++++++++++-------- target/arm/translate-sve.c | 33 ++++++++++++++ 4 files changed, 119 insertions(+), 19 deletions(-) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index efc9a7ccf1..cdff155ead 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -661,6 +661,16 @@ DEF_HELPER_FLAGS_4(sve_tbl_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_tbl_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_tbl_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_tbl_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_tbl_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_tbl_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_tbl_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_tbx_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_tbx_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_tbx_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_tbx_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_3(sve_sunpk_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve_sunpk_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve_sunpk_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 051a6399ac..fdeb7b106b 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -558,6 +558,11 @@ TBL 00000101 .. 1 ..... 001100 ..... ..... @rd_rn_rm # SVE unpack vector elements UNPK 00000101 esz:2 1100 u:1 h:1 001110 rn:5 rd:5 +# SVE2 Table Lookup (three sources) + +TBL_sve2 00000101 .. 1 ..... 001010 ..... ..... @rd_rn_rm +TBX 00000101 .. 1 ..... 001011 ..... ..... @rd_rn_rm + ### SVE Permute - Predicates Group # SVE permute predicate elements diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index f9c2061260..4b05e2e427 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -3069,28 +3069,80 @@ void HELPER(sve_rev_d)(void *vd, void *vn, uint32_t desc) } } -#define DO_TBL(NAME, TYPE, H) \ -void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ -{ \ - intptr_t i, opr_sz = simd_oprsz(desc); \ - uintptr_t elem = opr_sz / sizeof(TYPE); \ - TYPE *d = vd, *n = vn, *m = vm; \ - ARMVectorReg tmp; \ - if (unlikely(vd == vn)) { \ - n = memcpy(&tmp, vn, opr_sz); \ - } \ - for (i = 0; i < elem; i++) { \ - TYPE j = m[H(i)]; \ - d[H(i)] = j < elem ? n[H(j)] : 0; \ - } \ +typedef void tb_impl_fn(void *, void *, void *, void *, uintptr_t, bool); + +static inline void do_tbl1(void *vd, void *vn, void *vm, uint32_t desc, + bool is_tbx, tb_impl_fn *fn) +{ + ARMVectorReg scratch; + uintptr_t oprsz = simd_oprsz(desc); + + if (unlikely(vd == vn)) { + vn = memcpy(&scratch, vn, oprsz); + } + + fn(vd, vn, NULL, vm, oprsz, is_tbx); } -DO_TBL(sve_tbl_b, uint8_t, H1) -DO_TBL(sve_tbl_h, uint16_t, H2) -DO_TBL(sve_tbl_s, uint32_t, H4) -DO_TBL(sve_tbl_d, uint64_t, ) +static inline void do_tbl2(void *vd, void *vn0, void *vn1, void *vm, + uint32_t desc, bool is_tbx, tb_impl_fn *fn) +{ + ARMVectorReg scratch; + uintptr_t oprsz = simd_oprsz(desc); -#undef TBL + if (unlikely(vd == vn0)) { + vn0 = memcpy(&scratch, vn0, oprsz); + if (vd == vn1) { + vn1 = vn0; + } + } else if (unlikely(vd == vn1)) { + vn1 = memcpy(&scratch, vn1, oprsz); + } + + fn(vd, vn0, vn1, vm, oprsz, is_tbx); +} + +#define DO_TB(SUFF, TYPE, H) \ +static inline void do_tb_##SUFF(void *vd, void *vt0, void *vt1, \ + void *vm, uintptr_t oprsz, bool is_tbx) \ +{ \ + TYPE *d = vd, *tbl0 = vt0, *tbl1 = vt1, *indexes = vm; \ + uintptr_t i, nelem = oprsz / sizeof(TYPE); \ + for (i = 0; i < nelem; ++i) { \ + TYPE index = indexes[H1(i)], val = 0; \ + if (index < nelem) { \ + val = tbl0[H(index)]; \ + } else { \ + index -= nelem; \ + if (tbl1 && index < nelem) { \ + val = tbl1[H(index)]; \ + } else if (is_tbx) { \ + continue; \ + } \ + } \ + d[H(i)] = val; \ + } \ +} \ +void HELPER(sve_tbl_##SUFF)(void *vd, void *vn, void *vm, uint32_t desc) \ +{ \ + do_tbl1(vd, vn, vm, desc, false, do_tb_##SUFF); \ +} \ +void HELPER(sve2_tbl_##SUFF)(void *vd, void *vn0, void *vn1, \ + void *vm, uint32_t desc) \ +{ \ + do_tbl2(vd, vn0, vn1, vm, desc, false, do_tb_##SUFF); \ +} \ +void HELPER(sve2_tbx_##SUFF)(void *vd, void *vn, void *vm, uint32_t desc) \ +{ \ + do_tbl1(vd, vn, vm, desc, true, do_tb_##SUFF); \ +} + +DO_TB(b, uint8_t, H1) +DO_TB(h, uint16_t, H2) +DO_TB(s, uint32_t, H4) +DO_TB(d, uint64_t, ) + +#undef DO_TB #define DO_UNPK(NAME, TYPED, TYPES, HD, HS) \ void HELPER(NAME)(void *vd, void *vn, uint32_t desc) \ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 2136a41094..9a6f7c87c1 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -2417,6 +2417,39 @@ static bool trans_TBL(DisasContext *s, arg_rrr_esz *a) return true; } +static bool trans_TBL_sve2(DisasContext *s, arg_rrr_esz *a) +{ + static gen_helper_gvec_4 * const fns[4] = { + gen_helper_sve2_tbl_b, gen_helper_sve2_tbl_h, + gen_helper_sve2_tbl_s, gen_helper_sve2_tbl_d + }; + + if (!dc_isar_feature(aa64_sve2, s)) { + return false; + } + if (sve_access_check(s)) { + gen_gvec_ool_zzzz(s, fns[a->esz], a->rd, a->rn, + (a->rn + 1) % 32, a->rm, 0); + } + return true; +} + +static bool trans_TBX(DisasContext *s, arg_rrr_esz *a) +{ + static gen_helper_gvec_3 * const fns[4] = { + gen_helper_sve2_tbx_b, gen_helper_sve2_tbx_h, + gen_helper_sve2_tbx_s, gen_helper_sve2_tbx_d + }; + + if (!dc_isar_feature(aa64_sve2, s)) { + return false; + } + if (sve_access_check(s)) { + gen_gvec_ool_zzz(s, fns[a->esz], a->rd, a->rn, a->rm, 0); + } + return true; +} + static bool trans_UNPK(DisasContext *s, arg_UNPK *a) { static gen_helper_gvec_2 * const fns[4][2] = { From patchwork Tue May 25 01:03:38 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277597 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id F1B14C2B9F7 for ; Tue, 25 May 2021 02:18:37 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 721C86141D for ; Tue, 25 May 2021 02:18:37 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 721C86141D Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:38808 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llMeS-00045z-Ea for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 22:18:36 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:55652) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLYV-0000yx-JL for qemu-devel@nongnu.org; Mon, 24 May 2021 21:08:23 -0400 Received: from mail-pl1-x62a.google.com ([2607:f8b0:4864:20::62a]:39918) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLXt-00047w-Dg for qemu-devel@nongnu.org; Mon, 24 May 2021 21:08:23 -0400 Received: by mail-pl1-x62a.google.com with SMTP id t9so7067022ply.6 for ; Mon, 24 May 2021 18:07:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=lgZhxU07vdbFbL3Rra8/VVNowuJ8Xs/CdPm7YJWJNXI=; b=M8kpUrSjJ5a0hwd/HrgMt2/H2Nz8ZqvBEadKqHxhvqXxdvvCoxgQGEEMUBPX26Ej/8 EyjW8JoYob5en2CYK5NxHrcq8R00QsfRnua5EE0NuFuI3oZ3TjwFHY05ueHsNJgbURld oiTLnNc41kaOXnsiFebR/mOB+pAiJMOzsvu1RtL6z5PBcY7ZAYnLe56m4WI6F/AhmtMM xQSK6UXOyoddW4XjuRvpyYrKiaxgdogO3HpenGh4/7xEFZbuZuOYGS7H9/3EENVRAFoD mvuYksnJwf6TELtDXxWITBeZYNjrHUrYHga9fVhcwdVJrKJ6Cqb/V5tLAHDP//Ft+xTz NtQw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=lgZhxU07vdbFbL3Rra8/VVNowuJ8Xs/CdPm7YJWJNXI=; b=Qjt8Rt1SoDeeu83Dz5Mrp41r3vgYH7jrCEKEpodUICQvy7n8uOjKNUAROT8sXqc45U dj3OiRVpQHATs20LoUC0LRUgwqw31v+zjJDbOA/haeR10oD6HKlIcA0VEPUxq5VfMvhy e7bVG3IBYqcuKAyZNIUqWV6Tgej9Rb6v1KtObLf0blG8Xb645kPDzECzv+1USkf9u8mk wcSuq+XKSnJjwkeZl/jQv76aWU2DkPXsZJ9yzmY0XeTXcq+4JY2dLI+8QB58YE4JUyv+ JE7UF0Eb/JJSUdCLJb3RN3zP+MIvSOQjRSIPmgtY4i3Uz8ZQxMZAETsFcc9VfyJ7VcG8 LocA== X-Gm-Message-State: AOAM533Bkx4qlCxwMlPiqXlt7l/PODKuoNQZsuToUm2qqRF9GLCl2XUf PFOQkTrcC0mE5dzNU3OhtrbjL/gdzkaRyw== X-Google-Smtp-Source: ABdhPJyYedMgoSoVE7TRRHCjF/oWUEWZt7uTBJJFrUBGOKu2KERCr5K2faBhuYgyQ5cA/y0CwaQGuA== X-Received: by 2002:a17:90b:1c0b:: with SMTP id oc11mr27749617pjb.156.1621904863964; Mon, 24 May 2021 18:07:43 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id b16sm11748176pju.35.2021.05.24.18.07.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:07:43 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 72/92] target/arm: Implement SVE2 FCVTNT Date: Mon, 24 May 2021 18:03:38 -0700 Message-Id: <20210525010358.152808-73-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::62a; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x62a.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org, Stephen Long Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Reviewed-by: Peter Maydell Signed-off-by: Stephen Long Message-Id: <20200428174332.17162-2-steplong@quicinc.com> Signed-off-by: Richard Henderson --- v7: Fix big-endian indexing. --- target/arm/helper-sve.h | 5 +++++ target/arm/sve.decode | 4 ++++ target/arm/sve_helper.c | 20 ++++++++++++++++++++ target/arm/translate-sve.c | 16 ++++++++++++++++ 4 files changed, 45 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index cdff155ead..7aa365d565 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2744,3 +2744,8 @@ DEF_HELPER_FLAGS_5(sve2_cdot_idx_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve2_cdot_idx_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_fcvtnt_sh, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_fcvtnt_ds, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index fdeb7b106b..94cdc6ff15 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1580,3 +1580,7 @@ SM4E 01000101 00 10001 1 11100 0 ..... ..... @rdn_rm_e0 # SVE2 crypto constructive binary operations SM4EKEY 01000101 00 1 ..... 11110 0 ..... ..... @rd_rn_rm_e0 RAX1 01000101 00 1 ..... 11110 1 ..... ..... @rd_rn_rm_e0 + +### SVE2 floating-point convert precision odd elements +FCVTNT_sh 01100100 10 0010 00 101 ... ..... ..... @rd_pg_rn_e0 +FCVTNT_ds 01100100 11 0010 10 101 ... ..... ..... @rd_pg_rn_e0 diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 4b05e2e427..d44bcfa44a 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -7602,3 +7602,23 @@ void HELPER(fmmla_d)(void *vd, void *vn, void *vm, void *va, d[3] = float64_add(a[3], float64_add(p0, p1, status), status); } } + +#define DO_FCVTNT(NAME, TYPEW, TYPEN, HW, HN, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vg, void *status, uint32_t desc) \ +{ \ + intptr_t i = simd_oprsz(desc); \ + uint64_t *g = vg; \ + do { \ + uint64_t pg = g[(i - 1) >> 6]; \ + do { \ + i -= sizeof(TYPEW); \ + if (likely((pg >> (i & 63)) & 1)) { \ + TYPEW nn = *(TYPEW *)(vn + HW(i)); \ + *(TYPEN *)(vd + HN(i + sizeof(TYPEN))) = OP(nn, status); \ + } \ + } while (i & 63); \ + } while (i != 0); \ +} + +DO_FCVTNT(sve2_fcvtnt_sh, uint32_t, uint16_t, H1_4, H1_2, sve_f32_to_f16) +DO_FCVTNT(sve2_fcvtnt_ds, uint64_t, uint32_t, , H1_4, float64_to_float32) diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 9a6f7c87c1..700b02814c 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -8246,3 +8246,19 @@ static bool trans_RAX1(DisasContext *s, arg_rrr_esz *a) } return true; } + +static bool trans_FCVTNT_sh(DisasContext *s, arg_rpr_esz *a) +{ + if (!dc_isar_feature(aa64_sve2, s)) { + return false; + } + return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve2_fcvtnt_sh); +} + +static bool trans_FCVTNT_ds(DisasContext *s, arg_rpr_esz *a) +{ + if (!dc_isar_feature(aa64_sve2, s)) { + return false; + } + return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve2_fcvtnt_ds); +} From patchwork Tue May 25 01:03:39 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277545 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CF664C2B9F7 for ; Tue, 25 May 2021 01:58:10 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 5AD3E61417 for ; Tue, 25 May 2021 01:58:10 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 5AD3E61417 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:53672 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llMKf-0007c0-Db for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 21:58:09 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:55720) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLYY-000191-4E for qemu-devel@nongnu.org; Mon, 24 May 2021 21:08:26 -0400 Received: from mail-pj1-x102e.google.com ([2607:f8b0:4864:20::102e]:45054) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLXt-00049C-Sy for qemu-devel@nongnu.org; Mon, 24 May 2021 21:08:25 -0400 Received: by mail-pj1-x102e.google.com with SMTP id h20-20020a17090aa894b029015db8f3969eso11565530pjq.3 for ; Mon, 24 May 2021 18:07:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=4DkLdzGeRLcffpsdPeSt6njaGGwt+9skrRZyw6tOeco=; b=SqiWIIZKCKmPoER09CeOoQshexGWYnBV3zir+cjSV6/VYsBZTrTQe8GcoSd5Vs1HxO ZgoMZfJvKLzPXkac+f4hfmIrwdrsOrZqjTBL2ZRSc3cn1lO5iXrcPDEUjNsQFrXtseuS f6qFF5sHYPThzf6URNDnoqi6aDfMyx5P7ZiH5j7skgKEpQXJAwADV/DZQxRdXsdGOGBr BCeVEEimbniJZsTyzCR7TseQ9En6XvZCb1Yd7TfH8gzdZt5mofMbgJS1rp1fwBHvHTJu DNgPnnyGgV8QEaA7cr49E1VKW0VhbQFnLdgKIpfMo1C8+wU8DmDHGLoHtEbonUgCa9kw 306A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=4DkLdzGeRLcffpsdPeSt6njaGGwt+9skrRZyw6tOeco=; b=ADRqzfEFonxqXEkIpT7B3l0AHrJumfUPKQqHoeea2GuYw0oSiuNeFUZE5MFG40aQoM b/yyFsMKiD9XH/UTPtM+vOXgcrLGGWmMuf8WxOqLyeyg/S659fN7wK0Ry0A0pkvTh65i kMqINBpCCT9MYB5IhlWzYIYoVnBLIcXqC2wOBr8qgUIBdE2cA6yspd/11Qx8bj0QWimx z4q58eah6keT+YsUomjldS67o6ZH2JnW9nYeybJY83/rp4ZcxgXl2Zz88MSB2wfhPtG0 RwzhHM68YNl0QkFFAZ4O6JeVXVD3/20ghnrO6y2fBCMpC5YRt4c3Pkyop9ea2qoh0fbY q++A== X-Gm-Message-State: AOAM533leLqYC4iXj+6avrvin8ENDETXAOAwW/uGHqiMxB4XP7iSAVwG MqZJV/pvG3gDNF7adyDc8mD+ml2+JS/QXw== X-Google-Smtp-Source: ABdhPJwbC3zqkj6BLc22Gf/fmS4uE9Ry9qzOHp9FFFEif+XKps5WZNAtolpZlJCkvOZ/kGgrHa6TrA== X-Received: by 2002:a17:90a:ba07:: with SMTP id s7mr27786875pjr.129.1621904864593; Mon, 24 May 2021 18:07:44 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id b16sm11748176pju.35.2021.05.24.18.07.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:07:44 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 73/92] target/arm: Implement SVE2 FCVTLT Date: Mon, 24 May 2021 18:03:39 -0700 Message-Id: <20210525010358.152808-74-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::102e; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x102e.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org, Stephen Long Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Stephen Long Reviewed-by: Peter Maydell Signed-off-by: Stephen Long Message-Id: <20200428174332.17162-3-steplong@quicinc.com> Signed-off-by: Richard Henderson --- v7: Fix big-endian indexing. --- target/arm/helper-sve.h | 5 +++++ target/arm/sve.decode | 2 ++ target/arm/sve_helper.c | 23 +++++++++++++++++++++++ target/arm/translate-sve.c | 16 ++++++++++++++++ 4 files changed, 46 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 7aa365d565..be4b17f1c2 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2749,3 +2749,8 @@ DEF_HELPER_FLAGS_5(sve2_fcvtnt_sh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve2_fcvtnt_ds, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_fcvtlt_hs, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_fcvtlt_sd, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 94cdc6ff15..1be3515470 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1583,4 +1583,6 @@ RAX1 01000101 00 1 ..... 11110 1 ..... ..... @rd_rn_rm_e0 ### SVE2 floating-point convert precision odd elements FCVTNT_sh 01100100 10 0010 00 101 ... ..... ..... @rd_pg_rn_e0 +FCVTLT_hs 01100100 10 0010 01 101 ... ..... ..... @rd_pg_rn_e0 FCVTNT_ds 01100100 11 0010 10 101 ... ..... ..... @rd_pg_rn_e0 +FCVTLT_sd 01100100 11 0010 11 101 ... ..... ..... @rd_pg_rn_e0 diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index d44bcfa44a..8882393515 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -7622,3 +7622,26 @@ void HELPER(NAME)(void *vd, void *vn, void *vg, void *status, uint32_t desc) \ DO_FCVTNT(sve2_fcvtnt_sh, uint32_t, uint16_t, H1_4, H1_2, sve_f32_to_f16) DO_FCVTNT(sve2_fcvtnt_ds, uint64_t, uint32_t, , H1_4, float64_to_float32) + +#define DO_FCVTLT(NAME, TYPEW, TYPEN, HW, HN, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vg, void *status, uint32_t desc) \ +{ \ + intptr_t i = simd_oprsz(desc); \ + uint64_t *g = vg; \ + do { \ + uint64_t pg = g[(i - 1) >> 6]; \ + do { \ + i -= sizeof(TYPEW); \ + if (likely((pg >> (i & 63)) & 1)) { \ + TYPEN nn = *(TYPEN *)(vn + HN(i + sizeof(TYPEN))); \ + *(TYPEW *)(vd + HW(i)) = OP(nn, status); \ + } \ + } while (i & 63); \ + } while (i != 0); \ +} + +DO_FCVTLT(sve2_fcvtlt_hs, uint32_t, uint16_t, H1_4, H1_2, sve_f16_to_f32) +DO_FCVTLT(sve2_fcvtlt_sd, uint64_t, uint32_t, , H1_4, float32_to_float64) + +#undef DO_FCVTLT +#undef DO_FCVTNT diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 700b02814c..7490094d17 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -8262,3 +8262,19 @@ static bool trans_FCVTNT_ds(DisasContext *s, arg_rpr_esz *a) } return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve2_fcvtnt_ds); } + +static bool trans_FCVTLT_hs(DisasContext *s, arg_rpr_esz *a) +{ + if (!dc_isar_feature(aa64_sve2, s)) { + return false; + } + return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve2_fcvtlt_hs); +} + +static bool trans_FCVTLT_sd(DisasContext *s, arg_rpr_esz *a) +{ + if (!dc_isar_feature(aa64_sve2, s)) { + return false; + } + return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve2_fcvtlt_sd); +} From patchwork Tue May 25 01:03:40 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277579 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 40845C2B9F8 for ; Tue, 25 May 2021 02:09:23 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id B32006141D for ; Tue, 25 May 2021 02:09:22 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B32006141D Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:34276 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llMVV-00079T-RV for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 22:09:21 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:55722) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLYY-00019M-7g for qemu-devel@nongnu.org; Mon, 24 May 2021 21:08:26 -0400 Received: from mail-pl1-x62c.google.com ([2607:f8b0:4864:20::62c]:39920) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLXu-00049M-ET for qemu-devel@nongnu.org; Mon, 24 May 2021 21:08:25 -0400 Received: by mail-pl1-x62c.google.com with SMTP id t9so7067046ply.6 for ; Mon, 24 May 2021 18:07:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=lbC8FoGg2H/cVuag5f63FQxhwvwYVaxuD9dMnuei1/0=; b=k2mqsEXQ4PmnHDU/7oM2qAYvChPVeTS4uCbN1slWLX1P71wuzNUp5loiw8Xr7FiXfP mBDQJDFA3jarPdFc6wur+KD75Yki95C2/3iMG8HyBcDzJfNkLwP/paKIBtUKYKasKait 3ySVK/biOXTvRJqDoqyS+ZtZFH94L/nxrrzcY6DjqeaEppTVbozIv47HaA3pyRO1IkPS nFrN2L02Vmw2iheftEjgs9X71fkNQax5DPq2Kur6TQ6NMrqfwNYJzJY2o+4d462MIsTZ M/d6PoRR1OrZ+5ZibuiZ7QQp33WnZ2izXV/e+0fFEJR+5wSq6YisAjWRoxhizsUUIiUL Fjgg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=lbC8FoGg2H/cVuag5f63FQxhwvwYVaxuD9dMnuei1/0=; b=uTOfhGx/XnRtdPck4OyM+Aq5+3NcWXPIfgQFuV+1ZMy6k80RvAl8YrzP+dxXQl3IRh U/qttEfnTQR1bkuBJqISuaSEnz6MfY31pdllXIdrQW3FiXLcdgsl9jc6N61u1ozddks5 hUO2EGounkq/ElkggqXKlgnheWkV37olX5bje0ZsiGNkId21KrmLYZTH2Tyb07J/ke0s 4frKfRarqy9d8AfxaRk4KeAH8zjWyCvge3AHrl1iiQkrej1trL3bdr+MxnzlbRjeVJ8D 5b7dBO1pRgHg2HQO3dfNY4UGuKQvcmkUVyLKslIhe+jBfvTxpDoXDI43ChNOUslB2AX1 7+Vg== X-Gm-Message-State: AOAM533Iyvhl99dyx8rAIdATWIyPplSdPQlgo2hNq8LOz8stcQeIoLKL u2TVNK25LzXcgNZ8r2A+6dVFJS0PcRjojQ== X-Google-Smtp-Source: ABdhPJzmauIywsByjEKcQpaLIJe5u33AEVVHvADvTOiD7jMwveZLBkaA0TfP2oSsWMrJgk9Lh/gtzA== X-Received: by 2002:a17:902:c9c3:b029:f5:4af8:259e with SMTP id q3-20020a170902c9c3b02900f54af8259emr28182938pld.41.1621904865126; Mon, 24 May 2021 18:07:45 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id b16sm11748176pju.35.2021.05.24.18.07.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:07:44 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 74/92] target/arm: Implement SVE2 FCVTXNT, FCVTX Date: Mon, 24 May 2021 18:03:40 -0700 Message-Id: <20210525010358.152808-75-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::62c; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x62c.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org, Stephen Long Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Stephen Long Reviewed-by: Peter Maydell Signed-off-by: Stephen Long Message-Id: <20200428174332.17162-4-steplong@quicinc.com> [rth: Use do_frint_mode, which avoids a specific runtime helper.] Signed-off-by: Richard Henderson --- target/arm/sve.decode | 2 ++ target/arm/translate-sve.c | 49 ++++++++++++++++++++++++++++++-------- 2 files changed, 41 insertions(+), 10 deletions(-) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 1be3515470..5dcc79759e 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1582,6 +1582,8 @@ SM4EKEY 01000101 00 1 ..... 11110 0 ..... ..... @rd_rn_rm_e0 RAX1 01000101 00 1 ..... 11110 1 ..... ..... @rd_rn_rm_e0 ### SVE2 floating-point convert precision odd elements +FCVTXNT_ds 01100100 00 0010 10 101 ... ..... ..... @rd_pg_rn_e0 +FCVTX_ds 01100101 00 0010 10 101 ... ..... ..... @rd_pg_rn_e0 FCVTNT_sh 01100100 10 0010 00 101 ... ..... ..... @rd_pg_rn_e0 FCVTLT_hs 01100100 10 0010 01 101 ... ..... ..... @rd_pg_rn_e0 FCVTNT_ds 01100100 11 0010 10 101 ... ..... ..... @rd_pg_rn_e0 diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 7490094d17..0a2718c481 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -4777,11 +4777,9 @@ static bool trans_FRINTX(DisasContext *s, arg_rpr_esz *a) return do_zpz_ptr(s, a->rd, a->rn, a->pg, a->esz == MO_16, fns[a->esz - 1]); } -static bool do_frint_mode(DisasContext *s, arg_rpr_esz *a, int mode) +static bool do_frint_mode(DisasContext *s, arg_rpr_esz *a, + int mode, gen_helper_gvec_3_ptr *fn) { - if (a->esz == 0) { - return false; - } if (sve_access_check(s)) { unsigned vsz = vec_full_reg_size(s); TCGv_i32 tmode = tcg_const_i32(mode); @@ -4792,7 +4790,7 @@ static bool do_frint_mode(DisasContext *s, arg_rpr_esz *a, int mode) tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, a->rd), vec_full_reg_offset(s, a->rn), pred_full_reg_offset(s, a->pg), - status, vsz, vsz, 0, frint_fns[a->esz - 1]); + status, vsz, vsz, 0, fn); gen_helper_set_rmode(tmode, tmode, status); tcg_temp_free_i32(tmode); @@ -4803,27 +4801,42 @@ static bool do_frint_mode(DisasContext *s, arg_rpr_esz *a, int mode) static bool trans_FRINTN(DisasContext *s, arg_rpr_esz *a) { - return do_frint_mode(s, a, float_round_nearest_even); + if (a->esz == 0) { + return false; + } + return do_frint_mode(s, a, float_round_nearest_even, frint_fns[a->esz - 1]); } static bool trans_FRINTP(DisasContext *s, arg_rpr_esz *a) { - return do_frint_mode(s, a, float_round_up); + if (a->esz == 0) { + return false; + } + return do_frint_mode(s, a, float_round_up, frint_fns[a->esz - 1]); } static bool trans_FRINTM(DisasContext *s, arg_rpr_esz *a) { - return do_frint_mode(s, a, float_round_down); + if (a->esz == 0) { + return false; + } + return do_frint_mode(s, a, float_round_down, frint_fns[a->esz - 1]); } static bool trans_FRINTZ(DisasContext *s, arg_rpr_esz *a) { - return do_frint_mode(s, a, float_round_to_zero); + if (a->esz == 0) { + return false; + } + return do_frint_mode(s, a, float_round_to_zero, frint_fns[a->esz - 1]); } static bool trans_FRINTA(DisasContext *s, arg_rpr_esz *a) { - return do_frint_mode(s, a, float_round_ties_away); + if (a->esz == 0) { + return false; + } + return do_frint_mode(s, a, float_round_ties_away, frint_fns[a->esz - 1]); } static bool trans_FRECPX(DisasContext *s, arg_rpr_esz *a) @@ -8278,3 +8291,19 @@ static bool trans_FCVTLT_sd(DisasContext *s, arg_rpr_esz *a) } return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve2_fcvtlt_sd); } + +static bool trans_FCVTX_ds(DisasContext *s, arg_rpr_esz *a) +{ + if (!dc_isar_feature(aa64_sve2, s)) { + return false; + } + return do_frint_mode(s, a, float_round_to_odd, gen_helper_sve_fcvt_ds); +} + +static bool trans_FCVTXNT_ds(DisasContext *s, arg_rpr_esz *a) +{ + if (!dc_isar_feature(aa64_sve2, s)) { + return false; + } + return do_frint_mode(s, a, float_round_to_odd, gen_helper_sve2_fcvtnt_ds); +} From patchwork Tue May 25 01:03:41 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277585 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.9 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, UNWANTED_LANGUAGE_BODY,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A3CD1C2B9F7 for ; Tue, 25 May 2021 02:11:22 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 217496141D for ; Tue, 25 May 2021 02:11:22 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 217496141D Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:42872 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llMXR-0004Qf-5X for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 22:11:21 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:55758) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLYZ-0001D8-Em for qemu-devel@nongnu.org; Mon, 24 May 2021 21:08:27 -0400 Received: from mail-pj1-x102c.google.com ([2607:f8b0:4864:20::102c]:36453) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLXv-00049d-JO for qemu-devel@nongnu.org; Mon, 24 May 2021 21:08:27 -0400 Received: by mail-pj1-x102c.google.com with SMTP id n6-20020a17090ac686b029015d2f7aeea8so12277398pjt.1 for ; Mon, 24 May 2021 18:07:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=GxVaQvqVt0mxh8hFVoab4E5awx6N3yMBWMsb4PbVN5Y=; b=vwAz0CY+LPOxSmcK1NwAdG3VEm4QQZ+j0rrTlw0KGmaLimFQd51hHHciOJy0QOCs89 xroLLmrcNtM2cWvEg/baWwIdtyEc8Xp2L9L4DQDxIwvYqHYuTWTtBqnSqtApX3fzHyXr XinS7ZHpBQLqLDChkVMx7jgNJSi/HhLivLTL0AoJDF1oGTIobhidqnjVCaP8MembnfeO QfiBceVGkWhmVGSFChxVEz5wzUQq8L2z+bhUTd/CWiGgLOW51bCq8UvE0QvvHCWxh1if t9bIPnTrlzn15lL/L7cVblUwVBQrQyEeFboo/kyv7OFef3BaKq7SQfYbGNnmvzzpFjHr OYWg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=GxVaQvqVt0mxh8hFVoab4E5awx6N3yMBWMsb4PbVN5Y=; b=TnMSjfVShuHIGxEDhm+63SburQ/F9xpNhR7xi3eQIZcMGP1FkMB90yn5G68RMhxTm8 n0JjKMT9XupqUkr/F1HEE3cZrrvNV8cVcAhyOVId4+CZtfj3g4Je1Yu+S+Wvsj/XS3At H2CO16b/D9r6nKipoE4/foaZ7ChikFKfBITbUwvCu+RBSejLFX9Z/8MAegOuES/ClO7Y BHGZl/0EUTVGHjXVgTnn/fDlyf2vC7YjlZix8NZFSWzJS6iMfOBuX4kbkDRSnWinPFZj qVL591CT78f1hpIlDxNvq6E/YEbzzfz9EDYutC/k2r2qqRyzoJtHfQuzGf9qETbEQies uteA== X-Gm-Message-State: AOAM5309inxetcH51TgA59iyY5SXY8/fkieAloZoiFz/NDTpu6kR7d2h AHhJFVvze+k8ef4+vxh7QgPpUq3qoeZExg== X-Google-Smtp-Source: ABdhPJxu+s2+n1iXeaPp64vqGp4XWutUJGWHCkXDtAgg7ltm4I2l61mtI5GNGCgx5RBCZUxuHP4icA== X-Received: by 2002:a17:90a:a2b:: with SMTP id o40mr2026091pjo.214.1621904865622; Mon, 24 May 2021 18:07:45 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id b16sm11748176pju.35.2021.05.24.18.07.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:07:45 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 75/92] target/arm: Implement SVE2 FLOGB Date: Mon, 24 May 2021 18:03:41 -0700 Message-Id: <20210525010358.152808-76-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::102c; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x102c.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org, Stephen Long Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Stephen Long Signed-off-by: Stephen Long Message-Id: <20200430191405.21641-1-steplong@quicinc.com> Signed-off-by: Richard Henderson --- v2: Fixed esz index and c++ comments v3: Fixed denormal arithmetic and raise invalid. v7: Rewrite; handle denormal exceptions and flush to zero. --- target/arm/helper-sve.h | 4 ++ target/arm/sve.decode | 3 ++ target/arm/sve_helper.c | 88 ++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 24 +++++++++++ 4 files changed, 119 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index be4b17f1c2..342bb83721 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2754,3 +2754,7 @@ DEF_HELPER_FLAGS_5(sve2_fcvtlt_hs, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve2_fcvtlt_sd, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(flogb_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(flogb_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(flogb_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 5dcc79759e..5a1cceccb6 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1588,3 +1588,6 @@ FCVTNT_sh 01100100 10 0010 00 101 ... ..... ..... @rd_pg_rn_e0 FCVTLT_hs 01100100 10 0010 01 101 ... ..... ..... @rd_pg_rn_e0 FCVTNT_ds 01100100 11 0010 10 101 ... ..... ..... @rd_pg_rn_e0 FCVTLT_sd 01100100 11 0010 11 101 ... ..... ..... @rd_pg_rn_e0 + +### SVE2 floating-point convert to integer +FLOGB 01100101 00 011 esz:2 0101 pg:3 rn:5 rd:5 &rpr_esz diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 8882393515..a051854984 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -4729,6 +4729,94 @@ DO_ZPZ_FP(sve_ucvt_dh, uint64_t, , uint64_to_float16) DO_ZPZ_FP(sve_ucvt_ds, uint64_t, , uint64_to_float32) DO_ZPZ_FP(sve_ucvt_dd, uint64_t, , uint64_to_float64) +static int16_t do_float16_logb_as_int(float16 a, float_status *s) +{ + /* Extract frac to the top of the uint32_t. */ + uint32_t frac = (uint32_t)a << (16 + 6); + int16_t exp = extract32(a, 10, 5); + + if (unlikely(exp == 0)) { + if (frac != 0) { + if (!get_flush_inputs_to_zero(s)) { + /* denormal: bias - fractional_zeros */ + return -15 - clz32(frac); + } + /* flush to zero */ + float_raise(float_flag_input_denormal, s); + } + } else if (unlikely(exp == 0x1f)) { + if (frac == 0) { + return INT16_MAX; /* infinity */ + } + } else { + /* normal: exp - bias */ + return exp - 15; + } + /* nan or zero */ + float_raise(float_flag_invalid, s); + return INT16_MIN; +} + +static int32_t do_float32_logb_as_int(float32 a, float_status *s) +{ + /* Extract frac to the top of the uint32_t. */ + uint32_t frac = a << 9; + int32_t exp = extract32(a, 23, 8); + + if (unlikely(exp == 0)) { + if (frac != 0) { + if (!get_flush_inputs_to_zero(s)) { + /* denormal: bias - fractional_zeros */ + return -127 - clz32(frac); + } + /* flush to zero */ + float_raise(float_flag_input_denormal, s); + } + } else if (unlikely(exp == 0xff)) { + if (frac == 0) { + return INT32_MAX; /* infinity */ + } + } else { + /* normal: exp - bias */ + return exp - 127; + } + /* nan or zero */ + float_raise(float_flag_invalid, s); + return INT32_MIN; +} + +static int64_t do_float64_logb_as_int(float64 a, float_status *s) +{ + /* Extract frac to the top of the uint64_t. */ + uint64_t frac = a << 12; + int64_t exp = extract64(a, 52, 11); + + if (unlikely(exp == 0)) { + if (frac != 0) { + if (!get_flush_inputs_to_zero(s)) { + /* denormal: bias - fractional_zeros */ + return -1023 - clz64(frac); + } + /* flush to zero */ + float_raise(float_flag_input_denormal, s); + } + } else if (unlikely(exp == 0x7ff)) { + if (frac == 0) { + return INT64_MAX; /* infinity */ + } + } else { + /* normal: exp - bias */ + return exp - 1023; + } + /* nan or zero */ + float_raise(float_flag_invalid, s); + return INT64_MIN; +} + +DO_ZPZ_FP(flogb_h, float16, H1_2, do_float16_logb_as_int) +DO_ZPZ_FP(flogb_s, float32, H1_4, do_float32_logb_as_int) +DO_ZPZ_FP(flogb_d, float64, , do_float64_logb_as_int) + #undef DO_ZPZ_FP static void do_fmla_zpzzz_h(void *vd, void *vn, void *vm, void *va, void *vg, diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 0a2718c481..3ea51a73d3 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -8307,3 +8307,27 @@ static bool trans_FCVTXNT_ds(DisasContext *s, arg_rpr_esz *a) } return do_frint_mode(s, a, float_round_to_odd, gen_helper_sve2_fcvtnt_ds); } + +static bool trans_FLOGB(DisasContext *s, arg_rpr_esz *a) +{ + static gen_helper_gvec_3_ptr * const fns[] = { + NULL, gen_helper_flogb_h, + gen_helper_flogb_s, gen_helper_flogb_d + }; + + if (!dc_isar_feature(aa64_sve2, s) || fns[a->esz] == NULL) { + return false; + } + if (sve_access_check(s)) { + TCGv_ptr status = + fpstatus_ptr(a->esz == MO_16 ? FPST_FPCR_F16 : FPST_FPCR); + unsigned vsz = vec_full_reg_size(s); + + tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + pred_full_reg_offset(s, a->pg), + status, vsz, vsz, 0, fns[a->esz]); + tcg_temp_free_ptr(status); + } + return true; +} From patchwork Tue May 25 01:03:42 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277539 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1C8E5C2B9F7 for ; Tue, 25 May 2021 01:56:38 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 70427610CE for ; Tue, 25 May 2021 01:56:37 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 70427610CE Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:49344 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llMJA-0004k2-Hb for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 21:56:36 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:55794) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLYd-0001H9-95 for qemu-devel@nongnu.org; Mon, 24 May 2021 21:08:32 -0400 Received: from mail-pl1-x635.google.com ([2607:f8b0:4864:20::635]:45861) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLXv-0004AS-Pu for qemu-devel@nongnu.org; Mon, 24 May 2021 21:08:30 -0400 Received: by mail-pl1-x635.google.com with SMTP id s4so13951620plg.12 for ; Mon, 24 May 2021 18:07:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=5FmN3ljyVtwKizj7nbiiY/9sTfclXpJZqQxvrlA2g1E=; b=nZKK77L2IkMofhLcj9FoT/KCWJW6GW83HrwvjTBFlRXGVWJ3fwT4JV3uMmxaQCrkPl giSOvMnplUYleJJXGxVHoBx2jfhwUw1om6SyFeGW69EDYuIoVzwGp2Hifw/8mGSkvGkE r9986kpzSFaY7FO4DOPipxhf4lKdwi54cZXMQGCGkJMdTT/jFTCPXUJBrogajdr7hiEw ujbUCSPo/xF/Va53VwHsX9IL4nJ3qn299p2fumVHsE5pVeBpMZB99NK5os956Ced4U2i REzTpal0fVVWrZDFWfBxxC/A/8MXzQ4KUAlxPwPzPuMqiQQ+XGj1b/kUB/G0ZpPtv+GN Bq/A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=5FmN3ljyVtwKizj7nbiiY/9sTfclXpJZqQxvrlA2g1E=; b=FMu+1agGk5+plovE92iBOzy9C+Jr30thy09cGb7/YlkTnWPJT3I0NCTZveivA5HfRl I5CiRIgx6PSRitg6YTnO5ZBEwow8xaNQlATAFMoZXAi1BnUkIjHQr2HECverUQ15Pa5y 7p4GJdKvEE79p+lau9PrEJzy71Aq9DHjkekikZBU+sfK9/rvgyfCF1RA0K/IDm7rMxGb lVA/y3f7RWAgHDO4plBxIxcUSIuiwRbt8MTwBWQJbcchzqpThgeTMCVRerp1K8Vi7NXM 9717bildiX81Jp2UbaWaVobRR/Wfpl1eafwEdaLGWztgR2lHeOw//A53f/vR6y9Y5jEk q39A== X-Gm-Message-State: AOAM530nA9254OiIxbT/nWfP2yVPpc3pcdWBNpM3BoAYcfKfI/0KlOa+ qK+dUjvwvjED1sOglypmr4dSylv7/FsQRQ== X-Google-Smtp-Source: ABdhPJwb4jCqh4IrjuX1sdynqSR+knpQA3CUYNYtypi89PGVTWHEhxcJAWeiRPF/jCKvKleE7IVzOw== X-Received: by 2002:a17:902:c407:b029:f8:c1e2:e0bf with SMTP id k7-20020a170902c407b02900f8c1e2e0bfmr12541074plk.69.1621904866245; Mon, 24 May 2021 18:07:46 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id b16sm11748176pju.35.2021.05.24.18.07.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:07:46 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 76/92] target/arm: Share table of sve load functions Date: Mon, 24 May 2021 18:03:42 -0700 Message-Id: <20210525010358.152808-77-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::635; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x635.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" The table used by do_ldrq is a subset of the table used by do_ld_zpa; we can share them by passing dtype instead of msz to do_ldrq. The lack of MTE handling in do_ldrq was a bug, fixed by this change. Signed-off-by: Richard Henderson --- target/arm/translate-sve.c | 254 ++++++++++++++++++------------------- 1 file changed, 126 insertions(+), 128 deletions(-) diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 3ea51a73d3..54c50349ab 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -5215,128 +5215,130 @@ static void do_mem_zpa(DisasContext *s, int zt, int pg, TCGv_i64 addr, tcg_temp_free_i32(t_desc); } +/* Indexed by [mte][be][dtype][nreg] */ +static gen_helper_gvec_mem * const ldr_fns[2][2][16][4] = { + { /* mte inactive, little-endian */ + { { gen_helper_sve_ld1bb_r, gen_helper_sve_ld2bb_r, + gen_helper_sve_ld3bb_r, gen_helper_sve_ld4bb_r }, + { gen_helper_sve_ld1bhu_r, NULL, NULL, NULL }, + { gen_helper_sve_ld1bsu_r, NULL, NULL, NULL }, + { gen_helper_sve_ld1bdu_r, NULL, NULL, NULL }, + + { gen_helper_sve_ld1sds_le_r, NULL, NULL, NULL }, + { gen_helper_sve_ld1hh_le_r, gen_helper_sve_ld2hh_le_r, + gen_helper_sve_ld3hh_le_r, gen_helper_sve_ld4hh_le_r }, + { gen_helper_sve_ld1hsu_le_r, NULL, NULL, NULL }, + { gen_helper_sve_ld1hdu_le_r, NULL, NULL, NULL }, + + { gen_helper_sve_ld1hds_le_r, NULL, NULL, NULL }, + { gen_helper_sve_ld1hss_le_r, NULL, NULL, NULL }, + { gen_helper_sve_ld1ss_le_r, gen_helper_sve_ld2ss_le_r, + gen_helper_sve_ld3ss_le_r, gen_helper_sve_ld4ss_le_r }, + { gen_helper_sve_ld1sdu_le_r, NULL, NULL, NULL }, + + { gen_helper_sve_ld1bds_r, NULL, NULL, NULL }, + { gen_helper_sve_ld1bss_r, NULL, NULL, NULL }, + { gen_helper_sve_ld1bhs_r, NULL, NULL, NULL }, + { gen_helper_sve_ld1dd_le_r, gen_helper_sve_ld2dd_le_r, + gen_helper_sve_ld3dd_le_r, gen_helper_sve_ld4dd_le_r } }, + + /* mte inactive, big-endian */ + { { gen_helper_sve_ld1bb_r, gen_helper_sve_ld2bb_r, + gen_helper_sve_ld3bb_r, gen_helper_sve_ld4bb_r }, + { gen_helper_sve_ld1bhu_r, NULL, NULL, NULL }, + { gen_helper_sve_ld1bsu_r, NULL, NULL, NULL }, + { gen_helper_sve_ld1bdu_r, NULL, NULL, NULL }, + + { gen_helper_sve_ld1sds_be_r, NULL, NULL, NULL }, + { gen_helper_sve_ld1hh_be_r, gen_helper_sve_ld2hh_be_r, + gen_helper_sve_ld3hh_be_r, gen_helper_sve_ld4hh_be_r }, + { gen_helper_sve_ld1hsu_be_r, NULL, NULL, NULL }, + { gen_helper_sve_ld1hdu_be_r, NULL, NULL, NULL }, + + { gen_helper_sve_ld1hds_be_r, NULL, NULL, NULL }, + { gen_helper_sve_ld1hss_be_r, NULL, NULL, NULL }, + { gen_helper_sve_ld1ss_be_r, gen_helper_sve_ld2ss_be_r, + gen_helper_sve_ld3ss_be_r, gen_helper_sve_ld4ss_be_r }, + { gen_helper_sve_ld1sdu_be_r, NULL, NULL, NULL }, + + { gen_helper_sve_ld1bds_r, NULL, NULL, NULL }, + { gen_helper_sve_ld1bss_r, NULL, NULL, NULL }, + { gen_helper_sve_ld1bhs_r, NULL, NULL, NULL }, + { gen_helper_sve_ld1dd_be_r, gen_helper_sve_ld2dd_be_r, + gen_helper_sve_ld3dd_be_r, gen_helper_sve_ld4dd_be_r } } }, + + { /* mte active, little-endian */ + { { gen_helper_sve_ld1bb_r_mte, + gen_helper_sve_ld2bb_r_mte, + gen_helper_sve_ld3bb_r_mte, + gen_helper_sve_ld4bb_r_mte }, + { gen_helper_sve_ld1bhu_r_mte, NULL, NULL, NULL }, + { gen_helper_sve_ld1bsu_r_mte, NULL, NULL, NULL }, + { gen_helper_sve_ld1bdu_r_mte, NULL, NULL, NULL }, + + { gen_helper_sve_ld1sds_le_r_mte, NULL, NULL, NULL }, + { gen_helper_sve_ld1hh_le_r_mte, + gen_helper_sve_ld2hh_le_r_mte, + gen_helper_sve_ld3hh_le_r_mte, + gen_helper_sve_ld4hh_le_r_mte }, + { gen_helper_sve_ld1hsu_le_r_mte, NULL, NULL, NULL }, + { gen_helper_sve_ld1hdu_le_r_mte, NULL, NULL, NULL }, + + { gen_helper_sve_ld1hds_le_r_mte, NULL, NULL, NULL }, + { gen_helper_sve_ld1hss_le_r_mte, NULL, NULL, NULL }, + { gen_helper_sve_ld1ss_le_r_mte, + gen_helper_sve_ld2ss_le_r_mte, + gen_helper_sve_ld3ss_le_r_mte, + gen_helper_sve_ld4ss_le_r_mte }, + { gen_helper_sve_ld1sdu_le_r_mte, NULL, NULL, NULL }, + + { gen_helper_sve_ld1bds_r_mte, NULL, NULL, NULL }, + { gen_helper_sve_ld1bss_r_mte, NULL, NULL, NULL }, + { gen_helper_sve_ld1bhs_r_mte, NULL, NULL, NULL }, + { gen_helper_sve_ld1dd_le_r_mte, + gen_helper_sve_ld2dd_le_r_mte, + gen_helper_sve_ld3dd_le_r_mte, + gen_helper_sve_ld4dd_le_r_mte } }, + + /* mte active, big-endian */ + { { gen_helper_sve_ld1bb_r_mte, + gen_helper_sve_ld2bb_r_mte, + gen_helper_sve_ld3bb_r_mte, + gen_helper_sve_ld4bb_r_mte }, + { gen_helper_sve_ld1bhu_r_mte, NULL, NULL, NULL }, + { gen_helper_sve_ld1bsu_r_mte, NULL, NULL, NULL }, + { gen_helper_sve_ld1bdu_r_mte, NULL, NULL, NULL }, + + { gen_helper_sve_ld1sds_be_r_mte, NULL, NULL, NULL }, + { gen_helper_sve_ld1hh_be_r_mte, + gen_helper_sve_ld2hh_be_r_mte, + gen_helper_sve_ld3hh_be_r_mte, + gen_helper_sve_ld4hh_be_r_mte }, + { gen_helper_sve_ld1hsu_be_r_mte, NULL, NULL, NULL }, + { gen_helper_sve_ld1hdu_be_r_mte, NULL, NULL, NULL }, + + { gen_helper_sve_ld1hds_be_r_mte, NULL, NULL, NULL }, + { gen_helper_sve_ld1hss_be_r_mte, NULL, NULL, NULL }, + { gen_helper_sve_ld1ss_be_r_mte, + gen_helper_sve_ld2ss_be_r_mte, + gen_helper_sve_ld3ss_be_r_mte, + gen_helper_sve_ld4ss_be_r_mte }, + { gen_helper_sve_ld1sdu_be_r_mte, NULL, NULL, NULL }, + + { gen_helper_sve_ld1bds_r_mte, NULL, NULL, NULL }, + { gen_helper_sve_ld1bss_r_mte, NULL, NULL, NULL }, + { gen_helper_sve_ld1bhs_r_mte, NULL, NULL, NULL }, + { gen_helper_sve_ld1dd_be_r_mte, + gen_helper_sve_ld2dd_be_r_mte, + gen_helper_sve_ld3dd_be_r_mte, + gen_helper_sve_ld4dd_be_r_mte } } }, +}; + static void do_ld_zpa(DisasContext *s, int zt, int pg, TCGv_i64 addr, int dtype, int nreg) { - static gen_helper_gvec_mem * const fns[2][2][16][4] = { - { /* mte inactive, little-endian */ - { { gen_helper_sve_ld1bb_r, gen_helper_sve_ld2bb_r, - gen_helper_sve_ld3bb_r, gen_helper_sve_ld4bb_r }, - { gen_helper_sve_ld1bhu_r, NULL, NULL, NULL }, - { gen_helper_sve_ld1bsu_r, NULL, NULL, NULL }, - { gen_helper_sve_ld1bdu_r, NULL, NULL, NULL }, - - { gen_helper_sve_ld1sds_le_r, NULL, NULL, NULL }, - { gen_helper_sve_ld1hh_le_r, gen_helper_sve_ld2hh_le_r, - gen_helper_sve_ld3hh_le_r, gen_helper_sve_ld4hh_le_r }, - { gen_helper_sve_ld1hsu_le_r, NULL, NULL, NULL }, - { gen_helper_sve_ld1hdu_le_r, NULL, NULL, NULL }, - - { gen_helper_sve_ld1hds_le_r, NULL, NULL, NULL }, - { gen_helper_sve_ld1hss_le_r, NULL, NULL, NULL }, - { gen_helper_sve_ld1ss_le_r, gen_helper_sve_ld2ss_le_r, - gen_helper_sve_ld3ss_le_r, gen_helper_sve_ld4ss_le_r }, - { gen_helper_sve_ld1sdu_le_r, NULL, NULL, NULL }, - - { gen_helper_sve_ld1bds_r, NULL, NULL, NULL }, - { gen_helper_sve_ld1bss_r, NULL, NULL, NULL }, - { gen_helper_sve_ld1bhs_r, NULL, NULL, NULL }, - { gen_helper_sve_ld1dd_le_r, gen_helper_sve_ld2dd_le_r, - gen_helper_sve_ld3dd_le_r, gen_helper_sve_ld4dd_le_r } }, - - /* mte inactive, big-endian */ - { { gen_helper_sve_ld1bb_r, gen_helper_sve_ld2bb_r, - gen_helper_sve_ld3bb_r, gen_helper_sve_ld4bb_r }, - { gen_helper_sve_ld1bhu_r, NULL, NULL, NULL }, - { gen_helper_sve_ld1bsu_r, NULL, NULL, NULL }, - { gen_helper_sve_ld1bdu_r, NULL, NULL, NULL }, - - { gen_helper_sve_ld1sds_be_r, NULL, NULL, NULL }, - { gen_helper_sve_ld1hh_be_r, gen_helper_sve_ld2hh_be_r, - gen_helper_sve_ld3hh_be_r, gen_helper_sve_ld4hh_be_r }, - { gen_helper_sve_ld1hsu_be_r, NULL, NULL, NULL }, - { gen_helper_sve_ld1hdu_be_r, NULL, NULL, NULL }, - - { gen_helper_sve_ld1hds_be_r, NULL, NULL, NULL }, - { gen_helper_sve_ld1hss_be_r, NULL, NULL, NULL }, - { gen_helper_sve_ld1ss_be_r, gen_helper_sve_ld2ss_be_r, - gen_helper_sve_ld3ss_be_r, gen_helper_sve_ld4ss_be_r }, - { gen_helper_sve_ld1sdu_be_r, NULL, NULL, NULL }, - - { gen_helper_sve_ld1bds_r, NULL, NULL, NULL }, - { gen_helper_sve_ld1bss_r, NULL, NULL, NULL }, - { gen_helper_sve_ld1bhs_r, NULL, NULL, NULL }, - { gen_helper_sve_ld1dd_be_r, gen_helper_sve_ld2dd_be_r, - gen_helper_sve_ld3dd_be_r, gen_helper_sve_ld4dd_be_r } } }, - - { /* mte active, little-endian */ - { { gen_helper_sve_ld1bb_r_mte, - gen_helper_sve_ld2bb_r_mte, - gen_helper_sve_ld3bb_r_mte, - gen_helper_sve_ld4bb_r_mte }, - { gen_helper_sve_ld1bhu_r_mte, NULL, NULL, NULL }, - { gen_helper_sve_ld1bsu_r_mte, NULL, NULL, NULL }, - { gen_helper_sve_ld1bdu_r_mte, NULL, NULL, NULL }, - - { gen_helper_sve_ld1sds_le_r_mte, NULL, NULL, NULL }, - { gen_helper_sve_ld1hh_le_r_mte, - gen_helper_sve_ld2hh_le_r_mte, - gen_helper_sve_ld3hh_le_r_mte, - gen_helper_sve_ld4hh_le_r_mte }, - { gen_helper_sve_ld1hsu_le_r_mte, NULL, NULL, NULL }, - { gen_helper_sve_ld1hdu_le_r_mte, NULL, NULL, NULL }, - - { gen_helper_sve_ld1hds_le_r_mte, NULL, NULL, NULL }, - { gen_helper_sve_ld1hss_le_r_mte, NULL, NULL, NULL }, - { gen_helper_sve_ld1ss_le_r_mte, - gen_helper_sve_ld2ss_le_r_mte, - gen_helper_sve_ld3ss_le_r_mte, - gen_helper_sve_ld4ss_le_r_mte }, - { gen_helper_sve_ld1sdu_le_r_mte, NULL, NULL, NULL }, - - { gen_helper_sve_ld1bds_r_mte, NULL, NULL, NULL }, - { gen_helper_sve_ld1bss_r_mte, NULL, NULL, NULL }, - { gen_helper_sve_ld1bhs_r_mte, NULL, NULL, NULL }, - { gen_helper_sve_ld1dd_le_r_mte, - gen_helper_sve_ld2dd_le_r_mte, - gen_helper_sve_ld3dd_le_r_mte, - gen_helper_sve_ld4dd_le_r_mte } }, - - /* mte active, big-endian */ - { { gen_helper_sve_ld1bb_r_mte, - gen_helper_sve_ld2bb_r_mte, - gen_helper_sve_ld3bb_r_mte, - gen_helper_sve_ld4bb_r_mte }, - { gen_helper_sve_ld1bhu_r_mte, NULL, NULL, NULL }, - { gen_helper_sve_ld1bsu_r_mte, NULL, NULL, NULL }, - { gen_helper_sve_ld1bdu_r_mte, NULL, NULL, NULL }, - - { gen_helper_sve_ld1sds_be_r_mte, NULL, NULL, NULL }, - { gen_helper_sve_ld1hh_be_r_mte, - gen_helper_sve_ld2hh_be_r_mte, - gen_helper_sve_ld3hh_be_r_mte, - gen_helper_sve_ld4hh_be_r_mte }, - { gen_helper_sve_ld1hsu_be_r_mte, NULL, NULL, NULL }, - { gen_helper_sve_ld1hdu_be_r_mte, NULL, NULL, NULL }, - - { gen_helper_sve_ld1hds_be_r_mte, NULL, NULL, NULL }, - { gen_helper_sve_ld1hss_be_r_mte, NULL, NULL, NULL }, - { gen_helper_sve_ld1ss_be_r_mte, - gen_helper_sve_ld2ss_be_r_mte, - gen_helper_sve_ld3ss_be_r_mte, - gen_helper_sve_ld4ss_be_r_mte }, - { gen_helper_sve_ld1sdu_be_r_mte, NULL, NULL, NULL }, - - { gen_helper_sve_ld1bds_r_mte, NULL, NULL, NULL }, - { gen_helper_sve_ld1bss_r_mte, NULL, NULL, NULL }, - { gen_helper_sve_ld1bhs_r_mte, NULL, NULL, NULL }, - { gen_helper_sve_ld1dd_be_r_mte, - gen_helper_sve_ld2dd_be_r_mte, - gen_helper_sve_ld3dd_be_r_mte, - gen_helper_sve_ld4dd_be_r_mte } } }, - }; gen_helper_gvec_mem *fn - = fns[s->mte_active[0]][s->be_data == MO_BE][dtype][nreg]; + = ldr_fns[s->mte_active[0]][s->be_data == MO_BE][dtype][nreg]; /* * While there are holes in the table, they are not @@ -5574,14 +5576,8 @@ static bool trans_LDNF1_zpri(DisasContext *s, arg_rpri_load *a) return true; } -static void do_ldrq(DisasContext *s, int zt, int pg, TCGv_i64 addr, int msz) +static void do_ldrq(DisasContext *s, int zt, int pg, TCGv_i64 addr, int dtype) { - static gen_helper_gvec_mem * const fns[2][4] = { - { gen_helper_sve_ld1bb_r, gen_helper_sve_ld1hh_le_r, - gen_helper_sve_ld1ss_le_r, gen_helper_sve_ld1dd_le_r }, - { gen_helper_sve_ld1bb_r, gen_helper_sve_ld1hh_be_r, - gen_helper_sve_ld1ss_be_r, gen_helper_sve_ld1dd_be_r }, - }; unsigned vsz = vec_full_reg_size(s); TCGv_ptr t_pg; TCGv_i32 t_desc; @@ -5613,7 +5609,9 @@ static void do_ldrq(DisasContext *s, int zt, int pg, TCGv_i64 addr, int msz) t_pg = tcg_temp_new_ptr(); tcg_gen_addi_ptr(t_pg, cpu_env, poff); - fns[s->be_data == MO_BE][msz](cpu_env, t_pg, addr, t_desc); + gen_helper_gvec_mem *fn + = ldr_fns[s->mte_active[0]][s->be_data == MO_BE][dtype][0]; + fn(cpu_env, t_pg, addr, t_desc); tcg_temp_free_ptr(t_pg); tcg_temp_free_i32(t_desc); @@ -5635,7 +5633,7 @@ static bool trans_LD1RQ_zprr(DisasContext *s, arg_rprr_load *a) TCGv_i64 addr = new_tmp_a64(s); tcg_gen_shli_i64(addr, cpu_reg(s, a->rm), msz); tcg_gen_add_i64(addr, addr, cpu_reg_sp(s, a->rn)); - do_ldrq(s, a->rd, a->pg, addr, msz); + do_ldrq(s, a->rd, a->pg, addr, a->dtype); } return true; } @@ -5645,7 +5643,7 @@ static bool trans_LD1RQ_zpri(DisasContext *s, arg_rpri_load *a) if (sve_access_check(s)) { TCGv_i64 addr = new_tmp_a64(s); tcg_gen_addi_i64(addr, cpu_reg_sp(s, a->rn), a->imm * 16); - do_ldrq(s, a->rd, a->pg, addr, dtype_msz(a->dtype)); + do_ldrq(s, a->rd, a->pg, addr, a->dtype); } return true; } From patchwork Tue May 25 01:03:43 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277549 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A8A76C2B9F7 for ; Tue, 25 May 2021 02:00:31 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 3996A6113B for ; Tue, 25 May 2021 02:00:31 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 3996A6113B Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:60814 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llMMw-0003te-6U for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 22:00:30 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:55838) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLYh-0001JM-8Y for qemu-devel@nongnu.org; Mon, 24 May 2021 21:08:35 -0400 Received: from mail-pj1-x102c.google.com ([2607:f8b0:4864:20::102c]:51800) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLXw-0004AY-25 for qemu-devel@nongnu.org; Mon, 24 May 2021 21:08:34 -0400 Received: by mail-pj1-x102c.google.com with SMTP id k5so15869469pjj.1 for ; Mon, 24 May 2021 18:07:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=Z9DRAdUOD4ldpl5ZERKtO3bYlQxPDPcXj1lLMIABjys=; b=iL79QFz2aZPbiiuLRXrYcr0su/MhbNuUx0gGpQsOfSUdTqpk6DiaEUniSgwDjKFK37 JHsMYrROAGjJW12c+HTIkbuECOICID5UL1ljvQCpcEhHYzWDvutkmwbOKt7Yk6Eel0ei nIqgCKi9VlEgYriRFeuVWkYEVRog6VWaEU+1s5ePUAMsxxGtlGPx8wIqBO8E7flAGAOo DObi5mpCMiA6P5uKWMys+RkRm5NAqZsZ4Q22zRmrqsZNsPG6mzpEtCcMZzKdNkTS7Aab 4JOv0hzjM8dp6r1TUwDlkfPZJ3X0CjdGfC8uEcxrKdgLe13Sa7Pu0Sl4zeYn48HnBJJY 9UhQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Z9DRAdUOD4ldpl5ZERKtO3bYlQxPDPcXj1lLMIABjys=; b=dZQvsLoWtrsVLDYPdM/KwTEI2gnWmKv7ygbFxalFFjYNiE4JaO+o5BrxTfYVuuLQxK HB+TulrICc/S6QsoAKrywBelsfou3WaJ7o9bc0nB3fw71y1dS+782LrV2gRf2mEPMobz xrabd2bhJcrig8TRYOHvxXo4gE1qWZ2IAaleIjSBBOkVYXf/46anSwsDk4F4RDUYupl5 TXLgE5O51HdEbKNroLTDd+OVac2lWvnb2yIZ40viR1wCuCJq2/LydKkx3LYJbfzpUSRT UfvFdnw0Zq0BFyPOn2n987G+Nti2KGox9IK7GHKLW12obkdZg7RhadKa4uYzEeT8x0Ct /DIA== X-Gm-Message-State: AOAM5327LYQJ8FstW7nvmPQIlSEpu2seiCBswNYZJ5bCVHhI+dIg8VIF 3df/hMYINcz5v/l8dbITpDnoHj22Kt8JPA== X-Google-Smtp-Source: ABdhPJwl1IJDqUfI5tJoBuJSOAIJv8olH6K43bH1otok226K9rqIeAT3jRM0B2qp1HQ1RM5Gn2tWRg== X-Received: by 2002:a17:90a:e7c6:: with SMTP id kb6mr27407785pjb.81.1621904866788; Mon, 24 May 2021 18:07:46 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id b16sm11748176pju.35.2021.05.24.18.07.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:07:46 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 77/92] target/arm: Tidy do_ldrq Date: Mon, 24 May 2021 18:03:43 -0700 Message-Id: <20210525010358.152808-78-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::102c; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x102c.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Use tcg_constant_i32 for passing the simd descriptor, as this hashed value does not need to be freed. Rename dofs to doff to match poff. Signed-off-by: Richard Henderson --- target/arm/translate-sve.c | 13 ++++--------- 1 file changed, 4 insertions(+), 9 deletions(-) diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 54c50349ab..a213450583 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -5580,13 +5580,9 @@ static void do_ldrq(DisasContext *s, int zt, int pg, TCGv_i64 addr, int dtype) { unsigned vsz = vec_full_reg_size(s); TCGv_ptr t_pg; - TCGv_i32 t_desc; - int desc, poff; + int poff; /* Load the first quadword using the normal predicated load helpers. */ - desc = simd_desc(16, 16, zt); - t_desc = tcg_const_i32(desc); - poff = pred_full_reg_offset(s, pg); if (vsz > 16) { /* @@ -5611,15 +5607,14 @@ static void do_ldrq(DisasContext *s, int zt, int pg, TCGv_i64 addr, int dtype) gen_helper_gvec_mem *fn = ldr_fns[s->mte_active[0]][s->be_data == MO_BE][dtype][0]; - fn(cpu_env, t_pg, addr, t_desc); + fn(cpu_env, t_pg, addr, tcg_constant_i32(simd_desc(16, 16, zt))); tcg_temp_free_ptr(t_pg); - tcg_temp_free_i32(t_desc); /* Replicate that first quadword. */ if (vsz > 16) { - unsigned dofs = vec_full_reg_offset(s, zt); - tcg_gen_gvec_dup_mem(4, dofs + 16, dofs, vsz - 16, vsz - 16); + int doff = vec_full_reg_offset(s, zt); + tcg_gen_gvec_dup_mem(4, doff + 16, doff, vsz - 16, vsz - 16); } } From patchwork Tue May 25 01:03:44 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277555 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C5ABDC2B9F8 for ; Tue, 25 May 2021 02:01:34 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 2F9CA6113B for ; Tue, 25 May 2021 02:01:34 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 2F9CA6113B Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:35160 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llMNx-0005bE-8f for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 22:01:33 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:55858) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLYi-0001PI-JJ for qemu-devel@nongnu.org; Mon, 24 May 2021 21:08:36 -0400 Received: from mail-pl1-x62f.google.com ([2607:f8b0:4864:20::62f]:39923) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLXx-0004Aw-2q for qemu-devel@nongnu.org; Mon, 24 May 2021 21:08:36 -0400 Received: by mail-pl1-x62f.google.com with SMTP id t9so7067083ply.6 for ; Mon, 24 May 2021 18:07:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=MRoEXb+hFWdGI7cAvPOu5pferN66dwnHaCJBRFHxmfQ=; b=ZV/mlyKxFNjR8eNHKEzeYJ/cJAALbFsRDP3IS8GLYWjyUXZTVfh6Ltd9k5tMZ/nbrL pBb9wZ2M2B7hvLP6pTtCYuEv/pl57D+V+Yxt2kqoQYbuJcrFl+bwd/QP0GrcqdZpVQ+/ XjRvu1zk9h/o8omAh8Tr94DZBXg8NCK2u3qHPh+aBrGd8dg5ekwLn2Z4q1JFbccP8PeH zgRQ59LpDP5Ng22tu7NUgP3+tzjhXZQ+WOhiyJ2Vv0sJ3+nlnLNUCu69YoWoIlSRQxT0 FSc8hDVLz+e+1wfxuYSDFPtK7lXKmnjXNuUZIRDCdwRSEmn9M7ZzZICQjSOel6kuLdya tLjQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=MRoEXb+hFWdGI7cAvPOu5pferN66dwnHaCJBRFHxmfQ=; b=dAfWi3jL1v2eDqL8yTZRskDS9fPLSVKz+JcR90b9fUj1cihF4PbHiec8Iq77EgNc5R EHZm6cpo5yqBlWw9Oxp2cbyi8KZyK5j736ybCZNfR13wOLSBUNA4aO98uLfhzEStf2FM lfNcrOnN2Mi99BuwG/qM52ABZwl2i6q2PYiw1LAA5OaWcGvfaR32voDa0ml9KQ36Q9Cc 51A8dOCqYV5ikoOztfdvnhc82l/GiJXHGc1EJ6l9qXe6s/BKmweMD5mnCQ2hbIMADw+F JZtkGogIcX7RfAIE8eKLnG1yR5GWew7Inkk4KLl1zYhI3qGsSrg3tqLFOrodoNPsj7ci +zcQ== X-Gm-Message-State: AOAM530J7Uou5Wz0IrPJIJ/KvJkgRqpM5O/K9jYPtxC2cQrsKl9oKr/A 1qmUu+iyzRuLX3uzw7blzEv19fk9Erce/Q== X-Google-Smtp-Source: ABdhPJwQDPwKT5S0a8lZQh8JjW5HGAMjiqm7+qJZ1k114pgluSG+CkMt8GNf/RZ8v5Ho8mBVrJBTmA== X-Received: by 2002:a17:90a:a013:: with SMTP id q19mr1988242pjp.29.1621904867453; Mon, 24 May 2021 18:07:47 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id b16sm11748176pju.35.2021.05.24.18.07.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:07:47 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 78/92] target/arm: Implement SVE2 LD1RO Date: Mon, 24 May 2021 18:03:44 -0700 Message-Id: <20210525010358.152808-79-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::62f; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x62f.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- v7: Fix replication and tail clearing vs e2e7168a214. --- target/arm/sve.decode | 4 ++ target/arm/translate-sve.c | 93 ++++++++++++++++++++++++++++++++++++++ 2 files changed, 97 insertions(+) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 5a1cceccb6..884c5358eb 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1126,11 +1126,15 @@ LD_zpri 1010010 .. nreg:2 0.... 111 ... ..... ..... @rpri_load_msz # SVE load and broadcast quadword (scalar plus scalar) LD1RQ_zprr 1010010 .. 00 ..... 000 ... ..... ..... \ @rprr_load_msz nreg=0 +LD1RO_zprr 1010010 .. 01 ..... 000 ... ..... ..... \ + @rprr_load_msz nreg=0 # SVE load and broadcast quadword (scalar plus immediate) # LD1RQB, LD1RQH, LD1RQS, LD1RQD LD1RQ_zpri 1010010 .. 00 0.... 001 ... ..... ..... \ @rpri_load_msz nreg=0 +LD1RO_zpri 1010010 .. 01 0.... 001 ... ..... ..... \ + @rpri_load_msz nreg=0 # SVE 32-bit gather prefetch (scalar plus 32-bit scaled offsets) PRF 1000010 00 -1 ----- 0-- --- ----- 0 ---- diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index a213450583..1dcdbac0af 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -5643,6 +5643,99 @@ static bool trans_LD1RQ_zpri(DisasContext *s, arg_rpri_load *a) return true; } +static void do_ldro(DisasContext *s, int zt, int pg, TCGv_i64 addr, int dtype) +{ + unsigned vsz = vec_full_reg_size(s); + unsigned vsz_r32; + TCGv_ptr t_pg; + int poff, doff; + + if (vsz < 32) { + /* + * Note that this UNDEFINED check comes after CheckSVEEnabled() + * in the ARM pseudocode, which is the sve_access_check() done + * in our caller. We should not now return false from the caller. + */ + unallocated_encoding(s); + return; + } + + /* Load the first octaword using the normal predicated load helpers. */ + + poff = pred_full_reg_offset(s, pg); + if (vsz > 32) { + /* + * Zero-extend the first 32 bits of the predicate into a temporary. + * This avoids triggering an assert making sure we don't have bits + * set within a predicate beyond VQ, but we have lowered VQ to 2 + * for this load operation. + */ + TCGv_i64 tmp = tcg_temp_new_i64(); +#ifdef HOST_WORDS_BIGENDIAN + poff += 4; +#endif + tcg_gen_ld32u_i64(tmp, cpu_env, poff); + + poff = offsetof(CPUARMState, vfp.preg_tmp); + tcg_gen_st_i64(tmp, cpu_env, poff); + tcg_temp_free_i64(tmp); + } + + t_pg = tcg_temp_new_ptr(); + tcg_gen_addi_ptr(t_pg, cpu_env, poff); + + gen_helper_gvec_mem *fn + = ldr_fns[s->mte_active[0]][s->be_data == MO_BE][dtype][0]; + fn(cpu_env, t_pg, addr, tcg_constant_i32(simd_desc(32, 32, zt))); + + tcg_temp_free_ptr(t_pg); + + /* + * Replicate that first octaword. + * The replication happens in units of 32; if the full vector size + * is not a multiple of 32, the final bits are zeroed. + */ + doff = vec_full_reg_offset(s, zt); + vsz_r32 = QEMU_ALIGN_DOWN(vsz, 32); + if (vsz >= 64) { + tcg_gen_gvec_dup_mem(5, doff + 32, doff, vsz_r32 - 32, vsz_r32 - 32); + } + vsz -= vsz_r32; + if (vsz) { + tcg_gen_gvec_dup_imm(MO_64, doff + vsz_r32, vsz, vsz, 0); + } +} + +static bool trans_LD1RO_zprr(DisasContext *s, arg_rprr_load *a) +{ + if (!dc_isar_feature(aa64_sve_f64mm, s)) { + return false; + } + if (a->rm == 31) { + return false; + } + if (sve_access_check(s)) { + TCGv_i64 addr = new_tmp_a64(s); + tcg_gen_shli_i64(addr, cpu_reg(s, a->rm), dtype_msz(a->dtype)); + tcg_gen_add_i64(addr, addr, cpu_reg_sp(s, a->rn)); + do_ldro(s, a->rd, a->pg, addr, a->dtype); + } + return true; +} + +static bool trans_LD1RO_zpri(DisasContext *s, arg_rpri_load *a) +{ + if (!dc_isar_feature(aa64_sve_f64mm, s)) { + return false; + } + if (sve_access_check(s)) { + TCGv_i64 addr = new_tmp_a64(s); + tcg_gen_addi_i64(addr, cpu_reg_sp(s, a->rn), a->imm * 32); + do_ldro(s, a->rd, a->pg, addr, a->dtype); + } + return true; +} + /* Load and broadcast element. */ static bool trans_LD1R_zpri(DisasContext *s, arg_rpri_load *a) { From patchwork Tue May 25 01:03:45 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277557 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.5 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6A884C2B9F7 for ; Tue, 25 May 2021 02:03:26 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id DEF1961209 for ; Tue, 25 May 2021 02:03:25 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org DEF1961209 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:41230 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llMPl-0001JH-0O for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 22:03:25 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:55846) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLYh-0001L9-N2 for qemu-devel@nongnu.org; Mon, 24 May 2021 21:08:35 -0400 Received: from mail-pl1-x634.google.com ([2607:f8b0:4864:20::634]:34552) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLXx-0004Bo-Dy for qemu-devel@nongnu.org; Mon, 24 May 2021 21:08:35 -0400 Received: by mail-pl1-x634.google.com with SMTP id e15so8884489plh.1 for ; Mon, 24 May 2021 18:07:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=KozNqDqPcU6QFCditQjneab5f5CD8ouOg2dylxPkSf4=; b=mt3y12RUhpEig/B/zMvKMbIRDawgnWZZLbNhkXCHdo1cOg6qT/xjp0KC+rwkWki3P5 Bla6EO+Hv/5GV0X4jj6tlHhsCRFucWYDqpUBgTpSUUAl+Luc8fsz+r+886Cl+lliUwPp PIzWNFEZVSPlUpgOt/wwVMNptOkjcYB8wfofAUyozL9zFe6nMQPqBQdZtFqAYaxMHBdG nFB/ewYqR0LKQ68e4i4ppNwYfuitVW4dUG7pHyimY6o5Zs0yPkB0wbZ7eA3mQQORo2iS QgoFz6WqPIkYIXcVmyrvDTYnIaKO9UPf+lyo2xphGSAdqOjaRuGYdcxfHDzKmZ+apbVL BExg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=KozNqDqPcU6QFCditQjneab5f5CD8ouOg2dylxPkSf4=; b=dDwTWDL1HMG+zKUVhkaJ/BUECf/7G9CwPY40XlgJiSiN1P5XVZFazzQ57dkC8a71bg MmFAigaappoaZsTdRKPK9Wdh8f5nL+c3g/5hOGNzi+sX/63OFPb2khV4uP5/aRM6n3Lm VdDsFlO3T7XUPhljdOLDU4jPhd3n818dthORZJx6a+izOpXLUb0if+9vDpWrAWc0ATJN iuW4OqXKfGVbUOBAefoVtdBSgoyG8JXyTqNK/KfEukn4gWDOgEOCP+6RvR45hHKV0knb nrIDobxQTrtThPy7kuKvgneTRyyEXxExsO0TOVlOtde9cvX2bH1UoE3QOZSg/v4tPPHV JiyQ== X-Gm-Message-State: AOAM533E31Mo+R0rCI19a8D2QKNgE3E+ff6IbaLn3gRWwmdq4L/TdZ5g gDJWphNQTHqNVW4pdy3GyIpaduV2oHi77w== X-Google-Smtp-Source: ABdhPJzDZqIKAr+AkDTZTgrURGR2JnpKOGGwC97VwKuU6oxax36WD5qvv++uDKI3yYVUZSXBQsgDvA== X-Received: by 2002:a17:90a:ab0c:: with SMTP id m12mr27668536pjq.179.1621904868062; Mon, 24 May 2021 18:07:48 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id b16sm11748176pju.35.2021.05.24.18.07.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:07:47 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 79/92] target/arm: Implement 128-bit ZIP, UZP, TRN Date: Mon, 24 May 2021 18:03:45 -0700 Message-Id: <20210525010358.152808-80-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::634; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x634.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 3 ++ target/arm/sve.decode | 8 ++++++ target/arm/sve_helper.c | 29 +++++++++++++------ target/arm/translate-sve.c | 58 ++++++++++++++++++++++++++++++++++++++ 4 files changed, 90 insertions(+), 8 deletions(-) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 342bb83721..b43ffce23a 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -689,16 +689,19 @@ DEF_HELPER_FLAGS_4(sve_zip_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_zip_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_zip_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_zip_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_zip_q, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_uzp_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_uzp_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_uzp_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_uzp_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_uzp_q, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_trn_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_trn_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_trn_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_trn_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_trn_q, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_compact_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_compact_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 884c5358eb..5469ce0414 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -590,6 +590,14 @@ UZP2_z 00000101 .. 1 ..... 011 011 ..... ..... @rd_rn_rm TRN1_z 00000101 .. 1 ..... 011 100 ..... ..... @rd_rn_rm TRN2_z 00000101 .. 1 ..... 011 101 ..... ..... @rd_rn_rm +# SVE2 permute vector segments +ZIP1_q 00000101 10 1 ..... 000 000 ..... ..... @rd_rn_rm_e0 +ZIP2_q 00000101 10 1 ..... 000 001 ..... ..... @rd_rn_rm_e0 +UZP1_q 00000101 10 1 ..... 000 010 ..... ..... @rd_rn_rm_e0 +UZP2_q 00000101 10 1 ..... 000 011 ..... ..... @rd_rn_rm_e0 +TRN1_q 00000101 10 1 ..... 000 110 ..... ..... @rd_rn_rm_e0 +TRN2_q 00000101 10 1 ..... 000 111 ..... ..... @rd_rn_rm_e0 + ### SVE Permute - Predicated Group # SVE compress active elements diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index a051854984..d088b1f74c 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -3492,36 +3492,45 @@ void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ *(TYPE *)(vd + H(2 * i + 0)) = *(TYPE *)(vn + H(i)); \ *(TYPE *)(vd + H(2 * i + sizeof(TYPE))) = *(TYPE *)(vm + H(i)); \ } \ + if (sizeof(TYPE) == 16 && unlikely(oprsz & 16)) { \ + memset(vd + oprsz - 16, 0, 16); \ + } \ } DO_ZIP(sve_zip_b, uint8_t, H1) DO_ZIP(sve_zip_h, uint16_t, H1_2) DO_ZIP(sve_zip_s, uint32_t, H1_4) DO_ZIP(sve_zip_d, uint64_t, ) +DO_ZIP(sve2_zip_q, Int128, ) #define DO_UZP(NAME, TYPE, H) \ void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ { \ intptr_t oprsz = simd_oprsz(desc); \ - intptr_t oprsz_2 = oprsz / 2; \ intptr_t odd_ofs = simd_data(desc); \ - intptr_t i; \ + intptr_t i, p; \ ARMVectorReg tmp_m; \ if (unlikely((vm - vd) < (uintptr_t)oprsz)) { \ vm = memcpy(&tmp_m, vm, oprsz); \ } \ - for (i = 0; i < oprsz_2; i += sizeof(TYPE)) { \ - *(TYPE *)(vd + H(i)) = *(TYPE *)(vn + H(2 * i + odd_ofs)); \ - } \ - for (i = 0; i < oprsz_2; i += sizeof(TYPE)) { \ - *(TYPE *)(vd + H(oprsz_2 + i)) = *(TYPE *)(vm + H(2 * i + odd_ofs)); \ - } \ + i = 0, p = odd_ofs; \ + do { \ + *(TYPE *)(vd + H(i)) = *(TYPE *)(vn + H(p)); \ + i += sizeof(TYPE), p += 2 * sizeof(TYPE); \ + } while (p < oprsz); \ + p -= oprsz; \ + do { \ + *(TYPE *)(vd + H(i)) = *(TYPE *)(vm + H(p)); \ + i += sizeof(TYPE), p += 2 * sizeof(TYPE); \ + } while (p < oprsz); \ + tcg_debug_assert(i == oprsz); \ } DO_UZP(sve_uzp_b, uint8_t, H1) DO_UZP(sve_uzp_h, uint16_t, H1_2) DO_UZP(sve_uzp_s, uint32_t, H1_4) DO_UZP(sve_uzp_d, uint64_t, ) +DO_UZP(sve2_uzp_q, Int128, ) #define DO_TRN(NAME, TYPE, H) \ void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ @@ -3535,12 +3544,16 @@ void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ *(TYPE *)(vd + H(i + 0)) = ae; \ *(TYPE *)(vd + H(i + sizeof(TYPE))) = be; \ } \ + if (sizeof(TYPE) == 16 && unlikely(oprsz & 16)) { \ + memset(vd + oprsz - 16, 0, 16); \ + } \ } DO_TRN(sve_trn_b, uint8_t, H1) DO_TRN(sve_trn_h, uint16_t, H1_2) DO_TRN(sve_trn_s, uint32_t, H1_4) DO_TRN(sve_trn_d, uint64_t, ) +DO_TRN(sve2_trn_q, Int128, ) #undef DO_ZIP #undef DO_UZP diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 1dcdbac0af..b2aa9130b6 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -2624,6 +2624,32 @@ static bool trans_ZIP2_z(DisasContext *s, arg_rrr_esz *a) return do_zip(s, a, true); } +static bool do_zip_q(DisasContext *s, arg_rrr_esz *a, bool high) +{ + if (!dc_isar_feature(aa64_sve_f64mm, s)) { + return false; + } + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + unsigned high_ofs = high ? QEMU_ALIGN_DOWN(vsz, 32) / 2 : 0; + tcg_gen_gvec_3_ool(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn) + high_ofs, + vec_full_reg_offset(s, a->rm) + high_ofs, + vsz, vsz, 0, gen_helper_sve2_zip_q); + } + return true; +} + +static bool trans_ZIP1_q(DisasContext *s, arg_rrr_esz *a) +{ + return do_zip_q(s, a, false); +} + +static bool trans_ZIP2_q(DisasContext *s, arg_rrr_esz *a) +{ + return do_zip_q(s, a, true); +} + static gen_helper_gvec_3 * const uzp_fns[4] = { gen_helper_sve_uzp_b, gen_helper_sve_uzp_h, gen_helper_sve_uzp_s, gen_helper_sve_uzp_d, @@ -2639,6 +2665,22 @@ static bool trans_UZP2_z(DisasContext *s, arg_rrr_esz *a) return do_zzz_data_ool(s, a, 1 << a->esz, uzp_fns[a->esz]); } +static bool trans_UZP1_q(DisasContext *s, arg_rrr_esz *a) +{ + if (!dc_isar_feature(aa64_sve_f64mm, s)) { + return false; + } + return do_zzz_data_ool(s, a, 0, gen_helper_sve2_uzp_q); +} + +static bool trans_UZP2_q(DisasContext *s, arg_rrr_esz *a) +{ + if (!dc_isar_feature(aa64_sve_f64mm, s)) { + return false; + } + return do_zzz_data_ool(s, a, 16, gen_helper_sve2_uzp_q); +} + static gen_helper_gvec_3 * const trn_fns[4] = { gen_helper_sve_trn_b, gen_helper_sve_trn_h, gen_helper_sve_trn_s, gen_helper_sve_trn_d, @@ -2654,6 +2696,22 @@ static bool trans_TRN2_z(DisasContext *s, arg_rrr_esz *a) return do_zzz_data_ool(s, a, 1 << a->esz, trn_fns[a->esz]); } +static bool trans_TRN1_q(DisasContext *s, arg_rrr_esz *a) +{ + if (!dc_isar_feature(aa64_sve_f64mm, s)) { + return false; + } + return do_zzz_data_ool(s, a, 0, gen_helper_sve2_trn_q); +} + +static bool trans_TRN2_q(DisasContext *s, arg_rrr_esz *a) +{ + if (!dc_isar_feature(aa64_sve_f64mm, s)) { + return false; + } + return do_zzz_data_ool(s, a, 16, gen_helper_sve2_trn_q); +} + /* *** SVE Permute Vector - Predicated Group */ From patchwork Tue May 25 01:03:46 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277561 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9B306C47084 for ; Tue, 25 May 2021 02:04:19 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 268C06113B for ; Tue, 25 May 2021 02:04:19 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 268C06113B Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:44874 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llMQc-0003hK-7F for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 22:04:18 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:56320) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLaz-00065u-Nl for qemu-devel@nongnu.org; Mon, 24 May 2021 21:10:59 -0400 Received: from mail-pg1-x529.google.com ([2607:f8b0:4864:20::529]:37685) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLat-0005qV-JJ for qemu-devel@nongnu.org; Mon, 24 May 2021 21:10:57 -0400 Received: by mail-pg1-x529.google.com with SMTP id t193so21416513pgb.4 for ; Mon, 24 May 2021 18:10:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=rAh+mrtYdjW28/Wa1av2YPet2GepPpehSKaS9Nt7Pu8=; b=R8HtKb18aRFL8VyQZCw6nFmEI3g01SqMuyJxDXs2KS5YitcfDmILsWZjCjtjCb5RL9 98rl4Ddt6H4HDjg5kSxyn4+7cNoVHF1Bjo7UetE36Rfylk0ud/1IsGx13kGrNdi9RF6u xbquI/GwZRAY1+P/wnCFqRBrpDBDkA7BYK21Ia1ZRBY604SWNWTjPUFxk/MM1AJmXqg2 CoHQg7UTh0V2/0ch/JGyVgCk+zOQc5mc3hXb1WfV9jak6Jk+IBfRnFqlyfz3LpVkDpxU njaBuUyv2gX/lxwxneWTLqIu0BBSauFPS6aT7U3PjfX7Rkv7++pVSZ/GCF8uc7/AFs+S yniw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=rAh+mrtYdjW28/Wa1av2YPet2GepPpehSKaS9Nt7Pu8=; b=FsZH5NikR73BdtFIY+oRZ2wDdQp4s0F10CBX5g+HcN35rlflzorIRo/49H8UxSqc8d XKATiRqll1AFv9SiRq1PvfUGQsy6aJxN2FW2wMebu741LiPVwXlBrDQ9znoHJLE4GAEt CPVGb0/6au5juFMeM/ucqXSWiBWlsgi1Yc+ebHh0ari5l+N02IZcHIoatSFDNI9EUUKN HHyE+KdVPwzuCECulPFiY9MT+8fS80wln13yoA5Ocp+eprJCMLrNn4xCgovrYuCfY6Ki o+PT9x44vUA8M7DDzaFEKyxJgSfuLlIYQuU+JCZWnLdr1krIxagcyfXJ5cp5JXwvx50w pHaA== X-Gm-Message-State: AOAM532O9V8oubGxwtd4o4a1VGruG6+sCoRbcgWLUKJUXAtWAt6xIb75 JTIpCtkdm7TDihmyFKl29hADFrtUtu79IQ== X-Google-Smtp-Source: ABdhPJzq/jpPUa0EQ/jCH3QgSMXL3VSfCSPlz3BOdtgj6PIK39XLU+UQ3iaDUK0zfRmkKYail7dSPw== X-Received: by 2002:a63:6cd:: with SMTP id 196mr16089384pgg.417.1621905049153; Mon, 24 May 2021 18:10:49 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id i8sm10614839pjs.54.2021.05.24.18.10.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:10:48 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 80/92] target/arm: Implement SVE2 bitwise shift immediate Date: Mon, 24 May 2021 18:03:46 -0700 Message-Id: <20210525010358.152808-81-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::529; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x529.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org, Stephen Long Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Stephen Long Implements SQSHL/UQSHL, SRSHR/URSHR, and SQSHLU Reviewed-by: Peter Maydell Signed-off-by: Stephen Long Message-Id: <20200430194159.24064-1-steplong@quicinc.com> Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 33 +++++++++++++++++++++ target/arm/sve.decode | 5 ++++ target/arm/sve_helper.c | 35 ++++++++++++++++++++++ target/arm/translate-sve.c | 60 ++++++++++++++++++++++++++++++++++++++ 4 files changed, 133 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index b43ffce23a..29a14a21f5 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2761,3 +2761,36 @@ DEF_HELPER_FLAGS_5(sve2_fcvtlt_sd, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(flogb_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(flogb_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(flogb_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_sqshl_zpzi_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqshl_zpzi_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqshl_zpzi_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqshl_zpzi_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_uqshl_zpzi_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_uqshl_zpzi_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_uqshl_zpzi_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_uqshl_zpzi_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_srshr_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_srshr_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_srshr_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_srshr_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_urshr_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_urshr_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_urshr_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_urshr_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_sqshlu_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqshlu_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqshlu_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqshlu_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 5469ce0414..ea98508cdd 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -340,6 +340,11 @@ ASR_zpzi 00000100 .. 000 000 100 ... .. ... ..... @rdn_pg_tszimm_shr LSR_zpzi 00000100 .. 000 001 100 ... .. ... ..... @rdn_pg_tszimm_shr LSL_zpzi 00000100 .. 000 011 100 ... .. ... ..... @rdn_pg_tszimm_shl ASRD 00000100 .. 000 100 100 ... .. ... ..... @rdn_pg_tszimm_shr +SQSHL_zpzi 00000100 .. 000 110 100 ... .. ... ..... @rdn_pg_tszimm_shl +UQSHL_zpzi 00000100 .. 000 111 100 ... .. ... ..... @rdn_pg_tszimm_shl +SRSHR 00000100 .. 001 100 100 ... .. ... ..... @rdn_pg_tszimm_shr +URSHR 00000100 .. 001 101 100 ... .. ... ..... @rdn_pg_tszimm_shr +SQSHLU 00000100 .. 001 111 100 ... .. ... ..... @rdn_pg_tszimm_shl # SVE bitwise shift by vector (predicated) ASR_zpzz 00000100 .. 010 000 100 ... ..... ..... @rdn_pg_rm diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index d088b1f74c..4afb06fb2a 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2238,6 +2238,41 @@ DO_ZPZI(sve_asrd_h, int16_t, H1_2, DO_ASRD) DO_ZPZI(sve_asrd_s, int32_t, H1_4, DO_ASRD) DO_ZPZI_D(sve_asrd_d, int64_t, DO_ASRD) +/* SVE2 bitwise shift by immediate */ +DO_ZPZI(sve2_sqshl_zpzi_b, int8_t, H1, do_sqshl_b) +DO_ZPZI(sve2_sqshl_zpzi_h, int16_t, H1_2, do_sqshl_h) +DO_ZPZI(sve2_sqshl_zpzi_s, int32_t, H1_4, do_sqshl_s) +DO_ZPZI_D(sve2_sqshl_zpzi_d, int64_t, do_sqshl_d) + +DO_ZPZI(sve2_uqshl_zpzi_b, uint8_t, H1, do_uqshl_b) +DO_ZPZI(sve2_uqshl_zpzi_h, uint16_t, H1_2, do_uqshl_h) +DO_ZPZI(sve2_uqshl_zpzi_s, uint32_t, H1_4, do_uqshl_s) +DO_ZPZI_D(sve2_uqshl_zpzi_d, uint64_t, do_uqshl_d) + +DO_ZPZI(sve2_srshr_b, int8_t, H1, do_srshr) +DO_ZPZI(sve2_srshr_h, int16_t, H1_2, do_srshr) +DO_ZPZI(sve2_srshr_s, int32_t, H1_4, do_srshr) +DO_ZPZI_D(sve2_srshr_d, int64_t, do_srshr) + +DO_ZPZI(sve2_urshr_b, uint8_t, H1, do_urshr) +DO_ZPZI(sve2_urshr_h, uint16_t, H1_2, do_urshr) +DO_ZPZI(sve2_urshr_s, uint32_t, H1_4, do_urshr) +DO_ZPZI_D(sve2_urshr_d, uint64_t, do_urshr) + +#define do_suqrshl_b(n, m) \ + ({ uint32_t discard; do_suqrshl_bhs(n, (int8_t)m, 8, false, &discard); }) +#define do_suqrshl_h(n, m) \ + ({ uint32_t discard; do_suqrshl_bhs(n, (int16_t)m, 16, false, &discard); }) +#define do_suqrshl_s(n, m) \ + ({ uint32_t discard; do_suqrshl_bhs(n, m, 32, false, &discard); }) +#define do_suqrshl_d(n, m) \ + ({ uint32_t discard; do_suqrshl_d(n, m, false, &discard); }) + +DO_ZPZI(sve2_sqshlu_b, int8_t, H1, do_suqrshl_b) +DO_ZPZI(sve2_sqshlu_h, int16_t, H1_2, do_suqrshl_h) +DO_ZPZI(sve2_sqshlu_s, int32_t, H1_4, do_suqrshl_s) +DO_ZPZI_D(sve2_sqshlu_d, int64_t, do_suqrshl_d) + #undef DO_ASRD #undef DO_ZPZI #undef DO_ZPZI_D diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index b2aa9130b6..92c0620bc8 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -1044,6 +1044,66 @@ static bool trans_ASRD(DisasContext *s, arg_rpri_esz *a) } } +static bool trans_SQSHL_zpzi(DisasContext *s, arg_rpri_esz *a) +{ + static gen_helper_gvec_3 * const fns[4] = { + gen_helper_sve2_sqshl_zpzi_b, gen_helper_sve2_sqshl_zpzi_h, + gen_helper_sve2_sqshl_zpzi_s, gen_helper_sve2_sqshl_zpzi_d, + }; + if (a->esz < 0 || !dc_isar_feature(aa64_sve2, s)) { + return false; + } + return do_zpzi_ool(s, a, fns[a->esz]); +} + +static bool trans_UQSHL_zpzi(DisasContext *s, arg_rpri_esz *a) +{ + static gen_helper_gvec_3 * const fns[4] = { + gen_helper_sve2_uqshl_zpzi_b, gen_helper_sve2_uqshl_zpzi_h, + gen_helper_sve2_uqshl_zpzi_s, gen_helper_sve2_uqshl_zpzi_d, + }; + if (a->esz < 0 || !dc_isar_feature(aa64_sve2, s)) { + return false; + } + return do_zpzi_ool(s, a, fns[a->esz]); +} + +static bool trans_SRSHR(DisasContext *s, arg_rpri_esz *a) +{ + static gen_helper_gvec_3 * const fns[4] = { + gen_helper_sve2_srshr_b, gen_helper_sve2_srshr_h, + gen_helper_sve2_srshr_s, gen_helper_sve2_srshr_d, + }; + if (a->esz < 0 || !dc_isar_feature(aa64_sve2, s)) { + return false; + } + return do_zpzi_ool(s, a, fns[a->esz]); +} + +static bool trans_URSHR(DisasContext *s, arg_rpri_esz *a) +{ + static gen_helper_gvec_3 * const fns[4] = { + gen_helper_sve2_urshr_b, gen_helper_sve2_urshr_h, + gen_helper_sve2_urshr_s, gen_helper_sve2_urshr_d, + }; + if (a->esz < 0 || !dc_isar_feature(aa64_sve2, s)) { + return false; + } + return do_zpzi_ool(s, a, fns[a->esz]); +} + +static bool trans_SQSHLU(DisasContext *s, arg_rpri_esz *a) +{ + static gen_helper_gvec_3 * const fns[4] = { + gen_helper_sve2_sqshlu_b, gen_helper_sve2_sqshlu_h, + gen_helper_sve2_sqshlu_s, gen_helper_sve2_sqshlu_d, + }; + if (a->esz < 0 || !dc_isar_feature(aa64_sve2, s)) { + return false; + } + return do_zpzi_ool(s, a, fns[a->esz]); +} + /* *** SVE Bitwise Shift - Predicated Group */ From patchwork Tue May 25 01:03:47 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277565 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B1E9EC2B9F7 for ; Tue, 25 May 2021 02:06:09 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 548076113B for ; Tue, 25 May 2021 02:06:09 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 548076113B Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:51020 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llMSO-0007lp-09 for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 22:06:08 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:56376) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLb3-00069R-LX for qemu-devel@nongnu.org; Mon, 24 May 2021 21:11:02 -0400 Received: from mail-pj1-x1036.google.com ([2607:f8b0:4864:20::1036]:34419) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLat-0005rf-JO for qemu-devel@nongnu.org; Mon, 24 May 2021 21:11:01 -0400 Received: by mail-pj1-x1036.google.com with SMTP id g6-20020a17090adac6b029015d1a9a6f1aso874802pjx.1 for ; Mon, 24 May 2021 18:10:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=SH4+oKBkgsx+CH0p73raOV00OXwUsTnolQbkKecgOjI=; b=cDVmUV1VHw/NAk9I3IxKUTMAuAg33gh6ZbN4ljcgs5SuuXAEOTb0j8VThrSsutSWFS DF30gPA1Qu7/haAWedrADSDihpYOoSyCnzIY2F/azNzdgflLw4Ew1JE4gmfxsnCzzSQo 5rYGf/Lfwl8STenGDajcAK0754myqpsF8svK193cThBnK5pYwPYRAVf0MZRH63pSOzIn QMeMZPJcHbkZl49gTu7FnDqNZc43w++/4llEaEdBj8ds4L8pqb/ELWRm4KEoL4SNyDRE gESMwP+QYxK7sdc93NCm5uFdPDwUzku7ylAJQytsR3S9aQNMKYAf5GFKBIAosYlUUTTn sRvA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=SH4+oKBkgsx+CH0p73raOV00OXwUsTnolQbkKecgOjI=; b=COx3szWKAHxyZLdDK9Y6na+tmSFosZuRWQ+sLMyqfLkqGAemZGbuxwlxaPVT18gnLG YJJlatOE8TQ49GO5xXGoSrr6v4/s/2U3qKTo4RtulM7vpkUosNdsqZAdNW1wC0l/5rYw N1wbS1kueI7aNqzkrejaeUcQrEZnY31NhQUKrOOLGE0gOZR7zbeRbz51kYo326Q+C+Up Lscet5Z3RcFykSjBNik7bQNk3I+nobjKHdZzY0mAOYaST+6xtyjF3k8gFMYJT2lRxMMS CdLIbYizPb4GmBRi3fobXgmuhpq9+olhA8KytYTatXbqi5RIsSo35m/R7YdnPfcCPDUU en3g== X-Gm-Message-State: AOAM530tpX5iF+mnidjZADfKHyh2lPLASAKgWiwFoaS9bP/YTG9p8Um5 80rR6U+H/50tZaxykLE6H41ofoambo9i7w== X-Google-Smtp-Source: ABdhPJyDDaQ9Y1v9o9GoV7IxwbbINwG+uNjaC45ctV7jCXKcogqXrUrB0t2Aflekkk+3UeIo2ckx+g== X-Received: by 2002:a17:903:1c3:b029:f4:9624:2c69 with SMTP id e3-20020a17090301c3b02900f496242c69mr28258002plh.51.1621905049956; Mon, 24 May 2021 18:10:49 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id i8sm10614839pjs.54.2021.05.24.18.10.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:10:49 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 81/92] target/arm: Move endian adjustment macros to vec_internal.h Date: Mon, 24 May 2021 18:03:47 -0700 Message-Id: <20210525010358.152808-82-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1036; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1036.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" We have two copies of these, one set of which is not complete. Move them to a common header. Suggested-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/vec_internal.h | 24 ++++++++++++++++++++++++ target/arm/sve_helper.c | 16 ---------------- target/arm/vec_helper.c | 12 ------------ 3 files changed, 24 insertions(+), 28 deletions(-) diff --git a/target/arm/vec_internal.h b/target/arm/vec_internal.h index ff694d870a..dba481e001 100644 --- a/target/arm/vec_internal.h +++ b/target/arm/vec_internal.h @@ -20,6 +20,30 @@ #ifndef TARGET_ARM_VEC_INTERNALS_H #define TARGET_ARM_VEC_INTERNALS_H +/* + * Note that vector data is stored in host-endian 64-bit chunks, + * so addressing units smaller than that needs a host-endian fixup. + * + * The H macros are used when indexing an array of elements of size N. + * + * The H1_ macros are used when performing byte arithmetic and then + * casting the final pointer to a type of size N. + */ +#ifdef HOST_WORDS_BIGENDIAN +#define H1(x) ((x) ^ 7) +#define H1_2(x) ((x) ^ 6) +#define H1_4(x) ((x) ^ 4) +#define H2(x) ((x) ^ 3) +#define H4(x) ((x) ^ 1) +#else +#define H1(x) (x) +#define H1_2(x) (x) +#define H1_4(x) (x) +#define H2(x) (x) +#define H4(x) (x) +#endif + + static inline void clear_tail(void *vd, uintptr_t opr_sz, uintptr_t max_sz) { uint64_t *d = vd + opr_sz; diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 4afb06fb2a..40af3024df 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -29,22 +29,6 @@ #include "vec_internal.h" -/* Note that vector data is stored in host-endian 64-bit chunks, - so addressing units smaller than that needs a host-endian fixup. */ -#ifdef HOST_WORDS_BIGENDIAN -#define H1(x) ((x) ^ 7) -#define H1_2(x) ((x) ^ 6) -#define H1_4(x) ((x) ^ 4) -#define H2(x) ((x) ^ 3) -#define H4(x) ((x) ^ 1) -#else -#define H1(x) (x) -#define H1_2(x) (x) -#define H1_4(x) (x) -#define H2(x) (x) -#define H4(x) (x) -#endif - /* Return a value for NZCV as per the ARM PredTest pseudofunction. * * The return value has bit 31 set if N is set, bit 1 set if Z is clear, diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index 21ae1258f2..f5af45375d 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -25,18 +25,6 @@ #include "qemu/int128.h" #include "vec_internal.h" -/* Note that vector data is stored in host-endian 64-bit chunks, - so addressing units smaller than that needs a host-endian fixup. */ -#ifdef HOST_WORDS_BIGENDIAN -#define H1(x) ((x) ^ 7) -#define H2(x) ((x) ^ 3) -#define H4(x) ((x) ^ 1) -#else -#define H1(x) (x) -#define H2(x) (x) -#define H4(x) (x) -#endif - /* Signed saturating rounding doubling multiply-accumulate high half, 8-bit */ int8_t do_sqrdmlah_b(int8_t src1, int8_t src2, int8_t src3, bool neg, bool round) From patchwork Tue May 25 01:03:48 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277601 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C61DAC2B9F7 for ; Tue, 25 May 2021 02:23:47 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 366486141D for ; Tue, 25 May 2021 02:23:47 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 366486141D Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:51362 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llMjS-0004L1-Aq for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 22:23:46 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:56372) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLb1-00067a-Hk for qemu-devel@nongnu.org; Mon, 24 May 2021 21:10:59 -0400 Received: from mail-pl1-x631.google.com ([2607:f8b0:4864:20::631]:45864) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLat-0005rp-Re for qemu-devel@nongnu.org; Mon, 24 May 2021 21:10:58 -0400 Received: by mail-pl1-x631.google.com with SMTP id s4so13954797plg.12 for ; Mon, 24 May 2021 18:10:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=gBzNAdSJ3IF4Sl+NJ02Wycw36cBTOxrQMQhhGPhDwm4=; b=K0nQrOs//wf4bCnYirphrVf87mTU7MVlsZ425B1jTCe5Jzeb395rHcb4n8FJOizJ8U CIm/S5PK+MJfHyHta9GuYogZa0b9PdhW/7ebRZv23qRbt3PfUTZjsNvRw7FuD02QxBHh WiPose/v2fRbtueboiEX20ppviMKBOWf27Ym4ipPKSzb0CTUQ8AIhsBF7ZlrA4SRT5kk 1IV8xiE1pzF2Iiun4b5P1GmSzXr8804vnMF4eMsE5w1/h+6EObdtOiRPWphle78Hd1be p5Rjs1lrYbZTpf1AdMksa0m/DI8C2i9QnKzdo6xPi+pBbFGfSkbPDOhiffZjezmvlJXC 1ztw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=gBzNAdSJ3IF4Sl+NJ02Wycw36cBTOxrQMQhhGPhDwm4=; b=AKhSDhQZvg3N/ozD5IdoTKY19c9dXtWjSnB+INY8UQOQ7xw+iarxSqgrMIFzjCaNs6 kUepJnVuQ02MAbF+1H1L3s6qIkGgduf1qLgxYQ572vVIFncO9G2A4T5a2xmNncgP70gv 6OiL59VSaeC2TneP8RxAqU45tVRx7XqPyA0Z1SAWPOoLlDfBk8/IXk+r1+9g9cikJXHL vLxFmRLa1YE5aCA7zdanWzHrMh8QYzC1yoHtDltIpuc4xsixK/IJy2POTlK36c0dg6ef xo0O1I6dQkhnTcpKFPW1zhGajNGoPiBYt3h7LvqWmKIkRpI6CxTPYlBR+VAudYIx6/G7 d3OQ== X-Gm-Message-State: AOAM531xSfWUr+UgDMtora6VTP3JzHeCtiFqHs+WDghUd8wmF7V7DZN8 r4NlFuC9KE1N2RhZP7JxC2wch0lUGQW5ng== X-Google-Smtp-Source: ABdhPJzd8tr0w6DPiwxEnPJ7/CuwSwv4jV9yyH/oat7ejxY4xsu6Uj0Sh9iVCSKP3E0reY36SDp1gg== X-Received: by 2002:a17:90b:209:: with SMTP id fy9mr2068249pjb.92.1621905050614; Mon, 24 May 2021 18:10:50 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id i8sm10614839pjs.54.2021.05.24.18.10.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:10:50 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 82/92] target/arm: Implement SVE2 fp multiply-add long Date: Mon, 24 May 2021 18:03:48 -0700 Message-Id: <20210525010358.152808-83-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::631; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x631.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org, Stephen Long Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Stephen Long Implements both vectored and indexed FMLALB, FMLALT, FMLSLB, FMLSLT Reviewed-by: Peter Maydell Signed-off-by: Stephen Long Message-Id: <20200504171240.11220-1-steplong@quicinc.com> [rth: Rearrange to use float16_to_float32_by_bits.] Signed-off-by: Richard Henderson --- target/arm/helper.h | 5 +++ target/arm/sve.decode | 14 +++++++ target/arm/translate-sve.c | 75 ++++++++++++++++++++++++++++++++++++++ target/arm/vec_helper.c | 47 ++++++++++++++++++++++++ 4 files changed, 141 insertions(+) diff --git a/target/arm/helper.h b/target/arm/helper.h index 2e212ae96b..92b81bbabe 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -986,6 +986,11 @@ DEF_HELPER_FLAGS_4(sve2_sqrdmulh_idx_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_4(sve2_sqrdmulh_idx_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve2_fmlal_zzzw_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve2_fmlal_zzxw_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_4(gvec_xar_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) #ifdef TARGET_AARCH64 diff --git a/target/arm/sve.decode b/target/arm/sve.decode index ea98508cdd..78a2a31ab1 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -132,6 +132,8 @@ &rrrr_esz ra=%reg_movprfx # Four operand with unused vector element size +@rda_rn_rm_e0 ........ ... rm:5 ... ... rn:5 rd:5 \ + &rrrr_esz esz=0 ra=%reg_movprfx @rdn_ra_rm_e0 ........ ... rm:5 ... ... ra:5 rd:5 \ &rrrr_esz esz=0 rn=%reg_movprfx @@ -1608,3 +1610,15 @@ FCVTLT_sd 01100100 11 0010 11 101 ... ..... ..... @rd_pg_rn_e0 ### SVE2 floating-point convert to integer FLOGB 01100101 00 011 esz:2 0101 pg:3 rn:5 rd:5 &rpr_esz + +### SVE2 floating-point multiply-add long (vectors) +FMLALB_zzzw 01100100 10 1 ..... 10 0 00 0 ..... ..... @rda_rn_rm_e0 +FMLALT_zzzw 01100100 10 1 ..... 10 0 00 1 ..... ..... @rda_rn_rm_e0 +FMLSLB_zzzw 01100100 10 1 ..... 10 1 00 0 ..... ..... @rda_rn_rm_e0 +FMLSLT_zzzw 01100100 10 1 ..... 10 1 00 1 ..... ..... @rda_rn_rm_e0 + +### SVE2 floating-point multiply-add long (indexed) +FMLALB_zzxw 01100100 10 1 ..... 0100.0 ..... ..... @rrxr_3a esz=2 +FMLALT_zzxw 01100100 10 1 ..... 0100.1 ..... ..... @rrxr_3a esz=2 +FMLSLB_zzxw 01100100 10 1 ..... 0110.0 ..... ..... @rrxr_3a esz=2 +FMLSLT_zzxw 01100100 10 1 ..... 0110.1 ..... ..... @rrxr_3a esz=2 diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 92c0620bc8..428ae018a3 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -8535,3 +8535,78 @@ static bool trans_FLOGB(DisasContext *s, arg_rpr_esz *a) } return true; } + +static bool do_FMLAL_zzzw(DisasContext *s, arg_rrrr_esz *a, bool sub, bool sel) +{ + if (!dc_isar_feature(aa64_sve2, s)) { + return false; + } + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + tcg_gen_gvec_4_ptr(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + vec_full_reg_offset(s, a->ra), + cpu_env, vsz, vsz, (sel << 1) | sub, + gen_helper_sve2_fmlal_zzzw_s); + } + return true; +} + +static bool trans_FMLALB_zzzw(DisasContext *s, arg_rrrr_esz *a) +{ + return do_FMLAL_zzzw(s, a, false, false); +} + +static bool trans_FMLALT_zzzw(DisasContext *s, arg_rrrr_esz *a) +{ + return do_FMLAL_zzzw(s, a, false, true); +} + +static bool trans_FMLSLB_zzzw(DisasContext *s, arg_rrrr_esz *a) +{ + return do_FMLAL_zzzw(s, a, true, false); +} + +static bool trans_FMLSLT_zzzw(DisasContext *s, arg_rrrr_esz *a) +{ + return do_FMLAL_zzzw(s, a, true, true); +} + +static bool do_FMLAL_zzxw(DisasContext *s, arg_rrxr_esz *a, bool sub, bool sel) +{ + if (!dc_isar_feature(aa64_sve2, s)) { + return false; + } + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + tcg_gen_gvec_4_ptr(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + vec_full_reg_offset(s, a->ra), + cpu_env, vsz, vsz, + (a->index << 2) | (sel << 1) | sub, + gen_helper_sve2_fmlal_zzxw_s); + } + return true; +} + +static bool trans_FMLALB_zzxw(DisasContext *s, arg_rrxr_esz *a) +{ + return do_FMLAL_zzxw(s, a, false, false); +} + +static bool trans_FMLALT_zzxw(DisasContext *s, arg_rrxr_esz *a) +{ + return do_FMLAL_zzxw(s, a, false, true); +} + +static bool trans_FMLSLB_zzxw(DisasContext *s, arg_rrxr_esz *a) +{ + return do_FMLAL_zzxw(s, a, true, false); +} + +static bool trans_FMLSLT_zzxw(DisasContext *s, arg_rrxr_esz *a) +{ + return do_FMLAL_zzxw(s, a, true, true); +} diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index f5af45375d..19c4ba1bdf 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -1668,6 +1668,27 @@ void HELPER(gvec_fmlal_a64)(void *vd, void *vn, void *vm, get_flush_inputs_to_zero(&env->vfp.fp_status_f16)); } +void HELPER(sve2_fmlal_zzzw_s)(void *vd, void *vn, void *vm, void *va, + void *venv, uint32_t desc) +{ + intptr_t i, oprsz = simd_oprsz(desc); + uint16_t negn = extract32(desc, SIMD_DATA_SHIFT, 1) << 15; + intptr_t sel = extract32(desc, SIMD_DATA_SHIFT + 1, 1) * sizeof(float16); + CPUARMState *env = venv; + float_status *status = &env->vfp.fp_status; + bool fz16 = get_flush_inputs_to_zero(&env->vfp.fp_status_f16); + + for (i = 0; i < oprsz; i += sizeof(float32)) { + float16 nn_16 = *(float16 *)(vn + H1_2(i + sel)) ^ negn; + float16 mm_16 = *(float16 *)(vm + H1_2(i + sel)); + float32 nn = float16_to_float32_by_bits(nn_16, fz16); + float32 mm = float16_to_float32_by_bits(mm_16, fz16); + float32 aa = *(float32 *)(va + H1_4(i)); + + *(float32 *)(vd + H1_4(i)) = float32_muladd(nn, mm, aa, 0, status); + } +} + static void do_fmlal_idx(float32 *d, void *vn, void *vm, float_status *fpst, uint32_t desc, bool fz16) { @@ -1712,6 +1733,32 @@ void HELPER(gvec_fmlal_idx_a64)(void *vd, void *vn, void *vm, get_flush_inputs_to_zero(&env->vfp.fp_status_f16)); } +void HELPER(sve2_fmlal_zzxw_s)(void *vd, void *vn, void *vm, void *va, + void *venv, uint32_t desc) +{ + intptr_t i, j, oprsz = simd_oprsz(desc); + uint16_t negn = extract32(desc, SIMD_DATA_SHIFT, 1) << 15; + intptr_t sel = extract32(desc, SIMD_DATA_SHIFT + 1, 1) * sizeof(float16); + intptr_t idx = extract32(desc, SIMD_DATA_SHIFT + 2, 3) * sizeof(float16); + CPUARMState *env = venv; + float_status *status = &env->vfp.fp_status; + bool fz16 = get_flush_inputs_to_zero(&env->vfp.fp_status_f16); + + for (i = 0; i < oprsz; i += 16) { + float16 mm_16 = *(float16 *)(vm + i + idx); + float32 mm = float16_to_float32_by_bits(mm_16, fz16); + + for (j = 0; j < 16; j += sizeof(float32)) { + float16 nn_16 = *(float16 *)(vn + H1_2(i + j + sel)) ^ negn; + float32 nn = float16_to_float32_by_bits(nn_16, fz16); + float32 aa = *(float32 *)(va + H1_4(i + j)); + + *(float32 *)(vd + H1_4(i + j)) = + float32_muladd(nn, mm, aa, 0, status); + } + } +} + void HELPER(gvec_sshl_b)(void *vd, void *vn, void *vm, uint32_t desc) { intptr_t i, opr_sz = simd_oprsz(desc); From patchwork Tue May 25 01:03:49 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277577 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D724EC2B9F7 for ; Tue, 25 May 2021 02:08:22 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 899E36141D for ; Tue, 25 May 2021 02:08:22 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 899E36141D Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:59634 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llMUX-00059x-Lp for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 22:08:21 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:56420) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLb7-0006HG-9t for qemu-devel@nongnu.org; Mon, 24 May 2021 21:11:05 -0400 Received: from mail-pj1-x1035.google.com ([2607:f8b0:4864:20::1035]:46923) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLav-0005ru-EF for qemu-devel@nongnu.org; Mon, 24 May 2021 21:11:04 -0400 Received: by mail-pj1-x1035.google.com with SMTP id pi6-20020a17090b1e46b029015cec51d7cdso12174414pjb.5 for ; Mon, 24 May 2021 18:10:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=7dZqpG1NzFwardXCWOeWwh7nv+Kax2dkw5cg803fy2E=; b=HA0d8y2Rk0ku9qXs3WUgDfZjLkAwN4SXYnQzk24FRjgwVoEZrmZCs2k7zmUkeL1hIX hTUmweP1anMCPj91tSyvMzB2QS4xDZgz7aOE/odjwDxhDRe3pBsmLNh9u7XYCNmFzvqu J9FoLsiXDYFqFMrUgJcn7NeZb9e3d5gbpwKD9fg8u2laj/3pvpyYTGyBFAkdkQ3+aaaf kUjyXBDbU758GrXCN+ICwey3NEDyBHiYOEfGT5zqvVw7ISsJgrICrH6vtzCB93SCl4QO UKW0rV2HvEy2XKzafsF8W4rej9VN+fEVwh/sbNCxdoSKS3zmvAMBA2JFVtfHclblVBNW KjZQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=7dZqpG1NzFwardXCWOeWwh7nv+Kax2dkw5cg803fy2E=; b=JyCi/dcEY50I3qN7z7pRWeRTlkTMj9AQWMpP0CVU8iR3CxmL1MSs4Zn+vJH8e4jQVr PRva1bJeqITQyIKomvJp+0ancAaEMmvPDkKWWWsrPLPTs7fq+w/5YgP/1KabEEVcKe6N gowe4Vq6dXmD78OC5PXhQFjidHah05k3BSYeLWKLlgp4DJF2Bi1ROtrFCFofF+rOXI04 DLwW1NOymST/XvOD/i8bgGaLykm9wzF2C1e5L9IC9QyGjzEEh875VFIQsPIlBJyKNNd/ jZyCBa4ufptizNqTmfYTHh8BR9e7R5iG5CQTtkxhqTyojXjLtqh3dT27jeQYzJEfV5d3 YVMQ== X-Gm-Message-State: AOAM531uu6rqKB15ZpyLaPMionGhO1rM6Lpkob/w6CV3u4+Y76Gd+e86 +3F1GpNQGbNZFZF1fYOrPK2BPvuMrI1INg== X-Google-Smtp-Source: ABdhPJzuCw3CfLWGhGkvhKxGhVrgBaSWhXQRbhaWYgisF0UED3CuTVQzBBCMmeaRnvfNRq0uMPKj7w== X-Received: by 2002:a17:90a:8581:: with SMTP id m1mr2847606pjn.47.1621905051167; Mon, 24 May 2021 18:10:51 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id i8sm10614839pjs.54.2021.05.24.18.10.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:10:50 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 83/92] target/arm: Implement aarch64 SUDOT, USDOT Date: Mon, 24 May 2021 18:03:49 -0700 Message-Id: <20210525010358.152808-84-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1035; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1035.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/cpu.h | 5 +++++ target/arm/translate-a64.c | 25 +++++++++++++++++++++++++ 2 files changed, 30 insertions(+) diff --git a/target/arm/cpu.h b/target/arm/cpu.h index c75601b221..b2b684df55 100644 --- a/target/arm/cpu.h +++ b/target/arm/cpu.h @@ -4206,6 +4206,11 @@ static inline bool isar_feature_aa64_rcpc_8_4(const ARMISARegisters *id) return FIELD_EX64(id->id_aa64isar1, ID_AA64ISAR1, LRCPC) >= 2; } +static inline bool isar_feature_aa64_i8mm(const ARMISARegisters *id) +{ + return FIELD_EX64(id->id_aa64isar1, ID_AA64ISAR1, I8MM) != 0; +} + static inline bool isar_feature_aa64_ccidx(const ARMISARegisters *id) { return FIELD_EX64(id->id_aa64mmfr2, ID_AA64MMFR2, CCIDX) != 0; diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index a8edd2d281..c875481784 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -12175,6 +12175,13 @@ static void disas_simd_three_reg_same_extra(DisasContext *s, uint32_t insn) } feature = dc_isar_feature(aa64_dp, s); break; + case 0x03: /* USDOT */ + if (size != MO_32) { + unallocated_encoding(s); + return; + } + feature = dc_isar_feature(aa64_i8mm, s); + break; case 0x18: /* FCMLA, #0 */ case 0x19: /* FCMLA, #90 */ case 0x1a: /* FCMLA, #180 */ @@ -12215,6 +12222,10 @@ static void disas_simd_three_reg_same_extra(DisasContext *s, uint32_t insn) u ? gen_helper_gvec_udot_b : gen_helper_gvec_sdot_b); return; + case 0x3: /* USDOT */ + gen_gvec_op4_ool(s, is_q, rd, rn, rm, rd, 0, gen_helper_gvec_usdot_b); + return; + case 0x8: /* FCMLA, #0 */ case 0x9: /* FCMLA, #90 */ case 0xa: /* FCMLA, #180 */ @@ -13360,6 +13371,13 @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn) return; } break; + case 0x0f: /* SUDOT, USDOT */ + if (is_scalar || (size & 1) || !dc_isar_feature(aa64_i8mm, s)) { + unallocated_encoding(s); + return; + } + size = MO_32; + break; case 0x11: /* FCMLA #0 */ case 0x13: /* FCMLA #90 */ case 0x15: /* FCMLA #180 */ @@ -13474,6 +13492,13 @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn) u ? gen_helper_gvec_udot_idx_b : gen_helper_gvec_sdot_idx_b); return; + case 0x0f: /* SUDOT, USDOT */ + gen_gvec_op4_ool(s, is_q, rd, rn, rm, rd, index, + extract32(insn, 23, 1) + ? gen_helper_gvec_usdot_idx_b + : gen_helper_gvec_sudot_idx_b); + return; + case 0x11: /* FCMLA #0 */ case 0x13: /* FCMLA #90 */ case 0x15: /* FCMLA #180 */ From patchwork Tue May 25 01:03:50 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277573 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0BB59C2B9F7 for ; Tue, 25 May 2021 02:07:02 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 83E0761402 for ; Tue, 25 May 2021 02:07:01 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 83E0761402 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:55030 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llMTE-00023B-Kq for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 22:07:00 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:56430) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLb7-0006Jr-TZ for qemu-devel@nongnu.org; Mon, 24 May 2021 21:11:05 -0400 Received: from mail-pf1-x42f.google.com ([2607:f8b0:4864:20::42f]:44005) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLax-0005sP-LJ for qemu-devel@nongnu.org; Mon, 24 May 2021 21:11:05 -0400 Received: by mail-pf1-x42f.google.com with SMTP id d78so21408313pfd.10 for ; Mon, 24 May 2021 18:10:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=Rjy5u/+r3N9Im1B+TWU6WpNfVMB2c2jYme3uNKhY+Qg=; b=MAcDLZ2eyHv/aBb5RLp0f3Tn/QXEh+S/Ct6PqSZKpenOJ9SteNs/Pm1PnLta19SNgE fiPpvjdEMZjEL51Ye7ISmt1CDpQH0WNQrhE5HenvCheHWf1wft89KiSu2SAENhi2Z2yJ DpRSUdQYi/wVe0qS+xLeCXrW6YpatHfN+2ym0s+5NSil2XlXdHFij0nrC5jzp/6QKSLg xH4dVA2Uvm4NPVV580X+RbdxvEIM07e/WPm7Xo94V5HkcaqpxtlhbO1hdLrwPk1qAhS/ kpAsVt5MzI9MXC4WDI6gKf7gkEv+iOUmENYeJYtwcjhrOmqB+FAwug6Hk8LdGMFdrxHI Arpg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Rjy5u/+r3N9Im1B+TWU6WpNfVMB2c2jYme3uNKhY+Qg=; b=uGCKnNVkclOHhdEii7Bfhb9Ova2zvo2AttNXb0wnTHiQMDaShFrWENKjsdo7IqcXMa WKNgJfnd5V6XWnT3S4bqB3yq7BiG27mbBLbPLvdzsjzgFSjPS3DTUjQxzmOmHSOFpZWE vU3q8Eq9Nz6+NspGLDaEDIRa4DaW9Tsg+wrThL+QYbULMTasdxzL+FZM55ZzCrCKsQED GtgtK420jDwsL+7vtnA2MITpl6T4t+FSlZQ3efzoXBSfM2CYNqLGi7ingjYJm+JV7BxY H9+oB3i4YYiHVf4Kigcnc703y3IPS1GZpaj06ndswLKOCbDN2B66X9W+8e67rmWQszLt 0sAg== X-Gm-Message-State: AOAM532SxWFuIpU80Eu5BQYMsuI9CjCj1O+lcuWmMuUO6oF7dr2cFT+o jaQeP2hCqmUWsht361ls9NzamBG85dYtKQ== X-Google-Smtp-Source: ABdhPJzc+3pv1ZWtz/kUygwnSo1bsiIwmPgiDDXA6RKQu1FSCFa/Nj4RAZvwYHTfxXhtxKC734deow== X-Received: by 2002:a63:a549:: with SMTP id r9mr16319196pgu.148.1621905051709; Mon, 24 May 2021 18:10:51 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id i8sm10614839pjs.54.2021.05.24.18.10.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:10:51 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 84/92] target/arm: Split out do_neon_ddda_fpst Date: Mon, 24 May 2021 18:03:50 -0700 Message-Id: <20210525010358.152808-85-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::42f; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x42f.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Split out a helper that can handle the 4-register format for helpers shared with SVE. Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/translate-neon.c | 98 ++++++++++++++++--------------------- 1 file changed, 43 insertions(+), 55 deletions(-) diff --git a/target/arm/translate-neon.c b/target/arm/translate-neon.c index 45fa5166f3..1a8fc7fb39 100644 --- a/target/arm/translate-neon.c +++ b/target/arm/translate-neon.c @@ -151,24 +151,21 @@ static void neon_store_element64(int reg, int ele, MemOp size, TCGv_i64 var) } } -static bool trans_VCMLA(DisasContext *s, arg_VCMLA *a) +static bool do_neon_ddda_fpst(DisasContext *s, int q, int vd, int vn, int vm, + int data, ARMFPStatusFlavour fp_flavour, + gen_helper_gvec_4_ptr *fn_gvec_ptr) { - int opr_sz; - TCGv_ptr fpst; - gen_helper_gvec_4_ptr *fn_gvec_ptr; - - if (!dc_isar_feature(aa32_vcma, s) - || (a->size == MO_16 && !dc_isar_feature(aa32_fp16_arith, s))) { - return false; - } - /* UNDEF accesses to D16-D31 if they don't exist. */ - if (!dc_isar_feature(aa32_simd_r32, s) && - ((a->vd | a->vn | a->vm) & 0x10)) { + if (((vd | vn | vm) & 0x10) && !dc_isar_feature(aa32_simd_r32, s)) { return false; } - if ((a->vn | a->vm | a->vd) & a->q) { + /* + * UNDEF accesses to odd registers for each bit of Q. + * Q will be 0b111 for all Q-reg instructions, otherwise + * when we have mixed Q- and D-reg inputs. + */ + if (((vd & 1) * 4 | (vn & 1) * 2 | (vm & 1)) & q) { return false; } @@ -176,20 +173,34 @@ static bool trans_VCMLA(DisasContext *s, arg_VCMLA *a) return true; } - opr_sz = (1 + a->q) * 8; - fpst = fpstatus_ptr(a->size == MO_16 ? FPST_STD_F16 : FPST_STD); - fn_gvec_ptr = (a->size == MO_16) ? - gen_helper_gvec_fcmlah : gen_helper_gvec_fcmlas; - tcg_gen_gvec_4_ptr(vfp_reg_offset(1, a->vd), - vfp_reg_offset(1, a->vn), - vfp_reg_offset(1, a->vm), - vfp_reg_offset(1, a->vd), - fpst, opr_sz, opr_sz, a->rot, - fn_gvec_ptr); + int opr_sz = q ? 16 : 8; + TCGv_ptr fpst = fpstatus_ptr(fp_flavour); + + tcg_gen_gvec_4_ptr(vfp_reg_offset(1, vd), + vfp_reg_offset(1, vn), + vfp_reg_offset(1, vm), + vfp_reg_offset(1, vd), + fpst, opr_sz, opr_sz, data, fn_gvec_ptr); tcg_temp_free_ptr(fpst); return true; } +static bool trans_VCMLA(DisasContext *s, arg_VCMLA *a) +{ + if (!dc_isar_feature(aa32_vcma, s)) { + return false; + } + if (a->size == MO_16) { + if (!dc_isar_feature(aa32_fp16_arith, s)) { + return false; + } + return do_neon_ddda_fpst(s, a->q * 7, a->vd, a->vn, a->vm, a->rot, + FPST_STD_F16, gen_helper_gvec_fcmlah); + } + return do_neon_ddda_fpst(s, a->q * 7, a->vd, a->vn, a->vm, a->rot, + FPST_STD, gen_helper_gvec_fcmlas); +} + static bool trans_VCADD(DisasContext *s, arg_VCADD *a) { int opr_sz; @@ -294,43 +305,20 @@ static bool trans_VFML(DisasContext *s, arg_VFML *a) static bool trans_VCMLA_scalar(DisasContext *s, arg_VCMLA_scalar *a) { - gen_helper_gvec_4_ptr *fn_gvec_ptr; - int opr_sz; - TCGv_ptr fpst; + int data = (a->index << 2) | a->rot; if (!dc_isar_feature(aa32_vcma, s)) { return false; } - if (a->size == MO_16 && !dc_isar_feature(aa32_fp16_arith, s)) { - return false; + if (a->size == MO_16) { + if (!dc_isar_feature(aa32_fp16_arith, s)) { + return false; + } + return do_neon_ddda_fpst(s, a->q * 6, a->vd, a->vn, a->vm, data, + FPST_STD_F16, gen_helper_gvec_fcmlah_idx); } - - /* UNDEF accesses to D16-D31 if they don't exist. */ - if (!dc_isar_feature(aa32_simd_r32, s) && - ((a->vd | a->vn | a->vm) & 0x10)) { - return false; - } - - if ((a->vd | a->vn) & a->q) { - return false; - } - - if (!vfp_access_check(s)) { - return true; - } - - fn_gvec_ptr = (a->size == MO_16) ? - gen_helper_gvec_fcmlah_idx : gen_helper_gvec_fcmlas_idx; - opr_sz = (1 + a->q) * 8; - fpst = fpstatus_ptr(a->size == MO_16 ? FPST_STD_F16 : FPST_STD); - tcg_gen_gvec_4_ptr(vfp_reg_offset(1, a->vd), - vfp_reg_offset(1, a->vn), - vfp_reg_offset(1, a->vm), - vfp_reg_offset(1, a->vd), - fpst, opr_sz, opr_sz, - (a->index << 2) | a->rot, fn_gvec_ptr); - tcg_temp_free_ptr(fpst); - return true; + return do_neon_ddda_fpst(s, a->q * 6, a->vd, a->vn, a->vm, data, + FPST_STD, gen_helper_gvec_fcmlas_idx); } static bool trans_VDOT_scalar(DisasContext *s, arg_VDOT_scalar *a) From patchwork Tue May 25 01:03:51 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277603 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EECC6C47084 for ; Tue, 25 May 2021 02:26:05 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id A60BA6141F for ; Tue, 25 May 2021 02:26:05 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A60BA6141F Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:54604 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llMlg-0006XD-Mv for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 22:26:04 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:56410) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLb6-0006GE-T8 for qemu-devel@nongnu.org; Mon, 24 May 2021 21:11:04 -0400 Received: from mail-pf1-x435.google.com ([2607:f8b0:4864:20::435]:35539) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLax-0005so-F7 for qemu-devel@nongnu.org; Mon, 24 May 2021 21:11:04 -0400 Received: by mail-pf1-x435.google.com with SMTP id g18so20619250pfr.2 for ; Mon, 24 May 2021 18:10:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=MRl4v4OmAU1TJRXrTZ44ui9rB0aroiaTDZFn8mfRvy0=; b=a2VwodcBkHIglPE/87e3TmUOggKbbbcuVGbyP9Lc1Jpxry2YbJB1napeJugQx1R8q8 9s2xmzuESPoeXoDhCnhZEg3SztWBNZGpg5gtvy0jmTtKZtxZfiDchG5mIUnC+BoQZQBX 8t8gJTJ7M/xR1l8MkFgy52WmVwS0swCBskrhbZPOLFHPOF0T4v4+VYgU/6GyhHVbtxQB MvVEvJMuTIUTTIpY53YwA9QfFeGpYmtkktoLhBMM+MJ4fgI5S6wNGUBK0qoTptdcI8Rv +pq3CqES3CYl6yGBhG7HSEHu5JqtocgYrHquQKehj9h3R6TFwKCUZb89uy7GbJmTY+Pc ZFJw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=MRl4v4OmAU1TJRXrTZ44ui9rB0aroiaTDZFn8mfRvy0=; b=JpUzwi0E8YWtez5JO6yYPpIhrCiKLuAG67vRj4pBX+sfsunb/y9XLFqRmxEMGPwPha +Tnnb8IZqwAsEsbb3FbFSBd5GpnbOBIu+RZY18dKQz8IBTHhA58f5bf0k27s+Ehn//j2 KGwOKr8NSFfAfACmaU042nHzqTsq8R53aSBzyPwo3EQYoZNeOKH+Za/Hl38Vmi2cABoz wY7rdINUcrAhNsaYvT8ZzUiapTjWXWAHrdtQiF9YxSupEHcERyf0wGya6d1e8I1e8PnM VmDQQQP4oPsm3j6Yy1aWtYqNWRaNmwh2RjDiFRU/mL/7T8C0Ur1B+CJmARwu7jBhg2+l SoQw== X-Gm-Message-State: AOAM533yxtAw1RQCp1kCBBPECe5qmIeY4nc0F9NzPKB26tWolFBboqq2 hDaHL/Gwjyvh6xRgir/5XUovr23uiAw32w== X-Google-Smtp-Source: ABdhPJzX7JDhcwcyP3Ai69cT10KeDuLI34LFf7yDVpPVfVs1sBFuWo6P9OU1aGyzQZOyH2lTDAyPsQ== X-Received: by 2002:a63:fc20:: with SMTP id j32mr16427172pgi.8.1621905052502; Mon, 24 May 2021 18:10:52 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id i8sm10614839pjs.54.2021.05.24.18.10.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:10:52 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 85/92] target/arm: Remove unused fpst from VDOT_scalar Date: Mon, 24 May 2021 18:03:51 -0700 Message-Id: <20210525010358.152808-86-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::435; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x435.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Cut and paste error from another pattern. Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/translate-neon.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/target/arm/translate-neon.c b/target/arm/translate-neon.c index 1a8fc7fb39..14a9d0d4d3 100644 --- a/target/arm/translate-neon.c +++ b/target/arm/translate-neon.c @@ -325,7 +325,6 @@ static bool trans_VDOT_scalar(DisasContext *s, arg_VDOT_scalar *a) { gen_helper_gvec_4 *fn_gvec; int opr_sz; - TCGv_ptr fpst; if (!dc_isar_feature(aa32_dp, s)) { return false; @@ -347,13 +346,11 @@ static bool trans_VDOT_scalar(DisasContext *s, arg_VDOT_scalar *a) fn_gvec = a->u ? gen_helper_gvec_udot_idx_b : gen_helper_gvec_sdot_idx_b; opr_sz = (1 + a->q) * 8; - fpst = fpstatus_ptr(FPST_STD); tcg_gen_gvec_4_ool(vfp_reg_offset(1, a->vd), vfp_reg_offset(1, a->vn), vfp_reg_offset(1, a->rm), vfp_reg_offset(1, a->vd), opr_sz, opr_sz, a->index, fn_gvec); - tcg_temp_free_ptr(fpst); return true; } From patchwork Tue May 25 01:03:52 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277655 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3A182C2B9F7 for ; Tue, 25 May 2021 02:29:49 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id CC3046141D for ; Tue, 25 May 2021 02:29:48 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org CC3046141D Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:59346 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llMpI-0001R5-0D for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 22:29:48 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:56516) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLbB-0006YH-NZ for qemu-devel@nongnu.org; Mon, 24 May 2021 21:11:09 -0400 Received: from mail-pg1-x536.google.com ([2607:f8b0:4864:20::536]:34591) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLaz-0005t0-At for qemu-devel@nongnu.org; Mon, 24 May 2021 21:11:09 -0400 Received: by mail-pg1-x536.google.com with SMTP id l70so21423522pga.1 for ; Mon, 24 May 2021 18:10:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=XfuBVEtxztdNyxE+FEFKor3ofjnQkROtvEMf4bpqF2k=; b=CepxoKyW+bc5yvYR605Ifa14nsZh0kF7JUKziiPrUzh2mNI3ZOzDXLx6UVXu46SU5j ZfQwLO0ntxd9QIprRG0V1KZThCpOOJySDnUHnjd9gOji/6d1ew3aJkaJKhJjEV6X9ATl B4Z5HKuVYigGfUNafIuCfwzanTYS07RD2tGIRZ9bvH7JVzz3aeSj7gjKP9q8P5rsD0CN 3LXCLsVfRcVU5PrO4EUiiQFtcaodRQPfA34JjMLHYdX0E8HrO6/2FzEXW6ElzH820Emn PgXNzMZvuWJF40eUPrN/TA/V/yQKMjz6XFUdmSZbDgGgxkWwz3WVzobAKyMhYWlQuebM au+w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=XfuBVEtxztdNyxE+FEFKor3ofjnQkROtvEMf4bpqF2k=; b=t+f82crm5UY35Bptex4MG8HGQZo4Px3uwBcL79FftUGX6dgV1ljWGMpkwYmek8aPOc zKh2Os1iDl/CltWBXOvryuBfsNmjppG7GqHTJfmUds3L3c/WjhQ5m89pIEatSWpBzs+f zewRoisHMHRaoBTi+Zd7GSKEqSMQcDEHmG3KM+6i/j1h91DrkT0SAbTnJygSXy3HEC5f yjsNbkFLT293C4f4b5fHLCKPvgnPHCogVA8XnLSCTM52msT5vbVq1tA3/AxFtcZcc2h1 3Cg14wCIQ6KJcGlrAu9v6RP7lniJvyYs0EEnRMn8+k8qppiTcsQrDoWMDxAtQJS8oB3g jUuA== X-Gm-Message-State: AOAM533yIgYeeOhMBL9jFRXiaSltJZcHnzOYlfUrsGzu2AbqG6zS2Pbz itSBTIq1RtEpdRBO56/NwmX2TF21J7dy5g== X-Google-Smtp-Source: ABdhPJzY6RMAZGcSZ4upQxFq5weWusDRg+s9HaiXmNtTWjiBQox+7/oJQAeswFVnj03a89SGt8+Omg== X-Received: by 2002:a65:67d5:: with SMTP id b21mr16149681pgs.117.1621905053154; Mon, 24 May 2021 18:10:53 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id i8sm10614839pjs.54.2021.05.24.18.10.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:10:52 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 86/92] target/arm: Fix decode for VDOT (indexed) Date: Mon, 24 May 2021 18:03:52 -0700 Message-Id: <20210525010358.152808-87-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::536; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x536.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" We were extracting the M register twice, once incorrectly as M:vm and once correctly as rm. Remove the incorrect name and remove the incorrect decode. Signed-off-by: Richard Henderson --- target/arm/neon-shared.decode | 4 ++-- target/arm/translate-neon.c | 2 +- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/target/arm/neon-shared.decode b/target/arm/neon-shared.decode index ca0c699072..facb621450 100644 --- a/target/arm/neon-shared.decode +++ b/target/arm/neon-shared.decode @@ -61,8 +61,8 @@ VCMLA_scalar 1111 1110 0 . rot:2 .... .... 1000 . q:1 index:1 0 vm:4 \ VCMLA_scalar 1111 1110 1 . rot:2 .... .... 1000 . q:1 . 0 .... \ vm=%vm_dp vn=%vn_dp vd=%vd_dp size=2 index=0 -VDOT_scalar 1111 1110 0 . 10 .... .... 1101 . q:1 index:1 u:1 rm:4 \ - vm=%vm_dp vn=%vn_dp vd=%vd_dp +VDOT_scalar 1111 1110 0 . 10 .... .... 1101 . q:1 index:1 u:1 vm:4 \ + vn=%vn_dp vd=%vd_dp %vfml_scalar_q0_rm 0:3 5:1 %vfml_scalar_q1_index 5:1 3:1 diff --git a/target/arm/translate-neon.c b/target/arm/translate-neon.c index 14a9d0d4d3..9f7a88aab1 100644 --- a/target/arm/translate-neon.c +++ b/target/arm/translate-neon.c @@ -348,7 +348,7 @@ static bool trans_VDOT_scalar(DisasContext *s, arg_VDOT_scalar *a) opr_sz = (1 + a->q) * 8; tcg_gen_gvec_4_ool(vfp_reg_offset(1, a->vd), vfp_reg_offset(1, a->vn), - vfp_reg_offset(1, a->rm), + vfp_reg_offset(1, a->vm), vfp_reg_offset(1, a->vd), opr_sz, opr_sz, a->index, fn_gvec); return true; From patchwork Tue May 25 01:03:53 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277581 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 45F52C2B9F7 for ; Tue, 25 May 2021 02:09:33 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id CE42861402 for ; Tue, 25 May 2021 02:09:32 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org CE42861402 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:35390 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llMVf-0007tu-VF for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 22:09:31 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:56442) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLb8-0006MN-N3 for qemu-devel@nongnu.org; Mon, 24 May 2021 21:11:06 -0400 Received: from mail-pg1-x52a.google.com ([2607:f8b0:4864:20::52a]:35632) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLax-0005tt-Km for qemu-devel@nongnu.org; Mon, 24 May 2021 21:11:06 -0400 Received: by mail-pg1-x52a.google.com with SMTP id m190so21418710pga.2 for ; Mon, 24 May 2021 18:10:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=YyP6MM/F+rXrYTyaSgrJ+8NstjNzUT49MS/gVveZWao=; b=VNf+F/nkWyGQkza1s0tTQAq57YbPDC9kbKdYnjXQOHn4CqHLTjBTValDZ7CQu/7Gn8 D1f5yhLUWbGsPR+DkKVxe0DbsLj00hd7IZKutqRa3yCXAbca01bApTrweZkZIYHWYoJa lB6SPWpjrSDWrY5E3cPkC+hXGkitQaq9Xp86YNBTeSgKMQ5PgiyOVFNaNjfpCK9FvzEI q0TIpibQJeZXOy1eRzvae/g/MXIIGwYYEuDB7+R4IeJtEzm5cjACyteaVAGlvFEWQPv0 8dXzU2dwI3atiT6BlPA/fO8j7v1Frcn8V+Zgpej9p3lE+n8UoXwFIv1pg115i26rPG1I Qi4w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=YyP6MM/F+rXrYTyaSgrJ+8NstjNzUT49MS/gVveZWao=; b=m8UIOnAA3qOR5lUudOD6s9x1JexPwsd+K617HKETh2mjDARRA0z+G+dq6+MAHDc2+2 m+NMLIDxw8Pm+agpt8YocQZVto/5AhM6Ppn22RSD2RSIznNezmntaVARhOlevJxRMVPc auochLtdifayhNaE2s0ZXV1tbSrrrT50BG8ucpHdW3GlW9KafbJovB9OPIACVl54X9kf P3uvB+yVAyuUw+wqvHL/Ea5Vgd05YWY1ug3k86cQ9FrKw3xL0EIOMDgEesOP+Qg0fAo4 AtWD+sabH7VV70HAtg0d8poRqtVwRn4Q7gVlDD+VFTkjXAxt6crw2AIK6tafZgYgYGBT oAxQ== X-Gm-Message-State: AOAM533DeufE+UwGoxZWglkpRuTBBiIfR6QpR9VS2nYC95GyVBUqgMdp FsNmkuzPy5q39ZyOoqSSntWQyvEPfDhbig== X-Google-Smtp-Source: ABdhPJybn7GjygMhSlsZQSaRncL6ytxLp7/hd3fOzCEOvtJnwYH5oCYMr7cZ2UNWqd3rn16tHePmqg== X-Received: by 2002:a63:5126:: with SMTP id f38mr3181875pgb.402.1621905053791; Mon, 24 May 2021 18:10:53 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id i8sm10614839pjs.54.2021.05.24.18.10.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:10:53 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 87/92] target/arm: Split out do_neon_ddda Date: Mon, 24 May 2021 18:03:53 -0700 Message-Id: <20210525010358.152808-88-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::52a; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x52a.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Split out a helper that can handle the 4-register format for helpers shared with SVE. Signed-off-by: Richard Henderson --- target/arm/translate-neon.c | 90 ++++++++++++++++--------------------- 1 file changed, 38 insertions(+), 52 deletions(-) diff --git a/target/arm/translate-neon.c b/target/arm/translate-neon.c index 9f7a88aab1..dfa33912ab 100644 --- a/target/arm/translate-neon.c +++ b/target/arm/translate-neon.c @@ -151,6 +151,36 @@ static void neon_store_element64(int reg, int ele, MemOp size, TCGv_i64 var) } } +static bool do_neon_ddda(DisasContext *s, int q, int vd, int vn, int vm, + int data, gen_helper_gvec_4 *fn_gvec) +{ + /* UNDEF accesses to D16-D31 if they don't exist. */ + if (((vd | vn | vm) & 0x10) && !dc_isar_feature(aa32_simd_r32, s)) { + return false; + } + + /* + * UNDEF accesses to odd registers for each bit of Q. + * Q will be 0b111 for all Q-reg instructions, otherwise + * when we have mixed Q- and D-reg inputs. + */ + if (((vd & 1) * 4 | (vn & 1) * 2 | (vm & 1)) & q) { + return false; + } + + if (!vfp_access_check(s)) { + return true; + } + + int opr_sz = q ? 16 : 8; + tcg_gen_gvec_4_ool(vfp_reg_offset(1, vd), + vfp_reg_offset(1, vn), + vfp_reg_offset(1, vm), + vfp_reg_offset(1, vd), + opr_sz, opr_sz, data, fn_gvec); + return true; +} + static bool do_neon_ddda_fpst(DisasContext *s, int q, int vd, int vn, int vm, int data, ARMFPStatusFlavour fp_flavour, gen_helper_gvec_4_ptr *fn_gvec_ptr) @@ -241,35 +271,13 @@ static bool trans_VCADD(DisasContext *s, arg_VCADD *a) static bool trans_VDOT(DisasContext *s, arg_VDOT *a) { - int opr_sz; - gen_helper_gvec_4 *fn_gvec; - if (!dc_isar_feature(aa32_dp, s)) { return false; } - - /* UNDEF accesses to D16-D31 if they don't exist. */ - if (!dc_isar_feature(aa32_simd_r32, s) && - ((a->vd | a->vn | a->vm) & 0x10)) { - return false; - } - - if ((a->vn | a->vm | a->vd) & a->q) { - return false; - } - - if (!vfp_access_check(s)) { - return true; - } - - opr_sz = (1 + a->q) * 8; - fn_gvec = a->u ? gen_helper_gvec_udot_b : gen_helper_gvec_sdot_b; - tcg_gen_gvec_4_ool(vfp_reg_offset(1, a->vd), - vfp_reg_offset(1, a->vn), - vfp_reg_offset(1, a->vm), - vfp_reg_offset(1, a->vd), - opr_sz, opr_sz, 0, fn_gvec); - return true; + return do_neon_ddda(s, a->q * 7, a->vd, a->vn, a->vm, 0, + a->u + ? gen_helper_gvec_udot_b + : gen_helper_gvec_sdot_b); } static bool trans_VFML(DisasContext *s, arg_VFML *a) @@ -323,35 +331,13 @@ static bool trans_VCMLA_scalar(DisasContext *s, arg_VCMLA_scalar *a) static bool trans_VDOT_scalar(DisasContext *s, arg_VDOT_scalar *a) { - gen_helper_gvec_4 *fn_gvec; - int opr_sz; - if (!dc_isar_feature(aa32_dp, s)) { return false; } - - /* UNDEF accesses to D16-D31 if they don't exist. */ - if (!dc_isar_feature(aa32_simd_r32, s) && - ((a->vd | a->vn) & 0x10)) { - return false; - } - - if ((a->vd | a->vn) & a->q) { - return false; - } - - if (!vfp_access_check(s)) { - return true; - } - - fn_gvec = a->u ? gen_helper_gvec_udot_idx_b : gen_helper_gvec_sdot_idx_b; - opr_sz = (1 + a->q) * 8; - tcg_gen_gvec_4_ool(vfp_reg_offset(1, a->vd), - vfp_reg_offset(1, a->vn), - vfp_reg_offset(1, a->vm), - vfp_reg_offset(1, a->vd), - opr_sz, opr_sz, a->index, fn_gvec); - return true; + return do_neon_ddda(s, a->q * 6, a->vd, a->vn, a->vm, a->index, + a->u + ? gen_helper_gvec_udot_idx_b + : gen_helper_gvec_sdot_idx_b); } static bool trans_VFML_scalar(DisasContext *s, arg_VFML_scalar *a) From patchwork Tue May 25 01:03:54 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277653 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6C7B0C2B9F7 for ; Tue, 25 May 2021 02:28:10 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id E47FA613EC for ; Tue, 25 May 2021 02:28:09 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E47FA613EC Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:56800 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llMnh-00087z-34 for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 22:28:09 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:56474) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLbA-0006S5-4L for qemu-devel@nongnu.org; Mon, 24 May 2021 21:11:08 -0400 Received: from mail-pj1-x1035.google.com ([2607:f8b0:4864:20::1035]:55863) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLax-0005uN-Kq for qemu-devel@nongnu.org; Mon, 24 May 2021 21:11:07 -0400 Received: by mail-pj1-x1035.google.com with SMTP id kr9so7802045pjb.5 for ; Mon, 24 May 2021 18:10:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=rqEfrgTAxZRSYWJVsj3UhKlDudjDrwfnyFzuC7wZThU=; b=gkaMUMjjIQbaRhIuF+waM6nyriWD1Y1lKClG5hb5Gxre+jAZvSxQ7vuj8BDVGlrjo6 PAD9oDpnbLBIVQfaYYl3a/H0GQ4q5QfI8pDYxOA/VNCR2m0XZwd0+JleVX4MlhmRyM/r NxyLQenJHtEC4G99ekv/IV1cc3cx8lRNr8L38RFMh+Ly6WAv/B133M0qrCm3ZJXY+xAi yfqBl4lTvtcrl0xujbe88fcWgMaCLoZuHkXDvc14JSxO/1NO/UBO4cxIJpKOwRUg3RP7 mevJGaw1HV+mg5oiqzhfk47m4gYg/DE9AyfH+mDbqg3IDBFhSopd5kpqiUTbAL4u9Bpy wpjg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=rqEfrgTAxZRSYWJVsj3UhKlDudjDrwfnyFzuC7wZThU=; b=qR2FSAEjCEjBPXVoOrJOboHgVEK7fQTe66T6ZKIsaz/QYkirvqZLBT5IdzKjXHXIXf UcIAc/nVDrfhC+TjpuUAKrhmv2C9RxuaSp3Sc/O9HeT58Gd7AWOIGB7JdFBlj00R5m06 gPdcsz9p5owOPVZGpL/Mj31F3P9nykrnS4KRn/3wpsreOU7USA0SS5HXAyZVB5Bak+1t YnFDC/Z3Kd+qoosBM48nIn24RJ6q6zuO5dZhzAhoVAvzGUv81pTbGCUDdWb8MtuV6NaN 8ca5EfHX7iJGWgm3IHwFe6ch2aMAQHBuiQvM2xbWrK475aA2ZBmYqgcMv2oeb7vbnMhf G9ug== X-Gm-Message-State: AOAM530V+mRWPgRmb4V7bfzy4deXejz0kcg8zIdz5CYsraDAGZ0DF05T a7LsjstRKhZ472JG8DJzZ3/6P4OXoShqfg== X-Google-Smtp-Source: ABdhPJzM28QoD7CEcC4/yisFI75eCDz55hhzLqp4BlSp1dvOpAbIwpIDTyu/ryrlNPKMOr6Hfbjsew== X-Received: by 2002:a17:90b:108e:: with SMTP id gj14mr2041542pjb.109.1621905054416; Mon, 24 May 2021 18:10:54 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id i8sm10614839pjs.54.2021.05.24.18.10.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:10:54 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 88/92] target/arm: Split decode of VSDOT and VUDOT Date: Mon, 24 May 2021 18:03:54 -0700 Message-Id: <20210525010358.152808-89-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1035; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1035.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Now that we have a common helper, sharing decode does not save much. Also, this will solve an upcoming naming problem. Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/neon-shared.decode | 9 ++++++--- target/arm/translate-neon.c | 30 ++++++++++++++++++++++-------- 2 files changed, 28 insertions(+), 11 deletions(-) diff --git a/target/arm/neon-shared.decode b/target/arm/neon-shared.decode index facb621450..2d94369750 100644 --- a/target/arm/neon-shared.decode +++ b/target/arm/neon-shared.decode @@ -46,8 +46,9 @@ VCMLA 1111 110 rot:2 . 1 . .... .... 1000 . q:1 . 0 .... \ VCADD 1111 110 rot:1 1 . 0 . .... .... 1000 . q:1 . 0 .... \ vm=%vm_dp vn=%vn_dp vd=%vd_dp size=%vcadd_size -# VUDOT and VSDOT -VDOT 1111 110 00 . 10 .... .... 1101 . q:1 . u:1 .... \ +VSDOT 1111 110 00 . 10 .... .... 1101 . q:1 . 0 .... \ + vm=%vm_dp vn=%vn_dp vd=%vd_dp +VUDOT 1111 110 00 . 10 .... .... 1101 . q:1 . 1 .... \ vm=%vm_dp vn=%vn_dp vd=%vd_dp # VFM[AS]L @@ -61,7 +62,9 @@ VCMLA_scalar 1111 1110 0 . rot:2 .... .... 1000 . q:1 index:1 0 vm:4 \ VCMLA_scalar 1111 1110 1 . rot:2 .... .... 1000 . q:1 . 0 .... \ vm=%vm_dp vn=%vn_dp vd=%vd_dp size=2 index=0 -VDOT_scalar 1111 1110 0 . 10 .... .... 1101 . q:1 index:1 u:1 vm:4 \ +VSDOT_scalar 1111 1110 0 . 10 .... .... 1101 . q:1 index:1 0 vm:4 \ + vn=%vn_dp vd=%vd_dp +VUDOT_scalar 1111 1110 0 . 10 .... .... 1101 . q:1 index:1 1 vm:4 \ vn=%vn_dp vd=%vd_dp %vfml_scalar_q0_rm 0:3 5:1 diff --git a/target/arm/translate-neon.c b/target/arm/translate-neon.c index dfa33912ab..386b42fe4b 100644 --- a/target/arm/translate-neon.c +++ b/target/arm/translate-neon.c @@ -269,15 +269,22 @@ static bool trans_VCADD(DisasContext *s, arg_VCADD *a) return true; } -static bool trans_VDOT(DisasContext *s, arg_VDOT *a) +static bool trans_VSDOT(DisasContext *s, arg_VSDOT *a) { if (!dc_isar_feature(aa32_dp, s)) { return false; } return do_neon_ddda(s, a->q * 7, a->vd, a->vn, a->vm, 0, - a->u - ? gen_helper_gvec_udot_b - : gen_helper_gvec_sdot_b); + gen_helper_gvec_sdot_b); +} + +static bool trans_VUDOT(DisasContext *s, arg_VUDOT *a) +{ + if (!dc_isar_feature(aa32_dp, s)) { + return false; + } + return do_neon_ddda(s, a->q * 7, a->vd, a->vn, a->vm, 0, + gen_helper_gvec_udot_b); } static bool trans_VFML(DisasContext *s, arg_VFML *a) @@ -329,15 +336,22 @@ static bool trans_VCMLA_scalar(DisasContext *s, arg_VCMLA_scalar *a) FPST_STD, gen_helper_gvec_fcmlas_idx); } -static bool trans_VDOT_scalar(DisasContext *s, arg_VDOT_scalar *a) +static bool trans_VSDOT_scalar(DisasContext *s, arg_VSDOT_scalar *a) { if (!dc_isar_feature(aa32_dp, s)) { return false; } return do_neon_ddda(s, a->q * 6, a->vd, a->vn, a->vm, a->index, - a->u - ? gen_helper_gvec_udot_idx_b - : gen_helper_gvec_sdot_idx_b); + gen_helper_gvec_sdot_idx_b); +} + +static bool trans_VUDOT_scalar(DisasContext *s, arg_VUDOT_scalar *a) +{ + if (!dc_isar_feature(aa32_dp, s)) { + return false; + } + return do_neon_ddda(s, a->q * 6, a->vd, a->vn, a->vm, a->index, + gen_helper_gvec_udot_idx_b); } static bool trans_VFML_scalar(DisasContext *s, arg_VFML_scalar *a) From patchwork Tue May 25 01:03:55 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277591 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2E5A2C2B9F7 for ; Tue, 25 May 2021 02:14:15 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id DB5EA6141D for ; Tue, 25 May 2021 02:14:14 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org DB5EA6141D Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:53654 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llMaD-0003L5-VC for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 22:14:13 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:56512) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLbB-0006XM-Fz for qemu-devel@nongnu.org; Mon, 24 May 2021 21:11:09 -0400 Received: from mail-pj1-x1029.google.com ([2607:f8b0:4864:20::1029]:43956) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLaz-0005uV-Ax for qemu-devel@nongnu.org; Mon, 24 May 2021 21:11:09 -0400 Received: by mail-pj1-x1029.google.com with SMTP id ep16-20020a17090ae650b029015d00f578a8so12192255pjb.2 for ; Mon, 24 May 2021 18:10:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=Uee7s+qdhdMQ6Jii7QCfRZaZt20ac2r4W/zjmra8qs4=; b=c/pTUl6Mq2nxwiOXhtIVXpgFn6bEz5yNnvG/KGxRWOqF1M1+iiBhmFX6xCy3GeY6Gi CKW+GGm3/3cWozkHL5XqyteehWGU0KK6jIppgetGul4vRpchvkqnSUTRrqWhZnhk4nmN R3H5FAcCcLdkSC/SX1ZBlYTzo3K8KJPkqbqe+ocRt/r94RCNRVWoPO/rMqd5RfDWd42i IoWOKcRaFgKTMiQ/naCXtcGWSqJSPjGfWSKXvlikzSWIzNX1vJKYnStjO0/ij6pGhlxs Xdo1dyGs/qOkifgzUFQWW4dB1ubianm5M/nI59QVittxYvAiTaj46bGYFNUVSRBLe4FX yP5Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Uee7s+qdhdMQ6Jii7QCfRZaZt20ac2r4W/zjmra8qs4=; b=G10aIugL554ibJ66RddjKhj78bxcTgw6MNyyEdz9QMRAFHj7UIZdytz0V7oVwUcPkQ p3U1ZNFmNsmqb7g2FYQQ0c8F/QScOmPFEscKlxwNmtkuOU4a11CeJgzGkiwXo0hm+n0Z 9EpJVhY0IGYsWrhRyKqGfHeLD+3nWqaE2/AX6qdZwtdCMAgAAok+13TPJOGvupSXl2Zp xP4j5ZxI3ipN5Qu5fxkBl5GbAPVY+Pl4k3EWkSYPKufovmXvz/5wp7Yv97vvMQydgbPU n3U6fR9rhiKJfZzOZprPtsV5ca7jrurYHmDE3SP1HH2VLeAkkubQ58OjDuaTkc5REUit ACpA== X-Gm-Message-State: AOAM532kEYMdToF8nPVoxibkrdxQXT0QgyeP5xrxWdyZQmJwofkZuYLU +fZ+gX9E2wpeM4sUAnhEcLU7ACIgWjoF7g== X-Google-Smtp-Source: ABdhPJyge3jUbWSa/GGPsmTf0xupLvMEMQ6WqY3c6RrWFKjMLrUXnY5atm1RUJhuLQKOK+h42LRMow== X-Received: by 2002:a17:90b:687:: with SMTP id m7mr26993536pjz.133.1621905055098; Mon, 24 May 2021 18:10:55 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id i8sm10614839pjs.54.2021.05.24.18.10.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:10:54 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 89/92] target/arm: Implement aarch32 VSUDOT, VUSDOT Date: Mon, 24 May 2021 18:03:55 -0700 Message-Id: <20210525010358.152808-90-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1029; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1029.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/cpu.h | 5 +++++ target/arm/neon-shared.decode | 6 ++++++ target/arm/translate-neon.c | 27 +++++++++++++++++++++++++++ 3 files changed, 38 insertions(+) diff --git a/target/arm/cpu.h b/target/arm/cpu.h index b2b684df55..5850db90f3 100644 --- a/target/arm/cpu.h +++ b/target/arm/cpu.h @@ -3783,6 +3783,11 @@ static inline bool isar_feature_aa32_predinv(const ARMISARegisters *id) return FIELD_EX32(id->id_isar6, ID_ISAR6, SPECRES) != 0; } +static inline bool isar_feature_aa32_i8mm(const ARMISARegisters *id) +{ + return FIELD_EX32(id->id_isar6, ID_ISAR6, I8MM) != 0; +} + static inline bool isar_feature_aa32_ras(const ARMISARegisters *id) { return FIELD_EX32(id->id_pfr0, ID_PFR0, RAS) != 0; diff --git a/target/arm/neon-shared.decode b/target/arm/neon-shared.decode index 2d94369750..5befaec87b 100644 --- a/target/arm/neon-shared.decode +++ b/target/arm/neon-shared.decode @@ -50,6 +50,8 @@ VSDOT 1111 110 00 . 10 .... .... 1101 . q:1 . 0 .... \ vm=%vm_dp vn=%vn_dp vd=%vd_dp VUDOT 1111 110 00 . 10 .... .... 1101 . q:1 . 1 .... \ vm=%vm_dp vn=%vn_dp vd=%vd_dp +VUSDOT 1111 110 01 . 10 .... .... 1101 . q:1 . 0 .... \ + vm=%vm_dp vn=%vn_dp vd=%vd_dp # VFM[AS]L VFML 1111 110 0 s:1 . 10 .... .... 1000 . 0 . 1 .... \ @@ -66,6 +68,10 @@ VSDOT_scalar 1111 1110 0 . 10 .... .... 1101 . q:1 index:1 0 vm:4 \ vn=%vn_dp vd=%vd_dp VUDOT_scalar 1111 1110 0 . 10 .... .... 1101 . q:1 index:1 1 vm:4 \ vn=%vn_dp vd=%vd_dp +VUSDOT_scalar 1111 1110 1 . 00 .... .... 1101 . q:1 index:1 0 vm:4 \ + vn=%vn_dp vd=%vd_dp +VSUDOT_scalar 1111 1110 1 . 00 .... .... 1101 . q:1 index:1 1 vm:4 \ + vn=%vn_dp vd=%vd_dp %vfml_scalar_q0_rm 0:3 5:1 %vfml_scalar_q1_index 5:1 3:1 diff --git a/target/arm/translate-neon.c b/target/arm/translate-neon.c index 386b42fe4b..b6ca29c25c 100644 --- a/target/arm/translate-neon.c +++ b/target/arm/translate-neon.c @@ -287,6 +287,15 @@ static bool trans_VUDOT(DisasContext *s, arg_VUDOT *a) gen_helper_gvec_udot_b); } +static bool trans_VUSDOT(DisasContext *s, arg_VUSDOT *a) +{ + if (!dc_isar_feature(aa32_i8mm, s)) { + return false; + } + return do_neon_ddda(s, a->q * 7, a->vd, a->vn, a->vm, 0, + gen_helper_gvec_usdot_b); +} + static bool trans_VFML(DisasContext *s, arg_VFML *a) { int opr_sz; @@ -354,6 +363,24 @@ static bool trans_VUDOT_scalar(DisasContext *s, arg_VUDOT_scalar *a) gen_helper_gvec_udot_idx_b); } +static bool trans_VUSDOT_scalar(DisasContext *s, arg_VUSDOT_scalar *a) +{ + if (!dc_isar_feature(aa32_i8mm, s)) { + return false; + } + return do_neon_ddda(s, a->q * 6, a->vd, a->vn, a->vm, a->index, + gen_helper_gvec_usdot_idx_b); +} + +static bool trans_VSUDOT_scalar(DisasContext *s, arg_VSUDOT_scalar *a) +{ + if (!dc_isar_feature(aa32_i8mm, s)) { + return false; + } + return do_neon_ddda(s, a->q * 6, a->vd, a->vn, a->vm, a->index, + gen_helper_gvec_sudot_idx_b); +} + static bool trans_VFML_scalar(DisasContext *s, arg_VFML_scalar *a) { int opr_sz; From patchwork Tue May 25 01:03:56 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277587 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B0B87C2B9F7 for ; Tue, 25 May 2021 02:11:49 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 3258D600D3 for ; Tue, 25 May 2021 02:11:49 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 3258D600D3 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:45160 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llMXs-0005yV-8L for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 22:11:48 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:56484) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLbA-0006T4-Bb for qemu-devel@nongnu.org; Mon, 24 May 2021 21:11:08 -0400 Received: from mail-pl1-x635.google.com ([2607:f8b0:4864:20::635]:37538) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLaz-0005uf-AB for qemu-devel@nongnu.org; Mon, 24 May 2021 21:11:08 -0400 Received: by mail-pl1-x635.google.com with SMTP id u7so6946738plq.4 for ; Mon, 24 May 2021 18:10:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=DX7CTp5EweNzeq/zxrwefxTe7CRateXrXF2rNVC4RlA=; b=YAfnwtSaJ/7rQazrG+hmfQueP3oficUrWcPZ/+BQY1ny6Kran0+rnyy/reeHYMqeny XSr/ABO1AS13dGdzrGJ+HfLmPPga9aqfJunxE5cSWQPlkCcNdNAMhJMASlxqVSvXT3+z HhmaiZ7HLGwsB0AQb9dvdYcaWDH9pBLktKIllC0F0oejsWQMXW1NaCMZ3UvTQugen8jD HIktsg93hHwydPoysZsta6TIB7dJy2SwzDbU2aL6pYR1XdQJcClZoiYl9VMntpUaJh1m 3zUsqVfLUzwa6lRirnxMib8gLDESKirKltoxEf0zhNAM16Z8BaNqgbPvF4z5AyFf7uit qGSQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=DX7CTp5EweNzeq/zxrwefxTe7CRateXrXF2rNVC4RlA=; b=CaCjvXiJTa/i5pXkn6kg6BpCa0VgArkJSLw++Fpas02X5S27S4+I2XiEK4O32tUYCt ejPLdoRQqKcTYaOElgFrC039z1f8YurYd3ZSUXMWr9VB4gtqUy8dWjNKx5Eax0RJxXIx hcYnzFLonAuT80wuHcxN764+5WeB2gVm2CEjRUu9EdxU0zCIctvu9qtFjhueFepU0jPc JtDN9/+BvLNO6E2omqFio9NnCUcebxOHxwjVMcWTHf3S4T6tD3xCyiZs4dCUQ8XLrzEW 5q4iN1XmkkQ/ssDhw+oSawMyaWjin/SrxqjpOHlyoZJeOCYTZEx+UnSTbRAMjsFHXccS WyRw== X-Gm-Message-State: AOAM532pR1N/ByIB4DP55KhU6yrro9Nbaq+hns93ZjeIbv8lTeAvTdYX iT01NoPk6H9qRNljSwjGy4n6f564ur2R0Q== X-Google-Smtp-Source: ABdhPJxKoROolzZtPkJH6oAr2ZfMjQH9tJ9HwCVnXSNKOhRZ+rq/hcONLBxJRWTxNYN2IbNAm9E02g== X-Received: by 2002:a17:902:720c:b029:ee:f427:91e9 with SMTP id ba12-20020a170902720cb02900eef42791e9mr28153520plb.72.1621905055865; Mon, 24 May 2021 18:10:55 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id i8sm10614839pjs.54.2021.05.24.18.10.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:10:55 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 90/92] target/arm: Implement integer matrix multiply accumulate Date: Mon, 24 May 2021 18:03:56 -0700 Message-Id: <20210525010358.152808-91-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::635; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x635.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" This is {S,U,US}MMLA for both AArch64 AdvSIMD and SVE, and V{S,U,US}MMLA.S8 for AArch32 NEON. Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/helper.h | 7 ++++ target/arm/neon-shared.decode | 7 ++++ target/arm/sve.decode | 6 +++ target/arm/translate-a64.c | 18 ++++++++ target/arm/translate-neon.c | 27 ++++++++++++ target/arm/translate-sve.c | 27 ++++++++++++ target/arm/vec_helper.c | 77 +++++++++++++++++++++++++++++++++++ 7 files changed, 169 insertions(+) diff --git a/target/arm/helper.h b/target/arm/helper.h index 92b81bbabe..23ccb0f72f 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -993,6 +993,13 @@ DEF_HELPER_FLAGS_6(sve2_fmlal_zzxw_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_4(gvec_xar_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_smmla_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_ummla_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_usmmla_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + #ifdef TARGET_AARCH64 #include "helper-a64.h" #include "helper-sve.h" diff --git a/target/arm/neon-shared.decode b/target/arm/neon-shared.decode index 5befaec87b..cc9f4cdd85 100644 --- a/target/arm/neon-shared.decode +++ b/target/arm/neon-shared.decode @@ -59,6 +59,13 @@ VFML 1111 110 0 s:1 . 10 .... .... 1000 . 0 . 1 .... \ VFML 1111 110 0 s:1 . 10 .... .... 1000 . 1 . 1 .... \ vm=%vm_dp vn=%vn_dp vd=%vd_dp q=1 +VSMMLA 1111 1100 0.10 .... .... 1100 .1.0 .... \ + vm=%vm_dp vn=%vn_dp vd=%vd_dp +VUMMLA 1111 1100 0.10 .... .... 1100 .1.1 .... \ + vm=%vm_dp vn=%vn_dp vd=%vd_dp +VUSMMLA 1111 1100 1.10 .... .... 1100 .1.0 .... \ + vm=%vm_dp vn=%vn_dp vd=%vd_dp + VCMLA_scalar 1111 1110 0 . rot:2 .... .... 1000 . q:1 index:1 0 vm:4 \ vn=%vn_dp vd=%vd_dp size=1 VCMLA_scalar 1111 1110 1 . rot:2 .... .... 1000 . q:1 . 0 .... \ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 78a2a31ab1..cb077bfde9 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1413,6 +1413,12 @@ USHLLT 01000101 .. 0 ..... 1010 11 ..... ..... @rd_rn_tszimm_shl EORBT 01000101 .. 0 ..... 10010 0 ..... ..... @rd_rn_rm EORTB 01000101 .. 0 ..... 10010 1 ..... ..... @rd_rn_rm +## SVE integer matrix multiply accumulate + +SMMLA 01000101 00 0 ..... 10011 0 ..... ..... @rda_rn_rm_e0 +USMMLA 01000101 10 0 ..... 10011 0 ..... ..... @rda_rn_rm_e0 +UMMLA 01000101 11 0 ..... 10011 0 ..... ..... @rda_rn_rm_e0 + ## SVE2 bitwise permute BEXT 01000101 .. 0 ..... 1011 00 ..... ..... @rd_rn_rm diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index c875481784..ceac0ee2bd 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -12182,6 +12182,15 @@ static void disas_simd_three_reg_same_extra(DisasContext *s, uint32_t insn) } feature = dc_isar_feature(aa64_i8mm, s); break; + case 0x04: /* SMMLA */ + case 0x14: /* UMMLA */ + case 0x05: /* USMMLA */ + if (!is_q || size != MO_32) { + unallocated_encoding(s); + return; + } + feature = dc_isar_feature(aa64_i8mm, s); + break; case 0x18: /* FCMLA, #0 */ case 0x19: /* FCMLA, #90 */ case 0x1a: /* FCMLA, #180 */ @@ -12226,6 +12235,15 @@ static void disas_simd_three_reg_same_extra(DisasContext *s, uint32_t insn) gen_gvec_op4_ool(s, is_q, rd, rn, rm, rd, 0, gen_helper_gvec_usdot_b); return; + case 0x04: /* SMMLA, UMMLA */ + gen_gvec_op4_ool(s, 1, rd, rn, rm, rd, 0, + u ? gen_helper_gvec_ummla_b + : gen_helper_gvec_smmla_b); + return; + case 0x05: /* USMMLA */ + gen_gvec_op4_ool(s, 1, rd, rn, rm, rd, 0, gen_helper_gvec_usmmla_b); + return; + case 0x8: /* FCMLA, #0 */ case 0x9: /* FCMLA, #90 */ case 0xa: /* FCMLA, #180 */ diff --git a/target/arm/translate-neon.c b/target/arm/translate-neon.c index b6ca29c25c..9e990b41ed 100644 --- a/target/arm/translate-neon.c +++ b/target/arm/translate-neon.c @@ -4036,3 +4036,30 @@ static bool trans_VTRN(DisasContext *s, arg_2misc *a) tcg_temp_free_i32(tmp2); return true; } + +static bool trans_VSMMLA(DisasContext *s, arg_VSMMLA *a) +{ + if (!dc_isar_feature(aa32_i8mm, s)) { + return false; + } + return do_neon_ddda(s, 7, a->vd, a->vn, a->vm, 0, + gen_helper_gvec_smmla_b); +} + +static bool trans_VUMMLA(DisasContext *s, arg_VUMMLA *a) +{ + if (!dc_isar_feature(aa32_i8mm, s)) { + return false; + } + return do_neon_ddda(s, 7, a->vd, a->vn, a->vm, 0, + gen_helper_gvec_ummla_b); +} + +static bool trans_VUSMMLA(DisasContext *s, arg_VUSMMLA *a) +{ + if (!dc_isar_feature(aa32_i8mm, s)) { + return false; + } + return do_neon_ddda(s, 7, a->vd, a->vn, a->vm, 0, + gen_helper_gvec_usmmla_b); +} diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 428ae018a3..9574efe957 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -8610,3 +8610,30 @@ static bool trans_FMLSLT_zzxw(DisasContext *s, arg_rrxr_esz *a) { return do_FMLAL_zzxw(s, a, true, true); } + +static bool do_i8mm_zzzz_ool(DisasContext *s, arg_rrrr_esz *a, + gen_helper_gvec_4 *fn, int data) +{ + if (!dc_isar_feature(aa64_sve_i8mm, s)) { + return false; + } + if (sve_access_check(s)) { + gen_gvec_ool_zzzz(s, fn, a->rd, a->rn, a->rm, a->ra, data); + } + return true; +} + +static bool trans_SMMLA(DisasContext *s, arg_rrrr_esz *a) +{ + return do_i8mm_zzzz_ool(s, a, gen_helper_gvec_smmla_b, 0); +} + +static bool trans_USMMLA(DisasContext *s, arg_rrrr_esz *a) +{ + return do_i8mm_zzzz_ool(s, a, gen_helper_gvec_usmmla_b, 0); +} + +static bool trans_UMMLA(DisasContext *s, arg_rrrr_esz *a) +{ + return do_i8mm_zzzz_ool(s, a, gen_helper_gvec_ummla_b, 0); +} diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index 19c4ba1bdf..e84b438340 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -2335,3 +2335,80 @@ void HELPER(gvec_xar_d)(void *vd, void *vn, void *vm, uint32_t desc) } clear_tail(d, opr_sz * 8, simd_maxsz(desc)); } + +/* + * Integer matrix-multiply accumulate + */ + +static uint32_t do_smmla_b(uint32_t sum, void *vn, void *vm) +{ + int8_t *n = vn, *m = vm; + + for (intptr_t k = 0; k < 8; ++k) { + sum += n[H1(k)] * m[H1(k)]; + } + return sum; +} + +static uint32_t do_ummla_b(uint32_t sum, void *vn, void *vm) +{ + uint8_t *n = vn, *m = vm; + + for (intptr_t k = 0; k < 8; ++k) { + sum += n[H1(k)] * m[H1(k)]; + } + return sum; +} + +static uint32_t do_usmmla_b(uint32_t sum, void *vn, void *vm) +{ + uint8_t *n = vn; + int8_t *m = vm; + + for (intptr_t k = 0; k < 8; ++k) { + sum += n[H1(k)] * m[H1(k)]; + } + return sum; +} + +static void do_mmla_b(void *vd, void *vn, void *vm, void *va, uint32_t desc, + uint32_t (*inner_loop)(uint32_t, void *, void *)) +{ + intptr_t seg, opr_sz = simd_oprsz(desc); + + for (seg = 0; seg < opr_sz; seg += 16) { + uint32_t *d = vd + seg; + uint32_t *a = va + seg; + uint32_t sum0, sum1, sum2, sum3; + + /* + * Process the entire segment at once, writing back the + * results only after we've consumed all of the inputs. + * + * Key to indicies by column: + * i j i j + */ + sum0 = a[H4(0 + 0)]; + sum0 = inner_loop(sum0, vn + seg + 0, vm + seg + 0); + sum1 = a[H4(0 + 1)]; + sum1 = inner_loop(sum1, vn + seg + 0, vm + seg + 8); + sum2 = a[H4(2 + 0)]; + sum2 = inner_loop(sum2, vn + seg + 8, vm + seg + 0); + sum3 = a[H4(2 + 1)]; + sum3 = inner_loop(sum3, vn + seg + 8, vm + seg + 8); + + d[H4(0)] = sum0; + d[H4(1)] = sum1; + d[H4(2)] = sum2; + d[H4(3)] = sum3; + } + clear_tail(vd, opr_sz, simd_maxsz(desc)); +} + +#define DO_MMLA_B(NAME, INNER) \ + void HELPER(NAME)(void *vd, void *vn, void *vm, void *va, uint32_t desc) \ + { do_mmla_b(vd, vn, vm, va, desc, INNER); } + +DO_MMLA_B(gvec_smmla_b, do_smmla_b) +DO_MMLA_B(gvec_ummla_b, do_ummla_b) +DO_MMLA_B(gvec_usmmla_b, do_usmmla_b) From patchwork Tue May 25 01:03:57 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277657 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 20C5BC2B9F7 for ; Tue, 25 May 2021 02:31:02 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id B280561420 for ; Tue, 25 May 2021 02:31:01 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B280561420 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:33272 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llMqS-0002xS-T6 for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 22:31:00 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:56568) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLbD-0006eH-5c for qemu-devel@nongnu.org; Mon, 24 May 2021 21:11:11 -0400 Received: from mail-pf1-x42b.google.com ([2607:f8b0:4864:20::42b]:35530) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLb0-0005up-Cc for qemu-devel@nongnu.org; Mon, 24 May 2021 21:11:10 -0400 Received: by mail-pf1-x42b.google.com with SMTP id g18so20619353pfr.2 for ; Mon, 24 May 2021 18:10:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=KPcdnZuAqCJG4HBYcNFFV8xlK+AmV9NrhFUm0MsLHPQ=; b=QPr1qhjCoLCnXWo91jcXi/EuFp5jx+GwHMkBv6dYC4476in/lfjglA2QVcoZRIDxsk 5qHOeXTg+INbnFyqdrJw/ci6A3k0QQNzw+N8O9YQ2faQAPu+4VXMJak3SYgHinIH3awu xFWhTIEae3wVeD2HYtMFU/vjEyXdLtN9q6AbogBHjDLMz55j9OpZaBm2/bFIloMfU7fq 6iGZOSwPK/sRmLVZHupNSwQXNRTo4jDmVqwL1oMvllCTm0104lwJ4YMPzcrjBg/SqRrz Rp8KylJ8sCzkZ94q7eSmIapmRWrLV3KSFv5vAYbGhybklMrsQwgQDkJkWCCKYnM1y3yw sTJg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=KPcdnZuAqCJG4HBYcNFFV8xlK+AmV9NrhFUm0MsLHPQ=; b=ZERCwLstlGw4Lq1qJJ24xUwxnvc8R9PsepXAmmGzU+tTXPW9w8kfeokwItEduMidz7 dmr4T2JAitkpzM+WzXgvYak2TtZm3Wm6FxoHAlAb5LwR+/dYKemnihMekIXXSxT331XS Ka5eJPzU9tO/yoqc8TyoOuaYNMPu5m1dHMnVrQVPwvMs1PNTbjGX6n3QEuqIANgHU5MC MM/gAuNSBr8YqDbb1xJc8tiFTH63gvEcTtyDt/VxUmsR99SLz9JGsDasP3XvEX3Su3uz cU8Hd+ixjOoMS2UMXZMDmWR9mHsQG9a0dqNihQg/w+owDiwmGAp0p4M0fEvpmU6DK0l/ VtWw== X-Gm-Message-State: AOAM530QSks43EtiM+FEIgTq+0A5gNqu5Qynr4ROfBOVfmLRfJyiqUxn jP27anOBdmM6yRLStqoFq3pihbc7WTP1Ww== X-Google-Smtp-Source: ABdhPJydy9damQcqDSPJ2FMCof/IyWj6B5hhWv3z6BleQ/5iZCT6FfPUu6cxPnyJoUdcP6EdEFVz+Q== X-Received: by 2002:a62:1a10:0:b029:28d:1590:d204 with SMTP id a16-20020a621a100000b029028d1590d204mr27415066pfa.78.1621905056463; Mon, 24 May 2021 18:10:56 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id i8sm10614839pjs.54.2021.05.24.18.10.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:10:56 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 91/92] linux-user/aarch64: Enable hwcap bits for sve2 and related extensions Date: Mon, 24 May 2021 18:03:57 -0700 Message-Id: <20210525010358.152808-92-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::42b; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x42b.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- linux-user/elfload.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/linux-user/elfload.c b/linux-user/elfload.c index 0e832b2649..1ab97e38e0 100644 --- a/linux-user/elfload.c +++ b/linux-user/elfload.c @@ -648,8 +648,18 @@ static uint32_t get_elf_hwcap2(void) uint32_t hwcaps = 0; GET_FEATURE_ID(aa64_dcpodp, ARM_HWCAP2_A64_DCPODP); + GET_FEATURE_ID(aa64_sve2, ARM_HWCAP2_A64_SVE2); + GET_FEATURE_ID(aa64_sve2_aes, ARM_HWCAP2_A64_SVEAES); + GET_FEATURE_ID(aa64_sve2_pmull128, ARM_HWCAP2_A64_SVEPMULL); + GET_FEATURE_ID(aa64_sve2_bitperm, ARM_HWCAP2_A64_SVEBITPERM); + GET_FEATURE_ID(aa64_sve2_sha3, ARM_HWCAP2_A64_SVESHA3); + GET_FEATURE_ID(aa64_sve2_sm4, ARM_HWCAP2_A64_SVESM4); GET_FEATURE_ID(aa64_condm_5, ARM_HWCAP2_A64_FLAGM2); GET_FEATURE_ID(aa64_frint, ARM_HWCAP2_A64_FRINT); + GET_FEATURE_ID(aa64_sve_i8mm, ARM_HWCAP2_A64_SVEI8MM); + GET_FEATURE_ID(aa64_sve_f32mm, ARM_HWCAP2_A64_SVEF32MM); + GET_FEATURE_ID(aa64_sve_f64mm, ARM_HWCAP2_A64_SVEF64MM); + GET_FEATURE_ID(aa64_i8mm, ARM_HWCAP2_A64_I8MM); GET_FEATURE_ID(aa64_rndr, ARM_HWCAP2_A64_RNG); GET_FEATURE_ID(aa64_bti, ARM_HWCAP2_A64_BTI); GET_FEATURE_ID(aa64_mte, ARM_HWCAP2_A64_MTE); From patchwork Tue May 25 01:03:58 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12277595 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1870EC2B9F7 for ; Tue, 25 May 2021 02:16:36 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id C64BF61360 for ; Tue, 25 May 2021 02:16:35 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C64BF61360 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:33820 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llMcU-0000gp-PC for qemu-devel@archiver.kernel.org; Mon, 24 May 2021 22:16:34 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:56572) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llLbD-0006fe-CK for qemu-devel@nongnu.org; Mon, 24 May 2021 21:11:11 -0400 Received: from mail-pf1-x42d.google.com ([2607:f8b0:4864:20::42d]:44657) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llLb0-0005vE-Cw for qemu-devel@nongnu.org; Mon, 24 May 2021 21:11:11 -0400 Received: by mail-pf1-x42d.google.com with SMTP id 22so21887414pfv.11 for ; Mon, 24 May 2021 18:10:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=VPedUl43P4vqxrl3NSrXGTDi5BxZimL9v1f9QzCFBpQ=; b=kv2hmvyqSmdFYk30U8N88nW4kg8v6IYVRG9JGXd4z9wBBjuoGNWH1tthMaMC7FM0IP gONJwLNhKL90h5H7vEFZ6tNpcOOjlDw+UWmPAk9/ePJtZJTAzLhgl2GqVqc9b3k8JOrH SzvfPHC8d30KcJObv8t1XtU3wUWEREtwVBjzmy1GvUppriKuYpimpIBCuemjwfMYmSNJ JVARX0WeW8dYc4V5Rw5hSLfi7udQgrhuNZGdysWkEReVZKRZll2FhffPMe0bVdDn3FNe LMlZ9pCBAKJqs55ZW1yx/UWdYT2zPfFFiAj/wkqDPIdRAaAWAsF5qagmVysw+p8tkW4p FFZw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=VPedUl43P4vqxrl3NSrXGTDi5BxZimL9v1f9QzCFBpQ=; b=Rw2cI4ZksdabrC05kiXPH2BAp/cEo81bEADJhU2yze+KZMm5NVrzustECPyYgnmvch QpE52rZr8CDC5u383dq5zuESiex/vdvFdo0Kp4TtUnFquUE6Fei0mJWXJLagvJ1U3+sF 0UASwTyT5hOquP36h691hn2Y1z20hEZCa23meXdSoO/vVYSejrVzQNY7jGWKcwJx7bnm c1+wAdqC03RSnJR+bk5WvJAxDCINVnDoYLgS6Om4oWTlJ6yqlqKWBRs32sJI65g3PBwp XDhX91BzlUPB0LLj81u7BemTEzouJerTp9+OWlmsBWn7q7J6zEjJMw5mNYC2Wo3P1oDt tFdA== X-Gm-Message-State: AOAM531hOagxiHeWxabGRNuX/RaQBJzMiCVhMOFycVqUuAQAJ+/CXYcA OYoqJbHyF/tE0u/90pmAj583IkJZzy6rBQ== X-Google-Smtp-Source: ABdhPJx99xOS/CSN3cVl0dt4jTxgh/QFq1Q1TtDrv2zsFF7sd0IcpHzJhA9rHy1IkfwpCFVCVpic7Q== X-Received: by 2002:a05:6a00:1992:b029:2df:b93b:49a with SMTP id d18-20020a056a001992b02902dfb93b049amr27424816pfl.11.1621905057084; Mon, 24 May 2021 18:10:57 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id i8sm10614839pjs.54.2021.05.24.18.10.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 May 2021 18:10:56 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v7 92/92] target/arm: Enable SVE2 and related extensions Date: Mon, 24 May 2021 18:03:58 -0700 Message-Id: <20210525010358.152808-93-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525010358.152808-1-richard.henderson@linaro.org> References: <20210525010358.152808-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::42d; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x42d.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Disable I8MM again for !have_neon during realize. Signed-off-by: Richard Henderson --- target/arm/cpu.c | 2 ++ target/arm/cpu64.c | 13 +++++++++++++ target/arm/cpu_tcg.c | 1 + 3 files changed, 16 insertions(+) diff --git a/target/arm/cpu.c b/target/arm/cpu.c index 4eb0d2f85c..7aeb4b1381 100644 --- a/target/arm/cpu.c +++ b/target/arm/cpu.c @@ -1503,6 +1503,7 @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp) t = cpu->isar.id_aa64isar1; t = FIELD_DP64(t, ID_AA64ISAR1, FCMA, 0); + t = FIELD_DP64(t, ID_AA64ISAR1, I8MM, 0); cpu->isar.id_aa64isar1 = t; t = cpu->isar.id_aa64pfr0; @@ -1517,6 +1518,7 @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp) u = cpu->isar.id_isar6; u = FIELD_DP32(u, ID_ISAR6, DP, 0); u = FIELD_DP32(u, ID_ISAR6, FHM, 0); + u = FIELD_DP32(u, ID_ISAR6, I8MM, 0); cpu->isar.id_isar6 = u; if (!arm_feature(env, ARM_FEATURE_M)) { diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c index f0a9e968c9..379f90fab8 100644 --- a/target/arm/cpu64.c +++ b/target/arm/cpu64.c @@ -662,6 +662,7 @@ static void aarch64_max_initfn(Object *obj) t = FIELD_DP64(t, ID_AA64ISAR1, SPECRES, 1); t = FIELD_DP64(t, ID_AA64ISAR1, FRINTTS, 1); t = FIELD_DP64(t, ID_AA64ISAR1, LRCPC, 2); /* ARMv8.4-RCPC */ + t = FIELD_DP64(t, ID_AA64ISAR1, I8MM, 1); cpu->isar.id_aa64isar1 = t; t = cpu->isar.id_aa64pfr0; @@ -702,6 +703,17 @@ static void aarch64_max_initfn(Object *obj) t = FIELD_DP64(t, ID_AA64MMFR2, ST, 1); /* TTST */ cpu->isar.id_aa64mmfr2 = t; + t = cpu->isar.id_aa64zfr0; + t = FIELD_DP64(t, ID_AA64ZFR0, SVEVER, 1); + t = FIELD_DP64(t, ID_AA64ZFR0, AES, 2); /* PMULL */ + t = FIELD_DP64(t, ID_AA64ZFR0, BITPERM, 1); + t = FIELD_DP64(t, ID_AA64ZFR0, SHA3, 1); + t = FIELD_DP64(t, ID_AA64ZFR0, SM4, 1); + t = FIELD_DP64(t, ID_AA64ZFR0, I8MM, 1); + t = FIELD_DP64(t, ID_AA64ZFR0, F32MM, 1); + t = FIELD_DP64(t, ID_AA64ZFR0, F64MM, 1); + cpu->isar.id_aa64zfr0 = t; + /* Replicate the same data to the 32-bit id registers. */ u = cpu->isar.id_isar5; u = FIELD_DP32(u, ID_ISAR5, AES, 2); /* AES + PMULL */ @@ -718,6 +730,7 @@ static void aarch64_max_initfn(Object *obj) u = FIELD_DP32(u, ID_ISAR6, FHM, 1); u = FIELD_DP32(u, ID_ISAR6, SB, 1); u = FIELD_DP32(u, ID_ISAR6, SPECRES, 1); + u = FIELD_DP32(u, ID_ISAR6, I8MM, 1); cpu->isar.id_isar6 = u; u = cpu->isar.id_pfr0; diff --git a/target/arm/cpu_tcg.c b/target/arm/cpu_tcg.c index 046e476f65..d3458335ed 100644 --- a/target/arm/cpu_tcg.c +++ b/target/arm/cpu_tcg.c @@ -968,6 +968,7 @@ static void arm_max_initfn(Object *obj) t = FIELD_DP32(t, ID_ISAR6, FHM, 1); t = FIELD_DP32(t, ID_ISAR6, SB, 1); t = FIELD_DP32(t, ID_ISAR6, SPECRES, 1); + t = FIELD_DP32(t, ID_ISAR6, I8MM, 1); cpu->isar.id_isar6 = t; t = cpu->isar.mvfr1;