From patchwork Fri May 24 23:20:15 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13673788 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 08F2FC25B7D for ; Fri, 24 May 2024 23:22:14 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sAeE5-0006MT-CY; Fri, 24 May 2024 19:21:29 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sAeE3-0006Ln-PT for qemu-devel@nongnu.org; Fri, 24 May 2024 19:21:27 -0400 Received: from mail-pg1-x52e.google.com ([2607:f8b0:4864:20::52e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sAeE1-0005eH-66 for qemu-devel@nongnu.org; Fri, 24 May 2024 19:21:27 -0400 Received: by mail-pg1-x52e.google.com with SMTP id 41be03b00d2f7-6818e31e5baso1189941a12.1 for ; Fri, 24 May 2024 16:21:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1716592884; x=1717197684; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=YrL1JsoHB9ONacMQPzWVRwN8hE48sjAIXUCgFRSOVZo=; b=L8m4dH8Yk+n2M/JR2gY2fb7wD9BWHwJGNn3atDGlLpsz14yj+iLvl0IcVClMKbKi5j KL41odVxkoDDxErDYRf4DYG+FcAgYvSGE3G8IfcHLkITnzAVY7G8AvaoEcJ1rymS6oek xpTxiP+HUE9TZ/e4e+/WwV3OPD8+4X9nclRnAXSiMT6+66MjtqnvFErjtE5ScJyZH33X 1nUKnI1uKS7r7SSncJXR314V9kyz7adIWq2fTgQ+tFJCJW957HdCRxXKl5VSONorPFAP LM9w8IhH7IiF2wvpj2pH8StcOOTA4eaKv5bLYm+MCJ0X9rUpyCBQm4kUtz348AEbpRFN f1gg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716592884; x=1717197684; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=YrL1JsoHB9ONacMQPzWVRwN8hE48sjAIXUCgFRSOVZo=; b=CsPP84ZGwCt29OJgJL2OhieeRV0aA6reClsf5Nmljf9BY0XGY/2gYcdnTnytBqm6da flY0gufovRsQkgN2v+F4IluKUl50AAoWhYWc6zqZnItafav6PCSZMK7KTSpPEO4ieLWm uw0UfC4cbO/3ilvfSa5LUCnHwPvhOThllaVcvMULr+xLB3m+UvXDLSnACK2eN+ZH+KmV uYoFdnmdeJLjU2dB0ak6w+Tni3dnuaqDUB8CxudhSZWSkXGhaKlM7ESTDlwSiC2Q+QyD YANMO0fxbtnDaKM84ekg/dzduMHsGqMRg8ei7A1NKSvvkQxIOG9MOJ034ZwN+aCIT9dn TDjA== X-Gm-Message-State: AOJu0YwgWb6Lbg4lgoR3y6YxwGOuMoHlom0UC1OeJUnjXV2dG7DMqqZl BmP6ZOo/Zjd9Cbx4Mo57opDhlbGWjYgLhecld2GLKS4MzKDfsSxn7fThQDCMh82eGc1tZu2E8Wu t X-Google-Smtp-Source: AGHT+IGWwp9aMCQNgDzMcO8U2sycq6R87MVgOF8kpteMfiv/i46vDT7l1kt2uo5g8Wl7APQHlUufdg== X-Received: by 2002:a17:902:e5cb:b0:1ed:7cdf:f331 with SMTP id d9443c01a7336-1f449903c29mr44738935ad.68.1716592883758; Fri, 24 May 2024 16:21:23 -0700 (PDT) Received: from stoup.. (174-21-72-5.tukw.qwest.net. [174.21.72.5]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-1f44c759ceesm19178305ad.10.2024.05.24.16.21.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 May 2024 16:21:23 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: qemu-arm@nongnu.org Subject: [PATCH v2 01/67] target/arm: Add neoverse-n1 to qemu-arm (DO NOT MERGE) Date: Fri, 24 May 2024 16:20:15 -0700 Message-Id: <20240524232121.284515-2-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240524232121.284515-1-richard.henderson@linaro.org> References: <20240524232121.284515-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::52e; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x52e.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Hack, because there should be a better way to do this without duplicating code between cpu32.c and cpu64.c. Hack, because qemu-arm crashes without ARM_FEATURE_AARCH64 disabled. Needed in order to compare RISU results with aarch64.ci.qemu.org. Signed-off-by: Richard Henderson --- target/arm/tcg/cpu32.c | 73 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 73 insertions(+) diff --git a/target/arm/tcg/cpu32.c b/target/arm/tcg/cpu32.c index bdd82d912a..6ee055c78b 100644 --- a/target/arm/tcg/cpu32.c +++ b/target/arm/tcg/cpu32.c @@ -978,6 +978,78 @@ static void arm_max_initfn(Object *obj) } #endif /* !TARGET_AARCH64 */ +#ifdef CONFIG_USER_ONLY +static void aarch64_neoverse_n1_initfn(Object *obj) +{ + ARMCPU *cpu = ARM_CPU(obj); + + cpu->dtb_compatible = "arm,neoverse-n1"; + set_feature(&cpu->env, ARM_FEATURE_V8); + set_feature(&cpu->env, ARM_FEATURE_NEON); + set_feature(&cpu->env, ARM_FEATURE_GENERIC_TIMER); + set_feature(&cpu->env, ARM_FEATURE_BACKCOMPAT_CNTFRQ); + // set_feature(&cpu->env, ARM_FEATURE_AARCH64); + set_feature(&cpu->env, ARM_FEATURE_CBAR_RO); + set_feature(&cpu->env, ARM_FEATURE_EL2); + set_feature(&cpu->env, ARM_FEATURE_EL3); + set_feature(&cpu->env, ARM_FEATURE_PMU); + + /* Ordered by B2.4 AArch64 registers by functional group */ + cpu->clidr = 0x82000023; + cpu->ctr = 0x8444c004; + cpu->dcz_blocksize = 4; + cpu->isar.id_aa64dfr0 = 0x0000000110305408ull; + cpu->isar.id_aa64isar0 = 0x0000100010211120ull; + cpu->isar.id_aa64isar1 = 0x0000000000100001ull; + cpu->isar.id_aa64mmfr0 = 0x0000000000101125ull; + cpu->isar.id_aa64mmfr1 = 0x0000000010212122ull; + cpu->isar.id_aa64mmfr2 = 0x0000000000001011ull; + cpu->isar.id_aa64pfr0 = 0x1100000010111112ull; /* GIC filled in later */ + cpu->isar.id_aa64pfr1 = 0x0000000000000020ull; + cpu->id_afr0 = 0x00000000; + cpu->isar.id_dfr0 = 0x04010088; + cpu->isar.id_isar0 = 0x02101110; + cpu->isar.id_isar1 = 0x13112111; + cpu->isar.id_isar2 = 0x21232042; + cpu->isar.id_isar3 = 0x01112131; + cpu->isar.id_isar4 = 0x00010142; + cpu->isar.id_isar5 = 0x01011121; + cpu->isar.id_isar6 = 0x00000010; + cpu->isar.id_mmfr0 = 0x10201105; + cpu->isar.id_mmfr1 = 0x40000000; + cpu->isar.id_mmfr2 = 0x01260000; + cpu->isar.id_mmfr3 = 0x02122211; + cpu->isar.id_mmfr4 = 0x00021110; + cpu->isar.id_pfr0 = 0x10010131; + cpu->isar.id_pfr1 = 0x00010000; /* GIC filled in later */ + cpu->isar.id_pfr2 = 0x00000011; + cpu->midr = 0x414fd0c1; /* r4p1 */ + cpu->revidr = 0; + + /* From B2.23 CCSIDR_EL1 */ + cpu->ccsidr[0] = 0x701fe01a; /* 64KB L1 dcache */ + cpu->ccsidr[1] = 0x201fe01a; /* 64KB L1 icache */ + cpu->ccsidr[2] = 0x70ffe03a; /* 1MB L2 cache */ + + /* From B2.98 SCTLR_EL3 */ + cpu->reset_sctlr = 0x30c50838; + + /* From B4.23 ICH_VTR_EL2 */ + cpu->gic_num_lrs = 4; + cpu->gic_vpribits = 5; + cpu->gic_vprebits = 5; + cpu->gic_pribits = 5; + + /* From B5.1 AdvSIMD AArch64 register summary */ + cpu->isar.mvfr0 = 0x10110222; + cpu->isar.mvfr1 = 0x13211111; + cpu->isar.mvfr2 = 0x00000043; + + /* From D5.1 AArch64 PMU register summary */ + cpu->isar.reset_pmcr_el0 = 0x410c3000; +} +#endif /* CONFIG_USER_ONLY */ + static const ARMCPUInfo arm_tcg_cpus[] = { { .name = "arm926", .initfn = arm926_initfn }, { .name = "arm946", .initfn = arm946_initfn }, @@ -1018,6 +1090,7 @@ static const ARMCPUInfo arm_tcg_cpus[] = { { .name = "max", .initfn = arm_max_initfn }, #endif #ifdef CONFIG_USER_ONLY + { .name = "neoverse-n1", .initfn = aarch64_neoverse_n1_initfn }, { .name = "any", .initfn = arm_max_initfn }, #endif }; From patchwork Fri May 24 23:20:16 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13673802 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id C3304C25B74 for ; Fri, 24 May 2024 23:25:44 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sAeE6-0006O6-P8; Fri, 24 May 2024 19:21:30 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sAeE5-0006MW-CN for qemu-devel@nongnu.org; Fri, 24 May 2024 19:21:29 -0400 Received: from mail-pg1-x52b.google.com ([2607:f8b0:4864:20::52b]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sAeE2-0005ew-9j for qemu-devel@nongnu.org; Fri, 24 May 2024 19:21:29 -0400 Received: by mail-pg1-x52b.google.com with SMTP id 41be03b00d2f7-656d8b346d2so3993412a12.2 for ; Fri, 24 May 2024 16:21:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1716592885; x=1717197685; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=nJubyT1NfWkcukaKcWaokzntrf+qNDVZUUMp7AEhbOw=; b=VWu1fPKCFJrg5T/VsDmkirgDzcvJ+k7g3rOQD6Z9L/d5jUKWbplWrnxMt+9BAq5I4y 0qV6OwKqbMh6FcN7kqo0sYDYa2HLnA+EnswYqwNuOnOiqmI1z5WfHVGygQUrdctpaCl3 D6D6X9w5Cz8IHUYh9EYCfk9I5RF//sxaTcao4Io6dc8ZuQ/YfpPlV/v1AMT0Eko07Glf V7godXZrLLgFIpF4eAiNSWIQq2tahOG53Ew3TqZdu8a6m52NHyqx2VRS3jL8g0XUMbQY ajrXwfbndU0/Q1lphDvlz4y286FRln0bW5e0QK4+oqwGBuhCmIZBpU3c8k8BUUORNDqj KfBQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716592885; x=1717197685; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=nJubyT1NfWkcukaKcWaokzntrf+qNDVZUUMp7AEhbOw=; b=OEnnlhZrkioP856uL9tR2GU1eQiOCN7skpQr7B9bGvGjMMnPclDJnkvx0OA+snU1EA x29uGWvyaATB+5KneU2/Y51X+cgYXsuzXn215cMBWCVFUkd6LZugMcdweYHX3u3ZC2CI fMnhhPo09pbKruJKqTuXfeI4kV+pCFQcbj0yaofNp9UxOug55m7BKJzb/SpMMFrGS8up 1xkMoXLumsUErV8Ra1YhlNoxK4uUcYKIbv2krjGj5WtJHk5jK8fUfW4stH4/GUzdiXy0 4N50wMVrnSfdKsDRK8hbuhcqL7ej5CiO9Vjcmqyaewz6ZK18cFBmkUZa3FtSUWQIlzUs 6EnQ== X-Gm-Message-State: AOJu0YyYLU7U4UrFP77xnsg+PaLCIoVjNeGxHvs4BwNpfr2KRlLiKtf7 I0851VZavjGWLXXINihiCg++qXwUleprf4Qe9zgWriGGuUK0ZMPOFICQzPiDma7yCDTZS3AoLWr 3 X-Google-Smtp-Source: AGHT+IHKeD5sQ2K6S7ptHGotSaSY0rcwsR/eE6zAUuCp4E0risEhlGO+tDRF5n03DVd5G6HCduOSNg== X-Received: by 2002:a17:902:64d7:b0:1f3:3f33:2873 with SMTP id d9443c01a7336-1f4486ed8afmr34218245ad.25.1716592884682; Fri, 24 May 2024 16:21:24 -0700 (PDT) Received: from stoup.. (174-21-72-5.tukw.qwest.net. [174.21.72.5]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-1f44c759ceesm19178305ad.10.2024.05.24.16.21.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 May 2024 16:21:24 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: qemu-arm@nongnu.org Subject: [PATCH v2 02/67] target/arm: Use PLD, PLDW, PLI not NOP for t32 Date: Fri, 24 May 2024 16:20:16 -0700 Message-Id: <20240524232121.284515-3-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240524232121.284515-1-richard.henderson@linaro.org> References: <20240524232121.284515-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::52b; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x52b.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org This fixes a bug in that neither PLI nor PLDW are present in ARMv6T2, but are introduced with ARMv7 and ARMv7MP respectively. For clarity, do not use NOP for PLD. Note that there is no PLDW (literal) -- bit 5 of the first word is not decoded, and is PLD (literal). Confirmed on neoverse-n1 host which does *not* trap on the (0) bit in the decode. Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/tcg/t32.decode | 25 ++++++++++++------------- target/arm/tcg/translate.c | 4 ++-- 2 files changed, 14 insertions(+), 15 deletions(-) diff --git a/target/arm/tcg/t32.decode b/target/arm/tcg/t32.decode index f21ad0167a..d327178829 100644 --- a/target/arm/tcg/t32.decode +++ b/target/arm/tcg/t32.decode @@ -458,41 +458,41 @@ STR_ri 1111 1000 1100 .... .... ............ @ldst_ri_pos # Note that Load, unsigned (literal) overlaps all other load encodings. { { - NOP 1111 1000 -001 1111 1111 ------------ # PLD + PLD 1111 1000 -001 1111 1111 ------------ # (literal) LDRB_ri 1111 1000 .001 1111 .... ............ @ldst_ri_lit } { - NOP 1111 1000 1001 ---- 1111 ------------ # PLD + PLD 1111 1000 1001 ---- 1111 ------------ # (immediate T1) LDRB_ri 1111 1000 1001 .... .... ............ @ldst_ri_pos } LDRB_ri 1111 1000 0001 .... .... 1..1 ........ @ldst_ri_idx { - NOP 1111 1000 0001 ---- 1111 1100 -------- # PLD + PLD 1111 1000 0001 ---- 1111 1100 -------- # (immediate T2) LDRB_ri 1111 1000 0001 .... .... 1100 ........ @ldst_ri_neg } LDRBT_ri 1111 1000 0001 .... .... 1110 ........ @ldst_ri_unp { - NOP 1111 1000 0001 ---- 1111 000000 -- ---- # PLD + PLD 1111 1000 0001 ---- 1111 000000 -- ---- # (register) LDRB_rr 1111 1000 0001 .... .... 000000 .. .... @ldst_rr } } { { - NOP 1111 1000 -011 1111 1111 ------------ # PLD + PLD 1111 1000 -011 1111 1111 ------------ # (literal) LDRH_ri 1111 1000 .011 1111 .... ............ @ldst_ri_lit } { - NOP 1111 1000 1011 ---- 1111 ------------ # PLDW + PLDW 1111 1000 1011 ---- 1111 ------------ # (immediate T1) LDRH_ri 1111 1000 1011 .... .... ............ @ldst_ri_pos } LDRH_ri 1111 1000 0011 .... .... 1..1 ........ @ldst_ri_idx { - NOP 1111 1000 0011 ---- 1111 1100 -------- # PLDW + PLDW 1111 1000 0011 ---- 1111 1100 -------- # (immediate T2) LDRH_ri 1111 1000 0011 .... .... 1100 ........ @ldst_ri_neg } LDRHT_ri 1111 1000 0011 .... .... 1110 ........ @ldst_ri_unp { - NOP 1111 1000 0011 ---- 1111 000000 -- ---- # PLDW + PLDW 1111 1000 0011 ---- 1111 000000 -- ---- # (register) LDRH_rr 1111 1000 0011 .... .... 000000 .. .... @ldst_rr } } @@ -504,24 +504,23 @@ STR_ri 1111 1000 1100 .... .... ............ @ldst_ri_pos LDRT_ri 1111 1000 0101 .... .... 1110 ........ @ldst_ri_unp LDR_rr 1111 1000 0101 .... .... 000000 .. .... @ldst_rr } -# NOPs here are PLI. { { - NOP 1111 1001 -001 1111 1111 ------------ + PLI 1111 1001 -001 1111 1111 ------------ # (literal T3) LDRSB_ri 1111 1001 .001 1111 .... ............ @ldst_ri_lit } { - NOP 1111 1001 1001 ---- 1111 ------------ + PLI 1111 1001 1001 ---- 1111 ------------ # (immediate T1) LDRSB_ri 1111 1001 1001 .... .... ............ @ldst_ri_pos } LDRSB_ri 1111 1001 0001 .... .... 1..1 ........ @ldst_ri_idx { - NOP 1111 1001 0001 ---- 1111 1100 -------- + PLI 1111 1001 0001 ---- 1111 1100 -------- # (immediate T2) LDRSB_ri 1111 1001 0001 .... .... 1100 ........ @ldst_ri_neg } LDRSBT_ri 1111 1001 0001 .... .... 1110 ........ @ldst_ri_unp { - NOP 1111 1001 0001 ---- 1111 000000 -- ---- + PLI 1111 1001 0001 ---- 1111 000000 -- ---- # (register) LDRSB_rr 1111 1001 0001 .... .... 000000 .. .... @ldst_rr } } diff --git a/target/arm/tcg/translate.c b/target/arm/tcg/translate.c index d605e10f11..187eacffd9 100644 --- a/target/arm/tcg/translate.c +++ b/target/arm/tcg/translate.c @@ -8765,12 +8765,12 @@ static bool trans_PLD(DisasContext *s, arg_PLD *a) return ENABLE_ARCH_5TE; } -static bool trans_PLDW(DisasContext *s, arg_PLD *a) +static bool trans_PLDW(DisasContext *s, arg_PLDW *a) { return arm_dc_feature(s, ARM_FEATURE_V7MP); } -static bool trans_PLI(DisasContext *s, arg_PLD *a) +static bool trans_PLI(DisasContext *s, arg_PLI *a) { return ENABLE_ARCH_7; } From patchwork Fri May 24 23:20:17 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13673803 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 7D583C41513 for ; Fri, 24 May 2024 23:25:45 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sAeE8-0006QX-RX; Fri, 24 May 2024 19:21:32 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sAeE5-0006Mp-Im for qemu-devel@nongnu.org; Fri, 24 May 2024 19:21:29 -0400 Received: from mail-pl1-x62e.google.com ([2607:f8b0:4864:20::62e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sAeE3-0005f7-78 for qemu-devel@nongnu.org; Fri, 24 May 2024 19:21:29 -0400 Received: by mail-pl1-x62e.google.com with SMTP id d9443c01a7336-1f44b4285dbso10941935ad.0 for ; Fri, 24 May 2024 16:21:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1716592885; x=1717197685; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=j+CSyQolXNlheFWl3qhhbqIiFKwE+QIiVDDbCmDvV50=; b=mHnrM+6isAPPSprFEvnP1SiDCphL522bU/U1h6YKCWKxShIfp4z7Y9q04ogRFzl9WL oGphCM7d8faDNyjDp85ZMbTcWqS98NXdktt4DbODiq4NqwTm21zdHMLIsV5kaVOtkDFQ ILzUWNb1gfEFQ97rza8cJmJT3+ZLylO0yRnVB/2Bb5S/slMmkbPE40F5U1eQ/npNcnvi gDf0O98hMpM8k70aBFc6p1Xy6oiP6LU5Dbf5KAG+lHPr2vfKRBNRA5K+ZC3TEn7z57mC LmsaAWcOcZc+/tvpMdXJ5KeG/NfyfAHyx4NVenURbX0cPllC0x7s9iqXvyWX1v9aV2yL OapA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716592885; x=1717197685; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=j+CSyQolXNlheFWl3qhhbqIiFKwE+QIiVDDbCmDvV50=; b=wgEd7I7idI1B9EtlCy8s1TAv13suNeEi7pF6zXgUIgzbEerxotazY74lePuRdXUZEG IzqnYweZRLt5lFZqHvkm3b8lCebJ7VOMtKW8+Bnur8T9b/bOZOlkJbHK7WwegdZZErUe E+8p1H5TK/qcgbivQXcZILCsEzMOZ3UEc4cQnVFnBZHAyqeERp90Fk5EIPu9OKvlZLn/ a/37piVVnMw3qxR+Dp+0UCBmmbkDpQ6ZIV5ail7h7qErRX8o8UrS0i+xxoVpNMrUyne4 9+YZv8nv9AbWxee+AMi9ARYEQ13jP4WdAfKPxEc2OYe87inYNGo0jGrNPmCspOG9//gk f6Jw== X-Gm-Message-State: AOJu0YxkcLxCZM7ijkrxgQ7ddkFuyoSGiDu7pNczsgIWWNMVu9TtQqpg 1mqxBDg7bEjgt3tnhUSJZ2+11m/0gyQcy04Y4UYG4R9zTFjlxUJOV8Tty+tFptogfwgpm89i46N q X-Google-Smtp-Source: AGHT+IEztbQprR+YdrFQyduiMBs6TN5/IVsdl8amdcHJaIcSlu8WLGzHiMv6Xn2f6IpJdoNolqrZbQ== X-Received: by 2002:a17:902:d4c6:b0:1e2:a467:1b6b with SMTP id d9443c01a7336-1f4486d54b8mr40156295ad.16.1716592885414; Fri, 24 May 2024 16:21:25 -0700 (PDT) Received: from stoup.. (174-21-72-5.tukw.qwest.net. [174.21.72.5]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-1f44c759ceesm19178305ad.10.2024.05.24.16.21.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 May 2024 16:21:25 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: qemu-arm@nongnu.org Subject: [PATCH v2 03/67] target/arm: Reject incorrect operands to PLD, PLDW, PLI Date: Fri, 24 May 2024 16:20:17 -0700 Message-Id: <20240524232121.284515-4-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240524232121.284515-1-richard.henderson@linaro.org> References: <20240524232121.284515-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::62e; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x62e.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org For all, rm == 15 is invalid. Prior to v8, thumb with rm == 13 is invalid. For PLDW, rn == 15 is invalid. Fixes a RISU mismatch for the HINTSPACE pattern in t32.risu compared to a neoverse-n1 host. Signed-off-by: Richard Henderson --- target/arm/tcg/a32-uncond.decode | 8 +++-- target/arm/tcg/t32.decode | 7 ++-- target/arm/tcg/translate.c | 57 ++++++++++++++++++++++++++++++++ 3 files changed, 66 insertions(+), 6 deletions(-) diff --git a/target/arm/tcg/a32-uncond.decode b/target/arm/tcg/a32-uncond.decode index 2339de2e94..e1b1780d37 100644 --- a/target/arm/tcg/a32-uncond.decode +++ b/target/arm/tcg/a32-uncond.decode @@ -24,7 +24,9 @@ &empty !extern &i !extern imm +&r !extern rm &setend E +&nm rn rm # Branch with Link and Exchange @@ -61,9 +63,9 @@ PLD 1111 0101 -101 ---- 1111 ---- ---- ---- # (imm, lit) 5te PLDW 1111 0101 -001 ---- 1111 ---- ---- ---- # (imm, lit) 7mp PLI 1111 0100 -101 ---- 1111 ---- ---- ---- # (imm, lit) 7 -PLD 1111 0111 -101 ---- 1111 ----- -- 0 ---- # (register) 5te -PLDW 1111 0111 -001 ---- 1111 ----- -- 0 ---- # (register) 7mp -PLI 1111 0110 -101 ---- 1111 ----- -- 0 ---- # (register) 7 +PLD_rr 1111 0111 -101 ---- 1111 ----- -- 0 rm:4 &r +PLDW_rr 1111 0111 -001 rn:4 1111 ----- -- 0 rm:4 &nm +PLI_rr 1111 0110 -101 ---- 1111 ----- -- 0 rm:4 &r # Unallocated memory hints # diff --git a/target/arm/tcg/t32.decode b/target/arm/tcg/t32.decode index d327178829..1ec12442a4 100644 --- a/target/arm/tcg/t32.decode +++ b/target/arm/tcg/t32.decode @@ -28,6 +28,7 @@ &rrr_rot !extern rd rn rm rot &rrr !extern rd rn rm &rr !extern rd rm +&nm !extern rn rm &ri !extern rd imm &r !extern rm &i !extern imm @@ -472,7 +473,7 @@ STR_ri 1111 1000 1100 .... .... ............ @ldst_ri_pos } LDRBT_ri 1111 1000 0001 .... .... 1110 ........ @ldst_ri_unp { - PLD 1111 1000 0001 ---- 1111 000000 -- ---- # (register) + PLD_rr 1111 1000 0001 ---- 1111 000000 -- rm:4 &r LDRB_rr 1111 1000 0001 .... .... 000000 .. .... @ldst_rr } } @@ -492,7 +493,7 @@ STR_ri 1111 1000 1100 .... .... ............ @ldst_ri_pos } LDRHT_ri 1111 1000 0011 .... .... 1110 ........ @ldst_ri_unp { - PLDW 1111 1000 0011 ---- 1111 000000 -- ---- # (register) + PLDW_rr 1111 1000 0011 rn:4 1111 000000 -- rm:4 &nm LDRH_rr 1111 1000 0011 .... .... 000000 .. .... @ldst_rr } } @@ -520,7 +521,7 @@ STR_ri 1111 1000 1100 .... .... ............ @ldst_ri_pos } LDRSBT_ri 1111 1001 0001 .... .... 1110 ........ @ldst_ri_unp { - PLI 1111 1001 0001 ---- 1111 000000 -- ---- # (register) + PLI_rr 1111 1001 0001 ---- 1111 000000 -- rm:4 &r LDRSB_rr 1111 1001 0001 .... .... 000000 .. .... @ldst_rr } } diff --git a/target/arm/tcg/translate.c b/target/arm/tcg/translate.c index 187eacffd9..7c09153b6e 100644 --- a/target/arm/tcg/translate.c +++ b/target/arm/tcg/translate.c @@ -8775,6 +8775,63 @@ static bool trans_PLI(DisasContext *s, arg_PLI *a) return ENABLE_ARCH_7; } +static bool prefetch_check_m(DisasContext *s, int rm) +{ + switch (rm) { + case 13: + /* SP allowed in v8 or with A1 encoding; rejected with T1. */ + return ENABLE_ARCH_8 || !s->thumb; + case 15: + /* PC always rejected. */ + return false; + default: + return true; + } +} + +static bool trans_PLD_rr(DisasContext *s, arg_PLD_rr *a) +{ + if (!ENABLE_ARCH_5TE) { + return false; + } + /* We cannot return false, because that leads to LDRB for thumb. */ + if (!prefetch_check_m(s, a->rm)) { + unallocated_encoding(s); + } + return true; +} + +static bool trans_PLDW_rr(DisasContext *s, arg_PLDW_rr *a) +{ + if (!arm_dc_feature(s, ARM_FEATURE_V7MP)) { + return false; + } + /* + * For A1, rn == 15 is UNPREDICTABLE. + * For T1, rn == 15 is PLD (literal). + */ + if (a->rn == 15) { + return false; + } + /* We cannot return false, because that leads to LDRH for thumb. */ + if (!prefetch_check_m(s, a->rm)) { + unallocated_encoding(s); + } + return true; +} + +static bool trans_PLI_rr(DisasContext *s, arg_PLI_rr *a) +{ + if (!ENABLE_ARCH_7) { + return false; + } + /* We cannot return false, because that leads to LDRSB for thumb. */ + if (!prefetch_check_m(s, a->rm)) { + unallocated_encoding(s); + } + return true; +} + /* * If-then */ From patchwork Fri May 24 23:20:18 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13673810 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id E2B0AC25B74 for ; Fri, 24 May 2024 23:27:15 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sAeE7-0006Ou-J0; Fri, 24 May 2024 19:21:31 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sAeE6-0006Nn-3X for qemu-devel@nongnu.org; Fri, 24 May 2024 19:21:30 -0400 Received: from mail-pl1-x62b.google.com ([2607:f8b0:4864:20::62b]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sAeE3-0005gX-MG for qemu-devel@nongnu.org; Fri, 24 May 2024 19:21:29 -0400 Received: by mail-pl1-x62b.google.com with SMTP id d9443c01a7336-1f304533064so26451485ad.0 for ; Fri, 24 May 2024 16:21:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1716592886; x=1717197686; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=KUNkPH2LlhkOxE/FhJQAuoTP7MfTsPeCObETyK8mLHE=; b=T6dpvyZURTbpAR1qSKze2Qr0APB2/2lXn2HAIN9Pe+I0xYnCAHAI1h0E19Tng02TqV PNXDFxppY4bQhXluOzeBZZ56LbmQcJjD1eGs/dbG3baPL9Z8EKfct1fINN5B3j8lRgbz Nqpo5tyCW5+k0Nh4iVzd8DRy7yhYo5zGV6JCWEcyRkWxRdW18xisKmF5POi0CeuP8Fp9 qyq+BsREL/9gvIbD4rCCbYI6s2RTecp7ecPKaOCCPsHDiDLSdNTyYEsP92avXQyBiA1C 2u9j4KhMFW0jnCjQF3DEngHxIwTyHqmQ4LF03pGjDcM9pmxrK0x8DFIsoO57gSwD1iVs tChg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716592886; x=1717197686; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=KUNkPH2LlhkOxE/FhJQAuoTP7MfTsPeCObETyK8mLHE=; b=M+17h/U7xqtSaXlIhbUqmGN938UByQvkcsECh1+4XEgu8bkeSJ8nQxC1Z3la5mPFev pp53gP6PNTNHx5HjkMH57+KukkN0tghBif/lkY0Uz0AyZLsttkH6sPk1/VLo3a+jacpw O67M3jJvV1KuQM4rf3T26zhVcaM3dnxzecJ0OEV4DK+H+qrH+unKhusXSxfU2LL+vPHa tZA9d3VBjv3C1ptWtgmYJ9IzWQs9pmbAlSmdcoLo7OkeJz+kM9QXr4sIylGnJi03roVH 6dC8rCWtPyCQ3VyHbGdrIkrEgPFObSFa3Ra7bQZAIdm6AqIMm/ujBYvFRZI/FIzahEao k6rg== X-Gm-Message-State: AOJu0YwIhaolutk2ziTu8JP5LRIDUrwf7bgFWhWDNVj8CijgCp7XLnNr uZX2N9Kn+6+A9iNTnRooVOXRhK/gfMK+Q00AygSP+4glMRDe5HPl6WLUTGblfkAi9Vk2t3z03Mx i X-Google-Smtp-Source: AGHT+IGbRw56tv/nDOLHwWpzYvvs08EsmVHgBobiGMVib0z7gMZnbhegNG0lGT9ttWHOX8OyEfPp/w== X-Received: by 2002:a17:902:ce8f:b0:1e7:e7ed:cbd8 with SMTP id d9443c01a7336-1f339f51964mr95547375ad.22.1716592886285; Fri, 24 May 2024 16:21:26 -0700 (PDT) Received: from stoup.. (174-21-72-5.tukw.qwest.net. [174.21.72.5]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-1f44c759ceesm19178305ad.10.2024.05.24.16.21.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 May 2024 16:21:25 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: qemu-arm@nongnu.org Subject: [PATCH v2 04/67] target/arm: Zero-extend writeback for fp16 FCVTZS (scalar, integer) Date: Fri, 24 May 2024 16:20:18 -0700 Message-Id: <20240524232121.284515-5-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240524232121.284515-1-richard.henderson@linaro.org> References: <20240524232121.284515-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::62b; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x62b.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Fixes RISU mismatch for "fcvtzs h31, h0, #14". Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/tcg/translate-a64.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c index 4126aaa27e..d97acdbaf9 100644 --- a/target/arm/tcg/translate-a64.c +++ b/target/arm/tcg/translate-a64.c @@ -8707,6 +8707,9 @@ static void handle_simd_shift_fpint_conv(DisasContext *s, bool is_scalar, read_vec_element_i32(s, tcg_op, rn, pass, size); fn(tcg_op, tcg_op, tcg_shift, tcg_fpstatus); if (is_scalar) { + if (size == MO_16 && !is_u) { + tcg_gen_ext16u_i32(tcg_op, tcg_op); + } write_fp_sreg(s, rd, tcg_op); } else { write_vec_element_i32(s, tcg_op, rd, pass, size); From patchwork Fri May 24 23:20:19 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13673832 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id A16F2C25B7F for ; Fri, 24 May 2024 23:30:00 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sAeEH-0006U9-H3; Fri, 24 May 2024 19:21:41 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sAeE7-0006Oe-4M for qemu-devel@nongnu.org; Fri, 24 May 2024 19:21:31 -0400 Received: from mail-pl1-x629.google.com ([2607:f8b0:4864:20::629]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sAeE4-0005hE-F8 for qemu-devel@nongnu.org; Fri, 24 May 2024 19:21:30 -0400 Received: by mail-pl1-x629.google.com with SMTP id d9443c01a7336-1f44b5ba445so12727895ad.3 for ; Fri, 24 May 2024 16:21:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1716592887; x=1717197687; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=gL2CcQSQHBX9UC9WbyVew+g6mF+g5maBBhC3HZ00Z54=; b=OrdE2hP8YeZw2eePamhfDzUN4Qb/vrRug7++O99H/+uJvmmkzl7zo2bSo06lEYCc29 dUc6qLQuhP1X7KWFVDcKag3AcvcFtCkQhVshbn3WLMzt1we/tYHM7oVYtcoJVXwjjZuT lj/1gOPFOwzfROS4bnIkD7gD9IvPRM7jye+RgWym9pFR6fpWjNYn2tv53nrQ7Li5Gf0s BwPNz9vGjdJWXsD2qe3A8FdRy+fYthuviuObyNC2FaH0uYWNLzgVs3wkLFz7zHv/Ki+9 1txzibA1xQVCvbqKR0tisFAnBfztAIwaSps7MrCO5REvvTj9nRSe7yAGE8QlrJlzhXEC wGSw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716592887; x=1717197687; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=gL2CcQSQHBX9UC9WbyVew+g6mF+g5maBBhC3HZ00Z54=; b=JKWLfJV6tnCdPj3/xdr1VuwPTi6M2sXIy/aLU5vBNoF80w3PraRpXWp9lkNXJ/JvJ9 Z9ZG1Trg6JW0gTpOYCl+VD7tgXVUeW87yVoTNuHn5Z1LSENolJVRggzS5eEVcKRBJNxS OwfVKvdHg1XZRt5U9TQz5qn2YEaarITnIdpFoVmMIHC0rVsgO/Tu7p8D2ngtymDH76rL 0mWp/iTaT1KiqIs/S+dy0Nr+PRvLCttoWhpithpNG5p1cNXQZTByVVk0dGjrUXSLXQg8 vm6hj4p/OD/JtvcqjigmhlP0z2g7CGT7eiyMmNZcS/ljERu1pJSepGmRMVQY9xrliXCH SVTg== X-Gm-Message-State: AOJu0Yz8HledhSt9TntSgRsug3ucKWlMhrfSoAe8gjMYJIiED9z34i/s kOusYMhgssWY3OUrtyfTnMZa5HO8yHAiiWnhxanuvqiwjTfe38nJ+U7nXE0tk7i9H72SPVdngoN E X-Google-Smtp-Source: AGHT+IF+MPw0RRYzvW9j8ypRbyZTQsEqg17q1XeNYF6teWALGxAibHfPFahQqwJk62PfznJmlfsRIg== X-Received: by 2002:a17:903:181:b0:1f2:e14b:3d91 with SMTP id d9443c01a7336-1f4498f09bbmr47181005ad.59.1716592886979; Fri, 24 May 2024 16:21:26 -0700 (PDT) Received: from stoup.. (174-21-72-5.tukw.qwest.net. [174.21.72.5]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-1f44c759ceesm19178305ad.10.2024.05.24.16.21.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 May 2024 16:21:26 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: qemu-arm@nongnu.org Subject: [PATCH v2 05/67] target/arm: Fix decode of FMOV (hp) vs MOVI Date: Fri, 24 May 2024 16:20:19 -0700 Message-Id: <20240524232121.284515-6-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240524232121.284515-1-richard.henderson@linaro.org> References: <20240524232121.284515-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::629; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x629.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org The decode of FMOV (vector, immediate, half-precision) vs invalid cases of MOVI are incorrect. Fixes RISU mismatch for invalid insn 0x2f01fd31. Fixes: 70b4e6a4457 ("arm/translate-a64: add FP16 FMOV to simd_mod_imm") Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/tcg/translate-a64.c | 24 ++++++++++++++---------- 1 file changed, 14 insertions(+), 10 deletions(-) diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c index d97acdbaf9..5455ae3685 100644 --- a/target/arm/tcg/translate-a64.c +++ b/target/arm/tcg/translate-a64.c @@ -7904,27 +7904,31 @@ static void disas_simd_mod_imm(DisasContext *s, uint32_t insn) bool is_q = extract32(insn, 30, 1); uint64_t imm = 0; - if (o2 != 0 || ((cmode == 0xf) && is_neg && !is_q)) { - /* Check for FMOV (vector, immediate) - half-precision */ - if (!(dc_isar_feature(aa64_fp16, s) && o2 && cmode == 0xf)) { + if (o2) { + if (cmode != 0xf || is_neg) { unallocated_encoding(s); return; } - } - - if (!fp_access_check(s)) { - return; - } - - if (cmode == 15 && o2 && !is_neg) { /* FMOV (vector, immediate) - half-precision */ + if (!dc_isar_feature(aa64_fp16, s)) { + unallocated_encoding(s); + return; + } imm = vfp_expand_imm(MO_16, abcdefgh); /* now duplicate across the lanes */ imm = dup_const(MO_16, imm); } else { + if (cmode == 0xf && is_neg && !is_q) { + unallocated_encoding(s); + return; + } imm = asimd_imm_const(abcdefgh, cmode, is_neg); } + if (!fp_access_check(s)) { + return; + } + if (!((cmode & 0x9) == 0x1 || (cmode & 0xd) == 0x9)) { /* MOVI or MVNI, with MVNI negation handled above. */ tcg_gen_gvec_dup_imm(MO_64, vec_full_reg_offset(s, rd), is_q ? 16 : 8, From patchwork Fri May 24 23:20:20 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13673833 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 6A9FEC25B7A for ; Fri, 24 May 2024 23:30:00 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sAeEA-0006RD-5f; Fri, 24 May 2024 19:21:34 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sAeE7-0006Of-5B for qemu-devel@nongnu.org; Fri, 24 May 2024 19:21:32 -0400 Received: from mail-pg1-x532.google.com ([2607:f8b0:4864:20::532]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sAeE5-0005hu-63 for qemu-devel@nongnu.org; Fri, 24 May 2024 19:21:30 -0400 Received: by mail-pg1-x532.google.com with SMTP id 41be03b00d2f7-681907aebebso1037775a12.1 for ; Fri, 24 May 2024 16:21:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1716592888; x=1717197688; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Qjg0N9d1UiR78WeGkTTMqX2tHTL3eVpJWXsbVAhKPqc=; b=vEdndpfNXdSLolIRiwW8rs8U2X0QWDjmrZADLvwF9Enr2yqjJPjZAvsQQdFsHqTzKE TwuT9k8LlXgM3J5CZwmow6muZmRGUObcqpOIjDb6oChFk0jI+KNnAD+vLTwyclxLLlbZ xehqMDjHZu25gabmslehW+Pn2kMtUt2zs3N9QRleCFQHYaO5DnT9Wx1j8mI6cSeHVvjt jKwIWRambVj0kgs/cHzZkunMdArxDV3Oyr/7ZzCSqF642ijIAUjILHPUcYYWbLr1cblK d2713FnamlChCuJDM2L22RZNGVMnaqrB3h29tEB21p2k7M1q5Y+rQowZnwqpMiIf+mW2 klIw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716592888; x=1717197688; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Qjg0N9d1UiR78WeGkTTMqX2tHTL3eVpJWXsbVAhKPqc=; b=X89hC8Xxy3/5gZ14hgwjEzDHwOuKSzu28/+RDnZz7HboInX0TZ9IxY536SZ2F2h3o4 lwHJ1crq4f4ydZoU6Q0N7r7bapUpxye8KgA8PVcxzh14oZVC1t/5UsdF/tws4o6ZwP1a U+Kt9oEnL+oAUSR5FnBIH5PaLy49Q5a2U1ddbS7vs/Vs6I5bHqNtCQrJgC0PQxNWbSXA JuotHc0elOt7Wws3FaYOG7UB1jz4/Ank7mdLNvy2eKNTJkm+7XD/Y+XLQKoaTtEvGch9 IxpVVG0VGVGDEO8NPxSMchgpwRiWvoujsSdDnbKfM4W04wLKRGyGDDUWfHSNU+orQvWZ OYyQ== X-Gm-Message-State: AOJu0YwmlAS5uiyrzQiB+X5Fg8vO1aHnVNaWJMRXi2RLW+WGogRBh5Xr k5piA4ZfvpdXMP/Hl0B8Cjrkd0NMXiEugJ//jaXlo2lyMTKiQh0Oui97Eylvbd5LVCH9NuFTA+w n X-Google-Smtp-Source: AGHT+IEESw3lq3PynrbdWw9/HheQTe2VanrU7kf6L3edjSnT36ekIyMLUiRxMMy2TnhsRr3pArdObg== X-Received: by 2002:a17:902:db11:b0:1f4:5c4b:dc6b with SMTP id d9443c01a7336-1f45c4be9d5mr24810025ad.47.1716592887707; Fri, 24 May 2024 16:21:27 -0700 (PDT) Received: from stoup.. (174-21-72-5.tukw.qwest.net. [174.21.72.5]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-1f44c759ceesm19178305ad.10.2024.05.24.16.21.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 May 2024 16:21:27 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: qemu-arm@nongnu.org Subject: [PATCH v2 06/67] target/arm: Verify sz=0 for Advanced SIMD scalar pairwise (fp16) Date: Fri, 24 May 2024 16:20:20 -0700 Message-Id: <20240524232121.284515-7-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240524232121.284515-1-richard.henderson@linaro.org> References: <20240524232121.284515-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::532; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x532.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org All of these insns have "if sz == '1' then UNDEFINED" in their pseudocode. Fixes a RISU miscompare for invalid insn 0x5ef0c87a. Fixes: 5c36d89567c ("arm/translate-a64: add all FP16 ops in simd_scalar_pairwise") Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/tcg/translate-a64.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c index 5455ae3685..0bdddb8517 100644 --- a/target/arm/tcg/translate-a64.c +++ b/target/arm/tcg/translate-a64.c @@ -8006,7 +8006,7 @@ static void disas_simd_scalar_pairwise(DisasContext *s, uint32_t insn) case 0x2f: /* FMINP */ /* FP op, size[0] is 32 or 64 bit*/ if (!u) { - if (!dc_isar_feature(aa64_fp16, s)) { + if ((size & 1) || !dc_isar_feature(aa64_fp16, s)) { unallocated_encoding(s); return; } else { From patchwork Fri May 24 23:20:21 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13673800 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 73D96C25B7D for ; Fri, 24 May 2024 23:24:46 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sAeEJ-0006W2-GY; Fri, 24 May 2024 19:21:43 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sAeED-0006Sq-G3 for qemu-devel@nongnu.org; Fri, 24 May 2024 19:21:37 -0400 Received: from mail-pl1-x632.google.com ([2607:f8b0:4864:20::632]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sAeE6-0005jj-UY for qemu-devel@nongnu.org; Fri, 24 May 2024 19:21:37 -0400 Received: by mail-pl1-x632.google.com with SMTP id d9443c01a7336-1f333e7a669so26693145ad.3 for ; Fri, 24 May 2024 16:21:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1716592889; x=1717197689; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=MjEwuXz1xHyiWkfLDFaYCMc2pxEMlYMp8DVPWneWf7A=; b=ZXRXHwj8mk72ExpFCScATc83SfXDS8JA6nKAnikoDq46jRpdg9jgQud6QEPUqKMZFf s5cknJKlQLeQ8zgtfPcCcViT5KMZYCl03YuceJZnTqpkzKncIWEXiV1E7PPl1/jk7woa gABEyGipR2hADyFy48t3Ff49fvprSc0uSHUJQ8OWGu3Rt23T+SuT0DPDM5/UDTugWE0A 2Q//MG/j2T/zP0aEKGgcNbEnWV085r6ndQ6W801AUiQw5zaobR69/zFoo2B/ud7m4HZK pViuwUOj5r8FUOJBmWqJ6Zp1OBBFF2BhdAIGviJ3ksYrD4asLqoYrVI9ZRYyaZGC2GAz oXvw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716592889; x=1717197689; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=MjEwuXz1xHyiWkfLDFaYCMc2pxEMlYMp8DVPWneWf7A=; b=oBYvqij1NgYD0Km3uODtPOiB8WZf3BQFbKOVDMyBnE2VxNOMIaj3JtTouc4vhxFA1T 5aIF10psm89w+Nb7+49QMJXFdkLycbClMntQQL6Iq/59bLpYZeTmyVCB8+eHePGqbSYH 90Kj08NhAzBBnAxZCSaQSUZSICzZQmbMfD6aEAnORJtyO+tfVdLAtp8RFUtjXdbgFS29 QQDLDfoahE2ut6+kf6HaVXkumDUWqHWAYUquT/Artr9exouRKsPpnofYcQCwOdApZi3g 6ybtVcjrY7bYuZPHCbwydixB8h3WhI/uDInALcH/vgsV+t9GSZ5LGkSBa476083yD+Bk 6c0A== X-Gm-Message-State: AOJu0Yy575t78meE/oJqngoesQmNxSPqQbjzmRV1X/SO786tX8uyDcZQ 2K7h/k1FW/oJEN1RlD7vRJs0iTT08M3KntGAhyg1qj/8CfZ6k6RdRLTpRH2RREDgGp8He2V/kV+ A X-Google-Smtp-Source: AGHT+IH8msPm9V6N0jE/4t/79D85ejHU9zIyKa6Cge55mO9ykoAh08h1Wgubrzd55X+BDnoCu1nCYg== X-Received: by 2002:a17:902:ea0a:b0:1f2:fd9a:dbf3 with SMTP id d9443c01a7336-1f4486d1fa4mr43336655ad.11.1716592888821; Fri, 24 May 2024 16:21:28 -0700 (PDT) Received: from stoup.. (174-21-72-5.tukw.qwest.net. [174.21.72.5]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-1f44c759ceesm19178305ad.10.2024.05.24.16.21.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 May 2024 16:21:28 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: qemu-arm@nongnu.org, Peter Maydell , =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= Subject: [PATCH v2 07/67] target/arm: Split out gengvec.c Date: Fri, 24 May 2024 16:20:21 -0700 Message-Id: <20240524232121.284515-8-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240524232121.284515-1-richard.henderson@linaro.org> References: <20240524232121.284515-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::632; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x632.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Reviewed-by: Peter Maydell Reviewed-by: Philippe Mathieu-Daudé Signed-off-by: Richard Henderson --- target/arm/tcg/translate.h | 5 + target/arm/tcg/gengvec.c | 1612 ++++++++++++++++++++++++++++++++++++ target/arm/tcg/translate.c | 1588 ----------------------------------- target/arm/tcg/meson.build | 1 + 4 files changed, 1618 insertions(+), 1588 deletions(-) create mode 100644 target/arm/tcg/gengvec.c diff --git a/target/arm/tcg/translate.h b/target/arm/tcg/translate.h index dc66ff2190..80e85096a8 100644 --- a/target/arm/tcg/translate.h +++ b/target/arm/tcg/translate.h @@ -445,6 +445,11 @@ void gen_gvec_ssra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, void gen_gvec_usra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, int64_t shift, uint32_t opr_sz, uint32_t max_sz); +void gen_srshr32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh); +void gen_srshr64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh); +void gen_urshr32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh); +void gen_urshr64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh); + void gen_gvec_srshr(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, int64_t shift, uint32_t opr_sz, uint32_t max_sz); void gen_gvec_urshr(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, diff --git a/target/arm/tcg/gengvec.c b/target/arm/tcg/gengvec.c new file mode 100644 index 0000000000..7a1856253f --- /dev/null +++ b/target/arm/tcg/gengvec.c @@ -0,0 +1,1612 @@ +/* + * ARM generic vector expansion + * + * Copyright (c) 2003 Fabrice Bellard + * Copyright (c) 2005-2007 CodeSourcery + * Copyright (c) 2007 OpenedHand, Ltd. + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library; if not, see . + */ + +#include "qemu/osdep.h" +#include "translate.h" + + +static void gen_gvec_fn3_qc(uint32_t rd_ofs, uint32_t rn_ofs, uint32_t rm_ofs, + uint32_t opr_sz, uint32_t max_sz, + gen_helper_gvec_3_ptr *fn) +{ + TCGv_ptr qc_ptr = tcg_temp_new_ptr(); + + tcg_gen_addi_ptr(qc_ptr, tcg_env, offsetof(CPUARMState, vfp.qc)); + tcg_gen_gvec_3_ptr(rd_ofs, rn_ofs, rm_ofs, qc_ptr, + opr_sz, max_sz, 0, fn); +} + +void gen_gvec_sqrdmlah_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static gen_helper_gvec_3_ptr * const fns[2] = { + gen_helper_gvec_qrdmlah_s16, gen_helper_gvec_qrdmlah_s32 + }; + tcg_debug_assert(vece >= 1 && vece <= 2); + gen_gvec_fn3_qc(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, fns[vece - 1]); +} + +void gen_gvec_sqrdmlsh_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static gen_helper_gvec_3_ptr * const fns[2] = { + gen_helper_gvec_qrdmlsh_s16, gen_helper_gvec_qrdmlsh_s32 + }; + tcg_debug_assert(vece >= 1 && vece <= 2); + gen_gvec_fn3_qc(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, fns[vece - 1]); +} + +#define GEN_CMP0(NAME, COND) \ + void NAME(unsigned vece, uint32_t d, uint32_t m, \ + uint32_t opr_sz, uint32_t max_sz) \ + { tcg_gen_gvec_cmpi(COND, vece, d, m, 0, opr_sz, max_sz); } + +GEN_CMP0(gen_gvec_ceq0, TCG_COND_EQ) +GEN_CMP0(gen_gvec_cle0, TCG_COND_LE) +GEN_CMP0(gen_gvec_cge0, TCG_COND_GE) +GEN_CMP0(gen_gvec_clt0, TCG_COND_LT) +GEN_CMP0(gen_gvec_cgt0, TCG_COND_GT) + +#undef GEN_CMP0 + +static void gen_ssra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) +{ + tcg_gen_vec_sar8i_i64(a, a, shift); + tcg_gen_vec_add8_i64(d, d, a); +} + +static void gen_ssra16_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) +{ + tcg_gen_vec_sar16i_i64(a, a, shift); + tcg_gen_vec_add16_i64(d, d, a); +} + +static void gen_ssra32_i32(TCGv_i32 d, TCGv_i32 a, int32_t shift) +{ + tcg_gen_sari_i32(a, a, shift); + tcg_gen_add_i32(d, d, a); +} + +static void gen_ssra64_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) +{ + tcg_gen_sari_i64(a, a, shift); + tcg_gen_add_i64(d, d, a); +} + +static void gen_ssra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh) +{ + tcg_gen_sari_vec(vece, a, a, sh); + tcg_gen_add_vec(vece, d, d, a); +} + +void gen_gvec_ssra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_sari_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen2i ops[4] = { + { .fni8 = gen_ssra8_i64, + .fniv = gen_ssra_vec, + .fno = gen_helper_gvec_ssra_b, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fni8 = gen_ssra16_i64, + .fniv = gen_ssra_vec, + .fno = gen_helper_gvec_ssra_h, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fni4 = gen_ssra32_i32, + .fniv = gen_ssra_vec, + .fno = gen_helper_gvec_ssra_s, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_32 }, + { .fni8 = gen_ssra64_i64, + .fniv = gen_ssra_vec, + .fno = gen_helper_gvec_ssra_d, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_64 }, + }; + + /* tszimm encoding produces immediates in the range [1..esize]. */ + tcg_debug_assert(shift > 0); + tcg_debug_assert(shift <= (8 << vece)); + + /* + * Shifts larger than the element size are architecturally valid. + * Signed results in all sign bits. + */ + shift = MIN(shift, (8 << vece) - 1); + tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]); +} + +static void gen_usra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) +{ + tcg_gen_vec_shr8i_i64(a, a, shift); + tcg_gen_vec_add8_i64(d, d, a); +} + +static void gen_usra16_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) +{ + tcg_gen_vec_shr16i_i64(a, a, shift); + tcg_gen_vec_add16_i64(d, d, a); +} + +static void gen_usra32_i32(TCGv_i32 d, TCGv_i32 a, int32_t shift) +{ + tcg_gen_shri_i32(a, a, shift); + tcg_gen_add_i32(d, d, a); +} + +static void gen_usra64_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) +{ + tcg_gen_shri_i64(a, a, shift); + tcg_gen_add_i64(d, d, a); +} + +static void gen_usra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh) +{ + tcg_gen_shri_vec(vece, a, a, sh); + tcg_gen_add_vec(vece, d, d, a); +} + +void gen_gvec_usra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_shri_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen2i ops[4] = { + { .fni8 = gen_usra8_i64, + .fniv = gen_usra_vec, + .fno = gen_helper_gvec_usra_b, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_8, }, + { .fni8 = gen_usra16_i64, + .fniv = gen_usra_vec, + .fno = gen_helper_gvec_usra_h, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_16, }, + { .fni4 = gen_usra32_i32, + .fniv = gen_usra_vec, + .fno = gen_helper_gvec_usra_s, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_32, }, + { .fni8 = gen_usra64_i64, + .fniv = gen_usra_vec, + .fno = gen_helper_gvec_usra_d, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_64, }, + }; + + /* tszimm encoding produces immediates in the range [1..esize]. */ + tcg_debug_assert(shift > 0); + tcg_debug_assert(shift <= (8 << vece)); + + /* + * Shifts larger than the element size are architecturally valid. + * Unsigned results in all zeros as input to accumulate: nop. + */ + if (shift < (8 << vece)) { + tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]); + } else { + /* Nop, but we do need to clear the tail. */ + tcg_gen_gvec_mov(vece, rd_ofs, rd_ofs, opr_sz, max_sz); + } +} + +/* + * Shift one less than the requested amount, and the low bit is + * the rounding bit. For the 8 and 16-bit operations, because we + * mask the low bit, we can perform a normal integer shift instead + * of a vector shift. + */ +static void gen_srshr8_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + tcg_gen_shri_i64(t, a, sh - 1); + tcg_gen_andi_i64(t, t, dup_const(MO_8, 1)); + tcg_gen_vec_sar8i_i64(d, a, sh); + tcg_gen_vec_add8_i64(d, d, t); +} + +static void gen_srshr16_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + tcg_gen_shri_i64(t, a, sh - 1); + tcg_gen_andi_i64(t, t, dup_const(MO_16, 1)); + tcg_gen_vec_sar16i_i64(d, a, sh); + tcg_gen_vec_add16_i64(d, d, t); +} + +void gen_srshr32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh) +{ + TCGv_i32 t; + + /* Handle shift by the input size for the benefit of trans_SRSHR_ri */ + if (sh == 32) { + tcg_gen_movi_i32(d, 0); + return; + } + t = tcg_temp_new_i32(); + tcg_gen_extract_i32(t, a, sh - 1, 1); + tcg_gen_sari_i32(d, a, sh); + tcg_gen_add_i32(d, d, t); +} + + void gen_srshr64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + tcg_gen_extract_i64(t, a, sh - 1, 1); + tcg_gen_sari_i64(d, a, sh); + tcg_gen_add_i64(d, d, t); +} + +static void gen_srshr_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + TCGv_vec ones = tcg_temp_new_vec_matching(d); + + tcg_gen_shri_vec(vece, t, a, sh - 1); + tcg_gen_dupi_vec(vece, ones, 1); + tcg_gen_and_vec(vece, t, t, ones); + tcg_gen_sari_vec(vece, d, a, sh); + tcg_gen_add_vec(vece, d, d, t); +} + +void gen_gvec_srshr(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_shri_vec, INDEX_op_sari_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen2i ops[4] = { + { .fni8 = gen_srshr8_i64, + .fniv = gen_srshr_vec, + .fno = gen_helper_gvec_srshr_b, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fni8 = gen_srshr16_i64, + .fniv = gen_srshr_vec, + .fno = gen_helper_gvec_srshr_h, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fni4 = gen_srshr32_i32, + .fniv = gen_srshr_vec, + .fno = gen_helper_gvec_srshr_s, + .opt_opc = vecop_list, + .vece = MO_32 }, + { .fni8 = gen_srshr64_i64, + .fniv = gen_srshr_vec, + .fno = gen_helper_gvec_srshr_d, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .opt_opc = vecop_list, + .vece = MO_64 }, + }; + + /* tszimm encoding produces immediates in the range [1..esize] */ + tcg_debug_assert(shift > 0); + tcg_debug_assert(shift <= (8 << vece)); + + if (shift == (8 << vece)) { + /* + * Shifts larger than the element size are architecturally valid. + * Signed results in all sign bits. With rounding, this produces + * (-1 + 1) >> 1 == 0, or (0 + 1) >> 1 == 0. + * I.e. always zero. + */ + tcg_gen_gvec_dup_imm(vece, rd_ofs, opr_sz, max_sz, 0); + } else { + tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]); + } +} + +static void gen_srsra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + gen_srshr8_i64(t, a, sh); + tcg_gen_vec_add8_i64(d, d, t); +} + +static void gen_srsra16_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + gen_srshr16_i64(t, a, sh); + tcg_gen_vec_add16_i64(d, d, t); +} + +static void gen_srsra32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh) +{ + TCGv_i32 t = tcg_temp_new_i32(); + + gen_srshr32_i32(t, a, sh); + tcg_gen_add_i32(d, d, t); +} + +static void gen_srsra64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + gen_srshr64_i64(t, a, sh); + tcg_gen_add_i64(d, d, t); +} + +static void gen_srsra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + + gen_srshr_vec(vece, t, a, sh); + tcg_gen_add_vec(vece, d, d, t); +} + +void gen_gvec_srsra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_shri_vec, INDEX_op_sari_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen2i ops[4] = { + { .fni8 = gen_srsra8_i64, + .fniv = gen_srsra_vec, + .fno = gen_helper_gvec_srsra_b, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_8 }, + { .fni8 = gen_srsra16_i64, + .fniv = gen_srsra_vec, + .fno = gen_helper_gvec_srsra_h, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_16 }, + { .fni4 = gen_srsra32_i32, + .fniv = gen_srsra_vec, + .fno = gen_helper_gvec_srsra_s, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_32 }, + { .fni8 = gen_srsra64_i64, + .fniv = gen_srsra_vec, + .fno = gen_helper_gvec_srsra_d, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_64 }, + }; + + /* tszimm encoding produces immediates in the range [1..esize] */ + tcg_debug_assert(shift > 0); + tcg_debug_assert(shift <= (8 << vece)); + + /* + * Shifts larger than the element size are architecturally valid. + * Signed results in all sign bits. With rounding, this produces + * (-1 + 1) >> 1 == 0, or (0 + 1) >> 1 == 0. + * I.e. always zero. With accumulation, this leaves D unchanged. + */ + if (shift == (8 << vece)) { + /* Nop, but we do need to clear the tail. */ + tcg_gen_gvec_mov(vece, rd_ofs, rd_ofs, opr_sz, max_sz); + } else { + tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]); + } +} + +static void gen_urshr8_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + tcg_gen_shri_i64(t, a, sh - 1); + tcg_gen_andi_i64(t, t, dup_const(MO_8, 1)); + tcg_gen_vec_shr8i_i64(d, a, sh); + tcg_gen_vec_add8_i64(d, d, t); +} + +static void gen_urshr16_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + tcg_gen_shri_i64(t, a, sh - 1); + tcg_gen_andi_i64(t, t, dup_const(MO_16, 1)); + tcg_gen_vec_shr16i_i64(d, a, sh); + tcg_gen_vec_add16_i64(d, d, t); +} + +void gen_urshr32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh) +{ + TCGv_i32 t; + + /* Handle shift by the input size for the benefit of trans_URSHR_ri */ + if (sh == 32) { + tcg_gen_extract_i32(d, a, sh - 1, 1); + return; + } + t = tcg_temp_new_i32(); + tcg_gen_extract_i32(t, a, sh - 1, 1); + tcg_gen_shri_i32(d, a, sh); + tcg_gen_add_i32(d, d, t); +} + +void gen_urshr64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + tcg_gen_extract_i64(t, a, sh - 1, 1); + tcg_gen_shri_i64(d, a, sh); + tcg_gen_add_i64(d, d, t); +} + +static void gen_urshr_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t shift) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + TCGv_vec ones = tcg_temp_new_vec_matching(d); + + tcg_gen_shri_vec(vece, t, a, shift - 1); + tcg_gen_dupi_vec(vece, ones, 1); + tcg_gen_and_vec(vece, t, t, ones); + tcg_gen_shri_vec(vece, d, a, shift); + tcg_gen_add_vec(vece, d, d, t); +} + +void gen_gvec_urshr(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_shri_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen2i ops[4] = { + { .fni8 = gen_urshr8_i64, + .fniv = gen_urshr_vec, + .fno = gen_helper_gvec_urshr_b, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fni8 = gen_urshr16_i64, + .fniv = gen_urshr_vec, + .fno = gen_helper_gvec_urshr_h, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fni4 = gen_urshr32_i32, + .fniv = gen_urshr_vec, + .fno = gen_helper_gvec_urshr_s, + .opt_opc = vecop_list, + .vece = MO_32 }, + { .fni8 = gen_urshr64_i64, + .fniv = gen_urshr_vec, + .fno = gen_helper_gvec_urshr_d, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .opt_opc = vecop_list, + .vece = MO_64 }, + }; + + /* tszimm encoding produces immediates in the range [1..esize] */ + tcg_debug_assert(shift > 0); + tcg_debug_assert(shift <= (8 << vece)); + + if (shift == (8 << vece)) { + /* + * Shifts larger than the element size are architecturally valid. + * Unsigned results in zero. With rounding, this produces a + * copy of the most significant bit. + */ + tcg_gen_gvec_shri(vece, rd_ofs, rm_ofs, shift - 1, opr_sz, max_sz); + } else { + tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]); + } +} + +static void gen_ursra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + if (sh == 8) { + tcg_gen_vec_shr8i_i64(t, a, 7); + } else { + gen_urshr8_i64(t, a, sh); + } + tcg_gen_vec_add8_i64(d, d, t); +} + +static void gen_ursra16_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + if (sh == 16) { + tcg_gen_vec_shr16i_i64(t, a, 15); + } else { + gen_urshr16_i64(t, a, sh); + } + tcg_gen_vec_add16_i64(d, d, t); +} + +static void gen_ursra32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh) +{ + TCGv_i32 t = tcg_temp_new_i32(); + + if (sh == 32) { + tcg_gen_shri_i32(t, a, 31); + } else { + gen_urshr32_i32(t, a, sh); + } + tcg_gen_add_i32(d, d, t); +} + +static void gen_ursra64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + if (sh == 64) { + tcg_gen_shri_i64(t, a, 63); + } else { + gen_urshr64_i64(t, a, sh); + } + tcg_gen_add_i64(d, d, t); +} + +static void gen_ursra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + + if (sh == (8 << vece)) { + tcg_gen_shri_vec(vece, t, a, sh - 1); + } else { + gen_urshr_vec(vece, t, a, sh); + } + tcg_gen_add_vec(vece, d, d, t); +} + +void gen_gvec_ursra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_shri_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen2i ops[4] = { + { .fni8 = gen_ursra8_i64, + .fniv = gen_ursra_vec, + .fno = gen_helper_gvec_ursra_b, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_8 }, + { .fni8 = gen_ursra16_i64, + .fniv = gen_ursra_vec, + .fno = gen_helper_gvec_ursra_h, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_16 }, + { .fni4 = gen_ursra32_i32, + .fniv = gen_ursra_vec, + .fno = gen_helper_gvec_ursra_s, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_32 }, + { .fni8 = gen_ursra64_i64, + .fniv = gen_ursra_vec, + .fno = gen_helper_gvec_ursra_d, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_64 }, + }; + + /* tszimm encoding produces immediates in the range [1..esize] */ + tcg_debug_assert(shift > 0); + tcg_debug_assert(shift <= (8 << vece)); + + tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]); +} + +static void gen_shr8_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) +{ + uint64_t mask = dup_const(MO_8, 0xff >> shift); + TCGv_i64 t = tcg_temp_new_i64(); + + tcg_gen_shri_i64(t, a, shift); + tcg_gen_andi_i64(t, t, mask); + tcg_gen_andi_i64(d, d, ~mask); + tcg_gen_or_i64(d, d, t); +} + +static void gen_shr16_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) +{ + uint64_t mask = dup_const(MO_16, 0xffff >> shift); + TCGv_i64 t = tcg_temp_new_i64(); + + tcg_gen_shri_i64(t, a, shift); + tcg_gen_andi_i64(t, t, mask); + tcg_gen_andi_i64(d, d, ~mask); + tcg_gen_or_i64(d, d, t); +} + +static void gen_shr32_ins_i32(TCGv_i32 d, TCGv_i32 a, int32_t shift) +{ + tcg_gen_shri_i32(a, a, shift); + tcg_gen_deposit_i32(d, d, a, 0, 32 - shift); +} + +static void gen_shr64_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) +{ + tcg_gen_shri_i64(a, a, shift); + tcg_gen_deposit_i64(d, d, a, 0, 64 - shift); +} + +static void gen_shr_ins_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + TCGv_vec m = tcg_temp_new_vec_matching(d); + + tcg_gen_dupi_vec(vece, m, MAKE_64BIT_MASK((8 << vece) - sh, sh)); + tcg_gen_shri_vec(vece, t, a, sh); + tcg_gen_and_vec(vece, d, d, m); + tcg_gen_or_vec(vece, d, d, t); +} + +void gen_gvec_sri(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { INDEX_op_shri_vec, 0 }; + const GVecGen2i ops[4] = { + { .fni8 = gen_shr8_ins_i64, + .fniv = gen_shr_ins_vec, + .fno = gen_helper_gvec_sri_b, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fni8 = gen_shr16_ins_i64, + .fniv = gen_shr_ins_vec, + .fno = gen_helper_gvec_sri_h, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fni4 = gen_shr32_ins_i32, + .fniv = gen_shr_ins_vec, + .fno = gen_helper_gvec_sri_s, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_32 }, + { .fni8 = gen_shr64_ins_i64, + .fniv = gen_shr_ins_vec, + .fno = gen_helper_gvec_sri_d, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_64 }, + }; + + /* tszimm encoding produces immediates in the range [1..esize]. */ + tcg_debug_assert(shift > 0); + tcg_debug_assert(shift <= (8 << vece)); + + /* Shift of esize leaves destination unchanged. */ + if (shift < (8 << vece)) { + tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]); + } else { + /* Nop, but we do need to clear the tail. */ + tcg_gen_gvec_mov(vece, rd_ofs, rd_ofs, opr_sz, max_sz); + } +} + +static void gen_shl8_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) +{ + uint64_t mask = dup_const(MO_8, 0xff << shift); + TCGv_i64 t = tcg_temp_new_i64(); + + tcg_gen_shli_i64(t, a, shift); + tcg_gen_andi_i64(t, t, mask); + tcg_gen_andi_i64(d, d, ~mask); + tcg_gen_or_i64(d, d, t); +} + +static void gen_shl16_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) +{ + uint64_t mask = dup_const(MO_16, 0xffff << shift); + TCGv_i64 t = tcg_temp_new_i64(); + + tcg_gen_shli_i64(t, a, shift); + tcg_gen_andi_i64(t, t, mask); + tcg_gen_andi_i64(d, d, ~mask); + tcg_gen_or_i64(d, d, t); +} + +static void gen_shl32_ins_i32(TCGv_i32 d, TCGv_i32 a, int32_t shift) +{ + tcg_gen_deposit_i32(d, d, a, shift, 32 - shift); +} + +static void gen_shl64_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) +{ + tcg_gen_deposit_i64(d, d, a, shift, 64 - shift); +} + +static void gen_shl_ins_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + TCGv_vec m = tcg_temp_new_vec_matching(d); + + tcg_gen_shli_vec(vece, t, a, sh); + tcg_gen_dupi_vec(vece, m, MAKE_64BIT_MASK(0, sh)); + tcg_gen_and_vec(vece, d, d, m); + tcg_gen_or_vec(vece, d, d, t); +} + +void gen_gvec_sli(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { INDEX_op_shli_vec, 0 }; + const GVecGen2i ops[4] = { + { .fni8 = gen_shl8_ins_i64, + .fniv = gen_shl_ins_vec, + .fno = gen_helper_gvec_sli_b, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fni8 = gen_shl16_ins_i64, + .fniv = gen_shl_ins_vec, + .fno = gen_helper_gvec_sli_h, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fni4 = gen_shl32_ins_i32, + .fniv = gen_shl_ins_vec, + .fno = gen_helper_gvec_sli_s, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_32 }, + { .fni8 = gen_shl64_ins_i64, + .fniv = gen_shl_ins_vec, + .fno = gen_helper_gvec_sli_d, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_64 }, + }; + + /* tszimm encoding produces immediates in the range [0..esize-1]. */ + tcg_debug_assert(shift >= 0); + tcg_debug_assert(shift < (8 << vece)); + + if (shift == 0) { + tcg_gen_gvec_mov(vece, rd_ofs, rm_ofs, opr_sz, max_sz); + } else { + tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]); + } +} + +static void gen_mla8_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) +{ + gen_helper_neon_mul_u8(a, a, b); + gen_helper_neon_add_u8(d, d, a); +} + +static void gen_mls8_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) +{ + gen_helper_neon_mul_u8(a, a, b); + gen_helper_neon_sub_u8(d, d, a); +} + +static void gen_mla16_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) +{ + gen_helper_neon_mul_u16(a, a, b); + gen_helper_neon_add_u16(d, d, a); +} + +static void gen_mls16_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) +{ + gen_helper_neon_mul_u16(a, a, b); + gen_helper_neon_sub_u16(d, d, a); +} + +static void gen_mla32_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) +{ + tcg_gen_mul_i32(a, a, b); + tcg_gen_add_i32(d, d, a); +} + +static void gen_mls32_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) +{ + tcg_gen_mul_i32(a, a, b); + tcg_gen_sub_i32(d, d, a); +} + +static void gen_mla64_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) +{ + tcg_gen_mul_i64(a, a, b); + tcg_gen_add_i64(d, d, a); +} + +static void gen_mls64_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) +{ + tcg_gen_mul_i64(a, a, b); + tcg_gen_sub_i64(d, d, a); +} + +static void gen_mla_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b) +{ + tcg_gen_mul_vec(vece, a, a, b); + tcg_gen_add_vec(vece, d, d, a); +} + +static void gen_mls_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b) +{ + tcg_gen_mul_vec(vece, a, a, b); + tcg_gen_sub_vec(vece, d, d, a); +} + +/* Note that while NEON does not support VMLA and VMLS as 64-bit ops, + * these tables are shared with AArch64 which does support them. + */ +void gen_gvec_mla(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_mul_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen3 ops[4] = { + { .fni4 = gen_mla8_i32, + .fniv = gen_mla_vec, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fni4 = gen_mla16_i32, + .fniv = gen_mla_vec, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fni4 = gen_mla32_i32, + .fniv = gen_mla_vec, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_32 }, + { .fni8 = gen_mla64_i64, + .fniv = gen_mla_vec, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_64 }, + }; + tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); +} + +void gen_gvec_mls(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_mul_vec, INDEX_op_sub_vec, 0 + }; + static const GVecGen3 ops[4] = { + { .fni4 = gen_mls8_i32, + .fniv = gen_mls_vec, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fni4 = gen_mls16_i32, + .fniv = gen_mls_vec, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fni4 = gen_mls32_i32, + .fniv = gen_mls_vec, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_32 }, + { .fni8 = gen_mls64_i64, + .fniv = gen_mls_vec, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_64 }, + }; + tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); +} + +/* CMTST : test is "if (X & Y != 0)". */ +static void gen_cmtst_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) +{ + tcg_gen_and_i32(d, a, b); + tcg_gen_negsetcond_i32(TCG_COND_NE, d, d, tcg_constant_i32(0)); +} + +void gen_cmtst_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) +{ + tcg_gen_and_i64(d, a, b); + tcg_gen_negsetcond_i64(TCG_COND_NE, d, d, tcg_constant_i64(0)); +} + +static void gen_cmtst_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b) +{ + tcg_gen_and_vec(vece, d, a, b); + tcg_gen_dupi_vec(vece, a, 0); + tcg_gen_cmp_vec(TCG_COND_NE, vece, d, d, a); +} + +void gen_gvec_cmtst(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { INDEX_op_cmp_vec, 0 }; + static const GVecGen3 ops[4] = { + { .fni4 = gen_helper_neon_tst_u8, + .fniv = gen_cmtst_vec, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fni4 = gen_helper_neon_tst_u16, + .fniv = gen_cmtst_vec, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fni4 = gen_cmtst_i32, + .fniv = gen_cmtst_vec, + .opt_opc = vecop_list, + .vece = MO_32 }, + { .fni8 = gen_cmtst_i64, + .fniv = gen_cmtst_vec, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .opt_opc = vecop_list, + .vece = MO_64 }, + }; + tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); +} + +void gen_ushl_i32(TCGv_i32 dst, TCGv_i32 src, TCGv_i32 shift) +{ + TCGv_i32 lval = tcg_temp_new_i32(); + TCGv_i32 rval = tcg_temp_new_i32(); + TCGv_i32 lsh = tcg_temp_new_i32(); + TCGv_i32 rsh = tcg_temp_new_i32(); + TCGv_i32 zero = tcg_constant_i32(0); + TCGv_i32 max = tcg_constant_i32(32); + + /* + * Rely on the TCG guarantee that out of range shifts produce + * unspecified results, not undefined behaviour (i.e. no trap). + * Discard out-of-range results after the fact. + */ + tcg_gen_ext8s_i32(lsh, shift); + tcg_gen_neg_i32(rsh, lsh); + tcg_gen_shl_i32(lval, src, lsh); + tcg_gen_shr_i32(rval, src, rsh); + tcg_gen_movcond_i32(TCG_COND_LTU, dst, lsh, max, lval, zero); + tcg_gen_movcond_i32(TCG_COND_LTU, dst, rsh, max, rval, dst); +} + +void gen_ushl_i64(TCGv_i64 dst, TCGv_i64 src, TCGv_i64 shift) +{ + TCGv_i64 lval = tcg_temp_new_i64(); + TCGv_i64 rval = tcg_temp_new_i64(); + TCGv_i64 lsh = tcg_temp_new_i64(); + TCGv_i64 rsh = tcg_temp_new_i64(); + TCGv_i64 zero = tcg_constant_i64(0); + TCGv_i64 max = tcg_constant_i64(64); + + /* + * Rely on the TCG guarantee that out of range shifts produce + * unspecified results, not undefined behaviour (i.e. no trap). + * Discard out-of-range results after the fact. + */ + tcg_gen_ext8s_i64(lsh, shift); + tcg_gen_neg_i64(rsh, lsh); + tcg_gen_shl_i64(lval, src, lsh); + tcg_gen_shr_i64(rval, src, rsh); + tcg_gen_movcond_i64(TCG_COND_LTU, dst, lsh, max, lval, zero); + tcg_gen_movcond_i64(TCG_COND_LTU, dst, rsh, max, rval, dst); +} + +static void gen_ushl_vec(unsigned vece, TCGv_vec dst, + TCGv_vec src, TCGv_vec shift) +{ + TCGv_vec lval = tcg_temp_new_vec_matching(dst); + TCGv_vec rval = tcg_temp_new_vec_matching(dst); + TCGv_vec lsh = tcg_temp_new_vec_matching(dst); + TCGv_vec rsh = tcg_temp_new_vec_matching(dst); + TCGv_vec msk, max; + + tcg_gen_neg_vec(vece, rsh, shift); + if (vece == MO_8) { + tcg_gen_mov_vec(lsh, shift); + } else { + msk = tcg_temp_new_vec_matching(dst); + tcg_gen_dupi_vec(vece, msk, 0xff); + tcg_gen_and_vec(vece, lsh, shift, msk); + tcg_gen_and_vec(vece, rsh, rsh, msk); + } + + /* + * Rely on the TCG guarantee that out of range shifts produce + * unspecified results, not undefined behaviour (i.e. no trap). + * Discard out-of-range results after the fact. + */ + tcg_gen_shlv_vec(vece, lval, src, lsh); + tcg_gen_shrv_vec(vece, rval, src, rsh); + + max = tcg_temp_new_vec_matching(dst); + tcg_gen_dupi_vec(vece, max, 8 << vece); + + /* + * The choice of LT (signed) and GEU (unsigned) are biased toward + * the instructions of the x86_64 host. For MO_8, the whole byte + * is significant so we must use an unsigned compare; otherwise we + * have already masked to a byte and so a signed compare works. + * Other tcg hosts have a full set of comparisons and do not care. + */ + if (vece == MO_8) { + tcg_gen_cmp_vec(TCG_COND_GEU, vece, lsh, lsh, max); + tcg_gen_cmp_vec(TCG_COND_GEU, vece, rsh, rsh, max); + tcg_gen_andc_vec(vece, lval, lval, lsh); + tcg_gen_andc_vec(vece, rval, rval, rsh); + } else { + tcg_gen_cmp_vec(TCG_COND_LT, vece, lsh, lsh, max); + tcg_gen_cmp_vec(TCG_COND_LT, vece, rsh, rsh, max); + tcg_gen_and_vec(vece, lval, lval, lsh); + tcg_gen_and_vec(vece, rval, rval, rsh); + } + tcg_gen_or_vec(vece, dst, lval, rval); +} + +void gen_gvec_ushl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_neg_vec, INDEX_op_shlv_vec, + INDEX_op_shrv_vec, INDEX_op_cmp_vec, 0 + }; + static const GVecGen3 ops[4] = { + { .fniv = gen_ushl_vec, + .fno = gen_helper_gvec_ushl_b, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fniv = gen_ushl_vec, + .fno = gen_helper_gvec_ushl_h, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fni4 = gen_ushl_i32, + .fniv = gen_ushl_vec, + .opt_opc = vecop_list, + .vece = MO_32 }, + { .fni8 = gen_ushl_i64, + .fniv = gen_ushl_vec, + .opt_opc = vecop_list, + .vece = MO_64 }, + }; + tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); +} + +void gen_sshl_i32(TCGv_i32 dst, TCGv_i32 src, TCGv_i32 shift) +{ + TCGv_i32 lval = tcg_temp_new_i32(); + TCGv_i32 rval = tcg_temp_new_i32(); + TCGv_i32 lsh = tcg_temp_new_i32(); + TCGv_i32 rsh = tcg_temp_new_i32(); + TCGv_i32 zero = tcg_constant_i32(0); + TCGv_i32 max = tcg_constant_i32(31); + + /* + * Rely on the TCG guarantee that out of range shifts produce + * unspecified results, not undefined behaviour (i.e. no trap). + * Discard out-of-range results after the fact. + */ + tcg_gen_ext8s_i32(lsh, shift); + tcg_gen_neg_i32(rsh, lsh); + tcg_gen_shl_i32(lval, src, lsh); + tcg_gen_umin_i32(rsh, rsh, max); + tcg_gen_sar_i32(rval, src, rsh); + tcg_gen_movcond_i32(TCG_COND_LEU, lval, lsh, max, lval, zero); + tcg_gen_movcond_i32(TCG_COND_LT, dst, lsh, zero, rval, lval); +} + +void gen_sshl_i64(TCGv_i64 dst, TCGv_i64 src, TCGv_i64 shift) +{ + TCGv_i64 lval = tcg_temp_new_i64(); + TCGv_i64 rval = tcg_temp_new_i64(); + TCGv_i64 lsh = tcg_temp_new_i64(); + TCGv_i64 rsh = tcg_temp_new_i64(); + TCGv_i64 zero = tcg_constant_i64(0); + TCGv_i64 max = tcg_constant_i64(63); + + /* + * Rely on the TCG guarantee that out of range shifts produce + * unspecified results, not undefined behaviour (i.e. no trap). + * Discard out-of-range results after the fact. + */ + tcg_gen_ext8s_i64(lsh, shift); + tcg_gen_neg_i64(rsh, lsh); + tcg_gen_shl_i64(lval, src, lsh); + tcg_gen_umin_i64(rsh, rsh, max); + tcg_gen_sar_i64(rval, src, rsh); + tcg_gen_movcond_i64(TCG_COND_LEU, lval, lsh, max, lval, zero); + tcg_gen_movcond_i64(TCG_COND_LT, dst, lsh, zero, rval, lval); +} + +static void gen_sshl_vec(unsigned vece, TCGv_vec dst, + TCGv_vec src, TCGv_vec shift) +{ + TCGv_vec lval = tcg_temp_new_vec_matching(dst); + TCGv_vec rval = tcg_temp_new_vec_matching(dst); + TCGv_vec lsh = tcg_temp_new_vec_matching(dst); + TCGv_vec rsh = tcg_temp_new_vec_matching(dst); + TCGv_vec tmp = tcg_temp_new_vec_matching(dst); + + /* + * Rely on the TCG guarantee that out of range shifts produce + * unspecified results, not undefined behaviour (i.e. no trap). + * Discard out-of-range results after the fact. + */ + tcg_gen_neg_vec(vece, rsh, shift); + if (vece == MO_8) { + tcg_gen_mov_vec(lsh, shift); + } else { + tcg_gen_dupi_vec(vece, tmp, 0xff); + tcg_gen_and_vec(vece, lsh, shift, tmp); + tcg_gen_and_vec(vece, rsh, rsh, tmp); + } + + /* Bound rsh so out of bound right shift gets -1. */ + tcg_gen_dupi_vec(vece, tmp, (8 << vece) - 1); + tcg_gen_umin_vec(vece, rsh, rsh, tmp); + tcg_gen_cmp_vec(TCG_COND_GT, vece, tmp, lsh, tmp); + + tcg_gen_shlv_vec(vece, lval, src, lsh); + tcg_gen_sarv_vec(vece, rval, src, rsh); + + /* Select in-bound left shift. */ + tcg_gen_andc_vec(vece, lval, lval, tmp); + + /* Select between left and right shift. */ + if (vece == MO_8) { + tcg_gen_dupi_vec(vece, tmp, 0); + tcg_gen_cmpsel_vec(TCG_COND_LT, vece, dst, lsh, tmp, rval, lval); + } else { + tcg_gen_dupi_vec(vece, tmp, 0x80); + tcg_gen_cmpsel_vec(TCG_COND_LT, vece, dst, lsh, tmp, lval, rval); + } +} + +void gen_gvec_sshl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_neg_vec, INDEX_op_umin_vec, INDEX_op_shlv_vec, + INDEX_op_sarv_vec, INDEX_op_cmp_vec, INDEX_op_cmpsel_vec, 0 + }; + static const GVecGen3 ops[4] = { + { .fniv = gen_sshl_vec, + .fno = gen_helper_gvec_sshl_b, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fniv = gen_sshl_vec, + .fno = gen_helper_gvec_sshl_h, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fni4 = gen_sshl_i32, + .fniv = gen_sshl_vec, + .opt_opc = vecop_list, + .vece = MO_32 }, + { .fni8 = gen_sshl_i64, + .fniv = gen_sshl_vec, + .opt_opc = vecop_list, + .vece = MO_64 }, + }; + tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); +} + +static void gen_uqadd_vec(unsigned vece, TCGv_vec t, TCGv_vec sat, + TCGv_vec a, TCGv_vec b) +{ + TCGv_vec x = tcg_temp_new_vec_matching(t); + tcg_gen_add_vec(vece, x, a, b); + tcg_gen_usadd_vec(vece, t, a, b); + tcg_gen_cmp_vec(TCG_COND_NE, vece, x, x, t); + tcg_gen_or_vec(vece, sat, sat, x); +} + +void gen_gvec_uqadd_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_usadd_vec, INDEX_op_cmp_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen4 ops[4] = { + { .fniv = gen_uqadd_vec, + .fno = gen_helper_gvec_uqadd_b, + .write_aofs = true, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fniv = gen_uqadd_vec, + .fno = gen_helper_gvec_uqadd_h, + .write_aofs = true, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fniv = gen_uqadd_vec, + .fno = gen_helper_gvec_uqadd_s, + .write_aofs = true, + .opt_opc = vecop_list, + .vece = MO_32 }, + { .fniv = gen_uqadd_vec, + .fno = gen_helper_gvec_uqadd_d, + .write_aofs = true, + .opt_opc = vecop_list, + .vece = MO_64 }, + }; + tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc), + rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); +} + +static void gen_sqadd_vec(unsigned vece, TCGv_vec t, TCGv_vec sat, + TCGv_vec a, TCGv_vec b) +{ + TCGv_vec x = tcg_temp_new_vec_matching(t); + tcg_gen_add_vec(vece, x, a, b); + tcg_gen_ssadd_vec(vece, t, a, b); + tcg_gen_cmp_vec(TCG_COND_NE, vece, x, x, t); + tcg_gen_or_vec(vece, sat, sat, x); +} + +void gen_gvec_sqadd_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_ssadd_vec, INDEX_op_cmp_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen4 ops[4] = { + { .fniv = gen_sqadd_vec, + .fno = gen_helper_gvec_sqadd_b, + .opt_opc = vecop_list, + .write_aofs = true, + .vece = MO_8 }, + { .fniv = gen_sqadd_vec, + .fno = gen_helper_gvec_sqadd_h, + .opt_opc = vecop_list, + .write_aofs = true, + .vece = MO_16 }, + { .fniv = gen_sqadd_vec, + .fno = gen_helper_gvec_sqadd_s, + .opt_opc = vecop_list, + .write_aofs = true, + .vece = MO_32 }, + { .fniv = gen_sqadd_vec, + .fno = gen_helper_gvec_sqadd_d, + .opt_opc = vecop_list, + .write_aofs = true, + .vece = MO_64 }, + }; + tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc), + rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); +} + +static void gen_uqsub_vec(unsigned vece, TCGv_vec t, TCGv_vec sat, + TCGv_vec a, TCGv_vec b) +{ + TCGv_vec x = tcg_temp_new_vec_matching(t); + tcg_gen_sub_vec(vece, x, a, b); + tcg_gen_ussub_vec(vece, t, a, b); + tcg_gen_cmp_vec(TCG_COND_NE, vece, x, x, t); + tcg_gen_or_vec(vece, sat, sat, x); +} + +void gen_gvec_uqsub_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_ussub_vec, INDEX_op_cmp_vec, INDEX_op_sub_vec, 0 + }; + static const GVecGen4 ops[4] = { + { .fniv = gen_uqsub_vec, + .fno = gen_helper_gvec_uqsub_b, + .opt_opc = vecop_list, + .write_aofs = true, + .vece = MO_8 }, + { .fniv = gen_uqsub_vec, + .fno = gen_helper_gvec_uqsub_h, + .opt_opc = vecop_list, + .write_aofs = true, + .vece = MO_16 }, + { .fniv = gen_uqsub_vec, + .fno = gen_helper_gvec_uqsub_s, + .opt_opc = vecop_list, + .write_aofs = true, + .vece = MO_32 }, + { .fniv = gen_uqsub_vec, + .fno = gen_helper_gvec_uqsub_d, + .opt_opc = vecop_list, + .write_aofs = true, + .vece = MO_64 }, + }; + tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc), + rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); +} + +static void gen_sqsub_vec(unsigned vece, TCGv_vec t, TCGv_vec sat, + TCGv_vec a, TCGv_vec b) +{ + TCGv_vec x = tcg_temp_new_vec_matching(t); + tcg_gen_sub_vec(vece, x, a, b); + tcg_gen_sssub_vec(vece, t, a, b); + tcg_gen_cmp_vec(TCG_COND_NE, vece, x, x, t); + tcg_gen_or_vec(vece, sat, sat, x); +} + +void gen_gvec_sqsub_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_sssub_vec, INDEX_op_cmp_vec, INDEX_op_sub_vec, 0 + }; + static const GVecGen4 ops[4] = { + { .fniv = gen_sqsub_vec, + .fno = gen_helper_gvec_sqsub_b, + .opt_opc = vecop_list, + .write_aofs = true, + .vece = MO_8 }, + { .fniv = gen_sqsub_vec, + .fno = gen_helper_gvec_sqsub_h, + .opt_opc = vecop_list, + .write_aofs = true, + .vece = MO_16 }, + { .fniv = gen_sqsub_vec, + .fno = gen_helper_gvec_sqsub_s, + .opt_opc = vecop_list, + .write_aofs = true, + .vece = MO_32 }, + { .fniv = gen_sqsub_vec, + .fno = gen_helper_gvec_sqsub_d, + .opt_opc = vecop_list, + .write_aofs = true, + .vece = MO_64 }, + }; + tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc), + rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); +} + +static void gen_sabd_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) +{ + TCGv_i32 t = tcg_temp_new_i32(); + + tcg_gen_sub_i32(t, a, b); + tcg_gen_sub_i32(d, b, a); + tcg_gen_movcond_i32(TCG_COND_LT, d, a, b, d, t); +} + +static void gen_sabd_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + tcg_gen_sub_i64(t, a, b); + tcg_gen_sub_i64(d, b, a); + tcg_gen_movcond_i64(TCG_COND_LT, d, a, b, d, t); +} + +static void gen_sabd_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + + tcg_gen_smin_vec(vece, t, a, b); + tcg_gen_smax_vec(vece, d, a, b); + tcg_gen_sub_vec(vece, d, d, t); +} + +void gen_gvec_sabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_sub_vec, INDEX_op_smin_vec, INDEX_op_smax_vec, 0 + }; + static const GVecGen3 ops[4] = { + { .fniv = gen_sabd_vec, + .fno = gen_helper_gvec_sabd_b, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fniv = gen_sabd_vec, + .fno = gen_helper_gvec_sabd_h, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fni4 = gen_sabd_i32, + .fniv = gen_sabd_vec, + .fno = gen_helper_gvec_sabd_s, + .opt_opc = vecop_list, + .vece = MO_32 }, + { .fni8 = gen_sabd_i64, + .fniv = gen_sabd_vec, + .fno = gen_helper_gvec_sabd_d, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .opt_opc = vecop_list, + .vece = MO_64 }, + }; + tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); +} + +static void gen_uabd_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) +{ + TCGv_i32 t = tcg_temp_new_i32(); + + tcg_gen_sub_i32(t, a, b); + tcg_gen_sub_i32(d, b, a); + tcg_gen_movcond_i32(TCG_COND_LTU, d, a, b, d, t); +} + +static void gen_uabd_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + tcg_gen_sub_i64(t, a, b); + tcg_gen_sub_i64(d, b, a); + tcg_gen_movcond_i64(TCG_COND_LTU, d, a, b, d, t); +} + +static void gen_uabd_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + + tcg_gen_umin_vec(vece, t, a, b); + tcg_gen_umax_vec(vece, d, a, b); + tcg_gen_sub_vec(vece, d, d, t); +} + +void gen_gvec_uabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_sub_vec, INDEX_op_umin_vec, INDEX_op_umax_vec, 0 + }; + static const GVecGen3 ops[4] = { + { .fniv = gen_uabd_vec, + .fno = gen_helper_gvec_uabd_b, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fniv = gen_uabd_vec, + .fno = gen_helper_gvec_uabd_h, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fni4 = gen_uabd_i32, + .fniv = gen_uabd_vec, + .fno = gen_helper_gvec_uabd_s, + .opt_opc = vecop_list, + .vece = MO_32 }, + { .fni8 = gen_uabd_i64, + .fniv = gen_uabd_vec, + .fno = gen_helper_gvec_uabd_d, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .opt_opc = vecop_list, + .vece = MO_64 }, + }; + tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); +} + +static void gen_saba_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) +{ + TCGv_i32 t = tcg_temp_new_i32(); + gen_sabd_i32(t, a, b); + tcg_gen_add_i32(d, d, t); +} + +static void gen_saba_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) +{ + TCGv_i64 t = tcg_temp_new_i64(); + gen_sabd_i64(t, a, b); + tcg_gen_add_i64(d, d, t); +} + +static void gen_saba_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + gen_sabd_vec(vece, t, a, b); + tcg_gen_add_vec(vece, d, d, t); +} + +void gen_gvec_saba(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_sub_vec, INDEX_op_add_vec, + INDEX_op_smin_vec, INDEX_op_smax_vec, 0 + }; + static const GVecGen3 ops[4] = { + { .fniv = gen_saba_vec, + .fno = gen_helper_gvec_saba_b, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_8 }, + { .fniv = gen_saba_vec, + .fno = gen_helper_gvec_saba_h, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_16 }, + { .fni4 = gen_saba_i32, + .fniv = gen_saba_vec, + .fno = gen_helper_gvec_saba_s, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_32 }, + { .fni8 = gen_saba_i64, + .fniv = gen_saba_vec, + .fno = gen_helper_gvec_saba_d, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_64 }, + }; + tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); +} + +static void gen_uaba_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) +{ + TCGv_i32 t = tcg_temp_new_i32(); + gen_uabd_i32(t, a, b); + tcg_gen_add_i32(d, d, t); +} + +static void gen_uaba_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) +{ + TCGv_i64 t = tcg_temp_new_i64(); + gen_uabd_i64(t, a, b); + tcg_gen_add_i64(d, d, t); +} + +static void gen_uaba_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + gen_uabd_vec(vece, t, a, b); + tcg_gen_add_vec(vece, d, d, t); +} + +void gen_gvec_uaba(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_sub_vec, INDEX_op_add_vec, + INDEX_op_umin_vec, INDEX_op_umax_vec, 0 + }; + static const GVecGen3 ops[4] = { + { .fniv = gen_uaba_vec, + .fno = gen_helper_gvec_uaba_b, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_8 }, + { .fniv = gen_uaba_vec, + .fno = gen_helper_gvec_uaba_h, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_16 }, + { .fni4 = gen_uaba_i32, + .fniv = gen_uaba_vec, + .fno = gen_helper_gvec_uaba_s, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_32 }, + { .fni8 = gen_uaba_i64, + .fniv = gen_uaba_vec, + .fno = gen_helper_gvec_uaba_d, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_64 }, + }; + tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); +} diff --git a/target/arm/tcg/translate.c b/target/arm/tcg/translate.c index 7c09153b6e..52cd5daf0f 100644 --- a/target/arm/tcg/translate.c +++ b/target/arm/tcg/translate.c @@ -2912,1594 +2912,6 @@ static void gen_exception_return(DisasContext *s, TCGv_i32 pc) gen_rfe(s, pc, load_cpu_field(spsr)); } -static void gen_gvec_fn3_qc(uint32_t rd_ofs, uint32_t rn_ofs, uint32_t rm_ofs, - uint32_t opr_sz, uint32_t max_sz, - gen_helper_gvec_3_ptr *fn) -{ - TCGv_ptr qc_ptr = tcg_temp_new_ptr(); - - tcg_gen_addi_ptr(qc_ptr, tcg_env, offsetof(CPUARMState, vfp.qc)); - tcg_gen_gvec_3_ptr(rd_ofs, rn_ofs, rm_ofs, qc_ptr, - opr_sz, max_sz, 0, fn); -} - -void gen_gvec_sqrdmlah_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, - uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) -{ - static gen_helper_gvec_3_ptr * const fns[2] = { - gen_helper_gvec_qrdmlah_s16, gen_helper_gvec_qrdmlah_s32 - }; - tcg_debug_assert(vece >= 1 && vece <= 2); - gen_gvec_fn3_qc(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, fns[vece - 1]); -} - -void gen_gvec_sqrdmlsh_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, - uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) -{ - static gen_helper_gvec_3_ptr * const fns[2] = { - gen_helper_gvec_qrdmlsh_s16, gen_helper_gvec_qrdmlsh_s32 - }; - tcg_debug_assert(vece >= 1 && vece <= 2); - gen_gvec_fn3_qc(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, fns[vece - 1]); -} - -#define GEN_CMP0(NAME, COND) \ - void NAME(unsigned vece, uint32_t d, uint32_t m, \ - uint32_t opr_sz, uint32_t max_sz) \ - { tcg_gen_gvec_cmpi(COND, vece, d, m, 0, opr_sz, max_sz); } - -GEN_CMP0(gen_gvec_ceq0, TCG_COND_EQ) -GEN_CMP0(gen_gvec_cle0, TCG_COND_LE) -GEN_CMP0(gen_gvec_cge0, TCG_COND_GE) -GEN_CMP0(gen_gvec_clt0, TCG_COND_LT) -GEN_CMP0(gen_gvec_cgt0, TCG_COND_GT) - -#undef GEN_CMP0 - -static void gen_ssra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) -{ - tcg_gen_vec_sar8i_i64(a, a, shift); - tcg_gen_vec_add8_i64(d, d, a); -} - -static void gen_ssra16_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) -{ - tcg_gen_vec_sar16i_i64(a, a, shift); - tcg_gen_vec_add16_i64(d, d, a); -} - -static void gen_ssra32_i32(TCGv_i32 d, TCGv_i32 a, int32_t shift) -{ - tcg_gen_sari_i32(a, a, shift); - tcg_gen_add_i32(d, d, a); -} - -static void gen_ssra64_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) -{ - tcg_gen_sari_i64(a, a, shift); - tcg_gen_add_i64(d, d, a); -} - -static void gen_ssra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh) -{ - tcg_gen_sari_vec(vece, a, a, sh); - tcg_gen_add_vec(vece, d, d, a); -} - -void gen_gvec_ssra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, - int64_t shift, uint32_t opr_sz, uint32_t max_sz) -{ - static const TCGOpcode vecop_list[] = { - INDEX_op_sari_vec, INDEX_op_add_vec, 0 - }; - static const GVecGen2i ops[4] = { - { .fni8 = gen_ssra8_i64, - .fniv = gen_ssra_vec, - .fno = gen_helper_gvec_ssra_b, - .load_dest = true, - .opt_opc = vecop_list, - .vece = MO_8 }, - { .fni8 = gen_ssra16_i64, - .fniv = gen_ssra_vec, - .fno = gen_helper_gvec_ssra_h, - .load_dest = true, - .opt_opc = vecop_list, - .vece = MO_16 }, - { .fni4 = gen_ssra32_i32, - .fniv = gen_ssra_vec, - .fno = gen_helper_gvec_ssra_s, - .load_dest = true, - .opt_opc = vecop_list, - .vece = MO_32 }, - { .fni8 = gen_ssra64_i64, - .fniv = gen_ssra_vec, - .fno = gen_helper_gvec_ssra_d, - .prefer_i64 = TCG_TARGET_REG_BITS == 64, - .opt_opc = vecop_list, - .load_dest = true, - .vece = MO_64 }, - }; - - /* tszimm encoding produces immediates in the range [1..esize]. */ - tcg_debug_assert(shift > 0); - tcg_debug_assert(shift <= (8 << vece)); - - /* - * Shifts larger than the element size are architecturally valid. - * Signed results in all sign bits. - */ - shift = MIN(shift, (8 << vece) - 1); - tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]); -} - -static void gen_usra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) -{ - tcg_gen_vec_shr8i_i64(a, a, shift); - tcg_gen_vec_add8_i64(d, d, a); -} - -static void gen_usra16_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) -{ - tcg_gen_vec_shr16i_i64(a, a, shift); - tcg_gen_vec_add16_i64(d, d, a); -} - -static void gen_usra32_i32(TCGv_i32 d, TCGv_i32 a, int32_t shift) -{ - tcg_gen_shri_i32(a, a, shift); - tcg_gen_add_i32(d, d, a); -} - -static void gen_usra64_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) -{ - tcg_gen_shri_i64(a, a, shift); - tcg_gen_add_i64(d, d, a); -} - -static void gen_usra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh) -{ - tcg_gen_shri_vec(vece, a, a, sh); - tcg_gen_add_vec(vece, d, d, a); -} - -void gen_gvec_usra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, - int64_t shift, uint32_t opr_sz, uint32_t max_sz) -{ - static const TCGOpcode vecop_list[] = { - INDEX_op_shri_vec, INDEX_op_add_vec, 0 - }; - static const GVecGen2i ops[4] = { - { .fni8 = gen_usra8_i64, - .fniv = gen_usra_vec, - .fno = gen_helper_gvec_usra_b, - .load_dest = true, - .opt_opc = vecop_list, - .vece = MO_8, }, - { .fni8 = gen_usra16_i64, - .fniv = gen_usra_vec, - .fno = gen_helper_gvec_usra_h, - .load_dest = true, - .opt_opc = vecop_list, - .vece = MO_16, }, - { .fni4 = gen_usra32_i32, - .fniv = gen_usra_vec, - .fno = gen_helper_gvec_usra_s, - .load_dest = true, - .opt_opc = vecop_list, - .vece = MO_32, }, - { .fni8 = gen_usra64_i64, - .fniv = gen_usra_vec, - .fno = gen_helper_gvec_usra_d, - .prefer_i64 = TCG_TARGET_REG_BITS == 64, - .load_dest = true, - .opt_opc = vecop_list, - .vece = MO_64, }, - }; - - /* tszimm encoding produces immediates in the range [1..esize]. */ - tcg_debug_assert(shift > 0); - tcg_debug_assert(shift <= (8 << vece)); - - /* - * Shifts larger than the element size are architecturally valid. - * Unsigned results in all zeros as input to accumulate: nop. - */ - if (shift < (8 << vece)) { - tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]); - } else { - /* Nop, but we do need to clear the tail. */ - tcg_gen_gvec_mov(vece, rd_ofs, rd_ofs, opr_sz, max_sz); - } -} - -/* - * Shift one less than the requested amount, and the low bit is - * the rounding bit. For the 8 and 16-bit operations, because we - * mask the low bit, we can perform a normal integer shift instead - * of a vector shift. - */ -static void gen_srshr8_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) -{ - TCGv_i64 t = tcg_temp_new_i64(); - - tcg_gen_shri_i64(t, a, sh - 1); - tcg_gen_andi_i64(t, t, dup_const(MO_8, 1)); - tcg_gen_vec_sar8i_i64(d, a, sh); - tcg_gen_vec_add8_i64(d, d, t); -} - -static void gen_srshr16_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) -{ - TCGv_i64 t = tcg_temp_new_i64(); - - tcg_gen_shri_i64(t, a, sh - 1); - tcg_gen_andi_i64(t, t, dup_const(MO_16, 1)); - tcg_gen_vec_sar16i_i64(d, a, sh); - tcg_gen_vec_add16_i64(d, d, t); -} - -static void gen_srshr32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh) -{ - TCGv_i32 t; - - /* Handle shift by the input size for the benefit of trans_SRSHR_ri */ - if (sh == 32) { - tcg_gen_movi_i32(d, 0); - return; - } - t = tcg_temp_new_i32(); - tcg_gen_extract_i32(t, a, sh - 1, 1); - tcg_gen_sari_i32(d, a, sh); - tcg_gen_add_i32(d, d, t); -} - -static void gen_srshr64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) -{ - TCGv_i64 t = tcg_temp_new_i64(); - - tcg_gen_extract_i64(t, a, sh - 1, 1); - tcg_gen_sari_i64(d, a, sh); - tcg_gen_add_i64(d, d, t); -} - -static void gen_srshr_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh) -{ - TCGv_vec t = tcg_temp_new_vec_matching(d); - TCGv_vec ones = tcg_temp_new_vec_matching(d); - - tcg_gen_shri_vec(vece, t, a, sh - 1); - tcg_gen_dupi_vec(vece, ones, 1); - tcg_gen_and_vec(vece, t, t, ones); - tcg_gen_sari_vec(vece, d, a, sh); - tcg_gen_add_vec(vece, d, d, t); -} - -void gen_gvec_srshr(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, - int64_t shift, uint32_t opr_sz, uint32_t max_sz) -{ - static const TCGOpcode vecop_list[] = { - INDEX_op_shri_vec, INDEX_op_sari_vec, INDEX_op_add_vec, 0 - }; - static const GVecGen2i ops[4] = { - { .fni8 = gen_srshr8_i64, - .fniv = gen_srshr_vec, - .fno = gen_helper_gvec_srshr_b, - .opt_opc = vecop_list, - .vece = MO_8 }, - { .fni8 = gen_srshr16_i64, - .fniv = gen_srshr_vec, - .fno = gen_helper_gvec_srshr_h, - .opt_opc = vecop_list, - .vece = MO_16 }, - { .fni4 = gen_srshr32_i32, - .fniv = gen_srshr_vec, - .fno = gen_helper_gvec_srshr_s, - .opt_opc = vecop_list, - .vece = MO_32 }, - { .fni8 = gen_srshr64_i64, - .fniv = gen_srshr_vec, - .fno = gen_helper_gvec_srshr_d, - .prefer_i64 = TCG_TARGET_REG_BITS == 64, - .opt_opc = vecop_list, - .vece = MO_64 }, - }; - - /* tszimm encoding produces immediates in the range [1..esize] */ - tcg_debug_assert(shift > 0); - tcg_debug_assert(shift <= (8 << vece)); - - if (shift == (8 << vece)) { - /* - * Shifts larger than the element size are architecturally valid. - * Signed results in all sign bits. With rounding, this produces - * (-1 + 1) >> 1 == 0, or (0 + 1) >> 1 == 0. - * I.e. always zero. - */ - tcg_gen_gvec_dup_imm(vece, rd_ofs, opr_sz, max_sz, 0); - } else { - tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]); - } -} - -static void gen_srsra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) -{ - TCGv_i64 t = tcg_temp_new_i64(); - - gen_srshr8_i64(t, a, sh); - tcg_gen_vec_add8_i64(d, d, t); -} - -static void gen_srsra16_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) -{ - TCGv_i64 t = tcg_temp_new_i64(); - - gen_srshr16_i64(t, a, sh); - tcg_gen_vec_add16_i64(d, d, t); -} - -static void gen_srsra32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh) -{ - TCGv_i32 t = tcg_temp_new_i32(); - - gen_srshr32_i32(t, a, sh); - tcg_gen_add_i32(d, d, t); -} - -static void gen_srsra64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) -{ - TCGv_i64 t = tcg_temp_new_i64(); - - gen_srshr64_i64(t, a, sh); - tcg_gen_add_i64(d, d, t); -} - -static void gen_srsra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh) -{ - TCGv_vec t = tcg_temp_new_vec_matching(d); - - gen_srshr_vec(vece, t, a, sh); - tcg_gen_add_vec(vece, d, d, t); -} - -void gen_gvec_srsra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, - int64_t shift, uint32_t opr_sz, uint32_t max_sz) -{ - static const TCGOpcode vecop_list[] = { - INDEX_op_shri_vec, INDEX_op_sari_vec, INDEX_op_add_vec, 0 - }; - static const GVecGen2i ops[4] = { - { .fni8 = gen_srsra8_i64, - .fniv = gen_srsra_vec, - .fno = gen_helper_gvec_srsra_b, - .opt_opc = vecop_list, - .load_dest = true, - .vece = MO_8 }, - { .fni8 = gen_srsra16_i64, - .fniv = gen_srsra_vec, - .fno = gen_helper_gvec_srsra_h, - .opt_opc = vecop_list, - .load_dest = true, - .vece = MO_16 }, - { .fni4 = gen_srsra32_i32, - .fniv = gen_srsra_vec, - .fno = gen_helper_gvec_srsra_s, - .opt_opc = vecop_list, - .load_dest = true, - .vece = MO_32 }, - { .fni8 = gen_srsra64_i64, - .fniv = gen_srsra_vec, - .fno = gen_helper_gvec_srsra_d, - .prefer_i64 = TCG_TARGET_REG_BITS == 64, - .opt_opc = vecop_list, - .load_dest = true, - .vece = MO_64 }, - }; - - /* tszimm encoding produces immediates in the range [1..esize] */ - tcg_debug_assert(shift > 0); - tcg_debug_assert(shift <= (8 << vece)); - - /* - * Shifts larger than the element size are architecturally valid. - * Signed results in all sign bits. With rounding, this produces - * (-1 + 1) >> 1 == 0, or (0 + 1) >> 1 == 0. - * I.e. always zero. With accumulation, this leaves D unchanged. - */ - if (shift == (8 << vece)) { - /* Nop, but we do need to clear the tail. */ - tcg_gen_gvec_mov(vece, rd_ofs, rd_ofs, opr_sz, max_sz); - } else { - tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]); - } -} - -static void gen_urshr8_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) -{ - TCGv_i64 t = tcg_temp_new_i64(); - - tcg_gen_shri_i64(t, a, sh - 1); - tcg_gen_andi_i64(t, t, dup_const(MO_8, 1)); - tcg_gen_vec_shr8i_i64(d, a, sh); - tcg_gen_vec_add8_i64(d, d, t); -} - -static void gen_urshr16_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) -{ - TCGv_i64 t = tcg_temp_new_i64(); - - tcg_gen_shri_i64(t, a, sh - 1); - tcg_gen_andi_i64(t, t, dup_const(MO_16, 1)); - tcg_gen_vec_shr16i_i64(d, a, sh); - tcg_gen_vec_add16_i64(d, d, t); -} - -static void gen_urshr32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh) -{ - TCGv_i32 t; - - /* Handle shift by the input size for the benefit of trans_URSHR_ri */ - if (sh == 32) { - tcg_gen_extract_i32(d, a, sh - 1, 1); - return; - } - t = tcg_temp_new_i32(); - tcg_gen_extract_i32(t, a, sh - 1, 1); - tcg_gen_shri_i32(d, a, sh); - tcg_gen_add_i32(d, d, t); -} - -static void gen_urshr64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) -{ - TCGv_i64 t = tcg_temp_new_i64(); - - tcg_gen_extract_i64(t, a, sh - 1, 1); - tcg_gen_shri_i64(d, a, sh); - tcg_gen_add_i64(d, d, t); -} - -static void gen_urshr_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t shift) -{ - TCGv_vec t = tcg_temp_new_vec_matching(d); - TCGv_vec ones = tcg_temp_new_vec_matching(d); - - tcg_gen_shri_vec(vece, t, a, shift - 1); - tcg_gen_dupi_vec(vece, ones, 1); - tcg_gen_and_vec(vece, t, t, ones); - tcg_gen_shri_vec(vece, d, a, shift); - tcg_gen_add_vec(vece, d, d, t); -} - -void gen_gvec_urshr(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, - int64_t shift, uint32_t opr_sz, uint32_t max_sz) -{ - static const TCGOpcode vecop_list[] = { - INDEX_op_shri_vec, INDEX_op_add_vec, 0 - }; - static const GVecGen2i ops[4] = { - { .fni8 = gen_urshr8_i64, - .fniv = gen_urshr_vec, - .fno = gen_helper_gvec_urshr_b, - .opt_opc = vecop_list, - .vece = MO_8 }, - { .fni8 = gen_urshr16_i64, - .fniv = gen_urshr_vec, - .fno = gen_helper_gvec_urshr_h, - .opt_opc = vecop_list, - .vece = MO_16 }, - { .fni4 = gen_urshr32_i32, - .fniv = gen_urshr_vec, - .fno = gen_helper_gvec_urshr_s, - .opt_opc = vecop_list, - .vece = MO_32 }, - { .fni8 = gen_urshr64_i64, - .fniv = gen_urshr_vec, - .fno = gen_helper_gvec_urshr_d, - .prefer_i64 = TCG_TARGET_REG_BITS == 64, - .opt_opc = vecop_list, - .vece = MO_64 }, - }; - - /* tszimm encoding produces immediates in the range [1..esize] */ - tcg_debug_assert(shift > 0); - tcg_debug_assert(shift <= (8 << vece)); - - if (shift == (8 << vece)) { - /* - * Shifts larger than the element size are architecturally valid. - * Unsigned results in zero. With rounding, this produces a - * copy of the most significant bit. - */ - tcg_gen_gvec_shri(vece, rd_ofs, rm_ofs, shift - 1, opr_sz, max_sz); - } else { - tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]); - } -} - -static void gen_ursra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) -{ - TCGv_i64 t = tcg_temp_new_i64(); - - if (sh == 8) { - tcg_gen_vec_shr8i_i64(t, a, 7); - } else { - gen_urshr8_i64(t, a, sh); - } - tcg_gen_vec_add8_i64(d, d, t); -} - -static void gen_ursra16_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) -{ - TCGv_i64 t = tcg_temp_new_i64(); - - if (sh == 16) { - tcg_gen_vec_shr16i_i64(t, a, 15); - } else { - gen_urshr16_i64(t, a, sh); - } - tcg_gen_vec_add16_i64(d, d, t); -} - -static void gen_ursra32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh) -{ - TCGv_i32 t = tcg_temp_new_i32(); - - if (sh == 32) { - tcg_gen_shri_i32(t, a, 31); - } else { - gen_urshr32_i32(t, a, sh); - } - tcg_gen_add_i32(d, d, t); -} - -static void gen_ursra64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) -{ - TCGv_i64 t = tcg_temp_new_i64(); - - if (sh == 64) { - tcg_gen_shri_i64(t, a, 63); - } else { - gen_urshr64_i64(t, a, sh); - } - tcg_gen_add_i64(d, d, t); -} - -static void gen_ursra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh) -{ - TCGv_vec t = tcg_temp_new_vec_matching(d); - - if (sh == (8 << vece)) { - tcg_gen_shri_vec(vece, t, a, sh - 1); - } else { - gen_urshr_vec(vece, t, a, sh); - } - tcg_gen_add_vec(vece, d, d, t); -} - -void gen_gvec_ursra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, - int64_t shift, uint32_t opr_sz, uint32_t max_sz) -{ - static const TCGOpcode vecop_list[] = { - INDEX_op_shri_vec, INDEX_op_add_vec, 0 - }; - static const GVecGen2i ops[4] = { - { .fni8 = gen_ursra8_i64, - .fniv = gen_ursra_vec, - .fno = gen_helper_gvec_ursra_b, - .opt_opc = vecop_list, - .load_dest = true, - .vece = MO_8 }, - { .fni8 = gen_ursra16_i64, - .fniv = gen_ursra_vec, - .fno = gen_helper_gvec_ursra_h, - .opt_opc = vecop_list, - .load_dest = true, - .vece = MO_16 }, - { .fni4 = gen_ursra32_i32, - .fniv = gen_ursra_vec, - .fno = gen_helper_gvec_ursra_s, - .opt_opc = vecop_list, - .load_dest = true, - .vece = MO_32 }, - { .fni8 = gen_ursra64_i64, - .fniv = gen_ursra_vec, - .fno = gen_helper_gvec_ursra_d, - .prefer_i64 = TCG_TARGET_REG_BITS == 64, - .opt_opc = vecop_list, - .load_dest = true, - .vece = MO_64 }, - }; - - /* tszimm encoding produces immediates in the range [1..esize] */ - tcg_debug_assert(shift > 0); - tcg_debug_assert(shift <= (8 << vece)); - - tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]); -} - -static void gen_shr8_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) -{ - uint64_t mask = dup_const(MO_8, 0xff >> shift); - TCGv_i64 t = tcg_temp_new_i64(); - - tcg_gen_shri_i64(t, a, shift); - tcg_gen_andi_i64(t, t, mask); - tcg_gen_andi_i64(d, d, ~mask); - tcg_gen_or_i64(d, d, t); -} - -static void gen_shr16_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) -{ - uint64_t mask = dup_const(MO_16, 0xffff >> shift); - TCGv_i64 t = tcg_temp_new_i64(); - - tcg_gen_shri_i64(t, a, shift); - tcg_gen_andi_i64(t, t, mask); - tcg_gen_andi_i64(d, d, ~mask); - tcg_gen_or_i64(d, d, t); -} - -static void gen_shr32_ins_i32(TCGv_i32 d, TCGv_i32 a, int32_t shift) -{ - tcg_gen_shri_i32(a, a, shift); - tcg_gen_deposit_i32(d, d, a, 0, 32 - shift); -} - -static void gen_shr64_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) -{ - tcg_gen_shri_i64(a, a, shift); - tcg_gen_deposit_i64(d, d, a, 0, 64 - shift); -} - -static void gen_shr_ins_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh) -{ - TCGv_vec t = tcg_temp_new_vec_matching(d); - TCGv_vec m = tcg_temp_new_vec_matching(d); - - tcg_gen_dupi_vec(vece, m, MAKE_64BIT_MASK((8 << vece) - sh, sh)); - tcg_gen_shri_vec(vece, t, a, sh); - tcg_gen_and_vec(vece, d, d, m); - tcg_gen_or_vec(vece, d, d, t); -} - -void gen_gvec_sri(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, - int64_t shift, uint32_t opr_sz, uint32_t max_sz) -{ - static const TCGOpcode vecop_list[] = { INDEX_op_shri_vec, 0 }; - const GVecGen2i ops[4] = { - { .fni8 = gen_shr8_ins_i64, - .fniv = gen_shr_ins_vec, - .fno = gen_helper_gvec_sri_b, - .load_dest = true, - .opt_opc = vecop_list, - .vece = MO_8 }, - { .fni8 = gen_shr16_ins_i64, - .fniv = gen_shr_ins_vec, - .fno = gen_helper_gvec_sri_h, - .load_dest = true, - .opt_opc = vecop_list, - .vece = MO_16 }, - { .fni4 = gen_shr32_ins_i32, - .fniv = gen_shr_ins_vec, - .fno = gen_helper_gvec_sri_s, - .load_dest = true, - .opt_opc = vecop_list, - .vece = MO_32 }, - { .fni8 = gen_shr64_ins_i64, - .fniv = gen_shr_ins_vec, - .fno = gen_helper_gvec_sri_d, - .prefer_i64 = TCG_TARGET_REG_BITS == 64, - .load_dest = true, - .opt_opc = vecop_list, - .vece = MO_64 }, - }; - - /* tszimm encoding produces immediates in the range [1..esize]. */ - tcg_debug_assert(shift > 0); - tcg_debug_assert(shift <= (8 << vece)); - - /* Shift of esize leaves destination unchanged. */ - if (shift < (8 << vece)) { - tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]); - } else { - /* Nop, but we do need to clear the tail. */ - tcg_gen_gvec_mov(vece, rd_ofs, rd_ofs, opr_sz, max_sz); - } -} - -static void gen_shl8_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) -{ - uint64_t mask = dup_const(MO_8, 0xff << shift); - TCGv_i64 t = tcg_temp_new_i64(); - - tcg_gen_shli_i64(t, a, shift); - tcg_gen_andi_i64(t, t, mask); - tcg_gen_andi_i64(d, d, ~mask); - tcg_gen_or_i64(d, d, t); -} - -static void gen_shl16_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) -{ - uint64_t mask = dup_const(MO_16, 0xffff << shift); - TCGv_i64 t = tcg_temp_new_i64(); - - tcg_gen_shli_i64(t, a, shift); - tcg_gen_andi_i64(t, t, mask); - tcg_gen_andi_i64(d, d, ~mask); - tcg_gen_or_i64(d, d, t); -} - -static void gen_shl32_ins_i32(TCGv_i32 d, TCGv_i32 a, int32_t shift) -{ - tcg_gen_deposit_i32(d, d, a, shift, 32 - shift); -} - -static void gen_shl64_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) -{ - tcg_gen_deposit_i64(d, d, a, shift, 64 - shift); -} - -static void gen_shl_ins_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh) -{ - TCGv_vec t = tcg_temp_new_vec_matching(d); - TCGv_vec m = tcg_temp_new_vec_matching(d); - - tcg_gen_shli_vec(vece, t, a, sh); - tcg_gen_dupi_vec(vece, m, MAKE_64BIT_MASK(0, sh)); - tcg_gen_and_vec(vece, d, d, m); - tcg_gen_or_vec(vece, d, d, t); -} - -void gen_gvec_sli(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, - int64_t shift, uint32_t opr_sz, uint32_t max_sz) -{ - static const TCGOpcode vecop_list[] = { INDEX_op_shli_vec, 0 }; - const GVecGen2i ops[4] = { - { .fni8 = gen_shl8_ins_i64, - .fniv = gen_shl_ins_vec, - .fno = gen_helper_gvec_sli_b, - .load_dest = true, - .opt_opc = vecop_list, - .vece = MO_8 }, - { .fni8 = gen_shl16_ins_i64, - .fniv = gen_shl_ins_vec, - .fno = gen_helper_gvec_sli_h, - .load_dest = true, - .opt_opc = vecop_list, - .vece = MO_16 }, - { .fni4 = gen_shl32_ins_i32, - .fniv = gen_shl_ins_vec, - .fno = gen_helper_gvec_sli_s, - .load_dest = true, - .opt_opc = vecop_list, - .vece = MO_32 }, - { .fni8 = gen_shl64_ins_i64, - .fniv = gen_shl_ins_vec, - .fno = gen_helper_gvec_sli_d, - .prefer_i64 = TCG_TARGET_REG_BITS == 64, - .load_dest = true, - .opt_opc = vecop_list, - .vece = MO_64 }, - }; - - /* tszimm encoding produces immediates in the range [0..esize-1]. */ - tcg_debug_assert(shift >= 0); - tcg_debug_assert(shift < (8 << vece)); - - if (shift == 0) { - tcg_gen_gvec_mov(vece, rd_ofs, rm_ofs, opr_sz, max_sz); - } else { - tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]); - } -} - -static void gen_mla8_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) -{ - gen_helper_neon_mul_u8(a, a, b); - gen_helper_neon_add_u8(d, d, a); -} - -static void gen_mls8_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) -{ - gen_helper_neon_mul_u8(a, a, b); - gen_helper_neon_sub_u8(d, d, a); -} - -static void gen_mla16_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) -{ - gen_helper_neon_mul_u16(a, a, b); - gen_helper_neon_add_u16(d, d, a); -} - -static void gen_mls16_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) -{ - gen_helper_neon_mul_u16(a, a, b); - gen_helper_neon_sub_u16(d, d, a); -} - -static void gen_mla32_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) -{ - tcg_gen_mul_i32(a, a, b); - tcg_gen_add_i32(d, d, a); -} - -static void gen_mls32_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) -{ - tcg_gen_mul_i32(a, a, b); - tcg_gen_sub_i32(d, d, a); -} - -static void gen_mla64_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) -{ - tcg_gen_mul_i64(a, a, b); - tcg_gen_add_i64(d, d, a); -} - -static void gen_mls64_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) -{ - tcg_gen_mul_i64(a, a, b); - tcg_gen_sub_i64(d, d, a); -} - -static void gen_mla_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b) -{ - tcg_gen_mul_vec(vece, a, a, b); - tcg_gen_add_vec(vece, d, d, a); -} - -static void gen_mls_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b) -{ - tcg_gen_mul_vec(vece, a, a, b); - tcg_gen_sub_vec(vece, d, d, a); -} - -/* Note that while NEON does not support VMLA and VMLS as 64-bit ops, - * these tables are shared with AArch64 which does support them. - */ -void gen_gvec_mla(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, - uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) -{ - static const TCGOpcode vecop_list[] = { - INDEX_op_mul_vec, INDEX_op_add_vec, 0 - }; - static const GVecGen3 ops[4] = { - { .fni4 = gen_mla8_i32, - .fniv = gen_mla_vec, - .load_dest = true, - .opt_opc = vecop_list, - .vece = MO_8 }, - { .fni4 = gen_mla16_i32, - .fniv = gen_mla_vec, - .load_dest = true, - .opt_opc = vecop_list, - .vece = MO_16 }, - { .fni4 = gen_mla32_i32, - .fniv = gen_mla_vec, - .load_dest = true, - .opt_opc = vecop_list, - .vece = MO_32 }, - { .fni8 = gen_mla64_i64, - .fniv = gen_mla_vec, - .prefer_i64 = TCG_TARGET_REG_BITS == 64, - .load_dest = true, - .opt_opc = vecop_list, - .vece = MO_64 }, - }; - tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); -} - -void gen_gvec_mls(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, - uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) -{ - static const TCGOpcode vecop_list[] = { - INDEX_op_mul_vec, INDEX_op_sub_vec, 0 - }; - static const GVecGen3 ops[4] = { - { .fni4 = gen_mls8_i32, - .fniv = gen_mls_vec, - .load_dest = true, - .opt_opc = vecop_list, - .vece = MO_8 }, - { .fni4 = gen_mls16_i32, - .fniv = gen_mls_vec, - .load_dest = true, - .opt_opc = vecop_list, - .vece = MO_16 }, - { .fni4 = gen_mls32_i32, - .fniv = gen_mls_vec, - .load_dest = true, - .opt_opc = vecop_list, - .vece = MO_32 }, - { .fni8 = gen_mls64_i64, - .fniv = gen_mls_vec, - .prefer_i64 = TCG_TARGET_REG_BITS == 64, - .load_dest = true, - .opt_opc = vecop_list, - .vece = MO_64 }, - }; - tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); -} - -/* CMTST : test is "if (X & Y != 0)". */ -static void gen_cmtst_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) -{ - tcg_gen_and_i32(d, a, b); - tcg_gen_negsetcond_i32(TCG_COND_NE, d, d, tcg_constant_i32(0)); -} - -void gen_cmtst_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) -{ - tcg_gen_and_i64(d, a, b); - tcg_gen_negsetcond_i64(TCG_COND_NE, d, d, tcg_constant_i64(0)); -} - -static void gen_cmtst_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b) -{ - tcg_gen_and_vec(vece, d, a, b); - tcg_gen_dupi_vec(vece, a, 0); - tcg_gen_cmp_vec(TCG_COND_NE, vece, d, d, a); -} - -void gen_gvec_cmtst(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, - uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) -{ - static const TCGOpcode vecop_list[] = { INDEX_op_cmp_vec, 0 }; - static const GVecGen3 ops[4] = { - { .fni4 = gen_helper_neon_tst_u8, - .fniv = gen_cmtst_vec, - .opt_opc = vecop_list, - .vece = MO_8 }, - { .fni4 = gen_helper_neon_tst_u16, - .fniv = gen_cmtst_vec, - .opt_opc = vecop_list, - .vece = MO_16 }, - { .fni4 = gen_cmtst_i32, - .fniv = gen_cmtst_vec, - .opt_opc = vecop_list, - .vece = MO_32 }, - { .fni8 = gen_cmtst_i64, - .fniv = gen_cmtst_vec, - .prefer_i64 = TCG_TARGET_REG_BITS == 64, - .opt_opc = vecop_list, - .vece = MO_64 }, - }; - tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); -} - -void gen_ushl_i32(TCGv_i32 dst, TCGv_i32 src, TCGv_i32 shift) -{ - TCGv_i32 lval = tcg_temp_new_i32(); - TCGv_i32 rval = tcg_temp_new_i32(); - TCGv_i32 lsh = tcg_temp_new_i32(); - TCGv_i32 rsh = tcg_temp_new_i32(); - TCGv_i32 zero = tcg_constant_i32(0); - TCGv_i32 max = tcg_constant_i32(32); - - /* - * Rely on the TCG guarantee that out of range shifts produce - * unspecified results, not undefined behaviour (i.e. no trap). - * Discard out-of-range results after the fact. - */ - tcg_gen_ext8s_i32(lsh, shift); - tcg_gen_neg_i32(rsh, lsh); - tcg_gen_shl_i32(lval, src, lsh); - tcg_gen_shr_i32(rval, src, rsh); - tcg_gen_movcond_i32(TCG_COND_LTU, dst, lsh, max, lval, zero); - tcg_gen_movcond_i32(TCG_COND_LTU, dst, rsh, max, rval, dst); -} - -void gen_ushl_i64(TCGv_i64 dst, TCGv_i64 src, TCGv_i64 shift) -{ - TCGv_i64 lval = tcg_temp_new_i64(); - TCGv_i64 rval = tcg_temp_new_i64(); - TCGv_i64 lsh = tcg_temp_new_i64(); - TCGv_i64 rsh = tcg_temp_new_i64(); - TCGv_i64 zero = tcg_constant_i64(0); - TCGv_i64 max = tcg_constant_i64(64); - - /* - * Rely on the TCG guarantee that out of range shifts produce - * unspecified results, not undefined behaviour (i.e. no trap). - * Discard out-of-range results after the fact. - */ - tcg_gen_ext8s_i64(lsh, shift); - tcg_gen_neg_i64(rsh, lsh); - tcg_gen_shl_i64(lval, src, lsh); - tcg_gen_shr_i64(rval, src, rsh); - tcg_gen_movcond_i64(TCG_COND_LTU, dst, lsh, max, lval, zero); - tcg_gen_movcond_i64(TCG_COND_LTU, dst, rsh, max, rval, dst); -} - -static void gen_ushl_vec(unsigned vece, TCGv_vec dst, - TCGv_vec src, TCGv_vec shift) -{ - TCGv_vec lval = tcg_temp_new_vec_matching(dst); - TCGv_vec rval = tcg_temp_new_vec_matching(dst); - TCGv_vec lsh = tcg_temp_new_vec_matching(dst); - TCGv_vec rsh = tcg_temp_new_vec_matching(dst); - TCGv_vec msk, max; - - tcg_gen_neg_vec(vece, rsh, shift); - if (vece == MO_8) { - tcg_gen_mov_vec(lsh, shift); - } else { - msk = tcg_temp_new_vec_matching(dst); - tcg_gen_dupi_vec(vece, msk, 0xff); - tcg_gen_and_vec(vece, lsh, shift, msk); - tcg_gen_and_vec(vece, rsh, rsh, msk); - } - - /* - * Rely on the TCG guarantee that out of range shifts produce - * unspecified results, not undefined behaviour (i.e. no trap). - * Discard out-of-range results after the fact. - */ - tcg_gen_shlv_vec(vece, lval, src, lsh); - tcg_gen_shrv_vec(vece, rval, src, rsh); - - max = tcg_temp_new_vec_matching(dst); - tcg_gen_dupi_vec(vece, max, 8 << vece); - - /* - * The choice of LT (signed) and GEU (unsigned) are biased toward - * the instructions of the x86_64 host. For MO_8, the whole byte - * is significant so we must use an unsigned compare; otherwise we - * have already masked to a byte and so a signed compare works. - * Other tcg hosts have a full set of comparisons and do not care. - */ - if (vece == MO_8) { - tcg_gen_cmp_vec(TCG_COND_GEU, vece, lsh, lsh, max); - tcg_gen_cmp_vec(TCG_COND_GEU, vece, rsh, rsh, max); - tcg_gen_andc_vec(vece, lval, lval, lsh); - tcg_gen_andc_vec(vece, rval, rval, rsh); - } else { - tcg_gen_cmp_vec(TCG_COND_LT, vece, lsh, lsh, max); - tcg_gen_cmp_vec(TCG_COND_LT, vece, rsh, rsh, max); - tcg_gen_and_vec(vece, lval, lval, lsh); - tcg_gen_and_vec(vece, rval, rval, rsh); - } - tcg_gen_or_vec(vece, dst, lval, rval); -} - -void gen_gvec_ushl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, - uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) -{ - static const TCGOpcode vecop_list[] = { - INDEX_op_neg_vec, INDEX_op_shlv_vec, - INDEX_op_shrv_vec, INDEX_op_cmp_vec, 0 - }; - static const GVecGen3 ops[4] = { - { .fniv = gen_ushl_vec, - .fno = gen_helper_gvec_ushl_b, - .opt_opc = vecop_list, - .vece = MO_8 }, - { .fniv = gen_ushl_vec, - .fno = gen_helper_gvec_ushl_h, - .opt_opc = vecop_list, - .vece = MO_16 }, - { .fni4 = gen_ushl_i32, - .fniv = gen_ushl_vec, - .opt_opc = vecop_list, - .vece = MO_32 }, - { .fni8 = gen_ushl_i64, - .fniv = gen_ushl_vec, - .opt_opc = vecop_list, - .vece = MO_64 }, - }; - tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); -} - -void gen_sshl_i32(TCGv_i32 dst, TCGv_i32 src, TCGv_i32 shift) -{ - TCGv_i32 lval = tcg_temp_new_i32(); - TCGv_i32 rval = tcg_temp_new_i32(); - TCGv_i32 lsh = tcg_temp_new_i32(); - TCGv_i32 rsh = tcg_temp_new_i32(); - TCGv_i32 zero = tcg_constant_i32(0); - TCGv_i32 max = tcg_constant_i32(31); - - /* - * Rely on the TCG guarantee that out of range shifts produce - * unspecified results, not undefined behaviour (i.e. no trap). - * Discard out-of-range results after the fact. - */ - tcg_gen_ext8s_i32(lsh, shift); - tcg_gen_neg_i32(rsh, lsh); - tcg_gen_shl_i32(lval, src, lsh); - tcg_gen_umin_i32(rsh, rsh, max); - tcg_gen_sar_i32(rval, src, rsh); - tcg_gen_movcond_i32(TCG_COND_LEU, lval, lsh, max, lval, zero); - tcg_gen_movcond_i32(TCG_COND_LT, dst, lsh, zero, rval, lval); -} - -void gen_sshl_i64(TCGv_i64 dst, TCGv_i64 src, TCGv_i64 shift) -{ - TCGv_i64 lval = tcg_temp_new_i64(); - TCGv_i64 rval = tcg_temp_new_i64(); - TCGv_i64 lsh = tcg_temp_new_i64(); - TCGv_i64 rsh = tcg_temp_new_i64(); - TCGv_i64 zero = tcg_constant_i64(0); - TCGv_i64 max = tcg_constant_i64(63); - - /* - * Rely on the TCG guarantee that out of range shifts produce - * unspecified results, not undefined behaviour (i.e. no trap). - * Discard out-of-range results after the fact. - */ - tcg_gen_ext8s_i64(lsh, shift); - tcg_gen_neg_i64(rsh, lsh); - tcg_gen_shl_i64(lval, src, lsh); - tcg_gen_umin_i64(rsh, rsh, max); - tcg_gen_sar_i64(rval, src, rsh); - tcg_gen_movcond_i64(TCG_COND_LEU, lval, lsh, max, lval, zero); - tcg_gen_movcond_i64(TCG_COND_LT, dst, lsh, zero, rval, lval); -} - -static void gen_sshl_vec(unsigned vece, TCGv_vec dst, - TCGv_vec src, TCGv_vec shift) -{ - TCGv_vec lval = tcg_temp_new_vec_matching(dst); - TCGv_vec rval = tcg_temp_new_vec_matching(dst); - TCGv_vec lsh = tcg_temp_new_vec_matching(dst); - TCGv_vec rsh = tcg_temp_new_vec_matching(dst); - TCGv_vec tmp = tcg_temp_new_vec_matching(dst); - - /* - * Rely on the TCG guarantee that out of range shifts produce - * unspecified results, not undefined behaviour (i.e. no trap). - * Discard out-of-range results after the fact. - */ - tcg_gen_neg_vec(vece, rsh, shift); - if (vece == MO_8) { - tcg_gen_mov_vec(lsh, shift); - } else { - tcg_gen_dupi_vec(vece, tmp, 0xff); - tcg_gen_and_vec(vece, lsh, shift, tmp); - tcg_gen_and_vec(vece, rsh, rsh, tmp); - } - - /* Bound rsh so out of bound right shift gets -1. */ - tcg_gen_dupi_vec(vece, tmp, (8 << vece) - 1); - tcg_gen_umin_vec(vece, rsh, rsh, tmp); - tcg_gen_cmp_vec(TCG_COND_GT, vece, tmp, lsh, tmp); - - tcg_gen_shlv_vec(vece, lval, src, lsh); - tcg_gen_sarv_vec(vece, rval, src, rsh); - - /* Select in-bound left shift. */ - tcg_gen_andc_vec(vece, lval, lval, tmp); - - /* Select between left and right shift. */ - if (vece == MO_8) { - tcg_gen_dupi_vec(vece, tmp, 0); - tcg_gen_cmpsel_vec(TCG_COND_LT, vece, dst, lsh, tmp, rval, lval); - } else { - tcg_gen_dupi_vec(vece, tmp, 0x80); - tcg_gen_cmpsel_vec(TCG_COND_LT, vece, dst, lsh, tmp, lval, rval); - } -} - -void gen_gvec_sshl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, - uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) -{ - static const TCGOpcode vecop_list[] = { - INDEX_op_neg_vec, INDEX_op_umin_vec, INDEX_op_shlv_vec, - INDEX_op_sarv_vec, INDEX_op_cmp_vec, INDEX_op_cmpsel_vec, 0 - }; - static const GVecGen3 ops[4] = { - { .fniv = gen_sshl_vec, - .fno = gen_helper_gvec_sshl_b, - .opt_opc = vecop_list, - .vece = MO_8 }, - { .fniv = gen_sshl_vec, - .fno = gen_helper_gvec_sshl_h, - .opt_opc = vecop_list, - .vece = MO_16 }, - { .fni4 = gen_sshl_i32, - .fniv = gen_sshl_vec, - .opt_opc = vecop_list, - .vece = MO_32 }, - { .fni8 = gen_sshl_i64, - .fniv = gen_sshl_vec, - .opt_opc = vecop_list, - .vece = MO_64 }, - }; - tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); -} - -static void gen_uqadd_vec(unsigned vece, TCGv_vec t, TCGv_vec sat, - TCGv_vec a, TCGv_vec b) -{ - TCGv_vec x = tcg_temp_new_vec_matching(t); - tcg_gen_add_vec(vece, x, a, b); - tcg_gen_usadd_vec(vece, t, a, b); - tcg_gen_cmp_vec(TCG_COND_NE, vece, x, x, t); - tcg_gen_or_vec(vece, sat, sat, x); -} - -void gen_gvec_uqadd_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, - uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) -{ - static const TCGOpcode vecop_list[] = { - INDEX_op_usadd_vec, INDEX_op_cmp_vec, INDEX_op_add_vec, 0 - }; - static const GVecGen4 ops[4] = { - { .fniv = gen_uqadd_vec, - .fno = gen_helper_gvec_uqadd_b, - .write_aofs = true, - .opt_opc = vecop_list, - .vece = MO_8 }, - { .fniv = gen_uqadd_vec, - .fno = gen_helper_gvec_uqadd_h, - .write_aofs = true, - .opt_opc = vecop_list, - .vece = MO_16 }, - { .fniv = gen_uqadd_vec, - .fno = gen_helper_gvec_uqadd_s, - .write_aofs = true, - .opt_opc = vecop_list, - .vece = MO_32 }, - { .fniv = gen_uqadd_vec, - .fno = gen_helper_gvec_uqadd_d, - .write_aofs = true, - .opt_opc = vecop_list, - .vece = MO_64 }, - }; - tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc), - rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); -} - -static void gen_sqadd_vec(unsigned vece, TCGv_vec t, TCGv_vec sat, - TCGv_vec a, TCGv_vec b) -{ - TCGv_vec x = tcg_temp_new_vec_matching(t); - tcg_gen_add_vec(vece, x, a, b); - tcg_gen_ssadd_vec(vece, t, a, b); - tcg_gen_cmp_vec(TCG_COND_NE, vece, x, x, t); - tcg_gen_or_vec(vece, sat, sat, x); -} - -void gen_gvec_sqadd_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, - uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) -{ - static const TCGOpcode vecop_list[] = { - INDEX_op_ssadd_vec, INDEX_op_cmp_vec, INDEX_op_add_vec, 0 - }; - static const GVecGen4 ops[4] = { - { .fniv = gen_sqadd_vec, - .fno = gen_helper_gvec_sqadd_b, - .opt_opc = vecop_list, - .write_aofs = true, - .vece = MO_8 }, - { .fniv = gen_sqadd_vec, - .fno = gen_helper_gvec_sqadd_h, - .opt_opc = vecop_list, - .write_aofs = true, - .vece = MO_16 }, - { .fniv = gen_sqadd_vec, - .fno = gen_helper_gvec_sqadd_s, - .opt_opc = vecop_list, - .write_aofs = true, - .vece = MO_32 }, - { .fniv = gen_sqadd_vec, - .fno = gen_helper_gvec_sqadd_d, - .opt_opc = vecop_list, - .write_aofs = true, - .vece = MO_64 }, - }; - tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc), - rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); -} - -static void gen_uqsub_vec(unsigned vece, TCGv_vec t, TCGv_vec sat, - TCGv_vec a, TCGv_vec b) -{ - TCGv_vec x = tcg_temp_new_vec_matching(t); - tcg_gen_sub_vec(vece, x, a, b); - tcg_gen_ussub_vec(vece, t, a, b); - tcg_gen_cmp_vec(TCG_COND_NE, vece, x, x, t); - tcg_gen_or_vec(vece, sat, sat, x); -} - -void gen_gvec_uqsub_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, - uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) -{ - static const TCGOpcode vecop_list[] = { - INDEX_op_ussub_vec, INDEX_op_cmp_vec, INDEX_op_sub_vec, 0 - }; - static const GVecGen4 ops[4] = { - { .fniv = gen_uqsub_vec, - .fno = gen_helper_gvec_uqsub_b, - .opt_opc = vecop_list, - .write_aofs = true, - .vece = MO_8 }, - { .fniv = gen_uqsub_vec, - .fno = gen_helper_gvec_uqsub_h, - .opt_opc = vecop_list, - .write_aofs = true, - .vece = MO_16 }, - { .fniv = gen_uqsub_vec, - .fno = gen_helper_gvec_uqsub_s, - .opt_opc = vecop_list, - .write_aofs = true, - .vece = MO_32 }, - { .fniv = gen_uqsub_vec, - .fno = gen_helper_gvec_uqsub_d, - .opt_opc = vecop_list, - .write_aofs = true, - .vece = MO_64 }, - }; - tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc), - rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); -} - -static void gen_sqsub_vec(unsigned vece, TCGv_vec t, TCGv_vec sat, - TCGv_vec a, TCGv_vec b) -{ - TCGv_vec x = tcg_temp_new_vec_matching(t); - tcg_gen_sub_vec(vece, x, a, b); - tcg_gen_sssub_vec(vece, t, a, b); - tcg_gen_cmp_vec(TCG_COND_NE, vece, x, x, t); - tcg_gen_or_vec(vece, sat, sat, x); -} - -void gen_gvec_sqsub_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, - uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) -{ - static const TCGOpcode vecop_list[] = { - INDEX_op_sssub_vec, INDEX_op_cmp_vec, INDEX_op_sub_vec, 0 - }; - static const GVecGen4 ops[4] = { - { .fniv = gen_sqsub_vec, - .fno = gen_helper_gvec_sqsub_b, - .opt_opc = vecop_list, - .write_aofs = true, - .vece = MO_8 }, - { .fniv = gen_sqsub_vec, - .fno = gen_helper_gvec_sqsub_h, - .opt_opc = vecop_list, - .write_aofs = true, - .vece = MO_16 }, - { .fniv = gen_sqsub_vec, - .fno = gen_helper_gvec_sqsub_s, - .opt_opc = vecop_list, - .write_aofs = true, - .vece = MO_32 }, - { .fniv = gen_sqsub_vec, - .fno = gen_helper_gvec_sqsub_d, - .opt_opc = vecop_list, - .write_aofs = true, - .vece = MO_64 }, - }; - tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc), - rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); -} - -static void gen_sabd_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) -{ - TCGv_i32 t = tcg_temp_new_i32(); - - tcg_gen_sub_i32(t, a, b); - tcg_gen_sub_i32(d, b, a); - tcg_gen_movcond_i32(TCG_COND_LT, d, a, b, d, t); -} - -static void gen_sabd_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) -{ - TCGv_i64 t = tcg_temp_new_i64(); - - tcg_gen_sub_i64(t, a, b); - tcg_gen_sub_i64(d, b, a); - tcg_gen_movcond_i64(TCG_COND_LT, d, a, b, d, t); -} - -static void gen_sabd_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b) -{ - TCGv_vec t = tcg_temp_new_vec_matching(d); - - tcg_gen_smin_vec(vece, t, a, b); - tcg_gen_smax_vec(vece, d, a, b); - tcg_gen_sub_vec(vece, d, d, t); -} - -void gen_gvec_sabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, - uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) -{ - static const TCGOpcode vecop_list[] = { - INDEX_op_sub_vec, INDEX_op_smin_vec, INDEX_op_smax_vec, 0 - }; - static const GVecGen3 ops[4] = { - { .fniv = gen_sabd_vec, - .fno = gen_helper_gvec_sabd_b, - .opt_opc = vecop_list, - .vece = MO_8 }, - { .fniv = gen_sabd_vec, - .fno = gen_helper_gvec_sabd_h, - .opt_opc = vecop_list, - .vece = MO_16 }, - { .fni4 = gen_sabd_i32, - .fniv = gen_sabd_vec, - .fno = gen_helper_gvec_sabd_s, - .opt_opc = vecop_list, - .vece = MO_32 }, - { .fni8 = gen_sabd_i64, - .fniv = gen_sabd_vec, - .fno = gen_helper_gvec_sabd_d, - .prefer_i64 = TCG_TARGET_REG_BITS == 64, - .opt_opc = vecop_list, - .vece = MO_64 }, - }; - tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); -} - -static void gen_uabd_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) -{ - TCGv_i32 t = tcg_temp_new_i32(); - - tcg_gen_sub_i32(t, a, b); - tcg_gen_sub_i32(d, b, a); - tcg_gen_movcond_i32(TCG_COND_LTU, d, a, b, d, t); -} - -static void gen_uabd_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) -{ - TCGv_i64 t = tcg_temp_new_i64(); - - tcg_gen_sub_i64(t, a, b); - tcg_gen_sub_i64(d, b, a); - tcg_gen_movcond_i64(TCG_COND_LTU, d, a, b, d, t); -} - -static void gen_uabd_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b) -{ - TCGv_vec t = tcg_temp_new_vec_matching(d); - - tcg_gen_umin_vec(vece, t, a, b); - tcg_gen_umax_vec(vece, d, a, b); - tcg_gen_sub_vec(vece, d, d, t); -} - -void gen_gvec_uabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, - uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) -{ - static const TCGOpcode vecop_list[] = { - INDEX_op_sub_vec, INDEX_op_umin_vec, INDEX_op_umax_vec, 0 - }; - static const GVecGen3 ops[4] = { - { .fniv = gen_uabd_vec, - .fno = gen_helper_gvec_uabd_b, - .opt_opc = vecop_list, - .vece = MO_8 }, - { .fniv = gen_uabd_vec, - .fno = gen_helper_gvec_uabd_h, - .opt_opc = vecop_list, - .vece = MO_16 }, - { .fni4 = gen_uabd_i32, - .fniv = gen_uabd_vec, - .fno = gen_helper_gvec_uabd_s, - .opt_opc = vecop_list, - .vece = MO_32 }, - { .fni8 = gen_uabd_i64, - .fniv = gen_uabd_vec, - .fno = gen_helper_gvec_uabd_d, - .prefer_i64 = TCG_TARGET_REG_BITS == 64, - .opt_opc = vecop_list, - .vece = MO_64 }, - }; - tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); -} - -static void gen_saba_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) -{ - TCGv_i32 t = tcg_temp_new_i32(); - gen_sabd_i32(t, a, b); - tcg_gen_add_i32(d, d, t); -} - -static void gen_saba_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) -{ - TCGv_i64 t = tcg_temp_new_i64(); - gen_sabd_i64(t, a, b); - tcg_gen_add_i64(d, d, t); -} - -static void gen_saba_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b) -{ - TCGv_vec t = tcg_temp_new_vec_matching(d); - gen_sabd_vec(vece, t, a, b); - tcg_gen_add_vec(vece, d, d, t); -} - -void gen_gvec_saba(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, - uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) -{ - static const TCGOpcode vecop_list[] = { - INDEX_op_sub_vec, INDEX_op_add_vec, - INDEX_op_smin_vec, INDEX_op_smax_vec, 0 - }; - static const GVecGen3 ops[4] = { - { .fniv = gen_saba_vec, - .fno = gen_helper_gvec_saba_b, - .opt_opc = vecop_list, - .load_dest = true, - .vece = MO_8 }, - { .fniv = gen_saba_vec, - .fno = gen_helper_gvec_saba_h, - .opt_opc = vecop_list, - .load_dest = true, - .vece = MO_16 }, - { .fni4 = gen_saba_i32, - .fniv = gen_saba_vec, - .fno = gen_helper_gvec_saba_s, - .opt_opc = vecop_list, - .load_dest = true, - .vece = MO_32 }, - { .fni8 = gen_saba_i64, - .fniv = gen_saba_vec, - .fno = gen_helper_gvec_saba_d, - .prefer_i64 = TCG_TARGET_REG_BITS == 64, - .opt_opc = vecop_list, - .load_dest = true, - .vece = MO_64 }, - }; - tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); -} - -static void gen_uaba_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) -{ - TCGv_i32 t = tcg_temp_new_i32(); - gen_uabd_i32(t, a, b); - tcg_gen_add_i32(d, d, t); -} - -static void gen_uaba_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) -{ - TCGv_i64 t = tcg_temp_new_i64(); - gen_uabd_i64(t, a, b); - tcg_gen_add_i64(d, d, t); -} - -static void gen_uaba_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b) -{ - TCGv_vec t = tcg_temp_new_vec_matching(d); - gen_uabd_vec(vece, t, a, b); - tcg_gen_add_vec(vece, d, d, t); -} - -void gen_gvec_uaba(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, - uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) -{ - static const TCGOpcode vecop_list[] = { - INDEX_op_sub_vec, INDEX_op_add_vec, - INDEX_op_umin_vec, INDEX_op_umax_vec, 0 - }; - static const GVecGen3 ops[4] = { - { .fniv = gen_uaba_vec, - .fno = gen_helper_gvec_uaba_b, - .opt_opc = vecop_list, - .load_dest = true, - .vece = MO_8 }, - { .fniv = gen_uaba_vec, - .fno = gen_helper_gvec_uaba_h, - .opt_opc = vecop_list, - .load_dest = true, - .vece = MO_16 }, - { .fni4 = gen_uaba_i32, - .fniv = gen_uaba_vec, - .fno = gen_helper_gvec_uaba_s, - .opt_opc = vecop_list, - .load_dest = true, - .vece = MO_32 }, - { .fni8 = gen_uaba_i64, - .fniv = gen_uaba_vec, - .fno = gen_helper_gvec_uaba_d, - .prefer_i64 = TCG_TARGET_REG_BITS == 64, - .opt_opc = vecop_list, - .load_dest = true, - .vece = MO_64 }, - }; - tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); -} - static bool aa32_cpreg_encoding_in_impdef_space(uint8_t crn, uint8_t crm) { static const uint16_t mask[3] = { diff --git a/target/arm/tcg/meson.build b/target/arm/tcg/meson.build index 3b1a9f0fc5..bdb5c7352f 100644 --- a/target/arm/tcg/meson.build +++ b/target/arm/tcg/meson.build @@ -24,6 +24,7 @@ arm_ss.add(when: 'TARGET_AARCH64', if_true: gen_a64) arm_ss.add(files( 'cpu32.c', + 'gengvec.c', 'translate.c', 'translate-m-nocp.c', 'translate-mve.c', From patchwork Fri May 24 23:20:22 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13673824 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id D5FBFC25B7A for ; Fri, 24 May 2024 23:28:21 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sAeEH-0006Tb-CL; Fri, 24 May 2024 19:21:41 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sAeEC-0006SW-8K for qemu-devel@nongnu.org; Fri, 24 May 2024 19:21:36 -0400 Received: from mail-pl1-x635.google.com ([2607:f8b0:4864:20::635]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sAeE8-0005jo-7g for qemu-devel@nongnu.org; Fri, 24 May 2024 19:21:36 -0400 Received: by mail-pl1-x635.google.com with SMTP id d9443c01a7336-1f46418820dso2900025ad.0 for ; Fri, 24 May 2024 16:21:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1716592890; x=1717197690; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=hKzExuS78evw1+jkab3GaKK0wmUSLT9rJpAkd6pvgss=; b=t8tO5RgVtBNx7Qtc+RvifEM2t1/nDqPJjFpSw7vYAYLMD4cFLAw/CBxVNsG/owcpfA hn4z+nI2IZu7UC1158zFXJYa3bLiIC8JOK/dJHDU5oZWg1FDf2kkS6TUljv22RhoQ/TV 54LJSTLRQjH6oZQHtJ/8hrd2zrPf/FEk3p02ufsvAgVrmCEWp42cG5/RkQMK4tNtSmoH rAeWXm5yLGjLyuPjj4W4DwKtjMCRHI9QyfLy/7wsk/fiaU8Z1SNjG5fJSwj1NwMAWzwj G7J3q5SoRiTbB24X0HLr1MQm5dNRIQKtZZ5jLpd0c4xSQkOpQxXYupBnZGkgx+mfBdGF Qiiw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716592890; x=1717197690; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=hKzExuS78evw1+jkab3GaKK0wmUSLT9rJpAkd6pvgss=; b=nkGfA2MXBwWOFiDQVT6azaX7O1oke7hq61bM/gBlF3RIC512ubg6XPoCx9SLi5eNdH bWglFKGXT5YEPbLU/KOsQhn4M4PwAFo/UBSqRrQnSc5tj3wQSKLtuHjcFK7CVWf6loeT TPsBOp2IeK6POIZ9EAPySkW76jh8yOtI5yoWx0MZw+LJPPthZlz1SrBXHC7DMbY6QRzH pv0PqcOnGbY5pErsX2nqQvEm6bYrVma4R5N2f2EjaoWyx7noObckwFnY/+Zz/pF6b+2N qE9FflNaBriBIzphnrbIndqwP3nVhfHhblvkzcJ9pa4HnM2MoBjb3EPkN0fl7eh/2ZwV mffw== X-Gm-Message-State: AOJu0Yzmi9rmWEOu8vCyIx6scmUs/9VWHes8EX7DVhOG2xY6bZBTuBCY undjypEbEPbgeY+8NjedUP5qDZrnTum/MCY2K4UaCxsGwpWMB1YpA5zt9IGooIz5vUA77VllH+3 o X-Google-Smtp-Source: AGHT+IF37zh6gfKMe4udt57/K7vxUlUHRSkmKN+XNoP44jUgLLnANJJI79+9PIHILkFUvv++6d20nA== X-Received: by 2002:a17:902:e88e:b0:1f2:f4e7:f880 with SMTP id d9443c01a7336-1f44815b9ccmr51311765ad.13.1716592889741; Fri, 24 May 2024 16:21:29 -0700 (PDT) Received: from stoup.. (174-21-72-5.tukw.qwest.net. [174.21.72.5]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-1f44c759ceesm19178305ad.10.2024.05.24.16.21.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 May 2024 16:21:29 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: qemu-arm@nongnu.org, Peter Maydell , =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= Subject: [PATCH v2 08/67] target/arm: Split out gengvec64.c Date: Fri, 24 May 2024 16:20:22 -0700 Message-Id: <20240524232121.284515-9-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240524232121.284515-1-richard.henderson@linaro.org> References: <20240524232121.284515-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::635; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x635.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Split some routines out of translate-a64.c and translate-sve.c that are used by both. Reviewed-by: Peter Maydell Reviewed-by: Philippe Mathieu-Daudé Signed-off-by: Richard Henderson --- target/arm/tcg/translate-a64.h | 4 + target/arm/tcg/gengvec64.c | 190 +++++++++++++++++++++++++++++++++ target/arm/tcg/translate-a64.c | 26 ----- target/arm/tcg/translate-sve.c | 145 +------------------------ target/arm/tcg/meson.build | 1 + 5 files changed, 197 insertions(+), 169 deletions(-) create mode 100644 target/arm/tcg/gengvec64.c diff --git a/target/arm/tcg/translate-a64.h b/target/arm/tcg/translate-a64.h index 7b811b8ac5..91750f0ca9 100644 --- a/target/arm/tcg/translate-a64.h +++ b/target/arm/tcg/translate-a64.h @@ -193,6 +193,10 @@ void gen_gvec_rax1(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, void gen_gvec_xar(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, uint32_t rm_ofs, int64_t shift, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_eor3(unsigned vece, uint32_t d, uint32_t n, uint32_t m, + uint32_t a, uint32_t oprsz, uint32_t maxsz); +void gen_gvec_bcax(unsigned vece, uint32_t d, uint32_t n, uint32_t m, + uint32_t a, uint32_t oprsz, uint32_t maxsz); void gen_sve_ldr(DisasContext *s, TCGv_ptr, int vofs, int len, int rn, int imm); void gen_sve_str(DisasContext *s, TCGv_ptr, int vofs, int len, int rn, int imm); diff --git a/target/arm/tcg/gengvec64.c b/target/arm/tcg/gengvec64.c new file mode 100644 index 0000000000..093b498b13 --- /dev/null +++ b/target/arm/tcg/gengvec64.c @@ -0,0 +1,190 @@ +/* + * AArch64 generic vector expansion + * + * Copyright (c) 2013 Alexander Graf + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library; if not, see . + */ + +#include "qemu/osdep.h" +#include "translate.h" +#include "translate-a64.h" + + +static void gen_rax1_i64(TCGv_i64 d, TCGv_i64 n, TCGv_i64 m) +{ + tcg_gen_rotli_i64(d, m, 1); + tcg_gen_xor_i64(d, d, n); +} + +static void gen_rax1_vec(unsigned vece, TCGv_vec d, TCGv_vec n, TCGv_vec m) +{ + tcg_gen_rotli_vec(vece, d, m, 1); + tcg_gen_xor_vec(vece, d, d, n); +} + +void gen_gvec_rax1(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { INDEX_op_rotli_vec, 0 }; + static const GVecGen3 op = { + .fni8 = gen_rax1_i64, + .fniv = gen_rax1_vec, + .opt_opc = vecop_list, + .fno = gen_helper_crypto_rax1, + .vece = MO_64, + }; + tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &op); +} + +static void gen_xar8_i64(TCGv_i64 d, TCGv_i64 n, TCGv_i64 m, int64_t sh) +{ + TCGv_i64 t = tcg_temp_new_i64(); + uint64_t mask = dup_const(MO_8, 0xff >> sh); + + tcg_gen_xor_i64(t, n, m); + tcg_gen_shri_i64(d, t, sh); + tcg_gen_shli_i64(t, t, 8 - sh); + tcg_gen_andi_i64(d, d, mask); + tcg_gen_andi_i64(t, t, ~mask); + tcg_gen_or_i64(d, d, t); +} + +static void gen_xar16_i64(TCGv_i64 d, TCGv_i64 n, TCGv_i64 m, int64_t sh) +{ + TCGv_i64 t = tcg_temp_new_i64(); + uint64_t mask = dup_const(MO_16, 0xffff >> sh); + + tcg_gen_xor_i64(t, n, m); + tcg_gen_shri_i64(d, t, sh); + tcg_gen_shli_i64(t, t, 16 - sh); + tcg_gen_andi_i64(d, d, mask); + tcg_gen_andi_i64(t, t, ~mask); + tcg_gen_or_i64(d, d, t); +} + +static void gen_xar_i32(TCGv_i32 d, TCGv_i32 n, TCGv_i32 m, int32_t sh) +{ + tcg_gen_xor_i32(d, n, m); + tcg_gen_rotri_i32(d, d, sh); +} + +static void gen_xar_i64(TCGv_i64 d, TCGv_i64 n, TCGv_i64 m, int64_t sh) +{ + tcg_gen_xor_i64(d, n, m); + tcg_gen_rotri_i64(d, d, sh); +} + +static void gen_xar_vec(unsigned vece, TCGv_vec d, TCGv_vec n, + TCGv_vec m, int64_t sh) +{ + tcg_gen_xor_vec(vece, d, n, m); + tcg_gen_rotri_vec(vece, d, d, sh); +} + +void gen_gvec_xar(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, int64_t shift, + uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop[] = { INDEX_op_rotli_vec, 0 }; + static const GVecGen3i ops[4] = { + { .fni8 = gen_xar8_i64, + .fniv = gen_xar_vec, + .fno = gen_helper_sve2_xar_b, + .opt_opc = vecop, + .vece = MO_8 }, + { .fni8 = gen_xar16_i64, + .fniv = gen_xar_vec, + .fno = gen_helper_sve2_xar_h, + .opt_opc = vecop, + .vece = MO_16 }, + { .fni4 = gen_xar_i32, + .fniv = gen_xar_vec, + .fno = gen_helper_sve2_xar_s, + .opt_opc = vecop, + .vece = MO_32 }, + { .fni8 = gen_xar_i64, + .fniv = gen_xar_vec, + .fno = gen_helper_gvec_xar_d, + .opt_opc = vecop, + .vece = MO_64 } + }; + int esize = 8 << vece; + + /* The SVE2 range is 1 .. esize; the AdvSIMD range is 0 .. esize-1. */ + tcg_debug_assert(shift >= 0); + tcg_debug_assert(shift <= esize); + shift &= esize - 1; + + if (shift == 0) { + /* xar with no rotate devolves to xor. */ + tcg_gen_gvec_xor(vece, rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz); + } else { + tcg_gen_gvec_3i(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, + shift, &ops[vece]); + } +} + +static void gen_eor3_i64(TCGv_i64 d, TCGv_i64 n, TCGv_i64 m, TCGv_i64 k) +{ + tcg_gen_xor_i64(d, n, m); + tcg_gen_xor_i64(d, d, k); +} + +static void gen_eor3_vec(unsigned vece, TCGv_vec d, TCGv_vec n, + TCGv_vec m, TCGv_vec k) +{ + tcg_gen_xor_vec(vece, d, n, m); + tcg_gen_xor_vec(vece, d, d, k); +} + +void gen_gvec_eor3(unsigned vece, uint32_t d, uint32_t n, uint32_t m, + uint32_t a, uint32_t oprsz, uint32_t maxsz) +{ + static const GVecGen4 op = { + .fni8 = gen_eor3_i64, + .fniv = gen_eor3_vec, + .fno = gen_helper_sve2_eor3, + .vece = MO_64, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + }; + tcg_gen_gvec_4(d, n, m, a, oprsz, maxsz, &op); +} + +static void gen_bcax_i64(TCGv_i64 d, TCGv_i64 n, TCGv_i64 m, TCGv_i64 k) +{ + tcg_gen_andc_i64(d, m, k); + tcg_gen_xor_i64(d, d, n); +} + +static void gen_bcax_vec(unsigned vece, TCGv_vec d, TCGv_vec n, + TCGv_vec m, TCGv_vec k) +{ + tcg_gen_andc_vec(vece, d, m, k); + tcg_gen_xor_vec(vece, d, d, n); +} + +void gen_gvec_bcax(unsigned vece, uint32_t d, uint32_t n, uint32_t m, + uint32_t a, uint32_t oprsz, uint32_t maxsz) +{ + static const GVecGen4 op = { + .fni8 = gen_bcax_i64, + .fniv = gen_bcax_vec, + .fno = gen_helper_sve2_bcax, + .vece = MO_64, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + }; + tcg_gen_gvec_4(d, n, m, a, oprsz, maxsz, &op); +} + diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c index 0bdddb8517..8842ff634d 100644 --- a/target/arm/tcg/translate-a64.c +++ b/target/arm/tcg/translate-a64.c @@ -13623,32 +13623,6 @@ static void disas_crypto_two_reg_sha(DisasContext *s, uint32_t insn) gen_gvec_op2_ool(s, true, rd, rn, 0, genfn); } -static void gen_rax1_i64(TCGv_i64 d, TCGv_i64 n, TCGv_i64 m) -{ - tcg_gen_rotli_i64(d, m, 1); - tcg_gen_xor_i64(d, d, n); -} - -static void gen_rax1_vec(unsigned vece, TCGv_vec d, TCGv_vec n, TCGv_vec m) -{ - tcg_gen_rotli_vec(vece, d, m, 1); - tcg_gen_xor_vec(vece, d, d, n); -} - -void gen_gvec_rax1(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, - uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) -{ - static const TCGOpcode vecop_list[] = { INDEX_op_rotli_vec, 0 }; - static const GVecGen3 op = { - .fni8 = gen_rax1_i64, - .fniv = gen_rax1_vec, - .opt_opc = vecop_list, - .fno = gen_helper_crypto_rax1, - .vece = MO_64, - }; - tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &op); -} - /* Crypto three-reg SHA512 * 31 21 20 16 15 14 13 12 11 10 9 5 4 0 * +-----------------------+------+---+---+-----+--------+------+------+ diff --git a/target/arm/tcg/translate-sve.c b/target/arm/tcg/translate-sve.c index ada05aa530..798ab2bfb1 100644 --- a/target/arm/tcg/translate-sve.c +++ b/target/arm/tcg/translate-sve.c @@ -527,94 +527,6 @@ TRANS_FEAT(ORR_zzz, aa64_sve, gen_gvec_fn_arg_zzz, tcg_gen_gvec_or, a) TRANS_FEAT(EOR_zzz, aa64_sve, gen_gvec_fn_arg_zzz, tcg_gen_gvec_xor, a) TRANS_FEAT(BIC_zzz, aa64_sve, gen_gvec_fn_arg_zzz, tcg_gen_gvec_andc, a) -static void gen_xar8_i64(TCGv_i64 d, TCGv_i64 n, TCGv_i64 m, int64_t sh) -{ - TCGv_i64 t = tcg_temp_new_i64(); - uint64_t mask = dup_const(MO_8, 0xff >> sh); - - tcg_gen_xor_i64(t, n, m); - tcg_gen_shri_i64(d, t, sh); - tcg_gen_shli_i64(t, t, 8 - sh); - tcg_gen_andi_i64(d, d, mask); - tcg_gen_andi_i64(t, t, ~mask); - tcg_gen_or_i64(d, d, t); -} - -static void gen_xar16_i64(TCGv_i64 d, TCGv_i64 n, TCGv_i64 m, int64_t sh) -{ - TCGv_i64 t = tcg_temp_new_i64(); - uint64_t mask = dup_const(MO_16, 0xffff >> sh); - - tcg_gen_xor_i64(t, n, m); - tcg_gen_shri_i64(d, t, sh); - tcg_gen_shli_i64(t, t, 16 - sh); - tcg_gen_andi_i64(d, d, mask); - tcg_gen_andi_i64(t, t, ~mask); - tcg_gen_or_i64(d, d, t); -} - -static void gen_xar_i32(TCGv_i32 d, TCGv_i32 n, TCGv_i32 m, int32_t sh) -{ - tcg_gen_xor_i32(d, n, m); - tcg_gen_rotri_i32(d, d, sh); -} - -static void gen_xar_i64(TCGv_i64 d, TCGv_i64 n, TCGv_i64 m, int64_t sh) -{ - tcg_gen_xor_i64(d, n, m); - tcg_gen_rotri_i64(d, d, sh); -} - -static void gen_xar_vec(unsigned vece, TCGv_vec d, TCGv_vec n, - TCGv_vec m, int64_t sh) -{ - tcg_gen_xor_vec(vece, d, n, m); - tcg_gen_rotri_vec(vece, d, d, sh); -} - -void gen_gvec_xar(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, - uint32_t rm_ofs, int64_t shift, - uint32_t opr_sz, uint32_t max_sz) -{ - static const TCGOpcode vecop[] = { INDEX_op_rotli_vec, 0 }; - static const GVecGen3i ops[4] = { - { .fni8 = gen_xar8_i64, - .fniv = gen_xar_vec, - .fno = gen_helper_sve2_xar_b, - .opt_opc = vecop, - .vece = MO_8 }, - { .fni8 = gen_xar16_i64, - .fniv = gen_xar_vec, - .fno = gen_helper_sve2_xar_h, - .opt_opc = vecop, - .vece = MO_16 }, - { .fni4 = gen_xar_i32, - .fniv = gen_xar_vec, - .fno = gen_helper_sve2_xar_s, - .opt_opc = vecop, - .vece = MO_32 }, - { .fni8 = gen_xar_i64, - .fniv = gen_xar_vec, - .fno = gen_helper_gvec_xar_d, - .opt_opc = vecop, - .vece = MO_64 } - }; - int esize = 8 << vece; - - /* The SVE2 range is 1 .. esize; the AdvSIMD range is 0 .. esize-1. */ - tcg_debug_assert(shift >= 0); - tcg_debug_assert(shift <= esize); - shift &= esize - 1; - - if (shift == 0) { - /* xar with no rotate devolves to xor. */ - tcg_gen_gvec_xor(vece, rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz); - } else { - tcg_gen_gvec_3i(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, - shift, &ops[vece]); - } -} - static bool trans_XAR(DisasContext *s, arg_rrri_esz *a) { if (a->esz < 0 || !dc_isar_feature(aa64_sve2, s)) { @@ -629,61 +541,8 @@ static bool trans_XAR(DisasContext *s, arg_rrri_esz *a) return true; } -static void gen_eor3_i64(TCGv_i64 d, TCGv_i64 n, TCGv_i64 m, TCGv_i64 k) -{ - tcg_gen_xor_i64(d, n, m); - tcg_gen_xor_i64(d, d, k); -} - -static void gen_eor3_vec(unsigned vece, TCGv_vec d, TCGv_vec n, - TCGv_vec m, TCGv_vec k) -{ - tcg_gen_xor_vec(vece, d, n, m); - tcg_gen_xor_vec(vece, d, d, k); -} - -static void gen_eor3(unsigned vece, uint32_t d, uint32_t n, uint32_t m, - uint32_t a, uint32_t oprsz, uint32_t maxsz) -{ - static const GVecGen4 op = { - .fni8 = gen_eor3_i64, - .fniv = gen_eor3_vec, - .fno = gen_helper_sve2_eor3, - .vece = MO_64, - .prefer_i64 = TCG_TARGET_REG_BITS == 64, - }; - tcg_gen_gvec_4(d, n, m, a, oprsz, maxsz, &op); -} - -TRANS_FEAT(EOR3, aa64_sve2, gen_gvec_fn_arg_zzzz, gen_eor3, a) - -static void gen_bcax_i64(TCGv_i64 d, TCGv_i64 n, TCGv_i64 m, TCGv_i64 k) -{ - tcg_gen_andc_i64(d, m, k); - tcg_gen_xor_i64(d, d, n); -} - -static void gen_bcax_vec(unsigned vece, TCGv_vec d, TCGv_vec n, - TCGv_vec m, TCGv_vec k) -{ - tcg_gen_andc_vec(vece, d, m, k); - tcg_gen_xor_vec(vece, d, d, n); -} - -static void gen_bcax(unsigned vece, uint32_t d, uint32_t n, uint32_t m, - uint32_t a, uint32_t oprsz, uint32_t maxsz) -{ - static const GVecGen4 op = { - .fni8 = gen_bcax_i64, - .fniv = gen_bcax_vec, - .fno = gen_helper_sve2_bcax, - .vece = MO_64, - .prefer_i64 = TCG_TARGET_REG_BITS == 64, - }; - tcg_gen_gvec_4(d, n, m, a, oprsz, maxsz, &op); -} - -TRANS_FEAT(BCAX, aa64_sve2, gen_gvec_fn_arg_zzzz, gen_bcax, a) +TRANS_FEAT(EOR3, aa64_sve2, gen_gvec_fn_arg_zzzz, gen_gvec_eor3, a) +TRANS_FEAT(BCAX, aa64_sve2, gen_gvec_fn_arg_zzzz, gen_gvec_bcax, a) static void gen_bsl(unsigned vece, uint32_t d, uint32_t n, uint32_t m, uint32_t a, uint32_t oprsz, uint32_t maxsz) diff --git a/target/arm/tcg/meson.build b/target/arm/tcg/meson.build index bdb5c7352f..508932a249 100644 --- a/target/arm/tcg/meson.build +++ b/target/arm/tcg/meson.build @@ -43,6 +43,7 @@ arm_ss.add(files( arm_ss.add(when: 'TARGET_AARCH64', if_true: files( 'cpu64.c', + 'gengvec64.c', 'translate-a64.c', 'translate-sve.c', 'translate-sme.c', From patchwork Fri May 24 23:20:23 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13673815 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 17A40C25B74 for ; Fri, 24 May 2024 23:27:31 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sAeEH-0006Tr-Gg; Fri, 24 May 2024 19:21:41 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sAeEB-0006SF-IN for qemu-devel@nongnu.org; Fri, 24 May 2024 19:21:36 -0400 Received: from mail-pg1-x531.google.com ([2607:f8b0:4864:20::531]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sAeE8-0005kI-4D for qemu-devel@nongnu.org; Fri, 24 May 2024 19:21:35 -0400 Received: by mail-pg1-x531.google.com with SMTP id 41be03b00d2f7-681a48efe77so1151589a12.2 for ; Fri, 24 May 2024 16:21:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1716592891; x=1717197691; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=BWmL6MxcnQuRiLjnWsKqiDG9Myh863kv81D/qPM1764=; b=HskoCrczW56l+gT8FpLMPCGacsPD7lyDbkafvnhhlYIRO4aK5rpXyRt7uHUbuU1Czg WTWz2gi0K8S+KAfSL9pI/oEEGVJ+TsW3/8IqZXV1Y0djnEPnmQVs8lQhuPhMQggijSvv liIF4NNqmIRq9gGI3okeaZ4fiN+Anzq5Y6tw96+Oc6O82YUS5ttNcedj799dMuxY3vVV 4pXRPWym7S90f9TEWqnul2epf+wneWY6BKP4+ZLxvXAH1XBGboTAPf8WRWWhXxbvKS4i BC62/LA6SrG+PK2h7aaqWIlWeDoHdN8A5hQ/v3e0KguUG1X5jFLaBquEoobjuM5y58P+ grBg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716592891; x=1717197691; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=BWmL6MxcnQuRiLjnWsKqiDG9Myh863kv81D/qPM1764=; b=Ui5MfpZQTgYeBIXcj2wXRDZQv7HpocfQ8+72AgUhRw8byt8Lg0b4GzapEkfEXrkiIY LwAaDMe7dZbAFzgzRqRk8aQH+UG8lYOQCYAS4MEBuWFhBh5N5mIaGD1mWb7dZ/jqXEPq NZ9e24s9byLpVgRffZIvZ3u/FLwCSmiSLt8+GuilDPB9O3mijzGtd3vj/requNqqPWsu Nbx8aX7LKtBxggEnqhyQLbFpHRM4oxihTkFMwpU4qsZ2YHVo8Ii+8u8Y2IZwoNTUrYK1 OtDsZ4yFxTOYGN63L7UlQHTUZvaAn/dX8x8lG+SGU63S/3GCMgUePgu/owGM16mf/FUZ eAHg== X-Gm-Message-State: AOJu0YzdRObnVfEFgOcNkQeFp64pXGZAGHcm0kAyDW+gnIRBPJZClT3C JsYVIp5RtATS6yqV+pfRfPruc56HyO8Rxx2xkb1theE4qSAcP4z9udG3hREwXe/WAAr6uvjHVCY k X-Google-Smtp-Source: AGHT+IEFvYuRCHQm5+OFCXBaMturVi/xza6viqFUOKYz4K4Y2BnppIXsF/psMfgU0L5vZpE3SG75yQ== X-Received: by 2002:a17:902:e849:b0:1f2:ff65:d2e0 with SMTP id d9443c01a7336-1f449026bdamr45105035ad.42.1716592890685; Fri, 24 May 2024 16:21:30 -0700 (PDT) Received: from stoup.. (174-21-72-5.tukw.qwest.net. [174.21.72.5]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-1f44c759ceesm19178305ad.10.2024.05.24.16.21.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 May 2024 16:21:30 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: qemu-arm@nongnu.org, Peter Maydell Subject: [PATCH v2 09/67] target/arm: Convert Cryptographic AES to decodetree Date: Fri, 24 May 2024 16:20:23 -0700 Message-Id: <20240524232121.284515-10-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240524232121.284515-1-richard.henderson@linaro.org> References: <20240524232121.284515-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::531; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x531.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/tcg/a64.decode | 21 +++++++-- target/arm/tcg/translate-a64.c | 86 +++++++++++++++------------------- 2 files changed, 54 insertions(+), 53 deletions(-) diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode index 0e7656fd15..1de09903dc 100644 --- a/target/arm/tcg/a64.decode +++ b/target/arm/tcg/a64.decode @@ -19,11 +19,17 @@ # This file is processed by scripts/decodetree.py # -&r rn -&ri rd imm -&rri_sf rd rn imm sf -&i imm +%rd 0:5 +&r rn +&ri rd imm +&rri_sf rd rn imm sf +&i imm +&qrr_e q rd rn esz +&qrrr_e q rd rn rm esz + +@rr_q1e0 ........ ........ ...... rn:5 rd:5 &qrr_e q=1 esz=0 +@r2r_q1e0 ........ ........ ...... rm:5 rd:5 &qrrr_e rn=%rd q=1 esz=0 ### Data Processing - Immediate @@ -590,3 +596,10 @@ CPYFE 00 011 0 01100 ..... .... 01 ..... ..... @cpy CPYP 00 011 1 01000 ..... .... 01 ..... ..... @cpy CPYM 00 011 1 01010 ..... .... 01 ..... ..... @cpy CPYE 00 011 1 01100 ..... .... 01 ..... ..... @cpy + +### Cryptographic AES + +AESE 01001110 00 10100 00100 10 ..... ..... @r2r_q1e0 +AESD 01001110 00 10100 00101 10 ..... ..... @r2r_q1e0 +AESMC 01001110 00 10100 00110 10 ..... ..... @rr_q1e0 +AESIMC 01001110 00 10100 00111 10 ..... ..... @rr_q1e0 diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c index 8842ff634d..3894db4bee 100644 --- a/target/arm/tcg/translate-a64.c +++ b/target/arm/tcg/translate-a64.c @@ -1313,6 +1313,34 @@ bool sme_enabled_check_with_svcr(DisasContext *s, unsigned req) return true; } +/* + * Expanders for AdvSIMD translation functions. + */ + +static bool do_gvec_op2_ool(DisasContext *s, arg_qrr_e *a, int data, + gen_helper_gvec_2 *fn) +{ + if (!a->q && a->esz == MO_64) { + return false; + } + if (fp_access_check(s)) { + gen_gvec_op2_ool(s, a->q, a->rd, a->rn, data, fn); + } + return true; +} + +static bool do_gvec_op3_ool(DisasContext *s, arg_qrrr_e *a, int data, + gen_helper_gvec_3 *fn) +{ + if (!a->q && a->esz == MO_64) { + return false; + } + if (fp_access_check(s)) { + gen_gvec_op3_ool(s, a->q, a->rd, a->rn, a->rm, data, fn); + } + return true; +} + /* * This utility function is for doing register extension with an * optional shift. You will likely want to pass a temporary for the @@ -4560,6 +4588,15 @@ static bool trans_EXTR(DisasContext *s, arg_extract *a) return true; } +/* + * Cryptographic AES + */ + +TRANS_FEAT(AESE, aa64_aes, do_gvec_op3_ool, a, 0, gen_helper_crypto_aese) +TRANS_FEAT(AESD, aa64_aes, do_gvec_op3_ool, a, 0, gen_helper_crypto_aesd) +TRANS_FEAT(AESMC, aa64_aes, do_gvec_op2_ool, a, 0, gen_helper_crypto_aesmc) +TRANS_FEAT(AESIMC, aa64_aes, do_gvec_op2_ool, a, 0, gen_helper_crypto_aesimc) + /* Shift a TCGv src by TCGv shift_amount, put result in dst. * Note that it is the caller's responsibility to ensure that the * shift amount is in range (ie 0..31 or 0..63) and provide the ARM @@ -13460,54 +13497,6 @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn) } } -/* Crypto AES - * 31 24 23 22 21 17 16 12 11 10 9 5 4 0 - * +-----------------+------+-----------+--------+-----+------+------+ - * | 0 1 0 0 1 1 1 0 | size | 1 0 1 0 0 | opcode | 1 0 | Rn | Rd | - * +-----------------+------+-----------+--------+-----+------+------+ - */ -static void disas_crypto_aes(DisasContext *s, uint32_t insn) -{ - int size = extract32(insn, 22, 2); - int opcode = extract32(insn, 12, 5); - int rn = extract32(insn, 5, 5); - int rd = extract32(insn, 0, 5); - gen_helper_gvec_2 *genfn2 = NULL; - gen_helper_gvec_3 *genfn3 = NULL; - - if (!dc_isar_feature(aa64_aes, s) || size != 0) { - unallocated_encoding(s); - return; - } - - switch (opcode) { - case 0x4: /* AESE */ - genfn3 = gen_helper_crypto_aese; - break; - case 0x6: /* AESMC */ - genfn2 = gen_helper_crypto_aesmc; - break; - case 0x5: /* AESD */ - genfn3 = gen_helper_crypto_aesd; - break; - case 0x7: /* AESIMC */ - genfn2 = gen_helper_crypto_aesimc; - break; - default: - unallocated_encoding(s); - return; - } - - if (!fp_access_check(s)) { - return; - } - if (genfn2) { - gen_gvec_op2_ool(s, true, rd, rn, 0, genfn2); - } else { - gen_gvec_op3_ool(s, true, rd, rd, rn, 0, genfn3); - } -} - /* Crypto three-reg SHA * 31 24 23 22 21 20 16 15 14 12 11 10 9 5 4 0 * +-----------------+------+---+------+---+--------+-----+------+------+ @@ -13917,7 +13906,6 @@ static const AArch64DecodeTable data_proc_simd[] = { { 0x5e000400, 0xdfe08400, disas_simd_scalar_copy }, { 0x5f000000, 0xdf000400, disas_simd_indexed }, /* scalar indexed */ { 0x5f000400, 0xdf800400, disas_simd_scalar_shift_imm }, - { 0x4e280800, 0xff3e0c00, disas_crypto_aes }, { 0x5e000000, 0xff208c00, disas_crypto_three_reg_sha }, { 0x5e280800, 0xff3e0c00, disas_crypto_two_reg_sha }, { 0xce608000, 0xffe0b000, disas_crypto_three_reg_sha512 }, From patchwork Fri May 24 23:20:24 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13673795 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 3B101C25B74 for ; Fri, 24 May 2024 23:24:00 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sAeEL-0006W9-If; Fri, 24 May 2024 19:21:45 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sAeED-0006Sk-3k for qemu-devel@nongnu.org; Fri, 24 May 2024 19:21:37 -0400 Received: from mail-pl1-x62a.google.com ([2607:f8b0:4864:20::62a]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sAeE8-0005kZ-Ut for qemu-devel@nongnu.org; Fri, 24 May 2024 19:21:36 -0400 Received: by mail-pl1-x62a.google.com with SMTP id d9443c01a7336-1f3434c36baso16838595ad.2 for ; Fri, 24 May 2024 16:21:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1716592892; x=1717197692; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=J2dPDbcfetYJQKypEcJdVjwxBjIU4IfpwItkA10fe4A=; b=D+jj2pc6L9dMqJBKh8eziAxPbF/WU39rp+s5UAknR+rpCyarcxmA89spfvrTSY03Xo XzBW1S8aBDWI5HUDJRtY42VSbCHHVXtjvh+tlJR/pBhfReR6epuPSnMun0f0sls5SbcN tqlA8ZJ8NKEXMZsjZgcoL+JYIZXeDj8NgSZ9DPfBXr+bH9OrSy2htzlC6T4AvlJ6seDl Fxp9ZuUGfpwvnRUvq+KUpyNEwIwcQIV2SKtuncwnQ50pUGGDDgs2poZfjj0RyFiMJiQv zHFUY3wa84y2MibV0s6DQS6DUD2S1nCWc4gEd6/pTaxwyxd1LWVqFitWZoXIdQW+uZsN 0j+Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716592892; x=1717197692; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=J2dPDbcfetYJQKypEcJdVjwxBjIU4IfpwItkA10fe4A=; b=gyuAXrgkj/m0rys1YiD4qzGdbcHzHwgNg4PWsuY86WN/j14Gd2aXOGrDnRxqiNEaTO 5hy/BJmls8moI9fZUI8YV7x/CxyqSF89t73mOD9uU9AuTL/0pTt+eZzUjd2lRLHtgR1s vnJygeu+661H/9WzI1QxO/T5j/j//DURhQ3JvnXWVhSyWj2DfmOs5t/GpggQPmMxYU8C 0blzV5CcJ/Sj7zp1gVfa1bPwNS3qY959kUVBjZCiys6rlMbqDpz8zL+poFdpSO4aMiiU XSYb/o1Kh3ESZWw2PEZPmXvmECSKHKFkA7eCXNnFH9naFBy9+ydhw35OyPcatzLcFSzl LUVA== X-Gm-Message-State: AOJu0YzIZg8SI9EtqrOsIslJZ9hC5Dt054Xoo+Icr/8p2iASTszjWKMX L6DE1uo+yA7cZZCM3y8DILSznPTWFsX1VJmYAsqJMqVkOrOGyJCckiIEUmu4jdJrLBXumIqwXww v X-Google-Smtp-Source: AGHT+IFX6TYimXWF6JJ7RmrF/lMy/x2F6ruXF2I6BeUWUeMJQoLch2L4Em9PBnm8Scd1/dc9z0Nw+Q== X-Received: by 2002:a17:902:654e:b0:1f3:375f:3bb8 with SMTP id d9443c01a7336-1f448839297mr30292395ad.41.1716592891576; Fri, 24 May 2024 16:21:31 -0700 (PDT) Received: from stoup.. (174-21-72-5.tukw.qwest.net. [174.21.72.5]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-1f44c759ceesm19178305ad.10.2024.05.24.16.21.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 May 2024 16:21:31 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: qemu-arm@nongnu.org, Peter Maydell Subject: [PATCH v2 10/67] target/arm: Convert Cryptographic 3-register SHA to decodetree Date: Fri, 24 May 2024 16:20:24 -0700 Message-Id: <20240524232121.284515-11-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240524232121.284515-1-richard.henderson@linaro.org> References: <20240524232121.284515-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::62a; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x62a.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/tcg/a64.decode | 11 +++++ target/arm/tcg/translate-a64.c | 78 +++++----------------------------- 2 files changed, 21 insertions(+), 68 deletions(-) diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode index 1de09903dc..7590659ee6 100644 --- a/target/arm/tcg/a64.decode +++ b/target/arm/tcg/a64.decode @@ -30,6 +30,7 @@ @rr_q1e0 ........ ........ ...... rn:5 rd:5 &qrr_e q=1 esz=0 @r2r_q1e0 ........ ........ ...... rm:5 rd:5 &qrrr_e rn=%rd q=1 esz=0 +@rrr_q1e0 ........ ... rm:5 ...... rn:5 rd:5 &qrrr_e q=1 esz=0 ### Data Processing - Immediate @@ -603,3 +604,13 @@ AESE 01001110 00 10100 00100 10 ..... ..... @r2r_q1e0 AESD 01001110 00 10100 00101 10 ..... ..... @r2r_q1e0 AESMC 01001110 00 10100 00110 10 ..... ..... @rr_q1e0 AESIMC 01001110 00 10100 00111 10 ..... ..... @rr_q1e0 + +### Cryptographic three-register SHA + +SHA1C 0101 1110 000 ..... 000000 ..... ..... @rrr_q1e0 +SHA1P 0101 1110 000 ..... 000100 ..... ..... @rrr_q1e0 +SHA1M 0101 1110 000 ..... 001000 ..... ..... @rrr_q1e0 +SHA1SU0 0101 1110 000 ..... 001100 ..... ..... @rrr_q1e0 +SHA256H 0101 1110 000 ..... 010000 ..... ..... @rrr_q1e0 +SHA256H2 0101 1110 000 ..... 010100 ..... ..... @rrr_q1e0 +SHA256SU1 0101 1110 000 ..... 011000 ..... ..... @rrr_q1e0 diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c index 3894db4bee..5bef39d4e7 100644 --- a/target/arm/tcg/translate-a64.c +++ b/target/arm/tcg/translate-a64.c @@ -4589,7 +4589,7 @@ static bool trans_EXTR(DisasContext *s, arg_extract *a) } /* - * Cryptographic AES + * Cryptographic AES, SHA */ TRANS_FEAT(AESE, aa64_aes, do_gvec_op3_ool, a, 0, gen_helper_crypto_aese) @@ -4597,6 +4597,15 @@ TRANS_FEAT(AESD, aa64_aes, do_gvec_op3_ool, a, 0, gen_helper_crypto_aesd) TRANS_FEAT(AESMC, aa64_aes, do_gvec_op2_ool, a, 0, gen_helper_crypto_aesmc) TRANS_FEAT(AESIMC, aa64_aes, do_gvec_op2_ool, a, 0, gen_helper_crypto_aesimc) +TRANS_FEAT(SHA1C, aa64_sha1, do_gvec_op3_ool, a, 0, gen_helper_crypto_sha1c) +TRANS_FEAT(SHA1P, aa64_sha1, do_gvec_op3_ool, a, 0, gen_helper_crypto_sha1p) +TRANS_FEAT(SHA1M, aa64_sha1, do_gvec_op3_ool, a, 0, gen_helper_crypto_sha1m) +TRANS_FEAT(SHA1SU0, aa64_sha1, do_gvec_op3_ool, a, 0, gen_helper_crypto_sha1su0) + +TRANS_FEAT(SHA256H, aa64_sha256, do_gvec_op3_ool, a, 0, gen_helper_crypto_sha256h) +TRANS_FEAT(SHA256H2, aa64_sha256, do_gvec_op3_ool, a, 0, gen_helper_crypto_sha256h2) +TRANS_FEAT(SHA256SU1, aa64_sha256, do_gvec_op3_ool, a, 0, gen_helper_crypto_sha256su1) + /* Shift a TCGv src by TCGv shift_amount, put result in dst. * Note that it is the caller's responsibility to ensure that the * shift amount is in range (ie 0..31 or 0..63) and provide the ARM @@ -13497,72 +13506,6 @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn) } } -/* Crypto three-reg SHA - * 31 24 23 22 21 20 16 15 14 12 11 10 9 5 4 0 - * +-----------------+------+---+------+---+--------+-----+------+------+ - * | 0 1 0 1 1 1 1 0 | size | 0 | Rm | 0 | opcode | 0 0 | Rn | Rd | - * +-----------------+------+---+------+---+--------+-----+------+------+ - */ -static void disas_crypto_three_reg_sha(DisasContext *s, uint32_t insn) -{ - int size = extract32(insn, 22, 2); - int opcode = extract32(insn, 12, 3); - int rm = extract32(insn, 16, 5); - int rn = extract32(insn, 5, 5); - int rd = extract32(insn, 0, 5); - gen_helper_gvec_3 *genfn; - bool feature; - - if (size != 0) { - unallocated_encoding(s); - return; - } - - switch (opcode) { - case 0: /* SHA1C */ - genfn = gen_helper_crypto_sha1c; - feature = dc_isar_feature(aa64_sha1, s); - break; - case 1: /* SHA1P */ - genfn = gen_helper_crypto_sha1p; - feature = dc_isar_feature(aa64_sha1, s); - break; - case 2: /* SHA1M */ - genfn = gen_helper_crypto_sha1m; - feature = dc_isar_feature(aa64_sha1, s); - break; - case 3: /* SHA1SU0 */ - genfn = gen_helper_crypto_sha1su0; - feature = dc_isar_feature(aa64_sha1, s); - break; - case 4: /* SHA256H */ - genfn = gen_helper_crypto_sha256h; - feature = dc_isar_feature(aa64_sha256, s); - break; - case 5: /* SHA256H2 */ - genfn = gen_helper_crypto_sha256h2; - feature = dc_isar_feature(aa64_sha256, s); - break; - case 6: /* SHA256SU1 */ - genfn = gen_helper_crypto_sha256su1; - feature = dc_isar_feature(aa64_sha256, s); - break; - default: - unallocated_encoding(s); - return; - } - - if (!feature) { - unallocated_encoding(s); - return; - } - - if (!fp_access_check(s)) { - return; - } - gen_gvec_op3_ool(s, true, rd, rn, rm, 0, genfn); -} - /* Crypto two-reg SHA * 31 24 23 22 21 17 16 12 11 10 9 5 4 0 * +-----------------+------+-----------+--------+-----+------+------+ @@ -13906,7 +13849,6 @@ static const AArch64DecodeTable data_proc_simd[] = { { 0x5e000400, 0xdfe08400, disas_simd_scalar_copy }, { 0x5f000000, 0xdf000400, disas_simd_indexed }, /* scalar indexed */ { 0x5f000400, 0xdf800400, disas_simd_scalar_shift_imm }, - { 0x5e000000, 0xff208c00, disas_crypto_three_reg_sha }, { 0x5e280800, 0xff3e0c00, disas_crypto_two_reg_sha }, { 0xce608000, 0xffe0b000, disas_crypto_three_reg_sha512 }, { 0xcec08000, 0xfffff000, disas_crypto_two_reg_sha512 }, From patchwork Fri May 24 23:20:25 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13673794 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 99CC3C25B74 for ; Fri, 24 May 2024 23:23:50 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sAeEP-0006aM-DD; Fri, 24 May 2024 19:21:49 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sAeED-0006TI-Ku for qemu-devel@nongnu.org; Fri, 24 May 2024 19:21:37 -0400 Received: from mail-pf1-x42e.google.com ([2607:f8b0:4864:20::42e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sAeEA-0005kj-0D for qemu-devel@nongnu.org; Fri, 24 May 2024 19:21:37 -0400 Received: by mail-pf1-x42e.google.com with SMTP id d2e1a72fcca58-6f6bddf57f6so4713318b3a.0 for ; Fri, 24 May 2024 16:21:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1716592892; x=1717197692; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=CcRsCeVGiNXWcprFDHSCi1p6P6rszl6J+vWs0nqYknA=; b=kg2n/zpNNPiCaSuAMfo5s7t0w5TrnS/s4qkyVLMSemI8XaA3dPyp5/ZO66AWybFh5Y keHMsOswnRvVpFtVMI6nIvJDrJ83LTKVWoOMd41pMYJPhdkM0iPGFld1CMFKLQMWL2CH FPDaW0sty3rul1+zgBfV8YrxXd5jMcNYbAXWgywUXweqN/T+dXDT2yB1n/G96vwVvC5L 5mgIDiZD4m5Uvgny/IcdmsBSkGrIJir38OrKX9GGKmFzNYt6VEWBngKfS9ew7t76P0gC dBYHnBAHEB5N95Ye+Vo1H7unVKkfxbtwVd69sZJebiJ6JG2Yj6BsBdmS0THdtfjQCWhu LZkw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716592892; x=1717197692; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=CcRsCeVGiNXWcprFDHSCi1p6P6rszl6J+vWs0nqYknA=; b=RUtkp2qYx0q53bYMc7kqVisXDteh5JRjgPykYrooDLE4ovbQdZtsSVCBq0PPAa6tpc ufReJHJWy7NX/X5P9qZQRW5RaYqNWygtisX5VncL5qlzN4+zdX12ZMs0AsnvvedtQJ+N E1Yi0N5ux2d3Y/35cniigge+OqduREJQEHuc+3gfJ0JigcWr3+/nEogPLvEC1bBdkt5N p/Nk7IEoiiIenAqkQvaAy8YbusMKdoznBxIaVn4RYkJHNVEEIl/ZlmZZT5N02bVU9LQ9 2ulaix5jN/Qm92Fn+r2Ks4GZbLqyq5DmNFNJfjeWwHvHAVbnaacYD+kTd7uV1+ej/rR7 P1pg== X-Gm-Message-State: AOJu0YwEG5Mir0IhZBi4G1jOX0pNAvPPUgRsQViUOEOwkN3cTRfJCYh9 o69/08ApLdfAYpxJSxo4MPkEOkpAhPghkvO5Va9JgdrqJHnHWByKzogF/FPO+7Ht4yBmGz9n1kj Z X-Google-Smtp-Source: AGHT+IFwsekvquJJ+omHXcdT4MSknIwsOM1k95dcsIZCDoSnIN3cF8Ixth9yawKy2gnqFGkBfAFLSw== X-Received: by 2002:a17:902:d4c1:b0:1f4:643d:c9b0 with SMTP id d9443c01a7336-1f4643dcb78mr10351195ad.20.1716592892554; Fri, 24 May 2024 16:21:32 -0700 (PDT) Received: from stoup.. (174-21-72-5.tukw.qwest.net. [174.21.72.5]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-1f44c759ceesm19178305ad.10.2024.05.24.16.21.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 May 2024 16:21:32 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: qemu-arm@nongnu.org, Peter Maydell Subject: [PATCH v2 11/67] target/arm: Convert Cryptographic 2-register SHA to decodetree Date: Fri, 24 May 2024 16:20:25 -0700 Message-Id: <20240524232121.284515-12-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240524232121.284515-1-richard.henderson@linaro.org> References: <20240524232121.284515-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::42e; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x42e.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/tcg/a64.decode | 6 ++++ target/arm/tcg/translate-a64.c | 54 +++------------------------------- 2 files changed, 10 insertions(+), 50 deletions(-) diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode index 7590659ee6..350afabc77 100644 --- a/target/arm/tcg/a64.decode +++ b/target/arm/tcg/a64.decode @@ -614,3 +614,9 @@ SHA1SU0 0101 1110 000 ..... 001100 ..... ..... @rrr_q1e0 SHA256H 0101 1110 000 ..... 010000 ..... ..... @rrr_q1e0 SHA256H2 0101 1110 000 ..... 010100 ..... ..... @rrr_q1e0 SHA256SU1 0101 1110 000 ..... 011000 ..... ..... @rrr_q1e0 + +### Cryptographic two-register SHA + +SHA1H 0101 1110 0010 1000 0000 10 ..... ..... @rr_q1e0 +SHA1SU1 0101 1110 0010 1000 0001 10 ..... ..... @rr_q1e0 +SHA256SU0 0101 1110 0010 1000 0010 10 ..... ..... @rr_q1e0 diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c index 5bef39d4e7..1d20bf0c35 100644 --- a/target/arm/tcg/translate-a64.c +++ b/target/arm/tcg/translate-a64.c @@ -4606,6 +4606,10 @@ TRANS_FEAT(SHA256H, aa64_sha256, do_gvec_op3_ool, a, 0, gen_helper_crypto_sha256 TRANS_FEAT(SHA256H2, aa64_sha256, do_gvec_op3_ool, a, 0, gen_helper_crypto_sha256h2) TRANS_FEAT(SHA256SU1, aa64_sha256, do_gvec_op3_ool, a, 0, gen_helper_crypto_sha256su1) +TRANS_FEAT(SHA1H, aa64_sha1, do_gvec_op2_ool, a, 0, gen_helper_crypto_sha1h) +TRANS_FEAT(SHA1SU1, aa64_sha1, do_gvec_op2_ool, a, 0, gen_helper_crypto_sha1su1) +TRANS_FEAT(SHA256SU0, aa64_sha256, do_gvec_op2_ool, a, 0, gen_helper_crypto_sha256su0) + /* Shift a TCGv src by TCGv shift_amount, put result in dst. * Note that it is the caller's responsibility to ensure that the * shift amount is in range (ie 0..31 or 0..63) and provide the ARM @@ -13506,55 +13510,6 @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn) } } -/* Crypto two-reg SHA - * 31 24 23 22 21 17 16 12 11 10 9 5 4 0 - * +-----------------+------+-----------+--------+-----+------+------+ - * | 0 1 0 1 1 1 1 0 | size | 1 0 1 0 0 | opcode | 1 0 | Rn | Rd | - * +-----------------+------+-----------+--------+-----+------+------+ - */ -static void disas_crypto_two_reg_sha(DisasContext *s, uint32_t insn) -{ - int size = extract32(insn, 22, 2); - int opcode = extract32(insn, 12, 5); - int rn = extract32(insn, 5, 5); - int rd = extract32(insn, 0, 5); - gen_helper_gvec_2 *genfn; - bool feature; - - if (size != 0) { - unallocated_encoding(s); - return; - } - - switch (opcode) { - case 0: /* SHA1H */ - feature = dc_isar_feature(aa64_sha1, s); - genfn = gen_helper_crypto_sha1h; - break; - case 1: /* SHA1SU1 */ - feature = dc_isar_feature(aa64_sha1, s); - genfn = gen_helper_crypto_sha1su1; - break; - case 2: /* SHA256SU0 */ - feature = dc_isar_feature(aa64_sha256, s); - genfn = gen_helper_crypto_sha256su0; - break; - default: - unallocated_encoding(s); - return; - } - - if (!feature) { - unallocated_encoding(s); - return; - } - - if (!fp_access_check(s)) { - return; - } - gen_gvec_op2_ool(s, true, rd, rn, 0, genfn); -} - /* Crypto three-reg SHA512 * 31 21 20 16 15 14 13 12 11 10 9 5 4 0 * +-----------------------+------+---+---+-----+--------+------+------+ @@ -13849,7 +13804,6 @@ static const AArch64DecodeTable data_proc_simd[] = { { 0x5e000400, 0xdfe08400, disas_simd_scalar_copy }, { 0x5f000000, 0xdf000400, disas_simd_indexed }, /* scalar indexed */ { 0x5f000400, 0xdf800400, disas_simd_scalar_shift_imm }, - { 0x5e280800, 0xff3e0c00, disas_crypto_two_reg_sha }, { 0xce608000, 0xffe0b000, disas_crypto_three_reg_sha512 }, { 0xcec08000, 0xfffff000, disas_crypto_two_reg_sha512 }, { 0xce000000, 0xff808000, disas_crypto_four_reg }, From patchwork Fri May 24 23:20:26 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13673801 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id CE9BFC25B7A for ; Fri, 24 May 2024 23:24:46 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sAeER-0006d4-Hs; Fri, 24 May 2024 19:21:53 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sAeEG-0006Tu-MV for qemu-devel@nongnu.org; Fri, 24 May 2024 19:21:41 -0400 Received: from mail-pl1-x62e.google.com ([2607:f8b0:4864:20::62e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sAeEA-0005kz-NP for qemu-devel@nongnu.org; Fri, 24 May 2024 19:21:38 -0400 Received: by mail-pl1-x62e.google.com with SMTP id d9443c01a7336-1f44b5ba445so12728405ad.3 for ; Fri, 24 May 2024 16:21:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1716592893; x=1717197693; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=61o670b1akWNSRNj2BaQ0Tclv08NTSzlJ/p48xm+wOs=; b=pJ4TOQpccjxPrDMmBmBjhzCL4kN/OAZNLJFnVSTuQ6/jLaj9+4vLoZHEnEOiRR4F6S RuVWk1VQOUGKcHWrwscMkQ3TXM82nigGYDJf5ujTMEDlE6dYmev7M6Kpgin4nR+a+VOB 71NCrIzi+atvLRMEkRQXWI+3CoejQO0bKUcORIfDgAy4LYidTeydBOeW/YqHDd8El0uj bGcONiPSF+wjSUGLolFPP0Hzx4qmwGmUaHw1xyQJW4xB/q2/KJ1ewMBhRM0BgmzH16+D lbuUcAlkKR/H09t1zWw3hpG1M5H22tvTpfuzNkm9Araz7OapcFb5ALkacmqgV7/zajsI CacQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716592893; x=1717197693; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=61o670b1akWNSRNj2BaQ0Tclv08NTSzlJ/p48xm+wOs=; b=BsIm2WJmDPXPaEHvPx5RBZGrewQbQX0qgle/d0uwOUo5y49d+AsT1RERv3jaAqxKcf +mNs9JjbbEE4KYHyTXZlGVVocN8iClioLZOMKIxxqT19IN+allpX5aYfojU5YF2vUcxd zkArPZ7Z0H9c7nihEefmkGp0A3Juxoa9fPWzRNHxJifrqtn2d3ypYk/dm0GYkcwalNSb ysnM9uNXeXVtBRzhJYbSSTek5ht8unz5QPL4KAYHqdnumJndsHxbdGTTArN144l9t87m h4aFyl7L4f8ssxwbuTYNeEw1Us7XBa6vjCIboGHzYZs1RO6hdoEab/B84M9CzOliwhQF 3M2w== X-Gm-Message-State: AOJu0YwWpM/vi/tNFY+vnY6ip/jMmSdFcbY+dlO+jydgdI2AP43V8eXw cH8FEKk1hrj9A6rJNqWSCHXb/KivnCzwVVSF/3qCqHQKV2/j0WsC98C747xrqdTcxoATnBrDuC4 / X-Google-Smtp-Source: AGHT+IG2YC4j6Ij0v24DQS2RIYaKAKDaTKupbKT2bXvuilldz+fIbtTfawdGLGcP1QL6QeqYS4v5sA== X-Received: by 2002:a17:903:228e:b0:1e4:6519:816d with SMTP id d9443c01a7336-1f4497df620mr37603525ad.48.1716592893380; Fri, 24 May 2024 16:21:33 -0700 (PDT) Received: from stoup.. (174-21-72-5.tukw.qwest.net. [174.21.72.5]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-1f44c759ceesm19178305ad.10.2024.05.24.16.21.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 May 2024 16:21:33 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: qemu-arm@nongnu.org, Peter Maydell Subject: [PATCH v2 12/67] target/arm: Convert Cryptographic 3-register SHA512 to decodetree Date: Fri, 24 May 2024 16:20:26 -0700 Message-Id: <20240524232121.284515-13-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240524232121.284515-1-richard.henderson@linaro.org> References: <20240524232121.284515-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::62e; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x62e.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/tcg/a64.decode | 11 ++++ target/arm/tcg/translate-a64.c | 97 ++++++++-------------------------- 2 files changed, 32 insertions(+), 76 deletions(-) diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode index 350afabc77..c342c27608 100644 --- a/target/arm/tcg/a64.decode +++ b/target/arm/tcg/a64.decode @@ -31,6 +31,7 @@ @rr_q1e0 ........ ........ ...... rn:5 rd:5 &qrr_e q=1 esz=0 @r2r_q1e0 ........ ........ ...... rm:5 rd:5 &qrrr_e rn=%rd q=1 esz=0 @rrr_q1e0 ........ ... rm:5 ...... rn:5 rd:5 &qrrr_e q=1 esz=0 +@rrr_q1e3 ........ ... rm:5 ...... rn:5 rd:5 &qrrr_e q=1 esz=3 ### Data Processing - Immediate @@ -620,3 +621,13 @@ SHA256SU1 0101 1110 000 ..... 011000 ..... ..... @rrr_q1e0 SHA1H 0101 1110 0010 1000 0000 10 ..... ..... @rr_q1e0 SHA1SU1 0101 1110 0010 1000 0001 10 ..... ..... @rr_q1e0 SHA256SU0 0101 1110 0010 1000 0010 10 ..... ..... @rr_q1e0 + +### Cryptographic three-register SHA512 + +SHA512H 1100 1110 011 ..... 100000 ..... ..... @rrr_q1e0 +SHA512H2 1100 1110 011 ..... 100001 ..... ..... @rrr_q1e0 +SHA512SU1 1100 1110 011 ..... 100010 ..... ..... @rrr_q1e0 +RAX1 1100 1110 011 ..... 100011 ..... ..... @rrr_q1e3 +SM3PARTW1 1100 1110 011 ..... 110000 ..... ..... @rrr_q1e0 +SM3PARTW2 1100 1110 011 ..... 110001 ..... ..... @rrr_q1e0 +SM4EKEY 1100 1110 011 ..... 110010 ..... ..... @rrr_q1e0 diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c index 1d20bf0c35..77b24cd52e 100644 --- a/target/arm/tcg/translate-a64.c +++ b/target/arm/tcg/translate-a64.c @@ -1341,6 +1341,17 @@ static bool do_gvec_op3_ool(DisasContext *s, arg_qrrr_e *a, int data, return true; } +static bool do_gvec_fn3(DisasContext *s, arg_qrrr_e *a, GVecGen3Fn *fn) +{ + if (!a->q && a->esz == MO_64) { + return false; + } + if (fp_access_check(s)) { + gen_gvec_fn3(s, a->q, a->rd, a->rn, a->rm, fn, a->esz); + } + return true; +} + /* * This utility function is for doing register extension with an * optional shift. You will likely want to pass a temporary for the @@ -4589,7 +4600,7 @@ static bool trans_EXTR(DisasContext *s, arg_extract *a) } /* - * Cryptographic AES, SHA + * Cryptographic AES, SHA, SHA512 */ TRANS_FEAT(AESE, aa64_aes, do_gvec_op3_ool, a, 0, gen_helper_crypto_aese) @@ -4610,6 +4621,15 @@ TRANS_FEAT(SHA1H, aa64_sha1, do_gvec_op2_ool, a, 0, gen_helper_crypto_sha1h) TRANS_FEAT(SHA1SU1, aa64_sha1, do_gvec_op2_ool, a, 0, gen_helper_crypto_sha1su1) TRANS_FEAT(SHA256SU0, aa64_sha256, do_gvec_op2_ool, a, 0, gen_helper_crypto_sha256su0) +TRANS_FEAT(SHA512H, aa64_sha512, do_gvec_op3_ool, a, 0, gen_helper_crypto_sha512h) +TRANS_FEAT(SHA512H2, aa64_sha512, do_gvec_op3_ool, a, 0, gen_helper_crypto_sha512h2) +TRANS_FEAT(SHA512SU1, aa64_sha512, do_gvec_op3_ool, a, 0, gen_helper_crypto_sha512su1) +TRANS_FEAT(RAX1, aa64_sha3, do_gvec_fn3, a, gen_gvec_rax1) +TRANS_FEAT(SM3PARTW1, aa64_sm3, do_gvec_op3_ool, a, 0, gen_helper_crypto_sm3partw1) +TRANS_FEAT(SM3PARTW2, aa64_sm3, do_gvec_op3_ool, a, 0, gen_helper_crypto_sm3partw2) +TRANS_FEAT(SM4EKEY, aa64_sm4, do_gvec_op3_ool, a, 0, gen_helper_crypto_sm4ekey) + + /* Shift a TCGv src by TCGv shift_amount, put result in dst. * Note that it is the caller's responsibility to ensure that the * shift amount is in range (ie 0..31 or 0..63) and provide the ARM @@ -13510,80 +13530,6 @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn) } } -/* Crypto three-reg SHA512 - * 31 21 20 16 15 14 13 12 11 10 9 5 4 0 - * +-----------------------+------+---+---+-----+--------+------+------+ - * | 1 1 0 0 1 1 1 0 0 1 1 | Rm | 1 | O | 0 0 | opcode | Rn | Rd | - * +-----------------------+------+---+---+-----+--------+------+------+ - */ -static void disas_crypto_three_reg_sha512(DisasContext *s, uint32_t insn) -{ - int opcode = extract32(insn, 10, 2); - int o = extract32(insn, 14, 1); - int rm = extract32(insn, 16, 5); - int rn = extract32(insn, 5, 5); - int rd = extract32(insn, 0, 5); - bool feature; - gen_helper_gvec_3 *oolfn = NULL; - GVecGen3Fn *gvecfn = NULL; - - if (o == 0) { - switch (opcode) { - case 0: /* SHA512H */ - feature = dc_isar_feature(aa64_sha512, s); - oolfn = gen_helper_crypto_sha512h; - break; - case 1: /* SHA512H2 */ - feature = dc_isar_feature(aa64_sha512, s); - oolfn = gen_helper_crypto_sha512h2; - break; - case 2: /* SHA512SU1 */ - feature = dc_isar_feature(aa64_sha512, s); - oolfn = gen_helper_crypto_sha512su1; - break; - case 3: /* RAX1 */ - feature = dc_isar_feature(aa64_sha3, s); - gvecfn = gen_gvec_rax1; - break; - default: - g_assert_not_reached(); - } - } else { - switch (opcode) { - case 0: /* SM3PARTW1 */ - feature = dc_isar_feature(aa64_sm3, s); - oolfn = gen_helper_crypto_sm3partw1; - break; - case 1: /* SM3PARTW2 */ - feature = dc_isar_feature(aa64_sm3, s); - oolfn = gen_helper_crypto_sm3partw2; - break; - case 2: /* SM4EKEY */ - feature = dc_isar_feature(aa64_sm4, s); - oolfn = gen_helper_crypto_sm4ekey; - break; - default: - unallocated_encoding(s); - return; - } - } - - if (!feature) { - unallocated_encoding(s); - return; - } - - if (!fp_access_check(s)) { - return; - } - - if (oolfn) { - gen_gvec_op3_ool(s, true, rd, rn, rm, 0, oolfn); - } else { - gen_gvec_fn3(s, true, rd, rn, rm, gvecfn, MO_64); - } -} - /* Crypto two-reg SHA512 * 31 12 11 10 9 5 4 0 * +-----------------------------------------+--------+------+------+ @@ -13804,7 +13750,6 @@ static const AArch64DecodeTable data_proc_simd[] = { { 0x5e000400, 0xdfe08400, disas_simd_scalar_copy }, { 0x5f000000, 0xdf000400, disas_simd_indexed }, /* scalar indexed */ { 0x5f000400, 0xdf800400, disas_simd_scalar_shift_imm }, - { 0xce608000, 0xffe0b000, disas_crypto_three_reg_sha512 }, { 0xcec08000, 0xfffff000, disas_crypto_two_reg_sha512 }, { 0xce000000, 0xff808000, disas_crypto_four_reg }, { 0xce800000, 0xffe00000, disas_crypto_xar }, From patchwork Fri May 24 23:20:27 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13673792 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 7843CC25B7A for ; Fri, 24 May 2024 23:23:01 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sAeEO-0006ZU-Rk; Fri, 24 May 2024 19:21:48 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sAeEH-0006U8-0w for qemu-devel@nongnu.org; Fri, 24 May 2024 19:21:41 -0400 Received: from mail-pl1-x629.google.com ([2607:f8b0:4864:20::629]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sAeEC-0005lI-Gf for qemu-devel@nongnu.org; Fri, 24 May 2024 19:21:40 -0400 Received: by mail-pl1-x629.google.com with SMTP id d9443c01a7336-1f332511457so13028625ad.2 for ; Fri, 24 May 2024 16:21:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1716592894; x=1717197694; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=IskcEk0O5C7PUgJwWoGQw6Vk3d4pXb5ryEHh+lqmaGQ=; b=vhif52a+/LnHM4Oom8MnJUkD919BX4PyTINgnuIBvUB2CvVUh303T3FbKTs9DS0sUE cQKWYeEJWiMxEOieVHgGcsMhiIkPolEIpjRfIKLVJ5/Oqh31ggyBYJymIao2Qrvsxf03 7deDvFz+MWwk7fnlI5jlBb2w2UMk9IKabl3Zo9qGLzRLbHELjOeAg/6M/Eo2fWDynlJ2 T5iAm1K6r2fGoZkgNnOOgzBPEFKAXxBPVQQCvZkVgLXYpdB1jXsWwyZIecz8d8RfMPD6 8EACszProypdoSdEYx4QQ1BtxYRF0G5b97Y61P0oFZDL0mKXNyRBhWSFalDytJwkwZqU Q7Iw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716592894; x=1717197694; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=IskcEk0O5C7PUgJwWoGQw6Vk3d4pXb5ryEHh+lqmaGQ=; b=PuDkD1WFllJuEyIcs+5ddC039pHyTI2QLl26gO5gZnLq7F893cbh+0U2S2tq4QuRn+ w/RvdizgwAbRRFM3x0n1CkGCnJdlk4vc/EFCUgChbV31VDiqLxkXy2J4DF0ID1cmuIGz T0l7l2mR311fbwI9MmRBL3Q8SFeG/JO6pSHVrJKnp6yQPtomkOtxHl/BxULYH/WIcW/L 4YjuznuBqlUayGA5hpOYKEAVgociNNO5L/50cD/SDZIUAjgfXXfNSmM9+kZtGfUIz95G UQpV63NXZCzcW/84AfxHzZQKCIlw9BvuIsp5CuZ+iv7dyGyqmoq+9uXbIF3WPvVsTSuM sCDQ== X-Gm-Message-State: AOJu0Yzh/haFyLiN5MpJJ7zZUNTFAk9WRtGYWH0NhIaUh6jTw1HNFMHG eu+13O65UNHdA2cObQ4+UvtSQHA5N18euH021ya0meeHwD2LSj/EblLP2i7/OaCwgjcmhWY2pCG t X-Google-Smtp-Source: AGHT+IHTTSzlm17eCmqw19X1rZ6uwAoVBjqxxU1QfDDuJYVkZum0G9RmEbQn2qnVOVM2lNnaIgTopw== X-Received: by 2002:a17:902:ecc7:b0:1f3:266e:cd2d with SMTP id d9443c01a7336-1f448933c64mr44519305ad.30.1716592894096; Fri, 24 May 2024 16:21:34 -0700 (PDT) Received: from stoup.. (174-21-72-5.tukw.qwest.net. [174.21.72.5]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-1f44c759ceesm19178305ad.10.2024.05.24.16.21.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 May 2024 16:21:33 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: qemu-arm@nongnu.org, Peter Maydell Subject: [PATCH v2 13/67] target/arm: Convert Cryptographic 2-register SHA512 to decodetree Date: Fri, 24 May 2024 16:20:27 -0700 Message-Id: <20240524232121.284515-14-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240524232121.284515-1-richard.henderson@linaro.org> References: <20240524232121.284515-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::629; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x629.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/tcg/a64.decode | 5 ++++ target/arm/tcg/translate-a64.c | 50 ++-------------------------------- 2 files changed, 8 insertions(+), 47 deletions(-) diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode index c342c27608..5a46205751 100644 --- a/target/arm/tcg/a64.decode +++ b/target/arm/tcg/a64.decode @@ -631,3 +631,8 @@ RAX1 1100 1110 011 ..... 100011 ..... ..... @rrr_q1e3 SM3PARTW1 1100 1110 011 ..... 110000 ..... ..... @rrr_q1e0 SM3PARTW2 1100 1110 011 ..... 110001 ..... ..... @rrr_q1e0 SM4EKEY 1100 1110 011 ..... 110010 ..... ..... @rrr_q1e0 + +### Cryptographic two-register SHA512 + +SHA512SU0 1100 1110 110 00000 100000 ..... ..... @rr_q1e0 +SM4E 1100 1110 110 00000 100001 ..... ..... @r2r_q1e0 diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c index 77b24cd52e..eed0abe912 100644 --- a/target/arm/tcg/translate-a64.c +++ b/target/arm/tcg/translate-a64.c @@ -4629,6 +4629,9 @@ TRANS_FEAT(SM3PARTW1, aa64_sm3, do_gvec_op3_ool, a, 0, gen_helper_crypto_sm3part TRANS_FEAT(SM3PARTW2, aa64_sm3, do_gvec_op3_ool, a, 0, gen_helper_crypto_sm3partw2) TRANS_FEAT(SM4EKEY, aa64_sm4, do_gvec_op3_ool, a, 0, gen_helper_crypto_sm4ekey) +TRANS_FEAT(SHA512SU0, aa64_sha512, do_gvec_op2_ool, a, 0, gen_helper_crypto_sha512su0) +TRANS_FEAT(SM4E, aa64_sm4, do_gvec_op3_ool, a, 0, gen_helper_crypto_sm4e) + /* Shift a TCGv src by TCGv shift_amount, put result in dst. * Note that it is the caller's responsibility to ensure that the @@ -13530,52 +13533,6 @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn) } } -/* Crypto two-reg SHA512 - * 31 12 11 10 9 5 4 0 - * +-----------------------------------------+--------+------+------+ - * | 1 1 0 0 1 1 1 0 1 1 0 0 0 0 0 0 1 0 0 0 | opcode | Rn | Rd | - * +-----------------------------------------+--------+------+------+ - */ -static void disas_crypto_two_reg_sha512(DisasContext *s, uint32_t insn) -{ - int opcode = extract32(insn, 10, 2); - int rn = extract32(insn, 5, 5); - int rd = extract32(insn, 0, 5); - bool feature; - - switch (opcode) { - case 0: /* SHA512SU0 */ - feature = dc_isar_feature(aa64_sha512, s); - break; - case 1: /* SM4E */ - feature = dc_isar_feature(aa64_sm4, s); - break; - default: - unallocated_encoding(s); - return; - } - - if (!feature) { - unallocated_encoding(s); - return; - } - - if (!fp_access_check(s)) { - return; - } - - switch (opcode) { - case 0: /* SHA512SU0 */ - gen_gvec_op2_ool(s, true, rd, rn, 0, gen_helper_crypto_sha512su0); - break; - case 1: /* SM4E */ - gen_gvec_op3_ool(s, true, rd, rd, rn, 0, gen_helper_crypto_sm4e); - break; - default: - g_assert_not_reached(); - } -} - /* Crypto four-register * 31 23 22 21 20 16 15 14 10 9 5 4 0 * +-------------------+-----+------+---+------+------+------+ @@ -13750,7 +13707,6 @@ static const AArch64DecodeTable data_proc_simd[] = { { 0x5e000400, 0xdfe08400, disas_simd_scalar_copy }, { 0x5f000000, 0xdf000400, disas_simd_indexed }, /* scalar indexed */ { 0x5f000400, 0xdf800400, disas_simd_scalar_shift_imm }, - { 0xcec08000, 0xfffff000, disas_crypto_two_reg_sha512 }, { 0xce000000, 0xff808000, disas_crypto_four_reg }, { 0xce800000, 0xffe00000, disas_crypto_xar }, { 0xce408000, 0xffe0c000, disas_crypto_three_reg_imm2 }, From patchwork Fri May 24 23:20:28 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13673845 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 326FDC25B7A for ; Fri, 24 May 2024 23:32:05 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sAeEV-0006ge-Ib; Fri, 24 May 2024 19:21:55 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sAeEH-0006Un-Ke for qemu-devel@nongnu.org; Fri, 24 May 2024 19:21:41 -0400 Received: from mail-pl1-x62d.google.com ([2607:f8b0:4864:20::62d]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sAeEC-0005lo-KY for qemu-devel@nongnu.org; Fri, 24 May 2024 19:21:41 -0400 Received: by mail-pl1-x62d.google.com with SMTP id d9443c01a7336-1f449f09476so8483685ad.1 for ; Fri, 24 May 2024 16:21:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1716592895; x=1717197695; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Q8nLXGQj1z+OU0wdXWijANT/e9/un6tsSHHTKaC+Uds=; b=WNT0k8KCITqVXLpfTOEX1JKglHeO80VU8RCixYmQmjvsomz7XD6561Ddkjn/4fTOF0 wCwi92xoEnLOI/fwBcn8OIHhpiWr9HxnaU+ZjXp5MFaZq3Ts1120c4SE+2XP8WhI+ZMF Am64TtKQYtoJI53dE1FjO6+DUKEJAoirFW+QLBBm13i+YujP4gCgmTQIXrne6yyOKpJW QLrYds7CttO+WEhsHNK7XImYEXG/g9DgfG++5qafe7+tuTMNWiuPYB3vFCnDK8/pIvDM KzQ9ooC1t/KEp4VOrK5ocOYzS+ytZwpZ0HDi+XU0YvO3mrpretQi/EDa5t+zSArUqnqC x8hg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716592895; x=1717197695; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Q8nLXGQj1z+OU0wdXWijANT/e9/un6tsSHHTKaC+Uds=; b=RyD3Li7MDqCk+I4OWuiS/8cZBKZ/8RHXP+2rSpNI9fa+1zDSgp31CIajq3lQ2r5nHM 9Z2K/ufXXbE5pN8i9o70xAVZDzoY/7odLy5lZYJGCDrooUYps3OhIqICf3WM2QyhV0Im xzA/4d0vLuCsRdkCEUqsgWahapVqP0Y+wFs0PVAxuH3oEWyT2CwMy0tuhmKKzx/IWsZ+ f1NzmFG46B0/PbArn8P3pdEzLFOABMEcQn83JdJYxb+GzzQgULdtzkLDVLucjHgG4xeF fRIDIX4/zXsQA2mwC+cnB65T4zg7GAiA9HAsP8PoXv38/3u2xvQjNapdI5r0hsmblBFE mvNA== X-Gm-Message-State: AOJu0YzvGjFDCdScwGjwpJpkGzUN5JS9ia3czrwP+/k8eTFy7N3/F3oR MscU4LWEAJDGEuWRxIf3IvdQDWwVM9/cd/Cz/csCXoML6QRSD67Q8pFrbYDgqTCCSjaW7+qGlNO P X-Google-Smtp-Source: AGHT+IFBgQv3GJSFTpDl2wtM4I0wPp+a0+tvoaiIJZ8xgn671MsI2uMHM9h9+4yNHFjKbC+iNIftOQ== X-Received: by 2002:a17:902:da90:b0:1f2:ef8f:8573 with SMTP id d9443c01a7336-1f447fa4e3emr53712245ad.0.1716592894940; Fri, 24 May 2024 16:21:34 -0700 (PDT) Received: from stoup.. (174-21-72-5.tukw.qwest.net. [174.21.72.5]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-1f44c759ceesm19178305ad.10.2024.05.24.16.21.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 May 2024 16:21:34 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: qemu-arm@nongnu.org, Peter Maydell Subject: [PATCH v2 14/67] target/arm: Convert Cryptographic 4-register to decodetree Date: Fri, 24 May 2024 16:20:28 -0700 Message-Id: <20240524232121.284515-15-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240524232121.284515-1-richard.henderson@linaro.org> References: <20240524232121.284515-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::62d; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x62d.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/tcg/a64.decode | 8 ++ target/arm/tcg/translate-a64.c | 132 +++++++++++---------------------- 2 files changed, 51 insertions(+), 89 deletions(-) diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode index 5a46205751..ef6902e86a 100644 --- a/target/arm/tcg/a64.decode +++ b/target/arm/tcg/a64.decode @@ -27,11 +27,13 @@ &i imm &qrr_e q rd rn esz &qrrr_e q rd rn rm esz +&qrrrr_e q rd rn rm ra esz @rr_q1e0 ........ ........ ...... rn:5 rd:5 &qrr_e q=1 esz=0 @r2r_q1e0 ........ ........ ...... rm:5 rd:5 &qrrr_e rn=%rd q=1 esz=0 @rrr_q1e0 ........ ... rm:5 ...... rn:5 rd:5 &qrrr_e q=1 esz=0 @rrr_q1e3 ........ ... rm:5 ...... rn:5 rd:5 &qrrr_e q=1 esz=3 +@rrrr_q1e3 ........ ... rm:5 . ra:5 rn:5 rd:5 &qrrrr_e q=1 esz=3 ### Data Processing - Immediate @@ -636,3 +638,9 @@ SM4EKEY 1100 1110 011 ..... 110010 ..... ..... @rrr_q1e0 SHA512SU0 1100 1110 110 00000 100000 ..... ..... @rr_q1e0 SM4E 1100 1110 110 00000 100001 ..... ..... @r2r_q1e0 + +### Cryptographic four-register + +EOR3 1100 1110 000 ..... 0 ..... ..... ..... @rrrr_q1e3 +BCAX 1100 1110 001 ..... 0 ..... ..... ..... @rrrr_q1e3 +SM3SS1 1100 1110 010 ..... 0 ..... ..... ..... @rrrr_q1e3 diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c index eed0abe912..2951e7eb59 100644 --- a/target/arm/tcg/translate-a64.c +++ b/target/arm/tcg/translate-a64.c @@ -1352,6 +1352,17 @@ static bool do_gvec_fn3(DisasContext *s, arg_qrrr_e *a, GVecGen3Fn *fn) return true; } +static bool do_gvec_fn4(DisasContext *s, arg_qrrrr_e *a, GVecGen4Fn *fn) +{ + if (!a->q && a->esz == MO_64) { + return false; + } + if (fp_access_check(s)) { + gen_gvec_fn4(s, a->q, a->rd, a->rn, a->rm, a->ra, fn, a->esz); + } + return true; +} + /* * This utility function is for doing register extension with an * optional shift. You will likely want to pass a temporary for the @@ -4632,6 +4643,38 @@ TRANS_FEAT(SM4EKEY, aa64_sm4, do_gvec_op3_ool, a, 0, gen_helper_crypto_sm4ekey) TRANS_FEAT(SHA512SU0, aa64_sha512, do_gvec_op2_ool, a, 0, gen_helper_crypto_sha512su0) TRANS_FEAT(SM4E, aa64_sm4, do_gvec_op3_ool, a, 0, gen_helper_crypto_sm4e) +TRANS_FEAT(EOR3, aa64_sha3, do_gvec_fn4, a, gen_gvec_eor3) +TRANS_FEAT(BCAX, aa64_sha3, do_gvec_fn4, a, gen_gvec_bcax) + +static bool trans_SM3SS1(DisasContext *s, arg_SM3SS1 *a) +{ + if (!dc_isar_feature(aa64_sm3, s)) { + return false; + } + if (fp_access_check(s)) { + TCGv_i32 tcg_op1 = tcg_temp_new_i32(); + TCGv_i32 tcg_op2 = tcg_temp_new_i32(); + TCGv_i32 tcg_op3 = tcg_temp_new_i32(); + TCGv_i32 tcg_res = tcg_temp_new_i32(); + unsigned vsz, dofs; + + read_vec_element_i32(s, tcg_op1, a->rn, 3, MO_32); + read_vec_element_i32(s, tcg_op2, a->rm, 3, MO_32); + read_vec_element_i32(s, tcg_op3, a->ra, 3, MO_32); + + tcg_gen_rotri_i32(tcg_res, tcg_op1, 20); + tcg_gen_add_i32(tcg_res, tcg_res, tcg_op2); + tcg_gen_add_i32(tcg_res, tcg_res, tcg_op3); + tcg_gen_rotri_i32(tcg_res, tcg_res, 25); + + /* Clear the whole register first, then store bits [127:96]. */ + vsz = vec_full_reg_size(s); + dofs = vec_full_reg_offset(s, a->rd); + tcg_gen_gvec_dup_imm(MO_64, dofs, vsz, vsz, 0); + write_vec_element_i32(s, tcg_res, a->rd, 3, MO_32); + } + return true; +} /* Shift a TCGv src by TCGv shift_amount, put result in dst. * Note that it is the caller's responsibility to ensure that the @@ -13533,94 +13576,6 @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn) } } -/* Crypto four-register - * 31 23 22 21 20 16 15 14 10 9 5 4 0 - * +-------------------+-----+------+---+------+------+------+ - * | 1 1 0 0 1 1 1 0 0 | Op0 | Rm | 0 | Ra | Rn | Rd | - * +-------------------+-----+------+---+------+------+------+ - */ -static void disas_crypto_four_reg(DisasContext *s, uint32_t insn) -{ - int op0 = extract32(insn, 21, 2); - int rm = extract32(insn, 16, 5); - int ra = extract32(insn, 10, 5); - int rn = extract32(insn, 5, 5); - int rd = extract32(insn, 0, 5); - bool feature; - - switch (op0) { - case 0: /* EOR3 */ - case 1: /* BCAX */ - feature = dc_isar_feature(aa64_sha3, s); - break; - case 2: /* SM3SS1 */ - feature = dc_isar_feature(aa64_sm3, s); - break; - default: - unallocated_encoding(s); - return; - } - - if (!feature) { - unallocated_encoding(s); - return; - } - - if (!fp_access_check(s)) { - return; - } - - if (op0 < 2) { - TCGv_i64 tcg_op1, tcg_op2, tcg_op3, tcg_res[2]; - int pass; - - tcg_op1 = tcg_temp_new_i64(); - tcg_op2 = tcg_temp_new_i64(); - tcg_op3 = tcg_temp_new_i64(); - tcg_res[0] = tcg_temp_new_i64(); - tcg_res[1] = tcg_temp_new_i64(); - - for (pass = 0; pass < 2; pass++) { - read_vec_element(s, tcg_op1, rn, pass, MO_64); - read_vec_element(s, tcg_op2, rm, pass, MO_64); - read_vec_element(s, tcg_op3, ra, pass, MO_64); - - if (op0 == 0) { - /* EOR3 */ - tcg_gen_xor_i64(tcg_res[pass], tcg_op2, tcg_op3); - } else { - /* BCAX */ - tcg_gen_andc_i64(tcg_res[pass], tcg_op2, tcg_op3); - } - tcg_gen_xor_i64(tcg_res[pass], tcg_res[pass], tcg_op1); - } - write_vec_element(s, tcg_res[0], rd, 0, MO_64); - write_vec_element(s, tcg_res[1], rd, 1, MO_64); - } else { - TCGv_i32 tcg_op1, tcg_op2, tcg_op3, tcg_res, tcg_zero; - - tcg_op1 = tcg_temp_new_i32(); - tcg_op2 = tcg_temp_new_i32(); - tcg_op3 = tcg_temp_new_i32(); - tcg_res = tcg_temp_new_i32(); - tcg_zero = tcg_constant_i32(0); - - read_vec_element_i32(s, tcg_op1, rn, 3, MO_32); - read_vec_element_i32(s, tcg_op2, rm, 3, MO_32); - read_vec_element_i32(s, tcg_op3, ra, 3, MO_32); - - tcg_gen_rotri_i32(tcg_res, tcg_op1, 20); - tcg_gen_add_i32(tcg_res, tcg_res, tcg_op2); - tcg_gen_add_i32(tcg_res, tcg_res, tcg_op3); - tcg_gen_rotri_i32(tcg_res, tcg_res, 25); - - write_vec_element_i32(s, tcg_zero, rd, 0, MO_32); - write_vec_element_i32(s, tcg_zero, rd, 1, MO_32); - write_vec_element_i32(s, tcg_zero, rd, 2, MO_32); - write_vec_element_i32(s, tcg_res, rd, 3, MO_32); - } -} - /* Crypto XAR * 31 21 20 16 15 10 9 5 4 0 * +-----------------------+------+--------+------+------+ @@ -13707,7 +13662,6 @@ static const AArch64DecodeTable data_proc_simd[] = { { 0x5e000400, 0xdfe08400, disas_simd_scalar_copy }, { 0x5f000000, 0xdf000400, disas_simd_indexed }, /* scalar indexed */ { 0x5f000400, 0xdf800400, disas_simd_scalar_shift_imm }, - { 0xce000000, 0xff808000, disas_crypto_four_reg }, { 0xce800000, 0xffe00000, disas_crypto_xar }, { 0xce408000, 0xffe0c000, disas_crypto_three_reg_imm2 }, { 0x0e400400, 0x9f60c400, disas_simd_three_reg_same_fp16 }, From patchwork Fri May 24 23:20:29 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13673798 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 0499EC25B74 for ; Fri, 24 May 2024 23:24:34 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sAeEU-0006e5-NI; Fri, 24 May 2024 19:21:54 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sAeEI-0006VG-15 for qemu-devel@nongnu.org; Fri, 24 May 2024 19:21:42 -0400 Received: from mail-pl1-x632.google.com ([2607:f8b0:4864:20::632]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sAeED-0005lz-1y for qemu-devel@nongnu.org; Fri, 24 May 2024 19:21:41 -0400 Received: by mail-pl1-x632.google.com with SMTP id d9443c01a7336-1f44b59f8daso10094375ad.2 for ; Fri, 24 May 2024 16:21:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1716592896; x=1717197696; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=6JGBAb3suYnGkPI0Zq7FnXw4L8WJQd0Zh1edTR6Hth4=; b=cMMmwrt9t2T4hAuGTd3VLIlRnpdZMnIzG9H7EOvj1YNq8NhKB4Vh1Sd6qieQQVW2qB yui1rwByV2lWAzm9XuaD26wFnOKsn22Y3Uy8kRavt2g2EvjMRJ2NKPab/LIelXLnRSif 0qxWA+7EIs6bD+233yDTy6ryo0OFXrPL4JxFRHyqnL9YILlWPIe50ofIz6hqyI7oJnRb Q/cYWnH33MjcwB91/v2GMcRMlNmm/dlxxYjthm1ZwBewQ6WFdW4Dp0/8eMY2sZC7/4m/ OQxLcRlw0jNQ0LBGQNvhaxiiP9bczUvWL6u+7k1nYyiP6H2tIuNWGpblEutz6RVgN7Iw 2kEA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716592896; x=1717197696; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=6JGBAb3suYnGkPI0Zq7FnXw4L8WJQd0Zh1edTR6Hth4=; b=HtBjZbE5zxo4JT/GLQ2cc51flHwzWJ+RSUbT1pSsvmOg+wsMU6Yf9PCqAEDOU/aD1f rx1e5WGsO15OtqBIHsoFRGsckrBMnp4v/BtKTIXNxjzwEh/suaqj7D2Btfa8MbS8YuMo QOnGro90XkMrY236MYfLfSijAvS2wiw9bQiKu0HqbZMV1zmggDoM6b9CFpSS5LmqcCne Z11C4Xzf7BBT9jbqZnhYgoUpqXoRaztGifJpbFTCSqzb2+Xk7msKAAvLQNm9Ayq4cZ/z +4QvQaPiArucg8aRuInrEMWViK/qt5ywyGJzmLh8OT0K13FfC/9XkJGYoj9skJlKKbW9 CS3A== X-Gm-Message-State: AOJu0YynxODGMOhTVHyXCcyJLWu74ibFlnatCBT9WZpNSsElg1gnYMiF 0YgauaUVYx09TM1NYMl7CLjqE6KgPNr+c+W8fWmKWR+qtGs7AGU85f4ANuuH1nZvoK9fwCpcNyN 0 X-Google-Smtp-Source: AGHT+IHxJLX09YDB2XAZ/mPwLgN23+U4YMx5d9NATfHk4mpOht4vIpcwJpmAUhBZyqzjBEnoJtM2fQ== X-Received: by 2002:a17:903:41ce:b0:1e2:4c85:82ea with SMTP id d9443c01a7336-1f4486f23c2mr46433985ad.24.1716592895744; Fri, 24 May 2024 16:21:35 -0700 (PDT) Received: from stoup.. (174-21-72-5.tukw.qwest.net. [174.21.72.5]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-1f44c759ceesm19178305ad.10.2024.05.24.16.21.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 May 2024 16:21:35 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: qemu-arm@nongnu.org, Peter Maydell Subject: [PATCH v2 15/67] target/arm: Convert Cryptographic 3-register, imm2 to decodetree Date: Fri, 24 May 2024 16:20:29 -0700 Message-Id: <20240524232121.284515-16-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240524232121.284515-1-richard.henderson@linaro.org> References: <20240524232121.284515-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::632; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x632.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/tcg/a64.decode | 10 ++++++++ target/arm/tcg/translate-a64.c | 43 ++++++++++------------------------ 2 files changed, 22 insertions(+), 31 deletions(-) diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode index ef6902e86a..1292312a7f 100644 --- a/target/arm/tcg/a64.decode +++ b/target/arm/tcg/a64.decode @@ -644,3 +644,13 @@ SM4E 1100 1110 110 00000 100001 ..... ..... @r2r_q1e0 EOR3 1100 1110 000 ..... 0 ..... ..... ..... @rrrr_q1e3 BCAX 1100 1110 001 ..... 0 ..... ..... ..... @rrrr_q1e3 SM3SS1 1100 1110 010 ..... 0 ..... ..... ..... @rrrr_q1e3 + +### Cryptographic three-register, imm2 + +&crypto3i rd rn rm imm +@crypto3i ........ ... rm:5 .. imm:2 .. rn:5 rd:5 &crypto3i + +SM3TT1A 11001110 010 ..... 10 .. 00 ..... ..... @crypto3i +SM3TT1B 11001110 010 ..... 10 .. 01 ..... ..... @crypto3i +SM3TT2A 11001110 010 ..... 10 .. 10 ..... ..... @crypto3i +SM3TT2B 11001110 010 ..... 10 .. 11 ..... ..... @crypto3i diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c index 2951e7eb59..cf3a7dfa99 100644 --- a/target/arm/tcg/translate-a64.c +++ b/target/arm/tcg/translate-a64.c @@ -4676,6 +4676,18 @@ static bool trans_SM3SS1(DisasContext *s, arg_SM3SS1 *a) return true; } +static bool do_crypto3i(DisasContext *s, arg_crypto3i *a, gen_helper_gvec_3 *fn) +{ + if (fp_access_check(s)) { + gen_gvec_op3_ool(s, true, a->rd, a->rn, a->rm, a->imm, fn); + } + return true; +} +TRANS_FEAT(SM3TT1A, aa64_sm3, do_crypto3i, a, gen_helper_crypto_sm3tt1a) +TRANS_FEAT(SM3TT1B, aa64_sm3, do_crypto3i, a, gen_helper_crypto_sm3tt1b) +TRANS_FEAT(SM3TT2A, aa64_sm3, do_crypto3i, a, gen_helper_crypto_sm3tt2a) +TRANS_FEAT(SM3TT2B, aa64_sm3, do_crypto3i, a, gen_helper_crypto_sm3tt2b) + /* Shift a TCGv src by TCGv shift_amount, put result in dst. * Note that it is the caller's responsibility to ensure that the * shift amount is in range (ie 0..31 or 0..63) and provide the ARM @@ -13604,36 +13616,6 @@ static void disas_crypto_xar(DisasContext *s, uint32_t insn) vec_full_reg_size(s)); } -/* Crypto three-reg imm2 - * 31 21 20 16 15 14 13 12 11 10 9 5 4 0 - * +-----------------------+------+-----+------+--------+------+------+ - * | 1 1 0 0 1 1 1 0 0 1 0 | Rm | 1 0 | imm2 | opcode | Rn | Rd | - * +-----------------------+------+-----+------+--------+------+------+ - */ -static void disas_crypto_three_reg_imm2(DisasContext *s, uint32_t insn) -{ - static gen_helper_gvec_3 * const fns[4] = { - gen_helper_crypto_sm3tt1a, gen_helper_crypto_sm3tt1b, - gen_helper_crypto_sm3tt2a, gen_helper_crypto_sm3tt2b, - }; - int opcode = extract32(insn, 10, 2); - int imm2 = extract32(insn, 12, 2); - int rm = extract32(insn, 16, 5); - int rn = extract32(insn, 5, 5); - int rd = extract32(insn, 0, 5); - - if (!dc_isar_feature(aa64_sm3, s)) { - unallocated_encoding(s); - return; - } - - if (!fp_access_check(s)) { - return; - } - - gen_gvec_op3_ool(s, true, rd, rn, rm, imm2, fns[opcode]); -} - /* C3.6 Data processing - SIMD, inc Crypto * * As the decode gets a little complex we are using a table based @@ -13663,7 +13645,6 @@ static const AArch64DecodeTable data_proc_simd[] = { { 0x5f000000, 0xdf000400, disas_simd_indexed }, /* scalar indexed */ { 0x5f000400, 0xdf800400, disas_simd_scalar_shift_imm }, { 0xce800000, 0xffe00000, disas_crypto_xar }, - { 0xce408000, 0xffe0c000, disas_crypto_three_reg_imm2 }, { 0x0e400400, 0x9f60c400, disas_simd_three_reg_same_fp16 }, { 0x0e780800, 0x8f7e0c00, disas_simd_two_reg_misc_fp16 }, { 0x5e400400, 0xdf60c400, disas_simd_scalar_three_reg_same_fp16 }, From patchwork Fri May 24 23:20:30 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13673823 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id E19B9C41513 for ; Fri, 24 May 2024 23:28:21 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sAeEV-0006g1-5f; Fri, 24 May 2024 19:21:55 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sAeEK-0006WB-5d for qemu-devel@nongnu.org; Fri, 24 May 2024 19:21:44 -0400 Received: from mail-pl1-x630.google.com ([2607:f8b0:4864:20::630]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sAeEF-0005mY-Ej for qemu-devel@nongnu.org; Fri, 24 May 2024 19:21:42 -0400 Received: by mail-pl1-x630.google.com with SMTP id d9443c01a7336-1f3310a21d8so24797425ad.1 for ; Fri, 24 May 2024 16:21:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1716592897; x=1717197697; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=fODOcCWj61ck2RRtz0vPUUYcMqJTXdEitBsSudVvQlM=; b=zckZDexIZnIMhteJHeMv6SXcThm+k4R05hytQHjIJJ81z5BAkBohV7H3qbypxnzv6e uZtUkahl7P9fYxS0eckxt/T2QFBRoLMF81whubiQ4qPURia3umCGUbnHcr3k60tqt99U 99gCzx/LgMFTfkDukC9cCVC87RIhTO3FRlHIriWDyThxHXDnu3daRKk6pNaFCG6y0kYh x8hltqzk/44j16jlJxK6AUMcpLiIyM6UQiyaiYRJ1k8DNfygH61aA+K4EPnuJsWrjTr/ R1+lvQXvFgeC99z6Aw4Pv9Ji3RHZqVtDhFwpZb03d6mmv5Szr3laR6GWc1juoEnZ266L I1TQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716592897; x=1717197697; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=fODOcCWj61ck2RRtz0vPUUYcMqJTXdEitBsSudVvQlM=; b=poTKG6MLe0ezQflL0Sw0RtX6yot2gHqpDJVWAeTV4ZGsvpjdRi6IgFg8JipaV8GAHc G3kVO8EEHKF++/LDFRWRTvfRjq0/m0OpPet5xlNjD90xtau9a/MZW50/7MezTy8VtT68 cKqSIxHyxCxnl8IDXuIB2jLM6NCw+1x0OKJU275otlrPgCNljKDXFEhv8g9m5q7LZmov eKSfuWY+Esepd9AFGyMCEbqPpOqZnEnB2mwjsONiuRYLDCEipzNR6N6oOBMtdE8/Pc96 kM/a4bBl99RsaC9ag+1mhnNX3i2aXYQiiT2N0AfQ+KVM6XIsDnv53qk3UDUjrBWAppge ZIcQ== X-Gm-Message-State: AOJu0YxBsrBa2se+NRDWILbkpTKXGgrCQmQ0AMOFfgDpHmVLUBropcfh mb2W808wfK1VjZckKM4oAAtNJ+sgM2f3I0HeU1ZZMnjtGreKZ5jZ8GMz0JaDDd9kOgDgT6jyGP8 h X-Google-Smtp-Source: AGHT+IFivj0xz3vXxVxFdEk8rIgXYw3qZtsQi8t8zU42Wj9tlABp730jHDdfyGlBMwoINmCeScJu7w== X-Received: by 2002:a17:902:d486:b0:1f3:1cb9:47a0 with SMTP id d9443c01a7336-1f44870a07cmr44675365ad.27.1716592896932; Fri, 24 May 2024 16:21:36 -0700 (PDT) Received: from stoup.. (174-21-72-5.tukw.qwest.net. [174.21.72.5]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-1f44c759ceesm19178305ad.10.2024.05.24.16.21.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 May 2024 16:21:36 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: qemu-arm@nongnu.org, Peter Maydell Subject: [PATCH v2 16/67] target/arm: Convert XAR to decodetree Date: Fri, 24 May 2024 16:20:30 -0700 Message-Id: <20240524232121.284515-17-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240524232121.284515-1-richard.henderson@linaro.org> References: <20240524232121.284515-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::630; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x630.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/tcg/a64.decode | 4 ++++ target/arm/tcg/translate-a64.c | 43 +++++++++++----------------------- 2 files changed, 18 insertions(+), 29 deletions(-) diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode index 1292312a7f..7f354af25d 100644 --- a/target/arm/tcg/a64.decode +++ b/target/arm/tcg/a64.decode @@ -654,3 +654,7 @@ SM3TT1A 11001110 010 ..... 10 .. 00 ..... ..... @crypto3i SM3TT1B 11001110 010 ..... 10 .. 01 ..... ..... @crypto3i SM3TT2A 11001110 010 ..... 10 .. 10 ..... ..... @crypto3i SM3TT2B 11001110 010 ..... 10 .. 11 ..... ..... @crypto3i + +### Cryptographic XAR + +XAR 1100 1110 100 rm:5 imm:6 rn:5 rd:5 diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c index cf3a7dfa99..75f1e6a7b9 100644 --- a/target/arm/tcg/translate-a64.c +++ b/target/arm/tcg/translate-a64.c @@ -4688,6 +4688,20 @@ TRANS_FEAT(SM3TT1B, aa64_sm3, do_crypto3i, a, gen_helper_crypto_sm3tt1b) TRANS_FEAT(SM3TT2A, aa64_sm3, do_crypto3i, a, gen_helper_crypto_sm3tt2a) TRANS_FEAT(SM3TT2B, aa64_sm3, do_crypto3i, a, gen_helper_crypto_sm3tt2b) +static bool trans_XAR(DisasContext *s, arg_XAR *a) +{ + if (!dc_isar_feature(aa64_sha3, s)) { + return false; + } + if (fp_access_check(s)) { + gen_gvec_xar(MO_64, vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), a->imm, 16, + vec_full_reg_size(s)); + } + return true; +} + /* Shift a TCGv src by TCGv shift_amount, put result in dst. * Note that it is the caller's responsibility to ensure that the * shift amount is in range (ie 0..31 or 0..63) and provide the ARM @@ -13588,34 +13602,6 @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn) } } -/* Crypto XAR - * 31 21 20 16 15 10 9 5 4 0 - * +-----------------------+------+--------+------+------+ - * | 1 1 0 0 1 1 1 0 1 0 0 | Rm | imm6 | Rn | Rd | - * +-----------------------+------+--------+------+------+ - */ -static void disas_crypto_xar(DisasContext *s, uint32_t insn) -{ - int rm = extract32(insn, 16, 5); - int imm6 = extract32(insn, 10, 6); - int rn = extract32(insn, 5, 5); - int rd = extract32(insn, 0, 5); - - if (!dc_isar_feature(aa64_sha3, s)) { - unallocated_encoding(s); - return; - } - - if (!fp_access_check(s)) { - return; - } - - gen_gvec_xar(MO_64, vec_full_reg_offset(s, rd), - vec_full_reg_offset(s, rn), - vec_full_reg_offset(s, rm), imm6, 16, - vec_full_reg_size(s)); -} - /* C3.6 Data processing - SIMD, inc Crypto * * As the decode gets a little complex we are using a table based @@ -13644,7 +13630,6 @@ static const AArch64DecodeTable data_proc_simd[] = { { 0x5e000400, 0xdfe08400, disas_simd_scalar_copy }, { 0x5f000000, 0xdf000400, disas_simd_indexed }, /* scalar indexed */ { 0x5f000400, 0xdf800400, disas_simd_scalar_shift_imm }, - { 0xce800000, 0xffe00000, disas_crypto_xar }, { 0x0e400400, 0x9f60c400, disas_simd_three_reg_same_fp16 }, { 0x0e780800, 0x8f7e0c00, disas_simd_two_reg_misc_fp16 }, { 0x5e400400, 0xdf60c400, disas_simd_scalar_three_reg_same_fp16 }, From patchwork Fri May 24 23:20:31 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13673808 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id DD17EC41513 for ; Fri, 24 May 2024 23:27:09 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sAeEU-0006ep-4X; Fri, 24 May 2024 19:21:54 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sAeEK-0006WD-CA for qemu-devel@nongnu.org; Fri, 24 May 2024 19:21:45 -0400 Received: from mail-pl1-x630.google.com ([2607:f8b0:4864:20::630]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sAeEF-0005mt-F6 for qemu-devel@nongnu.org; Fri, 24 May 2024 19:21:44 -0400 Received: by mail-pl1-x630.google.com with SMTP id d9443c01a7336-1f44b42d1caso9591295ad.0 for ; Fri, 24 May 2024 16:21:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1716592898; x=1717197698; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=f/oBfdTOXox41YO6PvDaX+L7susLQEDZ8jvwbEvXtZA=; b=S8saBvseLsrEn+3GBVTBXND07nrDC3ZDHUzfDjBGcJWLILCeYhBDfUJliApS6flZq/ pTeX6EhveEpv0KLml5Kq4Rz4zVHxuQZaWRtso28Ul/vG5vVilv/yQufx7DSpuA3HWLKS uzJA+U1TsJuGhv3prU7nnCJyt6vCMS+cia11WG89C13OKos4DFnMxRj/6X4Y1x+tylKd uOZ8s2ty1XL0LQrdpWgqwRnGw7dSdYDZ+swH5iQv4ScEko96ZNJluV7yrHt8wk/6IWhh Ty7mYRjyyjsPn7bKO9Jd3knIVoazNBNgsWscuV9EFsHHcklPO0Gyz5lApVePC8xsFA9N nj6A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716592898; x=1717197698; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=f/oBfdTOXox41YO6PvDaX+L7susLQEDZ8jvwbEvXtZA=; b=g1b1nPn9JLjqBFPLYZSkvvJLZUYA+X6sJBsilNTpArMPYC+5gX2jKPdFiY4rtU4o+i RQr+iSAGu1JKYm+ItY30m3BC0LkTtn6nLGv4gSOCW3k4w/b5HdCI0XxjrbAW8ABCOFb/ /ErL9f/SaK9/aw/ru6htS/qyId85DCRgJFdg8EAQt4V1vHT401rSaNkIs8/T5jj+b/iB mbgyOPsUflqejSQJ+W75o7bULM+PFb00EJLOjFsNzA9OjcINyv2gUyJ7dWGN/GPgA1iE 1c/bDGLv4YHq/P03xtYJTAADY9kg/hGN9iX3ZokSv8jWftx97j9+au+o1gwM6UtZN+cw dKpA== X-Gm-Message-State: AOJu0Ywbk6F9yljwzQhQbAvv68dx8Iq391oMMnKMx4doih287W29zfWB LdBX7EsYSLMngjBzHHC6HUGoifRQP3cyYM1wsGNQbnZQ+zNQdpb/aGx0sBYat5o9gztuLhQMTq8 Y X-Google-Smtp-Source: AGHT+IFSzYLV/KpoVlTxRMgnqEBKnXmM+Na22OrT0Z7IGorjtLbS3ytL48rpjuMgnHZwWp/YKHEAWw== X-Received: by 2002:a17:902:f54d:b0:1f3:11bc:20b5 with SMTP id d9443c01a7336-1f44874038emr41347385ad.23.1716592897898; Fri, 24 May 2024 16:21:37 -0700 (PDT) Received: from stoup.. (174-21-72-5.tukw.qwest.net. [174.21.72.5]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-1f44c759ceesm19178305ad.10.2024.05.24.16.21.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 May 2024 16:21:37 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: qemu-arm@nongnu.org, Peter Maydell Subject: [PATCH v2 17/67] target/arm: Convert Advanced SIMD copy to decodetree Date: Fri, 24 May 2024 16:20:31 -0700 Message-Id: <20240524232121.284515-18-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240524232121.284515-1-richard.henderson@linaro.org> References: <20240524232121.284515-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::630; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x630.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/tcg/a64.decode | 13 + target/arm/tcg/translate-a64.c | 426 +++++++++++---------------------- 2 files changed, 152 insertions(+), 287 deletions(-) diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode index 7f354af25d..d5bfeae7a8 100644 --- a/target/arm/tcg/a64.decode +++ b/target/arm/tcg/a64.decode @@ -658,3 +658,16 @@ SM3TT2B 11001110 010 ..... 10 .. 11 ..... ..... @crypto3i ### Cryptographic XAR XAR 1100 1110 100 rm:5 imm:6 rn:5 rd:5 + +### Advanced SIMD scalar copy + +DUP_element_s 0101 1110 000 imm:5 0 0000 1 rn:5 rd:5 + +### Advanced SIMD copy + +DUP_element_v 0 q:1 00 1110 000 imm:5 0 0000 1 rn:5 rd:5 +DUP_general 0 q:1 00 1110 000 imm:5 0 0001 1 rn:5 rd:5 +INS_general 0 1 00 1110 000 imm:5 0 0011 1 rn:5 rd:5 +SMOV 0 q:1 00 1110 000 imm:5 0 0101 1 rn:5 rd:5 +UMOV 0 q:1 00 1110 000 imm:5 0 0111 1 rn:5 rd:5 +INS_element 0 1 10 1110 000 di:5 0 si:4 1 rn:5 rd:5 diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c index 75f1e6a7b9..1a12bf22fd 100644 --- a/target/arm/tcg/translate-a64.c +++ b/target/arm/tcg/translate-a64.c @@ -4702,6 +4702,145 @@ static bool trans_XAR(DisasContext *s, arg_XAR *a) return true; } +/* + * Advanced SIMD copy + */ + +static bool decode_esz_idx(int imm, MemOp *pesz, unsigned *pidx) +{ + unsigned esz = ctz32(imm); + if (esz <= MO_64) { + *pesz = esz; + *pidx = imm >> (esz + 1); + return true; + } + return false; +} + +static bool trans_DUP_element_s(DisasContext *s, arg_DUP_element_s *a) +{ + MemOp esz; + unsigned idx; + + if (!decode_esz_idx(a->imm, &esz, &idx)) { + return false; + } + if (fp_access_check(s)) { + /* + * This instruction just extracts the specified element and + * zero-extends it into the bottom of the destination register. + */ + TCGv_i64 tmp = tcg_temp_new_i64(); + read_vec_element(s, tmp, a->rn, idx, esz); + write_fp_dreg(s, a->rd, tmp); + } + return true; +} + +static bool trans_DUP_element_v(DisasContext *s, arg_DUP_element_v *a) +{ + MemOp esz; + unsigned idx; + + if (!decode_esz_idx(a->imm, &esz, &idx)) { + return false; + } + if (esz == MO_64 && !a->q) { + return false; + } + if (fp_access_check(s)) { + tcg_gen_gvec_dup_mem(esz, vec_full_reg_offset(s, a->rd), + vec_reg_offset(s, a->rn, idx, esz), + a->q ? 16 : 8, vec_full_reg_size(s)); + } + return true; +} + +static bool trans_DUP_general(DisasContext *s, arg_DUP_general *a) +{ + MemOp esz; + unsigned idx; + + if (!decode_esz_idx(a->imm, &esz, &idx)) { + return false; + } + if (esz == MO_64 && !a->q) { + return false; + } + if (fp_access_check(s)) { + tcg_gen_gvec_dup_i64(esz, vec_full_reg_offset(s, a->rd), + a->q ? 16 : 8, vec_full_reg_size(s), + cpu_reg(s, a->rn)); + } + return true; +} + +static bool do_smov_umov(DisasContext *s, arg_SMOV *a, MemOp is_signed) +{ + MemOp esz; + unsigned idx; + + if (!decode_esz_idx(a->imm, &esz, &idx)) { + return false; + } + if (is_signed) { + if (esz == MO_64 || (esz == MO_32 && !a->q)) { + return false; + } + } else { + if (esz == MO_64 ? !a->q : a->q) { + return false; + } + } + if (fp_access_check(s)) { + TCGv_i64 tcg_rd = cpu_reg(s, a->rd); + read_vec_element(s, tcg_rd, a->rn, idx, esz | is_signed); + if (is_signed && !a->q) { + tcg_gen_ext32u_i64(tcg_rd, tcg_rd); + } + } + return true; +} + +TRANS(SMOV, do_smov_umov, a, MO_SIGN) +TRANS(UMOV, do_smov_umov, a, 0) + +static bool trans_INS_general(DisasContext *s, arg_INS_general *a) +{ + MemOp esz; + unsigned idx; + + if (!decode_esz_idx(a->imm, &esz, &idx)) { + return false; + } + if (fp_access_check(s)) { + write_vec_element(s, cpu_reg(s, a->rn), a->rd, idx, esz); + clear_vec_high(s, true, a->rd); + } + return true; +} + +static bool trans_INS_element(DisasContext *s, arg_INS_element *a) +{ + MemOp esz; + unsigned didx, sidx; + + if (!decode_esz_idx(a->di, &esz, &didx)) { + return false; + } + sidx = a->si >> esz; + if (fp_access_check(s)) { + TCGv_i64 tmp = tcg_temp_new_i64(); + + read_vec_element(s, tmp, a->rn, sidx, esz); + write_vec_element(s, tmp, a->rd, didx, esz); + + /* INS is considered a 128-bit write for SVE. */ + clear_vec_high(s, true, a->rd); + } + return true; +} + /* Shift a TCGv src by TCGv shift_amount, put result in dst. * Note that it is the caller's responsibility to ensure that the * shift amount is in range (ie 0..31 or 0..63) and provide the ARM @@ -7760,268 +7899,6 @@ static void disas_simd_across_lanes(DisasContext *s, uint32_t insn) write_fp_dreg(s, rd, tcg_res); } -/* DUP (Element, Vector) - * - * 31 30 29 21 20 16 15 10 9 5 4 0 - * +---+---+-------------------+--------+-------------+------+------+ - * | 0 | Q | 0 0 1 1 1 0 0 0 0 | imm5 | 0 0 0 0 0 1 | Rn | Rd | - * +---+---+-------------------+--------+-------------+------+------+ - * - * size: encoded in imm5 (see ARM ARM LowestSetBit()) - */ -static void handle_simd_dupe(DisasContext *s, int is_q, int rd, int rn, - int imm5) -{ - int size = ctz32(imm5); - int index; - - if (size > 3 || (size == 3 && !is_q)) { - unallocated_encoding(s); - return; - } - - if (!fp_access_check(s)) { - return; - } - - index = imm5 >> (size + 1); - tcg_gen_gvec_dup_mem(size, vec_full_reg_offset(s, rd), - vec_reg_offset(s, rn, index, size), - is_q ? 16 : 8, vec_full_reg_size(s)); -} - -/* DUP (element, scalar) - * 31 21 20 16 15 10 9 5 4 0 - * +-----------------------+--------+-------------+------+------+ - * | 0 1 0 1 1 1 1 0 0 0 0 | imm5 | 0 0 0 0 0 1 | Rn | Rd | - * +-----------------------+--------+-------------+------+------+ - */ -static void handle_simd_dupes(DisasContext *s, int rd, int rn, - int imm5) -{ - int size = ctz32(imm5); - int index; - TCGv_i64 tmp; - - if (size > 3) { - unallocated_encoding(s); - return; - } - - if (!fp_access_check(s)) { - return; - } - - index = imm5 >> (size + 1); - - /* This instruction just extracts the specified element and - * zero-extends it into the bottom of the destination register. - */ - tmp = tcg_temp_new_i64(); - read_vec_element(s, tmp, rn, index, size); - write_fp_dreg(s, rd, tmp); -} - -/* DUP (General) - * - * 31 30 29 21 20 16 15 10 9 5 4 0 - * +---+---+-------------------+--------+-------------+------+------+ - * | 0 | Q | 0 0 1 1 1 0 0 0 0 | imm5 | 0 0 0 0 1 1 | Rn | Rd | - * +---+---+-------------------+--------+-------------+------+------+ - * - * size: encoded in imm5 (see ARM ARM LowestSetBit()) - */ -static void handle_simd_dupg(DisasContext *s, int is_q, int rd, int rn, - int imm5) -{ - int size = ctz32(imm5); - uint32_t dofs, oprsz, maxsz; - - if (size > 3 || ((size == 3) && !is_q)) { - unallocated_encoding(s); - return; - } - - if (!fp_access_check(s)) { - return; - } - - dofs = vec_full_reg_offset(s, rd); - oprsz = is_q ? 16 : 8; - maxsz = vec_full_reg_size(s); - - tcg_gen_gvec_dup_i64(size, dofs, oprsz, maxsz, cpu_reg(s, rn)); -} - -/* INS (Element) - * - * 31 21 20 16 15 14 11 10 9 5 4 0 - * +-----------------------+--------+------------+---+------+------+ - * | 0 1 1 0 1 1 1 0 0 0 0 | imm5 | 0 | imm4 | 1 | Rn | Rd | - * +-----------------------+--------+------------+---+------+------+ - * - * size: encoded in imm5 (see ARM ARM LowestSetBit()) - * index: encoded in imm5<4:size+1> - */ -static void handle_simd_inse(DisasContext *s, int rd, int rn, - int imm4, int imm5) -{ - int size = ctz32(imm5); - int src_index, dst_index; - TCGv_i64 tmp; - - if (size > 3) { - unallocated_encoding(s); - return; - } - - if (!fp_access_check(s)) { - return; - } - - dst_index = extract32(imm5, 1+size, 5); - src_index = extract32(imm4, size, 4); - - tmp = tcg_temp_new_i64(); - - read_vec_element(s, tmp, rn, src_index, size); - write_vec_element(s, tmp, rd, dst_index, size); - - /* INS is considered a 128-bit write for SVE. */ - clear_vec_high(s, true, rd); -} - - -/* INS (General) - * - * 31 21 20 16 15 10 9 5 4 0 - * +-----------------------+--------+-------------+------+------+ - * | 0 1 0 0 1 1 1 0 0 0 0 | imm5 | 0 0 0 1 1 1 | Rn | Rd | - * +-----------------------+--------+-------------+------+------+ - * - * size: encoded in imm5 (see ARM ARM LowestSetBit()) - * index: encoded in imm5<4:size+1> - */ -static void handle_simd_insg(DisasContext *s, int rd, int rn, int imm5) -{ - int size = ctz32(imm5); - int idx; - - if (size > 3) { - unallocated_encoding(s); - return; - } - - if (!fp_access_check(s)) { - return; - } - - idx = extract32(imm5, 1 + size, 4 - size); - write_vec_element(s, cpu_reg(s, rn), rd, idx, size); - - /* INS is considered a 128-bit write for SVE. */ - clear_vec_high(s, true, rd); -} - -/* - * UMOV (General) - * SMOV (General) - * - * 31 30 29 21 20 16 15 12 10 9 5 4 0 - * +---+---+-------------------+--------+-------------+------+------+ - * | 0 | Q | 0 0 1 1 1 0 0 0 0 | imm5 | 0 0 1 U 1 1 | Rn | Rd | - * +---+---+-------------------+--------+-------------+------+------+ - * - * U: unsigned when set - * size: encoded in imm5 (see ARM ARM LowestSetBit()) - */ -static void handle_simd_umov_smov(DisasContext *s, int is_q, int is_signed, - int rn, int rd, int imm5) -{ - int size = ctz32(imm5); - int element; - TCGv_i64 tcg_rd; - - /* Check for UnallocatedEncodings */ - if (is_signed) { - if (size > 2 || (size == 2 && !is_q)) { - unallocated_encoding(s); - return; - } - } else { - if (size > 3 - || (size < 3 && is_q) - || (size == 3 && !is_q)) { - unallocated_encoding(s); - return; - } - } - - if (!fp_access_check(s)) { - return; - } - - element = extract32(imm5, 1+size, 4); - - tcg_rd = cpu_reg(s, rd); - read_vec_element(s, tcg_rd, rn, element, size | (is_signed ? MO_SIGN : 0)); - if (is_signed && !is_q) { - tcg_gen_ext32u_i64(tcg_rd, tcg_rd); - } -} - -/* AdvSIMD copy - * 31 30 29 28 21 20 16 15 14 11 10 9 5 4 0 - * +---+---+----+-----------------+------+---+------+---+------+------+ - * | 0 | Q | op | 0 1 1 1 0 0 0 0 | imm5 | 0 | imm4 | 1 | Rn | Rd | - * +---+---+----+-----------------+------+---+------+---+------+------+ - */ -static void disas_simd_copy(DisasContext *s, uint32_t insn) -{ - int rd = extract32(insn, 0, 5); - int rn = extract32(insn, 5, 5); - int imm4 = extract32(insn, 11, 4); - int op = extract32(insn, 29, 1); - int is_q = extract32(insn, 30, 1); - int imm5 = extract32(insn, 16, 5); - - if (op) { - if (is_q) { - /* INS (element) */ - handle_simd_inse(s, rd, rn, imm4, imm5); - } else { - unallocated_encoding(s); - } - } else { - switch (imm4) { - case 0: - /* DUP (element - vector) */ - handle_simd_dupe(s, is_q, rd, rn, imm5); - break; - case 1: - /* DUP (general) */ - handle_simd_dupg(s, is_q, rd, rn, imm5); - break; - case 3: - if (is_q) { - /* INS (general) */ - handle_simd_insg(s, rd, rn, imm5); - } else { - unallocated_encoding(s); - } - break; - case 5: - case 7: - /* UMOV/SMOV (is_q indicates 32/64; imm4 indicates signedness) */ - handle_simd_umov_smov(s, is_q, (imm4 == 5), rn, rd, imm5); - break; - default: - unallocated_encoding(s); - break; - } - } -} - /* AdvSIMD modified immediate * 31 30 29 28 19 18 16 15 12 11 10 9 5 4 0 * +---+---+----+---------------------+-----+-------+----+---+-------+------+ @@ -8085,29 +7962,6 @@ static void disas_simd_mod_imm(DisasContext *s, uint32_t insn) } } -/* AdvSIMD scalar copy - * 31 30 29 28 21 20 16 15 14 11 10 9 5 4 0 - * +-----+----+-----------------+------+---+------+---+------+------+ - * | 0 1 | op | 1 1 1 1 0 0 0 0 | imm5 | 0 | imm4 | 1 | Rn | Rd | - * +-----+----+-----------------+------+---+------+---+------+------+ - */ -static void disas_simd_scalar_copy(DisasContext *s, uint32_t insn) -{ - int rd = extract32(insn, 0, 5); - int rn = extract32(insn, 5, 5); - int imm4 = extract32(insn, 11, 4); - int imm5 = extract32(insn, 16, 5); - int op = extract32(insn, 29, 1); - - if (op != 0 || imm4 != 0) { - unallocated_encoding(s); - return; - } - - /* DUP (element, scalar) */ - handle_simd_dupes(s, rd, rn, imm5); -} - /* AdvSIMD scalar pairwise * 31 30 29 28 24 23 22 21 17 16 12 11 10 9 5 4 0 * +-----+---+-----------+------+-----------+--------+-----+------+------+ @@ -13614,7 +13468,6 @@ static const AArch64DecodeTable data_proc_simd[] = { { 0x0e200000, 0x9f200c00, disas_simd_three_reg_diff }, { 0x0e200800, 0x9f3e0c00, disas_simd_two_reg_misc }, { 0x0e300800, 0x9f3e0c00, disas_simd_across_lanes }, - { 0x0e000400, 0x9fe08400, disas_simd_copy }, { 0x0f000000, 0x9f000400, disas_simd_indexed }, /* vector indexed */ /* simd_mod_imm decode is a subset of simd_shift_imm, so must precede it */ { 0x0f000400, 0x9ff80400, disas_simd_mod_imm }, @@ -13627,7 +13480,6 @@ static const AArch64DecodeTable data_proc_simd[] = { { 0x5e200000, 0xdf200c00, disas_simd_scalar_three_reg_diff }, { 0x5e200800, 0xdf3e0c00, disas_simd_scalar_two_reg_misc }, { 0x5e300800, 0xdf3e0c00, disas_simd_scalar_pairwise }, - { 0x5e000400, 0xdfe08400, disas_simd_scalar_copy }, { 0x5f000000, 0xdf000400, disas_simd_indexed }, /* scalar indexed */ { 0x5f000400, 0xdf800400, disas_simd_scalar_shift_imm }, { 0x0e400400, 0x9f60c400, disas_simd_three_reg_same_fp16 }, From patchwork Fri May 24 23:20:32 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13673804 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 71B56C25B7D for ; Fri, 24 May 2024 23:25:45 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sAeEc-0006ok-5F; Fri, 24 May 2024 19:22:02 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sAeEN-0006Y3-RQ for qemu-devel@nongnu.org; Fri, 24 May 2024 19:21:47 -0400 Received: from mail-pl1-x62f.google.com ([2607:f8b0:4864:20::62f]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sAeEH-0005nN-I4 for qemu-devel@nongnu.org; Fri, 24 May 2024 19:21:47 -0400 Received: by mail-pl1-x62f.google.com with SMTP id d9443c01a7336-1f3469382f2so16980035ad.0 for ; Fri, 24 May 2024 16:21:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1716592899; x=1717197699; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=AUu9ImPu+PJ6nwVQRGFf6BOKQ5seZeCZAivr++UdtqU=; b=B/OyBElF7JTlH4LH5e3NWhAA9piMBD9hDXXxLua0j7FDvU69xj7mrbghfdxccl7pqh 0AMZcCobUZ56XUubnOHRmE8VJxTS0JVdO4hJvKC55KacQNecYyeketM/zXm+nDJb0B6z +tRTT2H/oSl2wKJMSmQu3euqil+8H+kaR8IG+acOZbv//cWlZ9SUeSE10UYkUcWcpvAV svAr6n6MX1644ejWf/4QBDEGKMPl9FKThHKhji7AIO9ups4CTgXyJJITRpaBPSPAszE2 mFMTOycera22YaUJEHrI5/5bTY562U+7tGZRz1ionD+9U/55mkymVRxgvwL/2tfZCEzA BYag== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716592899; x=1717197699; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=AUu9ImPu+PJ6nwVQRGFf6BOKQ5seZeCZAivr++UdtqU=; b=mdRkmD78lc/+Go9Xh4mNHVjT8CvaTCgSfFplPWMTTGu+bi8Ba5hIFx7sBeyv848cE7 ZQuIQs2NYXUbQtiNRqWKWd8gh1tFTeg2SEQLxiQMMQHjBI0aclnP24JE/CtJd9NMNetS DyPNvzw0ivYD2mJs8sJg0w/g9OoIAhv0PPLjdydbg24apnfby0HMZfexNPImtdsyAnr5 qU6Xf4rgCUU0DOF8bTnQB9fIexv4ArnbB+0pUwAFPSqwO43ATK6oPYT8YFEtR03t0pG6 nxWl8nLhX5GXkslrU8uJlgAiFnKvKhI+O6vRh2fJvAVZaVBgC3uQwApENfnTRCks7qya 2UMg== X-Gm-Message-State: AOJu0Yx1K4qCEumsNmIn70TrtgTk4XF1v6ZUZdxGMlayt+KUfVcz04Z8 0ZcrcD30S1Jj42cQhnZi3f0kfFM2VxE6DItdng9r8eFCM+Dexm6r4EsKLXp/nl++dEFkZpkOKAW C X-Google-Smtp-Source: AGHT+IEzm0wkFppPosA4YhyJRG4ql/peW7v2FyzD0PJjn2/v3dG502J27rUIR6AC6XefCc+6dV0sDw== X-Received: by 2002:a17:902:f688:b0:1f3:14e7:8ba2 with SMTP id d9443c01a7336-1f4486c55ebmr51631875ad.1.1716592898806; Fri, 24 May 2024 16:21:38 -0700 (PDT) Received: from stoup.. (174-21-72-5.tukw.qwest.net. [174.21.72.5]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-1f44c759ceesm19178305ad.10.2024.05.24.16.21.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 May 2024 16:21:38 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: qemu-arm@nongnu.org, Peter Maydell Subject: [PATCH v2 18/67] target/arm: Convert FMULX to decodetree Date: Fri, 24 May 2024 16:20:32 -0700 Message-Id: <20240524232121.284515-19-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240524232121.284515-1-richard.henderson@linaro.org> References: <20240524232121.284515-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::62f; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x62f.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Convert all forms (scalar, vector, scalar indexed, vector indexed), which allows us to remove switch table entries elsewhere. Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/tcg/helper-a64.h | 8 ++ target/arm/tcg/a64.decode | 45 +++++++ target/arm/tcg/translate-a64.c | 221 +++++++++++++++++++++++++++------ target/arm/tcg/vec_helper.c | 39 +++--- 4 files changed, 259 insertions(+), 54 deletions(-) diff --git a/target/arm/tcg/helper-a64.h b/target/arm/tcg/helper-a64.h index 0518165399..b79751a717 100644 --- a/target/arm/tcg/helper-a64.h +++ b/target/arm/tcg/helper-a64.h @@ -132,3 +132,11 @@ DEF_HELPER_4(cpye, void, env, i32, i32, i32) DEF_HELPER_4(cpyfp, void, env, i32, i32, i32) DEF_HELPER_4(cpyfm, void, env, i32, i32, i32) DEF_HELPER_4(cpyfe, void, env, i32, i32, i32) + +DEF_HELPER_FLAGS_5(gvec_fmulx_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_fmulx_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_fmulx_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(gvec_fmulx_idx_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_fmulx_idx_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_fmulx_idx_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode index d5bfeae7a8..2e0e01be01 100644 --- a/target/arm/tcg/a64.decode +++ b/target/arm/tcg/a64.decode @@ -20,21 +20,44 @@ # %rd 0:5 +%esz_sd 22:1 !function=plus_2 +%hl 11:1 21:1 +%hlm 11:1 20:2 &r rn &ri rd imm &rri_sf rd rn imm sf &i imm +&rrr_e rd rn rm esz +&rrx_e rd rn rm idx esz &qrr_e q rd rn esz &qrrr_e q rd rn rm esz +&qrrx_e q rd rn rm idx esz &qrrrr_e q rd rn rm ra esz +@rrr_h ........ ... rm:5 ...... rn:5 rd:5 &rrr_e esz=1 +@rrr_sd ........ ... rm:5 ...... rn:5 rd:5 &rrr_e esz=%esz_sd + +@rrx_h ........ .. .. rm:4 .... . . rn:5 rd:5 &rrx_e esz=1 idx=%hlm +@rrx_s ........ .. . rm:5 .... . . rn:5 rd:5 &rrx_e esz=2 idx=%hl +@rrx_d ........ .. . rm:5 .... idx:1 . rn:5 rd:5 &rrx_e esz=3 + @rr_q1e0 ........ ........ ...... rn:5 rd:5 &qrr_e q=1 esz=0 @r2r_q1e0 ........ ........ ...... rm:5 rd:5 &qrrr_e rn=%rd q=1 esz=0 @rrr_q1e0 ........ ... rm:5 ...... rn:5 rd:5 &qrrr_e q=1 esz=0 @rrr_q1e3 ........ ... rm:5 ...... rn:5 rd:5 &qrrr_e q=1 esz=3 @rrrr_q1e3 ........ ... rm:5 . ra:5 rn:5 rd:5 &qrrrr_e q=1 esz=3 +@qrrr_h . q:1 ...... ... rm:5 ...... rn:5 rd:5 &qrrr_e esz=1 +@qrrr_sd . q:1 ...... ... rm:5 ...... rn:5 rd:5 &qrrr_e esz=%esz_sd + +@qrrx_h . q:1 .. .... .. .. rm:4 .... . . rn:5 rd:5 \ + &qrrx_e esz=1 idx=%hlm +@qrrx_s . q:1 .. .... .. . rm:5 .... . . rn:5 rd:5 \ + &qrrx_e esz=2 idx=%hl +@qrrx_d . q:1 .. .... .. . rm:5 .... idx:1 . rn:5 rd:5 \ + &qrrx_e esz=3 + ### Data Processing - Immediate # PC-rel addressing @@ -671,3 +694,25 @@ INS_general 0 1 00 1110 000 imm:5 0 0011 1 rn:5 rd:5 SMOV 0 q:1 00 1110 000 imm:5 0 0101 1 rn:5 rd:5 UMOV 0 q:1 00 1110 000 imm:5 0 0111 1 rn:5 rd:5 INS_element 0 1 10 1110 000 di:5 0 si:4 1 rn:5 rd:5 + +### Advanced SIMD scalar three same + +FMULX_s 0101 1110 010 ..... 00011 1 ..... ..... @rrr_h +FMULX_s 0101 1110 0.1 ..... 11011 1 ..... ..... @rrr_sd + +### Advanced SIMD three same + +FMULX_v 0.00 1110 010 ..... 00011 1 ..... ..... @qrrr_h +FMULX_v 0.00 1110 0.1 ..... 11011 1 ..... ..... @qrrr_sd + +### Advanced SIMD scalar x indexed element + +FMULX_si 0111 1111 00 .. .... 1001 . 0 ..... ..... @rrx_h +FMULX_si 0111 1111 10 . ..... 1001 . 0 ..... ..... @rrx_s +FMULX_si 0111 1111 11 0 ..... 1001 . 0 ..... ..... @rrx_d + +### Advanced SIMD vector x indexed element + +FMULX_vi 0.10 1111 00 .. .... 1001 . 0 ..... ..... @qrrx_h +FMULX_vi 0.10 1111 10 . ..... 1001 . 0 ..... ..... @qrrx_s +FMULX_vi 0.10 1111 11 0 ..... 1001 . 0 ..... ..... @qrrx_d diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c index 1a12bf22fd..8cbe6cd70f 100644 --- a/target/arm/tcg/translate-a64.c +++ b/target/arm/tcg/translate-a64.c @@ -4841,6 +4841,178 @@ static bool trans_INS_element(DisasContext *s, arg_INS_element *a) return true; } +/* + * Advanced SIMD three same + */ + +typedef struct FPScalar { + void (*gen_h)(TCGv_i32, TCGv_i32, TCGv_i32, TCGv_ptr); + void (*gen_s)(TCGv_i32, TCGv_i32, TCGv_i32, TCGv_ptr); + void (*gen_d)(TCGv_i64, TCGv_i64, TCGv_i64, TCGv_ptr); +} FPScalar; + +static bool do_fp3_scalar(DisasContext *s, arg_rrr_e *a, const FPScalar *f) +{ + switch (a->esz) { + case MO_64: + if (fp_access_check(s)) { + TCGv_i64 t0 = read_fp_dreg(s, a->rn); + TCGv_i64 t1 = read_fp_dreg(s, a->rm); + f->gen_d(t0, t0, t1, fpstatus_ptr(FPST_FPCR)); + write_fp_dreg(s, a->rd, t0); + } + break; + case MO_32: + if (fp_access_check(s)) { + TCGv_i32 t0 = read_fp_sreg(s, a->rn); + TCGv_i32 t1 = read_fp_sreg(s, a->rm); + f->gen_s(t0, t0, t1, fpstatus_ptr(FPST_FPCR)); + write_fp_sreg(s, a->rd, t0); + } + break; + case MO_16: + if (!dc_isar_feature(aa64_fp16, s)) { + return false; + } + if (fp_access_check(s)) { + TCGv_i32 t0 = read_fp_hreg(s, a->rn); + TCGv_i32 t1 = read_fp_hreg(s, a->rm); + f->gen_h(t0, t0, t1, fpstatus_ptr(FPST_FPCR_F16)); + write_fp_sreg(s, a->rd, t0); + } + break; + default: + return false; + } + return true; +} + +static const FPScalar f_scalar_fmulx = { + gen_helper_advsimd_mulxh, + gen_helper_vfp_mulxs, + gen_helper_vfp_mulxd, +}; +TRANS(FMULX_s, do_fp3_scalar, a, &f_scalar_fmulx) + +static bool do_fp3_vector(DisasContext *s, arg_qrrr_e *a, + gen_helper_gvec_3_ptr * const fns[3]) +{ + MemOp esz = a->esz; + + switch (esz) { + case MO_64: + if (!a->q) { + return false; + } + break; + case MO_32: + break; + case MO_16: + if (!dc_isar_feature(aa64_fp16, s)) { + return false; + } + break; + default: + return false; + } + if (fp_access_check(s)) { + gen_gvec_op3_fpst(s, a->q, a->rd, a->rn, a->rm, + esz == MO_16, 0, fns[esz - 1]); + } + return true; +} + +static gen_helper_gvec_3_ptr * const f_vector_fmulx[3] = { + gen_helper_gvec_fmulx_h, + gen_helper_gvec_fmulx_s, + gen_helper_gvec_fmulx_d, +}; +TRANS(FMULX_v, do_fp3_vector, a, f_vector_fmulx) + +/* + * Advanced SIMD scalar/vector x indexed element + */ + +static bool do_fp3_scalar_idx(DisasContext *s, arg_rrx_e *a, const FPScalar *f) +{ + switch (a->esz) { + case MO_64: + if (fp_access_check(s)) { + TCGv_i64 t0 = read_fp_dreg(s, a->rn); + TCGv_i64 t1 = tcg_temp_new_i64(); + + read_vec_element(s, t1, a->rm, a->idx, MO_64); + f->gen_d(t0, t0, t1, fpstatus_ptr(FPST_FPCR)); + write_fp_dreg(s, a->rd, t0); + } + break; + case MO_32: + if (fp_access_check(s)) { + TCGv_i32 t0 = read_fp_sreg(s, a->rn); + TCGv_i32 t1 = tcg_temp_new_i32(); + + read_vec_element_i32(s, t1, a->rm, a->idx, MO_32); + f->gen_s(t0, t0, t1, fpstatus_ptr(FPST_FPCR)); + write_fp_sreg(s, a->rd, t0); + } + break; + case MO_16: + if (!dc_isar_feature(aa64_fp16, s)) { + return false; + } + if (fp_access_check(s)) { + TCGv_i32 t0 = read_fp_hreg(s, a->rn); + TCGv_i32 t1 = tcg_temp_new_i32(); + + read_vec_element_i32(s, t1, a->rm, a->idx, MO_16); + f->gen_h(t0, t0, t1, fpstatus_ptr(FPST_FPCR_F16)); + write_fp_sreg(s, a->rd, t0); + } + break; + default: + g_assert_not_reached(); + } + return true; +} + +TRANS(FMULX_si, do_fp3_scalar_idx, a, &f_scalar_fmulx) + +static bool do_fp3_vector_idx(DisasContext *s, arg_qrrx_e *a, + gen_helper_gvec_3_ptr * const fns[3]) +{ + MemOp esz = a->esz; + + switch (esz) { + case MO_64: + if (!a->q) { + return false; + } + break; + case MO_32: + break; + case MO_16: + if (!dc_isar_feature(aa64_fp16, s)) { + return false; + } + break; + default: + g_assert_not_reached(); + } + if (fp_access_check(s)) { + gen_gvec_op3_fpst(s, a->q, a->rd, a->rn, a->rm, + esz == MO_16, a->idx, fns[esz - 1]); + } + return true; +} + +static gen_helper_gvec_3_ptr * const f_vector_idx_fmulx[3] = { + gen_helper_gvec_fmulx_idx_h, + gen_helper_gvec_fmulx_idx_s, + gen_helper_gvec_fmulx_idx_d, +}; +TRANS(FMULX_vi, do_fp3_vector_idx, a, f_vector_idx_fmulx) + + /* Shift a TCGv src by TCGv shift_amount, put result in dst. * Note that it is the caller's responsibility to ensure that the * shift amount is in range (ie 0..31 or 0..63) and provide the ARM @@ -9011,9 +9183,6 @@ static void handle_3same_float(DisasContext *s, int size, int elements, case 0x1a: /* FADD */ gen_helper_vfp_addd(tcg_res, tcg_op1, tcg_op2, fpst); break; - case 0x1b: /* FMULX */ - gen_helper_vfp_mulxd(tcg_res, tcg_op1, tcg_op2, fpst); - break; case 0x1c: /* FCMEQ */ gen_helper_neon_ceq_f64(tcg_res, tcg_op1, tcg_op2, fpst); break; @@ -9058,6 +9227,7 @@ static void handle_3same_float(DisasContext *s, int size, int elements, gen_helper_neon_acgt_f64(tcg_res, tcg_op1, tcg_op2, fpst); break; default: + case 0x1b: /* FMULX */ g_assert_not_reached(); } @@ -9084,9 +9254,6 @@ static void handle_3same_float(DisasContext *s, int size, int elements, case 0x1a: /* FADD */ gen_helper_vfp_adds(tcg_res, tcg_op1, tcg_op2, fpst); break; - case 0x1b: /* FMULX */ - gen_helper_vfp_mulxs(tcg_res, tcg_op1, tcg_op2, fpst); - break; case 0x1c: /* FCMEQ */ gen_helper_neon_ceq_f32(tcg_res, tcg_op1, tcg_op2, fpst); break; @@ -9134,6 +9301,7 @@ static void handle_3same_float(DisasContext *s, int size, int elements, gen_helper_neon_acgt_f32(tcg_res, tcg_op1, tcg_op2, fpst); break; default: + case 0x1b: /* FMULX */ g_assert_not_reached(); } @@ -9172,7 +9340,6 @@ static void disas_simd_scalar_three_reg_same(DisasContext *s, uint32_t insn) /* Floating point: U, size[1] and opcode indicate operation */ int fpopcode = opcode | (extract32(size, 1, 1) << 5) | (u << 6); switch (fpopcode) { - case 0x1b: /* FMULX */ case 0x1f: /* FRECPS */ case 0x3f: /* FRSQRTS */ case 0x5d: /* FACGE */ @@ -9183,6 +9350,7 @@ static void disas_simd_scalar_three_reg_same(DisasContext *s, uint32_t insn) case 0x7a: /* FABD */ break; default: + case 0x1b: /* FMULX */ unallocated_encoding(s); return; } @@ -9335,7 +9503,6 @@ static void disas_simd_scalar_three_reg_same_fp16(DisasContext *s, TCGv_i32 tcg_res; switch (fpopcode) { - case 0x03: /* FMULX */ case 0x04: /* FCMEQ (reg) */ case 0x07: /* FRECPS */ case 0x0f: /* FRSQRTS */ @@ -9346,6 +9513,7 @@ static void disas_simd_scalar_three_reg_same_fp16(DisasContext *s, case 0x1d: /* FACGT */ break; default: + case 0x03: /* FMULX */ unallocated_encoding(s); return; } @@ -9365,9 +9533,6 @@ static void disas_simd_scalar_three_reg_same_fp16(DisasContext *s, tcg_res = tcg_temp_new_i32(); switch (fpopcode) { - case 0x03: /* FMULX */ - gen_helper_advsimd_mulxh(tcg_res, tcg_op1, tcg_op2, fpst); - break; case 0x04: /* FCMEQ (reg) */ gen_helper_advsimd_ceq_f16(tcg_res, tcg_op1, tcg_op2, fpst); break; @@ -9394,6 +9559,7 @@ static void disas_simd_scalar_three_reg_same_fp16(DisasContext *s, gen_helper_advsimd_acgt_f16(tcg_res, tcg_op1, tcg_op2, fpst); break; default: + case 0x03: /* FMULX */ g_assert_not_reached(); } @@ -11051,7 +11217,6 @@ static void disas_simd_3same_float(DisasContext *s, uint32_t insn) handle_simd_3same_pair(s, is_q, 0, fpopcode, size ? MO_64 : MO_32, rn, rm, rd); return; - case 0x1b: /* FMULX */ case 0x1f: /* FRECPS */ case 0x3f: /* FRSQRTS */ case 0x5d: /* FACGE */ @@ -11097,6 +11262,7 @@ static void disas_simd_3same_float(DisasContext *s, uint32_t insn) return; default: + case 0x1b: /* FMULX */ unallocated_encoding(s); return; } @@ -11441,7 +11607,6 @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn) case 0x0: /* FMAXNM */ case 0x1: /* FMLA */ case 0x2: /* FADD */ - case 0x3: /* FMULX */ case 0x4: /* FCMEQ */ case 0x6: /* FMAX */ case 0x7: /* FRECPS */ @@ -11467,6 +11632,7 @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn) pairwise = true; break; default: + case 0x3: /* FMULX */ unallocated_encoding(s); return; } @@ -11543,9 +11709,6 @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn) case 0x2: /* FADD */ gen_helper_advsimd_addh(tcg_res, tcg_op1, tcg_op2, fpst); break; - case 0x3: /* FMULX */ - gen_helper_advsimd_mulxh(tcg_res, tcg_op1, tcg_op2, fpst); - break; case 0x4: /* FCMEQ */ gen_helper_advsimd_ceq_f16(tcg_res, tcg_op1, tcg_op2, fpst); break; @@ -11597,6 +11760,7 @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn) gen_helper_advsimd_acgt_f16(tcg_res, tcg_op1, tcg_op2, fpst); break; default: + case 0x3: /* FMULX */ g_assert_not_reached(); } @@ -12816,7 +12980,6 @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn) case 0x01: /* FMLA */ case 0x05: /* FMLS */ case 0x09: /* FMUL */ - case 0x19: /* FMULX */ is_fp = 1; break; case 0x1d: /* SQRDMLAH */ @@ -12885,6 +13048,7 @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn) /* is_fp, but we pass tcg_env not fp_status. */ break; default: + case 0x19: /* FMULX */ unallocated_encoding(s); return; } @@ -13108,10 +13272,8 @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn) case 0x09: /* FMUL */ gen_helper_vfp_muld(tcg_res, tcg_op, tcg_idx, fpst); break; - case 0x19: /* FMULX */ - gen_helper_vfp_mulxd(tcg_res, tcg_op, tcg_idx, fpst); - break; default: + case 0x19: /* FMULX */ g_assert_not_reached(); } @@ -13224,24 +13386,6 @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn) g_assert_not_reached(); } break; - case 0x19: /* FMULX */ - switch (size) { - case 1: - if (is_scalar) { - gen_helper_advsimd_mulxh(tcg_res, tcg_op, - tcg_idx, fpst); - } else { - gen_helper_advsimd_mulx2h(tcg_res, tcg_op, - tcg_idx, fpst); - } - break; - case 2: - gen_helper_vfp_mulxs(tcg_res, tcg_op, tcg_idx, fpst); - break; - default: - g_assert_not_reached(); - } - break; case 0x0c: /* SQDMULH */ if (size == 1) { gen_helper_neon_qdmulh_s16(tcg_res, tcg_env, @@ -13283,6 +13427,7 @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn) } break; default: + case 0x19: /* FMULX */ g_assert_not_reached(); } diff --git a/target/arm/tcg/vec_helper.c b/target/arm/tcg/vec_helper.c index 1f93510b85..8684581923 100644 --- a/target/arm/tcg/vec_helper.c +++ b/target/arm/tcg/vec_helper.c @@ -1248,6 +1248,9 @@ DO_3OP(gvec_rsqrts_nf_h, float16_rsqrts_nf, float16) DO_3OP(gvec_rsqrts_nf_s, float32_rsqrts_nf, float32) #ifdef TARGET_AARCH64 +DO_3OP(gvec_fmulx_h, helper_advsimd_mulxh, float16) +DO_3OP(gvec_fmulx_s, helper_vfp_mulxs, float32) +DO_3OP(gvec_fmulx_d, helper_vfp_mulxd, float64) DO_3OP(gvec_recps_h, helper_recpsf_f16, float16) DO_3OP(gvec_recps_s, helper_recpsf_f32, float32) @@ -1385,7 +1388,7 @@ DO_MLA_IDX(gvec_mls_idx_d, uint64_t, -, H8) #undef DO_MLA_IDX -#define DO_FMUL_IDX(NAME, ADD, TYPE, H) \ +#define DO_FMUL_IDX(NAME, ADD, MUL, TYPE, H) \ void HELPER(NAME)(void *vd, void *vn, void *vm, void *stat, uint32_t desc) \ { \ intptr_t i, j, oprsz = simd_oprsz(desc); \ @@ -1395,33 +1398,37 @@ void HELPER(NAME)(void *vd, void *vn, void *vm, void *stat, uint32_t desc) \ for (i = 0; i < oprsz / sizeof(TYPE); i += segment) { \ TYPE mm = m[H(i + idx)]; \ for (j = 0; j < segment; j++) { \ - d[i + j] = TYPE##_##ADD(d[i + j], \ - TYPE##_mul(n[i + j], mm, stat), stat); \ + d[i + j] = ADD(d[i + j], MUL(n[i + j], mm, stat), stat); \ } \ } \ clear_tail(d, oprsz, simd_maxsz(desc)); \ } -#define float16_nop(N, M, S) (M) -#define float32_nop(N, M, S) (M) -#define float64_nop(N, M, S) (M) +#define nop(N, M, S) (M) -DO_FMUL_IDX(gvec_fmul_idx_h, nop, float16, H2) -DO_FMUL_IDX(gvec_fmul_idx_s, nop, float32, H4) -DO_FMUL_IDX(gvec_fmul_idx_d, nop, float64, H8) +DO_FMUL_IDX(gvec_fmul_idx_h, nop, float16_mul, float16, H2) +DO_FMUL_IDX(gvec_fmul_idx_s, nop, float32_mul, float32, H4) +DO_FMUL_IDX(gvec_fmul_idx_d, nop, float64_mul, float64, H8) + +#ifdef TARGET_AARCH64 + +DO_FMUL_IDX(gvec_fmulx_idx_h, nop, helper_advsimd_mulxh, float16, H2) +DO_FMUL_IDX(gvec_fmulx_idx_s, nop, helper_vfp_mulxs, float32, H4) +DO_FMUL_IDX(gvec_fmulx_idx_d, nop, helper_vfp_mulxd, float64, H8) + +#endif + +#undef nop /* * Non-fused multiply-accumulate operations, for Neon. NB that unlike * the fused ops below they assume accumulate both from and into Vd. */ -DO_FMUL_IDX(gvec_fmla_nf_idx_h, add, float16, H2) -DO_FMUL_IDX(gvec_fmla_nf_idx_s, add, float32, H4) -DO_FMUL_IDX(gvec_fmls_nf_idx_h, sub, float16, H2) -DO_FMUL_IDX(gvec_fmls_nf_idx_s, sub, float32, H4) +DO_FMUL_IDX(gvec_fmla_nf_idx_h, float16_add, float16_mul, float16, H2) +DO_FMUL_IDX(gvec_fmla_nf_idx_s, float32_add, float32_mul, float32, H4) +DO_FMUL_IDX(gvec_fmls_nf_idx_h, float16_sub, float16_mul, float16, H2) +DO_FMUL_IDX(gvec_fmls_nf_idx_s, float32_sub, float32_mul, float32, H4) -#undef float16_nop -#undef float32_nop -#undef float64_nop #undef DO_FMUL_IDX #define DO_FMLA_IDX(NAME, TYPE, H) \ From patchwork Fri May 24 23:20:33 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13673844 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 94313C25B7A for ; Fri, 24 May 2024 23:31:04 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sAeEX-0006i9-2H; Fri, 24 May 2024 19:21:57 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sAeER-0006cJ-81 for qemu-devel@nongnu.org; Fri, 24 May 2024 19:21:51 -0400 Received: from mail-pl1-x62c.google.com ([2607:f8b0:4864:20::62c]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sAeEH-0005nU-2q for qemu-devel@nongnu.org; Fri, 24 May 2024 19:21:50 -0400 Received: by mail-pl1-x62c.google.com with SMTP id d9443c01a7336-1f332511457so13029145ad.2 for ; Fri, 24 May 2024 16:21:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1716592900; x=1717197700; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=CR0tfiiELCs+uNI9douEUeiBrRpZFdkvqfQ5ocu7HEQ=; b=di3lDk5KVrC7wyhPs/68QD1LfJvfclmHB5Kdljer7F9xdZJlTp6ElfTR7o9LVj6mEB ykHDe2+n2AQm1dkSmQfc72pgx6HYoZ6tMhUA/Kd1SzT0zJAPntF/p0T5Vd/ndKUypFEu 7KJtLqtq1fpXM1OWeLSjF//6Ks3TsXpYW6Vma6hLXHldkprG+jrdzL0g5atpqqBfHKIc B7866lOJt1HEVtvz1NKgFgFTJssYRQX4cXNaxSsZC2nW+qqilg18mAC762K41tnaE6Ic 44A5nfrLJYvaYYxsouZoA6Nd6m0LEUgSQbaYy5MGdh6WW6dwB+Vnw7Wk3mnd2BlQuaib apnA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716592900; x=1717197700; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=CR0tfiiELCs+uNI9douEUeiBrRpZFdkvqfQ5ocu7HEQ=; b=mKOigc1tMkEQhv0t4TM2kCPyj7kvWt7cL1tTHtN0cR2TqdGEeXPV9Dncs/iroxn5/2 KNnEW4kwU13p3xYbL33zoYKR9/zDD8l6K9su3UABQplqnqkzuE9yiSYeNZr/IiU7HF7v OXxXXu/jwK8dghqoxiUs+GQbdAqTcJ+kxkR8NTz/fgQCHzwczdq8vfZ9kbuq2V4Mnbpq Y3tr1CezMgiZmlDbPlsw37f7A7vFN95OVhrauq2mfdqALNHzTRZtREVala25MyX1/BbZ +bi+JvKUXGJJjBgU95xf2FWkt9gI2V/cgrzSeJT3qnxHZgl2LOnSVqW3ONH/WghcHMBd uKOw== X-Gm-Message-State: AOJu0YxBKe/T48lbfHynYCLHiyUTL9FrCWBu6X0RVL2lAfOy6Vco5JnK bnn16vpK31mvwAwcDOVebUecH4hAPkS/CyMldKejld4I5d7UKu9mB/ot5XGlfYKZrOiStRG/4RO h X-Google-Smtp-Source: AGHT+IFmB0Nz1YmMwacXo2L4ZKA/U25Ukb4bw5kA+zBmsAlB8XiRGqPxOpzSQmU8s524uDFVlESj8Q== X-Received: by 2002:a17:902:cecd:b0:1f3:121e:e3de with SMTP id d9443c01a7336-1f4486d51b0mr41359115ad.14.1716592899714; Fri, 24 May 2024 16:21:39 -0700 (PDT) Received: from stoup.. (174-21-72-5.tukw.qwest.net. [174.21.72.5]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-1f44c759ceesm19178305ad.10.2024.05.24.16.21.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 May 2024 16:21:39 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: qemu-arm@nongnu.org, Peter Maydell Subject: [PATCH v2 19/67] target/arm: Convert FADD, FSUB, FDIV, FMUL to decodetree Date: Fri, 24 May 2024 16:20:33 -0700 Message-Id: <20240524232121.284515-20-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240524232121.284515-1-richard.henderson@linaro.org> References: <20240524232121.284515-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::62c; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x62c.google.com X-Spam_score_int: -16 X-Spam_score: -1.7 X-Spam_bar: - X-Spam_report: (-1.7 / 5.0 requ) BAYES_00=-1.9, DKIM_INVALID=0.1, DKIM_SIGNED=0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, T_SPF_TEMPERROR=0.01 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/tcg/helper-a64.h | 4 + target/arm/tcg/translate.h | 5 + target/arm/tcg/a64.decode | 27 +++++ target/arm/tcg/translate-a64.c | 205 +++++++++++++++++---------------- target/arm/tcg/vec_helper.c | 4 + 5 files changed, 143 insertions(+), 102 deletions(-) diff --git a/target/arm/tcg/helper-a64.h b/target/arm/tcg/helper-a64.h index b79751a717..371388f61b 100644 --- a/target/arm/tcg/helper-a64.h +++ b/target/arm/tcg/helper-a64.h @@ -133,6 +133,10 @@ DEF_HELPER_4(cpyfp, void, env, i32, i32, i32) DEF_HELPER_4(cpyfm, void, env, i32, i32, i32) DEF_HELPER_4(cpyfe, void, env, i32, i32, i32) +DEF_HELPER_FLAGS_5(gvec_fdiv_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_fdiv_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_fdiv_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(gvec_fmulx_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(gvec_fmulx_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(gvec_fmulx_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/tcg/translate.h b/target/arm/tcg/translate.h index 80e85096a8..ecfa242eef 100644 --- a/target/arm/tcg/translate.h +++ b/target/arm/tcg/translate.h @@ -252,6 +252,11 @@ static inline int shl_12(DisasContext *s, int x) return x << 12; } +static inline int xor_2(DisasContext *s, int x) +{ + return x ^ 2; +} + static inline int neon_3same_fp_size(DisasContext *s, int x) { /* Convert 0==fp32, 1==fp16 into a MO_* value */ diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode index 2e0e01be01..82daafbef5 100644 --- a/target/arm/tcg/a64.decode +++ b/target/arm/tcg/a64.decode @@ -21,6 +21,7 @@ %rd 0:5 %esz_sd 22:1 !function=plus_2 +%esz_hsd 22:2 !function=xor_2 %hl 11:1 21:1 %hlm 11:1 20:2 @@ -37,6 +38,7 @@ @rrr_h ........ ... rm:5 ...... rn:5 rd:5 &rrr_e esz=1 @rrr_sd ........ ... rm:5 ...... rn:5 rd:5 &rrr_e esz=%esz_sd +@rrr_hsd ........ ... rm:5 ...... rn:5 rd:5 &rrr_e esz=%esz_hsd @rrx_h ........ .. .. rm:4 .... . . rn:5 rd:5 &rrx_e esz=1 idx=%hlm @rrx_s ........ .. . rm:5 .... . . rn:5 rd:5 &rrx_e esz=2 idx=%hl @@ -697,22 +699,47 @@ INS_element 0 1 10 1110 000 di:5 0 si:4 1 rn:5 rd:5 ### Advanced SIMD scalar three same +FADD_s 0001 1110 ..1 ..... 0010 10 ..... ..... @rrr_hsd +FSUB_s 0001 1110 ..1 ..... 0011 10 ..... ..... @rrr_hsd +FDIV_s 0001 1110 ..1 ..... 0001 10 ..... ..... @rrr_hsd +FMUL_s 0001 1110 ..1 ..... 0000 10 ..... ..... @rrr_hsd + FMULX_s 0101 1110 010 ..... 00011 1 ..... ..... @rrr_h FMULX_s 0101 1110 0.1 ..... 11011 1 ..... ..... @rrr_sd ### Advanced SIMD three same +FADD_v 0.00 1110 010 ..... 00010 1 ..... ..... @qrrr_h +FADD_v 0.00 1110 0.1 ..... 11010 1 ..... ..... @qrrr_sd + +FSUB_v 0.00 1110 110 ..... 00010 1 ..... ..... @qrrr_h +FSUB_v 0.00 1110 1.1 ..... 11010 1 ..... ..... @qrrr_sd + +FDIV_v 0.10 1110 010 ..... 00111 1 ..... ..... @qrrr_h +FDIV_v 0.10 1110 0.1 ..... 11111 1 ..... ..... @qrrr_sd + +FMUL_v 0.10 1110 010 ..... 00011 1 ..... ..... @qrrr_h +FMUL_v 0.10 1110 0.1 ..... 11011 1 ..... ..... @qrrr_sd + FMULX_v 0.00 1110 010 ..... 00011 1 ..... ..... @qrrr_h FMULX_v 0.00 1110 0.1 ..... 11011 1 ..... ..... @qrrr_sd ### Advanced SIMD scalar x indexed element +FMUL_si 0101 1111 00 .. .... 1001 . 0 ..... ..... @rrx_h +FMUL_si 0101 1111 10 . ..... 1001 . 0 ..... ..... @rrx_s +FMUL_si 0101 1111 11 0 ..... 1001 . 0 ..... ..... @rrx_d + FMULX_si 0111 1111 00 .. .... 1001 . 0 ..... ..... @rrx_h FMULX_si 0111 1111 10 . ..... 1001 . 0 ..... ..... @rrx_s FMULX_si 0111 1111 11 0 ..... 1001 . 0 ..... ..... @rrx_d ### Advanced SIMD vector x indexed element +FMUL_vi 0.00 1111 00 .. .... 1001 . 0 ..... ..... @qrrx_h +FMUL_vi 0.00 1111 10 . ..... 1001 . 0 ..... ..... @qrrx_s +FMUL_vi 0.00 1111 11 0 ..... 1001 . 0 ..... ..... @qrrx_d + FMULX_vi 0.10 1111 00 .. .... 1001 . 0 ..... ..... @qrrx_h FMULX_vi 0.10 1111 10 . ..... 1001 . 0 ..... ..... @qrrx_s FMULX_vi 0.10 1111 11 0 ..... 1001 . 0 ..... ..... @qrrx_d diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c index 8cbe6cd70f..97c3d758d6 100644 --- a/target/arm/tcg/translate-a64.c +++ b/target/arm/tcg/translate-a64.c @@ -4887,6 +4887,34 @@ static bool do_fp3_scalar(DisasContext *s, arg_rrr_e *a, const FPScalar *f) return true; } +static const FPScalar f_scalar_fadd = { + gen_helper_vfp_addh, + gen_helper_vfp_adds, + gen_helper_vfp_addd, +}; +TRANS(FADD_s, do_fp3_scalar, a, &f_scalar_fadd) + +static const FPScalar f_scalar_fsub = { + gen_helper_vfp_subh, + gen_helper_vfp_subs, + gen_helper_vfp_subd, +}; +TRANS(FSUB_s, do_fp3_scalar, a, &f_scalar_fsub) + +static const FPScalar f_scalar_fdiv = { + gen_helper_vfp_divh, + gen_helper_vfp_divs, + gen_helper_vfp_divd, +}; +TRANS(FDIV_s, do_fp3_scalar, a, &f_scalar_fdiv) + +static const FPScalar f_scalar_fmul = { + gen_helper_vfp_mulh, + gen_helper_vfp_muls, + gen_helper_vfp_muld, +}; +TRANS(FMUL_s, do_fp3_scalar, a, &f_scalar_fmul) + static const FPScalar f_scalar_fmulx = { gen_helper_advsimd_mulxh, gen_helper_vfp_mulxs, @@ -4922,6 +4950,34 @@ static bool do_fp3_vector(DisasContext *s, arg_qrrr_e *a, return true; } +static gen_helper_gvec_3_ptr * const f_vector_fadd[3] = { + gen_helper_gvec_fadd_h, + gen_helper_gvec_fadd_s, + gen_helper_gvec_fadd_d, +}; +TRANS(FADD_v, do_fp3_vector, a, f_vector_fadd) + +static gen_helper_gvec_3_ptr * const f_vector_fsub[3] = { + gen_helper_gvec_fsub_h, + gen_helper_gvec_fsub_s, + gen_helper_gvec_fsub_d, +}; +TRANS(FSUB_v, do_fp3_vector, a, f_vector_fsub) + +static gen_helper_gvec_3_ptr * const f_vector_fdiv[3] = { + gen_helper_gvec_fdiv_h, + gen_helper_gvec_fdiv_s, + gen_helper_gvec_fdiv_d, +}; +TRANS(FDIV_v, do_fp3_vector, a, f_vector_fdiv) + +static gen_helper_gvec_3_ptr * const f_vector_fmul[3] = { + gen_helper_gvec_fmul_h, + gen_helper_gvec_fmul_s, + gen_helper_gvec_fmul_d, +}; +TRANS(FMUL_v, do_fp3_vector, a, f_vector_fmul) + static gen_helper_gvec_3_ptr * const f_vector_fmulx[3] = { gen_helper_gvec_fmulx_h, gen_helper_gvec_fmulx_s, @@ -4975,6 +5031,7 @@ static bool do_fp3_scalar_idx(DisasContext *s, arg_rrx_e *a, const FPScalar *f) return true; } +TRANS(FMUL_si, do_fp3_scalar_idx, a, &f_scalar_fmul) TRANS(FMULX_si, do_fp3_scalar_idx, a, &f_scalar_fmulx) static bool do_fp3_vector_idx(DisasContext *s, arg_qrrx_e *a, @@ -5005,6 +5062,13 @@ static bool do_fp3_vector_idx(DisasContext *s, arg_qrrx_e *a, return true; } +static gen_helper_gvec_3_ptr * const f_vector_idx_fmul[3] = { + gen_helper_gvec_fmul_idx_h, + gen_helper_gvec_fmul_idx_s, + gen_helper_gvec_fmul_idx_d, +}; +TRANS(FMUL_vi, do_fp3_vector_idx, a, f_vector_idx_fmul) + static gen_helper_gvec_3_ptr * const f_vector_idx_fmulx[3] = { gen_helper_gvec_fmulx_idx_h, gen_helper_gvec_fmulx_idx_s, @@ -6827,18 +6891,6 @@ static void handle_fp_2src_single(DisasContext *s, int opcode, tcg_op2 = read_fp_sreg(s, rm); switch (opcode) { - case 0x0: /* FMUL */ - gen_helper_vfp_muls(tcg_res, tcg_op1, tcg_op2, fpst); - break; - case 0x1: /* FDIV */ - gen_helper_vfp_divs(tcg_res, tcg_op1, tcg_op2, fpst); - break; - case 0x2: /* FADD */ - gen_helper_vfp_adds(tcg_res, tcg_op1, tcg_op2, fpst); - break; - case 0x3: /* FSUB */ - gen_helper_vfp_subs(tcg_res, tcg_op1, tcg_op2, fpst); - break; case 0x4: /* FMAX */ gen_helper_vfp_maxs(tcg_res, tcg_op1, tcg_op2, fpst); break; @@ -6855,6 +6907,12 @@ static void handle_fp_2src_single(DisasContext *s, int opcode, gen_helper_vfp_muls(tcg_res, tcg_op1, tcg_op2, fpst); gen_helper_vfp_negs(tcg_res, tcg_res); break; + default: + case 0x0: /* FMUL */ + case 0x1: /* FDIV */ + case 0x2: /* FADD */ + case 0x3: /* FSUB */ + g_assert_not_reached(); } write_fp_sreg(s, rd, tcg_res); @@ -6875,18 +6933,6 @@ static void handle_fp_2src_double(DisasContext *s, int opcode, tcg_op2 = read_fp_dreg(s, rm); switch (opcode) { - case 0x0: /* FMUL */ - gen_helper_vfp_muld(tcg_res, tcg_op1, tcg_op2, fpst); - break; - case 0x1: /* FDIV */ - gen_helper_vfp_divd(tcg_res, tcg_op1, tcg_op2, fpst); - break; - case 0x2: /* FADD */ - gen_helper_vfp_addd(tcg_res, tcg_op1, tcg_op2, fpst); - break; - case 0x3: /* FSUB */ - gen_helper_vfp_subd(tcg_res, tcg_op1, tcg_op2, fpst); - break; case 0x4: /* FMAX */ gen_helper_vfp_maxd(tcg_res, tcg_op1, tcg_op2, fpst); break; @@ -6903,6 +6949,12 @@ static void handle_fp_2src_double(DisasContext *s, int opcode, gen_helper_vfp_muld(tcg_res, tcg_op1, tcg_op2, fpst); gen_helper_vfp_negd(tcg_res, tcg_res); break; + default: + case 0x0: /* FMUL */ + case 0x1: /* FDIV */ + case 0x2: /* FADD */ + case 0x3: /* FSUB */ + g_assert_not_reached(); } write_fp_dreg(s, rd, tcg_res); @@ -6923,18 +6975,6 @@ static void handle_fp_2src_half(DisasContext *s, int opcode, tcg_op2 = read_fp_hreg(s, rm); switch (opcode) { - case 0x0: /* FMUL */ - gen_helper_advsimd_mulh(tcg_res, tcg_op1, tcg_op2, fpst); - break; - case 0x1: /* FDIV */ - gen_helper_advsimd_divh(tcg_res, tcg_op1, tcg_op2, fpst); - break; - case 0x2: /* FADD */ - gen_helper_advsimd_addh(tcg_res, tcg_op1, tcg_op2, fpst); - break; - case 0x3: /* FSUB */ - gen_helper_advsimd_subh(tcg_res, tcg_op1, tcg_op2, fpst); - break; case 0x4: /* FMAX */ gen_helper_advsimd_maxh(tcg_res, tcg_op1, tcg_op2, fpst); break; @@ -6952,6 +6992,10 @@ static void handle_fp_2src_half(DisasContext *s, int opcode, tcg_gen_xori_i32(tcg_res, tcg_res, 0x8000); break; default: + case 0x0: /* FMUL */ + case 0x1: /* FDIV */ + case 0x2: /* FADD */ + case 0x3: /* FSUB */ g_assert_not_reached(); } @@ -9180,9 +9224,6 @@ static void handle_3same_float(DisasContext *s, int size, int elements, case 0x18: /* FMAXNM */ gen_helper_vfp_maxnumd(tcg_res, tcg_op1, tcg_op2, fpst); break; - case 0x1a: /* FADD */ - gen_helper_vfp_addd(tcg_res, tcg_op1, tcg_op2, fpst); - break; case 0x1c: /* FCMEQ */ gen_helper_neon_ceq_f64(tcg_res, tcg_op1, tcg_op2, fpst); break; @@ -9195,27 +9236,18 @@ static void handle_3same_float(DisasContext *s, int size, int elements, case 0x38: /* FMINNM */ gen_helper_vfp_minnumd(tcg_res, tcg_op1, tcg_op2, fpst); break; - case 0x3a: /* FSUB */ - gen_helper_vfp_subd(tcg_res, tcg_op1, tcg_op2, fpst); - break; case 0x3e: /* FMIN */ gen_helper_vfp_mind(tcg_res, tcg_op1, tcg_op2, fpst); break; case 0x3f: /* FRSQRTS */ gen_helper_rsqrtsf_f64(tcg_res, tcg_op1, tcg_op2, fpst); break; - case 0x5b: /* FMUL */ - gen_helper_vfp_muld(tcg_res, tcg_op1, tcg_op2, fpst); - break; case 0x5c: /* FCMGE */ gen_helper_neon_cge_f64(tcg_res, tcg_op1, tcg_op2, fpst); break; case 0x5d: /* FACGE */ gen_helper_neon_acge_f64(tcg_res, tcg_op1, tcg_op2, fpst); break; - case 0x5f: /* FDIV */ - gen_helper_vfp_divd(tcg_res, tcg_op1, tcg_op2, fpst); - break; case 0x7a: /* FABD */ gen_helper_vfp_subd(tcg_res, tcg_op1, tcg_op2, fpst); gen_helper_vfp_absd(tcg_res, tcg_res); @@ -9227,7 +9259,11 @@ static void handle_3same_float(DisasContext *s, int size, int elements, gen_helper_neon_acgt_f64(tcg_res, tcg_op1, tcg_op2, fpst); break; default: + case 0x1a: /* FADD */ case 0x1b: /* FMULX */ + case 0x3a: /* FSUB */ + case 0x5b: /* FMUL */ + case 0x5f: /* FDIV */ g_assert_not_reached(); } @@ -9251,9 +9287,6 @@ static void handle_3same_float(DisasContext *s, int size, int elements, gen_helper_vfp_muladds(tcg_res, tcg_op1, tcg_op2, tcg_res, fpst); break; - case 0x1a: /* FADD */ - gen_helper_vfp_adds(tcg_res, tcg_op1, tcg_op2, fpst); - break; case 0x1c: /* FCMEQ */ gen_helper_neon_ceq_f32(tcg_res, tcg_op1, tcg_op2, fpst); break; @@ -9269,27 +9302,18 @@ static void handle_3same_float(DisasContext *s, int size, int elements, case 0x38: /* FMINNM */ gen_helper_vfp_minnums(tcg_res, tcg_op1, tcg_op2, fpst); break; - case 0x3a: /* FSUB */ - gen_helper_vfp_subs(tcg_res, tcg_op1, tcg_op2, fpst); - break; case 0x3e: /* FMIN */ gen_helper_vfp_mins(tcg_res, tcg_op1, tcg_op2, fpst); break; case 0x3f: /* FRSQRTS */ gen_helper_rsqrtsf_f32(tcg_res, tcg_op1, tcg_op2, fpst); break; - case 0x5b: /* FMUL */ - gen_helper_vfp_muls(tcg_res, tcg_op1, tcg_op2, fpst); - break; case 0x5c: /* FCMGE */ gen_helper_neon_cge_f32(tcg_res, tcg_op1, tcg_op2, fpst); break; case 0x5d: /* FACGE */ gen_helper_neon_acge_f32(tcg_res, tcg_op1, tcg_op2, fpst); break; - case 0x5f: /* FDIV */ - gen_helper_vfp_divs(tcg_res, tcg_op1, tcg_op2, fpst); - break; case 0x7a: /* FABD */ gen_helper_vfp_subs(tcg_res, tcg_op1, tcg_op2, fpst); gen_helper_vfp_abss(tcg_res, tcg_res); @@ -9301,7 +9325,11 @@ static void handle_3same_float(DisasContext *s, int size, int elements, gen_helper_neon_acgt_f32(tcg_res, tcg_op1, tcg_op2, fpst); break; default: + case 0x1a: /* FADD */ case 0x1b: /* FMULX */ + case 0x3a: /* FSUB */ + case 0x5b: /* FMUL */ + case 0x5f: /* FDIV */ g_assert_not_reached(); } @@ -11224,15 +11252,11 @@ static void disas_simd_3same_float(DisasContext *s, uint32_t insn) case 0x19: /* FMLA */ case 0x39: /* FMLS */ case 0x18: /* FMAXNM */ - case 0x1a: /* FADD */ case 0x1c: /* FCMEQ */ case 0x1e: /* FMAX */ case 0x38: /* FMINNM */ - case 0x3a: /* FSUB */ case 0x3e: /* FMIN */ - case 0x5b: /* FMUL */ case 0x5c: /* FCMGE */ - case 0x5f: /* FDIV */ case 0x7a: /* FABD */ case 0x7c: /* FCMGT */ if (!fp_access_check(s)) { @@ -11262,7 +11286,11 @@ static void disas_simd_3same_float(DisasContext *s, uint32_t insn) return; default: + case 0x1a: /* FADD */ case 0x1b: /* FMULX */ + case 0x3a: /* FSUB */ + case 0x5b: /* FMUL */ + case 0x5f: /* FDIV */ unallocated_encoding(s); return; } @@ -11606,19 +11634,15 @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn) switch (fpopcode) { case 0x0: /* FMAXNM */ case 0x1: /* FMLA */ - case 0x2: /* FADD */ case 0x4: /* FCMEQ */ case 0x6: /* FMAX */ case 0x7: /* FRECPS */ case 0x8: /* FMINNM */ case 0x9: /* FMLS */ - case 0xa: /* FSUB */ case 0xe: /* FMIN */ case 0xf: /* FRSQRTS */ - case 0x13: /* FMUL */ case 0x14: /* FCMGE */ case 0x15: /* FACGE */ - case 0x17: /* FDIV */ case 0x1a: /* FABD */ case 0x1c: /* FCMGT */ case 0x1d: /* FACGT */ @@ -11632,7 +11656,11 @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn) pairwise = true; break; default: + case 0x2: /* FADD */ case 0x3: /* FMULX */ + case 0xa: /* FSUB */ + case 0x13: /* FMUL */ + case 0x17: /* FDIV */ unallocated_encoding(s); return; } @@ -11706,9 +11734,6 @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn) gen_helper_advsimd_muladdh(tcg_res, tcg_op1, tcg_op2, tcg_res, fpst); break; - case 0x2: /* FADD */ - gen_helper_advsimd_addh(tcg_res, tcg_op1, tcg_op2, fpst); - break; case 0x4: /* FCMEQ */ gen_helper_advsimd_ceq_f16(tcg_res, tcg_op1, tcg_op2, fpst); break; @@ -11728,27 +11753,18 @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn) gen_helper_advsimd_muladdh(tcg_res, tcg_op1, tcg_op2, tcg_res, fpst); break; - case 0xa: /* FSUB */ - gen_helper_advsimd_subh(tcg_res, tcg_op1, tcg_op2, fpst); - break; case 0xe: /* FMIN */ gen_helper_advsimd_minh(tcg_res, tcg_op1, tcg_op2, fpst); break; case 0xf: /* FRSQRTS */ gen_helper_rsqrtsf_f16(tcg_res, tcg_op1, tcg_op2, fpst); break; - case 0x13: /* FMUL */ - gen_helper_advsimd_mulh(tcg_res, tcg_op1, tcg_op2, fpst); - break; case 0x14: /* FCMGE */ gen_helper_advsimd_cge_f16(tcg_res, tcg_op1, tcg_op2, fpst); break; case 0x15: /* FACGE */ gen_helper_advsimd_acge_f16(tcg_res, tcg_op1, tcg_op2, fpst); break; - case 0x17: /* FDIV */ - gen_helper_advsimd_divh(tcg_res, tcg_op1, tcg_op2, fpst); - break; case 0x1a: /* FABD */ gen_helper_advsimd_subh(tcg_res, tcg_op1, tcg_op2, fpst); tcg_gen_andi_i32(tcg_res, tcg_res, 0x7fff); @@ -11760,7 +11776,11 @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn) gen_helper_advsimd_acgt_f16(tcg_res, tcg_op1, tcg_op2, fpst); break; default: + case 0x2: /* FADD */ case 0x3: /* FMULX */ + case 0xa: /* FSUB */ + case 0x13: /* FMUL */ + case 0x17: /* FDIV */ g_assert_not_reached(); } @@ -12979,7 +12999,6 @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn) break; case 0x01: /* FMLA */ case 0x05: /* FMLS */ - case 0x09: /* FMUL */ is_fp = 1; break; case 0x1d: /* SQRDMLAH */ @@ -13048,6 +13067,7 @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn) /* is_fp, but we pass tcg_env not fp_status. */ break; default: + case 0x09: /* FMUL */ case 0x19: /* FMULX */ unallocated_encoding(s); return; @@ -13269,10 +13289,8 @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn) read_vec_element(s, tcg_res, rd, pass, MO_64); gen_helper_vfp_muladdd(tcg_res, tcg_op, tcg_idx, tcg_res, fpst); break; - case 0x09: /* FMUL */ - gen_helper_vfp_muld(tcg_res, tcg_op, tcg_idx, fpst); - break; default: + case 0x09: /* FMUL */ case 0x19: /* FMULX */ g_assert_not_reached(); } @@ -13368,24 +13386,6 @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn) g_assert_not_reached(); } break; - case 0x09: /* FMUL */ - switch (size) { - case 1: - if (is_scalar) { - gen_helper_advsimd_mulh(tcg_res, tcg_op, - tcg_idx, fpst); - } else { - gen_helper_advsimd_mul2h(tcg_res, tcg_op, - tcg_idx, fpst); - } - break; - case 2: - gen_helper_vfp_muls(tcg_res, tcg_op, tcg_idx, fpst); - break; - default: - g_assert_not_reached(); - } - break; case 0x0c: /* SQDMULH */ if (size == 1) { gen_helper_neon_qdmulh_s16(tcg_res, tcg_env, @@ -13427,6 +13427,7 @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn) } break; default: + case 0x09: /* FMUL */ case 0x19: /* FMULX */ g_assert_not_reached(); } diff --git a/target/arm/tcg/vec_helper.c b/target/arm/tcg/vec_helper.c index 8684581923..4106536371 100644 --- a/target/arm/tcg/vec_helper.c +++ b/target/arm/tcg/vec_helper.c @@ -1248,6 +1248,10 @@ DO_3OP(gvec_rsqrts_nf_h, float16_rsqrts_nf, float16) DO_3OP(gvec_rsqrts_nf_s, float32_rsqrts_nf, float32) #ifdef TARGET_AARCH64 +DO_3OP(gvec_fdiv_h, float16_div, float16) +DO_3OP(gvec_fdiv_s, float32_div, float32) +DO_3OP(gvec_fdiv_d, float64_div, float64) + DO_3OP(gvec_fmulx_h, helper_advsimd_mulxh, float16) DO_3OP(gvec_fmulx_s, helper_vfp_mulxs, float32) DO_3OP(gvec_fmulx_d, helper_vfp_mulxd, float64) From patchwork Fri May 24 23:20:34 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13673831 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 4A8C6C25B7D for ; Fri, 24 May 2024 23:30:00 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sAeEV-0006gv-Nj; Fri, 24 May 2024 19:21:55 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sAeEO-0006ZT-Lu for qemu-devel@nongnu.org; Fri, 24 May 2024 19:21:48 -0400 Received: from mail-pl1-x635.google.com ([2607:f8b0:4864:20::635]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sAeEH-0005nq-Vi for qemu-devel@nongnu.org; Fri, 24 May 2024 19:21:48 -0400 Received: by mail-pl1-x635.google.com with SMTP id d9443c01a7336-1f333e7a669so26693765ad.3 for ; Fri, 24 May 2024 16:21:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1716592901; x=1717197701; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=XeumEwoZxJnp4HQ3EMMiPgDqZ0FCMOyu53kU0EM+sNw=; b=BAcr6Xw5cTiU1t1YnDVEX9qSHkx+6SlhoqLREF3aUwGu9L/tdiAtsJTC75HOsQcXNV +FWu2ri68mW3f9dMD6W5q4BYKacNsdnSauxErDM9y4crn/TG+3xNbk1PUwJ6KwVbvLL8 8rccKzLKPKA5rK56fH4MwKR27JrK6udHYwV09qCoyUoFF45vBres9BUpKKemgJbiQAC9 HTXVbwYyThmGcpAJgsrRiP2lO5zqQd5nOEeHyPSkiB4lw/SEnI9nlGkEORDicz1zjX9Z 9QS+FKvWQxtJ1be17Ugb3wrML0s5oAKN5GxQ7J7lZot9igKe0ueFydqnCMeKYG7SwMN4 fWew== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716592901; x=1717197701; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=XeumEwoZxJnp4HQ3EMMiPgDqZ0FCMOyu53kU0EM+sNw=; b=sHHY+S2gMd2fhB5e973mAhpMgLRCYHlfK5wmYaETBdUYOVa7SsHTuXKarTnM9oIXVm fihjwbIg6vQPzxksDA4SxEm+T7mvyYcs9kqlo99r82YzL6+1Pvs6jH9gcGKQkn7duopK J5GodPzruy7OxpspvSgAOvHu8huwRwM+yq02oSCtJvXagL7tC5gdTVULOJVb6mUpN3Nn 8eGnGGHf5yDdJdQz7qqImQximk0qERZclYmLjW3CVvmVjbFlKyF2dQ+4SP/sJZG3yWjp 0P/cMRhquIFuIyWuLmp0w4Hx0V6JsyomH7M5kvLgndqXOCRdBulyNIVEceQN4bNple70 N79w== X-Gm-Message-State: AOJu0YxmcO5OQRNzaA+kj0Jj1UvPHaknJJoFngTOU8g3SwLqzb8kawJt uFLYUbGjzdeMeGuE63qq3H25SFzvZsGaOkwlGz67n7TS+hAhfZ9Hc2QETIdIorCy64ZGmiUSZ2y Q X-Google-Smtp-Source: AGHT+IHCzxp5aM97Lggjic/FbRgWw7aVRmj27JP5kpZt99OpPq39UM/lQMRnYw1z5Wn5dGpPEGWiaQ== X-Received: by 2002:a17:902:650d:b0:1f3:675:a68c with SMTP id d9443c01a7336-1f4486f5c15mr36096965ad.23.1716592900554; Fri, 24 May 2024 16:21:40 -0700 (PDT) Received: from stoup.. (174-21-72-5.tukw.qwest.net. [174.21.72.5]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-1f44c759ceesm19178305ad.10.2024.05.24.16.21.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 May 2024 16:21:40 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: qemu-arm@nongnu.org, Peter Maydell Subject: [PATCH v2 20/67] target/arm: Convert FMAX, FMIN, FMAXNM, FMINNM to decodetree Date: Fri, 24 May 2024 16:20:34 -0700 Message-Id: <20240524232121.284515-21-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240524232121.284515-1-richard.henderson@linaro.org> References: <20240524232121.284515-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::635; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x635.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/helper.h | 4 + target/arm/tcg/a64.decode | 17 ++++ target/arm/tcg/translate-a64.c | 168 +++++++++++++++++---------------- target/arm/tcg/vec_helper.c | 4 + 4 files changed, 113 insertions(+), 80 deletions(-) diff --git a/target/arm/helper.h b/target/arm/helper.h index 2b02733305..7ee15b9651 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -748,15 +748,19 @@ DEF_HELPER_FLAGS_5(gvec_facgt_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(gvec_fmax_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(gvec_fmax_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_fmax_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(gvec_fmin_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(gvec_fmin_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_fmin_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(gvec_fmaxnum_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(gvec_fmaxnum_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_fmaxnum_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(gvec_fminnum_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(gvec_fminnum_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_fminnum_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(gvec_recps_nf_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(gvec_recps_nf_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode index 82daafbef5..e2678d919e 100644 --- a/target/arm/tcg/a64.decode +++ b/target/arm/tcg/a64.decode @@ -704,6 +704,11 @@ FSUB_s 0001 1110 ..1 ..... 0011 10 ..... ..... @rrr_hsd FDIV_s 0001 1110 ..1 ..... 0001 10 ..... ..... @rrr_hsd FMUL_s 0001 1110 ..1 ..... 0000 10 ..... ..... @rrr_hsd +FMAX_s 0001 1110 ..1 ..... 0100 10 ..... ..... @rrr_hsd +FMIN_s 0001 1110 ..1 ..... 0101 10 ..... ..... @rrr_hsd +FMAXNM_s 0001 1110 ..1 ..... 0110 10 ..... ..... @rrr_hsd +FMINNM_s 0001 1110 ..1 ..... 0111 10 ..... ..... @rrr_hsd + FMULX_s 0101 1110 010 ..... 00011 1 ..... ..... @rrr_h FMULX_s 0101 1110 0.1 ..... 11011 1 ..... ..... @rrr_sd @@ -721,6 +726,18 @@ FDIV_v 0.10 1110 0.1 ..... 11111 1 ..... ..... @qrrr_sd FMUL_v 0.10 1110 010 ..... 00011 1 ..... ..... @qrrr_h FMUL_v 0.10 1110 0.1 ..... 11011 1 ..... ..... @qrrr_sd +FMAX_v 0.00 1110 010 ..... 00110 1 ..... ..... @qrrr_h +FMAX_v 0.00 1110 0.1 ..... 11110 1 ..... ..... @qrrr_sd + +FMIN_v 0.00 1110 110 ..... 00110 1 ..... ..... @qrrr_h +FMIN_v 0.00 1110 1.1 ..... 11110 1 ..... ..... @qrrr_sd + +FMAXNM_v 0.00 1110 010 ..... 00000 1 ..... ..... @qrrr_h +FMAXNM_v 0.00 1110 0.1 ..... 11000 1 ..... ..... @qrrr_sd + +FMINNM_v 0.00 1110 110 ..... 00000 1 ..... ..... @qrrr_h +FMINNM_v 0.00 1110 1.1 ..... 11000 1 ..... ..... @qrrr_sd + FMULX_v 0.00 1110 010 ..... 00011 1 ..... ..... @qrrr_h FMULX_v 0.00 1110 0.1 ..... 11011 1 ..... ..... @qrrr_sd diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c index 97c3d758d6..6f8207d842 100644 --- a/target/arm/tcg/translate-a64.c +++ b/target/arm/tcg/translate-a64.c @@ -4915,6 +4915,34 @@ static const FPScalar f_scalar_fmul = { }; TRANS(FMUL_s, do_fp3_scalar, a, &f_scalar_fmul) +static const FPScalar f_scalar_fmax = { + gen_helper_advsimd_maxh, + gen_helper_vfp_maxs, + gen_helper_vfp_maxd, +}; +TRANS(FMAX_s, do_fp3_scalar, a, &f_scalar_fmax) + +static const FPScalar f_scalar_fmin = { + gen_helper_advsimd_minh, + gen_helper_vfp_mins, + gen_helper_vfp_mind, +}; +TRANS(FMIN_s, do_fp3_scalar, a, &f_scalar_fmin) + +static const FPScalar f_scalar_fmaxnm = { + gen_helper_advsimd_maxnumh, + gen_helper_vfp_maxnums, + gen_helper_vfp_maxnumd, +}; +TRANS(FMAXNM_s, do_fp3_scalar, a, &f_scalar_fmaxnm) + +static const FPScalar f_scalar_fminnm = { + gen_helper_advsimd_minnumh, + gen_helper_vfp_minnums, + gen_helper_vfp_minnumd, +}; +TRANS(FMINNM_s, do_fp3_scalar, a, &f_scalar_fminnm) + static const FPScalar f_scalar_fmulx = { gen_helper_advsimd_mulxh, gen_helper_vfp_mulxs, @@ -4978,6 +5006,34 @@ static gen_helper_gvec_3_ptr * const f_vector_fmul[3] = { }; TRANS(FMUL_v, do_fp3_vector, a, f_vector_fmul) +static gen_helper_gvec_3_ptr * const f_vector_fmax[3] = { + gen_helper_gvec_fmax_h, + gen_helper_gvec_fmax_s, + gen_helper_gvec_fmax_d, +}; +TRANS(FMAX_v, do_fp3_vector, a, f_vector_fmax) + +static gen_helper_gvec_3_ptr * const f_vector_fmin[3] = { + gen_helper_gvec_fmin_h, + gen_helper_gvec_fmin_s, + gen_helper_gvec_fmin_d, +}; +TRANS(FMIN_v, do_fp3_vector, a, f_vector_fmin) + +static gen_helper_gvec_3_ptr * const f_vector_fmaxnm[3] = { + gen_helper_gvec_fmaxnum_h, + gen_helper_gvec_fmaxnum_s, + gen_helper_gvec_fmaxnum_d, +}; +TRANS(FMAXNM_v, do_fp3_vector, a, f_vector_fmaxnm) + +static gen_helper_gvec_3_ptr * const f_vector_fminnm[3] = { + gen_helper_gvec_fminnum_h, + gen_helper_gvec_fminnum_s, + gen_helper_gvec_fminnum_d, +}; +TRANS(FMINNM_v, do_fp3_vector, a, f_vector_fminnm) + static gen_helper_gvec_3_ptr * const f_vector_fmulx[3] = { gen_helper_gvec_fmulx_h, gen_helper_gvec_fmulx_s, @@ -6891,18 +6947,6 @@ static void handle_fp_2src_single(DisasContext *s, int opcode, tcg_op2 = read_fp_sreg(s, rm); switch (opcode) { - case 0x4: /* FMAX */ - gen_helper_vfp_maxs(tcg_res, tcg_op1, tcg_op2, fpst); - break; - case 0x5: /* FMIN */ - gen_helper_vfp_mins(tcg_res, tcg_op1, tcg_op2, fpst); - break; - case 0x6: /* FMAXNM */ - gen_helper_vfp_maxnums(tcg_res, tcg_op1, tcg_op2, fpst); - break; - case 0x7: /* FMINNM */ - gen_helper_vfp_minnums(tcg_res, tcg_op1, tcg_op2, fpst); - break; case 0x8: /* FNMUL */ gen_helper_vfp_muls(tcg_res, tcg_op1, tcg_op2, fpst); gen_helper_vfp_negs(tcg_res, tcg_res); @@ -6912,6 +6956,10 @@ static void handle_fp_2src_single(DisasContext *s, int opcode, case 0x1: /* FDIV */ case 0x2: /* FADD */ case 0x3: /* FSUB */ + case 0x4: /* FMAX */ + case 0x5: /* FMIN */ + case 0x6: /* FMAXNM */ + case 0x7: /* FMINNM */ g_assert_not_reached(); } @@ -6933,18 +6981,6 @@ static void handle_fp_2src_double(DisasContext *s, int opcode, tcg_op2 = read_fp_dreg(s, rm); switch (opcode) { - case 0x4: /* FMAX */ - gen_helper_vfp_maxd(tcg_res, tcg_op1, tcg_op2, fpst); - break; - case 0x5: /* FMIN */ - gen_helper_vfp_mind(tcg_res, tcg_op1, tcg_op2, fpst); - break; - case 0x6: /* FMAXNM */ - gen_helper_vfp_maxnumd(tcg_res, tcg_op1, tcg_op2, fpst); - break; - case 0x7: /* FMINNM */ - gen_helper_vfp_minnumd(tcg_res, tcg_op1, tcg_op2, fpst); - break; case 0x8: /* FNMUL */ gen_helper_vfp_muld(tcg_res, tcg_op1, tcg_op2, fpst); gen_helper_vfp_negd(tcg_res, tcg_res); @@ -6954,6 +6990,10 @@ static void handle_fp_2src_double(DisasContext *s, int opcode, case 0x1: /* FDIV */ case 0x2: /* FADD */ case 0x3: /* FSUB */ + case 0x4: /* FMAX */ + case 0x5: /* FMIN */ + case 0x6: /* FMAXNM */ + case 0x7: /* FMINNM */ g_assert_not_reached(); } @@ -6975,18 +7015,6 @@ static void handle_fp_2src_half(DisasContext *s, int opcode, tcg_op2 = read_fp_hreg(s, rm); switch (opcode) { - case 0x4: /* FMAX */ - gen_helper_advsimd_maxh(tcg_res, tcg_op1, tcg_op2, fpst); - break; - case 0x5: /* FMIN */ - gen_helper_advsimd_minh(tcg_res, tcg_op1, tcg_op2, fpst); - break; - case 0x6: /* FMAXNM */ - gen_helper_advsimd_maxnumh(tcg_res, tcg_op1, tcg_op2, fpst); - break; - case 0x7: /* FMINNM */ - gen_helper_advsimd_minnumh(tcg_res, tcg_op1, tcg_op2, fpst); - break; case 0x8: /* FNMUL */ gen_helper_advsimd_mulh(tcg_res, tcg_op1, tcg_op2, fpst); tcg_gen_xori_i32(tcg_res, tcg_res, 0x8000); @@ -6996,6 +7024,10 @@ static void handle_fp_2src_half(DisasContext *s, int opcode, case 0x1: /* FDIV */ case 0x2: /* FADD */ case 0x3: /* FSUB */ + case 0x4: /* FMAX */ + case 0x5: /* FMIN */ + case 0x6: /* FMAXNM */ + case 0x7: /* FMINNM */ g_assert_not_reached(); } @@ -9221,24 +9253,12 @@ static void handle_3same_float(DisasContext *s, int size, int elements, gen_helper_vfp_muladdd(tcg_res, tcg_op1, tcg_op2, tcg_res, fpst); break; - case 0x18: /* FMAXNM */ - gen_helper_vfp_maxnumd(tcg_res, tcg_op1, tcg_op2, fpst); - break; case 0x1c: /* FCMEQ */ gen_helper_neon_ceq_f64(tcg_res, tcg_op1, tcg_op2, fpst); break; - case 0x1e: /* FMAX */ - gen_helper_vfp_maxd(tcg_res, tcg_op1, tcg_op2, fpst); - break; case 0x1f: /* FRECPS */ gen_helper_recpsf_f64(tcg_res, tcg_op1, tcg_op2, fpst); break; - case 0x38: /* FMINNM */ - gen_helper_vfp_minnumd(tcg_res, tcg_op1, tcg_op2, fpst); - break; - case 0x3e: /* FMIN */ - gen_helper_vfp_mind(tcg_res, tcg_op1, tcg_op2, fpst); - break; case 0x3f: /* FRSQRTS */ gen_helper_rsqrtsf_f64(tcg_res, tcg_op1, tcg_op2, fpst); break; @@ -9259,9 +9279,13 @@ static void handle_3same_float(DisasContext *s, int size, int elements, gen_helper_neon_acgt_f64(tcg_res, tcg_op1, tcg_op2, fpst); break; default: + case 0x18: /* FMAXNM */ case 0x1a: /* FADD */ case 0x1b: /* FMULX */ + case 0x1e: /* FMAX */ + case 0x38: /* FMINNM */ case 0x3a: /* FSUB */ + case 0x3e: /* FMIN */ case 0x5b: /* FMUL */ case 0x5f: /* FDIV */ g_assert_not_reached(); @@ -9290,21 +9314,9 @@ static void handle_3same_float(DisasContext *s, int size, int elements, case 0x1c: /* FCMEQ */ gen_helper_neon_ceq_f32(tcg_res, tcg_op1, tcg_op2, fpst); break; - case 0x1e: /* FMAX */ - gen_helper_vfp_maxs(tcg_res, tcg_op1, tcg_op2, fpst); - break; case 0x1f: /* FRECPS */ gen_helper_recpsf_f32(tcg_res, tcg_op1, tcg_op2, fpst); break; - case 0x18: /* FMAXNM */ - gen_helper_vfp_maxnums(tcg_res, tcg_op1, tcg_op2, fpst); - break; - case 0x38: /* FMINNM */ - gen_helper_vfp_minnums(tcg_res, tcg_op1, tcg_op2, fpst); - break; - case 0x3e: /* FMIN */ - gen_helper_vfp_mins(tcg_res, tcg_op1, tcg_op2, fpst); - break; case 0x3f: /* FRSQRTS */ gen_helper_rsqrtsf_f32(tcg_res, tcg_op1, tcg_op2, fpst); break; @@ -9325,9 +9337,13 @@ static void handle_3same_float(DisasContext *s, int size, int elements, gen_helper_neon_acgt_f32(tcg_res, tcg_op1, tcg_op2, fpst); break; default: + case 0x18: /* FMAXNM */ case 0x1a: /* FADD */ case 0x1b: /* FMULX */ + case 0x1e: /* FMAX */ + case 0x38: /* FMINNM */ case 0x3a: /* FSUB */ + case 0x3e: /* FMIN */ case 0x5b: /* FMUL */ case 0x5f: /* FDIV */ g_assert_not_reached(); @@ -11251,11 +11267,7 @@ static void disas_simd_3same_float(DisasContext *s, uint32_t insn) case 0x7d: /* FACGT */ case 0x19: /* FMLA */ case 0x39: /* FMLS */ - case 0x18: /* FMAXNM */ case 0x1c: /* FCMEQ */ - case 0x1e: /* FMAX */ - case 0x38: /* FMINNM */ - case 0x3e: /* FMIN */ case 0x5c: /* FCMGE */ case 0x7a: /* FABD */ case 0x7c: /* FCMGT */ @@ -11286,9 +11298,13 @@ static void disas_simd_3same_float(DisasContext *s, uint32_t insn) return; default: + case 0x18: /* FMAXNM */ case 0x1a: /* FADD */ case 0x1b: /* FMULX */ + case 0x1e: /* FMAX */ + case 0x38: /* FMINNM */ case 0x3a: /* FSUB */ + case 0x3e: /* FMIN */ case 0x5b: /* FMUL */ case 0x5f: /* FDIV */ unallocated_encoding(s); @@ -11632,14 +11648,10 @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn) int pass; switch (fpopcode) { - case 0x0: /* FMAXNM */ case 0x1: /* FMLA */ case 0x4: /* FCMEQ */ - case 0x6: /* FMAX */ case 0x7: /* FRECPS */ - case 0x8: /* FMINNM */ case 0x9: /* FMLS */ - case 0xe: /* FMIN */ case 0xf: /* FRSQRTS */ case 0x14: /* FCMGE */ case 0x15: /* FACGE */ @@ -11656,9 +11668,13 @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn) pairwise = true; break; default: + case 0x0: /* FMAXNM */ case 0x2: /* FADD */ case 0x3: /* FMULX */ + case 0x6: /* FMAX */ + case 0x8: /* FMINNM */ case 0xa: /* FSUB */ + case 0xe: /* FMIN */ case 0x13: /* FMUL */ case 0x17: /* FDIV */ unallocated_encoding(s); @@ -11726,9 +11742,6 @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn) read_vec_element_i32(s, tcg_op2, rm, pass, MO_16); switch (fpopcode) { - case 0x0: /* FMAXNM */ - gen_helper_advsimd_maxnumh(tcg_res, tcg_op1, tcg_op2, fpst); - break; case 0x1: /* FMLA */ read_vec_element_i32(s, tcg_res, rd, pass, MO_16); gen_helper_advsimd_muladdh(tcg_res, tcg_op1, tcg_op2, tcg_res, @@ -11737,15 +11750,9 @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn) case 0x4: /* FCMEQ */ gen_helper_advsimd_ceq_f16(tcg_res, tcg_op1, tcg_op2, fpst); break; - case 0x6: /* FMAX */ - gen_helper_advsimd_maxh(tcg_res, tcg_op1, tcg_op2, fpst); - break; case 0x7: /* FRECPS */ gen_helper_recpsf_f16(tcg_res, tcg_op1, tcg_op2, fpst); break; - case 0x8: /* FMINNM */ - gen_helper_advsimd_minnumh(tcg_res, tcg_op1, tcg_op2, fpst); - break; case 0x9: /* FMLS */ /* As usual for ARM, separate negation for fused multiply-add */ tcg_gen_xori_i32(tcg_op1, tcg_op1, 0x8000); @@ -11753,9 +11760,6 @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn) gen_helper_advsimd_muladdh(tcg_res, tcg_op1, tcg_op2, tcg_res, fpst); break; - case 0xe: /* FMIN */ - gen_helper_advsimd_minh(tcg_res, tcg_op1, tcg_op2, fpst); - break; case 0xf: /* FRSQRTS */ gen_helper_rsqrtsf_f16(tcg_res, tcg_op1, tcg_op2, fpst); break; @@ -11776,9 +11780,13 @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn) gen_helper_advsimd_acgt_f16(tcg_res, tcg_op1, tcg_op2, fpst); break; default: + case 0x0: /* FMAXNM */ case 0x2: /* FADD */ case 0x3: /* FMULX */ + case 0x6: /* FMAX */ + case 0x8: /* FMINNM */ case 0xa: /* FSUB */ + case 0xe: /* FMIN */ case 0x13: /* FMUL */ case 0x17: /* FDIV */ g_assert_not_reached(); diff --git a/target/arm/tcg/vec_helper.c b/target/arm/tcg/vec_helper.c index 4106536371..99ef676071 100644 --- a/target/arm/tcg/vec_helper.c +++ b/target/arm/tcg/vec_helper.c @@ -1231,15 +1231,19 @@ DO_3OP(gvec_facgt_s, float32_acgt, float32) DO_3OP(gvec_fmax_h, float16_max, float16) DO_3OP(gvec_fmax_s, float32_max, float32) +DO_3OP(gvec_fmax_d, float64_max, float64) DO_3OP(gvec_fmin_h, float16_min, float16) DO_3OP(gvec_fmin_s, float32_min, float32) +DO_3OP(gvec_fmin_d, float64_min, float64) DO_3OP(gvec_fmaxnum_h, float16_maxnum, float16) DO_3OP(gvec_fmaxnum_s, float32_maxnum, float32) +DO_3OP(gvec_fmaxnum_d, float64_maxnum, float64) DO_3OP(gvec_fminnum_h, float16_minnum, float16) DO_3OP(gvec_fminnum_s, float32_minnum, float32) +DO_3OP(gvec_fminnum_d, float64_minnum, float64) DO_3OP(gvec_recps_nf_h, float16_recps_nf, float16) DO_3OP(gvec_recps_nf_s, float32_recps_nf, float32) From patchwork Fri May 24 23:20:35 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13673835 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id F3E0CC25B7E for ; Fri, 24 May 2024 23:30:00 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sAeEW-0006hZ-6v; Fri, 24 May 2024 19:21:56 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sAeEO-0006ZV-Ot for qemu-devel@nongnu.org; Fri, 24 May 2024 19:21:48 -0400 Received: from mail-pl1-x635.google.com ([2607:f8b0:4864:20::635]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sAeEK-0005oI-07 for qemu-devel@nongnu.org; Fri, 24 May 2024 19:21:48 -0400 Received: by mail-pl1-x635.google.com with SMTP id d9443c01a7336-1f32864bcc7so24957845ad.3 for ; Fri, 24 May 2024 16:21:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1716592901; x=1717197701; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=H2xFCppnIKFVialQo7UwJzgtJAAgE8nOt0WWcTAkceY=; b=gypTIJB68epyoYNrlrB18NUvsUl+V/DfYWda8p+nzYMjg1Rlm1dXKZsUm6r5riPG4s 5UHyKzMrCBh2UNYdWwuGRjVL9zT4E74lBW/9LwzLp9UntgFG+GVQ09nt9f2GWPA9i6qr 32PIuJpvhAbxhypmtWJLIi7OhaCXRzRntAUb0hNi7Hq9Is8fhzO3ygqIXzArthRZc1Ah ZD07f00gS+aknSg21gbvqc/So5pQ0VVM61EGNiLAQ0jVw1wRbwvtMGgbY4l9ThDVjae1 O3Fl5vF9PPU0KQZle5RpT6kDbW8IGlOG/lqJml8/rEeFi/KPw3dumDSg6YhQphcazrKD qmXA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716592901; x=1717197701; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=H2xFCppnIKFVialQo7UwJzgtJAAgE8nOt0WWcTAkceY=; b=BGtVFOIgMXoApLrfgB6Jg6KsS5UMejK1PuoQTftMWU/WmlAmUkxLRRr5xl3enrkZ0O KXgg6L2+1ni3CT08+D5QO12LX5WlMPDNcaiXSKhI+asGBV7ncimgzjiTO1iOiS0hRQUF U3qlHAuB1lbvKV7g8xtNI4ZQacEV+EI1JKlei+C/NK62PmcFHK8O5n9Z8hqfbS7qyP03 YNdG2ylK0qO4vWsNzUoW2qzLbvW7POl/pGlzvgm648fvBcfniMU8FIo49F5fFTFGKNeT Jckvj1XiaiKtW4vc7VOUKDljMTekYZgzAoDFhHuKVRuno/BKPF7/bC44K8cr2kHxaVn5 qRmw== X-Gm-Message-State: AOJu0Yz8YI3M9luWRJtEAvpgTVxhmvD/ySFaxg+Ae960kN6c0XvROw62 +8yTXLiF1kPQXNWQAe8YunCvtQNWSz2sU3eMoWB5jZVMLXqyRQbJ4+7PqGUm9q91ncr6VAnFV5i H X-Google-Smtp-Source: AGHT+IEUVQbtE+LbmuqaYkwk0mABelVPtxg6b32Rz8NKZ0x+edBJl4L8XrOYNrtT0uF0znPpRBAnmw== X-Received: by 2002:a17:902:eccd:b0:1e3:e39a:2e49 with SMTP id d9443c01a7336-1f4486da10bmr43862815ad.18.1716592901494; Fri, 24 May 2024 16:21:41 -0700 (PDT) Received: from stoup.. (174-21-72-5.tukw.qwest.net. [174.21.72.5]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-1f44c759ceesm19178305ad.10.2024.05.24.16.21.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 May 2024 16:21:41 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: qemu-arm@nongnu.org Subject: [PATCH v2 21/67] target/arm: Introduce vfp_load_reg16 Date: Fri, 24 May 2024 16:20:35 -0700 Message-Id: <20240524232121.284515-22-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240524232121.284515-1-richard.henderson@linaro.org> References: <20240524232121.284515-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::635; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x635.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Load and zero-extend float16 into a TCGv_i32 before all scalar operations. Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/tcg/translate-vfp.c | 39 +++++++++++++++++++--------------- 1 file changed, 22 insertions(+), 17 deletions(-) diff --git a/target/arm/tcg/translate-vfp.c b/target/arm/tcg/translate-vfp.c index b9af03b7c3..8e755fcde8 100644 --- a/target/arm/tcg/translate-vfp.c +++ b/target/arm/tcg/translate-vfp.c @@ -48,6 +48,12 @@ static inline void vfp_store_reg32(TCGv_i32 var, int reg) tcg_gen_st_i32(var, tcg_env, vfp_reg_offset(false, reg)); } +static inline void vfp_load_reg16(TCGv_i32 var, int reg) +{ + tcg_gen_ld16u_i32(var, tcg_env, + vfp_reg_offset(false, reg) + HOST_BIG_ENDIAN * 2); +} + /* * The imm8 encodes the sign bit, enough bits to represent an exponent in * the range 01....1xx to 10....0xx, and the most significant 4 bits of @@ -902,8 +908,7 @@ static bool trans_VMOV_half(DisasContext *s, arg_VMOV_single *a) if (a->l) { /* VFP to general purpose register */ tmp = tcg_temp_new_i32(); - vfp_load_reg32(tmp, a->vn); - tcg_gen_andi_i32(tmp, tmp, 0xffff); + vfp_load_reg16(tmp, a->vn); store_reg(s, a->rt, tmp); } else { /* general purpose register to VFP */ @@ -1453,11 +1458,11 @@ static bool do_vfp_3op_hp(DisasContext *s, VFPGen3OpSPFn *fn, fd = tcg_temp_new_i32(); fpst = fpstatus_ptr(FPST_FPCR_F16); - vfp_load_reg32(f0, vn); - vfp_load_reg32(f1, vm); + vfp_load_reg16(f0, vn); + vfp_load_reg16(f1, vm); if (reads_vd) { - vfp_load_reg32(fd, vd); + vfp_load_reg16(fd, vd); } fn(fd, f0, f1, fpst); vfp_store_reg32(fd, vd); @@ -1633,7 +1638,7 @@ static bool do_vfp_2op_hp(DisasContext *s, VFPGen2OpSPFn *fn, int vd, int vm) } f0 = tcg_temp_new_i32(); - vfp_load_reg32(f0, vm); + vfp_load_reg16(f0, vm); fn(f0, f0); vfp_store_reg32(f0, vd); @@ -2106,13 +2111,13 @@ static bool do_vfm_hp(DisasContext *s, arg_VFMA_sp *a, bool neg_n, bool neg_d) vm = tcg_temp_new_i32(); vd = tcg_temp_new_i32(); - vfp_load_reg32(vn, a->vn); - vfp_load_reg32(vm, a->vm); + vfp_load_reg16(vn, a->vn); + vfp_load_reg16(vm, a->vm); if (neg_n) { /* VFNMS, VFMS */ gen_helper_vfp_negh(vn, vn); } - vfp_load_reg32(vd, a->vd); + vfp_load_reg16(vd, a->vd); if (neg_d) { /* VFNMA, VFNMS */ gen_helper_vfp_negh(vd, vd); @@ -2456,11 +2461,11 @@ static bool trans_VCMP_hp(DisasContext *s, arg_VCMP_sp *a) vd = tcg_temp_new_i32(); vm = tcg_temp_new_i32(); - vfp_load_reg32(vd, a->vd); + vfp_load_reg16(vd, a->vd); if (a->z) { tcg_gen_movi_i32(vm, 0); } else { - vfp_load_reg32(vm, a->vm); + vfp_load_reg16(vm, a->vm); } if (a->e) { @@ -2700,7 +2705,7 @@ static bool trans_VRINTR_hp(DisasContext *s, arg_VRINTR_sp *a) } tmp = tcg_temp_new_i32(); - vfp_load_reg32(tmp, a->vm); + vfp_load_reg16(tmp, a->vm); fpst = fpstatus_ptr(FPST_FPCR_F16); gen_helper_rinth(tmp, tmp, fpst); vfp_store_reg32(tmp, a->vd); @@ -2773,7 +2778,7 @@ static bool trans_VRINTZ_hp(DisasContext *s, arg_VRINTZ_sp *a) } tmp = tcg_temp_new_i32(); - vfp_load_reg32(tmp, a->vm); + vfp_load_reg16(tmp, a->vm); fpst = fpstatus_ptr(FPST_FPCR_F16); tcg_rmode = gen_set_rmode(FPROUNDING_ZERO, fpst); gen_helper_rinth(tmp, tmp, fpst); @@ -2853,7 +2858,7 @@ static bool trans_VRINTX_hp(DisasContext *s, arg_VRINTX_sp *a) } tmp = tcg_temp_new_i32(); - vfp_load_reg32(tmp, a->vm); + vfp_load_reg16(tmp, a->vm); fpst = fpstatus_ptr(FPST_FPCR_F16); gen_helper_rinth_exact(tmp, tmp, fpst); vfp_store_reg32(tmp, a->vd); @@ -3270,7 +3275,7 @@ static bool trans_VCVT_hp_int(DisasContext *s, arg_VCVT_sp_int *a) fpst = fpstatus_ptr(FPST_FPCR_F16); vm = tcg_temp_new_i32(); - vfp_load_reg32(vm, a->vm); + vfp_load_reg16(vm, a->vm); if (a->s) { if (a->rz) { @@ -3383,8 +3388,8 @@ static bool trans_VINS(DisasContext *s, arg_VINS *a) /* Insert low half of Vm into high half of Vd */ rm = tcg_temp_new_i32(); rd = tcg_temp_new_i32(); - vfp_load_reg32(rm, a->vm); - vfp_load_reg32(rd, a->vd); + vfp_load_reg16(rm, a->vm); + vfp_load_reg16(rd, a->vd); tcg_gen_deposit_i32(rd, rd, rm, 16, 16); vfp_store_reg32(rd, a->vd); return true; From patchwork Fri May 24 23:20:36 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13673806 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 7B981C25B74 for ; Fri, 24 May 2024 23:26:10 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sAeEZ-0006lt-9p; Fri, 24 May 2024 19:21:59 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sAeEW-0006i8-TU for qemu-devel@nongnu.org; Fri, 24 May 2024 19:21:57 -0400 Received: from mail-pl1-x631.google.com ([2607:f8b0:4864:20::631]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sAeEK-0005oZ-2Z for qemu-devel@nongnu.org; Fri, 24 May 2024 19:21:56 -0400 Received: by mail-pl1-x631.google.com with SMTP id d9443c01a7336-1f44b4285dbso10943085ad.0 for ; Fri, 24 May 2024 16:21:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1716592902; x=1717197702; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=QVcU1WmJb/ZSvJaWJpId13nCAvYGF24jdDkDh7RokdY=; b=kuO8Zoyjsc/4SqjG6WiQlfrur58NnDnCMjQZRCdWgaiizNKlqBTAK/jNyty3xqxMVd zTXN5FZXpsVu21bYLXiE7MNfYPbJNETi2262vuyePJm87RjySwiYODTDtCSsVm29lbtc xgZm0G3Y19GEIi2sbs5TqbN8CCiQ7ujV8l0Polc/6o77yiuyjPuJBEV/k9/1bz6O1Kwq 3GZfGzzSO0u+QrSMVWan68joIg1qJubQ7xo/th53/kQdXeEMB46FIMz7v37WTVdr/i9y t6MEHT1WN/eB3r2AVKkPWWXBXXjn/I2mJETJfb16xOe8IUTlBg27HnW6p8YVHZH80ddk 0qEw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716592902; x=1717197702; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=QVcU1WmJb/ZSvJaWJpId13nCAvYGF24jdDkDh7RokdY=; b=YsLv4ZYUWR/4uhtGbQIIMn2Tryz3mZpA1kjqr1bW0Eadb52K/QkEMzRDoYS/N9vEZR DyazxM7sa3xe4lNajMbQGo89n5WAcTXTPGRlW6IdEJ4sqTA1mVke6SnnFxqGrvRsgjDf XNt4cAVgkV9+weH9WHADfEYNE/g6kQ5mE8xUo2ehSHy0iXOHKNzUq3Zye3V/Mq6x5obM 8SFDOJtdEasgsEU484U3WiBFhJrOX53VEkGjMqzf1MTrb41aBoMQ40U/zhGi0NAVs1p5 2JlMlba2v3oi+XgdaddK2MsvfQyica2EXGs8BATAuvdCqqOMYQR47LlmLCjRemPdbRcH 5J0w== X-Gm-Message-State: AOJu0YzwUH/xisk7LMOVO4COPoQMROA6IccmQR0t2SDpss6cFSlIUcGq 4QWgTJFnEVloBCXZEbtD4BdO/IzeqNrt256BR/r9hV9qkk59Mm2nNoE8lFGNvErs+DaHMKdTkLY e X-Google-Smtp-Source: AGHT+IEljQnHUfJ3cJKfwd8czCz4R+4syl+noNncZXw/DzK7GGV9IzDOGpT6W1pGlbvBR3tK7L/EhQ== X-Received: by 2002:a17:903:249:b0:1f3:c758:bdc5 with SMTP id d9443c01a7336-1f449026957mr36947785ad.54.1716592902374; Fri, 24 May 2024 16:21:42 -0700 (PDT) Received: from stoup.. (174-21-72-5.tukw.qwest.net. [174.21.72.5]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-1f44c759ceesm19178305ad.10.2024.05.24.16.21.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 May 2024 16:21:42 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: qemu-arm@nongnu.org, Peter Maydell Subject: [PATCH v2 22/67] target/arm: Expand vfp neg and abs inline Date: Fri, 24 May 2024 16:20:36 -0700 Message-Id: <20240524232121.284515-23-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240524232121.284515-1-richard.henderson@linaro.org> References: <20240524232121.284515-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::631; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x631.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/helper.h | 6 ---- target/arm/tcg/translate.h | 30 +++++++++++++++++++ target/arm/tcg/translate-a64.c | 44 +++++++++++++-------------- target/arm/tcg/translate-vfp.c | 54 +++++++++++++++++----------------- target/arm/vfp_helper.c | 30 ------------------- 5 files changed, 79 insertions(+), 85 deletions(-) diff --git a/target/arm/helper.h b/target/arm/helper.h index 7ee15b9651..0fd01c9c52 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -132,12 +132,6 @@ DEF_HELPER_3(vfp_maxnumd, f64, f64, f64, ptr) DEF_HELPER_3(vfp_minnumh, f16, f16, f16, ptr) DEF_HELPER_3(vfp_minnums, f32, f32, f32, ptr) DEF_HELPER_3(vfp_minnumd, f64, f64, f64, ptr) -DEF_HELPER_1(vfp_negh, f16, f16) -DEF_HELPER_1(vfp_negs, f32, f32) -DEF_HELPER_1(vfp_negd, f64, f64) -DEF_HELPER_1(vfp_absh, f16, f16) -DEF_HELPER_1(vfp_abss, f32, f32) -DEF_HELPER_1(vfp_absd, f64, f64) DEF_HELPER_2(vfp_sqrth, f16, f16, env) DEF_HELPER_2(vfp_sqrts, f32, f32, env) DEF_HELPER_2(vfp_sqrtd, f64, f64, env) diff --git a/target/arm/tcg/translate.h b/target/arm/tcg/translate.h index ecfa242eef..b05a9eb668 100644 --- a/target/arm/tcg/translate.h +++ b/target/arm/tcg/translate.h @@ -406,6 +406,36 @@ static inline void gen_swstep_exception(DisasContext *s, int isv, int ex) */ uint64_t vfp_expand_imm(int size, uint8_t imm8); +static inline void gen_vfp_absh(TCGv_i32 d, TCGv_i32 s) +{ + tcg_gen_andi_i32(d, s, INT16_MAX); +} + +static inline void gen_vfp_abss(TCGv_i32 d, TCGv_i32 s) +{ + tcg_gen_andi_i32(d, s, INT32_MAX); +} + +static inline void gen_vfp_absd(TCGv_i64 d, TCGv_i64 s) +{ + tcg_gen_andi_i64(d, s, INT64_MAX); +} + +static inline void gen_vfp_negh(TCGv_i32 d, TCGv_i32 s) +{ + tcg_gen_xori_i32(d, s, 1u << 15); +} + +static inline void gen_vfp_negs(TCGv_i32 d, TCGv_i32 s) +{ + tcg_gen_xori_i32(d, s, 1u << 31); +} + +static inline void gen_vfp_negd(TCGv_i64 d, TCGv_i64 s) +{ + tcg_gen_xori_i64(d, s, 1ull << 63); +} + /* Vector operations shared between ARM and AArch64. */ void gen_gvec_ceq0(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c index 6f8207d842..878f83298f 100644 --- a/target/arm/tcg/translate-a64.c +++ b/target/arm/tcg/translate-a64.c @@ -6591,10 +6591,10 @@ static void handle_fp_1src_half(DisasContext *s, int opcode, int rd, int rn) tcg_gen_mov_i32(tcg_res, tcg_op); break; case 0x1: /* FABS */ - tcg_gen_andi_i32(tcg_res, tcg_op, 0x7fff); + gen_vfp_absh(tcg_res, tcg_op); break; case 0x2: /* FNEG */ - tcg_gen_xori_i32(tcg_res, tcg_op, 0x8000); + gen_vfp_negh(tcg_res, tcg_op); break; case 0x3: /* FSQRT */ fpst = fpstatus_ptr(FPST_FPCR_F16); @@ -6645,10 +6645,10 @@ static void handle_fp_1src_single(DisasContext *s, int opcode, int rd, int rn) tcg_gen_mov_i32(tcg_res, tcg_op); goto done; case 0x1: /* FABS */ - gen_helper_vfp_abss(tcg_res, tcg_op); + gen_vfp_abss(tcg_res, tcg_op); goto done; case 0x2: /* FNEG */ - gen_helper_vfp_negs(tcg_res, tcg_op); + gen_vfp_negs(tcg_res, tcg_op); goto done; case 0x3: /* FSQRT */ gen_helper_vfp_sqrts(tcg_res, tcg_op, tcg_env); @@ -6720,10 +6720,10 @@ static void handle_fp_1src_double(DisasContext *s, int opcode, int rd, int rn) switch (opcode) { case 0x1: /* FABS */ - gen_helper_vfp_absd(tcg_res, tcg_op); + gen_vfp_absd(tcg_res, tcg_op); goto done; case 0x2: /* FNEG */ - gen_helper_vfp_negd(tcg_res, tcg_op); + gen_vfp_negd(tcg_res, tcg_op); goto done; case 0x3: /* FSQRT */ gen_helper_vfp_sqrtd(tcg_res, tcg_op, tcg_env); @@ -6949,7 +6949,7 @@ static void handle_fp_2src_single(DisasContext *s, int opcode, switch (opcode) { case 0x8: /* FNMUL */ gen_helper_vfp_muls(tcg_res, tcg_op1, tcg_op2, fpst); - gen_helper_vfp_negs(tcg_res, tcg_res); + gen_vfp_negs(tcg_res, tcg_res); break; default: case 0x0: /* FMUL */ @@ -6983,7 +6983,7 @@ static void handle_fp_2src_double(DisasContext *s, int opcode, switch (opcode) { case 0x8: /* FNMUL */ gen_helper_vfp_muld(tcg_res, tcg_op1, tcg_op2, fpst); - gen_helper_vfp_negd(tcg_res, tcg_res); + gen_vfp_negd(tcg_res, tcg_res); break; default: case 0x0: /* FMUL */ @@ -7017,7 +7017,7 @@ static void handle_fp_2src_half(DisasContext *s, int opcode, switch (opcode) { case 0x8: /* FNMUL */ gen_helper_advsimd_mulh(tcg_res, tcg_op1, tcg_op2, fpst); - tcg_gen_xori_i32(tcg_res, tcg_res, 0x8000); + gen_vfp_negh(tcg_res, tcg_res); break; default: case 0x0: /* FMUL */ @@ -7102,11 +7102,11 @@ static void handle_fp_3src_single(DisasContext *s, bool o0, bool o1, * flipped if it is a negated-input. */ if (o1 == true) { - gen_helper_vfp_negs(tcg_op3, tcg_op3); + gen_vfp_negs(tcg_op3, tcg_op3); } if (o0 != o1) { - gen_helper_vfp_negs(tcg_op1, tcg_op1); + gen_vfp_negs(tcg_op1, tcg_op1); } gen_helper_vfp_muladds(tcg_res, tcg_op1, tcg_op2, tcg_op3, fpst); @@ -7134,11 +7134,11 @@ static void handle_fp_3src_double(DisasContext *s, bool o0, bool o1, * flipped if it is a negated-input. */ if (o1 == true) { - gen_helper_vfp_negd(tcg_op3, tcg_op3); + gen_vfp_negd(tcg_op3, tcg_op3); } if (o0 != o1) { - gen_helper_vfp_negd(tcg_op1, tcg_op1); + gen_vfp_negd(tcg_op1, tcg_op1); } gen_helper_vfp_muladdd(tcg_res, tcg_op1, tcg_op2, tcg_op3, fpst); @@ -9246,7 +9246,7 @@ static void handle_3same_float(DisasContext *s, int size, int elements, switch (fpopcode) { case 0x39: /* FMLS */ /* As usual for ARM, separate negation for fused multiply-add */ - gen_helper_vfp_negd(tcg_op1, tcg_op1); + gen_vfp_negd(tcg_op1, tcg_op1); /* fall through */ case 0x19: /* FMLA */ read_vec_element(s, tcg_res, rd, pass, MO_64); @@ -9270,7 +9270,7 @@ static void handle_3same_float(DisasContext *s, int size, int elements, break; case 0x7a: /* FABD */ gen_helper_vfp_subd(tcg_res, tcg_op1, tcg_op2, fpst); - gen_helper_vfp_absd(tcg_res, tcg_res); + gen_vfp_absd(tcg_res, tcg_res); break; case 0x7c: /* FCMGT */ gen_helper_neon_cgt_f64(tcg_res, tcg_op1, tcg_op2, fpst); @@ -9304,7 +9304,7 @@ static void handle_3same_float(DisasContext *s, int size, int elements, switch (fpopcode) { case 0x39: /* FMLS */ /* As usual for ARM, separate negation for fused multiply-add */ - gen_helper_vfp_negs(tcg_op1, tcg_op1); + gen_vfp_negs(tcg_op1, tcg_op1); /* fall through */ case 0x19: /* FMLA */ read_vec_element_i32(s, tcg_res, rd, pass, MO_32); @@ -9328,7 +9328,7 @@ static void handle_3same_float(DisasContext *s, int size, int elements, break; case 0x7a: /* FABD */ gen_helper_vfp_subs(tcg_res, tcg_op1, tcg_op2, fpst); - gen_helper_vfp_abss(tcg_res, tcg_res); + gen_vfp_abss(tcg_res, tcg_res); break; case 0x7c: /* FCMGT */ gen_helper_neon_cgt_f32(tcg_res, tcg_op1, tcg_op2, fpst); @@ -9741,10 +9741,10 @@ static void handle_2misc_64(DisasContext *s, int opcode, bool u, } break; case 0x2f: /* FABS */ - gen_helper_vfp_absd(tcg_rd, tcg_rn); + gen_vfp_absd(tcg_rd, tcg_rn); break; case 0x6f: /* FNEG */ - gen_helper_vfp_negd(tcg_rd, tcg_rn); + gen_vfp_negd(tcg_rd, tcg_rn); break; case 0x7f: /* FSQRT */ gen_helper_vfp_sqrtd(tcg_rd, tcg_rn, tcg_env); @@ -12567,10 +12567,10 @@ static void disas_simd_two_reg_misc(DisasContext *s, uint32_t insn) } break; case 0x2f: /* FABS */ - gen_helper_vfp_abss(tcg_res, tcg_op); + gen_vfp_abss(tcg_res, tcg_op); break; case 0x6f: /* FNEG */ - gen_helper_vfp_negs(tcg_res, tcg_op); + gen_vfp_negs(tcg_res, tcg_op); break; case 0x7f: /* FSQRT */ gen_helper_vfp_sqrts(tcg_res, tcg_op, tcg_env); @@ -13291,7 +13291,7 @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn) switch (16 * u + opcode) { case 0x05: /* FMLS */ /* As usual for ARM, separate negation for fused multiply-add */ - gen_helper_vfp_negd(tcg_op, tcg_op); + gen_vfp_negd(tcg_op, tcg_op); /* fall through */ case 0x01: /* FMLA */ read_vec_element(s, tcg_res, rd, pass, MO_64); diff --git a/target/arm/tcg/translate-vfp.c b/target/arm/tcg/translate-vfp.c index 8e755fcde8..39ec971ff7 100644 --- a/target/arm/tcg/translate-vfp.c +++ b/target/arm/tcg/translate-vfp.c @@ -1768,7 +1768,7 @@ static void gen_VMLS_hp(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm, TCGv_ptr fpst) TCGv_i32 tmp = tcg_temp_new_i32(); gen_helper_vfp_mulh(tmp, vn, vm, fpst); - gen_helper_vfp_negh(tmp, tmp); + gen_vfp_negh(tmp, tmp); gen_helper_vfp_addh(vd, vd, tmp, fpst); } @@ -1786,7 +1786,7 @@ static void gen_VMLS_sp(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm, TCGv_ptr fpst) TCGv_i32 tmp = tcg_temp_new_i32(); gen_helper_vfp_muls(tmp, vn, vm, fpst); - gen_helper_vfp_negs(tmp, tmp); + gen_vfp_negs(tmp, tmp); gen_helper_vfp_adds(vd, vd, tmp, fpst); } @@ -1804,7 +1804,7 @@ static void gen_VMLS_dp(TCGv_i64 vd, TCGv_i64 vn, TCGv_i64 vm, TCGv_ptr fpst) TCGv_i64 tmp = tcg_temp_new_i64(); gen_helper_vfp_muld(tmp, vn, vm, fpst); - gen_helper_vfp_negd(tmp, tmp); + gen_vfp_negd(tmp, tmp); gen_helper_vfp_addd(vd, vd, tmp, fpst); } @@ -1824,7 +1824,7 @@ static void gen_VNMLS_hp(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm, TCGv_ptr fpst) TCGv_i32 tmp = tcg_temp_new_i32(); gen_helper_vfp_mulh(tmp, vn, vm, fpst); - gen_helper_vfp_negh(vd, vd); + gen_vfp_negh(vd, vd); gen_helper_vfp_addh(vd, vd, tmp, fpst); } @@ -1844,7 +1844,7 @@ static void gen_VNMLS_sp(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm, TCGv_ptr fpst) TCGv_i32 tmp = tcg_temp_new_i32(); gen_helper_vfp_muls(tmp, vn, vm, fpst); - gen_helper_vfp_negs(vd, vd); + gen_vfp_negs(vd, vd); gen_helper_vfp_adds(vd, vd, tmp, fpst); } @@ -1864,7 +1864,7 @@ static void gen_VNMLS_dp(TCGv_i64 vd, TCGv_i64 vn, TCGv_i64 vm, TCGv_ptr fpst) TCGv_i64 tmp = tcg_temp_new_i64(); gen_helper_vfp_muld(tmp, vn, vm, fpst); - gen_helper_vfp_negd(vd, vd); + gen_vfp_negd(vd, vd); gen_helper_vfp_addd(vd, vd, tmp, fpst); } @@ -1879,8 +1879,8 @@ static void gen_VNMLA_hp(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm, TCGv_ptr fpst) TCGv_i32 tmp = tcg_temp_new_i32(); gen_helper_vfp_mulh(tmp, vn, vm, fpst); - gen_helper_vfp_negh(tmp, tmp); - gen_helper_vfp_negh(vd, vd); + gen_vfp_negh(tmp, tmp); + gen_vfp_negh(vd, vd); gen_helper_vfp_addh(vd, vd, tmp, fpst); } @@ -1895,8 +1895,8 @@ static void gen_VNMLA_sp(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm, TCGv_ptr fpst) TCGv_i32 tmp = tcg_temp_new_i32(); gen_helper_vfp_muls(tmp, vn, vm, fpst); - gen_helper_vfp_negs(tmp, tmp); - gen_helper_vfp_negs(vd, vd); + gen_vfp_negs(tmp, tmp); + gen_vfp_negs(vd, vd); gen_helper_vfp_adds(vd, vd, tmp, fpst); } @@ -1911,8 +1911,8 @@ static void gen_VNMLA_dp(TCGv_i64 vd, TCGv_i64 vn, TCGv_i64 vm, TCGv_ptr fpst) TCGv_i64 tmp = tcg_temp_new_i64(); gen_helper_vfp_muld(tmp, vn, vm, fpst); - gen_helper_vfp_negd(tmp, tmp); - gen_helper_vfp_negd(vd, vd); + gen_vfp_negd(tmp, tmp); + gen_vfp_negd(vd, vd); gen_helper_vfp_addd(vd, vd, tmp, fpst); } @@ -1940,7 +1940,7 @@ static void gen_VNMUL_hp(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm, TCGv_ptr fpst) { /* VNMUL: -(fn * fm) */ gen_helper_vfp_mulh(vd, vn, vm, fpst); - gen_helper_vfp_negh(vd, vd); + gen_vfp_negh(vd, vd); } static bool trans_VNMUL_hp(DisasContext *s, arg_VNMUL_sp *a) @@ -1952,7 +1952,7 @@ static void gen_VNMUL_sp(TCGv_i32 vd, TCGv_i32 vn, TCGv_i32 vm, TCGv_ptr fpst) { /* VNMUL: -(fn * fm) */ gen_helper_vfp_muls(vd, vn, vm, fpst); - gen_helper_vfp_negs(vd, vd); + gen_vfp_negs(vd, vd); } static bool trans_VNMUL_sp(DisasContext *s, arg_VNMUL_sp *a) @@ -1964,7 +1964,7 @@ static void gen_VNMUL_dp(TCGv_i64 vd, TCGv_i64 vn, TCGv_i64 vm, TCGv_ptr fpst) { /* VNMUL: -(fn * fm) */ gen_helper_vfp_muld(vd, vn, vm, fpst); - gen_helper_vfp_negd(vd, vd); + gen_vfp_negd(vd, vd); } static bool trans_VNMUL_dp(DisasContext *s, arg_VNMUL_dp *a) @@ -2115,12 +2115,12 @@ static bool do_vfm_hp(DisasContext *s, arg_VFMA_sp *a, bool neg_n, bool neg_d) vfp_load_reg16(vm, a->vm); if (neg_n) { /* VFNMS, VFMS */ - gen_helper_vfp_negh(vn, vn); + gen_vfp_negh(vn, vn); } vfp_load_reg16(vd, a->vd); if (neg_d) { /* VFNMA, VFNMS */ - gen_helper_vfp_negh(vd, vd); + gen_vfp_negh(vd, vd); } fpst = fpstatus_ptr(FPST_FPCR_F16); gen_helper_vfp_muladdh(vd, vn, vm, vd, fpst); @@ -2174,12 +2174,12 @@ static bool do_vfm_sp(DisasContext *s, arg_VFMA_sp *a, bool neg_n, bool neg_d) vfp_load_reg32(vm, a->vm); if (neg_n) { /* VFNMS, VFMS */ - gen_helper_vfp_negs(vn, vn); + gen_vfp_negs(vn, vn); } vfp_load_reg32(vd, a->vd); if (neg_d) { /* VFNMA, VFNMS */ - gen_helper_vfp_negs(vd, vd); + gen_vfp_negs(vd, vd); } fpst = fpstatus_ptr(FPST_FPCR); gen_helper_vfp_muladds(vd, vn, vm, vd, fpst); @@ -2239,12 +2239,12 @@ static bool do_vfm_dp(DisasContext *s, arg_VFMA_dp *a, bool neg_n, bool neg_d) vfp_load_reg64(vm, a->vm); if (neg_n) { /* VFNMS, VFMS */ - gen_helper_vfp_negd(vn, vn); + gen_vfp_negd(vn, vn); } vfp_load_reg64(vd, a->vd); if (neg_d) { /* VFNMA, VFNMS */ - gen_helper_vfp_negd(vd, vd); + gen_vfp_negd(vd, vd); } fpst = fpstatus_ptr(FPST_FPCR); gen_helper_vfp_muladdd(vd, vn, vm, vd, fpst); @@ -2414,13 +2414,13 @@ static bool trans_VMOV_imm_dp(DisasContext *s, arg_VMOV_imm_dp *a) DO_VFP_VMOV(VMOV_reg, sp, tcg_gen_mov_i32) DO_VFP_VMOV(VMOV_reg, dp, tcg_gen_mov_i64) -DO_VFP_2OP(VABS, hp, gen_helper_vfp_absh, aa32_fp16_arith) -DO_VFP_2OP(VABS, sp, gen_helper_vfp_abss, aa32_fpsp_v2) -DO_VFP_2OP(VABS, dp, gen_helper_vfp_absd, aa32_fpdp_v2) +DO_VFP_2OP(VABS, hp, gen_vfp_absh, aa32_fp16_arith) +DO_VFP_2OP(VABS, sp, gen_vfp_abss, aa32_fpsp_v2) +DO_VFP_2OP(VABS, dp, gen_vfp_absd, aa32_fpdp_v2) -DO_VFP_2OP(VNEG, hp, gen_helper_vfp_negh, aa32_fp16_arith) -DO_VFP_2OP(VNEG, sp, gen_helper_vfp_negs, aa32_fpsp_v2) -DO_VFP_2OP(VNEG, dp, gen_helper_vfp_negd, aa32_fpdp_v2) +DO_VFP_2OP(VNEG, hp, gen_vfp_negh, aa32_fp16_arith) +DO_VFP_2OP(VNEG, sp, gen_vfp_negs, aa32_fpsp_v2) +DO_VFP_2OP(VNEG, dp, gen_vfp_negd, aa32_fpdp_v2) static void gen_VSQRT_hp(TCGv_i32 vd, TCGv_i32 vm) { diff --git a/target/arm/vfp_helper.c b/target/arm/vfp_helper.c index 3e5e37abbe..ce26b8a71a 100644 --- a/target/arm/vfp_helper.c +++ b/target/arm/vfp_helper.c @@ -281,36 +281,6 @@ VFP_BINOP(minnum) VFP_BINOP(maxnum) #undef VFP_BINOP -dh_ctype_f16 VFP_HELPER(neg, h)(dh_ctype_f16 a) -{ - return float16_chs(a); -} - -float32 VFP_HELPER(neg, s)(float32 a) -{ - return float32_chs(a); -} - -float64 VFP_HELPER(neg, d)(float64 a) -{ - return float64_chs(a); -} - -dh_ctype_f16 VFP_HELPER(abs, h)(dh_ctype_f16 a) -{ - return float16_abs(a); -} - -float32 VFP_HELPER(abs, s)(float32 a) -{ - return float32_abs(a); -} - -float64 VFP_HELPER(abs, d)(float64 a) -{ - return float64_abs(a); -} - dh_ctype_f16 VFP_HELPER(sqrt, h)(dh_ctype_f16 a, CPUARMState *env) { return float16_sqrt(a, &env->vfp.fp_status_f16); From patchwork Fri May 24 23:20:37 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13673790 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id CC535C25B7A for ; Fri, 24 May 2024 23:22:21 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sAeEY-0006jq-5D; Fri, 24 May 2024 19:21:58 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sAeEQ-0006cG-TV for qemu-devel@nongnu.org; Fri, 24 May 2024 19:21:51 -0400 Received: from mail-pl1-x62d.google.com ([2607:f8b0:4864:20::62d]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sAeEK-0005oq-Ff for qemu-devel@nongnu.org; Fri, 24 May 2024 19:21:50 -0400 Received: by mail-pl1-x62d.google.com with SMTP id d9443c01a7336-1f44b513017so11716875ad.1 for ; Fri, 24 May 2024 16:21:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1716592903; x=1717197703; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=HW9WJe8c9Y/ZQ8ANeC2v7XTvybTBdbF7wEMJw9YHz1E=; b=l+oWcHOc/x9/v1jfbrPcvIHonU9liJ7mkE1sH8fyCq38hHVAhi0SCWGueuDxQWxNfI TLK4h1CNgHZLLpeEyWxWzS+dAlzsdWx2xFW/ZEsUGKxuI39pLpg5xZON0Af1HYbVk5nS RsNjU4K40z3MxF6mHF0NXiMGJyndfXO4jwrnflrDj8ijCrxu2jYD5TYPSwTeOnFlH5sT hkMyuR/epo0t9wzxWib/pZ6rsmm7A0kAbuGRcM47slhPRzuI9UXqb/b8WUvtntTEf7Dm hcvSVzbwBvTnC24/yQ877fBqKG+punPIjrxATmspR/lg44rsA2N90o2wOU4btwx9VEin UaAw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716592903; x=1717197703; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=HW9WJe8c9Y/ZQ8ANeC2v7XTvybTBdbF7wEMJw9YHz1E=; b=H0miYslzUbX7x6BwYix7zI9rCaSwn8VfUS33C6Z1AnwwiTy7Ap9kMJq8wHppS6UJ2X 4MIr9KN3xgStqgs7o+Zu+SU7HPfbz5q455QhqaOS/CO75QMU8gFgAZ0DHrKNSsTmBFjn 4pASVuo4VS+5dBNAA1uRE38K7E1R6+fov7nlroPXKrWpePIRaSNIolxs3BHskkcxSX8H 8Fn81YgSGhTxLO8Nrw6tXnkEXPFQWPllSeG4FHOswpX9eBHRmn4kwW4fmFBTwaCKLWBX MBFwAPVdFYx96K8JIS9BaA5C/yF10TN80X+0AceGPmiLIxJKC3bYiwVvmVXmtHlJbo2b drjQ== X-Gm-Message-State: AOJu0Yyjgnd2/MsObWpe7Gscr784ROfy6nfHj0cqSZM/9FWdPA9zD+X6 8Obm2BmmB2RscqUZESsG/O3+PbjyPIIea5em0N4itfusnKl4TX7NMZ4emnbyfibZoLXC88rehl0 C X-Google-Smtp-Source: AGHT+IGECIThMFHuEUyoum2thUElTSz0Gwr7NLShFB7e3pxg4XsnvbzTGHLg5jntZrQwbASYPdXqLQ== X-Received: by 2002:a17:903:2451:b0:1f3:50b5:65bd with SMTP id d9443c01a7336-1f44873c138mr39770625ad.33.1716592903173; Fri, 24 May 2024 16:21:43 -0700 (PDT) Received: from stoup.. (174-21-72-5.tukw.qwest.net. [174.21.72.5]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-1f44c759ceesm19178305ad.10.2024.05.24.16.21.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 May 2024 16:21:42 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: qemu-arm@nongnu.org, Peter Maydell Subject: [PATCH v2 23/67] target/arm: Convert FNMUL to decodetree Date: Fri, 24 May 2024 16:20:37 -0700 Message-Id: <20240524232121.284515-24-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240524232121.284515-1-richard.henderson@linaro.org> References: <20240524232121.284515-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::62d; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x62d.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org This is the last instruction within disas_fp_2src, so remove that and its subroutines. Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/tcg/a64.decode | 1 + target/arm/tcg/translate-a64.c | 177 +++++---------------------------- 2 files changed, 27 insertions(+), 151 deletions(-) diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode index e2678d919e..cde4b86303 100644 --- a/target/arm/tcg/a64.decode +++ b/target/arm/tcg/a64.decode @@ -703,6 +703,7 @@ FADD_s 0001 1110 ..1 ..... 0010 10 ..... ..... @rrr_hsd FSUB_s 0001 1110 ..1 ..... 0011 10 ..... ..... @rrr_hsd FDIV_s 0001 1110 ..1 ..... 0001 10 ..... ..... @rrr_hsd FMUL_s 0001 1110 ..1 ..... 0000 10 ..... ..... @rrr_hsd +FNMUL_s 0001 1110 ..1 ..... 1000 10 ..... ..... @rrr_hsd FMAX_s 0001 1110 ..1 ..... 0100 10 ..... ..... @rrr_hsd FMIN_s 0001 1110 ..1 ..... 0101 10 ..... ..... @rrr_hsd diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c index 878f83298f..5ba30ba7c8 100644 --- a/target/arm/tcg/translate-a64.c +++ b/target/arm/tcg/translate-a64.c @@ -4950,6 +4950,31 @@ static const FPScalar f_scalar_fmulx = { }; TRANS(FMULX_s, do_fp3_scalar, a, &f_scalar_fmulx) +static void gen_fnmul_h(TCGv_i32 d, TCGv_i32 n, TCGv_i32 m, TCGv_ptr s) +{ + gen_helper_vfp_mulh(d, n, m, s); + gen_vfp_negh(d, d); +} + +static void gen_fnmul_s(TCGv_i32 d, TCGv_i32 n, TCGv_i32 m, TCGv_ptr s) +{ + gen_helper_vfp_muls(d, n, m, s); + gen_vfp_negs(d, d); +} + +static void gen_fnmul_d(TCGv_i64 d, TCGv_i64 n, TCGv_i64 m, TCGv_ptr s) +{ + gen_helper_vfp_muld(d, n, m, s); + gen_vfp_negd(d, d); +} + +static const FPScalar f_scalar_fnmul = { + gen_fnmul_h, + gen_fnmul_s, + gen_fnmul_d, +}; +TRANS(FNMUL_s, do_fp3_scalar, a, &f_scalar_fnmul) + static bool do_fp3_vector(DisasContext *s, arg_qrrr_e *a, gen_helper_gvec_3_ptr * const fns[3]) { @@ -6932,156 +6957,6 @@ static void disas_fp_1src(DisasContext *s, uint32_t insn) } } -/* Floating-point data-processing (2 source) - single precision */ -static void handle_fp_2src_single(DisasContext *s, int opcode, - int rd, int rn, int rm) -{ - TCGv_i32 tcg_op1; - TCGv_i32 tcg_op2; - TCGv_i32 tcg_res; - TCGv_ptr fpst; - - tcg_res = tcg_temp_new_i32(); - fpst = fpstatus_ptr(FPST_FPCR); - tcg_op1 = read_fp_sreg(s, rn); - tcg_op2 = read_fp_sreg(s, rm); - - switch (opcode) { - case 0x8: /* FNMUL */ - gen_helper_vfp_muls(tcg_res, tcg_op1, tcg_op2, fpst); - gen_vfp_negs(tcg_res, tcg_res); - break; - default: - case 0x0: /* FMUL */ - case 0x1: /* FDIV */ - case 0x2: /* FADD */ - case 0x3: /* FSUB */ - case 0x4: /* FMAX */ - case 0x5: /* FMIN */ - case 0x6: /* FMAXNM */ - case 0x7: /* FMINNM */ - g_assert_not_reached(); - } - - write_fp_sreg(s, rd, tcg_res); -} - -/* Floating-point data-processing (2 source) - double precision */ -static void handle_fp_2src_double(DisasContext *s, int opcode, - int rd, int rn, int rm) -{ - TCGv_i64 tcg_op1; - TCGv_i64 tcg_op2; - TCGv_i64 tcg_res; - TCGv_ptr fpst; - - tcg_res = tcg_temp_new_i64(); - fpst = fpstatus_ptr(FPST_FPCR); - tcg_op1 = read_fp_dreg(s, rn); - tcg_op2 = read_fp_dreg(s, rm); - - switch (opcode) { - case 0x8: /* FNMUL */ - gen_helper_vfp_muld(tcg_res, tcg_op1, tcg_op2, fpst); - gen_vfp_negd(tcg_res, tcg_res); - break; - default: - case 0x0: /* FMUL */ - case 0x1: /* FDIV */ - case 0x2: /* FADD */ - case 0x3: /* FSUB */ - case 0x4: /* FMAX */ - case 0x5: /* FMIN */ - case 0x6: /* FMAXNM */ - case 0x7: /* FMINNM */ - g_assert_not_reached(); - } - - write_fp_dreg(s, rd, tcg_res); -} - -/* Floating-point data-processing (2 source) - half precision */ -static void handle_fp_2src_half(DisasContext *s, int opcode, - int rd, int rn, int rm) -{ - TCGv_i32 tcg_op1; - TCGv_i32 tcg_op2; - TCGv_i32 tcg_res; - TCGv_ptr fpst; - - tcg_res = tcg_temp_new_i32(); - fpst = fpstatus_ptr(FPST_FPCR_F16); - tcg_op1 = read_fp_hreg(s, rn); - tcg_op2 = read_fp_hreg(s, rm); - - switch (opcode) { - case 0x8: /* FNMUL */ - gen_helper_advsimd_mulh(tcg_res, tcg_op1, tcg_op2, fpst); - gen_vfp_negh(tcg_res, tcg_res); - break; - default: - case 0x0: /* FMUL */ - case 0x1: /* FDIV */ - case 0x2: /* FADD */ - case 0x3: /* FSUB */ - case 0x4: /* FMAX */ - case 0x5: /* FMIN */ - case 0x6: /* FMAXNM */ - case 0x7: /* FMINNM */ - g_assert_not_reached(); - } - - write_fp_sreg(s, rd, tcg_res); -} - -/* Floating point data-processing (2 source) - * 31 30 29 28 24 23 22 21 20 16 15 12 11 10 9 5 4 0 - * +---+---+---+-----------+------+---+------+--------+-----+------+------+ - * | M | 0 | S | 1 1 1 1 0 | type | 1 | Rm | opcode | 1 0 | Rn | Rd | - * +---+---+---+-----------+------+---+------+--------+-----+------+------+ - */ -static void disas_fp_2src(DisasContext *s, uint32_t insn) -{ - int mos = extract32(insn, 29, 3); - int type = extract32(insn, 22, 2); - int rd = extract32(insn, 0, 5); - int rn = extract32(insn, 5, 5); - int rm = extract32(insn, 16, 5); - int opcode = extract32(insn, 12, 4); - - if (opcode > 8 || mos) { - unallocated_encoding(s); - return; - } - - switch (type) { - case 0: - if (!fp_access_check(s)) { - return; - } - handle_fp_2src_single(s, opcode, rd, rn, rm); - break; - case 1: - if (!fp_access_check(s)) { - return; - } - handle_fp_2src_double(s, opcode, rd, rn, rm); - break; - case 3: - if (!dc_isar_feature(aa64_fp16, s)) { - unallocated_encoding(s); - return; - } - if (!fp_access_check(s)) { - return; - } - handle_fp_2src_half(s, opcode, rd, rn, rm); - break; - default: - unallocated_encoding(s); - } -} - /* Floating-point data-processing (3 source) - single precision */ static void handle_fp_3src_single(DisasContext *s, bool o0, bool o1, int rd, int rn, int rm, int ra) @@ -7685,7 +7560,7 @@ static void disas_data_proc_fp(DisasContext *s, uint32_t insn) break; case 2: /* Floating point data-processing (2 source) */ - disas_fp_2src(s, insn); + unallocated_encoding(s); /* in decodetree */ break; case 3: /* Floating point conditional select */ From patchwork Fri May 24 23:20:38 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13673796 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 6C36DC25B7A for ; Fri, 24 May 2024 23:24:19 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sAeEY-0006k0-63; Fri, 24 May 2024 19:21:58 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sAeES-0006dw-PF for qemu-devel@nongnu.org; Fri, 24 May 2024 19:21:53 -0400 Received: from mail-pl1-x62d.google.com ([2607:f8b0:4864:20::62d]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sAeEL-0005pL-Hv for qemu-devel@nongnu.org; Fri, 24 May 2024 19:21:52 -0400 Received: by mail-pl1-x62d.google.com with SMTP id d9443c01a7336-1f3406f225bso17674955ad.3 for ; Fri, 24 May 2024 16:21:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1716592904; x=1717197704; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=7A7ijf0ipE4fJKHAH06QJGgCMVLILu54ncQgG0aOJdo=; b=wzb4a6VjyCLeKBeZoHSe+TRyqJZV1gBbVn2AQ6pyYTAlwU1UFGrBy+LjtKUeR0i/bb KtZRz5t7FKSG6s4wdTXWQJOw5GD8y+nIpsFREGy1oLUd/rwaRKWHdcCfztonw5Ed1idT LDDFy9JSceQ13D4Wv0WEb3E8GhANol43IW7PKVi00DZJGXx07N2C7PZwzFyiE03uj/TV PzM9eWf/EB/7WvDfJG57EHrxVodMVDD8zcR3FVxGesglZgoD+cYg3kU+jBNokiXSfwOU sz/M8/PsNoGjySsgHRFibgkSAXKF0yHuM+PYoZxS1uPC8hvFYzMEBSMZllrsBOAqy9X/ or3w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716592904; x=1717197704; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=7A7ijf0ipE4fJKHAH06QJGgCMVLILu54ncQgG0aOJdo=; b=i72gs/N6osxBdPlCoke6c90owuPJFrzRd/jEFn7Ma0sqToCEFpkq29QTorN2JPJewz 5OVXsxEFOOThhYpIIJeupN4clhPD6wPAED8sgu9Hcg9s0JwWG3VfavCmHmTpCqLYwb9Y t4ucdoRezFIT92uGlwRKHawQxahQjLszNmaAKMhzRf27IYL5WtLebD2EX7/QqRA60ODp nuNbHi3PgSaGaOLFDaMWqlxPyZRjqb9BsZXOjQT4yEAQgODDm16TSChO7AyTGrgz+YMM 3EcW1zf4P35DHHeoJDlNrmRkjfO1ekpPSkY7a3EVRcgBvr50q7y2QjXtDzTwNE+X+EsB Sbgw== X-Gm-Message-State: AOJu0YzFIVR2gbLncVs52jCFZiRa6S/1ggCrcOg8d5ZtQX6mz8Ze0E0/ mALvPVS0LswIQxqSo9dK1Y3PoJond99AoFgYb4VY4BJ4v8Zpg8zJKxhGLCd0exlQcRdoCWmOkks S X-Google-Smtp-Source: AGHT+IHwZ2vnEhmWOgVvAkg3DxOpfYiykSKDG8d1PP0yZ19TsE71VUNOeGdYnqORU9g+BOTMcP1GOA== X-Received: by 2002:a17:902:d48c:b0:1f4:603f:537 with SMTP id d9443c01a7336-1f4603f0af8mr18653685ad.45.1716592903902; Fri, 24 May 2024 16:21:43 -0700 (PDT) Received: from stoup.. (174-21-72-5.tukw.qwest.net. [174.21.72.5]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-1f44c759ceesm19178305ad.10.2024.05.24.16.21.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 May 2024 16:21:43 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: qemu-arm@nongnu.org, Peter Maydell Subject: [PATCH v2 24/67] target/arm: Convert FMLA, FMLS to decodetree Date: Fri, 24 May 2024 16:20:38 -0700 Message-Id: <20240524232121.284515-25-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240524232121.284515-1-richard.henderson@linaro.org> References: <20240524232121.284515-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::62d; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x62d.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/helper.h | 2 + target/arm/tcg/a64.decode | 22 +++ target/arm/tcg/translate-a64.c | 241 +++++++++++++++++---------------- target/arm/tcg/vec_helper.c | 14 ++ 4 files changed, 163 insertions(+), 116 deletions(-) diff --git a/target/arm/helper.h b/target/arm/helper.h index 0fd01c9c52..e021c18517 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -770,9 +770,11 @@ DEF_HELPER_FLAGS_5(gvec_fmls_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(gvec_vfma_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(gvec_vfma_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_vfma_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(gvec_vfms_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(gvec_vfms_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_vfms_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(gvec_ftsmul_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode index cde4b86303..11527bb5e5 100644 --- a/target/arm/tcg/a64.decode +++ b/target/arm/tcg/a64.decode @@ -742,12 +742,26 @@ FMINNM_v 0.00 1110 1.1 ..... 11000 1 ..... ..... @qrrr_sd FMULX_v 0.00 1110 010 ..... 00011 1 ..... ..... @qrrr_h FMULX_v 0.00 1110 0.1 ..... 11011 1 ..... ..... @qrrr_sd +FMLA_v 0.00 1110 010 ..... 00001 1 ..... ..... @qrrr_h +FMLA_v 0.00 1110 0.1 ..... 11001 1 ..... ..... @qrrr_sd + +FMLS_v 0.00 1110 110 ..... 00001 1 ..... ..... @qrrr_h +FMLS_v 0.00 1110 1.1 ..... 11001 1 ..... ..... @qrrr_sd + ### Advanced SIMD scalar x indexed element FMUL_si 0101 1111 00 .. .... 1001 . 0 ..... ..... @rrx_h FMUL_si 0101 1111 10 . ..... 1001 . 0 ..... ..... @rrx_s FMUL_si 0101 1111 11 0 ..... 1001 . 0 ..... ..... @rrx_d +FMLA_si 0101 1111 00 .. .... 0001 . 0 ..... ..... @rrx_h +FMLA_si 0101 1111 10 .. .... 0001 . 0 ..... ..... @rrx_s +FMLA_si 0101 1111 11 0. .... 0001 . 0 ..... ..... @rrx_d + +FMLS_si 0101 1111 00 .. .... 0101 . 0 ..... ..... @rrx_h +FMLS_si 0101 1111 10 .. .... 0101 . 0 ..... ..... @rrx_s +FMLS_si 0101 1111 11 0. .... 0101 . 0 ..... ..... @rrx_d + FMULX_si 0111 1111 00 .. .... 1001 . 0 ..... ..... @rrx_h FMULX_si 0111 1111 10 . ..... 1001 . 0 ..... ..... @rrx_s FMULX_si 0111 1111 11 0 ..... 1001 . 0 ..... ..... @rrx_d @@ -758,6 +772,14 @@ FMUL_vi 0.00 1111 00 .. .... 1001 . 0 ..... ..... @qrrx_h FMUL_vi 0.00 1111 10 . ..... 1001 . 0 ..... ..... @qrrx_s FMUL_vi 0.00 1111 11 0 ..... 1001 . 0 ..... ..... @qrrx_d +FMLA_vi 0.00 1111 00 .. .... 0001 . 0 ..... ..... @qrrx_h +FMLA_vi 0.00 1111 10 . ..... 0001 . 0 ..... ..... @qrrx_s +FMLA_vi 0.00 1111 11 0 ..... 0001 . 0 ..... ..... @qrrx_d + +FMLS_vi 0.00 1111 00 .. .... 0101 . 0 ..... ..... @qrrx_h +FMLS_vi 0.00 1111 10 . ..... 0101 . 0 ..... ..... @qrrx_s +FMLS_vi 0.00 1111 11 0 ..... 0101 . 0 ..... ..... @qrrx_d + FMULX_vi 0.10 1111 00 .. .... 1001 . 0 ..... ..... @qrrx_h FMULX_vi 0.10 1111 10 . ..... 1001 . 0 ..... ..... @qrrx_s FMULX_vi 0.10 1111 11 0 ..... 1001 . 0 ..... ..... @qrrx_d diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c index 5ba30ba7c8..f84c12378d 100644 --- a/target/arm/tcg/translate-a64.c +++ b/target/arm/tcg/translate-a64.c @@ -5066,6 +5066,20 @@ static gen_helper_gvec_3_ptr * const f_vector_fmulx[3] = { }; TRANS(FMULX_v, do_fp3_vector, a, f_vector_fmulx) +static gen_helper_gvec_3_ptr * const f_vector_fmla[3] = { + gen_helper_gvec_vfma_h, + gen_helper_gvec_vfma_s, + gen_helper_gvec_vfma_d, +}; +TRANS(FMLA_v, do_fp3_vector, a, f_vector_fmla) + +static gen_helper_gvec_3_ptr * const f_vector_fmls[3] = { + gen_helper_gvec_vfms_h, + gen_helper_gvec_vfms_s, + gen_helper_gvec_vfms_d, +}; +TRANS(FMLS_v, do_fp3_vector, a, f_vector_fmls) + /* * Advanced SIMD scalar/vector x indexed element */ @@ -5115,6 +5129,64 @@ static bool do_fp3_scalar_idx(DisasContext *s, arg_rrx_e *a, const FPScalar *f) TRANS(FMUL_si, do_fp3_scalar_idx, a, &f_scalar_fmul) TRANS(FMULX_si, do_fp3_scalar_idx, a, &f_scalar_fmulx) +static bool do_fmla_scalar_idx(DisasContext *s, arg_rrx_e *a, bool neg) +{ + switch (a->esz) { + case MO_64: + if (fp_access_check(s)) { + TCGv_i64 t0 = read_fp_dreg(s, a->rd); + TCGv_i64 t1 = read_fp_dreg(s, a->rn); + TCGv_i64 t2 = tcg_temp_new_i64(); + + read_vec_element(s, t2, a->rm, a->idx, MO_64); + if (neg) { + gen_vfp_negd(t1, t1); + } + gen_helper_vfp_muladdd(t0, t1, t2, t0, fpstatus_ptr(FPST_FPCR)); + write_fp_dreg(s, a->rd, t0); + } + break; + case MO_32: + if (fp_access_check(s)) { + TCGv_i32 t0 = read_fp_sreg(s, a->rd); + TCGv_i32 t1 = read_fp_sreg(s, a->rn); + TCGv_i32 t2 = tcg_temp_new_i32(); + + read_vec_element_i32(s, t2, a->rm, a->idx, MO_32); + if (neg) { + gen_vfp_negs(t1, t1); + } + gen_helper_vfp_muladds(t0, t1, t2, t0, fpstatus_ptr(FPST_FPCR)); + write_fp_sreg(s, a->rd, t0); + } + break; + case MO_16: + if (!dc_isar_feature(aa64_fp16, s)) { + return false; + } + if (fp_access_check(s)) { + TCGv_i32 t0 = read_fp_hreg(s, a->rd); + TCGv_i32 t1 = read_fp_hreg(s, a->rn); + TCGv_i32 t2 = tcg_temp_new_i32(); + + read_vec_element_i32(s, t2, a->rm, a->idx, MO_16); + if (neg) { + gen_vfp_negh(t1, t1); + } + gen_helper_advsimd_muladdh(t0, t1, t2, t0, + fpstatus_ptr(FPST_FPCR_F16)); + write_fp_sreg(s, a->rd, t0); + } + break; + default: + g_assert_not_reached(); + } + return true; +} + +TRANS(FMLA_si, do_fmla_scalar_idx, a, false) +TRANS(FMLS_si, do_fmla_scalar_idx, a, true) + static bool do_fp3_vector_idx(DisasContext *s, arg_qrrx_e *a, gen_helper_gvec_3_ptr * const fns[3]) { @@ -5157,6 +5229,42 @@ static gen_helper_gvec_3_ptr * const f_vector_idx_fmulx[3] = { }; TRANS(FMULX_vi, do_fp3_vector_idx, a, f_vector_idx_fmulx) +static bool do_fmla_vector_idx(DisasContext *s, arg_qrrx_e *a, bool neg) +{ + static gen_helper_gvec_4_ptr * const fns[3] = { + gen_helper_gvec_fmla_idx_h, + gen_helper_gvec_fmla_idx_s, + gen_helper_gvec_fmla_idx_d, + }; + MemOp esz = a->esz; + + switch (esz) { + case MO_64: + if (!a->q) { + return false; + } + break; + case MO_32: + break; + case MO_16: + if (!dc_isar_feature(aa64_fp16, s)) { + return false; + } + break; + default: + g_assert_not_reached(); + } + if (fp_access_check(s)) { + gen_gvec_op4_fpst(s, a->q, a->rd, a->rn, a->rm, a->rd, + esz == MO_16, (a->idx << 1) | neg, + fns[esz - 1]); + } + return true; +} + +TRANS(FMLA_vi, do_fmla_vector_idx, a, false) +TRANS(FMLS_vi, do_fmla_vector_idx, a, true) + /* Shift a TCGv src by TCGv shift_amount, put result in dst. * Note that it is the caller's responsibility to ensure that the @@ -9119,15 +9227,6 @@ static void handle_3same_float(DisasContext *s, int size, int elements, read_vec_element(s, tcg_op2, rm, pass, MO_64); switch (fpopcode) { - case 0x39: /* FMLS */ - /* As usual for ARM, separate negation for fused multiply-add */ - gen_vfp_negd(tcg_op1, tcg_op1); - /* fall through */ - case 0x19: /* FMLA */ - read_vec_element(s, tcg_res, rd, pass, MO_64); - gen_helper_vfp_muladdd(tcg_res, tcg_op1, tcg_op2, - tcg_res, fpst); - break; case 0x1c: /* FCMEQ */ gen_helper_neon_ceq_f64(tcg_res, tcg_op1, tcg_op2, fpst); break; @@ -9155,10 +9254,12 @@ static void handle_3same_float(DisasContext *s, int size, int elements, break; default: case 0x18: /* FMAXNM */ + case 0x19: /* FMLA */ case 0x1a: /* FADD */ case 0x1b: /* FMULX */ case 0x1e: /* FMAX */ case 0x38: /* FMINNM */ + case 0x39: /* FMLS */ case 0x3a: /* FSUB */ case 0x3e: /* FMIN */ case 0x5b: /* FMUL */ @@ -9177,15 +9278,6 @@ static void handle_3same_float(DisasContext *s, int size, int elements, read_vec_element_i32(s, tcg_op2, rm, pass, MO_32); switch (fpopcode) { - case 0x39: /* FMLS */ - /* As usual for ARM, separate negation for fused multiply-add */ - gen_vfp_negs(tcg_op1, tcg_op1); - /* fall through */ - case 0x19: /* FMLA */ - read_vec_element_i32(s, tcg_res, rd, pass, MO_32); - gen_helper_vfp_muladds(tcg_res, tcg_op1, tcg_op2, - tcg_res, fpst); - break; case 0x1c: /* FCMEQ */ gen_helper_neon_ceq_f32(tcg_res, tcg_op1, tcg_op2, fpst); break; @@ -9213,10 +9305,12 @@ static void handle_3same_float(DisasContext *s, int size, int elements, break; default: case 0x18: /* FMAXNM */ + case 0x19: /* FMLA */ case 0x1a: /* FADD */ case 0x1b: /* FMULX */ case 0x1e: /* FMAX */ case 0x38: /* FMINNM */ + case 0x39: /* FMLS */ case 0x3a: /* FSUB */ case 0x3e: /* FMIN */ case 0x5b: /* FMUL */ @@ -11140,8 +11234,6 @@ static void disas_simd_3same_float(DisasContext *s, uint32_t insn) case 0x3f: /* FRSQRTS */ case 0x5d: /* FACGE */ case 0x7d: /* FACGT */ - case 0x19: /* FMLA */ - case 0x39: /* FMLS */ case 0x1c: /* FCMEQ */ case 0x5c: /* FCMGE */ case 0x7a: /* FABD */ @@ -11174,10 +11266,12 @@ static void disas_simd_3same_float(DisasContext *s, uint32_t insn) default: case 0x18: /* FMAXNM */ + case 0x19: /* FMLA */ case 0x1a: /* FADD */ case 0x1b: /* FMULX */ case 0x1e: /* FMAX */ case 0x38: /* FMINNM */ + case 0x39: /* FMLS */ case 0x3a: /* FSUB */ case 0x3e: /* FMIN */ case 0x5b: /* FMUL */ @@ -11523,10 +11617,8 @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn) int pass; switch (fpopcode) { - case 0x1: /* FMLA */ case 0x4: /* FCMEQ */ case 0x7: /* FRECPS */ - case 0x9: /* FMLS */ case 0xf: /* FRSQRTS */ case 0x14: /* FCMGE */ case 0x15: /* FACGE */ @@ -11544,10 +11636,12 @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn) break; default: case 0x0: /* FMAXNM */ + case 0x1: /* FMLA */ case 0x2: /* FADD */ case 0x3: /* FMULX */ case 0x6: /* FMAX */ case 0x8: /* FMINNM */ + case 0x9: /* FMLS */ case 0xa: /* FSUB */ case 0xe: /* FMIN */ case 0x13: /* FMUL */ @@ -11617,24 +11711,12 @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn) read_vec_element_i32(s, tcg_op2, rm, pass, MO_16); switch (fpopcode) { - case 0x1: /* FMLA */ - read_vec_element_i32(s, tcg_res, rd, pass, MO_16); - gen_helper_advsimd_muladdh(tcg_res, tcg_op1, tcg_op2, tcg_res, - fpst); - break; case 0x4: /* FCMEQ */ gen_helper_advsimd_ceq_f16(tcg_res, tcg_op1, tcg_op2, fpst); break; case 0x7: /* FRECPS */ gen_helper_recpsf_f16(tcg_res, tcg_op1, tcg_op2, fpst); break; - case 0x9: /* FMLS */ - /* As usual for ARM, separate negation for fused multiply-add */ - tcg_gen_xori_i32(tcg_op1, tcg_op1, 0x8000); - read_vec_element_i32(s, tcg_res, rd, pass, MO_16); - gen_helper_advsimd_muladdh(tcg_res, tcg_op1, tcg_op2, tcg_res, - fpst); - break; case 0xf: /* FRSQRTS */ gen_helper_rsqrtsf_f16(tcg_res, tcg_op1, tcg_op2, fpst); break; @@ -11656,10 +11738,12 @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn) break; default: case 0x0: /* FMAXNM */ + case 0x1: /* FMLA */ case 0x2: /* FADD */ case 0x3: /* FMULX */ case 0x6: /* FMAX */ case 0x8: /* FMINNM */ + case 0x9: /* FMLS */ case 0xa: /* FSUB */ case 0xe: /* FMIN */ case 0x13: /* FMUL */ @@ -12880,10 +12964,6 @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn) case 0x0c: /* SQDMULH */ case 0x0d: /* SQRDMULH */ break; - case 0x01: /* FMLA */ - case 0x05: /* FMLS */ - is_fp = 1; - break; case 0x1d: /* SQRDMLAH */ case 0x1f: /* SQRDMLSH */ if (!dc_isar_feature(aa64_rdm, s)) { @@ -12950,6 +13030,8 @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn) /* is_fp, but we pass tcg_env not fp_status. */ break; default: + case 0x01: /* FMLA */ + case 0x05: /* FMLS */ case 0x09: /* FMUL */ case 0x19: /* FMULX */ unallocated_encoding(s); @@ -12958,20 +13040,8 @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn) switch (is_fp) { case 1: /* normal fp */ - /* convert insn encoded size to MemOp size */ - switch (size) { - case 0: /* half-precision */ - size = MO_16; - is_fp16 = true; - break; - case MO_32: /* single precision */ - case MO_64: /* double precision */ - break; - default: - unallocated_encoding(s); - return; - } - break; + unallocated_encoding(s); /* in decodetree */ + return; case 2: /* complex fp */ /* Each indexable element is a complex pair. */ @@ -13150,38 +13220,7 @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn) } if (size == 3) { - TCGv_i64 tcg_idx = tcg_temp_new_i64(); - int pass; - - assert(is_fp && is_q && !is_long); - - read_vec_element(s, tcg_idx, rm, index, MO_64); - - for (pass = 0; pass < (is_scalar ? 1 : 2); pass++) { - TCGv_i64 tcg_op = tcg_temp_new_i64(); - TCGv_i64 tcg_res = tcg_temp_new_i64(); - - read_vec_element(s, tcg_op, rn, pass, MO_64); - - switch (16 * u + opcode) { - case 0x05: /* FMLS */ - /* As usual for ARM, separate negation for fused multiply-add */ - gen_vfp_negd(tcg_op, tcg_op); - /* fall through */ - case 0x01: /* FMLA */ - read_vec_element(s, tcg_res, rd, pass, MO_64); - gen_helper_vfp_muladdd(tcg_res, tcg_op, tcg_idx, tcg_res, fpst); - break; - default: - case 0x09: /* FMUL */ - case 0x19: /* FMULX */ - g_assert_not_reached(); - } - - write_vec_element(s, tcg_res, rd, pass, MO_64); - } - - clear_vec_high(s, !is_scalar, rd); + g_assert_not_reached(); } else if (!is_long) { /* 32 bit floating point, or 16 or 32 bit integer. * For the 16 bit scalar case we use the usual Neon helpers and @@ -13237,38 +13276,6 @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn) genfn(tcg_res, tcg_op, tcg_res); break; } - case 0x05: /* FMLS */ - case 0x01: /* FMLA */ - read_vec_element_i32(s, tcg_res, rd, pass, - is_scalar ? size : MO_32); - switch (size) { - case 1: - if (opcode == 0x5) { - /* As usual for ARM, separate negation for fused - * multiply-add */ - tcg_gen_xori_i32(tcg_op, tcg_op, 0x80008000); - } - if (is_scalar) { - gen_helper_advsimd_muladdh(tcg_res, tcg_op, tcg_idx, - tcg_res, fpst); - } else { - gen_helper_advsimd_muladd2h(tcg_res, tcg_op, tcg_idx, - tcg_res, fpst); - } - break; - case 2: - if (opcode == 0x5) { - /* As usual for ARM, separate negation for - * fused multiply-add */ - tcg_gen_xori_i32(tcg_op, tcg_op, 0x80000000); - } - gen_helper_vfp_muladds(tcg_res, tcg_op, tcg_idx, - tcg_res, fpst); - break; - default: - g_assert_not_reached(); - } - break; case 0x0c: /* SQDMULH */ if (size == 1) { gen_helper_neon_qdmulh_s16(tcg_res, tcg_env, @@ -13310,6 +13317,8 @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn) } break; default: + case 0x01: /* FMLA */ + case 0x05: /* FMLS */ case 0x09: /* FMUL */ case 0x19: /* FMULX */ g_assert_not_reached(); diff --git a/target/arm/tcg/vec_helper.c b/target/arm/tcg/vec_helper.c index 99ef676071..b925b9f21b 100644 --- a/target/arm/tcg/vec_helper.c +++ b/target/arm/tcg/vec_helper.c @@ -1309,6 +1309,12 @@ static float32 float32_muladd_f(float32 dest, float32 op1, float32 op2, return float32_muladd(op1, op2, dest, 0, stat); } +static float64 float64_muladd_f(float64 dest, float64 op1, float64 op2, + float_status *stat) +{ + return float64_muladd(op1, op2, dest, 0, stat); +} + static float16 float16_mulsub_f(float16 dest, float16 op1, float16 op2, float_status *stat) { @@ -1321,6 +1327,12 @@ static float32 float32_mulsub_f(float32 dest, float32 op1, float32 op2, return float32_muladd(float32_chs(op1), op2, dest, 0, stat); } +static float64 float64_mulsub_f(float64 dest, float64 op1, float64 op2, + float_status *stat) +{ + return float64_muladd(float64_chs(op1), op2, dest, 0, stat); +} + #define DO_MULADD(NAME, FUNC, TYPE) \ void HELPER(NAME)(void *vd, void *vn, void *vm, void *stat, uint32_t desc) \ { \ @@ -1340,9 +1352,11 @@ DO_MULADD(gvec_fmls_s, float32_mulsub_nf, float32) DO_MULADD(gvec_vfma_h, float16_muladd_f, float16) DO_MULADD(gvec_vfma_s, float32_muladd_f, float32) +DO_MULADD(gvec_vfma_d, float64_muladd_f, float64) DO_MULADD(gvec_vfms_h, float16_mulsub_f, float16) DO_MULADD(gvec_vfms_s, float32_mulsub_f, float32) +DO_MULADD(gvec_vfms_d, float64_mulsub_f, float64) /* For the indexed ops, SVE applies the index per 128-bit vector segment. * For AdvSIMD, there is of course only one such vector segment. From patchwork Fri May 24 23:20:39 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13673852 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 29058C25B7A for ; Fri, 24 May 2024 23:32:36 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sAeEa-0006mt-ST; Fri, 24 May 2024 19:22:00 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sAeEV-0006fz-3M for qemu-devel@nongnu.org; Fri, 24 May 2024 19:21:55 -0400 Received: from mail-pl1-x62c.google.com ([2607:f8b0:4864:20::62c]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sAeEO-0005pa-1K for qemu-devel@nongnu.org; Fri, 24 May 2024 19:21:54 -0400 Received: by mail-pl1-x62c.google.com with SMTP id d9443c01a7336-1f44b59e234so10477545ad.2 for ; Fri, 24 May 2024 16:21:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1716592905; x=1717197705; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=ZbSO9gWb/VqmpcekSNeGDFpRCpDUVYIPPLbeSgFGxfc=; b=rlciHbwpRHrZyJehlStjcd3X2pX4cPiXet0IqKuECRiv7uOBpPYYcZL5mavP/FHlnP 5Ip4EwingI9gG/73retwOxJVwM9EFx5wH6IOLz3Bs2RlA7I4JykpbGDN+pfdT5hgppAl 5pPsXmaF6Nsj7nCEds/zbBKoU8isqcPgMRVGPyYkLUyYLpuDd+x5+z8OlqNOAJgwv/d4 gjIP4R+zeW9X+9KNNyzD/6JULwo00VMwVZ1uqLxPiglRD63ftf2zZR/zx48nJNN/Cvof /FN6pLoNmpTRH7sTLhwjybgn7kH6029bonbNZ/Mgd6TZz29sPCdGu18BJnlXF44VWH1R vVuA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716592905; x=1717197705; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ZbSO9gWb/VqmpcekSNeGDFpRCpDUVYIPPLbeSgFGxfc=; b=nRbXu0kCTgxHmXvP2kjHNzeEHuCfV0rz9T/mrEvh7qgTqPl4KSD+ZS8eE608zbPIl9 tkUYjuJouD7lQR32t8Ig+t81bUIXjPS/gl3qkT+pz1zDNBhaieHQnYSxEpAoedzZbeQO lwAeqVfuM5OWAoy8QdEgxheQ0+W57YKiFnPtNVSiFC4AuhYHSodBK0rRGyibDCJGBUsx VYgRkktQPU6JNi6PYnGFVDilkRjR2gzQveGv2XIv02mu/arHymUi+/wJ9SxFyFHKHId+ 9C127YSwtKVMsoJJLJC8zYtKvWayEqrZ4InZ7MD44qPOzV4nxQC4/sIc7/U+OMkSEBKJ 0HOA== X-Gm-Message-State: AOJu0YzHy24R57iVJvkjXYi7rRNZn02k7VR6g/3fK5fWezIECiWAqmse y8N0TufiRPMEqvABibuiywM57cRSpUCTXNtbTodxJPT7CxBzc+dZvYCtZmxe4Mc1hSQvCWXh8fs s X-Google-Smtp-Source: AGHT+IGQX4wTayC5eZ1gWmvFsPiWbUHgaemWjfcq7kMmvw/mHFPlcAj1Au9r3LIUYX5DKeNrwlBFRw== X-Received: by 2002:a17:902:e847:b0:1f3:29c:f9a5 with SMTP id d9443c01a7336-1f4486e6080mr45605365ad.13.1716592904923; Fri, 24 May 2024 16:21:44 -0700 (PDT) Received: from stoup.. (174-21-72-5.tukw.qwest.net. [174.21.72.5]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-1f44c759ceesm19178305ad.10.2024.05.24.16.21.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 May 2024 16:21:44 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: qemu-arm@nongnu.org, Peter Maydell Subject: [PATCH v2 25/67] target/arm: Convert FCMEQ, FCMGE, FCMGT, FACGE, FACGT to decodetree Date: Fri, 24 May 2024 16:20:39 -0700 Message-Id: <20240524232121.284515-26-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240524232121.284515-1-richard.henderson@linaro.org> References: <20240524232121.284515-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::62c; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x62c.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/helper.h | 5 + target/arm/tcg/a64.decode | 30 ++++++ target/arm/tcg/translate-a64.c | 188 +++++++++++++++++++-------------- target/arm/tcg/vec_helper.c | 30 ++++++ 4 files changed, 174 insertions(+), 79 deletions(-) diff --git a/target/arm/helper.h b/target/arm/helper.h index e021c18517..8d076011c1 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -727,18 +727,23 @@ DEF_HELPER_FLAGS_5(gvec_fabd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(gvec_fceq_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(gvec_fceq_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_fceq_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(gvec_fcge_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(gvec_fcge_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_fcge_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(gvec_fcgt_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(gvec_fcgt_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_fcgt_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(gvec_facge_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(gvec_facge_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_facge_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(gvec_facgt_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(gvec_facgt_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_facgt_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(gvec_fmax_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(gvec_fmax_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode index 11527bb5e5..7fc3277be6 100644 --- a/target/arm/tcg/a64.decode +++ b/target/arm/tcg/a64.decode @@ -713,6 +713,21 @@ FMINNM_s 0001 1110 ..1 ..... 0111 10 ..... ..... @rrr_hsd FMULX_s 0101 1110 010 ..... 00011 1 ..... ..... @rrr_h FMULX_s 0101 1110 0.1 ..... 11011 1 ..... ..... @rrr_sd +FCMEQ_s 0101 1110 010 ..... 00100 1 ..... ..... @rrr_h +FCMEQ_s 0101 1110 0.1 ..... 11100 1 ..... ..... @rrr_sd + +FCMGE_s 0111 1110 010 ..... 00100 1 ..... ..... @rrr_h +FCMGE_s 0111 1110 0.1 ..... 11100 1 ..... ..... @rrr_sd + +FCMGT_s 0111 1110 110 ..... 00100 1 ..... ..... @rrr_h +FCMGT_s 0111 1110 1.1 ..... 11100 1 ..... ..... @rrr_sd + +FACGE_s 0111 1110 010 ..... 00101 1 ..... ..... @rrr_h +FACGE_s 0111 1110 0.1 ..... 11101 1 ..... ..... @rrr_sd + +FACGT_s 0111 1110 110 ..... 00101 1 ..... ..... @rrr_h +FACGT_s 0111 1110 1.1 ..... 11101 1 ..... ..... @rrr_sd + ### Advanced SIMD three same FADD_v 0.00 1110 010 ..... 00010 1 ..... ..... @qrrr_h @@ -748,6 +763,21 @@ FMLA_v 0.00 1110 0.1 ..... 11001 1 ..... ..... @qrrr_sd FMLS_v 0.00 1110 110 ..... 00001 1 ..... ..... @qrrr_h FMLS_v 0.00 1110 1.1 ..... 11001 1 ..... ..... @qrrr_sd +FCMEQ_v 0.00 1110 010 ..... 00100 1 ..... ..... @qrrr_h +FCMEQ_v 0.00 1110 0.1 ..... 11100 1 ..... ..... @qrrr_sd + +FCMGE_v 0.10 1110 010 ..... 00100 1 ..... ..... @qrrr_h +FCMGE_v 0.10 1110 0.1 ..... 11100 1 ..... ..... @qrrr_sd + +FCMGT_v 0.10 1110 110 ..... 00100 1 ..... ..... @qrrr_h +FCMGT_v 0.10 1110 1.1 ..... 11100 1 ..... ..... @qrrr_sd + +FACGE_v 0.10 1110 010 ..... 00101 1 ..... ..... @qrrr_h +FACGE_v 0.10 1110 0.1 ..... 11101 1 ..... ..... @qrrr_sd + +FACGT_v 0.10 1110 110 ..... 00101 1 ..... ..... @qrrr_h +FACGT_v 0.10 1110 1.1 ..... 11101 1 ..... ..... @qrrr_sd + ### Advanced SIMD scalar x indexed element FMUL_si 0101 1111 00 .. .... 1001 . 0 ..... ..... @rrx_h diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c index f84c12378d..75b0c1a005 100644 --- a/target/arm/tcg/translate-a64.c +++ b/target/arm/tcg/translate-a64.c @@ -4975,6 +4975,41 @@ static const FPScalar f_scalar_fnmul = { }; TRANS(FNMUL_s, do_fp3_scalar, a, &f_scalar_fnmul) +static const FPScalar f_scalar_fcmeq = { + gen_helper_advsimd_ceq_f16, + gen_helper_neon_ceq_f32, + gen_helper_neon_ceq_f64, +}; +TRANS(FCMEQ_s, do_fp3_scalar, a, &f_scalar_fcmeq) + +static const FPScalar f_scalar_fcmge = { + gen_helper_advsimd_cge_f16, + gen_helper_neon_cge_f32, + gen_helper_neon_cge_f64, +}; +TRANS(FCMGE_s, do_fp3_scalar, a, &f_scalar_fcmge) + +static const FPScalar f_scalar_fcmgt = { + gen_helper_advsimd_cgt_f16, + gen_helper_neon_cgt_f32, + gen_helper_neon_cgt_f64, +}; +TRANS(FCMGT_s, do_fp3_scalar, a, &f_scalar_fcmgt) + +static const FPScalar f_scalar_facge = { + gen_helper_advsimd_acge_f16, + gen_helper_neon_acge_f32, + gen_helper_neon_acge_f64, +}; +TRANS(FACGE_s, do_fp3_scalar, a, &f_scalar_facge) + +static const FPScalar f_scalar_facgt = { + gen_helper_advsimd_acgt_f16, + gen_helper_neon_acgt_f32, + gen_helper_neon_acgt_f64, +}; +TRANS(FACGT_s, do_fp3_scalar, a, &f_scalar_facgt) + static bool do_fp3_vector(DisasContext *s, arg_qrrr_e *a, gen_helper_gvec_3_ptr * const fns[3]) { @@ -5080,6 +5115,41 @@ static gen_helper_gvec_3_ptr * const f_vector_fmls[3] = { }; TRANS(FMLS_v, do_fp3_vector, a, f_vector_fmls) +static gen_helper_gvec_3_ptr * const f_vector_fcmeq[3] = { + gen_helper_gvec_fceq_h, + gen_helper_gvec_fceq_s, + gen_helper_gvec_fceq_d, +}; +TRANS(FCMEQ_v, do_fp3_vector, a, f_vector_fcmeq) + +static gen_helper_gvec_3_ptr * const f_vector_fcmge[3] = { + gen_helper_gvec_fcge_h, + gen_helper_gvec_fcge_s, + gen_helper_gvec_fcge_d, +}; +TRANS(FCMGE_v, do_fp3_vector, a, f_vector_fcmge) + +static gen_helper_gvec_3_ptr * const f_vector_fcmgt[3] = { + gen_helper_gvec_fcgt_h, + gen_helper_gvec_fcgt_s, + gen_helper_gvec_fcgt_d, +}; +TRANS(FCMGT_v, do_fp3_vector, a, f_vector_fcmgt) + +static gen_helper_gvec_3_ptr * const f_vector_facge[3] = { + gen_helper_gvec_facge_h, + gen_helper_gvec_facge_s, + gen_helper_gvec_facge_d, +}; +TRANS(FACGE_v, do_fp3_vector, a, f_vector_facge) + +static gen_helper_gvec_3_ptr * const f_vector_facgt[3] = { + gen_helper_gvec_facgt_h, + gen_helper_gvec_facgt_s, + gen_helper_gvec_facgt_d, +}; +TRANS(FACGT_v, do_fp3_vector, a, f_vector_facgt) + /* * Advanced SIMD scalar/vector x indexed element */ @@ -9227,43 +9297,33 @@ static void handle_3same_float(DisasContext *s, int size, int elements, read_vec_element(s, tcg_op2, rm, pass, MO_64); switch (fpopcode) { - case 0x1c: /* FCMEQ */ - gen_helper_neon_ceq_f64(tcg_res, tcg_op1, tcg_op2, fpst); - break; case 0x1f: /* FRECPS */ gen_helper_recpsf_f64(tcg_res, tcg_op1, tcg_op2, fpst); break; case 0x3f: /* FRSQRTS */ gen_helper_rsqrtsf_f64(tcg_res, tcg_op1, tcg_op2, fpst); break; - case 0x5c: /* FCMGE */ - gen_helper_neon_cge_f64(tcg_res, tcg_op1, tcg_op2, fpst); - break; - case 0x5d: /* FACGE */ - gen_helper_neon_acge_f64(tcg_res, tcg_op1, tcg_op2, fpst); - break; case 0x7a: /* FABD */ gen_helper_vfp_subd(tcg_res, tcg_op1, tcg_op2, fpst); gen_vfp_absd(tcg_res, tcg_res); break; - case 0x7c: /* FCMGT */ - gen_helper_neon_cgt_f64(tcg_res, tcg_op1, tcg_op2, fpst); - break; - case 0x7d: /* FACGT */ - gen_helper_neon_acgt_f64(tcg_res, tcg_op1, tcg_op2, fpst); - break; default: case 0x18: /* FMAXNM */ case 0x19: /* FMLA */ case 0x1a: /* FADD */ case 0x1b: /* FMULX */ + case 0x1c: /* FCMEQ */ case 0x1e: /* FMAX */ case 0x38: /* FMINNM */ case 0x39: /* FMLS */ case 0x3a: /* FSUB */ case 0x3e: /* FMIN */ case 0x5b: /* FMUL */ + case 0x5c: /* FCMGE */ + case 0x5d: /* FACGE */ case 0x5f: /* FDIV */ + case 0x7c: /* FCMGT */ + case 0x7d: /* FACGT */ g_assert_not_reached(); } @@ -9278,43 +9338,33 @@ static void handle_3same_float(DisasContext *s, int size, int elements, read_vec_element_i32(s, tcg_op2, rm, pass, MO_32); switch (fpopcode) { - case 0x1c: /* FCMEQ */ - gen_helper_neon_ceq_f32(tcg_res, tcg_op1, tcg_op2, fpst); - break; case 0x1f: /* FRECPS */ gen_helper_recpsf_f32(tcg_res, tcg_op1, tcg_op2, fpst); break; case 0x3f: /* FRSQRTS */ gen_helper_rsqrtsf_f32(tcg_res, tcg_op1, tcg_op2, fpst); break; - case 0x5c: /* FCMGE */ - gen_helper_neon_cge_f32(tcg_res, tcg_op1, tcg_op2, fpst); - break; - case 0x5d: /* FACGE */ - gen_helper_neon_acge_f32(tcg_res, tcg_op1, tcg_op2, fpst); - break; case 0x7a: /* FABD */ gen_helper_vfp_subs(tcg_res, tcg_op1, tcg_op2, fpst); gen_vfp_abss(tcg_res, tcg_res); break; - case 0x7c: /* FCMGT */ - gen_helper_neon_cgt_f32(tcg_res, tcg_op1, tcg_op2, fpst); - break; - case 0x7d: /* FACGT */ - gen_helper_neon_acgt_f32(tcg_res, tcg_op1, tcg_op2, fpst); - break; default: case 0x18: /* FMAXNM */ case 0x19: /* FMLA */ case 0x1a: /* FADD */ case 0x1b: /* FMULX */ + case 0x1c: /* FCMEQ */ case 0x1e: /* FMAX */ case 0x38: /* FMINNM */ case 0x39: /* FMLS */ case 0x3a: /* FSUB */ case 0x3e: /* FMIN */ case 0x5b: /* FMUL */ + case 0x5c: /* FCMGE */ + case 0x5d: /* FACGE */ case 0x5f: /* FDIV */ + case 0x7c: /* FCMGT */ + case 0x7d: /* FACGT */ g_assert_not_reached(); } @@ -9355,15 +9405,15 @@ static void disas_simd_scalar_three_reg_same(DisasContext *s, uint32_t insn) switch (fpopcode) { case 0x1f: /* FRECPS */ case 0x3f: /* FRSQRTS */ + case 0x7a: /* FABD */ + break; + default: + case 0x1b: /* FMULX */ case 0x5d: /* FACGE */ case 0x7d: /* FACGT */ case 0x1c: /* FCMEQ */ case 0x5c: /* FCMGE */ case 0x7c: /* FCMGT */ - case 0x7a: /* FABD */ - break; - default: - case 0x1b: /* FMULX */ unallocated_encoding(s); return; } @@ -9516,17 +9566,17 @@ static void disas_simd_scalar_three_reg_same_fp16(DisasContext *s, TCGv_i32 tcg_res; switch (fpopcode) { - case 0x04: /* FCMEQ (reg) */ case 0x07: /* FRECPS */ case 0x0f: /* FRSQRTS */ - case 0x14: /* FCMGE (reg) */ - case 0x15: /* FACGE */ case 0x1a: /* FABD */ - case 0x1c: /* FCMGT (reg) */ - case 0x1d: /* FACGT */ break; default: case 0x03: /* FMULX */ + case 0x04: /* FCMEQ (reg) */ + case 0x14: /* FCMGE (reg) */ + case 0x15: /* FACGE */ + case 0x1c: /* FCMGT (reg) */ + case 0x1d: /* FACGT */ unallocated_encoding(s); return; } @@ -9546,33 +9596,23 @@ static void disas_simd_scalar_three_reg_same_fp16(DisasContext *s, tcg_res = tcg_temp_new_i32(); switch (fpopcode) { - case 0x04: /* FCMEQ (reg) */ - gen_helper_advsimd_ceq_f16(tcg_res, tcg_op1, tcg_op2, fpst); - break; case 0x07: /* FRECPS */ gen_helper_recpsf_f16(tcg_res, tcg_op1, tcg_op2, fpst); break; case 0x0f: /* FRSQRTS */ gen_helper_rsqrtsf_f16(tcg_res, tcg_op1, tcg_op2, fpst); break; - case 0x14: /* FCMGE (reg) */ - gen_helper_advsimd_cge_f16(tcg_res, tcg_op1, tcg_op2, fpst); - break; - case 0x15: /* FACGE */ - gen_helper_advsimd_acge_f16(tcg_res, tcg_op1, tcg_op2, fpst); - break; case 0x1a: /* FABD */ gen_helper_advsimd_subh(tcg_res, tcg_op1, tcg_op2, fpst); tcg_gen_andi_i32(tcg_res, tcg_res, 0x7fff); break; - case 0x1c: /* FCMGT (reg) */ - gen_helper_advsimd_cgt_f16(tcg_res, tcg_op1, tcg_op2, fpst); - break; - case 0x1d: /* FACGT */ - gen_helper_advsimd_acgt_f16(tcg_res, tcg_op1, tcg_op2, fpst); - break; default: case 0x03: /* FMULX */ + case 0x04: /* FCMEQ (reg) */ + case 0x14: /* FCMGE (reg) */ + case 0x15: /* FACGE */ + case 0x1c: /* FCMGT (reg) */ + case 0x1d: /* FACGT */ g_assert_not_reached(); } @@ -11232,12 +11272,7 @@ static void disas_simd_3same_float(DisasContext *s, uint32_t insn) return; case 0x1f: /* FRECPS */ case 0x3f: /* FRSQRTS */ - case 0x5d: /* FACGE */ - case 0x7d: /* FACGT */ - case 0x1c: /* FCMEQ */ - case 0x5c: /* FCMGE */ case 0x7a: /* FABD */ - case 0x7c: /* FCMGT */ if (!fp_access_check(s)) { return; } @@ -11269,13 +11304,18 @@ static void disas_simd_3same_float(DisasContext *s, uint32_t insn) case 0x19: /* FMLA */ case 0x1a: /* FADD */ case 0x1b: /* FMULX */ + case 0x1c: /* FCMEQ */ case 0x1e: /* FMAX */ case 0x38: /* FMINNM */ case 0x39: /* FMLS */ case 0x3a: /* FSUB */ case 0x3e: /* FMIN */ case 0x5b: /* FMUL */ + case 0x5c: /* FCMGE */ + case 0x5d: /* FACGE */ case 0x5f: /* FDIV */ + case 0x7d: /* FACGT */ + case 0x7c: /* FCMGT */ unallocated_encoding(s); return; } @@ -11617,14 +11657,9 @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn) int pass; switch (fpopcode) { - case 0x4: /* FCMEQ */ case 0x7: /* FRECPS */ case 0xf: /* FRSQRTS */ - case 0x14: /* FCMGE */ - case 0x15: /* FACGE */ case 0x1a: /* FABD */ - case 0x1c: /* FCMGT */ - case 0x1d: /* FACGT */ pairwise = false; break; case 0x10: /* FMAXNMP */ @@ -11639,13 +11674,18 @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn) case 0x1: /* FMLA */ case 0x2: /* FADD */ case 0x3: /* FMULX */ + case 0x4: /* FCMEQ */ case 0x6: /* FMAX */ case 0x8: /* FMINNM */ case 0x9: /* FMLS */ case 0xa: /* FSUB */ case 0xe: /* FMIN */ case 0x13: /* FMUL */ + case 0x14: /* FCMGE */ + case 0x15: /* FACGE */ case 0x17: /* FDIV */ + case 0x1c: /* FCMGT */ + case 0x1d: /* FACGT */ unallocated_encoding(s); return; } @@ -11711,43 +11751,33 @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn) read_vec_element_i32(s, tcg_op2, rm, pass, MO_16); switch (fpopcode) { - case 0x4: /* FCMEQ */ - gen_helper_advsimd_ceq_f16(tcg_res, tcg_op1, tcg_op2, fpst); - break; case 0x7: /* FRECPS */ gen_helper_recpsf_f16(tcg_res, tcg_op1, tcg_op2, fpst); break; case 0xf: /* FRSQRTS */ gen_helper_rsqrtsf_f16(tcg_res, tcg_op1, tcg_op2, fpst); break; - case 0x14: /* FCMGE */ - gen_helper_advsimd_cge_f16(tcg_res, tcg_op1, tcg_op2, fpst); - break; - case 0x15: /* FACGE */ - gen_helper_advsimd_acge_f16(tcg_res, tcg_op1, tcg_op2, fpst); - break; case 0x1a: /* FABD */ gen_helper_advsimd_subh(tcg_res, tcg_op1, tcg_op2, fpst); tcg_gen_andi_i32(tcg_res, tcg_res, 0x7fff); break; - case 0x1c: /* FCMGT */ - gen_helper_advsimd_cgt_f16(tcg_res, tcg_op1, tcg_op2, fpst); - break; - case 0x1d: /* FACGT */ - gen_helper_advsimd_acgt_f16(tcg_res, tcg_op1, tcg_op2, fpst); - break; default: case 0x0: /* FMAXNM */ case 0x1: /* FMLA */ case 0x2: /* FADD */ case 0x3: /* FMULX */ + case 0x4: /* FCMEQ */ case 0x6: /* FMAX */ case 0x8: /* FMINNM */ case 0x9: /* FMLS */ case 0xa: /* FSUB */ case 0xe: /* FMIN */ case 0x13: /* FMUL */ + case 0x14: /* FCMGE */ + case 0x15: /* FACGE */ case 0x17: /* FDIV */ + case 0x1c: /* FCMGT */ + case 0x1d: /* FACGT */ g_assert_not_reached(); } diff --git a/target/arm/tcg/vec_helper.c b/target/arm/tcg/vec_helper.c index b925b9f21b..dabefa3526 100644 --- a/target/arm/tcg/vec_helper.c +++ b/target/arm/tcg/vec_helper.c @@ -971,6 +971,11 @@ static uint32_t float32_ceq(float32 op1, float32 op2, float_status *stat) return -float32_eq_quiet(op1, op2, stat); } +static uint64_t float64_ceq(float64 op1, float64 op2, float_status *stat) +{ + return -float64_eq_quiet(op1, op2, stat); +} + static uint16_t float16_cge(float16 op1, float16 op2, float_status *stat) { return -float16_le(op2, op1, stat); @@ -981,6 +986,11 @@ static uint32_t float32_cge(float32 op1, float32 op2, float_status *stat) return -float32_le(op2, op1, stat); } +static uint64_t float64_cge(float64 op1, float64 op2, float_status *stat) +{ + return -float64_le(op2, op1, stat); +} + static uint16_t float16_cgt(float16 op1, float16 op2, float_status *stat) { return -float16_lt(op2, op1, stat); @@ -991,6 +1001,11 @@ static uint32_t float32_cgt(float32 op1, float32 op2, float_status *stat) return -float32_lt(op2, op1, stat); } +static uint64_t float64_cgt(float64 op1, float64 op2, float_status *stat) +{ + return -float64_lt(op2, op1, stat); +} + static uint16_t float16_acge(float16 op1, float16 op2, float_status *stat) { return -float16_le(float16_abs(op2), float16_abs(op1), stat); @@ -1001,6 +1016,11 @@ static uint32_t float32_acge(float32 op1, float32 op2, float_status *stat) return -float32_le(float32_abs(op2), float32_abs(op1), stat); } +static uint64_t float64_acge(float64 op1, float64 op2, float_status *stat) +{ + return -float64_le(float64_abs(op2), float64_abs(op1), stat); +} + static uint16_t float16_acgt(float16 op1, float16 op2, float_status *stat) { return -float16_lt(float16_abs(op2), float16_abs(op1), stat); @@ -1011,6 +1031,11 @@ static uint32_t float32_acgt(float32 op1, float32 op2, float_status *stat) return -float32_lt(float32_abs(op2), float32_abs(op1), stat); } +static uint64_t float64_acgt(float64 op1, float64 op2, float_status *stat) +{ + return -float64_lt(float64_abs(op2), float64_abs(op1), stat); +} + static int16_t vfp_tosszh(float16 x, void *fpstp) { float_status *fpst = fpstp; @@ -1216,18 +1241,23 @@ DO_3OP(gvec_fabd_s, float32_abd, float32) DO_3OP(gvec_fceq_h, float16_ceq, float16) DO_3OP(gvec_fceq_s, float32_ceq, float32) +DO_3OP(gvec_fceq_d, float64_ceq, float64) DO_3OP(gvec_fcge_h, float16_cge, float16) DO_3OP(gvec_fcge_s, float32_cge, float32) +DO_3OP(gvec_fcge_d, float64_cge, float64) DO_3OP(gvec_fcgt_h, float16_cgt, float16) DO_3OP(gvec_fcgt_s, float32_cgt, float32) +DO_3OP(gvec_fcgt_d, float64_cgt, float64) DO_3OP(gvec_facge_h, float16_acge, float16) DO_3OP(gvec_facge_s, float32_acge, float32) +DO_3OP(gvec_facge_d, float64_acge, float64) DO_3OP(gvec_facgt_h, float16_acgt, float16) DO_3OP(gvec_facgt_s, float32_acgt, float32) +DO_3OP(gvec_facgt_d, float64_acgt, float64) DO_3OP(gvec_fmax_h, float16_max, float16) DO_3OP(gvec_fmax_s, float32_max, float32) From patchwork Fri May 24 23:20:40 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13673791 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 35085C25B7A for ; Fri, 24 May 2024 23:22:40 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sAeEc-0006ol-78; Fri, 24 May 2024 19:22:02 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sAeET-0006dy-1H for qemu-devel@nongnu.org; Fri, 24 May 2024 19:21:54 -0400 Received: from mail-pf1-x42e.google.com ([2607:f8b0:4864:20::42e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sAeEN-0005pg-2P for qemu-devel@nongnu.org; Fri, 24 May 2024 19:21:52 -0400 Received: by mail-pf1-x42e.google.com with SMTP id d2e1a72fcca58-6f8ec7e054bso1393290b3a.3 for ; Fri, 24 May 2024 16:21:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1716592906; x=1717197706; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=SVRp8SA5uhJhEt4SEBJKz9y1VKM1Vkbew4OmDfeOn1Y=; b=zSvsei9ACM1vc7Bkx5q76VxsoJETk5VuSpZg9UEJz2aY116NVCMT2Qkwdvd4PST4aD cRloJrVZIwcEjiEi97ysAxnZ3pwLhX+rZSRKPKy90MJ59nMoKYg+LASnggqb+hEsHlGL Fu3nwUyzDmWUTHXWg0iSefBnvclJOUXnKqbROXlaD8ghfoTuY2Zmqt1/lCu2f8oj372S VHKe0eltLyWMa2IPaHXBJQsEsGR/F1Gk9PqpgNq+vWRW4oWjWBDR9SVfVNvgyg9PgC9g m/8/PX/080Pm3IVzDloWg5RnbrPw6wA7kyFwjnnVtGf2MaFVER6oseB+5v5jZvhKnphl W3dw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716592906; x=1717197706; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=SVRp8SA5uhJhEt4SEBJKz9y1VKM1Vkbew4OmDfeOn1Y=; b=pJjJJ098yl3G/JZK9G+S91ekRqBkx5O/cAbKvEcGlEg7kI6E+C6IAkB/q0Fl+1hTTY cr22kx9nTgpGoWlNKschMbPToHN/lNfOej7Kp+RUAOuyrRuceaYivXuyrIBz82j7h4lA v8Pv9TXxGPz6t8r3cRzPC92RPr1w31bK8n6213jU0ZK4xBjHlbaLmvwqtX7CqXr+YLl/ IQMZ0j1ASiA9FefJt3OGs2eDdwQBqEOxZvTJdxkqTjVPoxfZnJ9a519LrUHwDnrpIJCd 6DrUQHvnD01cVkKZ9jfSocT4ZbxlhX4ku2QzaN9t04Wh4Hnch4T9UoclslfLdjEoUGYw c6cw== X-Gm-Message-State: AOJu0YybYKOwASBwWP4fw9Z2XPjnJaOKxd/31vI9/IS1paieOpp1+MUE aX5NOwq2IeDK0aIGcssHOiGbL5MfmlrR+Hj2s9wtfG/LJg/tIerzpIe6kSXMHS01o7eDIS6yZCk O X-Google-Smtp-Source: AGHT+IGUPNVompDLxm1nK7x9AnFN5sbNzsaL8cBcALQchkLDX4sH7ZHsSBqq2KosmpvmthLIkYY9Gw== X-Received: by 2002:a17:903:1248:b0:1f3:11ec:cbce with SMTP id d9443c01a7336-1f4498f0b7bmr44485315ad.58.1716592905764; Fri, 24 May 2024 16:21:45 -0700 (PDT) Received: from stoup.. (174-21-72-5.tukw.qwest.net. [174.21.72.5]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-1f44c759ceesm19178305ad.10.2024.05.24.16.21.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 May 2024 16:21:45 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: qemu-arm@nongnu.org, Peter Maydell Subject: [PATCH v2 26/67] target/arm: Convert FABD to decodetree Date: Fri, 24 May 2024 16:20:40 -0700 Message-Id: <20240524232121.284515-27-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240524232121.284515-1-richard.henderson@linaro.org> References: <20240524232121.284515-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::42e; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x42e.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/helper.h | 1 + target/arm/tcg/a64.decode | 6 ++++ target/arm/tcg/translate-a64.c | 60 ++++++++++++++++++++++------------ target/arm/tcg/vec_helper.c | 6 ++++ 4 files changed, 53 insertions(+), 20 deletions(-) diff --git a/target/arm/helper.h b/target/arm/helper.h index 8d076011c1..ff6e3094f4 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -724,6 +724,7 @@ DEF_HELPER_FLAGS_5(gvec_fmul_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(gvec_fabd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(gvec_fabd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_fabd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(gvec_fceq_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(gvec_fceq_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode index 7fc3277be6..a852b5f06f 100644 --- a/target/arm/tcg/a64.decode +++ b/target/arm/tcg/a64.decode @@ -728,6 +728,9 @@ FACGE_s 0111 1110 0.1 ..... 11101 1 ..... ..... @rrr_sd FACGT_s 0111 1110 110 ..... 00101 1 ..... ..... @rrr_h FACGT_s 0111 1110 1.1 ..... 11101 1 ..... ..... @rrr_sd +FABD_s 0111 1110 110 ..... 00010 1 ..... ..... @rrr_h +FABD_s 0111 1110 1.1 ..... 11010 1 ..... ..... @rrr_sd + ### Advanced SIMD three same FADD_v 0.00 1110 010 ..... 00010 1 ..... ..... @qrrr_h @@ -778,6 +781,9 @@ FACGE_v 0.10 1110 0.1 ..... 11101 1 ..... ..... @qrrr_sd FACGT_v 0.10 1110 110 ..... 00101 1 ..... ..... @qrrr_h FACGT_v 0.10 1110 1.1 ..... 11101 1 ..... ..... @qrrr_sd +FABD_v 0.10 1110 110 ..... 00010 1 ..... ..... @qrrr_h +FABD_v 0.10 1110 1.1 ..... 11010 1 ..... ..... @qrrr_sd + ### Advanced SIMD scalar x indexed element FMUL_si 0101 1111 00 .. .... 1001 . 0 ..... ..... @rrx_h diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c index 75b0c1a005..633384d2a5 100644 --- a/target/arm/tcg/translate-a64.c +++ b/target/arm/tcg/translate-a64.c @@ -5010,6 +5010,31 @@ static const FPScalar f_scalar_facgt = { }; TRANS(FACGT_s, do_fp3_scalar, a, &f_scalar_facgt) +static void gen_fabd_h(TCGv_i32 d, TCGv_i32 n, TCGv_i32 m, TCGv_ptr s) +{ + gen_helper_vfp_subh(d, n, m, s); + gen_vfp_absh(d, d); +} + +static void gen_fabd_s(TCGv_i32 d, TCGv_i32 n, TCGv_i32 m, TCGv_ptr s) +{ + gen_helper_vfp_subs(d, n, m, s); + gen_vfp_abss(d, d); +} + +static void gen_fabd_d(TCGv_i64 d, TCGv_i64 n, TCGv_i64 m, TCGv_ptr s) +{ + gen_helper_vfp_subd(d, n, m, s); + gen_vfp_absd(d, d); +} + +static const FPScalar f_scalar_fabd = { + gen_fabd_h, + gen_fabd_s, + gen_fabd_d, +}; +TRANS(FABD_s, do_fp3_scalar, a, &f_scalar_fabd) + static bool do_fp3_vector(DisasContext *s, arg_qrrr_e *a, gen_helper_gvec_3_ptr * const fns[3]) { @@ -5150,6 +5175,13 @@ static gen_helper_gvec_3_ptr * const f_vector_facgt[3] = { }; TRANS(FACGT_v, do_fp3_vector, a, f_vector_facgt) +static gen_helper_gvec_3_ptr * const f_vector_fabd[3] = { + gen_helper_gvec_fabd_h, + gen_helper_gvec_fabd_s, + gen_helper_gvec_fabd_d, +}; +TRANS(FABD_v, do_fp3_vector, a, f_vector_fabd) + /* * Advanced SIMD scalar/vector x indexed element */ @@ -9303,10 +9335,6 @@ static void handle_3same_float(DisasContext *s, int size, int elements, case 0x3f: /* FRSQRTS */ gen_helper_rsqrtsf_f64(tcg_res, tcg_op1, tcg_op2, fpst); break; - case 0x7a: /* FABD */ - gen_helper_vfp_subd(tcg_res, tcg_op1, tcg_op2, fpst); - gen_vfp_absd(tcg_res, tcg_res); - break; default: case 0x18: /* FMAXNM */ case 0x19: /* FMLA */ @@ -9322,6 +9350,7 @@ static void handle_3same_float(DisasContext *s, int size, int elements, case 0x5c: /* FCMGE */ case 0x5d: /* FACGE */ case 0x5f: /* FDIV */ + case 0x7a: /* FABD */ case 0x7c: /* FCMGT */ case 0x7d: /* FACGT */ g_assert_not_reached(); @@ -9344,10 +9373,6 @@ static void handle_3same_float(DisasContext *s, int size, int elements, case 0x3f: /* FRSQRTS */ gen_helper_rsqrtsf_f32(tcg_res, tcg_op1, tcg_op2, fpst); break; - case 0x7a: /* FABD */ - gen_helper_vfp_subs(tcg_res, tcg_op1, tcg_op2, fpst); - gen_vfp_abss(tcg_res, tcg_res); - break; default: case 0x18: /* FMAXNM */ case 0x19: /* FMLA */ @@ -9363,6 +9388,7 @@ static void handle_3same_float(DisasContext *s, int size, int elements, case 0x5c: /* FCMGE */ case 0x5d: /* FACGE */ case 0x5f: /* FDIV */ + case 0x7a: /* FABD */ case 0x7c: /* FCMGT */ case 0x7d: /* FACGT */ g_assert_not_reached(); @@ -9405,7 +9431,6 @@ static void disas_simd_scalar_three_reg_same(DisasContext *s, uint32_t insn) switch (fpopcode) { case 0x1f: /* FRECPS */ case 0x3f: /* FRSQRTS */ - case 0x7a: /* FABD */ break; default: case 0x1b: /* FMULX */ @@ -9413,6 +9438,7 @@ static void disas_simd_scalar_three_reg_same(DisasContext *s, uint32_t insn) case 0x7d: /* FACGT */ case 0x1c: /* FCMEQ */ case 0x5c: /* FCMGE */ + case 0x7a: /* FABD */ case 0x7c: /* FCMGT */ unallocated_encoding(s); return; @@ -9568,13 +9594,13 @@ static void disas_simd_scalar_three_reg_same_fp16(DisasContext *s, switch (fpopcode) { case 0x07: /* FRECPS */ case 0x0f: /* FRSQRTS */ - case 0x1a: /* FABD */ break; default: case 0x03: /* FMULX */ case 0x04: /* FCMEQ (reg) */ case 0x14: /* FCMGE (reg) */ case 0x15: /* FACGE */ + case 0x1a: /* FABD */ case 0x1c: /* FCMGT (reg) */ case 0x1d: /* FACGT */ unallocated_encoding(s); @@ -9602,15 +9628,12 @@ static void disas_simd_scalar_three_reg_same_fp16(DisasContext *s, case 0x0f: /* FRSQRTS */ gen_helper_rsqrtsf_f16(tcg_res, tcg_op1, tcg_op2, fpst); break; - case 0x1a: /* FABD */ - gen_helper_advsimd_subh(tcg_res, tcg_op1, tcg_op2, fpst); - tcg_gen_andi_i32(tcg_res, tcg_res, 0x7fff); - break; default: case 0x03: /* FMULX */ case 0x04: /* FCMEQ (reg) */ case 0x14: /* FCMGE (reg) */ case 0x15: /* FACGE */ + case 0x1a: /* FABD */ case 0x1c: /* FCMGT (reg) */ case 0x1d: /* FACGT */ g_assert_not_reached(); @@ -11272,7 +11295,6 @@ static void disas_simd_3same_float(DisasContext *s, uint32_t insn) return; case 0x1f: /* FRECPS */ case 0x3f: /* FRSQRTS */ - case 0x7a: /* FABD */ if (!fp_access_check(s)) { return; } @@ -11314,6 +11336,7 @@ static void disas_simd_3same_float(DisasContext *s, uint32_t insn) case 0x5c: /* FCMGE */ case 0x5d: /* FACGE */ case 0x5f: /* FDIV */ + case 0x7a: /* FABD */ case 0x7d: /* FACGT */ case 0x7c: /* FCMGT */ unallocated_encoding(s); @@ -11659,7 +11682,6 @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn) switch (fpopcode) { case 0x7: /* FRECPS */ case 0xf: /* FRSQRTS */ - case 0x1a: /* FABD */ pairwise = false; break; case 0x10: /* FMAXNMP */ @@ -11684,6 +11706,7 @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn) case 0x14: /* FCMGE */ case 0x15: /* FACGE */ case 0x17: /* FDIV */ + case 0x1a: /* FABD */ case 0x1c: /* FCMGT */ case 0x1d: /* FACGT */ unallocated_encoding(s); @@ -11757,10 +11780,6 @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn) case 0xf: /* FRSQRTS */ gen_helper_rsqrtsf_f16(tcg_res, tcg_op1, tcg_op2, fpst); break; - case 0x1a: /* FABD */ - gen_helper_advsimd_subh(tcg_res, tcg_op1, tcg_op2, fpst); - tcg_gen_andi_i32(tcg_res, tcg_res, 0x7fff); - break; default: case 0x0: /* FMAXNM */ case 0x1: /* FMLA */ @@ -11776,6 +11795,7 @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn) case 0x14: /* FCMGE */ case 0x15: /* FACGE */ case 0x17: /* FDIV */ + case 0x1a: /* FABD */ case 0x1c: /* FCMGT */ case 0x1d: /* FACGT */ g_assert_not_reached(); diff --git a/target/arm/tcg/vec_helper.c b/target/arm/tcg/vec_helper.c index dabefa3526..e9d7922f30 100644 --- a/target/arm/tcg/vec_helper.c +++ b/target/arm/tcg/vec_helper.c @@ -1154,6 +1154,11 @@ static float32 float32_abd(float32 op1, float32 op2, float_status *stat) return float32_abs(float32_sub(op1, op2, stat)); } +static float64 float64_abd(float64 op1, float64 op2, float_status *stat) +{ + return float64_abs(float64_sub(op1, op2, stat)); +} + /* * Reciprocal step. These are the AArch32 version which uses a * non-fused multiply-and-subtract. @@ -1238,6 +1243,7 @@ DO_3OP(gvec_ftsmul_d, float64_ftsmul, float64) DO_3OP(gvec_fabd_h, float16_abd, float16) DO_3OP(gvec_fabd_s, float32_abd, float32) +DO_3OP(gvec_fabd_d, float64_abd, float64) DO_3OP(gvec_fceq_h, float16_ceq, float16) DO_3OP(gvec_fceq_s, float32_ceq, float32) From patchwork Fri May 24 23:20:41 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13673854 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 0E8ACC25B74 for ; Fri, 24 May 2024 23:33:14 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sAeEa-0006ml-JF; Fri, 24 May 2024 19:22:00 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sAeEX-0006iU-AO for qemu-devel@nongnu.org; Fri, 24 May 2024 19:21:57 -0400 Received: from mail-pl1-x62a.google.com ([2607:f8b0:4864:20::62a]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sAeEP-0005q4-2q for qemu-devel@nongnu.org; Fri, 24 May 2024 19:21:57 -0400 Received: by mail-pl1-x62a.google.com with SMTP id d9443c01a7336-1f32b1b5429so22972845ad.2 for ; Fri, 24 May 2024 16:21:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1716592907; x=1717197707; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=YwWJz5dYdceid2k272oRtgntPmVKRyShOM+but+kUR8=; b=K95PmI0HPhCC9IbD6lDDeRMtO8+TWnDtUBIG2X0MWDM1WA5rL0OeP9vENFYjQC3zQ/ 3aIFX0M9d/qicHk/CMxbkkE2bHWHX5rIwSHy9wRDBbMgpHyibWNJxlLrf9dgYjuUe3q6 D2SOssr5UqlM7pntwabFqAl+pkUCAYbx2WGhmU+taEBVYGp4exp0q3CTUN6b+45KfnwN Yr+q9zp+qWKqqT4UpqwvHt6t8lBcXmDGlEc7rQMoYnXPI3yOk+NihDq/qxsDOE/I2me3 Sj6jHQ7IN8l7cb0F2/DkJ64gP8XCFzjWDeDa511sgJkDzSMfYHxhkNJ8hxXtaa4uefA6 Fjcw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716592907; x=1717197707; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=YwWJz5dYdceid2k272oRtgntPmVKRyShOM+but+kUR8=; b=FW8PN1tA1CT3akhEnl58bJmx6MocENHvKR9qcMOPR1+FSZy9RZhz9mEU5aJdfc8pvz MtZphgB8J9BssErHzFbSrfOjbq5RNwyRQUZjI0ZQfSUgzG3YeXskFUsJFbU4+v0nO0Wg kK2b9CQ3IpdlFMgu8OJeiMKA/rhMd3OxpTHzwbpa5vc69Uzkub44AfdN3eqEVBKJv2WS a3m+DZeD7zpgfjv7ZiwDhvVdt60W12Td8++Fc+sCH3qBW55NWxY1qimbhzNTMyH9jMWo wTY104wht2YFlwoewVc+7B93p8NyTXz3Mb9zwXkV5s3mD9UoDC4r9v/+O0GUcozUfzug rGkA== X-Gm-Message-State: AOJu0YzFMt7Rq+M2psiN1Ow0V7xK4Fl+kS6wIowxKKukgxBIKYdK+RAm A3qpdtFWVfc+ipYfne93onBOiMHcW5um78odaiNFgvnj+rpexZhH5dAHybGAukExjsKLeOWstEI + X-Google-Smtp-Source: AGHT+IG8BLARPBsm7R1AUMHGvohsV1cyqmp6BFgsqHnAEZXKkvaBRmXIBfyCbEAGNaQ8hGVUJ1cwzA== X-Received: by 2002:a17:903:1112:b0:1e2:9676:c326 with SMTP id d9443c01a7336-1f44871e145mr43385035ad.29.1716592906690; Fri, 24 May 2024 16:21:46 -0700 (PDT) Received: from stoup.. (174-21-72-5.tukw.qwest.net. [174.21.72.5]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-1f44c759ceesm19178305ad.10.2024.05.24.16.21.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 May 2024 16:21:46 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: qemu-arm@nongnu.org, Peter Maydell Subject: [PATCH v2 27/67] target/arm: Convert FRECPS, FRSQRTS to decodetree Date: Fri, 24 May 2024 16:20:41 -0700 Message-Id: <20240524232121.284515-28-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240524232121.284515-1-richard.henderson@linaro.org> References: <20240524232121.284515-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::62a; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x62a.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org These are the last instructions within handle_3same_float and disas_simd_scalar_three_reg_same_fp16 so remove them. Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/tcg/a64.decode | 12 ++ target/arm/tcg/translate-a64.c | 293 ++++----------------------------- 2 files changed, 46 insertions(+), 259 deletions(-) diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode index a852b5f06f..84cb38f1dd 100644 --- a/target/arm/tcg/a64.decode +++ b/target/arm/tcg/a64.decode @@ -731,6 +731,12 @@ FACGT_s 0111 1110 1.1 ..... 11101 1 ..... ..... @rrr_sd FABD_s 0111 1110 110 ..... 00010 1 ..... ..... @rrr_h FABD_s 0111 1110 1.1 ..... 11010 1 ..... ..... @rrr_sd +FRECPS_s 0101 1110 010 ..... 00111 1 ..... ..... @rrr_h +FRECPS_s 0101 1110 0.1 ..... 11111 1 ..... ..... @rrr_sd + +FRSQRTS_s 0101 1110 110 ..... 00111 1 ..... ..... @rrr_h +FRSQRTS_s 0101 1110 1.1 ..... 11111 1 ..... ..... @rrr_sd + ### Advanced SIMD three same FADD_v 0.00 1110 010 ..... 00010 1 ..... ..... @qrrr_h @@ -784,6 +790,12 @@ FACGT_v 0.10 1110 1.1 ..... 11101 1 ..... ..... @qrrr_sd FABD_v 0.10 1110 110 ..... 00010 1 ..... ..... @qrrr_h FABD_v 0.10 1110 1.1 ..... 11010 1 ..... ..... @qrrr_sd +FRECPS_v 0.00 1110 010 ..... 00111 1 ..... ..... @qrrr_h +FRECPS_v 0.00 1110 0.1 ..... 11111 1 ..... ..... @qrrr_sd + +FRSQRTS_v 0.00 1110 110 ..... 00111 1 ..... ..... @qrrr_h +FRSQRTS_v 0.00 1110 1.1 ..... 11111 1 ..... ..... @qrrr_sd + ### Advanced SIMD scalar x indexed element FMUL_si 0101 1111 00 .. .... 1001 . 0 ..... ..... @rrx_h diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c index 633384d2a5..a7537a5104 100644 --- a/target/arm/tcg/translate-a64.c +++ b/target/arm/tcg/translate-a64.c @@ -5035,6 +5035,20 @@ static const FPScalar f_scalar_fabd = { }; TRANS(FABD_s, do_fp3_scalar, a, &f_scalar_fabd) +static const FPScalar f_scalar_frecps = { + gen_helper_recpsf_f16, + gen_helper_recpsf_f32, + gen_helper_recpsf_f64, +}; +TRANS(FRECPS_s, do_fp3_scalar, a, &f_scalar_frecps) + +static const FPScalar f_scalar_frsqrts = { + gen_helper_rsqrtsf_f16, + gen_helper_rsqrtsf_f32, + gen_helper_rsqrtsf_f64, +}; +TRANS(FRSQRTS_s, do_fp3_scalar, a, &f_scalar_frsqrts) + static bool do_fp3_vector(DisasContext *s, arg_qrrr_e *a, gen_helper_gvec_3_ptr * const fns[3]) { @@ -5182,6 +5196,20 @@ static gen_helper_gvec_3_ptr * const f_vector_fabd[3] = { }; TRANS(FABD_v, do_fp3_vector, a, f_vector_fabd) +static gen_helper_gvec_3_ptr * const f_vector_frecps[3] = { + gen_helper_gvec_recps_h, + gen_helper_gvec_recps_s, + gen_helper_gvec_recps_d, +}; +TRANS(FRECPS_v, do_fp3_vector, a, f_vector_frecps) + +static gen_helper_gvec_3_ptr * const f_vector_frsqrts[3] = { + gen_helper_gvec_rsqrts_h, + gen_helper_gvec_rsqrts_s, + gen_helper_gvec_rsqrts_d, +}; +TRANS(FRSQRTS_v, do_fp3_vector, a, f_vector_frsqrts) + /* * Advanced SIMD scalar/vector x indexed element */ @@ -9308,107 +9336,6 @@ static void handle_3same_64(DisasContext *s, int opcode, bool u, } } -/* Handle the 3-same-operands float operations; shared by the scalar - * and vector encodings. The caller must filter out any encodings - * not allocated for the encoding it is dealing with. - */ -static void handle_3same_float(DisasContext *s, int size, int elements, - int fpopcode, int rd, int rn, int rm) -{ - int pass; - TCGv_ptr fpst = fpstatus_ptr(FPST_FPCR); - - for (pass = 0; pass < elements; pass++) { - if (size) { - /* Double */ - TCGv_i64 tcg_op1 = tcg_temp_new_i64(); - TCGv_i64 tcg_op2 = tcg_temp_new_i64(); - TCGv_i64 tcg_res = tcg_temp_new_i64(); - - read_vec_element(s, tcg_op1, rn, pass, MO_64); - read_vec_element(s, tcg_op2, rm, pass, MO_64); - - switch (fpopcode) { - case 0x1f: /* FRECPS */ - gen_helper_recpsf_f64(tcg_res, tcg_op1, tcg_op2, fpst); - break; - case 0x3f: /* FRSQRTS */ - gen_helper_rsqrtsf_f64(tcg_res, tcg_op1, tcg_op2, fpst); - break; - default: - case 0x18: /* FMAXNM */ - case 0x19: /* FMLA */ - case 0x1a: /* FADD */ - case 0x1b: /* FMULX */ - case 0x1c: /* FCMEQ */ - case 0x1e: /* FMAX */ - case 0x38: /* FMINNM */ - case 0x39: /* FMLS */ - case 0x3a: /* FSUB */ - case 0x3e: /* FMIN */ - case 0x5b: /* FMUL */ - case 0x5c: /* FCMGE */ - case 0x5d: /* FACGE */ - case 0x5f: /* FDIV */ - case 0x7a: /* FABD */ - case 0x7c: /* FCMGT */ - case 0x7d: /* FACGT */ - g_assert_not_reached(); - } - - write_vec_element(s, tcg_res, rd, pass, MO_64); - } else { - /* Single */ - TCGv_i32 tcg_op1 = tcg_temp_new_i32(); - TCGv_i32 tcg_op2 = tcg_temp_new_i32(); - TCGv_i32 tcg_res = tcg_temp_new_i32(); - - read_vec_element_i32(s, tcg_op1, rn, pass, MO_32); - read_vec_element_i32(s, tcg_op2, rm, pass, MO_32); - - switch (fpopcode) { - case 0x1f: /* FRECPS */ - gen_helper_recpsf_f32(tcg_res, tcg_op1, tcg_op2, fpst); - break; - case 0x3f: /* FRSQRTS */ - gen_helper_rsqrtsf_f32(tcg_res, tcg_op1, tcg_op2, fpst); - break; - default: - case 0x18: /* FMAXNM */ - case 0x19: /* FMLA */ - case 0x1a: /* FADD */ - case 0x1b: /* FMULX */ - case 0x1c: /* FCMEQ */ - case 0x1e: /* FMAX */ - case 0x38: /* FMINNM */ - case 0x39: /* FMLS */ - case 0x3a: /* FSUB */ - case 0x3e: /* FMIN */ - case 0x5b: /* FMUL */ - case 0x5c: /* FCMGE */ - case 0x5d: /* FACGE */ - case 0x5f: /* FDIV */ - case 0x7a: /* FABD */ - case 0x7c: /* FCMGT */ - case 0x7d: /* FACGT */ - g_assert_not_reached(); - } - - if (elements == 1) { - /* scalar single so clear high part */ - TCGv_i64 tcg_tmp = tcg_temp_new_i64(); - - tcg_gen_extu_i32_i64(tcg_tmp, tcg_res); - write_vec_element(s, tcg_tmp, rd, pass, MO_64); - } else { - write_vec_element_i32(s, tcg_res, rd, pass, MO_32); - } - } - } - - clear_vec_high(s, elements * (size ? 8 : 4) > 8, rd); -} - /* AdvSIMD scalar three same * 31 30 29 28 24 23 22 21 20 16 15 11 10 9 5 4 0 * +-----+---+-----------+------+---+------+--------+---+------+------+ @@ -9425,33 +9352,6 @@ static void disas_simd_scalar_three_reg_same(DisasContext *s, uint32_t insn) bool u = extract32(insn, 29, 1); TCGv_i64 tcg_rd; - if (opcode >= 0x18) { - /* Floating point: U, size[1] and opcode indicate operation */ - int fpopcode = opcode | (extract32(size, 1, 1) << 5) | (u << 6); - switch (fpopcode) { - case 0x1f: /* FRECPS */ - case 0x3f: /* FRSQRTS */ - break; - default: - case 0x1b: /* FMULX */ - case 0x5d: /* FACGE */ - case 0x7d: /* FACGT */ - case 0x1c: /* FCMEQ */ - case 0x5c: /* FCMGE */ - case 0x7a: /* FABD */ - case 0x7c: /* FCMGT */ - unallocated_encoding(s); - return; - } - - if (!fp_access_check(s)) { - return; - } - - handle_3same_float(s, extract32(size, 0, 1), 1, fpopcode, rd, rn, rm); - return; - } - switch (opcode) { case 0x1: /* SQADD, UQADD */ case 0x5: /* SQSUB, UQSUB */ @@ -9568,80 +9468,6 @@ static void disas_simd_scalar_three_reg_same(DisasContext *s, uint32_t insn) write_fp_dreg(s, rd, tcg_rd); } -/* AdvSIMD scalar three same FP16 - * 31 30 29 28 24 23 22 21 20 16 15 14 13 11 10 9 5 4 0 - * +-----+---+-----------+---+-----+------+-----+--------+---+----+----+ - * | 0 1 | U | 1 1 1 1 0 | a | 1 0 | Rm | 0 0 | opcode | 1 | Rn | Rd | - * +-----+---+-----------+---+-----+------+-----+--------+---+----+----+ - * v: 0101 1110 0100 0000 0000 0100 0000 0000 => 5e400400 - * m: 1101 1111 0110 0000 1100 0100 0000 0000 => df60c400 - */ -static void disas_simd_scalar_three_reg_same_fp16(DisasContext *s, - uint32_t insn) -{ - int rd = extract32(insn, 0, 5); - int rn = extract32(insn, 5, 5); - int opcode = extract32(insn, 11, 3); - int rm = extract32(insn, 16, 5); - bool u = extract32(insn, 29, 1); - bool a = extract32(insn, 23, 1); - int fpopcode = opcode | (a << 3) | (u << 4); - TCGv_ptr fpst; - TCGv_i32 tcg_op1; - TCGv_i32 tcg_op2; - TCGv_i32 tcg_res; - - switch (fpopcode) { - case 0x07: /* FRECPS */ - case 0x0f: /* FRSQRTS */ - break; - default: - case 0x03: /* FMULX */ - case 0x04: /* FCMEQ (reg) */ - case 0x14: /* FCMGE (reg) */ - case 0x15: /* FACGE */ - case 0x1a: /* FABD */ - case 0x1c: /* FCMGT (reg) */ - case 0x1d: /* FACGT */ - unallocated_encoding(s); - return; - } - - if (!dc_isar_feature(aa64_fp16, s)) { - unallocated_encoding(s); - } - - if (!fp_access_check(s)) { - return; - } - - fpst = fpstatus_ptr(FPST_FPCR_F16); - - tcg_op1 = read_fp_hreg(s, rn); - tcg_op2 = read_fp_hreg(s, rm); - tcg_res = tcg_temp_new_i32(); - - switch (fpopcode) { - case 0x07: /* FRECPS */ - gen_helper_recpsf_f16(tcg_res, tcg_op1, tcg_op2, fpst); - break; - case 0x0f: /* FRSQRTS */ - gen_helper_rsqrtsf_f16(tcg_res, tcg_op1, tcg_op2, fpst); - break; - default: - case 0x03: /* FMULX */ - case 0x04: /* FCMEQ (reg) */ - case 0x14: /* FCMGE (reg) */ - case 0x15: /* FACGE */ - case 0x1a: /* FABD */ - case 0x1c: /* FCMGT (reg) */ - case 0x1d: /* FACGT */ - g_assert_not_reached(); - } - - write_fp_sreg(s, rd, tcg_res); -} - /* AdvSIMD scalar three same extra * 31 30 29 28 24 23 22 21 20 16 15 14 11 10 9 5 4 0 * +-----+---+-----------+------+---+------+---+--------+---+----+----+ @@ -11114,7 +10940,7 @@ static void disas_simd_3same_logic(DisasContext *s, uint32_t insn) /* Pairwise op subgroup of C3.6.16. * - * This is called directly or via the handle_3same_float for float pairwise + * This is called directly for float pairwise * operations where the opcode and size are calculated differently. */ static void handle_simd_3same_pair(DisasContext *s, int is_q, int u, int opcode, @@ -11271,10 +11097,6 @@ static void disas_simd_3same_float(DisasContext *s, uint32_t insn) int rn = extract32(insn, 5, 5); int rd = extract32(insn, 0, 5); - int datasize = is_q ? 128 : 64; - int esize = 32 << size; - int elements = datasize / esize; - if (size == 1 && !is_q) { unallocated_encoding(s); return; @@ -11293,13 +11115,6 @@ static void disas_simd_3same_float(DisasContext *s, uint32_t insn) handle_simd_3same_pair(s, is_q, 0, fpopcode, size ? MO_64 : MO_32, rn, rm, rd); return; - case 0x1f: /* FRECPS */ - case 0x3f: /* FRSQRTS */ - if (!fp_access_check(s)) { - return; - } - handle_3same_float(s, size, elements, fpopcode, rd, rn, rm); - return; case 0x1d: /* FMLAL */ case 0x3d: /* FMLSL */ @@ -11328,10 +11143,12 @@ static void disas_simd_3same_float(DisasContext *s, uint32_t insn) case 0x1b: /* FMULX */ case 0x1c: /* FCMEQ */ case 0x1e: /* FMAX */ + case 0x1f: /* FRECPS */ case 0x38: /* FMINNM */ case 0x39: /* FMLS */ case 0x3a: /* FSUB */ case 0x3e: /* FMIN */ + case 0x3f: /* FRSQRTS */ case 0x5b: /* FMUL */ case 0x5c: /* FCMGE */ case 0x5d: /* FACGE */ @@ -11673,17 +11490,11 @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn) * together indicate the operation. */ int fpopcode = opcode | (a << 3) | (u << 4); - int datasize = is_q ? 128 : 64; - int elements = datasize / 16; bool pairwise; TCGv_ptr fpst; int pass; switch (fpopcode) { - case 0x7: /* FRECPS */ - case 0xf: /* FRSQRTS */ - pairwise = false; - break; case 0x10: /* FMAXNMP */ case 0x12: /* FADDP */ case 0x16: /* FMAXP */ @@ -11698,10 +11509,12 @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn) case 0x3: /* FMULX */ case 0x4: /* FCMEQ */ case 0x6: /* FMAX */ + case 0x7: /* FRECPS */ case 0x8: /* FMINNM */ case 0x9: /* FMLS */ case 0xa: /* FSUB */ case 0xe: /* FMIN */ + case 0xf: /* FRSQRTS */ case 0x13: /* FMUL */ case 0x14: /* FCMGE */ case 0x15: /* FACGE */ @@ -11765,44 +11578,7 @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn) write_vec_element_i32(s, tcg_res[pass], rd, pass, MO_16); } } else { - for (pass = 0; pass < elements; pass++) { - TCGv_i32 tcg_op1 = tcg_temp_new_i32(); - TCGv_i32 tcg_op2 = tcg_temp_new_i32(); - TCGv_i32 tcg_res = tcg_temp_new_i32(); - - read_vec_element_i32(s, tcg_op1, rn, pass, MO_16); - read_vec_element_i32(s, tcg_op2, rm, pass, MO_16); - - switch (fpopcode) { - case 0x7: /* FRECPS */ - gen_helper_recpsf_f16(tcg_res, tcg_op1, tcg_op2, fpst); - break; - case 0xf: /* FRSQRTS */ - gen_helper_rsqrtsf_f16(tcg_res, tcg_op1, tcg_op2, fpst); - break; - default: - case 0x0: /* FMAXNM */ - case 0x1: /* FMLA */ - case 0x2: /* FADD */ - case 0x3: /* FMULX */ - case 0x4: /* FCMEQ */ - case 0x6: /* FMAX */ - case 0x8: /* FMINNM */ - case 0x9: /* FMLS */ - case 0xa: /* FSUB */ - case 0xe: /* FMIN */ - case 0x13: /* FMUL */ - case 0x14: /* FCMGE */ - case 0x15: /* FACGE */ - case 0x17: /* FDIV */ - case 0x1a: /* FABD */ - case 0x1c: /* FCMGT */ - case 0x1d: /* FACGT */ - g_assert_not_reached(); - } - - write_vec_element_i32(s, tcg_res, rd, pass, MO_16); - } + g_assert_not_reached(); } clear_vec_high(s, is_q, rd); @@ -13572,7 +13348,6 @@ static const AArch64DecodeTable data_proc_simd[] = { { 0x5f000400, 0xdf800400, disas_simd_scalar_shift_imm }, { 0x0e400400, 0x9f60c400, disas_simd_three_reg_same_fp16 }, { 0x0e780800, 0x8f7e0c00, disas_simd_two_reg_misc_fp16 }, - { 0x5e400400, 0xdf60c400, disas_simd_scalar_three_reg_same_fp16 }, { 0x00000000, 0x00000000, NULL } }; From patchwork Fri May 24 23:20:42 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13673799 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B2589C25B74 for ; Fri, 24 May 2024 23:24:36 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sAeEd-0006qx-QD; Fri, 24 May 2024 19:22:03 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sAeEV-0006gk-JZ for qemu-devel@nongnu.org; Fri, 24 May 2024 19:21:55 -0400 Received: from mail-pl1-x635.google.com ([2607:f8b0:4864:20::635]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sAeEO-0005qO-V5 for qemu-devel@nongnu.org; Fri, 24 May 2024 19:21:55 -0400 Received: by mail-pl1-x635.google.com with SMTP id d9443c01a7336-1f44b42d1caso9591835ad.0 for ; Fri, 24 May 2024 16:21:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1716592907; x=1717197707; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=NfcamptDB5wTyUKmdbwuv02wpGIF2mCuHF4gA8LcyHo=; b=oFVqVsgog+uh+/iGjeDePgp7a+tjNMEjftkC7hLOhSlf96XshFjbS3cpRyChXtRg+3 2MHhQxw4+oK9S24iuyDIBZVPPhsYWoC/RonYT79z3BKHA4ONUjAU/gsrq4lrQwzgCwin dMuLAt2HTmL0HOX5L1ie3gmJpo0c50X5ELdvNGN5i0QU9Ry53DgVth+KIUmH1X+5I47w xWU6lq0aTNhmngaL08MdjDziKxEDLwuliNE1O2S787L48AqOgMA5S5lmbFLFr2JmP2DK 40BaEwy1gYS92hpWBgqsB5T2FIB2811h+n5zci5lDDpGdnjtk9en7qd2mTlTvU6xhLB0 yRtw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716592907; x=1717197707; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=NfcamptDB5wTyUKmdbwuv02wpGIF2mCuHF4gA8LcyHo=; b=j+xs8e+RfODjdQ0H7Srf0ezQyT4RULVtn5nuSkdLIZ/DIhwBW6iANvU7cixldvKydi lJx4LHR6ZSp1K4ZUg3KEJR4acdkBT1KXRlHtyk+b1AMo4nvYH4JKV+Vguje9HL9LkKMr SM2jzF7hDWR/Rh1wyS1kbz8HmtVNMY73a5lhGAR3ThuvW/X8g691z4g+XcBSFwXRPOKl OB3yktzvIPwY20b4MZejTw7rtyuw+YL2p860ZJe8tvqCIu5Lj8vZSFVAf2Kb0k+/4dpC Q04WILhF1ZWViZ0o3/smYBNner+EC3TRubPL1upGebAHUfcQhyTzlu7STF59MiKOepvT L1iQ== X-Gm-Message-State: AOJu0YyFXodHthYjAs6cXGwF/m5PiN8+JCCaBVv3PPBAGW8xxTTYu2+x FCe8FEX3BC7uScOm0VmrlmlKJJV1hs000BeVe5d8F8y5pwM/Xq/XmbMZUanX1OZUpXmTuv85nIR s X-Google-Smtp-Source: AGHT+IHsMunD//+5vaXEL8Jo85VuC1Sg5KWxwUDtWdIQtdrYbX3XLlj7kJlg5lgYfQjY4DCg3djQMA== X-Received: by 2002:a17:902:cec1:b0:1f3:3769:400c with SMTP id d9443c01a7336-1f449901f5fmr34724515ad.65.1716592907513; Fri, 24 May 2024 16:21:47 -0700 (PDT) Received: from stoup.. (174-21-72-5.tukw.qwest.net. [174.21.72.5]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-1f44c759ceesm19178305ad.10.2024.05.24.16.21.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 May 2024 16:21:47 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: qemu-arm@nongnu.org, Peter Maydell Subject: [PATCH v2 28/67] target/arm: Convert FADDP to decodetree Date: Fri, 24 May 2024 16:20:42 -0700 Message-Id: <20240524232121.284515-29-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240524232121.284515-1-richard.henderson@linaro.org> References: <20240524232121.284515-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::635; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x635.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/helper.h | 4 ++ target/arm/tcg/a64.decode | 12 +++++ target/arm/tcg/translate-a64.c | 87 ++++++++++++++++++++++++++-------- target/arm/tcg/vec_helper.c | 23 +++++++++ 4 files changed, 105 insertions(+), 21 deletions(-) diff --git a/target/arm/helper.h b/target/arm/helper.h index ff6e3094f4..8441b49d1f 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -1048,6 +1048,10 @@ DEF_HELPER_FLAGS_5(gvec_uclamp_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(gvec_uclamp_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_faddp_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_faddp_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_faddp_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) + #ifdef TARGET_AARCH64 #include "tcg/helper-a64.h" #include "tcg/helper-sve.h" diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode index 84cb38f1dd..d2a02365e1 100644 --- a/target/arm/tcg/a64.decode +++ b/target/arm/tcg/a64.decode @@ -29,6 +29,7 @@ &ri rd imm &rri_sf rd rn imm sf &i imm +&rr_e rd rn esz &rrr_e rd rn rm esz &rrx_e rd rn rm idx esz &qrr_e q rd rn esz @@ -36,6 +37,9 @@ &qrrx_e q rd rn rm idx esz &qrrrr_e q rd rn rm ra esz +@rr_h ........ ... ..... ...... rn:5 rd:5 &rr_e esz=1 +@rr_sd ........ ... ..... ...... rn:5 rd:5 &rr_e esz=%esz_sd + @rrr_h ........ ... rm:5 ...... rn:5 rd:5 &rrr_e esz=1 @rrr_sd ........ ... rm:5 ...... rn:5 rd:5 &rrr_e esz=%esz_sd @rrr_hsd ........ ... rm:5 ...... rn:5 rd:5 &rrr_e esz=%esz_hsd @@ -737,6 +741,11 @@ FRECPS_s 0101 1110 0.1 ..... 11111 1 ..... ..... @rrr_sd FRSQRTS_s 0101 1110 110 ..... 00111 1 ..... ..... @rrr_h FRSQRTS_s 0101 1110 1.1 ..... 11111 1 ..... ..... @rrr_sd +### Advanced SIMD scalar pairwise + +FADDP_s 0101 1110 0011 0000 1101 10 ..... ..... @rr_h +FADDP_s 0111 1110 0.11 0000 1101 10 ..... ..... @rr_sd + ### Advanced SIMD three same FADD_v 0.00 1110 010 ..... 00010 1 ..... ..... @qrrr_h @@ -796,6 +805,9 @@ FRECPS_v 0.00 1110 0.1 ..... 11111 1 ..... ..... @qrrr_sd FRSQRTS_v 0.00 1110 110 ..... 00111 1 ..... ..... @qrrr_h FRSQRTS_v 0.00 1110 1.1 ..... 11111 1 ..... ..... @qrrr_sd +FADDP_v 0.10 1110 010 ..... 00010 1 ..... ..... @qrrr_h +FADDP_v 0.10 1110 0.1 ..... 11010 1 ..... ..... @qrrr_sd + ### Advanced SIMD scalar x indexed element FMUL_si 0101 1111 00 .. .... 1001 . 0 ..... ..... @rrx_h diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c index a7537a5104..78949ab34f 100644 --- a/target/arm/tcg/translate-a64.c +++ b/target/arm/tcg/translate-a64.c @@ -5210,6 +5210,13 @@ static gen_helper_gvec_3_ptr * const f_vector_frsqrts[3] = { }; TRANS(FRSQRTS_v, do_fp3_vector, a, f_vector_frsqrts) +static gen_helper_gvec_3_ptr * const f_vector_faddp[3] = { + gen_helper_gvec_faddp_h, + gen_helper_gvec_faddp_s, + gen_helper_gvec_faddp_d, +}; +TRANS(FADDP_v, do_fp3_vector, a, f_vector_faddp) + /* * Advanced SIMD scalar/vector x indexed element */ @@ -5395,6 +5402,56 @@ static bool do_fmla_vector_idx(DisasContext *s, arg_qrrx_e *a, bool neg) TRANS(FMLA_vi, do_fmla_vector_idx, a, false) TRANS(FMLS_vi, do_fmla_vector_idx, a, true) +/* + * Advanced SIMD scalar pairwise + */ + +static bool do_fp3_scalar_pair(DisasContext *s, arg_rr_e *a, const FPScalar *f) +{ + switch (a->esz) { + case MO_64: + if (fp_access_check(s)) { + TCGv_i64 t0 = tcg_temp_new_i64(); + TCGv_i64 t1 = tcg_temp_new_i64(); + + read_vec_element(s, t0, a->rn, 0, MO_64); + read_vec_element(s, t1, a->rn, 1, MO_64); + f->gen_d(t0, t0, t1, fpstatus_ptr(FPST_FPCR)); + write_fp_dreg(s, a->rd, t0); + } + break; + case MO_32: + if (fp_access_check(s)) { + TCGv_i32 t0 = tcg_temp_new_i32(); + TCGv_i32 t1 = tcg_temp_new_i32(); + + read_vec_element_i32(s, t0, a->rn, 0, MO_32); + read_vec_element_i32(s, t1, a->rn, 1, MO_32); + f->gen_s(t0, t0, t1, fpstatus_ptr(FPST_FPCR)); + write_fp_sreg(s, a->rd, t0); + } + break; + case MO_16: + if (!dc_isar_feature(aa64_fp16, s)) { + return false; + } + if (fp_access_check(s)) { + TCGv_i32 t0 = tcg_temp_new_i32(); + TCGv_i32 t1 = tcg_temp_new_i32(); + + read_vec_element_i32(s, t0, a->rn, 0, MO_16); + read_vec_element_i32(s, t1, a->rn, 1, MO_16); + f->gen_h(t0, t0, t1, fpstatus_ptr(FPST_FPCR_F16)); + write_fp_sreg(s, a->rd, t0); + } + break; + default: + g_assert_not_reached(); + } + return true; +} + +TRANS(FADDP_s, do_fp3_scalar_pair, a, &f_scalar_fadd) /* Shift a TCGv src by TCGv shift_amount, put result in dst. * Note that it is the caller's responsibility to ensure that the @@ -8357,7 +8414,6 @@ static void disas_simd_scalar_pairwise(DisasContext *s, uint32_t insn) fpst = NULL; break; case 0xc: /* FMAXNMP */ - case 0xd: /* FADDP */ case 0xf: /* FMAXP */ case 0x2c: /* FMINNMP */ case 0x2f: /* FMINP */ @@ -8380,6 +8436,7 @@ static void disas_simd_scalar_pairwise(DisasContext *s, uint32_t insn) fpst = fpstatus_ptr(size == MO_16 ? FPST_FPCR_F16 : FPST_FPCR); break; default: + case 0xd: /* FADDP */ unallocated_encoding(s); return; } @@ -8399,9 +8456,6 @@ static void disas_simd_scalar_pairwise(DisasContext *s, uint32_t insn) case 0xc: /* FMAXNMP */ gen_helper_vfp_maxnumd(tcg_res, tcg_op1, tcg_op2, fpst); break; - case 0xd: /* FADDP */ - gen_helper_vfp_addd(tcg_res, tcg_op1, tcg_op2, fpst); - break; case 0xf: /* FMAXP */ gen_helper_vfp_maxd(tcg_res, tcg_op1, tcg_op2, fpst); break; @@ -8412,6 +8466,7 @@ static void disas_simd_scalar_pairwise(DisasContext *s, uint32_t insn) gen_helper_vfp_mind(tcg_res, tcg_op1, tcg_op2, fpst); break; default: + case 0xd: /* FADDP */ g_assert_not_reached(); } @@ -8429,9 +8484,6 @@ static void disas_simd_scalar_pairwise(DisasContext *s, uint32_t insn) case 0xc: /* FMAXNMP */ gen_helper_advsimd_maxnumh(tcg_res, tcg_op1, tcg_op2, fpst); break; - case 0xd: /* FADDP */ - gen_helper_advsimd_addh(tcg_res, tcg_op1, tcg_op2, fpst); - break; case 0xf: /* FMAXP */ gen_helper_advsimd_maxh(tcg_res, tcg_op1, tcg_op2, fpst); break; @@ -8442,6 +8494,7 @@ static void disas_simd_scalar_pairwise(DisasContext *s, uint32_t insn) gen_helper_advsimd_minh(tcg_res, tcg_op1, tcg_op2, fpst); break; default: + case 0xd: /* FADDP */ g_assert_not_reached(); } } else { @@ -8449,9 +8502,6 @@ static void disas_simd_scalar_pairwise(DisasContext *s, uint32_t insn) case 0xc: /* FMAXNMP */ gen_helper_vfp_maxnums(tcg_res, tcg_op1, tcg_op2, fpst); break; - case 0xd: /* FADDP */ - gen_helper_vfp_adds(tcg_res, tcg_op1, tcg_op2, fpst); - break; case 0xf: /* FMAXP */ gen_helper_vfp_maxs(tcg_res, tcg_op1, tcg_op2, fpst); break; @@ -8462,6 +8512,7 @@ static void disas_simd_scalar_pairwise(DisasContext *s, uint32_t insn) gen_helper_vfp_mins(tcg_res, tcg_op1, tcg_op2, fpst); break; default: + case 0xd: /* FADDP */ g_assert_not_reached(); } } @@ -10982,9 +11033,6 @@ static void handle_simd_3same_pair(DisasContext *s, int is_q, int u, int opcode, case 0x58: /* FMAXNMP */ gen_helper_vfp_maxnumd(tcg_res[pass], tcg_op1, tcg_op2, fpst); break; - case 0x5a: /* FADDP */ - gen_helper_vfp_addd(tcg_res[pass], tcg_op1, tcg_op2, fpst); - break; case 0x5e: /* FMAXP */ gen_helper_vfp_maxd(tcg_res[pass], tcg_op1, tcg_op2, fpst); break; @@ -10995,6 +11043,7 @@ static void handle_simd_3same_pair(DisasContext *s, int is_q, int u, int opcode, gen_helper_vfp_mind(tcg_res[pass], tcg_op1, tcg_op2, fpst); break; default: + case 0x5a: /* FADDP */ g_assert_not_reached(); } } @@ -11052,9 +11101,6 @@ static void handle_simd_3same_pair(DisasContext *s, int is_q, int u, int opcode, case 0x58: /* FMAXNMP */ gen_helper_vfp_maxnums(tcg_res[pass], tcg_op1, tcg_op2, fpst); break; - case 0x5a: /* FADDP */ - gen_helper_vfp_adds(tcg_res[pass], tcg_op1, tcg_op2, fpst); - break; case 0x5e: /* FMAXP */ gen_helper_vfp_maxs(tcg_res[pass], tcg_op1, tcg_op2, fpst); break; @@ -11065,6 +11111,7 @@ static void handle_simd_3same_pair(DisasContext *s, int is_q, int u, int opcode, gen_helper_vfp_mins(tcg_res[pass], tcg_op1, tcg_op2, fpst); break; default: + case 0x5a: /* FADDP */ g_assert_not_reached(); } @@ -11104,7 +11151,6 @@ static void disas_simd_3same_float(DisasContext *s, uint32_t insn) switch (fpopcode) { case 0x58: /* FMAXNMP */ - case 0x5a: /* FADDP */ case 0x5e: /* FMAXP */ case 0x78: /* FMINNMP */ case 0x7e: /* FMINP */ @@ -11149,6 +11195,7 @@ static void disas_simd_3same_float(DisasContext *s, uint32_t insn) case 0x3a: /* FSUB */ case 0x3e: /* FMIN */ case 0x3f: /* FRSQRTS */ + case 0x5a: /* FADDP */ case 0x5b: /* FMUL */ case 0x5c: /* FCMGE */ case 0x5d: /* FACGE */ @@ -11496,7 +11543,6 @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn) switch (fpopcode) { case 0x10: /* FMAXNMP */ - case 0x12: /* FADDP */ case 0x16: /* FMAXP */ case 0x18: /* FMINNMP */ case 0x1e: /* FMINP */ @@ -11515,6 +11561,7 @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn) case 0xa: /* FSUB */ case 0xe: /* FMIN */ case 0xf: /* FRSQRTS */ + case 0x12: /* FADDP */ case 0x13: /* FMUL */ case 0x14: /* FCMGE */ case 0x15: /* FACGE */ @@ -11556,9 +11603,6 @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn) gen_helper_advsimd_maxnumh(tcg_res[pass], tcg_op1, tcg_op2, fpst); break; - case 0x12: /* FADDP */ - gen_helper_advsimd_addh(tcg_res[pass], tcg_op1, tcg_op2, fpst); - break; case 0x16: /* FMAXP */ gen_helper_advsimd_maxh(tcg_res[pass], tcg_op1, tcg_op2, fpst); break; @@ -11570,6 +11614,7 @@ static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn) gen_helper_advsimd_minh(tcg_res[pass], tcg_op1, tcg_op2, fpst); break; default: + case 0x12: /* FADDP */ g_assert_not_reached(); } } diff --git a/target/arm/tcg/vec_helper.c b/target/arm/tcg/vec_helper.c index e9d7922f30..28989c7d7a 100644 --- a/target/arm/tcg/vec_helper.c +++ b/target/arm/tcg/vec_helper.c @@ -2237,6 +2237,29 @@ DO_NEON_PAIRWISE(neon_pmin, min) #undef DO_NEON_PAIRWISE +#define DO_3OP_PAIR(NAME, FUNC, TYPE, H) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, void *stat, uint32_t desc) \ +{ \ + ARMVectorReg scratch; \ + intptr_t oprsz = simd_oprsz(desc); \ + intptr_t half = oprsz / sizeof(TYPE) / 2; \ + TYPE *d = vd, *n = vn, *m = vm; \ + if (unlikely(d == m)) { \ + m = memcpy(&scratch, m, oprsz); \ + } \ + for (intptr_t i = 0; i < half; ++i) { \ + d[H(i)] = FUNC(n[H(i * 2)], n[H(i * 2 + 1)], stat); \ + } \ + for (intptr_t i = 0; i < half; ++i) { \ + d[H(i + half)] = FUNC(m[H(i * 2)], m[H(i * 2 + 1)], stat); \ + } \ + clear_tail(d, oprsz, simd_maxsz(desc)); \ +} + +DO_3OP_PAIR(gvec_faddp_h, float16_add, float16, H2) +DO_3OP_PAIR(gvec_faddp_s, float32_add, float32, H4) +DO_3OP_PAIR(gvec_faddp_d, float64_add, float64, ) + #define DO_VCVT_FIXED(NAME, FUNC, TYPE) \ void HELPER(NAME)(void *vd, void *vn, void *stat, uint32_t desc) \ { \ From patchwork Fri May 24 23:20:43 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13673809 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B27B3C25B74 for ; Fri, 24 May 2024 23:27:09 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sAeEf-0006u2-DM; Fri, 24 May 2024 19:22:05 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sAeEY-0006ki-Js for qemu-devel@nongnu.org; Fri, 24 May 2024 19:21:58 -0400 Received: from mail-pl1-x633.google.com ([2607:f8b0:4864:20::633]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sAeEQ-0005rT-LF for qemu-devel@nongnu.org; Fri, 24 May 2024 19:21:58 -0400 Received: by mail-pl1-x633.google.com with SMTP id d9443c01a7336-1f332511457so13030205ad.2 for ; Fri, 24 May 2024 16:21:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1716592909; x=1717197709; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=wmB47crKjMbvJ+k9eAKu+AJTCVO3vpgI3P/aNQhLYsU=; b=UhB/yWnmNc+zU2gJyWgTI3EEyMYo91jxqIyhaEjhq0Kxy6kMClkWn6EG9sF74uGKsx 0/ignZ8L3CwCTtS3BP0tZn82bsGsxQh+6W2cR7HwiUkmpqU6aTA2s3blWtPGoIpckpcb YQN/G/onDl7E/Bc0caNu/f8yra/txtc9/gUxA7DkG4sMnnJlX3d9kHTupwAGg4nHkvn6 o00DrUIDs6e4jr4l2uaeEdmpOeauBWR5QFTqI7B4Fyuw6RJVczaUorWOlE1qZzpX5eFv +4QIHZbBR6gtZFBQaKDoCZVtkb2NkMB2coneF/J3ArCgBCiOmo7R+UmyhQTxq1KBil2Q 81UA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716592909; x=1717197709; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=wmB47crKjMbvJ+k9eAKu+AJTCVO3vpgI3P/aNQhLYsU=; b=uSEKbQigAzDn2dYC/8/kgu3aX93DklvjVQf0NA5vQJ8aSZwIoBNBP2prDsui4hxe1W kjopT9FTv+KjlAWof+EL5s2/MWGZNMlsNVMNucpot6M6QfaTVJwYdCOHHU+cLbjMxrUS GELDsyfbl1ZxAFmLRch28xugGA3qUWNvEpFkP4+7/BY2BDK9HC34hrFypGRSYp433ZBi yiHe7HTCjuNLGuqDG5xxqfR/vcEaoy8oipn9fMnBORopXQC0xc7JLTsvj2+SPWjS6+1x 8Fd6q2dT3q1oUNSGanGHPNNTQglQGPw9vbY30iAkJqw0tMwTVyTYOvrczZWrpjJ6E2oz JlxA== X-Gm-Message-State: AOJu0YxefqBzCGi15spEY9EsMt4MJn9+OcUE9wsXSsdjolp5tqL308TA rRGuZdS6U4K8u9yqE62kMW+tqoAapwGb/KU8Hfg1wx2c0myolQUtFf8CGMz2KHp2cXNj8YxRLl4 i X-Google-Smtp-Source: AGHT+IGnBloCl2K6UyjNiXFz1STCGbckO0LNmh6Hah/3YZk+wr9G0QXoyK6/O538khPm5Vx6BUx5qw== X-Received: by 2002:a17:902:f549:b0:1f3:3bce:9c95 with SMTP id d9443c01a7336-1f448a37233mr47762095ad.32.1716592908426; Fri, 24 May 2024 16:21:48 -0700 (PDT) Received: from stoup.. (174-21-72-5.tukw.qwest.net. [174.21.72.5]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-1f44c759ceesm19178305ad.10.2024.05.24.16.21.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 May 2024 16:21:48 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: qemu-arm@nongnu.org, Peter Maydell Subject: [PATCH v2 29/67] target/arm: Convert FMAXP, FMINP, FMAXNMP, FMINNMP to decodetree Date: Fri, 24 May 2024 16:20:43 -0700 Message-Id: <20240524232121.284515-30-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240524232121.284515-1-richard.henderson@linaro.org> References: <20240524232121.284515-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::633; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x633.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org These are the last instructions within disas_simd_three_reg_same_fp16, so remove it. Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/helper.h | 16 ++ target/arm/tcg/a64.decode | 24 +++ target/arm/tcg/translate-a64.c | 296 ++++++--------------------------- target/arm/tcg/vec_helper.c | 16 ++ 4 files changed, 107 insertions(+), 245 deletions(-) diff --git a/target/arm/helper.h b/target/arm/helper.h index 8441b49d1f..3268477329 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -1052,6 +1052,22 @@ DEF_HELPER_FLAGS_5(gvec_faddp_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(gvec_faddp_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(gvec_faddp_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_fmaxp_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_fmaxp_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_fmaxp_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(gvec_fminp_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_fminp_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_fminp_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(gvec_fmaxnump_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_fmaxnump_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_fmaxnump_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(gvec_fminnump_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_fminnump_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_fminnump_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) + #ifdef TARGET_AARCH64 #include "tcg/helper-a64.h" #include "tcg/helper-sve.h" diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode index d2a02365e1..43557fdccc 100644 --- a/target/arm/tcg/a64.decode +++ b/target/arm/tcg/a64.decode @@ -746,6 +746,18 @@ FRSQRTS_s 0101 1110 1.1 ..... 11111 1 ..... ..... @rrr_sd FADDP_s 0101 1110 0011 0000 1101 10 ..... ..... @rr_h FADDP_s 0111 1110 0.11 0000 1101 10 ..... ..... @rr_sd +FMAXP_s 0101 1110 0011 0000 1111 10 ..... ..... @rr_h +FMAXP_s 0111 1110 0.11 0000 1111 10 ..... ..... @rr_sd + +FMINP_s 0101 1110 1011 0000 1111 10 ..... ..... @rr_h +FMINP_s 0111 1110 1.11 0000 1111 10 ..... ..... @rr_sd + +FMAXNMP_s 0101 1110 0011 0000 1100 10 ..... ..... @rr_h +FMAXNMP_s 0111 1110 0.11 0000 1100 10 ..... ..... @rr_sd + +FMINNMP_s 0101 1110 1011 0000 1100 10 ..... ..... @rr_h +FMINNMP_s 0111 1110 1.11 0000 1100 10 ..... ..... @rr_sd + ### Advanced SIMD three same FADD_v 0.00 1110 010 ..... 00010 1 ..... ..... @qrrr_h @@ -808,6 +820,18 @@ FRSQRTS_v 0.00 1110 1.1 ..... 11111 1 ..... ..... @qrrr_sd FADDP_v 0.10 1110 010 ..... 00010 1 ..... ..... @qrrr_h FADDP_v 0.10 1110 0.1 ..... 11010 1 ..... ..... @qrrr_sd +FMAXP_v 0.10 1110 010 ..... 00110 1 ..... ..... @qrrr_h +FMAXP_v 0.10 1110 0.1 ..... 11110 1 ..... ..... @qrrr_sd + +FMINP_v 0.10 1110 110 ..... 00110 1 ..... ..... @qrrr_h +FMINP_v 0.10 1110 1.1 ..... 11110 1 ..... ..... @qrrr_sd + +FMAXNMP_v 0.10 1110 010 ..... 00000 1 ..... ..... @qrrr_h +FMAXNMP_v 0.10 1110 0.1 ..... 11000 1 ..... ..... @qrrr_sd + +FMINNMP_v 0.10 1110 110 ..... 00000 1 ..... ..... @qrrr_h +FMINNMP_v 0.10 1110 1.1 ..... 11000 1 ..... ..... @qrrr_sd + ### Advanced SIMD scalar x indexed element FMUL_si 0101 1111 00 .. .... 1001 . 0 ..... ..... @rrx_h diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c index 78949ab34f..07415bd285 100644 --- a/target/arm/tcg/translate-a64.c +++ b/target/arm/tcg/translate-a64.c @@ -5217,6 +5217,34 @@ static gen_helper_gvec_3_ptr * const f_vector_faddp[3] = { }; TRANS(FADDP_v, do_fp3_vector, a, f_vector_faddp) +static gen_helper_gvec_3_ptr * const f_vector_fmaxp[3] = { + gen_helper_gvec_fmaxp_h, + gen_helper_gvec_fmaxp_s, + gen_helper_gvec_fmaxp_d, +}; +TRANS(FMAXP_v, do_fp3_vector, a, f_vector_fmaxp) + +static gen_helper_gvec_3_ptr * const f_vector_fminp[3] = { + gen_helper_gvec_fminp_h, + gen_helper_gvec_fminp_s, + gen_helper_gvec_fminp_d, +}; +TRANS(FMINP_v, do_fp3_vector, a, f_vector_fminp) + +static gen_helper_gvec_3_ptr * const f_vector_fmaxnmp[3] = { + gen_helper_gvec_fmaxnump_h, + gen_helper_gvec_fmaxnump_s, + gen_helper_gvec_fmaxnump_d, +}; +TRANS(FMAXNMP_v, do_fp3_vector, a, f_vector_fmaxnmp) + +static gen_helper_gvec_3_ptr * const f_vector_fminnmp[3] = { + gen_helper_gvec_fminnump_h, + gen_helper_gvec_fminnump_s, + gen_helper_gvec_fminnump_d, +}; +TRANS(FMINNMP_v, do_fp3_vector, a, f_vector_fminnmp) + /* * Advanced SIMD scalar/vector x indexed element */ @@ -5452,6 +5480,10 @@ static bool do_fp3_scalar_pair(DisasContext *s, arg_rr_e *a, const FPScalar *f) } TRANS(FADDP_s, do_fp3_scalar_pair, a, &f_scalar_fadd) +TRANS(FMAXP_s, do_fp3_scalar_pair, a, &f_scalar_fmax) +TRANS(FMINP_s, do_fp3_scalar_pair, a, &f_scalar_fmin) +TRANS(FMAXNMP_s, do_fp3_scalar_pair, a, &f_scalar_fmaxnm) +TRANS(FMINNMP_s, do_fp3_scalar_pair, a, &f_scalar_fminnm) /* Shift a TCGv src by TCGv shift_amount, put result in dst. * Note that it is the caller's responsibility to ensure that the @@ -8393,7 +8425,6 @@ static void disas_simd_scalar_pairwise(DisasContext *s, uint32_t insn) int opcode = extract32(insn, 12, 5); int rn = extract32(insn, 5, 5); int rd = extract32(insn, 0, 5); - TCGv_ptr fpst; /* For some ops (the FP ones), size[1] is part of the encoding. * For ADDP strictly it is not but size[1] is always 1 for valid @@ -8410,33 +8441,13 @@ static void disas_simd_scalar_pairwise(DisasContext *s, uint32_t insn) if (!fp_access_check(s)) { return; } - - fpst = NULL; break; + default: case 0xc: /* FMAXNMP */ + case 0xd: /* FADDP */ case 0xf: /* FMAXP */ case 0x2c: /* FMINNMP */ case 0x2f: /* FMINP */ - /* FP op, size[0] is 32 or 64 bit*/ - if (!u) { - if ((size & 1) || !dc_isar_feature(aa64_fp16, s)) { - unallocated_encoding(s); - return; - } else { - size = MO_16; - } - } else { - size = extract32(size, 0, 1) ? MO_64 : MO_32; - } - - if (!fp_access_check(s)) { - return; - } - - fpst = fpstatus_ptr(size == MO_16 ? FPST_FPCR_F16 : FPST_FPCR); - break; - default: - case 0xd: /* FADDP */ unallocated_encoding(s); return; } @@ -8453,71 +8464,18 @@ static void disas_simd_scalar_pairwise(DisasContext *s, uint32_t insn) case 0x3b: /* ADDP */ tcg_gen_add_i64(tcg_res, tcg_op1, tcg_op2); break; - case 0xc: /* FMAXNMP */ - gen_helper_vfp_maxnumd(tcg_res, tcg_op1, tcg_op2, fpst); - break; - case 0xf: /* FMAXP */ - gen_helper_vfp_maxd(tcg_res, tcg_op1, tcg_op2, fpst); - break; - case 0x2c: /* FMINNMP */ - gen_helper_vfp_minnumd(tcg_res, tcg_op1, tcg_op2, fpst); - break; - case 0x2f: /* FMINP */ - gen_helper_vfp_mind(tcg_res, tcg_op1, tcg_op2, fpst); - break; default: + case 0xc: /* FMAXNMP */ case 0xd: /* FADDP */ + case 0xf: /* FMAXP */ + case 0x2c: /* FMINNMP */ + case 0x2f: /* FMINP */ g_assert_not_reached(); } write_fp_dreg(s, rd, tcg_res); } else { - TCGv_i32 tcg_op1 = tcg_temp_new_i32(); - TCGv_i32 tcg_op2 = tcg_temp_new_i32(); - TCGv_i32 tcg_res = tcg_temp_new_i32(); - - read_vec_element_i32(s, tcg_op1, rn, 0, size); - read_vec_element_i32(s, tcg_op2, rn, 1, size); - - if (size == MO_16) { - switch (opcode) { - case 0xc: /* FMAXNMP */ - gen_helper_advsimd_maxnumh(tcg_res, tcg_op1, tcg_op2, fpst); - break; - case 0xf: /* FMAXP */ - gen_helper_advsimd_maxh(tcg_res, tcg_op1, tcg_op2, fpst); - break; - case 0x2c: /* FMINNMP */ - gen_helper_advsimd_minnumh(tcg_res, tcg_op1, tcg_op2, fpst); - break; - case 0x2f: /* FMINP */ - gen_helper_advsimd_minh(tcg_res, tcg_op1, tcg_op2, fpst); - break; - default: - case 0xd: /* FADDP */ - g_assert_not_reached(); - } - } else { - switch (opcode) { - case 0xc: /* FMAXNMP */ - gen_helper_vfp_maxnums(tcg_res, tcg_op1, tcg_op2, fpst); - break; - case 0xf: /* FMAXP */ - gen_helper_vfp_maxs(tcg_res, tcg_op1, tcg_op2, fpst); - break; - case 0x2c: /* FMINNMP */ - gen_helper_vfp_minnums(tcg_res, tcg_op1, tcg_op2, fpst); - break; - case 0x2f: /* FMINP */ - gen_helper_vfp_mins(tcg_res, tcg_op1, tcg_op2, fpst); - break; - default: - case 0xd: /* FADDP */ - g_assert_not_reached(); - } - } - - write_fp_sreg(s, rd, tcg_res); + g_assert_not_reached(); } } @@ -10997,16 +10955,8 @@ static void disas_simd_3same_logic(DisasContext *s, uint32_t insn) static void handle_simd_3same_pair(DisasContext *s, int is_q, int u, int opcode, int size, int rn, int rm, int rd) { - TCGv_ptr fpst; int pass; - /* Floating point operations need fpst */ - if (opcode >= 0x58) { - fpst = fpstatus_ptr(FPST_FPCR); - } else { - fpst = NULL; - } - if (!fp_access_check(s)) { return; } @@ -11030,20 +10980,12 @@ static void handle_simd_3same_pair(DisasContext *s, int is_q, int u, int opcode, case 0x17: /* ADDP */ tcg_gen_add_i64(tcg_res[pass], tcg_op1, tcg_op2); break; - case 0x58: /* FMAXNMP */ - gen_helper_vfp_maxnumd(tcg_res[pass], tcg_op1, tcg_op2, fpst); - break; - case 0x5e: /* FMAXP */ - gen_helper_vfp_maxd(tcg_res[pass], tcg_op1, tcg_op2, fpst); - break; - case 0x78: /* FMINNMP */ - gen_helper_vfp_minnumd(tcg_res[pass], tcg_op1, tcg_op2, fpst); - break; - case 0x7e: /* FMINP */ - gen_helper_vfp_mind(tcg_res[pass], tcg_op1, tcg_op2, fpst); - break; default: + case 0x58: /* FMAXNMP */ case 0x5a: /* FADDP */ + case 0x5e: /* FMAXP */ + case 0x78: /* FMINNMP */ + case 0x7e: /* FMINP */ g_assert_not_reached(); } } @@ -11097,21 +11039,12 @@ static void handle_simd_3same_pair(DisasContext *s, int is_q, int u, int opcode, genfn = fns[size][u]; break; } - /* The FP operations are all on single floats (32 bit) */ - case 0x58: /* FMAXNMP */ - gen_helper_vfp_maxnums(tcg_res[pass], tcg_op1, tcg_op2, fpst); - break; - case 0x5e: /* FMAXP */ - gen_helper_vfp_maxs(tcg_res[pass], tcg_op1, tcg_op2, fpst); - break; - case 0x78: /* FMINNMP */ - gen_helper_vfp_minnums(tcg_res[pass], tcg_op1, tcg_op2, fpst); - break; - case 0x7e: /* FMINP */ - gen_helper_vfp_mins(tcg_res[pass], tcg_op1, tcg_op2, fpst); - break; default: + case 0x58: /* FMAXNMP */ case 0x5a: /* FADDP */ + case 0x5e: /* FMAXP */ + case 0x78: /* FMINNMP */ + case 0x7e: /* FMINP */ g_assert_not_reached(); } @@ -11150,18 +11083,6 @@ static void disas_simd_3same_float(DisasContext *s, uint32_t insn) } switch (fpopcode) { - case 0x58: /* FMAXNMP */ - case 0x5e: /* FMAXP */ - case 0x78: /* FMINNMP */ - case 0x7e: /* FMINP */ - if (size && !is_q) { - unallocated_encoding(s); - return; - } - handle_simd_3same_pair(s, is_q, 0, fpopcode, size ? MO_64 : MO_32, - rn, rm, rd); - return; - case 0x1d: /* FMLAL */ case 0x3d: /* FMLSL */ case 0x59: /* FMLAL2 */ @@ -11195,14 +11116,18 @@ static void disas_simd_3same_float(DisasContext *s, uint32_t insn) case 0x3a: /* FSUB */ case 0x3e: /* FMIN */ case 0x3f: /* FRSQRTS */ + case 0x58: /* FMAXNMP */ case 0x5a: /* FADDP */ case 0x5b: /* FMUL */ case 0x5c: /* FCMGE */ case 0x5d: /* FACGE */ + case 0x5e: /* FMAXP */ case 0x5f: /* FDIV */ + case 0x78: /* FMINNMP */ case 0x7a: /* FABD */ case 0x7d: /* FACGT */ case 0x7c: /* FCMGT */ + case 0x7e: /* FMINP */ unallocated_encoding(s); return; } @@ -11511,124 +11436,6 @@ static void disas_simd_three_reg_same(DisasContext *s, uint32_t insn) } } -/* - * Advanced SIMD three same (ARMv8.2 FP16 variants) - * - * 31 30 29 28 24 23 22 21 20 16 15 14 13 11 10 9 5 4 0 - * +---+---+---+-----------+---------+------+-----+--------+---+------+------+ - * | 0 | Q | U | 0 1 1 1 0 | a | 1 0 | Rm | 0 0 | opcode | 1 | Rn | Rd | - * +---+---+---+-----------+---------+------+-----+--------+---+------+------+ - * - * This includes FMULX, FCMEQ (register), FRECPS, FRSQRTS, FCMGE - * (register), FACGE, FABD, FCMGT (register) and FACGT. - * - */ -static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn) -{ - int opcode = extract32(insn, 11, 3); - int u = extract32(insn, 29, 1); - int a = extract32(insn, 23, 1); - int is_q = extract32(insn, 30, 1); - int rm = extract32(insn, 16, 5); - int rn = extract32(insn, 5, 5); - int rd = extract32(insn, 0, 5); - /* - * For these floating point ops, the U, a and opcode bits - * together indicate the operation. - */ - int fpopcode = opcode | (a << 3) | (u << 4); - bool pairwise; - TCGv_ptr fpst; - int pass; - - switch (fpopcode) { - case 0x10: /* FMAXNMP */ - case 0x16: /* FMAXP */ - case 0x18: /* FMINNMP */ - case 0x1e: /* FMINP */ - pairwise = true; - break; - default: - case 0x0: /* FMAXNM */ - case 0x1: /* FMLA */ - case 0x2: /* FADD */ - case 0x3: /* FMULX */ - case 0x4: /* FCMEQ */ - case 0x6: /* FMAX */ - case 0x7: /* FRECPS */ - case 0x8: /* FMINNM */ - case 0x9: /* FMLS */ - case 0xa: /* FSUB */ - case 0xe: /* FMIN */ - case 0xf: /* FRSQRTS */ - case 0x12: /* FADDP */ - case 0x13: /* FMUL */ - case 0x14: /* FCMGE */ - case 0x15: /* FACGE */ - case 0x17: /* FDIV */ - case 0x1a: /* FABD */ - case 0x1c: /* FCMGT */ - case 0x1d: /* FACGT */ - unallocated_encoding(s); - return; - } - - if (!dc_isar_feature(aa64_fp16, s)) { - unallocated_encoding(s); - return; - } - - if (!fp_access_check(s)) { - return; - } - - fpst = fpstatus_ptr(FPST_FPCR_F16); - - if (pairwise) { - int maxpass = is_q ? 8 : 4; - TCGv_i32 tcg_op1 = tcg_temp_new_i32(); - TCGv_i32 tcg_op2 = tcg_temp_new_i32(); - TCGv_i32 tcg_res[8]; - - for (pass = 0; pass < maxpass; pass++) { - int passreg = pass < (maxpass / 2) ? rn : rm; - int passelt = (pass << 1) & (maxpass - 1); - - read_vec_element_i32(s, tcg_op1, passreg, passelt, MO_16); - read_vec_element_i32(s, tcg_op2, passreg, passelt + 1, MO_16); - tcg_res[pass] = tcg_temp_new_i32(); - - switch (fpopcode) { - case 0x10: /* FMAXNMP */ - gen_helper_advsimd_maxnumh(tcg_res[pass], tcg_op1, tcg_op2, - fpst); - break; - case 0x16: /* FMAXP */ - gen_helper_advsimd_maxh(tcg_res[pass], tcg_op1, tcg_op2, fpst); - break; - case 0x18: /* FMINNMP */ - gen_helper_advsimd_minnumh(tcg_res[pass], tcg_op1, tcg_op2, - fpst); - break; - case 0x1e: /* FMINP */ - gen_helper_advsimd_minh(tcg_res[pass], tcg_op1, tcg_op2, fpst); - break; - default: - case 0x12: /* FADDP */ - g_assert_not_reached(); - } - } - - for (pass = 0; pass < maxpass; pass++) { - write_vec_element_i32(s, tcg_res[pass], rd, pass, MO_16); - } - } else { - g_assert_not_reached(); - } - - clear_vec_high(s, is_q, rd); -} - /* AdvSIMD three same extra * 31 30 29 28 24 23 22 21 20 16 15 14 11 10 9 5 4 0 * +---+---+---+-----------+------+---+------+---+--------+---+----+----+ @@ -13391,7 +13198,6 @@ static const AArch64DecodeTable data_proc_simd[] = { { 0x5e300800, 0xdf3e0c00, disas_simd_scalar_pairwise }, { 0x5f000000, 0xdf000400, disas_simd_indexed }, /* scalar indexed */ { 0x5f000400, 0xdf800400, disas_simd_scalar_shift_imm }, - { 0x0e400400, 0x9f60c400, disas_simd_three_reg_same_fp16 }, { 0x0e780800, 0x8f7e0c00, disas_simd_two_reg_misc_fp16 }, { 0x00000000, 0x00000000, NULL } }; diff --git a/target/arm/tcg/vec_helper.c b/target/arm/tcg/vec_helper.c index 28989c7d7a..79e1fdcaa9 100644 --- a/target/arm/tcg/vec_helper.c +++ b/target/arm/tcg/vec_helper.c @@ -2260,6 +2260,22 @@ DO_3OP_PAIR(gvec_faddp_h, float16_add, float16, H2) DO_3OP_PAIR(gvec_faddp_s, float32_add, float32, H4) DO_3OP_PAIR(gvec_faddp_d, float64_add, float64, ) +DO_3OP_PAIR(gvec_fmaxp_h, float16_max, float16, H2) +DO_3OP_PAIR(gvec_fmaxp_s, float32_max, float32, H4) +DO_3OP_PAIR(gvec_fmaxp_d, float64_max, float64, ) + +DO_3OP_PAIR(gvec_fminp_h, float16_min, float16, H2) +DO_3OP_PAIR(gvec_fminp_s, float32_min, float32, H4) +DO_3OP_PAIR(gvec_fminp_d, float64_min, float64, ) + +DO_3OP_PAIR(gvec_fmaxnump_h, float16_maxnum, float16, H2) +DO_3OP_PAIR(gvec_fmaxnump_s, float32_maxnum, float32, H4) +DO_3OP_PAIR(gvec_fmaxnump_d, float64_maxnum, float64, ) + +DO_3OP_PAIR(gvec_fminnump_h, float16_minnum, float16, H2) +DO_3OP_PAIR(gvec_fminnump_s, float32_minnum, float32, H4) +DO_3OP_PAIR(gvec_fminnump_d, float64_minnum, float64, ) + #define DO_VCVT_FIXED(NAME, FUNC, TYPE) \ void HELPER(NAME)(void *vd, void *vn, void *stat, uint32_t desc) \ { \ From patchwork Fri May 24 23:20:44 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13673797 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 4B5F3C25B7A for ; Fri, 24 May 2024 23:24:30 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sAeEd-0006qo-Ql; Fri, 24 May 2024 19:22:03 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sAeEY-0006kM-DM for qemu-devel@nongnu.org; Fri, 24 May 2024 19:21:58 -0400 Received: from mail-pl1-x629.google.com ([2607:f8b0:4864:20::629]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sAeEQ-0005sZ-Sz for qemu-devel@nongnu.org; Fri, 24 May 2024 19:21:58 -0400 Received: by mail-pl1-x629.google.com with SMTP id d9443c01a7336-1f3423646b7so20389455ad.0 for ; Fri, 24 May 2024 16:21:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1716592909; x=1717197709; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=WueAIRV4KkSDNix8RgtPUT8OsRBBiAwBreqjSCczGFg=; b=CJxtD7IrvoFTUcrdovHyu4Dm79x8e33SrcufTKsJL+j2jy6w1Y0Zm4i1ZsVtt428IC qT4C9Jc6olaiigT9rNoFSX+yCkcQXcQ84b/Mb/SDO6J0AKsJgoaPW4SSHJ1ZYC05vROI 4uxo7wpbXgN35NlBm8DTRIMH3vZu0GZY4cE9gWzfN03heKvzDhk3qcfGF22e1xNdmADY zVqrmHf+gZgw2xyagwgeWBx6nspjHaSqf1DBCkuw0VOprVzMF/n1vpAhYblgP67hk/AM D3iEctnP3hBzsqGfIguzUo9ojFEBu6QvTVBLsYLCZBYVL8+gwS7NjJrjFEuTrHDSMrGJ noKA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716592909; x=1717197709; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=WueAIRV4KkSDNix8RgtPUT8OsRBBiAwBreqjSCczGFg=; b=T3cYhcGNsK0l4qRSY8ihMyRzAaYD/1rzed4QScXvSazvkk0zZYkZZTcFlJHn1bf0z2 ugiZTIoGMVbNYtxEvpgg3ZoSlhl1gKm7Ka69xnIxgkgy5rIXkeKSv8kLppx2hTtGCma9 783BQ+SV6i2wCf2z2Idmr5rnPMseDpmuFCKm+OfZU4bMfMAkRh6inlYT0JOGrU3xFQxn KdXjqK+s1F5nmmXp8Y5OdsTpHTo1TzVLtTX7lN8z4TCe92l7yfoN2Q3KZcxrn+u5uMn+ t7iavLWscaUms8Y8krpyHd1IrURTrOsnkIjSdtKnoBN3qbYsh0gjUefiQncx5pDW83wT yhMg== X-Gm-Message-State: AOJu0Ywbuli2UwMn8sUZa0pRWH1uiBDXCEAEatRuQutToNJ7GoBakOyY g9yTyotNk+PPQ1uX0eV1/3kSuAgWiHzag/8x2tsJ8B5moDd5YuyxusvASGGBFOGqHnDA9YTUzAg h X-Google-Smtp-Source: AGHT+IGij/XAlFbfoWdHiEk8BmozKRYGXjAWQ9FoAnGaDlCWxi/Jfm5YeToagV23KSTc/Cxv9hLjuA== X-Received: by 2002:a17:902:d2c1:b0:1f3:26be:9886 with SMTP id d9443c01a7336-1f4486e134fmr46783225ad.23.1716592909318; Fri, 24 May 2024 16:21:49 -0700 (PDT) Received: from stoup.. (174-21-72-5.tukw.qwest.net. [174.21.72.5]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-1f44c759ceesm19178305ad.10.2024.05.24.16.21.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 May 2024 16:21:48 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: qemu-arm@nongnu.org, Peter Maydell Subject: [PATCH v2 30/67] target/arm: Use gvec for neon faddp, fmaxp, fminp Date: Fri, 24 May 2024 16:20:44 -0700 Message-Id: <20240524232121.284515-31-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240524232121.284515-1-richard.henderson@linaro.org> References: <20240524232121.284515-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::629; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x629.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/helper.h | 7 ----- target/arm/tcg/translate-neon.c | 55 ++------------------------------- target/arm/tcg/vec_helper.c | 45 --------------------------- 3 files changed, 3 insertions(+), 104 deletions(-) diff --git a/target/arm/helper.h b/target/arm/helper.h index 3268477329..065460ea80 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -650,13 +650,6 @@ DEF_HELPER_FLAGS_6(gvec_fcmlas_idx, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_6(gvec_fcmlad, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, ptr, i32) -DEF_HELPER_FLAGS_5(neon_paddh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) -DEF_HELPER_FLAGS_5(neon_pmaxh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) -DEF_HELPER_FLAGS_5(neon_pminh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) -DEF_HELPER_FLAGS_5(neon_padds, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) -DEF_HELPER_FLAGS_5(neon_pmaxs, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) -DEF_HELPER_FLAGS_5(neon_pmins, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) - DEF_HELPER_FLAGS_4(gvec_sstoh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(gvec_sitos, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(gvec_ustoh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) diff --git a/target/arm/tcg/translate-neon.c b/target/arm/tcg/translate-neon.c index 144f18ba22..2326a05a0a 100644 --- a/target/arm/tcg/translate-neon.c +++ b/target/arm/tcg/translate-neon.c @@ -1144,6 +1144,9 @@ DO_3S_FP_GVEC(VFMA, gen_helper_gvec_vfma_s, gen_helper_gvec_vfma_h) DO_3S_FP_GVEC(VFMS, gen_helper_gvec_vfms_s, gen_helper_gvec_vfms_h) DO_3S_FP_GVEC(VRECPS, gen_helper_gvec_recps_nf_s, gen_helper_gvec_recps_nf_h) DO_3S_FP_GVEC(VRSQRTS, gen_helper_gvec_rsqrts_nf_s, gen_helper_gvec_rsqrts_nf_h) +DO_3S_FP_GVEC(VPADD, gen_helper_gvec_faddp_s, gen_helper_gvec_faddp_h) +DO_3S_FP_GVEC(VPMAX, gen_helper_gvec_fmaxp_s, gen_helper_gvec_fmaxp_h) +DO_3S_FP_GVEC(VPMIN, gen_helper_gvec_fminp_s, gen_helper_gvec_fminp_h) WRAP_FP_GVEC(gen_VMAXNM_fp32_3s, FPST_STD, gen_helper_gvec_fmaxnum_s) WRAP_FP_GVEC(gen_VMAXNM_fp16_3s, FPST_STD_F16, gen_helper_gvec_fmaxnum_h) @@ -1180,58 +1183,6 @@ static bool trans_VMINNM_fp_3s(DisasContext *s, arg_3same *a) return do_3same(s, a, gen_VMINNM_fp32_3s); } -static bool do_3same_fp_pair(DisasContext *s, arg_3same *a, - gen_helper_gvec_3_ptr *fn) -{ - /* FP pairwise operations */ - TCGv_ptr fpstatus; - - if (!arm_dc_feature(s, ARM_FEATURE_NEON)) { - return false; - } - - /* UNDEF accesses to D16-D31 if they don't exist. */ - if (!dc_isar_feature(aa32_simd_r32, s) && - ((a->vd | a->vn | a->vm) & 0x10)) { - return false; - } - - if (!vfp_access_check(s)) { - return true; - } - - assert(a->q == 0); /* enforced by decode patterns */ - - - fpstatus = fpstatus_ptr(a->size == MO_16 ? FPST_STD_F16 : FPST_STD); - tcg_gen_gvec_3_ptr(vfp_reg_offset(1, a->vd), - vfp_reg_offset(1, a->vn), - vfp_reg_offset(1, a->vm), - fpstatus, 8, 8, 0, fn); - - return true; -} - -/* - * For all the functions using this macro, size == 1 means fp16, - * which is an architecture extension we don't implement yet. - */ -#define DO_3S_FP_PAIR(INSN,FUNC) \ - static bool trans_##INSN##_fp_3s(DisasContext *s, arg_3same *a) \ - { \ - if (a->size == MO_16) { \ - if (!dc_isar_feature(aa32_fp16_arith, s)) { \ - return false; \ - } \ - return do_3same_fp_pair(s, a, FUNC##h); \ - } \ - return do_3same_fp_pair(s, a, FUNC##s); \ - } - -DO_3S_FP_PAIR(VPADD, gen_helper_neon_padd) -DO_3S_FP_PAIR(VPMAX, gen_helper_neon_pmax) -DO_3S_FP_PAIR(VPMIN, gen_helper_neon_pmin) - static bool do_vector_2sh(DisasContext *s, arg_2reg_shift *a, GVecGen2iFn *fn) { /* Handle a 2-reg-shift insn which can be vectorized. */ diff --git a/target/arm/tcg/vec_helper.c b/target/arm/tcg/vec_helper.c index 79e1fdcaa9..26a9ca9c14 100644 --- a/target/arm/tcg/vec_helper.c +++ b/target/arm/tcg/vec_helper.c @@ -2192,51 +2192,6 @@ DO_ABA(gvec_uaba_d, uint64_t) #undef DO_ABA -#define DO_NEON_PAIRWISE(NAME, OP) \ - void HELPER(NAME##s)(void *vd, void *vn, void *vm, \ - void *stat, uint32_t oprsz) \ - { \ - float_status *fpst = stat; \ - float32 *d = vd; \ - float32 *n = vn; \ - float32 *m = vm; \ - float32 r0, r1; \ - \ - /* Read all inputs before writing outputs in case vm == vd */ \ - r0 = float32_##OP(n[H4(0)], n[H4(1)], fpst); \ - r1 = float32_##OP(m[H4(0)], m[H4(1)], fpst); \ - \ - d[H4(0)] = r0; \ - d[H4(1)] = r1; \ - } \ - \ - void HELPER(NAME##h)(void *vd, void *vn, void *vm, \ - void *stat, uint32_t oprsz) \ - { \ - float_status *fpst = stat; \ - float16 *d = vd; \ - float16 *n = vn; \ - float16 *m = vm; \ - float16 r0, r1, r2, r3; \ - \ - /* Read all inputs before writing outputs in case vm == vd */ \ - r0 = float16_##OP(n[H2(0)], n[H2(1)], fpst); \ - r1 = float16_##OP(n[H2(2)], n[H2(3)], fpst); \ - r2 = float16_##OP(m[H2(0)], m[H2(1)], fpst); \ - r3 = float16_##OP(m[H2(2)], m[H2(3)], fpst); \ - \ - d[H2(0)] = r0; \ - d[H2(1)] = r1; \ - d[H2(2)] = r2; \ - d[H2(3)] = r3; \ - } - -DO_NEON_PAIRWISE(neon_padd, add) -DO_NEON_PAIRWISE(neon_pmax, max) -DO_NEON_PAIRWISE(neon_pmin, min) - -#undef DO_NEON_PAIRWISE - #define DO_3OP_PAIR(NAME, FUNC, TYPE, H) \ void HELPER(NAME)(void *vd, void *vn, void *vm, void *stat, uint32_t desc) \ { \ From patchwork Fri May 24 23:20:45 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13673814 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id E10EAC25B74 for ; Fri, 24 May 2024 23:27:26 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sAeEg-0006uP-3p; Fri, 24 May 2024 19:22:06 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sAeEa-0006my-Th for qemu-devel@nongnu.org; Fri, 24 May 2024 19:22:01 -0400 Received: from mail-pl1-x636.google.com ([2607:f8b0:4864:20::636]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sAeER-0005sr-Mv for qemu-devel@nongnu.org; Fri, 24 May 2024 19:22:00 -0400 Received: by mail-pl1-x636.google.com with SMTP id d9443c01a7336-1f44b5ba445so12729845ad.3 for ; Fri, 24 May 2024 16:21:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1716592910; x=1717197710; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=V9k3/1giLIpJBSjc8tDxK2IFohAUnavQppLhnQcvUPM=; b=NksvQQn5NLjcEKhI4ztgZX3PEEjrtDWBCLKixM8vwTZWsSAaaEwYFTe8D7YE/+h+Uf J32ldqZphb+TgCkYo1qV/kI+yGCVQE8TgD7xR5+mtmRs8NPqlJSgxaXMmcXKHkwaiYCk AiI0PXNdPPi3Z0lYXr5u6ZvLFtEwlMAwq1DbfM9uLtGOie7q3G8gO5un9S36L7icrE0v frx8t/JvZzSREt3MW4YJeVhvzzXXOFTfXIr2jrIuV5yeCrxuDR5BC8ZrFsI4pcEp6TBe jYAj6/p2tB4d2GDrDskSnkhFiY8TVfi2VBrMM0O7agSpcpnlfpDejl8PY+YDCNSWgwFH rmjQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716592910; x=1717197710; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=V9k3/1giLIpJBSjc8tDxK2IFohAUnavQppLhnQcvUPM=; b=Ow4jCWQn9YzE0pheBqITmuE0+Tbg7XaklI0pVxQG9dsypZuHMf/2kxsmb/uUbI9Ohr h0Wlltx/4kRShYofvxzZzoZQfMr4WghRUOFukgGGbLUt9lpiBOm1okkeDZzT4eavZCGJ 0yohxQEQumhIkxINiQ6PuIIbobGLHz0FR9q9Z8PxvqoPSSDuffeKyfVeK1l2rpolqRMk 5Sj6BDJr75WbsSuEUWxaHGTVr7w2+5YB0fulRWNiNSJm+X2+MH+WjuO/Xo7ECZ2TXlV5 0Mny65mIHylzPh/L5OzLkbfISDPgsOab8hkc8NiORdaH+yBfmrqpb5E3yBuqZhCR5hSj 2jHw== X-Gm-Message-State: AOJu0YwDeABf80xwyxDBvzijg2sTWt+OSAcRskDzZsxf1tpnyKgcRi/b nfT79L10nQcTzLWNLr6vNTLQxUS9+/GxNMT0/sZS/kESa3qlvDBfxdg2ICk6J1M/2tnwQISN45s 8 X-Google-Smtp-Source: AGHT+IFyHTeMpPsGBmMymPmNIWVVOdpaXi/iCzTreRMGPkavpRLyIXxnhLXx0PRYqNQ6biiXNNbJsQ== X-Received: by 2002:a17:903:181:b0:1f2:e14b:3d91 with SMTP id d9443c01a7336-1f4498f09bbmr47190105ad.59.1716592910307; Fri, 24 May 2024 16:21:50 -0700 (PDT) Received: from stoup.. (174-21-72-5.tukw.qwest.net. [174.21.72.5]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-1f44c759ceesm19178305ad.10.2024.05.24.16.21.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 May 2024 16:21:49 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: qemu-arm@nongnu.org, Peter Maydell Subject: [PATCH v2 31/67] target/arm: Convert ADDP to decodetree Date: Fri, 24 May 2024 16:20:45 -0700 Message-Id: <20240524232121.284515-32-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240524232121.284515-1-richard.henderson@linaro.org> References: <20240524232121.284515-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::636; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x636.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/helper.h | 5 ++ target/arm/tcg/translate.h | 3 + target/arm/tcg/a64.decode | 6 ++ target/arm/tcg/gengvec.c | 12 ++++ target/arm/tcg/translate-a64.c | 128 ++++++--------------------------- target/arm/tcg/vec_helper.c | 30 ++++++++ 6 files changed, 77 insertions(+), 107 deletions(-) diff --git a/target/arm/helper.h b/target/arm/helper.h index 065460ea80..d3579a101f 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -1061,6 +1061,11 @@ DEF_HELPER_FLAGS_5(gvec_fminnump_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i DEF_HELPER_FLAGS_5(gvec_fminnump_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(gvec_fminnump_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_addp_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_addp_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_addp_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_addp_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + #ifdef TARGET_AARCH64 #include "tcg/helper-a64.h" #include "tcg/helper-sve.h" diff --git a/target/arm/tcg/translate.h b/target/arm/tcg/translate.h index b05a9eb668..04771f483b 100644 --- a/target/arm/tcg/translate.h +++ b/target/arm/tcg/translate.h @@ -514,6 +514,9 @@ void gen_gvec_saba(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, void gen_gvec_uaba(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_addp(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); + /* * Forward to the isar_feature_* tests given a DisasContext pointer. */ diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode index 43557fdccc..84f5bcc0e0 100644 --- a/target/arm/tcg/a64.decode +++ b/target/arm/tcg/a64.decode @@ -38,6 +38,7 @@ &qrrrr_e q rd rn rm ra esz @rr_h ........ ... ..... ...... rn:5 rd:5 &rr_e esz=1 +@rr_d ........ ... ..... ...... rn:5 rd:5 &rr_e esz=3 @rr_sd ........ ... ..... ...... rn:5 rd:5 &rr_e esz=%esz_sd @rrr_h ........ ... rm:5 ...... rn:5 rd:5 &rrr_e esz=1 @@ -56,6 +57,7 @@ @qrrr_h . q:1 ...... ... rm:5 ...... rn:5 rd:5 &qrrr_e esz=1 @qrrr_sd . q:1 ...... ... rm:5 ...... rn:5 rd:5 &qrrr_e esz=%esz_sd +@qrrr_e . q:1 ...... esz:2 . rm:5 ...... rn:5 rd:5 &qrrr_e @qrrx_h . q:1 .. .... .. .. rm:4 .... . . rn:5 rd:5 \ &qrrx_e esz=1 idx=%hlm @@ -758,6 +760,8 @@ FMAXNMP_s 0111 1110 0.11 0000 1100 10 ..... ..... @rr_sd FMINNMP_s 0101 1110 1011 0000 1100 10 ..... ..... @rr_h FMINNMP_s 0111 1110 1.11 0000 1100 10 ..... ..... @rr_sd +ADDP_s 0101 1110 1111 0001 1011 10 ..... ..... @rr_d + ### Advanced SIMD three same FADD_v 0.00 1110 010 ..... 00010 1 ..... ..... @qrrr_h @@ -832,6 +836,8 @@ FMAXNMP_v 0.10 1110 0.1 ..... 11000 1 ..... ..... @qrrr_sd FMINNMP_v 0.10 1110 110 ..... 00000 1 ..... ..... @qrrr_h FMINNMP_v 0.10 1110 1.1 ..... 11000 1 ..... ..... @qrrr_sd +ADDP_v 0.00 1110 ..1 ..... 10111 1 ..... ..... @qrrr_e + ### Advanced SIMD scalar x indexed element FMUL_si 0101 1111 00 .. .... 1001 . 0 ..... ..... @rrx_h diff --git a/target/arm/tcg/gengvec.c b/target/arm/tcg/gengvec.c index 7a1856253f..f010dd5a0e 100644 --- a/target/arm/tcg/gengvec.c +++ b/target/arm/tcg/gengvec.c @@ -1610,3 +1610,15 @@ void gen_gvec_uaba(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, }; tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); } + +void gen_gvec_addp(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static gen_helper_gvec_3 * const fns[4] = { + gen_helper_gvec_addp_b, + gen_helper_gvec_addp_h, + gen_helper_gvec_addp_s, + gen_helper_gvec_addp_d, + }; + tcg_gen_gvec_3_ool(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, 0, fns[vece]); +} diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c index 07415bd285..b8add91112 100644 --- a/target/arm/tcg/translate-a64.c +++ b/target/arm/tcg/translate-a64.c @@ -5245,6 +5245,8 @@ static gen_helper_gvec_3_ptr * const f_vector_fminnmp[3] = { }; TRANS(FMINNMP_v, do_fp3_vector, a, f_vector_fminnmp) +TRANS(ADDP_v, do_gvec_fn3, a, gen_gvec_addp) + /* * Advanced SIMD scalar/vector x indexed element */ @@ -5485,6 +5487,20 @@ TRANS(FMINP_s, do_fp3_scalar_pair, a, &f_scalar_fmin) TRANS(FMAXNMP_s, do_fp3_scalar_pair, a, &f_scalar_fmaxnm) TRANS(FMINNMP_s, do_fp3_scalar_pair, a, &f_scalar_fminnm) +static bool trans_ADDP_s(DisasContext *s, arg_rr_e *a) +{ + if (fp_access_check(s)) { + TCGv_i64 t0 = tcg_temp_new_i64(); + TCGv_i64 t1 = tcg_temp_new_i64(); + + read_vec_element(s, t0, a->rn, 0, MO_64); + read_vec_element(s, t1, a->rn, 1, MO_64); + tcg_gen_add_i64(t0, t0, t1); + write_fp_dreg(s, a->rd, t0); + } + return true; +} + /* Shift a TCGv src by TCGv shift_amount, put result in dst. * Note that it is the caller's responsibility to ensure that the * shift amount is in range (ie 0..31 or 0..63) and provide the ARM @@ -8412,73 +8428,6 @@ static void disas_simd_mod_imm(DisasContext *s, uint32_t insn) } } -/* AdvSIMD scalar pairwise - * 31 30 29 28 24 23 22 21 17 16 12 11 10 9 5 4 0 - * +-----+---+-----------+------+-----------+--------+-----+------+------+ - * | 0 1 | U | 1 1 1 1 0 | size | 1 1 0 0 0 | opcode | 1 0 | Rn | Rd | - * +-----+---+-----------+------+-----------+--------+-----+------+------+ - */ -static void disas_simd_scalar_pairwise(DisasContext *s, uint32_t insn) -{ - int u = extract32(insn, 29, 1); - int size = extract32(insn, 22, 2); - int opcode = extract32(insn, 12, 5); - int rn = extract32(insn, 5, 5); - int rd = extract32(insn, 0, 5); - - /* For some ops (the FP ones), size[1] is part of the encoding. - * For ADDP strictly it is not but size[1] is always 1 for valid - * encodings. - */ - opcode |= (extract32(size, 1, 1) << 5); - - switch (opcode) { - case 0x3b: /* ADDP */ - if (u || size != 3) { - unallocated_encoding(s); - return; - } - if (!fp_access_check(s)) { - return; - } - break; - default: - case 0xc: /* FMAXNMP */ - case 0xd: /* FADDP */ - case 0xf: /* FMAXP */ - case 0x2c: /* FMINNMP */ - case 0x2f: /* FMINP */ - unallocated_encoding(s); - return; - } - - if (size == MO_64) { - TCGv_i64 tcg_op1 = tcg_temp_new_i64(); - TCGv_i64 tcg_op2 = tcg_temp_new_i64(); - TCGv_i64 tcg_res = tcg_temp_new_i64(); - - read_vec_element(s, tcg_op1, rn, 0, MO_64); - read_vec_element(s, tcg_op2, rn, 1, MO_64); - - switch (opcode) { - case 0x3b: /* ADDP */ - tcg_gen_add_i64(tcg_res, tcg_op1, tcg_op2); - break; - default: - case 0xc: /* FMAXNMP */ - case 0xd: /* FADDP */ - case 0xf: /* FMAXP */ - case 0x2c: /* FMINNMP */ - case 0x2f: /* FMINP */ - g_assert_not_reached(); - } - - write_fp_dreg(s, rd, tcg_res); - } else { - g_assert_not_reached(); - } -} - /* * Common SSHR[RA]/USHR[RA] - Shift right (optional rounding/accumulate) * @@ -10965,34 +10914,7 @@ static void handle_simd_3same_pair(DisasContext *s, int is_q, int u, int opcode, * adjacent elements being operated on to produce an element in the result. */ if (size == 3) { - TCGv_i64 tcg_res[2]; - - for (pass = 0; pass < 2; pass++) { - TCGv_i64 tcg_op1 = tcg_temp_new_i64(); - TCGv_i64 tcg_op2 = tcg_temp_new_i64(); - int passreg = (pass == 0) ? rn : rm; - - read_vec_element(s, tcg_op1, passreg, 0, MO_64); - read_vec_element(s, tcg_op2, passreg, 1, MO_64); - tcg_res[pass] = tcg_temp_new_i64(); - - switch (opcode) { - case 0x17: /* ADDP */ - tcg_gen_add_i64(tcg_res[pass], tcg_op1, tcg_op2); - break; - default: - case 0x58: /* FMAXNMP */ - case 0x5a: /* FADDP */ - case 0x5e: /* FMAXP */ - case 0x78: /* FMINNMP */ - case 0x7e: /* FMINP */ - g_assert_not_reached(); - } - } - - for (pass = 0; pass < 2; pass++) { - write_vec_element(s, tcg_res[pass], rd, pass, MO_64); - } + g_assert_not_reached(); } else { int maxpass = is_q ? 4 : 2; TCGv_i32 tcg_res[4]; @@ -11009,16 +10931,6 @@ static void handle_simd_3same_pair(DisasContext *s, int is_q, int u, int opcode, tcg_res[pass] = tcg_temp_new_i32(); switch (opcode) { - case 0x17: /* ADDP */ - { - static NeonGenTwoOpFn * const fns[3] = { - gen_helper_neon_padd_u8, - gen_helper_neon_padd_u16, - tcg_gen_add_i32, - }; - genfn = fns[size]; - break; - } case 0x14: /* SMAXP, UMAXP */ { static NeonGenTwoOpFn * const fns[3][2] = { @@ -11040,6 +10952,7 @@ static void handle_simd_3same_pair(DisasContext *s, int is_q, int u, int opcode, break; } default: + case 0x17: /* ADDP */ case 0x58: /* FMAXNMP */ case 0x5a: /* FADDP */ case 0x5e: /* FMAXP */ @@ -11401,7 +11314,6 @@ static void disas_simd_three_reg_same(DisasContext *s, uint32_t insn) case 0x3: /* logic ops */ disas_simd_3same_logic(s, insn); break; - case 0x17: /* ADDP */ case 0x14: /* SMAXP, UMAXP */ case 0x15: /* SMINP, UMINP */ { @@ -11433,6 +11345,9 @@ static void disas_simd_three_reg_same(DisasContext *s, uint32_t insn) default: disas_simd_3same_int(s, insn); break; + case 0x17: /* ADDP */ + unallocated_encoding(s); + break; } } @@ -13195,7 +13110,6 @@ static const AArch64DecodeTable data_proc_simd[] = { { 0x5e008400, 0xdf208400, disas_simd_scalar_three_reg_same_extra }, { 0x5e200000, 0xdf200c00, disas_simd_scalar_three_reg_diff }, { 0x5e200800, 0xdf3e0c00, disas_simd_scalar_two_reg_misc }, - { 0x5e300800, 0xdf3e0c00, disas_simd_scalar_pairwise }, { 0x5f000000, 0xdf000400, disas_simd_indexed }, /* scalar indexed */ { 0x5f000400, 0xdf800400, disas_simd_scalar_shift_imm }, { 0x0e780800, 0x8f7e0c00, disas_simd_two_reg_misc_fp16 }, diff --git a/target/arm/tcg/vec_helper.c b/target/arm/tcg/vec_helper.c index 26a9ca9c14..5069899415 100644 --- a/target/arm/tcg/vec_helper.c +++ b/target/arm/tcg/vec_helper.c @@ -2231,6 +2231,36 @@ DO_3OP_PAIR(gvec_fminnump_h, float16_minnum, float16, H2) DO_3OP_PAIR(gvec_fminnump_s, float32_minnum, float32, H4) DO_3OP_PAIR(gvec_fminnump_d, float64_minnum, float64, ) +#undef DO_3OP_PAIR + +#define DO_3OP_PAIR(NAME, FUNC, TYPE, H) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ +{ \ + ARMVectorReg scratch; \ + intptr_t oprsz = simd_oprsz(desc); \ + intptr_t half = oprsz / sizeof(TYPE) / 2; \ + TYPE *d = vd, *n = vn, *m = vm; \ + if (unlikely(d == m)) { \ + m = memcpy(&scratch, m, oprsz); \ + } \ + for (intptr_t i = 0; i < half; ++i) { \ + d[H(i)] = FUNC(n[H(i * 2)], n[H(i * 2 + 1)]); \ + } \ + for (intptr_t i = 0; i < half; ++i) { \ + d[H(i + half)] = FUNC(m[H(i * 2)], m[H(i * 2 + 1)]); \ + } \ + clear_tail(d, oprsz, simd_maxsz(desc)); \ +} + +#define ADD(A, B) (A + B) +DO_3OP_PAIR(gvec_addp_b, ADD, uint8_t, H1) +DO_3OP_PAIR(gvec_addp_h, ADD, uint16_t, H2) +DO_3OP_PAIR(gvec_addp_s, ADD, uint32_t, H4) +DO_3OP_PAIR(gvec_addp_d, ADD, uint64_t, ) +#undef ADD + +#undef DO_3OP_PAIR + #define DO_VCVT_FIXED(NAME, FUNC, TYPE) \ void HELPER(NAME)(void *vd, void *vn, void *stat, uint32_t desc) \ { \ From patchwork Fri May 24 23:20:46 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13673813 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 05F89C25B7D for ; Fri, 24 May 2024 23:27:24 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sAeEe-0006sW-Hs; Fri, 24 May 2024 19:22:04 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sAeEa-0006n0-Vd for qemu-devel@nongnu.org; Fri, 24 May 2024 19:22:01 -0400 Received: from mail-pl1-x62b.google.com ([2607:f8b0:4864:20::62b]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sAeES-0005tA-Ix for qemu-devel@nongnu.org; Fri, 24 May 2024 19:22:00 -0400 Received: by mail-pl1-x62b.google.com with SMTP id d9443c01a7336-1f44b51b367so8858925ad.1 for ; Fri, 24 May 2024 16:21:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1716592911; x=1717197711; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=KJ19wgsVZPJfoFhDASr+UuLwteG8CXuQPId+c/bWmuc=; b=cakvcSx+yEJUG/A1/JcBI1gGsVUzR1bEA0+8nERjaevA5QUOaVK71aYpIDFmByt+xO pcrN4bY7CXFk1VAA6m6FK/ALvaIccipwfssMR9TTNK16S3mnHmj1m3zWLnRc3LKSgCsz f0V8EnYD/nJ+mJl2M89dkReSLENNlHqtztI4PjXo9bXZm+9YKM+xbougViJuF0nc65LR xnfwNv5VbprAF/7iJaWfDg2qLtWp9Pi7txZZjkyyMfNimb4KG8mxsCSdAGqygmo0dvRy AZA7JcVANK3nVMT72DfxVVWQC9Y8yvY7kICS3agCKPe3q+ekCzbGQw3xdQxWSKkASjhm 5BbA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716592911; x=1717197711; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=KJ19wgsVZPJfoFhDASr+UuLwteG8CXuQPId+c/bWmuc=; b=w0/lGe0yzLhFNIlTB79G00J90IwIFAT6WkzIQaThPSKA2OUv9KyZ+/yIFctjnFLPSo XnGIF50uaaeYBeJJ+PjYrGdMzuNIWnkj8HlMfzuMi91MxcfjftWjX0L8NEwdF/8FTZUr O/PgivhPaeM8gXqweIO43hVIAJkVMXQY0Qhi01aF/MFn5/c/3xClcZ3b68od3eQTP7EA Iwfs5HZ3MHBMpRCvcvLy/kNjkwH/xZl0iWEWfqN11SpsYGcM0rha2ksJd5KvccGpEQSf wONwDpD5SLQrnLmjuvzQl3O0fMQ9doTx+xMj5TxPsjvKrC8P93c58LcPaRTaaSW0NDfA IHIA== X-Gm-Message-State: AOJu0YwvZB5qKJ/ZNE6Wu3GlHqrmYJPpdZnwFENjpp7wDqI0cd/MgqH2 DxtpXB8wCtILSijNBvPuUQ1c9G2XyBkEPWmbw+DZoQz4XjwkUXkS2x8jTNDhzdDIWt1UqX0UdgG y X-Google-Smtp-Source: AGHT+IGlXo6pysxgpYC1UCw6UylExMcU5JIywmPMOniiyFGdLb7wjTooAiKOybIiwSJmw3U3TeZP0A== X-Received: by 2002:a17:902:e88a:b0:1f3:244:c619 with SMTP id d9443c01a7336-1f4486fd2admr46428565ad.7.1716592911273; Fri, 24 May 2024 16:21:51 -0700 (PDT) Received: from stoup.. (174-21-72-5.tukw.qwest.net. [174.21.72.5]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-1f44c759ceesm19178305ad.10.2024.05.24.16.21.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 May 2024 16:21:50 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: qemu-arm@nongnu.org, Peter Maydell Subject: [PATCH v2 32/67] target/arm: Use gvec for neon padd Date: Fri, 24 May 2024 16:20:46 -0700 Message-Id: <20240524232121.284515-33-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240524232121.284515-1-richard.henderson@linaro.org> References: <20240524232121.284515-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::62b; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x62b.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/helper.h | 2 -- target/arm/tcg/neon_helper.c | 5 ----- target/arm/tcg/translate-neon.c | 3 +-- 3 files changed, 1 insertion(+), 9 deletions(-) diff --git a/target/arm/helper.h b/target/arm/helper.h index d3579a101f..51ed49aa50 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -354,8 +354,6 @@ DEF_HELPER_3(neon_qrshl_s64, i64, env, i64, i64) DEF_HELPER_2(neon_add_u8, i32, i32, i32) DEF_HELPER_2(neon_add_u16, i32, i32, i32) -DEF_HELPER_2(neon_padd_u8, i32, i32, i32) -DEF_HELPER_2(neon_padd_u16, i32, i32, i32) DEF_HELPER_2(neon_sub_u8, i32, i32, i32) DEF_HELPER_2(neon_sub_u16, i32, i32, i32) DEF_HELPER_2(neon_mul_u8, i32, i32, i32) diff --git a/target/arm/tcg/neon_helper.c b/target/arm/tcg/neon_helper.c index bc6c4a54e9..a0b51c8809 100644 --- a/target/arm/tcg/neon_helper.c +++ b/target/arm/tcg/neon_helper.c @@ -745,11 +745,6 @@ uint32_t HELPER(neon_add_u16)(uint32_t a, uint32_t b) return (a + b) ^ mask; } -#define NEON_FN(dest, src1, src2) dest = src1 + src2 -NEON_POP(padd_u8, neon_u8, 4) -NEON_POP(padd_u16, neon_u16, 2) -#undef NEON_FN - #define NEON_FN(dest, src1, src2) dest = src1 - src2 NEON_VOP(sub_u8, neon_u8, 4) NEON_VOP(sub_u16, neon_u16, 2) diff --git a/target/arm/tcg/translate-neon.c b/target/arm/tcg/translate-neon.c index 2326a05a0a..6c5a7a98e1 100644 --- a/target/arm/tcg/translate-neon.c +++ b/target/arm/tcg/translate-neon.c @@ -830,6 +830,7 @@ DO_3SAME_NO_SZ_3(VABD_S, gen_gvec_sabd) DO_3SAME_NO_SZ_3(VABA_S, gen_gvec_saba) DO_3SAME_NO_SZ_3(VABD_U, gen_gvec_uabd) DO_3SAME_NO_SZ_3(VABA_U, gen_gvec_uaba) +DO_3SAME_NO_SZ_3(VPADD, gen_gvec_addp) #define DO_3SAME_CMP(INSN, COND) \ static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs, \ @@ -1070,13 +1071,11 @@ static bool do_3same_pair(DisasContext *s, arg_3same *a, NeonGenTwoOpFn *fn) #define gen_helper_neon_pmax_u32 tcg_gen_umax_i32 #define gen_helper_neon_pmin_s32 tcg_gen_smin_i32 #define gen_helper_neon_pmin_u32 tcg_gen_umin_i32 -#define gen_helper_neon_padd_u32 tcg_gen_add_i32 DO_3SAME_PAIR(VPMAX_S, pmax_s) DO_3SAME_PAIR(VPMIN_S, pmin_s) DO_3SAME_PAIR(VPMAX_U, pmax_u) DO_3SAME_PAIR(VPMIN_U, pmin_u) -DO_3SAME_PAIR(VPADD, padd_u) #define DO_3SAME_VQDMULH(INSN, FUNC) \ WRAP_ENV_FN(gen_##INSN##_tramp16, gen_helper_neon_##FUNC##_s16); \ From patchwork Fri May 24 23:20:47 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13673789 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id DF29EC25B74 for ; Fri, 24 May 2024 23:22:18 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sAeEd-0006qE-Lo; Fri, 24 May 2024 19:22:03 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sAeEa-0006mk-Fr for qemu-devel@nongnu.org; Fri, 24 May 2024 19:22:00 -0400 Received: from mail-pl1-x62e.google.com ([2607:f8b0:4864:20::62e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sAeET-0005te-I4 for qemu-devel@nongnu.org; Fri, 24 May 2024 19:22:00 -0400 Received: by mail-pl1-x62e.google.com with SMTP id d9443c01a7336-1f3310a21d8so24798435ad.1 for ; Fri, 24 May 2024 16:21:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1716592912; x=1717197712; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=riGMvoX+nry8om/sqEbOI0AWpnQYAppRwBb5kDyR7dg=; b=UvZMCT2T2BlptV+tupxtQtnWLhljzL+d1K2eMynwUZJfQKyS4VsYNlwCTpkQiOUF6I BHVPhPLaXX8Ozdq1HSMNmhODrQFP5bDgd6pdcPF0KQnahkRakX7LScOILCLWeiNNxOsn k0Kl/Yz9/0Ii1b/wOjT4O3sdvYWY8l92Hnn0EoJfemLfdfrRZq3xGHtz6aIpNZoExLWr +l+2kmY2djjWAsI3ZNcIbMOc/tmmhlkBmaStZFGosUgek9oOp0iXjRuFks/Bs1TrR8/8 WhXQ+4GEbElxLDfR1NkrnzHZEWBHSQEjw0NJaUChhINA4CM5wQsBdrbkB6p897ES7zbk I7sg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716592912; x=1717197712; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=riGMvoX+nry8om/sqEbOI0AWpnQYAppRwBb5kDyR7dg=; b=v+KVksh57siGUc/b5Ag7WIZZioi+VL2Eys19L/dnlTLiOqWfM+h8YEb7PO9PpKQ+f0 YW2LHyM+X3RFmfIxf9eL5S3uUmzXfSZigG8QKkW86iesPm0QUtmGzvlCgnv3ixcUE+AB jYhIkTrsbvtJm5iOnuDaCbQCEbLMjQSvCkNMigkPbsnMpW4pFBSSJqLxnk1mRgyZ7/4O 1xxvpry7PoV5eKAYMXb9/iPnri4yp7xIMTPwuY04jT0iDPft32PqiUma4GlGxDvbFvUu bseqD9anFi3TIVgpQNoXtcpTc+Q7VsIuEI7V1QuSY/Jow9TL7ETjcHb4+ym4m8JHTprl CCug== X-Gm-Message-State: AOJu0Ywn1SekCOF9t4FgStwK/ZznUEHQhub7eDc1d/aPESdkt6quP+03 yTwIGCPYsxjHnSbFm5Q3qjY3hahYhdS4PgpRrIEkD/1bcwimncJ13RN5A8mJsARY2YXFmWyqXBc C X-Google-Smtp-Source: AGHT+IFU7B0nQUim//NZZ5+tjjmodsBpya2KULo1cO/R81NJJcYruTBUKap1Gqh+ydAUikvtm87Bpw== X-Received: by 2002:a17:902:f645:b0:1f4:5278:5c2c with SMTP id d9443c01a7336-1f4527863ecmr30621975ad.60.1716592912193; Fri, 24 May 2024 16:21:52 -0700 (PDT) Received: from stoup.. (174-21-72-5.tukw.qwest.net. [174.21.72.5]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-1f44c759ceesm19178305ad.10.2024.05.24.16.21.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 May 2024 16:21:51 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: qemu-arm@nongnu.org, Peter Maydell Subject: [PATCH v2 33/67] target/arm: Convert SMAXP, SMINP, UMAXP, UMINP to decodetree Date: Fri, 24 May 2024 16:20:47 -0700 Message-Id: <20240524232121.284515-34-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240524232121.284515-1-richard.henderson@linaro.org> References: <20240524232121.284515-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::62e; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x62e.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org These are the last instructions within handle_simd_3same_pair so remove it. Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/helper.h | 16 +++++ target/arm/tcg/translate.h | 8 +++ target/arm/tcg/a64.decode | 4 ++ target/arm/tcg/gengvec.c | 48 +++++++++++++ target/arm/tcg/translate-a64.c | 119 +++++---------------------------- target/arm/tcg/vec_helper.c | 16 +++++ 6 files changed, 109 insertions(+), 102 deletions(-) diff --git a/target/arm/helper.h b/target/arm/helper.h index 51ed49aa50..f830531dd3 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -1064,6 +1064,22 @@ DEF_HELPER_FLAGS_4(gvec_addp_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(gvec_addp_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(gvec_addp_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_smaxp_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_smaxp_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_smaxp_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(gvec_sminp_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_sminp_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_sminp_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(gvec_umaxp_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_umaxp_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_umaxp_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(gvec_uminp_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_uminp_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_uminp_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + #ifdef TARGET_AARCH64 #include "tcg/helper-a64.h" #include "tcg/helper-sve.h" diff --git a/target/arm/tcg/translate.h b/target/arm/tcg/translate.h index 04771f483b..3abdbedfe5 100644 --- a/target/arm/tcg/translate.h +++ b/target/arm/tcg/translate.h @@ -516,6 +516,14 @@ void gen_gvec_uaba(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, void gen_gvec_addp(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_smaxp(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_sminp(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_umaxp(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_uminp(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); /* * Forward to the isar_feature_* tests given a DisasContext pointer. diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode index 84f5bcc0e0..22dfe8568d 100644 --- a/target/arm/tcg/a64.decode +++ b/target/arm/tcg/a64.decode @@ -837,6 +837,10 @@ FMINNMP_v 0.10 1110 110 ..... 00000 1 ..... ..... @qrrr_h FMINNMP_v 0.10 1110 1.1 ..... 11000 1 ..... ..... @qrrr_sd ADDP_v 0.00 1110 ..1 ..... 10111 1 ..... ..... @qrrr_e +SMAXP_v 0.00 1110 ..1 ..... 10100 1 ..... ..... @qrrr_e +SMINP_v 0.00 1110 ..1 ..... 10101 1 ..... ..... @qrrr_e +UMAXP_v 0.10 1110 ..1 ..... 10100 1 ..... ..... @qrrr_e +UMINP_v 0.10 1110 ..1 ..... 10101 1 ..... ..... @qrrr_e ### Advanced SIMD scalar x indexed element diff --git a/target/arm/tcg/gengvec.c b/target/arm/tcg/gengvec.c index f010dd5a0e..22c9d17dce 100644 --- a/target/arm/tcg/gengvec.c +++ b/target/arm/tcg/gengvec.c @@ -1622,3 +1622,51 @@ void gen_gvec_addp(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, }; tcg_gen_gvec_3_ool(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, 0, fns[vece]); } + +void gen_gvec_smaxp(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static gen_helper_gvec_3 * const fns[4] = { + gen_helper_gvec_smaxp_b, + gen_helper_gvec_smaxp_h, + gen_helper_gvec_smaxp_s, + }; + tcg_debug_assert(vece <= MO_32); + tcg_gen_gvec_3_ool(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, 0, fns[vece]); +} + +void gen_gvec_sminp(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static gen_helper_gvec_3 * const fns[4] = { + gen_helper_gvec_sminp_b, + gen_helper_gvec_sminp_h, + gen_helper_gvec_sminp_s, + }; + tcg_debug_assert(vece <= MO_32); + tcg_gen_gvec_3_ool(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, 0, fns[vece]); +} + +void gen_gvec_umaxp(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static gen_helper_gvec_3 * const fns[4] = { + gen_helper_gvec_umaxp_b, + gen_helper_gvec_umaxp_h, + gen_helper_gvec_umaxp_s, + }; + tcg_debug_assert(vece <= MO_32); + tcg_gen_gvec_3_ool(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, 0, fns[vece]); +} + +void gen_gvec_uminp(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static gen_helper_gvec_3 * const fns[4] = { + gen_helper_gvec_uminp_b, + gen_helper_gvec_uminp_h, + gen_helper_gvec_uminp_s, + }; + tcg_debug_assert(vece <= MO_32); + tcg_gen_gvec_3_ool(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, 0, fns[vece]); +} diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c index b8add91112..9fe70a939b 100644 --- a/target/arm/tcg/translate-a64.c +++ b/target/arm/tcg/translate-a64.c @@ -1352,6 +1352,17 @@ static bool do_gvec_fn3(DisasContext *s, arg_qrrr_e *a, GVecGen3Fn *fn) return true; } +static bool do_gvec_fn3_no64(DisasContext *s, arg_qrrr_e *a, GVecGen3Fn *fn) +{ + if (a->esz == MO_64) { + return false; + } + if (fp_access_check(s)) { + gen_gvec_fn3(s, a->q, a->rd, a->rn, a->rm, fn, a->esz); + } + return true; +} + static bool do_gvec_fn4(DisasContext *s, arg_qrrrr_e *a, GVecGen4Fn *fn) { if (!a->q && a->esz == MO_64) { @@ -5246,6 +5257,10 @@ static gen_helper_gvec_3_ptr * const f_vector_fminnmp[3] = { TRANS(FMINNMP_v, do_fp3_vector, a, f_vector_fminnmp) TRANS(ADDP_v, do_gvec_fn3, a, gen_gvec_addp) +TRANS(SMAXP_v, do_gvec_fn3_no64, a, gen_gvec_smaxp) +TRANS(SMINP_v, do_gvec_fn3_no64, a, gen_gvec_sminp) +TRANS(UMAXP_v, do_gvec_fn3_no64, a, gen_gvec_umaxp) +TRANS(UMINP_v, do_gvec_fn3_no64, a, gen_gvec_uminp) /* * Advanced SIMD scalar/vector x indexed element @@ -10896,84 +10911,6 @@ static void disas_simd_3same_logic(DisasContext *s, uint32_t insn) } } -/* Pairwise op subgroup of C3.6.16. - * - * This is called directly for float pairwise - * operations where the opcode and size are calculated differently. - */ -static void handle_simd_3same_pair(DisasContext *s, int is_q, int u, int opcode, - int size, int rn, int rm, int rd) -{ - int pass; - - if (!fp_access_check(s)) { - return; - } - - /* These operations work on the concatenated rm:rn, with each pair of - * adjacent elements being operated on to produce an element in the result. - */ - if (size == 3) { - g_assert_not_reached(); - } else { - int maxpass = is_q ? 4 : 2; - TCGv_i32 tcg_res[4]; - - for (pass = 0; pass < maxpass; pass++) { - TCGv_i32 tcg_op1 = tcg_temp_new_i32(); - TCGv_i32 tcg_op2 = tcg_temp_new_i32(); - NeonGenTwoOpFn *genfn = NULL; - int passreg = pass < (maxpass / 2) ? rn : rm; - int passelt = (is_q && (pass & 1)) ? 2 : 0; - - read_vec_element_i32(s, tcg_op1, passreg, passelt, MO_32); - read_vec_element_i32(s, tcg_op2, passreg, passelt + 1, MO_32); - tcg_res[pass] = tcg_temp_new_i32(); - - switch (opcode) { - case 0x14: /* SMAXP, UMAXP */ - { - static NeonGenTwoOpFn * const fns[3][2] = { - { gen_helper_neon_pmax_s8, gen_helper_neon_pmax_u8 }, - { gen_helper_neon_pmax_s16, gen_helper_neon_pmax_u16 }, - { tcg_gen_smax_i32, tcg_gen_umax_i32 }, - }; - genfn = fns[size][u]; - break; - } - case 0x15: /* SMINP, UMINP */ - { - static NeonGenTwoOpFn * const fns[3][2] = { - { gen_helper_neon_pmin_s8, gen_helper_neon_pmin_u8 }, - { gen_helper_neon_pmin_s16, gen_helper_neon_pmin_u16 }, - { tcg_gen_smin_i32, tcg_gen_umin_i32 }, - }; - genfn = fns[size][u]; - break; - } - default: - case 0x17: /* ADDP */ - case 0x58: /* FMAXNMP */ - case 0x5a: /* FADDP */ - case 0x5e: /* FMAXP */ - case 0x78: /* FMINNMP */ - case 0x7e: /* FMINP */ - g_assert_not_reached(); - } - - /* FP ops called directly, otherwise call now */ - if (genfn) { - genfn(tcg_res[pass], tcg_op1, tcg_op2); - } - } - - for (pass = 0; pass < maxpass; pass++) { - write_vec_element_i32(s, tcg_res[pass], rd, pass, MO_32); - } - clear_vec_high(s, is_q, rd); - } -} - /* Floating point op subgroup of C3.6.16. */ static void disas_simd_3same_float(DisasContext *s, uint32_t insn) { @@ -11314,30 +11251,6 @@ static void disas_simd_three_reg_same(DisasContext *s, uint32_t insn) case 0x3: /* logic ops */ disas_simd_3same_logic(s, insn); break; - case 0x14: /* SMAXP, UMAXP */ - case 0x15: /* SMINP, UMINP */ - { - /* Pairwise operations */ - int is_q = extract32(insn, 30, 1); - int u = extract32(insn, 29, 1); - int size = extract32(insn, 22, 2); - int rm = extract32(insn, 16, 5); - int rn = extract32(insn, 5, 5); - int rd = extract32(insn, 0, 5); - if (opcode == 0x17) { - if (u || (size == 3 && !is_q)) { - unallocated_encoding(s); - return; - } - } else { - if (size == 3) { - unallocated_encoding(s); - return; - } - } - handle_simd_3same_pair(s, is_q, u, opcode, size, rn, rm, rd); - break; - } case 0x18 ... 0x31: /* floating point ops, sz[1] and U are part of opcode */ disas_simd_3same_float(s, insn); @@ -11345,6 +11258,8 @@ static void disas_simd_three_reg_same(DisasContext *s, uint32_t insn) default: disas_simd_3same_int(s, insn); break; + case 0x14: /* SMAXP, UMAXP */ + case 0x15: /* SMINP, UMINP */ case 0x17: /* ADDP */ unallocated_encoding(s); break; diff --git a/target/arm/tcg/vec_helper.c b/target/arm/tcg/vec_helper.c index 5069899415..56fea14edb 100644 --- a/target/arm/tcg/vec_helper.c +++ b/target/arm/tcg/vec_helper.c @@ -2259,6 +2259,22 @@ DO_3OP_PAIR(gvec_addp_s, ADD, uint32_t, H4) DO_3OP_PAIR(gvec_addp_d, ADD, uint64_t, ) #undef ADD +DO_3OP_PAIR(gvec_smaxp_b, MAX, int8_t, H1) +DO_3OP_PAIR(gvec_smaxp_h, MAX, int16_t, H2) +DO_3OP_PAIR(gvec_smaxp_s, MAX, int32_t, H4) + +DO_3OP_PAIR(gvec_umaxp_b, MAX, uint8_t, H1) +DO_3OP_PAIR(gvec_umaxp_h, MAX, uint16_t, H2) +DO_3OP_PAIR(gvec_umaxp_s, MAX, uint32_t, H4) + +DO_3OP_PAIR(gvec_sminp_b, MIN, int8_t, H1) +DO_3OP_PAIR(gvec_sminp_h, MIN, int16_t, H2) +DO_3OP_PAIR(gvec_sminp_s, MIN, int32_t, H4) + +DO_3OP_PAIR(gvec_uminp_b, MIN, uint8_t, H1) +DO_3OP_PAIR(gvec_uminp_h, MIN, uint16_t, H2) +DO_3OP_PAIR(gvec_uminp_s, MIN, uint32_t, H4) + #undef DO_3OP_PAIR #define DO_VCVT_FIXED(NAME, FUNC, TYPE) \ From patchwork Fri May 24 23:20:48 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13673841 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 563D3C25B74 for ; Fri, 24 May 2024 23:30:37 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sAeEh-0006vF-Dn; Fri, 24 May 2024 19:22:07 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sAeEc-0006p4-An for qemu-devel@nongnu.org; Fri, 24 May 2024 19:22:02 -0400 Received: from mail-pl1-x62f.google.com ([2607:f8b0:4864:20::62f]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sAeEU-0005tz-Cr for qemu-devel@nongnu.org; Fri, 24 May 2024 19:22:02 -0400 Received: by mail-pl1-x62f.google.com with SMTP id d9443c01a7336-1f45d6500b4so6310365ad.1 for ; Fri, 24 May 2024 16:21:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1716592913; x=1717197713; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Z+tXvY1ceN292X4zMeCgBDzx+CgSby56X2Qdm2MgXZU=; b=zN0sNJ8SFJeOS3o4/EvAENtlKnix/SQftn7oNkA/U6Zfdm4PPV/SS3nC/3enB8Grs/ kulcnxkhnPciKmMzJvSZOnaT8l2RQXgBlwhZH/DsbrxyT0wOqZA697bPBBz4l1HYk43q vzDxbHqGqDN6rsD9lhgSZ/qIv2k+Co9HcH4YD4E79grHiDmsaUeDqdmRzswglxx00b3O DedoMuo9S+wnacHzP8rFbToBIRnufvq7Ta28FGzd7dbT2qFo+zWrzoIGEUfg/dDoJ6Lm QHFLt3KZRtbi285BpfVSnzCz/VSg+DG6/wRLz0PfPzB3cHjmR4X+rH1eF3tSit6Pwdv9 xttA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716592913; x=1717197713; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Z+tXvY1ceN292X4zMeCgBDzx+CgSby56X2Qdm2MgXZU=; b=ukeQAaH0gBZjUlVTC6CjxW+yli2mokYJKeqnqc/y4t2ACNim4yduaZlthadVwG4+dZ VAwFz1/9ggj5rwvexPHFBU0uCJELqnmevEWOWuQoGMHWUr+U5SRWanH9mIkYVf0gVVnG Mv3YUqhJ1IC1y67vlWFXbw0e93IdXFRjQ0VIAgEzKlVMz6w9MaZCktUjjl4uWaySnuOx HeqUCO5EMLK7LMFK3VfKr0u2BDFkj4eAxLd6wGIZB0VHNufwtNoRRDyrUVJeRFdyULiM SZQt7JlwN5sA36PcF/+Uj3btYWETk6Zgc6HlmyEHMWe+rcD5a7tsbeRHmdEAx5j75LtL oQ1Q== X-Gm-Message-State: AOJu0YyeZFhc/LLCx3EuJE2KH2O4OkbCeCIAr/n2MppJkZ6TP7Vec6Mv 9kSEdv4+k9UCgaLMHLmqESAgHiWJqIKAI61i2584yMYS5VGZVEq7ek0cyh+aT3bB14Fci9/S+1P q X-Google-Smtp-Source: AGHT+IFazYZMda0EtZD37ofkWq3UaaeimFOuklJqiO4luanuEHNMsMZvzL51eDDBat+wVWCI6ukInA== X-Received: by 2002:a17:902:d509:b0:1f3:4714:3d3c with SMTP id d9443c01a7336-1f4499017demr40374715ad.65.1716592912941; Fri, 24 May 2024 16:21:52 -0700 (PDT) Received: from stoup.. (174-21-72-5.tukw.qwest.net. [174.21.72.5]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-1f44c759ceesm19178305ad.10.2024.05.24.16.21.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 May 2024 16:21:52 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: qemu-arm@nongnu.org, Peter Maydell Subject: [PATCH v2 34/67] target/arm: Use gvec for neon pmax, pmin Date: Fri, 24 May 2024 16:20:48 -0700 Message-Id: <20240524232121.284515-35-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240524232121.284515-1-richard.henderson@linaro.org> References: <20240524232121.284515-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::62f; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x62f.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/tcg/translate-neon.c | 78 ++------------------------------- 1 file changed, 4 insertions(+), 74 deletions(-) diff --git a/target/arm/tcg/translate-neon.c b/target/arm/tcg/translate-neon.c index 6c5a7a98e1..18b048611b 100644 --- a/target/arm/tcg/translate-neon.c +++ b/target/arm/tcg/translate-neon.c @@ -831,6 +831,10 @@ DO_3SAME_NO_SZ_3(VABA_S, gen_gvec_saba) DO_3SAME_NO_SZ_3(VABD_U, gen_gvec_uabd) DO_3SAME_NO_SZ_3(VABA_U, gen_gvec_uaba) DO_3SAME_NO_SZ_3(VPADD, gen_gvec_addp) +DO_3SAME_NO_SZ_3(VPMAX_S, gen_gvec_smaxp) +DO_3SAME_NO_SZ_3(VPMIN_S, gen_gvec_sminp) +DO_3SAME_NO_SZ_3(VPMAX_U, gen_gvec_umaxp) +DO_3SAME_NO_SZ_3(VPMIN_U, gen_gvec_uminp) #define DO_3SAME_CMP(INSN, COND) \ static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs, \ @@ -1003,80 +1007,6 @@ DO_3SAME_32_ENV(VQSHL_U, qshl_u) DO_3SAME_32_ENV(VQRSHL_S, qrshl_s) DO_3SAME_32_ENV(VQRSHL_U, qrshl_u) -static bool do_3same_pair(DisasContext *s, arg_3same *a, NeonGenTwoOpFn *fn) -{ - /* Operations handled pairwise 32 bits at a time */ - TCGv_i32 tmp, tmp2, tmp3; - - if (!arm_dc_feature(s, ARM_FEATURE_NEON)) { - return false; - } - - /* UNDEF accesses to D16-D31 if they don't exist. */ - if (!dc_isar_feature(aa32_simd_r32, s) && - ((a->vd | a->vn | a->vm) & 0x10)) { - return false; - } - - if (a->size == 3) { - return false; - } - - if (!vfp_access_check(s)) { - return true; - } - - assert(a->q == 0); /* enforced by decode patterns */ - - /* - * Note that we have to be careful not to clobber the source operands - * in the "vm == vd" case by storing the result of the first pass too - * early. Since Q is 0 there are always just two passes, so instead - * of a complicated loop over each pass we just unroll. - */ - tmp = tcg_temp_new_i32(); - tmp2 = tcg_temp_new_i32(); - tmp3 = tcg_temp_new_i32(); - - read_neon_element32(tmp, a->vn, 0, MO_32); - read_neon_element32(tmp2, a->vn, 1, MO_32); - fn(tmp, tmp, tmp2); - - read_neon_element32(tmp3, a->vm, 0, MO_32); - read_neon_element32(tmp2, a->vm, 1, MO_32); - fn(tmp3, tmp3, tmp2); - - write_neon_element32(tmp, a->vd, 0, MO_32); - write_neon_element32(tmp3, a->vd, 1, MO_32); - - return true; -} - -#define DO_3SAME_PAIR(INSN, func) \ - static bool trans_##INSN##_3s(DisasContext *s, arg_3same *a) \ - { \ - static NeonGenTwoOpFn * const fns[] = { \ - gen_helper_neon_##func##8, \ - gen_helper_neon_##func##16, \ - gen_helper_neon_##func##32, \ - }; \ - if (a->size > 2) { \ - return false; \ - } \ - return do_3same_pair(s, a, fns[a->size]); \ - } - -/* 32-bit pairwise ops end up the same as the elementwise versions. */ -#define gen_helper_neon_pmax_s32 tcg_gen_smax_i32 -#define gen_helper_neon_pmax_u32 tcg_gen_umax_i32 -#define gen_helper_neon_pmin_s32 tcg_gen_smin_i32 -#define gen_helper_neon_pmin_u32 tcg_gen_umin_i32 - -DO_3SAME_PAIR(VPMAX_S, pmax_s) -DO_3SAME_PAIR(VPMIN_S, pmin_s) -DO_3SAME_PAIR(VPMAX_U, pmax_u) -DO_3SAME_PAIR(VPMIN_U, pmin_u) - #define DO_3SAME_VQDMULH(INSN, FUNC) \ WRAP_ENV_FN(gen_##INSN##_tramp16, gen_helper_neon_##FUNC##_s16); \ WRAP_ENV_FN(gen_##INSN##_tramp32, gen_helper_neon_##FUNC##_s32); \ From patchwork Fri May 24 23:20:49 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13673851 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 2229AC25B74 for ; Fri, 24 May 2024 23:32:36 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sAeEh-0006ui-6D; Fri, 24 May 2024 19:22:07 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sAeEc-0006pa-Ms for qemu-devel@nongnu.org; Fri, 24 May 2024 19:22:02 -0400 Received: from mail-pg1-x536.google.com ([2607:f8b0:4864:20::536]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sAeEV-0005uD-BB for qemu-devel@nongnu.org; Fri, 24 May 2024 19:22:02 -0400 Received: by mail-pg1-x536.google.com with SMTP id 41be03b00d2f7-6818e31e5baso1190106a12.1 for ; Fri, 24 May 2024 16:21:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1716592914; x=1717197714; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=qavblcON+mCWl8e6AfEri+fwfThNcoFUl1qibQ4A/vo=; b=vl+GmzWBlIL47vmrAE3vCZt9dLnp3AEzK4QNhIJ7qtmENjaCNTomMMIRhT+2WhFnFl Uv9GonfB82ow/9qcGAN6bcMUcxbfg/0mBOhlgarmzftgBlmxqSOOAAJS99FkPY0gPq83 O19QXZowpxHbI/iA3UZ/CXCK7LssxNdC4n6hZ2i+TjKgP6LfWra1bPqtZ8T9CAGe3yhj bbhm5BeHD4CXcMCEFTxoAKmyAc9qHkK4hXcBp67VG9JCS8IF3df9DBFbKaSu0g6zjpIs zQx+p8DCerYB0dvFldBkLraMrHqyDgHrqAAyR+IXZtDmlc6uYUF1iacUUQ20XhPuhzH/ YDfA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716592914; x=1717197714; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=qavblcON+mCWl8e6AfEri+fwfThNcoFUl1qibQ4A/vo=; b=g0y+N4nxDy5KeJtm7iKzK6YHKHmdFEEGWPCU4wnoNufozvLZeX0+OKkQWM0g6TyZk9 qjuPTbqv+fsGrkZUrp3eVfNA3m+bgnNyY6bIinUxKMsyCbt10Ke6feZWuVfVSMr7H7RS 7/u/J6/hF+lUrWhsKLYS10CEi9aLswYpOpNLY5ecIX8INbmW50ECpzLmi+3ZR56YOLgx 7hD9Pu8en2oUpwe1L62Cd5kDUlpv7ITkQkls54xJqZdMp3SOEBp6U/XWnQCMChfXEj5Q Vf3deM2okSnj4o5TBUuHsYloR+n5BT1WtZIPQX2T0ghSwPlo8YlH8aualRhurBIZvUnc IcOw== X-Gm-Message-State: AOJu0Yy/O4LVctankkMNOFwZbLWPQxmdryvW2hjfjphvmZgHhp5/9sjX P/WN8mhuQ+hH2Y35P8JlIwH6c84IjHe0CYwb0VefZrM3LfKeagQhMfLmoTSGLZUwcUHiQ68MFEf 7 X-Google-Smtp-Source: AGHT+IGH7jQF/ZNmBmQUjVZKW/S+kj9DqAmQVyXjEzVjFVYF6nhsXvR9z7zYIOUVQKyObZCGlclzUA== X-Received: by 2002:a17:902:f7d2:b0:1f3:3d2c:a771 with SMTP id d9443c01a7336-1f4497dd05amr28721155ad.51.1716592913730; Fri, 24 May 2024 16:21:53 -0700 (PDT) Received: from stoup.. (174-21-72-5.tukw.qwest.net. [174.21.72.5]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-1f44c759ceesm19178305ad.10.2024.05.24.16.21.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 May 2024 16:21:53 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: qemu-arm@nongnu.org, Peter Maydell Subject: [PATCH v2 35/67] target/arm: Convert FMLAL, FMLSL to decodetree Date: Fri, 24 May 2024 16:20:49 -0700 Message-Id: <20240524232121.284515-36-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240524232121.284515-1-richard.henderson@linaro.org> References: <20240524232121.284515-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::536; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x536.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/tcg/a64.decode | 10 +++ target/arm/tcg/translate-a64.c | 144 ++++++++++----------------------- 2 files changed, 51 insertions(+), 103 deletions(-) diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode index 22dfe8568d..7e993ed345 100644 --- a/target/arm/tcg/a64.decode +++ b/target/arm/tcg/a64.decode @@ -797,6 +797,11 @@ FMLA_v 0.00 1110 0.1 ..... 11001 1 ..... ..... @qrrr_sd FMLS_v 0.00 1110 110 ..... 00001 1 ..... ..... @qrrr_h FMLS_v 0.00 1110 1.1 ..... 11001 1 ..... ..... @qrrr_sd +FMLAL_v 0.00 1110 001 ..... 11101 1 ..... ..... @qrrr_h +FMLSL_v 0.00 1110 101 ..... 11101 1 ..... ..... @qrrr_h +FMLAL2_v 0.10 1110 001 ..... 11001 1 ..... ..... @qrrr_h +FMLSL2_v 0.10 1110 101 ..... 11001 1 ..... ..... @qrrr_h + FCMEQ_v 0.00 1110 010 ..... 00100 1 ..... ..... @qrrr_h FCMEQ_v 0.00 1110 0.1 ..... 11100 1 ..... ..... @qrrr_sd @@ -877,3 +882,8 @@ FMLS_vi 0.00 1111 11 0 ..... 0101 . 0 ..... ..... @qrrx_d FMULX_vi 0.10 1111 00 .. .... 1001 . 0 ..... ..... @qrrx_h FMULX_vi 0.10 1111 10 . ..... 1001 . 0 ..... ..... @qrrx_s FMULX_vi 0.10 1111 11 0 ..... 1001 . 0 ..... ..... @qrrx_d + +FMLAL_vi 0.00 1111 10 .. .... 0000 . 0 ..... ..... @qrrx_h +FMLSL_vi 0.00 1111 10 .. .... 0100 . 0 ..... ..... @qrrx_h +FMLAL2_vi 0.10 1111 10 .. .... 1000 . 0 ..... ..... @qrrx_h +FMLSL2_vi 0.10 1111 10 .. .... 1100 . 0 ..... ..... @qrrx_h diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c index 9fe70a939b..a4ff1fd202 100644 --- a/target/arm/tcg/translate-a64.c +++ b/target/arm/tcg/translate-a64.c @@ -5256,6 +5256,24 @@ static gen_helper_gvec_3_ptr * const f_vector_fminnmp[3] = { }; TRANS(FMINNMP_v, do_fp3_vector, a, f_vector_fminnmp) +static bool do_fmlal(DisasContext *s, arg_qrrr_e *a, bool is_s, bool is_2) +{ + if (fp_access_check(s)) { + int data = (is_2 << 1) | is_s; + tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), tcg_env, + a->q ? 16 : 8, vec_full_reg_size(s), + data, gen_helper_gvec_fmlal_a64); + } + return true; +} + +TRANS_FEAT(FMLAL_v, aa64_fhm, do_fmlal, a, false, false) +TRANS_FEAT(FMLSL_v, aa64_fhm, do_fmlal, a, true, false) +TRANS_FEAT(FMLAL2_v, aa64_fhm, do_fmlal, a, false, true) +TRANS_FEAT(FMLSL2_v, aa64_fhm, do_fmlal, a, true, true) + TRANS(ADDP_v, do_gvec_fn3, a, gen_gvec_addp) TRANS(SMAXP_v, do_gvec_fn3_no64, a, gen_gvec_smaxp) TRANS(SMINP_v, do_gvec_fn3_no64, a, gen_gvec_sminp) @@ -5447,6 +5465,24 @@ static bool do_fmla_vector_idx(DisasContext *s, arg_qrrx_e *a, bool neg) TRANS(FMLA_vi, do_fmla_vector_idx, a, false) TRANS(FMLS_vi, do_fmla_vector_idx, a, true) +static bool do_fmlal_idx(DisasContext *s, arg_qrrx_e *a, bool is_s, bool is_2) +{ + if (fp_access_check(s)) { + int data = (a->idx << 2) | (is_2 << 1) | is_s; + tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), tcg_env, + a->q ? 16 : 8, vec_full_reg_size(s), + data, gen_helper_gvec_fmlal_idx_a64); + } + return true; +} + +TRANS_FEAT(FMLAL_vi, aa64_fhm, do_fmlal_idx, a, false, false) +TRANS_FEAT(FMLSL_vi, aa64_fhm, do_fmlal_idx, a, true, false) +TRANS_FEAT(FMLAL2_vi, aa64_fhm, do_fmlal_idx, a, false, true) +TRANS_FEAT(FMLSL2_vi, aa64_fhm, do_fmlal_idx, a, true, true) + /* * Advanced SIMD scalar pairwise */ @@ -10911,78 +10947,6 @@ static void disas_simd_3same_logic(DisasContext *s, uint32_t insn) } } -/* Floating point op subgroup of C3.6.16. */ -static void disas_simd_3same_float(DisasContext *s, uint32_t insn) -{ - /* For floating point ops, the U, size[1] and opcode bits - * together indicate the operation. size[0] indicates single - * or double. - */ - int fpopcode = extract32(insn, 11, 5) - | (extract32(insn, 23, 1) << 5) - | (extract32(insn, 29, 1) << 6); - int is_q = extract32(insn, 30, 1); - int size = extract32(insn, 22, 1); - int rm = extract32(insn, 16, 5); - int rn = extract32(insn, 5, 5); - int rd = extract32(insn, 0, 5); - - if (size == 1 && !is_q) { - unallocated_encoding(s); - return; - } - - switch (fpopcode) { - case 0x1d: /* FMLAL */ - case 0x3d: /* FMLSL */ - case 0x59: /* FMLAL2 */ - case 0x79: /* FMLSL2 */ - if (size & 1 || !dc_isar_feature(aa64_fhm, s)) { - unallocated_encoding(s); - return; - } - if (fp_access_check(s)) { - int is_s = extract32(insn, 23, 1); - int is_2 = extract32(insn, 29, 1); - int data = (is_2 << 1) | is_s; - tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, rd), - vec_full_reg_offset(s, rn), - vec_full_reg_offset(s, rm), tcg_env, - is_q ? 16 : 8, vec_full_reg_size(s), - data, gen_helper_gvec_fmlal_a64); - } - return; - - default: - case 0x18: /* FMAXNM */ - case 0x19: /* FMLA */ - case 0x1a: /* FADD */ - case 0x1b: /* FMULX */ - case 0x1c: /* FCMEQ */ - case 0x1e: /* FMAX */ - case 0x1f: /* FRECPS */ - case 0x38: /* FMINNM */ - case 0x39: /* FMLS */ - case 0x3a: /* FSUB */ - case 0x3e: /* FMIN */ - case 0x3f: /* FRSQRTS */ - case 0x58: /* FMAXNMP */ - case 0x5a: /* FADDP */ - case 0x5b: /* FMUL */ - case 0x5c: /* FCMGE */ - case 0x5d: /* FACGE */ - case 0x5e: /* FMAXP */ - case 0x5f: /* FDIV */ - case 0x78: /* FMINNMP */ - case 0x7a: /* FABD */ - case 0x7d: /* FACGT */ - case 0x7c: /* FCMGT */ - case 0x7e: /* FMINP */ - unallocated_encoding(s); - return; - } -} - /* Integer op subgroup of C3.6.16. */ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) { @@ -11251,16 +11215,13 @@ static void disas_simd_three_reg_same(DisasContext *s, uint32_t insn) case 0x3: /* logic ops */ disas_simd_3same_logic(s, insn); break; - case 0x18 ... 0x31: - /* floating point ops, sz[1] and U are part of opcode */ - disas_simd_3same_float(s, insn); - break; default: disas_simd_3same_int(s, insn); break; case 0x14: /* SMAXP, UMAXP */ case 0x15: /* SMINP, UMINP */ case 0x17: /* ADDP */ + case 0x18 ... 0x31: /* floating point ops */ unallocated_encoding(s); break; } @@ -12526,22 +12487,15 @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn) } is_fp = 2; break; - case 0x00: /* FMLAL */ - case 0x04: /* FMLSL */ - case 0x18: /* FMLAL2 */ - case 0x1c: /* FMLSL2 */ - if (is_scalar || size != MO_32 || !dc_isar_feature(aa64_fhm, s)) { - unallocated_encoding(s); - return; - } - size = MO_16; - /* is_fp, but we pass tcg_env not fp_status. */ - break; default: + case 0x00: /* FMLAL */ case 0x01: /* FMLA */ + case 0x04: /* FMLSL */ case 0x05: /* FMLS */ case 0x09: /* FMUL */ + case 0x18: /* FMLAL2 */ case 0x19: /* FMULX */ + case 0x1c: /* FMLSL2 */ unallocated_encoding(s); return; } @@ -12660,22 +12614,6 @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn) } return; - case 0x00: /* FMLAL */ - case 0x04: /* FMLSL */ - case 0x18: /* FMLAL2 */ - case 0x1c: /* FMLSL2 */ - { - int is_s = extract32(opcode, 2, 1); - int is_2 = u; - int data = (index << 2) | (is_2 << 1) | is_s; - tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, rd), - vec_full_reg_offset(s, rn), - vec_full_reg_offset(s, rm), tcg_env, - is_q ? 16 : 8, vec_full_reg_size(s), - data, gen_helper_gvec_fmlal_idx_a64); - } - return; - case 0x08: /* MUL */ if (!is_long && !is_scalar) { static gen_helper_gvec_3 * const fns[3] = { From patchwork Fri May 24 23:20:50 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13673820 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id DD62EC25B7D for ; Fri, 24 May 2024 23:27:58 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sAeEe-0006t6-TS; Fri, 24 May 2024 19:22:04 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sAeEd-0006qr-Ng for qemu-devel@nongnu.org; Fri, 24 May 2024 19:22:03 -0400 Received: from mail-pl1-x633.google.com ([2607:f8b0:4864:20::633]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sAeEV-0005uR-T2 for qemu-devel@nongnu.org; Fri, 24 May 2024 19:22:03 -0400 Received: by mail-pl1-x633.google.com with SMTP id d9443c01a7336-1f32448e8fbso26445425ad.1 for ; Fri, 24 May 2024 16:21:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1716592914; x=1717197714; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=usPxXxgkBRoUUd22MEqXzle6GxZ0MJTfoUCJ8+mguX4=; b=qeVlB3OhBMHXaXo74mmiuMYMNNGWHCIe4NOwOH+0M9Bq1IJOkVro3CVWCGJ+TlBUSc x2iGGC9PqqYBrujEDq3+rOjgOTH9OJodEWCpeITZ3bkDWmFxsXWkvCLLK5SL0Vl95oIY A/J1Baq1ruGE4Af4TLROhduKoA+w0iCvZX6pVfyW/f6Qkn+r5PpQYEmfcXF9nVLVZmK7 VG2ABvR+L3jdtxb1TI9nIU64X3gCCzCQbzJu8ZsMzIIAXk27uu4WTzjorUr5+lI3Si5o iVNjee8RMw7GkafqbH9Hu1NL+o499ejtul3lYQEIdKVxOLjhT3E5aL6Fb9B4sh8vgd+T PuYQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716592914; x=1717197714; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=usPxXxgkBRoUUd22MEqXzle6GxZ0MJTfoUCJ8+mguX4=; b=rgUR6qcGtn7LbngXXVE9PSmFt4GJZz4eaGgDYB+05xt71LMvOqKF27WG+Lg6zilx/n Lo08U5QPB/L1WDnJi0F7VXWpTKFEBUthPHevg6CLW5uxm4vUOZmaz7hl99mfPncsxsas g21YV7SR1K4i6Pa32qSAqWBDA0FnnEQdGj5Uff1WCAQsV4OAmCeIHkrliLyJLD2B8khL K8e4ZqHsyyUyOoBDzk58mJbNCkJDRiggPMbHG51icLWxwCRWWeWovxnhl81c5mUfyNyq /hqp6hBKDzn5qHlaRix6vnwvPEYMrxpKr4IcJyk3abywJZmHzg2e4+xsM67rE8FmWc2t oLOw== X-Gm-Message-State: AOJu0YzbpugTQGNgEgRhYjnx4Y861FF4ODnvWGd6NrCEotftIBjJxz0T 9Yp3DwG0dmVtMlB8olChGHqbtg2YM0Xduqxj8psjApmni1hJ6/ABkPOW4TZQrwzk2gVP9ffVfn1 j X-Google-Smtp-Source: AGHT+IGeQHBA9DUm7xu9ueo+EV6IN3HtXSRwED8+T+Sh+/X0tDSHRWmlmWC0KWgwyEh+rjOx7r1LWQ== X-Received: by 2002:a17:902:d483:b0:1f3:6b1:a1bf with SMTP id d9443c01a7336-1f4486b9a9amr41558545ad.14.1716592914464; Fri, 24 May 2024 16:21:54 -0700 (PDT) Received: from stoup.. (174-21-72-5.tukw.qwest.net. [174.21.72.5]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-1f44c759ceesm19178305ad.10.2024.05.24.16.21.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 May 2024 16:21:54 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: qemu-arm@nongnu.org, Peter Maydell Subject: [PATCH v2 36/67] target/arm: Convert disas_simd_3same_logic to decodetree Date: Fri, 24 May 2024 16:20:50 -0700 Message-Id: <20240524232121.284515-37-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240524232121.284515-1-richard.henderson@linaro.org> References: <20240524232121.284515-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::633; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x633.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org This includes AND, ORR, EOR, BIC, ORN, BSF, BIT, BIF. Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/tcg/a64.decode | 10 +++++ target/arm/tcg/translate-a64.c | 68 ++++++++++------------------------ 2 files changed, 29 insertions(+), 49 deletions(-) diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode index 7e993ed345..f48adef5bb 100644 --- a/target/arm/tcg/a64.decode +++ b/target/arm/tcg/a64.decode @@ -55,6 +55,7 @@ @rrr_q1e3 ........ ... rm:5 ...... rn:5 rd:5 &qrrr_e q=1 esz=3 @rrrr_q1e3 ........ ... rm:5 . ra:5 rn:5 rd:5 &qrrrr_e q=1 esz=3 +@qrrr_b . q:1 ...... ... rm:5 ...... rn:5 rd:5 &qrrr_e esz=0 @qrrr_h . q:1 ...... ... rm:5 ...... rn:5 rd:5 &qrrr_e esz=1 @qrrr_sd . q:1 ...... ... rm:5 ...... rn:5 rd:5 &qrrr_e esz=%esz_sd @qrrr_e . q:1 ...... esz:2 . rm:5 ...... rn:5 rd:5 &qrrr_e @@ -847,6 +848,15 @@ SMINP_v 0.00 1110 ..1 ..... 10101 1 ..... ..... @qrrr_e UMAXP_v 0.10 1110 ..1 ..... 10100 1 ..... ..... @qrrr_e UMINP_v 0.10 1110 ..1 ..... 10101 1 ..... ..... @qrrr_e +AND_v 0.00 1110 001 ..... 00011 1 ..... ..... @qrrr_b +BIC_v 0.00 1110 011 ..... 00011 1 ..... ..... @qrrr_b +ORR_v 0.00 1110 101 ..... 00011 1 ..... ..... @qrrr_b +ORN_v 0.00 1110 111 ..... 00011 1 ..... ..... @qrrr_b +EOR_v 0.10 1110 001 ..... 00011 1 ..... ..... @qrrr_b +BSL_v 0.10 1110 011 ..... 00011 1 ..... ..... @qrrr_b +BIT_v 0.10 1110 101 ..... 00011 1 ..... ..... @qrrr_b +BIF_v 0.10 1110 111 ..... 00011 1 ..... ..... @qrrr_b + ### Advanced SIMD scalar x indexed element FMUL_si 0101 1111 00 .. .... 1001 . 0 ..... ..... @rrx_h diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c index a4ff1fd202..9167e4d0bd 100644 --- a/target/arm/tcg/translate-a64.c +++ b/target/arm/tcg/translate-a64.c @@ -5280,6 +5280,24 @@ TRANS(SMINP_v, do_gvec_fn3_no64, a, gen_gvec_sminp) TRANS(UMAXP_v, do_gvec_fn3_no64, a, gen_gvec_umaxp) TRANS(UMINP_v, do_gvec_fn3_no64, a, gen_gvec_uminp) +TRANS(AND_v, do_gvec_fn3, a, tcg_gen_gvec_and) +TRANS(BIC_v, do_gvec_fn3, a, tcg_gen_gvec_andc) +TRANS(ORR_v, do_gvec_fn3, a, tcg_gen_gvec_or) +TRANS(ORN_v, do_gvec_fn3, a, tcg_gen_gvec_orc) +TRANS(EOR_v, do_gvec_fn3, a, tcg_gen_gvec_xor) + +static bool do_bitsel(DisasContext *s, bool is_q, int d, int a, int b, int c) +{ + if (fp_access_check(s)) { + gen_gvec_fn4(s, is_q, d, a, b, c, tcg_gen_gvec_bitsel, 0); + } + return true; +} + +TRANS(BSL_v, do_bitsel, a->q, a->rd, a->rd, a->rn, a->rm) +TRANS(BIT_v, do_bitsel, a->q, a->rd, a->rm, a->rn, a->rd) +TRANS(BIF_v, do_bitsel, a->q, a->rd, a->rm, a->rd, a->rn) + /* * Advanced SIMD scalar/vector x indexed element */ @@ -10901,52 +10919,6 @@ static void disas_simd_three_reg_diff(DisasContext *s, uint32_t insn) } } -/* Logic op (opcode == 3) subgroup of C3.6.16. */ -static void disas_simd_3same_logic(DisasContext *s, uint32_t insn) -{ - int rd = extract32(insn, 0, 5); - int rn = extract32(insn, 5, 5); - int rm = extract32(insn, 16, 5); - int size = extract32(insn, 22, 2); - bool is_u = extract32(insn, 29, 1); - bool is_q = extract32(insn, 30, 1); - - if (!fp_access_check(s)) { - return; - } - - switch (size + 4 * is_u) { - case 0: /* AND */ - gen_gvec_fn3(s, is_q, rd, rn, rm, tcg_gen_gvec_and, 0); - return; - case 1: /* BIC */ - gen_gvec_fn3(s, is_q, rd, rn, rm, tcg_gen_gvec_andc, 0); - return; - case 2: /* ORR */ - gen_gvec_fn3(s, is_q, rd, rn, rm, tcg_gen_gvec_or, 0); - return; - case 3: /* ORN */ - gen_gvec_fn3(s, is_q, rd, rn, rm, tcg_gen_gvec_orc, 0); - return; - case 4: /* EOR */ - gen_gvec_fn3(s, is_q, rd, rn, rm, tcg_gen_gvec_xor, 0); - return; - - case 5: /* BSL bitwise select */ - gen_gvec_fn4(s, is_q, rd, rd, rn, rm, tcg_gen_gvec_bitsel, 0); - return; - case 6: /* BIT, bitwise insert if true */ - gen_gvec_fn4(s, is_q, rd, rm, rn, rd, tcg_gen_gvec_bitsel, 0); - return; - case 7: /* BIF, bitwise insert if false */ - gen_gvec_fn4(s, is_q, rd, rm, rd, rn, tcg_gen_gvec_bitsel, 0); - return; - - default: - g_assert_not_reached(); - } -} - /* Integer op subgroup of C3.6.16. */ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) { @@ -11212,12 +11184,10 @@ static void disas_simd_three_reg_same(DisasContext *s, uint32_t insn) int opcode = extract32(insn, 11, 5); switch (opcode) { - case 0x3: /* logic ops */ - disas_simd_3same_logic(s, insn); - break; default: disas_simd_3same_int(s, insn); break; + case 0x3: /* logic ops */ case 0x14: /* SMAXP, UMAXP */ case 0x15: /* SMINP, UMINP */ case 0x17: /* ADDP */ From patchwork Fri May 24 23:20:51 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13673822 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id D37D0C25B74 for ; Fri, 24 May 2024 23:28:06 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sAeEi-0006wQ-Os; Fri, 24 May 2024 19:22:08 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sAeEe-0006re-1T for qemu-devel@nongnu.org; Fri, 24 May 2024 19:22:04 -0400 Received: from mail-pl1-x633.google.com ([2607:f8b0:4864:20::633]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sAeEW-0005uq-MQ for qemu-devel@nongnu.org; Fri, 24 May 2024 19:22:03 -0400 Received: by mail-pl1-x633.google.com with SMTP id d9443c01a7336-1f44b441b08so10590955ad.0 for ; Fri, 24 May 2024 16:21:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1716592915; x=1717197715; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=XbPodXVfNYDOBeFPuI23bmSPTaq8W6PthVSB/YmIGHc=; b=LIio8tpkFHmCFh/CZYa0aKC//FJdoSM70UeRA6MTM8m4lGmamNiNtKIwiM4yXZQj5a aCrtOkJkU3SdxoRpwKHBCaQbLnAsNQc+DwsvCG1oIdYELmnWGe/elH7lR2/aKGpkaq8t RhfVJPWPsqCj9lMgRdtvlXbdTOim9wl2eSk2d0hEURdmRg6RTz5l/vA3D5M2eJLNOiMA Gp02IdYjg0bPEZq3B9B//WBIDvn5awAjtL1DkA5J0zU1hww88TBh765XmINrOl2wXJ8n zpZJZYfz2fiQ0CMAgQceVMdDEQLkS0lZYGftOz7k4f4+bigcc6goLoX5SBtawUL9eXxd Ik+Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716592915; x=1717197715; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=XbPodXVfNYDOBeFPuI23bmSPTaq8W6PthVSB/YmIGHc=; b=A3ANFHpKv3OT7uVUT479vzxXR9E8nQhoXEL8uzZ2Luw7rDrFcfdgF5rbmIcQ97WwKJ ds0vAWlRIMCN8AdX4rmnwcAU8yO373+PwIPkLT6uX8iwJtViKYAR4LvCxVqJGeSkxmUl N/7Z2uMOx7YOmvLP7e2JF69BPceC/eTR2ghzY+O6d465mhYP6UWc2Pw/xjQJf2MZcmcd euLyiNqNYJC3Lu4d5oAXr+s1CWKKPITVqRTorXuVs0DRWLr0uqRsSWWKR5c6WScZRMe4 /UWos622oPmytYMvVYbHNn2e93SsHtu60fpoUDFoCozAnBK13sNeMSM/BAB4bXQMY0Oo GN8g== X-Gm-Message-State: AOJu0YxV/xGh2lnx++pn4gWAme/nF+1ncoXyLgcNLLJoIJ+dRLc2LvlS Btgfan+GPfQR7rkjF4J2BoSS9a7GI36FuU93aTZIEcUH9770RTLPSk8qOXQrRBn2euJF8nBE9rw n X-Google-Smtp-Source: AGHT+IF3SrQGjMcMJFjXNJOM5qCJNRZzuiW3xiuw55iZjSWRlYjpl04TrXMJLNoAZl0BgQWQ4xMCyw== X-Received: by 2002:a17:902:f688:b0:1f3:704:8304 with SMTP id d9443c01a7336-1f4486e5ef5mr41421055ad.9.1716592915162; Fri, 24 May 2024 16:21:55 -0700 (PDT) Received: from stoup.. (174-21-72-5.tukw.qwest.net. [174.21.72.5]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-1f44c759ceesm19178305ad.10.2024.05.24.16.21.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 May 2024 16:21:54 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: qemu-arm@nongnu.org Subject: [PATCH v2 37/67] target/arm: Improve vector UQADD, UQSUB, SQADD, SQSUB Date: Fri, 24 May 2024 16:20:51 -0700 Message-Id: <20240524232121.284515-38-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240524232121.284515-1-richard.henderson@linaro.org> References: <20240524232121.284515-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::633; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x633.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org No need for a full comparison; xor produces non-zero bits for QC just fine. Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/tcg/gengvec.c | 32 ++++++++++++++++---------------- 1 file changed, 16 insertions(+), 16 deletions(-) diff --git a/target/arm/tcg/gengvec.c b/target/arm/tcg/gengvec.c index 22c9d17dce..bfe6885a01 100644 --- a/target/arm/tcg/gengvec.c +++ b/target/arm/tcg/gengvec.c @@ -1217,21 +1217,21 @@ void gen_gvec_sshl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); } -static void gen_uqadd_vec(unsigned vece, TCGv_vec t, TCGv_vec sat, +static void gen_uqadd_vec(unsigned vece, TCGv_vec t, TCGv_vec qc, TCGv_vec a, TCGv_vec b) { TCGv_vec x = tcg_temp_new_vec_matching(t); tcg_gen_add_vec(vece, x, a, b); tcg_gen_usadd_vec(vece, t, a, b); - tcg_gen_cmp_vec(TCG_COND_NE, vece, x, x, t); - tcg_gen_or_vec(vece, sat, sat, x); + tcg_gen_xor_vec(vece, x, x, t); + tcg_gen_or_vec(vece, qc, qc, x); } void gen_gvec_uqadd_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) { static const TCGOpcode vecop_list[] = { - INDEX_op_usadd_vec, INDEX_op_cmp_vec, INDEX_op_add_vec, 0 + INDEX_op_usadd_vec, INDEX_op_add_vec, 0 }; static const GVecGen4 ops[4] = { { .fniv = gen_uqadd_vec, @@ -1259,21 +1259,21 @@ void gen_gvec_uqadd_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); } -static void gen_sqadd_vec(unsigned vece, TCGv_vec t, TCGv_vec sat, +static void gen_sqadd_vec(unsigned vece, TCGv_vec t, TCGv_vec qc, TCGv_vec a, TCGv_vec b) { TCGv_vec x = tcg_temp_new_vec_matching(t); tcg_gen_add_vec(vece, x, a, b); tcg_gen_ssadd_vec(vece, t, a, b); - tcg_gen_cmp_vec(TCG_COND_NE, vece, x, x, t); - tcg_gen_or_vec(vece, sat, sat, x); + tcg_gen_xor_vec(vece, x, x, t); + tcg_gen_or_vec(vece, qc, qc, x); } void gen_gvec_sqadd_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) { static const TCGOpcode vecop_list[] = { - INDEX_op_ssadd_vec, INDEX_op_cmp_vec, INDEX_op_add_vec, 0 + INDEX_op_ssadd_vec, INDEX_op_add_vec, 0 }; static const GVecGen4 ops[4] = { { .fniv = gen_sqadd_vec, @@ -1301,21 +1301,21 @@ void gen_gvec_sqadd_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); } -static void gen_uqsub_vec(unsigned vece, TCGv_vec t, TCGv_vec sat, +static void gen_uqsub_vec(unsigned vece, TCGv_vec t, TCGv_vec qc, TCGv_vec a, TCGv_vec b) { TCGv_vec x = tcg_temp_new_vec_matching(t); tcg_gen_sub_vec(vece, x, a, b); tcg_gen_ussub_vec(vece, t, a, b); - tcg_gen_cmp_vec(TCG_COND_NE, vece, x, x, t); - tcg_gen_or_vec(vece, sat, sat, x); + tcg_gen_xor_vec(vece, x, x, t); + tcg_gen_or_vec(vece, qc, qc, x); } void gen_gvec_uqsub_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) { static const TCGOpcode vecop_list[] = { - INDEX_op_ussub_vec, INDEX_op_cmp_vec, INDEX_op_sub_vec, 0 + INDEX_op_ussub_vec, INDEX_op_sub_vec, 0 }; static const GVecGen4 ops[4] = { { .fniv = gen_uqsub_vec, @@ -1343,21 +1343,21 @@ void gen_gvec_uqsub_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); } -static void gen_sqsub_vec(unsigned vece, TCGv_vec t, TCGv_vec sat, +static void gen_sqsub_vec(unsigned vece, TCGv_vec t, TCGv_vec qc, TCGv_vec a, TCGv_vec b) { TCGv_vec x = tcg_temp_new_vec_matching(t); tcg_gen_sub_vec(vece, x, a, b); tcg_gen_sssub_vec(vece, t, a, b); - tcg_gen_cmp_vec(TCG_COND_NE, vece, x, x, t); - tcg_gen_or_vec(vece, sat, sat, x); + tcg_gen_xor_vec(vece, x, x, t); + tcg_gen_or_vec(vece, qc, qc, x); } void gen_gvec_sqsub_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) { static const TCGOpcode vecop_list[] = { - INDEX_op_sssub_vec, INDEX_op_cmp_vec, INDEX_op_sub_vec, 0 + INDEX_op_sssub_vec, INDEX_op_sub_vec, 0 }; static const GVecGen4 ops[4] = { { .fniv = gen_sqsub_vec, From patchwork Fri May 24 23:20:52 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13673847 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id BC170C25B7A for ; Fri, 24 May 2024 23:32:11 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sAeEi-0006wM-N4; Fri, 24 May 2024 19:22:08 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sAeEf-0006tc-4V for qemu-devel@nongnu.org; Fri, 24 May 2024 19:22:05 -0400 Received: from mail-pl1-x633.google.com ([2607:f8b0:4864:20::633]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sAeEX-0005vA-G8 for qemu-devel@nongnu.org; Fri, 24 May 2024 19:22:04 -0400 Received: by mail-pl1-x633.google.com with SMTP id d9443c01a7336-1f3406f225bso17675955ad.3 for ; Fri, 24 May 2024 16:21:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1716592916; x=1717197716; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=JOhGoqzVdGvBCRVF4dtmpoM16Yq2GbKoejyLkAf9y/A=; b=E3Y6sPG+OUTZSmgT2hA/9p/y088FW42aIFz+0PDqoK+8ndLFNEZ7bULtqkC2gyOBUL HyMXbEnRkh4mrKwuQytDpBRsJxq+HouL6V18sLWzpHEEErtX+u5hg+UTDmvs/5+aZqr+ PuezDalUcpzJmn4k4GUvixlpgToVEnQQwLxeXwS6+K2COSiUmil9RWwTNrnea4Vpc6Jp gBza3JmtxvhTCo5PllQbfxNf8c0lwzhndM3y3xzVQF06vCGNt1W6ECnyN4fZOuP0IypF lXEW8Nmbo/NL7SsHX3H5Fs+/nis3ZdC89IrkIDRvl9KaqH8rVsusECkn2J5KnAfmDtMN cl0g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716592916; x=1717197716; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=JOhGoqzVdGvBCRVF4dtmpoM16Yq2GbKoejyLkAf9y/A=; b=JnJTdfsg21bhtVrHrW4crTxychx4VZtKeOHLhnMFkq7rUKejtebXs3Vdi4UXquAlA4 0PEXLinTdfGF7gOWT8ouWdW2pzByi2UsrFCxHqt2lPn26Zz+F0Rwp1Z6oYw0QSCKoLo/ wzHHZU3xs94jId4i7DiCiAu7OLvSf/2j0Wd6joyxElOEjRGyB6Tbb807m1PwtXcJTJDT m9obCjO1g8xFolWChcDsf2urFKxqtwoKeVl7y/x+GvxhbqUtVs6qy8Vut4KvVULH0Hh+ AhY7jNKpgsUI1k+mJ5rSAawEQJIEsJkie2Oaa8jLeqnXBaEU8A5RDivXJciEA9Nd+s0K 3bUw== X-Gm-Message-State: AOJu0YxJ7b6KK/vPkp7H8bnG3nRaLy+z9Oq6DA+J7f1e7ESJO6j1uHdS g/GYRgnR9vhAoc5ROsDMLLSx21RfdUvaCF3ad5sfCOU4YHwYkIbd5JJHvm9QVxuLb1iiQbAECtx b X-Google-Smtp-Source: AGHT+IHSjq2sWVlWfDC+/lyjN3j9Gh7JFdOZJQVHGdGZbSP7AFQR7zEdC+7n6jvnxVnCqXl1ARh9vQ== X-Received: by 2002:a17:902:ec8f:b0:1ea:f7d4:b613 with SMTP id d9443c01a7336-1f449901da0mr40213275ad.62.1716592916083; Fri, 24 May 2024 16:21:56 -0700 (PDT) Received: from stoup.. (174-21-72-5.tukw.qwest.net. [174.21.72.5]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-1f44c759ceesm19178305ad.10.2024.05.24.16.21.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 May 2024 16:21:55 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: qemu-arm@nongnu.org Subject: [PATCH v2 38/67] target/arm: Convert SUQADD and USQADD to gvec Date: Fri, 24 May 2024 16:20:52 -0700 Message-Id: <20240524232121.284515-39-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240524232121.284515-1-richard.henderson@linaro.org> References: <20240524232121.284515-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::633; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x633.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Signed-off-by: Richard Henderson --- target/arm/helper.h | 16 +++++ target/arm/tcg/translate-a64.h | 6 ++ target/arm/tcg/gengvec64.c | 106 +++++++++++++++++++++++++++++++ target/arm/tcg/translate-a64.c | 113 ++++++++++++++------------------- target/arm/tcg/vec_helper.c | 64 +++++++++++++++++++ 5 files changed, 241 insertions(+), 64 deletions(-) diff --git a/target/arm/helper.h b/target/arm/helper.h index f830531dd3..de2c5c9aef 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -836,6 +836,22 @@ DEF_HELPER_FLAGS_5(gvec_sqsub_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(gvec_sqsub_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_usqadd_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_usqadd_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_usqadd_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_usqadd_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_suqadd_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_suqadd_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_suqadd_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_suqadd_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(gvec_fmlal_a32, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/tcg/translate-a64.h b/target/arm/tcg/translate-a64.h index 91750f0ca9..b5cb26f8a2 100644 --- a/target/arm/tcg/translate-a64.h +++ b/target/arm/tcg/translate-a64.h @@ -197,6 +197,12 @@ void gen_gvec_eor3(unsigned vece, uint32_t d, uint32_t n, uint32_t m, uint32_t a, uint32_t oprsz, uint32_t maxsz); void gen_gvec_bcax(unsigned vece, uint32_t d, uint32_t n, uint32_t m, uint32_t a, uint32_t oprsz, uint32_t maxsz); +void gen_gvec_suqadd_qc(unsigned vece, uint32_t rd_ofs, + uint32_t rn_ofs, uint32_t rm_ofs, + uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_usqadd_qc(unsigned vece, uint32_t rd_ofs, + uint32_t rn_ofs, uint32_t rm_ofs, + uint32_t opr_sz, uint32_t max_sz); void gen_sve_ldr(DisasContext *s, TCGv_ptr, int vofs, int len, int rn, int imm); void gen_sve_str(DisasContext *s, TCGv_ptr, int vofs, int len, int rn, int imm); diff --git a/target/arm/tcg/gengvec64.c b/target/arm/tcg/gengvec64.c index 093b498b13..4b76e476a0 100644 --- a/target/arm/tcg/gengvec64.c +++ b/target/arm/tcg/gengvec64.c @@ -188,3 +188,109 @@ void gen_gvec_bcax(unsigned vece, uint32_t d, uint32_t n, uint32_t m, tcg_gen_gvec_4(d, n, m, a, oprsz, maxsz, &op); } +static void gen_suqadd_vec(unsigned vece, TCGv_vec t, TCGv_vec qc, + TCGv_vec a, TCGv_vec b) +{ + TCGv_vec max = + tcg_constant_vec_matching(t, vece, (1ull << ((8 << vece) - 1)) - 1); + TCGv_vec u = tcg_temp_new_vec_matching(t); + + /* Maximum value that can be added to @a without overflow. */ + tcg_gen_sub_vec(vece, u, max, a); + + /* Constrain addend so that the next addition never overflows. */ + tcg_gen_umin_vec(vece, u, u, b); + tcg_gen_add_vec(vece, t, u, a); + + /* Compute QC by comparing the adjusted @b. */ + tcg_gen_xor_vec(vece, u, u, b); + tcg_gen_or_vec(vece, qc, qc, u); +} + +void gen_gvec_suqadd_qc(unsigned vece, uint32_t rd_ofs, + uint32_t rn_ofs, uint32_t rm_ofs, + uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_add_vec, INDEX_op_sub_vec, INDEX_op_umin_vec, 0 + }; + static const GVecGen4 ops[4] = { + { .fniv = gen_suqadd_vec, + .fno = gen_helper_gvec_suqadd_b, + .opt_opc = vecop_list, + .write_aofs = true, + .vece = MO_8 }, + { .fniv = gen_suqadd_vec, + .fno = gen_helper_gvec_suqadd_h, + .opt_opc = vecop_list, + .write_aofs = true, + .vece = MO_16 }, + { .fniv = gen_suqadd_vec, + .fno = gen_helper_gvec_suqadd_s, + .opt_opc = vecop_list, + .write_aofs = true, + .vece = MO_32 }, + { .fniv = gen_suqadd_vec, + .fno = gen_helper_gvec_suqadd_d, + .opt_opc = vecop_list, + .write_aofs = true, + .vece = MO_64 }, + }; + tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc), + rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); +} + +static void gen_usqadd_vec(unsigned vece, TCGv_vec t, TCGv_vec qc, + TCGv_vec a, TCGv_vec b) +{ + TCGv_vec u = tcg_temp_new_vec_matching(t); + TCGv_vec z = tcg_constant_vec_matching(t, vece, 0); + + /* Compute unsigned saturation of add for +b and sub for -b. */ + tcg_gen_neg_vec(vece, t, b); + tcg_gen_usadd_vec(vece, u, a, b); + tcg_gen_ussub_vec(vece, t, a, t); + + /* Select the correct result depending on the sign of b. */ + tcg_gen_cmpsel_vec(TCG_COND_LT, vece, t, b, z, t, u); + + /* Compute QC by comparing against the non-saturated result. */ + tcg_gen_add_vec(vece, u, a, b); + tcg_gen_xor_vec(vece, u, u, t); + tcg_gen_or_vec(vece, qc, qc, u); +} + +void gen_gvec_usqadd_qc(unsigned vece, uint32_t rd_ofs, + uint32_t rn_ofs, uint32_t rm_ofs, + uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_neg_vec, INDEX_op_add_vec, + INDEX_op_usadd_vec, INDEX_op_ussub_vec, + INDEX_op_cmpsel_vec, 0 + }; + static const GVecGen4 ops[4] = { + { .fniv = gen_usqadd_vec, + .fno = gen_helper_gvec_usqadd_b, + .opt_opc = vecop_list, + .write_aofs = true, + .vece = MO_8 }, + { .fniv = gen_usqadd_vec, + .fno = gen_helper_gvec_usqadd_h, + .opt_opc = vecop_list, + .write_aofs = true, + .vece = MO_16 }, + { .fniv = gen_usqadd_vec, + .fno = gen_helper_gvec_usqadd_s, + .opt_opc = vecop_list, + .write_aofs = true, + .vece = MO_32 }, + { .fniv = gen_usqadd_vec, + .fno = gen_helper_gvec_usqadd_d, + .opt_opc = vecop_list, + .write_aofs = true, + .vece = MO_64 }, + }; + tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc), + rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); +} diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c index 9167e4d0bd..9f948e033e 100644 --- a/target/arm/tcg/translate-a64.c +++ b/target/arm/tcg/translate-a64.c @@ -9983,83 +9983,68 @@ static void handle_2misc_narrow(DisasContext *s, bool scalar, /* Remaining saturating accumulating ops */ static void handle_2misc_satacc(DisasContext *s, bool is_scalar, bool is_u, - bool is_q, int size, int rn, int rd) + bool is_q, unsigned size, int rn, int rd) { - bool is_double = (size == 3); + if (!is_scalar) { + gen_gvec_fn3(s, is_q, rd, rd, rn, + is_u ? gen_gvec_usqadd_qc : gen_gvec_suqadd_qc, size); + return; + } - if (is_double) { + if (size == 3) { TCGv_i64 tcg_rn = tcg_temp_new_i64(); TCGv_i64 tcg_rd = tcg_temp_new_i64(); - int pass; - for (pass = 0; pass < (is_scalar ? 1 : 2); pass++) { - read_vec_element(s, tcg_rn, rn, pass, MO_64); - read_vec_element(s, tcg_rd, rd, pass, MO_64); + read_vec_element(s, tcg_rn, rn, 0, MO_64); + read_vec_element(s, tcg_rd, rd, 0, MO_64); - if (is_u) { /* USQADD */ - gen_helper_neon_uqadd_s64(tcg_rd, tcg_env, tcg_rn, tcg_rd); - } else { /* SUQADD */ - gen_helper_neon_sqadd_u64(tcg_rd, tcg_env, tcg_rn, tcg_rd); - } - write_vec_element(s, tcg_rd, rd, pass, MO_64); + if (is_u) { /* USQADD */ + gen_helper_neon_uqadd_s64(tcg_rd, tcg_env, tcg_rn, tcg_rd); + } else { /* SUQADD */ + gen_helper_neon_sqadd_u64(tcg_rd, tcg_env, tcg_rn, tcg_rd); } - clear_vec_high(s, !is_scalar, rd); + write_vec_element(s, tcg_rd, rd, 0, MO_64); + clear_vec_high(s, false, rd); } else { TCGv_i32 tcg_rn = tcg_temp_new_i32(); TCGv_i32 tcg_rd = tcg_temp_new_i32(); - int pass, maxpasses; - if (is_scalar) { - maxpasses = 1; - } else { - maxpasses = is_q ? 4 : 2; + read_vec_element_i32(s, tcg_rn, rn, 0, size); + read_vec_element_i32(s, tcg_rd, rd, 0, size); + + if (is_u) { /* USQADD */ + switch (size) { + case 0: + gen_helper_neon_uqadd_s8(tcg_rd, tcg_env, tcg_rn, tcg_rd); + break; + case 1: + gen_helper_neon_uqadd_s16(tcg_rd, tcg_env, tcg_rn, tcg_rd); + break; + case 2: + gen_helper_neon_uqadd_s32(tcg_rd, tcg_env, tcg_rn, tcg_rd); + break; + default: + g_assert_not_reached(); + } + } else { /* SUQADD */ + switch (size) { + case 0: + gen_helper_neon_sqadd_u8(tcg_rd, tcg_env, tcg_rn, tcg_rd); + break; + case 1: + gen_helper_neon_sqadd_u16(tcg_rd, tcg_env, tcg_rn, tcg_rd); + break; + case 2: + gen_helper_neon_sqadd_u32(tcg_rd, tcg_env, tcg_rn, tcg_rd); + break; + default: + g_assert_not_reached(); + } } - for (pass = 0; pass < maxpasses; pass++) { - if (is_scalar) { - read_vec_element_i32(s, tcg_rn, rn, pass, size); - read_vec_element_i32(s, tcg_rd, rd, pass, size); - } else { - read_vec_element_i32(s, tcg_rn, rn, pass, MO_32); - read_vec_element_i32(s, tcg_rd, rd, pass, MO_32); - } - - if (is_u) { /* USQADD */ - switch (size) { - case 0: - gen_helper_neon_uqadd_s8(tcg_rd, tcg_env, tcg_rn, tcg_rd); - break; - case 1: - gen_helper_neon_uqadd_s16(tcg_rd, tcg_env, tcg_rn, tcg_rd); - break; - case 2: - gen_helper_neon_uqadd_s32(tcg_rd, tcg_env, tcg_rn, tcg_rd); - break; - default: - g_assert_not_reached(); - } - } else { /* SUQADD */ - switch (size) { - case 0: - gen_helper_neon_sqadd_u8(tcg_rd, tcg_env, tcg_rn, tcg_rd); - break; - case 1: - gen_helper_neon_sqadd_u16(tcg_rd, tcg_env, tcg_rn, tcg_rd); - break; - case 2: - gen_helper_neon_sqadd_u32(tcg_rd, tcg_env, tcg_rn, tcg_rd); - break; - default: - g_assert_not_reached(); - } - } - - if (is_scalar) { - write_vec_element(s, tcg_constant_i64(0), rd, 0, MO_64); - } - write_vec_element_i32(s, tcg_rd, rd, pass, MO_32); - } - clear_vec_high(s, is_q, rd); + write_vec_element(s, tcg_constant_i64(0), rd, 0, MO_64); + write_vec_element_i32(s, tcg_rd, rd, 0, MO_32); + clear_vec_high(s, false, rd); } } diff --git a/target/arm/tcg/vec_helper.c b/target/arm/tcg/vec_helper.c index 56fea14edb..d8e96386be 100644 --- a/target/arm/tcg/vec_helper.c +++ b/target/arm/tcg/vec_helper.c @@ -1555,6 +1555,14 @@ DO_SAT(gvec_sqsub_b, int, int8_t, int8_t, -, INT8_MIN, INT8_MAX) DO_SAT(gvec_sqsub_h, int, int16_t, int16_t, -, INT16_MIN, INT16_MAX) DO_SAT(gvec_sqsub_s, int64_t, int32_t, int32_t, -, INT32_MIN, INT32_MAX) +DO_SAT(gvec_usqadd_b, int, uint8_t, int8_t, +, 0, UINT8_MAX) +DO_SAT(gvec_usqadd_h, int, uint16_t, int16_t, +, 0, UINT16_MAX) +DO_SAT(gvec_usqadd_s, int64_t, uint32_t, int32_t, +, 0, UINT32_MAX) + +DO_SAT(gvec_suqadd_b, int, int8_t, uint8_t, +, INT8_MIN, INT8_MAX) +DO_SAT(gvec_suqadd_h, int, int16_t, uint16_t, +, INT16_MIN, INT16_MAX) +DO_SAT(gvec_suqadd_s, int64_t, int32_t, uint32_t, +, INT32_MIN, INT32_MAX) + #undef DO_SAT void HELPER(gvec_uqadd_d)(void *vd, void *vq, void *vn, @@ -1645,6 +1653,62 @@ void HELPER(gvec_sqsub_d)(void *vd, void *vq, void *vn, clear_tail(d, oprsz, simd_maxsz(desc)); } +void HELPER(gvec_usqadd_d)(void *vd, void *vq, void *vn, + void *vm, uint32_t desc) +{ + intptr_t i, oprsz = simd_oprsz(desc); + uint64_t *d = vd, *n = vn, *m = vm; + bool q = false; + + for (i = 0; i < oprsz / 8; i++) { + uint64_t nn = n[i]; + int64_t mm = m[i]; + uint64_t dd = nn + mm; + + if (mm < 0) { + if (nn < (uint64_t)-mm) { + dd = 0; + q = true; + } + } else { + if (dd < nn) { + dd = UINT64_MAX; + q = true; + } + } + d[i] = dd; + } + if (q) { + uint32_t *qc = vq; + qc[0] = 1; + } + clear_tail(d, oprsz, simd_maxsz(desc)); +} + +void HELPER(gvec_suqadd_d)(void *vd, void *vq, void *vn, + void *vm, uint32_t desc) +{ + intptr_t i, oprsz = simd_oprsz(desc); + uint64_t *d = vd, *n = vn, *m = vm; + bool q = false; + + for (i = 0; i < oprsz / 8; i++) { + int64_t nn = n[i]; + uint64_t mm = m[i]; + int64_t dd = nn + mm; + + if (mm > (uint64_t)(INT64_MAX - nn)) { + dd = INT64_MAX; + q = true; + } + d[i] = dd; + } + if (q) { + uint32_t *qc = vq; + qc[0] = 1; + } + clear_tail(d, oprsz, simd_maxsz(desc)); +} #define DO_SRA(NAME, TYPE) \ void HELPER(NAME)(void *vd, void *vn, uint32_t desc) \ From patchwork Fri May 24 23:20:53 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13673793 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 0FD97C25B74 for ; Fri, 24 May 2024 23:23:25 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sAeEk-0006xn-Fg; Fri, 24 May 2024 19:22:10 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sAeEf-0006uB-Od for qemu-devel@nongnu.org; Fri, 24 May 2024 19:22:05 -0400 Received: from mail-pl1-x634.google.com ([2607:f8b0:4864:20::634]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sAeEY-0005vZ-8s for qemu-devel@nongnu.org; Fri, 24 May 2024 19:22:05 -0400 Received: by mail-pl1-x634.google.com with SMTP id d9443c01a7336-1f44b441b08so10591025ad.0 for ; Fri, 24 May 2024 16:21:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1716592917; x=1717197717; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=KbPe3D9gZvhev3amGI7qmDpwdeAReYAACBbqUccoki4=; b=fXGjUFHiLTaDXgV03MSXbFgxJfxLoQMr9i+51xcMnRhM93InFw9zrx5CUtS044YzGA M14EIAJGO4oApSqm9vhTEDRhWbJfIcSwnItsuwFFrcG6RHKLoysNmUXUC+OyWdrkWoZP cFBcKp5ihUs3B0khwK0eI5mjlnEYi2kTsjjXDBZdLsMQg3UPcnEoL1ca7sLL4US9cP49 qkGlHCV7sf5VG/FftugkXeqSQbrLDOP5MhzxgVIjeNvUVA2HP0kO0K/IDIbXJitbOQN8 WB4QqZiebd8ZCFPRdSKN16eOgMOggzttbm6JQ1WeRVJ15ikfEfjk3y1Ym2CKCoCmWvnh cX9g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716592917; x=1717197717; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=KbPe3D9gZvhev3amGI7qmDpwdeAReYAACBbqUccoki4=; b=rNKxL58vEuvE+XS0GAIGtb/bmVDY0PqdAl76FqKXrV9jzXNUVBHLxTREuU+YVVgiza jgd71RcA+AbE14o3SxqJnRUltCpRIbjXUOSqApgO18MiNhU+fdpLXsEd98XQ3khxaOlL jpo0FRTpWf9gKH4EeJOabSEUVelY5rdL/nKB/ctyENLZgHZAt7+XrYnlD40ai0b7xRgd l47iK2Ow/nVMHDbpdPFYCt9Qa8Wmi13R3y5VqRkTav9rWR+GcMpgG7iFTncqgZ80GALl 60bkJdhCK2s2eJD3S72YbsG9OdsY96yFoIUbrDXGqywTk7Vzb1oFpWBFkCRGxOlsKbA5 OZuw== X-Gm-Message-State: AOJu0YzZmcX/TBtlXb3LF5TlMbi3vbGhKL1+KARZ+qkYteiM2ru6Q8Ud iebN+o01MVDHu7P5WMeHrw5PZn5oJ9HkJQMcLVmqHmTyid6BNZJ9StGJYE991VMyyd5OoXEPg5O H X-Google-Smtp-Source: AGHT+IGa7IpRsbRD4bSu+MKZbqNtPzR0bALkpZw2beW4r7sor3bYEXttuFjtSiho0fPIytd7ROrTzg== X-Received: by 2002:a17:902:c407:b0:1f3:33b:ff18 with SMTP id d9443c01a7336-1f4486e5e72mr45788745ad.11.1716592916897; Fri, 24 May 2024 16:21:56 -0700 (PDT) Received: from stoup.. (174-21-72-5.tukw.qwest.net. [174.21.72.5]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-1f44c759ceesm19178305ad.10.2024.05.24.16.21.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 May 2024 16:21:56 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: qemu-arm@nongnu.org Subject: [PATCH v2 39/67] target/arm: Inline scalar SUQADD and USQADD Date: Fri, 24 May 2024 16:20:53 -0700 Message-Id: <20240524232121.284515-40-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240524232121.284515-1-richard.henderson@linaro.org> References: <20240524232121.284515-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::634; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x634.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org This eliminates the last uses of these neon helpers. Incorporate the MO_64 expanders as an option to the vector expander. Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/helper.h | 8 -- target/arm/tcg/translate-a64.h | 8 ++ target/arm/tcg/gengvec64.c | 71 ++++++++++++++ target/arm/tcg/neon_helper.c | 165 --------------------------------- target/arm/tcg/translate-a64.c | 73 +++++---------- 5 files changed, 103 insertions(+), 222 deletions(-) diff --git a/target/arm/helper.h b/target/arm/helper.h index de2c5c9aef..c76158d6d3 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -274,14 +274,6 @@ DEF_HELPER_FLAGS_3(neon_qadd_u16, TCG_CALL_NO_RWG, i32, env, i32, i32) DEF_HELPER_FLAGS_3(neon_qadd_s16, TCG_CALL_NO_RWG, i32, env, i32, i32) DEF_HELPER_FLAGS_3(neon_qadd_u32, TCG_CALL_NO_RWG, i32, env, i32, i32) DEF_HELPER_FLAGS_3(neon_qadd_s32, TCG_CALL_NO_RWG, i32, env, i32, i32) -DEF_HELPER_FLAGS_3(neon_uqadd_s8, TCG_CALL_NO_RWG, i32, env, i32, i32) -DEF_HELPER_FLAGS_3(neon_uqadd_s16, TCG_CALL_NO_RWG, i32, env, i32, i32) -DEF_HELPER_FLAGS_3(neon_uqadd_s32, TCG_CALL_NO_RWG, i32, env, i32, i32) -DEF_HELPER_FLAGS_3(neon_uqadd_s64, TCG_CALL_NO_RWG, i64, env, i64, i64) -DEF_HELPER_FLAGS_3(neon_sqadd_u8, TCG_CALL_NO_RWG, i32, env, i32, i32) -DEF_HELPER_FLAGS_3(neon_sqadd_u16, TCG_CALL_NO_RWG, i32, env, i32, i32) -DEF_HELPER_FLAGS_3(neon_sqadd_u32, TCG_CALL_NO_RWG, i32, env, i32, i32) -DEF_HELPER_FLAGS_3(neon_sqadd_u64, TCG_CALL_NO_RWG, i64, env, i64, i64) DEF_HELPER_3(neon_qsub_u8, i32, env, i32, i32) DEF_HELPER_3(neon_qsub_s8, i32, env, i32, i32) DEF_HELPER_3(neon_qsub_u16, i32, env, i32, i32) diff --git a/target/arm/tcg/translate-a64.h b/target/arm/tcg/translate-a64.h index b5cb26f8a2..0fcf7cb63a 100644 --- a/target/arm/tcg/translate-a64.h +++ b/target/arm/tcg/translate-a64.h @@ -197,9 +197,17 @@ void gen_gvec_eor3(unsigned vece, uint32_t d, uint32_t n, uint32_t m, uint32_t a, uint32_t oprsz, uint32_t maxsz); void gen_gvec_bcax(unsigned vece, uint32_t d, uint32_t n, uint32_t m, uint32_t a, uint32_t oprsz, uint32_t maxsz); + +void gen_suqadd_bhs(TCGv_i64 res, TCGv_i64 qc, + TCGv_i64 a, TCGv_i64 b, MemOp esz); +void gen_suqadd_d(TCGv_i64 res, TCGv_i64 qc, TCGv_i64 a, TCGv_i64 b); void gen_gvec_suqadd_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); + +void gen_usqadd_bhs(TCGv_i64 res, TCGv_i64 qc, + TCGv_i64 a, TCGv_i64 b, MemOp esz); +void gen_usqadd_d(TCGv_i64 res, TCGv_i64 qc, TCGv_i64 a, TCGv_i64 b); void gen_gvec_usqadd_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); diff --git a/target/arm/tcg/gengvec64.c b/target/arm/tcg/gengvec64.c index 4b76e476a0..dad4c1853b 100644 --- a/target/arm/tcg/gengvec64.c +++ b/target/arm/tcg/gengvec64.c @@ -188,6 +188,38 @@ void gen_gvec_bcax(unsigned vece, uint32_t d, uint32_t n, uint32_t m, tcg_gen_gvec_4(d, n, m, a, oprsz, maxsz, &op); } +/* + * Set @res to the correctly saturated result. + * Set @qc non-zero if saturation occured. + */ +void gen_suqadd_bhs(TCGv_i64 res, TCGv_i64 qc, + TCGv_i64 a, TCGv_i64 b, MemOp esz) +{ + TCGv_i64 max = tcg_constant_i64((1ull << ((8 << esz) - 1)) - 1); + TCGv_i64 t = tcg_temp_new_i64(); + + tcg_gen_add_i64(t, a, b); + tcg_gen_smin_i64(res, t, max); + tcg_gen_xor_i64(t, t, res); + tcg_gen_or_i64(qc, qc, t); +} + +void gen_suqadd_d(TCGv_i64 res, TCGv_i64 qc, TCGv_i64 a, TCGv_i64 b) +{ + TCGv_i64 max = tcg_constant_i64(INT64_MAX); + TCGv_i64 t = tcg_temp_new_i64(); + + /* Maximum value that can be added to @a without overflow. */ + tcg_gen_sub_i64(t, max, a); + + /* Constrain addend so that the next addition never overflows. */ + tcg_gen_umin_i64(t, t, b); + tcg_gen_add_i64(res, a, t); + + tcg_gen_xor_i64(t, t, b); + tcg_gen_or_i64(qc, qc, t); +} + static void gen_suqadd_vec(unsigned vece, TCGv_vec t, TCGv_vec qc, TCGv_vec a, TCGv_vec b) { @@ -231,6 +263,7 @@ void gen_gvec_suqadd_qc(unsigned vece, uint32_t rd_ofs, .write_aofs = true, .vece = MO_32 }, { .fniv = gen_suqadd_vec, + .fni8 = gen_suqadd_d, .fno = gen_helper_gvec_suqadd_d, .opt_opc = vecop_list, .write_aofs = true, @@ -240,6 +273,43 @@ void gen_gvec_suqadd_qc(unsigned vece, uint32_t rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); } +void gen_usqadd_bhs(TCGv_i64 res, TCGv_i64 qc, + TCGv_i64 a, TCGv_i64 b, MemOp esz) +{ + TCGv_i64 max = tcg_constant_i64(MAKE_64BIT_MASK(0, 8 << esz)); + TCGv_i64 zero = tcg_constant_i64(0); + TCGv_i64 tmp = tcg_temp_new_i64(); + + tcg_gen_add_i64(tmp, a, b); + tcg_gen_smin_i64(res, tmp, max); + tcg_gen_smax_i64(res, res, zero); + tcg_gen_xor_i64(tmp, tmp, res); + tcg_gen_or_i64(qc, qc, tmp); +} + +void gen_usqadd_d(TCGv_i64 res, TCGv_i64 qc, TCGv_i64 a, TCGv_i64 b) +{ + TCGv_i64 tmp = tcg_temp_new_i64(); + TCGv_i64 tneg = tcg_temp_new_i64(); + TCGv_i64 tpos = tcg_temp_new_i64(); + TCGv_i64 max = tcg_constant_i64(UINT64_MAX); + TCGv_i64 zero = tcg_constant_i64(0); + + tcg_gen_add_i64(tmp, a, b); + + /* If @b is positive, saturate if (a + b) < a, aka unsigned overflow. */ + tcg_gen_movcond_i64(TCG_COND_LTU, tpos, tmp, a, max, tmp); + + /* If @b is negative, saturate if a < -b, ie subtraction is negative. */ + tcg_gen_neg_i64(tneg, b); + tcg_gen_movcond_i64(TCG_COND_LTU, tneg, a, tneg, zero, tmp); + + /* Select correct result from sign of @b. */ + tcg_gen_movcond_i64(TCG_COND_LT, res, b, zero, tneg, tpos); + tcg_gen_xor_i64(tmp, tmp, res); + tcg_gen_or_i64(qc, qc, tmp); +} + static void gen_usqadd_vec(unsigned vece, TCGv_vec t, TCGv_vec qc, TCGv_vec a, TCGv_vec b) { @@ -286,6 +356,7 @@ void gen_gvec_usqadd_qc(unsigned vece, uint32_t rd_ofs, .write_aofs = true, .vece = MO_32 }, { .fniv = gen_usqadd_vec, + .fni8 = gen_usqadd_d, .fno = gen_helper_gvec_usqadd_d, .opt_opc = vecop_list, .write_aofs = true, diff --git a/target/arm/tcg/neon_helper.c b/target/arm/tcg/neon_helper.c index a0b51c8809..9505a5fd18 100644 --- a/target/arm/tcg/neon_helper.c +++ b/target/arm/tcg/neon_helper.c @@ -236,171 +236,6 @@ uint64_t HELPER(neon_qadd_s64)(CPUARMState *env, uint64_t src1, uint64_t src2) return res; } -/* Unsigned saturating accumulate of signed value - * - * Op1/Rn is treated as signed - * Op2/Rd is treated as unsigned - * - * Explicit casting is used to ensure the correct sign extension of - * inputs. The result is treated as a unsigned value and saturated as such. - * - * We use a macro for the 8/16 bit cases which expects signed integers of va, - * vb, and vr for interim calculation and an unsigned 32 bit result value r. - */ - -#define USATACC(bits, shift) \ - do { \ - va = sextract32(a, shift, bits); \ - vb = extract32(b, shift, bits); \ - vr = va + vb; \ - if (vr > UINT##bits##_MAX) { \ - SET_QC(); \ - vr = UINT##bits##_MAX; \ - } else if (vr < 0) { \ - SET_QC(); \ - vr = 0; \ - } \ - r = deposit32(r, shift, bits, vr); \ - } while (0) - -uint32_t HELPER(neon_uqadd_s8)(CPUARMState *env, uint32_t a, uint32_t b) -{ - int16_t va, vb, vr; - uint32_t r = 0; - - USATACC(8, 0); - USATACC(8, 8); - USATACC(8, 16); - USATACC(8, 24); - return r; -} - -uint32_t HELPER(neon_uqadd_s16)(CPUARMState *env, uint32_t a, uint32_t b) -{ - int32_t va, vb, vr; - uint64_t r = 0; - - USATACC(16, 0); - USATACC(16, 16); - return r; -} - -#undef USATACC - -uint32_t HELPER(neon_uqadd_s32)(CPUARMState *env, uint32_t a, uint32_t b) -{ - int64_t va = (int32_t)a; - int64_t vb = (uint32_t)b; - int64_t vr = va + vb; - if (vr > UINT32_MAX) { - SET_QC(); - vr = UINT32_MAX; - } else if (vr < 0) { - SET_QC(); - vr = 0; - } - return vr; -} - -uint64_t HELPER(neon_uqadd_s64)(CPUARMState *env, uint64_t a, uint64_t b) -{ - uint64_t res; - res = a + b; - /* We only need to look at the pattern of SIGN bits to detect - * +ve/-ve saturation - */ - if (~a & b & ~res & SIGNBIT64) { - SET_QC(); - res = UINT64_MAX; - } else if (a & ~b & res & SIGNBIT64) { - SET_QC(); - res = 0; - } - return res; -} - -/* Signed saturating accumulate of unsigned value - * - * Op1/Rn is treated as unsigned - * Op2/Rd is treated as signed - * - * The result is treated as a signed value and saturated as such - * - * We use a macro for the 8/16 bit cases which expects signed integers of va, - * vb, and vr for interim calculation and an unsigned 32 bit result value r. - */ - -#define SSATACC(bits, shift) \ - do { \ - va = extract32(a, shift, bits); \ - vb = sextract32(b, shift, bits); \ - vr = va + vb; \ - if (vr > INT##bits##_MAX) { \ - SET_QC(); \ - vr = INT##bits##_MAX; \ - } else if (vr < INT##bits##_MIN) { \ - SET_QC(); \ - vr = INT##bits##_MIN; \ - } \ - r = deposit32(r, shift, bits, vr); \ - } while (0) - -uint32_t HELPER(neon_sqadd_u8)(CPUARMState *env, uint32_t a, uint32_t b) -{ - int16_t va, vb, vr; - uint32_t r = 0; - - SSATACC(8, 0); - SSATACC(8, 8); - SSATACC(8, 16); - SSATACC(8, 24); - return r; -} - -uint32_t HELPER(neon_sqadd_u16)(CPUARMState *env, uint32_t a, uint32_t b) -{ - int32_t va, vb, vr; - uint32_t r = 0; - - SSATACC(16, 0); - SSATACC(16, 16); - - return r; -} - -#undef SSATACC - -uint32_t HELPER(neon_sqadd_u32)(CPUARMState *env, uint32_t a, uint32_t b) -{ - int64_t res; - int64_t op1 = (uint32_t)a; - int64_t op2 = (int32_t)b; - res = op1 + op2; - if (res > INT32_MAX) { - SET_QC(); - res = INT32_MAX; - } else if (res < INT32_MIN) { - SET_QC(); - res = INT32_MIN; - } - return res; -} - -uint64_t HELPER(neon_sqadd_u64)(CPUARMState *env, uint64_t a, uint64_t b) -{ - uint64_t res; - res = a + b; - /* We only need to look at the pattern of SIGN bits to detect an overflow */ - if (((a & res) - | (~b & res) - | (a & ~b)) & SIGNBIT64) { - SET_QC(); - res = INT64_MAX; - } - return res; -} - - #define NEON_USAT(dest, src1, src2, type) do { \ uint32_t tmp = (uint32_t)src1 - (uint32_t)src2; \ if (tmp != (type)tmp) { \ diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c index 9f948e033e..781b224972 100644 --- a/target/arm/tcg/translate-a64.c +++ b/target/arm/tcg/translate-a64.c @@ -9985,67 +9985,42 @@ static void handle_2misc_narrow(DisasContext *s, bool scalar, static void handle_2misc_satacc(DisasContext *s, bool is_scalar, bool is_u, bool is_q, unsigned size, int rn, int rd) { + TCGv_i64 res, qc, a, b; + if (!is_scalar) { gen_gvec_fn3(s, is_q, rd, rd, rn, is_u ? gen_gvec_usqadd_qc : gen_gvec_suqadd_qc, size); return; } - if (size == 3) { - TCGv_i64 tcg_rn = tcg_temp_new_i64(); - TCGv_i64 tcg_rd = tcg_temp_new_i64(); + res = tcg_temp_new_i64(); + qc = tcg_temp_new_i64(); + a = tcg_temp_new_i64(); + b = tcg_temp_new_i64(); - read_vec_element(s, tcg_rn, rn, 0, MO_64); - read_vec_element(s, tcg_rd, rd, 0, MO_64); + /* Read and extend scalar inputs to 64-bits. */ + read_vec_element(s, a, rd, 0, size | (is_u ? 0 : MO_SIGN)); + read_vec_element(s, b, rn, 0, size | (is_u ? MO_SIGN : 0)); + tcg_gen_ld_i64(qc, tcg_env, offsetof(CPUARMState, vfp.qc)); - if (is_u) { /* USQADD */ - gen_helper_neon_uqadd_s64(tcg_rd, tcg_env, tcg_rn, tcg_rd); - } else { /* SUQADD */ - gen_helper_neon_sqadd_u64(tcg_rd, tcg_env, tcg_rn, tcg_rd); + if (size == MO_64) { + if (is_u) { + gen_usqadd_d(res, qc, a, b); + } else { + gen_suqadd_d(res, qc, a, b); } - write_vec_element(s, tcg_rd, rd, 0, MO_64); - clear_vec_high(s, false, rd); } else { - TCGv_i32 tcg_rn = tcg_temp_new_i32(); - TCGv_i32 tcg_rd = tcg_temp_new_i32(); - - read_vec_element_i32(s, tcg_rn, rn, 0, size); - read_vec_element_i32(s, tcg_rd, rd, 0, size); - - if (is_u) { /* USQADD */ - switch (size) { - case 0: - gen_helper_neon_uqadd_s8(tcg_rd, tcg_env, tcg_rn, tcg_rd); - break; - case 1: - gen_helper_neon_uqadd_s16(tcg_rd, tcg_env, tcg_rn, tcg_rd); - break; - case 2: - gen_helper_neon_uqadd_s32(tcg_rd, tcg_env, tcg_rn, tcg_rd); - break; - default: - g_assert_not_reached(); - } - } else { /* SUQADD */ - switch (size) { - case 0: - gen_helper_neon_sqadd_u8(tcg_rd, tcg_env, tcg_rn, tcg_rd); - break; - case 1: - gen_helper_neon_sqadd_u16(tcg_rd, tcg_env, tcg_rn, tcg_rd); - break; - case 2: - gen_helper_neon_sqadd_u32(tcg_rd, tcg_env, tcg_rn, tcg_rd); - break; - default: - g_assert_not_reached(); - } + if (is_u) { + gen_usqadd_bhs(res, qc, a, b, size); + } else { + gen_suqadd_bhs(res, qc, a, b, size); + /* Truncate signed 64-bit result for writeback. */ + tcg_gen_ext_i64(res, res, size); } - - write_vec_element(s, tcg_constant_i64(0), rd, 0, MO_64); - write_vec_element_i32(s, tcg_rd, rd, 0, MO_32); - clear_vec_high(s, false, rd); } + + write_fp_dreg(s, rd, res); + tcg_gen_st_i64(qc, tcg_env, offsetof(CPUARMState, vfp.qc)); } /* AdvSIMD scalar two reg misc From patchwork Fri May 24 23:20:54 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13673825 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id F2654C25B7D for ; Fri, 24 May 2024 23:28:21 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sAeHv-0007KO-J7; Fri, 24 May 2024 19:25:28 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sAeHY-00078N-Tj for qemu-devel@nongnu.org; Fri, 24 May 2024 19:25:05 -0400 Received: from mail-pf1-x42e.google.com ([2607:f8b0:4864:20::42e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sAeHT-0006fE-Sb for qemu-devel@nongnu.org; Fri, 24 May 2024 19:25:04 -0400 Received: by mail-pf1-x42e.google.com with SMTP id d2e1a72fcca58-6f8ea5e3812so1320370b3a.2 for ; Fri, 24 May 2024 16:24:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1716593098; x=1717197898; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=hlxaiIX5tmmnaW7q/wPUtiHWBx7RTV8cImR19BF5TM8=; b=cUYSqfD8ayxmP/hbScBtJmK9ptpjBKcbd7Zw9kR1UXyiLFq9IJTT449m/QouiG7rrK t+f6JmTJaFdPFkxMrI6p3OAmIwSN3PJEMtduVlhqWO9iofdC9E6m7jrHiKdtIwkfqTsx pKBvUo735B5f9ScBd9WUxn2YPxMBigbOJR8GPMF1UUCIixooZw4haJ8HEBHhCgvyOK+/ ej/687BLPnKgQ1vZRZZ2fMCfrK+FfpRF+EJEELvPCsqbhlA30fgqJPF1EHI0NTvFGElN YSWM6NphsQPbCS6IAjlHbcjfTyFkHiqFZQqeGUSAREUrtXUZ/qtIEPUsa0qqhMWTT4KJ D+Rg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716593098; x=1717197898; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=hlxaiIX5tmmnaW7q/wPUtiHWBx7RTV8cImR19BF5TM8=; b=ulX51Zt93iWg9i5hQpDGLhiyygpZ/aCuWD3W1qUAqq1hjb40iEruyqV0D8Eu3et+mp mrpd3G8RYjR9jMzroMMrkXLtZkVBxqTrSM8f6XaRpudYeDB7qKCVbGR9VE0MMFviYcbX ofGSu6xN8p4bTHRpjH5pKO6GOxhn5AUZL5jy4QWYUra0HcCOd0d9piebEAKz/h4jhCvF GeeduHpXbfKpRDFGL+oY18FrdUJHsN0BDxzWoOc4tqhMCDKvLTP3ffFpq7P8OPKdqlXf PUhYDpjwmrkhw7GGJw/k1OKRmbigiuAsh/YjvXsrqR7zkVs1g8dtaNEfWZok2NdSLW6B SrIQ== X-Gm-Message-State: AOJu0YzIrqlJ0isEN3xkINqUGeDeT0CWGCg35zf4iGl2a33soxcT1fYw WdXv9ZeOmHSct0x5eF+6eH8ellbYIZd4HHuNgARXxBeTHHbt7RtwD29lQX3WN1N/sKkMMWUdFXb / X-Google-Smtp-Source: AGHT+IHLSIBYpLGa8c5F06GjDdMwWe8jpjq2/7HwvMFMlHnnRYe8q+OtCP3/CC0vHGrNsqwTqG4Nyw== X-Received: by 2002:a05:6a00:418a:b0:6e6:970f:a809 with SMTP id d2e1a72fcca58-6f8f392b6a8mr4067526b3a.20.1716593098254; Fri, 24 May 2024 16:24:58 -0700 (PDT) Received: from stoup.. (174-21-72-5.tukw.qwest.net. [174.21.72.5]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-6f8fbf2cfd1sm1591695b3a.3.2024.05.24.16.24.57 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 May 2024 16:24:57 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: qemu-arm@nongnu.org Subject: [PATCH v2 40/67] target/arm: Inline scalar SQADD, UQADD, SQSUB, UQSUB Date: Fri, 24 May 2024 16:20:54 -0700 Message-Id: <20240524232121.284515-41-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240524232121.284515-1-richard.henderson@linaro.org> References: <20240524232121.284515-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::42e; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x42e.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org This eliminates the last uses of these neon helpers. Incorporate the MO_64 expanders as an option to the vector expander. Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/helper.h | 17 ---- target/arm/tcg/translate.h | 15 +++ target/arm/tcg/gengvec.c | 116 +++++++++++++++++++++++ target/arm/tcg/neon_helper.c | 162 --------------------------------- target/arm/tcg/translate-a64.c | 67 ++++++++------ 5 files changed, 169 insertions(+), 208 deletions(-) diff --git a/target/arm/helper.h b/target/arm/helper.h index c76158d6d3..a14c040451 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -268,23 +268,6 @@ DEF_HELPER_FLAGS_2(fjcvtzs, TCG_CALL_NO_RWG, i64, f64, ptr) DEF_HELPER_FLAGS_3(check_hcr_el2_trap, TCG_CALL_NO_WG, void, env, i32, i32) /* neon_helper.c */ -DEF_HELPER_FLAGS_3(neon_qadd_u8, TCG_CALL_NO_RWG, i32, env, i32, i32) -DEF_HELPER_FLAGS_3(neon_qadd_s8, TCG_CALL_NO_RWG, i32, env, i32, i32) -DEF_HELPER_FLAGS_3(neon_qadd_u16, TCG_CALL_NO_RWG, i32, env, i32, i32) -DEF_HELPER_FLAGS_3(neon_qadd_s16, TCG_CALL_NO_RWG, i32, env, i32, i32) -DEF_HELPER_FLAGS_3(neon_qadd_u32, TCG_CALL_NO_RWG, i32, env, i32, i32) -DEF_HELPER_FLAGS_3(neon_qadd_s32, TCG_CALL_NO_RWG, i32, env, i32, i32) -DEF_HELPER_3(neon_qsub_u8, i32, env, i32, i32) -DEF_HELPER_3(neon_qsub_s8, i32, env, i32, i32) -DEF_HELPER_3(neon_qsub_u16, i32, env, i32, i32) -DEF_HELPER_3(neon_qsub_s16, i32, env, i32, i32) -DEF_HELPER_3(neon_qsub_u32, i32, env, i32, i32) -DEF_HELPER_3(neon_qsub_s32, i32, env, i32, i32) -DEF_HELPER_3(neon_qadd_u64, i64, env, i64, i64) -DEF_HELPER_3(neon_qadd_s64, i64, env, i64, i64) -DEF_HELPER_3(neon_qsub_u64, i64, env, i64, i64) -DEF_HELPER_3(neon_qsub_s64, i64, env, i64, i64) - DEF_HELPER_2(neon_hadd_s8, i32, i32, i32) DEF_HELPER_2(neon_hadd_u8, i32, i32, i32) DEF_HELPER_2(neon_hadd_s16, i32, i32, i32) diff --git a/target/arm/tcg/translate.h b/target/arm/tcg/translate.h index 3abdbedfe5..87439dcc61 100644 --- a/target/arm/tcg/translate.h +++ b/target/arm/tcg/translate.h @@ -466,12 +466,27 @@ void gen_sshl_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b); void gen_ushl_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b); void gen_sshl_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b); +void gen_uqadd_bhs(TCGv_i64 res, TCGv_i64 qc, + TCGv_i64 a, TCGv_i64 b, MemOp esz); +void gen_uqadd_d(TCGv_i64 d, TCGv_i64 q, TCGv_i64 a, TCGv_i64 b); void gen_gvec_uqadd_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); + +void gen_sqadd_bhs(TCGv_i64 res, TCGv_i64 qc, + TCGv_i64 a, TCGv_i64 b, MemOp esz); +void gen_sqadd_d(TCGv_i64 d, TCGv_i64 q, TCGv_i64 a, TCGv_i64 b); void gen_gvec_sqadd_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); + +void gen_uqsub_bhs(TCGv_i64 res, TCGv_i64 qc, + TCGv_i64 a, TCGv_i64 b, MemOp esz); +void gen_uqsub_d(TCGv_i64 d, TCGv_i64 q, TCGv_i64 a, TCGv_i64 b); void gen_gvec_uqsub_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); + +void gen_sqsub_bhs(TCGv_i64 res, TCGv_i64 qc, + TCGv_i64 a, TCGv_i64 b, MemOp esz); +void gen_sqsub_d(TCGv_i64 d, TCGv_i64 q, TCGv_i64 a, TCGv_i64 b); void gen_gvec_sqsub_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); diff --git a/target/arm/tcg/gengvec.c b/target/arm/tcg/gengvec.c index bfe6885a01..66a514ba86 100644 --- a/target/arm/tcg/gengvec.c +++ b/target/arm/tcg/gengvec.c @@ -1217,6 +1217,28 @@ void gen_gvec_sshl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); } +void gen_uqadd_bhs(TCGv_i64 res, TCGv_i64 qc, TCGv_i64 a, TCGv_i64 b, MemOp esz) +{ + uint64_t max = MAKE_64BIT_MASK(0, 8 << esz); + TCGv_i64 tmp = tcg_temp_new_i64(); + + tcg_gen_add_i64(tmp, a, b); + tcg_gen_umin_i64(res, tmp, tcg_constant_i64(max)); + tcg_gen_xor_i64(tmp, tmp, res); + tcg_gen_or_i64(qc, qc, tmp); +} + +void gen_uqadd_d(TCGv_i64 res, TCGv_i64 qc, TCGv_i64 a, TCGv_i64 b) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + tcg_gen_add_i64(t, a, b); + tcg_gen_movcond_i64(TCG_COND_LTU, res, t, a, + tcg_constant_i64(UINT64_MAX), t); + tcg_gen_xor_i64(t, t, res); + tcg_gen_or_i64(qc, qc, t); +} + static void gen_uqadd_vec(unsigned vece, TCGv_vec t, TCGv_vec qc, TCGv_vec a, TCGv_vec b) { @@ -1250,6 +1272,7 @@ void gen_gvec_uqadd_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, .opt_opc = vecop_list, .vece = MO_32 }, { .fniv = gen_uqadd_vec, + .fni8 = gen_uqadd_d, .fno = gen_helper_gvec_uqadd_d, .write_aofs = true, .opt_opc = vecop_list, @@ -1259,6 +1282,41 @@ void gen_gvec_uqadd_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); } +void gen_sqadd_bhs(TCGv_i64 res, TCGv_i64 qc, TCGv_i64 a, TCGv_i64 b, MemOp esz) +{ + int64_t max = MAKE_64BIT_MASK(0, (8 << esz) - 1); + int64_t min = -1ll - max; + TCGv_i64 tmp = tcg_temp_new_i64(); + + tcg_gen_add_i64(tmp, a, b); + tcg_gen_smin_i64(res, tmp, tcg_constant_i64(max)); + tcg_gen_smax_i64(res, res, tcg_constant_i64(min)); + tcg_gen_xor_i64(tmp, tmp, res); + tcg_gen_or_i64(qc, qc, tmp); +} + +void gen_sqadd_d(TCGv_i64 res, TCGv_i64 qc, TCGv_i64 a, TCGv_i64 b) +{ + TCGv_i64 t0 = tcg_temp_new_i64(); + TCGv_i64 t1 = tcg_temp_new_i64(); + TCGv_i64 t2 = tcg_temp_new_i64(); + + tcg_gen_add_i64(t0, a, b); + + /* Compute signed overflow indication into T1 */ + tcg_gen_xor_i64(t1, a, b); + tcg_gen_xor_i64(t2, t0, a); + tcg_gen_andc_i64(t1, t2, t1); + + /* Compute saturated value into T2 */ + tcg_gen_sari_i64(t2, a, 63); + tcg_gen_xori_i64(t2, t2, INT64_MAX); + + tcg_gen_movcond_i64(TCG_COND_LT, res, t1, tcg_constant_i64(0), t2, t0); + tcg_gen_xor_i64(t0, t0, res); + tcg_gen_or_i64(qc, qc, t0); +} + static void gen_sqadd_vec(unsigned vece, TCGv_vec t, TCGv_vec qc, TCGv_vec a, TCGv_vec b) { @@ -1292,6 +1350,7 @@ void gen_gvec_sqadd_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, .write_aofs = true, .vece = MO_32 }, { .fniv = gen_sqadd_vec, + .fni8 = gen_sqadd_d, .fno = gen_helper_gvec_sqadd_d, .opt_opc = vecop_list, .write_aofs = true, @@ -1301,6 +1360,26 @@ void gen_gvec_sqadd_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); } +void gen_uqsub_bhs(TCGv_i64 res, TCGv_i64 qc, TCGv_i64 a, TCGv_i64 b, MemOp esz) +{ + TCGv_i64 tmp = tcg_temp_new_i64(); + + tcg_gen_sub_i64(tmp, a, b); + tcg_gen_smax_i64(res, tmp, tcg_constant_i64(0)); + tcg_gen_xor_i64(tmp, tmp, res); + tcg_gen_or_i64(qc, qc, tmp); +} + +void gen_uqsub_d(TCGv_i64 res, TCGv_i64 qc, TCGv_i64 a, TCGv_i64 b) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + tcg_gen_sub_i64(t, a, b); + tcg_gen_movcond_i64(TCG_COND_LTU, res, a, b, tcg_constant_i64(0), t); + tcg_gen_xor_i64(t, t, res); + tcg_gen_or_i64(qc, qc, t); +} + static void gen_uqsub_vec(unsigned vece, TCGv_vec t, TCGv_vec qc, TCGv_vec a, TCGv_vec b) { @@ -1334,6 +1413,7 @@ void gen_gvec_uqsub_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, .write_aofs = true, .vece = MO_32 }, { .fniv = gen_uqsub_vec, + .fni8 = gen_uqsub_d, .fno = gen_helper_gvec_uqsub_d, .opt_opc = vecop_list, .write_aofs = true, @@ -1343,6 +1423,41 @@ void gen_gvec_uqsub_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); } +void gen_sqsub_bhs(TCGv_i64 res, TCGv_i64 qc, TCGv_i64 a, TCGv_i64 b, MemOp esz) +{ + int64_t max = MAKE_64BIT_MASK(0, (8 << esz) - 1); + int64_t min = -1ll - max; + TCGv_i64 tmp = tcg_temp_new_i64(); + + tcg_gen_sub_i64(tmp, a, b); + tcg_gen_smin_i64(res, tmp, tcg_constant_i64(max)); + tcg_gen_smax_i64(res, res, tcg_constant_i64(min)); + tcg_gen_xor_i64(tmp, tmp, res); + tcg_gen_or_i64(qc, qc, tmp); +} + +void gen_sqsub_d(TCGv_i64 res, TCGv_i64 qc, TCGv_i64 a, TCGv_i64 b) +{ + TCGv_i64 t0 = tcg_temp_new_i64(); + TCGv_i64 t1 = tcg_temp_new_i64(); + TCGv_i64 t2 = tcg_temp_new_i64(); + + tcg_gen_sub_i64(t0, a, b); + + /* Compute signed overflow indication into T1 */ + tcg_gen_xor_i64(t1, a, b); + tcg_gen_xor_i64(t2, t0, a); + tcg_gen_and_i64(t1, t1, t2); + + /* Compute saturated value into T2 */ + tcg_gen_sari_i64(t2, a, 63); + tcg_gen_xori_i64(t2, t2, INT64_MAX); + + tcg_gen_movcond_i64(TCG_COND_LT, res, t1, tcg_constant_i64(0), t2, t0); + tcg_gen_xor_i64(t0, t0, res); + tcg_gen_or_i64(qc, qc, t0); +} + static void gen_sqsub_vec(unsigned vece, TCGv_vec t, TCGv_vec qc, TCGv_vec a, TCGv_vec b) { @@ -1376,6 +1491,7 @@ void gen_gvec_sqsub_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, .write_aofs = true, .vece = MO_32 }, { .fniv = gen_sqsub_vec, + .fni8 = gen_sqsub_d, .fno = gen_helper_gvec_sqsub_d, .opt_opc = vecop_list, .write_aofs = true, diff --git a/target/arm/tcg/neon_helper.c b/target/arm/tcg/neon_helper.c index 9505a5fd18..0af15e9f6e 100644 --- a/target/arm/tcg/neon_helper.c +++ b/target/arm/tcg/neon_helper.c @@ -155,168 +155,6 @@ uint32_t HELPER(glue(neon_,name))(uint32_t arg) \ return arg; \ } - -#define NEON_USAT(dest, src1, src2, type) do { \ - uint32_t tmp = (uint32_t)src1 + (uint32_t)src2; \ - if (tmp != (type)tmp) { \ - SET_QC(); \ - dest = ~0; \ - } else { \ - dest = tmp; \ - }} while(0) -#define NEON_FN(dest, src1, src2) NEON_USAT(dest, src1, src2, uint8_t) -NEON_VOP_ENV(qadd_u8, neon_u8, 4) -#undef NEON_FN -#define NEON_FN(dest, src1, src2) NEON_USAT(dest, src1, src2, uint16_t) -NEON_VOP_ENV(qadd_u16, neon_u16, 2) -#undef NEON_FN -#undef NEON_USAT - -uint32_t HELPER(neon_qadd_u32)(CPUARMState *env, uint32_t a, uint32_t b) -{ - uint32_t res = a + b; - if (res < a) { - SET_QC(); - res = ~0; - } - return res; -} - -uint64_t HELPER(neon_qadd_u64)(CPUARMState *env, uint64_t src1, uint64_t src2) -{ - uint64_t res; - - res = src1 + src2; - if (res < src1) { - SET_QC(); - res = ~(uint64_t)0; - } - return res; -} - -#define NEON_SSAT(dest, src1, src2, type) do { \ - int32_t tmp = (uint32_t)src1 + (uint32_t)src2; \ - if (tmp != (type)tmp) { \ - SET_QC(); \ - if (src2 > 0) { \ - tmp = (1 << (sizeof(type) * 8 - 1)) - 1; \ - } else { \ - tmp = 1 << (sizeof(type) * 8 - 1); \ - } \ - } \ - dest = tmp; \ - } while(0) -#define NEON_FN(dest, src1, src2) NEON_SSAT(dest, src1, src2, int8_t) -NEON_VOP_ENV(qadd_s8, neon_s8, 4) -#undef NEON_FN -#define NEON_FN(dest, src1, src2) NEON_SSAT(dest, src1, src2, int16_t) -NEON_VOP_ENV(qadd_s16, neon_s16, 2) -#undef NEON_FN -#undef NEON_SSAT - -uint32_t HELPER(neon_qadd_s32)(CPUARMState *env, uint32_t a, uint32_t b) -{ - uint32_t res = a + b; - if (((res ^ a) & SIGNBIT) && !((a ^ b) & SIGNBIT)) { - SET_QC(); - res = ~(((int32_t)a >> 31) ^ SIGNBIT); - } - return res; -} - -uint64_t HELPER(neon_qadd_s64)(CPUARMState *env, uint64_t src1, uint64_t src2) -{ - uint64_t res; - - res = src1 + src2; - if (((res ^ src1) & SIGNBIT64) && !((src1 ^ src2) & SIGNBIT64)) { - SET_QC(); - res = ((int64_t)src1 >> 63) ^ ~SIGNBIT64; - } - return res; -} - -#define NEON_USAT(dest, src1, src2, type) do { \ - uint32_t tmp = (uint32_t)src1 - (uint32_t)src2; \ - if (tmp != (type)tmp) { \ - SET_QC(); \ - dest = 0; \ - } else { \ - dest = tmp; \ - }} while(0) -#define NEON_FN(dest, src1, src2) NEON_USAT(dest, src1, src2, uint8_t) -NEON_VOP_ENV(qsub_u8, neon_u8, 4) -#undef NEON_FN -#define NEON_FN(dest, src1, src2) NEON_USAT(dest, src1, src2, uint16_t) -NEON_VOP_ENV(qsub_u16, neon_u16, 2) -#undef NEON_FN -#undef NEON_USAT - -uint32_t HELPER(neon_qsub_u32)(CPUARMState *env, uint32_t a, uint32_t b) -{ - uint32_t res = a - b; - if (res > a) { - SET_QC(); - res = 0; - } - return res; -} - -uint64_t HELPER(neon_qsub_u64)(CPUARMState *env, uint64_t src1, uint64_t src2) -{ - uint64_t res; - - if (src1 < src2) { - SET_QC(); - res = 0; - } else { - res = src1 - src2; - } - return res; -} - -#define NEON_SSAT(dest, src1, src2, type) do { \ - int32_t tmp = (uint32_t)src1 - (uint32_t)src2; \ - if (tmp != (type)tmp) { \ - SET_QC(); \ - if (src2 < 0) { \ - tmp = (1 << (sizeof(type) * 8 - 1)) - 1; \ - } else { \ - tmp = 1 << (sizeof(type) * 8 - 1); \ - } \ - } \ - dest = tmp; \ - } while(0) -#define NEON_FN(dest, src1, src2) NEON_SSAT(dest, src1, src2, int8_t) -NEON_VOP_ENV(qsub_s8, neon_s8, 4) -#undef NEON_FN -#define NEON_FN(dest, src1, src2) NEON_SSAT(dest, src1, src2, int16_t) -NEON_VOP_ENV(qsub_s16, neon_s16, 2) -#undef NEON_FN -#undef NEON_SSAT - -uint32_t HELPER(neon_qsub_s32)(CPUARMState *env, uint32_t a, uint32_t b) -{ - uint32_t res = a - b; - if (((res ^ a) & SIGNBIT) && ((a ^ b) & SIGNBIT)) { - SET_QC(); - res = ~(((int32_t)a >> 31) ^ SIGNBIT); - } - return res; -} - -uint64_t HELPER(neon_qsub_s64)(CPUARMState *env, uint64_t src1, uint64_t src2) -{ - uint64_t res; - - res = src1 - src2; - if (((res ^ src1) & SIGNBIT64) && ((src1 ^ src2) & SIGNBIT64)) { - SET_QC(); - res = ((int64_t)src1 >> 63) ^ ~SIGNBIT64; - } - return res; -} - #define NEON_FN(dest, src1, src2) dest = (src1 + src2) >> 1 NEON_VOP(hadd_s8, neon_s8, 4) NEON_VOP(hadd_u8, neon_u8, 4) diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c index 781b224972..ca7ba6b1e8 100644 --- a/target/arm/tcg/translate-a64.c +++ b/target/arm/tcg/translate-a64.c @@ -9291,21 +9291,28 @@ static void handle_3same_64(DisasContext *s, int opcode, bool u, * or scalar-three-reg-same groups. */ TCGCond cond; + TCGv_i64 qc; switch (opcode) { case 0x1: /* SQADD */ + qc = tcg_temp_new_i64(); + tcg_gen_ld_i64(qc, tcg_env, offsetof(CPUARMState, vfp.qc)); if (u) { - gen_helper_neon_qadd_u64(tcg_rd, tcg_env, tcg_rn, tcg_rm); + gen_uqadd_d(tcg_rd, qc, tcg_rn, tcg_rm); } else { - gen_helper_neon_qadd_s64(tcg_rd, tcg_env, tcg_rn, tcg_rm); + gen_sqadd_d(tcg_rd, qc, tcg_rn, tcg_rm); } + tcg_gen_st_i64(qc, tcg_env, offsetof(CPUARMState, vfp.qc)); break; case 0x5: /* SQSUB */ + qc = tcg_temp_new_i64(); + tcg_gen_ld_i64(qc, tcg_env, offsetof(CPUARMState, vfp.qc)); if (u) { - gen_helper_neon_qsub_u64(tcg_rd, tcg_env, tcg_rn, tcg_rm); + gen_uqsub_d(tcg_rd, qc, tcg_rn, tcg_rm); } else { - gen_helper_neon_qsub_s64(tcg_rd, tcg_env, tcg_rn, tcg_rm); + gen_sqsub_d(tcg_rd, qc, tcg_rn, tcg_rm); } + tcg_gen_st_i64(qc, tcg_env, offsetof(CPUARMState, vfp.qc)); break; case 0x6: /* CMGT, CMHI */ cond = u ? TCG_COND_GTU : TCG_COND_GT; @@ -9425,35 +9432,16 @@ static void disas_simd_scalar_three_reg_same(DisasContext *s, uint32_t insn) * OPTME: special-purpose helpers would avoid doing some * unnecessary work in the helper for the 8 and 16 bit cases. */ - NeonGenTwoOpEnvFn *genenvfn; - TCGv_i32 tcg_rn = tcg_temp_new_i32(); - TCGv_i32 tcg_rm = tcg_temp_new_i32(); - TCGv_i32 tcg_rd32 = tcg_temp_new_i32(); - - read_vec_element_i32(s, tcg_rn, rn, 0, size); - read_vec_element_i32(s, tcg_rm, rm, 0, size); + NeonGenTwoOpEnvFn *genenvfn = NULL; + void (*genfn)(TCGv_i64, TCGv_i64, TCGv_i64, TCGv_i64, MemOp) = NULL; switch (opcode) { case 0x1: /* SQADD, UQADD */ - { - static NeonGenTwoOpEnvFn * const fns[3][2] = { - { gen_helper_neon_qadd_s8, gen_helper_neon_qadd_u8 }, - { gen_helper_neon_qadd_s16, gen_helper_neon_qadd_u16 }, - { gen_helper_neon_qadd_s32, gen_helper_neon_qadd_u32 }, - }; - genenvfn = fns[size][u]; + genfn = u ? gen_uqadd_bhs : gen_sqadd_bhs; break; - } case 0x5: /* SQSUB, UQSUB */ - { - static NeonGenTwoOpEnvFn * const fns[3][2] = { - { gen_helper_neon_qsub_s8, gen_helper_neon_qsub_u8 }, - { gen_helper_neon_qsub_s16, gen_helper_neon_qsub_u16 }, - { gen_helper_neon_qsub_s32, gen_helper_neon_qsub_u32 }, - }; - genenvfn = fns[size][u]; + genfn = u ? gen_uqsub_bhs : gen_sqsub_bhs; break; - } case 0x9: /* SQSHL, UQSHL */ { static NeonGenTwoOpEnvFn * const fns[3][2] = { @@ -9488,8 +9476,29 @@ static void disas_simd_scalar_three_reg_same(DisasContext *s, uint32_t insn) g_assert_not_reached(); } - genenvfn(tcg_rd32, tcg_env, tcg_rn, tcg_rm); - tcg_gen_extu_i32_i64(tcg_rd, tcg_rd32); + if (genenvfn) { + TCGv_i32 tcg_rn = tcg_temp_new_i32(); + TCGv_i32 tcg_rm = tcg_temp_new_i32(); + + read_vec_element_i32(s, tcg_rn, rn, 0, size); + read_vec_element_i32(s, tcg_rm, rm, 0, size); + genenvfn(tcg_rn, tcg_env, tcg_rn, tcg_rm); + tcg_gen_extu_i32_i64(tcg_rd, tcg_rn); + } else { + TCGv_i64 tcg_rn = tcg_temp_new_i64(); + TCGv_i64 tcg_rm = tcg_temp_new_i64(); + TCGv_i64 qc = tcg_temp_new_i64(); + + read_vec_element(s, tcg_rn, rn, 0, size | (u ? 0 : MO_SIGN)); + read_vec_element(s, tcg_rm, rm, 0, size | (u ? 0 : MO_SIGN)); + tcg_gen_ld_i64(qc, tcg_env, offsetof(CPUARMState, vfp.qc)); + genfn(tcg_rd, qc, tcg_rn, tcg_rm, size); + tcg_gen_st_i64(qc, tcg_env, offsetof(CPUARMState, vfp.qc)); + if (!u) { + /* Truncate signed 64-bit result for writeback. */ + tcg_gen_ext_i64(tcg_rd, tcg_rd, size); + } + } } write_fp_dreg(s, rd, tcg_rd); From patchwork Fri May 24 23:20:55 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13673855 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id F275FC25B74 for ; Fri, 24 May 2024 23:33:16 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sAeIX-0000rm-OJ; Fri, 24 May 2024 19:26:05 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sAeHe-0007G0-OR for qemu-devel@nongnu.org; Fri, 24 May 2024 19:25:19 -0400 Received: from mail-pf1-x42e.google.com ([2607:f8b0:4864:20::42e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sAeHU-0006fh-WD for qemu-devel@nongnu.org; Fri, 24 May 2024 19:25:07 -0400 Received: by mail-pf1-x42e.google.com with SMTP id d2e1a72fcca58-6f8e9878514so1236316b3a.1 for ; Fri, 24 May 2024 16:25:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1716593100; x=1717197900; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=iMbUPJ6N6KonzyEnUgiB4FTCa95bEc5bnf6ZFlswq0s=; b=LCdx84KyL6SmzYqzVuo6ync8mYX/+Qp4MrwZW2kR1Pz5/fgxOyZrB4xeahSRVWaJdd sflG45CHz2eJvR3VnqzdSRnDgbnxn4LDdF7uBXdz4FCEhDcNR2ZzK7GCvLLGUMgA3503 /RrorrBxL4EN5Nu5g90rVj6VEW76hv3kd8izaqC2DLufAGnsUBO3sI0bgKkdJkxd8Coi Hyxrtb6adqo+eUd22IPJiGTExtmmyYdteP/31mBilwBR5H7ZURIqoioJyKrsgiG6yW+h gayc8O/wTfGIq/+9/9wi9xF381zS+6ti1DdKwrJxv5qnRFI0DzoLTweZ+Mr+TGKRRe44 byQQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716593100; x=1717197900; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=iMbUPJ6N6KonzyEnUgiB4FTCa95bEc5bnf6ZFlswq0s=; b=k1IjZFkEu67gMuQE5clF55CLYPDG9Cyu0eBHt9Zz7aiuiP8JoqqagB+M0yDCUqZqfI vpfyXhD2vaGfDCUw5OGcR4+OFobJMtA4cG8mOlP+ogaS6SUdl0SIbDWWS86ssqEExQ1O LTgimdluzCqmB1KPYg5PMKvPEWJ2EKJbUtPGWS05E7b8vYMmMzQ9o0qAijjMYtgm9Lt0 RbLUsomF9hp/ZrVvNIWOgclNe/ZKnBrYzKX2Tf9tt1FHyUA+qdgyLFs/szf2mF3fIMRJ GbsY4YbPZhbNW/X6H0DECtDEiiJfSV1xSzFPE7a8UAL2oZIlOyNXNP6nU7fEVosoh2Yq z4UA== X-Gm-Message-State: AOJu0YyOmtrD+adOJEN3VsLaEbwTREojcEwOAOaPSNAwO5CupKn6rAfh 6mY78DMVhu/PtMyxqF5qE917I3XfdDxjPaIDJR+QFtsvONh2+k/dyIQrYOOyhoj/kGSiRKyxley z X-Google-Smtp-Source: AGHT+IG+jAK8+PEkVWtZ161ZHAiiFQMDhewmtPVC4rN7YcZQ2kegiSlp8xDZd3IwDpRaFGISxiCZZg== X-Received: by 2002:a05:6a20:3242:b0:1b1:f321:47ff with SMTP id adf61e73a8af0-1b212d0a7d4mr3518315637.17.1716593099148; Fri, 24 May 2024 16:24:59 -0700 (PDT) Received: from stoup.. (174-21-72-5.tukw.qwest.net. [174.21.72.5]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-6f8fbf2cfd1sm1591695b3a.3.2024.05.24.16.24.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 May 2024 16:24:58 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: qemu-arm@nongnu.org Subject: [PATCH v2 41/67] target/arm: Convert SQADD, SQSUB, UQADD, UQSUB to decodetree Date: Fri, 24 May 2024 16:20:55 -0700 Message-Id: <20240524232121.284515-42-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240524232121.284515-1-richard.henderson@linaro.org> References: <20240524232121.284515-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::42e; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x42e.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, T_SPF_HELO_TEMPERROR=0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/tcg/a64.decode | 11 ++++ target/arm/tcg/translate-a64.c | 100 +++++++++++++++++++-------------- 2 files changed, 68 insertions(+), 43 deletions(-) diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode index f48adef5bb..19010af03b 100644 --- a/target/arm/tcg/a64.decode +++ b/target/arm/tcg/a64.decode @@ -44,6 +44,7 @@ @rrr_h ........ ... rm:5 ...... rn:5 rd:5 &rrr_e esz=1 @rrr_sd ........ ... rm:5 ...... rn:5 rd:5 &rrr_e esz=%esz_sd @rrr_hsd ........ ... rm:5 ...... rn:5 rd:5 &rrr_e esz=%esz_hsd +@rrr_e ........ esz:2 . rm:5 ...... rn:5 rd:5 &rrr_e @rrx_h ........ .. .. rm:4 .... . . rn:5 rd:5 &rrx_e esz=1 idx=%hlm @rrx_s ........ .. . rm:5 .... . . rn:5 rd:5 &rrx_e esz=2 idx=%hl @@ -744,6 +745,11 @@ FRECPS_s 0101 1110 0.1 ..... 11111 1 ..... ..... @rrr_sd FRSQRTS_s 0101 1110 110 ..... 00111 1 ..... ..... @rrr_h FRSQRTS_s 0101 1110 1.1 ..... 11111 1 ..... ..... @rrr_sd +SQADD_s 0101 1110 ..1 ..... 00001 1 ..... ..... @rrr_e +UQADD_s 0111 1110 ..1 ..... 00001 1 ..... ..... @rrr_e +SQSUB_s 0101 1110 ..1 ..... 00101 1 ..... ..... @rrr_e +UQSUB_s 0111 1110 ..1 ..... 00101 1 ..... ..... @rrr_e + ### Advanced SIMD scalar pairwise FADDP_s 0101 1110 0011 0000 1101 10 ..... ..... @rr_h @@ -857,6 +863,11 @@ BSL_v 0.10 1110 011 ..... 00011 1 ..... ..... @qrrr_b BIT_v 0.10 1110 101 ..... 00011 1 ..... ..... @qrrr_b BIF_v 0.10 1110 111 ..... 00011 1 ..... ..... @qrrr_b +SQADD_v 0.00 1110 ..1 ..... 00001 1 ..... ..... @qrrr_e +UQADD_v 0.10 1110 ..1 ..... 00001 1 ..... ..... @qrrr_e +SQSUB_v 0.00 1110 ..1 ..... 00101 1 ..... ..... @qrrr_e +UQSUB_v 0.10 1110 ..1 ..... 00101 1 ..... ..... @qrrr_e + ### Advanced SIMD scalar x indexed element FMUL_si 0101 1111 00 .. .... 1001 . 0 ..... ..... @rrx_h diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c index ca7ba6b1e8..2f7298811d 100644 --- a/target/arm/tcg/translate-a64.c +++ b/target/arm/tcg/translate-a64.c @@ -5060,6 +5060,43 @@ static const FPScalar f_scalar_frsqrts = { }; TRANS(FRSQRTS_s, do_fp3_scalar, a, &f_scalar_frsqrts) +static bool do_satacc_s(DisasContext *s, arg_rrr_e *a, + MemOp sgn_n, MemOp sgn_m, + void (*gen_bhs)(TCGv_i64, TCGv_i64, TCGv_i64, TCGv_i64, MemOp), + void (*gen_d)(TCGv_i64, TCGv_i64, TCGv_i64, TCGv_i64)) +{ + TCGv_i64 t0, t1, t2, qc; + MemOp esz = a->esz; + + if (!fp_access_check(s)) { + return true; + } + + t0 = tcg_temp_new_i64(); + t1 = tcg_temp_new_i64(); + t2 = tcg_temp_new_i64(); + qc = tcg_temp_new_i64(); + read_vec_element(s, t1, a->rn, 0, esz | sgn_n); + read_vec_element(s, t2, a->rm, 0, esz | sgn_m); + tcg_gen_ld_i64(qc, tcg_env, offsetof(CPUARMState, vfp.qc)); + + if (esz == MO_64) { + gen_d(t0, qc, t1, t2); + } else { + gen_bhs(t0, qc, t1, t2, esz); + tcg_gen_ext_i64(t0, t0, esz); + } + + write_fp_dreg(s, a->rd, t0); + tcg_gen_st_i64(qc, tcg_env, offsetof(CPUARMState, vfp.qc)); + return true; +} + +TRANS(SQADD_s, do_satacc_s, a, MO_SIGN, MO_SIGN, gen_sqadd_bhs, gen_sqadd_d) +TRANS(SQSUB_s, do_satacc_s, a, MO_SIGN, MO_SIGN, gen_sqsub_bhs, gen_sqsub_d) +TRANS(UQADD_s, do_satacc_s, a, 0, 0, gen_uqadd_bhs, gen_uqadd_d) +TRANS(UQSUB_s, do_satacc_s, a, 0, 0, gen_uqsub_bhs, gen_uqsub_d) + static bool do_fp3_vector(DisasContext *s, arg_qrrr_e *a, gen_helper_gvec_3_ptr * const fns[3]) { @@ -5298,6 +5335,11 @@ TRANS(BSL_v, do_bitsel, a->q, a->rd, a->rd, a->rn, a->rm) TRANS(BIT_v, do_bitsel, a->q, a->rd, a->rm, a->rn, a->rd) TRANS(BIF_v, do_bitsel, a->q, a->rd, a->rm, a->rd, a->rn) +TRANS(SQADD_v, do_gvec_fn3, a, gen_gvec_sqadd_qc) +TRANS(UQADD_v, do_gvec_fn3, a, gen_gvec_uqadd_qc) +TRANS(SQSUB_v, do_gvec_fn3, a, gen_gvec_sqsub_qc) +TRANS(UQSUB_v, do_gvec_fn3, a, gen_gvec_uqsub_qc) + /* * Advanced SIMD scalar/vector x indexed element */ @@ -9291,29 +9333,8 @@ static void handle_3same_64(DisasContext *s, int opcode, bool u, * or scalar-three-reg-same groups. */ TCGCond cond; - TCGv_i64 qc; switch (opcode) { - case 0x1: /* SQADD */ - qc = tcg_temp_new_i64(); - tcg_gen_ld_i64(qc, tcg_env, offsetof(CPUARMState, vfp.qc)); - if (u) { - gen_uqadd_d(tcg_rd, qc, tcg_rn, tcg_rm); - } else { - gen_sqadd_d(tcg_rd, qc, tcg_rn, tcg_rm); - } - tcg_gen_st_i64(qc, tcg_env, offsetof(CPUARMState, vfp.qc)); - break; - case 0x5: /* SQSUB */ - qc = tcg_temp_new_i64(); - tcg_gen_ld_i64(qc, tcg_env, offsetof(CPUARMState, vfp.qc)); - if (u) { - gen_uqsub_d(tcg_rd, qc, tcg_rn, tcg_rm); - } else { - gen_sqsub_d(tcg_rd, qc, tcg_rn, tcg_rm); - } - tcg_gen_st_i64(qc, tcg_env, offsetof(CPUARMState, vfp.qc)); - break; case 0x6: /* CMGT, CMHI */ cond = u ? TCG_COND_GTU : TCG_COND_GT; do_cmop: @@ -9366,6 +9387,8 @@ static void handle_3same_64(DisasContext *s, int opcode, bool u, } break; default: + case 0x1: /* SQADD / UQADD */ + case 0x5: /* SQSUB / UQSUB */ g_assert_not_reached(); } } @@ -9387,8 +9410,6 @@ static void disas_simd_scalar_three_reg_same(DisasContext *s, uint32_t insn) TCGv_i64 tcg_rd; switch (opcode) { - case 0x1: /* SQADD, UQADD */ - case 0x5: /* SQSUB, UQSUB */ case 0x9: /* SQSHL, UQSHL */ case 0xb: /* SQRSHL, UQRSHL */ break; @@ -9410,6 +9431,8 @@ static void disas_simd_scalar_three_reg_same(DisasContext *s, uint32_t insn) } break; default: + case 0x1: /* SQADD, UQADD */ + case 0x5: /* SQSUB, UQSUB */ unallocated_encoding(s); return; } @@ -9436,12 +9459,6 @@ static void disas_simd_scalar_three_reg_same(DisasContext *s, uint32_t insn) void (*genfn)(TCGv_i64, TCGv_i64, TCGv_i64, TCGv_i64, MemOp) = NULL; switch (opcode) { - case 0x1: /* SQADD, UQADD */ - genfn = u ? gen_uqadd_bhs : gen_sqadd_bhs; - break; - case 0x5: /* SQSUB, UQSUB */ - genfn = u ? gen_uqsub_bhs : gen_sqsub_bhs; - break; case 0x9: /* SQSHL, UQSHL */ { static NeonGenTwoOpEnvFn * const fns[3][2] = { @@ -9473,6 +9490,8 @@ static void disas_simd_scalar_three_reg_same(DisasContext *s, uint32_t insn) break; } default: + case 0x1: /* SQADD, UQADD */ + case 0x5: /* SQSUB, UQSUB */ g_assert_not_reached(); } @@ -10933,6 +10952,11 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) return; } break; + + case 0x01: /* SQADD, UQADD */ + case 0x05: /* SQSUB, UQSUB */ + unallocated_encoding(s); + return; } if (!fp_access_check(s)) { @@ -10940,20 +10964,6 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) } switch (opcode) { - case 0x01: /* SQADD, UQADD */ - if (u) { - gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_uqadd_qc, size); - } else { - gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_sqadd_qc, size); - } - return; - case 0x05: /* SQSUB, UQSUB */ - if (u) { - gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_uqsub_qc, size); - } else { - gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_sqsub_qc, size); - } - return; case 0x08: /* SSHL, USHL */ if (u) { gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_ushl, size); @@ -11038,6 +11048,10 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) vec_full_reg_offset(s, rm), is_q ? 16 : 8, vec_full_reg_size(s)); return; + + case 0x01: /* SQADD, UQADD */ + case 0x05: /* SQSUB, UQSUB */ + g_assert_not_reached(); } if (size == 3) { From patchwork Fri May 24 23:20:56 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13673821 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 8E346C25B7A for ; Fri, 24 May 2024 23:28:06 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sAeI7-00081j-AR; Fri, 24 May 2024 19:25:39 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sAeHj-0007Jd-UC for qemu-devel@nongnu.org; Fri, 24 May 2024 19:25:19 -0400 Received: from mail-pf1-x435.google.com ([2607:f8b0:4864:20::435]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sAeHY-0006g0-3Z for qemu-devel@nongnu.org; Fri, 24 May 2024 19:25:13 -0400 Received: by mail-pf1-x435.google.com with SMTP id d2e1a72fcca58-6f4603237e0so2616720b3a.0 for ; Fri, 24 May 2024 16:25:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1716593101; x=1717197901; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=7DNXHK5xqHy7EXmKEeWHyhfM7wr361NDLjffbThLrQE=; b=Nn/qN8AFMuiqv0jOct0ln2BCI6RV2vISDczcfTCmRPUhPXG8I4ZOo2cyte6JI9ndZP Jmllxm8e+7uWHgDFPseowIXXmSsTJnVqpkDy0rbBUyZwpU1bbwQIsMwvga24/ZRVINsG H4ZUkKBKqceIbK/9md/PtvhMyJYFbGpIhp4TX197rilRPwxkRKMGqUy3ZbD9Mp8iCAId DWwBLLUhJQRt3HkuW4aJJucvvvC5xZGgjTQD0j3YBKI3gS99/Tj8LRKDNGIqupLSLxFZ OF9Umdp/pXJR+LNiAy+nmyS1foXLZuvrIU78kqjO6lgPq2eCMkau2gwnqRRkBo+OA11v vZ8A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716593101; x=1717197901; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=7DNXHK5xqHy7EXmKEeWHyhfM7wr361NDLjffbThLrQE=; b=FDY7a+9C4kqOX5DWHffJZqFgMhl3qtSV9HrPTkUN7UsowOn7ZgA5EaDNXPmFj30l3T eQRiZbH1VP56Ltxe9d7NHvzVARXdjCdZg9OXRY59TCuLvIEADfvVXoJPxc8ysBiz84IH 6+7orHVZt8n2m4170VO/xsAQbCnJMFryO5GnIVkCCLY3tCIkPYu/3VW+tmsbSk2iFeqy 9LZ+lSq3pi6M9rzB7lmJ5ToJ7yaGpD2D+k31Reloys+Y+HjtiS5N5L0kaTK8dBPiWksa QnAKILeVFttUKd+MueOqcwQNZEhT2vIzPv5qCKxSLWNupsCDSlf94/i0mEPy3HYKorqC 5LLg== X-Gm-Message-State: AOJu0YwiYsZhdd+MPR0GHoUz2cUUgda0qFZ5d9PNqawmEV4MT5nYpSj8 Xo8D6oV7gayU1FiBBIfE5dHLZvTQVQ7b9zdisJPTH7TRdNgb/96puFLJENAJ/kiCJx2JJPql9kG e X-Google-Smtp-Source: AGHT+IF7Ea+l3S5Z94+W2hzYuZ46cZwpdU0sk+fAdA1oeBPkne92CjK9MCJvglufOjOdWvQeqB09AA== X-Received: by 2002:a05:6a00:7c8:b0:6f4:74b8:3d57 with SMTP id d2e1a72fcca58-6f7726eb231mr6708525b3a.7.1716593100890; Fri, 24 May 2024 16:25:00 -0700 (PDT) Received: from stoup.. (174-21-72-5.tukw.qwest.net. [174.21.72.5]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-6f8fbf2cfd1sm1591695b3a.3.2024.05.24.16.24.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 May 2024 16:25:00 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: qemu-arm@nongnu.org Subject: [PATCH v2 42/67] target/arm: Convert SUQADD, USQADD to decodetree Date: Fri, 24 May 2024 16:20:56 -0700 Message-Id: <20240524232121.284515-43-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240524232121.284515-1-richard.henderson@linaro.org> References: <20240524232121.284515-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::435; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x435.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org These are faux 2-operand instructions, reading from rd. Sort them next to the other three-operand same insns for clarity. Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/tcg/a64.decode | 8 +++++ target/arm/tcg/translate-a64.c | 64 ++++------------------------------ 2 files changed, 14 insertions(+), 58 deletions(-) diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode index 19010af03b..7c350ba833 100644 --- a/target/arm/tcg/a64.decode +++ b/target/arm/tcg/a64.decode @@ -45,6 +45,7 @@ @rrr_sd ........ ... rm:5 ...... rn:5 rd:5 &rrr_e esz=%esz_sd @rrr_hsd ........ ... rm:5 ...... rn:5 rd:5 &rrr_e esz=%esz_hsd @rrr_e ........ esz:2 . rm:5 ...... rn:5 rd:5 &rrr_e +@r2r_e ........ esz:2 . ..... ...... rm:5 rd:5 &rrr_e rn=%rd @rrx_h ........ .. .. rm:4 .... . . rn:5 rd:5 &rrx_e esz=1 idx=%hlm @rrx_s ........ .. . rm:5 .... . . rn:5 rd:5 &rrx_e esz=2 idx=%hl @@ -60,6 +61,7 @@ @qrrr_h . q:1 ...... ... rm:5 ...... rn:5 rd:5 &qrrr_e esz=1 @qrrr_sd . q:1 ...... ... rm:5 ...... rn:5 rd:5 &qrrr_e esz=%esz_sd @qrrr_e . q:1 ...... esz:2 . rm:5 ...... rn:5 rd:5 &qrrr_e +@qr2r_e . q:1 ...... esz:2 . ..... ...... rm:5 rd:5 &qrrr_e rn=%rd @qrrx_h . q:1 .. .... .. .. rm:4 .... . . rn:5 rd:5 \ &qrrx_e esz=1 idx=%hlm @@ -750,6 +752,9 @@ UQADD_s 0111 1110 ..1 ..... 00001 1 ..... ..... @rrr_e SQSUB_s 0101 1110 ..1 ..... 00101 1 ..... ..... @rrr_e UQSUB_s 0111 1110 ..1 ..... 00101 1 ..... ..... @rrr_e +SUQADD_s 0101 1110 ..1 00000 00111 0 ..... ..... @r2r_e +USQADD_s 0111 1110 ..1 00000 00111 0 ..... ..... @r2r_e + ### Advanced SIMD scalar pairwise FADDP_s 0101 1110 0011 0000 1101 10 ..... ..... @rr_h @@ -868,6 +873,9 @@ UQADD_v 0.10 1110 ..1 ..... 00001 1 ..... ..... @qrrr_e SQSUB_v 0.00 1110 ..1 ..... 00101 1 ..... ..... @qrrr_e UQSUB_v 0.10 1110 ..1 ..... 00101 1 ..... ..... @qrrr_e +SUQADD_v 0.00 1110 ..1 00000 00111 0 ..... ..... @qr2r_e +USQADD_v 0.10 1110 ..1 00000 00111 0 ..... ..... @qr2r_e + ### Advanced SIMD scalar x indexed element FMUL_si 0101 1111 00 .. .... 1001 . 0 ..... ..... @rrx_h diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c index 2f7298811d..fbcf18f92a 100644 --- a/target/arm/tcg/translate-a64.c +++ b/target/arm/tcg/translate-a64.c @@ -5096,6 +5096,8 @@ TRANS(SQADD_s, do_satacc_s, a, MO_SIGN, MO_SIGN, gen_sqadd_bhs, gen_sqadd_d) TRANS(SQSUB_s, do_satacc_s, a, MO_SIGN, MO_SIGN, gen_sqsub_bhs, gen_sqsub_d) TRANS(UQADD_s, do_satacc_s, a, 0, 0, gen_uqadd_bhs, gen_uqadd_d) TRANS(UQSUB_s, do_satacc_s, a, 0, 0, gen_uqsub_bhs, gen_uqsub_d) +TRANS(SUQADD_s, do_satacc_s, a, MO_SIGN, 0, gen_suqadd_bhs, gen_suqadd_d) +TRANS(USQADD_s, do_satacc_s, a, 0, MO_SIGN, gen_usqadd_bhs, gen_usqadd_d) static bool do_fp3_vector(DisasContext *s, arg_qrrr_e *a, gen_helper_gvec_3_ptr * const fns[3]) @@ -5339,6 +5341,8 @@ TRANS(SQADD_v, do_gvec_fn3, a, gen_gvec_sqadd_qc) TRANS(UQADD_v, do_gvec_fn3, a, gen_gvec_uqadd_qc) TRANS(SQSUB_v, do_gvec_fn3, a, gen_gvec_sqsub_qc) TRANS(UQSUB_v, do_gvec_fn3, a, gen_gvec_uqsub_qc) +TRANS(SUQADD_v, do_gvec_fn3, a, gen_gvec_suqadd_qc) +TRANS(USQADD_v, do_gvec_fn3, a, gen_gvec_usqadd_qc) /* * Advanced SIMD scalar/vector x indexed element @@ -10009,48 +10013,6 @@ static void handle_2misc_narrow(DisasContext *s, bool scalar, clear_vec_high(s, is_q, rd); } -/* Remaining saturating accumulating ops */ -static void handle_2misc_satacc(DisasContext *s, bool is_scalar, bool is_u, - bool is_q, unsigned size, int rn, int rd) -{ - TCGv_i64 res, qc, a, b; - - if (!is_scalar) { - gen_gvec_fn3(s, is_q, rd, rd, rn, - is_u ? gen_gvec_usqadd_qc : gen_gvec_suqadd_qc, size); - return; - } - - res = tcg_temp_new_i64(); - qc = tcg_temp_new_i64(); - a = tcg_temp_new_i64(); - b = tcg_temp_new_i64(); - - /* Read and extend scalar inputs to 64-bits. */ - read_vec_element(s, a, rd, 0, size | (is_u ? 0 : MO_SIGN)); - read_vec_element(s, b, rn, 0, size | (is_u ? MO_SIGN : 0)); - tcg_gen_ld_i64(qc, tcg_env, offsetof(CPUARMState, vfp.qc)); - - if (size == MO_64) { - if (is_u) { - gen_usqadd_d(res, qc, a, b); - } else { - gen_suqadd_d(res, qc, a, b); - } - } else { - if (is_u) { - gen_usqadd_bhs(res, qc, a, b, size); - } else { - gen_suqadd_bhs(res, qc, a, b, size); - /* Truncate signed 64-bit result for writeback. */ - tcg_gen_ext_i64(res, res, size); - } - } - - write_fp_dreg(s, rd, res); - tcg_gen_st_i64(qc, tcg_env, offsetof(CPUARMState, vfp.qc)); -} - /* AdvSIMD scalar two reg misc * 31 30 29 28 24 23 22 21 17 16 12 11 10 9 5 4 0 * +-----+---+-----------+------+-----------+--------+-----+------+------+ @@ -10070,12 +10032,6 @@ static void disas_simd_scalar_two_reg_misc(DisasContext *s, uint32_t insn) TCGv_ptr tcg_fpstatus; switch (opcode) { - case 0x3: /* USQADD / SUQADD*/ - if (!fp_access_check(s)) { - return; - } - handle_2misc_satacc(s, true, u, false, size, rn, rd); - return; case 0x7: /* SQABS / SQNEG */ break; case 0xa: /* CMLT */ @@ -10175,6 +10131,7 @@ static void disas_simd_scalar_two_reg_misc(DisasContext *s, uint32_t insn) } break; default: + case 0x3: /* USQADD / SUQADD */ unallocated_encoding(s); return; } @@ -11666,16 +11623,6 @@ static void disas_simd_two_reg_misc(DisasContext *s, uint32_t insn) return; } break; - case 0x3: /* SUQADD, USQADD */ - if (size == 3 && !is_q) { - unallocated_encoding(s); - return; - } - if (!fp_access_check(s)) { - return; - } - handle_2misc_satacc(s, false, u, is_q, size, rn, rd); - return; case 0x7: /* SQABS, SQNEG */ if (size == 3 && !is_q) { unallocated_encoding(s); @@ -11850,6 +11797,7 @@ static void disas_simd_two_reg_misc(DisasContext *s, uint32_t insn) break; } default: + case 0x3: /* SUQADD, USQADD */ unallocated_encoding(s); return; } From patchwork Fri May 24 23:20:57 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13673829 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 5867BC25B7D for ; Fri, 24 May 2024 23:29:43 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sAeI1-0007aI-7W; Fri, 24 May 2024 19:25:34 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sAeHh-0007JK-M3 for qemu-devel@nongnu.org; Fri, 24 May 2024 19:25:19 -0400 Received: from mail-pg1-x536.google.com ([2607:f8b0:4864:20::536]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sAeHX-0006gJ-Mb for qemu-devel@nongnu.org; Fri, 24 May 2024 19:25:12 -0400 Received: by mail-pg1-x536.google.com with SMTP id 41be03b00d2f7-68197edc2d3so1122292a12.2 for ; Fri, 24 May 2024 16:25:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1716593102; x=1717197902; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=zpw8kDFCAlCVbW/nJ/zuK4e2XGEhGwmFrTcaIxs0eIY=; b=goO1vQoldrYIxVZ3GwUsot/AQX1gb5UNQqYRFkZSzcAILXU+ymXemmeXnXBsKOCGFb E7axhtYUtnCKDSy7FQ3N20Vjlroln2PJxW9Fdhi5lFEkW9FEz/q9xJhaXJtfWZ21thjT 0l81npRK5BkS5ZrkFWO0gDKyJAEZBzONNvVFg15ewFadYd7D9d7Gm86L0IIkANTkfV5V 4JFjrsTMLA6xYxEL7NzyuUFh3OpbjXtyXtNpaV9ndP59UYNWqdeqG2Zp09C8Yr1cy65u qcnV0Ik+nhpHxu1xy0pfVa+Twzk58+p+nWXbCgCed/ofCg4u4W/G3zhhBUh3RlNMdNaC ZVVg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716593102; x=1717197902; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=zpw8kDFCAlCVbW/nJ/zuK4e2XGEhGwmFrTcaIxs0eIY=; b=ON+qYDoj4XhVbhEPKdPK+ITiPIf2joYB2a7yKVQ+RPBrO9R2y2jrnhlrMSlVGo1sAL AfvIgywPgb7rCJS5HbkUymyQniN/n2oNZbONjmJnUgEia3GAsydIVlc/wAPZreh1v0w0 DOhJVygX+PkB0GfPm8HWBFdtoB2jERqznOOxEstmkY6tRCelNPeEYSfctnr6EoomEfBa xXygLBsEHrWMMipQoNG4s+2UrbcoDW7DXuSeRWz5bWOWHVXQHOXZ3z1xr5cxX3wdNFNh uFwe1pAh+5CCa3HPU+U01HhNBrzh8EZTiogXx62/M9hni8xayNceEVmc71LmpAMezoyb lYGg== X-Gm-Message-State: AOJu0YwesYxCb6baospYLVZmr7KdAPPHk2v957/bWG5ak/wKb7AjWb9R nXzwKYxw3GkVZ/Cgoj/YvqBfoRIg6XGOeKtL9xSI7dgcSOaOOxwu4I+JrGLy6UxZ8qRJckKX/0c e X-Google-Smtp-Source: AGHT+IHefDozUoaLc+XIn70+nA5+nkPto5wJuA2ngHAtTwHxcNvosOR0iq7A13S9O+ITvQs5mdHuOw== X-Received: by 2002:a05:6a20:5601:b0:1af:a5af:f945 with SMTP id adf61e73a8af0-1b212e338e9mr3888963637.34.1716593101655; Fri, 24 May 2024 16:25:01 -0700 (PDT) Received: from stoup.. (174-21-72-5.tukw.qwest.net. [174.21.72.5]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-6f8fbf2cfd1sm1591695b3a.3.2024.05.24.16.25.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 May 2024 16:25:01 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: qemu-arm@nongnu.org Subject: [PATCH v2 43/67] target/arm: Convert SSHL, USHL to decodetree Date: Fri, 24 May 2024 16:20:57 -0700 Message-Id: <20240524232121.284515-44-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240524232121.284515-1-richard.henderson@linaro.org> References: <20240524232121.284515-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::536; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x536.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, T_SPF_TEMPERROR=0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/tcg/a64.decode | 7 ++++++ target/arm/tcg/translate-a64.c | 40 +++++++++++++++++++++------------- 2 files changed, 32 insertions(+), 15 deletions(-) diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode index 7c350ba833..ea897d6732 100644 --- a/target/arm/tcg/a64.decode +++ b/target/arm/tcg/a64.decode @@ -42,6 +42,7 @@ @rr_sd ........ ... ..... ...... rn:5 rd:5 &rr_e esz=%esz_sd @rrr_h ........ ... rm:5 ...... rn:5 rd:5 &rrr_e esz=1 +@rrr_d ........ ... rm:5 ...... rn:5 rd:5 &rrr_e esz=3 @rrr_sd ........ ... rm:5 ...... rn:5 rd:5 &rrr_e esz=%esz_sd @rrr_hsd ........ ... rm:5 ...... rn:5 rd:5 &rrr_e esz=%esz_hsd @rrr_e ........ esz:2 . rm:5 ...... rn:5 rd:5 &rrr_e @@ -755,6 +756,9 @@ UQSUB_s 0111 1110 ..1 ..... 00101 1 ..... ..... @rrr_e SUQADD_s 0101 1110 ..1 00000 00111 0 ..... ..... @r2r_e USQADD_s 0111 1110 ..1 00000 00111 0 ..... ..... @r2r_e +SSHL_s 0101 1110 111 ..... 01000 1 ..... ..... @rrr_d +USHL_s 0111 1110 111 ..... 01000 1 ..... ..... @rrr_d + ### Advanced SIMD scalar pairwise FADDP_s 0101 1110 0011 0000 1101 10 ..... ..... @rr_h @@ -876,6 +880,9 @@ UQSUB_v 0.10 1110 ..1 ..... 00101 1 ..... ..... @qrrr_e SUQADD_v 0.00 1110 ..1 00000 00111 0 ..... ..... @qr2r_e USQADD_v 0.10 1110 ..1 00000 00111 0 ..... ..... @qr2r_e +SSHL_v 0.00 1110 ..1 ..... 01000 1 ..... ..... @qrrr_e +USHL_v 0.10 1110 ..1 ..... 01000 1 ..... ..... @qrrr_e + ### Advanced SIMD scalar x indexed element FMUL_si 0101 1111 00 .. .... 1001 . 0 ..... ..... @rrx_h diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c index fbcf18f92a..8d39a9663e 100644 --- a/target/arm/tcg/translate-a64.c +++ b/target/arm/tcg/translate-a64.c @@ -5099,6 +5099,24 @@ TRANS(UQSUB_s, do_satacc_s, a, 0, 0, gen_uqsub_bhs, gen_uqsub_d) TRANS(SUQADD_s, do_satacc_s, a, MO_SIGN, 0, gen_suqadd_bhs, gen_suqadd_d) TRANS(USQADD_s, do_satacc_s, a, 0, MO_SIGN, gen_usqadd_bhs, gen_usqadd_d) +static bool do_int3_scalar_d(DisasContext *s, arg_rrr_e *a, + void (*fn)(TCGv_i64, TCGv_i64, TCGv_i64)) +{ + if (fp_access_check(s)) { + TCGv_i64 t0 = tcg_temp_new_i64(); + TCGv_i64 t1 = tcg_temp_new_i64(); + + read_vec_element(s, t0, a->rn, 0, MO_64); + read_vec_element(s, t1, a->rm, 0, MO_64); + fn(t0, t0, t1); + write_fp_dreg(s, a->rd, t0); + } + return true; +} + +TRANS(SSHL_s, do_int3_scalar_d, a, gen_sshl_i64) +TRANS(USHL_s, do_int3_scalar_d, a, gen_ushl_i64) + static bool do_fp3_vector(DisasContext *s, arg_qrrr_e *a, gen_helper_gvec_3_ptr * const fns[3]) { @@ -5344,6 +5362,10 @@ TRANS(UQSUB_v, do_gvec_fn3, a, gen_gvec_uqsub_qc) TRANS(SUQADD_v, do_gvec_fn3, a, gen_gvec_suqadd_qc) TRANS(USQADD_v, do_gvec_fn3, a, gen_gvec_usqadd_qc) +TRANS(SSHL_v, do_gvec_fn3, a, gen_gvec_sshl) +TRANS(USHL_v, do_gvec_fn3, a, gen_gvec_ushl) + + /* * Advanced SIMD scalar/vector x indexed element */ @@ -9355,13 +9377,6 @@ static void handle_3same_64(DisasContext *s, int opcode, bool u, } gen_cmtst_i64(tcg_rd, tcg_rn, tcg_rm); break; - case 0x8: /* SSHL, USHL */ - if (u) { - gen_ushl_i64(tcg_rd, tcg_rn, tcg_rm); - } else { - gen_sshl_i64(tcg_rd, tcg_rn, tcg_rm); - } - break; case 0x9: /* SQSHL, UQSHL */ if (u) { gen_helper_neon_qshl_u64(tcg_rd, tcg_env, tcg_rn, tcg_rm); @@ -9393,6 +9408,7 @@ static void handle_3same_64(DisasContext *s, int opcode, bool u, default: case 0x1: /* SQADD / UQADD */ case 0x5: /* SQSUB / UQSUB */ + case 0x8: /* SSHL, USHL */ g_assert_not_reached(); } } @@ -9417,7 +9433,6 @@ static void disas_simd_scalar_three_reg_same(DisasContext *s, uint32_t insn) case 0x9: /* SQSHL, UQSHL */ case 0xb: /* SQRSHL, UQRSHL */ break; - case 0x8: /* SSHL, USHL */ case 0xa: /* SRSHL, URSHL */ case 0x6: /* CMGT, CMHI */ case 0x7: /* CMGE, CMHS */ @@ -9437,6 +9452,7 @@ static void disas_simd_scalar_three_reg_same(DisasContext *s, uint32_t insn) default: case 0x1: /* SQADD, UQADD */ case 0x5: /* SQSUB, UQSUB */ + case 0x8: /* SSHL, USHL */ unallocated_encoding(s); return; } @@ -10921,13 +10937,6 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) } switch (opcode) { - case 0x08: /* SSHL, USHL */ - if (u) { - gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_ushl, size); - } else { - gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_sshl, size); - } - return; case 0x0c: /* SMAX, UMAX */ if (u) { gen_gvec_fn3(s, is_q, rd, rn, rm, tcg_gen_gvec_umax, size); @@ -11008,6 +11017,7 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) case 0x01: /* SQADD, UQADD */ case 0x05: /* SQSUB, UQSUB */ + case 0x08: /* SSHL, USHL */ g_assert_not_reached(); } From patchwork Fri May 24 23:20:58 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13673837 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 03691C25B7A for ; Fri, 24 May 2024 23:30:26 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sAeI7-00083U-W0; Fri, 24 May 2024 19:25:40 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sAeHt-0007OS-Pw for qemu-devel@nongnu.org; Fri, 24 May 2024 19:25:28 -0400 Received: from mail-pf1-x436.google.com ([2607:f8b0:4864:20::436]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sAeHZ-0006gZ-V0 for qemu-devel@nongnu.org; Fri, 24 May 2024 19:25:21 -0400 Received: by mail-pf1-x436.google.com with SMTP id d2e1a72fcca58-6f4f2b1c997so3796500b3a.0 for ; Fri, 24 May 2024 16:25:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1716593102; x=1717197902; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=qLc2bGZLbWfwEugEY79RLG4rYjQBRzGvFp6uUqIXym0=; b=rtlUoAdiTqdvb2SHNUNU+PxYdN/G4P32wQp6WqV52iYnPdrYTb3vzd4LpCn/xlfjSl +qdLQtesh8gmX4XtYj5D2EEshEiAi41p74HlBghkcqwVFpQ4RKm7uSUY4le4V23E9Ad1 f8V2Kf+2DagVbtKsPDyQv7hwXLp1dNK632hjei+R83sln+8u9TxWTtIeP9hP5g2/d5J+ i2prmUGGJZTuQmVvtnNViHLBOW6TYNhcx+1wwZh4jucmiT+fbRLB+IPzGDvQej3VChAv SL03M+hQZ5obAhFjHIdauGKCdWfJpuT2VkLBbuGnhDAVdpB+YAFLwKWSgF6FTLYtfs+z gn/Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716593102; x=1717197902; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=qLc2bGZLbWfwEugEY79RLG4rYjQBRzGvFp6uUqIXym0=; b=sPcRVRz6l0zPXA8w9tou/yr6e4RJhgmzGes/qD+Jn0qmh8keqNnnijVwgOyYs9/x4m ZxBvCUKu6wLMuG4KpX7IQhStHM7/JY+L+mlqAWVFqTz7ZiTXkmsF9Dtnd40gIwP5HRQb NoWbbnw72rBIK92M3bMz+TckIhSZvxQQs+qnPJsaLMtFn0rmZ82kRbxTtQ9YS1Ak7ffb 6w8pQALEqsSUDhPhhw7lffnHD5bZK/9bee/FxAe2LUtdQ1KSFH8dTABR98w9RNvgka+8 MwIf8jID1kiwvN/x1MVtgnwbgSjWrVGLyZY8IyqVZXlXwz5cVtdDWvyqx+sViXnVRHbX q/HQ== X-Gm-Message-State: AOJu0YxUmQ6yLXXYlDTrvpbNUqwFeD0t9Cgwvw1+iqdlHv8HLEC/tY74 UGYR7l3RO2cnDi67HM5n7ArfJuk3QYSs84V5cWzxmFv44oIGH/7DVNYXaND12+F3nVK9eeU9VXR T X-Google-Smtp-Source: AGHT+IFKintYYrrs2uFnmdLNOe9Hf20YVdiVl21dV+oiznhk5YPwNyDS1fuie82MtrJNZ4qgYHnPkw== X-Received: by 2002:a05:6a00:428e:b0:6f6:7af7:7b6 with SMTP id d2e1a72fcca58-6f8f43cb8cbmr3845707b3a.30.1716593102396; Fri, 24 May 2024 16:25:02 -0700 (PDT) Received: from stoup.. (174-21-72-5.tukw.qwest.net. [174.21.72.5]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-6f8fbf2cfd1sm1591695b3a.3.2024.05.24.16.25.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 May 2024 16:25:01 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: qemu-arm@nongnu.org Subject: [PATCH v2 44/67] target/arm: Convert SRSHL and URSHL (register) to gvec Date: Fri, 24 May 2024 16:20:58 -0700 Message-Id: <20240524232121.284515-45-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240524232121.284515-1-richard.henderson@linaro.org> References: <20240524232121.284515-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::436; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x436.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Signed-off-by: Richard Henderson --- target/arm/helper.h | 10 +++++++++ target/arm/tcg/translate.h | 4 ++++ target/arm/tcg/neon-dp.decode | 10 ++------- target/arm/tcg/gengvec.c | 22 +++++++++++++++++++ target/arm/tcg/neon_helper.c | 38 ++++++++++++++++++++++++++++++++- target/arm/tcg/translate-a64.c | 17 ++++++--------- target/arm/tcg/translate-neon.c | 6 ++---- 7 files changed, 84 insertions(+), 23 deletions(-) diff --git a/target/arm/helper.h b/target/arm/helper.h index a14c040451..25eb7bf5df 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -327,6 +327,16 @@ DEF_HELPER_3(neon_qrshl_s32, i32, env, i32, i32) DEF_HELPER_3(neon_qrshl_u64, i64, env, i64, i64) DEF_HELPER_3(neon_qrshl_s64, i64, env, i64, i64) +DEF_HELPER_FLAGS_4(gvec_srshl_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_srshl_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_srshl_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_srshl_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(gvec_urshl_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_urshl_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_urshl_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_urshl_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_2(neon_add_u8, i32, i32, i32) DEF_HELPER_2(neon_add_u16, i32, i32, i32) DEF_HELPER_2(neon_sub_u8, i32, i32, i32) diff --git a/target/arm/tcg/translate.h b/target/arm/tcg/translate.h index 87439dcc61..ea63ffc47b 100644 --- a/target/arm/tcg/translate.h +++ b/target/arm/tcg/translate.h @@ -459,6 +459,10 @@ void gen_gvec_sshl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); void gen_gvec_ushl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_srshl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_urshl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); void gen_cmtst_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b); void gen_ushl_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b); diff --git a/target/arm/tcg/neon-dp.decode b/target/arm/tcg/neon-dp.decode index fd3a01bfa0..8525c65c0d 100644 --- a/target/arm/tcg/neon-dp.decode +++ b/target/arm/tcg/neon-dp.decode @@ -117,14 +117,8 @@ VSHL_U_3s 1111 001 1 0 . .. .... .... 0100 . . . 0 .... @3same_rev VQSHL_U64_3s 1111 001 1 0 . .. .... .... 0100 . . . 1 .... @3same_64_rev VQSHL_U_3s 1111 001 1 0 . .. .... .... 0100 . . . 1 .... @3same_rev } -{ - VRSHL_S64_3s 1111 001 0 0 . .. .... .... 0101 . . . 0 .... @3same_64_rev - VRSHL_S_3s 1111 001 0 0 . .. .... .... 0101 . . . 0 .... @3same_rev -} -{ - VRSHL_U64_3s 1111 001 1 0 . .. .... .... 0101 . . . 0 .... @3same_64_rev - VRSHL_U_3s 1111 001 1 0 . .. .... .... 0101 . . . 0 .... @3same_rev -} +VRSHL_S_3s 1111 001 0 0 . .. .... .... 0101 . . . 0 .... @3same_rev +VRSHL_U_3s 1111 001 1 0 . .. .... .... 0101 . . . 0 .... @3same_rev { VQRSHL_S64_3s 1111 001 0 0 . .. .... .... 0101 . . . 1 .... @3same_64_rev VQRSHL_S_3s 1111 001 0 0 . .. .... .... 0101 . . . 1 .... @3same_rev diff --git a/target/arm/tcg/gengvec.c b/target/arm/tcg/gengvec.c index 66a514ba86..d9a9132722 100644 --- a/target/arm/tcg/gengvec.c +++ b/target/arm/tcg/gengvec.c @@ -1217,6 +1217,28 @@ void gen_gvec_sshl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); } +void gen_gvec_srshl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static gen_helper_gvec_3 * const fns[] = { + gen_helper_gvec_srshl_b, gen_helper_gvec_srshl_h, + gen_helper_gvec_srshl_s, gen_helper_gvec_srshl_d, + }; + tcg_debug_assert(vece <= MO_64); + tcg_gen_gvec_3_ool(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, 0, fns[vece]); +} + +void gen_gvec_urshl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static gen_helper_gvec_3 * const fns[] = { + gen_helper_gvec_urshl_b, gen_helper_gvec_urshl_h, + gen_helper_gvec_urshl_s, gen_helper_gvec_urshl_d, + }; + tcg_debug_assert(vece <= MO_64); + tcg_gen_gvec_3_ool(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, 0, fns[vece]); +} + void gen_uqadd_bhs(TCGv_i64 res, TCGv_i64 qc, TCGv_i64 a, TCGv_i64 b, MemOp esz) { uint64_t max = MAKE_64BIT_MASK(0, 8 << esz); diff --git a/target/arm/tcg/neon_helper.c b/target/arm/tcg/neon_helper.c index 0af15e9f6e..516ecc1dcb 100644 --- a/target/arm/tcg/neon_helper.c +++ b/target/arm/tcg/neon_helper.c @@ -6,10 +6,11 @@ * * This code is licensed under the GNU GPL v2. */ -#include "qemu/osdep.h" +#include "qemu/osdep.h" #include "cpu.h" #include "exec/helper-proto.h" +#include "tcg/tcg-gvec-desc.h" #include "fpu/softfloat.h" #include "vec_internal.h" @@ -117,6 +118,17 @@ NEON_VOP_BODY(vtype, n) uint32_t HELPER(glue(neon_,name))(CPUARMState *env, uint32_t arg1, uint32_t arg2) \ NEON_VOP_BODY(vtype, n) +#define NEON_GVEC_VOP2(name, vtype) \ +void HELPER(name)(void *vd, void *vn, void *vm, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc); \ + vtype *d = vd, *n = vn, *m = vm; \ + for (i = 0; i < opr_sz / sizeof(vtype); i++) { \ + NEON_FN(d[i], n[i], m[i]); \ + } \ + clear_tail(d, opr_sz, simd_maxsz(desc)); \ +} + /* Pairwise operations. */ /* For 32-bit elements each segment only contains a single element, so the elementwise and pairwise operations are the same. */ @@ -263,11 +275,23 @@ NEON_VOP(shl_s16, neon_s16, 2) #define NEON_FN(dest, src1, src2) \ (dest = do_sqrshl_bhs(src1, (int8_t)src2, 8, true, NULL)) NEON_VOP(rshl_s8, neon_s8, 4) +NEON_GVEC_VOP2(gvec_srshl_b, int8_t) #undef NEON_FN #define NEON_FN(dest, src1, src2) \ (dest = do_sqrshl_bhs(src1, (int8_t)src2, 16, true, NULL)) NEON_VOP(rshl_s16, neon_s16, 2) +NEON_GVEC_VOP2(gvec_srshl_h, int16_t) +#undef NEON_FN + +#define NEON_FN(dest, src1, src2) \ + (dest = do_sqrshl_bhs(src1, (int8_t)src2, 32, true, NULL)) +NEON_GVEC_VOP2(gvec_srshl_s, int32_t) +#undef NEON_FN + +#define NEON_FN(dest, src1, src2) \ + (dest = do_sqrshl_d(src1, (int8_t)src2, true, NULL)) +NEON_GVEC_VOP2(gvec_srshl_d, int64_t) #undef NEON_FN uint32_t HELPER(neon_rshl_s32)(uint32_t val, uint32_t shift) @@ -283,11 +307,23 @@ uint64_t HELPER(neon_rshl_s64)(uint64_t val, uint64_t shift) #define NEON_FN(dest, src1, src2) \ (dest = do_uqrshl_bhs(src1, (int8_t)src2, 8, true, NULL)) NEON_VOP(rshl_u8, neon_u8, 4) +NEON_GVEC_VOP2(gvec_urshl_b, uint8_t) #undef NEON_FN #define NEON_FN(dest, src1, src2) \ (dest = do_uqrshl_bhs(src1, (int8_t)src2, 16, true, NULL)) NEON_VOP(rshl_u16, neon_u16, 2) +NEON_GVEC_VOP2(gvec_urshl_h, uint16_t) +#undef NEON_FN + +#define NEON_FN(dest, src1, src2) \ + (dest = do_uqrshl_bhs(src1, (int8_t)src2, 32, true, NULL)) +NEON_GVEC_VOP2(gvec_urshl_s, int32_t) +#undef NEON_FN + +#define NEON_FN(dest, src1, src2) \ + (dest = do_uqrshl_d(src1, (int8_t)src2, true, NULL)) +NEON_GVEC_VOP2(gvec_urshl_d, int64_t) #undef NEON_FN uint32_t HELPER(neon_rshl_u32)(uint32_t val, uint32_t shift) diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c index 8d39a9663e..2dffda36a8 100644 --- a/target/arm/tcg/translate-a64.c +++ b/target/arm/tcg/translate-a64.c @@ -10937,6 +10937,13 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) } switch (opcode) { + case 0x0a: /* SRSHL, URSHL */ + if (u) { + gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_urshl, size); + } else { + gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_srshl, size); + } + return; case 0x0c: /* SMAX, UMAX */ if (u) { gen_gvec_fn3(s, is_q, rd, rn, rm, tcg_gen_gvec_umax, size); @@ -11087,16 +11094,6 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) genenvfn = fns[size][u]; break; } - case 0xa: /* SRSHL, URSHL */ - { - static NeonGenTwoOpFn * const fns[3][2] = { - { gen_helper_neon_rshl_s8, gen_helper_neon_rshl_u8 }, - { gen_helper_neon_rshl_s16, gen_helper_neon_rshl_u16 }, - { gen_helper_neon_rshl_s32, gen_helper_neon_rshl_u32 }, - }; - genfn = fns[size][u]; - break; - } case 0xb: /* SQRSHL, UQRSHL */ { static NeonGenTwoOpEnvFn * const fns[3][2] = { diff --git a/target/arm/tcg/translate-neon.c b/target/arm/tcg/translate-neon.c index 18b048611b..337488bbf1 100644 --- a/target/arm/tcg/translate-neon.c +++ b/target/arm/tcg/translate-neon.c @@ -794,6 +794,8 @@ DO_3SAME(VQADD_S, gen_gvec_sqadd_qc) DO_3SAME(VQADD_U, gen_gvec_uqadd_qc) DO_3SAME(VQSUB_S, gen_gvec_sqsub_qc) DO_3SAME(VQSUB_U, gen_gvec_uqsub_qc) +DO_3SAME(VRSHL_S, gen_gvec_srshl) +DO_3SAME(VRSHL_U, gen_gvec_urshl) /* These insns are all gvec_bitsel but with the inputs in various orders. */ #define DO_3SAME_BITSEL(INSN, O1, O2, O3) \ @@ -929,8 +931,6 @@ DO_SHA2(SHA256SU1, gen_helper_crypto_sha256su1) } \ DO_3SAME_64(INSN, gen_##INSN##_elt) -DO_3SAME_64(VRSHL_S64, gen_helper_neon_rshl_s64) -DO_3SAME_64(VRSHL_U64, gen_helper_neon_rshl_u64) DO_3SAME_64_ENV(VQSHL_S64, gen_helper_neon_qshl_s64) DO_3SAME_64_ENV(VQSHL_U64, gen_helper_neon_qshl_u64) DO_3SAME_64_ENV(VQRSHL_S64, gen_helper_neon_qrshl_s64) @@ -999,8 +999,6 @@ DO_3SAME_32(VHSUB_S, hsub_s) DO_3SAME_32(VHSUB_U, hsub_u) DO_3SAME_32(VRHADD_S, rhadd_s) DO_3SAME_32(VRHADD_U, rhadd_u) -DO_3SAME_32(VRSHL_S, rshl_s) -DO_3SAME_32(VRSHL_U, rshl_u) DO_3SAME_32_ENV(VQSHL_S, qshl_s) DO_3SAME_32_ENV(VQSHL_U, qshl_u) From patchwork Fri May 24 23:20:59 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13673836 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 42646C25B74 for ; Fri, 24 May 2024 23:30:10 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sAeI4-0007od-3l; Fri, 24 May 2024 19:25:36 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sAeHj-0007Jc-SS for qemu-devel@nongnu.org; Fri, 24 May 2024 19:25:20 -0400 Received: from mail-pf1-x429.google.com ([2607:f8b0:4864:20::429]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sAeHY-0006lr-Lz for qemu-devel@nongnu.org; Fri, 24 May 2024 19:25:15 -0400 Received: by mail-pf1-x429.google.com with SMTP id d2e1a72fcca58-6f5053dc057so4955256b3a.2 for ; Fri, 24 May 2024 16:25:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1716593103; x=1717197903; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=2++z+BnsdD5ea3L5ZWynOcCaqUXBNf2jM4oVZP41cHM=; b=I6Vf2+Nr+8k9E7DCDZjB6vniEtkw6hH0i8Uf96MCPSdkYiWgpkJsBI8VMaCddI6rGZ E+CO+KncLTl0t1EvnZJx+fgApkDxP0lADsJbMa/WGWjHDne54NlQ5p5T96aS9jkzfdsw k7ctx9ty1pobxd3DFB+GhCHG+hhlC1NYjbKRuP2LbhD97JVQ9o0u6qr5a2UCdN5Jj84F GJ9nob4EfMalUXGL5edKBuLlMFOCjp8GiG9nuBPKlF21C6kMTV3SCBdD6YC02bzLRvbI ZubV/H4avzGAb02UjevkfsDuuhIwhKP0YgjiWeQL81NqNq1jwjiIy1v1axLYj3zdTNQs O+rQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716593103; x=1717197903; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=2++z+BnsdD5ea3L5ZWynOcCaqUXBNf2jM4oVZP41cHM=; b=DrNKWQlNzzLa1rZlqtlTrQ2ySdqD2hARkd9fbQkpfFZnS5CbHqB19oz/N2+mzRQInm 3QoT0y68aJUZ7gOWvFB/K+AGDc8/wMRWbqvrPpA+DtbtZAHGTrAAdyS+4Aa/FfCRNBoY H71HaIZQFcjafE+km3Duomh6FAXYBaw7kf/uAx+6s3hPuQcOSQSduZjBWZ8y5b0hZc9E oe3ycGex8FD7Hw2OjbV6j+pB8tSEU08uOAT+D+iCkwxGBhbmyf2SDwR8YKL3/oGG1XkO JJ6Nrv3Kw7bUA7ORP25FIK145e6f1HhouQE9xcc57Ga8mdyMKj/GFMcNiKz6MBVN21zB IQfw== X-Gm-Message-State: AOJu0YwUusOfXMHedzdgHgVb0tvajP5ADuKahHarFFC1RBprHnsMAuBB 3sy13kBA4iEhVGutlhM1M68bLbNFc/hio2PizU+1Vtd5hBAKun5ln0sDiB4WgJHTcs1jbjyXDME 6 X-Google-Smtp-Source: AGHT+IHnDQZudjwbc9M4Ay1fxHnHECLME8eSovBytvUZtiFxLlNEUevafc++RV9iRhl9RKLpwp8oMg== X-Received: by 2002:a05:6a20:9188:b0:1af:b1c0:c9eb with SMTP id adf61e73a8af0-1b212e02f4bmr5064211637.45.1716593103153; Fri, 24 May 2024 16:25:03 -0700 (PDT) Received: from stoup.. (174-21-72-5.tukw.qwest.net. [174.21.72.5]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-6f8fbf2cfd1sm1591695b3a.3.2024.05.24.16.25.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 May 2024 16:25:02 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: qemu-arm@nongnu.org Subject: [PATCH v2 45/67] target/arm: Convert SRSHL, URSHL to decodetree Date: Fri, 24 May 2024 16:20:59 -0700 Message-Id: <20240524232121.284515-46-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240524232121.284515-1-richard.henderson@linaro.org> References: <20240524232121.284515-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::429; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x429.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/tcg/a64.decode | 4 ++++ target/arm/tcg/translate-a64.c | 22 +++++++--------------- 2 files changed, 11 insertions(+), 15 deletions(-) diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode index ea897d6732..9e02776036 100644 --- a/target/arm/tcg/a64.decode +++ b/target/arm/tcg/a64.decode @@ -758,6 +758,8 @@ USQADD_s 0111 1110 ..1 00000 00111 0 ..... ..... @r2r_e SSHL_s 0101 1110 111 ..... 01000 1 ..... ..... @rrr_d USHL_s 0111 1110 111 ..... 01000 1 ..... ..... @rrr_d +SRSHL_s 0101 1110 111 ..... 01010 1 ..... ..... @rrr_d +URSHL_s 0111 1110 111 ..... 01010 1 ..... ..... @rrr_d ### Advanced SIMD scalar pairwise @@ -882,6 +884,8 @@ USQADD_v 0.10 1110 ..1 00000 00111 0 ..... ..... @qr2r_e SSHL_v 0.00 1110 ..1 ..... 01000 1 ..... ..... @qrrr_e USHL_v 0.10 1110 ..1 ..... 01000 1 ..... ..... @qrrr_e +SRSHL_v 0.00 1110 ..1 ..... 01010 1 ..... ..... @qrrr_e +URSHL_v 0.10 1110 ..1 ..... 01010 1 ..... ..... @qrrr_e ### Advanced SIMD scalar x indexed element diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c index 2dffda36a8..24f2025997 100644 --- a/target/arm/tcg/translate-a64.c +++ b/target/arm/tcg/translate-a64.c @@ -5116,6 +5116,8 @@ static bool do_int3_scalar_d(DisasContext *s, arg_rrr_e *a, TRANS(SSHL_s, do_int3_scalar_d, a, gen_sshl_i64) TRANS(USHL_s, do_int3_scalar_d, a, gen_ushl_i64) +TRANS(SRSHL_s, do_int3_scalar_d, a, gen_helper_neon_rshl_s64) +TRANS(URSHL_s, do_int3_scalar_d, a, gen_helper_neon_rshl_u64) static bool do_fp3_vector(DisasContext *s, arg_qrrr_e *a, gen_helper_gvec_3_ptr * const fns[3]) @@ -5364,6 +5366,8 @@ TRANS(USQADD_v, do_gvec_fn3, a, gen_gvec_usqadd_qc) TRANS(SSHL_v, do_gvec_fn3, a, gen_gvec_sshl) TRANS(USHL_v, do_gvec_fn3, a, gen_gvec_ushl) +TRANS(SRSHL_v, do_gvec_fn3, a, gen_gvec_srshl) +TRANS(URSHL_v, do_gvec_fn3, a, gen_gvec_urshl) /* @@ -9384,13 +9388,6 @@ static void handle_3same_64(DisasContext *s, int opcode, bool u, gen_helper_neon_qshl_s64(tcg_rd, tcg_env, tcg_rn, tcg_rm); } break; - case 0xa: /* SRSHL, URSHL */ - if (u) { - gen_helper_neon_rshl_u64(tcg_rd, tcg_rn, tcg_rm); - } else { - gen_helper_neon_rshl_s64(tcg_rd, tcg_rn, tcg_rm); - } - break; case 0xb: /* SQRSHL, UQRSHL */ if (u) { gen_helper_neon_qrshl_u64(tcg_rd, tcg_env, tcg_rn, tcg_rm); @@ -9409,6 +9406,7 @@ static void handle_3same_64(DisasContext *s, int opcode, bool u, case 0x1: /* SQADD / UQADD */ case 0x5: /* SQSUB / UQSUB */ case 0x8: /* SSHL, USHL */ + case 0xa: /* SRSHL, URSHL */ g_assert_not_reached(); } } @@ -9433,7 +9431,6 @@ static void disas_simd_scalar_three_reg_same(DisasContext *s, uint32_t insn) case 0x9: /* SQSHL, UQSHL */ case 0xb: /* SQRSHL, UQRSHL */ break; - case 0xa: /* SRSHL, URSHL */ case 0x6: /* CMGT, CMHI */ case 0x7: /* CMGE, CMHS */ case 0x11: /* CMTST, CMEQ */ @@ -9453,6 +9450,7 @@ static void disas_simd_scalar_three_reg_same(DisasContext *s, uint32_t insn) case 0x1: /* SQADD, UQADD */ case 0x5: /* SQSUB, UQSUB */ case 0x8: /* SSHL, USHL */ + case 0xa: /* SRSHL, URSHL */ unallocated_encoding(s); return; } @@ -10937,13 +10935,6 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) } switch (opcode) { - case 0x0a: /* SRSHL, URSHL */ - if (u) { - gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_urshl, size); - } else { - gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_srshl, size); - } - return; case 0x0c: /* SMAX, UMAX */ if (u) { gen_gvec_fn3(s, is_q, rd, rn, rm, tcg_gen_gvec_umax, size); @@ -11025,6 +11016,7 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) case 0x01: /* SQADD, UQADD */ case 0x05: /* SQSUB, UQSUB */ case 0x08: /* SSHL, USHL */ + case 0x0a: /* SRSHL, URSHL */ g_assert_not_reached(); } From patchwork Fri May 24 23:21:00 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13673843 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 36435C25B74 for ; Fri, 24 May 2024 23:31:02 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sAeIY-0000zx-KX; Fri, 24 May 2024 19:26:07 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sAeHt-0007Oc-Tv for qemu-devel@nongnu.org; Fri, 24 May 2024 19:25:28 -0400 Received: from mail-pf1-x42b.google.com ([2607:f8b0:4864:20::42b]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sAeHZ-0006rk-Um for qemu-devel@nongnu.org; Fri, 24 May 2024 19:25:18 -0400 Received: by mail-pf1-x42b.google.com with SMTP id d2e1a72fcca58-6f8ea2df4b3so1165445b3a.2 for ; Fri, 24 May 2024 16:25:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1716593104; x=1717197904; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=L22zCNCpbLgoOyFSgSkbsy5+926y8kYPgt7lsXqlSE0=; b=HScLjOF3kFbwxsTp4OeVVxuoi+4sKJh1R04cGL8k5bB+VkeBq4pp49OITSLta09tZC 5Eut1dXepc7EMR1F+56uECAgXhEP+HRQcL+0v/4Vx94bw+kttrGQuKkUzcg2kRfn8oGW dOWCNGTIHFZ6ff3OPf037VHWVLBIxp5/Uhpy5PysJ1dPhZkxO24P9OPS7JmLDSd77LGJ VDvxWY3kV+2DNxJHUxtbRSLKLZJBEoEtuL2J2coKsELkpeaqoqqTc0mZrI6n2WdY+oGg /DJX5h4B0SIupwyY4MGxWBmOfzYMirP38Pu/X3+4O20m8SQkH3oSfGrhAOClyK0kz1I3 ER1A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716593104; x=1717197904; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=L22zCNCpbLgoOyFSgSkbsy5+926y8kYPgt7lsXqlSE0=; b=Gyp8p3oaY/xIgRb9apIdsHuXUGDTxfSLqM2186gcAzIlH/hVHFe4IyKLnXA/yRhD78 9HOdd5Hc67Kc8Jpiiym28KUUnXSCo47iRSx23pdcSj+KQgpRg8UqBO+YjfF7dtc02E+9 YkrAsndSsUMMcYR4oFajsBIr29yea423nEAxrSGRwRnq8+pGtD11zoA+ll111vMgXO7f /z6H6k7A5PK3ARtmjTWxrxw3bJNjCobvlb/ypR/Ff8V5l4y1ettcxeWsPRqbupG0TOY/ VGbSjLh6S4j20yJNSwWWPdVJ4lJDnksnceznQWyWLEkaJHhy2dSaW9cyByT8p+lAyGW7 gKJg== X-Gm-Message-State: AOJu0Yyrhugc4z8lXbQOW0QArbN2k5LSd+DMZrX9VHUT0LvkfcLQ92aE AjwTCrkv8DASp/Y8UFem6tk34oTIVF1YPNOCGJzc0bcJ4e7Y+vO4S8E+tPuXdD6fXd3wainKZN2 t X-Google-Smtp-Source: AGHT+IEzYRmtqqBjnGQ16mYqQr3VXAQHMhWVZ2DuswmHqj0sADuTLodJfyOinfNdZ6c1af8XxS4+Fg== X-Received: by 2002:a05:6a00:8089:b0:6f4:436d:fd0d with SMTP id d2e1a72fcca58-6f8f4192d1dmr4106819b3a.27.1716593103946; Fri, 24 May 2024 16:25:03 -0700 (PDT) Received: from stoup.. (174-21-72-5.tukw.qwest.net. [174.21.72.5]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-6f8fbf2cfd1sm1591695b3a.3.2024.05.24.16.25.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 May 2024 16:25:03 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: qemu-arm@nongnu.org Subject: [PATCH v2 46/67] target/arm: Convert SQSHL and UQSHL (register) to gvec Date: Fri, 24 May 2024 16:21:00 -0700 Message-Id: <20240524232121.284515-47-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240524232121.284515-1-richard.henderson@linaro.org> References: <20240524232121.284515-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::42b; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x42b.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Signed-off-by: Richard Henderson --- target/arm/helper.h | 8 ++++++++ target/arm/tcg/translate.h | 4 ++++ target/arm/tcg/neon-dp.decode | 10 ++------- target/arm/tcg/gengvec.c | 24 ++++++++++++++++++++++ target/arm/tcg/neon_helper.c | 36 +++++++++++++++++++++++++++++++++ target/arm/tcg/translate-a64.c | 17 +++++++--------- target/arm/tcg/translate-neon.c | 6 ++---- 7 files changed, 83 insertions(+), 22 deletions(-) diff --git a/target/arm/helper.h b/target/arm/helper.h index 25eb7bf5df..f345087ddb 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -326,6 +326,14 @@ DEF_HELPER_3(neon_qrshl_u32, i32, env, i32, i32) DEF_HELPER_3(neon_qrshl_s32, i32, env, i32, i32) DEF_HELPER_3(neon_qrshl_u64, i64, env, i64, i64) DEF_HELPER_3(neon_qrshl_s64, i64, env, i64, i64) +DEF_HELPER_FLAGS_5(neon_sqshl_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(neon_sqshl_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(neon_sqshl_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(neon_sqshl_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(neon_uqshl_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(neon_uqshl_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(neon_uqshl_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(neon_uqshl_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(gvec_srshl_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(gvec_srshl_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) diff --git a/target/arm/tcg/translate.h b/target/arm/tcg/translate.h index ea63ffc47b..6c6d4d49e7 100644 --- a/target/arm/tcg/translate.h +++ b/target/arm/tcg/translate.h @@ -463,6 +463,10 @@ void gen_gvec_srshl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); void gen_gvec_urshl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); +void gen_neon_sqshl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); +void gen_neon_uqshl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); void gen_cmtst_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b); void gen_ushl_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b); diff --git a/target/arm/tcg/neon-dp.decode b/target/arm/tcg/neon-dp.decode index 8525c65c0d..6d4996b8d8 100644 --- a/target/arm/tcg/neon-dp.decode +++ b/target/arm/tcg/neon-dp.decode @@ -109,14 +109,8 @@ VSHL_U_3s 1111 001 1 0 . .. .... .... 0100 . . . 0 .... @3same_rev @3same_64_rev .... ... . . . 11 .... .... .... . q:1 . . .... \ &3same vm=%vn_dp vn=%vm_dp vd=%vd_dp size=3 -{ - VQSHL_S64_3s 1111 001 0 0 . .. .... .... 0100 . . . 1 .... @3same_64_rev - VQSHL_S_3s 1111 001 0 0 . .. .... .... 0100 . . . 1 .... @3same_rev -} -{ - VQSHL_U64_3s 1111 001 1 0 . .. .... .... 0100 . . . 1 .... @3same_64_rev - VQSHL_U_3s 1111 001 1 0 . .. .... .... 0100 . . . 1 .... @3same_rev -} +VQSHL_S_3s 1111 001 0 0 . .. .... .... 0100 . . . 1 .... @3same_rev +VQSHL_U_3s 1111 001 1 0 . .. .... .... 0100 . . . 1 .... @3same_rev VRSHL_S_3s 1111 001 0 0 . .. .... .... 0101 . . . 0 .... @3same_rev VRSHL_U_3s 1111 001 1 0 . .. .... .... 0101 . . . 0 .... @3same_rev { diff --git a/target/arm/tcg/gengvec.c b/target/arm/tcg/gengvec.c index d9a9132722..773dbf41d3 100644 --- a/target/arm/tcg/gengvec.c +++ b/target/arm/tcg/gengvec.c @@ -1239,6 +1239,30 @@ void gen_gvec_urshl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, tcg_gen_gvec_3_ool(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, 0, fns[vece]); } +void gen_neon_sqshl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static gen_helper_gvec_3_ptr * const fns[] = { + gen_helper_neon_sqshl_b, gen_helper_neon_sqshl_h, + gen_helper_neon_sqshl_s, gen_helper_neon_sqshl_d, + }; + tcg_debug_assert(vece <= MO_64); + tcg_gen_gvec_3_ptr(rd_ofs, rn_ofs, rm_ofs, tcg_env, + opr_sz, max_sz, 0, fns[vece]); +} + +void gen_neon_uqshl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static gen_helper_gvec_3_ptr * const fns[] = { + gen_helper_neon_uqshl_b, gen_helper_neon_uqshl_h, + gen_helper_neon_uqshl_s, gen_helper_neon_uqshl_d, + }; + tcg_debug_assert(vece <= MO_64); + tcg_gen_gvec_3_ptr(rd_ofs, rn_ofs, rm_ofs, tcg_env, + opr_sz, max_sz, 0, fns[vece]); +} + void gen_uqadd_bhs(TCGv_i64 res, TCGv_i64 qc, TCGv_i64 a, TCGv_i64 b, MemOp esz) { uint64_t max = MAKE_64BIT_MASK(0, 8 << esz); diff --git a/target/arm/tcg/neon_helper.c b/target/arm/tcg/neon_helper.c index 516ecc1dcb..88301f0dcb 100644 --- a/target/arm/tcg/neon_helper.c +++ b/target/arm/tcg/neon_helper.c @@ -129,6 +129,18 @@ void HELPER(name)(void *vd, void *vn, void *vm, uint32_t desc) \ clear_tail(d, opr_sz, simd_maxsz(desc)); \ } +#define NEON_GVEC_VOP2_ENV(name, vtype) \ +void HELPER(name)(void *vd, void *vn, void *vm, void *venv, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc); \ + vtype *d = vd, *n = vn, *m = vm; \ + CPUARMState *env = venv; \ + for (i = 0; i < opr_sz / sizeof(vtype); i++) { \ + NEON_FN(d[i], n[i], m[i]); \ + } \ + clear_tail(d, opr_sz, simd_maxsz(desc)); \ +} + /* Pairwise operations. */ /* For 32-bit elements each segment only contains a single element, so the elementwise and pairwise operations are the same. */ @@ -339,11 +351,23 @@ uint64_t HELPER(neon_rshl_u64)(uint64_t val, uint64_t shift) #define NEON_FN(dest, src1, src2) \ (dest = do_uqrshl_bhs(src1, (int8_t)src2, 8, false, env->vfp.qc)) NEON_VOP_ENV(qshl_u8, neon_u8, 4) +NEON_GVEC_VOP2_ENV(neon_uqshl_b, uint8_t) #undef NEON_FN #define NEON_FN(dest, src1, src2) \ (dest = do_uqrshl_bhs(src1, (int8_t)src2, 16, false, env->vfp.qc)) NEON_VOP_ENV(qshl_u16, neon_u16, 2) +NEON_GVEC_VOP2_ENV(neon_uqshl_h, uint16_t) +#undef NEON_FN + +#define NEON_FN(dest, src1, src2) \ + (dest = do_uqrshl_bhs(src1, (int8_t)src2, 32, false, env->vfp.qc)) +NEON_GVEC_VOP2_ENV(neon_uqshl_s, uint32_t) +#undef NEON_FN + +#define NEON_FN(dest, src1, src2) \ + (dest = do_uqrshl_d(src1, (int8_t)src2, false, env->vfp.qc)) +NEON_GVEC_VOP2_ENV(neon_uqshl_d, uint64_t) #undef NEON_FN uint32_t HELPER(neon_qshl_u32)(CPUARMState *env, uint32_t val, uint32_t shift) @@ -359,11 +383,23 @@ uint64_t HELPER(neon_qshl_u64)(CPUARMState *env, uint64_t val, uint64_t shift) #define NEON_FN(dest, src1, src2) \ (dest = do_sqrshl_bhs(src1, (int8_t)src2, 8, false, env->vfp.qc)) NEON_VOP_ENV(qshl_s8, neon_s8, 4) +NEON_GVEC_VOP2_ENV(neon_sqshl_b, int8_t) #undef NEON_FN #define NEON_FN(dest, src1, src2) \ (dest = do_sqrshl_bhs(src1, (int8_t)src2, 16, false, env->vfp.qc)) NEON_VOP_ENV(qshl_s16, neon_s16, 2) +NEON_GVEC_VOP2_ENV(neon_sqshl_h, int16_t) +#undef NEON_FN + +#define NEON_FN(dest, src1, src2) \ + (dest = do_sqrshl_bhs(src1, (int8_t)src2, 32, false, env->vfp.qc)) +NEON_GVEC_VOP2_ENV(neon_sqshl_s, int32_t) +#undef NEON_FN + +#define NEON_FN(dest, src1, src2) \ + (dest = do_sqrshl_d(src1, (int8_t)src2, false, env->vfp.qc)) +NEON_GVEC_VOP2_ENV(neon_sqshl_d, int64_t) #undef NEON_FN uint32_t HELPER(neon_qshl_s32)(CPUARMState *env, uint32_t val, uint32_t shift) diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c index 24f2025997..50b653bb4d 100644 --- a/target/arm/tcg/translate-a64.c +++ b/target/arm/tcg/translate-a64.c @@ -10935,6 +10935,13 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) } switch (opcode) { + case 0x09: /* SQSHL, UQSHL */ + if (u) { + gen_gvec_fn3(s, is_q, rd, rn, rm, gen_neon_uqshl, size); + } else { + gen_gvec_fn3(s, is_q, rd, rn, rm, gen_neon_sqshl, size); + } + return; case 0x0c: /* SMAX, UMAX */ if (u) { gen_gvec_fn3(s, is_q, rd, rn, rm, tcg_gen_gvec_umax, size); @@ -11076,16 +11083,6 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) genfn = fns[size][u]; break; } - case 0x9: /* SQSHL, UQSHL */ - { - static NeonGenTwoOpEnvFn * const fns[3][2] = { - { gen_helper_neon_qshl_s8, gen_helper_neon_qshl_u8 }, - { gen_helper_neon_qshl_s16, gen_helper_neon_qshl_u16 }, - { gen_helper_neon_qshl_s32, gen_helper_neon_qshl_u32 }, - }; - genenvfn = fns[size][u]; - break; - } case 0xb: /* SQRSHL, UQRSHL */ { static NeonGenTwoOpEnvFn * const fns[3][2] = { diff --git a/target/arm/tcg/translate-neon.c b/target/arm/tcg/translate-neon.c index 337488bbf1..a3eec47092 100644 --- a/target/arm/tcg/translate-neon.c +++ b/target/arm/tcg/translate-neon.c @@ -796,6 +796,8 @@ DO_3SAME(VQSUB_S, gen_gvec_sqsub_qc) DO_3SAME(VQSUB_U, gen_gvec_uqsub_qc) DO_3SAME(VRSHL_S, gen_gvec_srshl) DO_3SAME(VRSHL_U, gen_gvec_urshl) +DO_3SAME(VQSHL_S, gen_neon_sqshl) +DO_3SAME(VQSHL_U, gen_neon_uqshl) /* These insns are all gvec_bitsel but with the inputs in various orders. */ #define DO_3SAME_BITSEL(INSN, O1, O2, O3) \ @@ -931,8 +933,6 @@ DO_SHA2(SHA256SU1, gen_helper_crypto_sha256su1) } \ DO_3SAME_64(INSN, gen_##INSN##_elt) -DO_3SAME_64_ENV(VQSHL_S64, gen_helper_neon_qshl_s64) -DO_3SAME_64_ENV(VQSHL_U64, gen_helper_neon_qshl_u64) DO_3SAME_64_ENV(VQRSHL_S64, gen_helper_neon_qrshl_s64) DO_3SAME_64_ENV(VQRSHL_U64, gen_helper_neon_qrshl_u64) @@ -1000,8 +1000,6 @@ DO_3SAME_32(VHSUB_U, hsub_u) DO_3SAME_32(VRHADD_S, rhadd_s) DO_3SAME_32(VRHADD_U, rhadd_u) -DO_3SAME_32_ENV(VQSHL_S, qshl_s) -DO_3SAME_32_ENV(VQSHL_U, qshl_u) DO_3SAME_32_ENV(VQRSHL_S, qrshl_s) DO_3SAME_32_ENV(VQRSHL_U, qrshl_u) From patchwork Fri May 24 23:21:01 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13673848 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 255AFC25B74 for ; Fri, 24 May 2024 23:32:12 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sAeIc-0001VM-GT; Fri, 24 May 2024 19:26:10 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sAeHt-0007Od-UY for qemu-devel@nongnu.org; Fri, 24 May 2024 19:25:27 -0400 Received: from mail-pf1-x429.google.com ([2607:f8b0:4864:20::429]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sAeHb-0006rz-HS for qemu-devel@nongnu.org; Fri, 24 May 2024 19:25:22 -0400 Received: by mail-pf1-x429.google.com with SMTP id d2e1a72fcca58-6f8e98760fcso1242486b3a.1 for ; Fri, 24 May 2024 16:25:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1716593105; x=1717197905; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=NPFuri/cKswR4t+bCrnBkHfBWzfdGW7NEKWgDSK4v8Y=; b=i1A2q2ZjYaAic28ywelsgdjR+Qk4XGQ4Vg22CFxh8ev6SwprfQWndbv/iWXhrpUBtJ juIe6UXHCidVkwYf/21VjKVdtGSYE81YXdQSl1RLVlZ1X1+3HZwX12F7AIHYZEaIhSDA 5TVLyyvuFyQLYzB+qBC2XTMybe1JxQGmKFLayp7HoLqfdP0GaD6GgCu/+rbb4qQbFwLF x6kcrwZY2ZVfA41gs9NcRBhpQ/XLabSv8KftgKZfVii7I2DsLzI3+EcW6ackzECDayTr hOGSJmVxiOavGMVWvABkFKnTEWVCst3qfbWaZ1fM95NEDRaCKLfefcyw+XVvy2oLsFtX hjPQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716593105; x=1717197905; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=NPFuri/cKswR4t+bCrnBkHfBWzfdGW7NEKWgDSK4v8Y=; b=LeeVlzE54sRRc2l3MRA4G/9Msi3Xk96eU/nWV2bP1hhK0zloU4mfpqgSlCFxQIYU2E 28bLb7Ot0V3BUksCeKgR5UZWARKw3rcukRkAZXxL7M2S2hMoackfSt+53swCr0y+9VlF 61KCzs2TyxxEXyzUQZLWT/FXkpoqJf1qrq5L44bbQATKxGJfvBOzQcC7Q0w2gsSJaAm7 ElxGD48fCklN0/Tw8sQQ0JBm1ZnMOZy2z0x9LcNAski6XR6aU7XnM1sOjug5ac35K2/V xVPlNxgYaeifY12pDpwNQqW0A4wjNU6ow2gTUBHRTjriE6HNMpt2k6LNHFQZN7NS6p0K kKyw== X-Gm-Message-State: AOJu0Yz40FfkZxRg5fTkNh4CAYWOkzPYiM9pRQmUSYmv5u9oo8kZMRwn CKVemskyUCuFsa7EXGPOk79bpIFdk31FvitzMzMf13lcnEFDTTBrgBQtM76I8xieaXo+ueSoTZL 4 X-Google-Smtp-Source: AGHT+IHUYRSdpf2nea1TQ9t+5wrW8WIc1XQewY1S7M6YCI68RFCZSqkRHEh5WRKq9c+dFiedSZljhg== X-Received: by 2002:a05:6a20:d80b:b0:1af:66aa:7fc7 with SMTP id adf61e73a8af0-1b212cc3935mr5077522637.3.1716593104792; Fri, 24 May 2024 16:25:04 -0700 (PDT) Received: from stoup.. (174-21-72-5.tukw.qwest.net. [174.21.72.5]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-6f8fbf2cfd1sm1591695b3a.3.2024.05.24.16.25.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 May 2024 16:25:04 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: qemu-arm@nongnu.org Subject: [PATCH v2 47/67] target/arm: Convert SQSHL, UQSHL to decodetree Date: Fri, 24 May 2024 16:21:01 -0700 Message-Id: <20240524232121.284515-48-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240524232121.284515-1-richard.henderson@linaro.org> References: <20240524232121.284515-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::429; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x429.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/tcg/a64.decode | 4 ++ target/arm/tcg/translate-a64.c | 74 ++++++++++++++++++++++------------ 2 files changed, 53 insertions(+), 25 deletions(-) diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode index 9e02776036..85caf37948 100644 --- a/target/arm/tcg/a64.decode +++ b/target/arm/tcg/a64.decode @@ -760,6 +760,8 @@ SSHL_s 0101 1110 111 ..... 01000 1 ..... ..... @rrr_d USHL_s 0111 1110 111 ..... 01000 1 ..... ..... @rrr_d SRSHL_s 0101 1110 111 ..... 01010 1 ..... ..... @rrr_d URSHL_s 0111 1110 111 ..... 01010 1 ..... ..... @rrr_d +SQSHL_s 0101 1110 ..1 ..... 01001 1 ..... ..... @rrr_e +UQSHL_s 0111 1110 ..1 ..... 01001 1 ..... ..... @rrr_e ### Advanced SIMD scalar pairwise @@ -886,6 +888,8 @@ SSHL_v 0.00 1110 ..1 ..... 01000 1 ..... ..... @qrrr_e USHL_v 0.10 1110 ..1 ..... 01000 1 ..... ..... @qrrr_e SRSHL_v 0.00 1110 ..1 ..... 01010 1 ..... ..... @qrrr_e URSHL_v 0.10 1110 ..1 ..... 01010 1 ..... ..... @qrrr_e +SQSHL_v 0.00 1110 ..1 ..... 01001 1 ..... ..... @qrrr_e +UQSHL_v 0.10 1110 ..1 ..... 01001 1 ..... ..... @qrrr_e ### Advanced SIMD scalar x indexed element diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c index 50b653bb4d..f8d2760bea 100644 --- a/target/arm/tcg/translate-a64.c +++ b/target/arm/tcg/translate-a64.c @@ -5119,6 +5119,49 @@ TRANS(USHL_s, do_int3_scalar_d, a, gen_ushl_i64) TRANS(SRSHL_s, do_int3_scalar_d, a, gen_helper_neon_rshl_s64) TRANS(URSHL_s, do_int3_scalar_d, a, gen_helper_neon_rshl_u64) +typedef struct ENVScalar2 { + NeonGenTwoOpEnvFn *gen_bhs[3]; + NeonGenTwo64OpEnvFn *gen_d; +} ENVScalar2; + +static bool do_env_scalar2(DisasContext *s, arg_rrr_e *a, const ENVScalar2 *f) +{ + if (!fp_access_check(s)) { + return true; + } + if (a->esz == MO_64) { + TCGv_i64 t0 = read_fp_dreg(s, a->rn); + TCGv_i64 t1 = read_fp_dreg(s, a->rm); + f->gen_d(t0, tcg_env, t0, t1); + write_fp_dreg(s, a->rd, t0); + } else { + TCGv_i32 t0 = tcg_temp_new_i32(); + TCGv_i32 t1 = tcg_temp_new_i32(); + + read_vec_element_i32(s, t0, a->rn, 0, a->esz); + read_vec_element_i32(s, t1, a->rm, 0, a->esz); + f->gen_bhs[a->esz](t0, tcg_env, t0, t1); + write_fp_sreg(s, a->rd, t0); + } + return true; +} + +static const ENVScalar2 f_scalar_sqshl = { + { gen_helper_neon_qshl_s8, + gen_helper_neon_qshl_s16, + gen_helper_neon_qshl_s32 }, + gen_helper_neon_qshl_s64, +}; +TRANS(SQSHL_s, do_env_scalar2, a, &f_scalar_sqshl) + +static const ENVScalar2 f_scalar_uqshl = { + { gen_helper_neon_qshl_u8, + gen_helper_neon_qshl_u16, + gen_helper_neon_qshl_u32 }, + gen_helper_neon_qshl_u64, +}; +TRANS(UQSHL_s, do_env_scalar2, a, &f_scalar_uqshl) + static bool do_fp3_vector(DisasContext *s, arg_qrrr_e *a, gen_helper_gvec_3_ptr * const fns[3]) { @@ -5368,6 +5411,8 @@ TRANS(SSHL_v, do_gvec_fn3, a, gen_gvec_sshl) TRANS(USHL_v, do_gvec_fn3, a, gen_gvec_ushl) TRANS(SRSHL_v, do_gvec_fn3, a, gen_gvec_srshl) TRANS(URSHL_v, do_gvec_fn3, a, gen_gvec_urshl) +TRANS(SQSHL_v, do_gvec_fn3, a, gen_neon_sqshl) +TRANS(UQSHL_v, do_gvec_fn3, a, gen_neon_uqshl) /* @@ -9381,13 +9426,6 @@ static void handle_3same_64(DisasContext *s, int opcode, bool u, } gen_cmtst_i64(tcg_rd, tcg_rn, tcg_rm); break; - case 0x9: /* SQSHL, UQSHL */ - if (u) { - gen_helper_neon_qshl_u64(tcg_rd, tcg_env, tcg_rn, tcg_rm); - } else { - gen_helper_neon_qshl_s64(tcg_rd, tcg_env, tcg_rn, tcg_rm); - } - break; case 0xb: /* SQRSHL, UQRSHL */ if (u) { gen_helper_neon_qrshl_u64(tcg_rd, tcg_env, tcg_rn, tcg_rm); @@ -9406,6 +9444,7 @@ static void handle_3same_64(DisasContext *s, int opcode, bool u, case 0x1: /* SQADD / UQADD */ case 0x5: /* SQSUB / UQSUB */ case 0x8: /* SSHL, USHL */ + case 0x9: /* SQSHL, UQSHL */ case 0xa: /* SRSHL, URSHL */ g_assert_not_reached(); } @@ -9428,7 +9467,6 @@ static void disas_simd_scalar_three_reg_same(DisasContext *s, uint32_t insn) TCGv_i64 tcg_rd; switch (opcode) { - case 0x9: /* SQSHL, UQSHL */ case 0xb: /* SQRSHL, UQRSHL */ break; case 0x6: /* CMGT, CMHI */ @@ -9450,6 +9488,7 @@ static void disas_simd_scalar_three_reg_same(DisasContext *s, uint32_t insn) case 0x1: /* SQADD, UQADD */ case 0x5: /* SQSUB, UQSUB */ case 0x8: /* SSHL, USHL */ + case 0x9: /* SQSHL, UQSHL */ case 0xa: /* SRSHL, URSHL */ unallocated_encoding(s); return; @@ -9477,16 +9516,6 @@ static void disas_simd_scalar_three_reg_same(DisasContext *s, uint32_t insn) void (*genfn)(TCGv_i64, TCGv_i64, TCGv_i64, TCGv_i64, MemOp) = NULL; switch (opcode) { - case 0x9: /* SQSHL, UQSHL */ - { - static NeonGenTwoOpEnvFn * const fns[3][2] = { - { gen_helper_neon_qshl_s8, gen_helper_neon_qshl_u8 }, - { gen_helper_neon_qshl_s16, gen_helper_neon_qshl_u16 }, - { gen_helper_neon_qshl_s32, gen_helper_neon_qshl_u32 }, - }; - genenvfn = fns[size][u]; - break; - } case 0xb: /* SQRSHL, UQRSHL */ { static NeonGenTwoOpEnvFn * const fns[3][2] = { @@ -9510,6 +9539,7 @@ static void disas_simd_scalar_three_reg_same(DisasContext *s, uint32_t insn) default: case 0x1: /* SQADD, UQADD */ case 0x5: /* SQSUB, UQSUB */ + case 0x9: /* SQSHL, UQSHL */ g_assert_not_reached(); } @@ -10935,13 +10965,6 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) } switch (opcode) { - case 0x09: /* SQSHL, UQSHL */ - if (u) { - gen_gvec_fn3(s, is_q, rd, rn, rm, gen_neon_uqshl, size); - } else { - gen_gvec_fn3(s, is_q, rd, rn, rm, gen_neon_sqshl, size); - } - return; case 0x0c: /* SMAX, UMAX */ if (u) { gen_gvec_fn3(s, is_q, rd, rn, rm, tcg_gen_gvec_umax, size); @@ -11023,6 +11046,7 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) case 0x01: /* SQADD, UQADD */ case 0x05: /* SQSUB, UQSUB */ case 0x08: /* SSHL, USHL */ + case 0x09: /* SQSHL, UQSHL */ case 0x0a: /* SRSHL, URSHL */ g_assert_not_reached(); } From patchwork Fri May 24 23:21:02 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13673812 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 24DBCC25B74 for ; Fri, 24 May 2024 23:27:23 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sAeIf-0001r8-0w; Fri, 24 May 2024 19:26:14 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sAeHt-0007OT-S7 for qemu-devel@nongnu.org; Fri, 24 May 2024 19:25:28 -0400 Received: from mail-pf1-x42e.google.com ([2607:f8b0:4864:20::42e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sAeHd-0006s8-2v for qemu-devel@nongnu.org; Fri, 24 May 2024 19:25:21 -0400 Received: by mail-pf1-x42e.google.com with SMTP id d2e1a72fcca58-6f6a045d476so3611093b3a.1 for ; Fri, 24 May 2024 16:25:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1716593106; x=1717197906; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=M/akLjaDjomvFRgt+5ZiGPiHf+F8Vp/Gx5uhLMA7u/o=; b=i1FvT67sLMStVhGp06l133x85K+H86awxA8CaSptlOKZP/JSKkV16VF1t0H7Vnw2BH WNlk6CV6utBIXmSkmdg3fj0DPIAkH8STEAJ5T14cL9a46tkwmnZwWpkhlKLVeF/4+iDu bELB2mmfGMX16ItfdG+gBAvQhzXX5zBhTMGGScGHAZwp42CkRi3kbaL/gWMBg+e1nSwn /GN4//nubn3NAxLLVt02mBjTLwzpK7irTfcLoFufc5zjrMu551NU0J85XkkPRM6Grseb TFSJNUsFjO103kP9Rq8nmjE4Z0vO+R4V/KnsqLZDGRJrSmopEphN/LvPr0UljZUh3v4G 22dA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716593106; x=1717197906; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=M/akLjaDjomvFRgt+5ZiGPiHf+F8Vp/Gx5uhLMA7u/o=; b=Udtbo/EG+iT3Twx/m8WDY+9M4DrAIPM6av9RvgKqyL3BXjSxiq9nXkX2kRE/Z5DK1+ L4JD+5YMw59klhLTVDlS79FvQk1GqPCJYE9roLL5ayhGY9yGGwa0CGD1l138LUA+rD+3 2BIlM/v8yn7pn1XbN8mKlDO9d9SEnIXlYvVyseH5or/8dOzSMN+61DuufA2jTjO9DAex S0zxqKkSJXs7NRMzcvMIj2h0SNnDoGYe3jw2BbgnLEi3lyocKqz3NDXDdnCA1Iht1PHk eof8d6FHBqStey2B3ybjfKbRNy8Gc9O/eXkEE9wrMaRsklHH/9t7oOKHEgFdwqKT5Tg2 MlrA== X-Gm-Message-State: AOJu0YzqC48P7Wj8fEElqgGTARloMkFRASJCIisK4epLxBOdHUyzLlj/ e2Iv+vAdtCzc87JQ1iOaa8d2AXWkFh7ttE8ZOE3A+wPeXHAZIxalOrT/9XRyoa+7Na3o+feVTbI G X-Google-Smtp-Source: AGHT+IHXHhIj2rEFMXXwXGCspW/VGrtJEh94g9JyQgIIfsg+W9dlbY0v6CwqXsIpJLqIaEaAjk/LDg== X-Received: by 2002:a05:6a00:2c94:b0:6e8:f57d:f1ec with SMTP id d2e1a72fcca58-6f8f34ca8b9mr4517706b3a.17.1716593105523; Fri, 24 May 2024 16:25:05 -0700 (PDT) Received: from stoup.. (174-21-72-5.tukw.qwest.net. [174.21.72.5]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-6f8fbf2cfd1sm1591695b3a.3.2024.05.24.16.25.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 May 2024 16:25:05 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: qemu-arm@nongnu.org Subject: [PATCH v2 48/67] target/arm: Convert SQRSHL and UQRSHL (register) to gvec Date: Fri, 24 May 2024 16:21:02 -0700 Message-Id: <20240524232121.284515-49-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240524232121.284515-1-richard.henderson@linaro.org> References: <20240524232121.284515-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::42e; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x42e.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/helper.h | 8 ++++++ target/arm/tcg/translate.h | 4 +++ target/arm/tcg/neon-dp.decode | 17 ++---------- target/arm/tcg/gengvec.c | 24 ++++++++++++++++ target/arm/tcg/neon_helper.c | 24 ++++++++++++++++ target/arm/tcg/translate-a64.c | 17 +++++------- target/arm/tcg/translate-neon.c | 49 ++------------------------------- 7 files changed, 71 insertions(+), 72 deletions(-) diff --git a/target/arm/helper.h b/target/arm/helper.h index f345087ddb..9a89c9cea7 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -334,6 +334,14 @@ DEF_HELPER_FLAGS_5(neon_uqshl_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(neon_uqshl_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(neon_uqshl_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(neon_uqshl_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(neon_sqrshl_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(neon_sqrshl_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(neon_sqrshl_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(neon_sqrshl_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(neon_uqrshl_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(neon_uqrshl_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(neon_uqrshl_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(neon_uqrshl_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(gvec_srshl_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(gvec_srshl_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) diff --git a/target/arm/tcg/translate.h b/target/arm/tcg/translate.h index 6c6d4d49e7..048cb45ebe 100644 --- a/target/arm/tcg/translate.h +++ b/target/arm/tcg/translate.h @@ -467,6 +467,10 @@ void gen_neon_sqshl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); void gen_neon_uqshl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); +void gen_neon_sqrshl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); +void gen_neon_uqrshl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); void gen_cmtst_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b); void gen_ushl_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b); diff --git a/target/arm/tcg/neon-dp.decode b/target/arm/tcg/neon-dp.decode index 6d4996b8d8..788578c8fa 100644 --- a/target/arm/tcg/neon-dp.decode +++ b/target/arm/tcg/neon-dp.decode @@ -102,25 +102,12 @@ VCGE_U_3s 1111 001 1 0 . .. .... .... 0011 . . . 1 .... @3same VSHL_S_3s 1111 001 0 0 . .. .... .... 0100 . . . 0 .... @3same_rev VSHL_U_3s 1111 001 1 0 . .. .... .... 0100 . . . 0 .... @3same_rev - -# Insns operating on 64-bit elements (size!=0b11 handled elsewhere) -# The _rev suffix indicates that Vn and Vm are reversed (as explained -# by the comment for the @3same_rev format). -@3same_64_rev .... ... . . . 11 .... .... .... . q:1 . . .... \ - &3same vm=%vn_dp vn=%vm_dp vd=%vd_dp size=3 - VQSHL_S_3s 1111 001 0 0 . .. .... .... 0100 . . . 1 .... @3same_rev VQSHL_U_3s 1111 001 1 0 . .. .... .... 0100 . . . 1 .... @3same_rev VRSHL_S_3s 1111 001 0 0 . .. .... .... 0101 . . . 0 .... @3same_rev VRSHL_U_3s 1111 001 1 0 . .. .... .... 0101 . . . 0 .... @3same_rev -{ - VQRSHL_S64_3s 1111 001 0 0 . .. .... .... 0101 . . . 1 .... @3same_64_rev - VQRSHL_S_3s 1111 001 0 0 . .. .... .... 0101 . . . 1 .... @3same_rev -} -{ - VQRSHL_U64_3s 1111 001 1 0 . .. .... .... 0101 . . . 1 .... @3same_64_rev - VQRSHL_U_3s 1111 001 1 0 . .. .... .... 0101 . . . 1 .... @3same_rev -} +VQRSHL_S_3s 1111 001 0 0 . .. .... .... 0101 . . . 1 .... @3same_rev +VQRSHL_U_3s 1111 001 1 0 . .. .... .... 0101 . . . 1 .... @3same_rev VMAX_S_3s 1111 001 0 0 . .. .... .... 0110 . . . 0 .... @3same VMAX_U_3s 1111 001 1 0 . .. .... .... 0110 . . . 0 .... @3same diff --git a/target/arm/tcg/gengvec.c b/target/arm/tcg/gengvec.c index 773dbf41d3..51e66ccf5f 100644 --- a/target/arm/tcg/gengvec.c +++ b/target/arm/tcg/gengvec.c @@ -1263,6 +1263,30 @@ void gen_neon_uqshl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, opr_sz, max_sz, 0, fns[vece]); } +void gen_neon_sqrshl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static gen_helper_gvec_3_ptr * const fns[] = { + gen_helper_neon_sqrshl_b, gen_helper_neon_sqrshl_h, + gen_helper_neon_sqrshl_s, gen_helper_neon_sqrshl_d, + }; + tcg_debug_assert(vece <= MO_64); + tcg_gen_gvec_3_ptr(rd_ofs, rn_ofs, rm_ofs, tcg_env, + opr_sz, max_sz, 0, fns[vece]); +} + +void gen_neon_uqrshl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static gen_helper_gvec_3_ptr * const fns[] = { + gen_helper_neon_uqrshl_b, gen_helper_neon_uqrshl_h, + gen_helper_neon_uqrshl_s, gen_helper_neon_uqrshl_d, + }; + tcg_debug_assert(vece <= MO_64); + tcg_gen_gvec_3_ptr(rd_ofs, rn_ofs, rm_ofs, tcg_env, + opr_sz, max_sz, 0, fns[vece]); +} + void gen_uqadd_bhs(TCGv_i64 res, TCGv_i64 qc, TCGv_i64 a, TCGv_i64 b, MemOp esz) { uint64_t max = MAKE_64BIT_MASK(0, 8 << esz); diff --git a/target/arm/tcg/neon_helper.c b/target/arm/tcg/neon_helper.c index 88301f0dcb..b29a7c725f 100644 --- a/target/arm/tcg/neon_helper.c +++ b/target/arm/tcg/neon_helper.c @@ -435,11 +435,23 @@ uint64_t HELPER(neon_qshlu_s64)(CPUARMState *env, uint64_t val, uint64_t shift) #define NEON_FN(dest, src1, src2) \ (dest = do_uqrshl_bhs(src1, (int8_t)src2, 8, true, env->vfp.qc)) NEON_VOP_ENV(qrshl_u8, neon_u8, 4) +NEON_GVEC_VOP2_ENV(neon_uqrshl_b, uint8_t) #undef NEON_FN #define NEON_FN(dest, src1, src2) \ (dest = do_uqrshl_bhs(src1, (int8_t)src2, 16, true, env->vfp.qc)) NEON_VOP_ENV(qrshl_u16, neon_u16, 2) +NEON_GVEC_VOP2_ENV(neon_uqrshl_h, uint16_t) +#undef NEON_FN + +#define NEON_FN(dest, src1, src2) \ + (dest = do_uqrshl_bhs(src1, (int8_t)src2, 32, true, env->vfp.qc)) +NEON_GVEC_VOP2_ENV(neon_uqrshl_s, uint32_t) +#undef NEON_FN + +#define NEON_FN(dest, src1, src2) \ + (dest = do_uqrshl_d(src1, (int8_t)src2, true, env->vfp.qc)) +NEON_GVEC_VOP2_ENV(neon_uqrshl_d, uint64_t) #undef NEON_FN uint32_t HELPER(neon_qrshl_u32)(CPUARMState *env, uint32_t val, uint32_t shift) @@ -455,11 +467,23 @@ uint64_t HELPER(neon_qrshl_u64)(CPUARMState *env, uint64_t val, uint64_t shift) #define NEON_FN(dest, src1, src2) \ (dest = do_sqrshl_bhs(src1, (int8_t)src2, 8, true, env->vfp.qc)) NEON_VOP_ENV(qrshl_s8, neon_s8, 4) +NEON_GVEC_VOP2_ENV(neon_sqrshl_b, int8_t) #undef NEON_FN #define NEON_FN(dest, src1, src2) \ (dest = do_sqrshl_bhs(src1, (int8_t)src2, 16, true, env->vfp.qc)) NEON_VOP_ENV(qrshl_s16, neon_s16, 2) +NEON_GVEC_VOP2_ENV(neon_sqrshl_h, int16_t) +#undef NEON_FN + +#define NEON_FN(dest, src1, src2) \ + (dest = do_sqrshl_bhs(src1, (int8_t)src2, 32, true, env->vfp.qc)) +NEON_GVEC_VOP2_ENV(neon_sqrshl_s, int32_t) +#undef NEON_FN + +#define NEON_FN(dest, src1, src2) \ + (dest = do_sqrshl_d(src1, (int8_t)src2, true, env->vfp.qc)) +NEON_GVEC_VOP2_ENV(neon_sqrshl_d, int64_t) #undef NEON_FN uint32_t HELPER(neon_qrshl_s32)(CPUARMState *env, uint32_t val, uint32_t shift) diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c index f8d2760bea..b0004e2c6f 100644 --- a/target/arm/tcg/translate-a64.c +++ b/target/arm/tcg/translate-a64.c @@ -10965,6 +10965,13 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) } switch (opcode) { + case 0x0b: /* SQRSHL, UQRSHL */ + if (u) { + gen_gvec_fn3(s, is_q, rd, rn, rm, gen_neon_uqrshl, size); + } else { + gen_gvec_fn3(s, is_q, rd, rn, rm, gen_neon_sqrshl, size); + } + return; case 0x0c: /* SMAX, UMAX */ if (u) { gen_gvec_fn3(s, is_q, rd, rn, rm, tcg_gen_gvec_umax, size); @@ -11107,16 +11114,6 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) genfn = fns[size][u]; break; } - case 0xb: /* SQRSHL, UQRSHL */ - { - static NeonGenTwoOpEnvFn * const fns[3][2] = { - { gen_helper_neon_qrshl_s8, gen_helper_neon_qrshl_u8 }, - { gen_helper_neon_qrshl_s16, gen_helper_neon_qrshl_u16 }, - { gen_helper_neon_qrshl_s32, gen_helper_neon_qrshl_u32 }, - }; - genenvfn = fns[size][u]; - break; - } default: g_assert_not_reached(); } diff --git a/target/arm/tcg/translate-neon.c b/target/arm/tcg/translate-neon.c index a3eec47092..5f1576393e 100644 --- a/target/arm/tcg/translate-neon.c +++ b/target/arm/tcg/translate-neon.c @@ -798,6 +798,8 @@ DO_3SAME(VRSHL_S, gen_gvec_srshl) DO_3SAME(VRSHL_U, gen_gvec_urshl) DO_3SAME(VQSHL_S, gen_neon_sqshl) DO_3SAME(VQSHL_U, gen_neon_uqshl) +DO_3SAME(VQRSHL_S, gen_neon_sqrshl) +DO_3SAME(VQRSHL_U, gen_neon_uqrshl) /* These insns are all gvec_bitsel but with the inputs in various orders. */ #define DO_3SAME_BITSEL(INSN, O1, O2, O3) \ @@ -916,26 +918,6 @@ DO_SHA2(SHA256H, gen_helper_crypto_sha256h) DO_SHA2(SHA256H2, gen_helper_crypto_sha256h2) DO_SHA2(SHA256SU1, gen_helper_crypto_sha256su1) -#define DO_3SAME_64(INSN, FUNC) \ - static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs, \ - uint32_t rn_ofs, uint32_t rm_ofs, \ - uint32_t oprsz, uint32_t maxsz) \ - { \ - static const GVecGen3 op = { .fni8 = FUNC }; \ - tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, oprsz, maxsz, &op); \ - } \ - DO_3SAME(INSN, gen_##INSN##_3s) - -#define DO_3SAME_64_ENV(INSN, FUNC) \ - static void gen_##INSN##_elt(TCGv_i64 d, TCGv_i64 n, TCGv_i64 m) \ - { \ - FUNC(d, tcg_env, n, m); \ - } \ - DO_3SAME_64(INSN, gen_##INSN##_elt) - -DO_3SAME_64_ENV(VQRSHL_S64, gen_helper_neon_qrshl_s64) -DO_3SAME_64_ENV(VQRSHL_U64, gen_helper_neon_qrshl_u64) - #define DO_3SAME_32(INSN, FUNC) \ static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs, \ uint32_t rn_ofs, uint32_t rm_ofs, \ @@ -969,30 +951,6 @@ DO_3SAME_64_ENV(VQRSHL_U64, gen_helper_neon_qrshl_u64) FUNC(d, tcg_env, n, m); \ } -#define DO_3SAME_32_ENV(INSN, FUNC) \ - WRAP_ENV_FN(gen_##INSN##_tramp8, gen_helper_neon_##FUNC##8); \ - WRAP_ENV_FN(gen_##INSN##_tramp16, gen_helper_neon_##FUNC##16); \ - WRAP_ENV_FN(gen_##INSN##_tramp32, gen_helper_neon_##FUNC##32); \ - static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs, \ - uint32_t rn_ofs, uint32_t rm_ofs, \ - uint32_t oprsz, uint32_t maxsz) \ - { \ - static const GVecGen3 ops[4] = { \ - { .fni4 = gen_##INSN##_tramp8 }, \ - { .fni4 = gen_##INSN##_tramp16 }, \ - { .fni4 = gen_##INSN##_tramp32 }, \ - { 0 }, \ - }; \ - tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, oprsz, maxsz, &ops[vece]); \ - } \ - static bool trans_##INSN##_3s(DisasContext *s, arg_3same *a) \ - { \ - if (a->size > 2) { \ - return false; \ - } \ - return do_3same(s, a, gen_##INSN##_3s); \ - } - DO_3SAME_32(VHADD_S, hadd_s) DO_3SAME_32(VHADD_U, hadd_u) DO_3SAME_32(VHSUB_S, hsub_s) @@ -1000,9 +958,6 @@ DO_3SAME_32(VHSUB_U, hsub_u) DO_3SAME_32(VRHADD_S, rhadd_s) DO_3SAME_32(VRHADD_U, rhadd_u) -DO_3SAME_32_ENV(VQRSHL_S, qrshl_s) -DO_3SAME_32_ENV(VQRSHL_U, qrshl_u) - #define DO_3SAME_VQDMULH(INSN, FUNC) \ WRAP_ENV_FN(gen_##INSN##_tramp16, gen_helper_neon_##FUNC##_s16); \ WRAP_ENV_FN(gen_##INSN##_tramp32, gen_helper_neon_##FUNC##_s32); \ From patchwork Fri May 24 23:21:03 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13673853 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id AE317C25B7D for ; Fri, 24 May 2024 23:32:38 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sAeJA-00036R-5d; Fri, 24 May 2024 19:26:44 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sAeI3-0007ks-AN for qemu-devel@nongnu.org; Fri, 24 May 2024 19:25:35 -0400 Received: from mail-pg1-x532.google.com ([2607:f8b0:4864:20::532]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sAeHt-0006sP-CY for qemu-devel@nongnu.org; Fri, 24 May 2024 19:25:33 -0400 Received: by mail-pg1-x532.google.com with SMTP id 41be03b00d2f7-681919f89f2so1127064a12.1 for ; Fri, 24 May 2024 16:25:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1716593106; x=1717197906; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=NvOdWZ+hqbNSpKMnrqkieEhwTauPHntKblfxGeb+PLo=; b=JHgzSqiiWNx0X53Ezj20P55980e0w+mml0CgOPlw3SNedTZ+ATDUXUvAsZxsL/mX7O 2VL0LnQC3oiPl2kDLF1loPSvDb1gimxjXhzUChti7ipGSJOwcNi2IuOrBELCfUuKt0YB q5eiUC70Wo42Ns3Lf//0kIfEj2ykv9wrJyRLXjwSQSSshQT7xw5uSap2bSf6qOvFobrl RseRDDzGnd9z0eLwimUEOeKKCkZ5IoZj5/CFMM64Pp/Vfx/2T+Iar59RzAFVeo4DRK/h 4524vK8kKZ2vM4Xb23wZEcJNcx2ESjvIJuOHtQ/77NtyEWMWEZk/I1UqKfMYY28E8JCu jomA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716593106; x=1717197906; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=NvOdWZ+hqbNSpKMnrqkieEhwTauPHntKblfxGeb+PLo=; b=W46lZXHC+0333IuN1r2bQsP/jYGrZtczGGTdI4SxM0TAmcjkI8Ivn/q9E9TBeEifFr IdkZqo5L0LG2CQniD6zmxSy8AGXz8pro1Pmaw9ZrRhlssR9JJJe2bMxX4O9898NgSHzt dS4itmCEolH9MIwf2BT/wFq2MVMZHkrQpSB3Z6ORbz5HeZhxI6OhKMx7lhM1svWArNJe D6gGr6+mQiuuAN4PsCVLBo57bXQkztddXwPptT4uqgQbt3R+V2CZfvyLPEFzGitw5Cf0 yIlxqbM8ylDZqbldKmqntEs2FlmSi6KsDsVvg3K8otM5TR3P9wFpv1X7H36F6cXq0DWR sIew== X-Gm-Message-State: AOJu0YxC4GUNe173IZqLTSlSt+1j4BH+ZodIK5WJ9BetcC2OFd+NmEAY Rl72ldyvu0fcpzuZx6eZPsrayVuEYeOpAGovZosNVbHOt3dhref9+5XhSKh5pSfs5ElZIATk/Fp W X-Google-Smtp-Source: AGHT+IFJgTxFRgsuQs4ilwJ52GFO1EFa5P3qYVElPEofmRVJx54+wj1LsEHgZ8bCBqjkFOGxNtZfhA== X-Received: by 2002:a05:6a20:9146:b0:1ad:6c36:ee82 with SMTP id adf61e73a8af0-1b212cbcf34mr5737568637.13.1716593106444; Fri, 24 May 2024 16:25:06 -0700 (PDT) Received: from stoup.. (174-21-72-5.tukw.qwest.net. [174.21.72.5]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-6f8fbf2cfd1sm1591695b3a.3.2024.05.24.16.25.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 May 2024 16:25:05 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: qemu-arm@nongnu.org Subject: [PATCH v2 49/67] target/arm: Convert SQRSHL, UQRSHL to decodetree Date: Fri, 24 May 2024 16:21:03 -0700 Message-Id: <20240524232121.284515-50-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240524232121.284515-1-richard.henderson@linaro.org> References: <20240524232121.284515-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::532; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x532.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/tcg/a64.decode | 4 +++ target/arm/tcg/translate-a64.c | 48 ++++++++++++++++------------------ 2 files changed, 26 insertions(+), 26 deletions(-) diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode index 85caf37948..96ce35ad40 100644 --- a/target/arm/tcg/a64.decode +++ b/target/arm/tcg/a64.decode @@ -762,6 +762,8 @@ SRSHL_s 0101 1110 111 ..... 01010 1 ..... ..... @rrr_d URSHL_s 0111 1110 111 ..... 01010 1 ..... ..... @rrr_d SQSHL_s 0101 1110 ..1 ..... 01001 1 ..... ..... @rrr_e UQSHL_s 0111 1110 ..1 ..... 01001 1 ..... ..... @rrr_e +SQRSHL_s 0101 1110 ..1 ..... 01011 1 ..... ..... @rrr_e +UQRSHL_s 0111 1110 ..1 ..... 01011 1 ..... ..... @rrr_e ### Advanced SIMD scalar pairwise @@ -890,6 +892,8 @@ SRSHL_v 0.00 1110 ..1 ..... 01010 1 ..... ..... @qrrr_e URSHL_v 0.10 1110 ..1 ..... 01010 1 ..... ..... @qrrr_e SQSHL_v 0.00 1110 ..1 ..... 01001 1 ..... ..... @qrrr_e UQSHL_v 0.10 1110 ..1 ..... 01001 1 ..... ..... @qrrr_e +SQRSHL_v 0.00 1110 ..1 ..... 01011 1 ..... ..... @qrrr_e +UQRSHL_v 0.10 1110 ..1 ..... 01011 1 ..... ..... @qrrr_e ### Advanced SIMD scalar x indexed element diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c index b0004e2c6f..b76682cabf 100644 --- a/target/arm/tcg/translate-a64.c +++ b/target/arm/tcg/translate-a64.c @@ -5162,6 +5162,22 @@ static const ENVScalar2 f_scalar_uqshl = { }; TRANS(UQSHL_s, do_env_scalar2, a, &f_scalar_uqshl) +static const ENVScalar2 f_scalar_sqrshl = { + { gen_helper_neon_qrshl_s8, + gen_helper_neon_qrshl_s16, + gen_helper_neon_qrshl_s32 }, + gen_helper_neon_qrshl_s64, +}; +TRANS(SQRSHL_s, do_env_scalar2, a, &f_scalar_sqrshl) + +static const ENVScalar2 f_scalar_uqrshl = { + { gen_helper_neon_qrshl_u8, + gen_helper_neon_qrshl_u16, + gen_helper_neon_qrshl_u32 }, + gen_helper_neon_qrshl_u64, +}; +TRANS(UQRSHL_s, do_env_scalar2, a, &f_scalar_uqrshl) + static bool do_fp3_vector(DisasContext *s, arg_qrrr_e *a, gen_helper_gvec_3_ptr * const fns[3]) { @@ -5413,6 +5429,8 @@ TRANS(SRSHL_v, do_gvec_fn3, a, gen_gvec_srshl) TRANS(URSHL_v, do_gvec_fn3, a, gen_gvec_urshl) TRANS(SQSHL_v, do_gvec_fn3, a, gen_neon_sqshl) TRANS(UQSHL_v, do_gvec_fn3, a, gen_neon_uqshl) +TRANS(SQRSHL_v, do_gvec_fn3, a, gen_neon_sqrshl) +TRANS(UQRSHL_v, do_gvec_fn3, a, gen_neon_uqrshl) /* @@ -9426,13 +9444,6 @@ static void handle_3same_64(DisasContext *s, int opcode, bool u, } gen_cmtst_i64(tcg_rd, tcg_rn, tcg_rm); break; - case 0xb: /* SQRSHL, UQRSHL */ - if (u) { - gen_helper_neon_qrshl_u64(tcg_rd, tcg_env, tcg_rn, tcg_rm); - } else { - gen_helper_neon_qrshl_s64(tcg_rd, tcg_env, tcg_rn, tcg_rm); - } - break; case 0x10: /* ADD, SUB */ if (u) { tcg_gen_sub_i64(tcg_rd, tcg_rn, tcg_rm); @@ -9446,6 +9457,7 @@ static void handle_3same_64(DisasContext *s, int opcode, bool u, case 0x8: /* SSHL, USHL */ case 0x9: /* SQSHL, UQSHL */ case 0xa: /* SRSHL, URSHL */ + case 0xb: /* SQRSHL, UQRSHL */ g_assert_not_reached(); } } @@ -9467,8 +9479,6 @@ static void disas_simd_scalar_three_reg_same(DisasContext *s, uint32_t insn) TCGv_i64 tcg_rd; switch (opcode) { - case 0xb: /* SQRSHL, UQRSHL */ - break; case 0x6: /* CMGT, CMHI */ case 0x7: /* CMGE, CMHS */ case 0x11: /* CMTST, CMEQ */ @@ -9490,6 +9500,7 @@ static void disas_simd_scalar_three_reg_same(DisasContext *s, uint32_t insn) case 0x8: /* SSHL, USHL */ case 0x9: /* SQSHL, UQSHL */ case 0xa: /* SRSHL, URSHL */ + case 0xb: /* SQRSHL, UQRSHL */ unallocated_encoding(s); return; } @@ -9516,16 +9527,6 @@ static void disas_simd_scalar_three_reg_same(DisasContext *s, uint32_t insn) void (*genfn)(TCGv_i64, TCGv_i64, TCGv_i64, TCGv_i64, MemOp) = NULL; switch (opcode) { - case 0xb: /* SQRSHL, UQRSHL */ - { - static NeonGenTwoOpEnvFn * const fns[3][2] = { - { gen_helper_neon_qrshl_s8, gen_helper_neon_qrshl_u8 }, - { gen_helper_neon_qrshl_s16, gen_helper_neon_qrshl_u16 }, - { gen_helper_neon_qrshl_s32, gen_helper_neon_qrshl_u32 }, - }; - genenvfn = fns[size][u]; - break; - } case 0x16: /* SQDMULH, SQRDMULH */ { static NeonGenTwoOpEnvFn * const fns[2][2] = { @@ -9540,6 +9541,7 @@ static void disas_simd_scalar_three_reg_same(DisasContext *s, uint32_t insn) case 0x1: /* SQADD, UQADD */ case 0x5: /* SQSUB, UQSUB */ case 0x9: /* SQSHL, UQSHL */ + case 0xb: /* SQRSHL, UQRSHL */ g_assert_not_reached(); } @@ -10965,13 +10967,6 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) } switch (opcode) { - case 0x0b: /* SQRSHL, UQRSHL */ - if (u) { - gen_gvec_fn3(s, is_q, rd, rn, rm, gen_neon_uqrshl, size); - } else { - gen_gvec_fn3(s, is_q, rd, rn, rm, gen_neon_sqrshl, size); - } - return; case 0x0c: /* SMAX, UMAX */ if (u) { gen_gvec_fn3(s, is_q, rd, rn, rm, tcg_gen_gvec_umax, size); @@ -11055,6 +11050,7 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) case 0x08: /* SSHL, USHL */ case 0x09: /* SQSHL, UQSHL */ case 0x0a: /* SRSHL, URSHL */ + case 0x0b: /* SQRSHL, UQRSHL */ g_assert_not_reached(); } From patchwork Fri May 24 23:21:04 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13673828 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 30D44C25B74 for ; Fri, 24 May 2024 23:29:42 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sAeIW-0000Ze-5I; Fri, 24 May 2024 19:26:04 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sAeHt-0007Oe-VP for qemu-devel@nongnu.org; Fri, 24 May 2024 19:25:28 -0400 Received: from mail-pf1-x42a.google.com ([2607:f8b0:4864:20::42a]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sAeHf-0006sf-69 for qemu-devel@nongnu.org; Fri, 24 May 2024 19:25:23 -0400 Received: by mail-pf1-x42a.google.com with SMTP id d2e1a72fcca58-6f4f2b1c997so3796536b3a.0 for ; Fri, 24 May 2024 16:25:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1716593107; x=1717197907; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=VvCIuypfJNI340RGyci/PzKT1veZ7P8AmkoVbzz/TlM=; b=TlrMB86fFTZwZc24CYD5pKUwB+ergOLERy+eVTC+elIo3EUr+5Y3943it2UP2VuLqU pAEfVe2zLHHW0BJOzd2YOuXvto9yMs7biv3qgvgSkV05vN/ZkAmvVUqP2vUrsci9b4vU T8xSpbxi5Uxs0YsJ/SWf5dV0sGdlxr8LvHPbyXqoZr11rTmmcldgC0FtDIcTt1pSpLWd SIy1+y1MBphvzjdCGoo5qHz8dM/QSLDfy827RcNmxh0H3CY5WcQIKHIBwboJ28MUeDUu x4JLulCqO/T4cVUKnCpWzdnvT+bFVn2JZJhyAbW+uVsfyTg8iBWDjS8zmGFAP/VyIWQN Q4QQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716593107; x=1717197907; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=VvCIuypfJNI340RGyci/PzKT1veZ7P8AmkoVbzz/TlM=; b=EZdAQOwG7nmYqSZmCgqQHnrDM4wTaDVmNiSwfQGC4DN/4doVoZQmrsPhYf8/S/co14 1oJFbStBn5TtD1EteMYU9QeF2aZVrxRDeCp28wdLk5NY3lxuuDe9y9OhPgJkPmH75Jv1 RB88tMMaoHZ3zYatiH4zDxKsLvQjECtY2OQYxrXhvLfzJusspLwb6Ou+xlyMp4MXunbO qDOp/NXQNldU/gOVZ1U+PYCA41AyX4VkbrmsMREwqLJVc+igoHhlQi/z7Yg0YeB5JUdw CtTtBuny3KDrBMoIYnjIRox4Hl6frVFz+sNiO/DkJhhAi5cQBoNldD+j3/ekBxfHtDn0 m2DA== X-Gm-Message-State: AOJu0YykW/Spf6qVt5PFXSxhbzsQcF3Sr1S5LcQYArcYNb2EH+BBuMrl mebZIo+popOm+uAYgSsY4psxfyVsZa6XBotsP4Pj+7NjwMJPQHqgYPAhao7ctJ0ERkq3Ft4ILIk t X-Google-Smtp-Source: AGHT+IEofddKEJJM/G1T3HuEBcZMGEc30OD8eSPBb9ncGFqACyQbbWlei6xpMBvZWpoYvsn10/tR+g== X-Received: by 2002:a05:6a21:1f24:b0:1b0:25b6:a749 with SMTP id adf61e73a8af0-1b212e1711cmr4099335637.48.1716593107213; Fri, 24 May 2024 16:25:07 -0700 (PDT) Received: from stoup.. (174-21-72-5.tukw.qwest.net. [174.21.72.5]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-6f8fbf2cfd1sm1591695b3a.3.2024.05.24.16.25.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 May 2024 16:25:06 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: qemu-arm@nongnu.org Subject: [PATCH v2 50/67] target/arm: Convert ADD, SUB (vector) to decodetree Date: Fri, 24 May 2024 16:21:04 -0700 Message-Id: <20240524232121.284515-51-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240524232121.284515-1-richard.henderson@linaro.org> References: <20240524232121.284515-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::42a; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x42a.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/tcg/a64.decode | 6 ++++++ target/arm/tcg/translate-a64.c | 34 +++++++++++----------------------- 2 files changed, 17 insertions(+), 23 deletions(-) diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode index 96ce35ad40..44383b4fc7 100644 --- a/target/arm/tcg/a64.decode +++ b/target/arm/tcg/a64.decode @@ -765,6 +765,9 @@ UQSHL_s 0111 1110 ..1 ..... 01001 1 ..... ..... @rrr_e SQRSHL_s 0101 1110 ..1 ..... 01011 1 ..... ..... @rrr_e UQRSHL_s 0111 1110 ..1 ..... 01011 1 ..... ..... @rrr_e +ADD_s 0101 1110 111 ..... 10000 1 ..... ..... @rrr_d +SUB_s 0111 1110 111 ..... 10000 1 ..... ..... @rrr_d + ### Advanced SIMD scalar pairwise FADDP_s 0101 1110 0011 0000 1101 10 ..... ..... @rr_h @@ -895,6 +898,9 @@ UQSHL_v 0.10 1110 ..1 ..... 01001 1 ..... ..... @qrrr_e SQRSHL_v 0.00 1110 ..1 ..... 01011 1 ..... ..... @qrrr_e UQRSHL_v 0.10 1110 ..1 ..... 01011 1 ..... ..... @qrrr_e +ADD_v 0.00 1110 ..1 ..... 10000 1 ..... ..... @qrrr_e +SUB_v 0.10 1110 ..1 ..... 10000 1 ..... ..... @qrrr_e + ### Advanced SIMD scalar x indexed element FMUL_si 0101 1111 00 .. .... 1001 . 0 ..... ..... @rrx_h diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c index b76682cabf..77a64923e7 100644 --- a/target/arm/tcg/translate-a64.c +++ b/target/arm/tcg/translate-a64.c @@ -5118,6 +5118,8 @@ TRANS(SSHL_s, do_int3_scalar_d, a, gen_sshl_i64) TRANS(USHL_s, do_int3_scalar_d, a, gen_ushl_i64) TRANS(SRSHL_s, do_int3_scalar_d, a, gen_helper_neon_rshl_s64) TRANS(URSHL_s, do_int3_scalar_d, a, gen_helper_neon_rshl_u64) +TRANS(ADD_s, do_int3_scalar_d, a, tcg_gen_add_i64) +TRANS(SUB_s, do_int3_scalar_d, a, tcg_gen_sub_i64) typedef struct ENVScalar2 { NeonGenTwoOpEnvFn *gen_bhs[3]; @@ -5432,6 +5434,8 @@ TRANS(UQSHL_v, do_gvec_fn3, a, gen_neon_uqshl) TRANS(SQRSHL_v, do_gvec_fn3, a, gen_neon_sqrshl) TRANS(UQRSHL_v, do_gvec_fn3, a, gen_neon_uqrshl) +TRANS(ADD_v, do_gvec_fn3, a, tcg_gen_gvec_add) +TRANS(SUB_v, do_gvec_fn3, a, tcg_gen_gvec_sub) /* * Advanced SIMD scalar/vector x indexed element @@ -9444,13 +9448,6 @@ static void handle_3same_64(DisasContext *s, int opcode, bool u, } gen_cmtst_i64(tcg_rd, tcg_rn, tcg_rm); break; - case 0x10: /* ADD, SUB */ - if (u) { - tcg_gen_sub_i64(tcg_rd, tcg_rn, tcg_rm); - } else { - tcg_gen_add_i64(tcg_rd, tcg_rn, tcg_rm); - } - break; default: case 0x1: /* SQADD / UQADD */ case 0x5: /* SQSUB / UQSUB */ @@ -9458,6 +9455,7 @@ static void handle_3same_64(DisasContext *s, int opcode, bool u, case 0x9: /* SQSHL, UQSHL */ case 0xa: /* SRSHL, URSHL */ case 0xb: /* SQRSHL, UQRSHL */ + case 0x10: /* ADD, SUB */ g_assert_not_reached(); } } @@ -9482,7 +9480,6 @@ static void disas_simd_scalar_three_reg_same(DisasContext *s, uint32_t insn) case 0x6: /* CMGT, CMHI */ case 0x7: /* CMGE, CMHS */ case 0x11: /* CMTST, CMEQ */ - case 0x10: /* ADD, SUB (vector) */ if (size != 3) { unallocated_encoding(s); return; @@ -9501,6 +9498,7 @@ static void disas_simd_scalar_three_reg_same(DisasContext *s, uint32_t insn) case 0x9: /* SQSHL, UQSHL */ case 0xa: /* SRSHL, URSHL */ case 0xb: /* SQRSHL, UQRSHL */ + case 0x10: /* ADD, SUB (vector) */ unallocated_encoding(s); return; } @@ -10958,6 +10956,11 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) case 0x01: /* SQADD, UQADD */ case 0x05: /* SQSUB, UQSUB */ + case 0x08: /* SSHL, USHL */ + case 0x09: /* SQSHL, UQSHL */ + case 0x0a: /* SRSHL, URSHL */ + case 0x0b: /* SQRSHL, UQRSHL */ + case 0x10: /* ADD, SUB */ unallocated_encoding(s); return; } @@ -10995,13 +10998,6 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_saba, size); } return; - case 0x10: /* ADD, SUB */ - if (u) { - gen_gvec_fn3(s, is_q, rd, rn, rm, tcg_gen_gvec_sub, size); - } else { - gen_gvec_fn3(s, is_q, rd, rn, rm, tcg_gen_gvec_add, size); - } - return; case 0x13: /* MUL, PMUL */ if (!u) { /* MUL */ gen_gvec_fn3(s, is_q, rd, rn, rm, tcg_gen_gvec_mul, size); @@ -11044,14 +11040,6 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) vec_full_reg_offset(s, rm), is_q ? 16 : 8, vec_full_reg_size(s)); return; - - case 0x01: /* SQADD, UQADD */ - case 0x05: /* SQSUB, UQSUB */ - case 0x08: /* SSHL, USHL */ - case 0x09: /* SQSHL, UQSHL */ - case 0x0a: /* SRSHL, URSHL */ - case 0x0b: /* SQRSHL, UQRSHL */ - g_assert_not_reached(); } if (size == 3) { From patchwork Fri May 24 23:21:05 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13673805 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 26DCAC25B74 for ; Fri, 24 May 2024 23:26:07 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sAeIS-0000Ec-3E; Fri, 24 May 2024 19:26:00 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sAeHt-0007Ob-Sm for qemu-devel@nongnu.org; Fri, 24 May 2024 19:25:28 -0400 Received: from mail-ot1-x32f.google.com ([2607:f8b0:4864:20::32f]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sAeHh-0006sv-Bt for qemu-devel@nongnu.org; Fri, 24 May 2024 19:25:25 -0400 Received: by mail-ot1-x32f.google.com with SMTP id 46e09a7af769-6f0fd87da53so5323294a34.2 for ; Fri, 24 May 2024 16:25:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1716593108; x=1717197908; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=4PrCx4AlcRPK1mcWxhfVPtGOJZNiQeRANcBuEG/PgsM=; b=S1tuoXy2anUUQnraMiWqbhrMWcm5LPnfAURDOL1BkvjjnzjXNwKDlsajnorniAuJHa m243nSC6Jyw2xrWJiskRvW6G8Href5qHxJevXFMs18zs1mvU0nu35eHzK+6H/RMY+WVy m/h0VjE1rQ/uutk+qHwO+XTou9Ce7lTZXGVAPNxhQIXDSs4swjR6WIpkIBTTODbWG7kv c4y+h7Ta2nyN5/ZpgnjxTEwMt3/lK/xXj056c7b9FUBIcrOKcdz2UX75iIY2wR0MINhM +0DRRvtGekh5IiO75Bz56CMEwOIpsy45UdtAZYccJTxXj/udQ1UBD76fpXUYU9Z2PUhE N8Ew== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716593108; x=1717197908; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=4PrCx4AlcRPK1mcWxhfVPtGOJZNiQeRANcBuEG/PgsM=; b=EHq2HE5ooVIfB+dzvs8UFZ1/WskT7UAjOzDbc8sOVIWRr/i3YNcUnEGnwyl46F+xoJ 5hKcrUjOqR+w01vXGS5Zg5+KnV8usilSJpAfpa++JXRUknfaDFaJe5WbKRUe9tpK9axQ 8/67r9RHT6h7UYk6rPXJ8ULODVcOf/N+g7TT2QCuggH+wm7OiG3yg2AVBpzmFgPUYz44 mZEmLRkv3Ow8VHj/kacGgdjnGKAJfqvC0+j+iQu+LC9TUZb3zWRmtDWIdWuMomEY57Xd VZ41ACCYU2plvj/Psqzdv48sDPzSAJ4R2IMyxp5lR+DsVDzTfP+X35D7zrphGYpCTMqA lgxQ== X-Gm-Message-State: AOJu0Yw2R+OxIjeChLL/Aflv9PWu/GYvLaxZ3EsKfLuJKqMTOMd9gHrf YnE6dIJOmIp8PvAtpMe5UDU7NpI82a2VQjM/fXOXnt+z7Rakl5DJUvlZJgSkaVrskDh/1FENYcX c X-Google-Smtp-Source: AGHT+IFUsYgfe1PjD3ooucIXVnrHHI3IfuO0T/bORlGnPPqJLoz9NDjNb19r+iKyvYWtXmgFKFphIw== X-Received: by 2002:a05:6870:472c:b0:24c:b2d9:77af with SMTP id 586e51a60fabf-24cb2d9a877mr2913192fac.19.1716593108020; Fri, 24 May 2024 16:25:08 -0700 (PDT) Received: from stoup.. (174-21-72-5.tukw.qwest.net. [174.21.72.5]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-6f8fbf2cfd1sm1591695b3a.3.2024.05.24.16.25.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 May 2024 16:25:07 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: qemu-arm@nongnu.org Subject: [PATCH v2 51/67] target/arm: Convert CMGT, CMHI, CMGE, CMHS, CMTST, CMEQ to decodetree Date: Fri, 24 May 2024 16:21:05 -0700 Message-Id: <20240524232121.284515-52-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240524232121.284515-1-richard.henderson@linaro.org> References: <20240524232121.284515-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::32f; envelope-from=richard.henderson@linaro.org; helo=mail-ot1-x32f.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/tcg/a64.decode | 12 +++ target/arm/tcg/translate-a64.c | 132 ++++++++++++--------------------- 2 files changed, 60 insertions(+), 84 deletions(-) diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode index 44383b4fc7..3061e26242 100644 --- a/target/arm/tcg/a64.decode +++ b/target/arm/tcg/a64.decode @@ -767,6 +767,12 @@ UQRSHL_s 0111 1110 ..1 ..... 01011 1 ..... ..... @rrr_e ADD_s 0101 1110 111 ..... 10000 1 ..... ..... @rrr_d SUB_s 0111 1110 111 ..... 10000 1 ..... ..... @rrr_d +CMGT_s 0101 1110 111 ..... 00110 1 ..... ..... @rrr_d +CMHI_s 0111 1110 111 ..... 00110 1 ..... ..... @rrr_d +CMGE_s 0101 1110 111 ..... 00111 1 ..... ..... @rrr_d +CMHS_s 0111 1110 111 ..... 00111 1 ..... ..... @rrr_d +CMTST_s 0101 1110 111 ..... 10001 1 ..... ..... @rrr_d +CMEQ_s 0111 1110 111 ..... 10001 1 ..... ..... @rrr_d ### Advanced SIMD scalar pairwise @@ -900,6 +906,12 @@ UQRSHL_v 0.10 1110 ..1 ..... 01011 1 ..... ..... @qrrr_e ADD_v 0.00 1110 ..1 ..... 10000 1 ..... ..... @qrrr_e SUB_v 0.10 1110 ..1 ..... 10000 1 ..... ..... @qrrr_e +CMGT_v 0.00 1110 ..1 ..... 00110 1 ..... ..... @qrrr_e +CMHI_v 0.10 1110 ..1 ..... 00110 1 ..... ..... @qrrr_e +CMGE_v 0.00 1110 ..1 ..... 00111 1 ..... ..... @qrrr_e +CMHS_v 0.10 1110 ..1 ..... 00111 1 ..... ..... @qrrr_e +CMTST_v 0.00 1110 ..1 ..... 10001 1 ..... ..... @qrrr_e +CMEQ_v 0.10 1110 ..1 ..... 10001 1 ..... ..... @qrrr_e ### Advanced SIMD scalar x indexed element diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c index 77a64923e7..3c6cfc2952 100644 --- a/target/arm/tcg/translate-a64.c +++ b/target/arm/tcg/translate-a64.c @@ -5180,6 +5180,24 @@ static const ENVScalar2 f_scalar_uqrshl = { }; TRANS(UQRSHL_s, do_env_scalar2, a, &f_scalar_uqrshl) +static bool do_cmop_d(DisasContext *s, arg_rrr_e *a, TCGCond cond) +{ + if (fp_access_check(s)) { + TCGv_i64 t0 = read_fp_dreg(s, a->rn); + TCGv_i64 t1 = read_fp_dreg(s, a->rm); + tcg_gen_negsetcond_i64(cond, t0, t0, t1); + write_fp_dreg(s, a->rd, t0); + } + return true; +} + +TRANS(CMGT_s, do_cmop_d, a, TCG_COND_GT) +TRANS(CMHI_s, do_cmop_d, a, TCG_COND_GTU) +TRANS(CMGE_s, do_cmop_d, a, TCG_COND_GE) +TRANS(CMHS_s, do_cmop_d, a, TCG_COND_GEU) +TRANS(CMEQ_s, do_cmop_d, a, TCG_COND_EQ) +TRANS(CMTST_s, do_cmop_d, a, TCG_COND_TSTNE) + static bool do_fp3_vector(DisasContext *s, arg_qrrr_e *a, gen_helper_gvec_3_ptr * const fns[3]) { @@ -5437,6 +5455,28 @@ TRANS(UQRSHL_v, do_gvec_fn3, a, gen_neon_uqrshl) TRANS(ADD_v, do_gvec_fn3, a, tcg_gen_gvec_add) TRANS(SUB_v, do_gvec_fn3, a, tcg_gen_gvec_sub) +static bool do_cmop_v(DisasContext *s, arg_qrrr_e *a, TCGCond cond) +{ + if (a->esz == MO_64 && !a->q) { + return false; + } + if (fp_access_check(s)) { + tcg_gen_gvec_cmp(cond, a->esz, + vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + a->q ? 16 : 8, vec_full_reg_size(s)); + } + return true; +} + +TRANS(CMGT_v, do_cmop_v, a, TCG_COND_GT) +TRANS(CMHI_v, do_cmop_v, a, TCG_COND_GTU) +TRANS(CMGE_v, do_cmop_v, a, TCG_COND_GE) +TRANS(CMHS_v, do_cmop_v, a, TCG_COND_GEU) +TRANS(CMEQ_v, do_cmop_v, a, TCG_COND_EQ) +TRANS(CMTST_v, do_gvec_fn3, a, gen_gvec_cmtst) + /* * Advanced SIMD scalar/vector x indexed element */ @@ -9421,45 +9461,6 @@ static void disas_simd_scalar_three_reg_diff(DisasContext *s, uint32_t insn) } } -static void handle_3same_64(DisasContext *s, int opcode, bool u, - TCGv_i64 tcg_rd, TCGv_i64 tcg_rn, TCGv_i64 tcg_rm) -{ - /* Handle 64x64->64 opcodes which are shared between the scalar - * and vector 3-same groups. We cover every opcode where size == 3 - * is valid in either the three-reg-same (integer, not pairwise) - * or scalar-three-reg-same groups. - */ - TCGCond cond; - - switch (opcode) { - case 0x6: /* CMGT, CMHI */ - cond = u ? TCG_COND_GTU : TCG_COND_GT; - do_cmop: - /* 64 bit integer comparison, result = test ? -1 : 0. */ - tcg_gen_negsetcond_i64(cond, tcg_rd, tcg_rn, tcg_rm); - break; - case 0x7: /* CMGE, CMHS */ - cond = u ? TCG_COND_GEU : TCG_COND_GE; - goto do_cmop; - case 0x11: /* CMTST, CMEQ */ - if (u) { - cond = TCG_COND_EQ; - goto do_cmop; - } - gen_cmtst_i64(tcg_rd, tcg_rn, tcg_rm); - break; - default: - case 0x1: /* SQADD / UQADD */ - case 0x5: /* SQSUB / UQSUB */ - case 0x8: /* SSHL, USHL */ - case 0x9: /* SQSHL, UQSHL */ - case 0xa: /* SRSHL, URSHL */ - case 0xb: /* SQRSHL, UQRSHL */ - case 0x10: /* ADD, SUB */ - g_assert_not_reached(); - } -} - /* AdvSIMD scalar three same * 31 30 29 28 24 23 22 21 20 16 15 11 10 9 5 4 0 * +-----+---+-----------+------+---+------+--------+---+------+------+ @@ -9477,14 +9478,6 @@ static void disas_simd_scalar_three_reg_same(DisasContext *s, uint32_t insn) TCGv_i64 tcg_rd; switch (opcode) { - case 0x6: /* CMGT, CMHI */ - case 0x7: /* CMGE, CMHS */ - case 0x11: /* CMTST, CMEQ */ - if (size != 3) { - unallocated_encoding(s); - return; - } - break; case 0x16: /* SQDMULH, SQRDMULH (vector) */ if (size != 1 && size != 2) { unallocated_encoding(s); @@ -9494,11 +9487,14 @@ static void disas_simd_scalar_three_reg_same(DisasContext *s, uint32_t insn) default: case 0x1: /* SQADD, UQADD */ case 0x5: /* SQSUB, UQSUB */ + case 0x6: /* CMGT, CMHI */ + case 0x7: /* CMGE, CMHS */ case 0x8: /* SSHL, USHL */ case 0x9: /* SQSHL, UQSHL */ case 0xa: /* SRSHL, URSHL */ case 0xb: /* SQRSHL, UQRSHL */ case 0x10: /* ADD, SUB (vector) */ + case 0x11: /* CMTST, CMEQ */ unallocated_encoding(s); return; } @@ -9510,10 +9506,7 @@ static void disas_simd_scalar_three_reg_same(DisasContext *s, uint32_t insn) tcg_rd = tcg_temp_new_i64(); if (size == 3) { - TCGv_i64 tcg_rn = read_fp_dreg(s, rn); - TCGv_i64 tcg_rm = read_fp_dreg(s, rm); - - handle_3same_64(s, opcode, u, tcg_rd, tcg_rn, tcg_rm); + g_assert_not_reached(); } else { /* Do a single operation on the lowest element in the vector. * We use the standard Neon helpers and rely on 0 OP 0 == 0 with @@ -10919,7 +10912,6 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) int rn = extract32(insn, 5, 5); int rd = extract32(insn, 0, 5); int pass; - TCGCond cond; switch (opcode) { case 0x13: /* MUL, PMUL */ @@ -10956,11 +10948,14 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) case 0x01: /* SQADD, UQADD */ case 0x05: /* SQSUB, UQSUB */ + case 0x06: /* CMGT, CMHI */ + case 0x07: /* CMGE, CMHS */ case 0x08: /* SSHL, USHL */ case 0x09: /* SQSHL, UQSHL */ case 0x0a: /* SRSHL, URSHL */ case 0x0b: /* SQRSHL, UQRSHL */ case 0x10: /* ADD, SUB */ + case 0x11: /* CMTST, CMEQ */ unallocated_encoding(s); return; } @@ -11021,41 +11016,10 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) gen_gvec_op3_qc(s, is_q, rd, rn, rm, fns[size - 1][u]); } return; - case 0x11: - if (!u) { /* CMTST */ - gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_cmtst, size); - return; - } - /* else CMEQ */ - cond = TCG_COND_EQ; - goto do_gvec_cmp; - case 0x06: /* CMGT, CMHI */ - cond = u ? TCG_COND_GTU : TCG_COND_GT; - goto do_gvec_cmp; - case 0x07: /* CMGE, CMHS */ - cond = u ? TCG_COND_GEU : TCG_COND_GE; - do_gvec_cmp: - tcg_gen_gvec_cmp(cond, size, vec_full_reg_offset(s, rd), - vec_full_reg_offset(s, rn), - vec_full_reg_offset(s, rm), - is_q ? 16 : 8, vec_full_reg_size(s)); - return; } if (size == 3) { - assert(is_q); - for (pass = 0; pass < 2; pass++) { - TCGv_i64 tcg_op1 = tcg_temp_new_i64(); - TCGv_i64 tcg_op2 = tcg_temp_new_i64(); - TCGv_i64 tcg_res = tcg_temp_new_i64(); - - read_vec_element(s, tcg_op1, rn, pass, MO_64); - read_vec_element(s, tcg_op2, rm, pass, MO_64); - - handle_3same_64(s, opcode, u, tcg_res, tcg_op1, tcg_op2); - - write_vec_element(s, tcg_res, rd, pass, MO_64); - } + g_assert_not_reached(); } else { for (pass = 0; pass < (is_q ? 4 : 2); pass++) { TCGv_i32 tcg_op1 = tcg_temp_new_i32(); From patchwork Fri May 24 23:21:06 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13673846 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 896B5C25B74 for ; Fri, 24 May 2024 23:32:05 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sAeIW-0000e7-Je; Fri, 24 May 2024 19:26:04 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sAeHt-0007Oh-V5 for qemu-devel@nongnu.org; Fri, 24 May 2024 19:25:28 -0400 Received: from mail-pf1-x434.google.com ([2607:f8b0:4864:20::434]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sAeHh-0006tH-B2 for qemu-devel@nongnu.org; Fri, 24 May 2024 19:25:24 -0400 Received: by mail-pf1-x434.google.com with SMTP id d2e1a72fcca58-6f8e9870e72so1353994b3a.1 for ; Fri, 24 May 2024 16:25:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1716593109; x=1717197909; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=opsbeM2raunxCajavxAMAhO8CHOlNCLI/4hm0OSSpr8=; b=ERbOpDzHVVffVzQAfQJFFsjhUtlhzeGICqF76qHY12mLZ/NFode4tcy7ugC3Odcix+ 0mCBIQvVbdSP9/hGU84w4mysHK/7o67t7IibdGY6qm+NEa7I/GO3FPrCmumlPCap0PZv P4ODzdXrAyUqjaE2Xza92+qxQd+QIbLuW8Tiay6tAODKuP5cKAaVpzvwWq8y29eAx+6U CODjX+2yXS/u0M9BgJ5BrLjeohtqYh+T6pq76eQVonbasOm8EHOl+eniYDf4nqOXk76y BZrWwABCnA5dYjGu9C9cTl1FjQmh4i02YSsfwPiozKpJskO+abE9sHPJrwJv4NPx6Oxk +QvA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716593109; x=1717197909; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=opsbeM2raunxCajavxAMAhO8CHOlNCLI/4hm0OSSpr8=; b=wWygkwZZCo5qB1TKEmyKs5YkUNWL3v982Bgh2MoO2kZalMtxmGlBc0LMNIxKcL00ey P+zvozGPI0PWKdBQ0SIDfc7uCWxALuLKxVHcaFhXxiECMOvMItqXJH+Wd2OVZJn3a0Jk vcwrNGmvCt5x3ZPS4/IeKK2iAEUWZ3uDzObNBMkTUTtHH7U6+39acJScc2DUmnQx7MiY HKt5hvFVglwedMPBvd1iKlm6Icl9N6bm/lHdTXEQwHEpPCrf4rdQ1LDeUgQ5b5QnXkkm 9/9wJ/Ei2qbK6e0IY09p+46uEmlBLneQGxuXz6xI/jGpeAvhfnyQc71e9uUvn8O/M1zd PsQA== X-Gm-Message-State: AOJu0YydsL3KGKv0qjpOv1GBzDFtNNJJfFJvC1PW4CQ/WoYPF2xZqdyJ 4IWcs1gfiClIqyKETgdjSUa+/ke73yBleVKljM1x62vlD72xDP9ElVhGhI+VCxGSvmK6HsrU541 y X-Google-Smtp-Source: AGHT+IGPstgBCLZLFeLgoBVj5Odr5/p9nlkhTCTS0Ht30kHB5mN6XL/XDdBx3RD6jZZyKGF3tA+4TA== X-Received: by 2002:a05:6a00:4c0b:b0:6f8:beb9:c0ba with SMTP id d2e1a72fcca58-6f8f45e47bamr5232863b3a.31.1716593108928; Fri, 24 May 2024 16:25:08 -0700 (PDT) Received: from stoup.. (174-21-72-5.tukw.qwest.net. [174.21.72.5]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-6f8fbf2cfd1sm1591695b3a.3.2024.05.24.16.25.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 May 2024 16:25:08 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: qemu-arm@nongnu.org Subject: [PATCH v2 52/67] target/arm: Use TCG_COND_TSTNE in gen_cmtst_{i32, i64} Date: Fri, 24 May 2024 16:21:06 -0700 Message-Id: <20240524232121.284515-53-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240524232121.284515-1-richard.henderson@linaro.org> References: <20240524232121.284515-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::434; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x434.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/tcg/gengvec.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/target/arm/tcg/gengvec.c b/target/arm/tcg/gengvec.c index 51e66ccf5f..1d6bc6021d 100644 --- a/target/arm/tcg/gengvec.c +++ b/target/arm/tcg/gengvec.c @@ -933,14 +933,12 @@ void gen_gvec_mls(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, /* CMTST : test is "if (X & Y != 0)". */ static void gen_cmtst_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) { - tcg_gen_and_i32(d, a, b); - tcg_gen_negsetcond_i32(TCG_COND_NE, d, d, tcg_constant_i32(0)); + tcg_gen_negsetcond_i32(TCG_COND_TSTNE, d, a, b); } void gen_cmtst_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) { - tcg_gen_and_i64(d, a, b); - tcg_gen_negsetcond_i64(TCG_COND_NE, d, d, tcg_constant_i64(0)); + tcg_gen_negsetcond_i64(TCG_COND_TSTNE, d, a, b); } static void gen_cmtst_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b) From patchwork Fri May 24 23:21:07 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13673830 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 93968C25B74 for ; Fri, 24 May 2024 23:29:59 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sAeIZ-00016M-Fi; Fri, 24 May 2024 19:26:07 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sAeHt-0007OX-SD for qemu-devel@nongnu.org; Fri, 24 May 2024 19:25:28 -0400 Received: from mail-pf1-x432.google.com ([2607:f8b0:4864:20::432]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sAeHh-0006tZ-Af for qemu-devel@nongnu.org; Fri, 24 May 2024 19:25:24 -0400 Received: by mail-pf1-x432.google.com with SMTP id d2e1a72fcca58-6f8eaa14512so1307128b3a.3 for ; Fri, 24 May 2024 16:25:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1716593110; x=1717197910; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=KkrM6eUfNQC4/ZxOfw3RoQ+l/U6+/yIajWSBPEUqefU=; b=emsgS+iFjgBL5HsXZJJwTe2WGH+23g0HnCDQCoIzaD7bsYBi0+IJlQVA6NvGAIyAN0 xoN4llMYhwVILwTY0fuxSPTr6kTJJDF+J3KB2Hl9f1UjJ+DnNEPqseSvhfXksCOIL1oC jef15ld2Wq8ZgqkdbGZPGXVnSbkBQ/rhfmDNhJdExwQvzd043qmUFfQA7m/1lTCJayr8 YOnrrZx/uO1/w2rUBeuieR9ETNDEP3fb7RuelbVSixhAvesLOYambWDRYnHhg4D4ovdK xGNEZ2m95G7T6pG/GvAHqr/qjT0HLau4Y6J4d7OFG2J+pYuiw1muuaCT3ve5Z/XOO5vm 9mUg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716593110; x=1717197910; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=KkrM6eUfNQC4/ZxOfw3RoQ+l/U6+/yIajWSBPEUqefU=; b=Xi39AZjzz1tM/UsIbzgfTx6XiE6z+5UjoUlheJCBBBlVJdG3oxp8Rl5ZMfqaa4qCgy zxO43wcy6Xx58yK5GZ3DNtlvJ8R74+xaiVgu6otEHPCbBIkJmBuzsn+PPJR/mwha8roO 7YCvwDKWps+Mh4E4LswhkLu5ITH3yUKs+eHRwNRT4qccmTzfv+mDglAtpudU6PaJbrnI nOvWl0G68U34WYpUEEw9Ll0jdo2FE5WRZIzp2FxvxSaQUeVxPsKYc6wgp5LJ7gabJsKg 2V/JKtw53mTHnIikBmoPEboIGfNv/se9ct8eJuEorr+HPFBjbRAilpxYVSXWg9bSgHle woBw== X-Gm-Message-State: AOJu0YyZqRgfnc+LnrvhC4OFcIWgfLeiMCSQ+oz/xAO9Eq1xd7dM+kSp OYhNRxVRC8anj/Jh+Tf2y7/YVZh+8McLShSbi9uffLrF5RM8pJz0HqDBRXJB7/gBoIoskAUwLGe Q X-Google-Smtp-Source: AGHT+IH7FurCjsBAAZDits9Raqncawm0dy4V/L+dqXoZ6Kp8bOSTE9aa17L9zcQH98f2nVEdVugyHA== X-Received: by 2002:a05:6a00:4405:b0:6f3:f062:c09b with SMTP id d2e1a72fcca58-6f8f2c6c7b4mr4166051b3a.6.1716593109917; Fri, 24 May 2024 16:25:09 -0700 (PDT) Received: from stoup.. (174-21-72-5.tukw.qwest.net. [174.21.72.5]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-6f8fbf2cfd1sm1591695b3a.3.2024.05.24.16.25.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 May 2024 16:25:09 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: qemu-arm@nongnu.org Subject: [PATCH v2 53/67] target/arm: Use TCG_COND_TSTNE in gen_cmtst_vec Date: Fri, 24 May 2024 16:21:07 -0700 Message-Id: <20240524232121.284515-54-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240524232121.284515-1-richard.henderson@linaro.org> References: <20240524232121.284515-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::432; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x432.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/tcg/gengvec.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/target/arm/tcg/gengvec.c b/target/arm/tcg/gengvec.c index 1d6bc6021d..1895c3b19f 100644 --- a/target/arm/tcg/gengvec.c +++ b/target/arm/tcg/gengvec.c @@ -943,9 +943,7 @@ void gen_cmtst_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) static void gen_cmtst_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b) { - tcg_gen_and_vec(vece, d, a, b); - tcg_gen_dupi_vec(vece, a, 0); - tcg_gen_cmp_vec(TCG_COND_NE, vece, d, d, a); + tcg_gen_cmp_vec(TCG_COND_TSTNE, vece, d, a, b); } void gen_gvec_cmtst(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, From patchwork Fri May 24 23:21:08 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13673839 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id D7FE1C25B74 for ; Fri, 24 May 2024 23:30:30 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sAeJH-0003cd-Q3; Fri, 24 May 2024 19:26:51 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sAeI9-0008IA-EQ for qemu-devel@nongnu.org; Fri, 24 May 2024 19:25:42 -0400 Received: from mail-pf1-x42a.google.com ([2607:f8b0:4864:20::42a]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sAeHv-0006tr-92 for qemu-devel@nongnu.org; Fri, 24 May 2024 19:25:41 -0400 Received: by mail-pf1-x42a.google.com with SMTP id d2e1a72fcca58-6f4ed9dc7beso2815222b3a.1 for ; Fri, 24 May 2024 16:25:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1716593111; x=1717197911; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=iiPNnnZVNjS8p5m66evevlQPVHteoJ1u5PCLuHV/bYg=; b=i4Zk7F4AR79Y8X36qqvPBMGA1Tvy19P4kPTNDZGpgmDap8c3vxmXOx63tXgFNw0Iuj nvls3LU2azVhW7pgFCKi+p6Y0tgKUafKI935uhpKrCI1EP4HsGRHuANS+IJFDUMzU24b eBup8gyv9ihZQhLwE2nGu1qxhnjApqj/X6WBMhHnnpOfiL8ckriI2b4bAXe/+RLwffFt rOO/QdjFBVgYu7YSR4iNkJ4ynHPDCVndSWH1F5wYx5MF5cb05dRzQF2WsNAbZYU8nchg nWlZBmxw2zJOPXyOYvdPc0BPX4HliC/DpqLazxV+g2Cc6pmFgAFGUX93qHlAEe8imhRw oQJg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716593111; x=1717197911; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=iiPNnnZVNjS8p5m66evevlQPVHteoJ1u5PCLuHV/bYg=; b=KApl5G5SFpnTzijlAA3s8IG9grQcuyGa71Orid2bVHmyQ6REr6Q7L/H96Yk+sEguTy vk7YzJn6MmLDrgsjlT4KWoA+AhOHWCvApd3dVcFgkRerzx47cjzMFNrafwp24rG+c9R3 UO7QWwOQ8fVp7IL7PvSFRDinCibIBfgtWiJzrdaJRPu937lKa19j1SYf80RNldX/fYCo Qs8O1pB17W+Eg2shwbps8vMkWlMCEy7YMm7J9TeLheLhOlYnvXyT+I8w/K/ILsPauXI3 DRi9jZRQe88qGKF0dk6B10B+cdb+5K+FF6ytEebAQ+UyTpWxH+jTMmduZLUVHemWt98J +m/g== X-Gm-Message-State: AOJu0Yz1GxQS2h5edy3PMyHCc3e4P10+hUEZfm2esKm+KN9a3skEApGW XvooCM+iAp55lHiQQvuS1dn8r7Of3TUp6sPK756aWqDPoE8aDA3s8hjIPXi336GSo8mnQcf6smt A X-Google-Smtp-Source: AGHT+IEg0+BlN/A/2GMmmRCIpWBJj74a9s3G+JbDtB7xlCWeaOXAIp/r+p2QgXU17RQ021eRH3mnEg== X-Received: by 2002:a05:6a00:730f:b0:6f8:b773:ca3e with SMTP id d2e1a72fcca58-6f8b773cd29mr5513611b3a.12.1716593110636; Fri, 24 May 2024 16:25:10 -0700 (PDT) Received: from stoup.. (174-21-72-5.tukw.qwest.net. [174.21.72.5]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-6f8fbf2cfd1sm1591695b3a.3.2024.05.24.16.25.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 May 2024 16:25:10 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: qemu-arm@nongnu.org Subject: [PATCH v2 54/67] target/arm: Convert SHADD, UHADD to gvec Date: Fri, 24 May 2024 16:21:08 -0700 Message-Id: <20240524232121.284515-55-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240524232121.284515-1-richard.henderson@linaro.org> References: <20240524232121.284515-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::42a; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x42a.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/helper.h | 6 -- target/arm/tcg/translate.h | 5 ++ target/arm/tcg/gengvec.c | 144 ++++++++++++++++++++++++++++++++ target/arm/tcg/neon_helper.c | 27 ------ target/arm/tcg/translate-a64.c | 17 ++-- target/arm/tcg/translate-neon.c | 4 +- 6 files changed, 158 insertions(+), 45 deletions(-) diff --git a/target/arm/helper.h b/target/arm/helper.h index 9a89c9cea7..b26bfcb079 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -268,12 +268,6 @@ DEF_HELPER_FLAGS_2(fjcvtzs, TCG_CALL_NO_RWG, i64, f64, ptr) DEF_HELPER_FLAGS_3(check_hcr_el2_trap, TCG_CALL_NO_WG, void, env, i32, i32) /* neon_helper.c */ -DEF_HELPER_2(neon_hadd_s8, i32, i32, i32) -DEF_HELPER_2(neon_hadd_u8, i32, i32, i32) -DEF_HELPER_2(neon_hadd_s16, i32, i32, i32) -DEF_HELPER_2(neon_hadd_u16, i32, i32, i32) -DEF_HELPER_2(neon_hadd_s32, s32, s32, s32) -DEF_HELPER_2(neon_hadd_u32, i32, i32, i32) DEF_HELPER_2(neon_rhadd_s8, i32, i32, i32) DEF_HELPER_2(neon_rhadd_u8, i32, i32, i32) DEF_HELPER_2(neon_rhadd_s16, i32, i32, i32) diff --git a/target/arm/tcg/translate.h b/target/arm/tcg/translate.h index 048cb45ebe..dd99d76bf2 100644 --- a/target/arm/tcg/translate.h +++ b/target/arm/tcg/translate.h @@ -472,6 +472,11 @@ void gen_neon_sqrshl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, void gen_neon_uqrshl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_shadd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_uhadd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); + void gen_cmtst_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b); void gen_ushl_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b); void gen_sshl_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b); diff --git a/target/arm/tcg/gengvec.c b/target/arm/tcg/gengvec.c index 1895c3b19f..0627cec6b2 100644 --- a/target/arm/tcg/gengvec.c +++ b/target/arm/tcg/gengvec.c @@ -1852,3 +1852,147 @@ void gen_gvec_uminp(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, tcg_debug_assert(vece <= MO_32); tcg_gen_gvec_3_ool(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, 0, fns[vece]); } + +static void gen_shadd8_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + tcg_gen_and_i64(t, a, b); + tcg_gen_vec_sar8i_i64(a, a, 1); + tcg_gen_vec_sar8i_i64(b, b, 1); + tcg_gen_andi_i64(t, t, dup_const(MO_8, 1)); + tcg_gen_vec_add8_i64(d, a, b); + tcg_gen_vec_add8_i64(d, d, t); +} + +static void gen_shadd16_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + tcg_gen_and_i64(t, a, b); + tcg_gen_vec_sar16i_i64(a, a, 1); + tcg_gen_vec_sar16i_i64(b, b, 1); + tcg_gen_andi_i64(t, t, dup_const(MO_16, 1)); + tcg_gen_vec_add16_i64(d, a, b); + tcg_gen_vec_add16_i64(d, d, t); +} + +static void gen_shadd_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) +{ + TCGv_i32 t = tcg_temp_new_i32(); + + tcg_gen_and_i32(t, a, b); + tcg_gen_sari_i32(a, a, 1); + tcg_gen_sari_i32(b, b, 1); + tcg_gen_andi_i32(t, t, 1); + tcg_gen_add_i32(d, a, b); + tcg_gen_add_i32(d, d, t); +} + +static void gen_shadd_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + + tcg_gen_and_vec(vece, t, a, b); + tcg_gen_sari_vec(vece, a, a, 1); + tcg_gen_sari_vec(vece, b, b, 1); + tcg_gen_and_vec(vece, t, t, tcg_constant_vec_matching(d, vece, 1)); + tcg_gen_add_vec(vece, d, a, b); + tcg_gen_add_vec(vece, d, d, t); +} + +void gen_gvec_shadd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_sari_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen3 g[] = { + { .fni8 = gen_shadd8_i64, + .fniv = gen_shadd_vec, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fni8 = gen_shadd16_i64, + .fniv = gen_shadd_vec, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fni4 = gen_shadd_i32, + .fniv = gen_shadd_vec, + .opt_opc = vecop_list, + .vece = MO_32 }, + }; + tcg_debug_assert(vece <= MO_32); + tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &g[vece]); +} + +static void gen_uhadd8_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + tcg_gen_and_i64(t, a, b); + tcg_gen_vec_shr8i_i64(a, a, 1); + tcg_gen_vec_shr8i_i64(b, b, 1); + tcg_gen_andi_i64(t, t, dup_const(MO_8, 1)); + tcg_gen_vec_add8_i64(d, a, b); + tcg_gen_vec_add8_i64(d, d, t); +} + +static void gen_uhadd16_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + tcg_gen_and_i64(t, a, b); + tcg_gen_vec_shr16i_i64(a, a, 1); + tcg_gen_vec_shr16i_i64(b, b, 1); + tcg_gen_andi_i64(t, t, dup_const(MO_16, 1)); + tcg_gen_vec_add16_i64(d, a, b); + tcg_gen_vec_add16_i64(d, d, t); +} + +static void gen_uhadd_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) +{ + TCGv_i32 t = tcg_temp_new_i32(); + + tcg_gen_and_i32(t, a, b); + tcg_gen_shri_i32(a, a, 1); + tcg_gen_shri_i32(b, b, 1); + tcg_gen_andi_i32(t, t, 1); + tcg_gen_add_i32(d, a, b); + tcg_gen_add_i32(d, d, t); +} + +static void gen_uhadd_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + + tcg_gen_and_vec(vece, t, a, b); + tcg_gen_shri_vec(vece, a, a, 1); + tcg_gen_shri_vec(vece, b, b, 1); + tcg_gen_and_vec(vece, t, t, tcg_constant_vec_matching(d, vece, 1)); + tcg_gen_add_vec(vece, d, a, b); + tcg_gen_add_vec(vece, d, d, t); +} + +void gen_gvec_uhadd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_shri_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen3 g[] = { + { .fni8 = gen_uhadd8_i64, + .fniv = gen_uhadd_vec, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fni8 = gen_uhadd16_i64, + .fniv = gen_uhadd_vec, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fni4 = gen_uhadd_i32, + .fniv = gen_uhadd_vec, + .opt_opc = vecop_list, + .vece = MO_32 }, + }; + tcg_debug_assert(vece <= MO_32); + tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &g[vece]); +} diff --git a/target/arm/tcg/neon_helper.c b/target/arm/tcg/neon_helper.c index b29a7c725f..defd28a6f7 100644 --- a/target/arm/tcg/neon_helper.c +++ b/target/arm/tcg/neon_helper.c @@ -179,33 +179,6 @@ uint32_t HELPER(glue(neon_,name))(uint32_t arg) \ return arg; \ } -#define NEON_FN(dest, src1, src2) dest = (src1 + src2) >> 1 -NEON_VOP(hadd_s8, neon_s8, 4) -NEON_VOP(hadd_u8, neon_u8, 4) -NEON_VOP(hadd_s16, neon_s16, 2) -NEON_VOP(hadd_u16, neon_u16, 2) -#undef NEON_FN - -int32_t HELPER(neon_hadd_s32)(int32_t src1, int32_t src2) -{ - int32_t dest; - - dest = (src1 >> 1) + (src2 >> 1); - if (src1 & src2 & 1) - dest++; - return dest; -} - -uint32_t HELPER(neon_hadd_u32)(uint32_t src1, uint32_t src2) -{ - uint32_t dest; - - dest = (src1 >> 1) + (src2 >> 1); - if (src1 & src2 & 1) - dest++; - return dest; -} - #define NEON_FN(dest, src1, src2) dest = (src1 + src2 + 1) >> 1 NEON_VOP(rhadd_s8, neon_s8, 4) NEON_VOP(rhadd_u8, neon_u8, 4) diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c index 3c6cfc2952..5f3423513d 100644 --- a/target/arm/tcg/translate-a64.c +++ b/target/arm/tcg/translate-a64.c @@ -10965,6 +10965,13 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) } switch (opcode) { + case 0x00: /* SHADD, UHADD */ + if (u) { + gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_uhadd, size); + } else { + gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_shadd, size); + } + return; case 0x0c: /* SMAX, UMAX */ if (u) { gen_gvec_fn3(s, is_q, rd, rn, rm, tcg_gen_gvec_umax, size); @@ -11032,16 +11039,6 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) read_vec_element_i32(s, tcg_op2, rm, pass, MO_32); switch (opcode) { - case 0x0: /* SHADD, UHADD */ - { - static NeonGenTwoOpFn * const fns[3][2] = { - { gen_helper_neon_hadd_s8, gen_helper_neon_hadd_u8 }, - { gen_helper_neon_hadd_s16, gen_helper_neon_hadd_u16 }, - { gen_helper_neon_hadd_s32, gen_helper_neon_hadd_u32 }, - }; - genfn = fns[size][u]; - break; - } case 0x2: /* SRHADD, URHADD */ { static NeonGenTwoOpFn * const fns[3][2] = { diff --git a/target/arm/tcg/translate-neon.c b/target/arm/tcg/translate-neon.c index 5f1576393e..29e5c4a0a3 100644 --- a/target/arm/tcg/translate-neon.c +++ b/target/arm/tcg/translate-neon.c @@ -841,6 +841,8 @@ DO_3SAME_NO_SZ_3(VPMAX_S, gen_gvec_smaxp) DO_3SAME_NO_SZ_3(VPMIN_S, gen_gvec_sminp) DO_3SAME_NO_SZ_3(VPMAX_U, gen_gvec_umaxp) DO_3SAME_NO_SZ_3(VPMIN_U, gen_gvec_uminp) +DO_3SAME_NO_SZ_3(VHADD_S, gen_gvec_shadd) +DO_3SAME_NO_SZ_3(VHADD_U, gen_gvec_uhadd) #define DO_3SAME_CMP(INSN, COND) \ static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs, \ @@ -951,8 +953,6 @@ DO_SHA2(SHA256SU1, gen_helper_crypto_sha256su1) FUNC(d, tcg_env, n, m); \ } -DO_3SAME_32(VHADD_S, hadd_s) -DO_3SAME_32(VHADD_U, hadd_u) DO_3SAME_32(VHSUB_S, hsub_s) DO_3SAME_32(VHSUB_U, hsub_u) DO_3SAME_32(VRHADD_S, rhadd_s) From patchwork Fri May 24 23:21:09 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13673819 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 24529C25B74 for ; Fri, 24 May 2024 23:27:58 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sAeIV-0000Ou-C8; Fri, 24 May 2024 19:26:03 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sAeHu-0007P9-P4 for qemu-devel@nongnu.org; Fri, 24 May 2024 19:25:28 -0400 Received: from mail-pf1-x42c.google.com ([2607:f8b0:4864:20::42c]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sAeHj-0006u1-CX for qemu-devel@nongnu.org; Fri, 24 May 2024 19:25:25 -0400 Received: by mail-pf1-x42c.google.com with SMTP id d2e1a72fcca58-6f8e98760fcso1242544b3a.1 for ; Fri, 24 May 2024 16:25:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1716593111; x=1717197911; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=AtfQDDvhCTAQzr/cA9Eff/uGFAgvLsYs3JLC/5fgqLQ=; b=wZil00SEDRebx9oyUjm0ciRTCUZ6A1cmUQ8o8pVWRiGH2XTDNfCUgxFSrurqdarTa1 uw4/SqP8Iy5pofz1Z6iJ98OwEwPUl0L1dIZ+xOCNBtuHkqUAoLO/tODinOHNEp+m2oZX r9mK4OKrZYjLJ6Vm0uMiA5HskLWHqnl5C/UVYVPddBWqVtC5hIGjaA0wrBrOYYeN/Cap +tmHehs+He1qLf7SNRTqcJOsL5dP9T/4uYPmprcag2nsQTSOvzVg0AVHRcofa7KXWfNw 9swO1cQVGwXB9wZglG6nNCHM7P+1Axq3wHIh3MbpJhhMxQ5dy4k5L6UEWzp5VSvjmakL yRTQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716593111; x=1717197911; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=AtfQDDvhCTAQzr/cA9Eff/uGFAgvLsYs3JLC/5fgqLQ=; b=AIpCgPmia0s7bO2gvcsIt6yDoiYxk1d4jKiknEwMd1J6EmCLeEoN7g5XQtEKWjJTbJ XMteaBnJAWu/EzmDILqEM5j6uiVHZ4aFM8hgswJQTlr9hC4So9+bHGbOEcVvAmB9Jc2s G/GZw/6EYQLjV745QBIBRJHNtfVjRsGSFwT9TTudEQrRWI0v+OupQ4raqeFD9nE1+6QF l45hjzxJwYwoJu1TLaWq8dMh68dbhuF8rWAmcVsObfwzhqZ2VEmap4Fyw6mU1DOfFYVH J1KUNlLDOHVasJl4Q0xKIb5eyawYOYi7BvGJMrng8LLLSKyM7uQVi9iIp1p2rq0VBUIX dOeg== X-Gm-Message-State: AOJu0Yyz+0xnrCrMllNikpMUmQVLnSewuBdiujnNyY/zPS6fXu08lgAM OEHrXeML2Y1+KGV22KUr+8vFwnJAmOxdWP3Yb4ZX88pdyZwb7gCBexeYGhdSX3NFVx45CDKfKIG 5 X-Google-Smtp-Source: AGHT+IH+j5nCBT4Ft8G8jKCSdD83MbqwEQNJUmxG43HbAdFhjNV6nRd3mkEv5tASjUVgEAR8Fu/M6g== X-Received: by 2002:a05:6a21:19c:b0:1a7:2e17:efd3 with SMTP id adf61e73a8af0-1b212cbcefamr6073493637.5.1716593111450; Fri, 24 May 2024 16:25:11 -0700 (PDT) Received: from stoup.. (174-21-72-5.tukw.qwest.net. [174.21.72.5]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-6f8fbf2cfd1sm1591695b3a.3.2024.05.24.16.25.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 May 2024 16:25:11 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: qemu-arm@nongnu.org Subject: [PATCH v2 55/67] target/arm: Convert SHADD, UHADD to decodetree Date: Fri, 24 May 2024 16:21:09 -0700 Message-Id: <20240524232121.284515-56-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240524232121.284515-1-richard.henderson@linaro.org> References: <20240524232121.284515-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::42c; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x42c.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/tcg/a64.decode | 2 ++ target/arm/tcg/translate-a64.c | 11 +++-------- 2 files changed, 5 insertions(+), 8 deletions(-) diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode index 3061e26242..e33d91fd0a 100644 --- a/target/arm/tcg/a64.decode +++ b/target/arm/tcg/a64.decode @@ -912,6 +912,8 @@ CMGE_v 0.00 1110 ..1 ..... 00111 1 ..... ..... @qrrr_e CMHS_v 0.10 1110 ..1 ..... 00111 1 ..... ..... @qrrr_e CMTST_v 0.00 1110 ..1 ..... 10001 1 ..... ..... @qrrr_e CMEQ_v 0.10 1110 ..1 ..... 10001 1 ..... ..... @qrrr_e +SHADD_v 0.00 1110 ..1 ..... 00000 1 ..... ..... @qrrr_e +UHADD_v 0.10 1110 ..1 ..... 00000 1 ..... ..... @qrrr_e ### Advanced SIMD scalar x indexed element diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c index 5f3423513d..00c04425c1 100644 --- a/target/arm/tcg/translate-a64.c +++ b/target/arm/tcg/translate-a64.c @@ -5454,6 +5454,8 @@ TRANS(UQRSHL_v, do_gvec_fn3, a, gen_neon_uqrshl) TRANS(ADD_v, do_gvec_fn3, a, tcg_gen_gvec_add) TRANS(SUB_v, do_gvec_fn3, a, tcg_gen_gvec_sub) +TRANS(SHADD_v, do_gvec_fn3_no64, a, gen_gvec_shadd) +TRANS(UHADD_v, do_gvec_fn3_no64, a, gen_gvec_uhadd) static bool do_cmop_v(DisasContext *s, arg_qrrr_e *a, TCGCond cond) { @@ -10920,7 +10922,6 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) return; } /* fall through */ - case 0x0: /* SHADD, UHADD */ case 0x2: /* SRHADD, URHADD */ case 0x4: /* SHSUB, UHSUB */ case 0xc: /* SMAX, UMAX */ @@ -10946,6 +10947,7 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) } break; + case 0x0: /* SHADD, UHADD */ case 0x01: /* SQADD, UQADD */ case 0x05: /* SQSUB, UQSUB */ case 0x06: /* CMGT, CMHI */ @@ -10965,13 +10967,6 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) } switch (opcode) { - case 0x00: /* SHADD, UHADD */ - if (u) { - gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_uhadd, size); - } else { - gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_shadd, size); - } - return; case 0x0c: /* SMAX, UMAX */ if (u) { gen_gvec_fn3(s, is_q, rd, rn, rm, tcg_gen_gvec_umax, size); From patchwork Fri May 24 23:21:10 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13673827 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 425C9C25B74 for ; Fri, 24 May 2024 23:29:03 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sAeIf-0001pS-0A; Fri, 24 May 2024 19:26:14 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sAeI0-0007dB-7O for qemu-devel@nongnu.org; Fri, 24 May 2024 19:25:32 -0400 Received: from mail-pf1-x436.google.com ([2607:f8b0:4864:20::436]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sAeHt-0006uJ-6Z for qemu-devel@nongnu.org; Fri, 24 May 2024 19:25:29 -0400 Received: by mail-pf1-x436.google.com with SMTP id d2e1a72fcca58-6f4603237e0so2616780b3a.0 for ; Fri, 24 May 2024 16:25:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1716593112; x=1717197912; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Vy+becK3ZE58nm2Idqea46QX6RZ3fPSYpS72Z5p5Sd8=; b=Z3PcaCb/OWv4cqwp8oINXe5X3sC4vpG2/N16BnTbnaMU+uAYdeSkOCq1CAcoQD+SYI 6ig5181ZhnEOI+inuCCBqzB/b4/qhS5gIruXhCfk/xY8SeBJxJody3o3bizG8QstyWZt 392GcETKsj5CP0Zo6zmRDeKlPzXurXGWN4qTG7ArfUAXxCe8uzvMEUivovo4hWtKBCOT LMi3c1Vv+YZo/40oI9HY556RK+d0OjkH818/qegehlyo/xMFLWEGyoghpgpfiLyJ4A3b rj2WHg/IJZ9vrwoQ4YmuUAXMh+K6u405g1zNu0T6cXmQjwvQ+y8tWZ/ulKy2qv6N2Omq ABhA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716593112; x=1717197912; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Vy+becK3ZE58nm2Idqea46QX6RZ3fPSYpS72Z5p5Sd8=; b=BuVw+MNh5zGahYot0UkHZwNlkZr87GwCFt4nulMA6nZgs7LZ3joe/pDp4Ynrx/gBil rhOvF1/cZT5AlOJvFoiSGUCwV6CqzEPGTQe6DwWsAKda5jFSoZjuYLtg2N+FSHDltq3z olbkhq+gPtzLNUbwLFoZghe/CAnKW9s93ra4PXt5MfRbL6IuwCnO/pw//p/r/0kXfmJJ fn1ePsR2lj1tZlOR00imHvsu1d2fwv0e7NyDDaGpKidbEMx7akdJuzIXqF0jadh+L7cv EF1bINbU4cSAkoiaxkwjMk3rv0Ag7znWEqcX/2DfB0+45KVjeqMDToZhjlftRWMXjbt/ yFGA== X-Gm-Message-State: AOJu0YxO9mjdqNpz/h/+abHjBeWk3r1krvKAQqevuJnh1m1MKaa98IdD jaHh3H2nqdRdyKoJgcKeupw2XyKujRa2GYsRP0aPBRMKKxGH969vqwy2z2fyP6N7/jOlO3xc4Zs a X-Google-Smtp-Source: AGHT+IGuTJqwLfS9Jj9E9paxx80w63zSbl0AJ402OcjI19R2zlwRxnLlNkK5iiRCYQKbgUd1Ragp9w== X-Received: by 2002:a05:6a00:3198:b0:6f4:9fc7:d239 with SMTP id d2e1a72fcca58-6f7727be1f4mr7160793b3a.14.1716593112182; Fri, 24 May 2024 16:25:12 -0700 (PDT) Received: from stoup.. (174-21-72-5.tukw.qwest.net. [174.21.72.5]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-6f8fbf2cfd1sm1591695b3a.3.2024.05.24.16.25.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 May 2024 16:25:11 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: qemu-arm@nongnu.org Subject: [PATCH v2 56/67] target/arm: Convert SHSUB, UHSUB to gvec Date: Fri, 24 May 2024 16:21:10 -0700 Message-Id: <20240524232121.284515-57-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240524232121.284515-1-richard.henderson@linaro.org> References: <20240524232121.284515-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::436; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x436.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/helper.h | 6 -- target/arm/tcg/translate.h | 4 + target/arm/tcg/gengvec.c | 144 ++++++++++++++++++++++++++++++++ target/arm/tcg/neon_helper.c | 27 ------ target/arm/tcg/translate-a64.c | 17 ++-- target/arm/tcg/translate-neon.c | 4 +- 6 files changed, 157 insertions(+), 45 deletions(-) diff --git a/target/arm/helper.h b/target/arm/helper.h index b26bfcb079..b95f24ed0a 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -274,12 +274,6 @@ DEF_HELPER_2(neon_rhadd_s16, i32, i32, i32) DEF_HELPER_2(neon_rhadd_u16, i32, i32, i32) DEF_HELPER_2(neon_rhadd_s32, s32, s32, s32) DEF_HELPER_2(neon_rhadd_u32, i32, i32, i32) -DEF_HELPER_2(neon_hsub_s8, i32, i32, i32) -DEF_HELPER_2(neon_hsub_u8, i32, i32, i32) -DEF_HELPER_2(neon_hsub_s16, i32, i32, i32) -DEF_HELPER_2(neon_hsub_u16, i32, i32, i32) -DEF_HELPER_2(neon_hsub_s32, s32, s32, s32) -DEF_HELPER_2(neon_hsub_u32, i32, i32, i32) DEF_HELPER_2(neon_pmin_u8, i32, i32, i32) DEF_HELPER_2(neon_pmin_s8, i32, i32, i32) diff --git a/target/arm/tcg/translate.h b/target/arm/tcg/translate.h index dd99d76bf2..315e0afd04 100644 --- a/target/arm/tcg/translate.h +++ b/target/arm/tcg/translate.h @@ -476,6 +476,10 @@ void gen_gvec_shadd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); void gen_gvec_uhadd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_shsub(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_uhsub(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); void gen_cmtst_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b); void gen_ushl_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b); diff --git a/target/arm/tcg/gengvec.c b/target/arm/tcg/gengvec.c index 0627cec6b2..6a54ad2d21 100644 --- a/target/arm/tcg/gengvec.c +++ b/target/arm/tcg/gengvec.c @@ -1996,3 +1996,147 @@ void gen_gvec_uhadd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, tcg_debug_assert(vece <= MO_32); tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &g[vece]); } + +static void gen_shsub8_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + tcg_gen_andc_i64(t, b, a); + tcg_gen_vec_sar8i_i64(a, a, 1); + tcg_gen_vec_sar8i_i64(b, b, 1); + tcg_gen_andi_i64(t, t, dup_const(MO_8, 1)); + tcg_gen_vec_sub8_i64(d, a, b); + tcg_gen_vec_sub8_i64(d, d, t); +} + +static void gen_shsub16_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + tcg_gen_andc_i64(t, b, a); + tcg_gen_vec_sar16i_i64(a, a, 1); + tcg_gen_vec_sar16i_i64(b, b, 1); + tcg_gen_andi_i64(t, t, dup_const(MO_16, 1)); + tcg_gen_vec_sub16_i64(d, a, b); + tcg_gen_vec_sub16_i64(d, d, t); +} + +static void gen_shsub_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) +{ + TCGv_i32 t = tcg_temp_new_i32(); + + tcg_gen_andc_i32(t, b, a); + tcg_gen_sari_i32(a, a, 1); + tcg_gen_sari_i32(b, b, 1); + tcg_gen_andi_i32(t, t, 1); + tcg_gen_sub_i32(d, a, b); + tcg_gen_sub_i32(d, d, t); +} + +static void gen_shsub_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + + tcg_gen_andc_vec(vece, t, b, a); + tcg_gen_sari_vec(vece, a, a, 1); + tcg_gen_sari_vec(vece, b, b, 1); + tcg_gen_and_vec(vece, t, t, tcg_constant_vec_matching(d, vece, 1)); + tcg_gen_sub_vec(vece, d, a, b); + tcg_gen_sub_vec(vece, d, d, t); +} + +void gen_gvec_shsub(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_sari_vec, INDEX_op_sub_vec, 0 + }; + static const GVecGen3 g[4] = { + { .fni8 = gen_shsub8_i64, + .fniv = gen_shsub_vec, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fni8 = gen_shsub16_i64, + .fniv = gen_shsub_vec, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fni4 = gen_shsub_i32, + .fniv = gen_shsub_vec, + .opt_opc = vecop_list, + .vece = MO_32 }, + }; + assert(vece <= MO_32); + tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &g[vece]); +} + +static void gen_uhsub8_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + tcg_gen_andc_i64(t, b, a); + tcg_gen_vec_shr8i_i64(a, a, 1); + tcg_gen_vec_shr8i_i64(b, b, 1); + tcg_gen_andi_i64(t, t, dup_const(MO_8, 1)); + tcg_gen_vec_sub8_i64(d, a, b); + tcg_gen_vec_sub8_i64(d, d, t); +} + +static void gen_uhsub16_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + tcg_gen_andc_i64(t, b, a); + tcg_gen_vec_shr16i_i64(a, a, 1); + tcg_gen_vec_shr16i_i64(b, b, 1); + tcg_gen_andi_i64(t, t, dup_const(MO_16, 1)); + tcg_gen_vec_sub16_i64(d, a, b); + tcg_gen_vec_sub16_i64(d, d, t); +} + +static void gen_uhsub_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) +{ + TCGv_i32 t = tcg_temp_new_i32(); + + tcg_gen_andc_i32(t, b, a); + tcg_gen_shri_i32(a, a, 1); + tcg_gen_shri_i32(b, b, 1); + tcg_gen_andi_i32(t, t, 1); + tcg_gen_sub_i32(d, a, b); + tcg_gen_sub_i32(d, d, t); +} + +static void gen_uhsub_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + + tcg_gen_andc_vec(vece, t, b, a); + tcg_gen_shri_vec(vece, a, a, 1); + tcg_gen_shri_vec(vece, b, b, 1); + tcg_gen_and_vec(vece, t, t, tcg_constant_vec_matching(d, vece, 1)); + tcg_gen_sub_vec(vece, d, a, b); + tcg_gen_sub_vec(vece, d, d, t); +} + +void gen_gvec_uhsub(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_shri_vec, INDEX_op_sub_vec, 0 + }; + static const GVecGen3 g[4] = { + { .fni8 = gen_uhsub8_i64, + .fniv = gen_uhsub_vec, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fni8 = gen_uhsub16_i64, + .fniv = gen_uhsub_vec, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fni4 = gen_uhsub_i32, + .fniv = gen_uhsub_vec, + .opt_opc = vecop_list, + .vece = MO_32 }, + }; + assert(vece <= MO_32); + tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &g[vece]); +} diff --git a/target/arm/tcg/neon_helper.c b/target/arm/tcg/neon_helper.c index defd28a6f7..d1641a5252 100644 --- a/target/arm/tcg/neon_helper.c +++ b/target/arm/tcg/neon_helper.c @@ -206,33 +206,6 @@ uint32_t HELPER(neon_rhadd_u32)(uint32_t src1, uint32_t src2) return dest; } -#define NEON_FN(dest, src1, src2) dest = (src1 - src2) >> 1 -NEON_VOP(hsub_s8, neon_s8, 4) -NEON_VOP(hsub_u8, neon_u8, 4) -NEON_VOP(hsub_s16, neon_s16, 2) -NEON_VOP(hsub_u16, neon_u16, 2) -#undef NEON_FN - -int32_t HELPER(neon_hsub_s32)(int32_t src1, int32_t src2) -{ - int32_t dest; - - dest = (src1 >> 1) - (src2 >> 1); - if ((~src1) & src2 & 1) - dest--; - return dest; -} - -uint32_t HELPER(neon_hsub_u32)(uint32_t src1, uint32_t src2) -{ - uint32_t dest; - - dest = (src1 >> 1) - (src2 >> 1); - if ((~src1) & src2 & 1) - dest--; - return dest; -} - #define NEON_FN(dest, src1, src2) dest = (src1 < src2) ? src1 : src2 NEON_POP(pmin_s8, neon_s8, 4) NEON_POP(pmin_u8, neon_u8, 4) diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c index 00c04425c1..63f7a59f94 100644 --- a/target/arm/tcg/translate-a64.c +++ b/target/arm/tcg/translate-a64.c @@ -10967,6 +10967,13 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) } switch (opcode) { + case 0x04: /* SHSUB, UHSUB */ + if (u) { + gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_uhsub, size); + } else { + gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_shsub, size); + } + return; case 0x0c: /* SMAX, UMAX */ if (u) { gen_gvec_fn3(s, is_q, rd, rn, rm, tcg_gen_gvec_umax, size); @@ -11044,16 +11051,6 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) genfn = fns[size][u]; break; } - case 0x4: /* SHSUB, UHSUB */ - { - static NeonGenTwoOpFn * const fns[3][2] = { - { gen_helper_neon_hsub_s8, gen_helper_neon_hsub_u8 }, - { gen_helper_neon_hsub_s16, gen_helper_neon_hsub_u16 }, - { gen_helper_neon_hsub_s32, gen_helper_neon_hsub_u32 }, - }; - genfn = fns[size][u]; - break; - } default: g_assert_not_reached(); } diff --git a/target/arm/tcg/translate-neon.c b/target/arm/tcg/translate-neon.c index 29e5c4a0a3..d59d5804c5 100644 --- a/target/arm/tcg/translate-neon.c +++ b/target/arm/tcg/translate-neon.c @@ -843,6 +843,8 @@ DO_3SAME_NO_SZ_3(VPMAX_U, gen_gvec_umaxp) DO_3SAME_NO_SZ_3(VPMIN_U, gen_gvec_uminp) DO_3SAME_NO_SZ_3(VHADD_S, gen_gvec_shadd) DO_3SAME_NO_SZ_3(VHADD_U, gen_gvec_uhadd) +DO_3SAME_NO_SZ_3(VHSUB_S, gen_gvec_shsub) +DO_3SAME_NO_SZ_3(VHSUB_U, gen_gvec_uhsub) #define DO_3SAME_CMP(INSN, COND) \ static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs, \ @@ -953,8 +955,6 @@ DO_SHA2(SHA256SU1, gen_helper_crypto_sha256su1) FUNC(d, tcg_env, n, m); \ } -DO_3SAME_32(VHSUB_S, hsub_s) -DO_3SAME_32(VHSUB_U, hsub_u) DO_3SAME_32(VRHADD_S, rhadd_s) DO_3SAME_32(VRHADD_U, rhadd_u) From patchwork Fri May 24 23:21:11 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13673816 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id F3773C25B74 for ; Fri, 24 May 2024 23:27:33 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sAeJF-0003RU-FS; Fri, 24 May 2024 19:26:49 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sAeI8-0008Cf-HV for qemu-devel@nongnu.org; Fri, 24 May 2024 19:25:40 -0400 Received: from mail-pf1-x433.google.com ([2607:f8b0:4864:20::433]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sAeHv-0006uj-AY for qemu-devel@nongnu.org; Fri, 24 May 2024 19:25:40 -0400 Received: by mail-pf1-x433.google.com with SMTP id d2e1a72fcca58-6f8edff35a0so1240063b3a.2 for ; Fri, 24 May 2024 16:25:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1716593113; x=1717197913; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=/faNa0CqV173A6w0C5UtLu2pSxJR/tRzvusot9qONxA=; b=I3hR9zdD7rW2vxZ0Y3Q/XoeY2a+qc07bMM4Lt8mhWnXi60JeVHjB1xpVNSm9MbF8TQ RTZeqzB1JzFSRy7fXhnHQE7fNS/uoo8MPr/GZhOD23MjURUMxDybiVJsAemdrgd0MtiP 1IWunvrWfYpFKONtFo+QAgcpCnag3n0w/HrrJDwB0oPh55iui3+IUWO9rW4gWMWsk+Vb GekLB29cQWrdJJj0h57VoHH9nvKHe8F033ox7PDJm+oKgH81QM2d/H5sjFmcyqz7phSe 82iouL+Etrz/wL3XkMSPUQ4X67ndMRR2DUbfXM5OUosLk3g97J9egITZcHrzhpYk/cE3 7mDg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716593113; x=1717197913; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=/faNa0CqV173A6w0C5UtLu2pSxJR/tRzvusot9qONxA=; b=n+HVR8DbUjz+EjLuJaJs/np/7STIyhbnX0HBuZ6YyJ2KtfZTswNe0oKT0V9BsVyg1b K8g/qScVHOSG1cVCHq05mpt1pOj5YlvvvF36/z8baCdVkyLCW5x5C0JZWE5VRXlzX2n9 rnVRfGOBwlDap5Im90OhHnVFx4bw017vrMfvSlP367ROWcGmMRMubiluUySX34OnYpdY XfNEj5KTqeXVg2kCQ/ziR/eD3jLoA4iUTvcXNv8MsXWEE9QHZYBvFGqzUgXF4e6rLXPu mF6t32fKq+gAeC6aOGs1rVCim2VLAwJ7n3V+ZE2atJwIW3Z5weHzVgZKuqgNbJ0EwdTg AAuQ== X-Gm-Message-State: AOJu0YyrngONIWucGAO2lLF/zpp207bpcsRCBj/yqYkTkvOlH4SmyUWQ rbBKsQVWZRLZB1o6ax/GVHFCmP05SvlIlyhniTspIOQyhoAdHmVdgMHZQWvRcLrpU1R/CTcV4la C X-Google-Smtp-Source: AGHT+IH55GZDk0tQe0CmQJ55TdO1H+X874DuPsADINcShphfGhKuZB5Q2iRIiwtrwzF90sY58tYiCQ== X-Received: by 2002:a05:6a00:4007:b0:6f3:f963:505f with SMTP id d2e1a72fcca58-6f8f2a4b288mr3943265b3a.5.1716593112998; Fri, 24 May 2024 16:25:12 -0700 (PDT) Received: from stoup.. (174-21-72-5.tukw.qwest.net. [174.21.72.5]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-6f8fbf2cfd1sm1591695b3a.3.2024.05.24.16.25.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 May 2024 16:25:12 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: qemu-arm@nongnu.org Subject: [PATCH v2 57/67] target/arm: Convert SHSUB, UHSUB to decodetree Date: Fri, 24 May 2024 16:21:11 -0700 Message-Id: <20240524232121.284515-58-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240524232121.284515-1-richard.henderson@linaro.org> References: <20240524232121.284515-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::433; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x433.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/tcg/a64.decode | 2 ++ target/arm/tcg/translate-a64.c | 11 +++-------- 2 files changed, 5 insertions(+), 8 deletions(-) diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode index e33d91fd0a..b1bbcb144e 100644 --- a/target/arm/tcg/a64.decode +++ b/target/arm/tcg/a64.decode @@ -914,6 +914,8 @@ CMTST_v 0.00 1110 ..1 ..... 10001 1 ..... ..... @qrrr_e CMEQ_v 0.10 1110 ..1 ..... 10001 1 ..... ..... @qrrr_e SHADD_v 0.00 1110 ..1 ..... 00000 1 ..... ..... @qrrr_e UHADD_v 0.10 1110 ..1 ..... 00000 1 ..... ..... @qrrr_e +SHSUB_v 0.00 1110 ..1 ..... 00100 1 ..... ..... @qrrr_e +UHSUB_v 0.10 1110 ..1 ..... 00100 1 ..... ..... @qrrr_e ### Advanced SIMD scalar x indexed element diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c index 63f7a59f94..6571b999f4 100644 --- a/target/arm/tcg/translate-a64.c +++ b/target/arm/tcg/translate-a64.c @@ -5456,6 +5456,8 @@ TRANS(ADD_v, do_gvec_fn3, a, tcg_gen_gvec_add) TRANS(SUB_v, do_gvec_fn3, a, tcg_gen_gvec_sub) TRANS(SHADD_v, do_gvec_fn3_no64, a, gen_gvec_shadd) TRANS(UHADD_v, do_gvec_fn3_no64, a, gen_gvec_uhadd) +TRANS(SHSUB_v, do_gvec_fn3_no64, a, gen_gvec_shsub) +TRANS(UHSUB_v, do_gvec_fn3_no64, a, gen_gvec_uhsub) static bool do_cmop_v(DisasContext *s, arg_qrrr_e *a, TCGCond cond) { @@ -10923,7 +10925,6 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) } /* fall through */ case 0x2: /* SRHADD, URHADD */ - case 0x4: /* SHSUB, UHSUB */ case 0xc: /* SMAX, UMAX */ case 0xd: /* SMIN, UMIN */ case 0xe: /* SABD, UABD */ @@ -10949,6 +10950,7 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) case 0x0: /* SHADD, UHADD */ case 0x01: /* SQADD, UQADD */ + case 0x04: /* SHSUB, UHSUB */ case 0x05: /* SQSUB, UQSUB */ case 0x06: /* CMGT, CMHI */ case 0x07: /* CMGE, CMHS */ @@ -10967,13 +10969,6 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) } switch (opcode) { - case 0x04: /* SHSUB, UHSUB */ - if (u) { - gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_uhsub, size); - } else { - gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_shsub, size); - } - return; case 0x0c: /* SMAX, UMAX */ if (u) { gen_gvec_fn3(s, is_q, rd, rn, rm, tcg_gen_gvec_umax, size); From patchwork Fri May 24 23:21:12 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13673817 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 6156BC25B7A for ; Fri, 24 May 2024 23:27:36 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sAeJJ-0003ut-FA; Fri, 24 May 2024 19:26:53 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sAeIA-0008Ik-Im for qemu-devel@nongnu.org; Fri, 24 May 2024 19:25:43 -0400 Received: from mail-pf1-x435.google.com ([2607:f8b0:4864:20::435]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sAeHv-0006v5-9O for qemu-devel@nongnu.org; Fri, 24 May 2024 19:25:42 -0400 Received: by mail-pf1-x435.google.com with SMTP id d2e1a72fcca58-6f8ea3e9543so1196438b3a.2 for ; Fri, 24 May 2024 16:25:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1716593114; x=1717197914; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=uEQjOu+D099SYberal9QNBKKClTAzxuVClnBfx9sryw=; b=BARrJPyNdZXQvosUvTdd3yFMi3DY52/jTfibsVf2bNpyljRcPkKQmEs6wnGwi/88CT EwHb6P8tjEWuOhQtcrR7PTnTjBlvpIIWxm0uXKeaAZs+GqRGc/7VAqMM9Vjc/WOv8wf9 qZNRDNn/7b50UzW4LRYve+bmieWe83VCrsRkQ1C9GVwGFhQqkcaztorsxtJTbQ1pNggk sNW/IvOP2IBqnyxd0apd0xY7UkJO8m/FgyhNKpCm6NdinNHZJkdBm5EUtiBXBsOmY6tJ 2mwaPFEGVGbXE6YKT04msjX/Xz2j8cqyauv+sBsd88XFB7DVDPV918eDdpXkEx4kVK7I kuvQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716593114; x=1717197914; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=uEQjOu+D099SYberal9QNBKKClTAzxuVClnBfx9sryw=; b=HHk9qkIPdiIKQMGkFPm9SnC0PN5OBY0fecs8U94FONN5Kjt2Zv8NrgoYHtdtyID9sZ INniGGkkhHu6MK47mvXDQM2V3jE8y/HWRwOnS0hzVSFZyYhgWPqHurEC1YjGAC5y1urL LGJx7GrZulHR45Vj2Fg/xe1ZeDn+XUZjAsbZUHOGefG5DPjU2h6VhoZkhutECOnNLdwl Mx1wL9qp5MGfLuchtUmxSzEC2koKroZcFWu7imisXXZhZWrUrNv+XsJLypV4+V3sx+OT PuBGdwHPNt0OU0Ihwq/E9d+7G1W9lxty4drczcKxiAkEtJyEi8hTNwcplK1T1AXYlGBc TzXg== X-Gm-Message-State: AOJu0YxrxlZXx3ucG9f6RK2AV/jr9gVGlHdcxet8jkgp6tVISK1k538g kcdfvGIDMfFTIG19w/mw+NILaAEuoTUNx8Zm/PdHmelhkqRqfmNipYUbca5earlPHGgQt0IFbng d X-Google-Smtp-Source: AGHT+IHtB1zNfx+y09SENbUhuhUPcXftJ62P6WbKophjcsMRgrs64p61Ibme/q34+5lhS3IhKumRCA== X-Received: by 2002:a05:6a21:996:b0:1b1:d403:5272 with SMTP id adf61e73a8af0-1b212e659e2mr3769691637.57.1716593113827; Fri, 24 May 2024 16:25:13 -0700 (PDT) Received: from stoup.. (174-21-72-5.tukw.qwest.net. [174.21.72.5]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-6f8fbf2cfd1sm1591695b3a.3.2024.05.24.16.25.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 May 2024 16:25:13 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: qemu-arm@nongnu.org Subject: [PATCH v2 58/67] target/arm: Convert SRHADD, URHADD to gvec Date: Fri, 24 May 2024 16:21:12 -0700 Message-Id: <20240524232121.284515-59-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240524232121.284515-1-richard.henderson@linaro.org> References: <20240524232121.284515-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::435; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x435.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/helper.h | 7 -- target/arm/tcg/translate.h | 4 + target/arm/tcg/gengvec.c | 144 ++++++++++++++++++++++++++++++++ target/arm/tcg/neon_helper.c | 27 ------ target/arm/tcg/translate-a64.c | 48 ++--------- target/arm/tcg/translate-neon.c | 26 +----- 6 files changed, 158 insertions(+), 98 deletions(-) diff --git a/target/arm/helper.h b/target/arm/helper.h index b95f24ed0a..85f9302563 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -268,13 +268,6 @@ DEF_HELPER_FLAGS_2(fjcvtzs, TCG_CALL_NO_RWG, i64, f64, ptr) DEF_HELPER_FLAGS_3(check_hcr_el2_trap, TCG_CALL_NO_WG, void, env, i32, i32) /* neon_helper.c */ -DEF_HELPER_2(neon_rhadd_s8, i32, i32, i32) -DEF_HELPER_2(neon_rhadd_u8, i32, i32, i32) -DEF_HELPER_2(neon_rhadd_s16, i32, i32, i32) -DEF_HELPER_2(neon_rhadd_u16, i32, i32, i32) -DEF_HELPER_2(neon_rhadd_s32, s32, s32, s32) -DEF_HELPER_2(neon_rhadd_u32, i32, i32, i32) - DEF_HELPER_2(neon_pmin_u8, i32, i32, i32) DEF_HELPER_2(neon_pmin_s8, i32, i32, i32) DEF_HELPER_2(neon_pmin_u16, i32, i32, i32) diff --git a/target/arm/tcg/translate.h b/target/arm/tcg/translate.h index 315e0afd04..3b1e68b779 100644 --- a/target/arm/tcg/translate.h +++ b/target/arm/tcg/translate.h @@ -480,6 +480,10 @@ void gen_gvec_shsub(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); void gen_gvec_uhsub(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_srhadd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_urhadd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); void gen_cmtst_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b); void gen_ushl_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b); diff --git a/target/arm/tcg/gengvec.c b/target/arm/tcg/gengvec.c index 6a54ad2d21..32caabd126 100644 --- a/target/arm/tcg/gengvec.c +++ b/target/arm/tcg/gengvec.c @@ -2140,3 +2140,147 @@ void gen_gvec_uhsub(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, assert(vece <= MO_32); tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &g[vece]); } + +static void gen_srhadd8_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + tcg_gen_or_i64(t, a, b); + tcg_gen_vec_sar8i_i64(a, a, 1); + tcg_gen_vec_sar8i_i64(b, b, 1); + tcg_gen_andi_i64(t, t, dup_const(MO_8, 1)); + tcg_gen_vec_add8_i64(d, a, b); + tcg_gen_vec_add8_i64(d, d, t); +} + +static void gen_srhadd16_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + tcg_gen_or_i64(t, a, b); + tcg_gen_vec_sar16i_i64(a, a, 1); + tcg_gen_vec_sar16i_i64(b, b, 1); + tcg_gen_andi_i64(t, t, dup_const(MO_16, 1)); + tcg_gen_vec_add16_i64(d, a, b); + tcg_gen_vec_add16_i64(d, d, t); +} + +static void gen_srhadd_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) +{ + TCGv_i32 t = tcg_temp_new_i32(); + + tcg_gen_or_i32(t, a, b); + tcg_gen_sari_i32(a, a, 1); + tcg_gen_sari_i32(b, b, 1); + tcg_gen_andi_i32(t, t, 1); + tcg_gen_add_i32(d, a, b); + tcg_gen_add_i32(d, d, t); +} + +static void gen_srhadd_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + + tcg_gen_or_vec(vece, t, a, b); + tcg_gen_sari_vec(vece, a, a, 1); + tcg_gen_sari_vec(vece, b, b, 1); + tcg_gen_and_vec(vece, t, t, tcg_constant_vec_matching(d, vece, 1)); + tcg_gen_add_vec(vece, d, a, b); + tcg_gen_add_vec(vece, d, d, t); +} + +void gen_gvec_srhadd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_sari_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen3 g[] = { + { .fni8 = gen_srhadd8_i64, + .fniv = gen_srhadd_vec, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fni8 = gen_srhadd16_i64, + .fniv = gen_srhadd_vec, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fni4 = gen_srhadd_i32, + .fniv = gen_srhadd_vec, + .opt_opc = vecop_list, + .vece = MO_32 }, + }; + assert(vece <= MO_32); + tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &g[vece]); +} + +static void gen_urhadd8_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + tcg_gen_or_i64(t, a, b); + tcg_gen_vec_shr8i_i64(a, a, 1); + tcg_gen_vec_shr8i_i64(b, b, 1); + tcg_gen_andi_i64(t, t, dup_const(MO_8, 1)); + tcg_gen_vec_add8_i64(d, a, b); + tcg_gen_vec_add8_i64(d, d, t); +} + +static void gen_urhadd16_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + tcg_gen_or_i64(t, a, b); + tcg_gen_vec_shr16i_i64(a, a, 1); + tcg_gen_vec_shr16i_i64(b, b, 1); + tcg_gen_andi_i64(t, t, dup_const(MO_16, 1)); + tcg_gen_vec_add16_i64(d, a, b); + tcg_gen_vec_add16_i64(d, d, t); +} + +static void gen_urhadd_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) +{ + TCGv_i32 t = tcg_temp_new_i32(); + + tcg_gen_or_i32(t, a, b); + tcg_gen_shri_i32(a, a, 1); + tcg_gen_shri_i32(b, b, 1); + tcg_gen_andi_i32(t, t, 1); + tcg_gen_add_i32(d, a, b); + tcg_gen_add_i32(d, d, t); +} + +static void gen_urhadd_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + + tcg_gen_or_vec(vece, t, a, b); + tcg_gen_shri_vec(vece, a, a, 1); + tcg_gen_shri_vec(vece, b, b, 1); + tcg_gen_and_vec(vece, t, t, tcg_constant_vec_matching(d, vece, 1)); + tcg_gen_add_vec(vece, d, a, b); + tcg_gen_add_vec(vece, d, d, t); +} + +void gen_gvec_urhadd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_shri_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen3 g[] = { + { .fni8 = gen_urhadd8_i64, + .fniv = gen_urhadd_vec, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fni8 = gen_urhadd16_i64, + .fniv = gen_urhadd_vec, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fni4 = gen_urhadd_i32, + .fniv = gen_urhadd_vec, + .opt_opc = vecop_list, + .vece = MO_32 }, + }; + assert(vece <= MO_32); + tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &g[vece]); +} diff --git a/target/arm/tcg/neon_helper.c b/target/arm/tcg/neon_helper.c index d1641a5252..082bfd88ad 100644 --- a/target/arm/tcg/neon_helper.c +++ b/target/arm/tcg/neon_helper.c @@ -179,33 +179,6 @@ uint32_t HELPER(glue(neon_,name))(uint32_t arg) \ return arg; \ } -#define NEON_FN(dest, src1, src2) dest = (src1 + src2 + 1) >> 1 -NEON_VOP(rhadd_s8, neon_s8, 4) -NEON_VOP(rhadd_u8, neon_u8, 4) -NEON_VOP(rhadd_s16, neon_s16, 2) -NEON_VOP(rhadd_u16, neon_u16, 2) -#undef NEON_FN - -int32_t HELPER(neon_rhadd_s32)(int32_t src1, int32_t src2) -{ - int32_t dest; - - dest = (src1 >> 1) + (src2 >> 1); - if ((src1 | src2) & 1) - dest++; - return dest; -} - -uint32_t HELPER(neon_rhadd_u32)(uint32_t src1, uint32_t src2) -{ - uint32_t dest; - - dest = (src1 >> 1) + (src2 >> 1); - if ((src1 | src2) & 1) - dest++; - return dest; -} - #define NEON_FN(dest, src1, src2) dest = (src1 < src2) ? src1 : src2 NEON_POP(pmin_s8, neon_s8, 4) NEON_POP(pmin_u8, neon_u8, 4) diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c index 6571b999f4..40aa7a9d57 100644 --- a/target/arm/tcg/translate-a64.c +++ b/target/arm/tcg/translate-a64.c @@ -10915,7 +10915,6 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) int rm = extract32(insn, 16, 5); int rn = extract32(insn, 5, 5); int rd = extract32(insn, 0, 5); - int pass; switch (opcode) { case 0x13: /* MUL, PMUL */ @@ -10969,6 +10968,13 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) } switch (opcode) { + case 0x02: /* SRHADD, URHADD */ + if (u) { + gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_urhadd, size); + } else { + gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_srhadd, size); + } + return; case 0x0c: /* SMAX, UMAX */ if (u) { gen_gvec_fn3(s, is_q, rd, rn, rm, tcg_gen_gvec_umax, size); @@ -11021,45 +11027,7 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) } return; } - - if (size == 3) { - g_assert_not_reached(); - } else { - for (pass = 0; pass < (is_q ? 4 : 2); pass++) { - TCGv_i32 tcg_op1 = tcg_temp_new_i32(); - TCGv_i32 tcg_op2 = tcg_temp_new_i32(); - TCGv_i32 tcg_res = tcg_temp_new_i32(); - NeonGenTwoOpFn *genfn = NULL; - NeonGenTwoOpEnvFn *genenvfn = NULL; - - read_vec_element_i32(s, tcg_op1, rn, pass, MO_32); - read_vec_element_i32(s, tcg_op2, rm, pass, MO_32); - - switch (opcode) { - case 0x2: /* SRHADD, URHADD */ - { - static NeonGenTwoOpFn * const fns[3][2] = { - { gen_helper_neon_rhadd_s8, gen_helper_neon_rhadd_u8 }, - { gen_helper_neon_rhadd_s16, gen_helper_neon_rhadd_u16 }, - { gen_helper_neon_rhadd_s32, gen_helper_neon_rhadd_u32 }, - }; - genfn = fns[size][u]; - break; - } - default: - g_assert_not_reached(); - } - - if (genenvfn) { - genenvfn(tcg_res, tcg_env, tcg_op1, tcg_op2); - } else { - genfn(tcg_res, tcg_op1, tcg_op2); - } - - write_vec_element_i32(s, tcg_res, rd, pass, MO_32); - } - } - clear_vec_high(s, is_q, rd); + g_assert_not_reached(); } /* AdvSIMD three same diff --git a/target/arm/tcg/translate-neon.c b/target/arm/tcg/translate-neon.c index d59d5804c5..f9a8753906 100644 --- a/target/arm/tcg/translate-neon.c +++ b/target/arm/tcg/translate-neon.c @@ -845,6 +845,8 @@ DO_3SAME_NO_SZ_3(VHADD_S, gen_gvec_shadd) DO_3SAME_NO_SZ_3(VHADD_U, gen_gvec_uhadd) DO_3SAME_NO_SZ_3(VHSUB_S, gen_gvec_shsub) DO_3SAME_NO_SZ_3(VHSUB_U, gen_gvec_uhsub) +DO_3SAME_NO_SZ_3(VRHADD_S, gen_gvec_srhadd) +DO_3SAME_NO_SZ_3(VRHADD_U, gen_gvec_urhadd) #define DO_3SAME_CMP(INSN, COND) \ static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs, \ @@ -922,27 +924,6 @@ DO_SHA2(SHA256H, gen_helper_crypto_sha256h) DO_SHA2(SHA256H2, gen_helper_crypto_sha256h2) DO_SHA2(SHA256SU1, gen_helper_crypto_sha256su1) -#define DO_3SAME_32(INSN, FUNC) \ - static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs, \ - uint32_t rn_ofs, uint32_t rm_ofs, \ - uint32_t oprsz, uint32_t maxsz) \ - { \ - static const GVecGen3 ops[4] = { \ - { .fni4 = gen_helper_neon_##FUNC##8 }, \ - { .fni4 = gen_helper_neon_##FUNC##16 }, \ - { .fni4 = gen_helper_neon_##FUNC##32 }, \ - { 0 }, \ - }; \ - tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, oprsz, maxsz, &ops[vece]); \ - } \ - static bool trans_##INSN##_3s(DisasContext *s, arg_3same *a) \ - { \ - if (a->size > 2) { \ - return false; \ - } \ - return do_3same(s, a, gen_##INSN##_3s); \ - } - /* * Some helper functions need to be passed the tcg_env. In order * to use those with the gvec APIs like tcg_gen_gvec_3() we need @@ -955,9 +936,6 @@ DO_SHA2(SHA256SU1, gen_helper_crypto_sha256su1) FUNC(d, tcg_env, n, m); \ } -DO_3SAME_32(VRHADD_S, rhadd_s) -DO_3SAME_32(VRHADD_U, rhadd_u) - #define DO_3SAME_VQDMULH(INSN, FUNC) \ WRAP_ENV_FN(gen_##INSN##_tramp16, gen_helper_neon_##FUNC##_s16); \ WRAP_ENV_FN(gen_##INSN##_tramp32, gen_helper_neon_##FUNC##_s32); \ From patchwork Fri May 24 23:21:13 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13673842 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id A9D9CC25B7A for ; Fri, 24 May 2024 23:31:01 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sAeJD-0003MQ-Qg; Fri, 24 May 2024 19:26:47 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sAeI6-000827-8z for qemu-devel@nongnu.org; Fri, 24 May 2024 19:25:38 -0400 Received: from mail-pf1-x42a.google.com ([2607:f8b0:4864:20::42a]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sAeHt-0006vE-Vq for qemu-devel@nongnu.org; Fri, 24 May 2024 19:25:36 -0400 Received: by mail-pf1-x42a.google.com with SMTP id d2e1a72fcca58-6fbbd937719so639435b3a.0 for ; Fri, 24 May 2024 16:25:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1716593114; x=1717197914; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=xnQyj7jh7NMF7XqKnheMTRMPUBrUq1CKfrzf5INB11g=; b=NduP3F1Y9RIKGemWqRWlPTY92maUJPIQKNYdHl2Z41hs83m0aCGoOOaAY6NT9VpFAN 1+NZqqPGj9RBh64E4v76AdjovjfrpfGeaim2rrOReQjHJyGJwCElkKtQsQ5CAZjv8JNm YNUujpAi0XjhLJh+CxUL1edFgAnKsAX4SYA9d7m9sBaKu5RBMgrFLKRT75HWXH/JfelQ lt85jCenQS3lbNwrVTpzv6fK/WVMLK4VdykP9SxA1sYPEjRfZp+JXsiCFYZ0cpjpIxfX StdB+tlI23NDeNm9/nmeiepHu0Jbe38b9PvnYtaZo1auybQfaGdsrBBTtFoiDSEzH9Od jL0g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716593114; x=1717197914; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=xnQyj7jh7NMF7XqKnheMTRMPUBrUq1CKfrzf5INB11g=; b=lnWFtS2K0UzzvdsL5bmvy5R0oAbIuNO4HObetC8tlCG4ErupQo24zoEtfpQacxoZAj Ra3L//qi6VD5iE8MMJ2AX377Pvin+rVbAFDpv3vpH7T/5cJWeFfqSDZuMWws7ydkcL8d iz3WBtPstg+nWBAe0SQ8B7FhE2iLNsn6rCk4yCF5BZcRq70+VWZsVkiFTrHymvHsbvJJ b4xUBXrBBycxKd41TzKyrO/BssIqkElUdqTqewAiCBNPh34SNgq7o9ZsmMuxRWwaZejB 4yN4k/Z46V0+71pVtrGkDrf/4/AGP53zGov4oUcx2QUEiut7WsJhjxbFVpXuOVcMKg4m xQQw== X-Gm-Message-State: AOJu0YyMcwHJ7bi/AV6yCBRAzSbCpMP//NkovpiTIQSaZmakFoY4byvC qMqT5e8rx+KfNGHFg7obhY2nNaG3TX2PN5jCGq1ZaYCgqTkjQSNy1W0klJwJDgNrcINPZXCy/KF 8 X-Google-Smtp-Source: AGHT+IEi81mIaKo7m1SzTQBzJbVi4l74PVHPp5AvSgNkso3ebnrzdgvdYtrFvomQShwys/Ria+HI4A== X-Received: by 2002:a05:6a00:1f12:b0:6ed:1c7:8c6b with SMTP id d2e1a72fcca58-6f8f2c56b89mr3781795b3a.1.1716593114515; Fri, 24 May 2024 16:25:14 -0700 (PDT) Received: from stoup.. (174-21-72-5.tukw.qwest.net. [174.21.72.5]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-6f8fbf2cfd1sm1591695b3a.3.2024.05.24.16.25.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 May 2024 16:25:14 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: qemu-arm@nongnu.org Subject: [PATCH v2 59/67] target/arm: Convert SRHADD, URHADD to decodetree Date: Fri, 24 May 2024 16:21:13 -0700 Message-Id: <20240524232121.284515-60-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240524232121.284515-1-richard.henderson@linaro.org> References: <20240524232121.284515-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::42a; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x42a.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/tcg/a64.decode | 2 ++ target/arm/tcg/translate-a64.c | 11 +++-------- 2 files changed, 5 insertions(+), 8 deletions(-) diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode index b1bbcb144e..1c448b4f7c 100644 --- a/target/arm/tcg/a64.decode +++ b/target/arm/tcg/a64.decode @@ -916,6 +916,8 @@ SHADD_v 0.00 1110 ..1 ..... 00000 1 ..... ..... @qrrr_e UHADD_v 0.10 1110 ..1 ..... 00000 1 ..... ..... @qrrr_e SHSUB_v 0.00 1110 ..1 ..... 00100 1 ..... ..... @qrrr_e UHSUB_v 0.10 1110 ..1 ..... 00100 1 ..... ..... @qrrr_e +SRHADD_v 0.00 1110 ..1 ..... 00010 1 ..... ..... @qrrr_e +URHADD_v 0.10 1110 ..1 ..... 00010 1 ..... ..... @qrrr_e ### Advanced SIMD scalar x indexed element diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c index 40aa7a9d57..9ef5de6755 100644 --- a/target/arm/tcg/translate-a64.c +++ b/target/arm/tcg/translate-a64.c @@ -5458,6 +5458,8 @@ TRANS(SHADD_v, do_gvec_fn3_no64, a, gen_gvec_shadd) TRANS(UHADD_v, do_gvec_fn3_no64, a, gen_gvec_uhadd) TRANS(SHSUB_v, do_gvec_fn3_no64, a, gen_gvec_shsub) TRANS(UHSUB_v, do_gvec_fn3_no64, a, gen_gvec_uhsub) +TRANS(SRHADD_v, do_gvec_fn3_no64, a, gen_gvec_srhadd) +TRANS(URHADD_v, do_gvec_fn3_no64, a, gen_gvec_urhadd) static bool do_cmop_v(DisasContext *s, arg_qrrr_e *a, TCGCond cond) { @@ -10923,7 +10925,6 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) return; } /* fall through */ - case 0x2: /* SRHADD, URHADD */ case 0xc: /* SMAX, UMAX */ case 0xd: /* SMIN, UMIN */ case 0xe: /* SABD, UABD */ @@ -10949,6 +10950,7 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) case 0x0: /* SHADD, UHADD */ case 0x01: /* SQADD, UQADD */ + case 0x02: /* SRHADD, URHADD */ case 0x04: /* SHSUB, UHSUB */ case 0x05: /* SQSUB, UQSUB */ case 0x06: /* CMGT, CMHI */ @@ -10968,13 +10970,6 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) } switch (opcode) { - case 0x02: /* SRHADD, URHADD */ - if (u) { - gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_urhadd, size); - } else { - gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_srhadd, size); - } - return; case 0x0c: /* SMAX, UMAX */ if (u) { gen_gvec_fn3(s, is_q, rd, rn, rm, tcg_gen_gvec_umax, size); From patchwork Fri May 24 23:21:14 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13673826 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 4E114C25B74 for ; Fri, 24 May 2024 23:28:27 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sAeJD-0003LI-Lw; Fri, 24 May 2024 19:26:47 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sAeI6-0007zn-09 for qemu-devel@nongnu.org; Fri, 24 May 2024 19:25:38 -0400 Received: from mail-pf1-x435.google.com ([2607:f8b0:4864:20::435]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sAeHt-0006wE-CM for qemu-devel@nongnu.org; Fri, 24 May 2024 19:25:37 -0400 Received: by mail-pf1-x435.google.com with SMTP id d2e1a72fcca58-6fcbd812b33so242631b3a.3 for ; Fri, 24 May 2024 16:25:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1716593115; x=1717197915; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=uJ0q4kYSciCCJ5iNnH0cT/11J4ZEDfTPyyL262CCOCQ=; b=m/PfeRW9mPt6DQ7OXihTEGWyVk3C/nVlqDJT78WF/JSffTqiW9g6W7ohSlHRzrU5X3 rDvrDu1doTxB5muO9Vr/e8TwmAmhAhe4PQLByQUrNlBz1C1nKeboUz/pISa2zTNOZJd1 9qfSMYi666m4txlCQm3omIDb587UKkANCLit1N6X9Qvtp1xBr9GytG2q3/0KXrimkGct DP10Wo0f2jjUwy3kNmTcIljSnusRDXchaGGkjsF5VCJh3G61Wro5c/qlELwIpBAomNnF TzxeLx/mCwr5OD4r2Qe2Nn+syj0DgBqtt4qLoZTY83aaDYklx20N245CDOMdKqAKUemF ECAg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716593115; x=1717197915; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=uJ0q4kYSciCCJ5iNnH0cT/11J4ZEDfTPyyL262CCOCQ=; b=ZgpqcCi+ffI+7zKqsz5Kvk6667r4ZSDQ4dDNA0k2cnJXYQw6smW6SZmVrRKdLNOP/2 R7neBXTIzFW2vmZpc1Pge4gXRFmcWywvUMM3TjYxRzq8SH3QppM8EAG2/KWjJ3GvDdYZ 1Y9wwSH/1KOZXH8qUZb2eJhYUxyy0rc6BFOgQzJGrI8VWPv6BPJw/2c//rwUrBVkbkIe HSMi6lmlkL367WTIJMRZ8tWz+GzHa5Nff5zS+3bOUKVI3JIqcNtT+J6nIEJTMF+JQ5zh WA/XpNyy1DGHVeLuIdpb7cLfUQqxT8JOswOQpQNgqiiPzTxkeFAFqD4h9haCpVsntrNp PKdg== X-Gm-Message-State: AOJu0YyeTVwGtQvzExLknOBIp+Ru3qXpmdVI352DTZFlQRs/RJnWHNVa pi6p/Qo+/SAXvMmH3z0UHFCR9hpRA4g4el9SKnXRjyv0njJk7GfZFwbMZC7BWh5oVTOTAUtGbD8 Z X-Google-Smtp-Source: AGHT+IHLFahYjw3fYU81v2cdjJZyQlZTyUz7GRQhTeoEVUyx6hl63AGokg/huEw2Jar7pjQDDN6cog== X-Received: by 2002:a05:6a00:418a:b0:6e6:970f:a809 with SMTP id d2e1a72fcca58-6f8f392b6a8mr4068086b3a.20.1716593115251; Fri, 24 May 2024 16:25:15 -0700 (PDT) Received: from stoup.. (174-21-72-5.tukw.qwest.net. [174.21.72.5]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-6f8fbf2cfd1sm1591695b3a.3.2024.05.24.16.25.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 May 2024 16:25:14 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: qemu-arm@nongnu.org Subject: [PATCH v2 60/67] target/arm: Convert SMAX, SMIN, UMAX, UMIN to decodetree Date: Fri, 24 May 2024 16:21:14 -0700 Message-Id: <20240524232121.284515-61-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240524232121.284515-1-richard.henderson@linaro.org> References: <20240524232121.284515-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::435; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x435.google.com X-Spam_score_int: -16 X-Spam_score: -1.7 X-Spam_bar: - X-Spam_report: (-1.7 / 5.0 requ) BAYES_00=-1.9, DKIM_INVALID=0.1, DKIM_SIGNED=0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/tcg/a64.decode | 4 ++++ target/arm/tcg/translate-a64.c | 22 ++++++---------------- 2 files changed, 10 insertions(+), 16 deletions(-) diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode index 1c448b4f7c..bc98963bc5 100644 --- a/target/arm/tcg/a64.decode +++ b/target/arm/tcg/a64.decode @@ -918,6 +918,10 @@ SHSUB_v 0.00 1110 ..1 ..... 00100 1 ..... ..... @qrrr_e UHSUB_v 0.10 1110 ..1 ..... 00100 1 ..... ..... @qrrr_e SRHADD_v 0.00 1110 ..1 ..... 00010 1 ..... ..... @qrrr_e URHADD_v 0.10 1110 ..1 ..... 00010 1 ..... ..... @qrrr_e +SMAX_v 0.00 1110 ..1 ..... 01100 1 ..... ..... @qrrr_e +UMAX_v 0.10 1110 ..1 ..... 01100 1 ..... ..... @qrrr_e +SMIN_v 0.00 1110 ..1 ..... 01101 1 ..... ..... @qrrr_e +UMIN_v 0.10 1110 ..1 ..... 01101 1 ..... ..... @qrrr_e ### Advanced SIMD scalar x indexed element diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c index 9ef5de6755..db6f59df17 100644 --- a/target/arm/tcg/translate-a64.c +++ b/target/arm/tcg/translate-a64.c @@ -5460,6 +5460,10 @@ TRANS(SHSUB_v, do_gvec_fn3_no64, a, gen_gvec_shsub) TRANS(UHSUB_v, do_gvec_fn3_no64, a, gen_gvec_uhsub) TRANS(SRHADD_v, do_gvec_fn3_no64, a, gen_gvec_srhadd) TRANS(URHADD_v, do_gvec_fn3_no64, a, gen_gvec_urhadd) +TRANS(SMAX_v, do_gvec_fn3_no64, a, tcg_gen_gvec_smax) +TRANS(UMAX_v, do_gvec_fn3_no64, a, tcg_gen_gvec_umax) +TRANS(SMIN_v, do_gvec_fn3_no64, a, tcg_gen_gvec_smin) +TRANS(UMIN_v, do_gvec_fn3_no64, a, tcg_gen_gvec_umin) static bool do_cmop_v(DisasContext *s, arg_qrrr_e *a, TCGCond cond) { @@ -10925,8 +10929,6 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) return; } /* fall through */ - case 0xc: /* SMAX, UMAX */ - case 0xd: /* SMIN, UMIN */ case 0xe: /* SABD, UABD */ case 0xf: /* SABA, UABA */ case 0x12: /* MLA, MLS */ @@ -10959,6 +10961,8 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) case 0x09: /* SQSHL, UQSHL */ case 0x0a: /* SRSHL, URSHL */ case 0x0b: /* SQRSHL, UQRSHL */ + case 0x0c: /* SMAX, UMAX */ + case 0x0d: /* SMIN, UMIN */ case 0x10: /* ADD, SUB */ case 0x11: /* CMTST, CMEQ */ unallocated_encoding(s); @@ -10970,20 +10974,6 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) } switch (opcode) { - case 0x0c: /* SMAX, UMAX */ - if (u) { - gen_gvec_fn3(s, is_q, rd, rn, rm, tcg_gen_gvec_umax, size); - } else { - gen_gvec_fn3(s, is_q, rd, rn, rm, tcg_gen_gvec_smax, size); - } - return; - case 0x0d: /* SMIN, UMIN */ - if (u) { - gen_gvec_fn3(s, is_q, rd, rn, rm, tcg_gen_gvec_umin, size); - } else { - gen_gvec_fn3(s, is_q, rd, rn, rm, tcg_gen_gvec_smin, size); - } - return; case 0xe: /* SABD, UABD */ if (u) { gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_uabd, size); From patchwork Fri May 24 23:21:15 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13673840 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id F01F8C3DA40 for ; Fri, 24 May 2024 23:30:30 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sAeIs-0002Ue-B6; Fri, 24 May 2024 19:26:30 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sAeI0-0007dJ-8m for qemu-devel@nongnu.org; Fri, 24 May 2024 19:25:33 -0400 Received: from mail-pf1-x429.google.com ([2607:f8b0:4864:20::429]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sAeHt-0006wM-8H for qemu-devel@nongnu.org; Fri, 24 May 2024 19:25:30 -0400 Received: by mail-pf1-x429.google.com with SMTP id d2e1a72fcca58-6f4603237e0so2616792b3a.0 for ; Fri, 24 May 2024 16:25:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1716593116; x=1717197916; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=4y0c3xhRmUQuoyoD+KpuLG+Oeo/1XM2y/5M67hpS/L0=; b=v8vh9YLROjhygyae/RA0yn/lYz6wwilr3jA9/YCbl7PJVVK+8eva4De7CviKnIaFL3 lVhcO0zXeSa/HxoMdRDYSAhfm7ck/OTYtyKKPrnpMuC9wiVFmBsNwFnjslOpz/vltkbv zYs2q85/vY6F6buSUHx+2rE9Oq/RIgq0DeRSIoYj14TLt0E10pYNdpIqpvudqUqv6/HX 0U5kRXEsp6e1NSy62N2IBp28SPKynpY2Peh4JJDcFRs++qru/5M5oghpUK9ydEPlLcuj JDX1IHaBBzzdw2D3vs+BSZGyBQLAVs0FA5wbz85c7vQAcPteW+vUhY3DDpAmdtwkoWP7 7A+g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716593116; x=1717197916; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=4y0c3xhRmUQuoyoD+KpuLG+Oeo/1XM2y/5M67hpS/L0=; b=RjRILyp14in6/UNDcq0V2QhhxdPLePWRu66ATkQCAvFUoCOapU4wGCxpDD13HbbeV8 aRPlC3mfgchO3eAvaPm+ErMr8TKhbS1Y4Qmdsj2BSCVsMzmCn+71FsbhgDVZOyMBNm7c x1ndzpHGJgLprWyLkZGBO3vcKLqLAbVFqXfanai/RC0WnQR3U7JbgBOG87RRAlG0Cubv AKPqD5Ei6TfiSgVM1Jxad1zlMQw3uU6l1ty0mV8779aUMEYzBqpf9O0vgCGNv9Mg6Km3 an3Lve3PHhK2fE0rcMCVgbVkXLtTc1XaWfOYPGnVaTb9cALTsSevNoXRb2Vc8HCYcMWF THAg== X-Gm-Message-State: AOJu0Yz45Ql+8mfL4e6BKDnbPaBO2zWxP6BVceZeB4KzyEhTSt1At4ZW NTxRmIZfoITFYc1EZttwghsaSWWLFraox8p17RFJ5aexZn4CCrkVcVxYdF/lZVY49Axw7xBEgS4 H X-Google-Smtp-Source: AGHT+IHSPZzU7xkjecvGo6woj8E6OyS5zgX4deLqAjFdIG2OocSw8QjC3ZvYmD94JcZVxL+0Na2Cmw== X-Received: by 2002:a05:6a00:a0a:b0:6f4:7113:5d0a with SMTP id d2e1a72fcca58-6f7727bdd7bmr11700037b3a.11.1716593116063; Fri, 24 May 2024 16:25:16 -0700 (PDT) Received: from stoup.. (174-21-72-5.tukw.qwest.net. [174.21.72.5]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-6f8fbf2cfd1sm1591695b3a.3.2024.05.24.16.25.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 May 2024 16:25:15 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: qemu-arm@nongnu.org Subject: [PATCH v2 61/67] target/arm: Convert SABA, SABD, UABA, UABD to decodetree Date: Fri, 24 May 2024 16:21:15 -0700 Message-Id: <20240524232121.284515-62-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240524232121.284515-1-richard.henderson@linaro.org> References: <20240524232121.284515-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::429; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x429.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/tcg/a64.decode | 4 ++++ target/arm/tcg/translate-a64.c | 22 ++++++---------------- 2 files changed, 10 insertions(+), 16 deletions(-) diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode index bc98963bc5..07b604ec30 100644 --- a/target/arm/tcg/a64.decode +++ b/target/arm/tcg/a64.decode @@ -922,6 +922,10 @@ SMAX_v 0.00 1110 ..1 ..... 01100 1 ..... ..... @qrrr_e UMAX_v 0.10 1110 ..1 ..... 01100 1 ..... ..... @qrrr_e SMIN_v 0.00 1110 ..1 ..... 01101 1 ..... ..... @qrrr_e UMIN_v 0.10 1110 ..1 ..... 01101 1 ..... ..... @qrrr_e +SABD_v 0.00 1110 ..1 ..... 01110 1 ..... ..... @qrrr_e +UABD_v 0.10 1110 ..1 ..... 01110 1 ..... ..... @qrrr_e +SABA_v 0.00 1110 ..1 ..... 01111 1 ..... ..... @qrrr_e +UABA_v 0.10 1110 ..1 ..... 01111 1 ..... ..... @qrrr_e ### Advanced SIMD scalar x indexed element diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c index db6f59df17..61afbc434f 100644 --- a/target/arm/tcg/translate-a64.c +++ b/target/arm/tcg/translate-a64.c @@ -5464,6 +5464,10 @@ TRANS(SMAX_v, do_gvec_fn3_no64, a, tcg_gen_gvec_smax) TRANS(UMAX_v, do_gvec_fn3_no64, a, tcg_gen_gvec_umax) TRANS(SMIN_v, do_gvec_fn3_no64, a, tcg_gen_gvec_smin) TRANS(UMIN_v, do_gvec_fn3_no64, a, tcg_gen_gvec_umin) +TRANS(SABA_v, do_gvec_fn3_no64, a, gen_gvec_saba) +TRANS(UABA_v, do_gvec_fn3_no64, a, gen_gvec_uaba) +TRANS(SABD_v, do_gvec_fn3_no64, a, gen_gvec_sabd) +TRANS(UABD_v, do_gvec_fn3_no64, a, gen_gvec_uabd) static bool do_cmop_v(DisasContext *s, arg_qrrr_e *a, TCGCond cond) { @@ -10929,8 +10933,6 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) return; } /* fall through */ - case 0xe: /* SABD, UABD */ - case 0xf: /* SABA, UABA */ case 0x12: /* MLA, MLS */ if (size == 3) { unallocated_encoding(s); @@ -10963,6 +10965,8 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) case 0x0b: /* SQRSHL, UQRSHL */ case 0x0c: /* SMAX, UMAX */ case 0x0d: /* SMIN, UMIN */ + case 0x0e: /* SABD, UABD */ + case 0x0f: /* SABA, UABA */ case 0x10: /* ADD, SUB */ case 0x11: /* CMTST, CMEQ */ unallocated_encoding(s); @@ -10974,20 +10978,6 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) } switch (opcode) { - case 0xe: /* SABD, UABD */ - if (u) { - gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_uabd, size); - } else { - gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_sabd, size); - } - return; - case 0xf: /* SABA, UABA */ - if (u) { - gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_uaba, size); - } else { - gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_saba, size); - } - return; case 0x13: /* MUL, PMUL */ if (!u) { /* MUL */ gen_gvec_fn3(s, is_q, rd, rn, rm, tcg_gen_gvec_mul, size); From patchwork Fri May 24 23:21:16 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13673834 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 78A93C41513 for ; Fri, 24 May 2024 23:30:00 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sAeIU-0000Ge-27; Fri, 24 May 2024 19:26:02 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sAeI0-0007cf-63 for qemu-devel@nongnu.org; Fri, 24 May 2024 19:25:32 -0400 Received: from mail-pf1-x434.google.com ([2607:f8b0:4864:20::434]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sAeHt-0006x9-7A for qemu-devel@nongnu.org; Fri, 24 May 2024 19:25:29 -0400 Received: by mail-pf1-x434.google.com with SMTP id d2e1a72fcca58-6f8e859eb20so1321850b3a.0 for ; Fri, 24 May 2024 16:25:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1716593117; x=1717197917; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=vUxpSrKxVoAKgNl8OSw2nhjum1zUmIGG6Eg1mHx3xuw=; b=J0pRYMg0Df6Ojz3qgMYR91Gri869EU8fVT45RIasQstiBGvsjr7oKvPdHTUV6fHklM XngKjSanZdSvpBqcLouG2QOSgSWGkaNCnQ53y7Mnm/J9sIZzxHWt7t0CxWXIRSRrTwu9 rKFPPIdFI2piWEATf1puZBUxBVGk2M0dScW+IJKljhYkVVaTzzIdZMbNGK4A/SiJ2qMU LvOeAsXpxoLoBLOM8ZD9IPElsLlSOYuB7c5fIlj4wiwUe3dRNJI/4BhHbgRLZf7ILGT4 94QPba+ADkjnjGkTyDiEEl0J1glLPdiRssiFwTy2w3vgz84SqHJVUdbct0DdEXbsdUNS 7EYQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716593117; x=1717197917; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=vUxpSrKxVoAKgNl8OSw2nhjum1zUmIGG6Eg1mHx3xuw=; b=i51d2YtzvVwAMIVlLB0ayqjZyAJGrE0Lw83T3KdSpGvAJv7GG/M/cpC+oOtwI8RScl Djrss9jZDYv1+f8Aoab4QoNIJRU5o6sfaGKdn/tQ5qvTGtzPg4P3KGeIHbmdzYfGiSS2 nQF3RAMgWZ/3YQrlwUYY+Pz0UG2RHcz69gcCcayiZIXQUe51h2WdO/NamL1huCmzOE/K DMeIZIZC4LW0GY91DsR6u+rQrPUtGP3CLfvlYWtZsvUfsfpHkuWnxXVCbbzRFE5UefGB fL0ZTo3GVyoMEjRXYCnXe+ZmfXBPtaAJWZk5SjZyLI0ai2pa3OZqpwaWZAT4Zhi0wnEm 38Xg== X-Gm-Message-State: AOJu0YwZAIHe1AzmMMi8PvhxEvq8/XdRGn9uLFsdEuJbdG48JGX90QX3 3QFmqeqbzxH9B8OpZZLVAnDNEG6B2tatEtoMgnxrEkPd/30aIcOTuMnLbbtcZgt+bvwHbkK0h4V A X-Google-Smtp-Source: AGHT+IEwJE6ILieJXZk90qqb4H/P3OLT83qFf+cwxwtjhWJFDwEbQSlcgjfd8rRHw5s6SLWyPKCh4g== X-Received: by 2002:aa7:9302:0:b0:6f3:edac:d9e4 with SMTP id d2e1a72fcca58-6f8f3700476mr3510278b3a.20.1716593116886; Fri, 24 May 2024 16:25:16 -0700 (PDT) Received: from stoup.. (174-21-72-5.tukw.qwest.net. [174.21.72.5]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-6f8fbf2cfd1sm1591695b3a.3.2024.05.24.16.25.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 May 2024 16:25:16 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: qemu-arm@nongnu.org Subject: [PATCH v2 62/67] target/arm: Convert MUL, PMUL to decodetree Date: Fri, 24 May 2024 16:21:16 -0700 Message-Id: <20240524232121.284515-63-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240524232121.284515-1-richard.henderson@linaro.org> References: <20240524232121.284515-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::434; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x434.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/tcg/a64.decode | 5 ++++ target/arm/tcg/translate-a64.c | 51 +++++++++++++--------------------- 2 files changed, 25 insertions(+), 31 deletions(-) diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode index 07b604ec30..3ea0643370 100644 --- a/target/arm/tcg/a64.decode +++ b/target/arm/tcg/a64.decode @@ -926,6 +926,8 @@ SABD_v 0.00 1110 ..1 ..... 01110 1 ..... ..... @qrrr_e UABD_v 0.10 1110 ..1 ..... 01110 1 ..... ..... @qrrr_e SABA_v 0.00 1110 ..1 ..... 01111 1 ..... ..... @qrrr_e UABA_v 0.10 1110 ..1 ..... 01111 1 ..... ..... @qrrr_e +MUL_v 0.00 1110 ..1 ..... 10011 1 ..... ..... @qrrr_e +PMUL_v 0.10 1110 001 ..... 10011 1 ..... ..... @qrrr_b ### Advanced SIMD scalar x indexed element @@ -967,3 +969,6 @@ FMLAL_vi 0.00 1111 10 .. .... 0000 . 0 ..... ..... @qrrx_h FMLSL_vi 0.00 1111 10 .. .... 0100 . 0 ..... ..... @qrrx_h FMLAL2_vi 0.10 1111 10 .. .... 1000 . 0 ..... ..... @qrrx_h FMLSL2_vi 0.10 1111 10 .. .... 1100 . 0 ..... ..... @qrrx_h + +MUL_vi 0.00 1111 01 .. .... 1000 . 0 ..... ..... @qrrx_h +MUL_vi 0.00 1111 10 . ..... 1000 . 0 ..... ..... @qrrx_s diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c index 61afbc434f..1909d1426c 100644 --- a/target/arm/tcg/translate-a64.c +++ b/target/arm/tcg/translate-a64.c @@ -5468,6 +5468,8 @@ TRANS(SABA_v, do_gvec_fn3_no64, a, gen_gvec_saba) TRANS(UABA_v, do_gvec_fn3_no64, a, gen_gvec_uaba) TRANS(SABD_v, do_gvec_fn3_no64, a, gen_gvec_sabd) TRANS(UABD_v, do_gvec_fn3_no64, a, gen_gvec_uabd) +TRANS(MUL_v, do_gvec_fn3_no64, a, tcg_gen_gvec_mul) +TRANS(PMUL_v, do_gvec_op3_ool, a, 0, gen_helper_gvec_pmul_b) static bool do_cmop_v(DisasContext *s, arg_qrrr_e *a, TCGCond cond) { @@ -5694,6 +5696,22 @@ TRANS_FEAT(FMLSL_vi, aa64_fhm, do_fmlal_idx, a, true, false) TRANS_FEAT(FMLAL2_vi, aa64_fhm, do_fmlal_idx, a, false, true) TRANS_FEAT(FMLSL2_vi, aa64_fhm, do_fmlal_idx, a, true, true) +static bool do_int3_vector_idx(DisasContext *s, arg_qrrx_e *a, + gen_helper_gvec_3 * const fns[2]) +{ + assert(a->esz == MO_16 || a->esz == MO_32); + if (fp_access_check(s)) { + gen_gvec_op3_ool(s, a->q, a->rd, a->rn, a->rm, a->idx, fns[a->esz - 1]); + } + return true; +} + +static gen_helper_gvec_3 * const f_vector_idx_mul[2] = { + gen_helper_gvec_mul_idx_h, + gen_helper_gvec_mul_idx_s, +}; +TRANS(MUL_vi, do_int3_vector_idx, a, f_vector_idx_mul) + /* * Advanced SIMD scalar pairwise */ @@ -10927,12 +10945,6 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) int rd = extract32(insn, 0, 5); switch (opcode) { - case 0x13: /* MUL, PMUL */ - if (u && size != 0) { - unallocated_encoding(s); - return; - } - /* fall through */ case 0x12: /* MLA, MLS */ if (size == 3) { unallocated_encoding(s); @@ -10969,6 +10981,7 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) case 0x0f: /* SABA, UABA */ case 0x10: /* ADD, SUB */ case 0x11: /* CMTST, CMEQ */ + case 0x13: /* MUL, PMUL */ unallocated_encoding(s); return; } @@ -10978,13 +10991,6 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) } switch (opcode) { - case 0x13: /* MUL, PMUL */ - if (!u) { /* MUL */ - gen_gvec_fn3(s, is_q, rd, rn, rm, tcg_gen_gvec_mul, size); - } else { /* PMUL */ - gen_gvec_op3_ool(s, is_q, rd, rn, rm, 0, gen_helper_gvec_pmul_b); - } - return; case 0x12: /* MLA, MLS */ if (u) { gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_mls, size); @@ -12198,7 +12204,6 @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn) TCGv_ptr fpst; switch (16 * u + opcode) { - case 0x08: /* MUL */ case 0x10: /* MLA */ case 0x14: /* MLS */ if (is_scalar) { @@ -12285,6 +12290,7 @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn) case 0x01: /* FMLA */ case 0x04: /* FMLSL */ case 0x05: /* FMLS */ + case 0x08: /* MUL */ case 0x09: /* FMUL */ case 0x18: /* FMLAL2 */ case 0x19: /* FMULX */ @@ -12407,22 +12413,6 @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn) } return; - case 0x08: /* MUL */ - if (!is_long && !is_scalar) { - static gen_helper_gvec_3 * const fns[3] = { - gen_helper_gvec_mul_idx_h, - gen_helper_gvec_mul_idx_s, - gen_helper_gvec_mul_idx_d, - }; - tcg_gen_gvec_3_ool(vec_full_reg_offset(s, rd), - vec_full_reg_offset(s, rn), - vec_full_reg_offset(s, rm), - is_q ? 16 : 8, vec_full_reg_size(s), - index, fns[size - 1]); - return; - } - break; - case 0x10: /* MLA */ if (!is_long && !is_scalar) { static gen_helper_gvec_4 * const fns[3] = { @@ -12491,7 +12481,6 @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn) read_vec_element_i32(s, tcg_op, rn, pass, is_scalar ? size : MO_32); switch (16 * u + opcode) { - case 0x08: /* MUL */ case 0x10: /* MLA */ case 0x14: /* MLS */ { From patchwork Fri May 24 23:21:17 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13673849 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 7B3F3C25B7D for ; Fri, 24 May 2024 23:32:13 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sAeJH-0003aW-Az; Fri, 24 May 2024 19:26:51 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sAeI7-00086Z-9t for qemu-devel@nongnu.org; Fri, 24 May 2024 19:25:39 -0400 Received: from mail-pf1-x42b.google.com ([2607:f8b0:4864:20::42b]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sAeHu-0006xa-HA for qemu-devel@nongnu.org; Fri, 24 May 2024 19:25:38 -0400 Received: by mail-pf1-x42b.google.com with SMTP id d2e1a72fcca58-6f8e9870e72so1354081b3a.1 for ; Fri, 24 May 2024 16:25:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1716593117; x=1717197917; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=fHFkoGc3LcCzc/uNih0n52xsiOuWG1pYAHjpL/yQXc0=; b=VHPhZYw+5laLgZ3R1XHDDLhYQ7wu9/zr4JOOtIe0LvkdWd/atLJxfW+o8nx4FWKS0M EmdYe9c4Qc/3AX2/jB54cxiilooClDku3Ru0MBsqB8kVKuQy1uVrL3hBM4BX3XwDDOrU rTb0Z2Eu5iaaVuyfR9WqKFz9X35ryjWnluPSnldxdCpF2PRMLldV8C4UVjLpYuCaSFcm D2ejEpCOfLUia3S/UQtICelovsRA4GxhhBGLHveg9mGWyyhZ39HEGZagmlFmBrE5iltg +Bo3JuXvWq4BXadxYlTNPbcLyNDBNt+8JuPGnJGDWnLaEv+YGGwcnBC+HFKoC3R8ci/g VcDQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716593117; x=1717197917; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=fHFkoGc3LcCzc/uNih0n52xsiOuWG1pYAHjpL/yQXc0=; b=C1QcUxy7P44FTONUTS8mr6gKJPo0ubTEmYD6zxoZ5qw1jIjPycyQrsQF0A+b4uAdvz lDMtySushakdUrqYpErZ0IqeQ0kQV+rFqpeOm1vf7z40dJeG4lQ6Sggc3tE8yHldt+qH 1ni8CpoE3VW0OL5opKrG4FnqgVJoeghypF4UYDNOSqZt6Rr7iDHUQBBb6UDM9oUusnv+ SEhfT1iGcPtlEBpo2J9EOC9jaHrouLgWbSUr4rAox/tSYVxlOMixaebEn8nmWuPzCWVN Z4ZdI0EjO/p7j5+BrJnpurGYNUzgZwOM8SKLLTx5TiwGnur65g5ekU0aFD9MRWBpoVAp 9ldg== X-Gm-Message-State: AOJu0YyogyJFd8D6jDrqHkR8Jf2gXnSBloNrpqGvOlRuNxlPEYPWVfKu m31R/X2YKojqBDWbKR1Vy3oqEh6DY+uLwO9FxAePiK99GFBwGC0z97qD1Ytl0vq0k4qdOwwp95w 0 X-Google-Smtp-Source: AGHT+IEJEfcauuBtypGVJXnkgOy8ieihbzL/VQ4/C+qF+3i9o0j7rUiu9noLu5306WGVuvoxaAd2Ng== X-Received: by 2002:a05:6a00:4516:b0:6ed:435f:ec9b with SMTP id d2e1a72fcca58-6f8f3d589d6mr4140551b3a.20.1716593117586; Fri, 24 May 2024 16:25:17 -0700 (PDT) Received: from stoup.. (174-21-72-5.tukw.qwest.net. [174.21.72.5]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-6f8fbf2cfd1sm1591695b3a.3.2024.05.24.16.25.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 May 2024 16:25:17 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: qemu-arm@nongnu.org Subject: [PATCH v2 63/67] target/arm: Convert MLA, MLS to decodetree Date: Fri, 24 May 2024 16:21:17 -0700 Message-Id: <20240524232121.284515-64-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240524232121.284515-1-richard.henderson@linaro.org> References: <20240524232121.284515-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::42b; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x42b.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/tcg/a64.decode | 8 ++++ target/arm/tcg/translate-a64.c | 77 ++++++++++------------------------ 2 files changed, 31 insertions(+), 54 deletions(-) diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode index 3ea0643370..2dea68a0a9 100644 --- a/target/arm/tcg/a64.decode +++ b/target/arm/tcg/a64.decode @@ -928,6 +928,8 @@ SABA_v 0.00 1110 ..1 ..... 01111 1 ..... ..... @qrrr_e UABA_v 0.10 1110 ..1 ..... 01111 1 ..... ..... @qrrr_e MUL_v 0.00 1110 ..1 ..... 10011 1 ..... ..... @qrrr_e PMUL_v 0.10 1110 001 ..... 10011 1 ..... ..... @qrrr_b +MLA_v 0.00 1110 ..1 ..... 10010 1 ..... ..... @qrrr_e +MLS_v 0.10 1110 ..1 ..... 10010 1 ..... ..... @qrrr_e ### Advanced SIMD scalar x indexed element @@ -972,3 +974,9 @@ FMLSL2_vi 0.10 1111 10 .. .... 1100 . 0 ..... ..... @qrrx_h MUL_vi 0.00 1111 01 .. .... 1000 . 0 ..... ..... @qrrx_h MUL_vi 0.00 1111 10 . ..... 1000 . 0 ..... ..... @qrrx_s + +MLA_vi 0.10 1111 01 .. .... 0000 . 0 ..... ..... @qrrx_h +MLA_vi 0.10 1111 10 . ..... 0000 . 0 ..... ..... @qrrx_s + +MLS_vi 0.10 1111 01 .. .... 0100 . 0 ..... ..... @qrrx_h +MLS_vi 0.10 1111 10 . ..... 0100 . 0 ..... ..... @qrrx_s diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c index 1909d1426c..c4601cde2f 100644 --- a/target/arm/tcg/translate-a64.c +++ b/target/arm/tcg/translate-a64.c @@ -5470,6 +5470,8 @@ TRANS(SABD_v, do_gvec_fn3_no64, a, gen_gvec_sabd) TRANS(UABD_v, do_gvec_fn3_no64, a, gen_gvec_uabd) TRANS(MUL_v, do_gvec_fn3_no64, a, tcg_gen_gvec_mul) TRANS(PMUL_v, do_gvec_op3_ool, a, 0, gen_helper_gvec_pmul_b) +TRANS(MLA_v, do_gvec_fn3_no64, a, gen_gvec_mla) +TRANS(MLS_v, do_gvec_fn3_no64, a, gen_gvec_mls) static bool do_cmop_v(DisasContext *s, arg_qrrr_e *a, TCGCond cond) { @@ -5712,6 +5714,24 @@ static gen_helper_gvec_3 * const f_vector_idx_mul[2] = { }; TRANS(MUL_vi, do_int3_vector_idx, a, f_vector_idx_mul) +static bool do_mla_vector_idx(DisasContext *s, arg_qrrx_e *a, bool sub) +{ + static gen_helper_gvec_4 * const fns[2][2] = { + { gen_helper_gvec_mla_idx_h, gen_helper_gvec_mls_idx_h }, + { gen_helper_gvec_mla_idx_s, gen_helper_gvec_mls_idx_s }, + }; + + assert(a->esz == MO_16 || a->esz == MO_32); + if (fp_access_check(s)) { + gen_gvec_op4_ool(s, a->q, a->rd, a->rn, a->rm, a->rd, + a->idx, fns[a->esz - 1][sub]); + } + return true; +} + +TRANS(MLA_vi, do_mla_vector_idx, a, false) +TRANS(MLS_vi, do_mla_vector_idx, a, true) + /* * Advanced SIMD scalar pairwise */ @@ -10945,12 +10965,6 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) int rd = extract32(insn, 0, 5); switch (opcode) { - case 0x12: /* MLA, MLS */ - if (size == 3) { - unallocated_encoding(s); - return; - } - break; case 0x16: /* SQDMULH, SQRDMULH */ if (size == 0 || size == 3) { unallocated_encoding(s); @@ -10981,6 +10995,7 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) case 0x0f: /* SABA, UABA */ case 0x10: /* ADD, SUB */ case 0x11: /* CMTST, CMEQ */ + case 0x12: /* MLA, MLS */ case 0x13: /* MUL, PMUL */ unallocated_encoding(s); return; @@ -10991,13 +11006,6 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) } switch (opcode) { - case 0x12: /* MLA, MLS */ - if (u) { - gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_mls, size); - } else { - gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_mla, size); - } - return; case 0x16: /* SQDMULH, SQRDMULH */ { static gen_helper_gvec_3_ptr * const fns[2][2] = { @@ -12204,13 +12212,6 @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn) TCGv_ptr fpst; switch (16 * u + opcode) { - case 0x10: /* MLA */ - case 0x14: /* MLS */ - if (is_scalar) { - unallocated_encoding(s); - return; - } - break; case 0x02: /* SMLAL, SMLAL2 */ case 0x12: /* UMLAL, UMLAL2 */ case 0x06: /* SMLSL, SMLSL2 */ @@ -12292,6 +12293,8 @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn) case 0x05: /* FMLS */ case 0x08: /* MUL */ case 0x09: /* FMUL */ + case 0x10: /* MLA */ + case 0x14: /* MLS */ case 0x18: /* FMLAL2 */ case 0x19: /* FMULX */ case 0x1c: /* FMLSL2 */ @@ -12412,40 +12415,6 @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn) : gen_helper_gvec_fcmlah_idx); } return; - - case 0x10: /* MLA */ - if (!is_long && !is_scalar) { - static gen_helper_gvec_4 * const fns[3] = { - gen_helper_gvec_mla_idx_h, - gen_helper_gvec_mla_idx_s, - gen_helper_gvec_mla_idx_d, - }; - tcg_gen_gvec_4_ool(vec_full_reg_offset(s, rd), - vec_full_reg_offset(s, rn), - vec_full_reg_offset(s, rm), - vec_full_reg_offset(s, rd), - is_q ? 16 : 8, vec_full_reg_size(s), - index, fns[size - 1]); - return; - } - break; - - case 0x14: /* MLS */ - if (!is_long && !is_scalar) { - static gen_helper_gvec_4 * const fns[3] = { - gen_helper_gvec_mls_idx_h, - gen_helper_gvec_mls_idx_s, - gen_helper_gvec_mls_idx_d, - }; - tcg_gen_gvec_4_ool(vec_full_reg_offset(s, rd), - vec_full_reg_offset(s, rn), - vec_full_reg_offset(s, rm), - vec_full_reg_offset(s, rd), - is_q ? 16 : 8, vec_full_reg_size(s), - index, fns[size - 1]); - return; - } - break; } if (size == 3) { From patchwork Fri May 24 23:21:18 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13673838 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id EAE8BC25B7A for ; Fri, 24 May 2024 23:30:30 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sAeJD-0003Ij-4S; Fri, 24 May 2024 19:26:47 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sAeI7-00088V-Eb for qemu-devel@nongnu.org; Fri, 24 May 2024 19:25:39 -0400 Received: from mail-pf1-x42a.google.com ([2607:f8b0:4864:20::42a]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sAeHv-0006xq-AO for qemu-devel@nongnu.org; Fri, 24 May 2024 19:25:39 -0400 Received: by mail-pf1-x42a.google.com with SMTP id d2e1a72fcca58-6f8f34cb0beso1044755b3a.1 for ; Fri, 24 May 2024 16:25:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1716593118; x=1717197918; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Rp3TcZ1kWdRpY1y4Aio9ns2hoAb/sdOSYuGPHJgSHnk=; b=vsq4SrboSP8D7XxuB/3nWMdkxwUbsgwuA6oQg5vX5p1U7g2mbJV8EITnoo4jXXhFKV zII3PnNpiFjiITfTu6B+5gbSKaLc3NqBD/1R/MgB5W5s7AQdZHC5UW/f8uoc9utIsW/s nKXMnvqqKXd9X8EFYP0ARJ/y2hx4y6AhVobto1cChoa/KZAv+MDrMLva7sV/u1jzeu8a pTd8Zgm1v9kGkzBcnHSVs69fv5S1+toOpLP7tAInE7sBKlIpmdgEMleHN2T0gC7xwWmh 199UxSbnartf5s1mz68TArek11QIzIezTob3biKTy6FxcpWIYXZowe7N98XnL3dD4pUD EqCw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716593118; x=1717197918; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Rp3TcZ1kWdRpY1y4Aio9ns2hoAb/sdOSYuGPHJgSHnk=; b=WAZ1Y4E2CHLEFK8sW0xrBR9wgd9JiWJqTzdPN7tI26MAXdeii0CchIcKCYPdab9/mb mx8lrYI3T0xxyvevYkBSdTyP5+ms6OfCQzItx3g9xpCm3SmtW7XnpVMnVHkuphH1nTzN Nio63xRh0zrCCd13PpCf75+rSgL0Mh1sZR5bu501Ux1PC7UmalCHPnPlGaJBFFUAPp1D a02Kxri3709r/r8YAILhOE1NNd2Gbt3T4/Pbgg/jRYttFjWGxjd/I2tlqbxgY4RoyhlB 2IqiSNphr5EGo62noCOvFpP8y+9QtLV1Ca3kFdWmJkRvkSqeGNNRWHuUHtuDK45mGQNs k2IQ== X-Gm-Message-State: AOJu0YwF57JT6kfi0TfZ9joyLPq2dAM4uRPtG8348MtIHc/0cqrFG2EL oR2lnEn6qnZfKFdPhyeulwXGQh2jtnWzNJT4+j6zFB8iFoVzZZmn/woqcDsCgRYMkR9ZWD+/E0A / X-Google-Smtp-Source: AGHT+IEFdzJ1lB3NLD8pQLgs9u0VRq0s5FtDa6c3bVCfuLFFwCoK/cGI4G0obulU+JzkbFdSWGSzhw== X-Received: by 2002:a05:6a21:1a0:b0:1ac:3b81:2b5a with SMTP id adf61e73a8af0-1b212ce1d9cmr5680651637.8.1716593118298; Fri, 24 May 2024 16:25:18 -0700 (PDT) Received: from stoup.. (174-21-72-5.tukw.qwest.net. [174.21.72.5]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-6f8fbf2cfd1sm1591695b3a.3.2024.05.24.16.25.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 May 2024 16:25:17 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: qemu-arm@nongnu.org Subject: [PATCH v2 64/67] target/arm: Tidy SQDMULH, SQRDMULH (vector) Date: Fri, 24 May 2024 16:21:18 -0700 Message-Id: <20240524232121.284515-65-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240524232121.284515-1-richard.henderson@linaro.org> References: <20240524232121.284515-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::42a; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x42a.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org We already have a gvec helper for the operations, but we aren't using it on the aa32 neon side. Create a unified expander for use by both aa32 and aa64 translators. Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/tcg/translate.h | 4 ++++ target/arm/tcg/gengvec.c | 20 ++++++++++++++++++++ target/arm/tcg/translate-a64.c | 23 ++++------------------- target/arm/tcg/translate-neon.c | 23 +++-------------------- 4 files changed, 31 insertions(+), 39 deletions(-) diff --git a/target/arm/tcg/translate.h b/target/arm/tcg/translate.h index 3b1e68b779..aba21f730f 100644 --- a/target/arm/tcg/translate.h +++ b/target/arm/tcg/translate.h @@ -539,6 +539,10 @@ void gen_gvec_sri(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, void gen_gvec_sli(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, int64_t shift, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_sqdmulh_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_sqrdmulh_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); void gen_gvec_sqrdmlah_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); void gen_gvec_sqrdmlsh_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, diff --git a/target/arm/tcg/gengvec.c b/target/arm/tcg/gengvec.c index 32caabd126..462c185f9a 100644 --- a/target/arm/tcg/gengvec.c +++ b/target/arm/tcg/gengvec.c @@ -34,6 +34,26 @@ static void gen_gvec_fn3_qc(uint32_t rd_ofs, uint32_t rn_ofs, uint32_t rm_ofs, opr_sz, max_sz, 0, fn); } +void gen_gvec_sqdmulh_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static gen_helper_gvec_3_ptr * const fns[2] = { + gen_helper_neon_sqdmulh_h, gen_helper_neon_sqdmulh_s + }; + tcg_debug_assert(vece >= 1 && vece <= 2); + gen_gvec_fn3_qc(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, fns[vece - 1]); +} + +void gen_gvec_sqrdmulh_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static gen_helper_gvec_3_ptr * const fns[2] = { + gen_helper_neon_sqrdmulh_h, gen_helper_neon_sqrdmulh_s + }; + tcg_debug_assert(vece >= 1 && vece <= 2); + gen_gvec_fn3_qc(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, fns[vece - 1]); +} + void gen_gvec_sqrdmlah_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) { diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c index c4601cde2f..c673b95ec7 100644 --- a/target/arm/tcg/translate-a64.c +++ b/target/arm/tcg/translate-a64.c @@ -724,19 +724,6 @@ static void gen_gvec_op3_fpst(DisasContext *s, bool is_q, int rd, int rn, is_q ? 16 : 8, vec_full_reg_size(s), data, fn); } -/* Expand a 3-operand + qc + operation using an out-of-line helper. */ -static void gen_gvec_op3_qc(DisasContext *s, bool is_q, int rd, int rn, - int rm, gen_helper_gvec_3_ptr *fn) -{ - TCGv_ptr qc_ptr = tcg_temp_new_ptr(); - - tcg_gen_addi_ptr(qc_ptr, tcg_env, offsetof(CPUARMState, vfp.qc)); - tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, rd), - vec_full_reg_offset(s, rn), - vec_full_reg_offset(s, rm), qc_ptr, - is_q ? 16 : 8, vec_full_reg_size(s), 0, fn); -} - /* Expand a 4-operand operation using an out-of-line helper. */ static void gen_gvec_op4_ool(DisasContext *s, bool is_q, int rd, int rn, int rm, int ra, int data, gen_helper_gvec_4 *fn) @@ -11007,12 +10994,10 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) switch (opcode) { case 0x16: /* SQDMULH, SQRDMULH */ - { - static gen_helper_gvec_3_ptr * const fns[2][2] = { - { gen_helper_neon_sqdmulh_h, gen_helper_neon_sqrdmulh_h }, - { gen_helper_neon_sqdmulh_s, gen_helper_neon_sqrdmulh_s }, - }; - gen_gvec_op3_qc(s, is_q, rd, rn, rm, fns[size - 1][u]); + if (u) { + gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_sqrdmulh_qc, size); + } else { + gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_sqdmulh_qc, size); } return; } diff --git a/target/arm/tcg/translate-neon.c b/target/arm/tcg/translate-neon.c index f9a8753906..915c9e56db 100644 --- a/target/arm/tcg/translate-neon.c +++ b/target/arm/tcg/translate-neon.c @@ -937,28 +937,11 @@ DO_SHA2(SHA256SU1, gen_helper_crypto_sha256su1) } #define DO_3SAME_VQDMULH(INSN, FUNC) \ - WRAP_ENV_FN(gen_##INSN##_tramp16, gen_helper_neon_##FUNC##_s16); \ - WRAP_ENV_FN(gen_##INSN##_tramp32, gen_helper_neon_##FUNC##_s32); \ - static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs, \ - uint32_t rn_ofs, uint32_t rm_ofs, \ - uint32_t oprsz, uint32_t maxsz) \ - { \ - static const GVecGen3 ops[2] = { \ - { .fni4 = gen_##INSN##_tramp16 }, \ - { .fni4 = gen_##INSN##_tramp32 }, \ - }; \ - tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, oprsz, maxsz, &ops[vece - 1]); \ - } \ static bool trans_##INSN##_3s(DisasContext *s, arg_3same *a) \ - { \ - if (a->size != 1 && a->size != 2) { \ - return false; \ - } \ - return do_3same(s, a, gen_##INSN##_3s); \ - } + { return a->size >= 1 && a->size <= 2 && do_3same(s, a, FUNC); } -DO_3SAME_VQDMULH(VQDMULH, qdmulh) -DO_3SAME_VQDMULH(VQRDMULH, qrdmulh) +DO_3SAME_VQDMULH(VQDMULH, gen_gvec_sqdmulh_qc) +DO_3SAME_VQDMULH(VQRDMULH, gen_gvec_sqrdmulh_qc) #define WRAP_FP_GVEC(WRAPNAME, FPST, FUNC) \ static void WRAPNAME(unsigned vece, uint32_t rd_ofs, \ From patchwork Fri May 24 23:21:19 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13673811 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id D88B8C25B74 for ; Fri, 24 May 2024 23:27:19 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sAeIq-0002II-Ev; Fri, 24 May 2024 19:26:24 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sAeI3-0007lN-AG for qemu-devel@nongnu.org; Fri, 24 May 2024 19:25:35 -0400 Received: from mail-pf1-x432.google.com ([2607:f8b0:4864:20::432]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sAeHt-0006y5-6X for qemu-devel@nongnu.org; Fri, 24 May 2024 19:25:33 -0400 Received: by mail-pf1-x432.google.com with SMTP id d2e1a72fcca58-6f6a045d476so3611208b3a.1 for ; Fri, 24 May 2024 16:25:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1716593119; x=1717197919; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=U08xz9w8d1Te1t7LIidyDjNjL+T7qJ8Gada+qFJ0dkg=; b=zEPQW0CebemDgLgdga/ftwqMs2dCzPuhRRxtQMtrc4zLBJxQ5CS08gAeFPXArw4KcX Qxa00RMQj9qxEEAlUwLb7t7OkMpCnJtCsXkW1ekf3GgMj78Ghl8ZwwVRyP9m4CfRSAjZ TzjEUL3DvIb72RWvx9o2AlhjHERMctBVdd2NRWVT8manjgI8oca6oAfRL0e9DRhgqtlX v22+KRYRjQd1QKcKaLLQF+Y2K5tD0a1hHgnBVw7cS1jxBT2lXeaULPReQ0PUa5TVwxIs YdVNdiK7wnbrn6O35lSEpfiKHeQ7A/K3DOEJMihMdKMfQdVZ6KT8Mnw5vmBEb2JzHfQm fV+Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716593119; x=1717197919; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=U08xz9w8d1Te1t7LIidyDjNjL+T7qJ8Gada+qFJ0dkg=; b=i0R57kEbiVBwNTW7XHnU2VbCl+Z2OKL/S+UKUu3odKeugGbxLmVs71jhx9kf3vRmK/ sXNVfyoRxOkXVvrJ2BAjFunA1PZWwmCX/s9rB3DBTW/XT46Q2fY6LCvOo3BUU9SpaVZE n5sxQC2vwuYy5AJ1R8BFV6/I5XSg0hjhl+SfYwL9VWqC8kJP48d5aUZfXpOEWWUFSo8p 8zD3iSW1KYVcK/XAj6DjN50LvLxsRFLTYYbtz+bJ5xtk/4/vlsTdoJIFS+9jrF73pfdW McwefBTRTX7eq63SaqfIAp1u7+gPTJi7/IP9wJvhbBqxvpyRZ0rSmorhfamhk7m3JAly dv4Q== X-Gm-Message-State: AOJu0Yw7TdK3ftrd9n62nj1Px4OvOmcPICsqOa3ATAJGip2xpaZv2eyO AaBdJZ05Klwufhfl8f55WyfJMfpQp40eK51PfmaQfVVLP13UAPv+BYagVwczNWLSI38bF6IpjHv J X-Google-Smtp-Source: AGHT+IHrRY5F8c6y6AU9hJfW3KvEbrp5+//7F2f2sQhCEIW8duoAWzBP+hWQHwPVXign5I1QgEfP4w== X-Received: by 2002:a05:6a20:43ac:b0:1aa:43f4:3562 with SMTP id adf61e73a8af0-1b212d38fd6mr5029271637.11.1716593119095; Fri, 24 May 2024 16:25:19 -0700 (PDT) Received: from stoup.. (174-21-72-5.tukw.qwest.net. [174.21.72.5]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-6f8fbf2cfd1sm1591695b3a.3.2024.05.24.16.25.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 May 2024 16:25:18 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: qemu-arm@nongnu.org Subject: [PATCH v2 65/67] target/arm: Convert SQDMULH, SQRDMULH to decodetree Date: Fri, 24 May 2024 16:21:19 -0700 Message-Id: <20240524232121.284515-66-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240524232121.284515-1-richard.henderson@linaro.org> References: <20240524232121.284515-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::432; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x432.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org These are the last instructions within disas_simd_three_reg_same and disas_simd_scalar_three_reg_same, so remove them. Signed-off-by: Richard Henderson --- target/arm/helper.h | 10 ++ target/arm/tcg/a64.decode | 18 +++ target/arm/tcg/translate-a64.c | 276 ++++++++++----------------------- target/arm/tcg/vec_helper.c | 64 ++++++++ 4 files changed, 172 insertions(+), 196 deletions(-) diff --git a/target/arm/helper.h b/target/arm/helper.h index 85f9302563..24feecee9b 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -968,6 +968,16 @@ DEF_HELPER_FLAGS_5(neon_sqrdmulh_h, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(neon_sqrdmulh_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(neon_sqdmulh_idx_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(neon_sqdmulh_idx_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(neon_sqrdmulh_idx_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(neon_sqrdmulh_idx_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_4(sve2_sqdmulh_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_sqdmulh_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_sqdmulh_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode index 2dea68a0a9..f7f897f9fc 100644 --- a/target/arm/tcg/a64.decode +++ b/target/arm/tcg/a64.decode @@ -774,6 +774,9 @@ CMHS_s 0111 1110 111 ..... 00111 1 ..... ..... @rrr_d CMTST_s 0101 1110 111 ..... 10001 1 ..... ..... @rrr_d CMEQ_s 0111 1110 111 ..... 10001 1 ..... ..... @rrr_d +SQDMULH_s 0101 1110 ..1 ..... 10110 1 ..... ..... @rrr_e +SQRDMULH_s 0111 1110 ..1 ..... 10110 1 ..... ..... @rrr_e + ### Advanced SIMD scalar pairwise FADDP_s 0101 1110 0011 0000 1101 10 ..... ..... @rr_h @@ -931,6 +934,9 @@ PMUL_v 0.10 1110 001 ..... 10011 1 ..... ..... @qrrr_b MLA_v 0.00 1110 ..1 ..... 10010 1 ..... ..... @qrrr_e MLS_v 0.10 1110 ..1 ..... 10010 1 ..... ..... @qrrr_e +SQDMULH_v 0.00 1110 ..1 ..... 10110 1 ..... ..... @qrrr_e +SQRDMULH_v 0.10 1110 ..1 ..... 10110 1 ..... ..... @qrrr_e + ### Advanced SIMD scalar x indexed element FMUL_si 0101 1111 00 .. .... 1001 . 0 ..... ..... @rrx_h @@ -949,6 +955,12 @@ FMULX_si 0111 1111 00 .. .... 1001 . 0 ..... ..... @rrx_h FMULX_si 0111 1111 10 . ..... 1001 . 0 ..... ..... @rrx_s FMULX_si 0111 1111 11 0 ..... 1001 . 0 ..... ..... @rrx_d +SQDMULH_si 0101 1111 01 .. .... 1100 . 0 ..... ..... @rrx_h +SQDMULH_si 0101 1111 10 .. .... 1100 . 0 ..... ..... @rrx_s + +SQRDMULH_si 0101 1111 01 .. .... 1101 . 0 ..... ..... @rrx_h +SQRDMULH_si 0101 1111 10 . ..... 1101 . 0 ..... ..... @rrx_s + ### Advanced SIMD vector x indexed element FMUL_vi 0.00 1111 00 .. .... 1001 . 0 ..... ..... @qrrx_h @@ -980,3 +992,9 @@ MLA_vi 0.10 1111 10 . ..... 0000 . 0 ..... ..... @qrrx_s MLS_vi 0.10 1111 01 .. .... 0100 . 0 ..... ..... @qrrx_h MLS_vi 0.10 1111 10 . ..... 0100 . 0 ..... ..... @qrrx_s + +SQDMULH_vi 0.00 1111 01 .. .... 1100 . 0 ..... ..... @qrrx_h +SQDMULH_vi 0.00 1111 10 . ..... 1100 . 0 ..... ..... @qrrx_s + +SQRDMULH_vi 0.00 1111 01 .. .... 1101 . 0 ..... ..... @qrrx_h +SQRDMULH_vi 0.00 1111 10 . ..... 1101 . 0 ..... ..... @qrrx_s diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c index c673b95ec7..14226c56cf 100644 --- a/target/arm/tcg/translate-a64.c +++ b/target/arm/tcg/translate-a64.c @@ -1350,6 +1350,14 @@ static bool do_gvec_fn3_no64(DisasContext *s, arg_qrrr_e *a, GVecGen3Fn *fn) return true; } +static bool do_gvec_fn3_no8_no64(DisasContext *s, arg_qrrr_e *a, GVecGen3Fn *fn) +{ + if (a->esz == MO_8) { + return false; + } + return do_gvec_fn3_no64(s, a, fn); +} + static bool do_gvec_fn4(DisasContext *s, arg_qrrrr_e *a, GVecGen4Fn *fn) { if (!a->q && a->esz == MO_64) { @@ -5167,6 +5175,25 @@ static const ENVScalar2 f_scalar_uqrshl = { }; TRANS(UQRSHL_s, do_env_scalar2, a, &f_scalar_uqrshl) +static bool do_env_scalar2_hs(DisasContext *s, arg_rrr_e *a, + const ENVScalar2 *f) +{ + if (a->esz == MO_16 || a->esz == MO_32) { + return do_env_scalar2(s, a, f); + } + return false; +} + +static const ENVScalar2 f_scalar_sqdmulh = { + { NULL, gen_helper_neon_qdmulh_s16, gen_helper_neon_qdmulh_s32 } +}; +TRANS(SQDMULH_s, do_env_scalar2_hs, a, &f_scalar_sqdmulh) + +static const ENVScalar2 f_scalar_sqrdmulh = { + { NULL, gen_helper_neon_qrdmulh_s16, gen_helper_neon_qrdmulh_s32 } +}; +TRANS(SQRDMULH_s, do_env_scalar2_hs, a, &f_scalar_sqrdmulh) + static bool do_cmop_d(DisasContext *s, arg_rrr_e *a, TCGCond cond) { if (fp_access_check(s)) { @@ -5482,6 +5509,9 @@ TRANS(CMHS_v, do_cmop_v, a, TCG_COND_GEU) TRANS(CMEQ_v, do_cmop_v, a, TCG_COND_EQ) TRANS(CMTST_v, do_gvec_fn3, a, gen_gvec_cmtst) +TRANS(SQDMULH_v, do_gvec_fn3_no8_no64, a, gen_gvec_sqdmulh_qc) +TRANS(SQRDMULH_v, do_gvec_fn3_no8_no64, a, gen_gvec_sqrdmulh_qc) + /* * Advanced SIMD scalar/vector x indexed element */ @@ -5589,6 +5619,27 @@ static bool do_fmla_scalar_idx(DisasContext *s, arg_rrx_e *a, bool neg) TRANS(FMLA_si, do_fmla_scalar_idx, a, false) TRANS(FMLS_si, do_fmla_scalar_idx, a, true) +static bool do_env_scalar2_idx_hs(DisasContext *s, arg_rrx_e *a, + const ENVScalar2 *f) +{ + if (a->esz < MO_16 || a->esz > MO_32) { + return false; + } + if (fp_access_check(s)) { + TCGv_i32 t0 = tcg_temp_new_i32(); + TCGv_i32 t1 = tcg_temp_new_i32(); + + read_vec_element_i32(s, t0, a->rn, 0, a->esz); + read_vec_element_i32(s, t1, a->rm, a->idx, a->esz); + f->gen_bhs[a->esz](t0, tcg_env, t0, t1); + write_fp_sreg(s, a->rd, t0); + } + return true; +} + +TRANS(SQDMULH_si, do_env_scalar2_idx_hs, a, &f_scalar_sqdmulh) +TRANS(SQRDMULH_si, do_env_scalar2_idx_hs, a, &f_scalar_sqrdmulh) + static bool do_fp3_vector_idx(DisasContext *s, arg_qrrx_e *a, gen_helper_gvec_3_ptr * const fns[3]) { @@ -5719,6 +5770,33 @@ static bool do_mla_vector_idx(DisasContext *s, arg_qrrx_e *a, bool sub) TRANS(MLA_vi, do_mla_vector_idx, a, false) TRANS(MLS_vi, do_mla_vector_idx, a, true) +static bool do_int3_qc_vector_idx(DisasContext *s, arg_qrrx_e *a, + gen_helper_gvec_4 * const fns[2]) +{ + assert(a->esz == MO_16 || a->esz == MO_32); + if (fp_access_check(s)) { + tcg_gen_gvec_4_ool(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + offsetof(CPUARMState, vfp.qc), + a->q ? 16 : 8, vec_full_reg_size(s), + a->idx, fns[a->esz - 1]); + } + return true; +} + +static gen_helper_gvec_4 * const f_vector_idx_sqdmulh[2] = { + gen_helper_neon_sqdmulh_idx_h, + gen_helper_neon_sqdmulh_idx_s, +}; +TRANS(SQDMULH_vi, do_int3_qc_vector_idx, a, f_vector_idx_sqdmulh) + +static gen_helper_gvec_4 * const f_vector_idx_sqrdmulh[2] = { + gen_helper_neon_sqrdmulh_idx_h, + gen_helper_neon_sqrdmulh_idx_s, +}; +TRANS(SQRDMULH_vi, do_int3_qc_vector_idx, a, f_vector_idx_sqrdmulh) + /* * Advanced SIMD scalar pairwise */ @@ -9500,109 +9578,6 @@ static void disas_simd_scalar_three_reg_diff(DisasContext *s, uint32_t insn) } } -/* AdvSIMD scalar three same - * 31 30 29 28 24 23 22 21 20 16 15 11 10 9 5 4 0 - * +-----+---+-----------+------+---+------+--------+---+------+------+ - * | 0 1 | U | 1 1 1 1 0 | size | 1 | Rm | opcode | 1 | Rn | Rd | - * +-----+---+-----------+------+---+------+--------+---+------+------+ - */ -static void disas_simd_scalar_three_reg_same(DisasContext *s, uint32_t insn) -{ - int rd = extract32(insn, 0, 5); - int rn = extract32(insn, 5, 5); - int opcode = extract32(insn, 11, 5); - int rm = extract32(insn, 16, 5); - int size = extract32(insn, 22, 2); - bool u = extract32(insn, 29, 1); - TCGv_i64 tcg_rd; - - switch (opcode) { - case 0x16: /* SQDMULH, SQRDMULH (vector) */ - if (size != 1 && size != 2) { - unallocated_encoding(s); - return; - } - break; - default: - case 0x1: /* SQADD, UQADD */ - case 0x5: /* SQSUB, UQSUB */ - case 0x6: /* CMGT, CMHI */ - case 0x7: /* CMGE, CMHS */ - case 0x8: /* SSHL, USHL */ - case 0x9: /* SQSHL, UQSHL */ - case 0xa: /* SRSHL, URSHL */ - case 0xb: /* SQRSHL, UQRSHL */ - case 0x10: /* ADD, SUB (vector) */ - case 0x11: /* CMTST, CMEQ */ - unallocated_encoding(s); - return; - } - - if (!fp_access_check(s)) { - return; - } - - tcg_rd = tcg_temp_new_i64(); - - if (size == 3) { - g_assert_not_reached(); - } else { - /* Do a single operation on the lowest element in the vector. - * We use the standard Neon helpers and rely on 0 OP 0 == 0 with - * no side effects for all these operations. - * OPTME: special-purpose helpers would avoid doing some - * unnecessary work in the helper for the 8 and 16 bit cases. - */ - NeonGenTwoOpEnvFn *genenvfn = NULL; - void (*genfn)(TCGv_i64, TCGv_i64, TCGv_i64, TCGv_i64, MemOp) = NULL; - - switch (opcode) { - case 0x16: /* SQDMULH, SQRDMULH */ - { - static NeonGenTwoOpEnvFn * const fns[2][2] = { - { gen_helper_neon_qdmulh_s16, gen_helper_neon_qrdmulh_s16 }, - { gen_helper_neon_qdmulh_s32, gen_helper_neon_qrdmulh_s32 }, - }; - assert(size == 1 || size == 2); - genenvfn = fns[size - 1][u]; - break; - } - default: - case 0x1: /* SQADD, UQADD */ - case 0x5: /* SQSUB, UQSUB */ - case 0x9: /* SQSHL, UQSHL */ - case 0xb: /* SQRSHL, UQRSHL */ - g_assert_not_reached(); - } - - if (genenvfn) { - TCGv_i32 tcg_rn = tcg_temp_new_i32(); - TCGv_i32 tcg_rm = tcg_temp_new_i32(); - - read_vec_element_i32(s, tcg_rn, rn, 0, size); - read_vec_element_i32(s, tcg_rm, rm, 0, size); - genenvfn(tcg_rn, tcg_env, tcg_rn, tcg_rm); - tcg_gen_extu_i32_i64(tcg_rd, tcg_rn); - } else { - TCGv_i64 tcg_rn = tcg_temp_new_i64(); - TCGv_i64 tcg_rm = tcg_temp_new_i64(); - TCGv_i64 qc = tcg_temp_new_i64(); - - read_vec_element(s, tcg_rn, rn, 0, size | (u ? 0 : MO_SIGN)); - read_vec_element(s, tcg_rm, rm, 0, size | (u ? 0 : MO_SIGN)); - tcg_gen_ld_i64(qc, tcg_env, offsetof(CPUARMState, vfp.qc)); - genfn(tcg_rd, qc, tcg_rn, tcg_rm, size); - tcg_gen_st_i64(qc, tcg_env, offsetof(CPUARMState, vfp.qc)); - if (!u) { - /* Truncate signed 64-bit result for writeback. */ - tcg_gen_ext_i64(tcg_rd, tcg_rd, size); - } - } - } - - write_fp_dreg(s, rd, tcg_rd); -} - /* AdvSIMD scalar three same extra * 31 30 29 28 24 23 22 21 20 16 15 14 11 10 9 5 4 0 * +-----+---+-----------+------+---+------+---+--------+---+----+----+ @@ -10940,94 +10915,6 @@ static void disas_simd_three_reg_diff(DisasContext *s, uint32_t insn) } } -/* Integer op subgroup of C3.6.16. */ -static void disas_simd_3same_int(DisasContext *s, uint32_t insn) -{ - int is_q = extract32(insn, 30, 1); - int u = extract32(insn, 29, 1); - int size = extract32(insn, 22, 2); - int opcode = extract32(insn, 11, 5); - int rm = extract32(insn, 16, 5); - int rn = extract32(insn, 5, 5); - int rd = extract32(insn, 0, 5); - - switch (opcode) { - case 0x16: /* SQDMULH, SQRDMULH */ - if (size == 0 || size == 3) { - unallocated_encoding(s); - return; - } - break; - default: - if (size == 3 && !is_q) { - unallocated_encoding(s); - return; - } - break; - - case 0x0: /* SHADD, UHADD */ - case 0x01: /* SQADD, UQADD */ - case 0x02: /* SRHADD, URHADD */ - case 0x04: /* SHSUB, UHSUB */ - case 0x05: /* SQSUB, UQSUB */ - case 0x06: /* CMGT, CMHI */ - case 0x07: /* CMGE, CMHS */ - case 0x08: /* SSHL, USHL */ - case 0x09: /* SQSHL, UQSHL */ - case 0x0a: /* SRSHL, URSHL */ - case 0x0b: /* SQRSHL, UQRSHL */ - case 0x0c: /* SMAX, UMAX */ - case 0x0d: /* SMIN, UMIN */ - case 0x0e: /* SABD, UABD */ - case 0x0f: /* SABA, UABA */ - case 0x10: /* ADD, SUB */ - case 0x11: /* CMTST, CMEQ */ - case 0x12: /* MLA, MLS */ - case 0x13: /* MUL, PMUL */ - unallocated_encoding(s); - return; - } - - if (!fp_access_check(s)) { - return; - } - - switch (opcode) { - case 0x16: /* SQDMULH, SQRDMULH */ - if (u) { - gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_sqrdmulh_qc, size); - } else { - gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_sqdmulh_qc, size); - } - return; - } - g_assert_not_reached(); -} - -/* AdvSIMD three same - * 31 30 29 28 24 23 22 21 20 16 15 11 10 9 5 4 0 - * +---+---+---+-----------+------+---+------+--------+---+------+------+ - * | 0 | Q | U | 0 1 1 1 0 | size | 1 | Rm | opcode | 1 | Rn | Rd | - * +---+---+---+-----------+------+---+------+--------+---+------+------+ - */ -static void disas_simd_three_reg_same(DisasContext *s, uint32_t insn) -{ - int opcode = extract32(insn, 11, 5); - - switch (opcode) { - default: - disas_simd_3same_int(s, insn); - break; - case 0x3: /* logic ops */ - case 0x14: /* SMAXP, UMAXP */ - case 0x15: /* SMINP, UMINP */ - case 0x17: /* ADDP */ - case 0x18 ... 0x31: /* floating point ops */ - unallocated_encoding(s); - break; - } -} - /* AdvSIMD three same extra * 31 30 29 28 24 23 22 21 20 16 15 14 11 10 9 5 4 0 * +---+---+---+-----------+------+---+------+---+--------+---+----+----+ @@ -12214,9 +12101,6 @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn) case 0x0b: /* SQDMULL, SQDMULL2 */ is_long = true; break; - case 0x0c: /* SQDMULH */ - case 0x0d: /* SQRDMULH */ - break; case 0x1d: /* SQRDMLAH */ case 0x1f: /* SQRDMLSH */ if (!dc_isar_feature(aa64_rdm, s)) { @@ -12278,6 +12162,8 @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn) case 0x05: /* FMLS */ case 0x08: /* MUL */ case 0x09: /* FMUL */ + case 0x0c: /* SQDMULH */ + case 0x0d: /* SQRDMULH */ case 0x10: /* MLA */ case 0x14: /* MLS */ case 0x18: /* FMLAL2 */ @@ -12683,7 +12569,6 @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn) */ static const AArch64DecodeTable data_proc_simd[] = { /* pattern , mask , fn */ - { 0x0e200400, 0x9f200400, disas_simd_three_reg_same }, { 0x0e008400, 0x9f208400, disas_simd_three_reg_same_extra }, { 0x0e200000, 0x9f200c00, disas_simd_three_reg_diff }, { 0x0e200800, 0x9f3e0c00, disas_simd_two_reg_misc }, @@ -12695,7 +12580,6 @@ static const AArch64DecodeTable data_proc_simd[] = { { 0x0e000000, 0xbf208c00, disas_simd_tb }, { 0x0e000800, 0xbf208c00, disas_simd_zip_trn }, { 0x2e000000, 0xbf208400, disas_simd_ext }, - { 0x5e200400, 0xdf200400, disas_simd_scalar_three_reg_same }, { 0x5e008400, 0xdf208400, disas_simd_scalar_three_reg_same_extra }, { 0x5e200000, 0xdf200c00, disas_simd_scalar_three_reg_diff }, { 0x5e200800, 0xdf3e0c00, disas_simd_scalar_two_reg_misc }, diff --git a/target/arm/tcg/vec_helper.c b/target/arm/tcg/vec_helper.c index d8e96386be..b05922b425 100644 --- a/target/arm/tcg/vec_helper.c +++ b/target/arm/tcg/vec_helper.c @@ -311,6 +311,38 @@ void HELPER(neon_sqrdmulh_h)(void *vd, void *vn, void *vm, clear_tail(d, opr_sz, simd_maxsz(desc)); } +void HELPER(neon_sqdmulh_idx_h)(void *vd, void *vn, void *vm, + void *vq, uint32_t desc) +{ + intptr_t i, j, opr_sz = simd_oprsz(desc); + int idx = simd_data(desc); + int16_t *d = vd, *n = vn, *m = (int16_t *)vm + H2(idx); + + for (i = 0; i < opr_sz / 2; i += 16 / 2) { + int16_t mm = m[i]; + for (j = 0; j < 16 / 2; ++j) { + d[i + j] = do_sqrdmlah_h(n[i + j], mm, 0, false, false, vq); + } + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(neon_sqrdmulh_idx_h)(void *vd, void *vn, void *vm, + void *vq, uint32_t desc) +{ + intptr_t i, j, opr_sz = simd_oprsz(desc); + int idx = simd_data(desc); + int16_t *d = vd, *n = vn, *m = (int16_t *)vm + H2(idx); + + for (i = 0; i < opr_sz / 2; i += 16 / 2) { + int16_t mm = m[i]; + for (j = 0; j < 16 / 2; ++j) { + d[i + j] = do_sqrdmlah_h(n[i + j], mm, 0, false, true, vq); + } + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + void HELPER(sve2_sqrdmlah_h)(void *vd, void *vn, void *vm, void *va, uint32_t desc) { @@ -474,6 +506,38 @@ void HELPER(neon_sqrdmulh_s)(void *vd, void *vn, void *vm, clear_tail(d, opr_sz, simd_maxsz(desc)); } +void HELPER(neon_sqdmulh_idx_s)(void *vd, void *vn, void *vm, + void *vq, uint32_t desc) +{ + intptr_t i, j, opr_sz = simd_oprsz(desc); + int idx = simd_data(desc); + int32_t *d = vd, *n = vn, *m = (int32_t *)vm + H4(idx); + + for (i = 0; i < opr_sz / 4; i += 16 / 4) { + int32_t mm = m[i]; + for (j = 0; j < 16 / 4; ++j) { + d[i + j] = do_sqrdmlah_s(n[i + j], mm, 0, false, false, vq); + } + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(neon_sqrdmulh_idx_s)(void *vd, void *vn, void *vm, + void *vq, uint32_t desc) +{ + intptr_t i, j, opr_sz = simd_oprsz(desc); + int idx = simd_data(desc); + int32_t *d = vd, *n = vn, *m = (int32_t *)vm + H4(idx); + + for (i = 0; i < opr_sz / 4; i += 16 / 4) { + int32_t mm = m[i]; + for (j = 0; j < 16 / 4; ++j) { + d[i + j] = do_sqrdmlah_s(n[i + j], mm, 0, false, true, vq); + } + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + void HELPER(sve2_sqrdmlah_s)(void *vd, void *vn, void *vm, void *va, uint32_t desc) { From patchwork Fri May 24 23:21:20 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13673818 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id C0028C25B7D for ; Fri, 24 May 2024 23:27:36 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sAeIe-0001mq-D4; Fri, 24 May 2024 19:26:12 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sAeI1-0007hP-3O for qemu-devel@nongnu.org; Fri, 24 May 2024 19:25:34 -0400 Received: from mail-pg1-x52c.google.com ([2607:f8b0:4864:20::52c]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sAeHt-0006yG-9a for qemu-devel@nongnu.org; Fri, 24 May 2024 19:25:32 -0400 Received: by mail-pg1-x52c.google.com with SMTP id 41be03b00d2f7-66afccbee0cso3214180a12.1 for ; Fri, 24 May 2024 16:25:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1716593120; x=1717197920; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=ClVAedWRB3iqr0y2DLqE+AfQkjWiY9jnR3ZP7acwvkg=; b=JmBuDrSHWc5O08WfdzKOZZmxL2qSElZRWaiY00B3L8t9bN5xTHyCPWa7hz243Sh+Av GBCyuUC99k7UxmeXpUVMj29qxmPrzL1825ujGAkci2YQsCI4tpojm475P+yGzsHNEQLk ZOLk72j/WLaFyps8bgxeZ20YE2s7taWjAedOGYryzLOfrIK9Ug1sLkXEJNYvf4YcSijc T6JqFaAWH5D6Iz9hXfqefKK3PpnAKJHBc3X1vSrI2nNyUDNCBGUsXmr0coe1GWObYs0Z THgFSRchCSyPlk4ZCl1LrMVStpPk4fmyQqzHJSFRLg03Cqixz2S2H7Ig4eZGjfZ+jFTo knWQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716593120; x=1717197920; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ClVAedWRB3iqr0y2DLqE+AfQkjWiY9jnR3ZP7acwvkg=; b=c8UJfe9+VpBybg3JC3X7PouXm9BMWS/hPra1WAyiKnirSP2cSoOBItD1AKSGt9pgu7 vkFC/TNYd9GX4C7dMkDxc4CCLsioH9gHdFxHqajVzSc9AVJxIdoJbTgOhQ7gfC2QU/am h7BJnjjy1Avu5c507u1I1qmERNdQDOsHR7/mK2GGVZ+/gevo/DxV2++HzMOGAG7MeJE9 By2xzqTriTfhELIh+cHVY7J8bHLV4hjgTKZNg9TrpWqIZxp4CSnpYihRn14seg5mrF7a KCQX8MiSirypLf+yE5oBafE3aUz6CEES519LWJfrwM6DOkgJ0rAwGKklnVnwPXIEyB9d QQJQ== X-Gm-Message-State: AOJu0YwhwOVOmn+i24HklbjvEe7DtT9auZLVUstm2VdYSsazwqBbyvQc ymAdabXJnVY5jTqjKJb9eYczF5xrnqDWBB9J8z1Hz+1vAyJZtmBktDLbOWi6p+fripsqd7vz9yO c X-Google-Smtp-Source: AGHT+IFp79c+uL7+80YFP59Gb7v+9HXWkU6MdkBEeKF81AWjHHSLg6RzA/v4M/IUVtJfTt7GYqDujg== X-Received: by 2002:a05:6a20:da86:b0:1af:a2fa:e666 with SMTP id adf61e73a8af0-1b212cc6b12mr4021443637.10.1716593119845; Fri, 24 May 2024 16:25:19 -0700 (PDT) Received: from stoup.. (174-21-72-5.tukw.qwest.net. [174.21.72.5]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-6f8fbf2cfd1sm1591695b3a.3.2024.05.24.16.25.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 May 2024 16:25:19 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: qemu-arm@nongnu.org Subject: [PATCH v2 66/67] target/arm: Convert FMADD, FMSUB, FNMADD, FNMSUB to decodetree Date: Fri, 24 May 2024 16:21:20 -0700 Message-Id: <20240524232121.284515-67-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240524232121.284515-1-richard.henderson@linaro.org> References: <20240524232121.284515-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::52c; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x52c.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org These are the only instructions in the 3 source scalar class. Signed-off-by: Richard Henderson --- target/arm/tcg/a64.decode | 10 ++ target/arm/tcg/translate-a64.c | 233 ++++++++++++--------------------- 2 files changed, 93 insertions(+), 150 deletions(-) diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode index f7f897f9fc..6f6cd805b7 100644 --- a/target/arm/tcg/a64.decode +++ b/target/arm/tcg/a64.decode @@ -32,6 +32,7 @@ &rr_e rd rn esz &rrr_e rd rn rm esz &rrx_e rd rn rm idx esz +&rrrr_e rd rn rm ra esz &qrr_e q rd rn esz &qrrr_e q rd rn rm esz &qrrx_e q rd rn rm idx esz @@ -998,3 +999,12 @@ SQDMULH_vi 0.00 1111 10 . ..... 1100 . 0 ..... ..... @qrrx_s SQRDMULH_vi 0.00 1111 01 .. .... 1101 . 0 ..... ..... @qrrx_h SQRDMULH_vi 0.00 1111 10 . ..... 1101 . 0 ..... ..... @qrrx_s + +# Floating-point data-processing (3 source) + +@rrrr_hsd .... .... .. . rm:5 . ra:5 rn:5 rd:5 &rrrr_e esz=%esz_hsd + +FMADD 0001 1111 .. 0 ..... 0 ..... ..... ..... @rrrr_hsd +FMSUB 0001 1111 .. 0 ..... 1 ..... ..... ..... @rrrr_hsd +FNMADD 0001 1111 .. 1 ..... 0 ..... ..... ..... @rrrr_hsd +FNMSUB 0001 1111 .. 1 ..... 1 ..... ..... ..... @rrrr_hsd diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c index 14226c56cf..3c2963ebaa 100644 --- a/target/arm/tcg/translate-a64.c +++ b/target/arm/tcg/translate-a64.c @@ -5866,6 +5866,88 @@ static bool trans_ADDP_s(DisasContext *s, arg_rr_e *a) return true; } +/* + * Floating-point data-processing (3 source) + */ + +static bool do_fmadd(DisasContext *s, arg_rrrr_e *a, bool neg_a, bool neg_n) +{ + TCGv_ptr fpst; + + /* + * These are fused multiply-add. Note that doing the negations here + * as separate steps is correct: an input NaN should come out with + * its sign bit flipped if it is a negated-input. + */ + switch (a->esz) { + case MO_64: + if (fp_access_check(s)) { + TCGv_i64 tn = read_fp_dreg(s, a->rn); + TCGv_i64 tm = read_fp_dreg(s, a->rm); + TCGv_i64 ta = read_fp_dreg(s, a->ra); + + if (neg_a) { + gen_vfp_negd(ta, ta); + } + if (neg_n) { + gen_vfp_negd(tn, tn); + } + fpst = fpstatus_ptr(FPST_FPCR); + gen_helper_vfp_muladdd(ta, tn, tm, ta, fpst); + write_fp_dreg(s, a->rd, ta); + } + break; + + case MO_32: + if (fp_access_check(s)) { + TCGv_i32 tn = read_fp_sreg(s, a->rn); + TCGv_i32 tm = read_fp_sreg(s, a->rm); + TCGv_i32 ta = read_fp_sreg(s, a->ra); + + if (neg_a) { + gen_vfp_negs(ta, ta); + } + if (neg_n) { + gen_vfp_negs(tn, tn); + } + fpst = fpstatus_ptr(FPST_FPCR); + gen_helper_vfp_muladds(ta, tn, tm, ta, fpst); + write_fp_sreg(s, a->rd, ta); + } + break; + + case MO_16: + if (!dc_isar_feature(aa64_fp16, s)) { + return false; + } + if (fp_access_check(s)) { + TCGv_i32 tn = read_fp_hreg(s, a->rn); + TCGv_i32 tm = read_fp_hreg(s, a->rm); + TCGv_i32 ta = read_fp_hreg(s, a->ra); + + if (neg_a) { + gen_vfp_negh(ta, ta); + } + if (neg_n) { + gen_vfp_negh(tn, tn); + } + fpst = fpstatus_ptr(FPST_FPCR_F16); + gen_helper_advsimd_muladdh(ta, tn, tm, ta, fpst); + write_fp_sreg(s, a->rd, ta); + } + break; + + default: + return false; + } + return true; +} + +TRANS(FMADD, do_fmadd, a, false, false) +TRANS(FNMADD, do_fmadd, a, true, true) +TRANS(FMSUB, do_fmadd, a, false, true) +TRANS(FNMSUB, do_fmadd, a, true, false) + /* Shift a TCGv src by TCGv shift_amount, put result in dst. * Note that it is the caller's responsibility to ensure that the * shift amount is in range (ie 0..31 or 0..63) and provide the ARM @@ -7665,152 +7747,6 @@ static void disas_fp_1src(DisasContext *s, uint32_t insn) } } -/* Floating-point data-processing (3 source) - single precision */ -static void handle_fp_3src_single(DisasContext *s, bool o0, bool o1, - int rd, int rn, int rm, int ra) -{ - TCGv_i32 tcg_op1, tcg_op2, tcg_op3; - TCGv_i32 tcg_res = tcg_temp_new_i32(); - TCGv_ptr fpst = fpstatus_ptr(FPST_FPCR); - - tcg_op1 = read_fp_sreg(s, rn); - tcg_op2 = read_fp_sreg(s, rm); - tcg_op3 = read_fp_sreg(s, ra); - - /* These are fused multiply-add, and must be done as one - * floating point operation with no rounding between the - * multiplication and addition steps. - * NB that doing the negations here as separate steps is - * correct : an input NaN should come out with its sign bit - * flipped if it is a negated-input. - */ - if (o1 == true) { - gen_vfp_negs(tcg_op3, tcg_op3); - } - - if (o0 != o1) { - gen_vfp_negs(tcg_op1, tcg_op1); - } - - gen_helper_vfp_muladds(tcg_res, tcg_op1, tcg_op2, tcg_op3, fpst); - - write_fp_sreg(s, rd, tcg_res); -} - -/* Floating-point data-processing (3 source) - double precision */ -static void handle_fp_3src_double(DisasContext *s, bool o0, bool o1, - int rd, int rn, int rm, int ra) -{ - TCGv_i64 tcg_op1, tcg_op2, tcg_op3; - TCGv_i64 tcg_res = tcg_temp_new_i64(); - TCGv_ptr fpst = fpstatus_ptr(FPST_FPCR); - - tcg_op1 = read_fp_dreg(s, rn); - tcg_op2 = read_fp_dreg(s, rm); - tcg_op3 = read_fp_dreg(s, ra); - - /* These are fused multiply-add, and must be done as one - * floating point operation with no rounding between the - * multiplication and addition steps. - * NB that doing the negations here as separate steps is - * correct : an input NaN should come out with its sign bit - * flipped if it is a negated-input. - */ - if (o1 == true) { - gen_vfp_negd(tcg_op3, tcg_op3); - } - - if (o0 != o1) { - gen_vfp_negd(tcg_op1, tcg_op1); - } - - gen_helper_vfp_muladdd(tcg_res, tcg_op1, tcg_op2, tcg_op3, fpst); - - write_fp_dreg(s, rd, tcg_res); -} - -/* Floating-point data-processing (3 source) - half precision */ -static void handle_fp_3src_half(DisasContext *s, bool o0, bool o1, - int rd, int rn, int rm, int ra) -{ - TCGv_i32 tcg_op1, tcg_op2, tcg_op3; - TCGv_i32 tcg_res = tcg_temp_new_i32(); - TCGv_ptr fpst = fpstatus_ptr(FPST_FPCR_F16); - - tcg_op1 = read_fp_hreg(s, rn); - tcg_op2 = read_fp_hreg(s, rm); - tcg_op3 = read_fp_hreg(s, ra); - - /* These are fused multiply-add, and must be done as one - * floating point operation with no rounding between the - * multiplication and addition steps. - * NB that doing the negations here as separate steps is - * correct : an input NaN should come out with its sign bit - * flipped if it is a negated-input. - */ - if (o1 == true) { - tcg_gen_xori_i32(tcg_op3, tcg_op3, 0x8000); - } - - if (o0 != o1) { - tcg_gen_xori_i32(tcg_op1, tcg_op1, 0x8000); - } - - gen_helper_advsimd_muladdh(tcg_res, tcg_op1, tcg_op2, tcg_op3, fpst); - - write_fp_sreg(s, rd, tcg_res); -} - -/* Floating point data-processing (3 source) - * 31 30 29 28 24 23 22 21 20 16 15 14 10 9 5 4 0 - * +---+---+---+-----------+------+----+------+----+------+------+------+ - * | M | 0 | S | 1 1 1 1 1 | type | o1 | Rm | o0 | Ra | Rn | Rd | - * +---+---+---+-----------+------+----+------+----+------+------+------+ - */ -static void disas_fp_3src(DisasContext *s, uint32_t insn) -{ - int mos = extract32(insn, 29, 3); - int type = extract32(insn, 22, 2); - int rd = extract32(insn, 0, 5); - int rn = extract32(insn, 5, 5); - int ra = extract32(insn, 10, 5); - int rm = extract32(insn, 16, 5); - bool o0 = extract32(insn, 15, 1); - bool o1 = extract32(insn, 21, 1); - - if (mos) { - unallocated_encoding(s); - return; - } - - switch (type) { - case 0: - if (!fp_access_check(s)) { - return; - } - handle_fp_3src_single(s, o0, o1, rd, rn, rm, ra); - break; - case 1: - if (!fp_access_check(s)) { - return; - } - handle_fp_3src_double(s, o0, o1, rd, rn, rm, ra); - break; - case 3: - if (!dc_isar_feature(aa64_fp16, s)) { - unallocated_encoding(s); - return; - } - if (!fp_access_check(s)) { - return; - } - handle_fp_3src_half(s, o0, o1, rd, rn, rm, ra); - break; - default: - unallocated_encoding(s); - } -} - /* Floating point immediate * 31 30 29 28 24 23 22 21 20 13 12 10 9 5 4 0 * +---+---+---+-----------+------+---+------------+-------+------+------+ @@ -8254,10 +8190,7 @@ static void disas_fp_int_conv(DisasContext *s, uint32_t insn) */ static void disas_data_proc_fp(DisasContext *s, uint32_t insn) { - if (extract32(insn, 24, 1)) { - /* Floating point data-processing (3 source) */ - disas_fp_3src(s, insn); - } else if (extract32(insn, 21, 1) == 0) { + if (extract32(insn, 21, 1) == 0) { /* Floating point to fixed point conversions */ disas_fp_fixed_conv(s, insn); } else { From patchwork Fri May 24 23:21:21 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 13673850 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id E76B3C25B74 for ; Fri, 24 May 2024 23:32:21 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sAeJA-0003A0-Vy; Fri, 24 May 2024 19:26:45 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sAeI4-0007rW-42 for qemu-devel@nongnu.org; Fri, 24 May 2024 19:25:36 -0400 Received: from mail-pf1-x42e.google.com ([2607:f8b0:4864:20::42e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sAeHt-0006ya-Dd for qemu-devel@nongnu.org; Fri, 24 May 2024 19:25:35 -0400 Received: by mail-pf1-x42e.google.com with SMTP id d2e1a72fcca58-6f4ed9dc7beso2815278b3a.1 for ; Fri, 24 May 2024 16:25:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1716593120; x=1717197920; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=uPywkpQAhrA1k7AEfIqLBpGakz9C1XFcBO78m8F0zkI=; b=OzPBlGk+rk9RjOWDiKx5b6aTJXSV0G0EWxm9va00W8pbcGvhyrcIf38Gm/iyY1TaRC Jvd4FT+LT0kJ2vGAb3sy/zLM5sA+UtVXGhVTR8LJbHy42pvw0gVcSk1hYMf/BO8LrGtO 4W/zcQtImN/ANPyAnk6Qm6QlqJQB+93m+OAHnbLnuinDiKa0Zhvw39KBKUwlyUMQ8Zv1 rX1lnzVeBb/JHTTB9ohDFUCcdSdF42jIFRATI3R+v0t654cBs5sEc1q67rFfXQiDK8kp S79QtJSZQLq+4inMk48U9073PlKXck2gWgtRvDWxEk1tK1RRqu9p+BrhhBWOp2LN1GxG xagA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716593120; x=1717197920; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=uPywkpQAhrA1k7AEfIqLBpGakz9C1XFcBO78m8F0zkI=; b=usn5b7yWKrcgSQj6wvoAGVF8NKOp9NNmGF7pikyR7KOtswfOxYr0tqPGQWfrP15kE1 LwjlPl/nVtnZp0Z4f78huTF5GiXlNK8EQ4VvGPR3Y13sylq/ZQkov/7W5VpTekk1V3T2 FwQqLkjixGVeYykdmpsh+b1UcUQty8FqxDgQmgNGN868FHa7Qr5nhrBwmTTDH+UnHKBf w07mZAaayTILPjjXQxTTivgCoqmzgYQiLdc+61uB4bTyMBrqoX9ydgMpXznG0TwOQDQ0 NnfjF9Ya4jRobYU0xOv+03oSsFyojq2q4IDAYGZNaibQnG98StE5ozfsxvp0zJpW+FNm bP3Q== X-Gm-Message-State: AOJu0YwwGyCdYEvizJm74aI5t1mp+mrvGBqL6CwffOSRejzsf6E+ukd2 o2cFZOJW4QhZEN40KYJNQCYpYIUcdNIGDVJNoZUqKu93VCTlh1fo5yJRlMMIW6izfqonBlBYTVc n X-Google-Smtp-Source: AGHT+IHnL2QC8y1Dudv71TYCNKOE6o0VbixRQO641qh+QP1/NEkn6MBDW+JrIjZ/+xTUYJjiDe7KLg== X-Received: by 2002:a05:6a21:32a3:b0:1b0:194a:82f5 with SMTP id adf61e73a8af0-1b212dc7bf8mr5123624637.21.1716593120573; Fri, 24 May 2024 16:25:20 -0700 (PDT) Received: from stoup.. (174-21-72-5.tukw.qwest.net. [174.21.72.5]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-6f8fbf2cfd1sm1591695b3a.3.2024.05.24.16.25.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 May 2024 16:25:20 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Cc: qemu-arm@nongnu.org Subject: [PATCH v2 67/67] target/arm: Convert FCSEL to decodetree Date: Fri, 24 May 2024 16:21:21 -0700 Message-Id: <20240524232121.284515-68-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240524232121.284515-1-richard.henderson@linaro.org> References: <20240524232121.284515-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::42e; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x42e.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/tcg/a64.decode | 4 ++ target/arm/tcg/translate-a64.c | 108 ++++++++++++++------------------- 2 files changed, 49 insertions(+), 63 deletions(-) diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode index 6f6cd805b7..5dadbc74d7 100644 --- a/target/arm/tcg/a64.decode +++ b/target/arm/tcg/a64.decode @@ -1000,6 +1000,10 @@ SQDMULH_vi 0.00 1111 10 . ..... 1100 . 0 ..... ..... @qrrx_s SQRDMULH_vi 0.00 1111 01 .. .... 1101 . 0 ..... ..... @qrrx_h SQRDMULH_vi 0.00 1111 10 . ..... 1101 . 0 ..... ..... @qrrx_s +# Floating-point conditional select + +FCSEL 0001 1110 .. 1 rm:5 cond:4 11 rn:5 rd:5 esz=%esz_hsd + # Floating-point data-processing (3 source) @rrrr_hsd .... .... .. . rm:5 . ra:5 rn:5 rd:5 &rrrr_e esz=%esz_hsd diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c index 3c2963ebaa..845aaa2bfb 100644 --- a/target/arm/tcg/translate-a64.c +++ b/target/arm/tcg/translate-a64.c @@ -5866,6 +5866,50 @@ static bool trans_ADDP_s(DisasContext *s, arg_rr_e *a) return true; } +/* + * Floating-point conditional select + */ + +static bool trans_FCSEL(DisasContext *s, arg_FCSEL *a) +{ + TCGv_i64 t_true, t_false; + DisasCompare64 c; + + switch (a->esz) { + case MO_32: + case MO_64: + break; + case MO_16: + if (!dc_isar_feature(aa64_fp16, s)) { + return false; + } + break; + default: + return false; + } + + if (!fp_access_check(s)) { + return true; + } + + /* Zero extend sreg & hreg inputs to 64 bits now. */ + t_true = tcg_temp_new_i64(); + t_false = tcg_temp_new_i64(); + read_vec_element(s, t_true, a->rn, 0, a->esz); + read_vec_element(s, t_false, a->rm, 0, a->esz); + + a64_test_cc(&c, a->cond); + tcg_gen_movcond_i64(c.cond, t_true, c.value, tcg_constant_i64(0), + t_true, t_false); + + /* + * Note that sregs & hregs write back zeros to the high bits, + * and we've already done the zero-extension. + */ + write_fp_dreg(s, a->rd, t_true); + return true; +} + /* * Floating-point data-processing (3 source) */ @@ -7332,68 +7376,6 @@ static void disas_fp_ccomp(DisasContext *s, uint32_t insn) } } -/* Floating point conditional select - * 31 30 29 28 24 23 22 21 20 16 15 12 11 10 9 5 4 0 - * +---+---+---+-----------+------+---+------+------+-----+------+------+ - * | M | 0 | S | 1 1 1 1 0 | type | 1 | Rm | cond | 1 1 | Rn | Rd | - * +---+---+---+-----------+------+---+------+------+-----+------+------+ - */ -static void disas_fp_csel(DisasContext *s, uint32_t insn) -{ - unsigned int mos, type, rm, cond, rn, rd; - TCGv_i64 t_true, t_false; - DisasCompare64 c; - MemOp sz; - - mos = extract32(insn, 29, 3); - type = extract32(insn, 22, 2); - rm = extract32(insn, 16, 5); - cond = extract32(insn, 12, 4); - rn = extract32(insn, 5, 5); - rd = extract32(insn, 0, 5); - - if (mos) { - unallocated_encoding(s); - return; - } - - switch (type) { - case 0: - sz = MO_32; - break; - case 1: - sz = MO_64; - break; - case 3: - sz = MO_16; - if (dc_isar_feature(aa64_fp16, s)) { - break; - } - /* fallthru */ - default: - unallocated_encoding(s); - return; - } - - if (!fp_access_check(s)) { - return; - } - - /* Zero extend sreg & hreg inputs to 64 bits now. */ - t_true = tcg_temp_new_i64(); - t_false = tcg_temp_new_i64(); - read_vec_element(s, t_true, rn, 0, sz); - read_vec_element(s, t_false, rm, 0, sz); - - a64_test_cc(&c, cond); - tcg_gen_movcond_i64(c.cond, t_true, c.value, tcg_constant_i64(0), - t_true, t_false); - - /* Note that sregs & hregs write back zeros to the high bits, - and we've already done the zero-extension. */ - write_fp_dreg(s, rd, t_true); -} - /* Floating-point data-processing (1 source) - half precision */ static void handle_fp_1src_half(DisasContext *s, int opcode, int rd, int rn) { @@ -8205,7 +8187,7 @@ static void disas_data_proc_fp(DisasContext *s, uint32_t insn) break; case 3: /* Floating point conditional select */ - disas_fp_csel(s, insn); + unallocated_encoding(s); /* in decodetree */ break; case 0: switch (ctz32(extract32(insn, 12, 4))) {