From patchwork Tue May 25 22:58:06 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12280311 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 08271C47088 for ; Tue, 25 May 2021 23:02:30 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id C405761413 for ; Tue, 25 May 2021 23:02:29 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C405761413 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:39548 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llg4C-0002mL-Ul for qemu-devel@archiver.kernel.org; Tue, 25 May 2021 19:02:28 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:52692) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llg0Q-0002Xy-QL for qemu-devel@nongnu.org; Tue, 25 May 2021 18:58:34 -0400 Received: from mail-pf1-x431.google.com ([2607:f8b0:4864:20::431]:33734) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llg0D-0004mi-AQ for qemu-devel@nongnu.org; Tue, 25 May 2021 18:58:34 -0400 Received: by mail-pf1-x431.google.com with SMTP id f22so16404575pfn.0 for ; Tue, 25 May 2021 15:58:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=8bLZEVg63JaCjb9TWm4+laNRS99dgK9U+MbZ27NccEw=; b=TRb5zPZWkDydE5beewV/dSXXZDQXwFZ3e2LtR0n9k7trovaBOMyP8QaSywK/JPXU9j T2uXJZOLu8XJKwheNXmL7+Aj/IN+H85BnLsA2Zl9S7hR9J6rxbzKJ3cOD43KED5opHpn 8a+4UDrr18tPwySL79pAsePR4MAjg0INEFwR0SPen43MNjxxEHNluhcZTzLVPbEXgbx9 xKIinkP7px5aSMen3UyN+FnG2bQOHg3yOqdkG7akHsZGLKfH03XoNuRw6CRiVX0OE7BV T/+wdKHo1PODD0vX/CQBT7m2/6xh5xBiTLKrIOopvoJ7KeDVnhvuo2Xzkw1wRe6Ya/NX hcIA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=8bLZEVg63JaCjb9TWm4+laNRS99dgK9U+MbZ27NccEw=; b=jTF+UrL5YQVDXh7V3HHgB5W3Bi+GUL9MoCsLIPOAgxpfHlfC7B2XAk8eJdWXUXEoGv IHQmtBgkUarysQs0G7p/0dvTC+4vTIrydJ3EdD9DM0+yP4iihHTQVSdIFjvAXJmDoJXL zeDeQ0vq7GIC3uWnEeNzaPd1Co2KQj01X5m0CUw5SLee+ng4IwWLeBclWlAMoph4HOX+ 30E5WOFAkzUupDwsr93pOmke4LzbyS0QSKvsX9G1wGIVO1AxmHHgXuuX8i3/aM8jd2D5 sLIs+rbAxcmrpn+4uoO18Lm7rV6mXSa+lb+mOIzUCZusQVXvArosz7apWDK7ad13iBKf LPVw== X-Gm-Message-State: AOAM531C0iF5OOjSePjRbKPH5gV/jssR6UdmUbzOhhhDVxmEW/ZkX5a1 CILOiNxGuUZIiOBmB0JYuTTriEOZTyhUFQ== X-Google-Smtp-Source: ABdhPJxfPFKVOXLPbVEBZiw5q3cTTI02TpZhuctSEdpU94QJbvGOn/Tvlbsf6GkhVh4zUKr6IFS1+A== X-Received: by 2002:a62:60c2:0:b029:2cb:70a7:a8ce with SMTP id u185-20020a6260c20000b02902cb70a7a8cemr32545879pfb.77.1621983499252; Tue, 25 May 2021 15:58:19 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id l6sm1669928pjf.28.2021.05.25.15.58.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 25 May 2021 15:58:19 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v2 01/12] target/arm: Add isar_feature_{aa32, aa64, aa64_sve}_bf16 Date: Tue, 25 May 2021 15:58:06 -0700 Message-Id: <20210525225817.400336-2-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525225817.400336-1-richard.henderson@linaro.org> References: <20210525225817.400336-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::431; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x431.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Note that the SVE BFLOAT16 support does not require SVE2, it is an independent extension. Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/cpu.h | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/target/arm/cpu.h b/target/arm/cpu.h index 04f8be35bf..d68275b15e 100644 --- a/target/arm/cpu.h +++ b/target/arm/cpu.h @@ -3783,6 +3783,11 @@ static inline bool isar_feature_aa32_predinv(const ARMISARegisters *id) return FIELD_EX32(id->id_isar6, ID_ISAR6, SPECRES) != 0; } +static inline bool isar_feature_aa32_bf16(const ARMISARegisters *id) +{ + return FIELD_EX32(id->id_isar6, ID_ISAR6, BF16) != 0; +} + static inline bool isar_feature_aa32_i8mm(const ARMISARegisters *id) { return FIELD_EX32(id->id_isar6, ID_ISAR6, I8MM) != 0; @@ -4122,6 +4127,11 @@ static inline bool isar_feature_aa64_dcpodp(const ARMISARegisters *id) return FIELD_EX64(id->id_aa64isar1, ID_AA64ISAR1, DPB) >= 2; } +static inline bool isar_feature_aa64_bf16(const ARMISARegisters *id) +{ + return FIELD_EX64(id->id_aa64isar1, ID_AA64ISAR1, BF16) != 0; +} + static inline bool isar_feature_aa64_fp_simd(const ARMISARegisters *id) { /* We always set the AdvSIMD and FP fields identically. */ @@ -4266,6 +4276,11 @@ static inline bool isar_feature_aa64_sve2_bitperm(const ARMISARegisters *id) return FIELD_EX64(id->id_aa64zfr0, ID_AA64ZFR0, BITPERM) != 0; } +static inline bool isar_feature_aa64_sve_bf16(const ARMISARegisters *id) +{ + return FIELD_EX64(id->id_aa64zfr0, ID_AA64ZFR0, BFLOAT16) != 0; +} + static inline bool isar_feature_aa64_sve2_sha3(const ARMISARegisters *id) { return FIELD_EX64(id->id_aa64zfr0, ID_AA64ZFR0, SHA3) != 0; From patchwork Tue May 25 22:58:07 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12280305 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 017A8C4707F for ; Tue, 25 May 2021 23:00:30 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id A5F6C6140E for ; Tue, 25 May 2021 23:00:29 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A5F6C6140E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:32826 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llg2G-0006hx-P1 for qemu-devel@archiver.kernel.org; Tue, 25 May 2021 19:00:28 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:52698) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llg0R-0002Yd-4U for qemu-devel@nongnu.org; Tue, 25 May 2021 18:58:35 -0400 Received: from mail-pf1-x434.google.com ([2607:f8b0:4864:20::434]:43808) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llg0D-0004nP-BZ for qemu-devel@nongnu.org; Tue, 25 May 2021 18:58:34 -0400 Received: by mail-pf1-x434.google.com with SMTP id d78so23904560pfd.10 for ; Tue, 25 May 2021 15:58:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=3DHjpex0S8JDOy5JRC2IyqoQUwifwwct+C1bXOYlysc=; b=YRqj7oQr976igPnGuc3Oq96xQqHSLzZa2/oZHYTXCS0anSFH0z7AN/LOjxwPK19ax6 N/xaTsSv1S2oHFUrgOn+t3lrMSk5B0UvJScywI0Vhm3g8ON2aXl3/RIgcZ+rkNSZce5Q zQ2fJyw0nZXTmLjpZX0FyL8vuuXFyRb4yZ4rt54OGnt2VTKoWpk8qFDNOcfpUQMd0pev WUi2aiERXYHOg9z0wZKV8afF3mgVrHuLdqgrCLo89Gt+apXXOxG9wONfbyAHqdS/+jTb d92j+tZ2spNs9uXofUmcmAYh1Jmnh4UFeuo/uTbDk91LUFSoZZIVu/LNActg4cxBtweU Qtdg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=3DHjpex0S8JDOy5JRC2IyqoQUwifwwct+C1bXOYlysc=; b=ehbOpSyYMiL6bg4CiEkh5EUIKmVYnzknE5Ooe8aj9vYx+qTDGor6Hx8/ynsiurslYG T1mgHa+0LPFXi5Ot8bENPkw488brNtJ5h9piHdJskWen2L1if4iInmtMpUBZZgYkjQH9 GCU0U7FHJX27Ya0OnRR2k5vL/A9OANOnvn++iIpXxi1J3BqRDg+A9Gh5JLfcGAVotoV9 bYCdkatj2HiOmVaGDNtyoeBU9RITITVt7mxO9YNwt+5nSIbJ6uN0z7ZGn6gEHZRFvAWN fPOGhfx/JDUNUJ690a899bezWWDmnHGtoO7aCkfd4W2yKX9TUJRY1JrvoK1E29LQlAFp iBCw== X-Gm-Message-State: AOAM531rFahjdyISVF6XwxEFfwreJ5xUTT1ipnQdB1fdlJxlz11y2Ts6 LeV99rsDI12npKfmsyGOBqL2GxjHXIrh2g== X-Google-Smtp-Source: ABdhPJwLCjQgTBt5cDnXJYZr4qOlEdNIJOJBRzt9S4G23w5klV8OQ/mdN0FCba1Jjq5E0XL9V7jeXw== X-Received: by 2002:a63:1161:: with SMTP id 33mr21376706pgr.270.1621983499843; Tue, 25 May 2021 15:58:19 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id l6sm1669928pjf.28.2021.05.25.15.58.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 25 May 2021 15:58:19 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v2 02/12] target/arm: Unify unallocated path in disas_fp_1src Date: Tue, 25 May 2021 15:58:07 -0700 Message-Id: <20210525225817.400336-3-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525225817.400336-1-richard.henderson@linaro.org> References: <20210525225817.400336-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::434; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x434.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/translate-a64.c | 15 ++++++--------- 1 file changed, 6 insertions(+), 9 deletions(-) diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index ceac0ee2bd..510cb6ca5e 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -6494,8 +6494,7 @@ static void disas_fp_1src(DisasContext *s, uint32_t insn) int rd = extract32(insn, 0, 5); if (mos) { - unallocated_encoding(s); - return; + goto do_unallocated; } switch (opcode) { @@ -6504,8 +6503,7 @@ static void disas_fp_1src(DisasContext *s, uint32_t insn) /* FCVT between half, single and double precision */ int dtype = extract32(opcode, 0, 2); if (type == 2 || dtype == type) { - unallocated_encoding(s); - return; + goto do_unallocated; } if (!fp_access_check(s)) { return; @@ -6517,8 +6515,7 @@ static void disas_fp_1src(DisasContext *s, uint32_t insn) case 0x10 ... 0x13: /* FRINT{32,64}{X,Z} */ if (type > 1 || !dc_isar_feature(aa64_frint, s)) { - unallocated_encoding(s); - return; + goto do_unallocated; } /* fall through */ case 0x0 ... 0x3: @@ -6540,8 +6537,7 @@ static void disas_fp_1src(DisasContext *s, uint32_t insn) break; case 3: if (!dc_isar_feature(aa64_fp16, s)) { - unallocated_encoding(s); - return; + goto do_unallocated; } if (!fp_access_check(s)) { @@ -6550,11 +6546,12 @@ static void disas_fp_1src(DisasContext *s, uint32_t insn) handle_fp_1src_half(s, opcode, rd, rn); break; default: - unallocated_encoding(s); + goto do_unallocated; } break; default: + do_unallocated: unallocated_encoding(s); break; } From patchwork Tue May 25 22:58:08 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12280301 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8EA93C47086 for ; Tue, 25 May 2021 23:00:01 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id EC1C9613D6 for ; Tue, 25 May 2021 23:00:00 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org EC1C9613D6 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:60120 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llg1o-00066M-0P for qemu-devel@archiver.kernel.org; Tue, 25 May 2021 19:00:00 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:52694) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llg0Q-0002Y2-RA for qemu-devel@nongnu.org; Tue, 25 May 2021 18:58:34 -0400 Received: from mail-pf1-x42c.google.com ([2607:f8b0:4864:20::42c]:47087) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llg0D-0004nY-PU for qemu-devel@nongnu.org; Tue, 25 May 2021 18:58:34 -0400 Received: by mail-pf1-x42c.google.com with SMTP id y15so13376876pfn.13 for ; Tue, 25 May 2021 15:58:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=0La4bELq+QRQvGpWtveOOb2coHQOmlKGzXDvXUy3i7s=; b=W4PIKBxrD+per+6GVtdQnBXuRfHCgnIuWI6kHj7YFnIe0tD18zKt4IvjDU+SvfZLVP bhWkYzTDJL6bxVPBgwswCURElfyxWRF4u3ErNHgd4lnf9n8xKdDhyIP7nCzwVsUTYkx/ T5IIdckW5gu2OO6ohiTMjorcVmwrzzqZaE1/Xgr6wb+5jQz++umw5iNRq9mC1pRRMYAs kvEThWs0Vu4Vu8r2CqNd8PzCwZffiWXZj6PWaJ7G2Ybao44BPQ4wddgQ3sx1WQtRv5tE E1P3N73QNMJYcc4bOVfcDF4dIMHdIrQtWb3hE+0byekk35mn8u5PvdSSTM4Q+FBVtSzL 6Y3w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=0La4bELq+QRQvGpWtveOOb2coHQOmlKGzXDvXUy3i7s=; b=CqBKFNFO8z2gP60lKKby/p5DfvJh9LouiXTHcPxBowBNVfOytTg5BabzT7CO1jnxxg xDIwlxN6rmn6nZhNZd3NbhbYTfsdARWxf0uy9U7Ojsz9DtUYWt8kFoHjteT3/Nx/67B/ 1J3OqShRxFPN64ZIQllTC46OOlBtrXKJluPMbnxSCPxs3ki0e0D/EqHF5Mr/N7MIfMA3 8Gd2aLtQk3bts4LWrqSHUERPvSvqE4rQZVlCqV4r1x6k+KjU6d67m6r56ZOXhQcqKedg IvH7EOfK6ahX3zExscOG7zefv48xJ1MmH0wgrO+ckL/kHQgWg51G38vim5y3lvprl4OB ANfQ== X-Gm-Message-State: AOAM530PRXwNpUUPLOepdSTmbO6uHQ1y3DC0PNuLgySrMAU8zDLKO6MG MRjUhM5Y1GzdgqVQpMN3jdzKggk7XHFTLQ== X-Google-Smtp-Source: ABdhPJzin7ccYFpFtLqDVyAwmueNbKQoit/6UR1UuAQBQwJ19qmo74GY1I586OfEaxJZa3mf/SC+Ug== X-Received: by 2002:a05:6a00:1630:b029:2c0:a1eb:d77 with SMTP id e16-20020a056a001630b02902c0a1eb0d77mr32345282pfc.81.1621983500454; Tue, 25 May 2021 15:58:20 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id l6sm1669928pjf.28.2021.05.25.15.58.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 25 May 2021 15:58:20 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v2 03/12] target/arm: Implement scalar float32 to bfloat16 conversion Date: Tue, 25 May 2021 15:58:08 -0700 Message-Id: <20210525225817.400336-4-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525225817.400336-1-richard.henderson@linaro.org> References: <20210525225817.400336-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::42c; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x42c.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" This is the 64-bit BFCVT and the 32-bit VCVT{B,T}.BF16.F32. Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/helper.h | 1 + target/arm/vfp.decode | 2 ++ target/arm/translate-a64.c | 19 +++++++++++++++++++ target/arm/translate-vfp.c | 24 ++++++++++++++++++++++++ target/arm/vfp_helper.c | 5 +++++ 5 files changed, 51 insertions(+) diff --git a/target/arm/helper.h b/target/arm/helper.h index 23ccb0f72f..9977a827e9 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -143,6 +143,7 @@ DEF_HELPER_3(vfp_cmped, void, f64, f64, env) DEF_HELPER_2(vfp_fcvtds, f64, f32, env) DEF_HELPER_2(vfp_fcvtsd, f32, f64, env) +DEF_HELPER_FLAGS_2(bfcvt, TCG_CALL_NO_RWG, i32, f32, ptr) DEF_HELPER_2(vfp_uitoh, f16, i32, ptr) DEF_HELPER_2(vfp_uitos, f32, i32, ptr) diff --git a/target/arm/vfp.decode b/target/arm/vfp.decode index 6f7f28f9a4..52535d9b0b 100644 --- a/target/arm/vfp.decode +++ b/target/arm/vfp.decode @@ -205,6 +205,8 @@ VCVT_f64_f16 ---- 1110 1.11 0010 .... 1011 t:1 1.0 .... \ # VCVTB and VCVTT to f16: Vd format is always vd_sp; # Vm format depends on size bit +VCVT_b16_f32 ---- 1110 1.11 0011 .... 1001 t:1 1.0 .... \ + vd=%vd_sp vm=%vm_sp VCVT_f16_f32 ---- 1110 1.11 0011 .... 1010 t:1 1.0 .... \ vd=%vd_sp vm=%vm_sp VCVT_f16_f64 ---- 1110 1.11 0011 .... 1011 t:1 1.0 .... \ diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index 510cb6ca5e..90605d7dce 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -6273,6 +6273,9 @@ static void handle_fp_1src_single(DisasContext *s, int opcode, int rd, int rn) case 0x3: /* FSQRT */ gen_helper_vfp_sqrts(tcg_res, tcg_op, cpu_env); goto done; + case 0x6: /* BFCVT */ + gen_fpst = gen_helper_bfcvt; + break; case 0x8: /* FRINTN */ case 0x9: /* FRINTP */ case 0xa: /* FRINTM */ @@ -6550,6 +6553,22 @@ static void disas_fp_1src(DisasContext *s, uint32_t insn) } break; + case 0x6: + switch (type) { + case 1: /* BFCVT */ + if (!dc_isar_feature(aa64_bf16, s)) { + goto do_unallocated; + } + if (!fp_access_check(s)) { + return; + } + handle_fp_1src_single(s, opcode, rd, rn); + break; + default: + goto do_unallocated; + } + break; + default: do_unallocated: unallocated_encoding(s); diff --git a/target/arm/translate-vfp.c b/target/arm/translate-vfp.c index 3da84f30a0..d8271dbaac 100644 --- a/target/arm/translate-vfp.c +++ b/target/arm/translate-vfp.c @@ -3025,6 +3025,30 @@ static bool trans_VCVT_f64_f16(DisasContext *s, arg_VCVT_f64_f16 *a) return true; } +static bool trans_VCVT_b16_f32(DisasContext *s, arg_VCVT_b16_f32 *a) +{ + TCGv_ptr fpst; + TCGv_i32 tmp; + + if (!dc_isar_feature(aa32_bf16, s)) { + return false; + } + + if (!vfp_access_check(s)) { + return true; + } + + fpst = fpstatus_ptr(FPST_FPCR); + tmp = tcg_temp_new_i32(); + + vfp_load_reg32(tmp, a->vm); + gen_helper_bfcvt(tmp, tmp, fpst); + tcg_gen_st16_i32(tmp, cpu_env, vfp_f16_offset(a->vd, a->t)); + tcg_temp_free_ptr(fpst); + tcg_temp_free_i32(tmp); + return true; +} + static bool trans_VCVT_f16_f32(DisasContext *s, arg_VCVT_f16_f32 *a) { TCGv_ptr fpst; diff --git a/target/arm/vfp_helper.c b/target/arm/vfp_helper.c index 01b9d8557f..fe7a2a5daa 100644 --- a/target/arm/vfp_helper.c +++ b/target/arm/vfp_helper.c @@ -408,6 +408,11 @@ float32 VFP_HELPER(fcvts, d)(float64 x, CPUARMState *env) return float64_to_float32(x, &env->vfp.fp_status); } +uint32_t HELPER(bfcvt)(float32 x, void *status) +{ + return float32_to_bfloat16(x, status); +} + /* * VFP3 fixed point conversion. The AArch32 versions of fix-to-float * must always round-to-nearest; the AArch64 ones honour the FPSCR From patchwork Tue May 25 22:58:09 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12280303 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 94C43C47086 for ; Tue, 25 May 2021 23:00:06 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 06E6F61413 for ; Tue, 25 May 2021 23:00:06 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 06E6F61413 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:60364 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llg1s-0006FT-SF for qemu-devel@archiver.kernel.org; Tue, 25 May 2021 19:00:05 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:52732) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llg0S-0002d9-Jr for qemu-devel@nongnu.org; Tue, 25 May 2021 18:58:36 -0400 Received: from mail-pf1-x42c.google.com ([2607:f8b0:4864:20::42c]:39870) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llg0F-0004nl-3x for qemu-devel@nongnu.org; Tue, 25 May 2021 18:58:36 -0400 Received: by mail-pf1-x42c.google.com with SMTP id y202so1426387pfc.6 for ; Tue, 25 May 2021 15:58:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=gyJ317Rrl3ir/rnX415PevDk/VoO/H8S6IDuXBd4THk=; b=SShEv6loYwwyJE+gEKsZhx1wyoySfHdxC38zLB7w0qaaO63QHS6BfXVgrIz7cT24Xl 2+WPcc06oUOEyoSNEdqs44GdliExf6k8IAnSRL2P6YDyF1MLAOIUJzBnJVXfXtswStLN +QqKHykP83I8QWTPxA2xXyw0q/cCJqhXt2LhM2++8ddyuuoXsTvFGFT9HEUarPcV3Tyd WPOg7YO/mEYdZgp8GDez5m4gIWcdkn4B91uv/ZzhkCsn2Ts4tJ8KOH/yZ7+HUIyuOyC8 tBlv6zaPlcAMoc4v+lrl46t2b7Bvsvc3B4etgSJHnqggiy7ufHn0HhzpLC5kmMArASan EcYg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=gyJ317Rrl3ir/rnX415PevDk/VoO/H8S6IDuXBd4THk=; b=PsPLkLtzg9awBqm4SHcDRTX4wNAfP45wCiv9+dQewO2z4rzQqPGEGXQ/IFpu46hCca oQxJRbUew/1c8ML+y1/5EWuDUovZLsuamVKXN56ioDJrMTl5kxT1K0x6aBM1Xfqbf5AS hu1O+irZcAP3scecMZTmrQxRyaXOmt4mvhu9mPOTq4Pr8FXHqxzFl5plgYTXyhscpa4t HHlDeNiY3nrLWiB2xSqcedGZ3+oUD7Onvm+c9snRLTW1ABXJF0wEEN266mPLZvOK23/x Ndjjqu/g0Sk8QIxNCuWnYiunqq45uv0e8QRl9/STQx/hqz3Lt5xpk5rzgeTNSqggxdOC eMig== X-Gm-Message-State: AOAM532GxIBK+2e6eLNpmshITrwnWreMX4K9054pGzN4Xat/ugLOBlMw WqurU1cCBDHZJu7iAr99wMDVC6NPkY/3+g== X-Google-Smtp-Source: ABdhPJwNLxXUOR49oYrWnGtCUodPa46mvmPeOtGnApqZE8C+pcnL1QnlUFH4G284XwbNzKO/Q5KArA== X-Received: by 2002:a62:7c46:0:b029:2dc:cb24:b5b1 with SMTP id x67-20020a627c460000b02902dccb24b5b1mr32215624pfc.77.1621983501094; Tue, 25 May 2021 15:58:21 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id l6sm1669928pjf.28.2021.05.25.15.58.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 25 May 2021 15:58:20 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v2 04/12] target/arm: Implement vector float32 to bfloat16 conversion Date: Tue, 25 May 2021 15:58:09 -0700 Message-Id: <20210525225817.400336-5-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525225817.400336-1-richard.henderson@linaro.org> References: <20210525225817.400336-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::42c; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x42c.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" This is BFCVT{N,T} for both AArch64 AdvSIMD and SVE, and VCVT.BF16.F32 for AArch32 NEON. Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 4 ++++ target/arm/helper.h | 1 + target/arm/neon-dp.decode | 1 + target/arm/sve.decode | 2 ++ target/arm/sve_helper.c | 2 ++ target/arm/translate-a64.c | 17 ++++++++++++++ target/arm/translate-neon.c | 45 +++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 16 +++++++++++++ target/arm/vfp_helper.c | 7 ++++++ 9 files changed, 95 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 29a14a21f5..dc629f851a 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -1197,6 +1197,8 @@ DEF_HELPER_FLAGS_5(sve_fcvt_hd, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_fcvt_sd, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_bfcvt, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_fcvtzs_hh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) @@ -2752,6 +2754,8 @@ DEF_HELPER_FLAGS_5(sve2_fcvtnt_sh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve2_fcvtnt_ds, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_bfcvtnt, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve2_fcvtlt_hs, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/helper.h b/target/arm/helper.h index 9977a827e9..8b4b7d92f3 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -144,6 +144,7 @@ DEF_HELPER_3(vfp_cmped, void, f64, f64, env) DEF_HELPER_2(vfp_fcvtds, f64, f32, env) DEF_HELPER_2(vfp_fcvtsd, f32, f64, env) DEF_HELPER_FLAGS_2(bfcvt, TCG_CALL_NO_RWG, i32, f32, ptr) +DEF_HELPER_FLAGS_2(bfcvt_pair, TCG_CALL_NO_RWG, i32, i64, ptr) DEF_HELPER_2(vfp_uitoh, f16, i32, ptr) DEF_HELPER_2(vfp_uitos, f32, i32, ptr) diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode index ec83f10ab3..fd3a01bfa0 100644 --- a/target/arm/neon-dp.decode +++ b/target/arm/neon-dp.decode @@ -521,6 +521,7 @@ Vimm_1r 1111 001 . 1 . 000 ... .... cmode:4 0 . op:1 1 .... @1reg_imm VRINTZ 1111 001 11 . 11 .. 10 .... 0 1011 . . 0 .... @2misc VCVT_F16_F32 1111 001 11 . 11 .. 10 .... 0 1100 0 . 0 .... @2misc_q0 + VCVT_B16_F32 1111 001 11 . 11 .. 10 .... 0 1100 1 . 0 .... @2misc_q0 VRINTM 1111 001 11 . 11 .. 10 .... 0 1101 . . 0 .... @2misc diff --git a/target/arm/sve.decode b/target/arm/sve.decode index cb077bfde9..18d1a0eecc 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1036,6 +1036,7 @@ FNMLS_zpzzz 01100101 .. 1 ..... 111 ... ..... ..... @rdn_pg_rm_ra # SVE floating-point convert precision FCVT_sh 01100101 10 0010 00 101 ... ..... ..... @rd_pg_rn_e0 FCVT_hs 01100101 10 0010 01 101 ... ..... ..... @rd_pg_rn_e0 +BFCVT 01100101 10 0010 10 101 ... ..... ..... @rd_pg_rn_e0 FCVT_dh 01100101 11 0010 00 101 ... ..... ..... @rd_pg_rn_e0 FCVT_hd 01100101 11 0010 01 101 ... ..... ..... @rd_pg_rn_e0 FCVT_ds 01100101 11 0010 10 101 ... ..... ..... @rd_pg_rn_e0 @@ -1610,6 +1611,7 @@ RAX1 01000101 00 1 ..... 11110 1 ..... ..... @rd_rn_rm_e0 FCVTXNT_ds 01100100 00 0010 10 101 ... ..... ..... @rd_pg_rn_e0 FCVTX_ds 01100101 00 0010 10 101 ... ..... ..... @rd_pg_rn_e0 FCVTNT_sh 01100100 10 0010 00 101 ... ..... ..... @rd_pg_rn_e0 +BFCVTNT 01100100 10 0010 10 101 ... ..... ..... @rd_pg_rn_e0 FCVTLT_hs 01100100 10 0010 01 101 ... ..... ..... @rd_pg_rn_e0 FCVTNT_ds 01100100 11 0010 10 101 ... ..... ..... @rd_pg_rn_e0 FCVTLT_sd 01100100 11 0010 11 101 ... ..... ..... @rd_pg_rn_e0 diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 40af3024df..46a957b6fb 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -4708,6 +4708,7 @@ static inline uint64_t vfp_float64_to_uint64_rtz(float64 f, float_status *s) DO_ZPZ_FP(sve_fcvt_sh, uint32_t, H1_4, sve_f32_to_f16) DO_ZPZ_FP(sve_fcvt_hs, uint32_t, H1_4, sve_f16_to_f32) +DO_ZPZ_FP(sve_bfcvt, uint32_t, H1_4, float32_to_bfloat16) DO_ZPZ_FP(sve_fcvt_dh, uint64_t, , sve_f64_to_f16) DO_ZPZ_FP(sve_fcvt_hd, uint64_t, , sve_f16_to_f64) DO_ZPZ_FP(sve_fcvt_ds, uint64_t, , float64_to_float32) @@ -7740,6 +7741,7 @@ void HELPER(NAME)(void *vd, void *vn, void *vg, void *status, uint32_t desc) \ } while (i != 0); \ } +DO_FCVTNT(sve_bfcvtnt, uint32_t, uint16_t, H1_4, H1_2, float32_to_bfloat16) DO_FCVTNT(sve2_fcvtnt_sh, uint32_t, uint16_t, H1_4, H1_2, sve_f32_to_f16) DO_FCVTNT(sve2_fcvtnt_ds, uint64_t, uint32_t, , H1_4, float64_to_float32) diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index 90605d7dce..5a96523b9f 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -10346,6 +10346,13 @@ static void handle_2misc_narrow(DisasContext *s, bool scalar, tcg_temp_free_i32(ahp); } break; + case 0x36: /* BFCVTN, BFCVTN2 */ + { + TCGv_ptr fpst = fpstatus_ptr(FPST_FPCR); + gen_helper_bfcvt_pair(tcg_res[pass], tcg_op, fpst); + tcg_temp_free_ptr(fpst); + } + break; case 0x56: /* FCVTXN, FCVTXN2 */ /* 64 bit to 32 bit float conversion * with von Neumann rounding (round to odd) @@ -12746,6 +12753,16 @@ static void disas_simd_two_reg_misc(DisasContext *s, uint32_t insn) } handle_2misc_narrow(s, false, opcode, 0, is_q, size - 1, rn, rd); return; + case 0x36: /* BFCVTN, BFCVTN2 */ + if (!dc_isar_feature(aa64_bf16, s) || size != 2) { + unallocated_encoding(s); + return; + } + if (!fp_access_check(s)) { + return; + } + handle_2misc_narrow(s, false, opcode, 0, is_q, size - 1, rn, rd); + return; case 0x17: /* FCVTL, FCVTL2 */ if (!fp_access_check(s)) { return; diff --git a/target/arm/translate-neon.c b/target/arm/translate-neon.c index 9e990b41ed..6d94229c69 100644 --- a/target/arm/translate-neon.c +++ b/target/arm/translate-neon.c @@ -3422,6 +3422,51 @@ static bool trans_VSHLL(DisasContext *s, arg_2misc *a) return true; } +static bool trans_VCVT_B16_F32(DisasContext *s, arg_2misc *a) +{ + TCGv_ptr fpst; + TCGv_i64 tmp; + TCGv_i32 dst0, dst1; + + if (!dc_isar_feature(aa32_bf16, s)) { + return false; + } + + /* UNDEF accesses to D16-D31 if they don't exist. */ + if (!dc_isar_feature(aa32_simd_r32, s) && + ((a->vd | a->vm) & 0x10)) { + return false; + } + + if ((a->vm & 1) || (a->size != 1)) { + return false; + } + + if (!vfp_access_check(s)) { + return true; + } + + fpst = fpstatus_ptr(FPST_STD); + tmp = tcg_temp_new_i64(); + dst0 = tcg_temp_new_i32(); + dst1 = tcg_temp_new_i32(); + + read_neon_element64(tmp, a->vm, 0, MO_64); + gen_helper_bfcvt_pair(dst0, tmp, fpst); + + read_neon_element64(tmp, a->vm, 1, MO_64); + gen_helper_bfcvt_pair(dst1, tmp, fpst); + + write_neon_element32(dst0, a->vd, 0, MO_32); + write_neon_element32(dst1, a->vd, 1, MO_32); + + tcg_temp_free_i64(tmp); + tcg_temp_free_i32(dst0); + tcg_temp_free_i32(dst1); + tcg_temp_free_ptr(fpst); + return true; +} + static bool trans_VCVT_F16_F32(DisasContext *s, arg_2misc *a) { TCGv_ptr fpst; diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 9574efe957..fb692a1835 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -4777,6 +4777,14 @@ static bool trans_FCVT_hs(DisasContext *s, arg_rpr_esz *a) return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvt_hs); } +static bool trans_BFCVT(DisasContext *s, arg_rpr_esz *a) +{ + if (!dc_isar_feature(aa64_sve_bf16, s)) { + return false; + } + return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_bfcvt); +} + static bool trans_FCVT_dh(DisasContext *s, arg_rpr_esz *a) { return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvt_dh); @@ -8472,6 +8480,14 @@ static bool trans_FCVTNT_sh(DisasContext *s, arg_rpr_esz *a) return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve2_fcvtnt_sh); } +static bool trans_BFCVTNT(DisasContext *s, arg_rpr_esz *a) +{ + if (!dc_isar_feature(aa64_sve_bf16, s)) { + return false; + } + return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_bfcvtnt); +} + static bool trans_FCVTNT_ds(DisasContext *s, arg_rpr_esz *a) { if (!dc_isar_feature(aa64_sve2, s)) { diff --git a/target/arm/vfp_helper.c b/target/arm/vfp_helper.c index fe7a2a5daa..3328423cec 100644 --- a/target/arm/vfp_helper.c +++ b/target/arm/vfp_helper.c @@ -413,6 +413,13 @@ uint32_t HELPER(bfcvt)(float32 x, void *status) return float32_to_bfloat16(x, status); } +uint32_t HELPER(bfcvt_pair)(uint64_t pair, void *status) +{ + bfloat16 lo = float32_to_bfloat16(extract64(pair, 0, 32), status); + bfloat16 hi = float32_to_bfloat16(extract64(pair, 32, 32), status); + return deposit32(lo, 16, 16, hi); +} + /* * VFP3 fixed point conversion. The AArch32 versions of fix-to-float * must always round-to-nearest; the AArch64 ones honour the FPSCR From patchwork Tue May 25 22:58:10 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12280313 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B26B8C4707F for ; Tue, 25 May 2021 23:02:50 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 77E4361284 for ; Tue, 25 May 2021 23:02:50 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 77E4361284 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:41446 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llg4X-00041x-MJ for qemu-devel@archiver.kernel.org; Tue, 25 May 2021 19:02:49 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:52756) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llg0T-0002gD-Dd for qemu-devel@nongnu.org; Tue, 25 May 2021 18:58:37 -0400 Received: from mail-pf1-x429.google.com ([2607:f8b0:4864:20::429]:43798) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llg0F-0004oR-4M for qemu-devel@nongnu.org; Tue, 25 May 2021 18:58:37 -0400 Received: by mail-pf1-x429.google.com with SMTP id d78so23904593pfd.10 for ; Tue, 25 May 2021 15:58:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=inUd/h721Ne/1vYYmFZwK9UzqSXV6+upNZJZePrYay4=; b=kL1vkKxmskvj6qxDLY8hIJ4NVkoE5oaGfjbkYvLOufs7Thtk8meRcnifD9J0Iwcc8w nrhL8OPEnLrHYW4JXj12TaKF3LQlCpDFjpk19HbleG6cyx3WwZTCc/0BgsDaK8zpW9rW FqWenm6jLPDerA8czIVjTk94szmN8pi6ZjDNPrrT5UlGtn1Yw0Jqvjkh7Hz2C9sewiSL 1l1osvFEQ9+FS4vx2fYxJzw7QZ0ItEp7E6xj+SH3LmdfyuWd5+AFzikss1q1Vff30438 pcGS9UNtU0uxv+EKt26dHLojff+8z78fVybOp0pm3YJETUxAIRiRcc/a5SMgc9ww1RxA Tw3Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=inUd/h721Ne/1vYYmFZwK9UzqSXV6+upNZJZePrYay4=; b=OnE8aipQCwk4PR8Pnrb8Viy8bt28a1Dr0apGb5gir2gX7HwAo89xrfYRpozgmCsLqi tl7SJn3CN+5RJkx3BjbqtqX+iNTyitIfplMU2jDeDG6Z6l4WC0nSGsOGnCXk40p1TTh4 0oWpGNFbud5OLTGurshjAxMr+rhTv9k7BWiY1TmNgWOixnXFBIrxX1ZplEP6oy0iIqX8 rtxYgXv6pNaaiVjyvyXOyorFzH9w5wIQSYOSN6BL43CDH+ZvDcqBvfbUEwA0sqlY/o1r TObrLcsWPFKbUMpojSwEBDTYj01WCT451/+Awkfx5di0bvdPx7jN4Sy2R2XeXUJElUT3 c+bw== X-Gm-Message-State: AOAM532obiWHACSTeUGAGN0t0/fvn25rCrxzLVDbEBQ4hSu89LFaFcCw 9yLBHSJkENo/9cO19Hj1wPIBII71PXZE4w== X-Google-Smtp-Source: ABdhPJwFfIykPwZf/JlIChQFe3tqGtkIRQecHD5MsrcZQ18ISCwtP+bvyARRM60I8gm0phQvyJNfQg== X-Received: by 2002:a05:6a00:cd4:b029:2e1:b937:77e8 with SMTP id b20-20020a056a000cd4b02902e1b93777e8mr31316925pfv.43.1621983501679; Tue, 25 May 2021 15:58:21 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id l6sm1669928pjf.28.2021.05.25.15.58.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 25 May 2021 15:58:21 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v2 05/12] softfpu: Add float_round_to_odd_inf Date: Tue, 25 May 2021 15:58:10 -0700 Message-Id: <20210525225817.400336-6-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525225817.400336-1-richard.henderson@linaro.org> References: <20210525225817.400336-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::429; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x429.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org, =?utf-8?q?Alex_Benn=C3=A9e?= Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" For Arm BFDOT and BFMMLA, we need a version of round-to-odd that overflows to infinity, instead of the max normal number. Cc: Alex Bennée Signed-off-by: Richard Henderson --- include/fpu/softfloat-types.h | 4 +++- fpu/softfloat-parts.c.inc | 6 ++++-- 2 files changed, 7 insertions(+), 3 deletions(-) diff --git a/include/fpu/softfloat-types.h b/include/fpu/softfloat-types.h index 8a3f20fae9..3b757c3d6a 100644 --- a/include/fpu/softfloat-types.h +++ b/include/fpu/softfloat-types.h @@ -134,8 +134,10 @@ typedef enum __attribute__((__packed__)) { float_round_up = 2, float_round_to_zero = 3, float_round_ties_away = 4, - /* Not an IEEE rounding mode: round to the closest odd mantissa value */ + /* Not an IEEE rounding mode: round to closest odd, overflow to max */ float_round_to_odd = 5, + /* Not an IEEE rounding mode: round to closest odd, overflow to inf */ + float_round_to_odd_inf = 6, } FloatRoundMode; /* diff --git a/fpu/softfloat-parts.c.inc b/fpu/softfloat-parts.c.inc index a897a5a743..7f69da1d8f 100644 --- a/fpu/softfloat-parts.c.inc +++ b/fpu/softfloat-parts.c.inc @@ -176,13 +176,12 @@ static void partsN(uncanon)(FloatPartsN *p, float_status *s, g_assert_not_reached(); } + overflow_norm = false; switch (s->float_rounding_mode) { case float_round_nearest_even: - overflow_norm = false; inc = ((p->frac_lo & roundeven_mask) != frac_lsbm1 ? frac_lsbm1 : 0); break; case float_round_ties_away: - overflow_norm = false; inc = frac_lsbm1; break; case float_round_to_zero: @@ -199,6 +198,8 @@ static void partsN(uncanon)(FloatPartsN *p, float_status *s, break; case float_round_to_odd: overflow_norm = true; + /* fall through */ + case float_round_to_odd_inf: inc = p->frac_lo & frac_lsb ? 0 : round_mask; break; default: @@ -259,6 +260,7 @@ static void partsN(uncanon)(FloatPartsN *p, float_status *s, ? frac_lsbm1 : 0); break; case float_round_to_odd: + case float_round_to_odd_inf: inc = p->frac_lo & frac_lsb ? 0 : round_mask; break; default: From patchwork Tue May 25 22:58:11 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12280339 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AE1A4C47086 for ; Tue, 25 May 2021 23:07:35 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 273C860698 for ; Tue, 25 May 2021 23:07:35 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 273C860698 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:53060 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llg98-0003M7-8X for qemu-devel@archiver.kernel.org; Tue, 25 May 2021 19:07:34 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:52818) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llg0V-0002pt-OB for qemu-devel@nongnu.org; Tue, 25 May 2021 18:58:39 -0400 Received: from mail-pg1-x535.google.com ([2607:f8b0:4864:20::535]:33404) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llg0H-0004pF-08 for qemu-devel@nongnu.org; Tue, 25 May 2021 18:58:39 -0400 Received: by mail-pg1-x535.google.com with SMTP id i5so23986229pgm.0 for ; Tue, 25 May 2021 15:58:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=HHnHU44uTwaOtmMM1YViwvbfCM9UrBF9Z6IjsBhP0z4=; b=i/EsRwdHaa0OWJ57R0Jhyv96ESj3h3oXj35TuyS1y/zdxcPrKSu61H1mVPxJpUv+C1 /pWdEl322TFUZenE1C9FFO121HNrC/jzUVP7IuHV3orx6vfdlLwL2l3BM8+Y6Q3yTIK8 5Do/3EReUOosvl45Sm8r5U8LNSaIbhtQKoel340D6l9pDW7LR13fQflQJV9Ssj/OCWIv +YlFO9NuUApmT5ZzGw1AfbHfKIxGub2RK/yGNZvG+dgJpBDqJwyQiXHjZaCmVkbZ9Cbm 7Eqnch6oA7KxtnFPXm3gp4ZuVL8j8bLsxKXVVbTNYiNrg6pnKYMlEmL6ARHHY+0aBvG8 o8QA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=HHnHU44uTwaOtmMM1YViwvbfCM9UrBF9Z6IjsBhP0z4=; b=pi3wJbMqyov3vQ5MW9GeLwKfxh+MlLHHAW3Sc90ekNuDhDH5IAXXti6RVMm4SLl1Ho Z/udmOMCMmihexyTlCYa9a8alZv9zqaHHQFKEReQQhF6DPdXDnV8ojP3tHlkpnOR8VZ5 9F/3k06zHwlfrkbSqDchILOzKFojYkgdEgrgdRpqstKL39ewsz0iYDh6dpRVr0TmBGeq b87vtIobr8OGlAecwgQ6QMzyYV03xpoJlqAy/L8wXxCiOSLbfh8Yc7Hn1PKut/jtv1zl k3glLcZ1nHDxdiMnlE5eLLzZq24innRjKke7VJVEZr6TBGL+2fs3awsUY5VUBIeE133F dlLg== X-Gm-Message-State: AOAM531zin55eBztEMcH4m5/Td0R6Gqdk8ZBb3qCNpFcmbKyR3H7qZtK qDWAgTSsQW+gCUMGDLdK2frRVr85ROoPRg== X-Google-Smtp-Source: ABdhPJzdnZXGt0VLzUnz/hBHQlk083azzjCJ5bYbN4gc4reVEGhRMiubKiSSg+ANuEgYrlMtKwoXWA== X-Received: by 2002:a62:30c2:0:b029:289:116c:ec81 with SMTP id w185-20020a6230c20000b0290289116cec81mr32657277pfw.42.1621983502264; Tue, 25 May 2021 15:58:22 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id l6sm1669928pjf.28.2021.05.25.15.58.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 25 May 2021 15:58:21 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v2 06/12] target/arm: Implement bfloat16 dot product (vector) Date: Tue, 25 May 2021 15:58:11 -0700 Message-Id: <20210525225817.400336-7-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525225817.400336-1-richard.henderson@linaro.org> References: <20210525225817.400336-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::535; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x535.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" This is BFDOT for both AArch64 AdvSIMD and SVE, and VDOT.BF16 for AArch32 NEON. Signed-off-by: Richard Henderson --- target/arm/helper.h | 3 +++ target/arm/neon-shared.decode | 2 ++ target/arm/sve.decode | 3 +++ target/arm/translate-a64.c | 20 ++++++++++++++++++ target/arm/translate-neon.c | 9 ++++++++ target/arm/translate-sve.c | 12 +++++++++++ target/arm/vec_helper.c | 40 +++++++++++++++++++++++++++++++++++ 7 files changed, 89 insertions(+) diff --git a/target/arm/helper.h b/target/arm/helper.h index 8b4b7d92f3..de2f5331dc 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -1002,6 +1002,9 @@ DEF_HELPER_FLAGS_5(gvec_ummla_b, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(gvec_usmmla_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_bfdot, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + #ifdef TARGET_AARCH64 #include "helper-a64.h" #include "helper-sve.h" diff --git a/target/arm/neon-shared.decode b/target/arm/neon-shared.decode index cc9f4cdd85..31a0839bbb 100644 --- a/target/arm/neon-shared.decode +++ b/target/arm/neon-shared.decode @@ -52,6 +52,8 @@ VUDOT 1111 110 00 . 10 .... .... 1101 . q:1 . 1 .... \ vm=%vm_dp vn=%vn_dp vd=%vd_dp VUSDOT 1111 110 01 . 10 .... .... 1101 . q:1 . 0 .... \ vm=%vm_dp vn=%vn_dp vd=%vd_dp +VDOT_b16 1111 110 00 . 00 .... .... 1101 . q:1 . 0 .... \ + vm=%vm_dp vn=%vn_dp vd=%vd_dp # VFM[AS]L VFML 1111 110 0 s:1 . 10 .... .... 1000 . 0 . 1 .... \ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 18d1a0eecc..a7429b293f 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1625,6 +1625,9 @@ FMLALT_zzzw 01100100 10 1 ..... 10 0 00 1 ..... ..... @rda_rn_rm_e0 FMLSLB_zzzw 01100100 10 1 ..... 10 1 00 0 ..... ..... @rda_rn_rm_e0 FMLSLT_zzzw 01100100 10 1 ..... 10 1 00 1 ..... ..... @rda_rn_rm_e0 +### SVE2 floating-point bfloat16 dot-product +BFDOT_zzzz 01100100 01 1 ..... 10 0 00 0 ..... ..... @rda_rn_rm_e0 + ### SVE2 floating-point multiply-add long (indexed) FMLALB_zzxw 01100100 10 1 ..... 0100.0 ..... ..... @rrxr_3a esz=2 FMLALT_zzxw 01100100 10 1 ..... 0100.1 ..... ..... @rrxr_3a esz=2 diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index 5a96523b9f..127c8d8e9d 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -12228,6 +12228,16 @@ static void disas_simd_three_reg_same_extra(DisasContext *s, uint32_t insn) } feature = dc_isar_feature(aa64_fcma, s); break; + case 0x1f: /* BFDOT */ + switch (size) { + case 1: + feature = dc_isar_feature(aa64_bf16, s); + break; + default: + unallocated_encoding(s); + return; + } + break; default: unallocated_encoding(s); return; @@ -12311,6 +12321,16 @@ static void disas_simd_three_reg_same_extra(DisasContext *s, uint32_t insn) } return; + case 0xf: /* BFDOT */ + switch (size) { + case 1: + gen_gvec_op4_ool(s, is_q, rd, rn, rm, rd, 0, gen_helper_gvec_bfdot); + break; + default: + g_assert_not_reached(); + } + return; + default: g_assert_not_reached(); } diff --git a/target/arm/translate-neon.c b/target/arm/translate-neon.c index 6d94229c69..9460857b2a 100644 --- a/target/arm/translate-neon.c +++ b/target/arm/translate-neon.c @@ -296,6 +296,15 @@ static bool trans_VUSDOT(DisasContext *s, arg_VUSDOT *a) gen_helper_gvec_usdot_b); } +static bool trans_VDOT_b16(DisasContext *s, arg_VDOT_b16 *a) +{ + if (!dc_isar_feature(aa32_bf16, s)) { + return false; + } + return do_neon_ddda(s, a->q * 7, a->vd, a->vn, a->vm, 0, + gen_helper_gvec_bfdot); +} + static bool trans_VFML(DisasContext *s, arg_VFML *a) { int opr_sz; diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index fb692a1835..ed290827ad 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -8653,3 +8653,15 @@ static bool trans_UMMLA(DisasContext *s, arg_rrrr_esz *a) { return do_i8mm_zzzz_ool(s, a, gen_helper_gvec_ummla_b, 0); } + +static bool trans_BFDOT_zzzz(DisasContext *s, arg_rrrr_esz *a) +{ + if (!dc_isar_feature(aa64_sve_bf16, s)) { + return false; + } + if (sve_access_check(s)) { + gen_gvec_ool_zzzz(s, gen_helper_gvec_bfdot, + a->rd, a->rn, a->rm, a->ra, 0); + } + return true; +} diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index e84b438340..7eefcd06ea 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -2412,3 +2412,43 @@ static void do_mmla_b(void *vd, void *vn, void *vm, void *va, uint32_t desc, DO_MMLA_B(gvec_smmla_b, do_smmla_b) DO_MMLA_B(gvec_ummla_b, do_ummla_b) DO_MMLA_B(gvec_usmmla_b, do_usmmla_b) + +/* + * BFloat16 Dot Product + */ + +static float32 bfdotadd(float32 sum, uint32_t e1, uint32_t e2) +{ + /* FPCR is ignored for BFDOT and BFMMLA. */ + float_status bf_status = { + .tininess_before_rounding = float_tininess_before_rounding, + .float_rounding_mode = float_round_to_odd_inf, + .flush_to_zero = true, + .flush_inputs_to_zero = true, + .default_nan_mode = true, + }; + float32 t1, t2; + + /* + * Extract each BFloat16 from the element pair, and shift + * them such that they become float32. + */ + t1 = float32_mul(e1 << 16, e2 << 16, &bf_status); + t2 = float32_mul(e1 & 0xffff0000u, e2 & 0xffff0000u, &bf_status); + t1 = float32_add(t1, t2, &bf_status); + t1 = float32_add(sum, t1, &bf_status); + + return t1; +} + +void HELPER(gvec_bfdot)(void *vd, void *vn, void *vm, void *va, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + float32 *d = vd, *a = va; + uint32_t *n = vn, *m = vm; + + for (i = 0; i < opr_sz / 4; ++i) { + d[i] = bfdotadd(a[i], n[i], m[i]); + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} From patchwork Tue May 25 22:58:12 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12280319 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3B295C4707F for ; Tue, 25 May 2021 23:05:02 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 9B24861413 for ; Tue, 25 May 2021 23:05:01 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 9B24861413 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:48016 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llg6e-0008Mn-O1 for qemu-devel@archiver.kernel.org; Tue, 25 May 2021 19:05:00 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:52786) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llg0U-0002kh-HJ for qemu-devel@nongnu.org; Tue, 25 May 2021 18:58:38 -0400 Received: from mail-pj1-x102a.google.com ([2607:f8b0:4864:20::102a]:46881) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llg0G-0004pQ-Vz for qemu-devel@nongnu.org; Tue, 25 May 2021 18:58:38 -0400 Received: by mail-pj1-x102a.google.com with SMTP id pi6-20020a17090b1e46b029015cec51d7cdso14025415pjb.5 for ; Tue, 25 May 2021 15:58:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=sJG2F5NdBXHXD1yr3mStAXC+a80roqb/xw3NOJ00r48=; b=wBuUowm5x65NNpz5TLLDCSlRUzaM431oqZxs5ZkHVf6tLXriSEuKvWHK1WzLQ1NuKe 6aJOlcFt/K9FjLt55apZfTw/T7hdfEtYUPmFX8Fkytg/JHWH9/xepD7BYqw3NgtLfvwg YRdqAgJF9ptdhYAaesAsRvevNsIRAwbFq3TaauR9t1bVAAQvMXXn7LUs0ucCT+pjumHZ 1jHKZjnT4DFS5JAhrN95RD1O4uTK0A5UsNGZwXC949AHNpVFyiXe0NV2k26272TuiorV mIeWkQYMRHT6q7SMp4TCIB6esPQVZEGmW7L4zRzg2BsPWqUhBPrwY2R4l1w/m1cw9ixo BZ9Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=sJG2F5NdBXHXD1yr3mStAXC+a80roqb/xw3NOJ00r48=; b=ppUeEB9Tz08s4NgcoTi/bjbfWXBXbIONhirtqkJPGWKsxE/xkwsjKDme1ktXZ/9dQD oHTG/IcM4jVYiRgiCgHxqrFrQvDrOUov3BWnCxo8PcTLwJ9IXJ9o0Nabfd6kcex6jCKB sb+XAUNvt4QSlMiyg4Hhwr8vQnY5igsKf9Job9XV63UTgOJJtMDqJjzycqvDRLcnb+MS yuqxAa66ecP9O9slH1hyddjdRjxQEJZKUrp/FcyxjxyPPZisPkCVcOHf8yeQ9d55EQqV mZQlwfNpbxhpjuFrkQp1Sn5pELgTghe0/R0mkVJYJCG0q91NdJYt5YmlzslbTTDe34H9 cmUw== X-Gm-Message-State: AOAM533W5y+gibx1eHsNFuDDOvOz9lG+TxnG0yDF0z7E2CdgTxJK1ygW 2+GUgxI3Wb24T3OL83sTArZPBUV92+AOLQ== X-Google-Smtp-Source: ABdhPJwO6ZsnaHRWJ4FwQGRkOG7fYiqiMInegEoQVR07yjltN459YZNLjIciXDLuZWDzl9Nl4XAGAA== X-Received: by 2002:a17:902:db01:b029:f6:4a13:1764 with SMTP id m1-20020a170902db01b02900f64a131764mr28984664plx.25.1621983502892; Tue, 25 May 2021 15:58:22 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id l6sm1669928pjf.28.2021.05.25.15.58.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 25 May 2021 15:58:22 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v2 07/12] target/arm: Implement bfloat16 dot product (indexed) Date: Tue, 25 May 2021 15:58:12 -0700 Message-Id: <20210525225817.400336-8-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525225817.400336-1-richard.henderson@linaro.org> References: <20210525225817.400336-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::102a; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x102a.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" This is BFDOT for both AArch64 AdvSIMD and SVE, and VDOT.BF16 for AArch32 NEON. Signed-off-by: Richard Henderson --- target/arm/helper.h | 2 ++ target/arm/neon-shared.decode | 2 ++ target/arm/sve.decode | 3 +++ target/arm/translate-a64.c | 41 +++++++++++++++++++++++++++-------- target/arm/translate-neon.c | 9 ++++++++ target/arm/translate-sve.c | 12 ++++++++++ target/arm/vec_helper.c | 20 +++++++++++++++++ 7 files changed, 80 insertions(+), 9 deletions(-) diff --git a/target/arm/helper.h b/target/arm/helper.h index de2f5331dc..376c1cef0f 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -1004,6 +1004,8 @@ DEF_HELPER_FLAGS_5(gvec_usmmla_b, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(gvec_bfdot, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_bfdot_idx, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) #ifdef TARGET_AARCH64 #include "helper-a64.h" diff --git a/target/arm/neon-shared.decode b/target/arm/neon-shared.decode index 31a0839bbb..fa3cf14e3a 100644 --- a/target/arm/neon-shared.decode +++ b/target/arm/neon-shared.decode @@ -81,6 +81,8 @@ VUSDOT_scalar 1111 1110 1 . 00 .... .... 1101 . q:1 index:1 0 vm:4 \ vn=%vn_dp vd=%vd_dp VSUDOT_scalar 1111 1110 1 . 00 .... .... 1101 . q:1 index:1 1 vm:4 \ vn=%vn_dp vd=%vd_dp +VDOT_b16_scal 1111 1110 0 . 00 .... .... 1101 . q:1 index:1 0 vm:4 \ + vn=%vn_dp vd=%vd_dp %vfml_scalar_q0_rm 0:3 5:1 %vfml_scalar_q1_index 5:1 3:1 diff --git a/target/arm/sve.decode b/target/arm/sve.decode index a7429b293f..51f87e8937 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1633,3 +1633,6 @@ FMLALB_zzxw 01100100 10 1 ..... 0100.0 ..... ..... @rrxr_3a esz=2 FMLALT_zzxw 01100100 10 1 ..... 0100.1 ..... ..... @rrxr_3a esz=2 FMLSLB_zzxw 01100100 10 1 ..... 0110.0 ..... ..... @rrxr_3a esz=2 FMLSLT_zzxw 01100100 10 1 ..... 0110.1 ..... ..... @rrxr_3a esz=2 + +### SVE2 floating-point bfloat16 dot-product (indexed) +BFDOT_zzxz 01100100 01 1 ..... 010000 ..... ..... @rrxr_2 esz=2 diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index 127c8d8e9d..1df931cef5 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -13442,8 +13442,22 @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn) return; } break; - case 0x0f: /* SUDOT, USDOT */ - if (is_scalar || (size & 1) || !dc_isar_feature(aa64_i8mm, s)) { + case 0x0f: + switch (size) { + case 0: /* SUDOT */ + case 2: /* USDOT */ + if (is_scalar || !dc_isar_feature(aa64_i8mm, s)) { + unallocated_encoding(s); + return; + } + break; + case 1: /* BFDOT */ + if (is_scalar || !dc_isar_feature(aa64_bf16, s)) { + unallocated_encoding(s); + return; + } + break; + default: unallocated_encoding(s); return; } @@ -13563,13 +13577,22 @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn) u ? gen_helper_gvec_udot_idx_b : gen_helper_gvec_sdot_idx_b); return; - case 0x0f: /* SUDOT, USDOT */ - gen_gvec_op4_ool(s, is_q, rd, rn, rm, rd, index, - extract32(insn, 23, 1) - ? gen_helper_gvec_usdot_idx_b - : gen_helper_gvec_sudot_idx_b); - return; - + case 0x0f: + switch (extract32(insn, 22, 2)) { + case 0: /* SUDOT */ + gen_gvec_op4_ool(s, is_q, rd, rn, rm, rd, index, + gen_helper_gvec_sudot_idx_b); + return; + case 1: /* BFDOT */ + gen_gvec_op4_ool(s, is_q, rd, rn, rm, rd, index, + gen_helper_gvec_bfdot_idx); + return; + case 2: /* USDOT */ + gen_gvec_op4_ool(s, is_q, rd, rn, rm, rd, index, + gen_helper_gvec_usdot_idx_b); + return; + } + g_assert_not_reached(); case 0x11: /* FCMLA #0 */ case 0x13: /* FCMLA #90 */ case 0x15: /* FCMLA #180 */ diff --git a/target/arm/translate-neon.c b/target/arm/translate-neon.c index 9460857b2a..8099767792 100644 --- a/target/arm/translate-neon.c +++ b/target/arm/translate-neon.c @@ -390,6 +390,15 @@ static bool trans_VSUDOT_scalar(DisasContext *s, arg_VSUDOT_scalar *a) gen_helper_gvec_sudot_idx_b); } +static bool trans_VDOT_b16_scal(DisasContext *s, arg_VDOT_b16_scal *a) +{ + if (!dc_isar_feature(aa32_bf16, s)) { + return false; + } + return do_neon_ddda(s, a->q * 6, a->vd, a->vn, a->vm, a->index, + gen_helper_gvec_bfdot_idx); +} + static bool trans_VFML_scalar(DisasContext *s, arg_VFML_scalar *a) { int opr_sz; diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index ed290827ad..6f02030635 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -8665,3 +8665,15 @@ static bool trans_BFDOT_zzzz(DisasContext *s, arg_rrrr_esz *a) } return true; } + +static bool trans_BFDOT_zzxz(DisasContext *s, arg_rrxr_esz *a) +{ + if (!dc_isar_feature(aa64_sve_bf16, s)) { + return false; + } + if (sve_access_check(s)) { + gen_gvec_ool_zzzz(s, gen_helper_gvec_bfdot_idx, + a->rd, a->rn, a->rm, a->ra, a->index); + } + return true; +} diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index 7eefcd06ea..74a497f38c 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -2452,3 +2452,23 @@ void HELPER(gvec_bfdot)(void *vd, void *vn, void *vm, void *va, uint32_t desc) } clear_tail(d, opr_sz, simd_maxsz(desc)); } + +void HELPER(gvec_bfdot_idx)(void *vd, void *vn, void *vm, + void *va, uint32_t desc) +{ + intptr_t i, j, opr_sz = simd_oprsz(desc); + intptr_t index = simd_data(desc); + intptr_t elements = opr_sz / 4; + intptr_t eltspersegment = MIN(16 / 4, elements); + float32 *d = vd, *a = va; + uint32_t *n = vn, *m = vm; + + for (i = 0; i < elements; i += eltspersegment) { + uint32_t m_idx = m[i + H4(index)]; + + for (j = i; j < i + eltspersegment; j++) { + d[j] = bfdotadd(a[j], n[j], m_idx); + } + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} From patchwork Tue May 25 22:58:13 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12280315 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 745B1C47086 for ; Tue, 25 May 2021 23:04:17 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id A86D76140E for ; Tue, 25 May 2021 23:04:16 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A86D76140E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:45580 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llg5v-0006kd-NL for qemu-devel@archiver.kernel.org; Tue, 25 May 2021 19:04:15 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:52808) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llg0V-0002o6-BH for qemu-devel@nongnu.org; Tue, 25 May 2021 18:58:39 -0400 Received: from mail-pf1-x431.google.com ([2607:f8b0:4864:20::431]:45636) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llg0G-0004q5-WA for qemu-devel@nongnu.org; Tue, 25 May 2021 18:58:38 -0400 Received: by mail-pf1-x431.google.com with SMTP id d16so24759178pfn.12 for ; Tue, 25 May 2021 15:58:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=trxPjj9YPWlGNS1qfuLuCmIlNFTHPX2pZLXqEV/0snQ=; b=WTT+UX3M5diBDbGx+YVKzndhZeIfu/PFjasvW0eLDIBfq6hjIZEvS8wOGrAoSaulmK OINax1E8dGPr3tlrkiB0TW+qSNYeLv7gET3xtUujI/nn3GaD1Y+315WZ5zxyacRh3fH7 10kYWux1qYXFDMMMlMx0gwh3q3f5u015VpMnVc5qMDOQJoFKkCbAVbOjHwCZgVWr9IDn JthZ3RA65PZuG9HZihHiJJEIWA+u9I6ZpYrhuVggoASbxriPb9EMmywbfW6XiLNJHVCF OpBXntOWSDqyPRO9tDvOfiqyZBm2AB5zQQvNDjYqhl0k0h5EsX92AL5jKQ9Wu4CIchTk uF5g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=trxPjj9YPWlGNS1qfuLuCmIlNFTHPX2pZLXqEV/0snQ=; b=BEvjbJIpu/UAW3U7Dn0EInDedI+NokEgj94QVekGU2eq+5+JbWGkbjgMNaJ48KdtNA hGZ5VcMjdgO8wpyowABC/lmCu9OzDZO0LDy1suZQLVVbTOe9YCsOihRKATGagvnu+3BJ iPXJCcB9z3088ZihJtYD8iU1AzKFlpRCJ/hkRydsP/t8MEyADgstBzOHjLmDEaXIsgpf LhkODRt/KCpyZcLtnPJOFgDGqKav22V4iB5LRzVnHohpSzkbsL1uuVfGSbenVmuff2U/ 0HSEynFv2pMKtdI94Kpv/zeaaU86ZD7OmwAA5trzdWfYJAKVWVDdmxLZ/q2YrtUgb1IU S0+Q== X-Gm-Message-State: AOAM532OlomPuHet2HX4kl13rjVG6cHy/Ab9pCrf3/BCFPTC8OPMsKwP XmWLBpJQ3pSVDgb5y9Lmp6Umsc675aj8Kg== X-Google-Smtp-Source: ABdhPJys0E2b7EcJQmatssCHAyugbdErB/wXi019aTbO+DVeqR8fuRZsVpDhSbj1ccvvpAFyCznGTg== X-Received: by 2002:a62:d409:0:b029:27d:338:1cca with SMTP id a9-20020a62d4090000b029027d03381ccamr32271285pfh.25.1621983503628; Tue, 25 May 2021 15:58:23 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id l6sm1669928pjf.28.2021.05.25.15.58.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 25 May 2021 15:58:23 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v2 08/12] target/arm: Implement bfloat16 matrix multiply accumulate Date: Tue, 25 May 2021 15:58:13 -0700 Message-Id: <20210525225817.400336-9-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525225817.400336-1-richard.henderson@linaro.org> References: <20210525225817.400336-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::431; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x431.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" This is BFMMLA for both AArch64 AdvSIMD and SVE, and VMMLA.BF16 for AArch32 NEON. Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/helper.h | 3 +++ target/arm/neon-shared.decode | 2 ++ target/arm/sve.decode | 6 +++-- target/arm/translate-a64.c | 10 +++++++++ target/arm/translate-neon.c | 9 ++++++++ target/arm/translate-sve.c | 12 ++++++++++ target/arm/vec_helper.c | 42 ++++++++++++++++++++++++++++++++++- 7 files changed, 81 insertions(+), 3 deletions(-) diff --git a/target/arm/helper.h b/target/arm/helper.h index 376c1cef0f..af75d7f25f 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -1007,6 +1007,9 @@ DEF_HELPER_FLAGS_5(gvec_bfdot, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(gvec_bfdot_idx, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_bfmmla, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + #ifdef TARGET_AARCH64 #include "helper-a64.h" #include "helper-sve.h" diff --git a/target/arm/neon-shared.decode b/target/arm/neon-shared.decode index fa3cf14e3a..4e0a25d27c 100644 --- a/target/arm/neon-shared.decode +++ b/target/arm/neon-shared.decode @@ -67,6 +67,8 @@ VUMMLA 1111 1100 0.10 .... .... 1100 .1.1 .... \ vm=%vm_dp vn=%vn_dp vd=%vd_dp VUSMMLA 1111 1100 1.10 .... .... 1100 .1.0 .... \ vm=%vm_dp vn=%vn_dp vd=%vd_dp +VMMLA_b16 1111 1100 0.00 .... .... 1100 .1.0 .... \ + vm=%vm_dp vn=%vn_dp vd=%vd_dp VCMLA_scalar 1111 1110 0 . rot:2 .... .... 1000 . q:1 index:1 0 vm:4 \ vn=%vn_dp vd=%vd_dp size=1 diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 51f87e8937..6c17898dee 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1568,8 +1568,10 @@ SQRDCMLAH_zzzz 01000100 esz:2 0 rm:5 0011 rot:2 rn:5 rd:5 ra=%reg_movprfx USDOT_zzzz 01000100 .. 0 ..... 011 110 ..... ..... @rda_rn_rm ### SVE2 floating point matrix multiply accumulate - -FMMLA 01100100 .. 1 ..... 111001 ..... ..... @rda_rn_rm +{ + BFMMLA 01100100 01 1 ..... 111 001 ..... ..... @rda_rn_rm_e0 + FMMLA 01100100 .. 1 ..... 111 001 ..... ..... @rda_rn_rm +} ### SVE2 Memory Gather Load Group diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index 1df931cef5..9f8dae90ba 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -12228,6 +12228,13 @@ static void disas_simd_three_reg_same_extra(DisasContext *s, uint32_t insn) } feature = dc_isar_feature(aa64_fcma, s); break; + case 0x1d: /* BFMMLA */ + if (size != MO_16 || !is_q) { + unallocated_encoding(s); + return; + } + feature = dc_isar_feature(aa64_bf16, s); + break; case 0x1f: /* BFDOT */ switch (size) { case 1: @@ -12321,6 +12328,9 @@ static void disas_simd_three_reg_same_extra(DisasContext *s, uint32_t insn) } return; + case 0xd: /* BFMMLA */ + gen_gvec_op4_ool(s, is_q, rd, rn, rm, rd, 0, gen_helper_gvec_bfmmla); + return; case 0xf: /* BFDOT */ switch (size) { case 1: diff --git a/target/arm/translate-neon.c b/target/arm/translate-neon.c index 8099767792..9d227a1e13 100644 --- a/target/arm/translate-neon.c +++ b/target/arm/translate-neon.c @@ -4126,3 +4126,12 @@ static bool trans_VUSMMLA(DisasContext *s, arg_VUSMMLA *a) return do_neon_ddda(s, 7, a->vd, a->vn, a->vm, 0, gen_helper_gvec_usmmla_b); } + +static bool trans_VMMLA_b16(DisasContext *s, arg_VMMLA_b16 *a) +{ + if (!dc_isar_feature(aa32_bf16, s)) { + return false; + } + return do_neon_ddda(s, 7, a->vd, a->vn, a->vm, 0, + gen_helper_gvec_bfmmla); +} diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 6f02030635..4f575dc334 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -8677,3 +8677,15 @@ static bool trans_BFDOT_zzxz(DisasContext *s, arg_rrxr_esz *a) } return true; } + +static bool trans_BFMMLA(DisasContext *s, arg_rrrr_esz *a) +{ + if (!dc_isar_feature(aa64_sve_bf16, s)) { + return false; + } + if (sve_access_check(s)) { + gen_gvec_ool_zzzz(s, gen_helper_gvec_bfmmla, + a->rd, a->rn, a->rm, a->ra, 0); + } + return true; +} diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index 74a497f38c..27e9bdd329 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -2385,7 +2385,7 @@ static void do_mmla_b(void *vd, void *vn, void *vm, void *va, uint32_t desc, * Process the entire segment at once, writing back the * results only after we've consumed all of the inputs. * - * Key to indicies by column: + * Key to indices by column: * i j i j */ sum0 = a[H4(0 + 0)]; @@ -2472,3 +2472,43 @@ void HELPER(gvec_bfdot_idx)(void *vd, void *vn, void *vm, } clear_tail(d, opr_sz, simd_maxsz(desc)); } + +void HELPER(gvec_bfmmla)(void *vd, void *vn, void *vm, void *va, uint32_t desc) +{ + intptr_t s, opr_sz = simd_oprsz(desc); + float32 *d = vd, *a = va; + uint32_t *n = vn, *m = vm; + + for (s = 0; s < opr_sz / 4; s += 4) { + float32 sum00, sum01, sum10, sum11; + + /* + * Process the entire segment at once, writing back the + * results only after we've consumed all of the inputs. + * + * Key to indicies by column: + * i j i k j k + */ + sum00 = a[s + H4(0 + 0)]; + sum00 = bfdotadd(sum00, n[s + H4(0 + 0)], m[s + H4(0 + 0)]); + sum00 = bfdotadd(sum00, n[s + H4(0 + 1)], m[s + H4(0 + 1)]); + + sum01 = a[s + H4(0 + 1)]; + sum01 = bfdotadd(sum01, n[s + H4(0 + 0)], m[s + H4(2 + 0)]); + sum01 = bfdotadd(sum01, n[s + H4(0 + 1)], m[s + H4(2 + 1)]); + + sum10 = a[s + H4(2 + 0)]; + sum10 = bfdotadd(sum10, n[s + H4(2 + 0)], m[s + H4(0 + 0)]); + sum10 = bfdotadd(sum10, n[s + H4(2 + 1)], m[s + H4(0 + 1)]); + + sum11 = a[s + H4(2 + 1)]; + sum11 = bfdotadd(sum11, n[s + H4(2 + 0)], m[s + H4(2 + 0)]); + sum11 = bfdotadd(sum11, n[s + H4(2 + 1)], m[s + H4(2 + 1)]); + + d[s + H4(0 + 0)] = sum00; + d[s + H4(0 + 1)] = sum01; + d[s + H4(2 + 0)] = sum10; + d[s + H4(2 + 1)] = sum11; + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} From patchwork Tue May 25 22:58:14 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12280317 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A6D07C47086 for ; Tue, 25 May 2021 23:04:22 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 1178A61284 for ; Tue, 25 May 2021 23:04:22 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1178A61284 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:46076 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llg61-00074Y-17 for qemu-devel@archiver.kernel.org; Tue, 25 May 2021 19:04:21 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:52860) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llg0X-0002uJ-4P for qemu-devel@nongnu.org; Tue, 25 May 2021 18:58:41 -0400 Received: from mail-pf1-x431.google.com ([2607:f8b0:4864:20::431]:34792) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llg0H-0004qF-Mc for qemu-devel@nongnu.org; Tue, 25 May 2021 18:58:40 -0400 Received: by mail-pf1-x431.google.com with SMTP id q25so6730369pfn.1 for ; Tue, 25 May 2021 15:58:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=MzWGJSozIBbfglC7WTEZiiQbdD24PRiuEVX0rE+xf2I=; b=RSAU7bT1myypfKClb1iEVWrdjz4IJmRuO31kSm724tuTi5X4AyIcHC1szuzLFUz4QS nRhkdmEqXU6r2S/THqhgK66kNIkv57xCrFSDpjKjHbUeHCrPe7J109W3oJLtfut0GjaU OziN3lYx1t1zZ/aTIfao94pu1pJ/C6mGfeGaTotM3YATE/ANH5Wrc61x++paFPyfQb+s J4+s+AcIAI+0hW8MOT56vlGgDb0pV4kKBK9coXRUho2PwtTidaPLpnm2buICZ8vU1okZ EEeWXLDhY0gbrIvv6dIjp0KSPsYngp5jkjBAEiKWIav3LvxdMhipoSF97sBhG4kJPs3N 3Z0g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=MzWGJSozIBbfglC7WTEZiiQbdD24PRiuEVX0rE+xf2I=; b=M7iSAeknBN8lkYGMEq3QibEVMzjUx6d6Yf6SjwcmqA5KVgzXhQD4DNA8/rq+1SEZ3/ OPV0wOVoSSbXN+W0tXkNkt2wVkTQrVTGacB02C8oxuqLhd0vSntB/0qEcFYDRrbqpb14 f1uLQwCjuoPhSMn8Nmhg2eN3JeBKUZr5AHGknJIOe2mOD/jJVGqkCG1qqEnbE4yAhDDb 5P8Kh7WUUYOIOMk8UFVdSz7sXiAZ4in0RN+TIOD1eB0eBFThQ5c028eZSS6Xp/K4R1y2 Dcb4ZF5ZXAPeOi1jSitb2uIdwEl2qBhJLgakWM2JmY+BB9CGEHlqD7UlIb87cQfPg5jq hdfA== X-Gm-Message-State: AOAM532uzm1hx4CNk040v7ZZuQzp06l2kqpWbsq84lPP/6q68zH2LYkS g84c1OIsw4u8SM9iNz1cIft3dN+gImSjZQ== X-Google-Smtp-Source: ABdhPJzioJ8ydxO543qtpOyAxZZY5JLCt+AbryDv25/zq1cy0mvZPDFqfMQGT2hgIHB1bBxx5OOzBA== X-Received: by 2002:a62:148e:0:b029:2e4:e5a5:7e33 with SMTP id 136-20020a62148e0000b02902e4e5a57e33mr24155847pfu.9.1621983504399; Tue, 25 May 2021 15:58:24 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id l6sm1669928pjf.28.2021.05.25.15.58.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 25 May 2021 15:58:24 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v2 09/12] target/arm: Implement bfloat widening fma (vector) Date: Tue, 25 May 2021 15:58:14 -0700 Message-Id: <20210525225817.400336-10-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525225817.400336-1-richard.henderson@linaro.org> References: <20210525225817.400336-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::431; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x431.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" This is BFMLAL{B,T} for both AArch64 AdvSIMD and SVE, and VFMA{B,T}.BF16 for AArch32 NEON. Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/helper.h | 3 +++ target/arm/neon-shared.decode | 3 +++ target/arm/sve.decode | 3 +++ target/arm/translate-a64.c | 13 +++++++++---- target/arm/translate-neon.c | 9 +++++++++ target/arm/translate-sve.c | 30 ++++++++++++++++++++++++++++++ target/arm/vec_helper.c | 16 ++++++++++++++++ 7 files changed, 73 insertions(+), 4 deletions(-) diff --git a/target/arm/helper.h b/target/arm/helper.h index af75d7f25f..36b3c9dd2d 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -1010,6 +1010,9 @@ DEF_HELPER_FLAGS_5(gvec_bfdot_idx, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(gvec_bfmmla, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(gvec_bfmlal, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + #ifdef TARGET_AARCH64 #include "helper-a64.h" #include "helper-sve.h" diff --git a/target/arm/neon-shared.decode b/target/arm/neon-shared.decode index 4e0a25d27c..b61addd98b 100644 --- a/target/arm/neon-shared.decode +++ b/target/arm/neon-shared.decode @@ -70,6 +70,9 @@ VUSMMLA 1111 1100 1.10 .... .... 1100 .1.0 .... \ VMMLA_b16 1111 1100 0.00 .... .... 1100 .1.0 .... \ vm=%vm_dp vn=%vn_dp vd=%vd_dp +VFMA_b16 1111 110 0 0.11 .... .... 1000 . q:1 . 1 .... \ + vm=%vm_dp vn=%vn_dp vd=%vd_dp + VCMLA_scalar 1111 1110 0 . rot:2 .... .... 1000 . q:1 index:1 0 vm:4 \ vn=%vn_dp vd=%vd_dp size=1 VCMLA_scalar 1111 1110 1 . rot:2 .... .... 1000 . q:1 . 0 .... \ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 6c17898dee..5281164eea 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1627,6 +1627,9 @@ FMLALT_zzzw 01100100 10 1 ..... 10 0 00 1 ..... ..... @rda_rn_rm_e0 FMLSLB_zzzw 01100100 10 1 ..... 10 1 00 0 ..... ..... @rda_rn_rm_e0 FMLSLT_zzzw 01100100 10 1 ..... 10 1 00 1 ..... ..... @rda_rn_rm_e0 +BFMLALB_zzzw 01100100 11 1 ..... 10 0 00 0 ..... ..... @rda_rn_rm_e0 +BFMLALT_zzzw 01100100 11 1 ..... 10 0 00 1 ..... ..... @rda_rn_rm_e0 + ### SVE2 floating-point bfloat16 dot-product BFDOT_zzzz 01100100 01 1 ..... 10 0 00 0 ..... ..... @rda_rn_rm_e0 diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index 9f8dae90ba..2a99e015ca 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -12235,9 +12235,10 @@ static void disas_simd_three_reg_same_extra(DisasContext *s, uint32_t insn) } feature = dc_isar_feature(aa64_bf16, s); break; - case 0x1f: /* BFDOT */ + case 0x1f: switch (size) { - case 1: + case 1: /* BFDOT */ + case 3: /* BFMLAL{B,T} */ feature = dc_isar_feature(aa64_bf16, s); break; default: @@ -12331,11 +12332,15 @@ static void disas_simd_three_reg_same_extra(DisasContext *s, uint32_t insn) case 0xd: /* BFMMLA */ gen_gvec_op4_ool(s, is_q, rd, rn, rm, rd, 0, gen_helper_gvec_bfmmla); return; - case 0xf: /* BFDOT */ + case 0xf: switch (size) { - case 1: + case 1: /* BFDOT */ gen_gvec_op4_ool(s, is_q, rd, rn, rm, rd, 0, gen_helper_gvec_bfdot); break; + case 3: /* BFMLAL{B,T} */ + gen_gvec_op4_fpst(s, 1, rd, rn, rm, rd, false, is_q, + gen_helper_gvec_bfmlal); + break; default: g_assert_not_reached(); } diff --git a/target/arm/translate-neon.c b/target/arm/translate-neon.c index 9d227a1e13..4d0c2494dc 100644 --- a/target/arm/translate-neon.c +++ b/target/arm/translate-neon.c @@ -4135,3 +4135,12 @@ static bool trans_VMMLA_b16(DisasContext *s, arg_VMMLA_b16 *a) return do_neon_ddda(s, 7, a->vd, a->vn, a->vm, 0, gen_helper_gvec_bfmmla); } + +static bool trans_VFMA_b16(DisasContext *s, arg_VFMA_b16 *a) +{ + if (!dc_isar_feature(aa32_bf16, s)) { + return false; + } + return do_neon_ddda_fpst(s, 7, a->vd, a->vn, a->vm, a->q, FPST_STD, + gen_helper_gvec_bfmlal); +} diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 4f575dc334..ba8f5d7b7d 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -8689,3 +8689,33 @@ static bool trans_BFMMLA(DisasContext *s, arg_rrrr_esz *a) } return true; } + +static bool do_BFMLAL_zzzw(DisasContext *s, arg_rrrr_esz *a, bool sel) +{ + if (!dc_isar_feature(aa64_sve_bf16, s)) { + return false; + } + if (sve_access_check(s)) { + TCGv_ptr status = fpstatus_ptr(FPST_FPCR); + unsigned vsz = vec_full_reg_size(s); + + tcg_gen_gvec_4_ptr(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + vec_full_reg_offset(s, a->ra), + status, vsz, vsz, sel, + gen_helper_gvec_bfmlal); + tcg_temp_free_ptr(status); + } + return true; +} + +static bool trans_BFMLALB_zzzw(DisasContext *s, arg_rrrr_esz *a) +{ + return do_BFMLAL_zzzw(s, a, false); +} + +static bool trans_BFMLALT_zzzw(DisasContext *s, arg_rrrr_esz *a) +{ + return do_BFMLAL_zzzw(s, a, true); +} diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index 27e9bdd329..d82736b5e6 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -2512,3 +2512,19 @@ void HELPER(gvec_bfmmla)(void *vd, void *vn, void *vm, void *va, uint32_t desc) } clear_tail(d, opr_sz, simd_maxsz(desc)); } + +void HELPER(gvec_bfmlal)(void *vd, void *vn, void *vm, void *va, + void *stat, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + intptr_t sel = simd_data(desc); + float32 *d = vd, *a = va; + bfloat16 *n = vn, *m = vm; + + for (i = 0; i < opr_sz / 4; ++i) { + float32 nn = n[H2(i * 2 + sel)] << 16; + float32 mm = m[H2(i * 2 + sel)] << 16; + d[H4(i)] = float32_muladd(nn, mm, a[H4(i)], 0, stat); + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} From patchwork Tue May 25 22:58:15 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12280343 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 042E5C4707F for ; Tue, 25 May 2021 23:10:07 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 5A5B160698 for ; Tue, 25 May 2021 23:10:06 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 5A5B160698 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:57690 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llgBZ-0006Uh-GV for qemu-devel@archiver.kernel.org; Tue, 25 May 2021 19:10:05 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:52940) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llg0b-00036n-3c for qemu-devel@nongnu.org; Tue, 25 May 2021 18:58:45 -0400 Received: from mail-pl1-x632.google.com ([2607:f8b0:4864:20::632]:39784) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llg0M-0004qT-Lo for qemu-devel@nongnu.org; Tue, 25 May 2021 18:58:44 -0400 Received: by mail-pl1-x632.google.com with SMTP id q16so1356830pls.6 for ; Tue, 25 May 2021 15:58:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=iK+XC+NPEYKT5rI0LnO1hobdxUpyLx+ID3I1KyJc/L4=; b=U7OdEOOSOdyh1H/Q1I8/NRxARRDXAREGO+3hha4eHty++9lom0tRt3OBUqD3gZwZ7n 5C4XuVDJUA0cyaFhoStZd5hiy2shKxjtB9pyTdPcUlNNqOAw5CG+YI2rmrhO0KUBw2H+ cVvItRawOFNffoDa3ExBu8tVNnaPG4BB+jc5D09wIxnRj3pIM0KkNqWtljRcbnfKJhi4 Dp0Of4msWJg4ZCEeyqTdDd/E0EGxOX2/xxcRotgh1OY3H1iYWkrZ5dgjxk+SIU/q6+tM JRQmEoxA7SV7BMTmLC8EJCWuaEcoJfQYyH52F5ZeZVovep5JmhBDCFyVjNMBWgYpZMAA inbA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=iK+XC+NPEYKT5rI0LnO1hobdxUpyLx+ID3I1KyJc/L4=; b=hYyuGqNKv/DMpJzwgswzy4Ktyp1RaEqnXgaq/x6SAUgkXnG6R7deJtmRoZPChG5WNm Crwmky1XXo2pWzQlF+W4bmKfhP1OAX8aWe6E+klUvDmTQpRVKBUBmRfFrPMrq6muow7t FDOciWono0AFrU6aKZFbUhHY3d6fRk0gjEdExgOylXefHI6zHwSoS2ElI/s1r/IDCa9p C/dq39OTatEqL4B2m8Es8qgVxlqM+XOw277xrgkfnz8LKKHA5k5jMvcUn2MUpkwgLCDd ORBzSF+ifsXheP0GcS/OBVmC81/Cvl3nKZxk8IOkyOV/TXNcnNIqAwROPgXccXibDWKn e8IQ== X-Gm-Message-State: AOAM533LuiBtbZ0h6nSMegFnvZMeg9US1WcfzLXLUrtfXgtEsSahQzZu yIBNm35QXrHh4Je8t5Tg049iweSVJES+Zg== X-Google-Smtp-Source: ABdhPJz3FDxyyhZcO+0UVzRYXd3L3mBZ7fzXUnK/4hUmJLz4ZzrEa91Vn+xnFwO1a/DLa+UtAvaSpQ== X-Received: by 2002:a17:902:d48b:b029:f4:4a28:43e5 with SMTP id c11-20020a170902d48bb02900f44a2843e5mr33098427plg.19.1621983505016; Tue, 25 May 2021 15:58:25 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id l6sm1669928pjf.28.2021.05.25.15.58.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 25 May 2021 15:58:24 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v2 10/12] target/arm: Implement bfloat widening fma (indexed) Date: Tue, 25 May 2021 15:58:15 -0700 Message-Id: <20210525225817.400336-11-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525225817.400336-1-richard.henderson@linaro.org> References: <20210525225817.400336-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::632; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x632.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" This is BFMLAL{B,T} for both AArch64 AdvSIMD and SVE, and VFMA{B,T}.BF16 for AArch32 NEON. Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/helper.h | 2 ++ target/arm/neon-shared.decode | 2 ++ target/arm/sve.decode | 2 ++ target/arm/translate-a64.c | 15 ++++++++++++++- target/arm/translate-neon.c | 10 ++++++++++ target/arm/translate-sve.c | 30 ++++++++++++++++++++++++++++++ target/arm/vec_helper.c | 22 ++++++++++++++++++++++ 7 files changed, 82 insertions(+), 1 deletion(-) diff --git a/target/arm/helper.h b/target/arm/helper.h index 36b3c9dd2d..dc6eb96d43 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -1012,6 +1012,8 @@ DEF_HELPER_FLAGS_5(gvec_bfmmla, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_6(gvec_bfmlal, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(gvec_bfmlal_idx, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) #ifdef TARGET_AARCH64 #include "helper-a64.h" diff --git a/target/arm/neon-shared.decode b/target/arm/neon-shared.decode index b61addd98b..df80e6ebf6 100644 --- a/target/arm/neon-shared.decode +++ b/target/arm/neon-shared.decode @@ -95,3 +95,5 @@ VFML_scalar 1111 1110 0 . 0 s:1 .... .... 1000 . 0 . 1 index:1 ... \ rm=%vfml_scalar_q0_rm vn=%vn_sp vd=%vd_dp q=0 VFML_scalar 1111 1110 0 . 0 s:1 .... .... 1000 . 1 . 1 . rm:3 \ index=%vfml_scalar_q1_index vn=%vn_dp vd=%vd_dp q=1 +VFMA_b16_scal 1111 1110 0.11 .... .... 1000 . q:1 . 1 . vm:3 \ + index=%vfml_scalar_q1_index vn=%vn_dp vd=%vd_dp diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 5281164eea..a62c169f1a 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1638,6 +1638,8 @@ FMLALB_zzxw 01100100 10 1 ..... 0100.0 ..... ..... @rrxr_3a esz=2 FMLALT_zzxw 01100100 10 1 ..... 0100.1 ..... ..... @rrxr_3a esz=2 FMLSLB_zzxw 01100100 10 1 ..... 0110.0 ..... ..... @rrxr_3a esz=2 FMLSLT_zzxw 01100100 10 1 ..... 0110.1 ..... ..... @rrxr_3a esz=2 +BFMLALB_zzxw 01100100 11 1 ..... 0100.0 ..... ..... @rrxr_3a esz=2 +BFMLALT_zzxw 01100100 11 1 ..... 0100.1 ..... ..... @rrxr_3a esz=2 ### SVE2 floating-point bfloat16 dot-product (indexed) BFDOT_zzxz 01100100 01 1 ..... 010000 ..... ..... @rrxr_2 esz=2 diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index 2a99e015ca..d1dc9401d5 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -13465,18 +13465,27 @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn) unallocated_encoding(s); return; } + size = MO_32; break; case 1: /* BFDOT */ if (is_scalar || !dc_isar_feature(aa64_bf16, s)) { unallocated_encoding(s); return; } + size = MO_32; + break; + case 3: /* BFMLAL{B,T} */ + if (is_scalar || !dc_isar_feature(aa64_bf16, s)) { + unallocated_encoding(s); + return; + } + /* can't set is_fp without other incorrect size checks */ + size = MO_16; break; default: unallocated_encoding(s); return; } - size = MO_32; break; case 0x11: /* FCMLA #0 */ case 0x13: /* FCMLA #90 */ @@ -13606,6 +13615,10 @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn) gen_gvec_op4_ool(s, is_q, rd, rn, rm, rd, index, gen_helper_gvec_usdot_idx_b); return; + case 3: /* BFMLAL{B,T} */ + gen_gvec_op4_fpst(s, 1, rd, rn, rm, rd, 0, (index << 1) | is_q, + gen_helper_gvec_bfmlal_idx); + return; } g_assert_not_reached(); case 0x11: /* FCMLA #0 */ diff --git a/target/arm/translate-neon.c b/target/arm/translate-neon.c index 4d0c2494dc..633fef3bf7 100644 --- a/target/arm/translate-neon.c +++ b/target/arm/translate-neon.c @@ -4144,3 +4144,13 @@ static bool trans_VFMA_b16(DisasContext *s, arg_VFMA_b16 *a) return do_neon_ddda_fpst(s, 7, a->vd, a->vn, a->vm, a->q, FPST_STD, gen_helper_gvec_bfmlal); } + +static bool trans_VFMA_b16_scal(DisasContext *s, arg_VFMA_b16_scal *a) +{ + if (!dc_isar_feature(aa32_bf16, s)) { + return false; + } + return do_neon_ddda_fpst(s, 6, a->vd, a->vn, a->vm, + (a->index << 1) | a->q, FPST_STD, + gen_helper_gvec_bfmlal_idx); +} diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index ba8f5d7b7d..46210eb696 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -8719,3 +8719,33 @@ static bool trans_BFMLALT_zzzw(DisasContext *s, arg_rrrr_esz *a) { return do_BFMLAL_zzzw(s, a, true); } + +static bool do_BFMLAL_zzxw(DisasContext *s, arg_rrxr_esz *a, bool sel) +{ + if (!dc_isar_feature(aa64_sve_bf16, s)) { + return false; + } + if (sve_access_check(s)) { + TCGv_ptr status = fpstatus_ptr(FPST_FPCR); + unsigned vsz = vec_full_reg_size(s); + + tcg_gen_gvec_4_ptr(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + vec_full_reg_offset(s, a->ra), + status, vsz, vsz, (a->index << 1) | sel, + gen_helper_gvec_bfmlal_idx); + tcg_temp_free_ptr(status); + } + return true; +} + +static bool trans_BFMLALB_zzxw(DisasContext *s, arg_rrxr_esz *a) +{ + return do_BFMLAL_zzxw(s, a, false); +} + +static bool trans_BFMLALT_zzxw(DisasContext *s, arg_rrxr_esz *a) +{ + return do_BFMLAL_zzxw(s, a, true); +} diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index d82736b5e6..5862f187cd 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -2528,3 +2528,25 @@ void HELPER(gvec_bfmlal)(void *vd, void *vn, void *vm, void *va, } clear_tail(d, opr_sz, simd_maxsz(desc)); } + +void HELPER(gvec_bfmlal_idx)(void *vd, void *vn, void *vm, + void *va, void *stat, uint32_t desc) +{ + intptr_t i, j, opr_sz = simd_oprsz(desc); + intptr_t sel = extract32(desc, SIMD_DATA_SHIFT, 1); + intptr_t index = extract32(desc, SIMD_DATA_SHIFT + 1, 3); + intptr_t elements = opr_sz / 4; + intptr_t eltspersegment = MIN(16 / 4, elements); + float32 *d = vd, *a = va; + bfloat16 *n = vn, *m = vm; + + for (i = 0; i < elements; i += eltspersegment) { + float32 m_idx = m[H2(2 * i + index)] << 16; + + for (j = i; j < i + eltspersegment; j++) { + float32 n_j = n[H2(2 * j + sel)] << 16; + d[H4(j)] = float32_muladd(n_j, m_idx, a[H4(j)], 0, stat); + } + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} From patchwork Tue May 25 22:58:16 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12280345 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 05112C47086 for ; Tue, 25 May 2021 23:11:45 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id A596A613C1 for ; Tue, 25 May 2021 23:11:44 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A596A613C1 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:59810 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llgD9-000814-MO for qemu-devel@archiver.kernel.org; Tue, 25 May 2021 19:11:43 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:52952) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llg0b-0003BE-Vs for qemu-devel@nongnu.org; Tue, 25 May 2021 18:58:46 -0400 Received: from mail-pj1-x1030.google.com ([2607:f8b0:4864:20::1030]:53054) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llg0M-0004qs-Lz for qemu-devel@nongnu.org; Tue, 25 May 2021 18:58:45 -0400 Received: by mail-pj1-x1030.google.com with SMTP id q6so17725914pjj.2 for ; Tue, 25 May 2021 15:58:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=Zs9Prz7ExZU1dWkJoVWg1ReLLRdWEnLSyOhH0GKyXmc=; b=KI0+RGE4oQXODgj+e0qG21N1cb+djD6h12rXj6oIJOkKxaAGQSXRFcNt0hod4MIaVA wvrWVlU1q5UmVduGASahS0ZlaT7d45je6lJ3DSjkBPXMr6a1h9hebh9yUioYN3u6BHkJ 4+qpzPXtQtbqoocUceOQIIbte3BGRK2TIdf0RQpaFjg8/znN+VskVxhNBvXwnHFlLR5m D94cn3mBH6BpplOlQtHrEKAxeVGcPIADTzBI3lkiX5Lj2A8+gCRHGhPyK8wiCD8WHCwc fO5cV0PMtFk+EATAGU+vaMxyglJTL1X8KHDmsyumXMO9573s8MIlmwdQIBb22ULRzA59 LK2Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Zs9Prz7ExZU1dWkJoVWg1ReLLRdWEnLSyOhH0GKyXmc=; b=nC1f/UlY//H6eYIxpUREDV0qhji2Y9HXDIkS9z9EWPaIqq0g+DU2O/+yTmchhzdRUI X57z9A3Kv1vI5OiZ1MI9dokxSk36oqn+FQfheQI77Pr+g5mY5k7WZImm+3frex87NtUr QNTIhSiJ3eHCq4oBABPKrCtbxAkf8k7oyrEEYPYkpi41oGF5n/ju6K/bPyZmxInaZY6D YZnhGp1j/I8Zg3xDj+tO8w31bLjnMWLlO+anxD/AtsZT7CxWfAbuDPehan64a0jd2X7M l+P3YLodbtHuMVIazSyZhoV8B7oD+wyAutJRDBcPhKN503G1fNB4Ln1ysEcQ8oOuBdif xy/g== X-Gm-Message-State: AOAM531FglGz+kwR0awf0Ii7RJQYsel0f7W7uODScUrYttp7EOqIGrpK oDYLEJwHNzAyiRz9+krWK/8V5APCtZDl6w== X-Google-Smtp-Source: ABdhPJyMz+pL4xp0z6NouBWG79eqXlI6clvAGlWv2o9Ku4Cdu5RBWxXmfUIg3KZyHrW0DM4AEohkqw== X-Received: by 2002:a17:902:6bc7:b029:ee:f84f:1093 with SMTP id m7-20020a1709026bc7b02900eef84f1093mr32272753plt.37.1621983505533; Tue, 25 May 2021 15:58:25 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id l6sm1669928pjf.28.2021.05.25.15.58.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 25 May 2021 15:58:25 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v2 11/12] linux-user/aarch64: Enable hwcap bits for bfloat16 Date: Tue, 25 May 2021 15:58:16 -0700 Message-Id: <20210525225817.400336-12-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525225817.400336-1-richard.henderson@linaro.org> References: <20210525225817.400336-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1030; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1030.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- linux-user/elfload.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/linux-user/elfload.c b/linux-user/elfload.c index 1ab97e38e0..17ab06f612 100644 --- a/linux-user/elfload.c +++ b/linux-user/elfload.c @@ -659,7 +659,9 @@ static uint32_t get_elf_hwcap2(void) GET_FEATURE_ID(aa64_sve_i8mm, ARM_HWCAP2_A64_SVEI8MM); GET_FEATURE_ID(aa64_sve_f32mm, ARM_HWCAP2_A64_SVEF32MM); GET_FEATURE_ID(aa64_sve_f64mm, ARM_HWCAP2_A64_SVEF64MM); + GET_FEATURE_ID(aa64_sve_bf16, ARM_HWCAP2_A64_SVEBF16); GET_FEATURE_ID(aa64_i8mm, ARM_HWCAP2_A64_I8MM); + GET_FEATURE_ID(aa64_bf16, ARM_HWCAP2_A64_BF16); GET_FEATURE_ID(aa64_rndr, ARM_HWCAP2_A64_RNG); GET_FEATURE_ID(aa64_bti, ARM_HWCAP2_A64_BTI); GET_FEATURE_ID(aa64_mte, ARM_HWCAP2_A64_MTE); From patchwork Tue May 25 22:58:17 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 12280341 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CA6B8C4707F for ; Tue, 25 May 2021 23:09:05 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 7640A613E1 for ; Tue, 25 May 2021 23:09:05 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 7640A613E1 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:55592 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1llgAa-00054I-FP for qemu-devel@archiver.kernel.org; Tue, 25 May 2021 19:09:04 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:52904) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1llg0Y-0002zV-CC for qemu-devel@nongnu.org; Tue, 25 May 2021 18:58:42 -0400 Received: from mail-pl1-x633.google.com ([2607:f8b0:4864:20::633]:40917) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1llg0M-0004rQ-KH for qemu-devel@nongnu.org; Tue, 25 May 2021 18:58:42 -0400 Received: by mail-pl1-x633.google.com with SMTP id n8so11924359plf.7 for ; Tue, 25 May 2021 15:58:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=sbiH4B8cDwF4+AJH0qZkvnuJ9P1hwbVxNhJbCa4tUSo=; b=r4QJTz9AFyAFUCVnt/zfvO0QbBvlQDBHQNDm2SKMjN9+YzS/H9TXk4TUdzRDTopo0x zr39DKYBzyybxcpTN5O/+vxHD+2bTwiPaQM/PwQUCp7XQLJJkF6TZ5bw3SYoROYfiQgt V/+F7Hnm/Y51bLOId4u0j2ltv8OgBgWAB2l+aEh31YLzG0AIkxaiuiTinBEekKIRUD93 WkK2qH1bE3FR9MVphPee9Z5hxTiFNEDZ99JpPA5wa95hW10Ny6OJZutfZ33GWWdhQmLA zY81A9Q0oxSywYVmylcas5tC3UreGshOxzmsSzJTjCxwIoLPKeDLgOvRj+EJa4OG0Cw6 cYSg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=sbiH4B8cDwF4+AJH0qZkvnuJ9P1hwbVxNhJbCa4tUSo=; b=iKIaFkkJ4QfK7Writ52yqVhBuaEZk7SO86yK1ygHL4J2t1L6jvaGry/ZPcnE7Bx1NK S9Orvi7Ui3ffrV7QsCFTOWXkW5tE2S7UPoE6TIw1EV5gkYCrB6iTpbZMDBMXIYMJhpx2 v+Vd0vv2JjPL/ruYPwMf/IGXEPUVE6+8rxcZM9JcqT0baaYWlUIt+3CE6652rI/MHIpO ItCBjGJOWSh9j64l7DhSRUHXnAKjwT0r8NFJcIDz0EHiXsKejlPPIUQZnimstU2C/rDW dAMpouttVsexkbiaEN7Id/dz7Et99DHl0qCc+XYpIr/YrX6U/ss+jZt2TMUbQD4rdU7J qBBA== X-Gm-Message-State: AOAM532Y5aHB6QDusDX+mFkS3Fa93eAZHI0Y+X0HXX7tmc02Q94Z6kuh Nv5hiVTOLxHWMeB8iNscxst4KmHENR9Ngw== X-Google-Smtp-Source: ABdhPJxWCwSINp+HkCMfcQcBEkDm2KL4+Q7ghXHEy4jG/vEpjrCEfDljfUQB8AhKUpB4r27IF4qQRg== X-Received: by 2002:a17:902:7792:b029:fc:e490:ff9f with SMTP id o18-20020a1709027792b02900fce490ff9fmr2428756pll.27.1621983506246; Tue, 25 May 2021 15:58:26 -0700 (PDT) Received: from localhost.localdomain (174-21-70-228.tukw.qwest.net. [174.21.70.228]) by smtp.gmail.com with ESMTPSA id l6sm1669928pjf.28.2021.05.25.15.58.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 25 May 2021 15:58:25 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v2 12/12] target/arm: Enable BFloat16 extensions Date: Tue, 25 May 2021 15:58:17 -0700 Message-Id: <20210525225817.400336-13-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210525225817.400336-1-richard.henderson@linaro.org> References: <20210525225817.400336-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::633; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x633.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Disable BF16 again for !have_neon and !have_vfp during realize. Signed-off-by: Richard Henderson --- target/arm/cpu.c | 3 +++ target/arm/cpu64.c | 3 +++ target/arm/cpu_tcg.c | 1 + 3 files changed, 7 insertions(+) diff --git a/target/arm/cpu.c b/target/arm/cpu.c index 7aeb4b1381..cfc03c550b 100644 --- a/target/arm/cpu.c +++ b/target/arm/cpu.c @@ -1463,6 +1463,7 @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp) u = cpu->isar.id_isar6; u = FIELD_DP32(u, ID_ISAR6, JSCVT, 0); + u = FIELD_DP32(u, ID_ISAR6, BF16, 0); cpu->isar.id_isar6 = u; u = cpu->isar.mvfr0; @@ -1503,6 +1504,7 @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp) t = cpu->isar.id_aa64isar1; t = FIELD_DP64(t, ID_AA64ISAR1, FCMA, 0); + t = FIELD_DP64(t, ID_AA64ISAR1, BF16, 0); t = FIELD_DP64(t, ID_AA64ISAR1, I8MM, 0); cpu->isar.id_aa64isar1 = t; @@ -1518,6 +1520,7 @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp) u = cpu->isar.id_isar6; u = FIELD_DP32(u, ID_ISAR6, DP, 0); u = FIELD_DP32(u, ID_ISAR6, FHM, 0); + u = FIELD_DP32(u, ID_ISAR6, BF16, 0); u = FIELD_DP32(u, ID_ISAR6, I8MM, 0); cpu->isar.id_isar6 = u; diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c index d561dc7acc..1c23187d1a 100644 --- a/target/arm/cpu64.c +++ b/target/arm/cpu64.c @@ -661,6 +661,7 @@ static void aarch64_max_initfn(Object *obj) t = FIELD_DP64(t, ID_AA64ISAR1, FCMA, 1); t = FIELD_DP64(t, ID_AA64ISAR1, SB, 1); t = FIELD_DP64(t, ID_AA64ISAR1, SPECRES, 1); + t = FIELD_DP64(t, ID_AA64ISAR1, BF16, 1); t = FIELD_DP64(t, ID_AA64ISAR1, FRINTTS, 1); t = FIELD_DP64(t, ID_AA64ISAR1, LRCPC, 2); /* ARMv8.4-RCPC */ t = FIELD_DP64(t, ID_AA64ISAR1, I8MM, 1); @@ -708,6 +709,7 @@ static void aarch64_max_initfn(Object *obj) t = FIELD_DP64(t, ID_AA64ZFR0, SVEVER, 1); t = FIELD_DP64(t, ID_AA64ZFR0, AES, 2); /* PMULL */ t = FIELD_DP64(t, ID_AA64ZFR0, BITPERM, 1); + t = FIELD_DP64(t, ID_AA64ZFR0, BFLOAT16, 1); t = FIELD_DP64(t, ID_AA64ZFR0, SHA3, 1); t = FIELD_DP64(t, ID_AA64ZFR0, SM4, 1); t = FIELD_DP64(t, ID_AA64ZFR0, I8MM, 1); @@ -731,6 +733,7 @@ static void aarch64_max_initfn(Object *obj) u = FIELD_DP32(u, ID_ISAR6, FHM, 1); u = FIELD_DP32(u, ID_ISAR6, SB, 1); u = FIELD_DP32(u, ID_ISAR6, SPECRES, 1); + u = FIELD_DP32(u, ID_ISAR6, BF16, 1); u = FIELD_DP32(u, ID_ISAR6, I8MM, 1); cpu->isar.id_isar6 = u; diff --git a/target/arm/cpu_tcg.c b/target/arm/cpu_tcg.c index d3458335ed..c8a12bc2d6 100644 --- a/target/arm/cpu_tcg.c +++ b/target/arm/cpu_tcg.c @@ -968,6 +968,7 @@ static void arm_max_initfn(Object *obj) t = FIELD_DP32(t, ID_ISAR6, FHM, 1); t = FIELD_DP32(t, ID_ISAR6, SB, 1); t = FIELD_DP32(t, ID_ISAR6, SPECRES, 1); + t = FIELD_DP32(t, ID_ISAR6, BF16, 1); t = FIELD_DP32(t, ID_ISAR6, I8MM, 1); cpu->isar.id_isar6 = t;