From patchwork Thu Jan 11 15:17:39 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 13517518 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 2D9F5C47077 for ; Thu, 11 Jan 2024 15:17:56 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.666347.1036912 (Exim 4.92) (envelope-from ) id 1rNwoR-0000Li-NF; Thu, 11 Jan 2024 15:17:43 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 666347.1036912; Thu, 11 Jan 2024 15:17:43 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1rNwoR-0000Lb-Kh; Thu, 11 Jan 2024 15:17:43 +0000 Received: by outflank-mailman (input) for mailman id 666347; Thu, 11 Jan 2024 15:17:43 +0000 Received: from se1-gles-sth1-in.inumbo.com ([159.253.27.254] helo=se1-gles-sth1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1rNwoR-0000LM-4H for xen-devel@lists.xenproject.org; Thu, 11 Jan 2024 15:17:43 +0000 Received: from mail-wm1-x335.google.com (mail-wm1-x335.google.com [2a00:1450:4864:20::335]) by se1-gles-sth1.inumbo.com (Halon) with ESMTPS id 8ff68640-b094-11ee-98f0-6d05b1d4d9a1; Thu, 11 Jan 2024 16:17:41 +0100 (CET) Received: by mail-wm1-x335.google.com with SMTP id 5b1f17b1804b1-40e6297a00fso2509235e9.3 for ; Thu, 11 Jan 2024 07:17:41 -0800 (PST) Received: from [10.156.60.236] (ip-037-024-206-209.um08.pools.vodafone-ip.de. [37.24.206.209]) by smtp.gmail.com with ESMTPSA id fk10-20020a05600c0cca00b0040e5a0ebabesm4245126wmb.21.2024.01.11.07.17.40 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 11 Jan 2024 07:17:40 -0800 (PST) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 8ff68640-b094-11ee-98f0-6d05b1d4d9a1 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1704986261; x=1705591061; darn=lists.xenproject.org; h=content-transfer-encoding:in-reply-to:autocrypt:references:cc:to :from:content-language:subject:user-agent:mime-version:date :message-id:from:to:cc:subject:date:message-id:reply-to; bh=R+dSy1p/6rzNzLV5se8Gs5M7zoR769dXJ05ONyzPn2E=; b=gGV7S1ZZzWqjnTzhWnjpFfJrW5UkMB63vh1Hu6WDYKOcHqsGhms6EjcvzyNFQwfYC7 O2tXjC+7Orp9ST9yYc5FkNSz/lDNle6Gfp5nAXAQBSX38WDUfHvQDTO8Rm+gbXYSD0jI 5Sb/7LHNx5sBRJW1kvE1cgewBbSl4QELckBdchX0Br9UjQ1P71ZtHgFs9Pfuz0XTBGCx 5mHG38mLyCKnCDzL5yJ0FG/HYEstmq/l+022CmLlvmeMxaCXdIDRjiD6di29iCwHFCX8 2hqROXwoEuPLDtG6bxve3vX/IQuA1LQrV2ptj8eFwb2+9boDMvgfPl4G9Sn7x+VIIRkI ZISw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1704986261; x=1705591061; h=content-transfer-encoding:in-reply-to:autocrypt:references:cc:to :from:content-language:subject:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=R+dSy1p/6rzNzLV5se8Gs5M7zoR769dXJ05ONyzPn2E=; b=b/t0uxNNWHaey/88KQeLg4DXWLLoHr4PCuqra+2YT7VhJrFXaUtPKgQUXF9KFaROJ3 yOUclJaAGw/K31o6nFwMQ1GUXnxbgYeBOd/QKgzcJypZlyXr+bLc2a83vIX4p0CbDCDr +7k3HrZ/YYAumm4a/ZN0uKp0GGm8aMAn4kN1w98sWHnHPB8OYRlgos6WDq3NJjecZO/h A6hqVCRxJd1b5/SrHki0yOd6GcIpIf9KDrlrg4+9AVVNEEF2WhcXFRI0/0IhZS4Oi/dH yxtqSU/m3RCHLuK8mFk+7vErSOikBI0lN2F/1RAXd6CAbH+Ec/0zuju+eoL/qV/XC4IP O5Pw== X-Gm-Message-State: AOJu0YxAj8CerPrtCfYg8E5vrYFoeEBO2eEt3opa1C+2Qi+3CC41hyLD gP/0pw6jajxw6WU1kmLEihrnoTHsmMarKHsb7jkJQ6Jd6w== X-Google-Smtp-Source: AGHT+IGQf7cRqqNV5GtlcBMvz4JCVgxMu6XWgsbg6v+x1tjn6FFdoqHDIIj1yS0f1YKAl1rNC1CSVw== X-Received: by 2002:a05:600c:154d:b0:40e:6207:2eb8 with SMTP id f13-20020a05600c154d00b0040e62072eb8mr263634wmg.47.1704986260722; Thu, 11 Jan 2024 07:17:40 -0800 (PST) Message-ID: <253a35dd-d6e3-46fd-b629-999c88a4b88f@suse.com> Date: Thu, 11 Jan 2024 16:17:39 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: [PATCH 1/8] x86/CPUID: enable AVX10 leaf Content-Language: en-US From: Jan Beulich To: "xen-devel@lists.xenproject.org" Cc: Andrew Cooper , Wei Liu , =?utf-8?q?Roger_Pau_Monn=C3=A9?= References: <298db76f-d0ee-4d47-931f-1baa1a7546cf@suse.com> Autocrypt: addr=jbeulich@suse.com; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL In-Reply-To: <298db76f-d0ee-4d47-931f-1baa1a7546cf@suse.com> This requires bumping the number of basic leaves we support. Apart from this the logic is modeled as closely as possible after that of leaf 7 handling. Signed-off-by: Jan Beulich --- The gen-cpuid.py adjustment is merely the minimum needed. It's not really clear to me whether someone turning off e.g. AVX512BW might then also validly expect AVX10 to be turned off. Spec version 2 leaves unclear what the xstate components are which would need enabling for AVX10/256. recalculate_{xstate,misc}() are therefore conservative for now. Do we want to synthesize AVX10 in the policy when all necessary AVX512* features are available, thus allowing migration from an AVX10 host to a suitable non-AVX10 one? --- a/tools/misc/xen-cpuid.c +++ b/tools/misc/xen-cpuid.c @@ -230,7 +230,7 @@ static const char *const str_7d1[32] = [14] = "prefetchi", [15] = "user-msr", - [18] = "cet-sss", + [18] = "cet-sss", [19] = "avx10", }; static const char *const str_7d2[32] = --- a/xen/arch/x86/cpu-policy.c +++ b/xen/arch/x86/cpu-policy.c @@ -221,7 +221,7 @@ static void recalculate_xstate(struct cp xstate_sizes[X86_XCR0_BNDCSR_POS]); } - if ( p->feat.avx512f ) + if ( p->feat.avx512f || (p->feat.avx10 && p->avx10.vsz512) ) { xstates |= X86_XCR0_OPMASK | X86_XCR0_ZMM | X86_XCR0_HI_ZMM; xstate_size = max(xstate_size, @@ -283,6 +283,16 @@ static void recalculate_misc(struct cpu_ p->basic.raw[0xc] = EMPTY_LEAF; + zero_leaves(p->basic.raw, 0xe, 0x23); + + p->avx10.raw[0].b &= 0x000700ff; + p->avx10.raw[0].c = p->avx10.raw[0].d = 0; + if ( !p->feat.avx10 || !p->avx10.version || !p->avx10.vsz512 ) + { + p->feat.avx10 = false; + memset(p->avx10.raw, 0, sizeof(p->avx10.raw)); + } + p->extd.e1d &= ~CPUID_COMMON_1D_FEATURES; /* Most of Power/RAS hidden from guests. */ @@ -800,6 +810,7 @@ void recalculate_cpuid_policy(struct dom p->basic.max_leaf = min(p->basic.max_leaf, max->basic.max_leaf); p->feat.max_subleaf = min(p->feat.max_subleaf, max->feat.max_subleaf); + p->avx10.max_subleaf = min(p->avx10.max_subleaf, max->avx10.max_subleaf); p->extd.max_leaf = 0x80000000U | min(p->extd.max_leaf & 0xffff, ((p->x86_vendor & (X86_VENDOR_AMD | X86_VENDOR_HYGON)) @@ -854,6 +865,8 @@ void recalculate_cpuid_policy(struct dom if ( p->basic.max_leaf < XSTATE_CPUID ) __clear_bit(X86_FEATURE_XSAVE, fs); + if ( p->basic.max_leaf < 0x24 ) + __clear_bit(X86_FEATURE_AVX10, fs); sanitise_featureset(fs); @@ -967,6 +980,8 @@ static void __init __maybe_unused build_ sizeof(raw_cpu_policy.feat.raw)); BUILD_BUG_ON(sizeof(raw_cpu_policy.xstate) != sizeof(raw_cpu_policy.xstate.raw)); + BUILD_BUG_ON(sizeof(raw_cpu_policy.avx10) != + sizeof(raw_cpu_policy.avx10.raw)); BUILD_BUG_ON(sizeof(raw_cpu_policy.extd) != sizeof(raw_cpu_policy.extd.raw)); } --- a/xen/arch/x86/cpuid.c +++ b/xen/arch/x86/cpuid.c @@ -87,6 +87,15 @@ void guest_cpuid(const struct vcpu *v, u *res = array_access_nospec(p->xstate.raw, subleaf); break; + case 0x24: + ASSERT(p->avx10.max_subleaf < ARRAY_SIZE(p->avx10.raw)); + if ( subleaf > min_t(uint32_t, p->avx10.max_subleaf, + ARRAY_SIZE(p->avx10.raw) - 1) ) + return; + + *res = array_access_nospec(p->avx10.raw, subleaf); + break; + default: *res = array_access_nospec(p->basic.raw, leaf); break; --- a/xen/include/public/arch-x86/cpufeatureset.h +++ b/xen/include/public/arch-x86/cpufeatureset.h @@ -321,6 +321,7 @@ XEN_CPUFEATURE(AVX_VNNI_INT16, 15*32 XEN_CPUFEATURE(PREFETCHI, 15*32+14) /*A PREFETCHIT{0,1} Instructions */ XEN_CPUFEATURE(USER_MSR, 15*32+15) /*s U{RD,WR}MSR Instructions */ XEN_CPUFEATURE(CET_SSS, 15*32+18) /* CET Supervisor Shadow Stacks safe to use */ +XEN_CPUFEATURE(AVX10, 15*32+19) /* AVX10 Converged Vector ISA */ /* Intel-defined CPU features, MSR_ARCH_CAPS 0x10a.eax, word 16 */ XEN_CPUFEATURE(RDCL_NO, 16*32+ 0) /*A No Rogue Data Cache Load (Meltdown) */ --- a/xen/include/xen/lib/x86/cpu-policy.h +++ b/xen/include/xen/lib/x86/cpu-policy.h @@ -85,11 +85,12 @@ unsigned int x86_cpuid_lookup_vendor(uin */ const char *x86_cpuid_vendor_to_str(unsigned int vendor); -#define CPUID_GUEST_NR_BASIC (0xdu + 1) +#define CPUID_GUEST_NR_BASIC (0x24u + 1) #define CPUID_GUEST_NR_CACHE (5u + 1) #define CPUID_GUEST_NR_FEAT (2u + 1) #define CPUID_GUEST_NR_TOPO (1u + 1) #define CPUID_GUEST_NR_XSTATE (62u + 1) +#define CPUID_GUEST_NR_AVX10 (0u + 1) #define CPUID_GUEST_NR_EXTD_INTEL (0x8u + 1) #define CPUID_GUEST_NR_EXTD_AMD (0x21u + 1) #define CPUID_GUEST_NR_EXTD MAX(CPUID_GUEST_NR_EXTD_INTEL, \ @@ -255,6 +256,19 @@ struct cpu_policy } comp[CPUID_GUEST_NR_XSTATE]; } xstate; + /* Structured AVX10 information leaf: 0x000000024[xx] */ + union { + struct cpuid_leaf raw[CPUID_GUEST_NR_AVX10]; + struct { + /* Subleaf 0. */ + uint32_t max_subleaf; + uint32_t version:8, :8; + bool vsz128:1, vsz256:1, vsz512:1; + uint32_t :13; + uint32_t /* c */:32, /* d */:32; + }; + } avx10; + /* Extended leaves: 0x800000xx */ union { struct cpuid_leaf raw[CPUID_GUEST_NR_EXTD]; --- a/xen/lib/x86/cpuid.c +++ b/xen/lib/x86/cpuid.c @@ -123,6 +123,7 @@ void x86_cpu_policy_fill_native(struct c switch ( i ) { case 0x4: case 0x7: case 0xb: case 0xd: + case 0x24: /* Multi-invocation leaves. Deferred. */ continue; } @@ -216,6 +217,15 @@ void x86_cpu_policy_fill_native(struct c } } + if ( p->basic.max_leaf >= 0x24 ) + { + cpuid_count_leaf(0x24, 0, &p->avx10.raw[0]); + + for ( i = 1; i <= MIN(p->avx10.max_subleaf, + ARRAY_SIZE(p->avx10.raw) - 1); ++i ) + cpuid_count_leaf(0x24, i, &p->avx10.raw[i]); + } + /* Extended leaves. */ cpuid_leaf(0x80000000U, &p->extd.raw[0]); for ( i = 1; i <= MIN(p->extd.max_leaf & 0xffffU, @@ -285,6 +295,9 @@ void x86_cpu_policy_clear_out_of_range_l ARRAY_SIZE(p->xstate.raw) - 1); } + if ( p->basic.max_leaf < 0x24 ) + memset(p->avx10.raw, 0, sizeof(p->avx10.raw)); + zero_leaves(p->extd.raw, ((p->extd.max_leaf >> 16) == 0x8000 ? (p->extd.max_leaf & 0xffff) + 1 : 0), @@ -297,6 +310,8 @@ void __init x86_cpu_policy_bound_max_lea min_t(uint32_t, p->basic.max_leaf, ARRAY_SIZE(p->basic.raw) - 1); p->feat.max_subleaf = min_t(uint32_t, p->feat.max_subleaf, ARRAY_SIZE(p->feat.raw) - 1); + p->avx10.max_subleaf = + min_t(uint32_t, p->avx10.max_subleaf, ARRAY_SIZE(p->avx10.raw) - 1); p->extd.max_leaf = 0x80000000U | min_t(uint32_t, p->extd.max_leaf & 0xffff, ARRAY_SIZE(p->extd.raw) - 1); } @@ -324,6 +339,8 @@ void x86_cpu_policy_shrink_max_leaves(st */ p->basic.raw[0xd] = p->xstate.raw[0]; + p->basic.raw[0x24] = p->avx10.raw[0]; + for ( i = p->basic.max_leaf; i; --i ) if ( p->basic.raw[i].a | p->basic.raw[i].b | p->basic.raw[i].c | p->basic.raw[i].d ) @@ -457,6 +474,13 @@ int x86_cpuid_copy_to_buffer(const struc break; } + case 0x24: + for ( subleaf = 0; + subleaf <= MIN(p->avx10.max_subleaf, + ARRAY_SIZE(p->avx10.raw) - 1); ++subleaf ) + COPY_LEAF(leaf, subleaf, &p->avx10.raw[subleaf]); + break; + default: COPY_LEAF(leaf, XEN_CPUID_NO_SUBLEAF, &p->basic.raw[leaf]); break; @@ -549,6 +573,13 @@ int x86_cpuid_copy_from_buffer(struct cp array_access_nospec(p->xstate.raw, data.subleaf) = l; break; + case 0x24: + if ( data.subleaf >= ARRAY_SIZE(p->avx10.raw) ) + goto out_of_range; + + array_access_nospec(p->avx10.raw, data.subleaf) = l; + break; + default: if ( data.subleaf != XEN_CPUID_NO_SUBLEAF ) goto out_of_range; --- a/xen/lib/x86/policy.c +++ b/xen/lib/x86/policy.c @@ -21,6 +21,14 @@ int x86_cpu_policies_are_compatible(cons if ( guest->feat.max_subleaf > host->feat.max_subleaf ) FAIL_CPUID(7, 0); + if ( guest->avx10.version > host->avx10.version || + (guest->avx10.vsz512 + ? !host->avx10.vsz512 + : guest->avx10.vsz256 + ? !host->avx10.vsz256 + : guest->avx10.vsz128 && !host->avx10.vsz128 ) ) + FAIL_CPUID(0x24, 0); + if ( guest->extd.max_leaf > host->extd.max_leaf ) FAIL_CPUID(0x80000000U, NA); --- a/xen/tools/gen-cpuid.py +++ b/xen/tools/gen-cpuid.py @@ -286,7 +286,7 @@ def crunch_numbers(state): # enabled. Certain later extensions, acting on 256-bit vectors of # integers, better depend on AVX2 than AVX. AVX2: [AVX512F, VAES, VPCLMULQDQ, AVX_VNNI, AVX_IFMA, AVX_VNNI_INT8, - AVX_VNNI_INT16, SHA512, SM4], + AVX_VNNI_INT16, SHA512, SM4, AVX10], # AVX512F is taken to mean hardware support for 512bit registers # (which in practice depends on the EVEX prefix to encode) as well From patchwork Thu Jan 11 15:18:21 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 13517526 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id D8C6AC47077 for ; Thu, 11 Jan 2024 15:18:32 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.666353.1036923 (Exim 4.92) (envelope-from ) id 1rNwp6-000132-5B; Thu, 11 Jan 2024 15:18:24 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 666353.1036923; Thu, 11 Jan 2024 15:18:24 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1rNwp6-00012v-1v; Thu, 11 Jan 2024 15:18:24 +0000 Received: by outflank-mailman (input) for mailman id 666353; Thu, 11 Jan 2024 15:18:23 +0000 Received: from se1-gles-sth1-in.inumbo.com ([159.253.27.254] helo=se1-gles-sth1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1rNwp5-0000LM-26 for xen-devel@lists.xenproject.org; Thu, 11 Jan 2024 15:18:23 +0000 Received: from mail-wm1-x331.google.com (mail-wm1-x331.google.com [2a00:1450:4864:20::331]) by se1-gles-sth1.inumbo.com (Halon) with ESMTPS id a8335aa4-b094-11ee-98f0-6d05b1d4d9a1; Thu, 11 Jan 2024 16:18:22 +0100 (CET) Received: by mail-wm1-x331.google.com with SMTP id 5b1f17b1804b1-40e62979d41so2207175e9.2 for ; Thu, 11 Jan 2024 07:18:22 -0800 (PST) Received: from [10.156.60.236] (ip-037-024-206-209.um08.pools.vodafone-ip.de. [37.24.206.209]) by smtp.gmail.com with ESMTPSA id fk10-20020a05600c0cca00b0040e5a0ebabesm4245126wmb.21.2024.01.11.07.18.21 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 11 Jan 2024 07:18:21 -0800 (PST) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: a8335aa4-b094-11ee-98f0-6d05b1d4d9a1 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1704986301; x=1705591101; darn=lists.xenproject.org; h=content-transfer-encoding:in-reply-to:autocrypt:references:cc:to :from:content-language:subject:user-agent:mime-version:date :message-id:from:to:cc:subject:date:message-id:reply-to; bh=HrHkDSKkd/RFXi3H4lhjscho8DAVe02PCLpAr/Vcyyw=; b=NX2hVRBV3cp9Hm7J/tt32m0BrG+J/qYNE5YUUFTNJkftVPbJtq2sFoTb+RJmnYL6mp dYCGc2Mzzzi2hJ9ZrTsKpvBquFd9nNViz3m0tjgKMjpd0ZqxpkCIIa0dvnYL55eDKh3L YbY5A2USHE3lyQvOwK4afEs2orvgdeKGTc1wbXASmuBu3mQhXyPa7cVMOEJDONSE0+Jh 78XZ8eAv00igxtF3ZhV5IemJDY01vB/4GZEYvEPCM9t1iu9ucgME4cGbE9IxBTGzW8Ow Ju/Cx94zGjaoEJm3agLc3HXQNYUb6JK2UG1YsV/ONIc4cbFyMCzfXpscYzarMOlTf6Sx o/+w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1704986301; x=1705591101; h=content-transfer-encoding:in-reply-to:autocrypt:references:cc:to :from:content-language:subject:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=HrHkDSKkd/RFXi3H4lhjscho8DAVe02PCLpAr/Vcyyw=; b=gv3/p0Fxwwg23gEn3g+7neB+/0WNqp9+ONayulqxLltaoSgg2+WGNIeDxDFHv9NPkD oZujIHsmU5UYYj+3rooKqPUrhYNtPZQdKy0paQOkyYHLuzuiBEMJWFG+dun0r1yXu7NH 52Vyj75bDoe04HVFOqBOCP1WTz0UbZ6Xxsp0NgHwoi/Y/x+vrI8Wt25gjLEUYzXvosK3 /fvXPGLnbLrno6tQjeYRxBwIjU7m9GecN8feTSVCR3CRsaASAjNBRORm/qtCHRQGXT2o J/lHXNKicZajPw2FuDCiufJZVXH6K53pOaYWed3pC7n4AGEEro+4h5x9Gf67gCM7VUxa 22vw== X-Gm-Message-State: AOJu0YwLHYe8fHKTESfgOUT1ibH/TEYMmIhCU6fvrNG6A4/fQitAikCx zGtH0H36Au6Yg1NdJfUAbi0vnym/AClctkwv2uGT8XsLQg== X-Google-Smtp-Source: AGHT+IEYmb7cgh2VwvkhGx4h0hrOG6gmLcujMYIwkgkSSADnki9dhycNfbNFfN89l8rmd+dKdhbBrw== X-Received: by 2002:a05:600c:4f07:b0:40e:50e7:db03 with SMTP id l7-20020a05600c4f0700b0040e50e7db03mr292519wmq.316.1704986301507; Thu, 11 Jan 2024 07:18:21 -0800 (PST) Message-ID: <9e3a4d3c-38f3-4e73-8775-3252ca531f06@suse.com> Date: Thu, 11 Jan 2024 16:18:21 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: [PATCH 2/8] x86emul/test: rename "cp" Content-Language: en-US From: Jan Beulich To: "xen-devel@lists.xenproject.org" Cc: Andrew Cooper , Wei Liu , =?utf-8?q?Roger_Pau_Monn=C3=A9?= References: <298db76f-d0ee-4d47-931f-1baa1a7546cf@suse.com> Autocrypt: addr=jbeulich@suse.com; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL In-Reply-To: <298db76f-d0ee-4d47-931f-1baa1a7546cf@suse.com> In preparation of introducing a const struct cpu_policy * local in x86_emulate(), rename that global variable to something more suitable: "cp" is our commonly used name for function parameters or local variables of type struct cpu_policy *, and the present name of the global could hence have interfered already. Signed-off-by: Jan Beulich --- a/tools/fuzz/x86_instruction_emulator/fuzz-emul.c +++ b/tools/fuzz/x86_instruction_emulator/fuzz-emul.c @@ -899,7 +899,7 @@ int LLVMFuzzerTestOneInput(const uint8_t int rc; /* Not part of the initializer, for old gcc to cope. */ - ctxt.cpu_policy = &cp; + ctxt.cpu_policy = &cpu_policy; /* Reset all global state variables */ memset(&input, 0, sizeof(input)); --- a/tools/tests/x86_emulator/test_x86_emulator.c +++ b/tools/tests/x86_emulator/test_x86_emulator.c @@ -888,7 +888,8 @@ static void zap_fpsel(unsigned int *env, env[3] &= ~0xffff; } - if ( cp.x86_vendor != X86_VENDOR_AMD && cp.x86_vendor != X86_VENDOR_HYGON ) + if ( cpu_policy.x86_vendor != X86_VENDOR_AMD && + cpu_policy.x86_vendor != X86_VENDOR_HYGON ) return; if ( is_32bit ) @@ -1022,7 +1023,7 @@ int main(int argc, char **argv) ctxt.regs = ®s; ctxt.force_writeback = 0; - ctxt.cpu_policy = &cp; + ctxt.cpu_policy = &cpu_policy; ctxt.lma = sizeof(void *) == 8; ctxt.addr_size = 8 * sizeof(void *); ctxt.sp_size = 8 * sizeof(void *); @@ -1540,7 +1541,7 @@ int main(int argc, char **argv) regs.edx = (unsigned long)res; res[0] = 0x00004444; res[1] = 0x8888cccc; - i = cp.extd.nscb; cp.extd.nscb = true; /* for AMD */ + i = cpu_policy.extd.nscb; cpu_policy.extd.nscb = true; /* for AMD */ rc = x86_emulate(&ctxt, &emulops); if ( (rc != X86EMUL_OKAY) || (regs.eip != (unsigned long)&instr[5]) || @@ -1549,7 +1550,7 @@ int main(int argc, char **argv) goto fail; printf("okay\n"); - cp.extd.nscb = i; + cpu_policy.extd.nscb = i; emulops.write_segment = NULL; printf("%-40s", "Testing rdmsrlist..."); @@ -1803,11 +1804,11 @@ int main(int argc, char **argv) goto fail; printf("okay\n"); - vendor_native = cp.x86_vendor; - for ( cp.x86_vendor = X86_VENDOR_AMD; ; ) + vendor_native = cpu_policy.x86_vendor; + for ( cpu_policy.x86_vendor = X86_VENDOR_AMD; ; ) { - unsigned int v = cp.x86_vendor == X86_VENDOR_INTEL; - const char *vendor = cp.x86_vendor == X86_VENDOR_INTEL ? "Intel" : "AMD"; + unsigned int v = cpu_policy.x86_vendor == X86_VENDOR_INTEL; + const char *vendor = cpu_policy.x86_vendor == X86_VENDOR_INTEL ? "Intel" : "AMD"; uint64_t *stk = (void *)res + MMAP_SZ - 16; regs.rcx = 2; @@ -1843,11 +1844,11 @@ int main(int argc, char **argv) printf("okay\n"); } - if ( cp.x86_vendor == X86_VENDOR_INTEL ) + if ( cpu_policy.x86_vendor == X86_VENDOR_INTEL ) break; - cp.x86_vendor = X86_VENDOR_INTEL; + cpu_policy.x86_vendor = X86_VENDOR_INTEL; } - cp.x86_vendor = vendor_native; + cpu_policy.x86_vendor = vendor_native; #endif /* x86-64 */ printf("%-40s", "Testing shld $1,%ecx,(%edx)..."); --- a/tools/tests/x86_emulator/x86-emulate.c +++ b/tools/tests/x86_emulator/x86-emulate.c @@ -25,7 +25,7 @@ #endif uint32_t mxcsr_mask = 0x0000ffbf; -struct cpu_policy cp; +struct cpu_policy cpu_policy; static char fpu_save_area[0x4000] __attribute__((__aligned__((64)))); static bool use_xsave; @@ -75,24 +75,24 @@ bool emul_test_init(void) unsigned long sp; - x86_cpu_policy_fill_native(&cp); - x86_cpu_policy_bound_max_leaves(&cp); + x86_cpu_policy_fill_native(&cpu_policy); + x86_cpu_policy_bound_max_leaves(&cpu_policy); /* * The emulator doesn't use these instructions, so can always emulate * them. */ - cp.basic.movbe = true; - cp.feat.invpcid = true; - cp.feat.adx = true; - cp.feat.avx512pf = cp.feat.avx512f; - cp.feat.rdpid = true; - cp.feat.lkgs = true; - cp.feat.wrmsrns = true; - cp.feat.msrlist = true; - cp.extd.clzero = true; + cpu_policy.basic.movbe = true; + cpu_policy.feat.invpcid = true; + cpu_policy.feat.adx = true; + cpu_policy.feat.avx512pf = cpu_policy.feat.avx512f; + cpu_policy.feat.rdpid = true; + cpu_policy.feat.lkgs = true; + cpu_policy.feat.wrmsrns = true; + cpu_policy.feat.msrlist = true; + cpu_policy.extd.clzero = true; - x86_cpu_policy_shrink_max_leaves(&cp); + x86_cpu_policy_shrink_max_leaves(&cpu_policy); if ( cpu_has_xsave ) { --- a/tools/tests/x86_emulator/x86-emulate.h +++ b/tools/tests/x86_emulator/x86-emulate.h @@ -69,7 +69,7 @@ #define is_canonical_address(x) (((int64_t)(x) >> 47) == ((int64_t)(x) >> 63)) extern uint32_t mxcsr_mask; -extern struct cpu_policy cp; +extern struct cpu_policy cpu_policy; #define MMAP_SZ 16384 bool emul_test_init(void); @@ -123,7 +123,7 @@ static inline uint64_t xgetbv(uint32_t x } /* Intentionally checking OSXSAVE here. */ -#define cpu_has_xsave (cp.basic.raw[1].c & (1u << 27)) +#define cpu_has_xsave (cpu_policy.basic.raw[1].c & (1u << 27)) static inline bool xcr0_mask(uint64_t mask) { @@ -133,67 +133,67 @@ static inline bool xcr0_mask(uint64_t ma unsigned int rdpkru(void); void wrpkru(unsigned int val); -#define cache_line_size() (cp.basic.clflush_size * 8) -#define cpu_has_fpu cp.basic.fpu -#define cpu_has_mmx cp.basic.mmx -#define cpu_has_fxsr cp.basic.fxsr -#define cpu_has_sse cp.basic.sse -#define cpu_has_sse2 cp.basic.sse2 -#define cpu_has_sse3 cp.basic.sse3 -#define cpu_has_pclmulqdq cp.basic.pclmulqdq -#define cpu_has_ssse3 cp.basic.ssse3 -#define cpu_has_fma (cp.basic.fma && xcr0_mask(6)) -#define cpu_has_sse4_1 cp.basic.sse4_1 -#define cpu_has_sse4_2 cp.basic.sse4_2 -#define cpu_has_popcnt cp.basic.popcnt -#define cpu_has_aesni cp.basic.aesni -#define cpu_has_avx (cp.basic.avx && xcr0_mask(6)) -#define cpu_has_f16c (cp.basic.f16c && xcr0_mask(6)) - -#define cpu_has_avx2 (cp.feat.avx2 && xcr0_mask(6)) -#define cpu_has_bmi1 cp.feat.bmi1 -#define cpu_has_bmi2 cp.feat.bmi2 -#define cpu_has_avx512f (cp.feat.avx512f && xcr0_mask(0xe6)) -#define cpu_has_avx512dq (cp.feat.avx512dq && xcr0_mask(0xe6)) -#define cpu_has_avx512_ifma (cp.feat.avx512_ifma && xcr0_mask(0xe6)) -#define cpu_has_avx512er (cp.feat.avx512er && xcr0_mask(0xe6)) -#define cpu_has_avx512cd (cp.feat.avx512cd && xcr0_mask(0xe6)) -#define cpu_has_sha cp.feat.sha -#define cpu_has_avx512bw (cp.feat.avx512bw && xcr0_mask(0xe6)) -#define cpu_has_avx512vl (cp.feat.avx512vl && xcr0_mask(0xe6)) -#define cpu_has_avx512_vbmi (cp.feat.avx512_vbmi && xcr0_mask(0xe6)) -#define cpu_has_avx512_vbmi2 (cp.feat.avx512_vbmi2 && xcr0_mask(0xe6)) -#define cpu_has_gfni cp.feat.gfni -#define cpu_has_vaes (cp.feat.vaes && xcr0_mask(6)) -#define cpu_has_vpclmulqdq (cp.feat.vpclmulqdq && xcr0_mask(6)) -#define cpu_has_avx512_vnni (cp.feat.avx512_vnni && xcr0_mask(0xe6)) -#define cpu_has_avx512_bitalg (cp.feat.avx512_bitalg && xcr0_mask(0xe6)) -#define cpu_has_avx512_vpopcntdq (cp.feat.avx512_vpopcntdq && xcr0_mask(0xe6)) -#define cpu_has_movdiri cp.feat.movdiri -#define cpu_has_movdir64b cp.feat.movdir64b -#define cpu_has_avx512_4vnniw (cp.feat.avx512_4vnniw && xcr0_mask(0xe6)) -#define cpu_has_avx512_4fmaps (cp.feat.avx512_4fmaps && xcr0_mask(0xe6)) -#define cpu_has_avx512_vp2intersect (cp.feat.avx512_vp2intersect && xcr0_mask(0xe6)) -#define cpu_has_serialize cp.feat.serialize -#define cpu_has_avx512_fp16 (cp.feat.avx512_fp16 && xcr0_mask(0xe6)) -#define cpu_has_sha512 (cp.feat.sha512 && xcr0_mask(6)) -#define cpu_has_sm3 (cp.feat.sm3 && xcr0_mask(6)) -#define cpu_has_sm4 (cp.feat.sm4 && xcr0_mask(6)) -#define cpu_has_avx_vnni (cp.feat.avx_vnni && xcr0_mask(6)) -#define cpu_has_avx512_bf16 (cp.feat.avx512_bf16 && xcr0_mask(0xe6)) -#define cpu_has_cmpccxadd cp.feat.cmpccxadd -#define cpu_has_avx_ifma (cp.feat.avx_ifma && xcr0_mask(6)) -#define cpu_has_avx_vnni_int8 (cp.feat.avx_vnni_int8 && xcr0_mask(6)) -#define cpu_has_avx_ne_convert (cp.feat.avx_ne_convert && xcr0_mask(6)) -#define cpu_has_avx_vnni_int16 (cp.feat.avx_vnni_int16 && xcr0_mask(6)) - -#define cpu_has_xgetbv1 (cpu_has_xsave && cp.xstate.xgetbv1) - -#define cpu_has_3dnow_ext cp.extd._3dnowext -#define cpu_has_sse4a cp.extd.sse4a -#define cpu_has_xop (cp.extd.xop && xcr0_mask(6)) -#define cpu_has_fma4 (cp.extd.fma4 && xcr0_mask(6)) -#define cpu_has_tbm cp.extd.tbm +#define cache_line_size() (cpu_policy.basic.clflush_size * 8) +#define cpu_has_fpu cpu_policy.basic.fpu +#define cpu_has_mmx cpu_policy.basic.mmx +#define cpu_has_fxsr cpu_policy.basic.fxsr +#define cpu_has_sse cpu_policy.basic.sse +#define cpu_has_sse2 cpu_policy.basic.sse2 +#define cpu_has_sse3 cpu_policy.basic.sse3 +#define cpu_has_pclmulqdq cpu_policy.basic.pclmulqdq +#define cpu_has_ssse3 cpu_policy.basic.ssse3 +#define cpu_has_fma (cpu_policy.basic.fma && xcr0_mask(6)) +#define cpu_has_sse4_1 cpu_policy.basic.sse4_1 +#define cpu_has_sse4_2 cpu_policy.basic.sse4_2 +#define cpu_has_popcnt cpu_policy.basic.popcnt +#define cpu_has_aesni cpu_policy.basic.aesni +#define cpu_has_avx (cpu_policy.basic.avx && xcr0_mask(6)) +#define cpu_has_f16c (cpu_policy.basic.f16c && xcr0_mask(6)) + +#define cpu_has_avx2 (cpu_policy.feat.avx2 && xcr0_mask(6)) +#define cpu_has_bmi1 cpu_policy.feat.bmi1 +#define cpu_has_bmi2 cpu_policy.feat.bmi2 +#define cpu_has_avx512f (cpu_policy.feat.avx512f && xcr0_mask(0xe6)) +#define cpu_has_avx512dq (cpu_policy.feat.avx512dq && xcr0_mask(0xe6)) +#define cpu_has_avx512_ifma (cpu_policy.feat.avx512_ifma && xcr0_mask(0xe6)) +#define cpu_has_avx512er (cpu_policy.feat.avx512er && xcr0_mask(0xe6)) +#define cpu_has_avx512cd (cpu_policy.feat.avx512cd && xcr0_mask(0xe6)) +#define cpu_has_sha cpu_policy.feat.sha +#define cpu_has_avx512bw (cpu_policy.feat.avx512bw && xcr0_mask(0xe6)) +#define cpu_has_avx512vl (cpu_policy.feat.avx512vl && xcr0_mask(0xe6)) +#define cpu_has_avx512_vbmi (cpu_policy.feat.avx512_vbmi && xcr0_mask(0xe6)) +#define cpu_has_avx512_vbmi2 (cpu_policy.feat.avx512_vbmi2 && xcr0_mask(0xe6)) +#define cpu_has_gfni cpu_policy.feat.gfni +#define cpu_has_vaes (cpu_policy.feat.vaes && xcr0_mask(6)) +#define cpu_has_vpclmulqdq (cpu_policy.feat.vpclmulqdq && xcr0_mask(6)) +#define cpu_has_avx512_vnni (cpu_policy.feat.avx512_vnni && xcr0_mask(0xe6)) +#define cpu_has_avx512_bitalg (cpu_policy.feat.avx512_bitalg && xcr0_mask(0xe6)) +#define cpu_has_avx512_vpopcntdq (cpu_policy.feat.avx512_vpopcntdq && xcr0_mask(0xe6)) +#define cpu_has_movdiri cpu_policy.feat.movdiri +#define cpu_has_movdir64b cpu_policy.feat.movdir64b +#define cpu_has_avx512_4vnniw (cpu_policy.feat.avx512_4vnniw && xcr0_mask(0xe6)) +#define cpu_has_avx512_4fmaps (cpu_policy.feat.avx512_4fmaps && xcr0_mask(0xe6)) +#define cpu_has_avx512_vp2intersect (cpu_policy.feat.avx512_vp2intersect && xcr0_mask(0xe6)) +#define cpu_has_serialize cpu_policy.feat.serialize +#define cpu_has_avx512_fp16 (cpu_policy.feat.avx512_fp16 && xcr0_mask(0xe6)) +#define cpu_has_sha512 (cpu_policy.feat.sha512 && xcr0_mask(6)) +#define cpu_has_sm3 (cpu_policy.feat.sm3 && xcr0_mask(6)) +#define cpu_has_sm4 (cpu_policy.feat.sm4 && xcr0_mask(6)) +#define cpu_has_avx_vnni (cpu_policy.feat.avx_vnni && xcr0_mask(6)) +#define cpu_has_avx512_bf16 (cpu_policy.feat.avx512_bf16 && xcr0_mask(0xe6)) +#define cpu_has_cmpccxadd cpu_policy.feat.cmpccxadd +#define cpu_has_avx_ifma (cpu_policy.feat.avx_ifma && xcr0_mask(6)) +#define cpu_has_avx_vnni_int8 (cpu_policy.feat.avx_vnni_int8 && xcr0_mask(6)) +#define cpu_has_avx_ne_convert (cpu_policy.feat.avx_ne_convert && xcr0_mask(6)) +#define cpu_has_avx_vnni_int16 (cpu_policy.feat.avx_vnni_int16 && xcr0_mask(6)) + +#define cpu_has_xgetbv1 (cpu_has_xsave && cpu_policy.xstate.xgetbv1) + +#define cpu_has_3dnow_ext cpu_policy.extd._3dnowext +#define cpu_has_sse4a cpu_policy.extd.sse4a +#define cpu_has_xop (cpu_policy.extd.xop && xcr0_mask(6)) +#define cpu_has_fma4 (cpu_policy.extd.fma4 && xcr0_mask(6)) +#define cpu_has_tbm cpu_policy.extd.tbm int emul_test_cpuid( uint32_t leaf, From patchwork Thu Jan 11 15:18:39 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 13517527 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id CB7EEC47077 for ; Thu, 11 Jan 2024 15:18:50 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.666355.1036933 (Exim 4.92) (envelope-from ) id 1rNwpP-0001UE-CT; Thu, 11 Jan 2024 15:18:43 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 666355.1036933; Thu, 11 Jan 2024 15:18:43 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1rNwpP-0001U4-8w; Thu, 11 Jan 2024 15:18:43 +0000 Received: by outflank-mailman (input) for mailman id 666355; Thu, 11 Jan 2024 15:18:41 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1rNwpN-00010p-M0 for xen-devel@lists.xenproject.org; Thu, 11 Jan 2024 15:18:41 +0000 Received: from mail-wm1-x333.google.com (mail-wm1-x333.google.com [2a00:1450:4864:20::333]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id b2e71d70-b094-11ee-9b0f-b553b5be7939; Thu, 11 Jan 2024 16:18:40 +0100 (CET) Received: by mail-wm1-x333.google.com with SMTP id 5b1f17b1804b1-40e60e13581so6505105e9.1 for ; Thu, 11 Jan 2024 07:18:40 -0800 (PST) Received: from [10.156.60.236] (ip-037-024-206-209.um08.pools.vodafone-ip.de. [37.24.206.209]) by smtp.gmail.com with ESMTPSA id fk10-20020a05600c0cca00b0040e5a0ebabesm4245126wmb.21.2024.01.11.07.18.39 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 11 Jan 2024 07:18:39 -0800 (PST) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: b2e71d70-b094-11ee-9b0f-b553b5be7939 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1704986319; x=1705591119; darn=lists.xenproject.org; h=content-transfer-encoding:in-reply-to:autocrypt:references:cc:to :from:content-language:subject:user-agent:mime-version:date :message-id:from:to:cc:subject:date:message-id:reply-to; bh=ZjrQZ+IsckY64cqqmSStJXkQHtykH9VZpf2XCjIzkIA=; b=CneTUkdXxQdbklwimgJ6Dx9rH3GzcGWr3MLoHJR4AcJY4WKrvnYC5eGapz38OO0jhy hLI/nOkr+4V2r6sPeZFayNHZWGvLz7NtUXTiVA93tHsMmFiBLfuoMfEiRpZVtXeuIduj aUSoPRi9b6HbwGVTgXhET9wNXGXrZ07H/cOdQYntrmpH/WVLiifXdN2R6TXSclrKYuZ6 TJ7GqONYB9GvDSXf0D2eqd+2hMWIfV8T5jXf5vH4lnM3PREUw9cQCNMn0g4Hfm5/1E2j Eo8Uw2Tb2DGbE6H9bcbhTwy6J9PJZIGP5L0bdrCwny8S5gSsFwrL+e6owO9i5s3N+IuO hLjw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1704986319; x=1705591119; h=content-transfer-encoding:in-reply-to:autocrypt:references:cc:to :from:content-language:subject:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=ZjrQZ+IsckY64cqqmSStJXkQHtykH9VZpf2XCjIzkIA=; b=NVaWSq498hrVHSf8doEpK4YETbQ9JG0+nbsYjexXI5CIiauzxJSKEHwzpy0qRQ/JaO say2VtTyIifn73Dt322h2mkMMigQXA5SHcIaq/Vx09tR2WkNOQE0CCfxq0tePqBO8ZR0 M6FAisUPXePBiS1qk+/hS3Ja4KEpe/pXQS2gUUr8+LBMLQzK14UhF2mPCPdZC/rHhBsK iCGhWHBFI2EbDcLtanRe+bSvmLJdg4gvJnZq+FOYDo7R7+DnMclS9+DWeXPVkvYeham7 vVZWzPDUybkjHD5uoQiS/NuwUfx+MhJYlYOHA/zbNheVeAYw51wY6vYe8uDtyeS7O4DP spwA== X-Gm-Message-State: AOJu0YwbcGLOVU6hTZuUOziDm1cPxmfPHqd8MLOjgplImYubdjMiCYY6 dzaIZlF30/PzgPKJHYOR7pimoEEzWsd5Ng4QkZYb3kkfoQ== X-Google-Smtp-Source: AGHT+IFnImThCFYnDpLJSki0WavUX80bj8tvpqx7CELtzlxY1WQPu8e5QS3K0/ytcyW5rpNpx5vvCw== X-Received: by 2002:a05:600c:470b:b0:40e:629a:b7c9 with SMTP id v11-20020a05600c470b00b0040e629ab7c9mr112013wmo.40.1704986319541; Thu, 11 Jan 2024 07:18:39 -0800 (PST) Message-ID: <9a491a5e-1ba4-4e72-a341-c05ed1f1e9f0@suse.com> Date: Thu, 11 Jan 2024 16:18:39 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: [PATCH 3/8] x86emul: introduce a struct cpu_policy * local in x86_emulate() Content-Language: en-US From: Jan Beulich To: "xen-devel@lists.xenproject.org" Cc: Andrew Cooper , Wei Liu , =?utf-8?q?Roger_Pau_Monn=C3=A9?= References: <298db76f-d0ee-4d47-931f-1baa1a7546cf@suse.com> Autocrypt: addr=jbeulich@suse.com; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL In-Reply-To: <298db76f-d0ee-4d47-931f-1baa1a7546cf@suse.com> While of little effect right here, future patches (AVX10, AMX, KeyLocker) will benefit more significantly. Signed-off-by: Jan Beulich --- a/xen/arch/x86/x86_emulate/x86_emulate.c +++ b/xen/arch/x86/x86_emulate/x86_emulate.c @@ -1232,6 +1232,7 @@ x86_emulate( { /* Shadow copy of register state. Committed on successful emulation. */ struct cpu_user_regs _regs = *ctxt->regs; + const struct cpu_policy *cp = ctxt->cpu_policy; struct x86_emulate_state state; int rc; uint8_t b, d, *opc = NULL; @@ -3101,7 +3102,7 @@ x86_emulate( * in fact risking to make guest OSes vulnerable to the equivalent of * XSA-7 (CVE-2012-0217). */ - generate_exception_if(ctxt->cpuid->x86_vendor == X86_VENDOR_INTEL && + generate_exception_if(cp->x86_vendor == X86_VENDOR_INTEL && op_bytes == 8 && !is_canonical_address(_regs.rcx), X86_EXC_GP, 0); #endif From patchwork Thu Jan 11 15:19:31 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 13517528 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 24449C47077 for ; Thu, 11 Jan 2024 15:19:47 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.666359.1036942 (Exim 4.92) (envelope-from ) id 1rNwqG-00026O-LN; Thu, 11 Jan 2024 15:19:36 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 666359.1036942; Thu, 11 Jan 2024 15:19:36 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1rNwqG-00026H-IY; Thu, 11 Jan 2024 15:19:36 +0000 Received: by outflank-mailman (input) for mailman id 666359; Thu, 11 Jan 2024 15:19:35 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1rNwqF-000262-BS for xen-devel@lists.xenproject.org; Thu, 11 Jan 2024 15:19:35 +0000 Received: from mail-wm1-x32b.google.com (mail-wm1-x32b.google.com [2a00:1450:4864:20::32b]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id d21e50e1-b094-11ee-9b0f-b553b5be7939; Thu, 11 Jan 2024 16:19:32 +0100 (CET) Received: by mail-wm1-x32b.google.com with SMTP id 5b1f17b1804b1-40e5f746ac4so8728675e9.1 for ; Thu, 11 Jan 2024 07:19:32 -0800 (PST) Received: from [10.156.60.236] (ip-037-024-206-209.um08.pools.vodafone-ip.de. [37.24.206.209]) by smtp.gmail.com with ESMTPSA id fk10-20020a05600c0cca00b0040e5a0ebabesm4245126wmb.21.2024.01.11.07.19.31 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 11 Jan 2024 07:19:31 -0800 (PST) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: d21e50e1-b094-11ee-9b0f-b553b5be7939 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1704986372; x=1705591172; darn=lists.xenproject.org; h=content-transfer-encoding:in-reply-to:autocrypt:references:cc:to :from:content-language:subject:user-agent:mime-version:date :message-id:from:to:cc:subject:date:message-id:reply-to; bh=iLAsDjDx3KHcDn4LadRWIqxkpK4/Gwrx0sSyUHW5eu8=; b=PfPziIDepPtDD0FJmNLpCGy2pMLbu6AcE/F8SdUSr30QjhTYb9EJMzGg5ydo4DWUHi qnSN8ErcwDZVNppM0hG2wUT3t+D4QPSxAfGXV6+SK0I3ppMI1eZwnJ+Q7ZowO+XvkdpY oxSKM6zn6NrHyfv0QmW5yeNWoYI3dXBIOr3mSpRtGc5LN3c1Nd4k0vRyUxGpAeAiu5Qu REI2+/+nGklLryVYw2HA2ZnyL7D00yL5cBDBdzvV4FjJfGKBfvmUDnZDynlOq8WsIhiZ XY9b2AccOvS1Y5jxnp5E/Uo+IcLuGwR3xtbpomsuu/LI52gnvdqWrB58kTn9qiiKAHK6 fuIQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1704986372; x=1705591172; h=content-transfer-encoding:in-reply-to:autocrypt:references:cc:to :from:content-language:subject:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=iLAsDjDx3KHcDn4LadRWIqxkpK4/Gwrx0sSyUHW5eu8=; b=Ax28BJeXlPQFZyZD1NzWhqdiFo1f4jePLfORC4emWK2RVrQtUHEOo6pRJvz+FK9bNY rrGPl0FeVnuP3t3M5Qyh5lx4pF3ibm8EZTIVcOloIlT2vG7zSgLYe2nzYXG5HJhn5LoC IDi0pxPvMuH7wxlgBzJ5+xlwAz3fKzVTE52gspgBYjo1Dxa2ZUwd+mvtAQak3i4Guxpj IgAqUosvWTUf5ME7kodvbokFFexE0Ge1kBVRDM+kbXw6hr/M71dDCJOXgRlXn9NuNjar yengbSLBk2sh83yEsJw+nXRtij6S2MDPTIc11cXGHyOJj0+vM9mNyGHChDFdiso8pQ0a kfTA== X-Gm-Message-State: AOJu0YxqoW5jyQANw7MRzNBrNbYeKEGGKDZHBIA4aOkH3Xx7kVrlSdmA IyivntneAusoF5sTcXxs4p0SYJfcWJOjif92uArnt9xxQQ== X-Google-Smtp-Source: AGHT+IF19Ju8s0TvKjRZHkOR50UcUaQcHzdvU7eL8u2aCeDA6u2LneJRh75ZUagI8H9nhZ4fHy0fZA== X-Received: by 2002:a05:600c:a44:b0:40d:60c3:3d53 with SMTP id c4-20020a05600c0a4400b0040d60c33d53mr561022wmq.103.1704986371620; Thu, 11 Jan 2024 07:19:31 -0800 (PST) Message-ID: Date: Thu, 11 Jan 2024 16:19:31 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: [PATCH 4/8] x86emul: support AVX10.1 Content-Language: en-US From: Jan Beulich To: "xen-devel@lists.xenproject.org" Cc: Andrew Cooper , Wei Liu , =?utf-8?q?Roger_Pau_Monn=C3=A9?= References: <298db76f-d0ee-4d47-931f-1baa1a7546cf@suse.com> Autocrypt: addr=jbeulich@suse.com; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL In-Reply-To: <298db76f-d0ee-4d47-931f-1baa1a7546cf@suse.com> This requires relaxing various pre-existing AVX512* checks, as AVX10.1 covers all AVX512* except except PF, ER, 4FMAPS, 4VNNIW, and VP2INTERSECT. Yet potentially with only less than 512-bit vector width, while otoh guaranteeing more narrow widths being available when wider are (i.e. unlike AVX512VL being an add-on feature on top of AVX512F). Note that visa_check(), replacing host_and_vcpu_must_have() uses, checks only the guest capability: We wouldn't expose AVX512* (nor AVX10) without the hardware supporting it. Similarly in vlen_check() the original host_and_vcpu_must_have() is reduced to the equivalent of just vcpu_must_have(). This also simplifies (resulting) code in the test and fuzzing harnesses, as there the XCR0 checks that are part of cpu_has_avx512* are only needed in local code, not in the emulator itself (where respective checking occurs elsewhere anyway, utilizing emul_test_read_xcr()). While in most cases the changes to x86_emulate() are entirely mechanical, for opmask insns earlier unconditional AVX512F checks are converted into "else" clauses to existing if/else-if ones. To be certain that no uses remain, also drop respective cpu_has_avx512* (except in the test harness) and vcpu_has_avx512*(). Signed-off-by: Jan Beulich --- Probably avx512_vlen_check() should have the avx512_ prefix dropped, now that it also covers AVX10. But if so that wants to be either a prereq or a follow-on patch. visa_check() won't cover AVX10.2 and higher, but probably we will want independent checking logic for that anyway. Spec version 2 still leaves unclear what the xstate components are which would need enabling for AVX10/256. x86emul_get_fpu() is therefore untouched for now. Since it'll be reducing code size, we may want to further convert host_and_vcpu_must_have() to just vcpu_must_have() where appropriate (should be [almost?] everywhere). --- a/xen/arch/x86/include/asm/cpufeature.h +++ b/xen/arch/x86/include/asm/cpufeature.h @@ -132,30 +132,19 @@ static inline bool boot_cpu_has(unsigned #define cpu_has_pqe boot_cpu_has(X86_FEATURE_PQE) #define cpu_has_fpu_sel (!boot_cpu_has(X86_FEATURE_NO_FPU_SEL)) #define cpu_has_mpx boot_cpu_has(X86_FEATURE_MPX) -#define cpu_has_avx512f boot_cpu_has(X86_FEATURE_AVX512F) -#define cpu_has_avx512dq boot_cpu_has(X86_FEATURE_AVX512DQ) #define cpu_has_rdseed boot_cpu_has(X86_FEATURE_RDSEED) #define cpu_has_smap boot_cpu_has(X86_FEATURE_SMAP) -#define cpu_has_avx512_ifma boot_cpu_has(X86_FEATURE_AVX512_IFMA) #define cpu_has_clflushopt boot_cpu_has(X86_FEATURE_CLFLUSHOPT) #define cpu_has_clwb boot_cpu_has(X86_FEATURE_CLWB) #define cpu_has_avx512er boot_cpu_has(X86_FEATURE_AVX512ER) -#define cpu_has_avx512cd boot_cpu_has(X86_FEATURE_AVX512CD) #define cpu_has_proc_trace boot_cpu_has(X86_FEATURE_PROC_TRACE) #define cpu_has_sha boot_cpu_has(X86_FEATURE_SHA) -#define cpu_has_avx512bw boot_cpu_has(X86_FEATURE_AVX512BW) -#define cpu_has_avx512vl boot_cpu_has(X86_FEATURE_AVX512VL) /* CPUID level 0x00000007:0.ecx */ -#define cpu_has_avx512_vbmi boot_cpu_has(X86_FEATURE_AVX512_VBMI) #define cpu_has_pku boot_cpu_has(X86_FEATURE_PKU) -#define cpu_has_avx512_vbmi2 boot_cpu_has(X86_FEATURE_AVX512_VBMI2) #define cpu_has_gfni boot_cpu_has(X86_FEATURE_GFNI) #define cpu_has_vaes boot_cpu_has(X86_FEATURE_VAES) #define cpu_has_vpclmulqdq boot_cpu_has(X86_FEATURE_VPCLMULQDQ) -#define cpu_has_avx512_vnni boot_cpu_has(X86_FEATURE_AVX512_VNNI) -#define cpu_has_avx512_bitalg boot_cpu_has(X86_FEATURE_AVX512_BITALG) -#define cpu_has_avx512_vpopcntdq boot_cpu_has(X86_FEATURE_AVX512_VPOPCNTDQ) #define cpu_has_rdpid boot_cpu_has(X86_FEATURE_RDPID) #define cpu_has_movdiri boot_cpu_has(X86_FEATURE_MOVDIRI) #define cpu_has_movdir64b boot_cpu_has(X86_FEATURE_MOVDIR64B) @@ -180,7 +169,6 @@ static inline bool boot_cpu_has(unsigned #define cpu_has_rtm_always_abort boot_cpu_has(X86_FEATURE_RTM_ALWAYS_ABORT) #define cpu_has_tsx_force_abort boot_cpu_has(X86_FEATURE_TSX_FORCE_ABORT) #define cpu_has_serialize boot_cpu_has(X86_FEATURE_SERIALIZE) -#define cpu_has_avx512_fp16 boot_cpu_has(X86_FEATURE_AVX512_FP16) #define cpu_has_arch_caps boot_cpu_has(X86_FEATURE_ARCH_CAPS) /* CPUID level 0x00000007:1.eax */ @@ -188,7 +176,6 @@ static inline bool boot_cpu_has(unsigned #define cpu_has_sm3 boot_cpu_has(X86_FEATURE_SM3) #define cpu_has_sm4 boot_cpu_has(X86_FEATURE_SM4) #define cpu_has_avx_vnni boot_cpu_has(X86_FEATURE_AVX_VNNI) -#define cpu_has_avx512_bf16 boot_cpu_has(X86_FEATURE_AVX512_BF16) #define cpu_has_cmpccxadd boot_cpu_has(X86_FEATURE_CMPCCXADD) #define cpu_has_avx_ifma boot_cpu_has(X86_FEATURE_AVX_IFMA) --- a/xen/arch/x86/x86_emulate/private.h +++ b/xen/arch/x86/x86_emulate/private.h @@ -558,28 +558,17 @@ amd_like(const struct x86_emulate_ctxt * #define vcpu_has_invpcid() (ctxt->cpuid->feat.invpcid) #define vcpu_has_rtm() (ctxt->cpuid->feat.rtm) #define vcpu_has_mpx() (ctxt->cpuid->feat.mpx) -#define vcpu_has_avx512f() (ctxt->cpuid->feat.avx512f) -#define vcpu_has_avx512dq() (ctxt->cpuid->feat.avx512dq) #define vcpu_has_rdseed() (ctxt->cpuid->feat.rdseed) #define vcpu_has_adx() (ctxt->cpuid->feat.adx) #define vcpu_has_smap() (ctxt->cpuid->feat.smap) -#define vcpu_has_avx512_ifma() (ctxt->cpuid->feat.avx512_ifma) #define vcpu_has_clflushopt() (ctxt->cpuid->feat.clflushopt) #define vcpu_has_clwb() (ctxt->cpuid->feat.clwb) #define vcpu_has_avx512pf() (ctxt->cpuid->feat.avx512pf) #define vcpu_has_avx512er() (ctxt->cpuid->feat.avx512er) -#define vcpu_has_avx512cd() (ctxt->cpuid->feat.avx512cd) #define vcpu_has_sha() (ctxt->cpuid->feat.sha) -#define vcpu_has_avx512bw() (ctxt->cpuid->feat.avx512bw) -#define vcpu_has_avx512vl() (ctxt->cpuid->feat.avx512vl) -#define vcpu_has_avx512_vbmi() (ctxt->cpuid->feat.avx512_vbmi) -#define vcpu_has_avx512_vbmi2() (ctxt->cpuid->feat.avx512_vbmi2) #define vcpu_has_gfni() (ctxt->cpuid->feat.gfni) #define vcpu_has_vaes() (ctxt->cpuid->feat.vaes) #define vcpu_has_vpclmulqdq() (ctxt->cpuid->feat.vpclmulqdq) -#define vcpu_has_avx512_vnni() (ctxt->cpuid->feat.avx512_vnni) -#define vcpu_has_avx512_bitalg() (ctxt->cpuid->feat.avx512_bitalg) -#define vcpu_has_avx512_vpopcntdq() (ctxt->cpuid->feat.avx512_vpopcntdq) #define vcpu_has_rdpid() (ctxt->cpuid->feat.rdpid) #define vcpu_has_movdiri() (ctxt->cpuid->feat.movdiri) #define vcpu_has_movdir64b() (ctxt->cpuid->feat.movdir64b) @@ -589,12 +578,10 @@ amd_like(const struct x86_emulate_ctxt * #define vcpu_has_avx512_vp2intersect() (ctxt->cpuid->feat.avx512_vp2intersect) #define vcpu_has_serialize() (ctxt->cpuid->feat.serialize) #define vcpu_has_tsxldtrk() (ctxt->cpuid->feat.tsxldtrk) -#define vcpu_has_avx512_fp16() (ctxt->cpuid->feat.avx512_fp16) #define vcpu_has_sha512() (ctxt->cpuid->feat.sha512) #define vcpu_has_sm3() (ctxt->cpuid->feat.sm3) #define vcpu_has_sm4() (ctxt->cpuid->feat.sm4) #define vcpu_has_avx_vnni() (ctxt->cpuid->feat.avx_vnni) -#define vcpu_has_avx512_bf16() (ctxt->cpuid->feat.avx512_bf16) #define vcpu_has_cmpccxadd() (ctxt->cpuid->feat.cmpccxadd) #define vcpu_has_lkgs() (ctxt->cpuid->feat.lkgs) #define vcpu_has_wrmsrns() (ctxt->cpuid->feat.wrmsrns) --- a/xen/arch/x86/x86_emulate/x86_emulate.c +++ b/xen/arch/x86/x86_emulate/x86_emulate.c @@ -1125,19 +1125,43 @@ static unsigned long *decode_vex_gpr( return decode_gpr(regs, ~vex_reg & (mode_64bit() ? 0xf : 7)); } -#define avx512_vlen_check(lig) do { \ - switch ( evex.lr ) \ - { \ - default: \ - generate_exception(X86_EXC_UD); \ - case 2: \ - break; \ - case 0: case 1: \ - if ( !(lig) ) \ - host_and_vcpu_must_have(avx512vl); \ - break; \ - } \ -} while ( false ) +#define visa_check(subfeat) \ + generate_exception_if(!cp->feat.avx512 ## subfeat && !cp->feat.avx10, \ + X86_EXC_UD) + +static bool _vlen_check( + const struct x86_emulate_state *s, + const struct cpu_policy *cp, + bool lig) +{ + if ( s->evex.lr > 2 ) + return false; + + if ( lig ) + return true; + + if ( cp->feat.avx10 ) + switch ( s->evex.lr ) + { + case 0: + if ( cp->avx10.vsz128 ) + return true; + /* fall through */ + case 1: + if ( cp->avx10.vsz256 ) + return true; + /* fall through */ + case 2: + if ( cp->avx10.vsz512 ) + return true; + break; + } + + return s->evex.lr == 2 || cp->feat.avx512vl; +} + +#define avx512_vlen_check(lig) \ + generate_exception_if(!_vlen_check(state, cp, lig), X86_EXC_UD) static bool is_branch_step(struct x86_emulate_ctxt *ctxt, const struct x86_emulate_ops *ops) @@ -1369,7 +1393,9 @@ x86_emulate( /* KMOV{W,Q} %k, (%rax) */ stb[0] = 0xc4; stb[1] = 0xe1; - stb[2] = cpu_has_avx512bw ? 0xf8 : 0x78; + stb[2] = cp->feat.avx512bw || cp->feat.avx10 + ? 0xf8 /* L0.NP.W1 - kmovq */ + : 0x78 /* L0.NP.W0 - kmovw */; stb[3] = 0x91; stb[4] = evex.opmsk << 3; insn_bytes = 5; @@ -3392,7 +3418,7 @@ x86_emulate( (ea.type != OP_REG && evex.brs && (evex.pfx & VEX_PREFIX_SCALAR_MASK))), X86_EXC_UD); - host_and_vcpu_must_have(avx512f); + visa_check(f); if ( ea.type != OP_REG || !evex.brs ) avx512_vlen_check(evex.pfx & VEX_PREFIX_SCALAR_MASK); simd_zmm: @@ -3448,7 +3474,7 @@ x86_emulate( generate_exception_if((evex.lr || evex.opmsk || evex.brs || evex.w != (evex.pfx & VEX_PREFIX_DOUBLE_MASK)), X86_EXC_UD); - host_and_vcpu_must_have(avx512f); + visa_check(f); if ( (d & DstMask) != DstMem ) d &= ~TwoOp; op_bytes = 8; @@ -3475,7 +3501,7 @@ x86_emulate( generate_exception_if((evex.brs || evex.w != (evex.pfx & VEX_PREFIX_DOUBLE_MASK)), X86_EXC_UD); - host_and_vcpu_must_have(avx512f); + visa_check(f); avx512_vlen_check(false); d |= TwoOp; op_bytes = !(evex.pfx & VEX_PREFIX_DOUBLE_MASK) || evex.lr @@ -3512,7 +3538,7 @@ x86_emulate( case X86EMUL_OPC_EVEX_66(0x0f38, 0x64): /* vpblendm{d,q} [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_66(0x0f38, 0x65): /* vblendmp{s,d} [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ avx512f_no_sae: - host_and_vcpu_must_have(avx512f); + visa_check(f); generate_exception_if(ea.type != OP_MEM && evex.brs, X86_EXC_UD); avx512_vlen_check(false); goto simd_zmm; @@ -3592,13 +3618,13 @@ x86_emulate( case X86EMUL_OPC_EVEX_F3(5, 0x2a): /* vcvtsi2sh r/m,xmm,xmm */ case X86EMUL_OPC_EVEX_F3(5, 0x7b): /* vcvtusi2sh r/m,xmm,xmm */ - host_and_vcpu_must_have(avx512_fp16); + visa_check(_fp16); /* fall through */ CASE_SIMD_SCALAR_FP(_EVEX, 0x0f, 0x2a): /* vcvtsi2s{s,d} r/m,xmm,xmm */ CASE_SIMD_SCALAR_FP(_EVEX, 0x0f, 0x7b): /* vcvtusi2s{s,d} r/m,xmm,xmm */ generate_exception_if(evex.opmsk || (ea.type != OP_REG && evex.brs), X86_EXC_UD); - host_and_vcpu_must_have(avx512f); + visa_check(f); if ( !evex.brs ) avx512_vlen_check(true); get_fpu(X86EMUL_FPU_zmm); @@ -3708,7 +3734,7 @@ x86_emulate( case X86EMUL_OPC_EVEX_F3(5, 0x2d): /* vcvtsh2si xmm/mem,reg */ case X86EMUL_OPC_EVEX_F3(5, 0x78): /* vcvttsh2usi xmm/mem,reg */ case X86EMUL_OPC_EVEX_F3(5, 0x79): /* vcvtsh2usi xmm/mem,reg */ - host_and_vcpu_must_have(avx512_fp16); + visa_check(_fp16); /* fall through */ CASE_SIMD_SCALAR_FP(_EVEX, 0x0f, 0x2c): /* vcvtts{s,d}2si xmm/mem,reg */ CASE_SIMD_SCALAR_FP(_EVEX, 0x0f, 0x2d): /* vcvts{s,d}2si xmm/mem,reg */ @@ -3717,7 +3743,7 @@ x86_emulate( generate_exception_if((evex.reg != 0xf || !evex.RX || evex.opmsk || (ea.type != OP_REG && evex.brs)), X86_EXC_UD); - host_and_vcpu_must_have(avx512f); + visa_check(f); if ( !evex.brs ) avx512_vlen_check(true); get_fpu(X86EMUL_FPU_zmm); @@ -3783,7 +3809,7 @@ x86_emulate( case X86EMUL_OPC_EVEX(5, 0x2e): /* vucomish xmm/m16,xmm */ case X86EMUL_OPC_EVEX(5, 0x2f): /* vcomish xmm/m16,xmm */ - host_and_vcpu_must_have(avx512_fp16); + visa_check(_fp16); generate_exception_if(evex.w, X86_EXC_UD); /* fall through */ CASE_SIMD_PACKED_FP(_EVEX, 0x0f, 0x2e): /* vucomis{s,d} xmm/mem,xmm */ @@ -3792,7 +3818,7 @@ x86_emulate( (ea.type != OP_REG && evex.brs) || evex.w != evex.pfx), X86_EXC_UD); - host_and_vcpu_must_have(avx512f); + visa_check(f); if ( !evex.brs ) avx512_vlen_check(true); get_fpu(X86EMUL_FPU_zmm); @@ -3936,7 +3962,7 @@ x86_emulate( case X86EMUL_OPC_VEX(0x0f, 0x4a): /* kadd{w,q} k,k,k */ if ( !vex.w ) - host_and_vcpu_must_have(avx512dq); + visa_check(dq); /* fall through */ case X86EMUL_OPC_VEX(0x0f, 0x41): /* kand{w,q} k,k,k */ case X86EMUL_OPC_VEX_66(0x0f, 0x41): /* kand{b,d} k,k,k */ @@ -3952,11 +3978,12 @@ x86_emulate( generate_exception_if(!vex.l, X86_EXC_UD); opmask_basic: if ( vex.w ) - host_and_vcpu_must_have(avx512bw); + visa_check(bw); else if ( vex.pfx ) - host_and_vcpu_must_have(avx512dq); + visa_check(dq); + else + visa_check(f); opmask_common: - host_and_vcpu_must_have(avx512f); generate_exception_if(!vex.r || (mode_64bit() && !(vex.reg & 8)) || ea.type != OP_REG, X86_EXC_UD); @@ -3979,13 +4006,14 @@ x86_emulate( generate_exception_if(vex.l || vex.reg != 0xf, X86_EXC_UD); goto opmask_basic; - case X86EMUL_OPC_VEX(0x0f, 0x4b): /* kunpck{w,d}{d,q} k,k,k */ + case X86EMUL_OPC_VEX(0x0f, 0x4b): /* kunpck{wd,dq} k,k,k */ generate_exception_if(!vex.l, X86_EXC_UD); - host_and_vcpu_must_have(avx512bw); + visa_check(bw); goto opmask_common; case X86EMUL_OPC_VEX_66(0x0f, 0x4b): /* kunpckbw k,k,k */ generate_exception_if(!vex.l || vex.w, X86_EXC_UD); + visa_check(f); goto opmask_common; #endif /* X86EMUL_NO_SIMD */ @@ -4053,7 +4081,7 @@ x86_emulate( generate_exception_if((evex.w != (evex.pfx & VEX_PREFIX_DOUBLE_MASK) || (ea.type != OP_MEM && evex.brs)), X86_EXC_UD); - host_and_vcpu_must_have(avx512dq); + visa_check(dq); avx512_vlen_check(false); goto simd_zmm; @@ -4092,12 +4120,12 @@ x86_emulate( case X86EMUL_OPC_EVEX_F2(0x0f, 0x7a): /* vcvtudq2ps [xyz]mm/mem,[xyz]mm{k} */ /* vcvtuqq2ps [xyz]mm/mem,{x,y}mm{k} */ if ( evex.w ) - host_and_vcpu_must_have(avx512dq); + visa_check(dq); else { case X86EMUL_OPC_EVEX(0x0f, 0x78): /* vcvttp{s,d}2udq [xyz]mm/mem,[xyz]mm{k} */ case X86EMUL_OPC_EVEX(0x0f, 0x79): /* vcvtp{s,d}2udq [xyz]mm/mem,[xyz]mm{k} */ - host_and_vcpu_must_have(avx512f); + visa_check(f); } if ( ea.type != OP_REG || !evex.brs ) avx512_vlen_check(false); @@ -4314,7 +4342,7 @@ x86_emulate( case X86EMUL_OPC_EVEX_66(0x0f38, 0x0b): /* vpmulhrsw [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_66(0x0f38, 0x1c): /* vpabsb [xyz]mm/mem,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_66(0x0f38, 0x1d): /* vpabsw [xyz]mm/mem,[xyz]mm{k} */ - host_and_vcpu_must_have(avx512bw); + visa_check(bw); generate_exception_if(evex.brs, X86_EXC_UD); elem_bytes = 1 << (b & 1); goto avx512f_no_sae; @@ -4346,7 +4374,7 @@ x86_emulate( generate_exception_if(b != 0x27 && evex.w != (b & 1), X86_EXC_UD); goto avx512f_no_sae; } - host_and_vcpu_must_have(avx512bw); + visa_check(bw); generate_exception_if(evex.brs, X86_EXC_UD); elem_bytes = 1 << (ext == ext_0f ? b & 1 : evex.w); avx512_vlen_check(false); @@ -4419,7 +4447,7 @@ x86_emulate( dst.bytes = 2; /* fall through */ case X86EMUL_OPC_EVEX_66(5, 0x6e): /* vmovw r/m16,xmm */ - host_and_vcpu_must_have(avx512_fp16); + visa_check(_fp16); generate_exception_if(evex.w, X86_EXC_UD); /* fall through */ case X86EMUL_OPC_EVEX_66(0x0f, 0x6e): /* vmov{d,q} r/m,xmm */ @@ -4427,7 +4455,7 @@ x86_emulate( generate_exception_if((evex.lr || evex.opmsk || evex.brs || evex.reg != 0xf || !evex.RX), X86_EXC_UD); - host_and_vcpu_must_have(avx512f); + visa_check(f); get_fpu(X86EMUL_FPU_zmm); opc = init_evex(stub); @@ -4485,7 +4513,7 @@ x86_emulate( case X86EMUL_OPC_EVEX_F2(0x0f, 0x6f): /* vmovdqu{8,16} [xyz]mm/mem,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_F2(0x0f, 0x7f): /* vmovdqu{8,16} [xyz]mm,[xyz]mm/mem{k} */ - host_and_vcpu_must_have(avx512bw); + visa_check(bw); elem_bytes = 1 << evex.w; goto vmovdqa; @@ -4578,7 +4606,7 @@ x86_emulate( generate_exception_if(evex.w, X86_EXC_UD); else { - host_and_vcpu_must_have(avx512bw); + visa_check(bw); generate_exception_if(evex.brs, X86_EXC_UD); } d = (d & ~SrcMask) | SrcMem | TwoOp; @@ -4826,7 +4854,7 @@ x86_emulate( case X86EMUL_OPC_EVEX_F3(0x0f, 0xe6): /* vcvtdq2pd {x,y}mm/mem,[xyz]mm{k} */ /* vcvtqq2pd [xyz]mm/mem,[xyz]mm{k} */ if ( evex.pfx != vex_f3 ) - host_and_vcpu_must_have(avx512f); + visa_check(f); else if ( evex.w ) { case X86EMUL_OPC_EVEX_66(0x0f, 0x78): /* vcvttps2uqq {x,y}mm/mem,[xyz]mm{k} */ @@ -4837,11 +4865,11 @@ x86_emulate( /* vcvttpd2qq [xyz]mm/mem,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_66(0x0f, 0x7b): /* vcvtps2qq {x,y}mm/mem,[xyz]mm{k} */ /* vcvtpd2qq [xyz]mm/mem,[xyz]mm{k} */ - host_and_vcpu_must_have(avx512dq); + visa_check(dq); } else { - host_and_vcpu_must_have(avx512f); + visa_check(f); generate_exception_if(ea.type != OP_MEM && evex.brs, X86_EXC_UD); } if ( ea.type != OP_REG || !evex.brs ) @@ -4879,7 +4907,7 @@ x86_emulate( case X86EMUL_OPC_EVEX_66(0x0f, 0xd6): /* vmovq xmm,xmm/m64 */ generate_exception_if(evex.lr || !evex.w || evex.opmsk || evex.brs, X86_EXC_UD); - host_and_vcpu_must_have(avx512f); + visa_check(f); d |= TwoOp; op_bytes = 8; goto simd_zmm; @@ -4905,19 +4933,21 @@ x86_emulate( case X86EMUL_OPC_VEX(0x0f, 0x90): /* kmov{w,q} k/mem,k */ case X86EMUL_OPC_VEX_66(0x0f, 0x90): /* kmov{b,d} k/mem,k */ generate_exception_if(vex.l || !vex.r, X86_EXC_UD); - host_and_vcpu_must_have(avx512f); if ( vex.w ) { - host_and_vcpu_must_have(avx512bw); + visa_check(bw); op_bytes = 4 << !vex.pfx; } else if ( vex.pfx ) { - host_and_vcpu_must_have(avx512dq); + visa_check(dq); op_bytes = 1; } else + { + visa_check(f); op_bytes = 2; + } get_fpu(X86EMUL_FPU_opmask); @@ -4939,14 +4969,15 @@ x86_emulate( generate_exception_if(vex.l || !vex.r || vex.reg != 0xf || ea.type != OP_REG, X86_EXC_UD); - host_and_vcpu_must_have(avx512f); if ( vex.pfx == vex_f2 ) - host_and_vcpu_must_have(avx512bw); + visa_check(bw); else { generate_exception_if(vex.w, X86_EXC_UD); if ( vex.pfx ) - host_and_vcpu_must_have(avx512dq); + visa_check(dq); + else + visa_check(f); } get_fpu(X86EMUL_FPU_opmask); @@ -4978,10 +5009,9 @@ x86_emulate( dst = ea; dst.reg = decode_gpr(&_regs, modrm_reg); - host_and_vcpu_must_have(avx512f); if ( vex.pfx == vex_f2 ) { - host_and_vcpu_must_have(avx512bw); + visa_check(bw); dst.bytes = 4 << (mode_64bit() && vex.w); } else @@ -4989,7 +5019,9 @@ x86_emulate( generate_exception_if(vex.w, X86_EXC_UD); dst.bytes = 4; if ( vex.pfx ) - host_and_vcpu_must_have(avx512dq); + visa_check(dq); + else + visa_check(f); } get_fpu(X86EMUL_FPU_opmask); @@ -5011,20 +5043,18 @@ x86_emulate( ASSERT(!state->simd_size); break; - case X86EMUL_OPC_VEX(0x0f, 0x99): /* ktest{w,q} k,k */ - if ( !vex.w ) - host_and_vcpu_must_have(avx512dq); - /* fall through */ case X86EMUL_OPC_VEX(0x0f, 0x98): /* kortest{w,q} k,k */ case X86EMUL_OPC_VEX_66(0x0f, 0x98): /* kortest{b,d} k,k */ + case X86EMUL_OPC_VEX(0x0f, 0x99): /* ktest{w,q} k,k */ case X86EMUL_OPC_VEX_66(0x0f, 0x99): /* ktest{b,d} k,k */ generate_exception_if(vex.l || !vex.r || vex.reg != 0xf || ea.type != OP_REG, X86_EXC_UD); - host_and_vcpu_must_have(avx512f); if ( vex.w ) - host_and_vcpu_must_have(avx512bw); - else if ( vex.pfx ) - host_and_vcpu_must_have(avx512dq); + visa_check(bw); + else if ( vex.pfx || (b & 1) ) + visa_check(dq); + else + visa_check(f); get_fpu(X86EMUL_FPU_opmask); @@ -5362,7 +5392,7 @@ x86_emulate( (evex.pfx & VEX_PREFIX_SCALAR_MASK)) || !evex.r || !evex.R || evex.z), X86_EXC_UD); - host_and_vcpu_must_have(avx512f); + visa_check(f); if ( ea.type != OP_REG || !evex.brs ) avx512_vlen_check(evex.pfx & VEX_PREFIX_SCALAR_MASK); simd_imm8_zmm: @@ -5406,9 +5436,9 @@ x86_emulate( case X86EMUL_OPC_EVEX_66(0x0f3a, 0x22): /* vpinsr{d,q} $imm8,r/m,xmm,xmm */ generate_exception_if(evex.lr || evex.opmsk || evex.brs, X86_EXC_UD); if ( b & 2 ) - host_and_vcpu_must_have(avx512dq); + visa_check(dq); else - host_and_vcpu_must_have(avx512bw); + visa_check(bw); if ( !mode_64bit() ) evex.w = 0; memcpy(mmvalp, &src.val, src.bytes); @@ -5445,7 +5475,7 @@ x86_emulate( /* fall through */ case X86EMUL_OPC_EVEX_66(0x0f3a, 0x25): /* vpternlog{d,q} $imm8,[xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ avx512f_imm8_no_sae: - host_and_vcpu_must_have(avx512f); + visa_check(f); generate_exception_if(ea.type != OP_MEM && evex.brs, X86_EXC_UD); avx512_vlen_check(false); goto simd_imm8_zmm; @@ -5544,7 +5574,7 @@ x86_emulate( case X86EMUL_OPC_EVEX_66(0x0f, 0xe4): /* vpmulhuw [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_66(0x0f, 0xea): /* vpminsw [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_66(0x0f, 0xee): /* vpmaxsw [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ - host_and_vcpu_must_have(avx512bw); + visa_check(bw); generate_exception_if(evex.brs, X86_EXC_UD); elem_bytes = b & 0x10 ? 1 : 2; goto avx512f_no_sae; @@ -5769,7 +5799,7 @@ x86_emulate( case X86EMUL_OPC_EVEX_66(0x0f38, 0x10): /* vpsrlvw [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_66(0x0f38, 0x11): /* vpsravw [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_66(0x0f38, 0x12): /* vpsllvw [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ - host_and_vcpu_must_have(avx512bw); + visa_check(bw); generate_exception_if(!evex.w || evex.brs, X86_EXC_UD); elem_bytes = 2; goto avx512f_no_sae; @@ -5779,7 +5809,7 @@ x86_emulate( case X86EMUL_OPC_EVEX_F3(0x0f38, 0x20): /* vpmovswb [xyz]mm,{x,y}mm/mem{k} */ case X86EMUL_OPC_EVEX_66(0x0f38, 0x30): /* vpmovzxbw {x,y}mm/mem,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_F3(0x0f38, 0x30): /* vpmovwb [xyz]mm,{x,y}mm/mem{k} */ - host_and_vcpu_must_have(avx512bw); + visa_check(bw); if ( evex.pfx != vex_f3 ) { case X86EMUL_OPC_EVEX_66(0x0f38, 0x21): /* vpmovsxbd xmm/mem,[xyz]mm{k} */ @@ -5827,7 +5857,7 @@ x86_emulate( case X86EMUL_OPC_EVEX_66(0x0f38, 0x13): /* vcvtph2ps {x,y}mm/mem,[xyz]mm{k} */ generate_exception_if(evex.w || (ea.type != OP_REG && evex.brs), X86_EXC_UD); - host_and_vcpu_must_have(avx512f); + visa_check(f); if ( !evex.brs ) avx512_vlen_check(false); op_bytes = 8 << evex.lr; @@ -5881,7 +5911,7 @@ x86_emulate( op_bytes = 8; generate_exception_if(evex.brs, X86_EXC_UD); if ( !evex.w ) - host_and_vcpu_must_have(avx512dq); + visa_check(dq); goto avx512_broadcast; case X86EMUL_OPC_EVEX_66(0x0f38, 0x1a): /* vbroadcastf32x4 m128,{y,z}mm{k} */ @@ -5891,7 +5921,7 @@ x86_emulate( generate_exception_if(ea.type != OP_MEM || !evex.lr || evex.brs, X86_EXC_UD); if ( evex.w ) - host_and_vcpu_must_have(avx512dq); + visa_check(dq); goto avx512_broadcast; case X86EMUL_OPC_VEX_66(0x0f38, 0x20): /* vpmovsxbw xmm/mem,{x,y}mm */ @@ -5916,9 +5946,9 @@ x86_emulate( case X86EMUL_OPC_EVEX_F3(0x0f38, 0x28): /* vpmovm2{b,w} k,[xyz]mm */ case X86EMUL_OPC_EVEX_F3(0x0f38, 0x38): /* vpmovm2{d,q} k,[xyz]mm */ if ( b & 0x10 ) - host_and_vcpu_must_have(avx512dq); + visa_check(dq); else - host_and_vcpu_must_have(avx512bw); + visa_check(bw); generate_exception_if(evex.opmsk || ea.type != OP_REG, X86_EXC_UD); d |= TwoOp; op_bytes = 16 << evex.lr; @@ -5960,7 +5990,7 @@ x86_emulate( fault_suppression = false; /* fall through */ case X86EMUL_OPC_EVEX_66(0x0f38, 0x44): /* vplzcnt{d,q} [xyz]mm/mem,[xyz]mm{k} */ - host_and_vcpu_must_have(avx512cd); + visa_check(cd); goto avx512f_no_sae; case X86EMUL_OPC_VEX_66(0x0f38, 0x2c): /* vmaskmovps mem,{x,y}mm,{x,y}mm */ @@ -6036,7 +6066,7 @@ x86_emulate( case X86EMUL_OPC_EVEX_66(0x0f38, 0xba): /* vfmsub231p{s,d} [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_66(0x0f38, 0xbc): /* vfnmadd231p{s,d} [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_66(0x0f38, 0xbe): /* vfnmsub231p{s,d} [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ - host_and_vcpu_must_have(avx512f); + visa_check(f); if ( ea.type != OP_REG || !evex.brs ) avx512_vlen_check(false); goto simd_zmm; @@ -6055,7 +6085,7 @@ x86_emulate( case X86EMUL_OPC_EVEX_66(0x0f38, 0xbb): /* vfmsub231s{s,d} xmm/mem,xmm,xmm{k} */ case X86EMUL_OPC_EVEX_66(0x0f38, 0xbd): /* vfnmadd231s{s,d} xmm/mem,xmm,xmm{k} */ case X86EMUL_OPC_EVEX_66(0x0f38, 0xbf): /* vfnmsub231s{s,d} xmm/mem,xmm,xmm{k} */ - host_and_vcpu_must_have(avx512f); + visa_check(f); simd_zmm_scalar_sae: generate_exception_if(ea.type != OP_REG && evex.brs, X86_EXC_UD); if ( !evex.brs ) @@ -6070,14 +6100,14 @@ x86_emulate( case X86EMUL_OPC_EVEX_66(0x0f38, 0x3a): /* vpminuw [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_66(0x0f38, 0x3c): /* vpmaxsb [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_66(0x0f38, 0x3e): /* vpmaxuw [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ - host_and_vcpu_must_have(avx512bw); + visa_check(bw); generate_exception_if(evex.brs, X86_EXC_UD); elem_bytes = b & 2 ?: 1; goto avx512f_no_sae; case X86EMUL_OPC_EVEX_66(0x0f38, 0x40): /* vpmull{d,q} [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ if ( evex.w ) - host_and_vcpu_must_have(avx512dq); + visa_check(dq); goto avx512f_no_sae; case X86EMUL_OPC_66(0x0f38, 0xdb): /* aesimc xmm/m128,xmm */ @@ -6116,7 +6146,7 @@ x86_emulate( case X86EMUL_OPC_EVEX_66(0x0f38, 0x51): /* vpdpbusds [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_66(0x0f38, 0x52): /* vpdpwssd [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_66(0x0f38, 0x53): /* vpdpwssds [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ - host_and_vcpu_must_have(avx512_vnni); + visa_check(_vnni); generate_exception_if(evex.w, X86_EXC_UD); goto avx512f_no_sae; @@ -6128,7 +6158,7 @@ x86_emulate( d |= TwoOp; /* fall through */ case X86EMUL_OPC_EVEX_F3(0x0f38, 0x52): /* vdpbf16ps [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ - host_and_vcpu_must_have(avx512_bf16); + visa_check(_bf16); generate_exception_if(evex.w, X86_EXC_UD); op_bytes = 16 << evex.lr; goto avx512f_no_sae; @@ -6145,7 +6175,7 @@ x86_emulate( case X86EMUL_OPC_EVEX_66(0x0f38, 0x4d): /* vrcp14s{s,d} xmm/mem,xmm,xmm{k} */ case X86EMUL_OPC_EVEX_66(0x0f38, 0x4f): /* vrsqrt14s{s,d} xmm/mem,xmm,xmm{k} */ - host_and_vcpu_must_have(avx512f); + visa_check(f); generate_exception_if(evex.brs, X86_EXC_UD); avx512_vlen_check(true); goto simd_zmm; @@ -6163,16 +6193,16 @@ x86_emulate( generate_exception_if(evex.w || !evex.r || !evex.R || evex.z, X86_EXC_UD); /* fall through */ case X86EMUL_OPC_EVEX_66(0x0f38, 0x54): /* vpopcnt{b,w} [xyz]mm/mem,[xyz]mm{k} */ - host_and_vcpu_must_have(avx512_bitalg); + visa_check(_bitalg); /* fall through */ case X86EMUL_OPC_EVEX_66(0x0f38, 0x66): /* vpblendm{b,w} [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ - host_and_vcpu_must_have(avx512bw); + visa_check(bw); generate_exception_if(evex.brs, X86_EXC_UD); elem_bytes = 1 << evex.w; goto avx512f_no_sae; case X86EMUL_OPC_EVEX_66(0x0f38, 0x55): /* vpopcnt{d,q} [xyz]mm/mem,[xyz]mm{k} */ - host_and_vcpu_must_have(avx512_vpopcntdq); + visa_check(_vpopcntdq); goto avx512f_no_sae; case X86EMUL_OPC_VEX_66(0x0f38, 0x5a): /* vbroadcasti128 m128,ymm */ @@ -6181,14 +6211,14 @@ x86_emulate( case X86EMUL_OPC_EVEX_66(0x0f38, 0x62): /* vpexpand{b,w} [xyz]mm/mem,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_66(0x0f38, 0x63): /* vpcompress{b,w} [xyz]mm,[xyz]mm/mem{k} */ - host_and_vcpu_must_have(avx512_vbmi2); + visa_check(_vbmi2); elem_bytes = 1 << evex.w; /* fall through */ case X86EMUL_OPC_EVEX_66(0x0f38, 0x88): /* vexpandp{s,d} [xyz]mm/mem,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_66(0x0f38, 0x89): /* vpexpand{d,q} [xyz]mm/mem,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_66(0x0f38, 0x8a): /* vcompressp{s,d} [xyz]mm,[xyz]mm/mem{k} */ case X86EMUL_OPC_EVEX_66(0x0f38, 0x8b): /* vpcompress{d,q} [xyz]mm,[xyz]mm/mem{k} */ - host_and_vcpu_must_have(avx512f); + visa_check(f); generate_exception_if(evex.brs, X86_EXC_UD); avx512_vlen_check(false); /* @@ -6222,7 +6252,7 @@ x86_emulate( /* fall through */ case X86EMUL_OPC_EVEX_66(0x0f38, 0x71): /* vpshldv{d,q} [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_66(0x0f38, 0x73): /* vpshrdv{d,q} [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ - host_and_vcpu_must_have(avx512_vbmi2); + visa_check(_vbmi2); goto avx512f_no_sae; case X86EMUL_OPC_VEX (0x0f38, 0xb0): /* vcvtneoph2ps mem,[xy]mm */ @@ -6242,16 +6272,16 @@ x86_emulate( case X86EMUL_OPC_EVEX_66(0x0f38, 0x7d): /* vpermt2{b,w} [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_66(0x0f38, 0x8d): /* vperm{b,w} [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ if ( !evex.w ) - host_and_vcpu_must_have(avx512_vbmi); + visa_check(_vbmi); else - host_and_vcpu_must_have(avx512bw); + visa_check(bw); generate_exception_if(evex.brs, X86_EXC_UD); fault_suppression = false; goto avx512f_no_sae; case X86EMUL_OPC_EVEX_66(0x0f38, 0x78): /* vpbroadcastb xmm/m8,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_66(0x0f38, 0x79): /* vpbroadcastw xmm/m16,[xyz]mm{k} */ - host_and_vcpu_must_have(avx512bw); + visa_check(bw); generate_exception_if(evex.w || evex.brs, X86_EXC_UD); op_bytes = elem_bytes = 1 << (b & 1); /* See the comment at the avx512_broadcast label. */ @@ -6260,14 +6290,14 @@ x86_emulate( case X86EMUL_OPC_EVEX_66(0x0f38, 0x7a): /* vpbroadcastb r32,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_66(0x0f38, 0x7b): /* vpbroadcastw r32,[xyz]mm{k} */ - host_and_vcpu_must_have(avx512bw); + visa_check(bw); generate_exception_if(evex.w, X86_EXC_UD); /* fall through */ case X86EMUL_OPC_EVEX_66(0x0f38, 0x7c): /* vpbroadcast{d,q} reg,[xyz]mm{k} */ generate_exception_if((ea.type != OP_REG || evex.brs || evex.reg != 0xf || !evex.RX), X86_EXC_UD); - host_and_vcpu_must_have(avx512f); + visa_check(f); avx512_vlen_check(false); get_fpu(X86EMUL_FPU_zmm); @@ -6336,7 +6366,7 @@ x86_emulate( case X86EMUL_OPC_EVEX_66(0x0f38, 0x83): /* vpmultishiftqb [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ generate_exception_if(!evex.w, X86_EXC_UD); - host_and_vcpu_must_have(avx512_vbmi); + visa_check(_vbmi); fault_suppression = false; goto avx512f_no_sae; @@ -6484,8 +6514,8 @@ x86_emulate( evex.reg != 0xf || modrm_reg == state->sib_index), X86_EXC_UD); + visa_check(f); avx512_vlen_check(false); - host_and_vcpu_must_have(avx512f); get_fpu(X86EMUL_FPU_zmm); /* Read destination and index registers. */ @@ -6664,8 +6694,8 @@ x86_emulate( evex.reg != 0xf || modrm_reg == state->sib_index), X86_EXC_UD); + visa_check(f); avx512_vlen_check(false); - host_and_vcpu_must_have(avx512f); get_fpu(X86EMUL_FPU_zmm); /* Read source and index registers. */ @@ -6782,7 +6812,7 @@ x86_emulate( case X86EMUL_OPC_EVEX_66(0x0f38, 0xb4): /* vpmadd52luq [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_66(0x0f38, 0xb5): /* vpmadd52huq [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ - host_and_vcpu_must_have(avx512_ifma); + visa_check(_ifma); generate_exception_if(!evex.w, X86_EXC_UD); goto avx512f_no_sae; @@ -6798,8 +6828,8 @@ x86_emulate( #endif ASSERT(ea.type == OP_MEM); - generate_exception_if((!cpu_has_avx512f || !evex.opmsk || evex.brs || - evex.z || evex.reg != 0xf || evex.lr != 2), + generate_exception_if((!evex.opmsk || evex.brs || evex.z || + evex.reg != 0xf || evex.lr != 2), X86_EXC_UD); switch ( modrm_reg & 7 ) @@ -7315,7 +7345,7 @@ x86_emulate( /* fall through */ case X86EMUL_OPC_EVEX_66(0x0f3a, 0x08): /* vrndscaleps $imm8,[xyz]mm/mem,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_66(0x0f3a, 0x09): /* vrndscalepd $imm8,[xyz]mm/mem,[xyz]mm{k} */ - host_and_vcpu_must_have(avx512f); + visa_check(f); generate_exception_if(evex.w != (b & 1), X86_EXC_UD); avx512_vlen_check(b & 2); goto simd_imm8_zmm; @@ -7324,7 +7354,7 @@ x86_emulate( generate_exception_if(ea.type != OP_REG && evex.brs, X86_EXC_UD); /* fall through */ case X86EMUL_OPC_EVEX(0x0f3a, 0x08): /* vrndscaleph $imm8,[xyz]mm/mem,[xyz]mm{k} */ - host_and_vcpu_must_have(avx512_fp16); + visa_check(_fp16); generate_exception_if(evex.w, X86_EXC_UD); avx512_vlen_check(b & 2); goto simd_imm8_zmm; @@ -7437,11 +7467,11 @@ x86_emulate( evex.opmsk || evex.brs), X86_EXC_UD); if ( !(b & 2) ) - host_and_vcpu_must_have(avx512bw); + visa_check(bw); else if ( !(b & 1) ) - host_and_vcpu_must_have(avx512dq); + visa_check(dq); else - host_and_vcpu_must_have(avx512f); + visa_check(f); get_fpu(X86EMUL_FPU_zmm); opc = init_evex(stub); goto pextr; @@ -7455,7 +7485,7 @@ x86_emulate( case X86EMUL_OPC_EVEX_66(0x0f3a, 0x39): /* vextracti32x4 $imm8,{y,z}mm,xmm/m128{k} */ /* vextracti64x2 $imm8,{y,z}mm,xmm/m128{k} */ if ( evex.w ) - host_and_vcpu_must_have(avx512dq); + visa_check(dq); generate_exception_if(evex.brs, X86_EXC_UD); /* fall through */ case X86EMUL_OPC_EVEX_66(0x0f3a, 0x23): /* vshuff32x4 $imm8,{y,z}mm/mem,{y,z}mm,{y,z}mm{k} */ @@ -7475,7 +7505,7 @@ x86_emulate( case X86EMUL_OPC_EVEX_66(0x0f3a, 0x3b): /* vextracti32x8 $imm8,zmm,ymm/m256{k} */ /* vextracti64x4 $imm8,zmm,ymm/m256{k} */ if ( !evex.w ) - host_and_vcpu_must_have(avx512dq); + visa_check(dq); generate_exception_if(evex.lr != 2 || evex.brs, X86_EXC_UD); fault_suppression = false; goto avx512f_imm8_no_sae; @@ -7491,7 +7521,7 @@ x86_emulate( generate_exception_if((evex.w || evex.reg != 0xf || !evex.RX || (ea.type != OP_REG && (evex.z || evex.brs))), X86_EXC_UD); - host_and_vcpu_must_have(avx512f); + visa_check(f); avx512_vlen_check(false); opc = init_evex(stub); } @@ -7583,7 +7613,7 @@ x86_emulate( if ( !(b & 0x20) ) goto avx512f_imm8_no_sae; avx512bw_imm: - host_and_vcpu_must_have(avx512bw); + visa_check(bw); generate_exception_if(evex.brs, X86_EXC_UD); elem_bytes = 1 << evex.w; avx512_vlen_check(false); @@ -7622,7 +7652,7 @@ x86_emulate( goto simd_0f_imm8_avx; case X86EMUL_OPC_EVEX_66(0x0f3a, 0x21): /* vinsertps $imm8,xmm/m32,xmm,xmm */ - host_and_vcpu_must_have(avx512f); + visa_check(f); generate_exception_if(evex.lr || evex.w || evex.opmsk || evex.brs, X86_EXC_UD); op_bytes = 4; @@ -7630,18 +7660,18 @@ x86_emulate( case X86EMUL_OPC_EVEX_66(0x0f3a, 0x50): /* vrangep{s,d} $imm8,[xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_66(0x0f3a, 0x56): /* vreducep{s,d} $imm8,[xyz]mm/mem,[xyz]mm{k} */ - host_and_vcpu_must_have(avx512dq); + visa_check(dq); /* fall through */ case X86EMUL_OPC_EVEX_66(0x0f3a, 0x26): /* vgetmantp{s,d} $imm8,[xyz]mm/mem,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_66(0x0f3a, 0x54): /* vfixupimmp{s,d} $imm8,[xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ - host_and_vcpu_must_have(avx512f); + visa_check(f); if ( ea.type != OP_REG || !evex.brs ) avx512_vlen_check(false); goto simd_imm8_zmm; case X86EMUL_OPC_EVEX(0x0f3a, 0x26): /* vgetmantph $imm8,[xyz]mm/mem,[xyz]mm{k} */ case X86EMUL_OPC_EVEX(0x0f3a, 0x56): /* vreduceph $imm8,[xyz]mm/mem,[xyz]mm{k} */ - host_and_vcpu_must_have(avx512_fp16); + visa_check(_fp16); generate_exception_if(evex.w, X86_EXC_UD); if ( ea.type != OP_REG || !evex.brs ) avx512_vlen_check(false); @@ -7649,11 +7679,11 @@ x86_emulate( case X86EMUL_OPC_EVEX_66(0x0f3a, 0x51): /* vranges{s,d} $imm8,xmm/mem,xmm,xmm{k} */ case X86EMUL_OPC_EVEX_66(0x0f3a, 0x57): /* vreduces{s,d} $imm8,xmm/mem,xmm,xmm{k} */ - host_and_vcpu_must_have(avx512dq); + visa_check(dq); /* fall through */ case X86EMUL_OPC_EVEX_66(0x0f3a, 0x27): /* vgetmants{s,d} $imm8,xmm/mem,xmm,xmm{k} */ case X86EMUL_OPC_EVEX_66(0x0f3a, 0x55): /* vfixupimms{s,d} $imm8,xmm/mem,xmm,xmm{k} */ - host_and_vcpu_must_have(avx512f); + visa_check(f); generate_exception_if(ea.type != OP_REG && evex.brs, X86_EXC_UD); if ( !evex.brs ) avx512_vlen_check(true); @@ -7661,7 +7691,7 @@ x86_emulate( case X86EMUL_OPC_EVEX(0x0f3a, 0x27): /* vgetmantsh $imm8,xmm/mem,xmm,xmm{k} */ case X86EMUL_OPC_EVEX(0x0f3a, 0x57): /* vreducesh $imm8,xmm/mem,xmm,xmm{k} */ - host_and_vcpu_must_have(avx512_fp16); + visa_check(_fp16); generate_exception_if(evex.w, X86_EXC_UD); if ( !evex.brs ) avx512_vlen_check(true); @@ -7672,18 +7702,19 @@ x86_emulate( case X86EMUL_OPC_VEX_66(0x0f3a, 0x30): /* kshiftr{b,w} $imm8,k,k */ case X86EMUL_OPC_VEX_66(0x0f3a, 0x32): /* kshiftl{b,w} $imm8,k,k */ if ( !vex.w ) - host_and_vcpu_must_have(avx512dq); + visa_check(dq); + else + visa_check(f); opmask_shift_imm: generate_exception_if(vex.l || !vex.r || vex.reg != 0xf || ea.type != OP_REG, X86_EXC_UD); - host_and_vcpu_must_have(avx512f); get_fpu(X86EMUL_FPU_opmask); op_bytes = 1; /* Any non-zero value will do. */ goto simd_0f_imm8; case X86EMUL_OPC_VEX_66(0x0f3a, 0x31): /* kshiftr{d,q} $imm8,k,k */ case X86EMUL_OPC_VEX_66(0x0f3a, 0x33): /* kshiftl{d,q} $imm8,k,k */ - host_and_vcpu_must_have(avx512bw); + visa_check(bw); goto opmask_shift_imm; case X86EMUL_OPC_66(0x0f3a, 0x44): /* pclmulqdq $imm8,xmm/m128,xmm */ @@ -7824,7 +7855,7 @@ x86_emulate( case X86EMUL_OPC_EVEX_66(0x0f3a, 0x66): /* vfpclassp{s,d} $imm8,[xyz]mm/mem,k{k} */ case X86EMUL_OPC_EVEX_66(0x0f3a, 0x67): /* vfpclasss{s,d} $imm8,xmm/mem,k{k} */ - host_and_vcpu_must_have(avx512dq); + visa_check(dq); generate_exception_if(!evex.r || !evex.R || evex.z, X86_EXC_UD); if ( !(b & 1) ) goto avx512f_imm8_no_sae; @@ -7834,7 +7865,7 @@ x86_emulate( case X86EMUL_OPC_EVEX(0x0f3a, 0x66): /* vfpclassph $imm8,[xyz]mm/mem,k{k} */ case X86EMUL_OPC_EVEX(0x0f3a, 0x67): /* vfpclasssh $imm8,xmm/mem,k{k} */ - host_and_vcpu_must_have(avx512_fp16); + visa_check(_fp16); generate_exception_if(evex.w || !evex.r || !evex.R || evex.z, X86_EXC_UD); if ( !(b & 1) ) goto avx512f_imm8_no_sae; @@ -7849,14 +7880,14 @@ x86_emulate( /* fall through */ case X86EMUL_OPC_EVEX_66(0x0f3a, 0x71): /* vpshld{d,q} $imm8,[xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_66(0x0f3a, 0x73): /* vpshrd{d,q} $imm8,[xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ - host_and_vcpu_must_have(avx512_vbmi2); + visa_check(_vbmi2); goto avx512f_imm8_no_sae; case X86EMUL_OPC_EVEX_F3(0x0f3a, 0xc2): /* vcmpsh $imm8,xmm/mem,xmm,k{k} */ generate_exception_if(ea.type != OP_REG && evex.brs, X86_EXC_UD); /* fall through */ case X86EMUL_OPC_EVEX(0x0f3a, 0xc2): /* vcmpph $imm8,[xyz]mm/mem,[xyz]mm,k{k} */ - host_and_vcpu_must_have(avx512_fp16); + visa_check(_fp16); generate_exception_if(evex.w || !evex.r || !evex.R || evex.z, X86_EXC_UD); if ( ea.type != OP_REG || !evex.brs ) avx512_vlen_check(evex.pfx & VEX_PREFIX_SCALAR_MASK); @@ -7937,13 +7968,13 @@ x86_emulate( CASE_SIMD_SINGLE_FP(_EVEX, 5, 0x5d): /* vmin{p,s}h [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ CASE_SIMD_SINGLE_FP(_EVEX, 5, 0x5e): /* vdiv{p,s}h [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ CASE_SIMD_SINGLE_FP(_EVEX, 5, 0x5f): /* vmax{p,s}h [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ - host_and_vcpu_must_have(avx512_fp16); + visa_check(_fp16); generate_exception_if(evex.w, X86_EXC_UD); goto avx512f_all_fp; CASE_SIMD_ALL_FP(_EVEX, 5, 0x5a): /* vcvtp{h,d}2p{h,d} [xyz]mm/mem,[xyz]mm{k} */ /* vcvts{h,d}2s{h,d} xmm/mem,xmm,xmm{k} */ - host_and_vcpu_must_have(avx512_fp16); + visa_check(_fp16); if ( vex.pfx & VEX_PREFIX_SCALAR_MASK ) d &= ~TwoOp; op_bytes = 2 << (((evex.pfx & VEX_PREFIX_SCALAR_MASK) ? 0 : 1 + evex.lr) + @@ -7954,7 +7985,7 @@ x86_emulate( /* vcvtqq2ph [xyz]mm/mem,xmm{k} */ case X86EMUL_OPC_EVEX_F2(5, 0x7a): /* vcvtudq2ph [xyz]mm/mem,[xy]mm{k} */ /* vcvtuqq2ph [xyz]mm/mem,xmm{k} */ - host_and_vcpu_must_have(avx512_fp16); + visa_check(_fp16); if ( ea.type != OP_REG || !evex.brs ) avx512_vlen_check(false); op_bytes = 16 << evex.lr; @@ -7964,7 +7995,7 @@ x86_emulate( case X86EMUL_OPC_EVEX_F3(5, 0x5b): /* vcvttph2dq [xy]mm/mem,[xyz]mm{k} */ case X86EMUL_OPC_EVEX (5, 0x78): /* vcvttph2udq [xy]mm/mem,[xyz]mm{k} */ case X86EMUL_OPC_EVEX (5, 0x79): /* vcvtph2udq [xy]mm/mem,[xyz]mm{k} */ - host_and_vcpu_must_have(avx512_fp16); + visa_check(_fp16); generate_exception_if(evex.w, X86_EXC_UD); if ( ea.type != OP_REG || !evex.brs ) avx512_vlen_check(false); @@ -7975,7 +8006,7 @@ x86_emulate( case X86EMUL_OPC_EVEX_66(5, 0x79): /* vcvtph2uqq xmm/mem,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_66(5, 0x7a): /* vcvttph2qq xmm/mem,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_66(5, 0x7b): /* vcvtph2qq xmm/mem,[xyz]mm{k} */ - host_and_vcpu_must_have(avx512_fp16); + visa_check(_fp16); generate_exception_if(evex.w, X86_EXC_UD); if ( ea.type != OP_REG || !evex.brs ) avx512_vlen_check(false); @@ -8012,7 +8043,7 @@ x86_emulate( case X86EMUL_OPC_EVEX_66(6, 0xba): /* vfmsub231ph [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_66(6, 0xbc): /* vfnmadd231ph [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_66(6, 0xbe): /* vfnmsub231ph [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ - host_and_vcpu_must_have(avx512_fp16); + visa_check(_fp16); generate_exception_if(evex.w, X86_EXC_UD); if ( ea.type != OP_REG || !evex.brs ) avx512_vlen_check(false); @@ -8034,7 +8065,7 @@ x86_emulate( case X86EMUL_OPC_EVEX_66(6, 0xbb): /* vfmsub231sh xmm/m16,xmm,xmm{k} */ case X86EMUL_OPC_EVEX_66(6, 0xbd): /* vfnmadd231sh xmm/m16,xmm,xmm{k} */ case X86EMUL_OPC_EVEX_66(6, 0xbf): /* vfnmsub231sh xmm/m16,xmm,xmm{k} */ - host_and_vcpu_must_have(avx512_fp16); + visa_check(_fp16); generate_exception_if(evex.w || (ea.type != OP_REG && evex.brs), X86_EXC_UD); if ( !evex.brs ) @@ -8043,13 +8074,13 @@ x86_emulate( case X86EMUL_OPC_EVEX_66(6, 0x4c): /* vrcpph [xyz]mm/mem,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_66(6, 0x4e): /* vrsqrtph [xyz]mm/mem,[xyz]mm{k} */ - host_and_vcpu_must_have(avx512_fp16); + visa_check(_fp16); generate_exception_if(evex.w, X86_EXC_UD); goto avx512f_no_sae; case X86EMUL_OPC_EVEX_66(6, 0x4d): /* vrcpsh xmm/m16,xmm,xmm{k} */ case X86EMUL_OPC_EVEX_66(6, 0x4f): /* vrsqrtsh xmm/m16,xmm,xmm{k} */ - host_and_vcpu_must_have(avx512_fp16); + visa_check(_fp16); generate_exception_if(evex.w || evex.brs, X86_EXC_UD); avx512_vlen_check(true); goto simd_zmm; @@ -8067,7 +8098,7 @@ x86_emulate( { unsigned int src1 = ~evex.reg; - host_and_vcpu_must_have(avx512_fp16); + visa_check(_fp16); generate_exception_if(evex.w || ((b & 1) && ea.type != OP_REG && evex.brs), X86_EXC_UD); if ( mode_64bit() ) --- a/xen/include/public/arch-x86/cpufeatureset.h +++ b/xen/include/public/arch-x86/cpufeatureset.h @@ -321,7 +321,7 @@ XEN_CPUFEATURE(AVX_VNNI_INT16, 15*32 XEN_CPUFEATURE(PREFETCHI, 15*32+14) /*A PREFETCHIT{0,1} Instructions */ XEN_CPUFEATURE(USER_MSR, 15*32+15) /*s U{RD,WR}MSR Instructions */ XEN_CPUFEATURE(CET_SSS, 15*32+18) /* CET Supervisor Shadow Stacks safe to use */ -XEN_CPUFEATURE(AVX10, 15*32+19) /* AVX10 Converged Vector ISA */ +XEN_CPUFEATURE(AVX10, 15*32+19) /*a AVX10 Converged Vector ISA */ /* Intel-defined CPU features, MSR_ARCH_CAPS 0x10a.eax, word 16 */ XEN_CPUFEATURE(RDCL_NO, 16*32+ 0) /*A No Rogue Data Cache Load (Meltdown) */ From patchwork Thu Jan 11 15:20:36 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 13517529 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 30250C47258 for ; Thu, 11 Jan 2024 15:20:53 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.666364.1036953 (Exim 4.92) (envelope-from ) id 1rNwrJ-0003XX-3n; Thu, 11 Jan 2024 15:20:41 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 666364.1036953; Thu, 11 Jan 2024 15:20:41 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1rNwrJ-0003XO-0e; Thu, 11 Jan 2024 15:20:41 +0000 Received: by outflank-mailman (input) for mailman id 666364; Thu, 11 Jan 2024 15:20:40 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1rNwrI-0003X4-18 for xen-devel@lists.xenproject.org; Thu, 11 Jan 2024 15:20:40 +0000 Received: from mail-wm1-x32b.google.com (mail-wm1-x32b.google.com [2a00:1450:4864:20::32b]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id f9373f9f-b094-11ee-9b0f-b553b5be7939; Thu, 11 Jan 2024 16:20:37 +0100 (CET) Received: by mail-wm1-x32b.google.com with SMTP id 5b1f17b1804b1-40e5f615a32so7137355e9.1 for ; Thu, 11 Jan 2024 07:20:37 -0800 (PST) Received: from [10.156.60.236] (ip-037-024-206-209.um08.pools.vodafone-ip.de. [37.24.206.209]) by smtp.gmail.com with ESMTPSA id fk10-20020a05600c0cca00b0040e5a0ebabesm4245126wmb.21.2024.01.11.07.20.36 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 11 Jan 2024 07:20:37 -0800 (PST) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: f9373f9f-b094-11ee-9b0f-b553b5be7939 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1704986437; x=1705591237; darn=lists.xenproject.org; h=content-transfer-encoding:in-reply-to:autocrypt:references:cc:to :from:content-language:subject:user-agent:mime-version:date :message-id:from:to:cc:subject:date:message-id:reply-to; bh=RMM3D0iEFifomsZgLJKbjqpOdTn//wfnRPE6nPynBks=; b=ZjCqVdCVW9Y0NCK5uhZaiYcB3jcu+w1z5vsmKqoIhJtjuzNDm13fFkPfmr4Vp1l+ej U1Yi81UhCtbKIyk/suJe21tSh6Krqcowuq6v4jJ7uUDB7Iy3xDYinA65EVnYpUIv0SIw eVSmCfhDPKFedjS8aIDg3wypbH9oU46BJnGD6AIlxhX0cwswhdA+JAwPcpIeeip+r7+0 1fXrhDHXF53t0ENjvmTq5ZytxVHDU9EEktuXAeyYNTH2wa21lXtCpo0dQCCeh4ImqSCT 3+I+fufzGzXJkawkVozqgb1qmTbMxUsc/bTx+0mrN/BJWdL9CIq4/O+vCpIfuWuLw1Z0 MM5A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1704986437; x=1705591237; h=content-transfer-encoding:in-reply-to:autocrypt:references:cc:to :from:content-language:subject:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=RMM3D0iEFifomsZgLJKbjqpOdTn//wfnRPE6nPynBks=; b=wrOvu1capuN9WVHWupPcN7Z35r56Qwt1M77cVHzB3R6UrbImUwQryO1UgbYaUY85fm HXBffWTINXW5r2eOXjeiAYQCPN82SqD74Ih4GGuF8W2+gp2XLm+uwrFFCa1cJadxU6VH C0dqyzAumFp8vQ4IcSpb1s+H2QN8lvilpEtKcK+d3pn6uR1xClSiF1laHqnJBzOGXib9 cOFacBNRkeSqcTM4reY2tiAs8jzreybozWBdkBwJXCdO7VNm5EgLNYeNOvuwEDlq8qvl Y9kwfAnlw0Cc5cx4gvprJMzu5h4s5YeelfZtdh8ld7N1Y6yPlly+AqClbh4iYOLBw/VL vHKQ== X-Gm-Message-State: AOJu0YxXLDl66I7bGjIdT0P1J29ZiOua5DfhQcdddmZpoe/RMmLF7bsU /rgF5k4ETsOD62X5+j4o1FiddEcrgsXZhPl/rEaI6colhA== X-Google-Smtp-Source: AGHT+IGGByElgH0jNuAB7fFfGGmG5yjZY0ZWUGSc2An3w7p1TrZIoaZOJF72zc85fR9kpZVpTRDpEQ== X-Received: by 2002:a05:600c:444b:b0:40d:8658:8c0d with SMTP id v11-20020a05600c444b00b0040d86588c0dmr301215wmn.3.1704986437380; Thu, 11 Jan 2024 07:20:37 -0800 (PST) Message-ID: Date: Thu, 11 Jan 2024 16:20:36 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: [PATCH 5/8] x86emul/test: use simd_check_avx512*() in main() Content-Language: en-US From: Jan Beulich To: "xen-devel@lists.xenproject.org" Cc: Andrew Cooper , Wei Liu , =?utf-8?q?Roger_Pau_Monn=C3=A9?= References: <298db76f-d0ee-4d47-931f-1baa1a7546cf@suse.com> Autocrypt: addr=jbeulich@suse.com; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL In-Reply-To: <298db76f-d0ee-4d47-931f-1baa1a7546cf@suse.com> In preparation for having these also cover AVX10, use the helper functions in preference of open-coded cpu_has_avx512* for those features that AVX10 includes. Introduce a couple further helper functions where they weren't previously needed. Note that this way simd_check_avx512f_sha_vl() gains an AVX512F check (which is likely benign) and simd_check_avx512bw_gf_vl() gains an AVX512BW check (which was clearly missing). Signed-off-by: Jan Beulich --- a/tools/tests/x86_emulator/test_x86_emulator.c +++ b/tools/tests/x86_emulator/test_x86_emulator.c @@ -173,6 +173,11 @@ static bool simd_check_avx512vbmi_vl(voi return cpu_has_avx512_vbmi && cpu_has_avx512vl; } +static bool simd_check_avx512vbmi2(void) +{ + return cpu_has_avx512_vbmi2; +} + static bool simd_check_sse4_sha(void) { return cpu_has_sha && cpu_has_sse4_2; @@ -185,7 +190,7 @@ static bool simd_check_avx_sha(void) static bool simd_check_avx512f_sha_vl(void) { - return cpu_has_sha && cpu_has_avx512vl; + return cpu_has_sha && simd_check_avx512f_vl(); } static bool simd_check_avx2_vaes(void) @@ -195,13 +200,13 @@ static bool simd_check_avx2_vaes(void) static bool simd_check_avx512bw_vaes(void) { - return cpu_has_aesni && cpu_has_vaes && cpu_has_avx512bw; + return cpu_has_aesni && cpu_has_vaes && simd_check_avx512bw(); } static bool simd_check_avx512bw_vaes_vl(void) { return cpu_has_aesni && cpu_has_vaes && - cpu_has_avx512bw && cpu_has_avx512vl; + simd_check_avx512bw_vl(); } static bool simd_check_avx2_vpclmulqdq(void) @@ -211,22 +216,22 @@ static bool simd_check_avx2_vpclmulqdq(v static bool simd_check_avx512bw_vpclmulqdq(void) { - return cpu_has_vpclmulqdq && cpu_has_avx512bw; + return cpu_has_vpclmulqdq && simd_check_avx512bw(); } static bool simd_check_avx512bw_vpclmulqdq_vl(void) { - return cpu_has_vpclmulqdq && cpu_has_avx512bw && cpu_has_avx512vl; + return cpu_has_vpclmulqdq && simd_check_avx512bw_vl(); } static bool simd_check_avx512vbmi2_vpclmulqdq(void) { - return cpu_has_avx512_vbmi2 && simd_check_avx512bw_vpclmulqdq(); + return simd_check_avx512vbmi2() && simd_check_avx512bw_vpclmulqdq(); } static bool simd_check_avx512vbmi2_vpclmulqdq_vl(void) { - return cpu_has_avx512_vbmi2 && simd_check_avx512bw_vpclmulqdq_vl(); + return simd_check_avx512vbmi2() && simd_check_avx512bw_vpclmulqdq_vl(); } static bool simd_check_sse2_gf(void) @@ -241,12 +246,17 @@ static bool simd_check_avx2_gf(void) static bool simd_check_avx512bw_gf(void) { - return cpu_has_gfni && cpu_has_avx512bw; + return cpu_has_gfni && simd_check_avx512bw(); } static bool simd_check_avx512bw_gf_vl(void) { - return cpu_has_gfni && cpu_has_avx512vl; + return cpu_has_gfni && simd_check_avx512bw_vl(); +} + +static bool simd_check_avx512vnni(void) +{ + return cpu_has_avx512_vnni; } static bool simd_check_avx512fp16(void) @@ -3116,7 +3126,7 @@ int main(int argc, char **argv) printf("skipped\n"); printf("%-40s", "Testing {evex} vmovq %xmm1,32(%edx)..."); - if ( stack_exec && cpu_has_avx512f ) + if ( stack_exec && simd_check_avx512f() ) { decl_insn(evex_vmovq_to_mem); @@ -3140,7 +3150,7 @@ int main(int argc, char **argv) printf("skipped\n"); printf("%-40s", "Testing {evex} vmovq 32(%edx),%xmm0..."); - if ( stack_exec && cpu_has_avx512f ) + if ( stack_exec && simd_check_avx512f() ) { decl_insn(evex_vmovq_from_mem); @@ -3263,7 +3273,7 @@ int main(int argc, char **argv) printf("skipped\n"); printf("%-40s", "Testing vmovdqu32 %zmm2,(%ecx){%k1}..."); - if ( stack_exec && cpu_has_avx512f ) + if ( stack_exec && simd_check_avx512f() ) { decl_insn(vmovdqu32_to_mem); @@ -3293,7 +3303,7 @@ int main(int argc, char **argv) printf("skipped\n"); printf("%-40s", "Testing vmovdqu32 64(%edx),%zmm2{%k2}..."); - if ( stack_exec && cpu_has_avx512f ) + if ( stack_exec && simd_check_avx512f() ) { decl_insn(vmovdqu32_from_mem); @@ -3318,7 +3328,7 @@ int main(int argc, char **argv) printf("skipped\n"); printf("%-40s", "Testing vmovdqu16 %zmm3,(%ecx){%k1}..."); - if ( stack_exec && cpu_has_avx512bw ) + if ( stack_exec && simd_check_avx512bw() ) { decl_insn(vmovdqu16_to_mem); @@ -3350,7 +3360,7 @@ int main(int argc, char **argv) printf("skipped\n"); printf("%-40s", "Testing vmovdqu16 64(%edx),%zmm3{%k2}..."); - if ( stack_exec && cpu_has_avx512bw ) + if ( stack_exec && simd_check_avx512bw() ) { decl_insn(vmovdqu16_from_mem); @@ -3478,7 +3488,7 @@ int main(int argc, char **argv) printf("%-40s", "Testing vmovsd %xmm5,16(%ecx){%k3}..."); memset(res, 0x88, 128); memset(res + 20, 0x77, 8); - if ( stack_exec && cpu_has_avx512f ) + if ( stack_exec && simd_check_avx512f() ) { decl_insn(vmovsd_masked_to_mem); @@ -3513,7 +3523,7 @@ int main(int argc, char **argv) } printf("%-40s", "Testing vmovaps (%edx),%zmm7{%k3}{z}..."); - if ( stack_exec && cpu_has_avx512f ) + if ( stack_exec && simd_check_avx512f() ) { decl_insn(vmovaps_masked_from_mem); @@ -3696,7 +3706,7 @@ int main(int argc, char **argv) printf("skipped\n"); printf("%-40s", "Testing {evex} vmovd %xmm3,32(%ecx)..."); - if ( stack_exec && cpu_has_avx512f ) + if ( stack_exec && simd_check_avx512f() ) { decl_insn(evex_vmovd_to_mem); @@ -3721,7 +3731,7 @@ int main(int argc, char **argv) printf("skipped\n"); printf("%-40s", "Testing {evex} vmovd 32(%ecx),%xmm4..."); - if ( stack_exec && cpu_has_avx512f ) + if ( stack_exec && simd_check_avx512f() ) { decl_insn(evex_vmovd_from_mem); @@ -3911,7 +3921,7 @@ int main(int argc, char **argv) printf("skipped\n"); printf("%-40s", "Testing {evex} vmovd %xmm2,%ebx..."); - if ( stack_exec && cpu_has_avx512f ) + if ( stack_exec && simd_check_avx512f() ) { decl_insn(evex_vmovd_to_reg); @@ -3937,7 +3947,7 @@ int main(int argc, char **argv) printf("skipped\n"); printf("%-40s", "Testing {evex} vmovd %ebx,%xmm1..."); - if ( stack_exec && cpu_has_avx512f ) + if ( stack_exec && simd_check_avx512f() ) { decl_insn(evex_vmovd_from_reg); @@ -4039,7 +4049,7 @@ int main(int argc, char **argv) printf("skipped\n"); printf("%-40s", "Testing {evex} vmovq %xmm11,32(%ecx)..."); - if ( stack_exec && cpu_has_avx512f ) + if ( stack_exec && simd_check_avx512f() ) { decl_insn(evex_vmovq_to_mem2); @@ -4129,7 +4139,7 @@ int main(int argc, char **argv) printf("skipped\n"); printf("%-40s", "Testing vmovq %xmm22,%rbx..."); - if ( stack_exec && cpu_has_avx512f ) + if ( stack_exec && simd_check_avx512f() ) { decl_insn(evex_vmovq_to_reg); @@ -4322,7 +4332,7 @@ int main(int argc, char **argv) printf("skipped\n"); printf("%-40s", "Testing vmovntdqa 64(%ecx),%zmm4..."); - if ( stack_exec && cpu_has_avx512f ) + if ( stack_exec && simd_check_avx512f() ) { decl_insn(evex_vmovntdqa); @@ -4918,7 +4928,7 @@ int main(int argc, char **argv) printf("skipped\n"); printf("%-40s", "Testing vcvtph2ps 32(%ecx),%zmm7{%k4}..."); - if ( stack_exec && cpu_has_avx512f ) + if ( stack_exec && simd_check_avx512f() ) { decl_insn(evex_vcvtph2ps); decl_insn(evex_vcvtps2ph); @@ -4961,7 +4971,7 @@ int main(int argc, char **argv) printf("skipped\n"); printf("%-40s", "Testing vfixupimmpd $0,8(%edx){1to8},%zmm3,%zmm4..."); - if ( stack_exec && cpu_has_avx512f ) + if ( stack_exec && simd_check_avx512f() ) { decl_insn(vfixupimmpd); static const struct { @@ -5000,7 +5010,7 @@ int main(int argc, char **argv) printf("%-40s", "Testing vfpclasspsz $0x46,64(%edx),%k2..."); - if ( stack_exec && cpu_has_avx512dq ) + if ( stack_exec && simd_check_avx512dq() ) { decl_insn(vfpclassps); @@ -5032,7 +5042,7 @@ int main(int argc, char **argv) printf("skipped\n"); printf("%-40s", "Testing vfpclassphz $0x46,128(%ecx),%k3..."); - if ( stack_exec && cpu_has_avx512_fp16 ) + if ( stack_exec && simd_check_avx512fp16() ) { decl_insn(vfpclassph); @@ -5075,7 +5085,7 @@ int main(int argc, char **argv) * on the mapping boundaries) that elements controlled by clear mask * bits don't get accessed. */ - if ( stack_exec && cpu_has_avx512f ) + if ( stack_exec && simd_check_avx512f() ) { decl_insn(vpcompressd); decl_insn(vpcompressq); @@ -5177,7 +5187,7 @@ int main(int argc, char **argv) } #if __GNUC__ > 7 /* can't check for __AVX512VBMI2__ here */ - if ( stack_exec && cpu_has_avx512_vbmi2 ) + if ( stack_exec && simd_check_avx512vbmi2() ) { decl_insn(vpcompressb); decl_insn(vpcompressw); @@ -5440,7 +5450,7 @@ int main(int argc, char **argv) } printf("%-40s", "Testing vpdpwssd (%ecx),%{y,z}mmA,%{y,z}mmB..."); - if ( stack_exec && cpu_has_avx512_vnni && cpu_has_avx_vnni ) + if ( stack_exec && simd_check_avx512vnni() && cpu_has_avx_vnni ) { /* Do the same operation two ways and compare the results. */ decl_insn(vpdpwssd_vex1); @@ -5495,7 +5505,7 @@ int main(int argc, char **argv) printf("skipped\n"); printf("%-40s", "Testing vmovsh 8(%ecx),%xmm5..."); - if ( stack_exec && cpu_has_avx512_fp16 ) + if ( stack_exec && simd_check_avx512fp16() ) { decl_insn(vmovsh_from_mem); decl_insn(vmovw_to_gpr); From patchwork Thu Jan 11 15:21:12 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 13517530 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id A8846C47077 for ; Thu, 11 Jan 2024 15:21:29 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.666370.1036962 (Exim 4.92) (envelope-from ) id 1rNwrs-00047p-C8; Thu, 11 Jan 2024 15:21:16 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 666370.1036962; Thu, 11 Jan 2024 15:21:16 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1rNwrs-00047i-9B; Thu, 11 Jan 2024 15:21:16 +0000 Received: by outflank-mailman (input) for mailman id 666370; Thu, 11 Jan 2024 15:21:15 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1rNwrr-00045k-Gf for xen-devel@lists.xenproject.org; Thu, 11 Jan 2024 15:21:15 +0000 Received: from mail-wm1-x32a.google.com (mail-wm1-x32a.google.com [2a00:1450:4864:20::32a]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id 0e502da4-b095-11ee-9b0f-b553b5be7939; Thu, 11 Jan 2024 16:21:13 +0100 (CET) Received: by mail-wm1-x32a.google.com with SMTP id 5b1f17b1804b1-40e6275e9beso1903275e9.1 for ; Thu, 11 Jan 2024 07:21:13 -0800 (PST) Received: from [10.156.60.236] (ip-037-024-206-209.um08.pools.vodafone-ip.de. [37.24.206.209]) by smtp.gmail.com with ESMTPSA id fk10-20020a05600c0cca00b0040e5a0ebabesm4245126wmb.21.2024.01.11.07.21.12 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 11 Jan 2024 07:21:12 -0800 (PST) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 0e502da4-b095-11ee-9b0f-b553b5be7939 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1704986473; x=1705591273; darn=lists.xenproject.org; h=content-transfer-encoding:in-reply-to:autocrypt:references:cc:to :from:content-language:subject:user-agent:mime-version:date :message-id:from:to:cc:subject:date:message-id:reply-to; bh=8sR9iUBPyjz56T+rdvEviN3viyV4gNB0vCMZ4vWiszw=; b=Mg52yzYZdAj3I6B7CunlDCHvVAUVCfnP2kiw9O0cBp4b3XHya2W9dwBhweXn5IctN+ ktyAbbF2rnGYrWcmxyD45ynFQ9o3YjpomnIyEMXsh0mQK9qNWaYjRPn+3FQeBSZrg8L6 5yjap0XBLrZFGDqdwHhClfiWy7LubTUhvegRnzDUnNRWSaUxTrLgyw1+9YDZlTUzeicY NbIS9N8q7HuoAP6qeP+NTzS5JM7rVQzHZGDAGeYpIdypx93UyBAQZl3Ojm/f7NvTZ3NP NExY+3uaQsFphZPkK9D0cSE3W2ytON4J03MyTw9fotYq57p9M5TQCMDRfmXBUS1mHBik 8fEw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1704986473; x=1705591273; h=content-transfer-encoding:in-reply-to:autocrypt:references:cc:to :from:content-language:subject:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=8sR9iUBPyjz56T+rdvEviN3viyV4gNB0vCMZ4vWiszw=; b=UGix0E4MB4V0z6SnyOBhRQByoLW4v0mXMs/rP+lqcWZ6PKI2LXZKi+0uj2iVWIfqVV CmUqlOLzEpqwGcdl1N6mNLYC/Fo4Tjj+a1UAKhRiXhtoY21QXPdGNqYMISRICEskBQAB 8exsYeV+HGLCxCkuWGzdV3SCAmuOIdEX8IoAc/dG4EY/piK7Vx+jHEymErgt6SAIJcvV PNXiArCf0utvbENuhgerZXemMmSj53cvMf93xvG2u6USTmpYUcMRhabonbfWN3ntfoLw 0iigWe/fXkGcf4tJ/Zhlk2M2CsJFkGNoy5CmcdUWo6UdErewzWFQ8aC0Hf5tegA7ipl/ ZpNw== X-Gm-Message-State: AOJu0Yw7bzVDSwXYaNG0ScQ0rVgJoQ6BkJ7zJh33n48D0EHSlE7Y/HwT djAs7nWjt/hOninJlZfGpO4d9AOsCd3urVxJ2+VpqmWLbA== X-Google-Smtp-Source: AGHT+IEWxG+4nihFy66b0rZ4Uo9HIECs04qLXYWSQxj8Y2dUzgDrE6aK8ZVc8qNWtP/wJS/HLl3hmw== X-Received: by 2002:a05:600c:384e:b0:40e:5660:c796 with SMTP id s14-20020a05600c384e00b0040e5660c796mr541012wmr.73.1704986472945; Thu, 11 Jan 2024 07:21:12 -0800 (PST) Message-ID: Date: Thu, 11 Jan 2024 16:21:12 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: [PATCH 6/8] x86emul/test: drop cpu_has_avx512vl Content-Language: en-US From: Jan Beulich To: "xen-devel@lists.xenproject.org" Cc: Andrew Cooper , Wei Liu , =?utf-8?q?Roger_Pau_Monn=C3=A9?= References: <298db76f-d0ee-4d47-931f-1baa1a7546cf@suse.com> Autocrypt: addr=jbeulich@suse.com; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL In-Reply-To: <298db76f-d0ee-4d47-931f-1baa1a7546cf@suse.com> AVX512VL not being a standalone feature anyway, but always needing to be combined with some other AVX512*, replace uses of cpu_has_avx512vl by just the feature bit check. Signed-off-by: Jan Beulich --- a/tools/tests/x86_emulator/evex-disp8.c +++ b/tools/tests/x86_emulator/evex-disp8.c @@ -1031,7 +1031,8 @@ static void test_group(const struct test { for ( j = 0; j < nr_vl; ++j ) { - if ( vl[0] == VL_512 && vl[j] != VL_512 && !cpu_has_avx512vl ) + if ( vl[0] == VL_512 && vl[j] != VL_512 && + !cpu_policy.feat.avx512vl ) continue; switch ( tests[i].esz ) --- a/tools/tests/x86_emulator/test_x86_emulator.c +++ b/tools/tests/x86_emulator/test_x86_emulator.c @@ -132,7 +132,7 @@ static bool simd_check_avx512f(void) static bool simd_check_avx512f_vl(void) { - return cpu_has_avx512f && cpu_has_avx512vl; + return cpu_has_avx512f && cpu_policy.feat.avx512vl; } #define simd_check_avx512vl_sg simd_check_avx512f_vl @@ -144,7 +144,7 @@ static bool simd_check_avx512dq(void) static bool simd_check_avx512dq_vl(void) { - return cpu_has_avx512dq && cpu_has_avx512vl; + return cpu_has_avx512dq && cpu_policy.feat.avx512vl; } static bool simd_check_avx512er(void) @@ -160,7 +160,7 @@ static bool simd_check_avx512bw(void) static bool simd_check_avx512bw_vl(void) { - return cpu_has_avx512bw && cpu_has_avx512vl; + return cpu_has_avx512bw && cpu_policy.feat.avx512vl; } static bool simd_check_avx512vbmi(void) @@ -170,7 +170,7 @@ static bool simd_check_avx512vbmi(void) static bool simd_check_avx512vbmi_vl(void) { - return cpu_has_avx512_vbmi && cpu_has_avx512vl; + return cpu_has_avx512_vbmi && cpu_policy.feat.avx512vl; } static bool simd_check_avx512vbmi2(void) @@ -266,7 +266,7 @@ static bool simd_check_avx512fp16(void) static bool simd_check_avx512fp16_vl(void) { - return cpu_has_avx512_fp16 && cpu_has_avx512vl; + return cpu_has_avx512_fp16 && cpu_policy.feat.avx512vl; } static void simd_set_regs(struct cpu_user_regs *regs) --- a/tools/tests/x86_emulator/x86-emulate.h +++ b/tools/tests/x86_emulator/x86-emulate.h @@ -160,7 +160,6 @@ void wrpkru(unsigned int val); #define cpu_has_avx512cd (cpu_policy.feat.avx512cd && xcr0_mask(0xe6)) #define cpu_has_sha cpu_policy.feat.sha #define cpu_has_avx512bw (cpu_policy.feat.avx512bw && xcr0_mask(0xe6)) -#define cpu_has_avx512vl (cpu_policy.feat.avx512vl && xcr0_mask(0xe6)) #define cpu_has_avx512_vbmi (cpu_policy.feat.avx512_vbmi && xcr0_mask(0xe6)) #define cpu_has_avx512_vbmi2 (cpu_policy.feat.avx512_vbmi2 && xcr0_mask(0xe6)) #define cpu_has_gfni cpu_policy.feat.gfni From patchwork Thu Jan 11 15:21:33 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 13517531 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 04CDEC47077 for ; Thu, 11 Jan 2024 15:21:48 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.666372.1036973 (Exim 4.92) (envelope-from ) id 1rNwsD-0004Yy-KI; Thu, 11 Jan 2024 15:21:37 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 666372.1036973; Thu, 11 Jan 2024 15:21:37 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1rNwsD-0004Yr-Gs; Thu, 11 Jan 2024 15:21:37 +0000 Received: by outflank-mailman (input) for mailman id 666372; Thu, 11 Jan 2024 15:21:36 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1rNwsC-00045k-Js for xen-devel@lists.xenproject.org; Thu, 11 Jan 2024 15:21:36 +0000 Received: from mail-wr1-x435.google.com (mail-wr1-x435.google.com [2a00:1450:4864:20::435]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id 1aee62fc-b095-11ee-9b0f-b553b5be7939; Thu, 11 Jan 2024 16:21:34 +0100 (CET) Received: by mail-wr1-x435.google.com with SMTP id ffacd0b85a97d-3368b1e056eso4392406f8f.3 for ; Thu, 11 Jan 2024 07:21:34 -0800 (PST) Received: from [10.156.60.236] (ip-037-024-206-209.um08.pools.vodafone-ip.de. [37.24.206.209]) by smtp.gmail.com with ESMTPSA id fk10-20020a05600c0cca00b0040e5a0ebabesm4245126wmb.21.2024.01.11.07.21.33 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 11 Jan 2024 07:21:33 -0800 (PST) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 1aee62fc-b095-11ee-9b0f-b553b5be7939 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1704986494; x=1705591294; darn=lists.xenproject.org; h=content-transfer-encoding:in-reply-to:autocrypt:references:cc:to :from:content-language:subject:user-agent:mime-version:date :message-id:from:to:cc:subject:date:message-id:reply-to; bh=IDNtIkUYBUxOgV95KZo88lsdyxrnB7KdD8z/bx9O3Jo=; b=KuoMPEp+YdfeUQ0l/vjBjO0W17k+1CO2Ql7N8hGzcFnaqUZemTda8AbI5XN20aGWG0 epYfyPJL8kk0vXW2YiyN30w6OiSPYh3VCRPl+a7wj/psMe+E3D+ZUKCcWmnPwYCAx4sU tXtffX2KeN8+XAk+0HnfWL+lemdzhr9kmeZTYu29hwgL9UrWqBKWTZBa4mfHgxHZRHIy TTY3qdeYMe93gP6Ig7e91n0ezZEtNoUh42iLHihfxa2NbmC551PdIAzIuv+nbx5uDngw Fr9OL7bTv/jNnaQOUp9ZhBgdLguhsN6cXiSOfWIT1F7NOpCD9A0KBd1fxYkA5tdWE8SI M6HA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1704986494; x=1705591294; h=content-transfer-encoding:in-reply-to:autocrypt:references:cc:to :from:content-language:subject:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=IDNtIkUYBUxOgV95KZo88lsdyxrnB7KdD8z/bx9O3Jo=; b=JpDaONPugbZcIevaikXPn4TvkTGOhNcNi//VIt5fUuGr86ie6LkSYHBlJHGyYy+iU1 5/kdIaJadtZGH4EqgXXOWnh5mUKuWOJyumLFlnY/VAEEFeOLECCDw6GXIcYXcpd5wZYm XRRh/x8WsHclJGU9mGDjYhnXbzSonjcwI7+GJBMOJ+0TMpAXKF3zcgOZ3d00Dx06LjQF Y2uiliKuYwAhUDSfsvBjPvEC893O6xUZbS10uqF6DsmU8PXh9Lfrje5n1oIspCk3PHGa N/YjFUQTH/8/t9Se+Iz8DOkX80P+kA8X+pPj8Wkx8DNbChFyJ2bNU5siVtCBrl9Xyji1 Zrmw== X-Gm-Message-State: AOJu0YzkqhVIypo6jOK8NK2n6fE1m808/h4TWXlfZIYwMLIx33CQwIXY N6H55B9eEu6vMF65SromRorY6srpOVDNOSZc3jScSX1iFg== X-Google-Smtp-Source: AGHT+IHE4kXTzz2CtewNEdVfSDJEr6YgQ+/TSdXpIJybwIWdl5dUCbT29JetAl7KkVHDsK2k0+2T4Q== X-Received: by 2002:a05:600c:4f8a:b0:40e:3a62:943b with SMTP id n10-20020a05600c4f8a00b0040e3a62943bmr570080wmq.20.1704986493923; Thu, 11 Jan 2024 07:21:33 -0800 (PST) Message-ID: Date: Thu, 11 Jan 2024 16:21:33 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: [PATCH 7/8] x86emul: AVX10.1 testing Content-Language: en-US From: Jan Beulich To: "xen-devel@lists.xenproject.org" Cc: Andrew Cooper , Wei Liu , =?utf-8?q?Roger_Pau_Monn=C3=A9?= References: <298db76f-d0ee-4d47-931f-1baa1a7546cf@suse.com> Autocrypt: addr=jbeulich@suse.com; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL In-Reply-To: <298db76f-d0ee-4d47-931f-1baa1a7546cf@suse.com> Re-use respective AVX512 tests, by suitably adjusting the predicate functions. This leaves test names ("Testing ... NN-bit code sequence") somewhat misleading, but I think we can live with that. Note that the AVX512{BW,DQ} opmask tests cannot be run as-is for the AVX10/256 case, as they include 512-bit vector <-> opmask insn tests. Sadly until a newer SDE version (matching ISE 050 or newer) is available, one workaround is necessary to be able to run the test harness on SDE 9.27.0. Signed-off-by: Jan Beulich --- SDE: -gnr / -gnr256 --- TBD: For AVX10.1/256 need to somehow guarantee that the generated blobs really don't use 512-bit insns (it's uncertain whether passing -mprefer-vector-width= is enough). Right now according to my testing on SDE this is all fine. May need to probe for support of the new -mno-evex512 compiler option. The AVX512{BW,DQ} opmask tests could of course be cloned (i.e. rebuilt another time with -mavx512vl passed) accordingly, but the coverage gain wouldbe pretty marginal (plus there would again be issues with SDE 9.27.0). --- a/tools/tests/x86_emulator/evex-disp8.c +++ b/tools/tests/x86_emulator/evex-disp8.c @@ -1032,7 +1032,11 @@ static void test_group(const struct test for ( j = 0; j < nr_vl; ++j ) { if ( vl[0] == VL_512 && vl[j] != VL_512 && - !cpu_policy.feat.avx512vl ) + !cpu_policy.feat.avx512vl && !cpu_policy.feat.avx10 ) + continue; + + if ( vl[j] == VL_512 && !cpu_policy.feat.avx512f && + !cpu_policy.avx10.vsz512 ) continue; switch ( tests[i].esz ) @@ -1083,6 +1087,27 @@ static void test_group(const struct test } } +/* AVX512 (sub)features implied by AVX10. */ +#define avx10_has_avx512f true +#define avx10_has_avx512bw true +#define avx10_has_avx512cd true +#define avx10_has_avx512dq true +#define avx10_has_avx512_bf16 true +#define avx10_has_avx512_bitalg true +#define avx10_has_avx512_fp16 true +#define avx10_has_avx512_ifma true +#define avx10_has_avx512_vbmi true +#define avx10_has_avx512_vbmi2 true +#define avx10_has_avx512_vnni true +#define avx10_has_avx512_vpopcntdq true + +/* AVX512 sub-features /not/ implied by AVX10. */ +#define avx10_has_avx512er false +#define avx10_has_avx512pf false +#define avx10_has_avx512_4fmaps false +#define avx10_has_avx512_4vnniw false +#define avx10_has_avx512_vp2intersect false + void evex_disp8_test(void *instr, struct x86_emulate_ctxt *ctxt, const struct x86_emulate_ops *ops) { @@ -1090,8 +1115,8 @@ void evex_disp8_test(void *instr, struct emulops.read = read; emulops.write = write; -#define RUN(feat, vl) do { \ - if ( cpu_has_##feat ) \ +#define run(cond, feat, vl) do { \ + if ( cond ) \ { \ printf("%-40s", "Testing " #feat "/" #vl " disp8 handling..."); \ test_group(feat ## _ ## vl, ARRAY_SIZE(feat ## _ ## vl), \ @@ -1100,6 +1125,12 @@ void evex_disp8_test(void *instr, struct } \ } while ( false ) +#define RUN(feat, vl) \ + run(cpu_has_ ## feat || \ + (cpu_has_avx10_1 && cpu_policy.avx10.vsz256 && avx10_has_ ## feat && \ + (ARRAY_SIZE(vl_ ## vl) > 1 || &vl_ ## vl[0] != &vl_512[0])), \ + feat, vl) + RUN(avx512f, all); RUN(avx512f, 128); RUN(avx512f, no128); @@ -1127,10 +1158,15 @@ void evex_disp8_test(void *instr, struct RUN(avx512_fp16, all); RUN(avx512_fp16, 128); - if ( cpu_has_avx512f ) +#undef RUN + + if ( cpu_has_avx512f || cpu_has_avx10_1 ) { +#define RUN(feat, vl) run(cpu_has_ ## feat, feat, vl) RUN(gfni, all); RUN(vaes, all); RUN(vpclmulqdq, all); +#undef RUN } +#undef run } --- a/tools/tests/x86_emulator/testcase.mk +++ b/tools/tests/x86_emulator/testcase.mk @@ -4,7 +4,27 @@ include $(XEN_ROOT)/tools/Rules.mk $(call cc-options-add,CFLAGS,CC,$(EMBEDDED_EXTRA_CFLAGS)) -CFLAGS += -fno-builtin -g0 $($(TESTCASE)-cflags) +ifneq ($(filter -mavx512%,$($(TESTCASE)-cflags)),) + +cflags-vsz64 := +cflags-vsz32 := -mprefer-vector-width=256 +cflags-vsz16 := -mprefer-vector-width=128 +# Scalar tests don't set VEC_SIZE (and VEC_MAX is used by S/G ones only) +cflags-vsz := -mprefer-vector-width=128 + +ifneq ($(filter -DVEC_SIZE=%,$($(TESTCASE)-cflags)),) +CFLAGS-VSZ := $(cflags-vsz$(patsubst -DVEC_SIZE=%,%,$(filter -DVEC_SIZE=%,$($(TESTCASE)-cflags)))) +else +CFLAGS-VSZ := $(cflags-vsz$(patsubst -DVEC_MAX=%,%,$(filter -DVEC_MAX=%,$($(TESTCASE)-cflags)))) +endif + +else + +CFLAGS-VSZ := + +endif + +CFLAGS += -fno-builtin -g0 $($(TESTCASE)-cflags) $(CFLAGS-VSZ) LDFLAGS_DIRECT += $(shell { $(LD) -v --warn-rwx-segments; } >/dev/null 2>&1 && echo --no-warn-rwx-segments) --- a/tools/tests/x86_emulator/test_x86_emulator.c +++ b/tools/tests/x86_emulator/test_x86_emulator.c @@ -125,26 +125,33 @@ static bool simd_check_avx_pclmul(void) static bool simd_check_avx512f(void) { - return cpu_has_avx512f; + return cpu_has_avx512f || cpu_has_avx10_1_512; } -#define simd_check_avx512f_opmask simd_check_avx512f #define simd_check_avx512f_sg simd_check_avx512f +static bool simd_check_avx512f_sc(void) +{ + return cpu_has_avx512f || cpu_has_avx10_1; +} +#define simd_check_avx512f_opmask simd_check_avx512f_sc + static bool simd_check_avx512f_vl(void) { - return cpu_has_avx512f && cpu_policy.feat.avx512vl; + return (cpu_has_avx512f && cpu_policy.feat.avx512vl) || + cpu_has_avx10_1_256; } #define simd_check_avx512vl_sg simd_check_avx512f_vl static bool simd_check_avx512dq(void) { - return cpu_has_avx512dq; + return cpu_has_avx512dq || cpu_has_avx10_1_512; } #define simd_check_avx512dq_opmask simd_check_avx512dq static bool simd_check_avx512dq_vl(void) { - return cpu_has_avx512dq && cpu_policy.feat.avx512vl; + return (cpu_has_avx512dq && cpu_policy.feat.avx512vl) || + cpu_has_avx10_1_256; } static bool simd_check_avx512er(void) @@ -154,28 +161,30 @@ static bool simd_check_avx512er(void) static bool simd_check_avx512bw(void) { - return cpu_has_avx512bw; + return cpu_has_avx512bw || cpu_has_avx10_1_512; } #define simd_check_avx512bw_opmask simd_check_avx512bw static bool simd_check_avx512bw_vl(void) { - return cpu_has_avx512bw && cpu_policy.feat.avx512vl; + return (cpu_has_avx512bw && cpu_policy.feat.avx512vl) || + cpu_has_avx10_1_256; } static bool simd_check_avx512vbmi(void) { - return cpu_has_avx512_vbmi; + return cpu_has_avx512_vbmi || cpu_has_avx10_1_512; } static bool simd_check_avx512vbmi_vl(void) { - return cpu_has_avx512_vbmi && cpu_policy.feat.avx512vl; + return (cpu_has_avx512_vbmi && cpu_policy.feat.avx512vl) || + cpu_has_avx10_1_256; } static bool simd_check_avx512vbmi2(void) { - return cpu_has_avx512_vbmi2; + return cpu_has_avx512_vbmi2 || cpu_has_avx10_1_512; } static bool simd_check_sse4_sha(void) @@ -256,17 +265,23 @@ static bool simd_check_avx512bw_gf_vl(vo static bool simd_check_avx512vnni(void) { - return cpu_has_avx512_vnni; + return cpu_has_avx512_vnni || cpu_has_avx10_1_512; } static bool simd_check_avx512fp16(void) { - return cpu_has_avx512_fp16; + return cpu_has_avx512_fp16 || cpu_has_avx10_1_512; +} + +static bool simd_check_avx512fp16_sc(void) +{ + return cpu_has_avx512_fp16 || cpu_has_avx10_1; } static bool simd_check_avx512fp16_vl(void) { - return cpu_has_avx512_fp16 && cpu_policy.feat.avx512vl; + return (cpu_has_avx512_fp16 && cpu_policy.feat.avx512vl) || + cpu_has_avx10_1_256; } static void simd_set_regs(struct cpu_user_regs *regs) @@ -439,9 +454,13 @@ static const struct { SIMD(OPMASK+DQ/w, avx512dq_opmask, 2), SIMD(OPMASK+BW/d, avx512bw_opmask, 4), SIMD(OPMASK+BW/q, avx512bw_opmask, 8), - SIMD(AVX512F f32 scalar, avx512f, f4), +#define avx512f_sc_x86_32_D_f4 avx512f_x86_32_D_f4 +#define avx512f_sc_x86_64_D_f4 avx512f_x86_64_D_f4 + SIMD(AVX512F f32 scalar, avx512f_sc, f4), SIMD(AVX512F f32x16, avx512f, 64f4), - SIMD(AVX512F f64 scalar, avx512f, f8), +#define avx512f_sc_x86_32_D_f8 avx512f_x86_32_D_f8 +#define avx512f_sc_x86_64_D_f8 avx512f_x86_64_D_f8 + SIMD(AVX512F f64 scalar, avx512f_sc, f8), SIMD(AVX512F f64x8, avx512f, 64f8), SIMD(AVX512F s32x16, avx512f, 64i4), SIMD(AVX512F u32x16, avx512f, 64u4), @@ -533,7 +552,9 @@ static const struct { AVX512VL(_VBMI+VL u16x8, avx512vbmi, 16u2), AVX512VL(_VBMI+VL s16x16, avx512vbmi, 32i2), AVX512VL(_VBMI+VL u16x16, avx512vbmi, 32u2), - SIMD(AVX512_FP16 f16 scal,avx512fp16, f2), +#define avx512fp16_sc_x86_32_D_f2 avx512fp16_x86_32_D_f2 +#define avx512fp16_sc_x86_64_D_f2 avx512fp16_x86_64_D_f2 + SIMD(AVX512_FP16 f16 scal,avx512fp16_sc, f2), SIMD(AVX512_FP16 f16x32, avx512fp16, 64f2), AVX512VL(_FP16+VL f16x8, avx512fp16, 16f2), AVX512VL(_FP16+VL f16x16,avx512fp16, 32f2), @@ -3126,7 +3147,7 @@ int main(int argc, char **argv) printf("skipped\n"); printf("%-40s", "Testing {evex} vmovq %xmm1,32(%edx)..."); - if ( stack_exec && simd_check_avx512f() ) + if ( stack_exec && simd_check_avx512f_sc() ) { decl_insn(evex_vmovq_to_mem); @@ -3150,7 +3171,7 @@ int main(int argc, char **argv) printf("skipped\n"); printf("%-40s", "Testing {evex} vmovq 32(%edx),%xmm0..."); - if ( stack_exec && simd_check_avx512f() ) + if ( stack_exec && simd_check_avx512f_sc() ) { decl_insn(evex_vmovq_from_mem); @@ -3162,11 +3183,22 @@ int main(int argc, char **argv) rc = x86_emulate(&ctxt, &emulops); if ( rc != X86EMUL_OKAY || !check_eip(evex_vmovq_from_mem) ) goto fail; - asm ( "vmovq %1, %%xmm1\n\t" - "vpcmpeqq %%zmm0, %%zmm1, %%k0\n" - "kmovw %%k0, %0" : "=r" (rc) : "m" (res[8]) ); - if ( rc != 0xff ) - goto fail; + if ( simd_check_avx512f() ) + { + asm ( "vmovq %1, %%xmm1\n\t" + "vpcmpeqq %%zmm0, %%zmm1, %%k0\n" + "kmovw %%k0, %0" : "=r" (rc) : "m" (res[8]) ); + if ( rc != 0x00ff ) + goto fail; + } + else + { + asm ( "vmovq %1, %%xmm1\n\t" + "vpcmpeqq %%xmm0, %%xmm1, %%k0\n" + "kmovb %%k0, %0" : "=r" (rc) : "m" (res[8]) ); + if ( rc != 0x03 ) + goto fail; + } printf("okay\n"); } else @@ -3488,7 +3520,7 @@ int main(int argc, char **argv) printf("%-40s", "Testing vmovsd %xmm5,16(%ecx){%k3}..."); memset(res, 0x88, 128); memset(res + 20, 0x77, 8); - if ( stack_exec && simd_check_avx512f() ) + if ( stack_exec && simd_check_avx512f_sc() ) { decl_insn(vmovsd_masked_to_mem); @@ -3706,7 +3738,7 @@ int main(int argc, char **argv) printf("skipped\n"); printf("%-40s", "Testing {evex} vmovd %xmm3,32(%ecx)..."); - if ( stack_exec && simd_check_avx512f() ) + if ( stack_exec && simd_check_avx512f_sc() ) { decl_insn(evex_vmovd_to_mem); @@ -3731,7 +3763,7 @@ int main(int argc, char **argv) printf("skipped\n"); printf("%-40s", "Testing {evex} vmovd 32(%ecx),%xmm4..."); - if ( stack_exec && simd_check_avx512f() ) + if ( stack_exec && simd_check_avx512f_sc() ) { decl_insn(evex_vmovd_from_mem); @@ -3744,11 +3776,22 @@ int main(int argc, char **argv) rc = x86_emulate(&ctxt, &emulops); if ( rc != X86EMUL_OKAY || !check_eip(evex_vmovd_from_mem) ) goto fail; - asm ( "vmovd %1, %%xmm0\n\t" - "vpcmpeqd %%zmm4, %%zmm0, %%k0\n\t" - "kmovw %%k0, %0" : "=r" (rc) : "m" (res[8]) ); - if ( rc != 0xffff ) - goto fail; + if ( simd_check_avx512f() ) + { + asm ( "vmovd %1, %%xmm0\n\t" + "vpcmpeqd %%zmm4, %%zmm0, %%k0\n\t" + "kmovw %%k0, %0" : "=r" (rc) : "m" (res[8]) ); + if ( rc != 0xffff ) + goto fail; + } + else + { + asm ( "vmovd %1, %%xmm0\n\t" + "vpcmpeqd %%xmm4, %%xmm0, %%k0\n\t" + "kmovb %%k0, %0" : "=r" (rc) : "m" (res[8]) ); + if ( rc != 0x0f ) + goto fail; + } printf("okay\n"); } else @@ -3921,7 +3964,7 @@ int main(int argc, char **argv) printf("skipped\n"); printf("%-40s", "Testing {evex} vmovd %xmm2,%ebx..."); - if ( stack_exec && simd_check_avx512f() ) + if ( stack_exec && simd_check_avx512f_sc() ) { decl_insn(evex_vmovd_to_reg); @@ -3947,7 +3990,7 @@ int main(int argc, char **argv) printf("skipped\n"); printf("%-40s", "Testing {evex} vmovd %ebx,%xmm1..."); - if ( stack_exec && simd_check_avx512f() ) + if ( stack_exec && simd_check_avx512f_sc() ) { decl_insn(evex_vmovd_from_reg); @@ -3961,11 +4004,22 @@ int main(int argc, char **argv) rc = x86_emulate(&ctxt, &emulops); if ( (rc != X86EMUL_OKAY) || !check_eip(evex_vmovd_from_reg) ) goto fail; - asm ( "vmovd %1, %%xmm0\n\t" - "vpcmpeqd %%zmm1, %%zmm0, %%k0\n\t" - "kmovw %%k0, %0" : "=r" (rc) : "m" (res[8]) ); - if ( rc != 0xffff ) - goto fail; + if ( simd_check_avx512f() ) + { + asm ( "vmovd %1, %%xmm0\n\t" + "vpcmpeqd %%zmm1, %%zmm0, %%k0\n\t" + "kmovw %%k0, %0" : "=r" (rc) : "m" (res[8]) ); + if ( rc != 0xffff ) + goto fail; + } + else + { + asm ( "vmovd %1, %%xmm0\n\t" + "vpcmpeqd %%xmm1, %%xmm0, %%k0\n\t" + "kmovb %%k0, %0" : "=r" (rc) : "m" (res[8]) ); + if ( rc != 0x0f ) + goto fail; + } printf("okay\n"); } else @@ -4049,7 +4103,7 @@ int main(int argc, char **argv) printf("skipped\n"); printf("%-40s", "Testing {evex} vmovq %xmm11,32(%ecx)..."); - if ( stack_exec && simd_check_avx512f() ) + if ( stack_exec && simd_check_avx512f_sc() ) { decl_insn(evex_vmovq_to_mem2); @@ -4139,7 +4193,7 @@ int main(int argc, char **argv) printf("skipped\n"); printf("%-40s", "Testing vmovq %xmm22,%rbx..."); - if ( stack_exec && simd_check_avx512f() ) + if ( stack_exec && simd_check_avx512f_sc() ) { decl_insn(evex_vmovq_to_reg); @@ -5505,7 +5559,7 @@ int main(int argc, char **argv) printf("skipped\n"); printf("%-40s", "Testing vmovsh 8(%ecx),%xmm5..."); - if ( stack_exec && simd_check_avx512fp16() ) + if ( stack_exec && simd_check_avx512fp16_sc() ) { decl_insn(vmovsh_from_mem); decl_insn(vmovw_to_gpr); @@ -5523,14 +5577,28 @@ int main(int argc, char **argv) rc = x86_emulate(&ctxt, &emulops); if ( (rc != X86EMUL_OKAY) || !check_eip(vmovsh_from_mem) ) goto fail; - asm volatile ( "kmovw %2, %%k1\n\t" - "vmovdqu16 %1, %%zmm4%{%%k1%}%{z%}\n\t" - "vpcmpeqw %%zmm4, %%zmm5, %%k0\n\t" - "kmovw %%k0, %0" - : "=g" (rc) - : "m" (res[2]), "r" (1) ); - if ( rc != 0xffff ) - goto fail; + if ( simd_check_avx512fp16() ) + { + asm volatile ( "kmovw %2, %%k1\n\t" + "vmovdqu16 %1, %%zmm4%{%%k1%}%{z%}\n\t" + "vpcmpeqw %%zmm4, %%zmm5, %%k0\n\t" + "kmovw %%k0, %0" + : "=g" (rc) + : "m" (res[2]), "r" (1) ); + if ( rc != 0xffff ) + goto fail; + } + else + { + asm volatile ( "kmovb %2, %%k1\n\t" + "vmovdqu16 %1, %%xmm4%{%%k1%}%{z%}\n\t" + "vpcmpeqw %%xmm4, %%xmm5, %%k0\n\t" + "kmovb %%k0, %0" + : "=g" (rc) + : "m" (res[2]), "r" (1) ); + if ( rc != 0xff ) + goto fail; + } printf("okay\n"); printf("%-40s", "Testing vmovsh %xmm4,2(%eax){%k3}..."); --- a/tools/tests/x86_emulator/x86-emulate.c +++ b/tools/tests/x86_emulator/x86-emulate.c @@ -243,7 +243,7 @@ int emul_test_get_fpu( break; case X86EMUL_FPU_opmask: case X86EMUL_FPU_zmm: - if ( cpu_has_avx512f ) + if ( cpu_has_avx512f || cpu_has_avx10_1 ) break; default: return X86EMUL_UNHANDLEABLE; --- a/tools/tests/x86_emulator/x86-emulate.h +++ b/tools/tests/x86_emulator/x86-emulate.h @@ -185,6 +185,12 @@ void wrpkru(unsigned int val); #define cpu_has_avx_vnni_int8 (cpu_policy.feat.avx_vnni_int8 && xcr0_mask(6)) #define cpu_has_avx_ne_convert (cpu_policy.feat.avx_ne_convert && xcr0_mask(6)) #define cpu_has_avx_vnni_int16 (cpu_policy.feat.avx_vnni_int16 && xcr0_mask(6)) + /* TBD: Is bit 6 (ZMM_Hi256) really needed here? */ +#define cpu_has_avx10_1 (cpu_policy.feat.avx10 && xcr0_mask(0xe6)) +#define cpu_has_avx10_1_256 (cpu_has_avx10_1 && \ + (cpu_policy.avx10.vsz256 || \ + cpu_policy.avx10.vsz512)) +#define cpu_has_avx10_1_512 (cpu_has_avx10_1 && cpu_policy.avx10.vsz512) #define cpu_has_xgetbv1 (cpu_has_xsave && cpu_policy.xstate.xgetbv1) --- a/xen/arch/x86/x86_emulate/x86_emulate.c +++ b/xen/arch/x86/x86_emulate/x86_emulate.c @@ -1396,6 +1396,14 @@ x86_emulate( stb[2] = cp->feat.avx512bw || cp->feat.avx10 ? 0xf8 /* L0.NP.W1 - kmovq */ : 0x78 /* L0.NP.W0 - kmovw */; +#ifndef __XEN__ + /* + * SDE 9.27.0 is following ISE 049, where 64-bit opmask insns were + * valid only with vsz512. + */ + if ( cp->feat.avx10 && !cp->avx10.vsz512 ) + stb[2] = 0xf9 /* L0.66.W1 - kmovd */; +#endif stb[3] = 0x91; stb[4] = evex.opmsk << 3; insn_bytes = 5; From patchwork Thu Jan 11 15:22:10 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 13517532 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id F0269C47077 for ; Thu, 11 Jan 2024 15:22:24 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.666380.1036983 (Exim 4.92) (envelope-from ) id 1rNwso-0005Fk-35; Thu, 11 Jan 2024 15:22:14 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 666380.1036983; Thu, 11 Jan 2024 15:22:14 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1rNwso-0005Fd-0O; Thu, 11 Jan 2024 15:22:14 +0000 Received: by outflank-mailman (input) for mailman id 666380; Thu, 11 Jan 2024 15:22:12 +0000 Received: from se1-gles-sth1-in.inumbo.com ([159.253.27.254] helo=se1-gles-sth1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1rNwsm-0005FK-IY for xen-devel@lists.xenproject.org; Thu, 11 Jan 2024 15:22:12 +0000 Received: from mail-wm1-x336.google.com (mail-wm1-x336.google.com [2a00:1450:4864:20::336]) by se1-gles-sth1.inumbo.com (Halon) with ESMTPS id 310eb6cc-b095-11ee-98f0-6d05b1d4d9a1; Thu, 11 Jan 2024 16:22:11 +0100 (CET) Received: by mail-wm1-x336.google.com with SMTP id 5b1f17b1804b1-40e55c885d7so22774885e9.0 for ; Thu, 11 Jan 2024 07:22:11 -0800 (PST) Received: from [10.156.60.236] (ip-037-024-206-209.um08.pools.vodafone-ip.de. [37.24.206.209]) by smtp.gmail.com with ESMTPSA id fk10-20020a05600c0cca00b0040e5a0ebabesm4245126wmb.21.2024.01.11.07.22.10 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 11 Jan 2024 07:22:10 -0800 (PST) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 310eb6cc-b095-11ee-98f0-6d05b1d4d9a1 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1704986531; x=1705591331; darn=lists.xenproject.org; h=content-transfer-encoding:in-reply-to:autocrypt:references:cc:to :from:content-language:subject:user-agent:mime-version:date :message-id:from:to:cc:subject:date:message-id:reply-to; bh=VGSyTsew86/TrCcru/Rk7R/2beYcoXHkI1pXLpAyXsw=; b=KQFvNwtc/UE89YBPxY89GP2ckf7DXHiC2VCkp8V5LkzpveKpbp3slVufYIpx8MAkKp MJEpYzp8jVQRNoVjK/5E2e0Gy3rL2Y7bRsPo1c5Te47g8TqztkyGArrQ3A79JJCbh42r a66IEGeo5+Ck2gSNUUQ/hnnAoC2S68LcH7KuJDyVcSyfMXkH6Vb+D1iZIGoWkzA8j11i yKGqFzhf3vU+wGAj52RWFueNEBFvbnjMorLVICMWUeAqDH/Ki8+SNMLuCcdqfL+WpRJE Vrh3gdlKtFqcFsMXFt6upfTjB0Tu6j7nShKuSmDN8R9Lfas0baZQEAjEq/0GvMDAXogR s6FQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1704986531; x=1705591331; h=content-transfer-encoding:in-reply-to:autocrypt:references:cc:to :from:content-language:subject:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=VGSyTsew86/TrCcru/Rk7R/2beYcoXHkI1pXLpAyXsw=; b=Qjho3Ch+V6+LXpePR6HyCyHXYem9xVz9eAbsUvHvJaguB7cFvKTo3sFi+p8byq9ySd 52XNMZ/A9VW6KHDxyYMA4spBTIKUOGQJBH8abnjFSCwlHeqc3GMfCL8fMwzZaq8CCFyN w5P4JRHvHdQNrhRksKfaFDUIWBMcTIdzsE9cekDyMPgl8O13Tj3qk7+5elYjSmPPtd3S fWoD0APOAP/OqVpUpswm3XtRi/aGXXU634U2ib4x6bKa7uMvHYj13zcCOKBtGZgmqoy7 9WpQJHON4pgYHnIofhhMhcddgSMqxkX8DO0COnncF4oRd3nhN0kIOPGuSv2baQp17fbJ pDjw== X-Gm-Message-State: AOJu0Yw5HsGmG3vRRdkkDJWX8FOi0gvfdOvJoe8sL7eQH2yE52eUdkku B4Gd83mQB60CHbVmSUepSkx6XU4Nk/68AnzYHkW7lxkvRg== X-Google-Smtp-Source: AGHT+IEzL++dGMBUaXBsTL9iNFV3DktscG+XvFSst4IQStMlDWON/cU7xbdf2WsawJuVvzayiqm7dg== X-Received: by 2002:a05:600c:3d95:b0:40e:4905:3fb with SMTP id bi21-20020a05600c3d9500b0040e490503fbmr466778wmb.32.1704986531094; Thu, 11 Jan 2024 07:22:11 -0800 (PST) Message-ID: <837da45e-c5e2-4327-996a-13abf962adc8@suse.com> Date: Thu, 11 Jan 2024 16:22:10 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: [PATCH 8/8] x86emul/test: engage AVX512VL via command line option Content-Language: en-US From: Jan Beulich To: "xen-devel@lists.xenproject.org" Cc: Andrew Cooper , Wei Liu , =?utf-8?q?Roger_Pau_Monn=C3=A9?= References: <298db76f-d0ee-4d47-931f-1baa1a7546cf@suse.com> Autocrypt: addr=jbeulich@suse.com; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL In-Reply-To: <298db76f-d0ee-4d47-931f-1baa1a7546cf@suse.com> Now that we have machinery in testcase.mk to set vector length dependent flags for AVX512 tests, let's avoid using a pragma to enable AVX512VL insns for the compiler. This way, correct settings are in place from the very beginning of compilation. No change to the generated test blobs, and hence no functional change. Signed-off-by: Jan Beulich --- a/tools/tests/x86_emulator/simd.h +++ b/tools/tests/x86_emulator/simd.h @@ -215,10 +215,6 @@ DECL_OCTET(half); # define __builtin_ia32_shuf_i32x4_512_mask __builtin_ia32_shuf_i32x4_mask # define __builtin_ia32_shuf_i64x2_512_mask __builtin_ia32_shuf_i64x2_mask -# if VEC_SIZE > ELEM_SIZE && (defined(VEC_MAX) ? VEC_MAX : VEC_SIZE) < 64 -# pragma GCC target ( "avx512vl" ) -# endif - # define REN(insn, old, new) \ asm ( ".macro v" #insn #old " o:vararg \n\t" \ "v" #insn #new " \\o \n\t" \ --- a/tools/tests/x86_emulator/testcase.mk +++ b/tools/tests/x86_emulator/testcase.mk @@ -7,8 +7,8 @@ $(call cc-options-add,CFLAGS,CC,$(EMBEDD ifneq ($(filter -mavx512%,$($(TESTCASE)-cflags)),) cflags-vsz64 := -cflags-vsz32 := -mprefer-vector-width=256 -cflags-vsz16 := -mprefer-vector-width=128 +cflags-vsz32 := -mavx512vl -mprefer-vector-width=256 +cflags-vsz16 := -mavx512vl -mprefer-vector-width=128 # Scalar tests don't set VEC_SIZE (and VEC_MAX is used by S/G ones only) cflags-vsz := -mprefer-vector-width=128