From patchwork Wed Dec 11 10:11:16 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 13903284 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 20C14E7717D for ; Wed, 11 Dec 2024 10:11:33 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.854258.1267506 (Exim 4.92) (envelope-from ) id 1tLJgi-0003dq-PR; Wed, 11 Dec 2024 10:11:24 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 854258.1267506; Wed, 11 Dec 2024 10:11:24 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1tLJgi-0003dj-Mb; Wed, 11 Dec 2024 10:11:24 +0000 Received: by outflank-mailman (input) for mailman id 854258; Wed, 11 Dec 2024 10:11:23 +0000 Received: from se1-gles-sth1-in.inumbo.com ([159.253.27.254] helo=se1-gles-sth1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1tLJgh-0003db-GX for xen-devel@lists.xenproject.org; Wed, 11 Dec 2024 10:11:23 +0000 Received: from mail-wr1-x430.google.com (mail-wr1-x430.google.com [2a00:1450:4864:20::430]) by se1-gles-sth1.inumbo.com (Halon) with ESMTPS id 459ab123-b7a8-11ef-a0d5-8be0dac302b0; Wed, 11 Dec 2024 11:11:22 +0100 (CET) Received: by mail-wr1-x430.google.com with SMTP id ffacd0b85a97d-3862a921123so3839670f8f.3 for ; Wed, 11 Dec 2024 02:11:22 -0800 (PST) Received: from [10.156.60.236] (ip-037-024-206-209.um08.pools.vodafone-ip.de. [37.24.206.209]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-725eccf0c91sm5407421b3a.100.2024.12.11.02.11.19 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 11 Dec 2024 02:11:21 -0800 (PST) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 459ab123-b7a8-11ef-a0d5-8be0dac302b0 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1733911882; x=1734516682; darn=lists.xenproject.org; h=content-transfer-encoding:in-reply-to:autocrypt:content-language :references:cc:to:from:subject:user-agent:mime-version:date :message-id:from:to:cc:subject:date:message-id:reply-to; bh=ZuBh/F1oH3q7Tih7C7Y9Sp04iSX23XH5E4IlV6QEhGw=; b=UnKW1q135Si6nl/mOBlRzMK2Q6R/+ZcPk2y5FNGr8nqHqVnQoFm0OMGZ41Ps8r9lqN OQRU1aPcN624TlzjAGMMiL/r0JxHcx2r0ysfhSxR1DGZqdLnHLGnkqaxoPY5fLqv/USx dEqClr4JyUjwQ3PdQeOuXxNRvTegjHJG/MAgaH6eDxlCKs+6Yrfrh1f2rSWfP3VoE09n ecavGiNYYaJADAoTitmdKgbSBotWCqTjTR3pY3kf7iTRedoUO7nqeVqLnboPiSq0YDUs fogTa5hIoGQCHr+VYkVQ54RyNCtMExa/wbzQr0w0EtFh5Z824ng04lQTsFNXfYeyC58P bEPQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1733911882; x=1734516682; h=content-transfer-encoding:in-reply-to:autocrypt:content-language :references:cc:to:from:subject:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=ZuBh/F1oH3q7Tih7C7Y9Sp04iSX23XH5E4IlV6QEhGw=; b=HVTUxf0l6w3cbaf9VGUy+NkhHeVf1BdAirYVD89e6kA5JMDswXgAvb96BK8S/D0tqm zMYrUSVzVFA6hSRbtvNCkYVsynblbuEvXM8YLelTq+80+NCNwBAqSYfKbHJWFl40eAZg jXd8kveUc3KmCOuLI836PJ8l8855X6fEPWS18MYcZez6VwSHHzN/mDkgKdPc4DbeizeK ir/NCFOhpQs5FMaEbNmjfoSJCxVFam242KwXcdKVryKmPRBwU+IVf6s/28nQrEJFPezl Ce9E6nj2lGGXx6RhVRMvCNGJT+W6ZjSWUNNpXFq3Bf4LlhFses6UHdFRL1oJ4KVWbMN0 JdSQ== X-Gm-Message-State: AOJu0YwaH6LBotlfLizahR17HwbdwtZY7kf1UY66BQbhASVxfWHJmNrt Au66ta4vakvyVM0mGYaIbyITNIeNiCkt9fUMLizyoNi6Ca7d7vr0ebiOZNOKaDn099KWIjNLsQw = X-Gm-Gg: ASbGnctE71/mdDaxa23gcrssXgLZC6KpiLeQDC3LungmtKTGVzTlo2r+Gx+YeumXo8A F75qk/WK835yLbIZ5wUyors5Y3JGabV7yFIUYsdKjoOFqunRwK79ng/qyu3dmwqLEOW1rk0g+Pq ERMzdOW9mU2BAc5cVik67fpXxsWTq3EBxeogmMC8gjYRIrwjCZQXWm3qcZBed7w+8+73MU3e2e5 VmJgpzky+xCfA2FuoVHW4mYb1dtlyPJPzKXrvhgW6g4vcwOG3NK8dpF10adceVizE2symoHyKoA hg51P/+TbHLvR6E4pLIHpSTy9bu7x2hbcRAlYv4= X-Google-Smtp-Source: AGHT+IGAStCRdzqX1B9AAzuRgGdBaa4Vqj4WKL3apHiiTmf58IOp/+ELWyw1OL7Fi5HVI9RmLlALvg== X-Received: by 2002:a5d:47ac:0:b0:386:1cd3:8a00 with SMTP id ffacd0b85a97d-3864cec5a2cmr1728227f8f.40.1733911881625; Wed, 11 Dec 2024 02:11:21 -0800 (PST) Message-ID: <0b543263-2c66-4e35-a822-09c0b6ca016a@suse.com> Date: Wed, 11 Dec 2024 11:11:16 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: [PATCH v3 01/16] x86/CPUID: enable AVX10 leaf From: Jan Beulich To: "xen-devel@lists.xenproject.org" Cc: Andrew Cooper , =?utf-8?q?Roger_Pau_Monn?= =?utf-8?q?=C3=A9?= References: <516b7f9a-048e-409d-8a4e-89aeb8ffacc4@suse.com> Content-Language: en-US Autocrypt: addr=jbeulich@suse.com; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL In-Reply-To: <516b7f9a-048e-409d-8a4e-89aeb8ffacc4@suse.com> This requires bumping the number of basic leaves we support. Apart from this the logic is modeled as closely as possible after that of leaf 7 handling. Signed-off-by: Jan Beulich --- The gen-cpuid.py adjustment is merely the minimum needed. It's not really clear to me whether someone turning off e.g. AVX512BW might then also validly expect AVX10 to be turned off. Spec version 2 leaves unclear what the xstate components are which would need enabling for AVX10/256. recalculate_{xstate,misc}() are therefore conservative for now. Do we want to synthesize AVX10 in the policy when all necessary AVX512* features are available, thus allowing migration from an AVX10 host to a suitable non-AVX10 one? The prior vsz128 bit is now defined as reserved-at-1: No idea yet how to represent this. How a toolstack side equivalent (if any) of the init_dom0_cpuid_policy() change would look like is entirely unclear to me. How much should we take from the max policy, and how much should we require the user to specify (and how would the latter look like)? --- v3: Re-base. v2: Add logic to init_dom0_cpuid_policy(). Drop vsz128 field. Re-base. --- a/xen/arch/x86/cpu-policy.c +++ b/xen/arch/x86/cpu-policy.c @@ -210,7 +210,7 @@ static void recalculate_xstate(struct cp if ( p->feat.mpx ) xstates |= X86_XCR0_BNDREGS | X86_XCR0_BNDCSR; - if ( p->feat.avx512f ) + if ( p->feat.avx512f || (p->feat.avx10 && p->avx10.vsz512) ) xstates |= X86_XCR0_OPMASK | X86_XCR0_ZMM | X86_XCR0_HI_ZMM; if ( p->feat.pku ) @@ -271,6 +271,16 @@ static void recalculate_misc(struct cpu_ p->basic.raw[0xc] = EMPTY_LEAF; + zero_leaves(p->basic.raw, 0xe, 0x23); + + p->avx10.raw[0].b &= 0x000700ff; + p->avx10.raw[0].c = p->avx10.raw[0].d = 0; + if ( !p->feat.avx10 || !p->avx10.version || !p->avx10.vsz512 ) + { + p->feat.avx10 = false; + memset(p->avx10.raw, 0, sizeof(p->avx10.raw)); + } + p->extd.e1d &= ~CPUID_COMMON_1D_FEATURES; /* Most of Power/RAS hidden from guests. */ @@ -394,6 +404,7 @@ static void __init guest_common_max_leav { p->basic.max_leaf = ARRAY_SIZE(p->basic.raw) - 1; p->feat.max_subleaf = ARRAY_SIZE(p->feat.raw) - 1; + p->avx10.max_subleaf = ARRAY_SIZE(p->avx10.raw) - 1; p->extd.max_leaf = 0x80000000U + ARRAY_SIZE(p->extd.raw) - 1; } @@ -402,6 +413,7 @@ static void __init guest_common_default_ { p->basic.max_leaf = host_cpu_policy.basic.max_leaf; p->feat.max_subleaf = host_cpu_policy.feat.max_subleaf; + p->avx10.max_subleaf = host_cpu_policy.avx10.max_subleaf; p->extd.max_leaf = host_cpu_policy.extd.max_leaf; } @@ -905,6 +917,7 @@ void recalculate_cpuid_policy(struct dom p->basic.max_leaf = min(p->basic.max_leaf, max->basic.max_leaf); p->feat.max_subleaf = min(p->feat.max_subleaf, max->feat.max_subleaf); + p->avx10.max_subleaf = min(p->avx10.max_subleaf, max->avx10.max_subleaf); p->extd.max_leaf = 0x80000000U | min(p->extd.max_leaf & 0xffff, ((p->x86_vendor & (X86_VENDOR_AMD | X86_VENDOR_HYGON)) @@ -951,6 +964,8 @@ void recalculate_cpuid_policy(struct dom if ( p->basic.max_leaf < XSTATE_CPUID ) __clear_bit(X86_FEATURE_XSAVE, fs); + if ( p->basic.max_leaf < 0x24 ) + __clear_bit(X86_FEATURE_AVX10, fs); sanitise_featureset(fs); @@ -1020,9 +1035,18 @@ void __init init_dom0_cpuid_policy(struc /* Apply dom0-cpuid= command line settings, if provided. */ if ( dom0_cpuid_cmdline ) { + const struct cpu_policy *max = is_pv_domain(d) + ? (IS_ENABLED(CONFIG_PV) ? &pv_max_cpu_policy : NULL) + : (IS_ENABLED(CONFIG_HVM) ? &hvm_max_cpu_policy : NULL); uint32_t fs[FSCAPINTS]; unsigned int i; + if ( !max ) + { + ASSERT_UNREACHABLE(); + return; + } + x86_cpu_policy_to_featureset(p, fs); for ( i = 0; i < ARRAY_SIZE(fs); ++i ) @@ -1032,6 +1056,13 @@ void __init init_dom0_cpuid_policy(struc } x86_cpu_featureset_to_policy(fs, p); + + /* + * Default-off features with their own leaves need those leaves + * re-populated from the max policy. + */ + if ( p->feat.avx10 ) + p->avx10 = max->avx10; } /* @@ -1064,6 +1095,8 @@ static void __init __maybe_unused build_ sizeof(raw_cpu_policy.feat.raw)); BUILD_BUG_ON(sizeof(raw_cpu_policy.xstate) != sizeof(raw_cpu_policy.xstate.raw)); + BUILD_BUG_ON(sizeof(raw_cpu_policy.avx10) != + sizeof(raw_cpu_policy.avx10.raw)); BUILD_BUG_ON(sizeof(raw_cpu_policy.extd) != sizeof(raw_cpu_policy.extd.raw)); } --- a/xen/arch/x86/cpuid.c +++ b/xen/arch/x86/cpuid.c @@ -87,6 +87,15 @@ void guest_cpuid(const struct vcpu *v, u *res = array_access_nospec(p->xstate.raw, subleaf); break; + case 0x24: + ASSERT(p->avx10.max_subleaf < ARRAY_SIZE(p->avx10.raw)); + if ( subleaf > min_t(uint32_t, p->avx10.max_subleaf, + ARRAY_SIZE(p->avx10.raw) - 1) ) + return; + + *res = array_access_nospec(p->avx10.raw, subleaf); + break; + default: *res = array_access_nospec(p->basic.raw, leaf); break; --- a/xen/include/public/arch-x86/cpufeatureset.h +++ b/xen/include/public/arch-x86/cpufeatureset.h @@ -355,6 +355,7 @@ XEN_CPUFEATURE(UTMR, 15*32 XEN_CPUFEATURE(PREFETCHI, 15*32+14) /*A PREFETCHIT{0,1} Instructions */ XEN_CPUFEATURE(USER_MSR, 15*32+15) /*s U{RD,WR}MSR Instructions */ XEN_CPUFEATURE(CET_SSS, 15*32+18) /* CET Supervisor Shadow Stacks safe to use */ +XEN_CPUFEATURE(AVX10, 15*32+19) /* AVX10 Converged Vector ISA */ /* Intel-defined CPU features, MSR_ARCH_CAPS 0x10a.eax, word 16 */ XEN_CPUFEATURE(RDCL_NO, 16*32+ 0) /*A No Rogue Data Cache Load (Meltdown) */ --- a/xen/include/xen/lib/x86/cpu-policy.h +++ b/xen/include/xen/lib/x86/cpu-policy.h @@ -85,11 +85,12 @@ unsigned int x86_cpuid_lookup_vendor(uin */ const char *x86_cpuid_vendor_to_str(unsigned int vendor); -#define CPUID_GUEST_NR_BASIC (0xdu + 1) +#define CPUID_GUEST_NR_BASIC (0x24u + 1) #define CPUID_GUEST_NR_CACHE (5u + 1) #define CPUID_GUEST_NR_FEAT (2u + 1) #define CPUID_GUEST_NR_TOPO (1u + 1) #define CPUID_GUEST_NR_XSTATE (62u + 1) +#define CPUID_GUEST_NR_AVX10 (0u + 1) #define CPUID_GUEST_NR_EXTD_INTEL (0x8u + 1) #define CPUID_GUEST_NR_EXTD_AMD (0x21u + 1) #define CPUID_GUEST_NR_EXTD MAX(CPUID_GUEST_NR_EXTD_INTEL, \ @@ -255,6 +256,19 @@ struct cpu_policy } comp[CPUID_GUEST_NR_XSTATE]; } xstate; + /* Structured AVX10 information leaf: 0x000000024[xx] */ + union { + struct cpuid_leaf raw[CPUID_GUEST_NR_AVX10]; + struct { + /* Subleaf 0. */ + uint32_t max_subleaf; + uint32_t version:8, :9; + bool vsz256:1, vsz512:1; + uint32_t :13; + uint32_t /* c */:32, /* d */:32; + }; + } avx10; + /* Extended leaves: 0x800000xx */ union { struct cpuid_leaf raw[CPUID_GUEST_NR_EXTD]; --- a/xen/lib/x86/cpuid.c +++ b/xen/lib/x86/cpuid.c @@ -123,6 +123,7 @@ void x86_cpu_policy_fill_native(struct c switch ( i ) { case 0x4: case 0x7: case 0xb: case 0xd: + case 0x24: /* Multi-invocation leaves. Deferred. */ continue; } @@ -216,6 +217,15 @@ void x86_cpu_policy_fill_native(struct c } } + if ( p->basic.max_leaf >= 0x24 ) + { + cpuid_count_leaf(0x24, 0, &p->avx10.raw[0]); + + for ( i = 1; i <= MIN(p->avx10.max_subleaf, + ARRAY_SIZE(p->avx10.raw) - 1); ++i ) + cpuid_count_leaf(0x24, i, &p->avx10.raw[i]); + } + /* Extended leaves. */ cpuid_leaf(0x80000000U, &p->extd.raw[0]); for ( i = 1; i <= MIN(p->extd.max_leaf & 0xffffU, @@ -285,6 +295,9 @@ void x86_cpu_policy_clear_out_of_range_l ARRAY_SIZE(p->xstate.raw) - 1); } + if ( p->basic.max_leaf < 0x24 ) + memset(p->avx10.raw, 0, sizeof(p->avx10.raw)); + zero_leaves(p->extd.raw, ((p->extd.max_leaf >> 16) == 0x8000 ? (p->extd.max_leaf & 0xffff) + 1 : 0), @@ -297,6 +310,8 @@ void __init x86_cpu_policy_bound_max_lea min_t(uint32_t, p->basic.max_leaf, ARRAY_SIZE(p->basic.raw) - 1); p->feat.max_subleaf = min_t(uint32_t, p->feat.max_subleaf, ARRAY_SIZE(p->feat.raw) - 1); + p->avx10.max_subleaf = + min_t(uint32_t, p->avx10.max_subleaf, ARRAY_SIZE(p->avx10.raw) - 1); p->extd.max_leaf = 0x80000000U | min_t(uint32_t, p->extd.max_leaf & 0xffff, ARRAY_SIZE(p->extd.raw) - 1); } @@ -324,6 +339,8 @@ void x86_cpu_policy_shrink_max_leaves(st */ p->basic.raw[0xd] = p->xstate.raw[0]; + p->basic.raw[0x24] = p->avx10.raw[0]; + for ( i = p->basic.max_leaf; i; --i ) if ( p->basic.raw[i].a | p->basic.raw[i].b | p->basic.raw[i].c | p->basic.raw[i].d ) @@ -457,6 +474,13 @@ int x86_cpuid_copy_to_buffer(const struc break; } + case 0x24: + for ( subleaf = 0; + subleaf <= MIN(p->avx10.max_subleaf, + ARRAY_SIZE(p->avx10.raw) - 1); ++subleaf ) + COPY_LEAF(leaf, subleaf, &p->avx10.raw[subleaf]); + break; + default: COPY_LEAF(leaf, XEN_CPUID_NO_SUBLEAF, &p->basic.raw[leaf]); break; @@ -549,6 +573,13 @@ int x86_cpuid_copy_from_buffer(struct cp array_access_nospec(p->xstate.raw, data.subleaf) = l; break; + case 0x24: + if ( data.subleaf >= ARRAY_SIZE(p->avx10.raw) ) + goto out_of_range; + + array_access_nospec(p->avx10.raw, data.subleaf) = l; + break; + default: if ( data.subleaf != XEN_CPUID_NO_SUBLEAF ) goto out_of_range; --- a/xen/lib/x86/policy.c +++ b/xen/lib/x86/policy.c @@ -21,6 +21,12 @@ int x86_cpu_policies_are_compatible(cons if ( guest->feat.max_subleaf > host->feat.max_subleaf ) FAIL_CPUID(7, 0); + if ( guest->avx10.version > host->avx10.version || + (guest->avx10.vsz512 + ? !host->avx10.vsz512 + : guest->avx10.vsz256 && !host->avx10.vsz256 && !host->avx10.vsz512) ) + FAIL_CPUID(0x24, 0); + if ( guest->extd.max_leaf > host->extd.max_leaf ) FAIL_CPUID(0x80000000U, NA); --- a/xen/tools/gen-cpuid.py +++ b/xen/tools/gen-cpuid.py @@ -286,7 +286,7 @@ def crunch_numbers(state): # enabled. Certain later extensions, acting on 256-bit vectors of # integers, better depend on AVX2 than AVX. AVX2: [AVX512F, VAES, VPCLMULQDQ, AVX_VNNI, AVX_IFMA, AVX_VNNI_INT8, - AVX_VNNI_INT16, SHA512, SM4], + AVX_VNNI_INT16, SHA512, SM4, AVX10], # AVX512F is taken to mean hardware support for 512bit registers # (which in practice depends on the EVEX prefix to encode) as well From patchwork Wed Dec 11 10:11:43 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 13903285 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 1CDCCE77180 for ; Wed, 11 Dec 2024 10:12:03 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.854270.1267516 (Exim 4.92) (envelope-from ) id 1tLJhD-0004LL-4L; Wed, 11 Dec 2024 10:11:55 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 854270.1267516; Wed, 11 Dec 2024 10:11:55 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1tLJhD-0004LD-1U; Wed, 11 Dec 2024 10:11:55 +0000 Received: by outflank-mailman (input) for mailman id 854270; Wed, 11 Dec 2024 10:11:54 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1tLJhB-0003z3-P5 for xen-devel@lists.xenproject.org; Wed, 11 Dec 2024 10:11:54 +0000 Received: from mail-wm1-x334.google.com (mail-wm1-x334.google.com [2a00:1450:4864:20::334]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id 53ce5ed7-b7a8-11ef-99a3-01e77a169b0f; Wed, 11 Dec 2024 11:11:46 +0100 (CET) Received: by mail-wm1-x334.google.com with SMTP id 5b1f17b1804b1-434a736518eso72550585e9.1 for ; Wed, 11 Dec 2024 02:11:51 -0800 (PST) Received: from [10.156.60.236] (ip-037-024-206-209.um08.pools.vodafone-ip.de. [37.24.206.209]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-725e67ec497sm6093103b3a.125.2024.12.11.02.11.46 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 11 Dec 2024 02:11:49 -0800 (PST) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 53ce5ed7-b7a8-11ef-99a3-01e77a169b0f DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1733911910; x=1734516710; darn=lists.xenproject.org; h=content-transfer-encoding:in-reply-to:autocrypt:content-language :references:cc:to:from:subject:user-agent:mime-version:date :message-id:from:to:cc:subject:date:message-id:reply-to; bh=qZjwkU1zfDESdoMN4mfxEFe4CEUvDIL1q+y4iBCYT98=; b=KdN2bVHef7NAfewwsvC3q6vz/02EDCZshLk23Vtgsj1ByxrHA6mpMXk8rwHvKW8fuX BMZIX1CjYPf/h+9X8oajJvfEJiKcavqsBVLsdTyA+ngifWF4JBOT5Xj1l7vRn3iRndtZ A/U1hYXJPKYyLVd4HxESLO+lj0fnp0bXHgi3ouxj2geILgs0q93/Dck4YNfunY2ikgva tLK56HuRJOYld+nBCLOP0WSTO7pfOHeqhpEMdE5Tenm++vq4WAsb0ojtStUh4VbnsBT0 zP5gOdQmHRANXnAZVsdtCnsdlx9EvfjzCT7bQpr9CGRjLORxmDwP6zlr9YscqZBkqrDt zHvw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1733911910; x=1734516710; h=content-transfer-encoding:in-reply-to:autocrypt:content-language :references:cc:to:from:subject:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=qZjwkU1zfDESdoMN4mfxEFe4CEUvDIL1q+y4iBCYT98=; b=OnTLjMDnnt77Ch2aa0jNHsOjSB8HJJyzUpa9ys52CCrB98R9Rt6wcWiheNje2D90T3 48soC4L6O0geYKlJ61RxM41BQ8+VadiVg9fTE1rl5e44g6yhbG5qYdMDdTKwcOiM/QiQ 9DelmGA/Ur+sDLcew4gHh4oyenvyxVQXQZliaTzHrrkWq7UPgpYxKloSID0E0ny3x/D6 atgOt1aBvRtVuH+eC3ggahv3rnyrT16iXcPxfNEjzADRhCyU8YjwA/AaKNNrZWz/MUoB 7FKGohK5kzbszvdrz76HuAWiQi5yzW8ctxmWVA8zynWp9dl4sRztKas4QbsdwxIqJtg+ 3XRA== X-Gm-Message-State: AOJu0YyEGMqb1u6aTTvPY1dkdCg0z4Ha1CG6xLsnO18lG4lGQKNTDrg8 fowRpyHXYcPedk6mAls1AEID2DFFMXthKGfiNpNgnsMMfhNXX2eMZkj7hcLBaVzz73qdHYy7+6M = X-Gm-Gg: ASbGncueAaWzcCWQ8IymcUnYPl3mQI1z+V8HY1XP4LH/r3Tvubtk/eBbYhk6LIXYCx1 IBvOTXAz+WnBXrYGZ6gv02+V4qa04Z8nbUMCMQhU/Q8xLxBbaDThxrfIlWkiC0XE3tPiqVZTIw9 UeAr+BXETnSDqXAN8MjGqxTqWBOCq2OGHqKyoQXzTJ9F9ThK2MiIB7oAXwBJMPE4B9+Ih2imama 5EpLbXg57PQvmaEyCrTHhNrWU8832i8lI9juidwKTLh13I26Kv/d9R5a0PUsdzarlZYs7mtym2q 3C2jhkTPmqSYQ1LQroMDn+9ruB7X/1cuYkDewgY= X-Google-Smtp-Source: AGHT+IGzov3JM+76iGkVOVPnrySD2x8Umb9bZ43hO3YHGzCCM6SFVFHotA7HMpKq6U74eAJ2hUjDCQ== X-Received: by 2002:a05:6000:1fae:b0:385:e88a:7037 with SMTP id ffacd0b85a97d-3864ce888a6mr1669008f8f.6.1733911910122; Wed, 11 Dec 2024 02:11:50 -0800 (PST) Message-ID: Date: Wed, 11 Dec 2024 11:11:43 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: [PATCH v3 02/16] x86emul: support AVX10.1 From: Jan Beulich To: "xen-devel@lists.xenproject.org" Cc: Andrew Cooper , =?utf-8?q?Roger_Pau_Monn?= =?utf-8?q?=C3=A9?= , Oleksii Kurochko References: <516b7f9a-048e-409d-8a4e-89aeb8ffacc4@suse.com> Content-Language: en-US Autocrypt: addr=jbeulich@suse.com; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL In-Reply-To: <516b7f9a-048e-409d-8a4e-89aeb8ffacc4@suse.com> This requires relaxing various pre-existing AVX512* checks, as AVX10.1 covers all AVX512* except PF, ER, 4FMAPS, 4VNNIW (support for all of which was removed meanwhile anyway), and VP2INTERSECT. Yet potentially with only less than 512-bit vector width, while otoh guaranteeing more narrow widths being available when wider are (i.e. unlike AVX512VL being an add-on feature on top of AVX512F). Note that visa_check(), replacing host_and_vcpu_must_have() uses, checks only the guest capability: We wouldn't expose AVX512* (nor AVX10) without the hardware supporting it. Similarly in vlen_check() the original host_and_vcpu_must_have() is reduced to the equivalent of just vcpu_must_have(). This also simplifies (resulting) code in the test and fuzzing harnesses, as there the XCR0 checks that are part of cpu_has_avx512* are only needed in local code, not in the emulator itself (where respective checking occurs elsewhere anyway, utilizing emul_test_read_xcr()). While in most cases the changes to x86_emulate() are entirely mechanical, for opmask insns earlier unconditional AVX512F checks are converted into "else" clauses to existing if/else-if ones. To be certain that no uses remain, also drop respective cpu_has_avx512* (except in the test harness) and vcpu_has_avx512*(). Signed-off-by: Jan Beulich --- Probably avx512_vlen_check() should have the avx512_ prefix dropped, now that it also covers AVX10. But if so that wants to be either a prereq or a follow-on patch. visa_check() won't cover AVX10.2 and higher, but probably we will want independent checking logic for that anyway. Spec version 2 still leaves unclear what the xstate components are which would need enabling for AVX10/256. x86emul_get_fpu() is therefore untouched for now. Since it'll be reducing code size, we may want to further convert host_and_vcpu_must_have() to just vcpu_must_have() where appropriate (should be [almost?] everywhere). --- v3: Add ChangeLog entry. v2: Drop use of vsz128 field. Re-base, in particular over dropping of Xeon Phi support. --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -18,6 +18,7 @@ The format is based on [Keep a Changelog - Experimental support for Armv8-R. - On x86: - xl suspend/resume subcommands. + - Add support for AVX10.1. (Experimental) ### Removed - On x86: --- a/xen/arch/x86/include/asm/cpufeature.h +++ b/xen/arch/x86/include/asm/cpufeature.h @@ -133,29 +133,18 @@ static inline bool boot_cpu_has(unsigned #define cpu_has_pqe boot_cpu_has(X86_FEATURE_PQE) #define cpu_has_fpu_sel (!boot_cpu_has(X86_FEATURE_NO_FPU_SEL)) #define cpu_has_mpx boot_cpu_has(X86_FEATURE_MPX) -#define cpu_has_avx512f boot_cpu_has(X86_FEATURE_AVX512F) -#define cpu_has_avx512dq boot_cpu_has(X86_FEATURE_AVX512DQ) #define cpu_has_rdseed boot_cpu_has(X86_FEATURE_RDSEED) #define cpu_has_smap boot_cpu_has(X86_FEATURE_SMAP) -#define cpu_has_avx512_ifma boot_cpu_has(X86_FEATURE_AVX512_IFMA) #define cpu_has_clflushopt boot_cpu_has(X86_FEATURE_CLFLUSHOPT) #define cpu_has_clwb boot_cpu_has(X86_FEATURE_CLWB) -#define cpu_has_avx512cd boot_cpu_has(X86_FEATURE_AVX512CD) #define cpu_has_proc_trace boot_cpu_has(X86_FEATURE_PROC_TRACE) #define cpu_has_sha boot_cpu_has(X86_FEATURE_SHA) -#define cpu_has_avx512bw boot_cpu_has(X86_FEATURE_AVX512BW) -#define cpu_has_avx512vl boot_cpu_has(X86_FEATURE_AVX512VL) /* CPUID level 0x00000007:0.ecx */ -#define cpu_has_avx512_vbmi boot_cpu_has(X86_FEATURE_AVX512_VBMI) #define cpu_has_pku boot_cpu_has(X86_FEATURE_PKU) -#define cpu_has_avx512_vbmi2 boot_cpu_has(X86_FEATURE_AVX512_VBMI2) #define cpu_has_gfni boot_cpu_has(X86_FEATURE_GFNI) #define cpu_has_vaes boot_cpu_has(X86_FEATURE_VAES) #define cpu_has_vpclmulqdq boot_cpu_has(X86_FEATURE_VPCLMULQDQ) -#define cpu_has_avx512_vnni boot_cpu_has(X86_FEATURE_AVX512_VNNI) -#define cpu_has_avx512_bitalg boot_cpu_has(X86_FEATURE_AVX512_BITALG) -#define cpu_has_avx512_vpopcntdq boot_cpu_has(X86_FEATURE_AVX512_VPOPCNTDQ) #define cpu_has_rdpid boot_cpu_has(X86_FEATURE_RDPID) #define cpu_has_movdiri boot_cpu_has(X86_FEATURE_MOVDIRI) #define cpu_has_movdir64b boot_cpu_has(X86_FEATURE_MOVDIR64B) @@ -180,7 +169,6 @@ static inline bool boot_cpu_has(unsigned #define cpu_has_tsx_force_abort boot_cpu_has(X86_FEATURE_TSX_FORCE_ABORT) #define cpu_has_serialize boot_cpu_has(X86_FEATURE_SERIALIZE) #define cpu_has_hybrid boot_cpu_has(X86_FEATURE_HYBRID) -#define cpu_has_avx512_fp16 boot_cpu_has(X86_FEATURE_AVX512_FP16) #define cpu_has_arch_caps boot_cpu_has(X86_FEATURE_ARCH_CAPS) /* CPUID level 0x00000007:1.eax */ @@ -188,7 +176,6 @@ static inline bool boot_cpu_has(unsigned #define cpu_has_sm3 boot_cpu_has(X86_FEATURE_SM3) #define cpu_has_sm4 boot_cpu_has(X86_FEATURE_SM4) #define cpu_has_avx_vnni boot_cpu_has(X86_FEATURE_AVX_VNNI) -#define cpu_has_avx512_bf16 boot_cpu_has(X86_FEATURE_AVX512_BF16) #define cpu_has_cmpccxadd boot_cpu_has(X86_FEATURE_CMPCCXADD) #define cpu_has_avx_ifma boot_cpu_has(X86_FEATURE_AVX_IFMA) --- a/xen/arch/x86/x86_emulate/private.h +++ b/xen/arch/x86/x86_emulate/private.h @@ -562,26 +562,15 @@ amd_like(const struct x86_emulate_ctxt * #define vcpu_has_invpcid() (ctxt->cpuid->feat.invpcid) #define vcpu_has_rtm() (ctxt->cpuid->feat.rtm) #define vcpu_has_mpx() (ctxt->cpuid->feat.mpx) -#define vcpu_has_avx512f() (ctxt->cpuid->feat.avx512f) -#define vcpu_has_avx512dq() (ctxt->cpuid->feat.avx512dq) #define vcpu_has_rdseed() (ctxt->cpuid->feat.rdseed) #define vcpu_has_adx() (ctxt->cpuid->feat.adx) #define vcpu_has_smap() (ctxt->cpuid->feat.smap) -#define vcpu_has_avx512_ifma() (ctxt->cpuid->feat.avx512_ifma) #define vcpu_has_clflushopt() (ctxt->cpuid->feat.clflushopt) #define vcpu_has_clwb() (ctxt->cpuid->feat.clwb) -#define vcpu_has_avx512cd() (ctxt->cpuid->feat.avx512cd) #define vcpu_has_sha() (ctxt->cpuid->feat.sha) -#define vcpu_has_avx512bw() (ctxt->cpuid->feat.avx512bw) -#define vcpu_has_avx512vl() (ctxt->cpuid->feat.avx512vl) -#define vcpu_has_avx512_vbmi() (ctxt->cpuid->feat.avx512_vbmi) -#define vcpu_has_avx512_vbmi2() (ctxt->cpuid->feat.avx512_vbmi2) #define vcpu_has_gfni() (ctxt->cpuid->feat.gfni) #define vcpu_has_vaes() (ctxt->cpuid->feat.vaes) #define vcpu_has_vpclmulqdq() (ctxt->cpuid->feat.vpclmulqdq) -#define vcpu_has_avx512_vnni() (ctxt->cpuid->feat.avx512_vnni) -#define vcpu_has_avx512_bitalg() (ctxt->cpuid->feat.avx512_bitalg) -#define vcpu_has_avx512_vpopcntdq() (ctxt->cpuid->feat.avx512_vpopcntdq) #define vcpu_has_rdpid() (ctxt->cpuid->feat.rdpid) #define vcpu_has_movdiri() (ctxt->cpuid->feat.movdiri) #define vcpu_has_movdir64b() (ctxt->cpuid->feat.movdir64b) @@ -589,12 +578,10 @@ amd_like(const struct x86_emulate_ctxt * #define vcpu_has_avx512_vp2intersect() (ctxt->cpuid->feat.avx512_vp2intersect) #define vcpu_has_serialize() (ctxt->cpuid->feat.serialize) #define vcpu_has_tsxldtrk() (ctxt->cpuid->feat.tsxldtrk) -#define vcpu_has_avx512_fp16() (ctxt->cpuid->feat.avx512_fp16) #define vcpu_has_sha512() (ctxt->cpuid->feat.sha512) #define vcpu_has_sm3() (ctxt->cpuid->feat.sm3) #define vcpu_has_sm4() (ctxt->cpuid->feat.sm4) #define vcpu_has_avx_vnni() (ctxt->cpuid->feat.avx_vnni) -#define vcpu_has_avx512_bf16() (ctxt->cpuid->feat.avx512_bf16) #define vcpu_has_cmpccxadd() (ctxt->cpuid->feat.cmpccxadd) #define vcpu_has_lkgs() (ctxt->cpuid->feat.lkgs) #define vcpu_has_wrmsrns() (ctxt->cpuid->feat.wrmsrns) --- a/xen/arch/x86/x86_emulate/x86_emulate.c +++ b/xen/arch/x86/x86_emulate/x86_emulate.c @@ -1126,19 +1126,40 @@ static unsigned long *decode_vex_gpr( return decode_gpr(regs, ~vex_reg & (mode_64bit() ? 0xf : 7)); } -#define avx512_vlen_check(lig) do { \ - switch ( evex.lr ) \ - { \ - default: \ - generate_exception(X86_EXC_UD); \ - case 2: \ - break; \ - case 0: case 1: \ - if ( !(lig) ) \ - host_and_vcpu_must_have(avx512vl); \ - break; \ - } \ -} while ( false ) +#define visa_check(subfeat) \ + generate_exception_if(!cp->feat.avx512 ## subfeat && !cp->feat.avx10, \ + X86_EXC_UD) + +static bool _vlen_check( + const struct x86_emulate_state *s, + const struct cpu_policy *cp, + bool lig) +{ + if ( s->evex.lr > 2 ) + return false; + + if ( lig ) + return true; + + if ( cp->feat.avx10 ) + switch ( s->evex.lr ) + { + case 0: + case 1: + if ( cp->avx10.vsz256 ) + return true; + /* fall through */ + case 2: + if ( cp->avx10.vsz512 ) + return true; + break; + } + + return s->evex.lr == 2 || cp->feat.avx512vl; +} + +#define avx512_vlen_check(lig) \ + generate_exception_if(!_vlen_check(state, cp, lig), X86_EXC_UD) static bool is_branch_step(struct x86_emulate_ctxt *ctxt, const struct x86_emulate_ops *ops) @@ -1370,7 +1391,9 @@ x86_emulate( /* KMOV{W,Q} %k, (%rax) */ stb[0] = 0xc4; stb[1] = 0xe1; - stb[2] = cpu_has_avx512bw ? 0xf8 : 0x78; + stb[2] = cp->feat.avx512bw || cp->feat.avx10 + ? 0xf8 /* L0.NP.W1 - kmovq */ + : 0x78 /* L0.NP.W0 - kmovw */; stb[3] = 0x91; stb[4] = evex.opmsk << 3; insn_bytes = 5; @@ -3395,7 +3418,7 @@ x86_emulate( (ea.type != OP_REG && evex.brs && (evex.pfx & VEX_PREFIX_SCALAR_MASK))), X86_EXC_UD); - host_and_vcpu_must_have(avx512f); + visa_check(f); if ( ea.type != OP_REG || !evex.brs ) avx512_vlen_check(evex.pfx & VEX_PREFIX_SCALAR_MASK); simd_zmm: @@ -3451,7 +3474,7 @@ x86_emulate( generate_exception_if((evex.lr || evex.opmsk || evex.brs || evex.w != (evex.pfx & VEX_PREFIX_DOUBLE_MASK)), X86_EXC_UD); - host_and_vcpu_must_have(avx512f); + visa_check(f); if ( (d & DstMask) != DstMem ) d &= ~TwoOp; op_bytes = 8; @@ -3478,7 +3501,7 @@ x86_emulate( generate_exception_if((evex.brs || evex.w != (evex.pfx & VEX_PREFIX_DOUBLE_MASK)), X86_EXC_UD); - host_and_vcpu_must_have(avx512f); + visa_check(f); avx512_vlen_check(false); d |= TwoOp; op_bytes = !(evex.pfx & VEX_PREFIX_DOUBLE_MASK) || evex.lr @@ -3515,7 +3538,7 @@ x86_emulate( case X86EMUL_OPC_EVEX_66(0x0f38, 0x64): /* vpblendm{d,q} [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_66(0x0f38, 0x65): /* vblendmp{s,d} [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ avx512f_no_sae: - host_and_vcpu_must_have(avx512f); + visa_check(f); generate_exception_if(ea.type != OP_MEM && evex.brs, X86_EXC_UD); avx512_vlen_check(false); goto simd_zmm; @@ -3595,13 +3618,13 @@ x86_emulate( case X86EMUL_OPC_EVEX_F3(5, 0x2a): /* vcvtsi2sh r/m,xmm,xmm */ case X86EMUL_OPC_EVEX_F3(5, 0x7b): /* vcvtusi2sh r/m,xmm,xmm */ - host_and_vcpu_must_have(avx512_fp16); + visa_check(_fp16); /* fall through */ CASE_SIMD_SCALAR_FP(_EVEX, 0x0f, 0x2a): /* vcvtsi2s{s,d} r/m,xmm,xmm */ CASE_SIMD_SCALAR_FP(_EVEX, 0x0f, 0x7b): /* vcvtusi2s{s,d} r/m,xmm,xmm */ generate_exception_if(evex.opmsk || (ea.type != OP_REG && evex.brs), X86_EXC_UD); - host_and_vcpu_must_have(avx512f); + visa_check(f); if ( !evex.brs ) avx512_vlen_check(true); get_fpu(X86EMUL_FPU_zmm); @@ -3711,7 +3734,7 @@ x86_emulate( case X86EMUL_OPC_EVEX_F3(5, 0x2d): /* vcvtsh2si xmm/mem,reg */ case X86EMUL_OPC_EVEX_F3(5, 0x78): /* vcvttsh2usi xmm/mem,reg */ case X86EMUL_OPC_EVEX_F3(5, 0x79): /* vcvtsh2usi xmm/mem,reg */ - host_and_vcpu_must_have(avx512_fp16); + visa_check(_fp16); /* fall through */ CASE_SIMD_SCALAR_FP(_EVEX, 0x0f, 0x2c): /* vcvtts{s,d}2si xmm/mem,reg */ CASE_SIMD_SCALAR_FP(_EVEX, 0x0f, 0x2d): /* vcvts{s,d}2si xmm/mem,reg */ @@ -3721,7 +3744,7 @@ x86_emulate( evex.opmsk || (ea.type != OP_REG && evex.brs)), X86_EXC_UD); - host_and_vcpu_must_have(avx512f); + visa_check(f); if ( !evex.brs ) avx512_vlen_check(true); get_fpu(X86EMUL_FPU_zmm); @@ -3787,7 +3810,7 @@ x86_emulate( case X86EMUL_OPC_EVEX(5, 0x2e): /* vucomish xmm/m16,xmm */ case X86EMUL_OPC_EVEX(5, 0x2f): /* vcomish xmm/m16,xmm */ - host_and_vcpu_must_have(avx512_fp16); + visa_check(_fp16); generate_exception_if(evex.w, X86_EXC_UD); /* fall through */ CASE_SIMD_PACKED_FP(_EVEX, 0x0f, 0x2e): /* vucomis{s,d} xmm/mem,xmm */ @@ -3796,7 +3819,7 @@ x86_emulate( (ea.type != OP_REG && evex.brs) || evex.w != evex.pfx), X86_EXC_UD); - host_and_vcpu_must_have(avx512f); + visa_check(f); if ( !evex.brs ) avx512_vlen_check(true); get_fpu(X86EMUL_FPU_zmm); @@ -3940,7 +3963,7 @@ x86_emulate( case X86EMUL_OPC_VEX(0x0f, 0x4a): /* kadd{w,q} k,k,k */ if ( !vex.w ) - host_and_vcpu_must_have(avx512dq); + visa_check(dq); /* fall through */ case X86EMUL_OPC_VEX(0x0f, 0x41): /* kand{w,q} k,k,k */ case X86EMUL_OPC_VEX_66(0x0f, 0x41): /* kand{b,d} k,k,k */ @@ -3956,11 +3979,12 @@ x86_emulate( generate_exception_if(!vex.l, X86_EXC_UD); opmask_basic: if ( vex.w ) - host_and_vcpu_must_have(avx512bw); + visa_check(bw); else if ( vex.pfx ) - host_and_vcpu_must_have(avx512dq); + visa_check(dq); + else + visa_check(f); opmask_common: - host_and_vcpu_must_have(avx512f); generate_exception_if(!vex.r || (mode_64bit() && !(vex.reg & 8)) || ea.type != OP_REG, X86_EXC_UD); @@ -3983,13 +4007,14 @@ x86_emulate( generate_exception_if(vex.l || vex.reg != 0xf, X86_EXC_UD); goto opmask_basic; - case X86EMUL_OPC_VEX(0x0f, 0x4b): /* kunpck{w,d}{d,q} k,k,k */ + case X86EMUL_OPC_VEX(0x0f, 0x4b): /* kunpck{wd,dq} k,k,k */ generate_exception_if(!vex.l, X86_EXC_UD); - host_and_vcpu_must_have(avx512bw); + visa_check(bw); goto opmask_common; case X86EMUL_OPC_VEX_66(0x0f, 0x4b): /* kunpckbw k,k,k */ generate_exception_if(!vex.l || vex.w, X86_EXC_UD); + visa_check(f); goto opmask_common; #endif /* X86EMUL_NO_SIMD */ @@ -4057,7 +4082,7 @@ x86_emulate( generate_exception_if((evex.w != (evex.pfx & VEX_PREFIX_DOUBLE_MASK) || (ea.type != OP_MEM && evex.brs)), X86_EXC_UD); - host_and_vcpu_must_have(avx512dq); + visa_check(dq); avx512_vlen_check(false); goto simd_zmm; @@ -4096,12 +4121,12 @@ x86_emulate( case X86EMUL_OPC_EVEX_F2(0x0f, 0x7a): /* vcvtudq2ps [xyz]mm/mem,[xyz]mm{k} */ /* vcvtuqq2ps [xyz]mm/mem,{x,y}mm{k} */ if ( evex.w ) - host_and_vcpu_must_have(avx512dq); + visa_check(dq); else { case X86EMUL_OPC_EVEX(0x0f, 0x78): /* vcvttp{s,d}2udq [xyz]mm/mem,[xyz]mm{k} */ case X86EMUL_OPC_EVEX(0x0f, 0x79): /* vcvtp{s,d}2udq [xyz]mm/mem,[xyz]mm{k} */ - host_and_vcpu_must_have(avx512f); + visa_check(f); } if ( ea.type != OP_REG || !evex.brs ) avx512_vlen_check(false); @@ -4318,7 +4343,7 @@ x86_emulate( case X86EMUL_OPC_EVEX_66(0x0f38, 0x0b): /* vpmulhrsw [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_66(0x0f38, 0x1c): /* vpabsb [xyz]mm/mem,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_66(0x0f38, 0x1d): /* vpabsw [xyz]mm/mem,[xyz]mm{k} */ - host_and_vcpu_must_have(avx512bw); + visa_check(bw); generate_exception_if(evex.brs, X86_EXC_UD); elem_bytes = 1 << (b & 1); goto avx512f_no_sae; @@ -4350,7 +4375,7 @@ x86_emulate( generate_exception_if(b != 0x27 && evex.w != (b & 1), X86_EXC_UD); goto avx512f_no_sae; } - host_and_vcpu_must_have(avx512bw); + visa_check(bw); generate_exception_if(evex.brs, X86_EXC_UD); elem_bytes = 1 << (ext == ext_0f ? b & 1 : evex.w); avx512_vlen_check(false); @@ -4423,7 +4448,7 @@ x86_emulate( dst.bytes = 2; /* fall through */ case X86EMUL_OPC_EVEX_66(5, 0x6e): /* vmovw r/m16,xmm */ - host_and_vcpu_must_have(avx512_fp16); + visa_check(_fp16); generate_exception_if(evex.w, X86_EXC_UD); /* fall through */ case X86EMUL_OPC_EVEX_66(0x0f, 0x6e): /* vmov{d,q} r/m,xmm */ @@ -4431,7 +4456,7 @@ x86_emulate( generate_exception_if((evex.lr || evex.opmsk || evex.brs || evex.reg != 0xf || !evex.RX), X86_EXC_UD); - host_and_vcpu_must_have(avx512f); + visa_check(f); get_fpu(X86EMUL_FPU_zmm); opc = init_evex(stub); @@ -4489,7 +4514,7 @@ x86_emulate( case X86EMUL_OPC_EVEX_F2(0x0f, 0x6f): /* vmovdqu{8,16} [xyz]mm/mem,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_F2(0x0f, 0x7f): /* vmovdqu{8,16} [xyz]mm,[xyz]mm/mem{k} */ - host_and_vcpu_must_have(avx512bw); + visa_check(bw); elem_bytes = 1 << evex.w; goto vmovdqa; @@ -4582,7 +4607,7 @@ x86_emulate( generate_exception_if(evex.w, X86_EXC_UD); else { - host_and_vcpu_must_have(avx512bw); + visa_check(bw); generate_exception_if(evex.brs, X86_EXC_UD); } d = (d & ~SrcMask) | SrcMem | TwoOp; @@ -4830,7 +4855,7 @@ x86_emulate( case X86EMUL_OPC_EVEX_F3(0x0f, 0xe6): /* vcvtdq2pd {x,y}mm/mem,[xyz]mm{k} */ /* vcvtqq2pd [xyz]mm/mem,[xyz]mm{k} */ if ( evex.pfx != vex_f3 ) - host_and_vcpu_must_have(avx512f); + visa_check(f); else if ( evex.w ) { case X86EMUL_OPC_EVEX_66(0x0f, 0x78): /* vcvttps2uqq {x,y}mm/mem,[xyz]mm{k} */ @@ -4841,11 +4866,11 @@ x86_emulate( /* vcvttpd2qq [xyz]mm/mem,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_66(0x0f, 0x7b): /* vcvtps2qq {x,y}mm/mem,[xyz]mm{k} */ /* vcvtpd2qq [xyz]mm/mem,[xyz]mm{k} */ - host_and_vcpu_must_have(avx512dq); + visa_check(dq); } else { - host_and_vcpu_must_have(avx512f); + visa_check(f); generate_exception_if(ea.type != OP_MEM && evex.brs, X86_EXC_UD); } if ( ea.type != OP_REG || !evex.brs ) @@ -4883,7 +4908,7 @@ x86_emulate( case X86EMUL_OPC_EVEX_66(0x0f, 0xd6): /* vmovq xmm,xmm/m64 */ generate_exception_if(evex.lr || !evex.w || evex.opmsk || evex.brs, X86_EXC_UD); - host_and_vcpu_must_have(avx512f); + visa_check(f); d |= TwoOp; op_bytes = 8; goto simd_zmm; @@ -4909,19 +4934,21 @@ x86_emulate( case X86EMUL_OPC_VEX(0x0f, 0x90): /* kmov{w,q} k/mem,k */ case X86EMUL_OPC_VEX_66(0x0f, 0x90): /* kmov{b,d} k/mem,k */ generate_exception_if(vex.l || !vex.r, X86_EXC_UD); - host_and_vcpu_must_have(avx512f); if ( vex.w ) { - host_and_vcpu_must_have(avx512bw); + visa_check(bw); op_bytes = 4 << !vex.pfx; } else if ( vex.pfx ) { - host_and_vcpu_must_have(avx512dq); + visa_check(dq); op_bytes = 1; } else + { + visa_check(f); op_bytes = 2; + } get_fpu(X86EMUL_FPU_opmask); @@ -4943,14 +4970,15 @@ x86_emulate( generate_exception_if(vex.l || !vex.r || vex.reg != 0xf || ea.type != OP_REG, X86_EXC_UD); - host_and_vcpu_must_have(avx512f); if ( vex.pfx == vex_f2 ) - host_and_vcpu_must_have(avx512bw); + visa_check(bw); else { generate_exception_if(vex.w, X86_EXC_UD); if ( vex.pfx ) - host_and_vcpu_must_have(avx512dq); + visa_check(dq); + else + visa_check(f); } get_fpu(X86EMUL_FPU_opmask); @@ -4982,10 +5010,9 @@ x86_emulate( dst = ea; dst.reg = decode_gpr(&_regs, modrm_reg); - host_and_vcpu_must_have(avx512f); if ( vex.pfx == vex_f2 ) { - host_and_vcpu_must_have(avx512bw); + visa_check(bw); dst.bytes = 4 << (mode_64bit() && vex.w); } else @@ -4993,7 +5020,9 @@ x86_emulate( generate_exception_if(vex.w, X86_EXC_UD); dst.bytes = 4; if ( vex.pfx ) - host_and_vcpu_must_have(avx512dq); + visa_check(dq); + else + visa_check(f); } get_fpu(X86EMUL_FPU_opmask); @@ -5015,20 +5044,18 @@ x86_emulate( ASSERT(!state->simd_size); break; - case X86EMUL_OPC_VEX(0x0f, 0x99): /* ktest{w,q} k,k */ - if ( !vex.w ) - host_and_vcpu_must_have(avx512dq); - /* fall through */ case X86EMUL_OPC_VEX(0x0f, 0x98): /* kortest{w,q} k,k */ case X86EMUL_OPC_VEX_66(0x0f, 0x98): /* kortest{b,d} k,k */ + case X86EMUL_OPC_VEX(0x0f, 0x99): /* ktest{w,q} k,k */ case X86EMUL_OPC_VEX_66(0x0f, 0x99): /* ktest{b,d} k,k */ generate_exception_if(vex.l || !vex.r || vex.reg != 0xf || ea.type != OP_REG, X86_EXC_UD); - host_and_vcpu_must_have(avx512f); if ( vex.w ) - host_and_vcpu_must_have(avx512bw); - else if ( vex.pfx ) - host_and_vcpu_must_have(avx512dq); + visa_check(bw); + else if ( vex.pfx || (b & 1) ) + visa_check(dq); + else + visa_check(f); get_fpu(X86EMUL_FPU_opmask); @@ -5366,7 +5393,7 @@ x86_emulate( (evex.pfx & VEX_PREFIX_SCALAR_MASK)) || !evex.r || !evex.R || evex.z), X86_EXC_UD); - host_and_vcpu_must_have(avx512f); + visa_check(f); if ( ea.type != OP_REG || !evex.brs ) avx512_vlen_check(evex.pfx & VEX_PREFIX_SCALAR_MASK); simd_imm8_zmm: @@ -5410,9 +5437,9 @@ x86_emulate( case X86EMUL_OPC_EVEX_66(0x0f3a, 0x22): /* vpinsr{d,q} $imm8,r/m,xmm,xmm */ generate_exception_if(evex.lr || evex.opmsk || evex.brs, X86_EXC_UD); if ( b & 2 ) - host_and_vcpu_must_have(avx512dq); + visa_check(dq); else - host_and_vcpu_must_have(avx512bw); + visa_check(bw); if ( !mode_64bit() ) evex.w = 0; memcpy(mmvalp, &src.val, src.bytes); @@ -5449,7 +5476,7 @@ x86_emulate( /* fall through */ case X86EMUL_OPC_EVEX_66(0x0f3a, 0x25): /* vpternlog{d,q} $imm8,[xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ avx512f_imm8_no_sae: - host_and_vcpu_must_have(avx512f); + visa_check(f); generate_exception_if(ea.type != OP_MEM && evex.brs, X86_EXC_UD); avx512_vlen_check(false); goto simd_imm8_zmm; @@ -5548,7 +5575,7 @@ x86_emulate( case X86EMUL_OPC_EVEX_66(0x0f, 0xe4): /* vpmulhuw [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_66(0x0f, 0xea): /* vpminsw [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_66(0x0f, 0xee): /* vpmaxsw [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ - host_and_vcpu_must_have(avx512bw); + visa_check(bw); generate_exception_if(evex.brs, X86_EXC_UD); elem_bytes = b & 0x10 ? 1 : 2; goto avx512f_no_sae; @@ -5773,7 +5800,7 @@ x86_emulate( case X86EMUL_OPC_EVEX_66(0x0f38, 0x10): /* vpsrlvw [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_66(0x0f38, 0x11): /* vpsravw [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_66(0x0f38, 0x12): /* vpsllvw [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ - host_and_vcpu_must_have(avx512bw); + visa_check(bw); generate_exception_if(!evex.w || evex.brs, X86_EXC_UD); elem_bytes = 2; goto avx512f_no_sae; @@ -5783,7 +5810,7 @@ x86_emulate( case X86EMUL_OPC_EVEX_F3(0x0f38, 0x20): /* vpmovswb [xyz]mm,{x,y}mm/mem{k} */ case X86EMUL_OPC_EVEX_66(0x0f38, 0x30): /* vpmovzxbw {x,y}mm/mem,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_F3(0x0f38, 0x30): /* vpmovwb [xyz]mm,{x,y}mm/mem{k} */ - host_and_vcpu_must_have(avx512bw); + visa_check(bw); if ( evex.pfx != vex_f3 ) { case X86EMUL_OPC_EVEX_66(0x0f38, 0x21): /* vpmovsxbd xmm/mem,[xyz]mm{k} */ @@ -5831,7 +5858,7 @@ x86_emulate( case X86EMUL_OPC_EVEX_66(0x0f38, 0x13): /* vcvtph2ps {x,y}mm/mem,[xyz]mm{k} */ generate_exception_if(evex.w || (ea.type != OP_REG && evex.brs), X86_EXC_UD); - host_and_vcpu_must_have(avx512f); + visa_check(f); if ( !evex.brs ) avx512_vlen_check(false); op_bytes = 8 << evex.lr; @@ -5885,7 +5912,7 @@ x86_emulate( op_bytes = 8; generate_exception_if(evex.brs, X86_EXC_UD); if ( !evex.w ) - host_and_vcpu_must_have(avx512dq); + visa_check(dq); goto avx512_broadcast; case X86EMUL_OPC_EVEX_66(0x0f38, 0x1a): /* vbroadcastf32x4 m128,{y,z}mm{k} */ @@ -5895,7 +5922,7 @@ x86_emulate( generate_exception_if(ea.type != OP_MEM || !evex.lr || evex.brs, X86_EXC_UD); if ( evex.w ) - host_and_vcpu_must_have(avx512dq); + visa_check(dq); goto avx512_broadcast; case X86EMUL_OPC_VEX_66(0x0f38, 0x20): /* vpmovsxbw xmm/mem,{x,y}mm */ @@ -5920,9 +5947,9 @@ x86_emulate( case X86EMUL_OPC_EVEX_F3(0x0f38, 0x28): /* vpmovm2{b,w} k,[xyz]mm */ case X86EMUL_OPC_EVEX_F3(0x0f38, 0x38): /* vpmovm2{d,q} k,[xyz]mm */ if ( b & 0x10 ) - host_and_vcpu_must_have(avx512dq); + visa_check(dq); else - host_and_vcpu_must_have(avx512bw); + visa_check(bw); generate_exception_if(evex.opmsk || ea.type != OP_REG, X86_EXC_UD); d |= TwoOp; op_bytes = 16 << evex.lr; @@ -5965,7 +5992,7 @@ x86_emulate( fault_suppression = false; /* fall through */ case X86EMUL_OPC_EVEX_66(0x0f38, 0x44): /* vplzcnt{d,q} [xyz]mm/mem,[xyz]mm{k} */ - host_and_vcpu_must_have(avx512cd); + visa_check(cd); goto avx512f_no_sae; case X86EMUL_OPC_VEX_66(0x0f38, 0x2c): /* vmaskmovps mem,{x,y}mm,{x,y}mm */ @@ -6041,7 +6068,7 @@ x86_emulate( case X86EMUL_OPC_EVEX_66(0x0f38, 0xba): /* vfmsub231p{s,d} [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_66(0x0f38, 0xbc): /* vfnmadd231p{s,d} [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_66(0x0f38, 0xbe): /* vfnmsub231p{s,d} [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ - host_and_vcpu_must_have(avx512f); + visa_check(f); if ( ea.type != OP_REG || !evex.brs ) avx512_vlen_check(false); goto simd_zmm; @@ -6060,7 +6087,7 @@ x86_emulate( case X86EMUL_OPC_EVEX_66(0x0f38, 0xbb): /* vfmsub231s{s,d} xmm/mem,xmm,xmm{k} */ case X86EMUL_OPC_EVEX_66(0x0f38, 0xbd): /* vfnmadd231s{s,d} xmm/mem,xmm,xmm{k} */ case X86EMUL_OPC_EVEX_66(0x0f38, 0xbf): /* vfnmsub231s{s,d} xmm/mem,xmm,xmm{k} */ - host_and_vcpu_must_have(avx512f); + visa_check(f); generate_exception_if(ea.type != OP_REG && evex.brs, X86_EXC_UD); if ( !evex.brs ) avx512_vlen_check(true); @@ -6074,14 +6101,14 @@ x86_emulate( case X86EMUL_OPC_EVEX_66(0x0f38, 0x3a): /* vpminuw [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_66(0x0f38, 0x3c): /* vpmaxsb [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_66(0x0f38, 0x3e): /* vpmaxuw [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ - host_and_vcpu_must_have(avx512bw); + visa_check(bw); generate_exception_if(evex.brs, X86_EXC_UD); elem_bytes = b & 2 ?: 1; goto avx512f_no_sae; case X86EMUL_OPC_EVEX_66(0x0f38, 0x40): /* vpmull{d,q} [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ if ( evex.w ) - host_and_vcpu_must_have(avx512dq); + visa_check(dq); goto avx512f_no_sae; case X86EMUL_OPC_66(0x0f38, 0xdb): /* aesimc xmm/m128,xmm */ @@ -6121,7 +6148,7 @@ x86_emulate( case X86EMUL_OPC_EVEX_66(0x0f38, 0x51): /* vpdpbusds [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_66(0x0f38, 0x52): /* vpdpwssd [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_66(0x0f38, 0x53): /* vpdpwssds [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ - host_and_vcpu_must_have(avx512_vnni); + visa_check(_vnni); generate_exception_if(evex.w, X86_EXC_UD); goto avx512f_no_sae; @@ -6133,7 +6160,7 @@ x86_emulate( d |= TwoOp; /* fall through */ case X86EMUL_OPC_EVEX_F3(0x0f38, 0x52): /* vdpbf16ps [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ - host_and_vcpu_must_have(avx512_bf16); + visa_check(_bf16); generate_exception_if(evex.w, X86_EXC_UD); op_bytes = 16 << evex.lr; goto avx512f_no_sae; @@ -6150,7 +6177,7 @@ x86_emulate( case X86EMUL_OPC_EVEX_66(0x0f38, 0x4d): /* vrcp14s{s,d} xmm/mem,xmm,xmm{k} */ case X86EMUL_OPC_EVEX_66(0x0f38, 0x4f): /* vrsqrt14s{s,d} xmm/mem,xmm,xmm{k} */ - host_and_vcpu_must_have(avx512f); + visa_check(f); generate_exception_if(evex.brs, X86_EXC_UD); avx512_vlen_check(true); goto simd_zmm; @@ -6159,16 +6186,16 @@ x86_emulate( generate_exception_if(evex.w || !evex.r || !evex.R || evex.z, X86_EXC_UD); /* fall through */ case X86EMUL_OPC_EVEX_66(0x0f38, 0x54): /* vpopcnt{b,w} [xyz]mm/mem,[xyz]mm{k} */ - host_and_vcpu_must_have(avx512_bitalg); + visa_check(_bitalg); /* fall through */ case X86EMUL_OPC_EVEX_66(0x0f38, 0x66): /* vpblendm{b,w} [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ - host_and_vcpu_must_have(avx512bw); + visa_check(bw); generate_exception_if(evex.brs, X86_EXC_UD); elem_bytes = 1 << evex.w; goto avx512f_no_sae; case X86EMUL_OPC_EVEX_66(0x0f38, 0x55): /* vpopcnt{d,q} [xyz]mm/mem,[xyz]mm{k} */ - host_and_vcpu_must_have(avx512_vpopcntdq); + visa_check(_vpopcntdq); goto avx512f_no_sae; case X86EMUL_OPC_VEX_66(0x0f38, 0x5a): /* vbroadcasti128 m128,ymm */ @@ -6177,14 +6204,14 @@ x86_emulate( case X86EMUL_OPC_EVEX_66(0x0f38, 0x62): /* vpexpand{b,w} [xyz]mm/mem,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_66(0x0f38, 0x63): /* vpcompress{b,w} [xyz]mm,[xyz]mm/mem{k} */ - host_and_vcpu_must_have(avx512_vbmi2); + visa_check(_vbmi2); elem_bytes = 1 << evex.w; /* fall through */ case X86EMUL_OPC_EVEX_66(0x0f38, 0x88): /* vexpandp{s,d} [xyz]mm/mem,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_66(0x0f38, 0x89): /* vpexpand{d,q} [xyz]mm/mem,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_66(0x0f38, 0x8a): /* vcompressp{s,d} [xyz]mm,[xyz]mm/mem{k} */ case X86EMUL_OPC_EVEX_66(0x0f38, 0x8b): /* vpcompress{d,q} [xyz]mm,[xyz]mm/mem{k} */ - host_and_vcpu_must_have(avx512f); + visa_check(f); generate_exception_if(evex.brs, X86_EXC_UD); avx512_vlen_check(false); /* @@ -6218,7 +6245,7 @@ x86_emulate( /* fall through */ case X86EMUL_OPC_EVEX_66(0x0f38, 0x71): /* vpshldv{d,q} [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_66(0x0f38, 0x73): /* vpshrdv{d,q} [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ - host_and_vcpu_must_have(avx512_vbmi2); + visa_check(_vbmi2); goto avx512f_no_sae; case X86EMUL_OPC_VEX (0x0f38, 0xb0): /* vcvtneoph2ps mem,[xy]mm */ @@ -6238,16 +6265,16 @@ x86_emulate( case X86EMUL_OPC_EVEX_66(0x0f38, 0x7d): /* vpermt2{b,w} [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_66(0x0f38, 0x8d): /* vperm{b,w} [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ if ( !evex.w ) - host_and_vcpu_must_have(avx512_vbmi); + visa_check(_vbmi); else - host_and_vcpu_must_have(avx512bw); + visa_check(bw); generate_exception_if(evex.brs, X86_EXC_UD); fault_suppression = false; goto avx512f_no_sae; case X86EMUL_OPC_EVEX_66(0x0f38, 0x78): /* vpbroadcastb xmm/m8,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_66(0x0f38, 0x79): /* vpbroadcastw xmm/m16,[xyz]mm{k} */ - host_and_vcpu_must_have(avx512bw); + visa_check(bw); generate_exception_if(evex.w || evex.brs, X86_EXC_UD); op_bytes = elem_bytes = 1 << (b & 1); /* See the comment at the avx512_broadcast label. */ @@ -6256,14 +6283,14 @@ x86_emulate( case X86EMUL_OPC_EVEX_66(0x0f38, 0x7a): /* vpbroadcastb r32,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_66(0x0f38, 0x7b): /* vpbroadcastw r32,[xyz]mm{k} */ - host_and_vcpu_must_have(avx512bw); + visa_check(bw); generate_exception_if(evex.w, X86_EXC_UD); /* fall through */ case X86EMUL_OPC_EVEX_66(0x0f38, 0x7c): /* vpbroadcast{d,q} reg,[xyz]mm{k} */ generate_exception_if((ea.type != OP_REG || evex.brs || evex.reg != 0xf || !evex.RX), X86_EXC_UD); - host_and_vcpu_must_have(avx512f); + visa_check(f); avx512_vlen_check(false); get_fpu(X86EMUL_FPU_zmm); @@ -6332,7 +6359,7 @@ x86_emulate( case X86EMUL_OPC_EVEX_66(0x0f38, 0x83): /* vpmultishiftqb [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ generate_exception_if(!evex.w, X86_EXC_UD); - host_and_vcpu_must_have(avx512_vbmi); + visa_check(_vbmi); fault_suppression = false; goto avx512f_no_sae; @@ -6490,8 +6517,8 @@ x86_emulate( evex.reg != 0xf || modrm_reg == state->sib_index), X86_EXC_UD); + visa_check(f); avx512_vlen_check(false); - host_and_vcpu_must_have(avx512f); get_fpu(X86EMUL_FPU_zmm); /* Read destination and index registers. */ @@ -6652,8 +6679,8 @@ x86_emulate( evex.reg != 0xf || modrm_reg == state->sib_index), X86_EXC_UD); + visa_check(f); avx512_vlen_check(false); - host_and_vcpu_must_have(avx512f); get_fpu(X86EMUL_FPU_zmm); /* Read source and index registers. */ @@ -6769,7 +6796,7 @@ x86_emulate( case X86EMUL_OPC_EVEX_66(0x0f38, 0xb4): /* vpmadd52luq [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_66(0x0f38, 0xb5): /* vpmadd52huq [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ - host_and_vcpu_must_have(avx512_ifma); + visa_check(_ifma); generate_exception_if(!evex.w, X86_EXC_UD); goto avx512f_no_sae; @@ -7239,7 +7266,7 @@ x86_emulate( /* fall through */ case X86EMUL_OPC_EVEX_66(0x0f3a, 0x08): /* vrndscaleps $imm8,[xyz]mm/mem,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_66(0x0f3a, 0x09): /* vrndscalepd $imm8,[xyz]mm/mem,[xyz]mm{k} */ - host_and_vcpu_must_have(avx512f); + visa_check(f); generate_exception_if(evex.w != (b & 1), X86_EXC_UD); avx512_vlen_check(b & 2); goto simd_imm8_zmm; @@ -7248,7 +7275,7 @@ x86_emulate( generate_exception_if(ea.type != OP_REG && evex.brs, X86_EXC_UD); /* fall through */ case X86EMUL_OPC_EVEX(0x0f3a, 0x08): /* vrndscaleph $imm8,[xyz]mm/mem,[xyz]mm{k} */ - host_and_vcpu_must_have(avx512_fp16); + visa_check(_fp16); generate_exception_if(evex.w, X86_EXC_UD); avx512_vlen_check(b & 2); goto simd_imm8_zmm; @@ -7361,11 +7388,11 @@ x86_emulate( evex.opmsk || evex.brs), X86_EXC_UD); if ( !(b & 2) ) - host_and_vcpu_must_have(avx512bw); + visa_check(bw); else if ( !(b & 1) ) - host_and_vcpu_must_have(avx512dq); + visa_check(dq); else - host_and_vcpu_must_have(avx512f); + visa_check(f); get_fpu(X86EMUL_FPU_zmm); opc = init_evex(stub); goto pextr; @@ -7379,7 +7406,7 @@ x86_emulate( case X86EMUL_OPC_EVEX_66(0x0f3a, 0x39): /* vextracti32x4 $imm8,{y,z}mm,xmm/m128{k} */ /* vextracti64x2 $imm8,{y,z}mm,xmm/m128{k} */ if ( evex.w ) - host_and_vcpu_must_have(avx512dq); + visa_check(dq); generate_exception_if(evex.brs, X86_EXC_UD); /* fall through */ case X86EMUL_OPC_EVEX_66(0x0f3a, 0x23): /* vshuff32x4 $imm8,{y,z}mm/mem,{y,z}mm,{y,z}mm{k} */ @@ -7399,7 +7426,7 @@ x86_emulate( case X86EMUL_OPC_EVEX_66(0x0f3a, 0x3b): /* vextracti32x8 $imm8,zmm,ymm/m256{k} */ /* vextracti64x4 $imm8,zmm,ymm/m256{k} */ if ( !evex.w ) - host_and_vcpu_must_have(avx512dq); + visa_check(dq); generate_exception_if(evex.lr != 2 || evex.brs, X86_EXC_UD); fault_suppression = false; goto avx512f_imm8_no_sae; @@ -7415,7 +7442,7 @@ x86_emulate( generate_exception_if((evex.w || evex.reg != 0xf || !evex.RX || (ea.type != OP_REG && (evex.z || evex.brs))), X86_EXC_UD); - host_and_vcpu_must_have(avx512f); + visa_check(f); avx512_vlen_check(false); opc = init_evex(stub); } @@ -7507,7 +7534,7 @@ x86_emulate( if ( !(b & 0x20) ) goto avx512f_imm8_no_sae; avx512bw_imm: - host_and_vcpu_must_have(avx512bw); + visa_check(bw); generate_exception_if(evex.brs, X86_EXC_UD); elem_bytes = 1 << evex.w; avx512_vlen_check(false); @@ -7546,7 +7573,7 @@ x86_emulate( goto simd_0f_imm8_avx; case X86EMUL_OPC_EVEX_66(0x0f3a, 0x21): /* vinsertps $imm8,xmm/m32,xmm,xmm */ - host_and_vcpu_must_have(avx512f); + visa_check(f); generate_exception_if(evex.lr || evex.w || evex.opmsk || evex.brs, X86_EXC_UD); op_bytes = 4; @@ -7554,18 +7581,18 @@ x86_emulate( case X86EMUL_OPC_EVEX_66(0x0f3a, 0x50): /* vrangep{s,d} $imm8,[xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_66(0x0f3a, 0x56): /* vreducep{s,d} $imm8,[xyz]mm/mem,[xyz]mm{k} */ - host_and_vcpu_must_have(avx512dq); + visa_check(dq); /* fall through */ case X86EMUL_OPC_EVEX_66(0x0f3a, 0x26): /* vgetmantp{s,d} $imm8,[xyz]mm/mem,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_66(0x0f3a, 0x54): /* vfixupimmp{s,d} $imm8,[xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ - host_and_vcpu_must_have(avx512f); + visa_check(f); if ( ea.type != OP_REG || !evex.brs ) avx512_vlen_check(false); goto simd_imm8_zmm; case X86EMUL_OPC_EVEX(0x0f3a, 0x26): /* vgetmantph $imm8,[xyz]mm/mem,[xyz]mm{k} */ case X86EMUL_OPC_EVEX(0x0f3a, 0x56): /* vreduceph $imm8,[xyz]mm/mem,[xyz]mm{k} */ - host_and_vcpu_must_have(avx512_fp16); + visa_check(_fp16); generate_exception_if(evex.w, X86_EXC_UD); if ( ea.type != OP_REG || !evex.brs ) avx512_vlen_check(false); @@ -7573,11 +7600,11 @@ x86_emulate( case X86EMUL_OPC_EVEX_66(0x0f3a, 0x51): /* vranges{s,d} $imm8,xmm/mem,xmm,xmm{k} */ case X86EMUL_OPC_EVEX_66(0x0f3a, 0x57): /* vreduces{s,d} $imm8,xmm/mem,xmm,xmm{k} */ - host_and_vcpu_must_have(avx512dq); + visa_check(dq); /* fall through */ case X86EMUL_OPC_EVEX_66(0x0f3a, 0x27): /* vgetmants{s,d} $imm8,xmm/mem,xmm,xmm{k} */ case X86EMUL_OPC_EVEX_66(0x0f3a, 0x55): /* vfixupimms{s,d} $imm8,xmm/mem,xmm,xmm{k} */ - host_and_vcpu_must_have(avx512f); + visa_check(f); generate_exception_if(ea.type != OP_REG && evex.brs, X86_EXC_UD); if ( !evex.brs ) avx512_vlen_check(true); @@ -7585,7 +7612,7 @@ x86_emulate( case X86EMUL_OPC_EVEX(0x0f3a, 0x27): /* vgetmantsh $imm8,xmm/mem,xmm,xmm{k} */ case X86EMUL_OPC_EVEX(0x0f3a, 0x57): /* vreducesh $imm8,xmm/mem,xmm,xmm{k} */ - host_and_vcpu_must_have(avx512_fp16); + visa_check(_fp16); generate_exception_if(evex.w, X86_EXC_UD); if ( !evex.brs ) avx512_vlen_check(true); @@ -7596,18 +7623,19 @@ x86_emulate( case X86EMUL_OPC_VEX_66(0x0f3a, 0x30): /* kshiftr{b,w} $imm8,k,k */ case X86EMUL_OPC_VEX_66(0x0f3a, 0x32): /* kshiftl{b,w} $imm8,k,k */ if ( !vex.w ) - host_and_vcpu_must_have(avx512dq); + visa_check(dq); + else + visa_check(f); opmask_shift_imm: generate_exception_if(vex.l || !vex.r || vex.reg != 0xf || ea.type != OP_REG, X86_EXC_UD); - host_and_vcpu_must_have(avx512f); get_fpu(X86EMUL_FPU_opmask); op_bytes = 1; /* Any non-zero value will do. */ goto simd_0f_imm8; case X86EMUL_OPC_VEX_66(0x0f3a, 0x31): /* kshiftr{d,q} $imm8,k,k */ case X86EMUL_OPC_VEX_66(0x0f3a, 0x33): /* kshiftl{d,q} $imm8,k,k */ - host_and_vcpu_must_have(avx512bw); + visa_check(bw); goto opmask_shift_imm; case X86EMUL_OPC_66(0x0f3a, 0x44): /* pclmulqdq $imm8,xmm/m128,xmm */ @@ -7748,7 +7776,7 @@ x86_emulate( case X86EMUL_OPC_EVEX_66(0x0f3a, 0x66): /* vfpclassp{s,d} $imm8,[xyz]mm/mem,k{k} */ case X86EMUL_OPC_EVEX_66(0x0f3a, 0x67): /* vfpclasss{s,d} $imm8,xmm/mem,k{k} */ - host_and_vcpu_must_have(avx512dq); + visa_check(dq); generate_exception_if(!evex.r || !evex.R || evex.z, X86_EXC_UD); if ( !(b & 1) ) goto avx512f_imm8_no_sae; @@ -7758,7 +7786,7 @@ x86_emulate( case X86EMUL_OPC_EVEX(0x0f3a, 0x66): /* vfpclassph $imm8,[xyz]mm/mem,k{k} */ case X86EMUL_OPC_EVEX(0x0f3a, 0x67): /* vfpclasssh $imm8,xmm/mem,k{k} */ - host_and_vcpu_must_have(avx512_fp16); + visa_check(_fp16); generate_exception_if(evex.w || !evex.r || !evex.R || evex.z, X86_EXC_UD); if ( !(b & 1) ) goto avx512f_imm8_no_sae; @@ -7773,14 +7801,14 @@ x86_emulate( /* fall through */ case X86EMUL_OPC_EVEX_66(0x0f3a, 0x71): /* vpshld{d,q} $imm8,[xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_66(0x0f3a, 0x73): /* vpshrd{d,q} $imm8,[xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ - host_and_vcpu_must_have(avx512_vbmi2); + visa_check(_vbmi2); goto avx512f_imm8_no_sae; case X86EMUL_OPC_EVEX_F3(0x0f3a, 0xc2): /* vcmpsh $imm8,xmm/mem,xmm,k{k} */ generate_exception_if(ea.type != OP_REG && evex.brs, X86_EXC_UD); /* fall through */ case X86EMUL_OPC_EVEX(0x0f3a, 0xc2): /* vcmpph $imm8,[xyz]mm/mem,[xyz]mm,k{k} */ - host_and_vcpu_must_have(avx512_fp16); + visa_check(_fp16); generate_exception_if(evex.w || !evex.r || !evex.R || evex.z, X86_EXC_UD); if ( ea.type != OP_REG || !evex.brs ) avx512_vlen_check(evex.pfx & VEX_PREFIX_SCALAR_MASK); @@ -7856,13 +7884,13 @@ x86_emulate( CASE_SIMD_SINGLE_FP(_EVEX, 5, 0x5d): /* vmin{p,s}h [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ CASE_SIMD_SINGLE_FP(_EVEX, 5, 0x5e): /* vdiv{p,s}h [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ CASE_SIMD_SINGLE_FP(_EVEX, 5, 0x5f): /* vmax{p,s}h [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ - host_and_vcpu_must_have(avx512_fp16); + visa_check(_fp16); generate_exception_if(evex.w, X86_EXC_UD); goto avx512f_all_fp; CASE_SIMD_ALL_FP(_EVEX, 5, 0x5a): /* vcvtp{h,d}2p{h,d} [xyz]mm/mem,[xyz]mm{k} */ /* vcvts{h,d}2s{h,d} xmm/mem,xmm,xmm{k} */ - host_and_vcpu_must_have(avx512_fp16); + visa_check(_fp16); if ( vex.pfx & VEX_PREFIX_SCALAR_MASK ) d &= ~TwoOp; op_bytes = 2 << (((evex.pfx & VEX_PREFIX_SCALAR_MASK) ? 0 : 1 + evex.lr) + @@ -7873,7 +7901,7 @@ x86_emulate( /* vcvtqq2ph [xyz]mm/mem,xmm{k} */ case X86EMUL_OPC_EVEX_F2(5, 0x7a): /* vcvtudq2ph [xyz]mm/mem,[xy]mm{k} */ /* vcvtuqq2ph [xyz]mm/mem,xmm{k} */ - host_and_vcpu_must_have(avx512_fp16); + visa_check(_fp16); if ( ea.type != OP_REG || !evex.brs ) avx512_vlen_check(false); op_bytes = 16 << evex.lr; @@ -7883,7 +7911,7 @@ x86_emulate( case X86EMUL_OPC_EVEX_F3(5, 0x5b): /* vcvttph2dq [xy]mm/mem,[xyz]mm{k} */ case X86EMUL_OPC_EVEX (5, 0x78): /* vcvttph2udq [xy]mm/mem,[xyz]mm{k} */ case X86EMUL_OPC_EVEX (5, 0x79): /* vcvtph2udq [xy]mm/mem,[xyz]mm{k} */ - host_and_vcpu_must_have(avx512_fp16); + visa_check(_fp16); generate_exception_if(evex.w, X86_EXC_UD); if ( ea.type != OP_REG || !evex.brs ) avx512_vlen_check(false); @@ -7894,7 +7922,7 @@ x86_emulate( case X86EMUL_OPC_EVEX_66(5, 0x79): /* vcvtph2uqq xmm/mem,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_66(5, 0x7a): /* vcvttph2qq xmm/mem,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_66(5, 0x7b): /* vcvtph2qq xmm/mem,[xyz]mm{k} */ - host_and_vcpu_must_have(avx512_fp16); + visa_check(_fp16); generate_exception_if(evex.w, X86_EXC_UD); if ( ea.type != OP_REG || !evex.brs ) avx512_vlen_check(false); @@ -7931,7 +7959,7 @@ x86_emulate( case X86EMUL_OPC_EVEX_66(6, 0xba): /* vfmsub231ph [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_66(6, 0xbc): /* vfnmadd231ph [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_66(6, 0xbe): /* vfnmsub231ph [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ - host_and_vcpu_must_have(avx512_fp16); + visa_check(_fp16); generate_exception_if(evex.w, X86_EXC_UD); if ( ea.type != OP_REG || !evex.brs ) avx512_vlen_check(false); @@ -7953,7 +7981,7 @@ x86_emulate( case X86EMUL_OPC_EVEX_66(6, 0xbb): /* vfmsub231sh xmm/m16,xmm,xmm{k} */ case X86EMUL_OPC_EVEX_66(6, 0xbd): /* vfnmadd231sh xmm/m16,xmm,xmm{k} */ case X86EMUL_OPC_EVEX_66(6, 0xbf): /* vfnmsub231sh xmm/m16,xmm,xmm{k} */ - host_and_vcpu_must_have(avx512_fp16); + visa_check(_fp16); generate_exception_if(evex.w || (ea.type != OP_REG && evex.brs), X86_EXC_UD); if ( !evex.brs ) @@ -7962,13 +7990,13 @@ x86_emulate( case X86EMUL_OPC_EVEX_66(6, 0x4c): /* vrcpph [xyz]mm/mem,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_66(6, 0x4e): /* vrsqrtph [xyz]mm/mem,[xyz]mm{k} */ - host_and_vcpu_must_have(avx512_fp16); + visa_check(_fp16); generate_exception_if(evex.w, X86_EXC_UD); goto avx512f_no_sae; case X86EMUL_OPC_EVEX_66(6, 0x4d): /* vrcpsh xmm/m16,xmm,xmm{k} */ case X86EMUL_OPC_EVEX_66(6, 0x4f): /* vrsqrtsh xmm/m16,xmm,xmm{k} */ - host_and_vcpu_must_have(avx512_fp16); + visa_check(_fp16); generate_exception_if(evex.w || evex.brs, X86_EXC_UD); avx512_vlen_check(true); goto simd_zmm; @@ -7986,7 +8014,7 @@ x86_emulate( { unsigned int src1 = ~evex.reg; - host_and_vcpu_must_have(avx512_fp16); + visa_check(_fp16); generate_exception_if(evex.w || ((b & 1) && ea.type != OP_REG && evex.brs), X86_EXC_UD); if ( mode_64bit() ) --- a/xen/arch/x86/xstate.c +++ b/xen/arch/x86/xstate.c @@ -795,7 +795,7 @@ static void __init noinline xstate_check if ( cpu_has_mpx ) check_new_xstate(&s, X86_XCR0_BNDCSR | X86_XCR0_BNDREGS); - if ( cpu_has_avx512f ) + if ( boot_cpu_has(X86_FEATURE_AVX512F) || boot_cpu_has(X86_FEATURE_AVX10) ) check_new_xstate(&s, X86_XCR0_HI_ZMM | X86_XCR0_ZMM | X86_XCR0_OPMASK); /* --- a/xen/include/public/arch-x86/cpufeatureset.h +++ b/xen/include/public/arch-x86/cpufeatureset.h @@ -355,7 +355,7 @@ XEN_CPUFEATURE(UTMR, 15*32 XEN_CPUFEATURE(PREFETCHI, 15*32+14) /*A PREFETCHIT{0,1} Instructions */ XEN_CPUFEATURE(USER_MSR, 15*32+15) /*s U{RD,WR}MSR Instructions */ XEN_CPUFEATURE(CET_SSS, 15*32+18) /* CET Supervisor Shadow Stacks safe to use */ -XEN_CPUFEATURE(AVX10, 15*32+19) /* AVX10 Converged Vector ISA */ +XEN_CPUFEATURE(AVX10, 15*32+19) /*a AVX10 Converged Vector ISA */ /* Intel-defined CPU features, MSR_ARCH_CAPS 0x10a.eax, word 16 */ XEN_CPUFEATURE(RDCL_NO, 16*32+ 0) /*A No Rogue Data Cache Load (Meltdown) */ From patchwork Wed Dec 11 10:12:21 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 13903286 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id AFD6BE7717D for ; Wed, 11 Dec 2024 10:12:39 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.854283.1267525 (Exim 4.92) (envelope-from ) id 1tLJhn-00050c-Gf; Wed, 11 Dec 2024 10:12:31 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 854283.1267525; Wed, 11 Dec 2024 10:12:31 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1tLJhn-00050V-E8; Wed, 11 Dec 2024 10:12:31 +0000 Received: by outflank-mailman (input) for mailman id 854283; Wed, 11 Dec 2024 10:12:29 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1tLJhl-0003z3-OZ for xen-devel@lists.xenproject.org; Wed, 11 Dec 2024 10:12:29 +0000 Received: from mail-wr1-x435.google.com (mail-wr1-x435.google.com [2a00:1450:4864:20::435]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id 69d981eb-b7a8-11ef-99a3-01e77a169b0f; Wed, 11 Dec 2024 11:12:23 +0100 (CET) Received: by mail-wr1-x435.google.com with SMTP id ffacd0b85a97d-3862f32a33eso2182391f8f.3 for ; Wed, 11 Dec 2024 02:12:27 -0800 (PST) Received: from [10.156.60.236] (ip-037-024-206-209.um08.pools.vodafone-ip.de. [37.24.206.209]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2ef68c07bfcsm9657745a91.12.2024.12.11.02.12.24 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 11 Dec 2024 02:12:26 -0800 (PST) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 69d981eb-b7a8-11ef-99a3-01e77a169b0f DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1733911947; x=1734516747; darn=lists.xenproject.org; h=content-transfer-encoding:in-reply-to:autocrypt:content-language :references:cc:to:from:subject:user-agent:mime-version:date :message-id:from:to:cc:subject:date:message-id:reply-to; bh=ZqFYw/nizVXDm7kHbUCaTgxJOMEM5PPDM0bi8pFmXx4=; b=IOtFSqR1Eul+3lfMhZFJ+muVvqs3eRz6aBpzDN+G4y3jpP1jWyKMR0uDVi8nQoi5/b jcUY3E/pVl1jiV/fmjlT3EOQ1X2o8HJZkSnOLQbQk+nmRk5AJ553v5osqzlgnfAJKBS4 /oVFFjHwb/vHke9NgJI39fcajUyzzTRrBDf7pfHMve45dayLKTJrV6wrXs+vOVJYGURR 2J9mi8FKcxpDpSoZFZGS4ldvOOJP7rawPDEsKuFrIMXkzQmr2qBloJBfmf+9O8PsLA+m esQ87LmNUl/vK9NiPaD9n3qi14oxrTRBfwQHiZ2BoMEOjup+vrMEAFNjqPqvVxPmKmS/ 6cdQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1733911947; x=1734516747; h=content-transfer-encoding:in-reply-to:autocrypt:content-language :references:cc:to:from:subject:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=ZqFYw/nizVXDm7kHbUCaTgxJOMEM5PPDM0bi8pFmXx4=; b=er6acMPSlMqs/0uaQ+k81njNF3yIIIcy7FZm6nfVnUbV7z3VJlRfHM42japvkmrZN0 g3Ibs0gj1Tp0iM5OHo93LX1Oa38U3coDXZhiBy89hEqLPDcIg9WK9LuCjzEyInVSIjHd CCZJDZMBeQQ/+BjwemChi3sDPSDpKvPeBxW7Gai0tDjQb6C7z06rIcfuTH0iG6Fbw7wI LmJRah+mhtc5Wqkg8IXgZobOwuogjjgNy2Rz/aUe2qjmj+zjgfe8sUV+4n/h3eaxIRvy +102ziDO9KkDGxn3LXKuvJ5/vP5/wR+ssxZKhMbkRid1jkxzrwWW4wOSeRCIFn27W2WA 86cg== X-Gm-Message-State: AOJu0YxaU14yEP+zuh1gh9E3sC66eky/diCwthEB09540i/E/eWmPiOW oK85ap0/M2LnBhaO0fjs3sIm7LrauGT1kJ7u95tSMUsDpko4mo4Z66dWEDllIJqCnTJumA2efCI = X-Gm-Gg: ASbGncv4bdnW5TjxCV7oqZmf8JlH8SBSSpMhVzmz12KV1KOQ0HqC0KrztbG9/Rp2s+G 3yfOILtSEyfkDL16pVogLamm69JhZk45vdKzmjlkpp/Ku+hpwLRZBRDr20jcpLZ25y5fFLRpmJx Lbtxdi5yXNHmbtKFDXvivWeU+PbruMgwLTWMnY80bhk/TC5kFQ3a/XqLsGfK170ElxXNtuKemXE Lkbtrxe5V/rtPHiFvIeAQrFjwS4tv/abCVElmbH0EHgBGia5cZwfX/xjtNGOM9cU357yNGbJ6Ze eBOiFbf9xJz7cRd37xR8mMUNh31Qpz28rKXyJDA= X-Google-Smtp-Source: AGHT+IEjw1VR6RvKQ56Pj7s0pHYojoghsUb91BvgzPodD1nm29nl21v8UUvBBmPiK4/ZlBLg7jD27w== X-Received: by 2002:a5d:47ab:0:b0:385:fb34:d59f with SMTP id ffacd0b85a97d-3864ce86a74mr2122464f8f.11.1733911947260; Wed, 11 Dec 2024 02:12:27 -0800 (PST) Message-ID: Date: Wed, 11 Dec 2024 11:12:21 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: [PATCH v3 03/16] x86emul/test: use simd_check_avx512*() in main() From: Jan Beulich To: "xen-devel@lists.xenproject.org" Cc: Andrew Cooper , =?utf-8?q?Roger_Pau_Monn?= =?utf-8?q?=C3=A9?= References: <516b7f9a-048e-409d-8a4e-89aeb8ffacc4@suse.com> Content-Language: en-US Autocrypt: addr=jbeulich@suse.com; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL In-Reply-To: <516b7f9a-048e-409d-8a4e-89aeb8ffacc4@suse.com> In preparation for having these also cover AVX10, use the helper functions in preference of open-coded cpu_has_avx512* for those features that AVX10 includes. Introduce a couple further helper functions where they weren't previously needed. Note that this way simd_check_avx512f_sha_vl() gains an AVX512F check (which is likely benign) and simd_check_avx512bw_gf_vl() gains an AVX512BW check (which was clearly missing). Signed-off-by: Jan Beulich --- v2: Re-base over dropping of Xeon Phi support. --- a/tools/tests/x86_emulator/test_x86_emulator.c +++ b/tools/tests/x86_emulator/test_x86_emulator.c @@ -167,6 +167,11 @@ static bool simd_check_avx512vbmi_vl(voi return cpu_has_avx512_vbmi && cpu_has_avx512vl; } +static bool simd_check_avx512vbmi2(void) +{ + return cpu_has_avx512_vbmi2; +} + static bool simd_check_sse4_sha(void) { return cpu_has_sha && cpu_has_sse4_2; @@ -179,7 +184,7 @@ static bool simd_check_avx_sha(void) static bool simd_check_avx512f_sha_vl(void) { - return cpu_has_sha && cpu_has_avx512vl; + return cpu_has_sha && simd_check_avx512f_vl(); } static bool simd_check_avx2_vaes(void) @@ -189,13 +194,13 @@ static bool simd_check_avx2_vaes(void) static bool simd_check_avx512bw_vaes(void) { - return cpu_has_aesni && cpu_has_vaes && cpu_has_avx512bw; + return cpu_has_aesni && cpu_has_vaes && simd_check_avx512bw(); } static bool simd_check_avx512bw_vaes_vl(void) { return cpu_has_aesni && cpu_has_vaes && - cpu_has_avx512bw && cpu_has_avx512vl; + simd_check_avx512bw_vl(); } static bool simd_check_avx2_vpclmulqdq(void) @@ -205,22 +210,22 @@ static bool simd_check_avx2_vpclmulqdq(v static bool simd_check_avx512bw_vpclmulqdq(void) { - return cpu_has_vpclmulqdq && cpu_has_avx512bw; + return cpu_has_vpclmulqdq && simd_check_avx512bw(); } static bool simd_check_avx512bw_vpclmulqdq_vl(void) { - return cpu_has_vpclmulqdq && cpu_has_avx512bw && cpu_has_avx512vl; + return cpu_has_vpclmulqdq && simd_check_avx512bw_vl(); } static bool simd_check_avx512vbmi2_vpclmulqdq(void) { - return cpu_has_avx512_vbmi2 && simd_check_avx512bw_vpclmulqdq(); + return simd_check_avx512vbmi2() && simd_check_avx512bw_vpclmulqdq(); } static bool simd_check_avx512vbmi2_vpclmulqdq_vl(void) { - return cpu_has_avx512_vbmi2 && simd_check_avx512bw_vpclmulqdq_vl(); + return simd_check_avx512vbmi2() && simd_check_avx512bw_vpclmulqdq_vl(); } static bool simd_check_sse2_gf(void) @@ -235,12 +240,17 @@ static bool simd_check_avx2_gf(void) static bool simd_check_avx512bw_gf(void) { - return cpu_has_gfni && cpu_has_avx512bw; + return cpu_has_gfni && simd_check_avx512bw(); } static bool simd_check_avx512bw_gf_vl(void) { - return cpu_has_gfni && cpu_has_avx512vl; + return cpu_has_gfni && simd_check_avx512bw_vl(); +} + +static bool simd_check_avx512vnni(void) +{ + return cpu_has_avx512_vnni; } static bool simd_check_avx512fp16(void) @@ -3195,7 +3205,7 @@ int main(int argc, char **argv) printf("skipped\n"); printf("%-40s", "Testing {evex} vmovq %xmm1,32(%edx)..."); - if ( stack_exec && cpu_has_avx512f ) + if ( stack_exec && simd_check_avx512f() ) { decl_insn(evex_vmovq_to_mem); @@ -3219,7 +3229,7 @@ int main(int argc, char **argv) printf("skipped\n"); printf("%-40s", "Testing {evex} vmovq 32(%edx),%xmm0..."); - if ( stack_exec && cpu_has_avx512f ) + if ( stack_exec && simd_check_avx512f() ) { decl_insn(evex_vmovq_from_mem); @@ -3342,7 +3352,7 @@ int main(int argc, char **argv) printf("skipped\n"); printf("%-40s", "Testing vmovdqu32 %zmm2,(%ecx){%k1}..."); - if ( stack_exec && cpu_has_avx512f ) + if ( stack_exec && simd_check_avx512f() ) { decl_insn(vmovdqu32_to_mem); @@ -3372,7 +3382,7 @@ int main(int argc, char **argv) printf("skipped\n"); printf("%-40s", "Testing vmovdqu32 64(%edx),%zmm2{%k2}..."); - if ( stack_exec && cpu_has_avx512f ) + if ( stack_exec && simd_check_avx512f() ) { decl_insn(vmovdqu32_from_mem); @@ -3397,7 +3407,7 @@ int main(int argc, char **argv) printf("skipped\n"); printf("%-40s", "Testing vmovdqu16 %zmm3,(%ecx){%k1}..."); - if ( stack_exec && cpu_has_avx512bw ) + if ( stack_exec && simd_check_avx512bw() ) { decl_insn(vmovdqu16_to_mem); @@ -3429,7 +3439,7 @@ int main(int argc, char **argv) printf("skipped\n"); printf("%-40s", "Testing vmovdqu16 64(%edx),%zmm3{%k2}..."); - if ( stack_exec && cpu_has_avx512bw ) + if ( stack_exec && simd_check_avx512bw() ) { decl_insn(vmovdqu16_from_mem); @@ -3557,7 +3567,7 @@ int main(int argc, char **argv) printf("%-40s", "Testing vmovsd %xmm5,16(%ecx){%k3}..."); memset(res, 0x88, 128); memset(res + 20, 0x77, 8); - if ( stack_exec && cpu_has_avx512f ) + if ( stack_exec && simd_check_avx512f() ) { decl_insn(vmovsd_masked_to_mem); @@ -3592,7 +3602,7 @@ int main(int argc, char **argv) } printf("%-40s", "Testing vmovaps (%edx),%zmm7{%k3}{z}..."); - if ( stack_exec && cpu_has_avx512f ) + if ( stack_exec && simd_check_avx512f() ) { decl_insn(vmovaps_masked_from_mem); @@ -3775,7 +3785,7 @@ int main(int argc, char **argv) printf("skipped\n"); printf("%-40s", "Testing {evex} vmovd %xmm3,32(%ecx)..."); - if ( stack_exec && cpu_has_avx512f ) + if ( stack_exec && simd_check_avx512f() ) { decl_insn(evex_vmovd_to_mem); @@ -3800,7 +3810,7 @@ int main(int argc, char **argv) printf("skipped\n"); printf("%-40s", "Testing {evex} vmovd 32(%ecx),%xmm4..."); - if ( stack_exec && cpu_has_avx512f ) + if ( stack_exec && simd_check_avx512f() ) { decl_insn(evex_vmovd_from_mem); @@ -3990,7 +4000,7 @@ int main(int argc, char **argv) printf("skipped\n"); printf("%-40s", "Testing {evex} vmovd %xmm2,%ebx..."); - if ( stack_exec && cpu_has_avx512f ) + if ( stack_exec && simd_check_avx512f() ) { decl_insn(evex_vmovd_to_reg); @@ -4016,7 +4026,7 @@ int main(int argc, char **argv) printf("skipped\n"); printf("%-40s", "Testing {evex} vmovd %ebx,%xmm1..."); - if ( stack_exec && cpu_has_avx512f ) + if ( stack_exec && simd_check_avx512f() ) { decl_insn(evex_vmovd_from_reg); @@ -4118,7 +4128,7 @@ int main(int argc, char **argv) printf("skipped\n"); printf("%-40s", "Testing {evex} vmovq %xmm11,32(%ecx)..."); - if ( stack_exec && cpu_has_avx512f ) + if ( stack_exec && simd_check_avx512f() ) { decl_insn(evex_vmovq_to_mem2); @@ -4208,7 +4218,7 @@ int main(int argc, char **argv) printf("skipped\n"); printf("%-40s", "Testing vmovq %xmm22,%rbx..."); - if ( stack_exec && cpu_has_avx512f ) + if ( stack_exec && simd_check_avx512f() ) { decl_insn(evex_vmovq_to_reg); @@ -4401,7 +4411,7 @@ int main(int argc, char **argv) printf("skipped\n"); printf("%-40s", "Testing vmovntdqa 64(%ecx),%zmm4..."); - if ( stack_exec && cpu_has_avx512f ) + if ( stack_exec && simd_check_avx512f() ) { decl_insn(evex_vmovntdqa); @@ -4997,7 +5007,7 @@ int main(int argc, char **argv) printf("skipped\n"); printf("%-40s", "Testing vcvtph2ps 32(%ecx),%zmm7{%k4}..."); - if ( stack_exec && cpu_has_avx512f ) + if ( stack_exec && simd_check_avx512f() ) { decl_insn(evex_vcvtph2ps); decl_insn(evex_vcvtps2ph); @@ -5040,7 +5050,7 @@ int main(int argc, char **argv) printf("skipped\n"); printf("%-40s", "Testing vfixupimmpd $0,8(%edx){1to8},%zmm3,%zmm4..."); - if ( stack_exec && cpu_has_avx512f ) + if ( stack_exec && simd_check_avx512f() ) { decl_insn(vfixupimmpd); static const struct { @@ -5079,7 +5089,7 @@ int main(int argc, char **argv) printf("%-40s", "Testing vfpclasspsz $0x46,64(%edx),%k2..."); - if ( stack_exec && cpu_has_avx512dq ) + if ( stack_exec && simd_check_avx512dq() ) { decl_insn(vfpclassps); @@ -5111,7 +5121,7 @@ int main(int argc, char **argv) printf("skipped\n"); printf("%-40s", "Testing vfpclassphz $0x46,128(%ecx),%k3..."); - if ( stack_exec && cpu_has_avx512_fp16 ) + if ( stack_exec && simd_check_avx512fp16() ) { decl_insn(vfpclassph); @@ -5154,7 +5164,7 @@ int main(int argc, char **argv) * on the mapping boundaries) that elements controlled by clear mask * bits don't get accessed. */ - if ( stack_exec && cpu_has_avx512f ) + if ( stack_exec && simd_check_avx512f() ) { decl_insn(vpcompressd); decl_insn(vpcompressq); @@ -5256,7 +5266,7 @@ int main(int argc, char **argv) } #if __GNUC__ > 7 /* can't check for __AVX512VBMI2__ here */ - if ( stack_exec && cpu_has_avx512_vbmi2 ) + if ( stack_exec && simd_check_avx512vbmi2() ) { decl_insn(vpcompressb); decl_insn(vpcompressw); @@ -5444,7 +5454,7 @@ int main(int argc, char **argv) } printf("%-40s", "Testing vpdpwssd (%ecx),%{y,z}mmA,%{y,z}mmB..."); - if ( stack_exec && cpu_has_avx512_vnni && cpu_has_avx_vnni ) + if ( stack_exec && simd_check_avx512vnni() && cpu_has_avx_vnni ) { /* Do the same operation two ways and compare the results. */ decl_insn(vpdpwssd_vex1); @@ -5499,7 +5509,7 @@ int main(int argc, char **argv) printf("skipped\n"); printf("%-40s", "Testing vmovsh 8(%ecx),%xmm5..."); - if ( stack_exec && cpu_has_avx512_fp16 ) + if ( stack_exec && simd_check_avx512fp16() ) { decl_insn(vmovsh_from_mem); decl_insn(vmovw_to_gpr); From patchwork Wed Dec 11 10:12:45 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 13903308 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 15A14E7717D for ; Wed, 11 Dec 2024 10:21:18 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.854369.1267596 (Exim 4.92) (envelope-from ) id 1tLJqA-0001jq-NL; Wed, 11 Dec 2024 10:21:10 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 854369.1267596; Wed, 11 Dec 2024 10:21:10 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1tLJqA-0001jj-Kh; Wed, 11 Dec 2024 10:21:10 +0000 Received: by outflank-mailman (input) for mailman id 854369; Wed, 11 Dec 2024 10:21:09 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1tLJi8-0003z3-H0 for xen-devel@lists.xenproject.org; Wed, 11 Dec 2024 10:12:52 +0000 Received: from mail-wr1-x42c.google.com (mail-wr1-x42c.google.com [2a00:1450:4864:20::42c]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id 778aacfc-b7a8-11ef-99a3-01e77a169b0f; Wed, 11 Dec 2024 11:12:46 +0100 (CET) Received: by mail-wr1-x42c.google.com with SMTP id ffacd0b85a97d-385ed7f6605so2929507f8f.3 for ; Wed, 11 Dec 2024 02:12:50 -0800 (PST) Received: from [10.156.60.236] (ip-037-024-206-209.um08.pools.vodafone-ip.de. [37.24.206.209]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-216281f45a2sm73565155ad.250.2024.12.11.02.12.47 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 11 Dec 2024 02:12:49 -0800 (PST) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 778aacfc-b7a8-11ef-99a3-01e77a169b0f DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1733911970; x=1734516770; darn=lists.xenproject.org; h=content-transfer-encoding:in-reply-to:autocrypt:content-language :references:cc:to:from:subject:user-agent:mime-version:date :message-id:from:to:cc:subject:date:message-id:reply-to; bh=PydLQvHGoQVq8w5xg8EGioPd9dUoTx6VVtqXrfukRCE=; b=B0KGl3zOeF6/BNdIQe/oizfe0wxJdGCTAHmbGhW1zjZv1tJs5h/nClSnqzfykpSNi1 bI63XkqmEGmAj6ijAcEQXha06dHnJKTEvKgHX3fBtzy8GVOIwS/BIgUlblsOL9tU/LFm c2ghapV6ov70RdmLZsjzcpDf3PCHwjpXGicuC5h0YSwOnmRZdm9zWQcwbH99Az1muJX6 h015GBFDnTgRkqvb4J95wIysPb2yfKsLrDFVEby8RzCmIQEVwlQa0xUkKPfAdEgDJoyZ KJVEVK+H2KAGG0djWajpT3cFk227TTPVIe0e7UmgV0C4AvudnF/qvMinXRnMfCG3bpZs wJ5g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1733911970; x=1734516770; h=content-transfer-encoding:in-reply-to:autocrypt:content-language :references:cc:to:from:subject:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=PydLQvHGoQVq8w5xg8EGioPd9dUoTx6VVtqXrfukRCE=; b=vy90BgtRV2GcxVxWn2DViZeZBjDeXva0s3so297HZZG7EnvP6uLCkAarv0CwLL5yqW CDHP0i1r9Nxlj/rSycn+0uD82xX8JCu6jl/9sJV/rAh3/Lhb7h1LkhAeD67SZ9lGVZic Z+X/TKsvM8BBLSVp1S61K2hM8k+xGbgNhQRUQr7HQBnmCpEFxqDpv63/GfnThd8/pGyg 74nCzp7xFIRA9jHLImlHjfEtW/wAyTs6+P7TkdvM8+1wf3A7K7PT+Lbr0pjUyvBaPIao Za/+rUgnm/vV721qq84Inj9jUkpFmuHNFnJMBXTOqqsmYPVTRtjlHFDscKSc41ws+7k0 iK/w== X-Gm-Message-State: AOJu0Yx+JpKoj3bdKn/+YbyNqc9N5HkQhtjCco83AuMBFeiXvq4Tq7lh RVpMap8bp6kawbLfJ1D8OkW8SCCYFu4nmR8J/J23TLB/M9IHiutQri6Rk0/iarTWzuYMhKxR4hY = X-Gm-Gg: ASbGnctVAmmj5m+osKKRC8r+/xMSbhUuiV5w5kcAd7KTmOXZEyrASDJ3W6y0GjGf3qs syvTcAiK2/D7owmFLBt9XsKTF60zzKIhGKS9cP1O/YcQhSzVM5AkldRhvSDaq6XSw6YRUPO7UIH sReldD3DjcALFzy4Cbn9cd36pSwV7Mc0bWZgEhER8RNWGCrBADbvDRJc3mLYnw5sewMYJUoeEaX cyZk9s6ttxW4DZEqITlAN/RUUv+MONB0P0xe8CZm0AavXMcMa6hb/Ub1bCqq3T3giTdPQ5OvbDX yT2mpI46ZAcojsV8Vr/9DYhC1DI771nBWuLbJ+E= X-Google-Smtp-Source: AGHT+IEro+kMo0mF+/+xCqXl0B2cpDKIo4tuw+YhiyKniL4zHBG5DCm6pWtj55giIqFS3fR3cndu5Q== X-Received: by 2002:a5d:6dab:0:b0:385:f909:eb28 with SMTP id ffacd0b85a97d-3864ce985a5mr1647222f8f.45.1733911970348; Wed, 11 Dec 2024 02:12:50 -0800 (PST) Message-ID: <35109b64-87bb-435a-b3e9-2c5b378532e9@suse.com> Date: Wed, 11 Dec 2024 11:12:45 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: [PATCH v3 04/16] x86emul/test: drop cpu_has_avx512vl From: Jan Beulich To: "xen-devel@lists.xenproject.org" Cc: Andrew Cooper , =?utf-8?q?Roger_Pau_Monn?= =?utf-8?q?=C3=A9?= References: <516b7f9a-048e-409d-8a4e-89aeb8ffacc4@suse.com> Content-Language: en-US Autocrypt: addr=jbeulich@suse.com; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL In-Reply-To: <516b7f9a-048e-409d-8a4e-89aeb8ffacc4@suse.com> AVX512VL not being a standalone feature anyway, but always needing to be combined with some other AVX512*, replace uses of cpu_has_avx512vl by just the feature bit check. Signed-off-by: Jan Beulich --- v2: Re-base over dropping of Xeon Phi support. --- a/tools/tests/x86_emulator/evex-disp8.c +++ b/tools/tests/x86_emulator/evex-disp8.c @@ -998,7 +998,8 @@ static void test_group(const struct test { for ( j = 0; j < nr_vl; ++j ) { - if ( vl[0] == VL_512 && vl[j] != VL_512 && !cpu_has_avx512vl ) + if ( vl[0] == VL_512 && vl[j] != VL_512 && + !cpu_policy.feat.avx512vl ) continue; switch ( tests[i].esz ) --- a/tools/tests/x86_emulator/test_x86_emulator.c +++ b/tools/tests/x86_emulator/test_x86_emulator.c @@ -131,7 +131,7 @@ static bool simd_check_avx512f(void) static bool simd_check_avx512f_vl(void) { - return cpu_has_avx512f && cpu_has_avx512vl; + return cpu_has_avx512f && cpu_policy.feat.avx512vl; } #define simd_check_avx512vl_sg simd_check_avx512f_vl @@ -143,7 +143,7 @@ static bool simd_check_avx512dq(void) static bool simd_check_avx512dq_vl(void) { - return cpu_has_avx512dq && cpu_has_avx512vl; + return cpu_has_avx512dq && cpu_policy.feat.avx512vl; } static bool simd_check_avx512bw(void) @@ -154,7 +154,7 @@ static bool simd_check_avx512bw(void) static bool simd_check_avx512bw_vl(void) { - return cpu_has_avx512bw && cpu_has_avx512vl; + return cpu_has_avx512bw && cpu_policy.feat.avx512vl; } static bool simd_check_avx512vbmi(void) @@ -164,7 +164,7 @@ static bool simd_check_avx512vbmi(void) static bool simd_check_avx512vbmi_vl(void) { - return cpu_has_avx512_vbmi && cpu_has_avx512vl; + return cpu_has_avx512_vbmi && cpu_policy.feat.avx512vl; } static bool simd_check_avx512vbmi2(void) @@ -260,7 +260,7 @@ static bool simd_check_avx512fp16(void) static bool simd_check_avx512fp16_vl(void) { - return cpu_has_avx512_fp16 && cpu_has_avx512vl; + return cpu_has_avx512_fp16 && cpu_policy.feat.avx512vl; } static void simd_set_regs(struct cpu_user_regs *regs) --- a/tools/tests/x86_emulator/x86-emulate.h +++ b/tools/tests/x86_emulator/x86-emulate.h @@ -173,8 +173,6 @@ void wrpkru(unsigned int val); #define cpu_has_sha cpu_policy.feat.sha #define cpu_has_avx512bw (cpu_policy.feat.avx512bw && \ xcr0_mask(0xe6)) -#define cpu_has_avx512vl (cpu_policy.feat.avx512vl && \ - xcr0_mask(0xe6)) #define cpu_has_avx512_vbmi (cpu_policy.feat.avx512_vbmi && \ xcr0_mask(0xe6)) #define cpu_has_avx512_vbmi2 (cpu_policy.feat.avx512_vbmi2 && \ From patchwork Wed Dec 11 10:13:13 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 13903310 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 4D0C9E7717D for ; Wed, 11 Dec 2024 10:22:38 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.854396.1267626 (Exim 4.92) (envelope-from ) id 1tLJrT-0003g1-Io; Wed, 11 Dec 2024 10:22:31 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 854396.1267626; Wed, 11 Dec 2024 10:22:31 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1tLJrT-0003ft-ER; Wed, 11 Dec 2024 10:22:31 +0000 Received: by outflank-mailman (input) for mailman id 854396; Wed, 11 Dec 2024 10:22:30 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1tLJic-0003z3-2s for xen-devel@lists.xenproject.org; Wed, 11 Dec 2024 10:13:22 +0000 Received: from mail-wr1-x433.google.com (mail-wr1-x433.google.com [2a00:1450:4864:20::433]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id 88e60d75-b7a8-11ef-99a3-01e77a169b0f; Wed, 11 Dec 2024 11:13:15 +0100 (CET) Received: by mail-wr1-x433.google.com with SMTP id ffacd0b85a97d-37ed3bd6114so2778937f8f.2 for ; Wed, 11 Dec 2024 02:13:20 -0800 (PST) Received: from [10.156.60.236] (ip-037-024-206-209.um08.pools.vodafone-ip.de. [37.24.206.209]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-2162b052ee8sm71237215ad.45.2024.12.11.02.13.16 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 11 Dec 2024 02:13:18 -0800 (PST) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 88e60d75-b7a8-11ef-99a3-01e77a169b0f DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1733911999; x=1734516799; darn=lists.xenproject.org; h=content-transfer-encoding:in-reply-to:autocrypt:content-language :references:cc:to:from:subject:user-agent:mime-version:date :message-id:from:to:cc:subject:date:message-id:reply-to; bh=/gKnwhepsCOgjaSpf1xV3UdHomhomuoQ6+ggqo6U/2M=; b=PUvpnQuxMxUJkmf5ImqLv1OJc4BfI+odugiKr64+g4ii0ACE6WVAUtK50jRX8YFMN7 fjN56MLjNyNi5m80uaXlgooL+xL1veIMnRokN4AIL46gO1BAYaxpqFrMFbNQPTNCbQfJ 5xHRV6VXXf6tmoys7z02eogcyk+nivBrrgoqiw3nVotO1WLvxnHOJ0K82Ic4qGbd3Fti UbuUwE/oNhGqTn2NdylBwx6lCVNovlLFl4v9iAHNwOrAIZS1thHBasDS8DAMx0EBTNwz 1a0mS2G9RV8s3E8ZnX72/5to8zrQPbAaAk+VL4iD6vv0pRwNzqxuWkBivn4dWcKP6VvU MAVA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1733911999; x=1734516799; h=content-transfer-encoding:in-reply-to:autocrypt:content-language :references:cc:to:from:subject:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=/gKnwhepsCOgjaSpf1xV3UdHomhomuoQ6+ggqo6U/2M=; b=BH6UIuQKrPEwsuSrythON8hnjPkE4tQ/LXp005jhVteqXl14J55f3KGFTyuYmUNPqW /dvq3t6VHwh4VSI8hzeTnqF68IVrvSNHxvPY8xkCAI3+HbYH/GDfA4anLU8nGTr0WfsI suEBvqYQgS+JsRZWgNHY/nkAh1H3ARCMiAFGDA5lzb7Orf2RDdoYRXNb+E4URF8hyw3g Yrs4Z3byzdFZh/RVkx+eXBTkMLBrvmEDG0Ccn22HuHEPjx8AwyHKu7FW76PVgKEJEHZ7 chXM1a3l4QqJFwK35Pbx7cnJpMSy6k/eB8N34Ca+5z/l6QzTH5qqIohWkjAVidoPJXKO iCdw== X-Gm-Message-State: AOJu0Yy4OkAb0FBKWIFHBsriHwUsrnzxPSQcOhFNPTy4DfNdgUA/3RlY L4n5iT9xFhPOb379JmwL7YrTIuIC1EFYwXvyAX+5KqfcdSlUuonDa/aQnCC4mkZmfmy5wuiqZcQ = X-Gm-Gg: ASbGnctUir6stUT45q3E4tzDgvmbDRoSrGJtY3oy2ZKaKJ/RT5Mk8Dk5StfrPLz/CeM Fr81qVmLd4RMHTf9slKoPpnPHRNynI63SbrX/bHw12zxbd9EZbtfIebPImRidQltlqHVPWLU2Sk n/rY/+5DIZCnQSlxzdWE6S69dmdiKyH9M96EAfQ1byAt7DfB3d+vLCNPct68dixSxfGx4xve/yR 8mxDa3XoDtfr8kpwm8nj4R3bfYDsV7zXkkB7/O58KC+GC49Pl4LxZRR3OH14NYcuD4FCe1aE9XT 0W0NO1QmQCsx9NmP4scMpTLnDK2zSk8C46A3VHU= X-Google-Smtp-Source: AGHT+IGsZsYbOWcOvFkSsDe/ZCxaKgvyW2utiM0OkHWW3WWFzTJAB9L4DI+w6D8MflzmGvzJcGklqg== X-Received: by 2002:a5d:47c6:0:b0:385:f062:c2df with SMTP id ffacd0b85a97d-3864ce4968emr1994195f8f.11.1733911999302; Wed, 11 Dec 2024 02:13:19 -0800 (PST) Message-ID: Date: Wed, 11 Dec 2024 11:13:13 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: [PATCH v3 05/16] x86emul: AVX10.1 testing From: Jan Beulich To: "xen-devel@lists.xenproject.org" Cc: Andrew Cooper , =?utf-8?q?Roger_Pau_Monn?= =?utf-8?q?=C3=A9?= References: <516b7f9a-048e-409d-8a4e-89aeb8ffacc4@suse.com> Content-Language: en-US Autocrypt: addr=jbeulich@suse.com; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL In-Reply-To: <516b7f9a-048e-409d-8a4e-89aeb8ffacc4@suse.com> Re-use respective AVX512 tests, by suitably adjusting the predicate functions. This leaves test names ("Testing ... NN-bit code sequence") somewhat misleading, but I think we can live with that. Note that the AVX512{BW,DQ} opmask tests cannot be run as-is for the AVX10/256 case, as they include 512-bit vector <-> opmask insn tests. Signed-off-by: Jan Beulich --- SDE: -gnr / -gnr256 --- TBD: For AVX10.1/256 need to somehow guarantee that the generated blobs really don't use 512-bit insns (it's uncertain whether passing -mprefer-vector-width= is enough). Right now according to my testing on SDE this is all fine. May need to probe for support of the new -mno-evex512 compiler option. The AVX512{BW,DQ} opmask tests could of course be cloned (i.e. rebuilt another time with -mavx512vl passed) accordingly, but the coverage gain wouldbe pretty marginal. --- v2: Drop SDE 9.27.0 workaround. Re-base over dropping of Xeon Phi support. --- a/tools/tests/x86_emulator/evex-disp8.c +++ b/tools/tests/x86_emulator/evex-disp8.c @@ -999,7 +999,11 @@ static void test_group(const struct test for ( j = 0; j < nr_vl; ++j ) { if ( vl[0] == VL_512 && vl[j] != VL_512 && - !cpu_policy.feat.avx512vl ) + !cpu_policy.feat.avx512vl && !cpu_policy.feat.avx10 ) + continue; + + if ( vl[j] == VL_512 && !cpu_policy.feat.avx512f && + !cpu_policy.avx10.vsz512 ) continue; switch ( tests[i].esz ) @@ -1050,6 +1054,27 @@ static void test_group(const struct test } } +/* AVX512 (sub)features implied by AVX10. */ +#define avx10_has_avx512f true +#define avx10_has_avx512bw true +#define avx10_has_avx512cd true +#define avx10_has_avx512dq true +#define avx10_has_avx512_bf16 true +#define avx10_has_avx512_bitalg true +#define avx10_has_avx512_fp16 true +#define avx10_has_avx512_ifma true +#define avx10_has_avx512_vbmi true +#define avx10_has_avx512_vbmi2 true +#define avx10_has_avx512_vnni true +#define avx10_has_avx512_vpopcntdq true + +/* AVX512 sub-features /not/ implied by AVX10. */ +#define avx10_has_avx512er false +#define avx10_has_avx512pf false +#define avx10_has_avx512_4fmaps false +#define avx10_has_avx512_4vnniw false +#define avx10_has_avx512_vp2intersect false + void evex_disp8_test(void *instr, struct x86_emulate_ctxt *ctxt, const struct x86_emulate_ops *ops) { @@ -1057,8 +1082,8 @@ void evex_disp8_test(void *instr, struct emulops.read = read; emulops.write = write; -#define RUN(feat, vl) do { \ - if ( cpu_has_##feat ) \ +#define run(cond, feat, vl) do { \ + if ( cond ) \ { \ printf("%-40s", "Testing " #feat "/" #vl " disp8 handling..."); \ test_group(feat ## _ ## vl, ARRAY_SIZE(feat ## _ ## vl), \ @@ -1067,6 +1092,12 @@ void evex_disp8_test(void *instr, struct } \ } while ( false ) +#define RUN(feat, vl) \ + run(cpu_has_ ## feat || \ + (cpu_has_avx10_1 && cpu_policy.avx10.vsz256 && avx10_has_ ## feat && \ + (ARRAY_SIZE(vl_ ## vl) > 1 || &vl_ ## vl[0] != &vl_512[0])), \ + feat, vl) + RUN(avx512f, all); RUN(avx512f, 128); RUN(avx512f, no128); @@ -1089,10 +1120,15 @@ void evex_disp8_test(void *instr, struct RUN(avx512_fp16, all); RUN(avx512_fp16, 128); - if ( cpu_has_avx512f ) +#undef RUN + + if ( cpu_has_avx512f || cpu_has_avx10_1 ) { +#define RUN(feat, vl) run(cpu_has_ ## feat, feat, vl) RUN(gfni, all); RUN(vaes, all); RUN(vpclmulqdq, all); +#undef RUN } +#undef run } --- a/tools/tests/x86_emulator/testcase.mk +++ b/tools/tests/x86_emulator/testcase.mk @@ -4,7 +4,27 @@ include $(XEN_ROOT)/tools/Rules.mk $(call cc-options-add,CFLAGS,CC,$(EMBEDDED_EXTRA_CFLAGS)) -CFLAGS += -fno-builtin -g0 $($(TESTCASE)-cflags) +ifneq ($(filter -mavx512%,$($(TESTCASE)-cflags)),) + +cflags-vsz64 := +cflags-vsz32 := -mprefer-vector-width=256 +cflags-vsz16 := -mprefer-vector-width=128 +# Scalar tests don't set VEC_SIZE (and VEC_MAX is used by S/G ones only) +cflags-vsz := -mprefer-vector-width=128 + +ifneq ($(filter -DVEC_SIZE=%,$($(TESTCASE)-cflags)),) +CFLAGS-VSZ := $(cflags-vsz$(patsubst -DVEC_SIZE=%,%,$(filter -DVEC_SIZE=%,$($(TESTCASE)-cflags)))) +else +CFLAGS-VSZ := $(cflags-vsz$(patsubst -DVEC_MAX=%,%,$(filter -DVEC_MAX=%,$($(TESTCASE)-cflags)))) +endif + +else + +CFLAGS-VSZ := + +endif + +CFLAGS += -fno-builtin -g0 $($(TESTCASE)-cflags) $(CFLAGS-VSZ) LDFLAGS_DIRECT += $(shell { $(LD) -v --warn-rwx-segments; } >/dev/null 2>&1 && echo --no-warn-rwx-segments) --- a/tools/tests/x86_emulator/test_x86_emulator.c +++ b/tools/tests/x86_emulator/test_x86_emulator.c @@ -124,52 +124,61 @@ static bool simd_check_avx_pclmul(void) static bool simd_check_avx512f(void) { - return cpu_has_avx512f; + return cpu_has_avx512f || cpu_has_avx10_1_512; } -#define simd_check_avx512f_opmask simd_check_avx512f #define simd_check_avx512f_sg simd_check_avx512f +static bool simd_check_avx512f_sc(void) +{ + return cpu_has_avx512f || cpu_has_avx10_1; +} +#define simd_check_avx512f_opmask simd_check_avx512f_sc + static bool simd_check_avx512f_vl(void) { - return cpu_has_avx512f && cpu_policy.feat.avx512vl; + return (cpu_has_avx512f && cpu_policy.feat.avx512vl) || + cpu_has_avx10_1_256; } #define simd_check_avx512vl_sg simd_check_avx512f_vl static bool simd_check_avx512dq(void) { - return cpu_has_avx512dq; + return cpu_has_avx512dq || cpu_has_avx10_1_512; } #define simd_check_avx512dq_opmask simd_check_avx512dq static bool simd_check_avx512dq_vl(void) { - return cpu_has_avx512dq && cpu_policy.feat.avx512vl; + return (cpu_has_avx512dq && cpu_policy.feat.avx512vl) || + cpu_has_avx10_1_256; } static bool simd_check_avx512bw(void) { - return cpu_has_avx512bw; + return cpu_has_avx512bw || cpu_has_avx10_1_512; } #define simd_check_avx512bw_opmask simd_check_avx512bw static bool simd_check_avx512bw_vl(void) { - return cpu_has_avx512bw && cpu_policy.feat.avx512vl; + return (cpu_has_avx512bw && cpu_policy.feat.avx512vl) || + cpu_has_avx10_1_256; } static bool simd_check_avx512vbmi(void) { - return cpu_has_avx512_vbmi; + return cpu_has_avx512_vbmi || cpu_has_avx10_1_512; } static bool simd_check_avx512vbmi_vl(void) { - return cpu_has_avx512_vbmi && cpu_policy.feat.avx512vl; + return (cpu_has_avx512_vbmi && cpu_policy.feat.avx512vl) || + cpu_has_avx10_1_256; } static bool simd_check_avx512vbmi2(void) { - return cpu_has_avx512_vbmi2; + return cpu_has_avx512_vbmi2 || cpu_has_avx10_1_512; } static bool simd_check_sse4_sha(void) @@ -250,17 +259,23 @@ static bool simd_check_avx512bw_gf_vl(vo static bool simd_check_avx512vnni(void) { - return cpu_has_avx512_vnni; + return cpu_has_avx512_vnni || cpu_has_avx10_1_512; } static bool simd_check_avx512fp16(void) { - return cpu_has_avx512_fp16; + return cpu_has_avx512_fp16 || cpu_has_avx10_1_512; +} + +static bool simd_check_avx512fp16_sc(void) +{ + return cpu_has_avx512_fp16 || cpu_has_avx10_1; } static bool simd_check_avx512fp16_vl(void) { - return cpu_has_avx512_fp16 && cpu_policy.feat.avx512vl; + return (cpu_has_avx512_fp16 && cpu_policy.feat.avx512vl) || + cpu_has_avx10_1_256; } static void simd_set_regs(struct cpu_user_regs *regs) @@ -433,9 +448,13 @@ static const struct { SIMD(OPMASK+DQ/w, avx512dq_opmask, 2), SIMD(OPMASK+BW/d, avx512bw_opmask, 4), SIMD(OPMASK+BW/q, avx512bw_opmask, 8), - SIMD(AVX512F f32 scalar, avx512f, f4), +#define avx512f_sc_x86_32_D_f4 avx512f_x86_32_D_f4 +#define avx512f_sc_x86_64_D_f4 avx512f_x86_64_D_f4 + SIMD(AVX512F f32 scalar, avx512f_sc, f4), SIMD(AVX512F f32x16, avx512f, 64f4), - SIMD(AVX512F f64 scalar, avx512f, f8), +#define avx512f_sc_x86_32_D_f8 avx512f_x86_32_D_f8 +#define avx512f_sc_x86_64_D_f8 avx512f_x86_64_D_f8 + SIMD(AVX512F f64 scalar, avx512f_sc, f8), SIMD(AVX512F f64x8, avx512f, 64f8), SIMD(AVX512F s32x16, avx512f, 64i4), SIMD(AVX512F u32x16, avx512f, 64u4), @@ -523,7 +542,9 @@ static const struct { AVX512VL(_VBMI+VL u16x8, avx512vbmi, 16u2), AVX512VL(_VBMI+VL s16x16, avx512vbmi, 32i2), AVX512VL(_VBMI+VL u16x16, avx512vbmi, 32u2), - SIMD(AVX512_FP16 f16 scal,avx512fp16, f2), +#define avx512fp16_sc_x86_32_D_f2 avx512fp16_x86_32_D_f2 +#define avx512fp16_sc_x86_64_D_f2 avx512fp16_x86_64_D_f2 + SIMD(AVX512_FP16 f16 scal,avx512fp16_sc, f2), SIMD(AVX512_FP16 f16x32, avx512fp16, 64f2), AVX512VL(_FP16+VL f16x8, avx512fp16, 16f2), AVX512VL(_FP16+VL f16x16,avx512fp16, 32f2), @@ -3205,7 +3226,7 @@ int main(int argc, char **argv) printf("skipped\n"); printf("%-40s", "Testing {evex} vmovq %xmm1,32(%edx)..."); - if ( stack_exec && simd_check_avx512f() ) + if ( stack_exec && simd_check_avx512f_sc() ) { decl_insn(evex_vmovq_to_mem); @@ -3229,7 +3250,7 @@ int main(int argc, char **argv) printf("skipped\n"); printf("%-40s", "Testing {evex} vmovq 32(%edx),%xmm0..."); - if ( stack_exec && simd_check_avx512f() ) + if ( stack_exec && simd_check_avx512f_sc() ) { decl_insn(evex_vmovq_from_mem); @@ -3241,11 +3262,22 @@ int main(int argc, char **argv) rc = x86_emulate(&ctxt, &emulops); if ( rc != X86EMUL_OKAY || !check_eip(evex_vmovq_from_mem) ) goto fail; - asm ( "vmovq %1, %%xmm1\n\t" - "vpcmpeqq %%zmm0, %%zmm1, %%k0\n" - "kmovw %%k0, %0" : "=r" (rc) : "m" (res[8]) ); - if ( rc != 0xff ) - goto fail; + if ( simd_check_avx512f() ) + { + asm ( "vmovq %1, %%xmm1\n\t" + "vpcmpeqq %%zmm0, %%zmm1, %%k0\n" + "kmovw %%k0, %0" : "=r" (rc) : "m" (res[8]) ); + if ( rc != 0x00ff ) + goto fail; + } + else + { + asm ( "vmovq %1, %%xmm1\n\t" + "vpcmpeqq %%xmm0, %%xmm1, %%k0\n" + "kmovb %%k0, %0" : "=r" (rc) : "m" (res[8]) ); + if ( rc != 0x03 ) + goto fail; + } printf("okay\n"); } else @@ -3567,7 +3599,7 @@ int main(int argc, char **argv) printf("%-40s", "Testing vmovsd %xmm5,16(%ecx){%k3}..."); memset(res, 0x88, 128); memset(res + 20, 0x77, 8); - if ( stack_exec && simd_check_avx512f() ) + if ( stack_exec && simd_check_avx512f_sc() ) { decl_insn(vmovsd_masked_to_mem); @@ -3785,7 +3817,7 @@ int main(int argc, char **argv) printf("skipped\n"); printf("%-40s", "Testing {evex} vmovd %xmm3,32(%ecx)..."); - if ( stack_exec && simd_check_avx512f() ) + if ( stack_exec && simd_check_avx512f_sc() ) { decl_insn(evex_vmovd_to_mem); @@ -3810,7 +3842,7 @@ int main(int argc, char **argv) printf("skipped\n"); printf("%-40s", "Testing {evex} vmovd 32(%ecx),%xmm4..."); - if ( stack_exec && simd_check_avx512f() ) + if ( stack_exec && simd_check_avx512f_sc() ) { decl_insn(evex_vmovd_from_mem); @@ -3823,11 +3855,22 @@ int main(int argc, char **argv) rc = x86_emulate(&ctxt, &emulops); if ( rc != X86EMUL_OKAY || !check_eip(evex_vmovd_from_mem) ) goto fail; - asm ( "vmovd %1, %%xmm0\n\t" - "vpcmpeqd %%zmm4, %%zmm0, %%k0\n\t" - "kmovw %%k0, %0" : "=r" (rc) : "m" (res[8]) ); - if ( rc != 0xffff ) - goto fail; + if ( simd_check_avx512f() ) + { + asm ( "vmovd %1, %%xmm0\n\t" + "vpcmpeqd %%zmm4, %%zmm0, %%k0\n\t" + "kmovw %%k0, %0" : "=r" (rc) : "m" (res[8]) ); + if ( rc != 0xffff ) + goto fail; + } + else + { + asm ( "vmovd %1, %%xmm0\n\t" + "vpcmpeqd %%xmm4, %%xmm0, %%k0\n\t" + "kmovb %%k0, %0" : "=r" (rc) : "m" (res[8]) ); + if ( rc != 0x0f ) + goto fail; + } printf("okay\n"); } else @@ -4000,7 +4043,7 @@ int main(int argc, char **argv) printf("skipped\n"); printf("%-40s", "Testing {evex} vmovd %xmm2,%ebx..."); - if ( stack_exec && simd_check_avx512f() ) + if ( stack_exec && simd_check_avx512f_sc() ) { decl_insn(evex_vmovd_to_reg); @@ -4026,7 +4069,7 @@ int main(int argc, char **argv) printf("skipped\n"); printf("%-40s", "Testing {evex} vmovd %ebx,%xmm1..."); - if ( stack_exec && simd_check_avx512f() ) + if ( stack_exec && simd_check_avx512f_sc() ) { decl_insn(evex_vmovd_from_reg); @@ -4040,11 +4083,22 @@ int main(int argc, char **argv) rc = x86_emulate(&ctxt, &emulops); if ( (rc != X86EMUL_OKAY) || !check_eip(evex_vmovd_from_reg) ) goto fail; - asm ( "vmovd %1, %%xmm0\n\t" - "vpcmpeqd %%zmm1, %%zmm0, %%k0\n\t" - "kmovw %%k0, %0" : "=r" (rc) : "m" (res[8]) ); - if ( rc != 0xffff ) - goto fail; + if ( simd_check_avx512f() ) + { + asm ( "vmovd %1, %%xmm0\n\t" + "vpcmpeqd %%zmm1, %%zmm0, %%k0\n\t" + "kmovw %%k0, %0" : "=r" (rc) : "m" (res[8]) ); + if ( rc != 0xffff ) + goto fail; + } + else + { + asm ( "vmovd %1, %%xmm0\n\t" + "vpcmpeqd %%xmm1, %%xmm0, %%k0\n\t" + "kmovb %%k0, %0" : "=r" (rc) : "m" (res[8]) ); + if ( rc != 0x0f ) + goto fail; + } printf("okay\n"); } else @@ -4128,7 +4182,7 @@ int main(int argc, char **argv) printf("skipped\n"); printf("%-40s", "Testing {evex} vmovq %xmm11,32(%ecx)..."); - if ( stack_exec && simd_check_avx512f() ) + if ( stack_exec && simd_check_avx512f_sc() ) { decl_insn(evex_vmovq_to_mem2); @@ -4218,7 +4272,7 @@ int main(int argc, char **argv) printf("skipped\n"); printf("%-40s", "Testing vmovq %xmm22,%rbx..."); - if ( stack_exec && simd_check_avx512f() ) + if ( stack_exec && simd_check_avx512f_sc() ) { decl_insn(evex_vmovq_to_reg); @@ -5509,7 +5563,7 @@ int main(int argc, char **argv) printf("skipped\n"); printf("%-40s", "Testing vmovsh 8(%ecx),%xmm5..."); - if ( stack_exec && simd_check_avx512fp16() ) + if ( stack_exec && simd_check_avx512fp16_sc() ) { decl_insn(vmovsh_from_mem); decl_insn(vmovw_to_gpr); @@ -5527,14 +5581,28 @@ int main(int argc, char **argv) rc = x86_emulate(&ctxt, &emulops); if ( (rc != X86EMUL_OKAY) || !check_eip(vmovsh_from_mem) ) goto fail; - asm volatile ( "kmovw %2, %%k1\n\t" - "vmovdqu16 %1, %%zmm4%{%%k1%}%{z%}\n\t" - "vpcmpeqw %%zmm4, %%zmm5, %%k0\n\t" - "kmovw %%k0, %0" - : "=g" (rc) - : "m" (res[2]), "r" (1) ); - if ( rc != 0xffff ) - goto fail; + if ( simd_check_avx512fp16() ) + { + asm volatile ( "kmovw %2, %%k1\n\t" + "vmovdqu16 %1, %%zmm4%{%%k1%}%{z%}\n\t" + "vpcmpeqw %%zmm4, %%zmm5, %%k0\n\t" + "kmovw %%k0, %0" + : "=g" (rc) + : "m" (res[2]), "r" (1) ); + if ( rc != 0xffff ) + goto fail; + } + else + { + asm volatile ( "kmovb %2, %%k1\n\t" + "vmovdqu16 %1, %%xmm4%{%%k1%}%{z%}\n\t" + "vpcmpeqw %%xmm4, %%xmm5, %%k0\n\t" + "kmovb %%k0, %0" + : "=g" (rc) + : "m" (res[2]), "r" (1) ); + if ( rc != 0xff ) + goto fail; + } printf("okay\n"); printf("%-40s", "Testing vmovsh %xmm4,2(%eax){%k3}..."); --- a/tools/tests/x86_emulator/x86-emulate.c +++ b/tools/tests/x86_emulator/x86-emulate.c @@ -244,7 +244,7 @@ int emul_test_get_fpu( break; case X86EMUL_FPU_opmask: case X86EMUL_FPU_zmm: - if ( cpu_has_avx512f ) + if ( cpu_has_avx512f || cpu_has_avx10_1 ) break; default: return X86EMUL_UNHANDLEABLE; --- a/tools/tests/x86_emulator/x86-emulate.h +++ b/tools/tests/x86_emulator/x86-emulate.h @@ -207,6 +207,12 @@ void wrpkru(unsigned int val); xcr0_mask(6)) #define cpu_has_avx_vnni_int16 (cpu_policy.feat.avx_vnni_int16 && \ xcr0_mask(6)) + /* TBD: Is bit 6 (ZMM_Hi256) really needed here? */ +#define cpu_has_avx10_1 (cpu_policy.feat.avx10 && xcr0_mask(0xe6)) +#define cpu_has_avx10_1_256 (cpu_has_avx10_1 && \ + (cpu_policy.avx10.vsz256 || \ + cpu_policy.avx10.vsz512)) +#define cpu_has_avx10_1_512 (cpu_has_avx10_1 && cpu_policy.avx10.vsz512) #define cpu_has_xgetbv1 (cpu_has_xsave && cpu_policy.xstate.xgetbv1) From patchwork Wed Dec 11 10:13:53 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 13903289 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id C536FE77180 for ; Wed, 11 Dec 2024 10:14:11 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.854293.1267536 (Exim 4.92) (envelope-from ) id 1tLJjG-0005cD-QZ; Wed, 11 Dec 2024 10:14:02 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 854293.1267536; Wed, 11 Dec 2024 10:14:02 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1tLJjG-0005c6-NI; Wed, 11 Dec 2024 10:14:02 +0000 Received: by outflank-mailman (input) for mailman id 854293; Wed, 11 Dec 2024 10:14:00 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1tLJjE-0005bt-Rp for xen-devel@lists.xenproject.org; Wed, 11 Dec 2024 10:14:00 +0000 Received: from mail-wr1-x429.google.com (mail-wr1-x429.google.com [2a00:1450:4864:20::429]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id a016e5b8-b7a8-11ef-99a3-01e77a169b0f; Wed, 11 Dec 2024 11:13:54 +0100 (CET) Received: by mail-wr1-x429.google.com with SMTP id ffacd0b85a97d-385f06d0c8eso3045246f8f.0 for ; Wed, 11 Dec 2024 02:13:58 -0800 (PST) Received: from [10.156.60.236] (ip-037-024-206-209.um08.pools.vodafone-ip.de. [37.24.206.209]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-725d3dd4cbasm7469337b3a.142.2024.12.11.02.13.55 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 11 Dec 2024 02:13:57 -0800 (PST) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: a016e5b8-b7a8-11ef-99a3-01e77a169b0f DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1733912038; x=1734516838; darn=lists.xenproject.org; h=content-transfer-encoding:in-reply-to:autocrypt:content-language :references:cc:to:from:subject:user-agent:mime-version:date :message-id:from:to:cc:subject:date:message-id:reply-to; bh=VGSyTsew86/TrCcru/Rk7R/2beYcoXHkI1pXLpAyXsw=; b=WUB+Wcy01N562dCQj3yuiN1GKPumclfXSSO3WNIMZwNJcXIpGhmiB2snCwO0LbIpK8 Mc+A+m6DwKqNNJdjH7wSEMozx6ULv4/Dl9OXms0u/koTvJt002Pjp4XKIIBBxrrZKfxM E8qaG8hHR7U8MSopH9lhqMct1MlHruOu3FiNWsV0WovoWEoYIcW14L1aQL9rkcHWS36n hb1gvKiO1MoW5OXsAJt/v/mrDBZ9lxDhG4/3QUkLmceUdbcOe77JcejavlPVdjW6gcag jpTwy4vRtkEAw6mz/nIX4HYvWH/PCwuLto6EHyypHCVUw0PMMXXFhPPd3L+GlJWu5FeM vK5Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1733912038; x=1734516838; h=content-transfer-encoding:in-reply-to:autocrypt:content-language :references:cc:to:from:subject:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=VGSyTsew86/TrCcru/Rk7R/2beYcoXHkI1pXLpAyXsw=; b=I0j0qeAUxsY0ng9lLDTWv6PQLiOZgAU3unYahakOVxIxRTQ5DBpED7WextOF/Ubr4q N/Frm94zLN2pzs6r6brqd1ZfsNAj78iKhuH2ZLg+MP9reKbjVLnXXXeZdODD0VBSVetS vigcvK5FZQ1nmkYrGP9OHQpyzsu+5K3fntUBz/v90jDKLXHLo1RD3wqbSo/4Yv587jSn DFavvwaQx/Xty1uab6rKQJDAPay4thd1XfnqwrtvaS5xJDEgNRjm4lUM2enQaNJl5WBN RYx3m9RaZWk+CfFsYx1U5K/ZRvzRLmxR/JJzMD9oSefjwZtMYP/mi2WbZ4au20y4NO9g asVg== X-Gm-Message-State: AOJu0Yz5L3Ce1H7/nW2+BFiZ51b0o/v030FSZUH+GJPyFXTaJUgLFHgH +PRnbqmLmmdLIh7vZtQ5Myv9cRTBfdYmn+RN+p2xAbTSrqxAVeRt51Hhkt7U32/xLVYeu2iSXa8 = X-Gm-Gg: ASbGncuRqgpUEyTMmBukUKYGnz989KGgsr+aThy4Ek2t+t6sUd3Uoj5KWNoIp25gpNQ BwiuPB+DuXv7Az+k+XmOdHPzBUORAcEJJkH5Jiyf7AKdsa7wfixxoAIwEoNgK8IR/fk86wNh2OF 6pNM87WvzFp8qwnwxpajrEc9WCLKoa8tlWHsKvQfsXaiApDTfpVuEaQvTLGcLJURENB02Ari82V Pbhqr35aXxELuG2LPXTg+xsL6N/tQ5AjTUzh2TWMt9/crnL3gg2TZiKOlj8VgRmOdwCWBm28p8j lwtf4mjkrimeaBg3bjYDjAAf28CiAf2zmQl5ZsE= X-Google-Smtp-Source: AGHT+IFAUVdTKd07IUIk75oNNDJmg2YpV/i6O60WN6iY0a4ljixpKTvnt8a9OaU/jl5nY+M/q+dXzw== X-Received: by 2002:a5d:6c64:0:b0:385:fab3:c56d with SMTP id ffacd0b85a97d-3864cdc1132mr1836657f8f.0.1733912038370; Wed, 11 Dec 2024 02:13:58 -0800 (PST) Message-ID: <59222198-afca-4ff4-8a47-8c3e6d3e922c@suse.com> Date: Wed, 11 Dec 2024 11:13:53 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: [PATCH v3 06/16] x86emul/test: engage AVX512VL via command line option From: Jan Beulich To: "xen-devel@lists.xenproject.org" Cc: Andrew Cooper , =?utf-8?q?Roger_Pau_Monn?= =?utf-8?q?=C3=A9?= References: <516b7f9a-048e-409d-8a4e-89aeb8ffacc4@suse.com> Content-Language: en-US Autocrypt: addr=jbeulich@suse.com; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL In-Reply-To: <516b7f9a-048e-409d-8a4e-89aeb8ffacc4@suse.com> Now that we have machinery in testcase.mk to set vector length dependent flags for AVX512 tests, let's avoid using a pragma to enable AVX512VL insns for the compiler. This way, correct settings are in place from the very beginning of compilation. No change to the generated test blobs, and hence no functional change. Signed-off-by: Jan Beulich --- a/tools/tests/x86_emulator/simd.h +++ b/tools/tests/x86_emulator/simd.h @@ -215,10 +215,6 @@ DECL_OCTET(half); # define __builtin_ia32_shuf_i32x4_512_mask __builtin_ia32_shuf_i32x4_mask # define __builtin_ia32_shuf_i64x2_512_mask __builtin_ia32_shuf_i64x2_mask -# if VEC_SIZE > ELEM_SIZE && (defined(VEC_MAX) ? VEC_MAX : VEC_SIZE) < 64 -# pragma GCC target ( "avx512vl" ) -# endif - # define REN(insn, old, new) \ asm ( ".macro v" #insn #old " o:vararg \n\t" \ "v" #insn #new " \\o \n\t" \ --- a/tools/tests/x86_emulator/testcase.mk +++ b/tools/tests/x86_emulator/testcase.mk @@ -7,8 +7,8 @@ $(call cc-options-add,CFLAGS,CC,$(EMBEDD ifneq ($(filter -mavx512%,$($(TESTCASE)-cflags)),) cflags-vsz64 := -cflags-vsz32 := -mprefer-vector-width=256 -cflags-vsz16 := -mprefer-vector-width=128 +cflags-vsz32 := -mavx512vl -mprefer-vector-width=256 +cflags-vsz16 := -mavx512vl -mprefer-vector-width=128 # Scalar tests don't set VEC_SIZE (and VEC_MAX is used by S/G ones only) cflags-vsz := -mprefer-vector-width=128 From patchwork Wed Dec 11 10:14:17 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 13903290 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 43B76E7717D for ; Wed, 11 Dec 2024 10:14:33 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.854299.1267547 (Exim 4.92) (envelope-from ) id 1tLJjd-00065U-3B; Wed, 11 Dec 2024 10:14:25 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 854299.1267547; Wed, 11 Dec 2024 10:14:25 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1tLJjc-00065N-UE; Wed, 11 Dec 2024 10:14:24 +0000 Received: by outflank-mailman (input) for mailman id 854299; Wed, 11 Dec 2024 10:14:24 +0000 Received: from se1-gles-sth1-in.inumbo.com ([159.253.27.254] helo=se1-gles-sth1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1tLJjc-00060E-G7 for xen-devel@lists.xenproject.org; Wed, 11 Dec 2024 10:14:24 +0000 Received: from mail-wr1-x42e.google.com (mail-wr1-x42e.google.com [2a00:1450:4864:20::42e]) by se1-gles-sth1.inumbo.com (Halon) with ESMTPS id b1b1f941-b7a8-11ef-a0d5-8be0dac302b0; Wed, 11 Dec 2024 11:14:23 +0100 (CET) Received: by mail-wr1-x42e.google.com with SMTP id ffacd0b85a97d-38632b8ae71so3132864f8f.0 for ; Wed, 11 Dec 2024 02:14:23 -0800 (PST) Received: from [10.156.60.236] (ip-037-024-206-209.um08.pools.vodafone-ip.de. [37.24.206.209]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-725e438f169sm6062727b3a.168.2024.12.11.02.14.20 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 11 Dec 2024 02:14:22 -0800 (PST) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: b1b1f941-b7a8-11ef-a0d5-8be0dac302b0 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1733912063; x=1734516863; darn=lists.xenproject.org; h=content-transfer-encoding:in-reply-to:autocrypt:content-language :references:cc:to:from:subject:user-agent:mime-version:date :message-id:from:to:cc:subject:date:message-id:reply-to; bh=LdaGVXA10IBcCRzlLDHPVJ5Jo7qKqPrDjPHTX/kabAE=; b=aqHMLQpQjpM4sfwpdhSJ+e7aRqxDVSVEBc78CSADlAUOIkPNivpcqHfw0fxHe0vrK8 7prCNT4CEVMVOQRrXp2d9ND9u+C0I8sQIASMfQVy1hUEUkfn5ul7blXlv496kk8J3XR0 8paY8fgnr/s3wyFi1275c3OO+MCg2rAbesoUEVaa80EztLe91BGGNyh2chk3hoWwqnhq dcx7SWYKJpsuaKwRkOwBhCfbk/80JdNF2knfpa6pXJFKP6/E3jhn18vuCs0HEu5p6ySo Zzr21EnLOJ9ejNgyDquXtWxEjLrOHXq/YL4Mw2zMNZ6lrnUlbtuJ1u2b3Q2lioTezri1 5hrg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1733912063; x=1734516863; h=content-transfer-encoding:in-reply-to:autocrypt:content-language :references:cc:to:from:subject:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=LdaGVXA10IBcCRzlLDHPVJ5Jo7qKqPrDjPHTX/kabAE=; b=aEbkHQgiAWjDPyGzp++1or2YwdZfTCdsH26AzkrKisjkCkD0KWh9ELE1E7D+xYwWhE yZVuHvEiRTsD6inO9EBr0xEerZ7AW3Fbiheg9ngA/BQidYw8Ro5KofakIY7hpfpm2ofq FsEW97Q1mPOjHcTcCXh2D0pNt4nSXL2S/A+kBoOnW87/+7dvl120sz0Ljc1vwPV+dZYU 6xZ3/qvZ1Vt5KwaLJIDRsYpBrQ0CEiOXzC2jJRY+tzftA37eLl/QB+NxO55AXGtTAYe4 OMeEJ/FOHjh901mCCPeBIM/RzjCkEFuFk8xk84nG0KKiNm6jYfQ9hb1kEURzFmbvPjKh XVgg== X-Gm-Message-State: AOJu0YygmxPc3BOTARsC4h8f3+tBt0dhCYHFzJ1GPytKwNIdfpEcl7IF MQnM1zLVOvQkFrS8rQ7LUD61FbsWUQ5pZmfttwruKStw2ASVp0Oi2JIQ9EXJd4Ay0G2BuLVRmJQ = X-Gm-Gg: ASbGnctIgVsKHHBmzG7Kbq7o5dAjt5ogDtzntXNjncVOgoE8+HfNyDHRLo7Zxx31vnF RT2I607qqyWmuh6vTu+nlktNHzngiI3ybAaLb6ReH7xkf5yyNgdO6BqG/BBfu9V9UwXbLcb5TKs LYlz8Icg1z/tPEscWhyxTdDdJuida639R4N6dybEEFDDE4RfQzzVEIgnSFnqwBOkqsIz16gbiZL ///tzEieaFkM7+pghv+yt82pkvfCGB6zu/K6M9Ryw2TRs4UIRgoEICIIPFmnnOcum9M9aJfcsr2 QynPa2W0IPa98opUTzIulIWdCjCJoScubHz3tnc= X-Google-Smtp-Source: AGHT+IHbmK8rQyhYlnEZ/56i71MHJERRZ3R+gFaRtnYt48oWR9ePFEk/7foCXiqMM7Lketo+OfwWbA== X-Received: by 2002:a05:6000:18ae:b0:385:fd07:8616 with SMTP id ffacd0b85a97d-3864cdeb11fmr1770022f8f.0.1733912063058; Wed, 11 Dec 2024 02:14:23 -0800 (PST) Message-ID: <995a7961-2a28-426c-85b8-2ee3dd505f4b@suse.com> Date: Wed, 11 Dec 2024 11:14:17 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: [PATCH v3 07/16] x86emul: support AVX10.2 256-bit embedded rounding / SAE From: Jan Beulich To: "xen-devel@lists.xenproject.org" Cc: Andrew Cooper , =?utf-8?q?Roger_Pau_Monn?= =?utf-8?q?=C3=A9?= References: <516b7f9a-048e-409d-8a4e-89aeb8ffacc4@suse.com> Content-Language: en-US Autocrypt: addr=jbeulich@suse.com; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL In-Reply-To: <516b7f9a-048e-409d-8a4e-89aeb8ffacc4@suse.com> AVX10.2 (along with APX) assigns new meaning to the bit that previsouly distinguished EVEX from the Phi co-processor's MVEX. Therefore evex_encoded() now needs to key off of something else: Use the opcode mapping field for this, leveraging that map 0 has no assigned opcodes (and appears unlikely to gain any). Place the check of EVEX.U such that it'll cover most insns. EVEX.b is being checked for individual insns as applicable - whenever that's valid for (register-only) 512-bit forms, it becomes valid for 256-bit forms as well when AVX10.2 is permitted for a guest. Scalar insns permitting embedded rounding / SAE, otoh, have individual EVEX.U checks added (where applicable with minor adjustments to the logic to avoid - where easily possible - testing the same bit multiple times). Signed-off-by: Jan Beulich --- To raise the question early: It is entirely unclear to me how we want to allow control over the AVX10 minor version number from guest configs, as that's not a boolean field and hence not suitable for simple bit-wise masking of feature sets. --- v3: Take care of scalar insns individually. v2: New. --- a/xen/arch/x86/x86_emulate/decode.c +++ b/xen/arch/x86/x86_emulate/decode.c @@ -16,7 +16,7 @@ # define ERR_PTR(val) NULL #endif -#define evex_encoded() (s->evex.mbs) +#define evex_encoded() (s->evex.opcx) struct x86_emulate_state * x86_decode_insn( @@ -1198,8 +1198,15 @@ int x86emul_decode(struct x86_emulate_st s->evex.raw[1] = s->vex.raw[1]; s->evex.raw[2] = insn_fetch_type(uint8_t); - generate_exception_if(!s->evex.mbs || s->evex.mbz, X86_EXC_UD); - generate_exception_if(!s->evex.opmsk && s->evex.z, X86_EXC_UD); + /* + * .opcx is being checked here just to be on the safe + * side, especially as long as evex_encoded() uses + * this field. + */ + generate_exception_if(s->evex.mbz || !s->evex.opcx, + X86_EXC_UD); + generate_exception_if(!s->evex.opmsk && s->evex.z, + X86_EXC_UD); if ( !mode_64bit() ) s->evex.R = 1; @@ -1777,6 +1784,16 @@ int x86emul_decode(struct x86_emulate_st if ( override_seg != x86_seg_none ) s->ea.mem.seg = override_seg; + /* + * While this generic check takes care of most insns, scalar ones (with + * EVEX.b set) need checking individually (elsewhere). + */ + generate_exception_if((evex_encoded() && + !s->evex.u && + (s->modrm_mod != 3 || + !vcpu_has_avx10(2) || !s->evex.brs)), + X86_EXC_UD); + /* Fetch the immediate operand, if present. */ switch ( d & SrcMask ) { --- a/xen/arch/x86/x86_emulate/private.h +++ b/xen/arch/x86/x86_emulate/private.h @@ -225,7 +225,7 @@ union evex { uint8_t x:1; /* X */ uint8_t r:1; /* R */ uint8_t pfx:2; /* pp */ - uint8_t mbs:1; + uint8_t u:1; /* U */ uint8_t reg:4; /* vvvv */ uint8_t w:1; /* W */ uint8_t opmsk:3; /* aaa */ @@ -594,6 +594,8 @@ amd_like(const struct x86_emulate_ctxt * #define vcpu_has_avx_vnni_int16() (ctxt->cpuid->feat.avx_vnni_int16) #define vcpu_has_user_msr() (ctxt->cpuid->feat.user_msr) +#define vcpu_has_avx10(minor) (ctxt->cpuid->avx10.version >= (minor)) + #define vcpu_must_have(feat) \ generate_exception_if(!vcpu_has_##feat(), X86_EXC_UD) --- a/xen/arch/x86/x86_emulate/x86_emulate.c +++ b/xen/arch/x86/x86_emulate/x86_emulate.c @@ -1241,7 +1241,7 @@ int cf_check x86emul_unhandleable_rw( #define lock_prefix (state->lock_prefix) #define vex (state->vex) #define evex (state->evex) -#define evex_encoded() (evex.mbs) +#define evex_encoded() (evex.opcx) #define ea (state->ea) /* Undo DEBUG wrapper. */ @@ -3415,8 +3415,8 @@ x86_emulate( CASE_SIMD_ALL_FP(_EVEX, 0x0f, 0x5f): /* vmax{p,s}{s,d} [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ avx512f_all_fp: generate_exception_if((evex.w != (evex.pfx & VEX_PREFIX_DOUBLE_MASK) || - (ea.type != OP_REG && evex.brs && - (evex.pfx & VEX_PREFIX_SCALAR_MASK))), + ((evex.pfx & VEX_PREFIX_SCALAR_MASK) && + (ea.type != OP_REG ? evex.brs : !evex.u))), X86_EXC_UD); visa_check(f); if ( ea.type != OP_REG || !evex.brs ) @@ -3622,11 +3622,12 @@ x86_emulate( /* fall through */ CASE_SIMD_SCALAR_FP(_EVEX, 0x0f, 0x2a): /* vcvtsi2s{s,d} r/m,xmm,xmm */ CASE_SIMD_SCALAR_FP(_EVEX, 0x0f, 0x7b): /* vcvtusi2s{s,d} r/m,xmm,xmm */ - generate_exception_if(evex.opmsk || (ea.type != OP_REG && evex.brs), - X86_EXC_UD); + generate_exception_if(evex.opmsk, X86_EXC_UD); visa_check(f); if ( !evex.brs ) avx512_vlen_check(true); + else + generate_exception_if(ea.type != OP_REG || !evex.u, X86_EXC_UD); get_fpu(X86EMUL_FPU_zmm); if ( ea.type == OP_MEM ) @@ -3741,12 +3742,13 @@ x86_emulate( CASE_SIMD_SCALAR_FP(_EVEX, 0x0f, 0x78): /* vcvtts{s,d}2usi xmm/mem,reg */ CASE_SIMD_SCALAR_FP(_EVEX, 0x0f, 0x79): /* vcvts{s,d}2usi xmm/mem,reg */ generate_exception_if((evex.reg != 0xf || !evex.RX || !evex.R || - evex.opmsk || - (ea.type != OP_REG && evex.brs)), + evex.opmsk), X86_EXC_UD); visa_check(f); if ( !evex.brs ) avx512_vlen_check(true); + else + generate_exception_if(ea.type != OP_REG || !evex.u, X86_EXC_UD); get_fpu(X86EMUL_FPU_zmm); opc = init_evex(stub); goto cvts_2si; @@ -3816,12 +3818,13 @@ x86_emulate( CASE_SIMD_PACKED_FP(_EVEX, 0x0f, 0x2e): /* vucomis{s,d} xmm/mem,xmm */ CASE_SIMD_PACKED_FP(_EVEX, 0x0f, 0x2f): /* vcomis{s,d} xmm/mem,xmm */ generate_exception_if((evex.reg != 0xf || !evex.RX || evex.opmsk || - (ea.type != OP_REG && evex.brs) || evex.w != evex.pfx), X86_EXC_UD); visa_check(f); if ( !evex.brs ) avx512_vlen_check(true); + else + generate_exception_if(ea.type != OP_REG || !evex.u, X86_EXC_UD); get_fpu(X86EMUL_FPU_zmm); opc = init_evex(stub); @@ -5389,8 +5392,8 @@ x86_emulate( CASE_SIMD_ALL_FP(_EVEX, 0x0f, 0xc2): /* vcmp{p,s}{s,d} $imm8,[xyz]mm/mem,[xyz]mm,k{k} */ generate_exception_if((evex.w != (evex.pfx & VEX_PREFIX_DOUBLE_MASK) || - (ea.type != OP_REG && evex.brs && - (evex.pfx & VEX_PREFIX_SCALAR_MASK)) || + ((evex.pfx & VEX_PREFIX_SCALAR_MASK) && + (ea.type != OP_REG ? evex.brs : !evex.u)) || !evex.r || !evex.R || evex.z), X86_EXC_UD); visa_check(f); @@ -6088,9 +6091,10 @@ x86_emulate( case X86EMUL_OPC_EVEX_66(0x0f38, 0xbd): /* vfnmadd231s{s,d} xmm/mem,xmm,xmm{k} */ case X86EMUL_OPC_EVEX_66(0x0f38, 0xbf): /* vfnmsub231s{s,d} xmm/mem,xmm,xmm{k} */ visa_check(f); - generate_exception_if(ea.type != OP_REG && evex.brs, X86_EXC_UD); if ( !evex.brs ) avx512_vlen_check(true); + else + generate_exception_if(ea.type != OP_REG || !evex.u, X86_EXC_UD); goto simd_zmm; case X86EMUL_OPC_66(0x0f38, 0x37): /* pcmpgtq xmm/m128,xmm */ @@ -7262,7 +7266,8 @@ x86_emulate( case X86EMUL_OPC_EVEX_66(0x0f3a, 0x0a): /* vrndscaless $imm8,xmm/mem,xmm,xmm{k} */ case X86EMUL_OPC_EVEX_66(0x0f3a, 0x0b): /* vrndscalesd $imm8,xmm/mem,xmm,xmm{k} */ - generate_exception_if(ea.type != OP_REG && evex.brs, X86_EXC_UD); + generate_exception_if(ea.type != OP_REG ? evex.brs : !evex.u, + X86_EXC_UD); /* fall through */ case X86EMUL_OPC_EVEX_66(0x0f3a, 0x08): /* vrndscaleps $imm8,[xyz]mm/mem,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_66(0x0f3a, 0x09): /* vrndscalepd $imm8,[xyz]mm/mem,[xyz]mm{k} */ @@ -7272,7 +7277,8 @@ x86_emulate( goto simd_imm8_zmm; case X86EMUL_OPC_EVEX(0x0f3a, 0x0a): /* vrndscalesh $imm8,xmm/mem,xmm,xmm{k} */ - generate_exception_if(ea.type != OP_REG && evex.brs, X86_EXC_UD); + generate_exception_if(ea.type != OP_REG ? evex.brs : !evex.u, + X86_EXC_UD); /* fall through */ case X86EMUL_OPC_EVEX(0x0f3a, 0x08): /* vrndscaleph $imm8,[xyz]mm/mem,[xyz]mm{k} */ visa_check(_fp16); @@ -7605,9 +7611,10 @@ x86_emulate( case X86EMUL_OPC_EVEX_66(0x0f3a, 0x27): /* vgetmants{s,d} $imm8,xmm/mem,xmm,xmm{k} */ case X86EMUL_OPC_EVEX_66(0x0f3a, 0x55): /* vfixupimms{s,d} $imm8,xmm/mem,xmm,xmm{k} */ visa_check(f); - generate_exception_if(ea.type != OP_REG && evex.brs, X86_EXC_UD); if ( !evex.brs ) avx512_vlen_check(true); + else + generate_exception_if(ea.type != OP_REG || !evex.u, X86_EXC_UD); goto simd_imm8_zmm; case X86EMUL_OPC_EVEX(0x0f3a, 0x27): /* vgetmantsh $imm8,xmm/mem,xmm,xmm{k} */ @@ -7617,7 +7624,7 @@ x86_emulate( if ( !evex.brs ) avx512_vlen_check(true); else - generate_exception_if(ea.type != OP_REG, X86_EXC_UD); + generate_exception_if(ea.type != OP_REG || !evex.u, X86_EXC_UD); goto simd_imm8_zmm; case X86EMUL_OPC_VEX_66(0x0f3a, 0x30): /* kshiftr{b,w} $imm8,k,k */ @@ -7805,7 +7812,7 @@ x86_emulate( goto avx512f_imm8_no_sae; case X86EMUL_OPC_EVEX_F3(0x0f3a, 0xc2): /* vcmpsh $imm8,xmm/mem,xmm,k{k} */ - generate_exception_if(ea.type != OP_REG && evex.brs, X86_EXC_UD); + generate_exception_if(ea.type != OP_REG ? evex.brs : !evex.u, X86_EXC_UD); /* fall through */ case X86EMUL_OPC_EVEX(0x0f3a, 0xc2): /* vcmpph $imm8,[xyz]mm/mem,[xyz]mm,k{k} */ visa_check(_fp16); @@ -7982,10 +7989,11 @@ x86_emulate( case X86EMUL_OPC_EVEX_66(6, 0xbd): /* vfnmadd231sh xmm/m16,xmm,xmm{k} */ case X86EMUL_OPC_EVEX_66(6, 0xbf): /* vfnmsub231sh xmm/m16,xmm,xmm{k} */ visa_check(_fp16); - generate_exception_if(evex.w || (ea.type != OP_REG && evex.brs), - X86_EXC_UD); + generate_exception_if(evex.w, X86_EXC_UD); if ( !evex.brs ) avx512_vlen_check(true); + else + generate_exception_if(ea.type != OP_REG || !evex.u, X86_EXC_UD); goto simd_zmm; case X86EMUL_OPC_EVEX_66(6, 0x4c): /* vrcpph [xyz]mm/mem,[xyz]mm{k} */ @@ -8015,7 +8023,9 @@ x86_emulate( unsigned int src1 = ~evex.reg; visa_check(_fp16); - generate_exception_if(evex.w || ((b & 1) && ea.type != OP_REG && evex.brs), + generate_exception_if((evex.w || + ((b & 1) && + (ea.type != OP_REG ? evex.brs : !evex.u))), X86_EXC_UD); if ( mode_64bit() ) src1 = (src1 & 0xf) | (!evex.RX << 4); From patchwork Wed Dec 11 10:17:53 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 13903291 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 01724E77180 for ; Wed, 11 Dec 2024 10:18:16 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.854328.1267556 (Exim 4.92) (envelope-from ) id 1tLJn7-0006nc-JY; Wed, 11 Dec 2024 10:18:01 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 854328.1267556; Wed, 11 Dec 2024 10:18:01 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1tLJn7-0006nV-G3; Wed, 11 Dec 2024 10:18:01 +0000 Received: by outflank-mailman (input) for mailman id 854328; Wed, 11 Dec 2024 10:18:00 +0000 Received: from se1-gles-sth1-in.inumbo.com ([159.253.27.254] helo=se1-gles-sth1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1tLJn6-0006nP-Lx for xen-devel@lists.xenproject.org; Wed, 11 Dec 2024 10:18:00 +0000 Received: from mail-wr1-x434.google.com (mail-wr1-x434.google.com [2a00:1450:4864:20::434]) by se1-gles-sth1.inumbo.com (Halon) with ESMTPS id 326ac184-b7a9-11ef-a0d5-8be0dac302b0; Wed, 11 Dec 2024 11:17:59 +0100 (CET) Received: by mail-wr1-x434.google.com with SMTP id ffacd0b85a97d-385deda28b3so4860142f8f.0 for ; Wed, 11 Dec 2024 02:17:59 -0800 (PST) Received: from [10.156.60.236] (ip-037-024-206-209.um08.pools.vodafone-ip.de. [37.24.206.209]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-725e7efb381sm5737540b3a.117.2024.12.11.02.17.56 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 11 Dec 2024 02:17:58 -0800 (PST) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 326ac184-b7a9-11ef-a0d5-8be0dac302b0 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1733912279; x=1734517079; darn=lists.xenproject.org; h=content-transfer-encoding:in-reply-to:autocrypt:content-language :references:cc:to:from:subject:user-agent:mime-version:date :message-id:from:to:cc:subject:date:message-id:reply-to; bh=TmwKGqF6AW3IbD/rb2R5K+/4oOCjyt+s+wwmxJjKZqw=; b=WHBbSdivPbj8QnfyI4nJs7kYN9keDWnxKOHsvM9r7jLOISJCTSUTUUV6R2iiAuxZUv wGAzwD1AfQAfg0j/r4DvFrE/eallWF+hz6ZYvFs93jikpMhathCKdXU/weonKp2pmEqF 3DQut5BxEL2Wej3qVxASuH5v5WeBareGOM1O7vRqhUp7X+45Z55FuSffLOrVuz1Z84v5 FJiP/YQof4bSuhxjBYkPDyMlpCzNbyoZyy8oLN4vawarXktfzEfUD5+B6OqxR1qCmczx 6T0LgpAWLHf0kbacCRlokrTSQET199oUVO5b1UlWC1B1kSdvEg1fFnzNm7bxN78qfZgb NOqQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1733912279; x=1734517079; h=content-transfer-encoding:in-reply-to:autocrypt:content-language :references:cc:to:from:subject:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=TmwKGqF6AW3IbD/rb2R5K+/4oOCjyt+s+wwmxJjKZqw=; b=PYykFGtEm1M0Hm4bBihyFvejB7z8TaEBSfZUuI62l4s5Lb0iHsbIbUyMqRPmdG0gVB edmAxPPwtJNTFjOKeMBdzY81ld0k9Poc3f2bQY+8HjFPxPsnI1q/3eFuQbtUqQTMU7A9 LLiuGoE7XiDTTeeysiB+fWFioiuht8zf7Nz2/S2laJpEvOqpAKkkYvax0C1IxSD34Xno c6IQFv+bXBiJKPY4/zFFF/UYvAb5NgajB5TzQKfIOQTx+yjhaqydNCt+aDNqKCYmWsEU 2YnLBzMjQVm8+ITnDf0TcTKgwGCUrGb+BtpR/XDGMkl4hiGajOYsmnGyAHeNgNfbcLeZ dxow== X-Gm-Message-State: AOJu0Ywgy3FikTeb0uVlFm4EcVA21AKAWK7IWjriAzPd86Tg+CTspUje BrdAIW4uBtuV1Wj54ZGgKwje5GLMiVaCQvwfe5tXVYdd2iVUDELrSnokj/LZUH5wwAVLYEM/qf4 = X-Gm-Gg: ASbGncudO08ioTxUvtedtQ5uuqutMIMIQvAyQKGceOJKXmhXxsuSz5ayOF1+LOLjPdM vmXqJTrTBkffPvL/jB9Odm1euzfJB10GVoTOYY/cjf/3J1vjzA2UZwVDchK5XtjuC72DgU78OWI bIP7H2n5WEfbJf7dOeIvc/kt0IL2ob3WRRXrpNCDTeZyPrMDrhNnFZOXkunXo/3tsmjHKKsOUqI dlnzH+MGVC1Od1U1K3/XcyRR4vWChTfeJpm7MrWvdrMkUaBJaoIzcr8trTf0FgmC/PsgViU3ebt zkYy6ioQnQkUjmJotqxn7cDxnrSBSMcKI2TFmBg= X-Google-Smtp-Source: AGHT+IHYns+zJ5XF2WPWdcbChZWerdH81qBQTWuW0SCyy4+qXESeEqWRgnO/PWMjzFjdWwZV1Hu7Ow== X-Received: by 2002:a05:6000:440e:b0:385:fae2:f443 with SMTP id ffacd0b85a97d-3864cea43dcmr1329849f8f.34.1733912278995; Wed, 11 Dec 2024 02:17:58 -0800 (PST) Message-ID: <6e2423e1-1dc0-44c2-b5ad-8ebae0a91566@suse.com> Date: Wed, 11 Dec 2024 11:17:53 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: [PATCH v3 08/16] x86emul: support AVX10.2 scalar compare insns From: Jan Beulich To: "xen-devel@lists.xenproject.org" Cc: Andrew Cooper , =?utf-8?q?Roger_Pau_Monn?= =?utf-8?q?=C3=A9?= References: <516b7f9a-048e-409d-8a4e-89aeb8ffacc4@suse.com> Content-Language: en-US Autocrypt: addr=jbeulich@suse.com; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL In-Reply-To: <516b7f9a-048e-409d-8a4e-89aeb8ffacc4@suse.com> Simply clone code from their V{,U}COMIS{S,D,H} counterparts. While there drop a redundant EVEX.W check from V{,U}COMISH handling. Signed-off-by: Jan Beulich --- This still follows what spec version 001 says wrt embedded prefixed. They were swapped to match other insns, yet so far no SDE is available to run the test harness there with the flipped encoding. --- SDE: ??? --- v3: New. --- a/tools/tests/x86_emulator/evex-disp8.c +++ b/tools/tests/x86_emulator/evex-disp8.c @@ -81,6 +81,7 @@ enum esz { ESZ_w, ESZ_bw, ESZ_fp16, +#define ESZ_bf16 ESZ_fp16 }; #ifndef __i386__ @@ -711,6 +712,16 @@ static const struct test vpclmulqdq_all[ INSN(pclmulqdq, 66, 0f3a, 44, vl, q_nb, vl) }; +static const struct test avx10_2_all[] = { + INSN(comsbf16, 66, map5, 2f, el, bf16, el), + INSN(comxsd, f3, 0f, 2f, el, q, el), + INSN(comxsh, f2, map5, 2f, el, fp16, el), + INSN(comxss, f2, 0f, 2f, el, d, el), + INSN(ucomxsd, f3, 0f, 2e, el, q, el), + INSN(ucomxsh, f2, map5, 2e, el, fp16, el), + INSN(ucomxss, f2, 0f, 2e, el, d, el), +}; + static const unsigned char vl_all[] = { VL_512, VL_128, VL_256 }; static const unsigned char vl_128[] = { VL_128 }; static const unsigned char vl_no128[] = { VL_512, VL_256 }; @@ -1130,5 +1141,8 @@ void evex_disp8_test(void *instr, struct RUN(vpclmulqdq, all); #undef RUN } + + run(cpu_has_avx10_2, avx10_2, all); + #undef run } --- a/tools/tests/x86_emulator/predicates.c +++ b/tools/tests/x86_emulator/predicates.c @@ -1682,8 +1682,12 @@ static const struct evex { { { 0x2d }, 2, T, R, pfx_f2, Wn, LIG }, /* vcvtsd2si */ { { 0x2e }, 2, T, R, pfx_no, W0, LIG }, /* vucomiss */ { { 0x2e }, 2, T, R, pfx_66, W1, LIG }, /* vucomisd */ + { { 0x2e }, 2, T, R, pfx_f3, W1, LIG }, /* vucomxsd */ + { { 0x2e }, 2, T, R, pfx_f2, W0, LIG }, /* vucomxss */ { { 0x2f }, 2, T, R, pfx_no, W0, LIG }, /* vcomiss */ { { 0x2f }, 2, T, R, pfx_66, W1, LIG }, /* vcomisd */ + { { 0x2f }, 2, T, R, pfx_f3, W1, LIG }, /* vcomxsd */ + { { 0x2f }, 2, T, R, pfx_f2, W0, LIG }, /* vcomxss */ { { 0x51 }, 2, T, R, pfx_no, W0, Ln }, /* vsqrtps */ { { 0x51 }, 2, T, R, pfx_66, W1, Ln }, /* vsqrtpd */ { { 0x51 }, 2, T, R, pfx_f3, W0, LIG }, /* vsqrtss */ @@ -2100,7 +2104,10 @@ static const struct evex { { { 0x2c }, 2, T, R, pfx_f3, Wn, LIG }, /* vcvttsh2si */ { { 0x2d }, 2, T, R, pfx_f3, Wn, LIG }, /* vcvtsh2si */ { { 0x2e }, 2, T, R, pfx_no, W0, LIG }, /* vucomish */ + { { 0x2e }, 2, T, R, pfx_f2, W0, LIG }, /* vucomxsh */ { { 0x2f }, 2, T, R, pfx_no, W0, LIG }, /* vcomish */ + { { 0x2f }, 2, T, R, pfx_66, W0, LIG }, /* vcomsbf16 */ + { { 0x2f }, 2, T, R, pfx_f2, W0, LIG }, /* vcomxsh */ { { 0x51 }, 2, T, R, pfx_no, W0, Ln }, /* vsqrtph */ { { 0x51 }, 2, T, R, pfx_f3, W0, LIG }, /* vsqrtsh */ { { 0x58 }, 2, T, R, pfx_no, W0, Ln }, /* vaddph */ --- a/tools/tests/x86_emulator/x86-emulate.h +++ b/tools/tests/x86_emulator/x86-emulate.h @@ -213,6 +213,8 @@ void wrpkru(unsigned int val); (cpu_policy.avx10.vsz256 || \ cpu_policy.avx10.vsz512)) #define cpu_has_avx10_1_512 (cpu_has_avx10_1 && cpu_policy.avx10.vsz512) +#define cpu_has_avx10_2 (cpu_policy.avx10.version >= 2 && \ + xcr0_mask(0xe6)) #define cpu_has_xgetbv1 (cpu_has_xsave && cpu_policy.xstate.xgetbv1) --- a/xen/arch/x86/x86_emulate/decode.c +++ b/xen/arch/x86/x86_emulate/decode.c @@ -1521,9 +1521,8 @@ int x86emul_decode(struct x86_emulate_st s->fp16 = true; break; - case 0x2e: case 0x2f: /* v{,u}comish */ - if ( !s->evex.pfx ) - s->fp16 = true; + case 0x2e: case 0x2f: /* v{,u}com{i,x}sh, vcomsbf16 */ + s->fp16 = true; s->simd_size = simd_none; break; --- a/xen/arch/x86/x86_emulate/private.h +++ b/xen/arch/x86/x86_emulate/private.h @@ -304,7 +304,7 @@ struct x86_emulate_state { bool lock_prefix; bool not_64bit; /* Instruction not available in 64bit. */ bool fpu_ctrl; /* Instruction is an FPU control one. */ - bool fp16; /* Instruction has half-precision FP source operand. */ + bool fp16; /* Instruction has half-precision FP or BF16 source. */ opcode_desc_t desc; union vex vex; union evex evex; @@ -596,8 +596,8 @@ amd_like(const struct x86_emulate_ctxt * #define vcpu_has_avx10(minor) (ctxt->cpuid->avx10.version >= (minor)) -#define vcpu_must_have(feat) \ - generate_exception_if(!vcpu_has_##feat(), X86_EXC_UD) +#define vcpu_must_have(feat, ...) \ + generate_exception_if(!vcpu_has_##feat(__VA_ARGS__), X86_EXC_UD) #ifdef __XEN__ /* --- a/xen/arch/x86/x86_emulate/x86_emulate.c +++ b/xen/arch/x86/x86_emulate/x86_emulate.c @@ -3813,7 +3813,6 @@ x86_emulate( case X86EMUL_OPC_EVEX(5, 0x2e): /* vucomish xmm/m16,xmm */ case X86EMUL_OPC_EVEX(5, 0x2f): /* vcomish xmm/m16,xmm */ visa_check(_fp16); - generate_exception_if(evex.w, X86_EXC_UD); /* fall through */ CASE_SIMD_PACKED_FP(_EVEX, 0x0f, 0x2e): /* vucomis{s,d} xmm/mem,xmm */ CASE_SIMD_PACKED_FP(_EVEX, 0x0f, 0x2f): /* vcomis{s,d} xmm/mem,xmm */ @@ -3821,6 +3820,7 @@ x86_emulate( evex.w != evex.pfx), X86_EXC_UD); visa_check(f); + vcomi_evex: if ( !evex.brs ) avx512_vlen_check(true); else @@ -3831,6 +3831,17 @@ x86_emulate( op_bytes = 2 << (!state->fp16 + evex.w); goto vcomi; + CASE_SIMD_SCALAR_FP(_EVEX, 0x0f, 0x2e): /* vucomxs{s,d} xmm/mem,xmm */ + CASE_SIMD_SCALAR_FP(_EVEX, 0x0f, 0x2f): /* vcomxs{s,d} xmm/mem,xmm */ + case X86EMUL_OPC_EVEX_F2(5, 0x2e): /* vucomxsh xmm/m16,xmm */ + case X86EMUL_OPC_EVEX_66(5, 0x2f): /* vcomsbf16 xmm/m16,xmm */ + case X86EMUL_OPC_EVEX_F2(5, 0x2f): /* vcomxsh xmm/m16,xmm */ + generate_exception_if((evex.reg != 0xf || !evex.RX || evex.opmsk || + evex.w != !(evex.pfx & 1)), + X86_EXC_UD); + vcpu_must_have(avx10, 2); + goto vcomi_evex; + #endif case X86EMUL_OPC(0x0f, 0x30): /* wrmsr */ From patchwork Wed Dec 11 10:18:16 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 13903292 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 62A2DE77180 for ; Wed, 11 Dec 2024 10:18:33 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.854335.1267565 (Exim 4.92) (envelope-from ) id 1tLJnV-0007Ee-QC; Wed, 11 Dec 2024 10:18:25 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 854335.1267565; Wed, 11 Dec 2024 10:18:25 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1tLJnV-0007EX-Nd; Wed, 11 Dec 2024 10:18:25 +0000 Received: by outflank-mailman (input) for mailman id 854335; Wed, 11 Dec 2024 10:18:24 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1tLJnU-00076S-Im for xen-devel@lists.xenproject.org; Wed, 11 Dec 2024 10:18:24 +0000 Received: from mail-wm1-x32a.google.com (mail-wm1-x32a.google.com [2a00:1450:4864:20::32a]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id 3d680ad7-b7a9-11ef-99a3-01e77a169b0f; Wed, 11 Dec 2024 11:18:18 +0100 (CET) Received: by mail-wm1-x32a.google.com with SMTP id 5b1f17b1804b1-434a766b475so62665745e9.1 for ; Wed, 11 Dec 2024 02:18:22 -0800 (PST) Received: from [10.156.60.236] (ip-037-024-206-209.um08.pools.vodafone-ip.de. [37.24.206.209]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-2161b77ea74sm84612825ad.229.2024.12.11.02.18.19 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 11 Dec 2024 02:18:21 -0800 (PST) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 3d680ad7-b7a9-11ef-99a3-01e77a169b0f DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1733912302; x=1734517102; darn=lists.xenproject.org; h=content-transfer-encoding:in-reply-to:autocrypt:content-language :references:cc:to:from:subject:user-agent:mime-version:date :message-id:from:to:cc:subject:date:message-id:reply-to; bh=uETdII7euK3qqaPzdHHMKooaoTcHvrw5Jiuv6BUwztM=; b=JummlVuX2ups2MOVRktAtBXnSGZjGhPVCoP00rHEoJel94nd+cdT/qo99Q0Coqvb6y z66xtcKEo9IbE9ZLmU3GMP1PtxEHIlW7JWp8B6rRhIDMLj7wVaQHeDQ5khgYmDDRNoMv DI9kEbB9g8mi0cSzDQpYzNyLiDWmE/VkD/7aPy5bdaOwUypvf0VVS2Lyu+sSXM0b6pJH oy8w/vEVeQuw0DXU28fU8Noh1AcwCa3t7feTf3D8dx9pBcTU/YJRcBwFHX2uQdk+ACBZ 62KTS65GWquEbwsEAF9sOvF3ltq8IQkxrRUMlXLenK+qSUVspN9MPMhK3oAdyXRs18Dw nHhA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1733912302; x=1734517102; h=content-transfer-encoding:in-reply-to:autocrypt:content-language :references:cc:to:from:subject:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=uETdII7euK3qqaPzdHHMKooaoTcHvrw5Jiuv6BUwztM=; b=cf3axYNXSV4OW+CMR/yxhnOUQ8ZcDwStWyAS0oQrlzZY+49wlp0axeqBG/BxSgwThV cXi1VCN4qN1WEdBP2yilQE8RP3TuDKRzPsEzPOmdTu7qLOvey6Lv4v+PrZxoNKcMiOAX T29ANmDUf70m35tD6yfIPuhULr3lEkAknS4pwdZ9k4iacBvMf44Hr/We1a+Q/3Ut7YJy WGXFvI+FK8Xm9fOez1WkKBUstuNXlnJ/CuY+5RAIElBbwbcNDJXjsKpNJzChrGGxSI5F sTY38xSTaiOX7TuUH4OfVkAEQG216K5fCj0Rbx8/bZ0tjimN3+NU30GNV2eP1XWN/hRH wKQA== X-Gm-Message-State: AOJu0Ywr8qjGHoRhz3KafhtjAllmadvPTwSqJE+VThutaNt53UWlYBqg XtK3NnOxqyW87i4bFl7/Od/POLoGdX1tY5MuouzoAsFWN5CchdMUvWzWEuOoG5jgQcSD8M+fruU = X-Gm-Gg: ASbGncsHk+Zu08kGNmEOq3BF5fUpGw6V7qc3T7nTJlbT3ZUtaF8004pktYdKW8UcLuG z6eA53B4pwtGysdrRT0zy7Hj1XCZN4rqfkY2NkMY9rsJmDhL9PE7aQ0KIgM/LeTXFslRkjWOpx/ Dp1sOZsa5N89jvJ/NoxCSzNODFyQdaeKbCmjISnV5sfOHKXfA9liQ58mzfRNuokRrFt8hjL65ZG AF8Z6wwXWM/zK9DuNeWsbBSmvHKX+xbnacN+yacCIPKBkTO76l8eEiLe6hk8eI8f3ofhXPpVG8g 8wBFKe1USGzuJ47AER8KrdW0ik9DxBELy0e6ur4= X-Google-Smtp-Source: AGHT+IHaFe19BiEV4+Ua81mrbHn9BwgnIH9jKCMyNs9d+fOf2PyjAFL2ZHLg0EyFDdQU4jGyo3qFNw== X-Received: by 2002:a5d:47a1:0:b0:386:3a8e:64c1 with SMTP id ffacd0b85a97d-3864ce90f89mr1685137f8f.19.1733912302270; Wed, 11 Dec 2024 02:18:22 -0800 (PST) Message-ID: <4bed4638-98f6-4a1d-b41f-f0c25d9af526@suse.com> Date: Wed, 11 Dec 2024 11:18:16 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: [PATCH v3 09/16] x86emul: support AVX10.2 partial copy insns From: Jan Beulich To: "xen-devel@lists.xenproject.org" Cc: Andrew Cooper , =?utf-8?q?Roger_Pau_Monn?= =?utf-8?q?=C3=A9?= References: <516b7f9a-048e-409d-8a4e-89aeb8ffacc4@suse.com> Content-Language: en-US Autocrypt: addr=jbeulich@suse.com; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL In-Reply-To: <516b7f9a-048e-409d-8a4e-89aeb8ffacc4@suse.com> Extend existing VMOV{Q,W} logic accordingly. Signed-off-by: Jan Beulich --- SDE: ??? --- v3: New. --- a/tools/tests/x86_emulator/evex-disp8.c +++ b/tools/tests/x86_emulator/evex-disp8.c @@ -722,6 +722,13 @@ static const struct test avx10_2_all[] = INSN(ucomxss, f2, 0f, 2e, el, d, el), }; +static const struct test avx10_2_128[] = { + INSN(movd, f3, 0f, 7e, el, d, el), + INSN(movd, 66, 0f, d6, el, d, el), + INSN(movw, f3, map5, 6e, el, fp16, el), + INSN(movw, f3, map5, 7e, el, fp16, el), +}; + static const unsigned char vl_all[] = { VL_512, VL_128, VL_256 }; static const unsigned char vl_128[] = { VL_128 }; static const unsigned char vl_no128[] = { VL_512, VL_256 }; @@ -1143,6 +1150,7 @@ void evex_disp8_test(void *instr, struct } run(cpu_has_avx10_2, avx10_2, all); + run(cpu_has_avx10_2, avx10_2, 128); #undef run } --- a/tools/tests/x86_emulator/predicates.c +++ b/tools/tests/x86_emulator/predicates.c @@ -1782,7 +1782,7 @@ static const struct evex { { { 0x7b }, 2, T, R, pfx_f3, Wn, LIG }, /* vcvtusi2ss */ { { 0x7b }, 2, T, R, pfx_f2, Wn, LIG }, /* vcvtusi2sd */ { { 0x7e }, 2, T, W, pfx_66, Wn, L0 }, /* vmov{d,q} */ - { { 0x7e }, 2, T, R, pfx_f3, W1, L0 }, /* vmovq */ + { { 0x7e }, 2, T, R, pfx_f3, Wn, L0 }, /* vmov{d,q} */ { { 0x7f }, 2, T, W, pfx_66, Wn, Ln }, /* vmovdqa{32,64} */ { { 0x7f }, 2, T, W, pfx_f3, Wn, Ln }, /* vmovdqu{32,64} */ { { 0x7f }, 2, T, W, pfx_f2, Wn, Ln }, /* vmovdqu{8,16} */ @@ -1799,7 +1799,7 @@ static const struct evex { { { 0xd3 }, 2, T, R, pfx_66, W1, Ln }, /* vpsrlq */ { { 0xd4 }, 2, T, R, pfx_66, W1, Ln }, /* vpaddq */ { { 0xd5 }, 2, T, R, pfx_66, WIG, Ln }, /* vpmullw */ - { { 0xd6 }, 2, T, W, pfx_66, W1, L0 }, /* vmovq */ + { { 0xd6 }, 2, T, W, pfx_66, Wn, L0 }, /* vmov{d,q} */ { { 0xd8 }, 2, T, R, pfx_66, WIG, Ln }, /* vpsubusb */ { { 0xd9 }, 2, T, R, pfx_66, WIG, Ln }, /* vpsubusw */ { { 0xda }, 2, T, R, pfx_66, WIG, Ln }, /* vpminub */ @@ -2131,6 +2131,7 @@ static const struct evex { { { 0x5f }, 2, T, R, pfx_no, W0, Ln }, /* vmaxph */ { { 0x5f }, 2, T, R, pfx_f3, W0, LIG }, /* vmaxsh */ { { 0x6e }, 2, T, R, pfx_66, WIG, L0 }, /* vmovw */ + { { 0x6e }, 2, T, R, pfx_f3, W0, L0 }, /* vmovw */ { { 0x78 }, 2, T, R, pfx_no, W0, Ln }, /* vcvttph2udq */ { { 0x78 }, 2, T, R, pfx_66, W0, Ln }, /* vcvttph2uqq */ { { 0x78 }, 2, T, R, pfx_f3, Wn, LIG }, /* vcvttsh2usi */ @@ -2149,6 +2150,7 @@ static const struct evex { { { 0x7d }, 2, T, R, pfx_f3, W0, Ln }, /* vcvtw2ph */ { { 0x7d }, 2, T, R, pfx_f2, W0, Ln }, /* vcvtuwph */ { { 0x7e }, 2, T, W, pfx_66, WIG, L0 }, /* vmovw */ + { { 0x7e }, 2, T, W, pfx_f3, W0, L0 }, /* vmovw */ }, evex_map6[] = { { { 0x13 }, 2, T, R, pfx_66, W0, Ln }, /* vcvtph2psx */ { { 0x13 }, 2, T, R, pfx_no, W0, LIG }, /* vcvtsh2ss */ --- a/xen/arch/x86/x86_emulate/decode.c +++ b/xen/arch/x86/x86_emulate/decode.c @@ -293,7 +293,7 @@ static const struct twobyte_table { [0xd0] = { DstImplicit|SrcMem|ModRM, simd_other }, [0xd1 ... 0xd3] = { DstImplicit|SrcMem|ModRM, simd_128, 4 }, [0xd4 ... 0xd5] = { DstImplicit|SrcMem|ModRM, simd_packed_int, d8s_vl }, - [0xd6] = { DstMem|SrcImplicit|ModRM|Mov, simd_other, 3 }, + [0xd6] = { DstMem|SrcImplicit|ModRM|Mov, simd_other, d8s_dq }, [0xd7] = { DstReg|SrcImplicit|ModRM|Mov }, [0xd8 ... 0xdf] = { DstImplicit|SrcMem|ModRM, simd_packed_int, d8s_vl }, [0xe0] = { DstImplicit|SrcMem|ModRM, simd_packed_int, d8s_vl }, @@ -802,7 +802,7 @@ decode_twobyte(struct x86_emulate_state if ( s->vex.pfx == vex_f3 ) /* movq xmm/m64,xmm */ { case X86EMUL_OPC_VEX_F3(0, 0x7e): /* vmovq xmm/m64,xmm */ - case X86EMUL_OPC_EVEX_F3(0, 0x7e): /* vmovq xmm/m64,xmm */ + case X86EMUL_OPC_EVEX_F3(0, 0x7e): /* vmov{d,q} xmm/mem,xmm */ s->desc = DstImplicit | SrcMem | TwoOp; s->simd_size = simd_other; /* Avoid the s->desc clobbering of TwoOp below. */ @@ -1422,7 +1422,7 @@ int x86emul_decode(struct x86_emulate_st break; case 0x7e: /* vmovq xmm/m64,xmm needs special casing */ - if ( disp8scale == 2 && s->evex.pfx == vex_f3 ) + if ( disp8scale == 2 && s->evex.pfx == vex_f3 && s->evex.w ) disp8scale = 3; break; @@ -1531,13 +1531,13 @@ int x86emul_decode(struct x86_emulate_st s->fp16 = true; break; - case 0x6e: /* vmovw r/m16, xmm */ + case 0x6e: /* vmovw r/x/m16, xmm */ d = (d & ~SrcMask) | SrcMem16; /* fall through */ - case 0x7e: /* vmovw xmm, r/m16 */ + case 0x7e: /* vmovw xmm, r/x/m16 */ + s->fp16 = true; if ( s->evex.pfx == vex_66 ) - s->fp16 = true; - s->simd_size = simd_none; + s->simd_size = simd_none; break; case 0x78: case 0x79: /* vcvt{,t}ph2u{d,q}q, vcvt{,t}sh2usi */ --- a/xen/arch/x86/x86_emulate/x86_emulate.c +++ b/xen/arch/x86/x86_emulate/x86_emulate.c @@ -4918,13 +4918,17 @@ x86_emulate( op_bytes = 8; goto simd_0f_int; - case X86EMUL_OPC_EVEX_F3(0x0f, 0x7e): /* vmovq xmm/m64,xmm */ - case X86EMUL_OPC_EVEX_66(0x0f, 0xd6): /* vmovq xmm,xmm/m64 */ - generate_exception_if(evex.lr || !evex.w || evex.opmsk || evex.brs, - X86_EXC_UD); - visa_check(f); + case X86EMUL_OPC_EVEX_F3(0x0f, 0x7e): /* vmov{d,q} xmm/mem,xmm */ + case X86EMUL_OPC_EVEX_66(0x0f, 0xd6): /* vmov{d,q} xmm,xmm/mem */ + case X86EMUL_OPC_EVEX_F3(5, 0x6e): /* vmovw xmm/m16,xmm */ + case X86EMUL_OPC_EVEX_F3(5, 0x7e): /* vmovw xmm,xmm/m16 */ + generate_exception_if(evex.lr || evex.opmsk || evex.brs, X86_EXC_UD); + if ( evex.w ) + visa_check(f); + else + vcpu_must_have(avx10, 2); d |= TwoOp; - op_bytes = 8; + op_bytes = 2 << (!state->fp16 + evex.w); goto simd_zmm; #endif /* !X86EMUL_NO_SIMD */ From patchwork Wed Dec 11 10:18:39 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 13903293 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 27D52E7717D for ; Wed, 11 Dec 2024 10:18:54 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.854344.1267576 (Exim 4.92) (envelope-from ) id 1tLJnr-0007lT-1P; Wed, 11 Dec 2024 10:18:47 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 854344.1267576; Wed, 11 Dec 2024 10:18:47 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1tLJnq-0007lM-Ug; Wed, 11 Dec 2024 10:18:46 +0000 Received: by outflank-mailman (input) for mailman id 854344; Wed, 11 Dec 2024 10:18:45 +0000 Received: from se1-gles-sth1-in.inumbo.com ([159.253.27.254] helo=se1-gles-sth1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1tLJnp-0006nP-UQ for xen-devel@lists.xenproject.org; Wed, 11 Dec 2024 10:18:45 +0000 Received: from mail-wr1-x436.google.com (mail-wr1-x436.google.com [2a00:1450:4864:20::436]) by se1-gles-sth1.inumbo.com (Halon) with ESMTPS id 4d895a98-b7a9-11ef-a0d5-8be0dac302b0; Wed, 11 Dec 2024 11:18:45 +0100 (CET) Received: by mail-wr1-x436.google.com with SMTP id ffacd0b85a97d-38634c35129so3090165f8f.3 for ; Wed, 11 Dec 2024 02:18:45 -0800 (PST) Received: from [10.156.60.236] (ip-037-024-206-209.um08.pools.vodafone-ip.de. [37.24.206.209]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2ef4600d14fsm11233196a91.45.2024.12.11.02.18.42 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 11 Dec 2024 02:18:44 -0800 (PST) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 4d895a98-b7a9-11ef-a0d5-8be0dac302b0 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1733912324; x=1734517124; darn=lists.xenproject.org; h=content-transfer-encoding:in-reply-to:autocrypt:content-language :references:cc:to:from:subject:user-agent:mime-version:date :message-id:from:to:cc:subject:date:message-id:reply-to; bh=JX8A7wpZkzT/ToT7H0QDSiHEuQp9oj/HteeDtBUydIE=; b=OsaMjN7iw79ezQKnbqjjzCqfhoFbFBSNtJXk/lOwffLtnwqHMM5rGnUX9F11nxPGKg CPTLiv+BQESLFSqAFm84jBAEd2tdH69ncu+vcqeu7sFDsJHafV+XdpwEHJITsSchUy9n MGvzn+IWZ1uP2RFJD4O4lX6zU+zjadOIauA32cCbKqDKjeYWIaTnPnXwjTydxcA1ftJ4 TX4f1hQJBvZN3978sRkafjCWfEbu0zfJCl7jRR6WlG5xoyww4W2jvfrIg2HBteuALcxJ JhamPiIWBBnrpsXop5/g9p8xk7lVJUHo0ff56fC60cYV/JrtJgGPRfYriupb4XhN+kbg fOdw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1733912324; x=1734517124; h=content-transfer-encoding:in-reply-to:autocrypt:content-language :references:cc:to:from:subject:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=JX8A7wpZkzT/ToT7H0QDSiHEuQp9oj/HteeDtBUydIE=; b=UO6VAwPAXgbLPd46jAf565kcxu7P05pA3o5OZ2g0emmFgzE9shEPTlMUpe37L1fr8c aMSqn7Bj10Imq5Mli443+idokWtcYjMBBBss7N+5DH62mXKse0J8uzJxId6f/KChbd9o HqNzFc62OTut5Gc7pM+0WVrYnVlkMF8dkupvJ/ltERqCncyv8mXuhfPikNHsvkmAb3BI BWlM4+Ei2cbYusRAB4ChFBc8mWNU7JjvVgu25py/zbAxfE3dmD73URyatmq64xzzGjB0 dvScgHilAhE/dhbwSjWtefmREXoAhr8A61Fsxnd+oxXiEGp2Ey4GjxkvL4gqlIIksizt 8ikw== X-Gm-Message-State: AOJu0Yxw8aNnoDlN5dTAKLCII1OoII0YaQOU+zf40eWW5iUUJxbcymXc 7nyqQze2cmMQXelY1L2vDHHr3oHnx+gVRoWzjM/q781bOlAtY62IHMe7pDybkRJ98upR7aMeQjs = X-Gm-Gg: ASbGncvTpwczlO4NuNjXTByjqQyiRjr4lDfg1OZtFpBA8o3rIaRoJTtuHG8dlgQNErq dx59R0qp2qJGhzL2xuE3vMY8sU7uJUTyjo1gQXzNkeRMxjX136J+2CFGS9yrTssbRVlxImv2v32 BSPi7yMR91x6qLXbX1SZA0MvLiy/ZpmeRDUvLS2VTUnbrXWgXZwHY6aB86X5uhHigilONUWQ4we ANCtyX1BRjSUD+tcHf7rpzb1abm4sWD6s8b3GNNJFa5KTUpDS0SxPRQrJUUhOlXqEMPdhjX/1J2 U4aklf74e1qE9No/XdV7usDU1JiGDHnaK1WH/vw= X-Google-Smtp-Source: AGHT+IE2HAta4jh6z8md2+tzdwRYLSsgaYg8DUe98j9Af2So7N5jx9qFg0/VCxv+aOXkHhKxHecw2A== X-Received: by 2002:a5d:5846:0:b0:385:df4e:3645 with SMTP id ffacd0b85a97d-3864cecfaefmr1852440f8f.50.1733912324538; Wed, 11 Dec 2024 02:18:44 -0800 (PST) Message-ID: <953301b9-4cd6-4742-9486-bf31121cb3ef@suse.com> Date: Wed, 11 Dec 2024 11:18:39 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: [PATCH v3 10/16] x86emul: support AVX10.2 media insns From: Jan Beulich To: "xen-devel@lists.xenproject.org" Cc: Andrew Cooper , =?utf-8?q?Roger_Pau_Monn?= =?utf-8?q?=C3=A9?= References: <516b7f9a-048e-409d-8a4e-89aeb8ffacc4@suse.com> Content-Language: en-US Autocrypt: addr=jbeulich@suse.com; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL In-Reply-To: <516b7f9a-048e-409d-8a4e-89aeb8ffacc4@suse.com> These are all very similar to various existing insns. Signed-off-by: Jan Beulich --- SDE: ??? --- v3: New. --- a/tools/tests/x86_emulator/evex-disp8.c +++ b/tools/tests/x86_emulator/evex-disp8.c @@ -717,6 +717,20 @@ static const struct test avx10_2_all[] = INSN(comxsd, f3, 0f, 2f, el, q, el), INSN(comxsh, f2, map5, 2f, el, fp16, el), INSN(comxss, f2, 0f, 2f, el, d, el), + INSN(dpphps, , 0f38, 52, vl, d, vl), + INSN(mpsadbw, f3, 0f3a, 42, vl, d_nb, vl), + INSN(pdpbssd, f2, 0f38, 50, vl, d, vl), + INSN(pdpbssds, f2, 0f38, 51, vl, d, vl), + INSN(pdpbsud, f3, 0f38, 50, vl, d, vl), + INSN(pdpbsuds, f3, 0f38, 51, vl, d, vl), + INSN(pdpbuud, , 0f38, 50, vl, d, vl), + INSN(pdpbuuds, , 0f38, 51, vl, d, vl), + INSN(pdpwsud, f3, 0f38, d2, vl, d, vl), + INSN(pdpwsuds, f3, 0f38, d3, vl, d, vl), + INSN(pdpwusd, 66, 0f38, d2, vl, d, vl), + INSN(pdpwusds, 66, 0f38, d3, vl, d, vl), + INSN(pdpwuud, , 0f38, d2, vl, d, vl), + INSN(pdpwuuds, , 0f38, d3, vl, d, vl), INSN(ucomxsd, f3, 0f, 2e, el, q, el), INSN(ucomxsh, f2, map5, 2e, el, fp16, el), INSN(ucomxss, f2, 0f, 2e, el, d, el), --- a/tools/tests/x86_emulator/predicates.c +++ b/tools/tests/x86_emulator/predicates.c @@ -1927,8 +1927,15 @@ static const struct evex { { { 0x4d }, 2, T, R, pfx_66, Wn, LIG }, /* vrcp14s{s,d} */ { { 0x4e }, 2, T, R, pfx_66, Wn, Ln }, /* vrsqrt14p{s,d} */ { { 0x4f }, 2, T, R, pfx_66, Wn, LIG }, /* vrsqrt14s{s,d} */ + { { 0x50 }, 2, T, R, pfx_no, W0, Ln }, /* vpdpbuud */ { { 0x50 }, 2, T, R, pfx_66, W0, Ln }, /* vpdpbusd */ + { { 0x50 }, 2, T, R, pfx_f3, W0, Ln }, /* vpdpbsud */ + { { 0x50 }, 2, T, R, pfx_f2, W0, Ln }, /* vpdpbssd */ + { { 0x51 }, 2, T, R, pfx_no, W0, Ln }, /* vpdpbuuds */ { { 0x51 }, 2, T, R, pfx_66, W0, Ln }, /* vpdpbusds */ + { { 0x51 }, 2, T, R, pfx_f3, W0, Ln }, /* vpdpbsuds */ + { { 0x51 }, 2, T, R, pfx_f2, W0, Ln }, /* vpdpbssds */ + { { 0x52 }, 2, T, R, pfx_no, W0, Ln }, /* vdpphps */ { { 0x52 }, 2, T, R, pfx_66, W0, Ln }, /* vpdpwssd */ { { 0x52 }, 2, T, R, pfx_f3, W0, Ln }, /* vdpbf16ps */ { { 0x52 }, 2, T, R, pfx_f2, W0, L2 }, /* vp4dpwssd */ @@ -2029,6 +2036,12 @@ static const struct evex { { { 0xcc }, 2, T, R, pfx_66, Wn, L2 }, /* vrsqrt28p{s,d} */ { { 0xcd }, 2, T, R, pfx_66, Wn, LIG }, /* vrsqrt28s{s,d} */ { { 0xcf }, 2, T, R, pfx_66, W0, Ln }, /* vgf2p8mulb */ + { { 0xd2 }, 2, T, R, pfx_no, W0, Ln }, /* vpdpwuud */ + { { 0xd2 }, 2, T, R, pfx_66, W0, Ln }, /* vpdpwusd */ + { { 0xd2 }, 2, T, R, pfx_f3, W0, Ln }, /* vpdpwsud */ + { { 0xd3 }, 2, T, R, pfx_no, W0, Ln }, /* vpdpwuuds */ + { { 0xd3 }, 2, T, R, pfx_66, W0, Ln }, /* vpdpwusds */ + { { 0xd3 }, 2, T, R, pfx_f3, W0, Ln }, /* vpdpwsuds */ { { 0xdc }, 2, T, R, pfx_66, WIG, Ln }, /* vaesenc */ { { 0xdd }, 2, T, R, pfx_66, WIG, Ln }, /* vaesenclast */ { { 0xde }, 2, T, R, pfx_66, WIG, Ln }, /* vaesdec */ @@ -2073,6 +2086,7 @@ static const struct evex { { { 0x3e }, 3, T, R, pfx_66, Wn, Ln }, /* vpcmpu{b,w} */ { { 0x3f }, 3, T, R, pfx_66, Wn, Ln }, /* vpcmp{b,w} */ { { 0x42 }, 3, T, R, pfx_66, W0, Ln }, /* vdbpsadbw */ + { { 0x42 }, 3, T, R, pfx_f3, W0, Ln }, /* vmpsadbw */ { { 0x43 }, 3, T, R, pfx_66, Wn, L1|L2 }, /* vshufi{32x4,64x2} */ { { 0x44 }, 3, T, R, pfx_66, WIG, Ln }, /* vpclmulqdq */ { { 0x50 }, 3, T, R, pfx_66, Wn, Ln }, /* vrangep{s,d} */ --- a/xen/arch/x86/x86_emulate/decode.c +++ b/xen/arch/x86/x86_emulate/decode.c @@ -433,8 +433,8 @@ static const struct ext0f38_table { [0xcb] = { .simd_size = simd_other, .d8s = d8s_vl }, [0xcc ... 0xcd] = { .simd_size = simd_other, .two_op = 1, .d8s = d8s_vl }, [0xcf] = { .simd_size = simd_packed_int, .d8s = d8s_vl }, - [0xd2] = { .simd_size = simd_other }, - [0xd3] = { .simd_size = simd_other }, + [0xd2] = { .simd_size = simd_other, .d8s = d8s_vl }, + [0xd3] = { .simd_size = simd_other, .d8s = d8s_vl }, [0xd6] = { .simd_size = simd_other, .d8s = d8s_vl }, [0xd7] = { .simd_size = simd_scalar_vexw, .d8s = d8s_dq }, [0xda] = { .simd_size = simd_other }, --- a/xen/arch/x86/x86_emulate/x86_emulate.c +++ b/xen/arch/x86/x86_emulate/x86_emulate.c @@ -6201,6 +6201,24 @@ x86_emulate( avx512_vlen_check(true); goto simd_zmm; + case X86EMUL_OPC_EVEX (0x0f38, 0x50): /* vpdpbuud [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX_F3(0x0f38, 0x50): /* vpdpbsud [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX_F2(0x0f38, 0x50): /* vpdpbssd [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX (0x0f38, 0x51): /* vpdpbuuds [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX_F3(0x0f38, 0x51): /* vpdpbsuds [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX_F2(0x0f38, 0x51): /* vpdpbssds [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX (0x0f38, 0xd2): /* vpdpwuud [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX_66(0x0f38, 0xd2): /* vpdpwusd [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX_F3(0x0f38, 0xd2): /* vpdpwsud [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX (0x0f38, 0xd3): /* vpdpwuuds [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX_66(0x0f38, 0xd3): /* vpdpwusds [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX_F3(0x0f38, 0xd3): /* vpdpwsuds [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX (0x0f38, 0x52): /* vdpphps [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ + generate_exception_if(evex.w, X86_EXC_UD); + vcpu_must_have(avx10, 2); + op_bytes = 16 << evex.lr; + goto avx512f_no_sae; + case X86EMUL_OPC_EVEX_66(0x0f38, 0x8f): /* vpshufbitqmb [xyz]mm/mem,[xyz]mm,k{k} */ generate_exception_if(evex.w || !evex.r || !evex.R || evex.z, X86_EXC_UD); /* fall through */ @@ -7660,6 +7678,14 @@ x86_emulate( visa_check(bw); goto opmask_shift_imm; + case X86EMUL_OPC_EVEX_F3(0x0f3a, 0x42): /* vmpsadbw $imm8,[xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ + generate_exception_if(evex.w || evex.brs, X86_EXC_UD); + vcpu_must_have(avx10, 2); + avx512_vlen_check(false); + op_bytes = 16 << evex.lr; + fault_suppression = false; + goto simd_imm8_zmm; + case X86EMUL_OPC_66(0x0f3a, 0x44): /* pclmulqdq $imm8,xmm/m128,xmm */ case X86EMUL_OPC_VEX_66(0x0f3a, 0x44): /* vpclmulqdq $imm8,{x,y}mm/mem,{x,y}mm,{x,y}mm */ host_and_vcpu_must_have(pclmulqdq); From patchwork Wed Dec 11 10:19:03 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 13903311 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id BC2A4E77180 for ; Wed, 11 Dec 2024 10:23:07 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.854417.1267636 (Exim 4.92) (envelope-from ) id 1tLJrv-0004Yp-Ud; Wed, 11 Dec 2024 10:22:59 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 854417.1267636; Wed, 11 Dec 2024 10:22:59 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1tLJrv-0004Yi-Qz; Wed, 11 Dec 2024 10:22:59 +0000 Received: by outflank-mailman (input) for mailman id 854417; Wed, 11 Dec 2024 10:22:58 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1tLJoF-00076S-9d for xen-devel@lists.xenproject.org; Wed, 11 Dec 2024 10:19:11 +0000 Received: from mail-wr1-x434.google.com (mail-wr1-x434.google.com [2a00:1450:4864:20::434]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id 594722d0-b7a9-11ef-99a3-01e77a169b0f; Wed, 11 Dec 2024 11:19:04 +0100 (CET) Received: by mail-wr1-x434.google.com with SMTP id ffacd0b85a97d-3862d6d5765so2860917f8f.3 for ; Wed, 11 Dec 2024 02:19:09 -0800 (PST) Received: from [10.156.60.236] (ip-037-024-206-209.um08.pools.vodafone-ip.de. [37.24.206.209]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-725e4b13545sm6224682b3a.126.2024.12.11.02.19.06 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 11 Dec 2024 02:19:08 -0800 (PST) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 594722d0-b7a9-11ef-99a3-01e77a169b0f DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1733912349; x=1734517149; darn=lists.xenproject.org; h=content-transfer-encoding:in-reply-to:autocrypt:content-language :references:cc:to:from:subject:user-agent:mime-version:date :message-id:from:to:cc:subject:date:message-id:reply-to; bh=Zb0r/1KGI0Knl/6PQXuJWXBeJ6RFMwkqP6qhC8BvpLo=; b=b0yYPgrMW9B/O9aMBKenRzu672l5dWYTyE5kJPw0g+Sng4ISfuOMCVw5Jk/82yVcmq HOKggKMSBKUQ+qthlsV6BJp0/oUFn73UPE2U80UyNmUHb5ix0WY0qLNHVY+/kqiBSw48 cgtgI+lsR7C5jfvdk78mOFDY5ZTP8t05QXfMPI+Odwm0rQoc3hPbUADUHR/3rSgeU2oB 0+cm78ADRL9dNBnnhgXMz+CJOcSvi5K6qmftU+CDpPJLyqYLIHQJ9TMaE8PRye6CHBhH +WMhL8fz0G6PsVThj+45PladPvxsRrii+tg2ko/0U+1vQaZIFlK5VD9xB6sb8k/BEjl9 rSlg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1733912349; x=1734517149; h=content-transfer-encoding:in-reply-to:autocrypt:content-language :references:cc:to:from:subject:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=Zb0r/1KGI0Knl/6PQXuJWXBeJ6RFMwkqP6qhC8BvpLo=; b=AfRcEeKffOSo1GuatKlNwRH1T/1M4xjLMEx8hrj5kta1iGhUQOedhhV1w8+epuJMZX OKsr0g1G4ABOzWTvaMMkUjkbfZDCp3vx5KUka95YSaZPSLZc8hZPlcjBBCFIpwP5Lw7t MyfLBuVZOzzkAu+5XHUb22NH+F3lrkRSvXyLnW3tSIz1chBGMEJWL9D5PA9PcLYUXLdq VCkbmj7XCxNfsQj//XsqYUx/29Svf2aJiWPdqRa+p6m7G9hX9KiN7ps1N9NRGzkmQKIL NSNSYnr1EvKTNs/bRGtr5Et4gTQMF+jsyKUPu8kHvATyAD9jMeBAbO7NVp9vrB75FB6a 0iWA== X-Gm-Message-State: AOJu0YzjT6n8JE0nvEElMNO5mOVTs+Vz+fF3YaZyha5oi9pNxxwtM4Kz LwL7dfs6QDGDXKIWvEl1kMGmJCHcDxBZb0W1hIRawjJKZagSrd1vAnwqFeV6PnBOrUI9oQWSqnU = X-Gm-Gg: ASbGncu/+axGyXfOVYWv8uhN7K6kKgkLTckOXz46ZwUFz3+vSAu6+6Q3o+YcF6C6Kgu Ev5OaZqBKVzWLkb8TCvbtKYTqYaQgd7KSeNIJGeobr2ttuD1LSKaanEnwN3SeSJIpci5P87smX5 b+xA97b3S21/4gvwPSCXPSALNWOxqTIBz4NUhSwQ2U/X9Wf2n3KKr6wRCDndO/lP7Dqcy9zD8F8 6vICKbnl+bEvFht5OPkmIk5AcNYLPEnRjAi8x4+1LXrJ7bmIPh2l/Pph97qXcGhX/XYAfc3iM6D UtdgY4ha2Alwcpw5ArMLLEhbg3ONMu5qOE0/B0Y= X-Google-Smtp-Source: AGHT+IGboT55AhabOdd2V6EwEZ/3tiKMObX482bdzmGsnPOVsItqG7tYDEPePdf8k2BNhmZSelFdew== X-Received: by 2002:a5d:584c:0:b0:382:4115:1ccb with SMTP id ffacd0b85a97d-3864ce49607mr1940850f8f.7.1733912349031; Wed, 11 Dec 2024 02:19:09 -0800 (PST) Message-ID: Date: Wed, 11 Dec 2024 11:19:03 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: [PATCH v3 11/16] x86emul: support AVX10.2 minmax insns From: Jan Beulich To: "xen-devel@lists.xenproject.org" Cc: Andrew Cooper , =?utf-8?q?Roger_Pau_Monn?= =?utf-8?q?=C3=A9?= References: <516b7f9a-048e-409d-8a4e-89aeb8ffacc4@suse.com> Content-Language: en-US Autocrypt: addr=jbeulich@suse.com; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL In-Reply-To: <516b7f9a-048e-409d-8a4e-89aeb8ffacc4@suse.com> While they use new major opcodes, they are still pretty similar to various existing insns. Signed-off-by: Jan Beulich --- Spec rev 002 says VMINMAXNEPBF16, yet that's going to change to VMINMAXPBF16. --- SDE: ??? --- v3: New. --- a/tools/tests/x86_emulator/evex-disp8.c +++ b/tools/tests/x86_emulator/evex-disp8.c @@ -718,6 +718,11 @@ static const struct test avx10_2_all[] = INSN(comxsh, f2, map5, 2f, el, fp16, el), INSN(comxss, f2, 0f, 2f, el, d, el), INSN(dpphps, , 0f38, 52, vl, d, vl), + INSN(minmax, 66, 0f3a, 52, vl, sd, vl), + INSN(minmax, 66, 0f3a, 53, el, sd, el), + INSN(minmaxpbf16, f2, 0f3a, 52, vl, bf16, vl), + INSN(minmaxph, , 0f3a, 52, vl, fp16, vl), + INSN(minmaxsh, , 0f3a, 53, el, fp16, el), INSN(mpsadbw, f3, 0f3a, 42, vl, d_nb, vl), INSN(pdpbssd, f2, 0f38, 50, vl, d, vl), INSN(pdpbssds, f2, 0f38, 51, vl, d, vl), --- a/tools/tests/x86_emulator/predicates.c +++ b/tools/tests/x86_emulator/predicates.c @@ -2091,6 +2091,11 @@ static const struct evex { { { 0x44 }, 3, T, R, pfx_66, WIG, Ln }, /* vpclmulqdq */ { { 0x50 }, 3, T, R, pfx_66, Wn, Ln }, /* vrangep{s,d} */ { { 0x51 }, 3, T, R, pfx_66, Wn, LIG }, /* vranges{s,d} */ + { { 0x52 }, 3, T, R, pfx_no, W0, Ln }, /* vminmaxph */ + { { 0x52 }, 3, T, R, pfx_66, Wn, Ln }, /* vminmaxp{s,d} */ + { { 0x52 }, 3, T, R, pfx_f2, W0, Ln }, /* vminmaxpbf16 */ + { { 0x53 }, 3, T, R, pfx_no, W0, LIG }, /* vminmaxsh */ + { { 0x53 }, 3, T, R, pfx_66, Wn, LIG }, /* vminmaxs{s,d} */ { { 0x54 }, 3, T, R, pfx_66, Wn, Ln }, /* vfixupimmp{s,d} */ { { 0x55 }, 3, T, R, pfx_66, Wn, LIG }, /* vfixumpimms{s,d} */ { { 0x56 }, 3, T, R, pfx_no, W0, Ln }, /* vreduceph */ --- a/xen/arch/x86/x86_emulate/decode.c +++ b/xen/arch/x86/x86_emulate/decode.c @@ -499,6 +499,8 @@ static const struct ext0f3a_table { [0x4c] = { .simd_size = simd_packed_int, .four_op = 1 }, [0x50] = { .simd_size = simd_packed_fp, .d8s = d8s_vl }, [0x51] = { .simd_size = simd_scalar_vexw, .d8s = d8s_dq }, + [0x52] = { .simd_size = simd_packed_fp, .d8s = d8s_vl }, + [0x53] = { .simd_size = simd_scalar_vexw, .d8s = d8s_dq }, [0x54] = { .simd_size = simd_packed_fp, .d8s = d8s_vl }, [0x55] = { .simd_size = simd_scalar_vexw, .d8s = d8s_dq }, [0x56] = { .simd_size = simd_packed_fp, .two_op = 1, .d8s = d8s_vl }, @@ -1474,6 +1476,7 @@ int x86emul_decode(struct x86_emulate_st case 0x0a: /* vrndscalesh */ case 0x26: /* vfpclassph */ case 0x27: /* vfpclasssh */ + case 0x53: /* vminmaxsh */ case 0x56: /* vgetmantph */ case 0x57: /* vgetmantsh */ case 0x66: /* vreduceph */ @@ -1482,6 +1485,11 @@ int x86emul_decode(struct x86_emulate_st s->fp16 = true; break; + case 0x52: /* vminmaxp{h,bf16} */ + if ( !s->evex.pfx || s->evex.pfx == vex_f2 ) + s->fp16 = true; + break; + case 0xc2: /* vpcmp{p,s}h */ if ( !(s->evex.pfx & VEX_PREFIX_DOUBLE_MASK) ) s->fp16 = true; --- a/xen/arch/x86/x86_emulate/x86_emulate.c +++ b/xen/arch/x86/x86_emulate/x86_emulate.c @@ -7716,6 +7716,21 @@ x86_emulate( generate_exception_if(vex.w, X86_EXC_UD); goto simd_0f_int_imm8; + case X86EMUL_OPC_EVEX_F2(0x0f3a, 0x52): /* vminmaxpbf16 $imm8,[xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ + generate_exception_if(ea.type != OP_MEM && evex.brs, X86_EXC_UD); + op_bytes = 16 << evex.lr; + /* fall through */ + case X86EMUL_OPC_EVEX(0x0f3a, 0x52): /* vminmaxph $imm8,[xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX(0x0f3a, 0x53): /* vminmaxsh $imm,xmm/m16,xmm,xmm,xmm{k} */ + generate_exception_if(vex.w, X86_EXC_UD); + /* fall through */ + case X86EMUL_OPC_EVEX_66(0x0f3a, 0x52): /* vminmaxp{s,d} $imm8,[xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX_66(0x0f3a, 0x53): /* vminmaxs{s,d} $imm8,xmm/mem,xmm,xmm{k} */ + vcpu_must_have(avx10, 2); + if ( ea.type != OP_REG || !evex.brs ) + avx512_vlen_check(b & 1); + goto simd_imm8_zmm; + case X86EMUL_OPC_VEX_66(0x0f3a, 0x5c): /* vfmaddsubps {x,y}mm,{x,y}mm/mem,{x,y}mm,{x,y}mm */ /* vfmaddsubps {x,y}mm/mem,{x,y}mm,{x,y}mm,{x,y}mm */ case X86EMUL_OPC_VEX_66(0x0f3a, 0x5d): /* vfmaddsubpd {x,y}mm,{x,y}mm/mem,{x,y}mm,{x,y}mm */ From patchwork Wed Dec 11 10:19:58 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 13903307 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id C2FF7E7717D for ; Wed, 11 Dec 2024 10:20:15 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.854358.1267586 (Exim 4.92) (envelope-from ) id 1tLJp9-0000vs-EQ; Wed, 11 Dec 2024 10:20:07 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 854358.1267586; Wed, 11 Dec 2024 10:20:07 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1tLJp9-0000vl-BZ; Wed, 11 Dec 2024 10:20:07 +0000 Received: by outflank-mailman (input) for mailman id 854358; Wed, 11 Dec 2024 10:20:06 +0000 Received: from se1-gles-sth1-in.inumbo.com ([159.253.27.254] helo=se1-gles-sth1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1tLJp8-0000mx-1K for xen-devel@lists.xenproject.org; Wed, 11 Dec 2024 10:20:06 +0000 Received: from mail-wr1-x42b.google.com (mail-wr1-x42b.google.com [2a00:1450:4864:20::42b]) by se1-gles-sth1.inumbo.com (Halon) with ESMTPS id 7cee6b84-b7a9-11ef-a0d5-8be0dac302b0; Wed, 11 Dec 2024 11:20:04 +0100 (CET) Received: by mail-wr1-x42b.google.com with SMTP id ffacd0b85a97d-385d7b4da2bso5354183f8f.1 for ; Wed, 11 Dec 2024 02:20:04 -0800 (PST) Received: from [10.156.60.236] (ip-037-024-206-209.um08.pools.vodafone-ip.de. [37.24.206.209]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-725e34bbb22sm6192335b3a.81.2024.12.11.02.20.01 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 11 Dec 2024 02:20:03 -0800 (PST) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 7cee6b84-b7a9-11ef-a0d5-8be0dac302b0 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1733912404; x=1734517204; darn=lists.xenproject.org; h=content-transfer-encoding:in-reply-to:autocrypt:content-language :references:cc:to:from:subject:user-agent:mime-version:date :message-id:from:to:cc:subject:date:message-id:reply-to; bh=OZVqOnW0T4/VwuftVOiXwu0ymxGzeAMEDZBrCeulqK0=; b=Xnp+RTSzifYDM3ChjTDz7SyJS2o/puiHh7Wahyco46XN66+qx7WSfFAEGLLwuIA8d/ 1dWYbvCYlfBVEINUg93BwItV92QYqw0iKgJgn3ri6y/nSBWma+xI8a5O4BaOll6PQgWQ yFIxSN3Lyd+9U9dHN3RMRG20v1o8uSD6IRd8p53tHnjA8blhbwr3xmfzxtwjlQmRA8c5 Abwg5gHruoeIgDvpe5OCPve9y8K+TgOb+gRKcOuCpOvezUFNVl0EW7Lbf6auoTRE1fVz NhLYhx1qLN84jtEVQWRXL83I7abaaMWQw0wiJ88cD7canPGjOZlGpBcok8klNXH33fy8 F32w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1733912404; x=1734517204; h=content-transfer-encoding:in-reply-to:autocrypt:content-language :references:cc:to:from:subject:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=OZVqOnW0T4/VwuftVOiXwu0ymxGzeAMEDZBrCeulqK0=; b=KvcruY96wi+8ufxT8WV1pYKvUC1EPyT2S8KchwFkrvNnSiZAFa0Sjl+R4mEApoD4Np bhzF2kZU8TFLleavmHf1zRYOHK4UftFfoVWp9rOqbVRf2ULPIswwQJK5Sr5EM2e5pK32 rTm/rvbWiaEsuVzmQkoOWaXJUObpWydoDfcxPTsgDfL9VyjpGoCYYVacXCzdKv5UPVTl 19cB9CtgRaR+O4OzDTAcsN5OuA8/WjNmdASR/7ubQJFSZS5g+jppnsUtCjzCBbwaJUwc L2uljAERNX5K97C/F087SsH6oTTYrksBpUN9UMynV+JSkWJnwNUdMRIlYQFAm3+9Dorg epNA== X-Gm-Message-State: AOJu0YxRxIAvwxKjZbhFGNlh3R2NpnfinDTq7X5KyQ16+BPjy7z+9G8i 1sTDXCzYRD0FB1j0sj+qxzBU0ZjUlpS0ERxvhQ3rkTBFVEU9Tk+b7d/c2sZ9L5vfQR95XPoEyG4 = X-Gm-Gg: ASbGncsp2TcB6JsxI5JbFmVhliv+umRDcmRcqwRHT4ltnBs4l6ptCj3RVh4QG2II0V8 Zle3r0HTntw54HEzOrG9aBxzSHcboY7Bc+n1BftZTZoO0aLaVVXoY85c3Dg7LNesZ3WyA6NqSaj 1AdMgsDmyBnMdwk96HH9fbyfAgksumXlYGVZkCxc0lLYZCxRR/1+g0udTHxHSu7VkZWRGrkn73Y /jF22g+oy5R7UoT0TQR/oFFe1An544TaXEAXpiwXgnjgVVSaeZH0lz+k77NsdxMuUBeC6PNTqc9 qSQUq2uxnGhowZALcBTjZMmuZwzClIOkmp8H1ko= X-Google-Smtp-Source: AGHT+IEnYB0mCxJXoFvqhkvEObE+G0wTsBnjv1+dBmq8FgYKn17WizFNj6MO1LxNR4eYOWGVqUnalw== X-Received: by 2002:a5d:6da4:0:b0:386:1ab5:f0e1 with SMTP id ffacd0b85a97d-3864ce968c0mr1868450f8f.14.1733912403910; Wed, 11 Dec 2024 02:20:03 -0800 (PST) Message-ID: Date: Wed, 11 Dec 2024 11:19:58 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: [PATCH v3 12/16] x86emul: support AVX10.2 BFloat16 insns From: Jan Beulich To: "xen-devel@lists.xenproject.org" Cc: Andrew Cooper , =?utf-8?q?Roger_Pau_Monn?= =?utf-8?q?=C3=A9?= References: <516b7f9a-048e-409d-8a4e-89aeb8ffacc4@suse.com> Content-Language: en-US Autocrypt: addr=jbeulich@suse.com; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL In-Reply-To: <516b7f9a-048e-409d-8a4e-89aeb8ffacc4@suse.com> These are all very similar to various existing insns. VGETEXPPBF16, not living in the expected place, benefits from the respective twobyte_table[] entry already having Mov (aka TwoOp). Signed-off-by: Jan Beulich --- This still follows what spec version 001 says for VGETEXPPBF16. It moved to map 6 (and be NP), yet so far no SDE is available to run the test harness there with the changed encoding. Spec rev 002 says VSCALEFPBF16, yet that's going to change to VSCALEFNEPBF16. --- SDE: ??? --- v3: New. --- a/tools/tests/x86_emulator/evex-disp8.c +++ b/tools/tests/x86_emulator/evex-disp8.c @@ -713,16 +713,37 @@ static const struct test vpclmulqdq_all[ }; static const struct test avx10_2_all[] = { + INSN(addnepbf16, 66, map5, 58, vl, bf16, vl), + INSN(cmppbf16, f2, 0f3a, c2, vl, bf16, vl), INSN(comsbf16, 66, map5, 2f, el, bf16, el), INSN(comxsd, f3, 0f, 2f, el, q, el), INSN(comxsh, f2, map5, 2f, el, fp16, el), INSN(comxss, f2, 0f, 2f, el, d, el), + INSN(divnepbf16, 66, map5, 5e, vl, bf16, vl), INSN(dpphps, , 0f38, 52, vl, d, vl), + INSN(fmadd132nepbf16, , map6, 98, vl, bf16, vl), + INSN(fmadd213nepbf16, , map6, a8, vl, bf16, vl), + INSN(fmadd231nepbf16, , map6, b8, vl, bf16, vl), + INSN(fmsub132nepbf16, , map6, 9a, vl, bf16, vl), + INSN(fmsub213nepbf16, , map6, aa, vl, bf16, vl), + INSN(fmsub231nepbf16, , map6, ba, vl, bf16, vl), + INSN(fnmadd132nepbf16, , map6, 9c, vl, bf16, vl), + INSN(fnmadd213nepbf16, , map6, ac, vl, bf16, vl), + INSN(fnmadd231nepbf16, , map6, bc, vl, bf16, vl), + INSN(fnmsub132nepbf16, , map6, 9e, vl, bf16, vl), + INSN(fnmsub213nepbf16, , map6, ae, vl, bf16, vl), + INSN(fnmsub231nepbf16, , map6, be, vl, bf16, vl), + INSN(fpclasspbf16, f2, 0f3a, 66, vl, bf16, vl), + INSN(getexppbf16, 66, map5, 42, vl, bf16, vl), + INSN(getmantpbf16, f2, 0f3a, 26, vl, bf16, vl), + INSN(maxpbf16, 66, map5, 5f, vl, bf16, vl), + INSN(minpbf16, 66, map5, 5d, vl, bf16, vl), INSN(minmax, 66, 0f3a, 52, vl, sd, vl), INSN(minmax, 66, 0f3a, 53, el, sd, el), INSN(minmaxpbf16, f2, 0f3a, 52, vl, bf16, vl), INSN(minmaxph, , 0f3a, 52, vl, fp16, vl), INSN(minmaxsh, , 0f3a, 53, el, fp16, el), + INSN(mulnepbf16, 66, map5, 59, vl, bf16, vl), INSN(mpsadbw, f3, 0f3a, 42, vl, d_nb, vl), INSN(pdpbssd, f2, 0f38, 50, vl, d, vl), INSN(pdpbssds, f2, 0f38, 51, vl, d, vl), @@ -736,6 +757,13 @@ static const struct test avx10_2_all[] = INSN(pdpwusds, 66, 0f38, d3, vl, d, vl), INSN(pdpwuud, , 0f38, d2, vl, d, vl), INSN(pdpwuuds, , 0f38, d3, vl, d, vl), + INSN(rcpph, , map6, 4c, vl, bf16, vl), + INSN(reducenepbf16, f2, 0f3a, 56, vl, bf16, vl), + INSN(rndscalenepbf16, f2, 0f3a, 08, vl, bf16, vl), + INSN(rsqrtph, , map6, 4e, vl, bf16, vl), + INSN(scalefnepbf16, , map6, 2c, vl, bf16, vl), + INSN(sqrtnepbf16, 66, map5, 51, vl, bf16, vl), + INSN(subnepbf16, 66, map5, 5c, vl, bf16, vl), INSN(ucomxsd, f3, 0f, 2e, el, q, el), INSN(ucomxsh, f2, map5, 2e, el, fp16, el), INSN(ucomxss, f2, 0f, 2e, el, d, el), --- a/tools/tests/x86_emulator/predicates.c +++ b/tools/tests/x86_emulator/predicates.c @@ -2054,6 +2054,7 @@ static const struct evex { { { 0x05 }, 3, T, R, pfx_66, W1, Ln }, /* vpermilpd */ { { 0x08 }, 3, T, R, pfx_no, W0, Ln }, /* vrndscaleph */ { { 0x08 }, 3, T, R, pfx_66, W0, Ln }, /* vrndscaleps */ + { { 0x08 }, 3, T, R, pfx_f2, W0, Ln }, /* vrndscalenepbf16 */ { { 0x09 }, 3, T, R, pfx_66, W1, Ln }, /* vrndscalepd */ { { 0x0a }, 3, T, R, pfx_no, W0, LIG }, /* vrndscalesh */ { { 0x0a }, 3, T, R, pfx_66, W0, LIG }, /* vrndscaless */ @@ -2077,6 +2078,7 @@ static const struct evex { { { 0x25 }, 3, T, R, pfx_66, Wn, Ln }, /* vpternlog{d,q} */ { { 0x26 }, 3, T, R, pfx_no, W0, Ln }, /* vgetmantph */ { { 0x26 }, 3, T, R, pfx_66, Wn, Ln }, /* vgetmantp{s,d} */ + { { 0x26 }, 3, T, R, pfx_f2, W0, Ln }, /* vgetmantpbf16 */ { { 0x27 }, 3, T, R, pfx_no, W0, LIG }, /* vgetmantsh */ { { 0x27 }, 3, T, R, pfx_66, Wn, LIG }, /* vgetmants{s,d} */ { { 0x38 }, 3, T, R, pfx_66, Wn, L1|L2 }, /* vinserti{32x4,64x2} */ @@ -2100,10 +2102,12 @@ static const struct evex { { { 0x55 }, 3, T, R, pfx_66, Wn, LIG }, /* vfixumpimms{s,d} */ { { 0x56 }, 3, T, R, pfx_no, W0, Ln }, /* vreduceph */ { { 0x56 }, 3, T, R, pfx_66, Wn, Ln }, /* vreducep{s,d} */ + { { 0x56 }, 3, T, R, pfx_f2, W0, Ln }, /* vreducenepbf16 */ { { 0x57 }, 3, T, R, pfx_no, W0, LIG }, /* vreducesh */ { { 0x57 }, 3, T, R, pfx_66, Wn, LIG }, /* vreduces{s,d} */ { { 0x66 }, 3, T, R, pfx_no, W0, Ln }, /* vfpclassph */ { { 0x66 }, 3, T, R, pfx_66, Wn, Ln }, /* vfpclassp{s,d} */ + { { 0x66 }, 3, T, R, pfx_f2, W0, Ln }, /* vfpclasspbf16 */ { { 0x67 }, 3, T, R, pfx_no, W0, LIG }, /* vfpclasssh */ { { 0x67 }, 3, T, R, pfx_66, Wn, LIG }, /* vfpclasss{s,d} */ { { 0x70 }, 3, T, R, pfx_66, W1, Ln }, /* vshldw */ @@ -2112,6 +2116,7 @@ static const struct evex { { { 0x73 }, 3, T, R, pfx_66, Wn, Ln }, /* vshrd{d,q} */ { { 0xc2 }, 3, T, R, pfx_no, W0, Ln }, /* vcmpph */ { { 0xc2 }, 3, T, R, pfx_f3, W0, LIG }, /* vcmpsh */ + { { 0xc2 }, 3, T, R, pfx_f2, W0, Ln }, /* vcmppbf16 */ { { 0xce }, 3, T, R, pfx_66, W1, Ln }, /* vgf2p8affineqb */ { { 0xcf }, 3, T, R, pfx_66, W1, Ln }, /* vgf2p8affineinvqb */ }, evex_map5[] = { @@ -2127,11 +2132,15 @@ static const struct evex { { { 0x2f }, 2, T, R, pfx_no, W0, LIG }, /* vcomish */ { { 0x2f }, 2, T, R, pfx_66, W0, LIG }, /* vcomsbf16 */ { { 0x2f }, 2, T, R, pfx_f2, W0, LIG }, /* vcomxsh */ + { { 0x42 }, 2, T, R, pfx_66, W0, Ln }, /* vgetexppbf16 */ { { 0x51 }, 2, T, R, pfx_no, W0, Ln }, /* vsqrtph */ + { { 0x51 }, 2, T, R, pfx_66, W0, Ln }, /* vsqrtnepbf16 */ { { 0x51 }, 2, T, R, pfx_f3, W0, LIG }, /* vsqrtsh */ { { 0x58 }, 2, T, R, pfx_no, W0, Ln }, /* vaddph */ + { { 0x58 }, 2, T, R, pfx_66, W0, Ln }, /* vaddnepbf16 */ { { 0x58 }, 2, T, R, pfx_f3, W0, LIG }, /* vaddsh */ { { 0x59 }, 2, T, R, pfx_no, W0, Ln }, /* vmulph */ + { { 0x59 }, 2, T, R, pfx_66, W0, Ln }, /* vmulnepbf16 */ { { 0x59 }, 2, T, R, pfx_f3, W0, LIG }, /* vmulsh */ { { 0x5a }, 2, T, R, pfx_no, W0, Ln }, /* vcvtph2pd */ { { 0x5a }, 2, T, R, pfx_66, W1, Ln }, /* vcvtpd2ph */ @@ -2142,12 +2151,16 @@ static const struct evex { { { 0x5b }, 2, T, R, pfx_66, W0, Ln }, /* vcvtph2dq */ { { 0x5b }, 2, T, R, pfx_f3, W0, Ln }, /* vcvttph2dq */ { { 0x5c }, 2, T, R, pfx_no, W0, Ln }, /* vsubph */ + { { 0x5c }, 2, T, R, pfx_66, W0, Ln }, /* vsubnepbf16 */ { { 0x5c }, 2, T, R, pfx_f3, W0, LIG }, /* vsubsh */ { { 0x5d }, 2, T, R, pfx_no, W0, Ln }, /* vminph */ + { { 0x5d }, 2, T, R, pfx_66, W0, Ln }, /* vminpbf16 */ { { 0x5d }, 2, T, R, pfx_f3, W0, LIG }, /* vminsh */ { { 0x5e }, 2, T, R, pfx_no, W0, Ln }, /* vdivph */ + { { 0x5e }, 2, T, R, pfx_66, W0, Ln }, /* vdivnepbf16 */ { { 0x5e }, 2, T, R, pfx_f3, W0, LIG }, /* vdivsh */ { { 0x5f }, 2, T, R, pfx_no, W0, Ln }, /* vmaxph */ + { { 0x5f }, 2, T, R, pfx_66, W0, Ln }, /* vmaxpbf16 */ { { 0x5f }, 2, T, R, pfx_f3, W0, LIG }, /* vmaxsh */ { { 0x6e }, 2, T, R, pfx_66, WIG, L0 }, /* vmovw */ { { 0x6e }, 2, T, R, pfx_f3, W0, L0 }, /* vmovw */ @@ -2173,12 +2186,15 @@ static const struct evex { }, evex_map6[] = { { { 0x13 }, 2, T, R, pfx_66, W0, Ln }, /* vcvtph2psx */ { { 0x13 }, 2, T, R, pfx_no, W0, LIG }, /* vcvtsh2ss */ + { { 0x2c }, 2, T, R, pfx_no, W0, Ln }, /* vscalefnepbf16 */ { { 0x2c }, 2, T, R, pfx_66, W0, Ln }, /* vscalefph */ { { 0x2d }, 2, T, R, pfx_66, W0, LIG }, /* vscalefsh */ { { 0x42 }, 2, T, R, pfx_66, W0, Ln }, /* vgetexpph */ { { 0x43 }, 2, T, R, pfx_66, W0, LIG }, /* vgetexpsh */ + { { 0x4c }, 2, T, R, pfx_no, W0, Ln }, /* vrcppbf16 */ { { 0x4c }, 2, T, R, pfx_66, W0, Ln }, /* vrcpph */ { { 0x4d }, 2, T, R, pfx_66, W0, LIG }, /* vrcpsh */ + { { 0x4e }, 2, T, R, pfx_no, W0, Ln }, /* vrsqrtpbf16 */ { { 0x4e }, 2, T, R, pfx_66, W0, Ln }, /* vrsqrtph */ { { 0x4f }, 2, T, R, pfx_66, W0, LIG }, /* vrsqrtsh */ { { 0x56 }, 2, T, R, pfx_f3, W0, Ln }, /* vfmaddcph */ --- a/xen/arch/x86/x86_emulate/decode.c +++ b/xen/arch/x86/x86_emulate/decode.c @@ -1472,31 +1472,34 @@ int x86emul_decode(struct x86_emulate_st { switch ( b ) { - case 0x08: /* vrndscaleph */ + case 0x08: /* vrndscale{ph,nepbf16} */ + case 0x26: /* vfpclassp{h,bf16} */ + case 0x52: /* vminmaxp{h,bf16} */ + case 0x56: /* vgetmantp{h,bf16} */ + case 0x66: /* vreduce{ph,nepbf16} */ + if ( !s->evex.pfx || s->evex.pfx == vex_f2 ) + s->fp16 = true; + break; + case 0x0a: /* vrndscalesh */ - case 0x26: /* vfpclassph */ case 0x27: /* vfpclasssh */ case 0x53: /* vminmaxsh */ - case 0x56: /* vgetmantph */ case 0x57: /* vgetmantsh */ - case 0x66: /* vreduceph */ case 0x67: /* vreducesh */ if ( !s->evex.pfx ) s->fp16 = true; break; - case 0x52: /* vminmaxp{h,bf16} */ - if ( !s->evex.pfx || s->evex.pfx == vex_f2 ) - s->fp16 = true; - break; - - case 0xc2: /* vpcmp{p,s}h */ - if ( !(s->evex.pfx & VEX_PREFIX_DOUBLE_MASK) ) + case 0xc2: /* vpcmp{p,s}h, vcmppbf16 */ + if ( s->evex.pfx != vex_66 ) s->fp16 = true; break; } - disp8scale = decode_disp8scale(ext0f3a_table[b].d8s, s); + if ( s->fp16 && s->evex.pfx == vex_f2 && !s->evex.brs ) + disp8scale = 4 + s->evex.lr; + else + disp8scale = decode_disp8scale(ext0f3a_table[b].d8s, s); } break; @@ -1504,7 +1507,7 @@ int x86emul_decode(struct x86_emulate_st switch ( b ) { default: - if ( !(s->evex.pfx & VEX_PREFIX_DOUBLE_MASK) ) + if ( s->evex.pfx != vex_f2 ) s->fp16 = true; break; @@ -1534,6 +1537,11 @@ int x86emul_decode(struct x86_emulate_st s->simd_size = simd_none; break; + case 0x5a: /* vcvt{p,s}d2{p,s}h, vcvt{p,s}h2{p,s}d */ + if ( !(s->evex.pfx & VEX_PREFIX_DOUBLE_MASK) ) + s->fp16 = true; + break; + case 0x5b: /* vcvt{d,q}q2ph, vcvt{,t}ph2dq */ if ( s->evex.pfx && s->evex.pfx != vex_f2 ) s->fp16 = true; @@ -1586,6 +1594,14 @@ int x86emul_decode(struct x86_emulate_st disp8scale = 1; break; + case 0x42: /* vgetexppbf16 needs special casing */ + if ( s->evex.pfx == vex_66 ) + { + s->simd_size = simd_packed_fp; + disp8scale = s->evex.brs ? 1 : 4 + s->evex.lr; + } + break; + case 0x5a: /* vcvtph2pd needs special casing */ if ( !s->evex.pfx && !s->evex.brs ) disp8scale -= 2; @@ -1618,7 +1634,7 @@ int x86emul_decode(struct x86_emulate_st switch ( b ) { default: - if ( s->evex.pfx == vex_66 ) + if ( !(s->evex.pfx & VEX_PREFIX_SCALAR_MASK) ) s->fp16 = true; break; @@ -1950,6 +1966,13 @@ int x86emul_decode(struct x86_emulate_st s->op_bytes = 4 >> s->fp16; break; case vex_f2: + if ( s->fp16 ) + { + ASSERT(evex_encoded()); + generate_exception_if(s->evex.w, X86_EXC_UD); + s->op_bytes = 0; + break; + } generate_exception_if(evex_encoded() && !s->evex.w, X86_EXC_UD); s->op_bytes = 8; break; --- a/xen/arch/x86/x86_emulate/x86_emulate.c +++ b/xen/arch/x86/x86_emulate/x86_emulate.c @@ -7319,6 +7319,20 @@ x86_emulate( avx512_vlen_check(b & 2); goto simd_imm8_zmm; + case X86EMUL_OPC_EVEX_F2(0x0f3a, 0x66): /* vfpclasspbf16 $imm8,[xyz]mm/mem,k{k} */ + case X86EMUL_OPC_EVEX_F2(0x0f3a, 0xc2): /* vcmppbf16 $imm8,[xyz]mm/mem,[xyz]mm,k{k} */ + generate_exception_if(!evex.r || !evex.R || evex.z, X86_EXC_UD); + /* fall through */ + case X86EMUL_OPC_EVEX_F2(0x0f3a, 0x08): /* vrndscalenepbf16 $imm8,[xyz]mm/mem,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX_F2(0x0f3a, 0x26): /* vgetmantpbf16 $imm8,[xyz]mm/mem,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX_F2(0x0f3a, 0x56): /* vreducenepbf16 $imm8,[xyz]mm/mem,[xyz]mm{k} */ + generate_exception_if(evex.w || (ea.type != OP_MEM && evex.brs), + X86_EXC_UD); + vcpu_must_have(avx10, 2); + avx512_vlen_check(false); + op_bytes = 16 << evex.lr; + goto simd_imm8_zmm; + #endif /* X86EMUL_NO_SIMD */ CASE_SIMD_PACKED_INT(0x0f3a, 0x0f): /* palignr $imm8,{,x}mm/mem,{,x}mm */ @@ -7951,6 +7965,36 @@ x86_emulate( generate_exception_if(evex.w, X86_EXC_UD); goto avx512f_all_fp; + case X86EMUL_OPC_EVEX_66(5, 0x42): /* vgetexppbf16 [xyz]mm/mem,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX_66(5, 0x51): /* vsqrtnepbf16 [xyz]mm/mem,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX_66(5, 0x58): /* vaddnepbf16 [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX_66(5, 0x59): /* vmulnepbf16 [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX_66(5, 0x5c): /* vsubnepbf16 [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX_66(5, 0x5d): /* vminpbf16 [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX_66(5, 0x5e): /* vdivnepbf16 [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX_66(5, 0x5f): /* vmaxpbf16 [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX(6, 0x2c): /* vscalefnepbf16 [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX(6, 0x4c): /* vrcppbf16 [xyz]mm/mem,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX(6, 0x4e): /* vrsqrtpbf16 [xyz]mm/mem,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX(6, 0x98): /* vfmadd132nepbf16 [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX(6, 0x9a): /* vfmsub132nepbf16 [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX(6, 0x9c): /* vfnmadd132nepbf16 [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX(6, 0x9e): /* vfnmsub132nepbf16 [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX(6, 0xa8): /* vfmadd213nepbf16 [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX(6, 0xaa): /* vfmsub213nepbf16 [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX(6, 0xac): /* vfnmadd213nepbf16 [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX(6, 0xae): /* vfnmsub213nepbf16 [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX(6, 0xb8): /* vfmadd231nepbf16 [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX(6, 0xba): /* vfmsub231nepbf16 [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX(6, 0xbc): /* vfnmadd231nepbf16 [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX(6, 0xbe): /* vfnmsub231nepbf16 [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ + generate_exception_if(evex.w || (ea.type != OP_MEM && evex.brs), + X86_EXC_UD); + vcpu_must_have(avx10, 2); + avx512_vlen_check(false); + op_bytes = 16 << evex.lr; + goto simd_zmm; + CASE_SIMD_ALL_FP(_EVEX, 5, 0x5a): /* vcvtp{h,d}2p{h,d} [xyz]mm/mem,[xyz]mm{k} */ /* vcvts{h,d}2s{h,d} xmm/mem,xmm,xmm{k} */ visa_check(_fp16); From patchwork Wed Dec 11 10:20:22 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 13903317 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id E9374E7717D for ; Wed, 11 Dec 2024 10:33:29 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.854497.1267686 (Exim 4.92) (envelope-from ) id 1tLK1w-00017x-RM; Wed, 11 Dec 2024 10:33:20 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 854497.1267686; Wed, 11 Dec 2024 10:33:20 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1tLK1w-00017q-Nd; Wed, 11 Dec 2024 10:33:20 +0000 Received: by outflank-mailman (input) for mailman id 854497; Wed, 11 Dec 2024 10:33:19 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1tLJpV-00076S-TC for xen-devel@lists.xenproject.org; Wed, 11 Dec 2024 10:20:30 +0000 Received: from mail-wr1-x435.google.com (mail-wr1-x435.google.com [2a00:1450:4864:20::435]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id 881658d2-b7a9-11ef-99a3-01e77a169b0f; Wed, 11 Dec 2024 11:20:23 +0100 (CET) Received: by mail-wr1-x435.google.com with SMTP id ffacd0b85a97d-385deda28b3so4861984f8f.0 for ; Wed, 11 Dec 2024 02:20:28 -0800 (PST) Received: from [10.156.60.236] (ip-037-024-206-209.um08.pools.vodafone-ip.de. [37.24.206.209]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-21654207719sm46391835ad.48.2024.12.11.02.20.24 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 11 Dec 2024 02:20:27 -0800 (PST) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 881658d2-b7a9-11ef-99a3-01e77a169b0f DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1733912427; x=1734517227; darn=lists.xenproject.org; h=content-transfer-encoding:in-reply-to:autocrypt:content-language :references:cc:to:from:subject:user-agent:mime-version:date :message-id:from:to:cc:subject:date:message-id:reply-to; bh=Qs9XBOYAp6IPbLweUtrtvMK28FTozTB4amhvJ+Ev7XE=; b=EFy3BIdPkxdw2hPS9QoXek5Gke1/idor5PwcfdtIpmtbpwAmcHlmwxbkKNTH3XJNrH q3wkLOJe4fWL4daEyFAaeQUTz+T4xQcMjtMm/etm/0T2xCSDLiuzg2CLJnoCrojeQmOr kBqMdvF6uNGwJAL5R/4ysnPkKgg1NnhnK0hF6SVpznM0sRVcUGnvDUtWiESZkz8B2ANn BivG7mEmb/vyUNLXOZrST84BpW8HaF2can0herkk4JLwrrs+tC6DY38VLYM2R3fg+rbZ JOZUckDyaI5DTDB1c+8nYccS3yD5u+iswYiGOpW3C24aDIbRT/BNdni8dTJd1MsCuSSz ODSw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1733912427; x=1734517227; h=content-transfer-encoding:in-reply-to:autocrypt:content-language :references:cc:to:from:subject:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=Qs9XBOYAp6IPbLweUtrtvMK28FTozTB4amhvJ+Ev7XE=; b=p5hef0F3Aoe1TXczAnn0XohVkLVVSpUpNioWpfc3oJsFXcUGPVYpjAwkupcMspQ9B4 5U/rNSVUXwBLQQoXpO3iPWY/ojSoCJ3DWHQGMXijREbSJ5aWCYjowCRCb0Kqx9g2AoXR mtspx+xOiMQrLvPz8vpLyP3BYi+PIfxs1fYXIzbdGnwJ6TkDelZueu0Ib1SofVUz6ljw 3eZvRPJlfx6H+xFHsPPy3jPe1NcetgLldGnJuSd7JxRKjgpNL9QgyuRJyZcRYoRMGiC/ yreA3mXLvxzAuv6CJhpHQ+XIvZ/h/dwB2+ChjM0yLKpIgql826WHKN5i1kCyYCMqfUnY hpGQ== X-Gm-Message-State: AOJu0YwFipWu+uEYs3JAJ76zmkG0ueIJZROZWsWH0YcxmxvhxOI4reqj 0/JQTOi0pomkbwH0mCxTsmUuecuyXHBULHa7sv4dXenYkgTibpoa4OOWXipE859ub2++y1QUipM = X-Gm-Gg: ASbGncu5lD8caThvxmdV7uQjhFQ3GntpqI9LFXUhKuHisS3efZB4qXSSOi0y/AX0wdu JCAxMgewNUVZJwAIgEBtTzgnCvqhgcvMSwuiNifs3EQdhx0lo5+NUWaUNHTzVE84C3yEUrM+ro7 i9D4SUypvTL5FPbOVy28jM7SiT975++a/XOwEG7yeoJJCCa+C5hGp025MBokhqP5fpQv6eFj0PU J4CJr2isKkmN7E08vVMaV/N6NA0iiVvPHOiUnlWD3ehB/MAc2/O111+F/W1rDpxHIJfkkP4BGba GxnhhIF4Gj53M0yFT7roj2mDK21U2fs2GkCXNvw= X-Google-Smtp-Source: AGHT+IHAgKRVLhZdPu84qjCA+WUGRppKHkhPCiergvw2aQApDglRU+3P8PqJnD8kte/0JD8RAwKvCg== X-Received: by 2002:a05:6000:144f:b0:385:f60b:f5c4 with SMTP id ffacd0b85a97d-3864cea3ca5mr1737545f8f.29.1733912427525; Wed, 11 Dec 2024 02:20:27 -0800 (PST) Message-ID: <9ea8d2d5-632a-465c-852f-47b0feeda69b@suse.com> Date: Wed, 11 Dec 2024 11:20:22 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: [PATCH v3 13/16] x86emul: support AVX10.2 saturating convert insns From: Jan Beulich To: "xen-devel@lists.xenproject.org" Cc: Andrew Cooper , =?utf-8?q?Roger_Pau_Monn?= =?utf-8?q?=C3=A9?= References: <516b7f9a-048e-409d-8a4e-89aeb8ffacc4@suse.com> Content-Language: en-US Autocrypt: addr=jbeulich@suse.com; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL In-Reply-To: <516b7f9a-048e-409d-8a4e-89aeb8ffacc4@suse.com> While the to-byte ones are somewhat different from what has been there (yet then nicely regular from an operands perspective), the others are pretty similar to various existing insns. Signed-off-by: Jan Beulich --- Spec rev 002 says VCVTTNEBF162I{,U}BS, yet that's going to change to VCVTTBF162I{,U}BS. --- SDE: ??? --- v3: New. --- a/tools/tests/x86_emulator/evex-disp8.c +++ b/tools/tests/x86_emulator/evex-disp8.c @@ -719,6 +719,30 @@ static const struct test avx10_2_all[] = INSN(comxsd, f3, 0f, 2f, el, q, el), INSN(comxsh, f2, map5, 2f, el, fp16, el), INSN(comxss, f2, 0f, 2f, el, d, el), + INSN(cvtnebf162ibs, f2, map5, 69, vl, bf16, vl), + INSN(cvtnebf162iubs, f2, map5, 6b, vl, bf16, vl), + INSN(cvtph2ibs, , map5, 69, vl, fp16, vl), + INSN(cvtph2iubs, , map5, 6b, vl, fp16, vl), + INSN(cvtps2ibs, 66, map5, 69, vl, d, vl), + INSN(cvtps2iubs, 66, map5, 6b, vl, d, vl), + INSN(cvttbf162ibs, f2, map5, 68, vl, bf16, vl), + INSN(cvttbf162iubs, f2, map5, 6a, vl, bf16, vl), + INSN(cvttpd2dqs, , map5, 6d, vl, q, vl), + INSN(cvttpd2qqs, 66, map5, 6d, vl, q, vl), + INSN(cvttpd2udqs, , map5, 6c, vl, q, vl), + INSN(cvttpd2uqqs, 66, map5, 6c, vl, q, vl), + INSN(cvttph2ibs, , map5, 68, vl, fp16, vl), + INSN(cvttph2iubs, , map5, 6a, vl, fp16, vl), + INSN(cvttps2dqs, , map5, 6d, vl, d, vl), + INSN(cvttps2ibs, 66, map5, 68, vl, d, vl), + INSN(cvttps2iubs, 66, map5, 6a, vl, d, vl), + INSN(cvttps2qqs, 66, map5, 6d, vl_2, d, vl), + INSN(cvttps2udqs, , map5, 6c, vl, d, vl), + INSN(cvttps2uqqs, 66, map5, 6c, vl_2, d, vl), + INSN(cvttsd2sis, f2, map5, 6d, el, q, el), + INSN(cvttsd2usis, f2, map5, 6c, el, q, el), + INSN(cvttss2sis, f3, map5, 6d, el, d, el), + INSN(cvttss2usis, f3, map5, 6c, el, d, el), INSN(divnepbf16, 66, map5, 5e, vl, bf16, vl), INSN(dpphps, , 0f38, 52, vl, d, vl), INSN(fmadd132nepbf16, , map6, 98, vl, bf16, vl), --- a/tools/tests/x86_emulator/predicates.c +++ b/tools/tests/x86_emulator/predicates.c @@ -2162,6 +2162,26 @@ static const struct evex { { { 0x5f }, 2, T, R, pfx_no, W0, Ln }, /* vmaxph */ { { 0x5f }, 2, T, R, pfx_66, W0, Ln }, /* vmaxpbf16 */ { { 0x5f }, 2, T, R, pfx_f3, W0, LIG }, /* vmaxsh */ + { { 0x68 }, 2, T, R, pfx_no, W0, Ln }, /* vcvttph2ibs */ + { { 0x68 }, 2, T, R, pfx_66, W0, Ln }, /* vcvttps2ibs */ + { { 0x68 }, 2, T, R, pfx_f2, W0, Ln }, /* vcvttbf162ibs */ + { { 0x69 }, 2, T, R, pfx_no, W0, Ln }, /* vcvtph2ibs */ + { { 0x69 }, 2, T, R, pfx_66, W0, Ln }, /* vcvtps2ibs */ + { { 0x69 }, 2, T, R, pfx_f2, W0, Ln }, /* vcvtnebf162ibs */ + { { 0x6a }, 2, T, R, pfx_no, W0, Ln }, /* vcvttph2iubs */ + { { 0x6a }, 2, T, R, pfx_66, W0, Ln }, /* vcvttps2iubs */ + { { 0x6a }, 2, T, R, pfx_f2, W0, Ln }, /* vcvttbf162iubs */ + { { 0x6b }, 2, T, R, pfx_no, W0, Ln }, /* vcvtph2iubs */ + { { 0x6b }, 2, T, R, pfx_66, W0, Ln }, /* vcvtps2iubs */ + { { 0x6b }, 2, T, R, pfx_f2, W0, Ln }, /* vcvtnebf162iubs */ + { { 0x6c }, 2, T, R, pfx_no, Wn, Ln }, /* vcvttp{s,d}2udqs */ + { { 0x6c }, 2, T, R, pfx_66, Wn, Ln }, /* vcvttp{s,d}2uqqs */ + { { 0x6c }, 2, T, R, pfx_f3, Wn, LIG }, /* vcvttss2usis */ + { { 0x6c }, 2, T, R, pfx_f2, Wn, LIG }, /* vcvttsd2usis */ + { { 0x6d }, 2, T, R, pfx_no, Wn, Ln }, /* vcvttp{s,d}2dqs */ + { { 0x6d }, 2, T, R, pfx_66, Wn, Ln }, /* vcvttp{s,d}2qqs */ + { { 0x6d }, 2, T, R, pfx_f3, Wn, LIG }, /* vcvttss2sis */ + { { 0x6d }, 2, T, R, pfx_f2, Wn, LIG }, /* vcvttsd2sis */ { { 0x6e }, 2, T, R, pfx_66, WIG, L0 }, /* vmovw */ { { 0x6e }, 2, T, R, pfx_f3, W0, L0 }, /* vmovw */ { { 0x78 }, 2, T, R, pfx_no, W0, Ln }, /* vcvttph2udq */ --- a/xen/arch/x86/x86_emulate/decode.c +++ b/xen/arch/x86/x86_emulate/decode.c @@ -1547,6 +1547,19 @@ int x86emul_decode(struct x86_emulate_st s->fp16 = true; break; + case 0x68: /* vcvtt{ph,ps,bf16}2ibs */ + case 0x69: /* vcvt{ph,ps,nebf16}2ibs */ + case 0x6a: /* vcvtt{ph,ps,bf16}2iubs */ + case 0x6b: /* vcvt{ph,ps,nebf16}2iubs */ + if ( !s->evex.pfx || s->evex.pfx == vex_f2 ) + s->fp16 = true; + /* fall through */ + case 0x6c: /* vcvttp{s,d}2u{d,q}qs, vcvtts{s,d}2usis */ + case 0x6d: /* vcvttp{s,d}2{d,q}qs, vcvtts{s,d}2sis */ + d |= TwoOp; + s->simd_size = simd_other; + break; + case 0x6e: /* vmovw r/x/m16, xmm */ d = (d & ~SrcMask) | SrcMem16; /* fall through */ @@ -1612,6 +1625,14 @@ int x86emul_decode(struct x86_emulate_st --disp8scale; break; + case 0x6c: /* vcvttps2uqqs and vcvts{s,d}2usi need special casing */ + case 0x6d: /* vcvttps2qqs and vcvts{s,d}2si need special casing */ + if ( s->evex.pfx == vex_66 && !s->evex.w && !s->evex.brs ) + --disp8scale; + else if ( s->evex.pfx & VEX_PREFIX_SCALAR_MASK ) + disp8scale = s->evex.pfx & VEX_PREFIX_DOUBLE_MASK ? 3 : 2; + break; + case 0x7a: case 0x7b: /* vcvt{,t}ph2qq need special casing */ if ( s->evex.pfx == vex_66 && !s->evex.brs ) disp8scale = s->evex.brs ? 1 : 2 + s->evex.lr; --- a/xen/arch/x86/x86_emulate/x86_emulate.c +++ b/xen/arch/x86/x86_emulate/x86_emulate.c @@ -8025,6 +8025,55 @@ x86_emulate( op_bytes = 8 << evex.lr; goto simd_zmm; + case X86EMUL_OPC_EVEX_F2(5, 0x68): /* vcvttbf162ibs [xyz]mm/mem,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX_F2(5, 0x69): /* vcvtnebf162ibs [xyz]mm/mem,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX_F2(5, 0x6a): /* vcvttbf162iubs [xyz]mm/mem,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX_F2(5, 0x6b): /* vcvtnebf162iubs [xyz]mm/mem,[xyz]mm{k} */ + generate_exception_if(ea.type != OP_MEM && evex.brs, X86_EXC_UD); + /* fall through */ + case X86EMUL_OPC_EVEX (5, 0x68): /* vcvttph2ibs [xyz]mm/mem,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX_66(5, 0x68): /* vcvttps2ibs [xyz]mm/mem,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX (5, 0x69): /* vcvtph2ibs [xyz]mm/mem,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX_66(5, 0x69): /* vcvtps2ibs [xyz]mm/mem,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX (5, 0x6a): /* vcvttph2iubs [xyz]mm/mem,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX_66(5, 0x6a): /* vcvttps2iubs [xyz]mm/mem,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX (5, 0x6b): /* vcvtph2iubs [xyz]mm/mem,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX_66(5, 0x6b): /* vcvtps2iubs [xyz]mm/mem,[xyz]mm{k} */ + generate_exception_if(evex.w, X86_EXC_UD); + vcpu_must_have(avx10, 2); + if ( ea.type != OP_REG || !evex.brs ) + avx512_vlen_check(false); + op_bytes = 16 << evex.lr; + goto simd_zmm; + + case X86EMUL_OPC_EVEX (5, 0x6c): /* vcvttps2udqs [xyz]mm/mem,[xyz]mm{k} */ + /* vcvttpd2udqs [xyz]mm/mem,{x,y}mm{k} */ + case X86EMUL_OPC_EVEX_66(5, 0x6c): /* vcvttps2uqqs {x,y}mm/mem,[xyz]mm{k} */ + /* vcvttpd2uqqs [xyz]mm/mem,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX (5, 0x6d): /* vcvttps2dqs [xyz]mm/mem,[xyz]mm{k} */ + /* vcvttpd2dqs [xyz]mm/mem,{x,y}mm{k} */ + case X86EMUL_OPC_EVEX_66(5, 0x6d): /* vcvttps2qqs {x,y}mm/mem,[xyz]mm{k} */ + /* vcvttpd2qqs [xyz]mm/mem,[xyz]mm{k} */ + vcpu_must_have(avx10, 2); + if ( ea.type != OP_REG || !evex.brs ) + avx512_vlen_check(false); + op_bytes = 8 << ((evex.w || !evex.pfx) + evex.lr); + goto simd_zmm; + + CASE_SIMD_SCALAR_FP(_EVEX, 5, 0x6c): /* vcvtts{s,d}2usis xmm/mem,reg */ + CASE_SIMD_SCALAR_FP(_EVEX, 5, 0x6d): /* vcvtts{s,d}2sis xmm/mem,reg */ + generate_exception_if((evex.reg != 0xf || !evex.RX || !evex.R || + evex.opmsk), + X86_EXC_UD); + vcpu_must_have(avx10, 2); + if ( !evex.brs ) + avx512_vlen_check(true); + else + generate_exception_if(ea.type != OP_REG || !evex.u, X86_EXC_UD); + get_fpu(X86EMUL_FPU_zmm); + opc = init_evex(stub); + goto cvts_2si; + case X86EMUL_OPC_EVEX_66(5, 0x78): /* vcvttph2uqq xmm/mem,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_66(5, 0x79): /* vcvtph2uqq xmm/mem,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_66(5, 0x7a): /* vcvttph2qq xmm/mem,[xyz]mm{k} */ From patchwork Wed Dec 11 10:20:45 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 13903316 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 2B7A8E7717D for ; Wed, 11 Dec 2024 10:32:24 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.854478.1267675 (Exim 4.92) (envelope-from ) id 1tLK0n-0008Vu-GV; Wed, 11 Dec 2024 10:32:09 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 854478.1267675; Wed, 11 Dec 2024 10:32:09 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1tLK0n-0008Vn-Da; Wed, 11 Dec 2024 10:32:09 +0000 Received: by outflank-mailman (input) for mailman id 854478; Wed, 11 Dec 2024 10:32:08 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1tLJpt-00076S-2v for xen-devel@lists.xenproject.org; Wed, 11 Dec 2024 10:20:53 +0000 Received: from mail-wr1-x42a.google.com (mail-wr1-x42a.google.com [2a00:1450:4864:20::42a]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id 95e169e3-b7a9-11ef-99a3-01e77a169b0f; Wed, 11 Dec 2024 11:20:46 +0100 (CET) Received: by mail-wr1-x42a.google.com with SMTP id ffacd0b85a97d-385dece873cso3145205f8f.0 for ; Wed, 11 Dec 2024 02:20:51 -0800 (PST) Received: from [10.156.60.236] (ip-037-024-206-209.um08.pools.vodafone-ip.de. [37.24.206.209]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-2162a479fc7sm72313335ad.47.2024.12.11.02.20.48 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 11 Dec 2024 02:20:50 -0800 (PST) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 95e169e3-b7a9-11ef-99a3-01e77a169b0f DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1733912451; x=1734517251; darn=lists.xenproject.org; h=content-transfer-encoding:in-reply-to:autocrypt:content-language :references:cc:to:from:subject:user-agent:mime-version:date :message-id:from:to:cc:subject:date:message-id:reply-to; bh=EYrG6X8uwp+bs8ww0EgrMry1siCkDx4rXslRN+kZ534=; b=I9tDmfXpNDdMnWdAx45uwAji31lrAQty28jPDs38DEhMhsLai5N4X+IluaX17W45nf EEQ5AZj37KXEYOlVeVIKblB+GGs9Pacd7jZbAFxzaCqcztr98F6GFrwOscVvLdAPtXd0 1IH+AMV6e0xHcOhBc5SvTCckm9phkPoAbR1Sc1cWDctTZC24kehKW0unkxLxcUE8PXa0 1hJq1YOL1iaV13OWT0/Ytk9jffzxiB1nfOOkT01+NnKSazBt3/k1t3dO5YVAI8nx3FDV QcplgEllvycZNXIEXenFUArfbtsxGKPCOxmMzPDk2s1GV5UGu+CYohvRani0htFEup7V Q0dA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1733912451; x=1734517251; h=content-transfer-encoding:in-reply-to:autocrypt:content-language :references:cc:to:from:subject:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=EYrG6X8uwp+bs8ww0EgrMry1siCkDx4rXslRN+kZ534=; b=uVs8xEqp0kr2ogsYOrcZPtzCu7XeRdDfXm3hTslbEy1sKLYVbECeIt19dj4KMK6dWi E/FwVVXFb23LYaGwW3sxdXLX+yXdORba1YqKen2SVeHhFYFB1CE0sZ3LF4ozy+L5reeH FNi4ztrBwj8jcpHL+sjAdM66CB16X2iQL35iLphD3fZutCOwZrtW7NZhFuDex4apxfWl vlLel8NMozctrfN6d1Twmq/xAp0nruqj1SfLvGLpIQCYRTgRMn5lZ6G4fFYz8lsR1cV4 grvUBOnNM6zN+Qv4QOrKW9aJ+UZheY6ydKVLaQy2Imgjf2ipySvUOEES5xloRGS4uZmK zjng== X-Gm-Message-State: AOJu0YyQRk3mnhhxl3t3oQPuCGJH3xT/St5pibKqdWTqixA4elG9bhT7 YaNLtiyN151yk0fehsbp/M+Wppz25oZJxhRUKLw3Nd3VKWDz0Sey+Bf5zGCCE4nf7tUGbbEgzx8 = X-Gm-Gg: ASbGncsK66NpLmo4B1YzLIqgAzqpu9juXTGvF7nhYPQ5APpOoVOkea/+j71M+q/ZSu2 fW57O3VUycBIYIpCE68437kET1yQvLx+D4lsSb8GNwXFZiAx5eLpOlIg9kltynUmHPq+t4NrAXF hKsc+xIDkw5Ec2v/HeqntflGXQqL8SL429RoUDLhwee2n1lvnpjfdlEq7eGsfJsjELbcBKPv27D xhWdJruQWSdnuFzhs2Lxov9UbUgBt17ATSHYG68448+V+a+GSOx8G9yYdVH/98B7T7dF2HIyVGF YQfIGp9LmmsgBxHDUbmlDTQGtcp8l1aIoOqX2hI= X-Google-Smtp-Source: AGHT+IHkPNXTc+Yha08gY4NqV2a4G7lJgQZ5Wt4aKB4fhNh+fd/5wUpTc08FUCVRmN9GgaLUv4O4pw== X-Received: by 2002:a05:6000:18a3:b0:385:f5b6:9c9d with SMTP id ffacd0b85a97d-3864cea5696mr1748859f8f.33.1733912450680; Wed, 11 Dec 2024 02:20:50 -0800 (PST) Message-ID: Date: Wed, 11 Dec 2024 11:20:45 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: [PATCH v3 14/16] x86emul: support other AVX10.2 convert insns From: Jan Beulich To: "xen-devel@lists.xenproject.org" Cc: Andrew Cooper , =?utf-8?q?Roger_Pau_Monn?= =?utf-8?q?=C3=A9?= References: <516b7f9a-048e-409d-8a4e-89aeb8ffacc4@suse.com> Content-Language: en-US Autocrypt: addr=jbeulich@suse.com; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL In-Reply-To: <516b7f9a-048e-409d-8a4e-89aeb8ffacc4@suse.com> Despite most of them being about conversion to BF8/HF8, they are still somewhat similar to various existing convert insns. Signed-off-by: Jan Beulich --- SDE: ??? --- v3: New. --- a/tools/tests/x86_emulator/evex-disp8.c +++ b/tools/tests/x86_emulator/evex-disp8.c @@ -719,8 +719,22 @@ static const struct test avx10_2_all[] = INSN(comxsd, f3, 0f, 2f, el, q, el), INSN(comxsh, f2, map5, 2f, el, fp16, el), INSN(comxss, f2, 0f, 2f, el, d, el), + INSN(cvt2ps2phx, 66, 0f38, 67, vl, d, vl), + INSN(cvtbiasph2bf8, , 0f38, 74, vl, fp16, vl), + INSN(cvtbiasph2bf8s, , map5, 74, vl, fp16, vl), + INSN(cvtbiasph2hf8, , map5, 18, vl, fp16, vl), + INSN(cvtbiasph2hf8s, , map5, 1b, vl, fp16, vl), + INSN(cvthf82ph, f2, map5, 1e, vl_2, b, vl), + INSN(cvtne2ph2bf8, f2, 0f38, 74, vl, fp16, vl), + INSN(cvtne2ph2bf8s, f2, map5, 74, vl, fp16, vl), + INSN(cvtne2ph2hf8, f2, map5, 18, vl, fp16, vl), + INSN(cvtne2ph2hf8s, f2, map5, 1b, vl, fp16, vl), INSN(cvtnebf162ibs, f2, map5, 69, vl, bf16, vl), INSN(cvtnebf162iubs, f2, map5, 6b, vl, bf16, vl), + INSN(cvtneph2bf8, f3, 0f38, 74, vl, fp16, vl), + INSN(cvtneph2bf8s, f3, map5, 74, vl, fp16, vl), + INSN(cvtneph2hf8, f3, map5, 18, vl, fp16, vl), + INSN(cvtneph2hf8s, f3, map5, 1b, vl, fp16, vl), INSN(cvtph2ibs, , map5, 69, vl, fp16, vl), INSN(cvtph2iubs, , map5, 6b, vl, fp16, vl), INSN(cvtps2ibs, 66, map5, 69, vl, d, vl), --- a/tools/tests/x86_emulator/predicates.c +++ b/tools/tests/x86_emulator/predicates.c @@ -1952,6 +1952,7 @@ static const struct evex { { { 0x64 }, 2, T, R, pfx_66, Wn, Ln }, /* vpblendm{d,q} */ { { 0x65 }, 2, T, R, pfx_66, Wn, Ln }, /* vblendmp{s,d} */ { { 0x66 }, 2, T, R, pfx_66, Wn, Ln }, /* vpblendm{b,w} */ + { { 0x67 }, 2, T, R, pfx_66, W0, Ln }, /* vcvt2ps2phx */ { { 0x68 }, 2, T, R, pfx_f2, Wn, Ln }, /* vp2intersect{d,q} */ { { 0x70 }, 2, T, R, pfx_66, W1, Ln }, /* vpshldvw */ { { 0x71 }, 2, T, R, pfx_66, Wn, Ln }, /* vpshldv{d,q} */ @@ -1959,6 +1960,9 @@ static const struct evex { { { 0x72 }, 2, T, R, pfx_f3, W1, Ln }, /* vcvtneps2bf16 */ { { 0x72 }, 2, T, R, pfx_f2, W1, Ln }, /* vcvtne2ps2bf16 */ { { 0x73 }, 2, T, R, pfx_66, Wn, Ln }, /* vpshrdv{d,q} */ + { { 0x74 }, 2, T, R, pfx_no, W0, Ln }, /* vcvtbiasph2bf8 */ + { { 0x74 }, 2, T, R, pfx_f3, W0, Ln }, /* vcvtneph2bf8 */ + { { 0x74 }, 2, T, R, pfx_f2, W0, Ln }, /* vcvtne2ph2bf8 */ { { 0x75 }, 2, T, R, pfx_66, Wn, Ln }, /* vpermi2{b,w} */ { { 0x76 }, 2, T, R, pfx_66, Wn, Ln }, /* vpermi2{d,q} */ { { 0x77 }, 2, T, R, pfx_66, Wn, Ln }, /* vpermi2p{s,d} */ @@ -2122,8 +2126,15 @@ static const struct evex { }, evex_map5[] = { { { 0x10 }, 2, T, R, pfx_f3, W0, LIG }, /* vmovsh */ { { 0x11 }, 2, T, W, pfx_f3, W0, LIG }, /* vmovsh */ + { { 0x18 }, 2, T, R, pfx_no, W0, Ln }, /* vcvtbiasph2hf8 */ + { { 0x18 }, 2, T, R, pfx_f3, W0, Ln }, /* vcvtneph2hf8 */ + { { 0x18 }, 2, T, R, pfx_f2, W0, Ln }, /* vcvtne2ph2hf8 */ + { { 0x1b }, 2, T, R, pfx_no, W0, Ln }, /* vcvtbiasph2hf8s */ + { { 0x1b }, 2, T, R, pfx_f3, W0, Ln }, /* vcvtneph2hf8s */ + { { 0x1b }, 2, T, R, pfx_f2, W0, Ln }, /* vcvtne2ph2hf8s */ { { 0x1d }, 2, T, R, pfx_66, W0, Ln }, /* vcvtps2phx */ { { 0x1d }, 2, T, R, pfx_no, W0, LIG }, /* vcvtss2sh */ + { { 0x1e }, 2, T, R, pfx_f2, W0, Ln }, /* cvthf82ph */ { { 0x2a }, 2, T, R, pfx_f3, Wn, LIG }, /* vcvtsi2sh */ { { 0x2c }, 2, T, R, pfx_f3, Wn, LIG }, /* vcvttsh2si */ { { 0x2d }, 2, T, R, pfx_f3, Wn, LIG }, /* vcvtsh2si */ @@ -2184,6 +2195,9 @@ static const struct evex { { { 0x6d }, 2, T, R, pfx_f2, Wn, LIG }, /* vcvttsd2sis */ { { 0x6e }, 2, T, R, pfx_66, WIG, L0 }, /* vmovw */ { { 0x6e }, 2, T, R, pfx_f3, W0, L0 }, /* vmovw */ + { { 0x74 }, 2, T, R, pfx_no, W0, Ln }, /* vcvtbiasph2bf8s */ + { { 0x74 }, 2, T, R, pfx_f3, W0, Ln }, /* vcvtneph2bf8s */ + { { 0x74 }, 2, T, R, pfx_f2, W0, Ln }, /* vcvtne2ph2bf8s */ { { 0x78 }, 2, T, R, pfx_no, W0, Ln }, /* vcvttph2udq */ { { 0x78 }, 2, T, R, pfx_66, W0, Ln }, /* vcvttph2uqq */ { { 0x78 }, 2, T, R, pfx_f3, Wn, LIG }, /* vcvttsh2usi */ --- a/xen/arch/x86/x86_emulate/decode.c +++ b/xen/arch/x86/x86_emulate/decode.c @@ -378,8 +378,10 @@ static const struct ext0f38_table { [0x62] = { .simd_size = simd_packed_int, .two_op = 1, .d8s = d8s_bw }, [0x63] = { .simd_size = simd_packed_int, .to_mem = 1, .two_op = 1, .d8s = d8s_bw }, [0x64 ... 0x66] = { .simd_size = simd_packed_int, .d8s = d8s_vl }, + [0x67] = { .simd_size = simd_other, .d8s = d8s_vl }, [0x68] = { .simd_size = simd_packed_int, .d8s = d8s_vl }, [0x70 ... 0x73] = { .simd_size = simd_packed_int, .d8s = d8s_vl }, + [0x74] = { .simd_size = simd_other, .d8s = d8s_vl }, [0x75 ... 0x76] = { .simd_size = simd_packed_int, .d8s = d8s_vl }, [0x77] = { .simd_size = simd_packed_fp, .d8s = d8s_vl }, [0x78] = { .simd_size = simd_other, .two_op = 1 }, @@ -1445,6 +1447,15 @@ int x86emul_decode(struct x86_emulate_st s->simd_size = ext0f38_table[b].simd_size; if ( evex_encoded() ) { + switch ( b ) + { + case 0x74: /* cvt{bias,ne,ne2}ph2bf8 */ + s->fp16 = true; + if ( s->evex.pfx != vex_f2 ) + d |= TwoOp; + break; + } + /* * VPMOVUS* are identical to VPMOVS* Disp8-scaling-wise, but * their attributes don't match those of the vex_66 encoded @@ -1592,6 +1603,23 @@ int x86emul_decode(struct x86_emulate_st switch ( b ) { + case 0x18: /* vcvt{bias,ne,ne2}ph2hf8 */ + case 0x1b: /* vcvt{bias,ne,ne2}ph2hf8s */ + case 0x74: /* vcvt{bias,ne,ne2}ph2bf8s */ + s->fp16 = true; + d = DstReg | SrcMem; + if ( s->evex.pfx != vex_f2 ) + d |= TwoOp; + s->simd_size = simd_other; + disp8scale = s->evex.brs ? 1 : 4 + s->evex.lr; + break; + + case 0x1e: /* vcvthf82ph */ + d = DstReg | SrcMem | TwoOp; + s->simd_size = simd_other; + disp8scale = 3 + s->evex.lr; + break; + case 0x78: case 0x79: /* vcvt{,t}ph2u{d,q}q need special casing */ --- a/xen/arch/x86/x86_emulate/x86_emulate.c +++ b/xen/arch/x86/x86_emulate/x86_emulate.c @@ -6269,6 +6269,29 @@ x86_emulate( } goto simd_zmm; + case X86EMUL_OPC_EVEX (0x0f38, 0x74): /* vcvtbiasph2bf8 [xyz]mm,[xyz]mm/mem{k} */ + case X86EMUL_OPC_EVEX_F3(0x0f38, 0x74): /* vcvtneph2bf8 [xyz]mm,[xyz]mm/mem{k} */ + case X86EMUL_OPC_EVEX_F2(0x0f38, 0x74): /* vcvtne2ph2bf8 [xyz]mm,[xyz]mm,[xyz]mm/mem{k} */ + case X86EMUL_OPC_EVEX ( 5, 0x18): /* vcvtbiasph2hf8 [xyz]mm,[xyz]mm/mem{k} */ + case X86EMUL_OPC_EVEX_F3( 5, 0x18): /* vcvtneph2hf8 [xyz]mm,[xyz]mm/mem{k} */ + case X86EMUL_OPC_EVEX_F2( 5, 0x18): /* vcvtne2ph2hf8 [xyz]mm,[xyz]mm,[xyz]mm/mem{k} */ + case X86EMUL_OPC_EVEX ( 5, 0x1b): /* vcvtbiasph2hf8s [xyz]mm,[xyz]mm/mem{k} */ + case X86EMUL_OPC_EVEX_F3( 5, 0x1b): /* vcvtneph2hf8s [xyz]mm,[xyz]mm/mem{k} */ + case X86EMUL_OPC_EVEX_F2( 5, 0x1b): /* vcvtne2ph2hf8s [xyz]mm,[xyz]mm,[xyz]mm/mem{k} */ + case X86EMUL_OPC_EVEX ( 5, 0x74): /* vcvtbiasph2bf8s [xyz]mm,[xyz]mm/mem{k} */ + case X86EMUL_OPC_EVEX_F3( 5, 0x74): /* vcvtneph2bf8s [xyz]mm,[xyz]mm/mem{k} */ + case X86EMUL_OPC_EVEX_F2( 5, 0x74): /* vcvtne2ph2bf8s [xyz]mm,[xyz]mm,[xyz]mm/mem{k} */ + generate_exception_if(ea.type != OP_MEM && evex.brs, X86_EXC_UD); + /* fall through */ + case X86EMUL_OPC_EVEX_66(0x0f38, 0x67): /* vcvt2ps2phx [xyz]mm,[xyz]mm,[xyz]mm/mem{k} */ + generate_exception_if(evex.w, X86_EXC_UD); + vcpu_must_have(avx10, 2); + if ( ea.type != OP_REG || !evex.brs ) + avx512_vlen_check(false); + op_bytes = 16 << evex.lr; + fault_suppression = false; + goto simd_zmm; + case X86EMUL_OPC_EVEX_F2(0x0f38, 0x68): /* vp2intersect{d,q} [xyz]mm/mem,[xyz]mm,k+1 */ host_and_vcpu_must_have(avx512_vp2intersect); generate_exception_if(evex.opmsk || !evex.r || !evex.R, X86_EXC_UD); @@ -7965,6 +7988,14 @@ x86_emulate( generate_exception_if(evex.w, X86_EXC_UD); goto avx512f_all_fp; + case X86EMUL_OPC_EVEX_F2(5, 0x1e): /* vcvthf82ph [xyz]mm,[xyz]mm/mem{k} */ + generate_exception_if(evex.w || evex.brs, X86_EXC_UD); + vcpu_must_have(avx10, 2); + if ( ea.type != OP_REG || !evex.brs ) + avx512_vlen_check(false); + op_bytes = 8 << evex.lr; + goto simd_zmm; + case X86EMUL_OPC_EVEX_66(5, 0x42): /* vgetexppbf16 [xyz]mm/mem,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_66(5, 0x51): /* vsqrtnepbf16 [xyz]mm/mem,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_66(5, 0x58): /* vaddnepbf16 [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ From patchwork Wed Dec 11 10:21:22 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 13903309 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 893BAE7717D for ; Wed, 11 Dec 2024 10:21:37 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.854376.1267606 (Exim 4.92) (envelope-from ) id 1tLJqV-0002Es-2e; Wed, 11 Dec 2024 10:21:31 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 854376.1267606; Wed, 11 Dec 2024 10:21:31 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1tLJqU-0002Eg-VS; Wed, 11 Dec 2024 10:21:30 +0000 Received: by outflank-mailman (input) for mailman id 854376; Wed, 11 Dec 2024 10:21:30 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1tLJqU-0002Dh-3e for xen-devel@lists.xenproject.org; Wed, 11 Dec 2024 10:21:30 +0000 Received: from mail-wr1-x435.google.com (mail-wr1-x435.google.com [2a00:1450:4864:20::435]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id abdf14bb-b7a9-11ef-99a3-01e77a169b0f; Wed, 11 Dec 2024 11:21:23 +0100 (CET) Received: by mail-wr1-x435.google.com with SMTP id ffacd0b85a97d-385e06af753so2882868f8f.2 for ; Wed, 11 Dec 2024 02:21:28 -0800 (PST) Received: from [10.156.60.236] (ip-037-024-206-209.um08.pools.vodafone-ip.de. [37.24.206.209]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-728e8c46a84sm1880755b3a.130.2024.12.11.02.21.25 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 11 Dec 2024 02:21:27 -0800 (PST) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: abdf14bb-b7a9-11ef-99a3-01e77a169b0f DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1733912487; x=1734517287; darn=lists.xenproject.org; h=content-transfer-encoding:in-reply-to:autocrypt:content-language :references:cc:to:from:subject:user-agent:mime-version:date :message-id:from:to:cc:subject:date:message-id:reply-to; bh=nCKajekOUINBVbNX77/s6Kb1O6XuDQGofayJhNlpP6I=; b=JxYOjeyJVsuDril8zvJ3nsmKvGS30lgUCqRmLP4kFYNM8GWsFPZ7X3FDD9VqWTisTu tNzeXUDbX9jTLKsSknfCUwxRaM1AKzilYSOVTRQ/PY3fOlD0R1b83tRYXLFsHXnYC+Ru dj4jJepDLqZWlck8p+mXVufIksIGMjMpiuCHIbPgT4IPCp/jQI5VaX383iKCkiARvzlp /I2I8kqVz8xx4rbzBu1u4T5n8bpW93olY+M/LZ7TuNS87nl1wJPwYWIvNwLT7yhSPTiv 5tZayctyJXT8VzK+561dKEPvzq+7O66id/zbxEyBF3WCRSMrI9QSp563uj4IFYqR+F8j uLWA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1733912487; x=1734517287; h=content-transfer-encoding:in-reply-to:autocrypt:content-language :references:cc:to:from:subject:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=nCKajekOUINBVbNX77/s6Kb1O6XuDQGofayJhNlpP6I=; b=JFpQgFTGCxsgvEFpggjRDS8Hy/2S+xhhcLGYRx+ZWccTVxGRMw/C6VAtjSS7qMHUiA AUrbGCVtRwLMQzKZYXYBmE62oaDkKXtBwMwvmpKJKq2Vfnz3jy19LNxMIfU5NiLnPifL tfMzCcVfwENvEAj4WZ1vqQexvqxmQ5Hwv+TPe+xkTL7Z4YHJ4MEn1L9ZBpumRYUodDO3 VlROCiX7F/kTscRBi5uLwanmm9zPXbLu0wbzsnPW8LVW9yxKhhDFjanT0g7hMTES+vSZ 40TtjVcs9wYA5opqHCq2LopXpal1zuOKODiNg0foEbS8q3J+Jkn/gsOCOYZbSYzejD2F 0IiA== X-Gm-Message-State: AOJu0Yy2m5FcW9EnLgkCNMp82PszKUjr/pNKYufkFHql1cxqflJjMEUe cfSaj3bLB4wLLM3RuMOCca37exBi9Qwtf5TfzfqqXSV63gMm7Yn3dcZ3Uzc/kBNaRxb7FZXOJLo = X-Gm-Gg: ASbGncuGOTbXeMlqkyKLfvKJcOjek2M5Nuc5/1xmEzwfhJlHNBy9saANwTF5EN3KtZ3 +vlMWLgyr3kNBObTU9CsMj3K6R32utZ8lT5taaclO6Zf9jQQagvyUoem457CvFdmsNP2O3dfeo3 awqe8kqdMTLhFLmkf59OUs3EwV9+m+Eytiwl4eFEPePfxotACn6gzhoPM5TFieKRvxPtznhtgye Bopu8ls4r/jR34Gl2MRe6QL7kzgj7ii8kXXeKAmX0sWXDFJUg1CaT13hq3SqHaKjI6qm79ygz2Q ycReD/CnaTRpfqZD4bOyTWbpYcAusldlFjcXnMs= X-Google-Smtp-Source: AGHT+IHS2U2/7bxUzfRckX1RsUkJ6lGs3X+OPidIK3ypTK+HqRiConptVoA2lOKWVperV7Xm98ZOmA== X-Received: by 2002:a5d:59a7:0:b0:382:4926:98fa with SMTP id ffacd0b85a97d-3864cec56f0mr1823654f8f.40.1733912487589; Wed, 11 Dec 2024 02:21:27 -0800 (PST) Message-ID: <6252b702-d204-4e10-acb9-9e8b3e06e5b4@suse.com> Date: Wed, 11 Dec 2024 11:21:22 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: [PATCH v3 15/16] x86emul: support SIMD MOVRS From: Jan Beulich To: "xen-devel@lists.xenproject.org" Cc: Andrew Cooper , =?utf-8?q?Roger_Pau_Monn?= =?utf-8?q?=C3=A9?= References: <516b7f9a-048e-409d-8a4e-89aeb8ffacc4@suse.com> Content-Language: en-US Autocrypt: addr=jbeulich@suse.com; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL In-Reply-To: <516b7f9a-048e-409d-8a4e-89aeb8ffacc4@suse.com> As we ignore cachability aspects of insns, they're treated like simple VMOVs. Signed-off-by: Jan Beulich --- SDE: -??? --- v3: New. --- a/tools/tests/x86_emulator/evex-disp8.c +++ b/tools/tests/x86_emulator/evex-disp8.c @@ -814,6 +814,13 @@ static const struct test avx10_2_128[] = INSN(movw, f3, map5, 7e, el, fp16, el), }; +static const struct test movrs_all[] = { + INSN(movrsb, f2, map5, 6f, vl, b, vl), + INSN(movrsd, f3, map5, 6f, vl, d_nb, vl), + INSN(movrsq, f3, map5, 6f, vl, q_nb, vl), + INSN(movrsw, f2, map5, 6f, vl, w, vl), +}; + static const unsigned char vl_all[] = { VL_512, VL_128, VL_256 }; static const unsigned char vl_128[] = { VL_128 }; static const unsigned char vl_no128[] = { VL_512, VL_256 }; @@ -1236,6 +1243,10 @@ void evex_disp8_test(void *instr, struct run(cpu_has_avx10_2, avx10_2, all); run(cpu_has_avx10_2, avx10_2, 128); + if ( cpu_has_avx10_2 ) + { + run(ctxt->addr_size == 64 && cpu_has_movrs, movrs, all); + } #undef run } --- a/tools/tests/x86_emulator/predicates.c +++ b/tools/tests/x86_emulator/predicates.c @@ -2195,6 +2195,8 @@ static const struct evex { { { 0x6d }, 2, T, R, pfx_f2, Wn, LIG }, /* vcvttsd2sis */ { { 0x6e }, 2, T, R, pfx_66, WIG, L0 }, /* vmovw */ { { 0x6e }, 2, T, R, pfx_f3, W0, L0 }, /* vmovw */ + { { 0x6f }, 2, T, R, pfx_f3, Wn, Ln }, /* vmovrs{d,q} */ + { { 0x6f }, 2, T, R, pfx_f2, Wn, Ln }, /* vmovrs{b,w} */ { { 0x74 }, 2, T, R, pfx_no, W0, Ln }, /* vcvtbiasph2bf8s */ { { 0x74 }, 2, T, R, pfx_f3, W0, Ln }, /* vcvtneph2bf8s */ { { 0x74 }, 2, T, R, pfx_f2, W0, Ln }, /* vcvtne2ph2bf8s */ --- a/tools/tests/x86_emulator/x86-emulate.h +++ b/tools/tests/x86_emulator/x86-emulate.h @@ -201,6 +201,7 @@ void wrpkru(unsigned int val); xcr0_mask(0xe6)) #define cpu_has_cmpccxadd cpu_policy.feat.cmpccxadd #define cpu_has_avx_ifma (cpu_policy.feat.avx_ifma && xcr0_mask(6)) +#define cpu_has_movrs cpu_policy.feat.movrs #define cpu_has_avx_vnni_int8 (cpu_policy.feat.avx_vnni_int8 && \ xcr0_mask(6)) #define cpu_has_avx_ne_convert (cpu_policy.feat.avx_ne_convert && \ --- a/xen/arch/x86/x86_emulate/x86_emulate.c +++ b/xen/arch/x86/x86_emulate/x86_emulate.c @@ -6298,6 +6298,17 @@ x86_emulate( op_bytes = 16 << evex.lr; goto avx512f_no_sae; + case X86EMUL_OPC_EVEX_F2(0x0f38, 0x6f): /* vmovrs{b,w} mem,[xyz]mm{k} */ + elem_bytes = 1 << evex.w; + /* fall through */ + case X86EMUL_OPC_EVEX_F3(0x0f38, 0x6f): /* vmovrs{d,q} mem,[xyz]mm{k} */ + generate_exception_if(ea.type != OP_MEM || evex.brs, X86_EXC_UD); + vcpu_must_have(avx10, 2); + vcpu_must_have(movrs); + avx512_vlen_check(false); + op_bytes = 16 << evex.lr; + goto simd_zmm; + case X86EMUL_OPC_EVEX_66(0x0f38, 0x70): /* vpshldvw [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_66(0x0f38, 0x72): /* vpshrdvw [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ generate_exception_if(!evex.w, X86_EXC_UD); From patchwork Wed Dec 11 10:22:01 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 13903330 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 34DD9E7717D for ; Wed, 11 Dec 2024 10:34:00 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.854508.1267695 (Exim 4.92) (envelope-from ) id 1tLK2K-0001kZ-6i; Wed, 11 Dec 2024 10:33:44 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 854508.1267695; Wed, 11 Dec 2024 10:33:44 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1tLK2K-0001k1-3L; Wed, 11 Dec 2024 10:33:44 +0000 Received: by outflank-mailman (input) for mailman id 854508; Wed, 11 Dec 2024 10:33:42 +0000 Received: from se1-gles-sth1-in.inumbo.com ([159.253.27.254] helo=se1-gles-sth1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1tLJr6-00024A-3C for xen-devel@lists.xenproject.org; Wed, 11 Dec 2024 10:22:08 +0000 Received: from mail-wr1-x42a.google.com (mail-wr1-x42a.google.com [2a00:1450:4864:20::42a]) by se1-gles-sth1.inumbo.com (Halon) with ESMTPS id c5c4e683-b7a9-11ef-a0d5-8be0dac302b0; Wed, 11 Dec 2024 11:22:07 +0100 (CET) Received: by mail-wr1-x42a.google.com with SMTP id ffacd0b85a97d-385e27c75f4so4774851f8f.2 for ; Wed, 11 Dec 2024 02:22:07 -0800 (PST) Received: from [10.156.60.236] (ip-037-024-206-209.um08.pools.vodafone-ip.de. [37.24.206.209]) by smtp.gmail.com with ESMTPSA id 41be03b00d2f7-7fd35331cf8sm7308250a12.44.2024.12.11.02.22.03 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 11 Dec 2024 02:22:05 -0800 (PST) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: c5c4e683-b7a9-11ef-a0d5-8be0dac302b0 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1733912526; x=1734517326; darn=lists.xenproject.org; h=content-transfer-encoding:in-reply-to:autocrypt:content-language :references:cc:to:from:subject:user-agent:mime-version:date :message-id:from:to:cc:subject:date:message-id:reply-to; bh=hO8Qikz+f0D+oVjxqRCd7QOvjmKJ9NgkZPKMop33tuo=; b=fKt9Oy2QezJGe/jtR8oZ32NVgN1yGkuUvcraO7rDdbDWNuVuFG0pE/PfArof6BMyAy WtADKRkc9Oa3+PUBPbpw4ZUx+jn3zRkodU2WJKrWkImc+OJnWssXgkkG7NmllXsJ4gq6 bC0QrkKj78NSxxLX2QGTQLuSIAyt/pmUNn/6Uts3RR/R4KCkpFBmnAGZqDNIDYytwC8E PdpZUb6SIsfYigDiDVY65kwbWaB9w6Sqek5eK9TuNLWg3kvVUH54UGfTCEo+ji1DDjzB vB2mJ71ti/F/gTRdXMGOik8Nr3TRizsBvrPo5A8bXVv5/p6MuGwhGYi8mdDVw84dprim UIYg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1733912526; x=1734517326; h=content-transfer-encoding:in-reply-to:autocrypt:content-language :references:cc:to:from:subject:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=hO8Qikz+f0D+oVjxqRCd7QOvjmKJ9NgkZPKMop33tuo=; b=Fr63OCGfCQovIra6UOOwAvtc4WLJv0AWPMmUyWODqVIfQR/lmcf7aB0sigkC8VQbCN 59W2Pb1iXtoCxrqtgLK6ed00nqLJXLEypQcyAz2UOQaPXLmykHh1ZczRoJwA2TA4yMhZ fAVV2pqheQUzbWhXVxhm4+MUyKRy+Jpjmkk0p8FNMFeCRYp3+MDnNmSuQ/9v8cHa1dcD h4uqTi7QdbwmKGtVCfV/DryZaXiNn3Y7AjFivFwwGjWkAeisiTPI61d7UnwiiVJRhdLP e9LXxucc76PgNkAmYnRlKOBagx8qWDTiPkTHUFakLv/4YAp4ji855zRNRCxlmU3ECn4T mqJg== X-Gm-Message-State: AOJu0YyLqi3NHR4gHuFqE6JcL1PGqO1/ITfFUPqLWa1ZdC6aTc9c1UNF 1NloJQ4p67zEpNFSKMxOHgPM9qzMSfG79033w8oX4EyopfUq9PtxHZk5HuMuAvLtP3ccTptxRu8 = X-Gm-Gg: ASbGncvxXLF3xkPoxwPpyFRVYluUNXhEM+U4cbtkaSX8FPY6gsbnyctIs0UwLi8Qgos krZ/dpus3+IGCUXBAmlEVzLnIUsUrcU32X0WPQ+JyHO2VUYuciJY5W6ZtQkdxVQyn9bSTKwuoiy TeMqCofsxCs7c0nvcNe2jLL2waT737ypRTdFV6X13GzW92ll4e5511gWX90TTB2VAswjJCogJwH nz5rNFld+2tddd7SgaRh+odty8h20ntHtTqB8B8BSyo1I774H3sFTSqnfr/04dNPkkEURbWquGk xkGGXXDko2m+YRMCJarEZCCR9SzXJCw++MoGcys= X-Google-Smtp-Source: AGHT+IEJfMcOa7CTrg2kHbvf5IWOAllNdNS6uSJAV6DeZb0RyD3Ak294+pH58jQrJQ1zlKlOi47rUw== X-Received: by 2002:a5d:47a1:0:b0:386:3a8e:64c1 with SMTP id ffacd0b85a97d-3864ce90f89mr1695738f8f.19.1733912526329; Wed, 11 Dec 2024 02:22:06 -0800 (PST) Message-ID: <9da1258a-86dd-46fb-9d38-95a2c2f3d902@suse.com> Date: Wed, 11 Dec 2024 11:22:01 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: [PATCH v3 16/16] x86emul: support AVX10.2 forms of SM4 insns From: Jan Beulich To: "xen-devel@lists.xenproject.org" Cc: Andrew Cooper , =?utf-8?q?Roger_Pau_Monn?= =?utf-8?q?=C3=A9?= References: <516b7f9a-048e-409d-8a4e-89aeb8ffacc4@suse.com> Content-Language: en-US Autocrypt: addr=jbeulich@suse.com; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL In-Reply-To: <516b7f9a-048e-409d-8a4e-89aeb8ffacc4@suse.com> Simply clone the VEX-encoded handling to cover the EVEX forms. Signed-off-by: Jan Beulich --- There's a TODO left due to lack of SDE support. Invoking the test would fail at present, for SDE 9.44.0 advertising both AVX10.2 and SM4, while not supporting the new EVEX encodings just yet. --- SDE: -??? --- v3: New. --- a/tools/tests/x86_emulator/evex-disp8.c +++ b/tools/tests/x86_emulator/evex-disp8.c @@ -821,6 +821,11 @@ static const struct test movrs_all[] = { INSN(movrsw, f2, map5, 6f, vl, w, vl), }; +static const struct test sm4_all[] = { + INSN(sm4key4, f3, 0f38, da, vl, d_nb, vl), + INSN(sm4rnds4, f2, 0f38, da, vl, d_nb, vl), +}; + static const unsigned char vl_all[] = { VL_512, VL_128, VL_256 }; static const unsigned char vl_128[] = { VL_128 }; static const unsigned char vl_no128[] = { VL_512, VL_256 }; @@ -1246,6 +1251,7 @@ void evex_disp8_test(void *instr, struct if ( cpu_has_avx10_2 ) { run(ctxt->addr_size == 64 && cpu_has_movrs, movrs, all); + (void)sm4_all;//todo run(cpu_has_sm4, sm4, all); } #undef run --- a/tools/tests/x86_emulator/predicates.c +++ b/tools/tests/x86_emulator/predicates.c @@ -2046,6 +2046,8 @@ static const struct evex { { { 0xd3 }, 2, T, R, pfx_no, W0, Ln }, /* vpdpwuuds */ { { 0xd3 }, 2, T, R, pfx_66, W0, Ln }, /* vpdpwusds */ { { 0xd3 }, 2, T, R, pfx_f3, W0, Ln }, /* vpdpwsuds */ + { { 0xda }, 2, T, R, pfx_f3, W0, Ln }, /* vsm4key4 */ + { { 0xda }, 2, T, R, pfx_f2, W0, Ln }, /* vsm4rnds4 */ { { 0xdc }, 2, T, R, pfx_66, WIG, Ln }, /* vaesenc */ { { 0xdd }, 2, T, R, pfx_66, WIG, Ln }, /* vaesenclast */ { { 0xde }, 2, T, R, pfx_66, WIG, Ln }, /* vaesdec */ --- a/xen/arch/x86/x86_emulate/decode.c +++ b/xen/arch/x86/x86_emulate/decode.c @@ -439,7 +439,7 @@ static const struct ext0f38_table { [0xd3] = { .simd_size = simd_other, .d8s = d8s_vl }, [0xd6] = { .simd_size = simd_other, .d8s = d8s_vl }, [0xd7] = { .simd_size = simd_scalar_vexw, .d8s = d8s_dq }, - [0xda] = { .simd_size = simd_other }, + [0xda] = { .simd_size = simd_other, .d8s = d8s_vl }, [0xdb] = { .simd_size = simd_packed_int, .two_op = 1 }, [0xdc ... 0xdf] = { .simd_size = simd_packed_int, .d8s = d8s_vl }, [0xe0 ... 0xef] = { .to_mem = 1 }, --- a/xen/arch/x86/x86_emulate/x86_emulate.c +++ b/xen/arch/x86/x86_emulate/x86_emulate.c @@ -6928,6 +6928,14 @@ x86_emulate( op_bytes = 16 << vex.l; goto simd_0f_ymm; + case X86EMUL_OPC_EVEX_F3(0x0f38, 0xda): /* vsm4key4 [xyz]mm/mem,[xyz]mm,[xyz]mm */ + case X86EMUL_OPC_EVEX_F2(0x0f38, 0xda): /* vsm4rnds4 [xyz]mm/mem,[xyz]mm,[xyz]mm */ + vcpu_must_have(avx10, 2); + host_and_vcpu_must_have(sm4); + generate_exception_if(evex.w || evex.brs || evex.opmsk, X86_EXC_UD); + op_bytes = 16 << evex.lr; + goto simd_zmm; + case X86EMUL_OPC_VEX_66(0x0f38, 0xdc): /* vaesenc {x,y}mm/mem,{x,y}mm,{x,y}mm */ case X86EMUL_OPC_VEX_66(0x0f38, 0xdd): /* vaesenclast {x,y}mm/mem,{x,y}mm,{x,y}mm */ case X86EMUL_OPC_VEX_66(0x0f38, 0xde): /* vaesdec {x,y}mm/mem,{x,y}mm,{x,y}mm */