From patchwork Wed Jun 15 10:27:32 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 12882070 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 08ED6C433EF for ; Wed, 15 Jun 2022 10:27:49 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.349909.576098 (Exim 4.92) (envelope-from ) id 1o1QFO-0002wl-42; Wed, 15 Jun 2022 10:27:38 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 349909.576098; Wed, 15 Jun 2022 10:27:38 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1o1QFO-0002we-0c; Wed, 15 Jun 2022 10:27:38 +0000 Received: by outflank-mailman (input) for mailman id 349909; Wed, 15 Jun 2022 10:27:36 +0000 Received: from se1-gles-sth1-in.inumbo.com ([159.253.27.254] helo=se1-gles-sth1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1o1QFM-0002mz-Ed for xen-devel@lists.xenproject.org; Wed, 15 Jun 2022 10:27:36 +0000 Received: from EUR03-VE1-obe.outbound.protection.outlook.com (mail-ve1eur03on060b.outbound.protection.outlook.com [2a01:111:f400:fe09::60b]) by se1-gles-sth1.inumbo.com (Halon) with ESMTPS id c5df4b41-ec95-11ec-bd2c-47488cf2e6aa; Wed, 15 Jun 2022 12:27:35 +0200 (CEST) Received: from VE1PR04MB6560.eurprd04.prod.outlook.com (2603:10a6:803:122::25) by DB7PR04MB5113.eurprd04.prod.outlook.com (2603:10a6:10:14::27) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5332.22; Wed, 15 Jun 2022 10:27:34 +0000 Received: from VE1PR04MB6560.eurprd04.prod.outlook.com ([fe80::dfa:a64a:432f:e26b]) by VE1PR04MB6560.eurprd04.prod.outlook.com ([fe80::dfa:a64a:432f:e26b%7]) with mapi id 15.20.5332.020; Wed, 15 Jun 2022 10:27:34 +0000 X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: c5df4b41-ec95-11ec-bd2c-47488cf2e6aa ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=IlA/jvSaQ8VNS/FZXyxJMY6+SLIRHz2cZ+YJdpzHH86yvIK5MoOhEMFAxjcUydKIWUx7aFXzRCSbawIWW2VuGHdMdSMxzCGy8IA9k1M+GiQSKuqmGH8TYNN++ffX9c2pZwZDa6K8wFqWitqMWtr7DBHUhFqfrZIPFVBZnug2HQryTDIGVScfeqT4apMuoxm0UoXIRb0EyeHDYXvgr6UfcgdWhbtLXkc+/QGBUfjBPuv+sPsRdLvAUv2jxJU8lfZeMaGn/068etCAUby35OwVFm2Db3zZ+CyQgr4rtvUPvyFb1qFDeZNj5PJELrfxFQUKcuDYlC79+3hHDnmn2Rc0lg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=AhN1mnRm3C+eBZ3ZOnL1xkngzkdjxzdw7LWEKU7f9gk=; b=iR5+piM3XauZ4tYzNJtB9+SdnfLP8JItA70U+hn7bVySDA3F0Mgso7Fz/RlQ/8392a5lZntCENjbDbs9X9ZGPLZD5Y/9MI4tLqhDPOo2OQ5K59bNOrol+VtB3yrRc0bErAjUPKkH77BC4hxSORLo7hQQHdnPy4csMASvfd73XiXmAc0D1n0jpT/CiEMfJpcOuLzvil3RAaujJHFYj80zf6IR7ECitGmwbvw4x+SG4y7KbwBnaglbnB1IbJa1S8JVdp7fdl2Y9UJ1GRwCpatDxz/hNKSO2xbpKOhL5g5sWWTxDmBoWFP+RBOnsAUkzVaCvKb/KhhVT2nO3z2loBelgQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=AhN1mnRm3C+eBZ3ZOnL1xkngzkdjxzdw7LWEKU7f9gk=; b=OVM2p9s4vA/KHto0bZpl2oZteOOiJwRge3OaRkpXS1RfLFc4MYjSkZSxKfhhHKab2bUd+WhfgOCOkARD/phGdoqcVHWbgqFmzfwi77YwGCyM2Cm5vxUOG2BXqfr7f/exFZMfbWhicqd/ECzZQ0w0NYrvKLqeZGj7Wf6fMkB7G3roQM512zxdMYLnSU7wZS/gT6/9QQiGTDodmG9CdSvhDTYwRTAJUAHitilylaYV6hqDXrQVw3fnSCOPMt7jAqDh6LpE02SBIOY2VBUt4rJ0gcHJ4Sgb+KJYavH9nV6B2FWyx4rEF4SCYwUVGpzgxde6HeARaDZ/i8mU6Tgkjzth4Q== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com; Message-ID: <5dc84e7b-e3d6-92cf-8ffb-c4bc0a3e6c74@suse.com> Date: Wed, 15 Jun 2022 12:27:32 +0200 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.10.0 Subject: [PATCH 01/11] x86/CPUID: AVX512-FP16 definitions Content-Language: en-US From: Jan Beulich To: "xen-devel@lists.xenproject.org" Cc: Andrew Cooper , Wei Liu , =?utf-8?q?Roger_Pau_Monn=C3=A9?= References: <9ba3aaa7-56ce-b051-f1c4-874276e493e1@suse.com> In-Reply-To: <9ba3aaa7-56ce-b051-f1c4-874276e493e1@suse.com> X-ClientProxiedBy: AS8PR04CA0047.eurprd04.prod.outlook.com (2603:10a6:20b:312::22) To VE1PR04MB6560.eurprd04.prod.outlook.com (2603:10a6:803:122::25) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 36073096-f08b-4e27-b831-08da4eb9a8e8 X-MS-TrafficTypeDiagnostic: DB7PR04MB5113:EE_ X-Microsoft-Antispam-PRVS: X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: BBQf00mfNF94/VmAiokwJ2CRAuqgiFeBt6VOPvC6ls0f3bDqnK8WtXziNuCL3/j2SfRn9pI6g0FRGIUc26unbAtkKug7Vbdz/jTcRVnB0X7OyPq9CmRsCZmaPHtM3vYx+bTuPHa4QJGSwTRwEgTG1Uz3TRbAv8OTJyN2/+VksLpsmOO+oBjggGborUEw4piExjh6UE7ykvG5VQMttqthRKtiMdJnEKCNJr61g76WGGkIgE8vgmViqTP4HVs96bMEPHIyh6Nw/TJI3SfngUd5OOnv8NCzhjZEUOqRzCt1PFJqSIrb0NJXq+ND+xIQjHyjRQuBznLz0Ya4D9MVj3sG7BViEExUrFqlCZ4gvXhhvgGaywL8mZRanz/r9EypzvxuFr6GKmpj30/7/hqKYNHpFMSOAYeso2imQUfwqMSil4B6HdquoxUGpeqDBElpheqbxK9ucvNJKdrrsx1Y9rbD6lXj0p8UqACFC/bCDMQib5RQGw3xozHQqoXFjAmdHx59VJh7bmFi+BuTu6tOo4J7o62UAYyDLBSLgnCvJYJUqmvMJvG8X+eI6uv+/DwEUNHjp7fFRfBCdbMQ79g6afdC9P0ZLHpB3D/hFVk7abYdaZ2PTVR7OE9Gmbu/rS8/hC5mLOILp3nC7EC+IP/sc+agPbpj5PNblIJBa6z5XMHaexj8JqqJsT2uYLax5ay5YQpvIZQYFLFbwudc85XQ1Y1pydIFd4ZVtlRQXAwijFd/sbI= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:VE1PR04MB6560.eurprd04.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230016)(366004)(316002)(36756003)(508600001)(38100700002)(6512007)(6916009)(8676002)(31686004)(4326008)(6486002)(26005)(2616005)(66946007)(66556008)(66476007)(2906002)(31696002)(5660300002)(186003)(86362001)(6506007)(54906003)(8936002)(83380400001)(43740500002)(45980500001);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?q?7r8MEE4bjkdedbfMaZ4lY4yB5BDg?= =?utf-8?q?cMxHs3te+gOg4t0ID1AIYndl8oAddRmgmh34uZLig/bmHKaXp3Fo6mMBWnkw97kdf?= =?utf-8?q?S844y2LiRWjQTFNcjrwbIL98jofMJIPls043Moe0HZXgrS/BZucGQnxoWIB3QtYxy?= =?utf-8?q?y6t6UfzrogdMrCZFFvVmdS31G6NrdZH7gCUNzbLzv8pBOoY+so2CjyeJ/xEqV3pkD?= =?utf-8?q?DXtOE7L+fEA67Nk0sJr0VZdyoNTQeIPXHZGPf0cepL7gn7TfjQUOEfckVkxQr+z6s?= =?utf-8?q?AMAl2V1ETVDvQ2mwmuWEkvd7s8ZMF55ubzgCCryw9KWJz4iws5TEX7MPlqdEgLlRi?= =?utf-8?q?D7UCJQruL2HWuQIQaNGNrKHbRYEIexgPN1I3uW7+Fg7Ug2g+Zoz2AQQNAY/yLqQMI?= =?utf-8?q?TN7y5rk+UkK4XofOUpE7dZNcrg3XMqqUUHQn5b5lC8qrDEHpGcBJVAMxQpPcXzQQF?= =?utf-8?q?UUHCkIxsFI88f2jhDR5HZQgxvunxqw04yrm3+HpuriKvVvEChnFFoYDbxtDRlEKSf?= =?utf-8?q?cYdVFCekviqsyg0eMRIY1mvNheAd7dn3nDsbO6mxf12UgVmfqK6XA93nSbZyA8yja?= =?utf-8?q?PNiyRFtb+YlwHEc7VnvvGfb8IjTGPZp7Ubsh6+5PALH63AsbG3qPrcfrCnt8ENan+?= =?utf-8?q?svAWFhFzQhMs6PmizcdCmpCHsZ5g67FtN4nLOsejO7zRfglJj7cFlR17Y8qt3NjIq?= =?utf-8?q?yhZifCTz7nNQPCo402LLrDK63VIKqclwQOiXenmBmOpAdWkoyXNeNdbaxly4oe2Rg?= =?utf-8?q?a629fnuCQ7p3/RyqBnZrRyOroojzMW3NSiihiP08EF01Z2IwQ/OR8PfBxUsJge+2Y?= =?utf-8?q?cI3/CDsvkh05PFOuUGwMc0n9A84Q9FsDfZfm+nEyN2TXGo2yQRd33wT5gMT3fJf+h?= =?utf-8?q?/OjWUGsDU2jzATew1AxAFBuKVQw9RG8/dswXN9MAjWU6d1HeBo9oVsNwzzTwTB7qi?= =?utf-8?q?mjbOptw/8zaMMiEPlDslwvZW9w1/CWzET9edRTjRnR+Uk5alS2se1nx+/TOHuWZFI?= =?utf-8?q?fhkjqYa/2N82098t66WGBYntec/yX44BqTxe3na2SfmOlQTBnZ8kN7VLk+gwoRTWr?= =?utf-8?q?VSrfGp8dmt/KEC/IR3tknwxiljFuZcr/aWnO5LI2Bgq5BpHys2m/cboaz39aoH6Q2?= =?utf-8?q?SWPTYFqzD5dDwtT+u3DnHlILqr0Vk9uf6DF7LMJEU+qMMpDGKN+S6bX77Dq+i5bQz?= =?utf-8?q?rnRqSuC0jQYuY9obgvciTIaPLUmWTpFpA0ayE9NtHtA0URXzQsfk5JMJJXKE/+kTX?= =?utf-8?q?vRjSOmFy8S45mHi3qtvyv9QFWielOgeVMReoti1T17DtiCPNFbKYYRqe5YI9q3+hy?= =?utf-8?q?ILgA8nzON4RWpJVNppPh05PRYw1ejGhzYkMRgRYWXTGGGI6QFSa2hE6N3afOvtkUp?= =?utf-8?q?gjlnzOgHD+MZvrMAeGn53UuufbWlD0Irz973mhqXhDedS8b2tLVSTxsMbZCa9ugz3?= =?utf-8?q?rExlDKQpBQ3bRKCoFbTT0qC9aW6+m6hXAXRci4NUWCpwfKcGMJXMq1eyeOMRt/QCJ?= =?utf-8?q?N4v+enINJCjUueGXqtqz6At45oYAsdVATT5iP2YTN7IWe4E71OYiDJhIjxWhqxgkb?= =?utf-8?q?PFHf0R3RIKsv6tF+vl3L8Bjzw5JUzRJvQIzdZsOkTeuMtb+qiZ6Z1S4kW7rSENKce?= =?utf-8?q?Z7ZZ8wPVO+lqBjyzd59Wngj9zCoTP5ag=3D=3D?= X-OriginatorOrg: suse.com X-MS-Exchange-CrossTenant-Network-Message-Id: 36073096-f08b-4e27-b831-08da4eb9a8e8 X-MS-Exchange-CrossTenant-AuthSource: VE1PR04MB6560.eurprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 15 Jun 2022 10:27:33.9055 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: f7a17af6-1c5c-4a36-aa8b-f5be247aa4ba X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: i6QlyTTOekx/5tOquLxD+l1jfCdlm4NpYA4e6nRZRGEW/fEz3KUfciYc2OBYAR9rBw1Y4tNDV0c3M4WYtl+JoA== X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB7PR04MB5113 Signed-off-by: Jan Beulich Reviewed-by: Andrew Cooper --- a/tools/libs/light/libxl_cpuid.c +++ b/tools/libs/light/libxl_cpuid.c @@ -221,6 +221,7 @@ int libxl_cpuid_parse_config(libxl_cpuid {"serialize", 0x00000007, 0, CPUID_REG_EDX, 14, 1}, {"tsxldtrk", 0x00000007, 0, CPUID_REG_EDX, 16, 1}, {"cet-ibt", 0x00000007, 0, CPUID_REG_EDX, 20, 1}, + {"avx512-fp16", 0x00000007, 0, CPUID_REG_EDX, 23, 1}, {"ibrsb", 0x00000007, 0, CPUID_REG_EDX, 26, 1}, {"stibp", 0x00000007, 0, CPUID_REG_EDX, 27, 1}, {"l1d-flush", 0x00000007, 0, CPUID_REG_EDX, 28, 1}, --- a/tools/misc/xen-cpuid.c +++ b/tools/misc/xen-cpuid.c @@ -175,6 +175,7 @@ static const char *const str_7d0[32] = [16] = "tsxldtrk", [18] = "pconfig", [20] = "cet-ibt", + /* 22 */ [23] = "avx512-fp16", [26] = "ibrsb", [27] = "stibp", [28] = "l1d-flush", [29] = "arch-caps", --- a/xen/arch/x86/include/asm/cpufeature.h +++ b/xen/arch/x86/include/asm/cpufeature.h @@ -138,6 +138,7 @@ #define cpu_has_rtm_always_abort boot_cpu_has(X86_FEATURE_RTM_ALWAYS_ABORT) #define cpu_has_tsx_force_abort boot_cpu_has(X86_FEATURE_TSX_FORCE_ABORT) #define cpu_has_serialize boot_cpu_has(X86_FEATURE_SERIALIZE) +#define cpu_has_avx512_fp16 boot_cpu_has(X86_FEATURE_AVX512_FP16) #define cpu_has_arch_caps boot_cpu_has(X86_FEATURE_ARCH_CAPS) /* CPUID level 0x00000007:1.eax */ --- a/xen/include/public/arch-x86/cpufeatureset.h +++ b/xen/include/public/arch-x86/cpufeatureset.h @@ -281,6 +281,7 @@ XEN_CPUFEATURE(TSX_FORCE_ABORT, 9*32+13) XEN_CPUFEATURE(SERIALIZE, 9*32+14) /*A SERIALIZE insn */ XEN_CPUFEATURE(TSXLDTRK, 9*32+16) /*a TSX load tracking suspend/resume insns */ XEN_CPUFEATURE(CET_IBT, 9*32+20) /* CET - Indirect Branch Tracking */ +XEN_CPUFEATURE(AVX512_FP16, 9*32+23) /* AVX512 FP16 instructions */ XEN_CPUFEATURE(IBRSB, 9*32+26) /*A IBRS and IBPB support (used by Intel) */ XEN_CPUFEATURE(STIBP, 9*32+27) /*A STIBP */ XEN_CPUFEATURE(L1D_FLUSH, 9*32+28) /*S MSR_FLUSH_CMD and L1D flush. */ --- a/xen/tools/gen-cpuid.py +++ b/xen/tools/gen-cpuid.py @@ -267,7 +267,8 @@ def crunch_numbers(state): # AVX512 extensions acting on vectors of bytes/words are made # dependents of AVX512BW (as to requiring wider than 16-bit mask # registers), despite the SDM not formally making this connection. - AVX512BW: [AVX512_VBMI, AVX512_VBMI2, AVX512_BITALG, AVX512_BF16], + AVX512BW: [AVX512_VBMI, AVX512_VBMI2, AVX512_BITALG, AVX512_BF16, + AVX512_FP16], # Extensions with VEX/EVEX encodings keyed to a separate feature # flag are made dependents of their respective legacy feature. From patchwork Wed Jun 15 10:27:53 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 12882071 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 38B55C433EF for ; Wed, 15 Jun 2022 10:28:11 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.349913.576109 (Exim 4.92) (envelope-from ) id 1o1QFj-0003PN-F9; Wed, 15 Jun 2022 10:27:59 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 349913.576109; Wed, 15 Jun 2022 10:27:59 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1o1QFj-0003PG-C3; Wed, 15 Jun 2022 10:27:59 +0000 Received: by outflank-mailman (input) for mailman id 349913; Wed, 15 Jun 2022 10:27:58 +0000 Received: from se1-gles-sth1-in.inumbo.com ([159.253.27.254] helo=se1-gles-sth1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1o1QFi-0002mz-4d for xen-devel@lists.xenproject.org; Wed, 15 Jun 2022 10:27:58 +0000 Received: from EUR03-VE1-obe.outbound.protection.outlook.com (mail-ve1eur03on062f.outbound.protection.outlook.com [2a01:111:f400:fe09::62f]) by se1-gles-sth1.inumbo.com (Halon) with ESMTPS id d2523f9f-ec95-11ec-bd2c-47488cf2e6aa; Wed, 15 Jun 2022 12:27:56 +0200 (CEST) Received: from VE1PR04MB6560.eurprd04.prod.outlook.com (2603:10a6:803:122::25) by DB7PR04MB5113.eurprd04.prod.outlook.com (2603:10a6:10:14::27) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5332.22; Wed, 15 Jun 2022 10:27:55 +0000 Received: from VE1PR04MB6560.eurprd04.prod.outlook.com ([fe80::dfa:a64a:432f:e26b]) by VE1PR04MB6560.eurprd04.prod.outlook.com ([fe80::dfa:a64a:432f:e26b%7]) with mapi id 15.20.5332.020; Wed, 15 Jun 2022 10:27:55 +0000 X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: d2523f9f-ec95-11ec-bd2c-47488cf2e6aa ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=XPvaag71N3Mp7UF8BeQYhXVEohUcleGNtabJ/DBkkn9nOa3+zlTGIF1qyCCdSZiSaZWxshYlRJRA030QKx5aSWZem5dnzWG0eHo2CWt14lzC3I2XzavoYEtIpcAwxSyPRowBApi79BtquNd5LK+SclIBNW40wdm0A2C2XrMRhmrBfQYIuPev91jLZLdpBqm/cE44FSbPl5alTLPidqc5afSqLMC07dUiqxsXur1Wepnhtk/lM+k6X+cLnBb1JxcMmdni1MdU493MRBYmmMR185/zOK/ifLHMPMWqYHMBZKPXrBEdmlwu8amcAwT1jxDuKMFEpxtIWWrR2cdEnE5jpQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=CbWxrA9PT3dW6Hbw/pnjIBabWsyLSFj6ASuTNtk5o8o=; b=BRR4xEJwaNA7tIk6qkOl1ronydqJWZyqvgwYQv9gK4j3H3Iy7os8g8LlUU2bo+nVDI1w6yJg6ciiD2gZPECT17BlfHBFnMGkdzGh20srAGVclC2/Gc4QU9qo1MQZl3g9HAqMB5LxlurFFnxMbo5RnyOba1/4uGWn9f18I+liWkqC2RLnGXtggfNmEeJoPpy3xQWiNegi1tS98fHpVdoddfs4nUl0g0sDMSsLmt02cUaBHgGdlCsb8VXH6Cvj7VU8MEzdm2pV9YVp6FZVg+4FKszzAr+Jrp1gkCo4irBh3PMDpPhj6T2RiJUAkcedF3MGPR+a8FKzOeqdFktzZI5i3w== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=CbWxrA9PT3dW6Hbw/pnjIBabWsyLSFj6ASuTNtk5o8o=; b=O1RA3NZLNLLT6+G8hs2X7eVy4+9jO6x06dZ1LU4fZmb0Ux5cOyirz3CQUnBLFZE9urH+dJmqqsnAKJA3DzhkSNI5otetVSemp0H1yeD9KqgJRnGd5R5Pk9npWcGq2XJOff52EAiQf5bc3qcIMFmBwtbMCLznewwsU3xJB7etVfS0ikec1IqeyAkCIeybh68LlBRsd9Hva8MVI9NchTOlQLmIFMugyxe2UHUxz+5ql675yxd5mUlCFodZIbady1odouQY27aktWaMhXpQBElxN+hZmZUiAnemurKtW9i41dk8qq15ELnlnsYwiuvs2XWvPiuj54/7GbflGcMwDc78iQ== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com; Message-ID: Date: Wed, 15 Jun 2022 12:27:53 +0200 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.10.0 Subject: [PATCH 02/11] x86emul: handle AVX512-FP16 insns encoded in 0f3a opcode map Content-Language: en-US From: Jan Beulich To: "xen-devel@lists.xenproject.org" Cc: Andrew Cooper , Wei Liu , =?utf-8?q?Roger_Pau_Monn=C3=A9?= References: <9ba3aaa7-56ce-b051-f1c4-874276e493e1@suse.com> In-Reply-To: <9ba3aaa7-56ce-b051-f1c4-874276e493e1@suse.com> X-ClientProxiedBy: AS8PR04CA0039.eurprd04.prod.outlook.com (2603:10a6:20b:312::14) To VE1PR04MB6560.eurprd04.prod.outlook.com (2603:10a6:803:122::25) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: f4e97241-d705-4c4f-9d4a-08da4eb9b5af X-MS-TrafficTypeDiagnostic: DB7PR04MB5113:EE_ X-Microsoft-Antispam-PRVS: X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: ikIpElavdwA0Wm1/5b8yNmWEa/cN7w0oZVIUGS6rIj588luN6L5/AUPLDweAzpiEiR6IkDPODZ/9mvBSJ4Nxqjkm5mZNksjESsZ7YAzmMjJM+9Syd5GRvt2SN2mDO5p6adr5j+1Yu9hITNTvQHyprDFk91wECmIWwi/TNAU7NjNVNaFWhn+aFhexrLzVq01bnoNe5FLkN8coy92bWtoz5FiNcPhnLq3S5fPO6OLxQ5fX51sODZdGGW4Y9IhegsJknm34Irl04slASFFzT9rgsSYmMwBM/IaNLrnk/C5WZp2iV6iUVMxrK/m4oYWjvFdgmcPZoeozYlsU8RIOB8BojsKHh9JMvOXuUTMQhIwcXVcUWDcWOEV01jOz0o/rDJqaD8F862FYWuM2/p2sKS4UJkSbfcoG3eRs7pHG08BHN7wvz+tlw2UW8YyYO0N1xnlFqR6vt7C3aMwLLXdnwa+fXbz2Osq4BgPWjPDTwlVGA9AHNedub1shWaSDePVhAMnsX0TWw2mfTv7HqwP1vPkNumujd3Y4e5fyypc0VUz9DFrJoty4zfu6ceOvwGxny/WXOm//QSgKKCimzVvAUNilIcg7Eku/kD6tM3fZRCBP6xLO06SZUD1mcPetvUDfJl3yzNzVPsCLwpHWt+VPki+YWLSxszmgS1gbZwbeQoHbfJUMQRN7hVP5cbwLAY0jIi0tGEyMxrRhLTS1a1UEI3nUV5UtRWHT3ITlrYMBOTYeHkwZqzZ8GAuuPzk0UYoPXy9U X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:VE1PR04MB6560.eurprd04.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230016)(366004)(316002)(36756003)(508600001)(38100700002)(6512007)(30864003)(6916009)(8676002)(31686004)(4326008)(6486002)(26005)(2616005)(66946007)(66556008)(66476007)(2906002)(31696002)(5660300002)(186003)(86362001)(6506007)(54906003)(8936002)(83380400001)(43740500002)(45980500001);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?q?fCzBhgHdQ5POOeh21lrNXadNr0Vw?= =?utf-8?q?VwydP/A5JR2dYBxr/t5+96CCzkDI/reEv6w7WALMuaH7YBmkI4SfyS+gIg7Cu+ibC?= =?utf-8?q?fR9T8RzwfkWr2uPq2vCeLF28QKCOLvg9HQGdE/34p/44DOmEIMo243Fu+vv2xO6JA?= =?utf-8?q?ZbEvRcWwEoX2O4oQzAkl69o/6RFwHPm8gc1YTK56GAbawmPTsHktFr0oSGGRgvvmQ?= =?utf-8?q?LmBjfL/q3rI3wrdEpxQBG0ftchn/54yohcpZKh8eZ/TghVsOA9Ha2OxzCkBqu5+69?= =?utf-8?q?VjJm3MIxDQuofGg6lgceQ18r1oltbRFXgmrYKS+cRcopZ9WgvkEK53wjFbUEqKvwY?= =?utf-8?q?qMSgLl847JncS8JGSe2/+DTkNHWcYsTbkiIjw2JSo1+7Co5OkeizHjJYQf/vt2iMw?= =?utf-8?q?x8G7Zu51B+fi9T81bEmOXmNaRyNGcCFXMqu+rECBGCdDu/PnfnF3KJgXRZuS+Ud47?= =?utf-8?q?cfXUYuzt/2JDDrYxk9XDR3u+grFPnzftwRXt6uyF+BHjpxWDjwb3uyxX/N6YU+Pxr?= =?utf-8?q?0oMOAqlpA4ssYOO8kM8HBOKJQyulHkbzl281/fciG1pyzMf9tMUuaNJtaPhZXjEuw?= =?utf-8?q?wcVD9roDK2M9oRkoQxl0JWV1h7+YwJ+9e2uljZb/KdobUls1uIEKXK5Gb3i+/chxi?= =?utf-8?q?sRI7gAA2l92U7gZ5yc61VrzM4Sq10q8I+HuK9kzqmKu1YLK6I59o/8R62w6pzNnQZ?= =?utf-8?q?KmpfYe6lE7lZ4R4ddKXrUrMwLpFaP1yggtH8w5sQHrAA1EijoDyGE3ZrOPNeeujuS?= =?utf-8?q?Nxnz1m3FfMiWCvyLJk3adTM+hIo2sZFFXw8yo1EROBupITq0wCULhuXN2v3YkNytJ?= =?utf-8?q?elWRpLJmepRF8pBUnnPIPAXrzFrb5x/Go2XPFC+3Wi0hBCtWHWm4nxsJeN/sVVpF3?= =?utf-8?q?GwRs5kKd3DiPrgtpgqeSNPBsVqNIZdcsF1mbfFvYOQRidVOPYqRiv5YiYD/46GCAj?= =?utf-8?q?+lcBZ0xa/6Bpw+rI1VsukurmSw+V5lXsAFfXqlwTjyoH07x9zdgNGP6BvcBdXu7KO?= =?utf-8?q?ObBPhGVaobEfeaEJ4JF4RVVhKUUPtouYcVFPr9EpDMUbQbfMIiE7G7+jzKhjLCBb3?= =?utf-8?q?LUb6+WQa7OTugWBJtJ18qAEVFCsddYu1uLcw8beY8snYEOP82q8/P2PTwre7hC64v?= =?utf-8?q?rQWQH81hc0H+G1AQ5Jg+xcc4OQG/RwSJcKcpcd14qQl13iCsmpoPIBCqTAA6PA6hG?= =?utf-8?q?XkrZAUwsplxluJjveaUB9V/TzWecUwF84Dk4Y/pPzCD/9CBq5idXTWWVgj/WGPqTW?= =?utf-8?q?sWXc8p/syr31IDzRUKJSfghT8uXfkMIZuJ+j52n+ENpykjXRUpIns5kbpD7wVBJQy?= =?utf-8?q?FfL69YeIZwVOM2JA3OhaJCogj/g5ssBR71eZQWhI4mxS+7wE+KTsc5mMUznNbqRtW?= =?utf-8?q?PVZJIh/H+3lZ4xQWHsmaXBYyMcIFzozFxWUQ57DoPC9ScHnyNaBqT9p+1Vaiud14t?= =?utf-8?q?iil0/SGiORp2QTtjXc+dBeSV1e/ChIEoLnEzev7T22P8hvssTCq/GrssDK7g2T9ZG?= =?utf-8?q?+o65094JgYmdbN38gMDLzW7mBNRa7+ukSkpgzpg8Od3CcKIT9uPsO7TwoswUewoVZ?= =?utf-8?q?HW3vteSGsM0gn4R5p19RXNrwmHSrsftsdRtqOm9uOpFyMtXDchQsawdDerUgwktbT?= =?utf-8?q?nip/EAqJC0ZVS/tvbpDfh5G4EHSiAM1g=3D=3D?= X-OriginatorOrg: suse.com X-MS-Exchange-CrossTenant-Network-Message-Id: f4e97241-d705-4c4f-9d4a-08da4eb9b5af X-MS-Exchange-CrossTenant-AuthSource: VE1PR04MB6560.eurprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 15 Jun 2022 10:27:55.1854 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: f7a17af6-1c5c-4a36-aa8b-f5be247aa4ba X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: yQBjhzZvWC4yAmUQ2LCi67pFP2oRJZ7Q6QQdBESGrVmvYjmSJKW3F2OuMoB6eN/ZJhgJiL0IPAhAqxwl8AdGRg== X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB7PR04MB5113 In order to re-use (also in subsequent patches) existing code and tables as much as possible, simply introduce a new boolean field in emulator state indicating whether an insn is one with a half-precision source. Everything else then follows "naturally". Signed-off-by: Jan Beulich --- SDE: -spr or -future --- a/tools/tests/x86_emulator/evex-disp8.c +++ b/tools/tests/x86_emulator/evex-disp8.c @@ -76,6 +76,7 @@ enum esz { ESZ_b, ESZ_w, ESZ_bw, + ESZ_fp16, }; #ifndef __i386__ @@ -601,6 +602,19 @@ static const struct test avx512_vpopcntd INSN(popcnt, 66, 0f38, 55, vl, dq, vl) }; +static const struct test avx512_fp16_all[] = { + INSN(cmpph, , 0f3a, c2, vl, fp16, vl), + INSN(cmpsh, f3, 0f3a, c2, el, fp16, el), + INSN(fpclassph, , 0f3a, 66, vl, fp16, vl), + INSN(fpclasssh, , 0f3a, 67, el, fp16, el), + INSN(getmantph, , 0f3a, 26, vl, fp16, vl), + INSN(getmantsh, , 0f3a, 27, el, fp16, el), + INSN(reduceph, , 0f3a, 56, vl, fp16, vl), + INSN(reducesh, , 0f3a, 57, el, fp16, el), + INSN(rndscaleph, , 0f3a, 08, vl, fp16, vl), + INSN(rndscalesh, , 0f3a, 0a, el, fp16, el), +}; + static const struct test gfni_all[] = { INSN(gf2p8affineinvqb, 66, 0f3a, cf, vl, q, vl), INSN(gf2p8affineqb, 66, 0f3a, ce, vl, q, vl), @@ -728,8 +742,10 @@ static void test_one(const struct test * break; case ESZ_w: - esz = 2; evex.w = 1; + /* fall through */ + case ESZ_fp16: + esz = 2; break; #ifdef __i386__ @@ -845,7 +861,7 @@ static void test_one(const struct test * case ESZ_b: case ESZ_w: case ESZ_bw: return; - case ESZ_d: case ESZ_q: + case ESZ_d: case ESZ_q: case ESZ_fp16: break; default: @@ -1002,6 +1018,7 @@ void evex_disp8_test(void *instr, struct RUN(avx512_vnni, all); RUN(avx512_vp2intersect, all); RUN(avx512_vpopcntdq, all); + RUN(avx512_fp16, all); if ( cpu_has_avx512f ) { --- a/tools/tests/x86_emulator/predicates.c +++ b/tools/tests/x86_emulator/predicates.c @@ -1972,8 +1972,10 @@ static const struct evex { { { 0x03 }, 3, T, R, pfx_66, Wn, Ln }, /* valign{d,q} */ { { 0x04 }, 3, T, R, pfx_66, W0, Ln }, /* vpermilps */ { { 0x05 }, 3, T, R, pfx_66, W1, Ln }, /* vpermilpd */ + { { 0x08 }, 3, T, R, pfx_no, W0, Ln }, /* vrndscaleph */ { { 0x08 }, 3, T, R, pfx_66, W0, Ln }, /* vrndscaleps */ { { 0x09 }, 3, T, R, pfx_66, W1, Ln }, /* vrndscalepd */ + { { 0x0a }, 3, T, R, pfx_no, W0, LIG }, /* vrndscalesh */ { { 0x0a }, 3, T, R, pfx_66, W0, LIG }, /* vrndscaless */ { { 0x0b }, 3, T, R, pfx_66, W1, LIG }, /* vrndscalesd */ { { 0x0f }, 3, T, R, pfx_66, WIG, Ln }, /* vpalignr */ @@ -1993,7 +1995,9 @@ static const struct evex { { { 0x22 }, 3, T, R, pfx_66, Wn, L0 }, /* vpinsr{d,q} */ { { 0x23 }, 3, T, R, pfx_66, Wn, L1|L2 }, /* vshuff{32x4,64x2} */ { { 0x25 }, 3, T, R, pfx_66, Wn, Ln }, /* vpternlog{d,q} */ + { { 0x26 }, 3, T, R, pfx_no, W0, Ln }, /* vgetmantph */ { { 0x26 }, 3, T, R, pfx_66, Wn, Ln }, /* vgetmantp{s,d} */ + { { 0x27 }, 3, T, R, pfx_no, W0, LIG }, /* vgetmantsh */ { { 0x27 }, 3, T, R, pfx_66, Wn, LIG }, /* vgetmants{s,d} */ { { 0x38 }, 3, T, R, pfx_66, Wn, L1|L2 }, /* vinserti{32x4,64x2} */ { { 0x39 }, 3, T, W, pfx_66, Wn, L1|L2 }, /* vextracti{32x4,64x2} */ @@ -2008,14 +2012,20 @@ static const struct evex { { { 0x51 }, 3, T, R, pfx_66, Wn, LIG }, /* vranges{s,d} */ { { 0x54 }, 3, T, R, pfx_66, Wn, Ln }, /* vfixupimmp{s,d} */ { { 0x55 }, 3, T, R, pfx_66, Wn, LIG }, /* vfixumpimms{s,d} */ + { { 0x56 }, 3, T, R, pfx_no, W0, Ln }, /* vreduceph */ { { 0x56 }, 3, T, R, pfx_66, Wn, Ln }, /* vreducep{s,d} */ + { { 0x57 }, 3, T, R, pfx_no, W0, LIG }, /* vreducesh */ { { 0x57 }, 3, T, R, pfx_66, Wn, LIG }, /* vreduces{s,d} */ + { { 0x66 }, 3, T, R, pfx_no, W0, Ln }, /* vfpclassph */ { { 0x66 }, 3, T, R, pfx_66, Wn, Ln }, /* vfpclassp{s,d} */ + { { 0x67 }, 3, T, R, pfx_no, W0, LIG }, /* vfpclasssh */ { { 0x67 }, 3, T, R, pfx_66, Wn, LIG }, /* vfpclasss{s,d} */ { { 0x70 }, 3, T, R, pfx_66, W1, Ln }, /* vshldw */ { { 0x71 }, 3, T, R, pfx_66, Wn, Ln }, /* vshld{d,q} */ { { 0x72 }, 3, T, R, pfx_66, W1, Ln }, /* vshrdw */ { { 0x73 }, 3, T, R, pfx_66, Wn, Ln }, /* vshrd{d,q} */ + { { 0xc2 }, 3, T, R, pfx_no, W0, Ln }, /* vcmpph */ + { { 0xc2 }, 3, T, R, pfx_f3, W0, LIG }, /* vcmpsh */ { { 0xce }, 3, T, R, pfx_66, W1, Ln }, /* vgf2p8affineqb */ { { 0xcf }, 3, T, R, pfx_66, W1, Ln }, /* vgf2p8affineinvqb */ }; --- a/tools/tests/x86_emulator/test_x86_emulator.c +++ b/tools/tests/x86_emulator/test_x86_emulator.c @@ -4674,6 +4674,44 @@ int main(int argc, char **argv) else printf("skipped\n"); + printf("%-40s", "Testing vfpclassphz $0x46,128(%ecx),%k3..."); + if ( stack_exec && cpu_has_avx512_fp16 ) + { + decl_insn(vfpclassph); + + asm volatile ( put_insn(vfpclassph, + /* 0x46: check for +/- 0 and neg. */ + /* vfpclassphz $0x46, 128(%0), %%k3 */ + ".byte 0x62, 0xf3, 0x7c, 0x48\n\t" + ".byte 0x66, 0x59, 0x02, 0x46") + :: "c" (NULL) ); + + set_insn(vfpclassph); + for ( i = 0; i < 3; ++i ) + { + res[16 + i * 5 + 0] = 0x7fff0000; /* +0 / +NaN */ + res[16 + i * 5 + 1] = 0xffff8000; /* -0 / -NaN */ + res[16 + i * 5 + 2] = 0x80010001; /* +DEN / -DEN */ + res[16 + i * 5 + 3] = 0xfc00f800; /* -FIN / -INF */ + res[16 + i * 5 + 4] = 0x7c007800; /* +FIN / +INF */ + } + res[31] = 0; + regs.ecx = (unsigned long)res - 64; + rc = x86_emulate(&ctxt, &emulops); + if ( rc != X86EMUL_OKAY || !check_eip(vfpclassph) ) + goto fail; + asm volatile ( "kmovd %%k3, %0" : "=g" (rc) ); + /* + * 0b11(0001100101)*3 + * 0b1100_0110_0101_0001_1001_0100_0110_0101 + */ + if ( rc != 0xc6519465 ) + goto fail; + printf("okay\n"); + } + else + printf("skipped\n"); + /* * The following compress/expand tests are not only making sure the * accessed data is correct, but they also verify (by placing operands --- a/tools/tests/x86_emulator/x86-emulate.h +++ b/tools/tests/x86_emulator/x86-emulate.h @@ -182,6 +182,7 @@ void wrpkru(unsigned int val); #define cpu_has_avx512_4fmaps (cp.feat.avx512_4fmaps && xcr0_mask(0xe6)) #define cpu_has_avx512_vp2intersect (cp.feat.avx512_vp2intersect && xcr0_mask(0xe6)) #define cpu_has_serialize cp.feat.serialize +#define cpu_has_avx512_fp16 (cp.feat.avx512_fp16 && xcr0_mask(0xe6)) #define cpu_has_avx_vnni (cp.feat.avx_vnni && xcr0_mask(6)) #define cpu_has_avx512_bf16 (cp.feat.avx512_bf16 && xcr0_mask(0xe6)) --- a/xen/arch/x86/x86_emulate/decode.c +++ b/xen/arch/x86/x86_emulate/decode.c @@ -518,6 +518,7 @@ static const struct ext0f3a_table { [0x7a ... 0x7b] = { .simd_size = simd_scalar_opc, .four_op = 1 }, [0x7c ... 0x7d] = { .simd_size = simd_packed_fp, .four_op = 1 }, [0x7e ... 0x7f] = { .simd_size = simd_scalar_opc, .four_op = 1 }, + [0xc2] = { .simd_size = simd_any_fp, .d8s = d8s_vl }, [0xcc] = { .simd_size = simd_other }, [0xce ... 0xcf] = { .simd_size = simd_packed_int, .d8s = d8s_vl }, [0xdf] = { .simd_size = simd_packed_int, .two_op = 1 }, @@ -579,7 +580,7 @@ static unsigned int decode_disp8scale(en if ( s->evex.brs ) { case d8s_dq: - return 2 + s->evex.w; + return 1 + !s->fp16 + s->evex.w; } break; @@ -596,7 +597,7 @@ static unsigned int decode_disp8scale(en /* fall through */ case simd_scalar_opc: case simd_scalar_vexw: - return 2 + s->evex.w; + return 1 + !s->fp16 + s->evex.w; case simd_128: /* These should have an explicit size specified. */ @@ -1417,7 +1418,29 @@ int x86emul_decode(struct x86_emulate_st */ s->simd_size = ext0f3a_table[b].simd_size; if ( evex_encoded() ) + { + switch ( b ) + { + case 0x08: /* vrndscaleph */ + case 0x0a: /* vrndscalesh */ + case 0x26: /* vfpclassph */ + case 0x27: /* vfpclasssh */ + case 0x56: /* vgetmantph */ + case 0x57: /* vgetmantsh */ + case 0x66: /* vreduceph */ + case 0x67: /* vreducesh */ + if ( !s->evex.pfx ) + s->fp16 = true; + break; + + case 0xc2: /* vpcmp{p,s}h */ + if ( !(s->evex.pfx & VEX_PREFIX_DOUBLE_MASK) ) + s->fp16 = true; + break; + } + disp8scale = decode_disp8scale(ext0f3a_table[b].d8s, s); + } break; case ext_8f09: @@ -1712,7 +1735,7 @@ int x86emul_decode(struct x86_emulate_st break; case vex_f3: generate_exception_if(evex_encoded() && s->evex.w, X86_EXC_UD); - s->op_bytes = 4; + s->op_bytes = 4 >> s->fp16; break; case vex_f2: generate_exception_if(evex_encoded() && !s->evex.w, X86_EXC_UD); @@ -1722,11 +1745,11 @@ int x86emul_decode(struct x86_emulate_st break; case simd_scalar_opc: - s->op_bytes = 4 << (ctxt->opcode & 1); + s->op_bytes = 2 << (!s->fp16 + (ctxt->opcode & 1)); break; case simd_scalar_vexw: - s->op_bytes = 4 << s->vex.w; + s->op_bytes = 2 << (!s->fp16 + s->vex.w); break; case simd_128: --- a/xen/arch/x86/x86_emulate/private.h +++ b/xen/arch/x86/x86_emulate/private.h @@ -304,6 +304,7 @@ struct x86_emulate_state { bool lock_prefix; bool not_64bit; /* Instruction not available in 64bit. */ bool fpu_ctrl; /* Instruction is an FPU control one. */ + bool fp16; /* Instruction has half-precision FP source operand. */ opcode_desc_t desc; union vex vex; union evex evex; @@ -590,6 +591,7 @@ amd_like(const struct x86_emulate_ctxt * #define vcpu_has_avx512_vp2intersect() (ctxt->cpuid->feat.avx512_vp2intersect) #define vcpu_has_serialize() (ctxt->cpuid->feat.serialize) #define vcpu_has_tsxldtrk() (ctxt->cpuid->feat.tsxldtrk) +#define vcpu_has_avx512_fp16() (ctxt->cpuid->feat.avx512_fp16) #define vcpu_has_avx_vnni() (ctxt->cpuid->feat.avx_vnni) #define vcpu_has_avx512_bf16() (ctxt->cpuid->feat.avx512_bf16) --- a/xen/arch/x86/x86_emulate/x86_emulate.c +++ b/xen/arch/x86/x86_emulate/x86_emulate.c @@ -1306,7 +1306,7 @@ x86_emulate( b = ctxt->opcode; d = state.desc; #define state (&state) - elem_bytes = 4 << evex.w; + elem_bytes = 2 << (!state->fp16 + evex.w); generate_exception_if(state->not_64bit && mode_64bit(), EXC_UD); @@ -7147,6 +7147,15 @@ x86_emulate( avx512_vlen_check(b & 2); goto simd_imm8_zmm; + case X86EMUL_OPC_EVEX(0x0f3a, 0x0a): /* vrndscalesh $imm8,xmm/mem,xmm,xmm{k} */ + generate_exception_if(ea.type != OP_REG && evex.brs, EXC_UD); + /* fall through */ + case X86EMUL_OPC_EVEX(0x0f3a, 0x08): /* vrndscaleph $imm8,[xyz]mm/mem,[xyz]mm{k} */ + host_and_vcpu_must_have(avx512_fp16); + generate_exception_if(evex.w, EXC_UD); + avx512_vlen_check(b & 2); + goto simd_imm8_zmm; + #endif /* X86EMUL_NO_SIMD */ CASE_SIMD_PACKED_INT(0x0f3a, 0x0f): /* palignr $imm8,{,x}mm/mem,{,x}mm */ @@ -7457,6 +7466,14 @@ x86_emulate( avx512_vlen_check(false); goto simd_imm8_zmm; + case X86EMUL_OPC_EVEX(0x0f3a, 0x26): /* vgetmantph $imm8,[xyz]mm/mem,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX(0x0f3a, 0x56): /* vreduceph $imm8,[xyz]mm/mem,[xyz]mm{k} */ + host_and_vcpu_must_have(avx512_fp16); + generate_exception_if(evex.w, EXC_UD); + if ( ea.type != OP_REG || !evex.brs ) + avx512_vlen_check(false); + goto simd_imm8_zmm; + case X86EMUL_OPC_EVEX_66(0x0f3a, 0x51): /* vranges{s,d} $imm8,xmm/mem,xmm,xmm{k} */ case X86EMUL_OPC_EVEX_66(0x0f3a, 0x57): /* vreduces{s,d} $imm8,xmm/mem,xmm,xmm{k} */ host_and_vcpu_must_have(avx512dq); @@ -7469,6 +7486,16 @@ x86_emulate( avx512_vlen_check(true); goto simd_imm8_zmm; + case X86EMUL_OPC_EVEX(0x0f3a, 0x27): /* vgetmantsh $imm8,xmm/mem,xmm,xmm{k} */ + case X86EMUL_OPC_EVEX(0x0f3a, 0x57): /* vreducesh $imm8,xmm/mem,xmm,xmm{k} */ + host_and_vcpu_must_have(avx512_fp16); + generate_exception_if(evex.w, EXC_UD); + if ( !evex.brs ) + avx512_vlen_check(true); + else + generate_exception_if(ea.type != OP_REG, EXC_UD); + goto simd_imm8_zmm; + case X86EMUL_OPC_VEX_66(0x0f3a, 0x30): /* kshiftr{b,w} $imm8,k,k */ case X86EMUL_OPC_VEX_66(0x0f3a, 0x32): /* kshiftl{b,w} $imm8,k,k */ if ( !vex.w ) @@ -7632,6 +7659,16 @@ x86_emulate( avx512_vlen_check(true); goto simd_imm8_zmm; + case X86EMUL_OPC_EVEX(0x0f3a, 0x66): /* vfpclassph $imm8,[xyz]mm/mem,k{k} */ + case X86EMUL_OPC_EVEX(0x0f3a, 0x67): /* vfpclasssh $imm8,xmm/mem,k{k} */ + host_and_vcpu_must_have(avx512_fp16); + generate_exception_if(evex.w || !evex.r || !evex.R || evex.z, EXC_UD); + if ( !(b & 1) ) + goto avx512f_imm8_no_sae; + generate_exception_if(evex.brs, EXC_UD); + avx512_vlen_check(true); + goto simd_imm8_zmm; + case X86EMUL_OPC_EVEX_66(0x0f3a, 0x70): /* vpshldw $imm8,[xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_66(0x0f3a, 0x72): /* vpshrdw $imm8,[xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ generate_exception_if(!evex.w, EXC_UD); @@ -7642,6 +7679,16 @@ x86_emulate( host_and_vcpu_must_have(avx512_vbmi2); goto avx512f_imm8_no_sae; + case X86EMUL_OPC_EVEX_F3(0x0f3a, 0xc2): /* vcmpsh $imm8,xmm/mem,xmm,k{k} */ + generate_exception_if(ea.type != OP_REG && evex.brs, EXC_UD); + /* fall through */ + case X86EMUL_OPC_EVEX(0x0f3a, 0xc2): /* vcmpph $imm8,[xyz]mm/mem,[xyz]mm,k{k} */ + host_and_vcpu_must_have(avx512_fp16); + generate_exception_if(evex.w || !evex.r || !evex.R || evex.z, EXC_UD); + if ( ea.type != OP_REG || !evex.brs ) + avx512_vlen_check(evex.pfx & VEX_PREFIX_SCALAR_MASK); + goto simd_imm8_zmm; + case X86EMUL_OPC(0x0f3a, 0xcc): /* sha1rnds4 $imm8,xmm/m128,xmm */ host_and_vcpu_must_have(sha); op_bytes = 16; From patchwork Wed Jun 15 10:28:15 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 12882085 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id D5692C433EF for ; Wed, 15 Jun 2022 10:28:34 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.349918.576121 (Exim 4.92) (envelope-from ) id 1o1QG5-000457-QZ; Wed, 15 Jun 2022 10:28:21 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 349918.576121; Wed, 15 Jun 2022 10:28:21 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1o1QG5-00044f-L3; Wed, 15 Jun 2022 10:28:21 +0000 Received: by outflank-mailman (input) for mailman id 349918; Wed, 15 Jun 2022 10:28:20 +0000 Received: from se1-gles-sth1-in.inumbo.com ([159.253.27.254] helo=se1-gles-sth1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1o1QG4-0002mz-DP for xen-devel@lists.xenproject.org; Wed, 15 Jun 2022 10:28:20 +0000 Received: from EUR05-VI1-obe.outbound.protection.outlook.com (mail-vi1eur05on2061d.outbound.protection.outlook.com [2a01:111:f400:7d00::61d]) by se1-gles-sth1.inumbo.com (Halon) with ESMTPS id df9db296-ec95-11ec-bd2c-47488cf2e6aa; Wed, 15 Jun 2022 12:28:19 +0200 (CEST) Received: from VE1PR04MB6560.eurprd04.prod.outlook.com (2603:10a6:803:122::25) by DB7PR04MB5113.eurprd04.prod.outlook.com (2603:10a6:10:14::27) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5332.22; Wed, 15 Jun 2022 10:28:17 +0000 Received: from VE1PR04MB6560.eurprd04.prod.outlook.com ([fe80::dfa:a64a:432f:e26b]) by VE1PR04MB6560.eurprd04.prod.outlook.com ([fe80::dfa:a64a:432f:e26b%7]) with mapi id 15.20.5332.020; Wed, 15 Jun 2022 10:28:17 +0000 X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: df9db296-ec95-11ec-bd2c-47488cf2e6aa ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=knfsbiqT+Bu2ST7HeueaXPvfrV0oD9Vaf047LgFbpqYptcQBDIizftlID7OxlbjVPlUBkv9/Tb6bVFnOwsFlm0ham8cCfzye4SuRQeGmK2SNME0P/KLXRqv/RixpnH7kZmi3ZKcSCtALpD2n6UF2Lscnj45Nk7Z9Sts+CJzdCNfk1nLPqYMXVsKP4ySpU1cdfIKZhJV+YQKDfhp30RjhcIKJC0tLJpvXpFDZnoSRSlpHWBAA0N5Lm312A/sSfBfjefQPgPlv61iBD08y1wiIJhh7P5miMmR0FVYQHwrRHWsedNnKlQbBsKnXG7qsK0cnsSpADtuPTypmfu3ZLOzIPA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=JQIzAAXSLVSWkhcS1qf3HBv4Fi2PTXseHqP5u9rJaSQ=; b=Z0GAZ5yr2aT9WyPv9dnqlVS19pSLK2x/mLLW1GhWZxtPeOj7taVhdlwwWmRChCnxEJM1Z5moMnNTkntWzeKi0LcoyEk48P+uo3+AkYuVzsZyNuCjVqhbhxTM7AYLJrpxmZEfY4ALKGH9Ur9J6f4QRWhVCyzdRN27GtLRwm2DobTidxvUPuWPAwxRmG5BU8i0qxF+EuVabkaPZL+/8j1g6BRz1nq84baztK4zVOvd7B8JMTMNaqgIVOfWgC29d2cVXhvn6iNkW4RW5Viy2Og56dfOA99jFd2gajooTK4Q96PqxV4TXfONbRbPw/78ykViSnHl7pxxBqTg5cX65W4yhw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=JQIzAAXSLVSWkhcS1qf3HBv4Fi2PTXseHqP5u9rJaSQ=; b=04tVpoI+dOu5NPsFIwLqjOhIH7Cdt3LN4MS0YtHruAKK2df1jw2WQWTzFWwwrns4fH5BV4y9UOzWTzG+33B0p7uD4rgwQI4eQVpskVMzwmU1myt979ltGage9HBZfYUQdfID2IGsO430x7rO1hkIMpi3eB1CkhxZ+NXIxZiaywrHBgM8Q+kFqkDjO47AbCBlhNCAgM9T6kTyCehU251OwJ8OLMfItS6Yop238ktPbl1luHNtyfg33XRNcrDWWAH1dIZv3L4ZwPmjzfNvnowIweY/L+F+sfjtIg1Z9zocs0VSldQJMCnR1U+00MkFgkkvySn4dA5MTFqRKvjbkrusVw== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com; Message-ID: <6721f404-e2f9-b686-009c-4c465a5a1e3f@suse.com> Date: Wed, 15 Jun 2022 12:28:15 +0200 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.10.0 Subject: [PATCH 03/11] x86emul: handle AVX512-FP16 Map5 arithmetic insns Content-Language: en-US From: Jan Beulich To: "xen-devel@lists.xenproject.org" Cc: Andrew Cooper , Wei Liu , =?utf-8?q?Roger_Pau_Monn=C3=A9?= References: <9ba3aaa7-56ce-b051-f1c4-874276e493e1@suse.com> In-Reply-To: <9ba3aaa7-56ce-b051-f1c4-874276e493e1@suse.com> X-ClientProxiedBy: AM6P195CA0038.EURP195.PROD.OUTLOOK.COM (2603:10a6:209:87::15) To VE1PR04MB6560.eurprd04.prod.outlook.com (2603:10a6:803:122::25) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 7923a1c0-f008-411a-b801-08da4eb9c2ce X-MS-TrafficTypeDiagnostic: DB7PR04MB5113:EE_ X-Microsoft-Antispam-PRVS: X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: b20EOtTG7crdxcbUSrDwuReDqySHTpMjZyXGYRREMbB+PAc/jaBuv+aTJNU3rLXu3TYbWQRo4F27z2j1VBvQoavNtluCSHi7WNJ9t9BpTwiidpNCKWncfUcMB1BrJe5tZYNWK1Lb5UoWFjnWCTPsMD3NPT6QUecp/+R3w/0OpY7ajdyWNRkWFCkFJ/6zhPnC3KtJ3AyifxqXTDbJ30JxSCH0mb8n5lALsePAwnLfBBYgs/MJ8RZlzI8OxPEPXFwo9hT6Hn6Lu3NeolFrh2zniXbPPIrmbnaJ5gAx5KlolLa4tNihMtSTBrzjE0CslVjAZLPZ93PUVYTOFgKPYS+xx2McuvH87WjKuGn+5PJy3Y71VnZ+STcPGTpStCI6VrpoIZub2U3lebyjFCUn6k/iw7xkiY6TO+1mMaBmqWx0CIaCMiOBUS12SoaMYWaN2zE7p8t/18/Xg9oWgjWFxNrT2VXqAE+e3Y4LehqogEJPDrO/1pH2YNozLKjKahfeWI4LglyYdI0xhY/34EobBflN3lrfBo6aTrhiLaYuaBO3Lk3eeCnPn0E0+VZpfYBd7SnEJ4PpXMR8UlzEHGPuneU0STfv1UgT+RmQLx0gv1Q7gR8ji1bwBNPDU6yWWb3n8aF7uXxaaQVr+B1dpPf80NIooYH2ckY0QD2qk1kf5o8gMNGCXwwRrpNKXD7CP0Aj4Aa306PxyrSWnwjZxCtUZA2vHPNSlrbLFltAoXWbXXHCRTD/phYFGkjCGO0IEUOyLKsr X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:VE1PR04MB6560.eurprd04.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230016)(366004)(316002)(36756003)(508600001)(38100700002)(6512007)(6916009)(8676002)(31686004)(4326008)(6486002)(26005)(2616005)(66946007)(66556008)(66476007)(2906002)(31696002)(5660300002)(186003)(86362001)(6506007)(54906003)(8936002)(43740500002)(45980500001);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?q?giIStpO9EnPkxeLFGCGBs6JHb9Lp?= =?utf-8?q?aNjQhLh8pZnQ+t1FNrbCyyPv2olrXP6uOrhdxZbbuo/8iZgIrdiZlG1oqLqhnDrOZ?= =?utf-8?q?lA8bjBTwrMT05DE72qTFivV7v1hSaiEpstrXZBC1zzLBHHpLB+vAAT20IYahb+P9A?= =?utf-8?q?/XR1omew659Icm8llu3NXv0oVMmEmWdQ7BTsC7ziJ1ROPYBKZOiyG/RSSidVndlN1?= =?utf-8?q?wCA5JDsvhKMDLjCeeVVIvp16CgirPhnRNkOoy8tXlPSy6skYdpaVI/WwkrBhr9vvU?= =?utf-8?q?cL/1avVSm5ztDwKGLpKhUJEtQ5mb9Ot2VH1ZvRH/YCChBaatfcTRlL3xe+1ihm+Ec?= =?utf-8?q?Hhe6DmDuq7gTbJgghvC3dUnTXj9TWqnoNMiWDugLpw7o8Py7EJvwJNGPrnCYst7Gp?= =?utf-8?q?byHclk8tYqLaX02idd5KClMczxbrNjlAb690ia7u0D7MHiMnit1MhitgijvAx5Nhl?= =?utf-8?q?hQgulUeb/+CmlxM8e9td+FGx2MOs+r9djjc+/xRZ/ZN0XNXXXue6T50sDHSsfUZcW?= =?utf-8?q?8Xk+6YX+IzJmYrqjpnsS6CbKxSUiqnDLhEgudUzkM2l6/Z/+cMiVB9K6RQhhCdTJ9?= =?utf-8?q?7R1dFohp7ML2AB/xFWTdjuMjCQNBU2vHgrEUS6S7a/Ogsk6grAWYBGNMVV9esKqo/?= =?utf-8?q?d99t5jBzjY9UDu/qvGBxwBkxeL56rpwyWIs2ufW42baBWx7AfASntY5u1wC6ZcZk5?= =?utf-8?q?Hzdph+0XNH9HKmfpkT1GpfArOHRkxxgALEPPvH8nHzWefxB6e8kFWYhwKskvaLB1k?= =?utf-8?q?7ecWFwgeLFC0Lflmnv0paOBq3ExZqjktOBsRwvU4mM2UEg6rBo181FadnMT4SIBfw?= =?utf-8?q?3X5E7NCOnuaedRdCUuruKqfNyVf6ZCqRTrkcN71JvC8zm70rk5DWwdvdA7/SyHIfB?= =?utf-8?q?acfLbDsa3SlA+kmMoS3WbROVaHyGwGpeEEj7iemICH8T4wGEMZB+wtOWr3x6rIMJc?= =?utf-8?q?r7CBknOoAv+K5er1GIZf0gyC64gbwNFF4/SQW93UeTe85XapGXwS/3M4Lfo0WrJS5?= =?utf-8?q?BFy0eSjbRa9K/tqa1fBTMbn0QhHLTLfip7GGfcC+OieAai/vbGlVsfuVeFN0dMGvJ?= =?utf-8?q?3/gAtDyfYLZ1w/8Po51mQ43svnzadeUrtd7f/yXRqs41JetK1DXfhid2IFnDj/1Kv?= =?utf-8?q?EGWeebUMsABv2693LnRfQ+jYPcpPptIXJpUTnpEu9ujXFX5NXVxswaIS6Rf1GMLJk?= =?utf-8?q?cSjn44P6c9p0yya1kTB6v0Cl+wjoYfyVmTgR1i0RO/OsKCBXrQk1jVo2knk9H0cPn?= =?utf-8?q?aYIX46cDsZkGJwgcmm0N0ieizEuZz/iO+fsr6cOcQgZTVYCEkQrYNzEF92JWkdpwq?= =?utf-8?q?nuBCRpiHj3HPojbKq+YoHG++3RTx/NJjblid1exePN+VTMfde7FLakqNSaDaL8zrq?= =?utf-8?q?dSM+9ZG9aDgHWzIK3l6iFMI7ZDo1U2pfHrSV9H90MWaqKtkacuK4+aA4H3aAdDNdm?= =?utf-8?q?lgPD/jqhy968RdWWq7fgy979bik+JoQpMqNOmrcZ2lim+RsgkDlHNyFX0cMOif4RR?= =?utf-8?q?XmBdPpAW/VpqlvvfutCQupboK2aSMB28kCrmDQkSrlz2t7xZuATANbFbib+IBWy0J?= =?utf-8?q?GUacOTiRA2Z/ZP/1xsPYK0JtvVXb2EijD6zN7sd7v9v1OvrAlRpT02hog6Ct6hA7F?= =?utf-8?q?KGVTQ1ZQ/8u7liUqgUKDT/BlvovXR7Mg=3D=3D?= X-OriginatorOrg: suse.com X-MS-Exchange-CrossTenant-Network-Message-Id: 7923a1c0-f008-411a-b801-08da4eb9c2ce X-MS-Exchange-CrossTenant-AuthSource: VE1PR04MB6560.eurprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 15 Jun 2022 10:28:17.2308 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: f7a17af6-1c5c-4a36-aa8b-f5be247aa4ba X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: iRVv80wt9FUnn9Z/cajvPbDVi+62fx+oIC063qL019WbUgTdMSmzAn0kA0bbG0M/ubFEY+t2QQFuv9Y6reW0yw== X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB7PR04MB5113 This encoding space is a very sparse clone of the "twobyte" one. Re-use that table, as the entries corresponding to invalid opcodes in Map5 are simply benign with simd_size forced to other than simd_none (preventing undue memory reads in SrcMem handling early in x86_emulate()). Signed-off-by: Jan Beulich --- a/tools/tests/x86_emulator/evex-disp8.c +++ b/tools/tests/x86_emulator/evex-disp8.c @@ -6,7 +6,7 @@ struct test { const char *mnemonic; unsigned int opc:8; - unsigned int spc:2; + unsigned int spc:3; unsigned int pfx:2; unsigned int vsz:3; unsigned int esz:4; @@ -19,6 +19,10 @@ enum spc { SPC_0f, SPC_0f38, SPC_0f3a, + SPC_unused4, + SPC_map5, + SPC_map6, + SPC_unused7, }; enum pfx { @@ -603,16 +607,32 @@ static const struct test avx512_vpopcntd }; static const struct test avx512_fp16_all[] = { + INSN(addph, , map5, 58, vl, fp16, vl), + INSN(addsh, f3, map5, 58, el, fp16, el), INSN(cmpph, , 0f3a, c2, vl, fp16, vl), INSN(cmpsh, f3, 0f3a, c2, el, fp16, el), + INSN(comish, , map5, 2f, el, fp16, el), + INSN(divph, , map5, 5e, vl, fp16, vl), + INSN(divsh, f3, map5, 5e, el, fp16, el), INSN(fpclassph, , 0f3a, 66, vl, fp16, vl), INSN(fpclasssh, , 0f3a, 67, el, fp16, el), INSN(getmantph, , 0f3a, 26, vl, fp16, vl), INSN(getmantsh, , 0f3a, 27, el, fp16, el), + INSN(maxph, , map5, 5f, vl, fp16, vl), + INSN(maxsh, f3, map5, 5f, el, fp16, el), + INSN(minph, , map5, 5d, vl, fp16, vl), + INSN(minsh, f3, map5, 5d, el, fp16, el), + INSN(mulph, , map5, 59, vl, fp16, vl), + INSN(mulsh, f3, map5, 59, el, fp16, el), INSN(reduceph, , 0f3a, 56, vl, fp16, vl), INSN(reducesh, , 0f3a, 57, el, fp16, el), INSN(rndscaleph, , 0f3a, 08, vl, fp16, vl), INSN(rndscalesh, , 0f3a, 0a, el, fp16, el), + INSN(sqrtph, , map5, 51, vl, fp16, vl), + INSN(sqrtsh, f3, map5, 51, el, fp16, el), + INSN(subph, , map5, 5c, vl, fp16, vl), + INSN(subsh, f3, map5, 5c, el, fp16, el), + INSN(ucomish, , map5, 2e, el, fp16, el), }; static const struct test gfni_all[] = { @@ -713,8 +733,8 @@ static void test_one(const struct test * union evex { uint8_t raw[3]; struct { - uint8_t opcx:2; - uint8_t mbz:2; + uint8_t opcx:3; + uint8_t mbz:1; uint8_t R:1; uint8_t b:1; uint8_t x:1; --- a/tools/tests/x86_emulator/predicates.c +++ b/tools/tests/x86_emulator/predicates.c @@ -2028,6 +2028,23 @@ static const struct evex { { { 0xc2 }, 3, T, R, pfx_f3, W0, LIG }, /* vcmpsh */ { { 0xce }, 3, T, R, pfx_66, W1, Ln }, /* vgf2p8affineqb */ { { 0xcf }, 3, T, R, pfx_66, W1, Ln }, /* vgf2p8affineinvqb */ +}, evex_map5[] = { + { { 0x2e }, 2, T, R, pfx_no, W0, LIG }, /* vucomish */ + { { 0x2f }, 2, T, R, pfx_no, W0, LIG }, /* vcomish */ + { { 0x51 }, 2, T, R, pfx_no, W0, Ln }, /* vsqrtph */ + { { 0x51 }, 2, T, R, pfx_f3, W0, LIG }, /* vsqrtsh */ + { { 0x58 }, 2, T, R, pfx_no, W0, Ln }, /* vaddph */ + { { 0x58 }, 2, T, R, pfx_f3, W0, LIG }, /* vaddsh */ + { { 0x59 }, 2, T, R, pfx_no, W0, Ln }, /* vmulph */ + { { 0x59 }, 2, T, R, pfx_f3, W0, LIG }, /* vmulsh */ + { { 0x5c }, 2, T, R, pfx_no, W0, Ln }, /* vsubph */ + { { 0x5c }, 2, T, R, pfx_f3, W0, LIG }, /* vsubsh */ + { { 0x5d }, 2, T, R, pfx_no, W0, Ln }, /* vminph */ + { { 0x5d }, 2, T, R, pfx_f3, W0, LIG }, /* vminsh */ + { { 0x5e }, 2, T, R, pfx_no, W0, Ln }, /* vdivph */ + { { 0x5e }, 2, T, R, pfx_f3, W0, LIG }, /* vdivsh */ + { { 0x5f }, 2, T, R, pfx_no, W0, Ln }, /* vmaxph */ + { { 0x5f }, 2, T, R, pfx_f3, W0, LIG }, /* vmaxsh */ }; static const struct { @@ -2037,6 +2054,8 @@ static const struct { { evex_0f, ARRAY_SIZE(evex_0f) }, { evex_0f38, ARRAY_SIZE(evex_0f38) }, { evex_0f3a, ARRAY_SIZE(evex_0f3a) }, + { NULL, 0 }, + { evex_map5, ARRAY_SIZE(evex_map5) }, }; #undef Wn --- a/xen/arch/x86/x86_emulate/decode.c +++ b/xen/arch/x86/x86_emulate/decode.c @@ -1219,9 +1219,18 @@ int x86emul_decode(struct x86_emulate_st opcode |= MASK_INSR(0x0f3a, X86EMUL_OPC_EXT_MASK); d = twobyte_table[0x3a].desc; break; + + case evex_map5: + if ( !evex_encoded() ) + { default: - rc = X86EMUL_UNRECOGNIZED; - goto done; + rc = X86EMUL_UNRECOGNIZED; + goto done; + } + opcode |= MASK_INSR(5, X86EMUL_OPC_EXT_MASK); + d = twobyte_table[b].desc; + s->simd_size = twobyte_table[b].size ?: simd_other; + break; } } else if ( s->ext < ext_8f08 + ARRAY_SIZE(xop_table) ) @@ -1443,6 +1452,24 @@ int x86emul_decode(struct x86_emulate_st } break; + case ext_map5: + switch ( b ) + { + default: + if ( !(s->evex.pfx & VEX_PREFIX_DOUBLE_MASK) ) + s->fp16 = true; + break; + + case 0x2e: case 0x2f: /* v{,u}comish */ + if ( !s->evex.pfx ) + s->fp16 = true; + s->simd_size = simd_none; + break; + } + + disp8scale = decode_disp8scale(twobyte_table[b].d8s, s); + break; + case ext_8f09: if ( ext8f09_table[b].two_op ) d |= TwoOp; @@ -1661,6 +1688,7 @@ int x86emul_decode(struct x86_emulate_st s->simd_size = ext8f08_table[b].simd_size; break; + case ext_map5: case ext_8f09: case ext_8f0a: break; --- a/xen/arch/x86/x86_emulate/private.h +++ b/xen/arch/x86/x86_emulate/private.h @@ -194,6 +194,7 @@ enum vex_opcx { vex_0f = vex_none + 1, vex_0f38, vex_0f3a, + evex_map5 = 5, }; enum vex_pfx { @@ -222,8 +223,8 @@ union vex { union evex { uint8_t raw[3]; struct { /* SDM names */ - uint8_t opcx:2; /* mm */ - uint8_t mbz:2; + uint8_t opcx:3; /* mmm */ + uint8_t mbz:1; uint8_t R:1; /* R' */ uint8_t b:1; /* B */ uint8_t x:1; /* X */ @@ -248,6 +249,7 @@ struct x86_emulate_state { ext_0f = vex_0f, ext_0f38 = vex_0f38, ext_0f3a = vex_0f3a, + ext_map5 = evex_map5, /* * For XOP use values such that the respective instruction field * can be used without adjustment. --- a/xen/arch/x86/x86_emulate/x86_emulate.c +++ b/xen/arch/x86/x86_emulate/x86_emulate.c @@ -3760,6 +3760,13 @@ x86_emulate( ASSERT(!state->simd_size); break; +#ifndef X86EMUL_NO_SIMD + + case X86EMUL_OPC_EVEX(5, 0x2e): /* vucomish xmm/m16,xmm */ + case X86EMUL_OPC_EVEX(5, 0x2f): /* vcomish xmm/m16,xmm */ + host_and_vcpu_must_have(avx512_fp16); + generate_exception_if(evex.w, EXC_UD); + /* fall through */ CASE_SIMD_PACKED_FP(_EVEX, 0x0f, 0x2e): /* vucomis{s,d} xmm/mem,xmm */ CASE_SIMD_PACKED_FP(_EVEX, 0x0f, 0x2f): /* vcomis{s,d} xmm/mem,xmm */ generate_exception_if((evex.reg != 0xf || !evex.RX || evex.opmsk || @@ -3772,9 +3779,11 @@ x86_emulate( get_fpu(X86EMUL_FPU_zmm); opc = init_evex(stub); - op_bytes = 4 << evex.w; + op_bytes = 2 << (!state->fp16 + evex.w); goto vcomi; +#endif + case X86EMUL_OPC(0x0f, 0x30): /* wrmsr */ generate_exception_if(!mode_ring0(), EXC_GP, 0); fail_if(ops->write_msr == NULL); @@ -7738,6 +7747,20 @@ x86_emulate( #ifndef X86EMUL_NO_SIMD + case X86EMUL_OPC_EVEX_F3(5, 0x51): /* vsqrtsh xmm/m16,xmm,xmm{k} */ + d &= ~TwoOp; + /* fall through */ + case X86EMUL_OPC_EVEX(5, 0x51): /* vsqrtph [xyz]mm/mem,[xyz]mm{k} */ + CASE_SIMD_SINGLE_FP(_EVEX, 5, 0x58): /* vadd{p,s}h [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ + CASE_SIMD_SINGLE_FP(_EVEX, 5, 0x59): /* vmul{p,s}h [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ + CASE_SIMD_SINGLE_FP(_EVEX, 5, 0x5c): /* vsub{p,s}h [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ + CASE_SIMD_SINGLE_FP(_EVEX, 5, 0x5d): /* vmin{p,s}h [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ + CASE_SIMD_SINGLE_FP(_EVEX, 5, 0x5e): /* vdiv{p,s}h [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ + CASE_SIMD_SINGLE_FP(_EVEX, 5, 0x5f): /* vmax{p,s}h [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ + host_and_vcpu_must_have(avx512_fp16); + generate_exception_if(evex.w, EXC_UD); + goto avx512f_all_fp; + case X86EMUL_OPC_XOP(08, 0x85): /* vpmacssww xmm,xmm/m128,xmm,xmm */ case X86EMUL_OPC_XOP(08, 0x86): /* vpmacsswd xmm,xmm/m128,xmm,xmm */ case X86EMUL_OPC_XOP(08, 0x87): /* vpmacssdql xmm,xmm/m128,xmm,xmm */ --- a/xen/arch/x86/x86_emulate/x86_emulate.h +++ b/xen/arch/x86/x86_emulate/x86_emulate.h @@ -619,6 +619,7 @@ struct x86_emulate_ctxt * 0x0fxxxx for 0f-prefixed opcodes (or their VEX/EVEX equivalents) * 0x0f38xxxx for 0f38-prefixed opcodes (or their VEX/EVEX equivalents) * 0x0f3axxxx for 0f3a-prefixed opcodes (or their VEX/EVEX equivalents) + * 0x5xxxx for Map5 opcodes (EVEX only) * 0x8f08xxxx for 8f/8-prefixed XOP opcodes * 0x8f09xxxx for 8f/9-prefixed XOP opcodes * 0x8f0axxxx for 8f/a-prefixed XOP opcodes From patchwork Wed Jun 15 10:28:33 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 12882086 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 5EDE3C43334 for ; Wed, 15 Jun 2022 10:28:55 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.349925.576131 (Exim 4.92) (envelope-from ) id 1o1QGQ-0004pL-6U; Wed, 15 Jun 2022 10:28:42 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 349925.576131; Wed, 15 Jun 2022 10:28:42 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1o1QGQ-0004pE-2z; Wed, 15 Jun 2022 10:28:42 +0000 Received: by outflank-mailman (input) for mailman id 349925; Wed, 15 Jun 2022 10:28:41 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1o1QGP-0004ln-Ho for xen-devel@lists.xenproject.org; Wed, 15 Jun 2022 10:28:41 +0000 Received: from EUR03-DBA-obe.outbound.protection.outlook.com (mail-dbaeur03on20613.outbound.protection.outlook.com [2a01:111:f400:fe1a::613]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id ea4b0d04-ec95-11ec-ab14-113154c10af9; Wed, 15 Jun 2022 12:28:37 +0200 (CEST) Received: from VE1PR04MB6560.eurprd04.prod.outlook.com (2603:10a6:803:122::25) by AM8PR04MB7332.eurprd04.prod.outlook.com (2603:10a6:20b:1db::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5353.13; Wed, 15 Jun 2022 10:28:35 +0000 Received: from VE1PR04MB6560.eurprd04.prod.outlook.com ([fe80::dfa:a64a:432f:e26b]) by VE1PR04MB6560.eurprd04.prod.outlook.com ([fe80::dfa:a64a:432f:e26b%7]) with mapi id 15.20.5332.020; Wed, 15 Jun 2022 10:28:35 +0000 X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: ea4b0d04-ec95-11ec-ab14-113154c10af9 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Vv3uR1Z8PmkV0s3l4DeGVDZAtNjgJsQsqYSGtuyHvKBoMDDuv35rizHT2qTPTORTNmsV0ToMhVXaCrww9ak8qgcJ8M8241uFwEjniUlMe5d3329dKGoK96yZXz+prGAAT5nf6HSjFXcCKhTXXI3sGQCYEWyfZIjUXLTb/wx7lUpsf7mMn+VEfYLNUSCXT27+awNVNt2Ll8FN/i5F5lkWDavzMkVxknvdbnB12gKoun6vJzgcSq8wZd6jg2+66XfZLkZ3cz7uN4J7heYnB2tOet0KPmfNaQGChuCV7iOiyyUUPhZU+7T/wZ/dQEtGUQsVq4ynXMtzktGCwUSh+d3+8Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=X56RGauel3ny1RfLPnM3lAMmDmDld+kl8ioALSznTgc=; b=da75xEhaJG0HO6/QWNl0CBBjWS5LOUy4wY6VqfEFN1/R8pI3X7Bp6d6OK36DPSE95qINflheGyNVlU7sYc99pAIcd86HDDVr/UmhTAUheVniDPLz+JpkKrMuFBE3xtPcJZvnlfdnfaj9BZ9KOp028RiY4AjiU7fGqHrLzIrgGqkRGlOgjpuR2N1TseSLebGN3zj5GnIfMJPZm4bamZgadKfXT8WPHcs9b1ojXIGLJEAmFIWkqRv9CX64Ef6+VXPZZV9jC9BaNilcdjJjmneg1aEwRA7TbS7VJRiGq+sDar4gwkR4jvM1/YAGlBRcsA7ARLtrkAnRsoxMuW9qvkZNwg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=X56RGauel3ny1RfLPnM3lAMmDmDld+kl8ioALSznTgc=; b=MykJ0BW3lS2bIMxe16Mua0aFUZBiOt6fZ3a7kjrdqG0PvlIDRhFaxQkJ8m52AX5RQiXylKlw+hxZBtfTWO4sBBKwOGvLsrFqO2ct8fimfcagQFoa6K23az2uhMfLagBPhwEx1vSCAMARUah1Mz+h8JQ+VJtjmFIkpERT17pGuh0ItiurXSFg4+LKiqksgZFNgoTwRIN6wfyUoJwVPxsUTp3wThSbKG62FYL1PL13Hd2CFETXgL71gwLzsK/ZQ+gc9HemQ80YwyoSAfo0P81WBc+J6w6d49K3JsWtCFpPa5yUlebrnGEWAblbtOcT7B0QktOUcY4ELAv9gm+CplbKRg== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com; Message-ID: <3e7f95f0-fded-74e3-d4b5-da185a7ab8d8@suse.com> Date: Wed, 15 Jun 2022 12:28:33 +0200 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.10.0 Subject: [PATCH 04/11] x86emul: handle AVX512-FP16 move insns Content-Language: en-US From: Jan Beulich To: "xen-devel@lists.xenproject.org" Cc: Andrew Cooper , Wei Liu , =?utf-8?q?Roger_Pau_Monn=C3=A9?= References: <9ba3aaa7-56ce-b051-f1c4-874276e493e1@suse.com> In-Reply-To: <9ba3aaa7-56ce-b051-f1c4-874276e493e1@suse.com> X-ClientProxiedBy: AS9PR0301CA0025.eurprd03.prod.outlook.com (2603:10a6:20b:468::22) To VE1PR04MB6560.eurprd04.prod.outlook.com (2603:10a6:803:122::25) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 9bf93028-4b83-46e9-6c35-08da4eb9cdb6 X-MS-TrafficTypeDiagnostic: AM8PR04MB7332:EE_ X-Microsoft-Antispam-PRVS: X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 6d4f184XJ04WAAMYZ3LdVFI3SK782RBn7U9GuILlAM7t8H9yVZ+41wi7qc6uq6CTvbZei0IV1kMz4W5A+cN8bDU1REstvIloooQdJLHMcLtFjBgBOJB1Wif5vEY95rbNYrtCZqr8g6tv1xElYpHRCrVBrrVyQivzQqQmO9qOQbJKUMPo8r+PBl+BBFz0YKT6h5x0rqpfvwEUVyjgRQClm1uB0F3JAl1wreXMBrcte8LNav+oz4PQB+Turdb+WcPt3f18jEVnmv1W0MveJw6jFgYsI4HonOoFZz8Q1vGbvZ99osqbaoP2NITsKGXL9x5oXOtQZhi8ydLytkdUY3nFY4O8RecUluoLiwQEMsu1bQjVanRgM8ZRkoK2mBCBNyTqr+K3Poa6mEiwtKYf7ihlkWFMg2YtNPY6B74tOc/q+lxDtY0F4f+tL45Bo5y9vV2elLi2Bo0Zkp3yqN5GqViCb46a45kQ4HsO4Cb2d1YnyYlegM6j7045/5rl8bNg8scgnb+mN2lZR1utqs1Y69WpR+48mHl1Ju7QIe6+EjGdMqx0eHvM2aZhfjv5Qj/gVJ11bs/0WZeR1wSTOBfyUcYfK0M7D1OTns+ka2exrPXbCTzcWBIQB0LAj7IcBPUYxu3MjWKp5O2TFfYVHc9OSc5il02+tp27SIMIT2Y6e7eb4tBewAhcli73JnEFUONqcf1L2nR0EzeaRbVgVVT8qH9D7A3w3h0uNver6X9ojVraUeNmBfeRyt8+TNonejpRv5sE X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:VE1PR04MB6560.eurprd04.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230016)(366004)(31686004)(6486002)(316002)(36756003)(26005)(86362001)(31696002)(6512007)(186003)(8936002)(2616005)(6506007)(38100700002)(6916009)(54906003)(508600001)(66476007)(4326008)(66556008)(8676002)(5660300002)(2906002)(66946007)(43740500002)(45980500001);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?q?I/XKgF/6bymAtW0AqBpvUusSXsUi?= =?utf-8?q?fhdcSRDUVpGoRLYRAySYGq4ljQq5pjsCRaONd+YfjzCgtIUwSOyWMYs5TLy4CjTfM?= =?utf-8?q?YCvGOd4lMqJf6T7MbCwOOiHId8kFIVqpLIMLGNd9dl7AcXBShk1Lhif8W+Sy3R0b7?= =?utf-8?q?nGv4prAp+OmN8DMwIP0KIEj2JEgZBM3bBLJYFGCJWVRED36umWRoovt2DihC2gI5k?= =?utf-8?q?vKiXDjJJXOseEVOOJr3YIbHasi/aq0nwZ0ol5SMhldk7ZCwvdZKLkNo6NNIuUE63e?= =?utf-8?q?eq16PfHYcWQIIbDT1ustTpT9dsJWxQC2sHnrvuXbtUfU6qcg9uIHQlBluh0GipC6G?= =?utf-8?q?MQOXpBsCSbwcSk0rDE/c4Xl9FOoibtnKCqY3XfOIv1anUwkMLxcslGrGWMlXSVutG?= =?utf-8?q?y5dM6hy/k+f1TSBNhHnqA4oy+f5e6P/oQCCu95MH67ySMRke9LcmWwHu3+Eu4kNTl?= =?utf-8?q?QXoD0a4qsmPCkqxxmEpusZRBKPf2l15Vvn+a96yLboD/fsg0Itmf6Hp61wx+Ug7Cz?= =?utf-8?q?9lB6kLwX/cojl4udMmgwDWngMY8AxvRxKvwg3tm5xzRSZJDPq49KzMqSkzLJJd8dD?= =?utf-8?q?BPl1EeRflwIdIaSyedW7ov9ztjvbJrtGmQEyaiI5tr8vFifS5AjH60DoTQ9a+b25W?= =?utf-8?q?k4nh0+WDNFSsxjNciZ0iYxOvfBGi2zuRE0NpDuDZT536E79z5+lAjGpoKb+mOMkg4?= =?utf-8?q?JNAXUDnIjWcZAXgmLkCx7UVuRMw50sHtecG99spxferVTHt02sqW1UUFz9b1Zssnw?= =?utf-8?q?Mf8j2qz6JBzDsoqgO7yFuP9UlAwrDCv0TymfUcDmd30u8uccGnxQLcLKZ+R6gQHlE?= =?utf-8?q?dO6RRkh2qEZHeKApg9eZvkMSo81r99tXrWA76Y/ypMCv8rnZr6cPBscJZdidbhF4n?= =?utf-8?q?CxKDTxmJRAAGdC/WpO54IzorC9sqGh0/KHWwn/+Zl4dAX8rCeqxPnCofgzt+hRqI4?= =?utf-8?q?OYe9+D9WACmFBLen9S00uJjndz8xBDooCbjlg4P4DlkPyZWXPMvFWFQK9v8Zhw3yV?= =?utf-8?q?beEbypJ2fdXhHdobX1sYBJMzzosdvXotMdhfhtbZjQB/fkCTGtnsFi1d5yWWwsNtV?= =?utf-8?q?0mcbeygpFdc3dQ/CiGuDOzKlBsy4AgDLtB0DUD6kWOWenqk8CdpWkrNUhPdK4k2aP?= =?utf-8?q?HyPM7TTVAp+WamW65V/l1mjqYmH0gE+8/6TROp64Aq4yg1A3OOmuJRavVVz1MnCuA?= =?utf-8?q?n1E1TnfxBysmKXPgXzYWD3eHczAeIVL2IR9p6WtgRuERgv0b+wl8KtqLrJTQaqHsL?= =?utf-8?q?BnRo/RHlIoQ4pOPh3zG+cTGYvCfrSGaGc9gFRylCt0ihzAxrlFD0znKYao17nP2UW?= =?utf-8?q?QD/2aQsVOnnWokAJkm2Ig/s1UDOE2pBcwcrlqTZDJFGtgtLqKliyGlnUW3SGwPS8T?= =?utf-8?q?E8aI/8UW/U/doj4ORyGlgpFDJNvyThZEIWYSnnp42WmVdtYxJuBysmN8WL3AaklY0?= =?utf-8?q?/MAC0Zh/2YTn/X8PKQ241ueE5YV6uD6O84/tuF8xeEQCDRVO4kCenY+Sb019nS9Of?= =?utf-8?q?dcxRuEZc8x0+dZsCsYPiMEHcc0m/vIJ7D/KcQBxzauAeKcW9OBWD4gdIyatNK76Ib?= =?utf-8?q?UsmYh9Hs4OFRYWc1V5rL059QD3wsQnB2Ao/7PPrG569b8cWeMeiqxfOO78NR51Etv?= =?utf-8?q?BZfHdZ47lG+LhrlPtY833bsr2hIm3kjQ=3D=3D?= X-OriginatorOrg: suse.com X-MS-Exchange-CrossTenant-Network-Message-Id: 9bf93028-4b83-46e9-6c35-08da4eb9cdb6 X-MS-Exchange-CrossTenant-AuthSource: VE1PR04MB6560.eurprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 15 Jun 2022 10:28:35.4953 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: f7a17af6-1c5c-4a36-aa8b-f5be247aa4ba X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: emWtnPt9iLE+5rAoDeC19MCvSZoVgrjMlGqZrA9tH//D4UxqQMiMSV7WlDad8l+Xxwkk1t8P3zA3ToQ2FrMSTw== X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM8PR04MB7332 Signed-off-by: Jan Beulich --- a/tools/tests/x86_emulator/evex-disp8.c +++ b/tools/tests/x86_emulator/evex-disp8.c @@ -622,6 +622,8 @@ static const struct test avx512_fp16_all INSN(maxsh, f3, map5, 5f, el, fp16, el), INSN(minph, , map5, 5d, vl, fp16, vl), INSN(minsh, f3, map5, 5d, el, fp16, el), + INSN(movsh, f3, map5, 10, el, fp16, el), + INSN(movsh, f3, map5, 11, el, fp16, el), INSN(mulph, , map5, 59, vl, fp16, vl), INSN(mulsh, f3, map5, 59, el, fp16, el), INSN(reduceph, , 0f3a, 56, vl, fp16, vl), @@ -635,6 +637,11 @@ static const struct test avx512_fp16_all INSN(ucomish, , map5, 2e, el, fp16, el), }; +static const struct test avx512_fp16_128[] = { + INSN(movw, 66, map5, 6e, el, fp16, el), + INSN(movw, 66, map5, 7e, el, fp16, el), +}; + static const struct test gfni_all[] = { INSN(gf2p8affineinvqb, 66, 0f3a, cf, vl, q, vl), INSN(gf2p8affineqb, 66, 0f3a, ce, vl, q, vl), @@ -1039,6 +1046,7 @@ void evex_disp8_test(void *instr, struct RUN(avx512_vp2intersect, all); RUN(avx512_vpopcntdq, all); RUN(avx512_fp16, all); + RUN(avx512_fp16, 128); if ( cpu_has_avx512f ) { --- a/tools/tests/x86_emulator/predicates.c +++ b/tools/tests/x86_emulator/predicates.c @@ -2029,6 +2029,8 @@ static const struct evex { { { 0xce }, 3, T, R, pfx_66, W1, Ln }, /* vgf2p8affineqb */ { { 0xcf }, 3, T, R, pfx_66, W1, Ln }, /* vgf2p8affineinvqb */ }, evex_map5[] = { + { { 0x10 }, 2, T, R, pfx_f3, W0, LIG }, /* vmovsh */ + { { 0x11 }, 2, T, W, pfx_f3, W0, LIG }, /* vmovsh */ { { 0x2e }, 2, T, R, pfx_no, W0, LIG }, /* vucomish */ { { 0x2f }, 2, T, R, pfx_no, W0, LIG }, /* vcomish */ { { 0x51 }, 2, T, R, pfx_no, W0, Ln }, /* vsqrtph */ @@ -2045,6 +2047,8 @@ static const struct evex { { { 0x5e }, 2, T, R, pfx_f3, W0, LIG }, /* vdivsh */ { { 0x5f }, 2, T, R, pfx_no, W0, Ln }, /* vmaxph */ { { 0x5f }, 2, T, R, pfx_f3, W0, LIG }, /* vmaxsh */ + { { 0x6e }, 2, T, R, pfx_66, WIG, L0 }, /* vmovw */ + { { 0x7e }, 2, T, W, pfx_66, WIG, L0 }, /* vmovw */ }; static const struct { --- a/tools/tests/x86_emulator/test_x86_emulator.c +++ b/tools/tests/x86_emulator/test_x86_emulator.c @@ -5137,6 +5137,76 @@ int main(int argc, char **argv) else printf("skipped\n"); + printf("%-40s", "Testing vmovsh 8(%ecx),%xmm5..."); + if ( stack_exec && cpu_has_avx512_fp16 ) + { + decl_insn(vmovsh_from_mem); + decl_insn(vmovw_to_gpr); + + asm volatile ( "vpcmpeqw %%ymm5, %%ymm5, %%ymm5\n\t" + put_insn(vmovsh_from_mem, + /* vmovsh 8(%0), %%xmm5 */ + ".byte 0x62, 0xf5, 0x7e, 0x08\n\t" + ".byte 0x10, 0x69, 0x04") + :: "c" (NULL) ); + + set_insn(vmovsh_from_mem); + res[2] = 0x3c00bc00; + regs.ecx = (unsigned long)res; + rc = x86_emulate(&ctxt, &emulops); + if ( (rc != X86EMUL_OKAY) || !check_eip(vmovsh_from_mem) ) + goto fail; + asm volatile ( "kmovw %2, %%k1\n\t" + "vmovdqu16 %1, %%zmm4%{%%k1%}%{z%}\n\t" + "vpcmpeqw %%zmm4, %%zmm5, %%k0\n\t" + "kmovw %%k0, %0" + : "=g" (rc) + : "m" (res[2]), "r" (1) ); + if ( rc != 0xffff ) + goto fail; + printf("okay\n"); + + printf("%-40s", "Testing vmovsh %xmm4,2(%eax){%k3}..."); + memset(res, ~0, 8); + res[2] = 0xbc00ffff; + memset(res + 3, ~0, 8); + regs.eax = (unsigned long)res; + regs.ecx = ~0; + for ( i = 0; i < 2; ++i ) + { + decl_insn(vmovsh_to_mem); + + asm volatile ( "kmovw %1, %%k3\n\t" + put_insn(vmovsh_to_mem, + /* vmovsh %%xmm4, 2(%0)%{%%k3%} */ + ".byte 0x62, 0xf5, 0x7e, 0x0b\n\t" + ".byte 0x11, 0x60, 0x01") + :: "a" (NULL), "r" (i) ); + + set_insn(vmovsh_to_mem); + rc = x86_emulate(&ctxt, &emulops); + if ( (rc != X86EMUL_OKAY) || !check_eip(vmovsh_to_mem) || + memcmp(res, res + 3 - i, 8) ) + goto fail; + } + printf("okay\n"); + + printf("%-40s", "Testing vmovw %xmm5,%ecx..."); + asm volatile ( put_insn(vmovw_to_gpr, + /* vmovw %%xmm5, %0 */ + ".byte 0x62, 0xf5, 0x7d, 0x08\n\t" + ".byte 0x7e, 0xe9") + :: "c" (NULL) ); + set_insn(vmovw_to_gpr); + rc = x86_emulate(&ctxt, &emulops); + if ( (rc != X86EMUL_OKAY) || !check_eip(vmovw_to_gpr) || + regs.ecx != 0xbc00 ) + goto fail; + printf("okay\n"); + } + else + printf("skipped\n"); + printf("%-40s", "Testing invpcid 16(%ecx),%%edx..."); if ( stack_exec ) { --- a/xen/arch/x86/x86_emulate/decode.c +++ b/xen/arch/x86/x86_emulate/decode.c @@ -585,7 +585,7 @@ static unsigned int decode_disp8scale(en break; case d8s_dq64: - return 2 + (s->op_bytes == 8); + return 1 + !s->fp16 + (s->op_bytes == 8); } switch ( s->simd_size ) @@ -1465,6 +1465,15 @@ int x86emul_decode(struct x86_emulate_st s->fp16 = true; s->simd_size = simd_none; break; + + case 0x6e: /* vmovw r/m16, xmm */ + d = (d & ~SrcMask) | SrcMem16; + /* fall through */ + case 0x7e: /* vmovw xmm, r/m16 */ + if ( s->evex.pfx == vex_66 ) + s->fp16 = true; + s->simd_size = simd_none; + break; } disp8scale = decode_disp8scale(twobyte_table[b].d8s, s); --- a/xen/arch/x86/x86_emulate/x86_emulate.c +++ b/xen/arch/x86/x86_emulate/x86_emulate.c @@ -4394,6 +4394,15 @@ x86_emulate( #ifndef X86EMUL_NO_SIMD + case X86EMUL_OPC_EVEX_66(5, 0x7e): /* vmovw xmm,r/m16 */ + ASSERT(dst.bytes >= 4); + if ( dst.type == OP_MEM ) + dst.bytes = 2; + /* fall through */ + case X86EMUL_OPC_EVEX_66(5, 0x6e): /* vmovw r/m16,xmm */ + host_and_vcpu_must_have(avx512_fp16); + generate_exception_if(evex.w, EXC_UD); + /* fall through */ case X86EMUL_OPC_EVEX_66(0x0f, 0x6e): /* vmov{d,q} r/m,xmm */ case X86EMUL_OPC_EVEX_66(0x0f, 0x7e): /* vmov{d,q} xmm,r/m */ generate_exception_if((evex.lr || evex.opmsk || evex.brs || @@ -7747,8 +7756,18 @@ x86_emulate( #ifndef X86EMUL_NO_SIMD + case X86EMUL_OPC_EVEX_F3(5, 0x10): /* vmovsh m16,xmm{k} */ + /* vmovsh xmm,xmm,xmm{k} */ + case X86EMUL_OPC_EVEX_F3(5, 0x11): /* vmovsh xmm,m16{k} */ + /* vmovsh xmm,xmm,xmm{k} */ + generate_exception_if(evex.brs, EXC_UD); + if ( ea.type == OP_MEM ) + d |= TwoOp; + else + { case X86EMUL_OPC_EVEX_F3(5, 0x51): /* vsqrtsh xmm/m16,xmm,xmm{k} */ - d &= ~TwoOp; + d &= ~TwoOp; + } /* fall through */ case X86EMUL_OPC_EVEX(5, 0x51): /* vsqrtph [xyz]mm/mem,[xyz]mm{k} */ CASE_SIMD_SINGLE_FP(_EVEX, 5, 0x58): /* vadd{p,s}h [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ From patchwork Wed Jun 15 10:28:56 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 12882087 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 7930AC43334 for ; Wed, 15 Jun 2022 10:29:13 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.349934.576142 (Exim 4.92) (envelope-from ) id 1o1QGj-0005Ti-Ee; Wed, 15 Jun 2022 10:29:01 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 349934.576142; Wed, 15 Jun 2022 10:29:01 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1o1QGj-0005TZ-Ba; Wed, 15 Jun 2022 10:29:01 +0000 Received: by outflank-mailman (input) for mailman id 349934; Wed, 15 Jun 2022 10:29:00 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1o1QGi-0004ln-Bx for xen-devel@lists.xenproject.org; Wed, 15 Jun 2022 10:29:00 +0000 Received: from EUR03-DBA-obe.outbound.protection.outlook.com (mail-dbaeur03on20618.outbound.protection.outlook.com [2a01:111:f400:fe1a::618]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id f776995e-ec95-11ec-ab14-113154c10af9; Wed, 15 Jun 2022 12:28:59 +0200 (CEST) Received: from VE1PR04MB6560.eurprd04.prod.outlook.com (2603:10a6:803:122::25) by AM8PR04MB7332.eurprd04.prod.outlook.com (2603:10a6:20b:1db::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5353.13; Wed, 15 Jun 2022 10:28:58 +0000 Received: from VE1PR04MB6560.eurprd04.prod.outlook.com ([fe80::dfa:a64a:432f:e26b]) by VE1PR04MB6560.eurprd04.prod.outlook.com ([fe80::dfa:a64a:432f:e26b%7]) with mapi id 15.20.5332.020; Wed, 15 Jun 2022 10:28:58 +0000 X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: f776995e-ec95-11ec-ab14-113154c10af9 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=cilm6SpKw+V1vWrnEBhQz6mhCaXO0SaD+3VbOnWKWdCyN8lhufF4xgbumm9rqoKZ69eJKDUvabBc3AvD8BSi0HUJ98d/P/SnOg10i80Hg6DbHn7HUBPjJrV7UoXdKPN2k343p55+dTU+UEkruIWDynSMPRypLrvoqo1I4Rj8F4nxeDgJznlBtpa6+SbNKieJA3tedfIqirQZo/hntpq3YGugsX1cSthkAeaLjY8pFKcNK/0kxKAkwtVJCELWW+iAcPi/GNUtKlDEBoRUw9oQM+zmnok/NA8hD/wJsyGMQDHs65ZmMl++ZLeiOecYOxjeXaPrlInOgPjiQ562dUn7Qw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=R20kHAGmuLO0WXnq8u+WocwexxXx/2Ax+f3XUYxWKgM=; b=VRonKpARPWipDQLk+nM2yoZzadflbbGdiRxfIF5c2yR0Pe/LE5spWU/0a0VTj7XKmUCVtdzAJ7vxetkacSd92qpCwbkJ1qr16Hr/ylgtAVxvQ8HL4KEKBXHOsJJws2g1BX2EFMCVE0iH/fibPXPKi+BUub2Bk3RUxdc3Zj5qpHWySToNEOLNdSqhsXI8GBr99k0X2gMQl2G1yS5AX5FgGb2u87kJmZk5y9XTNy9Jfe8BIxOIlyFccvaRfRc9OhuB/jVqQ6niYYd5RUCSXg2nentbIlmuCL4k0XnGaqNvvkVwHnKvUucIFJpyW/rwkXmZIe1XvrTB6g5JWQrYfftLRQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=R20kHAGmuLO0WXnq8u+WocwexxXx/2Ax+f3XUYxWKgM=; b=Usoy1z2ioww0JQtUIFS0x7+jS+PIalFLfFDJK69QT+wfC/HGtBsd+n3Kkk1Ue16eVrexLIRLEknlBAvBgvrRP7PJjNc6qE5uRglBruklVz35xihw9gldBhPP3AFthAr1xJJ8DUTYkG49doDQcRkpdwHCcLVIcoWZOkYcCiATApAzMvQ7xpwxZJm0F18lhDtNEiw7zHpZqlxEiHxeaq+xAXB0mK1opk9CE3mwWQGHE0tG2S7OUt6PRBfu0ZwRpZOqdNXsp9Feam2k4EMPrvbpwrB7qN5PBt6tzxJI17kVAA0udk6rWYlNxJJnm+zd20r0rrAuV/NE7Mv324e3NZLNHA== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com; Message-ID: <36fadb47-32a2-b06e-4cd3-218635ef8aeb@suse.com> Date: Wed, 15 Jun 2022 12:28:56 +0200 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.10.0 Subject: [PATCH 05/11] x86emul: handle AVX512-FP16 fma-like insns Content-Language: en-US From: Jan Beulich To: "xen-devel@lists.xenproject.org" Cc: Andrew Cooper , Wei Liu , =?utf-8?q?Roger_Pau_Monn=C3=A9?= References: <9ba3aaa7-56ce-b051-f1c4-874276e493e1@suse.com> In-Reply-To: <9ba3aaa7-56ce-b051-f1c4-874276e493e1@suse.com> X-ClientProxiedBy: AM5PR0202CA0020.eurprd02.prod.outlook.com (2603:10a6:203:69::30) To VE1PR04MB6560.eurprd04.prod.outlook.com (2603:10a6:803:122::25) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 8fbcaa9b-8b4b-4696-6e94-08da4eb9db1f X-MS-TrafficTypeDiagnostic: AM8PR04MB7332:EE_ X-Microsoft-Antispam-PRVS: X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: bFlNVYf4IPuHSJQ9GGfvlLkSBbv+sTAZGcBTslaLZo/i1sy4f94g9UckZx/CKW6V6QtRwkqPK7th3HfUAuH2p8G60FinzVU5jdJtEAK2PDSPinDishSB53jMDTpsllzUSkyL16E/bGSgwpyvoRh06Vkut15X8WY2LT/joLQcR5hgM0ihf9qI5ik+2K6C1PoalFVLJ6OMESze4ThgC+0Vtg5dEjGyF4jNcS98OJgN0aoGit/N1S7+/cGrBzAY4QVvpWM5RZLldUKuXjJMCc5gETznQ65axoDNFDgr7aSZEFFsOUhfODGCYU79aExlMze+3NmYEQwN7Uebc9ga015u9d9zwBThBFDgCn8cDgsG48DctorPZiyxGpwYMtmGVXxA8TjoR2EFaDLO1Vv/HOheATAsRaCXmWJu8kems6i+7xWbJZ6D5kuY7Y7QwpCycGGHkRBDSFhcQUEVZbzhL9LYd14e/9EFh9Jv0GMqHOR3xbEDDezZ+ODNXl7XkWTASoqzKuK+eJsqtL+fJXY6XM7HY2HJf2R9jDbz7VpMXxX35x5z7Sx+BR1Eq3cw1bqNrhQfFdwdVps7QSMwF4WB2PAb1EcPW1PNkESpdBdTGwoJKngMQ8tGSOSJu2izjB+EjpNL+0sH9cuDZKK8rI5KWDg5KUOzr93OSbplHYOtsLSzvdiUbEO6jp4qi90XdqESwsGGMkWkTBRKlouuDMendCZmcZreOul8qV7chukh54rh7bsCjDAnC9m6BScngiC+cfrN X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:VE1PR04MB6560.eurprd04.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230016)(366004)(31686004)(6486002)(316002)(36756003)(26005)(86362001)(31696002)(6512007)(186003)(8936002)(2616005)(6506007)(38100700002)(6916009)(54906003)(508600001)(66476007)(4326008)(66556008)(8676002)(5660300002)(2906002)(66946007)(43740500002)(45980500001);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?q?u6ZF4dD7w3PpU0nodCT+/N0XjTkK?= =?utf-8?q?8P0veaxc+pHY029Y4NbpyYV4cZtGu7F1y51DhsM+L4aaPuWgraiybR6m0zPJc/YKb?= =?utf-8?q?96q1PeEsEEsUCjKO1IFr4bEjIlXIVNcfWLp/0d7LC1U2QSoF3For9aPwHVFJRB9yj?= =?utf-8?q?1b96OZDIHsNtMWSZL5tYYnwz5Vf/GS8ZHo/lSEPQQdT8WlpBlh1GHyIxBIDcaLnH/?= =?utf-8?q?HGFRNvStgSr1pkb11Rjame4zbLRSTjD7WjT2uuYzxNosKBSIAAiaR2xkiH8+Tv0hP?= =?utf-8?q?dYNbq4JbqiSi+RQsMhVpvTlOiaR+0lDe64PGWmLjrQomIj9prmG0INXiQ61jb78H6?= =?utf-8?q?PAzhFm3HMbLTrYXaR4Nx8mZEm1keCho39CrCjk9JB3vFBXRNwePi8v8CcZ2/kjgTj?= =?utf-8?q?opnNkdrOl9RbQ8fMUT/Rk1lrNhPOwen6HIYVNkbmdPYqoqRAvnXVHPqtTcBBI+evr?= =?utf-8?q?faKyQpWFqf3Qu1X2/3q93htqn+maUikMs3Pna8MCvojkfC4pNjOjt4kboDyYqg+az?= =?utf-8?q?yUXsuDrEbSaLtgFQ9uTqmIV/NIQ9/nUk2VQdjgQ1Fqf7VCggpBEt/yHNwOp9FCcd7?= =?utf-8?q?Ox09658nUighoXWBJyzx5g9t5LeyajDUJFvFtz3sXy5MEL7Hwg+/qfMYTNAL/COFH?= =?utf-8?q?zIovT6osB6/Ca4m2x59Ie5iUensLranibqtDWF3VE0jbtzYC+gGrdxMxVCd9Nqbwd?= =?utf-8?q?ipyMXgi0PuZHMyzgs7xuN3UcPePBDWY9Bz6oyR84JiVByiKCI/GcbBpffrUVE6757?= =?utf-8?q?D+36lS1cHnqtxtDvuUwrzeR/Crk6v/7HvPkeSG7E+3x4lG9HGjb4KtFmmTmoM2dUk?= =?utf-8?q?eDfbsLivdZdctCB2t8Jf0/fX+ncEtBAVmCQEO4/kgNTrkwJqHZq4IC4amNKJqEBpc?= =?utf-8?q?xRbfkO7YAA3uLzt2RCAiNqrdj8N/QOFbTtEIn4jzvxku6HMq3nlfK1o6uNOACIqAQ?= =?utf-8?q?T0YyTgE+ichztrj7OHQxhx6Yb+8brkziw9HWZRdMoqdyFrL2QRl4LRLSG2c5cYybK?= =?utf-8?q?VJfJC/9lMrEK9iTvSpbWU/hHG8S6Sagtu0UdZCG2j9mQTEHcjqNoZrlFA4OyPKSC1?= =?utf-8?q?ijx1PSy/dooC/e3Yzr33/z5PtXg8V7iLE5VnsKD7drsTD/sRwgMCcE10UYxjrvZDq?= =?utf-8?q?IcQOOI+g9NsNI7QfjsUmE3cPqhkryyFXptQVyy8rjCLGrtDmG3jyOzlqdpEA/F6g5?= =?utf-8?q?zCrAvoVNSgaCPNS2R0Lvz0hvg/dWkkVsn8cihPE3WBG5x2FmSX26GI/SuZtBzNazT?= =?utf-8?q?kcwsv13Am3hnlX3rVHJYzCqf6XKnDb8OkQAF31wBqwT/CIWmTl3KcngnoDLqPkFe4?= =?utf-8?q?P3tcjMG0/ahyysjcj8AnQM4PAvE0iUKkZ9BzhHQsDE/fwgFAXcSr2hrLeNMs1uYxN?= =?utf-8?q?I/Hn+ZNrm+HTJs5yEvKck1Ejs1qIZR6qn/H8bjMpGHMqD11TnejkfNjPo6xbp3E21?= =?utf-8?q?lbLAS2UWg/edI/n0AAHPGNyu5wbFeB+ACHv4HSjaJ+2NUsZGiOps9HgwUhVlMKlQx?= =?utf-8?q?9pM0IlFpsC8UjNQVXkS/lD5bdv3cDvPdOXtHafXOvnJP7e+pBUaVKjhOjyEHp/muF?= =?utf-8?q?T+ibHQyPRULlu/4A7FM1Pxb69JrsLv366xwtbTt3acfc6C21JDC3KW4pqkdfC03hn?= =?utf-8?q?pdUC5/zEsTE/D932R273M8sJkvidZxug=3D=3D?= X-OriginatorOrg: suse.com X-MS-Exchange-CrossTenant-Network-Message-Id: 8fbcaa9b-8b4b-4696-6e94-08da4eb9db1f X-MS-Exchange-CrossTenant-AuthSource: VE1PR04MB6560.eurprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 15 Jun 2022 10:28:57.9782 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: f7a17af6-1c5c-4a36-aa8b-f5be247aa4ba X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: 4J5BcHPRuNpX/rY2roY4un8khWKnxdW69yF8t4S40y0oBARVo3aMmFl3lW+42FlBhntF6z/WENrf9xKPmiT57Q== X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM8PR04MB7332 The Map6 encoding space is a very sparse clone of the "0f38" one. Once again re-use that table, as the entries corresponding to invalid opcodes in Map6 are simply benign with simd_size forced to other than simd_none (preventing undue memory reads in SrcMem handling early in x86_emulate()). Signed-off-by: Jan Beulich --- a/tools/tests/x86_emulator/evex-disp8.c +++ b/tools/tests/x86_emulator/evex-disp8.c @@ -614,6 +614,36 @@ static const struct test avx512_fp16_all INSN(comish, , map5, 2f, el, fp16, el), INSN(divph, , map5, 5e, vl, fp16, vl), INSN(divsh, f3, map5, 5e, el, fp16, el), + INSN(fmadd132ph, 66, map6, 98, vl, fp16, vl), + INSN(fmadd132sh, 66, map6, 99, el, fp16, el), + INSN(fmadd213ph, 66, map6, a8, vl, fp16, vl), + INSN(fmadd213sh, 66, map6, a9, el, fp16, el), + INSN(fmadd231ph, 66, map6, b8, vl, fp16, vl), + INSN(fmadd231sh, 66, map6, b9, el, fp16, el), + INSN(fmaddsub132ph, 66, map6, 96, vl, fp16, vl), + INSN(fmaddsub213ph, 66, map6, a6, vl, fp16, vl), + INSN(fmaddsub231ph, 66, map6, b6, vl, fp16, vl), + INSN(fmsub132ph, 66, map6, 9a, vl, fp16, vl), + INSN(fmsub132sh, 66, map6, 9b, el, fp16, el), + INSN(fmsub213ph, 66, map6, aa, vl, fp16, vl), + INSN(fmsub213sh, 66, map6, ab, el, fp16, el), + INSN(fmsub231ph, 66, map6, ba, vl, fp16, vl), + INSN(fmsub231sh, 66, map6, bb, el, fp16, el), + INSN(fmsubadd132ph, 66, map6, 97, vl, fp16, vl), + INSN(fmsubadd213ph, 66, map6, a7, vl, fp16, vl), + INSN(fmsubadd231ph, 66, map6, b7, vl, fp16, vl), + INSN(fnmadd132ph, 66, map6, 9c, vl, fp16, vl), + INSN(fnmadd132sh, 66, map6, 9d, el, fp16, el), + INSN(fnmadd213ph, 66, map6, ac, vl, fp16, vl), + INSN(fnmadd213sh, 66, map6, ad, el, fp16, el), + INSN(fnmadd231ph, 66, map6, bc, vl, fp16, vl), + INSN(fnmadd231sh, 66, map6, bd, el, fp16, el), + INSN(fnmsub132ph, 66, map6, 9e, vl, fp16, vl), + INSN(fnmsub132sh, 66, map6, 9f, el, fp16, el), + INSN(fnmsub213ph, 66, map6, ae, vl, fp16, vl), + INSN(fnmsub213sh, 66, map6, af, el, fp16, el), + INSN(fnmsub231ph, 66, map6, be, vl, fp16, vl), + INSN(fnmsub231sh, 66, map6, bf, el, fp16, el), INSN(fpclassph, , 0f3a, 66, vl, fp16, vl), INSN(fpclasssh, , 0f3a, 67, el, fp16, el), INSN(getmantph, , 0f3a, 26, vl, fp16, vl), --- a/tools/tests/x86_emulator/predicates.c +++ b/tools/tests/x86_emulator/predicates.c @@ -2049,6 +2049,37 @@ static const struct evex { { { 0x5f }, 2, T, R, pfx_f3, W0, LIG }, /* vmaxsh */ { { 0x6e }, 2, T, R, pfx_66, WIG, L0 }, /* vmovw */ { { 0x7e }, 2, T, W, pfx_66, WIG, L0 }, /* vmovw */ +}, evex_map6[] = { + { { 0x96 }, 2, T, R, pfx_66, W0, Ln }, /* vfmaddsub132ph */ + { { 0x97 }, 2, T, R, pfx_66, W0, Ln }, /* vfmsubadd132ph */ + { { 0x98 }, 2, T, R, pfx_66, W0, Ln }, /* vfmadd132ph */ + { { 0x99 }, 2, T, R, pfx_66, W0, LIG }, /* vfmadd132sh */ + { { 0x9a }, 2, T, R, pfx_66, W0, Ln }, /* vfmsub132ph */ + { { 0x9b }, 2, T, R, pfx_66, W0, LIG }, /* vfmsub132sh */ + { { 0x9c }, 2, T, R, pfx_66, W0, Ln }, /* vfnmadd132ph */ + { { 0x9d }, 2, T, R, pfx_66, W0, LIG }, /* vfnmadd132sh */ + { { 0x9e }, 2, T, R, pfx_66, W0, Ln }, /* vfnmsub132ph */ + { { 0x9f }, 2, T, R, pfx_66, W0, LIG }, /* vfnmsub132sh */ + { { 0xa6 }, 2, T, R, pfx_66, W0, Ln }, /* vfmaddsub213ph */ + { { 0xa7 }, 2, T, R, pfx_66, W0, Ln }, /* vfmsubadd213ph */ + { { 0xa8 }, 2, T, R, pfx_66, W0, Ln }, /* vfmadd213ph */ + { { 0xa9 }, 2, T, R, pfx_66, W0, LIG }, /* vfmadd213sh */ + { { 0xaa }, 2, T, R, pfx_66, W0, Ln }, /* vfmsub213ph */ + { { 0xab }, 2, T, R, pfx_66, W0, LIG }, /* vfmsub213sh */ + { { 0xac }, 2, T, R, pfx_66, W0, Ln }, /* vfnmadd213ph */ + { { 0xad }, 2, T, R, pfx_66, W0, LIG }, /* vfnmadd213sh */ + { { 0xae }, 2, T, R, pfx_66, W0, Ln }, /* vfnmsub213ph */ + { { 0xaf }, 2, T, R, pfx_66, W0, LIG }, /* vfnmsub213sh */ + { { 0xb6 }, 2, T, R, pfx_66, W0, Ln }, /* vfmaddsub231ph */ + { { 0xb7 }, 2, T, R, pfx_66, W0, Ln }, /* vfmsubadd231ph */ + { { 0xb8 }, 2, T, R, pfx_66, W0, Ln }, /* vfmadd231ph */ + { { 0xb9 }, 2, T, R, pfx_66, W0, LIG }, /* vfmadd231sh */ + { { 0xba }, 2, T, R, pfx_66, W0, Ln }, /* vfmsub231ph */ + { { 0xbb }, 2, T, R, pfx_66, W0, LIG }, /* vfmsub231sh */ + { { 0xbc }, 2, T, R, pfx_66, W0, Ln }, /* vfnmadd231ph */ + { { 0xbd }, 2, T, R, pfx_66, W0, LIG }, /* vfnmadd231sh */ + { { 0xbe }, 2, T, R, pfx_66, W0, Ln }, /* vfnmsub231ph */ + { { 0xbf }, 2, T, R, pfx_66, W0, LIG }, /* vfnmsub231sh */ }; static const struct { @@ -2060,6 +2091,7 @@ static const struct { { evex_0f3a, ARRAY_SIZE(evex_0f3a) }, { NULL, 0 }, { evex_map5, ARRAY_SIZE(evex_map5) }, + { evex_map6, ARRAY_SIZE(evex_map6) }, }; #undef Wn --- a/xen/arch/x86/x86_emulate/decode.c +++ b/xen/arch/x86/x86_emulate/decode.c @@ -1231,6 +1231,16 @@ int x86emul_decode(struct x86_emulate_st d = twobyte_table[b].desc; s->simd_size = twobyte_table[b].size ?: simd_other; break; + + case evex_map6: + if ( !evex_encoded() ) + { + rc = X86EMUL_UNRECOGNIZED; + goto done; + } + opcode |= MASK_INSR(6, X86EMUL_OPC_EXT_MASK); + d = twobyte_table[0x38].desc; + break; } } else if ( s->ext < ext_8f08 + ARRAY_SIZE(xop_table) ) @@ -1479,6 +1489,24 @@ int x86emul_decode(struct x86_emulate_st disp8scale = decode_disp8scale(twobyte_table[b].d8s, s); break; + case ext_map6: + d = ext0f38_table[b].to_mem ? DstMem | SrcReg + : DstReg | SrcMem; + if ( ext0f38_table[b].two_op ) + d |= TwoOp; + s->simd_size = ext0f38_table[b].simd_size ?: simd_other; + + switch ( b ) + { + default: + if ( s->evex.pfx == vex_66 ) + s->fp16 = true; + break; + } + + disp8scale = decode_disp8scale(ext0f38_table[b].d8s, s); + break; + case ext_8f09: if ( ext8f09_table[b].two_op ) d |= TwoOp; @@ -1698,6 +1726,7 @@ int x86emul_decode(struct x86_emulate_st break; case ext_map5: + case ext_map6: case ext_8f09: case ext_8f0a: break; --- a/xen/arch/x86/x86_emulate/private.h +++ b/xen/arch/x86/x86_emulate/private.h @@ -195,6 +195,7 @@ enum vex_opcx { vex_0f38, vex_0f3a, evex_map5 = 5, + evex_map6, }; enum vex_pfx { @@ -250,6 +251,7 @@ struct x86_emulate_state { ext_0f38 = vex_0f38, ext_0f3a = vex_0f3a, ext_map5 = evex_map5, + ext_map6 = evex_map6, /* * For XOP use values such that the respective instruction field * can be used without adjustment. --- a/xen/arch/x86/x86_emulate/x86_emulate.c +++ b/xen/arch/x86/x86_emulate/x86_emulate.c @@ -7780,6 +7780,49 @@ x86_emulate( generate_exception_if(evex.w, EXC_UD); goto avx512f_all_fp; + case X86EMUL_OPC_EVEX_66(6, 0x96): /* vfmaddsub132ph [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX_66(6, 0x97): /* vfmsubadd132ph [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX_66(6, 0x98): /* vfmadd132ph [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX_66(6, 0x9a): /* vfmsub132ph [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX_66(6, 0x9c): /* vfnmadd132ph [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX_66(6, 0x9e): /* vfnmsub132ph [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX_66(6, 0xa6): /* vfmaddsub213ph [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX_66(6, 0xa7): /* vfmsubadd213ph [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX_66(6, 0xa8): /* vfmadd213ph [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX_66(6, 0xaa): /* vfmsub213ph [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX_66(6, 0xac): /* vfnmadd213ph [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX_66(6, 0xae): /* vfnmsub213ph [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX_66(6, 0xb6): /* vfmaddsub231ph [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX_66(6, 0xb7): /* vfmsubadd231ph [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX_66(6, 0xb8): /* vfmadd231ph [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX_66(6, 0xba): /* vfmsub231ph [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX_66(6, 0xbc): /* vfnmadd231ph [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX_66(6, 0xbe): /* vfnmsub231ph [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ + host_and_vcpu_must_have(avx512_fp16); + generate_exception_if(evex.w, EXC_UD); + if ( ea.type != OP_REG || !evex.brs ) + avx512_vlen_check(false); + goto simd_zmm; + + case X86EMUL_OPC_EVEX_66(6, 0x99): /* vfmadd132sh xmm/m16,xmm,xmm{k} */ + case X86EMUL_OPC_EVEX_66(6, 0x9b): /* vfmsub132sh xmm/m16,xmm,xmm{k} */ + case X86EMUL_OPC_EVEX_66(6, 0x9d): /* vfnmadd132sh xmm/m16,xmm,xmm{k} */ + case X86EMUL_OPC_EVEX_66(6, 0x9f): /* vfnmsub132sh xmm/m16,xmm,xmm{k} */ + case X86EMUL_OPC_EVEX_66(6, 0xa9): /* vfmadd213sh xmm/m16,xmm,xmm{k} */ + case X86EMUL_OPC_EVEX_66(6, 0xab): /* vfmsub213sh xmm/m16,xmm,xmm{k} */ + case X86EMUL_OPC_EVEX_66(6, 0xad): /* vfnmadd213sh xmm/m16,xmm,xmm{k} */ + case X86EMUL_OPC_EVEX_66(6, 0xaf): /* vfnmsub213sh xmm/m16,xmm,xmm{k} */ + case X86EMUL_OPC_EVEX_66(6, 0xb9): /* vfmadd231sh xmm/m16,xmm,xmm{k} */ + case X86EMUL_OPC_EVEX_66(6, 0xbb): /* vfmsub231sh xmm/m16,xmm,xmm{k} */ + case X86EMUL_OPC_EVEX_66(6, 0xbd): /* vfnmadd231sh xmm/m16,xmm,xmm{k} */ + case X86EMUL_OPC_EVEX_66(6, 0xbf): /* vfnmsub231sh xmm/m16,xmm,xmm{k} */ + host_and_vcpu_must_have(avx512_fp16); + generate_exception_if(evex.w || (ea.type != OP_REG && evex.brs), + EXC_UD); + if ( !evex.brs ) + avx512_vlen_check(true); + goto simd_zmm; + case X86EMUL_OPC_XOP(08, 0x85): /* vpmacssww xmm,xmm/m128,xmm,xmm */ case X86EMUL_OPC_XOP(08, 0x86): /* vpmacsswd xmm,xmm/m128,xmm,xmm */ case X86EMUL_OPC_XOP(08, 0x87): /* vpmacssdql xmm,xmm/m128,xmm,xmm */ --- a/xen/arch/x86/x86_emulate/x86_emulate.h +++ b/xen/arch/x86/x86_emulate/x86_emulate.h @@ -620,6 +620,7 @@ struct x86_emulate_ctxt * 0x0f38xxxx for 0f38-prefixed opcodes (or their VEX/EVEX equivalents) * 0x0f3axxxx for 0f3a-prefixed opcodes (or their VEX/EVEX equivalents) * 0x5xxxx for Map5 opcodes (EVEX only) + * 0x6xxxx for Map6 opcodes (EVEX only) * 0x8f08xxxx for 8f/8-prefixed XOP opcodes * 0x8f09xxxx for 8f/9-prefixed XOP opcodes * 0x8f0axxxx for 8f/a-prefixed XOP opcodes From patchwork Wed Jun 15 10:29:20 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 12882098 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 50570C433EF for ; Wed, 15 Jun 2022 10:38:28 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.350000.576230 (Exim 4.92) (envelope-from ) id 1o1QPg-0002s3-19; Wed, 15 Jun 2022 10:38:16 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 350000.576230; Wed, 15 Jun 2022 10:38:15 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1o1QPf-0002ru-To; Wed, 15 Jun 2022 10:38:15 +0000 Received: by outflank-mailman (input) for mailman id 350000; Wed, 15 Jun 2022 10:38:14 +0000 Received: from se1-gles-sth1-in.inumbo.com ([159.253.27.254] helo=se1-gles-sth1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1o1QH7-0002mz-Vx for xen-devel@lists.xenproject.org; Wed, 15 Jun 2022 10:29:26 +0000 Received: from EUR01-VE1-obe.outbound.protection.outlook.com (mail-ve1eur01on0629.outbound.protection.outlook.com [2a01:111:f400:fe1f::629]) by se1-gles-sth1.inumbo.com (Halon) with ESMTPS id 06cdd9ac-ec96-11ec-bd2c-47488cf2e6aa; Wed, 15 Jun 2022 12:29:25 +0200 (CEST) Received: from VE1PR04MB6560.eurprd04.prod.outlook.com (2603:10a6:803:122::25) by AM8PR04MB7332.eurprd04.prod.outlook.com (2603:10a6:20b:1db::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5353.13; Wed, 15 Jun 2022 10:29:23 +0000 Received: from VE1PR04MB6560.eurprd04.prod.outlook.com ([fe80::dfa:a64a:432f:e26b]) by VE1PR04MB6560.eurprd04.prod.outlook.com ([fe80::dfa:a64a:432f:e26b%7]) with mapi id 15.20.5332.020; Wed, 15 Jun 2022 10:29:22 +0000 X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 06cdd9ac-ec96-11ec-bd2c-47488cf2e6aa ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=UmzbM9VmEtu60O6d2A6xUEjeau3ZJ+a4LFwmd+AaHwBo6woOsa6fhSDB4JiE+emcPCv3Z2aSdlVtR06ZmauCmghcaSh86bsNm++UedyjFLme9uvyNXwj/qHahy7fV5xmvzgEShbx6jqBSOY6Bu+TnhVkBP4cpmfOb35OiOIQSGNcBmv09HGzG+m+oxofD9iUEQVgIVQi7qOhtwv1Z3LYnfGbuvWGs7pxWS/ssOJuzyt5nJToFFQnMVTws1ehOk6CzsX7rMu4L4fEt0WsiZIyiGvFi8tvCV82ncMkNd/LvPptYsn4B5HzTmxVZhoRCyUOjQOK0VqXG3ZSz/biyCJ/fw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=HjhcNK1/AAtI6oe/nZgHbpEji5E8JjehJMuPbAPTZnI=; b=SsV/3Mb+y3QW+2aejVkY5OS/te4kuVkFIVnJ+GX2ekxXabSS7oiiAR832Ut14HoqC739JEfVoPsI6eC8NyLjMyBj3FTb0Cocsc1QCA5hzrhbQELCuX1UrUpfHbU2Cacsu5W6mRf6JRcrsUU9IxCH57hIl3fQNTxRDmci/o/O3vuNROBram/01zykZl9T1PDpwAZR3LaIiY2FapihxOkeFpe3XM1+yoGrimL5KN3m1Drkq2SgBOsKRKoZW5hJ3GfE/Fi1Woa/Qe4Leky5fGE82ga+wk0kiJ1ZZ3GU3W3F8MyOhx4oHL7zG32+32Lff3ur7b19cXnHaWcaMqcfsZmh6Q== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=HjhcNK1/AAtI6oe/nZgHbpEji5E8JjehJMuPbAPTZnI=; b=091yK/Qon4Y5lgJvq4tvpGQ2NcvEebP+ZIgWfjZJd1ittCaMmqSErjXYBumjTlp4Velu3RORsIJJjUHzSvPz4tYF/wgt/DPviAZAoB3tSaXkciJcjyphbxu6d+nDa5o678Amp6W8HFm4PXyAMY6ZqJDc9bwLHtfehQs5gYp5DoFENh5WOodrjGRUBB2DvbkpwvNGfGDo1CDMYWnpWCRSWxkqusAT3yu/4UQvvKSaPlX/Ug2rDIqSuP/rbWbiBGoxrySzr7b3rQGLMkWZRVi8pUIYmA8E8otlima5qvknSaTJUVhML1kOwGEX+SUGEDBjsUT+3dyFfdar4rXHWLHBtQ== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com; Message-ID: Date: Wed, 15 Jun 2022 12:29:20 +0200 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.10.0 Subject: [PATCH 06/11] x86emul: handle AVX512-FP16 Map6 misc insns Content-Language: en-US From: Jan Beulich To: "xen-devel@lists.xenproject.org" Cc: Andrew Cooper , Wei Liu , =?utf-8?q?Roger_Pau_Monn=C3=A9?= References: <9ba3aaa7-56ce-b051-f1c4-874276e493e1@suse.com> In-Reply-To: <9ba3aaa7-56ce-b051-f1c4-874276e493e1@suse.com> X-ClientProxiedBy: AM6P191CA0027.EURP191.PROD.OUTLOOK.COM (2603:10a6:209:8b::40) To VE1PR04MB6560.eurprd04.prod.outlook.com (2603:10a6:803:122::25) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: d1532c4d-c4c0-45ca-5f99-08da4eb9e99f X-MS-TrafficTypeDiagnostic: AM8PR04MB7332:EE_ X-Microsoft-Antispam-PRVS: X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: BsIMuZPD/6cFiapbTg/tQCZVnb7MeGEkoFG0QZGRymvkrsL+fQyg/kF/Z9hDvS8ojrm5Wg6aGPpAiN3M7SAnpgsueXMyv3/R3TV4HTTMQ96pxY1us2+6bD2CYcbYWjIE5SVdna4pUcZU124YB35WJJomqAI55vvJ0BEgIYryFGVkhZ8J2MAgX907JTV8vgCVK4tro7FqvROOkyUPC7/Vld2kqWmwlU7l5TO0bzoql0TsPtD2E4tRb3xUWLkfFdApeURTOKP/Tyc1qJDc4vEm/DRVfVLFMxEXbUvwdu/MiWVuhbahDIjWJeyRztIbD/OnoORTxrke5iRzo99G+oyVTbp8niQIWQboi9Frt7SkPPSjbwlmwj2XIkMVadfwgvG1bV0tno+wcArpHhc075mfTJPAAJj2VTXAXsWtZjt/W+S3//RoeYsGWmoDF5rK8OPEIH0tRT1wgQUnsjmTrqfAR8ze/MRLtf6p8lEgSgl5o7YN53r6X3Vliz9OzrnK9k5jxvotot8DONiklNZfXGNEBd9P9E+wXqV1TlXw+kJ1Hc+ooZvb7+4ER3it0Qbn6kc0pUEdc5PVZtKl3n+kltNwdBMkCgfzLPaWcjOnLVY8gCCIBGiqpagXjrz/jOVkAvariUSFFihTSmI678yqfOr4iCjk+JR7JpclFYs5yTgDM4WdAcSI3lMiAE+dpd7LZz9t7vxf51jZlieIO5JHouK+mdOR6jr+SM9mlE9ceDwPDYwG5AckTK8eaI+7Foa8Bo0k X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:VE1PR04MB6560.eurprd04.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230016)(366004)(31686004)(6486002)(316002)(36756003)(26005)(86362001)(31696002)(6512007)(186003)(8936002)(2616005)(6506007)(38100700002)(6916009)(54906003)(508600001)(66476007)(4326008)(66556008)(8676002)(5660300002)(2906002)(66946007)(43740500002)(45980500001);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?q?fHVeERIsHijt29dsGtltF8PF6nkL?= =?utf-8?q?i3FFotH9CWY9MpdRPCFrpyehG/4zeptibha/ff195QFxEQeInekJQO+l7VYSMrMCp?= =?utf-8?q?SX9YILKCIrQSDo7I2JmLHxK/5l9F6rMFzvWykhVxo9qWQf7O+xMToT68pUyLdmnjY?= =?utf-8?q?mQ3dJSyhKGx+zeR28qohzqXyboqrhuef3KJ62SmfXARxMkXRYry27Q9uncwUqVupb?= =?utf-8?q?x/SRjwPsFPtibfYSBJN9LUG/yCxzWskAyRzje3VJgxUURGx18WRYkhK1yRd43CTX5?= =?utf-8?q?l6QdSHoMyobV8po4zrsWXjbwWqJGB9gzFYPr574fXbFnjZRmaPFJAT0D752cV6bQO?= =?utf-8?q?JrpvM7Lrcopcq6ebK072spzU0o23cqBgdkwOu2ygiJweIb0hAXqs8P3VTlG0Jf+/V?= =?utf-8?q?+4fYHEYpCKLd1/Y1eaD4f05uiv08GldOZp2Ts8WkH3C/pKLfQkQfZz1Ya035nP6qO?= =?utf-8?q?MXIfpwAybt+6Bu2+khDm/S5N0i8AncjDE9DSSLxX6CGIgqTKZIp7n6nHwu9E7kTJP?= =?utf-8?q?4EmCNoRD4D4HUz/qgramZ+E42jwHlvqaC5RPpXbWVT/8Q50mMO7am+yU2cRJwXBVy?= =?utf-8?q?G1qzrZvFhz9cXF2ddgYGNtrAM1Ea5JG4keIN+ZjvdPQr8XReYBydwUMao1ksD1zL/?= =?utf-8?q?ulmxfTiUe3qqid79Vu+52es4rV4y7GDEaRwq5mP9HwazxIBef9U2Pq2G0Ln2NFKqD?= =?utf-8?q?aKkjUnnasGseVzk569nYRAlOZTsAihq99cwr9XTvtG//5ckC7dIzhKabK3/dNCe3G?= =?utf-8?q?qwsaE7M+8NrNwSTXDJNk6P3+0Uq+0VPfQ/0635S+3DuKHqprDH97YtNoQnHXMisRa?= =?utf-8?q?Bbe6zYxVAPS/O7sNUU+PpUG7RbgQ+QE+v+TVQYg+A4vy3u9p/P0t4tvw4dNVEx0uQ?= =?utf-8?q?DdYPUYSnWsd2RsmM+CPRnQSh9wdlKUa1kvxNvMxd/aXXoTBr3P/eMJVTVD2aOOuFW?= =?utf-8?q?U0pP+hJ5LWpZ55gQYDTYFo5K/WExWnp95pcqPjhR8SEP4R0RSbXNAgVZmxnK5T9+o?= =?utf-8?q?eOXztpiGz0XbGoLYyLJVgLWNA9dXu5tuMjRAihvlHjTn0Ugev68CazV5JGaSTyMV+?= =?utf-8?q?BVBPB1x2vTZq4jy+QrGaQBZVvWG0CwgARU/CsNxqk8vs01hJrRtFyHbB3R7Sydvst?= =?utf-8?q?L3/IS2OfA/WdNTrSP/ggL2qI5xLXPU9sVB50+9A+Am1yPAIvJpzcAUfBsckFx+Va+?= =?utf-8?q?sYq5XT6AbKP9z694gjSxh7c4k2IfAKKp/RP+P54YtauxFwz5Fz+2tXv6FlhhzETdi?= =?utf-8?q?nqqRInfSgS9ezZksTHSrDCS1YzmzK7uMGtoGJH/aLUZWmYWxs4sdFyuVBE3RJAzPg?= =?utf-8?q?mS/mDnDGS5suO6kzj4NeW/7H7CLqIW/fpWC+mBTu2ouNaCKpTW42vEtXpLO97//ZS?= =?utf-8?q?zc8MQidZHJIN0+mVlQQio2T5dZzfgSCX3N4ZMMXpJjSyechF7JFceCVuGXxC0Rox0?= =?utf-8?q?/EIv5w0QfOMAGBWR9uGORRtNnrWEFo86XLaB3MbGLFwxW/aKqnlS5Fe56FhXBmcxw?= =?utf-8?q?QPCItz656Se7gCeWIiB9CDzrHRY4MDiGe7ScBn7EGMJ2kj3dpjszy1Q+n9O/mLYDt?= =?utf-8?q?P6MqipOv536F43yPEDUEe12p+O8ujv5zUnqeXH1TCHUdRWdR7DLZciZyil0aotCZa?= =?utf-8?q?TZ9jzPr3qCsOlory8uLPvP4mPX4ywx7Q=3D=3D?= X-OriginatorOrg: suse.com X-MS-Exchange-CrossTenant-Network-Message-Id: d1532c4d-c4c0-45ca-5f99-08da4eb9e99f X-MS-Exchange-CrossTenant-AuthSource: VE1PR04MB6560.eurprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 15 Jun 2022 10:29:22.3204 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: f7a17af6-1c5c-4a36-aa8b-f5be247aa4ba X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: vMLTNJLCNk85fP8qTMq3A8br4LZxAsiG7RRwII0ku8mGAQmpXsLlARJy7qHGGyoyJU7UQ62OXcQHmCzJKGwPZw== X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM8PR04MB7332 While, as before, this leverages that the Map6 encoding space is a very sparse clone of the "0f38" one, switch around the simd_size overriding for opcode 2D. This way less separate overrides are needed. Signed-off-by: Jan Beulich Acked-by: Andrew Cooper --- a/tools/tests/x86_emulator/evex-disp8.c +++ b/tools/tests/x86_emulator/evex-disp8.c @@ -646,6 +646,8 @@ static const struct test avx512_fp16_all INSN(fnmsub231sh, 66, map6, bf, el, fp16, el), INSN(fpclassph, , 0f3a, 66, vl, fp16, vl), INSN(fpclasssh, , 0f3a, 67, el, fp16, el), + INSN(getexpph, 66, map6, 42, vl, fp16, vl), + INSN(getexpsh, 66, map6, 43, el, fp16, el), INSN(getmantph, , 0f3a, 26, vl, fp16, vl), INSN(getmantsh, , 0f3a, 27, el, fp16, el), INSN(maxph, , map5, 5f, vl, fp16, vl), @@ -656,10 +658,16 @@ static const struct test avx512_fp16_all INSN(movsh, f3, map5, 11, el, fp16, el), INSN(mulph, , map5, 59, vl, fp16, vl), INSN(mulsh, f3, map5, 59, el, fp16, el), + INSN(rcpph, 66, map6, 4c, vl, fp16, vl), + INSN(rcpsh, 66, map6, 4d, el, fp16, el), INSN(reduceph, , 0f3a, 56, vl, fp16, vl), INSN(reducesh, , 0f3a, 57, el, fp16, el), INSN(rndscaleph, , 0f3a, 08, vl, fp16, vl), INSN(rndscalesh, , 0f3a, 0a, el, fp16, el), + INSN(rsqrtph, 66, map6, 4e, vl, fp16, vl), + INSN(rsqrtsh, 66, map6, 4f, el, fp16, el), + INSN(scalefph, 66, map6, 2c, vl, fp16, vl), + INSN(scalefsh, 66, map6, 2d, el, fp16, el), INSN(sqrtph, , map5, 51, vl, fp16, vl), INSN(sqrtsh, f3, map5, 51, el, fp16, el), INSN(subph, , map5, 5c, vl, fp16, vl), --- a/tools/tests/x86_emulator/predicates.c +++ b/tools/tests/x86_emulator/predicates.c @@ -2050,6 +2050,14 @@ static const struct evex { { { 0x6e }, 2, T, R, pfx_66, WIG, L0 }, /* vmovw */ { { 0x7e }, 2, T, W, pfx_66, WIG, L0 }, /* vmovw */ }, evex_map6[] = { + { { 0x2c }, 2, T, R, pfx_66, W0, Ln }, /* vscalefph */ + { { 0x2d }, 2, T, R, pfx_66, W0, LIG }, /* vscalefsh */ + { { 0x42 }, 2, T, R, pfx_66, W0, Ln }, /* vgetexpph */ + { { 0x43 }, 2, T, R, pfx_66, W0, LIG }, /* vgetexpsh */ + { { 0x4c }, 2, T, R, pfx_66, W0, Ln }, /* vrcpph */ + { { 0x4d }, 2, T, R, pfx_66, W0, LIG }, /* vrcpsh */ + { { 0x4e }, 2, T, R, pfx_66, W0, Ln }, /* vrsqrtph */ + { { 0x4f }, 2, T, R, pfx_66, W0, LIG }, /* vrsqrtsh */ { { 0x96 }, 2, T, R, pfx_66, W0, Ln }, /* vfmaddsub132ph */ { { 0x97 }, 2, T, R, pfx_66, W0, Ln }, /* vfmsubadd132ph */ { { 0x98 }, 2, T, R, pfx_66, W0, Ln }, /* vfmadd132ph */ --- a/xen/arch/x86/x86_emulate/decode.c +++ b/xen/arch/x86/x86_emulate/decode.c @@ -358,7 +358,7 @@ static const struct ext0f38_table { [0x2a] = { .simd_size = simd_packed_int, .two_op = 1, .d8s = d8s_vl }, [0x2b] = { .simd_size = simd_packed_int, .d8s = d8s_vl }, [0x2c] = { .simd_size = simd_packed_fp, .d8s = d8s_vl }, - [0x2d] = { .simd_size = simd_packed_fp, .d8s = d8s_dq }, + [0x2d] = { .simd_size = simd_scalar_vexw, .d8s = d8s_dq }, [0x2e ... 0x2f] = { .simd_size = simd_packed_fp, .to_mem = 1 }, [0x30] = { .simd_size = simd_other, .two_op = 1, .d8s = d8s_vl_by_2 }, [0x31] = { .simd_size = simd_other, .two_op = 1, .d8s = d8s_vl_by_4 }, @@ -909,8 +909,8 @@ decode_0f38(struct x86_emulate_state *s, ctxt->opcode |= MASK_INSR(s->vex.pfx, X86EMUL_OPC_PFX_MASK); break; - case X86EMUL_OPC_EVEX_66(0, 0x2d): /* vscalefs{s,d} */ - s->simd_size = simd_scalar_vexw; + case X86EMUL_OPC_VEX_66(0, 0x2d): /* vmaskmovpd */ + s->simd_size = simd_packed_fp; break; case X86EMUL_OPC_EVEX_66(0, 0x7a): /* vpbroadcastb */ --- a/xen/arch/x86/x86_emulate/x86_emulate.c +++ b/xen/arch/x86/x86_emulate/x86_emulate.c @@ -7780,6 +7780,8 @@ x86_emulate( generate_exception_if(evex.w, EXC_UD); goto avx512f_all_fp; + case X86EMUL_OPC_EVEX_66(6, 0x2c): /* vscalefph [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX_66(6, 0x42): /* vgetexpph [xyz]mm/mem,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_66(6, 0x96): /* vfmaddsub132ph [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_66(6, 0x97): /* vfmsubadd132ph [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_66(6, 0x98): /* vfmadd132ph [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ @@ -7804,6 +7806,8 @@ x86_emulate( avx512_vlen_check(false); goto simd_zmm; + case X86EMUL_OPC_EVEX_66(6, 0x2d): /* vscalefsh xmm/m16,xmm,xmm{k} */ + case X86EMUL_OPC_EVEX_66(6, 0x43): /* vgetexpsh xmm/m16,xmm,xmm{k} */ case X86EMUL_OPC_EVEX_66(6, 0x99): /* vfmadd132sh xmm/m16,xmm,xmm{k} */ case X86EMUL_OPC_EVEX_66(6, 0x9b): /* vfmsub132sh xmm/m16,xmm,xmm{k} */ case X86EMUL_OPC_EVEX_66(6, 0x9d): /* vfnmadd132sh xmm/m16,xmm,xmm{k} */ @@ -7823,6 +7827,19 @@ x86_emulate( avx512_vlen_check(true); goto simd_zmm; + case X86EMUL_OPC_EVEX_66(6, 0x4c): /* vrcpph [xyz]mm/mem,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX_66(6, 0x4e): /* vrsqrtph [xyz]mm/mem,[xyz]mm{k} */ + host_and_vcpu_must_have(avx512_fp16); + generate_exception_if(evex.w, EXC_UD); + goto avx512f_no_sae; + + case X86EMUL_OPC_EVEX_66(6, 0x4d): /* vrcpsh xmm/m16,xmm,xmm{k} */ + case X86EMUL_OPC_EVEX_66(6, 0x4f): /* vrsqrtsh xmm/m16,xmm,xmm{k} */ + host_and_vcpu_must_have(avx512_fp16); + generate_exception_if(evex.w || evex.brs, EXC_UD); + avx512_vlen_check(true); + goto simd_zmm; + case X86EMUL_OPC_XOP(08, 0x85): /* vpmacssww xmm,xmm/m128,xmm,xmm */ case X86EMUL_OPC_XOP(08, 0x86): /* vpmacsswd xmm,xmm/m128,xmm,xmm */ case X86EMUL_OPC_XOP(08, 0x87): /* vpmacssdql xmm,xmm/m128,xmm,xmm */ From patchwork Wed Jun 15 10:29:53 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 12882099 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 0B679C43334 for ; Wed, 15 Jun 2022 10:38:27 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.349990.576219 (Exim 4.92) (envelope-from ) id 1o1QPb-0002R0-N8; Wed, 15 Jun 2022 10:38:11 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 349990.576219; Wed, 15 Jun 2022 10:38:11 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1o1QPb-0002Ql-Gd; Wed, 15 Jun 2022 10:38:11 +0000 Received: by outflank-mailman (input) for mailman id 349990; Wed, 15 Jun 2022 10:38:10 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1o1QHf-0004ln-Jt for xen-devel@lists.xenproject.org; Wed, 15 Jun 2022 10:29:59 +0000 Received: from EUR04-DB3-obe.outbound.protection.outlook.com (mail-db3eur04on061e.outbound.protection.outlook.com [2a01:111:f400:fe0c::61e]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id 1ad2e8e1-ec96-11ec-ab14-113154c10af9; Wed, 15 Jun 2022 12:29:58 +0200 (CEST) Received: from VE1PR04MB6560.eurprd04.prod.outlook.com (2603:10a6:803:122::25) by DB8PR04MB6876.eurprd04.prod.outlook.com (2603:10a6:10:116::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5332.20; Wed, 15 Jun 2022 10:29:55 +0000 Received: from VE1PR04MB6560.eurprd04.prod.outlook.com ([fe80::dfa:a64a:432f:e26b]) by VE1PR04MB6560.eurprd04.prod.outlook.com ([fe80::dfa:a64a:432f:e26b%7]) with mapi id 15.20.5332.020; Wed, 15 Jun 2022 10:29:55 +0000 X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 1ad2e8e1-ec96-11ec-ab14-113154c10af9 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=ZHem/YFsl4OXr3QZw1vdeNx/CBpoB6f7cuOEX3X031tcWhdCs9Y6uAjknYbzO+8BRJ6mFclSZSNnxoxxZh7HH7Gh/m4DKWvkaO8Sh6hKJzicEFy2Wj9E38ZWj7UvkTdpnxjlS+fNeTzFbhiOH3wzA7R24mkJ4JCzmswaZ8nEXTVgP+KYFRPl4xZz8cMvMfriNdRrKAPGVDi6EzQLlrhN1vDNwuomHPEMzn7fr6DpnLLbysAcmGjLMUgbcYCd97/P/ZKDlcJFrVmHB7H/rc9o48ZLjEzXy9gemIs4D0vu6sioZxOJab0gY7RRdlkSJibObfhNLrZzQV+ksRzugXi9wg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=eQbKjA8/usEOvga5i8HP0FnNQfY6bpT7pO+ZFMrlli4=; b=DXJZT59aV7Nmdnb7jE0oPiodRePh4q/pZnnF2UQAJ65SbeKHGdiYdshIAWZj/teghme053sOeAfDQ5W1iR6x5sd7jjyErk1OVhhX2cT5I3iZzezLlS0BpvUjMDbnSFrz2Pdq3UVDHOEQ5nWMJHwcTUi+fzxGTU2wUrzfcnUQXIVW7C+CA9zRPBerIvGkhJydd4S+1dZZpZipRyFG8nzSzz7gaTCT5/BTo4J3MNEVjilDugkFx0eEihyZoqHCx6TLtLMupc1LsWZbVTck9d3Pb7BpuqYleHgm1+mdhKrf/3SKgl5CSZOjbPKv3/g72yucE8D+NSUPEQweBdwkFT+jkg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=eQbKjA8/usEOvga5i8HP0FnNQfY6bpT7pO+ZFMrlli4=; b=Fa/6DLSWxc8V16DI+a4Aa8735JXTvh2lizXXwvVV+hUzUfzk9rEy5JQZa4lRLZghic8kugNDqcwdvQMLpdu1XjUJD+RYXunNqOmELIw8bYjrC3qADY4rAzgRLy13euLHLMcjR4OxCxcZu4FzB0dQP0/+TQRXAmE+wpI29gyX+XBLbKPl96WtybX5llpGNajPVnN3uGTjq8UqVsID4h4rGLd0jSAN2OyQySmMIH0NXw3j0O2t5PPzPZRaqVfOIKfSfv0a87fO1XfVvMZcIuEIyZgKh9VeYReKtOgFSntB80xuDDn5/h/U12OLrHJtEVzfnfMzlFOOmcKy33BDYZn58Q== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com; Message-ID: Date: Wed, 15 Jun 2022 12:29:53 +0200 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.10.0 Subject: [PATCH 07/11] x86emul: handle AVX512-FP16 complex multiplication insns Content-Language: en-US From: Jan Beulich To: "xen-devel@lists.xenproject.org" Cc: Andrew Cooper , Wei Liu , =?utf-8?q?Roger_Pau_Monn=C3=A9?= References: <9ba3aaa7-56ce-b051-f1c4-874276e493e1@suse.com> In-Reply-To: <9ba3aaa7-56ce-b051-f1c4-874276e493e1@suse.com> X-ClientProxiedBy: AM6PR08CA0042.eurprd08.prod.outlook.com (2603:10a6:20b:c0::30) To VE1PR04MB6560.eurprd04.prod.outlook.com (2603:10a6:803:122::25) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 7821c3d1-e5d0-4d70-2235-08da4eb9fd7a X-MS-TrafficTypeDiagnostic: DB8PR04MB6876:EE_ X-Microsoft-Antispam-PRVS: X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: FbXnr4BOOLAY5sBSndTU2W5NIpHquBmXWqkZw7syhFgQxbFh7YPSHJtNPnecrCvce0uXdFCbDLxOBZqUVQ+Tj3x/L/Ikh+pNJBe50ZZLtGO2DhWu2vBunJTJfXVvjKw7VZRpRoZCSCQL2TNgfeh1g11nMt2N/4EdMBOBxDQS52dQfDHwpycgn0mDGRsfpgFGESNvgvlkAUrdoL8UPHJ1ceJWyDv7iqkhZZZe0hzIdh0EoVvtidG/3/9buDzi133pm+oHkKGlaSfhxVEyGzSzDnPr3t2k9GRQvjHqWnJdS4XktJq3apacASEqqyCeyABXUqMiLscFQrBBPE1gR7DcyiRf1F4Iwl5bqVZKnbv0NI6avrTfWm8mhzkcndPuanHNHaACPUwD3qRs8QTnxx+R9sgtP/+nx25ou7MvkkpCGDHKf2gkrohQltkTyiPJuXHS1KmxIDy6i5WjRiwPZ62WQSdMEb5iblBF4mOhJ1u0p+YUdowFZahR0Rov7tzCsPYAdzbG/rQARrFRpim0HnrcV9C+viF7gz5LE/71k3O5pk7BjPWqO765HdLVd6pewmjnZIXfrFJvYqxdoPEWkR3rPLSlkDan5ZZYFC4PyRsYVd3eWeke4HwBv2rpFAk6stjosgQgG0lDjwhhbXUK1PsU8uhAc/7O6TTNSyURb6qdPSrcKUKSwjtopKrB9cD9GomeINqW6PXkmsJ8JUbFGfdT1I/KFEDI/gGQrRg9ANuFW86L7VS2G4JZwFeX8u956ieR X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:VE1PR04MB6560.eurprd04.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230016)(366004)(6512007)(66556008)(26005)(54906003)(5660300002)(6506007)(316002)(186003)(2906002)(31686004)(36756003)(508600001)(6486002)(8936002)(66946007)(86362001)(4326008)(8676002)(6916009)(66476007)(38100700002)(31696002)(2616005)(43740500002)(45980500001);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?q?h9SbAA1YZGzzb5CXtjr/F4F+THvT?= =?utf-8?q?esYNuG3uZyKzhPbHhhf+Ae+wAuPgIpMWlsBwjv6Lf6ugyHdJvFlkeHlge8/f3ynd2?= =?utf-8?q?QOdL37965ZIOkmV/HjNRfB9KSee5EL2foT5LQ8ZQLsPaaHQRFJOPVMma6eG6ahVJq?= =?utf-8?q?BOj9RBWt0XpqYXc3zp2aTJobQB3UC2D07nXayfzD+qfOKujkoaoaE9WgqxYz3Yrv2?= =?utf-8?q?lHjGsUtDYZTu2RY6y1l55Dv3W5tP8fIvF0UfeaT5zSp7VnTb6boftzlSpElFJ8lXq?= =?utf-8?q?KyZK4c/EMWPUsGiljyNZsSMp2R+YsnT3yR5Z7Ng2c/ZwRllxHHOYoKYmXKVejr+bw?= =?utf-8?q?b8rjk5qUos6en4FLHlJxgRGLevo4T62RNuu6Q6m94LIOs5hRwJvJGx5deN/bb0SaP?= =?utf-8?q?osNDB2+X4G9dJUxaC7ZOQWLPx14kagQyWZkFIq2wPkbsMk2XSMuAFmn3d2ga17PAp?= =?utf-8?q?si1ikOsNpLvEkCc61k+Xwx0qMab+WsZ9/umpfTYMdHAzxCAvYKtCNnLsmN6ckYTKF?= =?utf-8?q?8i8A2fwWfI90AHRQfKcZ1/9oOjjqXvnw7zaQRWCsawKaAHwLrTIOJ6VaM4q1O6yFb?= =?utf-8?q?xW6Fr6JwXBgTgB8DtSvvg49jbUl7T7+UMndaszxk3WwKlvEX2CAr0p1o87eP+xKzB?= =?utf-8?q?EqOiFvW0dYo4pIVPeWlqos+7Ll1Xi5QEPwNGzwf+XaKh42BGwLVANPN3GcvoQe3Sn?= =?utf-8?q?/+B8vQSDEDmLBwTkanegnVVP6u3WI+mEanKP3xzSpBXvSoG8z/Yp6hndi1ArAqJPH?= =?utf-8?q?a4NZNdrUzgcY6cLDdOEjqQHlF0Rfj2JiI8gr87MsIq/qGp6QxgisI3c08APLXMWng?= =?utf-8?q?nzmu9ndMmpDVVlaiMiavos05V6C+EYjvQQXUkU18KhydVZnep8qWvaIfBaS8ITp+2?= =?utf-8?q?f/eDfAj5EZtSIAMbROT6Aza4hLXc8aOcmWnnIE7KJzkYFYN8+57SjMMfjAypsGygk?= =?utf-8?q?iiRf8UetsBF101HimN9sCQlRjFYcFI/x52K2YG7MCFd5m/bSG+ut4njOr2L0Z5V/e?= =?utf-8?q?vKlpVY1UXsk74PWwDPS56/2c08ryGtqzi83Lyn1jwK+t4dPf/EnuYm99pUE4aQKAz?= =?utf-8?q?/1DQDtBF/xDD/i9GFIbAyju/LFq3CgdXq0bjxmLZyNZItz+r9xnPo3aB2rlbZ0Hns?= =?utf-8?q?Ot6UlGGX4e9438kLIuEhpZSinTRqRxS0IZFipFNPKyljRYOJqu0VzMwqDP9rahpgg?= =?utf-8?q?eNP8c+GEJJLnKy1qc4mE793p7WODzGp2hhhVqidgowtepDcJSAuBxVtvSmMJn7rfa?= =?utf-8?q?WjJAJ9etB9DXZRX5GTV1O/g/Aj0eK6+zhWnqt/TIQR8MK99VR88m9CEE6wkBIYnBt?= =?utf-8?q?pX0dK+rEkfd02Dr+PHBQWFESnhQ0UjETYeu9dlPYwKXTvMa3BahPidQM43CRUF+eN?= =?utf-8?q?8chQe2zia128C+vxmYo/pDMZWFU4AMDB9aAcFu3jxHE0Msdn8sTDX/1Zj1mduGO+s?= =?utf-8?q?YM64WuyxVZFkgKvURqPC+1RzYgHfw3XVGr+vYQvPP6zIi8cxR1+QJCPr2gyCeGHlR?= =?utf-8?q?Px0FkSzwbYzK6YNIvYbKm8biNLe735vPIX0mvho6Hbmylhs0k5HUqt1ETB6Q8GRGC?= =?utf-8?q?4OsFKubSJlBR1GWpAJpUJleL5rbsrQVNmOIsAfc6h82ppIhvzIg92K0qrzZYWTX00?= =?utf-8?q?bb9kz7fmH/yqA074FgXOj7sQmf6X/hsA=3D=3D?= X-OriginatorOrg: suse.com X-MS-Exchange-CrossTenant-Network-Message-Id: 7821c3d1-e5d0-4d70-2235-08da4eb9fd7a X-MS-Exchange-CrossTenant-AuthSource: VE1PR04MB6560.eurprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 15 Jun 2022 10:29:55.6621 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: f7a17af6-1c5c-4a36-aa8b-f5be247aa4ba X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: nXdKV5xn5ShlcB9wbtNQJATumr9ACa5l2RyFaLAfb82bSmXpmInSstpxcRIE26XE6alOifnbSKOKept+x5OKOw== X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB8PR04MB6876 Aspects to consider are that these have 32-bit element size (pairs of FP16) and that there are restrictions on the registers valid to use. Signed-off-by: Jan Beulich Acked-by: Andrew Cooper --- a/tools/tests/x86_emulator/evex-disp8.c +++ b/tools/tests/x86_emulator/evex-disp8.c @@ -614,12 +614,18 @@ static const struct test avx512_fp16_all INSN(comish, , map5, 2f, el, fp16, el), INSN(divph, , map5, 5e, vl, fp16, vl), INSN(divsh, f3, map5, 5e, el, fp16, el), + INSNX(fcmaddcph, f2, map6, 56, 1, vl, d, vl), + INSNX(fcmaddcsh, f2, map6, 57, 1, el, d, el), + INSNX(fcmulcph, f2, map6, d6, 1, vl, d, vl), + INSNX(fcmulcsh, f2, map6, d7, 1, el, d, el), INSN(fmadd132ph, 66, map6, 98, vl, fp16, vl), INSN(fmadd132sh, 66, map6, 99, el, fp16, el), INSN(fmadd213ph, 66, map6, a8, vl, fp16, vl), INSN(fmadd213sh, 66, map6, a9, el, fp16, el), INSN(fmadd231ph, 66, map6, b8, vl, fp16, vl), INSN(fmadd231sh, 66, map6, b9, el, fp16, el), + INSNX(fmaddcph, f3, map6, 56, 1, vl, d, vl), + INSNX(fmaddcsh, f3, map6, 57, 1, el, d, el), INSN(fmaddsub132ph, 66, map6, 96, vl, fp16, vl), INSN(fmaddsub213ph, 66, map6, a6, vl, fp16, vl), INSN(fmaddsub231ph, 66, map6, b6, vl, fp16, vl), @@ -632,6 +638,8 @@ static const struct test avx512_fp16_all INSN(fmsubadd132ph, 66, map6, 97, vl, fp16, vl), INSN(fmsubadd213ph, 66, map6, a7, vl, fp16, vl), INSN(fmsubadd231ph, 66, map6, b7, vl, fp16, vl), + INSNX(fmulcph, f3, map6, d6, 1, vl, d, vl), + INSNX(fmulcsh, f3, map6, d7, 1, el, d, el), INSN(fnmadd132ph, 66, map6, 9c, vl, fp16, vl), INSN(fnmadd132sh, 66, map6, 9d, el, fp16, el), INSN(fnmadd213ph, 66, map6, ac, vl, fp16, vl), --- a/tools/tests/x86_emulator/predicates.c +++ b/tools/tests/x86_emulator/predicates.c @@ -2058,6 +2058,10 @@ static const struct evex { { { 0x4d }, 2, T, R, pfx_66, W0, LIG }, /* vrcpsh */ { { 0x4e }, 2, T, R, pfx_66, W0, Ln }, /* vrsqrtph */ { { 0x4f }, 2, T, R, pfx_66, W0, LIG }, /* vrsqrtsh */ + { { 0x56 }, 2, T, R, pfx_f3, W0, Ln }, /* vfmaddcph */ + { { 0x56 }, 2, T, R, pfx_f2, W0, Ln }, /* vfcmaddcph */ + { { 0x57 }, 2, T, R, pfx_f3, W0, LIG }, /* vfmaddcsh */ + { { 0x57 }, 2, T, R, pfx_f2, W0, LIG }, /* vfcmaddcsh */ { { 0x96 }, 2, T, R, pfx_66, W0, Ln }, /* vfmaddsub132ph */ { { 0x97 }, 2, T, R, pfx_66, W0, Ln }, /* vfmsubadd132ph */ { { 0x98 }, 2, T, R, pfx_66, W0, Ln }, /* vfmadd132ph */ @@ -2088,6 +2092,10 @@ static const struct evex { { { 0xbd }, 2, T, R, pfx_66, W0, LIG }, /* vfnmadd231sh */ { { 0xbe }, 2, T, R, pfx_66, W0, Ln }, /* vfnmsub231ph */ { { 0xbf }, 2, T, R, pfx_66, W0, LIG }, /* vfnmsub231sh */ + { { 0xd6 }, 2, T, R, pfx_f3, W0, Ln }, /* vfmulcph */ + { { 0xd6 }, 2, T, R, pfx_f2, W0, Ln }, /* vfcmulcph */ + { { 0xd7 }, 2, T, R, pfx_f3, W0, LIG }, /* vfmulcsh */ + { { 0xd7 }, 2, T, R, pfx_f2, W0, LIG }, /* vfcmulcsh */ }; static const struct { --- a/xen/arch/x86/x86_emulate/decode.c +++ b/xen/arch/x86/x86_emulate/decode.c @@ -379,6 +379,8 @@ static const struct ext0f38_table { [0x4f] = { .simd_size = simd_scalar_vexw, .d8s = d8s_dq }, [0x50 ... 0x53] = { .simd_size = simd_packed_int, .d8s = d8s_vl }, [0x54 ... 0x55] = { .simd_size = simd_packed_int, .two_op = 1, .d8s = d8s_vl }, + [0x56] = { .simd_size = simd_other, .d8s = d8s_vl }, + [0x57] = { .simd_size = simd_scalar_vexw, .d8s = d8s_dq }, [0x58] = { .simd_size = simd_other, .two_op = 1, .d8s = 2 }, [0x59] = { .simd_size = simd_other, .two_op = 1, .d8s = 3 }, [0x5a] = { .simd_size = simd_128, .two_op = 1, .d8s = 4 }, @@ -441,6 +443,8 @@ static const struct ext0f38_table { [0xcc] = { .simd_size = simd_packed_fp, .two_op = 1, .d8s = d8s_vl }, [0xcd] = { .simd_size = simd_scalar_vexw, .d8s = d8s_dq }, [0xcf] = { .simd_size = simd_packed_int, .d8s = d8s_vl }, + [0xd6] = { .simd_size = simd_other, .d8s = d8s_vl }, + [0xd7] = { .simd_size = simd_scalar_vexw, .d8s = d8s_dq }, [0xdb] = { .simd_size = simd_packed_int, .two_op = 1 }, [0xdc ... 0xdf] = { .simd_size = simd_packed_int, .d8s = d8s_vl }, [0xf0] = { .two_op = 1 }, @@ -1502,6 +1506,10 @@ int x86emul_decode(struct x86_emulate_st if ( s->evex.pfx == vex_66 ) s->fp16 = true; break; + + case 0x56: case 0x57: /* vf{,c}maddc{p,s}h */ + case 0xd6: case 0xd7: /* vf{,c}mulc{p,s}h */ + break; } disp8scale = decode_disp8scale(ext0f38_table[b].d8s, s); --- a/xen/arch/x86/x86_emulate/x86_emulate.c +++ b/xen/arch/x86/x86_emulate/x86_emulate.c @@ -7840,6 +7840,34 @@ x86_emulate( avx512_vlen_check(true); goto simd_zmm; + case X86EMUL_OPC_EVEX_F3(6, 0x56): /* vfmaddcph [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX_F2(6, 0x56): /* vfcmaddcph [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX_F3(6, 0xd6): /* vfmulcph [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX_F2(6, 0xd6): /* vfcmulcph [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ + op_bytes = 16 << evex.lr; + /* fall through */ + case X86EMUL_OPC_EVEX_F3(6, 0x57): /* vfmaddcsh xmm/m16,xmm,xmm{k} */ + case X86EMUL_OPC_EVEX_F2(6, 0x57): /* vfcmaddcsh xmm/m16,xmm,xmm{k} */ + case X86EMUL_OPC_EVEX_F3(6, 0xd7): /* vfmulcsh xmm/m16,xmm,xmm{k} */ + case X86EMUL_OPC_EVEX_F2(6, 0xd7): /* vfcmulcsh xmm/m16,xmm,xmm{k} */ + { + unsigned int src1 = ~evex.reg; + + host_and_vcpu_must_have(avx512_fp16); + generate_exception_if(evex.w || ((b & 1) && ea.type != OP_REG && evex.brs), + EXC_UD); + if ( mode_64bit() ) + src1 = (src1 & 0xf) | (!evex.RX << 4); + else + src1 &= 7; + generate_exception_if(modrm_reg == src1 || + (ea.type != OP_MEM && modrm_reg == modrm_rm), + EXC_UD); + if ( ea.type != OP_REG || (b & 1) || !evex.brs ) + avx512_vlen_check(!(b & 1)); + goto simd_zmm; + } + case X86EMUL_OPC_XOP(08, 0x85): /* vpmacssww xmm,xmm/m128,xmm,xmm */ case X86EMUL_OPC_XOP(08, 0x86): /* vpmacsswd xmm,xmm/m128,xmm,xmm */ case X86EMUL_OPC_XOP(08, 0x87): /* vpmacssdql xmm,xmm/m128,xmm,xmm */ From patchwork Wed Jun 15 10:30:29 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 12882088 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id A0CA4C43334 for ; Wed, 15 Jun 2022 10:30:48 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.349942.576153 (Exim 4.92) (envelope-from ) id 1o1QIF-00075j-1c; Wed, 15 Jun 2022 10:30:35 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 349942.576153; Wed, 15 Jun 2022 10:30:34 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1o1QIE-00075c-Tk; Wed, 15 Jun 2022 10:30:34 +0000 Received: by outflank-mailman (input) for mailman id 349942; Wed, 15 Jun 2022 10:30:33 +0000 Received: from se1-gles-sth1-in.inumbo.com ([159.253.27.254] helo=se1-gles-sth1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1o1QID-00075T-HJ for xen-devel@lists.xenproject.org; Wed, 15 Jun 2022 10:30:33 +0000 Received: from EUR04-DB3-obe.outbound.protection.outlook.com (mail-db3eur04on0619.outbound.protection.outlook.com [2a01:111:f400:fe0c::619]) by se1-gles-sth1.inumbo.com (Halon) with ESMTPS id 2ef00ec7-ec96-11ec-bd2c-47488cf2e6aa; Wed, 15 Jun 2022 12:30:32 +0200 (CEST) Received: from VE1PR04MB6560.eurprd04.prod.outlook.com (2603:10a6:803:122::25) by DB8PR04MB6876.eurprd04.prod.outlook.com (2603:10a6:10:116::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5332.20; Wed, 15 Jun 2022 10:30:30 +0000 Received: from VE1PR04MB6560.eurprd04.prod.outlook.com ([fe80::dfa:a64a:432f:e26b]) by VE1PR04MB6560.eurprd04.prod.outlook.com ([fe80::dfa:a64a:432f:e26b%7]) with mapi id 15.20.5332.020; Wed, 15 Jun 2022 10:30:30 +0000 X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 2ef00ec7-ec96-11ec-bd2c-47488cf2e6aa ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=CEFE/xmpqlrJK+fT76C9M+gtQo5+W8+8r3Be5Yk9tJ4Yo76VQocBr1ikp+pbN4oCFn+6tPHazjl46QictiIsH2GkS8SJ7EZCP0n4QtXNsq2/xwPrdmGN/rmtTVtcaMhGklZPOrVGcErIQj4RjVh7krUHfKAJeFmZebRy5qyRNKsE0YI3KEaPRrDbjC8LwmIJdmutlJfmCa7+OXDfJbAQX8ODqxWTQ9w+6e3u42TC5m3+DZe1RA9nLTpeNUacSpe6yChV6D8ghNlNJmJ2NoIBr4NzDUmF6gs7XtMjhrs6CkkIXDkvF9Oov2RqKhIw6zR1rdE+TXA4JSxfJ98siyMJ+g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=Pf+9jy8WYs5YFSSKVO0mOzJRe1qi6wwM0nQAc2XnSdc=; b=Y+fPwhomFfs23C+nstPksrmqXpStoIO72iUKaGMtiHdztJeRVLnMYcm6uYDeYvQ4Z1xwPBB3uXYkc2OoRxwjxM8ivAS7d7JT3KV119VzCPX80cldW1wuD2Om8m+8Yu4BjeKaIZMqYJ0JlI14V1sjR2iA+jaiNbaSQFdc5tzJQjLF6jHKHQ3rt616y796jMTgKaMRHhGwMhf57o+nA5FgKGaP44v7xy7qqqNNnaOKrrL+gZ/kwCwz0lR0TMJ5wfLZ4VoE/PCNJUubDABqDUG5PAU9/KU5wuEWqPpVmJaUsO5IvX9rYPhH7TCfjZMmWxOoVQxZLATXgkAeljZQ4Op4wA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=Pf+9jy8WYs5YFSSKVO0mOzJRe1qi6wwM0nQAc2XnSdc=; b=KonsezMEaA0kBHvSz0isaYWLV1R+DcXASQK6RhfbJPl7qJ4N+dgFbH7mEkShDWTo4+itrrrovK4aiN7wmW4d3s4AsNbP/NLiQJiW9p5VwVrFMk5iY9NZnS86IoJA2l1k3yPhQU4mcf+DlNQkQ1mXtGbPNxMv262xFXc0osI68CUzpt1FXf/dAuaFGuLPIUpONiWl/e1E/DdaVDcF/dAlOe6Gj5Oaqd49uQIOtkFGoIPwvj6zPx8AEGq8hORc9DJApvtn2pTgmpydZQq8fGxgP4hVsiM/h60WfY8gMwmLt56gIifjvppuGUJDL9ZSel1vUo+MKSRcFz9GDKZcNWi4zw== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com; Message-ID: <5c77cdba-fac9-d82b-9d68-40f8b4f82d66@suse.com> Date: Wed, 15 Jun 2022 12:30:29 +0200 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.10.0 Subject: [PATCH 08/11] x86emul: handle AVX512-FP16 conversion to/from (packed) int16 insns Content-Language: en-US From: Jan Beulich To: "xen-devel@lists.xenproject.org" Cc: Andrew Cooper , Wei Liu , =?utf-8?q?Roger_Pau_Monn=C3=A9?= References: <9ba3aaa7-56ce-b051-f1c4-874276e493e1@suse.com> In-Reply-To: <9ba3aaa7-56ce-b051-f1c4-874276e493e1@suse.com> X-ClientProxiedBy: AS9PR06CA0015.eurprd06.prod.outlook.com (2603:10a6:20b:462::6) To VE1PR04MB6560.eurprd04.prod.outlook.com (2603:10a6:803:122::25) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 0bee5b93-c83a-433d-713b-08da4eba1269 X-MS-TrafficTypeDiagnostic: DB8PR04MB6876:EE_ X-Microsoft-Antispam-PRVS: X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: XswqozxqBxBiFgV0+BPPfR7yu9zvMjyyo4OsBoB0K2ZWYZqJqOOv5iVdm2mMeomMHj2O1vsAAUr+Cuuo8ZpD89Teub/ufAd2BfZ5Fh+G5E/tNeqczJwERFk8QYL2xXr2+s0GKrLb/tW2iGsUCLLZqEoxI70MXTw7wEGgV8yKmKxLZju2V1LD+FwtvYLj83mlSa2qH7421F2cc41jSbL5cKWRwFiyLA1rXUqJKtXjzxTzb1DULnLVZpoN1HddlCyyb6cVTOr9tvgx5e0rpouiKCcAY6njtkDE9xzDSVMDDhPn5IBcF/9EmatbgzyX/t1xb5pPmB6IHtXMSXth3cDrPZJ/1YXrpCJf6v1ikGKvtrCHmSjCeDk2ICz8OhxL746/TJ6EBD7Ia0O8GK3yoMa6ovaIDBSuY7JYJNeECwl4bghZ5Xa31KoDWolakkFDTwwa5XgKgckvukNs2j+oFpwrb/ZEA+hrfIXqnu83Zlhwogj7MSlioz1FdbksNnuKj5Wh+O2R2keSmZ3mFOg5C8Mgz4WFS1XeNSGWmyAjVFlYpTThQQIYWOt5/ydCcKj07woQ37NwI6CJvwyRB0M1eit2EUl91TRl+4jWZNhSy+QFDR0Ixcbiswvwb33nmW4QbFU2D5vl/O08SOOwBsId8MRPcka/4hRvLP3eNVvX5yfZV//t7zxFW+JQlildIXQYzaPJJUqwdci2VWwujPuwtral2tO6DGTI7X2XIvz1a67ySUDNejbY7qmfLuwQpjGVDcVX X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:VE1PR04MB6560.eurprd04.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230016)(366004)(6512007)(66556008)(26005)(54906003)(5660300002)(6506007)(316002)(186003)(2906002)(31686004)(36756003)(508600001)(6486002)(8936002)(66946007)(86362001)(4326008)(8676002)(6916009)(66476007)(38100700002)(31696002)(2616005)(43740500002)(45980500001);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?q?DcKYZuRCTn+gfZf0MWbbNrVG9bEA?= =?utf-8?q?0KPWbarAP0ZfKQvMgNFXuLV72mnK7UqjIBQoW1tnXuFVtkegU9EbrvRW7DE3oAdRX?= =?utf-8?q?hC193LZDq029QRour0GnrHgUlLWXhQoikPxq89lGWa9gXZT3wNAfB9pYrz9Qfa2bZ?= =?utf-8?q?iPROahgHpfF9qH2n6B8eX7m3+HbHSaE7qpgEqztcNl9912HBnFwjOEY5TTCoeMmuI?= =?utf-8?q?Yq+C7b2lu1VGOcwsbUiY7V4jCDQqjR4H/xs3kwHqF/foQLxPejPODCVFh55FciduC?= =?utf-8?q?Zzh3gNOJyPFK7Y1g5AQazPUPr6uLCHWFZd9QRZJyH5ZcF9vF0Ursm1qEIT7flpEqd?= =?utf-8?q?/TYBghP5mcSdjdJIjYdIEedAb2QwqNrD28pohW2mgyx9N+0as7l0KzlJkYg2ung0a?= =?utf-8?q?rX9QL9UclvZ4VvHAOZ6EyT8jUC3QDY07L0P06sQtBn/jprDAY7vuRpLnnaBbsix7B?= =?utf-8?q?0nUq54fCH5vswSAYLunUAjUv9hn9zzmDGlCp0EbjE1ltp45I/rrbBK+/LloASturY?= =?utf-8?q?d5Wyurg5wfhCIbQowY7mBfuVLu92P3kIKBPdNzcSP/jbofoQ161fCm2wLtlB/khHU?= =?utf-8?q?CBaRwqzwzPFu1gf6tE9UmbYAAqd1wLmWqK7u3Fqe2EMJUehybOCw2Sg8iDP5vyy87?= =?utf-8?q?HRlOvdj8PVsa58XxycGRcxcqEiRi3RG3XSLw6Rspbwow9a8QKxMvsv5b4YmAvFFSh?= =?utf-8?q?JzdLhWgobjT+M+xs1btMkKdPoin/28K4uhp8WoJ6AyD3arvFxEFLEaIgN3Ma9DT3m?= =?utf-8?q?UYXlf77OXQMtlxqS+ok8wyHNnJAJkevb0UehxRcG4eeBBzDcrg2kUZar+UoQbzyNE?= =?utf-8?q?D8G1shWcdjQ8b3JTKd4BBR/49fXe1RAK3uUl2XxwiZDijycGDHv1BYGfC4slFXYxw?= =?utf-8?q?1FhAYxU4jxdPa/8qNAvVnw0Q8bvybAZorNcVbP6WJLATpdyUkcEeGamNoWBiVcCpZ?= =?utf-8?q?mIWs5rzoyurU2ynU6Fb02qOou4YpID8Enp8fmLfDfu2jbNDiNefGFfmdTIGIOsV57?= =?utf-8?q?9YLtLjIULZkea5xTiBqsG/e3WAv5Su1F+6RxEWyYlFYzw6QRNnpb0Y9+ePodhlPh8?= =?utf-8?q?Ywf8Xu2vem36HVek22IISEAS9uaq1HsQJSsmT+0escI3Mob+wLBoZ+4QDlDOwKMeN?= =?utf-8?q?+MKXT07Vk+MV5asg/ouwGI4Mn8kQaPDjEV0BOnC+TBdRQCKEHS2LxHu9ld1jZe2Af?= =?utf-8?q?E4vhnaKLBBDfF+cNvEtnJCn2H9zTf4NxACwgLW5FbQqHw3RDgCCH7SS0UGyA1RyVd?= =?utf-8?q?RwJkHTmrNfQgAkP5mu4veJkweqT/g4LJ93g2p5dClfY712TEr4saObkA8QqFV+n6Q?= =?utf-8?q?0cGinSip82j69J1bSRh290TE0aV5Gda0Ly13EcE7yW4R5C4VJxOWJFLLmPFGdapLv?= =?utf-8?q?3c1TCLpDtgwxVpyDXwTQFDilPtPQVsVyhe95GTww92GBT1CW3be4wso6KiNjXGOOk?= =?utf-8?q?1/5kwKylmZzyoGxGBIHjcKRd30Yfv4upXMLK0Q6Um/PCGSg3sW/98Q80FNOkWPqj8?= =?utf-8?q?BJ0AQ5ojY+ImrltpF22cez7XdjcI/liwNvyPCzB81JQfdWX0VGkiZGIUamSWTSi/u?= =?utf-8?q?j0vB74m1QNmzokMYChhhZVelOjcWIG89i4aFGlwfJcFnTp2dztuEpbRD8IM+pEnap?= =?utf-8?q?Z5UldRghusPq39O1MQJgjqUR4OojHLxg=3D=3D?= X-OriginatorOrg: suse.com X-MS-Exchange-CrossTenant-Network-Message-Id: 0bee5b93-c83a-433d-713b-08da4eba1269 X-MS-Exchange-CrossTenant-AuthSource: VE1PR04MB6560.eurprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 15 Jun 2022 10:30:30.7692 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: f7a17af6-1c5c-4a36-aa8b-f5be247aa4ba X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: XkF9fVRpUNudE1NJvPPyL8oSiga4aDpF3JWWduid5HQLuVdkTOzYU/HQoVEsPRcxesKv7mp0VPSaSOssTIqMOQ== X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB8PR04MB6876 These are easiest in that they have same-size source and destination vectors, yet they're different from other conversion insns in that they use opcodes which have different meaning in the 0F encoding space ({,V}H{ADD,SUB}P{S,D}), hence requiring a little bit of overriding. Signed-off-by: Jan Beulich --- a/tools/tests/x86_emulator/evex-disp8.c +++ b/tools/tests/x86_emulator/evex-disp8.c @@ -612,6 +612,12 @@ static const struct test avx512_fp16_all INSN(cmpph, , 0f3a, c2, vl, fp16, vl), INSN(cmpsh, f3, 0f3a, c2, el, fp16, el), INSN(comish, , map5, 2f, el, fp16, el), + INSN(cvtph2uw, , map5, 7d, vl, fp16, vl), + INSN(cvtph2w, 66, map5, 7d, vl, fp16, vl), + INSN(cvttph2uw, , map5, 7c, vl, fp16, vl), + INSN(cvttph2w, 66, map5, 7c, vl, fp16, vl), + INSN(cvtuw2ph, f2, map5, 7d, vl, fp16, vl), + INSN(cvtw2ph, f3, map5, 7d, vl, fp16, vl), INSN(divph, , map5, 5e, vl, fp16, vl), INSN(divsh, f3, map5, 5e, el, fp16, el), INSNX(fcmaddcph, f2, map6, 56, 1, vl, d, vl), --- a/tools/tests/x86_emulator/predicates.c +++ b/tools/tests/x86_emulator/predicates.c @@ -2048,6 +2048,12 @@ static const struct evex { { { 0x5f }, 2, T, R, pfx_no, W0, Ln }, /* vmaxph */ { { 0x5f }, 2, T, R, pfx_f3, W0, LIG }, /* vmaxsh */ { { 0x6e }, 2, T, R, pfx_66, WIG, L0 }, /* vmovw */ + { { 0x7c }, 2, T, R, pfx_no, W0, Ln }, /* vcvttph2uw */ + { { 0x7c }, 2, T, R, pfx_66, W0, Ln }, /* vcvttph2w */ + { { 0x7d }, 2, T, R, pfx_no, W0, Ln }, /* vcvtph2uw */ + { { 0x7d }, 2, T, R, pfx_66, W0, Ln }, /* vcvtph2w */ + { { 0x7d }, 2, T, R, pfx_f3, W0, Ln }, /* vcvtw2ph */ + { { 0x7d }, 2, T, R, pfx_f2, W0, Ln }, /* vcvtuwph */ { { 0x7e }, 2, T, W, pfx_66, WIG, L0 }, /* vmovw */ }, evex_map6[] = { { { 0x2c }, 2, T, R, pfx_66, W0, Ln }, /* vscalefph */ --- a/xen/arch/x86/x86_emulate/decode.c +++ b/xen/arch/x86/x86_emulate/decode.c @@ -259,7 +259,7 @@ static const struct twobyte_table { [0x78 ... 0x79] = { DstImplicit|SrcMem|ModRM|Mov, simd_other, d8s_vl }, [0x7a] = { DstImplicit|SrcMem|ModRM|Mov, simd_packed_fp, d8s_vl }, [0x7b] = { DstImplicit|SrcMem|ModRM|Mov, simd_other, d8s_dq64 }, - [0x7c ... 0x7d] = { DstImplicit|SrcMem|ModRM, simd_other }, + [0x7c ... 0x7d] = { DstImplicit|SrcMem|ModRM, simd_other, d8s_vl }, [0x7e] = { DstMem|SrcImplicit|ModRM|Mov, simd_none, d8s_dq64 }, [0x7f] = { DstMem|SrcImplicit|ModRM|Mov, simd_packed_int, d8s_vl }, [0x80 ... 0x8f] = { DstImplicit|SrcImm }, @@ -1488,6 +1488,12 @@ int x86emul_decode(struct x86_emulate_st s->fp16 = true; s->simd_size = simd_none; break; + + case 0x7c: /* vcvttph2{,u}w */ + case 0x7d: /* vcvtph2{,u}w / vcvt{,u}w2ph */ + d = DstReg | SrcMem | TwoOp; + s->fp16 = true; + break; } disp8scale = decode_disp8scale(twobyte_table[b].d8s, s); --- a/xen/arch/x86/x86_emulate/x86_emulate.c +++ b/xen/arch/x86/x86_emulate/x86_emulate.c @@ -7780,6 +7780,14 @@ x86_emulate( generate_exception_if(evex.w, EXC_UD); goto avx512f_all_fp; + case X86EMUL_OPC_EVEX (5, 0x7c): /* vcvttph2uw [xyz]mm/mem,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX_66(5, 0x7c): /* vcvttph2w [xyz]mm/mem,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX (5, 0x7d): /* vcvtph2uw [xyz]mm/mem,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX_66(5, 0x7d): /* vcvtph2w [xyz]mm/mem,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX_F3(5, 0x7d): /* vcvtw2ph [xyz]mm/mem,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX_F2(5, 0x7d): /* vcvtuw2ph [xyz]mm/mem,[xyz]mm{k} */ + op_bytes = 16 << evex.lr; + /* fall through */ case X86EMUL_OPC_EVEX_66(6, 0x2c): /* vscalefph [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_66(6, 0x42): /* vgetexpph [xyz]mm/mem,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_66(6, 0x96): /* vfmaddsub132ph [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ From patchwork Wed Jun 15 10:30:50 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 12882089 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 55E56C433EF for ; Wed, 15 Jun 2022 10:31:38 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.349949.576164 (Exim 4.92) (envelope-from ) id 1o1QJ5-0007hP-CK; Wed, 15 Jun 2022 10:31:27 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 349949.576164; Wed, 15 Jun 2022 10:31:27 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1o1QJ5-0007hI-8I; Wed, 15 Jun 2022 10:31:27 +0000 Received: by outflank-mailman (input) for mailman id 349949; Wed, 15 Jun 2022 10:31:25 +0000 Received: from se1-gles-sth1-in.inumbo.com ([159.253.27.254] helo=se1-gles-sth1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1o1QJ3-0007gz-MR for xen-devel@lists.xenproject.org; Wed, 15 Jun 2022 10:31:25 +0000 Received: from EUR04-DB3-obe.outbound.protection.outlook.com (mail-db3eur04on062b.outbound.protection.outlook.com [2a01:111:f400:fe0c::62b]) by se1-gles-sth1.inumbo.com (Halon) with ESMTPS id 3beb87d1-ec96-11ec-bd2c-47488cf2e6aa; Wed, 15 Jun 2022 12:30:54 +0200 (CEST) Received: from VE1PR04MB6560.eurprd04.prod.outlook.com (2603:10a6:803:122::25) by DB8PR04MB6876.eurprd04.prod.outlook.com (2603:10a6:10:116::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5332.20; Wed, 15 Jun 2022 10:30:52 +0000 Received: from VE1PR04MB6560.eurprd04.prod.outlook.com ([fe80::dfa:a64a:432f:e26b]) by VE1PR04MB6560.eurprd04.prod.outlook.com ([fe80::dfa:a64a:432f:e26b%7]) with mapi id 15.20.5332.020; Wed, 15 Jun 2022 10:30:52 +0000 X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 3beb87d1-ec96-11ec-bd2c-47488cf2e6aa ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=DE0NnyhkLT+KxeYdCD6VvVynLmARSPYuxsy2X0bbwIPJrypbjKwYjUZQKcf37uka9z7chsfR8oJ6KrsAAdRhIaqQBG+TkVm9tpQhivpo4ESCCUUe+We8LloTiooCX9ez33ujcuq1iSyJSt41GuoPuc4vAqLXJNwuCUI28uG2h+bUc7OW0AaWlSDEXS9cNbsQgvgmVKqqv9nIXekrfNJV+51Hin3FWLMW7l4NK6JYBzLFKX1VtKX32//K1NaOXm4DkEozXY46tHhsVbxJuwOu0qkObG5LQpQe3mG87utKT3Z6mhs3Fk54hFyG7zOZO+jMYKYqvpfc8Ug3fh3sWKZ4TA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=k2dOvs6g6kpvHhFE2Cq6miTN2SdtjeNE6N4Fh2/FJ04=; b=Q3JUbM4c/zh9J4OIqT5nY/7nWOP4SlMe0Fx3s/q7pr6nHe1C8o6EqXk+60TPR0Zkrffi2RDSBkCoJITgpgq9nMa9+vIET3sfI3b5qUlZUzbPAHiqbJgNOHNi3xcRuOqvAJ56ObSE51LJZL16643UAOMoAptPLgEmKJyTPAddBKNAp6K90PLZrYmzzHa4flSCQ1fenKK+I9x0gPEH508yj7dvbl9R5qidKgqb9qPJ5OgnbDXb+4C3ku7Szj39MG8S8JUV7JEvg2Np1/LfGPKAq41WUrwS6HgoAIESsRCAdYMpP0q5wlroNwo2PLXBKUjBEsg1UIgTLwg2M80DaywgpQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=k2dOvs6g6kpvHhFE2Cq6miTN2SdtjeNE6N4Fh2/FJ04=; b=bY3fkeDUYNXJz5EB4PHT0YKB6FK7T6TjQkOi7NNPZkzdcQb2I/Kr+LYXzTMjgf3LlBCaDTYlZAEV83jPt6uJVLv9lTPabdUe4jK92QSMAAB8INQ3TSwSPc6h6B+A5OG4awUvVjncLT+S1NwRp4hd/SKQbe3GA0EFXomEk4gwMq6tEFhx3XxYSXKl5ApgCwPQ1nn5goZ0SsMq8Td/GdeuYxgt0HNz9glFxMhK52MyW3QBpC2XVuuFA7CidYyc9/5Mo1xMMyuw5mODJPmUGAGbHR942sk7UAF5rvQQCR0gWrDLsPi8wL5cDGUPCrl7Gh43/pzNZwWhDaIRTpCr+nOOdg== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com; Message-ID: <4d9c76d3-763c-c1e6-f38b-9282f023a995@suse.com> Date: Wed, 15 Jun 2022 12:30:50 +0200 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.10.0 Subject: [PATCH 09/11] x86emul: handle AVX512-FP16 floating point conversion insns Content-Language: en-US From: Jan Beulich To: "xen-devel@lists.xenproject.org" Cc: Andrew Cooper , Wei Liu , =?utf-8?q?Roger_Pau_Monn=C3=A9?= References: <9ba3aaa7-56ce-b051-f1c4-874276e493e1@suse.com> In-Reply-To: <9ba3aaa7-56ce-b051-f1c4-874276e493e1@suse.com> X-ClientProxiedBy: AS9P194CA0029.EURP194.PROD.OUTLOOK.COM (2603:10a6:20b:46d::35) To VE1PR04MB6560.eurprd04.prod.outlook.com (2603:10a6:803:122::25) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: f7da50cf-ece9-4bad-2d56-08da4eba1f48 X-MS-TrafficTypeDiagnostic: DB8PR04MB6876:EE_ X-Microsoft-Antispam-PRVS: X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: zR/6NkwkcywMZtd0xmdG34avwsFW9weN3rw6DSXBgIGbg4ftmMjNmETqcOMjWgH0Al5SbXaSzh1Y5yMkkPSdhrJksZTRomfMYRCATfjNZlNCoY6zKz7TPG56a9aHm/Navq2BmD6H6wgKcqcuzSDfztC9iNXVUF4p5qX/8hFPB04gMKASZW8os1DhEx+e+D4RbHK2KMPGh2/tSjSYk8djEWM1Y2WZsBtshc7OXKckUYesG24TnOS3YIwUw6TbHTFSak0rDJteqyX3F5DM9CCUnhva2tZvQE2r8Z2MLUwaukeMJeemP6HzqqA4pKZMXqsSfhvsMihM89GtZ56H/KwThjFMY3boASddHBb/grvpgg7Y91bTrexYsrMtYtW67w7ZaeK9fqVKINMw2vxdihZe5SpDtZBJc30u2t99tQSerU3S6niOl8CpMf+JZynX6wFe16MGGq4GmhTrQy/H6xIKcV0YZNhgbhR5YuWyjkdbKsZNeXpI/hGBYBqgXM49rEx+EoVdXmGlRp18g5H/uAYNbHIFsIp4E/EhhNz1K2r6pDvpBMzgA66MpvUQKkiE5VNdgggCjJBVYbChfEtosDGRk5KrrgFUSUW4MBzMoZOwXmn799OJtnma6+Z8t2TjFgvwWSN+OrjfLWPmbmi9LzVxw2QtnDvx1Ho2+4ywbpmhC0mp2XfGoir9cGgl+zznBYdyDwsxugGAjax51UcTJUfCtMeJeFc6F/8Q7XQ63fH70or4Nq8vTZknslsWsg1s+yD6 X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:VE1PR04MB6560.eurprd04.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230016)(366004)(6512007)(66556008)(26005)(54906003)(5660300002)(6506007)(316002)(186003)(2906002)(31686004)(36756003)(508600001)(6486002)(8936002)(66946007)(86362001)(4326008)(8676002)(6916009)(66476007)(38100700002)(31696002)(2616005)(43740500002)(45980500001);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?q?tK4GUH7HV039pz1N/hiHuY9id0tI?= =?utf-8?q?Nnr/1dGDnXu52h8tR5MBptwqSSEqNl0eDHwRQCnhjX6skdVPRlzpvVG3E8L0YEnPd?= =?utf-8?q?vRtndCJnkKvKJX658UAkApHeTqaxtko1fNKi4snovUxpjafjxvQ20DnSWm/58N+Wt?= =?utf-8?q?UKkm/oKP7OCVrfdc8yKNpG3Jq8mAxUN1Fd8QMoJsPfFDXn/qBDiw24WYxu5t8sffY?= =?utf-8?q?sQ6SUc3jSoeE4hOXsVZeGr5g15AdcwAlr/Gh8K7FkobmoL4ZG1FmURLMtHHwfEgoz?= =?utf-8?q?yCTZdyb1qXLAsomV+hPt+ssYd1L15WcmmGAs0u+blROB3mX+TMQcFCDtE7ibUxxJD?= =?utf-8?q?LL71y8PPQEKzpwnwT/PsksbSZCWGkG+M1SSncl4q+fDcyWCKd3SYaNC3Bespg/0ah?= =?utf-8?q?UtIHAqjejs5oicCODkJtupmgZGOr7oDDN8iHbryC345gcd+UIxu7FelSiTS6Fausm?= =?utf-8?q?wPgDIKXtGRsmUo1a6DnSyrW9KAd8VAcy/8ll8ousa7c3l4I9wiYHB10hflPoReH7Z?= =?utf-8?q?4RNeJ2gnHcAdOiSjh1eFawNHOe8RVjLLU3Vv5fVvFdK5pHik2jYAoLxcXA+RVmfAg?= =?utf-8?q?/2VehURkumuo5VikDnl82auMSmtNRU8B2jz7LtShzHgxVvdz59x94L/vofXJunpm9?= =?utf-8?q?Vw525MdR7Mut+fCccqXtmL7NLxT3zP+xP17ZPb/1KyIkXJ8CJ0I/VjfwBiN/jpDYD?= =?utf-8?q?8jfDqoKERv/y2conCWRgzsvXbCXiO7wNIf84XPXP9X5Q94d9ZneftcPU+9xKOaxJ9?= =?utf-8?q?Kyd7OfRq4Z+tVOcSwMpiAsqfV8hB+Gd9jBEnI9PFkIwAqZqC9YG3mpZ2EgRsu6D9t?= =?utf-8?q?h4f8rpNcInHQmrtvxOMvGcGhEeo2T201UFEjbpoT6d0CB+w9Jp2kzI/1WfwchFCPu?= =?utf-8?q?bqWOJKIEItayN6HBKrh0xonQ9+o0Fze7Xogl/0AFEJovSrVgTM38KAUMqXbUZGqOs?= =?utf-8?q?5SPp7i5+L6NaspGaefhBPTNtj9lVWUhKHbo8pv7vnExkUwt4+jISvPhaef6ik8wHu?= =?utf-8?q?EzTKlFTu2DaF7v7d0E/Yp0/w3TPaDyp8FY1yq0UhhDBs3icZRxspSrV5zbVxdwH6h?= =?utf-8?q?P9/cpmz1pgUXnn3qZFiLsu/OaQr4aMo5AJBco7EFsIfaGCsHH9Q+Z+KW16s8il4my?= =?utf-8?q?ZYYQJoY9hcaLrvkPzDe+gj5CtP9oWBqcWM+CY1uW5Gug0IzNZvNhSRW6Ki3O3C+du?= =?utf-8?q?TBLZaodkN6j1S7NwdZOrQvL41SPjzPDGEF5U+IMoffQPTDh+jMxkbN38eKqnbbWw4?= =?utf-8?q?o9HMoXwPMqBvQQvoY53fo/MWXGePmSd8IFuk4k3kGiBlqogUopYXF9+eOSGAkmYpt?= =?utf-8?q?Np7tqUzfnG5wnz+g/DnGj/CruWO6gKSGMcoC9gA+s5l/Cn83k8GU+YIHpcJSVOI3K?= =?utf-8?q?+OCo7TIBnYRnYtlw4sHE0C/NFHeaLtUJAWf1o1shKG9z+aEeIrbdeUbau15NY1uQs?= =?utf-8?q?G7jDOCmvEQDF9+Hf5v5EJzE2PzjphWFtaZIymye4Abd3uE/+f9f6XC5QDd33jO1kD?= =?utf-8?q?VS5NAI7nzSZP2ZPExeta1yaZ1cTsG4IHx7H3OpH5ujAEFw0QASenFfJguzWAtkJfk?= =?utf-8?q?mKnpuah5pqYyoSoZSXEhQUwKEsMeus508hFu1HdOc7Uxi+7MZVb8FQnHOSZG7/Ye4?= =?utf-8?q?MVQwtsWk9ZVepvu4wL78J90yZcAPJ0ew=3D=3D?= X-OriginatorOrg: suse.com X-MS-Exchange-CrossTenant-Network-Message-Id: f7da50cf-ece9-4bad-2d56-08da4eba1f48 X-MS-Exchange-CrossTenant-AuthSource: VE1PR04MB6560.eurprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 15 Jun 2022 10:30:52.3616 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: f7a17af6-1c5c-4a36-aa8b-f5be247aa4ba X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: kg9J8dmX9prMY3qcHatrkZtAYJpAtmZNu/Ya0Vj8AG1zupNmBVq2ULGHJH1SNYsEKpxMB6leG0VlH9SbZ+raWA== X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB8PR04MB6876 Signed-off-by: Jan Beulich Acked-by: Andrew Cooper --- a/tools/tests/x86_emulator/evex-disp8.c +++ b/tools/tests/x86_emulator/evex-disp8.c @@ -612,8 +612,16 @@ static const struct test avx512_fp16_all INSN(cmpph, , 0f3a, c2, vl, fp16, vl), INSN(cmpsh, f3, 0f3a, c2, el, fp16, el), INSN(comish, , map5, 2f, el, fp16, el), + INSN(cvtpd2ph, 66, map5, 5a, vl, q, vl), + INSN(cvtph2pd, , map5, 5a, vl_4, fp16, vl), + INSN(cvtph2psx, 66, map6, 13, vl_2, fp16, vl), INSN(cvtph2uw, , map5, 7d, vl, fp16, vl), INSN(cvtph2w, 66, map5, 7d, vl, fp16, vl), + INSN(cvtps2phx, 66, map5, 1d, vl, d, vl), + INSN(cvtsd2sh, f2, map5, 5a, el, q, el), + INSN(cvtsh2sd, f3, map5, 5a, el, fp16, el), + INSN(cvtsh2ss, , map6, 13, el, fp16, el), + INSN(cvtss2sh, , map5, 1d, el, d, el), INSN(cvttph2uw, , map5, 7c, vl, fp16, vl), INSN(cvttph2w, 66, map5, 7c, vl, fp16, vl), INSN(cvtuw2ph, f2, map5, 7d, vl, fp16, vl), --- a/tools/tests/x86_emulator/predicates.c +++ b/tools/tests/x86_emulator/predicates.c @@ -2031,6 +2031,8 @@ static const struct evex { }, evex_map5[] = { { { 0x10 }, 2, T, R, pfx_f3, W0, LIG }, /* vmovsh */ { { 0x11 }, 2, T, W, pfx_f3, W0, LIG }, /* vmovsh */ + { { 0x1d }, 2, T, R, pfx_66, W0, Ln }, /* vcvtps2phx */ + { { 0x1d }, 2, T, R, pfx_no, W0, LIG }, /* vcvtss2sh */ { { 0x2e }, 2, T, R, pfx_no, W0, LIG }, /* vucomish */ { { 0x2f }, 2, T, R, pfx_no, W0, LIG }, /* vcomish */ { { 0x51 }, 2, T, R, pfx_no, W0, Ln }, /* vsqrtph */ @@ -2039,6 +2041,10 @@ static const struct evex { { { 0x58 }, 2, T, R, pfx_f3, W0, LIG }, /* vaddsh */ { { 0x59 }, 2, T, R, pfx_no, W0, Ln }, /* vmulph */ { { 0x59 }, 2, T, R, pfx_f3, W0, LIG }, /* vmulsh */ + { { 0x5a }, 2, T, R, pfx_no, W0, Ln }, /* vcvtph2pd */ + { { 0x5a }, 2, T, R, pfx_66, W1, Ln }, /* vcvtpd2ph */ + { { 0x5a }, 2, T, R, pfx_f3, W0, LIG }, /* vcvtsh2sd */ + { { 0x5a }, 2, T, R, pfx_f2, W1, LIG }, /* vcvtsd2sh */ { { 0x5c }, 2, T, R, pfx_no, W0, Ln }, /* vsubph */ { { 0x5c }, 2, T, R, pfx_f3, W0, LIG }, /* vsubsh */ { { 0x5d }, 2, T, R, pfx_no, W0, Ln }, /* vminph */ @@ -2056,6 +2062,8 @@ static const struct evex { { { 0x7d }, 2, T, R, pfx_f2, W0, Ln }, /* vcvtuwph */ { { 0x7e }, 2, T, W, pfx_66, WIG, L0 }, /* vmovw */ }, evex_map6[] = { + { { 0x13 }, 2, T, R, pfx_66, W0, Ln }, /* vcvtph2psx */ + { { 0x13 }, 2, T, R, pfx_no, W0, LIG }, /* vcvtsh2ss */ { { 0x2c }, 2, T, R, pfx_66, W0, Ln }, /* vscalefph */ { { 0x2d }, 2, T, R, pfx_66, W0, LIG }, /* vscalefsh */ { { 0x42 }, 2, T, R, pfx_66, W0, Ln }, /* vgetexpph */ --- a/xen/arch/x86/x86_emulate/decode.c +++ b/xen/arch/x86/x86_emulate/decode.c @@ -224,7 +224,9 @@ static const struct twobyte_table { [0x14 ... 0x15] = { DstImplicit|SrcMem|ModRM, simd_packed_fp, d8s_vl }, [0x16] = { DstImplicit|SrcMem|ModRM|Mov, simd_other, 3 }, [0x17] = { DstMem|SrcImplicit|ModRM|Mov, simd_other, 3 }, - [0x18 ... 0x1f] = { ImplicitOps|ModRM }, + [0x18 ... 0x1c] = { ImplicitOps|ModRM }, + [0x1d] = { ImplicitOps|ModRM, simd_none, d8s_vl }, + [0x1e ... 0x1f] = { ImplicitOps|ModRM }, [0x20 ... 0x21] = { DstMem|SrcImplicit|ModRM }, [0x22 ... 0x23] = { DstImplicit|SrcMem|ModRM }, [0x28] = { DstImplicit|SrcMem|ModRM|Mov, simd_packed_fp, d8s_vl }, @@ -1474,6 +1476,19 @@ int x86emul_decode(struct x86_emulate_st s->fp16 = true; break; + case 0x1d: /* vcvtps2phx / vcvtss2sh */ + if ( s->evex.pfx & VEX_PREFIX_SCALAR_MASK ) + break; + d = DstReg | SrcMem; + if ( s->evex.pfx & VEX_PREFIX_DOUBLE_MASK ) + { + s->simd_size = simd_packed_fp; + d |= TwoOp; + } + else + s->simd_size = simd_scalar_vexw; + break; + case 0x2e: case 0x2f: /* v{,u}comish */ if ( !s->evex.pfx ) s->fp16 = true; @@ -1497,6 +1512,15 @@ int x86emul_decode(struct x86_emulate_st } disp8scale = decode_disp8scale(twobyte_table[b].d8s, s); + + switch ( b ) + { + case 0x5a: /* vcvtph2pd needs special casing */ + if ( !s->evex.pfx && !s->evex.brs ) + disp8scale -= 2; + break; + } + break; case ext_map6: @@ -1513,6 +1537,17 @@ int x86emul_decode(struct x86_emulate_st s->fp16 = true; break; + case 0x13: /* vcvtph2psx / vcvtsh2ss */ + if ( s->evex.pfx & VEX_PREFIX_SCALAR_MASK ) + break; + s->fp16 = true; + if ( !(s->evex.pfx & VEX_PREFIX_DOUBLE_MASK) ) + { + s->simd_size = simd_scalar_vexw; + d &= ~TwoOp; + } + break; + case 0x56: case 0x57: /* vf{,c}maddc{p,s}h */ case 0xd6: case 0xd7: /* vf{,c}mulc{p,s}h */ break; --- a/xen/arch/x86/x86_emulate/x86_emulate.c +++ b/xen/arch/x86/x86_emulate/x86_emulate.c @@ -7780,14 +7780,25 @@ x86_emulate( generate_exception_if(evex.w, EXC_UD); goto avx512f_all_fp; + CASE_SIMD_ALL_FP(_EVEX, 5, 0x5a): /* vcvtp{h,d}2p{h,d} [xyz]mm/mem,[xyz]mm{k} */ + /* vcvts{h,d}2s{h,d} xmm/mem,xmm,xmm{k} */ + host_and_vcpu_must_have(avx512_fp16); + if ( vex.pfx & VEX_PREFIX_SCALAR_MASK ) + d &= ~TwoOp; + op_bytes = 2 << (((evex.pfx & VEX_PREFIX_SCALAR_MASK) ? 0 : 1 + evex.lr) + + 2 * evex.w); + goto avx512f_all_fp; + case X86EMUL_OPC_EVEX (5, 0x7c): /* vcvttph2uw [xyz]mm/mem,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_66(5, 0x7c): /* vcvttph2w [xyz]mm/mem,[xyz]mm{k} */ case X86EMUL_OPC_EVEX (5, 0x7d): /* vcvtph2uw [xyz]mm/mem,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_66(5, 0x7d): /* vcvtph2w [xyz]mm/mem,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_F3(5, 0x7d): /* vcvtw2ph [xyz]mm/mem,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_F2(5, 0x7d): /* vcvtuw2ph [xyz]mm/mem,[xyz]mm{k} */ - op_bytes = 16 << evex.lr; + case X86EMUL_OPC_EVEX_66(6, 0x13): /* vcvtph2psx [xy]mm/mem,[xyz]mm{k} */ + op_bytes = 8 << ((ext == ext_map5) + evex.lr); /* fall through */ + case X86EMUL_OPC_EVEX_66(5, 0x1d): /* vcvtps2phx [xyz]mm/mem,[xy]mm{k} */ case X86EMUL_OPC_EVEX_66(6, 0x2c): /* vscalefph [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_66(6, 0x42): /* vgetexpph [xyz]mm/mem,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_66(6, 0x96): /* vfmaddsub132ph [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ @@ -7814,6 +7825,8 @@ x86_emulate( avx512_vlen_check(false); goto simd_zmm; + case X86EMUL_OPC_EVEX(5, 0x1d): /* vcvtss2sh xmm/mem,xmm,xmm{k} */ + case X86EMUL_OPC_EVEX(6, 0x13): /* vcvtsh2ss xmm/mem,xmm,xmm{k} */ case X86EMUL_OPC_EVEX_66(6, 0x2d): /* vscalefsh xmm/m16,xmm,xmm{k} */ case X86EMUL_OPC_EVEX_66(6, 0x43): /* vgetexpsh xmm/m16,xmm,xmm{k} */ case X86EMUL_OPC_EVEX_66(6, 0x99): /* vfmadd132sh xmm/m16,xmm,xmm{k} */ From patchwork Wed Jun 15 10:31:09 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 12882090 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id E8D0BC43334 for ; Wed, 15 Jun 2022 10:31:41 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.349952.576175 (Exim 4.92) (envelope-from ) id 1o1QJ7-0007yb-P6; Wed, 15 Jun 2022 10:31:29 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 349952.576175; Wed, 15 Jun 2022 10:31:29 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1o1QJ7-0007yS-Lf; Wed, 15 Jun 2022 10:31:29 +0000 Received: by outflank-mailman (input) for mailman id 349952; Wed, 15 Jun 2022 10:31:28 +0000 Received: from se1-gles-sth1-in.inumbo.com ([159.253.27.254] helo=se1-gles-sth1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1o1QJ6-0007gz-G4 for xen-devel@lists.xenproject.org; Wed, 15 Jun 2022 10:31:28 +0000 Received: from EUR04-DB3-obe.outbound.protection.outlook.com (mail-db3eur04on062b.outbound.protection.outlook.com [2a01:111:f400:fe0c::62b]) by se1-gles-sth1.inumbo.com (Halon) with ESMTPS id 4eb47bed-ec96-11ec-bd2c-47488cf2e6aa; Wed, 15 Jun 2022 12:31:25 +0200 (CEST) Received: from VE1PR04MB6560.eurprd04.prod.outlook.com (2603:10a6:803:122::25) by DB8PR04MB6876.eurprd04.prod.outlook.com (2603:10a6:10:116::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5332.20; Wed, 15 Jun 2022 10:31:11 +0000 Received: from VE1PR04MB6560.eurprd04.prod.outlook.com ([fe80::dfa:a64a:432f:e26b]) by VE1PR04MB6560.eurprd04.prod.outlook.com ([fe80::dfa:a64a:432f:e26b%7]) with mapi id 15.20.5332.020; Wed, 15 Jun 2022 10:31:11 +0000 X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 4eb47bed-ec96-11ec-bd2c-47488cf2e6aa ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=EvwiM6N8HVmwTxiIQoHRpvSK8v76sNx61DemjfPJXfGBkHPlwOY7r+TQFnQMjrwGA8+H+l6O15fHtM/atjvoknZs6QFC1cj9sM2SWsN+LRDsrTR+xa8AYf1whTZ6PopL2+xSpYmlL2VJLrwQkTIUkLK7wD3shomNMQV/EYlaPrScn1iMt6dFOit6ILiL5LiugkgIyTbN7IqNxEO0TKiiiGc9dl8t+0iV3xSpmVsM+0Xv8YVgINl+qUEhlcLqOE9COLX+ttiD9JJlUCkT1CObChAL/wdcnYlAKsDOv8fmaX/KmncPPDHIoWHopGLjFwJIqRpUyomw70EdE6DlET28Rw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=72avtIJigMrJ/t9S9LNVdKFGNIXvlVkWO5c2zIOjy7c=; b=ZqTHbFdGBDATJ5lryi3tsw/74ur0CLsmgtOjdhth2+LIUNTjH3mj3VwoUBkKvKVSMxkpNecCMc6NdqqrxqTda49qGygkReGgVaDDfxUTSOtDUDIxmw2NA3w9NFYdRjo+xS4aKyOQk0bVyZzUL8R71s0Q7G1POINrQqwTs3jSLlrjXYI9fFflqdDsn/bwR4Ws+A8o6pOdvjS0Pz6qADhZ5PSs/KlGA/0OGvNePGY/0QVpYB/C4JE7mGSTUeGLZVFRYmst6jlT0DJOvS+TdLrFBgAmh7jo/5KdVA9Qv+0RrciJladYkHtV1JmPz2MRgiSMTowhjFFRW6sljyUODBZ59A== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=72avtIJigMrJ/t9S9LNVdKFGNIXvlVkWO5c2zIOjy7c=; b=Ho20m5UqcMFYc8glAAesLdknesp4AZFzPE9puhuR8GWOicdhA9zSwk0ENheOhGnzQsAqpJnEff5viSMSuKCQq9MRsQd0v2GO2mi/rOhawwqR71kJI37o0cq0vvkLgfaFQVD/nwt2ocT6NzGmEoYos9lrV6y82jYSWd7GaynWOG2j9TIlSOzTsANboFFSOYa2XYMRf2szlWrArlFqhG4zUQZLA8xYhXbcAL2N/2ganBg3UDfw3fYN7CsIJSK284ocpypMCDJAN4rfipUjBE56ssviOBzMNj33V6eb0PIQvCs25S+U5ONAY1uyIXJ8oA/IqKdwCBV50/IRmEo5PJ0N9Q== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com; Message-ID: <2f99e91d-6a91-f860-45d0-9c8b67c9b2b8@suse.com> Date: Wed, 15 Jun 2022 12:31:09 +0200 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.10.0 Subject: [PATCH 10/11] x86emul: handle AVX512-FP16 conversion to/from (packed) int{32,64} insns Content-Language: en-US From: Jan Beulich To: "xen-devel@lists.xenproject.org" Cc: Andrew Cooper , Wei Liu , =?utf-8?q?Roger_Pau_Monn=C3=A9?= References: <9ba3aaa7-56ce-b051-f1c4-874276e493e1@suse.com> In-Reply-To: <9ba3aaa7-56ce-b051-f1c4-874276e493e1@suse.com> X-ClientProxiedBy: AS9PR06CA0177.eurprd06.prod.outlook.com (2603:10a6:20b:45c::34) To VE1PR04MB6560.eurprd04.prod.outlook.com (2603:10a6:803:122::25) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: b40be030-3b1c-45f7-64ee-08da4eba2a96 X-MS-TrafficTypeDiagnostic: DB8PR04MB6876:EE_ X-Microsoft-Antispam-PRVS: X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: HpyGpaQG1k86byOWZpbOIUF7cv4TuO/y5yO2L0RMSKuaP0wjCvGnjrGhtvk/g8utz9EhGBOInDbxNVVvU50YEpjpONi2b4M86gLKyaeyuNdVmKi5TrRVWxFzzUszGeSBbaLo+ipTm9G1aHHPp23160EPehX5xHUhJM2HHWycqZ3BTTfpYHEea50BHi01QqbTJUuIBMSMg+w4Orqpio43Bk+ZqDDT20NnJft4mUWiiwOywH3Y8HNIrSbeyyYQqYPGwIXzlJm4uPWc+b3AH04vo3nHyFshDShsEWOF8JnKD3xECbJlpyToa+agqkIVtktOEnHj6OXUe8wFGcBaz0VQ9dSaz++25vA3FRNerPChlY6/UktbpQVobbD97r3sC/8wAPgxyIXOdKTaXjuY9326t2EC0Wf7C3F7kvnH7zuA+5YmV8BlYjr6v2e0yZ1a0sBZy2MxyhekrIdXbAyBhi9L1sDDWQ/Kty1q6LcRA5DiFY75uQI1gT+GbLVmZ0f6r9z6eGocj2S2kXuWtXk8bV85Z88LDypTEocwK8/oF4UL99yrkD360NjNyUueZW2XqavERcerGK5t92l+60zWb/eddMVm7CDA0X5i4exGPYVxSQ47SOyIFZ3m1GAsWGsRLBMFXkm90NCEf6H8t/jAg+b8JPsEALV7ZvmMYUz9O5O7i8MtKZWL9hI6wPx1vGC5QX2wVal+SXqxiBzlC952bQXJqynggp0FEl0tJYjv2IVF8/4= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:VE1PR04MB6560.eurprd04.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230016)(366004)(6512007)(66556008)(26005)(54906003)(5660300002)(6506007)(316002)(186003)(2906002)(31686004)(36756003)(508600001)(6486002)(8936002)(66946007)(86362001)(4326008)(8676002)(6916009)(66476007)(38100700002)(31696002)(2616005)(43740500002)(45980500001);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?q?SlpqDbylCj7x/mOfP7JIMB2JP1Pf?= =?utf-8?q?bBmDggp1dckMBDDd1TIamm775L0NU7E+EJj6AcLZ43W3iD22NtuTCjNGZSASJuCJ7?= =?utf-8?q?gvAhToZvZMFMlKFikkuOv9Dm283TtXpMMY7e1gY5cPVJHe2673ckeDShQguhLhzGM?= =?utf-8?q?blzEwsL96vcpR+Yn+/eyIXfloFIRuN8S4WxClb1KvPb/aFXQzsNmqXGRldjp5zAhA?= =?utf-8?q?n8odKDrHMBoeMEHx2mbyhDbwAfKJCAiCU07qq6a196XWfwR9SOh5jbIgjfky7JaIB?= =?utf-8?q?EwxkKp2CabSdA4g/BU+4Ap0Mux0u//uZRj4mp2CMhCabaNQaKBqCXvdlXjECqGlEr?= =?utf-8?q?yqkf7pn3p8JUat3yyEascx3RNGCERLQX2TJ4fMnVUMx9s3mqJSVjakuNZfishimOQ?= =?utf-8?q?lg4ICMbyu1zgS+BojlY40hB8XTjU6qFUaa7P82m7DQlrC/c6RN2VyGDWmvdc/0wp2?= =?utf-8?q?igClyXghuKVPPp0mmOtPMJcDeRBpPavLIdAYLmvXJu5/XjSaIFTIFjTsb5+XtaQKC?= =?utf-8?q?OMDSNZY8q7PbgZBadNGtYnuoozXWB/9l+MRA5PrXEByIvCTwkfgxkQ6EPBMDAMicl?= =?utf-8?q?gpTg7n6xAX4BaPK68+iEg7zGHKDC5MdI96p8WQrZjE7smqIUUZBwKz1TPPIxcQGbf?= =?utf-8?q?yvTTenCVIcMhZMpoMdnlJ1sIt5r5yla4VgRqUq8rKe3oRAmd54vPVx1ZVz57U3Oqv?= =?utf-8?q?ihnT5IEW1KVoanNi+764wIdOdCZKQVX9XQbP3cBDH8T82tpDMiTc4yHNiNrRYc0Lz?= =?utf-8?q?E/LdDMenq0PIT6M1UmIo3m7k3423gKchi9CX27moSKIZi8biJseAOyvd1KpwFShCF?= =?utf-8?q?CDi8nLwTuwvScwmHwcK1tNVNwOXAxsnWvgPxtLiNZ/tZRqZROVncMrH4DpJhtwhYQ?= =?utf-8?q?JYufCAfASwiQLd7Mm4fIir2VqawNa0s8tMLNY4acEm3Opy/3eiko0wrJ6XB/eVdGz?= =?utf-8?q?zGeXucfpsuYQqCiBlH9XX+S1SiaQwjRat+e6OH5zNMuYlmcU0rhGcFXYkKX/c7lLZ?= =?utf-8?q?f5HrVVB2OPDVLZc/MwKKcHL/e65eXu40J+01bB1Dy5QOIfV4EKV9iVz3RVEjeurXC?= =?utf-8?q?etR1I3fY1Tsx24r7koCAc2zsNx8imEOEMOxfEcSIbaLhMeCJt+LcHxZrtI+SRV4Df?= =?utf-8?q?N30vMBuO47WZmAfKlmJkMeFR2J0BVpWnNYaeC89/+5KF/ml6ElZL1zUMisuxu2k2C?= =?utf-8?q?BPh12t1esPI8twOTfZGuRNYE8LJSDidOeEDcGb6Wi865aNVaxCb3/Mh2/LUKf1ZA+?= =?utf-8?q?O0FoXbfDWZ7F0F8+tHsHzkBoT9Jk5OP5sGk/1MHIG5VsKrPcAva8Xt97k0aV+guuc?= =?utf-8?q?9xeCxa/YVPFCEXY9V3gIVJzEfKBw3K/GAvmO13WonKAedJxC5LGR+HpRfyBQ3Iga5?= =?utf-8?q?ZmomsHf9fBcpdautfIhCUfjp/jRdKfIjPqyBDsD9l0neJL0ZtfzjatwVWh+zyV+PG?= =?utf-8?q?IWm17Wh/KcBF7jH3Cotylp2YRdGfLIXsg7RiV5Lr/F3vnaIrl7XBHgt0Mxm3oFvUt?= =?utf-8?q?mFC1X147xbOsBTCia9OmpNIvg4GyuL/HvGg6Iucgnh5bjOW1WbmFht7mJRjCTBBJn?= =?utf-8?q?dmpbOrlcGM9RrXwIKVnF7s/QRyNJ92FTllTArmM/9Xy370ZtB53EkmCRIcBgzu+XZ?= =?utf-8?q?7CUnWPpISRspwix7GpP18h3mhlDWXJwQ=3D=3D?= X-OriginatorOrg: suse.com X-MS-Exchange-CrossTenant-Network-Message-Id: b40be030-3b1c-45f7-64ee-08da4eba2a96 X-MS-Exchange-CrossTenant-AuthSource: VE1PR04MB6560.eurprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 15 Jun 2022 10:31:11.4229 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: f7a17af6-1c5c-4a36-aa8b-f5be247aa4ba X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: AxjTwJ6iXyCayFDYrKJR/1HEXoLcFjwEzpiP6JQgZ1cc+U0FDlgWs2M6jc/orN6mnBP2aCEqAXcEvswPRO2fHg== X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB8PR04MB6876 Signed-off-by: Jan Beulich --- a/tools/tests/x86_emulator/evex-disp8.c +++ b/tools/tests/x86_emulator/evex-disp8.c @@ -612,18 +612,36 @@ static const struct test avx512_fp16_all INSN(cmpph, , 0f3a, c2, vl, fp16, vl), INSN(cmpsh, f3, 0f3a, c2, el, fp16, el), INSN(comish, , map5, 2f, el, fp16, el), + INSN(cvtdq2ph, , map5, 5b, vl, d, vl), INSN(cvtpd2ph, 66, map5, 5a, vl, q, vl), + INSN(cvtph2dq, 66, map5, 5b, vl_2, fp16, vl), INSN(cvtph2pd, , map5, 5a, vl_4, fp16, vl), INSN(cvtph2psx, 66, map6, 13, vl_2, fp16, vl), + INSN(cvtph2qq, 66, map5, 7b, vl_4, fp16, vl), + INSN(cvtph2udq, , map5, 79, vl_2, fp16, vl), + INSN(cvtph2uqq, 66, map5, 79, vl_4, fp16, vl), INSN(cvtph2uw, , map5, 7d, vl, fp16, vl), INSN(cvtph2w, 66, map5, 7d, vl, fp16, vl), INSN(cvtps2phx, 66, map5, 1d, vl, d, vl), + INSN(cvtqq2ph, , map5, 5b, vl, q, vl), INSN(cvtsd2sh, f2, map5, 5a, el, q, el), INSN(cvtsh2sd, f3, map5, 5a, el, fp16, el), + INSN(cvtsh2si, f3, map5, 2d, el, fp16, el), INSN(cvtsh2ss, , map6, 13, el, fp16, el), + INSN(cvtsh2usi, f3, map5, 79, el, fp16, el), + INSN(cvtsi2sh, f3, map5, 2a, el, dq64, el), INSN(cvtss2sh, , map5, 1d, el, d, el), + INSN(cvttph2dq, f3, map5, 5b, vl_2, fp16, vl), + INSN(cvttph2qq, 66, map5, 7a, vl_4, fp16, vl), + INSN(cvttph2udq, , map5, 78, vl_2, fp16, vl), + INSN(cvttph2uqq, 66, map5, 78, vl_4, fp16, vl), INSN(cvttph2uw, , map5, 7c, vl, fp16, vl), INSN(cvttph2w, 66, map5, 7c, vl, fp16, vl), + INSN(cvttsh2si, f3, map5, 2c, el, fp16, el), + INSN(cvttsh2usi, f3, map5, 78, el, fp16, el), + INSN(cvtudq2ph, f2, map5, 7a, vl, d, vl), + INSN(cvtuqq2ph, f2, map5, 7a, vl, q, vl), + INSN(cvtusi2sh, f3, map5, 7b, el, dq64, el), INSN(cvtuw2ph, f2, map5, 7d, vl, fp16, vl), INSN(cvtw2ph, f3, map5, 7d, vl, fp16, vl), INSN(divph, , map5, 5e, vl, fp16, vl), --- a/tools/tests/x86_emulator/predicates.c +++ b/tools/tests/x86_emulator/predicates.c @@ -2033,6 +2033,9 @@ static const struct evex { { { 0x11 }, 2, T, W, pfx_f3, W0, LIG }, /* vmovsh */ { { 0x1d }, 2, T, R, pfx_66, W0, Ln }, /* vcvtps2phx */ { { 0x1d }, 2, T, R, pfx_no, W0, LIG }, /* vcvtss2sh */ + { { 0x2a }, 2, T, R, pfx_f3, Wn, LIG }, /* vcvtsi2sh */ + { { 0x2c }, 2, T, R, pfx_f3, Wn, LIG }, /* vcvttsh2si */ + { { 0x2d }, 2, T, R, pfx_f3, Wn, LIG }, /* vcvtsh2si */ { { 0x2e }, 2, T, R, pfx_no, W0, LIG }, /* vucomish */ { { 0x2f }, 2, T, R, pfx_no, W0, LIG }, /* vcomish */ { { 0x51 }, 2, T, R, pfx_no, W0, Ln }, /* vsqrtph */ @@ -2045,6 +2048,10 @@ static const struct evex { { { 0x5a }, 2, T, R, pfx_66, W1, Ln }, /* vcvtpd2ph */ { { 0x5a }, 2, T, R, pfx_f3, W0, LIG }, /* vcvtsh2sd */ { { 0x5a }, 2, T, R, pfx_f2, W1, LIG }, /* vcvtsd2sh */ + { { 0x5b }, 2, T, R, pfx_no, W0, Ln }, /* vcvtdq2ph */ + { { 0x5b }, 2, T, R, pfx_no, W1, Ln }, /* vcvtqq2ph */ + { { 0x5b }, 2, T, R, pfx_66, W0, Ln }, /* vcvtph2dq */ + { { 0x5b }, 2, T, R, pfx_f3, W0, Ln }, /* vcvttph2dq */ { { 0x5c }, 2, T, R, pfx_no, W0, Ln }, /* vsubph */ { { 0x5c }, 2, T, R, pfx_f3, W0, LIG }, /* vsubsh */ { { 0x5d }, 2, T, R, pfx_no, W0, Ln }, /* vminph */ @@ -2054,6 +2061,17 @@ static const struct evex { { { 0x5f }, 2, T, R, pfx_no, W0, Ln }, /* vmaxph */ { { 0x5f }, 2, T, R, pfx_f3, W0, LIG }, /* vmaxsh */ { { 0x6e }, 2, T, R, pfx_66, WIG, L0 }, /* vmovw */ + { { 0x78 }, 2, T, R, pfx_no, W0, Ln }, /* vcvttph2udq */ + { { 0x78 }, 2, T, R, pfx_66, W0, Ln }, /* vcvttph2uqq */ + { { 0x78 }, 2, T, R, pfx_f3, Wn, LIG }, /* vcvttsh2usi */ + { { 0x79 }, 2, T, R, pfx_no, W0, Ln }, /* vcvtph2udq */ + { { 0x79 }, 2, T, R, pfx_66, W0, Ln }, /* vcvtph2uqq */ + { { 0x79 }, 2, T, R, pfx_f3, Wn, LIG }, /* vcvtsh2usi */ + { { 0x7a }, 2, T, R, pfx_66, W0, Ln }, /* vcvttph2qq */ + { { 0x7a }, 2, T, R, pfx_f2, W0, Ln }, /* vcvtudq2ph */ + { { 0x7a }, 2, T, R, pfx_f2, W1, Ln }, /* vcvtuqq2ph */ + { { 0x7b }, 2, T, R, pfx_66, W0, Ln }, /* vcvtph2qq */ + { { 0x7b }, 2, T, R, pfx_f3, Wn, LIG }, /* vcvtusi2sh */ { { 0x7c }, 2, T, R, pfx_no, W0, Ln }, /* vcvttph2uw */ { { 0x7c }, 2, T, R, pfx_66, W0, Ln }, /* vcvttph2w */ { { 0x7d }, 2, T, R, pfx_no, W0, Ln }, /* vcvtph2uw */ --- a/xen/arch/x86/x86_emulate/decode.c +++ b/xen/arch/x86/x86_emulate/decode.c @@ -1489,12 +1489,25 @@ int x86emul_decode(struct x86_emulate_st s->simd_size = simd_scalar_vexw; break; + case 0x2a: /* vcvtsi2sh */ + break; + + case 0x2c: case 0x2d: /* vcvt{,t}sh2si */ + if ( s->evex.pfx == vex_f3 ) + s->fp16 = true; + break; + case 0x2e: case 0x2f: /* v{,u}comish */ if ( !s->evex.pfx ) s->fp16 = true; s->simd_size = simd_none; break; + case 0x5b: /* vcvt{d,q}q2ph, vcvt{,t}ph2dq */ + if ( s->evex.pfx && s->evex.pfx != vex_f2 ) + s->fp16 = true; + break; + case 0x6e: /* vmovw r/m16, xmm */ d = (d & ~SrcMask) | SrcMem16; /* fall through */ @@ -1504,6 +1517,17 @@ int x86emul_decode(struct x86_emulate_st s->simd_size = simd_none; break; + case 0x78: case 0x79: /* vcvt{,t}ph2u{d,q}q, vcvt{,t}sh2usi */ + if ( s->evex.pfx != vex_f2 ) + s->fp16 = true; + break; + + case 0x7a: /* vcvttph2qq, vcvtu{d,q}q2ph */ + case 0x7b: /* vcvtph2qq, vcvtusi2sh */ + if ( s->evex.pfx == vex_66 ) + s->fp16 = true; + break; + case 0x7c: /* vcvttph2{,u}w */ case 0x7d: /* vcvtph2{,u}w / vcvt{,u}w2ph */ d = DstReg | SrcMem | TwoOp; @@ -1515,10 +1539,34 @@ int x86emul_decode(struct x86_emulate_st switch ( b ) { + case 0x78: + case 0x79: + /* vcvt{,t}ph2u{d,q}q need special casing */ + if ( s->evex.pfx <= vex_66 ) + { + if ( !s->evex.brs ) + disp8scale -= 1 + (s->evex.pfx == vex_66); + break; + } + /* vcvt{,t}sh2usi needs special casing: fall through */ + case 0x2c: case 0x2d: /* vcvt{,t}sh2si need special casing */ + disp8scale = 1; + break; + case 0x5a: /* vcvtph2pd needs special casing */ if ( !s->evex.pfx && !s->evex.brs ) disp8scale -= 2; break; + + case 0x5b: /* vcvt{,t}ph2dq need special casing */ + if ( s->evex.pfx && !s->evex.brs ) + --disp8scale; + break; + + case 0x7a: case 0x7b: /* vcvt{,t}ph2qq need special casing */ + if ( s->evex.pfx == vex_66 && !s->evex.brs ) + disp8scale = s->evex.brs ? 1 : 2 + s->evex.lr; + break; } break; --- a/xen/arch/x86/x86_emulate/x86_emulate.c +++ b/xen/arch/x86/x86_emulate/x86_emulate.c @@ -3581,6 +3581,12 @@ x86_emulate( state->simd_size = simd_none; goto simd_0f_rm; +#ifndef X86EMUL_NO_SIMD + + case X86EMUL_OPC_EVEX_F3(5, 0x2a): /* vcvtsi2sh r/m,xmm,xmm */ + case X86EMUL_OPC_EVEX_F3(5, 0x7b): /* vcvtusi2sh r/m,xmm,xmm */ + host_and_vcpu_must_have(avx512_fp16); + /* fall through */ CASE_SIMD_SCALAR_FP(_EVEX, 0x0f, 0x2a): /* vcvtsi2s{s,d} r/m,xmm,xmm */ CASE_SIMD_SCALAR_FP(_EVEX, 0x0f, 0x7b): /* vcvtusi2s{s,d} r/m,xmm,xmm */ generate_exception_if(evex.opmsk || (ea.type != OP_REG && evex.brs), @@ -3659,7 +3665,9 @@ x86_emulate( opc[1] = 0x01; rc = ops->read(ea.mem.seg, ea.mem.off, mmvalp, - vex.pfx & VEX_PREFIX_DOUBLE_MASK ? 8 : 4, ctxt); + vex.pfx & VEX_PREFIX_DOUBLE_MASK + ? 8 : 2 << !state->fp16, + ctxt); if ( rc != X86EMUL_OKAY ) goto done; } @@ -3689,6 +3697,12 @@ x86_emulate( state->simd_size = simd_none; break; + case X86EMUL_OPC_EVEX_F3(5, 0x2c): /* vcvttsh2si xmm/mem,reg */ + case X86EMUL_OPC_EVEX_F3(5, 0x2d): /* vcvtsh2si xmm/mem,reg */ + case X86EMUL_OPC_EVEX_F3(5, 0x78): /* vcvttsh2usi xmm/mem,reg */ + case X86EMUL_OPC_EVEX_F3(5, 0x79): /* vcvtsh2usi xmm/mem,reg */ + host_and_vcpu_must_have(avx512_fp16); + /* fall through */ CASE_SIMD_SCALAR_FP(_EVEX, 0x0f, 0x2c): /* vcvtts{s,d}2si xmm/mem,reg */ CASE_SIMD_SCALAR_FP(_EVEX, 0x0f, 0x2d): /* vcvts{s,d}2si xmm/mem,reg */ CASE_SIMD_SCALAR_FP(_EVEX, 0x0f, 0x78): /* vcvtts{s,d}2usi xmm/mem,reg */ @@ -3760,8 +3774,6 @@ x86_emulate( ASSERT(!state->simd_size); break; -#ifndef X86EMUL_NO_SIMD - case X86EMUL_OPC_EVEX(5, 0x2e): /* vucomish xmm/m16,xmm */ case X86EMUL_OPC_EVEX(5, 0x2f): /* vcomish xmm/m16,xmm */ host_and_vcpu_must_have(avx512_fp16); @@ -7789,6 +7801,38 @@ x86_emulate( 2 * evex.w); goto avx512f_all_fp; + case X86EMUL_OPC_EVEX (5, 0x5b): /* vcvtdq2ph [xyz]mm/mem,[xy]mm{k} */ + /* vcvtqq2ph [xyz]mm/mem,xmm{k} */ + case X86EMUL_OPC_EVEX_F2(5, 0x7a): /* vcvtudq2ph [xyz]mm/mem,[xy]mm{k} */ + /* vcvtuqq2ph [xyz]mm/mem,xmm{k} */ + host_and_vcpu_must_have(avx512_fp16); + if ( ea.type != OP_REG || !evex.brs ) + avx512_vlen_check(false); + op_bytes = 16 << evex.lr; + goto simd_zmm; + + case X86EMUL_OPC_EVEX_66(5, 0x5b): /* vcvtph2dq [xy]mm/mem,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX_F3(5, 0x5b): /* vcvttph2dq [xy]mm/mem,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX (5, 0x78): /* vcvttph2udq [xy]mm/mem,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX (5, 0x79): /* vcvtph2udq [xy]mm/mem,[xyz]mm{k} */ + host_and_vcpu_must_have(avx512_fp16); + generate_exception_if(evex.w, EXC_UD); + if ( ea.type != OP_REG || !evex.brs ) + avx512_vlen_check(false); + op_bytes = 8 << evex.lr; + goto simd_zmm; + + case X86EMUL_OPC_EVEX_66(5, 0x78): /* vcvttph2uqq xmm/mem,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX_66(5, 0x79): /* vcvtph2uqq xmm/mem,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX_66(5, 0x7a): /* vcvttph2qq xmm/mem,[xyz]mm{k} */ + case X86EMUL_OPC_EVEX_66(5, 0x7b): /* vcvtph2qq xmm/mem,[xyz]mm{k} */ + host_and_vcpu_must_have(avx512_fp16); + generate_exception_if(evex.w, EXC_UD); + if ( ea.type != OP_REG || !evex.brs ) + avx512_vlen_check(false); + op_bytes = 4 << (evex.w + evex.lr); + goto simd_zmm; + case X86EMUL_OPC_EVEX (5, 0x7c): /* vcvttph2uw [xyz]mm/mem,[xyz]mm{k} */ case X86EMUL_OPC_EVEX_66(5, 0x7c): /* vcvttph2w [xyz]mm/mem,[xyz]mm{k} */ case X86EMUL_OPC_EVEX (5, 0x7d): /* vcvtph2uw [xyz]mm/mem,[xyz]mm{k} */ From patchwork Wed Jun 15 10:32:09 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 12882091 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 3768EC433EF for ; Wed, 15 Jun 2022 10:32:34 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.349974.576197 (Exim 4.92) (envelope-from ) id 1o1QJt-0000yp-Fg; Wed, 15 Jun 2022 10:32:17 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 349974.576197; Wed, 15 Jun 2022 10:32:17 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1o1QJt-0000yi-C7; Wed, 15 Jun 2022 10:32:17 +0000 Received: by outflank-mailman (input) for mailman id 349974; Wed, 15 Jun 2022 10:32:15 +0000 Received: from se1-gles-sth1-in.inumbo.com ([159.253.27.254] helo=se1-gles-sth1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1o1QJr-0007gz-H5 for xen-devel@lists.xenproject.org; Wed, 15 Jun 2022 10:32:15 +0000 Received: from EUR01-DB5-obe.outbound.protection.outlook.com (mail-db5eur01on0618.outbound.protection.outlook.com [2a01:111:f400:fe02::618]) by se1-gles-sth1.inumbo.com (Halon) with ESMTPS id 6ba1654c-ec96-11ec-bd2c-47488cf2e6aa; Wed, 15 Jun 2022 12:32:14 +0200 (CEST) Received: from VE1PR04MB6560.eurprd04.prod.outlook.com (2603:10a6:803:122::25) by DB8PR04MB6876.eurprd04.prod.outlook.com (2603:10a6:10:116::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5332.20; Wed, 15 Jun 2022 10:32:11 +0000 Received: from VE1PR04MB6560.eurprd04.prod.outlook.com ([fe80::dfa:a64a:432f:e26b]) by VE1PR04MB6560.eurprd04.prod.outlook.com ([fe80::dfa:a64a:432f:e26b%7]) with mapi id 15.20.5332.020; Wed, 15 Jun 2022 10:32:11 +0000 X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 6ba1654c-ec96-11ec-bd2c-47488cf2e6aa ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=TJ0rTOC+f4B3CdqJG9fMbRrb4NlZOgLt4fJVGvS/8tALv5WBOLckar0s7RPU7Vv0F4IAqh5JPawMbDPCDFDTQLswbTyp5zaklgVUkpx8rW/l3Acwd4ML88DXTLaRFlfNOo5i72QaSXoGFhSSNXuVseAldKzwtGMnt5fobuToNwEzKiXKP1z1oJZ99CPtRoY3lMXXrT7v2gNAcwpzDIrEWKAeqljzGEUJK4kyUF6DAYdw9fr8I+yRi5jTg66N2MH5dPABeoOOew8KXL2gRNcOQC1d9KnwIyCRsdT9SyA53btnjH7NOKVD1waHDGKxZmpPaeJAXH0MZeymUvvUl5aCxw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=X1FAw4SWHwB87fUYBwv+opycuFCEYYEsmXkhPZ7XIIk=; b=Oy/svtfaSI/DHRRqzUdPXM84mWdl/5Sn8xdVfpLIwoPoEfQoH47gd059aleVQjEOibogKylLRLvo2wHaThSHYlAvO4rZ3tx1PDoADwPNP+y6NiWrfEykJWJAZm5w62RDdbRU2hox6BLmCu7Sa3IEACbYzYyoZQkQJhuHpa04vR2+tx9XgFPdWxG5aHhPa1JFkc8XYY8AiL78bo5/C/q+gQcJpsaC/ih4zF9Ep3CpK8bBgPit5uqwWgFUWoJzGMDjPuj76YwkBT8nBYmisjuCSEF8h0sxxXjhVOzSGGkotHjfz5nubL3EtsyzJ8lv9Dfzuu4lED1VQkSAhOiMInsFYQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=X1FAw4SWHwB87fUYBwv+opycuFCEYYEsmXkhPZ7XIIk=; b=yAozt7Acu9RsWWisV+PraYciwWtbSqD6xGjiTirNoGo2/2yrJVVwHFkfo1zM0B8nva8luUt+8zY6wi2sMqbX1FFzCsbyRneXn6x7kQEKwtg+i2kSJi8vGmKKNrtsLAQjsunS9D5bCv6ziKMh6MbBefqzJ8IxHweOKyDCa5PcVpeu4Xq6A9fhJLDTJwBiN4/k+EHk1HIOq4eKomAAmFne90PCMs+7+f3Cw1SdlY72cl0iyqG1R8o5DGC6j1r6fuNPcqgCWcmrE/4oyAQIM6c5pdUz2xnQp17itN2d16hOJTOFD17Z6bXmaqQOww4UnCYkcL/1E+Zegpk3ekOrXMEi1g== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com; Message-ID: Date: Wed, 15 Jun 2022 12:32:09 +0200 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.10.0 Subject: [PATCH 11/11] x86emul: AVX512-FP16 testing Content-Language: en-US From: Jan Beulich To: "xen-devel@lists.xenproject.org" Cc: Andrew Cooper , Wei Liu , =?utf-8?q?Roger_Pau_Monn=C3=A9?= References: <9ba3aaa7-56ce-b051-f1c4-874276e493e1@suse.com> In-Reply-To: <9ba3aaa7-56ce-b051-f1c4-874276e493e1@suse.com> X-ClientProxiedBy: AS8PR04CA0035.eurprd04.prod.outlook.com (2603:10a6:20b:312::10) To VE1PR04MB6560.eurprd04.prod.outlook.com (2603:10a6:803:122::25) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: a3512278-6146-4765-2c15-08da4eba4e2e X-MS-TrafficTypeDiagnostic: DB8PR04MB6876:EE_ X-Microsoft-Antispam-PRVS: X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: Y8oDrGhWSSkuQ0t/WYAcr4DclOpTCk+PRAkI0nygqMr67Ytm5ktI8XkuJUgPmkGoIqp/w4eNOuNrjKPQH2+dOm/HGwB0BLIOrAkkPxM2/gYG4/23X8UoPdIqdbEsWDjOaK935LCyMQSJSf5MjsZ96yX12BC/MI9sRQu0IHLbTScYLfQrNlGgE2t5WP7qTXDcIZLPIjd2IJP66d9/AqxY9C3G97Iuexd4ODu2bhzrKPHVS4bPKhewJ7CV1BOMbRPfT59/eWJ1rjRzHsDIQdLy46igdXXpDlPk9BkCq2WdqJpfZzQlKziaJP20HtGAQsklaT1FoyYkyNMo0SzwpCXgs8vCOpJgRxV1hhmcdRnLKlpiFoqG9xXVhgobd9og7x4wjGMkfQXNP++CjnOcZA54MWB802t5B3pvJZ96KiOTpgdIiI6z7E3F2b65HCheMtY70pNP6OKFVgfVWKvAhgkDY2hCAOLzkyOfJedctW0srVNdfoQ/BK5JRkY7GQMbWZ7KvXVZyYdaL9pp+VkPlfHReOzJSFDMugYteM0LYCiNHfjRqKe1Dy9ildvB3GL591ql0oHOQ8TaGTiFbnWo1xnyc6ZjvjzJldlN0gh4kf1A1klgocZZ2iEMNuZCuxTxHth6i40hLLolgUw2em8gM+aE40lucIEAzBPVrp97dU8Hs2q+DWKKy2EYb/w75/6JhGtIYEuqZWh31DUavtEWYTFXdN/sPFElRV5MuvS5JcVFhHg= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:VE1PR04MB6560.eurprd04.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230016)(366004)(83380400001)(6512007)(66556008)(26005)(54906003)(5660300002)(6506007)(316002)(186003)(2906002)(31686004)(36756003)(30864003)(508600001)(6486002)(8936002)(66946007)(86362001)(4326008)(8676002)(6916009)(66476007)(38100700002)(31696002)(2616005)(43740500002)(45980500001);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?q?mGOCWJqnNI+F7PXw/pwAxPh7plI7?= =?utf-8?q?BTSD+zpSz9MAlmhJTYNTtcmRk2eCAiehpnYvGeLF8pVUO1KpNmenSlcRSgWqpu7tq?= =?utf-8?q?BHXgCEC+fUw7Mn8RNfuwotLfOwYFdAiEI0oBJ/yr5aLC1U5PvzQuNFiwPy2Mg8tdf?= =?utf-8?q?2DW654+IG0Cx7sXJwPFAFH/XYCM5bxECqQ6+Dqembocg8j41NkRTMdVPFeQbmFoON?= =?utf-8?q?w1Gfrdwz4HdMLJfYsx7i7NJNH/FglASXPybUaA+P/1YIiAH8gdw0MxscZ813wLzWN?= =?utf-8?q?aATtAAdzmkDQQMkLDDatfvFlFIKqP3qgPLvbPsjGySCnTKmRPyvXwwJ+outQxdANc?= =?utf-8?q?5XXUjryRJZeSLYlWFhK4wS2nwjXvK/szWpAm1dJL3GUuBquPZnskZ8v9zSkMIRJ8i?= =?utf-8?q?jqa/GdNc+O+pUSSQCO4qn+MW4BxK2ettzfAjUO1UWfqPxFs5VRmPWA2tZ5IWZhHTe?= =?utf-8?q?WWxE+GIHmQ4N2qsjvtk0YYC/BskwCw80g7NnlG4bPtLzrtVGWqVtvk+RCtKGNrfK+?= =?utf-8?q?Yax3UhEKlpk3cvdxWKOKSzb+xwqGFJJmyfNBgtV+leCieRFau78AtMlDRrnx/rT2c?= =?utf-8?q?HRxfz6riIUjI1rYs3QtCkLwNQ+ZFEkkQPolS0nbXA+OQ7Oanh2SOOm6r0kvHeOCxs?= =?utf-8?q?Ap9Nmtk/Y3C2irVUoF78kEcWVO+2SIA6J8edxpiKayJuUe4NNFLKMRSNhxwUn/OoV?= =?utf-8?q?OAbwJ2eAmJXfcEfa/F42dLuHLJyr0gA/1H0XLUZR6vJobJC9UjqFvBkV2e1gN/JNQ?= =?utf-8?q?Ni4J4PoK1ozM9LUeY1SLF/PhDU4zopDA00CjzEkIPFInumpUgHJIwuDQhM960OJuW?= =?utf-8?q?08ozNa6FvDnjHE0fKc4gpEFlmHSYKHgbqpXV+P37b2B+E30NkFGC38E9O/kZT2jeZ?= =?utf-8?q?dERCXSEcZvVyWov7Awo6UK+06/hSbfTj14D0wwiu6KpZKye7eTHCz3je5pfWr8htP?= =?utf-8?q?xBJ4+mpUzBwzJMa/WglugdQFtsJwfm8dfyovv4i366TcoTY2RLH/DLr7mjggJbccZ?= =?utf-8?q?KUBgbzbsSCO/SNHjCPyM9DwVm1TmL9etB1nEDgK9VfMoUh+DX0SsOmVF/16h9eRFS?= =?utf-8?q?RVK7oXokBka/keiUreDN1TJPwv7qznGrxvzo7eR9eUsW0xbX82LRePzkSfXdiiesO?= =?utf-8?q?5Vfqy60VmH37acGbDO4hmy0yjKeARaktZ7ddeTC3l4P3r6JjbArT4aSFSdj7c4wrF?= =?utf-8?q?jf+nV3R/QdfCDPrSuRkAYmvKEKIWE0Ush+tQTj/+MeSUNho+pEq9HzJp9tiSpmjHY?= =?utf-8?q?l1hwLIL8RMvDzWc1N/XalTBko9zsZ6B2MuLoYFZJqHJyrVvTncApo0TyDRwsXsEGG?= =?utf-8?q?kBP4IVUdumYti83QVzQVFAZ2Ezsz21M91uZKbGa9L1TltdEBVphHNz/0fI8AfnwZG?= =?utf-8?q?k7sdpkfk5s7NQ/MTkPNQxW8v9Aio2+wOyXd30hgTB3zkMMqOsDuV2QrkL2XOoKs6N?= =?utf-8?q?ynlXfgUSKSlwcgSRCZsHYEb15XHyxfn1KhesZzWraLvEyu9ziAG7rLwKRTlKhzLuo?= =?utf-8?q?QIYBWAl2o0OJGo/ROS9Uq3Xhre4j9rQPvD3BU26V0gtK8zy+2X54/6gTcc7eMt41n?= =?utf-8?q?5IxIeAMGCjSLQLI9vveNzCX4GspBIqvuJsQM5krcq68kTSw/LMWDcV5qVyN9gRbyS?= =?utf-8?q?TKxZSF5ZtyWB4CHjjq8aPknBkzSWXWSw=3D=3D?= X-OriginatorOrg: suse.com X-MS-Exchange-CrossTenant-Network-Message-Id: a3512278-6146-4765-2c15-08da4eba4e2e X-MS-Exchange-CrossTenant-AuthSource: VE1PR04MB6560.eurprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 15 Jun 2022 10:32:11.0441 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: f7a17af6-1c5c-4a36-aa8b-f5be247aa4ba X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: F+7O1iNwpQyUlVvdfqxLvBzCyCaMnjP4MiruGEMHI9cgcp27rYS5QNLUB2CivGA4dEFTdh4t3l0JiNjjwAQrJg== X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB8PR04MB6876 Naming of some of the builtins isn't fully consistent with that of pre-existing ones, so there's a need for a new BR2() wrapper macro. With the tests providing some proof of proper functioning of the emulator code also enable use of the feature by guests, as there's no other infrastructure involved in enabling this ISA extension. Signed-off-by: Jan Beulich --- SDE: -spr or -future --- In the course of putting together the FMA part of the test I've noticed that we no longer test scalar FMA insns (FMA, FMA4, AVX512F), due to gcc no longer recognizing the pattern in version 9 or later. See gcc bug 105965, which apparently has already gained a fix for version 13. (Using intrinsics for scalar operations is prohibitive, as they have full- vector parameters.) I'm taking this as one of several reasons why here I'm not even trying to make the compiler spot the complex FMA patterns, using a mixture of intrinsics and inline assembly instead. --- a/tools/tests/x86_emulator/Makefile +++ b/tools/tests/x86_emulator/Makefile @@ -16,7 +16,7 @@ vpath %.c $(XEN_ROOT)/xen/lib/x86 CFLAGS += $(CFLAGS_xeninclude) -SIMD := 3dnow sse sse2 sse4 avx avx2 xop avx512f avx512bw avx512dq avx512er avx512vbmi +SIMD := 3dnow sse sse2 sse4 avx avx2 xop avx512f avx512bw avx512dq avx512er avx512vbmi avx512fp16 FMA := fma4 fma SG := avx2-sg avx512f-sg avx512vl-sg AES := ssse3-aes avx-aes avx2-vaes avx512bw-vaes @@ -91,6 +91,9 @@ avx512vbmi-vecs := $(avx512bw-vecs) avx512vbmi-ints := $(avx512bw-ints) avx512vbmi-flts := $(avx512bw-flts) avx512vbmi2-vecs := $(avx512bw-vecs) +avx512fp16-vecs := $(avx512bw-vecs) +avx512fp16-ints := +avx512fp16-flts := 2 avx512f-opmask-vecs := 2 avx512dq-opmask-vecs := 1 2 @@ -246,7 +249,7 @@ $(addsuffix .c,$(GF)): $(addsuffix .h,$(SIMD) $(FMA) $(SG) $(AES) $(CLMUL) $(SHA) $(GF)): simd.h -xop.h avx512f.h: simd-fma.c +xop.h avx512f.h avx512fp16.h: simd-fma.c endif # 32-bit override --- a/tools/tests/x86_emulator/simd.c +++ b/tools/tests/x86_emulator/simd.c @@ -20,6 +20,14 @@ ENTRY(simd_test); asm ( "vcmpsd $0, %1, %2, %0" : "=k" (r_) : "m" (x_), "v" (y_) ); \ r_ == 1; \ }) +# elif VEC_SIZE == 2 +# define eq(x, y) ({ \ + _Float16 x_ = (x)[0]; \ + _Float16 __attribute__((vector_size(16))) y_ = { (y)[0] }; \ + unsigned int r_; \ + asm ( "vcmpsh $0, %1, %2, %0" : "=k" (r_) : "m" (x_), "v" (y_) ); \ + r_ == 1; \ +}) # elif FLOAT_SIZE == 4 /* * gcc's (up to at least 8.2) __builtin_ia32_cmpps256_mask() has an anomaly in @@ -31,6 +39,8 @@ ENTRY(simd_test); # define eq(x, y) ((BR(cmpps, _mask, x, y, 0, -1) & ALL_TRUE) == ALL_TRUE) # elif FLOAT_SIZE == 8 # define eq(x, y) (BR(cmppd, _mask, x, y, 0, -1) == ALL_TRUE) +# elif FLOAT_SIZE == 2 +# define eq(x, y) (B(cmpph, _mask, x, y, 0, -1) == ALL_TRUE) # elif (INT_SIZE == 1 || UINT_SIZE == 1) && defined(__AVX512BW__) # define eq(x, y) (B(pcmpeqb, _mask, (vqi_t)(x), (vqi_t)(y), -1) == ALL_TRUE) # elif (INT_SIZE == 2 || UINT_SIZE == 2) && defined(__AVX512BW__) @@ -116,6 +126,14 @@ static inline bool _to_bool(byte_vec_t b asm ( "vcvtusi2sd%z1 %1, %0, %0" : "=v" (t_) : "m" (u_) ); \ (vec_t){ t_[0] }; \ }) +# elif FLOAT_SIZE == 2 +# define to_u_int(type, x) ({ \ + unsigned type u_; \ + _Float16 __attribute__((vector_size(16))) t_; \ + asm ( "vcvtsh2usi %1, %0" : "=r" (u_) : "m" ((x)[0]) ); \ + asm ( "vcvtusi2sh%z1 %1, %0, %0" : "=v" (t_) : "m" (u_) ); \ + (vec_t){ t_[0] }; \ +}) # endif # define to_uint(x) to_u_int(int, x) # ifdef __x86_64__ @@ -153,6 +171,43 @@ static inline bool _to_bool(byte_vec_t b # define to_wint(x) BR(cvtqq2pd, _mask, BR(cvtpd2qq, _mask, x, (vdi_t)undef(), ~0), undef(), ~0) # define to_uwint(x) BR(cvtuqq2pd, _mask, BR(cvtpd2uqq, _mask, x, (vdi_t)undef(), ~0), undef(), ~0) # endif +# elif FLOAT_SIZE == 2 +# define to_int(x) BR2(vcvtw2ph, _mask, BR2(vcvtph2w, _mask, x, (vhi_t)undef(), ~0), undef(), ~0) +# define to_uint(x) BR2(vcvtuw2ph, _mask, BR2(vcvtph2uw, _mask, x, (vhi_t)undef(), ~0), undef(), ~0) +# if VEC_SIZE == 16 +# define low_half(x) (x) +# define high_half(x) ((vec_t)B_(movhlps, , (vsf_t)undef(), (vsf_t)(x))) +# define insert_half(x, y, p) ((vec_t)((p) ? B_(movlhps, , (vsf_t)(x), (vsf_t)(y)) \ + : B_(shufps, , (vsf_t)(y), (vsf_t)(x), 0b11100100))) +# elif VEC_SIZE == 32 +# define _half(x, lh) ((vhf_half_t)B(extracti32x4_, _mask, (vsi_t)(x), lh, (vsi_half_t){}, ~0)) +# define low_half(x) _half(x, 0) +# define high_half(x) _half(x, 1) +# define insert_half(x, y, p) \ + ((vec_t)B(inserti32x4_, _mask, (vsi_t)(x), (vsi_half_t)(y), p, (vsi_t)undef(), ~0)) +# elif VEC_SIZE == 64 +# define _half(x, lh) \ + ((vhf_half_t)__builtin_ia32_extracti64x4_mask((vdi_t)(x), lh, (vdi_half_t){}, ~0)) +# define low_half(x) _half(x, 0) +# define high_half(x) _half(x, 1) +# define insert_half(x, y, p) \ + ((vec_t)__builtin_ia32_inserti64x4_mask((vdi_t)(x), (vdi_half_t)(y), p, (vdi_t)undef(), ~0)) +# endif +# define to_w_int(x, s) ({ \ + vhf_half_t t_ = low_half(x); \ + vsi_t lo_, hi_; \ + touch(t_); \ + lo_ = BR2(vcvtph2 ## s ## dq, _mask, t_, (vsi_t)undef(), ~0); \ + t_ = high_half(x); \ + touch(t_); \ + hi_ = BR2(vcvtph2 ## s ## dq, _mask, t_, (vsi_t)undef(), ~0); \ + touch(lo_); touch(hi_); \ + insert_half(insert_half(undef(), \ + BR2(vcvt ## s ## dq2ph, _mask, lo_, (vhf_half_t){}, ~0), 0), \ + BR2(vcvt ## s ## dq2ph, _mask, hi_, (vhf_half_t){}, ~0), 1); \ +}) +# define to_wint(x) to_w_int(x, ) +# define to_uwint(x) to_w_int(x, u) # endif #elif VEC_SIZE == 16 && defined(__SSE2__) # if FLOAT_SIZE == 4 @@ -240,10 +295,18 @@ static inline vec_t movlhps(vec_t x, vec # define scale(x, y) scalar_2op(x, y, "vscalefsd %[in2], %[in1], %[out]") # define sqrt(x) scalar_1op(x, "vsqrtsd %[in], %[out], %[out]") # define trunc(x) scalar_1op(x, "vrndscalesd $0b1011, %[in], %[out], %[out]") +# elif FLOAT_SIZE == 2 +# define getexp(x) scalar_1op(x, "vgetexpsh %[in], %[out], %[out]") +# define getmant(x) scalar_1op(x, "vgetmantsh $0, %[in], %[out], %[out]") +# define recip(x) scalar_1op(x, "vrcpsh %[in], %[out], %[out]") +# define rsqrt(x) scalar_1op(x, "vrsqrtsh %[in], %[out], %[out]") +# define scale(x, y) scalar_2op(x, y, "vscalefsh %[in2], %[in1], %[out]") +# define sqrt(x) scalar_1op(x, "vsqrtsh %[in], %[out], %[out]") +# define trunc(x) scalar_1op(x, "vrndscalesh $0b1011, %[in], %[out], %[out]") # endif #elif defined(FLOAT_SIZE) && defined(__AVX512F__) && \ (VEC_SIZE == 64 || defined(__AVX512VL__)) -# if ELEM_COUNT == 8 /* vextractf{32,64}x4 */ || \ +# if (ELEM_COUNT == 8 && ELEM_SIZE >= 4) /* vextractf{32,64}x4 */ || \ (ELEM_COUNT == 16 && ELEM_SIZE == 4 && defined(__AVX512DQ__)) /* vextractf32x8 */ || \ (ELEM_COUNT == 4 && ELEM_SIZE == 8 && defined(__AVX512DQ__)) /* vextractf64x2 */ # define _half(x, lh) ({ \ @@ -398,6 +461,21 @@ static inline vec_t movlhps(vec_t x, vec VEC_SIZE == 32 ? 0b01 : 0b00011011, undef(), ~0), \ 0b01010101, undef(), ~0) # endif +# elif FLOAT_SIZE == 2 +# define frac(x) BR2(reduceph, _mask, x, 0b00001011, undef(), ~0) +# define getexp(x) BR(getexpph, _mask, x, undef(), ~0) +# define getmant(x) BR(getmantph, _mask, x, 0, undef(), ~0) +# define max(x, y) BR2(maxph, _mask, x, y, undef(), ~0) +# define min(x, y) BR2(minph, _mask, x, y, undef(), ~0) +# define scale(x, y) BR2(scalefph, _mask, x, y, undef(), ~0) +# define recip(x) B(rcpph, _mask, x, undef(), ~0) +# define rsqrt(x) B(rsqrtph, _mask, x, undef(), ~0) +# define shrink1(x) BR2(vcvtps2phx, _mask, (vsf_t)(x), (vhf_half_t){}, ~0) +# define shrink2(x) BR2(vcvtpd2ph, _mask, (vdf_t)(x), (vhf_quarter_t){}, ~0) +# define sqrt(x) BR2(sqrtph, _mask, x, undef(), ~0) +# define trunc(x) BR2(rndscaleph, _mask, x, 0b1011, undef(), ~0) +# define widen1(x) ((vec_t)BR2(vcvtph2psx, _mask, x, (vsf_t)undef(), ~0)) +# define widen2(x) ((vec_t)BR2(vcvtph2pd, _mask, x, (vdf_t)undef(), ~0)) # endif #elif FLOAT_SIZE == 4 && defined(__SSE__) # if VEC_SIZE == 32 && defined(__AVX__) @@ -920,6 +998,16 @@ static inline vec_t movlhps(vec_t x, vec # define dup_lo(x) B(movddup, _mask, x, undef(), ~0) # endif #endif +#if FLOAT_SIZE == 2 && ELEM_COUNT > 1 +# define dup_hi(x) ((vec_t)B(pshufhw, _mask, \ + B(pshuflw, _mask, (vhi_t)(x), 0b11110101, \ + (vhi_t)undef(), ~0), \ + 0b11110101, (vhi_t)undef(), ~0)) +# define dup_lo(x) ((vec_t)B(pshufhw, _mask, \ + B(pshuflw, _mask, (vhi_t)(x), 0b10100000, \ + (vhi_t)undef(), ~0), \ + 0b10100000, (vhi_t)undef(), ~0)) +#endif #if VEC_SIZE == 16 && defined(__SSSE3__) && !defined(__AVX512VL__) # if INT_SIZE == 1 # define abs(x) ((vec_t)__builtin_ia32_pabsb128((vqi_t)(x))) --- a/tools/tests/x86_emulator/simd.h +++ b/tools/tests/x86_emulator/simd.h @@ -53,6 +53,9 @@ float # elif FLOAT_SIZE == 8 # define MODE DF # define ELEM_SFX "d" +# elif FLOAT_SIZE == 2 +# define MODE HF +# define ELEM_SFX "h" # endif #endif #ifndef VEC_SIZE @@ -67,7 +70,10 @@ typedef unsigned int __attribute__((mode /* Various builtins want plain char / int / long long vector types ... */ typedef char __attribute__((vector_size(VEC_SIZE))) vqi_t; typedef short __attribute__((vector_size(VEC_SIZE))) vhi_t; +#if VEC_SIZE >= 4 typedef int __attribute__((vector_size(VEC_SIZE))) vsi_t; +typedef float __attribute__((vector_size(VEC_SIZE))) vsf_t; +#endif #if VEC_SIZE >= 8 typedef long long __attribute__((vector_size(VEC_SIZE))) vdi_t; typedef double __attribute__((vector_size(VEC_SIZE))) vdf_t; @@ -96,6 +102,9 @@ typedef char __attribute__((vector_size( typedef short __attribute__((vector_size(HALF_SIZE))) vhi_half_t; typedef int __attribute__((vector_size(HALF_SIZE))) vsi_half_t; typedef long long __attribute__((vector_size(HALF_SIZE))) vdi_half_t; +#ifdef __AVX512FP16__ +typedef _Float16 __attribute__((vector_size(HALF_SIZE))) vhf_half_t; +#endif typedef float __attribute__((vector_size(HALF_SIZE))) vsf_half_t; # endif @@ -110,6 +119,9 @@ typedef char __attribute__((vector_size( typedef short __attribute__((vector_size(QUARTER_SIZE))) vhi_quarter_t; typedef int __attribute__((vector_size(QUARTER_SIZE))) vsi_quarter_t; typedef long long __attribute__((vector_size(QUARTER_SIZE))) vdi_quarter_t; +#ifdef __AVX512FP16__ +typedef _Float16 __attribute__((vector_size(QUARTER_SIZE))) vhf_quarter_t; +#endif # endif # if ELEM_COUNT >= 8 @@ -163,6 +175,7 @@ DECL_OCTET(half); #elif VEC_SIZE == 64 # define B(n, s, a...) __builtin_ia32_ ## n ## 512 ## s(a) # define BR(n, s, a...) __builtin_ia32_ ## n ## 512 ## s(a, 4) +# define BR2(n, s, a...) __builtin_ia32_ ## n ## 512 ## s ## _round(a, 4) #endif #ifndef B_ # define B_ B @@ -171,6 +184,9 @@ DECL_OCTET(half); # define BR B # define BR_ B_ #endif +#ifndef BR2 +# define BR2 BR +#endif #ifndef BR_ # define BR_ BR #endif --- a/tools/tests/x86_emulator/simd-fma.c +++ b/tools/tests/x86_emulator/simd-fma.c @@ -28,6 +28,8 @@ ENTRY(fma_test); # define fmaddsub(x, y, z) BR(vfmaddsubps, _mask, x, y, z, ~0) # elif FLOAT_SIZE == 8 # define fmaddsub(x, y, z) BR(vfmaddsubpd, _mask, x, y, z, ~0) +# elif FLOAT_SIZE == 2 +# define fmaddsub(x, y, z) BR(vfmaddsubph, _mask, x, y, z, ~0) # endif #elif VEC_SIZE == 16 # if FLOAT_SIZE == 4 @@ -70,6 +72,75 @@ ENTRY(fma_test); # endif #endif +#ifdef __AVX512FP16__ +# define I (1.if16) +# if VEC_SIZE > FLOAT_SIZE +# define CELEM_COUNT (ELEM_COUNT / 2) +static const unsigned int conj_mask = 0x80000000; +# define conj(z) ({ \ + vec_t r_; \ + asm ( "vpxord %2%{1to%c3%}, %1, %0" \ + : "=v" (r_) \ + : "v" (z), "m" (conj_mask), "i" (CELEM_COUNT) ); \ + r_; \ +}) +# define _cmul_vv(a, b, c) BR2(vf##c##mulcph, , a, b) +# define _cmul_vs(a, b, c) ({ \ + vec_t r_; \ + _Complex _Float16 b_ = (b); \ + asm ( "vf"#c"mulcph %2%{1to%c3%}, %1, %0" \ + : "=v" (r_) \ + : "v" (a), "m" (b_), "i" (CELEM_COUNT) ); \ + r_; \ +}) +# define cmadd_vv(a, b, c) BR2(vfmaddcph, , a, b, c) +# define cmadd_vs(a, b, c) ({ \ + _Complex _Float16 b_ = (b); \ + vec_t r_; \ + asm ( "vfmaddcph %2%{1to%c3%}, %1, %0" \ + : "=v" (r_) \ + : "v" (a), "m" (b_), "i" (CELEM_COUNT), "0" (c) ); \ + r_; \ +}) +# else +# define CELEM_COUNT 1 +typedef _Float16 __attribute__((vector_size(4))) cvec_t; +# define conj(z) ({ \ + cvec_t r_; \ + asm ( "xor $0x80000000, %0" : "=rm" (r_) : "0" (z) ); \ + r_; \ +}) +# define _cmul_vv(a, b, c) ({ \ + cvec_t r_; \ + /* "=&x" to force destination to be different from both sources */ \ + asm ( "vf"#c"mulcsh %2, %1, %0" : "=&x" (r_) : "x" (a), "m" (b) ); \ + r_; \ +}) +# define _cmul_vs(a, b, c) ({ \ + _Complex _Float16 b_ = (b); \ + cvec_t r_; \ + /* "=&x" to force destination to be different from both sources */ \ + asm ( "vf"#c"mulcsh %2, %1, %0" : "=&x" (r_) : "x" (a), "m" (b_) ); \ + r_; \ +}) +# define cmadd_vv(a, b, c) ({ \ + cvec_t r_ = (c); \ + asm ( "vfmaddcsh %2, %1, %0" : "+x" (r_) : "x" (a), "m" (b) ); \ + r_; \ +}) +# define cmadd_vs(a, b, c) ({ \ + _Complex _Float16 b_ = (b); \ + cvec_t r_ = (c); \ + asm ( "vfmaddcsh %2, %1, %0" : "+x" (r_) : "x" (a), "m" (b_) ); \ + r_; \ +}) +# endif +# define cmul_vv(a, b) _cmul_vv(a, b, ) +# define cmulc_vv(a, b) _cmul_vv(a, b, c) +# define cmul_vs(a, b) _cmul_vs(a, b, ) +# define cmulc_vs(a, b) _cmul_vs(a, b, c) +#endif + int fma_test(void) { unsigned int i; @@ -156,5 +227,99 @@ int fma_test(void) touch(inv); #endif +#ifdef CELEM_COUNT + +# if VEC_SIZE > FLOAT_SIZE +# define cvec_t vec_t +# define ceq eq +# else + { + /* Cannot re-use the function-scope variables (for being too small). */ + cvec_t x, y, z, src = { 1, 2 }, inv = { 2, 1 }, one = { 1, 1 }; +# define ceq(x, y) ({ \ + unsigned int r_; \ + asm ( "vcmpph $0, %1, %2, %0" : "=k" (r_) : "x" (x), "x" (y) ); \ + (r_ & 3) == 3; \ +}) +# endif + + /* (a * i)² == -a² */ + x = cmul_vs(src, I); + y = cmul_vv(x, x); + x = -src; + touch(src); + z = cmul_vv(x, src); + if ( !ceq(y, z) ) return __LINE__; + + /* conj(a * b) == conj(a) * conj(b) */ + touch(src); + x = conj(src); + touch(inv); + y = cmulc_vv(x, inv); + touch(src); + touch(inv); + z = conj(cmul_vv(src, inv)); + if ( !ceq(y, z) ) return __LINE__; + + /* a * conj(a) == |a|² */ + touch(src); + y = src; + touch(src); + x = cmulc_vv(y, src); + y *= y; + for ( i = 0; i < ELEM_COUNT; i += 2 ) + { + if ( x[i] != y[i] + y[i + 1] ) return __LINE__; + if ( x[i + 1] ) return __LINE__; + } + + /* a * b == b * a + 0 */ + touch(src); + touch(inv); + x = cmul_vv(src, inv); + touch(src); + touch(inv); + y = cmadd_vv(inv, src, (cvec_t){}); + if ( !ceq(x, y) ) return __LINE__; + + /* a * 1 + b == b * 1 + a */ + touch(src); + touch(inv); + x = cmadd_vs(src, 1, inv); + for ( i = 0; i < ELEM_COUNT; i += 2 ) + { + z[i] = 1; + z[i + 1] = 0; + } + touch(z); + y = cmadd_vv(inv, z, src); + if ( !ceq(x, y) ) return __LINE__; + + /* (a + b) * c == a * c + b * c */ + touch(one); + touch(inv); + x = cmul_vv(src + one, inv); + touch(inv); + y = cmul_vv(one, inv); + touch(inv); + z = cmadd_vv(src, inv, y); + if ( !ceq(x, z) ) return __LINE__; + + /* a * i + conj(a) == (Re(a) - Im(a)) * (1 + i) */ + x = cmadd_vs(src, I, conj(src)); + for ( i = 0; i < ELEM_COUNT; i += 2 ) + { + typeof(x[0]) val = src[i] - src[i + 1]; + + if ( x[i] != val ) return __LINE__; + if ( x[i + 1] != val ) return __LINE__; + } + +# if VEC_SIZE == FLOAT_SIZE + } +# endif + +#endif /* CELEM_COUNT */ + return 0; } --- a/tools/tests/x86_emulator/test_x86_emulator.c +++ b/tools/tests/x86_emulator/test_x86_emulator.c @@ -43,6 +43,7 @@ asm ( ".pushsection .test, \"ax\", @prog #include "avx512er.h" #include "avx512vbmi.h" #include "avx512vbmi2-vpclmulqdq.h" +#include "avx512fp16.h" #define verbose false /* Switch to true for far more logging. */ @@ -249,6 +250,16 @@ static bool simd_check_avx512bw_gf_vl(vo return cpu_has_gfni && cpu_has_avx512vl; } +static bool simd_check_avx512fp16(void) +{ + return cpu_has_avx512_fp16; +} + +static bool simd_check_avx512fp16_vl(void) +{ + return cpu_has_avx512_fp16 && cpu_has_avx512vl; +} + static void simd_set_regs(struct cpu_user_regs *regs) { if ( cpu_has_mmx ) @@ -510,6 +521,10 @@ static const struct { AVX512VL(_VBMI+VL u16x8, avx512vbmi, 16u2), AVX512VL(_VBMI+VL s16x16, avx512vbmi, 32i2), AVX512VL(_VBMI+VL u16x16, avx512vbmi, 32u2), + SIMD(AVX512_FP16 f16 scal,avx512fp16, f2), + SIMD(AVX512_FP16 f16x32, avx512fp16, 64f2), + AVX512VL(_FP16+VL f16x8, avx512fp16, 16f2), + AVX512VL(_FP16+VL f16x16,avx512fp16, 32f2), SIMD(SHA, sse4_sha, 16), SIMD(AVX+SHA, avx_sha, 16), AVX512VL(VL+SHA, avx512f_sha, 16), --- a/xen/include/public/arch-x86/cpufeatureset.h +++ b/xen/include/public/arch-x86/cpufeatureset.h @@ -281,7 +281,7 @@ XEN_CPUFEATURE(TSX_FORCE_ABORT, 9*32+13) XEN_CPUFEATURE(SERIALIZE, 9*32+14) /*A SERIALIZE insn */ XEN_CPUFEATURE(TSXLDTRK, 9*32+16) /*a TSX load tracking suspend/resume insns */ XEN_CPUFEATURE(CET_IBT, 9*32+20) /* CET - Indirect Branch Tracking */ -XEN_CPUFEATURE(AVX512_FP16, 9*32+23) /* AVX512 FP16 instructions */ +XEN_CPUFEATURE(AVX512_FP16, 9*32+23) /*A AVX512 FP16 instructions */ XEN_CPUFEATURE(IBRSB, 9*32+26) /*A IBRS and IBPB support (used by Intel) */ XEN_CPUFEATURE(STIBP, 9*32+27) /*A STIBP */ XEN_CPUFEATURE(L1D_FLUSH, 9*32+28) /*S MSR_FLUSH_CMD and L1D flush. */