From patchwork Fri Nov 18 05:48:45 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: He Chen X-Patchwork-Id: 9435785 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 7D5CC60469 for ; Fri, 18 Nov 2016 05:52:08 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6D5C82972E for ; Fri, 18 Nov 2016 05:52:08 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 5FEE129750; Fri, 18 Nov 2016 05:52:08 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.2 required=2.0 tests=BAYES_00, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id C98A12972E for ; Fri, 18 Nov 2016 05:52:06 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1c7c3R-0001h3-Ox; Fri, 18 Nov 2016 05:49:41 +0000 Received: from mail6.bemta3.messagelabs.com ([195.245.230.39]) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1c7c3Q-0001gx-KB for xen-devel@lists.xen.org; Fri, 18 Nov 2016 05:49:40 +0000 Received: from [85.158.137.68] by server-9.bemta-3.messagelabs.com id 96/AD-08915-3769E285; Fri, 18 Nov 2016 05:49:39 +0000 X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFrrGLMWRWlGSWpSXmKPExsXS1tbhqFs8TS/ C4P4CPoslHxezODB6HN39mymAMYo1My8pvyKBNaNn5UL2gg/6FV3PTjI1MP5Q7mLk5BASqJRY cu0jM4gtIcArcWTZDFYIO0Diz6FLQHEuoJpWRolLV14xdjFycLAJqEtMmF0GUiMiIC1x7fNlR pAaZoG9jBLflreCDRIWcJd4e3ErE4jNIqAq8XHbRDCbFyg+e8sNFogFchI3z3UyT2DkXsDIsI pRozi1qCy1SNfQTC+pKDM9oyQ3MTNH19DAWC83tbg4MT01JzGpWC85P3cTI9C/DECwg3HVds9 DjJIcTEqivN79ehFCfEn5KZUZicUZ8UWlOanFhxhlODiUJHglpgLlBItS01Mr0jJzgIEGk5bg 4FES4Z0MkuYtLkjMLc5Mh0idYlSUEuf9MAUoIQCSyCjNg2uDBfclRlkpYV5GoEOEeApSi3IzS 1DlXzGKczAqCfM6g4znycwrgZv+CmgxE9DiPQI6IItLEhFSUg2MYW4Gpw7N+8/kHljP/mUWa/ g26RunuYIP31T3+8xhaBmb++60oNTWOfust/VznPx4PFagUd/qYltK+d9Fb7vkWiSEZ3JOYL1 u/abtuqeHmkN2/e6V8RMXZVhfjGtcHRgzd8PPPZGhZ19It71+UXB50tm1X20iU8WO6W5piFhk fEv7nIDZ79BcJZbijERDLeai4kQAbuI512kCAAA= X-Env-Sender: he.chen@linux.intel.com X-Msg-Ref: server-13.tower-31.messagelabs.com!1479448176!70950738!1 X-Originating-IP: [134.134.136.65] X-SpamReason: No, hits=0.5 required=7.0 tests=BODY_RANDOM_LONG X-StarScan-Received: X-StarScan-Version: 9.0.16; banners=-,-,- X-VirusChecked: Checked Received: (qmail 10950 invoked from network); 18 Nov 2016 05:49:38 -0000 Received: from mga03.intel.com (HELO mga03.intel.com) (134.134.136.65) by server-13.tower-31.messagelabs.com with DHE-RSA-AES256-GCM-SHA384 encrypted SMTP; 18 Nov 2016 05:49:38 -0000 Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orsmga103.jf.intel.com with ESMTP; 17 Nov 2016 21:49:35 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos; i="5.31,655,1473145200"; d="scan'208"; a="1070138763" Received: from he.bj.intel.com (HELO localhost) ([10.238.135.151]) by fmsmga001.fm.intel.com with ESMTP; 17 Nov 2016 21:49:34 -0800 From: He Chen To: xen-devel@lists.xen.org Date: Fri, 18 Nov 2016 13:48:45 +0800 Message-Id: <1479448125-22718-1-git-send-email-he.chen@linux.intel.com> X-Mailer: git-send-email 2.7.4 Cc: Wei Liu , Ian Jackson , Luwei Kang , Jan Beulich , Andrew Cooper Subject: [Xen-devel] [PATCH] x86/cpuid: Add AVX512_4VNNIW and AVX512_4FMAPS support X-BeenThere: xen-devel@lists.xen.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" X-Virus-Scanned: ClamAV using ClamSMTP Add two new AVX512 subfeatures support for guest. AVX512_4VNNIW: Vector instructions for deep learning enhanced word variable precision. AVX512_4FMAPS: Vector instructions for deep learning floating-point single precision. Signed-off-by: Luwei Kang Signed-off-by: He Chen Reviewed-by: Jan Beulich --- tools/libxc/xc_cpuid_x86.c | 8 ++++++-- xen/arch/x86/cpu/common.c | 2 +- xen/arch/x86/cpuid.c | 2 +- xen/arch/x86/hvm/hvm.c | 1 + xen/arch/x86/traps.c | 5 +++-- xen/include/asm-x86/cpuid.h | 1 + xen/include/public/arch-x86/cpufeatureset.h | 4 ++++ xen/tools/gen-cpuid.py | 2 +- 8 files changed, 18 insertions(+), 7 deletions(-) diff --git a/tools/libxc/xc_cpuid_x86.c b/tools/libxc/xc_cpuid_x86.c index 2ad9aeb..e9e3691 100644 --- a/tools/libxc/xc_cpuid_x86.c +++ b/tools/libxc/xc_cpuid_x86.c @@ -547,13 +547,15 @@ static void xc_cpuid_hvm_policy(xc_interface *xch, { regs[1] = info->featureset[featureword_of(X86_FEATURE_FSGSBASE)]; regs[2] = info->featureset[featureword_of(X86_FEATURE_PREFETCHWT1)]; + regs[3] = info->featureset[featureword_of(X86_FEATURE_AVX512_4VNNIW)]; } else { regs[1] = 0; regs[2] = 0; + regs[3] = 0; } - regs[0] = regs[3] = 0; + regs[0] = 0; break; case 0x0000000d: @@ -638,13 +640,15 @@ static void xc_cpuid_pv_policy(xc_interface *xch, { regs[1] = info->featureset[featureword_of(X86_FEATURE_FSGSBASE)]; regs[2] = info->featureset[featureword_of(X86_FEATURE_PREFETCHWT1)]; + regs[3] = info->featureset[featureword_of(X86_FEATURE_AVX512_4VNNIW)]; } else { regs[1] = 0; regs[2] = 0; + regs[3] = 0; } - regs[0] = regs[3] = 0; + regs[0] = 0; break; case 0x0000000d: diff --git a/xen/arch/x86/cpu/common.c b/xen/arch/x86/cpu/common.c index 3475198..aaaa873 100644 --- a/xen/arch/x86/cpu/common.c +++ b/xen/arch/x86/cpu/common.c @@ -325,7 +325,7 @@ static void generic_identify(struct cpuinfo_x86 *c) cpuid_count(0x00000007, 0, &tmp, &c->x86_capability[cpufeat_word(X86_FEATURE_FSGSBASE)], &c->x86_capability[cpufeat_word(X86_FEATURE_PKU)], - &tmp); + &c->x86_capability[cpufeat_word(X86_FEATURE_AVX512_4VNNIW)]); } /* diff --git a/xen/arch/x86/cpuid.c b/xen/arch/x86/cpuid.c index 63b2db9..3e85a63 100644 --- a/xen/arch/x86/cpuid.c +++ b/xen/arch/x86/cpuid.c @@ -78,7 +78,7 @@ static void __init calculate_raw_featureset(void) cpuid_count(0x7, 0, &tmp, &raw_featureset[FEATURESET_7b0], &raw_featureset[FEATURESET_7c0], - &tmp); + &raw_featureset[FEATURESET_7d0]); if ( max >= 0xd ) cpuid_count(0xd, 1, &raw_featureset[FEATURESET_Da1], diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c index 704fd64..752e5fb 100644 --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -3503,6 +3503,7 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx, special_features[FEATURESET_7b0]); *ecx &= hvm_featureset[FEATURESET_7c0]; + *edx &= hvm_featureset[FEATURESET_7d0]; /* Don't expose HAP-only features to non-hap guests. */ if ( !hap_enabled(d) ) diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c index 14abb62..2469e49 100644 --- a/xen/arch/x86/traps.c +++ b/xen/arch/x86/traps.c @@ -1128,6 +1128,7 @@ void pv_cpuid(struct cpu_user_regs *regs) special_features[FEATURESET_7b0]); c &= pv_featureset[FEATURESET_7c0]; + d &= pv_featureset[FEATURESET_7d0]; if ( !is_pvh_domain(currd) ) { @@ -1142,8 +1143,8 @@ void pv_cpuid(struct cpu_user_regs *regs) } } else - b = c = 0; - a = d = 0; + b = c = d = 0; + a = 0; break; case XSTATE_CPUID: diff --git a/xen/include/asm-x86/cpuid.h b/xen/include/asm-x86/cpuid.h index 2372474..ec8bbb5 100644 --- a/xen/include/asm-x86/cpuid.h +++ b/xen/include/asm-x86/cpuid.h @@ -17,6 +17,7 @@ #define FEATURESET_7c0 6 /* 0x00000007:0.ecx */ #define FEATURESET_e7d 7 /* 0x80000007.edx */ #define FEATURESET_e8b 8 /* 0x80000008.ebx */ +#define FEATURESET_7d0 9 /* 0x00000007:0.edx */ #ifndef __ASSEMBLY__ #include diff --git a/xen/include/public/arch-x86/cpufeatureset.h b/xen/include/public/arch-x86/cpufeatureset.h index 9320c9e..565ccd5 100644 --- a/xen/include/public/arch-x86/cpufeatureset.h +++ b/xen/include/public/arch-x86/cpufeatureset.h @@ -234,6 +234,10 @@ XEN_CPUFEATURE(EFRO, 7*32+10) /* APERF/MPERF Read Only interface */ /* AMD-defined CPU features, CPUID level 0x80000008.ebx, word 8 */ XEN_CPUFEATURE(CLZERO, 8*32+ 0) /*A CLZERO instruction */ +/* Intel-defined CPU features, CPUID level 0x00000007:0.edx, word 9 */ +XEN_CPUFEATURE(AVX512_4VNNIW, 9*32+ 2) /*A AVX512 Neural Network Instructions */ +XEN_CPUFEATURE(AVX512_4FMAPS, 9*32+ 3) /*A AVX512 Multiply Accumulation Single Precision */ + #endif /* XEN_CPUFEATURE */ /* Clean up from a default include. Close the enum (for C). */ diff --git a/xen/tools/gen-cpuid.py b/xen/tools/gen-cpuid.py index 005cad9..c29f1d3 100755 --- a/xen/tools/gen-cpuid.py +++ b/xen/tools/gen-cpuid.py @@ -253,7 +253,7 @@ def crunch_numbers(state): # 512bit registers, and the instructions themselves. All further AVX512 features # are built on top of AVX512F AVX512F: [AVX512DQ, AVX512IFMA, AVX512PF, AVX512ER, AVX512CD, - AVX512BW, AVX512VL, AVX512VBMI], + AVX512BW, AVX512VL, AVX512VBMI, AVX512_4VNNIW, AVX512_4FMAPS], } deep_features = tuple(sorted(deps.keys()))