From patchwork Mon Nov 21 06:01:14 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: He Chen X-Patchwork-Id: 9438791 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 74071600BA for ; Mon, 21 Nov 2016 06:04:30 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 62E8A28671 for ; Mon, 21 Nov 2016 06:04:30 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 57C2B2868F; Mon, 21 Nov 2016 06:04:30 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.2 required=2.0 tests=BAYES_00, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 78F3C28675 for ; Mon, 21 Nov 2016 06:04:29 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1c8hgB-0002MK-Jk; Mon, 21 Nov 2016 06:02:11 +0000 Received: from mail6.bemta5.messagelabs.com ([195.245.231.135]) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1c8hgB-0002MC-3R for xen-devel@lists.xen.org; Mon, 21 Nov 2016 06:02:11 +0000 Received: from [85.158.139.211] by server-11.bemta-5.messagelabs.com id 77/29-09407-2ED82385; Mon, 21 Nov 2016 06:02:10 +0000 X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFrrCLMWRWlGSWpSXmKPExsVywNykWPdhr1G Ewb5/ahZLPi5mcWD0OLr7N1MAYxRrZl5SfkUCa8a9Xd3sBXPMKlZ8nM3cwHhfo4uRk0NIoFJi yunzLCC2hACvxJFlM1ghbD+JQ08+skLUtDJKzO6z7mLk4GATUJeYMLsMJCwiIC1x7fNlxi5GL g5mgb2MEt+WtzKDJIQFvCTeL73OCGKzCKhKNG68DWbzCrhL7L41ixlivpzEzXOdzBMYuRcwMq xiVC9OLSpLLdK11EsqykzPKMlNzMzRNTQw1ctNLS5OTE/NSUwq1kvOz93ECPQtAxDsYFzb6ny IUZKDSUmU92GdUYQQX1J+SmVGYnFGfFFpTmrxIUYZDg4lCd6OHqCcYFFqempFWmYOMMhg0hIc PEoivKzAQBPiLS5IzC3OTIdInWJUlBLn3QLSJwCSyCjNg2uDBfYlRlkpYV5GoEOEeApSi3IzS 1DlXzGKczAqCfMGgEzhycwrgZv+CmgxE9BiJVYDkMUliQgpqQbGnEmPXzzbpsXYK/M8RX+5Vy XjiaMr2iXf9u0NYyiXjT07efW+heofk+JYt+xcHK+/5ff/p1mZ5dsuRFm863EROzl/zfbu33k aXB7lX7tZd0m4BBtEzd4xi+dc5/TjD7L+6ftqy2WmycXMyT720mDrZIE+wdUHmmyPFLDYrvZa 6bHDpWPrb6YsJZbijERDLeai4kQAAvOZ5WcCAAA= X-Env-Sender: he.chen@linux.intel.com X-Msg-Ref: server-11.tower-206.messagelabs.com!1479708126!59134257!1 X-Originating-IP: [192.55.52.115] X-SpamReason: No, hits=0.5 required=7.0 tests=BODY_RANDOM_LONG X-StarScan-Received: X-StarScan-Version: 9.0.16; banners=-,-,- X-VirusChecked: Checked Received: (qmail 28928 invoked from network); 21 Nov 2016 06:02:08 -0000 Received: from mga14.intel.com (HELO mga14.intel.com) (192.55.52.115) by server-11.tower-206.messagelabs.com with DHE-RSA-AES256-GCM-SHA384 encrypted SMTP; 21 Nov 2016 06:02:08 -0000 Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by fmsmga103.fm.intel.com with ESMTP; 20 Nov 2016 22:02:06 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.31,525,1473145200"; d="scan'208";a="33850006" Received: from he.bj.intel.com (HELO localhost) ([10.238.135.151]) by fmsmga005.fm.intel.com with ESMTP; 20 Nov 2016 22:02:05 -0800 From: He Chen To: xen-devel@lists.xen.org Date: Mon, 21 Nov 2016 14:01:14 +0800 Message-Id: <1479708074-17958-1-git-send-email-he.chen@linux.intel.com> X-Mailer: git-send-email 2.7.4 Cc: Wei Liu , Ian Jackson , Luwei Kang , Jan Beulich , Andrew Cooper Subject: [Xen-devel] [PATCH v2] x86/cpuid: Add AVX512_4VNNIW and AVX512_4FMAPS support X-BeenThere: xen-devel@lists.xen.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" X-Virus-Scanned: ClamAV using ClamSMTP Add two new AVX512 subfeatures support for guest. AVX512_4VNNIW: Vector instructions for deep learning enhanced word variable precision. AVX512_4FMAPS: Vector instructions for deep learning floating-point single precision. Signed-off-by: Luwei Kang Signed-off-by: He Chen Reviewed-by: Jan Beulich Acked-by: Wei Liu --- Changes from v1: Add new leaf in xen-cpuid.c --- tools/libxc/xc_cpuid_x86.c | 8 ++++++-- tools/misc/xen-cpuid.c | 10 ++++++++++ xen/arch/x86/cpu/common.c | 2 +- xen/arch/x86/cpuid.c | 2 +- xen/arch/x86/hvm/hvm.c | 1 + xen/arch/x86/traps.c | 5 +++-- xen/include/asm-x86/cpuid.h | 1 + xen/include/public/arch-x86/cpufeatureset.h | 4 ++++ xen/tools/gen-cpuid.py | 2 +- 9 files changed, 28 insertions(+), 7 deletions(-) diff --git a/tools/libxc/xc_cpuid_x86.c b/tools/libxc/xc_cpuid_x86.c index 2ad9aeb..e9e3691 100644 --- a/tools/libxc/xc_cpuid_x86.c +++ b/tools/libxc/xc_cpuid_x86.c @@ -547,13 +547,15 @@ static void xc_cpuid_hvm_policy(xc_interface *xch, { regs[1] = info->featureset[featureword_of(X86_FEATURE_FSGSBASE)]; regs[2] = info->featureset[featureword_of(X86_FEATURE_PREFETCHWT1)]; + regs[3] = info->featureset[featureword_of(X86_FEATURE_AVX512_4VNNIW)]; } else { regs[1] = 0; regs[2] = 0; + regs[3] = 0; } - regs[0] = regs[3] = 0; + regs[0] = 0; break; case 0x0000000d: @@ -638,13 +640,15 @@ static void xc_cpuid_pv_policy(xc_interface *xch, { regs[1] = info->featureset[featureword_of(X86_FEATURE_FSGSBASE)]; regs[2] = info->featureset[featureword_of(X86_FEATURE_PREFETCHWT1)]; + regs[3] = info->featureset[featureword_of(X86_FEATURE_AVX512_4VNNIW)]; } else { regs[1] = 0; regs[2] = 0; + regs[3] = 0; } - regs[0] = regs[3] = 0; + regs[0] = 0; break; case 0x0000000d: diff --git a/tools/misc/xen-cpuid.c b/tools/misc/xen-cpuid.c index 44991f6..5d66e94 100644 --- a/tools/misc/xen-cpuid.c +++ b/tools/misc/xen-cpuid.c @@ -143,6 +143,15 @@ static const char *str_e8b[32] = [1 ... 31] = "REZ", }; +static const char *str_7d0[32] = +{ + [0 ... 1] = "REZ", + + [ 2] = "avx512_4vnniw", [ 3] = "avx512_4fmaps", + + [4 ... 31] = "REZ", +}; + static struct { const char *name; const char *abbr; @@ -158,6 +167,7 @@ static struct { { "0x00000007:0.ecx", "7c0", str_7c0 }, { "0x80000007.edx", "e7d", str_e7d }, { "0x80000008.ebx", "e8b", str_e8b }, + { "0x00000007:0.edx", "7d0", str_7d0 }, }; #define COL_ALIGN "18" diff --git a/xen/arch/x86/cpu/common.c b/xen/arch/x86/cpu/common.c index 3475198..aaaa873 100644 --- a/xen/arch/x86/cpu/common.c +++ b/xen/arch/x86/cpu/common.c @@ -325,7 +325,7 @@ static void generic_identify(struct cpuinfo_x86 *c) cpuid_count(0x00000007, 0, &tmp, &c->x86_capability[cpufeat_word(X86_FEATURE_FSGSBASE)], &c->x86_capability[cpufeat_word(X86_FEATURE_PKU)], - &tmp); + &c->x86_capability[cpufeat_word(X86_FEATURE_AVX512_4VNNIW)]); } /* diff --git a/xen/arch/x86/cpuid.c b/xen/arch/x86/cpuid.c index 63b2db9..3e85a63 100644 --- a/xen/arch/x86/cpuid.c +++ b/xen/arch/x86/cpuid.c @@ -78,7 +78,7 @@ static void __init calculate_raw_featureset(void) cpuid_count(0x7, 0, &tmp, &raw_featureset[FEATURESET_7b0], &raw_featureset[FEATURESET_7c0], - &tmp); + &raw_featureset[FEATURESET_7d0]); if ( max >= 0xd ) cpuid_count(0xd, 1, &raw_featureset[FEATURESET_Da1], diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c index 704fd64..752e5fb 100644 --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -3503,6 +3503,7 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx, special_features[FEATURESET_7b0]); *ecx &= hvm_featureset[FEATURESET_7c0]; + *edx &= hvm_featureset[FEATURESET_7d0]; /* Don't expose HAP-only features to non-hap guests. */ if ( !hap_enabled(d) ) diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c index d56d76e..01ac1b1 100644 --- a/xen/arch/x86/traps.c +++ b/xen/arch/x86/traps.c @@ -1133,6 +1133,7 @@ void pv_cpuid(struct cpu_user_regs *regs) special_features[FEATURESET_7b0]); c &= pv_featureset[FEATURESET_7c0]; + d &= pv_featureset[FEATURESET_7d0]; if ( !is_pvh_domain(currd) ) { @@ -1147,8 +1148,8 @@ void pv_cpuid(struct cpu_user_regs *regs) } } else - b = c = 0; - a = d = 0; + b = c = d = 0; + a = 0; break; case XSTATE_CPUID: diff --git a/xen/include/asm-x86/cpuid.h b/xen/include/asm-x86/cpuid.h index 2372474..ec8bbb5 100644 --- a/xen/include/asm-x86/cpuid.h +++ b/xen/include/asm-x86/cpuid.h @@ -17,6 +17,7 @@ #define FEATURESET_7c0 6 /* 0x00000007:0.ecx */ #define FEATURESET_e7d 7 /* 0x80000007.edx */ #define FEATURESET_e8b 8 /* 0x80000008.ebx */ +#define FEATURESET_7d0 9 /* 0x00000007:0.edx */ #ifndef __ASSEMBLY__ #include diff --git a/xen/include/public/arch-x86/cpufeatureset.h b/xen/include/public/arch-x86/cpufeatureset.h index 9320c9e..565ccd5 100644 --- a/xen/include/public/arch-x86/cpufeatureset.h +++ b/xen/include/public/arch-x86/cpufeatureset.h @@ -234,6 +234,10 @@ XEN_CPUFEATURE(EFRO, 7*32+10) /* APERF/MPERF Read Only interface */ /* AMD-defined CPU features, CPUID level 0x80000008.ebx, word 8 */ XEN_CPUFEATURE(CLZERO, 8*32+ 0) /*A CLZERO instruction */ +/* Intel-defined CPU features, CPUID level 0x00000007:0.edx, word 9 */ +XEN_CPUFEATURE(AVX512_4VNNIW, 9*32+ 2) /*A AVX512 Neural Network Instructions */ +XEN_CPUFEATURE(AVX512_4FMAPS, 9*32+ 3) /*A AVX512 Multiply Accumulation Single Precision */ + #endif /* XEN_CPUFEATURE */ /* Clean up from a default include. Close the enum (for C). */ diff --git a/xen/tools/gen-cpuid.py b/xen/tools/gen-cpuid.py index 005cad9..c29f1d3 100755 --- a/xen/tools/gen-cpuid.py +++ b/xen/tools/gen-cpuid.py @@ -253,7 +253,7 @@ def crunch_numbers(state): # 512bit registers, and the instructions themselves. All further AVX512 features # are built on top of AVX512F AVX512F: [AVX512DQ, AVX512IFMA, AVX512PF, AVX512ER, AVX512CD, - AVX512BW, AVX512VL, AVX512VBMI], + AVX512BW, AVX512VL, AVX512VBMI, AVX512_4VNNIW, AVX512_4FMAPS], } deep_features = tuple(sorted(deps.keys()))