From patchwork Tue Mar 8 07:19:24 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Shuai Ruan X-Patchwork-Id: 8530111 Return-Path: X-Original-To: patchwork-xen-devel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork1.web.kernel.org (Postfix) with ESMTP id A89719F7CA for ; Tue, 8 Mar 2016 07:25:06 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 976022017D for ; Tue, 8 Mar 2016 07:25:05 +0000 (UTC) Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 7EEB12015E for ; Tue, 8 Mar 2016 07:25:04 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xen.org with esmtp (Exim 4.84) (envelope-from ) id 1adByH-00073P-91; Tue, 08 Mar 2016 07:22:21 +0000 Received: from mail6.bemta5.messagelabs.com ([195.245.231.135]) by lists.xen.org with esmtp (Exim 4.84) (envelope-from ) id 1adByG-00073J-6H for xen-devel@lists.xen.org; Tue, 08 Mar 2016 07:22:20 +0000 Received: from [85.158.139.211] by server-11.bemta-5.messagelabs.com id 79/B2-31705-BAD7ED65; Tue, 08 Mar 2016 07:22:19 +0000 X-Env-Sender: shuai.ruan@linux.intel.com X-Msg-Ref: server-5.tower-206.messagelabs.com!1457421737!27486739!1 X-Originating-IP: [192.55.52.88] X-SpamReason: No, hits=0.0 required=7.0 tests=sa_preprocessor: VHJ1c3RlZCBJUDogMTkyLjU1LjUyLjg4ID0+IDM3NDcyNQ==\n X-StarScan-Received: X-StarScan-Version: 8.11; banners=-,-,- X-VirusChecked: Checked Received: (qmail 15349 invoked from network); 8 Mar 2016 07:22:18 -0000 Received: from mga01.intel.com (HELO mga01.intel.com) (192.55.52.88) by server-5.tower-206.messagelabs.com with SMTP; 8 Mar 2016 07:22:18 -0000 Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga101.fm.intel.com with ESMTP; 07 Mar 2016 23:22:17 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.22,555,1449561600"; d="scan'208";a="665644929" Received: from rs-vmm.bj.intel.com ([10.238.135.71]) by FMSMGA003.fm.intel.com with ESMTP; 07 Mar 2016 23:22:15 -0800 From: Shuai Ruan To: xen-devel@lists.xen.org Date: Tue, 8 Mar 2016 15:19:24 +0800 Message-Id: <1457421564-17288-1-git-send-email-shuai.ruan@linux.intel.com> X-Mailer: git-send-email 1.9.1 Cc: andrew.cooper3@citrix.com, keir@xen.org, jbeulich@suse.com Subject: [Xen-devel] [V3] x86/xsaves: fix overwriting between non-lazy/lazy xsaves X-BeenThere: xen-devel@lists.xen.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP The offset at which components xsaved by xsave[sc] are not fixed. So when when a save with v->fpu_dirtied set is followed by one with v->fpu_dirtied clear, non-lazy xsave[sc] may overwriting data written by the lazy one. The solution is when xsave[sc] is enabled and taking xcr0_accum into consideration, if guest has ever used XSTATE_LAZY & ~XSTATE_FP_SSE (XSTATE_FP_SSE will be excluded beacause xsave will write XSTATE_FP_SSE part in legacy region of xsave area which is fixed, saving XSTATE_FS_SSE will not cause overwriting problem), vcpu_xsave_mask will return XSTATE_ALL. Otherwise vcpu_xsave_mask will return XSTATE_NONLAZY. This may cause overhead save on lazy states which will cause performance impact. As xsavec support code will be cleaned up (reason is list below), so the patch only take xsaves into consideration. After doing some performance test on xsavec and xsaveopt(suggested by jan), the result show xsaveopt performs better than xsavec. This patch will clean up xsavec suppot code in xen. Also xsaves will be disabled (xsaves will be used when supervised state is introduced). Here in this patch do not change xc_cpuid_config_xsave in tools/libxc/xc_cpuid_x86.c but add some check in hvm_cpuid. Next time xsaves is needed, only add some code in xstate_init is enough. Signed-off-by: Shuai Ruan Reported-by: Jan Beulich --- v3: 1. Add xsavc clean up code and disable xsaves. 2. Add comment on why certain mask should be return in vcpu_xsave_mask. v2: add performance impact and next step to do in the description. xen/arch/x86/domctl.c | 2 +- xen/arch/x86/hvm/hvm.c | 3 +++ xen/arch/x86/i387.c | 14 +++++++++++++- xen/arch/x86/xstate.c | 16 ++++++---------- 4 files changed, 23 insertions(+), 12 deletions(-) diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c index 55aecdc..d71541c 100644 --- a/xen/arch/x86/domctl.c +++ b/xen/arch/x86/domctl.c @@ -922,7 +922,7 @@ long arch_do_domctl( ret = -EFAULT; offset += sizeof(v->arch.xcr0_accum); - if ( !ret && (cpu_has_xsaves || cpu_has_xsavec) ) + if ( !ret && cpu_has_xsaves ) { void *xsave_area; diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c index 5bc2812..8264ff8 100644 --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -4762,7 +4762,10 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx, *ebx += xstate_sizes[sub_leaf]; } else + { + *eax &= ~(cpufeat_mask(X86_FEATURE_XSAVES) | cpufeat_mask(X86_FEATURE_XSAVEC)); *ebx = *ecx = *edx = 0; + } } break; diff --git a/xen/arch/x86/i387.c b/xen/arch/x86/i387.c index c29d0fa..651b6b8 100644 --- a/xen/arch/x86/i387.c +++ b/xen/arch/x86/i387.c @@ -118,7 +118,19 @@ static inline uint64_t vcpu_xsave_mask(const struct vcpu *v) if ( v->fpu_dirtied ) return v->arch.nonlazy_xstate_used ? XSTATE_ALL : XSTATE_LAZY; - return v->arch.nonlazy_xstate_used ? XSTATE_NONLAZY : 0; + /* + * The offsets of components in the extended region of xsave area xsaved by + * xasves are not fixed. This may cause overwriting xsave area when + * v->fpu_dirtied set is followed by one with v->fpu_dirtied clear. + * The way solve this problem is taking xcro_accum into consideration. + * if guest has ever used lazy states (exclude XSTATE_FP_SSE), + * vcpu_xsave_mask will return XSTATE_ALL. Otherwise return XSTATE_NONLAZY. + * The reason XSTATE_FP_SSE should be excluded is that the offsets of + * XSTATE_FP_SSE (in the legacy region of xsave area) are fixed, saving + * XSTATE_FS_SSE using xsaves will not cause overwriting problem. + */ + return cpu_has_xsaves && (v->arch.xcr0_accum & XSTATE_LAZY & ~XSTATE_FP_SSE) ? + XSTATE_ALL : XSTATE_NONLAZY; } /* Save x87 extended state */ diff --git a/xen/arch/x86/xstate.c b/xen/arch/x86/xstate.c index 8316bd9..c57f3a4 100644 --- a/xen/arch/x86/xstate.c +++ b/xen/arch/x86/xstate.c @@ -165,7 +165,7 @@ void expand_xsave_states(struct vcpu *v, void *dest, unsigned int size) u64 xstate_bv = xsave->xsave_hdr.xstate_bv; u64 valid; - if ( !cpu_has_xsaves && !cpu_has_xsavec ) + if ( !cpu_has_xsaves ) { memcpy(dest, xsave, size); return; @@ -206,7 +206,7 @@ void compress_xsave_states(struct vcpu *v, const void *src, unsigned int size) u64 xstate_bv = ((const struct xsave_struct *)src)->xsave_hdr.xstate_bv; u64 valid; - if ( !cpu_has_xsaves && !cpu_has_xsavec ) + if ( !cpu_has_xsaves ) { memcpy(xsave, src, size); return; @@ -251,11 +251,9 @@ void xsave(struct vcpu *v, uint64_t mask) uint32_t lmask = mask; unsigned int fip_width = v->domain->arch.x87_fip_width; #define XSAVE(pfx) \ - alternative_io_3(".byte " pfx "0x0f,0xae,0x27\n", /* xsave */ \ + alternative_io_2(".byte " pfx "0x0f,0xae,0x27\n", /* xsave */ \ ".byte " pfx "0x0f,0xae,0x37\n", /* xsaveopt */ \ X86_FEATURE_XSAVEOPT, \ - ".byte " pfx "0x0f,0xc7,0x27\n", /* xsavec */ \ - X86_FEATURE_XSAVEC, \ ".byte " pfx "0x0f,0xc7,0x2f\n", /* xsaves */ \ X86_FEATURE_XSAVES, \ "=m" (*ptr), \ @@ -409,7 +407,7 @@ void xrstor(struct vcpu *v, uint64_t mask) ((mask & XSTATE_YMM) && !(ptr->xsave_hdr.xcomp_bv & XSTATE_COMPACTION_ENABLED))) ) ptr->fpu_sse.mxcsr &= mxcsr_mask; - if ( cpu_has_xsaves || cpu_has_xsavec ) + if ( cpu_has_xsaves ) { ptr->xsave_hdr.xcomp_bv &= this_cpu(xcr0) | this_cpu(xss); ptr->xsave_hdr.xstate_bv &= ptr->xsave_hdr.xcomp_bv; @@ -565,9 +563,7 @@ void xstate_init(struct cpuinfo_x86 *c) /* Mask out features not currently understood by Xen. */ eax &= (cpufeat_mask(X86_FEATURE_XSAVEOPT) | - cpufeat_mask(X86_FEATURE_XSAVEC) | - cpufeat_mask(X86_FEATURE_XGETBV1) | - cpufeat_mask(X86_FEATURE_XSAVES)); + cpufeat_mask(X86_FEATURE_XGETBV1)); c->x86_capability[cpufeat_word(X86_FEATURE_XSAVEOPT)] = eax; @@ -575,7 +571,7 @@ void xstate_init(struct cpuinfo_x86 *c) if ( setup_xstate_features(bsp) && bsp ) BUG(); - if ( bsp && (cpu_has_xsaves || cpu_has_xsavec) ) + if ( bsp && cpu_has_xsaves ) setup_xstate_comp(); }