diff mbox

[V2] x86/xsaves: fix overwriting between non-lazy/lazy xsaves

Message ID 1457090525-22468-1-git-send-email-shuai.ruan@linux.intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Shuai Ruan March 4, 2016, 11:22 a.m. UTC
The offset at which components xsaved by xsave[sc] are not fixed.
So when when a save with v->fpu_dirtied set is followed by one
with v->fpu_dirtied clear, non-lazy xsave[sc] may overwriting data
written by the lazy one.

The solution is when xsave[sc] is enabled and taking xcr0_accum into
consideration, if guest has ever used lazy states, vcpu_xsave_mask will
return XSTATE_ALL. Otherwise vcpu_xsave_mask will return XSTATE_NONLAZY.
This may cause overhead save on lazy states which will cause performance
impact. As xsavec support code will be cleaned up(reason is list below),
so the patch only take xsaves into consideration.

After doing some performance test on xsavec and xsaveopt(suggested by jan),
the result show xsaveopt performs better than xsavec. So next step clean up
xsavec suppot code in xen and disable xsaves (xsaves should be used when
supervised state is introduced).

Signed-off-by: Shuai Ruan <shuai.ruan@linux.intel.com>
Reported-by: Jan Beulich <jbeulich@suse.com>
---

 V2: Address comments from Jan:
 Add performance impact and future steps in description.

 xen/arch/x86/i387.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Jan Beulich March 4, 2016, 2:06 p.m. UTC | #1
>>> On 04.03.16 at 12:22, <shuai.ruan@linux.intel.com> wrote:
> The offset at which components xsaved by xsave[sc] are not fixed.
> So when when a save with v->fpu_dirtied set is followed by one
> with v->fpu_dirtied clear, non-lazy xsave[sc] may overwriting data
> written by the lazy one.
> 
> The solution is when xsave[sc] is enabled and taking xcr0_accum into
> consideration, if guest has ever used lazy states, vcpu_xsave_mask will
> return XSTATE_ALL. Otherwise vcpu_xsave_mask will return XSTATE_NONLAZY.
> This may cause overhead save on lazy states which will cause performance
> impact. As xsavec support code will be cleaned up(reason is list below),
> so the patch only take xsaves into consideration.
> 
> After doing some performance test on xsavec and xsaveopt(suggested by jan),
> the result show xsaveopt performs better than xsavec. So next step clean up
> xsavec suppot code in xen and disable xsaves (xsaves should be used when
> supervised state is introduced).

I'd rather see this being right away except of just getting mentioned
here; I don't expect this to be too difficult a change.

> --- a/xen/arch/x86/i387.c
> +++ b/xen/arch/x86/i387.c
> @@ -118,7 +118,8 @@ static inline uint64_t vcpu_xsave_mask(const struct vcpu *v)
>      if ( v->fpu_dirtied )
>          return v->arch.nonlazy_xstate_used ? XSTATE_ALL : XSTATE_LAZY;
>  
> -    return v->arch.nonlazy_xstate_used ? XSTATE_NONLAZY : 0;
> +    return cpu_has_xsaves && (v->arch.xcr0_accum & XSTATE_LAZY & ~XSTATE_FP_SSE) ?

If (here or in the suggested future patch) such an expression is
needed, I'd highly recommend accompanying it by a comment
explaining what this is about (i.e. specifically why the low two bits
can be ignored for the purpose here). In any event you will need
to make the commit description (which currently just says "if guest
has ever used lazy states") match this.

Jan
diff mbox

Patch

diff --git a/xen/arch/x86/i387.c b/xen/arch/x86/i387.c
index c29d0fa..0084578 100644
--- a/xen/arch/x86/i387.c
+++ b/xen/arch/x86/i387.c
@@ -118,7 +118,8 @@  static inline uint64_t vcpu_xsave_mask(const struct vcpu *v)
     if ( v->fpu_dirtied )
         return v->arch.nonlazy_xstate_used ? XSTATE_ALL : XSTATE_LAZY;
 
-    return v->arch.nonlazy_xstate_used ? XSTATE_NONLAZY : 0;
+    return cpu_has_xsaves && (v->arch.xcr0_accum & XSTATE_LAZY & ~XSTATE_FP_SSE) ?
+           XSTATE_ALL : XSTATE_NONLAZY;
 }
 
 /* Save x87 extended state */