diff mbox

[5/6] x86/domctl: Fix migration of guests which are not using xsave

Message ID 1473673900-8585-6-git-send-email-andrew.cooper3@citrix.com (mailing list archive)
State New, archived
Headers show

Commit Message

Andrew Cooper Sept. 12, 2016, 9:51 a.m. UTC
c/s da62246e "x86/xsaves: enable xsaves/xrstors/xsavec in xen" broke migration
of PV guests which were not using xsave.

In such a case, compress_xsave_states() gets passed a zero length buffer.  The
first thing it tries to do is ASSERT() on user-provided data, if it hadn't
already wandered off the end of the buffer to do so.

Perform more verification of the input buffer before passing it to
compress_xsave_states().  This involves making xsave_area_compressed() public.

Similar problems exist on the HVM side, so make equivielent adjustments there.
This doesn't manifest in general, as hvm_save_cpu_xsave_states() elides the
entire record if xsave isn't used, but is a problem if a caller were to
construct an xsave record manually.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
---
 xen/arch/x86/domctl.c        | 12 +++++++++---
 xen/arch/x86/hvm/hvm.c       | 20 ++++++++++++++++++--
 xen/arch/x86/xstate.c        |  6 ------
 xen/include/asm-x86/xstate.h |  6 ++++++
 4 files changed, 33 insertions(+), 11 deletions(-)

Comments

Jan Beulich Sept. 12, 2016, 12:14 p.m. UTC | #1
>>> On 12.09.16 at 11:51, <andrew.cooper3@citrix.com> wrote:
> --- a/xen/arch/x86/domctl.c
> +++ b/xen/arch/x86/domctl.c
> @@ -1158,7 +1158,15 @@ long arch_do_domctl(
>                  goto vcpuextstate_out;
>              }
>  
> -            if ( evc->size <= PV_XSAVE_SIZE(_xcr0_accum) )
> +            if ( evc->size == PV_XSAVE_HDR_SIZE )
> +                ; /* Nothing to restore. */
> +            else if ( evc->size < PV_XSAVE_HDR_SIZE + XSTATE_AREA_MIN_SIZE )
> +                ret = -EINVAL; /* Can't be legitimate data. */
> +            else if ( xsave_area_compressed(_xsave_area) )
> +                ret = -EOPNOTSUPP; /* Don't support compressed data. */
> +            else if ( evc->size != PV_XSAVE_SIZE(_xcr0_accum) )
> +                ret = -EINVAL; /* Not legitimate data. */

Can't this be moved ahead of the xsave_area_compressed() check,
eliminating (as redundant) the check that's there right now?

In any event the tightening to != you do here supports my desire to
not do any relaxation of the size check in patch 2. Or alternatively
you would want to consider restoring the behavior from prior to
the xsaves change, where the <= which you now remove was still
okay.

> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -1345,6 +1345,23 @@ static int hvm_load_cpu_xsave_states(struct domain *d, hvm_domain_context_t *h)
>          printk(XENLOG_G_WARNING
>                 "HVM%d.%u restore mismatch: xsave length %#x > %#x\n",
>                 d->domain_id, vcpuid, desc->length, size);
> +        /* Rewind desc->length to ignore the extraneous zeros. */
> +        desc->length = size;

This is slightly ugly, as it will prevent eventually constifying dest (which
it really should have been from the beginning).

> --- a/xen/include/asm-x86/xstate.h
> +++ b/xen/include/asm-x86/xstate.h
> @@ -131,4 +131,10 @@ static inline bool_t xstate_all(const struct vcpu *v)
>             (v->arch.xcr0_accum & XSTATE_LAZY & ~XSTATE_FP_SSE);
>  }
>  
> +static inline bool __nonnull(1)
> +    xsave_area_compressed(const struct xsave_struct *xsave_area)

This is certainly odd indentation.

Jan
Andrew Cooper Sept. 12, 2016, 12:46 p.m. UTC | #2
On 12/09/16 13:14, Jan Beulich wrote:
>>>> On 12.09.16 at 11:51, <andrew.cooper3@citrix.com> wrote:
>> --- a/xen/arch/x86/domctl.c
>> +++ b/xen/arch/x86/domctl.c
>> @@ -1158,7 +1158,15 @@ long arch_do_domctl(
>>                  goto vcpuextstate_out;
>>              }
>>  
>> -            if ( evc->size <= PV_XSAVE_SIZE(_xcr0_accum) )
>> +            if ( evc->size == PV_XSAVE_HDR_SIZE )
>> +                ; /* Nothing to restore. */
>> +            else if ( evc->size < PV_XSAVE_HDR_SIZE + XSTATE_AREA_MIN_SIZE )
>> +                ret = -EINVAL; /* Can't be legitimate data. */
>> +            else if ( xsave_area_compressed(_xsave_area) )
>> +                ret = -EOPNOTSUPP; /* Don't support compressed data. */
>> +            else if ( evc->size != PV_XSAVE_SIZE(_xcr0_accum) )
>> +                ret = -EINVAL; /* Not legitimate data. */
> Can't this be moved ahead of the xsave_area_compressed() check,
> eliminating (as redundant) the check that's there right now?

No.  That doesn't catch the case where _xcr0_accum is 0, which will
cause the xsave_area_compressed(_xsave_area) check to wander off the end
of the buffer.

>
> In any event the tightening to != you do here supports my desire to
> not do any relaxation of the size check in patch 2.

The two cases are different, and should not be conflated.

>  Or alternatively
> you would want to consider restoring the behavior from prior to
> the xsaves change, where the <= which you now remove was still
> okay.

I looked back through the history and couldn't find any justification
for the check being <= as opposed to being more strict.  We absolutely
don't want to be in the case where we only restore part of an xstate
component.

>
>> --- a/xen/arch/x86/hvm/hvm.c
>> +++ b/xen/arch/x86/hvm/hvm.c
>> @@ -1345,6 +1345,23 @@ static int hvm_load_cpu_xsave_states(struct domain *d, hvm_domain_context_t *h)
>>          printk(XENLOG_G_WARNING
>>                 "HVM%d.%u restore mismatch: xsave length %#x > %#x\n",
>>                 d->domain_id, vcpuid, desc->length, size);
>> +        /* Rewind desc->length to ignore the extraneous zeros. */
>> +        desc->length = size;
> This is slightly ugly, as it will prevent eventually constifying dest (which
> it really should have been from the beginning).

Hmm yes.  I hadn't considered that it was actually mutating the hvm
domain context buffer.  I will switch to a shadow variable instead.

>
>> --- a/xen/include/asm-x86/xstate.h
>> +++ b/xen/include/asm-x86/xstate.h
>> @@ -131,4 +131,10 @@ static inline bool_t xstate_all(const struct vcpu *v)
>>             (v->arch.xcr0_accum & XSTATE_LAZY & ~XSTATE_FP_SSE);
>>  }
>>  
>> +static inline bool __nonnull(1)
>> +    xsave_area_compressed(const struct xsave_struct *xsave_area)
> This is certainly odd indentation.

It is where my editor naturally indents to.  It would appear that emacs
is mistaking the __nonnull(1) as the function name, and presuming that
the following line is K&R style parameters.

It is certainly awkward that the attributes need to be that way around.

~Andrew
Jan Beulich Sept. 12, 2016, 1:41 p.m. UTC | #3
>>> On 12.09.16 at 14:46, <andrew.cooper3@citrix.com> wrote:
> On 12/09/16 13:14, Jan Beulich wrote:
>>>>> On 12.09.16 at 11:51, <andrew.cooper3@citrix.com> wrote:
>>> --- a/xen/arch/x86/domctl.c
>>> +++ b/xen/arch/x86/domctl.c
>>> @@ -1158,7 +1158,15 @@ long arch_do_domctl(
>>>                  goto vcpuextstate_out;
>>>              }
>>>  
>>> -            if ( evc->size <= PV_XSAVE_SIZE(_xcr0_accum) )
>>> +            if ( evc->size == PV_XSAVE_HDR_SIZE )
>>> +                ; /* Nothing to restore. */
>>> +            else if ( evc->size < PV_XSAVE_HDR_SIZE + XSTATE_AREA_MIN_SIZE )
>>> +                ret = -EINVAL; /* Can't be legitimate data. */
>>> +            else if ( xsave_area_compressed(_xsave_area) )
>>> +                ret = -EOPNOTSUPP; /* Don't support compressed data. */
>>> +            else if ( evc->size != PV_XSAVE_SIZE(_xcr0_accum) )
>>> +                ret = -EINVAL; /* Not legitimate data. */
>> Can't this be moved ahead of the xsave_area_compressed() check,
>> eliminating (as redundant) the check that's there right now?
> 
> No.  That doesn't catch the case where _xcr0_accum is 0, which will
> cause the xsave_area_compressed(_xsave_area) check to wander off the end
> of the buffer.

Oh, indeed.

>> In any event the tightening to != you do here supports my desire to
>> not do any relaxation of the size check in patch 2.
> 
> The two cases are different, and should not be conflated.
> 
>>  Or alternatively
>> you would want to consider restoring the behavior from prior to
>> the xsaves change, where the <= which you now remove was still
>> okay.
> 
> I looked back through the history and couldn't find any justification
> for the check being <= as opposed to being more strict.  We absolutely
> don't want to be in the case where we only restore part of an xstate
> component.

Indeed I don't think there was ever any justification provided, it
just happened to be that way (perhaps having in mind that the
FXSAVE layout had unused space at the end).

>>> --- a/xen/arch/x86/hvm/hvm.c
>>> +++ b/xen/arch/x86/hvm/hvm.c
>>> @@ -1345,6 +1345,23 @@ static int hvm_load_cpu_xsave_states(struct domain *d, hvm_domain_context_t *h)
>>>          printk(XENLOG_G_WARNING
>>>                 "HVM%d.%u restore mismatch: xsave length %#x > %#x\n",
>>>                 d->domain_id, vcpuid, desc->length, size);
>>> +        /* Rewind desc->length to ignore the extraneous zeros. */
>>> +        desc->length = size;
>> This is slightly ugly, as it will prevent eventually constifying dest (which
>> it really should have been from the beginning).
> 
> Hmm yes.  I hadn't considered that it was actually mutating the hvm
> domain context buffer.  I will switch to a shadow variable instead.
> 
>>
>>> --- a/xen/include/asm-x86/xstate.h
>>> +++ b/xen/include/asm-x86/xstate.h
>>> @@ -131,4 +131,10 @@ static inline bool_t xstate_all(const struct vcpu *v)
>>>             (v->arch.xcr0_accum & XSTATE_LAZY & ~XSTATE_FP_SSE);
>>>  }
>>>  
>>> +static inline bool __nonnull(1)
>>> +    xsave_area_compressed(const struct xsave_struct *xsave_area)
>> This is certainly odd indentation.
> 
> It is where my editor naturally indents to.  It would appear that emacs
> is mistaking the __nonnull(1) as the function name, and presuming that
> the following line is K&R style parameters.

With these two small things taken care of,
Reviewed-by: Jan Beulich <jbeulich@suse.com>
diff mbox

Patch

diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c
index 5aa9f3a..2a2fe04 100644
--- a/xen/arch/x86/domctl.c
+++ b/xen/arch/x86/domctl.c
@@ -1158,7 +1158,15 @@  long arch_do_domctl(
                 goto vcpuextstate_out;
             }
 
-            if ( evc->size <= PV_XSAVE_SIZE(_xcr0_accum) )
+            if ( evc->size == PV_XSAVE_HDR_SIZE )
+                ; /* Nothing to restore. */
+            else if ( evc->size < PV_XSAVE_HDR_SIZE + XSTATE_AREA_MIN_SIZE )
+                ret = -EINVAL; /* Can't be legitimate data. */
+            else if ( xsave_area_compressed(_xsave_area) )
+                ret = -EOPNOTSUPP; /* Don't support compressed data. */
+            else if ( evc->size != PV_XSAVE_SIZE(_xcr0_accum) )
+                ret = -EINVAL; /* Not legitimate data. */
+            else
             {
                 vcpu_pause(v);
                 v->arch.xcr0 = _xcr0;
@@ -1169,8 +1177,6 @@  long arch_do_domctl(
                                       evc->size - PV_XSAVE_HDR_SIZE);
                 vcpu_unpause(v);
             }
-            else
-                ret = -EINVAL;
 
             xfree(receive_buf);
         }
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index ca96643..a6c6621 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -1345,6 +1345,23 @@  static int hvm_load_cpu_xsave_states(struct domain *d, hvm_domain_context_t *h)
         printk(XENLOG_G_WARNING
                "HVM%d.%u restore mismatch: xsave length %#x > %#x\n",
                d->domain_id, vcpuid, desc->length, size);
+        /* Rewind desc->length to ignore the extraneous zeros. */
+        desc->length = size;
+    }
+
+    if ( xsave_area_compressed((const void *)&ctxt->save_area) )
+    {
+        printk(XENLOG_G_WARNING
+               "HVM%d.%u restore: compressed xsave state not supported\n",
+               d->domain_id, vcpuid);
+        return -EOPNOTSUPP;
+    }
+    else if ( desc->length != size )
+    {
+        printk(XENLOG_G_WARNING
+               "HVM%d.%u restore mismatch: xsave length %#x != %#x\n",
+               d->domain_id, vcpuid, desc->length, size);
+        return -EINVAL;
     }
     /* Checking finished */
 
@@ -1353,8 +1370,7 @@  static int hvm_load_cpu_xsave_states(struct domain *d, hvm_domain_context_t *h)
     if ( ctxt->xcr0_accum & XSTATE_NONLAZY )
         v->arch.nonlazy_xstate_used = 1;
     compress_xsave_states(v, &ctxt->save_area,
-                          min(desc->length, size) -
-                          offsetof(struct hvm_hw_cpu_xsave,save_area));
+                          size - offsetof(struct hvm_hw_cpu_xsave, save_area));
 
     return 0;
 }
diff --git a/xen/arch/x86/xstate.c b/xen/arch/x86/xstate.c
index 1973ba0..f6157f5 100644
--- a/xen/arch/x86/xstate.c
+++ b/xen/arch/x86/xstate.c
@@ -86,12 +86,6 @@  uint64_t get_msr_xss(void)
     return this_cpu(xss);
 }
 
-static bool_t xsave_area_compressed(const struct xsave_struct *xsave_area)
-{
-     return xsave_area && (xsave_area->xsave_hdr.xcomp_bv
-                           & XSTATE_COMPACTION_ENABLED);
-}
-
 static int setup_xstate_features(bool_t bsp)
 {
     unsigned int leaf, eax, ebx, ecx, edx;
diff --git a/xen/include/asm-x86/xstate.h b/xen/include/asm-x86/xstate.h
index 51a9ed4..a8b8751 100644
--- a/xen/include/asm-x86/xstate.h
+++ b/xen/include/asm-x86/xstate.h
@@ -131,4 +131,10 @@  static inline bool_t xstate_all(const struct vcpu *v)
            (v->arch.xcr0_accum & XSTATE_LAZY & ~XSTATE_FP_SSE);
 }
 
+static inline bool __nonnull(1)
+    xsave_area_compressed(const struct xsave_struct *xsave_area)
+{
+    return xsave_area->xsave_hdr.xcomp_bv & XSTATE_COMPACTION_ENABLED;
+}
+
 #endif /* __ASM_XSTATE_H */