diff mbox

[4/6] x86/xstate: Fix latent bugs in expand_xsave_states()

Message ID 1473673900-8585-5-git-send-email-andrew.cooper3@citrix.com (mailing list archive)
State New, archived
Headers show

Commit Message

Andrew Cooper Sept. 12, 2016, 9:51 a.m. UTC
Without checking the size input, the memcpy() for the uncompressed path might
read off the end of the vcpu's xsave_area.  Both callers pass the approprite
size, so hold them to it with a BUG_ON().

The compressed path is currently dead code, but its attempt to avoid leaking
uninitalised data was incomplete.  The current xstate_bv will be less than
xcr0_accum if some bits of xsave state are in their default values.  Work
around this by zeroing the whole rest of the buffer before decompression.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
---
 xen/arch/x86/xstate.c | 21 ++++++++++++++++++---
 1 file changed, 18 insertions(+), 3 deletions(-)

Comments

Jan Beulich Sept. 12, 2016, 11:41 a.m. UTC | #1
>>> On 12.09.16 at 11:51, <andrew.cooper3@citrix.com> wrote:
> @@ -176,6 +187,11 @@ void expand_xsave_states(struct vcpu *v, void *dest, unsigned int size)
>      u64 xstate_bv = xsave->xsave_hdr.xstate_bv;
>      u64 valid;
>  
> +    /* Check there is state to serialise (i.e. at least an XSAVE_HDR) */
> +    BUG_ON(!v->arch.xcr0_accum);
> +    /* Check there is the correct room to decompress into. */
> +    BUG_ON(size != xstate_ctxt_size(v->arch.xcr0_accum));

Further down I see you convert an ASSERT() to BUG_ON(), but I
wonder why you do that and why the two above can't be ASSERT()
too. xstate_ctxt_size() is not always cheap.

> @@ -189,6 +205,7 @@ void expand_xsave_states(struct vcpu *v, void *dest, unsigned int size)
>       * Copy legacy XSAVE area and XSAVE hdr area.
>       */
>      memcpy(dest, xsave, XSTATE_AREA_MIN_SIZE);
> +    memset(dest + XSTATE_AREA_MIN_SIZE, 0, size - XSTATE_AREA_MIN_SIZE);
>  
>      ((struct xsave_struct *)dest)->xsave_hdr.xcomp_bv =  0;
>  
> @@ -205,11 +222,9 @@ void expand_xsave_states(struct vcpu *v, void *dest, unsigned int size)
>  
>          if ( src )
>          {
> -            ASSERT((xstate_offsets[index] + xstate_sizes[index]) <= size);
> +            BUG_ON((xstate_offsets[index] + xstate_sizes[index]) <= size);

Surely converting an ASSERT() to BUG_ON() means inverting the
relational operator used?

>              memcpy(dest + xstate_offsets[index], src, xstate_sizes[index]);
>          }
> -        else
> -            memset(dest + xstate_offsets[index], 0, xstate_sizes[index]);

So I have difficulty seeing why this memset() wasn't sufficient: It
precisely covers for the respective component being in default
state. Or wait - this was fine if intermediate bits were clear in
xstate_bv, but not if clear-but-valid ones weren't followed by
another set one. Nor would gaps between components have been
taken care of. I think the commit message could be made more
explicit in this regard (of course unless I'm overlooking yet another
aspect).

Jan
Andrew Cooper Sept. 12, 2016, 12:29 p.m. UTC | #2
On 12/09/16 12:41, Jan Beulich wrote:
>>>> On 12.09.16 at 11:51, <andrew.cooper3@citrix.com> wrote:
>> @@ -176,6 +187,11 @@ void expand_xsave_states(struct vcpu *v, void *dest, unsigned int size)
>>      u64 xstate_bv = xsave->xsave_hdr.xstate_bv;
>>      u64 valid;
>>  
>> +    /* Check there is state to serialise (i.e. at least an XSAVE_HDR) */
>> +    BUG_ON(!v->arch.xcr0_accum);
>> +    /* Check there is the correct room to decompress into. */
>> +    BUG_ON(size != xstate_ctxt_size(v->arch.xcr0_accum));
> Further down I see you convert an ASSERT() to BUG_ON(), but I
> wonder why you do that and why the two above can't be ASSERT()
> too. xstate_ctxt_size() is not always cheap.

This isn't a fastpath, and the cpuid work will make xstate_ctxt_size()
into an O(1) operation.

Furthermore, following the investigation of XSA-186, I will not use
assertions for bounds checking.  The potential damage of omitting the
check far outweighs the overhead of the unconditional check.

>
>> @@ -189,6 +205,7 @@ void expand_xsave_states(struct vcpu *v, void *dest, unsigned int size)
>>       * Copy legacy XSAVE area and XSAVE hdr area.
>>       */
>>      memcpy(dest, xsave, XSTATE_AREA_MIN_SIZE);
>> +    memset(dest + XSTATE_AREA_MIN_SIZE, 0, size - XSTATE_AREA_MIN_SIZE);
>>  
>>      ((struct xsave_struct *)dest)->xsave_hdr.xcomp_bv =  0;
>>  
>> @@ -205,11 +222,9 @@ void expand_xsave_states(struct vcpu *v, void *dest, unsigned int size)
>>  
>>          if ( src )
>>          {
>> -            ASSERT((xstate_offsets[index] + xstate_sizes[index]) <= size);
>> +            BUG_ON((xstate_offsets[index] + xstate_sizes[index]) <= size);
> Surely converting an ASSERT() to BUG_ON() means inverting the
> relational operator used?

Very true.  It is unfortunate that all of this is dead code, and
impossible to test.  I also had half a mind to explicitly #if 0 it out
to leave people in no illusion that it ever might have been tested.

>
>>              memcpy(dest + xstate_offsets[index], src, xstate_sizes[index]);
>>          }
>> -        else
>> -            memset(dest + xstate_offsets[index], 0, xstate_sizes[index]);
> So I have difficulty seeing why this memset() wasn't sufficient: It
> precisely covers for the respective component being in default
> state.

No it doesn't.  The loop skips over all bits which are not set in xstate_bv.

I had (erroneously) come to the conclusion that the "if ( src )" check
only caught the case where we had bad comp_offsets[] information, but
rereading the logic, that case would actually corrupt the legacy SSE header.

Overall, it turns out that the "if ( src )" is unconditionally taken.

~Andrew
Jan Beulich Sept. 12, 2016, 12:41 p.m. UTC | #3
>>> On 12.09.16 at 14:29, <andrew.cooper3@citrix.com> wrote:
> On 12/09/16 12:41, Jan Beulich wrote:
>>>>> On 12.09.16 at 11:51, <andrew.cooper3@citrix.com> wrote:
>>> @@ -205,11 +222,9 @@ void expand_xsave_states(struct vcpu *v, void *dest, unsigned int size)
>>>  
>>>          if ( src )
>>>          {
>>> -            ASSERT((xstate_offsets[index] + xstate_sizes[index]) <= size);
>>> +            BUG_ON((xstate_offsets[index] + xstate_sizes[index]) <= size);
>>>              memcpy(dest + xstate_offsets[index], src, xstate_sizes[index]);
>>>          }
>>> -        else
>>> -            memset(dest + xstate_offsets[index], 0, xstate_sizes[index]);
>> So I have difficulty seeing why this memset() wasn't sufficient: It
>> precisely covers for the respective component being in default
>> state.
> 
> No it doesn't.  The loop skips over all bits which are not set in xstate_bv.

Well, yes, I had corrected myself in the following sentence, resulting
in me just asking for the commit message to get clarified.

> I had (erroneously) come to the conclusion that the "if ( src )" check
> only caught the case where we had bad comp_offsets[] information, but
> rereading the logic, that case would actually corrupt the legacy SSE header.
> 
> Overall, it turns out that the "if ( src )" is unconditionally taken.

Oh, I see (same applies to my then wrong comment on patch 6):
We iterate over xstate_bv here, and components with their flag
set in xstate_bv won't see NULL coming back from get_xsave_addr().
I'm sorry for the noise then.

Jan
Jan Beulich Sept. 12, 2016, 12:43 p.m. UTC | #4
>>> On 12.09.16 at 14:29, <andrew.cooper3@citrix.com> wrote:
> On 12/09/16 12:41, Jan Beulich wrote:
>>>>> On 12.09.16 at 11:51, <andrew.cooper3@citrix.com> wrote:
>>> @@ -205,11 +222,9 @@ void expand_xsave_states(struct vcpu *v, void *dest, unsigned int size)
>>>  
>>>          if ( src )
>>>          {
>>> -            ASSERT((xstate_offsets[index] + xstate_sizes[index]) <= size);
>>> +            BUG_ON((xstate_offsets[index] + xstate_sizes[index]) <= size);
>> Surely converting an ASSERT() to BUG_ON() means inverting the
>> relational operator used?
> 
> Very true.  It is unfortunate that all of this is dead code, and
> impossible to test.  I also had half a mind to explicitly #if 0 it out
> to leave people in no illusion that it ever might have been tested.

So with this correct, the patch is then
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper Sept. 12, 2016, 1:57 p.m. UTC | #5
On 12/09/16 13:43, Jan Beulich wrote:
>>>> On 12.09.16 at 14:29, <andrew.cooper3@citrix.com> wrote:
>> On 12/09/16 12:41, Jan Beulich wrote:
>>>>>> On 12.09.16 at 11:51, <andrew.cooper3@citrix.com> wrote:
>>>> @@ -205,11 +222,9 @@ void expand_xsave_states(struct vcpu *v, void *dest, unsigned int size)
>>>>  
>>>>          if ( src )
>>>>          {
>>>> -            ASSERT((xstate_offsets[index] + xstate_sizes[index]) <= size);
>>>> +            BUG_ON((xstate_offsets[index] + xstate_sizes[index]) <= size);
>>> Surely converting an ASSERT() to BUG_ON() means inverting the
>>> relational operator used?
>> Very true.  It is unfortunate that all of this is dead code, and
>> impossible to test.  I also had half a mind to explicitly #if 0 it out
>> to leave people in no illusion that it ever might have been tested.
> So with this correct, the patch is then
> Reviewed-by: Jan Beulich <jbeulich@suse.com>

The discussion on this patch has shown that the "if ( src )" is
unconditionally true, and as such, I would like to remove it.  Would
your R-b stand with this hunk altered similarly to the final hunk in
patch 6 (with the BUG_ON() logic adjustment, and updated wording) ?

~Andrew
Jan Beulich Sept. 12, 2016, 2:13 p.m. UTC | #6
>>> On 12.09.16 at 15:57, <andrew.cooper3@citrix.com> wrote:
> On 12/09/16 13:43, Jan Beulich wrote:
>>>>> On 12.09.16 at 14:29, <andrew.cooper3@citrix.com> wrote:
>>> On 12/09/16 12:41, Jan Beulich wrote:
>>>>>>> On 12.09.16 at 11:51, <andrew.cooper3@citrix.com> wrote:
>>>>> @@ -205,11 +222,9 @@ void expand_xsave_states(struct vcpu *v, void *dest, 
> unsigned int size)
>>>>>  
>>>>>          if ( src )
>>>>>          {
>>>>> -            ASSERT((xstate_offsets[index] + xstate_sizes[index]) <= size);
>>>>> +            BUG_ON((xstate_offsets[index] + xstate_sizes[index]) <= size);
>>>> Surely converting an ASSERT() to BUG_ON() means inverting the
>>>> relational operator used?
>>> Very true.  It is unfortunate that all of this is dead code, and
>>> impossible to test.  I also had half a mind to explicitly #if 0 it out
>>> to leave people in no illusion that it ever might have been tested.
>> So with this correct, the patch is then
>> Reviewed-by: Jan Beulich <jbeulich@suse.com>
> 
> The discussion on this patch has shown that the "if ( src )" is
> unconditionally true, and as such, I would like to remove it.  Would
> your R-b stand with this hunk altered similarly to the final hunk in
> patch 6 (with the BUG_ON() logic adjustment, and updated wording) ?

Yes.

Jan
diff mbox

Patch

diff --git a/xen/arch/x86/xstate.c b/xen/arch/x86/xstate.c
index 6e4a0d3..1973ba0 100644
--- a/xen/arch/x86/xstate.c
+++ b/xen/arch/x86/xstate.c
@@ -169,6 +169,17 @@  static void *get_xsave_addr(struct xsave_struct *xsave,
            (void *)xsave + comp_offsets[xfeature_idx] : NULL;
 }
 
+/*
+ * Serialise a vcpus xsave state into a representation suitable for the
+ * toolstack.
+ *
+ * Internally a vcpus xsave state may be compressed or uncompressed, depending
+ * on the features in use, but the ABI with the toolstack is strictly
+ * uncompressed.
+ *
+ * It is the callers responsibility to ensure that there is xsave state to
+ * serialise, and that the provided buffer is exactly the right size.
+ */
 void expand_xsave_states(struct vcpu *v, void *dest, unsigned int size)
 {
     struct xsave_struct *xsave = v->arch.xsave_area;
@@ -176,6 +187,11 @@  void expand_xsave_states(struct vcpu *v, void *dest, unsigned int size)
     u64 xstate_bv = xsave->xsave_hdr.xstate_bv;
     u64 valid;
 
+    /* Check there is state to serialise (i.e. at least an XSAVE_HDR) */
+    BUG_ON(!v->arch.xcr0_accum);
+    /* Check there is the correct room to decompress into. */
+    BUG_ON(size != xstate_ctxt_size(v->arch.xcr0_accum));
+
     if ( !(xsave->xsave_hdr.xcomp_bv & XSTATE_COMPACTION_ENABLED) )
     {
         memcpy(dest, xsave, size);
@@ -189,6 +205,7 @@  void expand_xsave_states(struct vcpu *v, void *dest, unsigned int size)
      * Copy legacy XSAVE area and XSAVE hdr area.
      */
     memcpy(dest, xsave, XSTATE_AREA_MIN_SIZE);
+    memset(dest + XSTATE_AREA_MIN_SIZE, 0, size - XSTATE_AREA_MIN_SIZE);
 
     ((struct xsave_struct *)dest)->xsave_hdr.xcomp_bv =  0;
 
@@ -205,11 +222,9 @@  void expand_xsave_states(struct vcpu *v, void *dest, unsigned int size)
 
         if ( src )
         {
-            ASSERT((xstate_offsets[index] + xstate_sizes[index]) <= size);
+            BUG_ON((xstate_offsets[index] + xstate_sizes[index]) <= size);
             memcpy(dest + xstate_offsets[index], src, xstate_sizes[index]);
         }
-        else
-            memset(dest + xstate_offsets[index], 0, xstate_sizes[index]);
 
         valid &= ~feature;
     }