diff mbox

[v4,26/26] tools/libxc: Calculate xstate cpuid leaf from guest information

Message ID 1458750989-28967-27-git-send-email-andrew.cooper3@citrix.com (mailing list archive)
State New, archived
Headers show

Commit Message

Andrew Cooper March 23, 2016, 4:36 p.m. UTC
It is unsafe to generate the guests xstate leaves from host information, as it
prevents the differences between hosts from being hidden.

In addition, some further improvements and corrections:
 - don't discard the known flags in sub-leaves 2..63 ECX
 - zap sub-leaves beyond 62
 - zap all bits in leaf 1, EBX/ECX.  No XSS features are currently supported.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
CC: Wei Liu <wei.liu2@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>

v3:
 * Reintroduce MPX adjustment (this series has been in development since
   before the introduction of MPX upstream, and it got lost in a rebase)
v4:
 * Fold further improvements from Jan
---
 tools/libxc/xc_cpuid_x86.c | 71 +++++++++++++++++++++++++++++++++++++---------
 1 file changed, 57 insertions(+), 14 deletions(-)

Comments

Wei Liu March 24, 2016, 5:20 p.m. UTC | #1
On Wed, Mar 23, 2016 at 04:36:29PM +0000, Andrew Cooper wrote:
> It is unsafe to generate the guests xstate leaves from host information, as it
> prevents the differences between hosts from being hidden.
> 
> In addition, some further improvements and corrections:
>  - don't discard the known flags in sub-leaves 2..63 ECX
>  - zap sub-leaves beyond 62
>  - zap all bits in leaf 1, EBX/ECX.  No XSS features are currently supported.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Acked-by: Wei Liu <wei.liu2@citrix.com>
Jan Beulich March 31, 2016, 7:48 a.m. UTC | #2
>>> On 23.03.16 at 17:36, <andrew.cooper3@citrix.com> wrote:
> --- a/tools/libxc/xc_cpuid_x86.c
> +++ b/tools/libxc/xc_cpuid_x86.c
> @@ -398,54 +398,97 @@ static void intel_xc_cpuid_policy(xc_interface *xch,
>      }
>  }
>  
> +/* XSTATE bits in XCR0. */
> +#define X86_XCR0_X87    (1ULL <<  0)
> +#define X86_XCR0_SSE    (1ULL <<  1)
> +#define X86_XCR0_AVX    (1ULL <<  2)
> +#define X86_XCR0_BNDREG (1ULL <<  3)
> +#define X86_XCR0_BNDCSR (1ULL <<  4)
> +#define X86_XCR0_LWP    (1ULL << 62)

Why an incomplete set? At least PKRU should be needed right
away. And I see no reason why the three AVX-512 pieces can't
be put here right away too.

> +#define X86_XSS_MASK    (0) /* No XSS states supported yet. */
> +
> +/* Per-component subleaf flags. */
> +#define XSTATE_XSS      (1ULL <<  0)
> +#define XSTATE_ALIGN64  (1ULL <<  1)
> +
>  /* Configure extended state enumeration leaves (0x0000000D for xsave) */
>  static void xc_cpuid_config_xsave(xc_interface *xch,
>                                    const struct cpuid_domain_info *info,
>                                    const unsigned int *input, unsigned int *regs)
>  {
> -    if ( info->xfeature_mask == 0 )
> +    uint64_t guest_xfeature_mask;
> +
> +    if ( info->xfeature_mask == 0 ||
> +         !test_bit(X86_FEATURE_XSAVE, info->featureset) )
>      {
>          regs[0] = regs[1] = regs[2] = regs[3] = 0;
>          return;
>      }
>  
> +    guest_xfeature_mask = X86_XCR0_SSE | X86_XCR0_X87;
> +
> +    if ( test_bit(X86_FEATURE_AVX, info->featureset) )
> +        guest_xfeature_mask |= X86_XCR0_AVX;
> +
> +    if ( test_bit(X86_FEATURE_MPX, info->featureset) )
> +        guest_xfeature_mask |= X86_XCR0_BNDREG | X86_XCR0_BNDCSR;
> +
> +    if ( test_bit(X86_FEATURE_LWP, info->featureset) )
> +        guest_xfeature_mask |= X86_XCR0_LWP;
> +
> +    /*
> +     * Clamp to host mask.  Should be no-op, as guest_xfeature_mask should not
> +     * be able to be calculated as larger than info->xfeature_mask.
> +     *
> +     * TODO - see about making this a harder error.
> +     */
> +    guest_xfeature_mask &= info->xfeature_mask;

This is ugly. For one, your dependency mechanism should be able to
express the dependencies you "manually"enforce above. And beyond
that masking with info->xfeature_mask should be all that's needed,
together with enforcing the XCR0 / XSS split ...

>      switch ( input[1] )
>      {
> -    case 0: 
> +    case 0:
>          /* EAX: low 32bits of xfeature_enabled_mask */
> -        regs[0] = info->xfeature_mask & 0xFFFFFFFF;
> +        regs[0] = guest_xfeature_mask;
>          /* EDX: high 32bits of xfeature_enabled_mask */
> -        regs[3] = (info->xfeature_mask >> 32) & 0xFFFFFFFF;
> +        regs[3] = guest_xfeature_mask >> 32;

... here and ...

>      case 1: /* leaf 1 */
>          regs[0] = info->featureset[featureword_of(X86_FEATURE_XSAVEOPT)];
> -        regs[2] &= info->xfeature_mask;
> -        regs[3] = 0;
> +        regs[2] = guest_xfeature_mask & X86_XSS_MASK;
> +        regs[3] = (guest_xfeature_mask >> 32) & X86_XSS_MASK;

... here. Yet not by a compile time defined mask, but by using
(host) CPUID output: It is clear that once a bit got assigned to XCR0
vs XSS, it won't ever change. Hence it doesn't matter whether you
use the guest or host view of that split. And this will then also - other
than you've said before would be unavoidable - make unnecessary to
always update this code when new states get added.

> -    case 2 ... 63: /* sub-leaves */
> -        if ( !(info->xfeature_mask & (1ULL << input[1])) )
> +
> +    case 2 ... 62: /* per-component sub-leaves */
> +        if ( !(guest_xfeature_mask & (1ULL << input[1])) )
>          {
>              regs[0] = regs[1] = regs[2] = regs[3] = 0;
>              break;
>          }
>          /* Don't touch EAX, EBX. Also cleanup ECX and EDX */
> -        regs[2] = regs[3] = 0;
> +        regs[2] &= XSTATE_XSS | XSTATE_ALIGN64;

Wouldn't this better also use the "known features" approach, by
adding yet another word in cpufeatureset.h?

Btw., looking at that header again I now wonder whether it
wouldn't have been neater to make XEN_CPUFEATURE() a
3-parameter macro, with word and bit specified separately
and a default definition of

#define XEN_CPUFEATURE(name, word, bit) XEN_X86_FEATURE_##name = (word) * 32 + (bit),

avoiding the ugly repeated "*32" in all macro invocations. Of
course we'd need to adjust this before we release with this new
interface.

Jan
Andrew Cooper April 5, 2016, 5:48 p.m. UTC | #3
On 31/03/16 08:48, Jan Beulich wrote:
>>>> On 23.03.16 at 17:36, <andrew.cooper3@citrix.com> wrote:
>> --- a/tools/libxc/xc_cpuid_x86.c
>> +++ b/tools/libxc/xc_cpuid_x86.c
>> @@ -398,54 +398,97 @@ static void intel_xc_cpuid_policy(xc_interface *xch,
>>      }
>>  }
>>  
>> +/* XSTATE bits in XCR0. */
>> +#define X86_XCR0_X87    (1ULL <<  0)
>> +#define X86_XCR0_SSE    (1ULL <<  1)
>> +#define X86_XCR0_AVX    (1ULL <<  2)
>> +#define X86_XCR0_BNDREG (1ULL <<  3)
>> +#define X86_XCR0_BNDCSR (1ULL <<  4)
>> +#define X86_XCR0_LWP    (1ULL << 62)
> Why an incomplete set? At least PKRU should be needed right
> away. And I see no reason why the three AVX-512 pieces can't
> be put here right away too.

PKRU is another victim of this series being rebased over the
introduction of new functionality.  I will re-add it.

AVX-512 would require adding the AVX feature flags, and deciphering the
dependency tree for all of them.  I have no ability to test any such
additions (no available hardware), and don't want to introduce
possibly-buggy code ahead of full support being added.

>
>> +#define X86_XSS_MASK    (0) /* No XSS states supported yet. */
>> +
>> +/* Per-component subleaf flags. */
>> +#define XSTATE_XSS      (1ULL <<  0)
>> +#define XSTATE_ALIGN64  (1ULL <<  1)
>> +
>>  /* Configure extended state enumeration leaves (0x0000000D for xsave) */
>>  static void xc_cpuid_config_xsave(xc_interface *xch,
>>                                    const struct cpuid_domain_info *info,
>>                                    const unsigned int *input, unsigned int *regs)
>>  {
>> -    if ( info->xfeature_mask == 0 )
>> +    uint64_t guest_xfeature_mask;
>> +
>> +    if ( info->xfeature_mask == 0 ||
>> +         !test_bit(X86_FEATURE_XSAVE, info->featureset) )
>>      {
>>          regs[0] = regs[1] = regs[2] = regs[3] = 0;
>>          return;
>>      }
>>  
>> +    guest_xfeature_mask = X86_XCR0_SSE | X86_XCR0_X87;
>> +
>> +    if ( test_bit(X86_FEATURE_AVX, info->featureset) )
>> +        guest_xfeature_mask |= X86_XCR0_AVX;
>> +
>> +    if ( test_bit(X86_FEATURE_MPX, info->featureset) )
>> +        guest_xfeature_mask |= X86_XCR0_BNDREG | X86_XCR0_BNDCSR;
>> +
>> +    if ( test_bit(X86_FEATURE_LWP, info->featureset) )
>> +        guest_xfeature_mask |= X86_XCR0_LWP;
>> +
>> +    /*
>> +     * Clamp to host mask.  Should be no-op, as guest_xfeature_mask should not
>> +     * be able to be calculated as larger than info->xfeature_mask.
>> +     *
>> +     * TODO - see about making this a harder error.
>> +     */
>> +    guest_xfeature_mask &= info->xfeature_mask;
> This is ugly.

And now I think about it, wrong.  Dom0's cpuid view is that of a PV
guest, which comes with no XSAVES (which will impact the future support
of Processor Trace), and no PKRU.

>  For one, your dependency mechanism should be able to
> express the dependencies you "manually"enforce above. And beyond
> that masking with info->xfeature_mask should be all that's needed,
> together with enforcing the XCR0 / XSS split ...
>
>>      switch ( input[1] )
>>      {
>> -    case 0: 
>> +    case 0:
>>          /* EAX: low 32bits of xfeature_enabled_mask */
>> -        regs[0] = info->xfeature_mask & 0xFFFFFFFF;
>> +        regs[0] = guest_xfeature_mask;
>>          /* EDX: high 32bits of xfeature_enabled_mask */
>> -        regs[3] = (info->xfeature_mask >> 32) & 0xFFFFFFFF;
>> +        regs[3] = guest_xfeature_mask >> 32;
> ... here and ...
>
>>      case 1: /* leaf 1 */
>>          regs[0] = info->featureset[featureword_of(X86_FEATURE_XSAVEOPT)];
>> -        regs[2] &= info->xfeature_mask;
>> -        regs[3] = 0;
>> +        regs[2] = guest_xfeature_mask & X86_XSS_MASK;
>> +        regs[3] = (guest_xfeature_mask >> 32) & X86_XSS_MASK;
> ... here. Yet not by a compile time defined mask, but by using
> (host) CPUID output: It is clear that once a bit got assigned to XCR0
> vs XSS, it won't ever change. Hence it doesn't matter whether you
> use the guest or host view of that split. And this will then also - other
> than you've said before would be unavoidable - make unnecessary to
> always update this code when new states get added.

There is no possible way of avoiding having a whitelist somewhere, which
limits what Xen will tolerate supporting for the guest.

All of this code should have been implemented in Xen in the first
place.  I am afraid that this can't be fixed properly without my further
plans to do fully policy handling in Xen.

I will see if I can find a minimal way of fixing this for 4.7, but it is
yet another example of xstate handling simply being broken in tree.

>
>> -    case 2 ... 63: /* sub-leaves */
>> -        if ( !(info->xfeature_mask & (1ULL << input[1])) )
>> +
>> +    case 2 ... 62: /* per-component sub-leaves */
>> +        if ( !(guest_xfeature_mask & (1ULL << input[1])) )
>>          {
>>              regs[0] = regs[1] = regs[2] = regs[3] = 0;
>>              break;
>>          }
>>          /* Don't touch EAX, EBX. Also cleanup ECX and EDX */
>> -        regs[2] = regs[3] = 0;
>> +        regs[2] &= XSTATE_XSS | XSTATE_ALIGN64;
> Wouldn't this better also use the "known features" approach, by
> adding yet another word in cpufeatureset.h?

No - I (thought) I had already explained why.

There is a mapping between features and available xstate to use those
features (with some features mapping to multiple xstates).  Having the
valid xstates derived from the configured features prevents the two
getting out of sync, and advertising a feature without its applicable
xstate, or advertising an xstate without the appropriate feature bit.

>
> Btw., looking at that header again I now wonder whether it
> wouldn't have been neater to make XEN_CPUFEATURE() a
> 3-parameter macro, with word and bit specified separately
> and a default definition of
>
> #define XEN_CPUFEATURE(name, word, bit) XEN_X86_FEATURE_##name = (word) * 32 + (bit),
>
> avoiding the ugly repeated "*32" in all macro invocations. Of
> course we'd need to adjust this before we release with this new
> interface.

I'd prefer not to.  The "*32" is the expected way of reading the
constants, and providing the word and bit separately allows for someone
to try and do something silly by not multiplying by 32 themselves.

~Andrew
Jan Beulich April 7, 2016, 12:16 a.m. UTC | #4
>>> Andrew Cooper <andrew.cooper3@citrix.com> 04/05/16 7:49 PM >>>
>On 31/03/16 08:48, Jan Beulich wrote:
>>>>> On 23.03.16 at 17:36, <andrew.cooper3@citrix.com> wrote:
>>>      switch ( input[1] )
>>>      {
>>> -    case 0: 
>>> +    case 0:
>>>          /* EAX: low 32bits of xfeature_enabled_mask */
>>> -        regs[0] = info->xfeature_mask & 0xFFFFFFFF;
>>> +        regs[0] = guest_xfeature_mask;
>>          /* EDX: high 32bits of xfeature_enabled_mask */
>> -        regs[3] = (info->xfeature_mask >> 32) & 0xFFFFFFFF;
>> +        regs[3] = guest_xfeature_mask >> 32;
>> ... here and ...
>>
>>>      case 1: /* leaf 1 */
<>>          regs[0] = info->featureset[featureword_of(X86_FEATURE_XSAVEOPT)];
>>> -        regs[2] &= info->xfeature_mask;
>>> -        regs[3] = 0;
>>> +        regs[2] = guest_xfeature_mask & X86_XSS_MASK;
>>> +        regs[3] = (guest_xfeature_mask >> 32) & X86_XSS_MASK;
>> ... here. Yet not by a compile time defined mask, but by using
>> (host) CPUID output: It is clear that once a bit got assigned to XCR0
>> vs XSS, it won't ever change. Hence it doesn't matter whether you
>> use the guest or host view of that split. And this will then also - other
>> than you've said before would be unavoidable - make unnecessary to
>> always update this code when new states get added.
>
>There is no possible way of avoiding having a whitelist somewhere, which
>limits what Xen will tolerate supporting for the guest.

Right, but preferably in exactly one place. And imo that ought to be
info->xfeature_mask.

Jan
Andrew Cooper April 7, 2016, 12:40 a.m. UTC | #5
On 07/04/2016 01:16, Jan Beulich wrote:
>>>> Andrew Cooper <andrew.cooper3@citrix.com> 04/05/16 7:49 PM >>>
>> On 31/03/16 08:48, Jan Beulich wrote:
>>>>>> On 23.03.16 at 17:36, <andrew.cooper3@citrix.com> wrote:
>>>>      switch ( input[1] )
>>>>      {
>>>> -    case 0: 
>>>> +    case 0:
>>>>          /* EAX: low 32bits of xfeature_enabled_mask */
>>>> -        regs[0] = info->xfeature_mask & 0xFFFFFFFF;
>>>> +        regs[0] = guest_xfeature_mask;
>>>          /* EDX: high 32bits of xfeature_enabled_mask */
>>> -        regs[3] = (info->xfeature_mask >> 32) & 0xFFFFFFFF;
>>> +        regs[3] = guest_xfeature_mask >> 32;
>>> ... here and ...
>>>
>>>>      case 1: /* leaf 1 */
> <>>          regs[0] = info->featureset[featureword_of(X86_FEATURE_XSAVEOPT)];
>>>> -        regs[2] &= info->xfeature_mask;
>>>> -        regs[3] = 0;
>>>> +        regs[2] = guest_xfeature_mask & X86_XSS_MASK;
>>>> +        regs[3] = (guest_xfeature_mask >> 32) & X86_XSS_MASK;
>>> ... here. Yet not by a compile time defined mask, but by using
>>> (host) CPUID output: It is clear that once a bit got assigned to XCR0
>>> vs XSS, it won't ever change. Hence it doesn't matter whether you
>>> use the guest or host view of that split. And this will then also - other
>>> than you've said before would be unavoidable - make unnecessary to
>>> always update this code when new states get added.
>> There is no possible way of avoiding having a whitelist somewhere, which
>> limits what Xen will tolerate supporting for the guest.
> Right, but preferably in exactly one place. And imo that ought to be
> info->xfeature_mask.

info->xfeature_mask is actually Xen's limit, as obtained from
XEN_DOMCTL_getvcpuextstate, so is an authoritative source of "the
maximum Xen will support".

However, the guest_xfeature_mask must be generated and used as this
patch.  Without it, a domU will break if it migrates from a more capable
xstate host to a less capable host, as using info->xfeature_mask alone
leaks in state which should be levelled out.

Currently upstream, heterogeneous migration of domains using xsave is
broken if the domain first boots on the more-capable host.

~Andrew
Jan Beulich April 7, 2016, 12:56 a.m. UTC | #6
>>> Andrew Cooper <andrew.cooper3@citrix.com> 04/07/16 2:40 AM >>>
>On 07/04/2016 01:16, Jan Beulich wrote:
>>>>> Andrew Cooper <andrew.cooper3@citrix.com> 04/05/16 7:49 PM >>>
>>> There is no possible way of avoiding having a whitelist somewhere, which
>>> limits what Xen will tolerate supporting for the guest.
>> Right, but preferably in exactly one place. And imo that ought to be
>> info->xfeature_mask.
>
>info->xfeature_mask is actually Xen's limit, as obtained from
>XEN_DOMCTL_getvcpuextstate, so is an authoritative source of "the
>maximum Xen will support".
>
>However, the guest_xfeature_mask must be generated and used as this
>patch.  Without it, a domU will break if it migrates from a more capable
>xstate host to a less capable host, as using info->xfeature_mask alone
>leaks in state which should be levelled out.
>
>Currently upstream, heterogeneous migration of domains using xsave is
>broken if the domain first boots on the more-capable host.

I don't follow, I'm afraid: To me this looks like two separate things. One is to
suitably level the guest (via its config file), and the other is to not allow it to
use things the host doesn't support. If you want the guest to be migratable
to a less capable host, you need to configure the guest accordingly instead
of relying on a second instance of white listing.

Jan
Andrew Cooper April 7, 2016, 11:34 a.m. UTC | #7
On 07/04/16 01:56, Jan Beulich wrote:
>>>> Andrew Cooper <andrew.cooper3@citrix.com> 04/07/16 2:40 AM >>>
>> On 07/04/2016 01:16, Jan Beulich wrote:
>>>>>> Andrew Cooper <andrew.cooper3@citrix.com> 04/05/16 7:49 PM >>>
>>>> There is no possible way of avoiding having a whitelist somewhere, which
>>>> limits what Xen will tolerate supporting for the guest.
>>> Right, but preferably in exactly one place. And imo that ought to be
>>> info->xfeature_mask.
>> info->xfeature_mask is actually Xen's limit, as obtained from
>> XEN_DOMCTL_getvcpuextstate, so is an authoritative source of "the
>> maximum Xen will support".
>>
>> However, the guest_xfeature_mask must be generated and used as this
>> patch.  Without it, a domU will break if it migrates from a more capable
>> xstate host to a less capable host, as using info->xfeature_mask alone
>> leaks in state which should be levelled out.
>>
>> Currently upstream, heterogeneous migration of domains using xsave is
>> broken if the domain first boots on the more-capable host.
> I don't follow, I'm afraid: To me this looks like two separate things. One is to
> suitably level the guest (via its config file), and the other is to not allow it to
> use things the host doesn't support. If you want the guest to be migratable
> to a less capable host, you need to configure the guest accordingly instead
> of relying on a second instance of white listing.

Agreed, on all points.

But I assert that my change moves the code from being broken to working,
per the above description.

I have reworded several bits for v5 - perhaps that will make the patch
more clear.

~Andrew
diff mbox

Patch

diff --git a/tools/libxc/xc_cpuid_x86.c b/tools/libxc/xc_cpuid_x86.c
index fc7e20a..cf1f6b7 100644
--- a/tools/libxc/xc_cpuid_x86.c
+++ b/tools/libxc/xc_cpuid_x86.c
@@ -398,54 +398,97 @@  static void intel_xc_cpuid_policy(xc_interface *xch,
     }
 }
 
+/* XSTATE bits in XCR0. */
+#define X86_XCR0_X87    (1ULL <<  0)
+#define X86_XCR0_SSE    (1ULL <<  1)
+#define X86_XCR0_AVX    (1ULL <<  2)
+#define X86_XCR0_BNDREG (1ULL <<  3)
+#define X86_XCR0_BNDCSR (1ULL <<  4)
+#define X86_XCR0_LWP    (1ULL << 62)
+
+#define X86_XSS_MASK    (0) /* No XSS states supported yet. */
+
+/* Per-component subleaf flags. */
+#define XSTATE_XSS      (1ULL <<  0)
+#define XSTATE_ALIGN64  (1ULL <<  1)
+
 /* Configure extended state enumeration leaves (0x0000000D for xsave) */
 static void xc_cpuid_config_xsave(xc_interface *xch,
                                   const struct cpuid_domain_info *info,
                                   const unsigned int *input, unsigned int *regs)
 {
-    if ( info->xfeature_mask == 0 )
+    uint64_t guest_xfeature_mask;
+
+    if ( info->xfeature_mask == 0 ||
+         !test_bit(X86_FEATURE_XSAVE, info->featureset) )
     {
         regs[0] = regs[1] = regs[2] = regs[3] = 0;
         return;
     }
 
+    guest_xfeature_mask = X86_XCR0_SSE | X86_XCR0_X87;
+
+    if ( test_bit(X86_FEATURE_AVX, info->featureset) )
+        guest_xfeature_mask |= X86_XCR0_AVX;
+
+    if ( test_bit(X86_FEATURE_MPX, info->featureset) )
+        guest_xfeature_mask |= X86_XCR0_BNDREG | X86_XCR0_BNDCSR;
+
+    if ( test_bit(X86_FEATURE_LWP, info->featureset) )
+        guest_xfeature_mask |= X86_XCR0_LWP;
+
+    /*
+     * Clamp to host mask.  Should be no-op, as guest_xfeature_mask should not
+     * be able to be calculated as larger than info->xfeature_mask.
+     *
+     * TODO - see about making this a harder error.
+     */
+    guest_xfeature_mask &= info->xfeature_mask;
+
     switch ( input[1] )
     {
-    case 0: 
+    case 0:
         /* EAX: low 32bits of xfeature_enabled_mask */
-        regs[0] = info->xfeature_mask & 0xFFFFFFFF;
+        regs[0] = guest_xfeature_mask;
         /* EDX: high 32bits of xfeature_enabled_mask */
-        regs[3] = (info->xfeature_mask >> 32) & 0xFFFFFFFF;
+        regs[3] = guest_xfeature_mask >> 32;
         /* ECX: max size required by all HW features */
         {
             unsigned int _input[2] = {0xd, 0x0}, _regs[4];
             regs[2] = 0;
-            for ( _input[1] = 2; _input[1] < 64; _input[1]++ )
+            for ( _input[1] = 2; _input[1] <= 62; _input[1]++ )
             {
                 cpuid(_input, _regs);
                 if ( (_regs[0] + _regs[1]) > regs[2] )
                     regs[2] = _regs[0] + _regs[1];
             }
         }
-        /* EBX: max size required by enabled features. 
-         * This register contains a dynamic value, which varies when a guest 
-         * enables or disables XSTATE features (via xsetbv). The default size 
-         * after reset is 576. */ 
+        /* EBX: max size required by enabled features.
+         * This register contains a dynamic value, which varies when a guest
+         * enables or disables XSTATE features (via xsetbv). The default size
+         * after reset is 576. */
         regs[1] = 512 + 64; /* FP/SSE + XSAVE.HEADER */
         break;
+
     case 1: /* leaf 1 */
         regs[0] = info->featureset[featureword_of(X86_FEATURE_XSAVEOPT)];
-        regs[2] &= info->xfeature_mask;
-        regs[3] = 0;
+        regs[2] = guest_xfeature_mask & X86_XSS_MASK;
+        regs[3] = (guest_xfeature_mask >> 32) & X86_XSS_MASK;
         break;
-    case 2 ... 63: /* sub-leaves */
-        if ( !(info->xfeature_mask & (1ULL << input[1])) )
+
+    case 2 ... 62: /* per-component sub-leaves */
+        if ( !(guest_xfeature_mask & (1ULL << input[1])) )
         {
             regs[0] = regs[1] = regs[2] = regs[3] = 0;
             break;
         }
         /* Don't touch EAX, EBX. Also cleanup ECX and EDX */
-        regs[2] = regs[3] = 0;
+        regs[2] &= XSTATE_XSS | XSTATE_ALIGN64;
+        regs[3] = 0;
+        break;
+
+    default:
+        regs[0] = regs[1] = regs[2] = regs[3] = 0;
         break;
     }
 }