diff mbox

x86/cpuid: fix dom0 crash on skylake machine

Message ID 1464757092-13177-1-git-send-email-luwei.kang@intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Luwei Kang June 1, 2016, 4:58 a.m. UTC
CPUID.0XD.0X0.EAX is from machine value for dom0, and dom0 kernel will xsetbv
with xfeatures_mask that is from CPUID.0XD.0X0.EAX, but handle_xsetbv has
ingored XSTATE_PKRU with hardware protection fault emulation, so dom0 kernel
will crash on skylake machine with PKRU support.

Signed-off-by: Luwei Kang <luwei.kang@intel.com>
---
 xen/arch/x86/traps.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Huaitong Han June 1, 2016, 5:54 a.m. UTC | #1
On Wed, 2016-06-01 at 12:58 +0800, Luwei Kang wrote:
> CPUID.0XD.0X0.EAX is from machine value for dom0, and dom0 kernel will xsetbv

> with xfeatures_mask that is from CPUID.0XD.0X0.EAX, but handle_xsetbv has

> ingored XSTATE_PKRU with hardware protection fault emulation, so dom0 kernel

> will crash on skylake machine with PKRU support.

> 

> Signed-off-by: Luwei Kang <luwei.kang@intel.com>

Signed-off-by: Huaitong Han <huaitong.han@intel.com>

> ---

>  xen/arch/x86/traps.c | 3 +++

>  1 file changed, 3 insertions(+)

> 

> diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c

> index 1ef8401..5e72e44 100644

> --- a/xen/arch/x86/traps.c

> +++ b/xen/arch/x86/traps.c

> @@ -1100,6 +1100,9 @@ void pv_cpuid(struct cpu_user_regs *regs)

>               */

>              if ( !is_control_domain(currd) && !is_hardware_domain(currd) )

>                  cpuid_count(leaf, subleaf, &tmp, &b, &tmp, &tmp);

> +

> +            /* PV is not supported by XSTATE_PKRU. */

> +            a &= ~XSTATE_PKRU;

>              break;

>          }

>
Jan Beulich June 1, 2016, 8:49 a.m. UTC | #2
>>> On 01.06.16 at 07:54, <huaitong.han@intel.com> wrote:
> On Wed, 2016-06-01 at 12:58 +0800, Luwei Kang wrote:
>> CPUID.0XD.0X0.EAX is from machine value for dom0, and dom0 kernel will xsetbv
>> with xfeatures_mask that is from CPUID.0XD.0X0.EAX, but handle_xsetbv has
>> ingored XSTATE_PKRU with hardware protection fault emulation, so dom0 kernel
>> will crash on skylake machine with PKRU support.
>> 
>> Signed-off-by: Luwei Kang <luwei.kang@intel.com>
> Signed-off-by: Huaitong Han <huaitong.han@intel.com>

I don't understand this: Did you mean Reviewed-by? Or else did
Luwei forget to mention your co-authorship (albeit co-authoring
on this small a patch seems unlikely), or the patch having flown
from you to him or the other way around?

Jan
Huaitong Han June 1, 2016, 9 a.m. UTC | #3
On Wed, 2016-06-01 at 02:49 -0600, Jan Beulich wrote:
> >>> On 01.06.16 at 07:54, <huaitong.han@intel.com> wrote:

> > On Wed, 2016-06-01 at 12:58 +0800, Luwei Kang wrote:

> >> CPUID.0XD.0X0.EAX is from machine value for dom0, and dom0 kernel will xsetbv

> >> with xfeatures_mask that is from CPUID.0XD.0X0.EAX, but handle_xsetbv has

> >> ingored XSTATE_PKRU with hardware protection fault emulation, so dom0 kernel

> >> will crash on skylake machine with PKRU support.

> >> 

> >> Signed-off-by: Luwei Kang <luwei.kang@intel.com>

> > Signed-off-by: Huaitong Han <huaitong.han@intel.com>

> 

> I don't understand this: Did you mean Reviewed-by? Or else did

> Luwei forget to mention your co-authorship (albeit co-authoring

> on this small a patch seems unlikely), or the patch having flown

> from you to him or the other way around?

Reviewed-by: Huaitong Han <huaitong.han@intel.com>

> 

> Jan

>
Jan Beulich June 1, 2016, 9:04 a.m. UTC | #4
>>> On 01.06.16 at 06:58, <luwei.kang@intel.com> wrote:
> CPUID.0XD.0X0.EAX is from machine value for dom0, and dom0 kernel will xsetbv
> with xfeatures_mask that is from CPUID.0XD.0X0.EAX, but handle_xsetbv has
> ingored XSTATE_PKRU with hardware protection fault emulation, so dom0 kernel
> will crash on skylake machine with PKRU support.

I have some difficulty following this description (albeit I think I see
what is going wrong): handle_xsetbv() doesn't _ignore_
XSTATE_PKRU for PV guests, it _fails_ in that case. And along those
lines the patch title should also be a little more specific.

> --- a/xen/arch/x86/traps.c
> +++ b/xen/arch/x86/traps.c
> @@ -1100,6 +1100,9 @@ void pv_cpuid(struct cpu_user_regs *regs)
>               */
>              if ( !is_control_domain(currd) && !is_hardware_domain(currd) )
>                  cpuid_count(leaf, subleaf, &tmp, &b, &tmp, &tmp);
> +
> +            /* PV is not supported by XSTATE_PKRU. */
> +            a &= ~XSTATE_PKRU;

Okay, so this is the trivial immediate fix to deal with the problem. Did
you, however, think about it in some more generic terms? For
example, MPX isn't supported for PV guests either, so
handle_xsetbv() should refuse requests to set the respective two
bits too. Which in turn would call for abstracting things via a new
#define in xstate.h.

Yet taking one more step, HVM guests may have PKRU and MPX
(and in fact any other of the features connected to the various
XSTATE_* bits) disabled too, in which case requests to enable
those in XCR0 should be refused too. Which in turn gets me to
ask how Dom0 actually learned of (in your case) XCR0.PKRU
being modifiable: Andrew's new CPUID handling should, I would
hope, make the XSTATE leaf not report any unavailable features.
And remember that PV Dom0 is _required_ to use the PV CPUID
mechanism to obtain available features, so if I am ti trust the
initial part of your description, the bug really is in Dom0 (and no
hypervisor change is necessary at all). Please clarify.

Jan
diff mbox

Patch

diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index 1ef8401..5e72e44 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -1100,6 +1100,9 @@  void pv_cpuid(struct cpu_user_regs *regs)
              */
             if ( !is_control_domain(currd) && !is_hardware_domain(currd) )
                 cpuid_count(leaf, subleaf, &tmp, &b, &tmp, &tmp);
+
+            /* PV is not supported by XSTATE_PKRU. */
+            a &= ~XSTATE_PKRU;
             break;
         }