[v2] x86/amd: Avoid cpu_has_hypervisor evaluating true on native hardware
diff mbox series

Message ID 20200211155155.17396-1-andrew.cooper3@citrix.com
State New
Headers show
Series
  • [v2] x86/amd: Avoid cpu_has_hypervisor evaluating true on native hardware
Related show

Commit Message

Andrew Cooper Feb. 11, 2020, 3:51 p.m. UTC
Currently when booting native on AMD hardware, cpuidmask_defaults._1cd gets
configured with the HYPERVISOR bit before native CPUID is scanned for feature
bits.

This results in cpu_has_hypervisor becoming set as part of identify_cpu(), and
ends up appearing in the raw and host CPU policies.

A combination of this bug, and c/s bb502a8ca59 "x86: check feature flags after
resume" which checks that feature bits don't go missing, results in broken S3
on AMD hardware.

Alter amd_init_levelling() to exclude the HYPERVISOR bit from
cpumask_defaults, and update domain_cpu_policy_changed() to allow it to be
explicitly forwarded.

This also fixes a bug on kexec, where the hypervisor bit is left enabled for
the new kernel to find.

These changes highlight a further but - dom0 construction is asymetric with
domU construction, by not having any calls to domain_cpu_policy_changed().
Extend arch_domain_create() to always call domain_cpu_policy_changed().

Reported-by: Igor Druzhinin <igor.druzhinin@citrix.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Wei Liu <wl@xen.org>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Igor Druzhinin <igor.druzhinin@citrix.com>
CC: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
CC: Claudia <claudia1@disroot.org>

v2:
 * Rewrite the commit message.  No change to the patch content.

Marek/Claudia: Do either of you want a Reported-by tag seeing as you found a
brand new way that this was broken?
---
 xen/arch/x86/cpu/amd.c       | 3 ---
 xen/arch/x86/domain.c        | 2 ++
 xen/arch/x86/domctl.c        | 9 ++++++++-
 xen/include/asm-x86/domain.h | 2 ++
 4 files changed, 12 insertions(+), 4 deletions(-)

Comments

Roger Pau Monné Feb. 11, 2020, 4:31 p.m. UTC | #1
On Tue, Feb 11, 2020 at 03:51:54PM +0000, Andrew Cooper wrote:
> Currently when booting native on AMD hardware, cpuidmask_defaults._1cd gets
> configured with the HYPERVISOR bit before native CPUID is scanned for feature
> bits.
> 
> This results in cpu_has_hypervisor becoming set as part of identify_cpu(), and
> ends up appearing in the raw and host CPU policies.
> 
> A combination of this bug, and c/s bb502a8ca59 "x86: check feature flags after
> resume" which checks that feature bits don't go missing, results in broken S3
> on AMD hardware.
> 
> Alter amd_init_levelling() to exclude the HYPERVISOR bit from
> cpumask_defaults, and update domain_cpu_policy_changed() to allow it to be
> explicitly forwarded.
> 
> This also fixes a bug on kexec, where the hypervisor bit is left enabled for
> the new kernel to find.
> 
> These changes highlight a further but - dom0 construction is asymetric with
> domU construction, by not having any calls to domain_cpu_policy_changed().
> Extend arch_domain_create() to always call domain_cpu_policy_changed().
> 
> Reported-by: Igor Druzhinin <igor.druzhinin@citrix.com>
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> ---
> CC: Jan Beulich <JBeulich@suse.com>
> CC: Wei Liu <wl@xen.org>
> CC: Roger Pau Monné <roger.pau@citrix.com>
> CC: Igor Druzhinin <igor.druzhinin@citrix.com>
> CC: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
> CC: Claudia <claudia1@disroot.org>
> 
> v2:
>  * Rewrite the commit message.  No change to the patch content.
> 
> Marek/Claudia: Do either of you want a Reported-by tag seeing as you found a
> brand new way that this was broken?
> ---
>  xen/arch/x86/cpu/amd.c       | 3 ---
>  xen/arch/x86/domain.c        | 2 ++
>  xen/arch/x86/domctl.c        | 9 ++++++++-
>  xen/include/asm-x86/domain.h | 2 ++
>  4 files changed, 12 insertions(+), 4 deletions(-)
> 
> diff --git a/xen/arch/x86/cpu/amd.c b/xen/arch/x86/cpu/amd.c
> index e351dd227f..f95a8e0fd3 100644
> --- a/xen/arch/x86/cpu/amd.c
> +++ b/xen/arch/x86/cpu/amd.c
> @@ -298,9 +298,6 @@ static void __init noinline amd_init_levelling(void)
>  			ecx |= cpufeat_mask(X86_FEATURE_OSXSAVE);
>  		edx |= cpufeat_mask(X86_FEATURE_APIC);
>  
> -		/* Allow the HYPERVISOR bit to be set via guest policy. */
> -		ecx |= cpufeat_mask(X86_FEATURE_HYPERVISOR);
> -
>  		cpuidmask_defaults._1cd = ((uint64_t)ecx << 32) | edx;
>  	}
>  
> diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
> index f53ae5ff86..12bd554391 100644
> --- a/xen/arch/x86/domain.c
> +++ b/xen/arch/x86/domain.c
> @@ -656,6 +656,8 @@ int arch_domain_create(struct domain *d,
>       */
>      d->arch.x87_fip_width = cpu_has_fpu_sel ? 0 : 8;
>  
> +    domain_cpu_policy_changed(d);
> +
>      return 0;
>  
>   fail:
> diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c
> index 4fa9c91140..ce76d6d776 100644
> --- a/xen/arch/x86/domctl.c
> +++ b/xen/arch/x86/domctl.c
> @@ -48,7 +48,7 @@ static int gdbsx_guest_mem_io(domid_t domid, struct xen_domctl_gdbsx_memio *iop)
>  }
>  #endif
>  
> -static void domain_cpu_policy_changed(struct domain *d)
> +void domain_cpu_policy_changed(struct domain *d)
>  {
>      const struct cpuid_policy *p = d->arch.cpuid;
>      struct vcpu *v;
> @@ -106,6 +106,13 @@ static void domain_cpu_policy_changed(struct domain *d)
>                      ecx = 0;
>                  edx = cpufeat_mask(X86_FEATURE_APIC);
>  
> +                /*
> +                 * If the Hypervisor bit is set in the policy, we can also
> +                 * forward it into real CPUID.
> +                 */
> +                if ( p->basic.hypervisor )
> +                    ecx |= cpufeat_mask(X86_FEATURE_HYPERVISOR);

AFAICT dom0 will also get the hypervisor bit set by default, as that's
part of both the HVM and the PV max policy?

If so:

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Thanks, Roger.
Andrew Cooper Feb. 11, 2020, 4:41 p.m. UTC | #2
On 11/02/2020 16:31, Roger Pau Monné wrote:
> On Tue, Feb 11, 2020 at 03:51:54PM +0000, Andrew Cooper wrote:
>> diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c
>> index 4fa9c91140..ce76d6d776 100644
>> --- a/xen/arch/x86/domctl.c
>> +++ b/xen/arch/x86/domctl.c
>> @@ -48,7 +48,7 @@ static int gdbsx_guest_mem_io(domid_t domid, struct xen_domctl_gdbsx_memio *iop)
>>  }
>>  #endif
>>  
>> -static void domain_cpu_policy_changed(struct domain *d)
>> +void domain_cpu_policy_changed(struct domain *d)
>>  {
>>      const struct cpuid_policy *p = d->arch.cpuid;
>>      struct vcpu *v;
>> @@ -106,6 +106,13 @@ static void domain_cpu_policy_changed(struct domain *d)
>>                      ecx = 0;
>>                  edx = cpufeat_mask(X86_FEATURE_APIC);
>>  
>> +                /*
>> +                 * If the Hypervisor bit is set in the policy, we can also
>> +                 * forward it into real CPUID.
>> +                 */
>> +                if ( p->basic.hypervisor )
>> +                    ecx |= cpufeat_mask(X86_FEATURE_HYPERVISOR);
> AFAICT dom0 will also get the hypervisor bit set by default, as that's
> part of both the HVM and the PV max policy?

Correct.

>
> If so:
>
> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Thanks.

~Andrew
Jan Beulich Feb. 11, 2020, 4:59 p.m. UTC | #3
On 11.02.2020 17:31, Roger Pau Monné wrote:
> On Tue, Feb 11, 2020 at 03:51:54PM +0000, Andrew Cooper wrote:
>> Currently when booting native on AMD hardware, cpuidmask_defaults._1cd gets
>> configured with the HYPERVISOR bit before native CPUID is scanned for feature
>> bits.
>>
>> This results in cpu_has_hypervisor becoming set as part of identify_cpu(), and
>> ends up appearing in the raw and host CPU policies.
>>
>> A combination of this bug, and c/s bb502a8ca59 "x86: check feature flags after
>> resume" which checks that feature bits don't go missing, results in broken S3
>> on AMD hardware.
>>
>> Alter amd_init_levelling() to exclude the HYPERVISOR bit from
>> cpumask_defaults, and update domain_cpu_policy_changed() to allow it to be
>> explicitly forwarded.
>>
>> This also fixes a bug on kexec, where the hypervisor bit is left enabled for
>> the new kernel to find.
>>
>> These changes highlight a further but - dom0 construction is asymetric with
>> domU construction, by not having any calls to domain_cpu_policy_changed().
>> Extend arch_domain_create() to always call domain_cpu_policy_changed().
>>
>> Reported-by: Igor Druzhinin <igor.druzhinin@citrix.com>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>> ---
>> CC: Jan Beulich <JBeulich@suse.com>
>> CC: Wei Liu <wl@xen.org>
>> CC: Roger Pau Monné <roger.pau@citrix.com>
>> CC: Igor Druzhinin <igor.druzhinin@citrix.com>
>> CC: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
>> CC: Claudia <claudia1@disroot.org>
>>
>> v2:
>>  * Rewrite the commit message.  No change to the patch content.
>>
>> Marek/Claudia: Do either of you want a Reported-by tag seeing as you found a
>> brand new way that this was broken?

I understand this is addressing only one half of their issue. Since
you said you don't find it surprising, do you have any idea why the
OSXSAVE bit is behaving differently on AMD and on Intel?

>> @@ -106,6 +106,13 @@ static void domain_cpu_policy_changed(struct domain *d)
>>                      ecx = 0;
>>                  edx = cpufeat_mask(X86_FEATURE_APIC);
>>  
>> +                /*
>> +                 * If the Hypervisor bit is set in the policy, we can also
>> +                 * forward it into real CPUID.
>> +                 */
>> +                if ( p->basic.hypervisor )
>> +                    ecx |= cpufeat_mask(X86_FEATURE_HYPERVISOR);
> 
> AFAICT dom0 will also get the hypervisor bit set by default, as that's
> part of both the HVM and the PV max policy?
> 
> If so:
> 
> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Acked-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper Feb. 11, 2020, 5:16 p.m. UTC | #4
On 11/02/2020 16:59, Jan Beulich wrote:
> On 11.02.2020 17:31, Roger Pau Monné wrote:
>> On Tue, Feb 11, 2020 at 03:51:54PM +0000, Andrew Cooper wrote:
>>> Currently when booting native on AMD hardware, cpuidmask_defaults._1cd gets
>>> configured with the HYPERVISOR bit before native CPUID is scanned for feature
>>> bits.
>>>
>>> This results in cpu_has_hypervisor becoming set as part of identify_cpu(), and
>>> ends up appearing in the raw and host CPU policies.
>>>
>>> A combination of this bug, and c/s bb502a8ca59 "x86: check feature flags after
>>> resume" which checks that feature bits don't go missing, results in broken S3
>>> on AMD hardware.
>>>
>>> Alter amd_init_levelling() to exclude the HYPERVISOR bit from
>>> cpumask_defaults, and update domain_cpu_policy_changed() to allow it to be
>>> explicitly forwarded.
>>>
>>> This also fixes a bug on kexec, where the hypervisor bit is left enabled for
>>> the new kernel to find.
>>>
>>> These changes highlight a further but - dom0 construction is asymetric with
>>> domU construction, by not having any calls to domain_cpu_policy_changed().
>>> Extend arch_domain_create() to always call domain_cpu_policy_changed().
>>>
>>> Reported-by: Igor Druzhinin <igor.druzhinin@citrix.com>
>>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>>> ---
>>> CC: Jan Beulich <JBeulich@suse.com>
>>> CC: Wei Liu <wl@xen.org>
>>> CC: Roger Pau Monné <roger.pau@citrix.com>
>>> CC: Igor Druzhinin <igor.druzhinin@citrix.com>
>>> CC: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
>>> CC: Claudia <claudia1@disroot.org>
>>>
>>> v2:
>>>  * Rewrite the commit message.  No change to the patch content.
>>>
>>> Marek/Claudia: Do either of you want a Reported-by tag seeing as you found a
>>> brand new way that this was broken?
> I understand this is addressing only one half of their issue. Since
> you said you don't find it surprising, do you have any idea why the
> OSXSAVE bit is behaving differently on AMD and on Intel?

It isn't behaving differently between Intel and AMD, I don't think.

The diagnostics are asymmetric - they ever printed when a feature
disappears, not for a feature appearing.

OSXSAVE is clear until fairly late on boot (therefore misses being
cached), but is restored as part of mmu_cr4_features (before the feature
check).

The only reason anything gets printed in the first place is because the
HYPERVISOR bit disappeared.

Overall, OSXSAVE (and in principle OSPKE if we start supporting it in PV
guests) are benign.  We could filter them out of the diagnostics but
don't currently have a suitable featuremask, and I'm not sure adding one
is worth it.

>
>>> @@ -106,6 +106,13 @@ static void domain_cpu_policy_changed(struct domain *d)
>>>                      ecx = 0;
>>>                  edx = cpufeat_mask(X86_FEATURE_APIC);
>>>  
>>> +                /*
>>> +                 * If the Hypervisor bit is set in the policy, we can also
>>> +                 * forward it into real CPUID.
>>> +                 */
>>> +                if ( p->basic.hypervisor )
>>> +                    ecx |= cpufeat_mask(X86_FEATURE_HYPERVISOR);
>> AFAICT dom0 will also get the hypervisor bit set by default, as that's
>> part of both the HVM and the PV max policy?
>>
>> If so:
>>
>> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
> Acked-by: Jan Beulich <jbeulich@suse.com>

Thanks.

~Andrew
Jan Beulich Feb. 12, 2020, 8:06 a.m. UTC | #5
On 11.02.2020 18:16, Andrew Cooper wrote:
> On 11/02/2020 16:59, Jan Beulich wrote:
>> On 11.02.2020 17:31, Roger Pau Monné wrote:
>>> On Tue, Feb 11, 2020 at 03:51:54PM +0000, Andrew Cooper wrote:
>>>> Currently when booting native on AMD hardware, cpuidmask_defaults._1cd gets
>>>> configured with the HYPERVISOR bit before native CPUID is scanned for feature
>>>> bits.
>>>>
>>>> This results in cpu_has_hypervisor becoming set as part of identify_cpu(), and
>>>> ends up appearing in the raw and host CPU policies.
>>>>
>>>> A combination of this bug, and c/s bb502a8ca59 "x86: check feature flags after
>>>> resume" which checks that feature bits don't go missing, results in broken S3
>>>> on AMD hardware.
>>>>
>>>> Alter amd_init_levelling() to exclude the HYPERVISOR bit from
>>>> cpumask_defaults, and update domain_cpu_policy_changed() to allow it to be
>>>> explicitly forwarded.
>>>>
>>>> This also fixes a bug on kexec, where the hypervisor bit is left enabled for
>>>> the new kernel to find.
>>>>
>>>> These changes highlight a further but - dom0 construction is asymetric with
>>>> domU construction, by not having any calls to domain_cpu_policy_changed().
>>>> Extend arch_domain_create() to always call domain_cpu_policy_changed().
>>>>
>>>> Reported-by: Igor Druzhinin <igor.druzhinin@citrix.com>
>>>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>>>> ---
>>>> CC: Jan Beulich <JBeulich@suse.com>
>>>> CC: Wei Liu <wl@xen.org>
>>>> CC: Roger Pau Monné <roger.pau@citrix.com>
>>>> CC: Igor Druzhinin <igor.druzhinin@citrix.com>
>>>> CC: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
>>>> CC: Claudia <claudia1@disroot.org>
>>>>
>>>> v2:
>>>>  * Rewrite the commit message.  No change to the patch content.
>>>>
>>>> Marek/Claudia: Do either of you want a Reported-by tag seeing as you found a
>>>> brand new way that this was broken?
>> I understand this is addressing only one half of their issue. Since
>> you said you don't find it surprising, do you have any idea why the
>> OSXSAVE bit is behaving differently on AMD and on Intel?
> 
> It isn't behaving differently between Intel and AMD, I don't think.
> 
> The diagnostics are asymmetric - they ever printed when a feature
> disappears, not for a feature appearing.

Oh, I see - this is the part I was missing.

Jan

Patch
diff mbox series

diff --git a/xen/arch/x86/cpu/amd.c b/xen/arch/x86/cpu/amd.c
index e351dd227f..f95a8e0fd3 100644
--- a/xen/arch/x86/cpu/amd.c
+++ b/xen/arch/x86/cpu/amd.c
@@ -298,9 +298,6 @@  static void __init noinline amd_init_levelling(void)
 			ecx |= cpufeat_mask(X86_FEATURE_OSXSAVE);
 		edx |= cpufeat_mask(X86_FEATURE_APIC);
 
-		/* Allow the HYPERVISOR bit to be set via guest policy. */
-		ecx |= cpufeat_mask(X86_FEATURE_HYPERVISOR);
-
 		cpuidmask_defaults._1cd = ((uint64_t)ecx << 32) | edx;
 	}
 
diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index f53ae5ff86..12bd554391 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -656,6 +656,8 @@  int arch_domain_create(struct domain *d,
      */
     d->arch.x87_fip_width = cpu_has_fpu_sel ? 0 : 8;
 
+    domain_cpu_policy_changed(d);
+
     return 0;
 
  fail:
diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c
index 4fa9c91140..ce76d6d776 100644
--- a/xen/arch/x86/domctl.c
+++ b/xen/arch/x86/domctl.c
@@ -48,7 +48,7 @@  static int gdbsx_guest_mem_io(domid_t domid, struct xen_domctl_gdbsx_memio *iop)
 }
 #endif
 
-static void domain_cpu_policy_changed(struct domain *d)
+void domain_cpu_policy_changed(struct domain *d)
 {
     const struct cpuid_policy *p = d->arch.cpuid;
     struct vcpu *v;
@@ -106,6 +106,13 @@  static void domain_cpu_policy_changed(struct domain *d)
                     ecx = 0;
                 edx = cpufeat_mask(X86_FEATURE_APIC);
 
+                /*
+                 * If the Hypervisor bit is set in the policy, we can also
+                 * forward it into real CPUID.
+                 */
+                if ( p->basic.hypervisor )
+                    ecx |= cpufeat_mask(X86_FEATURE_HYPERVISOR);
+
                 mask |= ((uint64_t)ecx << 32) | edx;
                 break;
             }
diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.h
index f0c25ffec0..1843c76d1a 100644
--- a/xen/include/asm-x86/domain.h
+++ b/xen/include/asm-x86/domain.h
@@ -624,6 +624,8 @@  struct guest_memory_policy
 void update_guest_memory_policy(struct vcpu *v,
                                 struct guest_memory_policy *policy);
 
+void domain_cpu_policy_changed(struct domain *d);
+
 bool update_runstate_area(struct vcpu *);
 bool update_secondary_system_time(struct vcpu *,
                                   struct vcpu_time_info *);