diff mbox series

x86/cpu-policy: Fix visibility of HTT/CMP_LEGACY in max policies

Message ID 20240301112829.2657865-1-andrew.cooper3@citrix.com (mailing list archive)
State New
Headers show
Series x86/cpu-policy: Fix visibility of HTT/CMP_LEGACY in max policies | expand

Commit Message

Andrew Cooper March 1, 2024, 11:28 a.m. UTC
The block in recalculate_cpuid_policy() predates the proper split between
default and max policies, and was a "slightly max for a toolstack which knows
about it" capability.  It didn't get transformed properly in Xen 4.14.

Because Xen will accept a VM with HTT/CMP_LEGACY seen, they should be visible
in the max polices.  Keep the default policy matching host settings.

This manifested as an incorrectly-rejected migration across XenServer's Xen
4.13 -> 4.17 upgrade, as Xapi is slowly growing the logic to check a VM
against the target max policy.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>
---
 xen/arch/x86/cpu-policy.c | 29 ++++++++++++++++++++++-------
 1 file changed, 22 insertions(+), 7 deletions(-)

Comments

Roger Pau Monne March 1, 2024, 12:30 p.m. UTC | #1
On Fri, Mar 01, 2024 at 11:28:29AM +0000, Andrew Cooper wrote:
> The block in recalculate_cpuid_policy() predates the proper split between
> default and max policies, and was a "slightly max for a toolstack which knows
> about it" capability.  It didn't get transformed properly in Xen 4.14.
> 
> Because Xen will accept a VM with HTT/CMP_LEGACY seen, they should be visible
> in the max polices.  Keep the default policy matching host settings.
> 
> This manifested as an incorrectly-rejected migration across XenServer's Xen
> 4.13 -> 4.17 upgrade, as Xapi is slowly growing the logic to check a VM
> against the target max policy.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

I have one question below.

> ---
> CC: Jan Beulich <JBeulich@suse.com>
> CC: Roger Pau Monné <roger.pau@citrix.com>
> CC: Wei Liu <wl@xen.org>
> ---
>  xen/arch/x86/cpu-policy.c | 29 ++++++++++++++++++++++-------
>  1 file changed, 22 insertions(+), 7 deletions(-)
> 
> diff --git a/xen/arch/x86/cpu-policy.c b/xen/arch/x86/cpu-policy.c
> index c9b32bc17849..4f558e502e01 100644
> --- a/xen/arch/x86/cpu-policy.c
> +++ b/xen/arch/x86/cpu-policy.c
> @@ -464,6 +464,16 @@ static void __init guest_common_max_feature_adjustments(uint32_t *fs)
>               raw_cpu_policy.feat.clwb )
>              __set_bit(X86_FEATURE_CLWB, fs);
>      }
> +
> +    /*
> +     * Topology information inside the guest is entirely at the toolstack's
> +     * disgression, and bears no relationship to the host we're running on.
> +     *
> +     * HTT identifies p->basic.lppp as valid
> +     * CMP_LEGACY identifies p->extd.nc as valid
> +     */
> +    __set_bit(X86_FEATURE_HTT, fs);
> +    __set_bit(X86_FEATURE_CMP_LEGACY, fs);
>  }
>  
>  static void __init guest_common_default_feature_adjustments(uint32_t *fs)
> @@ -514,6 +524,18 @@ static void __init guest_common_default_feature_adjustments(uint32_t *fs)
>              __clear_bit(X86_FEATURE_CLWB, fs);
>      }
>  
> +    /*
> +     * Topology information is at the toolstack's discretion so these are
> +     * unconditionally set in max, but pick a default which matches the host.
> +     */
> +    __clear_bit(X86_FEATURE_HTT, fs);
> +    if ( cpu_has_htt )
> +        __set_bit(X86_FEATURE_HTT, fs);
> +
> +    __clear_bit(X86_FEATURE_CMP_LEGACY, fs);
> +    if ( cpu_has_cmp_legacy )
> +        __set_bit(X86_FEATURE_CMP_LEGACY, fs);

Not that I oppose to it, but does it make sense to expose this options
to PV guests by default?  Those guests don't really have an APIC ID,
and I think we don't expect any of the topology information to be
usable by them in the first place.

Thanks, Roger.
Andrew Cooper March 1, 2024, 3:24 p.m. UTC | #2
On 01/03/2024 12:30 pm, Roger Pau Monné wrote:
> On Fri, Mar 01, 2024 at 11:28:29AM +0000, Andrew Cooper wrote:
>> The block in recalculate_cpuid_policy() predates the proper split between
>> default and max policies, and was a "slightly max for a toolstack which knows
>> about it" capability.  It didn't get transformed properly in Xen 4.14.
>>
>> Because Xen will accept a VM with HTT/CMP_LEGACY seen, they should be visible
>> in the max polices.  Keep the default policy matching host settings.
>>
>> This manifested as an incorrectly-rejected migration across XenServer's Xen
>> 4.13 -> 4.17 upgrade, as Xapi is slowly growing the logic to check a VM
>> against the target max policy.
>>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Thanks.

>
> I have one question below.
>
>> ---
>> CC: Jan Beulich <JBeulich@suse.com>
>> CC: Roger Pau Monné <roger.pau@citrix.com>
>> CC: Wei Liu <wl@xen.org>
>> ---
>>  xen/arch/x86/cpu-policy.c | 29 ++++++++++++++++++++++-------
>>  1 file changed, 22 insertions(+), 7 deletions(-)
>>
>> diff --git a/xen/arch/x86/cpu-policy.c b/xen/arch/x86/cpu-policy.c
>> index c9b32bc17849..4f558e502e01 100644
>> --- a/xen/arch/x86/cpu-policy.c
>> +++ b/xen/arch/x86/cpu-policy.c
>> @@ -464,6 +464,16 @@ static void __init guest_common_max_feature_adjustments(uint32_t *fs)
>>               raw_cpu_policy.feat.clwb )
>>              __set_bit(X86_FEATURE_CLWB, fs);
>>      }
>> +
>> +    /*
>> +     * Topology information inside the guest is entirely at the toolstack's
>> +     * disgression, and bears no relationship to the host we're running on.
>> +     *
>> +     * HTT identifies p->basic.lppp as valid
>> +     * CMP_LEGACY identifies p->extd.nc as valid
>> +     */
>> +    __set_bit(X86_FEATURE_HTT, fs);
>> +    __set_bit(X86_FEATURE_CMP_LEGACY, fs);
>>  }
>>  
>>  static void __init guest_common_default_feature_adjustments(uint32_t *fs)
>> @@ -514,6 +524,18 @@ static void __init guest_common_default_feature_adjustments(uint32_t *fs)
>>              __clear_bit(X86_FEATURE_CLWB, fs);
>>      }
>>  
>> +    /*
>> +     * Topology information is at the toolstack's discretion so these are
>> +     * unconditionally set in max, but pick a default which matches the host.
>> +     */
>> +    __clear_bit(X86_FEATURE_HTT, fs);
>> +    if ( cpu_has_htt )
>> +        __set_bit(X86_FEATURE_HTT, fs);
>> +
>> +    __clear_bit(X86_FEATURE_CMP_LEGACY, fs);
>> +    if ( cpu_has_cmp_legacy )
>> +        __set_bit(X86_FEATURE_CMP_LEGACY, fs);
> Not that I oppose to it, but does it make sense to expose this options
> to PV guests by default?  Those guests don't really have an APIC ID,
> and I think we don't expect any of the topology information to be
> usable by them in the first place.

This patch doesn't alter what PV or HVM guests see by default.  This
hunk only counteracts the prior hunk forcing visibility of the two bits
in the max policy.

Whether it is sensible for PV to see this is a different matter, and
it's actually very complicated.

On systems without CPUID Faulting, we have to contend with PV guests
seeing mostly host data, whatever Xen would prefer.

With CPUID Masking, we can hide (Intel) or hide/set (AMD) these bits in
the feature leaves, but we can never stop the host value leaking into
lppp/nc/etc.

For better or worse, when the toolstack is divining a policy for PV
guests, it also chooses the host HTT/CMP_LEGACY values.

We should reconsider all of this when we've got topology working
sensibly for HVM guests.

~Andrew
Jan Beulich March 4, 2024, 8:42 a.m. UTC | #3
On 01.03.2024 12:28, Andrew Cooper wrote:
> --- a/xen/arch/x86/cpu-policy.c
> +++ b/xen/arch/x86/cpu-policy.c
> @@ -464,6 +464,16 @@ static void __init guest_common_max_feature_adjustments(uint32_t *fs)
>               raw_cpu_policy.feat.clwb )
>              __set_bit(X86_FEATURE_CLWB, fs);
>      }
> +
> +    /*
> +     * Topology information inside the guest is entirely at the toolstack's
> +     * disgression, and bears no relationship to the host we're running on.
> +     *
> +     * HTT identifies p->basic.lppp as valid
> +     * CMP_LEGACY identifies p->extd.nc as valid
> +     */
> +    __set_bit(X86_FEATURE_HTT, fs);
> +    __set_bit(X86_FEATURE_CMP_LEGACY, fs);
>  }
>  
>  static void __init guest_common_default_feature_adjustments(uint32_t *fs)
> @@ -514,6 +524,18 @@ static void __init guest_common_default_feature_adjustments(uint32_t *fs)
>              __clear_bit(X86_FEATURE_CLWB, fs);
>      }
>  
> +    /*
> +     * Topology information is at the toolstack's discretion so these are
> +     * unconditionally set in max, but pick a default which matches the host.
> +     */
> +    __clear_bit(X86_FEATURE_HTT, fs);
> +    if ( cpu_has_htt )
> +        __set_bit(X86_FEATURE_HTT, fs);
> +
> +    __clear_bit(X86_FEATURE_CMP_LEGACY, fs);
> +    if ( cpu_has_cmp_legacy )
> +        __set_bit(X86_FEATURE_CMP_LEGACY, fs);

When running on a host with the respective bit clear, won't this break
(at least older) Linux'es logic? Iirc the unconditional setting of at
least HTT was tied to the unconditional multiplying by 2 of the vCPU ID
to derive the (fake) APIC ID.

Jan
Andrew Cooper March 4, 2024, 1:30 p.m. UTC | #4
On 04/03/2024 8:42 am, Jan Beulich wrote:
> On 01.03.2024 12:28, Andrew Cooper wrote:
>> --- a/xen/arch/x86/cpu-policy.c
>> +++ b/xen/arch/x86/cpu-policy.c
>> @@ -464,6 +464,16 @@ static void __init guest_common_max_feature_adjustments(uint32_t *fs)
>>               raw_cpu_policy.feat.clwb )
>>              __set_bit(X86_FEATURE_CLWB, fs);
>>      }
>> +
>> +    /*
>> +     * Topology information inside the guest is entirely at the toolstack's
>> +     * disgression, and bears no relationship to the host we're running on.
>> +     *
>> +     * HTT identifies p->basic.lppp as valid
>> +     * CMP_LEGACY identifies p->extd.nc as valid
>> +     */
>> +    __set_bit(X86_FEATURE_HTT, fs);
>> +    __set_bit(X86_FEATURE_CMP_LEGACY, fs);
>>  }
>>  
>>  static void __init guest_common_default_feature_adjustments(uint32_t *fs)
>> @@ -514,6 +524,18 @@ static void __init guest_common_default_feature_adjustments(uint32_t *fs)
>>              __clear_bit(X86_FEATURE_CLWB, fs);
>>      }
>>  
>> +    /*
>> +     * Topology information is at the toolstack's discretion so these are
>> +     * unconditionally set in max, but pick a default which matches the host.
>> +     */
>> +    __clear_bit(X86_FEATURE_HTT, fs);
>> +    if ( cpu_has_htt )
>> +        __set_bit(X86_FEATURE_HTT, fs);
>> +
>> +    __clear_bit(X86_FEATURE_CMP_LEGACY, fs);
>> +    if ( cpu_has_cmp_legacy )
>> +        __set_bit(X86_FEATURE_CMP_LEGACY, fs);
> When running on a host with the respective bit clear, won't this break
> (at least older) Linux'es logic? Iirc the unconditional setting of at
> least HTT was tied to the unconditional multiplying by 2 of the vCPU ID
> to derive the (fake) APIC ID.

This patch doesn't change the default at all.

All it does is expose (properly) in the max policy what Xen would
tolerate the toolstack doing blindly.

If there are problems in certain configurations, then they continue to
be problems.

Although I'll note that the unconditional multiplying by 2 was never
about hyper-threading - Alejandro did some archaeology and found out
that it was an LAPIC vs IOAPIC error in vmxloader.  The connection to HT
has just been bad guesswork since.

~Andrew
diff mbox series

Patch

diff --git a/xen/arch/x86/cpu-policy.c b/xen/arch/x86/cpu-policy.c
index c9b32bc17849..4f558e502e01 100644
--- a/xen/arch/x86/cpu-policy.c
+++ b/xen/arch/x86/cpu-policy.c
@@ -464,6 +464,16 @@  static void __init guest_common_max_feature_adjustments(uint32_t *fs)
              raw_cpu_policy.feat.clwb )
             __set_bit(X86_FEATURE_CLWB, fs);
     }
+
+    /*
+     * Topology information inside the guest is entirely at the toolstack's
+     * disgression, and bears no relationship to the host we're running on.
+     *
+     * HTT identifies p->basic.lppp as valid
+     * CMP_LEGACY identifies p->extd.nc as valid
+     */
+    __set_bit(X86_FEATURE_HTT, fs);
+    __set_bit(X86_FEATURE_CMP_LEGACY, fs);
 }
 
 static void __init guest_common_default_feature_adjustments(uint32_t *fs)
@@ -514,6 +524,18 @@  static void __init guest_common_default_feature_adjustments(uint32_t *fs)
             __clear_bit(X86_FEATURE_CLWB, fs);
     }
 
+    /*
+     * Topology information is at the toolstack's discretion so these are
+     * unconditionally set in max, but pick a default which matches the host.
+     */
+    __clear_bit(X86_FEATURE_HTT, fs);
+    if ( cpu_has_htt )
+        __set_bit(X86_FEATURE_HTT, fs);
+
+    __clear_bit(X86_FEATURE_CMP_LEGACY, fs);
+    if ( cpu_has_cmp_legacy )
+        __set_bit(X86_FEATURE_CMP_LEGACY, fs);
+
     /*
      * On certain hardware, speculative or errata workarounds can result in
      * TSX being placed in "force-abort" mode, where it doesn't actually
@@ -861,13 +883,6 @@  void recalculate_cpuid_policy(struct domain *d)
         }
     }
 
-    /*
-     * Allow the toolstack to set HTT and CMP_LEGACY.  These bits
-     * affect how to interpret topology information in other cpuid leaves.
-     */
-    __set_bit(X86_FEATURE_HTT, max_fs);
-    __set_bit(X86_FEATURE_CMP_LEGACY, max_fs);
-
     /*
      * 32bit PV domains can't use any Long Mode features, and cannot use
      * SYSCALL on non-AMD hardware.