[2/3] KVM: x86: hyper-v: set NoNonArchitecturalCoreSharing CPUID bit when SMT is impossible
diff mbox series

Message ID 20190916162258.6528-3-vkuznets@redhat.com
State New
Headers show
Series
  • KVM: x86: hyper-v: make L2 Hyper-V 2019 on KVM guests see MD_CLEAR
Related show

Commit Message

Vitaly Kuznetsov Sept. 16, 2019, 4:22 p.m. UTC
Hyper-V 2019 doesn't expose MD_CLEAR CPUID bit to guests when it cannot
guarantee that two virtual processors won't end up running on sibling SMT
threads without knowing about it. This is done as an optimization as in
this case there is nothing the guest can do to protect itself against MDS
and issuing additional flush requests is just pointless. On bare metal the
topology is known, however, when Hyper-V is running nested (e.g. on top of
KVM) it needs an additional piece of information: a confirmation that the
exposed topology (wrt vCPU placement on different SMT threads) is
trustworthy.

NoNonArchitecturalCoreSharing (CPUID 0x40000004 EAX bit 18) is described in
TLFS as follows: "Indicates that a virtual processor will never share a
physical core with another virtual processor, except for virtual processors
that are reported as sibling SMT threads." From KVM we can give such
guarantee in two cases:
- SMT is unsupported or forcefully disabled (just 'disabled' doesn't work
 as it can become re-enabled during the lifetime of the guest).
- vCPUs are properly pinned so the scheduler won't put them on sibling
SMT threads (when they're not reported as such).

This patch reports NoNonArchitecturalCoreSharing bit in to userspace in the
first case. The second case is outside of KVM's domain of responsibility
(as vCPU pinning is actually done by someone who manages KVM's userspace -
e.g. libvirt pinning QEMU threads).

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/include/asm/hyperv-tlfs.h | 7 +++++++
 arch/x86/kvm/hyperv.c              | 4 +++-
 2 files changed, 10 insertions(+), 1 deletion(-)

Comments

Jim Mattson Sept. 16, 2019, 4:34 p.m. UTC | #1
On Mon, Sep 16, 2019 at 9:23 AM Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
>
> Hyper-V 2019 doesn't expose MD_CLEAR CPUID bit to guests when it cannot
> guarantee that two virtual processors won't end up running on sibling SMT
> threads without knowing about it. This is done as an optimization as in
> this case there is nothing the guest can do to protect itself against MDS
> and issuing additional flush requests is just pointless. On bare metal the
> topology is known, however, when Hyper-V is running nested (e.g. on top of
> KVM) it needs an additional piece of information: a confirmation that the
> exposed topology (wrt vCPU placement on different SMT threads) is
> trustworthy.
>
> NoNonArchitecturalCoreSharing (CPUID 0x40000004 EAX bit 18) is described in
> TLFS as follows: "Indicates that a virtual processor will never share a
> physical core with another virtual processor, except for virtual processors
> that are reported as sibling SMT threads." From KVM we can give such
> guarantee in two cases:
> - SMT is unsupported or forcefully disabled (just 'disabled' doesn't work
>  as it can become re-enabled during the lifetime of the guest).
> - vCPUs are properly pinned so the scheduler won't put them on sibling
> SMT threads (when they're not reported as such).

That's a nice bit of information. Have you considered a mechanism for
communicating this information to kvm guests in a way that doesn't
require Hyper-V enlightenments?

> This patch reports NoNonArchitecturalCoreSharing bit in to userspace in the
> first case. The second case is outside of KVM's domain of responsibility
> (as vCPU pinning is actually done by someone who manages KVM's userspace -
> e.g. libvirt pinning QEMU threads).
>
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  arch/x86/include/asm/hyperv-tlfs.h | 7 +++++++
>  arch/x86/kvm/hyperv.c              | 4 +++-
>  2 files changed, 10 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/include/asm/hyperv-tlfs.h b/arch/x86/include/asm/hyperv-tlfs.h
> index af78cd72b8f3..989a1efe7f5e 100644
> --- a/arch/x86/include/asm/hyperv-tlfs.h
> +++ b/arch/x86/include/asm/hyperv-tlfs.h
> @@ -170,6 +170,13 @@
>  /* Recommend using enlightened VMCS */
>  #define HV_X64_ENLIGHTENED_VMCS_RECOMMENDED            BIT(14)
>
> +/*
> + * Virtual processor will never share a physical core with another virtual
> + * processor, except for virtual processors that are reported as sibling SMT
> + * threads.
> + */
> +#define HV_X64_NO_NONARCH_CORESHARING                  BIT(18)
> +
>  /* Nested features. These are HYPERV_CPUID_NESTED_FEATURES.EAX bits. */
>  #define HV_X64_NESTED_GUEST_MAPPING_FLUSH              BIT(18)
>  #define HV_X64_NESTED_MSR_BITMAP                       BIT(19)
> diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
> index fff790a3f4ee..9c187d16a9cd 100644
> --- a/arch/x86/kvm/hyperv.c
> +++ b/arch/x86/kvm/hyperv.c
> @@ -23,6 +23,7 @@
>  #include "ioapic.h"
>  #include "hyperv.h"
>
> +#include <linux/cpu.h>
>  #include <linux/kvm_host.h>
>  #include <linux/highmem.h>
>  #include <linux/sched/cputime.h>
> @@ -1864,7 +1865,8 @@ int kvm_vcpu_ioctl_get_hv_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid2 *cpuid,
>                         ent->eax |= HV_X64_EX_PROCESSOR_MASKS_RECOMMENDED;
>                         if (evmcs_ver)
>                                 ent->eax |= HV_X64_ENLIGHTENED_VMCS_RECOMMENDED;
> -
> +                       if (!cpu_smt_possible())
> +                               ent->eax |= HV_X64_NO_NONARCH_CORESHARING;
>                         /*
>                          * Default number of spinlock retry attempts, matches
>                          * HyperV 2016.
> --
> 2.20.1
>
Vitaly Kuznetsov Sept. 17, 2019, 9:33 a.m. UTC | #2
Jim Mattson <jmattson@google.com> writes:

> On Mon, Sep 16, 2019 at 9:23 AM Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
>>
>> Hyper-V 2019 doesn't expose MD_CLEAR CPUID bit to guests when it cannot
>> guarantee that two virtual processors won't end up running on sibling SMT
>> threads without knowing about it. This is done as an optimization as in
>> this case there is nothing the guest can do to protect itself against MDS
>> and issuing additional flush requests is just pointless. On bare metal the
>> topology is known, however, when Hyper-V is running nested (e.g. on top of
>> KVM) it needs an additional piece of information: a confirmation that the
>> exposed topology (wrt vCPU placement on different SMT threads) is
>> trustworthy.
>>
>> NoNonArchitecturalCoreSharing (CPUID 0x40000004 EAX bit 18) is described in
>> TLFS as follows: "Indicates that a virtual processor will never share a
>> physical core with another virtual processor, except for virtual processors
>> that are reported as sibling SMT threads." From KVM we can give such
>> guarantee in two cases:
>> - SMT is unsupported or forcefully disabled (just 'disabled' doesn't work
>>  as it can become re-enabled during the lifetime of the guest).
>> - vCPUs are properly pinned so the scheduler won't put them on sibling
>> SMT threads (when they're not reported as such).
>
> That's a nice bit of information. Have you considered a mechanism for
> communicating this information to kvm guests in a way that doesn't
> require Hyper-V enlightenments?
>

(I haven't put much thought in this) but can we re-use MD_CLEAR CPUID
bit for that? Like if the hypervisor can't guarantee usefulness
(e.g. when two random vCPUs can be put on sibling SMT threads) of
flushing, is there any reason to still make the guest think the feature
is there?
Paolo Bonzini Sept. 17, 2019, 2:08 p.m. UTC | #3
On 17/09/19 11:33, Vitaly Kuznetsov wrote:
> Jim Mattson <jmattson@google.com> writes:
> 
>> On Mon, Sep 16, 2019 at 9:23 AM Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
>>>
>>> Hyper-V 2019 doesn't expose MD_CLEAR CPUID bit to guests when it cannot
>>> guarantee that two virtual processors won't end up running on sibling SMT
>>> threads without knowing about it. This is done as an optimization as in
>>> this case there is nothing the guest can do to protect itself against MDS
>>> and issuing additional flush requests is just pointless. On bare metal the
>>> topology is known, however, when Hyper-V is running nested (e.g. on top of
>>> KVM) it needs an additional piece of information: a confirmation that the
>>> exposed topology (wrt vCPU placement on different SMT threads) is
>>> trustworthy.
>>>
>>> NoNonArchitecturalCoreSharing (CPUID 0x40000004 EAX bit 18) is described in
>>> TLFS as follows: "Indicates that a virtual processor will never share a
>>> physical core with another virtual processor, except for virtual processors
>>> that are reported as sibling SMT threads." From KVM we can give such
>>> guarantee in two cases:
>>> - SMT is unsupported or forcefully disabled (just 'disabled' doesn't work
>>>  as it can become re-enabled during the lifetime of the guest).
>>> - vCPUs are properly pinned so the scheduler won't put them on sibling
>>> SMT threads (when they're not reported as such).
>>
>> That's a nice bit of information. Have you considered a mechanism for
>> communicating this information to kvm guests in a way that doesn't
>> require Hyper-V enlightenments?
>>
> 
> (I haven't put much thought in this) but can we re-use MD_CLEAR CPUID
> bit for that? Like if the hypervisor can't guarantee usefulness
> (e.g. when two random vCPUs can be put on sibling SMT threads) of
> flushing, is there any reason to still make the guest think the feature
> is there?

Yes, that's a good idea.

Paolo
Peter Zijlstra Sept. 23, 2019, 3:37 p.m. UTC | #4
On Mon, Sep 16, 2019 at 06:22:57PM +0200, Vitaly Kuznetsov wrote:
> Hyper-V 2019 doesn't expose MD_CLEAR CPUID bit to guests when it cannot
> guarantee that two virtual processors won't end up running on sibling SMT
> threads without knowing about it. This is done as an optimization as in
> this case there is nothing the guest can do to protect itself against MDS
> and issuing additional flush requests is just pointless. On bare metal the
> topology is known, however, when Hyper-V is running nested (e.g. on top of
> KVM) it needs an additional piece of information: a confirmation that the
> exposed topology (wrt vCPU placement on different SMT threads) is
> trustworthy.
> 
> NoNonArchitecturalCoreSharing (CPUID 0x40000004 EAX bit 18) is described in
> TLFS as follows: "Indicates that a virtual processor will never share a
> physical core with another virtual processor, except for virtual processors
> that are reported as sibling SMT threads." From KVM we can give such
> guarantee in two cases:
> - SMT is unsupported or forcefully disabled (just 'disabled' doesn't work
>  as it can become re-enabled during the lifetime of the guest).
> - vCPUs are properly pinned so the scheduler won't put them on sibling
> SMT threads (when they're not reported as such).
> 
> This patch reports NoNonArchitecturalCoreSharing bit in to userspace in the
> first case. The second case is outside of KVM's domain of responsibility
> (as vCPU pinning is actually done by someone who manages KVM's userspace -
> e.g. libvirt pinning QEMU threads).

This is purely about guest<->guest MDS, right? Ie. not worse than actual
hardware.
Paolo Bonzini Sept. 23, 2019, 4:48 p.m. UTC | #5
On 23/09/19 17:37, Peter Zijlstra wrote:
>> This patch reports NoNonArchitecturalCoreSharing bit in to userspace in the
>> first case. The second case is outside of KVM's domain of responsibility
>> (as vCPU pinning is actually done by someone who manages KVM's userspace -
>> e.g. libvirt pinning QEMU threads).
> This is purely about guest<->guest MDS, right? Ie. not worse than actual
> hardware.

Even within the same guest.  If vCPU 1 is on virtual core 1 and vCPU 2
is on virtual core 2, but they can share the same physical core, core
scheduling in the guest can do nothing about it.

Paolo

Patch
diff mbox series

diff --git a/arch/x86/include/asm/hyperv-tlfs.h b/arch/x86/include/asm/hyperv-tlfs.h
index af78cd72b8f3..989a1efe7f5e 100644
--- a/arch/x86/include/asm/hyperv-tlfs.h
+++ b/arch/x86/include/asm/hyperv-tlfs.h
@@ -170,6 +170,13 @@ 
 /* Recommend using enlightened VMCS */
 #define HV_X64_ENLIGHTENED_VMCS_RECOMMENDED		BIT(14)
 
+/*
+ * Virtual processor will never share a physical core with another virtual
+ * processor, except for virtual processors that are reported as sibling SMT
+ * threads.
+ */
+#define HV_X64_NO_NONARCH_CORESHARING                  BIT(18)
+
 /* Nested features. These are HYPERV_CPUID_NESTED_FEATURES.EAX bits. */
 #define HV_X64_NESTED_GUEST_MAPPING_FLUSH		BIT(18)
 #define HV_X64_NESTED_MSR_BITMAP			BIT(19)
diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index fff790a3f4ee..9c187d16a9cd 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -23,6 +23,7 @@ 
 #include "ioapic.h"
 #include "hyperv.h"
 
+#include <linux/cpu.h>
 #include <linux/kvm_host.h>
 #include <linux/highmem.h>
 #include <linux/sched/cputime.h>
@@ -1864,7 +1865,8 @@  int kvm_vcpu_ioctl_get_hv_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid2 *cpuid,
 			ent->eax |= HV_X64_EX_PROCESSOR_MASKS_RECOMMENDED;
 			if (evmcs_ver)
 				ent->eax |= HV_X64_ENLIGHTENED_VMCS_RECOMMENDED;
-
+			if (!cpu_smt_possible())
+				ent->eax |= HV_X64_NO_NONARCH_CORESHARING;
 			/*
 			 * Default number of spinlock retry attempts, matches
 			 * HyperV 2016.