diff mbox series

[RFC,14/33] KVM: x86: Add VTL to the MMU role

Message ID 20231108111806.92604-15-nsaenz@amazon.com (mailing list archive)
State New, archived
Headers show
Series KVM: x86: hyperv: Introduce VSM support | expand

Commit Message

Nicolas Saenz Julienne Nov. 8, 2023, 11:17 a.m. UTC
With the upcoming introduction of per-VTL memory protections, make MMU
roles VTL aware. This will avoid sharing PTEs between vCPUs that belong
to different VTLs, and that have distinct memory access restrictions.

Four bits are allocated to store the VTL number in the MMU role, since
the TLFS states there is a maximum of 16 levels.

Signed-off-by: Nicolas Saenz Julienne <nsaenz@amazon.com>
---
 arch/x86/include/asm/kvm_host.h | 3 ++-
 arch/x86/kvm/hyperv.h           | 6 ++++++
 arch/x86/kvm/mmu.h              | 1 +
 arch/x86/kvm/mmu/mmu.c          | 3 +++
 4 files changed, 12 insertions(+), 1 deletion(-)

Comments

Sean Christopherson Nov. 8, 2023, 5:26 p.m. UTC | #1
On Wed, Nov 08, 2023, Nicolas Saenz Julienne wrote:
> With the upcoming introduction of per-VTL memory protections, make MMU
> roles VTL aware. This will avoid sharing PTEs between vCPUs that belong
> to different VTLs, and that have distinct memory access restrictions.
> 
> Four bits are allocated to store the VTL number in the MMU role, since
> the TLFS states there is a maximum of 16 levels.

How many does KVM actually allow/support?  Multiplying the number of possible
roots by 16x is a *major* change.
Nicolas Saenz Julienne Nov. 10, 2023, 6:52 p.m. UTC | #2
On Wed Nov 8, 2023 at 5:26 PM UTC, Sean Christopherson wrote:
> On Wed, Nov 08, 2023, Nicolas Saenz Julienne wrote:
> > With the upcoming introduction of per-VTL memory protections, make MMU
> > roles VTL aware. This will avoid sharing PTEs between vCPUs that belong
> > to different VTLs, and that have distinct memory access restrictions.
> >
> > Four bits are allocated to store the VTL number in the MMU role, since
> > the TLFS states there is a maximum of 16 levels.
>
> How many does KVM actually allow/support?  Multiplying the number of possible
> roots by 16x is a *major* change.

AFAIK in practice only VTL0/1 are used. Don't know if Microsoft will
come up with more in the future. We could introduce a CAP that expses
the number of supported VTLs to user-space, and leave it as a compile
option.
Maxim Levitsky Nov. 28, 2023, 7:34 a.m. UTC | #3
On Fri, 2023-11-10 at 18:52 +0000, Nicolas Saenz Julienne wrote:
> On Wed Nov 8, 2023 at 5:26 PM UTC, Sean Christopherson wrote:
> > On Wed, Nov 08, 2023, Nicolas Saenz Julienne wrote:
> > > With the upcoming introduction of per-VTL memory protections, make MMU
> > > roles VTL aware. This will avoid sharing PTEs between vCPUs that belong
> > > to different VTLs, and that have distinct memory access restrictions.
> > > 
> > > Four bits are allocated to store the VTL number in the MMU role, since
> > > the TLFS states there is a maximum of 16 levels.
> > 
> > How many does KVM actually allow/support?  Multiplying the number of possible
> > roots by 16x is a *major* change.
> 
> AFAIK in practice only VTL0/1 are used. Don't know if Microsoft will
> come up with more in the future. We could introduce a CAP that expses
> the number of supported VTLs to user-space, and leave it as a compile
> option.
> 

Actually hyperv spec says that currently only two VTLs are implemented in HyperV

https://learn.microsoft.com/en-us/virtualization/hyper-v-on-windows/tlfs/vsm

"Architecturally, up to 16 levels of VTLs are supported; however a hypervisor may choose to implement fewer than 16 VTL’s. Currently, only two VTLs are implemented."

We shouldn't completely hardcode two VTLs but I think that it is safe to make optimizations aiming at two VTLs,
and also have a compile time switch for the number of supported VTLs.

In terms of adding VTLs to MMU role, as long as it's only 2 VTLs, I don't think that this is a terrible idea.

This does bring a question: what we are going to do about SMM? Windows will need it due to secure boot,
so we can't just say that VSM is only supported without SMM.


However if we take the approach of having a VM per VTL, then all of this is free, except that every time userspace changes memslots,
it will have to do so for both VMs at the same time (and that might introduce races).

Also TLB flushes might be tricky to synchronize between these two VMs and so on.

Best regards,
	Maxim Levitsky
diff mbox series

Patch

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 7712e31b7537..1f5a85d461ce 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -338,7 +338,8 @@  union kvm_mmu_page_role {
 		unsigned ad_disabled:1;
 		unsigned guest_mode:1;
 		unsigned passthrough:1;
-		unsigned :5;
+		unsigned vtl:4;
+		unsigned :1;
 
 		/*
 		 * This is left at the top of the word so that
diff --git a/arch/x86/kvm/hyperv.h b/arch/x86/kvm/hyperv.h
index b3d1113efe82..605e80b9e5eb 100644
--- a/arch/x86/kvm/hyperv.h
+++ b/arch/x86/kvm/hyperv.h
@@ -263,4 +263,10 @@  static inline bool kvm_hv_vsm_enabled(struct kvm *kvm)
 
 int kvm_vm_ioctl_get_hv_vsm_state(struct kvm *kvm, struct kvm_hv_vsm_state *state);
 
+static inline void kvm_mmu_role_set_hv_bits(struct kvm_vcpu *vcpu,
+					    union kvm_mmu_page_role *role)
+{
+	role->vtl = kvm_hv_get_active_vtl(vcpu);
+}
+
 #endif
diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h
index 253fb2093d5d..e170388c6da1 100644
--- a/arch/x86/kvm/mmu.h
+++ b/arch/x86/kvm/mmu.h
@@ -304,4 +304,5 @@  static inline gpa_t kvm_translate_gpa(struct kvm_vcpu *vcpu,
 		return gpa;
 	return translate_nested_gpa(vcpu, gpa, access, exception);
 }
+
 #endif
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index baeba8fc1c38..2afef86863fb 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -28,6 +28,7 @@ 
 #include "page_track.h"
 #include "cpuid.h"
 #include "spte.h"
+#include "hyperv.h"
 
 #include <linux/kvm_host.h>
 #include <linux/types.h>
@@ -5197,6 +5198,7 @@  static union kvm_cpu_role kvm_calc_cpu_role(struct kvm_vcpu *vcpu,
 	role.base.smm = is_smm(vcpu);
 	role.base.guest_mode = is_guest_mode(vcpu);
 	role.ext.valid = 1;
+	kvm_mmu_role_set_hv_bits(vcpu, &role.base);
 
 	if (!____is_cr0_pg(regs)) {
 		role.base.direct = 1;
@@ -5271,6 +5273,7 @@  kvm_calc_tdp_mmu_root_page_role(struct kvm_vcpu *vcpu,
 	role.level = kvm_mmu_get_tdp_level(vcpu);
 	role.direct = true;
 	role.has_4_byte_gpte = false;
+	kvm_mmu_role_set_hv_bits(vcpu, &role);
 
 	return role;
 }