diff mbox series

[v2,4/7] KVM: nVMX: Add a quirk for KVM tweaks to VMX control MSRs

Message ID 20220204204705.3538240-5-oupton@google.com (mailing list archive)
State New, archived
Headers show
Series VMX: nVMX: VMX control MSR fixes | expand

Commit Message

Oliver Upton Feb. 4, 2022, 8:47 p.m. UTC
KVM really has no business messing with the vCPU state. Nonetheless, it
has become ABI for KVM to adjust certain bits of the VMX entry/exit
control MSRs depending on the guest CPUID. Namely, the bits associated
with the IA32_PERF_GLOBAL_CTRL and IA32_BNDCFGS MSRs were conditionally
enabled if the guest CPUID allows for it.

Allow userspace to opt-out of changes to VMX control MSRs by adding a
new KVM quirk.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Oliver Upton <oupton@google.com>
---
 arch/x86/include/uapi/asm/kvm.h | 11 ++++++-----
 arch/x86/kvm/vmx/vmx.c          |  3 +++
 2 files changed, 9 insertions(+), 5 deletions(-)

Comments

Sean Christopherson Feb. 7, 2022, 6:06 p.m. UTC | #1
On Fri, Feb 04, 2022, Oliver Upton wrote:
> KVM really has no business messing with the vCPU state. Nonetheless, it
> has become ABI for KVM to adjust certain bits of the VMX entry/exit
> control MSRs depending on the guest CPUID. Namely, the bits associated
> with the IA32_PERF_GLOBAL_CTRL and IA32_BNDCFGS MSRs were conditionally
> enabled if the guest CPUID allows for it.
> 
> Allow userspace to opt-out of changes to VMX control MSRs by adding a
> new KVM quirk.
> 
> Suggested-by: Sean Christopherson <seanjc@google.com>
> Signed-off-by: Oliver Upton <oupton@google.com>
> ---
>  arch/x86/include/uapi/asm/kvm.h | 11 ++++++-----
>  arch/x86/kvm/vmx/vmx.c          |  3 +++
>  2 files changed, 9 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h
> index bf6e96011dfe..acbab6a97fae 100644
> --- a/arch/x86/include/uapi/asm/kvm.h
> +++ b/arch/x86/include/uapi/asm/kvm.h
> @@ -428,11 +428,12 @@ struct kvm_sync_regs {
>  	struct kvm_vcpu_events events;
>  };
>  
> -#define KVM_X86_QUIRK_LINT0_REENABLED	   (1 << 0)
> -#define KVM_X86_QUIRK_CD_NW_CLEARED	   (1 << 1)
> -#define KVM_X86_QUIRK_LAPIC_MMIO_HOLE	   (1 << 2)
> -#define KVM_X86_QUIRK_OUT_7E_INC_RIP	   (1 << 3)
> -#define KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT (1 << 4)
> +#define KVM_X86_QUIRK_LINT0_REENABLED		(1 << 0)
> +#define KVM_X86_QUIRK_CD_NW_CLEARED		(1 << 1)
> +#define KVM_X86_QUIRK_LAPIC_MMIO_HOLE		(1 << 2)
> +#define KVM_X86_QUIRK_OUT_7E_INC_RIP		(1 << 3)
> +#define KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT	(1 << 4)
> +#define KVM_X86_QUIRK_TWEAK_VMX_CTRL_MSRS	(1 << 5)

I'd prefer we include msr_ia32_feature_control_valid_bits in this quirk, it should
be relatively easy to do since most of the modifications stem from
vmx_vcpu_after_set_cpuid().  vmx_setup_mce() is a bit odd, but IMO it's worth
excising as much crud as we can.

>  #define KVM_STATE_NESTED_FORMAT_VMX	0
>  #define KVM_STATE_NESTED_FORMAT_SVM	1
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 395787b7e7ac..60b1b76782e1 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -7231,6 +7231,9 @@ void nested_vmx_entry_exit_ctls_update(struct kvm_vcpu *vcpu)
>  {
>  	struct vcpu_vmx *vmx = to_vmx(vcpu);
>  
> +	if (!kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_TWEAK_VMX_CTRL_MSRS))
> +		return;


Probably worth calling out that nested_vmx_cr_fixed1_bits_update() is intentionally
exempt from this "rule":

	case MSR_IA32_VMX_CR0_FIXED1:
	case MSR_IA32_VMX_CR4_FIXED1:
		/*
		 * These MSRs are generated based on the vCPU's CPUID, so we
		 * do not support restoring them directly.
		 */
		return -EINVAL;

> +
>  	if (kvm_mpx_supported()) {
>  		bool mpx_enabled = guest_cpuid_has(vcpu, X86_FEATURE_MPX);
>  
> -- 
> 2.35.0.263.gb82422642f-goog
>
Oliver Upton Feb. 9, 2022, 1:50 a.m. UTC | #2
On Mon, Feb 7, 2022 at 10:06 AM Sean Christopherson <seanjc@google.com> wrote:

[...]

> > +#define KVM_X86_QUIRK_TWEAK_VMX_CTRL_MSRS    (1 << 5)
>
> I'd prefer we include msr_ia32_feature_control_valid_bits in this quirk, it should
> be relatively easy to do since most of the modifications stem from
> vmx_vcpu_after_set_cpuid().  vmx_setup_mce() is a bit odd, but IMO it's worth
> excising as much crud as we can.
>

Sure, this is a good opportunity to rip out the crud.
msr_ia32_feature_control_valid_bits is a bit messy, since the default
value does not contain all the bits we support. At least with
IA32_VM_TRUE_{ENTRY,EXIT}_CTLS we slim down the hardware values to get
the default value.

Not at all objecting, but it looks like we will need to populate some
bits in the default value of the IA32_FEAT_CTL mask, otherwise with
the quirk enabled guests could never set any of the bits in the MSR.

> >  #define KVM_STATE_NESTED_FORMAT_VMX  0
> >  #define KVM_STATE_NESTED_FORMAT_SVM  1
> > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> > index 395787b7e7ac..60b1b76782e1 100644
> > --- a/arch/x86/kvm/vmx/vmx.c
> > +++ b/arch/x86/kvm/vmx/vmx.c
> > @@ -7231,6 +7231,9 @@ void nested_vmx_entry_exit_ctls_update(struct kvm_vcpu *vcpu)
> >  {
> >       struct vcpu_vmx *vmx = to_vmx(vcpu);
> >
> > +     if (!kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_TWEAK_VMX_CTRL_MSRS))
> > +             return;
>
>
> Probably worth calling out that nested_vmx_cr_fixed1_bits_update() is intentionally
> exempt from this "rule":

Agreed.

--
Thanks,
Oliver
Sean Christopherson Feb. 9, 2022, 8:23 p.m. UTC | #3
On Tue, Feb 08, 2022, Oliver Upton wrote:
> On Mon, Feb 7, 2022 at 10:06 AM Sean Christopherson <seanjc@google.com> wrote:
> 
> [...]
> 
> > > +#define KVM_X86_QUIRK_TWEAK_VMX_CTRL_MSRS    (1 << 5)
> >
> > I'd prefer we include msr_ia32_feature_control_valid_bits in this quirk, it should
> > be relatively easy to do since most of the modifications stem from
> > vmx_vcpu_after_set_cpuid().  vmx_setup_mce() is a bit odd, but IMO it's worth
> > excising as much crud as we can.
> >
> 
> Sure, this is a good opportunity to rip out the crud.
> msr_ia32_feature_control_valid_bits is a bit messy, since the default
> value does not contain all the bits we support. At least with
> IA32_VM_TRUE_{ENTRY,EXIT}_CTLS we slim down the hardware values to get
> the default value.
> 
> Not at all objecting, but it looks like we will need to populate some
> bits in the default value of the IA32_FEAT_CTL mask, otherwise with
> the quirk enabled guests could never set any of the bits in the MSR.

I assume you mean "quirk disabled"?  Because quirks are on by default, i.e. KVM's
default behavior will be to populate msr_ia32_feature_control_valid_bits based on
CPUID updates.

That said, after typing up what I had in mind, I don't think we need a quirk at all.
The only weird part is that KVM doesn't allow host userspace to set the MSR without
first setting CPUID.  That's trivial to fix and we can do so without impacting KVM's
modeling of WRMSR from the guest.  Modeling WRMSR is no different than KVM enforcing
CR4 bits based on CPUID.  The VMX MSRs are weird because they are technically
independent of the non-virtualization support reported in CPUID, i.e. KVM is overstepping
by manipulating the MSRs based on CPUID.

I'll send this is a formal patch, obviously with KVM_SUPPORTED_FEATURE_CONTROL
defined...

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index d05b4955d14f..d50ae2de8b51 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -1749,11 +1749,16 @@ bool nested_vmx_allowed(struct kvm_vcpu *vcpu)
 }

 static inline bool vmx_feature_control_msr_valid(struct kvm_vcpu *vcpu,
-                                                uint64_t val)
+                                                struct msr_data *msr)
 {
-       uint64_t valid_bits = to_vmx(vcpu)->msr_ia32_feature_control_valid_bits;
+       uint64_t valid_bits;

-       return !(val & ~valid_bits);
+       if (msr->host_initiated)
+               valid_bits = KVM_SUPPORTED_FEATURE_CONTROL;
+       else
+               to_vmx(vcpu)->msr_ia32_feature_control_valid_bits;
+
+       return !(msr->data & ~valid_bits);
 }

 static int vmx_get_msr_feature(struct kvm_msr_entry *msr)
@@ -2146,7 +2151,7 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
                vcpu->arch.mcg_ext_ctl = data;
                break;
        case MSR_IA32_FEAT_CTL:
-               if (!vmx_feature_control_msr_valid(vcpu, data) ||
+               if (!vmx_feature_control_msr_valid(vcpu, msr_info) ||
                    (to_vmx(vcpu)->msr_ia32_feature_control &
                     FEAT_CTL_LOCKED && !msr_info->host_initiated))
                        return 1;
diff mbox series

Patch

diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h
index bf6e96011dfe..acbab6a97fae 100644
--- a/arch/x86/include/uapi/asm/kvm.h
+++ b/arch/x86/include/uapi/asm/kvm.h
@@ -428,11 +428,12 @@  struct kvm_sync_regs {
 	struct kvm_vcpu_events events;
 };
 
-#define KVM_X86_QUIRK_LINT0_REENABLED	   (1 << 0)
-#define KVM_X86_QUIRK_CD_NW_CLEARED	   (1 << 1)
-#define KVM_X86_QUIRK_LAPIC_MMIO_HOLE	   (1 << 2)
-#define KVM_X86_QUIRK_OUT_7E_INC_RIP	   (1 << 3)
-#define KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT (1 << 4)
+#define KVM_X86_QUIRK_LINT0_REENABLED		(1 << 0)
+#define KVM_X86_QUIRK_CD_NW_CLEARED		(1 << 1)
+#define KVM_X86_QUIRK_LAPIC_MMIO_HOLE		(1 << 2)
+#define KVM_X86_QUIRK_OUT_7E_INC_RIP		(1 << 3)
+#define KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT	(1 << 4)
+#define KVM_X86_QUIRK_TWEAK_VMX_CTRL_MSRS	(1 << 5)
 
 #define KVM_STATE_NESTED_FORMAT_VMX	0
 #define KVM_STATE_NESTED_FORMAT_SVM	1
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 395787b7e7ac..60b1b76782e1 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7231,6 +7231,9 @@  void nested_vmx_entry_exit_ctls_update(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_vmx *vmx = to_vmx(vcpu);
 
+	if (!kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_TWEAK_VMX_CTRL_MSRS))
+		return;
+
 	if (kvm_mpx_supported()) {
 		bool mpx_enabled = guest_cpuid_has(vcpu, X86_FEATURE_MPX);