diff mbox series

KVM: nVMX: Dynamically compute max VMCS index for vmcs12

Message ID 20210618214658.2700765-1-seanjc@google.com (mailing list archive)
State New, archived
Headers show
Series KVM: nVMX: Dynamically compute max VMCS index for vmcs12 | expand

Commit Message

Sean Christopherson June 18, 2021, 9:46 p.m. UTC
Calculate the max VMCS index for vmcs12 by walking the array to find the
actual max index.  Hardcoding the index is prone to bitrot, and the
calculation is only done on KVM bringup (albeit on every CPU, but there
aren't _that_ many null entries in the array).

Fixes: 3c0f99366e34 ("KVM: nVMX: Add a TSC multiplier field in VMCS12")
Signed-off-by: Sean Christopherson <seanjc@google.com>
---

Note, the vmx test in kvm-unit-tests will still fail using stock QEMU,
as QEMU also hardcodes and overwrites the MSR.  The test passes if I
hack KVM to ignore userspace (it was easier than rebuilding QEMU).

 arch/x86/kvm/vmx/nested.c | 37 +++++++++++++++++++++++++++++++++++--
 arch/x86/kvm/vmx/vmcs.h   |  8 ++++++++
 arch/x86/kvm/vmx/vmcs12.h |  6 ------
 3 files changed, 43 insertions(+), 8 deletions(-)

Comments

Paolo Bonzini June 21, 2021, 4:39 p.m. UTC | #1
On 18/06/21 23:46, Sean Christopherson wrote:
> Calculate the max VMCS index for vmcs12 by walking the array to find the
> actual max index.  Hardcoding the index is prone to bitrot, and the
> calculation is only done on KVM bringup (albeit on every CPU, but there
> aren't _that_ many null entries in the array).
> 
> Fixes: 3c0f99366e34 ("KVM: nVMX: Add a TSC multiplier field in VMCS12")
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
> 
> Note, the vmx test in kvm-unit-tests will still fail using stock QEMU,
> as QEMU also hardcodes and overwrites the MSR.  The test passes if I
> hack KVM to ignore userspace (it was easier than rebuilding QEMU).

Queued, thanks.  Without having checked the kvm-unit-tests sources very 
thoroughly, this might be a configuration issue in kvm-unit-tests; in 
theory "-cpu host" (unlike "-cpu host,migratable=no") should not enable 
TSC scaling.

Paolo
Sean Christopherson June 21, 2021, 5:08 p.m. UTC | #2
On Mon, Jun 21, 2021, Paolo Bonzini wrote:
> On 18/06/21 23:46, Sean Christopherson wrote:
> > Calculate the max VMCS index for vmcs12 by walking the array to find the
> > actual max index.  Hardcoding the index is prone to bitrot, and the
> > calculation is only done on KVM bringup (albeit on every CPU, but there
> > aren't _that_ many null entries in the array).
> > 
> > Fixes: 3c0f99366e34 ("KVM: nVMX: Add a TSC multiplier field in VMCS12")
> > Signed-off-by: Sean Christopherson <seanjc@google.com>
> > ---
> > 
> > Note, the vmx test in kvm-unit-tests will still fail using stock QEMU,
> > as QEMU also hardcodes and overwrites the MSR.  The test passes if I
> > hack KVM to ignore userspace (it was easier than rebuilding QEMU).
> 
> Queued, thanks.  Without having checked the kvm-unit-tests sources very
> thoroughly, this might be a configuration issue in kvm-unit-tests; in theory
> "-cpu host" (unlike "-cpu host,migratable=no") should not enable TSC
> scaling.

As noted in the code comments, KVM allows VMREAD/VMWRITE to all defined fields,
whether or not the field should actually exist for the vCPU model doesn't enter
into the equation.  That's technically wrong as there are a number of fields
that the SDM explicitly states exist iff a certain feature is supported.  To fix
that we'd need to add a "feature flag" to vmcs_field_to_offset_table that is
checked against the vCPU model, though updating the MSR would probably fall onto
userspace's shoulders?

And FWIW, this is the QEMU code:

  #define VMCS12_MAX_FIELD_INDEX (0x17)

  static void kvm_msr_entry_add_vmx(X86CPU *cpu, FeatureWordArray f)
  {
      ...

      /*
       * Just to be safe, write these with constant values.  The CRn_FIXED1
       * MSRs are generated by KVM based on the vCPU's CPUID.
       */
      kvm_msr_entry_add(cpu, MSR_IA32_VMX_CR0_FIXED0,
                        CR0_PE_MASK | CR0_PG_MASK | CR0_NE_MASK);
      kvm_msr_entry_add(cpu, MSR_IA32_VMX_CR4_FIXED0,
                        CR4_VMXE_MASK);
      kvm_msr_entry_add(cpu, MSR_IA32_VMX_VMCS_ENUM,
                        VMCS12_MAX_FIELD_INDEX << 1);
  }
Hu, Robert July 6, 2021, 3:05 a.m. UTC | #3
> -----Original Message-----
> From: Sean Christopherson <seanjc@google.com>
> Sent: Tuesday, June 22, 2021 01:08
> To: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Vitaly Kuznetsov <vkuznets@redhat.com>; Wanpeng Li
> <wanpengli@tencent.com>; Jim Mattson <jmattson@google.com>; Joerg
> Roedel <joro@8bytes.org>; kvm@vger.kernel.org; linux-kernel@vger.kernel.org
> Subject: Re: [PATCH] KVM: nVMX: Dynamically compute max VMCS index for
> vmcs12
> 
> On Mon, Jun 21, 2021, Paolo Bonzini wrote:
> > On 18/06/21 23:46, Sean Christopherson wrote:
> > > Calculate the max VMCS index for vmcs12 by walking the array to find
> > > the actual max index.  Hardcoding the index is prone to bitrot, and
> > > the calculation is only done on KVM bringup (albeit on every CPU,
> > > but there aren't _that_ many null entries in the array).
> > >
> > > Fixes: 3c0f99366e34 ("KVM: nVMX: Add a TSC multiplier field in
> > > VMCS12")
> > > Signed-off-by: Sean Christopherson <seanjc@google.com>
> > > ---
> > >
> > > Note, the vmx test in kvm-unit-tests will still fail using stock
> > > QEMU, as QEMU also hardcodes and overwrites the MSR.  The test
> > > passes if I hack KVM to ignore userspace (it was easier than rebuilding
> QEMU).
> >
> > Queued, thanks.  Without having checked the kvm-unit-tests sources
> > very thoroughly, this might be a configuration issue in
> > kvm-unit-tests; in theory "-cpu host" (unlike "-cpu
> > host,migratable=no") should not enable TSC scaling.
> 
> As noted in the code comments, KVM allows VMREAD/VMWRITE to all defined
> fields, whether or not the field should actually exist for the vCPU model doesn't
> enter into the equation.  That's technically wrong as there are a number of
> fields that the SDM explicitly states exist iff a certain feature is supported.  To
> fix that we'd need to add a "feature flag" to vmcs_field_to_offset_table that is
> checked against the vCPU model, though updating the MSR would probably fall
> onto userspace's shoulders?
[Hu, Robert] 
Perhaps more easier and proper to do this in KVM side.
QEMU sets actual feature set down to KVM, and KVM updates IA32_VMX_VMCS_ENUM
MSR accordingly. We don't see a channel that QEMU constructs a VMCS and sets a whole
to KVM.

> 
> And FWIW, this is the QEMU code:
> 
>   #define VMCS12_MAX_FIELD_INDEX (0x17)
> 
>   static void kvm_msr_entry_add_vmx(X86CPU *cpu, FeatureWordArray f)
>   {
>       ...
> 
>       /*
>        * Just to be safe, write these with constant values.  The CRn_FIXED1
>        * MSRs are generated by KVM based on the vCPU's CPUID.
>        */
>       kvm_msr_entry_add(cpu, MSR_IA32_VMX_CR0_FIXED0,
>                         CR0_PE_MASK | CR0_PG_MASK | CR0_NE_MASK);
>       kvm_msr_entry_add(cpu, MSR_IA32_VMX_CR4_FIXED0,
>                         CR4_VMXE_MASK);
>       kvm_msr_entry_add(cpu, MSR_IA32_VMX_VMCS_ENUM,
>                         VMCS12_MAX_FIELD_INDEX << 1);
>   }
Paolo Bonzini July 6, 2021, 5:42 a.m. UTC | #4
On 06/07/21 05:05, Hu, Robert wrote:
>> As noted in the code comments, KVM allows VMREAD/VMWRITE to all defined
>> fields, whether or not the field should actually exist for the vCPU model doesn't
>> enter into the equation.  That's technically wrong as there are a number of
>> fields that the SDM explicitly states exist iff a certain feature is supported.  To
>> fix that we'd need to add a "feature flag" to vmcs_field_to_offset_table that is
>> checked against the vCPU model, though updating the MSR would probably fall
>> onto userspace's shoulders?
> 
> [Hu, Robert]
> Perhaps more easier and proper to do this in KVM side.
> QEMU sets actual feature set down to KVM, and KVM updates IA32_VMX_VMCS_ENUM
> MSR accordingly. We don't see a channel that QEMU constructs a VMCS and sets a whole
> to KVM.

Yes, it's possible to do that too.  If that is included in Linux 5.14, 
we can remove it from QEMU.

Paolo
Hu, Robert July 21, 2021, 10:02 a.m. UTC | #5
> -----Original Message-----
> From: Sean Christopherson <seanjc@google.com>
> Sent: Tuesday, June 22, 2021 01:08
> To: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Vitaly Kuznetsov <vkuznets@redhat.com>; Wanpeng Li
> <wanpengli@tencent.com>; Jim Mattson <jmattson@google.com>; Joerg
> Roedel <joro@8bytes.org>; kvm@vger.kernel.org; linux-kernel@vger.kernel.org
> Subject: Re: [PATCH] KVM: nVMX: Dynamically compute max VMCS index for
> vmcs12
> 
> On Mon, Jun 21, 2021, Paolo Bonzini wrote:
> > On 18/06/21 23:46, Sean Christopherson wrote:
> > > Calculate the max VMCS index for vmcs12 by walking the array to find
> > > the actual max index.  Hardcoding the index is prone to bitrot, and
> > > the calculation is only done on KVM bringup (albeit on every CPU,
> > > but there aren't _that_ many null entries in the array).
> > >
> > > Fixes: 3c0f99366e34 ("KVM: nVMX: Add a TSC multiplier field in
> > > VMCS12")
> > > Signed-off-by: Sean Christopherson <seanjc@google.com>
> > > ---
> > >
> > > Note, the vmx test in kvm-unit-tests will still fail using stock
> > > QEMU, as QEMU also hardcodes and overwrites the MSR.  The test
> > > passes if I hack KVM to ignore userspace (it was easier than rebuilding
> QEMU).
> >
> > Queued, thanks.  Without having checked the kvm-unit-tests sources
> > very thoroughly, this might be a configuration issue in
> > kvm-unit-tests; in theory "-cpu host" (unlike "-cpu
> > host,migratable=no") should not enable TSC scaling.
> 
> As noted in the code comments, KVM allows VMREAD/VMWRITE to all defined
> fields, whether or not the field should actually exist for the vCPU model doesn't
> enter into the equation.  That's technically wrong as there are a number of
> fields that the SDM explicitly states exist iff a certain feature is supported.  
[Hu, Robert] 
It's right that a number of fields' existence depends on some certain feature.
Meanwhile, "IA32_VMX_VMCS_ENUM indicates to software the highest index
value used in the encoding of any field *supported* by the processor", rather than
*existed*.
So my understanding is no matter what VMCS exec control field's value is set, value
of IA32_VMX_VMCS_ENUM shall not be affected, as it reports the physical CPU's capability
rather than runtime VMCS configuration.
Back to nested case, L1's VMCS configuration lays the "physical" capability for L2, right?
nested_vmx_msrs or at least nested_vmx_msrs.vmcs_enum shall be put to vcpu
scope, rather than current kvm global.
Current nested_vmx_calc_vmcs_enum_msr() is invoked at early stage, before vcpu features
are settled. I think should be moved to later stage as well.
 
> To fix that we'd need to add a "feature flag" to vmcs_field_to_offset_table that is
> checked against the vCPU model, though updating the MSR would probably fall
> onto userspace's shoulders?
> 
> And FWIW, this is the QEMU code:
> 
>   #define VMCS12_MAX_FIELD_INDEX (0x17)
> 
>   static void kvm_msr_entry_add_vmx(X86CPU *cpu, FeatureWordArray f)
>   {
>       ...
> 
>       /*
>        * Just to be safe, write these with constant values.  The CRn_FIXED1
>        * MSRs are generated by KVM based on the vCPU's CPUID.
>        */
>       kvm_msr_entry_add(cpu, MSR_IA32_VMX_CR0_FIXED0,
>                         CR0_PE_MASK | CR0_PG_MASK | CR0_NE_MASK);
>       kvm_msr_entry_add(cpu, MSR_IA32_VMX_CR4_FIXED0,
>                         CR4_VMXE_MASK);
>       kvm_msr_entry_add(cpu, MSR_IA32_VMX_VMCS_ENUM,
>                         VMCS12_MAX_FIELD_INDEX << 1);
>   }
Sean Christopherson July 21, 2021, 4:18 p.m. UTC | #6
On Wed, Jul 21, 2021, Hu, Robert wrote:
> > > Queued, thanks.  Without having checked the kvm-unit-tests sources
> > > very thoroughly, this might be a configuration issue in
> > > kvm-unit-tests; in theory "-cpu host" (unlike "-cpu
> > > host,migratable=no") should not enable TSC scaling.
> > 
> > As noted in the code comments, KVM allows VMREAD/VMWRITE to all defined
> > fields, whether or not the field should actually exist for the vCPU model doesn't
> > enter into the equation.  That's technically wrong as there are a number of
> > fields that the SDM explicitly states exist iff a certain feature is supported.  
>
> It's right that a number of fields' existence depends on some certain feature.
> Meanwhile, "IA32_VMX_VMCS_ENUM indicates to software the highest index
> value used in the encoding of any field *supported* by the processor", rather than
> *existed*.

Yes.

> So my understanding is no matter what VMCS exec control field's value is set,
> value of IA32_VMX_VMCS_ENUM shall not be affected, as it reports the physical
> CPU's capability rather than runtime VMCS configuration.

Yes.

> Back to nested case, L1's VMCS configuration lays the "physical" capability
> for L2, right?

Yes. 

> nested_vmx_msrs or at least nested_vmx_msrs.vmcs_enum shall be put to vcpu
> scope, rather than current kvm global.
>
> Current nested_vmx_calc_vmcs_enum_msr() is invoked at early stage, before
> vcpu features are settled. I think should be moved to later stage as well.

Just moving the helper (or adding another call) wouldn't fix the underlying
problem that KVM doesn't correctly model the interplay between VMX features and
VMCS fields.  It's arguably less wrong than letting userspace stuff an incorrect
value, but it's not 100% correct and ignoring/overriding userspace is sketchy at
best.  As suggested below, the full fix is to fail VMREAD/VMWRITE to fields that
shouldn't exist given the vCPU model.

> > To fix that we'd need to add a "feature flag" to vmcs_field_to_offset_table
> > that is checked against the vCPU model, though updating the MSR would
> > probably fall onto userspace's shoulders?
> > 
> > And FWIW, this is the QEMU code:
> > 
> >   #define VMCS12_MAX_FIELD_INDEX (0x17)
> > 
> >   static void kvm_msr_entry_add_vmx(X86CPU *cpu, FeatureWordArray f)
> >   {
> >       ...
> > 
> >       /*
> >        * Just to be safe, write these with constant values.  The CRn_FIXED1
> >        * MSRs are generated by KVM based on the vCPU's CPUID.
> >        */
> >       kvm_msr_entry_add(cpu, MSR_IA32_VMX_CR0_FIXED0,
> >                         CR0_PE_MASK | CR0_PG_MASK | CR0_NE_MASK);
> >       kvm_msr_entry_add(cpu, MSR_IA32_VMX_CR4_FIXED0,
> >                         CR4_VMXE_MASK);
> >       kvm_msr_entry_add(cpu, MSR_IA32_VMX_VMCS_ENUM,
> >                         VMCS12_MAX_FIELD_INDEX << 1);
> >   }
>
Robert Hoo July 22, 2021, 2:43 a.m. UTC | #7
On Wed, 2021-07-21 at 16:18 +0000, Sean Christopherson wrote:
> On Wed, Jul 21, 2021, Hu, Robert wrote:
> > > > Queued, thanks.  Without having checked the kvm-unit-tests
> > > > sources
> > > > very thoroughly, this might be a configuration issue in
> > > > kvm-unit-tests; in theory "-cpu host" (unlike "-cpu
> > > > host,migratable=no") should not enable TSC scaling.
> > > 
> > > As noted in the code comments, KVM allows VMREAD/VMWRITE to all
> > > defined
> > > fields, whether or not the field should actually exist for the
> > > vCPU model doesn't
> > > enter into the equation.  That's technically wrong as there are a
> > > number of
> > > fields that the SDM explicitly states exist iff a certain feature
> > > is supported.  
> > 
> > It's right that a number of fields' existence depends on some
> > certain feature.
> > Meanwhile, "IA32_VMX_VMCS_ENUM indicates to software the highest
> > index
> > value used in the encoding of any field *supported* by the
> > processor", rather than
> > *existed*.
> 
> Yes.
> 
> > So my understanding is no matter what VMCS exec control field's
> > value is set,
> > value of IA32_VMX_VMCS_ENUM shall not be affected, as it reports
> > the physical
> > CPU's capability rather than runtime VMCS configuration.
> 
> Yes.
> 
> > Back to nested case, L1's VMCS configuration lays the "physical"
> > capability
> > for L2, right?
> 
> Yes. 
> 
> > nested_vmx_msrs or at least nested_vmx_msrs.vmcs_enum shall be put
> > to vcpu
> > scope, rather than current kvm global.
> > 
> > Current nested_vmx_calc_vmcs_enum_msr() is invoked at early stage,
> > before
> > vcpu features are settled. I think should be moved to later stage
> > as well.
> 
> Just moving the helper (or adding another call) wouldn't fix the
> underlying
> problem that KVM doesn't correctly model the interplay between VMX
> features and
> VMCS fields.  It's arguably less wrong than letting userspace stuff
> an incorrect
> value, but it's not 100% correct and ignoring/overriding userspace is
> sketchy at
> best.  
I think IA32_VMX_VMCS_ENUM MSR shall not be set by QEMU, it is actually
already indirectly set by user space CPU feature set.

> As suggested below, the full fix is to fail VMREAD/VMWRITE to fields
> that
> shouldn't exist given the vCPU model.
> 
> > > To fix that we'd need to add a "feature flag" to
> > > vmcs_field_to_offset_table
> > > that is checked against the vCPU model, though updating the MSR
> > > would
> > > probably fall onto userspace's shoulders?
> > > 
> > > And FWIW, this is the QEMU code:
> > > 
> > >   #define VMCS12_MAX_FIELD_INDEX (0x17)
> > > 
> > >   static void kvm_msr_entry_add_vmx(X86CPU *cpu, FeatureWordArray
> > > f)
> > >   {
> > >       ...
> > > 
> > >       /*
> > >        * Just to be safe, write these with constant values.  The
> > > CRn_FIXED1
> > >        * MSRs are generated by KVM based on the vCPU's CPUID.
> > >        */
> > >       kvm_msr_entry_add(cpu, MSR_IA32_VMX_CR0_FIXED0,
> > >                         CR0_PE_MASK | CR0_PG_MASK | CR0_NE_MASK);
> > >       kvm_msr_entry_add(cpu, MSR_IA32_VMX_CR4_FIXED0,
> > >                         CR4_VMXE_MASK);
> > >       kvm_msr_entry_add(cpu, MSR_IA32_VMX_VMCS_ENUM,
> > >                         VMCS12_MAX_FIELD_INDEX << 1);
> > >   }
Maxim Levitsky July 22, 2021, 7:59 a.m. UTC | #8
On Fri, 2021-06-18 at 14:46 -0700, Sean Christopherson wrote:
> Calculate the max VMCS index for vmcs12 by walking the array to find the
> actual max index.  Hardcoding the index is prone to bitrot, and the
> calculation is only done on KVM bringup (albeit on every CPU, but there
> aren't _that_ many null entries in the array).
> 
> Fixes: 3c0f99366e34 ("KVM: nVMX: Add a TSC multiplier field in VMCS12")
> Signed-off-by: Sean Christopherson <seanjc@google.com>

Could you give me an example on how this fails in the KVM unit tests?
I have a bug report here and it might be related so I want to save some
time triaging it.

Best regards,
	Maxim Levitsky

> ---
> 
> Note, the vmx test in kvm-unit-tests will still fail using stock QEMU,
> as QEMU also hardcodes and overwrites the MSR.  The test passes if I
> hack KVM to ignore userspace (it was easier than rebuilding QEMU).
> 
>  arch/x86/kvm/vmx/nested.c | 37 +++++++++++++++++++++++++++++++++++--
>  arch/x86/kvm/vmx/vmcs.h   |  8 ++++++++
>  arch/x86/kvm/vmx/vmcs12.h |  6 ------
>  3 files changed, 43 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> index b531e08a095b..183fd9d62fc5 100644
> --- a/arch/x86/kvm/vmx/nested.c
> +++ b/arch/x86/kvm/vmx/nested.c
> @@ -6374,6 +6374,40 @@ void nested_vmx_set_vmcs_shadowing_bitmap(void)
>  	}
>  }
>  
> +/*
> + * Indexing into the vmcs12 uses the VMCS encoding rotated left by 6.  Undo
> + * that madness to get the encoding for comparison.
> + */
> +#define VMCS12_IDX_TO_ENC(idx) ((u16)(((u16)(idx) >> 6) | ((u16)(idx) << 10)))
> +
> +static u64 nested_vmx_calc_vmcs_enum_msr(void)
> +{
> +	/*
> +	 * Note these are the so called "index" of the VMCS field encoding, not
> +	 * the index into vmcs12.
> +	 */
> +	unsigned int max_idx, idx;
> +	int i;
> +
> +	/*
> +	 * For better or worse, KVM allows VMREAD/VMWRITE to all fields in
> +	 * vmcs12, regardless of whether or not the associated feature is
> +	 * exposed to L1.  Simply find the field with the highest index.
> +	 */
> +	max_idx = 0;
> +	for (i = 0; i < nr_vmcs12_fields; i++) {
> +		/* The vmcs12 table is very, very sparsely populated. */
> +		if (!vmcs_field_to_offset_table[i])
> +			continue;
> +
> +		idx = vmcs_field_index(VMCS12_IDX_TO_ENC(i));
> +		if (idx > max_idx)
> +			max_idx = idx;
> +	}
> +
> +	return (u64)max_idx << VMCS_FIELD_INDEX_SHIFT;
> +}
> +
>  /*
>   * nested_vmx_setup_ctls_msrs() sets up variables containing the values to be
>   * returned for the various VMX controls MSRs when nested VMX is enabled.
> @@ -6619,8 +6653,7 @@ void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, u32 ept_caps)
>  	rdmsrl(MSR_IA32_VMX_CR0_FIXED1, msrs->cr0_fixed1);
>  	rdmsrl(MSR_IA32_VMX_CR4_FIXED1, msrs->cr4_fixed1);
>  
> -	/* highest index: VMX_PREEMPTION_TIMER_VALUE */
> -	msrs->vmcs_enum = VMCS12_MAX_FIELD_INDEX << 1;
> +	msrs->vmcs_enum = nested_vmx_calc_vmcs_enum_msr();
>  }
>  
>  void nested_vmx_hardware_unsetup(void)
> diff --git a/arch/x86/kvm/vmx/vmcs.h b/arch/x86/kvm/vmx/vmcs.h
> index 1472c6c376f7..de3b04d4b587 100644
> --- a/arch/x86/kvm/vmx/vmcs.h
> +++ b/arch/x86/kvm/vmx/vmcs.h
> @@ -164,4 +164,12 @@ static inline int vmcs_field_readonly(unsigned long field)
>  	return (((field >> 10) & 0x3) == 1);
>  }
>  
> +#define VMCS_FIELD_INDEX_SHIFT		(1)
> +#define VMCS_FIELD_INDEX_MASK		GENMASK(9, 1)
> +
> +static inline unsigned int vmcs_field_index(unsigned long field)
> +{
> +	return (field & VMCS_FIELD_INDEX_MASK) >> VMCS_FIELD_INDEX_SHIFT;
> +}
> +
>  #endif /* __KVM_X86_VMX_VMCS_H */
> diff --git a/arch/x86/kvm/vmx/vmcs12.h b/arch/x86/kvm/vmx/vmcs12.h
> index bb81a23afe89..5e0e1b39f495 100644
> --- a/arch/x86/kvm/vmx/vmcs12.h
> +++ b/arch/x86/kvm/vmx/vmcs12.h
> @@ -205,12 +205,6 @@ struct __packed vmcs12 {
>   */
>  #define VMCS12_SIZE		KVM_STATE_NESTED_VMX_VMCS_SIZE
>  
> -/*
> - * VMCS12_MAX_FIELD_INDEX is the highest index value used in any
> - * supported VMCS12 field encoding.
> - */
> -#define VMCS12_MAX_FIELD_INDEX 0x17
> -
>  /*
>   * For save/restore compatibility, the vmcs12 field offsets must not change.
>   */
Sean Christopherson July 22, 2021, 3:04 p.m. UTC | #9
On Thu, Jul 22, 2021, Maxim Levitsky wrote:
> On Fri, 2021-06-18 at 14:46 -0700, Sean Christopherson wrote:
> > Calculate the max VMCS index for vmcs12 by walking the array to find the
> > actual max index.  Hardcoding the index is prone to bitrot, and the
> > calculation is only done on KVM bringup (albeit on every CPU, but there
> > aren't _that_ many null entries in the array).
> > 
> > Fixes: 3c0f99366e34 ("KVM: nVMX: Add a TSC multiplier field in VMCS12")
> > Signed-off-by: Sean Christopherson <seanjc@google.com>
> 
> Could you give me an example on how this fails in the KVM unit tests?
> I have a bug report here and it might be related so I want to save some
> time triaging it.

FAIL: VMX_VMCS_ENUM.MAX_INDEX expected: 19, actual: 17

FWIW, unless a kernel/hypervisor is sanity checking VMREAD/VMWRITE or doing something
clever with the MAX_INDEX, I wouldn't expect this to cause any real world failures.
diff mbox series

Patch

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index b531e08a095b..183fd9d62fc5 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -6374,6 +6374,40 @@  void nested_vmx_set_vmcs_shadowing_bitmap(void)
 	}
 }
 
+/*
+ * Indexing into the vmcs12 uses the VMCS encoding rotated left by 6.  Undo
+ * that madness to get the encoding for comparison.
+ */
+#define VMCS12_IDX_TO_ENC(idx) ((u16)(((u16)(idx) >> 6) | ((u16)(idx) << 10)))
+
+static u64 nested_vmx_calc_vmcs_enum_msr(void)
+{
+	/*
+	 * Note these are the so called "index" of the VMCS field encoding, not
+	 * the index into vmcs12.
+	 */
+	unsigned int max_idx, idx;
+	int i;
+
+	/*
+	 * For better or worse, KVM allows VMREAD/VMWRITE to all fields in
+	 * vmcs12, regardless of whether or not the associated feature is
+	 * exposed to L1.  Simply find the field with the highest index.
+	 */
+	max_idx = 0;
+	for (i = 0; i < nr_vmcs12_fields; i++) {
+		/* The vmcs12 table is very, very sparsely populated. */
+		if (!vmcs_field_to_offset_table[i])
+			continue;
+
+		idx = vmcs_field_index(VMCS12_IDX_TO_ENC(i));
+		if (idx > max_idx)
+			max_idx = idx;
+	}
+
+	return (u64)max_idx << VMCS_FIELD_INDEX_SHIFT;
+}
+
 /*
  * nested_vmx_setup_ctls_msrs() sets up variables containing the values to be
  * returned for the various VMX controls MSRs when nested VMX is enabled.
@@ -6619,8 +6653,7 @@  void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, u32 ept_caps)
 	rdmsrl(MSR_IA32_VMX_CR0_FIXED1, msrs->cr0_fixed1);
 	rdmsrl(MSR_IA32_VMX_CR4_FIXED1, msrs->cr4_fixed1);
 
-	/* highest index: VMX_PREEMPTION_TIMER_VALUE */
-	msrs->vmcs_enum = VMCS12_MAX_FIELD_INDEX << 1;
+	msrs->vmcs_enum = nested_vmx_calc_vmcs_enum_msr();
 }
 
 void nested_vmx_hardware_unsetup(void)
diff --git a/arch/x86/kvm/vmx/vmcs.h b/arch/x86/kvm/vmx/vmcs.h
index 1472c6c376f7..de3b04d4b587 100644
--- a/arch/x86/kvm/vmx/vmcs.h
+++ b/arch/x86/kvm/vmx/vmcs.h
@@ -164,4 +164,12 @@  static inline int vmcs_field_readonly(unsigned long field)
 	return (((field >> 10) & 0x3) == 1);
 }
 
+#define VMCS_FIELD_INDEX_SHIFT		(1)
+#define VMCS_FIELD_INDEX_MASK		GENMASK(9, 1)
+
+static inline unsigned int vmcs_field_index(unsigned long field)
+{
+	return (field & VMCS_FIELD_INDEX_MASK) >> VMCS_FIELD_INDEX_SHIFT;
+}
+
 #endif /* __KVM_X86_VMX_VMCS_H */
diff --git a/arch/x86/kvm/vmx/vmcs12.h b/arch/x86/kvm/vmx/vmcs12.h
index bb81a23afe89..5e0e1b39f495 100644
--- a/arch/x86/kvm/vmx/vmcs12.h
+++ b/arch/x86/kvm/vmx/vmcs12.h
@@ -205,12 +205,6 @@  struct __packed vmcs12 {
  */
 #define VMCS12_SIZE		KVM_STATE_NESTED_VMX_VMCS_SIZE
 
-/*
- * VMCS12_MAX_FIELD_INDEX is the highest index value used in any
- * supported VMCS12 field encoding.
- */
-#define VMCS12_MAX_FIELD_INDEX 0x17
-
 /*
  * For save/restore compatibility, the vmcs12 field offsets must not change.
  */