[v2,03/18] KVM: nVMX: use vm_exit_controls_init() to write exit controls for vmcs02
diff mbox series

Message ID 20180828160459.14093-4-sean.j.christopherson@intel.com
State New
Headers show
Series
  • KVM: nVMX: add option to perform early consistency checks via H/W
Related show

Commit Message

Sean Christopherson Aug. 28, 2018, 4:04 p.m. UTC
Write VM_EXIT_CONTROLS using vm_exit_controls_init() when configuring
vmcs02, otherwise vm_exit_controls_shadow will be stale.  EFER in
particular can be corrupted if VM_EXIT_LOAD_IA32_EFER is not updated
due to an incorrect shadow optimization, which can crash L0 due to
EFER not being loaded on exit.  This does not occur with the current
code base simply because update_transition_efer() unconditionally
clears VM_EXIT_LOAD_IA32_EFER before conditionally setting it, and
because a nested guest always starts with VM_EXIT_LOAD_IA32_EFER
clear, i.e. we'll only ever unnecessarily clear the bit.  That is,
until someone optimizes update_transition_efer()...

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---
 arch/x86/kvm/vmx.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Jim Mattson Sept. 19, 2018, 9 p.m. UTC | #1
On Tue, Aug 28, 2018 at 9:04 AM, Sean Christopherson
<sean.j.christopherson@intel.com> wrote:
> Write VM_EXIT_CONTROLS using vm_exit_controls_init() when configuring
> vmcs02, otherwise vm_exit_controls_shadow will be stale.  EFER in
> particular can be corrupted if VM_EXIT_LOAD_IA32_EFER is not updated
> due to an incorrect shadow optimization, which can crash L0 due to
> EFER not being loaded on exit.  This does not occur with the current
> code base simply because update_transition_efer() unconditionally
> clears VM_EXIT_LOAD_IA32_EFER before conditionally setting it, and
> because a nested guest always starts with VM_EXIT_LOAD_IA32_EFER
> clear, i.e. we'll only ever unnecessarily clear the bit.  That is,
> until someone optimizes update_transition_efer()...
>
> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>

Doesn't this also imply that we need a vm_exit_controls_init() when
returning to vmcs01, or vm_exit_controls_shadow will be stale after an
emulated VM-exit?
Jim Mattson Sept. 19, 2018, 9:18 p.m. UTC | #2
On Wed, Sep 19, 2018 at 2:00 PM, Jim Mattson <jmattson@google.com> wrote:
> On Tue, Aug 28, 2018 at 9:04 AM, Sean Christopherson
> <sean.j.christopherson@intel.com> wrote:
>> Write VM_EXIT_CONTROLS using vm_exit_controls_init() when configuring
>> vmcs02, otherwise vm_exit_controls_shadow will be stale.  EFER in
>> particular can be corrupted if VM_EXIT_LOAD_IA32_EFER is not updated
>> due to an incorrect shadow optimization, which can crash L0 due to
>> EFER not being loaded on exit.  This does not occur with the current
>> code base simply because update_transition_efer() unconditionally
>> clears VM_EXIT_LOAD_IA32_EFER before conditionally setting it, and
>> because a nested guest always starts with VM_EXIT_LOAD_IA32_EFER
>> clear, i.e. we'll only ever unnecessarily clear the bit.  That is,
>> until someone optimizes update_transition_efer()...
>>
>> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
>
> Doesn't this also imply that we need a vm_exit_controls_init() when
> returning to vmcs01, or vm_exit_controls_shadow will be stale after an
> emulated VM-exit?

Ah, vm_exit_controls_reset_shadow() does it. Okay.

Reviewed-by: Jim Mattson <jmattson@google.com>
Sean Christopherson Sept. 19, 2018, 9:20 p.m. UTC | #3
On Wed, Sep 19, 2018 at 02:00:04PM -0700, Jim Mattson wrote:
> On Tue, Aug 28, 2018 at 9:04 AM, Sean Christopherson
> <sean.j.christopherson@intel.com> wrote:
> > Write VM_EXIT_CONTROLS using vm_exit_controls_init() when configuring
> > vmcs02, otherwise vm_exit_controls_shadow will be stale.  EFER in
> > particular can be corrupted if VM_EXIT_LOAD_IA32_EFER is not updated
> > due to an incorrect shadow optimization, which can crash L0 due to
> > EFER not being loaded on exit.  This does not occur with the current
> > code base simply because update_transition_efer() unconditionally
> > clears VM_EXIT_LOAD_IA32_EFER before conditionally setting it, and
> > because a nested guest always starts with VM_EXIT_LOAD_IA32_EFER
> > clear, i.e. we'll only ever unnecessarily clear the bit.  That is,
> > until someone optimizes update_transition_efer()...
> >
> > Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
> 
> Doesn't this also imply that we need a vm_exit_controls_init() when
> returning to vmcs01, or vm_exit_controls_shadow will be stale after an
> emulated VM-exit?

Yep.  It's in the next patch, 04/18.  IIRC I put it in a separate patch
because there were multiple shadow/cache resets missing and resetting
the shadows seemed to be distinctly different from initializing the
exit controls shadows.
Krish Sadhukhan Sept. 20, 2018, 9:28 p.m. UTC | #4
On 08/28/2018 09:04 AM, Sean Christopherson wrote:
> Write VM_EXIT_CONTROLS using vm_exit_controls_init() when configuring
> vmcs02, otherwise vm_exit_controls_shadow will be stale.  EFER in
> particular can be corrupted if VM_EXIT_LOAD_IA32_EFER is not updated
> due to an incorrect shadow optimization, which can crash L0 due to
> EFER not being loaded on exit.  This does not occur with the current
> code base simply because update_transition_efer() unconditionally
> clears VM_EXIT_LOAD_IA32_EFER before conditionally setting it, and
> because a nested guest always starts with VM_EXIT_LOAD_IA32_EFER
> clear, i.e. we'll only ever unnecessarily clear the bit.  That is,
> until someone optimizes update_transition_efer()...
>
> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
> ---
>   arch/x86/kvm/vmx.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index 6451e63847d9..b7aca0edeb59 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -12186,7 +12186,7 @@ static int prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12,
>   	 * we should use its exit controls. Note that VM_EXIT_LOAD_IA32_EFER
>   	 * bits are further modified by vmx_set_efer() below.
>   	 */
> -	vmcs_write32(VM_EXIT_CONTROLS, vmcs_config.vmexit_ctrl);
> +	vm_exit_controls_init(vmx, vmcs_config.vmexit_ctrl);
>   
>   	/* vmcs12's VM_ENTRY_LOAD_IA32_EFER and VM_ENTRY_IA32E_MODE are
>   	 * emulated by vmx_set_efer(), below.

We call  vm_exit_controls_init()  in vmx_vcpu_setup(). Just wondering 
why that initialization doesn't persist through prepare_vmcs02().
Sean Christopherson Sept. 20, 2018, 10:07 p.m. UTC | #5
On Thu, Sep 20, 2018 at 02:28:28PM -0700, Krish Sadhukhan wrote:
> 
> 
> On 08/28/2018 09:04 AM, Sean Christopherson wrote:
> >Write VM_EXIT_CONTROLS using vm_exit_controls_init() when configuring
> >vmcs02, otherwise vm_exit_controls_shadow will be stale.  EFER in
> >particular can be corrupted if VM_EXIT_LOAD_IA32_EFER is not updated
> >due to an incorrect shadow optimization, which can crash L0 due to
> >EFER not being loaded on exit.  This does not occur with the current
> >code base simply because update_transition_efer() unconditionally
> >clears VM_EXIT_LOAD_IA32_EFER before conditionally setting it, and
> >because a nested guest always starts with VM_EXIT_LOAD_IA32_EFER
> >clear, i.e. we'll only ever unnecessarily clear the bit.  That is,
> >until someone optimizes update_transition_efer()...
> >
> >Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
> >---
> >  arch/x86/kvm/vmx.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> >diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> >index 6451e63847d9..b7aca0edeb59 100644
> >--- a/arch/x86/kvm/vmx.c
> >+++ b/arch/x86/kvm/vmx.c
> >@@ -12186,7 +12186,7 @@ static int prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12,
> >  	 * we should use its exit controls. Note that VM_EXIT_LOAD_IA32_EFER
> >  	 * bits are further modified by vmx_set_efer() below.
> >  	 */
> >-	vmcs_write32(VM_EXIT_CONTROLS, vmcs_config.vmexit_ctrl);
> >+	vm_exit_controls_init(vmx, vmcs_config.vmexit_ctrl);
> >  	/* vmcs12's VM_ENTRY_LOAD_IA32_EFER and VM_ENTRY_IA32E_MODE are
> >  	 * emulated by vmx_set_efer(), below.
> 
> We call  vm_exit_controls_init()  in vmx_vcpu_setup(). Just wondering why
> that initialization doesn't persist through prepare_vmcs02().

vm_exit_controls_shadow, which is written by vm_exit_controls_init(),
is a shadow of the VM_EXIT_CONTROLS VMCS field.  The shadow is used to
elide VMREADs and VMWRITEs to VM_EXIT_CONTROLS, i.e. we don't need to
VMREAD the field when toggling a bit and we can skip VMWRITE if the
desired value matches the shadow (and therefore matches the VMCS).
vm_exit_controls_shadow needs to be kept in sync with the VMCS field
at all times, otherwise we might incorrectly skip a VMWRITE, e.g. if
VM_EXIT_LOAD_IA32_EFER bit is set in the shadow but not the actual
VMCS field.  When we switch to vmcs02 we need to re-initialize the
shadow because the shadow is per-VCPU, not per-VMCS.
Jim Mattson Sept. 20, 2018, 10:12 p.m. UTC | #6
On Thu, Sep 20, 2018 at 3:07 PM, Sean Christopherson
<sean.j.christopherson@intel.com> wrote:

> vm_exit_controls_shadow, which is written by vm_exit_controls_init(),
> is a shadow of the VM_EXIT_CONTROLS VMCS field.  The shadow is used to
> elide VMREADs and VMWRITEs to VM_EXIT_CONTROLS, i.e. we don't need to
> VMREAD the field when toggling a bit and we can skip VMWRITE if the
> desired value matches the shadow (and therefore matches the VMCS).
> vm_exit_controls_shadow needs to be kept in sync with the VMCS field
> at all times, otherwise we might incorrectly skip a VMWRITE, e.g. if
> VM_EXIT_LOAD_IA32_EFER bit is set in the shadow but not the actual
> VMCS field.  When we switch to vmcs02 we need to re-initialize the
> shadow because the shadow is per-VCPU, not per-VMCS.

Perhaps the shadow should be per-VMCS?
Sean Christopherson Sept. 20, 2018, 10:59 p.m. UTC | #7
On Thu, Sep 20, 2018 at 03:12:00PM -0700, Jim Mattson wrote:
> On Thu, Sep 20, 2018 at 3:07 PM, Sean Christopherson
> <sean.j.christopherson@intel.com> wrote:
> 
> > vm_exit_controls_shadow, which is written by vm_exit_controls_init(),
> > is a shadow of the VM_EXIT_CONTROLS VMCS field.  The shadow is used to
> > elide VMREADs and VMWRITEs to VM_EXIT_CONTROLS, i.e. we don't need to
> > VMREAD the field when toggling a bit and we can skip VMWRITE if the
> > desired value matches the shadow (and therefore matches the VMCS).
> > vm_exit_controls_shadow needs to be kept in sync with the VMCS field
> > at all times, otherwise we might incorrectly skip a VMWRITE, e.g. if
> > VM_EXIT_LOAD_IA32_EFER bit is set in the shadow but not the actual
> > VMCS field.  When we switch to vmcs02 we need to re-initialize the
> > shadow because the shadow is per-VCPU, not per-VMCS.
> 
> Perhaps the shadow should be per-VMCS?

I don't think a per-VMCS shadow is justifiable with the current code
base.  There's effectively zero overhead when switching to vmcs02
since we unconditionally write the VMCS field during prepare_vmcs02().
When switching back to vmcs01 the overhead is a single VMREAD to
reset the shadow, which peanuts compared to the overall cost of nested
VMEnter/VMExit.

That being said, at some point I do want to experiment with per-VMCS
shadows.  The pie in the sky goal would be to get to the point where
a basic nested transition doesn't VMWRITE any vmcs02 control fields
so that we can take advantage of hardware's dirty field optimizations.

Patch
diff mbox series

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 6451e63847d9..b7aca0edeb59 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -12186,7 +12186,7 @@  static int prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12,
 	 * we should use its exit controls. Note that VM_EXIT_LOAD_IA32_EFER
 	 * bits are further modified by vmx_set_efer() below.
 	 */
-	vmcs_write32(VM_EXIT_CONTROLS, vmcs_config.vmexit_ctrl);
+	vm_exit_controls_init(vmx, vmcs_config.vmexit_ctrl);
 
 	/* vmcs12's VM_ENTRY_LOAD_IA32_EFER and VM_ENTRY_IA32E_MODE are
 	 * emulated by vmx_set_efer(), below.