mbox series

[v3,00/20] KVM: nVMX: add option to perform early consistency checks via H/W

Message ID 20180926162358.10741-1-sean.j.christopherson@intel.com (mailing list archive)
Headers show
Series KVM: nVMX: add option to perform early consistency checks via H/W | expand

Message

Sean Christopherson Sept. 26, 2018, 4:23 p.m. UTC
KVM currently defers many VMX consistency checks to the CPU, including
checks that result in VMFail (as opposed to VMExit).  This behavior
may be undesirable for some users since this means KVM detects certain
classes of VMFail only after it has processed guest state.  Because
there is a strict ordering between checks that cause VMFail and those
that cause VMExit, i.e. all VMFail checks are performed before any
checks that cause VMExit, we can detect all VMFail conditions via a
dry run of sorts.

The end goal of this series is to add an optional (param-controlled)
pre-run VMEnter into the nested_vmx_run() flow in order to perform
all VMFail consistency checks prior to actually running vmcs02.  By
itself, this is not a complex process, but getting KVM to a point
where the approach is viable requires a fair amount of refactoring,
e.g. to split prepare_vmcs02() so that there is a point where vmcs02
can pass the VMFail checks without first consuming guest state.

And while the goal (and subject) of this series is to enable early
consistency checks, the vast majority of the series deals with bug
fixes and cleanups in the nested VMX code.  During the refactoring
and testing, a number of pre-existing bugs, opportunities for code
cleanup and easy optimization points (which unconvered more bugs)
were encountered.

Ideally, these patches would be split into 3-4 separate series,
especially the bug fix patches.  I smushed everything into a single
series because the early VMEnter code breaks without the bug fixes
and the refactoring shuffles the same code, and some of the cleanup
and fixes are inter-dependent.

Patch Synopsis:
  1-4:   bug fixes
  5-6:   optimizations
  7:     function rename
  8:     bug fix and refactoring
  9-12:  refactoring
  13:    optimization
  14:    refactoring
  15-16: bug fix
  17:    refactoring
  18:    optimization and prereq for early consistency checks
  19-20: early consistency checks

v1: https://www.spinics.net/lists/kvm/msg172795.html

v2:
  - rebased on tag kvm-4.19-2
  - added patch to skip instr in nested_vmx_{fail,succeed}

v3:
  - rebased on tag v4.19-rc1
  - use GUEST_RFLAGS to trigger failure during h/w checks [0-Day]
  - reset control shadows and seg cache in vmx_switch_vmcs() [Jim Mattson]
  - remove pml_pg ASSERT in dedicated patch [Jim Mattson]
  - add dedicated flag to track vmcs02 initialization [Jim Mattson]
  - use vmentry_fail_vmexit terminology [Jim Mattson]
  - fix a bug when moving check_vmentry_postreqs() [Jim Mattson]
  - introduce speculation of VM_EXIT_LOAD_IA32_EFER in correct patch [Jim Mattson]
  - remove unnecessary braces [Jim Mattson]
  
Sean Christopherson (20):
  KVM: nVMX: move host EFER consistency checks to VMFail path
  KVM: nVMX: move vmcs12 EPTP consistency check to
    check_vmentry_prereqs()
  KVM: nVMX: use vm_exit_controls_init() to write exit controls for
    vmcs02
  KVM: nVMX: reset cache/shadows when switching loaded VMCS
  KVM: vmx: do not unconditionally clear EFER switching
  KVM: nVMX: try to set EFER bits correctly when initializing controls
  KVM: nVMX: rename enter_vmx_non_root_mode to
    nested_vmx_enter_non_root_mode
  KVM: nVMX: move check_vmentry_postreqs() call to
    nested_vmx_enter_non_root_mode()
  KVM: nVMX: assimilate nested_vmx_entry_failure() into
    nested_vmx_enter_non_root_mode()
  KVM: vVMX: rename label for post-enter_guest_mode consistency check
  KVM: VMX: remove ASSERT() on vmx->pml_pg validity
  KVM: nVMX: split pieces of prepare_vmcs02() to prepare_vmcs02_early()
  KVM: nVMX: initialize vmcs02 constant exactly once (per VMCS)
  KVM: nVMX: do early preparation of vmcs02 before
    check_vmentry_postreqs()
  KVM: nVMX: do not skip VMEnter instruction that succeeds
  KVM: nVMX: do not call nested_vmx_succeed() for consistency check
    VMExit
  KVM: nVMX: call kvm_skip_emulated_instruction in
    nested_vmx_{fail,succeed}
  KVM: vmx: write HOST_IA32_EFER in vmx_set_constant_host_state()
  KVM: nVMX: add option to perform early consistency checks via H/W
  KVM: nVMX: WARN if nested run hits VMFail with early consistency
    checks enabled

 arch/x86/kvm/vmx.c | 1032 +++++++++++++++++++++++++-------------------
 1 file changed, 581 insertions(+), 451 deletions(-)

Comments

Paolo Bonzini Oct. 3, 2018, 4:38 p.m. UTC | #1
On 26/09/2018 18:23, Sean Christopherson wrote:
> KVM currently defers many VMX consistency checks to the CPU, including
> checks that result in VMFail (as opposed to VMExit).  This behavior
> may be undesirable for some users since this means KVM detects certain
> classes of VMFail only after it has processed guest state.  Because
> there is a strict ordering between checks that cause VMFail and those
> that cause VMExit, i.e. all VMFail checks are performed before any
> checks that cause VMExit, we can detect all VMFail conditions via a
> dry run of sorts.
> 
> The end goal of this series is to add an optional (param-controlled)
> pre-run VMEnter into the nested_vmx_run() flow in order to perform
> all VMFail consistency checks prior to actually running vmcs02.  By
> itself, this is not a complex process, but getting KVM to a point
> where the approach is viable requires a fair amount of refactoring,
> e.g. to split prepare_vmcs02() so that there is a point where vmcs02
> can pass the VMFail checks without first consuming guest state.
> 
> And while the goal (and subject) of this series is to enable early
> consistency checks, the vast majority of the series deals with bug
> fixes and cleanups in the nested VMX code.  During the refactoring
> and testing, a number of pre-existing bugs, opportunities for code
> cleanup and easy optimization points (which unconvered more bugs)
> were encountered.
> 
> Ideally, these patches would be split into 3-4 separate series,
> especially the bug fix patches.  I smushed everything into a single
> series because the early VMEnter code breaks without the bug fixes
> and the refactoring shuffles the same code, and some of the cleanup
> and fixes are inter-dependent.

I've now finished rebasing it, but haven't tested it yet.  I made some
small changes to patch 19:

1) remove auto mode and default to off for now (we can always add back
auto mode if the defaults are changed)

2) rename the parameter to nested_early_check

I'll push it tomorrow hopefully.

Paolo
Sean Christopherson Oct. 5, 2018, 8:14 p.m. UTC | #2
On Wed, Oct 03, 2018 at 06:38:21PM +0200, Paolo Bonzini wrote:
> On 26/09/2018 18:23, Sean Christopherson wrote:
> > KVM currently defers many VMX consistency checks to the CPU, including
> > checks that result in VMFail (as opposed to VMExit).  This behavior
> > may be undesirable for some users since this means KVM detects certain
> > classes of VMFail only after it has processed guest state.  Because
> > there is a strict ordering between checks that cause VMFail and those
> > that cause VMExit, i.e. all VMFail checks are performed before any
> > checks that cause VMExit, we can detect all VMFail conditions via a
> > dry run of sorts.
> > 
> > The end goal of this series is to add an optional (param-controlled)
> > pre-run VMEnter into the nested_vmx_run() flow in order to perform
> > all VMFail consistency checks prior to actually running vmcs02.  By
> > itself, this is not a complex process, but getting KVM to a point
> > where the approach is viable requires a fair amount of refactoring,
> > e.g. to split prepare_vmcs02() so that there is a point where vmcs02
> > can pass the VMFail checks without first consuming guest state.
> > 
> > And while the goal (and subject) of this series is to enable early
> > consistency checks, the vast majority of the series deals with bug
> > fixes and cleanups in the nested VMX code.  During the refactoring
> > and testing, a number of pre-existing bugs, opportunities for code
> > cleanup and easy optimization points (which unconvered more bugs)
> > were encountered.
> > 
> > Ideally, these patches would be split into 3-4 separate series,
> > especially the bug fix patches.  I smushed everything into a single
> > series because the early VMEnter code breaks without the bug fixes
> > and the refactoring shuffles the same code, and some of the cleanup
> > and fixes are inter-dependent.
> 
> I've now finished rebasing it, but haven't tested it yet.  I made some
> small changes to patch 19:
> 
> 1) remove auto mode and default to off for now (we can always add back
> auto mode if the defaults are changed)
> 
> 2) rename the parameter to nested_early_check

nested_early_check is a much better name :)

> I'll push it tomorrow hopefully.

The changelog still refers to the original name and auto behavior.
The last paragraph of the changelog can be stripped down to a single
sentence or removed altogether.

The addition of "#include <asm/hypervisor.h>" can also be removed,
it was added to support the auto behavior.  Let me know if you want
a patch.

Thanks!