mbox series

[v2,00/18] KVM: nVMX: add option to perform early consistency checks via H/W

Message ID 20180828160459.14093-1-sean.j.christopherson@intel.com (mailing list archive)
Headers show
Series KVM: nVMX: add option to perform early consistency checks via H/W | expand

Message

Sean Christopherson Aug. 28, 2018, 4:04 p.m. UTC
KVM currently defers many VMX consistency checks to the CPU, including
checks that result in VMFail (as opposed to VMExit).  This behavior
may be undesirable for some users since this means KVM detects certain
classes of VMFail only after it has processed guest state.  Because
there is a strict ordering between checks that cause VMFail and those
that cause VMExit, i.e. all VMFail checks are performed before any
checks that cause VMExit, we can detect all VMFail conditions via a
dry run of sorts.

The end goal of this series is to add an optional (param-controlled)
pre-run VMEnter into the nested_vmx_run() flow in order to perform
all VMFail consistency checks prior to actually running vmcs02.  By
itself, this is not a complex process, but getting KVM to a point
where the approach is viable requires a fair amount of refactoring,
e.g. to split prepare_vmcs02() so that there is a point where vmcs02
can pass the VMFail checks without first consuming guest state.

And while the goal (and subject) of this series is to enable early
consistency checks, the vast majority of the series deals with bug
fixes and cleanups in the nested VMX code.  During the refactoring
and testing, a number of pre-existing bugs, opportunities for code
cleanup and easy optimization points (which unconvered more bugs)
were encountered.

Ideally, these patches would be split into 3-4 separate series,
especially the bug fix patches.  I smushed everything into a single
series because the early VMEnter code breaks without the bug fixes
and the refactoring shuffles the same code, and some of the cleanup
and fixes are inter-dependent.

Patch Synopsis:
  1-4:   bug fixes
  5-6:   optimizations
  7:     function rename
  8:     bug fix and refactoring
  9-12:  refactoring
  13-14: bug fix
  15:    refactoring
  16:    optimization and prereq for early consistency checks
  17-18: early consistency checks

v1: https://www.spinics.net/lists/kvm/msg172795.html

v2:
  - rebased on tag kvm-4.19-2
  - added patch to skip instr in nested_vmx_{fail,succeed}

Sean Christopherson (18):
  KVM: nVMX: move host EFER consistency checks to VMFail path
  KVM: nVMX: move vmcs12 EPTP consistency check to
    check_vmentry_prereqs()
  KVM: nVMX: use vm_exit_controls_init() to write exit controls for
    vmcs02
  KVM: nVMX: reset cache/shadows on nested consistency check VMExit
  KVM: vmx: do not unconditionally clear EFER switching
  KVM: nVMX: try to set EFER bits correctly when init'ing entry controls
  KVM: nVMX: rename enter_vmx_non_root_mode to
    nested_vmx_enter_non_root_mode
  KVM: nVMX: move check_vmentry_postreqs() call to
    nested_vmx_enter_non_root_mode()
  KVM: nVMX: assimilate nested_vmx_entry_failure() into
    nested_vmx_enter_non_root_mode()
  KVM: nVMX: split pieces of prepare_vmcs02() to prepare_vmcs02_early()
  KVM: nVMX: do early preparation of vmcs02 before
    check_vmentry_postreqs()
  KVM: vVMX: rename label for post-enter_guest_mode consistency check
  KVM: nVMX: do not skip VMEnter instruction that succeeds
  KVM: nVMX: do not call nested_vmx_succeed() for consistency check
    VMExit
  KVM: nVMX: call kvm_skip_emulated_instruction in
    nested_vmx_{fail,succeed}
  KVM: vmx: write HOST_IA32_EFER in vmx_set_constant_host_state()
  KVM: nVMX: add option to perform early consistency checks via H/W
  KVM: nVMX: WARN if nested run hits VMFail with early consistency
    checks enabled

 arch/x86/kvm/vmx.c | 972 ++++++++++++++++++++++++++-------------------
 1 file changed, 556 insertions(+), 416 deletions(-)

Comments

Sean Christopherson Sept. 6, 2018, 3:23 p.m. UTC | #1
On Tue, Aug 28, 2018 at 09:04:41AM -0700, Sean Christopherson wrote:
> KVM currently defers many VMX consistency checks to the CPU, including
> checks that result in VMFail (as opposed to VMExit).  This behavior
> may be undesirable for some users since this means KVM detects certain
> classes of VMFail only after it has processed guest state.  Because
> there is a strict ordering between checks that cause VMFail and those
> that cause VMExit, i.e. all VMFail checks are performed before any
> checks that cause VMExit, we can detect all VMFail conditions via a
> dry run of sorts.
> 
> The end goal of this series is to add an optional (param-controlled)
> pre-run VMEnter into the nested_vmx_run() flow in order to perform
> all VMFail consistency checks prior to actually running vmcs02.  By
> itself, this is not a complex process, but getting KVM to a point
> where the approach is viable requires a fair amount of refactoring,
> e.g. to split prepare_vmcs02() so that there is a point where vmcs02
> can pass the VMFail checks without first consuming guest state.
> 
> And while the goal (and subject) of this series is to enable early
> consistency checks, the vast majority of the series deals with bug
> fixes and cleanups in the nested VMX code.  During the refactoring
> and testing, a number of pre-existing bugs, opportunities for code
> cleanup and easy optimization points (which unconvered more bugs)
> were encountered.
> 
> Ideally, these patches would be split into 3-4 separate series,
> especially the bug fix patches.  I smushed everything into a single
> series because the early VMEnter code breaks without the bug fixes
> and the refactoring shuffles the same code, and some of the cleanup
> and fixes are inter-dependent.
> 
> Patch Synopsis:
>   1-4:   bug fixes
>   5-6:   optimizations
>   7:     function rename
>   8:     bug fix and refactoring
>   9-12:  refactoring
>   13-14: bug fix
>   15:    refactoring
>   16:    optimization and prereq for early consistency checks
>   17-18: early consistency checks

Has anyone had a chance to look at any of these patches, especially
patches 1-16?  I'm fairly indifferent with regard to getting the early
consistency check stuff in mainline, but I'd like to get the bug fixes
and refactoring upstreamed sooner rather than later.  I can attempt to
break this into multiple series if bundling everything together is
unpalatable.
 
> v1: https://www.spinics.net/lists/kvm/msg172795.html
> 
> v2:
>   - rebased on tag kvm-4.19-2
>   - added patch to skip instr in nested_vmx_{fail,succeed}
> 
> Sean Christopherson (18):
>   KVM: nVMX: move host EFER consistency checks to VMFail path
>   KVM: nVMX: move vmcs12 EPTP consistency check to
>     check_vmentry_prereqs()
>   KVM: nVMX: use vm_exit_controls_init() to write exit controls for
>     vmcs02
>   KVM: nVMX: reset cache/shadows on nested consistency check VMExit
>   KVM: vmx: do not unconditionally clear EFER switching
>   KVM: nVMX: try to set EFER bits correctly when init'ing entry controls
>   KVM: nVMX: rename enter_vmx_non_root_mode to
>     nested_vmx_enter_non_root_mode
>   KVM: nVMX: move check_vmentry_postreqs() call to
>     nested_vmx_enter_non_root_mode()
>   KVM: nVMX: assimilate nested_vmx_entry_failure() into
>     nested_vmx_enter_non_root_mode()
>   KVM: nVMX: split pieces of prepare_vmcs02() to prepare_vmcs02_early()
>   KVM: nVMX: do early preparation of vmcs02 before
>     check_vmentry_postreqs()
>   KVM: vVMX: rename label for post-enter_guest_mode consistency check
>   KVM: nVMX: do not skip VMEnter instruction that succeeds
>   KVM: nVMX: do not call nested_vmx_succeed() for consistency check
>     VMExit
>   KVM: nVMX: call kvm_skip_emulated_instruction in
>     nested_vmx_{fail,succeed}
>   KVM: vmx: write HOST_IA32_EFER in vmx_set_constant_host_state()
>   KVM: nVMX: add option to perform early consistency checks via H/W
>   KVM: nVMX: WARN if nested run hits VMFail with early consistency
>     checks enabled
> 
>  arch/x86/kvm/vmx.c | 972 ++++++++++++++++++++++++++-------------------
>  1 file changed, 556 insertions(+), 416 deletions(-)
> 
> -- 
> 2.18.0
>
Paolo Bonzini Sept. 10, 2018, 11:44 a.m. UTC | #2
On 06/09/2018 17:23, Sean Christopherson wrote:
> Has anyone had a chance to look at any of these patches, especially
> patches 1-16?  I'm fairly indifferent with regard to getting the early
> consistency check stuff in mainline, but I'd like to get the bug fixes
> and refactoring upstreamed sooner rather than later.  I can attempt to
> break this into multiple series if bundling everything together is
> unpalatable.
>  

Back from vacation, I'll look at this series this week.  I find it a
very useful tool, and actually I'm thinking of adding a Kconfig symbol
to enable early_consistency_checks by default.  Distros may want to
enable it in their debug kernels.

Paolo
Sean Christopherson Sept. 10, 2018, 2:12 p.m. UTC | #3
On Mon, 2018-09-10 at 13:44 +0200, Paolo Bonzini wrote:
> On 06/09/2018 17:23, Sean Christopherson wrote:
> > 
> > Has anyone had a chance to look at any of these patches, especially
> > patches 1-16?  I'm fairly indifferent with regard to getting the early
> > consistency check stuff in mainline, but I'd like to get the bug fixes
> > and refactoring upstreamed sooner rather than later.  I can attempt to
> > break this into multiple series if bundling everything together is
> > unpalatable.
> >  
> Back from vacation, I'll look at this series this week.  I find it a
> very useful tool, and actually I'm thinking of adding a Kconfig symbol
> to enable early_consistency_checks by default.  Distros may want to
> enable it in their debug kernels.

Thanks!  I need to tweak patch 17/18[1], I'll wait to do that until
you've had a chance to review the rest of the series.

[1] Using GUEST_RIP to force the consistency check VMExit doesn't work
    on 32-bit since we can't write bits 63:32 (thankfully 0-Day caught
    this issue).  GUEST_RFLAGS is a good (and maybe better) alternative.
    I'll also enhance the changelog to explain the reasoning behind
    using GUEST_RFLAGS.