mbox series

[00/43] KVM: x86: vCPU RESET/INIT fixes and consolidation

Message ID 20210424004645.3950558-1-seanjc@google.com (mailing list archive)
Headers show
Series KVM: x86: vCPU RESET/INIT fixes and consolidation | expand

Message

Sean Christopherson April 24, 2021, 12:46 a.m. UTC
For the record, I went into this thinking it was going to be a simple code
shuffle between {svm,vmx}_vcpu_reset() and kvm_vcpu_reset().  The actual
goal is to consolidate the RESET/INIT code, both to deduplicate code and
to try to avoid divergent behavior/bugs, e.g. SVM only recently started
updating vcpu->arch.cr4 on INIT.

The TL;DR of why it takes 40+ patches to get there is that the RESET/INIT
flows have multiple latent bugs and hidden dependencies, but "work"
because they're rarely touched, are mostly fixed flows in both KVM and the
guest, and because guests don't sanity check state after INIT.

While several of the patches have Fixes tags, I am absolutely terrified of
backporting most of them due to the likelihood of breaking a different
version of KVM.  And, for the most part the bugs are benign in the sense
no guest has actually encountered any of these bugs.  For that reason, I
intentionally omitted stable@ entirely.  The only patches I would consider
even remotely safe for backporting are the first two patches in the series.


Sean Christopherson (43):
  KVM: nVMX: Set LDTR to its architecturally defined value on nested
    VM-Exit
  KVM: VMX: Set EDX at INIT with CPUID.0x1, Family-Model-Stepping
  KVM: SVM: Require exact CPUID.0x1 match when stuffing EDX at INIT
  KVM: SVM: Fall back to KVM's hardcoded value for EDX at RESET/INIT
  KVM: x86: Split out CR0/CR4 MMU role change detectors to separate
    helpers
  KVM: x86: Properly reset MMU context at vCPU RESET/INIT
  KVM: VMX: Remove explicit MMU reset in enter_rmode()
  KVM: SVM: Drop explicit MMU reset at RESET/INIT
  KVM: SVM: Drop a redundant init_vmcb() from svm_create_vcpu()
  KVM: VMX: Move init_vmcs() invocation to vmx_vcpu_reset()
  KVM: x86: WARN if the APIC map is dirty without an in-kernel local
    APIC
  KVM: x86: Remove defunct BSP "update" in local APIC reset
  KVM: x86: Migrate the PIT only if vcpu0 is migrated, not any BSP
  KVM: x86: Don't force set BSP bit when local APIC is managed by
    userspace
  KVM: x86: Set BSP bit in reset BSP vCPU's APIC base by default
  KVM: VMX: Stuff vcpu->arch.apic_base directly at vCPU RESET
  KVM: x86: Open code necessary bits of kvm_lapic_set_base() at vCPU
    RESET
  KVM: x86: Consolidate APIC base RESET initialization code
  KVM: x86: Move EDX initialization at vCPU RESET to common code
  KVM: SVM: Don't bother writing vmcb->save.rip at vCPU RESET/INIT
  KVM: VMX: Invert handling of CR0.WP for EPT without unrestricted guest
  KVM: VMX: Remove direct write to vcpu->arch.cr0 during vCPU RESET/INIT
  KVM: VMX: Fold ept_update_paging_mode_cr0() back into vmx_set_cr0()
  KVM: nVMX: Do not clear CR3 load/store exiting bits if L1 wants 'em
  KVM: VMX: Pull GUEST_CR3 from the VMCS iff CR3 load exiting is
    disabled
  KVM: VMX: Process CR0.PG side effects after setting CR0 assets
  KVM: VMX: Skip emulation required checks during pmode/rmode
    transitions
  KVM: nVMX: Don't evaluate "emulation required" on VM-Exit
  KVM: SVM: Tweak order of cr0/cr4/efer writes at RESET/INIT
  KVM: SVM: Drop redundant writes to vmcb->save.cr4 at RESET/INIT
  KVM: SVM: Stuff save->dr6 at during VMSA sync, not at RESET/INIT
  KVM: VMX: Skip pointless MSR bitmap update when setting EFER
  KVM: VMX: Refresh list of user return MSRs after setting guest CPUID
  KVM: VMX: Don't _explicitly_ reconfigure user return MSRs on vCPU INIT
  KVM: x86: Move setting of sregs during vCPU RESET/INIT to common x86
  KVM: VMX: Remove obsolete MSR bitmap refresh at vCPU RESET/INIT
  KVM: nVMX: Remove obsolete MSR bitmap refresh at nested transitions
  KVM: VMX: Don't redo x2APIC MSR bitmaps when userspace filter is
    changed
  KVM: VMX: Remove unnecessary initialization of msr_bitmap_mode
  KVM: VMX: Smush x2APIC MSR bitmap adjustments into single function
  KVM: VMX: Remove redundant write to set vCPU as active at RESET/INIT
  KVM: VMX: Drop VMWRITEs to zero fields at vCPU RESET
  KVM: x86: Drop pointless @reset_roots from kvm_init_mmu()

 arch/x86/include/asm/kvm_host.h |   5 -
 arch/x86/kvm/i8254.c            |   3 +-
 arch/x86/kvm/lapic.c            |  26 +--
 arch/x86/kvm/mmu.h              |   2 +-
 arch/x86/kvm/mmu/mmu.c          |  13 +-
 arch/x86/kvm/svm/nested.c       |   2 +-
 arch/x86/kvm/svm/sev.c          |   1 +
 arch/x86/kvm/svm/svm.c          |  33 +---
 arch/x86/kvm/vmx/nested.c       |  26 ++-
 arch/x86/kvm/vmx/vmx.c          | 271 +++++++++++++-------------------
 arch/x86/kvm/vmx/vmx.h          |   5 +-
 arch/x86/kvm/x86.c              |  51 +++++-
 12 files changed, 189 insertions(+), 249 deletions(-)

Comments

Paolo Bonzini June 10, 2021, 4:54 p.m. UTC | #1
On 24/04/21 02:46, Sean Christopherson wrote:
> For the record, I went into this thinking it was going to be a simple code
> shuffle between {svm,vmx}_vcpu_reset() and kvm_vcpu_reset().  The actual
> goal is to consolidate the RESET/INIT code, both to deduplicate code and
> to try to avoid divergent behavior/bugs, e.g. SVM only recently started
> updating vcpu->arch.cr4 on INIT.
> 
> The TL;DR of why it takes 40+ patches to get there is that the RESET/INIT
> flows have multiple latent bugs and hidden dependencies, but "work"
> because they're rarely touched, are mostly fixed flows in both KVM and the
> guest, and because guests don't sanity check state after INIT.
> 
> While several of the patches have Fixes tags, I am absolutely terrified of
> backporting most of them due to the likelihood of breaking a different
> version of KVM.  And, for the most part the bugs are benign in the sense
> no guest has actually encountered any of these bugs.  For that reason, I
> intentionally omitted stable@ entirely.  The only patches I would consider
> even remotely safe for backporting are the first two patches in the series.
> 
> 
> Sean Christopherson (43):
>    KVM: nVMX: Set LDTR to its architecturally defined value on nested
>      VM-Exit
>    KVM: VMX: Set EDX at INIT with CPUID.0x1, Family-Model-Stepping
>    KVM: SVM: Require exact CPUID.0x1 match when stuffing EDX at INIT
>    KVM: SVM: Fall back to KVM's hardcoded value for EDX at RESET/INIT
>    KVM: x86: Split out CR0/CR4 MMU role change detectors to separate
>      helpers
>    KVM: x86: Properly reset MMU context at vCPU RESET/INIT
>    KVM: VMX: Remove explicit MMU reset in enter_rmode()
>    KVM: SVM: Drop explicit MMU reset at RESET/INIT
>    KVM: SVM: Drop a redundant init_vmcb() from svm_create_vcpu()
>    KVM: VMX: Move init_vmcs() invocation to vmx_vcpu_reset()
>    KVM: x86: WARN if the APIC map is dirty without an in-kernel local
>      APIC
>    KVM: x86: Remove defunct BSP "update" in local APIC reset
>    KVM: x86: Migrate the PIT only if vcpu0 is migrated, not any BSP
>    KVM: x86: Don't force set BSP bit when local APIC is managed by
>      userspace
>    KVM: x86: Set BSP bit in reset BSP vCPU's APIC base by default
>    KVM: VMX: Stuff vcpu->arch.apic_base directly at vCPU RESET
>    KVM: x86: Open code necessary bits of kvm_lapic_set_base() at vCPU
>      RESET
>    KVM: x86: Consolidate APIC base RESET initialization code
>    KVM: x86: Move EDX initialization at vCPU RESET to common code
>    KVM: SVM: Don't bother writing vmcb->save.rip at vCPU RESET/INIT
>    KVM: VMX: Invert handling of CR0.WP for EPT without unrestricted guest
>    KVM: VMX: Remove direct write to vcpu->arch.cr0 during vCPU RESET/INIT
>    KVM: VMX: Fold ept_update_paging_mode_cr0() back into vmx_set_cr0()
>    KVM: nVMX: Do not clear CR3 load/store exiting bits if L1 wants 'em
>    KVM: VMX: Pull GUEST_CR3 from the VMCS iff CR3 load exiting is
>      disabled
>    KVM: VMX: Process CR0.PG side effects after setting CR0 assets
>    KVM: VMX: Skip emulation required checks during pmode/rmode
>      transitions
>    KVM: nVMX: Don't evaluate "emulation required" on VM-Exit
>    KVM: SVM: Tweak order of cr0/cr4/efer writes at RESET/INIT
>    KVM: SVM: Drop redundant writes to vmcb->save.cr4 at RESET/INIT
>    KVM: SVM: Stuff save->dr6 at during VMSA sync, not at RESET/INIT
>    KVM: VMX: Skip pointless MSR bitmap update when setting EFER
>    KVM: VMX: Refresh list of user return MSRs after setting guest CPUID
>    KVM: VMX: Don't _explicitly_ reconfigure user return MSRs on vCPU INIT
>    KVM: x86: Move setting of sregs during vCPU RESET/INIT to common x86
>    KVM: VMX: Remove obsolete MSR bitmap refresh at vCPU RESET/INIT
>    KVM: nVMX: Remove obsolete MSR bitmap refresh at nested transitions
>    KVM: VMX: Don't redo x2APIC MSR bitmaps when userspace filter is
>      changed
>    KVM: VMX: Remove unnecessary initialization of msr_bitmap_mode
>    KVM: VMX: Smush x2APIC MSR bitmap adjustments into single function
>    KVM: VMX: Remove redundant write to set vCPU as active at RESET/INIT
>    KVM: VMX: Drop VMWRITEs to zero fields at vCPU RESET
>    KVM: x86: Drop pointless @reset_roots from kvm_init_mmu()
> 
>   arch/x86/include/asm/kvm_host.h |   5 -
>   arch/x86/kvm/i8254.c            |   3 +-
>   arch/x86/kvm/lapic.c            |  26 +--
>   arch/x86/kvm/mmu.h              |   2 +-
>   arch/x86/kvm/mmu/mmu.c          |  13 +-
>   arch/x86/kvm/svm/nested.c       |   2 +-
>   arch/x86/kvm/svm/sev.c          |   1 +
>   arch/x86/kvm/svm/svm.c          |  33 +---
>   arch/x86/kvm/vmx/nested.c       |  26 ++-
>   arch/x86/kvm/vmx/vmx.c          | 271 +++++++++++++-------------------
>   arch/x86/kvm/vmx/vmx.h          |   5 +-
>   arch/x86/kvm/x86.c              |  51 +++++-
>   12 files changed, 189 insertions(+), 249 deletions(-)
> 

I'm waiting for a v2 of this; it applies with relatively few conflicts, 
but there were some comments so it's better if you take care of updating it.

Paolo
Sean Christopherson June 10, 2021, 7:22 p.m. UTC | #2
On Thu, Jun 10, 2021, Paolo Bonzini wrote:
> I'm waiting for a v2 of this; it applies with relatively few conflicts, but
> there were some comments so it's better if you take care of updating it.

Ya, slowly getting there...  Something in this series (I can't even remember what)
sent me into the morass that is unsync shadow pages and I've been thrashing around
in there for a while.