mbox series

[v2,00/46] KVM: x86: vCPU RESET/INIT fixes and consolidation

Message ID 20210713163324.627647-1-seanjc@google.com (mailing list archive)
Headers show
Series KVM: x86: vCPU RESET/INIT fixes and consolidation | expand

Message

Sean Christopherson July 13, 2021, 4:32 p.m. UTC
The end goal of this series is to consolidate the RESET/INIT code, both to
deduplicate code and to try to avoid divergent behavior/bugs, e.g. SVM only
recently started updating vcpu->arch.cr4 on INIT.

The TL;DR of why it takes 40+ patches to get there is that the RESET/INIT
flows have multiple latent bugs and hidden dependencies, but "work"
because they're rarely touched, are mostly fixed flows in both KVM and the
guest, and because guests don't sanity check state after INIT.

While several of the patches have Fixes tags, I am absolutely terrified of
backporting most of them due to the likelihood of breaking a different
version of KVM.  And, for the most part the bugs are benign in the sense
no guest has actually encountered any of these bugs.  For that reason, I
intentionally omitted stable@ entirely.  The only patches I would consider
even remotely safe for backporting are the first four patches in the series.

v2:
  - Collect Reviews. [Reiji]
  - Fix an apic->base_address initialization goof. [Reiji]
  - Add patch to flush TLB on INIT. [Reiji]
  - Add patch to preserved CR0.CD/NW on INIT. [Reiji]
  - Add patch to emulate #INIT after shutdown on SVM. [Reiji]
  - Add patch to consolidate arch.hflags code. [Reiji]
  - Drop patch to omit VMWRITE zeroing. [Paolo, Jim]
  - Drop several MMU patches (moved to other series).

v1: https://lkml.kernel.org/r/20210424004645.3950558-1-seanjc@google.com

Sean Christopherson (46):
  KVM: x86: Flush the guest's TLB on INIT
  KVM: nVMX: Set LDTR to its architecturally defined value on nested
    VM-Exit
  KVM: SVM: Zero out GDTR.base and IDTR.base on INIT
  KVM: VMX: Set EDX at INIT with CPUID.0x1, Family-Model-Stepping
  KVM: SVM: Require exact CPUID.0x1 match when stuffing EDX at INIT
  KVM: SVM: Fall back to KVM's hardcoded value for EDX at RESET/INIT
  KVM: VMX: Remove explicit MMU reset in enter_rmode()
  KVM: SVM: Drop explicit MMU reset at RESET/INIT
  KVM: SVM: Drop a redundant init_vmcb() from svm_create_vcpu()
  KVM: VMX: Move init_vmcs() invocation to vmx_vcpu_reset()
  KVM: x86: WARN if the APIC map is dirty without an in-kernel local
    APIC
  KVM: x86: Remove defunct BSP "update" in local APIC reset
  KVM: x86: Migrate the PIT only if vcpu0 is migrated, not any BSP
  KVM: x86: Don't force set BSP bit when local APIC is managed by
    userspace
  KVM: x86: Set BSP bit in reset BSP vCPU's APIC base by default
  KVM: VMX: Stuff vcpu->arch.apic_base directly at vCPU RESET
  KVM: x86: Open code necessary bits of kvm_lapic_set_base() at vCPU
    RESET
  KVM: x86: Consolidate APIC base RESET initialization code
  KVM: x86: Move EDX initialization at vCPU RESET to common code
  KVM: SVM: Don't bother writing vmcb->save.rip at vCPU RESET/INIT
  KVM: VMX: Invert handling of CR0.WP for EPT without unrestricted guest
  KVM: VMX: Remove direct write to vcpu->arch.cr0 during vCPU RESET/INIT
  KVM: VMX: Fold ept_update_paging_mode_cr0() back into vmx_set_cr0()
  KVM: nVMX: Do not clear CR3 load/store exiting bits if L1 wants 'em
  KVM: VMX: Pull GUEST_CR3 from the VMCS iff CR3 load exiting is
    disabled
  KVM: x86/mmu: Skip the permission_fault() check on MMIO if CR0.PG=0
  KVM: VMX: Process CR0.PG side effects after setting CR0 assets
  KVM: VMX: Skip emulation required checks during pmode/rmode
    transitions
  KVM: nVMX: Don't evaluate "emulation required" on nested VM-Exit
  KVM: SVM: Tweak order of cr0/cr4/efer writes at RESET/INIT
  KVM: SVM: Drop redundant writes to vmcb->save.cr4 at RESET/INIT
  KVM: SVM: Stuff save->dr6 at during VMSA sync, not at RESET/INIT
  KVM: VMX: Skip pointless MSR bitmap update when setting EFER
  KVM: VMX: Refresh list of user return MSRs after setting guest CPUID
  KVM: VMX: Don't _explicitly_ reconfigure user return MSRs on vCPU INIT
  KVM: x86: Move setting of sregs during vCPU RESET/INIT to common x86
  KVM: VMX: Remove obsolete MSR bitmap refresh at vCPU RESET/INIT
  KVM: nVMX: Remove obsolete MSR bitmap refresh at nested transitions
  KVM: VMX: Don't redo x2APIC MSR bitmaps when userspace filter is
    changed
  KVM: VMX: Remove unnecessary initialization of msr_bitmap_mode
  KVM: VMX: Smush x2APIC MSR bitmap adjustments into single function
  KVM: VMX: Remove redundant write to set vCPU as active at RESET/INIT
  KVM: VMX: Move RESET-only VMWRITE sequences to init_vmcs()
  KVM: SVM: Emulate #INIT in response to triple fault shutdown
  KVM: SVM: Drop redundant clearing of vcpu->arch.hflags at INIT/RESET
  KVM: x86: Preserve guest's CR0.CD/NW on INIT

 arch/x86/include/asm/kvm_host.h |   5 -
 arch/x86/kvm/i8254.c            |   3 +-
 arch/x86/kvm/lapic.c            |  26 +--
 arch/x86/kvm/svm/sev.c          |   1 +
 arch/x86/kvm/svm/svm.c          |  48 ++----
 arch/x86/kvm/vmx/nested.c       |  24 ++-
 arch/x86/kvm/vmx/vmx.c          | 270 +++++++++++++++-----------------
 arch/x86/kvm/vmx/vmx.h          |   5 +-
 arch/x86/kvm/x86.c              |  52 +++++-
 9 files changed, 211 insertions(+), 223 deletions(-)

Comments

Paolo Bonzini July 26, 2021, 9:12 p.m. UTC | #1
On 13/07/21 18:32, Sean Christopherson wrote:
> The end goal of this series is to consolidate the RESET/INIT code, both to
> deduplicate code and to try to avoid divergent behavior/bugs, e.g. SVM only
> recently started updating vcpu->arch.cr4 on INIT.
> 
> The TL;DR of why it takes 40+ patches to get there is that the RESET/INIT
> flows have multiple latent bugs and hidden dependencies, but "work"
> because they're rarely touched, are mostly fixed flows in both KVM and the
> guest, and because guests don't sanity check state after INIT.
> 
> While several of the patches have Fixes tags, I am absolutely terrified of
> backporting most of them due to the likelihood of breaking a different
> version of KVM.  And, for the most part the bugs are benign in the sense
> no guest has actually encountered any of these bugs.  For that reason, I
> intentionally omitted stable@ entirely.  The only patches I would consider
> even remotely safe for backporting are the first four patches in the series.
> 
> v2:
>    - Collect Reviews. [Reiji]
>    - Fix an apic->base_address initialization goof. [Reiji]
>    - Add patch to flush TLB on INIT. [Reiji]
>    - Add patch to preserved CR0.CD/NW on INIT. [Reiji]
>    - Add patch to emulate #INIT after shutdown on SVM. [Reiji]
>    - Add patch to consolidate arch.hflags code. [Reiji]
>    - Drop patch to omit VMWRITE zeroing. [Paolo, Jim]
>    - Drop several MMU patches (moved to other series).
> 
> v1: https://lkml.kernel.org/r/20210424004645.3950558-1-seanjc@google.com
> 
> Sean Christopherson (46):
>    KVM: x86: Flush the guest's TLB on INIT
>    KVM: nVMX: Set LDTR to its architecturally defined value on nested
>      VM-Exit
>    KVM: SVM: Zero out GDTR.base and IDTR.base on INIT
>    KVM: VMX: Set EDX at INIT with CPUID.0x1, Family-Model-Stepping
>    KVM: SVM: Require exact CPUID.0x1 match when stuffing EDX at INIT
>    KVM: SVM: Fall back to KVM's hardcoded value for EDX at RESET/INIT
>    KVM: VMX: Remove explicit MMU reset in enter_rmode()
>    KVM: SVM: Drop explicit MMU reset at RESET/INIT
>    KVM: SVM: Drop a redundant init_vmcb() from svm_create_vcpu()
>    KVM: VMX: Move init_vmcs() invocation to vmx_vcpu_reset()
>    KVM: x86: WARN if the APIC map is dirty without an in-kernel local
>      APIC
>    KVM: x86: Remove defunct BSP "update" in local APIC reset
>    KVM: x86: Migrate the PIT only if vcpu0 is migrated, not any BSP
>    KVM: x86: Don't force set BSP bit when local APIC is managed by
>      userspace
>    KVM: x86: Set BSP bit in reset BSP vCPU's APIC base by default
>    KVM: VMX: Stuff vcpu->arch.apic_base directly at vCPU RESET
>    KVM: x86: Open code necessary bits of kvm_lapic_set_base() at vCPU
>      RESET
>    KVM: x86: Consolidate APIC base RESET initialization code
>    KVM: x86: Move EDX initialization at vCPU RESET to common code
>    KVM: SVM: Don't bother writing vmcb->save.rip at vCPU RESET/INIT
>    KVM: VMX: Invert handling of CR0.WP for EPT without unrestricted guest
>    KVM: VMX: Remove direct write to vcpu->arch.cr0 during vCPU RESET/INIT
>    KVM: VMX: Fold ept_update_paging_mode_cr0() back into vmx_set_cr0()
>    KVM: nVMX: Do not clear CR3 load/store exiting bits if L1 wants 'em
>    KVM: VMX: Pull GUEST_CR3 from the VMCS iff CR3 load exiting is
>      disabled
>    KVM: x86/mmu: Skip the permission_fault() check on MMIO if CR0.PG=0
>    KVM: VMX: Process CR0.PG side effects after setting CR0 assets
>    KVM: VMX: Skip emulation required checks during pmode/rmode
>      transitions
>    KVM: nVMX: Don't evaluate "emulation required" on nested VM-Exit
>    KVM: SVM: Tweak order of cr0/cr4/efer writes at RESET/INIT
>    KVM: SVM: Drop redundant writes to vmcb->save.cr4 at RESET/INIT
>    KVM: SVM: Stuff save->dr6 at during VMSA sync, not at RESET/INIT
>    KVM: VMX: Skip pointless MSR bitmap update when setting EFER
>    KVM: VMX: Refresh list of user return MSRs after setting guest CPUID
>    KVM: VMX: Don't _explicitly_ reconfigure user return MSRs on vCPU INIT
>    KVM: x86: Move setting of sregs during vCPU RESET/INIT to common x86
>    KVM: VMX: Remove obsolete MSR bitmap refresh at vCPU RESET/INIT
>    KVM: nVMX: Remove obsolete MSR bitmap refresh at nested transitions
>    KVM: VMX: Don't redo x2APIC MSR bitmaps when userspace filter is
>      changed
>    KVM: VMX: Remove unnecessary initialization of msr_bitmap_mode
>    KVM: VMX: Smush x2APIC MSR bitmap adjustments into single function
>    KVM: VMX: Remove redundant write to set vCPU as active at RESET/INIT
>    KVM: VMX: Move RESET-only VMWRITE sequences to init_vmcs()
>    KVM: SVM: Emulate #INIT in response to triple fault shutdown
>    KVM: SVM: Drop redundant clearing of vcpu->arch.hflags at INIT/RESET
>    KVM: x86: Preserve guest's CR0.CD/NW on INIT
> 
>   arch/x86/include/asm/kvm_host.h |   5 -
>   arch/x86/kvm/i8254.c            |   3 +-
>   arch/x86/kvm/lapic.c            |  26 +--
>   arch/x86/kvm/svm/sev.c          |   1 +
>   arch/x86/kvm/svm/svm.c          |  48 ++----
>   arch/x86/kvm/vmx/nested.c       |  24 ++-
>   arch/x86/kvm/vmx/vmx.c          | 270 +++++++++++++++-----------------
>   arch/x86/kvm/vmx/vmx.h          |   5 +-
>   arch/x86/kvm/x86.c              |  52 +++++-
>   9 files changed, 211 insertions(+), 223 deletions(-)
> 

Queued, except for patches 9-10.

I'd rather have init_vmcb/svm_vcpu_reset look more like the VMX code, 
with the INIT code moved to svm_vcpu_reset and the rest remaining in 
init_vmcb.

Paolo