mbox series

[0/7] KVM: random nested fixes

Message ID 20210217145718.1217358-1-mlevitsk@redhat.com (mailing list archive)
Headers show
Series KVM: random nested fixes | expand

Message

Maxim Levitsky Feb. 17, 2021, 2:57 p.m. UTC
This is a set of mostly random fixes I have in my patch queue.

- Patches 1,2 are minor tracing fixes from a patch series I sent
  some time ago which I don't want to get lost in the noise.

- Patches 3,4 are for fixing a theoretical bug in VMX with ept=0, but also to
  allow to move nested_vmx_load_cr3 call a bit, to make sure that update to
  .inject_page_fault is not lost while entering a nested guest.

- Patch 5 fixes running nested guests with npt=0 on host, which is sometimes
  useful for debug and such (especially nested).

- Patch 6 fixes the (mostly theoretical) issue with PDPTR loading on VMX after
  nested migration.

- Patch 7 is hopefully the correct fix to eliminate a L0 crash in some rare
  cases when a HyperV guest is migrated.

This was tested with kvm_unit_tests on both VMX and SVM,
both native and in a VM.
Some tests fail on VMX, but I haven't observed new tests failing
due to the changes.

This patch series was also tested by doing my nested migration with:
    1. npt/ept disabled on the host
    2. npt/ept enabled on the host and disabled in the L1
    3. npt/ept enabled on both.

In case of npt/ept=0 on the host (both on Intel and AMD),
the L2 eventually crashed but I strongly suspect a bug in shadow mmu,
which I track separately.
(see below for full explanation).

This patch series is based on kvm/queue branch.

Best regards,
	Maxim Levitsky

PS: The shadow mmu bug which I spent most of this week on:

In my testing I am not able to boot win10 (without nesting, HyperV or
anything special) on either Intel nor AMD without two dimensional paging
enabled (ept/npt).
It always crashes in various ways during the boot.

I found out (accidentally) that if I make KVM's shadow mmu not unsync last level
shadow pages, it starts working.
In addition to that, as I mentioned above this bug can happen on Linux as well,
while stressing the shadow mmu with repeated migrations
(and again with the same shadow unsync hack it just works).

While running without two dimensional paging is very obsolete by now, a
bug in shadow mmu is relevant to nesting, since it uses it as well.

Maxim Levitsky (7):
  KVM: VMX: read idt_vectoring_info a bit earlier
  KVM: nSVM: move nested vmrun tracepoint to enter_svm_guest_mode
  KVM: x86: add .complete_mmu_init arch callback
  KVM: nVMX: move inject_page_fault tweak to .complete_mmu_init
  KVM: nSVM: fix running nested guests when npt=0
  KVM: nVMX: don't load PDPTRS right after nested state set
  KVM: nSVM: call nested_svm_load_cr3 on nested state load

 arch/x86/include/asm/kvm-x86-ops.h |  1 +
 arch/x86/include/asm/kvm_host.h    |  2 +
 arch/x86/kvm/mmu/mmu.c             |  2 +
 arch/x86/kvm/svm/nested.c          | 84 +++++++++++++++++++-----------
 arch/x86/kvm/svm/svm.c             |  9 ++++
 arch/x86/kvm/svm/svm.h             |  1 +
 arch/x86/kvm/vmx/nested.c          | 22 ++++----
 arch/x86/kvm/vmx/nested.h          |  1 +
 arch/x86/kvm/vmx/vmx.c             | 13 ++++-
 9 files changed, 92 insertions(+), 43 deletions(-)