mbox series

[0/4] Process some MMIO-related errors without KVM exit

Message ID 20240923141810.76331-1-iorlov@amazon.com (mailing list archive)
Headers show
Series Process some MMIO-related errors without KVM exit | expand

Message

Ivan Orlov Sept. 23, 2024, 2:18 p.m. UTC
Currently, KVM may return a variety of internal errors to VMM when
accessing MMIO, and some of them could be gracefully handled on the KVM
level instead. Moreover, some of the MMIO-related errors are handled
differently in VMX in comparison with SVM, which produces certain
inconsistency and should be fixed. This patch series introduces
KVM-level handling for the following situations:

1) Guest is accessing MMIO during event delivery: triple fault instead
of internal error on VMX and infinite loop on SVM

2) Guest fetches an instruction from MMIO: inject #UD and resume guest
execution without internal error

Additionaly, this patch series includes a KVM selftest which covers
different cases of MMIO misuse.

Also, update the set_memory_region_test to expect the triple fault when
starting VM with no RAM.

Ivan Orlov (4):
  KVM: vmx, svm, mmu: Fix MMIO during event delivery handling
  KVM: x86: Inject UD when fetching from MMIO
  selftests: KVM: Change expected exit code in test_zero_memory_regions
  selftests: KVM: Add new test for faulty mmio usage

 arch/x86/include/asm/kvm_host.h               |   6 +
 arch/x86/kvm/emulate.c                        |   3 +
 arch/x86/kvm/kvm_emulate.h                    |   1 +
 arch/x86/kvm/mmu/mmu.c                        |  13 +-
 arch/x86/kvm/svm/svm.c                        |   4 +
 arch/x86/kvm/vmx/vmx.c                        |  21 +-
 arch/x86/kvm/x86.c                            |   7 +-
 tools/testing/selftests/kvm/Makefile          |   1 +
 .../selftests/kvm/set_memory_region_test.c    |   3 +-
 .../selftests/kvm/x86_64/faulty_mmio.c        | 199 ++++++++++++++++++
 10 files changed, 242 insertions(+), 16 deletions(-)
 create mode 100644 tools/testing/selftests/kvm/x86_64/faulty_mmio.c

Comments

Sean Christopherson Sept. 23, 2024, 5:04 p.m. UTC | #1
On Mon, Sep 23, 2024, Ivan Orlov wrote:
> Currently, KVM may return a variety of internal errors to VMM when
> accessing MMIO, and some of them could be gracefully handled on the KVM
> level instead. Moreover, some of the MMIO-related errors are handled
> differently in VMX in comparison with SVM, which produces certain
> inconsistency and should be fixed. This patch series introduces
> KVM-level handling for the following situations:
> 
> 1) Guest is accessing MMIO during event delivery: triple fault instead
> of internal error on VMX and infinite loop on SVM
> 
> 2) Guest fetches an instruction from MMIO: inject #UD and resume guest
> execution without internal error

No.  This is not architectural behavior.  It's not even remotely close to
architectural behavior.  KVM's behavior isn't great, but making up _guest visible_
behavior is not going to happen.
Allister, Jack Sept. 23, 2024, 7:38 p.m. UTC | #2
On Mon, 2024-09-23 at 10:04 -0700, Sean Christopherson wrote:
> 
> On Mon, Sep 23, 2024, Ivan Orlov wrote:
> > Currently, KVM may return a variety of internal errors to VMM when
> > accessing MMIO, and some of them could be gracefully handled on the
> > KVM
> > level instead. Moreover, some of the MMIO-related errors are
> > handled
> > differently in VMX in comparison with SVM, which produces certain
> > inconsistency and should be fixed. This patch series introduces
> > KVM-level handling for the following situations:
> > 
> > 1) Guest is accessing MMIO during event delivery: triple fault
> > instead
> > of internal error on VMX and infinite loop on SVM
> > 
> > 2) Guest fetches an instruction from MMIO: inject #UD and resume
> > guest
> > execution without internal error
> 
> No.  This is not architectural behavior.  It's not even remotely
> close to
> architectural behavior.  KVM's behavior isn't great, but making up
> _guest visible_
> behavior is not going to happen.

Is this a no to the whole series or from the cover letter? 

For patch 1 we have observed that if a guest has incorrectly set it's
IDT base to point inside of an MMIO region it will result in a triple
fault (bare metal Cascake Lake Intel). Yes a sane operating system is
not really going to be doing setting it's IDT or GDT base to point into
an MMIO region, but we've seen occurrences. Normally when other
external things have gone horribly wrong.

Ivan can clarify as to what's been seen on AMD platforms regarding the
infinite loop for patch one. This was also tested on bare metal
hardware. Injection of the #UD within patch 2 may be debatable but I
believe Ivan has some more data from experiments backing this up.

Best regards,
Jack