diff mbox series

KVM: x86: enable TDP MMU by default

Message ID 20210726163106.1433600-1-pbonzini@redhat.com (mailing list archive)
State New, archived
Headers show
Series KVM: x86: enable TDP MMU by default | expand

Commit Message

Paolo Bonzini July 26, 2021, 4:31 p.m. UTC
With the addition of fast page fault support, the TDP-specific MMU has reached
feature parity with the original MMU.  All my testing in the last few months
has been done with the TDP MMU; switch the default on 64-bit machines.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/mmu/tdp_mmu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Ben Gardon July 26, 2021, 4:53 p.m. UTC | #1
On Mon, Jul 26, 2021 at 9:31 AM Paolo Bonzini <pbonzini@redhat.com> wrote:
>
> With the addition of fast page fault support, the TDP-specific MMU has reached
> feature parity with the original MMU.  All my testing in the last few months
> has been done with the TDP MMU; switch the default on 64-bit machines.
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Yay! Thank you for your support in getting this merged and enabled
Paolo, and thanks to David and Sean for helping upstream code and for
so many reviews!
I'm sure this will provoke some bug reports, and I'll keep a close eye
on the mailing list to help address the issues quickly.
In the meantime,

Reviewed-by: Ben Gardon <bgardon@google.com>

> ---
>  arch/x86/kvm/mmu/tdp_mmu.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
> index f86158d41af0..43f12f5d12c0 100644
> --- a/arch/x86/kvm/mmu/tdp_mmu.c
> +++ b/arch/x86/kvm/mmu/tdp_mmu.c
> @@ -10,7 +10,7 @@
>  #include <asm/cmpxchg.h>
>  #include <trace/events/kvm.h>
>
> -static bool __read_mostly tdp_mmu_enabled = false;
> +static bool __read_mostly tdp_mmu_enabled = true;
>  module_param_named(tdp_mmu, tdp_mmu_enabled, bool, 0644);
>
>  /* Initializes the TDP MMU for the VM, if enabled. */
> --
> 2.27.0
>
Stoiko Ivanov July 26, 2022, 2:57 p.m. UTC | #2
Hi,

Proxmox[0] recently switched to the 5.15 kernel series (based on the one
for Ubuntu 22.04), which includes this commit. 
While it's working well on most installations, we have a few users who
reported that some of their guests shutdown with 
`KVM: entry failed, hardware error 0x80000021` being logged under certain
conditions and environments[1]:
* The issue is not deterministically reproducible, and only happens
  eventually with certain loads (e.g. we have only one system in our
  office which exhibits the issue - and this only by repeatedly installing
  Windows 2k22 ~ one out of 10 installs will cause the guest-crash)
* While most reports are referring to (newer) Windows guests, some users
  run into the issue with Linux VMs as well
* The affected systems are from a quite wide range - our affected machine
  is an old IvyBridge Xeon with outdated BIOS (an equivalent system with
  the latest available BIOS is not affected), but we have
  reports of all kind of Intel CPUs (up to an i5-12400). It seems AMD CPUs
  are not affected.

Disabling tdp_mmu seems to mitigate the issue, but I still thought you
might want to know that in some cases tdp_mmu causes problems, or that you
even might have an idea of how to fix the issue without explicitly
disabling tdp_mmu?

While trying to find the cause, we also included a test with a 5.18 kernel
(still affected).


The logs of the hypervisor after a guest crash:
```
Jun 24 17:25:51 testhost kernel: VMCS 000000006afb1754, last attempted VM-entry on CPU 12
Jun 24 17:25:51 testhost kernel: *** Guest State ***
Jun 24 17:25:51 testhost kernel: CR0: actual=0x0000000000050032, shadow=0x0000000000050032, gh_mask=fffffffffffffff7
Jun 24 17:25:51 testhost kernel: CR4: actual=0x0000000000002040, shadow=0x0000000000000000, gh_mask=fffffffffffef871
Jun 24 17:25:51 testhost kernel: CR3 = 0x000000013cbf4002
Jun 24 17:25:51 testhost kernel: PDPTR0 = 0x0000003300050011  PDPTR1 = 0x0000000000000000
Jun 24 17:25:51 testhost kernel: PDPTR2 = 0x0000000000000000  PDPTR3 = 0x0000010000000000
Jun 24 17:25:51 testhost kernel: RSP = 0xffff898cacda2c90  RIP = 0x0000000000008000
Jun 24 17:25:51 testhost kernel: RFLAGS=0x00000002         DR7 = 0x0000000000000400
Jun 24 17:25:51 testhost kernel: Sysenter RSP=0000000000000000 CS:RIP=0000:0000000000000000
Jun 24 17:25:51 testhost kernel: CS:   sel=0xc200, attr=0x08093, limit=0xffffffff, base=0x000000007ffc2000
Jun 24 17:25:51 testhost kernel: DS:   sel=0x0000, attr=0x08093, limit=0xffffffff, base=0x0000000000000000
Jun 24 17:25:51 testhost kernel: SS:   sel=0x0000, attr=0x08093, limit=0xffffffff, base=0x0000000000000000
Jun 24 17:25:51 testhost kernel: ES:   sel=0x0000, attr=0x08093, limit=0xffffffff, base=0x0000000000000000
Jun 24 17:25:51 testhost kernel: FS:   sel=0x0000, attr=0x08093, limit=0xffffffff, base=0x0000000000000000
Jun 24 17:25:51 testhost kernel: GS:   sel=0x0000, attr=0x08093, limit=0xffffffff, base=0x0000000000000000
Jun 24 17:25:51 testhost kernel: GDTR:                           limit=0x00000057, base=0xfffff8024e652fb0
Jun 24 17:25:51 testhost kernel: LDTR: sel=0x0000, attr=0x10000, limit=0x000fffff, base=0x0000000000000000
Jun 24 17:25:51 testhost kernel: IDTR:                           limit=0x00000000, base=0x0000000000000000
Jun 24 17:25:51 testhost kernel: TR:   sel=0x0040, attr=0x0008b, limit=0x00000067, base=0xfffff8024e651000
Jun 24 17:25:51 testhost kernel: EFER= 0x0000000000000000
Jun 24 17:25:51 testhost kernel: PAT = 0x0007010600070106
Jun 24 17:25:51 testhost kernel: DebugCtl = 0x0000000000000000  DebugExceptions = 0x0000000000000000
Jun 24 17:25:51 testhost kernel: Interruptibility = 00000009  ActivityState = 00000000
Jun 24 17:25:51 testhost kernel: InterruptStatus = 002f
Jun 24 17:25:51 testhost kernel: *** Host State ***
Jun 24 17:25:51 testhost kernel: RIP = 0xffffffffc119a0a0  RSP = 0xffffa6a24a52bc20
Jun 24 17:25:51 testhost kernel: CS=0010 SS=0018 DS=0000 ES=0000 FS=0000 GS=0000 TR=0040
Jun 24 17:25:51 testhost kernel: FSBase=00007f1bf7fff700 GSBase=ffff97df5ed80000 TRBase=fffffe00002c7000
Jun 24 17:25:51 testhost kernel: GDTBase=fffffe00002c5000 IDTBase=fffffe0000000000
Jun 24 17:25:51 testhost kernel: CR0=0000000080050033 CR3=00000001226c8004 CR4=00000000001726e0
Jun 24 17:25:51 testhost kernel: Sysenter RSP=fffffe00002c7000 CS:RIP=0010:ffffffffbd201d90
Jun 24 17:25:51 testhost kernel: EFER= 0x0000000000000d01
Jun 24 17:25:51 testhost kernel: PAT = 0x0407050600070106
Jun 24 17:25:51 testhost kernel: *** Control State ***
Jun 24 17:25:51 testhost kernel: PinBased=000000ff CPUBased=b5a06dfa SecondaryExec=000007eb
Jun 24 17:25:51 testhost kernel: EntryControls=0000d1ff ExitControls=002befff
Jun 24 17:25:51 testhost kernel: ExceptionBitmap=00060042 PFECmask=00000000 PFECmatch=00000000
Jun 24 17:25:51 testhost kernel: VMEntry: intr_info=00000000 errcode=00000004 ilen=00000000
Jun 24 17:25:51 testhost kernel: VMExit: intr_info=00000000 errcode=00000000 ilen=00000001
Jun 24 17:25:51 testhost kernel:         reason=80000021 qualification=0000000000000000
Jun 24 17:25:51 testhost kernel: IDTVectoring: info=00000000 errcode=00000000
Jun 24 17:25:51 testhost kernel: TSC Offset = 0xff96fad07396b5f8
Jun 24 17:25:51 testhost kernel: SVI|RVI = 00|2f TPR Threshold = 0x00
Jun 24 17:25:51 testhost kernel: APIC-access addr = 0x000000014516c000 virt-APIC addr = 0x000000014afe7000
Jun 24 17:25:51 testhost kernel: PostedIntrVec = 0xf2
Jun 24 17:25:51 testhost kernel: EPT pointer = 0x000000011aa2d01e
Jun 24 17:25:51 testhost kernel: PLE Gap=00000080 Window=00020000
Jun 24 17:25:51 testhost kernel: Virtual processor ID = 0x0003
Jun 24 17:25:51 testhost QEMU[2997]: KVM: entry failed, hardware error 0x80000021
Jun 24 17:25:51 testhost QEMU[2997]: If you're running a guest on an Intel machine without unrestricted mode
Jun 24 17:25:51 testhost QEMU[2997]: support, the failure can be most likely due to the guest entering an invalid
Jun 24 17:25:51 testhost QEMU[2997]: state for Intel VT. For example, the guest maybe running in big real mode
Jun 24 17:25:51 testhost QEMU[2997]: which is not supported on less recent Intel processors.
Jun 24 17:25:51 testhost QEMU[2997]: EAX=00001e30 EBX=4e364180 ECX=00000001 EDX=00000000
Jun 24 17:25:51 testhost QEMU[2997]: ESI=df291040 EDI=e0d82080 EBP=acda2ea0 ESP=acda2c90
Jun 24 17:25:51 testhost QEMU[2997]: EIP=00008000 EFL=00000002 [-------] CPL=0 II=0 A20=1 SMM=1 HLT=0
Jun 24 17:25:51 testhost QEMU[2997]: ES =0000 00000000 ffffffff 00809300
Jun 24 17:25:51 testhost QEMU[2997]: CS =c200 7ffc2000 ffffffff 00809300
Jun 24 17:25:51 testhost QEMU[2997]: SS =0000 00000000 ffffffff 00809300
Jun 24 17:25:51 testhost QEMU[2997]: DS =0000 00000000 ffffffff 00809300
Jun 24 17:25:51 testhost QEMU[2997]: FS =0000 00000000 ffffffff 00809300
Jun 24 17:25:51 testhost QEMU[2997]: GS =0000 00000000 ffffffff 00809300
Jun 24 17:25:51 testhost QEMU[2997]: LDT=0000 00000000 000fffff 00000000
Jun 24 17:25:51 testhost QEMU[2997]: TR =0040 4e651000 00000067 00008b00
Jun 24 17:25:51 testhost QEMU[2997]: GDT=     4e652fb0 00000057
Jun 24 17:25:51 testhost QEMU[2997]: IDT=     00000000 00000000
Jun 24 17:25:51 testhost QEMU[2997]: CR0=00050032 CR2=826c6000 CR3=3cbf4002 CR4=00000000
Jun 24 17:25:51 testhost QEMU[2997]: DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000
Jun 24 17:25:51 testhost QEMU[2997]: DR6=00000000ffff0ff0 DR7=0000000000000400
Jun 24 17:25:51 testhost QEMU[2997]: EFER=0000000000000000
Jun 24 17:25:51 testhost QEMU[2997]: Code=kvm: ../hw/core/cpu-sysemu.c:77: cpu_asidx_from_attrs: Assertion `ret < cpu->num_ases && ret >= 0' failed.

```

Should you need any further information from my side or want me to test
some potential fix - please don't hesitate to ask!

Kind Regards,
stoiko


[0] https://www.proxmox.com/
[1] https://forum.proxmox.com/threads/.109410

On Mon, 26 Jul 2021 12:31:06 -0400
Paolo Bonzini <pbonzini@redhat.com> wrote:

> With the addition of fast page fault support, the TDP-specific MMU has reached
> feature parity with the original MMU.  All my testing in the last few months
> has been done with the TDP MMU; switch the default on 64-bit machines.


> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ..snip..
Paolo Bonzini July 26, 2022, 3:43 p.m. UTC | #3
On 7/26/22 16:57, Stoiko Ivanov wrote:
> Hi,
> 
> Proxmox[0] recently switched to the 5.15 kernel series (based on the one
> for Ubuntu 22.04), which includes this commit.
> While it's working well on most installations, we have a few users who
> reported that some of their guests shutdown with
> `KVM: entry failed, hardware error 0x80000021` being logged under certain
> conditions and environments[1]:
> * The issue is not deterministically reproducible, and only happens
>    eventually with certain loads (e.g. we have only one system in our
>    office which exhibits the issue - and this only by repeatedly installing
>    Windows 2k22 ~ one out of 10 installs will cause the guest-crash)
> * While most reports are referring to (newer) Windows guests, some users
>    run into the issue with Linux VMs as well
> * The affected systems are from a quite wide range - our affected machine
>    is an old IvyBridge Xeon with outdated BIOS (an equivalent system with
>    the latest available BIOS is not affected), but we have
>    reports of all kind of Intel CPUs (up to an i5-12400). It seems AMD CPUs
>    are not affected.
> 
> Disabling tdp_mmu seems to mitigate the issue, but I still thought you
> might want to know that in some cases tdp_mmu causes problems, or that you
> even might have an idea of how to fix the issue without explicitly
> disabling tdp_mmu?

If you don't need secure boot, you can try disabling SMM.  It should not 
be related to TDP MMU, but the logs (thanks!) point at an SMM entry (RIP 
= 0x8000, CS base=0x7ffc2000).

This is likely to be fixed by 
https://lore.kernel.org/kvm/20220621150902.46126-1-mlevitsk@redhat.com/.

Paolo
Maxim Levitsky July 27, 2022, 10:22 a.m. UTC | #4
On Tue, 2022-07-26 at 17:43 +0200, Paolo Bonzini wrote:
> On 7/26/22 16:57, Stoiko Ivanov wrote:
> > Hi,
> > 
> > Proxmox[0] recently switched to the 5.15 kernel series (based on the one
> > for Ubuntu 22.04), which includes this commit.
> > While it's working well on most installations, we have a few users who
> > reported that some of their guests shutdown with
> > `KVM: entry failed, hardware error 0x80000021` being logged under certain
> > conditions and environments[1]:
> > * The issue is not deterministically reproducible, and only happens
> >    eventually with certain loads (e.g. we have only one system in our
> >    office which exhibits the issue - and this only by repeatedly installing
> >    Windows 2k22 ~ one out of 10 installs will cause the guest-crash)
> > * While most reports are referring to (newer) Windows guests, some users
> >    run into the issue with Linux VMs as well
> > * The affected systems are from a quite wide range - our affected machine
> >    is an old IvyBridge Xeon with outdated BIOS (an equivalent system with
> >    the latest available BIOS is not affected), but we have
> >    reports of all kind of Intel CPUs (up to an i5-12400). It seems AMD CPUs
> >    are not affected.
> > 
> > Disabling tdp_mmu seems to mitigate the issue, but I still thought you
> > might want to know that in some cases tdp_mmu causes problems, or that you
> > even might have an idea of how to fix the issue without explicitly
> > disabling tdp_mmu?
> 
> If you don't need secure boot, you can try disabling SMM.  It should not 
> be related to TDP MMU, but the logs (thanks!) point at an SMM entry (RIP 
> = 0x8000, CS base=0x7ffc2000).

No doubt about it. It is the issue.

> 
> This is likely to be fixed by 
> https://lore.kernel.org/kvm/20220621150902.46126-1-mlevitsk@redhat.com/.


Speaking of my patch series, anything I should do to move that thing forward?

My approach to preserve the interrupt shadow in SMRAM doesn't seem to be accepted,
so what you think I should do?

Best regards,
	Maxim Levitsky

> 
> Paolo
>
Stoiko Ivanov July 27, 2022, 1:31 p.m. UTC | #5
On Wed, 27 Jul 2022 13:22:48 +0300
Maxim Levitsky <mlevitsk@redhat.com> wrote:

> On Tue, 2022-07-26 at 17:43 +0200, Paolo Bonzini wrote:
> > On 7/26/22 16:57, Stoiko Ivanov wrote:  
> > > Hi,
> > > 
> > > Proxmox[0] recently switched to the 5.15 kernel series (based on the one
> > > for Ubuntu 22.04), which includes this commit.
> > > While it's working well on most installations, we have a few users who
> > > reported that some of their guests shutdown with
> > > `KVM: entry failed, hardware error 0x80000021` being logged under certain
> > > conditions and environments[1]:
> > > * The issue is not deterministically reproducible, and only happens
> > >    eventually with certain loads (e.g. we have only one system in our
> > >    office which exhibits the issue - and this only by repeatedly installing
> > >    Windows 2k22 ~ one out of 10 installs will cause the guest-crash)
> > > * While most reports are referring to (newer) Windows guests, some users
> > >    run into the issue with Linux VMs as well
> > > * The affected systems are from a quite wide range - our affected machine
> > >    is an old IvyBridge Xeon with outdated BIOS (an equivalent system with
> > >    the latest available BIOS is not affected), but we have
> > >    reports of all kind of Intel CPUs (up to an i5-12400). It seems AMD CPUs
> > >    are not affected.
> > > 
> > > Disabling tdp_mmu seems to mitigate the issue, but I still thought you
> > > might want to know that in some cases tdp_mmu causes problems, or that you
> > > even might have an idea of how to fix the issue without explicitly
> > > disabling tdp_mmu?  
> > 
> > If you don't need secure boot, you can try disabling SMM.  It should not 
> > be related to TDP MMU, but the logs (thanks!) point at an SMM entry (RIP 
> > = 0x8000, CS base=0x7ffc2000).  
> 
> No doubt about it. It is the issue.
> 
> > 
> > This is likely to be fixed by 
> > https://lore.kernel.org/kvm/20220621150902.46126-1-mlevitsk@redhat.com/.
Thanks to both of you for the quick feedback and the patches!

We ran our reproducer with the patch-series above applied on top of
5.19-rc8 from
git://git.launchpad.net/~ubuntu-kernel/ubuntu/+source/linux/+git/kinetic
* without the patches the issue occurred within 20 minutes,
* with the patches applied issues did not occur for 3 hours (it usually
  does within 1-2 hours at most)

so fwiw it seems to fix the issue on our setup.
we'll do some more internal tests and would then make this available
(backported to our 5.15 kernel) to our users, who are affected by this.

Kind regards,
stoiko
diff mbox series

Patch

diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index f86158d41af0..43f12f5d12c0 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -10,7 +10,7 @@ 
 #include <asm/cmpxchg.h>
 #include <trace/events/kvm.h>
 
-static bool __read_mostly tdp_mmu_enabled = false;
+static bool __read_mostly tdp_mmu_enabled = true;
 module_param_named(tdp_mmu, tdp_mmu_enabled, bool, 0644);
 
 /* Initializes the TDP MMU for the VM, if enabled. */