Message ID | 20220715230016.3762909-2-seanjc@google.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | KVM: x86/mmu: Memtype related cleanups | expand |
On Fri, Jul 15, 2022 at 4:02 PM Sean Christopherson <seanjc@google.com> wrote: > > Reject KVM if entry '0' in the host's IA32_PAT MSR is not programmed to > writeback (WB) memtype. KVM subtly relies on IA32_PAT entry '0' to be > programmed to WB by leaving the PAT bits in shadow paging and NPT SPTEs > as '0'. If something other than WB is in PAT[0], at _best_ guests will > suffer very poor performance, and at worst KVM will crash the system by > breaking cache-coherency expecations (e.g. using WC for guest memory). > > Signed-off-by: Sean Christopherson <seanjc@google.com> > --- What if someone changes the host's PAT to violate this rule *after* kvm is loaded?
On Fri, Jul 15, 2022, Jim Mattson wrote: > On Fri, Jul 15, 2022 at 4:02 PM Sean Christopherson <seanjc@google.com> wrote: > > > > Reject KVM if entry '0' in the host's IA32_PAT MSR is not programmed to > > writeback (WB) memtype. KVM subtly relies on IA32_PAT entry '0' to be > > programmed to WB by leaving the PAT bits in shadow paging and NPT SPTEs > > as '0'. If something other than WB is in PAT[0], at _best_ guests will > > suffer very poor performance, and at worst KVM will crash the system by > > breaking cache-coherency expecations (e.g. using WC for guest memory). > > > > Signed-off-by: Sean Christopherson <seanjc@google.com> > > --- > What if someone changes the host's PAT to violate this rule *after* > kvm is loaded? Then KVM (and probably many other things in the kernel) is hosed. The same argument (that KVM isn't paranoid enough) can likely be made for a number of MSRs and critical registers.
On Fri, 2022-07-15 at 23:18 +0000, Sean Christopherson wrote: > On Fri, Jul 15, 2022, Jim Mattson wrote: > > On Fri, Jul 15, 2022 at 4:02 PM Sean Christopherson <seanjc@google.com> wrote: > > > > > > Reject KVM if entry '0' in the host's IA32_PAT MSR is not programmed to > > > writeback (WB) memtype. KVM subtly relies on IA32_PAT entry '0' to be > > > programmed to WB by leaving the PAT bits in shadow paging and NPT SPTEs > > > as '0'. If something other than WB is in PAT[0], at _best_ guests will > > > suffer very poor performance, and at worst KVM will crash the system by > > > breaking cache-coherency expecations (e.g. using WC for guest memory). > > > > > > Signed-off-by: Sean Christopherson <seanjc@google.com> > > > --- > > What if someone changes the host's PAT to violate this rule *after* > > kvm is loaded? > > Then KVM (and probably many other things in the kernel) is hosed. The same argument > (that KVM isn't paranoid enough) can likely be made for a number of MSRs and critical > registers. > I was thinking about the same thing and I also 100% agree with the above. Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Best regards, Maxim Levitsky
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index f389691d8c04..12199c40f2bc 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -9141,6 +9141,7 @@ static struct notifier_block pvclock_gtod_notifier = { int kvm_arch_init(void *opaque) { struct kvm_x86_init_ops *ops = opaque; + u64 host_pat; int r; if (kvm_x86_ops.hardware_enable) { @@ -9179,6 +9180,20 @@ int kvm_arch_init(void *opaque) goto out; } + /* + * KVM assumes that PAT entry '0' encodes WB memtype and simply zeroes + * the PAT bits in SPTEs. Bail if PAT[0] is programmed to something + * other than WB. Note, EPT doesn't utilize the PAT, but don't bother + * with an exception. PAT[0] is set to WB on RESET and also by the + * kernel, i.e. failure indicates a kernel bug or broken firmware. + */ + if (rdmsrl_safe(MSR_IA32_CR_PAT, &host_pat) || + (host_pat & GENMASK(2, 0)) != 6) { + pr_err("kvm: host PAT[0] is not WB\n"); + r = -EIO; + goto out; + } + r = -ENOMEM; x86_emulator_cache = kvm_alloc_emulator_cache();
Reject KVM if entry '0' in the host's IA32_PAT MSR is not programmed to writeback (WB) memtype. KVM subtly relies on IA32_PAT entry '0' to be programmed to WB by leaving the PAT bits in shadow paging and NPT SPTEs as '0'. If something other than WB is in PAT[0], at _best_ guests will suffer very poor performance, and at worst KVM will crash the system by breaking cache-coherency expecations (e.g. using WC for guest memory). Signed-off-by: Sean Christopherson <seanjc@google.com> --- arch/x86/kvm/x86.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+)