Message ID | 20200520172145.23284-22-pbonzini@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [01/24] KVM: nSVM: fix condition for filtering async PF | expand |
On Wed, May 20, 2020 at 01:21:42PM -0400, Paolo Bonzini wrote: > vmx_load_mmu_pgd is delaying the write of GUEST_CR3 to prepare_vmcs02 as > an optimization, but this is only correct before the nested vmentry. > If userspace is modifying CR3 with KVM_SET_SREGS after the VM has > already been put in guest mode, the value of CR3 will not be updated. > Remove the optimization, which almost never triggers anyway. > > This also applies to SVM, where the code was added in commit 689f3bf21628 > ("KVM: x86: unify callbacks to load paging root", 2020-03-16) just to keep the > two vendor-specific modules closer. > > Fixes: 04f11ef45810 ("KVM: nVMX: Always write vmcs02.GUEST_CR3 during nested VM-Enter") > Fixes: 689f3bf21628 ("KVM: x86: unify callbacks to load paging root") > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> > --- ... > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c > index 55712dd86baf..7daf6a50e774 100644 > --- a/arch/x86/kvm/vmx/vmx.c > +++ b/arch/x86/kvm/vmx/vmx.c > @@ -3085,10 +3085,7 @@ void vmx_load_mmu_pgd(struct kvm_vcpu *vcpu, unsigned long pgd) > spin_unlock(&to_kvm_vmx(kvm)->ept_pointer_lock); > } > > - /* Loading vmcs02.GUEST_CR3 is handled by nested VM-Enter. */ > - if (is_guest_mode(vcpu)) > - update_guest_cr3 = false; > - else if (!enable_unrestricted_guest && !is_paging(vcpu)) > + if (!enable_unrestricted_guest && !is_paging(vcpu)) > guest_cr3 = to_kvm_vmx(kvm)->ept_identity_map_addr; > else if (test_bit(VCPU_EXREG_CR3, (ulong *)&vcpu->arch.regs_avail)) As an alternative fix, what about marking VCPU_EXREG_CR3 dirty in __set_sregs()? E.g. /* * Loading vmcs02.GUEST_CR3 is handled by nested VM-Enter, but * it can be explicitly dirtied by KVM_SET_SREGS. */ if (is_guest_mode(vcpu) && !test_bit(VCPU_EXREG_CR3, (ulong *)&vcpu->arch.regs_dirty)) There's already a dependency on __set_sregs() doing kvm_register_mark_available() before kvm_mmu_reset_context(), i.e. the code is already a bit kludgy. The dirty check would make the kludge less subtle and provide explicit documentation. > guest_cr3 = vcpu->arch.cr3; The comment that's just below the context is now stale, e.g. replace vmcs01.GUEST_CR3 with vmcs.GUEST_CR3. > -- > 2.18.2 > >
Oh, and it'd be nice to do s/VMCB/VMCB\/VMCS in the subject, I almost glossed over this patch because it explicitly said VMCB :-) On Wed, May 20, 2020 at 01:21:42PM -0400, Paolo Bonzini wrote: > vmx_load_mmu_pgd is delaying the write of GUEST_CR3 to prepare_vmcs02 as > an optimization, but this is only correct before the nested vmentry. > If userspace is modifying CR3 with KVM_SET_SREGS after the VM has > already been put in guest mode, the value of CR3 will not be updated. > Remove the optimization, which almost never triggers anyway. > > This also applies to SVM, where the code was added in commit 689f3bf21628 > ("KVM: x86: unify callbacks to load paging root", 2020-03-16) just to keep the > two vendor-specific modules closer. > > Fixes: 04f11ef45810 ("KVM: nVMX: Always write vmcs02.GUEST_CR3 during nested VM-Enter") > Fixes: 689f3bf21628 ("KVM: x86: unify callbacks to load paging root") > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> > ---
On 20/05/20 20:22, Sean Christopherson wrote: > As an alternative fix, what about marking VCPU_EXREG_CR3 dirty in > __set_sregs()? E.g. > > /* > * Loading vmcs02.GUEST_CR3 is handled by nested VM-Enter, but > * it can be explicitly dirtied by KVM_SET_SREGS. > */ > if (is_guest_mode(vcpu) && > !test_bit(VCPU_EXREG_CR3, (ulong *)&vcpu->arch.regs_dirty)) > > There's already a dependency on __set_sregs() doing > kvm_register_mark_available() before kvm_mmu_reset_context(), i.e. the > code is already a bit kludgy. The dirty check would make the kludge less > subtle and provide explicit documentation. A comment in __set_sregs is certainly a good idea. But checking for dirty seems worse since the caching of CR3 is a bit special in this respect (it's never marked dirty). This patch should probably be split too, so that the Fixes tags are separate for Intel and AMD. Paolo >> guest_cr3 = vcpu->arch.cr3; > > The comment that's just below the context is now stale, e.g. replace > vmcs01.GUEST_CR3 with vmcs.GUEST_CR3. > >> -- >> 2.18.2 >> >> >
On Wed, May 20, 2020 at 10:14:47PM +0200, Paolo Bonzini wrote: > On 20/05/20 20:22, Sean Christopherson wrote: > > As an alternative fix, what about marking VCPU_EXREG_CR3 dirty in > > __set_sregs()? E.g. > > > > /* > > * Loading vmcs02.GUEST_CR3 is handled by nested VM-Enter, but > > * it can be explicitly dirtied by KVM_SET_SREGS. > > */ > > if (is_guest_mode(vcpu) && > > !test_bit(VCPU_EXREG_CR3, (ulong *)&vcpu->arch.regs_dirty)) > > > > There's already a dependency on __set_sregs() doing > > kvm_register_mark_available() before kvm_mmu_reset_context(), i.e. the > > code is already a bit kludgy. The dirty check would make the kludge less > > subtle and provide explicit documentation. > > A comment in __set_sregs is certainly a good idea. But checking for > dirty seems worse since the caching of CR3 is a bit special in this > respect (it's never marked dirty). That's why I thought it was so clever :-) > This patch should probably be split too, so that the Fixes tags are > separate for Intel and AMD. That would be nice.
On 23/05/20 00:47, Sean Christopherson wrote: > On Wed, May 20, 2020 at 10:14:47PM +0200, Paolo Bonzini wrote: >> This patch should probably be split too, so that the Fixes tags are >> separate for Intel and AMD. > > That would be nice. Will do. Anyway this series will be quite different in v2, and there will be a couple more changes to common code to avoid repeated calls to kvm_cpu_has_injectable_intr (because on AMD I'd like to avoid unnecessary calls to enable_irq_window, it is already complicated enough without those). Paolo
diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c index 19b6a7c954e8..087a04ae74e4 100644 --- a/arch/x86/kvm/svm/nested.c +++ b/arch/x86/kvm/svm/nested.c @@ -260,11 +260,7 @@ static void nested_prepare_vmcb_save(struct vcpu_svm *svm, struct vmcb *nested_v svm_set_efer(&svm->vcpu, nested_vmcb->save.efer); svm_set_cr0(&svm->vcpu, nested_vmcb->save.cr0); svm_set_cr4(&svm->vcpu, nested_vmcb->save.cr4); - if (npt_enabled) { - svm->vmcb->save.cr3 = nested_vmcb->save.cr3; - svm->vcpu.arch.cr3 = nested_vmcb->save.cr3; - } else - (void)kvm_set_cr3(&svm->vcpu, nested_vmcb->save.cr3); + (void)kvm_set_cr3(&svm->vcpu, nested_vmcb->save.cr3); svm->vmcb->save.cr2 = svm->vcpu.arch.cr2 = nested_vmcb->save.cr2; kvm_rax_write(&svm->vcpu, nested_vmcb->save.rax); diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index d8187d25fe04..56be704ffe95 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -3465,7 +3465,6 @@ static fastpath_t svm_vcpu_run(struct kvm_vcpu *vcpu) static void svm_load_mmu_pgd(struct kvm_vcpu *vcpu, unsigned long root) { struct vcpu_svm *svm = to_svm(vcpu); - bool update_guest_cr3 = true; unsigned long cr3; cr3 = __sme_set(root); @@ -3474,18 +3473,13 @@ static void svm_load_mmu_pgd(struct kvm_vcpu *vcpu, unsigned long root) mark_dirty(svm->vmcb, VMCB_NPT); /* Loading L2's CR3 is handled by enter_svm_guest_mode. */ - if (is_guest_mode(vcpu)) - update_guest_cr3 = false; - else if (test_bit(VCPU_EXREG_CR3, (ulong *)&vcpu->arch.regs_avail)) - cr3 = vcpu->arch.cr3; - else /* CR3 is already up-to-date. */ - update_guest_cr3 = false; + if (!test_bit(VCPU_EXREG_CR3, (ulong *)&vcpu->arch.regs_avail)) + return; + cr3 = vcpu->arch.cr3; } - if (update_guest_cr3) { - svm->vmcb->save.cr3 = cr3; - mark_dirty(svm->vmcb, VMCB_CR); - } + svm->vmcb->save.cr3 = cr3; + mark_dirty(svm->vmcb, VMCB_CR); } static int is_disabled(void) diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 55712dd86baf..7daf6a50e774 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -3085,10 +3085,7 @@ void vmx_load_mmu_pgd(struct kvm_vcpu *vcpu, unsigned long pgd) spin_unlock(&to_kvm_vmx(kvm)->ept_pointer_lock); } - /* Loading vmcs02.GUEST_CR3 is handled by nested VM-Enter. */ - if (is_guest_mode(vcpu)) - update_guest_cr3 = false; - else if (!enable_unrestricted_guest && !is_paging(vcpu)) + if (!enable_unrestricted_guest && !is_paging(vcpu)) guest_cr3 = to_kvm_vmx(kvm)->ept_identity_map_addr; else if (test_bit(VCPU_EXREG_CR3, (ulong *)&vcpu->arch.regs_avail)) guest_cr3 = vcpu->arch.cr3;
vmx_load_mmu_pgd is delaying the write of GUEST_CR3 to prepare_vmcs02 as an optimization, but this is only correct before the nested vmentry. If userspace is modifying CR3 with KVM_SET_SREGS after the VM has already been put in guest mode, the value of CR3 will not be updated. Remove the optimization, which almost never triggers anyway. This also applies to SVM, where the code was added in commit 689f3bf21628 ("KVM: x86: unify callbacks to load paging root", 2020-03-16) just to keep the two vendor-specific modules closer. Fixes: 04f11ef45810 ("KVM: nVMX: Always write vmcs02.GUEST_CR3 during nested VM-Enter") Fixes: 689f3bf21628 ("KVM: x86: unify callbacks to load paging root") Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> --- arch/x86/kvm/svm/nested.c | 6 +----- arch/x86/kvm/svm/svm.c | 16 +++++----------- arch/x86/kvm/vmx/vmx.c | 5 +---- 3 files changed, 7 insertions(+), 20 deletions(-)