diff mbox series

[21/24] KVM: x86: always update CR3 in VMCB

Message ID 20200520172145.23284-22-pbonzini@redhat.com (mailing list archive)
State New, archived
Headers show
Series [01/24] KVM: nSVM: fix condition for filtering async PF | expand

Commit Message

Paolo Bonzini May 20, 2020, 5:21 p.m. UTC
vmx_load_mmu_pgd is delaying the write of GUEST_CR3 to prepare_vmcs02 as
an optimization, but this is only correct before the nested vmentry.
If userspace is modifying CR3 with KVM_SET_SREGS after the VM has
already been put in guest mode, the value of CR3 will not be updated.
Remove the optimization, which almost never triggers anyway.

This also applies to SVM, where the code was added in commit 689f3bf21628
("KVM: x86: unify callbacks to load paging root", 2020-03-16) just to keep the
two vendor-specific modules closer.

Fixes: 04f11ef45810 ("KVM: nVMX: Always write vmcs02.GUEST_CR3 during nested VM-Enter")
Fixes: 689f3bf21628 ("KVM: x86: unify callbacks to load paging root")
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/svm/nested.c |  6 +-----
 arch/x86/kvm/svm/svm.c    | 16 +++++-----------
 arch/x86/kvm/vmx/vmx.c    |  5 +----
 3 files changed, 7 insertions(+), 20 deletions(-)

Comments

Sean Christopherson May 20, 2020, 6:22 p.m. UTC | #1
On Wed, May 20, 2020 at 01:21:42PM -0400, Paolo Bonzini wrote:
> vmx_load_mmu_pgd is delaying the write of GUEST_CR3 to prepare_vmcs02 as
> an optimization, but this is only correct before the nested vmentry.
> If userspace is modifying CR3 with KVM_SET_SREGS after the VM has
> already been put in guest mode, the value of CR3 will not be updated.
> Remove the optimization, which almost never triggers anyway.
> 
> This also applies to SVM, where the code was added in commit 689f3bf21628
> ("KVM: x86: unify callbacks to load paging root", 2020-03-16) just to keep the
> two vendor-specific modules closer.
> 
> Fixes: 04f11ef45810 ("KVM: nVMX: Always write vmcs02.GUEST_CR3 during nested VM-Enter")
> Fixes: 689f3bf21628 ("KVM: x86: unify callbacks to load paging root")
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---

...

> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 55712dd86baf..7daf6a50e774 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -3085,10 +3085,7 @@ void vmx_load_mmu_pgd(struct kvm_vcpu *vcpu, unsigned long pgd)
>  			spin_unlock(&to_kvm_vmx(kvm)->ept_pointer_lock);
>  		}
>  
> -		/* Loading vmcs02.GUEST_CR3 is handled by nested VM-Enter. */
> -		if (is_guest_mode(vcpu))
> -			update_guest_cr3 = false;
> -		else if (!enable_unrestricted_guest && !is_paging(vcpu))
> +		if (!enable_unrestricted_guest && !is_paging(vcpu))
>  			guest_cr3 = to_kvm_vmx(kvm)->ept_identity_map_addr;
>  		else if (test_bit(VCPU_EXREG_CR3, (ulong *)&vcpu->arch.regs_avail))

As an alternative fix, what about marking VCPU_EXREG_CR3 dirty in
__set_sregs()?  E.g.

		/*
		 * Loading vmcs02.GUEST_CR3 is handled by nested VM-Enter, but
		 * it can be explicitly dirtied by KVM_SET_SREGS.
		 */
		if (is_guest_mode(vcpu) &&
		    !test_bit(VCPU_EXREG_CR3, (ulong *)&vcpu->arch.regs_dirty))

There's already a dependency on __set_sregs() doing
kvm_register_mark_available() before kvm_mmu_reset_context(), i.e. the
code is already a bit kludgy.  The dirty check would make the kludge less
subtle and provide explicit documentation.

>  			guest_cr3 = vcpu->arch.cr3;

The comment that's just below the context is now stale, e.g. replace
vmcs01.GUEST_CR3 with vmcs.GUEST_CR3.

> -- 
> 2.18.2
> 
>
Sean Christopherson May 20, 2020, 6:24 p.m. UTC | #2
Oh, and it'd be nice to do s/VMCB/VMCB\/VMCS in the subject, I almost
glossed over this patch because it explicitly said VMCB :-)

On Wed, May 20, 2020 at 01:21:42PM -0400, Paolo Bonzini wrote:
> vmx_load_mmu_pgd is delaying the write of GUEST_CR3 to prepare_vmcs02 as
> an optimization, but this is only correct before the nested vmentry.
> If userspace is modifying CR3 with KVM_SET_SREGS after the VM has
> already been put in guest mode, the value of CR3 will not be updated.
> Remove the optimization, which almost never triggers anyway.
> 
> This also applies to SVM, where the code was added in commit 689f3bf21628
> ("KVM: x86: unify callbacks to load paging root", 2020-03-16) just to keep the
> two vendor-specific modules closer.
> 
> Fixes: 04f11ef45810 ("KVM: nVMX: Always write vmcs02.GUEST_CR3 during nested VM-Enter")
> Fixes: 689f3bf21628 ("KVM: x86: unify callbacks to load paging root")
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
Paolo Bonzini May 20, 2020, 8:14 p.m. UTC | #3
On 20/05/20 20:22, Sean Christopherson wrote:
> As an alternative fix, what about marking VCPU_EXREG_CR3 dirty in
> __set_sregs()?  E.g.
> 
> 		/*
> 		 * Loading vmcs02.GUEST_CR3 is handled by nested VM-Enter, but
> 		 * it can be explicitly dirtied by KVM_SET_SREGS.
> 		 */
> 		if (is_guest_mode(vcpu) &&
> 		    !test_bit(VCPU_EXREG_CR3, (ulong *)&vcpu->arch.regs_dirty))
> 
> There's already a dependency on __set_sregs() doing
> kvm_register_mark_available() before kvm_mmu_reset_context(), i.e. the
> code is already a bit kludgy.  The dirty check would make the kludge less
> subtle and provide explicit documentation.

A comment in __set_sregs is certainly a good idea.  But checking for
dirty seems worse since the caching of CR3 is a bit special in this
respect (it's never marked dirty).

This patch should probably be split too, so that the Fixes tags are
separate for Intel and AMD.

Paolo

>>  			guest_cr3 = vcpu->arch.cr3;
> 
> The comment that's just below the context is now stale, e.g. replace
> vmcs01.GUEST_CR3 with vmcs.GUEST_CR3.
> 
>> -- 
>> 2.18.2
>>
>>
>
Sean Christopherson May 22, 2020, 10:47 p.m. UTC | #4
On Wed, May 20, 2020 at 10:14:47PM +0200, Paolo Bonzini wrote:
> On 20/05/20 20:22, Sean Christopherson wrote:
> > As an alternative fix, what about marking VCPU_EXREG_CR3 dirty in
> > __set_sregs()?  E.g.
> > 
> > 		/*
> > 		 * Loading vmcs02.GUEST_CR3 is handled by nested VM-Enter, but
> > 		 * it can be explicitly dirtied by KVM_SET_SREGS.
> > 		 */
> > 		if (is_guest_mode(vcpu) &&
> > 		    !test_bit(VCPU_EXREG_CR3, (ulong *)&vcpu->arch.regs_dirty))
> > 
> > There's already a dependency on __set_sregs() doing
> > kvm_register_mark_available() before kvm_mmu_reset_context(), i.e. the
> > code is already a bit kludgy.  The dirty check would make the kludge less
> > subtle and provide explicit documentation.
> 
> A comment in __set_sregs is certainly a good idea.  But checking for
> dirty seems worse since the caching of CR3 is a bit special in this
> respect (it's never marked dirty).

That's why I thought it was so clever :-)

> This patch should probably be split too, so that the Fixes tags are
> separate for Intel and AMD.

That would be nice.
Paolo Bonzini May 23, 2020, 7:07 a.m. UTC | #5
On 23/05/20 00:47, Sean Christopherson wrote:
> On Wed, May 20, 2020 at 10:14:47PM +0200, Paolo Bonzini wrote:
>> This patch should probably be split too, so that the Fixes tags are
>> separate for Intel and AMD.
> 
> That would be nice.

Will do.  Anyway this series will be quite different in v2, and there
will be a couple more changes to common code to avoid repeated calls to
kvm_cpu_has_injectable_intr (because on AMD I'd like to avoid
unnecessary calls to enable_irq_window, it is already complicated enough
without those).

Paolo
diff mbox series

Patch

diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index 19b6a7c954e8..087a04ae74e4 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -260,11 +260,7 @@  static void nested_prepare_vmcb_save(struct vcpu_svm *svm, struct vmcb *nested_v
 	svm_set_efer(&svm->vcpu, nested_vmcb->save.efer);
 	svm_set_cr0(&svm->vcpu, nested_vmcb->save.cr0);
 	svm_set_cr4(&svm->vcpu, nested_vmcb->save.cr4);
-	if (npt_enabled) {
-		svm->vmcb->save.cr3 = nested_vmcb->save.cr3;
-		svm->vcpu.arch.cr3 = nested_vmcb->save.cr3;
-	} else
-		(void)kvm_set_cr3(&svm->vcpu, nested_vmcb->save.cr3);
+	(void)kvm_set_cr3(&svm->vcpu, nested_vmcb->save.cr3);
 
 	svm->vmcb->save.cr2 = svm->vcpu.arch.cr2 = nested_vmcb->save.cr2;
 	kvm_rax_write(&svm->vcpu, nested_vmcb->save.rax);
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index d8187d25fe04..56be704ffe95 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -3465,7 +3465,6 @@  static fastpath_t svm_vcpu_run(struct kvm_vcpu *vcpu)
 static void svm_load_mmu_pgd(struct kvm_vcpu *vcpu, unsigned long root)
 {
 	struct vcpu_svm *svm = to_svm(vcpu);
-	bool update_guest_cr3 = true;
 	unsigned long cr3;
 
 	cr3 = __sme_set(root);
@@ -3474,18 +3473,13 @@  static void svm_load_mmu_pgd(struct kvm_vcpu *vcpu, unsigned long root)
 		mark_dirty(svm->vmcb, VMCB_NPT);
 
 		/* Loading L2's CR3 is handled by enter_svm_guest_mode.  */
-		if (is_guest_mode(vcpu))
-			update_guest_cr3 = false;
-		else if (test_bit(VCPU_EXREG_CR3, (ulong *)&vcpu->arch.regs_avail))
-			cr3 = vcpu->arch.cr3;
-		else /* CR3 is already up-to-date.  */
-			update_guest_cr3 = false;
+		if (!test_bit(VCPU_EXREG_CR3, (ulong *)&vcpu->arch.regs_avail))
+			return;
+		cr3 = vcpu->arch.cr3;
 	}
 
-	if (update_guest_cr3) {
-		svm->vmcb->save.cr3 = cr3;
-		mark_dirty(svm->vmcb, VMCB_CR);
-	}
+	svm->vmcb->save.cr3 = cr3;
+	mark_dirty(svm->vmcb, VMCB_CR);
 }
 
 static int is_disabled(void)
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 55712dd86baf..7daf6a50e774 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -3085,10 +3085,7 @@  void vmx_load_mmu_pgd(struct kvm_vcpu *vcpu, unsigned long pgd)
 			spin_unlock(&to_kvm_vmx(kvm)->ept_pointer_lock);
 		}
 
-		/* Loading vmcs02.GUEST_CR3 is handled by nested VM-Enter. */
-		if (is_guest_mode(vcpu))
-			update_guest_cr3 = false;
-		else if (!enable_unrestricted_guest && !is_paging(vcpu))
+		if (!enable_unrestricted_guest && !is_paging(vcpu))
 			guest_cr3 = to_kvm_vmx(kvm)->ept_identity_map_addr;
 		else if (test_bit(VCPU_EXREG_CR3, (ulong *)&vcpu->arch.regs_avail))
 			guest_cr3 = vcpu->arch.cr3;