diff mbox series

KVM: nSVM: call nested_svm_load_cr3 on nested state load

Message ID 20210210155937.141569-1-mlevitsk@redhat.com (mailing list archive)
State New, archived
Headers show
Series KVM: nSVM: call nested_svm_load_cr3 on nested state load | expand

Commit Message

Maxim Levitsky Feb. 10, 2021, 3:59 p.m. UTC
While KVM's MMU should be fully reset by loading of nested CR0/CR3/CR4
by KVM_SET_SREGS, we are not in nested mode yet when we do it and therefore
only root_mmu is reset.

On regular nested entries we call nested_svm_load_cr3 which both updates the
guest's CR3 in the MMU when it is needed, and it also initializes
the mmu again which makes it initialize the walk_mmu as well when nested
paging is enabled in both host and guest.

Since we don't call nested_svm_load_cr3 on nested state load,
the walk_mmu can be left uninitialized, which can lead to a NULL pointer
dereference while accessing it if we happen to get a nested page fault
right after entering the nested guest first time after the migration and
we decide to emulate it, which leads to emulator trying to access
walk_mmu->gva_to_gpa which is NULL.

Therefore we should call this function on nested state load as well.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 arch/x86/kvm/svm/nested.c | 8 ++++++++
 1 file changed, 8 insertions(+)

Comments

Paolo Bonzini Feb. 10, 2021, 5:38 p.m. UTC | #1
On 10/02/21 16:59, Maxim Levitsky wrote:
> While KVM's MMU should be fully reset by loading of nested CR0/CR3/CR4
> by KVM_SET_SREGS, we are not in nested mode yet when we do it and therefore
> only root_mmu is reset.
> 
> On regular nested entries we call nested_svm_load_cr3 which both updates the
> guest's CR3 in the MMU when it is needed, and it also initializes
> the mmu again which makes it initialize the walk_mmu as well when nested
> paging is enabled in both host and guest.
> 
> Since we don't call nested_svm_load_cr3 on nested state load,
> the walk_mmu can be left uninitialized, which can lead to a NULL pointer
> dereference while accessing it if we happen to get a nested page fault
> right after entering the nested guest first time after the migration and
> we decide to emulate it, which leads to emulator trying to access
> walk_mmu->gva_to_gpa which is NULL.
> 
> Therefore we should call this function on nested state load as well.
> 
> Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
> Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
> ---
>   arch/x86/kvm/svm/nested.c | 8 ++++++++
>   1 file changed, 8 insertions(+)
> 
> diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
> index 519fe84f2100..c209f1232928 100644
> --- a/arch/x86/kvm/svm/nested.c
> +++ b/arch/x86/kvm/svm/nested.c
> @@ -1282,6 +1282,14 @@ static int svm_set_nested_state(struct kvm_vcpu *vcpu,
>   
>   	nested_vmcb02_prepare_control(svm);
>   
> +	ret = nested_svm_load_cr3(&svm->vcpu, vcpu->arch.cr3,
> +				  nested_npt_enabled(svm));
> +
> +	if (ret) {
> +		svm_leave_nested(svm);
> +		goto out_free;
> +	}
> +
>   	kvm_make_request(KVM_REQ_GET_NESTED_STATE_PAGES, vcpu);
>   	ret = 0;
>   out_free:
> 

I think you have to delay this to KVM_REQ_GET_NESTED_STATE_PAGES, 
because the !nested_npt case can be accessing memory before the VM is 
started (PDPTRs!).

In fact the same is true for VMX: this code

         /* Shadow page tables on either EPT or shadow page tables. */
         if (nested_vmx_load_cr3(vcpu, vmcs12->guest_cr3, 
nested_cpu_has_ept(vmcs12),
                                 entry_failure_code))
                 return -EINVAL;

must be moved from prepare_vmcs02 to both nested_vmx_enter_non_root_mode 
and nested_get_vmcs12_pages.

Thanks,

Paolo
diff mbox series

Patch

diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index 519fe84f2100..c209f1232928 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -1282,6 +1282,14 @@  static int svm_set_nested_state(struct kvm_vcpu *vcpu,
 
 	nested_vmcb02_prepare_control(svm);
 
+	ret = nested_svm_load_cr3(&svm->vcpu, vcpu->arch.cr3,
+				  nested_npt_enabled(svm));
+
+	if (ret) {
+		svm_leave_nested(svm);
+		goto out_free;
+	}
+
 	kvm_make_request(KVM_REQ_GET_NESTED_STATE_PAGES, vcpu);
 	ret = 0;
 out_free: