diff mbox series

[4/4] KVM: nVMX: Map enlightened VMCS upon restore when possible

Message ID 20210503150854.1144255-5-vkuznets@redhat.com (mailing list archive)
State New, archived
Headers show
Series KVM: nVMX: Fix migration of nested guests when eVMCS is in use | expand

Commit Message

Vitaly Kuznetsov May 3, 2021, 3:08 p.m. UTC
It now looks like a bad idea to not restore eVMCS mapping directly from
vmx_set_nested_state(). The restoration path now depends on whether KVM
will continue executing L2 (vmx_get_nested_state_pages()) or will have to
exit to L1 (nested_vmx_vmexit()), this complicates error propagation and
diverges too much from the 'native' path when 'nested.current_vmptr' is
set directly from vmx_get_nested_state_pages().

The existing solution postponing eVMCS mapping also seems to be fragile.
In multiple places the code checks whether 'vmx->nested.hv_evmcs' is not
NULL to distinguish between eVMCS and non-eVMCS cases. All these checks
are 'incomplete' as we have a weird 'eVMCS is in use but not yet mapped'
state.

Also, in case vmx_get_nested_state() is called right after
vmx_set_nested_state() without executing the guest first, the resulting
state is going to be incorrect as 'KVM_STATE_NESTED_EVMCS' flag will be
missing.

Fix all these issues by making eVMCS restoration path closer to its
'native' sibling by putting eVMCS GPA to 'struct kvm_vmx_nested_state_hdr'.
To avoid ABI incompatibility, do not introduce a new flag and keep the
original eVMCS mapping path through KVM_REQ_GET_NESTED_STATE_PAGES in
place. To distinguish between 'new' and 'old' formats consider eVMCS
GPA == 0 as an unset GPA (thus forcing KVM_REQ_GET_NESTED_STATE_PAGES
path). While technically possible, it seems to be an extremely unlikely
case.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/include/uapi/asm/kvm.h |  2 ++
 arch/x86/kvm/vmx/nested.c       | 27 +++++++++++++++++++++------
 2 files changed, 23 insertions(+), 6 deletions(-)

Comments

Paolo Bonzini May 3, 2021, 3:53 p.m. UTC | #1
On 03/05/21 17:08, Vitaly Kuznetsov wrote:
> It now looks like a bad idea to not restore eVMCS mapping directly from
> vmx_set_nested_state(). The restoration path now depends on whether KVM
> will continue executing L2 (vmx_get_nested_state_pages()) or will have to
> exit to L1 (nested_vmx_vmexit()), this complicates error propagation and
> diverges too much from the 'native' path when 'nested.current_vmptr' is
> set directly from vmx_get_nested_state_pages().
> 
> The existing solution postponing eVMCS mapping also seems to be fragile.
> In multiple places the code checks whether 'vmx->nested.hv_evmcs' is not
> NULL to distinguish between eVMCS and non-eVMCS cases. All these checks
> are 'incomplete' as we have a weird 'eVMCS is in use but not yet mapped'
> state.
> 
> Also, in case vmx_get_nested_state() is called right after
> vmx_set_nested_state() without executing the guest first, the resulting
> state is going to be incorrect as 'KVM_STATE_NESTED_EVMCS' flag will be
> missing.
> 
> Fix all these issues by making eVMCS restoration path closer to its
> 'native' sibling by putting eVMCS GPA to 'struct kvm_vmx_nested_state_hdr'.
> To avoid ABI incompatibility, do not introduce a new flag and keep the

I'm not sure what is the disadvantage of not having a new flag.

Having two different paths with subtly different side effects however 
seems really worse for maintenance.  We are already discussing in 
another thread how to get rid of the check_nested_events side effects; 
that might possibly even remove the need for patch 1, so it's at least 
worth pursuing more than adding this second path.

I have queued patch 1, but I'd rather have a kvm selftest for it.  It 
doesn't seem impossible to have one...

Paolo

> original eVMCS mapping path through KVM_REQ_GET_NESTED_STATE_PAGES in
> place. To distinguish between 'new' and 'old' formats consider eVMCS
> GPA == 0 as an unset GPA (thus forcing KVM_REQ_GET_NESTED_STATE_PAGES
> path). While technically possible, it seems to be an extremely unlikely
> case.


> Signed-off-by: Vitaly Kuznetsov<vkuznets@redhat.com>
Vitaly Kuznetsov May 4, 2021, 8:02 a.m. UTC | #2
Paolo Bonzini <pbonzini@redhat.com> writes:

> On 03/05/21 17:08, Vitaly Kuznetsov wrote:
>> It now looks like a bad idea to not restore eVMCS mapping directly from
>> vmx_set_nested_state(). The restoration path now depends on whether KVM
>> will continue executing L2 (vmx_get_nested_state_pages()) or will have to
>> exit to L1 (nested_vmx_vmexit()), this complicates error propagation and
>> diverges too much from the 'native' path when 'nested.current_vmptr' is
>> set directly from vmx_get_nested_state_pages().
>> 
>> The existing solution postponing eVMCS mapping also seems to be fragile.
>> In multiple places the code checks whether 'vmx->nested.hv_evmcs' is not
>> NULL to distinguish between eVMCS and non-eVMCS cases. All these checks
>> are 'incomplete' as we have a weird 'eVMCS is in use but not yet mapped'
>> state.
>> 
>> Also, in case vmx_get_nested_state() is called right after
>> vmx_set_nested_state() without executing the guest first, the resulting
>> state is going to be incorrect as 'KVM_STATE_NESTED_EVMCS' flag will be
>> missing.
>> 
>> Fix all these issues by making eVMCS restoration path closer to its
>> 'native' sibling by putting eVMCS GPA to 'struct kvm_vmx_nested_state_hdr'.
>> To avoid ABI incompatibility, do not introduce a new flag and keep the
>
> I'm not sure what is the disadvantage of not having a new flag.
>

Adding a new flag would make us backwards-incompatible both ways:

1) Migrating 'new' state to an older KVM will fail the

	if (kvm_state->hdr.vmx.flags & ~KVM_STATE_VMX_PREEMPTION_TIMER_DEADLINE)
	        return -EINVAL;

check.

2) Migrating 'old' state to 'new' KVM would make us support the old path
('KVM_REQ_GET_NESTED_STATE_PAGES') so the flag will still be 'optional'.

> Having two different paths with subtly different side effects however 
> seems really worse for maintenance.  We are already discussing in 
> another thread how to get rid of the check_nested_events side effects; 
> that might possibly even remove the need for patch 1, so it's at least 
> worth pursuing more than adding this second path.

I have to admit I don't fully like this solution either :-( In case we
make sure KVM_REQ_GET_NESTED_STATE_PAGES always gets handled the fix can
be omitted indeed, however, I still dislike the divergence and the fact
that 'if (vmx->nested.hv_evmcs)' checks scattered across the code are
not fully valid. E.g. how do we fix immediate KVM_GET_NESTED_STATE after
KVM_SET_NESTED_STATE without executing the vCPU problem?

>
> I have queued patch 1, but I'd rather have a kvm selftest for it.  It 
> doesn't seem impossible to have one...

Thank you, the band-aid solves a real problem. Let me try to come up
with a selftest for it.

>
> Paolo
>
>> original eVMCS mapping path through KVM_REQ_GET_NESTED_STATE_PAGES in
>> place. To distinguish between 'new' and 'old' formats consider eVMCS
>> GPA == 0 as an unset GPA (thus forcing KVM_REQ_GET_NESTED_STATE_PAGES
>> path). While technically possible, it seems to be an extremely unlikely
>> case.
>
>
>> Signed-off-by: Vitaly Kuznetsov<vkuznets@redhat.com>
>
Paolo Bonzini May 4, 2021, 8:06 a.m. UTC | #3
On 04/05/21 10:02, Vitaly Kuznetsov wrote:
> I still dislike the divergence and the fact
> that 'if (vmx->nested.hv_evmcs)' checks scattered across the code are
> not fully valid. E.g. how do we fix immediate KVM_GET_NESTED_STATE after
> KVM_SET_NESTED_STATE without executing the vCPU problem?

You obviously have thought about this more than I did, but if you can 
write a testcase for that as well, I can take a look.

Thanks,

Paolo
Maxim Levitsky May 5, 2021, 8:33 a.m. UTC | #4
On Mon, 2021-05-03 at 17:08 +0200, Vitaly Kuznetsov wrote:
> It now looks like a bad idea to not restore eVMCS mapping directly from
> vmx_set_nested_state(). The restoration path now depends on whether KVM
> will continue executing L2 (vmx_get_nested_state_pages()) or will have to
> exit to L1 (nested_vmx_vmexit()), this complicates error propagation and
> diverges too much from the 'native' path when 'nested.current_vmptr' is
> set directly from vmx_get_nested_state_pages().
> 
> The existing solution postponing eVMCS mapping also seems to be fragile.
> In multiple places the code checks whether 'vmx->nested.hv_evmcs' is not
> NULL to distinguish between eVMCS and non-eVMCS cases. All these checks
> are 'incomplete' as we have a weird 'eVMCS is in use but not yet mapped'
> state.
> 
> Also, in case vmx_get_nested_state() is called right after
> vmx_set_nested_state() without executing the guest first, the resulting
> state is going to be incorrect as 'KVM_STATE_NESTED_EVMCS' flag will be
> missing.
> 
> Fix all these issues by making eVMCS restoration path closer to its
> 'native' sibling by putting eVMCS GPA to 'struct kvm_vmx_nested_state_hdr'.
> To avoid ABI incompatibility, do not introduce a new flag and keep the
> original eVMCS mapping path through KVM_REQ_GET_NESTED_STATE_PAGES in
> place. To distinguish between 'new' and 'old' formats consider eVMCS
> GPA == 0 as an unset GPA (thus forcing KVM_REQ_GET_NESTED_STATE_PAGES
> path). While technically possible, it seems to be an extremely unlikely
> case.
> 
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  arch/x86/include/uapi/asm/kvm.h |  2 ++
>  arch/x86/kvm/vmx/nested.c       | 27 +++++++++++++++++++++------
>  2 files changed, 23 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h
> index 0662f644aad9..3845977b739e 100644
> --- a/arch/x86/include/uapi/asm/kvm.h
> +++ b/arch/x86/include/uapi/asm/kvm.h
> @@ -441,6 +441,8 @@ struct kvm_vmx_nested_state_hdr {
>  
>  	__u32 flags;
>  	__u64 preemption_timer_deadline;
> +
> +	__u64 evmcs_pa;
>  };
>  
>  struct kvm_svm_nested_state_data {
> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> index 37fdc34f7afc..4261cf4755c8 100644
> --- a/arch/x86/kvm/vmx/nested.c
> +++ b/arch/x86/kvm/vmx/nested.c
> @@ -6019,6 +6019,7 @@ static int vmx_get_nested_state(struct kvm_vcpu *vcpu,
>  		.hdr.vmx.vmxon_pa = -1ull,
>  		.hdr.vmx.vmcs12_pa = -1ull,
>  		.hdr.vmx.preemption_timer_deadline = 0,
> +		.hdr.vmx.evmcs_pa = -1ull,
>  	};
>  	struct kvm_vmx_nested_state_data __user *user_vmx_nested_state =
>  		&user_kvm_nested_state->data.vmx[0];
> @@ -6037,8 +6038,10 @@ static int vmx_get_nested_state(struct kvm_vcpu *vcpu,
>  		if (vmx_has_valid_vmcs12(vcpu)) {
>  			kvm_state.size += sizeof(user_vmx_nested_state->vmcs12);
>  
> -			if (vmx->nested.hv_evmcs)
> +			if (vmx->nested.hv_evmcs) {
>  				kvm_state.flags |= KVM_STATE_NESTED_EVMCS;
> +				kvm_state.hdr.vmx.evmcs_pa = vmx->nested.hv_evmcs_vmptr;
> +			}
>  
>  			if (is_guest_mode(vcpu) &&
>  			    nested_cpu_has_shadow_vmcs(vmcs12) &&
> @@ -6230,13 +6233,25 @@ static int vmx_set_nested_state(struct kvm_vcpu *vcpu,
>  
>  		set_current_vmptr(vmx, kvm_state->hdr.vmx.vmcs12_pa);
>  	} else if (kvm_state->flags & KVM_STATE_NESTED_EVMCS) {
> +		u64 evmcs_gpa = kvm_state->hdr.vmx.evmcs_pa;
> +
>  		/*
> -		 * nested_vmx_handle_enlightened_vmptrld() cannot be called
> -		 * directly from here as HV_X64_MSR_VP_ASSIST_PAGE may not be
> -		 * restored yet. EVMCS will be mapped from
> -		 * nested_get_vmcs12_pages().
> +		 * EVMCS GPA == 0 most likely indicates that the migration data is
> +		 * coming from an older KVM which doesn't support 'evmcs_pa' in
> +		 * 'struct kvm_vmx_nested_state_hdr'.
>  		 */
> -		kvm_make_request(KVM_REQ_GET_NESTED_STATE_PAGES, vcpu);
> +		if (evmcs_gpa && (evmcs_gpa != -1ull) &&
> +		    (__nested_vmx_handle_enlightened_vmptrld(vcpu, evmcs_gpa, false) !=
> +		     EVMPTRLD_SUCCEEDED)) {
> +			return -EINVAL;
> +		} else if (!evmcs_gpa) {
> +			/*
> +			 * EVMCS GPA can't be acquired from VP assist page here because
> +			 * HV_X64_MSR_VP_ASSIST_PAGE may not be restored yet.
> +			 * EVMCS will be mapped from nested_get_evmcs_page().
> +			 */
> +			kvm_make_request(KVM_REQ_GET_NESTED_STATE_PAGES, vcpu);
> +		}
>  	} else {
>  		return -EINVAL;
>  	}

Hi everyone!

Let me expalin my concern about this patch and also ask if I understand this correctly.

In a nutshell if I understand this correctly, we are not allowed to access any guest
memory while setting the nested state. 

Now, if I understand correctly as well, the reason for the above,
is that the userspace is allowed to set the nested state first, then fiddle with
the KVM memslots, maybe even update the guest memory and only later do the KVM_RUN ioctl,

And so this is the major reason why the KVM_REQ_GET_NESTED_STATE_PAGES
request exists in the first place.

If that is correct I assume that we either have to keep loading the EVMCS page on
KVM_REQ_GET_NESTED_STATE_PAGES request, or we want to include the EVMCS itself
in the migration state in addition to its physical address, similar to how we treat
the VMCS12 and the VMCB12.

I personally tinkered with qemu to try and reproduce this situation
and in my tests I wasn't able to make it update the memory
map after the load of the nested state but prior to KVM_RUN
but neither I wasn't able to prove that this can't happen.

In addition to that I don't know how qemu behaves when it does 
guest ram post-copy because so far I haven't tried to tinker with it.

Finally other userspace hypervisors exist, and they might rely on assumption
as well.

Looking forward for any comments,
Best regards,
	Maxim Levitsky
Vitaly Kuznetsov May 5, 2021, 9:17 a.m. UTC | #5
Maxim Levitsky <mlevitsk@redhat.com> writes:

> On Mon, 2021-05-03 at 17:08 +0200, Vitaly Kuznetsov wrote:
>> It now looks like a bad idea to not restore eVMCS mapping directly from
>> vmx_set_nested_state(). The restoration path now depends on whether KVM
>> will continue executing L2 (vmx_get_nested_state_pages()) or will have to
>> exit to L1 (nested_vmx_vmexit()), this complicates error propagation and
>> diverges too much from the 'native' path when 'nested.current_vmptr' is
>> set directly from vmx_get_nested_state_pages().
>> 
>> The existing solution postponing eVMCS mapping also seems to be fragile.
>> In multiple places the code checks whether 'vmx->nested.hv_evmcs' is not
>> NULL to distinguish between eVMCS and non-eVMCS cases. All these checks
>> are 'incomplete' as we have a weird 'eVMCS is in use but not yet mapped'
>> state.
>> 
>> Also, in case vmx_get_nested_state() is called right after
>> vmx_set_nested_state() without executing the guest first, the resulting
>> state is going to be incorrect as 'KVM_STATE_NESTED_EVMCS' flag will be
>> missing.
>> 
>> Fix all these issues by making eVMCS restoration path closer to its
>> 'native' sibling by putting eVMCS GPA to 'struct kvm_vmx_nested_state_hdr'.
>> To avoid ABI incompatibility, do not introduce a new flag and keep the
>> original eVMCS mapping path through KVM_REQ_GET_NESTED_STATE_PAGES in
>> place. To distinguish between 'new' and 'old' formats consider eVMCS
>> GPA == 0 as an unset GPA (thus forcing KVM_REQ_GET_NESTED_STATE_PAGES
>> path). While technically possible, it seems to be an extremely unlikely
>> case.
>> 
>> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
>> ---
>>  arch/x86/include/uapi/asm/kvm.h |  2 ++
>>  arch/x86/kvm/vmx/nested.c       | 27 +++++++++++++++++++++------
>>  2 files changed, 23 insertions(+), 6 deletions(-)
>> 
>> diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h
>> index 0662f644aad9..3845977b739e 100644
>> --- a/arch/x86/include/uapi/asm/kvm.h
>> +++ b/arch/x86/include/uapi/asm/kvm.h
>> @@ -441,6 +441,8 @@ struct kvm_vmx_nested_state_hdr {
>>  
>>  	__u32 flags;
>>  	__u64 preemption_timer_deadline;
>> +
>> +	__u64 evmcs_pa;
>>  };
>>  
>>  struct kvm_svm_nested_state_data {
>> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
>> index 37fdc34f7afc..4261cf4755c8 100644
>> --- a/arch/x86/kvm/vmx/nested.c
>> +++ b/arch/x86/kvm/vmx/nested.c
>> @@ -6019,6 +6019,7 @@ static int vmx_get_nested_state(struct kvm_vcpu *vcpu,
>>  		.hdr.vmx.vmxon_pa = -1ull,
>>  		.hdr.vmx.vmcs12_pa = -1ull,
>>  		.hdr.vmx.preemption_timer_deadline = 0,
>> +		.hdr.vmx.evmcs_pa = -1ull,
>>  	};
>>  	struct kvm_vmx_nested_state_data __user *user_vmx_nested_state =
>>  		&user_kvm_nested_state->data.vmx[0];
>> @@ -6037,8 +6038,10 @@ static int vmx_get_nested_state(struct kvm_vcpu *vcpu,
>>  		if (vmx_has_valid_vmcs12(vcpu)) {
>>  			kvm_state.size += sizeof(user_vmx_nested_state->vmcs12);
>>  
>> -			if (vmx->nested.hv_evmcs)
>> +			if (vmx->nested.hv_evmcs) {
>>  				kvm_state.flags |= KVM_STATE_NESTED_EVMCS;
>> +				kvm_state.hdr.vmx.evmcs_pa = vmx->nested.hv_evmcs_vmptr;
>> +			}
>>  
>>  			if (is_guest_mode(vcpu) &&
>>  			    nested_cpu_has_shadow_vmcs(vmcs12) &&
>> @@ -6230,13 +6233,25 @@ static int vmx_set_nested_state(struct kvm_vcpu *vcpu,
>>  
>>  		set_current_vmptr(vmx, kvm_state->hdr.vmx.vmcs12_pa);
>>  	} else if (kvm_state->flags & KVM_STATE_NESTED_EVMCS) {
>> +		u64 evmcs_gpa = kvm_state->hdr.vmx.evmcs_pa;
>> +
>>  		/*
>> -		 * nested_vmx_handle_enlightened_vmptrld() cannot be called
>> -		 * directly from here as HV_X64_MSR_VP_ASSIST_PAGE may not be
>> -		 * restored yet. EVMCS will be mapped from
>> -		 * nested_get_vmcs12_pages().
>> +		 * EVMCS GPA == 0 most likely indicates that the migration data is
>> +		 * coming from an older KVM which doesn't support 'evmcs_pa' in
>> +		 * 'struct kvm_vmx_nested_state_hdr'.
>>  		 */
>> -		kvm_make_request(KVM_REQ_GET_NESTED_STATE_PAGES, vcpu);
>> +		if (evmcs_gpa && (evmcs_gpa != -1ull) &&
>> +		    (__nested_vmx_handle_enlightened_vmptrld(vcpu, evmcs_gpa, false) !=
>> +		     EVMPTRLD_SUCCEEDED)) {
>> +			return -EINVAL;
>> +		} else if (!evmcs_gpa) {
>> +			/*
>> +			 * EVMCS GPA can't be acquired from VP assist page here because
>> +			 * HV_X64_MSR_VP_ASSIST_PAGE may not be restored yet.
>> +			 * EVMCS will be mapped from nested_get_evmcs_page().
>> +			 */
>> +			kvm_make_request(KVM_REQ_GET_NESTED_STATE_PAGES, vcpu);
>> +		}
>>  	} else {
>>  		return -EINVAL;
>>  	}
>
> Hi everyone!
>
> Let me expalin my concern about this patch and also ask if I understand this correctly.
>
> In a nutshell if I understand this correctly, we are not allowed to access any guest
> memory while setting the nested state. 
>
> Now, if I understand correctly as well, the reason for the above,
> is that the userspace is allowed to set the nested state first, then fiddle with
> the KVM memslots, maybe even update the guest memory and only later do the KVM_RUN ioctl,

Currently, userspace is free to restore the guest in any order
indeed. I've probably missed post-copy but even the fact that guest MSRs
can be restored after restoring nested state doesn't make our life easier.

>
> And so this is the major reason why the KVM_REQ_GET_NESTED_STATE_PAGES
> request exists in the first place.
>
> If that is correct I assume that we either have to keep loading the EVMCS page on
> KVM_REQ_GET_NESTED_STATE_PAGES request, or we want to include the EVMCS itself
> in the migration state in addition to its physical address, similar to how we treat
> the VMCS12 and the VMCB12.

Keeping eVMCS load from KVM_REQ_GET_NESTED_STATE_PAGES is OK I believe
(or at least I still don't see a reason for us to carry a copy in the
migration data). What I still don't like is the transient state after
vmx_set_nested_state(): 
- vmx->nested.current_vmptr is -1ull because no 'real' vmptrld was done
(we skip set_current_vmptr() when KVM_STATE_NESTED_EVMCS)
- vmx->nested.hv_evmcs/vmx->nested.hv_evmcs_vmptr are also NULL because
we haven't performed nested_vmx_handle_enlightened_vmptrld() yet.

I know of at least one real problem with this state: in case
vmx_get_nested_state() happens before KVM_RUN the resulting state won't
have KVM_STATE_NESTED_EVMCS flag and this is incorrect. Take a look at
the check in nested_vmx_fail() for example:

        if (vmx->nested.current_vmptr == -1ull && !vmx->nested.hv_evmcs)
                return nested_vmx_failInvalid(vcpu);

this also seems off (I'm not sure it matters in any context but still).

>
> I personally tinkered with qemu to try and reproduce this situation
> and in my tests I wasn't able to make it update the memory
> map after the load of the nested state but prior to KVM_RUN
> but neither I wasn't able to prove that this can't happen.

Userspace has multiple ways to mess with the state of course, in KVM we
only need to make sure we don't crash :-) On migration, well behaving
userspace is supposed to restore exactly what it got though. The
restoration sequence may vary.

>
> In addition to that I don't know how qemu behaves when it does 
> guest ram post-copy because so far I haven't tried to tinker with it.
>
> Finally other userspace hypervisors exist, and they might rely on assumption
> as well.
>
> Looking forward for any comments,
> Best regards,
> 	Maxim Levitsky
>
>
>
diff mbox series

Patch

diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h
index 0662f644aad9..3845977b739e 100644
--- a/arch/x86/include/uapi/asm/kvm.h
+++ b/arch/x86/include/uapi/asm/kvm.h
@@ -441,6 +441,8 @@  struct kvm_vmx_nested_state_hdr {
 
 	__u32 flags;
 	__u64 preemption_timer_deadline;
+
+	__u64 evmcs_pa;
 };
 
 struct kvm_svm_nested_state_data {
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 37fdc34f7afc..4261cf4755c8 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -6019,6 +6019,7 @@  static int vmx_get_nested_state(struct kvm_vcpu *vcpu,
 		.hdr.vmx.vmxon_pa = -1ull,
 		.hdr.vmx.vmcs12_pa = -1ull,
 		.hdr.vmx.preemption_timer_deadline = 0,
+		.hdr.vmx.evmcs_pa = -1ull,
 	};
 	struct kvm_vmx_nested_state_data __user *user_vmx_nested_state =
 		&user_kvm_nested_state->data.vmx[0];
@@ -6037,8 +6038,10 @@  static int vmx_get_nested_state(struct kvm_vcpu *vcpu,
 		if (vmx_has_valid_vmcs12(vcpu)) {
 			kvm_state.size += sizeof(user_vmx_nested_state->vmcs12);
 
-			if (vmx->nested.hv_evmcs)
+			if (vmx->nested.hv_evmcs) {
 				kvm_state.flags |= KVM_STATE_NESTED_EVMCS;
+				kvm_state.hdr.vmx.evmcs_pa = vmx->nested.hv_evmcs_vmptr;
+			}
 
 			if (is_guest_mode(vcpu) &&
 			    nested_cpu_has_shadow_vmcs(vmcs12) &&
@@ -6230,13 +6233,25 @@  static int vmx_set_nested_state(struct kvm_vcpu *vcpu,
 
 		set_current_vmptr(vmx, kvm_state->hdr.vmx.vmcs12_pa);
 	} else if (kvm_state->flags & KVM_STATE_NESTED_EVMCS) {
+		u64 evmcs_gpa = kvm_state->hdr.vmx.evmcs_pa;
+
 		/*
-		 * nested_vmx_handle_enlightened_vmptrld() cannot be called
-		 * directly from here as HV_X64_MSR_VP_ASSIST_PAGE may not be
-		 * restored yet. EVMCS will be mapped from
-		 * nested_get_vmcs12_pages().
+		 * EVMCS GPA == 0 most likely indicates that the migration data is
+		 * coming from an older KVM which doesn't support 'evmcs_pa' in
+		 * 'struct kvm_vmx_nested_state_hdr'.
 		 */
-		kvm_make_request(KVM_REQ_GET_NESTED_STATE_PAGES, vcpu);
+		if (evmcs_gpa && (evmcs_gpa != -1ull) &&
+		    (__nested_vmx_handle_enlightened_vmptrld(vcpu, evmcs_gpa, false) !=
+		     EVMPTRLD_SUCCEEDED)) {
+			return -EINVAL;
+		} else if (!evmcs_gpa) {
+			/*
+			 * EVMCS GPA can't be acquired from VP assist page here because
+			 * HV_X64_MSR_VP_ASSIST_PAGE may not be restored yet.
+			 * EVMCS will be mapped from nested_get_evmcs_page().
+			 */
+			kvm_make_request(KVM_REQ_GET_NESTED_STATE_PAGES, vcpu);
+		}
 	} else {
 		return -EINVAL;
 	}