diff mbox series

[2/3] KVM: nVMX: restore L1's EFER prior to setting the nested state

Message ID 20211110100018.367426-3-mlevitsk@redhat.com (mailing list archive)
State New, archived
Headers show
Series VMX: nested migration fixes for 32 bit nested guests | expand

Commit Message

Maxim Levitsky Nov. 10, 2021, 10 a.m. UTC
It is known that some kvm's users (e.g. qemu) load part of L2's register
state prior to setting the nested state after a migration.

If a 32 bit L2 guest is running in a 64 bit L1 guest, and nested migration
happens, Qemu will restore L2's EFER, and then the nested state load
function will use it as if it was L1's EFER.

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 arch/x86/kvm/vmx/nested.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

Comments

Paolo Bonzini Nov. 10, 2021, 3:01 p.m. UTC | #1
On 11/10/21 11:00, Maxim Levitsky wrote:
> +	/*
> +	 * The vcpu might currently contain L2's IA32_EFER, due to the way
> +	 * some userspace kvm users (e.g qemu) restore nested state.
> +	 *
> +	 * To fix this, restore its IA32_EFER to the value it would have
> +	 * after VM exit from the nested guest.
> +	 *
> +	 */
> +
> +	vcpu->arch.efer = nested_vmx_get_vmcs12_host_efer(vcpu, vmcs12);
> +

In principle the value of LOAD_HOST_EFER on exit need not be the same as 
on entry.  But you don't need all of EFER, only EFER.LME/EFER.LMA, and 
those two bits must match ("the values of the LMA and LME bits in the 
field must each be that of the “host address-space size” VM-exit 
control" from the "Checks on Host Control Registers, MSRs, and SSP"; 
plus the "Checks Related to Address-Space Size").

At least it's worth adjusting the comment to explain that.  But the root 
cause of the issue is just nested_vmx_check_* accessing vcpu->arch.  So 
you can instead:

- split out of nested_vmx_check_host_state a new function 
nested_vmx_check_address_state_size that does

#ifdef CONFIG_X86_64
	if (CC(!!(vmcs12->vm_exit_controls & VM_EXIT_HOST_ADDR_SPACE_SIZE) !=
	       !!(vcpu->arch.efer & EFER_LMA)))
		return -EINVAL;
#endif
	return 0;

- call it from vmentry but not from migration

- in nested_vmx_check_host_state, assign ia32e from 
vmcs12->vm_exit_controls instead of vcpu->arch.efer

Paolo
Maxim Levitsky Nov. 10, 2021, 3:08 p.m. UTC | #2
On Wed, 2021-11-10 at 16:01 +0100, Paolo Bonzini wrote:
> On 11/10/21 11:00, Maxim Levitsky wrote:
> > +	/*
> > +	 * The vcpu might currently contain L2's IA32_EFER, due to the way
> > +	 * some userspace kvm users (e.g qemu) restore nested state.
> > +	 *
> > +	 * To fix this, restore its IA32_EFER to the value it would have
> > +	 * after VM exit from the nested guest.
> > +	 *
> > +	 */
> > +
> > +	vcpu->arch.efer = nested_vmx_get_vmcs12_host_efer(vcpu, vmcs12);
> > +
> 
> In principle the value of LOAD_HOST_EFER on exit need not be the same as 
> on entry.  But you don't need all of EFER, only EFER.LME/EFER.LMA, and 
> those two bits must match ("the values of the LMA and LME bits in the 
> field must each be that of the “host address-space size” VM-exit 
> control" from the "Checks on Host Control Registers, MSRs, and SSP"; 
> plus the "Checks Related to Address-Space Size").
> 
> At least it's worth adjusting the comment to explain that.  But the root 
> cause of the issue is just nested_vmx_check_* accessing vcpu->arch.  So 
> you can instead:
> 
> - split out of nested_vmx_check_host_state a new function 
> nested_vmx_check_address_state_size that does
> 
> #ifdef CONFIG_X86_64
> 	if (CC(!!(vmcs12->vm_exit_controls & VM_EXIT_HOST_ADDR_SPACE_SIZE) !=
> 	       !!(vcpu->arch.efer & EFER_LMA)))
> 		return -EINVAL;
> #endif
> 	return 0;
> 
> - call it from vmentry but not from migration
> 
> - in nested_vmx_check_host_state, assign ia32e from 
> vmcs12->vm_exit_controls instead of vcpu->arch.efer

I agree with you. I was thinking do something like that but wasn't sure at all.

Best regards,
	Maxim Levitsky

> 
> Paolo
>
diff mbox series

Patch

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 49ae96c0cc4d1..28e270824e5b1 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -6404,6 +6404,17 @@  static int vmx_set_nested_state(struct kvm_vcpu *vcpu,
 			kvm_state->hdr.vmx.preemption_timer_deadline;
 	}
 
+	/*
+	 * The vcpu might currently contain L2's IA32_EFER, due to the way
+	 * some userspace kvm users (e.g qemu) restore nested state.
+	 *
+	 * To fix this, restore its IA32_EFER to the value it would have
+	 * after VM exit from the nested guest.
+	 *
+	 */
+
+	vcpu->arch.efer = nested_vmx_get_vmcs12_host_efer(vcpu, vmcs12);
+
 	if (nested_vmx_check_controls(vcpu, vmcs12) ||
 	    nested_vmx_check_host_state(vcpu, vmcs12) ||
 	    nested_vmx_check_guest_state(vcpu, vmcs12, &ignored))