diff mbox

[RFC] KVM: nVMX: Leave VMX mode on apparent CPU reset

Message ID 52AEC8B2.7010602@siemens.com (mailing list archive)
State New, archived
Headers show

Commit Message

Jan Kiszka Dec. 16, 2013, 9:32 a.m. UTC
As long as we do not expose all the VMX related states to user space,
there is no way to properly reset a VCPU when VMX is enabled. Emulate
this for now by catching host-side clearings of the feature control MSR.
This allows to reboot a VM while it is running some hypervisor code.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---

Better ideas? Or continue to leave it as it is?

 arch/x86/kvm/vmx.c | 35 +++++++++++++++++++++++++++++++++++
 1 file changed, 35 insertions(+)

Comments

Paolo Bonzini Dec. 17, 2013, 1:25 p.m. UTC | #1
Il 16/12/2013 10:32, Jan Kiszka ha scritto:
> As long as we do not expose all the VMX related states to user space,
> there is no way to properly reset a VCPU when VMX is enabled. Emulate
> this for now by catching host-side clearings of the feature control MSR.
> This allows to reboot a VM while it is running some hypervisor code.
> 
> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
> ---
> 
> Better ideas? Or continue to leave it as it is?

The final vmx_vcpu_reset is the only really ugly part, but it is
_really_ ugly...  Can you modify QEMU to restore MSRs first, and reduce
vmx_reset_nested to just

	if (is_guest_mode(vcpu))
		nested_vmx_vmexit(vcpu);

	free_nested(vmx);

?

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jan Kiszka Dec. 17, 2013, 2:40 p.m. UTC | #2
On 2013-12-17 14:25, Paolo Bonzini wrote:
> Il 16/12/2013 10:32, Jan Kiszka ha scritto:
>> As long as we do not expose all the VMX related states to user space,
>> there is no way to properly reset a VCPU when VMX is enabled. Emulate
>> this for now by catching host-side clearings of the feature control MSR.
>> This allows to reboot a VM while it is running some hypervisor code.
>>
>> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
>> ---
>>
>> Better ideas? Or continue to leave it as it is?
> 
> The final vmx_vcpu_reset is the only really ugly part, but it is
> _really_ ugly...  Can you modify QEMU to restore MSRs first, and reduce
> vmx_reset_nested to just
> 
> 	if (is_guest_mode(vcpu))
> 		nested_vmx_vmexit(vcpu);
> 
> 	free_nested(vmx);
> 
> ?

Well, I could make setting of MSR_IA32_FEATURE_CONTROL to 0 an official
"clear VMX" interface. Then QEMU would have to issue this MSR set
request before doing any other CPU state manipulation. Is that what you
have in mind?

Jan
Paolo Bonzini Dec. 17, 2013, 2:43 p.m. UTC | #3
Il 17/12/2013 15:40, Jan Kiszka ha scritto:
>> > The final vmx_vcpu_reset is the only really ugly part, but it is
>> > _really_ ugly...  Can you modify QEMU to restore MSRs first, and reduce
>> > vmx_reset_nested to just
>> > 
>> > 	if (is_guest_mode(vcpu))
>> > 		nested_vmx_vmexit(vcpu);
>> > 
>> > 	free_nested(vmx);
>> > 
>> > ?
> Well, I could make setting of MSR_IA32_FEATURE_CONTROL to 0 an official
> "clear VMX" interface. Then QEMU would have to issue this MSR set
> request before doing any other CPU state manipulation. Is that what you
> have in mind?

Yes, that was the idea.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Marcelo Tosatti Dec. 30, 2013, 8:02 p.m. UTC | #4
On Mon, Dec 16, 2013 at 10:32:34AM +0100, Jan Kiszka wrote:
> As long as we do not expose all the VMX related states to user space,
> there is no way to properly reset a VCPU when VMX is enabled. Emulate
> this for now by catching host-side clearings of the feature control MSR.
> This allows to reboot a VM while it is running some hypervisor code.
> 
> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
> ---
> 
> Better ideas? Or continue to leave it as it is?

The hidden backport might be problematic... How about an explicit 
notification to leave guest mode?
Such as a new flag to KVM_SET_REGS (or even a separate ioctl?).

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Marcelo Tosatti Dec. 30, 2013, 8:02 p.m. UTC | #5
On Mon, Dec 16, 2013 at 10:32:34AM +0100, Jan Kiszka wrote:
> As long as we do not expose all the VMX related states to user space,
> there is no way to properly reset a VCPU when VMX is enabled. Emulate
> this for now by catching host-side clearings of the feature control MSR.
> This allows to reboot a VM while it is running some hypervisor code.
> 
> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
> ---
> 
> Better ideas? Or continue to leave it as it is?

So basically reset is broken, with nested virt?

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Marcelo Tosatti Dec. 30, 2013, 8:05 p.m. UTC | #6
On Mon, Dec 30, 2013 at 06:02:17PM -0200, Marcelo Tosatti wrote:
> On Mon, Dec 16, 2013 at 10:32:34AM +0100, Jan Kiszka wrote:
> > As long as we do not expose all the VMX related states to user space,
> > there is no way to properly reset a VCPU when VMX is enabled. Emulate
> > this for now by catching host-side clearings of the feature control MSR.
> > This allows to reboot a VM while it is running some hypervisor code.
> > 
> > Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
> > ---
> > 
> > Better ideas? Or continue to leave it as it is?
> 
> The hidden backport might be problematic... How about an explicit 
> notification to leave guest mode?
> Such as a new flag to KVM_SET_REGS (or even a separate ioctl?).
> 

Nevermind.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index f90320b..da04247 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -2455,6 +2455,8 @@  static int vmx_get_vmx_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 *pdata)
 	return 1;
 }
 
+static void vmx_reset_nested(struct kvm_vcpu *vcpu);
+
 static int vmx_set_vmx_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 {
 	u32 msr_index = msr_info->index;
@@ -2470,6 +2472,12 @@  static int vmx_set_vmx_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 				& FEATURE_CONTROL_LOCKED)
 			return 0;
 		to_vmx(vcpu)->nested.msr_ia32_feature_control = data;
+		/*
+		 * Detect reset and allow to leave VMX mode this way until we
+		 * expose all related states to user space.
+		 */
+		if (host_initialized && data == 0)
+			vmx_reset_nested(vcpu);
 		return 1;
 	}
 
@@ -8487,6 +8495,33 @@  static void nested_vmx_vmexit(struct kvm_vcpu *vcpu)
 		vmx->nested.sync_shadow_vmcs = true;
 }
 
+static void vmx_reset_nested(struct kvm_vcpu *vcpu)
+{
+	struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
+	struct vcpu_vmx *vmx = to_vmx(vcpu);
+
+	if (!vmx->nested.vmxon)
+		return;
+
+	if (is_guest_mode(vcpu)) {
+		vmcs12->host_cr0 = X86_CR0_NW | X86_CR0_CD | X86_CR0_ET;
+		vmcs12->host_cr3 = 0;
+		vmcs12->host_cr4 = 0;
+		vmcs12->host_rsp = 0;
+		vmcs12->vm_exit_controls = 0;
+		nested_vmx_vmexit(vcpu);
+	}
+
+	free_nested(vmx);
+
+	/*
+	 * If we were in guest mode, the reset state user space wrote so far
+	 * is now inconsistent. If we were in host mode, some state update may
+	 * have been rejected. So simply repeat the reset her.
+	 */
+	vmx_vcpu_reset(vcpu);
+}
+
 /*
  * L1's failure to enter L2 is a subset of a normal exit, as explained in
  * 23.7 "VM-entry failures during or after loading guest state" (this also