diff mbox

[v4] kvm: nVMX: Remove superfluous VMX instruction fault checks

Message ID 20170426155346.116970-1-jmattson@google.com (mailing list archive)
State New, archived
Headers show

Commit Message

Jim Mattson April 26, 2017, 3:53 p.m. UTC
According to the Intel SDM, "Certain exceptions have priority over VM
exits. These include invalid-opcode exceptions, faults based on
privilege level*, and general-protection exceptions that are based on
checking I/O permission bits in the task-state segment (TSS)."

There is no need to check for faulting conditions that the hardware
has already checked.

* These include faults generated by attempts to execute, in
  virtual-8086 mode, privileged instructions that are not recognized
  in that mode.

Signed-off-by: Jim Mattson <jmattson@google.com>
---
 arch/x86/kvm/vmx.c | 55 ++++++++----------------------------------------------
 1 file changed, 8 insertions(+), 47 deletions(-)

Comments

David Hildenbrand April 27, 2017, 8:29 a.m. UTC | #1
On 26.04.2017 17:53, Jim Mattson wrote:
> According to the Intel SDM, "Certain exceptions have priority over VM
> exits. These include invalid-opcode exceptions, faults based on
> privilege level*, and general-protection exceptions that are based on
> checking I/O permission bits in the task-state segment (TSS)."
> 
> There is no need to check for faulting conditions that the hardware
> has already checked.
> 
> * These include faults generated by attempts to execute, in
>   virtual-8086 mode, privileged instructions that are not recognized
>   in that mode.
> 
> Signed-off-by: Jim Mattson <jmattson@google.com>
> ---
>  arch/x86/kvm/vmx.c | 55 ++++++++----------------------------------------------
>  1 file changed, 8 insertions(+), 47 deletions(-)

Nice! So we really only have to check vmxon / pointer / features for
vmxon and for the others only vmxon.

Reviewed-by: David Hildenbrand <david@redhat.com>
[...]
>  	if (vmx->nested.vmxon) {
>  		nested_vmx_failValid(vcpu, VMXERR_VMXON_IN_VMX_ROOT_OPERATION);
>  		return kvm_skip_emulated_instruction(vcpu);
> @@ -7161,29 +7141,15 @@ static int handle_vmon(struct kvm_vcpu *vcpu)
>   * Intel's VMX Instruction Reference specifies a common set of prerequisites
>   * for running VMX instructions (except VMXON, whose prerequisites are
>   * slightly different). It also specifies what exception to inject otherwise.
> + * Note that many of these exceptions have priority over VM exits, so they
> + * don't have to be checked again here.
>   */

I think we could rename that one to nested_vmx_check_vmxon() now and
drop the complete comment. Maybe as a cleanup patch. Do we have some
nVMX documentation where we could put such information (so we could also
remove the comment from handle_vmon())?

>  static int nested_vmx_check_permission(struct kvm_vcpu *vcpu)
>  {
Paolo Bonzini April 27, 2017, 11:25 a.m. UTC | #2
On 27/04/2017 10:29, David Hildenbrand wrote:
>>  arch/x86/kvm/vmx.c | 55 ++++++++----------------------------------------------
>>  1 file changed, 8 insertions(+), 47 deletions(-)
> Nice! So we really only have to check vmxon / pointer / features for
> vmxon and for the others only vmxon.

Still not good, CR4.VMXE has to be checked because we always run the
guest with CR4.VMXE set (see section 23.8 in the SDM).

Paolo

> Reviewed-by: David Hildenbrand <david@redhat.com>
David Hildenbrand April 27, 2017, 12:09 p.m. UTC | #3
On 27.04.2017 13:25, Paolo Bonzini wrote:
> 
> 
> On 27/04/2017 10:29, David Hildenbrand wrote:
>>>  arch/x86/kvm/vmx.c | 55 ++++++++----------------------------------------------
>>>  1 file changed, 8 insertions(+), 47 deletions(-)
>> Nice! So we really only have to check vmxon / pointer / features for
>> vmxon and for the others only vmxon.
> 
> Still not good, CR4.VMXE has to be checked because we always run the
> guest with CR4.VMXE set (see section 23.8 in the SDM).
> 
> Paolo
> 
>> Reviewed-by: David Hildenbrand <david@redhat.com>

Looking at the pseudocode of VMXON (30-27) (and friends), it looks like
the hardware performs the following checks before testing for non-root
operation.

(register operand) -> #UD
(cr4.vmxe = 0) -> #UD
(rflags.vm = 1) -> #UD
(lma = 1) -> #UD
(cs.l) -> #UD

I was assuming cr4.vmxe might come from the read shadow, but as I
learned shadows are only for MOV/SMSW executed in the guest. So you're
of course right, VMXE would be always active for us.

Check for (in VMX non-root operation) and triggering the VMexit is done
before checking CPL > 0. So maybe also the CPL check also has to stay.

I would really like to see a test case for that. And a brain dump of
Paolo :)
Jim Mattson April 27, 2017, 3 p.m. UTC | #4
On Thu, Apr 27, 2017 at 4:25 AM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>
>
> On 27/04/2017 10:29, David Hildenbrand wrote:
>>>  arch/x86/kvm/vmx.c | 55 ++++++++----------------------------------------------
>>>  1 file changed, 8 insertions(+), 47 deletions(-)
>> Nice! So we really only have to check vmxon / pointer / features for
>> vmxon and for the others only vmxon.
>
> Still not good, CR4.VMXE has to be checked because we always run the
> guest with CR4.VMXE set (see section 23.8 in the SDM).

Ah, yes, Of course.

>
> Paolo
>
>> Reviewed-by: David Hildenbrand <david@redhat.com>
Paolo Bonzini April 27, 2017, 3:31 p.m. UTC | #5
On 27/04/2017 17:00, Jim Mattson wrote:
> On Thu, Apr 27, 2017 at 4:25 AM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>>
>>
>> On 27/04/2017 10:29, David Hildenbrand wrote:
>>>>  arch/x86/kvm/vmx.c | 55 ++++++++----------------------------------------------
>>>>  1 file changed, 8 insertions(+), 47 deletions(-)
>>> Nice! So we really only have to check vmxon / pointer / features for
>>> vmxon and for the others only vmxon.
>>
>> Still not good, CR4.VMXE has to be checked because we always run the
>> guest with CR4.VMXE set (see section 23.8 in the SDM).
> 
> Ah, yes, Of course.

I fixed this up and will push the patch briefly.

Paolo

>>
>> Paolo
>>
>>> Reviewed-by: David Hildenbrand <david@redhat.com>
>
Jim Mattson April 27, 2017, 3:32 p.m. UTC | #6
Also, the allowed/required CR0/CR4 bits may not be the same for the
vCPU as for the physical hardware, so more than just CR4.VMXE may have
to be checked.

On Thu, Apr 27, 2017 at 8:00 AM, Jim Mattson <jmattson@google.com> wrote:
> On Thu, Apr 27, 2017 at 4:25 AM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>>
>>
>> On 27/04/2017 10:29, David Hildenbrand wrote:
>>>>  arch/x86/kvm/vmx.c | 55 ++++++++----------------------------------------------
>>>>  1 file changed, 8 insertions(+), 47 deletions(-)
>>> Nice! So we really only have to check vmxon / pointer / features for
>>> vmxon and for the others only vmxon.
>>
>> Still not good, CR4.VMXE has to be checked because we always run the
>> guest with CR4.VMXE set (see section 23.8 in the SDM).
>
> Ah, yes, Of course.
>
>>
>> Paolo
>>
>>> Reviewed-by: David Hildenbrand <david@redhat.com>
Paolo Bonzini April 27, 2017, 3:49 p.m. UTC | #7
On 27/04/2017 17:32, Jim Mattson wrote:
> Also, the allowed/required CR0/CR4 bits may not be the same for the
> vCPU as for the physical hardware, so more than just CR4.VMXE may have
> to be checked.

I think we can pass on that.  For CR0 we know the two cases are
fundamentally unrestricted guest and !unrestricted guest, and they both
are covered (via CR0.PE and EFLAGS.VM respectively).

For CR4, we also pretty much know the only FIXED1 bit is VMXE, and
FIXED0 bits match the values that are checked by MOV to CR4.

Paolo
Jim Mattson April 27, 2017, 5:29 p.m. UTC | #8
What about CR0.NE, which, like CR4.VMXE, will always be set while
running the guest, but which may not be set in the vCPU?

On Thu, Apr 27, 2017 at 8:49 AM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>
>
> On 27/04/2017 17:32, Jim Mattson wrote:
>> Also, the allowed/required CR0/CR4 bits may not be the same for the
>> vCPU as for the physical hardware, so more than just CR4.VMXE may have
>> to be checked.
>
> I think we can pass on that.  For CR0 we know the two cases are
> fundamentally unrestricted guest and !unrestricted guest, and they both
> are covered (via CR0.PE and EFLAGS.VM respectively).
>
> For CR4, we also pretty much know the only FIXED1 bit is VMXE, and
> FIXED0 bits match the values that are checked by MOV to CR4.
>
> Paolo
diff mbox

Patch

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 259e9b28ccf8..a6c4af687006 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -7107,34 +7107,14 @@  static int enter_vmx_operation(struct kvm_vcpu *vcpu)
 static int handle_vmon(struct kvm_vcpu *vcpu)
 {
 	int ret;
-	struct kvm_segment cs;
 	struct vcpu_vmx *vmx = to_vmx(vcpu);
 	const u64 VMXON_NEEDED_FEATURES = FEATURE_CONTROL_LOCKED
 		| FEATURE_CONTROL_VMXON_ENABLED_OUTSIDE_SMX;
 
-	/* The Intel VMX Instruction Reference lists a bunch of bits that
-	 * are prerequisite to running VMXON, most notably cr4.VMXE must be
-	 * set to 1 (see vmx_set_cr4() for when we allow the guest to set this).
-	 * Otherwise, we should fail with #UD. We test these now:
+	/*
+	 * Most of the faulting conditions have already been checked by
+	 * hardware, prior to the VM-exit for VMXON.
 	 */
-	if (!kvm_read_cr4_bits(vcpu, X86_CR4_VMXE) ||
-	    !kvm_read_cr0_bits(vcpu, X86_CR0_PE) ||
-	    (vmx_get_rflags(vcpu) & X86_EFLAGS_VM)) {
-		kvm_queue_exception(vcpu, UD_VECTOR);
-		return 1;
-	}
-
-	vmx_get_segment(vcpu, &cs, VCPU_SREG_CS);
-	if (is_long_mode(vcpu) && !cs.l) {
-		kvm_queue_exception(vcpu, UD_VECTOR);
-		return 1;
-	}
-
-	if (vmx_get_cpl(vcpu)) {
-		kvm_inject_gp(vcpu, 0);
-		return 1;
-	}
-
 	if (vmx->nested.vmxon) {
 		nested_vmx_failValid(vcpu, VMXERR_VMXON_IN_VMX_ROOT_OPERATION);
 		return kvm_skip_emulated_instruction(vcpu);
@@ -7161,29 +7141,15 @@  static int handle_vmon(struct kvm_vcpu *vcpu)
  * Intel's VMX Instruction Reference specifies a common set of prerequisites
  * for running VMX instructions (except VMXON, whose prerequisites are
  * slightly different). It also specifies what exception to inject otherwise.
+ * Note that many of these exceptions have priority over VM exits, so they
+ * don't have to be checked again here.
  */
 static int nested_vmx_check_permission(struct kvm_vcpu *vcpu)
 {
-	struct kvm_segment cs;
-	struct vcpu_vmx *vmx = to_vmx(vcpu);
-
-	if (!vmx->nested.vmxon) {
-		kvm_queue_exception(vcpu, UD_VECTOR);
-		return 0;
-	}
-
-	vmx_get_segment(vcpu, &cs, VCPU_SREG_CS);
-	if ((vmx_get_rflags(vcpu) & X86_EFLAGS_VM) ||
-	    (is_long_mode(vcpu) && !cs.l)) {
+	if (!to_vmx(vcpu)->nested.vmxon) {
 		kvm_queue_exception(vcpu, UD_VECTOR);
 		return 0;
 	}
-
-	if (vmx_get_cpl(vcpu)) {
-		kvm_inject_gp(vcpu, 0);
-		return 0;
-	}
-
 	return 1;
 }
 
@@ -7527,7 +7493,7 @@  static int handle_vmread(struct kvm_vcpu *vcpu)
 		if (get_vmx_mem_address(vcpu, exit_qualification,
 				vmx_instruction_info, true, &gva))
 			return 1;
-		/* _system ok, as nested_vmx_check_permission verified cpl=0 */
+		/* _system ok, as hardware has verified cpl=0 */
 		kvm_write_guest_virt_system(&vcpu->arch.emulate_ctxt, gva,
 			     &field_value, (is_long_mode(vcpu) ? 8 : 4), NULL);
 	}
@@ -7660,7 +7626,7 @@  static int handle_vmptrst(struct kvm_vcpu *vcpu)
 	if (get_vmx_mem_address(vcpu, exit_qualification,
 			vmx_instruction_info, true, &vmcs_gva))
 		return 1;
-	/* ok to use *_system, as nested_vmx_check_permission verified cpl=0 */
+	/* ok to use *_system, as hardware has verified cpl=0 */
 	if (kvm_write_guest_virt_system(&vcpu->arch.emulate_ctxt, vmcs_gva,
 				 (void *)&to_vmx(vcpu)->nested.current_vmptr,
 				 sizeof(u64), &e)) {
@@ -7693,11 +7659,6 @@  static int handle_invept(struct kvm_vcpu *vcpu)
 	if (!nested_vmx_check_permission(vcpu))
 		return 1;
 
-	if (!kvm_read_cr0_bits(vcpu, X86_CR0_PE)) {
-		kvm_queue_exception(vcpu, UD_VECTOR);
-		return 1;
-	}
-
 	vmx_instruction_info = vmcs_read32(VMX_INSTRUCTION_INFO);
 	type = kvm_register_readl(vcpu, (vmx_instruction_info >> 28) & 0xf);