diff mbox

KVM: nVMX: fix guest CR4 loading when emulating L2 to L1 exit

Message ID 20171010070122.29972-1-haozhong.zhang@intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Haozhong Zhang Oct. 10, 2017, 7:01 a.m. UTC
When KVM emulates an exit from L2 to L1, it loads L1 CR4 into the
guest CR4. Before this CR4 loading, the guest CR4 refers to L2
CR4. Because these two CR4's are in different levels of guest, we
should vmx_set_cr4() rather than kvm_set_cr4() here. The latter, which
is used to handle guest writes to its CR4, checks the guest change to
CR4 and may fail if the change is invalid.

The failure may cause trouble. Consider we start
  a L1 guest with non-zero L1 PCID in use,
     (i.e. L1 CR4.PCIDE == 1 && L1 CR3.PCID != 0)
and
  a L2 guest with L2 PCID disabled,
     (i.e. L2 CR4.PCIDE == 0)
and following events may happen:

1. If kvm_set_cr4() is used in load_vmcs12_host_state() to load L1 CR4
   into guest CR4 (in VMCS01) for L2 to L1 exit, it will fail because
   of PCID check. As a result, the guest CR4 recorded in L0 KVM (i.e.
   vcpu->arch.cr4) is left to the value of L2 CR4.

2. Later, if L1 attempts to change its CR4, e.g., clearing VMXE bit,
   kvm_set_cr4() in L0 KVM will think L1 also wants to enable PCID,
   because the wrong L2 CR4 is used by L0 KVM as L1 CR4. As L1
   CR3.PCID != 0, L0 KVM will inject GP to L1 guest.

Fixes: 4704d0befb072 ("KVM: nVMX: Exiting from L2 to L1")
Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
---
 arch/x86/kvm/vmx.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Paolo Bonzini Oct. 10, 2017, 8:19 a.m. UTC | #1
On 10/10/2017 09:01, Haozhong Zhang wrote:
> When KVM emulates an exit from L2 to L1, it loads L1 CR4 into the
> guest CR4. Before this CR4 loading, the guest CR4 refers to L2
> CR4. Because these two CR4's are in different levels of guest, we
> should vmx_set_cr4() rather than kvm_set_cr4() here. The latter, which
> is used to handle guest writes to its CR4, checks the guest change to
> CR4 and may fail if the change is invalid.
> 
> The failure may cause trouble. Consider we start
>   a L1 guest with non-zero L1 PCID in use,
>      (i.e. L1 CR4.PCIDE == 1 && L1 CR3.PCID != 0)
> and
>   a L2 guest with L2 PCID disabled,
>      (i.e. L2 CR4.PCIDE == 0)
> and following events may happen:
> 
> 1. If kvm_set_cr4() is used in load_vmcs12_host_state() to load L1 CR4
>    into guest CR4 (in VMCS01) for L2 to L1 exit, it will fail because
>    of PCID check. As a result, the guest CR4 recorded in L0 KVM (i.e.
>    vcpu->arch.cr4) is left to the value of L2 CR4.
> 
> 2. Later, if L1 attempts to change its CR4, e.g., clearing VMXE bit,
>    kvm_set_cr4() in L0 KVM will think L1 also wants to enable PCID,
>    because the wrong L2 CR4 is used by L0 KVM as L1 CR4. As L1
>    CR3.PCID != 0, L0 KVM will inject GP to L1 guest.
> 
> Fixes: 4704d0befb072 ("KVM: nVMX: Exiting from L2 to L1")
> Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>

Makes sense, and none of the other effects of kvm_set_cr4 matter:

	// MMU is already reset on L2->L1 exit
        if (((cr4 ^ old_cr4) & pdptr_bits) ||
            (!(cr4 & X86_CR4_PCIDE) && (old_cr4 & X86_CR4_PCIDE)))
                kvm_mmu_reset_context(vcpu);

	// L2 CPUID always exits to L1
        if ((cr4 ^ old_cr4) & (X86_CR4_OSXSAVE | X86_CR4_PKE))
                kvm_update_cpuid(vcpu);

Would you please write a testcase too?

Paolo

> ---
>  arch/x86/kvm/vmx.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index b9d2140eb212..3fe639e0e1f6 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -11297,7 +11297,7 @@ static void load_vmcs12_host_state(struct kvm_vcpu *vcpu,
>  
>  	/* Same as above - no reason to call set_cr4_guest_host_mask().  */
>  	vcpu->arch.cr4_guest_owned_bits = ~vmcs_readl(CR4_GUEST_HOST_MASK);
> -	kvm_set_cr4(vcpu, vmcs12->host_cr4);
> +	vmx_set_cr4(vcpu, vmcs12->host_cr4);
>  
>  	nested_ept_uninit_mmu_context(vcpu);
>  
>
Haozhong Zhang Oct. 10, 2017, 9:12 a.m. UTC | #2
On 10/10/17 10:19 +0200, Paolo Bonzini wrote:
> On 10/10/2017 09:01, Haozhong Zhang wrote:
> > When KVM emulates an exit from L2 to L1, it loads L1 CR4 into the
> > guest CR4. Before this CR4 loading, the guest CR4 refers to L2
> > CR4. Because these two CR4's are in different levels of guest, we
> > should vmx_set_cr4() rather than kvm_set_cr4() here. The latter, which
> > is used to handle guest writes to its CR4, checks the guest change to
> > CR4 and may fail if the change is invalid.
> > 
> > The failure may cause trouble. Consider we start
> >   a L1 guest with non-zero L1 PCID in use,
> >      (i.e. L1 CR4.PCIDE == 1 && L1 CR3.PCID != 0)
> > and
> >   a L2 guest with L2 PCID disabled,
> >      (i.e. L2 CR4.PCIDE == 0)
> > and following events may happen:
> > 
> > 1. If kvm_set_cr4() is used in load_vmcs12_host_state() to load L1 CR4
> >    into guest CR4 (in VMCS01) for L2 to L1 exit, it will fail because
> >    of PCID check. As a result, the guest CR4 recorded in L0 KVM (i.e.
> >    vcpu->arch.cr4) is left to the value of L2 CR4.
> > 
> > 2. Later, if L1 attempts to change its CR4, e.g., clearing VMXE bit,
> >    kvm_set_cr4() in L0 KVM will think L1 also wants to enable PCID,
> >    because the wrong L2 CR4 is used by L0 KVM as L1 CR4. As L1
> >    CR3.PCID != 0, L0 KVM will inject GP to L1 guest.
> > 
> > Fixes: 4704d0befb072 ("KVM: nVMX: Exiting from L2 to L1")
> > Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
> 
> Makes sense, and none of the other effects of kvm_set_cr4 matter:
> 
> 	// MMU is already reset on L2->L1 exit
>         if (((cr4 ^ old_cr4) & pdptr_bits) ||
>             (!(cr4 & X86_CR4_PCIDE) && (old_cr4 & X86_CR4_PCIDE)))
>                 kvm_mmu_reset_context(vcpu);
> 
> 	// L2 CPUID always exits to L1
>         if ((cr4 ^ old_cr4) & (X86_CR4_OSXSAVE | X86_CR4_PKE))
>                 kvm_update_cpuid(vcpu);
> 
> Would you please write a testcase too?
> 

Sure, I'll send it later.

Haozhong
diff mbox

Patch

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index b9d2140eb212..3fe639e0e1f6 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -11297,7 +11297,7 @@  static void load_vmcs12_host_state(struct kvm_vcpu *vcpu,
 
 	/* Same as above - no reason to call set_cr4_guest_host_mask().  */
 	vcpu->arch.cr4_guest_owned_bits = ~vmcs_readl(CR4_GUEST_HOST_MASK);
-	kvm_set_cr4(vcpu, vmcs12->host_cr4);
+	vmx_set_cr4(vcpu, vmcs12->host_cr4);
 
 	nested_ept_uninit_mmu_context(vcpu);