4.16 kernel: vmwrite error: reg 401e value 2021 (err 12)
diff mbox

Message ID 37306EFA9975BE469F115FDE982C075BCE93CC3B@ORSMSX114.amr.corp.intel.com
State New
Headers show

Commit Message

Sean Christopherson April 17, 2018, 4:26 p.m. UTC
On Tue, 2018-04-17, Paolo Bonzini wrote:
> On 17/04/2018 17:46, Christopherson, Sean J wrote:

> > On Tue, 2018-04-17, Zdenek Kaspar wrote:

> >> Hello, I did quick test with latest stable kernel (4.16.2) and got tons

> >> of vmwrite errors immediately when starting VM:

> > 

> > Code related to UMIP emulation is effectively doing an unconditional

> > RMW on SECONDARY_VM_EXEC_CONTROL, which isn't guaranteed to exist on

> > older processors.  KVM already ensures it only advertises UMIP (via

> > emulation) when SECONDARY_EXEC_DESC can be set, i.e. KVM is already

> > implicitly checking for SECONDARY_VM_EXEC_CONTROL, so fixing the bug

> > is just a matter of omitting the unneeded VMREAD/VMWRITE sequence.

> 

> Thanks for the report!

> 

> This should be a fix:

> 

> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c

> index aa66ccd6ed6c..c5dd185825c7 100644

> --- a/arch/x86/kvm/vmx.c

> +++ b/arch/x86/kvm/vmx.c

> @@ -4767,14 +4767,16 @@ static int vmx_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)

>  	else

>  		hw_cr4 |= KVM_PMODE_VM_CR4_ALWAYS_ON;

>  

> -	if ((cr4 & X86_CR4_UMIP) && !boot_cpu_has(X86_FEATURE_UMIP)) {

> -		vmcs_set_bits(SECONDARY_VM_EXEC_CONTROL,

> -			      SECONDARY_EXEC_DESC);

> -		hw_cr4 &= ~X86_CR4_UMIP;

> -	} else if (!is_guest_mode(vcpu) ||

> -	           !nested_cpu_has2(get_vmcs12(vcpu), SECONDARY_EXEC_DESC))

> -		vmcs_clear_bits(SECONDARY_VM_EXEC_CONTROL,

> -				SECONDARY_EXEC_DESC);

> +	if (!boot_cpu_has(X86_FEATURE_UMIP) && vmx_umip_emulated()) {

> +		if (cr4 & X86_CR4_UMIP) {

> +			vmcs_set_bits(SECONDARY_VM_EXEC_CONTROL,

> +				      SECONDARY_EXEC_DESC);

> +			hw_cr4 &= ~X86_CR4_UMIP;

> +		} else if (!is_guest_mode(vcpu) ||

> +			   !nested_cpu_has2(get_vmcs12(vcpu), SECONDARY_EXEC_DESC))

> +			vmcs_clear_bits(SECONDARY_VM_EXEC_CONTROL,

> +					SECONDARY_EXEC_DESC);

> +	}

>  

>  	if (cr4 & X86_CR4_VMXE) {

>  		/*

> 

> I'll test it and send the patch more formally.


Below is what I was thinking for a patch.  We should avoid the
VMREAD/VMWRITE when possible even when we're emulating UMIP.


From: Sean Christopherson <sean.j.christopherson@intel.com>

Subject: [PATCH] KVM: vmx: update SECONDARY_EXEC_DESC only if CR4.UMIP changes

Update SECONDARY_EXEC_DESC in SECONDARY_VM_EXEC_CONTROL for UMIP
emulation if and only if CR4.UMIP is being modified and UMIP is
not supported by hardware, i.e. we're emulating UMIP.  If CR4.UMIP
is not being changed then it's safe to assume that the previous
invocation of vmx_set_cr4() correctly set SECONDARY_EXEC_DESC,
i.e. the desired value is already the current value.  This avoids
unnecessary VMREAD/VMWRITE to SECONDARY_VM_EXEC_CONTROL, which
is critical as not all processors support SECONDARY_VM_EXEC_CONTROL.

WARN once and signal a fault if CR4.UMIP is changing and UMIP can't
be emulated, i.e. SECONDARY_EXEC_DESC can't be set.  Prior checks
should prevent setting UMIP if it can't be emulated, i.e. UMIP
shouldn't have been advertised to the guest if it can't be emulated,
regardless of whether or not UMIP is supported in bare metal.

Fixes: 0367f205a3b7 ("KVM: vmx: add support for emulating UMIP")
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>

---
 arch/x86/kvm/vmx.c | 34 ++++++++++++++++++++--------------
 1 file changed, 20 insertions(+), 14 deletions(-)

--


> 

> Thanks,

> 

> Paolo

Comments

Paolo Bonzini April 17, 2018, 4:29 p.m. UTC | #1
On 17/04/2018 18:26, Christopherson, Sean J wrote:
> On Tue, 2018-04-17, Paolo Bonzini wrote:
>> On 17/04/2018 17:46, Christopherson, Sean J wrote:
>>> On Tue, 2018-04-17, Zdenek Kaspar wrote:
>>>> Hello, I did quick test with latest stable kernel (4.16.2) and got tons
>>>> of vmwrite errors immediately when starting VM:
>>>
>>> Code related to UMIP emulation is effectively doing an unconditional
>>> RMW on SECONDARY_VM_EXEC_CONTROL, which isn't guaranteed to exist on
>>> older processors.  KVM already ensures it only advertises UMIP (via
>>> emulation) when SECONDARY_EXEC_DESC can be set, i.e. KVM is already
>>> implicitly checking for SECONDARY_VM_EXEC_CONTROL, so fixing the bug
>>> is just a matter of omitting the unneeded VMREAD/VMWRITE sequence.
>>
>> Thanks for the report!
>>
>> This should be a fix:
>>
>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>> index aa66ccd6ed6c..c5dd185825c7 100644
>> --- a/arch/x86/kvm/vmx.c
>> +++ b/arch/x86/kvm/vmx.c
>> @@ -4767,14 +4767,16 @@ static int vmx_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
>>  	else
>>  		hw_cr4 |= KVM_PMODE_VM_CR4_ALWAYS_ON;
>>  
>> -	if ((cr4 & X86_CR4_UMIP) && !boot_cpu_has(X86_FEATURE_UMIP)) {
>> -		vmcs_set_bits(SECONDARY_VM_EXEC_CONTROL,
>> -			      SECONDARY_EXEC_DESC);
>> -		hw_cr4 &= ~X86_CR4_UMIP;
>> -	} else if (!is_guest_mode(vcpu) ||
>> -	           !nested_cpu_has2(get_vmcs12(vcpu), SECONDARY_EXEC_DESC))
>> -		vmcs_clear_bits(SECONDARY_VM_EXEC_CONTROL,
>> -				SECONDARY_EXEC_DESC);
>> +	if (!boot_cpu_has(X86_FEATURE_UMIP) && vmx_umip_emulated()) {
>> +		if (cr4 & X86_CR4_UMIP) {
>> +			vmcs_set_bits(SECONDARY_VM_EXEC_CONTROL,
>> +				      SECONDARY_EXEC_DESC);
>> +			hw_cr4 &= ~X86_CR4_UMIP;
>> +		} else if (!is_guest_mode(vcpu) ||
>> +			   !nested_cpu_has2(get_vmcs12(vcpu), SECONDARY_EXEC_DESC))
>> +			vmcs_clear_bits(SECONDARY_VM_EXEC_CONTROL,
>> +					SECONDARY_EXEC_DESC);
>> +	}
>>  
>>  	if (cr4 & X86_CR4_VMXE) {
>>  		/*
>>
>> I'll test it and send the patch more formally.
> 
> Below is what I was thinking for a patch.  We should avoid the
> VMREAD/VMWRITE when possible even when we're emulating UMIP.
> 
> 
> From: Sean Christopherson <sean.j.christopherson@intel.com>
> Subject: [PATCH] KVM: vmx: update SECONDARY_EXEC_DESC only if CR4.UMIP changes
> 
> Update SECONDARY_EXEC_DESC in SECONDARY_VM_EXEC_CONTROL for UMIP
> emulation if and only if CR4.UMIP is being modified and UMIP is
> not supported by hardware, i.e. we're emulating UMIP.  If CR4.UMIP
> is not being changed then it's safe to assume that the previous
> invocation of vmx_set_cr4() correctly set SECONDARY_EXEC_DESC,
> i.e. the desired value is already the current value.  This avoids
> unnecessary VMREAD/VMWRITE to SECONDARY_VM_EXEC_CONTROL, which
> is critical as not all processors support SECONDARY_VM_EXEC_CONTROL.
> 
> WARN once and signal a fault if CR4.UMIP is changing and UMIP can't
> be emulated, i.e. SECONDARY_EXEC_DESC can't be set.  Prior checks
> should prevent setting UMIP if it can't be emulated, i.e. UMIP
> shouldn't have been advertised to the guest if it can't be emulated,
> regardless of whether or not UMIP is supported in bare metal.
> 
> Fixes: 0367f205a3b7 ("KVM: vmx: add support for emulating UMIP")
> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
> ---
>  arch/x86/kvm/vmx.c | 34 ++++++++++++++++++++--------------
>  1 file changed, 20 insertions(+), 14 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index aafcc9881e88..31b36b9801bb 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -1494,6 +1494,12 @@ static inline bool cpu_has_vmx_vmfunc(void)
>                 SECONDARY_EXEC_ENABLE_VMFUNC;
>  }
> 
> +static bool vmx_umip_emulated(void)
> +{
> +       return vmcs_config.cpu_based_2nd_exec_ctrl &
> +               SECONDARY_EXEC_DESC;
> +}
> +
>  static inline bool report_flexpriority(void)
>  {
>         return flexpriority_enabled;
> @@ -4776,14 +4782,20 @@ static int vmx_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
>         else
>                 hw_cr4 |= KVM_PMODE_VM_CR4_ALWAYS_ON;
> 
> -       if ((cr4 & X86_CR4_UMIP) && !boot_cpu_has(X86_FEATURE_UMIP)) {
> -               vmcs_set_bits(SECONDARY_VM_EXEC_CONTROL,
> -                             SECONDARY_EXEC_DESC);
> -               hw_cr4 &= ~X86_CR4_UMIP;
> -       } else if (!is_guest_mode(vcpu) ||
> -                  !nested_cpu_has2(get_vmcs12(vcpu), SECONDARY_EXEC_DESC))
> -               vmcs_clear_bits(SECONDARY_VM_EXEC_CONTROL,
> -                               SECONDARY_EXEC_DESC);
> +       if (((cr4 ^ kvm_read_cr4(vcpu)) & X86_CR4_UMIP) &&
> +           !boot_cpu_has(X86_FEATURE_UMIP)) {
> +               if (WARN_ON_ONCE(!vmx_umip_emulated()))
> +                       return 1;
> +
> +               if (cr4 & X86_CR4_UMIP) {
> +                       vmcs_set_bits(SECONDARY_VM_EXEC_CONTROL,
> +                                       SECONDARY_EXEC_DESC);
> +                       hw_cr4 &= ~X86_CR4_UMIP;
> +               } else if (!is_guest_mode(vcpu) ||
> +                       !nested_cpu_has2(get_vmcs12(vcpu), SECONDARY_EXEC_DESC))
> +                       vmcs_clear_bits(SECONDARY_VM_EXEC_CONTROL,
> +                                       SECONDARY_EXEC_DESC);
> +       }

Yes, that's nice too!

Paolo
Zdenek Kaspar April 17, 2018, 6:10 p.m. UTC | #2
On 04/17/2018 06:26 PM, Christopherson, Sean J wrote:
> On Tue, 2018-04-17, Paolo Bonzini wrote:
>> On 17/04/2018 17:46, Christopherson, Sean J wrote:
>>> On Tue, 2018-04-17, Zdenek Kaspar wrote:
>>>> Hello, I did quick test with latest stable kernel (4.16.2) and got tons
>>>> of vmwrite errors immediately when starting VM:
>>>
>>> Code related to UMIP emulation is effectively doing an unconditional
>>> RMW on SECONDARY_VM_EXEC_CONTROL, which isn't guaranteed to exist on
>>> older processors.  KVM already ensures it only advertises UMIP (via
>>> emulation) when SECONDARY_EXEC_DESC can be set, i.e. KVM is already
>>> implicitly checking for SECONDARY_VM_EXEC_CONTROL, so fixing the bug
>>> is just a matter of omitting the unneeded VMREAD/VMWRITE sequence.
>>
>> Thanks for the report!
>>
>> This should be a fix:
>>
>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>> index aa66ccd6ed6c..c5dd185825c7 100644
>> --- a/arch/x86/kvm/vmx.c
>> +++ b/arch/x86/kvm/vmx.c
>> @@ -4767,14 +4767,16 @@ static int vmx_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
>>   	else
>>   		hw_cr4 |= KVM_PMODE_VM_CR4_ALWAYS_ON;
>>   
>> -	if ((cr4 & X86_CR4_UMIP) && !boot_cpu_has(X86_FEATURE_UMIP)) {
>> -		vmcs_set_bits(SECONDARY_VM_EXEC_CONTROL,
>> -			      SECONDARY_EXEC_DESC);
>> -		hw_cr4 &= ~X86_CR4_UMIP;
>> -	} else if (!is_guest_mode(vcpu) ||
>> -	           !nested_cpu_has2(get_vmcs12(vcpu), SECONDARY_EXEC_DESC))
>> -		vmcs_clear_bits(SECONDARY_VM_EXEC_CONTROL,
>> -				SECONDARY_EXEC_DESC);
>> +	if (!boot_cpu_has(X86_FEATURE_UMIP) && vmx_umip_emulated()) {
>> +		if (cr4 & X86_CR4_UMIP) {
>> +			vmcs_set_bits(SECONDARY_VM_EXEC_CONTROL,
>> +				      SECONDARY_EXEC_DESC);
>> +			hw_cr4 &= ~X86_CR4_UMIP;
>> +		} else if (!is_guest_mode(vcpu) ||
>> +			   !nested_cpu_has2(get_vmcs12(vcpu), SECONDARY_EXEC_DESC))
>> +			vmcs_clear_bits(SECONDARY_VM_EXEC_CONTROL,
>> +					SECONDARY_EXEC_DESC);
>> +	}
>>   
>>   	if (cr4 & X86_CR4_VMXE) {
>>   		/*
>>
>> I'll test it and send the patch more formally.
> 
> Below is what I was thinking for a patch.  We should avoid the
> VMREAD/VMWRITE when possible even when we're emulating UMIP.
> 
> 
> From: Sean Christopherson <sean.j.christopherson@intel.com>
> Subject: [PATCH] KVM: vmx: update SECONDARY_EXEC_DESC only if CR4.UMIP changes
> 
> Update SECONDARY_EXEC_DESC in SECONDARY_VM_EXEC_CONTROL for UMIP
> emulation if and only if CR4.UMIP is being modified and UMIP is
> not supported by hardware, i.e. we're emulating UMIP.  If CR4.UMIP
> is not being changed then it's safe to assume that the previous
> invocation of vmx_set_cr4() correctly set SECONDARY_EXEC_DESC,
> i.e. the desired value is already the current value.  This avoids
> unnecessary VMREAD/VMWRITE to SECONDARY_VM_EXEC_CONTROL, which
> is critical as not all processors support SECONDARY_VM_EXEC_CONTROL.
> 
> WARN once and signal a fault if CR4.UMIP is changing and UMIP can't
> be emulated, i.e. SECONDARY_EXEC_DESC can't be set.  Prior checks
> should prevent setting UMIP if it can't be emulated, i.e. UMIP
> shouldn't have been advertised to the guest if it can't be emulated,
> regardless of whether or not UMIP is supported in bare metal.
> 
> Fixes: 0367f205a3b7 ("KVM: vmx: add support for emulating UMIP")
> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
> ---
>   arch/x86/kvm/vmx.c | 34 ++++++++++++++++++++--------------
>   1 file changed, 20 insertions(+), 14 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index aafcc9881e88..31b36b9801bb 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -1494,6 +1494,12 @@ static inline bool cpu_has_vmx_vmfunc(void)
>                  SECONDARY_EXEC_ENABLE_VMFUNC;
>   }
> 
> +static bool vmx_umip_emulated(void)
> +{
> +       return vmcs_config.cpu_based_2nd_exec_ctrl &
> +               SECONDARY_EXEC_DESC;
> +}
> +
>   static inline bool report_flexpriority(void)
>   {
>          return flexpriority_enabled;
> @@ -4776,14 +4782,20 @@ static int vmx_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
>          else
>                  hw_cr4 |= KVM_PMODE_VM_CR4_ALWAYS_ON;
> 
> -       if ((cr4 & X86_CR4_UMIP) && !boot_cpu_has(X86_FEATURE_UMIP)) {
> -               vmcs_set_bits(SECONDARY_VM_EXEC_CONTROL,
> -                             SECONDARY_EXEC_DESC);
> -               hw_cr4 &= ~X86_CR4_UMIP;
> -       } else if (!is_guest_mode(vcpu) ||
> -                  !nested_cpu_has2(get_vmcs12(vcpu), SECONDARY_EXEC_DESC))
> -               vmcs_clear_bits(SECONDARY_VM_EXEC_CONTROL,
> -                               SECONDARY_EXEC_DESC);
> +       if (((cr4 ^ kvm_read_cr4(vcpu)) & X86_CR4_UMIP) &&
> +           !boot_cpu_has(X86_FEATURE_UMIP)) {
> +               if (WARN_ON_ONCE(!vmx_umip_emulated()))
> +                       return 1;
> +
> +               if (cr4 & X86_CR4_UMIP) {
> +                       vmcs_set_bits(SECONDARY_VM_EXEC_CONTROL,
> +                                       SECONDARY_EXEC_DESC);
> +                       hw_cr4 &= ~X86_CR4_UMIP;
> +               } else if (!is_guest_mode(vcpu) ||
> +                       !nested_cpu_has2(get_vmcs12(vcpu), SECONDARY_EXEC_DESC))
> +                       vmcs_clear_bits(SECONDARY_VM_EXEC_CONTROL,
> +                                       SECONDARY_EXEC_DESC);
> +       }
> 
>          if (cr4 & X86_CR4_VMXE) {
>                  /*
> @@ -9512,12 +9524,6 @@ static bool vmx_xsaves_supported(void)
>                  SECONDARY_EXEC_XSAVES;
>   }
> 
> -static bool vmx_umip_emulated(void)
> -{
> -       return vmcs_config.cpu_based_2nd_exec_ctrl &
> -               SECONDARY_EXEC_DESC;
> -}
> -
>   static void vmx_recover_nmi_blocking(struct vcpu_vmx *vmx)
>   {
>          u32 exit_intr_info;
> --
> 
> 
>>
>> Thanks,
>>
>> Paolo

It fixes the issue, thanks! (tested on 4.16.2)

Z.
Wanpeng Li April 18, 2018, 4:11 a.m. UTC | #3
2018-04-18 0:26 GMT+08:00 Christopherson, Sean J
<sean.j.christopherson@intel.com>:
> On Tue, 2018-04-17, Paolo Bonzini wrote:
>> On 17/04/2018 17:46, Christopherson, Sean J wrote:
>> > On Tue, 2018-04-17, Zdenek Kaspar wrote:
>> >> Hello, I did quick test with latest stable kernel (4.16.2) and got tons
>> >> of vmwrite errors immediately when starting VM:
>> >
>> > Code related to UMIP emulation is effectively doing an unconditional
>> > RMW on SECONDARY_VM_EXEC_CONTROL, which isn't guaranteed to exist on
>> > older processors.  KVM already ensures it only advertises UMIP (via
>> > emulation) when SECONDARY_EXEC_DESC can be set, i.e. KVM is already
>> > implicitly checking for SECONDARY_VM_EXEC_CONTROL, so fixing the bug
>> > is just a matter of omitting the unneeded VMREAD/VMWRITE sequence.
>>
>> Thanks for the report!
>>
>> This should be a fix:
>>
>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>> index aa66ccd6ed6c..c5dd185825c7 100644
>> --- a/arch/x86/kvm/vmx.c
>> +++ b/arch/x86/kvm/vmx.c
>> @@ -4767,14 +4767,16 @@ static int vmx_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
>>       else
>>               hw_cr4 |= KVM_PMODE_VM_CR4_ALWAYS_ON;
>>
>> -     if ((cr4 & X86_CR4_UMIP) && !boot_cpu_has(X86_FEATURE_UMIP)) {
>> -             vmcs_set_bits(SECONDARY_VM_EXEC_CONTROL,
>> -                           SECONDARY_EXEC_DESC);
>> -             hw_cr4 &= ~X86_CR4_UMIP;
>> -     } else if (!is_guest_mode(vcpu) ||
>> -                !nested_cpu_has2(get_vmcs12(vcpu), SECONDARY_EXEC_DESC))
>> -             vmcs_clear_bits(SECONDARY_VM_EXEC_CONTROL,
>> -                             SECONDARY_EXEC_DESC);
>> +     if (!boot_cpu_has(X86_FEATURE_UMIP) && vmx_umip_emulated()) {
>> +             if (cr4 & X86_CR4_UMIP) {
>> +                     vmcs_set_bits(SECONDARY_VM_EXEC_CONTROL,
>> +                                   SECONDARY_EXEC_DESC);
>> +                     hw_cr4 &= ~X86_CR4_UMIP;
>> +             } else if (!is_guest_mode(vcpu) ||
>> +                        !nested_cpu_has2(get_vmcs12(vcpu), SECONDARY_EXEC_DESC))
>> +                     vmcs_clear_bits(SECONDARY_VM_EXEC_CONTROL,
>> +                                     SECONDARY_EXEC_DESC);
>> +     }
>>
>>       if (cr4 & X86_CR4_VMXE) {
>>               /*
>>
>> I'll test it and send the patch more formally.
>
> Below is what I was thinking for a patch.  We should avoid the
> VMREAD/VMWRITE when possible even when we're emulating UMIP.
>
>
> From: Sean Christopherson <sean.j.christopherson@intel.com>
> Subject: [PATCH] KVM: vmx: update SECONDARY_EXEC_DESC only if CR4.UMIP changes
>
> Update SECONDARY_EXEC_DESC in SECONDARY_VM_EXEC_CONTROL for UMIP
> emulation if and only if CR4.UMIP is being modified and UMIP is
> not supported by hardware, i.e. we're emulating UMIP.  If CR4.UMIP
> is not being changed then it's safe to assume that the previous
> invocation of vmx_set_cr4() correctly set SECONDARY_EXEC_DESC,
> i.e. the desired value is already the current value.  This avoids
> unnecessary VMREAD/VMWRITE to SECONDARY_VM_EXEC_CONTROL, which
> is critical as not all processors support SECONDARY_VM_EXEC_CONTROL.
>
> WARN once and signal a fault if CR4.UMIP is changing and UMIP can't
> be emulated, i.e. SECONDARY_EXEC_DESC can't be set.  Prior checks
> should prevent setting UMIP if it can't be emulated, i.e. UMIP
> shouldn't have been advertised to the guest if it can't be emulated,
> regardless of whether or not UMIP is supported in bare metal.
>
> Fixes: 0367f205a3b7 ("KVM: vmx: add support for emulating UMIP")
> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
> ---
>  arch/x86/kvm/vmx.c | 34 ++++++++++++++++++++--------------
>  1 file changed, 20 insertions(+), 14 deletions(-)
>
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index aafcc9881e88..31b36b9801bb 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -1494,6 +1494,12 @@ static inline bool cpu_has_vmx_vmfunc(void)
>                 SECONDARY_EXEC_ENABLE_VMFUNC;
>  }
>
> +static bool vmx_umip_emulated(void)
> +{
> +       return vmcs_config.cpu_based_2nd_exec_ctrl &
> +               SECONDARY_EXEC_DESC;
> +}
> +
>  static inline bool report_flexpriority(void)
>  {
>         return flexpriority_enabled;
> @@ -4776,14 +4782,20 @@ static int vmx_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
>         else
>                 hw_cr4 |= KVM_PMODE_VM_CR4_ALWAYS_ON;
>
> -       if ((cr4 & X86_CR4_UMIP) && !boot_cpu_has(X86_FEATURE_UMIP)) {
> -               vmcs_set_bits(SECONDARY_VM_EXEC_CONTROL,
> -                             SECONDARY_EXEC_DESC);
> -               hw_cr4 &= ~X86_CR4_UMIP;
> -       } else if (!is_guest_mode(vcpu) ||
> -                  !nested_cpu_has2(get_vmcs12(vcpu), SECONDARY_EXEC_DESC))
> -               vmcs_clear_bits(SECONDARY_VM_EXEC_CONTROL,
> -                               SECONDARY_EXEC_DESC);
> +       if (((cr4 ^ kvm_read_cr4(vcpu)) & X86_CR4_UMIP) &&
> +           !boot_cpu_has(X86_FEATURE_UMIP)) {
> +               if (WARN_ON_ONCE(!vmx_umip_emulated()))
> +                       return 1;

Two questions here:
1) cr4 is set to 0 during reset, however, the calltrace occurs during
reset, why cr4 & X86_CR4_UMIP can be true?
2) UMIP cpuid flag is not exposed to the guest since
vmx_umip_emulated() fails, why there is a place set cr4 & X86_CR4_UMIP
to true?

Regards,
Wanpeng Li

> +
> +               if (cr4 & X86_CR4_UMIP) {
> +                       vmcs_set_bits(SECONDARY_VM_EXEC_CONTROL,
> +                                       SECONDARY_EXEC_DESC);
> +                       hw_cr4 &= ~X86_CR4_UMIP;
> +               } else if (!is_guest_mode(vcpu) ||
> +                       !nested_cpu_has2(get_vmcs12(vcpu), SECONDARY_EXEC_DESC))
> +                       vmcs_clear_bits(SECONDARY_VM_EXEC_CONTROL,
> +                                       SECONDARY_EXEC_DESC);
> +       }
>
>         if (cr4 & X86_CR4_VMXE) {
>                 /*
> @@ -9512,12 +9524,6 @@ static bool vmx_xsaves_supported(void)
>                 SECONDARY_EXEC_XSAVES;
>  }
>
> -static bool vmx_umip_emulated(void)
> -{
> -       return vmcs_config.cpu_based_2nd_exec_ctrl &
> -               SECONDARY_EXEC_DESC;
> -}
> -
>  static void vmx_recover_nmi_blocking(struct vcpu_vmx *vmx)
>  {
>         u32 exit_intr_info;
> --
>
>
>>
>> Thanks,
>>
>> Paolo
Paolo Bonzini April 18, 2018, 9:22 a.m. UTC | #4
On 18/04/2018 06:11, Wanpeng Li wrote:
> 2018-04-18 0:26 GMT+08:00 Christopherson, Sean J
> <sean.j.christopherson@intel.com>:
>> On Tue, 2018-04-17, Paolo Bonzini wrote:
>>> On 17/04/2018 17:46, Christopherson, Sean J wrote:
>>>> On Tue, 2018-04-17, Zdenek Kaspar wrote:
>>>>> Hello, I did quick test with latest stable kernel (4.16.2) and got tons
>>>>> of vmwrite errors immediately when starting VM:
>>>>
>>>> Code related to UMIP emulation is effectively doing an unconditional
>>>> RMW on SECONDARY_VM_EXEC_CONTROL, which isn't guaranteed to exist on
>>>> older processors.  KVM already ensures it only advertises UMIP (via
>>>> emulation) when SECONDARY_EXEC_DESC can be set, i.e. KVM is already
>>>> implicitly checking for SECONDARY_VM_EXEC_CONTROL, so fixing the bug
>>>> is just a matter of omitting the unneeded VMREAD/VMWRITE sequence.
>>>
>>> Thanks for the report!
>>>
>>> This should be a fix:
>>>
>>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>>> index aa66ccd6ed6c..c5dd185825c7 100644
>>> --- a/arch/x86/kvm/vmx.c
>>> +++ b/arch/x86/kvm/vmx.c
>>> @@ -4767,14 +4767,16 @@ static int vmx_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
>>>       else
>>>               hw_cr4 |= KVM_PMODE_VM_CR4_ALWAYS_ON;
>>>
>>> -     if ((cr4 & X86_CR4_UMIP) && !boot_cpu_has(X86_FEATURE_UMIP)) {
>>> -             vmcs_set_bits(SECONDARY_VM_EXEC_CONTROL,
>>> -                           SECONDARY_EXEC_DESC);
>>> -             hw_cr4 &= ~X86_CR4_UMIP;
>>> -     } else if (!is_guest_mode(vcpu) ||
>>> -                !nested_cpu_has2(get_vmcs12(vcpu), SECONDARY_EXEC_DESC))
>>> -             vmcs_clear_bits(SECONDARY_VM_EXEC_CONTROL,
>>> -                             SECONDARY_EXEC_DESC);
>>> +     if (!boot_cpu_has(X86_FEATURE_UMIP) && vmx_umip_emulated()) {
>>> +             if (cr4 & X86_CR4_UMIP) {
>>> +                     vmcs_set_bits(SECONDARY_VM_EXEC_CONTROL,
>>> +                                   SECONDARY_EXEC_DESC);
>>> +                     hw_cr4 &= ~X86_CR4_UMIP;
>>> +             } else if (!is_guest_mode(vcpu) ||
>>> +                        !nested_cpu_has2(get_vmcs12(vcpu), SECONDARY_EXEC_DESC))
>>> +                     vmcs_clear_bits(SECONDARY_VM_EXEC_CONTROL,
>>> +                                     SECONDARY_EXEC_DESC);
>>> +     }
>>>
>>>       if (cr4 & X86_CR4_VMXE) {
>>>               /*
>>>
>>> I'll test it and send the patch more formally.
>>
>> Below is what I was thinking for a patch.  We should avoid the
>> VMREAD/VMWRITE when possible even when we're emulating UMIP.
>>
>>
>> From: Sean Christopherson <sean.j.christopherson@intel.com>
>> Subject: [PATCH] KVM: vmx: update SECONDARY_EXEC_DESC only if CR4.UMIP changes
>>
>> Update SECONDARY_EXEC_DESC in SECONDARY_VM_EXEC_CONTROL for UMIP
>> emulation if and only if CR4.UMIP is being modified and UMIP is
>> not supported by hardware, i.e. we're emulating UMIP.  If CR4.UMIP
>> is not being changed then it's safe to assume that the previous
>> invocation of vmx_set_cr4() correctly set SECONDARY_EXEC_DESC,
>> i.e. the desired value is already the current value.  This avoids
>> unnecessary VMREAD/VMWRITE to SECONDARY_VM_EXEC_CONTROL, which
>> is critical as not all processors support SECONDARY_VM_EXEC_CONTROL.
>>
>> WARN once and signal a fault if CR4.UMIP is changing and UMIP can't
>> be emulated, i.e. SECONDARY_EXEC_DESC can't be set.  Prior checks
>> should prevent setting UMIP if it can't be emulated, i.e. UMIP
>> shouldn't have been advertised to the guest if it can't be emulated,
>> regardless of whether or not UMIP is supported in bare metal.
>>
>> Fixes: 0367f205a3b7 ("KVM: vmx: add support for emulating UMIP")
>> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
>> ---
>>  arch/x86/kvm/vmx.c | 34 ++++++++++++++++++++--------------
>>  1 file changed, 20 insertions(+), 14 deletions(-)
>>
>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>> index aafcc9881e88..31b36b9801bb 100644
>> --- a/arch/x86/kvm/vmx.c
>> +++ b/arch/x86/kvm/vmx.c
>> @@ -1494,6 +1494,12 @@ static inline bool cpu_has_vmx_vmfunc(void)
>>                 SECONDARY_EXEC_ENABLE_VMFUNC;
>>  }
>>
>> +static bool vmx_umip_emulated(void)
>> +{
>> +       return vmcs_config.cpu_based_2nd_exec_ctrl &
>> +               SECONDARY_EXEC_DESC;
>> +}
>> +
>>  static inline bool report_flexpriority(void)
>>  {
>>         return flexpriority_enabled;
>> @@ -4776,14 +4782,20 @@ static int vmx_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
>>         else
>>                 hw_cr4 |= KVM_PMODE_VM_CR4_ALWAYS_ON;
>>
>> -       if ((cr4 & X86_CR4_UMIP) && !boot_cpu_has(X86_FEATURE_UMIP)) {
>> -               vmcs_set_bits(SECONDARY_VM_EXEC_CONTROL,
>> -                             SECONDARY_EXEC_DESC);
>> -               hw_cr4 &= ~X86_CR4_UMIP;
>> -       } else if (!is_guest_mode(vcpu) ||
>> -                  !nested_cpu_has2(get_vmcs12(vcpu), SECONDARY_EXEC_DESC))
>> -               vmcs_clear_bits(SECONDARY_VM_EXEC_CONTROL,
>> -                               SECONDARY_EXEC_DESC);
>> +       if (((cr4 ^ kvm_read_cr4(vcpu)) & X86_CR4_UMIP) &&
>> +           !boot_cpu_has(X86_FEATURE_UMIP)) {
>> +               if (WARN_ON_ONCE(!vmx_umip_emulated()))
>> +                       return 1;
> 
> Two questions here:
> 1) cr4 is set to 0 during reset, however, the calltrace occurs during
> reset, why cr4 & X86_CR4_UMIP can be true?
> 2) UMIP cpuid flag is not exposed to the guest since
> vmx_umip_emulated() fails, why there is a place set cr4 & X86_CR4_UMIP
> to true?

Because we hit the case where vmcs_clear_bits.

Paolo

> Regards,
> Wanpeng Li
> 
>> +
>> +               if (cr4 & X86_CR4_UMIP) {
>> +                       vmcs_set_bits(SECONDARY_VM_EXEC_CONTROL,
>> +                                       SECONDARY_EXEC_DESC);
>> +                       hw_cr4 &= ~X86_CR4_UMIP;
>> +               } else if (!is_guest_mode(vcpu) ||
>> +                       !nested_cpu_has2(get_vmcs12(vcpu), SECONDARY_EXEC_DESC))
>> +                       vmcs_clear_bits(SECONDARY_VM_EXEC_CONTROL,
>> +                                       SECONDARY_EXEC_DESC);
>> +       }
>>
>>         if (cr4 & X86_CR4_VMXE) {
>>                 /*
>> @@ -9512,12 +9524,6 @@ static bool vmx_xsaves_supported(void)
>>                 SECONDARY_EXEC_XSAVES;
>>  }
>>
>> -static bool vmx_umip_emulated(void)
>> -{
>> -       return vmcs_config.cpu_based_2nd_exec_ctrl &
>> -               SECONDARY_EXEC_DESC;
>> -}
>> -
>>  static void vmx_recover_nmi_blocking(struct vcpu_vmx *vmx)
>>  {
>>         u32 exit_intr_info;
>> --
>>
>>
>>>
>>> Thanks,
>>>
>>> Paolo

Patch
diff mbox

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index aafcc9881e88..31b36b9801bb 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -1494,6 +1494,12 @@  static inline bool cpu_has_vmx_vmfunc(void)
                SECONDARY_EXEC_ENABLE_VMFUNC;
 }

+static bool vmx_umip_emulated(void)
+{
+       return vmcs_config.cpu_based_2nd_exec_ctrl &
+               SECONDARY_EXEC_DESC;
+}
+
 static inline bool report_flexpriority(void)
 {
        return flexpriority_enabled;
@@ -4776,14 +4782,20 @@  static int vmx_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
        else
                hw_cr4 |= KVM_PMODE_VM_CR4_ALWAYS_ON;

-       if ((cr4 & X86_CR4_UMIP) && !boot_cpu_has(X86_FEATURE_UMIP)) {
-               vmcs_set_bits(SECONDARY_VM_EXEC_CONTROL,
-                             SECONDARY_EXEC_DESC);
-               hw_cr4 &= ~X86_CR4_UMIP;
-       } else if (!is_guest_mode(vcpu) ||
-                  !nested_cpu_has2(get_vmcs12(vcpu), SECONDARY_EXEC_DESC))
-               vmcs_clear_bits(SECONDARY_VM_EXEC_CONTROL,
-                               SECONDARY_EXEC_DESC);
+       if (((cr4 ^ kvm_read_cr4(vcpu)) & X86_CR4_UMIP) &&
+           !boot_cpu_has(X86_FEATURE_UMIP)) {
+               if (WARN_ON_ONCE(!vmx_umip_emulated()))
+                       return 1;
+
+               if (cr4 & X86_CR4_UMIP) {
+                       vmcs_set_bits(SECONDARY_VM_EXEC_CONTROL,
+                                       SECONDARY_EXEC_DESC);
+                       hw_cr4 &= ~X86_CR4_UMIP;
+               } else if (!is_guest_mode(vcpu) ||
+                       !nested_cpu_has2(get_vmcs12(vcpu), SECONDARY_EXEC_DESC))
+                       vmcs_clear_bits(SECONDARY_VM_EXEC_CONTROL,
+                                       SECONDARY_EXEC_DESC);
+       }

        if (cr4 & X86_CR4_VMXE) {
                /*
@@ -9512,12 +9524,6 @@  static bool vmx_xsaves_supported(void)
                SECONDARY_EXEC_XSAVES;
 }

-static bool vmx_umip_emulated(void)
-{
-       return vmcs_config.cpu_based_2nd_exec_ctrl &
-               SECONDARY_EXEC_DESC;
-}
-
 static void vmx_recover_nmi_blocking(struct vcpu_vmx *vmx)
 {
        u32 exit_intr_info;