diff mbox

kvm: svm: reset mmu on VCPU reset

Message ID 1442583545-266967-1-git-send-email-imammedo@redhat.com (mailing list archive)
State New, archived
Headers show

Commit Message

Igor Mammedov Sept. 18, 2015, 1:39 p.m. UTC
When INIT/SIPI sequence is sent to VCPU which before that
was in use by OS, VMRUN might fail with:

 KVM: entry failed, hardware error 0xffffffff
 EAX=00000000 EBX=00000000 ECX=00000000 EDX=000006d3
 ESI=00000000 EDI=00000000 EBP=00000000 ESP=00000000
 EIP=00000000 EFL=00000002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0
 ES =0000 00000000 0000ffff 00009300
 CS =9a00 0009a000 0000ffff 00009a00
 [...]
 CR0=60000010 CR2=b6f3e000 CR3=01942000 CR4=000007e0
 [...]
 EFER=0000000000000000

with corresponding SVM error:
 KVM: FAILED VMRUN WITH VMCB:
 [...]
 cpl:            0                efer:         0000000000001000
 cr0:            0000000080010010 cr2:          00007fd7fe85bf90
 cr3:            0000000187d0c000 cr4:          0000000000000020
 [...]

What happens is that VCPU state right after offlinig:
CR0: 0x80050033  EFER: 0xd01  CR4: 0x7e0
  -> long mode with CR3 pointing to longmode page tables

and when VCPU gets INIT/SIPI following transition happens
CR0: 0 -> 0x60000010 EFER: 0x0  CR4: 0x7e0
  -> paging disabled with stale CR3

However SVM under the hood puts VCPU in Paged Real Mode*
which effectively translates CR0 0x60000010 -> 80010010 after

   svm_vcpu_reset()
       -> init_vmcb()
           -> kvm_set_cr0()
               -> svm_set_cr0()

but from  kvm_set_cr0() perspective CR0: 0 -> 0x60000010
only caching bits are changed and
commit d81135a57aa6
 ("KVM: x86: do not reset mmu if CR0.CD and CR0.NW are changed")'
regressed svm_vcpu_reset() which relied on MMU being reset.

As result VMRUN after svm_vcpu_reset() tries to run
VCPU in Paged Real Mode with stale MMU context (longmode page tables),
which causes some AMD CPUs** to bail out with VMEXIT_INVALID.

Fix issue by unconditionally resetting MMU context
at init_vmcb() time.

--
* AMD64 Architecture Programmer’s Manual,
    Volume 2: System Programming, rev: 3.25
      15.19 Paged Real Mode
** Opteron 1216

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
---
 arch/x86/kvm/svm.c | 1 +
 1 file changed, 1 insertion(+)

Comments

Paolo Bonzini Sept. 18, 2015, 2:50 p.m. UTC | #1
On 18/09/2015 15:39, Igor Mammedov wrote:
> When INIT/SIPI sequence is sent to VCPU which before that
> was in use by OS, VMRUN might fail with:
> 
>  KVM: entry failed, hardware error 0xffffffff
>  EAX=00000000 EBX=00000000 ECX=00000000 EDX=000006d3
>  ESI=00000000 EDI=00000000 EBP=00000000 ESP=00000000
>  EIP=00000000 EFL=00000002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0
>  ES =0000 00000000 0000ffff 00009300
>  CS =9a00 0009a000 0000ffff 00009a00
>  [...]
>  CR0=60000010 CR2=b6f3e000 CR3=01942000 CR4=000007e0
>  [...]
>  EFER=0000000000000000
> 
> with corresponding SVM error:
>  KVM: FAILED VMRUN WITH VMCB:
>  [...]
>  cpl:            0                efer:         0000000000001000
>  cr0:            0000000080010010 cr2:          00007fd7fe85bf90
>  cr3:            0000000187d0c000 cr4:          0000000000000020
>  [...]
> 
> What happens is that VCPU state right after offlinig:
> CR0: 0x80050033  EFER: 0xd01  CR4: 0x7e0
>   -> long mode with CR3 pointing to longmode page tables
> 
> and when VCPU gets INIT/SIPI following transition happens
> CR0: 0 -> 0x60000010 EFER: 0x0  CR4: 0x7e0
>   -> paging disabled with stale CR3
> 
> However SVM under the hood puts VCPU in Paged Real Mode*
> which effectively translates CR0 0x60000010 -> 80010010 after
> 
>    svm_vcpu_reset()
>        -> init_vmcb()
>            -> kvm_set_cr0()
>                -> svm_set_cr0()
> 
> but from  kvm_set_cr0() perspective CR0: 0 -> 0x60000010
> only caching bits are changed and
> commit d81135a57aa6
>  ("KVM: x86: do not reset mmu if CR0.CD and CR0.NW are changed")'
> regressed svm_vcpu_reset() which relied on MMU being reset.
> 
> As result VMRUN after svm_vcpu_reset() tries to run
> VCPU in Paged Real Mode with stale MMU context (longmode page tables),
> which causes some AMD CPUs** to bail out with VMEXIT_INVALID.
> 
> Fix issue by unconditionally resetting MMU context
> at init_vmcb() time.
> 
> --
> * AMD64 Architecture Programmer’s Manual,
>     Volume 2: System Programming, rev: 3.25
>       15.19 Paged Real Mode
> ** Opteron 1216
> 
> Signed-off-by: Igor Mammedov <imammedo@redhat.com>
> ---
>  arch/x86/kvm/svm.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
> index fdb8cb6..89173af 100644
> --- a/arch/x86/kvm/svm.c
> +++ b/arch/x86/kvm/svm.c
> @@ -1264,6 +1264,7 @@ static void init_vmcb(struct vcpu_svm *svm, bool init_event)
>  	 * It also updates the guest-visible cr0 value.
>  	 */
>  	(void)kvm_set_cr0(&svm->vcpu, X86_CR0_NW | X86_CR0_CD | X86_CR0_ET);
> +	kvm_mmu_reset_context(&svm->vcpu);
>  
>  	save->cr4 = X86_CR4_PAE;
>  	/* rdx = ?? */
> 

Thanks.  Unfortunately I have just sent a pull request to Linus, but
I'll add

Fixes: d81135a57aa6
Cc: stable@vger.kernel.org

and send it out next week.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Xiao Guangrong Sept. 21, 2015, 2:14 a.m. UTC | #2
On 09/18/2015 09:39 PM, Igor Mammedov wrote:
> When INIT/SIPI sequence is sent to VCPU which before that
> was in use by OS, VMRUN might fail with:
>
>   KVM: entry failed, hardware error 0xffffffff
>   EAX=00000000 EBX=00000000 ECX=00000000 EDX=000006d3
>   ESI=00000000 EDI=00000000 EBP=00000000 ESP=00000000
>   EIP=00000000 EFL=00000002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0
>   ES =0000 00000000 0000ffff 00009300
>   CS =9a00 0009a000 0000ffff 00009a00
>   [...]
>   CR0=60000010 CR2=b6f3e000 CR3=01942000 CR4=000007e0
>   [...]
>   EFER=0000000000000000
>
> with corresponding SVM error:
>   KVM: FAILED VMRUN WITH VMCB:
>   [...]
>   cpl:            0                efer:         0000000000001000
>   cr0:            0000000080010010 cr2:          00007fd7fe85bf90
>   cr3:            0000000187d0c000 cr4:          0000000000000020
>   [...]
>
> What happens is that VCPU state right after offlinig:
> CR0: 0x80050033  EFER: 0xd01  CR4: 0x7e0
>    -> long mode with CR3 pointing to longmode page tables
>
> and when VCPU gets INIT/SIPI following transition happens
> CR0: 0 -> 0x60000010 EFER: 0x0  CR4: 0x7e0
>    -> paging disabled with stale CR3
>
> However SVM under the hood puts VCPU in Paged Real Mode*
> which effectively translates CR0 0x60000010 -> 80010010 after
>
>     svm_vcpu_reset()
>         -> init_vmcb()
>             -> kvm_set_cr0()
>                 -> svm_set_cr0()
>
> but from  kvm_set_cr0() perspective CR0: 0 -> 0x60000010
> only caching bits are changed and
> commit d81135a57aa6
>   ("KVM: x86: do not reset mmu if CR0.CD and CR0.NW are changed")'
> regressed svm_vcpu_reset() which relied on MMU being reset.
>
> As result VMRUN after svm_vcpu_reset() tries to run
> VCPU in Paged Real Mode with stale MMU context (longmode page tables),
> which causes some AMD CPUs** to bail out with VMEXIT_INVALID.
>
> Fix issue by unconditionally resetting MMU context
> at init_vmcb() time.
>
> --
> * AMD64 Architecture Programmer’s Manual,
>      Volume 2: System Programming, rev: 3.25
>        15.19 Paged Real Mode
> ** Opteron 1216
>

Good catch and nice analysis. Thanks for your fix, Igor!

Reviewed-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index fdb8cb6..89173af 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -1264,6 +1264,7 @@  static void init_vmcb(struct vcpu_svm *svm, bool init_event)
 	 * It also updates the guest-visible cr0 value.
 	 */
 	(void)kvm_set_cr0(&svm->vcpu, X86_CR0_NW | X86_CR0_CD | X86_CR0_ET);
+	kvm_mmu_reset_context(&svm->vcpu);
 
 	save->cr4 = X86_CR4_PAE;
 	/* rdx = ?? */