From patchwork Sat Apr 27 06:42:43 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Xu, Xinhao" X-Patchwork-Id: 2496711 Return-Path: X-Original-To: patchwork-kvm@patchwork.kernel.org Delivered-To: patchwork-process-083081@patchwork1.kernel.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by patchwork1.kernel.org (Postfix) with ESMTP id C2FC43FC64 for ; Sat, 27 Apr 2013 06:43:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755535Ab3D0GnE (ORCPT ); Sat, 27 Apr 2013 02:43:04 -0400 Received: from mga02.intel.com ([134.134.136.20]:45513 "EHLO mga02.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755093Ab3D0GnD convert rfc822-to-8bit (ORCPT ); Sat, 27 Apr 2013 02:43:03 -0400 Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga101.jf.intel.com with ESMTP; 26 Apr 2013 23:42:47 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="4.87,562,1363158000"; d="scan'208,223";a="327079509" Received: from fmsmsx103.amr.corp.intel.com ([10.19.9.34]) by orsmga002.jf.intel.com with ESMTP; 26 Apr 2013 23:42:47 -0700 Received: from shsmsx151.ccr.corp.intel.com (10.239.6.50) by FMSMSX103.amr.corp.intel.com (10.19.9.34) with Microsoft SMTP Server (TLS) id 14.1.355.2; Fri, 26 Apr 2013 23:42:46 -0700 Received: from shsmsx101.ccr.corp.intel.com ([169.254.1.85]) by SHSMSX151.ccr.corp.intel.com ([169.254.3.236]) with mapi id 14.01.0355.002; Sat, 27 Apr 2013 14:42:44 +0800 From: "Xu, Xinhao" To: "Nakajima, Jun" , "kvm@vger.kernel.org" Subject: RE: [PATCH 11/11] nEPT: Provide the correct exit qualification upon EPT Thread-Topic: [PATCH 11/11] nEPT: Provide the correct exit qualification upon EPT Thread-Index: AQHOQkmPRWREuO9GnEymKuICYFiiz5jpn7DQ Date: Sat, 27 Apr 2013 06:42:43 +0000 Message-ID: References: <1366958611-6935-1-git-send-email-jun.nakajima@intel.com> <1366958611-6935-2-git-send-email-jun.nakajima@intel.com> <1366958611-6935-3-git-send-email-jun.nakajima@intel.com> <1366958611-6935-4-git-send-email-jun.nakajima@intel.com> <1366958611-6935-5-git-send-email-jun.nakajima@intel.com> <1366958611-6935-6-git-send-email-jun.nakajima@intel.com> <1366958611-6935-7-git-send-email-jun.nakajima@intel.com> <1366958611-6935-8-git-send-email-jun.nakajima@intel.com> <1366958611-6935-9-git-send-email-jun.nakajima@intel.com> <1366958611-6935-10-git-send-email-jun.nakajima@intel.com> <1366958611-6935-11-git-send-email-jun.nakajima@intel.com> In-Reply-To: <1366958611-6935-11-git-send-email-jun.nakajima@intel.com> Accept-Language: zh-CN, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.239.127.40] MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From 2df72c1e8e3b167a6008ab11e2a68d734c68e425 Mon Sep 17 00:00:00 2001 From: Xinhao Xu Date: Sat, 27 Apr 2013 05:45:49 +0800 Subject: [PATCH] nEPT: Check EPT misconfiguration while walking addr & move pte check code to vmx.c I add code to detect EPT misconfiguration and inject it to L1. Now L1 can correctly go to ept_misconfig handler(instead of wrongly going to fast_page_fault), it will try to handle mmio page fault, if failed, it is a real ept misconfiguration. For scalability, Xiantao suggests me moving vendor specific code out from common code. In order to do this, I add new ops in kvm_mmu struct, check_tdp_pte, to provide an interface to check fault while walking address. The better way is to do fault detecting at here, but so far I just have checked ept misconfiguration. More patches will be added in future. Signed-off-by: Jun Nakajima Signed-off-by: Xinhao Xu --- arch/x86/include/asm/kvm_emulate.h | 2 + arch/x86/include/asm/kvm_host.h | 2 + arch/x86/kvm/cpuid.c | 1 + arch/x86/kvm/mmu.c | 5 -- arch/x86/kvm/mmu.h | 5 ++ arch/x86/kvm/paging_tmpl.h | 25 ++++++++--- arch/x86/kvm/vmx.c | 79 +++++++++++++++++++++++++++++++++++- 7 files changed, 106 insertions(+), 13 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/arch/x86/include/asm/kvm_emulate.h b/arch/x86/include/asm/kvm_emulate.h index 15f960c..c74f7c6 100644 --- a/arch/x86/include/asm/kvm_emulate.h +++ b/arch/x86/include/asm/kvm_emulate.h @@ -21,6 +21,8 @@ struct x86_exception { u8 vector; bool error_code_valid; u16 error_code; + u64 exit_qualification; + bool fault_checked_by_vendor; /* is this fault checked by vendor */ bool nested_page_fault; u64 address; /* cr2 or nested page fault gpa */ }; diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index e029bba..0fedfcd 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -262,6 +262,8 @@ struct kvm_mmu { void (*invlpg)(struct kvm_vcpu *vcpu, gva_t gva); void (*update_pte)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp, u64 *spte, const void *pte); + bool (*check_tdp_pte)(struct kvm_vcpu *vcpu, u64 pte, int level, + struct x86_exception *fault); hpa_t root_hpa; int root_level; int shadow_root_level; diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index a20ecb5..51dd1f9 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -621,6 +621,7 @@ int cpuid_maxphyaddr(struct kvm_vcpu *vcpu) not_found: return 36; } +EXPORT_SYMBOL_GPL(cpuid_maxphyaddr); /* * If no match is found, check whether we exceed the vCPU's limit diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index 99bfc5e..e1f5ea1 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -230,11 +230,6 @@ static bool set_mmio_spte(u64 *sptep, gfn_t gfn, pfn_t pfn, unsigned access) return false; } -static inline u64 rsvd_bits(int s, int e) -{ - return ((1ULL << (e - s + 1)) - 1) << s; -} - void kvm_mmu_set_mask_ptes(u64 user_mask, u64 accessed_mask, u64 dirty_mask, u64 nx_mask, u64 x_mask) { diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h index 19dd5ab..8aebd5a 100644 --- a/arch/x86/kvm/mmu.h +++ b/arch/x86/kvm/mmu.h @@ -91,6 +91,11 @@ static inline bool is_write_protection(struct kvm_vcpu *vcpu) return kvm_read_cr0_bits(vcpu, X86_CR0_WP); } +static inline u64 rsvd_bits(int s, int e) +{ + return ((1ULL << (e - s + 1)) - 1) << s; +} + /* * Will a fault with a given page-fault error code (pfec) cause a permission * fault with the given access (in ACC_* format)? diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h index bd370e7..09e7154 100644 --- a/arch/x86/kvm/paging_tmpl.h +++ b/arch/x86/kvm/paging_tmpl.h @@ -240,6 +240,7 @@ retry_walk: } #endif walker->max_level = walker->level; + walker->fault.fault_checked_by_vendor = false; ASSERT((!is_long_mode(vcpu) && is_pae(vcpu)) || (mmu->get_cr3(vcpu) & CR3_NONPAE_RESERVED_BITS) == 0); @@ -282,14 +283,22 @@ retry_walk: if (unlikely(!FNAME(is_present_gpte)(pte))) goto error; +#if PTTYPE != PTTYPE_EPT if (unlikely(is_rsvd_bits_set(&vcpu->arch.mmu, pte, walker->level))) { errcode |= PFERR_PRESENT_MASK; -#if PTTYPE != PTTYPE_EPT errcode |= PFERR_RSVD_MASK; -#endif goto error; } +#else + if (vcpu->arch.mmu.check_tdp_pte) + { + if (vcpu->arch.mmu.check_tdp_pte(vcpu, pte, + walker->level, &walker->fault)) + goto error; + } + +#endif accessed_dirty &= pte; pte_access = pt_access & FNAME(gpte_access)(vcpu, pte); @@ -349,12 +358,14 @@ error: walker->fault.vector = PF_VECTOR; walker->fault.error_code_valid = true; -#if PTTYPE != PTTYPE_EPT walker->fault.error_code = errcode; -#else - /* Reuse bits [2:0] of EPT violation */ - walker->fault.error_code = vcpu->arch.exit_qualification & 0x7; -#endif + /* + * If it is not fault checked in check_tdp_pte, for now, it is a + * EPT violation, set correct exit qualification according to Intel + * SDM. This part will be changed in future.*/ + if (!walker->fault.fault_checked_by_vendor) + walker->fault.exit_qualification = ((walker->pte_access & + 0x7) << 3) | access; walker->fault.address = addr; walker->fault.nested_page_fault = mmu != vcpu->arch.walk_mmu; diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 61e2853..a450c84 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -7006,10 +7006,86 @@ static void nested_ept_inject_page_fault(struct kvm_vcpu *vcpu, * Note no need to set vmcs12->vm_exit_reason as it is already copied * from vmcs02 in nested_vmx_vmexit() above, i.e., EPT_VIOLATION. */ - vmcs12->exit_qualification = fault->error_code; + if (fault->fault_checked_by_vendor && fault->exit_qualification == 0) + vmcs12->vm_exit_reason = EXIT_REASON_EPT_MISCONFIG; + else + vmcs12->vm_exit_reason = EXIT_REASON_EPT_VIOLATION; + vmcs12->exit_qualification = fault->exit_qualification; vmcs12->guest_physical_address = fault->address; } +static bool nested_ept_rsvd_bits_check(struct kvm_vcpu *vcpu, u64 pte, int level) +{ + int maxphyaddr = cpuid_maxphyaddr(vcpu); + u64 rsvd_mask = rsvd_bits(maxphyaddr, 51); + + switch (level) + { + case 4: + rsvd_mask |= rsvd_bits(3, 7); + break; + case 3: + case 2: + if (pte & (1 << 7)) + rsvd_mask |= rsvd_bits(PAGE_SHIFT, PAGE_SHIFT + 9 * (level - 1) - 1); + else + rsvd_mask |= rsvd_bits(3, 6); + break; + case 1: + break; + default: + /* impossible to go to here */ + BUG(); + } + return pte & rsvd_mask; +} + +static bool nested_ept_rwx_bits_check(u64 pte) +{ + /* write only or write/execute only */ + uint8_t rwx_bits = pte & 7; + + switch (rwx_bits) + { + case 0x2: + case 0x6: + return 1; + case 0x4: + if (!(nested_vmx_ept_caps & 0x1)) + return 1; + default: + return 0; + } +} + +static bool nested_ept_memtype_check(u64 pte, int level) +{ + if (level == 1 || (level == 2 && (pte & (1ULL << 7)))) { + /* 0x38, namely bits 5:3, stands for EPT memory type */ + u64 ept_mem_type = (pte & 0x38) >> 3; + + if (ept_mem_type == 0x2 || ept_mem_type == 0x3 || + ept_mem_type == 0x7) + return 1; + } + return 0; +} + +bool nested_check_ept_pte(struct kvm_vcpu *vcpu, u64 pte, int level, + struct x86_exception *fault) +{ + bool r; + r = nested_ept_rsvd_bits_check(vcpu, pte, level) || + nested_ept_rwx_bits_check(pte) || + nested_ept_memtype_check(pte, level); + if (r) + { + fault->exit_qualification = 0; + fault->fault_checked_by_vendor = true; + } + return r; +} + static int nested_ept_init_mmu_context(struct kvm_vcpu *vcpu) { int r = kvm_init_shadow_EPT_mmu(vcpu, &vcpu->arch.mmu); @@ -7017,6 +7093,7 @@ static int nested_ept_init_mmu_context(struct kvm_vcpu *vcpu) vcpu->arch.mmu.set_cr3 = vmx_set_cr3; vcpu->arch.mmu.get_cr3 = nested_ept_get_cr3; vcpu->arch.mmu.inject_page_fault = nested_ept_inject_page_fault; + vcpu->arch.mmu.check_tdp_pte = nested_check_ept_pte; vcpu->arch.walk_mmu = &vcpu->arch.nested_mmu; -- 1.7.1 -----Original Message----- From: kvm-owner@vger.kernel.org [mailto:kvm-owner@vger.kernel.org] On Behalf Of Jun Nakajima Sent: Friday, April 26, 2013 2:44 PM To: kvm@vger.kernel.org Subject: [PATCH 11/11] nEPT: Provide the correct exit qualification upon EPT Save [2:0] of exit qualificaiton at EPT violation, and use the information when injecting EPT violation. Signed-off-by: Jun Nakajima Signed-off-by: Xinhao Xu --- arch/x86/include/asm/kvm_host.h | 2 ++ arch/x86/kvm/paging_tmpl.h | 5 +++++ arch/x86/kvm/vmx.c | 3 +++ 3 files changed, 10 insertions(+) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 4979778..e029bba 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -504,6 +504,8 @@ struct kvm_vcpu_arch { * instruction. */ bool write_fault_to_shadow_pgtable; + + unsigned long exit_qualification; /* set at EPT violation at this +point */ }; struct kvm_lpage_info { diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h index e13b6c5..bd370e7 100644 --- a/arch/x86/kvm/paging_tmpl.h +++ b/arch/x86/kvm/paging_tmpl.h @@ -349,7 +349,12 @@ error: walker->fault.vector = PF_VECTOR; walker->fault.error_code_valid = true; +#if PTTYPE != PTTYPE_EPT walker->fault.error_code = errcode; +#else + /* Reuse bits [2:0] of EPT violation */ + walker->fault.error_code = vcpu->arch.exit_qualification & 0x7; #endif walker->fault.address = addr; walker->fault.nested_page_fault = mmu != vcpu->arch.walk_mmu; diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 95304cc..61e2853 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -425,6 +425,7 @@ struct vcpu_vmx { ktime_t entry_time; s64 vnmi_blocked_time; u32 exit_reason; + unsigned long exit_qualification; bool rdtscp_enabled; @@ -5074,6 +5075,8 @@ static int handle_ept_violation(struct kvm_vcpu *vcpu) /* ept page table is present? */ error_code |= (exit_qualification >> 3) & 0x1; + vcpu->arch.exit_qualification = exit_qualification; + return kvm_mmu_page_fault(vcpu, gpa, error_code, NULL, 0); } -- 1.8.2.1.610.g562af5b