From patchwork Fri Dec 18 13:50:33 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Vrabel X-Patchwork-Id: 7884101 Return-Path: X-Original-To: patchwork-xen-devel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 482EABEEE5 for ; Fri, 18 Dec 2015 13:53:46 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 4631A203E6 for ; Fri, 18 Dec 2015 13:53:45 +0000 (UTC) Received: from lists.xen.org (lists.xenproject.org [50.57.142.19]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 0D5D02041D for ; Fri, 18 Dec 2015 13:53:43 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=lists.xen.org) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1a9vQp-0004nr-5t; Fri, 18 Dec 2015 13:50:51 +0000 Received: from mail6.bemta3.messagelabs.com ([195.245.230.39]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1a9vQn-0004nI-Eq for xen-devel@lists.xenproject.org; Fri, 18 Dec 2015 13:50:49 +0000 Received: from [85.158.137.68] by server-3.bemta-3.messagelabs.com id C3/9B-01753-83F04765; Fri, 18 Dec 2015 13:50:48 +0000 X-Env-Sender: prvs=787c6add4=david.vrabel@citrix.com X-Msg-Ref: server-8.tower-31.messagelabs.com!1450446646!11738243!1 X-Originating-IP: [66.165.176.63] X-SpamReason: No, hits=0.0 required=7.0 tests=sa_preprocessor: VHJ1c3RlZCBJUDogNjYuMTY1LjE3Ni42MyA9PiAzMDYwNDg=\n, received_headers: No Received headers X-StarScan-Received: X-StarScan-Version: 7.35.1; banners=-,-,- X-VirusChecked: Checked Received: (qmail 38367 invoked from network); 18 Dec 2015 13:50:47 -0000 Received: from smtp02.citrix.com (HELO SMTP02.CITRIX.COM) (66.165.176.63) by server-8.tower-31.messagelabs.com with RC4-SHA encrypted SMTP; 18 Dec 2015 13:50:47 -0000 X-IronPort-AV: E=Sophos;i="5.20,446,1444694400"; d="scan'208";a="326260317" From: David Vrabel To: Date: Fri, 18 Dec 2015 13:50:33 +0000 Message-ID: <1450446634-8762-2-git-send-email-david.vrabel@citrix.com> X-Mailer: git-send-email 2.1.4 In-Reply-To: <1450446634-8762-1-git-send-email-david.vrabel@citrix.com> References: <1450446634-8762-1-git-send-email-david.vrabel@citrix.com> MIME-Version: 1.0 X-DLP: MIA2 Cc: Kevin Tian , Jun Nakajima , George Dunlap , Andrew Cooper , Tim Deegan , David Vrabel , Jan Beulich Subject: [Xen-devel] [PATCHv6 1/2] x86/ept: invalidate guest physical mappings on VMENTER X-BeenThere: xen-devel@lists.xen.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP If a guest allocates a page and the tlbflush_timestamp on the page indicates that a TLB flush of the previous owner is required, only the linear and combined mappings are invalidated. The guest-physical mappings are not invalidated. This is currently safe because the EPT code ensures that the guest-physical and combined mappings are invalidated /before/ the page is freed. However, this prevents us from deferring the EPT invalidate until after the page is freed (e.g., to defer the invalidate until the p2m locks are released). The TLB flush that may be done after allocating page already causes the original guest to VMEXIT, thus on VMENTER we can do an INVEPT if one is pending. This means __ept_sync_domain() need not do anything and the thus the on_selected_cpu() call does not need to wait for as long. ept_sync_domain() now marks all PCPUs as needing to be invalidated, including PCPUs that the domain has not run on. We still only IPI those PCPUs that are active so this does not result in any more INVEPT calls. We do not attempt to track when PCPUs may have cached translations because the only safe way to clear this per-CPU state is if immediately after an invalidate the PCPU is not active (i.e., the PCPU is not in d->domain_dirty_cpumask). Since we only invalidate on VMENTER or by IPIing active PCPUs this can never happen. Signed-off-by: David Vrabel Reviewed-by: George Dunlap Reviewed-by: George Dunlap Acked-by: Kevin Tian --- v6: - Add some comments. - Use test then clear instead of test_and_clear in vmx_vmenter_helper() v4: - __ept_sync_domain() is a no-op -- invalidates are done before VMENTER. - initialize ept->invalidate to all ones so the initial invalidate is always done. --- xen/arch/x86/hvm/vmx/vmx.c | 24 ++++++++++++------------ xen/arch/x86/mm/p2m-ept.c | 31 ++++++++++++++++++------------- xen/include/asm-x86/hvm/vmx/vmcs.h | 4 ++-- 3 files changed, 32 insertions(+), 27 deletions(-) diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c index 2922673..23ab737 100644 --- a/xen/arch/x86/hvm/vmx/vmx.c +++ b/xen/arch/x86/hvm/vmx/vmx.c @@ -744,24 +744,12 @@ static void vmx_ctxt_switch_from(struct vcpu *v) static void vmx_ctxt_switch_to(struct vcpu *v) { - struct domain *d = v->domain; unsigned long old_cr4 = read_cr4(), new_cr4 = mmu_cr4_features; - struct ept_data *ept_data = &p2m_get_hostp2m(d)->ept; /* HOST_CR4 in VMCS is always mmu_cr4_features. Sync CR4 now. */ if ( old_cr4 != new_cr4 ) write_cr4(new_cr4); - if ( paging_mode_hap(d) ) - { - unsigned int cpu = smp_processor_id(); - /* Test-and-test-and-set this CPU in the EPT-is-synced mask. */ - if ( !cpumask_test_cpu(cpu, ept_get_synced_mask(ept_data)) && - !cpumask_test_and_set_cpu(cpu, - ept_get_synced_mask(ept_data)) ) - __invept(INVEPT_SINGLE_CONTEXT, ept_get_eptp(ept_data), 0); - } - vmx_restore_guest_msrs(v); vmx_restore_dr(v); } @@ -3586,6 +3574,18 @@ void vmx_vmenter_helper(const struct cpu_user_regs *regs) if ( unlikely(need_flush) ) vpid_sync_all(); + if ( paging_mode_hap(curr->domain) ) + { + struct ept_data *ept = &p2m_get_hostp2m(curr->domain)->ept; + unsigned int cpu = smp_processor_id(); + + if ( cpumask_test_cpu(cpu, ept->invalidate) ) + { + cpumask_clear_cpu(cpu, ept->invalidate); + __invept(INVEPT_SINGLE_CONTEXT, ept_get_eptp(ept), 0); + } + } + out: HVMTRACE_ND(VMENTRY, 0, 1/*cycles*/, 0, 0, 0, 0, 0, 0, 0); diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c index eef0372..c094320 100644 --- a/xen/arch/x86/mm/p2m-ept.c +++ b/xen/arch/x86/mm/p2m-ept.c @@ -1089,9 +1089,10 @@ static void ept_memory_type_changed(struct p2m_domain *p2m) static void __ept_sync_domain(void *info) { - struct ept_data *ept = &((struct p2m_domain *)info)->ept; - - __invept(INVEPT_SINGLE_CONTEXT, ept_get_eptp(ept), 0); + /* + * The invalidation will be done before VMENTER (see + * vmx_vmenter_helper()). + */ } void ept_sync_domain(struct p2m_domain *p2m) @@ -1108,15 +1109,15 @@ void ept_sync_domain(struct p2m_domain *p2m) p2m_flush_nestedp2m(d); /* - * Flush active cpus synchronously. Flush others the next time this domain - * is scheduled onto them. We accept the race of other CPUs adding to - * the ept_synced mask before on_selected_cpus() reads it, resulting in - * unnecessary extra flushes, to avoid allocating a cpumask_t on the stack. + * Need to invalidate on all PCPUs because either: + * + * a) A VCPU has run and some translations may be cached. + * b) A VCPU has not run and and the initial invalidation in case + * of an EP4TA reuse is still needed. */ - cpumask_and(ept_get_synced_mask(ept), - d->domain_dirty_cpumask, &cpu_online_map); + cpumask_setall(ept->invalidate); - on_selected_cpus(ept_get_synced_mask(ept), + on_selected_cpus(d->domain_dirty_cpumask, __ept_sync_domain, p2m, 1); } @@ -1182,10 +1183,14 @@ int ept_p2m_init(struct p2m_domain *p2m) p2m->flush_hardware_cached_dirty = ept_flush_pml_buffers; } - if ( !zalloc_cpumask_var(&ept->synced_mask) ) + if ( !zalloc_cpumask_var(&ept->invalidate) ) return -ENOMEM; - on_each_cpu(__ept_sync_domain, p2m, 1); + /* + * Assume an initial invalidation is required, in case an EP4TA is + * reused. + */ + cpumask_setall(ept->invalidate); return 0; } @@ -1193,7 +1198,7 @@ int ept_p2m_init(struct p2m_domain *p2m) void ept_p2m_uninit(struct p2m_domain *p2m) { struct ept_data *ept = &p2m->ept; - free_cpumask_var(ept->synced_mask); + free_cpumask_var(ept->invalidate); } static void ept_dump_p2m_table(unsigned char key) diff --git a/xen/include/asm-x86/hvm/vmx/vmcs.h b/xen/include/asm-x86/hvm/vmx/vmcs.h index a8d4d5b..d1496b8 100644 --- a/xen/include/asm-x86/hvm/vmx/vmcs.h +++ b/xen/include/asm-x86/hvm/vmx/vmcs.h @@ -67,7 +67,8 @@ struct ept_data { }; u64 eptp; }; - cpumask_var_t synced_mask; + /* Set of PCPUs needing an INVEPT before a VMENTER. */ + cpumask_var_t invalidate; }; #define _VMX_DOMAIN_PML_ENABLED 0 @@ -97,7 +98,6 @@ struct pi_desc { #define ept_get_wl(ept) ((ept)->ept_wl) #define ept_get_asr(ept) ((ept)->asr) #define ept_get_eptp(ept) ((ept)->eptp) -#define ept_get_synced_mask(ept) ((ept)->synced_mask) #define NR_PML_ENTRIES 512