From patchwork Tue Jul 7 16:41:04 2009 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Joerg Roedel X-Patchwork-Id: 34490 Received: from vger.kernel.org (vger.kernel.org [209.132.176.167]) by demeter.kernel.org (8.14.2/8.14.2) with ESMTP id n67GfdFO012389 for ; Tue, 7 Jul 2009 16:41:39 GMT Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755841AbZGGQlh (ORCPT ); Tue, 7 Jul 2009 12:41:37 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1755405AbZGGQlg (ORCPT ); Tue, 7 Jul 2009 12:41:36 -0400 Received: from outbound-dub.frontbridge.com ([213.199.154.16]:51840 "EHLO IE1EHSOBE004.bigfish.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755225AbZGGQld (ORCPT ); Tue, 7 Jul 2009 12:41:33 -0400 Received: from mail100-dub-R.bigfish.com (10.5.252.3) by IE1EHSOBE004.bigfish.com (10.5.252.24) with Microsoft SMTP Server id 8.1.340.0; Tue, 7 Jul 2009 16:41:29 +0000 Received: from mail100-dub (localhost.localdomain [127.0.0.1]) by mail100-dub-R.bigfish.com (Postfix) with ESMTP id 9FA5B14F01AE; Tue, 7 Jul 2009 16:41:29 +0000 (UTC) X-SpamScore: 3 X-BigFish: VPS3(zzzz1202hzzz32i43j67h) X-Spam-TCS-SCL: 6:0 Received: by mail100-dub (MessageSwitch) id 1246984887694297_7797; Tue, 7 Jul 2009 16:41:27 +0000 (UCT) Received: from svlb1extmailp02.amd.com (unknown [139.95.251.11]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail100-dub.bigfish.com (Postfix) with ESMTP id 435B85D8056; Tue, 7 Jul 2009 16:41:26 +0000 (UTC) Received: from svlb1twp02.amd.com ([139.95.250.35]) by svlb1extmailp02.amd.com (Switch-3.2.7/Switch-3.2.7) with ESMTP id n67GfI72021896; Tue, 7 Jul 2009 09:41:21 -0700 X-WSS-ID: 0KMF7OS-04-HX9-01 Received: from SSVLEXBH1.amd.com (ssvlexbh1.amd.com [139.95.53.182]) by svlb1twp02.amd.com (Tumbleweed MailGate 3.5.1) with ESMTP id 2BD3C1103C2; Tue, 7 Jul 2009 09:41:16 -0700 (PDT) Received: from SSVLEXMB1.amd.com ([139.95.53.181]) by SSVLEXBH1.amd.com with Microsoft SMTPSVC(6.0.3790.3959); Tue, 7 Jul 2009 09:41:23 -0700 Received: from SF30EXMB1.amd.com ([172.20.6.49]) by SSVLEXMB1.amd.com with Microsoft SMTPSVC(6.0.3790.3959); Tue, 7 Jul 2009 09:41:23 -0700 Received: from seurexmb1.amd.com ([165.204.9.130]) by SF30EXMB1.amd.com with Microsoft SMTPSVC(6.0.3790.3959); Tue, 7 Jul 2009 18:41:17 +0200 Received: from lemmy.amd.com ([165.204.15.93]) by seurexmb1.amd.com with Microsoft SMTPSVC(6.0.3790.3959); Tue, 7 Jul 2009 18:41:16 +0200 Received: by lemmy.amd.com (Postfix, from userid 41430) id E54EBC9BCC; Tue, 7 Jul 2009 18:41:15 +0200 (CEST) From: Joerg Roedel To: Avi Kivity , Marcelo Tosatti CC: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Joerg Roedel Subject: [PATCH 3/6] kvm/mmu: make direct mapping paths aware of mapping levels Date: Tue, 7 Jul 2009 18:41:04 +0200 Message-ID: <1246984867-15952-4-git-send-email-joerg.roedel@amd.com> X-Mailer: git-send-email 1.6.3.3 In-Reply-To: <1246984867-15952-1-git-send-email-joerg.roedel@amd.com> References: <1246984867-15952-1-git-send-email-joerg.roedel@amd.com> X-OriginalArrivalTime: 07 Jul 2009 16:41:16.0162 (UTC) FILETIME=[BF155620:01C9FF21] MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Signed-off-by: Joerg Roedel --- arch/x86/include/asm/kvm_host.h | 2 +- arch/x86/kvm/mmu.c | 74 ++++++++++++++++++++++---------------- arch/x86/kvm/paging_tmpl.h | 6 ++-- 3 files changed, 47 insertions(+), 35 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 30b625d..1fa1ff0 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -315,7 +315,7 @@ struct kvm_vcpu_arch { struct { gfn_t gfn; /* presumed gfn during guest pte update */ pfn_t pfn; /* pfn corresponding to that gfn */ - int largepage; + int level; unsigned long mmu_seq; } update_pte; diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index 7de9f41..d42185a 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -257,7 +257,7 @@ static int is_last_spte(u64 pte, int level) { if (level == PT_PAGE_TABLE_LEVEL) return 1; - if (level == PT_DIRECTORY_LEVEL && is_large_pte(pte)) + if (is_large_pte(pte)) return 1; return 0; } @@ -746,7 +746,7 @@ static int kvm_unmap_rmapp(struct kvm *kvm, unsigned long *rmapp) static int kvm_handle_hva(struct kvm *kvm, unsigned long hva, int (*handler)(struct kvm *kvm, unsigned long *rmapp)) { - int i; + int i, j; int retval = 0; /* @@ -765,11 +765,15 @@ static int kvm_handle_hva(struct kvm *kvm, unsigned long hva, end = start + (memslot->npages << PAGE_SHIFT); if (hva >= start && hva < end) { gfn_t gfn_offset = (hva - start) >> PAGE_SHIFT; - int idx = gfn_offset / - KVM_PAGES_PER_HPAGE(PT_DIRECTORY_LEVEL); + retval |= handler(kvm, &memslot->rmap[gfn_offset]); - retval |= handler(kvm, - &memslot->lpage_info[0][idx].rmap_pde); + + for (j = 0; j < KVM_NR_PAGE_SIZES - 1; ++j) { + int idx = gfn_offset; + idx /= KVM_PAGES_PER_HPAGE(PT_DIRECTORY_LEVEL + j); + retval |= handler(kvm, + &memslot->lpage_info[j][idx].rmap_pde); + } } } @@ -1713,7 +1717,7 @@ static int mmu_need_write_protect(struct kvm_vcpu *vcpu, gfn_t gfn, static int set_spte(struct kvm_vcpu *vcpu, u64 *sptep, unsigned pte_access, int user_fault, - int write_fault, int dirty, int largepage, + int write_fault, int dirty, int level, gfn_t gfn, pfn_t pfn, bool speculative, bool can_unsync) { @@ -1736,7 +1740,7 @@ static int set_spte(struct kvm_vcpu *vcpu, u64 *sptep, spte |= shadow_nx_mask; if (pte_access & ACC_USER_MASK) spte |= shadow_user_mask; - if (largepage) + if (level > PT_PAGE_TABLE_LEVEL) spte |= PT_PAGE_SIZE_MASK; if (tdp_enabled) spte |= kvm_x86_ops->get_mt_mask(vcpu, gfn, @@ -1747,7 +1751,8 @@ static int set_spte(struct kvm_vcpu *vcpu, u64 *sptep, if ((pte_access & ACC_WRITE_MASK) || (write_fault && !is_write_protection(vcpu) && !user_fault)) { - if (largepage && has_wrprotected_page(vcpu->kvm, gfn, 1)) { + if (level > PT_PAGE_TABLE_LEVEL && + has_wrprotected_page(vcpu->kvm, gfn, level)) { ret = 1; spte = shadow_trap_nonpresent_pte; goto set_pte; @@ -1785,7 +1790,7 @@ set_pte: static void mmu_set_spte(struct kvm_vcpu *vcpu, u64 *sptep, unsigned pt_access, unsigned pte_access, int user_fault, int write_fault, int dirty, - int *ptwrite, int largepage, gfn_t gfn, + int *ptwrite, int level, gfn_t gfn, pfn_t pfn, bool speculative) { int was_rmapped = 0; @@ -1801,7 +1806,8 @@ static void mmu_set_spte(struct kvm_vcpu *vcpu, u64 *sptep, * If we overwrite a PTE page pointer with a 2MB PMD, unlink * the parent of the now unreachable PTE. */ - if (largepage && !is_large_pte(*sptep)) { + if (level > PT_PAGE_TABLE_LEVEL && + !is_large_pte(*sptep)) { struct kvm_mmu_page *child; u64 pte = *sptep; @@ -1814,8 +1820,9 @@ static void mmu_set_spte(struct kvm_vcpu *vcpu, u64 *sptep, } else was_rmapped = 1; } + if (set_spte(vcpu, sptep, pte_access, user_fault, write_fault, - dirty, largepage, gfn, pfn, speculative, true)) { + dirty, level, gfn, pfn, speculative, true)) { if (write_fault) *ptwrite = 1; kvm_x86_ops->tlb_flush(vcpu); @@ -1851,7 +1858,7 @@ static void nonpaging_new_cr3(struct kvm_vcpu *vcpu) } static int __direct_map(struct kvm_vcpu *vcpu, gpa_t v, int write, - int largepage, gfn_t gfn, pfn_t pfn) + int level, gfn_t gfn, pfn_t pfn) { struct kvm_shadow_walk_iterator iterator; struct kvm_mmu_page *sp; @@ -1859,11 +1866,10 @@ static int __direct_map(struct kvm_vcpu *vcpu, gpa_t v, int write, gfn_t pseudo_gfn; for_each_shadow_entry(vcpu, (u64)gfn << PAGE_SHIFT, iterator) { - if (iterator.level == PT_PAGE_TABLE_LEVEL - || (largepage && iterator.level == PT_DIRECTORY_LEVEL)) { + if (iterator.level == level) { mmu_set_spte(vcpu, iterator.sptep, ACC_ALL, ACC_ALL, 0, write, 1, &pt_write, - largepage, gfn, pfn, false); + level, gfn, pfn, false); ++vcpu->stat.pf_fixed; break; } @@ -1891,14 +1897,20 @@ static int __direct_map(struct kvm_vcpu *vcpu, gpa_t v, int write, static int nonpaging_map(struct kvm_vcpu *vcpu, gva_t v, int write, gfn_t gfn) { int r; - int largepage = 0; + int level; pfn_t pfn; unsigned long mmu_seq; - if (mapping_level(vcpu, gfn) == PT_DIRECTORY_LEVEL) { - gfn &= ~(KVM_PAGES_PER_HPAGE(PT_DIRECTORY_LEVEL) - 1); - largepage = 1; - } + level = mapping_level(vcpu, gfn); + + /* + * This path builds a PAE pagetable - so we can map 2mb pages at + * maximum. Therefore check if the level is larger than that. + */ + if (level > PT_DIRECTORY_LEVEL) + level = PT_DIRECTORY_LEVEL; + + gfn &= ~(KVM_PAGES_PER_HPAGE(level) - 1); mmu_seq = vcpu->kvm->mmu_notifier_seq; smp_rmb(); @@ -1914,7 +1926,7 @@ static int nonpaging_map(struct kvm_vcpu *vcpu, gva_t v, int write, gfn_t gfn) if (mmu_notifier_retry(vcpu, mmu_seq)) goto out_unlock; kvm_mmu_free_some_pages(vcpu); - r = __direct_map(vcpu, v, write, largepage, gfn, pfn); + r = __direct_map(vcpu, v, write, level, gfn, pfn); spin_unlock(&vcpu->kvm->mmu_lock); @@ -2090,7 +2102,7 @@ static int tdp_page_fault(struct kvm_vcpu *vcpu, gva_t gpa, { pfn_t pfn; int r; - int largepage = 0; + int level; gfn_t gfn = gpa >> PAGE_SHIFT; unsigned long mmu_seq; @@ -2101,10 +2113,10 @@ static int tdp_page_fault(struct kvm_vcpu *vcpu, gva_t gpa, if (r) return r; - if (mapping_level(vcpu, gfn) == PT_DIRECTORY_LEVEL) { - gfn &= ~(KVM_PAGES_PER_HPAGE(PT_DIRECTORY_LEVEL) - 1); - largepage = 1; - } + level = mapping_level(vcpu, gfn); + + gfn &= ~(KVM_PAGES_PER_HPAGE(level) - 1); + mmu_seq = vcpu->kvm->mmu_notifier_seq; smp_rmb(); pfn = gfn_to_pfn(vcpu->kvm, gfn); @@ -2117,7 +2129,7 @@ static int tdp_page_fault(struct kvm_vcpu *vcpu, gva_t gpa, goto out_unlock; kvm_mmu_free_some_pages(vcpu); r = __direct_map(vcpu, gpa, error_code & PFERR_WRITE_MASK, - largepage, gfn, pfn); + level, gfn, pfn); spin_unlock(&vcpu->kvm->mmu_lock); return r; @@ -2424,7 +2436,7 @@ static void mmu_pte_write_new_pte(struct kvm_vcpu *vcpu, const void *new) { if (sp->role.level != PT_PAGE_TABLE_LEVEL) { - if (!vcpu->arch.update_pte.largepage || + if (vcpu->arch.update_pte.level == PT_PAGE_TABLE_LEVEL || sp->role.glevels == PT32_ROOT_LEVEL) { ++vcpu->kvm->stat.mmu_pde_zapped; return; @@ -2474,7 +2486,7 @@ static void mmu_guess_page_from_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa, u64 gpte = 0; pfn_t pfn; - vcpu->arch.update_pte.largepage = 0; + vcpu->arch.update_pte.level = PT_PAGE_TABLE_LEVEL; if (bytes != 4 && bytes != 8) return; @@ -2506,7 +2518,7 @@ static void mmu_guess_page_from_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa, if (is_large_pte(gpte) && (mapping_level(vcpu, gfn) == PT_DIRECTORY_LEVEL)) { gfn &= ~(KVM_PAGES_PER_HPAGE(PT_DIRECTORY_LEVEL) - 1); - vcpu->arch.update_pte.largepage = 1; + vcpu->arch.update_pte.level = PT_DIRECTORY_LEVEL; } vcpu->arch.update_pte.mmu_seq = vcpu->kvm->mmu_notifier_seq; smp_rmb(); diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h index 44f0346..b167f0d 100644 --- a/arch/x86/kvm/paging_tmpl.h +++ b/arch/x86/kvm/paging_tmpl.h @@ -253,7 +253,7 @@ static void FNAME(update_pte)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *page, pt_element_t gpte; unsigned pte_access; pfn_t pfn; - int largepage = vcpu->arch.update_pte.largepage; + int level = vcpu->arch.update_pte.level; gpte = *(const pt_element_t *)pte; if (~gpte & (PT_PRESENT_MASK | PT_ACCESSED_MASK)) { @@ -272,7 +272,7 @@ static void FNAME(update_pte)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *page, return; kvm_get_pfn(pfn); mmu_set_spte(vcpu, spte, page->role.access, pte_access, 0, 0, - gpte & PT_DIRTY_MASK, NULL, largepage, + gpte & PT_DIRTY_MASK, NULL, level, gpte_to_gfn(gpte), pfn, true); } @@ -306,7 +306,7 @@ static u64 *FNAME(fetch)(struct kvm_vcpu *vcpu, gva_t addr, gw->pte_access & access, user_fault, write_fault, gw->ptes[gw->level-1] & PT_DIRTY_MASK, - ptwrite, largepage, + ptwrite, level, gw->gfn, pfn, false); break; }