From patchwork Tue Sep 23 00:54:48 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mario Smarduch X-Patchwork-Id: 4951631 Return-Path: X-Original-To: patchwork-linux-arm@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 869F9BEEA5 for ; Tue, 23 Sep 2014 00:59:25 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 6651A20160 for ; Tue, 23 Sep 2014 00:59:24 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.9]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 5175920154 for ; Tue, 23 Sep 2014 00:59:23 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.80.1 #2 (Red Hat Linux)) id 1XWEQD-0007l4-Ay; Tue, 23 Sep 2014 00:57:37 +0000 Received: from mailout2.w2.samsung.com ([211.189.100.12] helo=usmailout2.samsung.com) by bombadil.infradead.org with esmtps (Exim 4.80.1 #2 (Red Hat Linux)) id 1XWEQA-0007N0-0m for linux-arm-kernel@lists.infradead.org; Tue, 23 Sep 2014 00:57:34 +0000 Received: from uscpsbgex3.samsung.com (u124.gpu85.samsung.co.kr [203.254.195.124]) by mailout2.w2.samsung.com (Oracle Communications Messaging Server 7u4-24.01(7.0.4.24.0) 64bit (built Nov 17 2011)) with ESMTP id <0NCB00LKDXBD9200@mailout2.w2.samsung.com> for linux-arm-kernel@lists.infradead.org; Mon, 22 Sep 2014 20:57:13 -0400 (EDT) X-AuditID: cbfec37c-b7fde6d000000c64-7a-5420c569c238 Received: from usmmp2.samsung.com ( [203.254.195.78]) by uscpsbgex3.samsung.com (USCPEXMTA) with SMTP id D9.CA.03172.965C0245; Mon, 22 Sep 2014 20:57:13 -0400 (EDT) Received: from sisasmtp.sisa.samsung.com ([105.144.21.116]) by usmmp2.samsung.com (Oracle Communications Messaging Server 7u4-27.01(7.0.4.27.0) 64bit (built Aug 30 2012)) with ESMTP id <0NCB00A6SXBCW210@usmmp2.samsung.com>; Mon, 22 Sep 2014 20:57:13 -0400 (EDT) Received: from mjsmard-530U3C-530U4C-532U3C.sisa.samsung.com (105.144.129.79) by SISAEX02SJ.sisa.samsung.com (105.144.21.116) with Microsoft SMTP Server (TLS) id 14.1.421.2; Mon, 22 Sep 2014 17:57:11 -0700 From: Mario Smarduch To: kvmarm@lists.cs.columbia.edu, christoffer.dall@linaro.org, marc.zyngier@arm.com, pbonzini@redhat.com, gleb@kernel.org, agraf@suse.de, borntraeger@de.ibm.com, cornelia.huck@de.ibm.com, xiaoguangrong@linux.vnet.ibm.com, ralf@linux-mips.org, catali.marinas@arm.com Subject: [PATCH v11 4/6] arm: KVM: Add initial dirty page locking infrastructure Date: Mon, 22 Sep 2014 17:54:48 -0700 Message-id: <1411433690-8104-5-git-send-email-m.smarduch@samsung.com> X-Mailer: git-send-email 1.7.9.5 In-reply-to: <1411433690-8104-1-git-send-email-m.smarduch@samsung.com> References: <1411433690-8104-1-git-send-email-m.smarduch@samsung.com> MIME-version: 1.0 X-Originating-IP: [105.144.129.79] X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFupikeLIzCtJLcpLzFFi42I5/e+wn27mUYUQg1MrOC1OXPnHaDF9xXYW i2tdzYwWL14DufObGxktvvy8zmgxZ2qhxcdTx9ktNj2+xmrx984/Nov92/6xWsw584DF4tIe FYtJb7YxWSz8f5PRgd9jzbw1jB4HHx1i89i0qpPN4861PWweR1euZfJ4cGgzi8f5TWuYPTYv qfd4v+8qm8fm09UenzfJBXBHcdmkpOZklqUW6dslcGWc2r2OseCkdcWTo0uZGhinGHQxcnJI CJhIbHqzhBHCFpO4cG89WxcjF4eQwDJGiY8nPkM5vUwSN+feYAKpEhK4yCjx9mcoiM0moCux /95GdpAiEYF+JonTx7azgjjMAjMYJfYtugbUzsEhLBAgMeeqGkgDi4CqxOxLKxlBwrwCrhI9 x3VATAkBBYk5k2xAKjgF3CSOb93GCLHKVeJl8xk2EJtXQFDix+R7LCDlzAISEs8/K0GUqEps u/kc6n4liWmHr7JPYBSahaRjFkLHAkamVYxipcXJBcVJ6akVxnrFibnFpXnpesn5uZsYIfFX s4Px3lebQ4wCHIxKPLweaxRChFgTy4orcw8xSnAwK4nwlkwECvGmJFZWpRblxxeV5qQWH2Jk 4uCUamBsKmxLKHD3tuAWers4LUzOKmdJ24l1O2WqZ3KwmXaXF0+fv/nYvSsi30M9T35LrAj9 fPneV+ZXFpGVn8LKlhroJ8rG7Z043yH3OceNqz9K+O9tdNmjUzJ9j1XI2Q+ipzOnyp4+I/rt z9+VE3cECzF1vN/4tzwjvoDve/i0ScFfCt4xXj/wPYdViaU4I9FQi7moOBEAgNUXv50CAAA= X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20140922_175734_148962_10C59E65 X-CRM114-Status: GOOD ( 17.48 ) X-Spam-Score: -6.0 (------) Cc: peter.maydell@linaro.org, Mario Smarduch , linux-arm-kernel@lists.infradead.org, kvm@vger.kernel.org, steve.capper@arm.com X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org X-Spam-Status: No, score=-2.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_NONE, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Patch adds support for initial write protection of VM memlsot. This patch series assumes that huge PUDs will not be used in 2nd stage tables, which is awlays valid on ARMv7. Signed-off-by: Mario Smarduch --- arch/arm/include/asm/kvm_host.h | 2 + arch/arm/include/asm/kvm_mmu.h | 20 +++++ arch/arm/include/asm/pgtable-3level.h | 1 + arch/arm/kvm/arm.c | 9 +++ arch/arm/kvm/mmu.c | 129 +++++++++++++++++++++++++++++++++ 5 files changed, 161 insertions(+) diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h index 311486f..12311a5 100644 --- a/arch/arm/include/asm/kvm_host.h +++ b/arch/arm/include/asm/kvm_host.h @@ -243,4 +243,6 @@ int kvm_perf_teardown(void); u64 kvm_arm_timer_get_reg(struct kvm_vcpu *, u64 regid); int kvm_arm_timer_set_reg(struct kvm_vcpu *, u64 regid, u64 value); +void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot); + #endif /* __ARM_KVM_HOST_H__ */ diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h index 5cc0b0f..08ab5e8 100644 --- a/arch/arm/include/asm/kvm_mmu.h +++ b/arch/arm/include/asm/kvm_mmu.h @@ -114,6 +114,26 @@ static inline void kvm_set_s2pmd_writable(pmd_t *pmd) pmd_val(*pmd) |= L_PMD_S2_RDWR; } +static inline void kvm_set_s2pte_readonly(pte_t *pte) +{ + pte_val(*pte) = (pte_val(*pte) & ~L_PTE_S2_RDWR) | L_PTE_S2_RDONLY; +} + +static inline bool kvm_s2pte_readonly(pte_t *pte) +{ + return (pte_val(*pte) & L_PTE_S2_RDWR) == L_PTE_S2_RDONLY; +} + +static inline void kvm_set_s2pmd_readonly(pmd_t *pmd) +{ + pmd_val(*pmd) = (pmd_val(*pmd) & ~L_PMD_S2_RDWR) | L_PMD_S2_RDONLY; +} + +static inline bool kvm_s2pmd_readonly(pmd_t *pmd) +{ + return (pmd_val(*pmd) & L_PMD_S2_RDWR) == L_PMD_S2_RDONLY; +} + /* Open coded p*d_addr_end that can deal with 64bit addresses */ #define kvm_pgd_addr_end(addr, end) \ ({ u64 __boundary = ((addr) + PGDIR_SIZE) & PGDIR_MASK; \ diff --git a/arch/arm/include/asm/pgtable-3level.h b/arch/arm/include/asm/pgtable-3level.h index 85c60ad..8a3266e 100644 --- a/arch/arm/include/asm/pgtable-3level.h +++ b/arch/arm/include/asm/pgtable-3level.h @@ -129,6 +129,7 @@ #define L_PTE_S2_RDONLY (_AT(pteval_t, 1) << 6) /* HAP[1] */ #define L_PTE_S2_RDWR (_AT(pteval_t, 3) << 6) /* HAP[2:1] */ +#define L_PMD_S2_RDONLY (_AT(pmdval_t, 1) << 6) /* HAP[1] */ #define L_PMD_S2_RDWR (_AT(pmdval_t, 3) << 6) /* HAP[2:1] */ /* diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c index c52b2bd..e1be6c7 100644 --- a/arch/arm/kvm/arm.c +++ b/arch/arm/kvm/arm.c @@ -242,6 +242,15 @@ void kvm_arch_commit_memory_region(struct kvm *kvm, const struct kvm_memory_slot *old, enum kvm_mr_change change) { +#ifdef CONFIG_ARM + /* + * At this point memslot has been committed and there is an + * allocated dirty_bitmap[], dirty pages will be be tracked while the + * memory slot is write protected. + */ + if ((change != KVM_MR_DELETE) && (mem->flags & KVM_MEM_LOG_DIRTY_PAGES)) + kvm_mmu_wp_memory_region(kvm, mem->slot); +#endif } void kvm_arch_flush_shadow_all(struct kvm *kvm) diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c index 2336061..ba00899 100644 --- a/arch/arm/kvm/mmu.c +++ b/arch/arm/kvm/mmu.c @@ -45,6 +45,7 @@ static phys_addr_t hyp_idmap_vector; #define pgd_order get_order(PTRS_PER_PGD * sizeof(pgd_t)) #define kvm_pmd_huge(_x) (pmd_huge(_x) || pmd_trans_huge(_x)) +#define kvm_pud_huge(_x) pud_huge(_x) static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa) { @@ -746,6 +747,134 @@ static bool transparent_hugepage_adjust(pfn_t *pfnp, phys_addr_t *ipap) return false; } +#ifdef CONFIG_ARM +/** + * stage2_wp_ptes - write protect PMD range + * @pmd: pointer to pmd entry + * @addr: range start address + * @end: range end address + */ +static void stage2_wp_ptes(pmd_t *pmd, phys_addr_t addr, phys_addr_t end) +{ + pte_t *pte; + + pte = pte_offset_kernel(pmd, addr); + do { + if (!pte_none(*pte)) { + if (!kvm_s2pte_readonly(pte)) + kvm_set_s2pte_readonly(pte); + } + } while (pte++, addr += PAGE_SIZE, addr != end); +} + +/** + * stage2_wp_pmds - write protect PUD range + * @pud: pointer to pud entry + * @addr: range start address + * @end: range end address + */ +static void stage2_wp_pmds(pud_t *pud, phys_addr_t addr, phys_addr_t end) +{ + pmd_t *pmd; + phys_addr_t next; + + pmd = pmd_offset(pud, addr); + + do { + next = kvm_pmd_addr_end(addr, end); + if (!pmd_none(*pmd)) { + if (kvm_pmd_huge(*pmd)) { + if (!kvm_s2pmd_readonly(pmd)) + kvm_set_s2pmd_readonly(pmd); + } else { + stage2_wp_ptes(pmd, addr, next); + } + } + } while (pmd++, addr = next, addr != end); +} + +/** + * stage2_wp_puds - write protect PGD range + * @kvm: pointer to kvm structure + * @pud: pointer to pgd entry + * @addr: range start address + * @end: range end address + * + * While walking the PUD range huge PUD pages are ignored. + */ +static void stage2_wp_puds(struct kvm *kvm, pgd_t *pgd, + phys_addr_t addr, phys_addr_t end) +{ + pud_t *pud; + phys_addr_t next; + + pud = pud_offset(pgd, addr); + do { + next = kvm_pud_addr_end(addr, end); + if (!pud_none(*pud)) { + /* TODO:PUD not supported, revisit later if supported */ + BUG_ON(kvm_pud_huge(*pud)); + stage2_wp_pmds(pud, addr, next); + } + } while (pud++, addr = next, addr != end); +} + +/** + * stage2_wp_range() - write protect stage2 memory region range + * @kvm: The KVM pointer + * @addr: Start address of range + * @end: End address of range + */ +static void stage2_wp_range(struct kvm *kvm, phys_addr_t addr, phys_addr_t end) +{ + pgd_t *pgd; + phys_addr_t next; + + pgd = kvm->arch.pgd + pgd_index(addr); + do { + /* + * Release kvm_mmu_lock periodically if the memory region is + * large. Otherwise, we may see kernel panics with + * CONFIG_DETECT_HUNG_TASK, CONFIG_LOCK_DETECTOR, + * CONFIG_LOCK_DEP. Additionally, holding the lock too long + * will also starve other vCPUs. + */ + if (need_resched() || spin_needbreak(&kvm->mmu_lock)) + cond_resched_lock(&kvm->mmu_lock); + + next = kvm_pgd_addr_end(addr, end); + if (pgd_present(*pgd)) + stage2_wp_puds(kvm, pgd, addr, next); + } while (pgd++, addr = next, addr != end); +} + +/** + * kvm_mmu_wp_memory_region() - write protect stage 2 entries for memory slot + * @kvm: The KVM pointer + * @slot: The memory slot to write protect + * + * Called to start logging dirty pages after memory region + * KVM_MEM_LOG_DIRTY_PAGES operation is called. After this function returns + * all present PMD and PTEs are write protected in the memory region. + * Afterwards read of dirty page log can be called. + * + * Acquires kvm_mmu_lock. Called with kvm->slots_lock mutex acquired, + * serializing operations for VM memory regions. + */ + +void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot) +{ + struct kvm_memory_slot *memslot = id_to_memslot(kvm->memslots, slot); + phys_addr_t start = memslot->base_gfn << PAGE_SHIFT; + phys_addr_t end = (memslot->base_gfn + memslot->npages) << PAGE_SHIFT; + + spin_lock(&kvm->mmu_lock); + stage2_wp_range(kvm, start, end); + spin_unlock(&kvm->mmu_lock); + kvm_flush_remote_tlbs(kvm); +} +#endif + static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, struct kvm_memory_slot *memslot, unsigned long fault_status)