From patchwork Thu Feb 6 03:08:57 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leonardo Bras X-Patchwork-Id: 11367595 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D0C5E14E3 for ; Thu, 6 Feb 2020 03:12:38 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 9A2C820720 for ; Thu, 6 Feb 2020 03:12:38 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 9A2C820720 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.ibm.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id CB2166B0003; Wed, 5 Feb 2020 22:12:37 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id C88F96B0006; Wed, 5 Feb 2020 22:12:37 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B77C86B000A; Wed, 5 Feb 2020 22:12:37 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0169.hostedemail.com [216.40.44.169]) by kanga.kvack.org (Postfix) with ESMTP id 9FF6B6B0003 for ; Wed, 5 Feb 2020 22:12:37 -0500 (EST) Received: from smtpin20.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 51038180AD804 for ; Thu, 6 Feb 2020 03:12:37 +0000 (UTC) X-FDA: 76458229554.20.boys91_6e1fc4999d602 X-Spam-Summary: 2,0,0,333198fbfc58649c,d41d8cd98f00b204,leonardo@linux.ibm.com,:benh@kernel.crashing.org:paulus@samba.org:mpe@ellerman.id.au:arnd@arndb.de:akpm@linux-foundation.org:aneesh.kumar@linux.ibm.com:npiggin@gmail.com:christophe.leroy@c-s.fr:steven.price@arm.com:robin.murphy@arm.com:leonardo@linux.ibm.com:mahesh@linux.vnet.ibm.com:bsingharora@gmail.com:arbab@linux.ibm.com:tglx@linutronix.de:allison@lohutok.net:gregkh@linuxfoundation.org:rppt@linux.ibm.com:msuchanek@suse.de:linuxppc-dev@lists.ozlabs.org:linux-kernel@vger.kernel.org:kvm-ppc@vger.kernel.org:linux-arch@vger.kernel.org:,RULES_HIT:1:2:41:69:355:379:541:800:960:968:973:988:989:1260:1261:1311:1314:1345:1359:1437:1515:1605:1730:1747:1777:1792:2198:2199:2393:2559:2562:2895:2897:3138:3139:3140:3141:3142:3865:3866:3867:3868:3870:3871:3872:3874:4049:4250:4321:5007:6119:6261:6737:7903:8603:8634:10004:11026:11658:11914:12043:12048:12291:12296:12297:12438:12555:12895:12986:13161:13229:13255:13894:14096:14394:21063:21 080:2132 X-HE-Tag: boys91_6e1fc4999d602 X-Filterd-Recvd-Size: 10375 Received: from mx0a-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by imf35.hostedemail.com (Postfix) with ESMTP for ; Thu, 6 Feb 2020 03:12:36 +0000 (UTC) Received: from pps.filterd (m0098413.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 01639jn8087190; Wed, 5 Feb 2020 22:12:26 -0500 Received: from pps.reinject (localhost [127.0.0.1]) by mx0b-001b2d01.pphosted.com with ESMTP id 2xyhmhvk17-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 05 Feb 2020 22:12:26 -0500 Received: from m0098413.ppops.net (m0098413.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.36/8.16.0.36) with SMTP id 0163AKjb088861; Wed, 5 Feb 2020 22:12:25 -0500 Received: from ppma03dal.us.ibm.com (b.bd.3ea9.ip4.static.sl-reverse.com [169.62.189.11]) by mx0b-001b2d01.pphosted.com with ESMTP id 2xyhmhvk02-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 05 Feb 2020 22:12:25 -0500 Received: from pps.filterd (ppma03dal.us.ibm.com [127.0.0.1]) by ppma03dal.us.ibm.com (8.16.0.27/8.16.0.27) with SMTP id 0163BlgC028884; Thu, 6 Feb 2020 03:12:24 GMT Received: from b03cxnp08028.gho.boulder.ibm.com (b03cxnp08028.gho.boulder.ibm.com [9.17.130.20]) by ppma03dal.us.ibm.com with ESMTP id 2xykc9m8n0-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 06 Feb 2020 03:12:24 +0000 Received: from b03ledav005.gho.boulder.ibm.com (b03ledav005.gho.boulder.ibm.com [9.17.130.236]) by b03cxnp08028.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 0163CMBT65274112 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 6 Feb 2020 03:12:22 GMT Received: from b03ledav005.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id A76D4BE04F; Thu, 6 Feb 2020 03:12:22 +0000 (GMT) Received: from b03ledav005.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id C9E25BE054; Thu, 6 Feb 2020 03:11:52 +0000 (GMT) Received: from LeoBras.aus.stglabs.ibm.com (unknown [9.85.163.250]) by b03ledav005.gho.boulder.ibm.com (Postfix) with ESMTP; Thu, 6 Feb 2020 03:11:52 +0000 (GMT) From: Leonardo Bras To: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Arnd Bergmann , Andrew Morton , "Aneesh Kumar K.V" , Nicholas Piggin , Christophe Leroy , Steven Price , Robin Murphy , Leonardo Bras , Mahesh Salgaonkar , Balbir Singh , Reza Arbab , Thomas Gleixner , Allison Randal , Greg Kroah-Hartman , Mike Rapoport , Michal Suchanek Cc: linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, kvm-ppc@vger.kernel.org, linux-arch@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v6 08/11] powerpc/kvm/book3s_hv: Use functions to track lockless pgtbl walks Date: Thu, 6 Feb 2020 00:08:57 -0300 Message-Id: <20200206030900.147032-9-leonardo@linux.ibm.com> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20200206030900.147032-1-leonardo@linux.ibm.com> References: <20200206030900.147032-1-leonardo@linux.ibm.com> MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.138,18.0.572 definitions=2020-02-05_06:2020-02-04,2020-02-05 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 phishscore=0 clxscore=1015 impostorscore=0 malwarescore=0 priorityscore=1501 adultscore=0 mlxlogscore=999 suspectscore=0 bulkscore=0 lowpriorityscore=0 spamscore=0 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2001150001 definitions=main-2002060022 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Applies the new functions for tracking all book3s_hv related functions that do lockless pagetable walks. Adds comments explaining that some lockless pagetable walks don't need protection due to guest pgd not being a target of THP collapse/split, or due to being called from Realmode + MSR_EE = 0 kvmppc_do_h_enter: Fixes where local_irq_restore() must be placed (after the last usage of ptep). Given that some of these functions can be called in real mode, and others always are, we use __{begin,end}_lockless_pgtbl_walk so we can decide when to disable interrupts. Signed-off-by: Leonardo Bras --- arch/powerpc/kvm/book3s_hv_nested.c | 22 ++++++++++++++++++++-- arch/powerpc/kvm/book3s_hv_rm_mmu.c | 28 ++++++++++++++++++---------- 2 files changed, 38 insertions(+), 12 deletions(-) diff --git a/arch/powerpc/kvm/book3s_hv_nested.c b/arch/powerpc/kvm/book3s_hv_nested.c index dc97e5be76f6..a398061d5778 100644 --- a/arch/powerpc/kvm/book3s_hv_nested.c +++ b/arch/powerpc/kvm/book3s_hv_nested.c @@ -803,7 +803,11 @@ static void kvmhv_update_nest_rmap_rc(struct kvm *kvm, u64 n_rmap, if (!gp) return; - /* Find the pte */ + /* Find the pte: + * We are walking the nested guest (partition-scoped) page table here. + * We can do this without disabling irq because the Linux MM + * subsystem doesn't do THP splits and collapses on this tree. + */ ptep = __find_linux_pte(gp->shadow_pgtable, gpa, NULL, &shift); /* * If the pte is present and the pfn is still the same, update the pte. @@ -853,7 +857,11 @@ static void kvmhv_remove_nest_rmap(struct kvm *kvm, u64 n_rmap, if (!gp) return; - /* Find and invalidate the pte */ + /* Find and invalidate the pte: + * We are walking the nested guest (partition-scoped) page table here. + * We can do this without disabling irq because the Linux MM + * subsystem doesn't do THP splits and collapses on this tree. + */ ptep = __find_linux_pte(gp->shadow_pgtable, gpa, NULL, &shift); /* Don't spuriously invalidate ptes if the pfn has changed */ if (ptep && pte_present(*ptep) && ((pte_val(*ptep) & mask) == hpa)) @@ -921,6 +929,11 @@ static bool kvmhv_invalidate_shadow_pte(struct kvm_vcpu *vcpu, int shift; spin_lock(&kvm->mmu_lock); + /* + * We are walking the nested guest (partition-scoped) page table here. + * We can do this without disabling irq because the Linux MM + * subsystem doesn't do THP splits and collapses on this tree. + */ ptep = __find_linux_pte(gp->shadow_pgtable, gpa, NULL, &shift); if (!shift) shift = PAGE_SHIFT; @@ -1362,6 +1375,11 @@ static long int __kvmhv_nested_page_fault(struct kvm_run *run, /* See if can find translation in our partition scoped tables for L1 */ pte = __pte(0); spin_lock(&kvm->mmu_lock); + /* + * We are walking the secondary (partition-scoped) page table here. + * We can do this without disabling irq because the Linux MM + * subsystem doesn't do THP splits and collapses on this tree. + */ pte_p = __find_linux_pte(kvm->arch.pgtable, gpa, NULL, &shift); if (!shift) shift = PAGE_SHIFT; diff --git a/arch/powerpc/kvm/book3s_hv_rm_mmu.c b/arch/powerpc/kvm/book3s_hv_rm_mmu.c index 220305454c23..fd4d8f174f09 100644 --- a/arch/powerpc/kvm/book3s_hv_rm_mmu.c +++ b/arch/powerpc/kvm/book3s_hv_rm_mmu.c @@ -210,7 +210,7 @@ long kvmppc_do_h_enter(struct kvm *kvm, unsigned long flags, pte_t *ptep; unsigned int writing; unsigned long mmu_seq; - unsigned long rcbits, irq_flags = 0; + unsigned long rcbits, irq_mask = 0; if (kvm_is_radix(kvm)) return H_FUNCTION; @@ -252,8 +252,8 @@ long kvmppc_do_h_enter(struct kvm *kvm, unsigned long flags, * If we had a page table table change after lookup, we would * retry via mmu_notifier_retry. */ - if (!realmode) - local_irq_save(irq_flags); + irq_mask = __begin_lockless_pgtbl_walk(!realmode); + /* * If called in real mode we have MSR_EE = 0. Otherwise * we disable irq above. @@ -272,8 +272,7 @@ long kvmppc_do_h_enter(struct kvm *kvm, unsigned long flags, * to <= host page size, if host is using hugepage */ if (host_pte_size < psize) { - if (!realmode) - local_irq_restore(flags); + __end_lockless_pgtbl_walk(irq_mask, !realmode); return H_PARAMETER; } pte = kvmppc_read_update_linux_pte(ptep, writing); @@ -287,8 +286,6 @@ long kvmppc_do_h_enter(struct kvm *kvm, unsigned long flags, pa |= gpa & ~PAGE_MASK; } } - if (!realmode) - local_irq_restore(irq_flags); ptel &= HPTE_R_KEY | HPTE_R_PP0 | (psize-1); ptel |= pa; @@ -302,8 +299,10 @@ long kvmppc_do_h_enter(struct kvm *kvm, unsigned long flags, /*If we had host pte mapping then Check WIMG */ if (ptep && !hpte_cache_flags_ok(ptel, is_ci)) { - if (is_ci) + if (is_ci) { + __end_lockless_pgtbl_walk(irq_mask, !realmode); return H_PARAMETER; + } /* * Allow guest to map emulated device memory as * uncacheable, but actually make it cacheable. @@ -311,6 +310,7 @@ long kvmppc_do_h_enter(struct kvm *kvm, unsigned long flags, ptel &= ~(HPTE_R_W|HPTE_R_I|HPTE_R_G); ptel |= HPTE_R_M; } + __end_lockless_pgtbl_walk(irq_mask, !realmode); /* Find and lock the HPTEG slot to use */ do_insert: @@ -907,11 +907,19 @@ static int kvmppc_get_hpa(struct kvm_vcpu *vcpu, unsigned long gpa, /* Translate to host virtual address */ hva = __gfn_to_hva_memslot(memslot, gfn); - /* Try to find the host pte for that virtual address */ + /* Try to find the host pte for that virtual address : + * Called by hcall_real_table (real mode + MSR_EE=0) + * Interrupts are disabled here. + */ + __begin_lockless_pgtbl_walk(false); ptep = __find_linux_pte(vcpu->arch.pgdir, hva, NULL, &shift); - if (!ptep) + if (!ptep) { + __end_lockless_pgtbl_walk(0, false); return H_TOO_HARD; + } pte = kvmppc_read_update_linux_pte(ptep, writing); + __end_lockless_pgtbl_walk(0, false); + if (!pte_present(pte)) return H_TOO_HARD;