From patchwork Wed Feb 5 15:09:41 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13961279 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9E6ACC02192 for ; Wed, 5 Feb 2025 15:10:33 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 32DBE280018; Wed, 5 Feb 2025 10:10:33 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 2DD63280001; Wed, 5 Feb 2025 10:10:33 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 17E3C280018; Wed, 5 Feb 2025 10:10:33 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id E93A5280001 for ; Wed, 5 Feb 2025 10:10:32 -0500 (EST) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id A063C4487D for ; Wed, 5 Feb 2025 15:10:32 +0000 (UTC) X-FDA: 83086227504.12.36BFD61 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf07.hostedemail.com (Postfix) with ESMTP id B335C40006 for ; Wed, 5 Feb 2025 15:10:30 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf07.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738768230; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=dn6q14up47wXBbEOr94KKxNs+KJ/S1lrz85eiByawBo=; b=taI+c7JeuOhYvM/fF2tJ9dV7jd81D6WZMB2Eyp4EapHCC6E4zgb9vnQRzcVwI2/2Piv+YR TVg427rS3fd7oeiYFfMdZqHS1o+n96F6Jkh7Bsnt+T84AxiS1bKoey3lChnzkulP3udsiq ZPxBPi6Ql0nn3nDZhm5h0A3dXoHWSL8= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf07.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738768230; a=rsa-sha256; cv=none; b=119lbNswdFRm9iKXdrHcHcBHfn5rJiwVbKq2/ZROjVxbLxdNAoItoGyB6jbJEen6McEUMP o1K6yQ1V1pnfGbTdh0r2E5unbkp9iM6gwipBSZV9LQjClP3gpWbzdvOPNZ284rCkdhUX1V 6Lja0exo7a6wL4ZVlu1CayvFStXpBKI= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 84921165C; Wed, 5 Feb 2025 07:10:53 -0800 (PST) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.27]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 438EB3F5A1; Wed, 5 Feb 2025 07:10:25 -0800 (PST) From: Ryan Roberts To: Catalin Marinas , Will Deacon , Muchun Song , Pasha Tatashin , Andrew Morton , Uladzislau Rezki , Christoph Hellwig , Mark Rutland , Ard Biesheuvel , Anshuman Khandual , Dev Jain , Alexandre Ghiti , Steve Capper , Kevin Brodsky Cc: Ryan Roberts , linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Huacai Chen , Thomas Bogendoerfer , "James E.J. Bottomley" , Helge Deller , Madhavan Srinivasan , Michael Ellerman , Paul Walmsley , Palmer Dabbelt , Albert Ou , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Gerald Schaefer , "David S. Miller" , Andreas Larsson , stable@vger.kernel.org Subject: [PATCH v1 01/16] mm: hugetlb: Add huge page size param to huge_ptep_get_and_clear() Date: Wed, 5 Feb 2025 15:09:41 +0000 Message-ID: <20250205151003.88959-2-ryan.roberts@arm.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250205151003.88959-1-ryan.roberts@arm.com> References: <20250205151003.88959-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: B335C40006 X-Stat-Signature: tm4u74snjxy6m8e8ygu5qkedwpseo8kq X-Rspamd-Server: rspam08 X-Rspam-User: X-HE-Tag: 1738768230-920049 X-HE-Meta: U2FsdGVkX19+u5b3DbXul0O02Ee/zdvQeYXoaSZA6aCycUw6SvRmGVAzCLtCMPDvCUok6u2uIZZTW+XEj/HuhPgDGiGDp/u5smMbCqGTEfRmGRlmknpWZfktL4t9yb0C3Fv8IhqdAvJ/kDSRAlH13bpnv84EpZgCYlR9h65/O8D1VOjTgpER/daOHsgxDOUp9/Ksggaz3ZMwwiGB/iwTNBYQ3GxUzwNMHCNJmkyxIYH3F+/pUCOTo+YrUXYQ7Bor7LQR4HM2NilyyccZwkI3vbGjKSKjIl2u6aRyBsfwcbBLmY3Q3DBvEvqGR4VSkvsalTRkkZsiYvOLz7IYuwx+cRKZJ5LsbDCf/nXL5mbID8Cg/YWWNq/NEkfZH16ZtgyS89c+bMYeoCaDqQbyeMYH0XxwELW3K1Jv+ipVJQX8MIYzDtISEOvGAmHwVSk/+iGngRIyvrBYGDvtTFSfT41a++NXj3hxcPw52J5LGs4ncmd8IbDixyVI5d4bAPD/R/aMk87o5ZgzNy5la/KYHXZFBfsTb6IJ6cQSP4Q7hja9C15FCzlhvMYrTxsDlhktNi96FOwtT9aC/2veqYKe9YG6B3Q01+p23ooOr3mwVJLQeqbo+mKLwVNqPAYGZElhnNeF5//JMQ+irLuRcvf6VZxMoJKH3OSpDwWEUEUP26veBXYrzzyBAYr9N60TXIZTZPb9yjeIEggaArWzjf6uIDwpNtIxbFFozVJs+jeUXGNT/oNFbvH2kQJuRbiyYrKVqWCpfupKG05n54hMInRrqdwC9rqczwq37dhUVpEncdwzwCUcPq1bJO3d46yAEmfjOjMzRYaCWOHLGeYtoNH0SHwp62Wuj7abPs2+ffuy8TvzIEWd/6ztsn9VUNDrUPh5YtevQlgv9WO0kS2mVcTlpYUM62FebddVonOQzh076ZR5SwFUv5FDzM3TQALdh03Hmur/btVBDhYuaTAtbwEUp1V 3yB4RZEP 6y+36lDqKSyh401utrLXAlJGFia3AKh0BWSKthvNxzs0mTPrAgGqwV5WXUuH/5G8iJlt3iPDCfZo/LgaPhMVdxftVf1VYBSMkm+8RttIw/M/YEMAQoGM3hb3l+dEcJxEAn/eDbv1OSRH67O1BHrKjjEDMDEwaR5xYhTTYuF9NWIZFmrG1JoUVDPMMq9nu0EWdQh+JwfW/oO8uAkjPIzDAGAtTxnzsVJ3Mpiat X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: In order to fix a bug, arm64 needs to be told the size of the huge page for which the huge_pte is being set in huge_ptep_get_and_clear(). Provide for this by adding an `unsigned long sz` parameter to the function. This follows the same pattern as huge_pte_clear() and set_huge_pte_at(). This commit makes the required interface modifications to the core mm as well as all arches that implement this function (arm64, loongarch, mips, parisc, powerpc, riscv, s390, sparc). The actual arm64 bug will be fixed in a separate commit. Cc: Fixes: 66b3923a1a0f ("arm64: hugetlb: add support for PTE contiguous bit") Signed-off-by: Ryan Roberts --- arch/arm64/include/asm/hugetlb.h | 4 ++-- arch/arm64/mm/hugetlbpage.c | 8 +++++--- arch/loongarch/include/asm/hugetlb.h | 6 ++++-- arch/mips/include/asm/hugetlb.h | 6 ++++-- arch/parisc/include/asm/hugetlb.h | 2 +- arch/parisc/mm/hugetlbpage.c | 2 +- arch/powerpc/include/asm/hugetlb.h | 6 ++++-- arch/riscv/include/asm/hugetlb.h | 3 ++- arch/riscv/mm/hugetlbpage.c | 2 +- arch/s390/include/asm/hugetlb.h | 12 ++++++++---- arch/s390/mm/hugetlbpage.c | 10 ++++++++-- arch/sparc/include/asm/hugetlb.h | 2 +- arch/sparc/mm/hugetlbpage.c | 2 +- include/asm-generic/hugetlb.h | 2 +- include/linux/hugetlb.h | 4 +++- mm/hugetlb.c | 4 ++-- 16 files changed, 48 insertions(+), 27 deletions(-) -- 2.43.0 diff --git a/arch/arm64/include/asm/hugetlb.h b/arch/arm64/include/asm/hugetlb.h index c6dff3e69539..03db9cb21ace 100644 --- a/arch/arm64/include/asm/hugetlb.h +++ b/arch/arm64/include/asm/hugetlb.h @@ -42,8 +42,8 @@ extern int huge_ptep_set_access_flags(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep, pte_t pte, int dirty); #define __HAVE_ARCH_HUGE_PTEP_GET_AND_CLEAR -extern pte_t huge_ptep_get_and_clear(struct mm_struct *mm, - unsigned long addr, pte_t *ptep); +extern pte_t huge_ptep_get_and_clear(struct mm_struct *mm, unsigned long addr, + pte_t *ptep, unsigned long sz); #define __HAVE_ARCH_HUGE_PTEP_SET_WRPROTECT extern void huge_ptep_set_wrprotect(struct mm_struct *mm, unsigned long addr, pte_t *ptep); diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c index 98a2a0e64e25..06db4649af91 100644 --- a/arch/arm64/mm/hugetlbpage.c +++ b/arch/arm64/mm/hugetlbpage.c @@ -396,8 +396,8 @@ void huge_pte_clear(struct mm_struct *mm, unsigned long addr, __pte_clear(mm, addr, ptep); } -pte_t huge_ptep_get_and_clear(struct mm_struct *mm, - unsigned long addr, pte_t *ptep) +pte_t huge_ptep_get_and_clear(struct mm_struct *mm, unsigned long addr, + pte_t *ptep, unsigned long sz) { int ncontig; size_t pgsize; @@ -549,6 +549,8 @@ bool __init arch_hugetlb_valid_size(unsigned long size) pte_t huge_ptep_modify_prot_start(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep) { + unsigned long psize = huge_page_size(hstate_vma(vma)); + if (alternative_has_cap_unlikely(ARM64_WORKAROUND_2645198)) { /* * Break-before-make (BBM) is required for all user space mappings @@ -558,7 +560,7 @@ pte_t huge_ptep_modify_prot_start(struct vm_area_struct *vma, unsigned long addr if (pte_user_exec(__ptep_get(ptep))) return huge_ptep_clear_flush(vma, addr, ptep); } - return huge_ptep_get_and_clear(vma->vm_mm, addr, ptep); + return huge_ptep_get_and_clear(vma->vm_mm, addr, ptep, psize); } void huge_ptep_modify_prot_commit(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep, diff --git a/arch/loongarch/include/asm/hugetlb.h b/arch/loongarch/include/asm/hugetlb.h index c8e4057734d0..4dc4b3e04225 100644 --- a/arch/loongarch/include/asm/hugetlb.h +++ b/arch/loongarch/include/asm/hugetlb.h @@ -36,7 +36,8 @@ static inline void huge_pte_clear(struct mm_struct *mm, unsigned long addr, #define __HAVE_ARCH_HUGE_PTEP_GET_AND_CLEAR static inline pte_t huge_ptep_get_and_clear(struct mm_struct *mm, - unsigned long addr, pte_t *ptep) + unsigned long addr, pte_t *ptep, + unsigned long sz) { pte_t clear; pte_t pte = ptep_get(ptep); @@ -51,8 +52,9 @@ static inline pte_t huge_ptep_clear_flush(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep) { pte_t pte; + unsigned long sz = huge_page_size(hstate_vma(vma)); - pte = huge_ptep_get_and_clear(vma->vm_mm, addr, ptep); + pte = huge_ptep_get_and_clear(vma->vm_mm, addr, ptep, sz); flush_tlb_page(vma, addr); return pte; } diff --git a/arch/mips/include/asm/hugetlb.h b/arch/mips/include/asm/hugetlb.h index d0a86ce83de9..fbc71ddcf0f6 100644 --- a/arch/mips/include/asm/hugetlb.h +++ b/arch/mips/include/asm/hugetlb.h @@ -27,7 +27,8 @@ static inline int prepare_hugepage_range(struct file *file, #define __HAVE_ARCH_HUGE_PTEP_GET_AND_CLEAR static inline pte_t huge_ptep_get_and_clear(struct mm_struct *mm, - unsigned long addr, pte_t *ptep) + unsigned long addr, pte_t *ptep, + unsigned long sz) { pte_t clear; pte_t pte = *ptep; @@ -42,13 +43,14 @@ static inline pte_t huge_ptep_clear_flush(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep) { pte_t pte; + unsigned long sz = huge_page_size(hstate_vma(vma)); /* * clear the huge pte entry firstly, so that the other smp threads will * not get old pte entry after finishing flush_tlb_page and before * setting new huge pte entry */ - pte = huge_ptep_get_and_clear(vma->vm_mm, addr, ptep); + pte = huge_ptep_get_and_clear(vma->vm_mm, addr, ptep, sz); flush_tlb_page(vma, addr); return pte; } diff --git a/arch/parisc/include/asm/hugetlb.h b/arch/parisc/include/asm/hugetlb.h index 5b3a5429f71b..21e9ace17739 100644 --- a/arch/parisc/include/asm/hugetlb.h +++ b/arch/parisc/include/asm/hugetlb.h @@ -10,7 +10,7 @@ void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, #define __HAVE_ARCH_HUGE_PTEP_GET_AND_CLEAR pte_t huge_ptep_get_and_clear(struct mm_struct *mm, unsigned long addr, - pte_t *ptep); + pte_t *ptep, unsigned long sz); #define __HAVE_ARCH_HUGE_PTEP_CLEAR_FLUSH static inline pte_t huge_ptep_clear_flush(struct vm_area_struct *vma, diff --git a/arch/parisc/mm/hugetlbpage.c b/arch/parisc/mm/hugetlbpage.c index e9d18cf25b79..a94fe546d434 100644 --- a/arch/parisc/mm/hugetlbpage.c +++ b/arch/parisc/mm/hugetlbpage.c @@ -126,7 +126,7 @@ void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, pte_t huge_ptep_get_and_clear(struct mm_struct *mm, unsigned long addr, - pte_t *ptep) + pte_t *ptep, unsigned long sz) { pte_t entry; diff --git a/arch/powerpc/include/asm/hugetlb.h b/arch/powerpc/include/asm/hugetlb.h index dad2e7980f24..86326587e58d 100644 --- a/arch/powerpc/include/asm/hugetlb.h +++ b/arch/powerpc/include/asm/hugetlb.h @@ -45,7 +45,8 @@ void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep, #define __HAVE_ARCH_HUGE_PTEP_GET_AND_CLEAR static inline pte_t huge_ptep_get_and_clear(struct mm_struct *mm, - unsigned long addr, pte_t *ptep) + unsigned long addr, pte_t *ptep, + unsigned long sz) { return __pte(pte_update(mm, addr, ptep, ~0UL, 0, 1)); } @@ -55,8 +56,9 @@ static inline pte_t huge_ptep_clear_flush(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep) { pte_t pte; + unsigned long sz = huge_page_size(hstate_vma(vma)); - pte = huge_ptep_get_and_clear(vma->vm_mm, addr, ptep); + pte = huge_ptep_get_and_clear(vma->vm_mm, addr, ptep, sz); flush_hugetlb_page(vma, addr); return pte; } diff --git a/arch/riscv/include/asm/hugetlb.h b/arch/riscv/include/asm/hugetlb.h index faf3624d8057..446126497768 100644 --- a/arch/riscv/include/asm/hugetlb.h +++ b/arch/riscv/include/asm/hugetlb.h @@ -28,7 +28,8 @@ void set_huge_pte_at(struct mm_struct *mm, #define __HAVE_ARCH_HUGE_PTEP_GET_AND_CLEAR pte_t huge_ptep_get_and_clear(struct mm_struct *mm, - unsigned long addr, pte_t *ptep); + unsigned long addr, pte_t *ptep, + unsigned long sz); #define __HAVE_ARCH_HUGE_PTEP_CLEAR_FLUSH pte_t huge_ptep_clear_flush(struct vm_area_struct *vma, diff --git a/arch/riscv/mm/hugetlbpage.c b/arch/riscv/mm/hugetlbpage.c index 42314f093922..b4a78a4b35cf 100644 --- a/arch/riscv/mm/hugetlbpage.c +++ b/arch/riscv/mm/hugetlbpage.c @@ -293,7 +293,7 @@ int huge_ptep_set_access_flags(struct vm_area_struct *vma, pte_t huge_ptep_get_and_clear(struct mm_struct *mm, unsigned long addr, - pte_t *ptep) + pte_t *ptep, unsigned long sz) { pte_t orig_pte = ptep_get(ptep); int pte_num; diff --git a/arch/s390/include/asm/hugetlb.h b/arch/s390/include/asm/hugetlb.h index 7c52acaf9f82..420c74306779 100644 --- a/arch/s390/include/asm/hugetlb.h +++ b/arch/s390/include/asm/hugetlb.h @@ -26,7 +26,11 @@ void __set_huge_pte_at(struct mm_struct *mm, unsigned long addr, pte_t huge_ptep_get(struct mm_struct *mm, unsigned long addr, pte_t *ptep); #define __HAVE_ARCH_HUGE_PTEP_GET_AND_CLEAR -pte_t huge_ptep_get_and_clear(struct mm_struct *mm, unsigned long addr, pte_t *ptep); +pte_t huge_ptep_get_and_clear(struct mm_struct *mm, + unsigned long addr, pte_t *ptep, + unsigned long sz); +pte_t __huge_ptep_get_and_clear(struct mm_struct *mm, + unsigned long addr, pte_t *ptep); static inline void arch_clear_hugetlb_flags(struct folio *folio) { @@ -48,7 +52,7 @@ static inline void huge_pte_clear(struct mm_struct *mm, unsigned long addr, static inline pte_t huge_ptep_clear_flush(struct vm_area_struct *vma, unsigned long address, pte_t *ptep) { - return huge_ptep_get_and_clear(vma->vm_mm, address, ptep); + return __huge_ptep_get_and_clear(vma->vm_mm, address, ptep); } #define __HAVE_ARCH_HUGE_PTEP_SET_ACCESS_FLAGS @@ -59,7 +63,7 @@ static inline int huge_ptep_set_access_flags(struct vm_area_struct *vma, int changed = !pte_same(huge_ptep_get(vma->vm_mm, addr, ptep), pte); if (changed) { - huge_ptep_get_and_clear(vma->vm_mm, addr, ptep); + __huge_ptep_get_and_clear(vma->vm_mm, addr, ptep); __set_huge_pte_at(vma->vm_mm, addr, ptep, pte); } return changed; @@ -69,7 +73,7 @@ static inline int huge_ptep_set_access_flags(struct vm_area_struct *vma, static inline void huge_ptep_set_wrprotect(struct mm_struct *mm, unsigned long addr, pte_t *ptep) { - pte_t pte = huge_ptep_get_and_clear(mm, addr, ptep); + pte_t pte = __huge_ptep_get_and_clear(mm, addr, ptep); __set_huge_pte_at(mm, addr, ptep, pte_wrprotect(pte)); } diff --git a/arch/s390/mm/hugetlbpage.c b/arch/s390/mm/hugetlbpage.c index d9ce199953de..52ee8e854195 100644 --- a/arch/s390/mm/hugetlbpage.c +++ b/arch/s390/mm/hugetlbpage.c @@ -188,8 +188,8 @@ pte_t huge_ptep_get(struct mm_struct *mm, unsigned long addr, pte_t *ptep) return __rste_to_pte(pte_val(*ptep)); } -pte_t huge_ptep_get_and_clear(struct mm_struct *mm, - unsigned long addr, pte_t *ptep) +pte_t __huge_ptep_get_and_clear(struct mm_struct *mm, + unsigned long addr, pte_t *ptep) { pte_t pte = huge_ptep_get(mm, addr, ptep); pmd_t *pmdp = (pmd_t *) ptep; @@ -202,6 +202,12 @@ pte_t huge_ptep_get_and_clear(struct mm_struct *mm, return pte; } +pte_t huge_ptep_get_and_clear(struct mm_struct *mm, + unsigned long addr, pte_t *ptep, unsigned long sz) +{ + return __huge_ptep_get_and_clear(mm, addr, ptep); +} + pte_t *huge_pte_alloc(struct mm_struct *mm, struct vm_area_struct *vma, unsigned long addr, unsigned long sz) { diff --git a/arch/sparc/include/asm/hugetlb.h b/arch/sparc/include/asm/hugetlb.h index c714ca6a05aa..e7a9cdd498dc 100644 --- a/arch/sparc/include/asm/hugetlb.h +++ b/arch/sparc/include/asm/hugetlb.h @@ -20,7 +20,7 @@ void __set_huge_pte_at(struct mm_struct *mm, unsigned long addr, #define __HAVE_ARCH_HUGE_PTEP_GET_AND_CLEAR pte_t huge_ptep_get_and_clear(struct mm_struct *mm, unsigned long addr, - pte_t *ptep); + pte_t *ptep, unsigned long sz); #define __HAVE_ARCH_HUGE_PTEP_CLEAR_FLUSH static inline pte_t huge_ptep_clear_flush(struct vm_area_struct *vma, diff --git a/arch/sparc/mm/hugetlbpage.c b/arch/sparc/mm/hugetlbpage.c index eee601a0d2cf..80504148d8a5 100644 --- a/arch/sparc/mm/hugetlbpage.c +++ b/arch/sparc/mm/hugetlbpage.c @@ -260,7 +260,7 @@ void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, } pte_t huge_ptep_get_and_clear(struct mm_struct *mm, unsigned long addr, - pte_t *ptep) + pte_t *ptep, unsigned long sz) { unsigned int i, nptes, orig_shift, shift; unsigned long size; diff --git a/include/asm-generic/hugetlb.h b/include/asm-generic/hugetlb.h index f42133dae68e..2afc95bf1655 100644 --- a/include/asm-generic/hugetlb.h +++ b/include/asm-generic/hugetlb.h @@ -90,7 +90,7 @@ static inline void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, #ifndef __HAVE_ARCH_HUGE_PTEP_GET_AND_CLEAR static inline pte_t huge_ptep_get_and_clear(struct mm_struct *mm, - unsigned long addr, pte_t *ptep) + unsigned long addr, pte_t *ptep, unsigned long sz) { return ptep_get_and_clear(mm, addr, ptep); } diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index ec8c0ccc8f95..bf5f7256bd28 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -1004,7 +1004,9 @@ static inline void hugetlb_count_sub(long l, struct mm_struct *mm) static inline pte_t huge_ptep_modify_prot_start(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep) { - return huge_ptep_get_and_clear(vma->vm_mm, addr, ptep); + unsigned long psize = huge_page_size(hstate_vma(vma)); + + return huge_ptep_get_and_clear(vma->vm_mm, addr, ptep, psize); } #endif diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 65068671e460..de9d49e521c1 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -5447,7 +5447,7 @@ static void move_huge_pte(struct vm_area_struct *vma, unsigned long old_addr, if (src_ptl != dst_ptl) spin_lock_nested(src_ptl, SINGLE_DEPTH_NESTING); - pte = huge_ptep_get_and_clear(mm, old_addr, src_pte); + pte = huge_ptep_get_and_clear(mm, old_addr, src_pte, sz); if (need_clear_uffd_wp && pte_marker_uffd_wp(pte)) huge_pte_clear(mm, new_addr, dst_pte, sz); @@ -5622,7 +5622,7 @@ void __unmap_hugepage_range(struct mmu_gather *tlb, struct vm_area_struct *vma, set_vma_resv_flags(vma, HPAGE_RESV_UNMAPPED); } - pte = huge_ptep_get_and_clear(mm, address, ptep); + pte = huge_ptep_get_and_clear(mm, address, ptep, sz); tlb_remove_huge_tlb_entry(h, tlb, ptep, address); if (huge_pte_dirty(pte)) set_page_dirty(page); From patchwork Wed Feb 5 15:09:42 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13961280 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4D2B2C02194 for ; Wed, 5 Feb 2025 15:10:36 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CF3D528001A; Wed, 5 Feb 2025 10:10:35 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id CA2C3280001; Wed, 5 Feb 2025 10:10:35 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B1BFF28001A; Wed, 5 Feb 2025 10:10:35 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 911E5280001 for ; Wed, 5 Feb 2025 10:10:35 -0500 (EST) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 487AA120125 for ; Wed, 5 Feb 2025 15:10:35 +0000 (UTC) X-FDA: 83086227630.24.C5B47A4 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf13.hostedemail.com (Postfix) with ESMTP id 6A6082001A for ; Wed, 5 Feb 2025 15:10:33 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=none; spf=pass (imf13.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738768233; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=8Jitp153DoKaz6RDBlPOeiKvbTMbE3x83jwBUW1XMlg=; b=NQjmUFqvju5bkyNQ1JVQC0lMxeVVPr0HOUssYggO2UYoM8e0Y1ySiQIzEI7Vw8Cwo+8I3n n07u2Wk4PTZxXzdRJyVyv6IOjCBkBYcdTrd/gg0fdVqW2yJui3zxKw9Es9hHLg36XmCJ2X 9xtMNQg+4va/oysL0yRxlKGSP2O7Btc= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=none; spf=pass (imf13.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738768233; a=rsa-sha256; cv=none; b=5BVojeLUYNCAlNJsJJmMPp2moq9LxJ3FPS/GiX0cUQg4Kcyfifp6REGAAkDO624YZUwJ/E pDqQeJ+DDq+dPjyNxtefV6R2xIHv9z5aU9uZDbftUp/7wBJoB+O520ZQo7S35cxteH8FLI vacHGtkTo5hIc8k1a2pIEI/XyrElv+4= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 71E221063; Wed, 5 Feb 2025 07:10:56 -0800 (PST) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.27]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 36B093F5A1; Wed, 5 Feb 2025 07:10:30 -0800 (PST) From: Ryan Roberts To: Catalin Marinas , Will Deacon , Muchun Song , Pasha Tatashin , Andrew Morton , Uladzislau Rezki , Christoph Hellwig , Mark Rutland , Ard Biesheuvel , Anshuman Khandual , Dev Jain , Alexandre Ghiti , Steve Capper , Kevin Brodsky Cc: Ryan Roberts , linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, stable@vger.kernel.org Subject: [PATCH v1 02/16] arm64: hugetlb: Fix huge_ptep_get_and_clear() for non-present ptes Date: Wed, 5 Feb 2025 15:09:42 +0000 Message-ID: <20250205151003.88959-3-ryan.roberts@arm.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250205151003.88959-1-ryan.roberts@arm.com> References: <20250205151003.88959-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 6A6082001A X-Stat-Signature: 5sznfxep5oedcsxqcs6cbmy5zbt7kt5x X-HE-Tag: 1738768233-269434 X-HE-Meta: U2FsdGVkX18sF76LLOnxKEP37uzyML2oDZhyE23luzN5kTgxpClrP3xcIwB0VWmfLSJVfwzf2OT2daBuz4IUn/e9AV8566vDxJlGS9XrhiWkjZLKYSvvnqUS71ROvi6K+12jX4LGlYx/j8NS0csOqcRRNrM/zndj5bEbh5LYy1ic6rZQgZ0ow9LmCbV0NldtfDjiAuOX/FbH43uMcBR2RnWZwekv1ESjLWfYzpPZHK5pnBYhDblEK9mt9ghrf4WqdaEUBedyXBbOd5awhooVC1N42/nCjlzN8t9Iew4DlJqk5emxBlbN8M3hxHqLS/4qh7eRqOWDr4gHHwJRQpq2Z4U+D2QgS8VCs/LTgsHGZQaOe9ALlUGun2/skmlwpL+YnHbPv98TSTV31+jqy1k/GG/ribhI3hNOD1+hhE1ltm2SiAM8TfxQYhtEDAbAJ+2yMuGAliryjIZ3MDIG+bwJ6/Q16hDnV2nDAf8gmj0CCP5CLBBdDYW9X5fFia3VZXIzX3TJ1hUNNo0bgyhisgeyLHWz9LTi9ew5WTN/f8/PNIY5tGZobZQ/NGPdLurCzvLyqTRtbbXo07e+q66z+d/gguFC+dfM1Fpt7vYVikfRIjpRRUafZlpBSpDXFsUSqQZ8yYAaxq9w73jEe8sAoKZwpld4eVL+RAJg0cPqZOgK3HHCbJyrs0hOMExb61CM/JkKaaUJ6NWa787DomlZla1BviJ6XhUvkSBZ2+jwzucSrzNRoNaAqBwGUmxYgGJvqB48uf3nDSwf6S/Yw1KdH8KIpniEHijPu+HCkSGIBdtnVCf2DJq44ykZM4IiVTfGWjPLA7dIxQjYggb7mOtL0kUZS7sn/SYc75WX4L+Y2x4dQ1ZpFYRc2ztNN18NIWg2nNb6CzrTITUvRODx0Yp/mMJv/MPcPPB1OZmqHmtBxJ/YxGmkQ6Rnow9xVWJ1RpivIJaeP8z6JUg15kkwnxISNqQ xzTg85mU F5WucI2b3RhlckTc25FzWFpVO7y9TqO1zVyKYguh8XUDgo52sdVmCbU0aTqBbfIzuVY9tlJMgsx+8n1dn7abk+o7LWGZKGC4pDMTBTItN4gpFWMGQqRAPrsiQhuTH4bnWckdbJxAP1eqC5xUxcjX9stJkydVU/qVfW0WAgP567Pzc3u5hXG14f5ALrM7+QWS0CoBnwIme98FZQdz73UYTnvJzPI6I6Nl5y0V+ X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: arm64 supports multiple huge_pte sizes. Some of the sizes are covered by a single pte entry at a particular level (PMD_SIZE, PUD_SIZE), and some are covered by multiple ptes at a particular level (CONT_PTE_SIZE, CONT_PMD_SIZE). So the function has to figure out the size from the huge_pte pointer. This was previously done by walking the pgtable to determine the level, then using the PTE_CONT bit to determine the number of ptes. But the PTE_CONT bit is only valid when the pte is present. For non-present pte values (e.g. markers, migration entries), the previous implementation was therefore erroniously determining the size. There is at least one known caller in core-mm, move_huge_pte(), which may call huge_ptep_get_and_clear() for a non-present pte. So we must be robust to this case. Additionally the "regular" ptep_get_and_clear() is robust to being called for non-present ptes so it makes sense to follow the behaviour. Fix this by using the new sz parameter which is now provided to the function. Additionally when clearing each pte in a contig range, don't gather the access and dirty bits if the pte is not present. An alternative approach that would not require API changes would be to store the PTE_CONT bit in a spare bit in the swap entry pte. But it felt cleaner to follow other APIs' lead and just pass in the size. While we are at it, add some debug warnings in functions that require the pte is present. As an aside, PTE_CONT is bit 52, which corresponds to bit 40 in the swap entry offset field (layout of non-present pte). Since hugetlb is never swapped to disk, this field will only be populated for markers, which always set this bit to 0 and hwpoison swap entries, which set the offset field to a PFN; So it would only ever be 1 for a 52-bit PVA system where memory in that high half was poisoned (I think!). So in practice, this bit would almost always be zero for non-present ptes and we would only clear the first entry if it was actually a contiguous block. That's probably a less severe symptom than if it was always interpretted as 1 and cleared out potentially-present neighboring PTEs. Cc: Fixes: 66b3923a1a0f ("arm64: hugetlb: add support for PTE contiguous bit") Signed-off-by: Ryan Roberts --- arch/arm64/mm/hugetlbpage.c | 54 ++++++++++++++++++++----------------- 1 file changed, 29 insertions(+), 25 deletions(-) diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c index 06db4649af91..328eec4bfe55 100644 --- a/arch/arm64/mm/hugetlbpage.c +++ b/arch/arm64/mm/hugetlbpage.c @@ -163,24 +163,23 @@ static pte_t get_clear_contig(struct mm_struct *mm, unsigned long pgsize, unsigned long ncontig) { - pte_t orig_pte = __ptep_get(ptep); - unsigned long i; - - for (i = 0; i < ncontig; i++, addr += pgsize, ptep++) { - pte_t pte = __ptep_get_and_clear(mm, addr, ptep); - - /* - * If HW_AFDBM is enabled, then the HW could turn on - * the dirty or accessed bit for any page in the set, - * so check them all. - */ - if (pte_dirty(pte)) - orig_pte = pte_mkdirty(orig_pte); - - if (pte_young(pte)) - orig_pte = pte_mkyoung(orig_pte); + pte_t pte, tmp_pte; + bool present; + + pte = __ptep_get_and_clear(mm, addr, ptep); + present = pte_present(pte); + while (--ncontig) { + ptep++; + addr += pgsize; + tmp_pte = __ptep_get_and_clear(mm, addr, ptep); + if (present) { + if (pte_dirty(tmp_pte)) + pte = pte_mkdirty(pte); + if (pte_young(tmp_pte)) + pte = pte_mkyoung(pte); + } } - return orig_pte; + return pte; } static pte_t get_clear_contig_flush(struct mm_struct *mm, @@ -401,13 +400,8 @@ pte_t huge_ptep_get_and_clear(struct mm_struct *mm, unsigned long addr, { int ncontig; size_t pgsize; - pte_t orig_pte = __ptep_get(ptep); - - if (!pte_cont(orig_pte)) - return __ptep_get_and_clear(mm, addr, ptep); - - ncontig = find_num_contig(mm, addr, ptep, &pgsize); + ncontig = num_contig_ptes(sz, &pgsize); return get_clear_contig(mm, addr, ptep, pgsize, ncontig); } @@ -451,6 +445,8 @@ int huge_ptep_set_access_flags(struct vm_area_struct *vma, pgprot_t hugeprot; pte_t orig_pte; + VM_WARN_ON(!pte_present(pte)); + if (!pte_cont(pte)) return __ptep_set_access_flags(vma, addr, ptep, pte, dirty); @@ -461,6 +457,7 @@ int huge_ptep_set_access_flags(struct vm_area_struct *vma, return 0; orig_pte = get_clear_contig_flush(mm, addr, ptep, pgsize, ncontig); + VM_WARN_ON(!pte_present(orig_pte)); /* Make sure we don't lose the dirty or young state */ if (pte_dirty(orig_pte)) @@ -485,7 +482,10 @@ void huge_ptep_set_wrprotect(struct mm_struct *mm, size_t pgsize; pte_t pte; - if (!pte_cont(__ptep_get(ptep))) { + pte = __ptep_get(ptep); + VM_WARN_ON(!pte_present(pte)); + + if (!pte_cont(pte)) { __ptep_set_wrprotect(mm, addr, ptep); return; } @@ -509,8 +509,12 @@ pte_t huge_ptep_clear_flush(struct vm_area_struct *vma, struct mm_struct *mm = vma->vm_mm; size_t pgsize; int ncontig; + pte_t pte; - if (!pte_cont(__ptep_get(ptep))) + pte = __ptep_get(ptep); + VM_WARN_ON(!pte_present(pte)); + + if (!pte_cont(pte)) return ptep_clear_flush(vma, addr, ptep); ncontig = find_num_contig(mm, addr, ptep, &pgsize); From patchwork Wed Feb 5 15:09:43 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13961281 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2FFB1C02192 for ; Wed, 5 Feb 2025 15:10:39 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C06ED28001B; Wed, 5 Feb 2025 10:10:38 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id BB8B9280001; Wed, 5 Feb 2025 10:10:38 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A314528001B; Wed, 5 Feb 2025 10:10:38 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 85DE5280001 for ; Wed, 5 Feb 2025 10:10:38 -0500 (EST) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 3D8BC1601E0 for ; Wed, 5 Feb 2025 15:10:38 +0000 (UTC) X-FDA: 83086227756.30.F68E31D Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf02.hostedemail.com (Postfix) with ESMTP id 8BFF48000C for ; Wed, 5 Feb 2025 15:10:36 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=none; spf=pass (imf02.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738768236; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=xq0gzkQCuzS6qPgu3GdElgorB/nMOeYTLDAAFULqJX0=; b=ZZLOY6tvZBKBiNxvtqKjQSnhQr5qc20KP8zCQAwICdrhSI+cDFNBv+kL9mCISpKinVyp6D kpwFiclol3OsZT5/1ABBWGbHs3qIGjXD3nf+SXDUhALCKNDdSx1dStLyOZ3cXT07w4qbLB 9aOOLUKsKpihCkWDqmXt/5FzI1dx1JM= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=none; spf=pass (imf02.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738768236; a=rsa-sha256; cv=none; b=hIKHNYqHOLp7w+avOzS86Z1SBeOgnIJ/uaI4vzLftuB/kf9kixCJzeLAHigjWAVA6dPBra crTf4VRY0OoKt3patObv1LxOIBaQxCQ9KHm1JCotrI0Y8YQfi/NGPGZPRxnbDwF1ADV4aU nWqBO5oDyyXCD/uP08mMHAIl1l4d6h4= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 5FB141C01; Wed, 5 Feb 2025 07:10:59 -0800 (PST) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.27]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 242593F5A1; Wed, 5 Feb 2025 07:10:33 -0800 (PST) From: Ryan Roberts To: Catalin Marinas , Will Deacon , Muchun Song , Pasha Tatashin , Andrew Morton , Uladzislau Rezki , Christoph Hellwig , Mark Rutland , Ard Biesheuvel , Anshuman Khandual , Dev Jain , Alexandre Ghiti , Steve Capper , Kevin Brodsky Cc: Ryan Roberts , linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, stable@vger.kernel.org Subject: [PATCH v1 03/16] arm64: hugetlb: Fix flush_hugetlb_tlb_range() invalidation level Date: Wed, 5 Feb 2025 15:09:43 +0000 Message-ID: <20250205151003.88959-4-ryan.roberts@arm.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250205151003.88959-1-ryan.roberts@arm.com> References: <20250205151003.88959-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: 8BFF48000C X-Rspam-User: X-Rspamd-Server: rspam11 X-Stat-Signature: q3chj4dfdtdark16g8ndqdqw79wss7wx X-HE-Tag: 1738768236-802592 X-HE-Meta: U2FsdGVkX1/eEc1qwREp1Ap/MZkMvXfJSZfo01529ifbh3G6TeUnrmJTdng9VTzGd9H1jEa8NsMc39bW8LAAlH6SVzcihgw9DZsvcTnFS6h68hbxC6K/9qawUAHCiI1Zj8fbQLrkccpMqwvwysfpZ3OxSJDpGLttzOOC4y5ltn56VS3hEGW7p7l619L2mCfqRlFQISW2A3f7H6k1+xH328CHcjimpH08eKaHJ7Dm4VNv1hOVkg3RsGjgVvJeg/DeDBNhaWydDhkiy9dEuGltBWrOrPUdg/LnPORYly7ZZCH05G4BlOgSkQhCv26lZq8XVOPpIjLbvl7qAUAoLW5BIP142Ei48gBQv1HCirbAp0pWXUYW4HcloAcE509khmW2CgNVgm4J1VhJ7Tii+U0nTaOXMqgxf+TGQWaDSzdw7MAwXGZ4/x0bZraS0Q1MS4seGhmuh7denrBTzhOCFYcW+1noSVfKz5sZ42fw2MuAMoFaPfqdPu6Hs8SV8LyDmg4rowOmK580ifgeLDheBkeJOp2snq0oM1SfiueC+dCR3rsiDTWP026ZRGpbgmpWUUfSMOVtjlZ9VqMKUB5WOSL1+SNSOqygZ5tsoPOQeJOcySR0wxxFVF0XECFz2xp08YZgsIRKBvgrVBklzSGTU9h4xUyLJ7a3oH0nRTe+rR9WVQwXevlW7iUNMp2mZirvBo898nmZeg6EITJBmJKqOOclRYNNlu/PMFLRl71k4s1OjogvfOITKs0wBE6/Zk9Zf0MUFup5xNmW21nxy7Iqtm0158+pTeqrSX/cKF+xJljgB0mAV9Wup4R0XgUk92FZzZfQQO3JPilMDrEz5oUERgulBWEGaEaf5fWwrz0wYZc0KoZAan8RI+oR2jRF6ULBc4MRxp70ZKXjK7pbcAwBoSzbglr/cC/yKlq6rgG31ortayCagS6A+JMO6+1Z9/gYCdA01nDZxYkUF8xQoqzgchH spgq++32 ZbZf9a1rJjgV+TNqrFluCWtD/q9Sf1ovyj9MQeoS3RWM2yRKnXZHUcLYTDeEngkDshbiTPZlZYFT87FBpp29EnVj590+G0ca6YmYUJj55wL4gwhy9yq8vjHAk3x/g39oLUZ4mhgtuFw9vCsoHZVb73g8NMYmUT4obb4LY/tbxqsMGuBrR0aLwaIJYAvT8aQjTZOnBkhX0XCKnCjUOIlIX3YRHljWjTG0ZrS1p X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: commit c910f2b65518 ("arm64/mm: Update tlb invalidation routines for FEAT_LPA2") changed the "invalidation level unknown" hint from 0 to TLBI_TTL_UNKNOWN (INT_MAX). But the fallback "unknown level" path in flush_hugetlb_tlb_range() was not updated. So as it stands, when trying to invalidate CONT_PMD_SIZE or CONT_PTE_SIZE hugetlb mappings, we will spuriously try to invalidate at level 0 on LPA2-enabled systems. Fix this so that the fallback passes TLBI_TTL_UNKNOWN, and while we are at it, explicitly use the correct stride and level for CONT_PMD_SIZE and CONT_PTE_SIZE, which should provide a minor optimization. Cc: Fixes: c910f2b65518 ("arm64/mm: Update tlb invalidation routines for FEAT_LPA2") Signed-off-by: Ryan Roberts --- arch/arm64/include/asm/hugetlb.h | 20 ++++++++++++++------ 1 file changed, 14 insertions(+), 6 deletions(-) diff --git a/arch/arm64/include/asm/hugetlb.h b/arch/arm64/include/asm/hugetlb.h index 03db9cb21ace..8ab9542d2d22 100644 --- a/arch/arm64/include/asm/hugetlb.h +++ b/arch/arm64/include/asm/hugetlb.h @@ -76,12 +76,20 @@ static inline void flush_hugetlb_tlb_range(struct vm_area_struct *vma, { unsigned long stride = huge_page_size(hstate_vma(vma)); - if (stride == PMD_SIZE) - __flush_tlb_range(vma, start, end, stride, false, 2); - else if (stride == PUD_SIZE) - __flush_tlb_range(vma, start, end, stride, false, 1); - else - __flush_tlb_range(vma, start, end, PAGE_SIZE, false, 0); + switch (stride) { + case PUD_SIZE: + __flush_tlb_range(vma, start, end, PUD_SIZE, false, 1); + break; + case CONT_PMD_SIZE: + case PMD_SIZE: + __flush_tlb_range(vma, start, end, PMD_SIZE, false, 2); + break; + case CONT_PTE_SIZE: + __flush_tlb_range(vma, start, end, PAGE_SIZE, false, 3); + break; + default: + __flush_tlb_range(vma, start, end, PAGE_SIZE, false, TLBI_TTL_UNKNOWN); + } } #endif /* __ASM_HUGETLB_H */ From patchwork Wed Feb 5 15:09:44 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13961282 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 47569C02192 for ; Wed, 5 Feb 2025 15:10:42 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CCD696B007B; Wed, 5 Feb 2025 10:10:41 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id C7CC16B0089; Wed, 5 Feb 2025 10:10:41 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AF4F36B008A; Wed, 5 Feb 2025 10:10:41 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 918C56B007B for ; Wed, 5 Feb 2025 10:10:41 -0500 (EST) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 3B582A01D0 for ; Wed, 5 Feb 2025 15:10:41 +0000 (UTC) X-FDA: 83086227882.01.201C121 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf30.hostedemail.com (Postfix) with ESMTP id 7AF6E80012 for ; Wed, 5 Feb 2025 15:10:39 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=none; spf=pass (imf30.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738768239; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=zo0qgLaUDpVDO+Osvp+LrXOW2iS80qEjSY2xlQTPbVI=; b=UdoukYe+u3XrJ+v8LDdCFc7IKBxPSBxaOkSJAVQ+3TxD9skjOyiJjijrK0HTzxZ6AFC9wi Q23fzTTm6EwDnoTBZdqH2XKLYBg+uofHpezftM8/JH2omvT73n0h1Py2cHdSycXaeCthgV /glbGFK0KyJt6b058kHLc+TFxmKCpGI= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738768239; a=rsa-sha256; cv=none; b=O/gK2lbpncPfrsMbBBRjSH1+hkYxcGRaw9szRxH/Y4pmzaQICEMkoket9KfxfNEdqr0GcY WR7zvorbQla01i9zFpRFj8Jkxq1tJYVkDwmiKxShjzhL/CtywsGAxShq0O4yqkI2KdRr8o svbpynWeCXMXPH1SK8tCXhfK6Ln9thk= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=none; spf=pass (imf30.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 2DD511515; Wed, 5 Feb 2025 07:11:02 -0800 (PST) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.27]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 11E403F5A1; Wed, 5 Feb 2025 07:10:35 -0800 (PST) From: Ryan Roberts To: Catalin Marinas , Will Deacon , Muchun Song , Pasha Tatashin , Andrew Morton , Uladzislau Rezki , Christoph Hellwig , Mark Rutland , Ard Biesheuvel , Anshuman Khandual , Dev Jain , Alexandre Ghiti , Steve Capper , Kevin Brodsky Cc: Ryan Roberts , linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v1 04/16] arm64: hugetlb: Refine tlb maintenance scope Date: Wed, 5 Feb 2025 15:09:44 +0000 Message-ID: <20250205151003.88959-5-ryan.roberts@arm.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250205151003.88959-1-ryan.roberts@arm.com> References: <20250205151003.88959-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Stat-Signature: am7zdzfdf34msrsjfg5aifwtmsjiry4h X-Rspamd-Queue-Id: 7AF6E80012 X-Rspam-User: X-Rspamd-Server: rspam06 X-HE-Tag: 1738768239-741215 X-HE-Meta: U2FsdGVkX1+54yWk52F+71MFVn9NliiM8Iyd0e8idTvc0SLgrOu22wfaQ/+OekWCok/fpbGrOs8pB5oN+mYSKKagVz50z3kgNvaYrtopL4nMQgV50DBZjqZUZtEsoZbuRZotJoO+Q33sWpTMSGMCEW2H2JvgSfGybr3/SLJTgy0oiXAEyM4wucou3xf5JNaXl7MVXKpvvUp0Tw9Jw/7HIzqvnLUs9Cbux24xvfPS2gueJe+I4VBLI6Dl4MTF8f4Z4S174Y61tgOMUmgkpdMVYhVkc+AF6yHyoUqK516mYcWheFij0CJjJiS5qwb+wY5fqCFLapfZqqzlVr4kKBkx+nSPPC2Y3TDoEPy1Mm3tNPKsJmMIiZO9NtLQUq2r6/qNHrl3ragsrwHwjMGBA9XFPL4bhH7LnXAPPdGLWQoHE7CVsDia6LsbkX/7IRVYd26nPc8Cq/i4NkZxXIC9S3cmdT6GMUz6QHzk2hsxZkBMNvwvd/eJdt6A2ylJiDOx8kzbazq5x6pYWRm5iHsnXftgWHb2MI9xDtm0IaZMU3DR/xkD8N8c1AdyqtrSnWb7mtHhJ4ejjhvtwJwC3lbPtErZKyMPs/ERwZ5iCgJpGel0UnA2A2bhHXUOySJAsMZIwy5VPxAcs31WfANMa/NVDCVOjvroUnyakoJuYfDZeVJUBX60sZMm853wOREHh1tJFrVEk+mnmn5LbthNcasSLo+nmrhyvgj6HuzwaRl66j/5P7T7ak8y4X8CEu1lIaEvb6HAmVXlhlgzGV30o5RNiiZAq7L7gyqCgW+dFP4stJ1nVySjUgB2EXugHEUW1HU9RpdcT7Ey/CC8qVchaTbflV6ejDA5UnuWWt2RoLvWDadkdQMc2GFJONXo1yQFfPfshhEc6b3oLbJpa03YdPw5DUkYhY9kejQbGSeUFuSmHlkFp7+u8OAWiBazQ14ZcWvhwD/77mAdUUQgDLQtImr4fRu GgvXyoCj csQUlaphuNO+IIttJ81PTWrxmqOcaV9F39NvyZ3SgkGLVyhkDOyiiPBbQPRmnBqwCNGa/ZEK9hbVCgnc3vgIeGelHPFv1ZYWuDikkCed7NfHFgWojEWdtLYoXsT3si3nC50tPC74GwTov+t2qynjGuyRsvQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: When operating on contiguous blocks of ptes (or pmds) for some hugetlb sizes, we must honour break-before-make requirements and clear down the block to invalid state in the pgtable then invalidate the relevant tlb entries before making the pgtable entries valid again. However, the tlb maintenance is currently always done assuming the worst case stride (PAGE_SIZE), last_level (false) and tlb_level (TLBI_TTL_UNKNOWN). We can do much better with the hinting; In reality, we know the stride from the huge_pte pgsize, we are always operating only on the last level, and we always know the tlb_level, again based on pgsize. So let's start providing these hints. Additionally, avoid tlb maintenace in set_huge_pte_at(). Break-before-make is only required if we are transitioning the contiguous pte block from valid -> valid. So let's elide the clear-and-flush ("break") if the pte range was previously invalid. Signed-off-by: Ryan Roberts --- arch/arm64/include/asm/hugetlb.h | 29 +++++++++++++++++++---------- arch/arm64/mm/hugetlbpage.c | 9 ++++++--- 2 files changed, 25 insertions(+), 13 deletions(-) diff --git a/arch/arm64/include/asm/hugetlb.h b/arch/arm64/include/asm/hugetlb.h index 8ab9542d2d22..c38f2944c20d 100644 --- a/arch/arm64/include/asm/hugetlb.h +++ b/arch/arm64/include/asm/hugetlb.h @@ -69,27 +69,36 @@ extern void huge_ptep_modify_prot_commit(struct vm_area_struct *vma, #include -#define __HAVE_ARCH_FLUSH_HUGETLB_TLB_RANGE -static inline void flush_hugetlb_tlb_range(struct vm_area_struct *vma, - unsigned long start, - unsigned long end) +static inline void __flush_hugetlb_tlb_range(struct vm_area_struct *vma, + unsigned long start, + unsigned long end, + unsigned long stride, + bool last_level) { - unsigned long stride = huge_page_size(hstate_vma(vma)); - switch (stride) { case PUD_SIZE: - __flush_tlb_range(vma, start, end, PUD_SIZE, false, 1); + __flush_tlb_range(vma, start, end, PUD_SIZE, last_level, 1); break; case CONT_PMD_SIZE: case PMD_SIZE: - __flush_tlb_range(vma, start, end, PMD_SIZE, false, 2); + __flush_tlb_range(vma, start, end, PMD_SIZE, last_level, 2); break; case CONT_PTE_SIZE: - __flush_tlb_range(vma, start, end, PAGE_SIZE, false, 3); + __flush_tlb_range(vma, start, end, PAGE_SIZE, last_level, 3); break; default: - __flush_tlb_range(vma, start, end, PAGE_SIZE, false, TLBI_TTL_UNKNOWN); + __flush_tlb_range(vma, start, end, PAGE_SIZE, last_level, TLBI_TTL_UNKNOWN); } } +#define __HAVE_ARCH_FLUSH_HUGETLB_TLB_RANGE +static inline void flush_hugetlb_tlb_range(struct vm_area_struct *vma, + unsigned long start, + unsigned long end) +{ + unsigned long stride = huge_page_size(hstate_vma(vma)); + + __flush_hugetlb_tlb_range(vma, start, end, stride, false); +} + #endif /* __ASM_HUGETLB_H */ diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c index 328eec4bfe55..e870d01d12ea 100644 --- a/arch/arm64/mm/hugetlbpage.c +++ b/arch/arm64/mm/hugetlbpage.c @@ -190,8 +190,9 @@ static pte_t get_clear_contig_flush(struct mm_struct *mm, { pte_t orig_pte = get_clear_contig(mm, addr, ptep, pgsize, ncontig); struct vm_area_struct vma = TLB_FLUSH_VMA(mm, 0); + unsigned long end = addr + (pgsize * ncontig); - flush_tlb_range(&vma, addr, addr + (pgsize * ncontig)); + __flush_hugetlb_tlb_range(&vma, addr, end, pgsize, true); return orig_pte; } @@ -216,7 +217,7 @@ static void clear_flush(struct mm_struct *mm, for (i = 0; i < ncontig; i++, addr += pgsize, ptep++) __ptep_get_and_clear(mm, addr, ptep); - flush_tlb_range(&vma, saddr, addr); + __flush_hugetlb_tlb_range(&vma, saddr, addr, pgsize, true); } void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, @@ -245,7 +246,9 @@ void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, dpfn = pgsize >> PAGE_SHIFT; hugeprot = pte_pgprot(pte); - clear_flush(mm, addr, ptep, pgsize, ncontig); + /* Only need to "break" if transitioning valid -> valid. */ + if (pte_valid(__ptep_get(ptep))) + clear_flush(mm, addr, ptep, pgsize, ncontig); for (i = 0; i < ncontig; i++, ptep++, addr += pgsize, pfn += dpfn) __set_ptes(mm, addr, ptep, pfn_pte(pfn, hugeprot), 1); From patchwork Wed Feb 5 15:09:45 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13961283 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5029DC02192 for ; Wed, 5 Feb 2025 15:10:45 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B45836B008C; Wed, 5 Feb 2025 10:10:44 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id AF510280001; Wed, 5 Feb 2025 10:10:44 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 997066B0093; Wed, 5 Feb 2025 10:10:44 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 77DAE6B008C for ; Wed, 5 Feb 2025 10:10:44 -0500 (EST) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 3D07F1401C6 for ; Wed, 5 Feb 2025 15:10:44 +0000 (UTC) X-FDA: 83086228008.20.60D0C92 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf26.hostedemail.com (Postfix) with ESMTP id 0B27714000B for ; Wed, 5 Feb 2025 15:10:41 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf26.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738768242; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=hdIdCqL/U5NG8r9FvUNmuvz1Dz0ucLmsjWNXeD3dOug=; b=FBIgvLk12s0u3fQYwi5JBUr3K5uzI7G3++q2nJlrOmeJEcpI6Ln7gnRFISVAmVMGkvhlpg Xq12/3rws/K97rUpdnVlOkCuFqMDN4V+JCWSIKHBCEnTsBvq0KDCdTlnOv+ttCPeZokbc2 eaTHES1efXtjq/jCKyE9/38XXIiiOWA= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf26.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738768242; a=rsa-sha256; cv=none; b=TIi1MuammlA+8m1DebxUe7gKdq8Cw0+0wNvEQFoM5K2xbNA02UGu3UP7wUScNT6OO+rNu2 WsvftyOiwE3z8he0/fwVbZWy5vUGNzpat35YmatljWfLKU55DTuzh4nwrlQk1ZJEJDlunu a1gkYBE+OvKa6/TpbSgYnpbAyckzMdI= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id F04381063; Wed, 5 Feb 2025 07:11:04 -0800 (PST) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.27]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id D46F73F5A1; Wed, 5 Feb 2025 07:10:38 -0800 (PST) From: Ryan Roberts To: Catalin Marinas , Will Deacon , Muchun Song , Pasha Tatashin , Andrew Morton , Uladzislau Rezki , Christoph Hellwig , Mark Rutland , Ard Biesheuvel , Anshuman Khandual , Dev Jain , Alexandre Ghiti , Steve Capper , Kevin Brodsky Cc: Ryan Roberts , linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v1 05/16] mm/page_table_check: Batch-check pmds/puds just like ptes Date: Wed, 5 Feb 2025 15:09:45 +0000 Message-ID: <20250205151003.88959-6-ryan.roberts@arm.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250205151003.88959-1-ryan.roberts@arm.com> References: <20250205151003.88959-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 0B27714000B X-Stat-Signature: 6ffgssrx5b6ofgkmj1xargpwiig1nhx4 X-Rspam-User: X-HE-Tag: 1738768241-450504 X-HE-Meta: U2FsdGVkX19rMkHusPv3WVVqsW+RJOINd6CvSmYpUs7Sf/no2JL1sPWdLczz5gpZWyCmLe2rNoZ5XiWyfzMey2xmDSfcXojWevwNn0X/+JLMyhBsPZqkTLKZlXZQaZKUHEUWUfS9JZMql0UHg8g4bYxhaZO2tu3vMYSvspGc2AnC3tvrNidp6J/gmf3aWG0mIaw/pRNVnl4FgBwaosY5oBv4ygPA38U6MXjO6WrECB7TdnX69CM4Wu/KGIm70CXaL4I2VEr5dY5t9UF3Zkgwc5RcRktLK2jHIVFsBzEo8E9dz8fmAFUhozsqMODgrjlqs65+NyULJeW+uxyqTqFCK0xgnx3CAelKU3Z5+wiwtedZwXWsIWMiqO1f/H95nHj6W/KwkApoR0t1IP2wYghBuOoT9JLRYK1kuLN+xfR9txCyt5ByuXELf7hCszJuRoEDw6wcuHAZW7V8YhHTFojslDaorMrnyM7botDXDc64XopFKVRy10XsF7vHAxQn92ctJeq4WoalTDQ3Fj7p7JQGfapHwexYWYhYNiNfMTvokekHs2H5b/wuuLaCsBVSXFFMuXPo/r4bYk8fXGe4yHM+NsF/OqVjjVnzFuz366UmJENPMSfOjSqUyj5sOCj3XLiZs2O/OkQefndL0b69WCppKSQ0pFU+33HZQVfniXsw8squBQ2KSXqdEHqFaJXLFKYzfR7ywDKQzSY1ED4RQEFoZP1r/j+kKOS31ApwOrdWAUGdQlFAuGY0oVoYtPBAnCYONMv0hLL1eAw1N8BAtI1Tk6ZfGyNgh9UY24f0EFb/7oZv57bPMoLv5jxL/i7hdkBRRDLmbzhKr4CgJ6A3Qra7rzbbHckKkmZAmTk7XZvOShks9KUDwYgWctAgAmWiLvsnnc+YhBNWXx2ihdE+CVtARc16gqzIY2AsGjKHsVP0EW+eMOuI/mQm4iD3pIYdu0ymlwod20VZ9F9SqJlrkiH C9eKaiia ovFPHtfZEriIgy7jvpFct9RV+bdMSOPq6BnPNbonphJmeQAxn5y4sssV8YJlJTxMMeAjWM7xLb5isyeWDuam53Lfzu/6qw8ktOgex+Z0TYaC4d5j2Ejzd21ULTJiGI5YjORcXIouj5Ba6LVPwwKsS/JiU2ieQ+rS7lGRQO8Q2kbiM323buS3osSMDelJCh7RfzWyDiVyoTFUn9nA= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Convert page_table_check_p[mu]d_set(...) to page_table_check_p[mu]ds_set(..., nr) to allow checking a contiguous set of pmds/puds in single batch. We retain page_table_check_p[mu]d_set(...) as macros that call new batch functions with nr=1 for compatibility. arm64 is about to reorganise its pte/pmd/pud helpers to reuse more code and to allow the implementation for huge_pte to more efficiently set ptes/pmds/puds in batches. We need these batch-helpers to make the refactoring possible. Signed-off-by: Ryan Roberts --- include/linux/page_table_check.h | 30 +++++++++++++++++----------- mm/page_table_check.c | 34 +++++++++++++++++++------------- 2 files changed, 38 insertions(+), 26 deletions(-) diff --git a/include/linux/page_table_check.h b/include/linux/page_table_check.h index 6722941c7cb8..289620d4aad3 100644 --- a/include/linux/page_table_check.h +++ b/include/linux/page_table_check.h @@ -19,8 +19,10 @@ void __page_table_check_pmd_clear(struct mm_struct *mm, pmd_t pmd); void __page_table_check_pud_clear(struct mm_struct *mm, pud_t pud); void __page_table_check_ptes_set(struct mm_struct *mm, pte_t *ptep, pte_t pte, unsigned int nr); -void __page_table_check_pmd_set(struct mm_struct *mm, pmd_t *pmdp, pmd_t pmd); -void __page_table_check_pud_set(struct mm_struct *mm, pud_t *pudp, pud_t pud); +void __page_table_check_pmds_set(struct mm_struct *mm, pmd_t *pmdp, pmd_t pmd, + unsigned int nr); +void __page_table_check_puds_set(struct mm_struct *mm, pud_t *pudp, pud_t pud, + unsigned int nr); void __page_table_check_pte_clear_range(struct mm_struct *mm, unsigned long addr, pmd_t pmd); @@ -74,22 +76,22 @@ static inline void page_table_check_ptes_set(struct mm_struct *mm, __page_table_check_ptes_set(mm, ptep, pte, nr); } -static inline void page_table_check_pmd_set(struct mm_struct *mm, pmd_t *pmdp, - pmd_t pmd) +static inline void page_table_check_pmds_set(struct mm_struct *mm, + pmd_t *pmdp, pmd_t pmd, unsigned int nr) { if (static_branch_likely(&page_table_check_disabled)) return; - __page_table_check_pmd_set(mm, pmdp, pmd); + __page_table_check_pmds_set(mm, pmdp, pmd, nr); } -static inline void page_table_check_pud_set(struct mm_struct *mm, pud_t *pudp, - pud_t pud) +static inline void page_table_check_puds_set(struct mm_struct *mm, + pud_t *pudp, pud_t pud, unsigned int nr) { if (static_branch_likely(&page_table_check_disabled)) return; - __page_table_check_pud_set(mm, pudp, pud); + __page_table_check_puds_set(mm, pudp, pud, nr); } static inline void page_table_check_pte_clear_range(struct mm_struct *mm, @@ -129,13 +131,13 @@ static inline void page_table_check_ptes_set(struct mm_struct *mm, { } -static inline void page_table_check_pmd_set(struct mm_struct *mm, pmd_t *pmdp, - pmd_t pmd) +static inline void page_table_check_pmds_set(struct mm_struct *mm, + pmd_t *pmdp, pmd_t pmd, unsigned int nr) { } -static inline void page_table_check_pud_set(struct mm_struct *mm, pud_t *pudp, - pud_t pud) +static inline void page_table_check_puds_set(struct mm_struct *mm, + pud_t *pudp, pud_t pud, unsigned int nr) { } @@ -146,4 +148,8 @@ static inline void page_table_check_pte_clear_range(struct mm_struct *mm, } #endif /* CONFIG_PAGE_TABLE_CHECK */ + +#define page_table_check_pmd_set(mm, pmdp, pmd) page_table_check_pmds_set(mm, pmdp, pmd, 1) +#define page_table_check_pud_set(mm, pudp, pud) page_table_check_puds_set(mm, pudp, pud, 1) + #endif /* __LINUX_PAGE_TABLE_CHECK_H */ diff --git a/mm/page_table_check.c b/mm/page_table_check.c index 509c6ef8de40..dae4a7d776b3 100644 --- a/mm/page_table_check.c +++ b/mm/page_table_check.c @@ -234,33 +234,39 @@ static inline void page_table_check_pmd_flags(pmd_t pmd) WARN_ON_ONCE(swap_cached_writable(pmd_to_swp_entry(pmd))); } -void __page_table_check_pmd_set(struct mm_struct *mm, pmd_t *pmdp, pmd_t pmd) +void __page_table_check_pmds_set(struct mm_struct *mm, pmd_t *pmdp, pmd_t pmd, + unsigned int nr) { + unsigned int i; + unsigned long stride = PMD_SIZE >> PAGE_SHIFT; + if (&init_mm == mm) return; page_table_check_pmd_flags(pmd); - __page_table_check_pmd_clear(mm, *pmdp); - if (pmd_user_accessible_page(pmd)) { - page_table_check_set(pmd_pfn(pmd), PMD_SIZE >> PAGE_SHIFT, - pmd_write(pmd)); - } + for (i = 0; i < nr; i++) + __page_table_check_pmd_clear(mm, *(pmdp + i)); + if (pmd_user_accessible_page(pmd)) + page_table_check_set(pmd_pfn(pmd), stride * nr, pmd_write(pmd)); } -EXPORT_SYMBOL(__page_table_check_pmd_set); +EXPORT_SYMBOL(__page_table_check_pmds_set); -void __page_table_check_pud_set(struct mm_struct *mm, pud_t *pudp, pud_t pud) +void __page_table_check_puds_set(struct mm_struct *mm, pud_t *pudp, pud_t pud, + unsigned int nr) { + unsigned int i; + unsigned long stride = PUD_SIZE >> PAGE_SHIFT; + if (&init_mm == mm) return; - __page_table_check_pud_clear(mm, *pudp); - if (pud_user_accessible_page(pud)) { - page_table_check_set(pud_pfn(pud), PUD_SIZE >> PAGE_SHIFT, - pud_write(pud)); - } + for (i = 0; i < nr; i++) + __page_table_check_pud_clear(mm, *(pudp + i)); + if (pud_user_accessible_page(pud)) + page_table_check_set(pud_pfn(pud), stride * nr, pud_write(pud)); } -EXPORT_SYMBOL(__page_table_check_pud_set); +EXPORT_SYMBOL(__page_table_check_puds_set); void __page_table_check_pte_clear_range(struct mm_struct *mm, unsigned long addr, From patchwork Wed Feb 5 15:09:46 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13961284 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BB32FC02194 for ; Wed, 5 Feb 2025 15:10:47 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 30CB8280019; Wed, 5 Feb 2025 10:10:47 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 26A53280001; Wed, 5 Feb 2025 10:10:47 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0E41D280019; Wed, 5 Feb 2025 10:10:47 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id E2EC1280001 for ; Wed, 5 Feb 2025 10:10:46 -0500 (EST) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 86F74C01AB for ; Wed, 5 Feb 2025 15:10:46 +0000 (UTC) X-FDA: 83086228092.19.C97202C Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf25.hostedemail.com (Postfix) with ESMTP id CDB24A0012 for ; Wed, 5 Feb 2025 15:10:44 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf25.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738768244; a=rsa-sha256; cv=none; b=M11QGCz9tDOWg5yA3jRxByIC2IQwo0xYCHxbeb5k5woYfM6WCcGsHnaCy+yjo+vSt74pQ2 2+cLoRhRsjxVAFQcLp99///+jIgQbePqW8rKTSpGDwvNzeP58BgbrFU1qTwMwxGps8PUEi dGyE7BltcHySo2ZJhQtdYXOOFHlzB3g= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf25.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738768244; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=dVPrvgxnzxsAfTYi9Dah/Z/7cPmJ2a+tEu4/yE+RZHs=; b=vYJe7X8pAKrvPueG8ia72CN4pf8vYFAYjKrKayE0xfymIpz0DYHlQ39isnl74MxpQaTcwn NdVQUuPjZhaTHX87V4D1CBWe8muSVrQK6qRbyTB5afleHMBOCZ39d2RGe3nGbJhzdFQsk2 0aK/d2FsKHCkAGD/fY1Gl/rknaN2ud0= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id BE7D41063; Wed, 5 Feb 2025 07:11:07 -0800 (PST) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.27]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id A24303F5A1; Wed, 5 Feb 2025 07:10:41 -0800 (PST) From: Ryan Roberts To: Catalin Marinas , Will Deacon , Muchun Song , Pasha Tatashin , Andrew Morton , Uladzislau Rezki , Christoph Hellwig , Mark Rutland , Ard Biesheuvel , Anshuman Khandual , Dev Jain , Alexandre Ghiti , Steve Capper , Kevin Brodsky Cc: Ryan Roberts , linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v1 06/16] arm64/mm: Refactor __set_ptes() and __ptep_get_and_clear() Date: Wed, 5 Feb 2025 15:09:46 +0000 Message-ID: <20250205151003.88959-7-ryan.roberts@arm.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250205151003.88959-1-ryan.roberts@arm.com> References: <20250205151003.88959-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Queue-Id: CDB24A0012 X-Rspamd-Server: rspam10 X-Stat-Signature: uo1doumoxugfmpg7seippt9trzzer1p5 X-HE-Tag: 1738768244-134557 X-HE-Meta: U2FsdGVkX1++cYjsZicqLXQGsSqKAqdBqcPrHybqkpio3oHP9cYQZPg6p/QbORu2sm24E2W/ATpUBLPsvIFTvDCWqCZFEoRS+ojm94OJ6JwUg4fOislmc6cstul1jWLwgSGnMjnodYlY63runOuxcAgzR3yR46NpBYvZpqPU32JZm5SpoF/DxfyaKnT5XNfnf1TB1jIrUzArRc4zOrK6aJYRD00gzmvIBTFNt/NoTZv4xxklu1ezjALYpGcLtzgcQty1Z2wL2+fO8LouSx+aBI9ixRsL5wLhhrAw87N/G2ggvrFMJWI6SQcc9oJBd5WHE1dGioamdDVmuCJ5yHbstujayxfDut6+wkpYpKtfwkUiZuH/Dq7fcpq7B6zqHYNsIUlTE7wmDfe9qroTJzf4SbjIPEGkSowLIihnceXfK9DgfNxbRij6YjPUgmjF39yriLk886bDq4v4e4MDk62nIM1GcpwFQSbgM3AXDkzATqPuuotFfPkgDrIn8e7ssbsjBhvJIas7mQvmmypJVXEPEjcH2snoT3lGFN+8wqEL7VbhFQIHePev/2SRniZcRJherh6Q82JpViPKQl2Espzx2Paa0hEN1uQ+7dv6UlXpRKau0VNfJett1C/+koxL514UINDSv8MChoLoGt6Vj5d+daCV1YHFpTeUkMz4kwx4gcnqj/rwfefHiXt3fikyTZaOu9cFxOI6s7sDZKt1YjvnTafUbnmt0vwqgIqzY+DKrvrUUjMd45DuzB0vh6J3M07t5c/sWxckZg7Go54Lr3Z4mZtBSwXvCWHi9SFtwnYfxNOJF0BuezfBhD101VBTRRxsMzdj56JMGET3cYLHvfbfaJqQCk1a+Ed2re+/gYEuFtB3Cc3uNivf9Q4fL1aXK95Y0WMmvI8Qg67qUa86Tf164vdeiiTvtcYs4Od30s5Fi0Jgamz+Woslhy9GFJ/psitOs/Dz9jDYz0yWNJSGwSZ SEUtzr2X Z8XUGE2zFdEfCdRuxNE4oyXkYslS7cjNskQpIzxMgJ1Ons5AiDsUs6E/1nj/9rN39ZTjVzH+KOboCLNrhjPubq4zgcidBt/UXFahoHrgzrGRMvtGZvWgccWOEMN141atsg44ARQMS0SAww5x0C0wrA4wPdcrpnUiVYLL3X6tTAHZQSSa7ULK+lFhUuoylP0fj6EWiC3UY5sA2UQU= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Refactor __set_ptes(), set_pmd_at() and set_pud_at() so that they are all a thin wrapper around a generic ___set_ptes(), which takes pgsize parameter. This cleans up the code to remove the confusing __set_pte_at() (which was only ever used for pmd/pud) and will allow us to perform future barrier optimizations in a single place. Additionally, it will permit the huge_pte API to efficiently batch-set pgtable entries and take advantage of the future barrier optimizations. ___set_ptes() calls the correct page_table_check_*_set() function based on the pgsize. This means that huge_ptes be able to get proper coverage regardless of their size, once it's plumbed into huge_pte. Currently the huge_pte API always uses the pte API which assumes an entry only covers a single page. While we are at it, refactor __ptep_get_and_clear() and pmdp_huge_get_and_clear() to use a common ___ptep_get_and_clear() which also takes a pgsize parameter. This will provide the huge_pte API the means to clear ptes corresponding with the way they were set. Signed-off-by: Ryan Roberts --- arch/arm64/include/asm/pgtable.h | 108 +++++++++++++++++++------------ 1 file changed, 67 insertions(+), 41 deletions(-) diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index 0b2a2ad1b9e8..3b55d9a15f05 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -420,23 +420,6 @@ static inline pte_t pte_advance_pfn(pte_t pte, unsigned long nr) return pfn_pte(pte_pfn(pte) + nr, pte_pgprot(pte)); } -static inline void __set_ptes(struct mm_struct *mm, - unsigned long __always_unused addr, - pte_t *ptep, pte_t pte, unsigned int nr) -{ - page_table_check_ptes_set(mm, ptep, pte, nr); - __sync_cache_and_tags(pte, nr); - - for (;;) { - __check_safe_pte_update(mm, ptep, pte); - __set_pte(ptep, pte); - if (--nr == 0) - break; - ptep++; - pte = pte_advance_pfn(pte, 1); - } -} - /* * Hugetlb definitions. */ @@ -641,30 +624,59 @@ static inline pgprot_t pud_pgprot(pud_t pud) return __pgprot(pud_val(pfn_pud(pfn, __pgprot(0))) ^ pud_val(pud)); } -static inline void __set_pte_at(struct mm_struct *mm, - unsigned long __always_unused addr, - pte_t *ptep, pte_t pte, unsigned int nr) +static inline void ___set_ptes(struct mm_struct *mm, pte_t *ptep, pte_t pte, + unsigned int nr, unsigned long pgsize) { - __sync_cache_and_tags(pte, nr); - __check_safe_pte_update(mm, ptep, pte); - __set_pte(ptep, pte); + unsigned long stride = pgsize >> PAGE_SHIFT; + + switch (pgsize) { + case PAGE_SIZE: + page_table_check_ptes_set(mm, ptep, pte, nr); + break; + case PMD_SIZE: + page_table_check_pmds_set(mm, (pmd_t *)ptep, pte_pmd(pte), nr); + break; + case PUD_SIZE: + page_table_check_puds_set(mm, (pud_t *)ptep, pte_pud(pte), nr); + break; + default: + VM_WARN_ON(1); + } + + __sync_cache_and_tags(pte, nr * stride); + + for (;;) { + __check_safe_pte_update(mm, ptep, pte); + __set_pte(ptep, pte); + if (--nr == 0) + break; + ptep++; + pte = pte_advance_pfn(pte, stride); + } } -static inline void set_pmd_at(struct mm_struct *mm, unsigned long addr, - pmd_t *pmdp, pmd_t pmd) +static inline void __set_ptes(struct mm_struct *mm, + unsigned long __always_unused addr, + pte_t *ptep, pte_t pte, unsigned int nr) { - page_table_check_pmd_set(mm, pmdp, pmd); - return __set_pte_at(mm, addr, (pte_t *)pmdp, pmd_pte(pmd), - PMD_SIZE >> PAGE_SHIFT); + ___set_ptes(mm, ptep, pte, nr, PAGE_SIZE); } -static inline void set_pud_at(struct mm_struct *mm, unsigned long addr, - pud_t *pudp, pud_t pud) +static inline void __set_pmds(struct mm_struct *mm, + unsigned long __always_unused addr, + pmd_t *pmdp, pmd_t pmd, unsigned int nr) +{ + ___set_ptes(mm, (pte_t *)pmdp, pmd_pte(pmd), nr, PMD_SIZE); +} +#define set_pmd_at(mm, addr, pmdp, pmd) __set_pmds(mm, addr, pmdp, pmd, 1) + +static inline void __set_puds(struct mm_struct *mm, + unsigned long __always_unused addr, + pud_t *pudp, pud_t pud, unsigned int nr) { - page_table_check_pud_set(mm, pudp, pud); - return __set_pte_at(mm, addr, (pte_t *)pudp, pud_pte(pud), - PUD_SIZE >> PAGE_SHIFT); + ___set_ptes(mm, (pte_t *)pudp, pud_pte(pud), nr, PUD_SIZE); } +#define set_pud_at(mm, addr, pudp, pud) __set_puds(mm, addr, pudp, pud, 1) #define __p4d_to_phys(p4d) __pte_to_phys(p4d_pte(p4d)) #define __phys_to_p4d_val(phys) __phys_to_pte_val(phys) @@ -1276,16 +1288,34 @@ static inline int pmdp_test_and_clear_young(struct vm_area_struct *vma, } #endif /* CONFIG_TRANSPARENT_HUGEPAGE || CONFIG_ARCH_HAS_NONLEAF_PMD_YOUNG */ -static inline pte_t __ptep_get_and_clear(struct mm_struct *mm, - unsigned long address, pte_t *ptep) +static inline pte_t ___ptep_get_and_clear(struct mm_struct *mm, pte_t *ptep, + unsigned long pgsize) { pte_t pte = __pte(xchg_relaxed(&pte_val(*ptep), 0)); - page_table_check_pte_clear(mm, pte); + switch (pgsize) { + case PAGE_SIZE: + page_table_check_pte_clear(mm, pte); + break; + case PMD_SIZE: + page_table_check_pmd_clear(mm, pte_pmd(pte)); + break; + case PUD_SIZE: + page_table_check_pud_clear(mm, pte_pud(pte)); + break; + default: + VM_WARN_ON(1); + } return pte; } +static inline pte_t __ptep_get_and_clear(struct mm_struct *mm, + unsigned long address, pte_t *ptep) +{ + return ___ptep_get_and_clear(mm, ptep, PAGE_SIZE); +} + static inline void __clear_full_ptes(struct mm_struct *mm, unsigned long addr, pte_t *ptep, unsigned int nr, int full) { @@ -1322,11 +1352,7 @@ static inline pte_t __get_and_clear_full_ptes(struct mm_struct *mm, static inline pmd_t pmdp_huge_get_and_clear(struct mm_struct *mm, unsigned long address, pmd_t *pmdp) { - pmd_t pmd = __pmd(xchg_relaxed(&pmd_val(*pmdp), 0)); - - page_table_check_pmd_clear(mm, pmd); - - return pmd; + return pte_pmd(___ptep_get_and_clear(mm, (pte_t *)pmdp, PMD_SIZE)); } #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ From patchwork Wed Feb 5 15:09:47 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13961285 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id F1BD3C02194 for ; Wed, 5 Feb 2025 15:10:50 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7EAFA280014; Wed, 5 Feb 2025 10:10:50 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 79B26280001; Wed, 5 Feb 2025 10:10:50 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5C5CE280014; Wed, 5 Feb 2025 10:10:50 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id F2801280001 for ; Wed, 5 Feb 2025 10:10:49 -0500 (EST) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 653E31A01FE for ; Wed, 5 Feb 2025 15:10:49 +0000 (UTC) X-FDA: 83086228218.23.BA3B9C0 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf15.hostedemail.com (Postfix) with ESMTP id 92FA4A0017 for ; Wed, 5 Feb 2025 15:10:47 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=none; spf=pass (imf15.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738768247; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=OkH6mYXJp0TjRSwBtmBs0YmaHm+TspyfIyTEYgopupM=; b=JGa61S4wGXTDvjTcaItd4UKOvne9uRdHSCuLGhAdbmilSANCrJ48ghwBgJ5KzmQsHxD2Lr reOCkr4cQIeLN9Bnc4T/u2i2iK1BouwWWTJpd+Bo+TCxwVr5jXXkgEAABiBvnt7bRoH18g hpn2hNgXhQQETwl1DU10YWyo2+zdepc= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=none; spf=pass (imf15.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738768247; a=rsa-sha256; cv=none; b=33A+86v65KBMulIiz0gHBOdnEckrm49OOKXsopVN9mT2pSWr4XXVuDeF3xsxK3gJQ8IXIO Knr4fWlgjJhRuwLxS0HcnxvF7wmyN4emYB9SAXS3rtzZRA4qqapatlQyYUtnwC2pJfNJfM lwwejDFsxajp4gjluLSFYNVVOGgjrts= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 8D37F1063; Wed, 5 Feb 2025 07:11:10 -0800 (PST) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.27]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 712D43F5A1; Wed, 5 Feb 2025 07:10:44 -0800 (PST) From: Ryan Roberts To: Catalin Marinas , Will Deacon , Muchun Song , Pasha Tatashin , Andrew Morton , Uladzislau Rezki , Christoph Hellwig , Mark Rutland , Ard Biesheuvel , Anshuman Khandual , Dev Jain , Alexandre Ghiti , Steve Capper , Kevin Brodsky Cc: Ryan Roberts , linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v1 07/16] arm64: hugetlb: Use ___set_ptes() and ___ptep_get_and_clear() Date: Wed, 5 Feb 2025 15:09:47 +0000 Message-ID: <20250205151003.88959-8-ryan.roberts@arm.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250205151003.88959-1-ryan.roberts@arm.com> References: <20250205151003.88959-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 92FA4A0017 X-Stat-Signature: 9q966xj8r5jwqpm9dihbeu8zeswjodkt X-Rspam-User: X-HE-Tag: 1738768247-593454 X-HE-Meta: U2FsdGVkX1/5UiA1uzRMuyz2saqt3fs/45hamM2QfUlXnF75lMclNW4OR/RlVfQq7kpe5PW5BEEdzbjBRt3PkbHH8so9Q5fpOIcXGwcnGWYeSCicqDkZFsg4ziz6YK6OmjArfgenwDiGva9nZju7yKL9s+faMKMTf0H07sp5ImVCtQ75iORjCPMurPUhPCQ+iauLH7sJzy5SSYOnYml5s6ySt/U2RD1YaPHR3HUGHEY9v5vyKpzLafkk6kwYfGTScSvfVTn1WvM46Z4FlL/+i4I3tTpldotE6KGaHYZ8aIE+9+j3PV3oaI4neKnVemM3myt2htjq1xQ2vLFVZDVhid3PU1HM8dQw6/3J5gwJZKeY4XcY5zBVIPfmro/H8K/ssMwn6pUT5znXxPojHLIpHDVbKl5DlwVh3Cen1bAALpP3/aVgmZz11/tUpT02M8ENc4vJQ62as5oKPXXNKR6SEF6dI5HtjKnoSJP+LDySQzKLhoLns/3rWebaCU5Tzk3XGBspd7v5xvQPa8CE/L33UNcj00HGVSe0R2ZiZpAad9F/h8+ziKzZ08WlXYKXBDv5VJz/ZKTViy0kWEtn80jTCK7mNhO0dCNvPwdAeB6HUtYXNhsqpwyqKLq5HTFoyaCJgL0NIh0olIGyW8tL4kDUqtWXc7Dfa8A98Ycb4TJc2kQPeF/vyu1kvEHDoT7WKUvL/tcrYiR5KldhrY7KfqBdvJbekHqFPtXCnWhJbh5MUDkKb4NiQJVtuZZsxPA8Wy40UpbR6l8QjMV5lHZ+I2tdzIBWc4qK45fgd6vvs3YQPLIQzCpLltnBRLN8hnqn1nycQzlGHH4E0pHh1/jJsV7WtrQ4WJBhNSU8BzLMYnOnmvRSwTdcR3SliGfGqHWuVYJ5GIHY5IAdN+ww3XmMZspAEH1EkE7xW1U+NDn5v6EHrALaGWDTFiX+r7ggLdEzG41/ymLuiopPX72DQED9Jzh pz7Syfcf cuyoYhO8P0sg1M4UVtd2n1JsM7kl230S7UMImF5MbCjxtriPvt92Xf+AVtr17NHqHcZnJT3iO4kaF55jLHz7/sCkHvz5JChEGQ6oKQ5B1rl5XGyPWEVmpEszJrBzjADlhZBvugFQNrIIiIU1rXVc8aqTa/pWssHiusdqZdBAg8S7s+6h8/aQHNN/tKZLLLlUUarZjVwc/YiE0aks= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Refactor the huge_pte helpers to use the new generic ___set_ptes() and ___ptep_get_and_clear() APIs. This provides 2 benefits; First, when page_table_check=on, hugetlb is now properly/fully checked. Previously only the first page of a hugetlb folio was checked. Second, instead of having to call __set_ptes(nr=1) for each pte in a loop, the whole contiguous batch can now be set in one go, which enables some efficiencies and cleans up the code. One detail to note is that huge_ptep_clear_flush() was previously calling ptep_clear_flush() for a non-contiguous pte (i.e. a pud or pmd block mapping). This has a couple of disadvantages; first ptep_clear_flush() calls ptep_get_and_clear() which transparently handles contpte. Given we only call for non-contiguous ptes, it would be safe, but a waste of effort. It's preferable to go stright to the layer below. However, more problematic is that ptep_get_and_clear() is for PAGE_SIZE entries so it calls page_table_check_pte_clear() and would not clear the whole hugetlb folio. So let's stop special-casing the non-cont case and just rely on get_clear_contig_flush() to do the right thing for non-cont entries. Signed-off-by: Ryan Roberts --- arch/arm64/mm/hugetlbpage.c | 50 ++++++++----------------------------- 1 file changed, 11 insertions(+), 39 deletions(-) diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c index e870d01d12ea..02afee31444e 100644 --- a/arch/arm64/mm/hugetlbpage.c +++ b/arch/arm64/mm/hugetlbpage.c @@ -166,12 +166,12 @@ static pte_t get_clear_contig(struct mm_struct *mm, pte_t pte, tmp_pte; bool present; - pte = __ptep_get_and_clear(mm, addr, ptep); + pte = ___ptep_get_and_clear(mm, ptep, pgsize); present = pte_present(pte); while (--ncontig) { ptep++; addr += pgsize; - tmp_pte = __ptep_get_and_clear(mm, addr, ptep); + tmp_pte = ___ptep_get_and_clear(mm, ptep, pgsize); if (present) { if (pte_dirty(tmp_pte)) pte = pte_mkdirty(pte); @@ -215,7 +215,7 @@ static void clear_flush(struct mm_struct *mm, unsigned long i, saddr = addr; for (i = 0; i < ncontig; i++, addr += pgsize, ptep++) - __ptep_get_and_clear(mm, addr, ptep); + ___ptep_get_and_clear(mm, ptep, pgsize); __flush_hugetlb_tlb_range(&vma, saddr, addr, pgsize, true); } @@ -226,32 +226,20 @@ void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, size_t pgsize; int i; int ncontig; - unsigned long pfn, dpfn; - pgprot_t hugeprot; ncontig = num_contig_ptes(sz, &pgsize); if (!pte_present(pte)) { for (i = 0; i < ncontig; i++, ptep++, addr += pgsize) - __set_ptes(mm, addr, ptep, pte, 1); + ___set_ptes(mm, ptep, pte, 1, pgsize); return; } - if (!pte_cont(pte)) { - __set_ptes(mm, addr, ptep, pte, 1); - return; - } - - pfn = pte_pfn(pte); - dpfn = pgsize >> PAGE_SHIFT; - hugeprot = pte_pgprot(pte); - /* Only need to "break" if transitioning valid -> valid. */ - if (pte_valid(__ptep_get(ptep))) + if (pte_cont(pte) && pte_valid(__ptep_get(ptep))) clear_flush(mm, addr, ptep, pgsize, ncontig); - for (i = 0; i < ncontig; i++, ptep++, addr += pgsize, pfn += dpfn) - __set_ptes(mm, addr, ptep, pfn_pte(pfn, hugeprot), 1); + ___set_ptes(mm, ptep, pte, ncontig, pgsize); } pte_t *huge_pte_alloc(struct mm_struct *mm, struct vm_area_struct *vma, @@ -441,11 +429,9 @@ int huge_ptep_set_access_flags(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep, pte_t pte, int dirty) { - int ncontig, i; + int ncontig; size_t pgsize = 0; - unsigned long pfn = pte_pfn(pte), dpfn; struct mm_struct *mm = vma->vm_mm; - pgprot_t hugeprot; pte_t orig_pte; VM_WARN_ON(!pte_present(pte)); @@ -454,7 +440,6 @@ int huge_ptep_set_access_flags(struct vm_area_struct *vma, return __ptep_set_access_flags(vma, addr, ptep, pte, dirty); ncontig = find_num_contig(mm, addr, ptep, &pgsize); - dpfn = pgsize >> PAGE_SHIFT; if (!__cont_access_flags_changed(ptep, pte, ncontig)) return 0; @@ -469,19 +454,14 @@ int huge_ptep_set_access_flags(struct vm_area_struct *vma, if (pte_young(orig_pte)) pte = pte_mkyoung(pte); - hugeprot = pte_pgprot(pte); - for (i = 0; i < ncontig; i++, ptep++, addr += pgsize, pfn += dpfn) - __set_ptes(mm, addr, ptep, pfn_pte(pfn, hugeprot), 1); - + ___set_ptes(mm, ptep, pte, ncontig, pgsize); return 1; } void huge_ptep_set_wrprotect(struct mm_struct *mm, unsigned long addr, pte_t *ptep) { - unsigned long pfn, dpfn; - pgprot_t hugeprot; - int ncontig, i; + int ncontig; size_t pgsize; pte_t pte; @@ -494,16 +474,11 @@ void huge_ptep_set_wrprotect(struct mm_struct *mm, } ncontig = find_num_contig(mm, addr, ptep, &pgsize); - dpfn = pgsize >> PAGE_SHIFT; pte = get_clear_contig_flush(mm, addr, ptep, pgsize, ncontig); pte = pte_wrprotect(pte); - hugeprot = pte_pgprot(pte); - pfn = pte_pfn(pte); - - for (i = 0; i < ncontig; i++, ptep++, addr += pgsize, pfn += dpfn) - __set_ptes(mm, addr, ptep, pfn_pte(pfn, hugeprot), 1); + ___set_ptes(mm, ptep, pte, ncontig, pgsize); } pte_t huge_ptep_clear_flush(struct vm_area_struct *vma, @@ -517,10 +492,7 @@ pte_t huge_ptep_clear_flush(struct vm_area_struct *vma, pte = __ptep_get(ptep); VM_WARN_ON(!pte_present(pte)); - if (!pte_cont(pte)) - return ptep_clear_flush(vma, addr, ptep); - - ncontig = find_num_contig(mm, addr, ptep, &pgsize); + ncontig = num_contig_ptes(page_size(pte_page(pte)), &pgsize); return get_clear_contig_flush(mm, addr, ptep, pgsize, ncontig); } From patchwork Wed Feb 5 15:09:48 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13961286 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5A5CBC02194 for ; Wed, 5 Feb 2025 15:10:54 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E06A828001D; Wed, 5 Feb 2025 10:10:53 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id DB4F0280001; Wed, 5 Feb 2025 10:10:53 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C563E28001D; Wed, 5 Feb 2025 10:10:53 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id A0FE6280001 for ; Wed, 5 Feb 2025 10:10:53 -0500 (EST) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 164AE448CF for ; Wed, 5 Feb 2025 15:10:52 +0000 (UTC) X-FDA: 83086228344.07.DA84947 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf15.hostedemail.com (Postfix) with ESMTP id 6496CA0008 for ; Wed, 5 Feb 2025 15:10:50 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf15.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738768250; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=qkmgYVJsYjfY1eQJPhDEotjXCZNcuY4MKyNB9p34TVU=; b=Wbgti9te9kx/GMm8RAWIOc7I6pK9DPUCZr6ag53ZGBIaq5mYl0C48eydNhl3h72/diSyCh U8zU3MZOTnpbj948+7rkwjDHcKTjPDPBGUQCsF30TqD+Azq1W0tuoSF0yq+2Z1f7udaOG/ YFCI7lVbRYkRGMTYaulSe2aqxmg57/4= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738768250; a=rsa-sha256; cv=none; b=Kwuz8RF70OovUIGqo9txVrCiiWjWfcZOQxix5xmfeFPt5fMV30Cr9osGt/+EZQ3GLrhuNr bNDt+nh9F1Bp78IIqqLSab8jamvXq04XBuXQmPEUpyKDy//C8Jv8+Kn/0tGcNul0HnqNWP 3ZbrMpFTW8cTd5qwvrRmsIEfzN6/lSs= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf15.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 5B6AD1063; Wed, 5 Feb 2025 07:11:13 -0800 (PST) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.27]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 3F23B3F5A1; Wed, 5 Feb 2025 07:10:47 -0800 (PST) From: Ryan Roberts To: Catalin Marinas , Will Deacon , Muchun Song , Pasha Tatashin , Andrew Morton , Uladzislau Rezki , Christoph Hellwig , Mark Rutland , Ard Biesheuvel , Anshuman Khandual , Dev Jain , Alexandre Ghiti , Steve Capper , Kevin Brodsky Cc: Ryan Roberts , linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v1 08/16] arm64/mm: Hoist barriers out of ___set_ptes() loop Date: Wed, 5 Feb 2025 15:09:48 +0000 Message-ID: <20250205151003.88959-9-ryan.roberts@arm.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250205151003.88959-1-ryan.roberts@arm.com> References: <20250205151003.88959-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: 6496CA0008 X-Stat-Signature: dobonbphh7f3b4bdcx1ow4xi6x7yq9nt X-Rspam-User: X-Rspamd-Server: rspam12 X-HE-Tag: 1738768250-699486 X-HE-Meta: U2FsdGVkX1/nD/ddt9CLOiSrHULKZflxcodqNF4XQpEvAikLBP3H4KfJ73KVaWlfoOoq5vFEfhyqVDFRJJ6169+9flJhiNOuAV0UwZ/BPwRtvdm4EVSdFm5HLKmpPw+LtNfpQkkakYC10rh8PSoOyO2bNjNMyeWmS7VRKVE8Qpujifd7SKodZvkO4+jOHERGd8qlzzjgwdTg1UvnqRL9sloi02J9YqJjNapVR38kiG3B9GMelX9NqtqeqoUHNUhP+WqytEEjzBtckneGjwJ3RISazj8smSoA1Rvy82TrbQX7FxJrwrFhJZ6eNmU1Lazn8mdM9V0pT/CCEg3UNkOWXW8pA6mKl9EG1PyIJsjhDQPsLo2eMuUXBvmh+cr8V/JrDgTMHg4w7HjNE44A5yBCa5J57E6kPjajBShsbc+uunRa4WcGm/cbaUfMXLY0OiYqJeRiMX+HkWEBvDPBYM5D3EEYFZSLQXG2bjKNSjWbl0pCqjGHFqDSLQa4Mh2CA/1OD+065PtcVphBWizupY0e7zupkLPOSq7QVVwA8QmkLAWXEmu55ZeGtzclgmrx8+w6qVH2nmqCvzZLty//e6GIFIOfVVCEmrgPc9zOKAaYVJZTfHw3/Bne0aAtCioyzELOtvoZ2HAALl0hsaMPtriIlY9RWnWJC23Geg44D/KqphSG6C6I56YqpzfEIQ1kP4o+p7FDCqLTt7IGgEBH4Gb12eKFC4QsgcnNtACwIHoZtnjoX8zV+DiVfhhf5O/ZrGnH+K0rXQdqHLOJ1/eeMCFRsbgGWKKcvQqcIuOQqDJhwL7HVtI5AzNFD8pgf7bhk4z5JOFLrLWlgqJZR+khkT+qadkPPRhdTdDryO/a5dx4/l0A0GRReEU+i3zYojA0YV5pGAeGs19g8p3MTWWJtjTe+OYTiBqngRp5trp1en2b2KZg+6RDp93p+60uyVVZIvoAAmcawJfLsOnp+mDbL25 LCEQ+Dfl dS2QvJP5FWc9Xt4+Tzu0GaiBXh8K+YSWkNsez9Cr9eudogq3Xx8+4SG7OGqXpFE2Zl7VIvtBPx4dflWzfbkRGE3lF0YJ7dgIyKQgMevFmzUTCbHhXUNyaGcYxmksY/128xA/Suj/3DapROxnLPIs/3mINH/GuAIMn7bHnao25Vup9d/wR8bs9f+oisHU9UPzyr4X6QqnLAVzGwFY= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: ___set_ptes() previously called __set_pte() for each PTE in the range, which would conditionally issue a DSB and ISB to make the new PTE value immediately visible to the table walker if the new PTE was valid and for kernel space. We can do better than this; let's hoist those barriers out of the loop so that they are only issued once at the end of the loop. We then reduce the cost by the number of PTEs in the range. Signed-off-by: Ryan Roberts --- arch/arm64/include/asm/pgtable.h | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-) diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index 3b55d9a15f05..1d428e9c0e5a 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -317,10 +317,8 @@ static inline void __set_pte_nosync(pte_t *ptep, pte_t pte) WRITE_ONCE(*ptep, pte); } -static inline void __set_pte(pte_t *ptep, pte_t pte) +static inline void __set_pte_complete(pte_t pte) { - __set_pte_nosync(ptep, pte); - /* * Only if the new pte is valid and kernel, otherwise TLB maintenance * or update_mmu_cache() have the necessary barriers. @@ -331,6 +329,12 @@ static inline void __set_pte(pte_t *ptep, pte_t pte) } } +static inline void __set_pte(pte_t *ptep, pte_t pte) +{ + __set_pte_nosync(ptep, pte); + __set_pte_complete(pte); +} + static inline pte_t __ptep_get(pte_t *ptep) { return READ_ONCE(*ptep); @@ -647,12 +651,14 @@ static inline void ___set_ptes(struct mm_struct *mm, pte_t *ptep, pte_t pte, for (;;) { __check_safe_pte_update(mm, ptep, pte); - __set_pte(ptep, pte); + __set_pte_nosync(ptep, pte); if (--nr == 0) break; ptep++; pte = pte_advance_pfn(pte, stride); } + + __set_pte_complete(pte); } static inline void __set_ptes(struct mm_struct *mm, From patchwork Wed Feb 5 15:09:49 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13961287 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B19FBC02198 for ; Wed, 5 Feb 2025 15:10:56 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 441D228001E; Wed, 5 Feb 2025 10:10:56 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 3CB25280001; Wed, 5 Feb 2025 10:10:56 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1F47B28001E; Wed, 5 Feb 2025 10:10:56 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id EE9A3280001 for ; Wed, 5 Feb 2025 10:10:55 -0500 (EST) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 9D0891A0224 for ; Wed, 5 Feb 2025 15:10:55 +0000 (UTC) X-FDA: 83086228470.10.F3EE34F Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf09.hostedemail.com (Postfix) with ESMTP id 604BF14001F for ; Wed, 5 Feb 2025 15:10:53 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=none; spf=pass (imf09.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738768253; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ZwFN2pI5ddclpZ3ZgbQxOyljUbVmVqlB4ZpqvrOUBOA=; b=FXJRkATyIbuiGSV6AgF/6kqhbfrwrFcQFT4+BwMPXNh1g8Mka4zcBCDeC6+1+9yD1LewPR EdMWg1iy3SOc1CDZu74eCUAV2FExdLvShGUANaBuLtaXeStITIkYc4hrwQPAUSFvCUMkRc SveS7rcbQx7y95zFaAPeSHnFJ+hSnxA= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738768253; a=rsa-sha256; cv=none; b=lAdGSEionWw2mNrGPGDFgoZw42ElszvkxSpr+pLGzxpFN/HZM3ZFdSgyu1JTbIEz0lSRMF OO0D9KG2H88J/H5W4uVTLv/4U60foRY11FVfn5HmNdrwGNbbQigyq3oZFhBjfkxhdq3c5p MVPjojL/z4E4vFOAr4518zLfx5xTjP0= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=none; spf=pass (imf09.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 2A0151063; Wed, 5 Feb 2025 07:11:16 -0800 (PST) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.27]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 0DD7A3F5A1; Wed, 5 Feb 2025 07:10:49 -0800 (PST) From: Ryan Roberts To: Catalin Marinas , Will Deacon , Muchun Song , Pasha Tatashin , Andrew Morton , Uladzislau Rezki , Christoph Hellwig , Mark Rutland , Ard Biesheuvel , Anshuman Khandual , Dev Jain , Alexandre Ghiti , Steve Capper , Kevin Brodsky Cc: Ryan Roberts , linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v1 09/16] arm64/mm: Avoid barriers for invalid or userspace mappings Date: Wed, 5 Feb 2025 15:09:49 +0000 Message-ID: <20250205151003.88959-10-ryan.roberts@arm.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250205151003.88959-1-ryan.roberts@arm.com> References: <20250205151003.88959-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Stat-Signature: 5j8rseqo3d1zyfiriqc1bin5c1m3phq7 X-Rspamd-Queue-Id: 604BF14001F X-Rspam-User: X-Rspamd-Server: rspam06 X-HE-Tag: 1738768253-744161 X-HE-Meta: U2FsdGVkX1+yITLzZ5TJLeDp0s8uCL7/1GcdyrTHmT/sOyfXfoJFjOsSVS6bVLMSXJ4g4Ie7XptH6t9MN+SJFLV2bgLA/ZgN0sgBHcgDi+p0u0flSaLyEejVighx6fvlD0WTpj0IB03VtV2UJaOV84/wL6g9OMq61XCKRWC5bu1VHsqy5R2yLZc3Zl8yPeR99DEDeHsw7U+5wyYapwnW33EiY4RXtn463G0OFpIvDbJbA7swke6sInGiO0Z7Oa4u2zi2zIaa4OmKiZqGJ/I0jiW16EexuoSy8Hwm5Ij95KgZhFioirSI3WQmvMhL7c6EEzL5si424OeTZc8xp3vj0OKnHl6XDrQQJdGk4CSARTJ2K8ZQmW5avrNaxrgaI9ZIUo0I7C2038uTPkzHdp/uU1Vc+OtzBhX/nSJZNd6QG4KIDt9fz8f2LhGjPNnUIm+Lo+5LWPi0Pm32ntUwBt76dqBhD4xPorOwMHacn9+xuCzcUMOxQQp/m6kNg1/lZGZ0sdkx5FxVn74KH01MU43nQJklPWB89EcCuvz07Z7Aop7u2otkBW33i/iQbpeg3eA5ZiMqKakB/GVoGshvwMP7sviHIqg7r10/8HNpCeRfBTWw0FbYLlkRnuv9B2kmYxDiNLyBg97Poakz9+6Tll6WYz1ew9trKH1Q5Rd+qXwe+3hhm23ukV9yrj9JCS6Rb+8LSs+nbh4h3FFtnpGUOd39azVrqRCV/sNZIN+tRw6z/90qwiX6AD2lo+S3iW/O9Lnc+ZEkbRDVnuT/vaejqtu+9+QvIZ1WFQwsZQ9auShcY5fbBB0+XQjh5l9dd2fy/K6gL/KkGdG2cm4PRXa7cbbEXG+PZlAZ1gA5JZqpMC5yvUjWnpW9M5zED5nl4KGLrCRXnJ59xryZGYAvDM2ExBZ30ox7DhNtU7o6HH/Plhkv2seikaGAouARIbLw4WwU62FHEgU6B4HFCCQpoJa+n65 jN4CGaB4 TJUy2gQDHSwDSvWjmXUI5D88PnWL43Xyf2sL3Me1VwCDXrqDSzT5EFU+V0PY6nKlXV/FChMD28iZHVuVZNL5LaszPJ5miO4RpcyJrjRhIY+TRq4+ZOcQVV4jw3QbUUNT5iT/4pJ0aaicFjCrIt56ENQBlZA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: __set_pte_complete(), set_pmd(), set_pud(), set_p4d() and set_pgd() are used to write entries into pgtables. And they issue barriers (currently dsb and isb) to ensure that the written values are observed by the table walker prior to any program-order-future memory access to the mapped location. Over the years some of these functions have received optimizations: In particular, commit 7f0b1bf04511 ("arm64: Fix barriers used for page table modifications") made it so that the barriers were only emitted for valid-kernel mappings for set_pte() (now __set_pte_complete()). And commit 0795edaf3f1f ("arm64: pgtable: Implement p[mu]d_valid() and check in set_p[mu]d()") made it so that set_pmd()/set_pud() only emitted the barriers for valid mappings. set_p4d()/set_pgd() continue to emit the barriers unconditionally. This is all very confusing to the casual observer; surely the rules should be invariant to the level? Let's change this so that every level consistently emits the barriers only when setting valid, non-user entries (both table and leaf). It seems obvious that if it is ok to elide barriers all but valid kernel mappings at pte level, it must also be ok to do this for leaf entries at other levels: If setting an entry to invalid, a tlb maintenance operaiton must surely follow to synchronise the TLB and this contains the required barriers. If setting a valid user mapping, the previous mapping must have been invalid and there must have been a TLB maintenance operation (complete with barriers) to honour break-before-make. So the worst that can happen is we take an extra fault (which will imply the DSB + ISB) and conclude that there is nothing to do. These are the aguments for doing this optimization at pte level and they also apply to leaf mappings at other levels. For table entries, the same arguments hold: If unsetting a table entry, a TLB is required and this will emit the required barriers. If setting a table entry, the previous value must have been invalid and the table walker must already be able to observe that. Additionally the contents of the pgtable being pointed to in the newly set entry must be visible before the entry is written and this is enforced via smp_wmb() (dmb) in the pgtable allocation functions and in __split_huge_pmd_locked(). But this last part could never have been enforced by the barriers in set_pXd() because they occur after updating the entry. So ultimately, the wost that can happen by eliding these barriers for user table entries is an extra fault. I observe roughly the same number of page faults (107M) with and without this change when compiling the kernel on Apple M2. Signed-off-by: Ryan Roberts --- arch/arm64/include/asm/pgtable.h | 60 ++++++++++++++++++++++++++++---- 1 file changed, 54 insertions(+), 6 deletions(-) diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index 1d428e9c0e5a..ff358d983583 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -767,6 +767,19 @@ static inline bool in_swapper_pgdir(void *addr) ((unsigned long)swapper_pg_dir & PAGE_MASK); } +static inline bool pmd_valid_not_user(pmd_t pmd) +{ + /* + * User-space table pmd entries always have (PXN && !UXN). All other + * combinations indicate it's a table entry for kernel space. + * Valid-not-user leaf entries follow the same rules as + * pte_valid_not_user(). + */ + if (pmd_table(pmd)) + return !((pmd_val(pmd) & (PMD_TABLE_PXN | PMD_TABLE_UXN)) == PMD_TABLE_PXN); + return pte_valid_not_user(pmd_pte(pmd)); +} + static inline void set_pmd(pmd_t *pmdp, pmd_t pmd) { #ifdef __PAGETABLE_PMD_FOLDED @@ -778,7 +791,7 @@ static inline void set_pmd(pmd_t *pmdp, pmd_t pmd) WRITE_ONCE(*pmdp, pmd); - if (pmd_valid(pmd)) { + if (pmd_valid_not_user(pmd)) { dsb(ishst); isb(); } @@ -836,6 +849,17 @@ static inline unsigned long pmd_page_vaddr(pmd_t pmd) static inline bool pgtable_l4_enabled(void); + +static inline bool pud_valid_not_user(pud_t pud) +{ + /* + * Follows the same rules as pmd_valid_not_user(). + */ + if (pud_table(pud)) + return !((pud_val(pud) & (PUD_TABLE_PXN | PUD_TABLE_UXN)) == PUD_TABLE_PXN); + return pte_valid_not_user(pud_pte(pud)); +} + static inline void set_pud(pud_t *pudp, pud_t pud) { if (!pgtable_l4_enabled() && in_swapper_pgdir(pudp)) { @@ -845,7 +869,7 @@ static inline void set_pud(pud_t *pudp, pud_t pud) WRITE_ONCE(*pudp, pud); - if (pud_valid(pud)) { + if (pud_valid_not_user(pud)) { dsb(ishst); isb(); } @@ -917,6 +941,16 @@ static inline bool mm_pud_folded(const struct mm_struct *mm) #define p4d_bad(p4d) (pgtable_l4_enabled() && !(p4d_val(p4d) & P4D_TABLE_BIT)) #define p4d_present(p4d) (!p4d_none(p4d)) +static inline bool p4d_valid_not_user(p4d_t p4d) +{ + /* + * User-space table p4d entries always have (PXN && !UXN). All other + * combinations indicate it's a table entry for kernel space. p4d block + * entries are not supported. + */ + return !((p4d_val(p4d) & (P4D_TABLE_PXN | P4D_TABLE_UXN)) == P4D_TABLE_PXN); +} + static inline void set_p4d(p4d_t *p4dp, p4d_t p4d) { if (in_swapper_pgdir(p4dp)) { @@ -925,8 +959,11 @@ static inline void set_p4d(p4d_t *p4dp, p4d_t p4d) } WRITE_ONCE(*p4dp, p4d); - dsb(ishst); - isb(); + + if (p4d_valid_not_user(p4d)) { + dsb(ishst); + isb(); + } } static inline void p4d_clear(p4d_t *p4dp) @@ -1044,6 +1081,14 @@ static inline bool mm_p4d_folded(const struct mm_struct *mm) #define pgd_bad(pgd) (pgtable_l5_enabled() && !(pgd_val(pgd) & PGD_TABLE_BIT)) #define pgd_present(pgd) (!pgd_none(pgd)) +static inline bool pgd_valid_not_user(pgd_t pgd) +{ + /* + * Follows the same rules as p4d_valid_not_user(). + */ + return !((pgd_val(pgd) & (PGD_TABLE_PXN | PGD_TABLE_UXN)) == PGD_TABLE_PXN); +} + static inline void set_pgd(pgd_t *pgdp, pgd_t pgd) { if (in_swapper_pgdir(pgdp)) { @@ -1052,8 +1097,11 @@ static inline void set_pgd(pgd_t *pgdp, pgd_t pgd) } WRITE_ONCE(*pgdp, pgd); - dsb(ishst); - isb(); + + if (pgd_valid_not_user(pgd)) { + dsb(ishst); + isb(); + } } static inline void pgd_clear(pgd_t *pgdp) From patchwork Wed Feb 5 15:09:50 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13961288 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0A00DC02194 for ; Wed, 5 Feb 2025 15:11:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 68C8F280020; Wed, 5 Feb 2025 10:10:58 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 63E77280001; Wed, 5 Feb 2025 10:10:58 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4B6D0280020; Wed, 5 Feb 2025 10:10:58 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 284D4280001 for ; Wed, 5 Feb 2025 10:10:58 -0500 (EST) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id C6B2C1201B1 for ; Wed, 5 Feb 2025 15:10:57 +0000 (UTC) X-FDA: 83086228554.03.7051BE3 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf07.hostedemail.com (Postfix) with ESMTP id 151594000E for ; Wed, 5 Feb 2025 15:10:55 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf07.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738768256; a=rsa-sha256; cv=none; b=pmEoNX1tSHk0tbWFrCGrcs4qVdP9tU7dUcWlNE3HzfM6pKDoOpZAevR/ZugrCEzz+8vQl1 ov8dxUrK67gW7licEqqR2QOWWSeUa3+ilC3GMR83akRFX7cvxxEta6oATznt8kpDSwqUHh maxy/zC5YAc8ScBlArYn20uzjk0uWU8= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf07.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738768256; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=U+k5wYTeBA+XE7ykZf8axIhwSYcAM//Cop1YkXugCnw=; b=ajYuVnNjqwcYzqyYSqbdYXzoooX7mohOfg2KMeWlI+Kn3vv3nxGjo8+i/K+FKxdmK0flwu aLffCy5deqrxVMYIK6Ucuq4FS3hHOP8Cy/E4P6/qR6/D1Lb562atCmyviAaUGYNYeDZCY3 bLPfuC/ceGo8tqeQWdGIFuMYBfZ5sxc= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id EC4DA1063; Wed, 5 Feb 2025 07:11:18 -0800 (PST) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.27]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id D00F53F5A1; Wed, 5 Feb 2025 07:10:52 -0800 (PST) From: Ryan Roberts To: Catalin Marinas , Will Deacon , Muchun Song , Pasha Tatashin , Andrew Morton , Uladzislau Rezki , Christoph Hellwig , Mark Rutland , Ard Biesheuvel , Anshuman Khandual , Dev Jain , Alexandre Ghiti , Steve Capper , Kevin Brodsky Cc: Ryan Roberts , linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v1 10/16] mm/vmalloc: Warn on improper use of vunmap_range() Date: Wed, 5 Feb 2025 15:09:50 +0000 Message-ID: <20250205151003.88959-11-ryan.roberts@arm.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250205151003.88959-1-ryan.roberts@arm.com> References: <20250205151003.88959-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Queue-Id: 151594000E X-Rspamd-Server: rspam10 X-Stat-Signature: nwz9hksufypzyp9wpd1fps9ee5yg1sou X-HE-Tag: 1738768255-660310 X-HE-Meta: U2FsdGVkX1+ZEEodVcMXVqRZdMbn+nUNC67zHX28EDr8kRNYHv9KgErYNi/U/Kza6HQjqNRu7fjN2gVtmvupt+pxwBbj9q/t5Pg12anycuTpS2N+Lnu/WMlFclolMr7umCHmJZ/PmbplBo0Bc0AkK61GmTh7J0iezEx4NuqO69Sbwh2o3r8cURJyv/Sw2P4aHd0cYbKDJKD+0Q1PgzLIUqapNzX1UvUSXr85HUuy0xTDR370nbWoGdTKDixljuJzKjZhIvc13TNyPQ9+KubbSjGErbIBSOeFw1lwOAegH3qXxQK4zs6SRs0K32x03/TM9F1i/xYMvcrA5H7FCJD4iE6D3OpTl1gkWLDO2+yPv8AKOVyGNXIEKTwOtqk8nla1n15H+XTStxemGedAZkPdVrcHsBbt8W9YDYbC0J8gpOXCdeVd5GsyzUYjrZ/8yQTiNUqz52nDkF7C3942eowOuc0jJ4GQyPkILBtndCP4ESJ46o7WLlziqGbYeCyavbCWWyzuC93kAOp0I+2uCVbICcXUrAS15vYJGO9nND8thPLJyx1Kc+R9rdbruQyzIhlb75RdWFf07QQbX29ip2npBqhmb4xFrpu1SuXdGSo/Cl9RIb87wpPXqWyuWK8gfBHBbQZekWG1lXYdFl/GuC9iKx/Cg+yE3ZiwYJwFPQO5b2SKFNVXvORRYztf8C21dDEaAfzy/vi/H/YMYXO6uivM4ngC0ELsI2yT7GoeJpjq9yEbtwY4/Aw8X4CiuLGiMhJOGaldZqcgORNSR6i+FlAEC+nwlc7IsRv84N+CJnqmEq4JLKKAMWzmY9HlvxZPSVq+uz+2rQaPayxw4ArCQFWAzJccRqbzF8GpLa1sWKVxNW0wIVLuylXHFjjzdSECiwPAWkCw77ifmk1ZQCVYcZCKBL7AmkYtrSVCzHC7S0ZWlbXQ6wze1yqeiFuglaUtR7T2ZTS06UryhjqiwqWVwuU NyFqAjdE ew3u7FGt+jUObdH1W3/Uwt/xIg+X20LFEmkBqA9BCxX6l96ugeO8yF+hVq0fqU0bZoAzLAsmUUY0K9AjGIhCPQ7vKEIU36p7WiooXgbUyTKR2n91WpYViDrtn/9xREnzWsxddhQH365iz7v1C14KfgFZfAoG5e8evEnzKhEtbfiu7ODhPtEKM/OjCDco3Ot2Dec7U1L0rIVUyUnc= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: A call to vmalloc_huge() may cause memory blocks to be mapped at pmd or pud level. But it is possible to subsquently call vunmap_range() on a sub-range of the mapped memory, which partially overlaps a pmd or pud. In this case, vmalloc unmaps the entire pmd or pud so that the no-overlapping portion is also unmapped. Clearly that would have a bad outcome, but it's not something that any callers do today as far as I can tell. So I guess it's jsut expected that callers will not do this. However, it would be useful to know if this happened in future; let's add a warning to cover the eventuality. Signed-off-by: Ryan Roberts --- mm/vmalloc.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/mm/vmalloc.c b/mm/vmalloc.c index a6e7acebe9ad..fcdf67d5177a 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -374,8 +374,10 @@ static void vunmap_pmd_range(pud_t *pud, unsigned long addr, unsigned long end, if (cleared || pmd_bad(*pmd)) *mask |= PGTBL_PMD_MODIFIED; - if (cleared) + if (cleared) { + WARN_ON(next - addr < PMD_SIZE); continue; + } if (pmd_none_or_clear_bad(pmd)) continue; vunmap_pte_range(pmd, addr, next, mask); @@ -399,8 +401,10 @@ static void vunmap_pud_range(p4d_t *p4d, unsigned long addr, unsigned long end, if (cleared || pud_bad(*pud)) *mask |= PGTBL_PUD_MODIFIED; - if (cleared) + if (cleared) { + WARN_ON(next - addr < PUD_SIZE); continue; + } if (pud_none_or_clear_bad(pud)) continue; vunmap_pmd_range(pud, addr, next, mask); From patchwork Wed Feb 5 15:09:51 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13961289 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4B477C02192 for ; Wed, 5 Feb 2025 15:11:03 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D1AC028000A; Wed, 5 Feb 2025 10:11:02 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id C5609280001; Wed, 5 Feb 2025 10:11:02 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A589F28000A; Wed, 5 Feb 2025 10:11:02 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 82E93280001 for ; Wed, 5 Feb 2025 10:11:02 -0500 (EST) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 3BC0C1A01F4 for ; Wed, 5 Feb 2025 15:11:02 +0000 (UTC) X-FDA: 83086228764.16.84C38AE Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf18.hostedemail.com (Postfix) with ESMTP id 2683B1C0004 for ; Wed, 5 Feb 2025 15:10:58 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=none; spf=pass (imf18.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738768259; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=YB+/8+HBhXzlf8dkhPGjGHvbFRBfNv5iEpapquTV+08=; b=vNUT7Hfj44LZkFppjDxAQtDbvANmElXbxrQ2g2rZcBmQb4p2ko/rkaAYWOSlxOvvG2h6uc kzkdp/Kzk0lEIzAlFMUod+0MbWIoFFhs5r0jpWm2vKkax9SZU8NbSQnd97qwIXkhhLnUht 8KCUAtgAvB4SGsYMv8cRwOk1GrmX5rE= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=none; spf=pass (imf18.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738768259; a=rsa-sha256; cv=none; b=BD2ofb+hza96Q9Nj839RyUjPao5AgnEvzm6nIBvLCqYDiTU+jDXabYaVmSfq8dQRlh5w3M usOTxKBG4TW9A3HkJpf7S88ETJWGS6zK1HPYb+TEu2/lSrL4FO/0DOnUIWmtepLo+KuzrW Zvjl7B9YBSMs/uwXNr3ZVgjXM822xhg= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id D45141063; Wed, 5 Feb 2025 07:11:21 -0800 (PST) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.27]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 9E5CF3F5A1; Wed, 5 Feb 2025 07:10:55 -0800 (PST) From: Ryan Roberts To: Catalin Marinas , Will Deacon , Muchun Song , Pasha Tatashin , Andrew Morton , Uladzislau Rezki , Christoph Hellwig , Mark Rutland , Ard Biesheuvel , Anshuman Khandual , Dev Jain , Alexandre Ghiti , Steve Capper , Kevin Brodsky Cc: Ryan Roberts , linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v1 11/16] mm/vmalloc: Gracefully unmap huge ptes Date: Wed, 5 Feb 2025 15:09:51 +0000 Message-ID: <20250205151003.88959-12-ryan.roberts@arm.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250205151003.88959-1-ryan.roberts@arm.com> References: <20250205151003.88959-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 2683B1C0004 X-Stat-Signature: fgsm7gt3ngk4bazmmjojiqumb47njb7g X-Rspam-User: X-HE-Tag: 1738768258-733097 X-HE-Meta: U2FsdGVkX18LZfanISF2swQqeTvCQCu3reEHs/mMrBC9vD6oCE9MTlsYl91uCfxslHmJWcPQRJAyu3M/iXmkTdd3KvzMBi8Y3ghtaAv8WVufgdxB8tvtCp+6yV/4OhKwdOVsnlOK4yXPCE6JGUXl8l5KylrEBsbltU8/cY4Vi67mg+1jwQYNDiQfYqixKcVL3eh9bxTsRVoy52sogoXEfKLPtv1zPl0jM25wS/w6U3eFUXBmY62M4LMVDv84zZ1ege9N1wgjqRl6NpMfT8q01yfbDe6ZU7ymxY5tszz/GFhUPNdcM8/6vWMscZXSsFD4yLb5DI6uh4X4NG9YAKtXfB4Wl3QrSg6Y3mfBO56iFy9xeG2KIVS8Ec93F/V0R9ABr+Eyo7sUp4CcTVgQTn77j+4xYGd6jSd9Crmn/0vSX26w0vk6sfTWhOtnguHWH01OHeUJnGxs7um0VBniWIorSuCYpKBZt7EmqsgH0EFMhhagH9VJLQ0AVxUf51YoVQ8JTGRdAD6LYXNQiPTEiLpvQa6FVj2inFu/2SKXuhE90wTD78oGvNHJe6ZH1KbNBVI0H7fAi8NxOxRm0iJd87xR6eR8D61YQ1H8ohjm2IvNGWofdMtQ/EVnLUfZyGK+wiPf1HbDiPS017Z2fGeR0HtcvGole8rPK9bMomrM7Hs3pVXQT82auz5n9b7YpGs5hH6IWVg/3hTmDJHFtz1wv+H7XSiel66hwO9Myrv3PulrQKuu5JyD+qRbg6MnV7xuqpiiEeMjMekQWmilnpCKjZu/2QrOwZnTCs0jAT6kf7FOhXqW2qsFkx6TTG946k8UXtPGN7SN4ptg2zlDAvCpzH4gzsxUbkzklmL/QVROBg6MsA/4kcbWH7VqC2Bx7TeWLFXM7HgeTUFfjXYHtZ3HrrcLRH9x/5LK8/p6el/CjWf8RfV8JBcN5ZeBy2+8GKxBkZuhwLIqsqT1J5i6Z9dqF8j BKArbiTX ioRZ0RRrncCTTdENRaG80nu7QhBuX0g776SWjnCwDZ+Dt9oxY6wLb8RoQfgdDV8ea3XaGg0wtCAz36CJcwaGWxDF7Sa0nMuX58147BoL2pjDfr+xNKT0oV7dV4RuEI0ORgPJxb2Kck5jnfrdToxutx8p6hybliA9QYMxIRtijJq7RMz3RItsYm9LK0t8cQA55Ft1A X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Commit f7ee1f13d606 ("mm/vmalloc: enable mapping of huge pages at pte level in vmap") added its support by reusing the set_huge_pte_at() API, which is otherwise only used for user mappings. But when unmapping those huge ptes, it continued to call ptep_get_and_clear(), which is a layering violation. To date, the only arch to implement this support is powerpc and it all happens to work ok for it. But arm64's implementation of ptep_get_and_clear() can not be safely used to clear a previous set_huge_pte_at(). So let's introduce a new arch opt-in function, arch_vmap_pte_range_unmap_size(), which can provide the size of a (present) pte. Then we can call huge_ptep_get_and_clear() to tear it down properly. Note that if vunmap_range() is called with a range that starts in the middle of a huge pte-mapped page, we must unmap the entire huge page so the behaviour is consistent with pmd and pud block mappings. In this case emit a warning just like we do for pmd/pud mappings. Signed-off-by: Ryan Roberts --- include/linux/vmalloc.h | 8 ++++++++ mm/vmalloc.c | 18 ++++++++++++++++-- 2 files changed, 24 insertions(+), 2 deletions(-) diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h index 31e9ffd936e3..16dd4cba64f2 100644 --- a/include/linux/vmalloc.h +++ b/include/linux/vmalloc.h @@ -113,6 +113,14 @@ static inline unsigned long arch_vmap_pte_range_map_size(unsigned long addr, uns } #endif +#ifndef arch_vmap_pte_range_unmap_size +static inline unsigned long arch_vmap_pte_range_unmap_size(unsigned long addr, + pte_t *ptep) +{ + return PAGE_SIZE; +} +#endif + #ifndef arch_vmap_pte_supported_shift static inline int arch_vmap_pte_supported_shift(unsigned long size) { diff --git a/mm/vmalloc.c b/mm/vmalloc.c index fcdf67d5177a..6111ce900ec4 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -350,12 +350,26 @@ static void vunmap_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end, pgtbl_mod_mask *mask) { pte_t *pte; + pte_t ptent; + unsigned long size = PAGE_SIZE; pte = pte_offset_kernel(pmd, addr); do { - pte_t ptent = ptep_get_and_clear(&init_mm, addr, pte); +#ifdef CONFIG_HUGETLB_PAGE + size = arch_vmap_pte_range_unmap_size(addr, pte); + if (size != PAGE_SIZE) { + if (WARN_ON(!IS_ALIGNED(addr, size))) { + addr = ALIGN_DOWN(addr, size); + pte = PTR_ALIGN_DOWN(pte, sizeof(*pte) * (size >> PAGE_SHIFT)); + } + ptent = huge_ptep_get_and_clear(&init_mm, addr, pte, size); + if (WARN_ON(end - addr < size)) + size = end - addr; + } else +#endif + ptent = ptep_get_and_clear(&init_mm, addr, pte); WARN_ON(!pte_none(ptent) && !pte_present(ptent)); - } while (pte++, addr += PAGE_SIZE, addr != end); + } while (pte += (size >> PAGE_SHIFT), addr += size, addr != end); *mask |= PGTBL_PTE_MODIFIED; } From patchwork Wed Feb 5 15:09:52 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13961290 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id F2E8EC02192 for ; Wed, 5 Feb 2025 15:11:05 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7C2D3280021; Wed, 5 Feb 2025 10:11:05 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 772DE280013; Wed, 5 Feb 2025 10:11:05 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6145C280021; Wed, 5 Feb 2025 10:11:05 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 3A809280013 for ; Wed, 5 Feb 2025 10:11:05 -0500 (EST) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id EB7581601C6 for ; Wed, 5 Feb 2025 15:11:04 +0000 (UTC) X-FDA: 83086228848.11.E18573A Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf01.hostedemail.com (Postfix) with ESMTP id EED714001D for ; Wed, 5 Feb 2025 15:11:01 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=none; spf=pass (imf01.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738768262; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=V68pH4S7d/Sl2VYKE0uFiVbQmTdTAcmGvsS/XLOraMg=; b=M0EErdl4p/m8zjdTufXN77leoTg+AWsDQf958dIpUfg15gJ1kDxcHpadoYUD2HufKQLhNE nFj+XjsLjW6SFzPCygDAidZQuai5WPGuYBmmLipYCAbEbfpbWaDdQsxYP+nJRH7jOeEVPG GQlIoIMtNqkNrWYlTQpKxhFonn4Ffw8= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=none; spf=pass (imf01.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738768262; a=rsa-sha256; cv=none; b=3+ufOghJwqG7bAmwasPlaIaf3k90V4qvP9QN4Yt+4U/7KgUdJCUkvsh3CDqrJpfnKVBWDI Cjtqzn7KSraGigjv42f+FfJUN5VvXKNl8G6cCOHzlcbe9VKvFYP5bWSwClCNXmVtCU0ULF m4XeWffbiMWsv598E9VEmG4u7xH3jpE= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 88D461C01; Wed, 5 Feb 2025 07:11:24 -0800 (PST) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.27]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 6CE293F5A1; Wed, 5 Feb 2025 07:10:58 -0800 (PST) From: Ryan Roberts To: Catalin Marinas , Will Deacon , Muchun Song , Pasha Tatashin , Andrew Morton , Uladzislau Rezki , Christoph Hellwig , Mark Rutland , Ard Biesheuvel , Anshuman Khandual , Dev Jain , Alexandre Ghiti , Steve Capper , Kevin Brodsky Cc: Ryan Roberts , linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v1 12/16] arm64/mm: Support huge pte-mapped pages in vmap Date: Wed, 5 Feb 2025 15:09:52 +0000 Message-ID: <20250205151003.88959-13-ryan.roberts@arm.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250205151003.88959-1-ryan.roberts@arm.com> References: <20250205151003.88959-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: EED714001D X-Rspam-User: X-Rspamd-Server: rspam11 X-Stat-Signature: 3i36wh5hg7fymcjpukdm8yicuqs73mc6 X-HE-Tag: 1738768261-231146 X-HE-Meta: U2FsdGVkX18Xzgk6mQhA7adre3eOpXtU5DQzQwZ8c8PNrN8yX+dfvCmFvqxr7H6X1UmG2uyzUlxagXW/VOzTngER5XWnnm4XbYOcq8nky5uM5kV+ZzFutaYCUdCwZtOJUruXrPNhnxY/X3Dw8l7zBwvIZGcVcK+ysG86vCekYFQsIgxkS+13STOwuii+cmqmiHQ4+/U4UX+2/cGwCgQp/IQwKx45FJj2SAewsTL1rhdCMSDhsofIyZEajqDLd0NlWdQh4VEmEpJThsLK/Yo+3pLXObnMEzj3Somn1XqGbvXCDHlIsH0zSkSgxUE87ZipYt4/6HiysBA5aeCQnBAPA2s21pXj7rY6v/jd9G7dV6aXYYoYPlrZzFmg3rbKRJRXbiN/f9vLTGAue6g/bNyaKsFB9rgVHgLUpFtAYCYSyY2g0V47sAs+ls94nQSQNfLQFWdpJOAL/Eix8POHX5kE27Lp/JdMIs3URxEKRfQ/5ARNHBFa9OdZTF1RgnXhuqu1C5OKggK/yRFoCQFGo4eUw4EPeCbzL+m2KqJAbQGUFQwFRObQb4TkEDKd+r22iuoDZpZWQnOsZKQoSVRsfZDcxhLm9RXJqpGmQO1fN7Jzup7ukrrTCYSZmTSFtHHFYqqMONJFgM96xmzvNVN8fa4tcFXDpj7npLFwoSPalbFIIQSW1ud3spBcSfc+eGJJzQWBofB95h/dj0GOQRRw+E6d+a96Z/DE61pp/IEozc3doxzZwbuBnHvZHNan3cL+BuUApBZhoAvYk9hhR8h2JUOwvIroOOo2PsXRXnBZIo7hTRrFCjVWB8p5ZCuR4K8bzGy9ObVUHqJcJ+0rh/YvKpFmiI/cicEKYdgtFRtMeVqqRHc2VAiaXD/zmG26mIx35M3xy7EvqiW2AL334hnLG2vDg+Oj6eyULlhlJcYC7FousdDNzrrIz21zaMXmdnZP/e+DrSo0Fd5pLDoIm0pw2uk ZrnfTa6R hxplZjzBtamccGJpFnTmwNIjXM4cY9va6NgKWZj8EZ4Fv0MArlBfbNt4IXOXrC7ESMqeGcsqzTx36uTrG3fMJRGEwhdV5xMAv9s8nQ+Dwu4LngMqxNfrcRtBRAEhcgJYYokl6Eoxd/7ZbIpMboFA6ku4c3g== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Implement the required arch functions to enable use of contpte in the vmap when VM_ALLOW_HUGE_VMAP is specified. This speeds up vmap operations due to only having to issue a DSB and ISB per contpte block instead of per pte. But it also means that the TLB pressure reduces due to only needing a single TLB entry for the whole contpte block. Since vmap uses set_huge_pte_at() to set the contpte, that API is now used for kernel mappings for the first time. Although in the vmap case we never expect it to be called to modify a valid mapping so clear_flush() should never be called, it's still wise to make it robust for the kernel case, so amend the tlb flush function if the mm is for kernel space. Tested with vmalloc performance selftests: # kself/mm/test_vmalloc.sh \ run_test_mask=1 test_repeat_count=5 nr_pages=256 test_loop_count=100000 use_huge=1 Duration reduced from 1274243 usec to 1083553 usec on Apple M2 for 15% reduction in time taken. Signed-off-by: Ryan Roberts --- arch/arm64/include/asm/vmalloc.h | 40 ++++++++++++++++++++++++++++++++ arch/arm64/mm/hugetlbpage.c | 5 +++- 2 files changed, 44 insertions(+), 1 deletion(-) diff --git a/arch/arm64/include/asm/vmalloc.h b/arch/arm64/include/asm/vmalloc.h index 38fafffe699f..fbdeb40f3857 100644 --- a/arch/arm64/include/asm/vmalloc.h +++ b/arch/arm64/include/asm/vmalloc.h @@ -23,6 +23,46 @@ static inline bool arch_vmap_pmd_supported(pgprot_t prot) return !IS_ENABLED(CONFIG_PTDUMP_DEBUGFS); } +#define arch_vmap_pte_range_map_size arch_vmap_pte_range_map_size +static inline unsigned long arch_vmap_pte_range_map_size(unsigned long addr, + unsigned long end, u64 pfn, + unsigned int max_page_shift) +{ + if (max_page_shift < CONT_PTE_SHIFT) + return PAGE_SIZE; + + if (end - addr < CONT_PTE_SIZE) + return PAGE_SIZE; + + if (!IS_ALIGNED(addr, CONT_PTE_SIZE)) + return PAGE_SIZE; + + if (!IS_ALIGNED(PFN_PHYS(pfn), CONT_PTE_SIZE)) + return PAGE_SIZE; + + return CONT_PTE_SIZE; +} + +#define arch_vmap_pte_range_unmap_size arch_vmap_pte_range_unmap_size +static inline unsigned long arch_vmap_pte_range_unmap_size(unsigned long addr, + pte_t *ptep) +{ + /* + * The caller handles alignment so it's sufficient just to check + * PTE_CONT. + */ + return pte_valid_cont(__ptep_get(ptep)) ? CONT_PTE_SIZE : PAGE_SIZE; +} + +#define arch_vmap_pte_supported_shift arch_vmap_pte_supported_shift +static inline int arch_vmap_pte_supported_shift(unsigned long size) +{ + if (size >= CONT_PTE_SIZE) + return CONT_PTE_SHIFT; + + return PAGE_SHIFT; +} + #endif #define arch_vmap_pgprot_tagged arch_vmap_pgprot_tagged diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c index 02afee31444e..a74e43101dad 100644 --- a/arch/arm64/mm/hugetlbpage.c +++ b/arch/arm64/mm/hugetlbpage.c @@ -217,7 +217,10 @@ static void clear_flush(struct mm_struct *mm, for (i = 0; i < ncontig; i++, addr += pgsize, ptep++) ___ptep_get_and_clear(mm, ptep, pgsize); - __flush_hugetlb_tlb_range(&vma, saddr, addr, pgsize, true); + if (mm == &init_mm) + flush_tlb_kernel_range(saddr, addr); + else + __flush_hugetlb_tlb_range(&vma, saddr, addr, pgsize, true); } void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, From patchwork Wed Feb 5 15:09:53 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13961291 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4E3B4C02194 for ; Wed, 5 Feb 2025 15:11:08 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 838A5280022; Wed, 5 Feb 2025 10:11:07 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 7E790280013; Wed, 5 Feb 2025 10:11:07 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 663FA280022; Wed, 5 Feb 2025 10:11:07 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 1E232280013 for ; Wed, 5 Feb 2025 10:11:07 -0500 (EST) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 8C6C01601F7 for ; Wed, 5 Feb 2025 15:11:06 +0000 (UTC) X-FDA: 83086228932.19.9C22CFD Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf19.hostedemail.com (Postfix) with ESMTP id 8EB611A0003 for ; Wed, 5 Feb 2025 15:11:04 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=none; spf=pass (imf19.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738768264; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=z0b/vAo1dYDk99lppFjqh4NoEmfrtaCmuAHpymYcqN0=; b=sjq6Ic9qj3liBNIb19WGAON7agVDudr5B5gTNdz3dob7i6SKwKJunvD3zjW6GuI8CUIlbD 2qMdY6QpOviP+K1aGnKMy8/PGOgG4OJ028dR9UcRjIjuqFFNuXWtJ/QR3AglOXxlzeKL0A 9n8ubkE/TDodPLo8zhpCLcBRs0ZjeR4= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=none; spf=pass (imf19.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738768264; a=rsa-sha256; cv=none; b=aq/KqubU/vShTNRjKW908P/kWoCG2eg9rPzuGPMZmvN+Es1GeQgz0m+jFNlMtHAmMztXvA 5/TVXA9u43PNgIyL1VtT7MYipSPRGXrW8+ayhQtTXtupSaJxb+zigqwTglx/VAzs88PwY1 f42eJxri4IeMKmYULbPptjT04mjZ0r0= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 7589B1063; Wed, 5 Feb 2025 07:11:27 -0800 (PST) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.27]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 3AC763F5A1; Wed, 5 Feb 2025 07:11:01 -0800 (PST) From: Ryan Roberts To: Catalin Marinas , Will Deacon , Muchun Song , Pasha Tatashin , Andrew Morton , Uladzislau Rezki , Christoph Hellwig , Mark Rutland , Ard Biesheuvel , Anshuman Khandual , Dev Jain , Alexandre Ghiti , Steve Capper , Kevin Brodsky Cc: Ryan Roberts , linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, stable@vger.kernel.org Subject: [PATCH v1 13/16] mm: Don't skip arch_sync_kernel_mappings() in error paths Date: Wed, 5 Feb 2025 15:09:53 +0000 Message-ID: <20250205151003.88959-14-ryan.roberts@arm.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250205151003.88959-1-ryan.roberts@arm.com> References: <20250205151003.88959-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 8EB611A0003 X-Stat-Signature: k584udtskkdks9ysgum47fajushx9hwh X-Rspam-User: X-HE-Tag: 1738768264-857158 X-HE-Meta: U2FsdGVkX1+1z1CVEDGJCRiuTSmBseQT29I0FiVdLFAe4gWivuZcyJQZ2KaayLXzfF0YD3tMESdebAIsOLsE47SyCiZ7DPZDEbh3HG+Tl7+p1/bCIMSVGreLFo3ZIUirgiVZ+1CiKko5YIx8wvL+SCovL8EGsh6IIsl+9AFfQiCM9AP3jFGfzG8ybRIVRFlp5aJrVTHH/75Z0h01dpYrZwYVnRPq8vnwm1uupCIVvZtxtwxu1GWYNByQVBUaqCz70a6+I6vmQNSSqPx3c9BNdCRqup91s+ypdTf0LBPKuVlu8AA9Q26J1pH1VX8pTaq4KTmhQQzFQjMV0xkft5TkuIZHa5mFw4oZ2YIeSMBkn7K3dHEZhAGdGYq67RPR0yfN7xlbqAlJeOYwLiNPM3uAP+anuWGEauYIM3deM4moKgul5UBP9ORuP7VGGkxw6q0GRZeJMBnbWckpIL19sXZ9uGMJjCbwzd8VOGH1YDQVNFX21k86BJZ6qo6j8YzPVaKpIICTcmAlnmr3DZ5uiwpPgphrfbwby1zQ5OZtvojsGBtjLSq5ktRnFA898/Fa2//OCV7WT0frYHAGZVM0kpj1IjwiYjwVxzXhPbtzrQLR8ueDXD99MT/0QKXA91i0uJYs/16g7QgLxNCt3N4470ShhiN1Bd2xutRr4joSXBYHwNtTHDTxxAuA12q+97/n4T7kvYYKfy66c5P85s7VSL6d9rCYZTsayn7U7Vuu0Gj/UeS9LDPGtMuakLvx5MqIWppGRTlDdtOcV6pACW3oaGVrNhzVdFOU11/1UMJ7seBAAichYafUPyV/K/5m43eCR8t0l50w7FjxdFLJ+OKArrMkdL+vXYyOfAUIyg2do6uUaKsbFFBJTYFvfFwk4yKty6Qk1pcruNP6e94rd76yIWq0pURImJhcJMdDjiMRclfzBk/E9Vxd9CGw21FEDe3gGm2le8ndCJLKtDx1rLAGCXS CpJkIELl Ln1S3by7Yot5AL69I7dCi6Y/xfEy/kZK5+89X/6a221GI581h1suL/xo3zut6kn+lXIpfiXsmWLj++H9VulP+b4OLnkXdPTX4vJ6Cz/NL9E07iEVdKR6qzD5oh0FrfXicp+qKrQvRQPij2sNN0Rqev/0LMyjUKCQeHFrdjiiF2+PTldgmqGRn+zDpn/23hlFEiIwzXNXOik+7jSV9y7bEfGyy6WtyrCEWG7ng X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Fix callers that previously skipped calling arch_sync_kernel_mappings() if an error occurred during a pgtable update. The call is still required to sync any pgtable updates that may have occurred prior to hitting the error condition. These are theoretical bugs discovered during code review. Cc: Fixes: 2ba3e6947aed ("mm/vmalloc: track which page-table levels were modified") Fixes: 0c95cba49255 ("mm: apply_to_pte_range warn and fail if a large pte is encountered") Signed-off-by: Ryan Roberts --- mm/memory.c | 6 ++++-- mm/vmalloc.c | 4 ++-- 2 files changed, 6 insertions(+), 4 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 539c0f7c6d54..a15f7dd500ea 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3040,8 +3040,10 @@ static int __apply_to_page_range(struct mm_struct *mm, unsigned long addr, next = pgd_addr_end(addr, end); if (pgd_none(*pgd) && !create) continue; - if (WARN_ON_ONCE(pgd_leaf(*pgd))) - return -EINVAL; + if (WARN_ON_ONCE(pgd_leaf(*pgd))) { + err = -EINVAL; + break; + } if (!pgd_none(*pgd) && WARN_ON_ONCE(pgd_bad(*pgd))) { if (!create) continue; diff --git a/mm/vmalloc.c b/mm/vmalloc.c index 6111ce900ec4..68950b1824d0 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -604,13 +604,13 @@ static int vmap_small_pages_range_noflush(unsigned long addr, unsigned long end, mask |= PGTBL_PGD_MODIFIED; err = vmap_pages_p4d_range(pgd, addr, next, prot, pages, &nr, &mask); if (err) - return err; + break; } while (pgd++, addr = next, addr != end); if (mask & ARCH_PAGE_TABLE_SYNC_MASK) arch_sync_kernel_mappings(start, end); - return 0; + return err; } /* From patchwork Wed Feb 5 15:09:54 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13961293 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8B9ECC02192 for ; Wed, 5 Feb 2025 15:11:16 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CCBC96B0099; Wed, 5 Feb 2025 10:11:14 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id C547D6B009C; Wed, 5 Feb 2025 10:11:14 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id ACE196B009D; Wed, 5 Feb 2025 10:11:14 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 63E386B0099 for ; Wed, 5 Feb 2025 10:11:14 -0500 (EST) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 0FF5DA023E for ; Wed, 5 Feb 2025 15:11:11 +0000 (UTC) X-FDA: 83086229142.28.C48DBBD Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf02.hostedemail.com (Postfix) with ESMTP id 65F6780005 for ; Wed, 5 Feb 2025 15:11:07 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf02.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738768267; a=rsa-sha256; cv=none; b=vlr66oAKHRBQRZEL+eS1rUtsOs+pxlUDMoonIVFHh1Livgd0aqEZ42S8hNyMr7X+Mz0EBA YJkuISUuFEcDDlbTgzHDgSHcTtV03PMLYRt/5LPNjhvhNnUownR7xcdPXzRbYEkc0++Hpb 065v3i+P8Mb3DePVGgOzFjkEtsCq7zg= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf02.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738768267; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=QIxxSOtTOWfkgJ6ZPTh9wpPoFqmWlmW/dFwplfHXX5c=; b=ZGpEiOfm5+/ZPMHdTfI54gm8B6s+KpoNbXTdtJVD266YrR8moqYv5R8hP1hCOPgWMzaQXw o8kJcv6I7ETwXU4slfd3zlDgf89OSZP3XBSGVYySWvCLd+SlJwRl2NrE/25ViPu6a4sz+0 jOXRjV8gutQnMOMe5lmpR1xlYKFT73o= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 439D7165C; Wed, 5 Feb 2025 07:11:30 -0800 (PST) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.27]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 27B823F5A1; Wed, 5 Feb 2025 07:11:04 -0800 (PST) From: Ryan Roberts To: Catalin Marinas , Will Deacon , Muchun Song , Pasha Tatashin , Andrew Morton , Uladzislau Rezki , Christoph Hellwig , Mark Rutland , Ard Biesheuvel , Anshuman Khandual , Dev Jain , Alexandre Ghiti , Steve Capper , Kevin Brodsky Cc: Ryan Roberts , linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v1 14/16] mm/vmalloc: Batch arch_sync_kernel_mappings() more efficiently Date: Wed, 5 Feb 2025 15:09:54 +0000 Message-ID: <20250205151003.88959-15-ryan.roberts@arm.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250205151003.88959-1-ryan.roberts@arm.com> References: <20250205151003.88959-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 65F6780005 X-Stat-Signature: hjx3zxzfz5ujyofhjabzeexy351gwnsd X-HE-Tag: 1738768267-259423 X-HE-Meta: U2FsdGVkX194ZBG3NCkyxVBG5hnwMHoFFOMMoooRO0Ih1AOseqnJmr79hH7VgjV2QTm7lMnGzQSGnQwXSm/VkrWkfwGh65tMWwDfkDZC23tk+zvfQCiBIGugKyPq+Fy7b9ImzMbY20wWhCm9rJsgn8v6eNKAVwZp7H+9VVI9d2QB//OYJo6oBjqY2PVIOMniVMJqusmr81K02nSmIxH8WqWwYfFluPBH0sChgFJKL0SYtWFOCiC7cDnOR/YAItqQfVhpVZVKijkIgbd3EBr0BWJTpTFPscIyjWtGEBFSALBQUNhiHSeNAVJubiDlRHpe5mME+zA65dtgwfLgq3a/1Ep07vVivUYnBwxMpWSlWvtCWFckLPNo0w5/QgrHCbXQ3GHP7E+fde6VKTgMPcQXFfaA0IyZN5Jxuj6s63YdYVORwWqSxkloHkjsZmWX+VX6WldcXTN07zzf5lfp6M48Qf6GcgzBxOS1P59laeHN9OjI9mUJxytH8LOY2LrD+miUSAmUko3JTxhfM18v1XNbX0a0kazTJuo11uq0jX5wIOeFdfB1cb3ujrOlTFqNTA3/Tdw1YewPin+y3DhfjxsYcVJuCltIeSuw/Xq2hPfHM9sgZ5iayvTQTmqmELcPF6utsRYHlaXOs5kCAzbnu9L0Re5Rqspx3ebLynxn0RuKlyhRa1qtnQXZBaZQClJnhd+hzpc3QP/nMY2qWmlm8Mxjsb6PgMoRyluAAA/SpfQggsvXtQdd8r3RNMno8O1ctDq2gWyB3zBqe57Wg2qUF/rAkc376Nz06UPWxwnQxQHMiE3yINUWVsHkL4HLVSkhtFTCQH8RSQMMAPoZ6wI2HA43yeh6CXv2ZqJBAz/3VH+RpNIRiYn65vs3YtuRpH0dJfyEkkjG9iBiWVQWK5r3EBkpYUaQwNI2FeXmXHw4NgzEQ9TovCUg2Hg+bmy37FbNQuwHIzfPRBG75hWoc9zhFOc 7AgJ7rhy JuJjBKD+BL/thYxCBn9kmYZJ5AcUv7cdZDJd90GBTkpoU8cmOf3PW19AIILNRhrF4EkAOmMo9SFNr7ag3dytLWbi7poMGNde8z54MhJUYpHtuR/ggmRn+6jb5RDBXfNa432RmrrzDSLHgRBleIJ5m6Hc1DUhvn/UWJX4opj1G2ZQfbiqljhjGnX1Q8/kjefYA6JYn0Cd0PMlZnos= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: When page_shift is greater than PAGE_SIZE, __vmap_pages_range_noflush() will call vmap_range_noflush() for each individual huge page. But vmap_range_noflush() would previously call arch_sync_kernel_mappings() directly so this would end up being called for every huge page. We can do better than this; refactor the call into the outer __vmap_pages_range_noflush() so that it is only called once for the entire batch operation. This will benefit performance for arm64 which is about to opt-in to using the hook. Signed-off-by: Ryan Roberts --- mm/vmalloc.c | 60 ++++++++++++++++++++++++++-------------------------- 1 file changed, 30 insertions(+), 30 deletions(-) diff --git a/mm/vmalloc.c b/mm/vmalloc.c index 68950b1824d0..50fd44439875 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -285,40 +285,38 @@ static int vmap_p4d_range(pgd_t *pgd, unsigned long addr, unsigned long end, static int vmap_range_noflush(unsigned long addr, unsigned long end, phys_addr_t phys_addr, pgprot_t prot, - unsigned int max_page_shift) + unsigned int max_page_shift, pgtbl_mod_mask *mask) { pgd_t *pgd; - unsigned long start; unsigned long next; int err; - pgtbl_mod_mask mask = 0; might_sleep(); BUG_ON(addr >= end); - start = addr; pgd = pgd_offset_k(addr); do { next = pgd_addr_end(addr, end); err = vmap_p4d_range(pgd, addr, next, phys_addr, prot, - max_page_shift, &mask); + max_page_shift, mask); if (err) break; } while (pgd++, phys_addr += (next - addr), addr = next, addr != end); - if (mask & ARCH_PAGE_TABLE_SYNC_MASK) - arch_sync_kernel_mappings(start, end); - return err; } int vmap_page_range(unsigned long addr, unsigned long end, phys_addr_t phys_addr, pgprot_t prot) { + pgtbl_mod_mask mask = 0; int err; err = vmap_range_noflush(addr, end, phys_addr, pgprot_nx(prot), - ioremap_max_page_shift); + ioremap_max_page_shift, &mask); + if (mask & ARCH_PAGE_TABLE_SYNC_MASK) + arch_sync_kernel_mappings(addr, end); + flush_cache_vmap(addr, end); if (!err) err = kmsan_ioremap_page_range(addr, end, phys_addr, prot, @@ -587,29 +585,24 @@ static int vmap_pages_p4d_range(pgd_t *pgd, unsigned long addr, } static int vmap_small_pages_range_noflush(unsigned long addr, unsigned long end, - pgprot_t prot, struct page **pages) + pgprot_t prot, struct page **pages, pgtbl_mod_mask *mask) { - unsigned long start = addr; pgd_t *pgd; unsigned long next; int err = 0; int nr = 0; - pgtbl_mod_mask mask = 0; BUG_ON(addr >= end); pgd = pgd_offset_k(addr); do { next = pgd_addr_end(addr, end); if (pgd_bad(*pgd)) - mask |= PGTBL_PGD_MODIFIED; - err = vmap_pages_p4d_range(pgd, addr, next, prot, pages, &nr, &mask); + *mask |= PGTBL_PGD_MODIFIED; + err = vmap_pages_p4d_range(pgd, addr, next, prot, pages, &nr, mask); if (err) break; } while (pgd++, addr = next, addr != end); - if (mask & ARCH_PAGE_TABLE_SYNC_MASK) - arch_sync_kernel_mappings(start, end); - return err; } @@ -626,26 +619,33 @@ int __vmap_pages_range_noflush(unsigned long addr, unsigned long end, pgprot_t prot, struct page **pages, unsigned int page_shift) { unsigned int i, nr = (end - addr) >> PAGE_SHIFT; + unsigned long start = addr; + pgtbl_mod_mask mask = 0; + int err = 0; WARN_ON(page_shift < PAGE_SHIFT); if (!IS_ENABLED(CONFIG_HAVE_ARCH_HUGE_VMALLOC) || - page_shift == PAGE_SHIFT) - return vmap_small_pages_range_noflush(addr, end, prot, pages); - - for (i = 0; i < nr; i += 1U << (page_shift - PAGE_SHIFT)) { - int err; - - err = vmap_range_noflush(addr, addr + (1UL << page_shift), - page_to_phys(pages[i]), prot, - page_shift); - if (err) - return err; + page_shift == PAGE_SHIFT) { + err = vmap_small_pages_range_noflush(addr, end, prot, pages, + &mask); + } else { + for (i = 0; i < nr; i += 1U << (page_shift - PAGE_SHIFT)) { + err = vmap_range_noflush(addr, + addr + (1UL << page_shift), + page_to_phys(pages[i]), prot, + page_shift, &mask); + if (err) + break; - addr += 1UL << page_shift; + addr += 1UL << page_shift; + } } - return 0; + if (mask & ARCH_PAGE_TABLE_SYNC_MASK) + arch_sync_kernel_mappings(start, end); + + return err; } int vmap_pages_range_noflush(unsigned long addr, unsigned long end, From patchwork Wed Feb 5 15:09:55 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13961292 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9E2C8C02194 for ; Wed, 5 Feb 2025 15:11:14 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 37DDF6B0096; Wed, 5 Feb 2025 10:11:14 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 32D426B0099; Wed, 5 Feb 2025 10:11:14 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1F4956B009C; Wed, 5 Feb 2025 10:11:14 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id D3D2B6B0096 for ; Wed, 5 Feb 2025 10:11:13 -0500 (EST) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 60AB71A0210 for ; Wed, 5 Feb 2025 15:11:13 +0000 (UTC) X-FDA: 83086229226.20.DDF4DCE Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf10.hostedemail.com (Postfix) with ESMTP id 5CAE8C000E for ; Wed, 5 Feb 2025 15:11:10 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=none; spf=pass (imf10.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738768270; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=zw4vZu0X095hQTpAYGpfAGxwgGtSu1Wp4rJvqtPB/Ts=; b=tfXGcSJa8tkyoeL0yfsIvYknFmIh75o2FgEBDeUWSL8CkZk6q7fIRw1x2LBBFw9PNY+fng 5GyiM9daD1j/Sybx/66l1NCbguNKw6fvIkazYGrnyWsp7DiUN3b1b0gp5Fw89+GANgcLTm RhqSLqvBgMdxHtXlOHvBFY50kRu4ey4= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738768270; a=rsa-sha256; cv=none; b=rkEqdCIRsY1m6XmQZ0wU0HfdKgAvvDP6S1UwHWNyNPNTgbLQmDJGaqwVba8/EVoyyCF1Mq 0zco0A/60fa+xJhPY0vCe6enaER4pJwLJXi8/SnvvQ6AuNPRTTCY3XBRBg8B0zXlYv4TpD 6uf+PHJoM7i3DLQkFNFlkoEfBsEM6dY= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=none; spf=pass (imf10.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 12D691515; Wed, 5 Feb 2025 07:11:33 -0800 (PST) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.27]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id EA6533F5A1; Wed, 5 Feb 2025 07:11:06 -0800 (PST) From: Ryan Roberts To: Catalin Marinas , Will Deacon , Muchun Song , Pasha Tatashin , Andrew Morton , Uladzislau Rezki , Christoph Hellwig , Mark Rutland , Ard Biesheuvel , Anshuman Khandual , Dev Jain , Alexandre Ghiti , Steve Capper , Kevin Brodsky Cc: Ryan Roberts , linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v1 15/16] mm: Generalize arch_sync_kernel_mappings() Date: Wed, 5 Feb 2025 15:09:55 +0000 Message-ID: <20250205151003.88959-16-ryan.roberts@arm.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250205151003.88959-1-ryan.roberts@arm.com> References: <20250205151003.88959-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Stat-Signature: gjd5bdpq7kpt3np1i433ti43xzicdqqk X-Rspamd-Queue-Id: 5CAE8C000E X-Rspam-User: X-Rspamd-Server: rspam06 X-HE-Tag: 1738768270-249900 X-HE-Meta: U2FsdGVkX1+MHxGHfvEwODgSsIqPDA9JzpDUXY+LBHNyPJJMB4IvAP1ixppVFC0O0RSaM3ensXSqOD1jhCL3Muj1bxloiLmQ/kp5mYY3mOpm1FaYeDW9xD+Z98f6Ddh4siH7fkf69jSakv52nM7qk1E04npmX8Lj/2wa3++is8/fv50+VC+34FAc5GrUqq+gh2i02k8/E+PE0adk/4kLdko3vwQKOOEkOuauJ9hdWGKPDfJnQhgmDsQj3aSfO17o8tglWAtzRbiA7KVcea38x0Jbrtt/5reECIYootxehxzG5jSQPMK3ZYfMUAXT819f468PtdpixjkeskB0h3QG4px0H0Tju31i0R0XRunwlyJav7YQVUYbchy2hcXFQUqZhYItME9cUm3EfuqzCQzyAMWR994HQ7ua8k7NklIzY8z0fcrv6C5ipYTDEWk9iO28jfmBrCEvelSUwtbUaAXHxPe2llcOOQ106tKDoAgj1RhawUmIFK/hJz01Xbfl7OvrxgQevqcE7KOdajlbhG7rgDhLtRSLWIP/5xPtuO9zFRyaUc6CVxTMQoS9B7TcU6QrANAJFgKjvUHaNBAIeAnxyXMK2Fl3vdd26/2Id5MVT4cWvfVzH5TvPCin6ffqP2ptBmefupDFNOqz+Zma5iPd6JhCoQ/kvC8QFe8fC7rLBf6lGc+3fgrPxTARpDn8LIAkCLEMQjYUQS8EeknYsCZjAaDl1pj14g0qVljP/WlQTe8I19YH5IzsO8DGvkPiVHdASY5QjV2kno94/vWBBTnR5+5AwBIWFrfSBA/zHF6ktk1ga5WPSolJAyWylUUU+nu7bj086Jgw42TVUQ35pcgbAL8NqVCCQVGnE08Q8JC7DZCisdZYxRNbwVDswHdr2f4OT2gSudYAtP1fOcrRDfW6X3pkVXihLKeagA2lC26HsUeJ5LpNWYwjN26XqbuA3sdiz7+xhw3fZIOfjAAAG0D sqfsiWNH t9zSCStgo/Db44X460oxHVBspZdivLU5/XzETOXF4QzOUROlYn7zQPbhOw/YGks+p6MY6JqW3dw2CBS5vojLGBVtkXozJjt6KH9O73qJ+MpmyWqGhO8/FTYCbdYHzbXGdy+ZSVMH4RwuPZ8roNxo2xp11uNwAJ33s/a4pizHinkIJjnEuObrqKQvEV2hqPBHDI0gI4tTCS0HuCyw= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: arch_sync_kernel_mappings() is an optional hook for arches to allow them to synchonize certain levels of the kernel pgtables after modification. But arm64 could benefit from a hook similar to this, paired with a call prior to starting the batch of modifications. So let's introduce arch_update_kernel_mappings_begin() and arch_update_kernel_mappings_end(). Both have a default implementation which can be overridden by the arch code. The default for the former is a nop, and the default for the latter is to call arch_sync_kernel_mappings(), so the latter replaces previous arch_sync_kernel_mappings() callsites. So by default, the resulting behaviour is unchanged. To avoid include hell, the pgtbl_mod_mask type and it's associated macros are moved to their own header. In a future patch, arm64 will opt-in to overriding both functions. Signed-off-by: Ryan Roberts --- include/linux/pgtable.h | 24 +---------------- include/linux/pgtable_modmask.h | 32 ++++++++++++++++++++++ include/linux/vmalloc.h | 47 +++++++++++++++++++++++++++++++++ mm/memory.c | 5 ++-- mm/vmalloc.c | 15 ++++++----- 5 files changed, 92 insertions(+), 31 deletions(-) create mode 100644 include/linux/pgtable_modmask.h diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index 94d267d02372..7f70786a73b3 100644 --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -4,6 +4,7 @@ #include #include +#include #define PMD_ORDER (PMD_SHIFT - PAGE_SHIFT) #define PUD_ORDER (PUD_SHIFT - PAGE_SHIFT) @@ -1786,29 +1787,6 @@ static inline bool arch_has_pfn_modify_check(void) # define PAGE_KERNEL_EXEC PAGE_KERNEL #endif -/* - * Page Table Modification bits for pgtbl_mod_mask. - * - * These are used by the p?d_alloc_track*() set of functions an in the generic - * vmalloc/ioremap code to track at which page-table levels entries have been - * modified. Based on that the code can better decide when vmalloc and ioremap - * mapping changes need to be synchronized to other page-tables in the system. - */ -#define __PGTBL_PGD_MODIFIED 0 -#define __PGTBL_P4D_MODIFIED 1 -#define __PGTBL_PUD_MODIFIED 2 -#define __PGTBL_PMD_MODIFIED 3 -#define __PGTBL_PTE_MODIFIED 4 - -#define PGTBL_PGD_MODIFIED BIT(__PGTBL_PGD_MODIFIED) -#define PGTBL_P4D_MODIFIED BIT(__PGTBL_P4D_MODIFIED) -#define PGTBL_PUD_MODIFIED BIT(__PGTBL_PUD_MODIFIED) -#define PGTBL_PMD_MODIFIED BIT(__PGTBL_PMD_MODIFIED) -#define PGTBL_PTE_MODIFIED BIT(__PGTBL_PTE_MODIFIED) - -/* Page-Table Modification Mask */ -typedef unsigned int pgtbl_mod_mask; - #endif /* !__ASSEMBLY__ */ #if !defined(MAX_POSSIBLE_PHYSMEM_BITS) && !defined(CONFIG_64BIT) diff --git a/include/linux/pgtable_modmask.h b/include/linux/pgtable_modmask.h new file mode 100644 index 000000000000..5a21b1bb8df3 --- /dev/null +++ b/include/linux/pgtable_modmask.h @@ -0,0 +1,32 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _LINUX_PGTABLE_MODMASK_H +#define _LINUX_PGTABLE_MODMASK_H + +#ifndef __ASSEMBLY__ + +/* + * Page Table Modification bits for pgtbl_mod_mask. + * + * These are used by the p?d_alloc_track*() set of functions an in the generic + * vmalloc/ioremap code to track at which page-table levels entries have been + * modified. Based on that the code can better decide when vmalloc and ioremap + * mapping changes need to be synchronized to other page-tables in the system. + */ +#define __PGTBL_PGD_MODIFIED 0 +#define __PGTBL_P4D_MODIFIED 1 +#define __PGTBL_PUD_MODIFIED 2 +#define __PGTBL_PMD_MODIFIED 3 +#define __PGTBL_PTE_MODIFIED 4 + +#define PGTBL_PGD_MODIFIED BIT(__PGTBL_PGD_MODIFIED) +#define PGTBL_P4D_MODIFIED BIT(__PGTBL_P4D_MODIFIED) +#define PGTBL_PUD_MODIFIED BIT(__PGTBL_PUD_MODIFIED) +#define PGTBL_PMD_MODIFIED BIT(__PGTBL_PMD_MODIFIED) +#define PGTBL_PTE_MODIFIED BIT(__PGTBL_PTE_MODIFIED) + +/* Page-Table Modification Mask */ +typedef unsigned int pgtbl_mod_mask; + +#endif /* !__ASSEMBLY__ */ + +#endif /* _LINUX_PGTABLE_MODMASK_H */ diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h index 16dd4cba64f2..cb5d8f1965a1 100644 --- a/include/linux/vmalloc.h +++ b/include/linux/vmalloc.h @@ -11,6 +11,7 @@ #include /* pgprot_t */ #include #include +#include #include @@ -213,6 +214,26 @@ extern int remap_vmalloc_range(struct vm_area_struct *vma, void *addr, int vmap_pages_range(unsigned long addr, unsigned long end, pgprot_t prot, struct page **pages, unsigned int page_shift); +#ifndef arch_update_kernel_mappings_begin +/** + * arch_update_kernel_mappings_begin - A batch of kernel pgtable mappings are + * about to be updated. + * @start: Virtual address of start of range to be updated. + * @end: Virtual address of end of range to be updated. + * + * An optional hook to allow architecture code to prepare for a batch of kernel + * pgtable mapping updates. An architecture may use this to enter a lazy mode + * where some operations can be deferred until the end of the batch. + * + * Context: Called in task context and may be preemptible. + */ +static inline void arch_update_kernel_mappings_begin(unsigned long start, + unsigned long end) +{ +} +#endif + +#ifndef arch_update_kernel_mappings_end /* * Architectures can set this mask to a combination of PGTBL_P?D_MODIFIED values * and let generic vmalloc and ioremap code know when arch_sync_kernel_mappings() @@ -229,6 +250,32 @@ int vmap_pages_range(unsigned long addr, unsigned long end, pgprot_t prot, */ void arch_sync_kernel_mappings(unsigned long start, unsigned long end); +/** + * arch_update_kernel_mappings_end - A batch of kernel pgtable mappings have + * been updated. + * @start: Virtual address of start of range that was updated. + * @end: Virtual address of end of range that was updated. + * + * An optional hook to inform architecture code that a batch update is complete. + * This balances a previous call to arch_update_kernel_mappings_begin(). + * + * An architecture may override this for any purpose, such as exiting a lazy + * mode previously entered with arch_update_kernel_mappings_begin() or syncing + * kernel mappings to a secondary pgtable. The default implementation calls an + * arch-provided arch_sync_kernel_mappings() if any arch-defined pgtable level + * was updated. + * + * Context: Called in task context and may be preemptible. + */ +static inline void arch_update_kernel_mappings_end(unsigned long start, + unsigned long end, + pgtbl_mod_mask mask) +{ + if (mask & ARCH_PAGE_TABLE_SYNC_MASK) + arch_sync_kernel_mappings(start, end); +} +#endif + /* * Lowlevel-APIs (not for driver use!) */ diff --git a/mm/memory.c b/mm/memory.c index a15f7dd500ea..f80930bc19f6 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3035,6 +3035,8 @@ static int __apply_to_page_range(struct mm_struct *mm, unsigned long addr, if (WARN_ON(addr >= end)) return -EINVAL; + arch_update_kernel_mappings_begin(start, end); + pgd = pgd_offset(mm, addr); do { next = pgd_addr_end(addr, end); @@ -3055,8 +3057,7 @@ static int __apply_to_page_range(struct mm_struct *mm, unsigned long addr, break; } while (pgd++, addr = next, addr != end); - if (mask & ARCH_PAGE_TABLE_SYNC_MASK) - arch_sync_kernel_mappings(start, start + size); + arch_update_kernel_mappings_end(start, end, mask); return err; } diff --git a/mm/vmalloc.c b/mm/vmalloc.c index 50fd44439875..c5c51d86ef78 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -312,10 +312,10 @@ int vmap_page_range(unsigned long addr, unsigned long end, pgtbl_mod_mask mask = 0; int err; + arch_update_kernel_mappings_begin(addr, end); err = vmap_range_noflush(addr, end, phys_addr, pgprot_nx(prot), ioremap_max_page_shift, &mask); - if (mask & ARCH_PAGE_TABLE_SYNC_MASK) - arch_sync_kernel_mappings(addr, end); + arch_update_kernel_mappings_end(addr, end, mask); flush_cache_vmap(addr, end); if (!err) @@ -463,6 +463,9 @@ void __vunmap_range_noflush(unsigned long start, unsigned long end) pgtbl_mod_mask mask = 0; BUG_ON(addr >= end); + + arch_update_kernel_mappings_begin(start, end); + pgd = pgd_offset_k(addr); do { next = pgd_addr_end(addr, end); @@ -473,8 +476,7 @@ void __vunmap_range_noflush(unsigned long start, unsigned long end) vunmap_p4d_range(pgd, addr, next, &mask); } while (pgd++, addr = next, addr != end); - if (mask & ARCH_PAGE_TABLE_SYNC_MASK) - arch_sync_kernel_mappings(start, end); + arch_update_kernel_mappings_end(start, end, mask); } void vunmap_range_noflush(unsigned long start, unsigned long end) @@ -625,6 +627,8 @@ int __vmap_pages_range_noflush(unsigned long addr, unsigned long end, WARN_ON(page_shift < PAGE_SHIFT); + arch_update_kernel_mappings_begin(start, end); + if (!IS_ENABLED(CONFIG_HAVE_ARCH_HUGE_VMALLOC) || page_shift == PAGE_SHIFT) { err = vmap_small_pages_range_noflush(addr, end, prot, pages, @@ -642,8 +646,7 @@ int __vmap_pages_range_noflush(unsigned long addr, unsigned long end, } } - if (mask & ARCH_PAGE_TABLE_SYNC_MASK) - arch_sync_kernel_mappings(start, end); + arch_update_kernel_mappings_end(start, end, mask); return err; } From patchwork Wed Feb 5 15:09:56 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13961294 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 57884C02198 for ; Wed, 5 Feb 2025 15:11:19 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4E0D8280023; Wed, 5 Feb 2025 10:11:17 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 43E60280013; Wed, 5 Feb 2025 10:11:17 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 248A5280023; Wed, 5 Feb 2025 10:11:17 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id F41AC280013 for ; Wed, 5 Feb 2025 10:11:16 -0500 (EST) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 77F1B1201EA for ; Wed, 5 Feb 2025 15:11:16 +0000 (UTC) X-FDA: 83086229352.11.B530B82 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf08.hostedemail.com (Postfix) with ESMTP id 15263160003 for ; Wed, 5 Feb 2025 15:11:12 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf08.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738768273; a=rsa-sha256; cv=none; b=tlDLsIfcq+kyAW5bSmov3qs/vvG2y/CxSl2oL/n9htrDgMR8mwbyjFTWxrk8Bs+xqerkZL 6C9hOgKLY3GTyJsPgKttsfcPxbkdUu0YzvKi2ULMAUPJJYjHWAkj3auU28eoZ2BS5Z75FU NIGzHOJkZRY3q1rvjHGpngbinEkfds0= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf08.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738768273; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Nd/dr2vL0fvT1NFkFgg09AI38ZrhEiBpnWL1BZbZ2YU=; b=2Q8kKdPblOALC4vc+RCDm7Cm9zcwr/UPZDdVe3cNx9lOFTF9xzNu2HXbQLsIfbJFVGN3z7 TdZ0gAOljJwcJZufnXeNqzKxCdXHqajwRL7VgAhnyUW9aohpYpL38iA8CJcLlDGjnis2dt Pcis+gCIXzAnBvNQAMfnsLWEiQuUfyI= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id D55E91063; Wed, 5 Feb 2025 07:11:35 -0800 (PST) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.27]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id B93063F5A1; Wed, 5 Feb 2025 07:11:09 -0800 (PST) From: Ryan Roberts To: Catalin Marinas , Will Deacon , Muchun Song , Pasha Tatashin , Andrew Morton , Uladzislau Rezki , Christoph Hellwig , Mark Rutland , Ard Biesheuvel , Anshuman Khandual , Dev Jain , Alexandre Ghiti , Steve Capper , Kevin Brodsky Cc: Ryan Roberts , linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v1 16/16] arm64/mm: Defer barriers when updating kernel mappings Date: Wed, 5 Feb 2025 15:09:56 +0000 Message-ID: <20250205151003.88959-17-ryan.roberts@arm.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250205151003.88959-1-ryan.roberts@arm.com> References: <20250205151003.88959-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 15263160003 X-Stat-Signature: 5p7g4pbth1ehocckj81akw6tmaf1sckf X-HE-Tag: 1738768272-256845 X-HE-Meta: U2FsdGVkX181qWJBdiwd6TPpQM3Vg+Ct89AFuhKTF4EtcLOt9jaJYTtxEfhZw9Cv5m+BnjB34I1Fdwpdju0yJmQG8sYWcGXu13J538RMU4vmNwsmeynvIIDaSQ0OJ/VvcuhEu3LOcc4dl2S3SwzdvgifCnSfDX1FUyEaRPaRLXXFZf6qyZdvPLO5+aWGPk7Rn6VOZ01E3TPrAMopBzePwmJvWBlFiM3i2dfOn/MQ33cmoba0yMGRtL7eE3Q/9s+AR5K4aNx0I33Xvf8uPY0H58ssiU7Y/SwiwVH48xCWFcRjQLGK6WFk8d6udYBs2he8Itr53wfmCDSJs3liH3vHAo5ftSdiEg9z4Q6gl0ilk//b3c+goDt0g927DfthKaI1b64MQ1Y56k5xSWvjVFpb5kpiRm1wlmuY6F5IFCJwOOfeDSVC8X+7HlApiHMdIshkk0xryDuwm6Hh2wjK6S+Rt9IHmNjCh7dm13OdOlAyyFVgXLMybf7juUwyAO1QY7ofzMrfQAMEbh+YzkGO2BKh/5emlcRLhNAfUD+Jwh14/ubUN9e6lxgx1zm++bYeQ6W768ZrAR4P1gWXHba5IwoX95PcwLykd27vorsPp6mDc29lAEyCsmna2pKfLlcnRF/SqF56+qNkpHiG1ztRaUX/39KRv/GbZOFpkTd5kUhFFjOi3McJAmH0DaU1SsJEr9ypsW3SU595spoMckq1FzUwUVG+EzXyKopHSauKKWSpQ53t9gVfpioRyrpcRm3oQhOOjvr4yXN+feksQ5BuZ+mIrEYWGzbxfdEkak4sShYCU5OcZ2ZsetHDhGRv5+4aSCPjxD9aE/jsHK3Eu+lNEr9o6Q6K9IXjfRE+sDqapTQObxb5yn0H3+oC2mgploq1xyRqp4M5g364omSqXZzebnH+CSZCLmNLCF0qzBH+ppjsYFsp0yAH3HKtWmA9So+Tg+1ohb+4DUqCGt4x0pOTJCh NpZa+yWV ZbIO97zTL/E4AxdexwzIVQWgbo0uesWtICYYN+Z33GTzz4cjmSSWx6reaEDVxD1czebFhxGqYSa3qcVDvJ4vDTaUKgfUhkyjqMwqxLmRHjFjD4K9oZqH5JauxbV46cUYcmZNXx/9g8NpgNbTgkRluM26t00NUlrqOSTZf X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Because the kernel can't tolerate page faults for kernel mappings, when setting a valid, kernel space pte (or pmd/pud/p4d/pgd), it emits a dsb(ishst) to ensure that the store to the pgtable is observed by the table walker immediately. Additionally it emits an isb() to ensure that any already speculatively determined invalid mapping fault gets canceled. We can improve the performance of vmalloc operations by batching these barriers until the end of a set up entry updates. The newly added arch_update_kernel_mappings_begin() / arch_update_kernel_mappings_end() provide the required hooks. vmalloc improves by up to 30% as a result. Two new TIF_ flags are created; TIF_KMAP_UPDATE_ACTIVE tells us if we are in the batch mode and can therefore defer any barriers until the end of the batch. TIF_KMAP_UPDATE_PENDING tells us if barriers are queued to be emited at the end of the batch. Signed-off-by: Ryan Roberts --- arch/arm64/include/asm/pgtable.h | 65 +++++++++++++++++++--------- arch/arm64/include/asm/thread_info.h | 2 + arch/arm64/kernel/process.c | 20 +++++++-- 3 files changed, 63 insertions(+), 24 deletions(-) diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index ff358d983583..1ee9b9588502 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -39,6 +39,41 @@ #include #include #include +#include + +static inline void emit_pte_barriers(void) +{ + dsb(ishst); + isb(); +} + +static inline void queue_pte_barriers(void) +{ + if (test_thread_flag(TIF_KMAP_UPDATE_ACTIVE)) { + if (!test_thread_flag(TIF_KMAP_UPDATE_PENDING)) + set_thread_flag(TIF_KMAP_UPDATE_PENDING); + } else + emit_pte_barriers(); +} + +#define arch_update_kernel_mappings_begin arch_update_kernel_mappings_begin +static inline void arch_update_kernel_mappings_begin(unsigned long start, + unsigned long end) +{ + set_thread_flag(TIF_KMAP_UPDATE_ACTIVE); +} + +#define arch_update_kernel_mappings_end arch_update_kernel_mappings_end +static inline void arch_update_kernel_mappings_end(unsigned long start, + unsigned long end, + pgtbl_mod_mask mask) +{ + if (test_thread_flag(TIF_KMAP_UPDATE_PENDING)) + emit_pte_barriers(); + + clear_thread_flag(TIF_KMAP_UPDATE_PENDING); + clear_thread_flag(TIF_KMAP_UPDATE_ACTIVE); +} #ifdef CONFIG_TRANSPARENT_HUGEPAGE #define __HAVE_ARCH_FLUSH_PMD_TLB_RANGE @@ -323,10 +358,8 @@ static inline void __set_pte_complete(pte_t pte) * Only if the new pte is valid and kernel, otherwise TLB maintenance * or update_mmu_cache() have the necessary barriers. */ - if (pte_valid_not_user(pte)) { - dsb(ishst); - isb(); - } + if (pte_valid_not_user(pte)) + queue_pte_barriers(); } static inline void __set_pte(pte_t *ptep, pte_t pte) @@ -791,10 +824,8 @@ static inline void set_pmd(pmd_t *pmdp, pmd_t pmd) WRITE_ONCE(*pmdp, pmd); - if (pmd_valid_not_user(pmd)) { - dsb(ishst); - isb(); - } + if (pmd_valid_not_user(pmd)) + queue_pte_barriers(); } static inline void pmd_clear(pmd_t *pmdp) @@ -869,10 +900,8 @@ static inline void set_pud(pud_t *pudp, pud_t pud) WRITE_ONCE(*pudp, pud); - if (pud_valid_not_user(pud)) { - dsb(ishst); - isb(); - } + if (pud_valid_not_user(pud)) + queue_pte_barriers(); } static inline void pud_clear(pud_t *pudp) @@ -960,10 +989,8 @@ static inline void set_p4d(p4d_t *p4dp, p4d_t p4d) WRITE_ONCE(*p4dp, p4d); - if (p4d_valid_not_user(p4d)) { - dsb(ishst); - isb(); - } + if (p4d_valid_not_user(p4d)) + queue_pte_barriers(); } static inline void p4d_clear(p4d_t *p4dp) @@ -1098,10 +1125,8 @@ static inline void set_pgd(pgd_t *pgdp, pgd_t pgd) WRITE_ONCE(*pgdp, pgd); - if (pgd_valid_not_user(pgd)) { - dsb(ishst); - isb(); - } + if (pgd_valid_not_user(pgd)) + queue_pte_barriers(); } static inline void pgd_clear(pgd_t *pgdp) diff --git a/arch/arm64/include/asm/thread_info.h b/arch/arm64/include/asm/thread_info.h index 1114c1c3300a..382d2121261e 100644 --- a/arch/arm64/include/asm/thread_info.h +++ b/arch/arm64/include/asm/thread_info.h @@ -82,6 +82,8 @@ void arch_setup_new_exec(void); #define TIF_SME_VL_INHERIT 28 /* Inherit SME vl_onexec across exec */ #define TIF_KERNEL_FPSTATE 29 /* Task is in a kernel mode FPSIMD section */ #define TIF_TSC_SIGSEGV 30 /* SIGSEGV on counter-timer access */ +#define TIF_KMAP_UPDATE_ACTIVE 31 /* kernel map update in progress */ +#define TIF_KMAP_UPDATE_PENDING 32 /* kernel map updated with deferred barriers */ #define _TIF_SIGPENDING (1 << TIF_SIGPENDING) #define _TIF_NEED_RESCHED (1 << TIF_NEED_RESCHED) diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c index 42faebb7b712..1367ec6407d1 100644 --- a/arch/arm64/kernel/process.c +++ b/arch/arm64/kernel/process.c @@ -680,10 +680,10 @@ struct task_struct *__switch_to(struct task_struct *prev, gcs_thread_switch(next); /* - * Complete any pending TLB or cache maintenance on this CPU in case - * the thread migrates to a different CPU. - * This full barrier is also required by the membarrier system - * call. + * Complete any pending TLB or cache maintenance on this CPU in case the + * thread migrates to a different CPU. This full barrier is also + * required by the membarrier system call. Additionally it is required + * for TIF_KMAP_UPDATE_PENDING, see below. */ dsb(ish); @@ -696,6 +696,18 @@ struct task_struct *__switch_to(struct task_struct *prev, /* avoid expensive SCTLR_EL1 accesses if no change */ if (prev->thread.sctlr_user != next->thread.sctlr_user) update_sctlr_el1(next->thread.sctlr_user); + else if (unlikely(test_thread_flag(TIF_KMAP_UPDATE_PENDING))) { + /* + * In unlikely event that a kernel map update is on-going when + * preemption occurs, we must emit_pte_barriers() if pending. + * emit_pte_barriers() consists of "dsb(ishst); isb();". The dsb + * is already handled above. The isb() is handled if + * update_sctlr_el1() was called. So only need to emit isb() + * here if it wasn't called. + */ + isb(); + clear_thread_flag(TIF_KMAP_UPDATE_PENDING); + } /* the actual thread switch */ last = cpu_switch_to(prev, next);