From patchwork Wed Nov 15 16:30:05 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13457058 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 04862C54FB9 for ; Wed, 15 Nov 2023 16:30:48 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C134E6B0388; Wed, 15 Nov 2023 11:30:46 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id BC3306B038A; Wed, 15 Nov 2023 11:30:46 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A65226B038B; Wed, 15 Nov 2023 11:30:46 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 93E4D6B0388 for ; Wed, 15 Nov 2023 11:30:46 -0500 (EST) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 632DCA0329 for ; Wed, 15 Nov 2023 16:30:46 +0000 (UTC) X-FDA: 81460727292.04.D15C4D1 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf21.hostedemail.com (Postfix) with ESMTP id EF8471C003F for ; Wed, 15 Nov 2023 16:30:41 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf21.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1700065842; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=UA2ixjHApskUia+IBoCjlNEhJLPlTkaoxaoP9EiJlwk=; b=hoclEI+fIvcAhNXQK9oI18iOGo396zoWypE38ffLu5fJyeLHDqXZkxoXVxohldvcMHHdRQ MngsYkSU7czis+gnERts3Y/3wds9H9DvvfAJ4NaOm2MJy1VmDnjWkmIkPR0tyCfCHPUY7s tqNKsNV5My/1KsxJPQGIqxFl9z0+ZXQ= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf21.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1700065842; a=rsa-sha256; cv=none; b=OFqchSSbhaTWcS9QgqaaNXU73Zrxn0HpLs0uGxFoI2+xQts3k1q4HLIrOWyPWf8BspTomc zkTV+HTdl+Izl/XQAWijFqHMtafhGaNmDoyAqyhtkocAYUJt487XSADAtfzZCaUruiqcsM 5Q4Cp4AnRb9EMgV7SftmaTvDc9+PGUA= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id C680D15DB; Wed, 15 Nov 2023 08:31:26 -0800 (PST) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.26]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id F0A573F641; Wed, 15 Nov 2023 08:30:32 -0800 (PST) From: Ryan Roberts To: Catalin Marinas , Will Deacon , Ard Biesheuvel , Marc Zyngier , Oliver Upton , James Morse , Suzuki K Poulose , Zenghui Yu , Andrey Ryabinin , Alexander Potapenko , Andrey Konovalov , Dmitry Vyukov , Vincenzo Frascino , Andrew Morton , Anshuman Khandual , Matthew Wilcox , Yu Zhao , Mark Rutland , David Hildenbrand , Kefeng Wang , John Hubbard , Zi Yan Cc: Ryan Roberts , linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v2 01/14] mm: Batch-copy PTE ranges during fork() Date: Wed, 15 Nov 2023 16:30:05 +0000 Message-Id: <20231115163018.1303287-2-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20231115163018.1303287-1-ryan.roberts@arm.com> References: <20231115163018.1303287-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: EF8471C003F X-Rspam-User: X-Rspamd-Server: rspam04 X-Stat-Signature: bpqdojixcxrw6dti3pcppofzi9rgikhi X-HE-Tag: 1700065841-444364 X-HE-Meta: U2FsdGVkX199eROe7qIJGPRrLiRKzoz/qXMESRQWm1Of1By1ZFP2aFUarg33BpaRvLEpb+eu20w2QHVa9YEm+qiEc57Z67L9wpAenTiG8oSV7c26DJlHv6gteHjzWp/Q6LOp+OUSRLDrvqQtda1Yk5HSXvQjK/fWG9QO09mQ27oh3ZuH9ZKXdwtijUIrTR4Xt10/tuIXYuInTaScqIcI4hPyr5aT1BPRM/FDZ45+xT64odxAXkH+71w+zomT34QeTuHnNHtqzMJW1OMAZM7VwxB8ejOci55DHIncTYnbNz2jHbeak1HxrdFg8PRAOUUAi0/qqlHIBNIwKg4aXOr41g5P0zNOWr33GdyLrgjtVR+v2xks8SNiXOvU+Amk0JnpUBK/BJtVu3qUb6lJ1jT2wDvMOJk4wWYB1hzGbaCZXVUaES+XU9EpT1n86NEv5ZBoX0ddLjf02QecA5dK+czGISO5gLxvrVILeKPnM901kArJLfX8CaD4bFW0RzgKJrV63jcLzupkwd9qC6X4NzCDeVYit3qiOS0grzsus7JAxoRpjH7yC4Vz+XmcxkOOFk3EzKVXSC2nNZ16FZyKF+9wmX3bs4NaYXkgwSDly8/LitiIY0vmfrQysU1gu//k9v2aib7OvmtQkIC+eVSmkBZf+kfWpt8RXUvUvEgSt0nJz1WEoemCY729NzImVQ36k1iuj/79R48auTNB5C4pPhMzOnP4sugWNhvpzoeIUYgusUDOPteCucVijpwAtTggGISLhXzii1wrUisiC2AZgstE9+11GV5lMnq//DSDGe55uppXt5M6HF7gV16dHT2ILdsSoWwDzTv5+KunTVuyf73+cCxfbpJsddFgsQNCUqKWmfkpZpPYbap7XP2RF28yfxtFvLAqpzVfY14MuTXKUTTMy2uqcKlmcTlA1GQ0fEeRHlvoVgjWO1Ev06Lmh1QnDDxfFrDWw8xweH51DeDp9/t z2shLmlI GzX5ZLfYMztwfX0vYdCaXLBrCXEbaplwYfHeW2s6E3L5LUfU2wXhL+YByms+L8rpDiPqCZT7eDH60hYEmfvvySrYrQ0XjlNBqz0V7TwZxNo+P1OUi49fOdPloeNYynxKEVBpAhScP3we4smb9buHtC1UlyR48tjx8LXRhgbnMveepPpb8JkkYBKPJn3Z2ZUxgUCQO1516abtz3vwAeLaI1665ltBi6mVEALE++EZ0Fus5Y6Q= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Convert copy_pte_range() to copy a set of ptes in a batch. A given batch maps a physically contiguous block of memory, all belonging to the same folio, with the same permissions, and for shared mappings, the same dirty state. This will likely improve performance by a tiny amount due to batching the folio reference count management and calling set_ptes() rather than making individual calls to set_pte_at(). However, the primary motivation for this change is to reduce the number of tlb maintenance operations that the arm64 backend has to perform during fork, as it is about to add transparent support for the "contiguous bit" in its ptes. By write-protecting the parent using the new ptep_set_wrprotects() (note the 's' at the end) function, the backend can avoid having to unfold contig ranges of PTEs, which is expensive, when all ptes in the range are being write-protected. Similarly, by using set_ptes() rather than set_pte_at() to set up ptes in the child, the backend does not need to fold a contiguous range once they are all populated - they can be initially populated as a contiguous range in the first place. This change addresses the core-mm refactoring only, and introduces ptep_set_wrprotects() with a default implementation that calls ptep_set_wrprotect() for each pte in the range. A separate change will implement ptep_set_wrprotects() in the arm64 backend to realize the performance improvement as part of the work to enable contpte mappings. Signed-off-by: Ryan Roberts --- include/linux/pgtable.h | 13 +++ mm/memory.c | 175 +++++++++++++++++++++++++++++++--------- 2 files changed, 150 insertions(+), 38 deletions(-) diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index af7639c3b0a3..1c50f8a0fdde 100644 --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -622,6 +622,19 @@ static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addres } #endif +#ifndef ptep_set_wrprotects +struct mm_struct; +static inline void ptep_set_wrprotects(struct mm_struct *mm, + unsigned long address, pte_t *ptep, + unsigned int nr) +{ + unsigned int i; + + for (i = 0; i < nr; i++, address += PAGE_SIZE, ptep++) + ptep_set_wrprotect(mm, address, ptep); +} +#endif + /* * On some architectures hardware does not set page access bit when accessing * memory page, it is responsibility of software setting this bit. It brings diff --git a/mm/memory.c b/mm/memory.c index 1f18ed4a5497..b7c8228883cf 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -921,46 +921,129 @@ copy_present_page(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma /* Uffd-wp needs to be delivered to dest pte as well */ pte = pte_mkuffd_wp(pte); set_pte_at(dst_vma->vm_mm, addr, dst_pte, pte); - return 0; + return 1; +} + +static inline unsigned long page_cont_mapped_vaddr(struct page *page, + struct page *anchor, unsigned long anchor_vaddr) +{ + unsigned long offset; + unsigned long vaddr; + + offset = (page_to_pfn(page) - page_to_pfn(anchor)) << PAGE_SHIFT; + vaddr = anchor_vaddr + offset; + + if (anchor > page) { + if (vaddr > anchor_vaddr) + return 0; + } else { + if (vaddr < anchor_vaddr) + return ULONG_MAX; + } + + return vaddr; +} + +static int folio_nr_pages_cont_mapped(struct folio *folio, + struct page *page, pte_t *pte, + unsigned long addr, unsigned long end, + pte_t ptent, bool *any_dirty) +{ + int floops; + int i; + unsigned long pfn; + pgprot_t prot; + struct page *folio_end; + + if (!folio_test_large(folio)) + return 1; + + folio_end = &folio->page + folio_nr_pages(folio); + end = min(page_cont_mapped_vaddr(folio_end, page, addr), end); + floops = (end - addr) >> PAGE_SHIFT; + pfn = page_to_pfn(page); + prot = pte_pgprot(pte_mkold(pte_mkclean(ptent))); + + *any_dirty = pte_dirty(ptent); + + pfn++; + pte++; + + for (i = 1; i < floops; i++) { + ptent = ptep_get(pte); + ptent = pte_mkold(pte_mkclean(ptent)); + + if (!pte_present(ptent) || pte_pfn(ptent) != pfn || + pgprot_val(pte_pgprot(ptent)) != pgprot_val(prot)) + break; + + if (pte_dirty(ptent)) + *any_dirty = true; + + pfn++; + pte++; + } + + return i; } /* - * Copy one pte. Returns 0 if succeeded, or -EAGAIN if one preallocated page - * is required to copy this pte. + * Copy set of contiguous ptes. Returns number of ptes copied if succeeded + * (always gte 1), or -EAGAIN if one preallocated page is required to copy the + * first pte. */ static inline int -copy_present_pte(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma, - pte_t *dst_pte, pte_t *src_pte, unsigned long addr, int *rss, - struct folio **prealloc) +copy_present_ptes(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma, + pte_t *dst_pte, pte_t *src_pte, + unsigned long addr, unsigned long end, + int *rss, struct folio **prealloc) { struct mm_struct *src_mm = src_vma->vm_mm; unsigned long vm_flags = src_vma->vm_flags; pte_t pte = ptep_get(src_pte); struct page *page; struct folio *folio; + int nr = 1; + bool anon; + bool any_dirty = pte_dirty(pte); + int i; page = vm_normal_page(src_vma, addr, pte); - if (page) + if (page) { folio = page_folio(page); - if (page && folio_test_anon(folio)) { - /* - * If this page may have been pinned by the parent process, - * copy the page immediately for the child so that we'll always - * guarantee the pinned page won't be randomly replaced in the - * future. - */ - folio_get(folio); - if (unlikely(page_try_dup_anon_rmap(page, false, src_vma))) { - /* Page may be pinned, we have to copy. */ - folio_put(folio); - return copy_present_page(dst_vma, src_vma, dst_pte, src_pte, - addr, rss, prealloc, page); + anon = folio_test_anon(folio); + nr = folio_nr_pages_cont_mapped(folio, page, src_pte, addr, + end, pte, &any_dirty); + + for (i = 0; i < nr; i++, page++) { + if (anon) { + /* + * If this page may have been pinned by the + * parent process, copy the page immediately for + * the child so that we'll always guarantee the + * pinned page won't be randomly replaced in the + * future. + */ + if (unlikely(page_try_dup_anon_rmap( + page, false, src_vma))) { + if (i != 0) + break; + /* Page may be pinned, we have to copy. */ + return copy_present_page( + dst_vma, src_vma, dst_pte, + src_pte, addr, rss, prealloc, + page); + } + rss[MM_ANONPAGES]++; + VM_BUG_ON(PageAnonExclusive(page)); + } else { + page_dup_file_rmap(page, false); + rss[mm_counter_file(page)]++; + } } - rss[MM_ANONPAGES]++; - } else if (page) { - folio_get(folio); - page_dup_file_rmap(page, false); - rss[mm_counter_file(page)]++; + + nr = i; + folio_ref_add(folio, nr); } /* @@ -968,24 +1051,28 @@ copy_present_pte(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma, * in the parent and the child */ if (is_cow_mapping(vm_flags) && pte_write(pte)) { - ptep_set_wrprotect(src_mm, addr, src_pte); + ptep_set_wrprotects(src_mm, addr, src_pte, nr); pte = pte_wrprotect(pte); } - VM_BUG_ON(page && folio_test_anon(folio) && PageAnonExclusive(page)); /* - * If it's a shared mapping, mark it clean in - * the child + * If it's a shared mapping, mark it clean in the child. If its a + * private mapping, mark it dirty in the child if _any_ of the parent + * mappings in the block were marked dirty. The contiguous block of + * mappings are all backed by the same folio, so if any are dirty then + * the whole folio is dirty. This allows us to determine the batch size + * without having to ever consider the dirty bit. See + * folio_nr_pages_cont_mapped(). */ - if (vm_flags & VM_SHARED) - pte = pte_mkclean(pte); - pte = pte_mkold(pte); + pte = pte_mkold(pte_mkclean(pte)); + if (!(vm_flags & VM_SHARED) && any_dirty) + pte = pte_mkdirty(pte); if (!userfaultfd_wp(dst_vma)) pte = pte_clear_uffd_wp(pte); - set_pte_at(dst_vma->vm_mm, addr, dst_pte, pte); - return 0; + set_ptes(dst_vma->vm_mm, addr, dst_pte, pte, nr); + return nr; } static inline struct folio *page_copy_prealloc(struct mm_struct *src_mm, @@ -1087,15 +1174,28 @@ copy_pte_range(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma, */ WARN_ON_ONCE(ret != -ENOENT); } - /* copy_present_pte() will clear `*prealloc' if consumed */ - ret = copy_present_pte(dst_vma, src_vma, dst_pte, src_pte, - addr, rss, &prealloc); + /* copy_present_ptes() will clear `*prealloc' if consumed */ + ret = copy_present_ptes(dst_vma, src_vma, dst_pte, src_pte, + addr, end, rss, &prealloc); + /* * If we need a pre-allocated page for this pte, drop the * locks, allocate, and try again. */ if (unlikely(ret == -EAGAIN)) break; + + /* + * Positive return value is the number of ptes copied. + */ + VM_WARN_ON_ONCE(ret < 1); + progress += 8 * ret; + ret--; + dst_pte += ret; + src_pte += ret; + addr += ret << PAGE_SHIFT; + ret = 0; + if (unlikely(prealloc)) { /* * pre-alloc page cannot be reused by next time so as @@ -1106,7 +1206,6 @@ copy_pte_range(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma, folio_put(prealloc); prealloc = NULL; } - progress += 8; } while (dst_pte++, src_pte++, addr += PAGE_SIZE, addr != end); arch_leave_lazy_mmu_mode(); From patchwork Wed Nov 15 16:30:06 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13457056 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 21160C54FB9 for ; Wed, 15 Nov 2023 16:30:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A8BFA6B0384; Wed, 15 Nov 2023 11:30:42 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id A3BEB6B0386; Wed, 15 Nov 2023 11:30:42 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 92C656B0387; Wed, 15 Nov 2023 11:30:42 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 864976B0384 for ; Wed, 15 Nov 2023 11:30:42 -0500 (EST) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 5A8531A04D4 for ; Wed, 15 Nov 2023 16:30:42 +0000 (UTC) X-FDA: 81460727124.15.13D5900 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf17.hostedemail.com (Postfix) with ESMTP id 7229240023 for ; Wed, 15 Nov 2023 16:30:40 +0000 (UTC) Authentication-Results: imf17.hostedemail.com; dkim=none; spf=pass (imf17.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1700065840; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=xsNAviUmR5F8SYwTUIrOFJJg1yOMZo9+rbtTxHdvUqw=; b=fTucqIfN+tK3Hx5ZHaqL/Gv5cd3+SvhzOkio2mUajaWaBQDA3jl6R+L2Ix+pV0TZLm/PFs D0ebgbDdWNRyatUBYRfb/NfY24sQz9Mksa0CbFyByX6U5ptC+D9iZWGDAgsKVNpkHslJvY AwYT4IeZOYBTP4k4AgqfodSCwz5Gukk= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1700065840; a=rsa-sha256; cv=none; b=rdB4ohmPhvxYB2TLjwWpZ/LWuikn/nfeqqLIwuc7LBUk4ngoWdkBS/bZd6k8bLtf6WnzaI bZwnxomleO6n7eXijrErdrmjfZG+wjNOUjJqHJWVSvHRvsBG8niqDhVueGU1wYSYIecSMv 4ti68zPDPnggps2NBAubemf6GyKzB8Y= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=none; spf=pass (imf17.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 14F6A1595; Wed, 15 Nov 2023 08:31:25 -0800 (PST) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.26]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 5ABF03F73F; Wed, 15 Nov 2023 08:30:36 -0800 (PST) From: Ryan Roberts To: Catalin Marinas , Will Deacon , Ard Biesheuvel , Marc Zyngier , Oliver Upton , James Morse , Suzuki K Poulose , Zenghui Yu , Andrey Ryabinin , Alexander Potapenko , Andrey Konovalov , Dmitry Vyukov , Vincenzo Frascino , Andrew Morton , Anshuman Khandual , Matthew Wilcox , Yu Zhao , Mark Rutland , David Hildenbrand , Kefeng Wang , John Hubbard , Zi Yan Cc: Ryan Roberts , linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v2 02/14] arm64/mm: set_pte(): New layer to manage contig bit Date: Wed, 15 Nov 2023 16:30:06 +0000 Message-Id: <20231115163018.1303287-3-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20231115163018.1303287-1-ryan.roberts@arm.com> References: <20231115163018.1303287-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: 7229240023 X-Rspam-User: X-Rspamd-Server: rspam11 X-Stat-Signature: 9dmk3bnxmo5pqtcysentmthy8anhdoh8 X-HE-Tag: 1700065840-101895 X-HE-Meta: U2FsdGVkX1+olgzPLH1rC05gTXqGT5rGD5jNDbANWIrlmX+rDEcxBxnPSqjQIo8BTr/EamzXYOLCqngHSmtGfnOU09I2yTGxYICAUQXYpS7M7neSXHvZ6TE3DAfmwzFmpBLNASLwtmNQvO/SLMKweXyCfN5DDcOobf33MY9cjE6dOsE5LobluymaVD5qNd8sPTbxgxFlYgq4D8BzG48SIRWShpuWBbu9mXl7iUTuGWKIA7+TdELb1uSW1ET4/xp/PvexWBiFrhXDiH97NJ8gz4L53xcWlci0rtIbUl7nuzGwOim0HyBGqmJJeyqaYk9S/Foe+OxkpSIOqT8M0o0UXzY6Cvr3AVzJFGOgOuH4wl837r92DSlOIDGQrZITXWjpTIA61XCgCV1lT5m/JRy19fJo2itZOueu1odeq4tJ099ShvvUC6kKtD6zx4Apon3L+LWf7MfKZThhtlHfRiqxg6GlmLAsi2JVAZynzMHZ3HA6AYrOksEpBzvSMAsDD16GqzGLGq5ylsudUiewFUfDati+5EkKHntuZjYCCD0/yH1nQMXghLwl1MXdAsj8eplfu6Y4UCsl9aHzZx/xsOB1f1QbW+6h2CX6FSBmW6sE0NO0TRZKyAsjk8XdKUhX9DnFUrRfUSmpAm1CYj0UFHH9rdzcWpwrd/8FoYx7zZ+lE6dAQNBSN4VCXh6iK//taKgdlONqgjb/ZR1e14u1EVy6QlHpXD3znvzmoDOsHCA9sru8ujCVTtm5HCiNvJ82UTCOWctLlAOa4jebBOzQm4Xw2k4L09AUVJO6fvlYd54iuYBHiHM3F2nlCZLgC14yuxEF/2tQmF4TxbOUZXU9EujhPNZaUAdhvqjMwp7YO+f23IvcocFMFcP8XHXNTYnJhdEQSrwghTRNtIDZ+yYMozJwb4dqk+RTtXZJS36BDb6YJXWDMlossFyzBmYP3VXslCYDaKIjfE+h0CkNMzCzV7U 7YC2Kehb TgOa+OuP/cInI2v+ZbzyOtSJ7BVXZBXhJqX3OiF/SUj8yuTE8RYAYQcIShNEpXC7yHrcen/FwY3Uod/IfknBtTI7jD0wDM6IhbybS//rJp1p+nLQL/02+eb4+W0QrlZ5BiX8+2R2SOlJsvCvk9Xw12EDrodMuEuTDiTZygY25X6wQmqpXz+IsF9NqitSN+gwMDkqtfssnqg7LKmdvMbVeBW3AK8d44+CTa5zm+1kkOPAOF9OLdHdVUAWNXsmMjVyzkaEpcKvb5P4W0EWZtBNBIERAWA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Create a new layer for the in-table PTE manipulation APIs. For now, The existing API is prefixed with double underscore to become the arch-private API and the public API is just a simple wrapper that calls the private API. The public API implementation will subsequently be used to transparently manipulate the contiguous bit where appropriate. But since there are already some contig-aware users (e.g. hugetlb, kernel mapper), we must first ensure those users use the private API directly so that the future contig-bit manipulations in the public API do not interfere with those existing uses. Signed-off-by: Ryan Roberts --- arch/arm64/include/asm/pgtable.h | 12 ++++++++---- arch/arm64/kernel/efi.c | 2 +- arch/arm64/mm/fixmap.c | 2 +- arch/arm64/mm/kasan_init.c | 4 ++-- arch/arm64/mm/mmu.c | 2 +- arch/arm64/mm/pageattr.c | 2 +- arch/arm64/mm/trans_pgd.c | 4 ++-- 7 files changed, 16 insertions(+), 12 deletions(-) diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index b19a8aee684c..650d4f4bb6dc 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -93,7 +93,8 @@ static inline pteval_t __phys_to_pte_val(phys_addr_t phys) __pte(__phys_to_pte_val((phys_addr_t)(pfn) << PAGE_SHIFT) | pgprot_val(prot)) #define pte_none(pte) (!pte_val(pte)) -#define pte_clear(mm,addr,ptep) set_pte(ptep, __pte(0)) +#define pte_clear(mm, addr, ptep) \ + __set_pte(ptep, __pte(0)) #define pte_page(pte) (pfn_to_page(pte_pfn(pte))) /* @@ -261,7 +262,7 @@ static inline pte_t pte_mkdevmap(pte_t pte) return set_pte_bit(pte, __pgprot(PTE_DEVMAP | PTE_SPECIAL)); } -static inline void set_pte(pte_t *ptep, pte_t pte) +static inline void __set_pte(pte_t *ptep, pte_t pte) { WRITE_ONCE(*ptep, pte); @@ -350,7 +351,7 @@ static inline void set_ptes(struct mm_struct *mm, for (;;) { __check_safe_pte_update(mm, ptep, pte); - set_pte(ptep, pte); + __set_pte(ptep, pte); if (--nr == 0) break; ptep++; @@ -534,7 +535,7 @@ static inline void __set_pte_at(struct mm_struct *mm, { __sync_cache_and_tags(pte, nr); __check_safe_pte_update(mm, ptep, pte); - set_pte(ptep, pte); + __set_pte(ptep, pte); } static inline void set_pmd_at(struct mm_struct *mm, unsigned long addr, @@ -1118,6 +1119,9 @@ extern pte_t ptep_modify_prot_start(struct vm_area_struct *vma, extern void ptep_modify_prot_commit(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep, pte_t old_pte, pte_t new_pte); + +#define set_pte __set_pte + #endif /* !__ASSEMBLY__ */ #endif /* __ASM_PGTABLE_H */ diff --git a/arch/arm64/kernel/efi.c b/arch/arm64/kernel/efi.c index 0228001347be..44288a12fc6c 100644 --- a/arch/arm64/kernel/efi.c +++ b/arch/arm64/kernel/efi.c @@ -111,7 +111,7 @@ static int __init set_permissions(pte_t *ptep, unsigned long addr, void *data) pte = set_pte_bit(pte, __pgprot(PTE_PXN)); else if (system_supports_bti_kernel() && spd->has_bti) pte = set_pte_bit(pte, __pgprot(PTE_GP)); - set_pte(ptep, pte); + __set_pte(ptep, pte); return 0; } diff --git a/arch/arm64/mm/fixmap.c b/arch/arm64/mm/fixmap.c index c0a3301203bd..51cd4501816d 100644 --- a/arch/arm64/mm/fixmap.c +++ b/arch/arm64/mm/fixmap.c @@ -121,7 +121,7 @@ void __set_fixmap(enum fixed_addresses idx, ptep = fixmap_pte(addr); if (pgprot_val(flags)) { - set_pte(ptep, pfn_pte(phys >> PAGE_SHIFT, flags)); + __set_pte(ptep, pfn_pte(phys >> PAGE_SHIFT, flags)); } else { pte_clear(&init_mm, addr, ptep); flush_tlb_kernel_range(addr, addr+PAGE_SIZE); diff --git a/arch/arm64/mm/kasan_init.c b/arch/arm64/mm/kasan_init.c index 555285ebd5af..5eade712e9e5 100644 --- a/arch/arm64/mm/kasan_init.c +++ b/arch/arm64/mm/kasan_init.c @@ -112,7 +112,7 @@ static void __init kasan_pte_populate(pmd_t *pmdp, unsigned long addr, if (!early) memset(__va(page_phys), KASAN_SHADOW_INIT, PAGE_SIZE); next = addr + PAGE_SIZE; - set_pte(ptep, pfn_pte(__phys_to_pfn(page_phys), PAGE_KERNEL)); + __set_pte(ptep, pfn_pte(__phys_to_pfn(page_phys), PAGE_KERNEL)); } while (ptep++, addr = next, addr != end && pte_none(READ_ONCE(*ptep))); } @@ -266,7 +266,7 @@ static void __init kasan_init_shadow(void) * so we should make sure that it maps the zero page read-only. */ for (i = 0; i < PTRS_PER_PTE; i++) - set_pte(&kasan_early_shadow_pte[i], + __set_pte(&kasan_early_shadow_pte[i], pfn_pte(sym_to_pfn(kasan_early_shadow_page), PAGE_KERNEL_RO)); diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c index 15f6347d23b6..e884279b268e 100644 --- a/arch/arm64/mm/mmu.c +++ b/arch/arm64/mm/mmu.c @@ -178,7 +178,7 @@ static void init_pte(pmd_t *pmdp, unsigned long addr, unsigned long end, do { pte_t old_pte = READ_ONCE(*ptep); - set_pte(ptep, pfn_pte(__phys_to_pfn(phys), prot)); + __set_pte(ptep, pfn_pte(__phys_to_pfn(phys), prot)); /* * After the PTE entry has been populated once, we diff --git a/arch/arm64/mm/pageattr.c b/arch/arm64/mm/pageattr.c index 8e2017ba5f1b..057097acf9e0 100644 --- a/arch/arm64/mm/pageattr.c +++ b/arch/arm64/mm/pageattr.c @@ -41,7 +41,7 @@ static int change_page_range(pte_t *ptep, unsigned long addr, void *data) pte = clear_pte_bit(pte, cdata->clear_mask); pte = set_pte_bit(pte, cdata->set_mask); - set_pte(ptep, pte); + __set_pte(ptep, pte); return 0; } diff --git a/arch/arm64/mm/trans_pgd.c b/arch/arm64/mm/trans_pgd.c index 7b14df3c6477..230b607cf881 100644 --- a/arch/arm64/mm/trans_pgd.c +++ b/arch/arm64/mm/trans_pgd.c @@ -41,7 +41,7 @@ static void _copy_pte(pte_t *dst_ptep, pte_t *src_ptep, unsigned long addr) * read only (code, rodata). Clear the RDONLY bit from * the temporary mappings we use during restore. */ - set_pte(dst_ptep, pte_mkwrite_novma(pte)); + __set_pte(dst_ptep, pte_mkwrite_novma(pte)); } else if ((debug_pagealloc_enabled() || is_kfence_address((void *)addr)) && !pte_none(pte)) { /* @@ -55,7 +55,7 @@ static void _copy_pte(pte_t *dst_ptep, pte_t *src_ptep, unsigned long addr) */ BUG_ON(!pfn_valid(pte_pfn(pte))); - set_pte(dst_ptep, pte_mkpresent(pte_mkwrite_novma(pte))); + __set_pte(dst_ptep, pte_mkpresent(pte_mkwrite_novma(pte))); } } From patchwork Wed Nov 15 16:30:07 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13457057 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 970F3C2BB3F for ; Wed, 15 Nov 2023 16:30:46 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 386E76B0386; Wed, 15 Nov 2023 11:30:46 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 337046B0388; Wed, 15 Nov 2023 11:30:46 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1FF106B0389; Wed, 15 Nov 2023 11:30:46 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 10ABA6B0386 for ; Wed, 15 Nov 2023 11:30:46 -0500 (EST) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id E547BB4FDC for ; Wed, 15 Nov 2023 16:30:45 +0000 (UTC) X-FDA: 81460727250.06.670B9B3 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf12.hostedemail.com (Postfix) with ESMTP id E4E8B40026 for ; Wed, 15 Nov 2023 16:30:43 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf12.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1700065844; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=nX6cppQWykzFlGLv1/MzIfjs0FHZmU2vVGTO0jGCe+w=; b=kve1+o3QBTpbY2gCnCreozTYxJhK9GGyW60r1rKJvonKHn/kRISinKroZBrBF4c9vBXpaA Z6nMoMuC4U9DpFKG6o32+y5ALfWrqeU/JeLliNWMd3dRAWzOSeUthefhTWHzW6gU7o3FVM AxXZOxkPAcvHttN4TLAqsFdLyhnXcYg= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf12.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1700065844; a=rsa-sha256; cv=none; b=wTkoupM/cdbjjfLuwWtz7irqdRFlQoTDYflEM3lLuN8xLj8C7GSvAZY55Du57v9JEpvJpE pHvT74p/n2ldozEUE8pqo39NyF1hDTUvaS0D68oj3xFbWzbX99S0HM2nkfdOeT+qWw2H4G dYoDMtW9eCAqohmJQbcO0a76tv/hxIk= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 573671650; Wed, 15 Nov 2023 08:31:28 -0800 (PST) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.26]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 9D64E3F73F; Wed, 15 Nov 2023 08:30:39 -0800 (PST) From: Ryan Roberts To: Catalin Marinas , Will Deacon , Ard Biesheuvel , Marc Zyngier , Oliver Upton , James Morse , Suzuki K Poulose , Zenghui Yu , Andrey Ryabinin , Alexander Potapenko , Andrey Konovalov , Dmitry Vyukov , Vincenzo Frascino , Andrew Morton , Anshuman Khandual , Matthew Wilcox , Yu Zhao , Mark Rutland , David Hildenbrand , Kefeng Wang , John Hubbard , Zi Yan Cc: Ryan Roberts , linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v2 03/14] arm64/mm: set_ptes()/set_pte_at(): New layer to manage contig bit Date: Wed, 15 Nov 2023 16:30:07 +0000 Message-Id: <20231115163018.1303287-4-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20231115163018.1303287-1-ryan.roberts@arm.com> References: <20231115163018.1303287-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspam-User: X-Stat-Signature: czirfknetnfyfgn114wsswqfsf6an7m8 X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: E4E8B40026 X-HE-Tag: 1700065843-296175 X-HE-Meta: U2FsdGVkX18hJTBvkysEjPm2WQ8vlcFigxioRYP65dOeqt3g/iZs/5FLrcvoThHTLqyEYHsif2aSNUwnlRJKJV1Tu3KC5UeIr/Qk09fnzxKTmTQuIElqOVp+Tg1SGbN7d9yezrRWUKnGsdJzyxSUbyO+IUMLyjV2AI2oeS/DsVV9R+5z1e8IzyKhma3hTIbOGpT3fBUWWuD0XBzMOaPNJter0R2TcORqxMIB8yzSMgRk9eG6re3NwGeYkvyZCxHqLVptfj9hs8r+VVF3v79MjG3xZ6gfPGPDC25cVNHvrTJzjVRm0vPvFG61PIyI1sNQR9sJcFt4h3BPhX8azIbYcVTObuEEK0tquhrHXUq7kDZGp07ZPj6LKAzoa4m5VGwO+AE35nqG5jJ7YApSCPUoM5vY3yl+H+IfYqIyMhq+nxmQ7QYroXp5zGT1VkwaYKPf1UwzMxHZBgld/t0ZK7gBARkFL9MNZh1SoIII176DvQIJGVnZld+OkRZsojrtF2lEe/98U3rgX5BHyrJApXmcDUK4CAfYfZ7GAQDjf7F2E3Lj9ilje3NJxCqb04tPFpGwKjD+EsK2py7eVphhC8nRyeSlzsPxOSE2mk2jurM6zhTFwo42kGSexjGrTSUt9CwKAcobpoKkyw1MizIgCEGg6pNWgVskWbc9slgsB/q2Y2s44t3+IHV8gtHooo5m1HQGCjYBVc65DOHKefi80CDjLL0QrGEna2281dhTCV6cMIbyPOcQix1SIeqvoqDMY6efkJsJUlWThWmuyMhVm1qqJhqmdIbQD5K0IEOVa2khvlNcUNCl/Y+xiuZMV+uXy3+Q9DKRQcb/Em+L8tmF6MrUdigOPCvKRx4tUEc8FjPq3MtrRTjEljHToBi/h3vb7vsR1wwEesRlhODJef8mem6PmHCZ8ft3uhU4YKv5MyqAwpDV/9Quue63ETzaVsXqRF69B8N8Xm2D4DcKLjMccsm zMi03Agc rXVVu/gnunLpv3ipy6tV0/YC/cBvRjFkSSJO4+m5OwNuCWuCiRYTZNt3uudJivoLQcF52RByGA2ahktYuebAdao4GcOVKBqWxwfI8/I5pj1anq6OweTxgHlOQ7oTaSRXtW0DsxVLbVUHGx4pAsVamMEHvN74BbOIP76Wr+MSbw0D6lajY4h3aqUPUzM16qI1bCSo0OW+/iTa4zDQeeHVccpJ28rgHyctcqzCqmPkDqBdB7cgReOUYv3H8o7ScGGmKRXT9CapUiEacxMpMUcf2w9+YpQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Create a new layer for the in-table PTE manipulation APIs. For now, The existing API is prefixed with double underscore to become the arch-private API and the public API is just a simple wrapper that calls the private API. The public API implementation will subsequently be used to transparently manipulate the contiguous bit where appropriate. But since there are already some contig-aware users (e.g. hugetlb, kernel mapper), we must first ensure those users use the private API directly so that the future contig-bit manipulations in the public API do not interfere with those existing uses. set_pte_at() is a core macro that forwards to set_ptes() (with nr=1). Instead of creating a __set_pte_at() internal macro, convert all arch users to use set_ptes()/__set_ptes() directly, as appropriate. Callers in hugetlb may benefit from calling __set_ptes() once for their whole range rather than managing their own loop. This is left for future improvement. Signed-off-by: Ryan Roberts --- arch/arm64/include/asm/pgtable.h | 10 +++++----- arch/arm64/kernel/mte.c | 2 +- arch/arm64/kvm/guest.c | 2 +- arch/arm64/mm/fault.c | 2 +- arch/arm64/mm/hugetlbpage.c | 10 +++++----- 5 files changed, 13 insertions(+), 13 deletions(-) diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index 650d4f4bb6dc..323ec91add60 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -342,9 +342,9 @@ static inline void __sync_cache_and_tags(pte_t pte, unsigned int nr_pages) mte_sync_tags(pte, nr_pages); } -static inline void set_ptes(struct mm_struct *mm, - unsigned long __always_unused addr, - pte_t *ptep, pte_t pte, unsigned int nr) +static inline void __set_ptes(struct mm_struct *mm, + unsigned long __always_unused addr, + pte_t *ptep, pte_t pte, unsigned int nr) { page_table_check_ptes_set(mm, ptep, pte, nr); __sync_cache_and_tags(pte, nr); @@ -358,7 +358,6 @@ static inline void set_ptes(struct mm_struct *mm, pte_val(pte) += PAGE_SIZE; } } -#define set_ptes set_ptes /* * Huge pte definitions. @@ -1067,7 +1066,7 @@ static inline void arch_swap_restore(swp_entry_t entry, struct folio *folio) #endif /* CONFIG_ARM64_MTE */ /* - * On AArch64, the cache coherency is handled via the set_pte_at() function. + * On AArch64, the cache coherency is handled via the __set_ptes() function. */ static inline void update_mmu_cache_range(struct vm_fault *vmf, struct vm_area_struct *vma, unsigned long addr, pte_t *ptep, @@ -1121,6 +1120,7 @@ extern void ptep_modify_prot_commit(struct vm_area_struct *vma, pte_t old_pte, pte_t new_pte); #define set_pte __set_pte +#define set_ptes __set_ptes #endif /* !__ASSEMBLY__ */ diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c index a41ef3213e1e..dcdcccd40891 100644 --- a/arch/arm64/kernel/mte.c +++ b/arch/arm64/kernel/mte.c @@ -67,7 +67,7 @@ int memcmp_pages(struct page *page1, struct page *page2) /* * If the page content is identical but at least one of the pages is * tagged, return non-zero to avoid KSM merging. If only one of the - * pages is tagged, set_pte_at() may zero or change the tags of the + * pages is tagged, __set_ptes() may zero or change the tags of the * other page via mte_sync_tags(). */ if (page_mte_tagged(page1) || page_mte_tagged(page2)) diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c index aaf1d4939739..629145fd3161 100644 --- a/arch/arm64/kvm/guest.c +++ b/arch/arm64/kvm/guest.c @@ -1072,7 +1072,7 @@ int kvm_vm_ioctl_mte_copy_tags(struct kvm *kvm, } else { /* * Only locking to serialise with a concurrent - * set_pte_at() in the VMM but still overriding the + * __set_ptes() in the VMM but still overriding the * tags, hence ignoring the return value. */ try_page_mte_tagging(page); diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c index 460d799e1296..a287c1dea871 100644 --- a/arch/arm64/mm/fault.c +++ b/arch/arm64/mm/fault.c @@ -205,7 +205,7 @@ static void show_pte(unsigned long addr) * * It needs to cope with hardware update of the accessed/dirty state by other * agents in the system and can safely skip the __sync_icache_dcache() call as, - * like set_pte_at(), the PTE is never changed from no-exec to exec here. + * like __set_ptes(), the PTE is never changed from no-exec to exec here. * * Returns whether or not the PTE actually changed. */ diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c index f5aae342632c..741cb53672fd 100644 --- a/arch/arm64/mm/hugetlbpage.c +++ b/arch/arm64/mm/hugetlbpage.c @@ -254,12 +254,12 @@ void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, if (!pte_present(pte)) { for (i = 0; i < ncontig; i++, ptep++, addr += pgsize) - set_pte_at(mm, addr, ptep, pte); + __set_ptes(mm, addr, ptep, pte, 1); return; } if (!pte_cont(pte)) { - set_pte_at(mm, addr, ptep, pte); + __set_ptes(mm, addr, ptep, pte, 1); return; } @@ -270,7 +270,7 @@ void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, clear_flush(mm, addr, ptep, pgsize, ncontig); for (i = 0; i < ncontig; i++, ptep++, addr += pgsize, pfn += dpfn) - set_pte_at(mm, addr, ptep, pfn_pte(pfn, hugeprot)); + __set_ptes(mm, addr, ptep, pfn_pte(pfn, hugeprot), 1); } pte_t *huge_pte_alloc(struct mm_struct *mm, struct vm_area_struct *vma, @@ -478,7 +478,7 @@ int huge_ptep_set_access_flags(struct vm_area_struct *vma, hugeprot = pte_pgprot(pte); for (i = 0; i < ncontig; i++, ptep++, addr += pgsize, pfn += dpfn) - set_pte_at(mm, addr, ptep, pfn_pte(pfn, hugeprot)); + __set_ptes(mm, addr, ptep, pfn_pte(pfn, hugeprot), 1); return 1; } @@ -507,7 +507,7 @@ void huge_ptep_set_wrprotect(struct mm_struct *mm, pfn = pte_pfn(pte); for (i = 0; i < ncontig; i++, ptep++, addr += pgsize, pfn += dpfn) - set_pte_at(mm, addr, ptep, pfn_pte(pfn, hugeprot)); + __set_ptes(mm, addr, ptep, pfn_pte(pfn, hugeprot), 1); } pte_t huge_ptep_clear_flush(struct vm_area_struct *vma, From patchwork Wed Nov 15 16:30:08 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13457059 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 66A7DC072A2 for ; Wed, 15 Nov 2023 16:30:50 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 70A1D6B038D; Wed, 15 Nov 2023 11:30:49 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 6B9C26B038C; Wed, 15 Nov 2023 11:30:49 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 55AB06B038D; Wed, 15 Nov 2023 11:30:49 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 3CED36B038A for ; Wed, 15 Nov 2023 11:30:49 -0500 (EST) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 09D7D1402B0 for ; Wed, 15 Nov 2023 16:30:49 +0000 (UTC) X-FDA: 81460727418.05.D7B8CD5 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf07.hostedemail.com (Postfix) with ESMTP id 0BAF840006 for ; Wed, 15 Nov 2023 16:30:46 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=none; spf=pass (imf07.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1700065847; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=YwPKQQiak6Wq4SeQDrdHtDBJcvIfgSIZ07B41IMN5A0=; b=jmnwrE3omr0Bxs+vWpKdVpciKB3lJc8SQRQJZBtepEICFhCZclzzTsxRDnxAxHjUscNTj3 mO8eCE3K+nVgu73YPulYEvY4hbadUZfdwK/bCUzKVfVbVAXbo5bNhu8isaP2tzsDd4VhAO Hgpt1nkneBuheHpj1QN7aHJID9zcQmo= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1700065847; a=rsa-sha256; cv=none; b=oJOIWMb6Rwf7yd6TVV7n1kfOwMb1OQoqAesQf3mS0mO8ZQ1YugYh92L6CN+JLTmLzhZ26n 5EOtif0C70NQFvh5Js2AKxsMsv5tvx6sMk6K9vnvrsYrunCVBLWrVRim2W3gQoi0dXojrA pepiEq1t8BdIaUzeBgfDsjW187lH3uk= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=none; spf=pass (imf07.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 9A75DDA7; Wed, 15 Nov 2023 08:31:31 -0800 (PST) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.26]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id E084A3F641; Wed, 15 Nov 2023 08:30:42 -0800 (PST) From: Ryan Roberts To: Catalin Marinas , Will Deacon , Ard Biesheuvel , Marc Zyngier , Oliver Upton , James Morse , Suzuki K Poulose , Zenghui Yu , Andrey Ryabinin , Alexander Potapenko , Andrey Konovalov , Dmitry Vyukov , Vincenzo Frascino , Andrew Morton , Anshuman Khandual , Matthew Wilcox , Yu Zhao , Mark Rutland , David Hildenbrand , Kefeng Wang , John Hubbard , Zi Yan Cc: Ryan Roberts , linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v2 04/14] arm64/mm: pte_clear(): New layer to manage contig bit Date: Wed, 15 Nov 2023 16:30:08 +0000 Message-Id: <20231115163018.1303287-5-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20231115163018.1303287-1-ryan.roberts@arm.com> References: <20231115163018.1303287-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: 0BAF840006 X-Rspam-User: X-Stat-Signature: z8gatb8jxqyedkx9qd8mtfigyfah3pxn X-Rspamd-Server: rspam03 X-HE-Tag: 1700065846-708965 X-HE-Meta: U2FsdGVkX1+jHD7xcCl84SCBt32oFQd29D+B9pRJtzZvtrKu+DdziF8/omv9hRwIrrwJrB6k3/59pXYGQm3Hm3aFruZWGZrhHeaDBUSTc5ssBj6bL7dPHpCLoWP3DO7AuIWgzarLMGXN/0b6dOdWaoSEg6hC37SHUbt4MLMcVeONACEYDzMX1CyLg/Q+p+udf28U8H9HNU9ewAC38S47FAUQRcATE274YlId360pcg2Y8i4AV377jCrW97LDE2EJq9GcZOQyO2BbS2ViXXTIbQ9HpRZCXOHubWZMYIn6/beoKNr8RHO12CQAfjF8r6iJOjcw3C6jAfQQ3+UI7nlOR4Pu7KFF7gjJeHSPf341cCtXvuG0jxvDRtNTgG59pI0CKzlx/S1zr6ubH2YUGlawoafCF+dxK9mjzG8u/Uum0sHZ9/pj7/yXGYkyJoPJ5KyQvwcgFjmtvkskn8EOxFfZBJCfrcjyYEfbWOOF3J3Xac8/tWFKf5x2jzC7Kd4jvBk9NUNmeByWJFDjzlcdO3LgZV53uFw9jRpJ9L2oe5d8zWYjCf7snFwnoYn41OqVgT7/o+c6b6x8On9SFDfWuDyLhLD0nTXkXupvdYoN4PuPRLNYeXmb0rAOTUDgxgoswd6R7NIWt6t0j7OnSoXON3GklTEZwPYrGHgVPYwFwr+JjVUxW5f/0Zep26STlEsy9GTsPiQT0szDlPccnFxEnaxOjA4P/eAhFQXbAnEoAeVeTlhUh90JvPSzjCg3WN8dvIdyal+5f9BGTr+e4vubXetygEU1EjFh9dqmr5yx7Ch87Ny8EaUbs0REzkflCNkxwT6BAomMHx7lJ8IWsrGUNkEIC1ZnpoOZVE9uHlgGrBB413/2K2AOf9aYjXW9EOPm4qYKxcu3h7mxtEYP3s1W6ffUAHX4zmX+7TqrEfwR41C6Ec48lb8XEQlSfTbYOvcmNrgrYEMjRT4d7UhvvI8ooXc HINsYsqB r/BpZsQOGlXd+GggPaaIXe+fMUDDNEdfT09p4RIbE9DNeJieGXlMnXb2Kogh0NeKwXTNRxciLzO5H6qXClBmAelqbgjBf/qiFw1TvyWmGZRv+aNddA7xjZQrFIcHe/0smo/lLZ692WPkM/A9Mkz6YndyjbMCcd/QKQkgcetZdtMn9DzKmILRXiIiIHdR+IERhw+u+8qZMbvOrhamPxx4nnX04LnKavIlZ3sVsUzWuM3tcQGm4Bkjy0nWqcyO9SJ9S1j0jpp8+Vj9eQGuktwn3+QbXJA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Create a new layer for the in-table PTE manipulation APIs. For now, The existing API is prefixed with double underscore to become the arch-private API and the public API is just a simple wrapper that calls the private API. The public API implementation will subsequently be used to transparently manipulate the contiguous bit where appropriate. But since there are already some contig-aware users (e.g. hugetlb, kernel mapper), we must first ensure those users use the private API directly so that the future contig-bit manipulations in the public API do not interfere with those existing uses. Signed-off-by: Ryan Roberts --- arch/arm64/include/asm/pgtable.h | 3 ++- arch/arm64/mm/fixmap.c | 2 +- arch/arm64/mm/hugetlbpage.c | 2 +- arch/arm64/mm/mmu.c | 2 +- 4 files changed, 5 insertions(+), 4 deletions(-) diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index 323ec91add60..1464e990580a 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -93,7 +93,7 @@ static inline pteval_t __phys_to_pte_val(phys_addr_t phys) __pte(__phys_to_pte_val((phys_addr_t)(pfn) << PAGE_SHIFT) | pgprot_val(prot)) #define pte_none(pte) (!pte_val(pte)) -#define pte_clear(mm, addr, ptep) \ +#define __pte_clear(mm, addr, ptep) \ __set_pte(ptep, __pte(0)) #define pte_page(pte) (pfn_to_page(pte_pfn(pte))) @@ -1121,6 +1121,7 @@ extern void ptep_modify_prot_commit(struct vm_area_struct *vma, #define set_pte __set_pte #define set_ptes __set_ptes +#define pte_clear __pte_clear #endif /* !__ASSEMBLY__ */ diff --git a/arch/arm64/mm/fixmap.c b/arch/arm64/mm/fixmap.c index 51cd4501816d..bfc02568805a 100644 --- a/arch/arm64/mm/fixmap.c +++ b/arch/arm64/mm/fixmap.c @@ -123,7 +123,7 @@ void __set_fixmap(enum fixed_addresses idx, if (pgprot_val(flags)) { __set_pte(ptep, pfn_pte(phys >> PAGE_SHIFT, flags)); } else { - pte_clear(&init_mm, addr, ptep); + __pte_clear(&init_mm, addr, ptep); flush_tlb_kernel_range(addr, addr+PAGE_SIZE); } } diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c index 741cb53672fd..510b2d4b89a9 100644 --- a/arch/arm64/mm/hugetlbpage.c +++ b/arch/arm64/mm/hugetlbpage.c @@ -400,7 +400,7 @@ void huge_pte_clear(struct mm_struct *mm, unsigned long addr, ncontig = num_contig_ptes(sz, &pgsize); for (i = 0; i < ncontig; i++, addr += pgsize, ptep++) - pte_clear(mm, addr, ptep); + __pte_clear(mm, addr, ptep); } pte_t huge_ptep_get_and_clear(struct mm_struct *mm, diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c index e884279b268e..080e9b50f595 100644 --- a/arch/arm64/mm/mmu.c +++ b/arch/arm64/mm/mmu.c @@ -859,7 +859,7 @@ static void unmap_hotplug_pte_range(pmd_t *pmdp, unsigned long addr, continue; WARN_ON(!pte_present(pte)); - pte_clear(&init_mm, addr, ptep); + __pte_clear(&init_mm, addr, ptep); flush_tlb_kernel_range(addr, addr + PAGE_SIZE); if (free_mapped) free_hotplug_page_range(pte_page(pte), From patchwork Wed Nov 15 16:30:09 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13457060 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id F09CBC5AE4A for ; Wed, 15 Nov 2023 16:30:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8E0056B038C; Wed, 15 Nov 2023 11:30:52 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 88EB66B038E; Wed, 15 Nov 2023 11:30:52 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7562D6B038F; Wed, 15 Nov 2023 11:30:52 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 63D626B038C for ; Wed, 15 Nov 2023 11:30:52 -0500 (EST) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 1DD11C0340 for ; Wed, 15 Nov 2023 16:30:52 +0000 (UTC) X-FDA: 81460727544.11.259C683 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf18.hostedemail.com (Postfix) with ESMTP id 040BC1C0027 for ; Wed, 15 Nov 2023 16:30:49 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf18.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1700065850; a=rsa-sha256; cv=none; b=RuSwiJ/qznr2ENdaT1vA3hTj+PedMwUtLFq5ekQBfhr2M+bV0EcRS42/Ni3nFJ8zeMTtdP kH8k/pAJMJZhh6dEJKneQCbSgt+mdZ+6NgtcjvHkjDM862eJSGRlNhejasUgOBDQvpngDc lQu50NI76skvWxdhex7S16ypgz5wGRA= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf18.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1700065850; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=g7RKJubwZGuwzYxXs5rxmgWa6Yt6X+DHtCiVrTsfCrg=; b=8aT7WyZ6TYfwz7WTvV13WYVxwEjojtYxy9x4ZeEEhMXT5Ov1WWl1mgmWALTkFVX0+r2Iz3 DrwVjWP45mXLYjB+MOXtuD1SwYdIn5VMFpJQA+ks6+07huA8upDnidxwQk0SFMkcfu9cpl +PK7llrwj58ASDvNWUYl9ZGqDq2pHYs= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id DD98E1595; Wed, 15 Nov 2023 08:31:34 -0800 (PST) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.26]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 2F7823F641; Wed, 15 Nov 2023 08:30:46 -0800 (PST) From: Ryan Roberts To: Catalin Marinas , Will Deacon , Ard Biesheuvel , Marc Zyngier , Oliver Upton , James Morse , Suzuki K Poulose , Zenghui Yu , Andrey Ryabinin , Alexander Potapenko , Andrey Konovalov , Dmitry Vyukov , Vincenzo Frascino , Andrew Morton , Anshuman Khandual , Matthew Wilcox , Yu Zhao , Mark Rutland , David Hildenbrand , Kefeng Wang , John Hubbard , Zi Yan Cc: Ryan Roberts , linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v2 05/14] arm64/mm: ptep_get_and_clear(): New layer to manage contig bit Date: Wed, 15 Nov 2023 16:30:09 +0000 Message-Id: <20231115163018.1303287-6-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20231115163018.1303287-1-ryan.roberts@arm.com> References: <20231115163018.1303287-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 040BC1C0027 X-Stat-Signature: pgp4uo3qqwtngkwf6urmk5xmeynbdfpu X-Rspam-User: X-HE-Tag: 1700065849-272628 X-HE-Meta: U2FsdGVkX18B1UV1Sk32JpDxV1WESQDiTUlPMTlFgAhXPPZJixg6e9Nmo3/XrSGg3fS1fk3sOQvopTPzXOg3YIOWvFDBGJhWu6p5JcHNBcTIVlBSvnX1zmUm6jrL5X1iwJRiJCm/nKFVv+LuxJgTe+e/egRZagp/L1xmSweWoa43vp5K6FV6BElBPuISQq0ubUGbKO2KzbKMt2bELn/34b498SB5aQQV8Pv83Toa7emYRfWtjucj2D7bbaPuUsb1WC+rOcoPvsCAn4+RvXtNOyh9FbKH+KeqqDVMuf93SbKoGpOY1u+045ysvocnePsbjO1GRkoEmVfCdgaj1FqX62o7accIWKgKD9zPPI143ccYMsCmC2yciu/LoYVh3EJsRq4Dp+HbS3ee7WNghtgV7XOirKYP/tNqWXnw/G+SS79IWZu2WkIL/YN/+EogxH1HBBh5Y02jYqZqJu1OAqJmc8ulRBMD1EEHM7IHfyogi85RJuFzQy9jSWRFOLSM3JcNDM/aU+Ow6Q03egaRhTyGowc7adZvMJEHFdD06ae0fBlUqOVVX0erAIoW5oH+97jDzbPMS9hRdWXPvyzUz2eRX5wTWo+XIW5Jt5LAh5vLDM27BvH/2FW2D/lHSf4Z2Fncgva472ZTJugsADEePYFiqjxwWD8wfkP5JAhythApF2LG4rqVx2fLXLBZbocJYGceHTvOk8XMSH2uqcbjlwo/8jZB89jRTcXrfr0RuT/WWMjUkBFoU6shptih/n/9zxUKKUFoLzQ/Xyo3as8AqCnERS7OdY3uM5fT9J8obDLaTdKLdpUIyR5fnUkpHxQWM928O0g5Gkn3N4xDzkWxVQ4gEhstYuQcp6GwkhIqQ0xtor4zXu/mCCViz7fgkdAT+NBO4dsKA5G5Z4enMqI3ixpr0UuGuyMzfSRkrb1WGwrOOmdxmOYz7wxnqQdshbIZUZ+ft7fMWVFh2vDh6W++A00 AB6ob3Ro ZKeDFIP8Mp3duUYmSeLLm1t9PoeoqO+4AFCaQDRbGi3Io8TUbSSH2Jh2f2Chc+OP9NBFzLQXosUeTEB9iZEznng8Gr5uVXV1VFGD2fDXeo5lI9IqvMRFB+c0XrfyPTE0KQwuJzoS6Imx4qhpETC4qwMkGpBjMwpyNKS40CVHI9EDujsb3x0vNjoadrKiNWjEe8R3fXyrs/SlU7SkdkJo+Gq8zIyd/r2QjINeH+4H1fVHIHEnIqc3ZvW8hELNZ6aVe/Wy3tMENCexaarStZiE4+3BIFA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Create a new layer for the in-table PTE manipulation APIs. For now, The existing API is prefixed with double underscore to become the arch-private API and the public API is just a simple wrapper that calls the private API. The public API implementation will subsequently be used to transparently manipulate the contiguous bit where appropriate. But since there are already some contig-aware users (e.g. hugetlb, kernel mapper), we must first ensure those users use the private API directly so that the future contig-bit manipulations in the public API do not interfere with those existing uses. Signed-off-by: Ryan Roberts --- arch/arm64/include/asm/pgtable.h | 5 +++-- arch/arm64/mm/hugetlbpage.c | 6 +++--- 2 files changed, 6 insertions(+), 5 deletions(-) diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index 1464e990580a..994597a0bb0f 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -941,8 +941,7 @@ static inline int pmdp_test_and_clear_young(struct vm_area_struct *vma, } #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ -#define __HAVE_ARCH_PTEP_GET_AND_CLEAR -static inline pte_t ptep_get_and_clear(struct mm_struct *mm, +static inline pte_t __ptep_get_and_clear(struct mm_struct *mm, unsigned long address, pte_t *ptep) { pte_t pte = __pte(xchg_relaxed(&pte_val(*ptep), 0)); @@ -1122,6 +1121,8 @@ extern void ptep_modify_prot_commit(struct vm_area_struct *vma, #define set_pte __set_pte #define set_ptes __set_ptes #define pte_clear __pte_clear +#define __HAVE_ARCH_PTEP_GET_AND_CLEAR +#define ptep_get_and_clear __ptep_get_and_clear #endif /* !__ASSEMBLY__ */ diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c index 510b2d4b89a9..c2a753541d13 100644 --- a/arch/arm64/mm/hugetlbpage.c +++ b/arch/arm64/mm/hugetlbpage.c @@ -188,7 +188,7 @@ static pte_t get_clear_contig(struct mm_struct *mm, unsigned long i; for (i = 0; i < ncontig; i++, addr += pgsize, ptep++) { - pte_t pte = ptep_get_and_clear(mm, addr, ptep); + pte_t pte = __ptep_get_and_clear(mm, addr, ptep); /* * If HW_AFDBM is enabled, then the HW could turn on @@ -236,7 +236,7 @@ static void clear_flush(struct mm_struct *mm, unsigned long i, saddr = addr; for (i = 0; i < ncontig; i++, addr += pgsize, ptep++) - ptep_clear(mm, addr, ptep); + __ptep_get_and_clear(mm, addr, ptep); flush_tlb_range(&vma, saddr, addr); } @@ -411,7 +411,7 @@ pte_t huge_ptep_get_and_clear(struct mm_struct *mm, pte_t orig_pte = ptep_get(ptep); if (!pte_cont(orig_pte)) - return ptep_get_and_clear(mm, addr, ptep); + return __ptep_get_and_clear(mm, addr, ptep); ncontig = find_num_contig(mm, addr, ptep, &pgsize); From patchwork Wed Nov 15 16:30:10 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13457061 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2826DC54FB9 for ; Wed, 15 Nov 2023 16:30:56 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9EFCB6B0391; Wed, 15 Nov 2023 11:30:55 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 952DF6B0390; Wed, 15 Nov 2023 11:30:55 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7F29E6B0391; Wed, 15 Nov 2023 11:30:55 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 6DE966B038E for ; Wed, 15 Nov 2023 11:30:55 -0500 (EST) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 40D05120256 for ; Wed, 15 Nov 2023 16:30:55 +0000 (UTC) X-FDA: 81460727670.18.41E30B9 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf21.hostedemail.com (Postfix) with ESMTP id 5C5171C0017 for ; Wed, 15 Nov 2023 16:30:53 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf21.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1700065853; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ah49fTPYg1kjKkibdjx5mHLTxVRVE+SlIK3JQIq5PE4=; b=NxZLeYFymZ72E4Q363b7bMcmq7wdDZQW58CB1yX3MMMGlT4klMpniUhgITeFzyVKTlS5Ah W1j/itvIK4jf4x2Lw0YDU8TkLM4Ob0orI9QtwbC/RUq+L9P6wIrMtpHJip3h5vYNx4kUxO popcw9yy1VaA4254InZlYhlZGE8w12k= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf21.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1700065853; a=rsa-sha256; cv=none; b=mB5/MtT/smPgqoJWzArUa/je5WaLkTOEolYTieSJf1ChTAZ54LB0A/zsez1gdD6VvOf0hE IUM9or9GOys0lkEDUuPcbTb6IxLtVCQuCCWLmL/8pP65cRAkznpnWQTfpR8O5Jsq5SMZlT k1IWRPAL98j2or8/BPPcxkTLI4LR0qs= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 2D04FDA7; Wed, 15 Nov 2023 08:31:38 -0800 (PST) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.26]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 72D2D3F641; Wed, 15 Nov 2023 08:30:49 -0800 (PST) From: Ryan Roberts To: Catalin Marinas , Will Deacon , Ard Biesheuvel , Marc Zyngier , Oliver Upton , James Morse , Suzuki K Poulose , Zenghui Yu , Andrey Ryabinin , Alexander Potapenko , Andrey Konovalov , Dmitry Vyukov , Vincenzo Frascino , Andrew Morton , Anshuman Khandual , Matthew Wilcox , Yu Zhao , Mark Rutland , David Hildenbrand , Kefeng Wang , John Hubbard , Zi Yan Cc: Ryan Roberts , linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v2 06/14] arm64/mm: ptep_test_and_clear_young(): New layer to manage contig bit Date: Wed, 15 Nov 2023 16:30:10 +0000 Message-Id: <20231115163018.1303287-7-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20231115163018.1303287-1-ryan.roberts@arm.com> References: <20231115163018.1303287-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: 5C5171C0017 X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: 1r54m8t4xwexqinutxapwyicmr8fzpxg X-HE-Tag: 1700065853-341283 X-HE-Meta: U2FsdGVkX1+RNP3dteQ2ihJSYWEMaJK0xnNFkHN0SWtnYdBE2TGQ7DxWIxUpgyZyIzKZ6pCutxM81ePLrrCdpGK603Y4EA76tiYbYGpIAkE8cRHwqXH4tUAIjH9sy7G/rb9+Di3wlaj9cRPPVXC2o4+B+aHxChc5hknJDiDfOVnvwYFDn4bNzZe/wRm1m9k3QjCl+BfB0nvzFAynhX5J39MdSBaevzuYiuNWiccq4ALndY5U5mf3GHTAQjg/sJRGAHqXj5QZu2sG5SpshJ864rFd8dR9xEe9JjDlc0Vzo/cb4LOjHPLMPStNUBNTdfDoGmcUJ47ef7wToS2yyR7qN/iLPMkPKNIz354RGoJ0g5Aq//MgmD77sV/JDIHqL1lS9jFTRkhTgbp27AHHUwAAvn36s4EDjBFiQ6IPmUrR/3X2t3FMU61mmSTHB3KcKa3jCVUa8l2kVi2GwIwGSQxHhYgCcCpAIvVI03GCvAC5Dc16ARou6pCHewXGbg7L7IELi5w0H8ag3rE02e+aCE+Eps2YQmN4+wav+mn4exD/gfSNa3MwzZVfwRE0ewE0BpXz0cfpaRKjksllNqULYxlP7iD4iiPvEN72Ov2DOY5Ie7wiTWUt6lu85WuKtTzqOCCA9X3JL+YQ/CncXCe17wIV+quVHHQETSnHx9n4ykAuV1+L33big+rXhF32IdjGfNIcuEENmNcActM7mqcxKesQK5tS5YVeIGYNautege9SLRCzcC/9Y+zzshgfTSzYveXuG5JhmZ4iX8tczdxJf63xEfwSpHjNr8SgprqppM7jKiqtSwfGjB2XOlMonQKH+AgymBh3ukCQHAboJxHB2M5Ui1pz7mmG+1BuLWGnDej9GJCf7BNwR7xQw856SBNuHcIP+uzLKh+ARiS6Ei/7nzlmegxqDKnNq7Vvysnk+4Q5RuVTF7dmeUpDGbkpQrchXMdbiWMzoTsSSEFptMV0Nes hmCqch9i 6UdTsyWiBu6wgMbNW34BC+5UqdOIEO9kp1CmU49jdgUJKnb2iLKTNnkZR6L/ZEz7s3+maLsWULNzM+CPh6fP4ZL2aE5IaKBB+xitfNvYzuFZAKhTrs7zVKcupLuCWGsX3L6w3dyJYbJkkE2Iwl5QRtKp9MtcAIZRGppixjL2O57uEd4jsrtyCTHQhVFVtNd5iAEp+syOnqUhs7hCa9N8laMYxBlurbrMGwWdjeIrNt7UgOaBd1+YB9Atb2UCHBy0gTFzp0XNI2r/433eiw++YBXQVRw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Create a new layer for the in-table PTE manipulation APIs. For now, The existing API is prefixed with double underscore to become the arch-private API and the public API is just a simple wrapper that calls the private API. The public API implementation will subsequently be used to transparently manipulate the contiguous bit where appropriate. But since there are already some contig-aware users (e.g. hugetlb, kernel mapper), we must first ensure those users use the private API directly so that the future contig-bit manipulations in the public API do not interfere with those existing uses. Signed-off-by: Ryan Roberts --- arch/arm64/include/asm/pgtable.h | 18 +++++++----------- 1 file changed, 7 insertions(+), 11 deletions(-) diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index 994597a0bb0f..9b4a9909fd5b 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -887,8 +887,9 @@ static inline bool pud_user_accessible_page(pud_t pud) /* * Atomic pte/pmd modifications. */ -#define __HAVE_ARCH_PTEP_TEST_AND_CLEAR_YOUNG -static inline int __ptep_test_and_clear_young(pte_t *ptep) +static inline int __ptep_test_and_clear_young(struct vm_area_struct *vma, + unsigned long address, + pte_t *ptep) { pte_t old_pte, pte; @@ -903,18 +904,11 @@ static inline int __ptep_test_and_clear_young(pte_t *ptep) return pte_young(pte); } -static inline int ptep_test_and_clear_young(struct vm_area_struct *vma, - unsigned long address, - pte_t *ptep) -{ - return __ptep_test_and_clear_young(ptep); -} - #define __HAVE_ARCH_PTEP_CLEAR_YOUNG_FLUSH static inline int ptep_clear_flush_young(struct vm_area_struct *vma, unsigned long address, pte_t *ptep) { - int young = ptep_test_and_clear_young(vma, address, ptep); + int young = __ptep_test_and_clear_young(vma, address, ptep); if (young) { /* @@ -937,7 +931,7 @@ static inline int pmdp_test_and_clear_young(struct vm_area_struct *vma, unsigned long address, pmd_t *pmdp) { - return ptep_test_and_clear_young(vma, address, (pte_t *)pmdp); + return __ptep_test_and_clear_young(vma, address, (pte_t *)pmdp); } #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ @@ -1123,6 +1117,8 @@ extern void ptep_modify_prot_commit(struct vm_area_struct *vma, #define pte_clear __pte_clear #define __HAVE_ARCH_PTEP_GET_AND_CLEAR #define ptep_get_and_clear __ptep_get_and_clear +#define __HAVE_ARCH_PTEP_TEST_AND_CLEAR_YOUNG +#define ptep_test_and_clear_young __ptep_test_and_clear_young #endif /* !__ASSEMBLY__ */ From patchwork Wed Nov 15 16:30:11 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13457062 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 15767C072A2 for ; Wed, 15 Nov 2023 16:31:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A17166B0390; Wed, 15 Nov 2023 11:30:59 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 9EE426B0392; Wed, 15 Nov 2023 11:30:59 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8B6636B0393; Wed, 15 Nov 2023 11:30:59 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 79D386B0390 for ; Wed, 15 Nov 2023 11:30:59 -0500 (EST) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id F02E0803CF for ; Wed, 15 Nov 2023 16:30:58 +0000 (UTC) X-FDA: 81460727796.11.B3C05CB Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf14.hostedemail.com (Postfix) with ESMTP id E133410002B for ; Wed, 15 Nov 2023 16:30:56 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf14.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1700065857; a=rsa-sha256; cv=none; b=fbtjrvw0RAFRfhz8+LkvhFKyUDzPDFVFw/fHGIBddRIFlMv8paAxS9ITD/MJDqv7BCBoSd 7QN6lYeA79MwymYsolx/7Abmb2JwcDLvZ6zW/XWxp251uQCt6qkiiFueuArENyQHFYOX8I Zs6jHqXHzPRBQNC3Jy7MB1a3438ZFOU= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf14.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1700065857; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=eY6NiicJ90X6+1Q0eg6fxYOX0IQ4OXMtCPZ2hrvQ55c=; b=n3Dk9U5057p5/xg1IJZmG+ZYTB954BAk3wAGupxp79A9FCKu8wUFIehVJV0t6IuYgTN1nZ gUf1FLOg2KUgDfrGmpiXWn4oBzgDxmi3SCq8WDWM73fGh7+zbdcSTD/6bbw9GxOaD6NEoK CijQXEfHUjszJOdP15a1FO0v65JF7gs= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 70C3115DB; Wed, 15 Nov 2023 08:31:41 -0800 (PST) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.26]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id B6F783F641; Wed, 15 Nov 2023 08:30:52 -0800 (PST) From: Ryan Roberts To: Catalin Marinas , Will Deacon , Ard Biesheuvel , Marc Zyngier , Oliver Upton , James Morse , Suzuki K Poulose , Zenghui Yu , Andrey Ryabinin , Alexander Potapenko , Andrey Konovalov , Dmitry Vyukov , Vincenzo Frascino , Andrew Morton , Anshuman Khandual , Matthew Wilcox , Yu Zhao , Mark Rutland , David Hildenbrand , Kefeng Wang , John Hubbard , Zi Yan Cc: Ryan Roberts , linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v2 07/14] arm64/mm: ptep_clear_flush_young(): New layer to manage contig bit Date: Wed, 15 Nov 2023 16:30:11 +0000 Message-Id: <20231115163018.1303287-8-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20231115163018.1303287-1-ryan.roberts@arm.com> References: <20231115163018.1303287-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: E133410002B X-Stat-Signature: rt87ipqopb941nctny4bte5ynz3nxra7 X-Rspam-User: X-HE-Tag: 1700065856-230803 X-HE-Meta: U2FsdGVkX18HKTUpGVwhIT1OELq0DTcsQ3/fgpz8qnS8q34zPVgei4jP7B3ThKDLmgCgKvVprGEFSpWIIYL6IhVPLebXmW2bOBdzO91+LowDrtBocVkNyxsF47T5l9fUMkiraWYGtwhEEpnm5JtZGnwVEKWoiLO2uC3UVfkVRH7LNjefBhTo97mWpzXSyTgWiqrJwSIXD4LEpZy7QT48YmeSLmEfSYLhcb091T82z8T6NHbGUsQoHVG1XYIq+blB2CNCcXx4sDlFDz2RCDrr2jH+EF/aaKU+qS1OYCz6/cP1eO1EPEoxRwHt99TKZTozWnRRhxG3AlbXC7BEVpGJ85NYRw6u9uoOaa7m6iey9JaSxBRv26i86qeIktpnaRZ318O3UTmBBWy4tdekesv+4rELFAlFGDVGpG6fkD1RfBcKhficmF22WoXaNFziFI8Xvp2I5pppMUHBZ21juIURQYyZ6gvIJd/MdVsj72u85nd+jlJT/SUNxU0dmG4usx8LIGo+Wd6AWy8BvVrIswcZy+MF9IZx9psL/waIkLwR7Us8wy+Lfj8wnOJX3hJCZdPtVFghbtbjjsNq3TDEWitVXEHjRiixp0argDOMCl+oAOoEZsnII2b6IgTZ/rsYoA0Fgh8/pk5nsA1C6eVwrMy4QiyOBoXQw6UcbmSLpjgNwAjAgQvL7f9dFWdg8ddbo51HkbcFKnrH5QALCTkvvrts3LqwAsohmEqkoTJZ9TddhXXBlLLwxJ5xECDSYCFAJwLZuqOWTlGasUyHz9ggMANSXEeJfHNt3K36tPd8FYC32Uef179wo/mxACJI0l2MeQf+ygaLtJy16FTUkhZsoauyR9Pa/BPYSMXdHqbEYK2hMNYwYZBHB0SZgnFCNvsc0OqFGG62X1VKegWQxGgbPHowxY0bDPckMY513lwDLC2t5aFtUNLbRYkVWi8tFAN6lvy3x37ZxdoIvbHwlntEyjr kP0IggfB vgxds+nf75fOvmtbQvCvxl3iCUuWH+ffwj1YIn2C1tZ1jXNu1X7gTh2g92HF+cyzNfMvbfASf61mK2OqTgu9UJG7iMU2Hd/T3CjTwG6hA0Ld24c72XdVf2/W2mJNssuygldFvUFzmGkYPmYglx8Xec1ujYGsP76Y9UQdQa/UwwIKS5Cmep7Kgf8Gal1zjOoLGRPFEH04e+hNE2lqBwd/WqE8qUnlf6B8jZYMGig8lliek/HaWEdJ1/WTNCBa4zGSv5sdfuCrgOVrf+x8RUST1UzsK/g== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Create a new layer for the in-table PTE manipulation APIs. For now, The existing API is prefixed with double underscore to become the arch-private API and the public API is just a simple wrapper that calls the private API. The public API implementation will subsequently be used to transparently manipulate the contiguous bit where appropriate. But since there are already some contig-aware users (e.g. hugetlb, kernel mapper), we must first ensure those users use the private API directly so that the future contig-bit manipulations in the public API do not interfere with those existing uses. Signed-off-by: Ryan Roberts --- arch/arm64/include/asm/pgtable.h | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index 9b4a9909fd5b..fc1005222ee4 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -138,7 +138,7 @@ static inline pteval_t __phys_to_pte_val(phys_addr_t phys) * so that we don't erroneously return false for pages that have been * remapped as PROT_NONE but are yet to be flushed from the TLB. * Note that we can't make any assumptions based on the state of the access - * flag, since ptep_clear_flush_young() elides a DSB when invalidating the + * flag, since __ptep_clear_flush_young() elides a DSB when invalidating the * TLB. */ #define pte_accessible(mm, pte) \ @@ -904,8 +904,7 @@ static inline int __ptep_test_and_clear_young(struct vm_area_struct *vma, return pte_young(pte); } -#define __HAVE_ARCH_PTEP_CLEAR_YOUNG_FLUSH -static inline int ptep_clear_flush_young(struct vm_area_struct *vma, +static inline int __ptep_clear_flush_young(struct vm_area_struct *vma, unsigned long address, pte_t *ptep) { int young = __ptep_test_and_clear_young(vma, address, ptep); @@ -1119,6 +1118,8 @@ extern void ptep_modify_prot_commit(struct vm_area_struct *vma, #define ptep_get_and_clear __ptep_get_and_clear #define __HAVE_ARCH_PTEP_TEST_AND_CLEAR_YOUNG #define ptep_test_and_clear_young __ptep_test_and_clear_young +#define __HAVE_ARCH_PTEP_CLEAR_YOUNG_FLUSH +#define ptep_clear_flush_young __ptep_clear_flush_young #endif /* !__ASSEMBLY__ */ From patchwork Wed Nov 15 16:30:12 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13457063 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 99E83C072A2 for ; Wed, 15 Nov 2023 16:31:03 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 29DA06B0392; Wed, 15 Nov 2023 11:31:03 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 225C86B0394; Wed, 15 Nov 2023 11:31:03 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0EDDB6B0395; Wed, 15 Nov 2023 11:31:03 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id F21896B0392 for ; Wed, 15 Nov 2023 11:31:02 -0500 (EST) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 454E0802E4 for ; Wed, 15 Nov 2023 16:31:02 +0000 (UTC) X-FDA: 81460727964.04.1219DBF Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf16.hostedemail.com (Postfix) with ESMTP id 0750218003E for ; Wed, 15 Nov 2023 16:30:59 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf16.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1700065860; a=rsa-sha256; cv=none; b=IUyCzDwOFD5wiiyCRIgJBbIh3yMkRKwpk0/qHXS+ZBkJn/u4Oa20PFV7fIVBEJ6P0QiICP i2fWe6Udc5fVNyy9paaOKbfF+lKP3PBI6P8/hxRv0KQuASuAXRU2pMgjKgcSiHODOSBXnN hr9Mz6KNr+DulXY7SVmafrCOT1caOiw= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf16.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1700065860; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=2IQirIHWGTN0C5ucPumITjyI4YT5MYNWep5+T8WG/E4=; b=2Tzn3xvdRhGNI60tZMwq29UclYUgY40F+5cuW6IUzg923aW1nuMOceL1Uza6AqLedvMEtu cix6omvwvZ6dMAmhv7Kv5sZvIrIPYNsDyK7qYRJju4YJQbC+0lZyUBpJzLWwfjjAVEIXk7 Ew9J/wCDb8/n6OEKm6ZZTHVngpnRL8s= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id B3D151595; Wed, 15 Nov 2023 08:31:44 -0800 (PST) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.26]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 05EED3F641; Wed, 15 Nov 2023 08:30:55 -0800 (PST) From: Ryan Roberts To: Catalin Marinas , Will Deacon , Ard Biesheuvel , Marc Zyngier , Oliver Upton , James Morse , Suzuki K Poulose , Zenghui Yu , Andrey Ryabinin , Alexander Potapenko , Andrey Konovalov , Dmitry Vyukov , Vincenzo Frascino , Andrew Morton , Anshuman Khandual , Matthew Wilcox , Yu Zhao , Mark Rutland , David Hildenbrand , Kefeng Wang , John Hubbard , Zi Yan Cc: Ryan Roberts , linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v2 08/14] arm64/mm: ptep_set_wrprotect(): New layer to manage contig bit Date: Wed, 15 Nov 2023 16:30:12 +0000 Message-Id: <20231115163018.1303287-9-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20231115163018.1303287-1-ryan.roberts@arm.com> References: <20231115163018.1303287-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 0750218003E X-Stat-Signature: igi73qysohttbdrkaxipzrp51j7oxx4z X-Rspam-User: X-HE-Tag: 1700065859-316935 X-HE-Meta: U2FsdGVkX1+3tDErhbyuzWa5eoJfhp8+6rCpMYZy4HndxS8QR8uoLPN9+HYnwIQs3pdRgilS71u4+XUsQklyzvwwkSZrvmkWYmDsg8QRrq0jGsTEVeKiuhWQiX/tiDQPbb3mfXoj72BvVWpMLG8ZWbqX2m+ZvqrMXYqxpWFe0OXpw9QMBYe+7QA9vjrHs1+1qaKqXIqaUSnGLt6eu8DuRaS/7qr31F8xsw5Nwhy+Hv3z4ZAUptaHMNnfCpzUMKOw1FGiZ+VNnaUYmC2w0fHtnlBUFAUkjgmy8Byj+5IqtSCAWUHMxqk4XbNrh4Sk7u4Z5NGhfhGNWIfRamhStSz8sPCxdc8sazyXLJrEbJnXfY7cBupDh2KAq91B9HZOqpuYHeMrcrIVVFEHmnAaRDig5SWHM+ERQK5Kr7Wtn/Te4QXE+qU39coyucIZaLdfCb4D5mEHWvG42F73NV0bi7nxppPSRgiQ6JOGhT3cUMiVDTGIjfLwO8TDDNDC0pA4dCW5+CLczhvZ4C6P/xGTauFtJ429OB59HGEYSQ5e7zT9t/fKFOW6VWxXkz7CgHJtsFh7OZ0QiaRLieYzTuSyb0fGqYaprihtGJQmkJwT+UlNvw6wnlfyvzIX+cePm0wQkj8DhhCI0IgAi21MzOGJnvidAhqkrHrjxeQcpkEWkTSKbUq+VAwb1lUiVef+bW/YFXGNomTCY+KbCAbgIh8KrsMTulzq7R9WzdDpLspJgwQKhwZUbUHMyEEe4+QvEnVq8qSIaKFbXY6we4Kag1n01fTf3kHxsvc8TNGSrihvlqqB4OZPEm8kPMB5nwTVxYlKwixspjSjeKB+GLtrrB0esuJ4lLdU5iX/+YJmm/j/Dj0z/e6HAfsz3451O+ZePvu8BM1p2V04tIqwtpYKygPfKEb/UFqXHcK3XxiJ1qdaHCThazfFuzmfMVQg+Sp+uB7YYUVnCNEtRaBQHWGPntNmMqC otrKRnC0 NsRBK7dltlOC48mG/LS3kC0chLLud0KA6xGo+2Nlkj2CMMqpHRy9H5cIAIJ55RwQUXYwK+iDn/LbKsioJ0Mppbt4np0VB4h39sKoF+/WwFnASNyBnp16p1LYuGCo0/M5Xa1gFvHTUmsdT33RC/Pxn0dsZNieulgUAONyYGVQx9d5BpTy+yeYnp6z+QgvYwsDqZF/Q3VoPzuO4SGtMhmSS4d0+yCUZ+gGTFl5dOQEsV63gtrKO5HykCDciZpTswvx9yDad7vG+Ld0uQkgfQdW7i17XsA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Create a new layer for the in-table PTE manipulation APIs. For now, The existing API is prefixed with double underscore to become the arch-private API and the public API is just a simple wrapper that calls the private API. The public API implementation will subsequently be used to transparently manipulate the contiguous bit where appropriate. But since there are already some contig-aware users (e.g. hugetlb, kernel mapper), we must first ensure those users use the private API directly so that the future contig-bit manipulations in the public API do not interfere with those existing uses. Signed-off-by: Ryan Roberts --- arch/arm64/include/asm/pgtable.h | 10 ++++++---- arch/arm64/mm/hugetlbpage.c | 2 +- 2 files changed, 7 insertions(+), 5 deletions(-) diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index fc1005222ee4..423cc32b2777 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -958,11 +958,11 @@ static inline pmd_t pmdp_huge_get_and_clear(struct mm_struct *mm, #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ /* - * ptep_set_wrprotect - mark read-only while trasferring potential hardware + * __ptep_set_wrprotect - mark read-only while trasferring potential hardware * dirty status (PTE_DBM && !PTE_RDONLY) to the software PTE_DIRTY bit. */ -#define __HAVE_ARCH_PTEP_SET_WRPROTECT -static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long address, pte_t *ptep) +static inline void __ptep_set_wrprotect(struct mm_struct *mm, + unsigned long address, pte_t *ptep) { pte_t old_pte, pte; @@ -980,7 +980,7 @@ static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addres static inline void pmdp_set_wrprotect(struct mm_struct *mm, unsigned long address, pmd_t *pmdp) { - ptep_set_wrprotect(mm, address, (pte_t *)pmdp); + __ptep_set_wrprotect(mm, address, (pte_t *)pmdp); } #define pmdp_establish pmdp_establish @@ -1120,6 +1120,8 @@ extern void ptep_modify_prot_commit(struct vm_area_struct *vma, #define ptep_test_and_clear_young __ptep_test_and_clear_young #define __HAVE_ARCH_PTEP_CLEAR_YOUNG_FLUSH #define ptep_clear_flush_young __ptep_clear_flush_young +#define __HAVE_ARCH_PTEP_SET_WRPROTECT +#define ptep_set_wrprotect __ptep_set_wrprotect #endif /* !__ASSEMBLY__ */ diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c index c2a753541d13..952462820d9d 100644 --- a/arch/arm64/mm/hugetlbpage.c +++ b/arch/arm64/mm/hugetlbpage.c @@ -493,7 +493,7 @@ void huge_ptep_set_wrprotect(struct mm_struct *mm, pte_t pte; if (!pte_cont(READ_ONCE(*ptep))) { - ptep_set_wrprotect(mm, addr, ptep); + __ptep_set_wrprotect(mm, addr, ptep); return; } From patchwork Wed Nov 15 16:30:13 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13457064 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 35037C072A2 for ; Wed, 15 Nov 2023 16:31:07 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B9DB86B0394; Wed, 15 Nov 2023 11:31:06 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id B4DF26B0396; Wed, 15 Nov 2023 11:31:06 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A15E16B0397; Wed, 15 Nov 2023 11:31:06 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 8E6746B0394 for ; Wed, 15 Nov 2023 11:31:06 -0500 (EST) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 6F4B1403D3 for ; Wed, 15 Nov 2023 16:31:06 +0000 (UTC) X-FDA: 81460728132.13.759BB14 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf08.hostedemail.com (Postfix) with ESMTP id 301EB160009 for ; Wed, 15 Nov 2023 16:31:03 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf08.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1700065864; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=H0BwGCXMY0Jm/jIsiVmBBtVJvXghyud81fHEkKj7GCo=; b=DxOCqZK7+gyqTetJIzuUG2WS2jrWWfbKKtk/hJxWyG1/oH+vGKpbK+L5ymnpR2Kutl5tzV lxtaPKijpD4T5O/CIIPGSb6KNOOMmdwrkhLVOAwma5PLYz3coe8ZL4J2kqiRHm34PPwV5s 6/kGtloSwlpiQfNVc023WOWri+VoOvY= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf08.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1700065864; a=rsa-sha256; cv=none; b=Wgn13JH/mjkDv+q0D0FE6u20EuAA4ITtOs1Heig6CidVPbhuQzR7jBPfxiVPdVq9jA3ghz 086CQ4PVVpBYgOKOndhhTO9BvP6xZVMT8CUdmJt1Jz9dlylKs4L2f7Mma8mhWDM9Gx+hH2 a9eAxvI41wy+YwMYWm30f6ujyNXO2+c= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 02C4DDA7; Wed, 15 Nov 2023 08:31:48 -0800 (PST) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.26]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 48AB23F641; Wed, 15 Nov 2023 08:30:59 -0800 (PST) From: Ryan Roberts To: Catalin Marinas , Will Deacon , Ard Biesheuvel , Marc Zyngier , Oliver Upton , James Morse , Suzuki K Poulose , Zenghui Yu , Andrey Ryabinin , Alexander Potapenko , Andrey Konovalov , Dmitry Vyukov , Vincenzo Frascino , Andrew Morton , Anshuman Khandual , Matthew Wilcox , Yu Zhao , Mark Rutland , David Hildenbrand , Kefeng Wang , John Hubbard , Zi Yan Cc: Ryan Roberts , linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v2 09/14] arm64/mm: ptep_set_access_flags(): New layer to manage contig bit Date: Wed, 15 Nov 2023 16:30:13 +0000 Message-Id: <20231115163018.1303287-10-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20231115163018.1303287-1-ryan.roberts@arm.com> References: <20231115163018.1303287-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 301EB160009 X-Stat-Signature: 6hwazba49g6jmm8ntfw3rq4rx1h3fdxd X-HE-Tag: 1700065863-61635 X-HE-Meta: U2FsdGVkX1/VeRYkuMYZh2U1rmRJW9qQmsF52RabDqbG/f2scZE8pPEGC5pnFmx8ywJRmZYDvHwNCa+vq4CJq076PR+gx2cSkEtuy6QodK6+uxEvOUahL02Y8kQk8lsyoyjZgHAdEqBeAM9YJhnc3ASSZXVulAJLpGF3sd4W0EMRE7H3Md3W/LobSXJX2eqHrX63kz66w42UejpUaPP9gR3EXlwBk2FvowWSzyLfcF8PVlyQO27Hb3Vii+jn1NifcStVAG5rqxZ6d8Z4VAgPlLZUPooNSmw7cFqQNH0sBLojjp19bpTYmuWeD8ox/RmQRAC+DAoSe72bhXBHQANVPfS7X3jfE+qWn6blF87t4cftM+UE4X3cAXByXba9P3DKWUpJjNl8vL/dFtWVIZQ/HntUlmp8zQD3KhH2TaLGPaNJ452hh1R7KD0TMzsskwFM+dzbSyuTIsKZKlBSLlQq9c2YEsK9Nc2spTLRcvlyI8cubJZrg+VGdf6LrZphG2si7NJsnG7hErJztDTwjaKv4pMLmW/+Je3hAdkR4ScC8IIzs0GWqpcKZyRgxqMtSoataWtQJ7GBc7NLNx2EJEK88d+E70tz+63erwIt0gkISGideMaJLQMhm3CnTgAB2Sn9rIhzCu36F2pln1TnZlwYgHpIGHVXZOiXK5yWC4kvJ2y+XxQO/2TAV5up79/Km/8atoK2Zg1GHpEocC5FYGMRqdqVxXX/1o1lFx5/IqkNNEqD/wwuCoutXeR02C3NUkGa0mbs1pqyMMcKERI+ynh5AdTz/TrUaefjNFzi48WdhQXkZXucC4ltH3WeF61oG0HMTzVUXRnNAkMjQA8P9Yncja51xxEsjJeoef34OE1wBehctzu6vmOI6EAsJEQHq+Rsk1I/2AE4nr4dbtQwc9K4uCtrONenHWriBetu3y/wwWeifv4ezFzKzbE5rlmhfQsyJLoPxnpjnJkc8Dq03Dj hxDnK5gb fp0fo5WNTGNQeuiNfDT1+Bup+pyPLep+PFrKZI6WVc3ZgFgob1T1hx5LCRquiJwJvJsSaIjHcwS0Q4EM6eMH9h/+SvhyEzVUbAg+wW9MSdZT0hEJGzLChKtyr/EJ83dbpGGC5yBZwa+BhD+z9+SMeuAtpMHGBenQt8EIJStW5jEsexPEFeIIaPlf1o4iaJpGdn2ZIcWQXmfac1GXkIOiar8tf7HlDvo/PnK9cG+vHipqX2Q8dfcUSoH5WcnXQUmtrCRW6LQ9V/IkIPGFhrKsqiUjX1Q== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Create a new layer for the in-table PTE manipulation APIs. For now, The existing API is prefixed with double underscore to become the arch-private API and the public API is just a simple wrapper that calls the private API. The public API implementation will subsequently be used to transparently manipulate the contiguous bit where appropriate. But since there are already some contig-aware users (e.g. hugetlb, kernel mapper), we must first ensure those users use the private API directly so that the future contig-bit manipulations in the public API do not interfere with those existing uses. Signed-off-by: Ryan Roberts --- arch/arm64/include/asm/pgtable.h | 10 ++++++---- arch/arm64/mm/fault.c | 6 +++--- arch/arm64/mm/hugetlbpage.c | 2 +- 3 files changed, 10 insertions(+), 8 deletions(-) diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index 423cc32b2777..85010c2d4dfa 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -312,7 +312,7 @@ static inline void __check_safe_pte_update(struct mm_struct *mm, pte_t *ptep, /* * Check for potential race with hardware updates of the pte - * (ptep_set_access_flags safely changes valid ptes without going + * (__ptep_set_access_flags safely changes valid ptes without going * through an invalid entry). */ VM_WARN_ONCE(!pte_young(pte), @@ -842,8 +842,7 @@ static inline pmd_t pmd_modify(pmd_t pmd, pgprot_t newprot) return pte_pmd(pte_modify(pmd_pte(pmd), newprot)); } -#define __HAVE_ARCH_PTEP_SET_ACCESS_FLAGS -extern int ptep_set_access_flags(struct vm_area_struct *vma, +extern int __ptep_set_access_flags(struct vm_area_struct *vma, unsigned long address, pte_t *ptep, pte_t entry, int dirty); @@ -853,7 +852,8 @@ static inline int pmdp_set_access_flags(struct vm_area_struct *vma, unsigned long address, pmd_t *pmdp, pmd_t entry, int dirty) { - return ptep_set_access_flags(vma, address, (pte_t *)pmdp, pmd_pte(entry), dirty); + return __ptep_set_access_flags(vma, address, (pte_t *)pmdp, + pmd_pte(entry), dirty); } static inline int pud_devmap(pud_t pud) @@ -1122,6 +1122,8 @@ extern void ptep_modify_prot_commit(struct vm_area_struct *vma, #define ptep_clear_flush_young __ptep_clear_flush_young #define __HAVE_ARCH_PTEP_SET_WRPROTECT #define ptep_set_wrprotect __ptep_set_wrprotect +#define __HAVE_ARCH_PTEP_SET_ACCESS_FLAGS +#define ptep_set_access_flags __ptep_set_access_flags #endif /* !__ASSEMBLY__ */ diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c index a287c1dea871..7cebd9847aae 100644 --- a/arch/arm64/mm/fault.c +++ b/arch/arm64/mm/fault.c @@ -209,9 +209,9 @@ static void show_pte(unsigned long addr) * * Returns whether or not the PTE actually changed. */ -int ptep_set_access_flags(struct vm_area_struct *vma, - unsigned long address, pte_t *ptep, - pte_t entry, int dirty) +int __ptep_set_access_flags(struct vm_area_struct *vma, + unsigned long address, pte_t *ptep, + pte_t entry, int dirty) { pteval_t old_pteval, pteval; pte_t pte = READ_ONCE(*ptep); diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c index 952462820d9d..627a9717e98c 100644 --- a/arch/arm64/mm/hugetlbpage.c +++ b/arch/arm64/mm/hugetlbpage.c @@ -459,7 +459,7 @@ int huge_ptep_set_access_flags(struct vm_area_struct *vma, pte_t orig_pte; if (!pte_cont(pte)) - return ptep_set_access_flags(vma, addr, ptep, pte, dirty); + return __ptep_set_access_flags(vma, addr, ptep, pte, dirty); ncontig = find_num_contig(mm, addr, ptep, &pgsize); dpfn = pgsize >> PAGE_SHIFT; From patchwork Wed Nov 15 16:30:14 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13457066 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5CED7C5AD4C for ; Wed, 15 Nov 2023 16:31:15 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CAF8D6B0398; Wed, 15 Nov 2023 11:31:14 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id C5F996B039A; Wed, 15 Nov 2023 11:31:14 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B009F6B039B; Wed, 15 Nov 2023 11:31:14 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 9E3CA6B0398 for ; Wed, 15 Nov 2023 11:31:14 -0500 (EST) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 6AB2BC030A for ; Wed, 15 Nov 2023 16:31:14 +0000 (UTC) X-FDA: 81460728468.28.1C5F882 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf21.hostedemail.com (Postfix) with ESMTP id 9626D1C002D for ; Wed, 15 Nov 2023 16:31:07 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf21.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1700065868; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=/VxtEV8E64tqDjSTd4uv2FQfTUg5XNfrhh1313bOMIY=; b=HZ+1ndASdyajKyoDV/XE8amjUhfvo1Pj0bLzy1v7ikP27IDGETqJA+fu4FXn0awGFUipMS +rjf9QA0uVnr012o52IkAsgdENj7NjypUNcs9v/WO+fA5eoXAKDd75EoxUGWFmM1vbwzea YjoPcc7Pbqj9116b4XKK7NqlqHnWyr0= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf21.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1700065868; a=rsa-sha256; cv=none; b=46w+fmSexG44kFyk++6sjVaC2QPOS+ykgq2Nc1Q2DC7NQE8ORRv3rAbMkC3dNz822Wn4xl 5cXW2WuZ3Hz3jZETiRrjeh4qoOodKp8sRoVEOo1FyGV9s2D2AfXw/plcCRRwmsOELmL2La cBwcORCpqOIULRWBgVOb3dvMF508zck= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 46B5A15DB; Wed, 15 Nov 2023 08:31:51 -0800 (PST) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.26]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 8C2583F641; Wed, 15 Nov 2023 08:31:02 -0800 (PST) From: Ryan Roberts To: Catalin Marinas , Will Deacon , Ard Biesheuvel , Marc Zyngier , Oliver Upton , James Morse , Suzuki K Poulose , Zenghui Yu , Andrey Ryabinin , Alexander Potapenko , Andrey Konovalov , Dmitry Vyukov , Vincenzo Frascino , Andrew Morton , Anshuman Khandual , Matthew Wilcox , Yu Zhao , Mark Rutland , David Hildenbrand , Kefeng Wang , John Hubbard , Zi Yan Cc: Ryan Roberts , linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v2 10/14] arm64/mm: ptep_get(): New layer to manage contig bit Date: Wed, 15 Nov 2023 16:30:14 +0000 Message-Id: <20231115163018.1303287-11-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20231115163018.1303287-1-ryan.roberts@arm.com> References: <20231115163018.1303287-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: 9626D1C002D X-Rspam-User: X-Rspamd-Server: rspam04 X-Stat-Signature: 1mm1kn7ahn3ytb4wwrfeh74zd8osnxs4 X-HE-Tag: 1700065867-119900 X-HE-Meta: U2FsdGVkX1/rkjAbA2mApTvcn5k2djFDQBH0ZU2NlVr1aRyXUdjKAX93q//dBBVwjoAXHKf9Nv728RjAF2TqbCx0B7QQmuUGVQfGJq6340Ao3o4JPKQjva8FdU7sTtv6w9ya/t9WaqbG14hJ029ellu+q63jAag35swZRUQEYeptyrreZDNoysO2lVAaqL6lbpHIzO7xWJplhzo48pFQfljxQ58DJT5JQ6apQ/60V6yeX1aYhJXEAsqqws+b0FeZIOYUZNiyfUV5JKB3s5sZE4zY+FAP+X5GP14vdVHHBXfAA6Urv9oqSCZsw2Hurqnolfmym7XgEP13XFncHGKI7n1O0As0WZxblWVItDLLqjRI/qsqpEuoDttHIK9coft/w3WL7nazY5FqiTjWa5oXxiEDk2fpWp16ZmXVhMAIQ7G+KgJz2IaWHIv4G8YoO25eSubGRqHstKEdIPgTrXP7naRadGR+TGotVGF5okx+CsZjdnAARyAz21GawxeHZy3pHnNbfKNS5q4iFsE96tUlQjGVnRY3Ip6r9ADiYJwvXQmP81HiLxmcSn2ZkbDysmImHyYcG73o0ybUdXgQy62ruYAzEags1hpzJ1yzq48WzCY/ywC7TPM7JkhRsdvnaP3QWxIG2XXzonFGsQtWRFU8cgTHKl46hkd2Qoqnzof907UDT1KtjiddIIzDf3Prsc1Lv+9oqcn7Cae1iOihx2jMuMpSJg/1d2TU2Dc/AZ2CfIUkpknJ1Onpjs6NKFuLIiZRBUz0n5Z8lVZhdjxzNRNLnHdTlxEFFEE7hwIU++4obLg6E5F7jTw0CwXkBknHkN5jUQV6TlJORYaBzCOulUGY6Ks+3HbPj7SNyhHBEuoHuHu26TGnWwlfY6A0KUwvEoBPPgW4OTazs0O5NU+uMOUT4/LBnb5s5Ocl4hqRap8TLPwmpxolnSPvyz1OEUtV6tORz8daaT0vT1s6AlSv9S5 qIqlVdI5 7MhpnLzA2Xv+IoL8HHGSWZJgKbJ2SCLpCBOXWOVbYQK//SgwrayXR7CgS4Pw33guEsUbUD2VElpGl7NBcYNF3ky12q1AJmJEPewQbN0NvtWBwqmn7FnO3gh3yJY+FASvg/HH/hP4FO8qpKYuYdlaSb5OcQWZ1rYPLP+5lu0QyObuuVAaIA/HcFxXeGFsqxPGltv+0j6mOEZGpD7YZAWueXR+EhzwbdrFb3PKHomkZ19O5MT5bVUa2yETrX2p4DIWbhXOXxY8mPZfmMT9ydUT0EjGgpw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Create a new layer for the in-table PTE manipulation APIs. For now, The existing API is prefixed with double underscore to become the arch-private API and the public API is just a simple wrapper that calls the private API. The public API implementation will subsequently be used to transparently manipulate the contiguous bit where appropriate. But since there are already some contig-aware users (e.g. hugetlb, kernel mapper), we must first ensure those users use the private API directly so that the future contig-bit manipulations in the public API do not interfere with those existing uses. arm64 did not previously define an arch-specific ptep_get(), so override the default version in the arch code, and also define the private __ptep_get() version. Currently they both do the same thing that the default version does (READ_ONCE()). Some arch users (hugetlb) were already using ptep_get() so convert those to the private API. While other callsites were doing direct READ_ONCE(), so convert those to use the appropriate (public/private) API too. Signed-off-by: Ryan Roberts --- arch/arm64/include/asm/pgtable.h | 12 +++++++++--- arch/arm64/kernel/efi.c | 2 +- arch/arm64/mm/fault.c | 4 ++-- arch/arm64/mm/hugetlbpage.c | 18 +++++++++--------- arch/arm64/mm/kasan_init.c | 2 +- arch/arm64/mm/mmu.c | 12 ++++++------ arch/arm64/mm/pageattr.c | 4 ++-- arch/arm64/mm/trans_pgd.c | 2 +- 8 files changed, 31 insertions(+), 25 deletions(-) diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index 85010c2d4dfa..6930c14f062f 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -276,6 +276,11 @@ static inline void __set_pte(pte_t *ptep, pte_t pte) } } +static inline pte_t __ptep_get(pte_t *ptep) +{ + return READ_ONCE(*ptep); +} + extern void __sync_icache_dcache(pte_t pteval); bool pgattr_change_is_safe(u64 old, u64 new); @@ -303,7 +308,7 @@ static inline void __check_safe_pte_update(struct mm_struct *mm, pte_t *ptep, if (!IS_ENABLED(CONFIG_DEBUG_VM)) return; - old_pte = READ_ONCE(*ptep); + old_pte = __ptep_get(ptep); if (!pte_valid(old_pte) || !pte_valid(pte)) return; @@ -893,7 +898,7 @@ static inline int __ptep_test_and_clear_young(struct vm_area_struct *vma, { pte_t old_pte, pte; - pte = READ_ONCE(*ptep); + pte = __ptep_get(ptep); do { old_pte = pte; pte = pte_mkold(pte); @@ -966,7 +971,7 @@ static inline void __ptep_set_wrprotect(struct mm_struct *mm, { pte_t old_pte, pte; - pte = READ_ONCE(*ptep); + pte = __ptep_get(ptep); do { old_pte = pte; pte = pte_wrprotect(pte); @@ -1111,6 +1116,7 @@ extern void ptep_modify_prot_commit(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep, pte_t old_pte, pte_t new_pte); +#define ptep_get __ptep_get #define set_pte __set_pte #define set_ptes __set_ptes #define pte_clear __pte_clear diff --git a/arch/arm64/kernel/efi.c b/arch/arm64/kernel/efi.c index 44288a12fc6c..9afcc690fe73 100644 --- a/arch/arm64/kernel/efi.c +++ b/arch/arm64/kernel/efi.c @@ -103,7 +103,7 @@ static int __init set_permissions(pte_t *ptep, unsigned long addr, void *data) { struct set_perm_data *spd = data; const efi_memory_desc_t *md = spd->md; - pte_t pte = READ_ONCE(*ptep); + pte_t pte = __ptep_get(ptep); if (md->attribute & EFI_MEMORY_RO) pte = set_pte_bit(pte, __pgprot(PTE_RDONLY)); diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c index 7cebd9847aae..d63f3a0a7251 100644 --- a/arch/arm64/mm/fault.c +++ b/arch/arm64/mm/fault.c @@ -191,7 +191,7 @@ static void show_pte(unsigned long addr) if (!ptep) break; - pte = READ_ONCE(*ptep); + pte = __ptep_get(ptep); pr_cont(", pte=%016llx", pte_val(pte)); pte_unmap(ptep); } while(0); @@ -214,7 +214,7 @@ int __ptep_set_access_flags(struct vm_area_struct *vma, pte_t entry, int dirty) { pteval_t old_pteval, pteval; - pte_t pte = READ_ONCE(*ptep); + pte_t pte = __ptep_get(ptep); if (pte_same(pte, entry)) return 0; diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c index 627a9717e98c..52fb767607e0 100644 --- a/arch/arm64/mm/hugetlbpage.c +++ b/arch/arm64/mm/hugetlbpage.c @@ -152,14 +152,14 @@ pte_t huge_ptep_get(pte_t *ptep) { int ncontig, i; size_t pgsize; - pte_t orig_pte = ptep_get(ptep); + pte_t orig_pte = __ptep_get(ptep); if (!pte_present(orig_pte) || !pte_cont(orig_pte)) return orig_pte; ncontig = num_contig_ptes(page_size(pte_page(orig_pte)), &pgsize); for (i = 0; i < ncontig; i++, ptep++) { - pte_t pte = ptep_get(ptep); + pte_t pte = __ptep_get(ptep); if (pte_dirty(pte)) orig_pte = pte_mkdirty(orig_pte); @@ -184,7 +184,7 @@ static pte_t get_clear_contig(struct mm_struct *mm, unsigned long pgsize, unsigned long ncontig) { - pte_t orig_pte = ptep_get(ptep); + pte_t orig_pte = __ptep_get(ptep); unsigned long i; for (i = 0; i < ncontig; i++, addr += pgsize, ptep++) { @@ -408,7 +408,7 @@ pte_t huge_ptep_get_and_clear(struct mm_struct *mm, { int ncontig; size_t pgsize; - pte_t orig_pte = ptep_get(ptep); + pte_t orig_pte = __ptep_get(ptep); if (!pte_cont(orig_pte)) return __ptep_get_and_clear(mm, addr, ptep); @@ -431,11 +431,11 @@ static int __cont_access_flags_changed(pte_t *ptep, pte_t pte, int ncontig) { int i; - if (pte_write(pte) != pte_write(ptep_get(ptep))) + if (pte_write(pte) != pte_write(__ptep_get(ptep))) return 1; for (i = 0; i < ncontig; i++) { - pte_t orig_pte = ptep_get(ptep + i); + pte_t orig_pte = __ptep_get(ptep + i); if (pte_dirty(pte) != pte_dirty(orig_pte)) return 1; @@ -492,7 +492,7 @@ void huge_ptep_set_wrprotect(struct mm_struct *mm, size_t pgsize; pte_t pte; - if (!pte_cont(READ_ONCE(*ptep))) { + if (!pte_cont(__ptep_get(ptep))) { __ptep_set_wrprotect(mm, addr, ptep); return; } @@ -517,7 +517,7 @@ pte_t huge_ptep_clear_flush(struct vm_area_struct *vma, size_t pgsize; int ncontig; - if (!pte_cont(READ_ONCE(*ptep))) + if (!pte_cont(__ptep_get(ptep))) return ptep_clear_flush(vma, addr, ptep); ncontig = find_num_contig(mm, addr, ptep, &pgsize); @@ -550,7 +550,7 @@ pte_t huge_ptep_modify_prot_start(struct vm_area_struct *vma, unsigned long addr * when the permission changes from executable to non-executable * in cases where cpu is affected with errata #2645198. */ - if (pte_user_exec(READ_ONCE(*ptep))) + if (pte_user_exec(__ptep_get(ptep))) return huge_ptep_clear_flush(vma, addr, ptep); } return huge_ptep_get_and_clear(vma->vm_mm, addr, ptep); diff --git a/arch/arm64/mm/kasan_init.c b/arch/arm64/mm/kasan_init.c index 5eade712e9e5..5274c317d775 100644 --- a/arch/arm64/mm/kasan_init.c +++ b/arch/arm64/mm/kasan_init.c @@ -113,7 +113,7 @@ static void __init kasan_pte_populate(pmd_t *pmdp, unsigned long addr, memset(__va(page_phys), KASAN_SHADOW_INIT, PAGE_SIZE); next = addr + PAGE_SIZE; __set_pte(ptep, pfn_pte(__phys_to_pfn(page_phys), PAGE_KERNEL)); - } while (ptep++, addr = next, addr != end && pte_none(READ_ONCE(*ptep))); + } while (ptep++, addr = next, addr != end && pte_none(__ptep_get(ptep))); } static void __init kasan_pmd_populate(pud_t *pudp, unsigned long addr, diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c index 080e9b50f595..784f1e312447 100644 --- a/arch/arm64/mm/mmu.c +++ b/arch/arm64/mm/mmu.c @@ -176,7 +176,7 @@ static void init_pte(pmd_t *pmdp, unsigned long addr, unsigned long end, ptep = pte_set_fixmap_offset(pmdp, addr); do { - pte_t old_pte = READ_ONCE(*ptep); + pte_t old_pte = __ptep_get(ptep); __set_pte(ptep, pfn_pte(__phys_to_pfn(phys), prot)); @@ -185,7 +185,7 @@ static void init_pte(pmd_t *pmdp, unsigned long addr, unsigned long end, * only allow updates to the permission attributes. */ BUG_ON(!pgattr_change_is_safe(pte_val(old_pte), - READ_ONCE(pte_val(*ptep)))); + pte_val(__ptep_get(ptep)))); phys += PAGE_SIZE; } while (ptep++, addr += PAGE_SIZE, addr != end); @@ -854,7 +854,7 @@ static void unmap_hotplug_pte_range(pmd_t *pmdp, unsigned long addr, do { ptep = pte_offset_kernel(pmdp, addr); - pte = READ_ONCE(*ptep); + pte = __ptep_get(ptep); if (pte_none(pte)) continue; @@ -987,7 +987,7 @@ static void free_empty_pte_table(pmd_t *pmdp, unsigned long addr, do { ptep = pte_offset_kernel(pmdp, addr); - pte = READ_ONCE(*ptep); + pte = __ptep_get(ptep); /* * This is just a sanity check here which verifies that @@ -1006,7 +1006,7 @@ static void free_empty_pte_table(pmd_t *pmdp, unsigned long addr, */ ptep = pte_offset_kernel(pmdp, 0UL); for (i = 0; i < PTRS_PER_PTE; i++) { - if (!pte_none(READ_ONCE(ptep[i]))) + if (!pte_none(__ptep_get(&ptep[i]))) return; } @@ -1475,7 +1475,7 @@ pte_t ptep_modify_prot_start(struct vm_area_struct *vma, unsigned long addr, pte * when the permission changes from executable to non-executable * in cases where cpu is affected with errata #2645198. */ - if (pte_user_exec(READ_ONCE(*ptep))) + if (pte_user_exec(ptep_get(ptep))) return ptep_clear_flush(vma, addr, ptep); } return ptep_get_and_clear(vma->vm_mm, addr, ptep); diff --git a/arch/arm64/mm/pageattr.c b/arch/arm64/mm/pageattr.c index 057097acf9e0..624b0b0982e3 100644 --- a/arch/arm64/mm/pageattr.c +++ b/arch/arm64/mm/pageattr.c @@ -36,7 +36,7 @@ bool can_set_direct_map(void) static int change_page_range(pte_t *ptep, unsigned long addr, void *data) { struct page_change_data *cdata = data; - pte_t pte = READ_ONCE(*ptep); + pte_t pte = __ptep_get(ptep); pte = clear_pte_bit(pte, cdata->clear_mask); pte = set_pte_bit(pte, cdata->set_mask); @@ -246,5 +246,5 @@ bool kernel_page_present(struct page *page) return true; ptep = pte_offset_kernel(pmdp, addr); - return pte_valid(READ_ONCE(*ptep)); + return pte_valid(__ptep_get(ptep)); } diff --git a/arch/arm64/mm/trans_pgd.c b/arch/arm64/mm/trans_pgd.c index 230b607cf881..5139a28130c0 100644 --- a/arch/arm64/mm/trans_pgd.c +++ b/arch/arm64/mm/trans_pgd.c @@ -33,7 +33,7 @@ static void *trans_alloc(struct trans_pgd_info *info) static void _copy_pte(pte_t *dst_ptep, pte_t *src_ptep, unsigned long addr) { - pte_t pte = READ_ONCE(*src_ptep); + pte_t pte = __ptep_get(src_ptep); if (pte_valid(pte)) { /* From patchwork Wed Nov 15 16:30:15 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13457065 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 27774C2BB3F for ; Wed, 15 Nov 2023 16:31:12 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B28066B0396; Wed, 15 Nov 2023 11:31:11 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id AD85E6B0398; Wed, 15 Nov 2023 11:31:11 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9C7D56B0399; Wed, 15 Nov 2023 11:31:11 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 8C8736B0396 for ; Wed, 15 Nov 2023 11:31:11 -0500 (EST) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 71C87B59D7 for ; Wed, 15 Nov 2023 16:31:11 +0000 (UTC) X-FDA: 81460728342.20.0444892 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf07.hostedemail.com (Postfix) with ESMTP id 927894001B for ; Wed, 15 Nov 2023 16:31:09 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=none; spf=pass (imf07.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1700065869; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=mOqMiM5NGpNjoYnvzB47gVofKLzjpD8ExtOVjQRbqJI=; b=23wa95YYr+lepWqTjV9qXNKPGiDUmSxG075F0NZsaAxB5y9N1mxYrbnqEB3Vjx550kyb1H m5kdNHKY6t4iv4KS4o6rbSk5iCtal+j7GxT840qEVFGWR8ABPDwTqlnz74grWfQDfEIvc/ x1OF/QUS7TN9QOcIDC/jm/yF+ynugzk= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1700065869; a=rsa-sha256; cv=none; b=dmINiNl4mxVfEVrNuPzPkNrGVrvkU3fLV965NU+43JiYPd3CopsHnOLKJ29q+1dTD4oBhM rfK6btfG1CVdoQCama9CGew5nm8FOL5Rg1ZUrRb6oM8oLO5EhALJ9up7XNyDR3CBZ2Oc2Z t8xSmngRvWz7kkIxOfb+L6hb+P+M/RI= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=none; spf=pass (imf07.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 88BC9DA7; Wed, 15 Nov 2023 08:31:54 -0800 (PST) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.26]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id CF1963F641; Wed, 15 Nov 2023 08:31:05 -0800 (PST) From: Ryan Roberts To: Catalin Marinas , Will Deacon , Ard Biesheuvel , Marc Zyngier , Oliver Upton , James Morse , Suzuki K Poulose , Zenghui Yu , Andrey Ryabinin , Alexander Potapenko , Andrey Konovalov , Dmitry Vyukov , Vincenzo Frascino , Andrew Morton , Anshuman Khandual , Matthew Wilcox , Yu Zhao , Mark Rutland , David Hildenbrand , Kefeng Wang , John Hubbard , Zi Yan Cc: Ryan Roberts , linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v2 11/14] arm64/mm: Split __flush_tlb_range() to elide trailing DSB Date: Wed, 15 Nov 2023 16:30:15 +0000 Message-Id: <20231115163018.1303287-12-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20231115163018.1303287-1-ryan.roberts@arm.com> References: <20231115163018.1303287-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: 927894001B X-Rspam-User: X-Stat-Signature: 88qoirr5erpc149berr9fgg1utujiqip X-Rspamd-Server: rspam03 X-HE-Tag: 1700065869-758130 X-HE-Meta: U2FsdGVkX1/OR8EgVHOKR6wzMtrotyDQrHErumzG4Pqvs8EMGQRT7sgWz/2Z9FdXwaYWrnlURxTz4S6rUmCpZ6iToNuTKZ3TwUI0fCzoKdWDcyGJFnffepFW3mZlJYjkzhRDzhrbmfeUdqj/Y9MisN4TRIzkIjfr2jWKwUSTIQOpTX09oY3PUzYoY+Qd1J9s8BkcOOvq8Sq3YSeuQrX7GGPNHUeyodMDRlH/bgoRNDhCGmW6J7X/sFevazH1VZVhD1JB+LKBEklltOAcAG6/CZRObXh/HScjHqIpwaANxvP976Jyj8BdS5pMLWmyRmNEs1vzWesoLPPHhfeUjTt7G3v7GgcQ32+KmTgYptOYzUYzgf/caSs5E+3ZrAexUY6bSqU0LCV5o9djmteGBsjcfqUuoeU9QfdnQ5pEwE8wXF6FWhCyEXSlpzv9OVMZe8qwtPt21ns5nhDJ3atbmhcUNO8soap1eyN5zE9Z83DSqw+4/uv+ZzWTHzM2vk9SZCKHNBFduZ+m8gp/aOmepziVP0g2YNXuwHaxkrl+zLA66aUrzZ88OvFBAGBiklxD+iZgxfGscgRKrHEy6iJByRWEdaBhuMTO8y4Y/Ltr83EOQW7E3pqZbG5QFjJNACnCHgpSfeNerZwR20lov7upM8i0FOustAujZAVdU76tYEjGfBQFbQMfJ6TI2mP/n/7DqtUoQV4rkBAD+C1/GR1AmDKHh6GYNi+c+6Q61T5NMaU3Gpg0u9o2dnBTHnevwjwM4iTs4LweiaeePiQ+bw5Q1PSdncouUSQOwHmk+fVYo1gTi2YwN8igJj3YPZ3XHssadATNMSUp4ygGInRwVkIk55jSQHykjapOTWR44sCu7YU3RSi7PHdaLw1ZupZI6sloJ2j6gvX/trrDWoaL+fpr1K1OQ7zP7szYxLn3GFRMHpHQiH2VtgamB3d+LHhkPdeRgmNMpZd6tn02zUV8mW/4G+J jDlfFSGP DBNGA4EaX0EcIBFAXvA8jFjelMnLO1Q+6SApIB2Csx4SHQN+7QkqqZ56rkilae3QK8sxyCSQmNYvu/UEdXOyFPTOy4/QtTZQtx6TABQruDy1KCMN+HnnEPPLzgYdPjQM2vIiCdgxc1H2ILDphk9nEUd4povTOrcIbL1ngEC1F5UOlsY2PL5HtzIss6yPzbD5JpRG2csdyYdkHyvbYmD2FQUCM3qsMLmzD4yu5oGHEjcH7uwoOylQ7gpc4p0c/+AesUcaTHD0bOqBwiTr2W5D+qRbuYg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Split __flush_tlb_range() into __flush_tlb_range_nosync() + __flush_tlb_range(), in the same way as the existing flush_tlb_page() arrangement. This allows calling __flush_tlb_range_nosync() to elide the trailing DSB. Forthcoming "contpte" code will take advantage of this when clearing the young bit from a contiguous range of ptes. Signed-off-by: Ryan Roberts --- arch/arm64/include/asm/tlbflush.h | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h index bb2c2833a987..925ef3bdf9ed 100644 --- a/arch/arm64/include/asm/tlbflush.h +++ b/arch/arm64/include/asm/tlbflush.h @@ -399,7 +399,7 @@ do { \ #define __flush_s2_tlb_range_op(op, start, pages, stride, tlb_level) \ __flush_tlb_range_op(op, start, pages, stride, 0, tlb_level, false) -static inline void __flush_tlb_range(struct vm_area_struct *vma, +static inline void __flush_tlb_range_nosync(struct vm_area_struct *vma, unsigned long start, unsigned long end, unsigned long stride, bool last_level, int tlb_level) @@ -431,10 +431,19 @@ static inline void __flush_tlb_range(struct vm_area_struct *vma, else __flush_tlb_range_op(vae1is, start, pages, stride, asid, tlb_level, true); - dsb(ish); mmu_notifier_arch_invalidate_secondary_tlbs(vma->vm_mm, start, end); } +static inline void __flush_tlb_range(struct vm_area_struct *vma, + unsigned long start, unsigned long end, + unsigned long stride, bool last_level, + int tlb_level) +{ + __flush_tlb_range_nosync(vma, start, end, stride, + last_level, tlb_level); + dsb(ish); +} + static inline void flush_tlb_range(struct vm_area_struct *vma, unsigned long start, unsigned long end) { From patchwork Wed Nov 15 16:30:16 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13457067 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B6678C54FB9 for ; Wed, 15 Nov 2023 16:31:18 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 50EF86B039A; Wed, 15 Nov 2023 11:31:18 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 4BE256B039C; Wed, 15 Nov 2023 11:31:18 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 337D66B039D; Wed, 15 Nov 2023 11:31:18 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 1DBB06B039A for ; Wed, 15 Nov 2023 11:31:18 -0500 (EST) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id BE204160558 for ; Wed, 15 Nov 2023 16:31:17 +0000 (UTC) X-FDA: 81460728594.06.FFEF6D1 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf21.hostedemail.com (Postfix) with ESMTP id 22A7B1C0015 for ; Wed, 15 Nov 2023 16:31:12 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf21.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1700065873; a=rsa-sha256; cv=none; b=k3XqtOAvlRFITabljD6m8QbAwLTCQn+0CsQj7lBbTxJa+KqDzbAGj8p1HDJO5zY60REz1M NjZ3RCZajG0e9z6INxKduW6X15d8+PiJY7t+TkmIrrO5FCmTsfu915zR6J4PWwHAIUaqQ6 2ySwHeHHuqdsRFUbV9mV5aR++b9eJSo= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf21.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1700065873; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=lCThfSyxcur1NkX+DVvNdNU1p/7yWg6Uho5SoX3w+rY=; b=CX4BjFiiFFzT578jMA9Aow+9BLK7igRfHeHzFNkarKKoXi8MwVsQ+oSASnBE3fOau1O26O htKFV8kOzpQXl/exIS5lXyLKH3dfWWMJnbfSYaQmMbEi+IftNT2KbeulzgPw2SQE5z9t0J sw1YGN9vmmQUZYGoV3edcYibqv/Uzs8= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id E6ED91595; Wed, 15 Nov 2023 08:31:57 -0800 (PST) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.26]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 1E9903F641; Wed, 15 Nov 2023 08:31:09 -0800 (PST) From: Ryan Roberts To: Catalin Marinas , Will Deacon , Ard Biesheuvel , Marc Zyngier , Oliver Upton , James Morse , Suzuki K Poulose , Zenghui Yu , Andrey Ryabinin , Alexander Potapenko , Andrey Konovalov , Dmitry Vyukov , Vincenzo Frascino , Andrew Morton , Anshuman Khandual , Matthew Wilcox , Yu Zhao , Mark Rutland , David Hildenbrand , Kefeng Wang , John Hubbard , Zi Yan Cc: Ryan Roberts , linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v2 12/14] arm64/mm: Wire up PTE_CONT for user mappings Date: Wed, 15 Nov 2023 16:30:16 +0000 Message-Id: <20231115163018.1303287-13-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20231115163018.1303287-1-ryan.roberts@arm.com> References: <20231115163018.1303287-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 22A7B1C0015 X-Stat-Signature: y3otxjrjyq79tmck7p1wqoxur3k1cd7p X-Rspam-User: X-HE-Tag: 1700065872-912380 X-HE-Meta: U2FsdGVkX198SxMHYWSGNAx7g1Brbt/5t3r2hU8VfistASJwosl+3i+UkK98EZh+phloXkE+tfxKAPmyqPbs5E7e5adrc3tMIXb8RbgFyGvffGVbFz4b+vTnAoUWp2mtEvMzsGzs32yKl2BJLBuJwA9oxKH3rHBzVJR2h9ip5/8ZNDlA5Mh6cUaYAXVHpL8EGs/iC0CkEUXifrgrUeGnd0AY0K8CgluuHYetM4fQtFU6S+e3QMhy/As02tJb1Um85IJ8q42ZhH6nyUWakr/ipj8pqSXAhL2JvJA/Tt61sXD4HwtDJZAGxdr5358sBRMl+LNc5FowfbpP6r6LcbSwKsS+EP+ZC7T7ONClIRJ+Yd1AkyNLLWUofDXE3/13JEsZ4mrk2X0THPY5NoxiqPKybgVQ5dXHJ0nK4yiN93sz0Pbwpvbke+TempTGKFDIxABUET2ntqQuyDh7NAUjaqYq8ijQVIBCeazOz9JMlVyQlQd0mPq8JA+fti7t175Z6ROVWmtKuLv8DYXQwMexR2a+7j9dkMLRTABA0lHkgvhZlXLDyvm2NaTiBcl6lhgAHhv5YLah53OfjPHzWdbLyWavgb0SePXZndogoWHL1BjgoeWK7VG91yXzYzejBhCLwIeY5cC4618IKqzIs4f0yaRwvBVwIqgxQ+4lhonbn4wA1qnpWjP8OqVB0Pz2rhG8eqfvJmrj16riHFUji3/EdxZ3t+lJ9USAWbBm/diKk7xA1dZxp5h1Av4i9Qk7COJ1Z6hO1OpdKgXSg/AGXL49lyoqhX1wtXia2KfM6FHYujxke4oKmPvdu4lCbyIjjlQN2YnOj8562e7QfI6FKVK7VOcU22hwkEXyxIQ4H5v2mKlqn1z49ZQgewFxRR+SmVPtAUyBSH+l62TeUjlp64Qgv/QQuaKMgAe/E5ND8sZ76YnUFA5enuJp3Q+JUXgjByqT1s+3cT2JSfx9Xn5ROudbHTI rmslY6dL EnuM3b7U3qrn0ebUTub2f8lWCjsdb8zZr2XDGQFUVuImLdtxcdoH80g7geAKHD1XkqvKGws2O+vabCb35yrcY4ScB9pRZO4e3lHc398ItCy1evLm3r3/kvRU38VIneWoPgQLo413EM/aOnbacXMe8uXZz/gBOAblERmRg0xiAT+e8cOLEMi7zsrSPMf+zNvUmKxt6DdbLdJowW1ZIgWhGXx6icvv0UcfIdaMwvdtfdAl19ueoB33cBDwm6YuuiMcrDS4kbwKmF0DQv1sRoq9Gn4dRrA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: With the ptep API sufficiently refactored, we can now introduce a new "contpte" API layer, which transparently manages the PTE_CONT bit for user mappings. Whenever it detects a set of PTEs that meet the requirements for a contiguous range, the PTEs are re-painted with the PTE_CONT bit. Use of contpte mappings is intended to be transparent to the core-mm, which continues to interact with individual ptes. Since a contpte block only has a single access and dirty bit, the semantic here changes slightly; when getting a pte (e.g. ptep_get()) that is part of a contpte mapping, the access and dirty information are pulled from the block (so all ptes in the block return the same access/dirty info). When changing the access/dirty info on a pte (e.g. ptep_set_access_flags()) that is part of a contpte mapping, this change will affect the whole contpte block. This is works fine in practice since we guarrantee that only a single folio is mapped by a contpte block, and the core-mm tracks access/dirty information per folio. This initial change provides a baseline that can be optimized in future commits. That said, fold/unfold operations (which imply tlb invalidation) are avoided where possible with a few tricks for access/dirty bit management. Write-protect modifications for contpte mappings are currently non-optimal, and incure a regression in fork() performance. This will be addressed in follow-up changes. In order for the public functions, which used to be pure inline, to continue to be callable by modules, export all the contpte_* symbols that are now called by those public inline functions. The feature is enabled/disabled with the ARM64_CONTPTE Kconfig parameter at build time. It defaults to enabled as long as its dependency, TRANSPARENT_HUGEPAGE is also enabled. The core-mm depends upon TRANSPARENT_HUGEPAGE to be able to allocate large folios, so if its not enabled, then there is no chance of meeting the physical contiguity requirement for contpte mappings. Signed-off-by: Ryan Roberts --- arch/arm64/Kconfig | 10 +- arch/arm64/include/asm/pgtable.h | 202 ++++++++++++++++++ arch/arm64/mm/Makefile | 1 + arch/arm64/mm/contpte.c | 351 +++++++++++++++++++++++++++++++ 4 files changed, 563 insertions(+), 1 deletion(-) create mode 100644 arch/arm64/mm/contpte.c diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index 7b071a00425d..de76e484ff3a 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -2209,6 +2209,15 @@ config UNWIND_PATCH_PAC_INTO_SCS select UNWIND_TABLES select DYNAMIC_SCS +config ARM64_CONTPTE + bool "Contiguous PTE mappings for user memory" if EXPERT + depends on TRANSPARENT_HUGEPAGE + default y + help + When enabled, user mappings are configured using the PTE contiguous + bit, for any mappings that meet the size and alignment requirements. + This reduces TLB pressure and improves performance. + endmenu # "Kernel Features" menu "Boot options" @@ -2318,4 +2327,3 @@ endmenu # "CPU Power Management" source "drivers/acpi/Kconfig" source "arch/arm64/kvm/Kconfig" - diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index 6930c14f062f..15bc9cf1eef4 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -133,6 +133,10 @@ static inline pteval_t __phys_to_pte_val(phys_addr_t phys) */ #define pte_valid_not_user(pte) \ ((pte_val(pte) & (PTE_VALID | PTE_USER | PTE_UXN)) == (PTE_VALID | PTE_UXN)) +/* + * Returns true if the pte is valid and has the contiguous bit set. + */ +#define pte_valid_cont(pte) (pte_valid(pte) && pte_cont(pte)) /* * Could the pte be present in the TLB? We must check mm_tlb_flush_pending * so that we don't erroneously return false for pages that have been @@ -1116,6 +1120,202 @@ extern void ptep_modify_prot_commit(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep, pte_t old_pte, pte_t new_pte); +#ifdef CONFIG_ARM64_CONTPTE + +/* + * The contpte APIs are used to transparently manage the contiguous bit in ptes + * where it is possible and makes sense to do so. The PTE_CONT bit is considered + * a private implementation detail of the public ptep API (see below). + */ +extern void __contpte_try_fold(struct mm_struct *mm, unsigned long addr, + pte_t *ptep, pte_t pte); +extern void __contpte_try_unfold(struct mm_struct *mm, unsigned long addr, + pte_t *ptep, pte_t pte); +extern pte_t contpte_ptep_get(pte_t *ptep, pte_t orig_pte); +extern pte_t contpte_ptep_get_lockless(pte_t *orig_ptep); +extern void contpte_set_ptes(struct mm_struct *mm, unsigned long addr, + pte_t *ptep, pte_t pte, unsigned int nr); +extern int contpte_ptep_test_and_clear_young(struct vm_area_struct *vma, + unsigned long addr, pte_t *ptep); +extern int contpte_ptep_clear_flush_young(struct vm_area_struct *vma, + unsigned long addr, pte_t *ptep); +extern int contpte_ptep_set_access_flags(struct vm_area_struct *vma, + unsigned long addr, pte_t *ptep, + pte_t entry, int dirty); + +static inline pte_t *contpte_align_down(pte_t *ptep) +{ + return (pte_t *)(ALIGN_DOWN((unsigned long)ptep >> 3, CONT_PTES) << 3); +} + +static inline bool contpte_is_enabled(struct mm_struct *mm) +{ + /* + * Don't attempt to apply the contig bit to kernel mappings, because + * dynamically adding/removing the contig bit can cause page faults. + * These racing faults are ok for user space, since they get serialized + * on the PTL. But kernel mappings can't tolerate faults. + */ + + return mm != &init_mm; +} + +static inline void contpte_try_fold(struct mm_struct *mm, unsigned long addr, + pte_t *ptep, pte_t pte) +{ + /* + * Only bother trying if both the virtual and physical addresses are + * aligned and correspond to the last entry in a contig range. The core + * code mostly modifies ranges from low to high, so this is the likely + * the last modification in the contig range, so a good time to fold. + * We can't fold special mappings, because there is no associated folio. + */ + + bool valign = ((unsigned long)ptep >> 3) % CONT_PTES == CONT_PTES - 1; + bool palign = pte_pfn(pte) % CONT_PTES == CONT_PTES - 1; + + if (contpte_is_enabled(mm) && valign && palign && + pte_valid(pte) && !pte_cont(pte) && !pte_special(pte)) + __contpte_try_fold(mm, addr, ptep, pte); +} + +static inline void contpte_try_unfold(struct mm_struct *mm, unsigned long addr, + pte_t *ptep, pte_t pte) +{ + if (contpte_is_enabled(mm) && pte_valid_cont(pte)) + __contpte_try_unfold(mm, addr, ptep, pte); +} + +/* + * The below functions constitute the public API that arm64 presents to the + * core-mm to manipulate PTE entries within the their page tables (or at least + * this is the subset of the API that arm64 needs to implement). These public + * versions will automatically and transparently apply the contiguous bit where + * it makes sense to do so. Therefore any users that are contig-aware (e.g. + * hugetlb, kernel mapper) should NOT use these APIs, but instead use the + * private versions, which are prefixed with double underscore. All of these + * APIs except for ptep_get_lockless() are expected to be called with the PTL + * held. + */ + +#define ptep_get ptep_get +static inline pte_t ptep_get(pte_t *ptep) +{ + pte_t pte = __ptep_get(ptep); + + if (!pte_valid_cont(pte)) + return pte; + + return contpte_ptep_get(ptep, pte); +} + +#define ptep_get_lockless ptep_get_lockless +static inline pte_t ptep_get_lockless(pte_t *ptep) +{ + pte_t pte = __ptep_get(ptep); + + if (!pte_valid_cont(pte)) + return pte; + + return contpte_ptep_get_lockless(ptep); +} + +static inline void set_pte(pte_t *ptep, pte_t pte) +{ + /* + * We don't have the mm or vaddr so cannot unfold or fold contig entries + * (since it requires tlb maintenance). set_pte() is not used in core + * code, so this should never even be called. Regardless do our best to + * service any call and emit a warning if there is any attempt to set a + * pte on top of an existing contig range. + */ + pte_t orig_pte = __ptep_get(ptep); + + WARN_ON_ONCE(pte_valid_cont(orig_pte)); + __set_pte(ptep, pte_mknoncont(pte)); +} + +#define set_ptes set_ptes +static inline void set_ptes(struct mm_struct *mm, unsigned long addr, + pte_t *ptep, pte_t pte, unsigned int nr) +{ + pte = pte_mknoncont(pte); + + if (!contpte_is_enabled(mm)) + __set_ptes(mm, addr, ptep, pte, nr); + else if (nr == 1) { + contpte_try_unfold(mm, addr, ptep, __ptep_get(ptep)); + __set_ptes(mm, addr, ptep, pte, nr); + contpte_try_fold(mm, addr, ptep, pte); + } else + contpte_set_ptes(mm, addr, ptep, pte, nr); +} + +static inline void pte_clear(struct mm_struct *mm, + unsigned long addr, pte_t *ptep) +{ + contpte_try_unfold(mm, addr, ptep, __ptep_get(ptep)); + __pte_clear(mm, addr, ptep); +} + +#define __HAVE_ARCH_PTEP_GET_AND_CLEAR +static inline pte_t ptep_get_and_clear(struct mm_struct *mm, + unsigned long addr, pte_t *ptep) +{ + contpte_try_unfold(mm, addr, ptep, __ptep_get(ptep)); + return __ptep_get_and_clear(mm, addr, ptep); +} + +#define __HAVE_ARCH_PTEP_TEST_AND_CLEAR_YOUNG +static inline int ptep_test_and_clear_young(struct vm_area_struct *vma, + unsigned long addr, pte_t *ptep) +{ + pte_t orig_pte = __ptep_get(ptep); + + if (!pte_valid_cont(orig_pte)) + return __ptep_test_and_clear_young(vma, addr, ptep); + + return contpte_ptep_test_and_clear_young(vma, addr, ptep); +} + +#define __HAVE_ARCH_PTEP_CLEAR_YOUNG_FLUSH +static inline int ptep_clear_flush_young(struct vm_area_struct *vma, + unsigned long addr, pte_t *ptep) +{ + pte_t orig_pte = __ptep_get(ptep); + + if (!pte_valid_cont(orig_pte)) + return __ptep_clear_flush_young(vma, addr, ptep); + + return contpte_ptep_clear_flush_young(vma, addr, ptep); +} + +#define __HAVE_ARCH_PTEP_SET_WRPROTECT +static inline void ptep_set_wrprotect(struct mm_struct *mm, + unsigned long addr, pte_t *ptep) +{ + contpte_try_unfold(mm, addr, ptep, __ptep_get(ptep)); + __ptep_set_wrprotect(mm, addr, ptep); + contpte_try_fold(mm, addr, ptep, __ptep_get(ptep)); +} + +#define __HAVE_ARCH_PTEP_SET_ACCESS_FLAGS +static inline int ptep_set_access_flags(struct vm_area_struct *vma, + unsigned long addr, pte_t *ptep, + pte_t entry, int dirty) +{ + pte_t orig_pte = __ptep_get(ptep); + + entry = pte_mknoncont(entry); + + if (!pte_valid_cont(orig_pte)) + return __ptep_set_access_flags(vma, addr, ptep, entry, dirty); + + return contpte_ptep_set_access_flags(vma, addr, ptep, entry, dirty); +} + +#else /* CONFIG_ARM64_CONTPTE */ + #define ptep_get __ptep_get #define set_pte __set_pte #define set_ptes __set_ptes @@ -1131,6 +1331,8 @@ extern void ptep_modify_prot_commit(struct vm_area_struct *vma, #define __HAVE_ARCH_PTEP_SET_ACCESS_FLAGS #define ptep_set_access_flags __ptep_set_access_flags +#endif /* CONFIG_ARM64_CONTPTE */ + #endif /* !__ASSEMBLY__ */ #endif /* __ASM_PGTABLE_H */ diff --git a/arch/arm64/mm/Makefile b/arch/arm64/mm/Makefile index dbd1bc95967d..60454256945b 100644 --- a/arch/arm64/mm/Makefile +++ b/arch/arm64/mm/Makefile @@ -3,6 +3,7 @@ obj-y := dma-mapping.o extable.o fault.o init.o \ cache.o copypage.o flush.o \ ioremap.o mmap.o pgd.o mmu.o \ context.o proc.o pageattr.o fixmap.o +obj-$(CONFIG_ARM64_CONTPTE) += contpte.o obj-$(CONFIG_HUGETLB_PAGE) += hugetlbpage.o obj-$(CONFIG_PTDUMP_CORE) += ptdump.o obj-$(CONFIG_PTDUMP_DEBUGFS) += ptdump_debugfs.o diff --git a/arch/arm64/mm/contpte.c b/arch/arm64/mm/contpte.c new file mode 100644 index 000000000000..667bcf7c3260 --- /dev/null +++ b/arch/arm64/mm/contpte.c @@ -0,0 +1,351 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Copyright (C) 2023 ARM Ltd. + */ + +#include +#include +#include + +static void ptep_clear_flush_range(struct mm_struct *mm, unsigned long addr, + pte_t *ptep, int nr) +{ + struct vm_area_struct vma = TLB_FLUSH_VMA(mm, 0); + unsigned long start_addr = addr; + int i; + + for (i = 0; i < nr; i++, ptep++, addr += PAGE_SIZE) + __pte_clear(mm, addr, ptep); + + __flush_tlb_range(&vma, start_addr, addr, PAGE_SIZE, true, 3); +} + +static bool ptep_any_valid(pte_t *ptep, int nr) +{ + int i; + + for (i = 0; i < nr; i++, ptep++) { + if (pte_valid(__ptep_get(ptep))) + return true; + } + + return false; +} + +static void contpte_fold(struct mm_struct *mm, unsigned long addr, + pte_t *ptep, pte_t pte, bool fold) +{ + struct vm_area_struct vma = TLB_FLUSH_VMA(mm, 0); + unsigned long start_addr; + pte_t *start_ptep; + int i; + + start_ptep = ptep = contpte_align_down(ptep); + start_addr = addr = ALIGN_DOWN(addr, CONT_PTE_SIZE); + pte = pfn_pte(ALIGN_DOWN(pte_pfn(pte), CONT_PTES), pte_pgprot(pte)); + pte = fold ? pte_mkcont(pte) : pte_mknoncont(pte); + + for (i = 0; i < CONT_PTES; i++, ptep++, addr += PAGE_SIZE) { + pte_t ptent = __ptep_get_and_clear(mm, addr, ptep); + + if (pte_dirty(ptent)) + pte = pte_mkdirty(pte); + + if (pte_young(ptent)) + pte = pte_mkyoung(pte); + } + + __flush_tlb_range(&vma, start_addr, addr, PAGE_SIZE, true, 3); + + __set_ptes(mm, start_addr, start_ptep, pte, CONT_PTES); +} + +void __contpte_try_fold(struct mm_struct *mm, unsigned long addr, + pte_t *ptep, pte_t pte) +{ + /* + * We have already checked that the virtual and pysical addresses are + * correctly aligned for a contpte mapping in contpte_try_fold() so the + * remaining checks are to ensure that the contpte range is fully + * covered by a single folio, and ensure that all the ptes are valid + * with contiguous PFNs and matching prots. We ignore the state of the + * access and dirty bits for the purpose of deciding if its a contiguous + * range; the folding process will generate a single contpte entry which + * has a single access and dirty bit. Those 2 bits are the logical OR of + * their respective bits in the constituent pte entries. In order to + * ensure the contpte range is covered by a single folio, we must + * recover the folio from the pfn, but special mappings don't have a + * folio backing them. Fortunately contpte_try_fold() already checked + * that the pte is not special - we never try to fold special mappings. + * Note we can't use vm_normal_page() for this since we don't have the + * vma. + */ + + struct page *page = pte_page(pte); + struct folio *folio = page_folio(page); + unsigned long folio_saddr = addr - (page - &folio->page) * PAGE_SIZE; + unsigned long folio_eaddr = folio_saddr + folio_nr_pages(folio) * PAGE_SIZE; + unsigned long cont_saddr = ALIGN_DOWN(addr, CONT_PTE_SIZE); + unsigned long cont_eaddr = cont_saddr + CONT_PTE_SIZE; + unsigned long pfn; + pgprot_t prot; + pte_t subpte; + pte_t *orig_ptep; + int i; + + if (folio_saddr > cont_saddr || folio_eaddr < cont_eaddr) + return; + + pfn = pte_pfn(pte) - ((addr - cont_saddr) >> PAGE_SHIFT); + prot = pte_pgprot(pte_mkold(pte_mkclean(pte))); + orig_ptep = ptep; + ptep = contpte_align_down(ptep); + + for (i = 0; i < CONT_PTES; i++, ptep++, pfn++) { + subpte = __ptep_get(ptep); + subpte = pte_mkold(pte_mkclean(subpte)); + + if (!pte_valid(subpte) || + pte_pfn(subpte) != pfn || + pgprot_val(pte_pgprot(subpte)) != pgprot_val(prot)) + return; + } + + contpte_fold(mm, addr, orig_ptep, pte, true); +} +EXPORT_SYMBOL(__contpte_try_fold); + +void __contpte_try_unfold(struct mm_struct *mm, unsigned long addr, + pte_t *ptep, pte_t pte) +{ + /* + * We have already checked that the ptes are contiguous in + * contpte_try_unfold(), so we can unfold unconditionally here. + */ + + contpte_fold(mm, addr, ptep, pte, false); +} +EXPORT_SYMBOL(__contpte_try_unfold); + +pte_t contpte_ptep_get(pte_t *ptep, pte_t orig_pte) +{ + /* + * Gather access/dirty bits, which may be populated in any of the ptes + * of the contig range. We are guarranteed to be holding the PTL, so any + * contiguous range cannot be unfolded or otherwise modified under our + * feet. + */ + + pte_t pte; + int i; + + ptep = contpte_align_down(ptep); + + for (i = 0; i < CONT_PTES; i++, ptep++) { + pte = __ptep_get(ptep); + + if (pte_dirty(pte)) + orig_pte = pte_mkdirty(orig_pte); + + if (pte_young(pte)) + orig_pte = pte_mkyoung(orig_pte); + } + + return orig_pte; +} +EXPORT_SYMBOL(contpte_ptep_get); + +pte_t contpte_ptep_get_lockless(pte_t *orig_ptep) +{ + /* + * Gather access/dirty bits, which may be populated in any of the ptes + * of the contig range. We may not be holding the PTL, so any contiguous + * range may be unfolded/modified/refolded under our feet. Therefore we + * ensure we read a _consistent_ contpte range by checking that all ptes + * in the range are valid and have CONT_PTE set, that all pfns are + * contiguous and that all pgprots are the same (ignoring access/dirty). + * If we find a pte that is not consistent, then we must be racing with + * an update so start again. If the target pte does not have CONT_PTE + * set then that is considered consistent on its own because it is not + * part of a contpte range. + */ + + pte_t orig_pte; + pgprot_t orig_prot; + pte_t *ptep; + unsigned long pfn; + pte_t pte; + pgprot_t prot; + int i; + +retry: + orig_pte = __ptep_get(orig_ptep); + + if (!pte_valid_cont(orig_pte)) + return orig_pte; + + orig_prot = pte_pgprot(pte_mkold(pte_mkclean(orig_pte))); + ptep = contpte_align_down(orig_ptep); + pfn = pte_pfn(orig_pte) - (orig_ptep - ptep); + + for (i = 0; i < CONT_PTES; i++, ptep++, pfn++) { + pte = __ptep_get(ptep); + prot = pte_pgprot(pte_mkold(pte_mkclean(pte))); + + if (!pte_valid_cont(pte) || + pte_pfn(pte) != pfn || + pgprot_val(prot) != pgprot_val(orig_prot)) + goto retry; + + if (pte_dirty(pte)) + orig_pte = pte_mkdirty(orig_pte); + + if (pte_young(pte)) + orig_pte = pte_mkyoung(orig_pte); + } + + return orig_pte; +} +EXPORT_SYMBOL(contpte_ptep_get_lockless); + +void contpte_set_ptes(struct mm_struct *mm, unsigned long addr, + pte_t *ptep, pte_t pte, unsigned int nr) +{ + unsigned long next; + unsigned long end = addr + (nr << PAGE_SHIFT); + unsigned long pfn = pte_pfn(pte); + pgprot_t prot = pte_pgprot(pte); + pte_t orig_pte; + + do { + next = pte_cont_addr_end(addr, end); + nr = (next - addr) >> PAGE_SHIFT; + pte = pfn_pte(pfn, prot); + + if (((addr | next | (pfn << PAGE_SHIFT)) & ~CONT_PTE_MASK) == 0) + pte = pte_mkcont(pte); + else + pte = pte_mknoncont(pte); + + /* + * If operating on a partial contiguous range then we must first + * unfold the contiguous range if it was previously folded. + * Otherwise we could end up with overlapping tlb entries. + */ + if (nr != CONT_PTES) + contpte_try_unfold(mm, addr, ptep, __ptep_get(ptep)); + + /* + * If we are replacing ptes that were contiguous or if the new + * ptes are contiguous and any of the ptes being replaced are + * valid, we need to clear and flush the range to prevent + * overlapping tlb entries. + */ + orig_pte = __ptep_get(ptep); + if (pte_valid_cont(orig_pte) || + (pte_cont(pte) && ptep_any_valid(ptep, nr))) + ptep_clear_flush_range(mm, addr, ptep, nr); + + __set_ptes(mm, addr, ptep, pte, nr); + + addr = next; + ptep += nr; + pfn += nr; + + } while (addr != end); +} +EXPORT_SYMBOL(contpte_set_ptes); + +int contpte_ptep_test_and_clear_young(struct vm_area_struct *vma, + unsigned long addr, pte_t *ptep) +{ + /* + * ptep_clear_flush_young() technically requires us to clear the access + * flag for a _single_ pte. However, the core-mm code actually tracks + * access/dirty per folio, not per page. And since we only create a + * contig range when the range is covered by a single folio, we can get + * away with clearing young for the whole contig range here, so we avoid + * having to unfold. + */ + + int i; + int young = 0; + + ptep = contpte_align_down(ptep); + addr = ALIGN_DOWN(addr, CONT_PTE_SIZE); + + for (i = 0; i < CONT_PTES; i++, ptep++, addr += PAGE_SIZE) + young |= __ptep_test_and_clear_young(vma, addr, ptep); + + return young; +} +EXPORT_SYMBOL(contpte_ptep_test_and_clear_young); + +int contpte_ptep_clear_flush_young(struct vm_area_struct *vma, + unsigned long addr, pte_t *ptep) +{ + int young; + + young = contpte_ptep_test_and_clear_young(vma, addr, ptep); + + if (young) { + /* + * See comment in __ptep_clear_flush_young(); same rationale for + * eliding the trailing DSB applies here. + */ + addr = ALIGN_DOWN(addr, CONT_PTE_SIZE); + __flush_tlb_range_nosync(vma, addr, addr + CONT_PTE_SIZE, + PAGE_SIZE, true, 3); + } + + return young; +} +EXPORT_SYMBOL(contpte_ptep_clear_flush_young); + +int contpte_ptep_set_access_flags(struct vm_area_struct *vma, + unsigned long addr, pte_t *ptep, + pte_t entry, int dirty) +{ + pte_t orig_pte; + int i; + unsigned long start_addr; + + /* + * Gather the access/dirty bits for the contiguous range. If nothing has + * changed, its a noop. + */ + orig_pte = ptep_get(ptep); + if (pte_val(orig_pte) == pte_val(entry)) + return 0; + + /* + * We can fix up access/dirty bits without having to unfold/fold the + * contig range. But if the write bit is changing, we need to go through + * the full unfold/fold cycle. + */ + if (pte_write(orig_pte) == pte_write(entry)) { + /* + * For HW access management, we technically only need to update + * the flag on a single pte in the range. But for SW access + * management, we need to update all the ptes to prevent extra + * faults. Avoid per-page tlb flush in __ptep_set_access_flags() + * and instead flush the whole range at the end. + */ + ptep = contpte_align_down(ptep); + start_addr = addr = ALIGN_DOWN(addr, CONT_PTE_SIZE); + + for (i = 0; i < CONT_PTES; i++, ptep++, addr += PAGE_SIZE) + __ptep_set_access_flags(vma, addr, ptep, entry, 0); + + if (dirty) + __flush_tlb_range(vma, start_addr, addr, + PAGE_SIZE, true, 3); + } else { + __contpte_try_unfold(vma->vm_mm, addr, ptep, orig_pte); + __ptep_set_access_flags(vma, addr, ptep, entry, dirty); + contpte_try_fold(vma->vm_mm, addr, ptep, entry); + } + + return 1; +} +EXPORT_SYMBOL(contpte_ptep_set_access_flags); From patchwork Wed Nov 15 16:30:17 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13457068 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BE3D5C072A2 for ; Wed, 15 Nov 2023 16:31:20 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 955286B039C; Wed, 15 Nov 2023 11:31:18 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 9067D6B039D; Wed, 15 Nov 2023 11:31:18 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7CBB36B039F; Wed, 15 Nov 2023 11:31:18 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 618496B039D for ; Wed, 15 Nov 2023 11:31:18 -0500 (EST) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 2009A1602A2 for ; Wed, 15 Nov 2023 16:31:18 +0000 (UTC) X-FDA: 81460728636.30.17CBEDF Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf16.hostedemail.com (Postfix) with ESMTP id 40A9D180022 for ; Wed, 15 Nov 2023 16:31:16 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf16.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1700065876; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=bsYZKHoCNNIKvSKeL5iLaAvAxp8A686Xb5jxLzFpFHw=; b=T7mcqWBFINrQhzDTsbvdJiC5YzUO9B405w4YfdiIl9TtMNW8tKYR4972nI4BNY/MbpXFjX FIoa9f69j2qAUjHE7rA5ZRCheV7cE2lEuie7eqGwraXZ1CZ10RaS4+MCMAQm5b6gPX6bAe evrw/vLyubkjcj+/9ajPhD9Zww8fkNg= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf16.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1700065876; a=rsa-sha256; cv=none; b=BmA52j7H7/44xfjytbPP0I+s9TYmqePTGdkGXMhIbcD/kjpKBsU2nPCJ19kamYprw0CHw3 VOOqK1WR+h4fESt1PvPiKG+F5G4gnDmj3MdZhFVS+xivclSF7RA2qsFvoRbKiTqXahDo20 UAsj8Edd72MGbnsIf476kA1phsN0F9w= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 3633F15DB; Wed, 15 Nov 2023 08:32:01 -0800 (PST) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.26]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 7CAC83F641; Wed, 15 Nov 2023 08:31:12 -0800 (PST) From: Ryan Roberts To: Catalin Marinas , Will Deacon , Ard Biesheuvel , Marc Zyngier , Oliver Upton , James Morse , Suzuki K Poulose , Zenghui Yu , Andrey Ryabinin , Alexander Potapenko , Andrey Konovalov , Dmitry Vyukov , Vincenzo Frascino , Andrew Morton , Anshuman Khandual , Matthew Wilcox , Yu Zhao , Mark Rutland , David Hildenbrand , Kefeng Wang , John Hubbard , Zi Yan Cc: Ryan Roberts , linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v2 13/14] arm64/mm: Implement ptep_set_wrprotects() to optimize fork() Date: Wed, 15 Nov 2023 16:30:17 +0000 Message-Id: <20231115163018.1303287-14-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20231115163018.1303287-1-ryan.roberts@arm.com> References: <20231115163018.1303287-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: 40A9D180022 X-Rspam-User: X-Rspamd-Server: rspam04 X-Stat-Signature: fjqfhqxwtrmw4ytkqpkxi4nfgioy4prb X-HE-Tag: 1700065876-992011 X-HE-Meta: U2FsdGVkX1+6uhYpe3NVn1qFjAsmH2aFc7Fzc1bZbwAGhF4FJyYHZe5febyzigWhBUTU7DHct8EMXa2G7tnX+QMD+eY9Pe68qmTMZopCcfRSk0eHQ4HoW98pACZjRdTmZDClI6LnXHAKASRlaU/sQ5P0alOG+4ZKZi751pmATWOkesVRgoAXAPax5DZzPL4bkVt6A6ajW/u0Q5pznbluRwa0lkPDtyKQC4akx04cXCxSl79s0JS+cJbh+Aiz56L0/U7W7fLhD8KGF6IDg6TwqhUn1lKSawoFyysjXNORauHQDT11KpBi253v7gfhkzIcfCkB4bFLJD3wjKMpMiLAy4CX6F2R952jjGrUNWgX9AZjyC/EFr30dHeTtaePMGY+Vnl+qydJ3/MRIdDpgX5TBztScKh6bmhQ4uuqqZw2uhghFN7EOFUqzDPo9bhb8fCu4PZ5hiuLUOVXqru8qFw0AlFOya+5nrcym1l66ngQecvlS6oq8q0smnGRW9wPS0qqXblH6Mrauu35dFmNANepAuaO7NyBbWNUnvfHYovLCPNZsJDMMDRChjvVJgHK9n/ZWoorbz6pBlrOgZoYSZleP3hRdFGIpnAK0AWec/Z+VVFRSSCcM1cp/IciUpSehqh8ZYAbfyaSywyKcBBdSBCDPE3p2/9gdb3BiYL7twv1R0x6knPTYy0prI7ZS7TZ0tXDKcVuavoUBL3Zach/nXJ2jdCSGzJ/3JSNxyZZKKIdsSkEc6X3woBhiZl6YqMqEqmpqF5i4sDjg+zgJ/RtMyoPc4modNB/KL7Ko21rOQ0QNEjX2w2GSu6rIZzfIPCNPNDDmr1RONIZN82OxDCVDFzQZARY4vjtBWR+ToXjiXZaisy740H7ApjUTwEXzrVd330BGhXJ4QMjpcSSJadN4HlhO82MYcR0KSpyfIwMxgjI1Ip0ZHdriM7YB1KX1pfoUebkt85b7O8qk3jnRnq8ilJ RX8D2FNc aFfVnJ0rciU9gIrBwBB3fdwBrUvWfy/c5Yk557dZnJiNFaX9ncf/YUiqRHE3wWi9JZ1Ni3Ce/uXenCz+TDJdlzdbXKeeWxrk8/980+Uhi1+a9NUmqjC5RIDBcFxMn4VCecS0phlTJ++/X8Q10liIr2FW7dwDAgybeuvqGL4x70/2bDQ7lXdhkr+1b+K9kyblPEL66MArVu7Jkj3OvR+iDtWotvHNnCeixsMrfwwqZJFys6WAq2I6PlAfgm4PE45Sa8ZPavvONVWPyJ8rVCsntbZm3ug== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: With the core-mm changes in place to batch-copy ptes during fork, we can take advantage of this in arm64 to greatly reduce the number of tlbis we have to issue, and recover the lost fork performance incured when adding support for transparent contiguous ptes. If we are write-protecting a whole contig range, we can apply the write-protection to the whole range and know that it won't change whether the range should have the contiguous bit set or not. For ranges smaller than the contig range, we will still have to unfold, apply the write-protection, then fold if the change now means the range is foldable. This optimization is possible thanks to the tightening of the Arm ARM in respect to the definition and behaviour when 'Misprogramming the Contiguous bit'. See section D21194 at https://developer.arm.com/documentation/102105/latest/ Performance tested with the following test written for the will-it-scale framework: ------- char *testcase_description = "fork and exit"; void testcase(unsigned long long *iterations, unsigned long nr) { int pid; char *mem; mem = malloc(SZ_128M); assert(mem); memset(mem, 1, SZ_128M); while (1) { pid = fork(); assert(pid >= 0); if (!pid) exit(0); waitpid(pid, NULL, 0); (*iterations)++; } } ------- I see huge performance regression when PTE_CONT support was added, then the regression is mostly fixed with the addition of this change. The following shows regression relative to before PTE_CONT was enabled (bigger negative value is bigger regression): | cpus | before opt | after opt | |-------:|-------------:|------------:| | 1 | -10.4% | -5.2% | | 8 | -15.4% | -3.5% | | 16 | -38.7% | -3.7% | | 24 | -57.0% | -4.4% | | 32 | -65.8% | -5.4% | Signed-off-by: Ryan Roberts --- arch/arm64/include/asm/pgtable.h | 30 ++++++++++++++++++++--- arch/arm64/mm/contpte.c | 42 ++++++++++++++++++++++++++++++++ 2 files changed, 69 insertions(+), 3 deletions(-) diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index 15bc9cf1eef4..9bd2f57a9e11 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -984,6 +984,16 @@ static inline void __ptep_set_wrprotect(struct mm_struct *mm, } while (pte_val(pte) != pte_val(old_pte)); } +static inline void __ptep_set_wrprotects(struct mm_struct *mm, + unsigned long address, pte_t *ptep, + unsigned int nr) +{ + unsigned int i; + + for (i = 0; i < nr; i++, address += PAGE_SIZE, ptep++) + __ptep_set_wrprotect(mm, address, ptep); +} + #ifdef CONFIG_TRANSPARENT_HUGEPAGE #define __HAVE_ARCH_PMDP_SET_WRPROTECT static inline void pmdp_set_wrprotect(struct mm_struct *mm, @@ -1139,6 +1149,8 @@ extern int contpte_ptep_test_and_clear_young(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep); extern int contpte_ptep_clear_flush_young(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep); +extern void contpte_set_wrprotects(struct mm_struct *mm, unsigned long addr, + pte_t *ptep, unsigned int nr); extern int contpte_ptep_set_access_flags(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep, pte_t entry, int dirty); @@ -1290,13 +1302,25 @@ static inline int ptep_clear_flush_young(struct vm_area_struct *vma, return contpte_ptep_clear_flush_young(vma, addr, ptep); } +#define ptep_set_wrprotects ptep_set_wrprotects +static inline void ptep_set_wrprotects(struct mm_struct *mm, unsigned long addr, + pte_t *ptep, unsigned int nr) +{ + if (!contpte_is_enabled(mm)) + __ptep_set_wrprotects(mm, addr, ptep, nr); + else if (nr == 1) { + contpte_try_unfold(mm, addr, ptep, __ptep_get(ptep)); + __ptep_set_wrprotects(mm, addr, ptep, 1); + contpte_try_fold(mm, addr, ptep, __ptep_get(ptep)); + } else + contpte_set_wrprotects(mm, addr, ptep, nr); +} + #define __HAVE_ARCH_PTEP_SET_WRPROTECT static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addr, pte_t *ptep) { - contpte_try_unfold(mm, addr, ptep, __ptep_get(ptep)); - __ptep_set_wrprotect(mm, addr, ptep); - contpte_try_fold(mm, addr, ptep, __ptep_get(ptep)); + ptep_set_wrprotects(mm, addr, ptep, 1); } #define __HAVE_ARCH_PTEP_SET_ACCESS_FLAGS diff --git a/arch/arm64/mm/contpte.c b/arch/arm64/mm/contpte.c index 667bcf7c3260..426be9cd4dea 100644 --- a/arch/arm64/mm/contpte.c +++ b/arch/arm64/mm/contpte.c @@ -302,6 +302,48 @@ int contpte_ptep_clear_flush_young(struct vm_area_struct *vma, } EXPORT_SYMBOL(contpte_ptep_clear_flush_young); +void contpte_set_wrprotects(struct mm_struct *mm, unsigned long addr, + pte_t *ptep, unsigned int nr) +{ + unsigned long next; + unsigned long end = addr + (nr << PAGE_SHIFT); + + do { + next = pte_cont_addr_end(addr, end); + nr = (next - addr) >> PAGE_SHIFT; + + /* + * If wrprotecting an entire contig range, we can avoid + * unfolding. Just set wrprotect and wait for the later + * mmu_gather flush to invalidate the tlb. Until the flush, the + * page may or may not be wrprotected. After the flush, it is + * guarranteed wrprotected. If its a partial range though, we + * must unfold, because we can't have a case where CONT_PTE is + * set but wrprotect applies to a subset of the PTEs; this would + * cause it to continue to be unpredictable after the flush. + */ + if (nr != CONT_PTES) + contpte_try_unfold(mm, addr, ptep, __ptep_get(ptep)); + + __ptep_set_wrprotects(mm, addr, ptep, nr); + + addr = next; + ptep += nr; + + /* + * If applying to a partial contig range, the change could have + * made the range foldable. Use the last pte in the range we + * just set for comparison, since contpte_try_fold() only + * triggers when acting on the last pte in the contig range. + */ + if (nr != CONT_PTES) + contpte_try_fold(mm, addr - PAGE_SIZE, ptep - 1, + __ptep_get(ptep - 1)); + + } while (addr != end); +} +EXPORT_SYMBOL(contpte_set_wrprotects); + int contpte_ptep_set_access_flags(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep, pte_t entry, int dirty) From patchwork Wed Nov 15 16:30:18 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13457069 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 78724C54FB9 for ; Wed, 15 Nov 2023 16:31:23 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5AC566B039D; Wed, 15 Nov 2023 11:31:22 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 50D586B03A0; Wed, 15 Nov 2023 11:31:22 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 392036B03A1; Wed, 15 Nov 2023 11:31:22 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 20A316B039D for ; Wed, 15 Nov 2023 11:31:22 -0500 (EST) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 8DB671CB35F for ; Wed, 15 Nov 2023 16:31:21 +0000 (UTC) X-FDA: 81460728762.23.16FC5BD Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf18.hostedemail.com (Postfix) with ESMTP id 88D921C0005 for ; Wed, 15 Nov 2023 16:31:19 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf18.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1700065879; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=h/mVzRBx3ll1YC9aC0+oEdCQl+yNhkPnsqUfdHZE74A=; b=dWvepnbe/kougRvNF9Pdo6prEY1WUM/C/OdAnPHzrE00sMisPJLc1sjcUEuZd2ny3a5J6P cVf018/GhR0BMbrTLBB2UnlrVDUmL4x+p0bnmor7vwwKCF1Pgfn16w0376P6ck0eFSzBvk 6zS8O/jHDhSbdAm9R7DYWavAW2m9/Jc= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf18.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1700065879; a=rsa-sha256; cv=none; b=m6Q1w0TP6szob3LSMLUMvHyw+DDxCmJgAqqg5g247ARCFFunbYYVpcJvHREmTHWj9aTp4O T/RjzzZeau12DtjK1v8rGQPMlmnIU/oo68Brshzu2eIKrgSoxehnlO/9zFehMPZ4cdmZei hDQKx40he9Ctx8WN9DIop3cRltIWwFA= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 7B84FDA7; Wed, 15 Nov 2023 08:32:04 -0800 (PST) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.26]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id C1AE63F641; Wed, 15 Nov 2023 08:31:15 -0800 (PST) From: Ryan Roberts To: Catalin Marinas , Will Deacon , Ard Biesheuvel , Marc Zyngier , Oliver Upton , James Morse , Suzuki K Poulose , Zenghui Yu , Andrey Ryabinin , Alexander Potapenko , Andrey Konovalov , Dmitry Vyukov , Vincenzo Frascino , Andrew Morton , Anshuman Khandual , Matthew Wilcox , Yu Zhao , Mark Rutland , David Hildenbrand , Kefeng Wang , John Hubbard , Zi Yan Cc: Ryan Roberts , linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v2 14/14] arm64/mm: Add ptep_get_and_clear_full() to optimize process teardown Date: Wed, 15 Nov 2023 16:30:18 +0000 Message-Id: <20231115163018.1303287-15-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20231115163018.1303287-1-ryan.roberts@arm.com> References: <20231115163018.1303287-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: 88D921C0005 X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: fponco18s5177siiwf1qfng3m3on5zin X-HE-Tag: 1700065879-904343 X-HE-Meta: U2FsdGVkX1/9nB9ZOtGcs8v4iOhRVxH78KxuUTBnbXlkFbfNWNJs1O1J3CBgnYPpvyEV/3EYTUiwRM+yqgi5WJdRRML9aD1LCwHsfqTeKOcoweIZGykAaVQR3OT3n69FKJka2ALHtVEkvVtizkhP3ufRTo2jAUgqf3GSctW1svF32FodTLEOaOJPM4KURV0RL5QlaSc7g8aCZ9uHgB+spTL6q45Wxlr9aHAs5U/Kgu9Sg4sVkbTAjJ/QI0fPq8OFEQ1FdfScZwH3Ml/gVIYEd4u8qeJpOKf5rspztkZ+VChxGHJ3A3KoIy3azjLCoFBn6Of0LOadYq7iLdpAU7rdf33ineUU1bkaLXAotQTzTd4k9qMKvn3hMnzKlYsDAkG1Se5NjgdrVFsxnmp/Gb5niObRthbALvQOel+cRcLMpVtkNoZg5yCuL+0tDQlRG53WlFtpc66wYiWj5xLJnvpa3PaOLxX9obomFDxUQVpYE/9mvARA+CEEfEsGYBmlLkIQE11FEIayRcrTB9ya41I32mEHfrVq6TyX2+f8b2EeXdYcWRw7PvwJYtlS1csPIDiYE3wDAbsZjMjWWbdLwnGXy1JTwB0YPmIhsleLgOw64pEJYS9mu9/zjTXetaGUdhHNaijC43PTRxLFm2vn8n+h82607TVThv/o8Y22CxuMwOfUpHANwx5V3iDaeRuCc+bqyOTiDQDcXcs4rk2MWnZkajwwiJs87cHrM9QGLqkC4l139psWmrDbCZOkxTSgHn66akAyotfTMYgG5+cPmOhs/UpJceQ/eqY9ibQSIRG5jKvprzPIa8+x1etUMzVFYbTMnd/CgTPAEvE/nEyMmjw1/TdGNbFQHCqEAVgwxtzDtsSDO6u1tgBZC/WvCRXqnss58Ru+nqPhp+WVm9r8Ktmf93Gfo4dTncQb9nxM1mKXxcKIAqIDa7ZcQLsO6mgRM/CpnReIecz3x3+vGNI1t3P iLTb7SYZ a2DALAnMtc9qvqQ837XvFCX3iTcEXgig10qLElbdBWEBkXJ1Za1etvAzfAWX0JIfPSRlq5PxDXRezVIbkHpRQ/X1oFR+YduP0F7eM+n5SPGEtcJLSlFs8Iy13rgdLCrr/ztOVV8HtbyekjyxdWuDfUPp4eG60zlN9DnHMUmhlyJwfTUdnNVAC3oEueLmJm8vb6ffrdsbzWZoUhjRnoA6gf62FwY3nGgnp2pLNfcMXm895cLjayotkJZPquAdGVgEQG1q4GT4ssonMeIidX5ocFi8Oag== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: ptep_get_and_clear_full() adds a 'full' parameter which is not present for the fallback ptep_get_and_clear() function. 'full' is set to 1 when a full address space teardown is in progress. We use this information to optimize arm64_sys_exit_group() by avoiding unfolding (and therefore tlbi) contiguous ranges. Instead we just clear the PTE but allow all the contiguous neighbours to keep their contig bit set, because we know we are about to clear the rest too. Before this optimization, the cost of arm64_sys_exit_group() exploded to 32x what it was before PTE_CONT support was wired up, when compiling the kernel. With this optimization in place, we are back down to the original cost. This approach is not perfect though, as for the duration between returning from the first call to ptep_get_and_clear_full() and making the final call, the contpte block in an intermediate state, where some ptes are cleared and others are still set with the PTE_CONT bit. If any other APIs are called for the ptes in the contpte block during that time, we have to be very careful. The core code currently interleaves calls to ptep_get_and_clear_full() with ptep_get() and so ptep_get() must be careful to ignore the cleared entries when accumulating the access and dirty bits - the same goes for ptep_get_lockless(). The only other calls we might resonably expect are to set markers in the previously cleared ptes. (We shouldn't see valid entries being set until after the tlbi, at which point we are no longer in the intermediate state). Since markers are not valid, this is safe; set_ptes() will see the old, invalid entry and will not attempt to unfold. And the new pte is also invalid so it won't attempt to fold. We shouldn't see this for the 'full' case anyway. The last remaining issue is returning the access/dirty bits. That info could be present in any of the ptes in the contpte block. ptep_get() will gather those bits from across the contpte block. We don't bother doing that here, because we know that the information is used by the core-mm to mark the underlying folio as accessed/dirty. And since the same folio must be underpinning the whole block (that was a requirement for folding in the first place), that information will make it to the folio eventually once all the ptes have been cleared. This approach means we don't have to play games with accumulating and storing the bits. It does mean that any interleaved calls to ptep_get() may lack correct access/dirty information if we have already cleared the pte that happened to store it. The core code does not rely on this though. Signed-off-by: Ryan Roberts --- arch/arm64/include/asm/pgtable.h | 18 +++++++++-- arch/arm64/mm/contpte.c | 54 ++++++++++++++++++++++++++++++++ 2 files changed, 70 insertions(+), 2 deletions(-) diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index 9bd2f57a9e11..ea58a9f4e700 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -1145,6 +1145,8 @@ extern pte_t contpte_ptep_get(pte_t *ptep, pte_t orig_pte); extern pte_t contpte_ptep_get_lockless(pte_t *orig_ptep); extern void contpte_set_ptes(struct mm_struct *mm, unsigned long addr, pte_t *ptep, pte_t pte, unsigned int nr); +extern pte_t contpte_ptep_get_and_clear_full(struct mm_struct *mm, + unsigned long addr, pte_t *ptep); extern int contpte_ptep_test_and_clear_young(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep); extern int contpte_ptep_clear_flush_young(struct vm_area_struct *vma, @@ -1270,12 +1272,24 @@ static inline void pte_clear(struct mm_struct *mm, __pte_clear(mm, addr, ptep); } +#define __HAVE_ARCH_PTEP_GET_AND_CLEAR_FULL +static inline pte_t ptep_get_and_clear_full(struct mm_struct *mm, + unsigned long addr, pte_t *ptep, int full) +{ + pte_t orig_pte = __ptep_get(ptep); + + if (!pte_valid_cont(orig_pte) || !full) { + contpte_try_unfold(mm, addr, ptep, orig_pte); + return __ptep_get_and_clear(mm, addr, ptep); + } else + return contpte_ptep_get_and_clear_full(mm, addr, ptep); +} + #define __HAVE_ARCH_PTEP_GET_AND_CLEAR static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned long addr, pte_t *ptep) { - contpte_try_unfold(mm, addr, ptep, __ptep_get(ptep)); - return __ptep_get_and_clear(mm, addr, ptep); + return ptep_get_and_clear_full(mm, addr, ptep, 0); } #define __HAVE_ARCH_PTEP_TEST_AND_CLEAR_YOUNG diff --git a/arch/arm64/mm/contpte.c b/arch/arm64/mm/contpte.c index 426be9cd4dea..5d1aaed82d32 100644 --- a/arch/arm64/mm/contpte.c +++ b/arch/arm64/mm/contpte.c @@ -144,6 +144,14 @@ pte_t contpte_ptep_get(pte_t *ptep, pte_t orig_pte) for (i = 0; i < CONT_PTES; i++, ptep++) { pte = __ptep_get(ptep); + /* + * Deal with the partial contpte_ptep_get_and_clear_full() case, + * where some of the ptes in the range may be cleared but others + * are still to do. See contpte_ptep_get_and_clear_full(). + */ + if (pte_val(pte) == 0) + continue; + if (pte_dirty(pte)) orig_pte = pte_mkdirty(orig_pte); @@ -256,6 +264,52 @@ void contpte_set_ptes(struct mm_struct *mm, unsigned long addr, } EXPORT_SYMBOL(contpte_set_ptes); +pte_t contpte_ptep_get_and_clear_full(struct mm_struct *mm, + unsigned long addr, pte_t *ptep) +{ + /* + * When doing a full address space teardown, we can avoid unfolding the + * contiguous range, and therefore avoid the associated tlbi. Instead, + * just get and clear the pte. The caller is promising to call us for + * every pte, so every pte in the range will be cleared by the time the + * tlbi is issued. + * + * This approach is not perfect though, as for the duration between + * returning from the first call to ptep_get_and_clear_full() and making + * the final call, the contpte block in an intermediate state, where + * some ptes are cleared and others are still set with the PTE_CONT bit. + * If any other APIs are called for the ptes in the contpte block during + * that time, we have to be very careful. The core code currently + * interleaves calls to ptep_get_and_clear_full() with ptep_get() and so + * ptep_get() must be careful to ignore the cleared entries when + * accumulating the access and dirty bits - the same goes for + * ptep_get_lockless(). The only other calls we might resonably expect + * are to set markers in the previously cleared ptes. (We shouldn't see + * valid entries being set until after the tlbi, at which point we are + * no longer in the intermediate state). Since markers are not valid, + * this is safe; set_ptes() will see the old, invalid entry and will not + * attempt to unfold. And the new pte is also invalid so it won't + * attempt to fold. We shouldn't see this for the 'full' case anyway. + * + * The last remaining issue is returning the access/dirty bits. That + * info could be present in any of the ptes in the contpte block. + * ptep_get() will gather those bits from across the contpte block. We + * don't bother doing that here, because we know that the information is + * used by the core-mm to mark the underlying folio as accessed/dirty. + * And since the same folio must be underpinning the whole block (that + * was a requirement for folding in the first place), that information + * will make it to the folio eventually once all the ptes have been + * cleared. This approach means we don't have to play games with + * accumulating and storing the bits. It does mean that any interleaved + * calls to ptep_get() may lack correct access/dirty information if we + * have already cleared the pte that happened to store it. The core code + * does not rely on this though. + */ + + return __ptep_get_and_clear(mm, addr, ptep); +} +EXPORT_SYMBOL(contpte_ptep_get_and_clear_full); + int contpte_ptep_test_and_clear_young(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep) {