From patchwork Wed Feb 5 15:09:49 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13961287 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B19FBC02198 for ; Wed, 5 Feb 2025 15:10:56 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 441D228001E; Wed, 5 Feb 2025 10:10:56 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 3CB25280001; Wed, 5 Feb 2025 10:10:56 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1F47B28001E; Wed, 5 Feb 2025 10:10:56 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id EE9A3280001 for ; Wed, 5 Feb 2025 10:10:55 -0500 (EST) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 9D0891A0224 for ; Wed, 5 Feb 2025 15:10:55 +0000 (UTC) X-FDA: 83086228470.10.F3EE34F Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf09.hostedemail.com (Postfix) with ESMTP id 604BF14001F for ; Wed, 5 Feb 2025 15:10:53 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=none; spf=pass (imf09.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738768253; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ZwFN2pI5ddclpZ3ZgbQxOyljUbVmVqlB4ZpqvrOUBOA=; b=FXJRkATyIbuiGSV6AgF/6kqhbfrwrFcQFT4+BwMPXNh1g8Mka4zcBCDeC6+1+9yD1LewPR EdMWg1iy3SOc1CDZu74eCUAV2FExdLvShGUANaBuLtaXeStITIkYc4hrwQPAUSFvCUMkRc SveS7rcbQx7y95zFaAPeSHnFJ+hSnxA= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738768253; a=rsa-sha256; cv=none; b=lAdGSEionWw2mNrGPGDFgoZw42ElszvkxSpr+pLGzxpFN/HZM3ZFdSgyu1JTbIEz0lSRMF OO0D9KG2H88J/H5W4uVTLv/4U60foRY11FVfn5HmNdrwGNbbQigyq3oZFhBjfkxhdq3c5p MVPjojL/z4E4vFOAr4518zLfx5xTjP0= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=none; spf=pass (imf09.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 2A0151063; Wed, 5 Feb 2025 07:11:16 -0800 (PST) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.27]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 0DD7A3F5A1; Wed, 5 Feb 2025 07:10:49 -0800 (PST) From: Ryan Roberts To: Catalin Marinas , Will Deacon , Muchun Song , Pasha Tatashin , Andrew Morton , Uladzislau Rezki , Christoph Hellwig , Mark Rutland , Ard Biesheuvel , Anshuman Khandual , Dev Jain , Alexandre Ghiti , Steve Capper , Kevin Brodsky Cc: Ryan Roberts , linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v1 09/16] arm64/mm: Avoid barriers for invalid or userspace mappings Date: Wed, 5 Feb 2025 15:09:49 +0000 Message-ID: <20250205151003.88959-10-ryan.roberts@arm.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250205151003.88959-1-ryan.roberts@arm.com> References: <20250205151003.88959-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Stat-Signature: 5j8rseqo3d1zyfiriqc1bin5c1m3phq7 X-Rspamd-Queue-Id: 604BF14001F X-Rspam-User: X-Rspamd-Server: rspam06 X-HE-Tag: 1738768253-744161 X-HE-Meta: U2FsdGVkX1+yITLzZ5TJLeDp0s8uCL7/1GcdyrTHmT/sOyfXfoJFjOsSVS6bVLMSXJ4g4Ie7XptH6t9MN+SJFLV2bgLA/ZgN0sgBHcgDi+p0u0flSaLyEejVighx6fvlD0WTpj0IB03VtV2UJaOV84/wL6g9OMq61XCKRWC5bu1VHsqy5R2yLZc3Zl8yPeR99DEDeHsw7U+5wyYapwnW33EiY4RXtn463G0OFpIvDbJbA7swke6sInGiO0Z7Oa4u2zi2zIaa4OmKiZqGJ/I0jiW16EexuoSy8Hwm5Ij95KgZhFioirSI3WQmvMhL7c6EEzL5si424OeTZc8xp3vj0OKnHl6XDrQQJdGk4CSARTJ2K8ZQmW5avrNaxrgaI9ZIUo0I7C2038uTPkzHdp/uU1Vc+OtzBhX/nSJZNd6QG4KIDt9fz8f2LhGjPNnUIm+Lo+5LWPi0Pm32ntUwBt76dqBhD4xPorOwMHacn9+xuCzcUMOxQQp/m6kNg1/lZGZ0sdkx5FxVn74KH01MU43nQJklPWB89EcCuvz07Z7Aop7u2otkBW33i/iQbpeg3eA5ZiMqKakB/GVoGshvwMP7sviHIqg7r10/8HNpCeRfBTWw0FbYLlkRnuv9B2kmYxDiNLyBg97Poakz9+6Tll6WYz1ew9trKH1Q5Rd+qXwe+3hhm23ukV9yrj9JCS6Rb+8LSs+nbh4h3FFtnpGUOd39azVrqRCV/sNZIN+tRw6z/90qwiX6AD2lo+S3iW/O9Lnc+ZEkbRDVnuT/vaejqtu+9+QvIZ1WFQwsZQ9auShcY5fbBB0+XQjh5l9dd2fy/K6gL/KkGdG2cm4PRXa7cbbEXG+PZlAZ1gA5JZqpMC5yvUjWnpW9M5zED5nl4KGLrCRXnJ59xryZGYAvDM2ExBZ30ox7DhNtU7o6HH/Plhkv2seikaGAouARIbLw4WwU62FHEgU6B4HFCCQpoJa+n65 jN4CGaB4 TJUy2gQDHSwDSvWjmXUI5D88PnWL43Xyf2sL3Me1VwCDXrqDSzT5EFU+V0PY6nKlXV/FChMD28iZHVuVZNL5LaszPJ5miO4RpcyJrjRhIY+TRq4+ZOcQVV4jw3QbUUNT5iT/4pJ0aaicFjCrIt56ENQBlZA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: __set_pte_complete(), set_pmd(), set_pud(), set_p4d() and set_pgd() are used to write entries into pgtables. And they issue barriers (currently dsb and isb) to ensure that the written values are observed by the table walker prior to any program-order-future memory access to the mapped location. Over the years some of these functions have received optimizations: In particular, commit 7f0b1bf04511 ("arm64: Fix barriers used for page table modifications") made it so that the barriers were only emitted for valid-kernel mappings for set_pte() (now __set_pte_complete()). And commit 0795edaf3f1f ("arm64: pgtable: Implement p[mu]d_valid() and check in set_p[mu]d()") made it so that set_pmd()/set_pud() only emitted the barriers for valid mappings. set_p4d()/set_pgd() continue to emit the barriers unconditionally. This is all very confusing to the casual observer; surely the rules should be invariant to the level? Let's change this so that every level consistently emits the barriers only when setting valid, non-user entries (both table and leaf). It seems obvious that if it is ok to elide barriers all but valid kernel mappings at pte level, it must also be ok to do this for leaf entries at other levels: If setting an entry to invalid, a tlb maintenance operaiton must surely follow to synchronise the TLB and this contains the required barriers. If setting a valid user mapping, the previous mapping must have been invalid and there must have been a TLB maintenance operation (complete with barriers) to honour break-before-make. So the worst that can happen is we take an extra fault (which will imply the DSB + ISB) and conclude that there is nothing to do. These are the aguments for doing this optimization at pte level and they also apply to leaf mappings at other levels. For table entries, the same arguments hold: If unsetting a table entry, a TLB is required and this will emit the required barriers. If setting a table entry, the previous value must have been invalid and the table walker must already be able to observe that. Additionally the contents of the pgtable being pointed to in the newly set entry must be visible before the entry is written and this is enforced via smp_wmb() (dmb) in the pgtable allocation functions and in __split_huge_pmd_locked(). But this last part could never have been enforced by the barriers in set_pXd() because they occur after updating the entry. So ultimately, the wost that can happen by eliding these barriers for user table entries is an extra fault. I observe roughly the same number of page faults (107M) with and without this change when compiling the kernel on Apple M2. Signed-off-by: Ryan Roberts --- arch/arm64/include/asm/pgtable.h | 60 ++++++++++++++++++++++++++++---- 1 file changed, 54 insertions(+), 6 deletions(-) diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index 1d428e9c0e5a..ff358d983583 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -767,6 +767,19 @@ static inline bool in_swapper_pgdir(void *addr) ((unsigned long)swapper_pg_dir & PAGE_MASK); } +static inline bool pmd_valid_not_user(pmd_t pmd) +{ + /* + * User-space table pmd entries always have (PXN && !UXN). All other + * combinations indicate it's a table entry for kernel space. + * Valid-not-user leaf entries follow the same rules as + * pte_valid_not_user(). + */ + if (pmd_table(pmd)) + return !((pmd_val(pmd) & (PMD_TABLE_PXN | PMD_TABLE_UXN)) == PMD_TABLE_PXN); + return pte_valid_not_user(pmd_pte(pmd)); +} + static inline void set_pmd(pmd_t *pmdp, pmd_t pmd) { #ifdef __PAGETABLE_PMD_FOLDED @@ -778,7 +791,7 @@ static inline void set_pmd(pmd_t *pmdp, pmd_t pmd) WRITE_ONCE(*pmdp, pmd); - if (pmd_valid(pmd)) { + if (pmd_valid_not_user(pmd)) { dsb(ishst); isb(); } @@ -836,6 +849,17 @@ static inline unsigned long pmd_page_vaddr(pmd_t pmd) static inline bool pgtable_l4_enabled(void); + +static inline bool pud_valid_not_user(pud_t pud) +{ + /* + * Follows the same rules as pmd_valid_not_user(). + */ + if (pud_table(pud)) + return !((pud_val(pud) & (PUD_TABLE_PXN | PUD_TABLE_UXN)) == PUD_TABLE_PXN); + return pte_valid_not_user(pud_pte(pud)); +} + static inline void set_pud(pud_t *pudp, pud_t pud) { if (!pgtable_l4_enabled() && in_swapper_pgdir(pudp)) { @@ -845,7 +869,7 @@ static inline void set_pud(pud_t *pudp, pud_t pud) WRITE_ONCE(*pudp, pud); - if (pud_valid(pud)) { + if (pud_valid_not_user(pud)) { dsb(ishst); isb(); } @@ -917,6 +941,16 @@ static inline bool mm_pud_folded(const struct mm_struct *mm) #define p4d_bad(p4d) (pgtable_l4_enabled() && !(p4d_val(p4d) & P4D_TABLE_BIT)) #define p4d_present(p4d) (!p4d_none(p4d)) +static inline bool p4d_valid_not_user(p4d_t p4d) +{ + /* + * User-space table p4d entries always have (PXN && !UXN). All other + * combinations indicate it's a table entry for kernel space. p4d block + * entries are not supported. + */ + return !((p4d_val(p4d) & (P4D_TABLE_PXN | P4D_TABLE_UXN)) == P4D_TABLE_PXN); +} + static inline void set_p4d(p4d_t *p4dp, p4d_t p4d) { if (in_swapper_pgdir(p4dp)) { @@ -925,8 +959,11 @@ static inline void set_p4d(p4d_t *p4dp, p4d_t p4d) } WRITE_ONCE(*p4dp, p4d); - dsb(ishst); - isb(); + + if (p4d_valid_not_user(p4d)) { + dsb(ishst); + isb(); + } } static inline void p4d_clear(p4d_t *p4dp) @@ -1044,6 +1081,14 @@ static inline bool mm_p4d_folded(const struct mm_struct *mm) #define pgd_bad(pgd) (pgtable_l5_enabled() && !(pgd_val(pgd) & PGD_TABLE_BIT)) #define pgd_present(pgd) (!pgd_none(pgd)) +static inline bool pgd_valid_not_user(pgd_t pgd) +{ + /* + * Follows the same rules as p4d_valid_not_user(). + */ + return !((pgd_val(pgd) & (PGD_TABLE_PXN | PGD_TABLE_UXN)) == PGD_TABLE_PXN); +} + static inline void set_pgd(pgd_t *pgdp, pgd_t pgd) { if (in_swapper_pgdir(pgdp)) { @@ -1052,8 +1097,11 @@ static inline void set_pgd(pgd_t *pgdp, pgd_t pgd) } WRITE_ONCE(*pgdp, pgd); - dsb(ishst); - isb(); + + if (pgd_valid_not_user(pgd)) { + dsb(ishst); + isb(); + } } static inline void pgd_clear(pgd_t *pgdp)