From patchwork Tue Mar 4 15:04:36 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 14000870 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 267E5C021B8 for ; Tue, 4 Mar 2025 15:05:16 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1E39A6B0095; Tue, 4 Mar 2025 10:05:11 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 16CD56B0096; Tue, 4 Mar 2025 10:05:11 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F027D6B0098; Tue, 4 Mar 2025 10:05:10 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id CB73C6B0095 for ; Tue, 4 Mar 2025 10:05:10 -0500 (EST) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 7D9FB1A01F6 for ; Tue, 4 Mar 2025 15:05:10 +0000 (UTC) X-FDA: 83184191580.29.DC90AA2 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf03.hostedemail.com (Postfix) with ESMTP id 99C2920027 for ; Tue, 4 Mar 2025 15:05:08 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=none; spf=pass (imf03.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1741100708; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=/+q/1JwU2m6fh2CWGbngu0nd469Mg+2mngxCCKUUaMU=; b=xoTqJCGoByVvyjbzcKUz2lwQ6kCs08GjQo7NZGORtGq08riOxue3UZJ2zRjHdX0S+xUe/O F27Ms4EtBxw/o6JlMWUXQGHLdHOd4jhE6kQVWggQJyk+tAXyyf40mx36i4eDCe2liqtPYn Fljzkeo0HV6IuXCc/8XQJa4TqWBRFGk= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=none; spf=pass (imf03.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1741100708; a=rsa-sha256; cv=none; b=jXsH8Omoc4RGvYCpDLjZsdByIhB+kQCvsJ/KUHOL1zn6N+Ynn8pP9Eii3VAyqeOsYP/sA7 zfTWKvJNMT6xcnX4siU04qehfgLJzMfGggbx1ZYQF1eFREvX6xl2GfCaKvJzGxx8L6+apd A3YtXe+E3a4OgT+EykifsfZRPMnOfZA= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 9595EFEC; Tue, 4 Mar 2025 07:05:21 -0800 (PST) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.27]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 0F7FE3F66E; Tue, 4 Mar 2025 07:05:05 -0800 (PST) From: Ryan Roberts To: Catalin Marinas , Will Deacon , Pasha Tatashin , Andrew Morton , Uladzislau Rezki , Christoph Hellwig , David Hildenbrand , "Matthew Wilcox (Oracle)" , Mark Rutland , Anshuman Khandual , Alexandre Ghiti , Kevin Brodsky Cc: Ryan Roberts , linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v3 06/11] arm64/mm: Hoist barriers out of set_ptes_anysz() loop Date: Tue, 4 Mar 2025 15:04:36 +0000 Message-ID: <20250304150444.3788920-7-ryan.roberts@arm.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250304150444.3788920-1-ryan.roberts@arm.com> References: <20250304150444.3788920-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Queue-Id: 99C2920027 X-Rspamd-Server: rspam09 X-Stat-Signature: zry197q46m6sd3yx9t614sa1y76cny8p X-HE-Tag: 1741100708-684744 X-HE-Meta: U2FsdGVkX18eHD6q3gt72Lby1q5btfsR5hM3tpX9WzsIM2YZ6bIAvIi+WLgdDpxZc4JdJVRO4Zn/AVGJJQaOD34QiQfbPKFa19HUwXMZv0XckvDgiuYZThR+4dZA5fTcla68f9mJuFhw28Ou54cKquv9/M8dl5JevnMQA7GAzk88P8ZjnQ64GLGPKbKRQTMipKCMhTuzNt/pzBbSq4H8wv8/h0/NoF9jjrfxbtsK4b3YydqNr9GVuzYFusZomKBsj0f6C08ogmqAGhoc9qxX+XsPvx0RQDKn+biSG54T3c1+dsbrNlFArl7fjDb+yhFWN3YfbhtxYoKEKpGUYnu5IV18Gtee6/Cdu4g4zEtytNUhSoecgTChpvUbah7EE4ounkTwVNby4pPQlULh+g/9Y7n5tAfPqmBnRcVyX1Irl5PEMwC5Niwxz3z+r0Xb3lEhIQ5jwIC0lT0KAUYh77yr9wn+a5swzPOMwasnZBroOOjXkx4N1p2cKG6lvGosdp5UwEHH6ln5o/DEZdbclDY7JBEZTJGQpP6fN1IMYjquq3Z4PgMgex2taPrT5nrRbLZ67MOT7BM3DorlYkfpv9IJdFPYdgvwTIsoQdzFfiVd3T3Sx+R+0uXCdCpYQ6DzcIf7QGh2TNpjOpARvhLIcfar8NVvOY7iQ62rPqbei4u6POZWPeMKtT6+INETSjy4DB5KRvuaVqB1kd0vdscBIEQn36qBeUud2bT9Ky2aW+jADRA0DcDmjdywYzY9pU278p4ZOKbv4WL67QcM5JIMIY8Hu3NB+4O2Lga6Kgy5x3oFyQdohw1TA5hZEZs/KbH2oVt0g8lNZbKt5g6Og23Hqucv63Pns/COM3Fh3JY7z56WbY9imLoDJRAU/a+Eal8AaCP0uVT3DKC74sDC4kQe721LYlA9hoHwAkjbGONhR2z4TxUPf+YnWB42pEERIaIU8e3PHQyoaT36s5ls1okjyJC TRrGA/Os sE96HNqKnrYPhuAMLTH+IC9qknQ9zWOpj5rjuqJADxt0kOizc+06TnOjhWYQ+1GXv+r6BHpxpvEwCyS96umedz6Yj/8Mr5OIUt6oGpkDiVPlpWGsGfTxQRU2gBEpTnBcatO9zddpq9ZdgtG4QYCmYsh7iT/V+6rAHw0ps7IzszwJjDfJD9xwbVU5IjE7HStRSp/u30WIwpL4XiMQ= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: set_ptes_anysz() previously called __set_pte() for each PTE in the range, which would conditionally issue a DSB and ISB to make the new PTE value immediately visible to the table walker if the new PTE was valid and for kernel space. We can do better than this; let's hoist those barriers out of the loop so that they are only issued once at the end of the loop. We then reduce the cost by the number of PTEs in the range. Signed-off-by: Ryan Roberts --- arch/arm64/include/asm/pgtable.h | 16 +++++++++++----- 1 file changed, 11 insertions(+), 5 deletions(-) diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index e255a36380dc..1898c3069c43 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -317,13 +317,11 @@ static inline void __set_pte_nosync(pte_t *ptep, pte_t pte) WRITE_ONCE(*ptep, pte); } -static inline void __set_pte(pte_t *ptep, pte_t pte) +static inline void __set_pte_complete(pte_t pte) { - __set_pte_nosync(ptep, pte); - /* * Only if the new pte is valid and kernel, otherwise TLB maintenance - * or update_mmu_cache() have the necessary barriers. + * has the necessary barriers. */ if (pte_valid_not_user(pte)) { dsb(ishst); @@ -331,6 +329,12 @@ static inline void __set_pte(pte_t *ptep, pte_t pte) } } +static inline void __set_pte(pte_t *ptep, pte_t pte) +{ + __set_pte_nosync(ptep, pte); + __set_pte_complete(pte); +} + static inline pte_t __ptep_get(pte_t *ptep) { return READ_ONCE(*ptep); @@ -647,12 +651,14 @@ static inline void set_ptes_anysz(struct mm_struct *mm, pte_t *ptep, pte_t pte, for (;;) { __check_safe_pte_update(mm, ptep, pte); - __set_pte(ptep, pte); + __set_pte_nosync(ptep, pte); if (--nr == 0) break; ptep++; pte = pte_advance_pfn(pte, stride); } + + __set_pte_complete(pte); } static inline void __set_ptes(struct mm_struct *mm,