From patchwork Wed Feb 5 15:09:48 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13961286 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5A5CBC02194 for ; Wed, 5 Feb 2025 15:10:54 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E06A828001D; Wed, 5 Feb 2025 10:10:53 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id DB4F0280001; Wed, 5 Feb 2025 10:10:53 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C563E28001D; Wed, 5 Feb 2025 10:10:53 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id A0FE6280001 for ; Wed, 5 Feb 2025 10:10:53 -0500 (EST) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 164AE448CF for ; Wed, 5 Feb 2025 15:10:52 +0000 (UTC) X-FDA: 83086228344.07.DA84947 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf15.hostedemail.com (Postfix) with ESMTP id 6496CA0008 for ; Wed, 5 Feb 2025 15:10:50 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf15.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738768250; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=qkmgYVJsYjfY1eQJPhDEotjXCZNcuY4MKyNB9p34TVU=; b=Wbgti9te9kx/GMm8RAWIOc7I6pK9DPUCZr6ag53ZGBIaq5mYl0C48eydNhl3h72/diSyCh U8zU3MZOTnpbj948+7rkwjDHcKTjPDPBGUQCsF30TqD+Azq1W0tuoSF0yq+2Z1f7udaOG/ YFCI7lVbRYkRGMTYaulSe2aqxmg57/4= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738768250; a=rsa-sha256; cv=none; b=Kwuz8RF70OovUIGqo9txVrCiiWjWfcZOQxix5xmfeFPt5fMV30Cr9osGt/+EZQ3GLrhuNr bNDt+nh9F1Bp78IIqqLSab8jamvXq04XBuXQmPEUpyKDy//C8Jv8+Kn/0tGcNul0HnqNWP 3ZbrMpFTW8cTd5qwvrRmsIEfzN6/lSs= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf15.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 5B6AD1063; Wed, 5 Feb 2025 07:11:13 -0800 (PST) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.27]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 3F23B3F5A1; Wed, 5 Feb 2025 07:10:47 -0800 (PST) From: Ryan Roberts To: Catalin Marinas , Will Deacon , Muchun Song , Pasha Tatashin , Andrew Morton , Uladzislau Rezki , Christoph Hellwig , Mark Rutland , Ard Biesheuvel , Anshuman Khandual , Dev Jain , Alexandre Ghiti , Steve Capper , Kevin Brodsky Cc: Ryan Roberts , linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v1 08/16] arm64/mm: Hoist barriers out of ___set_ptes() loop Date: Wed, 5 Feb 2025 15:09:48 +0000 Message-ID: <20250205151003.88959-9-ryan.roberts@arm.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250205151003.88959-1-ryan.roberts@arm.com> References: <20250205151003.88959-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: 6496CA0008 X-Stat-Signature: dobonbphh7f3b4bdcx1ow4xi6x7yq9nt X-Rspam-User: X-Rspamd-Server: rspam12 X-HE-Tag: 1738768250-699486 X-HE-Meta: U2FsdGVkX1/nD/ddt9CLOiSrHULKZflxcodqNF4XQpEvAikLBP3H4KfJ73KVaWlfoOoq5vFEfhyqVDFRJJ6169+9flJhiNOuAV0UwZ/BPwRtvdm4EVSdFm5HLKmpPw+LtNfpQkkakYC10rh8PSoOyO2bNjNMyeWmS7VRKVE8Qpujifd7SKodZvkO4+jOHERGd8qlzzjgwdTg1UvnqRL9sloi02J9YqJjNapVR38kiG3B9GMelX9NqtqeqoUHNUhP+WqytEEjzBtckneGjwJ3RISazj8smSoA1Rvy82TrbQX7FxJrwrFhJZ6eNmU1Lazn8mdM9V0pT/CCEg3UNkOWXW8pA6mKl9EG1PyIJsjhDQPsLo2eMuUXBvmh+cr8V/JrDgTMHg4w7HjNE44A5yBCa5J57E6kPjajBShsbc+uunRa4WcGm/cbaUfMXLY0OiYqJeRiMX+HkWEBvDPBYM5D3EEYFZSLQXG2bjKNSjWbl0pCqjGHFqDSLQa4Mh2CA/1OD+065PtcVphBWizupY0e7zupkLPOSq7QVVwA8QmkLAWXEmu55ZeGtzclgmrx8+w6qVH2nmqCvzZLty//e6GIFIOfVVCEmrgPc9zOKAaYVJZTfHw3/Bne0aAtCioyzELOtvoZ2HAALl0hsaMPtriIlY9RWnWJC23Geg44D/KqphSG6C6I56YqpzfEIQ1kP4o+p7FDCqLTt7IGgEBH4Gb12eKFC4QsgcnNtACwIHoZtnjoX8zV+DiVfhhf5O/ZrGnH+K0rXQdqHLOJ1/eeMCFRsbgGWKKcvQqcIuOQqDJhwL7HVtI5AzNFD8pgf7bhk4z5JOFLrLWlgqJZR+khkT+qadkPPRhdTdDryO/a5dx4/l0A0GRReEU+i3zYojA0YV5pGAeGs19g8p3MTWWJtjTe+OYTiBqngRp5trp1en2b2KZg+6RDp93p+60uyVVZIvoAAmcawJfLsOnp+mDbL25 LCEQ+Dfl dS2QvJP5FWc9Xt4+Tzu0GaiBXh8K+YSWkNsez9Cr9eudogq3Xx8+4SG7OGqXpFE2Zl7VIvtBPx4dflWzfbkRGE3lF0YJ7dgIyKQgMevFmzUTCbHhXUNyaGcYxmksY/128xA/Suj/3DapROxnLPIs/3mINH/GuAIMn7bHnao25Vup9d/wR8bs9f+oisHU9UPzyr4X6QqnLAVzGwFY= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: ___set_ptes() previously called __set_pte() for each PTE in the range, which would conditionally issue a DSB and ISB to make the new PTE value immediately visible to the table walker if the new PTE was valid and for kernel space. We can do better than this; let's hoist those barriers out of the loop so that they are only issued once at the end of the loop. We then reduce the cost by the number of PTEs in the range. Signed-off-by: Ryan Roberts --- arch/arm64/include/asm/pgtable.h | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-) diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index 3b55d9a15f05..1d428e9c0e5a 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -317,10 +317,8 @@ static inline void __set_pte_nosync(pte_t *ptep, pte_t pte) WRITE_ONCE(*ptep, pte); } -static inline void __set_pte(pte_t *ptep, pte_t pte) +static inline void __set_pte_complete(pte_t pte) { - __set_pte_nosync(ptep, pte); - /* * Only if the new pte is valid and kernel, otherwise TLB maintenance * or update_mmu_cache() have the necessary barriers. @@ -331,6 +329,12 @@ static inline void __set_pte(pte_t *ptep, pte_t pte) } } +static inline void __set_pte(pte_t *ptep, pte_t pte) +{ + __set_pte_nosync(ptep, pte); + __set_pte_complete(pte); +} + static inline pte_t __ptep_get(pte_t *ptep) { return READ_ONCE(*ptep); @@ -647,12 +651,14 @@ static inline void ___set_ptes(struct mm_struct *mm, pte_t *ptep, pte_t pte, for (;;) { __check_safe_pte_update(mm, ptep, pte); - __set_pte(ptep, pte); + __set_pte_nosync(ptep, pte); if (--nr == 0) break; ptep++; pte = pte_advance_pfn(pte, stride); } + + __set_pte_complete(pte); } static inline void __set_ptes(struct mm_struct *mm,