From patchwork Fri Feb 2 08:07:55 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13542359 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 425A7C47258 for ; Fri, 2 Feb 2024 08:09:51 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C79B76B0072; Fri, 2 Feb 2024 03:09:50 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id C2A436B00B3; Fri, 2 Feb 2024 03:09:50 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B19566B00B4; Fri, 2 Feb 2024 03:09:50 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id A2B636B0072 for ; Fri, 2 Feb 2024 03:09:50 -0500 (EST) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 761CC1C18BC for ; Fri, 2 Feb 2024 08:09:50 +0000 (UTC) X-FDA: 81746140140.14.D5778AA Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf19.hostedemail.com (Postfix) with ESMTP id BE6B61A0016 for ; Fri, 2 Feb 2024 08:09:48 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf19.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1706861388; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=yy+Swr2xbZSqDvn3ENPl3Nu9FCFS8f4PKsB1GCbCKzA=; b=NVDZrv/C0tmgSW9+tvJT3vofpVUAWlKJdIAm/E9/2M7bzkvm66bNbBsivtsyNG095B7mb8 4MB9jgYorC+DghuXAwlm4C1xAIZCRBIdwER2B7FU0DFpGqflbCK+je7lOB9JjC0kHNJ9gO f1S6jd4O31eMwm9zBOEBAnsw9ZATA2k= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf19.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1706861388; a=rsa-sha256; cv=none; b=ZUtEhIojHNIaArtwNsoVKid2/OAW9j3hDRfUys+ck585R6rFN3U3HhSi3gBwRzyyR5cdcq 5VJlC9dpDnmKf9puacb8sfO9P0E2okinJkIXXZ3M1+YcBmMmatFJO0yw/28gAY+ueRGcea borW37f7VjVH78kp4pFroJibUrq+K/M= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 599431BF7; Fri, 2 Feb 2024 00:10:30 -0800 (PST) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.26]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 6D8E53F5A1; Fri, 2 Feb 2024 00:09:44 -0800 (PST) From: Ryan Roberts To: Catalin Marinas , Will Deacon , Ard Biesheuvel , Marc Zyngier , James Morse , Andrey Ryabinin , Andrew Morton , Matthew Wilcox , Mark Rutland , David Hildenbrand , Kefeng Wang , John Hubbard , Zi Yan , Barry Song <21cnbao@gmail.com>, Alistair Popple , Yang Shi , Nicholas Piggin , Christophe Leroy , "Aneesh Kumar K.V" , "Naveen N. Rao" , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" Cc: Ryan Roberts , linux-arm-kernel@lists.infradead.org, x86@kernel.org, linuxppc-dev@lists.ozlabs.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v5 24/25] arm64/mm: __always_inline to improve fork() perf Date: Fri, 2 Feb 2024 08:07:55 +0000 Message-Id: <20240202080756.1453939-25-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20240202080756.1453939-1-ryan.roberts@arm.com> References: <20240202080756.1453939-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: BE6B61A0016 X-Rspam-User: X-Rspamd-Server: rspam02 X-Stat-Signature: oupp9qo3swiomde9gyihgoo3jzeoozit X-HE-Tag: 1706861388-894572 X-HE-Meta: U2FsdGVkX19OE4Qr+XZNvzR3/bnCbvlLUjQHpj/uKaJebrpCvO4TNuzm/qvODphJLplO+mlpcosLgUVKEeTzUMnZzI9WEyfTBkVIzwHnw03lPzpSymI8c5Y1aTZP1FrRI4qgb/spffdHZijTi1Isq/bDqTbVXiadPgoKIAP2A+tTAarfsiMrcbUI2DYYPGITR+NnHpe9tyGx+TSS50erKi1s0qe/tE4EDJnT4OqC7NTq69dM/BomfhvzlBEQFQnj56oPPTYQk3Pzv2yKQLK6hdZjtVKSv7tb+7WSDqUAVVptfmVixQj9t/7FTTLN2mTdyQj0Fz+uQapUSWSGzoqXaF3ZMzLFvEQj8BnLzJiq/W6zVel4kgs/+GJByjmh6Fz8iIbD4+aMBYJHcDtFdPNc21PO/Umwq5vBB/Qc1MNL6T62ceUH2+jCO6jyFTiYAVtyqikj05talJehCb5WlhPI8ZZVL6K4oETDE05T6aJRgJISOwLs2jT8n0kshkJ5K7U8WfzYqDbpge1ZhsUbA9TK09vEbP+ZJTGp0AkzZclNaOnpgHtlVImFYODiooWHmRw7uyl05IYFPf/2rlaNAUeHI0u3YRdbznJ6GX4+LneALMwr1loxhei1RamW/AMO/yB7w9GEDbl6xVwY6wm6QO1fcXzWvDlm6W933fbGMqvO5C9Lw7oLYKp2Xo0W9tnZRBDbiwpzjkMFodEi50Q4fnqySY7Ha0zDiS4Ic98Cp6VN03C7EME/Nj70mlu8iV2GlA+sQaAr+9s8ogIpUd/wWeLLZXz5iw4H7CsaLHuQvvlRRWMeTydg1qeNsFM68nXewptapdachnI1ELGmMPJwFHfug/Fwaqx3ipGDYFJ3xTTP+Ky38ryrC/H9xSUXnTIA/xaU8G5I/egnrHdNWKRZCl5S+7T5l4i08HYSLpNUyjmjaEl3oiHRRcLllXcre9yb3HmjfCMP0mimo/kuY0QCk71 Jno3luZO dWlOjW7pVE+RaoBYLvRWbuESC1Sv8XzuoOg+T/qlCPktS0wV41pVELI4xdRt09QAcWeAArSdZys4sbi8++dCy20KAusiDgLyPjg3IdSVJahqN9JIZ8U4xne5qLPhATjuYzkAvFZRsbM5FeYvRTOzFLjBIvhf1WNP2L10A4vOFY8Lfq1zkMbbIgkvfNzjICgGgD+NGzgTixfybn0SoGdCQKs1a0aWkc79Mel+ubxHZhKSc4m5xG6mXz0gyMZv6sP+4EsBHIWYP9Wr2iSJZ8COqmbEZAw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: As set_ptes() and wrprotect_ptes() become a bit more complex, the compiler may choose not to inline them. But this is critical for fork() performance. So mark the functions, along with contpte_try_unfold() which is called by them, as __always_inline. This is worth ~1% on the fork() microbenchmark with order-0 folios (the common case). Signed-off-by: Ryan Roberts Acked-by: Mark Rutland --- arch/arm64/include/asm/pgtable.h | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index 353ea67b5d75..cdc310880a3b 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -1213,8 +1213,8 @@ extern int contpte_ptep_set_access_flags(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep, pte_t entry, int dirty); -static inline void contpte_try_unfold(struct mm_struct *mm, unsigned long addr, - pte_t *ptep, pte_t pte) +static __always_inline void contpte_try_unfold(struct mm_struct *mm, + unsigned long addr, pte_t *ptep, pte_t pte) { if (unlikely(pte_valid_cont(pte))) __contpte_try_unfold(mm, addr, ptep, pte); @@ -1279,7 +1279,7 @@ static inline void set_pte(pte_t *ptep, pte_t pte) } #define set_ptes set_ptes -static inline void set_ptes(struct mm_struct *mm, unsigned long addr, +static __always_inline void set_ptes(struct mm_struct *mm, unsigned long addr, pte_t *ptep, pte_t pte, unsigned int nr) { pte = pte_mknoncont(pte); @@ -1361,8 +1361,8 @@ static inline int ptep_clear_flush_young(struct vm_area_struct *vma, } #define wrprotect_ptes wrprotect_ptes -static inline void wrprotect_ptes(struct mm_struct *mm, unsigned long addr, - pte_t *ptep, unsigned int nr) +static __always_inline void wrprotect_ptes(struct mm_struct *mm, + unsigned long addr, pte_t *ptep, unsigned int nr) { if (likely(nr == 1)) { /*