From patchwork Fri Apr 14 13:02:47 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13211452 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 25CDDC77B6E for ; Fri, 14 Apr 2023 13:03:24 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6E7C96B0072; Fri, 14 Apr 2023 09:03:23 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 671266B0075; Fri, 14 Apr 2023 09:03:23 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 512976B0078; Fri, 14 Apr 2023 09:03:23 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 3DADB6B0072 for ; Fri, 14 Apr 2023 09:03:23 -0400 (EDT) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 02D4EAB196 for ; Fri, 14 Apr 2023 13:03:22 +0000 (UTC) X-FDA: 80680012686.16.4DC924C Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf01.hostedemail.com (Postfix) with ESMTP id 5AF4340034 for ; Fri, 14 Apr 2023 13:03:21 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=none; spf=pass (imf01.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1681477401; a=rsa-sha256; cv=none; b=QjUEuq1dYGDuW7EKkKjajxbvRjriH04O1lzMDMzF98slxFSj/GzbBzSofxjuKfLBfX9zxX 9rrzgmz2fSHO01BDIMblgCNcEGeNnNFGguMy4vjdj3/oZ/2vf9ObImryBSOXm9w5I9tvKd 9bulZdHiEMAaOnoiVt3NZxmYxcUUN+U= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=none; spf=pass (imf01.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1681477401; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=+ln19MuwFthPsjVfctwgB3ISLmY5ZZJ5oHYy8WbqKKQ=; b=Oa6NnB0FntLsOv3uQpEtTJJqSw1B4GFoTuHhDQtfvlUxIAcX9nbYrTZ/hjlR2eF6oh4Hbu 611/rP8GCRYxZ0Hr+qKCcrOIHXJBY88jbexe7b/ufhBjeBOv8n2KmsjUwR4Fa4KVeckbDJ UqOWEBS/27Zm3GxUuzIt0yZMdsIoJT4= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id D119D4B3; Fri, 14 Apr 2023 06:04:04 -0700 (PDT) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.26]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 5881F3F6C4; Fri, 14 Apr 2023 06:03:19 -0700 (PDT) From: Ryan Roberts To: Andrew Morton , "Matthew Wilcox (Oracle)" , Yu Zhao , "Yin, Fengwei" Cc: Ryan Roberts , linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org Subject: [RFC v2 PATCH 01/17] mm: Expose clear_huge_page() unconditionally Date: Fri, 14 Apr 2023 14:02:47 +0100 Message-Id: <20230414130303.2345383-2-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230414130303.2345383-1-ryan.roberts@arm.com> References: <20230414130303.2345383-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Queue-Id: 5AF4340034 X-Rspamd-Server: rspam01 X-Stat-Signature: yg3354635t9mfaap9d3bynw3hd8p6kn5 X-HE-Tag: 1681477401-493515 X-HE-Meta: U2FsdGVkX1+CfF5fVCjfrU4tAC+pY/oGqcVDYI/Or9ZVMbgQkpH/QpP+DUYbtxGGeKl2yBRampopOr/pgQU68oHUl/NJdE4DVZ3CQRGGGaxdpl329YiT5+yeqsZRGZJKuM0DrgUHzfr9P5oTaUdY2C/0g9Glyr9yP2uHApOv7SsGMRd6/9FzKk2bmyi3T084b81kDwydlcvDMV0G93afRLpQ7aB9g9oY+6mu15WKTzpfS0BSMR+G6903FpdsC63yJeyI+Z5sxlikU/VG+bbCVO1w3bETsbK0gq3+J4tggqwBE0Vz1hMa45gA/0A4mj8MATyKCVfFi4XywiDLlSppuVTHgYQ0yDSfRrWnxYd1XGA+fM6irm3tII1+Q22VZNpYgXZ70SRBGDZtZqlMpHQQhyU4HyGr96yp6dBHGnutEv4t0H1ZzdBoF/iSplj6KW2cCJzFo7JzUIIhzqe6ValDxs6brHdXFudhpYrbcqHEeS+Z8KwDiqhtArWTJmnIrCbfTstVQS14KUXpLvXMhtXzibtXJvOGQi4j1677+/n7se1fRFH25rYJzeENBlWB60c1RtYHAsYU7AvpRLOh0Slg7Bg+aD39mY4If4irEMQwZCgJC88ItrSZQZMGF12Szl91/3Sj+VAMZaipkdNEZ3v64wl31tQehiox2amKTlSNAjTGUMQowlNd15F+0pScWQsehF+swNxLrDR1Mm6v2gg/r8pqyMAe4MLUUiwAWjjTwc6FVIO2q55xt64KOnvhrpD7ktcdJ5KITtFQe3LiSwrRraPKhI2kfaXxDtntubLZ8zwPVTyi4MrksxguvJi9qbMiq7P5u6+kHp9MHoEJ9rlkA/W5uwj7Q9maYs1ZfedEjr9SOaQWwA9f5R1NhFbPspbsbFMbxe44e9qGkBvfeP9ROAD9hgzy3buJySbGt9i3A6YVoxO+SNtBwsUaEeLu11wSjHbzfxpmHLEm7PAUobp bz7XI6UL yVoCT4F6eiTFoYJsyph0nTPSBj2JsFXJXxNOp07bu+MLKyBLYrRwltRrUQs+N8PILKsrkCAGKB6mGgkCrAK6hg1jVHSKo4odgh43HywZx869FV9uRgIRRpdVRyP8jJxi3h1xICQXB3I20MAgJqwOA1yI8rNHu1bk3cyxxlW3jp2lpKQhemXN7NKNVp0w/yk6QHxg4oEmZIQV2MXkxnE/8ukXAcLZtsciCjWzM X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: In preparation for extending vma_alloc_zeroed_movable_folio() to allocate a arbitrary order folio, expose clear_huge_page() unconditionally, so that it can be used to zero the allocated folio. Signed-off-by: Ryan Roberts --- include/linux/mm.h | 3 ++- mm/memory.c | 3 ++- 2 files changed, 4 insertions(+), 2 deletions(-) -- 2.25.1 diff --git a/include/linux/mm.h b/include/linux/mm.h index 1f79667824eb..cdb8c6031d0f 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -3538,10 +3538,11 @@ enum mf_action_page_type { */ extern const struct attribute_group memory_failure_attr_group; -#if defined(CONFIG_TRANSPARENT_HUGEPAGE) || defined(CONFIG_HUGETLBFS) extern void clear_huge_page(struct page *page, unsigned long addr_hint, unsigned int pages_per_huge_page); + +#if defined(CONFIG_TRANSPARENT_HUGEPAGE) || defined(CONFIG_HUGETLBFS) extern void copy_user_huge_page(struct page *dst, struct page *src, unsigned long addr_hint, struct vm_area_struct *vma, diff --git a/mm/memory.c b/mm/memory.c index 01a23ad48a04..3e2eee8c66a7 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -5642,7 +5642,6 @@ void __might_fault(const char *file, int line) EXPORT_SYMBOL(__might_fault); #endif -#if defined(CONFIG_TRANSPARENT_HUGEPAGE) || defined(CONFIG_HUGETLBFS) /* * Process all subpages of the specified huge page with the specified * operation. The target subpage will be processed last to keep its @@ -5730,6 +5729,8 @@ void clear_huge_page(struct page *page, process_huge_page(addr_hint, pages_per_huge_page, clear_subpage, page); } +#if defined(CONFIG_TRANSPARENT_HUGEPAGE) || defined(CONFIG_HUGETLBFS) + static void copy_user_gigantic_page(struct page *dst, struct page *src, unsigned long addr, struct vm_area_struct *vma, From patchwork Fri Apr 14 13:02:48 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13211454 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 86267C77B72 for ; Fri, 14 Apr 2023 13:03:27 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EB1C36B0078; Fri, 14 Apr 2023 09:03:24 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E1232900002; Fri, 14 Apr 2023 09:03:24 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B7A896B007D; Fri, 14 Apr 2023 09:03:24 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id A047F6B0078 for ; Fri, 14 Apr 2023 09:03:24 -0400 (EDT) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 3F2C5AB378 for ; Fri, 14 Apr 2023 13:03:24 +0000 (UTC) X-FDA: 80680012728.27.14FB2E9 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf25.hostedemail.com (Postfix) with ESMTP id 38C55A002F for ; Fri, 14 Apr 2023 13:03:22 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf25.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1681477402; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=X9+GrWbMSsq+wn6nXn8H61MRArB0lDp6KWsQd84iJgA=; b=DWP+VkxONZlsSVl1Ddr1l/fTfE8lVxn10/2CwTay/0BVeIfbAuSIgg01fR1rbgGin7OVFU OuGYgrfbc/kO1WP7gZXJO7n1OXWwWSPEFvYHeDyQpqqbHUNpe4bExaKjrvm6PaU+zuvZfm 9Yv421DCGmeZ9YqEy/iVcp7pZ53Z5V8= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf25.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1681477402; a=rsa-sha256; cv=none; b=FCQU+j4dlIkDGw0qzcW1WJzoevgMKxbUbIPguAJMwTR99xAOi0Vv+YiAWkdcwKKxe7pMRb EVZ/ZJPrm4UuoRde5X48Po8FGLCo6yYFldynONWKskiYUZo0mYzmzS8/aFkyD5aEmmra1A JnLD+iVeAzgmbCGS+kBYsa69T/NivLg= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 0F8A316F8; Fri, 14 Apr 2023 06:04:06 -0700 (PDT) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.26]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 8AF9E3F6C4; Fri, 14 Apr 2023 06:03:20 -0700 (PDT) From: Ryan Roberts To: Andrew Morton , "Matthew Wilcox (Oracle)" , Yu Zhao , "Yin, Fengwei" Cc: Ryan Roberts , linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org Subject: [RFC v2 PATCH 02/17] mm: pass gfp flags and order to vma_alloc_zeroed_movable_folio() Date: Fri, 14 Apr 2023 14:02:48 +0100 Message-Id: <20230414130303.2345383-3-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230414130303.2345383-1-ryan.roberts@arm.com> References: <20230414130303.2345383-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 38C55A002F X-Stat-Signature: nbe15hm9tqec4s6zc83qo8h6ejiy1erh X-HE-Tag: 1681477402-925961 X-HE-Meta: U2FsdGVkX188M9VKjyUiWi8HFKjZtGYDdYvuIFXWytrNQzMrADgdjPzTH4Q1qsWo+r+P+9kXnNXW+lfQrB3erng/i1/FdK9NpBiwwKcYB8pMbT80jkB7rKsRjSy+i8xWntEysDoN4Qji6zPJF9byTpC7B7Nba9xOJdJTZPoOXlCrpG3uvsJ1eS0IMker9pJ8LCFx5q90XJ0sTYsvGXvIGCv6ekYdmzIm4U3I2BKitxKxuZ4MG1Mt1XnazOZsw1l1ojJH7LjekKinZp/sQciOhnn83JiA4rncints5qcamGCIne2Mi1ZDduMzpphC3zi+ODi/CX65uZ6QbQKURfLvnz9ZRGZGuumhobA50BWKv3mS/mBnGhpl5AoQiCvagPjT7+ix85NIHcYLdGhgdQ7B5x0bNkqnVwYea3ktdfhv4UomjkG3EfcO7HXYcn5eLFvcXE1UcC5XMklhelUWt1yKOiMKHx9aSNARnutx9yPVfpiA2zmE9jVjo/zECQxU17c8/+T8zewZQaSSreKgOj+RLyZc1A/mAer0ezqAvZ85p29O03DOXpnJpFAEfR1qoE7plzbgMYwXXcuHsTmHyT6kGZELHbMXitWPhU+9QswRAaDiqQ42moKtm3zcTf60MCjhPgRkHVbR0jrqML52T+iIKR+REJfF66J0y0sVvpVR8VcLliH3uzplmFtdjcb7we5fvfn+W0ZelwaeFs3yxlCvzNVXHoX48JLoOmmIEU9KnNhFNmqDpZgNsxPG7roM8S9KByHDW61qDauOm+R+2ezx3BSWoguwuY78qRECFpeC5ooQk0LE7KNn9TrXYSiGPfDb/Ut+fjv9y7Je3q3KxGXPb/m2NJYjur/VuYgAsuXAD6CVgPoFIcbE0qrsBn+NkWp4m4WcXgpm6DKTdpkMJ4yZGvmL2ns1q3KBq5aXBUnhwlr5UAcfwbra/uYjXvsDMY73p6gen8K+M7HAlWCs+Iq k1j4nJcS QFgrIrm3HqjGL/E6hh3YKkUTSdzH7llRF/o/OkpRg+VODaTt2fZmfXPgltUBKK0HmrIoKADLHafey33vDUH9jlbYZh73IzM9MIuu3GXBMm10GNrQc0yvjP8bJnSlpgSj25cLSjzkjqk4yNim2ojjTr8AQEbrNYduAAGd79jinDBB+RdCagtLbowOGmHw5PU5LDZ+bUXuGG4yWnDaOk7eFtR6c/kLUud9d0MAH X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Allow allocation of large folios with vma_alloc_zeroed_movable_folio(). This prepares the ground for large anonymous folios. The generic implementation of vma_alloc_zeroed_movable_folio() now uses clear_huge_page() to zero the allocated folio since it may now be a non-0 order. Currently the function is always called with order 0 and no extra gfp flags, so no functional change intended. Signed-off-by: Ryan Roberts --- arch/alpha/include/asm/page.h | 5 +++-- arch/arm64/include/asm/page.h | 3 ++- arch/arm64/mm/fault.c | 7 ++++--- arch/ia64/include/asm/page.h | 5 +++-- arch/m68k/include/asm/page_no.h | 7 ++++--- arch/s390/include/asm/page.h | 5 +++-- arch/x86/include/asm/page.h | 5 +++-- include/linux/highmem.h | 23 +++++++++++++---------- mm/memory.c | 5 +++-- 9 files changed, 38 insertions(+), 27 deletions(-) -- 2.25.1 diff --git a/arch/alpha/include/asm/page.h b/arch/alpha/include/asm/page.h index 4db1ebc0ed99..6fc7fe91b6cb 100644 --- a/arch/alpha/include/asm/page.h +++ b/arch/alpha/include/asm/page.h @@ -17,8 +17,9 @@ extern void clear_page(void *page); #define clear_user_page(page, vaddr, pg) clear_page(page) -#define vma_alloc_zeroed_movable_folio(vma, vaddr) \ - vma_alloc_folio(GFP_HIGHUSER_MOVABLE | __GFP_ZERO, 0, vma, vaddr, false) +#define vma_alloc_zeroed_movable_folio(vma, vaddr, gfp, order) \ + vma_alloc_folio(GFP_HIGHUSER_MOVABLE | __GFP_ZERO | (gfp), \ + order, vma, vaddr, false) extern void copy_page(void * _to, void * _from); #define copy_user_page(to, from, vaddr, pg) copy_page(to, from) diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h index 2312e6ee595f..47710852f872 100644 --- a/arch/arm64/include/asm/page.h +++ b/arch/arm64/include/asm/page.h @@ -30,7 +30,8 @@ void copy_highpage(struct page *to, struct page *from); #define __HAVE_ARCH_COPY_HIGHPAGE struct folio *vma_alloc_zeroed_movable_folio(struct vm_area_struct *vma, - unsigned long vaddr); + unsigned long vaddr, + gfp_t gfp, int order); #define vma_alloc_zeroed_movable_folio vma_alloc_zeroed_movable_folio void tag_clear_highpage(struct page *to); diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c index f4cb0f85ccf4..3b4cc04f7a23 100644 --- a/arch/arm64/mm/fault.c +++ b/arch/arm64/mm/fault.c @@ -926,9 +926,10 @@ NOKPROBE_SYMBOL(do_debug_exception); * Used during anonymous page fault handling. */ struct folio *vma_alloc_zeroed_movable_folio(struct vm_area_struct *vma, - unsigned long vaddr) + unsigned long vaddr, + gfp_t gfp, int order) { - gfp_t flags = GFP_HIGHUSER_MOVABLE | __GFP_ZERO; + gfp_t flags = GFP_HIGHUSER_MOVABLE | __GFP_ZERO | gfp; /* * If the page is mapped with PROT_MTE, initialise the tags at the @@ -938,7 +939,7 @@ struct folio *vma_alloc_zeroed_movable_folio(struct vm_area_struct *vma, if (vma->vm_flags & VM_MTE) flags |= __GFP_ZEROTAGS; - return vma_alloc_folio(flags, 0, vma, vaddr, false); + return vma_alloc_folio(flags, order, vma, vaddr, false); } void tag_clear_highpage(struct page *page) diff --git a/arch/ia64/include/asm/page.h b/arch/ia64/include/asm/page.h index 310b09c3342d..ebdf04274023 100644 --- a/arch/ia64/include/asm/page.h +++ b/arch/ia64/include/asm/page.h @@ -82,10 +82,11 @@ do { \ } while (0) -#define vma_alloc_zeroed_movable_folio(vma, vaddr) \ +#define vma_alloc_zeroed_movable_folio(vma, vaddr, gfp, order) \ ({ \ struct folio *folio = vma_alloc_folio( \ - GFP_HIGHUSER_MOVABLE | __GFP_ZERO, 0, vma, vaddr, false); \ + GFP_HIGHUSER_MOVABLE | __GFP_ZERO | (gfp), \ + order, vma, vaddr, false); \ if (folio) \ flush_dcache_folio(folio); \ folio; \ diff --git a/arch/m68k/include/asm/page_no.h b/arch/m68k/include/asm/page_no.h index 060e4c0e7605..4a2fe57fef5e 100644 --- a/arch/m68k/include/asm/page_no.h +++ b/arch/m68k/include/asm/page_no.h @@ -3,7 +3,7 @@ #define _M68K_PAGE_NO_H #ifndef __ASSEMBLY__ - + extern unsigned long memory_start; extern unsigned long memory_end; @@ -13,8 +13,9 @@ extern unsigned long memory_end; #define clear_user_page(page, vaddr, pg) clear_page(page) #define copy_user_page(to, from, vaddr, pg) copy_page(to, from) -#define vma_alloc_zeroed_movable_folio(vma, vaddr) \ - vma_alloc_folio(GFP_HIGHUSER_MOVABLE | __GFP_ZERO, 0, vma, vaddr, false) +#define vma_alloc_zeroed_movable_folio(vma, vaddr, gfp, order) \ + vma_alloc_folio(GFP_HIGHUSER_MOVABLE | __GFP_ZERO | (gfp), \ + order, vma, vaddr, false) #define __pa(vaddr) ((unsigned long)(vaddr)) #define __va(paddr) ((void *)((unsigned long)(paddr))) diff --git a/arch/s390/include/asm/page.h b/arch/s390/include/asm/page.h index 8a2a3b5d1e29..b749564140f1 100644 --- a/arch/s390/include/asm/page.h +++ b/arch/s390/include/asm/page.h @@ -73,8 +73,9 @@ static inline void copy_page(void *to, void *from) #define clear_user_page(page, vaddr, pg) clear_page(page) #define copy_user_page(to, from, vaddr, pg) copy_page(to, from) -#define vma_alloc_zeroed_movable_folio(vma, vaddr) \ - vma_alloc_folio(GFP_HIGHUSER_MOVABLE | __GFP_ZERO, 0, vma, vaddr, false) +#define vma_alloc_zeroed_movable_folio(vma, vaddr, gfp, order) \ + vma_alloc_folio(GFP_HIGHUSER_MOVABLE | __GFP_ZERO | (gfp), \ + order, vma, vaddr, false) /* * These are used to make use of C type-checking.. diff --git a/arch/x86/include/asm/page.h b/arch/x86/include/asm/page.h index d18e5c332cb9..34deab1a8dae 100644 --- a/arch/x86/include/asm/page.h +++ b/arch/x86/include/asm/page.h @@ -34,8 +34,9 @@ static inline void copy_user_page(void *to, void *from, unsigned long vaddr, copy_page(to, from); } -#define vma_alloc_zeroed_movable_folio(vma, vaddr) \ - vma_alloc_folio(GFP_HIGHUSER_MOVABLE | __GFP_ZERO, 0, vma, vaddr, false) +#define vma_alloc_zeroed_movable_folio(vma, vaddr, gfp, order) \ + vma_alloc_folio(GFP_HIGHUSER_MOVABLE | __GFP_ZERO | (gfp), \ + order, vma, vaddr, false) #ifndef __pa #define __pa(x) __phys_addr((unsigned long)(x)) diff --git a/include/linux/highmem.h b/include/linux/highmem.h index 8fc10089e19e..54e68deae5ef 100644 --- a/include/linux/highmem.h +++ b/include/linux/highmem.h @@ -209,26 +209,29 @@ static inline void clear_user_highpage(struct page *page, unsigned long vaddr) #ifndef vma_alloc_zeroed_movable_folio /** - * vma_alloc_zeroed_movable_folio - Allocate a zeroed page for a VMA. - * @vma: The VMA the page is to be allocated for. - * @vaddr: The virtual address the page will be inserted into. - * - * This function will allocate a page suitable for inserting into this - * VMA at this virtual address. It may be allocated from highmem or + * vma_alloc_zeroed_movable_folio - Allocate a zeroed folio for a VMA. + * @vma: The start VMA the folio is to be allocated for. + * @vaddr: The virtual address the folio will be inserted into. + * @gfp: Additional gfp falgs to mix in or 0. + * @order: The order of the folio (2^order pages). + * + * This function will allocate a folio suitable for inserting into this + * VMA starting at this virtual address. It may be allocated from highmem or * the movable zone. An architecture may provide its own implementation. * - * Return: A folio containing one allocated and zeroed page or NULL if + * Return: A folio containing 2^order allocated and zeroed pages or NULL if * we are out of memory. */ static inline struct folio *vma_alloc_zeroed_movable_folio(struct vm_area_struct *vma, - unsigned long vaddr) + unsigned long vaddr, gfp_t gfp, int order) { struct folio *folio; - folio = vma_alloc_folio(GFP_HIGHUSER_MOVABLE, 0, vma, vaddr, false); + folio = vma_alloc_folio(GFP_HIGHUSER_MOVABLE | gfp, + order, vma, vaddr, false); if (folio) - clear_user_highpage(&folio->page, vaddr); + clear_huge_page(&folio->page, vaddr, 1U << order); return folio; } diff --git a/mm/memory.c b/mm/memory.c index 3e2eee8c66a7..9d5e8be49f3b 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3061,7 +3061,8 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf) goto oom; if (is_zero_pfn(pte_pfn(vmf->orig_pte))) { - new_folio = vma_alloc_zeroed_movable_folio(vma, vmf->address); + new_folio = vma_alloc_zeroed_movable_folio(vma, vmf->address, + 0, 0); if (!new_folio) goto oom; } else { @@ -4063,7 +4064,7 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf) /* Allocate our own private page. */ if (unlikely(anon_vma_prepare(vma))) goto oom; - folio = vma_alloc_zeroed_movable_folio(vma, vmf->address); + folio = vma_alloc_zeroed_movable_folio(vma, vmf->address, 0, 0); if (!folio) goto oom; From patchwork Fri Apr 14 13:02:49 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13211455 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DAC1FC77B76 for ; Fri, 14 Apr 2023 13:03:29 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E157F6B007B; Fri, 14 Apr 2023 09:03:25 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D774C6B007D; Fri, 14 Apr 2023 09:03:25 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BF2A16B007E; Fri, 14 Apr 2023 09:03:25 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id A994D6B007B for ; Fri, 14 Apr 2023 09:03:25 -0400 (EDT) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 88F03802A7 for ; Fri, 14 Apr 2023 13:03:25 +0000 (UTC) X-FDA: 80680012770.27.8C7BE1B Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf28.hostedemail.com (Postfix) with ESMTP id AB0DDC000D for ; Fri, 14 Apr 2023 13:03:23 +0000 (UTC) Authentication-Results: imf28.hostedemail.com; dkim=none; spf=pass (imf28.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1681477403; a=rsa-sha256; cv=none; b=S3Zfe5yS9mWUZaUcHetyi8mon6ZiPor4c86/4IaTrQbQ0eusBvGsHNQDA8M/Us6GiPLqKq 9XbQ3czsMCyTvuATKT398iEAHbt9ijv6pWWUemTMpc4SJzZcWGcs+y3MoZ180SUNACQ8YS grzhWleJiR9TF/AmLVX8azXf1jWBStM= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=none; spf=pass (imf28.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1681477403; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=BspBzcbIcmP3MIkdJlihJ4zGGnNWp7gX0CygQG0DWB8=; b=0e5eU4TuXhfTlb/k1TVvaLh925noaQ+vyQ00A1qxgCRsVmDymsfJ2wMYA9h3V/kpPPYQp1 5doS34VXvNa6753equVg913eNMdXz45VQwaniMmJpKNBRArvh+LlmAChwhIUbADdCv1XyI XrSj1PqYyhct9Xus0SZH+mg0BOup1CM= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 417941713; Fri, 14 Apr 2023 06:04:07 -0700 (PDT) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.26]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id BD1443F6C4; Fri, 14 Apr 2023 06:03:21 -0700 (PDT) From: Ryan Roberts To: Andrew Morton , "Matthew Wilcox (Oracle)" , Yu Zhao , "Yin, Fengwei" Cc: Ryan Roberts , linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org Subject: [RFC v2 PATCH 03/17] mm: Introduce try_vma_alloc_movable_folio() Date: Fri, 14 Apr 2023 14:02:49 +0100 Message-Id: <20230414130303.2345383-4-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230414130303.2345383-1-ryan.roberts@arm.com> References: <20230414130303.2345383-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Queue-Id: AB0DDC000D X-Rspamd-Server: rspam01 X-Stat-Signature: ebs6rt876zi5aw9a7t3um36y9srqo19w X-HE-Tag: 1681477403-359601 X-HE-Meta: U2FsdGVkX1806Y+A4WApMZ7SN12i5Dgk+2O1qzkizCjUcsicOOx28TsY3kh3oL34GiabYiHLOlT8y4qWrULul7MDkHVD069Ht9RfccLDYNjaLzcBl0T6+ScTVlc2lFcY3LkVO8qfJxIWgDjKAU46l4M/zdZqMU6KRKTWKSNIQButQUqY9mO7iOH4g9RRsnpX2oSQUijh0jQFZN3XXU2bqHSi6ObU8F1AXFujC0xRCWnCJf6ch0yA0V5TLp/ab/9ApTHjMWdojJbhAG5Y7YgLgC8QPHu1bJ0SCTXbuR0XY33G5kGRqydjqjSoCROj3ceiPFB1akm08FRSQe3Svbi00e5PkoMcv2ptQfQeiav9sTj1kn36XKTruiQVmb+/lo+VKcYLSHcDL5JsI2aXgbZRk2d4K9vFxhQmy4lTMAbYHWTgBCgmVMV0cQveEsPheihzEKEpqiMMTk4hR+lgMKb8snrI3MkiieLoNFHBqziitduXdXkvSjTsGd2kF9gw7R0iah5+aYWEwLIvvdlYAi7VM55rfHPLuVRbKCCN+/1bi/BKX8gHV273EfWI1jcPthwcMVo+CMKJIYKD+Li/s7qsiQ+kXVxXxHu6yUqo5zUC2wbYDXCIV+fQCMmImXSnhokOXtts4+0JsOSf+bgfjuAEalsz4ch9wRTQaSGtZ54l0vJTOabj7CaSJ1Si7ziWs6h/odLpgpChvUSoHAjla4d5NOGUFAAWiVml4nhKPrhk73ASzQyZY9L9zyFYoKdzxmHJoYjW0GyKYZ+4cjni9MwJV7Kt+T8yeB5sexFx452x47e0znMtoSNv5PGUOshHMJBsT9NOzIp8K0LLdbED0cA+4pHrOeAzNCbGdFqw29wXYUpajs+/NvOt8TlfKUjEFA+UjQSyqaQnNOoK2QkjamxTKKZeP9/kqHWOTECT/+65AO9UACdt1zBqTMSkCeMbLo+k1HNCAC4wrP1bHBWJkwB 2LK4WL+y I3gL7EBxU72g/9IL5UmWtEF050jWLMAF4fL+7LL5fM3wUtZlaLnmJWNMa0A3USzwPnM0ic03Ad16XI0qNlTC0nkYr2N51KjlMa3HWeBzaxHBXvo0rTyiyqJP+/gmtYz68MX2CXOiwzkVTDcJIO/nmeCPTJUYXfCk6lbX7xd0t1XA/5L8qOL4v6AuEnfYk+FC3ZrQfpyC2e+dy1bZrVcfzpze4yLsr/e/n8hhF X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Opportunistically attempt to allocate high-order folios in highmem, optionally zeroed. Retry with lower orders all the way to order-0, until success. Although, of note, order-1 allocations are skipped since a large folio must be at least order-2 to work with the THP machinery. The user must check what they got with folio_order(). This will be used to oportunistically allocate large folios for anonymous memory with a sensible fallback under memory pressure. For attempts to allocate non-0 orders, we set __GFP_NORETRY to prevent high latency due to reclaim, instead preferring to just try for a lower order. The same approach is used by the readahead code when allocating large folios. Signed-off-by: Ryan Roberts --- mm/memory.c | 33 +++++++++++++++++++++++++++++++++ 1 file changed, 33 insertions(+) -- 2.25.1 diff --git a/mm/memory.c b/mm/memory.c index 9d5e8be49f3b..ca32f59acef2 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -2989,6 +2989,39 @@ static vm_fault_t fault_dirty_shared_page(struct vm_fault *vmf) return 0; } +static inline struct folio *vma_alloc_movable_folio(struct vm_area_struct *vma, + unsigned long vaddr, int order, bool zeroed) +{ + gfp_t gfp = order > 0 ? __GFP_NORETRY | __GFP_NOWARN : 0; + + if (zeroed) + return vma_alloc_zeroed_movable_folio(vma, vaddr, gfp, order); + else + return vma_alloc_folio(GFP_HIGHUSER_MOVABLE | gfp, order, vma, + vaddr, false); +} + +/* + * Opportunistically attempt to allocate high-order folios, retrying with lower + * orders all the way to order-0, until success. order-1 allocations are skipped + * since a folio must be at least order-2 to work with the THP machinery. The + * user must check what they got with folio_order(). vaddr can be any virtual + * address that will be mapped by the allocated folio. + */ +static struct folio *try_vma_alloc_movable_folio(struct vm_area_struct *vma, + unsigned long vaddr, int order, bool zeroed) +{ + struct folio *folio; + + for (; order > 1; order--) { + folio = vma_alloc_movable_folio(vma, vaddr, order, zeroed); + if (folio) + return folio; + } + + return vma_alloc_movable_folio(vma, vaddr, 0, zeroed); +} + /* * Handle write page faults for pages that can be reused in the current vma * From patchwork Fri Apr 14 13:02:50 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13211456 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5C12EC77B6E for ; Fri, 14 Apr 2023 13:03:32 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 69A7A6B007D; Fri, 14 Apr 2023 09:03:27 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 623716B007E; Fri, 14 Apr 2023 09:03:27 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 49C7D6B0080; Fri, 14 Apr 2023 09:03:27 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 374206B007D for ; Fri, 14 Apr 2023 09:03:27 -0400 (EDT) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 041C01C5C48 for ; Fri, 14 Apr 2023 13:03:26 +0000 (UTC) X-FDA: 80680012854.06.B116F30 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf18.hostedemail.com (Postfix) with ESMTP id BA92F1C0029 for ; Fri, 14 Apr 2023 13:03:24 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=none; spf=pass (imf18.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1681477404; a=rsa-sha256; cv=none; b=T0F2OlwrcYZDquKXkrYgnl5aCgkgPFP9yTI8fuFU5Y7jlWq+p1FwaN3P50klBrVh5RGHh7 +U0tCfszcq12oA+y+UHGajP+DwpcrbS08N9B2ROn+LCpe9NIbeWcB7SBPHQxjcRXg+o6ir owExTVcuNIzus28IAL/7mqrO/DxSn/k= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=none; spf=pass (imf18.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1681477404; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=eEfBR/V5t3+D39tjfvkOsML7QYeIT9w8k1Ieor3nb6Q=; b=e6mXsabdAi/PKyygGlux4t5WKG1P5JKKuW+lwMVZhWYZYAiqztcM6lO28T70ZZ+b7Z9/nw oyDmGDZpnEA+LZNhvYaKNrD27vCFG31EmAafTfNFbIk+ACdXhR/Cn464CzmNwgg7AXdH9d ns25fUsyWoFnAMFtcUN0rJDsolgIHAA= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 73A851756; Fri, 14 Apr 2023 06:04:08 -0700 (PDT) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.26]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id EF0BF3F6C4; Fri, 14 Apr 2023 06:03:22 -0700 (PDT) From: Ryan Roberts To: Andrew Morton , "Matthew Wilcox (Oracle)" , Yu Zhao , "Yin, Fengwei" Cc: Ryan Roberts , linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org Subject: [RFC v2 PATCH 04/17] mm: Implement folio_add_new_anon_rmap_range() Date: Fri, 14 Apr 2023 14:02:50 +0100 Message-Id: <20230414130303.2345383-5-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230414130303.2345383-1-ryan.roberts@arm.com> References: <20230414130303.2345383-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Queue-Id: BA92F1C0029 X-Rspamd-Server: rspam01 X-Stat-Signature: fdd81w666383j1m7g6qi8n1jy9s6cq6d X-HE-Tag: 1681477404-521747 X-HE-Meta: U2FsdGVkX18iRDQwrAraEkxb8355CB/DyVXyyZiVsntErD5ZWbmP4rvB/0977iIKm78vfRrXCr5XWGh2rfICHmSqOevyT6avZWnvY3DrzWQeQAACGgBSBk1DQOAnqOh0sC2tChB5IJVK/WCbFarPl6ZE32OJlP2u02H3T1CBoN8UDF/llJysK755MWHQbFGPIRcjlpHoOKjn/HsJPd2vgdd+0t1wsBy7LQSgHd1j5O0wb5xC1qdFROBms0koKb5UkzteBAvLeZSN2raODyAEhIMc9wphTS3/+Y01tJ9SdhOeRnyff/zaqzduK6Iw99eSUeiMxFJJarqvxi90rM3XRjlmydsT6FIePmkvucB4LZectYX7469OFlNsA4UAkzfFzCVSqbCsKY5bFQIxYLIkjluH0taj5p+jsEtrW7hW73+UQlgDy9/U9sJnueDoXSjyOZN/VxtYVeEvLOC9EInm+UfFTFlG6y/TQbhedVzCrTy0FABX5KYo3g+/MtzXXcGGla+pSl3/Yd5lr6c4JJGdw/rp4gE30QODu2vguyrtHUhoYe5KHhLp9MrIjXlOSaobB4ajatEfk9MH57tTd24OjwI6JWfFav270ETbks/nlEAiojQ6FczsGoNr8E76cxF7HXNYdw7muh8mzZMdrutB+IQjgRs5iRp2RY2qmJ3GIJVz8E10pPEs+RxM6T5dYwF3dbzFsoN6lOVhKqe0KyFxtRBw7icY/zrRAdXJw93zZoCSSZS0jGUYkLZwGpgdzgyx0c9dn5NGNeRrlC9qLFT+azrYWuonXmfbkO5PeOPo9yiJNdSM8r/YeBUoDuUuK/t9EIcGvihEUv7SofByYNntrSlXNR0Mh+upsWDow7cjBUDZS68q9Y82t6yGquCGlgd+NYRb5PH/uXAkVT8HxNxGisIKoc6hchlxwY3r+SfNar3AnYvcC2Rbj2RabCv8O2ZjYg4Ffq8KzZaIeThYAjB NCBB0tcX AXBlbWE38cMkcF+YLD2BB5AlrnJw56W9wIPRPq3b9VSb6EL0nk6cv8PGMsTo4+0mQsR//TLPC7ey8e8HwHEBmGZxUOwJctVqnSJ0dj8iCi+OaDdVD6bBrydztxGmlLIVUw6ffJNiHhjpD58OWHs3ZEvVOuPJ4tvOMSs7n7nGJMIrz8IZvwzvKZDO/W16AG+cV9039RpQAQOPFvi/gkiUvdJlrE611qtv2ydG1 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Like folio_add_new_anon_rmap() but batch-rmaps a range of pages belonging to a folio, for effciency savings. All pages are accounted as small pages. Signed-off-by: Ryan Roberts --- include/linux/rmap.h | 2 ++ mm/rmap.c | 43 +++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 45 insertions(+) -- 2.25.1 diff --git a/include/linux/rmap.h b/include/linux/rmap.h index b87d01660412..5c707f53d7b5 100644 --- a/include/linux/rmap.h +++ b/include/linux/rmap.h @@ -196,6 +196,8 @@ void page_add_new_anon_rmap(struct page *, struct vm_area_struct *, unsigned long address); void folio_add_new_anon_rmap(struct folio *, struct vm_area_struct *, unsigned long address); +void folio_add_new_anon_rmap_range(struct folio *folio, struct page *page, + int nr, struct vm_area_struct *vma, unsigned long address); void page_add_file_rmap(struct page *, struct vm_area_struct *, bool compound); void page_remove_rmap(struct page *, struct vm_area_struct *, diff --git a/mm/rmap.c b/mm/rmap.c index 8632e02661ac..d563d979c005 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1302,6 +1302,49 @@ void folio_add_new_anon_rmap(struct folio *folio, struct vm_area_struct *vma, __page_set_anon_rmap(folio, &folio->page, vma, address, 1); } +/** + * folio_add_new_anon_rmap_range - Add mapping to a set of pages within a new + * anonymous potentially large folio. + * @folio: The folio containing the pages to be mapped + * @page: First page in the folio to be mapped + * @nr: Number of pages to be mapped + * @vma: the vm area in which the mapping is added + * @address: the user virtual address of the first page to be mapped + * + * Like folio_add_new_anon_rmap() but batch-maps a range of pages within a folio + * using non-THP accounting. Like folio_add_new_anon_rmap(), the inc-and-test is + * bypassed and the folio does not have to be locked. All pages in the folio are + * individually accounted. + * + * As the folio is new, it's assumed to be mapped exclusively by a single + * process. + */ +void folio_add_new_anon_rmap_range(struct folio *folio, struct page *page, + int nr, struct vm_area_struct *vma, unsigned long address) +{ + int i; + + VM_BUG_ON_VMA(address < vma->vm_start || + address + (nr << PAGE_SHIFT) > vma->vm_end, vma); + __folio_set_swapbacked(folio); + + if (folio_test_large(folio)) { + /* increment count (starts at 0) */ + atomic_set(&folio->_nr_pages_mapped, nr); + } + + for (i = 0; i < nr; i++) { + /* increment count (starts at -1) */ + atomic_set(&page->_mapcount, 0); + __page_set_anon_rmap(folio, page, vma, address, 1); + page++; + address += PAGE_SIZE; + } + + __lruvec_stat_mod_folio(folio, NR_ANON_MAPPED, nr); + +} + /** * page_add_file_rmap - add pte mapping to a file page * @page: the page to add the mapping to From patchwork Fri Apr 14 13:02:51 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13211457 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D7C63C77B72 for ; Fri, 14 Apr 2023 13:03:34 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 169E96B007E; Fri, 14 Apr 2023 09:03:28 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0F2F86B0080; Fri, 14 Apr 2023 09:03:28 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E63786B0081; Fri, 14 Apr 2023 09:03:27 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id D46336B007E for ; Fri, 14 Apr 2023 09:03:27 -0400 (EDT) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id ADEA2AB191 for ; Fri, 14 Apr 2023 13:03:27 +0000 (UTC) X-FDA: 80680012854.16.A2C96F4 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf21.hostedemail.com (Postfix) with ESMTP id DF2C21C0011 for ; Fri, 14 Apr 2023 13:03:25 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=none; spf=pass (imf21.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1681477406; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=68Fb6So7DE+raDH8PM2qT2Sxx3ozWiuljRjCiDhnZ+o=; b=5EhLr6M9PH9fxyk5+MSt4BzVFuEkZvVwT39LJfVXQVIY+QeeYQge/QEBputYil/nq7MHVT t4H5+O1+3yhWIvN41sVHbEZGqyOXUiIbkUyi9I1Qq7H7SqVj1Drv7FzWjlV3+cKblr9PKM RDCBUc67uxL1rNznewHyOKlnVA2G/Fs= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=none; spf=pass (imf21.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1681477406; a=rsa-sha256; cv=none; b=x6sDFJZ1/eiCVxf2gX9zBUGJP0ycx0BKon258RFXAMKsEKg7mWE9EtEvleTWWFKJJsxBbE J/jSWCoAls8eogqVs8p2MNuRlZ58U6Xt5bEbRLU4brV+fbmJOsuHau8wI59IdmJGUw9HDK 4BX0LsBu3IqWgz28hvSMlh3RQZAfWv8= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id A585D1758; Fri, 14 Apr 2023 06:04:09 -0700 (PDT) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.26]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 2CD9F3F6C4; Fri, 14 Apr 2023 06:03:24 -0700 (PDT) From: Ryan Roberts To: Andrew Morton , "Matthew Wilcox (Oracle)" , Yu Zhao , "Yin, Fengwei" Cc: Ryan Roberts , linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org Subject: [RFC v2 PATCH 05/17] mm: Routines to determine max anon folio allocation order Date: Fri, 14 Apr 2023 14:02:51 +0100 Message-Id: <20230414130303.2345383-6-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230414130303.2345383-1-ryan.roberts@arm.com> References: <20230414130303.2345383-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam03 X-Stat-Signature: 5remrwijsyatpy9id41fmbrk85q8npo7 X-Rspamd-Queue-Id: DF2C21C0011 X-HE-Tag: 1681477405-68088 X-HE-Meta: U2FsdGVkX19u4fHGWemD6ULnLQPPJmED8MiV0lcgUtzKrLuO77Fy7CIRK9D1FZiE+xm0Vyuj9Q/bAxpa9sROV7KjycLD4CvqfJBpnGIp/9sYUTK9Es7/2eNYeSj9DG9bEx/R0ySCJiE4mdk9fmt4uBengnJ3AQE9XZYG9cDSoYoEbnKb/SwOCDGjGzrJ4b/SxF0gRTT/1/Tv2aFAqJeLX4kViV7FWh+poUpYH/BuqHxnEzPMuKvtDNMNtHzNGHEGZx0r5xd251DP3sQ3rV08NIYKRyNy8CsoWW6+1HNZUYlhbM1AYnl2jww35Zv0K0MHSASbLHNhfJgNEDjdJEDLVm4MWTRoDvYik6eBfi6WTv4RJKy+S8C5Ery5ukfpNBoRa9yNkanSJJBCVIEY1Dht4cHQhzbQHDsLb3o8KqOoUhcGKTEl1QDjlrz2QVHezHOYC5VsTD2TczvssvtDaUPNENaGDAkLhTI0gZPmvBlJm9sQItgLJpJv4n1n7MjuwkfmTKHlbvod3ECjypQukczCN+cLHRPqiKUNy3uERg9pWppwVurMNA7v9B88ucdpkPQRaYwru71l8JqOVc3tk9nY96bXoyFp1on0sVBBthHguibCBBoZFXT2MiEzz9Pgre+BBCu14A5Z8d7iJIYKdbmuAKn38qPkn53Xmevh2Sa7tyZzdQwqgWfHBxCnZhbwKUBDmVSqbH3wlomMEnq1XfYXxm91FSzJGtUJB8CmpcU9vpufVjWZBRufogym6ZE4he+hZw6s5OhQ6QOo/36E9p7WL91YLeTYkdddzjpMBCaDqQNRBJjxL9y0NEKmet1TJCGZw2Guaq9hQE2moNYnRywZF4ks1+NW/IK3fvcluoMfSis3n0MNbGVMO+Y5hdVDHOp4VSqckvrSFi8VhA29U2/ZyEPw7Ni66WoIFzccDpt23mJTBzzyPGvTdkdhQZMNZ7cwLC2H/GnxpAkoB9xCBlN Wn9b/GG3 ORH8Lhex63CMzCpzLP78QbaY8fqtnjiobjx4JIj54E2q0t9YoX7oqn8bopCs/inRrd7VlussKWN7eJp6IVVHerBWhd6ngxspb4mNDa5KTKPR221xE0mDF7hBLj0Ek7UIoTm5xi3vOb3jfesZhfEIm1tMQnSFksqSM6wWG9VY72EJec+Y= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: For variable-order anonymous folios, we want to tune the order that we prefer to allocate based on the vma. Add the routines to manage that heuristic. TODO: Currently we always use the global maximum. Add per-vma logic! Signed-off-by: Ryan Roberts --- include/linux/mm.h | 5 +++++ mm/memory.c | 8 ++++++++ 2 files changed, 13 insertions(+) -- 2.25.1 diff --git a/include/linux/mm.h b/include/linux/mm.h index cdb8c6031d0f..cc8d0b239116 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -3674,4 +3674,9 @@ madvise_set_anon_name(struct mm_struct *mm, unsigned long start, } #endif +/* + * TODO: Should this be set per-architecture? + */ +#define ANON_FOLIO_ORDER_MAX 4 + #endif /* _LINUX_MM_H */ diff --git a/mm/memory.c b/mm/memory.c index ca32f59acef2..d7e34a8c46aa 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3022,6 +3022,14 @@ static struct folio *try_vma_alloc_movable_folio(struct vm_area_struct *vma, return vma_alloc_movable_folio(vma, vaddr, 0, zeroed); } +static inline int max_anon_folio_order(struct vm_area_struct *vma) +{ + /* + * TODO: Policy for maximum folio order should likely be per-vma. + */ + return ANON_FOLIO_ORDER_MAX; +} + /* * Handle write page faults for pages that can be reused in the current vma * From patchwork Fri Apr 14 13:02:52 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13211458 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 44A4EC77B76 for ; Fri, 14 Apr 2023 13:03:37 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8E7366B0080; Fri, 14 Apr 2023 09:03:29 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 7D11B6B0081; Fri, 14 Apr 2023 09:03:29 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5D74E900002; Fri, 14 Apr 2023 09:03:29 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 4BF6F6B0080 for ; Fri, 14 Apr 2023 09:03:29 -0400 (EDT) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 09E62160207 for ; Fri, 14 Apr 2023 13:03:29 +0000 (UTC) X-FDA: 80680012938.24.DDB94F7 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf18.hostedemail.com (Postfix) with ESMTP id C100D1C002C for ; Fri, 14 Apr 2023 13:03:26 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=none; spf=pass (imf18.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1681477406; a=rsa-sha256; cv=none; b=HPl1+1m/QUo/rNnb8Ha1lG4dwxaSRoXxduLZmuHPzMRrohcNuiJiGFe0vO/5rt8ZwGiHQV o9Vf4faixlWfbUdH7uuMtu/F+wCatq8/3hsNiJoaKgEQwLOTIPK+V8h9j7Ar7JZwSywVOE qaD6YKuEeKGtRbkSXYNVG74iJm/PIxE= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=none; spf=pass (imf18.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1681477406; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=umfZXG2tnEOVqPFoBPLwRJ4GEhHWiFP35tKb/83gX6k=; b=8KvgN0nFgHN9ekI4TJXIlQWyz+yG8ggAcyD4lCpiUk0L99v5dwRdB8Y61xQxv3bQ4ue89B u8xya2cjpZSk8j9NFBSVIlCGg0BGtWmN5UezLnDB691iTLCUVwc2MAMzbw6EyM8snO19mG f/Er+XmQ0xtOHyXBxQFotA170/LVxKo= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id D7B5F175A; Fri, 14 Apr 2023 06:04:10 -0700 (PDT) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.26]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 5F32E3F6C4; Fri, 14 Apr 2023 06:03:25 -0700 (PDT) From: Ryan Roberts To: Andrew Morton , "Matthew Wilcox (Oracle)" , Yu Zhao , "Yin, Fengwei" Cc: Ryan Roberts , linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org Subject: [RFC v2 PATCH 06/17] mm: Allocate large folios for anonymous memory Date: Fri, 14 Apr 2023 14:02:52 +0100 Message-Id: <20230414130303.2345383-7-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230414130303.2345383-1-ryan.roberts@arm.com> References: <20230414130303.2345383-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Queue-Id: C100D1C002C X-Rspamd-Server: rspam01 X-Stat-Signature: 8epxqmsmcbeywuegnoi13dwz1si5zwi4 X-HE-Tag: 1681477406-818666 X-HE-Meta: U2FsdGVkX1/byTevUFKtxqD0ns2p+hRMy263+86s/BixQ6qSj716D78ze0hiz1M2QSYQXUpppKlaPON9YHs394FiQLbgth38lzJ1GWLE+yNZtFeHafU0w8M5owfOS7d30rY9gLJUgr8n0G+QoJib8WPdH7lNy371RlsOfT1+IfxH18guIEHzBnCrNrUHKxIFIJtyVSqsnPGXMGgH1sVHxsFQTSvJb/x9ZrzgKAEGc170OY/w3KwLQZhkZKKFsi22PpCM2vvLrJEwP+nfSfpYkVDI25YW5qrmfxlvhnjEh/CwCoic5+RhfZmy1cZY3F/d7lS5ZulBRGBeXkYofQ7vt3KNLQxGpi1vyl9diCueMVSV7nBVz0QOcquLERGhtvXMf2WVP5MNAm+//y8aMFCalJocOAjBMti1Aq69OvBhYas9XfYq79cGCb/vACJ3rV4+Jb+qeWPKDl0+oaiSGpEAEmtVaO+xgcOhbgDHSNfx3GO6tK3y5ggbEMNAzVTmbu5J487ViGHETMOUhTMW7nHS3VzJSNp57tf39kmfLNgIR/A0A4fC1B6g4A7MHQ+lqAp7amhiCFTjJwcIuw5oRuK9j79ViY9cec06hRz4pe2oiGTl3Ljt0ehw04rzSVGG4xVBiboWstWZVf58x6apRSyhKWghRYKRt8X+RHGm+TNEcgr2nPuwUXUqVCJw4gAgGhP5p2l+RmdegclhzH9KsED9TrbgBjHtBxKgv2mvegHDaCL0XDCzxvP876KimMb5uwjy76fdMj3rfm7yZMsc45LW+/8R0ZTrJXNK2dEKBxf0ZIbt5iXQlIIAM6pklwJ6jKKBa0EMxNnpAyh3VlxN2XViBHr/wWwZr/V9WZ/a0uHqnNdj4G1uNyGLWi5OW+D/kjaaJesMDAVdVI4ucYqd1gd0md49lnTai4zlYzteUQxGdl/YLofgR0ARWbSbTWJkxmPZWVQgk2meenrdObf2ybc GBZ7GB3F mnabTWBnMLECz1FSldN1jbTbsaLIUXbtdqaINwRKlZIh3VCdvQkDSBHd6vCeTbzEI5DquuuBwlgHxtaFd5ORld+UwSOL5HMs3R+eI7LZNdYJbmM334KzUovInB4/EjNU8mFBXZcHimHHtjHQAn9Bgb/YMnnx8/4n0ddqTUI1YcwrJZdE= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Add the machinery to determine what order of folio to allocate within do_anonymous_page() and deal with racing faults to the same region. For now, the maximum order is set to 4. This should probably be set per-vma based on factors, and adjusted dynamically. Signed-off-by: Ryan Roberts --- mm/memory.c | 154 ++++++++++++++++++++++++++++++++++++++++++++++------ 1 file changed, 138 insertions(+), 16 deletions(-) -- 2.25.1 diff --git a/mm/memory.c b/mm/memory.c index d7e34a8c46aa..f92a28064596 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3030,6 +3030,90 @@ static inline int max_anon_folio_order(struct vm_area_struct *vma) return ANON_FOLIO_ORDER_MAX; } +/* + * Returns index of first pte that is not none, or nr if all are none. + */ +static inline int check_ptes_none(pte_t *pte, int nr) +{ + int i; + + for (i = 0; i < nr; i++) { + if (!pte_none(*pte++)) + return i; + } + + return nr; +} + +static int calc_anon_folio_order_alloc(struct vm_fault *vmf, int order) +{ + /* + * The aim here is to determine what size of folio we should allocate + * for this fault. Factors include: + * - Order must not be higher than `order` upon entry + * - Folio must be naturally aligned within VA space + * - Folio must not breach boundaries of vma + * - Folio must be fully contained inside one pmd entry + * - Folio must not overlap any non-none ptes + * + * Additionally, we do not allow order-1 since this breaks assumptions + * elsewhere in the mm; THP pages must be at least order-2 (since they + * store state up to the 3rd struct page subpage), and these pages must + * be THP in order to correctly use pre-existing THP infrastructure such + * as folio_split(). + * + * As a consequence of relying on the THP infrastructure, if the system + * does not support THP, we always fallback to order-0. + * + * Note that the caller may or may not choose to lock the pte. If + * unlocked, the calculation should be considered an estimate that will + * need to be validated under the lock. + */ + + struct vm_area_struct *vma = vmf->vma; + int nr; + unsigned long addr; + pte_t *pte; + pte_t *first_set = NULL; + int ret; + + if (has_transparent_hugepage()) { + order = min(order, PMD_SHIFT - PAGE_SHIFT); + + for (; order > 1; order--) { + nr = 1 << order; + addr = ALIGN_DOWN(vmf->address, nr << PAGE_SHIFT); + pte = vmf->pte - ((vmf->address - addr) >> PAGE_SHIFT); + + /* Check vma bounds. */ + if (addr < vma->vm_start || + addr + (nr << PAGE_SHIFT) > vma->vm_end) + continue; + + /* Ptes covered by order already known to be none. */ + if (pte + nr <= first_set) + break; + + /* Already found set pte in range covered by order. */ + if (pte <= first_set) + continue; + + /* Need to check if all the ptes are none. */ + ret = check_ptes_none(pte, nr); + if (ret == nr) + break; + + first_set = pte + ret; + } + + if (order == 1) + order = 0; + } else + order = 0; + + return order; +} + /* * Handle write page faults for pages that can be reused in the current vma * @@ -4058,6 +4142,9 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf) struct folio *folio; vm_fault_t ret = 0; pte_t entry; + unsigned long addr; + int order = max_anon_folio_order(vma); + int pgcount = BIT(order); /* File mapping without ->vm_ops ? */ if (vma->vm_flags & VM_SHARED) @@ -4099,24 +4186,42 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf) pte_unmap_unlock(vmf->pte, vmf->ptl); return handle_userfault(vmf, VM_UFFD_MISSING); } - goto setpte; + set_pte_at(vma->vm_mm, vmf->address, vmf->pte, entry); + + /* No need to invalidate - it was non-present before */ + update_mmu_cache(vma, vmf->address, vmf->pte); + goto unlock; } - /* Allocate our own private page. */ +retry: + /* + * Estimate the folio order to allocate. We are not under the ptl here + * so this estiamte needs to be re-checked later once we have the lock. + */ + vmf->pte = pte_offset_map(vmf->pmd, vmf->address); + order = calc_anon_folio_order_alloc(vmf, order); + pte_unmap(vmf->pte); + + /* Allocate our own private folio. */ if (unlikely(anon_vma_prepare(vma))) goto oom; - folio = vma_alloc_zeroed_movable_folio(vma, vmf->address, 0, 0); + folio = try_vma_alloc_movable_folio(vma, vmf->address, order, true); if (!folio) goto oom; + /* We may have been granted less than we asked for. */ + order = folio_order(folio); + pgcount = BIT(order); + addr = ALIGN_DOWN(vmf->address, pgcount << PAGE_SHIFT); + if (mem_cgroup_charge(folio, vma->vm_mm, GFP_KERNEL)) goto oom_free_page; - cgroup_throttle_swaprate(&folio->page, GFP_KERNEL); + folio_throttle_swaprate(folio, GFP_KERNEL); /* * The memory barrier inside __folio_mark_uptodate makes sure that - * preceding stores to the page contents become visible before - * the set_pte_at() write. + * preceding stores to the folio contents become visible before + * the set_ptes() write. */ __folio_mark_uptodate(folio); @@ -4125,11 +4230,26 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf) if (vma->vm_flags & VM_WRITE) entry = pte_mkwrite(pte_mkdirty(entry)); - vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, vmf->address, - &vmf->ptl); - if (!pte_none(*vmf->pte)) { - update_mmu_tlb(vma, vmf->address, vmf->pte); - goto release; + vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, addr, &vmf->ptl); + + /* + * Ensure our estimate above is still correct; we could have raced with + * another thread to service a fault in the region. + */ + if (unlikely(check_ptes_none(vmf->pte, pgcount) != pgcount)) { + pte_t *pte = vmf->pte + ((vmf->address - addr) >> PAGE_SHIFT); + + /* If faulting pte was allocated by another, exit early. */ + if (order == 0 || !pte_none(*pte)) { + update_mmu_tlb(vma, vmf->address, pte); + goto release; + } + + /* Else try again, with a lower order. */ + pte_unmap_unlock(vmf->pte, vmf->ptl); + folio_put(folio); + order--; + goto retry; } ret = check_stable_address_space(vma->vm_mm); @@ -4143,14 +4263,16 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf) return handle_userfault(vmf, VM_UFFD_MISSING); } - inc_mm_counter(vma->vm_mm, MM_ANONPAGES); - folio_add_new_anon_rmap(folio, vma, vmf->address); + folio_ref_add(folio, pgcount - 1); + + add_mm_counter(vma->vm_mm, MM_ANONPAGES, pgcount); + folio_add_new_anon_rmap_range(folio, &folio->page, pgcount, vma, addr); folio_add_lru_vma(folio, vma); -setpte: - set_pte_at(vma->vm_mm, vmf->address, vmf->pte, entry); + + set_ptes(vma->vm_mm, addr, vmf->pte, entry, pgcount); /* No need to invalidate - it was non-present before */ - update_mmu_cache(vma, vmf->address, vmf->pte); + update_mmu_cache_range(vma, addr, vmf->pte, pgcount); unlock: pte_unmap_unlock(vmf->pte, vmf->ptl); return ret; From patchwork Fri Apr 14 13:02:53 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13211459 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B3DB6C77B6E for ; Fri, 14 Apr 2023 13:03:39 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 403D2900003; Fri, 14 Apr 2023 09:03:30 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 33E33900002; Fri, 14 Apr 2023 09:03:30 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1B8A9900003; Fri, 14 Apr 2023 09:03:30 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 05176900002 for ; Fri, 14 Apr 2023 09:03:30 -0400 (EDT) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 9C2A5A01AA for ; Fri, 14 Apr 2023 13:03:29 +0000 (UTC) X-FDA: 80680012938.07.C0852A5 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf21.hostedemail.com (Postfix) with ESMTP id DCAED1C0036 for ; Fri, 14 Apr 2023 13:03:27 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=none; spf=pass (imf21.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1681477408; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Mh/TNobpjhPFBv0BoO7O3iGqtKsxocuKJbzqF9rn0lM=; b=EojlzwE+j68PS18GZ7kuBIYnLb02uEfOMuVNLk+O8jU8WuJDTGzkXsxPMJSK6O5ksHmqNA ZqnYCwwpHMDbUkx8EXfJgp0Cqo6/iaEo80Qv1V7I/EBkBBMrDiNWmCSOq22P76ZKi1/zfB TApNQW9oGaO9bA+gja+CjuyPkiM9cio= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=none; spf=pass (imf21.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1681477408; a=rsa-sha256; cv=none; b=lR6OkEAkKVIxK8+yUKrUxTVM5PgRobMPgoMzkxrVYsmuP+MQb4/CSXAdZjCBejgXT2T2UA qw9pNy57d0crbtfURL+XHMjspIMMLnG/irfhPNOPDT6p7a9KCuGhB0Mb4J5ToGZp2YrE4R XPfTIr3N5By8RWUvr/CGiHPYTLvAY3Q= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 15934175D; Fri, 14 Apr 2023 06:04:12 -0700 (PDT) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.26]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 911753F6C4; Fri, 14 Apr 2023 06:03:26 -0700 (PDT) From: Ryan Roberts To: Andrew Morton , "Matthew Wilcox (Oracle)" , Yu Zhao , "Yin, Fengwei" Cc: Ryan Roberts , linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org Subject: [RFC v2 PATCH 07/17] mm: Allow deferred splitting of arbitrary large anon folios Date: Fri, 14 Apr 2023 14:02:53 +0100 Message-Id: <20230414130303.2345383-8-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230414130303.2345383-1-ryan.roberts@arm.com> References: <20230414130303.2345383-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam03 X-Stat-Signature: x1jj3wegnow6eutacfw8sumodn8ekhpw X-Rspamd-Queue-Id: DCAED1C0036 X-HE-Tag: 1681477407-434135 X-HE-Meta: U2FsdGVkX1+rd4KHSjVnGVwmyOX4e6fTbgabUv3TTxYrvcYnwdUI0fUsweIcI/VHAhhb2Qr4hjjvX7kAfV6DhfnX3IXPk2Wkfd9Hmg80SOhLPQqAJTqmDoUX3a2k95Cj9/pTyFrdpQQ5ziC663599VNB/hdkafrH+gDkAByFX6mABwBiu7eSaWP3gO3LPHLQqsFBwwtBvNMopDFa+2l5450ViN4VZFlOV7S012YCsg5f7zgGnSZkYr2fHhxgkw4NwjElCQ69C1710AkU5e5Gb2WZD7R+2mS5Ej1E68Q8IXabKms0BwIf5Q4YdahWWCr1tEHygPJN82S3Y/juuQ8WZL4z7ciAe7LHLNhGxqTB2uRcZG66i76srKCaOWLq86VGSioqMrtDQu/l/ZNf8ritdP87ZKy1KLiTDEKc59CXBYyI1vI5xZy/jNNFKfqsTrWBlbP1GOLd+aaz78lyJtbb28l1PgZWI4+bh9gbEjp+mQHCo4oPJM5pPzkD5AoHiZ3+9xemye8gfNcvkM4x+XJECAqKKzmMGqdgU9+CoqzIQe/7WrIMPQNi3sxLcyampIPYbrNqwV20m2iTE/2jD3xKfkLKfaF4TD54DXWU3A3z/vsbymEtX4Oc2KW9yE8jwAQ4Z0oWcPT1koShLuCkjuY8QGyae01HAZEjH/voVPZqVPA1fVLnd1FFekyR3xwgTCKU71TmMh1q6sJaFv6tKAZXLNHL8rsD3dwqRoIzc9l/raTwWMRMJvw2b4iDMinI/RA1t9e7b37mIf5o+3cK2c0uZcm9q3HiTZI6ld+aQ/PZpEGhzUSJppt63mj4y/7Lmb5npKP9abz2ziUw2QetDTsgPAMKeML2r3zNH7fO+b75WrXHJCw6hlfedS3ll69g7goxn49FhrHSQbnU9UwET9H3KcgbSMSc774Q+YvCquRmjiSqpi/gfgo+AURcadFnF0WdnIeGoNw1OwElMlrO4Mr 4DceoEzN t3DtgV/6jhz1Q7lNrtqVCWYSlZe9gIO4L0FbDhGVUnI6IdaLarId32gJqkKB4Wbn6A6Z6a+tsqu986aT9FqiRgCAW0xdGlBsj6t5yJJzeMkBGxQNbmj3yGc+lxi95UgVdgsRXz8zS14zfTgWyEY8jEEbzk7oNR2e1SgQ9SKGk4l+6mOs= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: With the introduction of large folios for anonymous memory, we would like to be able to split them when they have unmapped subpages, in order to free those unused pages under memory pressure. So remove the artificial requirement that the large folio needed to be at least PMD-sized. Signed-off-by: Ryan Roberts --- mm/rmap.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -- 2.25.1 diff --git a/mm/rmap.c b/mm/rmap.c index d563d979c005..5148a484f915 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1470,7 +1470,7 @@ void page_remove_rmap(struct page *page, struct vm_area_struct *vma, * page of the folio is unmapped and at least one page * is still mapped. */ - if (folio_test_pmd_mappable(folio) && folio_test_anon(folio)) + if (folio_test_large(folio) && folio_test_anon(folio)) if (!compound || nr < nr_pmdmapped) deferred_split_folio(folio); } From patchwork Fri Apr 14 13:02:54 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13211460 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 10CC6C77B76 for ; Fri, 14 Apr 2023 13:03:42 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BF60A900004; Fri, 14 Apr 2023 09:03:31 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B2C21900002; Fri, 14 Apr 2023 09:03:31 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9A5B6900004; Fri, 14 Apr 2023 09:03:31 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 83052900002 for ; Fri, 14 Apr 2023 09:03:31 -0400 (EDT) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id EBD731A011C for ; Fri, 14 Apr 2023 13:03:30 +0000 (UTC) X-FDA: 80680012980.11.B035BC8 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf18.hostedemail.com (Postfix) with ESMTP id 32BC21C003B for ; Fri, 14 Apr 2023 13:03:29 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=none; spf=pass (imf18.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1681477409; a=rsa-sha256; cv=none; b=vQcSeH6Ovt3JvVXCD8eNfax80OjAQNChBhaV8pHb9KpASO1xruyqCxMnxdFFaNbMu+VbU/ 5LegGnJ7zvsczVFKqcnz+XWycJn5gOa+3iycMHmPVGIG8S5etk3SEO5LSp/+POsgJhJCOs rHFl5VdG1sttk8X+qdRK6uZAnfvKVIE= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=none; spf=pass (imf18.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1681477409; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=VzYKZPUN+KxsXR3DLB7tLFMomsKlrH2nSlt7jFeq568=; b=kv2VwYYjGWK5zqBWwnA/UgoTQMGQQ19aO4RrgfuxdaQOsWzN54XqAp+ICGFfTjM8/MyRFF 8EIHfz48eYnjegPxns2jO0EdWCtxCFMdyeRREJtOPguEnO1AWzVhnWt6VSWgij4i21V/2W gVKN4LvsshCHdakVVU74RU8emUX6vcs= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 47F6A1762; Fri, 14 Apr 2023 06:04:13 -0700 (PDT) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.26]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id C33FF3F6C4; Fri, 14 Apr 2023 06:03:27 -0700 (PDT) From: Ryan Roberts To: Andrew Morton , "Matthew Wilcox (Oracle)" , Yu Zhao , "Yin, Fengwei" Cc: Ryan Roberts , linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org Subject: [RFC v2 PATCH 08/17] mm: Implement folio_move_anon_rmap_range() Date: Fri, 14 Apr 2023 14:02:54 +0100 Message-Id: <20230414130303.2345383-9-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230414130303.2345383-1-ryan.roberts@arm.com> References: <20230414130303.2345383-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Queue-Id: 32BC21C003B X-Rspamd-Server: rspam01 X-Stat-Signature: wwqohtkacsdf8ab1mop4ngekfbxhmkw7 X-HE-Tag: 1681477409-598271 X-HE-Meta: U2FsdGVkX1/AiWvmBhUWb0fQav2HL9ITHBf8dKlDnzi05OHPgK5SOyj3pS6MMrIPYj6aR6u06Y3Gnck2lGO7m5xFq4e6Z2rkR9AfeL+unpxJPYWY7JZ/Td6QFHFu56r0L/2Ii12xu3HOODLygEoO/HDzD9FAQc7cfr1VNM9PDuszttJJfBjI7fkKB/SedmoXIR9rgoxJrRPc9SgWEMEFuUgrahp4V+VaAeJWIGq0bTybc6Up1DDmr89EpDKCLGCAAJt/7sax88ggfB4v3KUuakesdaL0S2Kw1L/thNPUDHZPNRvfeFCqGaBw22zd1chVY8W2xUcvzC7033Go4BbjU1QTKqYfL5LWaC4WnpFWjbC5wpcILD5CwLSTYDiB3+lPGF7hx9ZiW17o3k/1yd1C0o1UhntIclCin5QosoKxpz/VDS1ssYjdg6af7AnGMoT+iMIyZGesVYFq0PWzhn2CKuLJOWuzbLwJOI8hGOmRzapejShzQhHlJrz5C50VuE69he8tOe8wy6NPCOybr98FBhFr451mty9n7+w3yJmtF2teyFTYM7PrFswyOxlVKSUrt76gLkgyu2LzBwJ/nAUG4X791mPfpdcDJ70Hb+JLucGZFtW6/cEQr3i0gBQ5F5C4tsJ6DGWpi2S5DHim3Kr7pS4+ExuZGwBsC7usVdVCDeOe17QnP8xSclRNtvCOPrwC++fNv39Yzo2L3yuh3CSzmiL/WvS/JLP2Ylrj3OAm1ZAEhuISYHs2twqSCorsfHageJPXq3dxWcpP5zIAUH8VY14CaP+DNIMQjhO1jWldiUqOg6Z/780aUFdeBbQXfuQPgei3DeMjIdM/SGPx9/PG0aGlhcubDZYM7oTYrxns6jjLpJruzIs2lYvKLaoqzFr+V5jKF2+cs+1/FpSzVkH81duqWpF/Q7Tjwi6i+yD70nuNuc+aE5tORV/PrHq9KGLozNgo278xKe/varDFbhJ ElvNqDJ+ NPjwRBwwYKy33YOmc6xRQjQFeIG/tGTfPdrbnyYefyU0DL1PiisOp0RsUzAgXemB9OwTVOBT8mkk1twJwNuqotr3sNofW8+MgYkr7R5vgR9rs8mFXTVpqyhRIMT685Axn/hsB6oQ/uQ4bdECwHPv6QQnOHAcy9SxCE71znNATW3gtxAOt2s6hcuMjLla+DYu2g9Gi4cVYV5Dj+vBeXddupGHWRNLa87IUcXRZ X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Similar to page_move_anon_rmap() except it can batch-move a range of pages within a folio for increased efficiency. Will be used to enable reusing multiple pages from a large anonymous folio in one go. Signed-off-by: Ryan Roberts --- include/linux/rmap.h | 2 ++ mm/rmap.c | 40 ++++++++++++++++++++++++++++++---------- 2 files changed, 32 insertions(+), 10 deletions(-) -- 2.25.1 diff --git a/include/linux/rmap.h b/include/linux/rmap.h index 5c707f53d7b5..8cb0ba48d58f 100644 --- a/include/linux/rmap.h +++ b/include/linux/rmap.h @@ -190,6 +190,8 @@ typedef int __bitwise rmap_t; * rmap interfaces called when adding or removing pte of page */ void page_move_anon_rmap(struct page *, struct vm_area_struct *); +void folio_move_anon_rmap_range(struct folio *folio, struct page *page, + int nr, struct vm_area_struct *vma); void page_add_anon_rmap(struct page *, struct vm_area_struct *, unsigned long address, rmap_t flags); void page_add_new_anon_rmap(struct page *, struct vm_area_struct *, diff --git a/mm/rmap.c b/mm/rmap.c index 5148a484f915..1cd8fb0b929f 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1103,19 +1103,22 @@ int folio_total_mapcount(struct folio *folio) } /** - * page_move_anon_rmap - move a page to our anon_vma - * @page: the page to move to our anon_vma - * @vma: the vma the page belongs to + * folio_move_anon_rmap_range - batch-move a range of pages within a folio to + * our anon_vma; a more efficient version of page_move_anon_rmap(). + * @folio: folio that owns the range of pages + * @page: the first page to move to our anon_vma + * @nr: number of pages to move to our anon_vma + * @vma: the vma the page belongs to * - * When a page belongs exclusively to one process after a COW event, - * that page can be moved into the anon_vma that belongs to just that - * process, so the rmap code will not search the parent or sibling - * processes. + * When a range of pages belongs exclusively to one process after a COW event, + * those pages can be moved into the anon_vma that belongs to just that process, + * so the rmap code will not search the parent or sibling processes. */ -void page_move_anon_rmap(struct page *page, struct vm_area_struct *vma) +void folio_move_anon_rmap_range(struct folio *folio, struct page *page, + int nr, struct vm_area_struct *vma) { void *anon_vma = vma->anon_vma; - struct folio *folio = page_folio(page); + int i; VM_BUG_ON_FOLIO(!folio_test_locked(folio), folio); VM_BUG_ON_VMA(!anon_vma, vma); @@ -1127,7 +1130,24 @@ void page_move_anon_rmap(struct page *page, struct vm_area_struct *vma) * folio_test_anon()) will not see one without the other. */ WRITE_ONCE(folio->mapping, anon_vma); - SetPageAnonExclusive(page); + + for (i = 0; i < nr; i++) + SetPageAnonExclusive(page++); +} + +/** + * page_move_anon_rmap - move a page to our anon_vma + * @page: the page to move to our anon_vma + * @vma: the vma the page belongs to + * + * When a page belongs exclusively to one process after a COW event, + * that page can be moved into the anon_vma that belongs to just that + * process, so the rmap code will not search the parent or sibling + * processes. + */ +void page_move_anon_rmap(struct page *page, struct vm_area_struct *vma) +{ + folio_move_anon_rmap_range(page_folio(page), page, 1, vma); } /** From patchwork Fri Apr 14 13:02:55 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13211461 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A05F8C77B6E for ; Fri, 14 Apr 2023 13:03:44 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9D146900005; Fri, 14 Apr 2023 09:03:32 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8BBF3900002; Fri, 14 Apr 2023 09:03:32 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6E821900005; Fri, 14 Apr 2023 09:03:32 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 5C670900002 for ; Fri, 14 Apr 2023 09:03:32 -0400 (EDT) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 2D160AAF52 for ; Fri, 14 Apr 2023 13:03:32 +0000 (UTC) X-FDA: 80680013064.15.8882DE6 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf21.hostedemail.com (Postfix) with ESMTP id 5BE001C0011 for ; Fri, 14 Apr 2023 13:03:30 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=none; spf=pass (imf21.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1681477410; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=gVuGG+EGC3oQJFIhWJXlenajYxJw7IAheJ47HnxAHkY=; b=h7Lt8nAaQs4NEE/IyYALi+9NIuq/jJuFtq+M6HzxI+0oI5PAY2TUQ+ss/KhP0ldChjAIrO HKK7mAC81X48K0NCv5vgfmW3RmiO0C6kUhF1brnNokGbmO79tHmHqg6zrBlEB9nobsDFmg YViC0u3oOhM9vlphi3HWQdoLobRvYQQ= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=none; spf=pass (imf21.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1681477410; a=rsa-sha256; cv=none; b=EWRwJfuDyt6qTo/p43p2PhIV0eJCq+sZNiLkZUwvWpRYNvCUoZzh7ykMQXx+oyG6Uof7hj WuG5uLX5Xw6mG3rKqkOCeCv4hMc3tjYwo5oRMdoK4mJbU1Koh918BWpleRlNAdtWpU7rJo Eo2hHnjUAfAzBzoJIcJ33oRnpem/9oY= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 7A1EA1763; Fri, 14 Apr 2023 06:04:14 -0700 (PDT) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.26]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 0171B3F6C4; Fri, 14 Apr 2023 06:03:28 -0700 (PDT) From: Ryan Roberts To: Andrew Morton , "Matthew Wilcox (Oracle)" , Yu Zhao , "Yin, Fengwei" Cc: Ryan Roberts , linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org Subject: [RFC v2 PATCH 09/17] mm: Update wp_page_reuse() to operate on range of pages Date: Fri, 14 Apr 2023 14:02:55 +0100 Message-Id: <20230414130303.2345383-10-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230414130303.2345383-1-ryan.roberts@arm.com> References: <20230414130303.2345383-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam03 X-Stat-Signature: hpcg498asd55ryupuctcb5dxetidyi6s X-Rspamd-Queue-Id: 5BE001C0011 X-HE-Tag: 1681477410-593355 X-HE-Meta: U2FsdGVkX1+Mj/YRpn3Ce4ivCDxfI/oZDRke6YLxESCaOFzmagQ5cyGmTBMfIwmIKkws5f55ov7KcBRJjftkFZ1/Ap/73+ZnCMXOwVcUv+ijxxHE4FwBm/O3MOA/opNCGrqt+0f8Vzmo2eIGFzuQSqCguLej6/kjVQmQomksMWKF2XIhEcpOyrVq2LSdAw42V0Ft0qYEHRq1GG+RrUyfzeBlRWkhPtRVkures9YKkOf5F2Fm/qFoZ9tsWWIDPwiODhab19vuuteBsl7z8O4qvCYJHuISkf4AdI2ly3OUcz9gQH2zinxBjCktzvS1b9N3530fbipd6bHtK6TY2cTQBu1b0tUxRAtZoDiqrW1hW2i9whxl8feK3uu0+nOxifImpGc6ifVESOm5wL+kTajJtLBsUMN/EafnrePuJCarOUEmqjlol1La3eAq3gfdh0G2Wfq5uahW7xv3W1VnvtU8mnCPRawYAVd+iAQWimUouY3aTmlT7y3LMxAolDiMXQnoBOu1YFRBJDC3t9zgyRYTCc5gUr2ge7IHiN2af+LkWyKXKLfUwXACqF223CtdcO7x4x7oxsITHdgKK75mMWuAthYIEq+vSP3DWRTGDGKSr+P8agyi5mdUSjjBvyjEu6ez/8HOKJbROXuUdW9KPxb5l0mUH7Okdg6M3rMg1w/RXwa4z/K7eEZFr8ngbUy4hIZaw9JbQxcKdaysdLAswk1Tj7mWbUvL4rbLGub0EKt/mnVtoWQtrVUuIp0vzwAFOvPdsBtOvuyD41C73cOnosgocM2w/RzMf1Kd/J0kaCR9sZV1IdV4/acCuhK7W90/fv6EFlsGaJuMZkgIWi6hTgf8UBABkm+QC9lfbxYcJ3fvO7GJ+EX4L9EThYMd0NzHG0TRq+2oQ9L8XikNxCGz5J4MDV1bSW7bTh8VLsCW/aZ7naioL4w7SthEjCGZlLS4MN8ghQl2JvbhF0208ZqcG5k JPamRXGo +P2Rub/hAjOd/QU8+KKMmvGhxEldBLT9s88afydtENAh5a73xE3KoDY+2JOg0k4eVFn74vV4MNVsy7dQLf/UXdzWLAn8eDCF5C27P7EL/wpH0ggBPQSyDlU//2rr3HxCpjdcYNxBXhXY0w2T564RZm9+LhP0IaljBdbkm+AooE82QUeEnhJ+bSGjrMdSX0pKVrp0OXsv4YMpmlL0xhWSlBiPj1q/UBikZbeJM X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: We will shortly be updating do_wp_page() to be able to reuse a range of pages from a large anon folio. As an enabling step, modify wp_page_reuse() to operate on a range of pages, if a struct anon_folio_range is passed in. Batching in this way allows us to batch up the cache maintenance and event counting for small performance improvements. Currently all callsites pass range=NULL, so no functional changes intended. Signed-off-by: Ryan Roberts --- mm/memory.c | 80 +++++++++++++++++++++++++++++++++++++++-------------- 1 file changed, 60 insertions(+), 20 deletions(-) -- 2.25.1 diff --git a/mm/memory.c b/mm/memory.c index f92a28064596..83835ff5a818 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3030,6 +3030,14 @@ static inline int max_anon_folio_order(struct vm_area_struct *vma) return ANON_FOLIO_ORDER_MAX; } +struct anon_folio_range { + unsigned long va_start; + pte_t *pte_start; + struct page *pg_start; + int nr; + bool exclusive; +}; + /* * Returns index of first pte that is not none, or nr if all are none. */ @@ -3122,31 +3130,63 @@ static int calc_anon_folio_order_alloc(struct vm_fault *vmf, int order) * case, all we need to do here is to mark the page as writable and update * any related book-keeping. */ -static inline void wp_page_reuse(struct vm_fault *vmf) +static inline void wp_page_reuse(struct vm_fault *vmf, + struct anon_folio_range *range) __releases(vmf->ptl) { struct vm_area_struct *vma = vmf->vma; - struct page *page = vmf->page; + unsigned long addr; + pte_t *pte; + struct page *page; + int nr; pte_t entry; + int change = 0; + int i; VM_BUG_ON(!(vmf->flags & FAULT_FLAG_WRITE)); - VM_BUG_ON(page && PageAnon(page) && !PageAnonExclusive(page)); - /* - * Clear the pages cpupid information as the existing - * information potentially belongs to a now completely - * unrelated process. - */ - if (page) - page_cpupid_xchg_last(page, (1 << LAST_CPUPID_SHIFT) - 1); + if (range) { + addr = range->va_start; + pte = range->pte_start; + page = range->pg_start; + nr = range->nr; + } else { + addr = vmf->address; + pte = vmf->pte; + page = vmf->page; + nr = 1; + } + + if (page) { + for (i = 0; i < nr; i++, page++) { + VM_BUG_ON(PageAnon(page) && !PageAnonExclusive(page)); + + /* + * Clear the pages cpupid information as the existing + * information potentially belongs to a now completely + * unrelated process. + */ + page_cpupid_xchg_last(page, + (1 << LAST_CPUPID_SHIFT) - 1); + } + } + + flush_cache_range(vma, addr, addr + (nr << PAGE_SHIFT)); + + for (i = 0; i < nr; i++) { + entry = pte_mkyoung(pte[i]); + entry = maybe_mkwrite(pte_mkdirty(entry), vma); + change |= ptep_set_access_flags(vma, + addr + (i << PAGE_SHIFT), + pte + i, + entry, 1); + } + + if (change) + update_mmu_cache_range(vma, addr, pte, nr); - flush_cache_page(vma, vmf->address, pte_pfn(vmf->orig_pte)); - entry = pte_mkyoung(vmf->orig_pte); - entry = maybe_mkwrite(pte_mkdirty(entry), vma); - if (ptep_set_access_flags(vma, vmf->address, vmf->pte, entry, 1)) - update_mmu_cache(vma, vmf->address, vmf->pte); pte_unmap_unlock(vmf->pte, vmf->ptl); - count_vm_event(PGREUSE); + count_vm_events(PGREUSE, nr); } /* @@ -3359,7 +3399,7 @@ vm_fault_t finish_mkwrite_fault(struct vm_fault *vmf) pte_unmap_unlock(vmf->pte, vmf->ptl); return VM_FAULT_NOPAGE; } - wp_page_reuse(vmf); + wp_page_reuse(vmf, NULL); return 0; } @@ -3381,7 +3421,7 @@ static vm_fault_t wp_pfn_shared(struct vm_fault *vmf) return ret; return finish_mkwrite_fault(vmf); } - wp_page_reuse(vmf); + wp_page_reuse(vmf, NULL); return 0; } @@ -3410,7 +3450,7 @@ static vm_fault_t wp_page_shared(struct vm_fault *vmf) return tmp; } } else { - wp_page_reuse(vmf); + wp_page_reuse(vmf, NULL); lock_page(vmf->page); } ret |= fault_dirty_shared_page(vmf); @@ -3534,7 +3574,7 @@ static vm_fault_t do_wp_page(struct vm_fault *vmf) pte_unmap_unlock(vmf->pte, vmf->ptl); return 0; } - wp_page_reuse(vmf); + wp_page_reuse(vmf, NULL); return 0; } copy: From patchwork Fri Apr 14 13:02:56 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13211462 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 29EA4C77B6E for ; Fri, 14 Apr 2023 13:03:47 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C3F87900002; Fri, 14 Apr 2023 09:03:34 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id BC8D5280001; Fri, 14 Apr 2023 09:03:34 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A1EA3900006; Fri, 14 Apr 2023 09:03:34 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 77C59900002 for ; Fri, 14 Apr 2023 09:03:34 -0400 (EDT) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 8FF48802AF for ; Fri, 14 Apr 2023 13:03:33 +0000 (UTC) X-FDA: 80680013106.17.7D7A464 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf18.hostedemail.com (Postfix) with ESMTP id B94C21C002A for ; Fri, 14 Apr 2023 13:03:31 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=none; spf=pass (imf18.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1681477412; a=rsa-sha256; cv=none; b=NZ5Z8yeAJI8LMv1r9W4RNq49kJXNBfnDgy7dcU4jI3j04xmsFyqO2EbujGS5n0eby68wfJ RQmZYjuLuJqbnblvt4jo/1qhSh8ddGhoLK+/jpFHRaK1Rb9S3y3xWiY8SSS4wc1IiKtWR6 y4uquuHX0KZdi2V2V4pukCMBpoERe7E= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=none; spf=pass (imf18.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1681477412; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=NIUNRxFEOsK9iUMrVeqINZSXpgo1AjdYuToS3nLUXPQ=; b=L1zRx+vaTGSicQc4U2fbzX9+eS1YeCMHhBapyM2KSJlEZAREcbLjmnngPl1i2JiytMCRDC tsuY+mP9UwdAR+nQoclIcfCbs9/aU6stkQ+0KnU9ntnGrX/PYaZPt7lCcjcSOF6n3JmuMe MN5N2qtK0czHZTBa6ibog3bcMHH0ItM= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id AC7E21764; Fri, 14 Apr 2023 06:04:15 -0700 (PDT) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.26]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 33CFE3F6C4; Fri, 14 Apr 2023 06:03:30 -0700 (PDT) From: Ryan Roberts To: Andrew Morton , "Matthew Wilcox (Oracle)" , Yu Zhao , "Yin, Fengwei" Cc: Ryan Roberts , linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org Subject: [RFC v2 PATCH 10/17] mm: Reuse large folios for anonymous memory Date: Fri, 14 Apr 2023 14:02:56 +0100 Message-Id: <20230414130303.2345383-11-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230414130303.2345383-1-ryan.roberts@arm.com> References: <20230414130303.2345383-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Queue-Id: B94C21C002A X-Rspamd-Server: rspam01 X-Stat-Signature: 95o5fshgyp5jyny7c5ujzm9roaiohoq3 X-HE-Tag: 1681477411-940982 X-HE-Meta: U2FsdGVkX1/4F8EcMq/FdnR+vtguU7j316jvtAkEwLZbF0b9LmDubMvI6Ky3fDh1QBefl/3yMs1OsjQN33Te5dg2dvFBeCnne2IHbbFavHftrnsqOulJ2VpQJiH1iCYEQ3RcULFdYwAw/awECeXR50CYAcZ1hLrOSjJXojnoGKbfFG+l4SPI7vQP0R5kRQ/An1e2cp2SM/m2LdInWOo4vWpD7/UMFzra/fwwQYyYuZsNIadqkzhM9jFWLdM/jn9NX4/OQKmjaVCX/WBHmq0ouRDF05SmnI9t9VfJSjEVEl3gZdkhjcjW/T98Ot6rhpy3hfg9XEkiQELY5ZgLMKDGwSbk791OSkY7+mUJ6PqlyQuKXwl+lGgAIaOKbk/Gs2+niEQq9eAgjmd2XvYd0XGeMwL8n7XbMRNSL7ae8KOfsgx9kVMx6QsIMOMIOUsu/ZHdW55u+FLhS2O3njaEY4Z1WSaqOcWmVnd4vink6OvxQgYhCLBJO+sHyYOmcgP7U7BlEWIoA+a6RQMakeRXRrEHxoA68/bZNXFd2qoFsu4E3idmqFLV//RgehzHU+0FuxONAkxjSDdmbCcBZJysxDQHIzFjbMaS+702skoHngHTR1oxnndHF4ftU9p9jeEyl6VrZaEhAVF78Lp9Uv276pSDgddBW3lHIKWL03KoMNgBDCLMzmteitzv8XcMGtUAugezoLzjs51JiibGjd9La/fhJbh0IJ7IMJPOrxyQSCr9LEwjVQDHr5G/1X/Y5pEk17hw1tqGBx2wtzCwRlRhSU/IeVWQ4G1O9RBZluvQJVfn/WoDzvnDyX/S9I3kmY/7Go4GvFqYC3YwVGfwZqQjn1DJuXZ+iT+v+nsFfo5aOozkUQ/qpejqD7ZHFSzepXDX1Qboyn1YJSYOY/4MS/L5sWSOMGxuNaSs5tqTjgo0ehbxzJU3fkYeofNqpCG/JuxuY+5pRE70cMAXIS8XE/BvwYb Vid72dVr lJTvHG6xCT5ydKbnZK5Iu1Gs6l1CoZkLm1jXZ/v1Fg1KeKYrtwLYo1uRlFm7m7Ep6uJUUcj3zW23oCl/UNw9/eMZ5P+Kh7Q8/oohaGMFFzFrO+qmAS0pczLCF5YUQ9iIhiytpjOd+AM38we74XIVfFcZ581Tys45Lf8AXIuGAmyNlZrk= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: When taking a write fault on an anonymous page, attempt to reuse as much of the folio as possible if it is exclusive to the process. This avoids a problem where an exclusive, PTE-mapped THP would previously have all of its pages except the last one CoWed, then the last page would be reused, causing the whole original folio to hang around as well as all the CoWed pages. This problem is exaserbated now that we are allocating variable-order folios for anonymous memory. The reason for this behaviour is that a PTE-mapped THP has a reference for each PTE and the old code thought that meant it was not exclusively mapped, and therefore could not be reused. We now take care to find the region that intersects the underlying folio, the VMA and the PMD entry and for the presence of that number of references as indicating exclusivity. Note that we are not guarranteed that this region will cover the whole folio due to munmap and mremap. The aim is to reuse as much as possible in one go in order to: - reduce memory consumption - reduce number of CoWs - reduce time spent in fault handler Signed-off-by: Ryan Roberts --- mm/memory.c | 169 +++++++++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 160 insertions(+), 9 deletions(-) -- 2.25.1 diff --git a/mm/memory.c b/mm/memory.c index 83835ff5a818..7e2af54fe2e0 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3038,6 +3038,26 @@ struct anon_folio_range { bool exclusive; }; +static inline unsigned long page_addr(struct page *page, + struct page *anchor, unsigned long anchor_addr) +{ + unsigned long offset; + unsigned long addr; + + offset = (page_to_pfn(page) - page_to_pfn(anchor)) << PAGE_SHIFT; + addr = anchor_addr + offset; + + if (anchor > page) { + if (addr > anchor_addr) + return 0; + } else { + if (addr < anchor_addr) + return ULONG_MAX; + } + + return addr; +} + /* * Returns index of first pte that is not none, or nr if all are none. */ @@ -3122,6 +3142,122 @@ static int calc_anon_folio_order_alloc(struct vm_fault *vmf, int order) return order; } +static void calc_anon_folio_range_reuse(struct vm_fault *vmf, + struct folio *folio, + struct anon_folio_range *range_out) +{ + /* + * The aim here is to determine the biggest range of pages that can be + * reused for this CoW fault if the identified range is responsible for + * all the references on the folio (i.e. it is exclusive) such that: + * - All pages are contained within folio + * - All pages are within VMA + * - All pages are within the same pmd entry as vmf->address + * - vmf->page is contained within the range + * - All covered ptes must be present, physically contiguous and RO + * + * Note that the folio itself may not be naturally aligned in VA space + * due to mremap. We take the largest range we can in order to increase + * our chances of being the exclusive user of the folio, therefore + * meaning we can reuse. Its possible that the folio crosses a pmd + * boundary, in which case we don't follow it into the next pte because + * this complicates the locking. + * + * Note that the caller may or may not choose to lock the pte. If + * unlocked, the calculation should be considered an estimate that will + * need to be validated under the lock. + */ + + struct vm_area_struct *vma = vmf->vma; + struct page *page; + pte_t *ptep; + pte_t pte; + bool excl = true; + unsigned long start, end; + int bloops, floops; + int i; + unsigned long pfn; + + /* + * Iterate backwards, starting with the page immediately before the + * anchor page. On exit from the loop, start is the inclusive start + * virtual address of the range. + */ + + start = page_addr(&folio->page, vmf->page, vmf->address); + start = max(start, vma->vm_start); + start = max(start, ALIGN_DOWN(vmf->address, PMD_SIZE)); + bloops = (vmf->address - start) >> PAGE_SHIFT; + + page = vmf->page - 1; + ptep = vmf->pte - 1; + pfn = page_to_pfn(vmf->page) - 1; + + for (i = 0; i < bloops; i++) { + pte = *ptep; + + if (!pte_present(pte) || + pte_write(pte) || + pte_protnone(pte) || + pte_pfn(pte) != pfn) { + start = vmf->address - (i << PAGE_SHIFT); + break; + } + + if (excl && !PageAnonExclusive(page)) + excl = false; + + pfn--; + ptep--; + page--; + } + + /* + * Iterate forward, starting with the anchor page. On exit from the + * loop, end is the exclusive end virtual address of the range. + */ + + end = page_addr(&folio->page + folio_nr_pages(folio), + vmf->page, vmf->address); + end = min(end, vma->vm_end); + end = min(end, ALIGN_DOWN(vmf->address, PMD_SIZE) + PMD_SIZE); + floops = (end - vmf->address) >> PAGE_SHIFT; + + page = vmf->page; + ptep = vmf->pte; + pfn = page_to_pfn(vmf->page); + + for (i = 0; i < floops; i++) { + pte = *ptep; + + if (!pte_present(pte) || + pte_write(pte) || + pte_protnone(pte) || + pte_pfn(pte) != pfn) { + end = vmf->address + (i << PAGE_SHIFT); + break; + } + + if (excl && !PageAnonExclusive(page)) + excl = false; + + pfn++; + ptep++; + page++; + } + + /* + * Fixup vmf to point to the start of the range, and return number of + * pages in range. + */ + + range_out->va_start = start; + range_out->pg_start = vmf->page - ((vmf->address - start) >> PAGE_SHIFT); + range_out->pte_start = vmf->pte - ((vmf->address - start) >> PAGE_SHIFT); + range_out->nr = (end - start) >> PAGE_SHIFT; + range_out->exclusive = excl; +} + /* * Handle write page faults for pages that can be reused in the current vma * @@ -3528,13 +3664,23 @@ static vm_fault_t do_wp_page(struct vm_fault *vmf) /* * Private mapping: create an exclusive anonymous page copy if reuse * is impossible. We might miss VM_WRITE for FOLL_FORCE handling. + * For anonymous memory, we attempt to copy/reuse in folios rather than + * page-by-page. We always prefer reuse above copy, even if we can only + * reuse a subset of the folio. Note that when reusing pages in a folio, + * due to munmap, mremap and friends, the folio isn't guarranteed to be + * naturally aligned in virtual memory space. */ if (folio && folio_test_anon(folio)) { + struct anon_folio_range range; + int swaprefs; + + calc_anon_folio_range_reuse(vmf, folio, &range); + /* - * If the page is exclusive to this process we must reuse the - * page without further checks. + * If the pages have already been proven to be exclusive to this + * process we must reuse the pages without further checks. */ - if (PageAnonExclusive(vmf->page)) + if (range.exclusive) goto reuse; /* @@ -3544,7 +3690,10 @@ static vm_fault_t do_wp_page(struct vm_fault *vmf) * * KSM doesn't necessarily raise the folio refcount. */ - if (folio_test_ksm(folio) || folio_ref_count(folio) > 3) + swaprefs = folio_test_swapcache(folio) ? + folio_nr_pages(folio) : 0; + if (folio_test_ksm(folio) || + folio_ref_count(folio) > range.nr + swaprefs + 1) goto copy; if (!folio_test_lru(folio)) /* @@ -3552,29 +3701,31 @@ static vm_fault_t do_wp_page(struct vm_fault *vmf) * remote LRU pagevecs or references to LRU folios. */ lru_add_drain(); - if (folio_ref_count(folio) > 1 + folio_test_swapcache(folio)) + if (folio_ref_count(folio) > range.nr + swaprefs) goto copy; if (!folio_trylock(folio)) goto copy; if (folio_test_swapcache(folio)) folio_free_swap(folio); - if (folio_test_ksm(folio) || folio_ref_count(folio) != 1) { + if (folio_test_ksm(folio) || + folio_ref_count(folio) != range.nr) { folio_unlock(folio); goto copy; } /* - * Ok, we've got the only folio reference from our mapping + * Ok, we've got the only folio references from our mapping * and the folio is locked, it's dark out, and we're wearing * sunglasses. Hit it. */ - page_move_anon_rmap(vmf->page, vma); + folio_move_anon_rmap_range(folio, range.pg_start, + range.nr, vma); folio_unlock(folio); reuse: if (unlikely(unshare)) { pte_unmap_unlock(vmf->pte, vmf->ptl); return 0; } - wp_page_reuse(vmf, NULL); + wp_page_reuse(vmf, &range); return 0; } copy: From patchwork Fri Apr 14 13:02:57 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13211463 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B2F11C77B72 for ; Fri, 14 Apr 2023 13:03:49 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AB8CF280002; Fri, 14 Apr 2023 09:03:35 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A42F5280001; Fri, 14 Apr 2023 09:03:35 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8BBA1280002; Fri, 14 Apr 2023 09:03:35 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 6E980280001 for ; Fri, 14 Apr 2023 09:03:35 -0400 (EDT) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id CD98880282 for ; Fri, 14 Apr 2023 13:03:34 +0000 (UTC) X-FDA: 80680013148.27.57C4156 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf21.hostedemail.com (Postfix) with ESMTP id D67C91C000E for ; Fri, 14 Apr 2023 13:03:32 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=none; spf=pass (imf21.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1681477413; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=kmVsoaJZusWox27eQw7cLPcmlW527fQj6sB0NN1MuqM=; b=FhwS/dhcJmKLO+e5sfnmTDRBoFlUw8RNZwD7ucRYSC+kIPW+H/Kb1KPqZpdyb2vxAAZ10a YFQOhCweoef05/ZuwG5XTPY/sJxDF9G5bufmpjpX7fqslyAm778c0SUS0JR6pRsLG3pkyW xaawZKUSnHGXbcajHmOCNN6qPwRdiq8= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=none; spf=pass (imf21.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1681477413; a=rsa-sha256; cv=none; b=BKs7MQuLw4X3gUB92cnMSA7go2p5XOep7RbE0cg17pShKcMtEyskLAOzYktkRQmGHR5jL9 xL7NfJhqN7pn/iAXMmX90VOH585o35wRBoMG9HWtW/lK1dFJEtkmJEQjoHWb34zLazj1r/ 5T/Tg+xKZVtZGC+49CIt6vYIo7LjXGM= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id DE8F04B3; Fri, 14 Apr 2023 06:04:16 -0700 (PDT) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.26]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 661D03F6C4; Fri, 14 Apr 2023 06:03:31 -0700 (PDT) From: Ryan Roberts To: Andrew Morton , "Matthew Wilcox (Oracle)" , Yu Zhao , "Yin, Fengwei" Cc: Ryan Roberts , linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org Subject: [RFC v2 PATCH 11/17] mm: Split __wp_page_copy_user() into 2 variants Date: Fri, 14 Apr 2023 14:02:57 +0100 Message-Id: <20230414130303.2345383-12-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230414130303.2345383-1-ryan.roberts@arm.com> References: <20230414130303.2345383-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam03 X-Stat-Signature: 9u8w3m46qcti3fcgdcctgrq84cfi7fnf X-Rspamd-Queue-Id: D67C91C000E X-HE-Tag: 1681477412-869745 X-HE-Meta: U2FsdGVkX1/vcbS6jzhioF6AlfWrlD+0HiLCqAZkhOILGpY63sUDkGm0cw+Jg6PiOV91hwx35tWSjFp8faZQnvB9u1m+nvK7hPSVXXVYHG/nozPIEY9/lgzgUmX24aRMEnahK/vzNykQSJwdFQj/y8f2gvOh+kQwwO1bvz2pFEXHrs5MrNSlNar3cvxIYZz1z3flRJmC+CYC7+2iHblMCzjuM0mXX3NtfMRanSeQ6qJipHdbd7q9Zst7KK+ZdhaFIP+qpCal5RBx7K2PU0B56sGe+j7jiMaDO08nre97nUo4EgI9eS5YW3k3jGOlUC+mh0DOA4sH0zxxpIhF9x8tb3oO0XtVYyiV5RCbeUByWo+wT+jbquip9URJe1JCCn19vzr2BZa84Ty86KESAHCgH0KzEVHN8rt0THuQpXCFnyYGKNKZ3RMf4usYJR8WqKes2q/RmABae3QCelTpOLDLqw/ofaILtWX19I4MuWJ36uf9pzUej8mr3hDUfjMUSgDX5sXVNLIjhtQUEvmXQv5wrHdgDEeIIWR0/xk/f21uiG5yfo5tDk84PL4XnbQGRiYMGyVIhQktymOaVAoaIbO2dZef0CvhVh7JE2rbiheZPHnvx+o4p61QSgh2l4tV2b3+KvfZPLltwriqXBi4DfdTxdOgwexVwnj3UzJPJK1/7d7zG8eCwYfREkNm1c4YHzm3jia6FJfiC/DG0UIsaxV8/a2XVy0Scpg9RmUvRZVl+BzAXnEjRz4lVqOSnSrcDTRzUjozfQZZsvnUjYOgUQe23dJM/fjT/lG/aandHmFsubkXtNNFnKcmQliCBttS90mZiDYBjCzIx7Xh0I6nyklksCYhDb3Z/X14MtutYDQixT7Iyb57TPvTSVSULAc2SFyKuHe2L/CPHGj2e+XQ4pHvwtFmLOWjVEiuZdRXlNLD6+p0Dd74BlNoACGrGofqMn6cE45WGUzziMS5BN7ai1z AkAxAzDr WuBWBq0Mlv/eKf8ylqYanb9/4gD9l2HabIeXh6Ajk7GOby6m85UGqAb+1rRwDP33zEROqThWhg8t9tvqIdoomdCDYjNPKCNtGP+xVIIZLtFqHixOvZ7VDLwisL8d51DLzh1euLv27uS73QsBxE0CRF8IfKXDi+j78LA7g51YhDMQFqCXLdmwVM9uahlCZ0BFEs1XSp9Jx47GrSXTRBB9o2Hw1DGSGJnIMtwrb X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: We will soon support CoWing large folios, so will need support for copying a contiguous range of pages in the case where there is a source folio. Therefore, split __wp_page_copy_user() into 2 variants: __wp_page_copy_user_pfn() copies a single pfn to a destination page. This is used when CoWing from a source without a folio, and is always only a single page copy. __wp_page_copy_user_range() copies a range of pages from source to destination and is used when the source has an underlying folio. For now it is only used to copy a single page, but this will change in a future commit. In both cases, kmsan_copy_page_meta() is moved into these helper functions so that the caller does not need to be concerned with calling it multiple times for the range case. No functional changes intended. Signed-off-by: Ryan Roberts --- mm/memory.c | 41 +++++++++++++++++++++++++++++------------ 1 file changed, 29 insertions(+), 12 deletions(-) -- 2.25.1 diff --git a/mm/memory.c b/mm/memory.c index 7e2af54fe2e0..f2b7cfb2efc0 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -2786,14 +2786,34 @@ static inline int pte_unmap_same(struct vm_fault *vmf) return same; } +/* + * Return: + * 0: copied succeeded + * -EHWPOISON: copy failed due to hwpoison in source page + */ +static inline int __wp_page_copy_user_range(struct page *dst, struct page *src, + int nr, unsigned long addr, + struct vm_area_struct *vma) +{ + for (; nr != 0; nr--, dst++, src++, addr += PAGE_SIZE) { + if (copy_mc_user_highpage(dst, src, addr, vma)) { + memory_failure_queue(page_to_pfn(src), 0); + return -EHWPOISON; + } + kmsan_copy_page_meta(dst, src); + } + + return 0; +} + /* * Return: * 0: copied succeeded * -EHWPOISON: copy failed due to hwpoison in source page * -EAGAIN: copied failed (some other reason) */ -static inline int __wp_page_copy_user(struct page *dst, struct page *src, - struct vm_fault *vmf) +static inline int __wp_page_copy_user_pfn(struct page *dst, + struct vm_fault *vmf) { int ret; void *kaddr; @@ -2803,14 +2823,6 @@ static inline int __wp_page_copy_user(struct page *dst, struct page *src, struct mm_struct *mm = vma->vm_mm; unsigned long addr = vmf->address; - if (likely(src)) { - if (copy_mc_user_highpage(dst, src, addr, vma)) { - memory_failure_queue(page_to_pfn(src), 0); - return -EHWPOISON; - } - return 0; - } - /* * If the source page was a PFN mapping, we don't have * a "struct page" for it. We do a best-effort copy by @@ -2879,6 +2891,7 @@ static inline int __wp_page_copy_user(struct page *dst, struct page *src, } } + kmsan_copy_page_meta(dst, NULL); ret = 0; pte_unlock: @@ -3372,7 +3385,12 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf) if (!new_folio) goto oom; - ret = __wp_page_copy_user(&new_folio->page, vmf->page, vmf); + if (likely(old_folio)) + ret = __wp_page_copy_user_range(&new_folio->page, + vmf->page, + 1, vmf->address, vma); + else + ret = __wp_page_copy_user_pfn(&new_folio->page, vmf); if (ret) { /* * COW failed, if the fault was solved by other, @@ -3388,7 +3406,6 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf) delayacct_wpcopy_end(); return ret == -EHWPOISON ? VM_FAULT_HWPOISON : 0; } - kmsan_copy_page_meta(&new_folio->page, vmf->page); } if (mem_cgroup_charge(new_folio, mm, GFP_KERNEL)) From patchwork Fri Apr 14 13:02:58 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13211464 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 62E38C77B76 for ; Fri, 14 Apr 2023 13:03:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C8D7C280003; Fri, 14 Apr 2023 09:03:36 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C1749280001; Fri, 14 Apr 2023 09:03:36 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A42CF280003; Fri, 14 Apr 2023 09:03:36 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 7A790280001 for ; Fri, 14 Apr 2023 09:03:36 -0400 (EDT) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 5A0EC1C5D44 for ; Fri, 14 Apr 2023 13:03:36 +0000 (UTC) X-FDA: 80680013232.25.82DD15F Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf18.hostedemail.com (Postfix) with ESMTP id E32161C002F for ; Fri, 14 Apr 2023 13:03:33 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=none; spf=pass (imf18.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1681477414; a=rsa-sha256; cv=none; b=w00nxDLcUBAz0eMpDne0JIODxQlroGOGLJplhXyq2lC78yZ9xmb4Pey2phtYvVcRePnov9 Q9Q8hhaSek6O1vGJa3VvTRe9NZpJuSaN5LWnDMw9rRL9zBGSUOz3ZDIuzf49NzG1lBHJyL nFtJFKL/s5uPZ46B4WvvVkGNdKhTh/Q= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=none; spf=pass (imf18.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1681477414; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=32I9Jpfj7syNSZBFmF24OmcYCe67NGKuiEzjjsNeU1Q=; b=u5wLmWyKaXu0gv3EO2IM1oDzjKCukGuWkmZQWU8DmOARgr0U56MhqgN8riogIj8iUChLDN sTZWr33+7ir3K7Bqhm1NAY9A9yo01eCPFIyrNvXWIes076eiTiVlQ0onL5fItE/Sw0O8FI wr7a2/3byXeJGXpiVs7t0v6LnycR8TQ= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 1C74816F8; Fri, 14 Apr 2023 06:04:18 -0700 (PDT) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.26]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 980903F6C4; Fri, 14 Apr 2023 06:03:32 -0700 (PDT) From: Ryan Roberts To: Andrew Morton , "Matthew Wilcox (Oracle)" , Yu Zhao , "Yin, Fengwei" Cc: Ryan Roberts , linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org Subject: [RFC v2 PATCH 12/17] mm: ptep_clear_flush_range_notify() macro for batch operation Date: Fri, 14 Apr 2023 14:02:58 +0100 Message-Id: <20230414130303.2345383-13-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230414130303.2345383-1-ryan.roberts@arm.com> References: <20230414130303.2345383-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Queue-Id: E32161C002F X-Rspamd-Server: rspam01 X-Stat-Signature: x9goyz5bedaz3z1atfxiuhb88jhtmjuo X-HE-Tag: 1681477413-789933 X-HE-Meta: U2FsdGVkX18mPO2kEJnjnUZZtjiMd86IIY50UVG15gpFNokMshDG/VMgSajE8UPqcGe+fqU1JGVVBhOCUkNtqo0oGG7k0pH2u/y80BZ+UAFyI4Nx4zlv2tv/Z12Si41bE9jFeoG2HnF0msXQfA4lP55wWeCNhiveWh4y0cpX1kxqWG7ckFQAlqLY5EG8GejLP1pqjfoyxfL8WoFB8y/xA9oO1QNtDZU3xYUHsEmcC+E3eCS4fph71kQg7xQuBTUWhlAEmAht3Q7ozA+O7mUf7nng+/NMMuLvYj+WY9+ctees9fL5Jzngredkvo9ZDvx8Kon6L/mT7ehT7RJoNb1rwWfUwu9f/BnNUDB6OpH7hbLbIq3BUENaq/jbw2Yt7Y5C5IiwFrP/2ObvUzkT9TgcwuihFGlzh+GxLhmuf+WsKSPhh64bDnJhYnYfGKMtYhU/bQgR1bzx/h1rrNi+dkaWIi/51KURYxYwBCw0rFIUo6RLnbMszcugJ7cLK29Gvmd5lxDX0LEuqfwriliGc6PE72HHJRFc0GExs7Kcb+NKMv62TG6s13at9uja0b86KkBb8+mjxpKj1fzWgNnDxJlqLnsWUyImRIxoILPEhK2QEgwibJh7iBxe9rq17LgBfU+8Rj7/vUo0lKfeySGrQasshVFyt7s9Y9DS7XX2WvK17cg2iJOU7ABWFWUVdM6KaB1XhCKBXTqpmiL5O0QY8KxHNcNe0MD113zgeFgZIhd9f7KLAHgSxs9mF63wyHFuzCf6iaPjfpQZqxhTWF37/IS/TqawOsOd56ahbF8AxmhB1vdKYG48gZwfgyAhB4usjeOz6ooaqA8BSjWMe/KbeuhVoRkNnm/yYpSiw3FgyOfpUC0eIeJphMCrFU47JpWPsx2ihMDeA65IjKX8smMXJNdP7FXcyoq0zBX0pVfechTsqlsGiDSLHqvlSXWF8Eg8zE7v3QKwt6UAMWTeNq34XZL bnNBdAE/ uT5ofsdZhJIhDR+FFxRFqxi5WjqeEBiXvT8maO6ATfyay7sVU8Ix1b/GdqgUQ6heFosn9i3pQ1wis5iLJ/SzOqdWpG+fDEQEJSpyekZ1zxtFhhsEXlOFA8+b9VjBPN0nUV9UNibWpQL/1pL4NxdvuZ5XmUgMDaJb0BOetaxDoOFkgYTRzfzhVJx0CLv6wRYs6kwo3/YFxkD5uprLEFil3gK520y9W6jAnqnwN X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: We will soon add support for CoWing large anonymous folios, so create a ranged version of the ptep_clear_flush_notify() macro in preparation for that. It is able call mmu_notifier_invalidate_range() once for the entire range, but still calls ptep_clear_flush() per page since there is no arch support for a batched version of this API yet. No functional change intended. Signed-off-by: Ryan Roberts --- include/linux/mmu_notifier.h | 31 +++++++++++++++++++++++++++++++ 1 file changed, 31 insertions(+) -- 2.25.1 diff --git a/include/linux/mmu_notifier.h b/include/linux/mmu_notifier.h index 64a3e051c3c4..527aa89959b4 100644 --- a/include/linux/mmu_notifier.h +++ b/include/linux/mmu_notifier.h @@ -595,6 +595,24 @@ static inline void mmu_notifier_range_init_owner( ___pte; \ }) +#define ptep_clear_flush_range_notify(__vma, __address, __ptep, __nr) \ +({ \ + struct vm_area_struct *___vma = (__vma); \ + unsigned long ___addr = (__address) & PAGE_MASK; \ + pte_t *___ptep = (__ptep); \ + int ___nr = (__nr); \ + struct mm_struct *___mm = ___vma->vm_mm; \ + int ___i; \ + \ + for (___i = 0; ___i < ___nr; ___i++) \ + ptep_clear_flush(___vma, \ + ___addr + (___i << PAGE_SHIFT), \ + ___ptep + ___i); \ + \ + mmu_notifier_invalidate_range(___mm, ___addr, \ + ___addr + (___nr << PAGE_SHIFT)); \ +}) + #define pmdp_huge_clear_flush_notify(__vma, __haddr, __pmd) \ ({ \ unsigned long ___haddr = __haddr & HPAGE_PMD_MASK; \ @@ -736,6 +754,19 @@ static inline void mmu_notifier_subscriptions_destroy(struct mm_struct *mm) #define ptep_clear_young_notify ptep_test_and_clear_young #define pmdp_clear_young_notify pmdp_test_and_clear_young #define ptep_clear_flush_notify ptep_clear_flush +#define ptep_clear_flush_range_notify(__vma, __address, __ptep, __nr) \ +({ \ + struct vm_area_struct *___vma = (__vma); \ + unsigned long ___addr = (__address) & PAGE_MASK; \ + pte_t *___ptep = (__ptep); \ + int ___nr = (__nr); \ + int ___i; \ + \ + for (___i = 0; ___i < ___nr; ___i++) \ + ptep_clear_flush(___vma, \ + ___addr + (___i << PAGE_SHIFT), \ + ___ptep + ___i); \ +}) #define pmdp_huge_clear_flush_notify pmdp_huge_clear_flush #define pudp_huge_clear_flush_notify pudp_huge_clear_flush #define set_pte_at_notify set_pte_at From patchwork Fri Apr 14 13:02:59 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13211465 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2E728C77B72 for ; Fri, 14 Apr 2023 13:03:55 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4C610280004; Fri, 14 Apr 2023 09:03:38 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 44F08280001; Fri, 14 Apr 2023 09:03:38 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2A4A4280004; Fri, 14 Apr 2023 09:03:38 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 11857280001 for ; Fri, 14 Apr 2023 09:03:38 -0400 (EDT) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id D300B120241 for ; Fri, 14 Apr 2023 13:03:37 +0000 (UTC) X-FDA: 80680013274.23.B9A8BD2 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf21.hostedemail.com (Postfix) with ESMTP id 3141E1C000B for ; Fri, 14 Apr 2023 13:03:35 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=none; spf=pass (imf21.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1681477415; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=7ZDJ9cjTlxL0uCmbHQ5z7qLc8C1ClBv5+RC+K8b5xO4=; b=H+3e6ApQZaAGlBJ3g1jAgae7qRw6QgzTO9feRndXyM+AjgQy/sKCryzyK0PpQOQF8kd18u AuUfhKcj98kORdfgEdBOQtX6PZDPcxI2vH0H9rxfvNcL/2YnWro4z3dLoel+2a2AMR2oBq 6PwcTM9UfrVDWi+Z91mKzBhpiC4F6qU= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=none; spf=pass (imf21.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1681477415; a=rsa-sha256; cv=none; b=Xp0zzANHjsZMjO+FziiHWP4M7u8sHgvUKOyLjQyZjptZd87GvaEVCXmm/xGpXmjlvKV6M1 v+VtWEh7x6xi+YYp5AjWgukUeDmwjyZPIijXM7sLfFffMdWwK6hvl3mQQno7IHw8KP8/b/ xSnEwXVaDXtDN12GALhkc+baRstNQoo= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 4E2F02F4; Fri, 14 Apr 2023 06:04:19 -0700 (PDT) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.26]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id C9E343F6C4; Fri, 14 Apr 2023 06:03:33 -0700 (PDT) From: Ryan Roberts To: Andrew Morton , "Matthew Wilcox (Oracle)" , Yu Zhao , "Yin, Fengwei" Cc: Ryan Roberts , linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org Subject: [RFC v2 PATCH 13/17] mm: Implement folio_remove_rmap_range() Date: Fri, 14 Apr 2023 14:02:59 +0100 Message-Id: <20230414130303.2345383-14-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230414130303.2345383-1-ryan.roberts@arm.com> References: <20230414130303.2345383-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam03 X-Stat-Signature: 787b98pee56agn7gayfcmte4yrs8i7yu X-Rspamd-Queue-Id: 3141E1C000B X-HE-Tag: 1681477415-635766 X-HE-Meta: U2FsdGVkX18/qXghwyKCEhKA7Z359KhbREZ/ACtDBMZY5b67khui8PZliuoYvjDUJ6ru8hsjFGK2498EcHKekdkh11zQGCYGGNhaffAZeB4QZGZunJOia4yTIECQNrF075kWne+oZzGa22CvzXM0V4HX+9+4wZWajnhl1QBLikF/nvSUyozqIMsY24mzyCQdABq2pU6/TKtB+heTloNf64lVAm/MVK2EzUyw21fbwnY6hYa+srJ3+FD+0/AeVRFKP1nsGQ8IqWBcozmThGtGDzLjhj9Bh74Xf20qo4Oy31q6eS9Cn4Q/HlEmOFj0vFOTLdL8uUgiXLt8tMTqxkNHL279i9HqdLYi6OdMSHnRzp8Ye954LSRY1L0JoPET7lAmAf3c5AsXfKH33oCwWWxU+2y4D27JXCSFshEAJ5aRzLht4gwsh1D/rh20lS56jLUe31OaZiuDl7oiraZN3w0g8gZPaRiQKYJvbsXkPYVkK5ATW3pI32VMiJvDw2YWgfGSLMF5iNTyfE7WCBGKo9SV2kqILxikF1/Ix+d5QPe84bEjZNK0zE+prE1ur9P/Bvw3E6oK5JH41Sa7NnSG1s9GQIGOi0tcVeMuBoxnrjkVrdpXux3aVHX+0rMCsfF/ZUBTMSnZZ9Tdfz5GiAk9bssKGWzncOaZc0p7Vj6UKSBMd9E/MjLKBv6rolvUCfGAal7TJ6xPJ0u1fZXVXtvHuFdo2cT9yIrpFkItozVj5z6nBjIcGWSNHr0ldr6qY+wCo0z5Z0Ds54Dx98PbRXAnfUNZhLPHvg001tpG5XAQr0bZUFNZ111EIVwXw2BqN723RDyOLH7ovrEl+gHrEaDlB+tV819JWeGYMBHyXglvZX8XBLQKLQ8A0NK8RxvfA9/ZwYefsKyjNa35XzFcstHvY379xR2uQaNR5eVyl7SQyMfgBhsZyom2X0x3d7Vf6S/tnmbt0VhGiecc4aeuRpEOayA ukbjvXOH q+TtcrUo8QpMCzCD/TD91RrMdzxwWs9uLfDnELyDFB/KLyKcgrtInafUpxW+GSyE1gqZOJkS/HdKELZVFF7pZnLcbBDePh6Tji7CqLVa130GtpV03DNRIcCX1NcDUYexS51vb++BfJA2Xik/6CQ0rdWv7rCI6iZTYEcn/j1EEk2tVE8sMwW/d3heMkdv0XA9bR/hFrkf4ZILu/PJ9PIFfuSjY6QElbsYYJNsy X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Like page_remove_rmap() but batch-removes the rmap for a range of pages belonging to a folio, for effciency savings. All pages are accounted as small pages. Signed-off-by: Ryan Roberts --- include/linux/rmap.h | 2 ++ mm/rmap.c | 62 ++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 64 insertions(+) -- 2.25.1 diff --git a/include/linux/rmap.h b/include/linux/rmap.h index 8cb0ba48d58f..7daf25887049 100644 --- a/include/linux/rmap.h +++ b/include/linux/rmap.h @@ -204,6 +204,8 @@ void page_add_file_rmap(struct page *, struct vm_area_struct *, bool compound); void page_remove_rmap(struct page *, struct vm_area_struct *, bool compound); +void folio_remove_rmap_range(struct folio *folio, struct page *page, + int nr, struct vm_area_struct *vma); void hugepage_add_anon_rmap(struct page *, struct vm_area_struct *, unsigned long address, rmap_t flags); diff --git a/mm/rmap.c b/mm/rmap.c index 1cd8fb0b929f..954e44054d5c 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1419,6 +1419,68 @@ void page_add_file_rmap(struct page *page, struct vm_area_struct *vma, mlock_vma_folio(folio, vma, compound); } +/** + * folio_remove_rmap_range - take down pte mappings from a range of pages + * belonging to a folio. All pages are accounted as small pages. + * @folio: folio that all pages belong to + * @page: first page in range to remove mapping from + * @nr: number of pages in range to remove mapping from + * @vma: the vm area from which the mapping is removed + * + * The caller needs to hold the pte lock. + */ +void folio_remove_rmap_range(struct folio *folio, struct page *page, + int nr, struct vm_area_struct *vma) +{ + atomic_t *mapped = &folio->_nr_pages_mapped; + int nr_unmapped = 0; + int nr_mapped; + bool last; + enum node_stat_item idx; + + VM_BUG_ON_FOLIO(folio_test_hugetlb(folio), folio); + + if (!folio_test_large(folio)) { + /* Is this the page's last map to be removed? */ + last = atomic_add_negative(-1, &page->_mapcount); + nr_unmapped = last; + } else { + for (; nr != 0; nr--, page++) { + /* Is this the page's last map to be removed? */ + last = atomic_add_negative(-1, &page->_mapcount); + if (last) { + /* Page still mapped if folio mapped entirely */ + nr_mapped = atomic_dec_return_relaxed(mapped); + if (nr_mapped < COMPOUND_MAPPED) + nr_unmapped++; + } + } + } + + if (nr_unmapped) { + idx = folio_test_anon(folio) ? NR_ANON_MAPPED : NR_FILE_MAPPED; + __lruvec_stat_mod_folio(folio, idx, -nr_unmapped); + + /* + * Queue anon THP for deferred split if we have just unmapped at + * least 1 page, while at least 1 page remains mapped. + */ + if (folio_test_large(folio) && folio_test_anon(folio)) + if (nr_mapped) + deferred_split_folio(folio); + } + + /* + * It would be tidy to reset folio_test_anon mapping when fully + * unmapped, but that might overwrite a racing page_add_anon_rmap + * which increments mapcount after us but sets mapping before us: + * so leave the reset to free_pages_prepare, and remember that + * it's only reliable while mapped. + */ + + munlock_vma_folio(folio, vma, false); +} + /** * page_remove_rmap - take down pte mapping from a page * @page: page to remove mapping from From patchwork Fri Apr 14 13:03:00 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13211466 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 26555C77B72 for ; Fri, 14 Apr 2023 13:03:58 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 38CFF280005; Fri, 14 Apr 2023 09:03:39 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3168D280001; Fri, 14 Apr 2023 09:03:39 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 141B2280005; Fri, 14 Apr 2023 09:03:39 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id EC2A9280001 for ; Fri, 14 Apr 2023 09:03:38 -0400 (EDT) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id CC1DE402A2 for ; Fri, 14 Apr 2023 13:03:38 +0000 (UTC) X-FDA: 80680013316.02.9A4E507 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf18.hostedemail.com (Postfix) with ESMTP id 9349F1C0004 for ; Fri, 14 Apr 2023 13:03:36 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=none; spf=pass (imf18.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1681477417; a=rsa-sha256; cv=none; b=kJVIH9cwdaR027HiEWeto5dLH33zDsuaP/POPTvA+DPtbba41emHRxo0VXYKQCR+khN3fn 09ngcCirTrwQudnN5rN1+ErlAtQjpwqK1b3PjUOg1oKFuORfL0py9aWa7ng5Ag5pE9tqHP Vaw0U4XpPNeI/az3ApKlS33lwLiOzQM= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=none; spf=pass (imf18.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1681477417; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=afVdFqOV/o0K20kJgZSx3fjvhEQ2eP1wsk2Vl6NEh1U=; b=JvqiPcDZP39jllKBOXSL8wBpL/78OYLmmqoRLlq5sZCk+3JR0xa32sWBF2WWE7CD9BX9MB 0MQQJaFemrBcCat3YfR21xWfv+3uSPTxJ80stuELIRRzOTZ2FB+8b3zwWGAnR8nr2qoYV7 Z7zUnPB0G0aYw/suA7+2UIClbG641K4= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 9B3761713; Fri, 14 Apr 2023 06:04:20 -0700 (PDT) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.26]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 082AF3F6C4; Fri, 14 Apr 2023 06:03:34 -0700 (PDT) From: Ryan Roberts To: Andrew Morton , "Matthew Wilcox (Oracle)" , Yu Zhao , "Yin, Fengwei" Cc: Ryan Roberts , linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org Subject: [RFC v2 PATCH 14/17] mm: Copy large folios for anonymous memory Date: Fri, 14 Apr 2023 14:03:00 +0100 Message-Id: <20230414130303.2345383-15-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230414130303.2345383-1-ryan.roberts@arm.com> References: <20230414130303.2345383-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Queue-Id: 9349F1C0004 X-Rspamd-Server: rspam01 X-Stat-Signature: uzyoz8sgwsors4tjp3buteofw9tg3o8h X-HE-Tag: 1681477416-9937 X-HE-Meta: U2FsdGVkX180o11SCx2n/UcL14m+7NxhJElDhXeWfnVnzTuOw33F3mgDUmVBQBwBkh+yJqZHAaBuEM+Dfm2v+fcpacbyNmkfy/H4vQSayDEgTWFW6fJX0bEuM1ml8ot/3Ilgw3wVuWt+BdU6XFP806OaisNrwR1VK1CCtg3U+pLu907MKCuJqBUkWwWo5ksscBithHVNp2U+jLnq+QHI5wEv+9G2h1yJWZeQr7fqj6+q1o+JB1hZIrBT9EAgZcOC3d3EIGuMFFF8EZPHmixmaKDIdWVCzNfcUypf/xD3r2zXnv8ItsGgEB/rGGRHBBinFbRVgW75cjzWTjtB842DpGCqswsZs77OspcHQF2dt7TP5o750rqT5NdNwjKmSbItRswNooBSyxDggZkA/6Zehv8h1Nr708YAyw8TnQ2cyWak4Y3wR6R/xGa+y4Jmzu70mOHn4FRlKrOOSTdFDweqsDOMOXH9DxmVIPUqUb3ZmyCcw7zXoy4fZAq5EN8tdWy9i+4femgQ8QD7o1JkywHYxZ8+Vo5Z3lVDBsYoSBaaHzeO6RZY6yVy+aenXPdIUokIQMFf2LRyQt43Fhq8Q5f8VwGrtjCfehRkRE4TQb4/G7KeG0cfq6Wdx/0lEDhl50qF2E6torFTbnl4J6LMShukXd3mWMfi9bp0uU2cT+0MaZx5dOEbMPvz1eXqlzejE4R6Gj0exFfnyAqODLvJrT/x9+5hNW5TMgFXsvv72dSKq6PIKt70tQtty9woG0xlIYTKz0IblHmgOSCzzOwEC8IPqrs3K9VC9x1+pxm2VG0MvbFNi+In1blUpo2be9kiMDA2JDsXTRqRhkEu8e/hCuCQubTIvfdbu3UwTxCP3ngT/+kHvi+v6CSXVJardketygIX/hGdD+GhrMb1oRTf9VolTL6M7KrQO+KxO1pvNdOnoRbZtszVhIbA5YdhuY+SiNhzoYGFWxt8IzFcqxaoxu2 jccTx4Bg H9vRH7/0DGZ8kW8C+x348tUQY29i7568QUGZtO5EdaqCVtmlka9TT24DSsJe8z5/Snwdx+7YFTQGP/MVq1p/JBlovTxRc4SiQ9426VTOVL0r+01xdM36fH9wY67gFO0EA3ueisu/PLX14u1dvHw6AXnfGdIdMBlUKXEweYjK0QUxHgCo= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: When taking a write fault on an anonymous page, if we are unable to reuse the folio (due to it being mapped by others), do CoW for the entire folio instead of just a single page. We assume that the size of the anonymous folio chosen at allocation time is still a good choice and therefore it is better to copy the entire folio rather than a single page. It does not seem wise to do this for file-backed folios, since the folio size chosen there is related to the system-wide usage of the file. So we continue to CoW a single page for file-backed mappings. There are edge cases where the original mapping has been mremapped or partially munmapped. In this case the source folio may not be naturally aligned in the virtual address space. In this case, we CoW a power-of-2 portion of the source folio which is aligned. A similar effect happens when allocation of a high order destination folio fails. In this case, we reduce the order to 0 until we are successful. Signed-off-by: Ryan Roberts --- mm/memory.c | 242 ++++++++++++++++++++++++++++++++++++++++++++-------- 1 file changed, 207 insertions(+), 35 deletions(-) -- 2.25.1 diff --git a/mm/memory.c b/mm/memory.c index f2b7cfb2efc0..61cec97a57f3 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3086,6 +3086,30 @@ static inline int check_ptes_none(pte_t *pte, int nr) return nr; } +/* + * Returns index of first pte that is not mapped RO and physically contiguously + * starting at pfn, or nr if all are correct. + */ +static inline int check_ptes_contig_ro(pte_t *pte, int nr, unsigned long pfn) +{ + int i; + pte_t entry; + + for (i = 0; i < nr; i++) { + entry = *pte++; + + if (!pte_present(entry) || + pte_write(entry) || + pte_protnone(entry) || + pte_pfn(entry) != pfn) + return i; + + pfn++; + } + + return nr; +} + static int calc_anon_folio_order_alloc(struct vm_fault *vmf, int order) { /* @@ -3155,6 +3179,94 @@ static int calc_anon_folio_order_alloc(struct vm_fault *vmf, int order) return order; } +static int calc_anon_folio_order_copy(struct vm_fault *vmf, + struct folio *old_folio, int order) +{ + /* + * The aim here is to determine what size of folio we should allocate as + * the destination for this CoW fault. Factors include: + * - Order must not be higher than `order` upon entry + * - Folio must be naturally aligned within VA space + * - Folio must not breach boundaries of vma + * - Folio must be fully contained inside one pmd entry + * - All covered ptes must be present, physically contiguous and RO + * - All covered ptes must be mapped to old_folio + * + * Additionally, we do not allow order-1 since this breaks assumptions + * elsewhere in the mm; THP pages must be at least order-2 (since they + * store state up to the 3rd struct page subpage), and these pages must + * be THP in order to correctly use pre-existing THP infrastructure such + * as folio_split(). + * + * As a consequence of relying on the THP infrastructure, if the system + * does not support THP, we always fallback to order-0. + * + * Note that old_folio may not be naturally aligned in VA space due to + * mremap. We deliberately force alignment of the new folio to simplify + * fallback, so in this unaligned case we will end up only copying a + * portion of old_folio. + * + * Note that the caller may or may not choose to lock the pte. If + * unlocked, the calculation should be considered an estimate that will + * need to be validated under the lock. + */ + + struct vm_area_struct *vma = vmf->vma; + int nr; + unsigned long addr; + pte_t *pte; + pte_t *first_bad = NULL; + int ret; + unsigned long start, end; + unsigned long offset; + unsigned long pfn; + + if (has_transparent_hugepage()) { + order = min(order, PMD_SHIFT - PAGE_SHIFT); + + start = page_addr(&old_folio->page, vmf->page, vmf->address); + start = max(start, vma->vm_start); + + end = page_addr(&old_folio->page + folio_nr_pages(old_folio), + vmf->page, vmf->address); + end = min(end, vma->vm_end); + + for (; order > 1; order--) { + nr = 1 << order; + addr = ALIGN_DOWN(vmf->address, nr << PAGE_SHIFT); + offset = ((vmf->address - addr) >> PAGE_SHIFT); + pfn = page_to_pfn(vmf->page) - offset; + pte = vmf->pte - offset; + + /* Check vma and folio bounds. */ + if (addr < start || + addr + (nr << PAGE_SHIFT) > end) + continue; + + /* Ptes covered by order already known to be good. */ + if (pte + nr <= first_bad) + break; + + /* Already found bad pte in range covered by order. */ + if (pte <= first_bad) + continue; + + /* Need to check if all the ptes are good. */ + ret = check_ptes_contig_ro(pte, nr, pfn); + if (ret == nr) + break; + + first_bad = pte + ret; + } + + if (order == 1) + order = 0; + } else + order = 0; + + return order; +} + static void calc_anon_folio_range_reuse(struct vm_fault *vmf, struct folio *folio, struct anon_folio_range *range_out) @@ -3366,6 +3478,14 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf) int page_copied = 0; struct mmu_notifier_range range; int ret; + pte_t orig_pte; + unsigned long addr = vmf->address; + int order = 0; + int pgcount = BIT(order); + unsigned long offset = 0; + unsigned long pfn; + struct page *page; + int i; delayacct_wpcopy_start(); @@ -3375,20 +3495,39 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf) goto oom; if (is_zero_pfn(pte_pfn(vmf->orig_pte))) { - new_folio = vma_alloc_zeroed_movable_folio(vma, vmf->address, - 0, 0); + new_folio = vma_alloc_movable_folio(vma, vmf->address, 0, true); if (!new_folio) goto oom; } else { - new_folio = vma_alloc_folio(GFP_HIGHUSER_MOVABLE, 0, vma, - vmf->address, false); + if (old_folio && folio_test_anon(old_folio)) { + order = min_t(int, folio_order(old_folio), + max_anon_folio_order(vma)); +retry: + /* + * Estimate the folio order to allocate. We are not + * under the ptl here so this estimate needs to be + * re-checked later once we have the lock. + */ + vmf->pte = pte_offset_map(vmf->pmd, vmf->address); + order = calc_anon_folio_order_copy(vmf, old_folio, order); + pte_unmap(vmf->pte); + } + + new_folio = try_vma_alloc_movable_folio(vma, vmf->address, + order, false); if (!new_folio) goto oom; + /* We may have been granted less than we asked for. */ + order = folio_order(new_folio); + pgcount = BIT(order); + addr = ALIGN_DOWN(vmf->address, pgcount << PAGE_SHIFT); + offset = ((vmf->address - addr) >> PAGE_SHIFT); + if (likely(old_folio)) ret = __wp_page_copy_user_range(&new_folio->page, - vmf->page, - 1, vmf->address, vma); + vmf->page - offset, + pgcount, addr, vma); else ret = __wp_page_copy_user_pfn(&new_folio->page, vmf); if (ret) { @@ -3410,39 +3549,31 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf) if (mem_cgroup_charge(new_folio, mm, GFP_KERNEL)) goto oom_free_new; - cgroup_throttle_swaprate(&new_folio->page, GFP_KERNEL); + folio_throttle_swaprate(new_folio, GFP_KERNEL); __folio_mark_uptodate(new_folio); mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, mm, - vmf->address & PAGE_MASK, - (vmf->address & PAGE_MASK) + PAGE_SIZE); + addr, addr + (pgcount << PAGE_SHIFT)); mmu_notifier_invalidate_range_start(&range); /* - * Re-check the pte - we dropped the lock + * Re-check the pte(s) - we dropped the lock */ - vmf->pte = pte_offset_map_lock(mm, vmf->pmd, vmf->address, &vmf->ptl); - if (likely(pte_same(*vmf->pte, vmf->orig_pte))) { + vmf->pte = pte_offset_map_lock(mm, vmf->pmd, addr, &vmf->ptl); + pfn = pte_pfn(vmf->orig_pte) - offset; + if (likely(check_ptes_contig_ro(vmf->pte, pgcount, pfn) == pgcount)) { if (old_folio) { if (!folio_test_anon(old_folio)) { + VM_BUG_ON(order != 0); dec_mm_counter(mm, mm_counter_file(&old_folio->page)); inc_mm_counter(mm, MM_ANONPAGES); } } else { + VM_BUG_ON(order != 0); inc_mm_counter(mm, MM_ANONPAGES); } - flush_cache_page(vma, vmf->address, pte_pfn(vmf->orig_pte)); - entry = mk_pte(&new_folio->page, vma->vm_page_prot); - entry = pte_sw_mkyoung(entry); - if (unlikely(unshare)) { - if (pte_soft_dirty(vmf->orig_pte)) - entry = pte_mksoft_dirty(entry); - if (pte_uffd_wp(vmf->orig_pte)) - entry = pte_mkuffd_wp(entry); - } else { - entry = maybe_mkwrite(pte_mkdirty(entry), vma); - } + flush_cache_range(vma, addr, addr + (pgcount << PAGE_SHIFT)); /* * Clear the pte entry and flush it first, before updating the @@ -3451,17 +3582,40 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf) * that left a window where the new PTE could be loaded into * some TLBs while the old PTE remains in others. */ - ptep_clear_flush_notify(vma, vmf->address, vmf->pte); - folio_add_new_anon_rmap(new_folio, vma, vmf->address); + ptep_clear_flush_range_notify(vma, addr, vmf->pte, pgcount); + folio_ref_add(new_folio, pgcount - 1); + folio_add_new_anon_rmap_range(new_folio, &new_folio->page, + pgcount, vma, addr); folio_add_lru_vma(new_folio, vma); /* * We call the notify macro here because, when using secondary * mmu page tables (such as kvm shadow page tables), we want the * new page to be mapped directly into the secondary page table. */ - BUG_ON(unshare && pte_write(entry)); - set_pte_at_notify(mm, vmf->address, vmf->pte, entry); - update_mmu_cache(vma, vmf->address, vmf->pte); + page = &new_folio->page; + for (i = 0; i < pgcount; i++, page++) { + entry = mk_pte(page, vma->vm_page_prot); + entry = pte_sw_mkyoung(entry); + if (unlikely(unshare)) { + orig_pte = vmf->pte[i]; + if (pte_soft_dirty(orig_pte)) + entry = pte_mksoft_dirty(entry); + if (pte_uffd_wp(orig_pte)) + entry = pte_mkuffd_wp(entry); + } else { + entry = maybe_mkwrite(pte_mkdirty(entry), vma); + } + /* + * TODO: Batch for !unshare case. Could use set_ptes(), + * but currently there is no arch-agnostic way to + * increment pte values by pfn so can't do the notify + * part. So currently stuck creating the pte from + * scratch every iteration. + */ + set_pte_at_notify(mm, addr + (i << PAGE_SHIFT), + vmf->pte + i, entry); + } + update_mmu_cache_range(vma, addr, vmf->pte, pgcount); if (old_folio) { /* * Only after switching the pte to the new page may @@ -3473,10 +3627,10 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf) * threads. * * The critical issue is to order this - * page_remove_rmap with the ptp_clear_flush above. - * Those stores are ordered by (if nothing else,) + * folio_remove_rmap_range with the ptp_clear_flush + * above. Those stores are ordered by (if nothing else,) * the barrier present in the atomic_add_negative - * in page_remove_rmap. + * in folio_remove_rmap_range. * * Then the TLB flush in ptep_clear_flush ensures that * no process can access the old page before the @@ -3485,14 +3639,30 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf) * mapcount is visible. So transitively, TLBs to * old page will be flushed before it can be reused. */ - page_remove_rmap(vmf->page, vma, false); + folio_remove_rmap_range(old_folio, + vmf->page - offset, + pgcount, vma); } /* Free the old page.. */ new_folio = old_folio; page_copied = 1; } else { - update_mmu_tlb(vma, vmf->address, vmf->pte); + pte_t *pte = vmf->pte + ((vmf->address - addr) >> PAGE_SHIFT); + + /* + * If faulting pte was serviced by another, exit early. Else try + * again, with a lower order. + */ + if (order > 0 && pte_same(*pte, vmf->orig_pte)) { + pte_unmap_unlock(vmf->pte, vmf->ptl); + mmu_notifier_invalidate_range_only_end(&range); + folio_put(new_folio); + order--; + goto retry; + } + + update_mmu_tlb(vma, vmf->address, pte); } if (new_folio) @@ -3505,9 +3675,11 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf) */ mmu_notifier_invalidate_range_only_end(&range); if (old_folio) { - if (page_copied) + if (page_copied) { free_swap_cache(&old_folio->page); - folio_put(old_folio); + folio_put_refs(old_folio, pgcount); + } else + folio_put(old_folio); } delayacct_wpcopy_end(); From patchwork Fri Apr 14 13:03:01 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13211467 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 51082C77B6E for ; Fri, 14 Apr 2023 13:04:01 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0A710280006; Fri, 14 Apr 2023 09:03:40 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 030FA280001; Fri, 14 Apr 2023 09:03:39 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DF685280006; Fri, 14 Apr 2023 09:03:39 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id C16DD280001 for ; Fri, 14 Apr 2023 09:03:39 -0400 (EDT) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 7D161120181 for ; Fri, 14 Apr 2023 13:03:39 +0000 (UTC) X-FDA: 80680013358.21.868A581 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf21.hostedemail.com (Postfix) with ESMTP id 8B7FB1C0038 for ; Fri, 14 Apr 2023 13:03:37 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=none; spf=pass (imf21.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1681477417; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ybFxUPyHWDzwds2QZnZm/D00r0MGjfNOCyQiLQp+jsg=; b=WwpcSzVkszEnvqE/PVxkdvh+vfH5guQAdv0YvzOBUhgHJflUWrp29f6y3n1mQ19Sj7RBdc PmsVlQiyINHeeJdT+xtJ3uVC7RRHccAfhly4L7N6n5TZegH8FoQdy4sRhE7CVi8/1iw5gE jNpLk5O1JPUCDJe4WUzwJsNwvYS6AG4= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=none; spf=pass (imf21.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1681477417; a=rsa-sha256; cv=none; b=DRDzb3FfSilf/oWrXrglWNFM+g6tS89Mg8gISFyWyPAA5qzJv3/Rx3ZWUxDy5WqxwiJcSe ZebssyFT5h78Sb02Z0f8B+ftFsoTModGZgOFlhImrd85ewrZ5PN2Xb9uQmKIhQklrRMg9+ V5X+zjerMaa+KNav3B1PvvLSRQzMwd4= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id CD8031756; Fri, 14 Apr 2023 06:04:21 -0700 (PDT) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.26]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 550D53F6C4; Fri, 14 Apr 2023 06:03:36 -0700 (PDT) From: Ryan Roberts To: Andrew Morton , "Matthew Wilcox (Oracle)" , Yu Zhao , "Yin, Fengwei" Cc: Ryan Roberts , linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org Subject: [RFC v2 PATCH 15/17] mm: Convert zero page to large folios on write Date: Fri, 14 Apr 2023 14:03:01 +0100 Message-Id: <20230414130303.2345383-16-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230414130303.2345383-1-ryan.roberts@arm.com> References: <20230414130303.2345383-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam03 X-Stat-Signature: 6pwh9kzyibj4oa9p9d559rak8ipr93rz X-Rspamd-Queue-Id: 8B7FB1C0038 X-HE-Tag: 1681477417-814351 X-HE-Meta: U2FsdGVkX1/wInExKKtf3ifTY28SFXgOJPBeWoWuW3oDxplJfHu7HfulAsoP3bSoaJuarei6DRTUeYV2XSv0oRsxnjvK0avAlnOkrm0nKH70EhYaV7FYVECQYkEc2vmNPV/gcmFH43KPEkcWWivAXW5Kx3M7L8tDh6Cnx/2m7X8PO6GFpBt0MsbGWzzVtcS0T6qUMbTJ3GWvwEBckSzAqlzftVfpK0n5nPQeuliCuOpI/hT+DCDcerrWxQhQs9iNUQ+T8InrJEHZU8LzC9PWDZUH4AQfqc84DzTRMPaNNutGAdvXLqlvdwNiN6blhRhSaO+jPCw90NYVDs4GuRMSMPQxTsYPvUBrEJA0HNIrTSTyTiT/44m68fQm4uRmnO5l99yQmTxi13kbpENo5r4vvo7RvxkJQKm9ltG7YoR1MujealbLsuavTFa4fdh8PgieXkXqy0IRmK+0+yceAebPdMZ2PtLWB0yReoIiLansynfAWbrUhe49gLEKVE74dJGnW8p50hChGA8qXNKoj996DQ50P4MlGj1F7v1pwGSoZz0vjTV0dDAAa6GWl11vCeVFYgjAyn6mO1rS1IdP6d/XV1qooG9o8lCQAYdvAQNWV/TGYqOLzE3xonXq7rGDD1v5u4wy0t12cc1szXt/vOQrFoV+Fb99xmvdF0Vnf95BnA+cVCx/dVieQEyMW2JJ+23VR06KWOeBrHbkQ84SsFteVP5cs+OlOG4SfV6i1FIo0nZajcSGrt/yc4JeacNbHGy+GISLG2EuotveWr+nXGMShEZyUgx2hoD6I644Cf4MwBCWY5Ws5pZivEL34VzDrz5++idkW2jxPenkK+CpdH/Hc8EhJPfVcKmZ8gTEu4Hbqb2PFEIev67o26UmDtSBjkmtMYb2dV3m8bpRxsLUJAv/iDPhKjY797jQ1I0qZYKxh0gPXtOAK4a7wkWrYg5PiIfkVdl8MJ8gkW/bqh7C4Hp 7WI0ITLf CVMKzi3BEbyp/ZlNe4VDviUh4zLqJT2mKQSIW0YP5TRj8/Syz3WkqaVc8lBT1NAfmKq15O4krpEqqGvIxFyJXZAACbWjd6QhoYLLldWXwJCIGZ7w2C5Py4eVR+j0isw41E+ntt9/dClcXmL7P+yDClPIj0I9ctCTkfwmXWIUVAuLiSs2ni4KrGgcIKcUS7e1+nQan X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: A read fault causes the zero page to be mapped read-only. A subsequent write fault causes the zero page to be replaced with a zero-filled private anonymous page. Change the write fault behaviour to replace the zero page with a large anonymous folio, allocated using the same policy as if the write fault had happened without the previous read fault. Experimentation shows that reading multiple contiguous pages is extremely rare without interleved writes, so we don't bother to map a large zero page. We just use the small zero page as a marker and expand the allocation at the write fault. Signed-off-by: Ryan Roberts --- mm/memory.c | 115 ++++++++++++++++++++++++++++++++++++---------------- 1 file changed, 80 insertions(+), 35 deletions(-) -- 2.25.1 diff --git a/mm/memory.c b/mm/memory.c index 61cec97a57f3..fac686e9f895 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3110,6 +3110,23 @@ static inline int check_ptes_contig_ro(pte_t *pte, int nr, unsigned long pfn) return nr; } +/* + * Checks that all ptes are none except for the pte at offset, which should be + * entry. Returns index of first pte that does not meet expectations, or nr if + * all are correct. + */ +static inline int check_ptes_none_or_entry(pte_t *pte, int nr, + pte_t entry, unsigned long offset) +{ + int ret; + + ret = check_ptes_none(pte, offset); + if (ret == offset && pte_same(pte[offset], entry)) + ret += 1 + check_ptes_none(pte + offset + 1, nr - offset - 1); + + return ret; +} + static int calc_anon_folio_order_alloc(struct vm_fault *vmf, int order) { /* @@ -3141,6 +3158,7 @@ static int calc_anon_folio_order_alloc(struct vm_fault *vmf, int order) pte_t *pte; pte_t *first_set = NULL; int ret; + unsigned long offset; if (has_transparent_hugepage()) { order = min(order, PMD_SHIFT - PAGE_SHIFT); @@ -3148,7 +3166,8 @@ static int calc_anon_folio_order_alloc(struct vm_fault *vmf, int order) for (; order > 1; order--) { nr = 1 << order; addr = ALIGN_DOWN(vmf->address, nr << PAGE_SHIFT); - pte = vmf->pte - ((vmf->address - addr) >> PAGE_SHIFT); + offset = ((vmf->address - addr) >> PAGE_SHIFT); + pte = vmf->pte - offset; /* Check vma bounds. */ if (addr < vma->vm_start || @@ -3163,8 +3182,9 @@ static int calc_anon_folio_order_alloc(struct vm_fault *vmf, int order) if (pte <= first_set) continue; - /* Need to check if all the ptes are none. */ - ret = check_ptes_none(pte, nr); + /* Need to check if all the ptes are none or entry. */ + ret = check_ptes_none_or_entry(pte, nr, + vmf->orig_pte, offset); if (ret == nr) break; @@ -3479,13 +3499,15 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf) struct mmu_notifier_range range; int ret; pte_t orig_pte; - unsigned long addr = vmf->address; - int order = 0; - int pgcount = BIT(order); - unsigned long offset = 0; + unsigned long addr; + int order; + int pgcount; + unsigned long offset; unsigned long pfn; struct page *page; int i; + bool zero; + bool anon; delayacct_wpcopy_start(); @@ -3494,36 +3516,54 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf) if (unlikely(anon_vma_prepare(vma))) goto oom; + /* + * Set the upper bound of the folio allocation order. If we hit a zero + * page, we allocate a folio with the same policy as allocation upon + * write fault. If we are copying an anon folio, then limit ourself to + * its order as we don't want to copy from multiple folios. For all + * other cases (e.g. file-mapped) CoW a single page. + */ if (is_zero_pfn(pte_pfn(vmf->orig_pte))) { - new_folio = vma_alloc_movable_folio(vma, vmf->address, 0, true); - if (!new_folio) - goto oom; - } else { - if (old_folio && folio_test_anon(old_folio)) { - order = min_t(int, folio_order(old_folio), + zero = true; + anon = false; + order = max_anon_folio_order(vma); + } else if (old_folio && folio_test_anon(old_folio)) { + zero = false; + anon = true; + order = min_t(int, folio_order(old_folio), max_anon_folio_order(vma)); + } else { + zero = false; + anon = false; + order = 0; + } + retry: - /* - * Estimate the folio order to allocate. We are not - * under the ptl here so this estimate needs to be - * re-checked later once we have the lock. - */ - vmf->pte = pte_offset_map(vmf->pmd, vmf->address); - order = calc_anon_folio_order_copy(vmf, old_folio, order); - pte_unmap(vmf->pte); - } + /* + * Estimate the folio order to allocate. We are not under the ptl here + * so this estimate needs to be re-checked later once we have the lock. + */ + if (zero || anon) { + vmf->pte = pte_offset_map(vmf->pmd, vmf->address); + order = zero ? calc_anon_folio_order_alloc(vmf, order) : + calc_anon_folio_order_copy(vmf, old_folio, order); + pte_unmap(vmf->pte); + } - new_folio = try_vma_alloc_movable_folio(vma, vmf->address, - order, false); - if (!new_folio) - goto oom; + /* Allocate the new folio. */ + new_folio = try_vma_alloc_movable_folio(vma, vmf->address, order, zero); + if (!new_folio) + goto oom; - /* We may have been granted less than we asked for. */ - order = folio_order(new_folio); - pgcount = BIT(order); - addr = ALIGN_DOWN(vmf->address, pgcount << PAGE_SHIFT); - offset = ((vmf->address - addr) >> PAGE_SHIFT); + /* We may have been granted less than we asked for. */ + order = folio_order(new_folio); + pgcount = BIT(order); + addr = ALIGN_DOWN(vmf->address, pgcount << PAGE_SHIFT); + offset = ((vmf->address - addr) >> PAGE_SHIFT); + pfn = pte_pfn(vmf->orig_pte) - offset; + /* Copy contents. */ + if (!zero) { if (likely(old_folio)) ret = __wp_page_copy_user_range(&new_folio->page, vmf->page - offset, @@ -3561,8 +3601,14 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf) * Re-check the pte(s) - we dropped the lock */ vmf->pte = pte_offset_map_lock(mm, vmf->pmd, addr, &vmf->ptl); - pfn = pte_pfn(vmf->orig_pte) - offset; - if (likely(check_ptes_contig_ro(vmf->pte, pgcount, pfn) == pgcount)) { + + if (zero) + ret = check_ptes_none_or_entry(vmf->pte, pgcount, + vmf->orig_pte, offset); + else + ret = check_ptes_contig_ro(vmf->pte, pgcount, pfn); + + if (likely(ret == pgcount)) { if (old_folio) { if (!folio_test_anon(old_folio)) { VM_BUG_ON(order != 0); @@ -3570,8 +3616,7 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf) inc_mm_counter(mm, MM_ANONPAGES); } } else { - VM_BUG_ON(order != 0); - inc_mm_counter(mm, MM_ANONPAGES); + add_mm_counter(mm, MM_ANONPAGES, pgcount); } flush_cache_range(vma, addr, addr + (pgcount << PAGE_SHIFT)); From patchwork Fri Apr 14 13:03:02 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13211468 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A5B4BC77B6E for ; Fri, 14 Apr 2023 13:04:04 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3B3EA280007; Fri, 14 Apr 2023 09:03:41 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 33C95280001; Fri, 14 Apr 2023 09:03:41 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1B6A5280007; Fri, 14 Apr 2023 09:03:41 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 03818280001 for ; Fri, 14 Apr 2023 09:03:41 -0400 (EDT) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id C05A51C5B48 for ; Fri, 14 Apr 2023 13:03:40 +0000 (UTC) X-FDA: 80680013400.19.464C504 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf18.hostedemail.com (Postfix) with ESMTP id 07F581C0005 for ; Fri, 14 Apr 2023 13:03:38 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=none; spf=pass (imf18.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1681477419; a=rsa-sha256; cv=none; b=EgJRaxO3DYPE5ecpCOBky5fwQWL0+fRQEGC90/YO8v4z+g48wRnYU8wnP1oPb2QeJ79rDH N6VlSN+I0lk4bXCMq8fyffYwSRCemWKd8I7Fnr1Qj4s1PKye13TL4ehjQ7ausw+6VRauth ht2D41A83gGqT+GPDNufzuUADGXI2aY= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=none; spf=pass (imf18.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1681477419; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=yLEbxn/oDvdzRpoW8AGJdEnOQjZWB+dVG7oOEDaA1Jw=; b=0oAME33ObvlzqEgk5DKdDu6LaSPwQLjYLJrCJPsfyaoJ/oA2HJZ6KOjLK0xm17ByqosLdv G4S+Qx8KdFs8X1VZn1mniL4LCfKGsSBlgtShCqH6LPGYq7z+UOQ0z7ZU66PR86JGIh47aP 8f7+t7GmIciIfmMyYhycQbhXdEw885M= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 0B5541758; Fri, 14 Apr 2023 06:04:23 -0700 (PDT) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.26]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 86F0B3F6C4; Fri, 14 Apr 2023 06:03:37 -0700 (PDT) From: Ryan Roberts To: Andrew Morton , "Matthew Wilcox (Oracle)" , Yu Zhao , "Yin, Fengwei" Cc: Ryan Roberts , linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org Subject: [RFC v2 PATCH 16/17] mm: mmap: Align unhinted maps to highest anon folio order Date: Fri, 14 Apr 2023 14:03:02 +0100 Message-Id: <20230414130303.2345383-17-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230414130303.2345383-1-ryan.roberts@arm.com> References: <20230414130303.2345383-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Queue-Id: 07F581C0005 X-Rspamd-Server: rspam01 X-Stat-Signature: r8ugt3oc4pahsop65niukhzyhonkji8b X-HE-Tag: 1681477418-588092 X-HE-Meta: U2FsdGVkX1/OeR8xHrdYuWXGEOszeMgb+5KWXdViA1iRlHJyE29ohPXAINU8CYFrlQ6FxInXRQ7R49Be2yY1V7GuYH9Dm+jkJluYjSuLA5U/E1/Su6+m7WHlQm2kEHEv0WuNYrot3sHOVw4Hf0mq0272uX7dzOAP0prhBUD/k0MD3Rab8vreJUNthR4JSuwOj1IUXpF5ZI7/D2d988zJOwI0aVA9Bc3SOVDORbNhZU3WLGPCPX64+5vp2YiVTgBsfXOWfZfjtKvucvS8vnKpuldimJfuE9Uv+bU86Co8xVPbgrI4Li8lPjU0jUQMDpFaSwXCQrPTxVfGjys6AbgHkOFAZIkNJxkSOYuZV2t+Etae8wENhVpQpQQZQHZtDZyA3S9af0NXMyDayMsGQyFyLlzkBz9xuaekiies73UmT44y3bazoNcv4Jccm8VvSgptkFUadZpRaZ+mAKF48jwwYRlDHgYR/P/yZq4OiRZ4TrV+1fUjAg01aN2GT8Dd8BQOn2weUPwTfamt8TS80JnURFNEs9IoVG/q5seRlxZix/HIW0Hx8YbvtYRAqbrFcxbi3ymQxupV+7cIc/36RW/ATrbaiNS78tHz5vkYhOaxZsGio31sPAUGndCf0/vArbuZGz2nqfbEvhlVO5NwsqalOXrO6vZ/M9W2JzJRc6Ru/znkWI2Qn44HUZYrcYKgI6X3iTTDWlKlc8mqeQzDEN7Gcfxljknd8Jzz4BJak+gDzr9PMVuqZW4bVvFuV65Sn5xZ0I+e/UxiXTzUmZVDn9PfO/+1fhOfkX4oWTmXjFm9+udfHZU/0+hv/0fF/qCxNSykrvZYuzjz0uV1jWVLuX6iOA6/mk9Xi9AxQvvOyDZNRQyL/kLrPJnlpqCepzMXap8Hxk5WKuxwl4yIi/99y/PZEEs00q4cuJRK3TsRTE9dbl2CwYJdMThaTgr+o6uVF8boEIeI8GwXQc+WGzfKAc0 EB8S0JV+ ien8qW6lI0Mbt8amaDU2lYyMeeIzjsi6wzcYCQqTI1ixNxex3Arellh2d4QxSazlli91sQ9MqHtu5fNErT4eeT1x3B3Vxjo1exqSlEVJnidf3TPC2j3pMckvvatlWJSuX+rWDU3eyOCmLEx7UFeYfeQkO6Ie0g6aCVcEq6sIeMcdq3zo= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: When allocating large anonymous folios, we want to maximize our chances of being able to use the highest order we support. Since one of the constraints is that a folio has to be mapped naturally aligned, let's have mmap default to that alignment when user space does not provide a hint. With this in place, an extra 2% of all allocated anonymous memory belongs to a folio of the highest order, when compiling the kernel. Signed-off-by: Ryan Roberts --- mm/mmap.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) -- 2.25.1 diff --git a/mm/mmap.c b/mm/mmap.c index ff68a67a2a7c..e7652001a32e 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -1627,7 +1627,7 @@ generic_get_unmapped_area(struct file *filp, unsigned long addr, info.length = len; info.low_limit = mm->mmap_base; info.high_limit = mmap_end; - info.align_mask = 0; + info.align_mask = BIT(PAGE_SHIFT + ANON_FOLIO_ORDER_MAX) - 1; info.align_offset = 0; return vm_unmapped_area(&info); } @@ -1677,7 +1677,7 @@ generic_get_unmapped_area_topdown(struct file *filp, unsigned long addr, info.length = len; info.low_limit = max(PAGE_SIZE, mmap_min_addr); info.high_limit = arch_get_mmap_base(addr, mm->mmap_base); - info.align_mask = 0; + info.align_mask = BIT(PAGE_SHIFT + ANON_FOLIO_ORDER_MAX) - 1; info.align_offset = 0; addr = vm_unmapped_area(&info); From patchwork Fri Apr 14 13:03:03 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13211469 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DBB8FC77B72 for ; Fri, 14 Apr 2023 13:04:07 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5131F280008; Fri, 14 Apr 2023 09:03:42 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 49CAA280001; Fri, 14 Apr 2023 09:03:42 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2A0D5280008; Fri, 14 Apr 2023 09:03:42 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 113FE280001 for ; Fri, 14 Apr 2023 09:03:42 -0400 (EDT) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id D0CBE160207 for ; Fri, 14 Apr 2023 13:03:41 +0000 (UTC) X-FDA: 80680013442.25.E286147 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf21.hostedemail.com (Postfix) with ESMTP id 0519C1C0006 for ; Fri, 14 Apr 2023 13:03:39 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=none; spf=pass (imf21.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1681477420; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=iux+4ckBwMpXp6sm/36rK893xRH3ArY8ns7SoNZg89w=; b=Gf9hAPqRpwMIffDZg037EhsEWqmiDCDCbyiR/R2peYr1kUXt80Koz8PvmZWA3iumQNzApS A2QvMV5D6parjpOVT73+Kx6JqXZ3g0w+e6EMh7GNJK88faRHHkSgP/vmd4+yM39axHSgQM zUJIjiMvx3nr4WS004YKrLYX1HFbKXU= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=none; spf=pass (imf21.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1681477420; a=rsa-sha256; cv=none; b=FxmXm2uMFwowtusukPY01Txxt+E8xsOBaJ+Gsj7xCOhgKD7WrwGFV8VabSrbV6CmaI2VHM +mxm5ExCpe8jdHyHuXrK2eF8SEtzZi+MGep92Qfl9iEBnIdq92Fs5aR5OKATrsIjaMpszY YZ6lxCxtNrVysuVtgiqojNAwnPX7ixk= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 3D53C176A; Fri, 14 Apr 2023 06:04:24 -0700 (PDT) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.26]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id B902E3F6C4; Fri, 14 Apr 2023 06:03:38 -0700 (PDT) From: Ryan Roberts To: Andrew Morton , "Matthew Wilcox (Oracle)" , Yu Zhao , "Yin, Fengwei" Cc: Ryan Roberts , linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org Subject: [RFC v2 PATCH 17/17] mm: Batch-zap large anonymous folio PTE mappings Date: Fri, 14 Apr 2023 14:03:03 +0100 Message-Id: <20230414130303.2345383-18-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230414130303.2345383-1-ryan.roberts@arm.com> References: <20230414130303.2345383-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam03 X-Stat-Signature: eia4aa4m5uepwwuzsmr7jsyqhfkkca7b X-Rspamd-Queue-Id: 0519C1C0006 X-HE-Tag: 1681477419-562803 X-HE-Meta: U2FsdGVkX19j5hG5yOwt1ucqmM3wLZKV/BJ58/1MFTYVK/b6secZNbTQCnO0x7ix0zGU30uM/51HOxD4+PckfEYd43GdPmzbKBcE/PWQ6PDk4prkzs7Gt164HNB6O+cQ/emtLVtn0dfGYaU743ab2w4+dQ99/v1xZ36Qv46zvvpf/FvpmT9m24oGKnVOu3B3V+oaSZwgxmobtJmhrnYs+3PXVJvC1VGWX5VDrk1mH3gVa4elgWlJxwG1uHYB7bg3gS4W4VL8N5w8fqqZDwdknd+vtb4Jxn9hOySFPnLxBxqETtq84eIIbFIX641/lGOpvZjoMLghJbpYtlxXRkvnbHm1Zth4Ha91U5oqmbYzWbaeW+nidbsBx/v2Mrdp2UdUCasg5tZrdLkmPEGyhnIqBqnXLZ9Bg0HF0EcvLCKmDPYm2zA2DnQK1k/juEtB2xTCwpbIYwMKMB3A9V7cFqblMJL8YCkCfuVH3lqK/HdprPcilMiich6+ubp9i9+4hxB4n4MF08YsLM8NFqKxAuAZxtsPp6KpCWELITxxpuEGK37bVWiPUxVf4ojTlt7SBvgZ1E/1oHrp2jekyIcywo2nFiUbVEi4Ct9jK1Uwn9yo0rjjz6Du5ugl6RE8fAznGyMfaOsWJbtK5jRreCehwqr8XzYqvcnt1c0o3FCvXXdTYg5itOZF8J1sbl4EkR0sqPYNVgvG8iVuEuJ36Nn89OeYmuCPF6Q/2eDh4Mbc/5iE2HgrwbkxFS61qg/YGMyibhZiph6UY2SIZ4rHYR9z3QGzWhyRkcNmfNICfb3l7K1uh1o7gkUYWok6Dl4ARx7P+T/uJNod3/Ryuyc4K83CLALvK0o2p36HIINBy0ZaXhjYKvzVVvmxVrKtbiqSntfIwufewi3FqYPF/Y5m2IYlPMWXqrr6vhA0f6HJ3Ex3cybqlzLLFQ9kmkT0r3R0oW8RGyBhKDeRC58Ip/oo8avPt5c Dxid0+9D e40UIChEDDmRYOs5xWCYYKLv8QpiCYBys2iaEZmk2uupCCO9jzBj5IgI5uDjrVAOuccTZr/Y/HX8WehFLUowdyPmdGR7Zzd4h5hu371gmfwlIORNcWupMEGApdjvMXwZEGIQKXpjbyxiBOMfiMJbfzlP2qo+YBm8M7KXFhivXFrZj4B4= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This allows batching the rmap removal with folio_remove_rmap_range(), which means we avoid spuriously adding a partially unmapped folio to the deferrred split queue in the common case, which reduces split queue lock contention. Previously each page was removed from the rmap individually with page_remove_rmap(). If the first page belonged to a large folio, this would cause page_remove_rmap() to conclude that the folio was now partially mapped and add the folio to the deferred split queue. But subsequent calls would cause the folio to become fully unmapped, meaning there is no value to adding it to the split queue. Signed-off-by: Ryan Roberts --- mm/memory.c | 139 ++++++++++++++++++++++++++++++++++++++++++++-------- 1 file changed, 119 insertions(+), 20 deletions(-) -- 2.25.1 diff --git a/mm/memory.c b/mm/memory.c index fac686e9f895..e1cb4bf6fd5d 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1351,6 +1351,95 @@ zap_install_uffd_wp_if_needed(struct vm_area_struct *vma, pte_install_uffd_wp_if_needed(vma, addr, pte, pteval); } +static inline unsigned long page_addr(struct page *page, + struct page *anchor, unsigned long anchor_addr) +{ + unsigned long offset; + unsigned long addr; + + offset = (page_to_pfn(page) - page_to_pfn(anchor)) << PAGE_SHIFT; + addr = anchor_addr + offset; + + if (anchor > page) { + if (addr > anchor_addr) + return 0; + } else { + if (addr < anchor_addr) + return ULONG_MAX; + } + + return addr; +} + +static int calc_anon_folio_map_pgcount(struct folio *folio, + struct page *page, pte_t *pte, + unsigned long addr, unsigned long end) +{ + pte_t ptent; + int floops; + int i; + unsigned long pfn; + + end = min(page_addr(&folio->page + folio_nr_pages(folio), page, addr), + end); + floops = (end - addr) >> PAGE_SHIFT; + pfn = page_to_pfn(page); + pfn++; + pte++; + + for (i = 1; i < floops; i++) { + ptent = *pte; + + if (!pte_present(ptent) || + pte_pfn(ptent) != pfn) { + return i; + } + + pfn++; + pte++; + } + + return floops; +} + +static unsigned long zap_anon_pte_range(struct mmu_gather *tlb, + struct vm_area_struct *vma, + struct page *page, pte_t *pte, + unsigned long addr, unsigned long end, + bool *full_out) +{ + struct folio *folio = page_folio(page); + struct mm_struct *mm = tlb->mm; + pte_t ptent; + int pgcount; + int i; + bool full; + + pgcount = calc_anon_folio_map_pgcount(folio, page, pte, addr, end); + + for (i = 0; i < pgcount;) { + ptent = ptep_get_and_clear_full(mm, addr, pte, tlb->fullmm); + tlb_remove_tlb_entry(tlb, pte, addr); + full = __tlb_remove_page(tlb, page, 0); + + if (unlikely(page_mapcount(page) < 1)) + print_bad_pte(vma, addr, ptent, page); + + i++; + page++; + pte++; + addr += PAGE_SIZE; + + if (unlikely(full)) + break; + } + + folio_remove_rmap_range(folio, page - i, i, vma); + + *full_out = full; + return i; +} + static unsigned long zap_pte_range(struct mmu_gather *tlb, struct vm_area_struct *vma, pmd_t *pmd, unsigned long addr, unsigned long end, @@ -1387,6 +1476,36 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, page = vm_normal_page(vma, addr, ptent); if (unlikely(!should_zap_page(details, page))) continue; + + /* + * Batch zap large anonymous folio mappings. This allows + * batching the rmap removal, which means we avoid + * spuriously adding a partially unmapped folio to the + * deferrred split queue in the common case, which + * reduces split queue lock contention. Require the VMA + * to be anonymous to ensure that none of the PTEs in + * the range require zap_install_uffd_wp_if_needed(). + */ + if (page && PageAnon(page) && vma_is_anonymous(vma)) { + bool full; + int pgcount; + + pgcount = zap_anon_pte_range(tlb, vma, + page, pte, addr, end, &full); + + rss[mm_counter(page)] -= pgcount; + pgcount--; + pte += pgcount; + addr += pgcount << PAGE_SHIFT; + + if (unlikely(full)) { + force_flush = 1; + addr += PAGE_SIZE; + break; + } + continue; + } + ptent = ptep_get_and_clear_full(mm, addr, pte, tlb->fullmm); tlb_remove_tlb_entry(tlb, pte, addr); @@ -3051,26 +3170,6 @@ struct anon_folio_range { bool exclusive; }; -static inline unsigned long page_addr(struct page *page, - struct page *anchor, unsigned long anchor_addr) -{ - unsigned long offset; - unsigned long addr; - - offset = (page_to_pfn(page) - page_to_pfn(anchor)) << PAGE_SHIFT; - addr = anchor_addr + offset; - - if (anchor > page) { - if (addr > anchor_addr) - return 0; - } else { - if (addr < anchor_addr) - return ULONG_MAX; - } - - return addr; -} - /* * Returns index of first pte that is not none, or nr if all are none. */