From patchwork Thu Feb 29 00:37:49 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Barry Song <21cnbao@gmail.com> X-Patchwork-Id: 13576145 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 15CACC5475B for ; Thu, 29 Feb 2024 00:38:38 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9AE3F6B009E; Wed, 28 Feb 2024 19:38:37 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 95FD76B00A0; Wed, 28 Feb 2024 19:38:37 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7FFAC6B00A1; Wed, 28 Feb 2024 19:38:37 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 7085E6B009E for ; Wed, 28 Feb 2024 19:38:37 -0500 (EST) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id EC1A01C1489 for ; Thu, 29 Feb 2024 00:38:36 +0000 (UTC) X-FDA: 81842980632.28.D6D3CB5 Received: from mail-pf1-f174.google.com (mail-pf1-f174.google.com [209.85.210.174]) by imf09.hostedemail.com (Postfix) with ESMTP id 1F082140008 for ; Thu, 29 Feb 2024 00:38:33 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=mGqban28; spf=pass (imf09.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.210.174 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1709167114; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=aRxwZRNCH2mhvj5BH6CtV9uA6BSo86eped93k9fcnb8=; b=BGVyOyUIoO0oOtcjdaJO8u2uwB3JGHcXIQghSKSvfv7Dt2IYAYT3oLqksqYTfFWjA5pINC n2KxupLHLZzrOpOHzqw2IA7ognR/ocSOlk/QAwd5L1yDu2X2m7lkZFkHkLW6LTnPsTlzez dm1OfmeEoP7C1Xr6dOG+jwQtOzmwOcM= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=mGqban28; spf=pass (imf09.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.210.174 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1709167114; a=rsa-sha256; cv=none; b=J6QYeJ5j43iWZqpOnvodlXTqlgEcMsgjenJwnAcghwT1fqNunwNqb0XkfiLcPtE0cknJ6J Zm0UQC6eDhdOpG/CsT9IO9QIEZnkbjKPo0cAJlr7ofTAbDq5uaQvinz7PM2FHfUI/747l8 Fota/XwaRecJfZ7Y1BfALbKFskF6tzw= Received: by mail-pf1-f174.google.com with SMTP id d2e1a72fcca58-6e48eef8be5so208276b3a.1 for ; Wed, 28 Feb 2024 16:38:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1709167113; x=1709771913; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=aRxwZRNCH2mhvj5BH6CtV9uA6BSo86eped93k9fcnb8=; b=mGqban28t1SKcyxcQAY/U7kcE27ryr9T4AVXjvpf9obNMf1xeBrq2vn84OudInHymQ vZ2A5exViuh6Q6OiaHfzu7pbK46gQz9WMKLftcGkhhkniMIpB2LP97umfPM1IknpEPca Zk8IaubUqrajXYn4RuEKHWLKLLrEK3TCdaX+Zwo3vP8hhDN98nr3Q4KTgxbX9R22leeE WPxADwRQJb6cGsukse3qQsRrgaVavp33X67TEthpNmcPHpelVmvdEapbB/S+HHSa3Z5n F1YmsmkeAfeHoHtLXIINjRVP/Rz7dbSmpdbzQrxYIXYO7qtwWcn0IP/0lyAVAkTm7Qog eugQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1709167113; x=1709771913; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=aRxwZRNCH2mhvj5BH6CtV9uA6BSo86eped93k9fcnb8=; b=qYvplgCayP6nPZHe7H5Fa138Qz/huUaJdWrZoPs0MKU7QeEM14iNjF025qU2vIP73N PGDBwcRsWns9WAIABWboDksF3SXngib62GdCNNg2qWbkRfh/cR+KZSP3b72DxLVARKZ6 3I7zXK45o6WWo16JWF3DTJDkxqwjp8mTpPXaD/exZFQOPto0zLiUy43pGN1LlFGc0kSO VaVUUjrYLByUFPeIE/G2V+rd65GLr/AlUQ5vEgW6Wz5r4/BjE185whMzWClLUv7nyp0P 7vEmKn2uNcXkX7Qqb9RmGvSogJpUyn7+oSP7M5Wr5v1S2BFkOxRxPuPGXhOrvYLe4fIc Y6Ew== X-Forwarded-Encrypted: i=1; AJvYcCXEza4UGwkuCmZ8OjHs4p0yUxH1umfJE4aLTqfSeh8UMWuppCOaop8miD0lYmEsuWIp+9GXMYgwJ885EFUXeBdvjQ8= X-Gm-Message-State: AOJu0Yzg0VG9fVjLWfXhv2Q7e7cdSyt1TXAd3eNU02y9raCg3isMirkH IcQgWa2HhNYShivWXTREuvrdVgM9hzOpG0/ADc7sVQh9TGIJANdS X-Google-Smtp-Source: AGHT+IFYesE9qHktmHaNdgA0gjlt042dxAyTHhmUPcBc/Iu8UblxtJZ6KbK4wn0XfTJXg38vPbuHew== X-Received: by 2002:a05:6a20:3252:b0:19f:f059:c190 with SMTP id hm18-20020a056a20325200b0019ff059c190mr905440pzc.24.1709167112788; Wed, 28 Feb 2024 16:38:32 -0800 (PST) Received: from localhost.localdomain ([2407:7000:8942:5500:5158:ed66:78b3:7fda]) by smtp.gmail.com with ESMTPSA id p3-20020a170902780300b001d9641003cfsm62647pll.142.2024.02.28.16.38.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 28 Feb 2024 16:38:32 -0800 (PST) From: Barry Song <21cnbao@gmail.com> To: akpm@linux-foundation.org, david@redhat.com, linux-mm@kvack.org, ryan.roberts@arm.com, chrisl@kernel.org Cc: 21cnbao@gmail.com, linux-kernel@vger.kernel.org, mhocko@suse.com, shy828301@gmail.com, steven.price@arm.com, surenb@google.com, wangkefeng.wang@huawei.com, willy@infradead.org, xiang@kernel.org, ying.huang@intel.com, yuzhao@google.com, kasong@tencent.com, yosryahmed@google.com, nphamcs@gmail.com, chengming.zhou@linux.dev, hannes@cmpxchg.org, linux-arm-kernel@lists.infradead.org, Barry Song , Catalin Marinas , Will Deacon , Mark Rutland , Kemeng Shi , Anshuman Khandual , Peter Collingbourne , Peter Xu , Lorenzo Stoakes , "Mike Rapoport (IBM)" , Hugh Dickins , "Aneesh Kumar K.V" , Rick Edgecombe Subject: [PATCH RFC v2 1/5] arm64: mm: swap: support THP_SWAP on hardware with MTE Date: Thu, 29 Feb 2024 13:37:49 +1300 Message-Id: <20240229003753.134193-2-21cnbao@gmail.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240229003753.134193-1-21cnbao@gmail.com> References: <20240229003753.134193-1-21cnbao@gmail.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: 1F082140008 X-Rspam-User: X-Stat-Signature: 7pgi378r7e5t5paa9d7ery79k7amfz8g X-Rspamd-Server: rspam01 X-HE-Tag: 1709167113-117982 X-HE-Meta: U2FsdGVkX19+6uo2L78iZ+PQssg3OKIvLsFEJXjzwVpxTGIblX8dVfye5q8JdI7IBp9YPCERm4yPC2sEoLaZCigMc6D8itE9nxjF7RsM0xOBbSTLwXVsltNNisnpLEFay+/tAfWq4baTfTu5ox1i1Fb3S9AZEDtGFtQMBT6out//tNgV+un7FRwUvVPeZikH+vUZESU2y7QnHGgVrl/ShAUzsh5BAT0yfjGzT/y6TOoVRMxG3k7U0MT8q7wf8f00eSzMv137dp2d8UvNGNvmWvL7W3rThAIvVZY+MezzpY4McpOghNT/wxZQMLYhN+LG67WEpNlHKpFe6yczAqJ9t+x4XBjjGuO16eOcQ4VofQ+AHKAEIPpvK/oEDORBTIueY+ac6BRfdLwjuW7j1gwTdxCOxi6bUQD7kJsItdMA0+jblEP52Yxeby/+3qvTRALO0b5+SUr3XDmbFS2DqBZYVwbji7azwJ0su4KOOmA6gFeOMZKoOXeO0wa/d2i9F176uE5kXP8xaFxjvAzZhzIK/nfm4t4nXZE72lKULUaWQf/CH8ifnQunpSkS12WebLblCzAZySywGKH6MoSlpl0ITCozJX2O6aEndltqkxyDhLiZgRZNgRX3/KucbejQlIu6B/CtXIUj+wv7wDdEFbJ/N00ZNhxpiwI8K0A5as6K6GUHJ1nfCpBJD47DoVtLDjy6uDtY0YDGNkt4q6HGfJmgwB83vNP4u/8NaHBtu/h0sNDTMQaouxvC1ehmxJWuuKziCLtUL4SOd+DUkPF4wiWSSVMqXWrxqPMzFk/WvIFgNfXRNsFa95koOtuxRWV9rE+FdRqC1NtVGjKV/kYBbAyEmEvV6uOxpUL/LddSHfEUpmV34S7oO+XUB62FzBAWn34RtX6hHypG8RwpjAuk3K4qEAYm+18XwYB493aerVOqwMvK/gdeNreWvG2sLnOSGhe2ZocHeodYtCkd/gui1Id rgdNUQip WhlabyFo+iQRecc2UdAaQNOwxrCQjY0flpdeWWjTqeyb21gFGzOM/xsaI1usLffjYOX/zujGI0jsgh9On5ulHGf/3imx/ttEMM5VxNVe1lPhEGHiti09dusaH94lKZEPJZf89k/pD2RBZbnY0nlgpHnC5W1yGEKqAfP3L X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Barry Song Commit d0637c505f8a1 ("arm64: enable THP_SWAP for arm64") brings up THP_SWAP on ARM64, but it doesn't enable THP_SWP on hardware with MTE as the MTE code works with the assumption tags save/restore is always handling a folio with only one page. The limitation should be removed as more and more ARM64 SoCs have this feature. Co-existence of MTE and THP_SWAP becomes more and more important. This patch makes MTE tags saving support large folios, then we don't need to split large folios into base pages for swapping out on ARM64 SoCs with MTE any more. arch_prepare_to_swap() should take folio rather than page as parameter because we support THP swap-out as a whole. It saves tags for all pages in a large folio. As now we are restoring tags based-on folio, in arch_swap_restore(), we may increase some extra loops and early-exitings while refaulting a large folio which is still in swapcache in do_swap_page(). In case a large folio has nr pages, do_swap_page() will only set the PTE of the particular page which is causing the page fault. Thus do_swap_page() runs nr times, and each time, arch_swap_restore() will loop nr times for those subpages in the folio. So right now the algorithmic complexity becomes O(nr^2). Once we support mapping large folios in do_swap_page(), extra loops and early-exitings will decrease while not being completely removed as a large folio might get partially tagged in corner cases such as, 1. a large folio in swapcache can be partially unmapped, thus, MTE tags for the unmapped pages will be invalidated; 2. users might use mprotect() to set MTEs on a part of a large folio. arch_thp_swp_supported() is dropped since ARM64 MTE was the only one who needed it. Cc: Catalin Marinas Cc: Will Deacon Cc: Ryan Roberts Cc: Mark Rutland Cc: David Hildenbrand Cc: Kemeng Shi Cc: "Matthew Wilcox (Oracle)" Cc: Anshuman Khandual Cc: Peter Collingbourne Cc: Steven Price Cc: Yosry Ahmed Cc: Peter Xu Cc: Lorenzo Stoakes Cc: "Mike Rapoport (IBM)" Cc: Hugh Dickins CC: "Aneesh Kumar K.V" Cc: Rick Edgecombe Signed-off-by: Barry Song Reviewed-by: Steven Price Acked-by: Chris Li --- arch/arm64/include/asm/pgtable.h | 19 ++------------ arch/arm64/mm/mteswap.c | 43 ++++++++++++++++++++++++++++++++ include/linux/huge_mm.h | 12 --------- include/linux/pgtable.h | 2 +- mm/page_io.c | 2 +- mm/swap_slots.c | 2 +- 6 files changed, 48 insertions(+), 32 deletions(-) diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index 401087e8a43d..7a54750770b8 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -45,12 +45,6 @@ __flush_tlb_range(vma, addr, end, PUD_SIZE, false, 1) #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ -static inline bool arch_thp_swp_supported(void) -{ - return !system_supports_mte(); -} -#define arch_thp_swp_supported arch_thp_swp_supported - /* * Outside of a few very special situations (e.g. hibernation), we always * use broadcast TLB invalidation instructions, therefore a spurious page @@ -1095,12 +1089,7 @@ static inline pmd_t pmdp_establish(struct vm_area_struct *vma, #ifdef CONFIG_ARM64_MTE #define __HAVE_ARCH_PREPARE_TO_SWAP -static inline int arch_prepare_to_swap(struct page *page) -{ - if (system_supports_mte()) - return mte_save_tags(page); - return 0; -} +extern int arch_prepare_to_swap(struct folio *folio); #define __HAVE_ARCH_SWAP_INVALIDATE static inline void arch_swap_invalidate_page(int type, pgoff_t offset) @@ -1116,11 +1105,7 @@ static inline void arch_swap_invalidate_area(int type) } #define __HAVE_ARCH_SWAP_RESTORE -static inline void arch_swap_restore(swp_entry_t entry, struct folio *folio) -{ - if (system_supports_mte()) - mte_restore_tags(entry, &folio->page); -} +extern void arch_swap_restore(swp_entry_t entry, struct folio *folio); #endif /* CONFIG_ARM64_MTE */ diff --git a/arch/arm64/mm/mteswap.c b/arch/arm64/mm/mteswap.c index a31833e3ddc5..295836fef620 100644 --- a/arch/arm64/mm/mteswap.c +++ b/arch/arm64/mm/mteswap.c @@ -68,6 +68,13 @@ void mte_invalidate_tags(int type, pgoff_t offset) mte_free_tag_storage(tags); } +static inline void __mte_invalidate_tags(struct page *page) +{ + swp_entry_t entry = page_swap_entry(page); + + mte_invalidate_tags(swp_type(entry), swp_offset(entry)); +} + void mte_invalidate_tags_area(int type) { swp_entry_t entry = swp_entry(type, 0); @@ -83,3 +90,39 @@ void mte_invalidate_tags_area(int type) } xa_unlock(&mte_pages); } + +int arch_prepare_to_swap(struct folio *folio) +{ + long i, nr; + int err; + + if (!system_supports_mte()) + return 0; + + nr = folio_nr_pages(folio); + + for (i = 0; i < nr; i++) { + err = mte_save_tags(folio_page(folio, i)); + if (err) + goto out; + } + return 0; + +out: + while (i--) + __mte_invalidate_tags(folio_page(folio, i)); + return err; +} + +void arch_swap_restore(swp_entry_t entry, struct folio *folio) +{ + if (system_supports_mte()) { + long i, nr = folio_nr_pages(folio); + + entry.val -= swp_offset(entry) & (nr - 1); + for (i = 0; i < nr; i++) { + mte_restore_tags(entry, folio_page(folio, i)); + entry.val++; + } + } +} diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index de0c89105076..e04b93c43965 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -535,16 +535,4 @@ static inline int split_folio_to_order(struct folio *folio, int new_order) #define split_folio_to_list(f, l) split_folio_to_list_to_order(f, l, 0) #define split_folio(f) split_folio_to_order(f, 0) -/* - * archs that select ARCH_WANTS_THP_SWAP but don't support THP_SWP due to - * limitations in the implementation like arm64 MTE can override this to - * false - */ -#ifndef arch_thp_swp_supported -static inline bool arch_thp_swp_supported(void) -{ - return true; -} -#endif - #endif /* _LINUX_HUGE_MM_H */ diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index a36cf4e124b0..ec7efce0f3f0 100644 --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -1052,7 +1052,7 @@ static inline int arch_unmap_one(struct mm_struct *mm, * prototypes must be defined in the arch-specific asm/pgtable.h file. */ #ifndef __HAVE_ARCH_PREPARE_TO_SWAP -static inline int arch_prepare_to_swap(struct page *page) +static inline int arch_prepare_to_swap(struct folio *folio) { return 0; } diff --git a/mm/page_io.c b/mm/page_io.c index ae2b49055e43..a9a7c236aecc 100644 --- a/mm/page_io.c +++ b/mm/page_io.c @@ -189,7 +189,7 @@ int swap_writepage(struct page *page, struct writeback_control *wbc) * Arch code may have to preserve more data than just the page * contents, e.g. memory tags. */ - ret = arch_prepare_to_swap(&folio->page); + ret = arch_prepare_to_swap(folio); if (ret) { folio_mark_dirty(folio); folio_unlock(folio); diff --git a/mm/swap_slots.c b/mm/swap_slots.c index 90973ce7881d..53abeaf1371d 100644 --- a/mm/swap_slots.c +++ b/mm/swap_slots.c @@ -310,7 +310,7 @@ swp_entry_t folio_alloc_swap(struct folio *folio) entry.val = 0; if (folio_test_large(folio)) { - if (IS_ENABLED(CONFIG_THP_SWAP) && arch_thp_swp_supported()) + if (IS_ENABLED(CONFIG_THP_SWAP)) get_swap_pages(1, &entry, folio_nr_pages(folio)); goto out; } From patchwork Thu Feb 29 00:37:50 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Barry Song <21cnbao@gmail.com> X-Patchwork-Id: 13576146 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 883C0C5478C for ; Thu, 29 Feb 2024 00:38:44 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 199E36B00A2; Wed, 28 Feb 2024 19:38:44 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 121B46B00A1; Wed, 28 Feb 2024 19:38:44 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F04FE6B00A2; Wed, 28 Feb 2024 19:38:43 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id DDF446B007B for ; Wed, 28 Feb 2024 19:38:43 -0500 (EST) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id A4789161202 for ; Thu, 29 Feb 2024 00:38:43 +0000 (UTC) X-FDA: 81842980926.12.0EB1787 Received: from mail-pl1-f179.google.com (mail-pl1-f179.google.com [209.85.214.179]) by imf02.hostedemail.com (Postfix) with ESMTP id 027148000D for ; Thu, 29 Feb 2024 00:38:41 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=EbUZVxYX; spf=pass (imf02.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.214.179 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1709167122; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=yMJ7yXDIGraWBc5cHOK7WDJvzj0gdP9hzucxgbPAd4E=; b=ruyLreJlXAQjyBKf7oJhbl/2sba8j2tpEKyJAnXcRzYPzbVuqhFLgBJsZndM90Xx1HYjyL OahwwXfWBZt9U2zU+RfcPUDAnYmKvQ3dbxiAgPZodOxKtmGC0DrmYL0R7pxuUK0PuCZ8EK H0tjzWHuHypw/ND5H1WPVTSOxV5I3GY= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=EbUZVxYX; spf=pass (imf02.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.214.179 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1709167122; a=rsa-sha256; cv=none; b=lF+uRNAVBOuxjVrS8zbSxJy7uAo+1JSHq2I+zR/dR3aCqhUWQNLVcFi1GxDjnB2Zm22jt1 LuZobW+HPReqze+gbencz+EBD+UiLqNjHaT0u9A0DOmiJymCPcoyJhPotUxlvjnBCIGiLW LjBmueOiUcTwE9UXc2RqCWV+LC+9Ljk= Received: by mail-pl1-f179.google.com with SMTP id d9443c01a7336-1dc3b4b9b62so3111245ad.1 for ; Wed, 28 Feb 2024 16:38:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1709167121; x=1709771921; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=yMJ7yXDIGraWBc5cHOK7WDJvzj0gdP9hzucxgbPAd4E=; b=EbUZVxYXLULXfmmEAbQsILRvcrOo4u0gJ+WFOGd0SoEf7b6LcOCS4BPKj2aaaxcXAf Q7hoLYGejVGgKTnNpicTHhX0pyuti/ueVH7GxcLlXXKzRfRzhsbAkMbHW+v1jWh3fR7s WEQ3G1XS02kqw8EPvvifl6WzQNZjaSeFXvsUtU2SS10Osq0SkaIIgIKk+knAdKTiw9fC gMtM8E81498Io1qRthVVL5OsmPkuGIiPbUdklCK8z09B5F2YqJJA/pGDJNeB1rDfhU/g 9dow+xnQPj6FYfWQD8WAtqGmgZjypBUX3ZGsr3b30nLDpPrSvdoO3D3VZFTRtAny3B3a 6dgQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1709167121; x=1709771921; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=yMJ7yXDIGraWBc5cHOK7WDJvzj0gdP9hzucxgbPAd4E=; b=n9hjGpiNL2rzgVBmxZCxTjBO37BjUE08var9NLzuUskmiVC50L4+7hAEJdoyUuX56V 3Jc7TD+i4yFzlyYoAT80PY/1gDUrplHqcb6DI+bV96iCaMlY5l7nwmcK9lImXWEJytc5 5OGb25MyqWx4CTXAKbhR696bgZDeC+yT8/wwPtBsaa7H8jVeI3+tPSh8rARhh1NKa6xC sYvzkJo0inHHKVIv6ELAUYMVqK+GRHxomKy+yhkJhFFNtB0Xu4wMXdkn5Jlx6eN23Rul /iSp6PwlO7HGXZZ8fhzR6vBbe4uwYe96BLOICDpXaWvNClORB7cNIHqpF9s4iJFxfu7O cJbg== X-Forwarded-Encrypted: i=1; AJvYcCU0fm6N39u2OwCvkD4mbleYeKCGiLb2pvJtgQixUjU68398XOkNIDuzlHvmqFsH5eugAwYd/Ma8pUZy4JcMo7keo9c= X-Gm-Message-State: AOJu0YxBCi0nslMBDWlGZPZPEYIV1wP8VRa+RGi23pF6WiZpJFuWYPfU 0wKY3bzQVh0mJooxnZt9i9j3ohtvHHvRYtz7m3gkBlrlswFewNCA X-Google-Smtp-Source: AGHT+IHcSLxazfOE73baRJGF/GDzC2Ut3tKmCraQkgILYtM8ozQfrAGo7OOUSaQPjCdA4QL0Tm1Qmw== X-Received: by 2002:a17:902:e5c2:b0:1d9:adc9:2962 with SMTP id u2-20020a170902e5c200b001d9adc92962mr530369plf.20.1709167120889; Wed, 28 Feb 2024 16:38:40 -0800 (PST) Received: from localhost.localdomain ([2407:7000:8942:5500:5158:ed66:78b3:7fda]) by smtp.gmail.com with ESMTPSA id p3-20020a170902780300b001d9641003cfsm62647pll.142.2024.02.28.16.38.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 28 Feb 2024 16:38:40 -0800 (PST) From: Barry Song <21cnbao@gmail.com> To: akpm@linux-foundation.org, david@redhat.com, linux-mm@kvack.org, ryan.roberts@arm.com, chrisl@kernel.org Cc: 21cnbao@gmail.com, linux-kernel@vger.kernel.org, mhocko@suse.com, shy828301@gmail.com, steven.price@arm.com, surenb@google.com, wangkefeng.wang@huawei.com, willy@infradead.org, xiang@kernel.org, ying.huang@intel.com, yuzhao@google.com, kasong@tencent.com, yosryahmed@google.com, nphamcs@gmail.com, chengming.zhou@linux.dev, hannes@cmpxchg.org, linux-arm-kernel@lists.infradead.org, Chuanhua Han , Barry Song Subject: [PATCH RFC v2 2/5] mm: swap: introduce swap_nr_free() for batched swap_free() Date: Thu, 29 Feb 2024 13:37:50 +1300 Message-Id: <20240229003753.134193-3-21cnbao@gmail.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240229003753.134193-1-21cnbao@gmail.com> References: <20240229003753.134193-1-21cnbao@gmail.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: 027148000D X-Rspam-User: X-Stat-Signature: zhkjcaq9biwr3a5zpkqkjdii6q96znpb X-Rspamd-Server: rspam01 X-HE-Tag: 1709167121-947286 X-HE-Meta: U2FsdGVkX1+BUgD99bF+9PbysXs4KqKQ558KPLj77AAVRHH0jSQ2MZo5GuPYqjWdp9fgO9M9V71MpTJkU4TAHYKbnXXCdA5oAZ8fdbEZWPRSpOn7y9pQ5/+EiCo+PKU5tjPeo/bXxO1Y7TYwxrfmAwTAQ5ZL7+kNb/0wr/5e1I85N1ufmXSIxImIlBTkdxfhg+oQWeK/lugBrw10hn33BGC8QZokTJxO776DHtI5lJbZoaQK4xR1JYiE2X8S8F2FMsRoKmr/8dXoyJmZUm12Ha86pIz0HRrDZeMXf0/hoaDCvoilIyW+vBxNffsouhDEV1Jv9t9pWrRIF6Lh2kovTmn4XsD9HU0BsQoLlrahMmiNINRTJE5vePXPcFPeQjCFOOq02yHz+w8JzLzrZ2Qgk5aEuCPI8IjyGnwisV6JJ7JXGbVglNndTwIkHbVjhbFVBtmLf4akTLW/1gYoHSBN+b9oR8W7pCZNsaSQWQmvtTjcO1Ue6v684NQqjrCmLXH+5bCIa4IxOP+CdVe4OCRMpMhlMkRfgv0fVF5DLmH3U1G3S+hQYi27mFidyU4YNGKjHXrZ+zz+O7BcsRGz59YWj5SppCpcMy5C9P9C2ZZViWrivmqDGSMDcmej7P3SNOU/SVKAM7TE0EcOIR4+eMUAFcxeyQ7I0N7U7pI16f+2LXGh5g6aXkFRgMCBic84cK2izXDRU+h9N3xiHHCAZ0XVsPiGhZQ9G3T+NPKR7WzeTFg0zN5ePpMJo/S+71tk9FagGKVIt9peTvYiepzCKWkVzZaxSvBRSxGxVOiu3UEBGBxgaOITcPzgf7ATQRBS9B3zVu2moFJnXqBdlAvF2RHpbj03WDsEfDdjM0X2+WCCJfjDlHJkJ+MPI0XsMTCRkcxF7daqUiwsnxi3qIh4S0hby9F/pbu7/kAoHd3r8Dh9bty6OlmzlRLYDXWcWO4bfy4iFB1D3pMj0XVwwFtcdX5 EE64UqyA wvSzbYlq0HKej7zqF5JfFqGhh6113ybf18EI//bUkDghP/Rpg/wgPvvPP+slIrvpUZU5YlTR4qSFMJpCjr/+jLXWiiH41KtRrHIEmlGqR880OwjeTvg/zTpEoZFKAsjBPOsBiGM5H3yLolTQJkkkSfWArZBdY1teFXCIX X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Chuanhua Han While swapping in a large folio, we need to free swaps related to the whole folio. To avoid frequently acquiring and releasing swap locks, it is better to introduce an API for batched free. Signed-off-by: Chuanhua Han Co-developed-by: Barry Song Signed-off-by: Barry Song --- include/linux/swap.h | 6 ++++++ mm/swapfile.c | 35 +++++++++++++++++++++++++++++++++++ 2 files changed, 41 insertions(+) diff --git a/include/linux/swap.h b/include/linux/swap.h index 25f6368be078..b3581c976e5f 100644 --- a/include/linux/swap.h +++ b/include/linux/swap.h @@ -481,6 +481,7 @@ extern void swap_shmem_alloc(swp_entry_t); extern int swap_duplicate(swp_entry_t); extern int swapcache_prepare(swp_entry_t); extern void swap_free(swp_entry_t); +extern void swap_nr_free(swp_entry_t entry, int nr_pages); extern void swapcache_free_entries(swp_entry_t *entries, int n); extern int free_swap_and_cache(swp_entry_t); int swap_type_of(dev_t device, sector_t offset); @@ -561,6 +562,11 @@ static inline void swap_free(swp_entry_t swp) { } +void swap_nr_free(swp_entry_t entry, int nr_pages) +{ + +} + static inline void put_swap_folio(struct folio *folio, swp_entry_t swp) { } diff --git a/mm/swapfile.c b/mm/swapfile.c index 2b3a2d85e350..c0c058ee7b69 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -1340,6 +1340,41 @@ void swap_free(swp_entry_t entry) __swap_entry_free(p, entry); } +/* + * Called after swapping in a large folio, batched free swap entries + * for this large folio, entry should be for the first subpage and + * its offset is aligned with nr_pages + */ +void swap_nr_free(swp_entry_t entry, int nr_pages) +{ + int i; + struct swap_cluster_info *ci; + struct swap_info_struct *p; + unsigned type = swp_type(entry); + unsigned long offset = swp_offset(entry); + DECLARE_BITMAP(usage, SWAPFILE_CLUSTER) = { 0 }; + + /* all swap entries are within a cluster for mTHP */ + VM_BUG_ON(offset % SWAPFILE_CLUSTER + nr_pages > SWAPFILE_CLUSTER); + + if (nr_pages == 1) { + swap_free(entry); + return; + } + + p = _swap_info_get(entry); + + ci = lock_cluster(p, offset); + for (i = 0; i < nr_pages; i++) { + if (__swap_entry_free_locked(p, offset + i, 1)) + __bitmap_set(usage, i, 1); + } + unlock_cluster(ci); + + for_each_clear_bit(i, usage, nr_pages) + free_swap_slot(swp_entry(type, offset + i)); +} + /* * Called after dropping swapcache to decrease refcnt to swap entries. */ From patchwork Thu Feb 29 00:37:51 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Barry Song <21cnbao@gmail.com> X-Patchwork-Id: 13576147 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D6248C5475B for ; Thu, 29 Feb 2024 00:38:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6543E6B00A3; Wed, 28 Feb 2024 19:38:52 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 603616B00A4; Wed, 28 Feb 2024 19:38:52 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 47F046B00A5; Wed, 28 Feb 2024 19:38:52 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 35D596B00A3 for ; Wed, 28 Feb 2024 19:38:52 -0500 (EST) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id D6A86A1886 for ; Thu, 29 Feb 2024 00:38:51 +0000 (UTC) X-FDA: 81842981262.22.F6C0BDD Received: from mail-pl1-f178.google.com (mail-pl1-f178.google.com [209.85.214.178]) by imf06.hostedemail.com (Postfix) with ESMTP id 06DC418000B for ; Thu, 29 Feb 2024 00:38:49 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=nZggVojL; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf06.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.214.178 as permitted sender) smtp.mailfrom=21cnbao@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1709167130; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=UdVUS/tLhNyeqOoyodd3YqCiZJm0fvc/nLmJZ8OGUGI=; b=YVJ/dIMPd5CNuAJ97JkoF7w4neiFbHc3fvJlQOpoAkKLAQFbDfVO0SqLGgmGVagzohra0F gWIIcADDMKc6+NA+kJf/1cUPi2RAoRl99s96V2Xoub/i1Wp4qoxTwtOiHWKNlTdnauHbvf y/ZUcJFOUyk4a5h7XBSmCTpCfm9RnRY= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=nZggVojL; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf06.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.214.178 as permitted sender) smtp.mailfrom=21cnbao@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1709167130; a=rsa-sha256; cv=none; b=p29Dp7K0mb712evgeHDceLT5z7pO9b9IBOe4UR6iIP6YcHBS1Dcc76rp5hFX7pR7993YQu TreHqzqVfhOzCoRem5m7/2iMxRfGOgZbb/CkojeJ3oNaLwgTEUSQIf1BB4+g6cH4KtnkOc pjkiy5RJoj8bVElcyPLEb6OPBW+0aiU= Received: by mail-pl1-f178.google.com with SMTP id d9443c01a7336-1d944e8f367so3068405ad.0 for ; Wed, 28 Feb 2024 16:38:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1709167129; x=1709771929; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=UdVUS/tLhNyeqOoyodd3YqCiZJm0fvc/nLmJZ8OGUGI=; b=nZggVojLaWfFmBFrvEcwT7RB8lsuumQx0zi/uwcG0Bptp16CO6VuHdOSMmhQuQmNzO 16fHGN6/sIOGkMSYa17v6jFg9A7hukaQnmJhDvAAzBFtfqZbQvS4evPPtu8fYaDgJaVY z+3uPebqS8XOMy7JqcoKhZjF/ZyzSbjSqS9G+61xhyd1tYJz5E1uhztvPrdtYTSMikHJ WjnfZWf6Xikdv1SF1C+2rhn0OrNTuSg/jl6TrdN6eukl8nMdva2bav5NjR5ke0U5W4fk +fchQalOlU+Xm3GjCyLFyr4qt2z/n84PJDUxMuZZAai7GClvJ468ejwWG6yML7E5pX24 mSIg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1709167129; x=1709771929; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=UdVUS/tLhNyeqOoyodd3YqCiZJm0fvc/nLmJZ8OGUGI=; b=Kv+V52gUIlKhv6JqV4F1oc+pKIj45f6gibjm5nfgBxtThOhGHv0aS4DDq9ClxNvIdy QbpKzKz4KxYdzUD8SH4iIFLB+zrGK63tj4nyL017bq3PQYtxo2YGn0cT4Rt8x0wAqtJA ZLhEmz4WxfAbHSxH7lFxaMa2hMMyS47MZYZj3C47LF06H3H1w5UoyRGmOuejRlx34PsX NkXd278BXMatMGtrNWdxn8dGzpi4eOvg5AQ1MT1QHEw9pHIrjU7xa3+Gyz+QON6K8YKr XjA0g4YhJwBjN2xGUcBYL24HN/w3R16IHh9Bb+TfVQasbpLWJ8sqtWtLwXpjYu+pe+NQ uDCQ== X-Forwarded-Encrypted: i=1; AJvYcCXb6m7Ol5yptfV70CVrUSuXta9wwkeKKNM5xLgK3murYooDTBtSg8tgeHZTiHKAQFWejuwkHXl5Ja6dMj7+1Ns7ApE= X-Gm-Message-State: AOJu0YxcCib59bLXthCDWQeqLlB35fG26JxRgo0WNUfz4p+EHIq0URI4 rqE7MLNHbvnsNiVfWptVgA4EKH/F5sM0UWgGzY3SOGD8pEpCo/D1 X-Google-Smtp-Source: AGHT+IEuGt8EvEpQqxAVqcPMvnU0WfE0nNgN4CSstXtrAZpJfq6DmLPZKApzsi42Su3R2d3MnQpTcA== X-Received: by 2002:a17:902:7d82:b0:1dc:b320:9475 with SMTP id a2-20020a1709027d8200b001dcb3209475mr648542plm.13.1709167128925; Wed, 28 Feb 2024 16:38:48 -0800 (PST) Received: from localhost.localdomain ([2407:7000:8942:5500:5158:ed66:78b3:7fda]) by smtp.gmail.com with ESMTPSA id p3-20020a170902780300b001d9641003cfsm62647pll.142.2024.02.28.16.38.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 28 Feb 2024 16:38:48 -0800 (PST) From: Barry Song <21cnbao@gmail.com> To: akpm@linux-foundation.org, david@redhat.com, linux-mm@kvack.org, ryan.roberts@arm.com, chrisl@kernel.org Cc: 21cnbao@gmail.com, linux-kernel@vger.kernel.org, mhocko@suse.com, shy828301@gmail.com, steven.price@arm.com, surenb@google.com, wangkefeng.wang@huawei.com, willy@infradead.org, xiang@kernel.org, ying.huang@intel.com, yuzhao@google.com, kasong@tencent.com, yosryahmed@google.com, nphamcs@gmail.com, chengming.zhou@linux.dev, hannes@cmpxchg.org, linux-arm-kernel@lists.infradead.org, Chuanhua Han , Barry Song Subject: [PATCH RFC v2 3/5] mm: swap: make should_try_to_free_swap() support large-folio Date: Thu, 29 Feb 2024 13:37:51 +1300 Message-Id: <20240229003753.134193-4-21cnbao@gmail.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240229003753.134193-1-21cnbao@gmail.com> References: <20240229003753.134193-1-21cnbao@gmail.com> MIME-Version: 1.0 X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 06DC418000B X-Stat-Signature: i36m6mocn97x1hynpbxuctz6sjq6udo1 X-Rspam-User: X-HE-Tag: 1709167129-144957 X-HE-Meta: U2FsdGVkX1/b2sufBPTpUhqtOOYaa6tZjIiy/M0eEvsQFqhwaqp1MfV4ZJuxmlRWB0i3GWJB91nAbTDw+YuUaQsgmWJEJskHrgaw+QdS1rVKLggumRoZFDN0joZvf8ZIH9VkIZYqhrP5iBYFbJMRNs1EEhBODTSRVnu5v75TKAWlc7wn7TjvF3cOeHSuoCqxIDMp1pu0+eVo3tAt8YdvDG+SEDjLO36dJgY2Py8xHCJJPAtZsNynMDRNCYdE9/WcRU6Cn40wYSzPfGmD2fVyn9+44MQaMir1NG5K8Cnr/n5RXRuBOTKl6SnOE+fp1dsNLyMV8jzKXSmhleTlT/AZK169NCCGJtVNyoM4HTTZnImtqZEZUJq0/BwAgUll2HlByQ5zg23j0I3KMi10jvsjGdgfOmS2Q1YGJCybKQLKoJ1Ao9SMy2ju4+Y4edQKTWMdBF4oF1l064MPbnoO5CzFxzD38BCnVm1bFx7hu9iGOZn8pRODtjRf4rBf+RGQMpMdd8eq3Pzof6ejaqdsmPz9VnT0NTyhNCK2DO/CYnaTDBofCT0HBcdh7YVVoAki2ElDP5NPIivl5RcJpQCv0j1YTfsmgH7e2+0X6XR0CMcny/xkCqBZcr6v68kdIf6B+AsbXS7j6aCuKmOOByxCfCv1StXi6E0LONSvfM6WINIKYVAtTwXn1swvCNlipOqncmdkOGMKO4mU/jH8Z+UXTjRjEsY1bD1e5PpzwOSzzgjGSZ64MQxnmY6vG/6oBB/4qjOBugyAelVIpNSOlhPHVc03KUxA8BYSzcmTgBpdqyxdLdFs7H37YVPbdYNeM5kkkntGfB3QsBE15F+RNGIUD+NjMw/wrV1CqeEY5XQg2js/snjwwCLtgREWUbjSwtVDrKJsTn+Pu2c9q7X1Jls1VmosiVjeTliHPuOG8ffaiHZi3eomhtsRDelqB/yvPxgFqUw6l+vTbZnwcCHFD/1g5aK OIHzT3/k YicNMiBR9RXDdlfqskNAMp+Q6ABAbUW/UuZQAXykspaWniZzxugt1v0x7LBNBlBQt7PBPHoYbBMjYDKJSFEivvZLcJ0vz27tzU71uQvSHn7ckPOavNnPhRjh2RYeSDm2kgw1oAfajgFe4QGKc/kwCefszoKVKXpMsdt7entxQHvDTeDnoHEuMNPKQxE+P60npHM6W+9FzpeBa2bsUKlOFFflEYJPsvzYSA+YK5SLj6sKyl0F0C02Lv17ShfQkiUG4juSzCNtexLAHNmvs5QBv2XYWAYlDqJlvYJS/vI7rNBm7XV6CCMC2EwdS4O7w2A3s4sQ4QMug6Q3Lahdd9IPocKeu1vjFMiQXhRbLHYd6mZfayXqUGVC/29hhLizOZJHpGCPCVIks3XighxRLy1ZQKtCMunARZoUyO5JmbKBE6Bjg6lfVoCJ3lovHbU5D3rPv1f0tP1prnKStALqdACyYAXGpt/TNf1bkrzcAmjDZLljSWYsnOtF46f/vvyMwQV8qVqgoIpFs5tarCINxoJjRkSisLBZwDWWFD6+bZHmWirZsiA/Bjf85V2GHxbJidQ2WAFu8b+sNhKFtV0qsBVyqRV1PawmBb82g27FTbafv3njVQq8En3WEK9BQaeN55Y272I87 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Chuanhua Han should_try_to_free_swap() works with an assumption that swap-in is always done at normal page granularity, aka, folio_nr_pages = 1. To support large folio swap-in, this patch removes the assumption. Signed-off-by: Chuanhua Han Co-developed-by: Barry Song Signed-off-by: Barry Song Acked-by: Chris Li --- mm/memory.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/memory.c b/mm/memory.c index 319b3be05e75..90b08b7cbaac 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3904,7 +3904,7 @@ static inline bool should_try_to_free_swap(struct folio *folio, * reference only in case it's likely that we'll be the exlusive user. */ return (fault_flags & FAULT_FLAG_WRITE) && !folio_test_ksm(folio) && - folio_ref_count(folio) == 2; + folio_ref_count(folio) == (1 + folio_nr_pages(folio)); } static vm_fault_t pte_marker_clear(struct vm_fault *vmf) From patchwork Thu Feb 29 00:37:52 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Barry Song <21cnbao@gmail.com> X-Patchwork-Id: 13576148 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 64CF7C5478C for ; Thu, 29 Feb 2024 00:39:01 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id F05EC6B00A5; Wed, 28 Feb 2024 19:39:00 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id EB6686B00A6; Wed, 28 Feb 2024 19:39:00 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D570C6B00A7; Wed, 28 Feb 2024 19:39:00 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id C3DDD6B00A5 for ; Wed, 28 Feb 2024 19:39:00 -0500 (EST) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 77FCD1A10D5 for ; Thu, 29 Feb 2024 00:39:00 +0000 (UTC) X-FDA: 81842981640.06.43476F4 Received: from mail-pl1-f177.google.com (mail-pl1-f177.google.com [209.85.214.177]) by imf16.hostedemail.com (Postfix) with ESMTP id A2E41180008 for ; Thu, 29 Feb 2024 00:38:58 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=aiaYCD24; spf=pass (imf16.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.214.177 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1709167138; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=qNEaXVngxEEVFW2r7SF7/R/SOQEyIxB42KfKsKpMTi0=; b=Cq7WL3+xiULyHe3J+Qjr0iJiYI1m6lyJ2P6RCRWLUIeNkl0O0/vtyj4Bdo9spndsqVoe9K xi0OYqbIrftm2/nV1yAqqBg6blYvw3qn1OMaNSr0siDYaBcYpnbDEPeZlDlNJUKaBlVm9J b9tHJJhmzMPqdYGZHnQZxPhBCxJYriM= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1709167138; a=rsa-sha256; cv=none; b=dTc3qtxU9fVZI/Y+K+BVsJYsLxCIxcX469cJr3irT871ZMHGY2MgSMCTtdKyKY8t2eCsty J1tCMB6W3P2DJkZI58+HV/kZfG67GhCBvPsRTfYTWDiY7QF46q7C2OnpdWxZHnz8igKmuF qzr9D+054hw8a6EDko225565Uffghfg= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=aiaYCD24; spf=pass (imf16.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.214.177 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-pl1-f177.google.com with SMTP id d9443c01a7336-1dbd32cff0bso4131115ad.0 for ; Wed, 28 Feb 2024 16:38:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1709167137; x=1709771937; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=qNEaXVngxEEVFW2r7SF7/R/SOQEyIxB42KfKsKpMTi0=; b=aiaYCD24J51yD6WFvR93JkCRE4YdEg6yyjnTYcOqNiDPziQoHE1yJPg1WmM1K6dd6a MY41A2SU8zdPPeXxc6GHAMj7ECi5ncZQ/bk5s0ZPyhhlJcc3nfoxDn82o7HLg8Ko03cA TSAw62CSF6/CP28TBWEhBc7b31DWdr3QdGuKitvjoM8ko4wgI1x7ZaxQBIt+wEGGXSd+ s+HUk9zBbj9tbP9R5dIxZ6eFf7q0dVATESTTDeBaQH9P0Zt4br2V8T+dv1jQXPjCQd2W 8LsGSUyDYiSuiJ2B4QmymuGb4X+QDo2TAbvQsGFn7FyqVVGcwdqcaHrufkG32JQLS0UQ 1Yqw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1709167137; x=1709771937; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=qNEaXVngxEEVFW2r7SF7/R/SOQEyIxB42KfKsKpMTi0=; b=VAL0OHvube2tDcQ+MMuXiGYboATMMulkCDZ+UQQKYnBdLKx8wkmkXdpaUeXYv/pbbz 1Aq9+zmJEnouYW11TslWCtZOJEzbm2NY8+Rra/61gFyytFViP5CWIu2zE7ZwiZ9GQu7J s3WzWG6nvIMAQR1I2QTUtz4x0DLNckx3Oinu6KY3uZNeNFKSPvSprlifuqE/kNHf7I+U HDxjxlvEQBholFNUrFzC+pkmNugKbSW2eBaiJKz1BBzHGH9jA2ykONzqB21HT9taBhFc 3F++eDMwiG4BFee66OSSZv20uXKhhpEmq2hFBIrDwBOTkDhdUpQTKcYqktuCBho81bU6 Ke6A== X-Forwarded-Encrypted: i=1; AJvYcCU7strb7qL3CFNPLlfp6QwSWHAPdgGDFkXxKC26cbQ5mATYvehrvQtwDK9EGVx0tsMayNkeZ9rsSyEphRmpDPq7K08= X-Gm-Message-State: AOJu0YyWfh8YRMEwz4l0Muu/OvXUnBZDR2+BEhb331VyjIGYvu4A7ix0 nV2t3ezgggvltnfUOeSR46mF8IzPiozASuqdxIHz3t8xOWL6xWiH X-Google-Smtp-Source: AGHT+IHR7uqhk7gVftB4fR9v+bC0QoXfZ4ovI4Z48TBsdfkzzx38SqwehsO7O7NBUo+/+reJ8o+fsg== X-Received: by 2002:a17:902:d38d:b0:1dc:8db3:16e9 with SMTP id e13-20020a170902d38d00b001dc8db316e9mr530965pld.45.1709167137495; Wed, 28 Feb 2024 16:38:57 -0800 (PST) Received: from localhost.localdomain ([2407:7000:8942:5500:5158:ed66:78b3:7fda]) by smtp.gmail.com with ESMTPSA id p3-20020a170902780300b001d9641003cfsm62647pll.142.2024.02.28.16.38.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 28 Feb 2024 16:38:57 -0800 (PST) From: Barry Song <21cnbao@gmail.com> To: akpm@linux-foundation.org, david@redhat.com, linux-mm@kvack.org, ryan.roberts@arm.com, chrisl@kernel.org Cc: 21cnbao@gmail.com, linux-kernel@vger.kernel.org, mhocko@suse.com, shy828301@gmail.com, steven.price@arm.com, surenb@google.com, wangkefeng.wang@huawei.com, willy@infradead.org, xiang@kernel.org, ying.huang@intel.com, yuzhao@google.com, kasong@tencent.com, yosryahmed@google.com, nphamcs@gmail.com, chengming.zhou@linux.dev, hannes@cmpxchg.org, linux-arm-kernel@lists.infradead.org, Barry Song , Hugh Dickins , Minchan Kim , SeongJae Park Subject: [PATCH RFC v2 4/5] mm: swap: introduce swapcache_prepare_nr and swapcache_clear_nr for large folios swap-in Date: Thu, 29 Feb 2024 13:37:52 +1300 Message-Id: <20240229003753.134193-5-21cnbao@gmail.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240229003753.134193-1-21cnbao@gmail.com> References: <20240229003753.134193-1-21cnbao@gmail.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: A2E41180008 X-Rspam-User: X-Rspamd-Server: rspam11 X-Stat-Signature: 8cfbn6c7azmaoiceh7iusphrzojauthc X-HE-Tag: 1709167138-443586 X-HE-Meta: U2FsdGVkX1/HGlK4WlvYBWU+Tzv3/IzPnCEKY1LzHRP5+vwp1WZ07gspwTnqG82PoSqwHv4QOLTed4xkvmhgg+niVQgQtfXGd1aRrTRxiKL3zwvTMABhJHaR9b1Veq6/V5MhsTOFGWWJt2iUWMAI71/qHAPD1QQVmaqCvzsMaJvhpS+ySb15QMVYkaiS7kb49YtSaiXajYPnjGW8pRq1uWqgRZPYrWSlfmBmD3CnJ3TDAfDJMe9xujP1V1OuBOul2QYkV+ngqGtTQseq+MkZa5jCoFzs5cI7GyEYcbGC07DthwBaFeLjJRtOR+owiYnfCywXZy7BEpLBK59TVwVV1iUslJkubvvpbc0x9oGtvvFDxyRguw8Je8BgrTNsgzPboFo4qRSainT5/wmnyyie6PZUXqxvMH5TBc9QMamakQbJfWX6eGjVW15hiAvzgiMA5vBtucIzuqHsrpdnqMLz7EzMV6b9PnlUg74Rtsp+bmermb7rCIykVzQ52Hf1K7bWtM/Dd17BHg3BUONfUF3J0vAqRIE7UeCXr2+Nf61kOEPtA9412JAOg1DBcCmDaHGdso1vaSNh7DgX9xXyvAjeUqVez5/1mQZCf1rt7s9Sh2mfwQDcqh2ma5gZNvUNeFQDC2Rvy0hy6X7awD+4UyoCdzYnSQkVXkxY7z/DCFhhI3YVdz5Ia/fUGKCwrAABUWw6oqnkKWRcQq+RJFE7qHoUVx4N3WAz+xH/eC9uRB2u1Dz5o/VIC9I3uGNs9AeBTfraq9G/PTBydTyHipMrggTv9R/VrnPqqBtOcgDtU/82rf3T8Wg/FAuS5LHVt8BeQ8oErw2YXq+LXyIf36QAuiKJRpNpogOC7AGCwpSwZ3JVRQrYj3kciItCK6Vd86X6oGK96RI1LLoNqGnibDJsFAKd55AuZQ2uyOwKHwsnZb4MuXpfCYeFk1wSYF9xDqBVy3URWl/Rzq0BYlkHQ2iSpaJ kzhC8f6s B/qiNRb2woWCUNrA67e6+iSO+X8bijQtyua30155rYPD+2F1NbIJxQptc+DaOItFhs4UNFpLZiWaRIrerbZpLpv4lmZjg7GoGZlplsU1JvVCBtWN0adEc5vzIGBEIcV+JBtEb2k7TmlDe/6obenmryFjvRI/JVC3cRkpVAzwi67mcFcWka1REeofkg3OzN/ek6+F9K6YE9amA5odMEtfTNxQ8DTBhkCESNEEdaWUDWJ0YJs2+duucvXOSBIf9c5QBxpq6rUSnL07crhvkab3UnDoVROj/3p03nTEFV3eqK+b6Xw9xQ9ZgTM4Ivkf4N9CSZg7PYxFarin6yomhXz6jWG1JIVknC9PPfkoHrfj4gXzpLhYRnOYAJ/UQDl//7mD9BMvzg0q1FuKvlQn5pF4XdZKa8eYtZ8dh84X6okJSVMhJi1yYcRjUgoY4aiBCJdkeMUzQZrULykG1eMhPjBLVxkLiLQCdIzetFIgtv03QjfTxkjwL/S0xE8filmX+qjZgrC/Tl7f3G6n2lJUcqCufyx++DEsj4x27SnPRpHM3u9b5CZF5PApke53610pxQ+gjp2TLVCzgVrJyRq1wXCGrKTGuWKUul2xFftAppEmZg7QEi0SiC732TWDNacaX3ynSmnF3h8e6fsne4KhyDVhKgZFkc7fPTaWiRQ9T+S5acX8TV5yF9yv+LPYFZwoWzVxqdsUhkzKTMjZmI9vplTJso8bD6m9iLbr2tJrGWaAamEGBmUMf/KE6biBx1NM+ntYTy2jSYlGAZ9AbgD8jWCOOtmxrtkMYFGek2o0yLDImkycNW7OoiEkL+Ztpx+V3+4dihoLf7h3UUXOe8CekJKhii2mlnQV45xRfumUIlHCxd/mPsNnhGICvGWf86ZPLTW/Or2lr+b6+LDhUoNnnbbErXr2RjA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Barry Song Commit 13ddaf26be32 ("mm/swap: fix race when skipping swapcache") supports one entry only, to support large folio swap-in, we need to handle multiple swap entries. Cc: Kairui Song Cc: "Huang, Ying" Cc: Yu Zhao Cc: David Hildenbrand Cc: Chris Li Cc: Hugh Dickins Cc: Johannes Weiner Cc: Matthew Wilcox (Oracle) Cc: Michal Hocko Cc: Minchan Kim Cc: Yosry Ahmed Cc: Yu Zhao Cc: SeongJae Park Signed-off-by: Barry Song --- include/linux/swap.h | 1 + mm/swap.h | 1 + mm/swapfile.c | 117 ++++++++++++++++++++++++++----------------- 3 files changed, 72 insertions(+), 47 deletions(-) diff --git a/include/linux/swap.h b/include/linux/swap.h index b3581c976e5f..2691c739d9a4 100644 --- a/include/linux/swap.h +++ b/include/linux/swap.h @@ -480,6 +480,7 @@ extern int add_swap_count_continuation(swp_entry_t, gfp_t); extern void swap_shmem_alloc(swp_entry_t); extern int swap_duplicate(swp_entry_t); extern int swapcache_prepare(swp_entry_t); +extern int swapcache_prepare_nr(swp_entry_t, int nr); extern void swap_free(swp_entry_t); extern void swap_nr_free(swp_entry_t entry, int nr_pages); extern void swapcache_free_entries(swp_entry_t *entries, int n); diff --git a/mm/swap.h b/mm/swap.h index fc2f6ade7f80..1cec991efcda 100644 --- a/mm/swap.h +++ b/mm/swap.h @@ -42,6 +42,7 @@ void delete_from_swap_cache(struct folio *folio); void clear_shadow_from_swap_cache(int type, unsigned long begin, unsigned long end); void swapcache_clear(struct swap_info_struct *si, swp_entry_t entry); +void swapcache_clear_nr(struct swap_info_struct *si, swp_entry_t entry, int nr); struct folio *swap_cache_get_folio(swp_entry_t entry, struct vm_area_struct *vma, unsigned long addr); struct folio *filemap_get_incore_folio(struct address_space *mapping, diff --git a/mm/swapfile.c b/mm/swapfile.c index c0c058ee7b69..c8c8b6dbaeda 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -3308,7 +3308,7 @@ void si_swapinfo(struct sysinfo *val) } /* - * Verify that a swap entry is valid and increment its swap map count. + * Verify that nr swap entries are valid and increment their swap map count. * * Returns error code in following case. * - success -> 0 @@ -3318,66 +3318,73 @@ void si_swapinfo(struct sysinfo *val) * - swap-cache reference is requested but the entry is not used. -> ENOENT * - swap-mapped reference requested but needs continued swap count. -> ENOMEM */ -static int __swap_duplicate(swp_entry_t entry, unsigned char usage) +static int __swap_duplicate_nr(swp_entry_t entry, int nr, unsigned char usage) { struct swap_info_struct *p; struct swap_cluster_info *ci; unsigned long offset; - unsigned char count; - unsigned char has_cache; - int err; + unsigned char count[SWAPFILE_CLUSTER]; + unsigned char has_cache[SWAPFILE_CLUSTER]; + int err, i; p = swp_swap_info(entry); offset = swp_offset(entry); ci = lock_cluster_or_swap_info(p, offset); - count = p->swap_map[offset]; - - /* - * swapin_readahead() doesn't check if a swap entry is valid, so the - * swap entry could be SWAP_MAP_BAD. Check here with lock held. - */ - if (unlikely(swap_count(count) == SWAP_MAP_BAD)) { - err = -ENOENT; - goto unlock_out; - } + for (i = 0; i < nr; i++) { + count[i] = p->swap_map[offset + i]; - has_cache = count & SWAP_HAS_CACHE; - count &= ~SWAP_HAS_CACHE; - err = 0; - - if (usage == SWAP_HAS_CACHE) { - - /* set SWAP_HAS_CACHE if there is no cache and entry is used */ - if (!has_cache && count) - has_cache = SWAP_HAS_CACHE; - else if (has_cache) /* someone else added cache */ - err = -EEXIST; - else /* no users remaining */ + /* + * swapin_readahead() doesn't check if a swap entry is valid, so the + * swap entry could be SWAP_MAP_BAD. Check here with lock held. + */ + if (unlikely(swap_count(count[i]) == SWAP_MAP_BAD)) { err = -ENOENT; + goto unlock_out; + } - } else if (count || has_cache) { - - if ((count & ~COUNT_CONTINUED) < SWAP_MAP_MAX) - count += usage; - else if ((count & ~COUNT_CONTINUED) > SWAP_MAP_MAX) - err = -EINVAL; - else if (swap_count_continued(p, offset, count)) - count = COUNT_CONTINUED; - else - err = -ENOMEM; - } else - err = -ENOENT; /* unused swap entry */ - - if (!err) - WRITE_ONCE(p->swap_map[offset], count | has_cache); + has_cache[i] = count[i] & SWAP_HAS_CACHE; + count[i] &= ~SWAP_HAS_CACHE; + err = 0; + + if (usage == SWAP_HAS_CACHE) { + + /* set SWAP_HAS_CACHE if there is no cache and entry is used */ + if (!has_cache[i] && count[i]) + has_cache[i] = SWAP_HAS_CACHE; + else if (has_cache[i]) /* someone else added cache */ + err = -EEXIST; + else /* no users remaining */ + err = -ENOENT; + } else if (count[i] || has_cache[i]) { + + if ((count[i] & ~COUNT_CONTINUED) < SWAP_MAP_MAX) + count[i] += usage; + else if ((count[i] & ~COUNT_CONTINUED) > SWAP_MAP_MAX) + err = -EINVAL; + else if (swap_count_continued(p, offset + i, count[i])) + count[i] = COUNT_CONTINUED; + else + err = -ENOMEM; + } else + err = -ENOENT; /* unused swap entry */ + } + if (!err) { + for (i = 0; i < nr; i++) + WRITE_ONCE(p->swap_map[offset + i], count[i] | has_cache[i]); + } unlock_out: unlock_cluster_or_swap_info(p, ci); return err; } +static int __swap_duplicate(swp_entry_t entry, unsigned char usage) +{ + return __swap_duplicate_nr(entry, 1, usage); +} + /* * Help swapoff by noting that swap entry belongs to shmem/tmpfs * (in which case its reference count is never incremented). @@ -3416,17 +3423,33 @@ int swapcache_prepare(swp_entry_t entry) return __swap_duplicate(entry, SWAP_HAS_CACHE); } -void swapcache_clear(struct swap_info_struct *si, swp_entry_t entry) +int swapcache_prepare_nr(swp_entry_t entry, int nr) +{ + return __swap_duplicate_nr(entry, nr, SWAP_HAS_CACHE); +} + +void swapcache_clear_nr(struct swap_info_struct *si, swp_entry_t entry, int nr) { struct swap_cluster_info *ci; unsigned long offset = swp_offset(entry); - unsigned char usage; + unsigned char usage[SWAPFILE_CLUSTER]; + int i; ci = lock_cluster_or_swap_info(si, offset); - usage = __swap_entry_free_locked(si, offset, SWAP_HAS_CACHE); + for (i = 0; i < nr; i++) + usage[i] = __swap_entry_free_locked(si, offset + i, SWAP_HAS_CACHE); unlock_cluster_or_swap_info(si, ci); - if (!usage) - free_swap_slot(entry); + for (i = 0; i < nr; i++) { + if (!usage[i]) { + free_swap_slot(entry); + entry.val++; + } + } +} + +void swapcache_clear(struct swap_info_struct *si, swp_entry_t entry) +{ + swapcache_clear_nr(si, entry, 1); } struct swap_info_struct *swp_swap_info(swp_entry_t entry) From patchwork Thu Feb 29 00:37:53 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Barry Song <21cnbao@gmail.com> X-Patchwork-Id: 13576153 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3EF44C5478C for ; Thu, 29 Feb 2024 00:39:09 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C20916B00A7; Wed, 28 Feb 2024 19:39:08 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id BD0F36B00A8; Wed, 28 Feb 2024 19:39:08 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A713E6B00A9; Wed, 28 Feb 2024 19:39:08 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 96E1D6B00A7 for ; Wed, 28 Feb 2024 19:39:08 -0500 (EST) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 7931C1A019C for ; Thu, 29 Feb 2024 00:39:08 +0000 (UTC) X-FDA: 81842981976.17.6801AAF Received: from mail-pl1-f181.google.com (mail-pl1-f181.google.com [209.85.214.181]) by imf16.hostedemail.com (Postfix) with ESMTP id CC86218000D for ; Thu, 29 Feb 2024 00:39:06 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=MAvglmPA; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf16.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.214.181 as permitted sender) smtp.mailfrom=21cnbao@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1709167146; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=qbI/XApED6aL6ip36JGDw3HTQoLAAxWC9U5cpk4hjWc=; b=Dj613TU9hq5kS6xpJ3TAB+WGiUOKRikX6jLTL294YJ9j+UcSbgQK1Fw9rwRcONP3vK3ADG 14F9pEg9sNETW0A9Xdd87bJxogvag5npVtwPnYb3/ztvfE/0t9XK3PF/qU5AMIpkaONvxH EcgK3Xll8R5X1c1TVIWKYAfNDerpc4A= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=MAvglmPA; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf16.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.214.181 as permitted sender) smtp.mailfrom=21cnbao@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1709167146; a=rsa-sha256; cv=none; b=SJWABqdq0Gx3FDobzaBzprXMggL3q3gGSYbg0sV1nWGcmBKV5YtZoBSBk3YilutpmOTLIk sUDwrsMiEmok1I8Dy8DrNSkxb32T+W7D1wCLSXq8oIDgVFUVGwrWi9xJLPbP+0496+7ESf j3rwzLX+iFR+2mGb/GVhhtzKNr7D+1s= Received: by mail-pl1-f181.google.com with SMTP id d9443c01a7336-1dca160163dso3943825ad.3 for ; Wed, 28 Feb 2024 16:39:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1709167146; x=1709771946; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=qbI/XApED6aL6ip36JGDw3HTQoLAAxWC9U5cpk4hjWc=; b=MAvglmPAmX0Ietcu1rwbMaEZfe3/KAob5nNYTn8EP+wdlAyRR7ts9QaDq4oRTXjixR TlJYlWlxqu7G9C6/KrYpuie01yL6YZgwHt9ukSxlnRkpiNhghobshER5XaJf1TvwgwlL YyJEFRRlg+d4nyUHAzIAKQz+0luH6CwypvHxqIH1NFAMBqPRT+wVkfEDfSieJ/GhzlE8 +ZTPyXRgVQN/NPrSmi1UofguY6MSts7Yz7QuxNmBgDdCGMuf22FbAjToYgLJVCB4PnWQ eIEG6nJ5NJlBdreyVOadVqyZeQ2gfJ8nHsseCUkIj61aMKal48EiJD1v1CQFU/aH3MKQ s6eg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1709167146; x=1709771946; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=qbI/XApED6aL6ip36JGDw3HTQoLAAxWC9U5cpk4hjWc=; b=oRVwALEIYrHy2aRiR/xoRQIZlfzEqbLn4CztLQ/8KsgbQhhW8T5gbc9SdugEhBSrcw 3dg3gUBbTwm2g+peY0MFlCQ5VpCx4fKgKQmA3+m2eyaPwPkDKy0TaEHhxdcxjnUQZmJK 3Jpt/efHezCp45V0YLy7iG4WwsGtHBFXF1maDS3uq+BtcsOPXg+GA4N6VsVzsXWhPPdj BHw4iB6nE81XGoUUVvYPS4MgVqa4XmrcE7wkWpYKUL/BF0MnwCdrkwukcqLQlJ8XwncO jqBMMA5OeltkcbJUJmMVaQmiEs3IyOA3ztTZ3E1J6+niFGzbZ791MiWef75RvDqt2H0/ K9yg== X-Forwarded-Encrypted: i=1; AJvYcCXk/UedR9ADAfynu+SWkKNdVUmwc60MLwVm5D6elPVeLPV/fEYcSNWI/DbxaoUK9aE1cpQmZtyjm7IsMDiGsDPZIz8= X-Gm-Message-State: AOJu0Yz3IhSESHg7f56vtCgGHqskorgMxZYbI7wCyvNJPD/WF2U4di8b B/dOwgIWY7I43mYU3W83AIxKloZObPenM0v7RIFjbYxRDFK/VOTS X-Google-Smtp-Source: AGHT+IG+S6jcynauDRSdZ9l3zET/zqstY86gTNFzUYCAszncMlV3OluHiIng2OvqNpYj0yUVvzvDwg== X-Received: by 2002:a17:902:b192:b0:1dc:5efc:8498 with SMTP id s18-20020a170902b19200b001dc5efc8498mr527487plr.56.1709167145576; Wed, 28 Feb 2024 16:39:05 -0800 (PST) Received: from localhost.localdomain ([2407:7000:8942:5500:5158:ed66:78b3:7fda]) by smtp.gmail.com with ESMTPSA id p3-20020a170902780300b001d9641003cfsm62647pll.142.2024.02.28.16.38.57 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 28 Feb 2024 16:39:05 -0800 (PST) From: Barry Song <21cnbao@gmail.com> To: akpm@linux-foundation.org, david@redhat.com, linux-mm@kvack.org, ryan.roberts@arm.com, chrisl@kernel.org Cc: 21cnbao@gmail.com, linux-kernel@vger.kernel.org, mhocko@suse.com, shy828301@gmail.com, steven.price@arm.com, surenb@google.com, wangkefeng.wang@huawei.com, willy@infradead.org, xiang@kernel.org, ying.huang@intel.com, yuzhao@google.com, kasong@tencent.com, yosryahmed@google.com, nphamcs@gmail.com, chengming.zhou@linux.dev, hannes@cmpxchg.org, linux-arm-kernel@lists.infradead.org, Chuanhua Han , Barry Song Subject: [PATCH RFC v2 5/5] mm: support large folios swapin as a whole Date: Thu, 29 Feb 2024 13:37:53 +1300 Message-Id: <20240229003753.134193-6-21cnbao@gmail.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240229003753.134193-1-21cnbao@gmail.com> References: <20240229003753.134193-1-21cnbao@gmail.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: CC86218000D X-Rspam-User: X-Rspamd-Server: rspam02 X-Stat-Signature: qezmk7d19cawuhxenyx8i1rxjn3oy47r X-HE-Tag: 1709167146-43541 X-HE-Meta: U2FsdGVkX18C0FBrUUO6bT4RA7uImFkVYCa68BZFKQgtCxDxh4d/56HEtXXzwmDqXYKvmph7GNphp0p5FXKP3UFqDGB3f77l8gvW6aYjQMPzVioeK4uZ19Dnf/e9NPrkKn3A2kRY6jyG8AGZjaFjoNFTl8/LOeAiXoQwGR3LAZD8ZO5gxyb0IP6uemPB/r8Q8Jd6dUMM1/07+Pbb2ksZZ7RaSQZsrVBvTSm1UldEBxdQyNr5mKOL87mYy9UeK5NeJLPm+eym++1uJS3GTiWc+3hT20G4w4R2yKqfoDo/a2Dsi5xPOEJB/i+jAGpuVmSl+Sy5bELpWvRE1Qah4R5jEuAiLYSeDRULLaQ2jUIkv9XEXlgG+uQlGaarYLvtxbadToRRgOKbNKnHipIsJ9NIWrVjkJJC0ISXTzLJpEhtxwhEZQPzE8azKZcxxe+CDmRPMBlDcPy89hqQik8L4e5BaV2Au1XnRYMEs7eRmi8RSD97oW3MhrQ1rQRnGO/GA4nloWoFA+vow/BiPlVObD010TrSseB9EZ9V7yqv3a8ed4pi8Ayg0CuQFUNKpSRocjv0DE7CgYXZNjsq4obx9gCpg7HlLrMa27JvTxhIpGmrXRMDGA9T6GOsTOYsOxTUkYPhN/oIPVUsk35+fO/kxEod8Vr0pO1aZGlR3RFxJLsQojXBn0KQ+ZkvEKIUR3V9puPWu+otusXNpl6r4bnLYUTmdtxNRbrkEGRtqLpT61kdM2vAR83cujJcO7fg5CRz0foxVv9K1oW1Bc4s5qa21xIHjoEp53PeAxfKVl8rl0D7XGDMd1s1gZw/0mdXWCKhOhKLAd291+g+05dWNs+NoqF/MKhJHV+CBloji1TZFRYnCwkOTqM49VgClZnJpQxw3vO2r8QUyiqQl39vnKtVevxdrkFt2Q9uq7HalC8d8EI/qJ3C1Ut96xOIbRtsDt1DmtRohRIfYt9GU8bLkvGq/Yw Z5JD2KGO TbGeJwQowwJMohbdefPTQrzOmc1Kmr5hmpq2Vly/mdmzX5QGthHg8hbWtrDCrcLqFc2rSVo6Jvel9/Ao1bHlwy2zqt3vMX/gJ253qhiInRfUooW7M+GVgIeJl0A== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Chuanhua Han On an embedded system like Android, more than half of anon memory is actually in swap devices such as zRAM. For example, while an app is switched to back- ground, its most memory might be swapped-out. Now we have mTHP features, unfortunately, if we don't support large folios swap-in, once those large folios are swapped-out, we immediately lose the performance gain we can get through large folios and hardware optimization such as CONT-PTE. This patch brings up mTHP swap-in support. Right now, we limit mTHP swap-in to those contiguous swaps which were likely swapped out from mTHP as a whole. On the other hand, the current implementation only covers the SWAP_SYCHRONOUS case. It doesn't support swapin_readahead as large folios yet. Right now, we are re-faulting large folios which are still in swapcache as a whole, this can effectively decrease extra loops and early-exitings which we have increased in arch_swap_restore() while supporting MTE restore for folios rather than page. On the other hand, it can also decrease do_swap_page as PTEs used to be set one by one even we hit a large folio in swapcache. Signed-off-by: Chuanhua Han Co-developed-by: Barry Song Signed-off-by: Barry Song --- mm/memory.c | 191 ++++++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 157 insertions(+), 34 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 90b08b7cbaac..471689ce4e91 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -104,9 +104,16 @@ struct page *mem_map; EXPORT_SYMBOL(mem_map); #endif +/* A choice of behaviors for alloc_anon_folio() */ +enum behavior { + DO_SWAP_PAGE, + DO_ANON_PAGE, +}; + static vm_fault_t do_fault(struct vm_fault *vmf); static vm_fault_t do_anonymous_page(struct vm_fault *vmf); static bool vmf_pte_changed(struct vm_fault *vmf); +static struct folio *alloc_anon_folio(struct vm_fault *vmf, enum behavior behavior); /* * Return true if the original pte was a uffd-wp pte marker (so the pte was @@ -3974,6 +3981,52 @@ static vm_fault_t handle_pte_marker(struct vm_fault *vmf) return VM_FAULT_SIGBUS; } +/* + * check a range of PTEs are completely swap entries with + * contiguous swap offsets and the same SWAP_HAS_CACHE. + * pte must be first one in the range + */ +static bool is_pte_range_contig_swap(pte_t *pte, int nr_pages) +{ + int i; + struct swap_info_struct *si; + swp_entry_t entry; + unsigned type; + pgoff_t start_offset; + char has_cache; + + entry = pte_to_swp_entry(ptep_get_lockless(pte)); + if (non_swap_entry(entry)) + return false; + start_offset = swp_offset(entry); + if (start_offset % nr_pages) + return false; + + si = swp_swap_info(entry); + type = swp_type(entry); + has_cache = si->swap_map[start_offset] & SWAP_HAS_CACHE; + for (i = 1; i < nr_pages; i++) { + entry = pte_to_swp_entry(ptep_get_lockless(pte + i)); + if (non_swap_entry(entry)) + return false; + if (swp_offset(entry) != start_offset + i) + return false; + if (swp_type(entry) != type) + return false; + /* + * while allocating a large folio and doing swap_read_folio for the + * SWP_SYNCHRONOUS_IO path, which is the case the being faulted pte + * doesn't have swapcache. We need to ensure all PTEs have no cache + * as well, otherwise, we might go to swap devices while the content + * is in swapcache + */ + if ((si->swap_map[start_offset + i] & SWAP_HAS_CACHE) != has_cache) + return false; + } + + return true; +} + /* * We enter with non-exclusive mmap_lock (to exclude vma changes, * but allow concurrent faults), and pte mapped but not yet locked. @@ -3995,6 +4048,9 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) pte_t pte; vm_fault_t ret = 0; void *shadow = NULL; + int nr_pages = 1; + unsigned long start_address; + pte_t *start_pte; if (!pte_unmap_same(vmf)) goto out; @@ -4058,28 +4114,32 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) if (!folio) { if (data_race(si->flags & SWP_SYNCHRONOUS_IO) && __swap_count(entry) == 1) { - /* - * Prevent parallel swapin from proceeding with - * the cache flag. Otherwise, another thread may - * finish swapin first, free the entry, and swapout - * reusing the same entry. It's undetectable as - * pte_same() returns true due to entry reuse. - */ - if (swapcache_prepare(entry)) { - /* Relax a bit to prevent rapid repeated page faults */ - schedule_timeout_uninterruptible(1); - goto out; - } - need_clear_cache = true; - /* skip swapcache */ - folio = vma_alloc_folio(GFP_HIGHUSER_MOVABLE, 0, - vma, vmf->address, false); + folio = alloc_anon_folio(vmf, DO_SWAP_PAGE); page = &folio->page; if (folio) { __folio_set_locked(folio); __folio_set_swapbacked(folio); + if (folio_test_large(folio)) { + nr_pages = folio_nr_pages(folio); + entry.val = ALIGN_DOWN(entry.val, nr_pages); + } + + /* + * Prevent parallel swapin from proceeding with + * the cache flag. Otherwise, another thread may + * finish swapin first, free the entry, and swapout + * reusing the same entry. It's undetectable as + * pte_same() returns true due to entry reuse. + */ + if (swapcache_prepare_nr(entry, nr_pages)) { + /* Relax a bit to prevent rapid repeated page faults */ + schedule_timeout_uninterruptible(1); + goto out; + } + need_clear_cache = true; + if (mem_cgroup_swapin_charge_folio(folio, vma->vm_mm, GFP_KERNEL, entry)) { @@ -4185,6 +4245,42 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) */ vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, vmf->address, &vmf->ptl); + + start_address = vmf->address; + start_pte = vmf->pte; + if (folio_test_large(folio)) { + unsigned long nr = folio_nr_pages(folio); + unsigned long addr = ALIGN_DOWN(vmf->address, nr * PAGE_SIZE); + pte_t *aligned_pte = vmf->pte - (vmf->address - addr) / PAGE_SIZE; + + /* + * case 1: we are allocating large_folio, try to map it as a whole + * iff the swap entries are still entirely mapped; + * case 2: we hit a large folio in swapcache, and all swap entries + * are still entirely mapped, try to map a large folio as a whole. + * otherwise, map only the faulting page within the large folio + * which is swapcache + */ + if (!is_pte_range_contig_swap(aligned_pte, nr)) { + if (nr_pages > 1) /* ptes have changed for case 1 */ + goto out_nomap; + goto check_pte; + } + + start_address = addr; + start_pte = aligned_pte; + /* + * the below has been done before swap_read_folio() + * for case 1 + */ + if (unlikely(folio == swapcache)) { + nr_pages = nr; + entry.val = ALIGN_DOWN(entry.val, nr_pages); + page = &folio->page; + } + } + +check_pte: if (unlikely(!vmf->pte || !pte_same(ptep_get(vmf->pte), vmf->orig_pte))) goto out_nomap; @@ -4252,12 +4348,14 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) * We're already holding a reference on the page but haven't mapped it * yet. */ - swap_free(entry); + swap_nr_free(entry, nr_pages); if (should_try_to_free_swap(folio, vma, vmf->flags)) folio_free_swap(folio); - inc_mm_counter(vma->vm_mm, MM_ANONPAGES); - dec_mm_counter(vma->vm_mm, MM_SWAPENTS); + folio_ref_add(folio, nr_pages - 1); + add_mm_counter(vma->vm_mm, MM_ANONPAGES, nr_pages); + add_mm_counter(vma->vm_mm, MM_SWAPENTS, -nr_pages); + pte = mk_pte(page, vma->vm_page_prot); /* @@ -4267,14 +4365,14 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) * exclusivity. */ if (!folio_test_ksm(folio) && - (exclusive || folio_ref_count(folio) == 1)) { + (exclusive || folio_ref_count(folio) == nr_pages)) { if (vmf->flags & FAULT_FLAG_WRITE) { pte = maybe_mkwrite(pte_mkdirty(pte), vma); vmf->flags &= ~FAULT_FLAG_WRITE; } rmap_flags |= RMAP_EXCLUSIVE; } - flush_icache_page(vma, page); + flush_icache_pages(vma, page, nr_pages); if (pte_swp_soft_dirty(vmf->orig_pte)) pte = pte_mksoft_dirty(pte); if (pte_swp_uffd_wp(vmf->orig_pte)) @@ -4283,17 +4381,19 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) /* ksm created a completely new copy */ if (unlikely(folio != swapcache && swapcache)) { - folio_add_new_anon_rmap(folio, vma, vmf->address); + folio_add_new_anon_rmap(folio, vma, start_address); folio_add_lru_vma(folio, vma); + } else if (!folio_test_anon(folio)) { + folio_add_new_anon_rmap(folio, vma, start_address); } else { - folio_add_anon_rmap_pte(folio, page, vma, vmf->address, + folio_add_anon_rmap_ptes(folio, page, nr_pages, vma, start_address, rmap_flags); } VM_BUG_ON(!folio_test_anon(folio) || (pte_write(pte) && !PageAnonExclusive(page))); - set_pte_at(vma->vm_mm, vmf->address, vmf->pte, pte); - arch_do_swap_page(vma->vm_mm, vma, vmf->address, pte, vmf->orig_pte); + set_ptes(vma->vm_mm, start_address, start_pte, pte, nr_pages); + arch_do_swap_page(vma->vm_mm, vma, start_address, pte, vmf->orig_pte); folio_unlock(folio); if (folio != swapcache && swapcache) { @@ -4310,6 +4410,9 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) } if (vmf->flags & FAULT_FLAG_WRITE) { + if (nr_pages > 1) + vmf->orig_pte = ptep_get(vmf->pte); + ret |= do_wp_page(vmf); if (ret & VM_FAULT_ERROR) ret &= VM_FAULT_ERROR; @@ -4317,14 +4420,14 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) } /* No need to invalidate - it was non-present before */ - update_mmu_cache_range(vmf, vma, vmf->address, vmf->pte, 1); + update_mmu_cache_range(vmf, vma, start_address, start_pte, nr_pages); unlock: if (vmf->pte) pte_unmap_unlock(vmf->pte, vmf->ptl); out: /* Clear the swap cache pin for direct swapin after PTL unlock */ if (need_clear_cache) - swapcache_clear(si, entry); + swapcache_clear_nr(si, entry, nr_pages); if (si) put_swap_device(si); return ret; @@ -4340,7 +4443,7 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) folio_put(swapcache); } if (need_clear_cache) - swapcache_clear(si, entry); + swapcache_clear_nr(si, entry, nr_pages); if (si) put_swap_device(si); return ret; @@ -4358,7 +4461,7 @@ static bool pte_range_none(pte_t *pte, int nr_pages) return true; } -static struct folio *alloc_anon_folio(struct vm_fault *vmf) +static struct folio *alloc_anon_folio(struct vm_fault *vmf, enum behavior behavior) { struct vm_area_struct *vma = vmf->vma; #ifdef CONFIG_TRANSPARENT_HUGEPAGE @@ -4376,6 +4479,19 @@ static struct folio *alloc_anon_folio(struct vm_fault *vmf) if (unlikely(userfaultfd_armed(vma))) goto fallback; + /* + * a large folio being swapped-in could be partially in + * zswap and partially in swap devices, zswap doesn't + * support large folios yet, we might get corrupted + * zero-filled data by reading all subpages from swap + * devices while some of them are actually in zswap + */ + if (behavior == DO_SWAP_PAGE && is_zswap_enabled()) + goto fallback; + + if (unlikely(behavior != DO_ANON_PAGE && behavior != DO_SWAP_PAGE)) + return ERR_PTR(-EINVAL); + /* * Get a list of all the (large) orders below PMD_ORDER that are enabled * for this vma. Then filter out the orders that can't be allocated over @@ -4393,15 +4509,22 @@ static struct folio *alloc_anon_folio(struct vm_fault *vmf) return ERR_PTR(-EAGAIN); /* - * Find the highest order where the aligned range is completely - * pte_none(). Note that all remaining orders will be completely + * For do_anonymous_page, find the highest order where the aligned range is + * completely pte_none(). Note that all remaining orders will be completely * pte_none(). + * For do_swap_page, find the highest order where the aligned range is + * completely swap entries with contiguous swap offsets. */ order = highest_order(orders); while (orders) { addr = ALIGN_DOWN(vmf->address, PAGE_SIZE << order); - if (pte_range_none(pte + pte_index(addr), 1 << order)) - break; + if (behavior == DO_ANON_PAGE) { + if (pte_range_none(pte + pte_index(addr), 1 << order)) + break; + } else { + if (is_pte_range_contig_swap(pte + pte_index(addr), 1 << order)) + break; + } order = next_order(&orders, order); } @@ -4485,7 +4608,7 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf) if (unlikely(anon_vma_prepare(vma))) goto oom; /* Returns NULL on OOM or ERR_PTR(-EAGAIN) if we must retry the fault */ - folio = alloc_anon_folio(vmf); + folio = alloc_anon_folio(vmf, DO_ANON_PAGE); if (IS_ERR(folio)) return 0; if (!folio)