From patchwork Thu Jan 18 11:10:31 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Barry Song <21cnbao@gmail.com> X-Patchwork-Id: 13522709 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5D4CBC4707B for ; Thu, 18 Jan 2024 11:11:19 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id F118B6B0081; Thu, 18 Jan 2024 06:11:18 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id EC1016B0082; Thu, 18 Jan 2024 06:11:18 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D3AAF6B0083; Thu, 18 Jan 2024 06:11:18 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id C4AEC6B0081 for ; Thu, 18 Jan 2024 06:11:18 -0500 (EST) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id A2DA1160AF2 for ; Thu, 18 Jan 2024 11:11:18 +0000 (UTC) X-FDA: 81692165436.10.601691F Received: from mail-pf1-f180.google.com (mail-pf1-f180.google.com [209.85.210.180]) by imf20.hostedemail.com (Postfix) with ESMTP id B912C1C0008 for ; Thu, 18 Jan 2024 11:11:16 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="V7vE/bd2"; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf20.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.210.180 as permitted sender) smtp.mailfrom=21cnbao@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1705576276; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=H8++HXKXEg2rLAGYvsGKqRlSZ467q1G3QH/l6LoLLoc=; b=K84jirQs7YfUw0zsLCU0/fk/3O2pi9NDcW2l7h0Og3AMgIj0rzXTD8iVaUYpbenBahgqpc YRh/6pY1qyTY+Baak51lZvLlSRjcybAps3gED7SPufUZrX+PpkA7wHApnJHblCkYsM4lXE QqpJeTDa1qJx3aK1GaZuUEdgaIoANTI= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="V7vE/bd2"; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf20.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.210.180 as permitted sender) smtp.mailfrom=21cnbao@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1705576276; a=rsa-sha256; cv=none; b=A8nRL7w8DSynRpUY3H/BP9R5N6X4EhreoV6c/XSEOWs9p/tnVocqlvgaMV1gnnJvjAZ8u+ 1nhxp6lrLGGXn7G5cqBBoihhZyAXJsg0y1ZHnu0NoiL7vVT9ljrRVVPaxjBZqsI498Xw01 P49Ec1W8FKvIRBbt5insI7VYdiubfwk= Received: by mail-pf1-f180.google.com with SMTP id d2e1a72fcca58-6d9af1f12d5so10376984b3a.3 for ; Thu, 18 Jan 2024 03:11:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1705576275; x=1706181075; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=H8++HXKXEg2rLAGYvsGKqRlSZ467q1G3QH/l6LoLLoc=; b=V7vE/bd2hGCRmHf2TOQAYuvCODg1zHaTKLuKaSaXzEHeW0jtHHFJ71YSm8ooHT2grX /ArF44L7aXNVw1kNXx4Poyb5E0VzmL8gEWVL6Kz5Fm2ttbvKvoviL55csDdClo5Rbqsh oSfOLX43YrI6WqL2bjMNykDYDfV2Q+0eNpcOeH0NuZoK+q1J0yGJ8iTgeXYYrovKD6mj r/r0OegN/1ATBtf5wtOc3sMcZHoRrcOjqBE5Qr8y9pPrJJIj7Kc90z6B/IVqGI33EWAg CHqJ+VFGVFs10j8jitH94d1dWXYTa17f3i8HAnV5PVc9IWF78SfY1YsVQmxRpFPYerld NVYA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1705576275; x=1706181075; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=H8++HXKXEg2rLAGYvsGKqRlSZ467q1G3QH/l6LoLLoc=; b=Brep0DAJRB4F33SY+plagZrWt9anGSfghLc125Du2P+BTzqGilNRLybKAMXRw7NZmy ZJlAFH7mAypXr7sd8EQ73+WUbm11YQGmHaToj01KQMinKn3NQTu1LydOdOKoQmRCyHKt XOnGqp04nw9YkfJmHc+W9os+MKKs39byIIG7g2KGuUXcRmErN1QVSQQQln5zZtIhP7Uu pLN4wn8vIKW0GitKKs5hbH9jqadM1Qysus7nwpgEprg3JdhbXceRbVZSRJVj8dT/vvqs Et694qTzLLAE1r/jbk8kTJBB/4Mj2eYKMt5PWGSJ76D+SWiGdCgjWcLDtpCu9ZvV/n9X kLtw== X-Gm-Message-State: AOJu0YzObWkryLOM8efKYe+pLLyVsH2lCdgXopl/YjICS0oemNJVFv77 ZaIbyWndMLI0FL/Xi3aQfaWnppJ2XDlfb3OUY6b7kjCph3kSV2Ap X-Google-Smtp-Source: AGHT+IES0NLZgAm4unP7Kp0dcWly48U0/DxS1VkSTEJ6A+2jDMT/vDPz11Jhz+G2DyZI1Lcqv4xcYg== X-Received: by 2002:a05:6a20:7354:b0:19b:90d8:2a11 with SMTP id v20-20020a056a20735400b0019b90d82a11mr617298pzc.69.1705576275572; Thu, 18 Jan 2024 03:11:15 -0800 (PST) Received: from barry-desktop.. (143.122.224.49.dyn.cust.vf.net.nz. [49.224.122.143]) by smtp.gmail.com with ESMTPSA id t19-20020a056a0021d300b006d9be753ac7sm3039107pfj.108.2024.01.18.03.11.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 18 Jan 2024 03:11:15 -0800 (PST) From: Barry Song <21cnbao@gmail.com> To: ryan.roberts@arm.com, akpm@linux-foundation.org, david@redhat.com, linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org, mhocko@suse.com, shy828301@gmail.com, wangkefeng.wang@huawei.com, willy@infradead.org, xiang@kernel.org, ying.huang@intel.com, yuzhao@google.com, surenb@google.com, steven.price@arm.com, Barry Song Subject: [PATCH RFC 1/6] arm64: mm: swap: support THP_SWAP on hardware with MTE Date: Fri, 19 Jan 2024 00:10:31 +1300 Message-Id: <20240118111036.72641-2-21cnbao@gmail.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240118111036.72641-1-21cnbao@gmail.com> References: <20231025144546.577640-1-ryan.roberts@arm.com> <20240118111036.72641-1-21cnbao@gmail.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: B912C1C0008 X-Stat-Signature: 63jhgxx5ojfhuc6tkiikkfxwb874tcp8 X-HE-Tag: 1705576276-975178 X-HE-Meta: U2FsdGVkX18a+6B5cNFMC2SFKIW12p1KPmzG+DTIF/h6NFTu4bug5hC26PNPMQDL9DkiOCoA5J6NT6re7JlCiSs+p/rjWX59PVgaCLwdgZHmVUmY2FFn4xk5JqWP1RJ8p7+djjDJ9cC0gqwEr7AdaBMl2P7kbnNI0ySB+p6rTuZ1gzy+6cfIzzrBL8Ps26VZP1VCvx+OAz7v0FllGmBJokq+zegHtuFB54RmftoDP5h84ziMfAcqJ8Rz/HHs1MECj+iuVK6LBe/Vp49uU+c+LRVSmvp4GD+Rwdd8fFZ4vA8CfSAdrABwPun9VBJE/iIobE5MKNHtc0tgrBM8h3TMFgKoRG48ec1YDdnJ35SrCqragAkONIEb9dr6m0xXCV6XmbPxEHGMRU9Kr0hWMNZxwfPBV3QeItAcon7FCy07cVtXLrRa9ChZhE7WPSwHu7xvKhcLPDH2R68XEBx/PUr28Y5L+dnbtQkMr/o8BQTdxBzGDFXBoUXTe79RtYLr7o6JBWnUm1sHgb4qlQ2M62iXNot5esp5sg+//txlcZz6ylFQLHv1cnWm/FsMGhH3YLm7Jdxnt++2jhAQ2x2nYgDHvbTmlmpS62nlCsKVd3nWFKY/6uk6siZ10b6EjnTUgPWO+83bbJnsP3apIYAReFt1pqMUUsdlrAbMVMT+3AQZV4GKdpKOQdMh21l/93jdr1VQMV03ZnYCFAtdgIrOMmTtzJRocmqWaVuYvick/u0TTjfcD07WLh+p5bNw5zBrUYTMtCR2Krdj7Q6JblKI3gEgHCBZb5QllVGeNNEm7VRmY0ig6TPH3+3tvF2zb49nvrobzvjpf5gev4sywFBiTWbYTMRqPf0rGQaBQbunIYVjk6GlvATN7TrUOzcW0M3PCXN0h2x87HD6t5fIqAryFuCV//YJyjdx9hkizJa/0xN6lR1Zfi8SWIQ8W02t/NjFJAyh2DL5/XcZfZ1twoDvC4f u+GVyDBv FXYaUBr1zA+iMCrQxyjHRyLbdnw2YmRZLZsvz4Ko8ssqYFREUHcD1vGaIi4NCBQ9xKx50EzBlSAO8kap2hMA/VhW3pcoBbXrLY6sbeVJ49/4oGvhfUMr+AkyAT1ynGv5ZVtT1QAlHBpDHlYFQr+5vOFn4xJBAeLt7hcwl6oNMpU60mgDo+5RTl1pORv6BztZEvVHMtKHA/NAMJGa+LuSNFGMupEEgyECjaAe5AWzc/oUFotYMyroclZgdwpwyNI5GHAL7AooALU3wJzfIZPAK9rcAmguZ0rrvkalpkhSQ1BZIHBrQh6WKbl/nNULQW71IhsFr5iZpFvJNxps8eM1rf0IB9jD2HYaR66ba1efDyQ19Q7uy95RF06JLzd08sRk9DEr+46zINcx5CAoF7dJkrS4lHfyWXSgYQzGpcLQeczUjJZbHg1uPCsd0KI67biKp8RWwLe/tXSn10e0j49y0pjekT24t7eRZkjJakbw4skJFttiWHq0/UZyuGKiKiKxURxQ9pldcHUEO/U7hJelKASbVoo06eW6NuTs8Qo8maaRz7i4fJeHQQsPQUV5SThcQlpBIGXURLizMTV13wFoRZfxNiw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Barry Song Commit d0637c505f8a1 ("arm64: enable THP_SWAP for arm64") brings up THP_SWAP on ARM64, but it doesn't enable THP_SWP on hardware with MTE as the MTE code works with the assumption tags save/restore is always handling a folio with only one page. The limitation should be removed as more and more ARM64 SoCs have this feature. Co-existence of MTE and THP_SWAP becomes more and more important. This patch makes MTE tags saving support large folios, then we don't need to split large folios into base pages for swapping out on ARM64 SoCs with MTE any more. arch_prepare_to_swap() should take folio rather than page as parameter because we support THP swap-out as a whole. It saves tags for all pages in a large folio. As now we are restoring tags based-on folio, in arch_swap_restore(), we may increase some extra loops and early-exitings while refaulting a large folio which is still in swapcache in do_swap_page(). In case a large folio has nr pages, do_swap_page() will only set the PTE of the particular page which is causing the page fault. Thus do_swap_page() runs nr times, and each time, arch_swap_restore() will loop nr times for those subpages in the folio. So right now the algorithmic complexity becomes O(nr^2). Once we support mapping large folios in do_swap_page(), extra loops and early-exitings will decrease while not being completely removed as a large folio might get partially tagged in corner cases such as, 1. a large folio in swapcache can be partially unmapped, thus, MTE tags for the unmapped pages will be invalidated; 2. users might use mprotect() to set MTEs on a part of a large folio. arch_thp_swp_supported() is dropped since ARM64 MTE was the only one who needed it. Reviewed-by: Steven Price Signed-off-by: Barry Song Acked-by: Chris Li --- arch/arm64/include/asm/pgtable.h | 21 +++------------- arch/arm64/mm/mteswap.c | 42 ++++++++++++++++++++++++++++++++ include/linux/huge_mm.h | 12 --------- include/linux/pgtable.h | 2 +- mm/page_io.c | 2 +- mm/swap_slots.c | 2 +- 6 files changed, 49 insertions(+), 32 deletions(-) diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index 79ce70fbb751..9902395ca426 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -45,12 +45,6 @@ __flush_tlb_range(vma, addr, end, PUD_SIZE, false, 1) #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ -static inline bool arch_thp_swp_supported(void) -{ - return !system_supports_mte(); -} -#define arch_thp_swp_supported arch_thp_swp_supported - /* * Outside of a few very special situations (e.g. hibernation), we always * use broadcast TLB invalidation instructions, therefore a spurious page @@ -1042,12 +1036,8 @@ static inline pmd_t pmdp_establish(struct vm_area_struct *vma, #ifdef CONFIG_ARM64_MTE #define __HAVE_ARCH_PREPARE_TO_SWAP -static inline int arch_prepare_to_swap(struct page *page) -{ - if (system_supports_mte()) - return mte_save_tags(page); - return 0; -} +#define arch_prepare_to_swap arch_prepare_to_swap +extern int arch_prepare_to_swap(struct folio *folio); #define __HAVE_ARCH_SWAP_INVALIDATE static inline void arch_swap_invalidate_page(int type, pgoff_t offset) @@ -1063,11 +1053,8 @@ static inline void arch_swap_invalidate_area(int type) } #define __HAVE_ARCH_SWAP_RESTORE -static inline void arch_swap_restore(swp_entry_t entry, struct folio *folio) -{ - if (system_supports_mte()) - mte_restore_tags(entry, &folio->page); -} +#define arch_swap_restore arch_swap_restore +extern void arch_swap_restore(swp_entry_t entry, struct folio *folio); #endif /* CONFIG_ARM64_MTE */ diff --git a/arch/arm64/mm/mteswap.c b/arch/arm64/mm/mteswap.c index a31833e3ddc5..b9ca1b35902f 100644 --- a/arch/arm64/mm/mteswap.c +++ b/arch/arm64/mm/mteswap.c @@ -68,6 +68,13 @@ void mte_invalidate_tags(int type, pgoff_t offset) mte_free_tag_storage(tags); } +static inline void __mte_invalidate_tags(struct page *page) +{ + swp_entry_t entry = page_swap_entry(page); + + mte_invalidate_tags(swp_type(entry), swp_offset(entry)); +} + void mte_invalidate_tags_area(int type) { swp_entry_t entry = swp_entry(type, 0); @@ -83,3 +90,38 @@ void mte_invalidate_tags_area(int type) } xa_unlock(&mte_pages); } + +int arch_prepare_to_swap(struct folio *folio) +{ + int err; + long i; + + if (system_supports_mte()) { + long nr = folio_nr_pages(folio); + + for (i = 0; i < nr; i++) { + err = mte_save_tags(folio_page(folio, i)); + if (err) + goto out; + } + } + return 0; + +out: + while (i--) + __mte_invalidate_tags(folio_page(folio, i)); + return err; +} + +void arch_swap_restore(swp_entry_t entry, struct folio *folio) +{ + if (system_supports_mte()) { + long i, nr = folio_nr_pages(folio); + + entry.val -= swp_offset(entry) & (nr - 1); + for (i = 0; i < nr; i++) { + mte_restore_tags(entry, folio_page(folio, i)); + entry.val++; + } + } +} diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index 5adb86af35fc..67219d2309dd 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -530,16 +530,4 @@ static inline int split_folio(struct folio *folio) return split_folio_to_list(folio, NULL); } -/* - * archs that select ARCH_WANTS_THP_SWAP but don't support THP_SWP due to - * limitations in the implementation like arm64 MTE can override this to - * false - */ -#ifndef arch_thp_swp_supported -static inline bool arch_thp_swp_supported(void) -{ - return true; -} -#endif - #endif /* _LINUX_HUGE_MM_H */ diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index f6d0e3513948..37fe83b0c358 100644 --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -925,7 +925,7 @@ static inline int arch_unmap_one(struct mm_struct *mm, * prototypes must be defined in the arch-specific asm/pgtable.h file. */ #ifndef __HAVE_ARCH_PREPARE_TO_SWAP -static inline int arch_prepare_to_swap(struct page *page) +static inline int arch_prepare_to_swap(struct folio *folio) { return 0; } diff --git a/mm/page_io.c b/mm/page_io.c index ae2b49055e43..a9a7c236aecc 100644 --- a/mm/page_io.c +++ b/mm/page_io.c @@ -189,7 +189,7 @@ int swap_writepage(struct page *page, struct writeback_control *wbc) * Arch code may have to preserve more data than just the page * contents, e.g. memory tags. */ - ret = arch_prepare_to_swap(&folio->page); + ret = arch_prepare_to_swap(folio); if (ret) { folio_mark_dirty(folio); folio_unlock(folio); diff --git a/mm/swap_slots.c b/mm/swap_slots.c index 0bec1f705f8e..2325adbb1f19 100644 --- a/mm/swap_slots.c +++ b/mm/swap_slots.c @@ -307,7 +307,7 @@ swp_entry_t folio_alloc_swap(struct folio *folio) entry.val = 0; if (folio_test_large(folio)) { - if (IS_ENABLED(CONFIG_THP_SWAP) && arch_thp_swp_supported()) + if (IS_ENABLED(CONFIG_THP_SWAP)) get_swap_pages(1, &entry, folio_nr_pages(folio)); goto out; }