From patchwork Fri Aug 30 10:03:38 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Usama Arif X-Patchwork-Id: 13784870 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1B8E4CA0EFB for ; Fri, 30 Aug 2024 10:04:54 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 727C86B00F8; Fri, 30 Aug 2024 06:04:51 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6D4456B00F9; Fri, 30 Aug 2024 06:04:51 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 550816B00FA; Fri, 30 Aug 2024 06:04:51 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 361556B00F8 for ; Fri, 30 Aug 2024 06:04:51 -0400 (EDT) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id E5EAB40ECE for ; Fri, 30 Aug 2024 10:04:50 +0000 (UTC) X-FDA: 82508477940.10.85F339C Received: from mail-qv1-f42.google.com (mail-qv1-f42.google.com [209.85.219.42]) by imf29.hostedemail.com (Postfix) with ESMTP id 1F71212000B for ; Fri, 30 Aug 2024 10:04:48 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=GP9+XGoa; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf29.hostedemail.com: domain of usamaarif642@gmail.com designates 209.85.219.42 as permitted sender) smtp.mailfrom=usamaarif642@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1725012188; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=t+vBNVWjO6mtV26zUs32wjVtSHlbqsf20K2xRQFmV6I=; b=TRrD1Bj0jF7TQcmHsj9IyhNe1TfTisj8550XNOrFPuAkxn1LZp+zr9c/h+DUTJApyFbT11 D9REBFUk9IAVrJL5erY2DhR2tCKcirBjbprDt0jRFdEHG/vDNgBrafTjRjxRf23mRp4D8G 78fNbzMWZI7my1MVMxNEPbonyo2Yekk= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1725012188; a=rsa-sha256; cv=none; b=u5hYLygxg5MpItZkHRQ40rctEkmMoobBMUiCKEiY73oXwc5tXlqjuGd1h908Hs+c55GoYR pr+UdWSDLTXjdeHILFGKJ21myaz0IEArbZebR8jkXdhF0vFZN055PtuWLYxjS1a3+jnPkg GApKAuiMESLO/Y63ccQmtoqUgm7BkQI= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=GP9+XGoa; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf29.hostedemail.com: domain of usamaarif642@gmail.com designates 209.85.219.42 as permitted sender) smtp.mailfrom=usamaarif642@gmail.com Received: by mail-qv1-f42.google.com with SMTP id 6a1803df08f44-6c352bcb569so357706d6.0 for ; Fri, 30 Aug 2024 03:04:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1725012288; x=1725617088; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=t+vBNVWjO6mtV26zUs32wjVtSHlbqsf20K2xRQFmV6I=; b=GP9+XGoaEgOaf73q0Hu8cbNHZZAZoE6YixGB3lPL/IzaJMwa+d+QRfjRbnsn2yrsz7 jSm9WHRVmR4sZVVEiYx/4MDWrwj8bD5w5yQBBX3oQSxbDmfi+FvRkfzlnhZDs/GCuw6Y Wei0PUU7HqH5xA1XpJ/KJAv/iKrXGCOVwvp8b3s4jBU79EPy1MYwbp04/lD8v2J8B7iu 4SS3OW1jFBhoblWdu33mckzEetXAXchOVZRrbPV0/DMvHL3dh2GK/rm/kQgt44a7KBx/ s3Owe/z37Yp86si0ntRzT68G5iexNBIXP4J7jJxUSFtqw26W9MD3gk6CAUIjZ/L/S6B9 rmXA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725012288; x=1725617088; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=t+vBNVWjO6mtV26zUs32wjVtSHlbqsf20K2xRQFmV6I=; b=pYHY3FCWUQ8NqHQ6RlYDQnEMQESjum/2mgU1X8OPievpBBlIScfST8WScjSRJcAqZ5 yagRkPRV0Y6fLYpYE+8dVQytV1VFun9OEi0Az1MGN9DLFKsD5FBeelpsuEyMsGZLwXfd 8vXvl7yURcjjxiI5B56c1o9Uq5vx2wU0oCwUwM9sjUXVXGDasb9P3A35pwjJTPsXKhjJ 6jZkddljyhmzYSW7pL9ce7aTA4oIm9z84fYsLQyAd2k6uBxmQhu2okHnMp8LchL4RkKj yKOTtPUG96oigIIttUjhZvnHjh+mVrKCtyqFSbS/EkHhrpIT8tko3PM2DQnylj5FmVUO oynw== X-Forwarded-Encrypted: i=1; AJvYcCUEmttsOlxWr+rOknjuAYAS4g2et12A3KgKTZlL5kReNJJwxOgG0a9TGN07weYRgm001iWd4lPehA==@kvack.org X-Gm-Message-State: AOJu0YyLNXmAX2GVnRR5OhbL8zXmqLfUbHC1gei1lJTnzevwu0jEqzoX /+UR+6kBeE99GnpLAy4/vf6/uqC4E5Z5GgpnUSspaGR/kHmnd9nq X-Google-Smtp-Source: AGHT+IH5AyMQZDNKx+vMxR5PHzOCBM3mGoXbZzUBy0tcyWMwXEM0DbfBsCwTeIM/EkOxrAzZrQbkmg== X-Received: by 2002:a0c:fa03:0:b0:6bf:6b8a:40a1 with SMTP id 6a1803df08f44-6c34ae4a03amr11113366d6.29.1725012288019; Fri, 30 Aug 2024 03:04:48 -0700 (PDT) Received: from localhost (fwdproxy-ash-001.fbsv.net. [2a03:2880:20ff:1::face:b00c]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-6c340bfa666sm13303256d6.2.2024.08.30.03.04.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 30 Aug 2024 03:04:47 -0700 (PDT) From: Usama Arif To: akpm@linux-foundation.org, linux-mm@kvack.org Cc: hannes@cmpxchg.org, riel@surriel.com, shakeel.butt@linux.dev, roman.gushchin@linux.dev, yuzhao@google.com, david@redhat.com, npache@redhat.com, baohua@kernel.org, ryan.roberts@arm.com, rppt@kernel.org, willy@infradead.org, cerasuolodomenico@gmail.com, ryncsn@gmail.com, corbet@lwn.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, kernel-team@meta.com, Usama Arif Subject: [PATCH v5 4/6] mm: Introduce a pageflag for partially mapped folios Date: Fri, 30 Aug 2024 11:03:38 +0100 Message-ID: <20240830100438.3623486-5-usamaarif642@gmail.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20240830100438.3623486-1-usamaarif642@gmail.com> References: <20240830100438.3623486-1-usamaarif642@gmail.com> MIME-Version: 1.0 X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 1F71212000B X-Stat-Signature: 61juia4oposwye4n9kwat174em9objua X-Rspam-User: X-HE-Tag: 1725012288-683317 X-HE-Meta: U2FsdGVkX19lfZxxWIOT/LQfmI9TSahzFa8WBdq1JxC+WKDXEmGiWAFurM2Ra+jL1L4xfbF8isBN5Bok3mQPRyjIRMbwamhZ1dGIqDdjSzSspk0lG99iD40P1WnHczP570UUG5wedDuPaBo4s9U7N7IKYwsXtzRendbKfXl0thVbUlVrTBUAv4jDDs1dN8NY1vMtfhRiWOWX/sgp9aVsVvHEDdvWFEQE3jxAwWqcciksbL/HnmVLV3/pJy7ZFThUntSvw9LC1/3UlZMnd0cxJC6CAOLzXMpSuML04no3qCBEwy1xeROAAQ6yfbXmvENXQCicR1zUTI4KuDlPlk36RwJaUs6I8mpnZxhigi6YE4/DEvEeKIDgpaD4Bi0H6F2MIVl7XZF8hdDPs3i2ZYix1N2wUYboLMXoE4G+QQcNj3zcd+9IBhWZvV17gxtRmHlnmoWmYfdLgLFbC6tTrkeS3jyv4DdQ19QSTnHQnPo5tdDSX7fcI0bOiGE6drotNyUYusiVZQRltr3PYbBboLeuslfqRd8unopUrEvIcWxTu4d2T8POugkwxkGDMhD6IohhWPxYwP5zzzF/YFXeqPBoHBwZHkVUsrvzxQ2h+Elrg6oJUe5ToitXTuM9j3HddcLlOFcyY9VZP1mlwyaeoMb2d9S1v6O5eJN5tWRCsfJ/CN8hod8kniLUz/9ryGHlroO03LSZnZphwqfMcCJ1ZjLY5UbNppCIW0NngpKoBBKJtMzhlW467RliY7XL7oyzUGMTBeizVYrcJ72xbethOSKHOe6c+q20rMoTR554BHjzYMxAsikJn65X3EgfxY0Xtvmx1FOcxhvAM/NqJLofxuXwZtLsBzv/eWcmhuLrsGARfmJNU7tfl14Su+jIybGk7dYJGtyT0aS1MVyiYIvMb3i9hV+8Zim6SgwbCHrwsmrEHLduXUfRwR+E4EKVqaEJe8AzgOoC+AF6VeGiozcJ8Zx ovJYCy2b YFBh8HddOfLhVFTr8b3pZeSuEJg8JIl+MUxKlGb+EP0MjFvB0IU3s0pI6zUtSN3wrNaDY5voYodSJ0nn7B/dg9Pmhx+35ZwXRsPWtP3hEVOtWeiZQD6Q4y4sNS6sMQrY4XxSADdny0sSi4JWwgYuznCyTDbqwE4xFFsHYa+vbWVppNNXoA+XyWnmHU5mwfIlTBJVUUVsEu8H84qUEUMsvshddi7Porr1X1Yj7A4JHA7sp0QCPilWnolXexD6AuGqKyE0ksK+EQjuv8ICL9IU9CC+NV8as7cScvBrwKagMtpbxB7lf0mkmRAMdEhE7tuw0a46R5SLkzVOpF/A+ZuHD3UF2DT0VqxC/kQzJ+l7uqzaBJIwTokGn9WrpfH9HUErxlU0P X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Currently folio->_deferred_list is used to keep track of partially_mapped folios that are going to be split under memory pressure. In the next patch, all THPs that are faulted in and collapsed by khugepaged are also going to be tracked using _deferred_list. This patch introduces a pageflag to be able to distinguish between partially mapped folios and others in the deferred_list at split time in deferred_split_scan. Its needed as __folio_remove_rmap decrements _mapcount, _large_mapcount and _entire_mapcount, hence it won't be possible to distinguish between partially mapped folios and others in deferred_split_scan. Eventhough it introduces an extra flag to track if the folio is partially mapped, there is no functional change intended with this patch and the flag is not useful in this patch itself, it will become useful in the next patch when _deferred_list has non partially mapped folios. Signed-off-by: Usama Arif --- include/linux/huge_mm.h | 4 ++-- include/linux/page-flags.h | 13 +++++++++++- mm/huge_memory.c | 41 ++++++++++++++++++++++++++++---------- mm/memcontrol.c | 3 ++- mm/migrate.c | 3 ++- mm/page_alloc.c | 5 +++-- mm/rmap.c | 5 +++-- mm/vmscan.c | 3 ++- 8 files changed, 56 insertions(+), 21 deletions(-) diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index 4da102b74a8c..0b0539f4ee1a 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -333,7 +333,7 @@ static inline int split_huge_page(struct page *page) { return split_huge_page_to_list_to_order(page, NULL, 0); } -void deferred_split_folio(struct folio *folio); +void deferred_split_folio(struct folio *folio, bool partially_mapped); void __split_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd, unsigned long address, bool freeze, struct folio *folio); @@ -502,7 +502,7 @@ static inline int split_huge_page(struct page *page) { return 0; } -static inline void deferred_split_folio(struct folio *folio) {} +static inline void deferred_split_folio(struct folio *folio, bool partially_mapped) {} #define split_huge_pmd(__vma, __pmd, __address) \ do { } while (0) diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h index 2175ebceb41c..1b3a76710487 100644 --- a/include/linux/page-flags.h +++ b/include/linux/page-flags.h @@ -186,6 +186,7 @@ enum pageflags { /* At least one page in this folio has the hwpoison flag set */ PG_has_hwpoisoned = PG_active, PG_large_rmappable = PG_workingset, /* anon or file-backed */ + PG_partially_mapped = PG_reclaim, /* was identified to be partially mapped */ }; #define PAGEFLAGS_MASK ((1UL << NR_PAGEFLAGS) - 1) @@ -859,8 +860,18 @@ static inline void ClearPageCompound(struct page *page) ClearPageHead(page); } FOLIO_FLAG(large_rmappable, FOLIO_SECOND_PAGE) +FOLIO_TEST_FLAG(partially_mapped, FOLIO_SECOND_PAGE) +/* + * PG_partially_mapped is protected by deferred_split split_queue_lock, + * so its safe to use non-atomic set/clear. + */ +__FOLIO_SET_FLAG(partially_mapped, FOLIO_SECOND_PAGE) +__FOLIO_CLEAR_FLAG(partially_mapped, FOLIO_SECOND_PAGE) #else FOLIO_FLAG_FALSE(large_rmappable) +FOLIO_TEST_FLAG_FALSE(partially_mapped) +__FOLIO_SET_FLAG_NOOP(partially_mapped) +__FOLIO_CLEAR_FLAG_NOOP(partially_mapped) #endif #define PG_head_mask ((1UL << PG_head)) @@ -1171,7 +1182,7 @@ static __always_inline void __ClearPageAnonExclusive(struct page *page) */ #define PAGE_FLAGS_SECOND \ (0xffUL /* order */ | 1UL << PG_has_hwpoisoned | \ - 1UL << PG_large_rmappable) + 1UL << PG_large_rmappable | 1UL << PG_partially_mapped) #define PAGE_FLAGS_PRIVATE \ (1UL << PG_private | 1UL << PG_private_2) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index af60684e7c70..166f8810f3c6 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -3503,7 +3503,11 @@ int split_huge_page_to_list_to_order(struct page *page, struct list_head *list, if (folio_order(folio) > 1 && !list_empty(&folio->_deferred_list)) { ds_queue->split_queue_len--; - mod_mthp_stat(folio_order(folio), MTHP_STAT_NR_ANON_PARTIALLY_MAPPED, -1); + if (folio_test_partially_mapped(folio)) { + __folio_clear_partially_mapped(folio); + mod_mthp_stat(folio_order(folio), + MTHP_STAT_NR_ANON_PARTIALLY_MAPPED, -1); + } /* * Reinitialize page_deferred_list after removing the * page from the split_queue, otherwise a subsequent @@ -3570,13 +3574,18 @@ void __folio_undo_large_rmappable(struct folio *folio) spin_lock_irqsave(&ds_queue->split_queue_lock, flags); if (!list_empty(&folio->_deferred_list)) { ds_queue->split_queue_len--; - mod_mthp_stat(folio_order(folio), MTHP_STAT_NR_ANON_PARTIALLY_MAPPED, -1); + if (folio_test_partially_mapped(folio)) { + __folio_clear_partially_mapped(folio); + mod_mthp_stat(folio_order(folio), + MTHP_STAT_NR_ANON_PARTIALLY_MAPPED, -1); + } list_del_init(&folio->_deferred_list); } spin_unlock_irqrestore(&ds_queue->split_queue_lock, flags); } -void deferred_split_folio(struct folio *folio) +/* partially_mapped=false won't clear PG_partially_mapped folio flag */ +void deferred_split_folio(struct folio *folio, bool partially_mapped) { struct deferred_split *ds_queue = get_deferred_split_queue(folio); #ifdef CONFIG_MEMCG @@ -3604,15 +3613,21 @@ void deferred_split_folio(struct folio *folio) if (folio_test_swapcache(folio)) return; - if (!list_empty(&folio->_deferred_list)) - return; - spin_lock_irqsave(&ds_queue->split_queue_lock, flags); + if (partially_mapped) { + if (!folio_test_partially_mapped(folio)) { + __folio_set_partially_mapped(folio); + if (folio_test_pmd_mappable(folio)) + count_vm_event(THP_DEFERRED_SPLIT_PAGE); + count_mthp_stat(folio_order(folio), MTHP_STAT_SPLIT_DEFERRED); + mod_mthp_stat(folio_order(folio), MTHP_STAT_NR_ANON_PARTIALLY_MAPPED, 1); + + } + } else { + /* partially mapped folios cannot become non-partially mapped */ + VM_WARN_ON_FOLIO(folio_test_partially_mapped(folio), folio); + } if (list_empty(&folio->_deferred_list)) { - if (folio_test_pmd_mappable(folio)) - count_vm_event(THP_DEFERRED_SPLIT_PAGE); - count_mthp_stat(folio_order(folio), MTHP_STAT_SPLIT_DEFERRED); - mod_mthp_stat(folio_order(folio), MTHP_STAT_NR_ANON_PARTIALLY_MAPPED, 1); list_add_tail(&folio->_deferred_list, &ds_queue->split_queue); ds_queue->split_queue_len++; #ifdef CONFIG_MEMCG @@ -3660,7 +3675,11 @@ static unsigned long deferred_split_scan(struct shrinker *shrink, list_move(&folio->_deferred_list, &list); } else { /* We lost race with folio_put() */ - mod_mthp_stat(folio_order(folio), MTHP_STAT_NR_ANON_PARTIALLY_MAPPED, -1); + if (folio_test_partially_mapped(folio)) { + __folio_clear_partially_mapped(folio); + mod_mthp_stat(folio_order(folio), + MTHP_STAT_NR_ANON_PARTIALLY_MAPPED, -1); + } list_del_init(&folio->_deferred_list); ds_queue->split_queue_len--; } diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 087a8cb1a6d8..e66da58a365d 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -4629,7 +4629,8 @@ static void uncharge_folio(struct folio *folio, struct uncharge_gather *ug) VM_BUG_ON_FOLIO(folio_test_lru(folio), folio); VM_BUG_ON_FOLIO(folio_order(folio) > 1 && !folio_test_hugetlb(folio) && - !list_empty(&folio->_deferred_list), folio); + !list_empty(&folio->_deferred_list) && + folio_test_partially_mapped(folio), folio); /* * Nobody should be changing or seriously looking at diff --git a/mm/migrate.c b/mm/migrate.c index d039863e014b..35cc9d35064b 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -1766,7 +1766,8 @@ static int migrate_pages_batch(struct list_head *from, * use _deferred_list. */ if (nr_pages > 2 && - !list_empty(&folio->_deferred_list)) { + !list_empty(&folio->_deferred_list) && + folio_test_partially_mapped(folio)) { if (!try_split_folio(folio, split_folios, mode)) { nr_failed++; stats->nr_thp_failed += is_thp; diff --git a/mm/page_alloc.c b/mm/page_alloc.c index c2ffccf9d213..a82c221b7c2e 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -962,8 +962,9 @@ static int free_tail_page_prepare(struct page *head_page, struct page *page) break; case 2: /* the second tail page: deferred_list overlaps ->mapping */ - if (unlikely(!list_empty(&folio->_deferred_list))) { - bad_page(page, "on deferred list"); + if (unlikely(!list_empty(&folio->_deferred_list) && + folio_test_partially_mapped(folio))) { + bad_page(page, "partially mapped folio on deferred list"); goto out; } break; diff --git a/mm/rmap.c b/mm/rmap.c index 78529cf0fd66..a8797d1b3d49 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1579,8 +1579,9 @@ static __always_inline void __folio_remove_rmap(struct folio *folio, * Check partially_mapped first to ensure it is a large folio. */ if (partially_mapped && folio_test_anon(folio) && - list_empty(&folio->_deferred_list)) - deferred_split_folio(folio); + !folio_test_partially_mapped(folio)) + deferred_split_folio(folio, true); + __folio_mod_stat(folio, -nr, -nr_pmdmapped); /* diff --git a/mm/vmscan.c b/mm/vmscan.c index f27792e77a0f..4ca612f7e473 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -1238,7 +1238,8 @@ static unsigned int shrink_folio_list(struct list_head *folio_list, * Split partially mapped folios right away. * We can free the unmapped pages without IO. */ - if (data_race(!list_empty(&folio->_deferred_list)) && + if (data_race(!list_empty(&folio->_deferred_list) && + folio_test_partially_mapped(folio)) && split_folio_to_list(folio, folio_list)) goto activate_locked; }