From patchwork Sun Jun 18 06:57:44 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yosry Ahmed X-Patchwork-Id: 13283770 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 07BD7EB64D7 for ; Sun, 18 Jun 2023 06:57:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 536C26B0072; Sun, 18 Jun 2023 02:57:52 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4E6488E0002; Sun, 18 Jun 2023 02:57:52 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 388778E0001; Sun, 18 Jun 2023 02:57:52 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 281DF6B0072 for ; Sun, 18 Jun 2023 02:57:52 -0400 (EDT) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id ED22280D6D for ; Sun, 18 Jun 2023 06:57:51 +0000 (UTC) X-FDA: 80914963542.16.C9269B9 Received: from mail-pl1-f201.google.com (mail-pl1-f201.google.com [209.85.214.201]) by imf27.hostedemail.com (Postfix) with ESMTP id 2DADA40004 for ; Sun, 18 Jun 2023 06:57:48 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=rH0WngdQ; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf27.hostedemail.com: domain of 366qOZAoKCKYeUYXeGNSKJMUUMRK.IUSROTad-SSQbGIQ.UXM@flex--yosryahmed.bounces.google.com designates 209.85.214.201 as permitted sender) smtp.mailfrom=366qOZAoKCKYeUYXeGNSKJMUUMRK.IUSROTad-SSQbGIQ.UXM@flex--yosryahmed.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1687071469; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=oanGJOi49vWQXpa4IjH+jDiLL++ZEmTeHQBHHmuT7VA=; b=yHdFJo12bh68NS53QxNVXQSufOpBIjdRE+r5FjVfWNE9shPU26BDRXghMrHUsVZn20JOpv aH5UBTd0ngtgxKk1MJjzq+ds/gG6e/LGQS4/Cw+oJ9J+DOCf+XILS9C8uVKY9QPyz4aKEX arAtYobdkioCe4adDbxLqLPOOOlfN6Q= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=rH0WngdQ; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf27.hostedemail.com: domain of 366qOZAoKCKYeUYXeGNSKJMUUMRK.IUSROTad-SSQbGIQ.UXM@flex--yosryahmed.bounces.google.com designates 209.85.214.201 as permitted sender) smtp.mailfrom=366qOZAoKCKYeUYXeGNSKJMUUMRK.IUSROTad-SSQbGIQ.UXM@flex--yosryahmed.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1687071469; a=rsa-sha256; cv=none; b=Z+IqTb4SnsU6xdOnA/IVgs7p0K/pZeDDhhynm/eP0hBCaUR3KQ+U/J0zSGqK50tGG0gTeX JmBHc3MLbzAdzHNVr1m0inSUetQgPerHX8pzwTpBvNQU/PeA7lLGRmByb24Osgoe0ZadXU 8IlnPAeRitzaFxHq/N1WOKNJHRANC+A= Received: by mail-pl1-f201.google.com with SMTP id d9443c01a7336-1b549e81cc1so2270915ad.2 for ; Sat, 17 Jun 2023 23:57:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1687071468; x=1689663468; h=cc:to:from:subject:message-id:mime-version:date:from:to:cc:subject :date:message-id:reply-to; bh=oanGJOi49vWQXpa4IjH+jDiLL++ZEmTeHQBHHmuT7VA=; b=rH0WngdQ9qGwOTu5WrjlK1rnzw05zNL66bEcgUgP3iNz4EUwU+eCnXuO9WbWh7zHrw uyKNYmYd1KVIuhtK+FNqLglXeQplda4K7X/H3P4fdi4IBTXt7UUZjGDx5qkt4PYlJlVX QQV+Ja4OJrZZPV1NZVDMwlFpNQVFU4QTlU7xBd4Ln25EM3zLX/cIrtJT19SBisW1f49j dWo7CLm2vOUnhnDJjNquR3dJ8P/hsC4LicLALz0+95GS9wOJGTFMfIMePDIBO1jV2Wr1 a2k0lCROHwVNnJ3aWz2hMbfbR1UlTBISh7jWF7GPPux317ZRN0iJ1IXyMIJ1gBcvRdzR g0nQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687071468; x=1689663468; h=cc:to:from:subject:message-id:mime-version:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=oanGJOi49vWQXpa4IjH+jDiLL++ZEmTeHQBHHmuT7VA=; b=jF3tZ+DRx3CWnqR4pMktrE0oZZ/Aicc/DQKyPfgA+5xC1rqRcz9mbA67vAET0Op2SI 58iAKmZ/aavJiFWvFgxPhFl5gotEihTA2o4f+b2rxZtZH4MrQ3K316m1hMZgrCLlCJUs 7j8I0gwtWzA2+n6nztfNH1bz880LozcD6TSokS/QdtCdxoqe2pD/YEjpoh1JKqyJPtZF iSZeaNjhj8t4wwmtJcAUaKCmeITC7IgA1JwmNnDcrD8nGie4ZZEm+2nb6SF4NTBlqAOd MzrQTuJH2FtRUJoO7zzVW6Ea6GuhrPCahw+UwxwNF6+p1EkVlmXy9e7pnGqtZ74XXBi2 cmyQ== X-Gm-Message-State: AC+VfDxCA++4fnzGMQi3Y2bG014foKb4Rzr+qYH+mi+lRU/a3fRNSzYJ 8uZ1teHKNEKQnfnbzC9KgWIQKOAZ4S5khqln X-Google-Smtp-Source: ACHHUZ7zI80tZWV5iOBCFxt/C3skgJDGur2XXYSwYbOumwZHh2yV2/ujoz5PHR4er5XS5dl1vtUHZQGNl8svUw+t X-Received: from yosry.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:2327]) (user=yosryahmed job=sendgmr) by 2002:a17:903:481:b0:1b3:f6e2:d156 with SMTP id jj1-20020a170903048100b001b3f6e2d156mr1040951plb.3.1687071467571; Sat, 17 Jun 2023 23:57:47 -0700 (PDT) Date: Sun, 18 Jun 2023 06:57:44 +0000 Mime-Version: 1.0 X-Mailer: git-send-email 2.41.0.162.gfafddb0af9-goog Message-ID: <20230618065744.1363948-1-yosryahmed@google.com> Subject: [RFC PATCH 1/5] mm/mlock: rework mlock_count to use _mapcount for order-0 folios From: Yosry Ahmed To: Andrew Morton Cc: Yu Zhao , "Jan Alexander Steffens (heftig)" , Steven Barrett , Brian Geffon , "T.J. Alumbaugh" , Gaosheng Cui , Suren Baghdasaryan , "Matthew Wilcox (Oracle)" , "Liam R. Howlett" , David Hildenbrand , Jason Gunthorpe , Mathieu Desnoyers , David Howells , Hugh Dickins , Greg Thelen , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Yosry Ahmed X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 2DADA40004 X-Stat-Signature: 6z175hudh9493zcef794kskymak5esho X-HE-Tag: 1687071468-742151 X-HE-Meta: U2FsdGVkX18HuGOGA0QR/wQ/nmLjJZayebwo4WT+hQsHdLM5KPcSq5v4kztvniaLKrtKyOE+Mgyc4xOJIP7v+5d9E3kxjn9RnvKf/k4/ctecB8F2ZKPowV8/AeK5u2tCM7QMrFpn6ohi5Go88kqQrBJQBzufHXU0tVqOu0She8qW2W/uNuuVBnGYX0gN4WlAlX0aAmob1lX7EPTLIytaxUuYXSKC7dW5NAeBttpEawzsjGjNj5ltWRScNziYwVCi/9XmriM2ucg+bRKiGGg/mmUQgTgn3i1EMXrsVck5ozz4UpbBz2vCFVCgnmyBtB/aZMiGX4i+LffkmgGo+HCQfp+AA/5t4/vgtqqkxsU+5l82BCRtWo0mvUg0QTfOdZLZ+CWu0t3O4ea1BHf5GgPdxNzuFytkxXTNEk9UWV1v6ymNPQ1ssR9PBbxKtKXXHOxugGsUfo0IcMwoBrtp2KDWJDvvJNUBlNWG7scXGA0Z3GyppTqONF3T3ZeOwAYo0fkP1Gjd8sKhxYXIRt28zLYu9RphM5lILitrxjQr31FkeCQL/rsxb2EUTqItVEKLQek4vD2V0cCdefOZ7UnLfdD7Q399MxuZMQ32Z6LBcr/7spK861nX0cUVe/hsh3R/hkjwELeIt43Gv0JHe9i10zdhAF8ZPq+mj5vmOuUPvJl8O+OxzbRhWz2u+99hrOzKoQ6wKFCBC22ZSyrXAge3D3xaF06QSJHcXQ6d8rgVeyIIeWOG8jjO634Vhl27+XAV/LoiXqKTBEsbkWELcWYojSv1KpYTCwLyAAomev5LvgThi1keguSHX5BAwyYIF6uUg0DcGVSVgBQoV876wGeBugNs+NW2eDP5Q+zuM+WYLKla3oEkUTbxwUk0GoxbYZPAFvactpA3IYV1z4FVLotkH/AvTdgwshSygMz+SsxYnjxcjd6cBZAsGq7QIGf8yoMHvONm3qJvkKxiPpRNufI6B9R zRJI4rQI 0+ocreT8L2o4VRmR4RWWoseGS6oC6n9RhTQmQPwHH+qZgHoZ/RR4GsJxifZeCh9n0Vgnq5K5Z4Pf0b+L2bla+/Ndh5+gIh4RgOZEabGc58P+rQ48zBeVjmqI5XqsINyDTIQoISuT6Qymt/6gwTkvF7BWr095t/tmarLQxXnl/5XAbt7xnUrGE4KLBmuZ3X26mGbHewqXpV0eKFM/bsPt++Z74C1sdlCJ+uRt0SycgwtKtCIDIHXEqBN4R9+fTKM+gk4i2JG3GExt4Y3GUEIehMIj8rfPpBUZMiE85tWJYCdS5CcobE/ZgxX+oV+bF5SQEnMw3dHqnOx3fJclCBGWiWlWXyVW/HWxR5Iu8HmX5kbZLh14xfwcmJvIa/tLbk3GZafSWrk/v45XfrJlUChdld9jWOetahC6edN+6axx/zdC08Kik7YF5kRMWwXi/L5X5REMoQpqW1j1ie7El9654P6uMvzSp346XLxbVB0cmKMoQ8xHZX/xwaWZSKQXl682N3u2j5vbYKiks4r2DgJxxtsf5StkFqseim0QxvomXfV7XM7TYpVL+HRkRCTb4jlhyYrDDEzzQ3E/ec3n+a+TyySxJ4bvWFSYvHYfw4G28v/xJE7WuGLzesg1dfgsxXpqekG/nTS4km5ra/Mxd4fH6BLn+sReaxv8LtDL0zToNKGk6nZShNLTT5Um1Qw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Commit 07ca76067308 ("mm/munlock: maintain page->mlock_count while unevictable") introduced mlock_count to keep track of how many times a folio was mlocked to avoid having to walk the rmap during munlock() to find out whether or not the folio is mlocked() by a different mapping. page->mlock_count currently overlays page->lru for unevictable mlocked folios. A side effect of this is that we lose the mlock_count every time we isolate a folio. We have to re-initialize the mlock_count every time we realize we notice that it's gone, and the initialization is different in different places. Furthermore, we lose the unevictable LRU (it became imaginary). While we never currently scan this LRU, it is useful to have ~all user folios charged to a memcg in the LRUs for future work. Rework the mlock_count scheme. For order-0 pages, piggyback the mlock_count with the mapcount for mlocked folios only. For mlocked folios, we leave the 20 lower bits for the mapcount, and use the 11 next bits for the mlock_count (we leave the highermost bit as the counter is signed). We do not allow the mlock_count to overflow, we cap it at 2^11 - 1 (2047 mlocks). It is fine to underestimate the mlock_count, we might end up clearing PG_mlocked early, but that's fine. Reclaim will fix it for us. The max mapcount allowed for mlocked order-0 pages is 2^20-1, a little bit over a million. In the rare event that this is not enough, the mapcount will become incorrect (underestimated) -- but we never allow it to fall to 0 or 1, as these can have special meanings (unmapped or not shared). Once the mapcount falls below 2^20-1 again it becomes correct. For what it's worth, most code paths checking the mapcount either do not apply to mlocked order-0 folios (MADV_FREE, MADV_COLD, reclaim, ..), just compare the mapcount to 1 (which we make sure doesn't happen by mistake), or compare the mapcount to the refcount to estimate if there extra refs on the page (in which case we will mistakenly report extra refs). For higher order folios, we add a new atomic counter, _mlock_count, to the second struct page, and do not need to handle the above complexity. The complexity described above is hidden within mm/mlock.c, a new helper folio_mlocked_mapcount() is introduced to handle read the mapcount of an mlocked order-0 folio. The mlock_count is only modified within mm/mlock.c, so this is already hidden from the rest of mm. We increment the mlock_count when PG_mlocked is set (in mlock_folio/mlock_new_folio), before the batched mlock operation. On the other hand, we decrement the mlock_count in the batched munlock operation when PG_mlocked is cleared. This correctly follows the pattern of setting / clearing PG_mlocked, which we couldn't do before due to the need to hold the lruvec lock when modifying mlock_count -- we don't need this anymore. The inc/dec/initialize logic is also simpler now, no need to check if the mlock_count is maintained or not, it is always maintained now. No need to re-initialize every time we add an isolated folio to the unevictable LRU, or re-initialzie every time we find that it was cleared during mlock/munlock. Furthermore, we get to update the mlock_count even if we fail to isolate the folio, which we couldn't do when mlock_count overlayed page->lru. Signed-off-by: Yosry Ahmed --- include/linux/mm.h | 23 ++++++-- include/linux/mm_types.h | 24 +------- mm/huge_memory.c | 4 +- mm/mlock.c | 121 ++++++++++++++++++++++++++++++++------- mm/swap.c | 8 --- 5 files changed, 122 insertions(+), 58 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 27ce77080c79..3994580772b3 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1049,6 +1049,7 @@ unsigned long vmalloc_to_pfn(const void *addr); #ifdef CONFIG_MMU extern bool is_vmalloc_addr(const void *x); extern int is_vmalloc_or_module_addr(const void *x); +extern int folio_mlocked_mapcount(struct folio *folio); #else static inline bool is_vmalloc_addr(const void *x) { @@ -1058,6 +1059,10 @@ static inline int is_vmalloc_or_module_addr(const void *x) { return 0; } +static inline int folio_mlocked_mapcount(struct folio *folio) +{ + return 0; +} #endif /* @@ -1100,6 +1105,8 @@ static inline int page_mapcount(struct page *page) if (unlikely(PageCompound(page))) mapcount += folio_entire_mapcount(page_folio(page)); + else if (unlikely(PageMlocked(page))) + mapcount = folio_mlocked_mapcount(page_folio(page)); return mapcount; } @@ -1119,16 +1126,20 @@ int folio_total_mapcount(struct folio *folio); */ static inline int folio_mapcount(struct folio *folio) { - if (likely(!folio_test_large(folio))) - return atomic_read(&folio->_mapcount) + 1; - return folio_total_mapcount(folio); + if (unlikely(folio_test_large(folio))) + return folio_total_mapcount(folio); + if (unlikely(folio_test_mlocked(folio))) + return folio_mlocked_mapcount(folio); + return atomic_read(&folio->_mapcount) + 1; } static inline int total_mapcount(struct page *page) { - if (likely(!PageCompound(page))) - return atomic_read(&page->_mapcount) + 1; - return folio_total_mapcount(page_folio(page)); + if (unlikely(PageCompound(page))) + return folio_total_mapcount(page_folio(page)); + if (unlikely(PageMlocked(page))) + return folio_mlocked_mapcount(page_folio(page)); + return atomic_read(&page->_mapcount) + 1; } static inline bool folio_large_is_mapped(struct folio *folio) diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 306a3d1a0fa6..8c8d524fb263 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -89,15 +89,6 @@ struct page { */ union { struct list_head lru; - - /* Or, for the Unevictable "LRU list" slot */ - struct { - /* Always even, to negate PageTail */ - void *__filler; - /* Count page's or folio's mlocks */ - unsigned int mlock_count; - }; - /* Or, free page */ struct list_head buddy_list; struct list_head pcp_list; @@ -266,7 +257,6 @@ static inline struct page *encoded_page_ptr(struct encoded_page *page) * struct folio - Represents a contiguous set of bytes. * @flags: Identical to the page flags. * @lru: Least Recently Used list; tracks how recently this folio was used. - * @mlock_count: Number of times this folio has been pinned by mlock(). * @mapping: The file this page belongs to, or refers to the anon_vma for * anonymous memory. * @index: Offset within the file, in units of pages. For anonymous memory, @@ -283,6 +273,7 @@ static inline struct page *encoded_page_ptr(struct encoded_page *page) * @_entire_mapcount: Do not use directly, call folio_entire_mapcount(). * @_nr_pages_mapped: Do not use directly, call folio_mapcount(). * @_pincount: Do not use directly, call folio_maybe_dma_pinned(). + * @_mlock_count: Do not use directly, used exclusively in mm/mlock.c. * @_folio_nr_pages: Do not use directly, call folio_nr_pages(). * @_hugetlb_subpool: Do not use directly, use accessor in hugetlb.h. * @_hugetlb_cgroup: Do not use directly, use accessor in hugetlb_cgroup.h. @@ -304,17 +295,7 @@ struct folio { struct { /* public: */ unsigned long flags; - union { - struct list_head lru; - /* private: avoid cluttering the output */ - struct { - void *__filler; - /* public: */ - unsigned int mlock_count; - /* private: */ - }; - /* public: */ - }; + struct list_head lru; struct address_space *mapping; pgoff_t index; void *private; @@ -337,6 +318,7 @@ struct folio { atomic_t _entire_mapcount; atomic_t _nr_pages_mapped; atomic_t _pincount; + atomic_t _mlock_count; #ifdef CONFIG_64BIT unsigned int _folio_nr_pages; #endif diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 624671aaa60d..0e5b58ca603f 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2407,9 +2407,7 @@ static void lru_add_page_tail(struct page *head, struct page *tail, } else { /* head is still on lru (and we have it frozen) */ VM_WARN_ON(!PageLRU(head)); - if (PageUnevictable(tail)) - tail->mlock_count = 0; - else + if (!PageUnevictable(tail)) list_add_tail(&tail->lru, &head->lru); SetPageLRU(tail); } diff --git a/mm/mlock.c b/mm/mlock.c index 40b43f8740df..5c5462627391 100644 --- a/mm/mlock.c +++ b/mm/mlock.c @@ -47,6 +47,98 @@ bool can_do_mlock(void) } EXPORT_SYMBOL(can_do_mlock); +/* + * Keep track of the number of times an order-0 folio was mlock()'d by adding + * MLOCK_COUNT_BIAS to folio->_mapcount for each mlock operation. + * We leave the lower 20 bits for the mapcount, and we leave the higher most bit + * untouched as the counter is signed. This means that we can count up to + * 2047 mlock()'s before overflowing. We do not allow overflowing to avoid + * making the mapcount negative. Instead, we cap the mlock_count at + * (INT_MAX >> MLOCK_COUNT_SHIFT). This means that the number of mlock()'s may + * be underestimated, but this is fine. Reclaim will fix it for us. + * + * For large folios, we have a dedicated _mlock_count field and don't need to + * worry about this. + * + * When reading the mapcount of an order-0 folio , if the folio is mlock()'d, + * we only look at the first 20 bits (MLOCK_MAPCOUNT_MASK). For the rare case + * where the number of mappings >= MLOCK_COUNT_BIAS: + * (1) The mapcount will be incorrect (underestimated). It will be correct again + * once the number of mappings falls below MLOCK_COUNT_BIAS. + * (2) munlock() can misinterpret the large number of mappings as an mlock_count + * and leave PG_mlocked set. + */ +#define MLOCK_COUNT_SHIFT 20 +#define MLOCK_COUNT_BIAS (1U << MLOCK_COUNT_SHIFT) +#define MLOCK_MAPCOUNT_MASK (MLOCK_COUNT_BIAS - 1U) +#define MLOCK_COUNT_MAX (INT_MAX >> MLOCK_COUNT_SHIFT) + +int folio_mlocked_mapcount(struct folio *folio) +{ + int mapcount, mlock_count; + + VM_BUG_ON(!folio_test_mlocked(folio) || folio_test_large(folio)); + /* always add 1 to folio->_mapcount when parsing it */ + mapcount = atomic_read(&folio->_mapcount) + 1; + mlock_count = mapcount >> MLOCK_COUNT_SHIFT; + mapcount &= MLOCK_MAPCOUNT_MASK; + + /* + * If the mapcount overflows beyond the lower 20 bits, we will see + * elevated mlock_count (at least 2), and extremely underestimated + * mapcount (potentially 0 or 1). Make sure we at least return a value + * higher than 1 (a mapcount of 1 usually signifies exclusive mapping). + */ + if (mlock_count > mapcount) + return mlock_count; + return mapcount; +} + +static void folio_mlock_count_inc(struct folio *folio) +{ + int old, new, mlock_count; + + if (folio_test_large(folio)) { + atomic_inc(&folio->_mlock_count); + return; + } + + /* + * When using the upper bits of _mapcount, make sure we do not overflow + * into the sign bit. If we underestimate, reclaim will fix it. + */ + old = atomic_read(&folio->_mapcount); + do { + /* always add 1 to folio->_mapcount when parsing it */ + mlock_count = (old + 1) >> MLOCK_COUNT_SHIFT; + if (mlock_count == MLOCK_COUNT_MAX) + return; + new = old + MLOCK_COUNT_BIAS; + } while (!atomic_try_cmpxchg(&folio->_mapcount, &old, new)); +} + +static int folio_mlock_count_dec(struct folio *folio) +{ + int old, new, mlock_count; + + if (folio_test_large(folio)) + return atomic_dec_return(&folio->_mlock_count); + + /* + * When using the upper bit of _mapcount, we may have underestimated the + * mlock count before. Do not underflow. + */ + old = atomic_read(&folio->_mapcount); + do { + /* always add 1 to folio->_mapcount when parsing it */ + mlock_count = (old + 1) >> MLOCK_COUNT_SHIFT; + if (mlock_count == 0) + return 0; + new = old - MLOCK_COUNT_BIAS; + } while (!atomic_try_cmpxchg(&folio->_mapcount, &old, new)); + return mlock_count - 1; +} + /* * Mlocked folios are marked with the PG_mlocked flag for efficient testing * in vmscan and, possibly, the fault path; and to support semi-accurate @@ -83,16 +175,12 @@ static struct lruvec *__mlock_folio(struct folio *folio, struct lruvec *lruvec) goto out; } - if (folio_test_unevictable(folio)) { - if (folio_test_mlocked(folio)) - folio->mlock_count++; + if (folio_test_unevictable(folio)) goto out; - } lruvec_del_folio(lruvec, folio); folio_clear_active(folio); folio_set_unevictable(folio); - folio->mlock_count = !!folio_test_mlocked(folio); lruvec_add_folio(lruvec, folio); __count_vm_events(UNEVICTABLE_PGCULLED, folio_nr_pages(folio)); out: @@ -111,7 +199,6 @@ static struct lruvec *__mlock_new_folio(struct folio *folio, struct lruvec *lruv goto out; folio_set_unevictable(folio); - folio->mlock_count = !!folio_test_mlocked(folio); __count_vm_events(UNEVICTABLE_PGCULLED, folio_nr_pages(folio)); out: lruvec_add_folio(lruvec, folio); @@ -124,22 +211,14 @@ static struct lruvec *__munlock_folio(struct folio *folio, struct lruvec *lruvec int nr_pages = folio_nr_pages(folio); bool isolated = false; - if (!folio_test_clear_lru(folio)) - goto munlock; - - isolated = true; - lruvec = folio_lruvec_relock_irq(folio, lruvec); - - if (folio_test_unevictable(folio)) { - /* Then mlock_count is maintained, but might undercount */ - if (folio->mlock_count) - folio->mlock_count--; - if (folio->mlock_count) - goto out; + if (folio_test_clear_lru(folio)) { + isolated = true; + lruvec = folio_lruvec_relock_irq(folio, lruvec); } - /* else assume that was the last mlock: reclaim will fix it if not */ -munlock: + if (folio_mlock_count_dec(folio) > 0) + goto out; + if (folio_test_clear_mlocked(folio)) { __zone_stat_mod_folio(folio, NR_MLOCK, -nr_pages); if (isolated || !folio_test_unevictable(folio)) @@ -254,6 +333,7 @@ void mlock_folio(struct folio *folio) __count_vm_events(UNEVICTABLE_PGMLOCKED, nr_pages); } + folio_mlock_count_inc(folio); folio_get(folio); if (!folio_batch_add(fbatch, mlock_lru(folio)) || folio_test_large(folio) || lru_cache_disabled()) @@ -273,6 +353,7 @@ void mlock_new_folio(struct folio *folio) local_lock(&mlock_fbatch.lock); fbatch = this_cpu_ptr(&mlock_fbatch.fbatch); folio_set_mlocked(folio); + folio_mlock_count_inc(folio); zone_stat_mod_folio(folio, NR_MLOCK, nr_pages); __count_vm_events(UNEVICTABLE_PGMLOCKED, nr_pages); diff --git a/mm/swap.c b/mm/swap.c index 423199ee8478..8b6f6e2fdc24 100644 --- a/mm/swap.c +++ b/mm/swap.c @@ -184,14 +184,6 @@ static void lru_add_fn(struct lruvec *lruvec, struct folio *folio) } else { folio_clear_active(folio); folio_set_unevictable(folio); - /* - * folio->mlock_count = !!folio_test_mlocked(folio)? - * But that leaves __mlock_folio() in doubt whether another - * actor has already counted the mlock or not. Err on the - * safe side, underestimate, let page reclaim fix it, rather - * than leaving a page on the unevictable LRU indefinitely. - */ - folio->mlock_count = 0; if (!was_unevictable) __count_vm_events(UNEVICTABLE_PGCULLED, nr_pages); } From patchwork Sun Jun 18 06:57:56 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yosry Ahmed X-Patchwork-Id: 13283771 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C787BEB64D8 for ; Sun, 18 Jun 2023 06:58:03 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 61F8F6B0075; Sun, 18 Jun 2023 02:58:03 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 5D1A48E0002; Sun, 18 Jun 2023 02:58:03 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 498598E0001; Sun, 18 Jun 2023 02:58:03 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 36BBD6B0075 for ; Sun, 18 Jun 2023 02:58:03 -0400 (EDT) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 0C39D403F6 for ; Sun, 18 Jun 2023 06:58:03 +0000 (UTC) X-FDA: 80914964046.30.A71D462 Received: from mail-pf1-f202.google.com (mail-pf1-f202.google.com [209.85.210.202]) by imf07.hostedemail.com (Postfix) with ESMTP id 5E9B840007 for ; Sun, 18 Jun 2023 06:58:01 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=wiiXK9RV; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf07.hostedemail.com: domain of 3-KqOZAoKCLMrhlkrTafXWZhhZeX.Vhfebgnq-ffdoTVd.hkZ@flex--yosryahmed.bounces.google.com designates 209.85.210.202 as permitted sender) smtp.mailfrom=3-KqOZAoKCLMrhlkrTafXWZhhZeX.Vhfebgnq-ffdoTVd.hkZ@flex--yosryahmed.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1687071481; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=x1i7wt10FRqoJR/0k/fRkwJvkZET8us7vWtGv6N9JkM=; b=OAk5jcKuVZNWvpCTyw5gDsPduuj7q0gAhwpbqEOJejcvqjzxjonj0jfjaksvZfKi1ldrVO oNrUwHP1we4u6eqpT8uCIufPQfvNChFza91VgeDWrguX25O4db/DoC5yRvSIz5ZC2+h+UN mER/YQrmwscB+9cvJEvWEwaXjdwwv70= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=wiiXK9RV; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf07.hostedemail.com: domain of 3-KqOZAoKCLMrhlkrTafXWZhhZeX.Vhfebgnq-ffdoTVd.hkZ@flex--yosryahmed.bounces.google.com designates 209.85.210.202 as permitted sender) smtp.mailfrom=3-KqOZAoKCLMrhlkrTafXWZhhZeX.Vhfebgnq-ffdoTVd.hkZ@flex--yosryahmed.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1687071481; a=rsa-sha256; cv=none; b=tcYSjwj0wfj+2jgMMpJpndv0SHiaJaP1v/P6Razq2/HjNdqjvgkLViDWHpf82k9Rxdwg/x R6DbIP2k3ms1TcfCRlRvdvXx6W5460/nCOd9RmadfrQjsD7u3bhAZQPcimUtYZVNZTNrh7 MPnXBC/wSCFJqzdMyEYniZU4Yx2NNXQ= Received: by mail-pf1-f202.google.com with SMTP id d2e1a72fcca58-6686a103a8cso780355b3a.1 for ; Sat, 17 Jun 2023 23:58:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1687071480; x=1689663480; h=cc:to:from:subject:message-id:mime-version:date:from:to:cc:subject :date:message-id:reply-to; bh=x1i7wt10FRqoJR/0k/fRkwJvkZET8us7vWtGv6N9JkM=; b=wiiXK9RVcjc6zXXnvOnzIHx0QvWcLndYS5cikYrM2XoqA5fnzzO4iZ3MVCVVrJ6Fwn As4BvVUIQ75yp5oF65LfjuNfJryRmjH7FleuGKzWg4HXPHQ8MMFz2J+/md40tNJ/m3e/ imNsFC1Qwc1K2vwpAOjFKRAikJCrYKJEqLpfqHtTtuL/PXuC6HnOwhtteQ7ASIMuMqLG NeLsple1mPw1PS3FEyOxIXW8hSxpNZNK5TcuBTRVIJgJEoMQS2UGi6tFiipF3+sawrW2 T7qYdku1evM2j1oRkchbHm0j4VDRm4ldH8CRDCvDaDfBgW0b9PhEzY51U/tI8QK009jn Y/Eg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687071480; x=1689663480; h=cc:to:from:subject:message-id:mime-version:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=x1i7wt10FRqoJR/0k/fRkwJvkZET8us7vWtGv6N9JkM=; b=Iri+zQ0EHHWBjwsLuyt1xsJRiYiSZHGx0UnqQUK7QIbVLYFCXUEr/QB9u0Jjht3aDn 9pjWbtL6dN5e3x8AiSFAjFY0BnDk9XXDhBAg+V0E4wY1Xpfd3tW++dxXKSY1/+xJ/TfW DHU1scrIf54emqGh/HX/q50h4ky9PRQjWn3MNGpP7xugYM/lVKEhNnhFcgEZxd5rsknn N7Ss3muAjhwwP0hoZ4gs9d0+6Ad30ZrxOaKnJxCaYaBGytwPesOK6+PyEsMbpnEOiKpU z6+NSG5Y9KGKEO9nH36ULi2j2LrnlhwCvR6R9gnL/37A5rw1qsmLzIiIAV5ax0Qht6Vf ysLg== X-Gm-Message-State: AC+VfDyWqc9BHKPl6P8FAw1sUReuckbSu0hhNJsAZx5BgVsBd901Dw17 ttR32ryYVVUg3wrHV/wLED5wP34ic6PXO/zS X-Google-Smtp-Source: ACHHUZ6Id6BCQc4em9nIdd8QYYHd6lTWLtvplCqOWsU+BGezvy9ft4sskTUOwDO98945URQ+GDemSFWaK1L6vnbw X-Received: from yosry.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:2327]) (user=yosryahmed job=sendgmr) by 2002:a05:6a00:14c4:b0:668:7143:50ea with SMTP id w4-20020a056a0014c400b00668714350eamr577347pfu.4.1687071480165; Sat, 17 Jun 2023 23:58:00 -0700 (PDT) Date: Sun, 18 Jun 2023 06:57:56 +0000 Mime-Version: 1.0 X-Mailer: git-send-email 2.41.0.162.gfafddb0af9-goog Message-ID: <20230618065756.1364399-1-yosryahmed@google.com> Subject: [RFC PATCH 2/5] mm/mlock: fixup mlock_count during unmap From: Yosry Ahmed To: Andrew Morton Cc: Yu Zhao , "Jan Alexander Steffens (heftig)" , Steven Barrett , Brian Geffon , "T.J. Alumbaugh" , Gaosheng Cui , Suren Baghdasaryan , "Matthew Wilcox (Oracle)" , "Liam R. Howlett" , David Hildenbrand , Jason Gunthorpe , Mathieu Desnoyers , David Howells , Hugh Dickins , Greg Thelen , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Yosry Ahmed X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 5E9B840007 X-Stat-Signature: 5q7wohm5r1h47p3kfrgdz58tj16xt9t5 X-Rspam-User: X-HE-Tag: 1687071481-645635 X-HE-Meta: U2FsdGVkX183Be5B9hLE0Yyfh5Vwx+HDqZ/tJMUHEbN/GoabfY/b85uBoqrumgfRIeT/zhn5yQfeVzb7A8uE6Fja1SWxArrUa3Cia8D+jEUkM2nWW4AZEcGhPs7BJqrg7U9id7VRZi9+lQEjQtt6YUUGKWnusFP2XpVu5LfVUZAvxQxKJA7zfRJ6+GhEKsm+iMM0ofBPyzNAA2H3r8a6ARDuRzuaDukJ0OUHmpMFiwNJ/HGaKsO6nctMLuGkf6aJyShdce/NnTBr6sIMFNoqeNQsATB+Xm2SXP4nmIKUkNFys8YC5j3j4pJy/78wbz1vziQAAKvbtkCRUp9ih/GrJ2SZdPd4owuve/TXIDKJcbnRf4hiqfx3IQeBwlPkmvQ1euCLUfXUEkgHEuJhKfr6837+Qax48XJ2YUeOSRiFUtdTYHK2mryVQX3hMNfEYzWuOaLoSTsJpU/X266ryr3Mi8CRJAsyswzvo4Sjmg2H2sRTMTmOmlIM7V/81dY8cs2Os0OxCV6VWTDh10rpB/oON7Q5O+rn6N0jYAm3YeImze9kGM8jf+xWtPgIQCXSrFG3OO6Q5xCZ1G6mL+9I9zP6YVGpoX1Iy3Tu2DHpjXhGaSfqgWyTMyb9njjrJega0N3uS7llYM3wTckzGbgQcC+zFG2VIDMEiTO8Kw7mk7r377WwsF9fbS6j6dPoNKtR0L8ShSQ4dbUeooQh9RpmGFlpyz83L7LZTY/EM3tcMW1leFF9+5sdeDIY+buI1bB6p6PUU9yUMPWwQ0oj34jdQ72d5HAMuv8IN5LMURbqG6wiTbbpUzGETepnDZZhmn/GyTcp5irEO92Qf/V+dnSSOgveLJ+To0UfFIs07xUThOCGQAtpWdATDbCV8a1Pu4852jiRBaaHfnvT5c/RFWCUp6m6+qf3Cn+NgWE75nQV871dF3wfogha1Rzj8xLkz1zDH5C6UY1ZDJzzCSpeQZNe79c IoZcPvwY tH33t6eqcvK6eI93JS9nE91mAxRYUHkN9bWfte1ZqKvHtfGJE1ny2s5MzEBYeQbngEZSV5SAMnUoZLsngr1SfTPPlX8cQcSqg7NHxax8ScuRO4rXV1pU2yjYLqXlP9pdv99wUcJGZKqW0tf06x06S4xt/rehFEBcw7i4Wi3bMU1Zy77higriR0Wzt8Gv/NRx48CDJmaDIpjs0BHw1A8raBPUZcZHt/XVCwDpc59ZwobPzwyZkgTiauwUPpUXSr9xXR0ABDZBQsz/OFCPyCebnYmchURGdPttHWhgFhLCgAtQU8edhEjW9xCjbyueuUw1P0v2U7QJwYRaIg/z3Pw4TL+exnS9wevod+8Cc+Vt1/soVAJCJRgexGTpOq4jv+kCoUFR9OpKZoWqe5usyDnqZ+WmqXJiqZdkOsW+NBPYLtuaYQ0GoL9BXP8QI0SDvXS8cbje3hLSWck3vh92LyeOWndteCc8ZvFai35AZmbd88K4XnkD/yGQr2GJXoA7C63cBlByulVXSrrSLy2V0WmxMQxhjiWv3e9whZZW9knN6r6gL6+b2W/ZkEWuunUogSRLWm29YwdnioE5Xv4lz81Prpd2TxoqaIkPySRY7uI2fX4IISW6z7dAJfRMoBZpgswXD//WA93vDcRbPIajYsg+jyccxs10oMuP2b0iENKQyxKHS3ygJhA9Q/6zUVg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: In the rare case where an mlocked order-0 folio is mapped 2^20 or more times, the high mapcount can be interpreted mistakenly by munlock() as an mlock_count, causing PG_mlocked to not be cleared, possibly leaving the folio stranded as unevictable endlessly. To fix this, add a hook during unmapping to check if the bits used for mlock_count are 0s yet PG_mlocked is set. In this case, call make sure to perform the missed munlock operation. Signed-off-by: Yosry Ahmed --- include/linux/mm.h | 4 ++++ mm/mlock.c | 18 +++++++++++++++++- mm/rmap.c | 1 + 3 files changed, 22 insertions(+), 1 deletion(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 3994580772b3..b341477a83e8 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1050,6 +1050,7 @@ unsigned long vmalloc_to_pfn(const void *addr); extern bool is_vmalloc_addr(const void *x); extern int is_vmalloc_or_module_addr(const void *x); extern int folio_mlocked_mapcount(struct folio *folio); +extern void folio_mlock_unmap_check(struct folio *folio); #else static inline bool is_vmalloc_addr(const void *x) { @@ -1063,6 +1064,9 @@ static inline int folio_mlocked_mapcount(struct folio *folio) { return 0; } +static inline void folio_mlock_unmap_check(struct folio *folio) +{ +} #endif /* diff --git a/mm/mlock.c b/mm/mlock.c index 5c5462627391..8261df11d6a6 100644 --- a/mm/mlock.c +++ b/mm/mlock.c @@ -66,7 +66,8 @@ EXPORT_SYMBOL(can_do_mlock); * (1) The mapcount will be incorrect (underestimated). It will be correct again * once the number of mappings falls below MLOCK_COUNT_BIAS. * (2) munlock() can misinterpret the large number of mappings as an mlock_count - * and leave PG_mlocked set. + * and leave PG_mlocked set. This will be fixed when the number of mappings + * falls below MLOCK_COUNT_BIAS by folio_mlock_unmap_check(). */ #define MLOCK_COUNT_SHIFT 20 #define MLOCK_COUNT_BIAS (1U << MLOCK_COUNT_SHIFT) @@ -139,6 +140,21 @@ static int folio_mlock_count_dec(struct folio *folio) return mlock_count - 1; } +/* + * Call after decrementing the mapcount. If the mapcount previously overflowed + * beyond the lower 20 bits for an order-0 mlocked folio, munlock() have + * mistakenly left the folio mlocked. Fix it here. + */ +void folio_mlock_unmap_check(struct folio *folio) +{ + int mapcount = atomic_read(&folio->_mapcount) + 1; + int mlock_count = mapcount >> MLOCK_COUNT_SHIFT; + + if (unlikely(!folio_test_large(folio) && folio_test_mlocked(folio) && + mlock_count == 0)) + munlock_folio(folio); +} + /* * Mlocked folios are marked with the PG_mlocked flag for efficient testing * in vmscan and, possibly, the fault path; and to support semi-accurate diff --git a/mm/rmap.c b/mm/rmap.c index 19392e090bec..02e558551f15 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1392,6 +1392,7 @@ void page_remove_rmap(struct page *page, struct vm_area_struct *vma, nr = atomic_dec_return_relaxed(mapped); nr = (nr < COMPOUND_MAPPED); } + folio_mlock_unmap_check(folio); } else if (folio_test_pmd_mappable(folio)) { /* That test is redundant: it's for safety or to optimize out */ From patchwork Sun Jun 18 06:58:09 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yosry Ahmed X-Patchwork-Id: 13283772 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7576FEB64D7 for ; Sun, 18 Jun 2023 06:58:15 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id F20FD6B0078; Sun, 18 Jun 2023 02:58:14 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id ED1698E0002; Sun, 18 Jun 2023 02:58:14 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D997E8E0001; Sun, 18 Jun 2023 02:58:14 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id C78E06B0078 for ; Sun, 18 Jun 2023 02:58:14 -0400 (EDT) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 9F9E21603DD for ; Sun, 18 Jun 2023 06:58:14 +0000 (UTC) X-FDA: 80914964508.06.D8BB79D Received: from mail-pl1-f202.google.com (mail-pl1-f202.google.com [209.85.214.202]) by imf21.hostedemail.com (Postfix) with ESMTP id E0BE31C0008 for ; Sun, 18 Jun 2023 06:58:12 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=d2187lUk; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf21.hostedemail.com: domain of 3A6uOZAoKCL42swv2elqihksskpi.gsqpmry1-qqozego.svk@flex--yosryahmed.bounces.google.com designates 209.85.214.202 as permitted sender) smtp.mailfrom=3A6uOZAoKCL42swv2elqihksskpi.gsqpmry1-qqozego.svk@flex--yosryahmed.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1687071493; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=7fjpxEmRYmfIPy9m8+B6O263cflpsiora2wxjSeJZzc=; b=nZblVoqaTlMJT1AkCxCSal16++/OEhwdeMG3+3Au9VSBP5yBC/6iqRLMJ5M6w6nQZ5rzFg ozK5nrhoN4KK3pButWex0egk5T+16J8O58l3VPgtkL/SikFBQUzW0eQmPj1B+g+7RX5M2O dizOEJ3p8OuM1KW78g3e3G4xjndyHZI= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=d2187lUk; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf21.hostedemail.com: domain of 3A6uOZAoKCL42swv2elqihksskpi.gsqpmry1-qqozego.svk@flex--yosryahmed.bounces.google.com designates 209.85.214.202 as permitted sender) smtp.mailfrom=3A6uOZAoKCL42swv2elqihksskpi.gsqpmry1-qqozego.svk@flex--yosryahmed.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1687071493; a=rsa-sha256; cv=none; b=egWNjx3WB0pu7q/WlSxx9S8meF7Z0suAC+9SovfQuOaZ6t3p0bo+lk4pBOnk8QL24HxRuB vIgm9wgas1GcV7qWzSCltL8/DsmB6QDJaB7vRqwSplOH1pjsk9Z4/+jsKJT0PWoen+JFJv U36EW3/mSViTTh5EveJofTh19kBNkAQ= Received: by mail-pl1-f202.google.com with SMTP id d9443c01a7336-1b508a0873eso11930585ad.3 for ; Sat, 17 Jun 2023 23:58:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1687071492; x=1689663492; h=cc:to:from:subject:message-id:mime-version:date:from:to:cc:subject :date:message-id:reply-to; bh=7fjpxEmRYmfIPy9m8+B6O263cflpsiora2wxjSeJZzc=; b=d2187lUkhz1tQ0LdjKS/RdBcAInoS313wvVyAeapaGuheNx0zAMEP7j5tsde3Pp0ki JLy4R1+trXmmTvJcpQFdSwOcOoWzO74P8BuVHwFwEckpX5HpMtMVxrPYKjjQfQ24J7VO Q835WNmKxjc9DtFkftSG8+8pjhS98I9MXzG/PHODkUae5GHvtG2HidU7XWvjYdDqsGTB UZrvi6kDymzX4lmmYXUipM1X5APyR7LmwCmH4ChdN/SzeOfAhYnGUooORyYrnU8lzDzB US9GH11eoECaMcdkpDFewfWnZcdflNR0OzUJe7cHb8WLhYHIi48dpX5NwSXjwzQDhV86 h4MQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687071492; x=1689663492; h=cc:to:from:subject:message-id:mime-version:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=7fjpxEmRYmfIPy9m8+B6O263cflpsiora2wxjSeJZzc=; b=FoDeEBHy+GM+Jjh/71v6hNXXOUQ9x9ey3Dqj3WX+6C+fhTwXssNhT+jNgJQpOx6G3m qNzNsJYKW+adU7o2fMciigbBzFKd9zZColfe/BEcCa9gqoAPSvRG5rPxyyx5s/ER/K1v ATrzaGRKgJ+MRCNH9erwS64FnxLFOs8CVt/C3J07KPQexTVZOiUZQ+6OLOubv3Fy/wBC g8Ga0XqN+Zx47yWPzHnEI/GS1Rj8X15+DqLgEZRP96mAZiG8c7QQr37AFpO7fmXQmExD +n9TvJ7fklDWiZcZZ8tUgz9E8Wlj40LuwPJty47Rsrx5kX/Cakl93Q5vloBDZ+egB61V x1Ww== X-Gm-Message-State: AC+VfDwk2+ZT3UB3RMGR3qJiNqc07//wlZlMbzUOPoF/ofHP7gm/2EGX Sz3HbD7U0Zi/aV0o/FDmXUcDHf7eThtHygj1 X-Google-Smtp-Source: ACHHUZ4KDoG+icnmkKzKlDDiXY31YTjeGkcRrxhpySyxVPdutgNKWixbpYJugmDD23KR/VXk8geceEUZPGubfthm X-Received: from yosry.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:2327]) (user=yosryahmed job=sendgmr) by 2002:a17:902:bb10:b0:1b2:61eb:a73f with SMTP id im16-20020a170902bb1000b001b261eba73fmr881124plb.9.1687071491867; Sat, 17 Jun 2023 23:58:11 -0700 (PDT) Date: Sun, 18 Jun 2023 06:58:09 +0000 Mime-Version: 1.0 X-Mailer: git-send-email 2.41.0.162.gfafddb0af9-goog Message-ID: <20230618065809.1364900-1-yosryahmed@google.com> Subject: [RFC PATCH 3/5] mm/mlock: WARN_ON() if mapcount overflows into mlock_count From: Yosry Ahmed To: Andrew Morton Cc: Yu Zhao , "Jan Alexander Steffens (heftig)" , Steven Barrett , Brian Geffon , "T.J. Alumbaugh" , Gaosheng Cui , Suren Baghdasaryan , "Matthew Wilcox (Oracle)" , "Liam R. Howlett" , David Hildenbrand , Jason Gunthorpe , Mathieu Desnoyers , David Howells , Hugh Dickins , Greg Thelen , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Yosry Ahmed X-Rspam-User: X-Stat-Signature: 8re1jmobwsj3sbx4wr97zq15ysted547 X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: E0BE31C0008 X-HE-Tag: 1687071492-985794 X-HE-Meta: U2FsdGVkX19TW7peUFK6kS0jers1gU+ZC/ECk45nct78jm1Jb5NMzHMBwcX/gZyBCO14ifqflcvxOo13T/TMWmn5SYGulmu0UCHBHcdUBMmFfl2QA920CCM08b8sorlbMjF9KCmRl2ZVSXdPGoVGACPt856qdIpeIZQl7mfBnlaNVPlrAGyMrhEV0zxotzrfRQQrivaeXc4Mp/ekjmnnK5bmFztrwFe3temQYUNJqhFckGlwsIOEjjUn7Po3zXO7vvNXkCG3Ze5Q7xl4hueZSl2taUoiMuKuLyPWcmgsoflUTr8TjI3r23FbJSTiFOd8C8Mc0VevXNFqxOkW7Pxi3j2d/XYGBdeUoxk7vn247x7WiUfZbBZ4FRurl06qy78WgLoJaq3pyer2kbYzH7dYNWRwPFAt4pKhMCWPIn8aCu0u5ynqvQ5cELyUeM0LTfsj5ISmKtAl6sK77KKT6DG1BoyXu8j9FmQ2vLynQuLYu3sd5iYRFhf4vqzdrfeEbj2rpRzJobGBg4KmjRDJefz7pyWRHzFRNLq2N+a8QB4ogUonJRBNK4+yxabLjz+GJdxT3i+aGXUl9YJWwkKL39h6g8yF6ZymcV42RqJ5muB0gTfJlm0N9tUelOjTbIo94rrBs8NrGH7X7WadAyMeH69Ng0wYek37CiluzZJNiW3sRbnOzY4GVzO4UKzdVPQ9mSWR4FWvQ7KDnU1mU7InVA8egSWYR+BdA4wDQ3o26ZO7HgTEIs8vtFnpTkrMQCXaTx3q05VReaRDoyMJp+6HYlb2qjwTbAIhYbIyWpwoDw+XJT5TrFosDJUIN5xbhKtLM8epXlNwDmpbWoB5FEbJ4BwCMrMQMV7jBIID8lINOIFDFC5CZqEG35r+NxGqgAVftreWSKuC4huqth5YCDLTFQXA7pnKfEjigJrLOY+SRxO66ul6k/7wKJvKTFudQT7abqxcD947CabXgiKyON+t1JI lm1LZ6tZ vOOT62i6ex3lRYXeEQRi5FEeqpebrFGgua+f+98y3cw8pNTab4j/ydctBdcurdHcePkcLAaYcgZgga0+oCq7jwSfk4Xo0G63cTyRkQ3lqRIHlPhU7oeC9MYBggzfqbZRhmDjixs1cwfog2PZMQ0jX37qhQfCGAKEnJW4TtoNjC0UnDl7Jxhymg1wRt3+o9lbd3nA+sEgzc9JJJ9g5lXvwDJ9BkTRUHartkXgyXtV9sn0dbEmHWquBLCEjCtG1lv4z+UtP5NTwQpVlR1xUmUA6Se7+2E1FseaYGl1o3kB7VwOmDx8IDUR95V7Ou5APtPlRACD+/PR7VHMzkHx09tVvfLBHiQd2nG3jb8FiEBY0EYjweL2fNgX9OP1edqt343ggTghg8MK8IgwOCfif+VCYVIgQeoEHBABiUBJubkSXWN0ojfj8/M6NfmlwvIj0INCJTnKVVX4Ejset9hLZ8H+kR37ds9LugtOwlV4V/azH09OHIaCTKsmT/Y4PnWwOGLt7x6GOVID+lDgLVaIPm5K2JwPeGdqNdGv41fUxgHKAFLZaeRMVJW9lBrnMa8kmVx38GbW9t53DvMYI/FfQcghGkHKKpV3oCdlpgLie8tfQZCzrG75va3WmLIPWBjgaVAikZR5V+H3N3IX3PvdPO4WpxE/rpkePg2Eci8Tr4jDbOHYNOCzXZLpIQP6iAw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Add a hook when a new mapping is added to an order-0 mlocked folio to check if the mapcount overflowed beyond the allocated 20 bits into the mlock_count. This is useful to know if this alarm if this happens frequently enough to cause a problem. We do so by checking if the folio has the lower 20 bits as all 0s. For file-backed folios, we do not hold the folio lock while adding a new mapping, so there's a chance that two mappings are added in quick succession such that the warning doesn't fire. Don't sweat it. Signed-off-by: Yosry Ahmed --- include/linux/mm.h | 4 ++++ mm/mlock.c | 13 +++++++++++++ mm/rmap.c | 2 ++ 3 files changed, 19 insertions(+) diff --git a/include/linux/mm.h b/include/linux/mm.h index b341477a83e8..917f81996e22 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1050,6 +1050,7 @@ unsigned long vmalloc_to_pfn(const void *addr); extern bool is_vmalloc_addr(const void *x); extern int is_vmalloc_or_module_addr(const void *x); extern int folio_mlocked_mapcount(struct folio *folio); +extern void folio_mlock_map_check(struct folio *folio); extern void folio_mlock_unmap_check(struct folio *folio); #else static inline bool is_vmalloc_addr(const void *x) @@ -1064,6 +1065,9 @@ static inline int folio_mlocked_mapcount(struct folio *folio) { return 0; } +static inline void folio_mlock_map_check(struct folio *folio) +{ +} static inline void folio_mlock_unmap_check(struct folio *folio) { } diff --git a/mm/mlock.c b/mm/mlock.c index 8261df11d6a6..f8b3fb1b2986 100644 --- a/mm/mlock.c +++ b/mm/mlock.c @@ -140,6 +140,19 @@ static int folio_mlock_count_dec(struct folio *folio) return mlock_count - 1; } +/* + * Call after incrementing the mapcount. WARN_ON() if the mapcount overflows + * beyond the lower 20 bits for order-0 mlocked folios. + */ +void folio_mlock_map_check(struct folio *folio) +{ + int mapcount = atomic_read(&folio->_mapcount) + 1; + + /* WARN if we overflow beyond the lower 20 bits */ + if (unlikely(!folio_test_large(folio) && folio_test_mlocked(folio))) + WARN_ON((mapcount & MLOCK_MAPCOUNT_MASK) == 0); +} + /* * Call after decrementing the mapcount. If the mapcount previously overflowed * beyond the lower 20 bits for an order-0 mlocked folio, munlock() have diff --git a/mm/rmap.c b/mm/rmap.c index 02e558551f15..092529319782 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1228,6 +1228,7 @@ void page_add_anon_rmap(struct page *page, struct vm_area_struct *vma, nr = atomic_inc_return_relaxed(mapped); nr = (nr < COMPOUND_MAPPED); } + folio_mlock_map_check(folio); } else if (folio_test_pmd_mappable(folio)) { /* That test is redundant: it's for safety or to optimize out */ @@ -1330,6 +1331,7 @@ void page_add_file_rmap(struct page *page, struct vm_area_struct *vma, nr = atomic_inc_return_relaxed(mapped); nr = (nr < COMPOUND_MAPPED); } + folio_mlock_map_check(folio); } else if (folio_test_pmd_mappable(folio)) { /* That test is redundant: it's for safety or to optimize out */ From patchwork Sun Jun 18 06:58:16 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yosry Ahmed X-Patchwork-Id: 13283773 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8E982EB64D7 for ; Sun, 18 Jun 2023 06:58:24 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 02DB78E0001; Sun, 18 Jun 2023 02:58:24 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 003F56B007D; Sun, 18 Jun 2023 02:58:23 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E0DDC8E0001; Sun, 18 Jun 2023 02:58:23 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id CE1376B007B for ; Sun, 18 Jun 2023 02:58:23 -0400 (EDT) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 97BA4403F6 for ; Sun, 18 Jun 2023 06:58:23 +0000 (UTC) X-FDA: 80914964886.30.41E77E2 Received: from mail-pf1-f202.google.com (mail-pf1-f202.google.com [209.85.210.202]) by imf16.hostedemail.com (Postfix) with ESMTP id D4343180014 for ; Sun, 18 Jun 2023 06:58:21 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b="zOwV/CaH"; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf16.hostedemail.com: domain of 3DKuOZAoKCMcB154Bnuzrqt11tyr.p1zyv07A-zzx8npx.14t@flex--yosryahmed.bounces.google.com designates 209.85.210.202 as permitted sender) smtp.mailfrom=3DKuOZAoKCMcB154Bnuzrqt11tyr.p1zyv07A-zzx8npx.14t@flex--yosryahmed.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1687071501; a=rsa-sha256; cv=none; b=tlEeGhPU2bzE+D76dYmxXEYZDtXvZFz5u6CpC3VxZWjWrRwqqfo2B09zZ2fWl2dYI0Eimg K2whzsIjAy+mAubcwlOTc7EQC30CbMKy3TbRqQa2aqWajjHr3/MJG9ilbVhjdUxF2RYS4G 9OGRD6qVG/xKRsWJCAOoONpoHk/N2+w= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b="zOwV/CaH"; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf16.hostedemail.com: domain of 3DKuOZAoKCMcB154Bnuzrqt11tyr.p1zyv07A-zzx8npx.14t@flex--yosryahmed.bounces.google.com designates 209.85.210.202 as permitted sender) smtp.mailfrom=3DKuOZAoKCMcB154Bnuzrqt11tyr.p1zyv07A-zzx8npx.14t@flex--yosryahmed.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1687071501; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=YwXvQVSaypElesS3BE3qysx5jmIbVwSYC6z5eyt8hE4=; b=CDxqcB1Bl3ESK+rZxRXlYbHC1dhdTQfSNdop6iqn3wolp1NhJi+rTe5aXXhboSG7AUBWi9 sv1g1erbtA4mTRzCIgtFz8PQe26cXccDj5cgKY9Yby8hgzDo+SXHvzzzJ17FZ7kZYFiljV 9+HwbTTp2mmYaD2nawM/E3p4cGJz2UY= Received: by mail-pf1-f202.google.com with SMTP id d2e1a72fcca58-666e3b5d305so784473b3a.2 for ; Sat, 17 Jun 2023 23:58:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1687071501; x=1689663501; h=cc:to:from:subject:message-id:mime-version:date:from:to:cc:subject :date:message-id:reply-to; bh=YwXvQVSaypElesS3BE3qysx5jmIbVwSYC6z5eyt8hE4=; b=zOwV/CaHWAn0TdbeDv0YfetFHRqhjLbNtZofZk8gp1jhdDbApFYwHWaoGhfGloU8t/ ctAcxIf0+30pNej07lmQ6104qhnz655nV+5zO6qi9UaMQMboL93K3xQAOKbTh0qcXfl/ P09cWxqA2MK8VXEwo8LNT6A51QrppTROyxHRMUQFGJG6bHXe3KtwKdaPWg/jzyZxx6YD aCddrawGZdXnqKrbBY9qI0k68VFyjmbA4lyB+fCnMkH9IzjnjE9A9xiPEkbrWKosrxjF Fgyv7OpKDc1wOIhAVWrCmDBwHFHKObEurJqiCKO2a33jGe1yoE8KeFMDSuxwH8vwhDb9 RB6Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687071501; x=1689663501; h=cc:to:from:subject:message-id:mime-version:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=YwXvQVSaypElesS3BE3qysx5jmIbVwSYC6z5eyt8hE4=; b=lSoPXPckpzRtAxnso9a5KVJCOR74SEhvVHhFXR7bvof3f0rbftSc1NwFTZRNWJFMPY xYdZuGis4UpRuXOvO4Il85PMp4mEL9vPMVwAM6bPW+0wJCGXT6HMutQE5iQdXxq0UulS Ke9WenuHZLKT5Qu178C1c6LoR8teZ0d8eQO6R96BVE+JcbPxN2D8hgcDB6xKwSha3AsH HQAUGuchm8NoEBwc8KnPlbIxcW7IGNrJB7LX9pdBWLyRmysHQodg2tQLRBkOvieb1kTq 6SizqdlSchyPvjpJdJyBVAeZr4p/SAg2QMvQ8+6xS2/F34EcMSeiRZ69IjNlx7dWayZE bhJw== X-Gm-Message-State: AC+VfDz2pew5OHaPbAgynD3JFKCYw/fDMyPx05U1uMaEhWsoMR8o8mWQ alL2iF2Cs939bzIeVSKQEE0jPv6Rf53xDY3I X-Google-Smtp-Source: ACHHUZ53HYo999vjmhrDC3s4hj57OygGWtz6Njl0zOWvVSrpTvDyQH1suL+U7VFBl2xQC7vNyDLGLfbC/tTModWT X-Received: from yosry.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:2327]) (user=yosryahmed job=sendgmr) by 2002:aa7:8882:0:b0:652:b76e:43d2 with SMTP id z2-20020aa78882000000b00652b76e43d2mr1980763pfe.6.1687071500739; Sat, 17 Jun 2023 23:58:20 -0700 (PDT) Date: Sun, 18 Jun 2023 06:58:16 +0000 Mime-Version: 1.0 X-Mailer: git-send-email 2.41.0.162.gfafddb0af9-goog Message-ID: <20230618065816.1365301-1-yosryahmed@google.com> Subject: [RFC PATCH 4/5] mm/vmscan: revive the unevictable LRU From: Yosry Ahmed To: Andrew Morton Cc: Yu Zhao , "Jan Alexander Steffens (heftig)" , Steven Barrett , Brian Geffon , "T.J. Alumbaugh" , Gaosheng Cui , Suren Baghdasaryan , "Matthew Wilcox (Oracle)" , "Liam R. Howlett" , David Hildenbrand , Jason Gunthorpe , Mathieu Desnoyers , David Howells , Hugh Dickins , Greg Thelen , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Yosry Ahmed X-Rspam-User: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: D4343180014 X-Stat-Signature: rcbi9e97oqf5jhhziy4mmewrjy99ase1 X-HE-Tag: 1687071501-799550 X-HE-Meta: U2FsdGVkX1/6CKmeISYl1ujsbZJNyo5bHcnM8o8Pus58dPOz9oJUOoq/2D3aMMdXQOrssWiuew6sO1DQRwtK5KyWduKK1nguKD0V06dOgg7097DTyET3mvayGqG4JdNdqCvKxiLuXAUJfSXUautHdlc+CdAW6LbSUw/Ekj0uZX5snbfCwuLURWRUrCzvvA7Ynxw/sHEXWJjZzQx5YiKSaDxED0eGNtj2JGAwKc52SNVGjOVZ8oxIX19TbvIh+pnlIjDLmjSAqs2G2Q/5E95qIjODZnTC+BhE5kpfOSiPbvi2dKcrO4DveL0yRgstK581NOtOlOx9VBTgacaoTQj+Cdl5+LYz7cRm8nubsOA+aT7wGZ9oIU5vazE8FhKxPzbHU9AVw0gJlD5VGs9IQ2tHKvSVFwHWMpLq8Dclmnn8cv8Tb1ie7KjOH4duQkTV7THouctUGLjSSRI0eT2jykcIUK0wtHTrkLBX9Z7PwnjrDSphTlH1sag/VSesn7sm4ceXF+ID0THRyrNWGR/WZlNWzfBcL2ufyxYXhojI/ANQ+Gnk1ympoIMpvhs/3WOEpCsWKgTcCJvzYOE3/uwtdne66Bj3KoSfVxv5umqFmY268NdV4Mf/XpXpH6ih4cqMBKBPj1NlmNiKZ3IgW+drkXZ3eQmwRQE+UTQUGakXvSAbAPDdDzSk+f0/5ZHQ3rOdBqBUowaRcuIllfo0FizSweXLBRj8yoTLmx4kF7sJKMr6HzHZbmoF38/JaoyIuAHssf6B1VFK3K1HbZyktIgfK9fq6GkgokBkZnAWDwdyACJRKGHzk3/2Lwj0soSjTSHS15NddGvjoN9iTzQoYVX9yMLDZbF94NJLTvyGmpNHuNhWukcL7UITkuD0gB/S5IMUcM7nS9oQGW/gv5CGbZmaZmCD0uvIRLe5IlF5DVvH7Akpgy33QuIzfMuEbPohqUEQR+YAswIbnp4+6YBtrSRDo1+ Dq7sgTWr MmmVDTAq7sS78zRt6Su8CbBAZRY/Ffk/gZWdjpJZJLVh0wG1mWoRm+pL/O7CVxYOi/w5aHDhIwzsIMHYsCxMlZQIpZlFwMPxqQhz5f9LnGOmF80NdABhAjbbm7yVy38i7vbOWLyRDHuD0288uxWXdfkzJ9m+DpPsKDBKakJ/VDMZ2GSMhS/abSs2Xf4cT7DiqgTbhF7myCgz7ucS5XKT1bTpOz5VPnSl6bsSXFkutCGjXw2aruSMW0iUIbbszgxwJxPLmcWAKb6OWNLA4wbx2YZ5QXVSEzOkuTBSoOxpCMtnW1itZQ4rQMpKzlOOfcgRL6t+tTJhyij5V+hvCPgvWBIx+Uy0IQBjif7wCbxYmDTMkLothvTn100t3RD6djdgAaFfl3C8bk3GD8DXJ0sPFfl+ES7DAnxTKskj8MMDcJBEMrIbXqRmhSkyDWLqRkTMKSMyjF1Xnuu8cUSbJZN1IfhT+6aHLZJE/disyN39d3VPTtKqRtO+fdpl7jihkM79fV5jApx/uedn1L2mxAO2SRZPr4QxveDZo5jbGeF7UY/6qxpTjkH6ZJpHKq9C7Cw/8Wy18/qjJanXPB1nMCPQ8oTl8+655UOzbqucDy+gPeVw/wEeo1YUvPuJzsbzgnPmoTL1wvAIl8ADJmzc= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Now that mlock_count no longer overlays page->lru, revive the unevictable LRU. No need to special case it when adding/removing a folio to the LRUs. This also enables future work that will use the LRUs to find all user folios charged to a memcg, having the unevictable LRU makes sure we are not missing a significant chunk of those. Signed-off-by: Yosry Ahmed --- include/linux/mm_inline.h | 11 +++-------- mm/huge_memory.c | 3 +-- mm/mmzone.c | 8 -------- 3 files changed, 4 insertions(+), 18 deletions(-) diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h index 0e1d239a882c..203b8db6b4a2 100644 --- a/include/linux/mm_inline.h +++ b/include/linux/mm_inline.h @@ -319,8 +319,7 @@ void lruvec_add_folio(struct lruvec *lruvec, struct folio *folio) update_lru_size(lruvec, lru, folio_zonenum(folio), folio_nr_pages(folio)); - if (lru != LRU_UNEVICTABLE) - list_add(&folio->lru, &lruvec->lists[lru]); + list_add(&folio->lru, &lruvec->lists[lru]); } static __always_inline void add_page_to_lru_list(struct page *page, @@ -339,21 +338,17 @@ void lruvec_add_folio_tail(struct lruvec *lruvec, struct folio *folio) update_lru_size(lruvec, lru, folio_zonenum(folio), folio_nr_pages(folio)); - /* This is not expected to be used on LRU_UNEVICTABLE */ list_add_tail(&folio->lru, &lruvec->lists[lru]); } static __always_inline void lruvec_del_folio(struct lruvec *lruvec, struct folio *folio) { - enum lru_list lru = folio_lru_list(folio); - if (lru_gen_del_folio(lruvec, folio, false)) return; - if (lru != LRU_UNEVICTABLE) - list_del(&folio->lru); - update_lru_size(lruvec, lru, folio_zonenum(folio), + list_del(&folio->lru); + update_lru_size(lruvec, folio_lru_list(folio), folio_zonenum(folio), -folio_nr_pages(folio)); } diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 0e5b58ca603f..4aa2f4ad8da7 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2407,8 +2407,7 @@ static void lru_add_page_tail(struct page *head, struct page *tail, } else { /* head is still on lru (and we have it frozen) */ VM_WARN_ON(!PageLRU(head)); - if (!PageUnevictable(tail)) - list_add_tail(&tail->lru, &head->lru); + list_add_tail(&tail->lru, &head->lru); SetPageLRU(tail); } } diff --git a/mm/mmzone.c b/mm/mmzone.c index 68e1511be12d..7678177bd639 100644 --- a/mm/mmzone.c +++ b/mm/mmzone.c @@ -81,14 +81,6 @@ void lruvec_init(struct lruvec *lruvec) for_each_lru(lru) INIT_LIST_HEAD(&lruvec->lists[lru]); - /* - * The "Unevictable LRU" is imaginary: though its size is maintained, - * it is never scanned, and unevictable pages are not threaded on it - * (so that their lru fields can be reused to hold mlock_count). - * Poison its list head, so that any operations on it would crash. - */ - list_del(&lruvec->lists[LRU_UNEVICTABLE]); - lru_gen_init_lruvec(lruvec); } From patchwork Sun Jun 18 06:58:24 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yosry Ahmed X-Patchwork-Id: 13283774 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1F423EB64D7 for ; Sun, 18 Jun 2023 06:58:31 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AB3798E0002; Sun, 18 Jun 2023 02:58:30 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A643C6B007D; Sun, 18 Jun 2023 02:58:30 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 92C608E0002; Sun, 18 Jun 2023 02:58:30 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 8449F6B007B for ; Sun, 18 Jun 2023 02:58:30 -0400 (EDT) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 5585380DA4 for ; Sun, 18 Jun 2023 06:58:30 +0000 (UTC) X-FDA: 80914965180.23.2A862C2 Received: from mail-pg1-f201.google.com (mail-pg1-f201.google.com [209.85.215.201]) by imf23.hostedemail.com (Postfix) with ESMTP id 94D7C14000A for ; Sun, 18 Jun 2023 06:58:28 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=Jmn08JRk; spf=pass (imf23.hostedemail.com: domain of 3E6uOZAoKCM4I8CBIu16yx08805y.w86527EH-664Fuw4.8B0@flex--yosryahmed.bounces.google.com designates 209.85.215.201 as permitted sender) smtp.mailfrom=3E6uOZAoKCM4I8CBIu16yx08805y.w86527EH-664Fuw4.8B0@flex--yosryahmed.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1687071508; a=rsa-sha256; cv=none; b=cpKMEvpj6p5pbUaMVFebQOVDKjdy3T6XzeUE9fbDZ+XRo22JGxnOGd00mfQi1CNx4QBd4r khnU5pY9uvP2C7IuoTZ3unub/VwAqB8rXiWFza5PIZXQJn6c2L4HQIDBkEXL4x6VyF1nPP nIuNGFBSOcisyP1VXly8sXR21aK1Wew= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=Jmn08JRk; spf=pass (imf23.hostedemail.com: domain of 3E6uOZAoKCM4I8CBIu16yx08805y.w86527EH-664Fuw4.8B0@flex--yosryahmed.bounces.google.com designates 209.85.215.201 as permitted sender) smtp.mailfrom=3E6uOZAoKCM4I8CBIu16yx08805y.w86527EH-664Fuw4.8B0@flex--yosryahmed.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1687071508; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=h9MkEXCOECVy+F9lVXmcJqM8lpLvl13EAFxuWhnTpvI=; b=G8GXkDwFUPPSI9RFQiKoUnilPihNN0MvSZPSAt5VMEFV78PTSONJQBfONkjeCAO8AxvpN0 8LgyacNnyxH/10+wGQtjw9sZkoTF9xITdyaT0chzO0eWfS61WeU1mBvdjIx69GfnjuTinl D7qh5edLuxc+plHHaiem3TiscqfAeL8= Received: by mail-pg1-f201.google.com with SMTP id 41be03b00d2f7-5488dccfbdfso1947057a12.2 for ; Sat, 17 Jun 2023 23:58:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1687071507; x=1689663507; h=cc:to:from:subject:message-id:mime-version:date:from:to:cc:subject :date:message-id:reply-to; bh=h9MkEXCOECVy+F9lVXmcJqM8lpLvl13EAFxuWhnTpvI=; b=Jmn08JRk9haBdTN7W4jKc018AW8wTh5/Vt4hWrG22Ce5WHHa5gH8+MFIzo5RR/XZDq /W6Qm/trFt2lkCHmg+/8aK4c3BVvqvUZGoipdn0DwQhEJG0zDsVcqfnYXrvXqWqFMSwB cD5AXD9nnEOim2ftXAtIhqlPxnHlllsa63199K3JNAxq5hr4CMKl6vYbDBMBcYoqGoJI TojGOAjg6KS0exeu21OEEbX34hPDKOcGV9jJ6KSr7JDko1JBsl94JVx3mVI4VmF+a4zE hOk3HCxA72ztJwA4CdUKzWKxOVZY+VQuQRRxrmIiJ5l2lm8iS7CGxZY5s5RVjVkMk7B+ 4BIQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687071507; x=1689663507; h=cc:to:from:subject:message-id:mime-version:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=h9MkEXCOECVy+F9lVXmcJqM8lpLvl13EAFxuWhnTpvI=; b=g5YjVD5/vbdHBCHlsnpQjBEEbpylh4qIs6ymHnQWGMGSioNMGa7eAwD9oJ1JvWaXBK mk7lF5uepDDQd4vyTmds5GJMx9n3wxEiMg9F0FVd3aW5qZowjglgkeEKPT/fU1P6YIVi zZVtjV3U8ctR8Y8EUH6bPtYe19w68NC7tbGBuFBU3el8yal2h/NQYuM3GWK2Ir709Xgr yGSiEKClWaPIYMJKG3aHm8NdW76p9fnWfS2IyKVTYFe0l4m++UitZWM5lCeaFO1k8lKE eFb53OoZhpeFWGK/o6MjljkOS7vcX56/GE1HSDZtiE0lcaI2CxJjOYmBD6IFswuKL24Q 7aYw== X-Gm-Message-State: AC+VfDzTvWmoFbnLApIZ6G2RReO6wOqSzZBrUvUdyhI6Z6VjKgWcVWdN wB8DBdET63M3j9kfQ/TQESMlVKSxR7gKkPUT X-Google-Smtp-Source: ACHHUZ5g3uocw1m2bEK5IzJLclvW8dC8mH9S0Bf7aD9hPOQWtWFYCu24N61BbV7x9G/xqeRMh6De5y2vc+vaUIqs X-Received: from yosry.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:2327]) (user=yosryahmed job=sendgmr) by 2002:a63:f106:0:b0:53f:2d21:5a16 with SMTP id f6-20020a63f106000000b0053f2d215a16mr450864pgi.10.1687071507411; Sat, 17 Jun 2023 23:58:27 -0700 (PDT) Date: Sun, 18 Jun 2023 06:58:24 +0000 Mime-Version: 1.0 X-Mailer: git-send-email 2.41.0.162.gfafddb0af9-goog Message-ID: <20230618065824.1365750-1-yosryahmed@google.com> Subject: [RFC PATCH 5/5] Revert "mm/migrate: __unmap_and_move() push good newpage to LRU" From: Yosry Ahmed To: Andrew Morton Cc: Yu Zhao , "Jan Alexander Steffens (heftig)" , Steven Barrett , Brian Geffon , "T.J. Alumbaugh" , Gaosheng Cui , Suren Baghdasaryan , "Matthew Wilcox (Oracle)" , "Liam R. Howlett" , David Hildenbrand , Jason Gunthorpe , Mathieu Desnoyers , David Howells , Hugh Dickins , Greg Thelen , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Yosry Ahmed X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 94D7C14000A X-Stat-Signature: 6a5a6k43hbuac6cb6tmegkjtm7bhj8c5 X-Rspam-User: X-HE-Tag: 1687071508-923536 X-HE-Meta: U2FsdGVkX1+wY5zclu01cmLZwctQtElGT3rTT525thZjDx2gcYAOzC1wQ/Sp6IOBrSGIGOyR5wXhg6+R8AVBRIitFvPme+ttr4vsQ9GzHRiIQRjlx1bya+ZhrIJCF9G5Q89TtvhTLQYcQWSKkhbKT8EWENgmmZ3i7C0If+9B8gCwrQbyIuMBWAgCreRQb7HIAZE2iRSAbd340w4oV0Yt6m4vgb0FsK6LDcRa3kKFvWZBWGNPlHzcPi2h4qrbq7YkyeeqPVE8gP2WK8Mjx3iYox0g3ZRQ6rx7eUYEwS2oMo2KXyfhEceT/VFA5r4yFUZkT/egzc0Tyc+SZ4xcmVskkYe5oR5JPAiHobKVElfnDdXTBXb4QA/fW8AtV+5nkXIjCXnx860oZubiUiN2uRa4bWH0ZbmUoqWOVsAKWbHH+CFyFy8AcfFeJGgShr45xzNhElU8IkfQBLXKhKqKBJGtXFH5P+HiBvT3nxX0kGfZiY00K0C1MADSnxMt2lnFA1jcTcS64ULt292XPn+Q/h+P/IvmSE/u/WygoEFPVepYxB+jCk8r6/wc6HNKCanL4HiGpgK5TDDxbvse1Z/PJ17l5C2l6xj38vWKIMBLBKc3TIXpQI3I13mBL09myQNLlxOK/U/WPjW5yQ4oENUA0/eSSxprzSyAWnrZ8wp6o2zeUyYVlIbxUG6eXbxJdyMQtUuinhOWFt2asfBeZA8OlvefVZgw//gCs2T35U9dDMfmVon8s6sDDnT0CcQj+RjkQyMVLAPjU+W+HPReM+9fbd3EGjjDqzlMAtUR9gO4Q4+mlv7cNi4CfN1BSdUZiuxfESoSDttlkMouOf3GJsEMjsy72yOHIKfUuVXPHWrSWm9zm/NKHOSYkU0QdDGOmjfkw54HItx+WRoK0Mzu8HAgbFImhMf5HgySsC2+VC437jTGfyfVA7KK3iL0YEDrg1FmER5wc6cafDGRortDqJf476g sBGN+CCv 8PilpMqiBG49HEv421D9ALIvI4Fs1XnNAEtQrFnXR/RdYttBmnXJx5ykjQkGE9NTgByQFCC7W0KBf4P5pAp2Vo/BhH/97TtsQMSE+m7Nxt5xGIQlgLpbznOsDlRGb6jblYJ2mJDrquIKSnRgLqQZY3npistlmPfFQm19nYi3JwpKb6218TRz+zmBJovu2/cyFkF4mkIK/yWN2vs5FiEz87tqTY7t0fam9eXDVO/mJYlWr80kkMKC0ZTjwRPPYtaASKNnAEfWWkeFvhMOPFIzFKqwWkCtLgEuNJfqbFTVeYkyXeAhPaaQiargRmrrlZkIvI1yWBJ+K58ncGHDvbMHtfF25uqPKCJ93bxe2KLrtxp8aSeLCdUopxALuMy+m4UgQbAuARJR1AF2IVuwv276AsATnzxR7LPOAcADAMR6bkrpWwnpH8XmoU1KeSdC0g2LDxzAE9PYXW+BTpFnoyQYEc91mZyG8RdUuAwjgWhZEj9HO8Ncc7N1Um5q+LiuiM8nGZu0ZaF1X23ZZ/p8AsObCjVZFgr7fvfA6hy2yeJ1mLxhMYZyVTArchiGk/ot3ghASLIcIO3rwRCqFQmSI1qPYqXrdJARRSVyhiKqJuYNWJHGEPMEvWNOSZ1bU1Mf9CrrcRLfhNuCii+h5ysLR6y24XcEINd6QBs/I5xm0asckYvVErzQ+Y86RWUnWFw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This reverts commit c3096e6782b733158bf34f6bbb4567808d4e0740. That commit made sure we immediately add the new page to the LRU before remove_migration_ptes() is called in migrate_move_folio() (used to be __unmap_and_move() back then), such that the rmap walk will rebuild the correct mlock_count for the page again. This was needed because the mlock_count was lost when the page is isolated. This is no longer the case since mlock_count no longer overlays page->lru. Revert the commit (the code was foliated afterward the commit, so the revert is updated as such). Signed-off-by: Yosry Ahmed Reviewed-by: "Huang, Ying" --- mm/migrate.c | 24 +++++++++--------------- 1 file changed, 9 insertions(+), 15 deletions(-) diff --git a/mm/migrate.c b/mm/migrate.c index 01cac26a3127..68f693731865 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -1279,19 +1279,6 @@ static int migrate_folio_move(free_page_t put_new_page, unsigned long private, if (unlikely(!is_lru)) goto out_unlock_both; - /* - * When successful, push dst to LRU immediately: so that if it - * turns out to be an mlocked page, remove_migration_ptes() will - * automatically build up the correct dst->mlock_count for it. - * - * We would like to do something similar for the old page, when - * unsuccessful, and other cases when a page has been temporarily - * isolated from the unevictable LRU: but this case is the easiest. - */ - folio_add_lru(dst); - if (page_was_mapped) - lru_add_drain(); - if (page_was_mapped) remove_migration_ptes(src, dst, false); @@ -1301,9 +1288,16 @@ static int migrate_folio_move(free_page_t put_new_page, unsigned long private, /* * If migration is successful, decrease refcount of dst, * which will not free the page because new page owner increased - * refcounter. + * refcounter. As well, if it is LRU folio, add the folio to LRU + * list in here. Use the old state of the isolated source folio to + * determine if we migrated a LRU folio. dst was already unlocked + * and possibly modified by its owner - don't rely on the folio + * state. */ - folio_put(dst); + if (unlikely(!is_lru)) + folio_put(dst); + else + folio_putback_lru(dst); /* * A folio that has been migrated has all references removed