From patchwork Fri Dec 20 01:09:31 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chen Ridong X-Patchwork-Id: 13915965 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4C1F7E77184 for ; Fri, 20 Dec 2024 01:20:02 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9B1006B0082; Thu, 19 Dec 2024 20:20:01 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 9394F6B0083; Thu, 19 Dec 2024 20:20:01 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7DA836B0085; Thu, 19 Dec 2024 20:20:01 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 5BD736B0082 for ; Thu, 19 Dec 2024 20:20:01 -0500 (EST) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id C6A30434E7 for ; Fri, 20 Dec 2024 01:20:00 +0000 (UTC) X-FDA: 82913580204.05.B3CD63B Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) by imf07.hostedemail.com (Postfix) with ESMTP id DC08F40012 for ; Fri, 20 Dec 2024 01:19:04 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf07.hostedemail.com: domain of chenridong@huaweicloud.com designates 45.249.212.51 as permitted sender) smtp.mailfrom=chenridong@huaweicloud.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1734657561; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references; bh=xhnp3VRJ6GaNCwsADp+N8xuI7mxG4G25LwP9DktxInE=; b=NHoBNONY9k2qnGuscsIiOV66FAlSVS2Q/MUY61vYXdhBAuPYcAnKgkby5+xa9/gByHlkFv znfi+69Qj/ZLkkskOq/IzytM7ak8XDUC1fSUnOZRr5CLH/5a2qB6Unsq3OufZTj+9I1cWm cyCvwBbXQbNhHZvDhhAXPi13JlOa370= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1734657561; a=rsa-sha256; cv=none; b=gMQWoCMC9IY6R5/aZoPvIxIx3Y4H2JZmwRtBwoZI4hOqNwmVNpy6z89xvBfJwOvALX+qQ0 akIVCI9b/9zQaSWcdrgRH7/F8OZTASyPszQJHG8MNm5AhYrzeO54jBza1FrP2HH3PVxlup 7RnC0oCRLVtJtN3M3AnWVDgv4t36ak0= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf07.hostedemail.com: domain of chenridong@huaweicloud.com designates 45.249.212.51 as permitted sender) smtp.mailfrom=chenridong@huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.235]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4YDqMW3gp8z4f3jqq for ; Fri, 20 Dec 2024 09:19:35 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.252]) by mail.maildlp.com (Postfix) with ESMTP id 06EDA1A0568 for ; Fri, 20 Dec 2024 09:19:50 +0800 (CST) Received: from hulk-vt.huawei.com (unknown [10.67.174.121]) by APP3 (Coremail) with SMTP id _Ch0CgBX6MQZxmRnZVHKEw--.35408S2; Fri, 20 Dec 2024 09:19:47 +0800 (CST) From: Chen Ridong To: akpm@linux-foundation.org, mhocko@suse.com, hannes@cmpxchg.org, yosryahmed@google.com, yuzhao@google.com, david@redhat.com, willy@infradead.org, ryan.roberts@arm.com, baohua@kernel.org, 21cnbao@gmail.com, wangkefeng.wang@huawei.com Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, chenridong@huawei.com, wangweiyang2@huawei.com, xieym_ict@hotmail.com Subject: [PATCH -next v5] mm: vmscan: retry folios written back while isolated for traditional LRU Date: Fri, 20 Dec 2024 01:09:31 +0000 Message-Id: <20241220010931.3603111-1-chenridong@huaweicloud.com> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 X-CM-TRANSID: _Ch0CgBX6MQZxmRnZVHKEw--.35408S2 X-Coremail-Antispam: 1UD129KBjvJXoW3XrWfuFyxWr4xWr4rGr1xKrg_yoWfXF15pF Z3Wrsrt3y8Jr1fKr43ZFn8Wr1ak3yxWr47tFW7GrW2yF13XryYgFy2k34jqF45GrykAFna v39IgryDWa1jyF7anT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUv2b4IE77IF4wAFF20E14v26r4j6ryUM7CY07I20VC2zVCF04k2 6cxKx2IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M28lY4IEw2IIxxk0rwA2F7IY1VAKz4 vEj48ve4kI8wA2z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7Cj xVAFwI0_Gr1j6F4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x 0267AKxVW0oVCq3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG 6I80ewAv7VC0I7IYx2IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFV Cjc4AY6r1j6r4UM4x0Y48IcxkI7VAKI48JM4IIrI8v6xkF7I0E8cxan2IY04v7MxkF7I0E n4kS14v26r1q6r43MxAIw28IcxkI7VAKI48JMxC20s026xCaFVCjc4AY6r1j6r4UMI8I3I 0E5I8CrVAFwI0_Jr0_Jr4lx2IqxVCjr7xvwVAFwI0_JrI_JrWlx4CE17CEb7AF67AKxVW8 ZVWrXwCIc40Y0x0EwIxGrwCI42IY6xIIjxv20xvE14v26r1j6r1xMIIF0xvE2Ix0cI8IcV CY1x0267AKxVW8JVWxJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF4lIxAIcVC2z280aVAF wI0_Jr0_Gr1lIxAIcVC2z280aVCY1x0267AKxVW8JVW8JrUvcSsGvfC2KfnxnUUI43ZEXa 7IU17KsUUUUUU== X-CM-SenderInfo: hfkh02xlgr0w46kxt4xhlfz01xgou0bp/ X-Rspamd-Queue-Id: DC08F40012 X-Rspam-User: X-Rspamd-Server: rspam07 X-Stat-Signature: g1iqsewknj883ea4g1dbzyz5hrhh75j5 X-HE-Tag: 1734657544-664204 X-HE-Meta: U2FsdGVkX19Y+MoRdaWgquJ6lwfaFf1m6XaosPxAIJ+F8GSqOEhtkM/aRICAF3FtRn0whBhPlBbvxo+Qy2QK2cHFYyknUh/nwhDjzs4yp9O5ap3HLkrQUDtrv8DLxJ+QMoOZeRXVp0gPaMQQ8yx7YuJ3v32+9ZWDtFyKB90ccW5irRfPVrjuybQLQKNcU+ODpaK0RVP+jG1+rJ5fICCAGgE55GDOxVwwEl6OQtl5OJmHUeBVbSFsG5v0gT251XfqNRTCNkM1oqPVYGEG1pozVrM2+rVaYFGbJB8/T33W/Gtovl1tY6hHyqcg6pxm5THX4I9R/mS82O3C33y/A/NtBgTnPhmn8g49IyfgdBhnyrN++Hb0LFwVVbcPLiNp4zjPQI2PEmCYBkYtMcdzGlDhsiXnNoMagXoySGL7mYUOK2dS6CfK7+wWcVS2/1xlEn/M9YT2b7ZeouRUqzyYHC/0u4OHYcOwVC01MGtmtMKAEo0sd4yRkTSjnjbUq0r/rN2Z/UqCJnW3kM/bTXuYisu4Iqc0LYQHpFkIEdP1YE44M/mkRsCF0ywJ+KgCUy6PLm6fVKSyn8+1PiFre7D5nlQlLe9xq8XRKm5cELWzIAVdR2IVpC5H52JQGB9rRuVvo1oQkjEz3CPt6oPP76ybM+lCSujfl4lCfv9bD7N2EVgD64Qp1hpBPWLX4S8h57KA7HPP5p8MXHf6Hz4uColPKIEx0q24tbyKkkeVSDsqVX2rFWDJOZErdopq1HTvSsPxpvXdE5P0POqwMBhGyU9wlt52dBHwG3MOW+YumTdcdFM7bTbkqWKEi4BQX0bGhx8KqpXQEnNxGzEuN9ui9jPYtrrNEq0wlajswdbfB1Bh5dPRv+1pQ941F5Qnb0lgT0GPQ0VIJFkuNNDbkGEiKQtouJFCkxst/it+BuxqY4wrrluhww4p8WTkFSEAJVbdSnRbdA9WaGQas4HBfPDPlm15WqO oISefCoa FSgOgJ/Ggzpqtb5U/OLXD4ZOEMiQ1mBqetThsBZrEbezFFGwnxRZ+WJdceLFuMd2/eyAMhXnXB1yHrxBdoH4IIzCsSWhH3gX6I7WkcJ4NtDgExRYcGu/XpHg99TRPTGzHrfO+1Kx6yR1+ChG518yIRBi/KsDBSPj/FxFGVKIY7mdae2NNjxAtTgPkdM37IAS3zL4V36jkmvL4Ba0= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Chen Ridong The page reclaim isolates a batch of folios from the tail of one of the LRU lists and works on those folios one by one. For a suitable swap-backed folio, if the swap device is async, it queues that folio for writeback. After the page reclaim finishes an entire batch, it puts back the folios it queued for writeback to the head of the original LRU list. In the meantime, the page writeback flushes the queued folios also by batches. Its batching logic is independent from that of the page reclaim. For each of the folios it writes back, the page writeback calls folio_rotate_reclaimable() which tries to rotate a folio to the tail. folio_rotate_reclaimable() only works for a folio after the page reclaim has put it back. If an async swap device is fast enough, the page writeback can finish with that folio while the page reclaim is still working on the rest of the batch containing it. In this case, that folio will remain at the head and the page reclaim will not retry it before reaching there. The commit 359a5e1416ca ("mm: multi-gen LRU: retry folios written back while isolated") only fixed the issue for mglru. However, this issue also exists in the traditional active/inactive LRU. This issue will be worse if THP is split, which makes the list longer and needs longer time to finish a batch of folios reclaim. This issue should be fixed in the same way for the traditional LRU. Therefore, the common logic was extracted to the 'find_folios_written_back' function firstly, which is then reused in the 'shrink_inactive_list' function. Finally, retry reclaiming those folios that may have missed the rotation for traditional LRU. Link: https://lore.kernel.org/linux-kernel/20241010081802.290893-1-chenridong@huaweicloud.com/ Link: https://lore.kernel.org/linux-kernel/CAGsJ_4zqL8ZHNRZ44o_CC69kE7DBVXvbZfvmQxMGiFqRxqHQdA@mail.gmail.com/ Signed-off-by: Chen Ridong Reviewed-by: Barry Song --- mm/vmscan.c | 108 ++++++++++++++++++++++++++++++++++------------------ 1 file changed, 70 insertions(+), 38 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index 39886f435ec5..e67e446540ba 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -283,6 +283,39 @@ static void set_task_reclaim_state(struct task_struct *task, task->reclaim_state = rs; } +/** + * find_folios_written_back - Find and move the written back folios to a new list. + * @list: filios list + * @clean: the written back folios list + * @is_retried: whether the list has already been retried. + */ +static inline void find_folios_written_back(struct list_head *list, + struct list_head *clean, bool is_retried) +{ + struct folio *folio; + struct folio *next; + + list_for_each_entry_safe_reverse(folio, next, list, lru) { + if (!folio_evictable(folio)) { + list_del(&folio->lru); + folio_putback_lru(folio); + continue; + } + + /* retry folios that may have missed folio_rotate_reclaimable() */ + if (!is_retried && !folio_test_active(folio) && !folio_mapped(folio) && + !folio_test_dirty(folio) && !folio_test_writeback(folio)) { + list_move(&folio->lru, clean); + continue; + } + + /* don't add rejected folios to the oldest generation */ + if (lru_gen_enabled() && !lru_gen_distance(folio, false)) + set_mask_bits(&folio->flags, LRU_REFS_FLAGS, BIT(PG_active)); + } + +} + /* * flush_reclaim_state(): add pages reclaimed outside of LRU-based reclaim to * scan_control->nr_reclaimed. @@ -1959,14 +1992,18 @@ static unsigned long shrink_inactive_list(unsigned long nr_to_scan, enum lru_list lru) { LIST_HEAD(folio_list); + LIST_HEAD(clean_list); unsigned long nr_scanned; - unsigned int nr_reclaimed = 0; + unsigned int nr_reclaimed, total_reclaimed = 0; + unsigned int nr_pageout = 0; + unsigned int nr_unqueued_dirty = 0; unsigned long nr_taken; struct reclaim_stat stat; bool file = is_file_lru(lru); enum vm_event_item item; struct pglist_data *pgdat = lruvec_pgdat(lruvec); bool stalled = false; + bool is_retried = false; while (unlikely(too_many_isolated(pgdat, file, sc))) { if (stalled) @@ -2000,22 +2037,47 @@ static unsigned long shrink_inactive_list(unsigned long nr_to_scan, if (nr_taken == 0) return 0; +retry: nr_reclaimed = shrink_folio_list(&folio_list, pgdat, sc, &stat, false); + sc->nr.dirty += stat.nr_dirty; + sc->nr.congested += stat.nr_congested; + sc->nr.unqueued_dirty += stat.nr_unqueued_dirty; + sc->nr.writeback += stat.nr_writeback; + sc->nr.immediate += stat.nr_immediate; + total_reclaimed += nr_reclaimed; + nr_pageout += stat.nr_pageout; + nr_unqueued_dirty += stat.nr_unqueued_dirty; + + trace_mm_vmscan_lru_shrink_inactive(pgdat->node_id, + nr_scanned, nr_reclaimed, &stat, sc->priority, file); + + find_folios_written_back(&folio_list, &clean_list, is_retried); + spin_lock_irq(&lruvec->lru_lock); move_folios_to_lru(lruvec, &folio_list); __mod_lruvec_state(lruvec, PGDEMOTE_KSWAPD + reclaimer_offset(), stat.nr_demoted); - __mod_node_page_state(pgdat, NR_ISOLATED_ANON + file, -nr_taken); item = PGSTEAL_KSWAPD + reclaimer_offset(); if (!cgroup_reclaim(sc)) __count_vm_events(item, nr_reclaimed); __count_memcg_events(lruvec_memcg(lruvec), item, nr_reclaimed); __count_vm_events(PGSTEAL_ANON + file, nr_reclaimed); + + if (!list_empty(&clean_list)) { + list_splice_init(&clean_list, &folio_list); + is_retried = true; + spin_unlock_irq(&lruvec->lru_lock); + goto retry; + } + __mod_node_page_state(pgdat, NR_ISOLATED_ANON + file, -nr_taken); spin_unlock_irq(&lruvec->lru_lock); + sc->nr.taken += nr_taken; + if (file) + sc->nr.file_taken += nr_taken; - lru_note_cost(lruvec, file, stat.nr_pageout, nr_scanned - nr_reclaimed); + lru_note_cost(lruvec, file, nr_pageout, nr_scanned - total_reclaimed); /* * If dirty folios are scanned that are not queued for IO, it @@ -2028,7 +2090,7 @@ static unsigned long shrink_inactive_list(unsigned long nr_to_scan, * the flushers simply cannot keep up with the allocation * rate. Nudge the flusher threads in case they are asleep. */ - if (stat.nr_unqueued_dirty == nr_taken) { + if (nr_unqueued_dirty == nr_taken) { wakeup_flusher_threads(WB_REASON_VMSCAN); /* * For cgroupv1 dirty throttling is achieved by waking up @@ -2043,18 +2105,7 @@ static unsigned long shrink_inactive_list(unsigned long nr_to_scan, reclaim_throttle(pgdat, VMSCAN_THROTTLE_WRITEBACK); } - sc->nr.dirty += stat.nr_dirty; - sc->nr.congested += stat.nr_congested; - sc->nr.unqueued_dirty += stat.nr_unqueued_dirty; - sc->nr.writeback += stat.nr_writeback; - sc->nr.immediate += stat.nr_immediate; - sc->nr.taken += nr_taken; - if (file) - sc->nr.file_taken += nr_taken; - - trace_mm_vmscan_lru_shrink_inactive(pgdat->node_id, - nr_scanned, nr_reclaimed, &stat, sc->priority, file); - return nr_reclaimed; + return total_reclaimed; } /* @@ -4585,12 +4636,10 @@ static int evict_folios(struct lruvec *lruvec, struct scan_control *sc, int swap int reclaimed; LIST_HEAD(list); LIST_HEAD(clean); - struct folio *folio; - struct folio *next; enum vm_event_item item; struct reclaim_stat stat; struct lru_gen_mm_walk *walk; - bool skip_retry = false; + bool is_retried = false; struct lru_gen_folio *lrugen = &lruvec->lrugen; struct mem_cgroup *memcg = lruvec_memcg(lruvec); struct pglist_data *pgdat = lruvec_pgdat(lruvec); @@ -4616,24 +4665,7 @@ static int evict_folios(struct lruvec *lruvec, struct scan_control *sc, int swap scanned, reclaimed, &stat, sc->priority, type ? LRU_INACTIVE_FILE : LRU_INACTIVE_ANON); - list_for_each_entry_safe_reverse(folio, next, &list, lru) { - if (!folio_evictable(folio)) { - list_del(&folio->lru); - folio_putback_lru(folio); - continue; - } - - /* retry folios that may have missed folio_rotate_reclaimable() */ - if (!skip_retry && !folio_test_active(folio) && !folio_mapped(folio) && - !folio_test_dirty(folio) && !folio_test_writeback(folio)) { - list_move(&folio->lru, &clean); - continue; - } - - /* don't add rejected folios to the oldest generation */ - if (!lru_gen_distance(folio, false)) - set_mask_bits(&folio->flags, LRU_REFS_FLAGS, BIT(PG_active)); - } + find_folios_written_back(&list, &clean, is_retried); spin_lock_irq(&lruvec->lru_lock); @@ -4656,7 +4688,7 @@ static int evict_folios(struct lruvec *lruvec, struct scan_control *sc, int swap list_splice_init(&clean, &list); if (!list_empty(&list)) { - skip_retry = true; + is_retried = true; goto retry; }