From patchwork Wed Sep 29 10:09:10 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mel Gorman X-Patchwork-Id: 12525199 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 64589C433FE for ; Wed, 29 Sep 2021 10:09:41 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id C8E7E613CD for ; Wed, 29 Sep 2021 10:09:39 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org C8E7E613CD Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=techsingularity.net Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 6EB0F940017; Wed, 29 Sep 2021 06:09:39 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 69C1A6B0078; Wed, 29 Sep 2021 06:09:39 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 56325940017; Wed, 29 Sep 2021 06:09:39 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0202.hostedemail.com [216.40.44.202]) by kanga.kvack.org (Postfix) with ESMTP id 463DE6B0075 for ; Wed, 29 Sep 2021 06:09:39 -0400 (EDT) Received: from smtpin06.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id E988F82499A8 for ; Wed, 29 Sep 2021 10:09:38 +0000 (UTC) X-FDA: 78640189236.06.0D41F17 Received: from outbound-smtp49.blacknight.com (outbound-smtp49.blacknight.com [46.22.136.233]) by imf18.hostedemail.com (Postfix) with ESMTP id 678934002087 for ; Wed, 29 Sep 2021 10:09:38 +0000 (UTC) Received: from mail.blacknight.com (pemlinmail06.blacknight.ie [81.17.255.152]) by outbound-smtp49.blacknight.com (Postfix) with ESMTPS id 2A69BFB48A for ; Wed, 29 Sep 2021 11:09:37 +0100 (IST) Received: (qmail 19500 invoked from network); 29 Sep 2021 10:09:36 -0000 Received: from unknown (HELO stampy.112glenside.lan) (mgorman@techsingularity.net@[84.203.17.29]) by 81.17.254.9 with ESMTPA; 29 Sep 2021 10:09:36 -0000 From: Mel Gorman To: Linux-MM Cc: NeilBrown , Theodore Ts'o , Andreas Dilger , "Darrick J . Wong" , Matthew Wilcox , Michal Hocko , Dave Chinner , Rik van Riel , Vlastimil Babka , Johannes Weiner , Jonathan Corbet , Linux-fsdevel , LKML , Mel Gorman Subject: [PATCH 1/5] mm/vmscan: Throttle reclaim until some writeback completes if congested Date: Wed, 29 Sep 2021 11:09:10 +0100 Message-Id: <20210929100914.14704-2-mgorman@techsingularity.net> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20210929100914.14704-1-mgorman@techsingularity.net> References: <20210929100914.14704-1-mgorman@techsingularity.net> MIME-Version: 1.0 X-Rspamd-Queue-Id: 678934002087 X-Stat-Signature: gtbo9u43cqsszocaqdjjixcrz7z4zbeq Authentication-Results: imf18.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf18.hostedemail.com: domain of mgorman@techsingularity.net designates 46.22.136.233 as permitted sender) smtp.mailfrom=mgorman@techsingularity.net X-Rspamd-Server: rspam06 X-HE-Tag: 1632910178-999105 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Page reclaim throttles on wait_iff_congested under the following conditions o kswapd is encountering pages under writeback and marked for immediate reclaim implying that pages are cycling through the LRU faster than pages can be cleaned. o Direct reclaim will stall if all dirty pages are backed by congested inodes. wait_iff_congested is almost completely broken with few exceptions. This patch adds a new node-based workqueue and tracks the number of throttled tasks and pages written back since throttling started. If enough pages belonging to the node are written back then the throttled tasks will wake early. If not, the throttled tasks sleeps until the timeout expires. Signed-off-by: Mel Gorman --- include/linux/backing-dev.h | 1 - include/linux/mmzone.h | 9 +++++ include/trace/events/vmscan.h | 34 +++++++++++++++++++ include/trace/events/writeback.h | 7 ---- mm/backing-dev.c | 48 -------------------------- mm/filemap.c | 1 + mm/internal.h | 9 +++++ mm/page_alloc.c | 1 + mm/vmscan.c | 58 +++++++++++++++++++++++++++----- mm/vmstat.c | 1 + 10 files changed, 105 insertions(+), 64 deletions(-) diff --git a/include/linux/backing-dev.h b/include/linux/backing-dev.h index ac7f231b8825..9fb1f0ae273c 100644 --- a/include/linux/backing-dev.h +++ b/include/linux/backing-dev.h @@ -154,7 +154,6 @@ static inline int wb_congested(struct bdi_writeback *wb, int cong_bits) } long congestion_wait(int sync, long timeout); -long wait_iff_congested(int sync, long timeout); static inline bool mapping_can_writeback(struct address_space *mapping) { diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index 6a1d79d84675..ef0a63ebd21d 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -199,6 +199,7 @@ enum node_stat_item { NR_VMSCAN_IMMEDIATE, /* Prioritise for reclaim when writeback ends */ NR_DIRTIED, /* page dirtyings since bootup */ NR_WRITTEN, /* page writings since bootup */ + NR_THROTTLED_WRITTEN, /* NR_WRITTEN while reclaim throttled */ NR_KERNEL_MISC_RECLAIMABLE, /* reclaimable non-slab kernel pages */ NR_FOLL_PIN_ACQUIRED, /* via: pin_user_page(), gup flag: FOLL_PIN */ NR_FOLL_PIN_RELEASED, /* pages returned via unpin_user_page() */ @@ -272,6 +273,10 @@ enum lru_list { NR_LRU_LISTS }; +enum vmscan_throttle_state { + VMSCAN_THROTTLE_WRITEBACK, +}; + #define for_each_lru(lru) for (lru = 0; lru < NR_LRU_LISTS; lru++) #define for_each_evictable_lru(lru) for (lru = 0; lru <= LRU_ACTIVE_FILE; lru++) @@ -841,6 +846,10 @@ typedef struct pglist_data { int node_id; wait_queue_head_t kswapd_wait; wait_queue_head_t pfmemalloc_wait; + wait_queue_head_t reclaim_wait; /* wq for throttling reclaim */ + atomic_t nr_reclaim_throttled; /* nr of throtted tasks */ + unsigned long nr_reclaim_start; /* nr pages written while throttled + * when throttling started. */ struct task_struct *kswapd; /* Protected by mem_hotplug_begin/end() */ int kswapd_order; diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h index 88faf2400ec2..c317f9fe0d17 100644 --- a/include/trace/events/vmscan.h +++ b/include/trace/events/vmscan.h @@ -27,6 +27,14 @@ {RECLAIM_WB_ASYNC, "RECLAIM_WB_ASYNC"} \ ) : "RECLAIM_WB_NONE" +#define _VMSCAN_THROTTLE_WRITEBACK (1 << VMSCAN_THROTTLE_WRITEBACK) + +#define show_throttle_flags(flags) \ + (flags) ? __print_flags(flags, "|", \ + {_VMSCAN_THROTTLE_WRITEBACK, "VMSCAN_THROTTLE_WRITEBACK"} \ + ) : "VMSCAN_THROTTLE_NONE" + + #define trace_reclaim_flags(file) ( \ (file ? RECLAIM_WB_FILE : RECLAIM_WB_ANON) | \ (RECLAIM_WB_ASYNC) \ @@ -454,6 +462,32 @@ DEFINE_EVENT(mm_vmscan_direct_reclaim_end_template, mm_vmscan_node_reclaim_end, TP_ARGS(nr_reclaimed) ); +TRACE_EVENT(mm_vmscan_throttled, + + TP_PROTO(int nid, int usec_timeout, int usec_delayed, int reason), + + TP_ARGS(nid, usec_timeout, usec_delayed, reason), + + TP_STRUCT__entry( + __field(int, nid) + __field(int, usec_timeout) + __field(int, usec_delayed) + __field(int, reason) + ), + + TP_fast_assign( + __entry->nid = nid; + __entry->usec_timeout = usec_timeout; + __entry->usec_delayed = usec_delayed; + __entry->reason = 1U << reason; + ), + + TP_printk("nid=%d usec_timeout=%d usect_delayed=%d reason=%s", + __entry->nid, + __entry->usec_timeout, + __entry->usec_delayed, + show_throttle_flags(__entry->reason)) +); #endif /* _TRACE_VMSCAN_H */ /* This part must be outside protection */ diff --git a/include/trace/events/writeback.h b/include/trace/events/writeback.h index 840d1ba84cf5..3bc759b81897 100644 --- a/include/trace/events/writeback.h +++ b/include/trace/events/writeback.h @@ -763,13 +763,6 @@ DEFINE_EVENT(writeback_congest_waited_template, writeback_congestion_wait, TP_ARGS(usec_timeout, usec_delayed) ); -DEFINE_EVENT(writeback_congest_waited_template, writeback_wait_iff_congested, - - TP_PROTO(unsigned int usec_timeout, unsigned int usec_delayed), - - TP_ARGS(usec_timeout, usec_delayed) -); - DECLARE_EVENT_CLASS(writeback_single_inode_template, TP_PROTO(struct inode *inode, diff --git a/mm/backing-dev.c b/mm/backing-dev.c index 4a9d4e27d0d9..0ea1a105eae5 100644 --- a/mm/backing-dev.c +++ b/mm/backing-dev.c @@ -1041,51 +1041,3 @@ long congestion_wait(int sync, long timeout) return ret; } EXPORT_SYMBOL(congestion_wait); - -/** - * wait_iff_congested - Conditionally wait for a backing_dev to become uncongested or a pgdat to complete writes - * @sync: SYNC or ASYNC IO - * @timeout: timeout in jiffies - * - * In the event of a congested backing_dev (any backing_dev) this waits - * for up to @timeout jiffies for either a BDI to exit congestion of the - * given @sync queue or a write to complete. - * - * The return value is 0 if the sleep is for the full timeout. Otherwise, - * it is the number of jiffies that were still remaining when the function - * returned. return_value == timeout implies the function did not sleep. - */ -long wait_iff_congested(int sync, long timeout) -{ - long ret; - unsigned long start = jiffies; - DEFINE_WAIT(wait); - wait_queue_head_t *wqh = &congestion_wqh[sync]; - - /* - * If there is no congestion, yield if necessary instead - * of sleeping on the congestion queue - */ - if (atomic_read(&nr_wb_congested[sync]) == 0) { - cond_resched(); - - /* In case we scheduled, work out time remaining */ - ret = timeout - (jiffies - start); - if (ret < 0) - ret = 0; - - goto out; - } - - /* Sleep until uncongested or a write happens */ - prepare_to_wait(wqh, &wait, TASK_UNINTERRUPTIBLE); - ret = io_schedule_timeout(timeout); - finish_wait(wqh, &wait); - -out: - trace_writeback_wait_iff_congested(jiffies_to_usecs(timeout), - jiffies_to_usecs(jiffies - start)); - - return ret; -} -EXPORT_SYMBOL(wait_iff_congested); diff --git a/mm/filemap.c b/mm/filemap.c index dae481293b5d..59187787fbfc 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -1605,6 +1605,7 @@ void end_page_writeback(struct page *page) smp_mb__after_atomic(); wake_up_page(page, PG_writeback); + acct_reclaim_writeback(page); put_page(page); } EXPORT_SYMBOL(end_page_writeback); diff --git a/mm/internal.h b/mm/internal.h index cf3cb933eba3..e25b3686bfab 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -34,6 +34,15 @@ void page_writeback_init(void); +void __acct_reclaim_writeback(pg_data_t *pgdat, struct page *page); +static inline void acct_reclaim_writeback(struct page *page) +{ + pg_data_t *pgdat = page_pgdat(page); + + if (atomic_read(&pgdat->nr_reclaim_throttled)) + __acct_reclaim_writeback(pgdat, page); +} + vm_fault_t do_swap_page(struct vm_fault *vmf); void free_pgtables(struct mmu_gather *tlb, struct vm_area_struct *start_vma, diff --git a/mm/page_alloc.c b/mm/page_alloc.c index b37435c274cf..d849ddfc1e51 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -7396,6 +7396,7 @@ static void __meminit pgdat_init_internals(struct pglist_data *pgdat) init_waitqueue_head(&pgdat->kswapd_wait); init_waitqueue_head(&pgdat->pfmemalloc_wait); + init_waitqueue_head(&pgdat->reclaim_wait); pgdat_page_ext_init(pgdat); lruvec_init(&pgdat->__lruvec); diff --git a/mm/vmscan.c b/mm/vmscan.c index 74296c2d1fed..b58ea0b13286 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -1006,6 +1006,47 @@ static void handle_write_error(struct address_space *mapping, unlock_page(page); } +static void +reclaim_throttle(pg_data_t *pgdat, enum vmscan_throttle_state reason, + long timeout) +{ + wait_queue_head_t *wqh = &pgdat->reclaim_wait; + unsigned long start = jiffies; + long ret; + DEFINE_WAIT(wait); + + atomic_inc(&pgdat->nr_reclaim_throttled); + WRITE_ONCE(pgdat->nr_reclaim_start, + node_page_state(pgdat, NR_THROTTLED_WRITTEN)); + + prepare_to_wait(wqh, &wait, TASK_INTERRUPTIBLE); + ret = schedule_timeout(timeout); + finish_wait(wqh, &wait); + atomic_dec(&pgdat->nr_reclaim_throttled); + + trace_mm_vmscan_throttled(pgdat->node_id, jiffies_to_usecs(timeout), + jiffies_to_usecs(jiffies - start), + reason); +} + +/* + * Account for pages written if tasks are throttled waiting on dirty + * pages to clean. If enough pages have been cleaned since throttling + * started then wakeup the throttled tasks. + */ +void __acct_reclaim_writeback(pg_data_t *pgdat, struct page *page) +{ + unsigned long nr_written; + int nr_throttled = atomic_read(&pgdat->nr_reclaim_throttled); + + __inc_node_page_state(page, NR_THROTTLED_WRITTEN); + nr_written = node_page_state(pgdat, NR_THROTTLED_WRITTEN) - + READ_ONCE(pgdat->nr_reclaim_start); + + if (nr_written > SWAP_CLUSTER_MAX * nr_throttled) + wake_up_interruptible_all(&pgdat->reclaim_wait); +} + /* possible outcome of pageout() */ typedef enum { /* failed to write page out, page is locked */ @@ -1412,9 +1453,8 @@ static unsigned int shrink_page_list(struct list_head *page_list, /* * The number of dirty pages determines if a node is marked - * reclaim_congested which affects wait_iff_congested. kswapd - * will stall and start writing pages if the tail of the LRU - * is all dirty unqueued pages. + * reclaim_congested. kswapd will stall and start writing + * pages if the tail of the LRU is all dirty unqueued pages. */ page_check_dirty_writeback(page, &dirty, &writeback); if (dirty || writeback) @@ -3180,19 +3220,20 @@ static void shrink_node(pg_data_t *pgdat, struct scan_control *sc) * If kswapd scans pages marked for immediate * reclaim and under writeback (nr_immediate), it * implies that pages are cycling through the LRU - * faster than they are written so also forcibly stall. + * faster than they are written so forcibly stall + * until some pages complete writeback. */ if (sc->nr.immediate) - congestion_wait(BLK_RW_ASYNC, HZ/10); + reclaim_throttle(pgdat, VMSCAN_THROTTLE_WRITEBACK, HZ/10); } /* * Tag a node/memcg as congested if all the dirty pages * scanned were backed by a congested BDI and - * wait_iff_congested will stall. + * non-kswapd tasks will stall on reclaim_throttle. * * Legacy memcg will stall in page writeback so avoid forcibly - * stalling in wait_iff_congested(). + * stalling in reclaim_throttle(). */ if ((current_is_kswapd() || (cgroup_reclaim(sc) && writeback_throttling_sane(sc))) && @@ -3208,7 +3249,7 @@ static void shrink_node(pg_data_t *pgdat, struct scan_control *sc) if (!current_is_kswapd() && current_may_throttle() && !sc->hibernation_mode && test_bit(LRUVEC_CONGESTED, &target_lruvec->flags)) - wait_iff_congested(BLK_RW_ASYNC, HZ/10); + reclaim_throttle(pgdat, VMSCAN_THROTTLE_WRITEBACK, HZ/10); if (should_continue_reclaim(pgdat, sc->nr_reclaimed - nr_reclaimed, sc)) @@ -4286,6 +4327,7 @@ static int kswapd(void *p) WRITE_ONCE(pgdat->kswapd_order, 0); WRITE_ONCE(pgdat->kswapd_highest_zoneidx, MAX_NR_ZONES); + atomic_set(&pgdat->nr_reclaim_throttled, 0); for ( ; ; ) { bool ret; diff --git a/mm/vmstat.c b/mm/vmstat.c index 8ce2620344b2..9b2bc9d61d4b 100644 --- a/mm/vmstat.c +++ b/mm/vmstat.c @@ -1225,6 +1225,7 @@ const char * const vmstat_text[] = { "nr_vmscan_immediate_reclaim", "nr_dirtied", "nr_written", + "nr_throttled_written", "nr_kernel_misc_reclaimable", "nr_foll_pin_acquired", "nr_foll_pin_released", From patchwork Wed Sep 29 10:09:11 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mel Gorman X-Patchwork-Id: 12525201 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A0500C433EF for ; Wed, 29 Sep 2021 10:09:50 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 43B4E61407 for ; Wed, 29 Sep 2021 10:09:50 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 43B4E61407 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=techsingularity.net Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id D2D71940018; Wed, 29 Sep 2021 06:09:49 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id CDCD76B007D; Wed, 29 Sep 2021 06:09:49 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BCB75940018; Wed, 29 Sep 2021 06:09:49 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0130.hostedemail.com [216.40.44.130]) by kanga.kvack.org (Postfix) with ESMTP id ACD956B0078 for ; Wed, 29 Sep 2021 06:09:49 -0400 (EDT) Received: from smtpin02.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 50BBF2BFAB for ; Wed, 29 Sep 2021 10:09:49 +0000 (UTC) X-FDA: 78640189698.02.B5D8456 Received: from outbound-smtp01.blacknight.com (outbound-smtp01.blacknight.com [81.17.249.7]) by imf12.hostedemail.com (Postfix) with ESMTP id C549110000AC for ; Wed, 29 Sep 2021 10:09:48 +0000 (UTC) Received: from mail.blacknight.com (pemlinmail06.blacknight.ie [81.17.255.152]) by outbound-smtp01.blacknight.com (Postfix) with ESMTPS id 5D5F05C006 for ; Wed, 29 Sep 2021 11:09:47 +0100 (IST) Received: (qmail 19915 invoked from network); 29 Sep 2021 10:09:47 -0000 Received: from unknown (HELO stampy.112glenside.lan) (mgorman@techsingularity.net@[84.203.17.29]) by 81.17.254.9 with ESMTPA; 29 Sep 2021 10:09:47 -0000 From: Mel Gorman To: Linux-MM Cc: NeilBrown , Theodore Ts'o , Andreas Dilger , "Darrick J . Wong" , Matthew Wilcox , Michal Hocko , Dave Chinner , Rik van Riel , Vlastimil Babka , Johannes Weiner , Jonathan Corbet , Linux-fsdevel , LKML , Mel Gorman Subject: [PATCH 2/5] mm/vmscan: Throttle reclaim and compaction when too may pages are isolated Date: Wed, 29 Sep 2021 11:09:11 +0100 Message-Id: <20210929100914.14704-3-mgorman@techsingularity.net> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20210929100914.14704-1-mgorman@techsingularity.net> References: <20210929100914.14704-1-mgorman@techsingularity.net> MIME-Version: 1.0 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: C549110000AC X-Stat-Signature: ncfg117q5w6sb34w4okuoei7emy9y1do Authentication-Results: imf12.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf12.hostedemail.com: domain of mgorman@techsingularity.net designates 81.17.249.7 as permitted sender) smtp.mailfrom=mgorman@techsingularity.net X-HE-Tag: 1632910188-486252 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Page reclaim throttles on congestion if too many parallel reclaim instances have isolated too many pages. This makes no sense, excessive parallelisation has nothing to do with writeback or congestion. This patch creates an additional workqueue to sleep on when too many pages are isolated. The throttled tasks are woken when the number of isolated pages is reduced or a timeout occurs. There may be some false positive wakeups for GFP_NOIO/GFP_NOFS callers but the tasks will throttle again if necessary. Signed-off-by: Mel Gorman --- include/linux/mmzone.h | 4 +++- include/trace/events/vmscan.h | 4 +++- mm/compaction.c | 2 +- mm/internal.h | 2 ++ mm/page_alloc.c | 6 +++++- mm/vmscan.c | 22 ++++++++++++++++------ 6 files changed, 30 insertions(+), 10 deletions(-) diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index ef0a63ebd21d..ca65d6a64bdd 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -275,6 +275,8 @@ enum lru_list { enum vmscan_throttle_state { VMSCAN_THROTTLE_WRITEBACK, + VMSCAN_THROTTLE_ISOLATED, + NR_VMSCAN_THROTTLE, }; #define for_each_lru(lru) for (lru = 0; lru < NR_LRU_LISTS; lru++) @@ -846,7 +848,7 @@ typedef struct pglist_data { int node_id; wait_queue_head_t kswapd_wait; wait_queue_head_t pfmemalloc_wait; - wait_queue_head_t reclaim_wait; /* wq for throttling reclaim */ + wait_queue_head_t reclaim_wait[NR_VMSCAN_THROTTLE]; atomic_t nr_reclaim_throttled; /* nr of throtted tasks */ unsigned long nr_reclaim_start; /* nr pages written while throttled * when throttling started. */ diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h index c317f9fe0d17..d4905bd9e9c4 100644 --- a/include/trace/events/vmscan.h +++ b/include/trace/events/vmscan.h @@ -28,10 +28,12 @@ ) : "RECLAIM_WB_NONE" #define _VMSCAN_THROTTLE_WRITEBACK (1 << VMSCAN_THROTTLE_WRITEBACK) +#define _VMSCAN_THROTTLE_ISOLATED (1 << VMSCAN_THROTTLE_ISOLATED) #define show_throttle_flags(flags) \ (flags) ? __print_flags(flags, "|", \ - {_VMSCAN_THROTTLE_WRITEBACK, "VMSCAN_THROTTLE_WRITEBACK"} \ + {_VMSCAN_THROTTLE_WRITEBACK, "VMSCAN_THROTTLE_WRITEBACK"}, \ + {_VMSCAN_THROTTLE_ISOLATED, "VMSCAN_THROTTLE_ISOLATED"} \ ) : "VMSCAN_THROTTLE_NONE" diff --git a/mm/compaction.c b/mm/compaction.c index bfc93da1c2c7..221c9c10ad7e 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -822,7 +822,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, if (cc->mode == MIGRATE_ASYNC) return -EAGAIN; - congestion_wait(BLK_RW_ASYNC, HZ/10); + reclaim_throttle(pgdat, VMSCAN_THROTTLE_ISOLATED, HZ/10); if (fatal_signal_pending(current)) return -EINTR; diff --git a/mm/internal.h b/mm/internal.h index e25b3686bfab..e6cd22fb5a43 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -118,6 +118,8 @@ extern unsigned long highest_memmap_pfn; */ extern int isolate_lru_page(struct page *page); extern void putback_lru_page(struct page *page); +extern void reclaim_throttle(pg_data_t *pgdat, enum vmscan_throttle_state reason, + long timeout); /* * in mm/rmap.c: diff --git a/mm/page_alloc.c b/mm/page_alloc.c index d849ddfc1e51..78e538067651 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -7389,6 +7389,8 @@ static void pgdat_init_kcompactd(struct pglist_data *pgdat) {} static void __meminit pgdat_init_internals(struct pglist_data *pgdat) { + int i; + pgdat_resize_init(pgdat); pgdat_init_split_queue(pgdat); @@ -7396,7 +7398,9 @@ static void __meminit pgdat_init_internals(struct pglist_data *pgdat) init_waitqueue_head(&pgdat->kswapd_wait); init_waitqueue_head(&pgdat->pfmemalloc_wait); - init_waitqueue_head(&pgdat->reclaim_wait); + + for (i = 0; i < NR_VMSCAN_THROTTLE; i++) + init_waitqueue_head(&pgdat->reclaim_wait[i]); pgdat_page_ext_init(pgdat); lruvec_init(&pgdat->__lruvec); diff --git a/mm/vmscan.c b/mm/vmscan.c index b58ea0b13286..eb81dcac15b2 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -1006,11 +1006,10 @@ static void handle_write_error(struct address_space *mapping, unlock_page(page); } -static void -reclaim_throttle(pg_data_t *pgdat, enum vmscan_throttle_state reason, +void reclaim_throttle(pg_data_t *pgdat, enum vmscan_throttle_state reason, long timeout) { - wait_queue_head_t *wqh = &pgdat->reclaim_wait; + wait_queue_head_t *wqh = &pgdat->reclaim_wait[reason]; unsigned long start = jiffies; long ret; DEFINE_WAIT(wait); @@ -1044,7 +1043,7 @@ void __acct_reclaim_writeback(pg_data_t *pgdat, struct page *page) READ_ONCE(pgdat->nr_reclaim_start); if (nr_written > SWAP_CLUSTER_MAX * nr_throttled) - wake_up_interruptible_all(&pgdat->reclaim_wait); + wake_up_interruptible_all(&pgdat->reclaim_wait[VMSCAN_THROTTLE_WRITEBACK]); } /* possible outcome of pageout() */ @@ -2159,6 +2158,7 @@ static int too_many_isolated(struct pglist_data *pgdat, int file, struct scan_control *sc) { unsigned long inactive, isolated; + bool too_many; if (current_is_kswapd()) return 0; @@ -2182,6 +2182,17 @@ static int too_many_isolated(struct pglist_data *pgdat, int file, if ((sc->gfp_mask & (__GFP_IO | __GFP_FS)) == (__GFP_IO | __GFP_FS)) inactive >>= 3; + too_many = isolated > inactive; + + /* Wake up tasks throttled due to too_many_isolated. */ + if (!too_many) { + wait_queue_head_t *wqh; + + wqh = &pgdat->reclaim_wait[VMSCAN_THROTTLE_ISOLATED]; + if (waitqueue_active(wqh)) + wake_up_interruptible_all(wqh); + } + return isolated > inactive; } @@ -2291,8 +2302,7 @@ shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec, return 0; /* wait a bit for the reclaimer. */ - msleep(100); - stalled = true; + reclaim_throttle(pgdat, VMSCAN_THROTTLE_ISOLATED, HZ/10); /* We are about to die and free our memory. Return now. */ if (fatal_signal_pending(current)) From patchwork Wed Sep 29 10:09:12 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mel Gorman X-Patchwork-Id: 12525203 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0E10EC433F5 for ; Wed, 29 Sep 2021 10:10:01 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id A38DA613CD for ; Wed, 29 Sep 2021 10:10:00 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org A38DA613CD Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=techsingularity.net Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 403DF6B007D; Wed, 29 Sep 2021 06:10:00 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3B4CF94001A; Wed, 29 Sep 2021 06:10:00 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 27C91940019; Wed, 29 Sep 2021 06:10:00 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0220.hostedemail.com [216.40.44.220]) by kanga.kvack.org (Postfix) with ESMTP id 180D06B007D for ; Wed, 29 Sep 2021 06:10:00 -0400 (EDT) Received: from smtpin23.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id B0A332BA78 for ; Wed, 29 Sep 2021 10:09:59 +0000 (UTC) X-FDA: 78640190118.23.1432890 Received: from outbound-smtp31.blacknight.com (outbound-smtp31.blacknight.com [81.17.249.62]) by imf27.hostedemail.com (Postfix) with ESMTP id 39A6D70000A7 for ; Wed, 29 Sep 2021 10:09:59 +0000 (UTC) Received: from mail.blacknight.com (pemlinmail06.blacknight.ie [81.17.255.152]) by outbound-smtp31.blacknight.com (Postfix) with ESMTPS id DC85CC0BAB for ; Wed, 29 Sep 2021 11:09:57 +0100 (IST) Received: (qmail 20326 invoked from network); 29 Sep 2021 10:09:57 -0000 Received: from unknown (HELO stampy.112glenside.lan) (mgorman@techsingularity.net@[84.203.17.29]) by 81.17.254.9 with ESMTPA; 29 Sep 2021 10:09:57 -0000 From: Mel Gorman To: Linux-MM Cc: NeilBrown , Theodore Ts'o , Andreas Dilger , "Darrick J . Wong" , Matthew Wilcox , Michal Hocko , Dave Chinner , Rik van Riel , Vlastimil Babka , Johannes Weiner , Jonathan Corbet , Linux-fsdevel , LKML , Mel Gorman Subject: [PATCH 3/5] mm/vmscan: Throttle reclaim when no progress is being made Date: Wed, 29 Sep 2021 11:09:12 +0100 Message-Id: <20210929100914.14704-4-mgorman@techsingularity.net> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20210929100914.14704-1-mgorman@techsingularity.net> References: <20210929100914.14704-1-mgorman@techsingularity.net> MIME-Version: 1.0 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 39A6D70000A7 X-Stat-Signature: 3f5y7auzosaaxx9mkkip9aowasditz36 Authentication-Results: imf27.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf27.hostedemail.com: domain of mgorman@techsingularity.net designates 81.17.249.62 as permitted sender) smtp.mailfrom=mgorman@techsingularity.net X-HE-Tag: 1632910199-810786 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Memcg reclaim throttles on congestion if no reclaim progress is made. This makes little sense, it might be due to writeback or a host of other factors. For !memcg reclaim, it's messy. Direct reclaim primarily is throttled in the page allocator if it is failing to make progress. Kswapd throttles if too many pages are under writeback and marked for immediate reclaim. This patch explicitly throttles if reclaim is failing to make progress. Signed-off-by: Mel Gorman --- include/linux/mmzone.h | 1 + include/trace/events/vmscan.h | 4 +++- mm/memcontrol.c | 10 +-------- mm/vmscan.c | 38 +++++++++++++++++++++++++++++++++++ 4 files changed, 43 insertions(+), 10 deletions(-) diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index ca65d6a64bdd..7c08cc91d526 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -276,6 +276,7 @@ enum lru_list { enum vmscan_throttle_state { VMSCAN_THROTTLE_WRITEBACK, VMSCAN_THROTTLE_ISOLATED, + VMSCAN_THROTTLE_NOPROGRESS, NR_VMSCAN_THROTTLE, }; diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h index d4905bd9e9c4..f25a6149d3ba 100644 --- a/include/trace/events/vmscan.h +++ b/include/trace/events/vmscan.h @@ -29,11 +29,13 @@ #define _VMSCAN_THROTTLE_WRITEBACK (1 << VMSCAN_THROTTLE_WRITEBACK) #define _VMSCAN_THROTTLE_ISOLATED (1 << VMSCAN_THROTTLE_ISOLATED) +#define _VMSCAN_THROTTLE_NOPROGRESS (1 << VMSCAN_THROTTLE_NOPROGRESS) #define show_throttle_flags(flags) \ (flags) ? __print_flags(flags, "|", \ {_VMSCAN_THROTTLE_WRITEBACK, "VMSCAN_THROTTLE_WRITEBACK"}, \ - {_VMSCAN_THROTTLE_ISOLATED, "VMSCAN_THROTTLE_ISOLATED"} \ + {_VMSCAN_THROTTLE_ISOLATED, "VMSCAN_THROTTLE_ISOLATED"}, \ + {_VMSCAN_THROTTLE_NOPROGRESS, "VMSCAN_THROTTLE_NOPROGRESS"} \ ) : "VMSCAN_THROTTLE_NONE" diff --git a/mm/memcontrol.c b/mm/memcontrol.c index b762215d73eb..8479919a633c 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -3470,19 +3470,11 @@ static int mem_cgroup_force_empty(struct mem_cgroup *memcg) /* try to free all pages in this cgroup */ while (nr_retries && page_counter_read(&memcg->memory)) { - int progress; - if (signal_pending(current)) return -EINTR; - progress = try_to_free_mem_cgroup_pages(memcg, 1, - GFP_KERNEL, true); - if (!progress) { + if (!try_to_free_mem_cgroup_pages(memcg, 1, GFP_KERNEL, true)) nr_retries--; - /* maybe some writeback is necessary */ - congestion_wait(BLK_RW_ASYNC, HZ/10); - } - } return 0; diff --git a/mm/vmscan.c b/mm/vmscan.c index eb81dcac15b2..18b9826953a0 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -3307,6 +3307,33 @@ static inline bool compaction_ready(struct zone *zone, struct scan_control *sc) return zone_watermark_ok_safe(zone, 0, watermark, sc->reclaim_idx); } +static void consider_reclaim_throttle(pg_data_t *pgdat, struct scan_control *sc) +{ + /* If reclaim is making progress, wake any throttled tasks. */ + if (sc->nr_reclaimed) { + wait_queue_head_t *wqh; + + wqh = &pgdat->reclaim_wait[VMSCAN_THROTTLE_NOPROGRESS]; + if (waitqueue_active(wqh)) + wake_up_interruptible_all(wqh); + + return; + } + + /* + * Do not throttle kswapd on NOPROGRESS as it will throttle on + * VMSCAN_THROTTLE_WRITEBACK if there are too many pages under + * writeback and marked for immediate reclaim at the tail of + * the LRU. + */ + if (current_is_kswapd()) + return; + + /* Throttle if making no progress at high prioities. */ + if (sc->priority < DEF_PRIORITY - 2) + reclaim_throttle(pgdat, VMSCAN_THROTTLE_NOPROGRESS, HZ/10); +} + /* * This is the direct reclaim path, for page-allocating processes. We only * try to reclaim pages from zones which will satisfy the caller's allocation @@ -3391,6 +3418,7 @@ static void shrink_zones(struct zonelist *zonelist, struct scan_control *sc) continue; last_pgdat = zone->zone_pgdat; shrink_node(zone->zone_pgdat, sc); + consider_reclaim_throttle(zone->zone_pgdat, sc); } /* @@ -3765,6 +3793,16 @@ unsigned long try_to_free_mem_cgroup_pages(struct mem_cgroup *memcg, trace_mm_vmscan_memcg_reclaim_end(nr_reclaimed); set_task_reclaim_state(current, NULL); + if (!nr_reclaimed) { + struct zoneref *z; + pg_data_t *pgdat; + + z = first_zones_zonelist(zonelist, sc.reclaim_idx, sc.nodemask); + pgdat = zonelist_zone(z)->zone_pgdat; + + reclaim_throttle(pgdat, VMSCAN_THROTTLE_NOPROGRESS, HZ/10); + } + return nr_reclaimed; } #endif From patchwork Wed Sep 29 10:09:13 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mel Gorman X-Patchwork-Id: 12525205 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7E329C433EF for ; Wed, 29 Sep 2021 10:10:11 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 1A9ED61411 for ; Wed, 29 Sep 2021 10:10:11 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 1A9ED61411 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=techsingularity.net Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id ABB0F6B0081; Wed, 29 Sep 2021 06:10:10 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A6B6994001A; Wed, 29 Sep 2021 06:10:10 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 959BA940019; Wed, 29 Sep 2021 06:10:10 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0214.hostedemail.com [216.40.44.214]) by kanga.kvack.org (Postfix) with ESMTP id 8633A6B0081 for ; Wed, 29 Sep 2021 06:10:10 -0400 (EDT) Received: from smtpin29.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 3FB332FD84 for ; Wed, 29 Sep 2021 10:10:10 +0000 (UTC) X-FDA: 78640190580.29.E1AB023 Received: from outbound-smtp35.blacknight.com (outbound-smtp35.blacknight.com [46.22.139.218]) by imf25.hostedemail.com (Postfix) with ESMTP id D3564B00018E for ; Wed, 29 Sep 2021 10:10:09 +0000 (UTC) Received: from mail.blacknight.com (pemlinmail06.blacknight.ie [81.17.255.152]) by outbound-smtp35.blacknight.com (Postfix) with ESMTPS id 951E21A3A for ; Wed, 29 Sep 2021 11:10:08 +0100 (IST) Received: (qmail 20883 invoked from network); 29 Sep 2021 10:10:08 -0000 Received: from unknown (HELO stampy.112glenside.lan) (mgorman@techsingularity.net@[84.203.17.29]) by 81.17.254.9 with ESMTPA; 29 Sep 2021 10:10:07 -0000 From: Mel Gorman To: Linux-MM Cc: NeilBrown , Theodore Ts'o , Andreas Dilger , "Darrick J . Wong" , Matthew Wilcox , Michal Hocko , Dave Chinner , Rik van Riel , Vlastimil Babka , Johannes Weiner , Jonathan Corbet , Linux-fsdevel , LKML , Mel Gorman Subject: [PATCH 4/5] mm/writeback: Throttle based on page writeback instead of congestion Date: Wed, 29 Sep 2021 11:09:13 +0100 Message-Id: <20210929100914.14704-5-mgorman@techsingularity.net> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20210929100914.14704-1-mgorman@techsingularity.net> References: <20210929100914.14704-1-mgorman@techsingularity.net> MIME-Version: 1.0 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: D3564B00018E X-Stat-Signature: 6mb445ykrg6omkh7i4fppoanb7spn6gn Authentication-Results: imf25.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf25.hostedemail.com: domain of mgorman@techsingularity.net designates 46.22.139.218 as permitted sender) smtp.mailfrom=mgorman@techsingularity.net X-HE-Tag: 1632910209-987603 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: do_writepages throttles on congestion if the writepages() fails due to a lack of memory but congestion_wait() is partially broken as the congestion state is not updated for all BDIs. This patch stalls waiting for a number of pages to complete writeback that located on the local node. The main weakness is that there is no correlation between the location of the inode's pages and locality but that is still better than congestion_wait. Signed-off-by: Mel Gorman --- mm/page-writeback.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/mm/page-writeback.c b/mm/page-writeback.c index 4812a17b288c..f34f54fcd5b4 100644 --- a/mm/page-writeback.c +++ b/mm/page-writeback.c @@ -2366,8 +2366,15 @@ int do_writepages(struct address_space *mapping, struct writeback_control *wbc) ret = generic_writepages(mapping, wbc); if ((ret != -ENOMEM) || (wbc->sync_mode != WB_SYNC_ALL)) break; - cond_resched(); - congestion_wait(BLK_RW_ASYNC, HZ/50); + + /* + * Lacking an allocation context or the locality or writeback + * state of any of the inode's pages, throttle based on + * writeback activity on the local node. It's as good a + * guess as any. + */ + reclaim_throttle(NODE_DATA(numa_node_id()), + VMSCAN_THROTTLE_WRITEBACK, HZ/50); } /* * Usually few pages are written by now from those we've just submitted From patchwork Wed Sep 29 10:09:14 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mel Gorman X-Patchwork-Id: 12525207 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 65D55C4332F for ; Wed, 29 Sep 2021 10:10:21 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 0E2E361407 for ; Wed, 29 Sep 2021 10:10:21 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 0E2E361407 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=techsingularity.net Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 9F7DB940019; Wed, 29 Sep 2021 06:10:20 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9A7806B0085; Wed, 29 Sep 2021 06:10:20 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 86F84940019; Wed, 29 Sep 2021 06:10:20 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0085.hostedemail.com [216.40.44.85]) by kanga.kvack.org (Postfix) with ESMTP id 754616B0083 for ; Wed, 29 Sep 2021 06:10:20 -0400 (EDT) Received: from smtpin03.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 2D930181FEBC5 for ; Wed, 29 Sep 2021 10:10:20 +0000 (UTC) X-FDA: 78640191000.03.642E9E0 Received: from outbound-smtp35.blacknight.com (outbound-smtp35.blacknight.com [46.22.139.218]) by imf14.hostedemail.com (Postfix) with ESMTP id BB84D6001996 for ; Wed, 29 Sep 2021 10:10:19 +0000 (UTC) Received: from mail.blacknight.com (pemlinmail06.blacknight.ie [81.17.255.152]) by outbound-smtp35.blacknight.com (Postfix) with ESMTPS id A7D701A3C for ; Wed, 29 Sep 2021 11:10:18 +0100 (IST) Received: (qmail 21284 invoked from network); 29 Sep 2021 10:10:18 -0000 Received: from unknown (HELO stampy.112glenside.lan) (mgorman@techsingularity.net@[84.203.17.29]) by 81.17.254.9 with ESMTPA; 29 Sep 2021 10:10:18 -0000 From: Mel Gorman To: Linux-MM Cc: NeilBrown , Theodore Ts'o , Andreas Dilger , "Darrick J . Wong" , Matthew Wilcox , Michal Hocko , Dave Chinner , Rik van Riel , Vlastimil Babka , Johannes Weiner , Jonathan Corbet , Linux-fsdevel , LKML , Mel Gorman Subject: [PATCH 5/5] mm/page_alloc: Remove the throttling logic from the page allocator Date: Wed, 29 Sep 2021 11:09:14 +0100 Message-Id: <20210929100914.14704-6-mgorman@techsingularity.net> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20210929100914.14704-1-mgorman@techsingularity.net> References: <20210929100914.14704-1-mgorman@techsingularity.net> MIME-Version: 1.0 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: BB84D6001996 X-Stat-Signature: sskht8g6scwt84tsqu3op7do9sba8i17 Authentication-Results: imf14.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf14.hostedemail.com: domain of mgorman@techsingularity.net designates 46.22.139.218 as permitted sender) smtp.mailfrom=mgorman@techsingularity.net X-HE-Tag: 1632910219-952393 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The page allocator stalls based on the number of pages that are waiting for writeback to start but this should now be redundant. shrink_inactive_list() will wake flusher threads if the LRU tail are unqueued dirty pages so the flusher should be active. If it fails to make progress due to pages under writeback not being completed quickly then it should stall on VMSCAN_THROTTLE_WRITEBACK. Signed-off-by: Mel Gorman --- mm/page_alloc.c | 21 +-------------------- 1 file changed, 1 insertion(+), 20 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 78e538067651..8fa0109ff417 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -4795,30 +4795,11 @@ should_reclaim_retry(gfp_t gfp_mask, unsigned order, trace_reclaim_retry_zone(z, order, reclaimable, available, min_wmark, *no_progress_loops, wmark); if (wmark) { - /* - * If we didn't make any progress and have a lot of - * dirty + writeback pages then we should wait for - * an IO to complete to slow down the reclaim and - * prevent from pre mature OOM - */ - if (!did_some_progress) { - unsigned long write_pending; - - write_pending = zone_page_state_snapshot(zone, - NR_ZONE_WRITE_PENDING); - - if (2 * write_pending > reclaimable) { - congestion_wait(BLK_RW_ASYNC, HZ/10); - return true; - } - } - ret = true; - goto out; + break; } } -out: /* * Memory allocation/reclaim might be called from a WQ context and the * current implementation of the WQ concurrency control doesn't