diff mbox series

[5/5] mm/vmscan: don't forcely shrink active anon lru list

Message ID 20190222174337.26390-5-aryabinin@virtuozzo.com (mailing list archive)
State New, archived
Headers show
Series [1/5] mm/workingset: remove unused @mapping argument in workingset_eviction() | expand

Commit Message

Andrey Ryabinin Feb. 22, 2019, 5:43 p.m. UTC
shrink_node_memcg() always forcely shrink active anon list.
This doesn't seem like correct behavior. If system/memcg has no swap, it's
absolutely pointless to rebalance anon lru lists.
And in case we did scan the active anon list above, it's unclear why would
we need this additional force scan. If there are cases when we want more
aggressive scan of the anon lru we should just change the scan target
in get_scan_count() (and better explain such cases in the comments).

Remove this force shrink and let get_scan_count() to decide how
much of active anon we want to shrink.

Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Rik van Riel <riel@surriel.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
---
 mm/vmscan.c | 12 ++----------
 1 file changed, 2 insertions(+), 10 deletions(-)

Comments

Johannes Weiner Feb. 22, 2019, 6:22 p.m. UTC | #1
On Fri, Feb 22, 2019 at 08:43:37PM +0300, Andrey Ryabinin wrote:
> shrink_node_memcg() always forcely shrink active anon list.
> This doesn't seem like correct behavior. If system/memcg has no swap, it's
> absolutely pointless to rebalance anon lru lists.
> And in case we did scan the active anon list above, it's unclear why would
> we need this additional force scan. If there are cases when we want more
> aggressive scan of the anon lru we should just change the scan target
> in get_scan_count() (and better explain such cases in the comments).
> 
> Remove this force shrink and let get_scan_count() to decide how
> much of active anon we want to shrink.

This change breaks the anon pre-aging.

The idea behind this is that the VM maintains a small batch of anon
reclaim candidates with recent access information. On every reclaim,
even when we just trim cache, which is the most common reclaim mode,
but also when we just swapped out some pages and shrunk the inactive
anon list, at the end of it we make sure that the list of potential
anon candidates is refilled for the next reclaim cycle.

The comments for this are above inactive_list_is_low() and the
age_active_anon() call from kswapd.

Re: no swap, you are correct. We should gate that rebalancing on
total_swap_pages, just like age_active_anon() does.
Andrey Ryabinin Feb. 26, 2019, 12:04 p.m. UTC | #2
On 2/22/19 9:22 PM, Johannes Weiner wrote:
> On Fri, Feb 22, 2019 at 08:43:37PM +0300, Andrey Ryabinin wrote:
>> shrink_node_memcg() always forcely shrink active anon list.
>> This doesn't seem like correct behavior. If system/memcg has no swap, it's
>> absolutely pointless to rebalance anon lru lists.
>> And in case we did scan the active anon list above, it's unclear why would
>> we need this additional force scan. If there are cases when we want more
>> aggressive scan of the anon lru we should just change the scan target
>> in get_scan_count() (and better explain such cases in the comments).
>>
>> Remove this force shrink and let get_scan_count() to decide how
>> much of active anon we want to shrink.
> 
> This change breaks the anon pre-aging.
> 
> The idea behind this is that the VM maintains a small batch of anon
> reclaim candidates with recent access information. On every reclaim,
> even when we just trim cache, which is the most common reclaim mode,
> but also when we just swapped out some pages and shrunk the inactive
> anon list, at the end of it we make sure that the list of potential
> anon candidates is refilled for the next reclaim cycle.
> 
> The comments for this are above inactive_list_is_low() and the
> age_active_anon() call from kswapd.
> 
> Re: no swap, you are correct. We should gate that rebalancing on
> total_swap_pages, just like age_active_anon() does.
> 


I think we should leave anon aging only for !SCAN_FILE cases.
At least aging was definitely invented for the SCAN_FRACT mode which was the
main mode at the time it was added by the commit:

	556adecba110bf5f1db6c6b56416cfab5bcab698
	Author: Rik van Riel <riel@redhat.com>
	Date:   Sat Oct 18 20:26:34 2008 -0700

	    vmscan: second chance replacement for anonymous pages


Later we've got more of the SCAN_FILE mode usage, commit:

e9868505987a03a26a3979f27b82911ccc003752
Author: Rik van Riel <riel@redhat.com>
Date:   Tue Dec 11 16:01:10 2012 -0800

    mm,vmscan: only evict file pages when we have plenty


and I think would be reasonable to  avoid the anon aging in the SCAN_FILE case.
Because if workload generates enough inactive file pages we never go to the SCAN_FRACT,
so aging is just as useless as with no swap case.

So, how about something like bellow on top of the patch?

---
 mm/vmscan.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index efd10d6b9510..6c63adfee37b 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2525,6 +2525,15 @@ static void get_scan_count(struct lruvec *lruvec, struct mem_cgroup *memcg,
 
 		nr[lru] = scan;
 	}
+
+	/*
+	 * Even if we did not try to evict anon pages at all, we want to
+	 * rebalance the anon lru active/inactive ratio to maintain
+	 * enough reclaim candidates for the next reclaim cycle.
+	 */
+	if (scan_balance != SCAN_FILE && inactive_list_is_low(lruvec,
+						false, memcg, sc, false))
+		nr[LRU_ACTIVE_ANON] += SWAP_CLUSTER_MAX;
 }
 
 /*
Rik van Riel Feb. 26, 2019, 2:42 p.m. UTC | #3
On Tue, 2019-02-26 at 15:04 +0300, Andrey Ryabinin wrote:

> I think we should leave anon aging only for !SCAN_FILE cases.
> At least aging was definitely invented for the SCAN_FRACT mode which
> was the
> main mode at the time it was added by the commit:

> and I think would be reasonable to  avoid the anon aging in the
> SCAN_FILE case.
> Because if workload generates enough inactive file pages we never go
> to the SCAN_FRACT,
> so aging is just as useless as with no swap case.

There are a few different cases here.

If you NEVER end up scanning or evicting anonymous
pages, scanning them is indeed a waste of time.

However, if you occasionally end pushing something
into swap, it is very useful to know that the pages
that did get pushed to swap had been sitting on the
inactive list for a very long time, and had not been
used in that time.

To limit the amount of wasted work, only SWAP_CLUSTER_MAX
pages are moved from the active_anon list to the inactive_anon
list at a time.

I suppose that could be gated behind a check whether or
not the system has swap space configured, so no anon
pages are ever scanned if the system has no swap space.
diff mbox series

Patch

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 07f74e9507b6..efd10d6b9510 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2563,8 +2563,8 @@  static void shrink_node_memcg(struct pglist_data *pgdat, struct mem_cgroup *memc
 			 sc->priority == DEF_PRIORITY);
 
 	blk_start_plug(&plug);
-	while (nr[LRU_INACTIVE_ANON] || nr[LRU_ACTIVE_FILE] ||
-					nr[LRU_INACTIVE_FILE]) {
+	while (nr[LRU_ACTIVE_ANON] || nr[LRU_INACTIVE_ANON] ||
+		nr[LRU_ACTIVE_FILE] || nr[LRU_INACTIVE_FILE]) {
 		unsigned long nr_anon, nr_file, percentage;
 		unsigned long nr_scanned;
 
@@ -2636,14 +2636,6 @@  static void shrink_node_memcg(struct pglist_data *pgdat, struct mem_cgroup *memc
 	}
 	blk_finish_plug(&plug);
 	sc->nr_reclaimed += nr_reclaimed;
-
-	/*
-	 * Even if we did not try to evict anon pages at all, we want to
-	 * rebalance the anon lru active/inactive ratio.
-	 */
-	if (inactive_list_is_low(lruvec, false, memcg, sc, true))
-		shrink_active_list(SWAP_CLUSTER_MAX, lruvec,
-				   sc, LRU_ACTIVE_ANON);
 }
 
 /* Use reclaim/compaction for costly allocs or under memory pressure */