Message ID | 20190222174337.26390-5-aryabinin@virtuozzo.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [1/5] mm/workingset: remove unused @mapping argument in workingset_eviction() | expand |
On Fri, Feb 22, 2019 at 08:43:37PM +0300, Andrey Ryabinin wrote: > shrink_node_memcg() always forcely shrink active anon list. > This doesn't seem like correct behavior. If system/memcg has no swap, it's > absolutely pointless to rebalance anon lru lists. > And in case we did scan the active anon list above, it's unclear why would > we need this additional force scan. If there are cases when we want more > aggressive scan of the anon lru we should just change the scan target > in get_scan_count() (and better explain such cases in the comments). > > Remove this force shrink and let get_scan_count() to decide how > much of active anon we want to shrink. This change breaks the anon pre-aging. The idea behind this is that the VM maintains a small batch of anon reclaim candidates with recent access information. On every reclaim, even when we just trim cache, which is the most common reclaim mode, but also when we just swapped out some pages and shrunk the inactive anon list, at the end of it we make sure that the list of potential anon candidates is refilled for the next reclaim cycle. The comments for this are above inactive_list_is_low() and the age_active_anon() call from kswapd. Re: no swap, you are correct. We should gate that rebalancing on total_swap_pages, just like age_active_anon() does.
On 2/22/19 9:22 PM, Johannes Weiner wrote: > On Fri, Feb 22, 2019 at 08:43:37PM +0300, Andrey Ryabinin wrote: >> shrink_node_memcg() always forcely shrink active anon list. >> This doesn't seem like correct behavior. If system/memcg has no swap, it's >> absolutely pointless to rebalance anon lru lists. >> And in case we did scan the active anon list above, it's unclear why would >> we need this additional force scan. If there are cases when we want more >> aggressive scan of the anon lru we should just change the scan target >> in get_scan_count() (and better explain such cases in the comments). >> >> Remove this force shrink and let get_scan_count() to decide how >> much of active anon we want to shrink. > > This change breaks the anon pre-aging. > > The idea behind this is that the VM maintains a small batch of anon > reclaim candidates with recent access information. On every reclaim, > even when we just trim cache, which is the most common reclaim mode, > but also when we just swapped out some pages and shrunk the inactive > anon list, at the end of it we make sure that the list of potential > anon candidates is refilled for the next reclaim cycle. > > The comments for this are above inactive_list_is_low() and the > age_active_anon() call from kswapd. > > Re: no swap, you are correct. We should gate that rebalancing on > total_swap_pages, just like age_active_anon() does. > I think we should leave anon aging only for !SCAN_FILE cases. At least aging was definitely invented for the SCAN_FRACT mode which was the main mode at the time it was added by the commit: 556adecba110bf5f1db6c6b56416cfab5bcab698 Author: Rik van Riel <riel@redhat.com> Date: Sat Oct 18 20:26:34 2008 -0700 vmscan: second chance replacement for anonymous pages Later we've got more of the SCAN_FILE mode usage, commit: e9868505987a03a26a3979f27b82911ccc003752 Author: Rik van Riel <riel@redhat.com> Date: Tue Dec 11 16:01:10 2012 -0800 mm,vmscan: only evict file pages when we have plenty and I think would be reasonable to avoid the anon aging in the SCAN_FILE case. Because if workload generates enough inactive file pages we never go to the SCAN_FRACT, so aging is just as useless as with no swap case. So, how about something like bellow on top of the patch? --- mm/vmscan.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/mm/vmscan.c b/mm/vmscan.c index efd10d6b9510..6c63adfee37b 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -2525,6 +2525,15 @@ static void get_scan_count(struct lruvec *lruvec, struct mem_cgroup *memcg, nr[lru] = scan; } + + /* + * Even if we did not try to evict anon pages at all, we want to + * rebalance the anon lru active/inactive ratio to maintain + * enough reclaim candidates for the next reclaim cycle. + */ + if (scan_balance != SCAN_FILE && inactive_list_is_low(lruvec, + false, memcg, sc, false)) + nr[LRU_ACTIVE_ANON] += SWAP_CLUSTER_MAX; } /*
On Tue, 2019-02-26 at 15:04 +0300, Andrey Ryabinin wrote: > I think we should leave anon aging only for !SCAN_FILE cases. > At least aging was definitely invented for the SCAN_FRACT mode which > was the > main mode at the time it was added by the commit: > and I think would be reasonable to avoid the anon aging in the > SCAN_FILE case. > Because if workload generates enough inactive file pages we never go > to the SCAN_FRACT, > so aging is just as useless as with no swap case. There are a few different cases here. If you NEVER end up scanning or evicting anonymous pages, scanning them is indeed a waste of time. However, if you occasionally end pushing something into swap, it is very useful to know that the pages that did get pushed to swap had been sitting on the inactive list for a very long time, and had not been used in that time. To limit the amount of wasted work, only SWAP_CLUSTER_MAX pages are moved from the active_anon list to the inactive_anon list at a time. I suppose that could be gated behind a check whether or not the system has swap space configured, so no anon pages are ever scanned if the system has no swap space.
diff --git a/mm/vmscan.c b/mm/vmscan.c index 07f74e9507b6..efd10d6b9510 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -2563,8 +2563,8 @@ static void shrink_node_memcg(struct pglist_data *pgdat, struct mem_cgroup *memc sc->priority == DEF_PRIORITY); blk_start_plug(&plug); - while (nr[LRU_INACTIVE_ANON] || nr[LRU_ACTIVE_FILE] || - nr[LRU_INACTIVE_FILE]) { + while (nr[LRU_ACTIVE_ANON] || nr[LRU_INACTIVE_ANON] || + nr[LRU_ACTIVE_FILE] || nr[LRU_INACTIVE_FILE]) { unsigned long nr_anon, nr_file, percentage; unsigned long nr_scanned; @@ -2636,14 +2636,6 @@ static void shrink_node_memcg(struct pglist_data *pgdat, struct mem_cgroup *memc } blk_finish_plug(&plug); sc->nr_reclaimed += nr_reclaimed; - - /* - * Even if we did not try to evict anon pages at all, we want to - * rebalance the anon lru active/inactive ratio. - */ - if (inactive_list_is_low(lruvec, false, memcg, sc, true)) - shrink_active_list(SWAP_CLUSTER_MAX, lruvec, - sc, LRU_ACTIVE_ANON); } /* Use reclaim/compaction for costly allocs or under memory pressure */
shrink_node_memcg() always forcely shrink active anon list. This doesn't seem like correct behavior. If system/memcg has no swap, it's absolutely pointless to rebalance anon lru lists. And in case we did scan the active anon list above, it's unclear why would we need this additional force scan. If there are cases when we want more aggressive scan of the anon lru we should just change the scan target in get_scan_count() (and better explain such cases in the comments). Remove this force shrink and let get_scan_count() to decide how much of active anon we want to shrink. Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Rik van Riel <riel@surriel.com> Cc: Mel Gorman <mgorman@techsingularity.net> --- mm/vmscan.c | 12 ++---------- 1 file changed, 2 insertions(+), 10 deletions(-)