Message ID | TYCP286MB1108D012DA436CA72029ACA7C5DF9@TYCP286MB1108.JPNP286.PROD.OUTLOOK.COM (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | mm:vmscan: fix extra adjustment for lruvec's nonresident_age in case of reactivation | expand |
On Sun, Sep 19, 2021 at 11:25:09PM +0800, Yongmei Xie wrote: > Before commit #31d8fcac, VM didn't increase nonresident_age (AKA inactive age for > file pages) in shrink_page_list. When putback_inactive_pages was converged with > move_pages_to_lru, both shrink_active_list and shrink_page_list use the same > function to handle move pages to the appropriate lru under lru lock's protection. > > At those day, VM didn't increase nonresident_age for second chance promotion. > Commit #31d8fcac fix the problem. Definitely, we should account the activation > for second chance. But move_pages_to_lru is used in reactivation in active lru > as well for protecting code section. So I suggest to add another variable to > tell whether reactivation or not. This looks incorrect to me. We *should* count reactivations/rotations on the active list toward nonresident age. The nonresident age tracks the number of in-memory references in order to later calculate the (minimum) reuse distance of refaulting pages. If a page on the active list gets reactivated due to a reference, that reference contributes to the distance of yet-to-refault pages.
On Wed, Sep 22, 2021 at 02:27:24PM +0000, 解 咏梅 wrote: > But now, move_pages_to_lru didn't increase nonresident age for active rotation. > Back toinactive age, VM ONLY care about the pages left inactive lru, AKA activation and reclaiming. > Anyway reactivation is rare case, so whatever it contributes to nonresident age or not is ok to me. > > But I am interested the logic how to guess the pages will be referenced again in the future. > If active reactivation does matter to nonresident age. why not active rotation? But, currently it doesn't. Can you point me to the code you're referring to? Looking at move_pages_to_lru(), any pages with PageActive() set count toward the non-resident age. That means activations from the inactive list, as well as rotations on the active list, increase the nonresident age. As to your question which one is right: the original workingset patch was wrong not to count activations and reactivations. If we see a page referenced in memory, it means it's hotter than the page that's not refaulting -> nonresident age increses. So the code as it is now looks correct to me. Thanks
Hello Yongmei, On Thu, Oct 07, 2021 at 02:35:44PM +0000, 解 咏梅 wrote: > You're right. I checked with the commit 264e90cc07f177adec17ee7cc154ddaa132f0b2d > > I was say that, because back to 1 or 2 years ago, VM used reclaim_stat's rotation/scan as the factor to balance the ratio between fs page cache and anonymous pages. > It used the side effect of working set activation (it raised rotation count) to challenge the other side memory: file vs anon > And in shrink_active_list deactivation also contributes to rotation count. > > So I got the conclusion that active list rotation refers to deactivation. > I checked with commit #264e90c, only executable code section contributes to active list rotation. Thank you for pointing out my misunderstanding. > But, deactivation contributes to PGROTATED event. I'm still a sort of confused:( Yeah PGROTATED is a little strange! I'm not sure people use it much. > 1 more question: > why activation/deativation/deactive_fn removes the contribution to lru cost? because those are cpu intensive not I/O intensive, right? > > So for now, if we'd like to balance the ratio between fs page cache and anonymous pages, we only take I/O (in allocation path and reclaim path) into consideration. Yes, correct. The idea is to have the algorithm serve the workingset with the least amount of paging IO. Actually, the first version of the patch accounted for CPU overhead, since anon and file do have different aging rules with different CPU requirements. However, it didn't seem to matter in my testing, and it's a bit awkward to compare IO cost and CPU cost since it depends heavily on the underlying hardware, so I deleted that code. It's possible we may need to add it back if a workload shows up that cares. > As my observation, VM don't take fs periodical dirty flush as I/O cost. Correct. The thinking is: writeback IO needs to happen with or without reclaim, because of data integrity. Whereas swapping only happens under memory pressure - without anon reclaim we would not do any swap writes. Of course, reclaim can trigger accelerated dirty flushing, which *could* result in increased IO and thus higher LRU cost: two buffered writes to the same page within the dirty expiration window would result in one disk write but could result in two under pressure. It's a pain to track this properly, though, so the compromise is that when kswapd gets in enough trouble that it needs to flush pages one by one (NR_VMSCAN_WRITE). This seems to work reasonably well in practice. > Looking forward to your reply! > > Thanks again. I get more clear view of VM:) > > > It is Chinese national holiday, sorry for my late response. Happy Golden Week! Johannes
diff --git a/mm/vmscan.c b/mm/vmscan.c index 74296c2d1fed..85ccafcd4912 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -2152,7 +2152,8 @@ static int too_many_isolated(struct pglist_data *pgdat, int file, * Returns the number of pages moved to the given lruvec. */ static unsigned int move_pages_to_lru(struct lruvec *lruvec, - struct list_head *list) + struct list_head *list, + bool reactivation) { int nr_pages, nr_moved = 0; LIST_HEAD(pages_to_free); @@ -2203,7 +2204,7 @@ static unsigned int move_pages_to_lru(struct lruvec *lruvec, add_page_to_lru_list(page, lruvec); nr_pages = thp_nr_pages(page); nr_moved += nr_pages; - if (PageActive(page)) + if (PageActive(page) && !reactivation) workingset_age_nonresident(lruvec, nr_pages); } @@ -2281,7 +2282,7 @@ shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec, nr_reclaimed = shrink_page_list(&page_list, pgdat, sc, &stat, false); spin_lock_irq(&lruvec->lru_lock); - move_pages_to_lru(lruvec, &page_list); + move_pages_to_lru(lruvec, &page_list, false); __mod_node_page_state(pgdat, NR_ISOLATED_ANON + file, -nr_taken); item = current_is_kswapd() ? PGSTEAL_KSWAPD : PGSTEAL_DIRECT; @@ -2418,8 +2419,8 @@ static void shrink_active_list(unsigned long nr_to_scan, */ spin_lock_irq(&lruvec->lru_lock); - nr_activate = move_pages_to_lru(lruvec, &l_active); - nr_deactivate = move_pages_to_lru(lruvec, &l_inactive); + nr_activate = move_pages_to_lru(lruvec, &l_active, true); + nr_deactivate = move_pages_to_lru(lruvec, &l_inactive, false); /* Keep all free pages in l_active list */ list_splice(&l_inactive, &l_active);
Before commit #31d8fcac, VM didn't increase nonresident_age (AKA inactive age for file pages) in shrink_page_list. When putback_inactive_pages was converged with move_pages_to_lru, both shrink_active_list and shrink_page_list use the same function to handle move pages to the appropriate lru under lru lock's protection. At those day, VM didn't increase nonresident_age for second chance promotion. Commit #31d8fcac fix the problem. Definitely, we should account the activation for second chance. But move_pages_to_lru is used in reactivation in active lru as well for protecting code section. So I suggest to add another variable to tell whether reactivation or not. Signed-off-by: Yongmei Xie <yongmeixie@hotmail.com> --- mm/vmscan.c | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-)