diff mbox series

mm:vmscan: fix extra adjustment for lruvec's nonresident_age in case of reactivation

Message ID TYCP286MB1108D012DA436CA72029ACA7C5DF9@TYCP286MB1108.JPNP286.PROD.OUTLOOK.COM (mailing list archive)
State New
Headers show
Series mm:vmscan: fix extra adjustment for lruvec's nonresident_age in case of reactivation | expand

Commit Message

Xie Yongmei Sept. 19, 2021, 3:25 p.m. UTC
Before commit #31d8fcac, VM didn't increase nonresident_age (AKA inactive age for
file pages) in shrink_page_list. When putback_inactive_pages was converged with
move_pages_to_lru, both shrink_active_list and shrink_page_list use the same
function to handle move pages to the appropriate lru under lru lock's protection.

At those day, VM didn't increase nonresident_age for second chance promotion.
Commit #31d8fcac fix the problem. Definitely, we should account the activation
for second chance. But move_pages_to_lru is used in reactivation in active lru
as well for protecting code section. So I suggest to add another variable to
tell whether reactivation or not.

Signed-off-by: Yongmei Xie <yongmeixie@hotmail.com>
---
 mm/vmscan.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

Comments

Johannes Weiner Sept. 21, 2021, 2:48 p.m. UTC | #1
On Sun, Sep 19, 2021 at 11:25:09PM +0800, Yongmei Xie wrote:
> Before commit #31d8fcac, VM didn't increase nonresident_age (AKA inactive age for
> file pages) in shrink_page_list. When putback_inactive_pages was converged with
> move_pages_to_lru, both shrink_active_list and shrink_page_list use the same
> function to handle move pages to the appropriate lru under lru lock's protection.
> 
> At those day, VM didn't increase nonresident_age for second chance promotion.
> Commit #31d8fcac fix the problem. Definitely, we should account the activation
> for second chance. But move_pages_to_lru is used in reactivation in active lru
> as well for protecting code section. So I suggest to add another variable to
> tell whether reactivation or not.

This looks incorrect to me. We *should* count reactivations/rotations
on the active list toward nonresident age.

The nonresident age tracks the number of in-memory references in order
to later calculate the (minimum) reuse distance of refaulting pages.

If a page on the active list gets reactivated due to a reference, that
reference contributes to the distance of yet-to-refault pages.
Johannes Weiner Oct. 1, 2021, 4:43 p.m. UTC | #2
On Wed, Sep 22, 2021 at 02:27:24PM +0000, 解 咏梅 wrote:
> But now, move_pages_to_lru didn't increase nonresident age for active rotation.
> Back toinactive age, VM ONLY care about the pages left inactive lru, AKA activation and reclaiming.
> Anyway reactivation is rare case, so whatever it contributes to nonresident age or not is ok to me.
> 
> But I am interested the logic how to guess the pages will be referenced again in the future.
> If active reactivation does matter to nonresident age. why not active rotation? But, currently it doesn't.

Can you point me to the code you're referring to? Looking at
move_pages_to_lru(), any pages with PageActive() set count toward the
non-resident age. That means activations from the inactive list, as
well as rotations on the active list, increase the nonresident age.

As to your question which one is right: the original workingset patch
was wrong not to count activations and reactivations. If we see a page
referenced in memory, it means it's hotter than the page that's not
refaulting -> nonresident age increses.

So the code as it is now looks correct to me.

Thanks
Johannes Weiner Oct. 7, 2021, 4:34 p.m. UTC | #3
Hello Yongmei,

On Thu, Oct 07, 2021 at 02:35:44PM +0000, 解 咏梅 wrote:
> You're right. I checked with the commit 264e90cc07f177adec17ee7cc154ddaa132f0b2d
> 
> I was say that, because back to 1 or 2 years ago, VM used reclaim_stat's rotation/scan as the factor to balance the ratio between fs page cache and anonymous pages.
> It used the side effect of working set activation (it raised rotation count) to challenge the other side memory: file vs anon
> And in shrink_active_list deactivation also contributes to rotation count.
> 
> So I got the conclusion that active list rotation refers to deactivation.
> I checked with commit #264e90c, only executable code section contributes to active list rotation. Thank you for pointing out my misunderstanding.
> But, deactivation contributes to PGROTATED event. I'm still a sort of confused:(

Yeah PGROTATED is a little strange! I'm not sure people use it much.

> 1 more question:
> why activation/deativation/deactive_fn removes the contribution to lru cost? because those are cpu intensive not I/O intensive, right?
> 
> So for now, if we'd like to balance the ratio between fs page cache and anonymous pages, we only take I/O (in allocation path and reclaim path) into consideration.

Yes, correct. The idea is to have the algorithm serve the workingset
with the least amount of paging IO.

Actually, the first version of the patch accounted for CPU overhead,
since anon and file do have different aging rules with different CPU
requirements. However, it didn't seem to matter in my testing, and
it's a bit awkward to compare IO cost and CPU cost since it depends
heavily on the underlying hardware, so I deleted that code. It's
possible we may need to add it back if a workload shows up that cares.

> As my observation, VM don't take fs periodical dirty flush as I/O cost.

Correct.

The thinking is: writeback IO needs to happen with or without reclaim,
because of data integrity. Whereas swapping only happens under memory
pressure - without anon reclaim we would not do any swap writes.

Of course, reclaim can trigger accelerated dirty flushing, which
*could* result in increased IO and thus higher LRU cost: two buffered
writes to the same page within the dirty expiration window would
result in one disk write but could result in two under pressure. It's
a pain to track this properly, though, so the compromise is that when
kswapd gets in enough trouble that it needs to flush pages one by one
(NR_VMSCAN_WRITE). This seems to work reasonably well in practice.

> Looking forward to your reply!
> 
> Thanks again. I get more clear view of VM:)
> 
> 
> It is Chinese national holiday, sorry for my late response.

Happy Golden Week!

Johannes
diff mbox series

Patch

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 74296c2d1fed..85ccafcd4912 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2152,7 +2152,8 @@  static int too_many_isolated(struct pglist_data *pgdat, int file,
  * Returns the number of pages moved to the given lruvec.
  */
 static unsigned int move_pages_to_lru(struct lruvec *lruvec,
-				      struct list_head *list)
+				      struct list_head *list,
+				      bool reactivation)
 {
 	int nr_pages, nr_moved = 0;
 	LIST_HEAD(pages_to_free);
@@ -2203,7 +2204,7 @@  static unsigned int move_pages_to_lru(struct lruvec *lruvec,
 		add_page_to_lru_list(page, lruvec);
 		nr_pages = thp_nr_pages(page);
 		nr_moved += nr_pages;
-		if (PageActive(page))
+		if (PageActive(page) && !reactivation)
 			workingset_age_nonresident(lruvec, nr_pages);
 	}
 
@@ -2281,7 +2282,7 @@  shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec,
 	nr_reclaimed = shrink_page_list(&page_list, pgdat, sc, &stat, false);
 
 	spin_lock_irq(&lruvec->lru_lock);
-	move_pages_to_lru(lruvec, &page_list);
+	move_pages_to_lru(lruvec, &page_list, false);
 
 	__mod_node_page_state(pgdat, NR_ISOLATED_ANON + file, -nr_taken);
 	item = current_is_kswapd() ? PGSTEAL_KSWAPD : PGSTEAL_DIRECT;
@@ -2418,8 +2419,8 @@  static void shrink_active_list(unsigned long nr_to_scan,
 	 */
 	spin_lock_irq(&lruvec->lru_lock);
 
-	nr_activate = move_pages_to_lru(lruvec, &l_active);
-	nr_deactivate = move_pages_to_lru(lruvec, &l_inactive);
+	nr_activate = move_pages_to_lru(lruvec, &l_active, true);
+	nr_deactivate = move_pages_to_lru(lruvec, &l_inactive, false);
 	/* Keep all free pages in l_active list */
 	list_splice(&l_inactive, &l_active);