diff mbox series

[RFC,1/1] mm: only use old generation and stable tier for madv_pageout

Message ID 20231013113028.2720996-1-zhaoyang.huang@unisoc.com (mailing list archive)
State New
Headers show
Series [RFC,1/1] mm: only use old generation and stable tier for madv_pageout | expand

Commit Message

zhaoyang.huang Oct. 13, 2023, 11:30 a.m. UTC
From: Zhaoyang Huang <zhaoyang.huang@unisoc.com>

Dropping pages of young generation or unstable tier via madvise could
make the system experience heavy page thrashing and IO pressure.
Furthermore, it could lead to failure of tier's PID controller which
affect normal reclaiming. I would like suggest skipping this pages in
madv_pageout.

Signed-off-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
---
 include/linux/swap.h |  1 +
 mm/madvise.c         | 12 ++++++++++++
 mm/vmscan.c          |  3 ++-
 3 files changed, 15 insertions(+), 1 deletion(-)

Comments

Matthew Wilcox (Oracle) Oct. 13, 2023, 3:38 p.m. UTC | #1
On Fri, Oct 13, 2023 at 07:30:28PM +0800, zhaoyang.huang wrote:
> From: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
> 
> Dropping pages of young generation or unstable tier via madvise could
> make the system experience heavy page thrashing and IO pressure.

... then userspace should not do that?

> @@ -5091,6 +5091,7 @@ static int get_tier_idx(struct lruvec *lruvec, int type)
>  
>  	return tier - 1;
>  }
> +EXPORT_SYMBOL_GPL(get_tier_idx);

Why would this need to be exported to modules in order to be used by
madvise?  Is this patch just a trojan horse so you can use get_tier_idx
in your own module?  NAK.
diff mbox series

Patch

diff --git a/include/linux/swap.h b/include/linux/swap.h
index 493487ed7c38..d09c859ccc45 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -496,6 +496,7 @@  extern int init_swap_address_space(unsigned int type, unsigned long nr_pages);
 extern void exit_swap_address_space(unsigned int type);
 extern struct swap_info_struct *get_swap_device(swp_entry_t entry);
 sector_t swap_page_sector(struct page *page);
+extern int get_tier_idx(struct lruvec *lruvec, int type);
 
 static inline void put_swap_device(struct swap_info_struct *si)
 {
diff --git a/mm/madvise.c b/mm/madvise.c
index 4dded5d27e7e..324d76096ca5 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -452,6 +452,18 @@  static int madvise_cold_or_pageout_pte_range(pmd_t *pmd,
 		if (!folio || folio_is_zone_device(folio))
 			continue;
 
+		if (lru_gen_enabled() && pageout) {
+			int gen = folio_lru_gen(folio);
+			struct lruvec *lruvec = folio_lruvec(folio);
+			int type = folio_is_file_lru(folio);
+			int refs = folio_lru_refs(folio);
+			int tier = lru_tier_from_refs(refs);
+			int tier_st = get_tier_idx(lruvec, type);
+
+			if (gen > lru_gen_from_seq(lruvec->lrugen.min_seq[type]) + 1
+				|| tier > tier_st)
+				continue;
+		}
 		/*
 		 * Creating a THP page is expensive so split it only if we
 		 * are sure it's worth. Split it if we are only owner.
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 6f13394b112e..16900a8c13e0 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -5072,7 +5072,7 @@  static int scan_folios(struct lruvec *lruvec, struct scan_control *sc,
 	return isolated || !remaining ? scanned : 0;
 }
 
-static int get_tier_idx(struct lruvec *lruvec, int type)
+int get_tier_idx(struct lruvec *lruvec, int type)
 {
 	int tier;
 	struct ctrl_pos sp, pv;
@@ -5091,6 +5091,7 @@  static int get_tier_idx(struct lruvec *lruvec, int type)
 
 	return tier - 1;
 }
+EXPORT_SYMBOL_GPL(get_tier_idx);
 
 static int get_type_to_scan(struct lruvec *lruvec, int swappiness, int *tier_idx)
 {