mm: fix LRU balancing effect of new transparent huge pages
diff mbox series

Message ID 20200509141946.158892-1-shakeelb@google.com
State New
Headers show
Series
  • mm: fix LRU balancing effect of new transparent huge pages
Related show

Commit Message

Shakeel Butt May 9, 2020, 2:19 p.m. UTC
From: Johannes Weiner <hannes@cmpxchg.org>

Currently, THP are counted as single pages until they are split right
before being swapped out. However, at that point the VM is already in
the middle of reclaim, and adjusting the LRU balance then is useless.

Always account THP by the number of basepages, and remove the fixup
from the splitting path.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Shakeel Butt <shakeelb@google.com>
---
Revived the patch from https://lore.kernel.org/patchwork/patch/685703/

 mm/swap.c | 23 +++++++++--------------
 1 file changed, 9 insertions(+), 14 deletions(-)

Comments

Andrew Morton May 11, 2020, 9:11 p.m. UTC | #1
On Sat,  9 May 2020 07:19:46 -0700 Shakeel Butt <shakeelb@google.com> wrote:

> Currently, THP are counted as single pages until they are split right
> before being swapped out. However, at that point the VM is already in
> the middle of reclaim, and adjusting the LRU balance then is useless.
> 
> Always account THP by the number of basepages, and remove the fixup
> from the splitting path.

Confused.  What kernel is this applicable to?
Shakeel Butt May 11, 2020, 9:38 p.m. UTC | #2
On Mon, May 11, 2020 at 2:11 PM Andrew Morton <akpm@linux-foundation.org> wrote:
>
> On Sat,  9 May 2020 07:19:46 -0700 Shakeel Butt <shakeelb@google.com> wrote:
>
> > Currently, THP are counted as single pages until they are split right
> > before being swapped out. However, at that point the VM is already in
> > the middle of reclaim, and adjusting the LRU balance then is useless.
> >
> > Always account THP by the number of basepages, and remove the fixup
> > from the splitting path.
>
> Confused.  What kernel is this applicable to?

It is still applicable to the latest Linux kernel. Basically
lruvec->reclaim_stat->recent_[scanned|rotated] counters are used as
heuristic in get_scan_count() to measure how much file and anon LRUs
should be scanned by the current reclaim. Previously huge pages are
treated as single page while updating the recent_[scanned|rotated]
counters in swap.c while vmscan.c correctly updates them as huge
pages.
Andrew Morton May 11, 2020, 9:58 p.m. UTC | #3
On Mon, 11 May 2020 14:38:23 -0700 Shakeel Butt <shakeelb@google.com> wrote:

> On Mon, May 11, 2020 at 2:11 PM Andrew Morton <akpm@linux-foundation.org> wrote:
> >
> > On Sat,  9 May 2020 07:19:46 -0700 Shakeel Butt <shakeelb@google.com> wrote:
> >
> > > Currently, THP are counted as single pages until they are split right
> > > before being swapped out. However, at that point the VM is already in
> > > the middle of reclaim, and adjusting the LRU balance then is useless.
> > >
> > > Always account THP by the number of basepages, and remove the fixup
> > > from the splitting path.
> >
> > Confused.  What kernel is this applicable to?
> 
> It is still applicable to the latest Linux kernel.

The patch has

> @@ -288,7 +288,7 @@ static void __activate_page(struct page *page, struct lruvec *lruvec,
>  
>  		__count_vm_events(PGACTIVATE, nr_pages);
>  		__count_memcg_events(lruvec_memcg(lruvec), PGACTIVATE, nr_pages);
> -		update_page_reclaim_stat(lruvec, file, 1);
> +		update_page_reclaim_stat(lruvec, file, 1, nr_pages);
>  	}
>  }

but current mainline is quite different:

static void __activate_page(struct page *page, struct lruvec *lruvec,
			    void *arg)
{
	if (PageLRU(page) && !PageActive(page) && !PageUnevictable(page)) {
		int file = page_is_file_lru(page);
		int lru = page_lru_base_type(page);

		del_page_from_lru_list(page, lruvec, lru);
		SetPageActive(page);
		lru += LRU_ACTIVE;
		add_page_to_lru_list(page, lruvec, lru);
		trace_mm_lru_activate(page);

		__count_vm_event(PGACTIVATE);
		update_page_reclaim_stat(lruvec, file, 1);
	}
}

q:/usr/src/linux-5.7-rc5> patch -p1 --dry-run < ~/x.txt
checking file mm/swap.c
Hunk #2 FAILED at 288.
Hunk #3 FAILED at 546.
Hunk #4 FAILED at 564.
Hunk #5 FAILED at 590.
Hunk #6 succeeded at 890 (offset -9 lines).
Hunk #7 succeeded at 915 (offset -9 lines).
Hunk #8 succeeded at 958 with fuzz 2 (offset -10 lines).
4 out of 8 hunks FAILED
Shakeel Butt May 11, 2020, 10:04 p.m. UTC | #4
On Mon, May 11, 2020 at 2:58 PM Andrew Morton <akpm@linux-foundation.org> wrote:
>
> On Mon, 11 May 2020 14:38:23 -0700 Shakeel Butt <shakeelb@google.com> wrote:
>
> > On Mon, May 11, 2020 at 2:11 PM Andrew Morton <akpm@linux-foundation.org> wrote:
> > >
> > > On Sat,  9 May 2020 07:19:46 -0700 Shakeel Butt <shakeelb@google.com> wrote:
> > >
> > > > Currently, THP are counted as single pages until they are split right
> > > > before being swapped out. However, at that point the VM is already in
> > > > the middle of reclaim, and adjusting the LRU balance then is useless.
> > > >
> > > > Always account THP by the number of basepages, and remove the fixup
> > > > from the splitting path.
> > >
> > > Confused.  What kernel is this applicable to?
> >
> > It is still applicable to the latest Linux kernel.
>
> The patch has
>
> > @@ -288,7 +288,7 @@ static void __activate_page(struct page *page, struct lruvec *lruvec,
> >
> >               __count_vm_events(PGACTIVATE, nr_pages);
> >               __count_memcg_events(lruvec_memcg(lruvec), PGACTIVATE, nr_pages);
> > -             update_page_reclaim_stat(lruvec, file, 1);
> > +             update_page_reclaim_stat(lruvec, file, 1, nr_pages);
> >       }
> >  }
>
> but current mainline is quite different:
>
> static void __activate_page(struct page *page, struct lruvec *lruvec,
>                             void *arg)
> {
>         if (PageLRU(page) && !PageActive(page) && !PageUnevictable(page)) {
>                 int file = page_is_file_lru(page);
>                 int lru = page_lru_base_type(page);
>
>                 del_page_from_lru_list(page, lruvec, lru);
>                 SetPageActive(page);
>                 lru += LRU_ACTIVE;
>                 add_page_to_lru_list(page, lruvec, lru);
>                 trace_mm_lru_activate(page);
>
>                 __count_vm_event(PGACTIVATE);
>                 update_page_reclaim_stat(lruvec, file, 1);
>         }
> }
>
> q:/usr/src/linux-5.7-rc5> patch -p1 --dry-run < ~/x.txt
> checking file mm/swap.c
> Hunk #2 FAILED at 288.
> Hunk #3 FAILED at 546.
> Hunk #4 FAILED at 564.
> Hunk #5 FAILED at 590.
> Hunk #6 succeeded at 890 (offset -9 lines).
> Hunk #7 succeeded at 915 (offset -9 lines).
> Hunk #8 succeeded at 958 with fuzz 2 (offset -10 lines).
> 4 out of 8 hunks FAILED
>

Oh sorry my mistake. It is dependent on the first two patches at [1].
Basically I replaced the third patch of the series with this one. I
should have re-send them all together.

[1] http://lkml.kernel.org/r/20200508212215.181307-1-shakeelb@google.com

Patch
diff mbox series

diff --git a/mm/swap.c b/mm/swap.c
index 4eb179ee0b72..b75c0ce90418 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -262,14 +262,14 @@  void rotate_reclaimable_page(struct page *page)
 	}
 }
 
-static void update_page_reclaim_stat(struct lruvec *lruvec,
-				     int file, int rotated)
+static void update_page_reclaim_stat(struct lruvec *lruvec, int file,
+				     int rotated, int nr_pages)
 {
 	struct zone_reclaim_stat *reclaim_stat = &lruvec->reclaim_stat;
 
-	reclaim_stat->recent_scanned[file]++;
+	reclaim_stat->recent_scanned[file] += nr_pages;
 	if (rotated)
-		reclaim_stat->recent_rotated[file]++;
+		reclaim_stat->recent_rotated[file] += nr_pages;
 }
 
 static void __activate_page(struct page *page, struct lruvec *lruvec,
@@ -288,7 +288,7 @@  static void __activate_page(struct page *page, struct lruvec *lruvec,
 
 		__count_vm_events(PGACTIVATE, nr_pages);
 		__count_memcg_events(lruvec_memcg(lruvec), PGACTIVATE, nr_pages);
-		update_page_reclaim_stat(lruvec, file, 1);
+		update_page_reclaim_stat(lruvec, file, 1, nr_pages);
 	}
 }
 
@@ -546,7 +546,7 @@  static void lru_deactivate_file_fn(struct page *page, struct lruvec *lruvec,
 		__count_vm_events(PGDEACTIVATE, nr_pages);
 		__count_memcg_events(lruvec_memcg(lruvec), PGDEACTIVATE, nr_pages);
 	}
-	update_page_reclaim_stat(lruvec, file, 0);
+	update_page_reclaim_stat(lruvec, file, 0, nr_pages);
 }
 
 static void lru_deactivate_fn(struct page *page, struct lruvec *lruvec,
@@ -564,7 +564,7 @@  static void lru_deactivate_fn(struct page *page, struct lruvec *lruvec,
 
 		__count_vm_events(PGDEACTIVATE, nr_pages);
 		__count_memcg_events(lruvec_memcg(lruvec), PGDEACTIVATE, nr_pages);
-		update_page_reclaim_stat(lruvec, file, 0);
+		update_page_reclaim_stat(lruvec, file, 0, nr_pages);
 	}
 }
 
@@ -590,7 +590,7 @@  static void lru_lazyfree_fn(struct page *page, struct lruvec *lruvec,
 
 		__count_vm_events(PGLAZYFREE, nr_pages);
 		__count_memcg_events(lruvec_memcg(lruvec), PGLAZYFREE, nr_pages);
-		update_page_reclaim_stat(lruvec, 1, 0);
+		update_page_reclaim_stat(lruvec, 1, 0, nr_pages);
 	}
 }
 
@@ -899,8 +899,6 @@  EXPORT_SYMBOL(__pagevec_release);
 void lru_add_page_tail(struct page *page, struct page *page_tail,
 		       struct lruvec *lruvec, struct list_head *list)
 {
-	const int file = 0;
-
 	VM_BUG_ON_PAGE(!PageHead(page), page);
 	VM_BUG_ON_PAGE(PageCompound(page_tail), page);
 	VM_BUG_ON_PAGE(PageLRU(page_tail), page);
@@ -926,9 +924,6 @@  void lru_add_page_tail(struct page *page, struct page *page_tail,
 		add_page_to_lru_list_tail(page_tail, lruvec,
 					  page_lru(page_tail));
 	}
-
-	if (!PageUnevictable(page))
-		update_page_reclaim_stat(lruvec, file, PageActive(page_tail));
 }
 #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
 
@@ -973,7 +968,7 @@  static void __pagevec_lru_add_fn(struct page *page, struct lruvec *lruvec,
 	if (page_evictable(page)) {
 		lru = page_lru(page);
 		update_page_reclaim_stat(lruvec, page_is_file_lru(page),
-					 PageActive(page));
+					 PageActive(page), nr_pages);
 		if (was_unevictable)
 			__count_vm_events(UNEVICTABLE_PGRESCUED, nr_pages);
 	} else {