diff mbox series

[09/14] mm: deactivations shouldn't bias the LRU balance

Message ID 20200520232525.798933-10-hannes@cmpxchg.org (mailing list archive)
State New, archived
Headers show
Series mm: balance LRU lists based on relative thrashing v2 | expand

Commit Message

Johannes Weiner May 20, 2020, 11:25 p.m. UTC
Operations like MADV_FREE, FADV_DONTNEED etc. currently move any
affected active pages to the inactive list to accelerate their reclaim
(good) but also steer page reclaim toward that LRU type, or away from
the other (bad).

The reason why this is undesirable is that such operations are not
part of the regular page aging cycle, and rather a fluke that doesn't
say much about the remaining pages on that list; they might all be in
heavy use, and once the chunk of easy victims has been purged, the VM
continues to apply elevated pressure on those remaining hot pages. The
other LRU, meanwhile, might have easily reclaimable pages, and there
was never a need to steer away from it in the first place.

As the previous patch outlined, we should focus on recording actually
observed cost to steer the balance rather than speculating about the
potential value of one LRU list over the other. In that spirit, leave
explicitely deactivated pages to the LRU algorithm to pick up, and let
rotations decide which list is the easiest to reclaim.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Minchan Kim <minchan@kernel.org>
Acked-by: Michal Hocko <mhocko@suse.com>
---
 mm/swap.c | 4 ----
 1 file changed, 4 deletions(-)

Comments

Qian Cai May 22, 2020, 1:33 p.m. UTC | #1
On Wed, May 20, 2020 at 07:25:20PM -0400, Johannes Weiner wrote:
> Operations like MADV_FREE, FADV_DONTNEED etc. currently move any
> affected active pages to the inactive list to accelerate their reclaim
> (good) but also steer page reclaim toward that LRU type, or away from
> the other (bad).
> 
> The reason why this is undesirable is that such operations are not
> part of the regular page aging cycle, and rather a fluke that doesn't
> say much about the remaining pages on that list; they might all be in
> heavy use, and once the chunk of easy victims has been purged, the VM
> continues to apply elevated pressure on those remaining hot pages. The
> other LRU, meanwhile, might have easily reclaimable pages, and there
> was never a need to steer away from it in the first place.
> 
> As the previous patch outlined, we should focus on recording actually
> observed cost to steer the balance rather than speculating about the
> potential value of one LRU list over the other. In that spirit, leave
> explicitely deactivated pages to the LRU algorithm to pick up, and let
> rotations decide which list is the easiest to reclaim.
> 
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> Acked-by: Minchan Kim <minchan@kernel.org>
> Acked-by: Michal Hocko <mhocko@suse.com>
> ---
>  mm/swap.c | 4 ----
>  1 file changed, 4 deletions(-)
> 
> diff --git a/mm/swap.c b/mm/swap.c
> index 5d62c5a0c651..d7912bfb597f 100644
> --- a/mm/swap.c
> +++ b/mm/swap.c
> @@ -515,14 +515,12 @@ static void lru_deactivate_file_fn(struct page *page, struct lruvec *lruvec,
>  
>  	if (active)
>  		__count_vm_event(PGDEACTIVATE);
> -	lru_note_cost(lruvec, !file, hpage_nr_pages(page));
>  }
>
[]

mm/swap.c: In function 'lru_deactivate_file_fn':
mm/swap.c:504:11: warning: variable 'file' set but not used
[-Wunused-but-set-variable]
  int lru, file;
           ^~~~  

This?

diff --git a/mm/swap.c b/mm/swap.c
index fedf5847dfdb..9c38c1b545af 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -501,7 +501,7 @@ void lru_cache_add_active_or_unevictable(struct page *page,
 static void lru_deactivate_file_fn(struct page *page, struct lruvec *lruvec,
 			      void *arg)
 {
-	int lru, file;
+	int lru;
 	bool active;
 
 	if (!PageLRU(page))
@@ -515,7 +515,6 @@ static void lru_deactivate_file_fn(struct page *page, struct lruvec *lruvec,
 		return;
 
 	active = PageActive(page);
-	file = page_is_file_lru(page);
 	lru = page_lru_base_type(page);
 
 	del_page_from_lru_list(page, lruvec, lru + active);
Johannes Weiner May 26, 2020, 3:55 p.m. UTC | #2
On Fri, May 22, 2020 at 09:33:35AM -0400, Qian Cai wrote:
> On Wed, May 20, 2020 at 07:25:20PM -0400, Johannes Weiner wrote:
> > Operations like MADV_FREE, FADV_DONTNEED etc. currently move any
> > affected active pages to the inactive list to accelerate their reclaim
> > (good) but also steer page reclaim toward that LRU type, or away from
> > the other (bad).
> > 
> > The reason why this is undesirable is that such operations are not
> > part of the regular page aging cycle, and rather a fluke that doesn't
> > say much about the remaining pages on that list; they might all be in
> > heavy use, and once the chunk of easy victims has been purged, the VM
> > continues to apply elevated pressure on those remaining hot pages. The
> > other LRU, meanwhile, might have easily reclaimable pages, and there
> > was never a need to steer away from it in the first place.
> > 
> > As the previous patch outlined, we should focus on recording actually
> > observed cost to steer the balance rather than speculating about the
> > potential value of one LRU list over the other. In that spirit, leave
> > explicitely deactivated pages to the LRU algorithm to pick up, and let
> > rotations decide which list is the easiest to reclaim.
> > 
> > Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> > Acked-by: Minchan Kim <minchan@kernel.org>
> > Acked-by: Michal Hocko <mhocko@suse.com>
> > ---
> >  mm/swap.c | 4 ----
> >  1 file changed, 4 deletions(-)
> > 
> > diff --git a/mm/swap.c b/mm/swap.c
> > index 5d62c5a0c651..d7912bfb597f 100644
> > --- a/mm/swap.c
> > +++ b/mm/swap.c
> > @@ -515,14 +515,12 @@ static void lru_deactivate_file_fn(struct page *page, struct lruvec *lruvec,
> >  
> >  	if (active)
> >  		__count_vm_event(PGDEACTIVATE);
> > -	lru_note_cost(lruvec, !file, hpage_nr_pages(page));
> >  }
> >
> []
> 
> mm/swap.c: In function 'lru_deactivate_file_fn':
> mm/swap.c:504:11: warning: variable 'file' set but not used
> [-Wunused-but-set-variable]
>   int lru, file;
>            ^~~~ 

Oops, my gcc doesn't warn about that, but yes, it's clearly dead code.

$ make mm/swap.o
  GEN     Makefile
  CALL    /home/hannes/src/linux/linux/scripts/checksyscalls.sh
  CALL    /home/hannes/src/linux/linux/scripts/atomic/check-atomics.sh
  DESCEND  objtool
  CC      mm/swap.o
$
 
> This?
> 
> diff --git a/mm/swap.c b/mm/swap.c
> index fedf5847dfdb..9c38c1b545af 100644
> --- a/mm/swap.c
> +++ b/mm/swap.c
> @@ -501,7 +501,7 @@ void lru_cache_add_active_or_unevictable(struct page *page,
>  static void lru_deactivate_file_fn(struct page *page, struct lruvec *lruvec,
>  			      void *arg)
>  {
> -	int lru, file;
> +	int lru;
>  	bool active;
>  
>  	if (!PageLRU(page))
> @@ -515,7 +515,6 @@ static void lru_deactivate_file_fn(struct page *page, struct lruvec *lruvec,
>  		return;
>  
>  	active = PageActive(page);
> -	file = page_is_file_lru(page);
>  	lru = page_lru_base_type(page);
>  
>  	del_page_from_lru_list(page, lruvec, lru + active);

Looks good, and it appears Andrew has already queued it. Would you
mind providing the Signed-off-by: for it?
diff mbox series

Patch

diff --git a/mm/swap.c b/mm/swap.c
index 5d62c5a0c651..d7912bfb597f 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -515,14 +515,12 @@  static void lru_deactivate_file_fn(struct page *page, struct lruvec *lruvec,
 
 	if (active)
 		__count_vm_event(PGDEACTIVATE);
-	lru_note_cost(lruvec, !file, hpage_nr_pages(page));
 }
 
 static void lru_deactivate_fn(struct page *page, struct lruvec *lruvec,
 			    void *arg)
 {
 	if (PageLRU(page) && PageActive(page) && !PageUnevictable(page)) {
-		int file = page_is_file_lru(page);
 		int lru = page_lru_base_type(page);
 
 		del_page_from_lru_list(page, lruvec, lru + LRU_ACTIVE);
@@ -531,7 +529,6 @@  static void lru_deactivate_fn(struct page *page, struct lruvec *lruvec,
 		add_page_to_lru_list(page, lruvec, lru);
 
 		__count_vm_events(PGDEACTIVATE, hpage_nr_pages(page));
-		lru_note_cost(lruvec, !file, hpage_nr_pages(page));
 	}
 }
 
@@ -556,7 +553,6 @@  static void lru_lazyfree_fn(struct page *page, struct lruvec *lruvec,
 
 		__count_vm_events(PGLAZYFREE, hpage_nr_pages(page));
 		count_memcg_page_event(page, PGLAZYFREE);
-		lru_note_cost(lruvec, 0, hpage_nr_pages(page));
 	}
 }