diff mbox series

[v14,08/39] mm/swap: Add folio_mark_accessed()

Message ID 20210715200030.899216-9-willy@infradead.org (mailing list archive)
State New
Headers show
Series Memory folios: Pagecache edition | expand

Commit Message

Matthew Wilcox July 15, 2021, 7:59 p.m. UTC
Convert mark_page_accessed() to folio_mark_accessed().  It already
operated on the entire compound page, but now we can avoid calling
compound_head quite so many times.  Shrinks the function from 424 bytes
to 295 bytes (shrinking by 129 bytes).  The compatibility wrapper is 30
bytes, plus the 8 bytes for the exported symbol means the kernel shrinks
by 91 bytes.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/swap.h |  3 ++-
 mm/folio-compat.c    |  7 +++++++
 mm/swap.c            | 34 ++++++++++++++++------------------
 3 files changed, 25 insertions(+), 19 deletions(-)

Comments

Gregory Price Oct. 8, 2023, 3:34 p.m. UTC | #1
On Thu, 15 Jul 2021 20:59:59 +0100, Matthew Wilcox wrote
> diff --git a/include/linux/swap.h b/include/linux/swap.h
> index 989d8f78c256..c7a4c0a5863d 100644
> --- a/include/linux/swap.h
> +++ b/include/linux/swap.h
> @@ -352,7 +352,8 @@ extern void lru_note_cost(struct lruvec *lruvec, bool file,
>  			  unsigned int nr_pages);
>  extern void lru_note_cost_page(struct page *);
>  extern void lru_cache_add(struct page *);
> -extern void mark_page_accessed(struct page *);
> +void mark_page_accessed(struct page *);
> +void folio_mark_accessed(struct folio *);
> 
>  extern atomic_t lru_disable_count;
> 
> diff --git a/mm/folio-compat.c b/mm/folio-compat.c
> index 7044fcc8a8aa..a374747ae1c6 100644
> --- a/mm/folio-compat.c
> +++ b/mm/folio-compat.c
> @@ -5,6 +5,7 @@
>   */
> 
>  #include <linux/pagemap.h>
> +#include <linux/swap.h>
> 
>  struct address_space *page_mapping(struct page *page)
>  {
> @@ -41,3 +42,9 @@ bool page_mapped(struct page *page)
>  	return folio_mapped(page_folio(page));
>  }
>  EXPORT_SYMBOL(page_mapped);
> +
> +void mark_page_accessed(struct page *page)
> +{
> +	folio_mark_accessed(page_folio(page));
> +}
> +EXPORT_SYMBOL(mark_page_accessed);
... snip ...
>
> @@ -430,36 +430,34 @@ static void __lru_cache_activate_page(struct page *page)
>   * When a newly allocated page is not yet visible, so safe for non-atomic ops,
>   * __SetPageReferenced(page) may be substituted for mark_page_accessed(page).
>   */
> -void mark_page_accessed(struct page *page)
> +void folio_mark_accessed(struct folio *folio)
>  {
> -	page = compound_head(page);
> -
> -	if (!PageReferenced(page)) {
> -		SetPageReferenced(page);
> -	} else if (PageUnevictable(page)) {
> +	if (!folio_test_referenced(folio)) {
> +		folio_set_referenced(folio);
> +	} else if (folio_test_unevictable(folio)) {

Hi Matthew,

My colleagues and I have been testing some page-tracking algorithms and
we tried using the reference bit by using /proc/pid/clear_clears,
/proc/pid/pagemap, and /proc/kpageflags.

I'm not 100% of the issue, but we're finding some inconsistencies when
tracking page reference bits, and this seems to be at least one of the
patches we saw that might be in the mix.

From reading up on folios, it seems like this change prevents each
individual page in a folio from being marked referenced, and instead
prefers to simply mark the entire folio containing the page as accessed.

When looking at the proc/ interface, it seems like it is still
referencing the page and not using the folio for a page (see below).

Our current suspicion is that since only the folio head gets marked
referenced, any pages inside the folio that aren't the head will
basically never be marked referenced, leading to an inconsistent view.

I'm curious your thoughts on whether (or neither):

a) Should we update kpageflags_read to use page_folio() and get the
   folio flags for the head page

or

b) the above change to mark_page_accessed() should mark both the
   individual page flags to accessed as well as the folio head accessed.

Thanks for your time.
Gregory Price


(relevant fs/proc/page.c code:)


static ssize_t kpageflags_read(struct file *file, char __user *buf,
                             size_t count, loff_t *ppos)
{
... snip ...
                ppage = pfn_to_online_page(pfn);

                if (put_user(stable_page_flags(ppage), out)) {
                        ret = -EFAULT;
                        break;
                }
... snip ...
}


u64 stable_page_flags(struct page *page)
{
... snip ...
        k = page->flags;
... snip ...
}
Matthew Wilcox Oct. 10, 2023, 6:08 p.m. UTC | #2
On Sun, Oct 08, 2023 at 11:34:45AM -0400, Gregory Price wrote:
> On Thu, 15 Jul 2021 20:59:59 +0100, Matthew Wilcox wrote
> > +void mark_page_accessed(struct page *page)
> > +{
> > +	folio_mark_accessed(page_folio(page));
> > +}
> > +EXPORT_SYMBOL(mark_page_accessed);
> ... snip ...
> >
> > @@ -430,36 +430,34 @@ static void __lru_cache_activate_page(struct page *page)
> >   * When a newly allocated page is not yet visible, so safe for non-atomic ops,
> >   * __SetPageReferenced(page) may be substituted for mark_page_accessed(page).
> >   */
> > -void mark_page_accessed(struct page *page)
> > +void folio_mark_accessed(struct folio *folio)
> >  {
> > -	page = compound_head(page);
> > -
> > -	if (!PageReferenced(page)) {
> > -		SetPageReferenced(page);
> > -	} else if (PageUnevictable(page)) {
> > +	if (!folio_test_referenced(folio)) {
> > +		folio_set_referenced(folio);
> > +	} else if (folio_test_unevictable(folio)) {
> 
> Hi Matthew,
> 
> My colleagues and I have been testing some page-tracking algorithms and
> we tried using the reference bit by using /proc/pid/clear_clears,
> /proc/pid/pagemap, and /proc/kpageflags.
> 
> I'm not 100% of the issue, but we're finding some inconsistencies when
> tracking page reference bits, and this seems to be at least one of the
> patches we saw that might be in the mix.
> 
> >From reading up on folios, it seems like this change prevents each
> individual page in a folio from being marked referenced, and instead
> prefers to simply mark the entire folio containing the page as accessed.
> 
> When looking at the proc/ interface, it seems like it is still
> referencing the page and not using the folio for a page (see below).
> 
> Our current suspicion is that since only the folio head gets marked
> referenced, any pages inside the folio that aren't the head will
> basically never be marked referenced, leading to an inconsistent view.
> 
> I'm curious your thoughts on whether (or neither):
> 
> a) Should we update kpageflags_read to use page_folio() and get the
>    folio flags for the head page
> 
> or
> 
> b) the above change to mark_page_accessed() should mark both the
>    individual page flags to accessed as well as the folio head accessed.

Hi Greg, thanks for reaching out.

The referenced bit has been tracked on what is now known as a per-folio
basis since commit 8cb38fabb6bc in 2016.  That is, if you tried to
SetPageReferenced / ClearPageReferenced / test PageReferenced on a tail
page, it would redirect to the head page in order to set/clear/test
the bit in the head page's flags field.

That's not to say that all the code which sets/checks/tests this
really understands that!  We've definitely found bugs during the folio
work where code has mistakenly assumed that operations on a tail
page actually affect that page rather than the whole allocation.

There's also code which is now being exposed to compound pages for the
first time, and is not holding up well.

I hope that's helpful in finding the bug you're chasing.

> Thanks for your time.
> Gregory Price
> 
> 
> (relevant fs/proc/page.c code:)
> 
> 
> static ssize_t kpageflags_read(struct file *file, char __user *buf,
>                              size_t count, loff_t *ppos)
> {
> ... snip ...
>                 ppage = pfn_to_online_page(pfn);
> 
>                 if (put_user(stable_page_flags(ppage), out)) {
>                         ret = -EFAULT;
>                         break;
>                 }
> ... snip ...
> }
> 
> 
> u64 stable_page_flags(struct page *page)
> {
> ... snip ...
>         k = page->flags;
> ... snip ...
> }
Gregory Price Oct. 13, 2023, 4:38 p.m. UTC | #3
On Tue, Oct 10, 2023 at 07:08:15PM +0100, Matthew Wilcox wrote:
> On Sun, Oct 08, 2023 at 11:34:45AM -0400, Gregory Price wrote:
> > a) Should we update kpageflags_read to use page_folio() and get the
> >    folio flags for the head page
> > 
> > or
> > 
> > b) the above change to mark_page_accessed() should mark both the
> >    individual page flags to accessed as well as the folio head accessed.
> 
> Hi Greg, thanks for reaching out.
> 
> The referenced bit has been tracked on what is now known as a per-folio
> basis since commit 8cb38fabb6bc in 2016.  That is, if you tried to
> SetPageReferenced / ClearPageReferenced / test PageReferenced on a tail
> page, it would redirect to the head page in order to set/clear/test
> the bit in the head page's flags field.
> 
> That's not to say that all the code which sets/checks/tests this
> really understands that!  We've definitely found bugs during the folio
> work where code has mistakenly assumed that operations on a tail
> page actually affect that page rather than the whole allocation.
> 
> There's also code which is now being exposed to compound pages for the
> first time, and is not holding up well.
> 
> I hope that's helpful in finding the bug you're chasing.
> 

ah, very interesting.  So the kpageflags code doesn't appear to respect
flag locations defined in 8cb38fabb6bc

We tested this hack of a patch which swaps the page flags for the folio
head flags according to the flag locations defined in page-flags.h.

The result was that the referenced bit was almost *never* set on an
anonymous pages, no matter where we look, even when we have a program
constantly touching every page in a large chunk of memory.

Head scratcher.

Thank you for the response.  We're still very much confused, but it
looks like there are potentially many problems (unless we're
misunderstanding the reference bit).

(cc Naoya since I saw you were actively working on kpageflags for folio
 support not sure what else you're seeing).

~Gregory


diff --git a/fs/proc/page.c b/fs/proc/page.c
index 195b077c0fac..8022a41d7899 100644
--- a/fs/proc/page.c
+++ b/fs/proc/page.c
@@ -110,6 +110,7 @@ static inline u64 kpf_copy_bit(u64 kflags, int ubit, int kbit)
 u64 stable_page_flags(struct page *page)
 {
        u64 k;
+       u64 fh;
        u64 u;

        /*
@@ -119,7 +120,7 @@ u64 stable_page_flags(struct page *page)
        if (!page)
                return 1 << KPF_NOPAGE;

-       k = page->flags;
+       fh = *folio_flags(page_folio(folio), 0);
        u = 0;

        /*
@@ -188,20 +189,20 @@ u64 stable_page_flags(struct page *page)
                u |= 1 << KPF_SLAB;

        u |= kpf_copy_bit(k, KPF_ERROR,         PG_error);
-       u |= kpf_copy_bit(k, KPF_DIRTY,         PG_dirty);
+       u |= kpf_copy_bit(fh, KPF_DIRTY,        PG_dirty);
        u |= kpf_copy_bit(k, KPF_UPTODATE,      PG_uptodate);
        u |= kpf_copy_bit(k, KPF_WRITEBACK,     PG_writeback);

-       u |= kpf_copy_bit(k, KPF_LRU,           PG_lru);
-       u |= kpf_copy_bit(k, KPF_REFERENCED,    PG_referenced);
-       u |= kpf_copy_bit(k, KPF_ACTIVE,        PG_active);
+       u |= kpf_copy_bit(fh, KPF_LRU,          PG_lru);
+       u |= kpf_copy_bit(fh, KPF_REFERENCED,   PG_referenced);
+       u |= kpf_copy_bit(fh, KPF_ACTIVE,       PG_active);
        u |= kpf_copy_bit(k, KPF_RECLAIM,       PG_reclaim);

        if (PageSwapCache(page))
                u |= 1 << KPF_SWAPCACHE;
        u |= kpf_copy_bit(k, KPF_SWAPBACKED,    PG_swapbacked);

-       u |= kpf_copy_bit(k, KPF_UNEVICTABLE,   PG_unevictable);
+       u |= kpf_copy_bit(fh, KPF_UNEVICTABLE,  PG_unevictable);
        u |= kpf_copy_bit(k, KPF_MLOCKED,       PG_mlocked);

 #ifdef CONFIG_MEMORY_FAILURE
diff mbox series

Patch

diff --git a/include/linux/swap.h b/include/linux/swap.h
index 989d8f78c256..c7a4c0a5863d 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -352,7 +352,8 @@  extern void lru_note_cost(struct lruvec *lruvec, bool file,
 			  unsigned int nr_pages);
 extern void lru_note_cost_page(struct page *);
 extern void lru_cache_add(struct page *);
-extern void mark_page_accessed(struct page *);
+void mark_page_accessed(struct page *);
+void folio_mark_accessed(struct folio *);
 
 extern atomic_t lru_disable_count;
 
diff --git a/mm/folio-compat.c b/mm/folio-compat.c
index 7044fcc8a8aa..a374747ae1c6 100644
--- a/mm/folio-compat.c
+++ b/mm/folio-compat.c
@@ -5,6 +5,7 @@ 
  */
 
 #include <linux/pagemap.h>
+#include <linux/swap.h>
 
 struct address_space *page_mapping(struct page *page)
 {
@@ -41,3 +42,9 @@  bool page_mapped(struct page *page)
 	return folio_mapped(page_folio(page));
 }
 EXPORT_SYMBOL(page_mapped);
+
+void mark_page_accessed(struct page *page)
+{
+	folio_mark_accessed(page_folio(page));
+}
+EXPORT_SYMBOL(mark_page_accessed);
diff --git a/mm/swap.c b/mm/swap.c
index c3137e4e1cd8..d32007fe23b3 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -390,7 +390,7 @@  static void folio_activate(struct folio *folio)
 }
 #endif
 
-static void __lru_cache_activate_page(struct page *page)
+static void __lru_cache_activate_folio(struct folio *folio)
 {
 	struct pagevec *pvec;
 	int i;
@@ -411,8 +411,8 @@  static void __lru_cache_activate_page(struct page *page)
 	for (i = pagevec_count(pvec) - 1; i >= 0; i--) {
 		struct page *pagevec_page = pvec->pages[i];
 
-		if (pagevec_page == page) {
-			SetPageActive(page);
+		if (pagevec_page == &folio->page) {
+			folio_set_active(folio);
 			break;
 		}
 	}
@@ -430,36 +430,34 @@  static void __lru_cache_activate_page(struct page *page)
  * When a newly allocated page is not yet visible, so safe for non-atomic ops,
  * __SetPageReferenced(page) may be substituted for mark_page_accessed(page).
  */
-void mark_page_accessed(struct page *page)
+void folio_mark_accessed(struct folio *folio)
 {
-	page = compound_head(page);
-
-	if (!PageReferenced(page)) {
-		SetPageReferenced(page);
-	} else if (PageUnevictable(page)) {
+	if (!folio_test_referenced(folio)) {
+		folio_set_referenced(folio);
+	} else if (folio_test_unevictable(folio)) {
 		/*
 		 * Unevictable pages are on the "LRU_UNEVICTABLE" list. But,
 		 * this list is never rotated or maintained, so marking an
 		 * evictable page accessed has no effect.
 		 */
-	} else if (!PageActive(page)) {
+	} else if (!folio_test_active(folio)) {
 		/*
 		 * If the page is on the LRU, queue it for activation via
 		 * lru_pvecs.activate_page. Otherwise, assume the page is on a
 		 * pagevec, mark it active and it'll be moved to the active
 		 * LRU on the next drain.
 		 */
-		if (PageLRU(page))
-			folio_activate(page_folio(page));
+		if (folio_test_lru(folio))
+			folio_activate(folio);
 		else
-			__lru_cache_activate_page(page);
-		ClearPageReferenced(page);
-		workingset_activation(page_folio(page));
+			__lru_cache_activate_folio(folio);
+		folio_clear_referenced(folio);
+		workingset_activation(folio);
 	}
-	if (page_is_idle(page))
-		clear_page_idle(page);
+	if (folio_test_idle(folio))
+		folio_clear_idle(folio);
 }
-EXPORT_SYMBOL(mark_page_accessed);
+EXPORT_SYMBOL(folio_mark_accessed);
 
 /**
  * lru_cache_add - add a page to a page list