[v14,08/39] mm/swap: Add folio_mark_accessed()

Message ID	20210715200030.899216-9-willy@infradead.org (mailing list archive)
State	New
Headers	show Return-Path: <SRS0=osCV=MH=kvack.org=owner-linux-mm@kernel.org> DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 2F28961370 From: "Matthew Wilcox (Oracle)" <willy@infradead.org> To: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> Subject: [PATCH v14 08/39] mm/swap: Add folio_mark_accessed() Date: Thu, 15 Jul 2021 20:59:59 +0100 Message-Id: <20210715200030.899216-9-willy@infradead.org> In-Reply-To: <20210715200030.899216-1-willy@infradead.org> References: <20210715200030.899216-1-willy@infradead.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Sender: owner-linux-mm@kvack.org Precedence: bulk
Series	Memory folios: Pagecache edition \| expand [v14c,00/39] Memory folios: Pagecache edition [v14,01/39] mm: Add folio_pfn() [v14,02/39] mm: Add folio_raw_mapping() [v14,03/39] mm: Add flush_dcache_folio() [v14,04/39] mm: Add kmap_local_folio() [v14,05/39] mm: Add arch_make_folio_accessible() [v14,06/39] mm: Add folio_young and folio_idle [v14,07/39] mm/swap: Add folio_activate() [v14,08/39] mm/swap: Add folio_mark_accessed() [v14,09/39] mm/rmap: Add folio_mkclean() [v14,10/39] mm/migrate: Add folio_migrate_mapping() [v14,11/39] mm/migrate: Add folio_migrate_flags() [v14,12/39] mm/migrate: Add folio_migrate_copy() [v14,13/39] mm/writeback: Rename __add_wb_stat() to wb_stat_mod() [v14,14/39] flex_proportions: Allow N events instead of 1 [v14,15/39] mm/writeback: Change __wb_writeout_inc() to __wb_writeout_add() [v14,16/39] mm/writeback: Add __folio_end_writeback() [v14,17/39] mm/writeback: Add folio_start_writeback() [v14,18/39] mm/writeback: Add folio_mark_dirty() [v14,19/39] mm/writeback: Add __folio_mark_dirty() [v14,20/39] mm/writeback: Convert tracing writeback_page_template to folios [v14,21/39] mm/writeback: Add filemap_dirty_folio() [v14,22/39] mm/writeback: Add folio_account_cleaned() [v14,23/39] mm/writeback: Add folio_cancel_dirty() [v14,24/39] mm/writeback: Add folio_clear_dirty_for_io() [v14,25/39] mm/writeback: Add folio_account_redirty() [v14,26/39] mm/writeback: Add folio_redirty_for_writepage() [v14,27/39] mm/filemap: Add i_blocks_per_folio() [v14,28/39] mm/filemap: Add folio_mkwrite_check_truncate() [v14,29/39] mm/filemap: Add readahead_folio() [v14,30/39] mm/workingset: Convert workingset_refault() to take a folio [v14,31/39] mm: Add folio_evictable() [v14,32/39] mm/lru: Convert __pagevec_lru_add_fn to take a folio [v14,33/39] mm/lru: Add folio_add_lru() [v14,34/39] mm/page_alloc: Add folio allocation functions [v14,35/39] mm/filemap: Add filemap_alloc_folio [v14,36/39] mm/filemap: Add filemap_add_folio() [v14,37/39] mm/filemap: Convert mapping_get_entry to return a folio [v14,38/39] mm/filemap: Add filemap_get_folio [v14,39/39] mm/filemap: Add FGP_STABLE

Message ID

20210715200030.899216-9-willy@infradead.org (mailing list archive)

State

New

Headers

DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 2F28961370
From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
To: linux-mm@kvack.org,
	linux-fsdevel@vger.kernel.org
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Subject: [PATCH v14 08/39] mm/swap: Add folio_mark_accessed()
Date: Thu, 15 Jul 2021 20:59:59 +0100
Message-Id: <20210715200030.899216-9-willy@infradead.org>
In-Reply-To: <20210715200030.899216-1-willy@infradead.org>
References: <20210715200030.899216-1-willy@infradead.org>
MIME-Version: 1.0
Content-Transfer-Encoding: quoted-printable
Sender: owner-linux-mm@kvack.org
Precedence: bulk

Series

Memory folios: Pagecache edition | expand

Commit Message

Matthew Wilcox July 15, 2021, 7:59 p.m. UTC

Convert mark_page_accessed() to folio_mark_accessed().  It already
operated on the entire compound page, but now we can avoid calling
compound_head quite so many times.  Shrinks the function from 424 bytes
to 295 bytes (shrinking by 129 bytes).  The compatibility wrapper is 30
bytes, plus the 8 bytes for the exported symbol means the kernel shrinks
by 91 bytes.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/swap.h |  3 ++-
 mm/folio-compat.c    |  7 +++++++
 mm/swap.c            | 34 ++++++++++++++++------------------
 3 files changed, 25 insertions(+), 19 deletions(-)

Comments

Gregory Price Oct. 8, 2023, 3:34 p.m. UTC | #1

On Thu, 15 Jul 2021 20:59:59 +0100, Matthew Wilcox wrote
> diff --git a/include/linux/swap.h b/include/linux/swap.h
> index 989d8f78c256..c7a4c0a5863d 100644
> --- a/include/linux/swap.h
> +++ b/include/linux/swap.h
> @@ -352,7 +352,8 @@ extern void lru_note_cost(struct lruvec *lruvec, bool file,
>  			  unsigned int nr_pages);
>  extern void lru_note_cost_page(struct page *);
>  extern void lru_cache_add(struct page *);
> -extern void mark_page_accessed(struct page *);
> +void mark_page_accessed(struct page *);
> +void folio_mark_accessed(struct folio *);
> 
>  extern atomic_t lru_disable_count;
> 
> diff --git a/mm/folio-compat.c b/mm/folio-compat.c
> index 7044fcc8a8aa..a374747ae1c6 100644
> --- a/mm/folio-compat.c
> +++ b/mm/folio-compat.c
> @@ -5,6 +5,7 @@
>   */
> 
>  #include <linux/pagemap.h>
> +#include <linux/swap.h>
> 
>  struct address_space *page_mapping(struct page *page)
>  {
> @@ -41,3 +42,9 @@ bool page_mapped(struct page *page)
>  	return folio_mapped(page_folio(page));
>  }
>  EXPORT_SYMBOL(page_mapped);
> +
> +void mark_page_accessed(struct page *page)
> +{
> +	folio_mark_accessed(page_folio(page));
> +}
> +EXPORT_SYMBOL(mark_page_accessed);
... snip ...
>
> @@ -430,36 +430,34 @@ static void __lru_cache_activate_page(struct page *page)
>   * When a newly allocated page is not yet visible, so safe for non-atomic ops,
>   * __SetPageReferenced(page) may be substituted for mark_page_accessed(page).
>   */
> -void mark_page_accessed(struct page *page)
> +void folio_mark_accessed(struct folio *folio)
>  {
> -	page = compound_head(page);
> -
> -	if (!PageReferenced(page)) {
> -		SetPageReferenced(page);
> -	} else if (PageUnevictable(page)) {
> +	if (!folio_test_referenced(folio)) {
> +		folio_set_referenced(folio);
> +	} else if (folio_test_unevictable(folio)) {

Hi Matthew,

My colleagues and I have been testing some page-tracking algorithms and
we tried using the reference bit by using /proc/pid/clear_clears,
/proc/pid/pagemap, and /proc/kpageflags.

I'm not 100% of the issue, but we're finding some inconsistencies when
tracking page reference bits, and this seems to be at least one of the
patches we saw that might be in the mix.

From reading up on folios, it seems like this change prevents each
individual page in a folio from being marked referenced, and instead
prefers to simply mark the entire folio containing the page as accessed.

When looking at the proc/ interface, it seems like it is still
referencing the page and not using the folio for a page (see below).

Our current suspicion is that since only the folio head gets marked
referenced, any pages inside the folio that aren't the head will
basically never be marked referenced, leading to an inconsistent view.

I'm curious your thoughts on whether (or neither):

a) Should we update kpageflags_read to use page_folio() and get the
   folio flags for the head page

or

b) the above change to mark_page_accessed() should mark both the
   individual page flags to accessed as well as the folio head accessed.

Thanks for your time.
Gregory Price


(relevant fs/proc/page.c code:)


static ssize_t kpageflags_read(struct file *file, char __user *buf,
                             size_t count, loff_t *ppos)
{
... snip ...
                ppage = pfn_to_online_page(pfn);

                if (put_user(stable_page_flags(ppage), out)) {
                        ret = -EFAULT;
                        break;
                }
... snip ...
}


u64 stable_page_flags(struct page *page)
{
... snip ...
        k = page->flags;
... snip ...
}

Matthew Wilcox Oct. 10, 2023, 6:08 p.m. UTC | #2

On Sun, Oct 08, 2023 at 11:34:45AM -0400, Gregory Price wrote:
> On Thu, 15 Jul 2021 20:59:59 +0100, Matthew Wilcox wrote
> > +void mark_page_accessed(struct page *page)
> > +{
> > +	folio_mark_accessed(page_folio(page));
> > +}
> > +EXPORT_SYMBOL(mark_page_accessed);
> ... snip ...
> >
> > @@ -430,36 +430,34 @@ static void __lru_cache_activate_page(struct page *page)
> >   * When a newly allocated page is not yet visible, so safe for non-atomic ops,
> >   * __SetPageReferenced(page) may be substituted for mark_page_accessed(page).
> >   */
> > -void mark_page_accessed(struct page *page)
> > +void folio_mark_accessed(struct folio *folio)
> >  {
> > -	page = compound_head(page);
> > -
> > -	if (!PageReferenced(page)) {
> > -		SetPageReferenced(page);
> > -	} else if (PageUnevictable(page)) {
> > +	if (!folio_test_referenced(folio)) {
> > +		folio_set_referenced(folio);
> > +	} else if (folio_test_unevictable(folio)) {
> 
> Hi Matthew,
> 
> My colleagues and I have been testing some page-tracking algorithms and
> we tried using the reference bit by using /proc/pid/clear_clears,
> /proc/pid/pagemap, and /proc/kpageflags.
> 
> I'm not 100% of the issue, but we're finding some inconsistencies when
> tracking page reference bits, and this seems to be at least one of the
> patches we saw that might be in the mix.
> 
> >From reading up on folios, it seems like this change prevents each
> individual page in a folio from being marked referenced, and instead
> prefers to simply mark the entire folio containing the page as accessed.
> 
> When looking at the proc/ interface, it seems like it is still
> referencing the page and not using the folio for a page (see below).
> 
> Our current suspicion is that since only the folio head gets marked
> referenced, any pages inside the folio that aren't the head will
> basically never be marked referenced, leading to an inconsistent view.
> 
> I'm curious your thoughts on whether (or neither):
> 
> a) Should we update kpageflags_read to use page_folio() and get the
>    folio flags for the head page
> 
> or
> 
> b) the above change to mark_page_accessed() should mark both the
>    individual page flags to accessed as well as the folio head accessed.

Hi Greg, thanks for reaching out.

The referenced bit has been tracked on what is now known as a per-folio
basis since commit 8cb38fabb6bc in 2016.  That is, if you tried to
SetPageReferenced / ClearPageReferenced / test PageReferenced on a tail
page, it would redirect to the head page in order to set/clear/test
the bit in the head page's flags field.

That's not to say that all the code which sets/checks/tests this
really understands that!  We've definitely found bugs during the folio
work where code has mistakenly assumed that operations on a tail
page actually affect that page rather than the whole allocation.

There's also code which is now being exposed to compound pages for the
first time, and is not holding up well.

I hope that's helpful in finding the bug you're chasing.

> Thanks for your time.
> Gregory Price
> 
> 
> (relevant fs/proc/page.c code:)
> 
> 
> static ssize_t kpageflags_read(struct file *file, char __user *buf,
>                              size_t count, loff_t *ppos)
> {
> ... snip ...
>                 ppage = pfn_to_online_page(pfn);
> 
>                 if (put_user(stable_page_flags(ppage), out)) {
>                         ret = -EFAULT;
>                         break;
>                 }
> ... snip ...
> }
> 
> 
> u64 stable_page_flags(struct page *page)
> {
> ... snip ...
>         k = page->flags;
> ... snip ...
> }

Gregory Price Oct. 13, 2023, 4:38 p.m. UTC | #3

On Tue, Oct 10, 2023 at 07:08:15PM +0100, Matthew Wilcox wrote:
> On Sun, Oct 08, 2023 at 11:34:45AM -0400, Gregory Price wrote:
> > a) Should we update kpageflags_read to use page_folio() and get the
> >    folio flags for the head page
> > 
> > or
> > 
> > b) the above change to mark_page_accessed() should mark both the
> >    individual page flags to accessed as well as the folio head accessed.
> 
> Hi Greg, thanks for reaching out.
> 
> The referenced bit has been tracked on what is now known as a per-folio
> basis since commit 8cb38fabb6bc in 2016.  That is, if you tried to
> SetPageReferenced / ClearPageReferenced / test PageReferenced on a tail
> page, it would redirect to the head page in order to set/clear/test
> the bit in the head page's flags field.
> 
> That's not to say that all the code which sets/checks/tests this
> really understands that!  We've definitely found bugs during the folio
> work where code has mistakenly assumed that operations on a tail
> page actually affect that page rather than the whole allocation.
> 
> There's also code which is now being exposed to compound pages for the
> first time, and is not holding up well.
> 
> I hope that's helpful in finding the bug you're chasing.
> 

ah, very interesting.  So the kpageflags code doesn't appear to respect
flag locations defined in 8cb38fabb6bc

We tested this hack of a patch which swaps the page flags for the folio
head flags according to the flag locations defined in page-flags.h.

The result was that the referenced bit was almost *never* set on an
anonymous pages, no matter where we look, even when we have a program
constantly touching every page in a large chunk of memory.

Head scratcher.

Thank you for the response.  We're still very much confused, but it
looks like there are potentially many problems (unless we're
misunderstanding the reference bit).

(cc Naoya since I saw you were actively working on kpageflags for folio
 support not sure what else you're seeing).

~Gregory


diff --git a/fs/proc/page.c b/fs/proc/page.c
index 195b077c0fac..8022a41d7899 100644
--- a/fs/proc/page.c
+++ b/fs/proc/page.c
@@ -110,6 +110,7 @@ static inline u64 kpf_copy_bit(u64 kflags, int ubit, int kbit)
 u64 stable_page_flags(struct page *page)
 {
        u64 k;
+       u64 fh;
        u64 u;

        /*
@@ -119,7 +120,7 @@ u64 stable_page_flags(struct page *page)
        if (!page)
                return 1 << KPF_NOPAGE;

-       k = page->flags;
+       fh = *folio_flags(page_folio(folio), 0);
        u = 0;

        /*
@@ -188,20 +189,20 @@ u64 stable_page_flags(struct page *page)
                u |= 1 << KPF_SLAB;

        u |= kpf_copy_bit(k, KPF_ERROR,         PG_error);
-       u |= kpf_copy_bit(k, KPF_DIRTY,         PG_dirty);
+       u |= kpf_copy_bit(fh, KPF_DIRTY,        PG_dirty);
        u |= kpf_copy_bit(k, KPF_UPTODATE,      PG_uptodate);
        u |= kpf_copy_bit(k, KPF_WRITEBACK,     PG_writeback);

-       u |= kpf_copy_bit(k, KPF_LRU,           PG_lru);
-       u |= kpf_copy_bit(k, KPF_REFERENCED,    PG_referenced);
-       u |= kpf_copy_bit(k, KPF_ACTIVE,        PG_active);
+       u |= kpf_copy_bit(fh, KPF_LRU,          PG_lru);
+       u |= kpf_copy_bit(fh, KPF_REFERENCED,   PG_referenced);
+       u |= kpf_copy_bit(fh, KPF_ACTIVE,       PG_active);
        u |= kpf_copy_bit(k, KPF_RECLAIM,       PG_reclaim);

        if (PageSwapCache(page))
                u |= 1 << KPF_SWAPCACHE;
        u |= kpf_copy_bit(k, KPF_SWAPBACKED,    PG_swapbacked);

-       u |= kpf_copy_bit(k, KPF_UNEVICTABLE,   PG_unevictable);
+       u |= kpf_copy_bit(fh, KPF_UNEVICTABLE,  PG_unevictable);
        u |= kpf_copy_bit(k, KPF_MLOCKED,       PG_mlocked);

 #ifdef CONFIG_MEMORY_FAILURE

diff --git a/include/linux/swap.h b/include/linux/swap.h
index 989d8f78c256..c7a4c0a5863d 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -352,7 +352,8 @@  extern void lru_note_cost(struct lruvec *lruvec, bool file,
 			  unsigned int nr_pages);
 extern void lru_note_cost_page(struct page *);
 extern void lru_cache_add(struct page *);
-extern void mark_page_accessed(struct page *);
+void mark_page_accessed(struct page *);
+void folio_mark_accessed(struct folio *);
 
 extern atomic_t lru_disable_count;
 
diff --git a/mm/folio-compat.c b/mm/folio-compat.c
index 7044fcc8a8aa..a374747ae1c6 100644
--- a/mm/folio-compat.c
+++ b/mm/folio-compat.c
@@ -5,6 +5,7 @@ 
  */
 
 #include <linux/pagemap.h>
+#include <linux/swap.h>
 
 struct address_space *page_mapping(struct page *page)
 {
@@ -41,3 +42,9 @@  bool page_mapped(struct page *page)
 	return folio_mapped(page_folio(page));
 }
 EXPORT_SYMBOL(page_mapped);
+
+void mark_page_accessed(struct page *page)
+{
+	folio_mark_accessed(page_folio(page));
+}
+EXPORT_SYMBOL(mark_page_accessed);
diff --git a/mm/swap.c b/mm/swap.c
index c3137e4e1cd8..d32007fe23b3 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -390,7 +390,7 @@  static void folio_activate(struct folio *folio)
 }
 #endif
 
-static void __lru_cache_activate_page(struct page *page)
+static void __lru_cache_activate_folio(struct folio *folio)
 {
 	struct pagevec *pvec;
 	int i;
@@ -411,8 +411,8 @@  static void __lru_cache_activate_page(struct page *page)
 	for (i = pagevec_count(pvec) - 1; i >= 0; i--) {
 		struct page *pagevec_page = pvec->pages[i];
 
-		if (pagevec_page == page) {
-			SetPageActive(page);
+		if (pagevec_page == &folio->page) {
+			folio_set_active(folio);
 			break;
 		}
 	}
@@ -430,36 +430,34 @@  static void __lru_cache_activate_page(struct page *page)
  * When a newly allocated page is not yet visible, so safe for non-atomic ops,
  * __SetPageReferenced(page) may be substituted for mark_page_accessed(page).
  */
-void mark_page_accessed(struct page *page)
+void folio_mark_accessed(struct folio *folio)
 {
-	page = compound_head(page);
-
-	if (!PageReferenced(page)) {
-		SetPageReferenced(page);
-	} else if (PageUnevictable(page)) {
+	if (!folio_test_referenced(folio)) {
+		folio_set_referenced(folio);
+	} else if (folio_test_unevictable(folio)) {
 		/*
 		 * Unevictable pages are on the "LRU_UNEVICTABLE" list. But,
 		 * this list is never rotated or maintained, so marking an
 		 * evictable page accessed has no effect.
 		 */
-	} else if (!PageActive(page)) {
+	} else if (!folio_test_active(folio)) {
 		/*
 		 * If the page is on the LRU, queue it for activation via
 		 * lru_pvecs.activate_page. Otherwise, assume the page is on a
 		 * pagevec, mark it active and it'll be moved to the active
 		 * LRU on the next drain.
 		 */
-		if (PageLRU(page))
-			folio_activate(page_folio(page));
+		if (folio_test_lru(folio))
+			folio_activate(folio);
 		else
-			__lru_cache_activate_page(page);
-		ClearPageReferenced(page);
-		workingset_activation(page_folio(page));
+			__lru_cache_activate_folio(folio);
+		folio_clear_referenced(folio);
+		workingset_activation(folio);
 	}
-	if (page_is_idle(page))
-		clear_page_idle(page);
+	if (folio_test_idle(folio))
+		folio_clear_idle(folio);
 }
-EXPORT_SYMBOL(mark_page_accessed);
+EXPORT_SYMBOL(folio_mark_accessed);
 
 /**
  * lru_cache_add - add a page to a page list

[v14,08/39] mm/swap: Add folio_mark_accessed()

Commit Message

Comments

Patch