diff mbox series

[RESEND,v7,6/8] mm: Add APIs to free a folio directly to the buddy bypassing pcp

Message ID 20240208062608.44351-7-byungchul@sk.com (mailing list archive)
State New
Headers show
Series Reduce TLB flushes by 94% by improving folio migration | expand

Commit Message

Byungchul Park Feb. 8, 2024, 6:26 a.m. UTC
This is a preparation for migrc mechanism that frees folios at a better
time later, rather than the moment migrating folios. The folios freed by
migrc are too old to keep in pcp.

Signed-off-by: Byungchul Park <byungchul@sk.com>
---
 include/linux/mm.h | 23 +++++++++++++++++++++++
 mm/internal.h      |  1 +
 mm/page_alloc.c    | 10 ++++++++++
 mm/swap.c          |  7 +++++++
 4 files changed, 41 insertions(+)

Comments

Andrew Morton Feb. 8, 2024, 8:49 p.m. UTC | #1
On Thu,  8 Feb 2024 15:26:06 +0900 Byungchul Park <byungchul@sk.com> wrote:

> This is a preparation for migrc mechanism that frees folios at a better

The term "migrc" appears in various places but I don't think we're told
what is actually means?

> time later, rather than the moment migrating folios. The folios freed by
> migrc are too old to keep in pcp.

How do we define "too old" and what causes you to believe this is the case?
Byungchul Park Feb. 13, 2024, 2:03 a.m. UTC | #2
On Thu, Feb 08, 2024 at 12:49:19PM -0800, Andrew Morton wrote:
> On Thu,  8 Feb 2024 15:26:06 +0900 Byungchul Park <byungchul@sk.com> wrote:
> 
> > This is a preparation for migrc mechanism that frees folios at a better
> 
> The term "migrc" appears in various places but I don't think we're told
> what is actually means?
> 
> > time later, rather than the moment migrating folios. The folios freed by
> > migrc are too old to keep in pcp.
> 
> How do we define "too old" and what causes you to believe this is the case?

Migrc defers folio_put() for source folios of migration that would be
unlikely used and frees a bunch of folios at once later. However, it
pollutes pcp, which means fresher folios might get free_pcppages_bulk()ed
and makes the effort to keep the best amount of pcp get unstable. So I
didn't want to make this situation happen.

	Byungchul
diff mbox series

Patch

diff --git a/include/linux/mm.h b/include/linux/mm.h
index da5219b48d52..fc0581cce3a7 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1284,6 +1284,7 @@  static inline struct folio *virt_to_folio(const void *x)
 }
 
 void __folio_put(struct folio *folio);
+void __folio_put_small_nopcp(struct folio *folio);
 
 void put_pages_list(struct list_head *pages);
 
@@ -1483,6 +1484,28 @@  static inline void folio_put(struct folio *folio)
 		__folio_put(folio);
 }
 
+/**
+ * folio_put_small_nopcp - Decrement the reference count on a folio.
+ * @folio: The folio.
+ *
+ * This is only for a single page folio to release directly to the buddy
+ * allocator bypassing pcp.
+ *
+ * If the folio's reference count reaches zero, the memory will be
+ * released back to the page allocator and may be used by another
+ * allocation immediately.  Do not access the memory or the struct folio
+ * after calling folio_put_small_nopcp() unless you can be sure that it
+ * wasn't the last reference.
+ *
+ * Context: May be called in process or interrupt context, but not in NMI
+ * context.  May be called while holding a spinlock.
+ */
+static inline void folio_put_small_nopcp(struct folio *folio)
+{
+	if (folio_put_testzero(folio))
+		__folio_put_small_nopcp(folio);
+}
+
 /**
  * folio_put_refs - Reduce the reference count on a folio.
  * @folio: The folio.
diff --git a/mm/internal.h b/mm/internal.h
index b880f1e78700..3be8fd5604e8 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -451,6 +451,7 @@  extern int user_min_free_kbytes;
 
 extern void free_unref_page(struct page *page, unsigned int order);
 extern void free_unref_page_list(struct list_head *list);
+extern void free_pages_nopcp(struct page *page, unsigned int order);
 
 extern void zone_pcp_reset(struct zone *zone);
 extern void zone_pcp_disable(struct zone *zone);
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 733732e7e0ba..21b8c8cd1673 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -565,6 +565,16 @@  static inline void free_the_page(struct page *page, unsigned int order)
 		__free_pages_ok(page, order, FPI_NONE);
 }
 
+void free_pages_nopcp(struct page *page, unsigned int order)
+{
+	/*
+	 * This function will be used in case that the pages are too
+	 * cold to keep in pcp e.g. migrc mechanism. So it'd better
+	 * release the pages to the tail.
+	 */
+	__free_pages_ok(page, order, FPI_TO_TAIL);
+}
+
 /*
  * Higher-order pages are called "compound pages".  They are structured thusly:
  *
diff --git a/mm/swap.c b/mm/swap.c
index cd8f0150ba3a..3f37496a1184 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -106,6 +106,13 @@  static void __folio_put_small(struct folio *folio)
 	free_unref_page(&folio->page, 0);
 }
 
+void __folio_put_small_nopcp(struct folio *folio)
+{
+	__page_cache_release(folio);
+	mem_cgroup_uncharge(folio);
+	free_pages_nopcp(&folio->page, 0);
+}
+
 static void __folio_put_large(struct folio *folio)
 {
 	/*