diff mbox series

[v2] mm/page_alloc: remove prefetchw() on freeing page to buddy system

Message ID 20240704015906.18437-1-richard.weiyang@gmail.com (mailing list archive)
State New
Headers show
Series [v2] mm/page_alloc: remove prefetchw() on freeing page to buddy system | expand

Commit Message

Wei Yang July 4, 2024, 1:59 a.m. UTC
The prefetchw() is introduced from an ancient patch[1].

The change log says:

    The basic idea is to free higher order pages instead of going
    through every single one.  Also, some unnecessary atomic operations
    are done away with and replaced with non-atomic equivalents, and
    prefetching is done where it helps the most.  For a more in-depth
    discusion of this patch, please see the linux-ia64 archives (topic
    is "free bootmem feedback patch").

So there are several changes improve the bootmem freeing, in which the
most basic idea is freeing higher order pages. And as Matthew says,
"Itanium CPUs of this era had no prefetchers."

I did 10 round bootup tests before and after this change, the data
doesn't prove prefetchw() help speeding up bootmem freeing. The sum of
the 10 round bootmem freeing time after prefetchw() removal even 5.2%
faster than before.

[1]: https://lore.kernel.org/linux-ia64/40F46962.4090604@sgi.com/

Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Suggested-by: Matthew Wilcox <willy@infradead.org>
CC: David Hildenbrand <david@redhat.com>

---
v2: slightly adjust the loop based on David's comment

The patch is based on mm-stable with David's change.
commit 3dadec1babf9eee0c67c967df931d6f0cb124a04

  mm: pass meminit_context to __free_pages_core()
---
 mm/page_alloc.c | 21 +++++++++------------
 1 file changed, 9 insertions(+), 12 deletions(-)
diff mbox series

Patch

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 116ee33fd1ce..5235015eba3d 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1224,7 +1224,7 @@  void __meminit __free_pages_core(struct page *page, unsigned int order,
 {
 	unsigned int nr_pages = 1 << order;
 	struct page *p = page;
-	unsigned int loop;
+	unsigned int loop = 0;
 
 	/*
 	 * When initializing the memmap, __init_single_page() sets the refcount
@@ -1236,16 +1236,14 @@  void __meminit __free_pages_core(struct page *page, unsigned int order,
 	 */
 	if (IS_ENABLED(CONFIG_MEMORY_HOTPLUG) &&
 	    unlikely(context == MEMINIT_HOTPLUG)) {
-		prefetchw(p);
-		for (loop = 0; loop < (nr_pages - 1); loop++, p++) {
-			prefetchw(p + 1);
+		for (;;) {
 			VM_WARN_ON_ONCE(PageReserved(p));
 			__ClearPageOffline(p);
 			set_page_count(p, 0);
+			if (++loop >= nr_pages)
+				break;
+			p++;
 		}
-		VM_WARN_ON_ONCE(PageReserved(p));
-		__ClearPageOffline(p);
-		set_page_count(p, 0);
 
 		/*
 		 * Freeing the page with debug_pagealloc enabled will try to
@@ -1255,14 +1253,13 @@  void __meminit __free_pages_core(struct page *page, unsigned int order,
 		debug_pagealloc_map_pages(page, nr_pages);
 		adjust_managed_page_count(page, nr_pages);
 	} else {
-		prefetchw(p);
-		for (loop = 0; loop < (nr_pages - 1); loop++, p++) {
-			prefetchw(p + 1);
+		for (;;) {
 			__ClearPageReserved(p);
 			set_page_count(p, 0);
+			if (++loop >= nr_pages)
+				break;
+			p++;
 		}
-		__ClearPageReserved(p);
-		set_page_count(p, 0);
 
 		/* memblock adjusts totalram_pages() manually. */
 		atomic_long_add(nr_pages, &page_zone(page)->managed_pages);