diff mbox series

[1/7] mm: Store compound_nr as well as compound_order

Message ID 20200629151959.15779-2-willy@infradead.org (mailing list archive)
State New, archived
Headers show
Series THP prep patches | expand

Commit Message

Matthew Wilcox (Oracle) June 29, 2020, 3:19 p.m. UTC
This removes a few instructions from functions which need to know how many
pages are in a compound page.  The storage used is either page->mapping
on 64-bit or page->index on 32-bit.  Both of these are fine to overlay
on tail pages.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/mm.h       | 5 ++++-
 include/linux/mm_types.h | 1 +
 mm/page_alloc.c          | 5 +++--
 3 files changed, 8 insertions(+), 3 deletions(-)

Comments

Kirill A . Shutemov July 6, 2020, 10:29 a.m. UTC | #1
On Mon, Jun 29, 2020 at 04:19:53PM +0100, Matthew Wilcox (Oracle) wrote:
> This removes a few instructions from functions which need to know how many
> pages are in a compound page.  The storage used is either page->mapping
> on 64-bit or page->index on 32-bit.  Both of these are fine to overlay
> on tail pages.

I'm not a fan of redundant data in struct page, even if it's less busy
tail page. We tend to find more use of the space over time.

Any numbers on what it gives for typical kernel? Does it really worth it?
Matthew Wilcox (Oracle) Aug. 11, 2020, 10:53 p.m. UTC | #2
On Mon, Jul 06, 2020 at 01:29:25PM +0300, Kirill A. Shutemov wrote:
> On Mon, Jun 29, 2020 at 04:19:53PM +0100, Matthew Wilcox (Oracle) wrote:
> > This removes a few instructions from functions which need to know how many
> > pages are in a compound page.  The storage used is either page->mapping
> > on 64-bit or page->index on 32-bit.  Both of these are fine to overlay
> > on tail pages.
> 
> I'm not a fan of redundant data in struct page, even if it's less busy
> tail page. We tend to find more use of the space over time.
> 
> Any numbers on what it gives for typical kernel? Does it really worth it?

Oops, I overlooked this email.  Sorry.  Thanks to Andrew for the reminder.

I haven't analysed the performance win for this.  The assembly is
two instructions (11 bytes) shorter:

before:
    206c:       a9 00 00 01 00          test   $0x10000,%eax
    2071:       0f 84 af 02 00 00       je     2326 <shmem_add_to_page_cache.isra.0+0x3b6>
    2077:       41 0f b6 4c 24 51       movzbl 0x51(%r12),%ecx
    207d:       41 bd 01 00 00 00       mov    $0x1,%r13d
    2083:       49 8b 44 24 08          mov    0x8(%r12),%rax
    2088:       49 d3 e5                shl    %cl,%r13
    208b:       a8 01                   test   $0x1,%al

after:
    2691:       a9 00 00 01 00          test   $0x10000,%eax
    2696:       0f 84 95 01 00 00       je     2831 <shmem_add_to_page_cache.isr
a.0+0x291>
    269c:       49 8b 47 08             mov    0x8(%r15),%rax
    26a0:       45 8b 77 58             mov    0x58(%r15),%r14d
    26a4:       a8 01                   test   $0x1,%al

(there are other changes in these files, so the addresses aren't
meaningful).

If we need the space, we can always revert this patch.  It's all hidden
behind the macro anyway.
diff mbox series

Patch

diff --git a/include/linux/mm.h b/include/linux/mm.h
index dc7b87310c10..af0305ad090f 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -911,12 +911,15 @@  static inline int compound_pincount(struct page *page)
 static inline void set_compound_order(struct page *page, unsigned int order)
 {
 	page[1].compound_order = order;
+	page[1].compound_nr = 1U << order;
 }
 
 /* Returns the number of pages in this potentially compound page. */
 static inline unsigned long compound_nr(struct page *page)
 {
-	return 1UL << compound_order(page);
+	if (!PageHead(page))
+		return 1;
+	return page[1].compound_nr;
 }
 
 /* Returns the number of bytes in this potentially compound page. */
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 64ede5f150dc..561ed987ab44 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -134,6 +134,7 @@  struct page {
 			unsigned char compound_dtor;
 			unsigned char compound_order;
 			atomic_t compound_mapcount;
+			unsigned int compound_nr; /* 1 << compound_order */
 		};
 		struct {	/* Second tail page of compound page */
 			unsigned long _compound_pad_1;	/* compound_head */
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 48eb0f1410d4..c7beb5f13193 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -673,8 +673,6 @@  void prep_compound_page(struct page *page, unsigned int order)
 	int i;
 	int nr_pages = 1 << order;
 
-	set_compound_page_dtor(page, COMPOUND_PAGE_DTOR);
-	set_compound_order(page, order);
 	__SetPageHead(page);
 	for (i = 1; i < nr_pages; i++) {
 		struct page *p = page + i;
@@ -682,6 +680,9 @@  void prep_compound_page(struct page *page, unsigned int order)
 		p->mapping = TAIL_MAPPING;
 		set_compound_head(p, page);
 	}
+
+	set_compound_page_dtor(page, COMPOUND_PAGE_DTOR);
+	set_compound_order(page, order);
 	atomic_set(compound_mapcount_ptr(page), -1);
 	if (hpage_pincount_available(page))
 		atomic_set(compound_pincount_ptr(page), 0);