[v3,01/25] mm: Introduce struct folio

Message ID	20210128070404.1922318-2-willy@infradead.org (mailing list archive)
State	New, archived
Headers	show Return-Path: <SRS0=KvBV=G7=kvack.org=owner-linux-mm@kernel.org> DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1267B64DCE From: "Matthew Wilcox (Oracle)" <willy@infradead.org> To: linux-fsdevel@vger.kernel.org, linux-mm@kvack.org Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>, linux-kernel@vger.kernel.org Subject: [PATCH v3 01/25] mm: Introduce struct folio Date: Thu, 28 Jan 2021 07:03:40 +0000 Message-Id: <20210128070404.1922318-2-willy@infradead.org> In-Reply-To: <20210128070404.1922318-1-willy@infradead.org> References: <20210128070404.1922318-1-willy@infradead.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Sender: owner-linux-mm@kvack.org Precedence: bulk
Series	Page folios \| expand [v3,00/25] Page folios [v3,01/25] mm: Introduce struct folio [v3,02/25] mm: Add folio_pgdat [v3,03/25] mm/vmstat: Add folio stat wrappers [v3,04/25] mm/debug: Add VM_BUG_ON_FOLIO and VM_WARN_ON_ONCE_FOLIO [v3,05/25] mm: Add put_folio [v3,06/25] mm: Add get_folio [v3,07/25] mm: Create FolioFlags [v3,08/25] mm: Handle per-folio private data [v3,09/25] mm: Add folio_index, folio_page and folio_contains [v3,10/25] mm/util: Add folio_mapping and folio_file_mapping [v3,11/25] mm/memcg: Add folio_memcg, lock_folio_memcg and unlock_folio_memcg [v3,12/25] mm/memcg: Add mem_cgroup_folio_lruvec [v3,13/25] mm: Add unlock_folio [v3,14/25] mm: Add lock_folio [v3,15/25] mm: Add lock_folio_killable [v3,16/25] mm: Convert lock_page_async to lock_folio_async [v3,17/25] mm/filemap: Convert end_page_writeback to end_folio_writeback [v3,18/25] mm: Convert wait_on_page_bit to wait_on_folio_bit [v3,19/25] mm: Add wait_for_stable_folio and wait_on_folio_writeback [v3,20/25] mm: Add wait_on_folio_locked & wait_on_folio_locked_killable [v3,21/25] mm: Convert lock_page_or_retry to lock_folio_or_retry [v3,22/25] mm/filemap: Convert wake_up_page_bit to wake_up_folio_bit [v3,23/25] mm: Convert test_clear_page_writeback to test_clear_folio_writeback [v3,24/25] mm/filemap: Convert page wait queues to be folios [v3,25/25] cachefiles: Switch to wait_page_key

Message ID

20210128070404.1922318-2-willy@infradead.org (mailing list archive)

State

New, archived

Headers

DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1267B64DCE
From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
To: linux-fsdevel@vger.kernel.org,
	linux-mm@kvack.org
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>,
	linux-kernel@vger.kernel.org
Subject: [PATCH v3 01/25] mm: Introduce struct folio
Date: Thu, 28 Jan 2021 07:03:40 +0000
Message-Id: <20210128070404.1922318-2-willy@infradead.org>
In-Reply-To: <20210128070404.1922318-1-willy@infradead.org>
References: <20210128070404.1922318-1-willy@infradead.org>
MIME-Version: 1.0
Content-Transfer-Encoding: quoted-printable
Sender: owner-linux-mm@kvack.org
Precedence: bulk

Series

Page folios | expand

Commit Message

Matthew Wilcox Jan. 28, 2021, 7:03 a.m. UTC

We have trouble keeping track of whether we've already called
compound_head() to ensure we're not operating on a tail page.  Further,
it's never clear whether we intend a struct page to refer to PAGE_SIZE
bytes or page_size(compound_head(page)).

Introduce a new type 'struct folio' that always refers to an entire
(possibly compound) page, and points to the head page (or base page).

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/mm.h       | 26 ++++++++++++++++++++++++++
 include/linux/mm_types.h | 17 +++++++++++++++++
 2 files changed, 43 insertions(+)

Comments

Zi Yan March 1, 2021, 8:26 p.m. UTC | #1

On 28 Jan 2021, at 2:03, Matthew Wilcox (Oracle) wrote:

> We have trouble keeping track of whether we've already called
> compound_head() to ensure we're not operating on a tail page.  Further,
> it's never clear whether we intend a struct page to refer to PAGE_SIZE
> bytes or page_size(compound_head(page)).
>
> Introduce a new type 'struct folio' that always refers to an entire
> (possibly compound) page, and points to the head page (or base page).
>
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> ---
>  include/linux/mm.h       | 26 ++++++++++++++++++++++++++
>  include/linux/mm_types.h | 17 +++++++++++++++++
>  2 files changed, 43 insertions(+)
>
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 2d6e715ab8ea..f20504017adf 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -924,6 +924,11 @@ static inline unsigned int compound_order(struct page *page)
>  	return page[1].compound_order;
>  }
>
> +static inline unsigned int folio_order(struct folio *folio)
> +{
> +	return compound_order(&folio->page);
> +}
> +
>  static inline bool hpage_pincount_available(struct page *page)
>  {
>  	/*
> @@ -975,6 +980,26 @@ static inline unsigned int page_shift(struct page *page)
>
>  void free_compound_page(struct page *page);
>
> +static inline unsigned long folio_nr_pages(struct folio *folio)
> +{
> +	return compound_nr(&folio->page);
> +}
> +
> +static inline struct folio *next_folio(struct folio *folio)
> +{
> +	return folio + folio_nr_pages(folio);

Are you planning to make hugetlb use folio too?

If yes, this might not work if we have CONFIG_SPARSEMEM && !CONFIG_SPARSEMEM_VMEMMAP
with a hugetlb folio > MAX_ORDER, because struct page might not be virtually contiguous.
See the experiment I did in [1].


[1] https://lore.kernel.org/linux-mm/16F7C58B-4D79-41C5-9B64-A1A1628F4AF2@nvidia.com/


—
Best Regards,
Yan Zi

Matthew Wilcox March 1, 2021, 8:53 p.m. UTC | #2

On Mon, Mar 01, 2021 at 03:26:11PM -0500, Zi Yan wrote:
> > +static inline struct folio *next_folio(struct folio *folio)
> > +{
> > +	return folio + folio_nr_pages(folio);
> 
> Are you planning to make hugetlb use folio too?

Eventually, probably.  It's not my focus.

> If yes, this might not work if we have CONFIG_SPARSEMEM && !CONFIG_SPARSEMEM_VMEMMAP
> with a hugetlb folio > MAX_ORDER, because struct page might not be virtually contiguous.
> See the experiment I did in [1].
> 
> [1] https://lore.kernel.org/linux-mm/16F7C58B-4D79-41C5-9B64-A1A1628F4AF2@nvidia.com/

I thought we were going to forbid that configuration?  ie no pages
larger than MAX_ORDER with (SPARSEMEM && !SPARSEMEM_VMEMMAP)

https://lore.kernel.org/linux-mm/312AECBD-CA6D-4E93-A6C1-1DF87BABD92D@nvidia.com/

is somewhere else we were discussing this.

Zi Yan March 1, 2021, 9:03 p.m. UTC | #3

On 1 Mar 2021, at 15:53, Matthew Wilcox wrote:

> On Mon, Mar 01, 2021 at 03:26:11PM -0500, Zi Yan wrote:
>>> +static inline struct folio *next_folio(struct folio *folio)
>>> +{
>>> +	return folio + folio_nr_pages(folio);
>>
>> Are you planning to make hugetlb use folio too?
>
> Eventually, probably.  It's not my focus.
>
>> If yes, this might not work if we have CONFIG_SPARSEMEM && !CONFIG_SPARSEMEM_VMEMMAP
>> with a hugetlb folio > MAX_ORDER, because struct page might not be virtually contiguous.
>> See the experiment I did in [1].
>>
>> [1] https://lore.kernel.org/linux-mm/16F7C58B-4D79-41C5-9B64-A1A1628F4AF2@nvidia.com/
>
> I thought we were going to forbid that configuration?  ie no pages
> larger than MAX_ORDER with (SPARSEMEM && !SPARSEMEM_VMEMMAP)
>
> https://lore.kernel.org/linux-mm/312AECBD-CA6D-4E93-A6C1-1DF87BABD92D@nvidia.com/
>
> is somewhere else we were discussing this.

That is my plan for 1GB THP, making it depend on SPARSEMEM_VMEMMAP,
otherwise the THP code will be too complicated to read. My concern
is just about using folio in hugetlb, since

If hugetlb is not using folio soon, the patch looks good to me.

Reviewed-by: Zi Yan <ziy@nvidia.com>


—
Best Regards,
Yan Zi

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 2d6e715ab8ea..f20504017adf 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -924,6 +924,11 @@  static inline unsigned int compound_order(struct page *page)
 	return page[1].compound_order;
 }
 
+static inline unsigned int folio_order(struct folio *folio)
+{
+	return compound_order(&folio->page);
+}
+
 static inline bool hpage_pincount_available(struct page *page)
 {
 	/*
@@ -975,6 +980,26 @@  static inline unsigned int page_shift(struct page *page)
 
 void free_compound_page(struct page *page);
 
+static inline unsigned long folio_nr_pages(struct folio *folio)
+{
+	return compound_nr(&folio->page);
+}
+
+static inline struct folio *next_folio(struct folio *folio)
+{
+	return folio + folio_nr_pages(folio);
+}
+
+static inline unsigned int folio_shift(struct folio *folio)
+{
+	return PAGE_SHIFT + folio_order(folio);
+}
+
+static inline size_t folio_size(struct folio *folio)
+{
+	return PAGE_SIZE << folio_order(folio);
+}
+
 #ifdef CONFIG_MMU
 /*
  * Do pte_mkwrite, but only if the vma says VM_WRITE.  We do this when
@@ -1618,6 +1643,7 @@  extern void pagefault_out_of_memory(void);
 
 #define offset_in_page(p)	((unsigned long)(p) & ~PAGE_MASK)
 #define offset_in_thp(page, p)	((unsigned long)(p) & (thp_size(page) - 1))
+#define offset_in_folio(folio, p) ((unsigned long)(p) & (folio_size(folio) - 1))
 
 /*
  * Flags passed to show_mem() and show_free_areas() to suppress output in
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 07d9acb5b19c..875dc6cd6ad2 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -223,6 +223,23 @@  struct page {
 #endif
 } _struct_page_alignment;
 
+/*
+ * A struct folio is either a base (order-0) page or the head page of
+ * a compound page.
+ */
+struct folio {
+	struct page page;
+};
+
+static inline struct folio *page_folio(struct page *page)
+{
+	unsigned long head = READ_ONCE(page->compound_head);
+
+	if (unlikely(head & 1))
+		return (struct folio *)(head - 1);
+	return (struct folio *)page;
+}
+
 static inline atomic_t *compound_mapcount_ptr(struct page *page)
 {
 	return &page[1].compound_mapcount;

[v3,01/25] mm: Introduce struct folio

Commit Message

Comments

Patch