diff mbox series

[v3,01/25] mm: Introduce struct folio

Message ID 20210128070404.1922318-2-willy@infradead.org (mailing list archive)
State New, archived
Headers show
Series Page folios | expand

Commit Message

Matthew Wilcox Jan. 28, 2021, 7:03 a.m. UTC
We have trouble keeping track of whether we've already called
compound_head() to ensure we're not operating on a tail page.  Further,
it's never clear whether we intend a struct page to refer to PAGE_SIZE
bytes or page_size(compound_head(page)).

Introduce a new type 'struct folio' that always refers to an entire
(possibly compound) page, and points to the head page (or base page).

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/mm.h       | 26 ++++++++++++++++++++++++++
 include/linux/mm_types.h | 17 +++++++++++++++++
 2 files changed, 43 insertions(+)

Comments

Zi Yan March 1, 2021, 8:26 p.m. UTC | #1
On 28 Jan 2021, at 2:03, Matthew Wilcox (Oracle) wrote:

> We have trouble keeping track of whether we've already called
> compound_head() to ensure we're not operating on a tail page.  Further,
> it's never clear whether we intend a struct page to refer to PAGE_SIZE
> bytes or page_size(compound_head(page)).
>
> Introduce a new type 'struct folio' that always refers to an entire
> (possibly compound) page, and points to the head page (or base page).
>
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> ---
>  include/linux/mm.h       | 26 ++++++++++++++++++++++++++
>  include/linux/mm_types.h | 17 +++++++++++++++++
>  2 files changed, 43 insertions(+)
>
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 2d6e715ab8ea..f20504017adf 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -924,6 +924,11 @@ static inline unsigned int compound_order(struct page *page)
>  	return page[1].compound_order;
>  }
>
> +static inline unsigned int folio_order(struct folio *folio)
> +{
> +	return compound_order(&folio->page);
> +}
> +
>  static inline bool hpage_pincount_available(struct page *page)
>  {
>  	/*
> @@ -975,6 +980,26 @@ static inline unsigned int page_shift(struct page *page)
>
>  void free_compound_page(struct page *page);
>
> +static inline unsigned long folio_nr_pages(struct folio *folio)
> +{
> +	return compound_nr(&folio->page);
> +}
> +
> +static inline struct folio *next_folio(struct folio *folio)
> +{
> +	return folio + folio_nr_pages(folio);

Are you planning to make hugetlb use folio too?

If yes, this might not work if we have CONFIG_SPARSEMEM && !CONFIG_SPARSEMEM_VMEMMAP
with a hugetlb folio > MAX_ORDER, because struct page might not be virtually contiguous.
See the experiment I did in [1].


[1] https://lore.kernel.org/linux-mm/16F7C58B-4D79-41C5-9B64-A1A1628F4AF2@nvidia.com/


—
Best Regards,
Yan Zi
Matthew Wilcox March 1, 2021, 8:53 p.m. UTC | #2
On Mon, Mar 01, 2021 at 03:26:11PM -0500, Zi Yan wrote:
> > +static inline struct folio *next_folio(struct folio *folio)
> > +{
> > +	return folio + folio_nr_pages(folio);
> 
> Are you planning to make hugetlb use folio too?

Eventually, probably.  It's not my focus.

> If yes, this might not work if we have CONFIG_SPARSEMEM && !CONFIG_SPARSEMEM_VMEMMAP
> with a hugetlb folio > MAX_ORDER, because struct page might not be virtually contiguous.
> See the experiment I did in [1].
> 
> [1] https://lore.kernel.org/linux-mm/16F7C58B-4D79-41C5-9B64-A1A1628F4AF2@nvidia.com/

I thought we were going to forbid that configuration?  ie no pages
larger than MAX_ORDER with (SPARSEMEM && !SPARSEMEM_VMEMMAP)

https://lore.kernel.org/linux-mm/312AECBD-CA6D-4E93-A6C1-1DF87BABD92D@nvidia.com/

is somewhere else we were discussing this.
Zi Yan March 1, 2021, 9:03 p.m. UTC | #3
On 1 Mar 2021, at 15:53, Matthew Wilcox wrote:

> On Mon, Mar 01, 2021 at 03:26:11PM -0500, Zi Yan wrote:
>>> +static inline struct folio *next_folio(struct folio *folio)
>>> +{
>>> +	return folio + folio_nr_pages(folio);
>>
>> Are you planning to make hugetlb use folio too?
>
> Eventually, probably.  It's not my focus.
>
>> If yes, this might not work if we have CONFIG_SPARSEMEM && !CONFIG_SPARSEMEM_VMEMMAP
>> with a hugetlb folio > MAX_ORDER, because struct page might not be virtually contiguous.
>> See the experiment I did in [1].
>>
>> [1] https://lore.kernel.org/linux-mm/16F7C58B-4D79-41C5-9B64-A1A1628F4AF2@nvidia.com/
>
> I thought we were going to forbid that configuration?  ie no pages
> larger than MAX_ORDER with (SPARSEMEM && !SPARSEMEM_VMEMMAP)
>
> https://lore.kernel.org/linux-mm/312AECBD-CA6D-4E93-A6C1-1DF87BABD92D@nvidia.com/
>
> is somewhere else we were discussing this.

That is my plan for 1GB THP, making it depend on SPARSEMEM_VMEMMAP,
otherwise the THP code will be too complicated to read. My concern
is just about using folio in hugetlb, since

If hugetlb is not using folio soon, the patch looks good to me.

Reviewed-by: Zi Yan <ziy@nvidia.com>


—
Best Regards,
Yan Zi
diff mbox series

Patch

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 2d6e715ab8ea..f20504017adf 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -924,6 +924,11 @@  static inline unsigned int compound_order(struct page *page)
 	return page[1].compound_order;
 }
 
+static inline unsigned int folio_order(struct folio *folio)
+{
+	return compound_order(&folio->page);
+}
+
 static inline bool hpage_pincount_available(struct page *page)
 {
 	/*
@@ -975,6 +980,26 @@  static inline unsigned int page_shift(struct page *page)
 
 void free_compound_page(struct page *page);
 
+static inline unsigned long folio_nr_pages(struct folio *folio)
+{
+	return compound_nr(&folio->page);
+}
+
+static inline struct folio *next_folio(struct folio *folio)
+{
+	return folio + folio_nr_pages(folio);
+}
+
+static inline unsigned int folio_shift(struct folio *folio)
+{
+	return PAGE_SHIFT + folio_order(folio);
+}
+
+static inline size_t folio_size(struct folio *folio)
+{
+	return PAGE_SIZE << folio_order(folio);
+}
+
 #ifdef CONFIG_MMU
 /*
  * Do pte_mkwrite, but only if the vma says VM_WRITE.  We do this when
@@ -1618,6 +1643,7 @@  extern void pagefault_out_of_memory(void);
 
 #define offset_in_page(p)	((unsigned long)(p) & ~PAGE_MASK)
 #define offset_in_thp(page, p)	((unsigned long)(p) & (thp_size(page) - 1))
+#define offset_in_folio(folio, p) ((unsigned long)(p) & (folio_size(folio) - 1))
 
 /*
  * Flags passed to show_mem() and show_free_areas() to suppress output in
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 07d9acb5b19c..875dc6cd6ad2 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -223,6 +223,23 @@  struct page {
 #endif
 } _struct_page_alignment;
 
+/*
+ * A struct folio is either a base (order-0) page or the head page of
+ * a compound page.
+ */
+struct folio {
+	struct page page;
+};
+
+static inline struct folio *page_folio(struct page *page)
+{
+	unsigned long head = READ_ONCE(page->compound_head);
+
+	if (unlikely(head & 1))
+		return (struct folio *)(head - 1);
+	return (struct folio *)page;
+}
+
 static inline atomic_t *compound_mapcount_ptr(struct page *page)
 {
 	return &page[1].compound_mapcount;