diff mbox series

[RFC,4/5] mm: truncate: split huge page cache page to a non-zero order if possible.

Message ID 20220321142128.2471199-5-zi.yan@sent.com (mailing list archive)
State New
Headers show
Series Split a huge page to any lower order pages | expand

Commit Message

Zi Yan March 21, 2022, 2:21 p.m. UTC
From: Zi Yan <ziy@nvidia.com>

To minimize the number of pages after a huge page truncation, we do not
need to split it all the way down to order-0. The huge page has at most
three parts, the part before offset, the part to be truncated, the part
remaining at the end. Find the greatest common power of two multiplier of
the non-zero values of them as the new order, so we can split the huge
page to this order and keep the remaining pages as large and as few as
possible.

Signed-off-by: Zi Yan <ziy@nvidia.com>
---
 mm/huge_memory.c |  1 +
 mm/truncate.c    | 33 +++++++++++++++++++++++++++++++--
 2 files changed, 32 insertions(+), 2 deletions(-)

Comments

Roman Gushchin March 21, 2022, 10:32 p.m. UTC | #1
On Mon, Mar 21, 2022 at 10:21:27AM -0400, Zi Yan wrote:
> From: Zi Yan <ziy@nvidia.com>
> 
> To minimize the number of pages after a huge page truncation, we do not
> need to split it all the way down to order-0. The huge page has at most
> three parts, the part before offset, the part to be truncated, the part
> remaining at the end. Find the greatest common power of two multiplier of
> the non-zero values of them as the new order, so we can split the huge
> page to this order and keep the remaining pages as large and as few as
> possible.

Would you mind please to describe the algorithm in more details?

Thanks!
Zi Yan March 22, 2022, 2:19 p.m. UTC | #2
On 21 Mar 2022, at 18:32, Roman Gushchin wrote:

> On Mon, Mar 21, 2022 at 10:21:27AM -0400, Zi Yan wrote:
>> From: Zi Yan <ziy@nvidia.com>
>>
>> To minimize the number of pages after a huge page truncation, we do not
>> need to split it all the way down to order-0. The huge page has at most
>> three parts, the part before offset, the part to be truncated, the part
>> remaining at the end. Find the greatest common power of two multiplier of
>> the non-zero values of them as the new order, so we can split the huge
>> page to this order and keep the remaining pages as large and as few as
>> possible.
>
> Would you mind please to describe the algorithm in more details?

Sure.

During truncation, there can be three parts in a huge page:
1. the _offset_ from the beginning of the huge page,
2. the _length_ of the to-be-truncated part,
3. the _remaining_ part after the to-be-truncated part.

the size of the split huge page need to be the greatest common divisor
of the non-zero ones of three after being rounded down to power of two.

OK, I actually find there is a gcd function. I think the algorithm can
simplified to

new_order = ilog2(gcd(gcd(offset, length), remaining)) - PAGE_SHIFT;

I will update the code, the commit message, and the comment in the next
version.

Thank you for the comment.

--
Best Regards,
Yan, Zi
diff mbox series

Patch

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 3617aa3ad0b1..76db0092a1e2 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2349,6 +2349,7 @@  static void __split_huge_page_tail(struct page *head, int tail,
 		prep_compound_page(page_tail, new_order);
 		prep_transhuge_page(page_tail);
 	}
+	VM_BUG_ON_PAGE(PageTail(page_tail), page_tail);
 
 	/* Finally unfreeze refcount. Additional reference from page cache. */
 	page_ref_unfreeze(page_tail, 1 + ((!PageAnon(head) ||
diff --git a/mm/truncate.c b/mm/truncate.c
index ab50d0d59a2a..4f71e67dec09 100644
--- a/mm/truncate.c
+++ b/mm/truncate.c
@@ -197,6 +197,14 @@  int truncate_inode_folio(struct address_space *mapping, struct folio *folio)
 	return 0;
 }
 
+static unsigned int greatest_pow_of_two_multiplier(unsigned int num)
+{
+	if (num & 1)
+		return 0;
+	return min_t(unsigned int, ilog2(num),
+		ilog2(num - rounddown_pow_of_two(num)));
+}
+
 /*
  * Handle partial folios.  The folio may be entirely within the
  * range if a split has raced with us.  If not, we zero the part of the
@@ -211,7 +219,8 @@  int truncate_inode_folio(struct address_space *mapping, struct folio *folio)
 bool truncate_inode_partial_folio(struct folio *folio, loff_t start, loff_t end)
 {
 	loff_t pos = folio_pos(folio);
-	unsigned int offset, length;
+	unsigned int offset, length, remaining;
+	unsigned int new_order = folio_order(folio);
 
 	if (pos < start)
 		offset = start - pos;
@@ -222,6 +231,7 @@  bool truncate_inode_partial_folio(struct folio *folio, loff_t start, loff_t end)
 		length = length - offset;
 	else
 		length = end + 1 - pos - offset;
+	remaining = folio_size(folio) - offset - length;
 
 	folio_wait_writeback(folio);
 	if (length == folio_size(folio)) {
@@ -236,11 +246,30 @@  bool truncate_inode_partial_folio(struct folio *folio, loff_t start, loff_t end)
 	 */
 	folio_zero_range(folio, offset, length);
 
+	/*
+	 * Find the greatest common power of two multiplier of the non-zero
+	 * offset, length, and remaining as the new order. So we can truncate
+	 * a subpage as large as possible.
+	 */
+	if (offset)
+		new_order = greatest_pow_of_two_multiplier(offset / PAGE_SIZE);
+	if (length)
+		new_order = min_t(unsigned int, new_order,
+			greatest_pow_of_two_multiplier(length / PAGE_SIZE));
+	if (remaining)
+		new_order = min_t(unsigned int, new_order,
+			greatest_pow_of_two_multiplier(remaining / PAGE_SIZE));
+
+	/* order-1 THP not supported, downgrade to order-0 */
+	if (new_order == 1)
+		new_order = 0;
+
+
 	if (folio_has_private(folio))
 		folio_invalidate(folio, offset, length);
 	if (!folio_test_large(folio))
 		return true;
-	if (split_huge_page(&folio->page) == 0)
+	if (split_huge_page_to_list_to_order(&folio->page, NULL, new_order) == 0)
 		return true;
 	if (folio_test_dirty(folio))
 		return false;