diff mbox series

[RESEND,v3,6/9] mm/truncate: use folio_split() for truncate operation.

Message ID 20241205001839.2582020-7-ziy@nvidia.com (mailing list archive)
State New
Headers show
Series Buddy allocator like folio split | expand

Commit Message

Zi Yan Dec. 5, 2024, 12:18 a.m. UTC
Instead of splitting the large folio uniformly during truncation, use
buddy allocator like split at the start of truncation range to minimize
the number of resulting folios.

For example, to truncate a order-4 folio
[0, 1, 2, 3, 4, 5, ..., 15]
between [3, 10] (inclusive), folio_split() splits the folio to
[0,1], [2], [3], [4..7], [8..15] and [3], [4..7] can be dropped and
[8..15] is kept with zeros in [8..10].

It is possible to further do a folio_split() at 10, so more resulting
folios can be dropped. But it is left as future possible optimization
if needed.

Another possible optimization is to make folio_split() to split a folio
based on a given range, like [3..10] above. But that complicates
folio_split(), so it will investigated when necessary.

Signed-off-by: Zi Yan <ziy@nvidia.com>
---
 include/linux/huge_mm.h | 18 ++++++++++++++++++
 mm/truncate.c           |  5 ++++-
 2 files changed, 22 insertions(+), 1 deletion(-)

Comments

David Hildenbrand Dec. 10, 2024, 8:12 p.m. UTC | #1
On 05.12.24 01:18, Zi Yan wrote:
> Instead of splitting the large folio uniformly during truncation, use
> buddy allocator like split at the start of truncation range to minimize
> the number of resulting folios.
> 
> For example, to truncate a order-4 folio
> [0, 1, 2, 3, 4, 5, ..., 15]
> between [3, 10] (inclusive), folio_split() splits the folio to
> [0,1], [2], [3], [4..7], [8..15] and [3], [4..7] can be dropped and
> [8..15] is kept with zeros in [8..10].

But isn't that making things worse that they are today? Imagine 
fallocate() on a shmem file where we won't be freeing memory?
Zi Yan Dec. 10, 2024, 8:41 p.m. UTC | #2
On 10 Dec 2024, at 15:12, David Hildenbrand wrote:

> On 05.12.24 01:18, Zi Yan wrote:
>> Instead of splitting the large folio uniformly during truncation, use
>> buddy allocator like split at the start of truncation range to minimize
>> the number of resulting folios.
>>
>> For example, to truncate a order-4 folio
>> [0, 1, 2, 3, 4, 5, ..., 15]
>> between [3, 10] (inclusive), folio_split() splits the folio to
>> [0,1], [2], [3], [4..7], [8..15] and [3], [4..7] can be dropped and
>> [8..15] is kept with zeros in [8..10].
>
> But isn't that making things worse that they are today? Imagine fallocate() on a shmem file where we won't be freeing memory?

You mean [8..10] are kept? Yes, it is worse. And the solution would be
split at both 3 and 10. For now folio_split() returns -EINVAL for
shmem mappings, but that means I have a bug in this patch. The newly added
split_folio_at() needs to retry uniform split if buddy allocator like
split returns with -EINVAL, otherwise, shmem truncate will no longer
split folios after this patch.

Thank you for checking the patch. I will fix it in the next version.

In terms of [8..10] not being freed, I need to think about a proper interface
to pass more than one split points as a future improvement.

Best Regards,
Yan, Zi
Zi Yan Dec. 10, 2024, 8:50 p.m. UTC | #3
On 10 Dec 2024, at 15:41, Zi Yan wrote:

> On 10 Dec 2024, at 15:12, David Hildenbrand wrote:
>
>> On 05.12.24 01:18, Zi Yan wrote:
>>> Instead of splitting the large folio uniformly during truncation, use
>>> buddy allocator like split at the start of truncation range to minimize
>>> the number of resulting folios.
>>>
>>> For example, to truncate a order-4 folio
>>> [0, 1, 2, 3, 4, 5, ..., 15]
>>> between [3, 10] (inclusive), folio_split() splits the folio to
>>> [0,1], [2], [3], [4..7], [8..15] and [3], [4..7] can be dropped and
>>> [8..15] is kept with zeros in [8..10].
>>
>> But isn't that making things worse that they are today? Imagine fallocate() on a shmem file where we won't be freeing memory?
>
> You mean [8..10] are kept? Yes, it is worse. And the solution would be
> split at both 3 and 10. For now folio_split() returns -EINVAL for
> shmem mappings, but that means I have a bug in this patch. The newly added
> split_folio_at() needs to retry uniform split if buddy allocator like
> split returns with -EINVAL, otherwise, shmem truncate will no longer
> split folios after this patch.
>
> Thank you for checking the patch. I will fix it in the next version.

I am going to add two functions: split_huge_page_supported(folio, new_order)
and folio_split_support(folio, new_order) to perform the order and folio->mapping
checks at the beginning of __folio_split(). So truncate and other potential
callers can make the right function call.


Best Regards,
Yan, Zi
diff mbox series

Patch

diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
index b94c2e8ee918..29accb5d93b8 100644
--- a/include/linux/huge_mm.h
+++ b/include/linux/huge_mm.h
@@ -339,6 +339,18 @@  int split_huge_page_to_list_to_order(struct page *page, struct list_head *list,
 		unsigned int new_order);
 int min_order_for_split(struct folio *folio);
 int split_folio_to_list(struct folio *folio, struct list_head *list);
+int folio_split(struct folio *folio, unsigned int new_order, struct page *page,
+		struct list_head *list);
+static inline int split_folio_at(struct folio *folio, struct page *page,
+		struct list_head *list)
+{
+	int ret = min_order_for_split(folio);
+
+	if (ret < 0)
+		return ret;
+
+	return folio_split(folio, ret, page, list);
+}
 static inline int split_huge_page(struct page *page)
 {
 	struct folio *folio = page_folio(page);
@@ -531,6 +543,12 @@  static inline int split_folio_to_list(struct folio *folio, struct list_head *lis
 	return 0;
 }
 
+static inline int split_folio_at(struct folio *folio, struct page *page,
+		struct list_head *list)
+{
+	return 0;
+}
+
 static inline void deferred_split_folio(struct folio *folio, bool partially_mapped) {}
 #define split_huge_pmd(__vma, __pmd, __address)	\
 	do { } while (0)
diff --git a/mm/truncate.c b/mm/truncate.c
index 7c304d2f0052..9f33d6821748 100644
--- a/mm/truncate.c
+++ b/mm/truncate.c
@@ -178,6 +178,7 @@  bool truncate_inode_partial_folio(struct folio *folio, loff_t start, loff_t end)
 {
 	loff_t pos = folio_pos(folio);
 	unsigned int offset, length;
+	long in_folio_offset;
 
 	if (pos < start)
 		offset = start - pos;
@@ -207,7 +208,9 @@  bool truncate_inode_partial_folio(struct folio *folio, loff_t start, loff_t end)
 		folio_invalidate(folio, offset, length);
 	if (!folio_test_large(folio))
 		return true;
-	if (split_folio(folio) == 0)
+
+	in_folio_offset = PAGE_ALIGN_DOWN(offset) / PAGE_SIZE;
+	if (split_folio_at(folio, folio_page(folio, in_folio_offset), NULL) == 0)
 		return true;
 	if (folio_test_dirty(folio))
 		return false;