diff mbox series

[v2,3/3] mm,thp,rmap: clean up the end of __split_huge_pmd_locked()

Message ID d43748aa-fece-e0b9-c4ab-f23c9ebc9011@google.com (mailing list archive)
State New
Headers show
Series mm,thp,rmap: rework the use of subpages_mapcount | expand

Commit Message

Hugh Dickins Nov. 22, 2022, 9:51 a.m. UTC
It's hard to add a page_add_anon_rmap() into __split_huge_pmd_locked()'s
HPAGE_PMD_NR set_pte_at() loop, without wincing at the "freeze" case's
HPAGE_PMD_NR page_remove_rmap() loop below it.

It's just a mistake to add rmaps in the "freeze" (insert migration entries
prior to splitting huge page) case: the pmd_migration case already avoids
doing that, so just follow its lead.  page_add_ref() versus put_page()
likewise.  But why is one more put_page() needed in the "freeze" case?
Because it's removing the pmd rmap, already removed when pmd_migration
(and freeze and pmd_migration are mutually exclusive cases).

Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
v2: same as v1, plus Ack from Kirill

 mm/huge_memory.c | 15 +++++----------
 1 file changed, 5 insertions(+), 10 deletions(-)

Comments

Hugh Dickins Dec. 5, 2022, 1:38 a.m. UTC | #1
On Tue, 22 Nov 2022, Hugh Dickins wrote:

> It's hard to add a page_add_anon_rmap() into __split_huge_pmd_locked()'s
> HPAGE_PMD_NR set_pte_at() loop, without wincing at the "freeze" case's
> HPAGE_PMD_NR page_remove_rmap() loop below it.

No problem here, but I did later learn something worth sharing.

Comparing before and after vmstats for the series, I was worried to find
the thp_deferred_split_page count consistently much lower afterwards
(10%? 1%?), and thought maybe the COMPOUND_MAPPED patch had messed up
the accounting for when to call deferred_split_huge_page().

But no: that's as before.  We can debate sometime whether it could do a
better job - the vast majority of calls to deferred_split_huge_page() are
just repeats - but that's a different story, one I'm not keen to get into
at the moment.

> -	if (freeze) {
> -		for (i = 0; i < HPAGE_PMD_NR; i++) {
> -			page_remove_rmap(page + i, vma, false);
> -			put_page(page + i);
> -		}
> -	}

The reason for the lower thp_deferred_split_page (at least in the kind
of testing I was doing) was a very good thing: those page_remove_rmap()
calls from __split_huge_pmd_locked() had very often been adding the page
to the deferred split queue, precisely while it was already being split.

The list management is such that there was no corruption, and splitting
calls from the split queue itself did not reach the point of bumping up
the thp_deferred_split_page count; but off-queue splits would add the
page before deleting it again, adding lots of noise to the count, and
unnecessary contention on the queue lock I presume.

Hugh
diff mbox series

Patch

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 3dee8665c585..ab5ab1a013e1 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2135,7 +2135,6 @@  static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd,
 		uffd_wp = pmd_uffd_wp(old_pmd);
 
 		VM_BUG_ON_PAGE(!page_count(page), page);
-		page_ref_add(page, HPAGE_PMD_NR - 1);
 
 		/*
 		 * Without "freeze", we'll simply split the PMD, propagating the
@@ -2155,6 +2154,8 @@  static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd,
 		anon_exclusive = PageAnon(page) && PageAnonExclusive(page);
 		if (freeze && anon_exclusive && page_try_share_anon_rmap(page))
 			freeze = false;
+		if (!freeze)
+			page_ref_add(page, HPAGE_PMD_NR - 1);
 	}
 
 	/*
@@ -2210,27 +2211,21 @@  static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd,
 				entry = pte_mksoft_dirty(entry);
 			if (uffd_wp)
 				entry = pte_mkuffd_wp(entry);
+			page_add_anon_rmap(page + i, vma, addr, false);
 		}
 		pte = pte_offset_map(&_pmd, addr);
 		BUG_ON(!pte_none(*pte));
 		set_pte_at(mm, addr, pte, entry);
-		if (!pmd_migration)
-			page_add_anon_rmap(page + i, vma, addr, false);
 		pte_unmap(pte);
 	}
 
 	if (!pmd_migration)
 		page_remove_rmap(page, vma, true);
+	if (freeze)
+		put_page(page);
 
 	smp_wmb(); /* make pte visible before pmd */
 	pmd_populate(mm, pmd, pgtable);
-
-	if (freeze) {
-		for (i = 0; i < HPAGE_PMD_NR; i++) {
-			page_remove_rmap(page + i, vma, false);
-			put_page(page + i);
-		}
-	}
 }
 
 void __split_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd,