diff mbox series

[v3,2/3] mm: Implement folio_remove_rmap_range()

Message ID 20230720112955.643283-3-ryan.roberts@arm.com (mailing list archive)
State New
Headers show
Series Optimize large folio interaction with deferred split | expand

Commit Message

Ryan Roberts July 20, 2023, 11:29 a.m. UTC
Like page_remove_rmap() but batch-removes the rmap for a range of pages
belonging to a folio. This can provide a small speedup due to less
manipuation of the various counters. But more crucially, if removing the
rmap for all pages of a folio in a batch, there is no need to
(spuriously) add it to the deferred split list, which saves significant
cost when there is contention for the split queue lock.

All contained pages are accounted using the order-0 folio (or base page)
scheme.

page_remove_rmap() is refactored so that it forwards to
folio_remove_rmap_range() for !compound cases, and both functions now
share a common epilogue function. The intention here is to avoid
duplication of code.

Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
---
 include/linux/rmap.h |   2 +
 mm/rmap.c            | 125 ++++++++++++++++++++++++++++++++-----------
 2 files changed, 97 insertions(+), 30 deletions(-)

Comments

Yu Zhao July 26, 2023, 5:53 a.m. UTC | #1
On Thu, Jul 20, 2023 at 5:30 AM Ryan Roberts <ryan.roberts@arm.com> wrote:
>
> Like page_remove_rmap() but batch-removes the rmap for a range of pages
> belonging to a folio. This can provide a small speedup due to less
> manipuation of the various counters. But more crucially, if removing the
> rmap for all pages of a folio in a batch, there is no need to
> (spuriously) add it to the deferred split list, which saves significant
> cost when there is contention for the split queue lock.
>
> All contained pages are accounted using the order-0 folio (or base page)
> scheme.
>
> page_remove_rmap() is refactored so that it forwards to
> folio_remove_rmap_range() for !compound cases, and both functions now
> share a common epilogue function. The intention here is to avoid
> duplication of code.
>
> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
> ---
>  include/linux/rmap.h |   2 +
>  mm/rmap.c            | 125 ++++++++++++++++++++++++++++++++-----------
>  2 files changed, 97 insertions(+), 30 deletions(-)
>
> diff --git a/include/linux/rmap.h b/include/linux/rmap.h
> index b87d01660412..f578975c12c0 100644
> --- a/include/linux/rmap.h
> +++ b/include/linux/rmap.h
> @@ -200,6 +200,8 @@ void page_add_file_rmap(struct page *, struct vm_area_struct *,
>                 bool compound);
>  void page_remove_rmap(struct page *, struct vm_area_struct *,
>                 bool compound);
> +void folio_remove_rmap_range(struct folio *folio, struct page *page,
> +               int nr, struct vm_area_struct *vma);

I prefer folio_remove_rmap_range(page, nr, vma). Passing both the
folio and the starting page seems redundant to me.

Matthew, is there a convention (function names, parameters, etc.) for
operations on a range of pages within a folio?

And regarding the refactor, what I have in mind is that
folio_remove_rmap_range() is the core API and page_remove_rmap() is
just a wrapper around it, i.e., folio_remove_rmap_range(page, 1, vma).

Let me post a diff later and see if it makes sense to you.
Ryan Roberts July 26, 2023, 6:42 a.m. UTC | #2
On 26/07/2023 06:53, Yu Zhao wrote:
> On Thu, Jul 20, 2023 at 5:30 AM Ryan Roberts <ryan.roberts@arm.com> wrote:
>>
>> Like page_remove_rmap() but batch-removes the rmap for a range of pages
>> belonging to a folio. This can provide a small speedup due to less
>> manipuation of the various counters. But more crucially, if removing the
>> rmap for all pages of a folio in a batch, there is no need to
>> (spuriously) add it to the deferred split list, which saves significant
>> cost when there is contention for the split queue lock.
>>
>> All contained pages are accounted using the order-0 folio (or base page)
>> scheme.
>>
>> page_remove_rmap() is refactored so that it forwards to
>> folio_remove_rmap_range() for !compound cases, and both functions now
>> share a common epilogue function. The intention here is to avoid
>> duplication of code.
>>
>> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
>> ---
>>  include/linux/rmap.h |   2 +
>>  mm/rmap.c            | 125 ++++++++++++++++++++++++++++++++-----------
>>  2 files changed, 97 insertions(+), 30 deletions(-)
>>
>> diff --git a/include/linux/rmap.h b/include/linux/rmap.h
>> index b87d01660412..f578975c12c0 100644
>> --- a/include/linux/rmap.h
>> +++ b/include/linux/rmap.h
>> @@ -200,6 +200,8 @@ void page_add_file_rmap(struct page *, struct vm_area_struct *,
>>                 bool compound);
>>  void page_remove_rmap(struct page *, struct vm_area_struct *,
>>                 bool compound);
>> +void folio_remove_rmap_range(struct folio *folio, struct page *page,
>> +               int nr, struct vm_area_struct *vma);
> 
> I prefer folio_remove_rmap_range(page, nr, vma). Passing both the
> folio and the starting page seems redundant to me.

I prefer to pass folio explicitly because it makes it clear that all pages in
the range must belong to the same folio.

> 
> Matthew, is there a convention (function names, parameters, etc.) for
> operations on a range of pages within a folio?
> 
> And regarding the refactor, what I have in mind is that
> folio_remove_rmap_range() is the core API and page_remove_rmap() is
> just a wrapper around it, i.e., folio_remove_rmap_range(page, 1, vma).

I tried to do it that way, but the existing page_remove_rmap() also takes a
'compound' parameter; it can operate on compound, thp pages and uses the
alternative accounting scheme in this case.

I could add a compound parameter to folio_remove_rmap_range() but in that case
the range parameters don't make sense - when compound is true we are implicitly
operating on the whole folio due to the way the accounting is done. So I felt it
was clearer for folio_remove_rmap_range() to deal with small page accounting
only. page_remove_rmap() forwards to folio_remove_rmap_range() when
compound=false and page_remove_rmap() directly deals with the thp accounting
when compound=true.

> 
> Let me post a diff later and see if it makes sense to you.
Matthew Wilcox July 26, 2023, 4:44 p.m. UTC | #3
On Tue, Jul 25, 2023 at 11:53:26PM -0600, Yu Zhao wrote:
> > +void folio_remove_rmap_range(struct folio *folio, struct page *page,
> > +               int nr, struct vm_area_struct *vma);
> 
> I prefer folio_remove_rmap_range(page, nr, vma). Passing both the
> folio and the starting page seems redundant to me.
> 
> Matthew, is there a convention (function names, parameters, etc.) for
> operations on a range of pages within a folio?

We've been establishing that convention recently, yes.  It seems
pointless to re-derive the folio from the page when the caller already
has the folio.  I also like Ryan's point that it reinforces that all
pages must be from the same folio.

> And regarding the refactor, what I have in mind is that
> folio_remove_rmap_range() is the core API and page_remove_rmap() is
> just a wrapper around it, i.e., folio_remove_rmap_range(page, 1, vma).
> 
> Let me post a diff later and see if it makes sense to you.

I think that can make sense.  Because we limit to a single page table,
specifying 'nr = 1 << PMD_ORDER' is the same as 'compound = true'.
Just make it folio, page, nr, vma.  I'd actually prefer it as (vma,
folio, page, nr), but that isn't the convention we've had in rmap up
until now.
Huang, Ying July 27, 2023, 1:29 a.m. UTC | #4
Matthew Wilcox <willy@infradead.org> writes:

> On Tue, Jul 25, 2023 at 11:53:26PM -0600, Yu Zhao wrote:
>> > +void folio_remove_rmap_range(struct folio *folio, struct page *page,
>> > +               int nr, struct vm_area_struct *vma);
>> 
>> I prefer folio_remove_rmap_range(page, nr, vma). Passing both the
>> folio and the starting page seems redundant to me.
>> 
>> Matthew, is there a convention (function names, parameters, etc.) for
>> operations on a range of pages within a folio?
>
> We've been establishing that convention recently, yes.  It seems
> pointless to re-derive the folio from the page when the caller already
> has the folio.  I also like Ryan's point that it reinforces that all
> pages must be from the same folio.
>
>> And regarding the refactor, what I have in mind is that
>> folio_remove_rmap_range() is the core API and page_remove_rmap() is
>> just a wrapper around it, i.e., folio_remove_rmap_range(page, 1, vma).
>> 
>> Let me post a diff later and see if it makes sense to you.
>
> I think that can make sense.  Because we limit to a single page table,
> specifying 'nr = 1 << PMD_ORDER' is the same as 'compound = true'.
> Just make it folio, page, nr, vma.  I'd actually prefer it as (vma,
> folio, page, nr), but that isn't the convention we've had in rmap up
> until now.

IIUC, even if 'nr = 1 << PMD_ORDER', we may remove one PMD 'compound'
mapping, or 'nr' PTE mapping.  So, we will still need 'compound' (or
some better name) as parameter.

--
Best Regards,
Huang, Ying
Matthew Wilcox July 27, 2023, 2:35 a.m. UTC | #5
On Thu, Jul 27, 2023 at 09:29:24AM +0800, Huang, Ying wrote:
> Matthew Wilcox <willy@infradead.org> writes:
> > I think that can make sense.  Because we limit to a single page table,
> > specifying 'nr = 1 << PMD_ORDER' is the same as 'compound = true'.
> > Just make it folio, page, nr, vma.  I'd actually prefer it as (vma,
> > folio, page, nr), but that isn't the convention we've had in rmap up
> > until now.
> 
> IIUC, even if 'nr = 1 << PMD_ORDER', we may remove one PMD 'compound'
> mapping, or 'nr' PTE mapping.  So, we will still need 'compound' (or
> some better name) as parameter.

Oh, this is removing ... so you're concerned with the case where we've
split the PMD into PTEs, but all the PTEs are still present in a single
page table?  OK, I don't have a good answer to that.  Maybe that torpedoes
the whole idea; I'll think about it.
Ryan Roberts July 27, 2023, 7:26 a.m. UTC | #6
On 27/07/2023 03:35, Matthew Wilcox wrote:
> On Thu, Jul 27, 2023 at 09:29:24AM +0800, Huang, Ying wrote:
>> Matthew Wilcox <willy@infradead.org> writes:
>>> I think that can make sense.  Because we limit to a single page table,
>>> specifying 'nr = 1 << PMD_ORDER' is the same as 'compound = true'.
>>> Just make it folio, page, nr, vma.  I'd actually prefer it as (vma,
>>> folio, page, nr), but that isn't the convention we've had in rmap up
>>> until now.
>>
>> IIUC, even if 'nr = 1 << PMD_ORDER', we may remove one PMD 'compound'
>> mapping, or 'nr' PTE mapping.  So, we will still need 'compound' (or
>> some better name) as parameter.
> 
> Oh, this is removing ... so you're concerned with the case where we've
> split the PMD into PTEs, but all the PTEs are still present in a single
> page table?  OK, I don't have a good answer to that.  Maybe that torpedoes
> the whole idea; I'll think about it.

This is exactly why I think the approach I've already taken is the correct one;
a 'range' makes no sense when you are dealing with 'compound' pages because you
are accounting the entire folio. So surely its better to reflect that by only
accounting small pages in the range version of the API.
Yu Zhao July 27, 2023, 4:38 p.m. UTC | #7
On Thu, Jul 27, 2023 at 1:26 AM Ryan Roberts <ryan.roberts@arm.com> wrote:
>
> On 27/07/2023 03:35, Matthew Wilcox wrote:
> > On Thu, Jul 27, 2023 at 09:29:24AM +0800, Huang, Ying wrote:
> >> Matthew Wilcox <willy@infradead.org> writes:
> >>> I think that can make sense.  Because we limit to a single page table,
> >>> specifying 'nr = 1 << PMD_ORDER' is the same as 'compound = true'.
> >>> Just make it folio, page, nr, vma.  I'd actually prefer it as (vma,
> >>> folio, page, nr), but that isn't the convention we've had in rmap up
> >>> until now.
> >>
> >> IIUC, even if 'nr = 1 << PMD_ORDER', we may remove one PMD 'compound'
> >> mapping, or 'nr' PTE mapping.  So, we will still need 'compound' (or
> >> some better name) as parameter.
> >
> > Oh, this is removing ... so you're concerned with the case where we've
> > split the PMD into PTEs, but all the PTEs are still present in a single
> > page table?  OK, I don't have a good answer to that.  Maybe that torpedoes
> > the whole idea; I'll think about it.
>
> This is exactly why I think the approach I've already taken is the correct one;
> a 'range' makes no sense when you are dealing with 'compound' pages because you
> are accounting the entire folio. So surely its better to reflect that by only
> accounting small pages in the range version of the API.

If the argument is the compound case is a separate one, then why not a
separate API for it?

I don't really care about whether we think 'range' makes sense for
'compound' or not. What I'm saying is:
1. if they are considered one general case, then one API with the
compound parameter.
2. if they are considered two specific cases, there should be two APIs.
This common design pattern is cleaner IMO.

Right now we have an overlap (redundancy) -- people would have to do
two code searches: one for page_remove_rmap() and the other for
folio_remove_rmap_range(nr=1), and this IMO is a bad design pattern.
Ryan Roberts July 28, 2023, 9 a.m. UTC | #8
On 27/07/2023 17:38, Yu Zhao wrote:
> On Thu, Jul 27, 2023 at 1:26 AM Ryan Roberts <ryan.roberts@arm.com> wrote:
>>
>> On 27/07/2023 03:35, Matthew Wilcox wrote:
>>> On Thu, Jul 27, 2023 at 09:29:24AM +0800, Huang, Ying wrote:
>>>> Matthew Wilcox <willy@infradead.org> writes:
>>>>> I think that can make sense.  Because we limit to a single page table,
>>>>> specifying 'nr = 1 << PMD_ORDER' is the same as 'compound = true'.
>>>>> Just make it folio, page, nr, vma.  I'd actually prefer it as (vma,
>>>>> folio, page, nr), but that isn't the convention we've had in rmap up
>>>>> until now.
>>>>
>>>> IIUC, even if 'nr = 1 << PMD_ORDER', we may remove one PMD 'compound'
>>>> mapping, or 'nr' PTE mapping.  So, we will still need 'compound' (or
>>>> some better name) as parameter.
>>>
>>> Oh, this is removing ... so you're concerned with the case where we've
>>> split the PMD into PTEs, but all the PTEs are still present in a single
>>> page table?  OK, I don't have a good answer to that.  Maybe that torpedoes
>>> the whole idea; I'll think about it.
>>
>> This is exactly why I think the approach I've already taken is the correct one;
>> a 'range' makes no sense when you are dealing with 'compound' pages because you
>> are accounting the entire folio. So surely its better to reflect that by only
>> accounting small pages in the range version of the API.
> 
> If the argument is the compound case is a separate one, then why not a
> separate API for it?
> 
> I don't really care about whether we think 'range' makes sense for
> 'compound' or not. What I'm saying is:
> 1. if they are considered one general case, then one API with the
> compound parameter.
> 2. if they are considered two specific cases, there should be two APIs.
> This common design pattern is cleaner IMO.

Option 2 definitely makes sense to me and I agree that it would be cleaner to
have 2 separate APIs, one for small-page accounting (which can accept a range
within a folio) and one for large-page accounting (i.e. compound=true in today's
API).

But...

1) That's not how the rest of the rmap API does it

2) This would be a much bigger change since I'm removing an existing API and
replacing it with a completely new one (there are ~20 call sites to fix up). I
was trying to keep the change small and manageable by maintaining the current
API but moving all the small-page logic to the new API, so the old API is a
wrapper in that case.

3) You would also need an API for the hugetlb case, which page_remove_rmap()
handles today. Perhaps that could also be done by the new API that handles the
compound case. But then you are mixing and matching your API styles - one caters
for 1 specific case, and the other caters for 2 cases and figures out which one.

> 
> Right now we have an overlap (redundancy) -- people would have to do
> two code searches: one for page_remove_rmap() and the other for
> folio_remove_rmap_range(nr=1), and this IMO is a bad design pattern.

I'm open to doing the work to remove this redundancy, but I'd like to hear
concensus on this thread that its the right approach first. Although personally
I don't see a problem with what I've already done; If you want to operate on a
page (inc the old concept of a "compound page" and a hugetlb page) call the old
one. If you want to operate on a range of pages in a folio, call the new one.

Thanks,
Ryan
Yu Zhao Aug. 1, 2023, 7:40 a.m. UTC | #9
On Fri, Jul 28, 2023 at 3:00 AM Ryan Roberts <ryan.roberts@arm.com> wrote:
>
> On 27/07/2023 17:38, Yu Zhao wrote:
> > On Thu, Jul 27, 2023 at 1:26 AM Ryan Roberts <ryan.roberts@arm.com> wrote:
> >>
> >> On 27/07/2023 03:35, Matthew Wilcox wrote:
> >>> On Thu, Jul 27, 2023 at 09:29:24AM +0800, Huang, Ying wrote:
> >>>> Matthew Wilcox <willy@infradead.org> writes:
> >>>>> I think that can make sense.  Because we limit to a single page table,
> >>>>> specifying 'nr = 1 << PMD_ORDER' is the same as 'compound = true'.
> >>>>> Just make it folio, page, nr, vma.  I'd actually prefer it as (vma,
> >>>>> folio, page, nr), but that isn't the convention we've had in rmap up
> >>>>> until now.
> >>>>
> >>>> IIUC, even if 'nr = 1 << PMD_ORDER', we may remove one PMD 'compound'
> >>>> mapping, or 'nr' PTE mapping.  So, we will still need 'compound' (or
> >>>> some better name) as parameter.
> >>>
> >>> Oh, this is removing ... so you're concerned with the case where we've
> >>> split the PMD into PTEs, but all the PTEs are still present in a single
> >>> page table?  OK, I don't have a good answer to that.  Maybe that torpedoes
> >>> the whole idea; I'll think about it.
> >>
> >> This is exactly why I think the approach I've already taken is the correct one;
> >> a 'range' makes no sense when you are dealing with 'compound' pages because you
> >> are accounting the entire folio. So surely its better to reflect that by only
> >> accounting small pages in the range version of the API.
> >
> > If the argument is the compound case is a separate one, then why not a
> > separate API for it?
> >
> > I don't really care about whether we think 'range' makes sense for
> > 'compound' or not. What I'm saying is:
> > 1. if they are considered one general case, then one API with the
> > compound parameter.
> > 2. if they are considered two specific cases, there should be two APIs.
> > This common design pattern is cleaner IMO.
>
> Option 2 definitely makes sense to me and I agree that it would be cleaner to
> have 2 separate APIs, one for small-page accounting (which can accept a range
> within a folio) and one for large-page accounting (i.e. compound=true in today's
> API).
>
> But...
>
> 1) That's not how the rest of the rmap API does it

Yes, but that's how we convert things: one step a time.

> 2) This would be a much bigger change since I'm removing an existing API and
> replacing it with a completely new one (there are ~20 call sites to fix up). I
> was trying to keep the change small and manageable by maintaining the current
> API but moving all the small-page logic to the new API, so the old API is a
> wrapper in that case.

I don't get how it'd be "much bigger". Isn't it just a straightforward
replacement?

> 3) You would also need an API for the hugetlb case, which page_remove_rmap()
> handles today. Perhaps that could also be done by the new API that handles the
> compound case. But then you are mixing and matching your API styles - one caters
> for 1 specific case, and the other caters for 2 cases and figures out which one.

You are talking about cases *inside* the APIs, and that's irrelevant
to the number of APIs: we only need two -- one supports a range within
a folio and the other takes a folio as a single unit.

> > Right now we have an overlap (redundancy) -- people would have to do
> > two code searches: one for page_remove_rmap() and the other for
> > folio_remove_rmap_range(nr=1), and this IMO is a bad design pattern.
>
> I'm open to doing the work to remove this redundancy, but I'd like to hear
> concensus on this thread that its the right approach first. Although personally
> I don't see a problem with what I've already done; If you want to operate on a
> page (inc the old concept of a "compound page" and a hugetlb page) call the old
> one. If you want to operate on a range of pages in a folio, call the new one.
diff mbox series

Patch

diff --git a/include/linux/rmap.h b/include/linux/rmap.h
index b87d01660412..f578975c12c0 100644
--- a/include/linux/rmap.h
+++ b/include/linux/rmap.h
@@ -200,6 +200,8 @@  void page_add_file_rmap(struct page *, struct vm_area_struct *,
 		bool compound);
 void page_remove_rmap(struct page *, struct vm_area_struct *,
 		bool compound);
+void folio_remove_rmap_range(struct folio *folio, struct page *page,
+		int nr, struct vm_area_struct *vma);
 
 void hugepage_add_anon_rmap(struct page *, struct vm_area_struct *,
 		unsigned long address, rmap_t flags);
diff --git a/mm/rmap.c b/mm/rmap.c
index eb0bb00dae34..c3ef56f7ec15 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1359,6 +1359,94 @@  void page_add_file_rmap(struct page *page, struct vm_area_struct *vma,
 	mlock_vma_folio(folio, vma, compound);
 }
 
+/**
+ * __remove_rmap_finish - common operations when taking down a mapping.
+ * @folio:	Folio containing all pages taken down.
+ * @vma:	The VM area containing the range.
+ * @compound:	True if pages were taken down from PMD or false if from PTE(s).
+ * @nr_unmapped: Number of pages within folio that are now unmapped.
+ * @nr_mapped:	Number of pages within folio that are still mapped.
+ */
+static void __remove_rmap_finish(struct folio *folio,
+				struct vm_area_struct *vma, bool compound,
+				int nr_unmapped, int nr_mapped)
+{
+	enum node_stat_item idx;
+
+	if (nr_unmapped) {
+		idx = folio_test_anon(folio) ? NR_ANON_MAPPED : NR_FILE_MAPPED;
+		__lruvec_stat_mod_folio(folio, idx, -nr_unmapped);
+
+		/*
+		 * Queue large anon folio for deferred split if at least one
+		 * page of the folio is unmapped and at least one page is still
+		 * mapped.
+		 */
+		if (folio_test_large(folio) &&
+		    folio_test_anon(folio) && nr_mapped)
+			deferred_split_folio(folio);
+	}
+
+	/*
+	 * It would be tidy to reset folio_test_anon mapping when fully
+	 * unmapped, but that might overwrite a racing page_add_anon_rmap
+	 * which increments mapcount after us but sets mapping before us:
+	 * so leave the reset to free_pages_prepare, and remember that
+	 * it's only reliable while mapped.
+	 */
+
+	munlock_vma_folio(folio, vma, compound);
+}
+
+/**
+ * folio_remove_rmap_range - Take down PTE mappings from a range of pages.
+ * @folio:	Folio containing all pages in range.
+ * @page:	First page in range to unmap.
+ * @nr:		Number of pages to unmap.
+ * @vma:	The VM area containing the range.
+ *
+ * All pages in the range must belong to the same VMA & folio. They must be
+ * mapped with PTEs, not a PMD.
+ *
+ * Context: Caller holds the pte lock.
+ */
+void folio_remove_rmap_range(struct folio *folio, struct page *page,
+					int nr, struct vm_area_struct *vma)
+{
+	atomic_t *mapped = &folio->_nr_pages_mapped;
+	int nr_unmapped = 0;
+	int nr_mapped = 0;
+	bool last;
+
+	if (unlikely(folio_test_hugetlb(folio))) {
+		VM_WARN_ON_FOLIO(1, folio);
+		return;
+	}
+
+	VM_WARN_ON_ONCE(page < &folio->page ||
+			page + nr > (&folio->page + folio_nr_pages(folio)));
+
+	if (!folio_test_large(folio)) {
+		/* Is this the page's last map to be removed? */
+		last = atomic_add_negative(-1, &page->_mapcount);
+		nr_unmapped = last;
+	} else {
+		for (; nr != 0; nr--, page++) {
+			/* Is this the page's last map to be removed? */
+			last = atomic_add_negative(-1, &page->_mapcount);
+			if (last)
+				nr_unmapped++;
+		}
+
+		/* Pages still mapped if folio mapped entirely */
+		nr_mapped = atomic_sub_return_relaxed(nr_unmapped, mapped);
+		if (nr_mapped >= COMPOUND_MAPPED)
+			nr_unmapped = 0;
+	}
+
+	__remove_rmap_finish(folio, vma, false, nr_unmapped, nr_mapped);
+}
+
 /**
  * page_remove_rmap - take down pte mapping from a page
  * @page:	page to remove mapping from
@@ -1385,15 +1473,13 @@  void page_remove_rmap(struct page *page, struct vm_area_struct *vma,
 		return;
 	}
 
-	/* Is page being unmapped by PTE? Is this its last map to be removed? */
+	/* Is page being unmapped by PTE? */
 	if (likely(!compound)) {
-		last = atomic_add_negative(-1, &page->_mapcount);
-		nr = last;
-		if (last && folio_test_large(folio)) {
-			nr = atomic_dec_return_relaxed(mapped);
-			nr = (nr < COMPOUND_MAPPED);
-		}
-	} else if (folio_test_pmd_mappable(folio)) {
+		folio_remove_rmap_range(folio, page, 1, vma);
+		return;
+	}
+
+	if (folio_test_pmd_mappable(folio)) {
 		/* That test is redundant: it's for safety or to optimize out */
 
 		last = atomic_add_negative(-1, &folio->_entire_mapcount);
@@ -1421,29 +1507,8 @@  void page_remove_rmap(struct page *page, struct vm_area_struct *vma,
 			idx = NR_FILE_PMDMAPPED;
 		__lruvec_stat_mod_folio(folio, idx, -nr_pmdmapped);
 	}
-	if (nr) {
-		idx = folio_test_anon(folio) ? NR_ANON_MAPPED : NR_FILE_MAPPED;
-		__lruvec_stat_mod_folio(folio, idx, -nr);
-
-		/*
-		 * Queue anon large folio for deferred split if at least one
-		 * page of the folio is unmapped and at least one page
-		 * is still mapped.
-		 */
-		if (folio_test_large(folio) && folio_test_anon(folio))
-			if (!compound || nr < nr_pmdmapped)
-				deferred_split_folio(folio);
-	}
-
-	/*
-	 * It would be tidy to reset folio_test_anon mapping when fully
-	 * unmapped, but that might overwrite a racing page_add_anon_rmap
-	 * which increments mapcount after us but sets mapping before us:
-	 * so leave the reset to free_pages_prepare, and remember that
-	 * it's only reliable while mapped.
-	 */
 
-	munlock_vma_folio(folio, vma, compound);
+	__remove_rmap_finish(folio, vma, compound, nr, nr_pmdmapped - nr);
 }
 
 /*