diff mbox series

[2/2] madvise: don't use mapcount() against large folio for sharing check

Message ID 20230728161356.1784568-3-fengwei.yin@intel.com (mailing list archive)
State New
Headers show
Series don't use mapcount() to check large folio sharing | expand

Commit Message

Yin Fengwei July 28, 2023, 4:13 p.m. UTC
The commits
98b211d6415f ("madvise: convert madvise_free_pte_range() to use
a folio")
fc986a38b670 ("mm: huge_memory: convert madvise_free_huge_pmd to
use a folio")

replaced the page_mapcount() with folio_mapcount() to check whether
the folio is shared by other mapping.

But it's not correct for large folio. folio_mapcount() returns the
total mapcount of large folio which is not suitable to detect whether
the folio is shared.

Use folio_estimated_sharers() which returns a estimated number of
shares. That means it's not 100% correct. But it should be OK for
madvise case here.

Fixes: 98b211d6415f ("madvise: convert madvise_free_pte_range() to use a folio")
Fixes: fc986a38b670 ("mm: huge_memory: convert madvise_free_huge_pmd to use a folio")
Signed-off-by: Yin Fengwei <fengwei.yin@intel.com>
Reviewed-by: Yu Zhao <yuzhao@google.com>
---
 mm/huge_memory.c | 2 +-
 mm/madvise.c     | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

Comments

Andrew Morton July 28, 2023, 5:41 p.m. UTC | #1
On Sat, 29 Jul 2023 00:13:56 +0800 Yin Fengwei <fengwei.yin@intel.com> wrote:

> Fixes: 98b211d6415f ("madvise: convert madvise_free_pte_range() to use a folio")
> Fixes: fc986a38b670 ("mm: huge_memory: convert madvise_free_huge_pmd to use a folio")

Having two Fixes: for one patch presumably makes backporting more
complicated and adds risk of making mistakes.

So I have split this into a three-patch series and I've fixed up the patch naming:

Subject: madvise:madvise_cold_or_pageout_pte_range(): don't use mapcount() against large folio for sharing check
Subject: madvise:madvise_free_huge_pmd(): don't use mapcount() against large folio for sharing check
Subject: madvise:madvise_free_pte_range(): don't use mapcount() against large folio for sharing check

I haven't added cc:stable at this time - that awaits the description of
user-visible effects.
Yin Fengwei July 29, 2023, 1:53 p.m. UTC | #2
Hi Andrew,

On 7/29/2023 1:41 AM, Andrew Morton wrote:
> On Sat, 29 Jul 2023 00:13:56 +0800 Yin Fengwei <fengwei.yin@intel.com> wrote:
> 
>> Fixes: 98b211d6415f ("madvise: convert madvise_free_pte_range() to use a folio")
>> Fixes: fc986a38b670 ("mm: huge_memory: convert madvise_free_huge_pmd to use a folio")
> 
> Having two Fixes: for one patch presumably makes backporting more
> complicated and adds risk of making mistakes.
> 
> So I have split this into a three-patch series and I've fixed up the patch naming:
> 
> Subject: madvise:madvise_cold_or_pageout_pte_range(): don't use mapcount() against large folio for sharing check
> Subject: madvise:madvise_free_huge_pmd(): don't use mapcount() against large folio for sharing check
> Subject: madvise:madvise_free_pte_range(): don't use mapcount() against large folio for sharing check
Thanks a lot for your kind help. Will be careful for the future patches.

> 
> I haven't added cc:stable at this time - that awaits the description of
> user-visible effects.
The impact of the patch:
  Without the patch, when user calls madvise() with MADV_COLD, MADV_PAGEOUT
  and MADV_FREE, it's likely THP pages will be skipped. With the patch,
  It's likely the THP pages will be split to pages which will be made code,
  reclaimed and freed.


Regards
Yin, Fengwei
diff mbox series

Patch

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index eb3678360b97..68c890875257 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1613,7 +1613,7 @@  bool madvise_free_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma,
 	 * If other processes are mapping this folio, we couldn't discard
 	 * the folio unless they all do MADV_FREE so let's skip the folio.
 	 */
-	if (folio_mapcount(folio) != 1)
+	if (folio_estimated_sharers(folio) != 1)
 		goto out;
 
 	if (!folio_trylock(folio))
diff --git a/mm/madvise.c b/mm/madvise.c
index 148b46beb039..55bdf641abfa 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -678,7 +678,7 @@  static int madvise_free_pte_range(pmd_t *pmd, unsigned long addr,
 		if (folio_test_large(folio)) {
 			int err;
 
-			if (folio_mapcount(folio) != 1)
+			if (folio_estimated_sharers(folio) != 1)
 				break;
 			if (!folio_trylock(folio))
 				break;