diff mbox series

mm: huge_memory: add folio_mark_accessed() when zapping file THP

Message ID fc117f60d7b686f87067f36a0ef7cdbc3a78109c.1744190345.git.baolin.wang@linux.alibaba.com (mailing list archive)
State New
Headers show
Series mm: huge_memory: add folio_mark_accessed() when zapping file THP | expand

Commit Message

Baolin Wang April 9, 2025, 9:38 a.m. UTC
When investigating performance issues during file folio unmap, I noticed some
behavioral differences in handling non-PMD-sized folios and PMD-sized folios.
For non-PMD-sized file folios, it will call folio_mark_accessed() to mark the
folio as having seen activity, but this is not done for PMD-sized folios.

This might not cause obvious issues, but a potential problem could be that,
it might lead to reclaim hot file folios under memory pressure, as quoted
from Johannes:

"
Sometimes file contents are only accessed through relatively short-lived
mappings. But they can nevertheless be accessed a lot and be hot. It's
important to not lose that information on unmap, and end up kicking out a
frequently used cache page.
"

Therefore, we should also add folio_mark_accessed() for PMD-sized file
folios when unmapping.

Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Zi Yan <ziy@nvidia.com>
---
Changes from RFC:
 - Update the commit message, per Johannes.
 - Collect Acked tags from Johannes and Zi. Thanks.
---
 mm/huge_memory.c | 4 ++++
 1 file changed, 4 insertions(+)

Comments

David Hildenbrand April 9, 2025, 9:46 a.m. UTC | #1
On 09.04.25 11:38, Baolin Wang wrote:
> When investigating performance issues during file folio unmap, I noticed some
> behavioral differences in handling non-PMD-sized folios and PMD-sized folios.
> For non-PMD-sized file folios, it will call folio_mark_accessed() to mark the
> folio as having seen activity, but this is not done for PMD-sized folios.
> 
> This might not cause obvious issues, but a potential problem could be that,
> it might lead to reclaim hot file folios under memory pressure, as quoted
> from Johannes:
> 
> "
> Sometimes file contents are only accessed through relatively short-lived
> mappings. But they can nevertheless be accessed a lot and be hot. It's
> important to not lose that information on unmap, and end up kicking out a
> frequently used cache page.
> "
> 
> Therefore, we should also add folio_mark_accessed() for PMD-sized file
> folios when unmapping.
> 
> Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
> Acked-by: Johannes Weiner <hannes@cmpxchg.org>
> Acked-by: Zi Yan <ziy@nvidia.com>
> ---
> Changes from RFC:
>   - Update the commit message, per Johannes.
>   - Collect Acked tags from Johannes and Zi. Thanks.
> ---
>   mm/huge_memory.c | 4 ++++
>   1 file changed, 4 insertions(+)
> 
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 2a47682d1ab7..955781b4e946 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -2260,6 +2260,10 @@ int zap_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma,
>   				zap_deposited_table(tlb->mm, pmd);
>   			add_mm_counter(tlb->mm, mm_counter_file(folio),
>   				       -HPAGE_PMD_NR);
> +
> +			if (flush_needed && pmd_young(orig_pmd) &&
> +			    likely(vma_has_recency(vma)))
> +				folio_mark_accessed(folio);

So the flush_needed check is really just a pmd_present() check. (the 
latter would be clearer, but I don't mind)

Acked-by: David Hildenbrand <david@redhat.com>
Baolin Wang April 9, 2025, 9:49 a.m. UTC | #2
On 2025/4/9 17:46, David Hildenbrand wrote:
> On 09.04.25 11:38, Baolin Wang wrote:
>> When investigating performance issues during file folio unmap, I 
>> noticed some
>> behavioral differences in handling non-PMD-sized folios and PMD-sized 
>> folios.
>> For non-PMD-sized file folios, it will call folio_mark_accessed() to 
>> mark the
>> folio as having seen activity, but this is not done for PMD-sized folios.
>>
>> This might not cause obvious issues, but a potential problem could be 
>> that,
>> it might lead to reclaim hot file folios under memory pressure, as quoted
>> from Johannes:
>>
>> "
>> Sometimes file contents are only accessed through relatively short-lived
>> mappings. But they can nevertheless be accessed a lot and be hot. It's
>> important to not lose that information on unmap, and end up kicking out a
>> frequently used cache page.
>> "
>>
>> Therefore, we should also add folio_mark_accessed() for PMD-sized file
>> folios when unmapping.
>>
>> Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
>> Acked-by: Johannes Weiner <hannes@cmpxchg.org>
>> Acked-by: Zi Yan <ziy@nvidia.com>
>> ---
>> Changes from RFC:
>>   - Update the commit message, per Johannes.
>>   - Collect Acked tags from Johannes and Zi. Thanks.
>> ---
>>   mm/huge_memory.c | 4 ++++
>>   1 file changed, 4 insertions(+)
>>
>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
>> index 2a47682d1ab7..955781b4e946 100644
>> --- a/mm/huge_memory.c
>> +++ b/mm/huge_memory.c
>> @@ -2260,6 +2260,10 @@ int zap_huge_pmd(struct mmu_gather *tlb, struct 
>> vm_area_struct *vma,
>>                   zap_deposited_table(tlb->mm, pmd);
>>               add_mm_counter(tlb->mm, mm_counter_file(folio),
>>                          -HPAGE_PMD_NR);
>> +
>> +            if (flush_needed && pmd_young(orig_pmd) &&
>> +                likely(vma_has_recency(vma)))
>> +                folio_mark_accessed(folio);
> 
> So the flush_needed check is really just a pmd_present() check. (the 
> latter would be clearer, but I don't mind)

Yes, we've already checked pmd_present() before, so I assume the 
flush_needed check is cheaper:)

> Acked-by: David Hildenbrand <david@redhat.com>

Thanks.
David Hildenbrand April 9, 2025, 9:53 a.m. UTC | #3
On 09.04.25 11:49, Baolin Wang wrote:
> 
> 
> On 2025/4/9 17:46, David Hildenbrand wrote:
>> On 09.04.25 11:38, Baolin Wang wrote:
>>> When investigating performance issues during file folio unmap, I
>>> noticed some
>>> behavioral differences in handling non-PMD-sized folios and PMD-sized
>>> folios.
>>> For non-PMD-sized file folios, it will call folio_mark_accessed() to
>>> mark the
>>> folio as having seen activity, but this is not done for PMD-sized folios.
>>>
>>> This might not cause obvious issues, but a potential problem could be
>>> that,
>>> it might lead to reclaim hot file folios under memory pressure, as quoted
>>> from Johannes:
>>>
>>> "
>>> Sometimes file contents are only accessed through relatively short-lived
>>> mappings. But they can nevertheless be accessed a lot and be hot. It's
>>> important to not lose that information on unmap, and end up kicking out a
>>> frequently used cache page.
>>> "
>>>
>>> Therefore, we should also add folio_mark_accessed() for PMD-sized file
>>> folios when unmapping.
>>>
>>> Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
>>> Acked-by: Johannes Weiner <hannes@cmpxchg.org>
>>> Acked-by: Zi Yan <ziy@nvidia.com>
>>> ---
>>> Changes from RFC:
>>>    - Update the commit message, per Johannes.
>>>    - Collect Acked tags from Johannes and Zi. Thanks.
>>> ---
>>>    mm/huge_memory.c | 4 ++++
>>>    1 file changed, 4 insertions(+)
>>>
>>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
>>> index 2a47682d1ab7..955781b4e946 100644
>>> --- a/mm/huge_memory.c
>>> +++ b/mm/huge_memory.c
>>> @@ -2260,6 +2260,10 @@ int zap_huge_pmd(struct mmu_gather *tlb, struct
>>> vm_area_struct *vma,
>>>                    zap_deposited_table(tlb->mm, pmd);
>>>                add_mm_counter(tlb->mm, mm_counter_file(folio),
>>>                           -HPAGE_PMD_NR);
>>> +
>>> +            if (flush_needed && pmd_young(orig_pmd) &&
>>> +                likely(vma_has_recency(vma)))
>>> +                folio_mark_accessed(folio);
>>
>> So the flush_needed check is really just a pmd_present() check. (the
>> latter would be clearer, but I don't mind)
> 
> Yes, we've already checked pmd_present() before, so I assume the
> flush_needed check is cheaper:)

Maybe :)
Oscar Salvador April 10, 2025, 8:45 a.m. UTC | #4
On Wed, Apr 09, 2025 at 05:38:58PM +0800, Baolin Wang wrote:
> When investigating performance issues during file folio unmap, I noticed some
> behavioral differences in handling non-PMD-sized folios and PMD-sized folios.
> For non-PMD-sized file folios, it will call folio_mark_accessed() to mark the
> folio as having seen activity, but this is not done for PMD-sized folios.
> 
> This might not cause obvious issues, but a potential problem could be that,
> it might lead to reclaim hot file folios under memory pressure, as quoted
> from Johannes:
> 
> "
> Sometimes file contents are only accessed through relatively short-lived
> mappings. But they can nevertheless be accessed a lot and be hot. It's
> important to not lose that information on unmap, and end up kicking out a
> frequently used cache page.
> "
> 
> Therefore, we should also add folio_mark_accessed() for PMD-sized file
> folios when unmapping.
> 
> Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
> Acked-by: Johannes Weiner <hannes@cmpxchg.org>
> Acked-by: Zi Yan <ziy@nvidia.com>

Reviewed-by: Oscar Salvador <osalvador@suse.de>

Although I agree with David here that pmd_present would be more obvious than
flush_needed.
It was not obvious to be at first glance.
Baolin Wang April 11, 2025, 1:07 a.m. UTC | #5
On 2025/4/10 16:45, Oscar Salvador wrote:
> On Wed, Apr 09, 2025 at 05:38:58PM +0800, Baolin Wang wrote:
>> When investigating performance issues during file folio unmap, I noticed some
>> behavioral differences in handling non-PMD-sized folios and PMD-sized folios.
>> For non-PMD-sized file folios, it will call folio_mark_accessed() to mark the
>> folio as having seen activity, but this is not done for PMD-sized folios.
>>
>> This might not cause obvious issues, but a potential problem could be that,
>> it might lead to reclaim hot file folios under memory pressure, as quoted
>> from Johannes:
>>
>> "
>> Sometimes file contents are only accessed through relatively short-lived
>> mappings. But they can nevertheless be accessed a lot and be hot. It's
>> important to not lose that information on unmap, and end up kicking out a
>> frequently used cache page.
>> "
>>
>> Therefore, we should also add folio_mark_accessed() for PMD-sized file
>> folios when unmapping.
>>
>> Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
>> Acked-by: Johannes Weiner <hannes@cmpxchg.org>
>> Acked-by: Zi Yan <ziy@nvidia.com>
> 
> Reviewed-by: Oscar Salvador <osalvador@suse.de>

Thanks.

> Although I agree with David here that pmd_present would be more obvious than
> flush_needed.
> It was not obvious to be at first glance.

How about adding some comments to make it clear?

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index b3ade7ac5bbf..93abd1fcc4fb 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2263,6 +2263,10 @@ int zap_huge_pmd(struct mmu_gather *tlb, struct 
vm_area_struct *vma,
                         add_mm_counter(tlb->mm, mm_counter_file(folio),
                                        -HPAGE_PMD_NR);

+                       /*
+                        * Use flush_needed to indicate whether the PMD 
entry is present,
+                        * instead of checking pmd_present() again.
+                        */
                         if (flush_needed && pmd_young(orig_pmd) &&
                             likely(vma_has_recency(vma)))
                                 folio_mark_accessed(folio);
Oscar Salvador April 11, 2025, 3:14 a.m. UTC | #6
On Fri, Apr 11, 2025 at 09:07:16AM +0800, Baolin Wang wrote:
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index b3ade7ac5bbf..93abd1fcc4fb 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -2263,6 +2263,10 @@ int zap_huge_pmd(struct mmu_gather *tlb, struct
> vm_area_struct *vma,
>                         add_mm_counter(tlb->mm, mm_counter_file(folio),
>                                        -HPAGE_PMD_NR);
> 
> +                       /*
> +                        * Use flush_needed to indicate whether the PMD
> entry is present,
> +                        * instead of checking pmd_present() again.
> +                        */
>                         if (flush_needed && pmd_young(orig_pmd) &&
>                             likely(vma_has_recency(vma)))
>                                 folio_mark_accessed(folio);
Yes, thanks, looks good to me, and I see that Andrew has already taken
it.
diff mbox series

Patch

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 2a47682d1ab7..955781b4e946 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2260,6 +2260,10 @@  int zap_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma,
 				zap_deposited_table(tlb->mm, pmd);
 			add_mm_counter(tlb->mm, mm_counter_file(folio),
 				       -HPAGE_PMD_NR);
+
+			if (flush_needed && pmd_young(orig_pmd) &&
+			    likely(vma_has_recency(vma)))
+				folio_mark_accessed(folio);
 		}
 
 		spin_unlock(ptl);