Message ID | fc117f60d7b686f87067f36a0ef7cdbc3a78109c.1744190345.git.baolin.wang@linux.alibaba.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | mm: huge_memory: add folio_mark_accessed() when zapping file THP | expand |
On 09.04.25 11:38, Baolin Wang wrote: > When investigating performance issues during file folio unmap, I noticed some > behavioral differences in handling non-PMD-sized folios and PMD-sized folios. > For non-PMD-sized file folios, it will call folio_mark_accessed() to mark the > folio as having seen activity, but this is not done for PMD-sized folios. > > This might not cause obvious issues, but a potential problem could be that, > it might lead to reclaim hot file folios under memory pressure, as quoted > from Johannes: > > " > Sometimes file contents are only accessed through relatively short-lived > mappings. But they can nevertheless be accessed a lot and be hot. It's > important to not lose that information on unmap, and end up kicking out a > frequently used cache page. > " > > Therefore, we should also add folio_mark_accessed() for PMD-sized file > folios when unmapping. > > Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com> > Acked-by: Johannes Weiner <hannes@cmpxchg.org> > Acked-by: Zi Yan <ziy@nvidia.com> > --- > Changes from RFC: > - Update the commit message, per Johannes. > - Collect Acked tags from Johannes and Zi. Thanks. > --- > mm/huge_memory.c | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > index 2a47682d1ab7..955781b4e946 100644 > --- a/mm/huge_memory.c > +++ b/mm/huge_memory.c > @@ -2260,6 +2260,10 @@ int zap_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma, > zap_deposited_table(tlb->mm, pmd); > add_mm_counter(tlb->mm, mm_counter_file(folio), > -HPAGE_PMD_NR); > + > + if (flush_needed && pmd_young(orig_pmd) && > + likely(vma_has_recency(vma))) > + folio_mark_accessed(folio); So the flush_needed check is really just a pmd_present() check. (the latter would be clearer, but I don't mind) Acked-by: David Hildenbrand <david@redhat.com>
On 2025/4/9 17:46, David Hildenbrand wrote: > On 09.04.25 11:38, Baolin Wang wrote: >> When investigating performance issues during file folio unmap, I >> noticed some >> behavioral differences in handling non-PMD-sized folios and PMD-sized >> folios. >> For non-PMD-sized file folios, it will call folio_mark_accessed() to >> mark the >> folio as having seen activity, but this is not done for PMD-sized folios. >> >> This might not cause obvious issues, but a potential problem could be >> that, >> it might lead to reclaim hot file folios under memory pressure, as quoted >> from Johannes: >> >> " >> Sometimes file contents are only accessed through relatively short-lived >> mappings. But they can nevertheless be accessed a lot and be hot. It's >> important to not lose that information on unmap, and end up kicking out a >> frequently used cache page. >> " >> >> Therefore, we should also add folio_mark_accessed() for PMD-sized file >> folios when unmapping. >> >> Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com> >> Acked-by: Johannes Weiner <hannes@cmpxchg.org> >> Acked-by: Zi Yan <ziy@nvidia.com> >> --- >> Changes from RFC: >> - Update the commit message, per Johannes. >> - Collect Acked tags from Johannes and Zi. Thanks. >> --- >> mm/huge_memory.c | 4 ++++ >> 1 file changed, 4 insertions(+) >> >> diff --git a/mm/huge_memory.c b/mm/huge_memory.c >> index 2a47682d1ab7..955781b4e946 100644 >> --- a/mm/huge_memory.c >> +++ b/mm/huge_memory.c >> @@ -2260,6 +2260,10 @@ int zap_huge_pmd(struct mmu_gather *tlb, struct >> vm_area_struct *vma, >> zap_deposited_table(tlb->mm, pmd); >> add_mm_counter(tlb->mm, mm_counter_file(folio), >> -HPAGE_PMD_NR); >> + >> + if (flush_needed && pmd_young(orig_pmd) && >> + likely(vma_has_recency(vma))) >> + folio_mark_accessed(folio); > > So the flush_needed check is really just a pmd_present() check. (the > latter would be clearer, but I don't mind) Yes, we've already checked pmd_present() before, so I assume the flush_needed check is cheaper:) > Acked-by: David Hildenbrand <david@redhat.com> Thanks.
On 09.04.25 11:49, Baolin Wang wrote: > > > On 2025/4/9 17:46, David Hildenbrand wrote: >> On 09.04.25 11:38, Baolin Wang wrote: >>> When investigating performance issues during file folio unmap, I >>> noticed some >>> behavioral differences in handling non-PMD-sized folios and PMD-sized >>> folios. >>> For non-PMD-sized file folios, it will call folio_mark_accessed() to >>> mark the >>> folio as having seen activity, but this is not done for PMD-sized folios. >>> >>> This might not cause obvious issues, but a potential problem could be >>> that, >>> it might lead to reclaim hot file folios under memory pressure, as quoted >>> from Johannes: >>> >>> " >>> Sometimes file contents are only accessed through relatively short-lived >>> mappings. But they can nevertheless be accessed a lot and be hot. It's >>> important to not lose that information on unmap, and end up kicking out a >>> frequently used cache page. >>> " >>> >>> Therefore, we should also add folio_mark_accessed() for PMD-sized file >>> folios when unmapping. >>> >>> Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com> >>> Acked-by: Johannes Weiner <hannes@cmpxchg.org> >>> Acked-by: Zi Yan <ziy@nvidia.com> >>> --- >>> Changes from RFC: >>> - Update the commit message, per Johannes. >>> - Collect Acked tags from Johannes and Zi. Thanks. >>> --- >>> mm/huge_memory.c | 4 ++++ >>> 1 file changed, 4 insertions(+) >>> >>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c >>> index 2a47682d1ab7..955781b4e946 100644 >>> --- a/mm/huge_memory.c >>> +++ b/mm/huge_memory.c >>> @@ -2260,6 +2260,10 @@ int zap_huge_pmd(struct mmu_gather *tlb, struct >>> vm_area_struct *vma, >>> zap_deposited_table(tlb->mm, pmd); >>> add_mm_counter(tlb->mm, mm_counter_file(folio), >>> -HPAGE_PMD_NR); >>> + >>> + if (flush_needed && pmd_young(orig_pmd) && >>> + likely(vma_has_recency(vma))) >>> + folio_mark_accessed(folio); >> >> So the flush_needed check is really just a pmd_present() check. (the >> latter would be clearer, but I don't mind) > > Yes, we've already checked pmd_present() before, so I assume the > flush_needed check is cheaper:) Maybe :)
On Wed, Apr 09, 2025 at 05:38:58PM +0800, Baolin Wang wrote: > When investigating performance issues during file folio unmap, I noticed some > behavioral differences in handling non-PMD-sized folios and PMD-sized folios. > For non-PMD-sized file folios, it will call folio_mark_accessed() to mark the > folio as having seen activity, but this is not done for PMD-sized folios. > > This might not cause obvious issues, but a potential problem could be that, > it might lead to reclaim hot file folios under memory pressure, as quoted > from Johannes: > > " > Sometimes file contents are only accessed through relatively short-lived > mappings. But they can nevertheless be accessed a lot and be hot. It's > important to not lose that information on unmap, and end up kicking out a > frequently used cache page. > " > > Therefore, we should also add folio_mark_accessed() for PMD-sized file > folios when unmapping. > > Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com> > Acked-by: Johannes Weiner <hannes@cmpxchg.org> > Acked-by: Zi Yan <ziy@nvidia.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Although I agree with David here that pmd_present would be more obvious than flush_needed. It was not obvious to be at first glance.
On 2025/4/10 16:45, Oscar Salvador wrote: > On Wed, Apr 09, 2025 at 05:38:58PM +0800, Baolin Wang wrote: >> When investigating performance issues during file folio unmap, I noticed some >> behavioral differences in handling non-PMD-sized folios and PMD-sized folios. >> For non-PMD-sized file folios, it will call folio_mark_accessed() to mark the >> folio as having seen activity, but this is not done for PMD-sized folios. >> >> This might not cause obvious issues, but a potential problem could be that, >> it might lead to reclaim hot file folios under memory pressure, as quoted >> from Johannes: >> >> " >> Sometimes file contents are only accessed through relatively short-lived >> mappings. But they can nevertheless be accessed a lot and be hot. It's >> important to not lose that information on unmap, and end up kicking out a >> frequently used cache page. >> " >> >> Therefore, we should also add folio_mark_accessed() for PMD-sized file >> folios when unmapping. >> >> Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com> >> Acked-by: Johannes Weiner <hannes@cmpxchg.org> >> Acked-by: Zi Yan <ziy@nvidia.com> > > Reviewed-by: Oscar Salvador <osalvador@suse.de> Thanks. > Although I agree with David here that pmd_present would be more obvious than > flush_needed. > It was not obvious to be at first glance. How about adding some comments to make it clear? diff --git a/mm/huge_memory.c b/mm/huge_memory.c index b3ade7ac5bbf..93abd1fcc4fb 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2263,6 +2263,10 @@ int zap_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma, add_mm_counter(tlb->mm, mm_counter_file(folio), -HPAGE_PMD_NR); + /* + * Use flush_needed to indicate whether the PMD entry is present, + * instead of checking pmd_present() again. + */ if (flush_needed && pmd_young(orig_pmd) && likely(vma_has_recency(vma))) folio_mark_accessed(folio);
On Fri, Apr 11, 2025 at 09:07:16AM +0800, Baolin Wang wrote: > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > index b3ade7ac5bbf..93abd1fcc4fb 100644 > --- a/mm/huge_memory.c > +++ b/mm/huge_memory.c > @@ -2263,6 +2263,10 @@ int zap_huge_pmd(struct mmu_gather *tlb, struct > vm_area_struct *vma, > add_mm_counter(tlb->mm, mm_counter_file(folio), > -HPAGE_PMD_NR); > > + /* > + * Use flush_needed to indicate whether the PMD > entry is present, > + * instead of checking pmd_present() again. > + */ > if (flush_needed && pmd_young(orig_pmd) && > likely(vma_has_recency(vma))) > folio_mark_accessed(folio); Yes, thanks, looks good to me, and I see that Andrew has already taken it.
diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 2a47682d1ab7..955781b4e946 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2260,6 +2260,10 @@ int zap_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma, zap_deposited_table(tlb->mm, pmd); add_mm_counter(tlb->mm, mm_counter_file(folio), -HPAGE_PMD_NR); + + if (flush_needed && pmd_young(orig_pmd) && + likely(vma_has_recency(vma))) + folio_mark_accessed(folio); } spin_unlock(ptl);