Message ID | 20250224165603.1434404-19-david@redhat.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | mm: MM owner tracking for large folios (!hugetlb) + CONFIG_NO_PAGE_MAPCOUNT | expand |
On Mon Feb 24, 2025 at 11:56 AM EST, David Hildenbrand wrote: > Let's implement an alternative when per-page mapcounts in large folios are > no longer maintained -- soon with CONFIG_NO_PAGE_MAPCOUNT. > > For calculating "mapmax", we now use the average per-page mapcount in > a large folio instead of the per-page mapcount. > > For hugetlb folios and folios that are not partially mapped into MMs, > there is no change. > > Likely, this change will not matter much in practice, and an alternative > might be to simple remove this stat with CONFIG_NO_PAGE_MAPCOUNT. > However, there might be value to it, so let's keep it like that and > document the behavior. > > Signed-off-by: David Hildenbrand <david@redhat.com> > --- > Documentation/filesystems/proc.rst | 5 +++++ > fs/proc/task_mmu.c | 7 ++++++- > 2 files changed, 11 insertions(+), 1 deletion(-) > > diff --git a/Documentation/filesystems/proc.rst b/Documentation/filesystems/proc.rst > index 09f0aed5a08ba..1aa190017f796 100644 > --- a/Documentation/filesystems/proc.rst > +++ b/Documentation/filesystems/proc.rst > @@ -686,6 +686,11 @@ Where: > node locality page counters (N0 == node0, N1 == node1, ...) and the kernel page > size, in KB, that is backing the mapping up. > > +Note that some kernel configurations do not track the precise number of times > +a page part of a larger allocation (e.g., THP) is mapped. In these > +configurations, "mapmax" might corresponds to the average number of mappings > +per page in such a larger allocation instead. > + > 1.2 Kernel data > --------------- > > diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c > index 80839bbf9657f..d7ee842367f0f 100644 > --- a/fs/proc/task_mmu.c > +++ b/fs/proc/task_mmu.c > @@ -2862,7 +2862,12 @@ static void gather_stats(struct page *page, struct numa_maps *md, int pte_dirty, > unsigned long nr_pages) > { > struct folio *folio = page_folio(page); > - int count = folio_precise_page_mapcount(folio, page); > + int count; > + > + if (IS_ENABLED(CONFIG_PAGE_MAPCOUNT)) > + count = folio_precise_page_mapcount(folio, page); > + else > + count = min_t(int, folio_average_page_mapcount(folio), 1); s/min/max ? Otherwise, count is at most 1. Anyway, if you change folio_average_page_mapcount() as I indicated in patch 16, this will become count = folio_average_page_mapcount(folio). > > md->pages += nr_pages; > if (pte_dirty || folio_test_dirty(folio))
On 24.02.25 21:45, Zi Yan wrote: > On Mon Feb 24, 2025 at 11:56 AM EST, David Hildenbrand wrote: >> Let's implement an alternative when per-page mapcounts in large folios are >> no longer maintained -- soon with CONFIG_NO_PAGE_MAPCOUNT. >> >> For calculating "mapmax", we now use the average per-page mapcount in >> a large folio instead of the per-page mapcount. >> >> For hugetlb folios and folios that are not partially mapped into MMs, >> there is no change. >> >> Likely, this change will not matter much in practice, and an alternative >> might be to simple remove this stat with CONFIG_NO_PAGE_MAPCOUNT. >> However, there might be value to it, so let's keep it like that and >> document the behavior. >> >> Signed-off-by: David Hildenbrand <david@redhat.com> >> --- >> Documentation/filesystems/proc.rst | 5 +++++ >> fs/proc/task_mmu.c | 7 ++++++- >> 2 files changed, 11 insertions(+), 1 deletion(-) >> >> diff --git a/Documentation/filesystems/proc.rst b/Documentation/filesystems/proc.rst >> index 09f0aed5a08ba..1aa190017f796 100644 >> --- a/Documentation/filesystems/proc.rst >> +++ b/Documentation/filesystems/proc.rst >> @@ -686,6 +686,11 @@ Where: >> node locality page counters (N0 == node0, N1 == node1, ...) and the kernel page >> size, in KB, that is backing the mapping up. >> >> +Note that some kernel configurations do not track the precise number of times >> +a page part of a larger allocation (e.g., THP) is mapped. In these >> +configurations, "mapmax" might corresponds to the average number of mappings >> +per page in such a larger allocation instead. >> + >> 1.2 Kernel data >> --------------- >> >> diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c >> index 80839bbf9657f..d7ee842367f0f 100644 >> --- a/fs/proc/task_mmu.c >> +++ b/fs/proc/task_mmu.c >> @@ -2862,7 +2862,12 @@ static void gather_stats(struct page *page, struct numa_maps *md, int pte_dirty, >> unsigned long nr_pages) >> { >> struct folio *folio = page_folio(page); >> - int count = folio_precise_page_mapcount(folio, page); >> + int count; >> + >> + if (IS_ENABLED(CONFIG_PAGE_MAPCOUNT)) >> + count = folio_precise_page_mapcount(folio, page); >> + else >> + count = min_t(int, folio_average_page_mapcount(folio), 1); > > s/min/max ? Indeed, thanks! > > Otherwise, count is at most 1. Anyway, if you change > folio_average_page_mapcount() as I indicated in patch 16, this > will become count = folio_average_page_mapcount(folio). No, the average should not be 1 just because a single subpage is mapped.
diff --git a/Documentation/filesystems/proc.rst b/Documentation/filesystems/proc.rst index 09f0aed5a08ba..1aa190017f796 100644 --- a/Documentation/filesystems/proc.rst +++ b/Documentation/filesystems/proc.rst @@ -686,6 +686,11 @@ Where: node locality page counters (N0 == node0, N1 == node1, ...) and the kernel page size, in KB, that is backing the mapping up. +Note that some kernel configurations do not track the precise number of times +a page part of a larger allocation (e.g., THP) is mapped. In these +configurations, "mapmax" might corresponds to the average number of mappings +per page in such a larger allocation instead. + 1.2 Kernel data --------------- diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index 80839bbf9657f..d7ee842367f0f 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -2862,7 +2862,12 @@ static void gather_stats(struct page *page, struct numa_maps *md, int pte_dirty, unsigned long nr_pages) { struct folio *folio = page_folio(page); - int count = folio_precise_page_mapcount(folio, page); + int count; + + if (IS_ENABLED(CONFIG_PAGE_MAPCOUNT)) + count = folio_precise_page_mapcount(folio, page); + else + count = min_t(int, folio_average_page_mapcount(folio), 1); md->pages += nr_pages; if (pte_dirty || folio_test_dirty(folio))
Let's implement an alternative when per-page mapcounts in large folios are no longer maintained -- soon with CONFIG_NO_PAGE_MAPCOUNT. For calculating "mapmax", we now use the average per-page mapcount in a large folio instead of the per-page mapcount. For hugetlb folios and folios that are not partially mapped into MMs, there is no change. Likely, this change will not matter much in practice, and an alternative might be to simple remove this stat with CONFIG_NO_PAGE_MAPCOUNT. However, there might be value to it, so let's keep it like that and document the behavior. Signed-off-by: David Hildenbrand <david@redhat.com> --- Documentation/filesystems/proc.rst | 5 +++++ fs/proc/task_mmu.c | 7 ++++++- 2 files changed, 11 insertions(+), 1 deletion(-)