Message ID | 20250224165603.1434404-20-david@redhat.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | mm: MM owner tracking for large folios (!hugetlb) + CONFIG_NO_PAGE_MAPCOUNT | expand |
On Mon Feb 24, 2025 at 11:56 AM EST, David Hildenbrand wrote: > Let's implement an alternative when per-page mapcounts in large folios are > no longer maintained -- soon with CONFIG_NO_PAGE_MAPCOUNT. > > When computing the output for smaps / smaps_rollups, in particular when > calculating the USS (Unique Set Size) and the PSS (Proportional Set Size), > we still rely on per-page mapcounts. > > To determine private vs. shared, we'll use folio_likely_mapped_shared(), > similar to how we handle PM_MMAP_EXCLUSIVE. Similarly, we might now > under-estimate the USS and count pages towards "shared" that are > actually "private" ("exclusively mapped"). > > When calculating the PSS, we'll now also use the average per-page > mapcount for large folios: this can result in both, an over-estimation > and an under-estimation of the PSS. The difference is not expected to > matter much in practice, but we'll have to learn as we go. > > We can now provide folio_precise_page_mapcount() only with > CONFIG_PAGE_MAPCOUNT, and remove one of the last users of per-page > mapcounts when CONFIG_NO_PAGE_MAPCOUNT is enabled. > > Document the new behavior. > > Signed-off-by: David Hildenbrand <david@redhat.com> > --- > Documentation/filesystems/proc.rst | 13 +++++++++++++ > fs/proc/internal.h | 8 ++++++++ > fs/proc/task_mmu.c | 17 +++++++++++++++-- > 3 files changed, 36 insertions(+), 2 deletions(-) > > diff --git a/Documentation/filesystems/proc.rst b/Documentation/filesystems/proc.rst > index 1aa190017f796..57d55274a1f42 100644 > --- a/Documentation/filesystems/proc.rst > +++ b/Documentation/filesystems/proc.rst > @@ -506,6 +506,19 @@ Note that even a page which is part of a MAP_SHARED mapping, but has only > a single pte mapped, i.e. is currently used by only one process, is accounted > as private and not as shared. > > +Note that in some kernel configurations, all pages part of a larger allocation > +(e.g., THP) might be considered "shared" if the large allocation is > +considered "shared": if not all pages are exclusive to the same process. > +Further, some kernel configurations might consider larger allocations "shared", > +if they were at one point considered "shared", even if they would now be > +considered "exclusive". > + > +Some kernel configurations do not track the precise number of times a page part > +of a larger allocation is mapped. In this case, when calculating the PSS, the > +average number of mappings per page in this larger allocation might be used > +as an approximation for the number of mappings of a page. The PSS calculation > +will be imprecise in this case. > + > "Referenced" indicates the amount of memory currently marked as referenced or > accessed. > > diff --git a/fs/proc/internal.h b/fs/proc/internal.h > index 16aa1fd260771..70205425a2daa 100644 > --- a/fs/proc/internal.h > +++ b/fs/proc/internal.h > @@ -143,6 +143,7 @@ unsigned name_to_int(const struct qstr *qstr); > /* Worst case buffer size needed for holding an integer. */ > #define PROC_NUMBUF 13 > > +#ifdef CONFIG_PAGE_MAPCOUNT > /** > * folio_precise_page_mapcount() - Number of mappings of this folio page. > * @folio: The folio. > @@ -173,6 +174,13 @@ static inline int folio_precise_page_mapcount(struct folio *folio, > > return mapcount; > } > +#else /* !CONFIG_PAGE_MAPCOUNT */ > +static inline int folio_precise_page_mapcount(struct folio *folio, > + struct page *page) > +{ > + BUILD_BUG(); > +} > +#endif /* CONFIG_PAGE_MAPCOUNT */ > > /** > * folio_average_page_mapcount() - Average number of mappings per page in this > diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c > index d7ee842367f0f..7ca0bc3bf417d 100644 > --- a/fs/proc/task_mmu.c > +++ b/fs/proc/task_mmu.c > @@ -707,6 +707,8 @@ static void smaps_account(struct mem_size_stats *mss, struct page *page, > struct folio *folio = page_folio(page); > int i, nr = compound ? compound_nr(page) : 1; > unsigned long size = nr * PAGE_SIZE; > + bool exclusive; > + int mapcount; > > /* > * First accumulate quantities that depend only on |size| and the type > @@ -747,18 +749,29 @@ static void smaps_account(struct mem_size_stats *mss, struct page *page, > dirty, locked, present); > return; > } > + > + if (IS_ENABLED(CONFIG_NO_PAGE_MAPCOUNT)) { > + mapcount = folio_average_page_mapcount(folio); This seems inconsistent with how folio_average_page_mapcount() is used in patch 16 and 18. > + exclusive = !folio_maybe_mapped_shared(folio); > + } > + > /* > * We obtain a snapshot of the mapcount. Without holding the folio lock > * this snapshot can be slightly wrong as we cannot always read the > * mapcount atomically. > */ > for (i = 0; i < nr; i++, page++) { > - int mapcount = folio_precise_page_mapcount(folio, page); > unsigned long pss = PAGE_SIZE << PSS_SHIFT; > + > + if (IS_ENABLED(CONFIG_PAGE_MAPCOUNT)) { > + mapcount = folio_precise_page_mapcount(folio, page); > + exclusive = mapcount < 2; > + } > + > if (mapcount >= 2) > pss /= mapcount; > smaps_page_accumulate(mss, folio, PAGE_SIZE, pss, > - dirty, locked, mapcount < 2); > + dirty, locked, exclusive); > } > } >
diff --git a/Documentation/filesystems/proc.rst b/Documentation/filesystems/proc.rst index 1aa190017f796..57d55274a1f42 100644 --- a/Documentation/filesystems/proc.rst +++ b/Documentation/filesystems/proc.rst @@ -506,6 +506,19 @@ Note that even a page which is part of a MAP_SHARED mapping, but has only a single pte mapped, i.e. is currently used by only one process, is accounted as private and not as shared. +Note that in some kernel configurations, all pages part of a larger allocation +(e.g., THP) might be considered "shared" if the large allocation is +considered "shared": if not all pages are exclusive to the same process. +Further, some kernel configurations might consider larger allocations "shared", +if they were at one point considered "shared", even if they would now be +considered "exclusive". + +Some kernel configurations do not track the precise number of times a page part +of a larger allocation is mapped. In this case, when calculating the PSS, the +average number of mappings per page in this larger allocation might be used +as an approximation for the number of mappings of a page. The PSS calculation +will be imprecise in this case. + "Referenced" indicates the amount of memory currently marked as referenced or accessed. diff --git a/fs/proc/internal.h b/fs/proc/internal.h index 16aa1fd260771..70205425a2daa 100644 --- a/fs/proc/internal.h +++ b/fs/proc/internal.h @@ -143,6 +143,7 @@ unsigned name_to_int(const struct qstr *qstr); /* Worst case buffer size needed for holding an integer. */ #define PROC_NUMBUF 13 +#ifdef CONFIG_PAGE_MAPCOUNT /** * folio_precise_page_mapcount() - Number of mappings of this folio page. * @folio: The folio. @@ -173,6 +174,13 @@ static inline int folio_precise_page_mapcount(struct folio *folio, return mapcount; } +#else /* !CONFIG_PAGE_MAPCOUNT */ +static inline int folio_precise_page_mapcount(struct folio *folio, + struct page *page) +{ + BUILD_BUG(); +} +#endif /* CONFIG_PAGE_MAPCOUNT */ /** * folio_average_page_mapcount() - Average number of mappings per page in this diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index d7ee842367f0f..7ca0bc3bf417d 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -707,6 +707,8 @@ static void smaps_account(struct mem_size_stats *mss, struct page *page, struct folio *folio = page_folio(page); int i, nr = compound ? compound_nr(page) : 1; unsigned long size = nr * PAGE_SIZE; + bool exclusive; + int mapcount; /* * First accumulate quantities that depend only on |size| and the type @@ -747,18 +749,29 @@ static void smaps_account(struct mem_size_stats *mss, struct page *page, dirty, locked, present); return; } + + if (IS_ENABLED(CONFIG_NO_PAGE_MAPCOUNT)) { + mapcount = folio_average_page_mapcount(folio); + exclusive = !folio_maybe_mapped_shared(folio); + } + /* * We obtain a snapshot of the mapcount. Without holding the folio lock * this snapshot can be slightly wrong as we cannot always read the * mapcount atomically. */ for (i = 0; i < nr; i++, page++) { - int mapcount = folio_precise_page_mapcount(folio, page); unsigned long pss = PAGE_SIZE << PSS_SHIFT; + + if (IS_ENABLED(CONFIG_PAGE_MAPCOUNT)) { + mapcount = folio_precise_page_mapcount(folio, page); + exclusive = mapcount < 2; + } + if (mapcount >= 2) pss /= mapcount; smaps_page_accumulate(mss, folio, PAGE_SIZE, pss, - dirty, locked, mapcount < 2); + dirty, locked, exclusive); } }
Let's implement an alternative when per-page mapcounts in large folios are no longer maintained -- soon with CONFIG_NO_PAGE_MAPCOUNT. When computing the output for smaps / smaps_rollups, in particular when calculating the USS (Unique Set Size) and the PSS (Proportional Set Size), we still rely on per-page mapcounts. To determine private vs. shared, we'll use folio_likely_mapped_shared(), similar to how we handle PM_MMAP_EXCLUSIVE. Similarly, we might now under-estimate the USS and count pages towards "shared" that are actually "private" ("exclusively mapped"). When calculating the PSS, we'll now also use the average per-page mapcount for large folios: this can result in both, an over-estimation and an under-estimation of the PSS. The difference is not expected to matter much in practice, but we'll have to learn as we go. We can now provide folio_precise_page_mapcount() only with CONFIG_PAGE_MAPCOUNT, and remove one of the last users of per-page mapcounts when CONFIG_NO_PAGE_MAPCOUNT is enabled. Document the new behavior. Signed-off-by: David Hildenbrand <david@redhat.com> --- Documentation/filesystems/proc.rst | 13 +++++++++++++ fs/proc/internal.h | 8 ++++++++ fs/proc/task_mmu.c | 17 +++++++++++++++-- 3 files changed, 36 insertions(+), 2 deletions(-)