Message ID | 20240212213922.783301-19-surenb@google.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | Memory allocation profiling | expand |
On 2/12/24 22:39, Suren Baghdasaryan wrote: > When a high-order page is split into smaller ones, each newly split > page should get its codetag. The original codetag is reused for these > pages but it's recorded as 0-byte allocation because original codetag > already accounts for the original high-order allocated page. Wouldn't it be possible to adjust the original's accounted size and redistribute to the split pages for more accuracy?
On Fri, Feb 16, 2024 at 6:33 AM Vlastimil Babka <vbabka@suse.cz> wrote: > > On 2/12/24 22:39, Suren Baghdasaryan wrote: > > When a high-order page is split into smaller ones, each newly split > > page should get its codetag. The original codetag is reused for these > > pages but it's recorded as 0-byte allocation because original codetag > > already accounts for the original high-order allocated page. > > Wouldn't it be possible to adjust the original's accounted size and > redistribute to the split pages for more accuracy? I can't recall why I didn't do it that way but I'll try to change and see if something non-obvious comes up. Thanks! >
On Fri, Feb 16, 2024 at 4:46 PM Suren Baghdasaryan <surenb@google.com> wrote: > > On Fri, Feb 16, 2024 at 6:33 AM Vlastimil Babka <vbabka@suse.cz> wrote: > > > > On 2/12/24 22:39, Suren Baghdasaryan wrote: > > > When a high-order page is split into smaller ones, each newly split > > > page should get its codetag. The original codetag is reused for these > > > pages but it's recorded as 0-byte allocation because original codetag > > > already accounts for the original high-order allocated page. > > > > Wouldn't it be possible to adjust the original's accounted size and > > redistribute to the split pages for more accuracy? > > I can't recall why I didn't do it that way but I'll try to change and > see if something non-obvious comes up. Thanks! Ok, now I recall what's happening here. alloc_tag_add() effectively does two things: 1. it sets reference to point to the tag (ref->ct = &tag->ct) 2. it increments tag->counters In pgalloc_tag_split() by calling alloc_tag_add(codetag_ref_from_page_ext(page_ext), tag, 0); we effectively set the reference from new page_ext to point to the original tag but we keep the tag->counters->bytes counter the same (incrementing by 0). It still increments tag->counters->calls but I think we need that because when freeing individual split pages we will be decrementing this counter for each individual page. We allocated many pages with one call, then split into smaller pages and will be freeing them with multiple calls. We need to balance out the call counter during the split. I can refactor the part of alloc_tag_add() that sets the reference into a separate alloc_tag_ref_set() and make it set the reference and increments tag->counters->calls (with a comment explaining why we need this increment here). Then I can call alloc_tag_ref_set() from inside alloc_tag_add() and when splitting pages. I think that will be a bit more clear. > > >
diff --git a/include/linux/pgalloc_tag.h b/include/linux/pgalloc_tag.h index a060c26eb449..0174aff5e871 100644 --- a/include/linux/pgalloc_tag.h +++ b/include/linux/pgalloc_tag.h @@ -62,11 +62,41 @@ static inline void pgalloc_tag_sub(struct page *page, unsigned int order) } } +static inline void pgalloc_tag_split(struct page *page, unsigned int nr) +{ + int i; + struct page_ext *page_ext; + union codetag_ref *ref; + struct alloc_tag *tag; + + if (!mem_alloc_profiling_enabled()) + return; + + page_ext = page_ext_get(page); + if (unlikely(!page_ext)) + return; + + ref = codetag_ref_from_page_ext(page_ext); + if (!ref->ct) + goto out; + + tag = ct_to_alloc_tag(ref->ct); + page_ext = page_ext_next(page_ext); + for (i = 1; i < nr; i++) { + /* New reference with 0 bytes accounted */ + alloc_tag_add(codetag_ref_from_page_ext(page_ext), tag, 0); + page_ext = page_ext_next(page_ext); + } +out: + page_ext_put(page_ext); +} + #else /* CONFIG_MEM_ALLOC_PROFILING */ static inline void pgalloc_tag_add(struct page *page, struct task_struct *task, unsigned int order) {} static inline void pgalloc_tag_sub(struct page *page, unsigned int order) {} +static inline void pgalloc_tag_split(struct page *page, unsigned int nr) {} #endif /* CONFIG_MEM_ALLOC_PROFILING */ diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 94c958f7ebb5..86daae671319 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -38,6 +38,7 @@ #include <linux/sched/sysctl.h> #include <linux/memory-tiers.h> #include <linux/compat.h> +#include <linux/pgalloc_tag.h> #include <asm/tlb.h> #include <asm/pgalloc.h> @@ -2899,6 +2900,7 @@ static void __split_huge_page(struct page *page, struct list_head *list, /* Caller disabled irqs, so they are still disabled here */ split_page_owner(head, nr); + pgalloc_tag_split(head, nr); /* See comment in __split_huge_page_tail() */ if (PageAnon(head)) { diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 58c0e8b948a4..4bc5b4720fee 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -2621,6 +2621,7 @@ void split_page(struct page *page, unsigned int order) for (i = 1; i < (1 << order); i++) set_page_refcounted(page + i); split_page_owner(page, 1 << order); + pgalloc_tag_split(page, 1 << order); split_page_memcg(page, 1 << order); } EXPORT_SYMBOL_GPL(split_page); @@ -4806,6 +4807,7 @@ static void *make_alloc_exact(unsigned long addr, unsigned int order, struct page *last = page + nr; split_page_owner(page, 1 << order); + pgalloc_tag_split(page, 1 << order); split_page_memcg(page, 1 << order); while (page < --last) set_page_refcounted(last);
When a high-order page is split into smaller ones, each newly split page should get its codetag. The original codetag is reused for these pages but it's recorded as 0-byte allocation because original codetag already accounts for the original high-order allocated page. Signed-off-by: Suren Baghdasaryan <surenb@google.com> --- include/linux/pgalloc_tag.h | 30 ++++++++++++++++++++++++++++++ mm/huge_memory.c | 2 ++ mm/page_alloc.c | 2 ++ 3 files changed, 34 insertions(+)