diff mbox series

[v3,1/2] mm: add per-order mTHP split counters

Message ID 20240704012905.42971-2-ioworker0@gmail.com (mailing list archive)
State New
Headers show
Series mm: introduce per-order mTHP split counters | expand

Commit Message

Lance Yang July 4, 2024, 1:29 a.m. UTC
Currently, the split counters in THP statistics no longer include
PTE-mapped mTHP. Therefore, we propose introducing per-order mTHP split
counters to monitor the frequency of mTHP splits. This will help developers
better analyze and optimize system performance.

/sys/kernel/mm/transparent_hugepage/hugepages-<size>/stats
        split
        split_failed
        split_deferred

Reviewed-by: Ryan Roberts <ryan.roberts@arm.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Acked-by: Barry Song <baohua@kernel.org>
Signed-off-by: Mingzhe Yang <mingzhe.yang@ly.com>
Signed-off-by: Lance Yang <ioworker0@gmail.com>
---
 include/linux/huge_mm.h |  3 +++
 mm/huge_memory.c        | 12 ++++++++++--
 2 files changed, 13 insertions(+), 2 deletions(-)

Comments

David Hildenbrand July 5, 2024, 9:08 a.m. UTC | #1
> @@ -3253,8 +3259,9 @@ int split_huge_page_to_list_to_order(struct page *page, struct list_head *list,
>   		i_mmap_unlock_read(mapping);
>   out:
>   	xas_destroy(&xas);
> -	if (is_thp)
> +	if (order >= HPAGE_PMD_ORDER)

We likely should be using "== HPAGE_PMD_ORDER" here, to be safe for the 
future.

Acked-by: David Hildenbrand <david@redhat.com>
Barry Song July 5, 2024, 10:12 a.m. UTC | #2
On Fri, Jul 5, 2024 at 9:08 PM David Hildenbrand <david@redhat.com> wrote:
>
> > @@ -3253,8 +3259,9 @@ int split_huge_page_to_list_to_order(struct page *page, struct list_head *list,
> >               i_mmap_unlock_read(mapping);
> >   out:
> >       xas_destroy(&xas);
> > -     if (is_thp)
> > +     if (order >= HPAGE_PMD_ORDER)
>
> We likely should be using "== HPAGE_PMD_ORDER" here, to be safe for the
> future.

I feel this might need to be separate since all other places are using
folio_test_pmd_mappable() ?

>
> Acked-by: David Hildenbrand <david@redhat.com>
>
> --
> Cheers,
>
> David / dhildenb
>
David Hildenbrand July 5, 2024, 10:14 a.m. UTC | #3
On 05.07.24 12:12, Barry Song wrote:
> On Fri, Jul 5, 2024 at 9:08 PM David Hildenbrand <david@redhat.com> wrote:
>>
>>> @@ -3253,8 +3259,9 @@ int split_huge_page_to_list_to_order(struct page *page, struct list_head *list,
>>>                i_mmap_unlock_read(mapping);
>>>    out:
>>>        xas_destroy(&xas);
>>> -     if (is_thp)
>>> +     if (order >= HPAGE_PMD_ORDER)
>>
>> We likely should be using "== HPAGE_PMD_ORDER" here, to be safe for the
>> future.
> 
> I feel this might need to be separate since all other places are using
> folio_test_pmd_mappable() ?

Likely, but as you are moving away from this ... this counter here does 
and will always only care about HPAGE_PMD_ORDER.
Lance Yang July 5, 2024, 10:48 a.m. UTC | #4
Hi David and Barry,

Thanks a lot for paying attention!

On Fri, Jul 5, 2024 at 6:14 PM David Hildenbrand <david@redhat.com> wrote:
>
> On 05.07.24 12:12, Barry Song wrote:
> > On Fri, Jul 5, 2024 at 9:08 PM David Hildenbrand <david@redhat.com> wrote:
> >>
> >>> @@ -3253,8 +3259,9 @@ int split_huge_page_to_list_to_order(struct page *page, struct list_head *list,
> >>>                i_mmap_unlock_read(mapping);
> >>>    out:
> >>>        xas_destroy(&xas);
> >>> -     if (is_thp)
> >>> +     if (order >= HPAGE_PMD_ORDER)
> >>
> >> We likely should be using "== HPAGE_PMD_ORDER" here, to be safe for the
> >> future.
> >
> > I feel this might need to be separate since all other places are using
> > folio_test_pmd_mappable() ?
>
> Likely, but as you are moving away from this ... this counter here does
> and will always only care about HPAGE_PMD_ORDER.

I appreciate the different opinions on whether we should use
">= HPAGE_PMD_ORDER" or "==" for this check.

In this context, let's leave it as is and stay consistent with
folio_test_pmd_mappable() by using ">= HPAGE_PMD_ORDER",
what do you think?

Thanks,
Lance

>
> --
> Cheers,
>
> David / dhildenb
>
David Hildenbrand July 5, 2024, 10:55 a.m. UTC | #5
On 05.07.24 12:48, Lance Yang wrote:
> Hi David and Barry,
> 
> Thanks a lot for paying attention!
> 
> On Fri, Jul 5, 2024 at 6:14 PM David Hildenbrand <david@redhat.com> wrote:
>>
>> On 05.07.24 12:12, Barry Song wrote:
>>> On Fri, Jul 5, 2024 at 9:08 PM David Hildenbrand <david@redhat.com> wrote:
>>>>
>>>>> @@ -3253,8 +3259,9 @@ int split_huge_page_to_list_to_order(struct page *page, struct list_head *list,
>>>>>                 i_mmap_unlock_read(mapping);
>>>>>     out:
>>>>>         xas_destroy(&xas);
>>>>> -     if (is_thp)
>>>>> +     if (order >= HPAGE_PMD_ORDER)
>>>>
>>>> We likely should be using "== HPAGE_PMD_ORDER" here, to be safe for the
>>>> future.
>>>
>>> I feel this might need to be separate since all other places are using
>>> folio_test_pmd_mappable() ?
>>
>> Likely, but as you are moving away from this ... this counter here does
>> and will always only care about HPAGE_PMD_ORDER.
> 
> I appreciate the different opinions on whether we should use
> ">= HPAGE_PMD_ORDER" or "==" for this check.
> 
> In this context, let's leave it as is and stay consistent with
> folio_test_pmd_mappable() by using ">= HPAGE_PMD_ORDER",
> what do you think?

I don't think it's a good idea to add more wrong code that is even 
harder to grep (folio_test_pmd_mappable would give you candidates that 
might need attention). But I don't care too much. Maybe someone here can 
volunteer to clean up these instances to make sure we check PMD-size and 
not PMD-mappable for these counters that are for PMD-sized folios only, 
even in the future with larger folios?
Lance Yang July 5, 2024, 11:19 a.m. UTC | #6
On Fri, Jul 5, 2024 at 6:56 PM David Hildenbrand <david@redhat.com> wrote:
>
> On 05.07.24 12:48, Lance Yang wrote:
> > Hi David and Barry,
> >
> > Thanks a lot for paying attention!
> >
> > On Fri, Jul 5, 2024 at 6:14 PM David Hildenbrand <david@redhat.com> wrote:
> >>
> >> On 05.07.24 12:12, Barry Song wrote:
> >>> On Fri, Jul 5, 2024 at 9:08 PM David Hildenbrand <david@redhat.com> wrote:
> >>>>
> >>>>> @@ -3253,8 +3259,9 @@ int split_huge_page_to_list_to_order(struct page *page, struct list_head *list,
> >>>>>                 i_mmap_unlock_read(mapping);
> >>>>>     out:
> >>>>>         xas_destroy(&xas);
> >>>>> -     if (is_thp)
> >>>>> +     if (order >= HPAGE_PMD_ORDER)
> >>>>
> >>>> We likely should be using "== HPAGE_PMD_ORDER" here, to be safe for the
> >>>> future.
> >>>
> >>> I feel this might need to be separate since all other places are using
> >>> folio_test_pmd_mappable() ?
> >>
> >> Likely, but as you are moving away from this ... this counter here does
> >> and will always only care about HPAGE_PMD_ORDER.
> >
> > I appreciate the different opinions on whether we should use
> > ">= HPAGE_PMD_ORDER" or "==" for this check.
> >
> > In this context, let's leave it as is and stay consistent with
> > folio_test_pmd_mappable() by using ">= HPAGE_PMD_ORDER",
> > what do you think?
>
> I don't think it's a good idea to add more wrong code that is even
> harder to grep (folio_test_pmd_mappable would give you candidates that
> might need attention). But I don't care too much. Maybe someone here can
> volunteer to clean up these instances to make sure we check PMD-size and
> not PMD-mappable for these counters that are for PMD-sized folios only,
> even in the future with larger folios?

Thanks for clarifying! Yes, agreed. We should ensure we check PMD-size,
not PMD-mappable here, especially as we consider large folios in the future.

So, let's use "== HPAGE_PMD_ORDER" here ;)

Thanks,
Lance

>
> --
> Cheers,
>
> David / dhildenb
>
Lance Yang July 5, 2024, 11:31 a.m. UTC | #7
Hi Andrew,

Could you please fold the following changes into this patch?

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 993f0dd2a1ed..f8b5cbd4dd71 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -3266,7 +3266,7 @@ int split_huge_page_to_list_to_order(struct page *page, struct list_head *list,
 		i_mmap_unlock_read(mapping);
 out:
 	xas_destroy(&xas);
-	if (order >= HPAGE_PMD_ORDER)
+	if (order == HPAGE_PMD_ORDER)
 		count_vm_event(!ret ? THP_SPLIT_PAGE : THP_SPLIT_PAGE_FAILED);
 	count_mthp_stat(order, !ret ? MTHP_STAT_SPLIT : MTHP_STAT_SPLIT_FAILED);
 	return ret;

Thanks,
Lance
diff mbox series

Patch

diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
index 212cca384d7e..cee3c5da8f0e 100644
--- a/include/linux/huge_mm.h
+++ b/include/linux/huge_mm.h
@@ -284,6 +284,9 @@  enum mthp_stat_item {
 	MTHP_STAT_FILE_ALLOC,
 	MTHP_STAT_FILE_FALLBACK,
 	MTHP_STAT_FILE_FALLBACK_CHARGE,
+	MTHP_STAT_SPLIT,
+	MTHP_STAT_SPLIT_FAILED,
+	MTHP_STAT_SPLIT_DEFERRED,
 	__MTHP_STAT_COUNT
 };
 
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 954c63575917..5cbd838e96e6 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -559,6 +559,9 @@  DEFINE_MTHP_STAT_ATTR(swpout_fallback, MTHP_STAT_SWPOUT_FALLBACK);
 DEFINE_MTHP_STAT_ATTR(file_alloc, MTHP_STAT_FILE_ALLOC);
 DEFINE_MTHP_STAT_ATTR(file_fallback, MTHP_STAT_FILE_FALLBACK);
 DEFINE_MTHP_STAT_ATTR(file_fallback_charge, MTHP_STAT_FILE_FALLBACK_CHARGE);
+DEFINE_MTHP_STAT_ATTR(split, MTHP_STAT_SPLIT);
+DEFINE_MTHP_STAT_ATTR(split_failed, MTHP_STAT_SPLIT_FAILED);
+DEFINE_MTHP_STAT_ATTR(split_deferred, MTHP_STAT_SPLIT_DEFERRED);
 
 static struct attribute *stats_attrs[] = {
 	&anon_fault_alloc_attr.attr,
@@ -569,6 +572,9 @@  static struct attribute *stats_attrs[] = {
 	&file_alloc_attr.attr,
 	&file_fallback_attr.attr,
 	&file_fallback_charge_attr.attr,
+	&split_attr.attr,
+	&split_failed_attr.attr,
+	&split_deferred_attr.attr,
 	NULL,
 };
 
@@ -3068,7 +3074,7 @@  int split_huge_page_to_list_to_order(struct page *page, struct list_head *list,
 	XA_STATE_ORDER(xas, &folio->mapping->i_pages, folio->index, new_order);
 	struct anon_vma *anon_vma = NULL;
 	struct address_space *mapping = NULL;
-	bool is_thp = folio_test_pmd_mappable(folio);
+	int order = folio_order(folio);
 	int extra_pins, ret;
 	pgoff_t end;
 	bool is_hzp;
@@ -3253,8 +3259,9 @@  int split_huge_page_to_list_to_order(struct page *page, struct list_head *list,
 		i_mmap_unlock_read(mapping);
 out:
 	xas_destroy(&xas);
-	if (is_thp)
+	if (order >= HPAGE_PMD_ORDER)
 		count_vm_event(!ret ? THP_SPLIT_PAGE : THP_SPLIT_PAGE_FAILED);
+	count_mthp_stat(order, !ret ? MTHP_STAT_SPLIT : MTHP_STAT_SPLIT_FAILED);
 	return ret;
 }
 
@@ -3307,6 +3314,7 @@  void deferred_split_folio(struct folio *folio)
 	if (list_empty(&folio->_deferred_list)) {
 		if (folio_test_pmd_mappable(folio))
 			count_vm_event(THP_DEFERRED_SPLIT_PAGE);
+		count_mthp_stat(folio_order(folio), MTHP_STAT_SPLIT_DEFERRED);
 		list_add_tail(&folio->_deferred_list, &ds_queue->split_queue);
 		ds_queue->split_queue_len++;
 #ifdef CONFIG_MEMCG