diff mbox series

[RFC] mm/vmstat: Add events for THP migration without split

Message ID 1589257372-29576-1-git-send-email-anshuman.khandual@arm.com (mailing list archive)
State New, archived
Headers show
Series [RFC] mm/vmstat: Add events for THP migration without split | expand

Commit Message

Anshuman Khandual May 12, 2020, 4:22 a.m. UTC
Add the following new trace events which will help in validating migration
events involving PMD based THP pages.

1. THP_PMD_MIGRATION_ENTRY_SET
2. THP_PMD_MIGRATION_ENTRY_REMOVE

There are no clear method to confirm whether a THP migration happened with
out involving it's split. These trace events along with PGMIGRATE_SUCCESS
and PGMIGRATE_FAILURE will provide additional insights. After this change,

A single 2M THP (2K base page) when migrated

1. Without split

................
pgmigrate_success 1
pgmigrate_fail 0
................
thp_pmd_migration_entry_set 1
thp_pmd_migration_entry_remove 1
................

2. With split

................
pgmigrate_success 512
pgmigrate_fail 0
................
thp_pmd_migration_entry_set 0
thp_pmd_migration_entry_remove 0
................

pgmigrate_success as 1 instead of 512, provides a hint for possible THP
migration event. But then it gets mixed with normal page migrations over
time. These additional trace events provide required co-relation.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: linux-mm@kvack.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
This is an indirect way for counting PMD migrations without split. Is there
a better method possible ? Just request for comments at the moment.

 include/linux/vm_event_item.h | 4 ++++
 mm/migrate.c                  | 1 +
 mm/rmap.c                     | 1 +
 mm/vmstat.c                   | 4 ++++
 4 files changed, 10 insertions(+)

Comments

Zi Yan May 14, 2020, 2:28 p.m. UTC | #1
On 12 May 2020, at 0:22, Anshuman Khandual wrote:

> Add the following new trace events which will help in validating migration
> events involving PMD based THP pages.
>
> 1. THP_PMD_MIGRATION_ENTRY_SET
> 2. THP_PMD_MIGRATION_ENTRY_REMOVE
>
> There are no clear method to confirm whether a THP migration happened with
> out involving it's split. These trace events along with PGMIGRATE_SUCCESS
> and PGMIGRATE_FAILURE will provide additional insights. After this change,
>
> A single 2M THP (2K base page) when migrated
>
> 1. Without split
>
> ................
> pgmigrate_success 1
> pgmigrate_fail 0
> ................
> thp_pmd_migration_entry_set 1
> thp_pmd_migration_entry_remove 1
> ................
>
> 2. With split
>
> ................
> pgmigrate_success 512
> pgmigrate_fail 0
> ................
> thp_pmd_migration_entry_set 0
> thp_pmd_migration_entry_remove 0
> ................
>
> pgmigrate_success as 1 instead of 512, provides a hint for possible THP
> migration event. But then it gets mixed with normal page migrations over
> time. These additional trace events provide required co-relation.

To track successful THP migrations, the code below should work, right?

diff --git a/mm/migrate.c b/mm/migrate.c
index b1092876e537..d394f5331288 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1220,6 +1220,8 @@ static ICE_noinline int unmap_and_move(new_page_t get_new_page,
         * we want to retry.
         */
        if (rc == MIGRATEPAGE_SUCCESS) {
+               if (PageTransHuge(newpage))
+                       count_vm_event(THP_PMD_MIGRATION_SUCCESS);
                put_page(page);
                if (reason == MR_MEMORY_FAILURE) {
                        /*

Maybe you could give more details on why you want to add the THP migration event and
how you are going to use the event in your use case. That would be very helpful to this
code review. Are you going to do anything if you see THP migration failures?

Thanks.

—
Best Regards,
Yan Zi
John Hubbard May 14, 2020, 6:29 p.m. UTC | #2
On 2020-05-11 21:22, Anshuman Khandual wrote:
> Add the following new trace events which will help in validating migration
> events involving PMD based THP pages.
> 
> 1. THP_PMD_MIGRATION_ENTRY_SET
> 2. THP_PMD_MIGRATION_ENTRY_REMOVE
> 
> There are no clear method to confirm whether a THP migration happened with
> out involving it's split. These trace events along with PGMIGRATE_SUCCESS
> and PGMIGRATE_FAILURE will provide additional insights. After this change,
> 


Hi Anshuman,

It's very nice to see this work, and I think that reporting a bit more
about THP migration stats is going to make development and performance
debugging a lot more efficient (and pleasant).


> A single 2M THP (2K base page) when migrated
> 
> 1. Without split
> 
> ................
> pgmigrate_success 1
> pgmigrate_fail 0
> ................
> thp_pmd_migration_entry_set 1
> thp_pmd_migration_entry_remove 1
> ................
> 

I do think we should decouple the trace event name(s) just a *little* more,
from the mechanisms used to migrate THPs. In other words, let's report
the number of THP migration successes, and name it accordingly--rather
than "set" and "remove", which are pretty low-level and furthermore
depend on today's exact code.

Maybe Zi Yan's recommended name is exactly right, in fact:

     THP_PMD_MIGRATION_SUCCESS


thanks,
Anshuman Khandual May 15, 2020, 3:50 a.m. UTC | #3
On 05/14/2020 07:58 PM, Zi Yan wrote:
> On 12 May 2020, at 0:22, Anshuman Khandual wrote:
> 
>> Add the following new trace events which will help in validating migration
>> events involving PMD based THP pages.
>>
>> 1. THP_PMD_MIGRATION_ENTRY_SET
>> 2. THP_PMD_MIGRATION_ENTRY_REMOVE
>>
>> There are no clear method to confirm whether a THP migration happened with
>> out involving it's split. These trace events along with PGMIGRATE_SUCCESS
>> and PGMIGRATE_FAILURE will provide additional insights. After this change,
>>
>> A single 2M THP (2K base page) when migrated
>>
>> 1. Without split
>>
>> ................
>> pgmigrate_success 1
>> pgmigrate_fail 0
>> ................
>> thp_pmd_migration_entry_set 1
>> thp_pmd_migration_entry_remove 1
>> ................
>>
>> 2. With split
>>
>> ................
>> pgmigrate_success 512
>> pgmigrate_fail 0
>> ................
>> thp_pmd_migration_entry_set 0
>> thp_pmd_migration_entry_remove 0
>> ................
>>
>> pgmigrate_success as 1 instead of 512, provides a hint for possible THP
>> migration event. But then it gets mixed with normal page migrations over
>> time. These additional trace events provide required co-relation.
> 
> To track successful THP migrations, the code below should work, right?
> 
> diff --git a/mm/migrate.c b/mm/migrate.c
> index b1092876e537..d394f5331288 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -1220,6 +1220,8 @@ static ICE_noinline int unmap_and_move(new_page_t get_new_page,
>          * we want to retry.
>          */
>         if (rc == MIGRATEPAGE_SUCCESS) {
> +               if (PageTransHuge(newpage))
> +                       count_vm_event(THP_PMD_MIGRATION_SUCCESS);

Thats right.

>                 put_page(page);
>                 if (reason == MR_MEMORY_FAILURE) {
>                         /*

Another THP_PMD_MIGRATION_FAILURE event in migrate_pages() when the THP gets
split as a huge page could not be allocated. Both THP_PMD_MIGRATION_SUCCESS
and THP_PMD_MIGRATION_FAILURE will provide a better understanding regarding
THP migration events on the system.

> 
> Maybe you could give more details on why you want to add the THP migration event and
> how you are going to use the event in your use case. That would be very helpful to this
> code review. Are you going to do anything if you see THP migration failures?

Not at the moment. Like other VM events these new ones will provide required
statistics (and better understanding) on THP migration which can be used to
improve it further. Follows the same good old principle, if we cannot measure
we cannot improve.
Anshuman Khandual May 15, 2020, 4:03 a.m. UTC | #4
On 05/14/2020 11:59 PM, John Hubbard wrote:
> On 2020-05-11 21:22, Anshuman Khandual wrote:
>> Add the following new trace events which will help in validating
>> migration events involving PMD based THP pages.
>> 
>> 1. THP_PMD_MIGRATION_ENTRY_SET 2. THP_PMD_MIGRATION_ENTRY_REMOVE
>> 
>> There are no clear method to confirm whether a THP migration
>> happened with out involving it's split. These trace events along
>> with PGMIGRATE_SUCCESS and PGMIGRATE_FAILURE will provide
>> additional insights. After this change,
>> 
> 
> 
> Hi Anshuman,
> 
> It's very nice to see this work, and I think that reporting a bit
> more about THP migration stats is going to make development and
> performance debugging a lot more efficient (and pleasant).

That is definitely one of the motivations for these events here.

> 
> 
>> A single 2M THP (2K base page) when migrated
>> 
>> 1. Without split
>> 
>> ................ pgmigrate_success 1 pgmigrate_fail 0 
>> ................ thp_pmd_migration_entry_set 1 
>> thp_pmd_migration_entry_remove 1 ................
>> 
> 
> I do think we should decouple the trace event name(s) just a *little*
> more, from the mechanisms used to migrate THPs. In other words, let's
> report the number of THP migration successes, and name it
> accordingly--rather than "set" and "remove", which are pretty
> low-level and furthermore depend on today's exact code.

Agreed, the events are low level and follows the implementation very
closely. Hence posted as a RFC instead, as I was not very sure about
these events.

> 
> Maybe Zi Yan's recommended name is exactly right, in fact:
> 
> THP_PMD_MIGRATION_SUCCESS

Will also add another THP_PMD_MIGRATION_FAILURE even in migrate_pages()
when a huge page could not be allocated and THP gets split.
diff mbox series

Patch

diff --git a/include/linux/vm_event_item.h b/include/linux/vm_event_item.h
index ffef0f279747..4b25102cf3ad 100644
--- a/include/linux/vm_event_item.h
+++ b/include/linux/vm_event_item.h
@@ -91,6 +91,10 @@  enum vm_event_item { PGPGIN, PGPGOUT, PSWPIN, PSWPOUT,
 		THP_ZERO_PAGE_ALLOC_FAILED,
 		THP_SWPOUT,
 		THP_SWPOUT_FALLBACK,
+#ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION
+		THP_PMD_MIGRATION_ENTRY_SET,
+		THP_PMD_MIGRATION_ENTRY_REMOVE,
+#endif
 #endif
 #ifdef CONFIG_MEMORY_BALLOON
 		BALLOON_INFLATE,
diff --git a/mm/migrate.c b/mm/migrate.c
index 7160c1556f79..8d50d55cbe97 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -228,6 +228,7 @@  static bool remove_migration_pte(struct page *page, struct vm_area_struct *vma,
 		if (!pvmw.pte) {
 			VM_BUG_ON_PAGE(PageHuge(page) || !PageTransCompound(page), page);
 			remove_migration_pmd(&pvmw, new);
+			count_vm_event(THP_PMD_MIGRATION_ENTRY_REMOVE);
 			continue;
 		}
 #endif
diff --git a/mm/rmap.c b/mm/rmap.c
index f79a206b271a..3c1fe3f45cb5 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1418,6 +1418,7 @@  static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
 			VM_BUG_ON_PAGE(PageHuge(page) || !PageTransCompound(page), page);
 
 			set_pmd_migration_entry(&pvmw, page);
+			count_vm_event(THP_PMD_MIGRATION_ENTRY_SET);
 			continue;
 		}
 #endif
diff --git a/mm/vmstat.c b/mm/vmstat.c
index 96d21a792b57..a5254b7ee531 100644
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -1274,6 +1274,10 @@  const char * const vmstat_text[] = {
 	"thp_zero_page_alloc_failed",
 	"thp_swpout",
 	"thp_swpout_fallback",
+#ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION
+	"thp_pmd_migration_entry_set",
+	"thp_pmd_migration_entry_remove",
+#endif
 #endif
 #ifdef CONFIG_MEMORY_BALLOON
 	"balloon_inflate",