diff mbox series

mm/migrate: fix CPUHP state to update node demotion order

Message ID 20210918025849.88901-1-ying.huang@intel.com (mailing list archive)
State New
Headers show
Series mm/migrate: fix CPUHP state to update node demotion order | expand

Commit Message

Huang, Ying Sept. 18, 2021, 2:58 a.m. UTC
The node demotion order needs to be updated during CPU hotplug.
Because whether a NUMA node has CPU may influence the demotion order.
The update function should be called during CPU online/offline after
the node_states[N_CPU] has been updated.  That is done in
CPUHP_AP_ONLINE_DYN during CPU online and in CPUHP_MM_VMSTAT_DEAD
during CPU offline.  But in commit 884a6e5d1f93 ("mm/migrate: update
node demotion order on hotplug events"), the function to update node
demotion order is called in CPUHP_AP_ONLINE_DYN during CPU
online/offline.  This doesn't satisfy the order requirement.  So in
this patch, we added CPUHP_AP_MM_DEMOTION_ONLINE and
CPUHP_MM_DEMOTION_OFFLINE to be called after CPUHP_AP_ONLINE_DYN and
CPUHP_MM_VMSTAT_DEAD during CPU online/offline, and register the
update function on them.

Fixes: 884a6e5d1f93 ("mm/migrate: update node demotion order on hotplug events")
Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Yang Shi <shy828301@gmail.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Wei Xu <weixugc@google.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: David Rientjes <rientjes@google.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Keith Busch <kbusch@kernel.org>
---
 include/linux/cpuhotplug.h | 2 ++
 mm/migrate.c               | 8 +++++---
 2 files changed, 7 insertions(+), 3 deletions(-)

Comments

Mika Penttilä Sept. 18, 2021, 4:04 a.m. UTC | #1
Hi!

On 18.9.2021 5.58, Huang Ying wrote:
> The node demotion order needs to be updated during CPU hotplug.
> Because whether a NUMA node has CPU may influence the demotion order.
> The update function should be called during CPU online/offline after
> the node_states[N_CPU] has been updated.  That is done in
> CPUHP_AP_ONLINE_DYN during CPU online and in CPUHP_MM_VMSTAT_DEAD
> during CPU offline.  But in commit 884a6e5d1f93 ("mm/migrate: update
> node demotion order on hotplug events"), the function to update node
> demotion order is called in CPUHP_AP_ONLINE_DYN during CPU
> online/offline.  This doesn't satisfy the order requirement.  So in
> this patch, we added CPUHP_AP_MM_DEMOTION_ONLINE and
> CPUHP_MM_DEMOTION_OFFLINE to be called after CPUHP_AP_ONLINE_DYN and
> CPUHP_MM_VMSTAT_DEAD during CPU online/offline, and register the
> update function on them.
>
> Fixes: 884a6e5d1f93 ("mm/migrate: update node demotion order on hotplug events")
> Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
> Cc: Dave Hansen <dave.hansen@linux.intel.com>
> Cc: Yang Shi <shy828301@gmail.com>
> Cc: Zi Yan <ziy@nvidia.com>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Wei Xu <weixugc@google.com>
> Cc: Oscar Salvador <osalvador@suse.de>
> Cc: David Rientjes <rientjes@google.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: David Hildenbrand <david@redhat.com>
> Cc: Greg Thelen <gthelen@google.com>
> Cc: Keith Busch <kbusch@kernel.org>
> ---
>   include/linux/cpuhotplug.h | 2 ++
>   mm/migrate.c               | 8 +++++---
>   2 files changed, 7 insertions(+), 3 deletions(-)
>
> diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h
> index 832d8a74fa59..5a92ea56f21b 100644
> --- a/include/linux/cpuhotplug.h
> +++ b/include/linux/cpuhotplug.h
> @@ -72,6 +72,7 @@ enum cpuhp_state {
>   	CPUHP_SLUB_DEAD,
>   	CPUHP_DEBUG_OBJ_DEAD,
>   	CPUHP_MM_WRITEBACK_DEAD,
> +	CPUHP_MM_DEMOTION_OFFLINE,
>   	CPUHP_MM_VMSTAT_DEAD,
>   	CPUHP_SOFTIRQ_DEAD,
>   	CPUHP_NET_MVNETA_DEAD,
> @@ -240,6 +241,7 @@ enum cpuhp_state {
>   	CPUHP_AP_BASE_CACHEINFO_ONLINE,
>   	CPUHP_AP_ONLINE_DYN,
>   	CPUHP_AP_ONLINE_DYN_END		= CPUHP_AP_ONLINE_DYN + 30,
> +	CPUHP_AP_MM_DEMOTION_ONLINE,
>   	CPUHP_AP_X86_HPET_ONLINE,
>   	CPUHP_AP_X86_KVM_CLK_ONLINE,
>   	CPUHP_AP_DTPM_CPU_ONLINE,
> diff --git a/mm/migrate.c b/mm/migrate.c
> index a6a7743ee98f..77d107a4577f 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -3278,9 +3278,8 @@ static int __init migrate_on_reclaim_init(void)
>   {
>   	int ret;
>   
> -	ret = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "migrate on reclaim",
> -				migration_online_cpu,
> -				migration_offline_cpu);
> +	ret = cpuhp_setup_state_nocalls(CPUHP_MM_DEMOTION_OFFLINE, "mm/demotion:offline",
> +					NULL, migration_offline_cpu);
>   	/*
>   	 * In the unlikely case that this fails, the automatic
>   	 * migration targets may become suboptimal for nodes
> @@ -3288,6 +3287,9 @@ static int __init migrate_on_reclaim_init(void)
>   	 * rare case, do not bother trying to do anything special.
>   	 */
>   	WARN_ON(ret < 0);
> +	ret = cpuhp_setup_state_nocalls(CPUHP_AP_MM_DEMOTION_ONLINE, "mm/demotion:online",
> +					migration_online_cpu, NULL);
>

You changed to _nocalls variant, how does this handle initialization for 
cpus present at boot?

Thanks,
Mika
Huang, Ying Sept. 18, 2021, 6:56 a.m. UTC | #2
Mika Penttilä <mika.penttila@nextfour.com> writes:

> Hi!
>
> On 18.9.2021 5.58, Huang Ying wrote:
>> The node demotion order needs to be updated during CPU hotplug.
>> Because whether a NUMA node has CPU may influence the demotion order.
>> The update function should be called during CPU online/offline after
>> the node_states[N_CPU] has been updated.  That is done in
>> CPUHP_AP_ONLINE_DYN during CPU online and in CPUHP_MM_VMSTAT_DEAD
>> during CPU offline.  But in commit 884a6e5d1f93 ("mm/migrate: update
>> node demotion order on hotplug events"), the function to update node
>> demotion order is called in CPUHP_AP_ONLINE_DYN during CPU
>> online/offline.  This doesn't satisfy the order requirement.  So in
>> this patch, we added CPUHP_AP_MM_DEMOTION_ONLINE and
>> CPUHP_MM_DEMOTION_OFFLINE to be called after CPUHP_AP_ONLINE_DYN and
>> CPUHP_MM_VMSTAT_DEAD during CPU online/offline, and register the
>> update function on them.
>>
>> Fixes: 884a6e5d1f93 ("mm/migrate: update node demotion order on hotplug events")
>> Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
>> Cc: Dave Hansen <dave.hansen@linux.intel.com>
>> Cc: Yang Shi <shy828301@gmail.com>
>> Cc: Zi Yan <ziy@nvidia.com>
>> Cc: Michal Hocko <mhocko@suse.com>
>> Cc: Wei Xu <weixugc@google.com>
>> Cc: Oscar Salvador <osalvador@suse.de>
>> Cc: David Rientjes <rientjes@google.com>
>> Cc: Dan Williams <dan.j.williams@intel.com>
>> Cc: David Hildenbrand <david@redhat.com>
>> Cc: Greg Thelen <gthelen@google.com>
>> Cc: Keith Busch <kbusch@kernel.org>
>> ---
>>   include/linux/cpuhotplug.h | 2 ++
>>   mm/migrate.c               | 8 +++++---
>>   2 files changed, 7 insertions(+), 3 deletions(-)
>>
>> diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h
>> index 832d8a74fa59..5a92ea56f21b 100644
>> --- a/include/linux/cpuhotplug.h
>> +++ b/include/linux/cpuhotplug.h
>> @@ -72,6 +72,7 @@ enum cpuhp_state {
>>   	CPUHP_SLUB_DEAD,
>>   	CPUHP_DEBUG_OBJ_DEAD,
>>   	CPUHP_MM_WRITEBACK_DEAD,
>> +	CPUHP_MM_DEMOTION_OFFLINE,
>>   	CPUHP_MM_VMSTAT_DEAD,
>>   	CPUHP_SOFTIRQ_DEAD,
>>   	CPUHP_NET_MVNETA_DEAD,
>> @@ -240,6 +241,7 @@ enum cpuhp_state {
>>   	CPUHP_AP_BASE_CACHEINFO_ONLINE,
>>   	CPUHP_AP_ONLINE_DYN,
>>   	CPUHP_AP_ONLINE_DYN_END		= CPUHP_AP_ONLINE_DYN + 30,
>> +	CPUHP_AP_MM_DEMOTION_ONLINE,
>>   	CPUHP_AP_X86_HPET_ONLINE,
>>   	CPUHP_AP_X86_KVM_CLK_ONLINE,
>>   	CPUHP_AP_DTPM_CPU_ONLINE,
>> diff --git a/mm/migrate.c b/mm/migrate.c
>> index a6a7743ee98f..77d107a4577f 100644
>> --- a/mm/migrate.c
>> +++ b/mm/migrate.c
>> @@ -3278,9 +3278,8 @@ static int __init migrate_on_reclaim_init(void)
>>   {
>>   	int ret;
>>   -	ret = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "migrate on
>> reclaim",
>> -				migration_online_cpu,
>> -				migration_offline_cpu);
>> +	ret = cpuhp_setup_state_nocalls(CPUHP_MM_DEMOTION_OFFLINE, "mm/demotion:offline",
>> +					NULL, migration_offline_cpu);
>>   	/*
>>   	 * In the unlikely case that this fails, the automatic
>>   	 * migration targets may become suboptimal for nodes
>> @@ -3288,6 +3287,9 @@ static int __init migrate_on_reclaim_init(void)
>>   	 * rare case, do not bother trying to do anything special.
>>   	 */
>>   	WARN_ON(ret < 0);
>> +	ret = cpuhp_setup_state_nocalls(CPUHP_AP_MM_DEMOTION_ONLINE, "mm/demotion:online",
>> +					migration_online_cpu, NULL);
>>
>
> You changed to _nocalls variant, how does this handle initialization
> for cpus present at boot?

You are right! Thanks!

There are some discussion about CPUHUP in anther thread as follows,

https://lore.kernel.org/lkml/CAAPL-u_Tig1jK=mv_r=j-A-hR3Kpu7txiSFbPR3a8O1qhM1s-Q@mail.gmail.com/

I will wait for discussion in that thread too before the next step.

Best Regards,
Huang, Ying
Thomas Gleixner Sept. 20, 2021, 7:02 a.m. UTC | #3
On Sat, Sep 18 2021 at 10:58, Huang Ying wrote:
> @@ -72,6 +72,7 @@ enum cpuhp_state {
>  	CPUHP_SLUB_DEAD,
>  	CPUHP_DEBUG_OBJ_DEAD,
>  	CPUHP_MM_WRITEBACK_DEAD,
> +	CPUHP_MM_DEMOTION_OFFLINE,

Please keep the _DEAD convention in that section. The plugged CPU is
already gone.

>  	CPUHP_MM_VMSTAT_DEAD,
>  	CPUHP_SOFTIRQ_DEAD,
>  	CPUHP_NET_MVNETA_DEAD,
> @@ -240,6 +241,7 @@ enum cpuhp_state {
>  	CPUHP_AP_BASE_CACHEINFO_ONLINE,
>  	CPUHP_AP_ONLINE_DYN,
>  	CPUHP_AP_ONLINE_DYN_END		= CPUHP_AP_ONLINE_DYN + 30,
> +	CPUHP_AP_MM_DEMOTION_ONLINE,

Are there any ordering requirements of these states vs. other CPU
hotplug states?

If not, then please use the dynamically allocated states.

If so, then please add a comment:

+	/* Must be before CPUHP_XXX and after CPUHP_YYY */
+	CPUHP_MM_DEMOTION_OFFLINE,

Thanks,

        tglx
Huang, Ying Sept. 21, 2021, 6:41 a.m. UTC | #4
Thomas Gleixner <tglx@linutronix.de> writes:

> On Sat, Sep 18 2021 at 10:58, Huang Ying wrote:
>> @@ -72,6 +72,7 @@ enum cpuhp_state {
>>  	CPUHP_SLUB_DEAD,
>>  	CPUHP_DEBUG_OBJ_DEAD,
>>  	CPUHP_MM_WRITEBACK_DEAD,
>> +	CPUHP_MM_DEMOTION_OFFLINE,
>
> Please keep the _DEAD convention in that section. The plugged CPU is
> already gone.

Sure.  Will do that.

>>  	CPUHP_MM_VMSTAT_DEAD,
>>  	CPUHP_SOFTIRQ_DEAD,
>>  	CPUHP_NET_MVNETA_DEAD,
>> @@ -240,6 +241,7 @@ enum cpuhp_state {
>>  	CPUHP_AP_BASE_CACHEINFO_ONLINE,
>>  	CPUHP_AP_ONLINE_DYN,
>>  	CPUHP_AP_ONLINE_DYN_END		= CPUHP_AP_ONLINE_DYN + 30,
>> +	CPUHP_AP_MM_DEMOTION_ONLINE,
>
> Are there any ordering requirements of these states vs. other CPU
> hotplug states?
>
> If not, then please use the dynamically allocated states.
>
> If so, then please add a comment:
>
> +	/* Must be before CPUHP_XXX and after CPUHP_YYY */
> +	CPUHP_MM_DEMOTION_OFFLINE,

The callbacks need to be called after node_states[N_CPU] has been
updated during CPU online/offline.  While node_states[N_CPU] is updated
in CPUHP_AP_ONLINE_DYN and CPUHP_MM_VMSTAT_DEAD.  So the new state must
be before CPUHP_MM_VMSTAT_DEAD for offline and after CPUHP_AP_ONLINE_DYN
for online.  I will update the patch and add the comments.

Best Regards,
Huang, Ying
diff mbox series

Patch

diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h
index 832d8a74fa59..5a92ea56f21b 100644
--- a/include/linux/cpuhotplug.h
+++ b/include/linux/cpuhotplug.h
@@ -72,6 +72,7 @@  enum cpuhp_state {
 	CPUHP_SLUB_DEAD,
 	CPUHP_DEBUG_OBJ_DEAD,
 	CPUHP_MM_WRITEBACK_DEAD,
+	CPUHP_MM_DEMOTION_OFFLINE,
 	CPUHP_MM_VMSTAT_DEAD,
 	CPUHP_SOFTIRQ_DEAD,
 	CPUHP_NET_MVNETA_DEAD,
@@ -240,6 +241,7 @@  enum cpuhp_state {
 	CPUHP_AP_BASE_CACHEINFO_ONLINE,
 	CPUHP_AP_ONLINE_DYN,
 	CPUHP_AP_ONLINE_DYN_END		= CPUHP_AP_ONLINE_DYN + 30,
+	CPUHP_AP_MM_DEMOTION_ONLINE,
 	CPUHP_AP_X86_HPET_ONLINE,
 	CPUHP_AP_X86_KVM_CLK_ONLINE,
 	CPUHP_AP_DTPM_CPU_ONLINE,
diff --git a/mm/migrate.c b/mm/migrate.c
index a6a7743ee98f..77d107a4577f 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -3278,9 +3278,8 @@  static int __init migrate_on_reclaim_init(void)
 {
 	int ret;
 
-	ret = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "migrate on reclaim",
-				migration_online_cpu,
-				migration_offline_cpu);
+	ret = cpuhp_setup_state_nocalls(CPUHP_MM_DEMOTION_OFFLINE, "mm/demotion:offline",
+					NULL, migration_offline_cpu);
 	/*
 	 * In the unlikely case that this fails, the automatic
 	 * migration targets may become suboptimal for nodes
@@ -3288,6 +3287,9 @@  static int __init migrate_on_reclaim_init(void)
 	 * rare case, do not bother trying to do anything special.
 	 */
 	WARN_ON(ret < 0);
+	ret = cpuhp_setup_state_nocalls(CPUHP_AP_MM_DEMOTION_ONLINE, "mm/demotion:online",
+					migration_online_cpu, NULL);
+	WARN_ON(ret < 0);
 
 	hotplug_memory_notifier(migrate_on_reclaim_callback, 100);
 	return 0;