From patchwork Wed Sep 29 06:03:51 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Huang, Ying" X-Patchwork-Id: 12524677 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A01F2C433EF for ; Wed, 29 Sep 2021 06:04:15 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 33114613CD for ; Wed, 29 Sep 2021 06:04:15 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 33114613CD Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 6EC8594000E; Wed, 29 Sep 2021 02:04:14 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 69C3C940009; Wed, 29 Sep 2021 02:04:14 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 58A9D94000E; Wed, 29 Sep 2021 02:04:14 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0023.hostedemail.com [216.40.44.23]) by kanga.kvack.org (Postfix) with ESMTP id 462C5940009 for ; Wed, 29 Sep 2021 02:04:14 -0400 (EDT) Received: from smtpin34.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id F27AE1828621A for ; Wed, 29 Sep 2021 06:04:12 +0000 (UTC) X-FDA: 78639570744.34.391C8A6 Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by imf02.hostedemail.com (Postfix) with ESMTP id 1C4AA7001A24 for ; Wed, 29 Sep 2021 06:04:11 +0000 (UTC) X-IronPort-AV: E=McAfee;i="6200,9189,10121"; a="221663324" X-IronPort-AV: E=Sophos;i="5.85,331,1624345200"; d="scan'208";a="221663324" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Sep 2021 23:04:10 -0700 X-IronPort-AV: E=Sophos;i="5.85,331,1624345200"; d="scan'208";a="562981471" Received: from xuefeich-mobl1.ccr.corp.intel.com (HELO yhuang6-mobl1.ccr.corp.intel.com) ([10.254.215.118]) by fmsmga002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Sep 2021 23:04:06 -0700 From: Huang Ying To: Andrew Morton Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Huang Ying , Thomas Gleixner , Dave Hansen , Yang Shi , Zi Yan , Michal Hocko , Wei Xu , Oscar Salvador , David Rientjes , Dan Williams , David Hildenbrand , Greg Thelen , Keith Busch Subject: [PATCH -V3] mm/migrate: fix CPUHP state to update node demotion order Date: Wed, 29 Sep 2021 14:03:51 +0800 Message-Id: <20210929060351.7293-1-ying.huang@intel.com> X-Mailer: git-send-email 2.30.2 MIME-Version: 1.0 X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 1C4AA7001A24 X-Stat-Signature: 73emorayr3hgmbwi1huay4r97bmpy9zm Authentication-Results: imf02.hostedemail.com; dkim=none; dmarc=fail reason="No valid SPF, No valid DKIM" header.from=intel.com (policy=none); spf=none (imf02.hostedemail.com: domain of ying.huang@intel.com has no SPF policy when checking 192.55.52.93) smtp.mailfrom=ying.huang@intel.com X-HE-Tag: 1632895451-18064 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The node demotion order needs to be updated during CPU hotplug. Because whether a NUMA node has CPU may influence the demotion order. The update function should be called during CPU online/offline after the node_states[N_CPU] has been updated. That is done in CPUHP_AP_ONLINE_DYN during CPU online and in CPUHP_MM_VMSTAT_DEAD during CPU offline. But in commit 884a6e5d1f93 ("mm/migrate: update node demotion order on hotplug events"), the function to update node demotion order is called in CPUHP_AP_ONLINE_DYN during CPU online/offline. This doesn't satisfy the order requirement. For example, there are 4 CPUs (P0, P1, P2, P3) in 2 sockets (P0, P1 in S0 and P2, P3 in S1), the demotion order is - S0 -> NUMA_NO_NODE - S1 -> NUMA_NO_NODE After P2 and P3 is offlined, because S1 has no CPU now, the demotion order should have been changed to - S0 -> S1 - S1 -> NO_NODE but it isn't changed, because the order updating callback for CPU hotplug doesn't see the new nodemask. After that, if P1 is offlined, the demotion order is changed to the expected order as above. So in this patch, we added CPUHP_AP_MM_DEMOTION_ONLINE and CPUHP_MM_DEMOTION_DEAD to be called after CPUHP_AP_ONLINE_DYN and CPUHP_MM_VMSTAT_DEAD during CPU online and offline, and register the update function on them. Fixes: 884a6e5d1f93 ("mm/migrate: update node demotion order on hotplug events") Signed-off-by: "Huang, Ying" Cc: Thomas Gleixner Cc: Dave Hansen Cc: Yang Shi Cc: Zi Yan Cc: Michal Hocko Cc: Wei Xu Cc: Oscar Salvador Cc: David Rientjes Cc: Dan Williams Cc: David Hildenbrand Cc: Greg Thelen Cc: Keith Busch Changes: v3: - Revise patch description per Andrew's comments. v2: - Revise state name to follow the naming convention per Thomas' comments. - Use cpuhp_setup_state() to initialize the initial order per Mika's comments. --- include/linux/cpuhotplug.h | 4 ++++ mm/migrate.c | 8 +++++--- 2 files changed, 9 insertions(+), 3 deletions(-) diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h index 832d8a74fa59..991911048857 100644 --- a/include/linux/cpuhotplug.h +++ b/include/linux/cpuhotplug.h @@ -72,6 +72,8 @@ enum cpuhp_state { CPUHP_SLUB_DEAD, CPUHP_DEBUG_OBJ_DEAD, CPUHP_MM_WRITEBACK_DEAD, + /* Must be after CPUHP_MM_VMSTAT_DEAD */ + CPUHP_MM_DEMOTION_DEAD, CPUHP_MM_VMSTAT_DEAD, CPUHP_SOFTIRQ_DEAD, CPUHP_NET_MVNETA_DEAD, @@ -240,6 +242,8 @@ enum cpuhp_state { CPUHP_AP_BASE_CACHEINFO_ONLINE, CPUHP_AP_ONLINE_DYN, CPUHP_AP_ONLINE_DYN_END = CPUHP_AP_ONLINE_DYN + 30, + /* Must be after CPUHP_AP_ONLINE_DYN for node_states[N_CPU] update */ + CPUHP_AP_MM_DEMOTION_ONLINE, CPUHP_AP_X86_HPET_ONLINE, CPUHP_AP_X86_KVM_CLK_ONLINE, CPUHP_AP_DTPM_CPU_ONLINE, diff --git a/mm/migrate.c b/mm/migrate.c index c14a55004fee..7769abac8aad 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -3284,9 +3284,8 @@ static int __init migrate_on_reclaim_init(void) { int ret; - ret = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "migrate on reclaim", - migration_online_cpu, - migration_offline_cpu); + ret = cpuhp_setup_state_nocalls(CPUHP_MM_DEMOTION_DEAD, "mm/demotion:offline", + NULL, migration_offline_cpu); /* * In the unlikely case that this fails, the automatic * migration targets may become suboptimal for nodes @@ -3294,6 +3293,9 @@ static int __init migrate_on_reclaim_init(void) * rare case, do not bother trying to do anything special. */ WARN_ON(ret < 0); + ret = cpuhp_setup_state(CPUHP_AP_MM_DEMOTION_ONLINE, "mm/demotion:online", + migration_online_cpu, NULL); + WARN_ON(ret < 0); hotplug_memory_notifier(migrate_on_reclaim_callback, 100); return 0;