From patchwork Thu Feb 3 10:52:12 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 12734035 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D0B9DC433EF for ; Thu, 3 Feb 2022 10:52:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232705AbiBCKwm (ORCPT ); Thu, 3 Feb 2022 05:52:42 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]:58911 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230016AbiBCKwl (ORCPT ); Thu, 3 Feb 2022 05:52:41 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1643885561; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding; bh=0nUhFDS2VVNBtWmr5UL6MxVLGL6ZahCobZOHDmafTBo=; b=NFtuoxXvi9F9Jp//tb6FzS6R3NPLIjTsmmeuIc3A8kIbUwk8TdPjs10jyt59BCMrB3Z6C4 XaFWZwJ34O7xQ2jMGHC2V82pKQnicCIR4Bft8M3IJRVYkmMOLSId2xPB2gpTHoMIE0fuIo HqLeYyH37O3Q9yZt/4P9en1gir5NXKo= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-601-qhQFYVz6OwyACj3fUT9yiQ-1; Thu, 03 Feb 2022 05:52:39 -0500 X-MC-Unique: qhQFYVz6OwyACj3fUT9yiQ-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 71F4A640A1; Thu, 3 Feb 2022 10:52:34 +0000 (UTC) Received: from t480s.redhat.com (unknown [10.39.194.64]) by smtp.corp.redhat.com (Postfix) with ESMTP id 9F12766E18; Thu, 3 Feb 2022 10:52:13 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: David Hildenbrand , Oscar Salvador , Anatoly Pugachev , Andrew Morton , Greg Kroah-Hartman , Michal Hocko , Mike Rapoport , Catalin Marinas , Will Deacon , Thomas Bogendoerfer , Michael Ellerman , Benjamin Herrenschmidt , Paul Mackerras , Paul Walmsley , Palmer Dabbelt , Albert Ou , Heiko Carstens , Vasily Gorbik , Yoshinori Sato , Rich Felker , "David S. Miller" , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "Rafael J. Wysocki" , x86@kernel.org, linux-arm-kernel@lists.infradead.org, linux-ia64@vger.kernel.org, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v1] drivers/base/node: consolidate node device subsystem initialization in node_dev_init() Date: Thu, 3 Feb 2022 11:52:12 +0100 Message-Id: <20220203105212.30385-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 Precedence: bulk List-ID: X-Mailing-List: linux-mips@vger.kernel.org ... and call node_dev_init() after memory_dev_init() from driver_init(), so before any of the existing arch/subsys calls. All online nodes should be known at that point: early during boot, arch code determines node and zone ranges and sets the relevant nodes online; usually this happens in setup_arch(). This is in line with memory_dev_init(), which initializes the memory device subsystem and creates all memory block devices. Similar to memory_dev_init(), panic() if anything goes wrong, we don't want to continue with such basic initialization errors. The important part is that node_dev_init() gets called after memory_dev_init() and after cpu_dev_init(), but before any of the relevant archs call register_cpu() to register the new cpu device under the node device. The latter should be the case for the current users of topology_init(). Reviewed-by: Oscar Salvador Tested-by: Anatoly Pugachev (sparc64) Cc: Andrew Morton Cc: Greg Kroah-Hartman Cc: Michal Hocko Cc: Oscar Salvador Cc: Mike Rapoport Cc: Catalin Marinas Cc: Will Deacon Cc: Thomas Bogendoerfer Cc: Michael Ellerman Cc: Benjamin Herrenschmidt Cc: Paul Mackerras Cc: Paul Walmsley Cc: Palmer Dabbelt Cc: Albert Ou Cc: Heiko Carstens Cc: Vasily Gorbik Cc: Yoshinori Sato Cc: Rich Felker Cc: "David S. Miller" Cc: Thomas Gleixner Cc: Ingo Molnar Cc: Borislav Petkov Cc: Dave Hansen Cc: "Rafael J. Wysocki" Cc: x86@kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Cc: linux-ia64@vger.kernel.org Cc: linux-mips@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-riscv@lists.infradead.org Cc: linux-s390@vger.kernel.org Cc: linux-sh@vger.kernel.org Cc: sparclinux@vger.kernel.org Cc: linux-mm@kvack.org Signed-off-by: David Hildenbrand --- RFC v1 -> v1: * Extended patch description slighly * Tested on x86-64, s390x, ppc64 and aarch64 * Added Rb and Tested-by. --- arch/arm64/kernel/setup.c | 3 --- arch/ia64/kernel/topology.c | 10 ---------- arch/mips/kernel/topology.c | 5 ----- arch/powerpc/kernel/sysfs.c | 17 ----------------- arch/riscv/kernel/setup.c | 3 --- arch/s390/kernel/numa.c | 7 ------- arch/sh/kernel/topology.c | 5 ----- arch/sparc/kernel/sysfs.c | 12 ------------ arch/x86/kernel/topology.c | 5 ----- drivers/base/init.c | 1 + drivers/base/node.c | 30 +++++++++++++++++------------- include/linux/node.h | 4 ++++ 12 files changed, 22 insertions(+), 80 deletions(-) base-commit: 26291c54e111ff6ba87a164d85d4a4e134b7315c diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c index f70573928f1b..3505789cf4bd 100644 --- a/arch/arm64/kernel/setup.c +++ b/arch/arm64/kernel/setup.c @@ -406,9 +406,6 @@ static int __init topology_init(void) { int i; - for_each_online_node(i) - register_one_node(i); - for_each_possible_cpu(i) { struct cpu *cpu = &per_cpu(cpu_data.cpu, i); cpu->hotpluggable = cpu_can_disable(i); diff --git a/arch/ia64/kernel/topology.c b/arch/ia64/kernel/topology.c index e4992917a24b..94a848b06f15 100644 --- a/arch/ia64/kernel/topology.c +++ b/arch/ia64/kernel/topology.c @@ -70,16 +70,6 @@ static int __init topology_init(void) { int i, err = 0; -#ifdef CONFIG_NUMA - /* - * MCD - Do we want to register all ONLINE nodes, or all POSSIBLE nodes? - */ - for_each_online_node(i) { - if ((err = register_one_node(i))) - goto out; - } -#endif - sysfs_cpus = kcalloc(NR_CPUS, sizeof(struct ia64_cpu), GFP_KERNEL); if (!sysfs_cpus) panic("kzalloc in topology_init failed - NR_CPUS too big?"); diff --git a/arch/mips/kernel/topology.c b/arch/mips/kernel/topology.c index 08ad6371fbe0..9429d85a4703 100644 --- a/arch/mips/kernel/topology.c +++ b/arch/mips/kernel/topology.c @@ -12,11 +12,6 @@ static int __init topology_init(void) { int i, ret; -#ifdef CONFIG_NUMA - for_each_online_node(i) - register_one_node(i); -#endif /* CONFIG_NUMA */ - for_each_present_cpu(i) { struct cpu *c = &per_cpu(cpu_devices, i); diff --git a/arch/powerpc/kernel/sysfs.c b/arch/powerpc/kernel/sysfs.c index d45a415d5374..2069bbb90a9a 100644 --- a/arch/powerpc/kernel/sysfs.c +++ b/arch/powerpc/kernel/sysfs.c @@ -1110,14 +1110,6 @@ EXPORT_SYMBOL_GPL(cpu_remove_dev_attr_group); /* NUMA stuff */ #ifdef CONFIG_NUMA -static void __init register_nodes(void) -{ - int i; - - for (i = 0; i < MAX_NUMNODES; i++) - register_one_node(i); -} - int sysfs_add_device_to_node(struct device *dev, int nid) { struct node *node = node_devices[nid]; @@ -1132,13 +1124,6 @@ void sysfs_remove_device_from_node(struct device *dev, int nid) sysfs_remove_link(&node->dev.kobj, kobject_name(&dev->kobj)); } EXPORT_SYMBOL_GPL(sysfs_remove_device_from_node); - -#else -static void __init register_nodes(void) -{ - return; -} - #endif /* Only valid if CPU is present. */ @@ -1155,8 +1140,6 @@ static int __init topology_init(void) { int cpu, r; - register_nodes(); - for_each_possible_cpu(cpu) { struct cpu *c = &per_cpu(cpu_devices, cpu); diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c index b42bfdc67482..834eb652a7b9 100644 --- a/arch/riscv/kernel/setup.c +++ b/arch/riscv/kernel/setup.c @@ -301,9 +301,6 @@ static int __init topology_init(void) { int i, ret; - for_each_online_node(i) - register_one_node(i); - for_each_possible_cpu(i) { struct cpu *cpu = &per_cpu(cpu_devices, i); diff --git a/arch/s390/kernel/numa.c b/arch/s390/kernel/numa.c index 51c5a9f6e525..23ab9f02f278 100644 --- a/arch/s390/kernel/numa.c +++ b/arch/s390/kernel/numa.c @@ -33,10 +33,3 @@ void __init numa_setup(void) NODE_DATA(0)->node_spanned_pages = memblock_end_of_DRAM() >> PAGE_SHIFT; NODE_DATA(0)->node_id = 0; } - -static int __init numa_init_late(void) -{ - register_one_node(0); - return 0; -} -arch_initcall(numa_init_late); diff --git a/arch/sh/kernel/topology.c b/arch/sh/kernel/topology.c index 76af6db9daa2..2d2a7509b565 100644 --- a/arch/sh/kernel/topology.c +++ b/arch/sh/kernel/topology.c @@ -46,11 +46,6 @@ static int __init topology_init(void) { int i, ret; -#ifdef CONFIG_NUMA - for_each_online_node(i) - register_one_node(i); -#endif - for_each_present_cpu(i) { struct cpu *c = &per_cpu(cpu_devices, i); diff --git a/arch/sparc/kernel/sysfs.c b/arch/sparc/kernel/sysfs.c index 6d60d416f0dd..f19487e4cc71 100644 --- a/arch/sparc/kernel/sysfs.c +++ b/arch/sparc/kernel/sysfs.c @@ -244,22 +244,10 @@ static void __init check_mmu_stats(void) mmu_stats_supported = 1; } -static void register_nodes(void) -{ -#ifdef CONFIG_NUMA - int i; - - for (i = 0; i < MAX_NUMNODES; i++) - register_one_node(i); -#endif -} - static int __init topology_init(void) { int cpu, ret; - register_nodes(); - check_mmu_stats(); for_each_possible_cpu(cpu) { diff --git a/arch/x86/kernel/topology.c b/arch/x86/kernel/topology.c index bd83748e2bde..8617d1ed9d31 100644 --- a/arch/x86/kernel/topology.c +++ b/arch/x86/kernel/topology.c @@ -154,11 +154,6 @@ static int __init topology_init(void) { int i; -#ifdef CONFIG_NUMA - for_each_online_node(i) - register_one_node(i); -#endif - for_each_present_cpu(i) arch_register_cpu(i); diff --git a/drivers/base/init.c b/drivers/base/init.c index a9f57c22fb9e..d8d0fe687111 100644 --- a/drivers/base/init.c +++ b/drivers/base/init.c @@ -35,5 +35,6 @@ void __init driver_init(void) auxiliary_bus_init(); cpu_dev_init(); memory_dev_init(); + node_dev_init(); container_dev_init(); } diff --git a/drivers/base/node.c b/drivers/base/node.c index 87acc47e8951..a133981a12fc 100644 --- a/drivers/base/node.c +++ b/drivers/base/node.c @@ -1065,26 +1065,30 @@ static const struct attribute_group *cpu_root_attr_groups[] = { }; #define NODE_CALLBACK_PRI 2 /* lower than SLAB */ -static int __init register_node_type(void) +void __init node_dev_init(void) { - int ret; + static struct notifier_block node_memory_callback_nb = { + .notifier_call = node_memory_callback, + .priority = NODE_CALLBACK_PRI, + }; + int ret, i; BUILD_BUG_ON(ARRAY_SIZE(node_state_attr) != NR_NODE_STATES); BUILD_BUG_ON(ARRAY_SIZE(node_state_attrs)-1 != NR_NODE_STATES); ret = subsys_system_register(&node_subsys, cpu_root_attr_groups); - if (!ret) { - static struct notifier_block node_memory_callback_nb = { - .notifier_call = node_memory_callback, - .priority = NODE_CALLBACK_PRI, - }; - register_hotmemory_notifier(&node_memory_callback_nb); - } + if (ret) + panic("%s() failed to register subsystem: %d\n", __func__, ret); + + register_hotmemory_notifier(&node_memory_callback_nb); /* - * Note: we're not going to unregister the node class if we fail - * to register the node state class attribute files. + * Create all node devices, which will properly link the node + * to applicable memory block devices and already created cpu devices. */ - return ret; + for_each_online_node(i) { + ret = register_one_node(i); + if (ret) + panic("%s() failed to add node: %d\n", __func__, ret); + } } -postcore_initcall(register_node_type); diff --git a/include/linux/node.h b/include/linux/node.h index bb21fd631b16..f3be6ccfebed 100644 --- a/include/linux/node.h +++ b/include/linux/node.h @@ -112,6 +112,7 @@ static inline void link_mem_sections(int nid, unsigned long start_pfn, extern void unregister_node(struct node *node); #ifdef CONFIG_NUMA +extern void node_dev_init(void); /* Core of the node registration - only memory hotplug should use this */ extern int __register_one_node(int nid); @@ -149,6 +150,9 @@ extern void register_hugetlbfs_with_node(node_registration_func_t doregister, node_registration_func_t unregister); #endif #else +static inline void node_dev_init(void) +{ +} static inline int __register_one_node(int nid) { return 0;