From patchwork Mon Jan 25 06:08:16 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: tangchen X-Patchwork-Id: 8103851 Return-Path: X-Original-To: patchwork-linux-acpi@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 02E049F818 for ; Mon, 25 Jan 2016 06:09:54 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 6081F203C2 for ; Mon, 25 Jan 2016 06:09:52 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 7E19E20260 for ; Mon, 25 Jan 2016 06:09:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755392AbcAYGJt (ORCPT ); Mon, 25 Jan 2016 01:09:49 -0500 Received: from cn.fujitsu.com ([59.151.112.132]:6584 "EHLO heian.cn.fujitsu.com" rhost-flags-OK-FAIL-OK-FAIL) by vger.kernel.org with ESMTP id S1751031AbcAYGIf (ORCPT ); Mon, 25 Jan 2016 01:08:35 -0500 X-IronPort-AV: E=Sophos;i="5.20,346,1444665600"; d="scan'208";a="2930657" Received: from unknown (HELO cn.fujitsu.com) ([10.167.33.5]) by heian.cn.fujitsu.com with ESMTP; 25 Jan 2016 14:08:28 +0800 Received: from G08CNEXCHPEKD01.g08.fujitsu.local (unknown [10.167.33.80]) by cn.fujitsu.com (Postfix) with ESMTP id EB88A41824FF; Mon, 25 Jan 2016 14:07:45 +0800 (CST) Received: from tangchen.g08.fujitsu.local (10.167.226.71) by G08CNEXCHPEKD01.g08.fujitsu.local (10.167.33.89) with Microsoft SMTP Server (TLS) id 14.3.181.6; Mon, 25 Jan 2016 14:07:45 +0800 From: Tang Chen To: , , , , , , , , , , , , , CC: , , , , Subject: [PATCH v5 RESEND 1/5] x86, memhp, numa: Online memory-less nodes at boot time. Date: Mon, 25 Jan 2016 14:08:16 +0800 Message-ID: <1453702100-2597-2-git-send-email-tangchen@cn.fujitsu.com> X-Mailer: git-send-email 1.9.3 In-Reply-To: <1453702100-2597-1-git-send-email-tangchen@cn.fujitsu.com> References: <1453702100-2597-1-git-send-email-tangchen@cn.fujitsu.com> MIME-Version: 1.0 X-Originating-IP: [10.167.226.71] X-yoursite-MailScanner-ID: EB88A41824FF.AE27D X-yoursite-MailScanner: Found to be clean X-yoursite-MailScanner-From: tangchen@cn.fujitsu.com X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 Sender: linux-acpi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-acpi@vger.kernel.org X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP For now, x86 does not support memory-less node. A node without memory will not be onlined, and the cpus on it will be mapped to the other online nodes with memory in init_cpu_to_node(). The reason of doing this is to ensure each cpu has mapped to a node with memory, so that it will be able to allocate local memory for that cpu. But we don't have to do it in this way. In this series of patches, we are going to construct cpu <-> node mapping for all possible cpus at boot time, which is a 1-1 mapping. It means the cpu will be mapped to the node it belongs to, and will never be changed. If a node has only cpus but no memory, the cpus on it will be mapped to a memory-less node. And the memory-less node should be onlined. This patch allocate pgdats for all memory-less nodes and online them at boot time. Then build zonelists for these nodes. As a result, when cpus on these memory-less nodes try to allocate memory from local node, it will automatically fall back to the proper zones in the zonelists. --- arch/x86/mm/numa.c | 27 +++++++++++++-------------- 1 file changed, 13 insertions(+), 14 deletions(-) diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c index c3b3f65..010edb4 100644 --- a/arch/x86/mm/numa.c +++ b/arch/x86/mm/numa.c @@ -704,22 +704,19 @@ void __init x86_numa_init(void) numa_init(dummy_numa_init); } -static __init int find_near_online_node(int node) +static void __init init_memory_less_node(int nid) { - int n, val; - int min_val = INT_MAX; - int best_node = -1; + unsigned long zones_size[MAX_NR_ZONES] = {0}; + unsigned long zholes_size[MAX_NR_ZONES] = {0}; - for_each_online_node(n) { - val = node_distance(node, n); + /* Allocate and initialize node data. Memory-less node is now online.*/ + alloc_node_data(nid); + free_area_init_node(nid, zones_size, 0, zholes_size); - if (val < min_val) { - min_val = val; - best_node = n; - } - } - - return best_node; + /* + * All zonelists will be built later in start_kernel() after per cpu + * areas are initialized. + */ } /* @@ -748,8 +745,10 @@ void __init init_cpu_to_node(void) if (node == NUMA_NO_NODE) continue; + if (!node_online(node)) - node = find_near_online_node(node); + init_memory_less_node(node); + numa_set_node(cpu, node); } }