From patchwork Thu Apr 1 18:32:18 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dave Hansen X-Patchwork-Id: 12179297 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C1C0BC433ED for ; Thu, 1 Apr 2021 18:35:04 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 7542A60FE6 for ; Thu, 1 Apr 2021 18:35:04 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 7542A60FE6 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id ED76F6B0092; Thu, 1 Apr 2021 14:35:03 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E604C6B008C; Thu, 1 Apr 2021 14:35:03 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CD8B56B0095; Thu, 1 Apr 2021 14:35:03 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0006.hostedemail.com [216.40.44.6]) by kanga.kvack.org (Postfix) with ESMTP id A65FB6B008C for ; Thu, 1 Apr 2021 14:35:03 -0400 (EDT) Received: from smtpin22.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 5FBDD824999B for ; Thu, 1 Apr 2021 18:35:03 +0000 (UTC) X-FDA: 77984650086.22.FDDC48D Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by imf23.hostedemail.com (Postfix) with ESMTP id BD9C3A0003A5 for ; Thu, 1 Apr 2021 18:35:01 +0000 (UTC) IronPort-SDR: 7FRYJNwEQF7E3+7A2duDiNgjNXDWDe2KKsmKDaQqz1hcf+GEvQq0sUD2wrsoD88kZSLtzEntPB SVSyMRw3Kerg== X-IronPort-AV: E=McAfee;i="6000,8403,9941"; a="189051344" X-IronPort-AV: E=Sophos;i="5.81,296,1610438400"; d="scan'208";a="189051344" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Apr 2021 11:35:00 -0700 IronPort-SDR: JioM0Sj4VyscMKESqMCXUZ967wamiVnr5qKHzeazQa098ewbw65ZWDvck0Ud2p9YnZwZE1uz2l KP6i84s581TA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.81,296,1610438400"; d="scan'208";a="379420456" Received: from viggo.jf.intel.com (HELO localhost.localdomain) ([10.54.77.144]) by orsmga006.jf.intel.com with ESMTP; 01 Apr 2021 11:34:59 -0700 Subject: [PATCH 01/10] mm/numa: node demotion data structure and lookup To: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org,Dave Hansen ,shy828301@gmail.com,weixugc@google.com,rientjes@google.com,ying.huang@intel.com,dan.j.williams@intel.com,david@redhat.com,osalvador@suse.de From: Dave Hansen Date: Thu, 01 Apr 2021 11:32:18 -0700 References: <20210401183216.443C4443@viggo.jf.intel.com> In-Reply-To: <20210401183216.443C4443@viggo.jf.intel.com> Message-Id: <20210401183218.E7C9CE24@viggo.jf.intel.com> X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: BD9C3A0003A5 X-Stat-Signature: b39mcq6sbxsp75fnzmej9i3jsaipnutt Received-SPF: none (linux.intel.com>: No applicable sender policy available) receiver=imf23; identity=mailfrom; envelope-from=""; helo=mga11.intel.com; client-ip=192.55.52.93 X-HE-DKIM-Result: none/none X-HE-Tag: 1617302101-477948 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Dave Hansen Prepare for the kernel to auto-migrate pages to other memory nodes with a user defined node migration table. This allows creating single migration target for each NUMA node to enable the kernel to do NUMA page migrations instead of simply reclaiming colder pages. A node with no target is a "terminal node", so reclaim acts normally there. The migration target does not fundamentally _need_ to be a single node, but this implementation starts there to limit complexity. If you consider the migration path as a graph, cycles (loops) in the graph are disallowed. This avoids wasting resources by constantly migrating (A->B, B->A, A->B ...). The expectation is that cycles will never be allowed. Signed-off-by: Dave Hansen Reviewed-by: Yang Shi Cc: Wei Xu Cc: David Rientjes Cc: Huang Ying Cc: Dan Williams Cc: David Hildenbrand Cc: osalvador Reviewed-by: Oscar Salvador Reviewed-by: Wei Xu --- changes since 20200122: * Make node_demotion[] __read_mostly changes in July 2020: - Remove loop from next_demotion_node() and get_online_mems(). This means that the node returned by next_demotion_node() might now be offline, but the worst case is that the allocation fails. That's fine since it is transient. --- b/mm/migrate.c | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) diff -puN mm/migrate.c~0006-node-Define-and-export-memory-migration-path mm/migrate.c --- a/mm/migrate.c~0006-node-Define-and-export-memory-migration-path 2021-03-31 15:17:10.734000264 -0700 +++ b/mm/migrate.c 2021-03-31 15:17:10.742000264 -0700 @@ -1163,6 +1163,23 @@ out: return rc; } +static int node_demotion[MAX_NUMNODES] __read_mostly = + {[0 ... MAX_NUMNODES - 1] = NUMA_NO_NODE}; + +/** + * next_demotion_node() - Get the next node in the demotion path + * @node: The starting node to lookup the next node + * + * @returns: node id for next memory node in the demotion path hierarchy + * from @node; NUMA_NO_NODE if @node is terminal. This does not keep + * @node online or guarantee that it *continues* to be the next demotion + * target. + */ +int next_demotion_node(int node) +{ + return node_demotion[node]; +} + /* * Obtain the lock on page, remove all ptes and migrate the page * to the newly allocated page in newpage.