From patchwork Thu Jun 13 23:29:29 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yang Shi X-Patchwork-Id: 10993827 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 11C54924 for ; Thu, 13 Jun 2019 23:30:19 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 03704268AE for ; Thu, 13 Jun 2019 23:30:19 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id EBAA626E90; Thu, 13 Jun 2019 23:30:18 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE,UNPARSEABLE_RELAY autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6EFBF268AE for ; Thu, 13 Jun 2019 23:30:18 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C608D8E0004; Thu, 13 Jun 2019 19:30:15 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id B749F8E0002; Thu, 13 Jun 2019 19:30:15 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 928C58E0004; Thu, 13 Jun 2019 19:30:15 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pg1-f197.google.com (mail-pg1-f197.google.com [209.85.215.197]) by kanga.kvack.org (Postfix) with ESMTP id 4E5B18E0002 for ; Thu, 13 Jun 2019 19:30:15 -0400 (EDT) Received: by mail-pg1-f197.google.com with SMTP id z10so433004pgf.15 for ; Thu, 13 Jun 2019 16:30:15 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=vaqAMivPZG8zRcB25WGSIeU5u9Z2PotIGlviXE6zM+I=; b=nNz/6P+IrONlFH/DD9nL+6nNdvzZi3q6w2vX+S3AH9gIpDpTRIGGoM7cvIOQTizBmZ +12zNM7SUwtagdh9KT0xrRljFGbKtnvlfv+K0bpFG8rj95y4DM0+3GI0ANLKli68P6MO 5IqE3WfAoJeD/4+ivS6UQOfZhIK2uuxQFJC5KxhX+doNqWMfwjvCJGyCdHFZrgysmEmX LdqLsPy1EgiI92EFrUv0iWQwBQgJHZ/IoP5FcCD0UjI9PRbVk+D+uFw9oapBk67Z0Njb CHf4gYcAZEd6ljSfZ0R8VNhOka8sHP9tWuSWtyp0wiWomN8oA34PXGWQLcGXbkstqRd3 Iz1A== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of yang.shi@linux.alibaba.com designates 47.88.44.36 as permitted sender) smtp.mailfrom=yang.shi@linux.alibaba.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=alibaba.com X-Gm-Message-State: APjAAAU2MPYG+NJQc3cqqOXh5C/fDMR5qUFOr+8HXJvrxnXEAyQJKE2g L5fUAbAuR/0iCgV5JndpmXQNTYDVKr3cwI5Dn9VOeOW/mHz73W42Jw82s8TYR8J08ON9Hccv7Rn h/x8/LCzcr2JoiyljwBE6AJMmPXlY2x66q5T/CwUf6FxFMcDMm3OyP4gAoQVbk+RFPg== X-Received: by 2002:a17:902:1125:: with SMTP id d34mr17968868pla.40.1560468614923; Thu, 13 Jun 2019 16:30:14 -0700 (PDT) X-Google-Smtp-Source: APXvYqwrsJ5EE1+mWH/VJSTag6OW2Pr9fGyGIIfHdZ+vSf2v2tzcuonDET12yXi3njL1LfYLY4nD X-Received: by 2002:a17:902:1125:: with SMTP id d34mr17968769pla.40.1560468613684; Thu, 13 Jun 2019 16:30:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1560468613; cv=none; d=google.com; s=arc-20160816; b=N1a6BXuPf1cNZPWdpV4WRfaE7FdsRI44Um0vF86nB2xG/4yD2seWla7/kzqT8yTJKI wd2BWsa9W16ZJ72XuHM6lq/6G8tx/kL18tDx4BFrZpEQfipHE9dfHAs4KETHbUhdhtS2 ViN+5Ga7ylrxial6/xI8W62+ZBuRefLUKrAJogrYYqYewelqzrpPOv98/86Jg5QhNMSX QppAySb1iL7dxgqrsDYJgd0z+vFEqFdVZpxQJhPgSR09Et/erup2+RLSA6U1SMJARng+ GSXfaYIS6OYnmvbz4f3y0tvIFQQ4CBvn2JOdHTNorWiGf+7WcEbcw3bn+dA05J9GcNow eG3A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=vaqAMivPZG8zRcB25WGSIeU5u9Z2PotIGlviXE6zM+I=; b=AzHn8hofvBjMAkmVNaKEvGea1dOJ/BRDKtqKqH/lVPAR+8T431gnhiWMAGikpru+g1 T3SqQZQeWB0AI5xMjyFjOPfjtF0HR1ExJcgokn8oIcUPtJfm4rBwvV3Kio48jTTmnOzl v6EM1Z+nXQSwGJ1H7/P5W+H+TMLJ2uwCI5pf3aWZ0WIR39rzW6RPb5hEnRBJLaE1rfWN 0qt/s7s6V8gGq4olEhSESXFJkvv38Bahp2rhHP0j2piVbLlHyCLjw/ccRC2OgM5djw74 9tn0aC4my4JPk4VvGKkEj1YpFx5UwnRKxPuKvEP/LNxqPw/fKUFV1y9YJVfFTjaJUJRm 57TA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of yang.shi@linux.alibaba.com designates 47.88.44.36 as permitted sender) smtp.mailfrom=yang.shi@linux.alibaba.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: from out4436.biz.mail.alibaba.com (out4436.biz.mail.alibaba.com. [47.88.44.36]) by mx.google.com with ESMTPS id a14si881661pgm.206.2019.06.13.16.30.12 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 13 Jun 2019 16:30:13 -0700 (PDT) Received-SPF: pass (google.com: domain of yang.shi@linux.alibaba.com designates 47.88.44.36 as permitted sender) client-ip=47.88.44.36; Authentication-Results: mx.google.com; spf=pass (google.com: domain of yang.shi@linux.alibaba.com designates 47.88.44.36 as permitted sender) smtp.mailfrom=yang.shi@linux.alibaba.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=alibaba.com X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R751e4;CH=green;DM=||false|;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04426;MF=yang.shi@linux.alibaba.com;NM=1;PH=DS;RN=15;SR=0;TI=SMTPD_---0TU6DYEz_1560468591; Received: from e19h19392.et15sqa.tbsite.net(mailfrom:yang.shi@linux.alibaba.com fp:SMTPD_---0TU6DYEz_1560468591) by smtp.aliyun-inc.com(127.0.0.1); Fri, 14 Jun 2019 07:29:58 +0800 From: Yang Shi To: mhocko@suse.com, mgorman@techsingularity.net, riel@surriel.com, hannes@cmpxchg.org, akpm@linux-foundation.org, dave.hansen@intel.com, keith.busch@intel.com, dan.j.williams@intel.com, fengguang.wu@intel.com, fan.du@intel.com, ying.huang@intel.com, ziy@nvidia.com Cc: yang.shi@linux.alibaba.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [v3 PATCH 1/9] mm: define N_CPU_MEM node states Date: Fri, 14 Jun 2019 07:29:29 +0800 Message-Id: <1560468577-101178-2-git-send-email-yang.shi@linux.alibaba.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1560468577-101178-1-git-send-email-yang.shi@linux.alibaba.com> References: <1560468577-101178-1-git-send-email-yang.shi@linux.alibaba.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Kernel has some pre-defined node masks called node states, i.e. N_MEMORY, N_CPU, etc. But, there might be cpuless nodes, i.e. PMEM nodes, and some architectures, i.e. Power, may have memoryless nodes. It is not very straight forward to get the nodes with both CPUs and memory. So, define N_CPU_MEMORY node states. The nodes with both CPUs and memory are called "primary" nodes. /sys/devices/system/node/primary would show the current online "primary" nodes. Signed-off-by: Yang Shi --- drivers/base/node.c | 2 ++ include/linux/nodemask.h | 3 ++- mm/memory_hotplug.c | 6 ++++++ mm/page_alloc.c | 1 + mm/vmstat.c | 11 +++++++++-- 5 files changed, 20 insertions(+), 3 deletions(-) diff --git a/drivers/base/node.c b/drivers/base/node.c index 8598fcb..4d80fc8 100644 --- a/drivers/base/node.c +++ b/drivers/base/node.c @@ -984,6 +984,7 @@ static ssize_t show_node_state(struct device *dev, #endif [N_MEMORY] = _NODE_ATTR(has_memory, N_MEMORY), [N_CPU] = _NODE_ATTR(has_cpu, N_CPU), + [N_CPU_MEM] = _NODE_ATTR(primary, N_CPU_MEM), }; static struct attribute *node_state_attrs[] = { @@ -995,6 +996,7 @@ static ssize_t show_node_state(struct device *dev, #endif &node_state_attr[N_MEMORY].attr.attr, &node_state_attr[N_CPU].attr.attr, + &node_state_attr[N_CPU_MEM].attr.attr, NULL }; diff --git a/include/linux/nodemask.h b/include/linux/nodemask.h index 27e7fa3..66a8964 100644 --- a/include/linux/nodemask.h +++ b/include/linux/nodemask.h @@ -398,7 +398,8 @@ enum node_states { N_HIGH_MEMORY = N_NORMAL_MEMORY, #endif N_MEMORY, /* The node has memory(regular, high, movable) */ - N_CPU, /* The node has one or more cpus */ + N_CPU, /* The node has one or more cpus */ + N_CPU_MEM, /* The node has both cpus and memory */ NR_NODE_STATES }; diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 328878b..7c29282 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -709,6 +709,9 @@ static void node_states_set_node(int node, struct memory_notify *arg) if (arg->status_change_nid >= 0) node_set_state(node, N_MEMORY); + + if (node_state(node, N_CPU)) + node_set_state(node, N_CPU_MEM); } static void __meminit resize_zone_range(struct zone *zone, unsigned long start_pfn, @@ -1526,6 +1529,9 @@ static void node_states_clear_node(int node, struct memory_notify *arg) if (arg->status_change_nid >= 0) node_clear_state(node, N_MEMORY); + + if (node_state(node, N_CPU)) + node_clear_state(node, N_CPU_MEM); } static int __ref __offline_pages(unsigned long start_pfn, diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 3b13d39..757db89e 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -124,6 +124,7 @@ struct pcpu_drain { #endif [N_MEMORY] = { { [0] = 1UL } }, [N_CPU] = { { [0] = 1UL } }, + [N_CPU_MEM] = { { [0] = 1UL } }, #endif /* NUMA */ }; EXPORT_SYMBOL(node_states); diff --git a/mm/vmstat.c b/mm/vmstat.c index a7d4933..d876ac0 100644 --- a/mm/vmstat.c +++ b/mm/vmstat.c @@ -1905,15 +1905,22 @@ static void __init init_cpu_node_state(void) int node; for_each_online_node(node) { - if (cpumask_weight(cpumask_of_node(node)) > 0) + if (cpumask_weight(cpumask_of_node(node)) > 0) { node_set_state(node, N_CPU); + if (node_state(node, N_MEMORY)) + node_set_state(node, N_CPU_MEM); + } } } static int vmstat_cpu_online(unsigned int cpu) { + int node = cpu_to_node(cpu); + refresh_zone_stat_thresholds(); - node_set_state(cpu_to_node(cpu), N_CPU); + node_set_state(node, N_CPU); + if (node_state(node, N_MEMORY)) + node_set_state(node, N_CPU_MEM); return 0; } From patchwork Thu Jun 13 23:29:30 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yang Shi X-Patchwork-Id: 10993825 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8BC20924 for ; Thu, 13 Jun 2019 23:30:16 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7D0EB268AE for ; Thu, 13 Jun 2019 23:30:16 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 6EB9F26E90; Thu, 13 Jun 2019 23:30:16 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE,UNPARSEABLE_RELAY autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E7E95268AE for ; Thu, 13 Jun 2019 23:30:15 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 13A788E0003; Thu, 13 Jun 2019 19:30:15 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 0EB918E0002; Thu, 13 Jun 2019 19:30:15 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F1CBF8E0003; Thu, 13 Jun 2019 19:30:14 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pl1-f198.google.com (mail-pl1-f198.google.com [209.85.214.198]) by kanga.kvack.org (Postfix) with ESMTP id BA6858E0002 for ; Thu, 13 Jun 2019 19:30:14 -0400 (EDT) Received: by mail-pl1-f198.google.com with SMTP id 71so446446pld.17 for ; Thu, 13 Jun 2019 16:30:14 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=IKfYPAGeHfLGYh0x3q+TVfPTsBQlwnb0pH+gyZISSqU=; b=A4Jnp+3yWQzPHVA7zGnGMl/gD7dtTrD0gIAdddf3oXfLjyFutEO2qZdF1A+t8rRl0m W+3o7dM+3FOvffnVlVdRY+1zddKX+9J9Lj42QLv+9jMy3l0J2bDW2D/8Ln/mgDdNQG2J jz/H6uYLDznkZsn4dCfTyFo3ZX2QdQE669Ml4KC5rs1VuengXg3PvvIbK6zR+qTeBBpV ptTOkzxVaVQAEK1kRjTfRp7e++7sk+URudDbvFt/IDvG/FHt/rRlTSzSGPfARjKUD8R2 l6hCvSetRXYOtt4VLJTfBcwHEii2G3FOfTdfxZkaBm5SU8/xUrvR3qCnSVNA1fwXQ5xl nMsg== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of yang.shi@linux.alibaba.com designates 47.88.44.36 as permitted sender) smtp.mailfrom=yang.shi@linux.alibaba.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=alibaba.com X-Gm-Message-State: APjAAAWawBAWUh34eBA10GO0F5Usi5LIsdOyw+/E0ifdYpboWAR9Dofj dANYomkvVu/K2BelwQy0P9NJ22LEmvkkrEd1GrKIvSFKElfJInBkZVTGQdXgkyt5XP35znJ0YEZ SFGwCiQvHqD90vZoCOm86r6+60mYwBfTIF5a9zsLWRHFR+owY5CKYGOSJ6lZD5GbgDQ== X-Received: by 2002:a63:4a1f:: with SMTP id x31mr22907318pga.150.1560468614284; Thu, 13 Jun 2019 16:30:14 -0700 (PDT) X-Google-Smtp-Source: APXvYqz+NJpJVgNtdBjoiZuCM8JsBRK5EZei2wdbX+ChBFzmVBlITMeVNMOlwvbexW/OKAhILeMf X-Received: by 2002:a63:4a1f:: with SMTP id x31mr22907258pga.150.1560468613479; Thu, 13 Jun 2019 16:30:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1560468613; cv=none; d=google.com; s=arc-20160816; b=iCGYBpMozpg429qNlct/KoBubKb1Nqeko2tXXLQtA1quJOwSaqxI83B4QlxJrV15Ab BxgJgeE7pIpHP6l3mcNx/E2qyuI1/0hsI+gqlU2NCapAyS+Zf4hG+0vx64EOxiICVC3u K+70dGGn55FjbCgc5VwpmrD8T6UIbYf2eWIFm+aEfcSl6EcSCgGV3IH/GDGGLnUGlpX+ 1W3UtFt3I8n9DMcuLh4TWzQeGSaIWThsSq7TLA5WNBIyAf02p6CwuFnYljY4riplFtu9 QKtOqSTVCHiNzIOZdTiCu5Fpdi/C8DaeMuyK6INmHDIEFB8h+2t7vWg++PNTviPsBdNx Ox9A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=IKfYPAGeHfLGYh0x3q+TVfPTsBQlwnb0pH+gyZISSqU=; b=FDlLQMyg6Dhk2d7Vj7xNhqASXYeDSVx86usctgSIchLoI94573tzj/ZcJeeic3eVDa v9aWJ5UPfZhxF/bWcVFyrcUUWeSnHXyTtBDCzd6MKHRRlEmyKI7ZwnkjFHVojq70C2jd 5Av3wUJ5TDpqsaIMP0ux8cAAbigUR8KcFym/zL3GGYSUmhWXwQsqg82o2qpiTMDFMBtr vO2a6glyJ4X6siQ1UhOfAm1KmyaVYqplhEHszhZU+1xdj1mBqtGqRF8grlgpX97I+6sV +CnKBoNOYyMq3ZSd2O1DheWGEgfeWvKbVOC8k7ju4Jt9ezUYS/svKtYbO5yurAjKNmoj vcJQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of yang.shi@linux.alibaba.com designates 47.88.44.36 as permitted sender) smtp.mailfrom=yang.shi@linux.alibaba.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: from out4436.biz.mail.alibaba.com (out4436.biz.mail.alibaba.com. [47.88.44.36]) by mx.google.com with ESMTPS id j20si740305pjn.91.2019.06.13.16.30.12 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 13 Jun 2019 16:30:13 -0700 (PDT) Received-SPF: pass (google.com: domain of yang.shi@linux.alibaba.com designates 47.88.44.36 as permitted sender) client-ip=47.88.44.36; Authentication-Results: mx.google.com; spf=pass (google.com: domain of yang.shi@linux.alibaba.com designates 47.88.44.36 as permitted sender) smtp.mailfrom=yang.shi@linux.alibaba.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=alibaba.com X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R351e4;CH=green;DM=||false|;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e07487;MF=yang.shi@linux.alibaba.com;NM=1;PH=DS;RN=15;SR=0;TI=SMTPD_---0TU6DYEz_1560468591; Received: from e19h19392.et15sqa.tbsite.net(mailfrom:yang.shi@linux.alibaba.com fp:SMTPD_---0TU6DYEz_1560468591) by smtp.aliyun-inc.com(127.0.0.1); Fri, 14 Jun 2019 07:29:59 +0800 From: Yang Shi To: mhocko@suse.com, mgorman@techsingularity.net, riel@surriel.com, hannes@cmpxchg.org, akpm@linux-foundation.org, dave.hansen@intel.com, keith.busch@intel.com, dan.j.williams@intel.com, fengguang.wu@intel.com, fan.du@intel.com, ying.huang@intel.com, ziy@nvidia.com Cc: yang.shi@linux.alibaba.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [v3 PATCH 2/9] mm: Introduce migrate target nodemask Date: Fri, 14 Jun 2019 07:29:30 +0800 Message-Id: <1560468577-101178-3-git-send-email-yang.shi@linux.alibaba.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1560468577-101178-1-git-send-email-yang.shi@linux.alibaba.com> References: <1560468577-101178-1-git-send-email-yang.shi@linux.alibaba.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP With more memory types are invented, the system may have heterogeneous memory hierarchy, i.e. DRAM and PMEM. Some of them are cheaper and slower than DRAM, may be good candidates to be used as secondary memory to store not recently or frequently used data. Introduce the "migrate target" nodemask for such memory nodes. The migrate target could be any memory types which are cheaper and/or slower than DRAM. Currently PMEM is one of such memory. Signed-off-by: Yang Shi --- drivers/acpi/numa.c | 12 ++++++++++++ drivers/base/node.c | 2 ++ include/linux/nodemask.h | 1 + mm/page_alloc.c | 1 + 4 files changed, 16 insertions(+) diff --git a/drivers/acpi/numa.c b/drivers/acpi/numa.c index 3099583..f75adba 100644 --- a/drivers/acpi/numa.c +++ b/drivers/acpi/numa.c @@ -296,6 +296,18 @@ void __init acpi_numa_slit_init(struct acpi_table_slit *slit) goto out_err_bad_srat; } + /* + * The system may have memory hierarchy, some memory may be good + * candidate for migration target, i.e. PMEM is one of them. Mark + * such memory as migration target. + * + * It may be better to retrieve such information from HMAT, but + * SRAT sounds good enough for now. May switch to HMAT in the + * future. + */ + if (ma->flags & ACPI_SRAT_MEM_NON_VOLATILE) + node_set_state(node, N_MIGRATE_TARGET); + node_set(node, numa_nodes_parsed); pr_info("SRAT: Node %u PXM %u [mem %#010Lx-%#010Lx]%s%s\n", diff --git a/drivers/base/node.c b/drivers/base/node.c index 4d80fc8..351b694 100644 --- a/drivers/base/node.c +++ b/drivers/base/node.c @@ -985,6 +985,7 @@ static ssize_t show_node_state(struct device *dev, [N_MEMORY] = _NODE_ATTR(has_memory, N_MEMORY), [N_CPU] = _NODE_ATTR(has_cpu, N_CPU), [N_CPU_MEM] = _NODE_ATTR(primary, N_CPU_MEM), + [N_MIGRATE_TARGET] = _NODE_ATTR(migrate_target, N_MIGRATE_TARGET), }; static struct attribute *node_state_attrs[] = { @@ -997,6 +998,7 @@ static ssize_t show_node_state(struct device *dev, &node_state_attr[N_MEMORY].attr.attr, &node_state_attr[N_CPU].attr.attr, &node_state_attr[N_CPU_MEM].attr.attr, + &node_state_attr[N_MIGRATE_TARGET].attr.attr, NULL }; diff --git a/include/linux/nodemask.h b/include/linux/nodemask.h index 66a8964..411618c 100644 --- a/include/linux/nodemask.h +++ b/include/linux/nodemask.h @@ -400,6 +400,7 @@ enum node_states { N_MEMORY, /* The node has memory(regular, high, movable) */ N_CPU, /* The node has one or more cpus */ N_CPU_MEM, /* The node has both cpus and memory */ + N_MIGRATE_TARGET, /* The node is suitable migrate target */ NR_NODE_STATES }; diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 757db89e..3b37c71 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -125,6 +125,7 @@ struct pcpu_drain { [N_MEMORY] = { { [0] = 1UL } }, [N_CPU] = { { [0] = 1UL } }, [N_CPU_MEM] = { { [0] = 1UL } }, + [N_MIGRATE_TARGET] = { { [0] = 1UL } }, #endif /* NUMA */ }; EXPORT_SYMBOL(node_states); From patchwork Thu Jun 13 23:29:31 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yang Shi X-Patchwork-Id: 10993837 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E49FB1398 for ; Thu, 13 Jun 2019 23:30:56 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D5E83268AE for ; Thu, 13 Jun 2019 23:30:56 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id C873B26E90; Thu, 13 Jun 2019 23:30:56 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE,UNPARSEABLE_RELAY autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4D64C268AE for ; Thu, 13 Jun 2019 23:30:56 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4BCE08E0009; Thu, 13 Jun 2019 19:30:55 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 46CB88E0002; Thu, 13 Jun 2019 19:30:55 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 384738E0009; Thu, 13 Jun 2019 19:30:55 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pf1-f197.google.com (mail-pf1-f197.google.com [209.85.210.197]) by kanga.kvack.org (Postfix) with ESMTP id 023D68E0002 for ; Thu, 13 Jun 2019 19:30:55 -0400 (EDT) Received: by mail-pf1-f197.google.com with SMTP id y7so370040pfy.9 for ; Thu, 13 Jun 2019 16:30:54 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=M/w7NOr7JiE3RTVURowXL/r9lk9zjfF8YDqu6lS/JCo=; b=qIE0H/r/IR4uMULovbK4pFIKrLIjddwnnDR5Z/f+Ee0q0OKEFcK1i78FTTxT4Zy5YQ Y9qE0xSw6a//V7BxOsRixjRLr7lKqsa/H6HtvbhB4KsWbTDFgDpudUjEG97D4/4FA/tP M9O9zn11STXE/pZhjYSxukurWdhBQazurxhp277sDCb0yUUSc1KXDYpoX89eATT+lQ+W m9ovUwyTUczn1f3JnkwwQbhs5l7+9BmG9sRm/rj4xCCJG2d+NhLuJLeoP6tBh2438Kei 8AfYp2bEs6xyXhY/+8bHmnrAnRu2lqXaETMcAMEqmD2LeHap/IAuDdI9fs6TatIleifU gn3Q== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of yang.shi@linux.alibaba.com designates 115.124.30.54 as permitted sender) smtp.mailfrom=yang.shi@linux.alibaba.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=alibaba.com X-Gm-Message-State: APjAAAUvim40YmNAIj6L5mCA/vYqn0JT4JT/D7x3HaCZLKqNrJ0kc8cm GBvxCADwTdxjP/n8GrfyF4thLmqtsAbEoCrpNDdBLtIKSWnGq95v8gU3Qb2WmH1Isfn0hB3Izjb rPIRwfiqiFkVKUdkusP1vUlm4WEER5LpFxtBz3XoTWk6nUrS+Ig4KZnu5zW2qlzGfQg== X-Received: by 2002:a65:5203:: with SMTP id o3mr32666113pgp.379.1560468654596; Thu, 13 Jun 2019 16:30:54 -0700 (PDT) X-Google-Smtp-Source: APXvYqzVMECK+PWUza7yAZefhe+ckhPBbLCFU/pMSLztBzQEGc24a4fSXV9CI7zQVxF6t2+cyVhm X-Received: by 2002:a65:5203:: with SMTP id o3mr32666071pgp.379.1560468653813; Thu, 13 Jun 2019 16:30:53 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1560468653; cv=none; d=google.com; s=arc-20160816; b=NAswQpRfwcMbMj5oa6RBnHeeuDbW78Mu0dzF/Qy1+K3iEqIfyAgiM5dhx0ZeyJyoGZ SjbZ/5O/AfyYbJw8dCrzyVJVG6X+LiqIkSYNfaJO+x5BTuC6sk0gQMHngv4gUyTIu+Ta 76GIZ1g+UjOTPHC+zV8qUDsEs2lAT0UiRgRBDET0XxpxwYFqDECey6L1p9jUZO+16og3 2agWFX0VLXOQUsmVcuMvENUL1XE1MAHPiAQkIdqoSvcoOl2ncdEbkyFsJSHco/4J61bZ gUxJhMZMrgZbwBVFRRnWRbwwNtak95awkeb3tbvyG/SQhhUImPfaIRDwbCBm4iafePXz lStA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=M/w7NOr7JiE3RTVURowXL/r9lk9zjfF8YDqu6lS/JCo=; b=M7FVS0+AhfmtwNgd8R78BsaTEqoCo0oQd86khSFwMev99henlZa3DhBp/fBkYu3qz8 csb2tHZ/J3aSdBvELa+n9Mk9a5+G6/NMKgQSzOZd1zRlkB/QpI5SP6/c/9ouvixN4lnn McDkL56ZDZS9/nCYvaM9Np8lVb/yDTs9WUieWUnAPN+v+l/J90iKopl9ODVEbGRpzWVp lNMrcKWM+HJjvx0gofALMeuQghtuohpZk5BzE7jPuSZuCyRfr/Xj+PVBk81Shq3RocNC 3ZKL0IauqulYqcNP6xK7WKf1IqBUM6beASdMzbRoyNzkemU+9S/+P8hXJNm68+3qJrTF cTdw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of yang.shi@linux.alibaba.com designates 115.124.30.54 as permitted sender) smtp.mailfrom=yang.shi@linux.alibaba.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: from out30-54.freemail.mail.aliyun.com (out30-54.freemail.mail.aliyun.com. [115.124.30.54]) by mx.google.com with ESMTPS id u132si896082pgc.97.2019.06.13.16.30.53 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 13 Jun 2019 16:30:53 -0700 (PDT) Received-SPF: pass (google.com: domain of yang.shi@linux.alibaba.com designates 115.124.30.54 as permitted sender) client-ip=115.124.30.54; Authentication-Results: mx.google.com; spf=pass (google.com: domain of yang.shi@linux.alibaba.com designates 115.124.30.54 as permitted sender) smtp.mailfrom=yang.shi@linux.alibaba.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=alibaba.com X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R151e4;CH=green;DM=||false|;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04400;MF=yang.shi@linux.alibaba.com;NM=1;PH=DS;RN=15;SR=0;TI=SMTPD_---0TU6DYEz_1560468591; Received: from e19h19392.et15sqa.tbsite.net(mailfrom:yang.shi@linux.alibaba.com fp:SMTPD_---0TU6DYEz_1560468591) by smtp.aliyun-inc.com(127.0.0.1); Fri, 14 Jun 2019 07:29:59 +0800 From: Yang Shi To: mhocko@suse.com, mgorman@techsingularity.net, riel@surriel.com, hannes@cmpxchg.org, akpm@linux-foundation.org, dave.hansen@intel.com, keith.busch@intel.com, dan.j.williams@intel.com, fengguang.wu@intel.com, fan.du@intel.com, ying.huang@intel.com, ziy@nvidia.com Cc: yang.shi@linux.alibaba.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [v3 PATCH 3/9] mm: page_alloc: make find_next_best_node find return migration target node Date: Fri, 14 Jun 2019 07:29:31 +0800 Message-Id: <1560468577-101178-4-git-send-email-yang.shi@linux.alibaba.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1560468577-101178-1-git-send-email-yang.shi@linux.alibaba.com> References: <1560468577-101178-1-git-send-email-yang.shi@linux.alibaba.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Need find the cloest migration target node to demote DRAM pages. Add "migration" parameter to find_next_best_node() to skip DRAM node on demand. Signed-off-by: Yang Shi --- mm/internal.h | 11 +++++++++++ mm/page_alloc.c | 14 ++++++++++---- 2 files changed, 21 insertions(+), 4 deletions(-) diff --git a/mm/internal.h b/mm/internal.h index 9eeaf2b..a3181e2 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -292,6 +292,17 @@ static inline bool is_data_mapping(vm_flags_t flags) return (flags & (VM_WRITE | VM_SHARED | VM_STACK)) == VM_WRITE; } +#ifdef CONFIG_NUMA +extern int find_next_best_node(int node, nodemask_t *used_node_mask, + bool migration); +#else +static inline int find_next_best_node(int node, nodemask_t *used_node_mask, + bool migtation) +{ + return 0; +} +#endif + /* mm/util.c */ void __vma_link_list(struct mm_struct *mm, struct vm_area_struct *vma, struct vm_area_struct *prev, struct rb_node *rb_parent); diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 3b37c71..917f64d 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -5425,6 +5425,7 @@ int numa_zonelist_order_handler(struct ctl_table *table, int write, * find_next_best_node - find the next node that should appear in a given node's fallback list * @node: node whose fallback list we're appending * @used_node_mask: nodemask_t of already used nodes + * @migration: find next best migration target node * * We use a number of factors to determine which is the next node that should * appear on a given node's fallback list. The node should not have appeared @@ -5436,7 +5437,8 @@ int numa_zonelist_order_handler(struct ctl_table *table, int write, * * Return: node id of the found node or %NUMA_NO_NODE if no node is found. */ -static int find_next_best_node(int node, nodemask_t *used_node_mask) +int find_next_best_node(int node, nodemask_t *used_node_mask, + bool migration) { int n, val; int min_val = INT_MAX; @@ -5444,13 +5446,18 @@ static int find_next_best_node(int node, nodemask_t *used_node_mask) const struct cpumask *tmp = cpumask_of_node(0); /* Use the local node if we haven't already */ - if (!node_isset(node, *used_node_mask)) { + if (!node_isset(node, *used_node_mask) && + !migration) { node_set(node, *used_node_mask); return node; } for_each_node_state(n, N_MEMORY) { + /* Find next best migration target node */ + if (migration && !node_state(n, N_MIGRATE_TARGET)) + continue; + /* Don't want a node to appear more than once */ if (node_isset(n, *used_node_mask)) continue; @@ -5482,7 +5489,6 @@ static int find_next_best_node(int node, nodemask_t *used_node_mask) return best_node; } - /* * Build zonelists ordered by node and zones within node. * This results in maximum locality--normal zone overflows into local @@ -5544,7 +5550,7 @@ static void build_zonelists(pg_data_t *pgdat) nodes_clear(used_mask); memset(node_order, 0, sizeof(node_order)); - while ((node = find_next_best_node(local_node, &used_mask)) >= 0) { + while ((node = find_next_best_node(local_node, &used_mask, false)) >= 0) { /* * We don't want to pressure a particular node. * So adding penalty to the first node in same From patchwork Thu Jun 13 23:29:32 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yang Shi X-Patchwork-Id: 10993835 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 85B92924 for ; Thu, 13 Jun 2019 23:30:51 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 72083268AE for ; Thu, 13 Jun 2019 23:30:51 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 61D0E26E90; Thu, 13 Jun 2019 23:30:51 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE,UNPARSEABLE_RELAY autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7A17B268AE for ; Thu, 13 Jun 2019 23:30:50 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 866538E0008; Thu, 13 Jun 2019 19:30:49 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 815D78E0002; Thu, 13 Jun 2019 19:30:49 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 72DD28E0008; Thu, 13 Jun 2019 19:30:49 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pg1-f199.google.com (mail-pg1-f199.google.com [209.85.215.199]) by kanga.kvack.org (Postfix) with ESMTP id 3795A8E0002 for ; Thu, 13 Jun 2019 19:30:49 -0400 (EDT) Received: by mail-pg1-f199.google.com with SMTP id s195so438664pgs.13 for ; Thu, 13 Jun 2019 16:30:49 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=kRKtKXN9Ruak/c7ToZZ3XfPYKZNVzcu9jLGB925/+eY=; b=hrJmKwNptXPdGmVwXsXv6e59M+jyuZ3L0/ATcc6feDe0zB9wWc2kiA2wFYoegejWfZ R21kTSLzsFtLpVtbUC+Bzez68Mg3zyJwqt7v55CObxyAIuzoHDc5xRfn6JMiAdalojoR XwJbDTh0ynRAjaZlFipnmE6v3hoEwX3ukzcmdz/CXlqvMDD3pXWTnfXawSl1xmUv0Ldp EUzG/EkSKvdmSF8E3V/YSEYrsBHvmkdShY41o6TvAEjCcGdyAwdXk1fGPVD6UJyPiROt PVvkbvY/TuV33rDJc5aSsV99Y426EgEevXERhja4dByKd05/r6sojHq0F9x+6ilFW4cm OyhA== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of yang.shi@linux.alibaba.com designates 47.88.44.37 as permitted sender) smtp.mailfrom=yang.shi@linux.alibaba.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=alibaba.com X-Gm-Message-State: APjAAAX662hdIBbfoTnoNgaF8/ZF9bw/jzPhTjvkhj+vZpDxYnnk1dYA 9V4U/bcZ8jGGbaP8ex6yF4ohfXpxdnoYH6CKrC7vk6y9mKpntbWBu+jh5LvhTkkCAZpjDYmNomz HhAIcpuQl51/pKU67PwJZqF32qDdpudRpEyrEawFxL4qDpocvPZTQXBxciJeoqeKVww== X-Received: by 2002:a62:ce05:: with SMTP id y5mr4575906pfg.166.1560468648818; Thu, 13 Jun 2019 16:30:48 -0700 (PDT) X-Google-Smtp-Source: APXvYqyQAqVWcSekV6JEsYfSBH1zzuLJxU4C6K8MmTPMEKMixybv+Xr2sMrH5Y1WrNWEHZyiTtIc X-Received: by 2002:a62:ce05:: with SMTP id y5mr4575778pfg.166.1560468647057; Thu, 13 Jun 2019 16:30:47 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1560468647; cv=none; d=google.com; s=arc-20160816; b=lUp39rvRy13r0xEoezS81TQR6tC7G5IZb9I4PTPNJAXg7YU3x7v0OgSW3l2B+n2zwT N52oiwfqcgVRWCvxslFLuJK93L6gjlUxHNVyy56zqc6Vms3irv+73VnOV+Frp1GpDchJ c3mnzONxNKqMbYE6pqAErJrHkjGCE/wewmHgyCtmBCmJye1Un04znXLsg8SVynA8wO/p N3o87QG3TJbvfsf8McCbUeFQNYw5eZvMCl9kjGAC1zJyVcgDaIVaRZNLPD+3cUnaWod/ Q/0om5q2TSMkAVx4gKmDMWLuPl99sWwI3lYpUVwaT4xv7ovR/jqXl/UmZyMaOeKAmhmP Xekg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=kRKtKXN9Ruak/c7ToZZ3XfPYKZNVzcu9jLGB925/+eY=; b=i4/BXlL5xdH8OvmY6W+WZgPcyHRjDHf9LNhAuBanFJrtfhKOHMnxKEDxpyuWIZPFII pAX4Hrwj6glSCWJ2fRLa73HUleKvbsZcr0GhP7z+f5ppGjm8i3xpB21v7VM1cFckJpmk y4i/Y5bqlyBytkeym38LiZ2PVjAVP4FwQE6EmZTn15zk2eNZB85K4lT2N0xk1631wyHc OQj9asHrEkD93oQGWDzbYE8kVkrAZPgU2mq7Gmr/CF3PczUM2/NDlappSapiAnSAr5ID 3vBq/Y5CmrWCARTI3qzMgk1agZ4pKE09RZk3wlDBJk42Z5ABl09Uw0n7g/aju5v7cVO1 UvTA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of yang.shi@linux.alibaba.com designates 47.88.44.37 as permitted sender) smtp.mailfrom=yang.shi@linux.alibaba.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: from out4437.biz.mail.alibaba.com (out4437.biz.mail.alibaba.com. [47.88.44.37]) by mx.google.com with ESMTPS id 66si888948pgg.199.2019.06.13.16.30.45 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 13 Jun 2019 16:30:47 -0700 (PDT) Received-SPF: pass (google.com: domain of yang.shi@linux.alibaba.com designates 47.88.44.37 as permitted sender) client-ip=47.88.44.37; Authentication-Results: mx.google.com; spf=pass (google.com: domain of yang.shi@linux.alibaba.com designates 47.88.44.37 as permitted sender) smtp.mailfrom=yang.shi@linux.alibaba.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=alibaba.com X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R271e4;CH=green;DM=||false|;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04426;MF=yang.shi@linux.alibaba.com;NM=1;PH=DS;RN=15;SR=0;TI=SMTPD_---0TU6DYEz_1560468591; Received: from e19h19392.et15sqa.tbsite.net(mailfrom:yang.shi@linux.alibaba.com fp:SMTPD_---0TU6DYEz_1560468591) by smtp.aliyun-inc.com(127.0.0.1); Fri, 14 Jun 2019 07:29:59 +0800 From: Yang Shi To: mhocko@suse.com, mgorman@techsingularity.net, riel@surriel.com, hannes@cmpxchg.org, akpm@linux-foundation.org, dave.hansen@intel.com, keith.busch@intel.com, dan.j.williams@intel.com, fengguang.wu@intel.com, fan.du@intel.com, ying.huang@intel.com, ziy@nvidia.com Cc: yang.shi@linux.alibaba.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [v3 PATCH 4/9] mm: migrate: make migrate_pages() return nr_succeeded Date: Fri, 14 Jun 2019 07:29:32 +0800 Message-Id: <1560468577-101178-5-git-send-email-yang.shi@linux.alibaba.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1560468577-101178-1-git-send-email-yang.shi@linux.alibaba.com> References: <1560468577-101178-1-git-send-email-yang.shi@linux.alibaba.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP The migrate_pages() returns the number of pages that were not migrated, or an error code. When returning an error code, there is no way to know how many pages were migrated or not migrated. In the following patch, migrate_pages() is used to demote pages to PMEM node, we need account how many pages are reclaimed (demoted) since page reclaim behavior depends on this. Add *nr_succeeded parameter to make migrate_pages() return how many pages are demoted successfully for all cases. Signed-off-by: Yang Shi --- include/linux/migrate.h | 5 +++-- mm/compaction.c | 3 ++- mm/gup.c | 4 +++- mm/memory-failure.c | 7 +++++-- mm/memory_hotplug.c | 4 +++- mm/mempolicy.c | 7 +++++-- mm/migrate.c | 18 ++++++++++-------- mm/page_alloc.c | 4 +++- 8 files changed, 34 insertions(+), 18 deletions(-) diff --git a/include/linux/migrate.h b/include/linux/migrate.h index e13d9bf..837fdd1 100644 --- a/include/linux/migrate.h +++ b/include/linux/migrate.h @@ -66,7 +66,8 @@ extern int migrate_page(struct address_space *mapping, struct page *newpage, struct page *page, enum migrate_mode mode); extern int migrate_pages(struct list_head *l, new_page_t new, free_page_t free, - unsigned long private, enum migrate_mode mode, int reason); + unsigned long private, enum migrate_mode mode, int reason, + unsigned int *nr_succeeded); extern int isolate_movable_page(struct page *page, isolate_mode_t mode); extern void putback_movable_page(struct page *page); @@ -84,7 +85,7 @@ extern int migrate_page_move_mapping(struct address_space *mapping, static inline void putback_movable_pages(struct list_head *l) {} static inline int migrate_pages(struct list_head *l, new_page_t new, free_page_t free, unsigned long private, enum migrate_mode mode, - int reason) + int reason, unsigned int *nr_succeeded) { return -ENOSYS; } static inline int isolate_movable_page(struct page *page, isolate_mode_t mode) { return -EBUSY; } diff --git a/mm/compaction.c b/mm/compaction.c index 9febc8c..c1723e5 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -2074,6 +2074,7 @@ bool compaction_zonelist_suitable(struct alloc_context *ac, int order, unsigned long last_migrated_pfn; const bool sync = cc->mode != MIGRATE_ASYNC; bool update_cached; + unsigned int nr_succeeded = 0; cc->migratetype = gfpflags_to_migratetype(cc->gfp_mask); ret = compaction_suitable(cc->zone, cc->order, cc->alloc_flags, @@ -2182,7 +2183,7 @@ bool compaction_zonelist_suitable(struct alloc_context *ac, int order, err = migrate_pages(&cc->migratepages, compaction_alloc, compaction_free, (unsigned long)cc, cc->mode, - MR_COMPACTION); + MR_COMPACTION, &nr_succeeded); trace_mm_compaction_migratepages(cc->nr_migratepages, err, &cc->migratepages); diff --git a/mm/gup.c b/mm/gup.c index 2c08248..446ce25 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -1337,6 +1337,7 @@ static long check_and_migrate_cma_pages(struct task_struct *tsk, long i; bool drain_allow = true; bool migrate_allow = true; + unsigned int nr_succeeded = 0; LIST_HEAD(cma_page_list); check_again: @@ -1377,7 +1378,8 @@ static long check_and_migrate_cma_pages(struct task_struct *tsk, put_page(pages[i]); if (migrate_pages(&cma_page_list, new_non_cma_page, - NULL, 0, MIGRATE_SYNC, MR_CONTIG_RANGE)) { + NULL, 0, MIGRATE_SYNC, MR_CONTIG_RANGE, + &nr_succeeded)) { /* * some of the pages failed migration. Do get_user_pages * without migration. diff --git a/mm/memory-failure.c b/mm/memory-failure.c index fc8b517..b5d8a8f 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -1686,6 +1686,7 @@ static int soft_offline_huge_page(struct page *page, int flags) int ret; unsigned long pfn = page_to_pfn(page); struct page *hpage = compound_head(page); + unsigned int nr_succeeded = 0; LIST_HEAD(pagelist); /* @@ -1713,7 +1714,7 @@ static int soft_offline_huge_page(struct page *page, int flags) } ret = migrate_pages(&pagelist, new_page, NULL, MPOL_MF_MOVE_ALL, - MIGRATE_SYNC, MR_MEMORY_FAILURE); + MIGRATE_SYNC, MR_MEMORY_FAILURE, &nr_succeeded); if (ret) { pr_info("soft offline: %#lx: hugepage migration failed %d, type %lx (%pGp)\n", pfn, ret, page->flags, &page->flags); @@ -1742,6 +1743,7 @@ static int __soft_offline_page(struct page *page, int flags) { int ret; unsigned long pfn = page_to_pfn(page); + unsigned int nr_succeeded = 0; /* * Check PageHWPoison again inside page lock because PageHWPoison @@ -1801,7 +1803,8 @@ static int __soft_offline_page(struct page *page, int flags) page_is_file_cache(page)); list_add(&page->lru, &pagelist); ret = migrate_pages(&pagelist, new_page, NULL, MPOL_MF_MOVE_ALL, - MIGRATE_SYNC, MR_MEMORY_FAILURE); + MIGRATE_SYNC, MR_MEMORY_FAILURE, + &nr_succeeded); if (ret) { if (!list_empty(&pagelist)) putback_movable_pages(&pagelist); diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 7c29282..1192d08 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -1360,6 +1360,7 @@ static struct page *new_node_page(struct page *page, unsigned long private) unsigned long pfn; struct page *page; int ret = 0; + unsigned int nr_succeeded = 0; LIST_HEAD(source); for (pfn = start_pfn; pfn < end_pfn; pfn++) { @@ -1416,7 +1417,8 @@ static struct page *new_node_page(struct page *page, unsigned long private) if (!list_empty(&source)) { /* Allocate a new page from the nearest neighbor node */ ret = migrate_pages(&source, new_node_page, NULL, 0, - MIGRATE_SYNC, MR_MEMORY_HOTPLUG); + MIGRATE_SYNC, MR_MEMORY_HOTPLUG, + &nr_succeeded); if (ret) { list_for_each_entry(page, &source, lru) { pr_warn("migrating pfn %lx failed ret:%d ", diff --git a/mm/mempolicy.c b/mm/mempolicy.c index 2219e74..b7bc60b 100644 --- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -988,6 +988,7 @@ static int migrate_to_node(struct mm_struct *mm, int source, int dest, nodemask_t nmask; LIST_HEAD(pagelist); int err = 0; + unsigned int nr_succeeded = 0; nodes_clear(nmask); node_set(source, nmask); @@ -1003,7 +1004,7 @@ static int migrate_to_node(struct mm_struct *mm, int source, int dest, if (!list_empty(&pagelist)) { err = migrate_pages(&pagelist, alloc_new_node_page, NULL, dest, - MIGRATE_SYNC, MR_SYSCALL); + MIGRATE_SYNC, MR_SYSCALL, &nr_succeeded); if (err) putback_movable_pages(&pagelist); } @@ -1182,6 +1183,7 @@ static long do_mbind(unsigned long start, unsigned long len, struct mempolicy *new; unsigned long end; int err; + unsigned int nr_succeeded = 0; LIST_HEAD(pagelist); if (flags & ~(unsigned long)MPOL_MF_VALID) @@ -1254,7 +1256,8 @@ static long do_mbind(unsigned long start, unsigned long len, if (!list_empty(&pagelist)) { WARN_ON_ONCE(flags & MPOL_MF_LAZY); nr_failed = migrate_pages(&pagelist, new_page, NULL, - start, MIGRATE_SYNC, MR_MEMPOLICY_MBIND); + start, MIGRATE_SYNC, MR_MEMPOLICY_MBIND, + &nr_succeeded); if (nr_failed) putback_movable_pages(&pagelist); } diff --git a/mm/migrate.c b/mm/migrate.c index f2ecc28..bc4242a 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -1392,6 +1392,7 @@ static int unmap_and_move_huge_page(new_page_t get_new_page, * @mode: The migration mode that specifies the constraints for * page migration, if any. * @reason: The reason for page migration. + * @nr_succeeded: The number of pages migrated successfully. * * The function returns after 10 attempts or if no pages are movable any more * because the list has become empty or no retryable pages exist any more. @@ -1402,11 +1403,10 @@ static int unmap_and_move_huge_page(new_page_t get_new_page, */ int migrate_pages(struct list_head *from, new_page_t get_new_page, free_page_t put_new_page, unsigned long private, - enum migrate_mode mode, int reason) + enum migrate_mode mode, int reason, unsigned int *nr_succeeded) { int retry = 1; int nr_failed = 0; - int nr_succeeded = 0; int pass = 0; struct page *page; struct page *page2; @@ -1460,7 +1460,7 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page, retry++; break; case MIGRATEPAGE_SUCCESS: - nr_succeeded++; + (*nr_succeeded)++; break; default: /* @@ -1477,11 +1477,11 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page, nr_failed += retry; rc = nr_failed; out: - if (nr_succeeded) - count_vm_events(PGMIGRATE_SUCCESS, nr_succeeded); + if (*nr_succeeded) + count_vm_events(PGMIGRATE_SUCCESS, *nr_succeeded); if (nr_failed) count_vm_events(PGMIGRATE_FAIL, nr_failed); - trace_mm_migrate_pages(nr_succeeded, nr_failed, mode, reason); + trace_mm_migrate_pages(*nr_succeeded, nr_failed, mode, reason); if (!swapwrite) current->flags &= ~PF_SWAPWRITE; @@ -1506,12 +1506,13 @@ static int do_move_pages_to_node(struct mm_struct *mm, struct list_head *pagelist, int node) { int err; + unsigned int nr_succeeded = 0; if (list_empty(pagelist)) return 0; err = migrate_pages(pagelist, alloc_new_node_page, NULL, node, - MIGRATE_SYNC, MR_SYSCALL); + MIGRATE_SYNC, MR_SYSCALL, &nr_succeeded); if (err) putback_movable_pages(pagelist); return err; @@ -1944,6 +1945,7 @@ int migrate_misplaced_page(struct page *page, struct vm_area_struct *vma, pg_data_t *pgdat = NODE_DATA(node); int isolated; int nr_remaining; + unsigned int nr_succeeded = 0; LIST_HEAD(migratepages); /* @@ -1968,7 +1970,7 @@ int migrate_misplaced_page(struct page *page, struct vm_area_struct *vma, list_add(&page->lru, &migratepages); nr_remaining = migrate_pages(&migratepages, alloc_misplaced_dst_page, NULL, node, MIGRATE_ASYNC, - MR_NUMA_MISPLACED); + MR_NUMA_MISPLACED, &nr_succeeded); if (nr_remaining) { if (!list_empty(&migratepages)) { list_del(&page->lru); diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 917f64d..7e95a66 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -8209,6 +8209,7 @@ static int __alloc_contig_migrate_range(struct compact_control *cc, unsigned long pfn = start; unsigned int tries = 0; int ret = 0; + unsigned int nr_succeeded = 0; migrate_prep(); @@ -8236,7 +8237,8 @@ static int __alloc_contig_migrate_range(struct compact_control *cc, cc->nr_migratepages -= nr_reclaimed; ret = migrate_pages(&cc->migratepages, alloc_migrate_target, - NULL, 0, cc->mode, MR_CONTIG_RANGE); + NULL, 0, cc->mode, MR_CONTIG_RANGE, + &nr_succeeded); } if (ret < 0) { putback_movable_pages(&cc->migratepages); From patchwork Thu Jun 13 23:29:33 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yang Shi X-Patchwork-Id: 10993823 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B70BA924 for ; Thu, 13 Jun 2019 23:30:09 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A6C1E268AE for ; Thu, 13 Jun 2019 23:30:09 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 9971C26E90; Thu, 13 Jun 2019 23:30:09 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE,UNPARSEABLE_RELAY autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 729A0268AE for ; Thu, 13 Jun 2019 23:30:08 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1C4376B000D; Thu, 13 Jun 2019 19:30:05 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 139A78E0002; Thu, 13 Jun 2019 19:30:05 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 002C66B0266; Thu, 13 Jun 2019 19:30:04 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pl1-f199.google.com (mail-pl1-f199.google.com [209.85.214.199]) by kanga.kvack.org (Postfix) with ESMTP id B0F2E6B000D for ; Thu, 13 Jun 2019 19:30:04 -0400 (EDT) Received: by mail-pl1-f199.google.com with SMTP id g65so457545plb.9 for ; Thu, 13 Jun 2019 16:30:04 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=bQ2Qu/5U0oOvpJZmOsQ38gDUKVlSWhPFu3LitWScYc0=; b=fUlr8fY1EAXzedCSuLQNnYrAc1W8Ua8vH+YWW55QdyRXkrhuyFEUuSsPL8n99+7gOA yy966fp3vdByL19HQ86C4QOAACfjgNRoTisWGb4w2uQKKAUWwM4SnxO96CUwRRQZQnWQ bXYfjg763Qo5IeFkNqiPfuvC7r5KW280RCWyTslLY3K8FbU0BZ2Dfe+hDyc0FRhrgNgC LWOpkhhLAVx+Zx3Ki2fhA8sAclMOhWE8IExOaPkWwHMg4DtcBq4SlG4UWASt6e1ZQxq2 DW+7XVGG+ILucZB39RZM8chx9i2sQAublAyoTlnbtA/ySP+9e2yKPl6/XWHQIDIjOdDE Iz4A== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of yang.shi@linux.alibaba.com designates 115.124.30.56 as permitted sender) smtp.mailfrom=yang.shi@linux.alibaba.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=alibaba.com X-Gm-Message-State: APjAAAW6l6UcnaGWNHuruMCNqFylloZTwlB6q4V/6YjKEfYjG09nHXFn UrJp1Y+ktVrzRGuFsd433RrTDkZiz6GNFKkBlKrxPsbyD7VR4Ntm7xY+JaHby/dJ4rTPeyaSvRw 4VVJUE6fyezs4dzS6ePQwbBwIebnz3YxvUkROV/ysq7/Wh0W6cNq93fJwA9Zpcos+YA== X-Received: by 2002:a17:902:b18f:: with SMTP id s15mr92817627plr.44.1560468604264; Thu, 13 Jun 2019 16:30:04 -0700 (PDT) X-Google-Smtp-Source: APXvYqx2TOnBlEUG9jpwmI0R3L2nJ65k5PrE9apqoh3QHEQ8RCOw4wZbODgPtVtaT8GLULlKO5b+ X-Received: by 2002:a17:902:b18f:: with SMTP id s15mr92817511plr.44.1560468602286; Thu, 13 Jun 2019 16:30:02 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1560468602; cv=none; d=google.com; s=arc-20160816; b=PorOYmbdHsI6q+KX8iGoXAxMqXOXWM5VoU4g+19zTVR61/fDiKGSg60Ea4RhYVuB0t E+WmeV76Enpdeu7/74y3V/S+mHF1y6Q3cBjErDKDbT8W0AfWvcjGKpg6cWKkqk0Pglyu y/VXWcIYlkecBQr9sRZZUCdO+QXMb2DjE4DU9/z7yOkVpA0b/A7WQuwYWhx+4k5ZWTQ3 C8BQLv6axSjNmp5+oCzRJHfzjZBYxLzZnetJd5EDTqI2gIwAgbkTeuT3Ksmw8ShwP/Z+ HVrS23jJwLyQBjvXUVjvTu6Y54OgXTpWzmusbnKXqZ0xzcvbjy78AvRD2/g5uw4XmikP wTlw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=bQ2Qu/5U0oOvpJZmOsQ38gDUKVlSWhPFu3LitWScYc0=; b=stgVg4vly4auAOvuvV50IuSc020vnMt7olHTpdNY0fjRsa2BGkrvObw/dMoiF5MqZ/ w/QbeEtVjMJPmMKkBeyLrLAsq+QAucPuFvUopnKkpAc1R650aLBOSLIfyKHYh8gsIQ0m R48ZcPaqPcw86xulHA/EPtY6B1IH1PhzjGJ4qIunyyJVvp1x2qTObR31wgOVQ2CTG1QN +oAeFIX8Bi4PEsQBFFPrnT+KnMBrzjoy8sPCRdE6lR3bfTd8e5ZgjKHZU7s16jm7Odn8 B0WXJzFenujGWEGqBMJR6s/XFXNRPPCp/hUrXRHa20GB6ctB83/MFKCTkCf4RH40Rlkz UUtw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of yang.shi@linux.alibaba.com designates 115.124.30.56 as permitted sender) smtp.mailfrom=yang.shi@linux.alibaba.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: from out30-56.freemail.mail.aliyun.com (out30-56.freemail.mail.aliyun.com. [115.124.30.56]) by mx.google.com with ESMTPS id 64si466066plw.37.2019.06.13.16.30.01 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 13 Jun 2019 16:30:02 -0700 (PDT) Received-SPF: pass (google.com: domain of yang.shi@linux.alibaba.com designates 115.124.30.56 as permitted sender) client-ip=115.124.30.56; Authentication-Results: mx.google.com; spf=pass (google.com: domain of yang.shi@linux.alibaba.com designates 115.124.30.56 as permitted sender) smtp.mailfrom=yang.shi@linux.alibaba.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=alibaba.com X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R401e4;CH=green;DM=||false|;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e07487;MF=yang.shi@linux.alibaba.com;NM=1;PH=DS;RN=15;SR=0;TI=SMTPD_---0TU6DYEz_1560468591; Received: from e19h19392.et15sqa.tbsite.net(mailfrom:yang.shi@linux.alibaba.com fp:SMTPD_---0TU6DYEz_1560468591) by smtp.aliyun-inc.com(127.0.0.1); Fri, 14 Jun 2019 07:30:00 +0800 From: Yang Shi To: mhocko@suse.com, mgorman@techsingularity.net, riel@surriel.com, hannes@cmpxchg.org, akpm@linux-foundation.org, dave.hansen@intel.com, keith.busch@intel.com, dan.j.williams@intel.com, fengguang.wu@intel.com, fan.du@intel.com, ying.huang@intel.com, ziy@nvidia.com Cc: yang.shi@linux.alibaba.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [v3 PATCH 5/9] mm: vmscan: demote anon DRAM pages to migration target node Date: Fri, 14 Jun 2019 07:29:33 +0800 Message-Id: <1560468577-101178-6-git-send-email-yang.shi@linux.alibaba.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1560468577-101178-1-git-send-email-yang.shi@linux.alibaba.com> References: <1560468577-101178-1-git-send-email-yang.shi@linux.alibaba.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Since migration target node (i.e. PMEM) typically provides larger capacity than DRAM and has much lower access latency than disk, so it is a good choice to use as a middle tier between DRAM and disk in page reclaim path. With migration target nodes, the demotion path of anonymous pages could be: DRAM -> PMEM -> swap device This patch demotes anonymous pages only for the time being and demote THP to the migration target node in a whole. To avoid expensive page reclaim and/or compaction on the target node if there is memory pressure on it, the most conservative gfp flag is used, which would fail quickly if there is memory pressure and just wakeup kswapd on failure. The migrate_pages() would split THP to migrate one by one as base page upon THP allocation failure. Demote pages to the cloest migration target node even though the system is swapless. The current logic of page reclaim just scan anon LRU when swap is on and swappiness is set properly. Demoting to the migration target doesn't need care whether swap is available or not. But, reclaiming from the migration target node still skip anon LRU if swap is not available. The demotion just happens from DRAM node to its cloest migration target node. Demoting to a remote migration target node or migrating from the target node to DRAM on reclaim path is not allowed. And, define a new migration reason for demotion, called MR_DEMOTE. Demote page via async migration to avoid blocking. The migration is just allowed via node reclaim. Introduce a new node reclaim mode: migrate mode. The migrate mode is not compatible with cpuset and mempolicy settings. Signed-off-by: Yang Shi --- Documentation/sysctl/vm.txt | 6 ++ include/linux/gfp.h | 12 ++++ include/linux/migrate.h | 1 + include/trace/events/migrate.h | 3 +- mm/debug.c | 1 + mm/internal.h | 12 ++++ mm/migrate.c | 15 +++- mm/vmscan.c | 157 +++++++++++++++++++++++++++++++++-------- 8 files changed, 175 insertions(+), 32 deletions(-) diff --git a/Documentation/sysctl/vm.txt b/Documentation/sysctl/vm.txt index 7493220..4b76a55 100644 --- a/Documentation/sysctl/vm.txt +++ b/Documentation/sysctl/vm.txt @@ -919,6 +919,7 @@ This is value ORed together of 1 = Zone reclaim on 2 = Zone reclaim writes dirty pages out 4 = Zone reclaim swaps pages +8 = Zone reclaim migrate pages zone_reclaim_mode is disabled by default. For file servers or workloads that benefit from having their data cached, zone_reclaim_mode should be @@ -943,4 +944,9 @@ Allowing regular swap effectively restricts allocations to the local node unless explicitly overridden by memory policies or cpuset configurations. +Allowing zone reclaim to migrate pages to the migration target nodes, which +are typically cheaper and slower than DRAM, but have larger capacity, i.e. +NVDIMM nodes, if such nodes are present in the system. The migrate mode +is not compatible with cpuset and mempolicy settings. + ============ End of Document ================================= diff --git a/include/linux/gfp.h b/include/linux/gfp.h index fb07b50..b294455 100644 --- a/include/linux/gfp.h +++ b/include/linux/gfp.h @@ -285,6 +285,14 @@ * available and will not wake kswapd/kcompactd on failure. The _LIGHT * version does not attempt reclaim/compaction at all and is by default used * in page fault path, while the non-light is used by khugepaged. + * + * %GFP_DEMOTE is for migration on memory reclaim (a.k.a demotion) allocations. + * The allocation might happen in kswapd or direct reclaim, so assuming + * __GFP_IO and __GFP_FS are not allowed looks safer. Demotion happens for + * user pages (on LRU) only and on specific node. Generally it will fail + * quickly if memory is not available, but may wake up kswapd on failure. + * + * %GFP_TRANSHUGE_DEMOTE is used for THP demotion allocation. */ #define GFP_ATOMIC (__GFP_HIGH|__GFP_ATOMIC|__GFP_KSWAPD_RECLAIM) #define GFP_KERNEL (__GFP_RECLAIM | __GFP_IO | __GFP_FS) @@ -300,6 +308,10 @@ #define GFP_TRANSHUGE_LIGHT ((GFP_HIGHUSER_MOVABLE | __GFP_COMP | \ __GFP_NOMEMALLOC | __GFP_NOWARN) & ~__GFP_RECLAIM) #define GFP_TRANSHUGE (GFP_TRANSHUGE_LIGHT | __GFP_DIRECT_RECLAIM) +#define GFP_DEMOTE (__GFP_HIGHMEM | __GFP_MOVABLE | __GFP_NORETRY | \ + __GFP_NOMEMALLOC | __GFP_NOWARN | __GFP_THISNODE | \ + GFP_NOWAIT) +#define GFP_TRANSHUGE_DEMOTE (GFP_DEMOTE | __GFP_COMP) /* Convert GFP flags to their corresponding migrate type */ #define GFP_MOVABLE_MASK (__GFP_RECLAIMABLE|__GFP_MOVABLE) diff --git a/include/linux/migrate.h b/include/linux/migrate.h index 837fdd1..cfb1f57 100644 --- a/include/linux/migrate.h +++ b/include/linux/migrate.h @@ -25,6 +25,7 @@ enum migrate_reason { MR_MEMPOLICY_MBIND, MR_NUMA_MISPLACED, MR_CONTIG_RANGE, + MR_DEMOTE, MR_TYPES }; diff --git a/include/trace/events/migrate.h b/include/trace/events/migrate.h index 705b33d..c1d5b36 100644 --- a/include/trace/events/migrate.h +++ b/include/trace/events/migrate.h @@ -20,7 +20,8 @@ EM( MR_SYSCALL, "syscall_or_cpuset") \ EM( MR_MEMPOLICY_MBIND, "mempolicy_mbind") \ EM( MR_NUMA_MISPLACED, "numa_misplaced") \ - EMe(MR_CONTIG_RANGE, "contig_range") + EM( MR_CONTIG_RANGE, "contig_range") \ + EMe(MR_DEMOTE, "demote") /* * First define the enums in the above macros to be exported to userspace diff --git a/mm/debug.c b/mm/debug.c index 8345bb6..0bcced8 100644 --- a/mm/debug.c +++ b/mm/debug.c @@ -25,6 +25,7 @@ "mempolicy_mbind", "numa_misplaced", "cma", + "demote", }; const struct trace_print_flags pageflag_names[] = { diff --git a/mm/internal.h b/mm/internal.h index a3181e2..3d756f2 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -303,6 +303,18 @@ static inline int find_next_best_node(int node, nodemask_t *used_node_mask, } #endif +static inline bool has_migration_target_node_online(void) +{ + int nid; + + for_each_online_node(nid) { + if (node_state(nid, N_MIGRATE_TARGET)) + return true; + } + + return false; +} + /* mm/util.c */ void __vma_link_list(struct mm_struct *mm, struct vm_area_struct *vma, struct vm_area_struct *prev, struct rb_node *rb_parent); diff --git a/mm/migrate.c b/mm/migrate.c index bc4242a..9fb76a6 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -1006,7 +1006,8 @@ static int move_to_new_page(struct page *newpage, struct page *page, } static int __unmap_and_move(struct page *page, struct page *newpage, - int force, enum migrate_mode mode) + int force, enum migrate_mode mode, + enum migrate_reason reason) { int rc = -EAGAIN; int page_was_mapped = 0; @@ -1143,8 +1144,16 @@ static int __unmap_and_move(struct page *page, struct page *newpage, if (rc == MIGRATEPAGE_SUCCESS) { if (unlikely(!is_lru)) put_page(newpage); - else + else { + /* + * Put demoted pages on the target node's + * active LRU. + */ + if (!PageUnevictable(newpage) && + reason == MR_DEMOTE) + SetPageActive(newpage); putback_lru_page(newpage); + } } return rc; @@ -1198,7 +1207,7 @@ static ICE_noinline int unmap_and_move(new_page_t get_new_page, goto out; } - rc = __unmap_and_move(page, newpage, force, mode); + rc = __unmap_and_move(page, newpage, force, mode, reason); if (rc == MIGRATEPAGE_SUCCESS) set_page_owner_migrate_reason(newpage, reason); diff --git a/mm/vmscan.c b/mm/vmscan.c index 7acd0af..428a83b 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -1094,6 +1094,55 @@ static void page_check_dirty_writeback(struct page *page, mapping->a_ops->is_dirty_writeback(page, dirty, writeback); } +#ifdef CONFIG_NUMA +#define RECLAIM_OFF 0 +#define RECLAIM_ZONE (1<<0) /* Run shrink_inactive_list on the zone */ +#define RECLAIM_WRITE (1<<1) /* Writeout pages during reclaim */ +#define RECLAIM_UNMAP (1<<2) /* Unmap pages during reclaim */ +#define RECLAIM_MIGRATE (1<<3) /* Migrate pages to migration target + * node during reclaim */ +static struct page *alloc_demote_page(struct page *page, unsigned long node) +{ + if (unlikely(PageHuge(page))) + /* HugeTLB demotion is not supported for now */ + BUG(); + else if (PageTransHuge(page)) { + struct page *thp; + + thp = alloc_pages_node(node, GFP_TRANSHUGE_DEMOTE, + HPAGE_PMD_ORDER); + if (!thp) + return NULL; + prep_transhuge_page(thp); + return thp; + } else + return __alloc_pages_node(node, GFP_DEMOTE, 0); +} +#else +static inline struct page *alloc_demote_page(struct page *page, + unsigned long node) +{ + return NULL; +} +#endif + +static inline bool is_demote_ok(int nid) +{ + /* Just do demotion with migrate mode of node reclaim */ + if (!(node_reclaim_mode & RECLAIM_MIGRATE)) + return false; + + /* Current node is cpuless node */ + if (!node_state(nid, N_CPU_MEM)) + return false; + + /* No online migration target node */ + if (!has_migration_target_node_online()) + return false; + + return true; +} + /* * shrink_page_list() returns the number of reclaimed pages */ @@ -1106,6 +1155,7 @@ static unsigned long shrink_page_list(struct list_head *page_list, { LIST_HEAD(ret_pages); LIST_HEAD(free_pages); + LIST_HEAD(demote_pages); unsigned nr_reclaimed = 0; unsigned pgactivate = 0; @@ -1269,6 +1319,18 @@ static unsigned long shrink_page_list(struct list_head *page_list, */ if (PageAnon(page) && PageSwapBacked(page)) { if (!PageSwapCache(page)) { + /* + * Demote anonymous pages only for now and + * skip MADV_FREE pages. + * + * Demotion only happen from primary nodes + * to cpuless nodes. + */ + if (is_demote_ok(page_to_nid(page))) { + list_add(&page->lru, &demote_pages); + unlock_page(page); + continue; + } if (!(sc->gfp_mask & __GFP_IO)) goto keep_locked; if (PageTransHuge(page)) { @@ -1480,6 +1542,30 @@ static unsigned long shrink_page_list(struct list_head *page_list, VM_BUG_ON_PAGE(PageLRU(page) || PageUnevictable(page), page); } + /* Demote pages to migration target */ + if (!list_empty(&demote_pages)) { + int err, target_nid; + unsigned int nr_succeeded = 0; + nodemask_t used_mask; + + nodes_clear(used_mask); + target_nid = find_next_best_node(pgdat->node_id, &used_mask, + true); + + /* Demotion would ignore all cpuset and mempolicy settings */ + err = migrate_pages(&demote_pages, alloc_demote_page, NULL, + target_nid, MIGRATE_ASYNC, MR_DEMOTE, + &nr_succeeded); + + nr_reclaimed += nr_succeeded; + + if (err) { + putback_movable_pages(&demote_pages); + + list_splice(&ret_pages, &demote_pages); + } + } + mem_cgroup_uncharge_list(&free_pages); try_to_unmap_flush(); free_unref_page_list(&free_pages); @@ -2136,10 +2222,11 @@ static bool inactive_list_is_low(struct lruvec *lruvec, bool file, unsigned long gb; /* - * If we don't have swap space, anonymous page deactivation - * is pointless. + * If we don't have swap space or migtation target node online, + * anonymous page deactivation is pointless. */ - if (!file && !total_swap_pages) + if (!file && !total_swap_pages && + !is_demote_ok(pgdat->node_id)) return false; inactive = lruvec_lru_size(lruvec, inactive_lru, sc->reclaim_idx); @@ -2213,22 +2300,34 @@ static void get_scan_count(struct lruvec *lruvec, struct mem_cgroup *memcg, unsigned long ap, fp; enum lru_list lru; - /* If we have no swap space, do not bother scanning anon pages. */ - if (!sc->may_swap || mem_cgroup_get_nr_swap_pages(memcg) <= 0) { - scan_balance = SCAN_FILE; - goto out; - } - /* - * Global reclaim will swap to prevent OOM even with no - * swappiness, but memcg users want to use this knob to - * disable swapping for individual groups completely when - * using the memory controller's swap limit feature would be - * too expensive. + * Anon pages can be demoted to PMEM. If there is PMEM node online, + * still scan anonymous LRU even though the systme is swapless or + * swapping is disabled by memcg. + * + * If current node is already PMEM node, demotion is not applicable. */ - if (!global_reclaim(sc) && !swappiness) { - scan_balance = SCAN_FILE; - goto out; + if (!is_demote_ok(pgdat->node_id)) { + /* + * If we have no swap space, do not bother scanning + * anon pages. + */ + if (!sc->may_swap || mem_cgroup_get_nr_swap_pages(memcg) <= 0) { + scan_balance = SCAN_FILE; + goto out; + } + + /* + * Global reclaim will swap to prevent OOM even with no + * swappiness, but memcg users want to use this knob to + * disable swapping for individual groups completely when + * using the memory controller's swap limit feature would be + * too expensive. + */ + if (!global_reclaim(sc) && !swappiness) { + scan_balance = SCAN_FILE; + goto out; + } } /* @@ -2577,7 +2676,7 @@ static inline bool should_continue_reclaim(struct pglist_data *pgdat, */ pages_for_compaction = compact_gap(sc->order); inactive_lru_pages = node_page_state(pgdat, NR_INACTIVE_FILE); - if (get_nr_swap_pages() > 0) + if (get_nr_swap_pages() > 0 || is_demote_ok(pgdat->node_id)) inactive_lru_pages += node_page_state(pgdat, NR_INACTIVE_ANON); if (sc->nr_reclaimed < pages_for_compaction && inactive_lru_pages > pages_for_compaction) @@ -3262,7 +3361,8 @@ static void age_active_anon(struct pglist_data *pgdat, { struct mem_cgroup *memcg; - if (!total_swap_pages) + /* Aging anon page as long as demotion is fine */ + if (!total_swap_pages && !is_demote_ok(pgdat->node_id)) return; memcg = mem_cgroup_iter(NULL, NULL, NULL); @@ -4003,11 +4103,6 @@ static int __init kswapd_init(void) */ int node_reclaim_mode __read_mostly; -#define RECLAIM_OFF 0 -#define RECLAIM_ZONE (1<<0) /* Run shrink_inactive_list on the zone */ -#define RECLAIM_WRITE (1<<1) /* Writeout pages during reclaim */ -#define RECLAIM_UNMAP (1<<2) /* Unmap pages during reclaim */ - /* * Priority for NODE_RECLAIM. This determines the fraction of pages * of a node considered for each zone_reclaim. 4 scans 1/16th of @@ -4084,8 +4179,10 @@ static int __node_reclaim(struct pglist_data *pgdat, gfp_t gfp_mask, unsigned in .gfp_mask = current_gfp_context(gfp_mask), .order = order, .priority = NODE_RECLAIM_PRIORITY, - .may_writepage = !!(node_reclaim_mode & RECLAIM_WRITE), - .may_unmap = !!(node_reclaim_mode & RECLAIM_UNMAP), + .may_writepage = !!((node_reclaim_mode & RECLAIM_WRITE) || + (node_reclaim_mode & RECLAIM_MIGRATE)), + .may_unmap = !!((node_reclaim_mode & RECLAIM_UNMAP) || + (node_reclaim_mode & RECLAIM_MIGRATE)), .may_swap = 1, .reclaim_idx = gfp_zone(gfp_mask), }; @@ -4105,7 +4202,8 @@ static int __node_reclaim(struct pglist_data *pgdat, gfp_t gfp_mask, unsigned in reclaim_state.reclaimed_slab = 0; p->reclaim_state = &reclaim_state; - if (node_pagecache_reclaimable(pgdat) > pgdat->min_unmapped_pages) { + if (node_pagecache_reclaimable(pgdat) > pgdat->min_unmapped_pages || + (node_reclaim_mode & RECLAIM_MIGRATE)) { /* * Free memory by calling shrink node with increasing * priorities until we have enough memory freed. @@ -4138,9 +4236,12 @@ int node_reclaim(struct pglist_data *pgdat, gfp_t gfp_mask, unsigned int order) * thrown out if the node is overallocated. So we do not reclaim * if less than a specified percentage of the node is used by * unmapped file backed pages. + * + * Migrate mode doesn't care the above restrictions. */ if (node_pagecache_reclaimable(pgdat) <= pgdat->min_unmapped_pages && - node_page_state(pgdat, NR_SLAB_RECLAIMABLE) <= pgdat->min_slab_pages) + node_page_state(pgdat, NR_SLAB_RECLAIMABLE) <= pgdat->min_slab_pages && + !(node_reclaim_mode & RECLAIM_MIGRATE)) return NODE_RECLAIM_FULL; /* From patchwork Thu Jun 13 23:29:34 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yang Shi X-Patchwork-Id: 10993821 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A6143924 for ; Thu, 13 Jun 2019 23:30:07 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8B0BD268AE for ; Thu, 13 Jun 2019 23:30:07 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 7AF7226E90; Thu, 13 Jun 2019 23:30:07 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE,UNPARSEABLE_RELAY autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DEEED268AE for ; Thu, 13 Jun 2019 23:30:05 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CADC46B000C; Thu, 13 Jun 2019 19:30:04 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id C5D6E6B000E; Thu, 13 Jun 2019 19:30:04 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B7D5D6B0266; Thu, 13 Jun 2019 19:30:04 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pg1-f198.google.com (mail-pg1-f198.google.com [209.85.215.198]) by kanga.kvack.org (Postfix) with ESMTP id 82CE56B000C for ; Thu, 13 Jun 2019 19:30:04 -0400 (EDT) Received: by mail-pg1-f198.google.com with SMTP id a21so440089pgh.11 for ; Thu, 13 Jun 2019 16:30:04 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=amxFSe5SyXBQQAe5Ku8yftAIKY+0zVP6ltWnawYlljs=; b=BanCVj5el7qphBOGUzCyoBw+P81qZo34O1nWA9xP8wPnDM+fN0opUxphDfFFsGmOBA Zh5iaPCVT59WwSVXRJcXiS1ujb7Ww6DJXRSfSoGjev/jJVHslUg7LK2khgyXiP7zQg0d fz+DeNUCJEh51sY+T+fnm8Y5cUgGwy8Hbhvswr+XzbuWGWtXJHy9I6/xrLzQEW9TGxFL Ye8RsIwL2tlpfUGxO8lpDd49vnYxlv9Gw2CX/lKQLFwGM0TeF24Ro+lfsbXqa/n8sfJf K5iQzj0FlUpHt0QravU3zu8Gcaml3yJA1dgdHORbc7N4e4SrWAMNIIWHXe3c8Z6EzZFQ uMQQ== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of yang.shi@linux.alibaba.com designates 115.124.30.131 as permitted sender) smtp.mailfrom=yang.shi@linux.alibaba.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=alibaba.com X-Gm-Message-State: APjAAAVadnl2iiePuZZqnCL8ZLZf8jJTq/6w+8pVeYswxYOxhpIYs48J 7U9THj4wRTqpmQnLHVvRG6XiEJAg32M5T2LyH40WA/qgxYOJz/qs7hDdCONDiIf2w+S3JvqNDzo ucZMwg3ypXwvww0tly+uJ9scmxshBcedsnIEjGHdxp5cocABiXSpHuX2Kko1320RClQ== X-Received: by 2002:aa7:9087:: with SMTP id i7mr38504625pfa.40.1560468604154; Thu, 13 Jun 2019 16:30:04 -0700 (PDT) X-Google-Smtp-Source: APXvYqziVSNt4qwldYBdIjkKFMQ+K6aQ6Q47eVR73+bTAUK9CkWO4dmA/WWe+55mFRtZEkr+LHc/ X-Received: by 2002:aa7:9087:: with SMTP id i7mr38504544pfa.40.1560468602891; Thu, 13 Jun 2019 16:30:02 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1560468602; cv=none; d=google.com; s=arc-20160816; b=jO5ta7+crsyfq2KIubC0alJfM50cKWaFA2v0IBtcN7+iPL5CxLR+qNUm2sdDaO4jq7 rjpObbvC+9Ic8IkYgAYlRV/9yah9u/AagCya58ZmgkYeqdMbt4N0f+3JTKoG7t4fVmaH auCRa1QPKSPn3I94gMjUKpY1tzXCL58hBeYh7qY8R69bvxmqiypANA0Zfm9u1Vq6NBBW iSmOQgf2prXo++2XyCkwwH55kHuHzvlAtBRFKSwwstTYJJATCujAxb7XzWpART2qB/Xn dhU0enGJu589renI5NlpZ4WMKja39azJ5dlGJyQmW/NpeIh01m/FgzEVvgqGM7B1pEe/ Q2aw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=amxFSe5SyXBQQAe5Ku8yftAIKY+0zVP6ltWnawYlljs=; b=pvuX2TCDPYr9JNCYOb2HomkGk6EehV/qRuff80rtscCy1ILe37In66cusSn0coM+uU MZfoMuvuNcJ1dCVRm09GZ4cnj7BTx/7BurGAltdgiWefb/L0lkEwZ1UPbl+p4fkzdJp3 Cknhy7pvkdIRzSKh4drV3ijjrDmXeoXd7+DYDUnUVIK4l7FIzyT+8U0HKjzYrrGskHC5 7xnLU4uVzJOtOGoPObMDP3VbL2VmaPqcC4hbRlPCPC+NwWSFcFS+zFGNEq+NDk4bPFHr DOClVFZP+VcvUvS6ccV4tKM6CQDetAfpnsmoBSZ9juNfTbx9jLUi67bKIoAtOdLnVAfW cFpg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of yang.shi@linux.alibaba.com designates 115.124.30.131 as permitted sender) smtp.mailfrom=yang.shi@linux.alibaba.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: from out30-131.freemail.mail.aliyun.com (out30-131.freemail.mail.aliyun.com. [115.124.30.131]) by mx.google.com with ESMTPS id k186si892418pgd.148.2019.06.13.16.30.02 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 13 Jun 2019 16:30:02 -0700 (PDT) Received-SPF: pass (google.com: domain of yang.shi@linux.alibaba.com designates 115.124.30.131 as permitted sender) client-ip=115.124.30.131; Authentication-Results: mx.google.com; spf=pass (google.com: domain of yang.shi@linux.alibaba.com designates 115.124.30.131 as permitted sender) smtp.mailfrom=yang.shi@linux.alibaba.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=alibaba.com X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R211e4;CH=green;DM=||false|;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e01422;MF=yang.shi@linux.alibaba.com;NM=1;PH=DS;RN=15;SR=0;TI=SMTPD_---0TU6DYEz_1560468591; Received: from e19h19392.et15sqa.tbsite.net(mailfrom:yang.shi@linux.alibaba.com fp:SMTPD_---0TU6DYEz_1560468591) by smtp.aliyun-inc.com(127.0.0.1); Fri, 14 Jun 2019 07:30:00 +0800 From: Yang Shi To: mhocko@suse.com, mgorman@techsingularity.net, riel@surriel.com, hannes@cmpxchg.org, akpm@linux-foundation.org, dave.hansen@intel.com, keith.busch@intel.com, dan.j.williams@intel.com, fengguang.wu@intel.com, fan.du@intel.com, ying.huang@intel.com, ziy@nvidia.com Cc: yang.shi@linux.alibaba.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [v3 PATCH 6/9] mm: vmscan: don't demote for memcg reclaim Date: Fri, 14 Jun 2019 07:29:34 +0800 Message-Id: <1560468577-101178-7-git-send-email-yang.shi@linux.alibaba.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1560468577-101178-1-git-send-email-yang.shi@linux.alibaba.com> References: <1560468577-101178-1-git-send-email-yang.shi@linux.alibaba.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP The memcg reclaim happens when the limit is breached, but demotion just migrate pages to the other node instead of reclaiming them. This sounds pointless to memcg reclaim since the usage is not reduced at all. Signed-off-by: Yang Shi --- mm/vmscan.c | 38 +++++++++++++++++++++----------------- 1 file changed, 21 insertions(+), 17 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index 428a83b..fb931ded 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -1126,12 +1126,16 @@ static inline struct page *alloc_demote_page(struct page *page, } #endif -static inline bool is_demote_ok(int nid) +static inline bool is_demote_ok(int nid, struct scan_control *sc) { /* Just do demotion with migrate mode of node reclaim */ if (!(node_reclaim_mode & RECLAIM_MIGRATE)) return false; + /* It is pointless to do demotion in memcg reclaim */ + if (!global_reclaim(sc)) + return false; + /* Current node is cpuless node */ if (!node_state(nid, N_CPU_MEM)) return false; @@ -1326,7 +1330,7 @@ static unsigned long shrink_page_list(struct list_head *page_list, * Demotion only happen from primary nodes * to cpuless nodes. */ - if (is_demote_ok(page_to_nid(page))) { + if (is_demote_ok(page_to_nid(page), sc)) { list_add(&page->lru, &demote_pages); unlock_page(page); continue; @@ -2226,7 +2230,7 @@ static bool inactive_list_is_low(struct lruvec *lruvec, bool file, * anonymous page deactivation is pointless. */ if (!file && !total_swap_pages && - !is_demote_ok(pgdat->node_id)) + !is_demote_ok(pgdat->node_id, sc)) return false; inactive = lruvec_lru_size(lruvec, inactive_lru, sc->reclaim_idx); @@ -2307,7 +2311,7 @@ static void get_scan_count(struct lruvec *lruvec, struct mem_cgroup *memcg, * * If current node is already PMEM node, demotion is not applicable. */ - if (!is_demote_ok(pgdat->node_id)) { + if (!is_demote_ok(pgdat->node_id, sc)) { /* * If we have no swap space, do not bother scanning * anon pages. @@ -2316,18 +2320,18 @@ static void get_scan_count(struct lruvec *lruvec, struct mem_cgroup *memcg, scan_balance = SCAN_FILE; goto out; } + } - /* - * Global reclaim will swap to prevent OOM even with no - * swappiness, but memcg users want to use this knob to - * disable swapping for individual groups completely when - * using the memory controller's swap limit feature would be - * too expensive. - */ - if (!global_reclaim(sc) && !swappiness) { - scan_balance = SCAN_FILE; - goto out; - } + /* + * Global reclaim will swap to prevent OOM even with no + * swappiness, but memcg users want to use this knob to + * disable swapping for individual groups completely when + * using the memory controller's swap limit feature would be + * too expensive. + */ + if (!global_reclaim(sc) && !swappiness) { + scan_balance = SCAN_FILE; + goto out; } /* @@ -2676,7 +2680,7 @@ static inline bool should_continue_reclaim(struct pglist_data *pgdat, */ pages_for_compaction = compact_gap(sc->order); inactive_lru_pages = node_page_state(pgdat, NR_INACTIVE_FILE); - if (get_nr_swap_pages() > 0 || is_demote_ok(pgdat->node_id)) + if (get_nr_swap_pages() > 0 || is_demote_ok(pgdat->node_id, sc)) inactive_lru_pages += node_page_state(pgdat, NR_INACTIVE_ANON); if (sc->nr_reclaimed < pages_for_compaction && inactive_lru_pages > pages_for_compaction) @@ -3362,7 +3366,7 @@ static void age_active_anon(struct pglist_data *pgdat, struct mem_cgroup *memcg; /* Aging anon page as long as demotion is fine */ - if (!total_swap_pages && !is_demote_ok(pgdat->node_id)) + if (!total_swap_pages && !is_demote_ok(pgdat->node_id, sc)) return; memcg = mem_cgroup_iter(NULL, NULL, NULL); From patchwork Thu Jun 13 23:29:35 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yang Shi X-Patchwork-Id: 10993839 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id AEE881398 for ; Thu, 13 Jun 2019 23:31:00 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A076F268AE for ; Thu, 13 Jun 2019 23:31:00 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 93E4A26E90; Thu, 13 Jun 2019 23:31:00 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE,UNPARSEABLE_RELAY autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 15368268AE for ; Thu, 13 Jun 2019 23:31:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E0C8A8E000A; Thu, 13 Jun 2019 19:30:58 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id DBBFC8E0002; Thu, 13 Jun 2019 19:30:58 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CABD68E000A; Thu, 13 Jun 2019 19:30:58 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-ot1-f70.google.com (mail-ot1-f70.google.com [209.85.210.70]) by kanga.kvack.org (Postfix) with ESMTP id 9CA778E0002 for ; Thu, 13 Jun 2019 19:30:58 -0400 (EDT) Received: by mail-ot1-f70.google.com with SMTP id p7so295417otk.22 for ; Thu, 13 Jun 2019 16:30:58 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=VnHuvyxfzlMLjuAluoP4Gqp8W0W7f3UstQ5jt5H3rp4=; b=N4vJuo+W5aGfS+1AUVk7j94QOg1MekGmPo8tbKENFeR2hOqRQyalI7bVg6bUu2FQXf cXmuUvrXjAJogY+2JpmQ3yD/CekH5Lg4eIncA12oOrwpoMm9TlOeGIvX1HwrnYcTv+GA Lw/dY/aZwHRYUJkSst5yFoCp0cS3AtkGdzeOcUhcP+xVsF2++02AUXOfi858TRd4XvI/ xmKOixNDtO1lo6cQOB3T2aPe9KWHXbh5a6EbIeHxpPj+lWOwaY+syNb4vbxVFkL0V6+a A54NMoYrxmxxTmcAXglH2M90CRwvreLG8/yrQQYXYFJWcnNp9NVVvPC6J6rrMBVTfrXf ZW+g== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of yang.shi@linux.alibaba.com designates 115.124.30.43 as permitted sender) smtp.mailfrom=yang.shi@linux.alibaba.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=alibaba.com X-Gm-Message-State: APjAAAWnMkiSBYeewQ3kTIggmg9cH1aUgyqkEFhoh3xvIiQPKAjmW80H mROM7QN/ZUM/y8dHfUAUgSXKYfwXXAjXOpmSyWPQeoXst4T/jhnkS40iSLUFQ+OehFijECJyE60 HlYVZV4eWbmWJkzA0EWSfiQW1rlQY2mI65KGhlUGfW4k5nHqQtpogn92PWzXIIRe5Cg== X-Received: by 2002:a05:6830:1042:: with SMTP id b2mr7445992otp.345.1560468658299; Thu, 13 Jun 2019 16:30:58 -0700 (PDT) X-Google-Smtp-Source: APXvYqyKzR6T+705AEiORIF8Bst36lXgV5ec/q74g0gXAwsin37aHK6tMX699ez/JUWt7PwjkNtV X-Received: by 2002:a05:6830:1042:: with SMTP id b2mr7445945otp.345.1560468657583; Thu, 13 Jun 2019 16:30:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1560468657; cv=none; d=google.com; s=arc-20160816; b=nbEm+iaj7kAvxYFgrWgLqhXSxQOadUwhxkMQy2+1ajFYFEaaKLjODFPboOFiOHK3HK rWfinBC3kbIWE5upPVvYu6pYqq/EdfWzfBuJ8iBpHlju645RGNJDWg/QYt13/BotDcwO d2hEZM8zUaGm/st6JVX3BlAyWfha3/MCY47Oxtw7kejD6L9boDSjyahzu/YT7gIlc5q1 24yGlDMDItCPclCyyK1crdvEFwsmZ3ONR1Qj8HL0tJVZF5kMcDd4AheWcGxL1h9NPAA5 AY+8Mp0GVIfZFUvioTFvWsNl1VHJxdw2VRechMSNSdQul4+8JWE9VZ5q3Zr/sxEmlvkd IA5g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=VnHuvyxfzlMLjuAluoP4Gqp8W0W7f3UstQ5jt5H3rp4=; b=Y4g2fODcy1uCBHSCtpc0BKHMHdl4Ctaww+uXoI/+rZgXqjoQgXBIcEnznWkxvKH2bk curPbC93TSnk5dg4h6ICDb6+vGjaWtpJxPGc3+JbysPOVTjVHiFUy8RTDPl++9KbxZG6 yboRB6H0NykowkiRKffUvTUTlRnzS/4flmkgDHIkqXvFg8Xy6Y2x4sxCxj/vdlo5b47g v+JoKo0BZh7EfWt+yNJPnSQQjf5aPiFnSKI4HiHn3UhSzyIg4+crxr1T8dAtwSqHIc6+ x3UpoY+WSivLtBZTDCZ9atRYqHgPWFt3pFNiD+5vnyxv5BoqAHG2AWmaUosExgxF7InG qvlg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of yang.shi@linux.alibaba.com designates 115.124.30.43 as permitted sender) smtp.mailfrom=yang.shi@linux.alibaba.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: from out30-43.freemail.mail.aliyun.com (out30-43.freemail.mail.aliyun.com. [115.124.30.43]) by mx.google.com with ESMTPS id t62si501913oib.246.2019.06.13.16.30.56 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 13 Jun 2019 16:30:57 -0700 (PDT) Received-SPF: pass (google.com: domain of yang.shi@linux.alibaba.com designates 115.124.30.43 as permitted sender) client-ip=115.124.30.43; Authentication-Results: mx.google.com; spf=pass (google.com: domain of yang.shi@linux.alibaba.com designates 115.124.30.43 as permitted sender) smtp.mailfrom=yang.shi@linux.alibaba.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=alibaba.com X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R601e4;CH=green;DM=||false|;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04423;MF=yang.shi@linux.alibaba.com;NM=1;PH=DS;RN=15;SR=0;TI=SMTPD_---0TU6DYEz_1560468591; Received: from e19h19392.et15sqa.tbsite.net(mailfrom:yang.shi@linux.alibaba.com fp:SMTPD_---0TU6DYEz_1560468591) by smtp.aliyun-inc.com(127.0.0.1); Fri, 14 Jun 2019 07:30:00 +0800 From: Yang Shi To: mhocko@suse.com, mgorman@techsingularity.net, riel@surriel.com, hannes@cmpxchg.org, akpm@linux-foundation.org, dave.hansen@intel.com, keith.busch@intel.com, dan.j.williams@intel.com, fengguang.wu@intel.com, fan.du@intel.com, ying.huang@intel.com, ziy@nvidia.com Cc: yang.shi@linux.alibaba.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [v3 PATCH 7/9] mm: vmscan: check if the demote target node is contended or not Date: Fri, 14 Jun 2019 07:29:35 +0800 Message-Id: <1560468577-101178-8-git-send-email-yang.shi@linux.alibaba.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1560468577-101178-1-git-send-email-yang.shi@linux.alibaba.com> References: <1560468577-101178-1-git-send-email-yang.shi@linux.alibaba.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP When demoting to the migration target node, the target node may have memory pressure, then the memory pressure may cause migrate_pages() fail. If the failure is caused by memory pressure (i.e. returning -ENOMEM), tag the node with PGDAT_CONTENDED. The tag would be cleared once the target node is balanced again. Check if the target node is PGDAT_CONTENDED or not, if it is just skip demotion. Signed-off-by: Yang Shi --- include/linux/mmzone.h | 3 +++ mm/vmscan.c | 37 +++++++++++++++++++++++++++++++++++++ 2 files changed, 40 insertions(+) diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index 70394ca..d4e05c5 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -573,6 +573,9 @@ enum pgdat_flags { * many pages under writeback */ PGDAT_RECLAIM_LOCKED, /* prevents concurrent reclaim */ + PGDAT_CONTENDED, /* the node has not enough free memory + * available + */ }; enum zone_flags { diff --git a/mm/vmscan.c b/mm/vmscan.c index fb931ded..9ec55d7 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -1126,6 +1126,21 @@ static inline struct page *alloc_demote_page(struct page *page, } #endif +static inline bool is_migration_target_contended(int nid) +{ + int node; + nodemask_t used_mask; + + + nodes_clear(used_mask); + node = find_next_best_node(nid, &used_mask, true); + + if (test_bit(PGDAT_CONTENDED, &NODE_DATA(node)->flags)) + return true; + + return false; +} + static inline bool is_demote_ok(int nid, struct scan_control *sc) { /* Just do demotion with migrate mode of node reclaim */ @@ -1144,6 +1159,10 @@ static inline bool is_demote_ok(int nid, struct scan_control *sc) if (!has_migration_target_node_online()) return false; + /* Check if the demote target node is contended or not */ + if (is_migration_target_contended(nid)) + return false; + return true; } @@ -1564,6 +1583,10 @@ static unsigned long shrink_page_list(struct list_head *page_list, nr_reclaimed += nr_succeeded; if (err) { + if (err == -ENOMEM) + set_bit(PGDAT_CONTENDED, + &NODE_DATA(target_nid)->flags); + putback_movable_pages(&demote_pages); list_splice(&ret_pages, &demote_pages); @@ -2597,6 +2620,19 @@ static void shrink_node_memcg(struct pglist_data *pgdat, struct mem_cgroup *memc * scan target and the percentage scanning already complete */ lru = (lru == LRU_FILE) ? LRU_BASE : LRU_FILE; + + /* + * The shrink_page_list() may find the demote target node is + * contended, if so it doesn't make sense to scan anonymous + * LRU again. + * + * Need check if swap is available or not too since demotion + * may happen on swapless system. + */ + if (!is_demote_ok(pgdat->node_id, sc) && + (!sc->may_swap || mem_cgroup_get_nr_swap_pages(memcg) <= 0)) + lru = LRU_FILE; + nr_scanned = targets[lru] - nr[lru]; nr[lru] = targets[lru] * (100 - percentage) / 100; nr[lru] -= min(nr[lru], nr_scanned); @@ -3447,6 +3483,7 @@ static void clear_pgdat_congested(pg_data_t *pgdat) clear_bit(PGDAT_CONGESTED, &pgdat->flags); clear_bit(PGDAT_DIRTY, &pgdat->flags); clear_bit(PGDAT_WRITEBACK, &pgdat->flags); + clear_bit(PGDAT_CONTENDED, &pgdat->flags); } /* From patchwork Thu Jun 13 23:29:36 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yang Shi X-Patchwork-Id: 10993829 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 301C7924 for ; Thu, 13 Jun 2019 23:30:21 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 16FC12018E for ; Thu, 13 Jun 2019 23:30:21 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 0AC62268AE; Thu, 13 Jun 2019 23:30:21 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE,UNPARSEABLE_RELAY autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8F5522018E for ; Thu, 13 Jun 2019 23:30:20 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1FE238E0005; Thu, 13 Jun 2019 19:30:17 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 163088E0002; Thu, 13 Jun 2019 19:30:17 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id ED3268E0005; Thu, 13 Jun 2019 19:30:16 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pl1-f198.google.com (mail-pl1-f198.google.com [209.85.214.198]) by kanga.kvack.org (Postfix) with ESMTP id B259C8E0002 for ; Thu, 13 Jun 2019 19:30:16 -0400 (EDT) Received: by mail-pl1-f198.google.com with SMTP id 91so459835pla.7 for ; Thu, 13 Jun 2019 16:30:16 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=hDNydRbA83jbliMn/y7MvMj1b1V45KErjlHbPtM+FvQ=; b=gqEKgUXc7RlZ2jOgyjHEkERyY+7u0w/JcuaogcI4R4wJm1EHqRNbJctnd77iH4M2DW 7IYLs4OcANDbezdeW6q+h+AcfwbL9yBVzqw1dbkhoMd1UPBrVhtu/vBvbto6cEryT+uz S1HWknVe0xdgZMoIlX2lcc3x44JR3IC2m650mfY7r2tWlgtVEevuC+Ah0JTVnvH7QXSe ZCQgyvQ8+zQJuHAzCMSG+/URT0NPCvZQIjlmH5q1dX6kSxKXfBNcye652gJbK1jAOuOj YXOP0qjLcJVpA68T1iehvnSa63y6WyM6WrQDLi1pBvUgFhki6rLdyHCVknH6EjT+bm2p ZjaA== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of yang.shi@linux.alibaba.com designates 47.88.44.36 as permitted sender) smtp.mailfrom=yang.shi@linux.alibaba.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=alibaba.com X-Gm-Message-State: APjAAAUAF8h/c7UdDnj/kYHjZKFufIJrHQmxa9556hlp95KvpavoEN10 JoGTU6tdgwuc49XrYh8RcjB1V6pVT68Iley63HUN4sECurAIb8mmQuzMIQTDWmKigP+KBXLSNaf PlLZV4f6sn0tRGcJ+3WQM2qlNvbpt2kAAO4yvpNY0Ixvmxe0bAtbFluohIG7MpaByFQ== X-Received: by 2002:a63:5c41:: with SMTP id n1mr1190693pgm.69.1560468616259; Thu, 13 Jun 2019 16:30:16 -0700 (PDT) X-Google-Smtp-Source: APXvYqyOhG3yVslHWmRzZpUEy+q/tRec8E14XSfrUbMHfYvndrU74vyHKdlsxIsmsrRmslTxBmfg X-Received: by 2002:a63:5c41:: with SMTP id n1mr1190599pgm.69.1560468615102; Thu, 13 Jun 2019 16:30:15 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1560468615; cv=none; d=google.com; s=arc-20160816; b=uS3NgFzS3/fbp5dmkJj6HsvwG/+6ieZcMnOTX+iK8pwz4PS+EbZksQYQAF5XGu4g97 5PCh022GrBRKCHgGsnvuWRMJj4+GjJStjOaWmfmFT+8JEYr2R57N3nhXfMSIwpyZ1fZ/ LaiWYH7KjVKTWSkoWgxJefaE9R626Bn5n3Mhn2xtBRbEIvToe/RDIYjLVErssn1X47fN 979qA0yoxARdjXrqwJZeYuTqClRk9rwPW/rr1sDas+/B/XWVx8Ye6oEgsDIbSg60Xwvb DFZv7RG/6kduq9yHvcSCcmyZ9lQBV0fPrdVtE1ucRLcmVPBtMX/w+Iyi49muF/KwYHfi JYsA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=hDNydRbA83jbliMn/y7MvMj1b1V45KErjlHbPtM+FvQ=; b=1Ibqrhysi8nPXMm64293NS89AZCjGawWr+HBh2FdcmokHBOQqbCE7J3Jx7Q3RU2Ntq Vg77dMGkcWHWeZQo8x3DaqGyb1bHhdk15+juZZtJc81F1BLTavhvkYNEqxgeS4rbW4R5 xKWMcqm0Y0xILjvjuTxlImDUtdghN7ih0SDuIguV6qGj+1vwOO8Vtr/6kJu4BPextF5g YjIsUwPPlf8uHWmiZXDTJ53rpq9BHkJlNMjS1xKzDr3ZvSbzHRtwdu5ckg9ZMJ/O7MNN yyq7jo+8Xpco4n6RfzXXv80//86SimslIIyTwZnSAApnyu4gR6UQuZDITlB/xWWiUgFr hRdQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of yang.shi@linux.alibaba.com designates 47.88.44.36 as permitted sender) smtp.mailfrom=yang.shi@linux.alibaba.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: from out4436.biz.mail.alibaba.com (out4436.biz.mail.alibaba.com. [47.88.44.36]) by mx.google.com with ESMTPS id b69si749875pjc.104.2019.06.13.16.30.13 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 13 Jun 2019 16:30:15 -0700 (PDT) Received-SPF: pass (google.com: domain of yang.shi@linux.alibaba.com designates 47.88.44.36 as permitted sender) client-ip=47.88.44.36; Authentication-Results: mx.google.com; spf=pass (google.com: domain of yang.shi@linux.alibaba.com designates 47.88.44.36 as permitted sender) smtp.mailfrom=yang.shi@linux.alibaba.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=alibaba.com X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R701e4;CH=green;DM=||false|;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04426;MF=yang.shi@linux.alibaba.com;NM=1;PH=DS;RN=15;SR=0;TI=SMTPD_---0TU6DYEz_1560468591; Received: from e19h19392.et15sqa.tbsite.net(mailfrom:yang.shi@linux.alibaba.com fp:SMTPD_---0TU6DYEz_1560468591) by smtp.aliyun-inc.com(127.0.0.1); Fri, 14 Jun 2019 07:30:01 +0800 From: Yang Shi To: mhocko@suse.com, mgorman@techsingularity.net, riel@surriel.com, hannes@cmpxchg.org, akpm@linux-foundation.org, dave.hansen@intel.com, keith.busch@intel.com, dan.j.williams@intel.com, fengguang.wu@intel.com, fan.du@intel.com, ying.huang@intel.com, ziy@nvidia.com Cc: yang.shi@linux.alibaba.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [v3 PATCH 8/9] mm: vmscan: add page demotion counter Date: Fri, 14 Jun 2019 07:29:36 +0800 Message-Id: <1560468577-101178-9-git-send-email-yang.shi@linux.alibaba.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1560468577-101178-1-git-send-email-yang.shi@linux.alibaba.com> References: <1560468577-101178-1-git-send-email-yang.shi@linux.alibaba.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Account the number of demoted pages into reclaim_state->nr_demoted. Add pgdemote_kswapd and pgdemote_direct VM counters showed in /proc/vmstat. Signed-off-by: Yang Shi --- include/linux/vm_event_item.h | 2 ++ include/linux/vmstat.h | 1 + mm/vmscan.c | 8 ++++++++ mm/vmstat.c | 2 ++ 4 files changed, 13 insertions(+) diff --git a/include/linux/vm_event_item.h b/include/linux/vm_event_item.h index 47a3441..499a3aa 100644 --- a/include/linux/vm_event_item.h +++ b/include/linux/vm_event_item.h @@ -32,6 +32,8 @@ enum vm_event_item { PGPGIN, PGPGOUT, PSWPIN, PSWPOUT, PGREFILL, PGSTEAL_KSWAPD, PGSTEAL_DIRECT, + PGDEMOTE_KSWAPD, + PGDEMOTE_DIRECT, PGSCAN_KSWAPD, PGSCAN_DIRECT, PGSCAN_DIRECT_THROTTLE, diff --git a/include/linux/vmstat.h b/include/linux/vmstat.h index bdeda4b..00d53d4 100644 --- a/include/linux/vmstat.h +++ b/include/linux/vmstat.h @@ -29,6 +29,7 @@ struct reclaim_stat { unsigned nr_activate[2]; unsigned nr_ref_keep; unsigned nr_unmap_fail; + unsigned nr_demoted; }; #ifdef CONFIG_VM_EVENT_COUNTERS diff --git a/mm/vmscan.c b/mm/vmscan.c index 9ec55d7..f65cd45 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -130,6 +130,7 @@ struct scan_control { unsigned int immediate; unsigned int file_taken; unsigned int taken; + unsigned int demoted; } nr; }; @@ -1582,6 +1583,12 @@ static unsigned long shrink_page_list(struct list_head *page_list, nr_reclaimed += nr_succeeded; + stat->nr_demoted = nr_succeeded; + if (current_is_kswapd()) + __count_vm_events(PGDEMOTE_KSWAPD, stat->nr_demoted); + else + __count_vm_events(PGDEMOTE_DIRECT, stat->nr_demoted); + if (err) { if (err == -ENOMEM) set_bit(PGDAT_CONTENDED, @@ -2097,6 +2104,7 @@ static int current_may_throttle(void) sc->nr.unqueued_dirty += stat.nr_unqueued_dirty; sc->nr.writeback += stat.nr_writeback; sc->nr.immediate += stat.nr_immediate; + sc->nr.demoted += stat.nr_demoted; sc->nr.taken += nr_taken; if (file) sc->nr.file_taken += nr_taken; diff --git a/mm/vmstat.c b/mm/vmstat.c index d876ac0..eee29a9 100644 --- a/mm/vmstat.c +++ b/mm/vmstat.c @@ -1192,6 +1192,8 @@ int fragmentation_index(struct zone *zone, unsigned int order) "pgrefill", "pgsteal_kswapd", "pgsteal_direct", + "pgdemote_kswapd", + "pgdemote_direct", "pgscan_kswapd", "pgscan_direct", "pgscan_direct_throttle", From patchwork Thu Jun 13 23:29:37 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yang Shi X-Patchwork-Id: 10993831 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2AF891398 for ; Thu, 13 Jun 2019 23:30:23 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1D98A2018E for ; Thu, 13 Jun 2019 23:30:23 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 11728268AE; Thu, 13 Jun 2019 23:30:23 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE,UNPARSEABLE_RELAY autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A5CFE2018E for ; Thu, 13 Jun 2019 23:30:22 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 99F288E0006; Thu, 13 Jun 2019 19:30:17 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 950838E0002; Thu, 13 Jun 2019 19:30:17 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 83DD08E0006; Thu, 13 Jun 2019 19:30:17 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-ot1-f71.google.com (mail-ot1-f71.google.com [209.85.210.71]) by kanga.kvack.org (Postfix) with ESMTP id 56EE48E0002 for ; Thu, 13 Jun 2019 19:30:17 -0400 (EDT) Received: by mail-ot1-f71.google.com with SMTP id a17so301718otd.19 for ; Thu, 13 Jun 2019 16:30:17 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=qOFD3Vu5k5uvjb9pS72jZXaVfrvyzRyCNIQoGs68Qno=; b=eM14OFwMrSI12zucnDukAIR+zy9BfKoYSpISBCAOxYEEqYzvrnKa/G5zrTeY0sK9Zx FA5zoj2q4webSAs87GpsNU16fzcBNP3Sr0Ikm8rTCmCRT/jdqgxS1gVqnbkcZYRlxSCz mgY9bULuyujuaSqQziNoo/Y0BuxCVaLg57kAsywQVI3lO6mCYl+F/Dp/Sfd42cZNoaPL mFP5392sb2WbMZIbqhIk2DbrN1H3OjioKPtytjcFpRowYztQpQvq48m27bEP1BWKcQcU DHxcDl1qfkGAogCoHLeRhR0i67aiBwIxb5QX4f7qr4FwulmX9oVyJSXepcMvmnTTVyGh xxzg== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of yang.shi@linux.alibaba.com designates 115.124.30.54 as permitted sender) smtp.mailfrom=yang.shi@linux.alibaba.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=alibaba.com X-Gm-Message-State: APjAAAVMm6E7ept1gxARel9Sgl7ksGtaW173EzNd/aKLcUdtaa64iSgh XqH4YnByvgxHlt43rU7sTneajUUyNQGdxxMphOPRtuz7ILp64wQiMuEyTGg3JDdfoeE7r0AvTGn r+fa09f47m3UH+WTKXS5uKlwejy/EU0d/0T5EF5BYbhwTUVDWP4XOj1KBiRdBm1xaTQ== X-Received: by 2002:a9d:6d0e:: with SMTP id o14mr38000168otp.205.1560468617025; Thu, 13 Jun 2019 16:30:17 -0700 (PDT) X-Google-Smtp-Source: APXvYqznbLtCRUnirJa69qKiIDPu9RtT981NF3YQ06rSoJdTCtdHJ/6bu0ncal0h1tMrKtcOtj+k X-Received: by 2002:a9d:6d0e:: with SMTP id o14mr38000091otp.205.1560468615825; Thu, 13 Jun 2019 16:30:15 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1560468615; cv=none; d=google.com; s=arc-20160816; b=YWGp0xXqxg9IK/3q8BLQ0fIJt9MNmol6G37MRpHyFXdnkIjIYIZBSt3sBY91wvHjhV HFNNSDwZcZvzl8PPvaD8XQCurjSpDbowFz4CyqZTLG2vgLA5IJsSft1+gLMl1sT0FZ9P 04orfwi4HtT0/+/HYFh8zmjESR4/6YiZGv92KV0snCypJbZTpu+gI1yBfZCCKBNzoRwq pcpsTEknrqR3zhUi8jNKRWXbGmTX1mpbgwcW+KkrIzyeKiQU/Siso898CKA7pMEzH4L2 7tSZ5WSHvnl5jkZwt9dqp0nR25zCvrtmyw0KHb3Le9MmZ5JzDakpT3Xa21JzrjukC1nR 39WA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=qOFD3Vu5k5uvjb9pS72jZXaVfrvyzRyCNIQoGs68Qno=; b=QmKfBQ6XWfqaKxK2ZiCsXEbo9NFsszbaFGvM6HOIhvzxQRUYmL2jBFmLDkCu2aRXw7 zZFDRsPstiDZqGgLfhSjS5zniktqkk/Ur8mZNysGgYVTSdE/kLnn0hAyVwfat8RSZBuV BwVUF3BbfM0g0L9uMp1xnTqwT83rxKfOrIiX8B1gRMUPlnWzhDc+mssMv5jgHCq263Mt eJls00RUtTjrkDAVC41h7Gs4vY1CYDDAVOpiwaC6wdG1/2UBBWhEor3YK7zH+dFSyk4D z6U9aqRXj9lfeJMbKHeYzHva3EHasSggl2ZLGH6wgUK/j3VFQBfoOqnpXPbk72MZo/ZH xEyw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of yang.shi@linux.alibaba.com designates 115.124.30.54 as permitted sender) smtp.mailfrom=yang.shi@linux.alibaba.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: from out30-54.freemail.mail.aliyun.com (out30-54.freemail.mail.aliyun.com. [115.124.30.54]) by mx.google.com with ESMTPS id j5si531541oif.224.2019.06.13.16.30.14 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 13 Jun 2019 16:30:15 -0700 (PDT) Received-SPF: pass (google.com: domain of yang.shi@linux.alibaba.com designates 115.124.30.54 as permitted sender) client-ip=115.124.30.54; Authentication-Results: mx.google.com; spf=pass (google.com: domain of yang.shi@linux.alibaba.com designates 115.124.30.54 as permitted sender) smtp.mailfrom=yang.shi@linux.alibaba.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=alibaba.com X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R151e4;CH=green;DM=||false|;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01f04391;MF=yang.shi@linux.alibaba.com;NM=1;PH=DS;RN=15;SR=0;TI=SMTPD_---0TU6DYEz_1560468591; Received: from e19h19392.et15sqa.tbsite.net(mailfrom:yang.shi@linux.alibaba.com fp:SMTPD_---0TU6DYEz_1560468591) by smtp.aliyun-inc.com(127.0.0.1); Fri, 14 Jun 2019 07:30:01 +0800 From: Yang Shi To: mhocko@suse.com, mgorman@techsingularity.net, riel@surriel.com, hannes@cmpxchg.org, akpm@linux-foundation.org, dave.hansen@intel.com, keith.busch@intel.com, dan.j.williams@intel.com, fengguang.wu@intel.com, fan.du@intel.com, ying.huang@intel.com, ziy@nvidia.com Cc: yang.shi@linux.alibaba.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [v3 PATCH 9/9] mm: numa: add page promotion counter Date: Fri, 14 Jun 2019 07:29:37 +0800 Message-Id: <1560468577-101178-10-git-send-email-yang.shi@linux.alibaba.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1560468577-101178-1-git-send-email-yang.shi@linux.alibaba.com> References: <1560468577-101178-1-git-send-email-yang.shi@linux.alibaba.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Add counter for page promotion for NUMA balancing. Signed-off-by: Yang Shi --- include/linux/vm_event_item.h | 1 + mm/huge_memory.c | 4 ++++ mm/memory.c | 4 ++++ mm/vmstat.c | 1 + 4 files changed, 10 insertions(+) diff --git a/include/linux/vm_event_item.h b/include/linux/vm_event_item.h index 499a3aa..9f52a62 100644 --- a/include/linux/vm_event_item.h +++ b/include/linux/vm_event_item.h @@ -51,6 +51,7 @@ enum vm_event_item { PGPGIN, PGPGOUT, PSWPIN, PSWPOUT, NUMA_HINT_FAULTS, NUMA_HINT_FAULTS_LOCAL, NUMA_PAGE_MIGRATE, + NUMA_PAGE_PROMOTE, #endif #ifdef CONFIG_MIGRATION PGMIGRATE_SUCCESS, PGMIGRATE_FAIL, diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 9f8bce9..01cfe29 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1638,6 +1638,10 @@ vm_fault_t do_huge_pmd_numa_page(struct vm_fault *vmf, pmd_t pmd) migrated = migrate_misplaced_transhuge_page(vma->vm_mm, vma, vmf->pmd, pmd, vmf->address, page, target_nid); if (migrated) { + if (!node_state(page_nid, N_CPU_MEM) && + node_state(target_nid, N_CPU_MEM)) + count_vm_numa_events(NUMA_PAGE_PROMOTE, HPAGE_PMD_NR); + flags |= TNF_MIGRATED; page_nid = target_nid; } else diff --git a/mm/memory.c b/mm/memory.c index 96f1d47..e554cd5 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3770,6 +3770,10 @@ static vm_fault_t do_numa_page(struct vm_fault *vmf) /* Migrate to the requested node */ migrated = migrate_misplaced_page(page, vma, target_nid); if (migrated) { + if (!node_state(page_nid, N_CPU_MEM) && + node_state(target_nid, N_CPU_MEM)) + count_vm_numa_event(NUMA_PAGE_PROMOTE); + page_nid = target_nid; flags |= TNF_MIGRATED; } else diff --git a/mm/vmstat.c b/mm/vmstat.c index eee29a9..0140736 100644 --- a/mm/vmstat.c +++ b/mm/vmstat.c @@ -1220,6 +1220,7 @@ int fragmentation_index(struct zone *zone, unsigned int order) "numa_hint_faults", "numa_hint_faults_local", "numa_pages_migrated", + "numa_pages_promoted", #endif #ifdef CONFIG_MIGRATION "pgmigrate_success",