From patchwork Fri Jul 10 14:01:45 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Feng Tang X-Patchwork-Id: 11656731 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 11BC813B6 for ; Fri, 10 Jul 2020 14:02:32 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id D16B82082E for ; Fri, 10 Jul 2020 14:02:31 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D16B82082E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 96DB78D000A; Fri, 10 Jul 2020 10:02:28 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 8A68E8D0001; Fri, 10 Jul 2020 10:02:28 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 71FF98D000A; Fri, 10 Jul 2020 10:02:28 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0018.hostedemail.com [216.40.44.18]) by kanga.kvack.org (Postfix) with ESMTP id 5E51D8D0001 for ; Fri, 10 Jul 2020 10:02:28 -0400 (EDT) Received: from smtpin03.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 12CB675A6 for ; Fri, 10 Jul 2020 14:02:28 +0000 (UTC) X-FDA: 77022331176.03.pigs87_52084f626ece Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin03.hostedemail.com (Postfix) with ESMTP id 9FEB718A6B for ; Fri, 10 Jul 2020 14:02:14 +0000 (UTC) X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,feng.tang@intel.com,,RULES_HIT:30054:30064:30070,0,RBL:134.134.136.65:@intel.com:.lbl8.mailshell.net-64.95.201.95 62.18.0.100;04yr34y4n9eos9pnczrpr7awfwqz7ocxiyrpxwbstqsjcjy1actjqzenz8z9xod.ywxq3tsa5sdhhmh4gd8bxa6ckjo7k46roimt1g1nwo5wdfu7r36wfec9dctyau7.6-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:ft,MSBL:0,DNSBL:none,Custom_rules:0:0:0,LFtime:24,LUA_SUMMARY:none X-HE-Tag: pigs87_52084f626ece X-Filterd-Recvd-Size: 2956 Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by imf42.hostedemail.com (Postfix) with ESMTP for ; Fri, 10 Jul 2020 14:02:13 +0000 (UTC) IronPort-SDR: Mp4s4bI9oOKi/Co3WGuHp5Xl7xz9acKc8xDFJ9J/ujV9pU8W+LURuprqltj0AI3Q+ezzJ7gwNR B7dl/7soxY4A== X-IronPort-AV: E=McAfee;i="6000,8403,9677"; a="148188183" X-IronPort-AV: E=Sophos;i="5.75,336,1589266800"; d="scan'208";a="148188183" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Jul 2020 07:01:56 -0700 IronPort-SDR: 0VwO2uDj2UHdPet+Hx0PX3TKMqv4XKT0MJsOr1Os3pNa/Q03TlxuMUrlhZ04bSs4zUwVpkV4M/ m6gjiGzQklMg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.75,336,1589266800"; d="scan'208";a="458287091" Received: from shbuild999.sh.intel.com ([10.239.146.107]) by orsmga005.jf.intel.com with ESMTP; 10 Jul 2020 07:01:53 -0700 From: Feng Tang To: Andrew Morton , Michal Hocko , Johannes Weiner , Matthew Wilcox , Mel Gorman , Kees Cook , Qian Cai , Dennis Zhou , andi.kleen@intel.com, tim.c.chen@intel.com, dave.hansen@intel.com, ying.huang@intel.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Feng Tang Subject: [PATCH v6 1/4] proc/meminfo: avoid open coded reading of vm_committed_as Date: Fri, 10 Jul 2020 22:01:45 +0800 Message-Id: <1594389708-60781-2-git-send-email-feng.tang@intel.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1594389708-60781-1-git-send-email-feng.tang@intel.com> References: <1594389708-60781-1-git-send-email-feng.tang@intel.com> X-Rspamd-Queue-Id: 9FEB718A6B X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam05 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Use the existing vm_memory_committed() instead, which is also convenient for future change. Signed-off-by: Feng Tang Acked-by: Michal Hocko Cc: Matthew Wilcox (Oracle) Cc: Johannes Weiner Cc: Mel Gorman Cc: Qian Cai Cc: Kees Cook Cc: Andi Kleen Cc: Tim Chen Cc: Dave Hansen Cc: Huang Ying --- fs/proc/meminfo.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/proc/meminfo.c b/fs/proc/meminfo.c index 2a4c58f..887a553 100644 --- a/fs/proc/meminfo.c +++ b/fs/proc/meminfo.c @@ -41,7 +41,7 @@ static int meminfo_proc_show(struct seq_file *m, void *v) si_meminfo(&i); si_swapinfo(&i); - committed = percpu_counter_read_positive(&vm_committed_as); + committed = vm_memory_committed(); cached = global_node_page_state(NR_FILE_PAGES) - total_swapcache_pages() - i.bufferram; From patchwork Fri Jul 10 14:01:46 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Feng Tang X-Patchwork-Id: 11656733 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7265C13B6 for ; Fri, 10 Jul 2020 14:03:46 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 47712207BB for ; Fri, 10 Jul 2020 14:03:44 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 47712207BB Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id DFE1B8D0009; Fri, 10 Jul 2020 10:03:43 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id DDF358D0006; Fri, 10 Jul 2020 10:03:43 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CC3E68D0009; Fri, 10 Jul 2020 10:03:43 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0160.hostedemail.com [216.40.44.160]) by kanga.kvack.org (Postfix) with ESMTP id B5A8F8D0006 for ; Fri, 10 Jul 2020 10:03:43 -0400 (EDT) Received: from smtpin01.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 68C67613B for ; Fri, 10 Jul 2020 14:03:43 +0000 (UTC) X-FDA: 77022334326.01.space17_041721926ece Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin01.hostedemail.com (Postfix) with ESMTP id CF13B1000A7E42D0 for ; Fri, 10 Jul 2020 14:02:17 +0000 (UTC) X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,feng.tang@intel.com,,RULES_HIT:30046:30054:30064:30075,0,RBL:134.134.136.65:@intel.com:.lbl8.mailshell.net-64.95.201.95 62.18.0.100;04yfa51ndixri386xrxa4ihaz478nycnn4s16814mf8zzu9zbtapa5f3tzmhf5e.bndd9hp737wsy69xnjy4ssb1jefznytgqbmj7qcj3tspecj1mie56f95wakh6bi.6-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:ft,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:24,LUA_SUMMARY:none X-HE-Tag: space17_041721926ece X-Filterd-Recvd-Size: 4175 Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by imf42.hostedemail.com (Postfix) with ESMTP for ; Fri, 10 Jul 2020 14:02:16 +0000 (UTC) IronPort-SDR: nyw4Zy3aqPBNP3h0sLXYpwJXRzvtbXCq9nO0X62S0NmoSlOPRyRwtZNhxW1Ii731jlUAXLwSzW onzi/dmZUigQ== X-IronPort-AV: E=McAfee;i="6000,8403,9677"; a="148188204" X-IronPort-AV: E=Sophos;i="5.75,336,1589266800"; d="scan'208";a="148188204" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Jul 2020 07:02:00 -0700 IronPort-SDR: OTQC5lfVS8VaF2w6vI2VH1bmljaJYefPE78XlTvQ3xquRSU5Kz2DpvaNSN/jZt0WomWhfZIYKA vLS/stXmdl3Q== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.75,336,1589266800"; d="scan'208";a="458287130" Received: from shbuild999.sh.intel.com ([10.239.146.107]) by orsmga005.jf.intel.com with ESMTP; 10 Jul 2020 07:01:56 -0700 From: Feng Tang To: Andrew Morton , Michal Hocko , Johannes Weiner , Matthew Wilcox , Mel Gorman , Kees Cook , Qian Cai , Dennis Zhou , andi.kleen@intel.com, tim.c.chen@intel.com, dave.hansen@intel.com, ying.huang@intel.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Feng Tang , "K. Y. Srinivasan" , Haiyang Zhang Subject: [PATCH v6 2/4] mm/util.c: make vm_memory_committed() more accurate Date: Fri, 10 Jul 2020 22:01:46 +0800 Message-Id: <1594389708-60781-3-git-send-email-feng.tang@intel.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1594389708-60781-1-git-send-email-feng.tang@intel.com> References: <1594389708-60781-1-git-send-email-feng.tang@intel.com> X-Rspamd-Queue-Id: CF13B1000A7E42D0 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam05 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: percpu_counter_sum_positive() will provide more accurate info. As with percpu_counter_read_positive(), in worst case the deviation could be 'batch * nr_cpus', which is totalram_pages/256 for now, and will be more when the batch gets enlarged. Its time cost is about 800 nanoseconds on a 2C/4T platform and 2~3 microseconds on a 2S/36C/72T Skylake server in normal case, and in worst case where vm_committed_as's spinlock is under severe contention, it costs 30~40 microseconds for the 2S/36C/72T Skylake sever, which should be fine for its only two users: /proc/meminfo and HyperV balloon driver's status trace per second. Link: http://lkml.kernel.org/r/1592725000-73486-3-git-send-email-feng.tang@intel.com Signed-off-by: Feng Tang Acked-by: Michal Hocko # for /proc/meminfo Cc: "K. Y. Srinivasan" Cc: Haiyang Zhang Cc: Matthew Wilcox (Oracle) Cc: Johannes Weiner Cc: Mel Gorman Cc: Qian Cai Cc: Andi Kleen Cc: Tim Chen Cc: Dave Hansen Cc: Huang Ying Signed-off-by: Andrew Morton --- mm/util.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/mm/util.c b/mm/util.c index c856f5f..d076218 100644 --- a/mm/util.c +++ b/mm/util.c @@ -787,10 +787,15 @@ struct percpu_counter vm_committed_as ____cacheline_aligned_in_smp; * balancing memory across competing virtual machines that are hosted. * Several metrics drive this policy engine including the guest reported * memory commitment. + * + * The time cost of this is very low for small platforms, and for big + * platform like a 2S/36C/72T Skylake server, in worst case where + * vm_committed_as's spinlock is under severe contention, the time cost + * could be about 30~40 microseconds. */ unsigned long vm_memory_committed(void) { - return percpu_counter_read_positive(&vm_committed_as); + return percpu_counter_sum_positive(&vm_committed_as); } EXPORT_SYMBOL_GPL(vm_memory_committed); From patchwork Fri Jul 10 14:01:47 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Feng Tang X-Patchwork-Id: 11656727 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2422813B6 for ; Fri, 10 Jul 2020 14:02:22 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id EE119207D0 for ; Fri, 10 Jul 2020 14:02:21 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org EE119207D0 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 28DDE8D0007; Fri, 10 Jul 2020 10:02:21 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 23CD88D0001; Fri, 10 Jul 2020 10:02:21 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 106C28D0007; Fri, 10 Jul 2020 10:02:21 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0100.hostedemail.com [216.40.44.100]) by kanga.kvack.org (Postfix) with ESMTP id EF85C8D0001 for ; Fri, 10 Jul 2020 10:02:20 -0400 (EDT) Received: from smtpin28.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id AE3027591 for ; Fri, 10 Jul 2020 14:02:20 +0000 (UTC) X-FDA: 77022330840.28.steam38_3315efb26ece Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin28.hostedemail.com (Postfix) with ESMTP id AA8E8641D for ; Fri, 10 Jul 2020 14:02:19 +0000 (UTC) X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,feng.tang@intel.com,,RULES_HIT:30005:30054:30064,0,RBL:134.134.136.65:@intel.com:.lbl8.mailshell.net-62.18.0.100 64.95.201.95;04yfhbufsqriud8jb9jh53ia3k85hycd45yqy6b7ny6uekru4yw7rwqf5oanrgm.yzz5j9ozhx6bo4u8yxu7txkm3k8fom5mwdf1irhtm6c84km84zyrg68u4t6mfae.6-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:ft,MSBL:0,DNSBL:none,Custom_rules:0:0:0,LFtime:29,LUA_SUMMARY:none X-HE-Tag: steam38_3315efb26ece X-Filterd-Recvd-Size: 4743 Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by imf48.hostedemail.com (Postfix) with ESMTP for ; Fri, 10 Jul 2020 14:02:18 +0000 (UTC) IronPort-SDR: w7McPH4CWy1gg3eRqOeT2KMBJwh5krpRrmvUe5EM2yH9lbUGK2M24GRzvGIS18clEaku6QHzHz 3BTo4wrA8Flw== X-IronPort-AV: E=McAfee;i="6000,8403,9677"; a="148188233" X-IronPort-AV: E=Sophos;i="5.75,336,1589266800"; d="scan'208";a="148188233" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Jul 2020 07:02:04 -0700 IronPort-SDR: Qtf614lvHJnTjh0kqbKNTaOneiygxtaXN5BrwNM3mQPllYWJ++pA7e7Yahs31ZT44+BhwyG6xo T+L6lyJHrjrw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.75,336,1589266800"; d="scan'208";a="458287180" Received: from shbuild999.sh.intel.com ([10.239.146.107]) by orsmga005.jf.intel.com with ESMTP; 10 Jul 2020 07:02:00 -0700 From: Feng Tang To: Andrew Morton , Michal Hocko , Johannes Weiner , Matthew Wilcox , Mel Gorman , Kees Cook , Qian Cai , Dennis Zhou , andi.kleen@intel.com, tim.c.chen@intel.com, dave.hansen@intel.com, ying.huang@intel.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Feng Tang , Tejun Heo , Christoph Lameter Subject: [PATCH v6 3/4] percpu_counter: add percpu_counter_sync() Date: Fri, 10 Jul 2020 22:01:47 +0800 Message-Id: <1594389708-60781-4-git-send-email-feng.tang@intel.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1594389708-60781-1-git-send-email-feng.tang@intel.com> References: <1594389708-60781-1-git-send-email-feng.tang@intel.com> X-Rspamd-Queue-Id: AA8E8641D X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam02 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: percpu_counter's accuracy is related to its batch size. For a percpu_counter with a big batch, its deviation could be big, so when the counter's batch is runtime changed to a smaller value for better accuracy, there could also be requirment to reduce the big deviation. So add a percpu-counter sync function to be run on each CPU. Reported-by: kernel test robot Signed-off-by: Feng Tang Cc: Dennis Zhou Cc: Tejun Heo Cc: Christoph Lameter Cc: Michal Hocko Cc: Qian Cai Cc: Andi Kleen Cc: Huang Ying --- include/linux/percpu_counter.h | 4 ++++ lib/percpu_counter.c | 19 +++++++++++++++++++ 2 files changed, 23 insertions(+) diff --git a/include/linux/percpu_counter.h b/include/linux/percpu_counter.h index 0a4f54d..01861ee 100644 --- a/include/linux/percpu_counter.h +++ b/include/linux/percpu_counter.h @@ -44,6 +44,7 @@ void percpu_counter_add_batch(struct percpu_counter *fbc, s64 amount, s32 batch); s64 __percpu_counter_sum(struct percpu_counter *fbc); int __percpu_counter_compare(struct percpu_counter *fbc, s64 rhs, s32 batch); +void percpu_counter_sync(struct percpu_counter *fbc); static inline int percpu_counter_compare(struct percpu_counter *fbc, s64 rhs) { @@ -172,6 +173,9 @@ static inline bool percpu_counter_initialized(struct percpu_counter *fbc) return true; } +static inline void percpu_counter_sync(struct percpu_counter *fbc) +{ +} #endif /* CONFIG_SMP */ static inline void percpu_counter_inc(struct percpu_counter *fbc) diff --git a/lib/percpu_counter.c b/lib/percpu_counter.c index a66595b..a2345de 100644 --- a/lib/percpu_counter.c +++ b/lib/percpu_counter.c @@ -99,6 +99,25 @@ void percpu_counter_add_batch(struct percpu_counter *fbc, s64 amount, s32 batch) EXPORT_SYMBOL(percpu_counter_add_batch); /* + * For percpu_counter with a big batch, the devication of its count could + * be big, and there is requirement to reduce the deviation, like when the + * counter's batch could be runtime decreased to get a better accuracy, + * which can be achieved by running this sync function on each CPU. + */ +void percpu_counter_sync(struct percpu_counter *fbc) +{ + unsigned long flags; + s64 count; + + raw_spin_lock_irqsave(&fbc->lock, flags); + count = __this_cpu_read(*fbc->counters); + fbc->count += count; + __this_cpu_sub(*fbc->counters, count); + raw_spin_unlock_irqrestore(&fbc->lock, flags); +} +EXPORT_SYMBOL(percpu_counter_sync); + +/* * Add up all the per-cpu counts, return the result. This is a more accurate * but much slower version of percpu_counter_read_positive() */ From patchwork Fri Jul 10 14:01:48 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Feng Tang X-Patchwork-Id: 11656729 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 292C213B6 for ; Fri, 10 Jul 2020 14:02:29 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id E60852082E for ; Fri, 10 Jul 2020 14:02:28 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E60852082E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id A07D88D0009; Fri, 10 Jul 2020 10:02:27 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 9DE308D0001; Fri, 10 Jul 2020 10:02:27 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8A6978D0009; Fri, 10 Jul 2020 10:02:27 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0242.hostedemail.com [216.40.44.242]) by kanga.kvack.org (Postfix) with ESMTP id 7538C8D0001 for ; Fri, 10 Jul 2020 10:02:27 -0400 (EDT) Received: from smtpin13.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 1B305181AEF0B for ; Fri, 10 Jul 2020 14:02:27 +0000 (UTC) X-FDA: 77022331134.13.river54_250206226ece Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin13.hostedemail.com (Postfix) with ESMTP id 8FEDD1813F559 for ; Fri, 10 Jul 2020 14:02:20 +0000 (UTC) X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,feng.tang@intel.com,,RULES_HIT:30034:30051:30054:30064:30070,0,RBL:134.134.136.65:@intel.com:.lbl8.mailshell.net-64.95.201.95 62.18.0.100;04ygj9nr1frqbx5eo8yb3ciroxm9tycdaa7assc1bj6h3h1qfy9qubcitzsyrpi.6rtxtqp1mu69mn1b3snm16d3tpf51ejtbo3krfs9gexjnsn1u9b7xjatqzn573d.s-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:ft,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:24,LUA_SUMMARY:none X-HE-Tag: river54_250206226ece X-Filterd-Recvd-Size: 9287 Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by imf27.hostedemail.com (Postfix) with ESMTP for ; Fri, 10 Jul 2020 14:02:19 +0000 (UTC) IronPort-SDR: qnXpygE85Tw5I3VCsnu4cpco+RsFR3u5kFqGtfW5sGUKr0Gx5ONg8PEn3Vv/MPTyw8WXEakVOT TDLAHb7c8bLg== X-IronPort-AV: E=McAfee;i="6000,8403,9677"; a="148188255" X-IronPort-AV: E=Sophos;i="5.75,336,1589266800"; d="scan'208";a="148188255" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Jul 2020 07:02:08 -0700 IronPort-SDR: GHH9ax0KhWzrg0MR0pQWTtFCjBFJr49qObvLj/i1znSakK6C9EfRnc99IZfHwI8oQFA+DVTf4C iPGLKauHK3Lw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.75,336,1589266800"; d="scan'208";a="458287227" Received: from shbuild999.sh.intel.com ([10.239.146.107]) by orsmga005.jf.intel.com with ESMTP; 10 Jul 2020 07:02:05 -0700 From: Feng Tang To: Andrew Morton , Michal Hocko , Johannes Weiner , Matthew Wilcox , Mel Gorman , Kees Cook , Qian Cai , Dennis Zhou , andi.kleen@intel.com, tim.c.chen@intel.com, dave.hansen@intel.com, ying.huang@intel.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Feng Tang Subject: [PATCH v6 4/4] mm: adjust vm_committed_as_batch according to vm overcommit policy Date: Fri, 10 Jul 2020 22:01:48 +0800 Message-Id: <1594389708-60781-5-git-send-email-feng.tang@intel.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1594389708-60781-1-git-send-email-feng.tang@intel.com> References: <1594389708-60781-1-git-send-email-feng.tang@intel.com> X-Rspamd-Queue-Id: 8FEDD1813F559 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam02 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: When checking a performance change for will-it-scale scalability mmap test [1], we found very high lock contention for spinlock of percpu counter 'vm_committed_as': 94.14% 0.35% [kernel.kallsyms] [k] _raw_spin_lock_irqsave 48.21% _raw_spin_lock_irqsave;percpu_counter_add_batch;__vm_enough_memory;mmap_region;do_mmap; 45.91% _raw_spin_lock_irqsave;percpu_counter_add_batch;__do_munmap; Actually this heavy lock contention is not always necessary. The 'vm_committed_as' needs to be very precise when the strict OVERCOMMIT_NEVER policy is set, which requires a rather small batch number for the percpu counter. So keep 'batch' number unchanged for strict OVERCOMMIT_NEVER policy, and lift it to 64X for OVERCOMMIT_ALWAYS and OVERCOMMIT_GUESS policies. Also add a sysctl handler to adjust it when the policy is reconfigured. Benchmark with the same testcase in [1] shows 53% improvement on a 8C/16T desktop, and 2097%(20X) on a 4S/72C/144T server. We tested with test platforms in 0day (server, desktop and laptop), and 80%+ platforms shows improvements with that test. And whether it shows improvements depends on if the test mmap size is bigger than the batch number computed. And if the lift is 16X, 1/3 of the platforms will show improvements, though it should help the mmap/unmap usage generally, as Michal Hocko mentioned: : I believe that there are non-synthetic worklaods which would benefit from : a larger batch. E.g. large in memory databases which do large mmaps : during startups from multiple threads. [1] https://lore.kernel.org/lkml/20200305062138.GI5972@shao2-debian/ Link: http://lkml.kernel.org/r/1589611660-89854-4-git-send-email-feng.tang@intel.com Link: http://lkml.kernel.org/r/1592725000-73486-4-git-send-email-feng.tang@intel.com Signed-off-by: Feng Tang Acked-by: Michal Hocko Cc: Matthew Wilcox (Oracle) Cc: Johannes Weiner Cc: Mel Gorman Cc: Qian Cai Cc: Kees Cook Cc: Andi Kleen Cc: Tim Chen Cc: Dave Hansen Cc: Huang Ying --- include/linux/mm.h | 2 ++ include/linux/mman.h | 4 ++++ kernel/sysctl.c | 2 +- mm/mm_init.c | 22 ++++++++++++++++------ mm/util.c | 41 +++++++++++++++++++++++++++++++++++++++++ 5 files changed, 64 insertions(+), 7 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index e529e90..678ea25 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -208,6 +208,8 @@ int overcommit_ratio_handler(struct ctl_table *, int, void *, size_t *, loff_t *); int overcommit_kbytes_handler(struct ctl_table *, int, void *, size_t *, loff_t *); +int overcommit_policy_handler(struct ctl_table *, int, void *, size_t *, + loff_t *); #define nth_page(page,n) pfn_to_page(page_to_pfn((page)) + (n)) diff --git a/include/linux/mman.h b/include/linux/mman.h index 6733f2f..629cefc 100644 --- a/include/linux/mman.h +++ b/include/linux/mman.h @@ -57,8 +57,12 @@ extern struct percpu_counter vm_committed_as; #ifdef CONFIG_SMP extern s32 vm_committed_as_batch; +extern void mm_compute_batch(int overcommit_policy); #else #define vm_committed_as_batch 0 +static inline void mm_compute_batch(int overcommit_policy) +{ +} #endif unsigned long vm_memory_committed(void); diff --git a/kernel/sysctl.c b/kernel/sysctl.c index 8dca889..51ad1aa 100644 --- a/kernel/sysctl.c +++ b/kernel/sysctl.c @@ -2664,7 +2664,7 @@ static struct ctl_table vm_table[] = { .data = &sysctl_overcommit_memory, .maxlen = sizeof(sysctl_overcommit_memory), .mode = 0644, - .proc_handler = proc_dointvec_minmax, + .proc_handler = overcommit_policy_handler, .extra1 = SYSCTL_ZERO, .extra2 = &two, }, diff --git a/mm/mm_init.c b/mm/mm_init.c index 435e5f7..b06a30f 100644 --- a/mm/mm_init.c +++ b/mm/mm_init.c @@ -13,6 +13,7 @@ #include #include #include +#include #include "internal.h" #ifdef CONFIG_DEBUG_MEMORY_INIT @@ -144,14 +145,23 @@ EXPORT_SYMBOL_GPL(mm_kobj); #ifdef CONFIG_SMP s32 vm_committed_as_batch = 32; -static void __meminit mm_compute_batch(void) +void mm_compute_batch(int overcommit_policy) { u64 memsized_batch; s32 nr = num_present_cpus(); s32 batch = max_t(s32, nr*2, 32); - - /* batch size set to 0.4% of (total memory/#cpus), or max int32 */ - memsized_batch = min_t(u64, (totalram_pages()/nr)/256, 0x7fffffff); + unsigned long ram_pages = totalram_pages(); + + /* + * For policy OVERCOMMIT_NEVER, set batch size to 0.4% of + * (total memory/#cpus), and lift it to 25% for other policies + * to easy the possible lock contention for percpu_counter + * vm_committed_as, while the max limit is INT_MAX + */ + if (overcommit_policy == OVERCOMMIT_NEVER) + memsized_batch = min_t(u64, ram_pages/nr/256, INT_MAX); + else + memsized_batch = min_t(u64, ram_pages/nr/4, INT_MAX); vm_committed_as_batch = max_t(s32, memsized_batch, batch); } @@ -162,7 +172,7 @@ static int __meminit mm_compute_batch_notifier(struct notifier_block *self, switch (action) { case MEM_ONLINE: case MEM_OFFLINE: - mm_compute_batch(); + mm_compute_batch(sysctl_overcommit_memory); default: break; } @@ -176,7 +186,7 @@ static struct notifier_block compute_batch_nb __meminitdata = { static int __init mm_compute_batch_init(void) { - mm_compute_batch(); + mm_compute_batch(sysctl_overcommit_memory); register_hotmemory_notifier(&compute_batch_nb); return 0; diff --git a/mm/util.c b/mm/util.c index d076218..79c965d 100644 --- a/mm/util.c +++ b/mm/util.c @@ -746,6 +746,47 @@ int overcommit_ratio_handler(struct ctl_table *table, int write, void *buffer, return ret; } +static void sync_overcommit_as(struct work_struct *dummy) +{ + percpu_counter_sync(&vm_committed_as); +} + +int overcommit_policy_handler(struct ctl_table *table, int write, void *buffer, + size_t *lenp, loff_t *ppos) +{ + struct ctl_table t; + int new_policy; + int ret; + + /* + * The deviation of sync_overcommit_as could be big with loose policy + * like OVERCOMMIT_ALWAYS/OVERCOMMIT_GUESS. When changing policy to + * strict OVERCOMMIT_NEVER, we need to reduce the deviation to comply + * with the strict "NEVER", and to avoid possible race condtion (even + * though user usually won't too frequently do the switching to policy + * OVERCOMMIT_NEVER), the switch is done in the following order: + * 1. changing the batch + * 2. sync percpu count on each CPU + * 3. switch the policy + */ + if (write) { + t = *table; + t.data = &new_policy; + ret = proc_dointvec_minmax(&t, write, buffer, lenp, ppos); + if (ret) + return ret; + + mm_compute_batch(new_policy); + if (new_policy == OVERCOMMIT_NEVER) + schedule_on_each_cpu(sync_overcommit_as); + sysctl_overcommit_memory = new_policy; + } else { + ret = proc_dointvec_minmax(table, write, buffer, lenp, ppos); + } + + return ret; +} + int overcommit_kbytes_handler(struct ctl_table *table, int write, void *buffer, size_t *lenp, loff_t *ppos) {