From patchwork Fri Apr 12 09:24:41 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "zhangpeng (AS)" X-Patchwork-Id: 13627434 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0F25FC00A94 for ; Fri, 12 Apr 2024 09:25:04 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BD2526B0087; Fri, 12 Apr 2024 05:24:59 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B5AFE6B0088; Fri, 12 Apr 2024 05:24:59 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9D51D6B0089; Fri, 12 Apr 2024 05:24:59 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 748F96B0087 for ; Fri, 12 Apr 2024 05:24:59 -0400 (EDT) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 1A667A15FF for ; Fri, 12 Apr 2024 09:24:59 +0000 (UTC) X-FDA: 82000345518.19.3F174C6 Received: from szxga05-in.huawei.com (szxga05-in.huawei.com [45.249.212.191]) by imf16.hostedemail.com (Postfix) with ESMTP id 75713180013 for ; Fri, 12 Apr 2024 09:24:56 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=none; spf=pass (imf16.hostedemail.com: domain of zhangpeng362@huawei.com designates 45.249.212.191 as permitted sender) smtp.mailfrom=zhangpeng362@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1712913897; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=is3I5Cr7yqaHNl63rBu/V9neKHIQhVwv52vj+Cw/ATA=; b=IBrhjtwwdiYOXhMqsrrw2FLyY4EgnzCDcPfIyiKE1Km5Q1ks5GE8KN3DjkposLX+Yr1WKl Lo5QvLgHbPOtGcgudOJTMGBWH0NzfCOsLT8KWSpiEeA0d49yLKlUNESLcrb1ei3zPNr3yF YeGC8RhHWadBBb1tWp3tqp/nNRsaf70= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1712913897; a=rsa-sha256; cv=none; b=MQPTV/yeCqdDJaQ8D4xw3Hw0i2gk6VubTw62v0zsAJjLbfB43XvR96esv9UzrKBvRPwtk/ y5g/2NjksOBzPw6ntmC9su9dIWUuaYThXA5V6pN3W8YeXbLYkMSSIdJGYXexQzImwPOUuQ UqRVYO/oNFR6gk9lK8Rf1PGhQRiBxHo= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=none; spf=pass (imf16.hostedemail.com: domain of zhangpeng362@huawei.com designates 45.249.212.191 as permitted sender) smtp.mailfrom=zhangpeng362@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com Received: from mail.maildlp.com (unknown [172.19.88.163]) by szxga05-in.huawei.com (SkyGuard) with ESMTP id 4VGB0R3lh1z2NW5X; Fri, 12 Apr 2024 17:21:59 +0800 (CST) Received: from kwepemm600020.china.huawei.com (unknown [7.193.23.147]) by mail.maildlp.com (Postfix) with ESMTPS id E766118001A; Fri, 12 Apr 2024 17:24:52 +0800 (CST) Received: from localhost.localdomain (10.175.112.125) by kwepemm600020.china.huawei.com (7.193.23.147) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2507.35; Fri, 12 Apr 2024 17:24:51 +0800 From: Peng Zhang To: , CC: , , , , , , , , , , , , Subject: [RFC PATCH 3/3] mm: convert mm's rss stats into lazy_percpu_counter Date: Fri, 12 Apr 2024 17:24:41 +0800 Message-ID: <20240412092441.3112481-4-zhangpeng362@huawei.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20240412092441.3112481-1-zhangpeng362@huawei.com> References: <20240412092441.3112481-1-zhangpeng362@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.175.112.125] X-ClientProxiedBy: dggems706-chm.china.huawei.com (10.3.19.183) To kwepemm600020.china.huawei.com (7.193.23.147) X-Stat-Signature: i5h4hiq8cn6h1tqfkrd1hs8q8gu4ncqj X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 75713180013 X-Rspam-User: X-HE-Tag: 1712913896-95647 X-HE-Meta: U2FsdGVkX18EEanbIj9QsZTPPzWbbdRZxla1fQFQy8G0o9QjaR53mw706V1dSYaMsTOsBrK22yQKpyn8Nfz4w9e9Pta72/yKba3G6Ksv83wBLHCkl8TH5lbBY8CAjVj2rhzAAhsTSsebIjJciChTyeQHpSwPDEzPsI04SdCV13cKhIduT0Y5UtlOgeiBJaB1JtigLUY6SCcElrhZNQzb4sQsr6tNwpeD3f8VIX0v0psw6+FDJ4JzT9QnUEvLAW9EQvsEWKt6HzJCkTKmIiSMxjLEMWKldHV4xglTw3veswjEPpocZFwcFhFvWEH7z/xMmXA4pwVMhQS09FSuY22E8QncZdlzgtZi4ou4KLNkAeT4/1sLpwHidwr4iA5IH33UOnuQKl+YFeLo2W4iZ8dL6Fn+A1RIrW5h9TGMOyhplxOhI0lUCBr8b6tZAvCiZ1dNOSseJIkM1BnMD2h5w8k9Ugsd1X4/Ig3EpN7gJydOAtYKl3pTy9RJ4d50BTqpi495AWrL0KU9X7PQEdzjozBFVukGmmwP9w3c0gy2mPgOT7wCNlo0JANbMs6hH3MFm6gJAALjYyN7/J7t4Z4fO9ux8YkeRkVjMsJ8Iu4dfd0Dsgm9VZeIIHHzhnYUAIk3S5TuNtoZGg47opSx5cY3dsC1qB7WbGEq6WG+WXEjbcmf2wr/sYM9/vOqQPp5RdJnmrA80wsReO/elJSst9QzFCV00nylOzsxEfDqhwVFfyl9esBleARrWntJS09AF70sI0xowzkWlkMmpJ/PRBPEePzx+sv9o2gR34NPkRLttjXE/oz9izLaFn5bC5qnhpIPDwEj2/zAIuz2ntUqtndrn0YXAW4GcdhXxFSZIGMTn4vm0mXtwtVj5BeZKU98SA2Ar4h0T22up69EF7wwnpLqFDo0Loec25OH8oiJYQ7kq1XR4arlwz9e8tvFBeM9gqKDWm9C3e1e9ggz6cq1g151Of7 Lc3XLQUM dqXk0xwx7JrVTBW4lDStZcbGgyJI3eh7lR9xNZL3V6TkJQ5wGMwS+BY4GDLlRUoHMcSjJ5MJpEePzuMgi5PuSLNyl2U4pgN7v3OgYdMWlz/uK70buXRTWfrx1oLm4BXNAM0fhlb7kZ25I8TDHkUikzLyL94eJJMS49lB/LSFDtytOL3g7PmbwnXvnuDzLTnu8LSgk X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: ZhangPeng Since commit f1a7941243c1 ("mm: convert mm's rss stats into percpu_counter"), the rss_stats have converted into percpu_counter, which convert the error margin from (nr_threads * 64) to approximately (nr_cpus ^ 2). However, the new percpu allocation in mm_init() causes a performance regression on fork/exec/shell. Even after commit 14ef95be6f55 ("kernel/fork: group allocation/free of per-cpu counters for mm struct"), the performance of fork/exec/shell is still poor compared to previous kernel versions. To mitigate performance regression, we use lazy_percpu_counter to delay the allocation of percpu memory for rss_stats. After lmbench test, we will get 3% ~ 6% performance improvement for lmbench fork_proc/exec_proc/ shell_proc after conversion. The test results are as follows: base base+revert base+lazy_percpu_counter fork_proc 427.4ms 394.1ms (7.8%) 413.9ms (3.2%) exec_proc 2205.1ms 2042.2ms (7.4%) 2072.0ms (6.0%) shell_proc 3180.9ms 2963.7ms (6.8%) 3010.7ms (5.4%) Signed-off-by: ZhangPeng Signed-off-by: Kefeng Wang --- include/linux/mm.h | 8 ++++---- include/linux/mm_types.h | 4 ++-- include/trace/events/kmem.h | 4 ++-- kernel/fork.c | 12 ++++-------- 4 files changed, 12 insertions(+), 16 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 07c73451d42f..d1ea246b99c3 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2631,28 +2631,28 @@ static inline bool get_user_page_fast_only(unsigned long addr, */ static inline unsigned long get_mm_counter(struct mm_struct *mm, int member) { - return percpu_counter_read_positive(&mm->rss_stat[member]); + return lazy_percpu_counter_read_positive(&mm->rss_stat[member]); } void mm_trace_rss_stat(struct mm_struct *mm, int member); static inline void add_mm_counter(struct mm_struct *mm, int member, long value) { - percpu_counter_add(&mm->rss_stat[member], value); + lazy_percpu_counter_add(&mm->rss_stat[member], value); mm_trace_rss_stat(mm, member); } static inline void inc_mm_counter(struct mm_struct *mm, int member) { - percpu_counter_inc(&mm->rss_stat[member]); + lazy_percpu_counter_add(&mm->rss_stat[member], 1); mm_trace_rss_stat(mm, member); } static inline void dec_mm_counter(struct mm_struct *mm, int member) { - percpu_counter_dec(&mm->rss_stat[member]); + lazy_percpu_counter_sub(&mm->rss_stat[member], 1); mm_trace_rss_stat(mm, member); } diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index c432add95913..bf44c3a6fc99 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -18,7 +18,7 @@ #include #include #include -#include +#include #include @@ -898,7 +898,7 @@ struct mm_struct { unsigned long saved_auxv[AT_VECTOR_SIZE]; /* for /proc/PID/auxv */ - struct percpu_counter rss_stat[NR_MM_COUNTERS]; + struct lazy_percpu_counter rss_stat[NR_MM_COUNTERS]; struct linux_binfmt *binfmt; diff --git a/include/trace/events/kmem.h b/include/trace/events/kmem.h index 6e62cc64cd92..3a35d9a665b7 100644 --- a/include/trace/events/kmem.h +++ b/include/trace/events/kmem.h @@ -399,8 +399,8 @@ TRACE_EVENT(rss_stat, __entry->mm_id = mm_ptr_to_hash(mm); __entry->curr = !!(current->mm == mm); __entry->member = member; - __entry->size = (percpu_counter_sum_positive(&mm->rss_stat[member]) - << PAGE_SHIFT); + __entry->size = (lazy_percpu_counter_sum_positive(&mm->rss_stat[member]) + << PAGE_SHIFT); ), TP_printk("mm_id=%u curr=%d type=%s size=%ldB", diff --git a/kernel/fork.c b/kernel/fork.c index 99076dbe27d8..0a4efb436030 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -823,7 +823,7 @@ static void check_mm(struct mm_struct *mm) "Please make sure 'struct resident_page_types[]' is updated as well"); for (i = 0; i < NR_MM_COUNTERS; i++) { - long x = percpu_counter_sum(&mm->rss_stat[i]); + long x = lazy_percpu_counter_sum(&mm->rss_stat[i]); if (unlikely(x)) pr_alert("BUG: Bad rss-counter state mm:%p type:%s val:%ld\n", @@ -910,6 +910,8 @@ static void cleanup_lazy_tlbs(struct mm_struct *mm) */ void __mmdrop(struct mm_struct *mm) { + int i; + BUG_ON(mm == &init_mm); WARN_ON_ONCE(mm == current->mm); @@ -924,7 +926,7 @@ void __mmdrop(struct mm_struct *mm) put_user_ns(mm->user_ns); mm_pasid_drop(mm); mm_destroy_cid(mm); - percpu_counter_destroy_many(mm->rss_stat, NR_MM_COUNTERS); + lazy_percpu_counter_destroy_many(&mm->rss_stat[i], NR_MM_COUNTERS); free_mm(mm); } @@ -1301,16 +1303,10 @@ static struct mm_struct *mm_init(struct mm_struct *mm, struct task_struct *p, if (mm_alloc_cid(mm)) goto fail_cid; - if (percpu_counter_init_many(mm->rss_stat, 0, GFP_KERNEL_ACCOUNT, - NR_MM_COUNTERS)) - goto fail_pcpu; - mm->user_ns = get_user_ns(user_ns); lru_gen_init_mm(mm); return mm; -fail_pcpu: - mm_destroy_cid(mm); fail_cid: destroy_context(mm); fail_nocontext: