From patchwork Tue Mar 9 15:25:09 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xunlei Pang X-Patchwork-Id: 12125817 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,UNPARSEABLE_RELAY,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 72988C433E6 for ; Tue, 9 Mar 2021 15:25:36 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id C07B06525F for ; Tue, 9 Mar 2021 15:25:35 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C07B06525F Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 0B1118D0101; Tue, 9 Mar 2021 10:25:34 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 05F0C8D007F; Tue, 9 Mar 2021 10:25:33 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DF5AD8D0101; Tue, 9 Mar 2021 10:25:33 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0105.hostedemail.com [216.40.44.105]) by kanga.kvack.org (Postfix) with ESMTP id C335B8D007F for ; Tue, 9 Mar 2021 10:25:33 -0500 (EST) Received: from smtpin12.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 82D1C180AD81A for ; Tue, 9 Mar 2021 15:25:33 +0000 (UTC) X-FDA: 77900710146.12.04591F6 Received: from out30-57.freemail.mail.aliyun.com (out30-57.freemail.mail.aliyun.com [115.124.30.57]) by imf01.hostedemail.com (Postfix) with ESMTP id 019B5200038A for ; Tue, 9 Mar 2021 15:25:21 +0000 (UTC) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R101e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04423;MF=xlpang@linux.alibaba.com;NM=1;PH=DS;RN=14;SR=0;TI=SMTPD_---0UR93h0F_1615303513; Received: from localhost(mailfrom:xlpang@linux.alibaba.com fp:SMTPD_---0UR93h0F_1615303513) by smtp.aliyun-inc.com(127.0.0.1); Tue, 09 Mar 2021 23:25:13 +0800 From: Xunlei Pang To: Christoph Lameter , Pekka Enberg , Vlastimil Babka , Roman Gushchin , Konstantin Khlebnikov , David Rientjes , Matthew Wilcox , Shu Ming , Andrew Morton Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, Wen Yang , James Wang , Xunlei Pang Subject: [PATCH v3 1/4] mm/slub: Introduce two counters for partial objects Date: Tue, 9 Mar 2021 23:25:09 +0800 Message-Id: <1615303512-35058-2-git-send-email-xlpang@linux.alibaba.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1615303512-35058-1-git-send-email-xlpang@linux.alibaba.com> References: <1615303512-35058-1-git-send-email-xlpang@linux.alibaba.com> X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 019B5200038A X-Stat-Signature: p8tc8k4hrgn9t5wjiufb1dp6f5pdu5c1 Received-SPF: none (linux.alibaba.com>: No applicable sender policy available) receiver=imf01; identity=mailfrom; envelope-from=""; helo=out30-57.freemail.mail.aliyun.com; client-ip=115.124.30.57 X-HE-DKIM-Result: none/none X-HE-Tag: 1615303521-357190 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The node list_lock in count_partial() spends long time iterating in case of large amount of partial page lists, which can cause thunder herd effect to the list_lock contention. We have HSF RT(High-speed Service Framework Response-Time) monitors, the RT figures fluctuated randomly, then we deployed a tool detecting "irq off" and "preempt off" to dump the culprit's calltrace, capturing the list_lock cost nearly 100ms with irq off issued by "ss", this also caused network timeouts. This patch introduces two counters to maintain the actual number of partial objects dynamically instead of iterating the partial page lists with list_lock held. New counters of kmem_cache_node: partial_free_objs, partial_total_objs. The main operations are under list_lock in slow path, its performance impact should be minimal except the __slab_free() path which will be addressed later. Tested-by: James Wang Reviewed-by: Pekka Enberg Signed-off-by: Xunlei Pang --- mm/slab.h | 4 ++++ mm/slub.c | 46 +++++++++++++++++++++++++++++++++++++++++++++- 2 files changed, 49 insertions(+), 1 deletion(-) diff --git a/mm/slab.h b/mm/slab.h index 076582f..817bfa0 100644 --- a/mm/slab.h +++ b/mm/slab.h @@ -547,6 +547,10 @@ struct kmem_cache_node { #ifdef CONFIG_SLUB unsigned long nr_partial; struct list_head partial; +#if defined(CONFIG_SLUB_DEBUG) || defined(CONFIG_SYSFS) + atomic_long_t partial_free_objs; + unsigned long partial_total_objs; +#endif #ifdef CONFIG_SLUB_DEBUG atomic_long_t nr_slabs; atomic_long_t total_objects; diff --git a/mm/slub.c b/mm/slub.c index e26c274..4d02831 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -1890,10 +1890,31 @@ static void discard_slab(struct kmem_cache *s, struct page *page) /* * Management of partially allocated slabs. */ +#if defined(CONFIG_SLUB_DEBUG) || defined(CONFIG_SYSFS) +static inline void +__update_partial_free(struct kmem_cache_node *n, long delta) +{ + atomic_long_add(delta, &n->partial_free_objs); +} + +static inline void +__update_partial_total(struct kmem_cache_node *n, long delta) +{ + n->partial_total_objs += delta; +} +#else +static inline void +__update_partial_free(struct kmem_cache_node *n, long delta) { } + +static inline void +__update_partial_total(struct kmem_cache_node *n, long delta) { } +#endif + static inline void __add_partial(struct kmem_cache_node *n, struct page *page, int tail) { n->nr_partial++; + __update_partial_total(n, page->objects); if (tail == DEACTIVATE_TO_TAIL) list_add_tail(&page->slab_list, &n->partial); else @@ -1913,6 +1934,7 @@ static inline void remove_partial(struct kmem_cache_node *n, lockdep_assert_held(&n->list_lock); list_del(&page->slab_list); n->nr_partial--; + __update_partial_total(n, -page->objects); } /* @@ -1957,6 +1979,7 @@ static inline void *acquire_slab(struct kmem_cache *s, return NULL; remove_partial(n, page); + __update_partial_free(n, -*objects); WARN_ON(!freelist); return freelist; } @@ -2286,8 +2309,11 @@ static void deactivate_slab(struct kmem_cache *s, struct page *page, "unfreezing slab")) goto redo; - if (lock) + if (lock) { + if (m == M_PARTIAL) + __update_partial_free(n, new.objects - new.inuse); spin_unlock(&n->list_lock); + } if (m == M_PARTIAL) stat(s, tail); @@ -2353,6 +2379,7 @@ static void unfreeze_partials(struct kmem_cache *s, discard_page = page; } else { add_partial(n, page, DEACTIVATE_TO_TAIL); + __update_partial_free(n, new.objects - new.inuse); stat(s, FREE_ADD_PARTIAL); } } @@ -3039,6 +3066,13 @@ static void __slab_free(struct kmem_cache *s, struct page *page, head, new.counters, "__slab_free")); + if (!was_frozen && prior) { + if (n) + __update_partial_free(n, cnt); + else + __update_partial_free(get_node(s, page_to_nid(page)), cnt); + } + if (likely(!n)) { if (likely(was_frozen)) { @@ -3069,6 +3103,7 @@ static void __slab_free(struct kmem_cache *s, struct page *page, if (!kmem_cache_has_cpu_partial(s) && unlikely(!prior)) { remove_full(s, n, page); add_partial(n, page, DEACTIVATE_TO_TAIL); + __update_partial_free(n, cnt); stat(s, FREE_ADD_PARTIAL); } spin_unlock_irqrestore(&n->list_lock, flags); @@ -3080,6 +3115,7 @@ static void __slab_free(struct kmem_cache *s, struct page *page, * Slab on the partial list. */ remove_partial(n, page); + __update_partial_free(n, -page->objects); stat(s, FREE_REMOVE_PARTIAL); } else { /* Slab must be on the full list */ @@ -3520,6 +3556,10 @@ static inline int calculate_order(unsigned int size) n->nr_partial = 0; spin_lock_init(&n->list_lock); INIT_LIST_HEAD(&n->partial); +#if defined(CONFIG_SLUB_DEBUG) || defined(CONFIG_SYSFS) + atomic_long_set(&n->partial_free_objs, 0); + n->partial_total_objs = 0; +#endif #ifdef CONFIG_SLUB_DEBUG atomic_long_set(&n->nr_slabs, 0); atomic_long_set(&n->total_objects, 0); @@ -3592,6 +3632,7 @@ static void early_kmem_cache_node_alloc(int node) * initialized and there is no concurrent access. */ __add_partial(n, page, DEACTIVATE_TO_HEAD); + __update_partial_free(n, page->objects - page->inuse); } static void free_kmem_cache_nodes(struct kmem_cache *s) @@ -3922,6 +3963,7 @@ static void free_partial(struct kmem_cache *s, struct kmem_cache_node *n) list_for_each_entry_safe(page, h, &n->partial, slab_list) { if (!page->inuse) { remove_partial(n, page); + __update_partial_free(n, -page->objects); list_add(&page->slab_list, &discard); } else { list_slab_objects(s, page, @@ -4263,6 +4305,8 @@ int __kmem_cache_shrink(struct kmem_cache *s) if (free == page->objects) { list_move(&page->slab_list, &discard); n->nr_partial--; + __update_partial_free(n, -free); + __update_partial_total(n, -free); } else if (free <= SHRINK_PROMOTE_MAX) list_move(&page->slab_list, promote + free - 1); } From patchwork Tue Mar 9 15:25:10 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xunlei Pang X-Patchwork-Id: 12125813 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,UNPARSEABLE_RELAY,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1658AC433DB for ; Tue, 9 Mar 2021 15:25:32 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 6395F6525F for ; Tue, 9 Mar 2021 15:25:31 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6395F6525F Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id E44D88D00FF; Tue, 9 Mar 2021 10:25:30 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id DF4C98D007F; Tue, 9 Mar 2021 10:25:30 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C6EC18D00FF; Tue, 9 Mar 2021 10:25:30 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0107.hostedemail.com [216.40.44.107]) by kanga.kvack.org (Postfix) with ESMTP id A70908D007F for ; Tue, 9 Mar 2021 10:25:30 -0500 (EST) Received: from smtpin08.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 60D2C180AD811 for ; Tue, 9 Mar 2021 15:25:30 +0000 (UTC) X-FDA: 77900710020.08.99611C1 Received: from out30-133.freemail.mail.aliyun.com (out30-133.freemail.mail.aliyun.com [115.124.30.133]) by imf22.hostedemail.com (Postfix) with ESMTP id 4CE58C0001EA for ; Tue, 9 Mar 2021 15:25:22 +0000 (UTC) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R171e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04423;MF=xlpang@linux.alibaba.com;NM=1;PH=DS;RN=14;SR=0;TI=SMTPD_---0UR8yree_1615303513; Received: from localhost(mailfrom:xlpang@linux.alibaba.com fp:SMTPD_---0UR8yree_1615303513) by smtp.aliyun-inc.com(127.0.0.1); Tue, 09 Mar 2021 23:25:14 +0800 From: Xunlei Pang To: Christoph Lameter , Pekka Enberg , Vlastimil Babka , Roman Gushchin , Konstantin Khlebnikov , David Rientjes , Matthew Wilcox , Shu Ming , Andrew Morton Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, Wen Yang , James Wang , Xunlei Pang Subject: [PATCH v3 2/4] mm/slub: Get rid of count_partial() Date: Tue, 9 Mar 2021 23:25:10 +0800 Message-Id: <1615303512-35058-3-git-send-email-xlpang@linux.alibaba.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1615303512-35058-1-git-send-email-xlpang@linux.alibaba.com> References: <1615303512-35058-1-git-send-email-xlpang@linux.alibaba.com> X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 4CE58C0001EA X-Stat-Signature: ganc5qab336ae7usu44mga9abxbhjcet Received-SPF: none (linux.alibaba.com>: No applicable sender policy available) receiver=imf22; identity=mailfrom; envelope-from=""; helo=out30-133.freemail.mail.aliyun.com; client-ip=115.124.30.133 X-HE-DKIM-Result: none/none X-HE-Tag: 1615303522-723453 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Now the partial counters are ready, let's use them directly and get rid of count_partial(). Tested-by: James Wang Reviewed-by: Pekka Enberg Signed-off-by: Xunlei Pang --- mm/slub.c | 54 ++++++++++++++++++++++-------------------------------- 1 file changed, 22 insertions(+), 32 deletions(-) diff --git a/mm/slub.c b/mm/slub.c index 4d02831..3f76b57 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -2533,11 +2533,6 @@ static inline int node_match(struct page *page, int node) } #ifdef CONFIG_SLUB_DEBUG -static int count_free(struct page *page) -{ - return page->objects - page->inuse; -} - static inline unsigned long node_nr_objs(struct kmem_cache_node *n) { return atomic_long_read(&n->total_objects); @@ -2545,19 +2540,26 @@ static inline unsigned long node_nr_objs(struct kmem_cache_node *n) #endif /* CONFIG_SLUB_DEBUG */ #if defined(CONFIG_SLUB_DEBUG) || defined(CONFIG_SYSFS) -static unsigned long count_partial(struct kmem_cache_node *n, - int (*get_count)(struct page *)) +enum partial_item { PARTIAL_FREE, PARTIAL_INUSE, PARTIAL_TOTAL }; + +static unsigned long partial_counter(struct kmem_cache_node *n, + enum partial_item item) { - unsigned long flags; - unsigned long x = 0; - struct page *page; + unsigned long ret = 0; - spin_lock_irqsave(&n->list_lock, flags); - list_for_each_entry(page, &n->partial, slab_list) - x += get_count(page); - spin_unlock_irqrestore(&n->list_lock, flags); - return x; + if (item == PARTIAL_FREE) { + ret = atomic_long_read(&n->partial_free_objs); + } else if (item == PARTIAL_TOTAL) { + ret = n->partial_total_objs; + } else if (item == PARTIAL_INUSE) { + ret = n->partial_total_objs - atomic_long_read(&n->partial_free_objs); + if ((long)ret < 0) + ret = 0; + } + + return ret; } + #endif /* CONFIG_SLUB_DEBUG || CONFIG_SYSFS */ static noinline void @@ -2587,7 +2589,7 @@ static unsigned long count_partial(struct kmem_cache_node *n, unsigned long nr_objs; unsigned long nr_free; - nr_free = count_partial(n, count_free); + nr_free = partial_counter(n, PARTIAL_FREE); nr_slabs = node_nr_slabs(n); nr_objs = node_nr_objs(n); @@ -4643,18 +4645,6 @@ void *__kmalloc_node_track_caller(size_t size, gfp_t gfpflags, EXPORT_SYMBOL(__kmalloc_node_track_caller); #endif -#ifdef CONFIG_SYSFS -static int count_inuse(struct page *page) -{ - return page->inuse; -} - -static int count_total(struct page *page) -{ - return page->objects; -} -#endif - #ifdef CONFIG_SLUB_DEBUG static void validate_slab(struct kmem_cache *s, struct page *page) { @@ -5091,7 +5081,7 @@ static ssize_t show_slab_objects(struct kmem_cache *s, x = atomic_long_read(&n->total_objects); else if (flags & SO_OBJECTS) x = atomic_long_read(&n->total_objects) - - count_partial(n, count_free); + partial_counter(n, PARTIAL_FREE); else x = atomic_long_read(&n->nr_slabs); total += x; @@ -5105,9 +5095,9 @@ static ssize_t show_slab_objects(struct kmem_cache *s, for_each_kmem_cache_node(s, node, n) { if (flags & SO_TOTAL) - x = count_partial(n, count_total); + x = partial_counter(n, PARTIAL_TOTAL); else if (flags & SO_OBJECTS) - x = count_partial(n, count_inuse); + x = partial_counter(n, PARTIAL_INUSE); else x = n->nr_partial; total += x; @@ -5873,7 +5863,7 @@ void get_slabinfo(struct kmem_cache *s, struct slabinfo *sinfo) for_each_kmem_cache_node(s, node, n) { nr_slabs += node_nr_slabs(n); nr_objs += node_nr_objs(n); - nr_free += count_partial(n, count_free); + nr_free += partial_counter(n, PARTIAL_FREE); } sinfo->active_objs = nr_objs - nr_free; From patchwork Tue Mar 9 15:25:11 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xunlei Pang X-Patchwork-Id: 12125819 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,UNPARSEABLE_RELAY,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 75A7FC433E9 for ; Tue, 9 Mar 2021 15:25:39 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 0DCF96526E for ; Tue, 9 Mar 2021 15:25:38 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 0DCF96526E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 89F318D0102; Tue, 9 Mar 2021 10:25:38 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 8503F8D007F; Tue, 9 Mar 2021 10:25:38 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6CBC68D0102; Tue, 9 Mar 2021 10:25:38 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0135.hostedemail.com [216.40.44.135]) by kanga.kvack.org (Postfix) with ESMTP id 4551B8D007F for ; Tue, 9 Mar 2021 10:25:38 -0500 (EST) Received: from smtpin08.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id D7CED8249980 for ; Tue, 9 Mar 2021 15:25:37 +0000 (UTC) X-FDA: 77900710314.08.D85599D Received: from out30-133.freemail.mail.aliyun.com (out30-133.freemail.mail.aliyun.com [115.124.30.133]) by imf08.hostedemail.com (Postfix) with ESMTP id 572C6801914C for ; Tue, 9 Mar 2021 15:25:17 +0000 (UTC) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R411e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04394;MF=xlpang@linux.alibaba.com;NM=1;PH=DS;RN=14;SR=0;TI=SMTPD_---0UR9HgFO_1615303514; Received: from localhost(mailfrom:xlpang@linux.alibaba.com fp:SMTPD_---0UR9HgFO_1615303514) by smtp.aliyun-inc.com(127.0.0.1); Tue, 09 Mar 2021 23:25:14 +0800 From: Xunlei Pang To: Christoph Lameter , Pekka Enberg , Vlastimil Babka , Roman Gushchin , Konstantin Khlebnikov , David Rientjes , Matthew Wilcox , Shu Ming , Andrew Morton Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, Wen Yang , James Wang , Xunlei Pang Subject: [PATCH v3 3/4] percpu: Export per_cpu_sum() Date: Tue, 9 Mar 2021 23:25:11 +0800 Message-Id: <1615303512-35058-4-git-send-email-xlpang@linux.alibaba.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1615303512-35058-1-git-send-email-xlpang@linux.alibaba.com> References: <1615303512-35058-1-git-send-email-xlpang@linux.alibaba.com> X-Stat-Signature: s1ir3hafdex6x7xzgunw7obi7yhnattk X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 572C6801914C Received-SPF: none (linux.alibaba.com>: No applicable sender policy available) receiver=imf08; identity=mailfrom; envelope-from=""; helo=out30-133.freemail.mail.aliyun.com; client-ip=115.124.30.133 X-HE-DKIM-Result: none/none X-HE-Tag: 1615303517-80498 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: per_cpu_sum() is useful, and deserves to be exported. Tested-by: James Wang Signed-off-by: Xunlei Pang --- include/linux/percpu-defs.h | 10 ++++++++++ kernel/locking/percpu-rwsem.c | 10 ---------- 2 files changed, 10 insertions(+), 10 deletions(-) diff --git a/include/linux/percpu-defs.h b/include/linux/percpu-defs.h index dff7040..0e71b68 100644 --- a/include/linux/percpu-defs.h +++ b/include/linux/percpu-defs.h @@ -220,6 +220,16 @@ (void)__vpp_verify; \ } while (0) +#define per_cpu_sum(var) \ +({ \ + typeof(var) __sum = 0; \ + int cpu; \ + compiletime_assert_atomic_type(__sum); \ + for_each_possible_cpu(cpu) \ + __sum += per_cpu(var, cpu); \ + __sum; \ +}) + #ifdef CONFIG_SMP /* diff --git a/kernel/locking/percpu-rwsem.c b/kernel/locking/percpu-rwsem.c index 70a32a5..0980e51 100644 --- a/kernel/locking/percpu-rwsem.c +++ b/kernel/locking/percpu-rwsem.c @@ -178,16 +178,6 @@ bool __percpu_down_read(struct percpu_rw_semaphore *sem, bool try) } EXPORT_SYMBOL_GPL(__percpu_down_read); -#define per_cpu_sum(var) \ -({ \ - typeof(var) __sum = 0; \ - int cpu; \ - compiletime_assert_atomic_type(__sum); \ - for_each_possible_cpu(cpu) \ - __sum += per_cpu(var, cpu); \ - __sum; \ -}) - /* * Return true if the modular sum of the sem->read_count per-CPU variable is * zero. If this sum is zero, then it is stable due to the fact that if any From patchwork Tue Mar 9 15:25:12 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xunlei Pang X-Patchwork-Id: 12125815 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,UNPARSEABLE_RELAY,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9CC12C433E0 for ; Tue, 9 Mar 2021 15:25:34 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 1E8CF65253 for ; Tue, 9 Mar 2021 15:25:34 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1E8CF65253 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 9D78E8D0100; Tue, 9 Mar 2021 10:25:33 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 960AA8D007F; Tue, 9 Mar 2021 10:25:33 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7B3818D0100; Tue, 9 Mar 2021 10:25:33 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0151.hostedemail.com [216.40.44.151]) by kanga.kvack.org (Postfix) with ESMTP id 5B4638D007F for ; Tue, 9 Mar 2021 10:25:33 -0500 (EST) Received: from smtpin14.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 1E26E181AC1F5 for ; Tue, 9 Mar 2021 15:25:33 +0000 (UTC) X-FDA: 77900710146.14.BAA0B47 Received: from out30-43.freemail.mail.aliyun.com (out30-43.freemail.mail.aliyun.com [115.124.30.43]) by imf24.hostedemail.com (Postfix) with ESMTP id 1BF6FA001AAC for ; Tue, 9 Mar 2021 15:25:26 +0000 (UTC) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R261e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04423;MF=xlpang@linux.alibaba.com;NM=1;PH=DS;RN=14;SR=0;TI=SMTPD_---0UR9HgFS_1615303514; Received: from localhost(mailfrom:xlpang@linux.alibaba.com fp:SMTPD_---0UR9HgFS_1615303514) by smtp.aliyun-inc.com(127.0.0.1); Tue, 09 Mar 2021 23:25:15 +0800 From: Xunlei Pang To: Christoph Lameter , Pekka Enberg , Vlastimil Babka , Roman Gushchin , Konstantin Khlebnikov , David Rientjes , Matthew Wilcox , Shu Ming , Andrew Morton Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, Wen Yang , James Wang , Xunlei Pang Subject: [PATCH v3 4/4] mm/slub: Use percpu partial free counter Date: Tue, 9 Mar 2021 23:25:12 +0800 Message-Id: <1615303512-35058-5-git-send-email-xlpang@linux.alibaba.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1615303512-35058-1-git-send-email-xlpang@linux.alibaba.com> References: <1615303512-35058-1-git-send-email-xlpang@linux.alibaba.com> X-Stat-Signature: wofxm4i484wyeqsj133b1fbs7a19cc5m X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 1BF6FA001AAC Received-SPF: none (linux.alibaba.com>: No applicable sender policy available) receiver=imf24; identity=mailfrom; envelope-from=""; helo=out30-43.freemail.mail.aliyun.com; client-ip=115.124.30.43 X-HE-DKIM-Result: none/none X-HE-Tag: 1615303526-982993 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The only concern of introducing partial counter is that, partial_free_objs may cause cache and atomic operation contention in case of same SLUB concurrent __slab_free(). This patch changes it to be a percpu counter, also replace the counter variables to avoid cacheline issues. Tested-by: James Wang Reviewed-by: Pekka Enberg Signed-off-by: Xunlei Pang --- mm/slab.h | 6 ++++-- mm/slub.c | 30 +++++++++++++++++++++++------- 2 files changed, 27 insertions(+), 9 deletions(-) diff --git a/mm/slab.h b/mm/slab.h index 817bfa0..c819597 100644 --- a/mm/slab.h +++ b/mm/slab.h @@ -546,16 +546,18 @@ struct kmem_cache_node { #ifdef CONFIG_SLUB unsigned long nr_partial; - struct list_head partial; #if defined(CONFIG_SLUB_DEBUG) || defined(CONFIG_SYSFS) - atomic_long_t partial_free_objs; unsigned long partial_total_objs; #endif + struct list_head partial; #ifdef CONFIG_SLUB_DEBUG atomic_long_t nr_slabs; atomic_long_t total_objects; struct list_head full; #endif +#if defined(CONFIG_SLUB_DEBUG) || defined(CONFIG_SYSFS) + unsigned long __percpu *partial_free_objs; +#endif #endif }; diff --git a/mm/slub.c b/mm/slub.c index 3f76b57..b6ec065 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -1894,7 +1894,7 @@ static void discard_slab(struct kmem_cache *s, struct page *page) static inline void __update_partial_free(struct kmem_cache_node *n, long delta) { - atomic_long_add(delta, &n->partial_free_objs); + this_cpu_add(*n->partial_free_objs, delta); } static inline void @@ -2548,11 +2548,16 @@ static unsigned long partial_counter(struct kmem_cache_node *n, unsigned long ret = 0; if (item == PARTIAL_FREE) { - ret = atomic_long_read(&n->partial_free_objs); + ret = per_cpu_sum(*n->partial_free_objs); + if ((long)ret < 0) + ret = 0; } else if (item == PARTIAL_TOTAL) { ret = n->partial_total_objs; } else if (item == PARTIAL_INUSE) { - ret = n->partial_total_objs - atomic_long_read(&n->partial_free_objs); + ret = per_cpu_sum(*n->partial_free_objs); + if ((long)ret < 0) + ret = 0; + ret = n->partial_total_objs - ret; if ((long)ret < 0) ret = 0; } @@ -3552,14 +3557,16 @@ static inline int calculate_order(unsigned int size) return -ENOSYS; } -static void +static int init_kmem_cache_node(struct kmem_cache_node *n) { n->nr_partial = 0; spin_lock_init(&n->list_lock); INIT_LIST_HEAD(&n->partial); #if defined(CONFIG_SLUB_DEBUG) || defined(CONFIG_SYSFS) - atomic_long_set(&n->partial_free_objs, 0); + n->partial_free_objs = alloc_percpu(unsigned long); + if (!n->partial_free_objs) + return -ENOMEM; n->partial_total_objs = 0; #endif #ifdef CONFIG_SLUB_DEBUG @@ -3567,6 +3574,8 @@ static inline int calculate_order(unsigned int size) atomic_long_set(&n->total_objects, 0); INIT_LIST_HEAD(&n->full); #endif + + return 0; } static inline int alloc_kmem_cache_cpus(struct kmem_cache *s) @@ -3626,7 +3635,7 @@ static void early_kmem_cache_node_alloc(int node) page->inuse = 1; page->frozen = 0; kmem_cache_node->node[node] = n; - init_kmem_cache_node(n); + BUG_ON(init_kmem_cache_node(n) < 0); inc_slabs_node(kmem_cache_node, node, page->objects); /* @@ -3644,6 +3653,9 @@ static void free_kmem_cache_nodes(struct kmem_cache *s) for_each_kmem_cache_node(s, node, n) { s->node[node] = NULL; +#if defined(CONFIG_SLUB_DEBUG) || defined(CONFIG_SYSFS) + free_percpu(n->partial_free_objs); +#endif kmem_cache_free(kmem_cache_node, n); } } @@ -3674,7 +3686,11 @@ static int init_kmem_cache_nodes(struct kmem_cache *s) return 0; } - init_kmem_cache_node(n); + if (init_kmem_cache_node(n) < 0) { + free_kmem_cache_nodes(s); + return 0; + } + s->node[node] = n; } return 1;