From patchwork Mon Apr 19 00:00:28 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Waiman Long X-Patchwork-Id: 12210553 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3BD9BC433ED for ; Mon, 19 Apr 2021 00:01:00 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id B0E16610CD for ; Mon, 19 Apr 2021 00:00:59 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B0E16610CD Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 992EB6B0036; Sun, 18 Apr 2021 20:00:58 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 942E16B006E; Sun, 18 Apr 2021 20:00:58 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7BCB56B0070; Sun, 18 Apr 2021 20:00:58 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0178.hostedemail.com [216.40.44.178]) by kanga.kvack.org (Postfix) with ESMTP id 5DC196B0036 for ; Sun, 18 Apr 2021 20:00:58 -0400 (EDT) Received: from smtpin15.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 10683181AEF3E for ; Mon, 19 Apr 2021 00:00:58 +0000 (UTC) X-FDA: 78047160996.15.0F96513 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf06.hostedemail.com (Postfix) with ESMTP id 25E01C000C7D for ; Mon, 19 Apr 2021 00:00:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1618790457; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:in-reply-to:in-reply-to:references:references; bh=pIzly3xuH0PWUBWjBoEugIWAN5rcOKgjj+f+nKIsmUw=; b=hqXB6RPSkNQBVtYHQ2WDTQkkwm6PKA5N6Bf+Q08Q1cazdWSP0ROuaixUKICd15mtNZQuHT hWbt3rqG2Ii8xDBhTWr/q90xs8Y1Y/ii64Wxx1unNaEe6Ow5s7/OY1+wRCKrmAS1eW8fh6 9utXLP3N+cPcx8h2kX2M8wQloKa/bq4= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-428-dpTpdBMuMd-VridQre62gg-1; Sun, 18 Apr 2021 20:00:53 -0400 X-MC-Unique: dpTpdBMuMd-VridQre62gg-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 779BF501F9; Mon, 19 Apr 2021 00:00:50 +0000 (UTC) Received: from llong.com (ovpn-112-235.rdu2.redhat.com [10.10.112.235]) by smtp.corp.redhat.com (Postfix) with ESMTP id 874B25D741; Mon, 19 Apr 2021 00:00:47 +0000 (UTC) From: Waiman Long To: Johannes Weiner , Michal Hocko , Vladimir Davydov , Andrew Morton , Tejun Heo , Christoph Lameter , Pekka Enberg , David Rientjes , Joonsoo Kim , Vlastimil Babka , Roman Gushchin Cc: linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, linux-mm@kvack.org, Shakeel Butt , Muchun Song , Alex Shi , Chris Down , Yafang Shao , Wei Yang , Masayoshi Mizuma , Xing Zhengjun , Matthew Wilcox , Waiman Long Subject: [PATCH v4 1/5] mm/memcg: Move mod_objcg_state() to memcontrol.c Date: Sun, 18 Apr 2021 20:00:28 -0400 Message-Id: <20210419000032.5432-2-longman@redhat.com> In-Reply-To: <20210419000032.5432-1-longman@redhat.com> References: <20210419000032.5432-1-longman@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 25E01C000C7D X-Stat-Signature: w7i3iq89cfoupyryb7n7nx3bse9ck6y7 Received-SPF: none (redhat.com>: No applicable sender policy available) receiver=imf06; identity=mailfrom; envelope-from=""; helo=us-smtp-delivery-124.mimecast.com; client-ip=170.10.133.124 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1618790459-942839 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The mod_objcg_state() function is moved from mm/slab.h to mm/memcontrol.c so that further optimization can be done to it in later patches without exposing unnecessary details to other mm components. Signed-off-by: Waiman Long Acked-by: Johannes Weiner Reviewed-by: Shakeel Butt --- mm/memcontrol.c | 13 +++++++++++++ mm/slab.h | 16 ++-------------- 2 files changed, 15 insertions(+), 14 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index e064ac0d850a..dc9032f28f2e 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -3150,6 +3150,19 @@ void __memcg_kmem_uncharge_page(struct page *page, int order) css_put(&memcg->css); } +void mod_objcg_state(struct obj_cgroup *objcg, struct pglist_data *pgdat, + enum node_stat_item idx, int nr) +{ + struct mem_cgroup *memcg; + struct lruvec *lruvec = NULL; + + rcu_read_lock(); + memcg = obj_cgroup_memcg(objcg); + lruvec = mem_cgroup_lruvec(memcg, pgdat); + mod_memcg_lruvec_state(lruvec, idx, nr); + rcu_read_unlock(); +} + static bool consume_obj_stock(struct obj_cgroup *objcg, unsigned int nr_bytes) { struct memcg_stock_pcp *stock; diff --git a/mm/slab.h b/mm/slab.h index 076582f58f68..ae8b85875426 100644 --- a/mm/slab.h +++ b/mm/slab.h @@ -239,6 +239,8 @@ static inline bool kmem_cache_debug_flags(struct kmem_cache *s, slab_flags_t fla #ifdef CONFIG_MEMCG_KMEM int memcg_alloc_page_obj_cgroups(struct page *page, struct kmem_cache *s, gfp_t gfp, bool new_page); +void mod_objcg_state(struct obj_cgroup *objcg, struct pglist_data *pgdat, + enum node_stat_item idx, int nr); static inline void memcg_free_page_obj_cgroups(struct page *page) { @@ -283,20 +285,6 @@ static inline bool memcg_slab_pre_alloc_hook(struct kmem_cache *s, return true; } -static inline void mod_objcg_state(struct obj_cgroup *objcg, - struct pglist_data *pgdat, - enum node_stat_item idx, int nr) -{ - struct mem_cgroup *memcg; - struct lruvec *lruvec; - - rcu_read_lock(); - memcg = obj_cgroup_memcg(objcg); - lruvec = mem_cgroup_lruvec(memcg, pgdat); - mod_memcg_lruvec_state(lruvec, idx, nr); - rcu_read_unlock(); -} - static inline void memcg_slab_post_alloc_hook(struct kmem_cache *s, struct obj_cgroup *objcg, gfp_t flags, size_t size, From patchwork Mon Apr 19 00:00:29 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Waiman Long X-Patchwork-Id: 12210555 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DBA5DC43460 for ; Mon, 19 Apr 2021 00:01:01 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 3BDF6610CB for ; Mon, 19 Apr 2021 00:01:01 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 3BDF6610CB Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 7978C6B006E; Sun, 18 Apr 2021 20:01:00 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 76F3B6B0070; Sun, 18 Apr 2021 20:01:00 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 60FF26B0071; Sun, 18 Apr 2021 20:01:00 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0195.hostedemail.com [216.40.44.195]) by kanga.kvack.org (Postfix) with ESMTP id 3ACC16B006E for ; Sun, 18 Apr 2021 20:01:00 -0400 (EDT) Received: from smtpin33.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id E8A94181AEF3E for ; Mon, 19 Apr 2021 00:00:59 +0000 (UTC) X-FDA: 78047161038.33.EF9CDC4 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf05.hostedemail.com (Postfix) with ESMTP id E1807E005F27 for ; Mon, 19 Apr 2021 00:00:58 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1618790459; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:in-reply-to:in-reply-to:references:references; bh=jrHNa3qY0mDnQZ5jXhLj28hpfDWjX5kUgWYop2vZoe8=; b=diwHnLPqVklfNaeOrtqfDDVYterL4IuzPegqqRmT1/jnpnRjuB07enazyqnDpV8lPz2csI L59OqAneX3mzpBDi8vCPNpwnQ7Z0l33eve8i6Ed6UXICbpJo7Tte6yzY2OwaB/nvRuY2cn NjrrPP2yM418JI2Y3unX0DwjgB5wCh8= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-506-TS38zplUOHKClT0Y41qKsw-1; Sun, 18 Apr 2021 20:00:55 -0400 X-MC-Unique: TS38zplUOHKClT0Y41qKsw-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id DC23210054F6; Mon, 19 Apr 2021 00:00:52 +0000 (UTC) Received: from llong.com (ovpn-112-235.rdu2.redhat.com [10.10.112.235]) by smtp.corp.redhat.com (Postfix) with ESMTP id 9E34A5D72E; Mon, 19 Apr 2021 00:00:50 +0000 (UTC) From: Waiman Long To: Johannes Weiner , Michal Hocko , Vladimir Davydov , Andrew Morton , Tejun Heo , Christoph Lameter , Pekka Enberg , David Rientjes , Joonsoo Kim , Vlastimil Babka , Roman Gushchin Cc: linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, linux-mm@kvack.org, Shakeel Butt , Muchun Song , Alex Shi , Chris Down , Yafang Shao , Wei Yang , Masayoshi Mizuma , Xing Zhengjun , Matthew Wilcox , Waiman Long Subject: [PATCH v4 2/5] mm/memcg: Cache vmstat data in percpu memcg_stock_pcp Date: Sun, 18 Apr 2021 20:00:29 -0400 Message-Id: <20210419000032.5432-3-longman@redhat.com> In-Reply-To: <20210419000032.5432-1-longman@redhat.com> References: <20210419000032.5432-1-longman@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-Stat-Signature: 737amgp5ur4hskaw9rcjnbsxfrjnhxnq X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: E1807E005F27 Received-SPF: none (redhat.com>: No applicable sender policy available) receiver=imf05; identity=mailfrom; envelope-from=""; helo=us-smtp-delivery-124.mimecast.com; client-ip=170.10.133.124 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1618790458-976777 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Before the new slab memory controller with per object byte charging, charging and vmstat data update happen only when new slab pages are allocated or freed. Now they are done with every kmem_cache_alloc() and kmem_cache_free(). This causes additional overhead for workloads that generate a lot of alloc and free calls. The memcg_stock_pcp is used to cache byte charge for a specific obj_cgroup to reduce that overhead. To further reducing it, this patch makes the vmstat data cached in the memcg_stock_pcp structure as well until it accumulates a page size worth of update or when other cached data change. Caching the vmstat data in the per-cpu stock eliminates two writes to non-hot cachelines for memcg specific as well as memcg-lruvecs specific vmstat data by a write to a hot local stock cacheline. On a 2-socket Cascade Lake server with instrumentation enabled and this patch applied, it was found that about 20% (634400 out of 3243830) of the time when mod_objcg_state() is called leads to an actual call to __mod_objcg_state() after initial boot. When doing parallel kernel build, the figure was about 17% (24329265 out of 142512465). So caching the vmstat data reduces the number of calls to __mod_objcg_state() by more than 80%. Signed-off-by: Waiman Long Reviewed-by: Shakeel Butt --- mm/memcontrol.c | 64 ++++++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 61 insertions(+), 3 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index dc9032f28f2e..693453f95d99 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -2213,7 +2213,10 @@ struct memcg_stock_pcp { #ifdef CONFIG_MEMCG_KMEM struct obj_cgroup *cached_objcg; + struct pglist_data *cached_pgdat; unsigned int nr_bytes; + int vmstat_idx; + int vmstat_bytes; #endif struct work_struct work; @@ -3150,8 +3153,9 @@ void __memcg_kmem_uncharge_page(struct page *page, int order) css_put(&memcg->css); } -void mod_objcg_state(struct obj_cgroup *objcg, struct pglist_data *pgdat, - enum node_stat_item idx, int nr) +static inline void __mod_objcg_state(struct obj_cgroup *objcg, + struct pglist_data *pgdat, + enum node_stat_item idx, int nr) { struct mem_cgroup *memcg; struct lruvec *lruvec = NULL; @@ -3159,10 +3163,53 @@ void mod_objcg_state(struct obj_cgroup *objcg, struct pglist_data *pgdat, rcu_read_lock(); memcg = obj_cgroup_memcg(objcg); lruvec = mem_cgroup_lruvec(memcg, pgdat); - mod_memcg_lruvec_state(lruvec, idx, nr); + __mod_memcg_lruvec_state(lruvec, idx, nr); rcu_read_unlock(); } +void mod_objcg_state(struct obj_cgroup *objcg, struct pglist_data *pgdat, + enum node_stat_item idx, int nr) +{ + struct memcg_stock_pcp *stock; + unsigned long flags; + + local_irq_save(flags); + stock = this_cpu_ptr(&memcg_stock); + + /* + * Save vmstat data in stock and skip vmstat array update unless + * accumulating over a page of vmstat data or when pgdat or idx + * changes. + */ + if (stock->cached_objcg != objcg) { + /* Output the current data as is */ + } else if (!stock->vmstat_bytes) { + /* Save the current data */ + stock->vmstat_bytes = nr; + stock->vmstat_idx = idx; + stock->cached_pgdat = pgdat; + nr = 0; + } else if ((stock->cached_pgdat != pgdat) || + (stock->vmstat_idx != idx)) { + /* Output the cached data & save the current data */ + swap(nr, stock->vmstat_bytes); + swap(idx, stock->vmstat_idx); + swap(pgdat, stock->cached_pgdat); + } else { + stock->vmstat_bytes += nr; + if (abs(stock->vmstat_bytes) > PAGE_SIZE) { + nr = stock->vmstat_bytes; + stock->vmstat_bytes = 0; + } else { + nr = 0; + } + } + if (nr) + __mod_objcg_state(objcg, pgdat, idx, nr); + + local_irq_restore(flags); +} + static bool consume_obj_stock(struct obj_cgroup *objcg, unsigned int nr_bytes) { struct memcg_stock_pcp *stock; @@ -3213,6 +3260,17 @@ static void drain_obj_stock(struct memcg_stock_pcp *stock) stock->nr_bytes = 0; } + /* + * Flush the vmstat data in current stock + */ + if (stock->vmstat_bytes) { + __mod_objcg_state(old, stock->cached_pgdat, stock->vmstat_idx, + stock->vmstat_bytes); + stock->cached_pgdat = NULL; + stock->vmstat_bytes = 0; + stock->vmstat_idx = 0; + } + obj_cgroup_put(old); stock->cached_objcg = NULL; } From patchwork Mon Apr 19 00:00:30 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Waiman Long X-Patchwork-Id: 12210557 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 428ADC433B4 for ; Mon, 19 Apr 2021 00:01:04 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id B7177610CC for ; Mon, 19 Apr 2021 00:01:03 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B7177610CC Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 0E6B36B0070; Sun, 18 Apr 2021 20:01:03 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 068896B0071; Sun, 18 Apr 2021 20:01:02 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E27136B0072; Sun, 18 Apr 2021 20:01:02 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0039.hostedemail.com [216.40.44.39]) by kanga.kvack.org (Postfix) with ESMTP id BADF56B0070 for ; Sun, 18 Apr 2021 20:01:02 -0400 (EDT) Received: from smtpin08.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 73C1D8249980 for ; Mon, 19 Apr 2021 00:01:02 +0000 (UTC) X-FDA: 78047161164.08.A8967BB Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf14.hostedemail.com (Postfix) with ESMTP id 07F7AC000C79 for ; Mon, 19 Apr 2021 00:00:51 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1618790461; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:in-reply-to:in-reply-to:references:references; bh=HRHn7Kgbu8l5PAGbh/+8/yQmGEgC6Y+jRpHLNTw6u7o=; b=VlYAmCpJSU07W6JEGgRYnVOCXkxmURCuZV5tfqTynoOTUmNWBPWGpf/aC0iw6DhgwAWp3J V1Wlxu0hJEKgGNfdIk5dq7mk/IzmYFEhHQYWQYxHnB2KfJuxMmEfYDPqULOlIYnmF07XOL Zare4FVOfEyy+nU3JgkiQa3TlF2NxaU= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-402-Fm9ddSVWOViulLiVcgJm8w-1; Sun, 18 Apr 2021 20:00:57 -0400 X-MC-Unique: Fm9ddSVWOViulLiVcgJm8w-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 4C2F18030BB; Mon, 19 Apr 2021 00:00:55 +0000 (UTC) Received: from llong.com (ovpn-112-235.rdu2.redhat.com [10.10.112.235]) by smtp.corp.redhat.com (Postfix) with ESMTP id 11F095D72E; Mon, 19 Apr 2021 00:00:52 +0000 (UTC) From: Waiman Long To: Johannes Weiner , Michal Hocko , Vladimir Davydov , Andrew Morton , Tejun Heo , Christoph Lameter , Pekka Enberg , David Rientjes , Joonsoo Kim , Vlastimil Babka , Roman Gushchin Cc: linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, linux-mm@kvack.org, Shakeel Butt , Muchun Song , Alex Shi , Chris Down , Yafang Shao , Wei Yang , Masayoshi Mizuma , Xing Zhengjun , Matthew Wilcox , Waiman Long Subject: [PATCH v4 3/5] mm/memcg: Optimize user context object stock access Date: Sun, 18 Apr 2021 20:00:30 -0400 Message-Id: <20210419000032.5432-4-longman@redhat.com> In-Reply-To: <20210419000032.5432-1-longman@redhat.com> References: <20210419000032.5432-1-longman@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 07F7AC000C79 X-Stat-Signature: zrqs9y5y7odh1rw1wkzzrcnqcfyftpkn Received-SPF: none (redhat.com>: No applicable sender policy available) receiver=imf14; identity=mailfrom; envelope-from=""; helo=us-smtp-delivery-124.mimecast.com; client-ip=170.10.133.124 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1618790451-80750 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Most kmem_cache_alloc() calls are from user context. With instrumentation enabled, the measured amount of kmem_cache_alloc() calls from non-task context was about 0.01% of the total. The irq disable/enable sequence used in this case to access content from object stock is slow. To optimize for user context access, there are now two sets of object stocks (in the new obj_stock structure) for task context and interrupt context access respectively. The task context object stock can be accessed after disabling preemption which is cheap in non-preempt kernel. The interrupt context object stock can only be accessed after disabling interrupt. User context code can access interrupt object stock, but not vice versa. The downside of this change is that there are more data stored in local object stocks and not reflected in the charge counter and the vmstat arrays. However, this is a small price to pay for better performance. Signed-off-by: Waiman Long Acked-by: Roman Gushchin Reviewed-by: Shakeel Butt --- mm/memcontrol.c | 94 +++++++++++++++++++++++++++++++++++-------------- 1 file changed, 68 insertions(+), 26 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 693453f95d99..c13502eab282 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -2207,17 +2207,23 @@ void unlock_page_memcg(struct page *page) } EXPORT_SYMBOL(unlock_page_memcg); -struct memcg_stock_pcp { - struct mem_cgroup *cached; /* this never be root cgroup */ - unsigned int nr_pages; - +struct obj_stock { #ifdef CONFIG_MEMCG_KMEM struct obj_cgroup *cached_objcg; struct pglist_data *cached_pgdat; unsigned int nr_bytes; int vmstat_idx; int vmstat_bytes; +#else + int dummy[0]; #endif +}; + +struct memcg_stock_pcp { + struct mem_cgroup *cached; /* this never be root cgroup */ + unsigned int nr_pages; + struct obj_stock task_obj; + struct obj_stock irq_obj; struct work_struct work; unsigned long flags; @@ -2227,12 +2233,12 @@ static DEFINE_PER_CPU(struct memcg_stock_pcp, memcg_stock); static DEFINE_MUTEX(percpu_charge_mutex); #ifdef CONFIG_MEMCG_KMEM -static void drain_obj_stock(struct memcg_stock_pcp *stock); +static void drain_obj_stock(struct obj_stock *stock); static bool obj_stock_flush_required(struct memcg_stock_pcp *stock, struct mem_cgroup *root_memcg); #else -static inline void drain_obj_stock(struct memcg_stock_pcp *stock) +static inline void drain_obj_stock(struct obj_stock *stock) { } static bool obj_stock_flush_required(struct memcg_stock_pcp *stock, @@ -2242,6 +2248,40 @@ static bool obj_stock_flush_required(struct memcg_stock_pcp *stock, } #endif +/* + * Most kmem_cache_alloc() calls are from user context. The irq disable/enable + * sequence used in this case to access content from object stock is slow. + * To optimize for user context access, there are now two object stocks for + * task context and interrupt context access respectively. + * + * The task context object stock can be accessed by disabling preemption only + * which is cheap in non-preempt kernel. The interrupt context object stock + * can only be accessed after disabling interrupt. User context code can + * access interrupt object stock, but not vice versa. + */ +static inline struct obj_stock *get_obj_stock(unsigned long *pflags) +{ + struct memcg_stock_pcp *stock; + + if (likely(in_task())) { + preempt_disable(); + stock = this_cpu_ptr(&memcg_stock); + return &stock->task_obj; + } else { + local_irq_save(*pflags); + stock = this_cpu_ptr(&memcg_stock); + return &stock->irq_obj; + } +} + +static inline void put_obj_stock(unsigned long flags) +{ + if (likely(in_task())) + preempt_enable(); + else + local_irq_restore(flags); +} + /** * consume_stock: Try to consume stocked charge on this cpu. * @memcg: memcg to consume from. @@ -2308,7 +2348,9 @@ static void drain_local_stock(struct work_struct *dummy) local_irq_save(flags); stock = this_cpu_ptr(&memcg_stock); - drain_obj_stock(stock); + drain_obj_stock(&stock->irq_obj); + if (in_task()) + drain_obj_stock(&stock->task_obj); drain_stock(stock); clear_bit(FLUSHING_CACHED_CHARGE, &stock->flags); @@ -3153,6 +3195,10 @@ void __memcg_kmem_uncharge_page(struct page *page, int order) css_put(&memcg->css); } +/* + * __mod_objcg_state() may be called with irq enabled, so + * mod_memcg_lruvec_state() should be used. + */ static inline void __mod_objcg_state(struct obj_cgroup *objcg, struct pglist_data *pgdat, enum node_stat_item idx, int nr) @@ -3163,18 +3209,15 @@ static inline void __mod_objcg_state(struct obj_cgroup *objcg, rcu_read_lock(); memcg = obj_cgroup_memcg(objcg); lruvec = mem_cgroup_lruvec(memcg, pgdat); - __mod_memcg_lruvec_state(lruvec, idx, nr); + mod_memcg_lruvec_state(lruvec, idx, nr); rcu_read_unlock(); } void mod_objcg_state(struct obj_cgroup *objcg, struct pglist_data *pgdat, enum node_stat_item idx, int nr) { - struct memcg_stock_pcp *stock; unsigned long flags; - - local_irq_save(flags); - stock = this_cpu_ptr(&memcg_stock); + struct obj_stock *stock = get_obj_stock(&flags); /* * Save vmstat data in stock and skip vmstat array update unless @@ -3207,29 +3250,26 @@ void mod_objcg_state(struct obj_cgroup *objcg, struct pglist_data *pgdat, if (nr) __mod_objcg_state(objcg, pgdat, idx, nr); - local_irq_restore(flags); + put_obj_stock(flags); } static bool consume_obj_stock(struct obj_cgroup *objcg, unsigned int nr_bytes) { - struct memcg_stock_pcp *stock; unsigned long flags; + struct obj_stock *stock = get_obj_stock(&flags); bool ret = false; - local_irq_save(flags); - - stock = this_cpu_ptr(&memcg_stock); if (objcg == stock->cached_objcg && stock->nr_bytes >= nr_bytes) { stock->nr_bytes -= nr_bytes; ret = true; } - local_irq_restore(flags); + put_obj_stock(flags); return ret; } -static void drain_obj_stock(struct memcg_stock_pcp *stock) +static void drain_obj_stock(struct obj_stock *stock) { struct obj_cgroup *old = stock->cached_objcg; @@ -3280,8 +3320,13 @@ static bool obj_stock_flush_required(struct memcg_stock_pcp *stock, { struct mem_cgroup *memcg; - if (stock->cached_objcg) { - memcg = obj_cgroup_memcg(stock->cached_objcg); + if (in_task() && stock->task_obj.cached_objcg) { + memcg = obj_cgroup_memcg(stock->task_obj.cached_objcg); + if (memcg && mem_cgroup_is_descendant(memcg, root_memcg)) + return true; + } + if (stock->irq_obj.cached_objcg) { + memcg = obj_cgroup_memcg(stock->irq_obj.cached_objcg); if (memcg && mem_cgroup_is_descendant(memcg, root_memcg)) return true; } @@ -3291,12 +3336,9 @@ static bool obj_stock_flush_required(struct memcg_stock_pcp *stock, static void refill_obj_stock(struct obj_cgroup *objcg, unsigned int nr_bytes) { - struct memcg_stock_pcp *stock; unsigned long flags; + struct obj_stock *stock = get_obj_stock(&flags); - local_irq_save(flags); - - stock = this_cpu_ptr(&memcg_stock); if (stock->cached_objcg != objcg) { /* reset if necessary */ drain_obj_stock(stock); obj_cgroup_get(objcg); @@ -3308,7 +3350,7 @@ static void refill_obj_stock(struct obj_cgroup *objcg, unsigned int nr_bytes) if (stock->nr_bytes > PAGE_SIZE) drain_obj_stock(stock); - local_irq_restore(flags); + put_obj_stock(flags); } int obj_cgroup_charge(struct obj_cgroup *objcg, gfp_t gfp, size_t size) From patchwork Mon Apr 19 00:00:31 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Waiman Long X-Patchwork-Id: 12210559 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E62B7C4360C for ; Mon, 19 Apr 2021 00:01:06 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 679FD61029 for ; Mon, 19 Apr 2021 00:01:06 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 679FD61029 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id B1D7E6B0071; Sun, 18 Apr 2021 20:01:05 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B09316B0072; Sun, 18 Apr 2021 20:01:05 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8A6936B0073; Sun, 18 Apr 2021 20:01:05 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0172.hostedemail.com [216.40.44.172]) by kanga.kvack.org (Postfix) with ESMTP id 6804B6B0071 for ; Sun, 18 Apr 2021 20:01:05 -0400 (EDT) Received: from smtpin12.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 2A17B3631 for ; Mon, 19 Apr 2021 00:01:05 +0000 (UTC) X-FDA: 78047161290.12.26BC5CC Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf06.hostedemail.com (Postfix) with ESMTP id 5D034C0007DB for ; Mon, 19 Apr 2021 00:01:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1618790464; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:in-reply-to:in-reply-to:references:references; bh=78iLLfuZ7fTLXxoPFDJyruPJpxKm3UtpGKR94vx8lFk=; b=BkS2TRCD1EsrrqqtEELcWZxI3DWKs3Jt4GQ9lF9Q5gd+39Pk+zuqAqfa1VU27ZqfGYigy6 11V8qv5KvnmK5C+4rH9REkRDUcOQ9SwDrrM5L3FTINIt+kwasnoyOCyEHvHFSNssN8sk4Y mK4QU+/6jlDSe4ODaRyJQATxA1vulK4= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-496-z61uXu-IMuWv6NwlBfQYIA-1; Sun, 18 Apr 2021 20:01:00 -0400 X-MC-Unique: z61uXu-IMuWv6NwlBfQYIA-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id C6A988030C4; Mon, 19 Apr 2021 00:00:57 +0000 (UTC) Received: from llong.com (ovpn-112-235.rdu2.redhat.com [10.10.112.235]) by smtp.corp.redhat.com (Postfix) with ESMTP id 72DF65D741; Mon, 19 Apr 2021 00:00:55 +0000 (UTC) From: Waiman Long To: Johannes Weiner , Michal Hocko , Vladimir Davydov , Andrew Morton , Tejun Heo , Christoph Lameter , Pekka Enberg , David Rientjes , Joonsoo Kim , Vlastimil Babka , Roman Gushchin Cc: linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, linux-mm@kvack.org, Shakeel Butt , Muchun Song , Alex Shi , Chris Down , Yafang Shao , Wei Yang , Masayoshi Mizuma , Xing Zhengjun , Matthew Wilcox , Waiman Long Subject: [PATCH v4 4/5] mm/memcg: Save both reclaimable & unreclaimable bytes in object stock Date: Sun, 18 Apr 2021 20:00:31 -0400 Message-Id: <20210419000032.5432-5-longman@redhat.com> In-Reply-To: <20210419000032.5432-1-longman@redhat.com> References: <20210419000032.5432-1-longman@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-Stat-Signature: 8k4kifubuuw71dbzgmwxwx8jnujtpszp X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 5D034C0007DB Received-SPF: none (redhat.com>: No applicable sender policy available) receiver=imf06; identity=mailfrom; envelope-from=""; helo=us-smtp-delivery-124.mimecast.com; client-ip=170.10.133.124 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1618790467-127739 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Currently, the object stock structure caches either reclaimable vmstat bytes or unreclaimable vmstat bytes in its object stock structure. The hit rate can be improved if both types of vmstat data can be cached especially for single-node system. This patch supports the cacheing of both type of vmstat data, though at the expense of a slightly increased complexity in the caching code. For large object (>= PAGE_SIZE), vmstat array is done directly without going through the stock caching step. On a 2-socket Cascade Lake server with instrumentation enabled, the miss rates are shown in the table below. Initial bootup: Kernel __mod_objcg_state mod_objcg_state %age ------ ----------------- --------------- ---- Before patch 634400 3243830 19.6% After patch 419810 3182424 13.2% Parallel kernel build: Kernel __mod_objcg_state mod_objcg_state %age ------ ----------------- --------------- ---- Before patch 24329265 142512465 17.1% After patch 24051721 142445825 16.9% There was a decrease of miss rate after initial system bootup. However, the miss rate for parallel kernel build remained about the same probably because most of the touched kmemcache objects were reclaimable inodes and dentries. Signed-off-by: Waiman Long --- mm/memcontrol.c | 79 +++++++++++++++++++++++++++++++------------------ 1 file changed, 51 insertions(+), 28 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index c13502eab282..a6dd18f6d8a8 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -2212,8 +2212,8 @@ struct obj_stock { struct obj_cgroup *cached_objcg; struct pglist_data *cached_pgdat; unsigned int nr_bytes; - int vmstat_idx; - int vmstat_bytes; + int reclaimable_bytes; /* NR_SLAB_RECLAIMABLE_B */ + int unreclaimable_bytes; /* NR_SLAB_UNRECLAIMABLE_B */ #else int dummy[0]; #endif @@ -3217,40 +3217,56 @@ void mod_objcg_state(struct obj_cgroup *objcg, struct pglist_data *pgdat, enum node_stat_item idx, int nr) { unsigned long flags; - struct obj_stock *stock = get_obj_stock(&flags); + struct obj_stock *stock; + int *bytes, *alt_bytes, alt_idx; + + /* + * Directly update vmstat array for big object. + */ + if (unlikely(abs(nr) >= PAGE_SIZE)) + goto update_vmstat; + + stock = get_obj_stock(&flags); + if (idx == NR_SLAB_RECLAIMABLE_B) { + bytes = &stock->reclaimable_bytes; + alt_bytes = &stock->unreclaimable_bytes; + alt_idx = NR_SLAB_UNRECLAIMABLE_B; + } else { + bytes = &stock->unreclaimable_bytes; + alt_bytes = &stock->reclaimable_bytes; + alt_idx = NR_SLAB_RECLAIMABLE_B; + } /* - * Save vmstat data in stock and skip vmstat array update unless - * accumulating over a page of vmstat data or when pgdat or idx + * Try to save vmstat data in stock and skip vmstat array update + * unless accumulating over a page of vmstat data or when pgdat * changes. */ if (stock->cached_objcg != objcg) { /* Output the current data as is */ - } else if (!stock->vmstat_bytes) { - /* Save the current data */ - stock->vmstat_bytes = nr; - stock->vmstat_idx = idx; - stock->cached_pgdat = pgdat; - nr = 0; - } else if ((stock->cached_pgdat != pgdat) || - (stock->vmstat_idx != idx)) { - /* Output the cached data & save the current data */ - swap(nr, stock->vmstat_bytes); - swap(idx, stock->vmstat_idx); + } else if (stock->cached_pgdat != pgdat) { + /* Save the current data and output cached data, if any */ + swap(nr, *bytes); swap(pgdat, stock->cached_pgdat); + if (*alt_bytes) { + __mod_objcg_state(objcg, pgdat, alt_idx, + *alt_bytes); + *alt_bytes = 0; + } } else { - stock->vmstat_bytes += nr; - if (abs(stock->vmstat_bytes) > PAGE_SIZE) { - nr = stock->vmstat_bytes; - stock->vmstat_bytes = 0; + *bytes += nr; + if (abs(*bytes) > PAGE_SIZE) { + nr = *bytes; + *bytes = 0; } else { nr = 0; } } - if (nr) - __mod_objcg_state(objcg, pgdat, idx, nr); - put_obj_stock(flags); + if (!nr) + return; +update_vmstat: + __mod_objcg_state(objcg, pgdat, idx, nr); } static bool consume_obj_stock(struct obj_cgroup *objcg, unsigned int nr_bytes) @@ -3303,12 +3319,19 @@ static void drain_obj_stock(struct obj_stock *stock) /* * Flush the vmstat data in current stock */ - if (stock->vmstat_bytes) { - __mod_objcg_state(old, stock->cached_pgdat, stock->vmstat_idx, - stock->vmstat_bytes); + if (stock->reclaimable_bytes || stock->unreclaimable_bytes) { + int bytes; + + if ((bytes = stock->reclaimable_bytes)) + __mod_objcg_state(old, stock->cached_pgdat, + NR_SLAB_RECLAIMABLE_B, bytes); + if ((bytes = stock->unreclaimable_bytes)) + __mod_objcg_state(old, stock->cached_pgdat, + NR_SLAB_UNRECLAIMABLE_B, bytes); + stock->cached_pgdat = NULL; - stock->vmstat_bytes = 0; - stock->vmstat_idx = 0; + stock->reclaimable_bytes = 0; + stock->unreclaimable_bytes = 0; } obj_cgroup_put(old); From patchwork Mon Apr 19 00:00:32 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Waiman Long X-Patchwork-Id: 12210561 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 49106C433B4 for ; Mon, 19 Apr 2021 00:01:09 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id EBBB5610CB for ; Mon, 19 Apr 2021 00:01:08 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org EBBB5610CB Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 7E6C26B0072; Sun, 18 Apr 2021 20:01:06 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 7BD4B6B0073; Sun, 18 Apr 2021 20:01:06 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5E81D6B0074; Sun, 18 Apr 2021 20:01:06 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0072.hostedemail.com [216.40.44.72]) by kanga.kvack.org (Postfix) with ESMTP id 3AA186B0072 for ; Sun, 18 Apr 2021 20:01:06 -0400 (EDT) Received: from smtpin11.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id E09928249980 for ; Mon, 19 Apr 2021 00:01:05 +0000 (UTC) X-FDA: 78047161290.11.55ED0E4 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by imf29.hostedemail.com (Postfix) with ESMTP id 5F9C213A for ; Mon, 19 Apr 2021 00:01:03 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1618790465; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:in-reply-to:in-reply-to:references:references; bh=udyNofiZcHz7UO2SxlIdvfmO6jGmZBxtXk0/dOlByNE=; b=HIDq92FdhhFnoa+xXXLMgwoij9hRM/roQsOQH31/QqMU7m4QgI4ZL4nT9Mnml2260bVtXy kXa2S0ZWS/Eo1b4HopuBBnNib0uzMSOlOyR4DgkRS13b8kNCAVtyIOB5RbfKmPqkzvfcPG Z6yPeNXG4celCOBjgBPSk+f8+joTqR8= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-427-fCyT3D2AOVOHipPLvx_x-g-1; Sun, 18 Apr 2021 20:01:03 -0400 X-MC-Unique: fCyT3D2AOVOHipPLvx_x-g-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 54F961898296; Mon, 19 Apr 2021 00:01:00 +0000 (UTC) Received: from llong.com (ovpn-112-235.rdu2.redhat.com [10.10.112.235]) by smtp.corp.redhat.com (Postfix) with ESMTP id ECA865D741; Mon, 19 Apr 2021 00:00:57 +0000 (UTC) From: Waiman Long To: Johannes Weiner , Michal Hocko , Vladimir Davydov , Andrew Morton , Tejun Heo , Christoph Lameter , Pekka Enberg , David Rientjes , Joonsoo Kim , Vlastimil Babka , Roman Gushchin Cc: linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, linux-mm@kvack.org, Shakeel Butt , Muchun Song , Alex Shi , Chris Down , Yafang Shao , Wei Yang , Masayoshi Mizuma , Xing Zhengjun , Matthew Wilcox , Waiman Long Subject: [PATCH v4 5/5] mm/memcg: Improve refill_obj_stock() performance Date: Sun, 18 Apr 2021 20:00:32 -0400 Message-Id: <20210419000032.5432-6-longman@redhat.com> In-Reply-To: <20210419000032.5432-1-longman@redhat.com> References: <20210419000032.5432-1-longman@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-Rspamd-Queue-Id: 5F9C213A X-Stat-Signature: yuzrebr51k9jgio1huinj1gd69yu4pjf X-Rspamd-Server: rspam02 Received-SPF: none (redhat.com>: No applicable sender policy available) receiver=imf29; identity=mailfrom; envelope-from=""; helo=us-smtp-delivery-124.mimecast.com; client-ip=216.205.24.124 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1618790463-358715 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: There are two issues with the current refill_obj_stock() code. First of all, when nr_bytes reaches over PAGE_SIZE, it calls drain_obj_stock() to atomically flush out remaining bytes to obj_cgroup, clear cached_objcg and do a obj_cgroup_put(). It is likely that the same obj_cgroup will be used again which leads to another call to drain_obj_stock() and obj_cgroup_get() as well as atomically retrieve the available byte from obj_cgroup. That is costly. Instead, we should just uncharge the excess pages, reduce the stock bytes and be done with it. The drain_obj_stock() function should only be called when obj_cgroup changes. Secondly, when charging an object of size not less than a page in obj_cgroup_charge(), it is possible that the remaining bytes to be refilled to the stock will overflow a page and cause refill_obj_stock() to uncharge 1 page. To avoid the additional uncharge in this case, a new overfill flag is added to refill_obj_stock() which will be set when called from obj_cgroup_charge(). Signed-off-by: Waiman Long --- mm/memcontrol.c | 23 +++++++++++++++++------ 1 file changed, 17 insertions(+), 6 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index a6dd18f6d8a8..d13961352eef 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -3357,23 +3357,34 @@ static bool obj_stock_flush_required(struct memcg_stock_pcp *stock, return false; } -static void refill_obj_stock(struct obj_cgroup *objcg, unsigned int nr_bytes) +static void refill_obj_stock(struct obj_cgroup *objcg, unsigned int nr_bytes, + bool overfill) { unsigned long flags; struct obj_stock *stock = get_obj_stock(&flags); + unsigned int nr_pages = 0; if (stock->cached_objcg != objcg) { /* reset if necessary */ - drain_obj_stock(stock); + if (stock->cached_objcg) + drain_obj_stock(stock); obj_cgroup_get(objcg); stock->cached_objcg = objcg; stock->nr_bytes = atomic_xchg(&objcg->nr_charged_bytes, 0); } stock->nr_bytes += nr_bytes; - if (stock->nr_bytes > PAGE_SIZE) - drain_obj_stock(stock); + if (!overfill && (stock->nr_bytes > PAGE_SIZE)) { + nr_pages = stock->nr_bytes >> PAGE_SHIFT; + stock->nr_bytes &= (PAGE_SIZE - 1); + } put_obj_stock(flags); + + if (nr_pages) { + rcu_read_lock(); + __memcg_kmem_uncharge(obj_cgroup_memcg(objcg), nr_pages); + rcu_read_unlock(); + } } int obj_cgroup_charge(struct obj_cgroup *objcg, gfp_t gfp, size_t size) @@ -3410,7 +3421,7 @@ int obj_cgroup_charge(struct obj_cgroup *objcg, gfp_t gfp, size_t size) ret = __memcg_kmem_charge(memcg, gfp, nr_pages); if (!ret && nr_bytes) - refill_obj_stock(objcg, PAGE_SIZE - nr_bytes); + refill_obj_stock(objcg, PAGE_SIZE - nr_bytes, true); css_put(&memcg->css); return ret; @@ -3418,7 +3429,7 @@ int obj_cgroup_charge(struct obj_cgroup *objcg, gfp_t gfp, size_t size) void obj_cgroup_uncharge(struct obj_cgroup *objcg, size_t size) { - refill_obj_stock(objcg, size); + refill_obj_stock(objcg, size, false); } #endif /* CONFIG_MEMCG_KMEM */