From patchwork Fri Apr 9 23:18:37 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Waiman Long X-Patchwork-Id: 12195211 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-10.8 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E4259C433B4 for ; Fri, 9 Apr 2021 23:19:35 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 7295761074 for ; Fri, 9 Apr 2021 23:19:35 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 7295761074 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id CDCE36B0036; Fri, 9 Apr 2021 19:19:34 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C8B2A6B006E; Fri, 9 Apr 2021 19:19:34 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B03396B0070; Fri, 9 Apr 2021 19:19:34 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0169.hostedemail.com [216.40.44.169]) by kanga.kvack.org (Postfix) with ESMTP id 915176B0036 for ; Fri, 9 Apr 2021 19:19:34 -0400 (EDT) Received: from smtpin23.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 44DF2181B9DB8 for ; Fri, 9 Apr 2021 23:19:34 +0000 (UTC) X-FDA: 78014397468.23.980EF2E Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by imf30.hostedemail.com (Postfix) with ESMTP id 4980CE00011F for ; Fri, 9 Apr 2021 23:19:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1618010373; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc; bh=PsIsstixQZmp09N5dzfFC9lZcQcKbivYZNxi99QL8gg=; b=MBxXKseJ6Ci3LNM2D6wkm7wSSz83CDY6rsQWY2LZ3079U03bp96AugLQSxnSKOG2HpgNxn ada20WnAWjVAaAwEZa+LsRuOU+Xj9aOepaO1NPJxuj9sg9eKUlo2SGvort7rTWR3tuvCNI ZG1FTUwzoMAzShoaQC/84WO0nxq0A9I= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-187-NuTI7DlQNsGWx8R-aRzgiw-1; Fri, 09 Apr 2021 19:19:29 -0400 X-MC-Unique: NuTI7DlQNsGWx8R-aRzgiw-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 73462107ACC7; Fri, 9 Apr 2021 23:19:25 +0000 (UTC) Received: from llong.com (ovpn-113-226.rdu2.redhat.com [10.10.113.226]) by smtp.corp.redhat.com (Postfix) with ESMTP id 6A4A31B400; Fri, 9 Apr 2021 23:19:19 +0000 (UTC) From: Waiman Long To: Johannes Weiner , Michal Hocko , Vladimir Davydov , Andrew Morton , Tejun Heo , Christoph Lameter , Pekka Enberg , David Rientjes , Joonsoo Kim , Vlastimil Babka , Roman Gushchin Cc: linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, linux-mm@kvack.org, Shakeel Butt , Muchun Song , Alex Shi , Chris Down , Yafang Shao , Alexander Duyck , Wei Yang , Masayoshi Mizuma , Waiman Long Subject: [PATCH 0/5] mm/memcg: Reduce kmemcache memory accounting overhead Date: Fri, 9 Apr 2021 19:18:37 -0400 Message-Id: <20210409231842.8840-1-longman@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 4980CE00011F X-Stat-Signature: aey1bizywzfbejnma4ssft6ednemoihi Received-SPF: none (redhat.com>: No applicable sender policy available) receiver=imf30; identity=mailfrom; envelope-from=""; helo=us-smtp-delivery-124.mimecast.com; client-ip=216.205.24.124 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1618010366-68077 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: With the recent introduction of the new slab memory controller, we eliminate the need for having separate kmemcaches for each memory cgroup and reduce overall kernel memory usage. However, we also add additional memory accounting overhead to each call of kmem_cache_alloc() and kmem_cache_free(). For workloads that require a lot of kmemcache allocations and de-allocations, they may experience performance regression as illustrated in [1]. With a simple kernel module that performs repeated loop of 100,000,000 kmem_cache_alloc() and kmem_cache_free() of 64-byte object at module init. The execution time to load the kernel module with and without memory accounting were: with accounting = 6.798s w/o accounting = 1.758s That is an increase of 5.04s (287%). With this patchset applied, the execution time became 4.254s. So the memory accounting overhead is now 2.496s which is a 50% reduction. It was found that a major part of the memory accounting overhead is caused by the local_irq_save()/local_irq_restore() sequences in updating local stock charge bytes and vmstat array, at least in x86 systems. There are two such sequences in kmem_cache_alloc() and two in kmem_cache_free(). This patchset tries to reduce the use of such sequences as much as possible. In fact, it eliminates them in the common case. Another part of this patchset to cache the vmstat data update in the local stock as well which also helps. [1] https://lore.kernel.org/linux-mm/20210408193948.vfktg3azh2wrt56t@gabell/T/#u Waiman Long (5): mm/memcg: Pass both memcg and lruvec to mod_memcg_lruvec_state() mm/memcg: Introduce obj_cgroup_uncharge_mod_state() mm/memcg: Cache vmstat data in percpu memcg_stock_pcp mm/memcg: Separate out object stock data into its own struct mm/memcg: Optimize user context object stock access include/linux/memcontrol.h | 14 ++- mm/memcontrol.c | 198 ++++++++++++++++++++++++++++++++----- mm/percpu.c | 9 +- mm/slab.h | 32 +++--- 4 files changed, 195 insertions(+), 58 deletions(-)