From patchwork Tue Aug 27 23:52:28 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Shakeel Butt X-Patchwork-Id: 13780260 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id ED6BBC54743 for ; Tue, 27 Aug 2024 23:52:45 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7060A6B0082; Tue, 27 Aug 2024 19:52:45 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6B5C06B0083; Tue, 27 Aug 2024 19:52:45 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 57CC96B0085; Tue, 27 Aug 2024 19:52:45 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 3A0D36B0082 for ; Tue, 27 Aug 2024 19:52:45 -0400 (EDT) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id CEF981415BF for ; Tue, 27 Aug 2024 23:52:44 +0000 (UTC) X-FDA: 82499677848.26.BA63AE9 Received: from out-184.mta0.migadu.com (out-184.mta0.migadu.com [91.218.175.184]) by imf16.hostedemail.com (Postfix) with ESMTP id E777618000E for ; Tue, 27 Aug 2024 23:52:42 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=KXJZfp3t; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf16.hostedemail.com: domain of shakeel.butt@linux.dev designates 91.218.175.184 as permitted sender) smtp.mailfrom=shakeel.butt@linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1724802666; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=DSSWXtbn5wILLvu1XA/R47N7+NK81XaTgeRTtJ7F0iM=; b=fzrYx2MoPdmLBUvoINEdcea5gt3xcg1VteknRvGttbe2YaAUfTmPpIvOBYavr3DK8Z3Yen 91yU63uqLy6VSSTE62C44r9NcgSn3jHFQMoKL8/GvruMKSfP4fY2GK9l8ip8pYXruxLa4t OUQacifKFjhbAOrG8nd22cuvSEfbTZc= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1724802666; a=rsa-sha256; cv=none; b=KEAyqFULq6jrp+ppTru92WfkgBMV33kYpNKjnCY/V/fah9pzVCrsyvYBOBlr/PniAvIBO7 5tUOsPsSzKlVpe6w3NnMkO8dOmSED1iLcO+YVVy9sQoTejW6Mw06hZM/h5wYphx5I9syYG VmsbWCYmJhqJ/Q5HdIiYP6jXjHm9gk0= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=KXJZfp3t; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf16.hostedemail.com: domain of shakeel.butt@linux.dev designates 91.218.175.184 as permitted sender) smtp.mailfrom=shakeel.butt@linux.dev X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1724802760; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding; bh=DSSWXtbn5wILLvu1XA/R47N7+NK81XaTgeRTtJ7F0iM=; b=KXJZfp3trgSdfSgQA9oiaLntyzsc/WZFfi6sOps73DNgH131rMGlngp2SlZ22JUjcSDbA4 6qK9UNIyuORAR2WpXFCnR5DoUEFYSFPw+3JNhru+l8jPVqDEId2sO9LbN0A3cU8hfHGaSl LSNRwbE7EAyYhcosD+q3XiZUSEBDcmk= From: Shakeel Butt To: Andrew Morton Cc: Johannes Weiner , Michal Hocko , Roman Gushchin , Muchun Song , Vlastimil Babka , David Rientjes , Hyeonggon Yoo <42.hyeyoo@gmail.com>, Eric Dumazet , "David S . Miller" , Jakub Kicinski , Paolo Abeni , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Meta kernel team , cgroups@vger.kernel.org, netdev@vger.kernel.org Subject: [PATCH v2] memcg: add charging of already allocated slab objects Date: Tue, 27 Aug 2024 16:52:28 -0700 Message-ID: <20240827235228.1591842-1-shakeel.butt@linux.dev> MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: E777618000E X-Stat-Signature: 4q14chd9p3y34itg8j4ewqrfq1496fuf X-Rspam-User: X-HE-Tag: 1724802762-800680 X-HE-Meta: U2FsdGVkX19Kz0RgsGiu/3LPtIhH+uCzB8LzdaNLHohAlVH+5BRUGcerTcXU5i30OCXvPGgqdbbD/R3yiPf5z0nfZSHcw9PWwGDV6+vggDRJQprMex0A57Omu9KrS4eta1l2Uh4Y2L56wgl5QYf2kEfLhynYYzjxQ/6HjmWeiQkKWt1E54I/PA11dfVacMLojzG3HYOk0FnNn5RnMrDDFIyD764MJ5RLbvrDFIu94wZuu/HZw1YYonoUz8jXs1EqjMGr8oCwPaIpSZEyi7mdt62DOFuRt5rBvqBLtpWL4c+MB/SAXJRJQwGj8XR9Fp5L7NQEkphFj589Z/Lu8QkjUEFDYR/jnIItHIb0mwv9vQFw7PdNKY7QezeZGFZwBXzlfjLfd3mCmNUWA6IZzrhSD1UnZmEdpfhXM6gzfutV8SgXY4PUG/ZRHwlJSW28ko3fLJ3G+a+sns4BJq8WTqg40XpxWXDlRFGVla4/P7U44OEguuxK+pYwBtwemuf0Kqh1r6RsY8jczN4CTX5e4W1XDuEzvOnm1n4ntoN+ckZ7Poll6s8FcOKNCW80HLvS9hQgzVofiu0RyoULGd0U67FV74l9qM6WiqHkB5v3ezbedUaPKSp61OgnTCTGtj7d/Db9QLVdKLVSjvyCzxp6ISkaiFRBPz0lA8fjE/WJgKxEbrFYWxZEZosU+w5/Nxmw+MtnkAhuoMmzGrpHPejU51eFXodqXQZlS/0lIqd5q2YcsWFUnu7AyYN8vfk6Wyajpt6PgMJ8a3Wn97EcfM4RjgPbhrs52LzJiZMJoZNYPG0lbJKhNBX7vHkVGZeB51kMhKYX/IZObrJab5MqJm6cTWezLtdd5fehET8kUzJ8/Xks4Ym7ABvjmYs0+X0v7G8GpTngOedEBzNe2D36vLs26hNyhD/WNdMnI0NnlR0YqJD1Edydufh++HX1luAL4zkdMKhP57s5IHdRzmio1DETcB5 ICRkicDb dAvG3gHJRZdatJ7TXT9PJAlOHKMrZF0wdY0vag26f9Ycto/iJo9BZXscZKAfAg/hA5EiTIt80IPSJH0CIEMnzKpvN85D8ySgDx/dZ8tN8DygHW2MtW1q30ilULt4LpZbe2agKcgRlOqrLrNwhfuaXN/YWSOdT00H9GS3FJttrUXki+pF1xesSmpJhb7EruW40Nxe3AyuB0mDAOHwAugybrGH7LUtfbMREX+eYRLqcjmdWKgHHMocSpm2Hr9/0n/wGh1HTNqDLedQQ2vSSH/R5RW0O+KHdjf6+IKsVcgABGRZPgR0= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: At the moment, the slab objects are charged to the memcg at the allocation time. However there are cases where slab objects are allocated at the time where the right target memcg to charge it to is not known. One such case is the network sockets for the incoming connection which are allocated in the softirq context. Couple hundred thousand connections are very normal on large loaded server and almost all of those sockets underlying those connections get allocated in the softirq context and thus not charged to any memcg. However later at the accept() time we know the right target memcg to charge. Let's add new API to charge already allocated objects, so we can have better accounting of the memory usage. To measure the performance impact of this change, tcp_crr is used from the neper [1] performance suite. Basically it is a network ping pong test with new connection for each ping pong. The server and the client are run inside 3 level of cgroup hierarchy using the following commands: Server: $ tcp_crr -6 Client: $ tcp_crr -6 -c -H ${server_ip} If the client and server run on different machines with 50 GBPS NIC, there is no visible impact of the change. For the same machine experiment with v6.11-rc5 as base. base (throughput) with-patch tcp_crr 14545 (+- 80) 14463 (+- 56) It seems like the performance impact is within the noise. Link: https://github.com/google/neper [1] Signed-off-by: Shakeel Butt Reviewed-by: Roman Gushchin --- v1: https://lore.kernel.org/all/20240826232908.4076417-1-shakeel.butt@linux.dev/ Changes since v1: - Correctly handle large allocations which bypass slab - Rearrange code to avoid compilation errors for !CONFIG_MEMCG builds RFC: https://lore.kernel.org/all/20240824010139.1293051-1-shakeel.butt@linux.dev/ Changes since the RFC: - Added check for already charged slab objects. - Added performance results from neper's tcp_crr include/linux/slab.h | 1 + mm/slub.c | 51 +++++++++++++++++++++++++++++++++ net/ipv4/inet_connection_sock.c | 5 ++-- 3 files changed, 55 insertions(+), 2 deletions(-) diff --git a/include/linux/slab.h b/include/linux/slab.h index eb2bf4629157..05cfab107c72 100644 --- a/include/linux/slab.h +++ b/include/linux/slab.h @@ -547,6 +547,7 @@ void *kmem_cache_alloc_lru_noprof(struct kmem_cache *s, struct list_lru *lru, gfp_t gfpflags) __assume_slab_alignment __malloc; #define kmem_cache_alloc_lru(...) alloc_hooks(kmem_cache_alloc_lru_noprof(__VA_ARGS__)) +bool kmem_cache_charge(void *objp, gfp_t gfpflags); void kmem_cache_free(struct kmem_cache *s, void *objp); kmem_buckets *kmem_buckets_create(const char *name, slab_flags_t flags, diff --git a/mm/slub.c b/mm/slub.c index c9d8a2497fd6..8265ea5f25be 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -2185,6 +2185,43 @@ void memcg_slab_free_hook(struct kmem_cache *s, struct slab *slab, void **p, __memcg_slab_free_hook(s, slab, p, objects, obj_exts); } + +#define KMALLOC_TYPE (SLAB_KMALLOC | SLAB_CACHE_DMA | \ + SLAB_ACCOUNT | SLAB_RECLAIM_ACCOUNT) + +static __fastpath_inline +bool memcg_slab_post_charge(void *p, gfp_t flags) +{ + struct slabobj_ext *slab_exts; + struct kmem_cache *s; + struct folio *folio; + struct slab *slab; + unsigned long off; + + folio = virt_to_folio(p); + if (!folio_test_slab(folio)) { + return __memcg_kmem_charge_page(folio_page(folio, 0), flags, + folio_order(folio)) == 0; + } + + slab = folio_slab(folio); + s = slab->slab_cache; + + /* Ignore KMALLOC_NORMAL cache to avoid circular dependency. */ + if ((s->flags & KMALLOC_TYPE) == SLAB_KMALLOC) + return true; + + /* Ignore already charged objects. */ + slab_exts = slab_obj_exts(slab); + if (slab_exts) { + off = obj_to_index(s, slab, p); + if (unlikely(slab_exts[off].objcg)) + return true; + } + + return __memcg_slab_post_alloc_hook(s, NULL, flags, 1, &p); +} + #else /* CONFIG_MEMCG */ static inline bool memcg_slab_post_alloc_hook(struct kmem_cache *s, struct list_lru *lru, @@ -2198,6 +2235,11 @@ static inline void memcg_slab_free_hook(struct kmem_cache *s, struct slab *slab, void **p, int objects) { } + +static inline bool memcg_slab_post_charge(void *p, gfp_t flags) +{ + return true; +} #endif /* CONFIG_MEMCG */ /* @@ -4062,6 +4104,15 @@ void *kmem_cache_alloc_lru_noprof(struct kmem_cache *s, struct list_lru *lru, } EXPORT_SYMBOL(kmem_cache_alloc_lru_noprof); +bool kmem_cache_charge(void *objp, gfp_t gfpflags) +{ + if (!memcg_kmem_online()) + return true; + + return memcg_slab_post_charge(objp, gfpflags); +} +EXPORT_SYMBOL(kmem_cache_charge); + /** * kmem_cache_alloc_node - Allocate an object on the specified node * @s: The cache to allocate from. diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c index 64d07b842e73..3c13ca8c11fb 100644 --- a/net/ipv4/inet_connection_sock.c +++ b/net/ipv4/inet_connection_sock.c @@ -715,6 +715,7 @@ struct sock *inet_csk_accept(struct sock *sk, struct proto_accept_arg *arg) release_sock(sk); if (newsk && mem_cgroup_sockets_enabled) { int amt = 0; + gfp_t gfp = GFP_KERNEL | __GFP_NOFAIL; /* atomically get the memory usage, set and charge the * newsk->sk_memcg. @@ -731,8 +732,8 @@ struct sock *inet_csk_accept(struct sock *sk, struct proto_accept_arg *arg) } if (amt) - mem_cgroup_charge_skmem(newsk->sk_memcg, amt, - GFP_KERNEL | __GFP_NOFAIL); + mem_cgroup_charge_skmem(newsk->sk_memcg, amt, gfp); + kmem_cache_charge(newsk, gfp); release_sock(newsk); }