From patchwork Tue Apr 15 02:45:13 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Muchun Song X-Patchwork-Id: 14051362 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 53E38C369B4 for ; Tue, 15 Apr 2025 02:46:47 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E28702801B6; Mon, 14 Apr 2025 22:46:45 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DAF352800C2; Mon, 14 Apr 2025 22:46:45 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C25422801B6; Mon, 14 Apr 2025 22:46:45 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id A20E52800C2 for ; Mon, 14 Apr 2025 22:46:45 -0400 (EDT) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 3A285120B7F for ; Tue, 15 Apr 2025 02:46:46 +0000 (UTC) X-FDA: 83334740412.18.875B955 Received: from mail-pl1-f175.google.com (mail-pl1-f175.google.com [209.85.214.175]) by imf16.hostedemail.com (Postfix) with ESMTP id 7DB7A18000E for ; Tue, 15 Apr 2025 02:46:44 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=DgsDK6va; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf16.hostedemail.com: domain of songmuchun@bytedance.com designates 209.85.214.175 as permitted sender) smtp.mailfrom=songmuchun@bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1744685204; a=rsa-sha256; cv=none; b=Xngal/fi93ydGw4WVduRYc6z3QzfuIgD10WD1qWVVduCPl1ob0RbWayvW8MV8ITlJR61/Q BuOKd7bS6IpCJbP2woA2U/EUnNAAmyoVO1005P9Vlg07QrJosyVKYVpiJb0AqIb6C5FGYT XTXZiTFBsQR91EW+ltH9LQXFTDFrffg= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=DgsDK6va; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf16.hostedemail.com: domain of songmuchun@bytedance.com designates 209.85.214.175 as permitted sender) smtp.mailfrom=songmuchun@bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1744685204; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=4255W6cWFY69GBJYbQ0IFQtoRNp4GanHOHXEYxbFWEk=; b=UnQvmP4bsx/xyE1NuzwyPsXbepdlWBvleMbIk1MVYwNGUKMOZr3t/r7momrIsUcO7hFwfu FAgWxQdTOOX6gtkI/VXsX57egC8RNd0pzufrlxioANjdOslV2P9C7A1HT6QZqL4Jp5w4sf KbfrOk9h5JP+zl8dec8HXMehOKOb0do= Received: by mail-pl1-f175.google.com with SMTP id d9443c01a7336-22401f4d35aso56951615ad.2 for ; Mon, 14 Apr 2025 19:46:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1744685203; x=1745290003; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=4255W6cWFY69GBJYbQ0IFQtoRNp4GanHOHXEYxbFWEk=; b=DgsDK6vaApP9E881qzPcNfrESS/S8w728oQk1So0gF5yX9Ms6WceN2um2N3JSrHI8B SEU1QMsLE8P76j0TA63krhR4mkTXIuwDAiEdbzpj55GyxibBimlpDHERhBbDQW+8jphq myqZusRMFIbIAmU7cZB4pzzCSeBabv0pCeku6hZWVbPl0Hj+Iuv27O2u8wgzJ8dYdjJ7 TEJWJzDWDhXcemlRDom462R1Dg37cQQ6Vfi2dpcYVr34umE+wVPGRdp+jTtMSjOX7Ysi 45MICFBeZ9I0/uBovRp62TK0uHyiDbwWpXYoD1rFMdudYgnn/PXfNDXfAQ5yhby9NWue 2GYw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1744685203; x=1745290003; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=4255W6cWFY69GBJYbQ0IFQtoRNp4GanHOHXEYxbFWEk=; b=ZWUdAJ6wNFSGdXemvkeilflFP48UzO+/XnA+8Mq8CZAvV5dmBvnsN9gbUbz9jZ1Rnl 8yvlHktEvuxgv6GXqESV3ICU61KLtLEh9N2CveOXJyQh6FsuxPMj7S5K4sWnV+TvFJOX RSk2qGljAEs1DvuQnF4lsIswkXzcrygtNaC5lY0G9j2OcxtiCGqWRjcUdyYK68WJbbYW d7+DVlWiFoMrsn4g7E/FW4UPcl9NTj1XaB9UTbR2emEy8lI2Cvo5yC9/RPbG4Tb3pVVM KehkQPf5pgB9uHPjtoKCvaEoI3dzzg3Hjm5N6Sgb4mYktoPJaTTku6fGfk+sPrm0XVL4 tJtA== X-Forwarded-Encrypted: i=1; AJvYcCWSk3Zn2cm8+a2XWfUHEy2hAtLl8qemTSn1K4MMJ+2+HCeKaa+5cwRb+XmHbTOahz84gOta6iG8Hw==@kvack.org X-Gm-Message-State: AOJu0Yw0etXM3wt1IOmcuHApoarswDsKsDyx7dWmmoaluKfnX3dVBBaa /xj+kUZeLhNbNse1Joj7K+H0aeGgQ/+YPXRPyM54f771SLFNdMmIxoCTzSWjEps= X-Gm-Gg: ASbGnctnOChq2pc1rtgzaqIMadOkmpv3LKnH6W1MfiuACU01bmLhiH6OnztSEVHM+sn MSpIqVy6gDzpZpMCDEd6LXCGla6TAI30lDhY5DWpKP71+bg0UuNbNwF+vbTqHFkrNwieQVJ+YwS XNtUeVef1jo1gP3DLeKbsUHEgbyvg6UHSbRwBioBf4PZcBmAKXWHfpzYIEBUTXMpoUhMUGufPyw GP1vTmHIqhtARSktnRrXgZCU0YLShUpR5MnRyv2NyHASflQcyfn2EN0830wShg/A0TBs4Yi9ZL9 iGpkXvLIiU/ksxctmbZIJJcQON5wOEDd3Jnq62/0+5jCoHaKqXkdlMl+boduFcHIRwhfEq7e X-Google-Smtp-Source: AGHT+IF1hz17w48BHL1psfGG5kRB5zOy7YB0gh0pPjQ8MypiC8y7ZHlHyktUWpSXGC6ZIkLQ5ghe2Q== X-Received: by 2002:a17:902:f546:b0:210:fce4:11ec with SMTP id d9443c01a7336-22bea49575fmr200724635ad.1.1744685203339; Mon, 14 Apr 2025 19:46:43 -0700 (PDT) Received: from PXLDJ45XCM.bytedance.net ([61.213.176.5]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-22ac7ccac49sm106681185ad.217.2025.04.14.19.46.38 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Mon, 14 Apr 2025 19:46:42 -0700 (PDT) From: Muchun Song To: hannes@cmpxchg.org, mhocko@kernel.org, roman.gushchin@linux.dev, shakeel.butt@linux.dev, muchun.song@linux.dev, akpm@linux-foundation.org, david@fromorbit.com, zhengqi.arch@bytedance.com, yosry.ahmed@linux.dev, nphamcs@gmail.com, chengming.zhou@linux.dev Cc: linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, linux-mm@kvack.org, hamzamahfooz@linux.microsoft.com, apais@linux.microsoft.com, Muchun Song Subject: [PATCH RFC 09/28] mm: memcontrol: allocate object cgroup for non-kmem case Date: Tue, 15 Apr 2025 10:45:13 +0800 Message-Id: <20250415024532.26632-10-songmuchun@bytedance.com> X-Mailer: git-send-email 2.39.5 (Apple Git-154) In-Reply-To: <20250415024532.26632-1-songmuchun@bytedance.com> References: <20250415024532.26632-1-songmuchun@bytedance.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Queue-Id: 7DB7A18000E X-Rspamd-Server: rspam04 X-Stat-Signature: iyt6z5pa17boajnwy31kgi1dgt1kcjmj X-HE-Tag: 1744685204-807141 X-HE-Meta: U2FsdGVkX19aspYqwt4x+zVmAGWjYm+rxGQ11E0QE8/dIN4JQZFqIyosJW9jQ/qxAwOV/KDCha/W1FyTijYc0FIaK7GQJ9LonPzAgQxCzUEGmetwhJ91mgM2dMHxq8MKvCtUH9gJWF07zIiK4JLTrpFoeQrzF5FWujCoMnYfjH/d9Af7WjdfATBoWjSwh7+7hfSwf1PEAx/v+xpjZ9JVJCGSjIKSldae8oGntKra/2bf+uoOYk/Syh0sRv/C3J2uU++lftc6bRnb2xTSgW59L9TxH/ywVJEux7R5mS4P20zzJYpIJqw1nS8M4ZxGZYqOgkcKqxly9VueIvQzLptFKyWqyBbXNJM154QFAmSc3NOVQFMBXXJgLlM/qTcDuH+OqSpUUHx529h73B4TAeJqz65CipSoBelVYbz8iZMu2KfL4Sc00JUQ4fFYf13wFepQZ66yVWyfHDATm98GY/Idq4gdj2Tyf3GCu3SEp0Eq4d7vOlQs0JyqcBUoWlKaooYz4D9uwuj+tx0bjenjkmARzCT3kvLFgv2ST6XMZo6GaX+El9/QC1fMDev2zU1Z8zaeHWY2rNYwBt0eZflQU5fqyEpRY5SqCUy0xNxYK+lXbg8htjLEOFkjCUpUo7bo3DztZo5kkE5trWWs3a0VyGE8b0i11qxJe2p7CRtjKbR1yPcUEKGjGeFWgjIjODmE6CEp0Ha/57k+rgVKCWIJzlZhyj1ie5wK2MZbTp3vvEm7Qas8yvUwYqDhOO+SyL3BAd8M9NpZq8kmoygWqo0BhFNoWbm1gfHQUfM3W+bT0klDZixTd4pFaFpXU2OOiRjE0x6u7qcVQUiggcWKQyUFT9PEPZXqNqglb+rElDCKBpyQoFj2QqAJHMo4mUDcAt3ofGlstcZF6j+9AEZdvP3aEUjPjplBTwYC3OcC42irk3rB4PR4jGTptTWkPvRuNkUaSjdDh7wJKfsan5uH/CszxV/ OfkSKEbJ RlvdIdDc9dpuXB+hSpo2PPCPWD8XSdyln8pA+ X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Pagecache pages are charged at allocation time and hold a reference to the original memory cgroup until reclaimed. Depending on memory pressure, page sharing patterns between different cgroups and cgroup creation/destruction rates, many dying memory cgroups can be pinned by pagecache pages, reducing page reclaim efficiency and wasting memory. Converting LRU folios and most other raw memory cgroup pins to the object cgroup direction can fix this long-living problem. As a result, the objcg infrastructure is no longer solely applicable to the kmem case. In this patch, we extend the scope of the objcg infrastructure beyond the kmem case, enabling LRU folios to reuse it for folio charging purposes. It should be noted that LRU folios are not accounted for at the root level, yet the folio->memcg_data points to the root_mem_cgroup. Hence, the folio->memcg_data of LRU folios always points to a valid pointer. However, the root_mem_cgroup does not possess an object cgroup. Therefore, we also allocate an object cgroup for the root_mem_cgroup. Signed-off-by: Muchun Song --- mm/memcontrol.c | 50 +++++++++++++++++++++++-------------------------- 1 file changed, 23 insertions(+), 27 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 0fc76d50bc23..a6362d11b46c 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -193,10 +193,10 @@ static struct obj_cgroup *obj_cgroup_alloc(void) return objcg; } -static void memcg_reparent_objcgs(struct mem_cgroup *memcg, - struct mem_cgroup *parent) +static void memcg_reparent_objcgs(struct mem_cgroup *memcg) { struct obj_cgroup *objcg, *iter; + struct mem_cgroup *parent = parent_mem_cgroup(memcg); objcg = rcu_replace_pointer(memcg->objcg, NULL, true); @@ -3156,30 +3156,17 @@ unsigned long mem_cgroup_usage(struct mem_cgroup *memcg, bool swap) return val; } -static int memcg_online_kmem(struct mem_cgroup *memcg) +static void memcg_online_kmem(struct mem_cgroup *memcg) { - struct obj_cgroup *objcg; - if (mem_cgroup_kmem_disabled()) - return 0; + return; if (unlikely(mem_cgroup_is_root(memcg))) - return 0; - - objcg = obj_cgroup_alloc(); - if (!objcg) - return -ENOMEM; - - objcg->memcg = memcg; - rcu_assign_pointer(memcg->objcg, objcg); - obj_cgroup_get(objcg); - memcg->orig_objcg = objcg; + return; static_branch_enable(&memcg_kmem_online_key); memcg->kmemcg_id = memcg->id.id; - - return 0; } static void memcg_offline_kmem(struct mem_cgroup *memcg) @@ -3194,12 +3181,6 @@ static void memcg_offline_kmem(struct mem_cgroup *memcg) parent = parent_mem_cgroup(memcg); memcg_reparent_list_lrus(memcg, parent); - - /* - * Objcg's reparenting must be after list_lru's, make sure list_lru - * helpers won't use parent's list_lru until child is drained. - */ - memcg_reparent_objcgs(memcg, parent); } #ifdef CONFIG_CGROUP_WRITEBACK @@ -3711,9 +3692,9 @@ mem_cgroup_css_alloc(struct cgroup_subsys_state *parent_css) static int mem_cgroup_css_online(struct cgroup_subsys_state *css) { struct mem_cgroup *memcg = mem_cgroup_from_css(css); + struct obj_cgroup *objcg; - if (memcg_online_kmem(memcg)) - goto remove_id; + memcg_online_kmem(memcg); /* * A memcg must be visible for expand_shrinker_info() @@ -3723,6 +3704,15 @@ static int mem_cgroup_css_online(struct cgroup_subsys_state *css) if (alloc_shrinker_info(memcg)) goto offline_kmem; + objcg = obj_cgroup_alloc(); + if (!objcg) + goto free_shrinker; + + objcg->memcg = memcg; + rcu_assign_pointer(memcg->objcg, objcg); + obj_cgroup_get(objcg); + memcg->orig_objcg = objcg; + if (unlikely(mem_cgroup_is_root(memcg)) && !mem_cgroup_disabled()) queue_delayed_work(system_unbound_wq, &stats_flush_dwork, FLUSH_TIME); @@ -3745,9 +3735,10 @@ static int mem_cgroup_css_online(struct cgroup_subsys_state *css) xa_store(&mem_cgroup_ids, memcg->id.id, memcg, GFP_KERNEL); return 0; +free_shrinker: + free_shrinker_info(memcg); offline_kmem: memcg_offline_kmem(memcg); -remove_id: mem_cgroup_id_remove(memcg); return -ENOMEM; } @@ -3764,6 +3755,11 @@ static void mem_cgroup_css_offline(struct cgroup_subsys_state *css) zswap_memcg_offline_cleanup(memcg); memcg_offline_kmem(memcg); + /* + * Objcg's reparenting must be after list_lru's above, make sure list_lru + * helpers won't use parent's list_lru until child is drained. + */ + memcg_reparent_objcgs(memcg); reparent_shrinker_deferred(memcg); wb_memcg_offline(memcg); lru_gen_offline_memcg(memcg);