From patchwork Tue Apr 15 02:45:14 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Muchun Song X-Patchwork-Id: 14051363 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A83F5C369B4 for ; Tue, 15 Apr 2025 02:46:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 446712801B8; Mon, 14 Apr 2025 22:46:51 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3CF4E2800C2; Mon, 14 Apr 2025 22:46:51 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 221842801B8; Mon, 14 Apr 2025 22:46:51 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id F0A662800C2 for ; Mon, 14 Apr 2025 22:46:50 -0400 (EDT) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id A13661CB74C for ; Tue, 15 Apr 2025 02:46:51 +0000 (UTC) X-FDA: 83334740622.02.2673679 Received: from mail-pl1-f175.google.com (mail-pl1-f175.google.com [209.85.214.175]) by imf20.hostedemail.com (Postfix) with ESMTP id C85AC1C000B for ; Tue, 15 Apr 2025 02:46:49 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=QT019ESb; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf20.hostedemail.com: domain of songmuchun@bytedance.com designates 209.85.214.175 as permitted sender) smtp.mailfrom=songmuchun@bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1744685209; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=IuDX2yZAES6U0UQApE3qruOoIN30Kn4u1JZ0iDLJzNw=; b=uTnH+mFwcq2YYcqo9FC2+mFuyBR8CUH8ampFSYrbafgj6X6lD8Cy90aq6Nf5jTBYaDh0TH r88xIesb8mzi6ag+rPsNW9xg7i55hZXmuSY/E8BxYEp489H73/0Fh//I0Ulf8c8s3T0dXK yP4sh0YJoparVP/VrE78AD/fSLJXzw4= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=QT019ESb; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf20.hostedemail.com: domain of songmuchun@bytedance.com designates 209.85.214.175 as permitted sender) smtp.mailfrom=songmuchun@bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1744685209; a=rsa-sha256; cv=none; b=ORX48DMoEHwN6xHSwc8wnk925t5wzvvYtleNvrGe2q8rbSR4uyqqKlJ8V6wt7tdMIXH7Y8 vn+lgEWtgcJBL6ezaFFs9hJt2rgEa5wabzMnW7OTz0kFJvzpeQsqTGHTy9N4sxDQpuya70 3tcJDEG5wCXftzgc0TmV6rCqJLXLz2s= Received: by mail-pl1-f175.google.com with SMTP id d9443c01a7336-2295d78b45cso70679555ad.0 for ; Mon, 14 Apr 2025 19:46:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1744685209; x=1745290009; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=IuDX2yZAES6U0UQApE3qruOoIN30Kn4u1JZ0iDLJzNw=; b=QT019ESb26aVFd5QFXw0VsIPUB85Fk3iEXfVJtG88aV1J5aW4YUa7WsYgpkFAJ8R+H 0AnkAun+tUE9Llmn5kP6A+mvooJIIzup8ZvCVxZMJWzUcDHConiGSlXcSyl38nv+lp5E HF9aG7GE1eqEOgjActIUF1k3rjIsYkgkYknWOzac1aQcumw6wBDJGg7ter1OrEvCyHIl 0W+344vGUaIdxCPq70tNtRuQrpd/Z2TsbruLpj5xPVcOYqH0hdnphneHX8KT5tO5Fck8 NQMskFcnEWQI7G9Q4j2Ynp7RLpwgMYVP3z+Ve23PsZ9F5f3Y1dleVn9XeclztTLK72u/ Y0fQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1744685209; x=1745290009; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=IuDX2yZAES6U0UQApE3qruOoIN30Kn4u1JZ0iDLJzNw=; b=qbBslc8JlTrjemG9SJ4wf2Bn0LV4DUv3RogBd1AdTZU9wOy7uTXJrLvzOh1wdbZaDa dRdnh568C1fXMCdTWqwl+EV+eyvKaF/GlHLPTwXtTP1vhfniCzKqxis+b8fh2FfhWGQw fqg4hFf+IUzpEwapbBZ7k/wOX6DtPckXdMbh4kHfPIUzChk6OS41RsXbhRV3O0fdInep rcMqAXIYexb9hVIBZfnZ6WYAZVCneBRuk7Np1FY4V9tUqkel/W2lP/8HuXY5s6zmRMPs QJ7/tTwttec2IcAQD7KQRPoWAIy091JyOmcTjwtoZqBPrGSbjncgI7cUCpHTMDfixo3o DpUg== X-Forwarded-Encrypted: i=1; AJvYcCViRj2Fhc2NIbMwKy+zzjSEvuSmAj3DbbhmX8Jn3tREXiPR5S1Syn/SpVVh3kItpumnOR5ytn+wOg==@kvack.org X-Gm-Message-State: AOJu0YzvBObYma3LRVAgOPhvBI7xifjlbc9oW9iWME6TSo6VKsdd/FlW JWDGBaGYgYzJrJ/paFH2f53Rr36Piiyvf3lZxKkLTvxqIWcCpe1x1B+HLkoShhc= X-Gm-Gg: ASbGncskFBI2tGZVz9r7S2gH525orjpiok8J+Q7JUHSoenjxbsXLVSHN15Ow3bazsbZ 95xARMQIEii7ShD8IW2TEwVrdhP53srje9bLU//q4YtO/l+kDnMB2/LHXncK5KMaOMI333t1bXR bn1xJpOwzDlWt6KU5D8FJ7kRBzzhb+jqsZBykRDs9pakyu8DPQGWqoxykkobTu27CtmDPMB1Kyq 6nDgFRjgzsuGQ36QtyI097Q2cKbArzHa3sdwF7RxA2HYf0hA0+iGOQfN3Uiulg4xK49UOPMQlC3 1ZInVw7bJbkJUwpoaMYVJbppaPO1nwgUyrKYidsYrfrfxcO2FZPAEPMFqfqky8sqT3Zmx4v7 X-Google-Smtp-Source: AGHT+IEiRiEAaC7KVEwbBorKdUbvn6lwcIuA3If1BEtGwUIR4XacRMxzFJ4y2u3Y4Nx1KigSul40og== X-Received: by 2002:a17:903:230e:b0:224:1234:5a3b with SMTP id d9443c01a7336-22bea50da98mr202894305ad.51.1744685208495; Mon, 14 Apr 2025 19:46:48 -0700 (PDT) Received: from PXLDJ45XCM.bytedance.net ([61.213.176.5]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-22ac7ccac49sm106681185ad.217.2025.04.14.19.46.43 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Mon, 14 Apr 2025 19:46:48 -0700 (PDT) From: Muchun Song To: hannes@cmpxchg.org, mhocko@kernel.org, roman.gushchin@linux.dev, shakeel.butt@linux.dev, muchun.song@linux.dev, akpm@linux-foundation.org, david@fromorbit.com, zhengqi.arch@bytedance.com, yosry.ahmed@linux.dev, nphamcs@gmail.com, chengming.zhou@linux.dev Cc: linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, linux-mm@kvack.org, hamzamahfooz@linux.microsoft.com, apais@linux.microsoft.com, Muchun Song Subject: [PATCH RFC 10/28] mm: memcontrol: return root object cgroup for root memory cgroup Date: Tue, 15 Apr 2025 10:45:14 +0800 Message-Id: <20250415024532.26632-11-songmuchun@bytedance.com> X-Mailer: git-send-email 2.39.5 (Apple Git-154) In-Reply-To: <20250415024532.26632-1-songmuchun@bytedance.com> References: <20250415024532.26632-1-songmuchun@bytedance.com> MIME-Version: 1.0 X-Rspamd-Server: rspam11 X-Rspamd-Queue-Id: C85AC1C000B X-Stat-Signature: 4i6wauap6ax57afmieio8ot89i4t5zic X-Rspam-User: X-HE-Tag: 1744685209-553020 X-HE-Meta: U2FsdGVkX1+MiJqDVOkI2ChZHFIvmQQ6mDU7iU0ENaqbNourSFJDix9qwBdUIQ2fzP/atjwHgUYJXpuKPcte3T0W0n8vSuTcyczYtMgPIyq0u23NsXuRcoCvHJkLMTDhjzF1HU4ZcZOeLG5NuqgcUtdS/56KEVSVoSK9fGZwnQa3NUTsjkD6zOMrMiN6Rf6+fR9+0lm69XAHrUo3wIZm93z1KlA+51Ie9idbKub3LzSgmQUXMwQR7HYXJxFU8VOGp9CAnOQ9sb8ksyRGlSrxjU2jcsa+k2mFKaCB71jtkzDVp4RCNz1BJzN2ZnH+1a05Nle1w1CjzuPsKDlpB4Pptc84R5nsE8/4UpdgiFn5VlZXI/tPxYK08I31KamQvwOZ1oieACh7/vp/KG64qeZ60wegt9PuLt9zdUg+C79IMgZPMCva6uFmvBjjHLDj//G56dFZJfISc+dIWv3HDuxuyIruJMD0/TJmx1c+7NUItIpa9FxZBH1u069jlM//cralHOPE4HUrac6i4Bp02sVAu8qV0RypzgWAG6IAfvtEJ8H3QPBzuZWeKNPZvsEZF70Rdwlj8K2lFCtxTWW+1p7q79WIpg15/BXJsyK1UcLNSlskrvN0G35CEex0Y4da6/PrHA7xfZy/Kf73zgs3yMsGwtMJ4RUhqscDvIK47aXbldK+QrEMb+nFol3dfNg+POh5Ubz5509hoBbsNKIij2XF8JQRMId1lUvLLesbsxLULGGFBvvrgpmTu7yjoN3+4YWCQV7rkY+xAVXLkFrSbqCKDFJc5sxFdJtprQ4uKKC8VYvALAieKbZ1ePekYEdgqH0eH2bCHF7nCBQARIYWnOUxVPK6g7aJ+IvukkmemvyLOfAN5pcLf/WquUX5dAjjEN9+CdvHiDNVbEFKQJxQAczNWCP+PWkCyB7W5bdWAJWw8un8s1rPK46tpVWZAb3rIguTAxDlsJrPlX7MqNaOCsF v/CGJqcu 5PvGYmT3kq+qnk5rIk+NZRNHGuFqRRiGjNr7njXK/A6h10WrE43eGBdGiGDGqwYIFe9OEDZNvHA3KaaDGsXjdLzIYGLNY5Sd69qBByVM18F9S85pXr4MlUdK/Y6nczvO+QRz3fljLy2U/YHnIMRq0zo86FDjtLcvxsXjf4YzJFy+PedM+FQoYjZ5Eiqf6d78aGvBhb32wbL7ImCFXS+eY5gMraceTAkl3iz00PmRIMVE53R0U2gH7qFOvYyAR46c3g4WujF61ixUiJZPN8R7jk0ZLbBXj06YjLKscfsDZV2SLFlxK0m6EDwzxKg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Memory cgroup functions such as get_mem_cgroup_from_folio() and get_mem_cgroup_from_mm() return a valid memory cgroup pointer, even for the root memory cgroup. In contrast, the situation for object cgroups has been different. Previously, the root object cgroup couldn't be returned because it didn't exist. Now that a valid root object cgroup exists, for the sake of consistency, it's necessary to align the behavior of object-cgroup-related operations with that of memory cgroup APIs. Signed-off-by: Muchun Song --- include/linux/memcontrol.h | 29 ++++++++++++++++++------- mm/memcontrol.c | 44 ++++++++++++++++++++------------------ mm/percpu.c | 2 +- 3 files changed, 45 insertions(+), 30 deletions(-) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index bb4f203733f3..e74922d5755d 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -319,6 +319,7 @@ struct mem_cgroup { #define MEMCG_CHARGE_BATCH 64U extern struct mem_cgroup *root_mem_cgroup; +extern struct obj_cgroup *root_obj_cgroup; enum page_memcg_data_flags { /* page->memcg_data is a pointer to an slabobj_ext vector */ @@ -528,6 +529,11 @@ static inline bool mem_cgroup_is_root(struct mem_cgroup *memcg) return (memcg == root_mem_cgroup); } +static inline bool obj_cgroup_is_root(const struct obj_cgroup *objcg) +{ + return objcg == root_obj_cgroup; +} + static inline bool mem_cgroup_disabled(void) { return !cgroup_subsys_enabled(memory_cgrp_subsys); @@ -752,23 +758,26 @@ struct mem_cgroup *mem_cgroup_from_css(struct cgroup_subsys_state *css){ static inline bool obj_cgroup_tryget(struct obj_cgroup *objcg) { + if (obj_cgroup_is_root(objcg)) + return true; return percpu_ref_tryget(&objcg->refcnt); } -static inline void obj_cgroup_get(struct obj_cgroup *objcg) +static inline void obj_cgroup_get_many(struct obj_cgroup *objcg, + unsigned long nr) { - percpu_ref_get(&objcg->refcnt); + if (!obj_cgroup_is_root(objcg)) + percpu_ref_get_many(&objcg->refcnt, nr); } -static inline void obj_cgroup_get_many(struct obj_cgroup *objcg, - unsigned long nr) +static inline void obj_cgroup_get(struct obj_cgroup *objcg) { - percpu_ref_get_many(&objcg->refcnt, nr); + obj_cgroup_get_many(objcg, 1); } static inline void obj_cgroup_put(struct obj_cgroup *objcg) { - if (objcg) + if (objcg && !obj_cgroup_is_root(objcg)) percpu_ref_put(&objcg->refcnt); } @@ -1101,6 +1110,11 @@ static inline bool mem_cgroup_is_root(struct mem_cgroup *memcg) return true; } +static inline bool obj_cgroup_is_root(const struct obj_cgroup *objcg) +{ + return true; +} + static inline bool mem_cgroup_disabled(void) { return true; @@ -1684,8 +1698,7 @@ static inline struct obj_cgroup *get_obj_cgroup_from_current(void) { struct obj_cgroup *objcg = current_obj_cgroup(); - if (objcg) - obj_cgroup_get(objcg); + obj_cgroup_get(objcg); return objcg; } diff --git a/mm/memcontrol.c b/mm/memcontrol.c index a6362d11b46c..4aadc1b87db3 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -81,6 +81,7 @@ struct cgroup_subsys memory_cgrp_subsys __read_mostly; EXPORT_SYMBOL(memory_cgrp_subsys); struct mem_cgroup *root_mem_cgroup __read_mostly; +struct obj_cgroup *root_obj_cgroup __read_mostly; /* Active memory cgroup to use from an interrupt context */ DEFINE_PER_CPU(struct mem_cgroup *, int_active_memcg); @@ -2525,15 +2526,14 @@ struct mem_cgroup *mem_cgroup_from_slab_obj(void *p) static struct obj_cgroup *__get_obj_cgroup_from_memcg(struct mem_cgroup *memcg) { - struct obj_cgroup *objcg = NULL; + for (; memcg; memcg = parent_mem_cgroup(memcg)) { + struct obj_cgroup *objcg = rcu_dereference(memcg->objcg); - for (; !mem_cgroup_is_root(memcg); memcg = parent_mem_cgroup(memcg)) { - objcg = rcu_dereference(memcg->objcg); if (likely(objcg && obj_cgroup_tryget(objcg))) - break; - objcg = NULL; + return objcg; } - return objcg; + + return NULL; } static struct obj_cgroup *current_objcg_update(void) @@ -2604,18 +2604,17 @@ __always_inline struct obj_cgroup *current_obj_cgroup(void) * Objcg reference is kept by the task, so it's safe * to use the objcg by the current task. */ - return objcg; + return objcg ? : root_obj_cgroup; } memcg = this_cpu_read(int_active_memcg); if (unlikely(memcg)) goto from_memcg; - return NULL; + return root_obj_cgroup; from_memcg: - objcg = NULL; - for (; !mem_cgroup_is_root(memcg); memcg = parent_mem_cgroup(memcg)) { + for (; memcg; memcg = parent_mem_cgroup(memcg)) { /* * Memcg pointer is protected by scope (see set_active_memcg()) * and is pinning the corresponding objcg, so objcg can't go @@ -2624,10 +2623,10 @@ __always_inline struct obj_cgroup *current_obj_cgroup(void) */ objcg = rcu_dereference_check(memcg->objcg, 1); if (likely(objcg)) - break; + return objcg; } - return objcg; + return root_obj_cgroup; } struct obj_cgroup *get_obj_cgroup_from_folio(struct folio *folio) @@ -2641,14 +2640,8 @@ struct obj_cgroup *get_obj_cgroup_from_folio(struct folio *folio) objcg = __folio_objcg(folio); obj_cgroup_get(objcg); } else { - struct mem_cgroup *memcg; - rcu_read_lock(); - memcg = __folio_memcg(folio); - if (memcg) - objcg = __get_obj_cgroup_from_memcg(memcg); - else - objcg = NULL; + objcg = __get_obj_cgroup_from_memcg(__folio_memcg(folio)); rcu_read_unlock(); } return objcg; @@ -2733,7 +2726,7 @@ int __memcg_kmem_charge_page(struct page *page, gfp_t gfp, int order) int ret = 0; objcg = current_obj_cgroup(); - if (objcg) { + if (!obj_cgroup_is_root(objcg)) { ret = obj_cgroup_charge_pages(objcg, gfp, 1 << order); if (!ret) { obj_cgroup_get(objcg); @@ -3036,7 +3029,7 @@ bool __memcg_slab_post_alloc_hook(struct kmem_cache *s, struct list_lru *lru, * obj_cgroup_get() is used to get a permanent reference. */ objcg = current_obj_cgroup(); - if (!objcg) + if (obj_cgroup_is_root(objcg)) return true; /* @@ -3708,6 +3701,9 @@ static int mem_cgroup_css_online(struct cgroup_subsys_state *css) if (!objcg) goto free_shrinker; + if (unlikely(mem_cgroup_is_root(memcg))) + root_obj_cgroup = objcg; + objcg->memcg = memcg; rcu_assign_pointer(memcg->objcg, objcg); obj_cgroup_get(objcg); @@ -5302,6 +5298,9 @@ void obj_cgroup_charge_zswap(struct obj_cgroup *objcg, size_t size) if (!cgroup_subsys_on_dfl(memory_cgrp_subsys)) return; + if (obj_cgroup_is_root(objcg)) + return; + VM_WARN_ON_ONCE(!(current->flags & PF_MEMALLOC)); /* PF_MEMALLOC context, charging must succeed */ @@ -5329,6 +5328,9 @@ void obj_cgroup_uncharge_zswap(struct obj_cgroup *objcg, size_t size) if (!cgroup_subsys_on_dfl(memory_cgrp_subsys)) return; + if (obj_cgroup_is_root(objcg)) + return; + obj_cgroup_uncharge(objcg, size); rcu_read_lock(); diff --git a/mm/percpu.c b/mm/percpu.c index b35494c8ede2..3e54c6fca9bd 100644 --- a/mm/percpu.c +++ b/mm/percpu.c @@ -1616,7 +1616,7 @@ static bool pcpu_memcg_pre_alloc_hook(size_t size, gfp_t gfp, return true; objcg = current_obj_cgroup(); - if (!objcg) + if (obj_cgroup_is_root(objcg)) return true; if (obj_cgroup_charge(objcg, gfp, pcpu_obj_full_size(size)))