From patchwork Tue Dec 10 19:30:33 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Matthew Wilcox (Oracle)" X-Patchwork-Id: 13901957 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D4B13E7717F for ; Tue, 10 Dec 2024 19:30:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4C4EA6B0286; Tue, 10 Dec 2024 14:30:52 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 44CD16B0287; Tue, 10 Dec 2024 14:30:52 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2C6DA6B0288; Tue, 10 Dec 2024 14:30:52 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id F0A7B6B0286 for ; Tue, 10 Dec 2024 14:30:51 -0500 (EST) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 99B21160A7C for ; Tue, 10 Dec 2024 19:30:51 +0000 (UTC) X-FDA: 82880041230.10.9F50E3D Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf30.hostedemail.com (Postfix) with ESMTP id 06A5580014 for ; Tue, 10 Dec 2024 19:30:06 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=rDBNZkHt; spf=none (imf30.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1733859039; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=8ZZbLSs3MA6+WgR36r/MUKldi5ntVBvwc7EFblthVBw=; b=tK482e8hW6HHwFiHkS5qEawg3t4U6zYnk9Dhhn/SK+b9DX4trkyYbHPbGxU5CI66EwDrpv MXOaMMuvOIMJ+n17hgRqskE3sTdaN0Y4kgunY/821cBEmXJuAzq3ppHvo0DOtElHc39a8r cN5OBUlBvpA46mLK+z30mQUOSqGdBeE= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1733859039; a=rsa-sha256; cv=none; b=Z4qGAqyforK/wwQMGE4EIMI277DHcIfFQm6nH9p1OapXncL18hhOOyxPiLgif0PxPBXNrn IBTaenGNijJ3QkkA7kGj11U/B9YVsmUcoZkMzGKweZsmgtWYaSoJX1GzsmL3hKCvlUNhNB AwmtDZmmdl2A+e27AAVyM7IkV6HzGuE= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=rDBNZkHt; spf=none (imf30.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org; dmarc=none DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Transfer-Encoding:MIME-Version: Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To:Content-Type:Content-ID: Content-Description:In-Reply-To:References; bh=8ZZbLSs3MA6+WgR36r/MUKldi5ntVBvwc7EFblthVBw=; b=rDBNZkHtxQ3A2vmMi5WlMQEUXC K2VL4cbY8FElYJb3gEydzXgrn3xDDpN1v9JwNfaNBWotqZvIorT7+1cL8zIl0JpemyCm8jLC2GkTR ra8UaX5puPkleVbicv+rgiuZkZc+uH57BQQD6ZJA7Cz0gnsbaSk9UA3hcq5+xj9QRfQ3ZYHjDDfRW lBa7tjD/19MVYDxhhOpY0HdNdlkVcCgJAz/eDEPa6lZJXjCjHVEEwzgpldTgFhMLNCfDYhwZe7/Qy EoEy7WV1IHH8e5jYCTw8zkkQN9Mainx7ZdiK4UtnZTMYad0lO2v9d/5vjBoMbCQe27SOKyoBCZhNs WIipZ2Hg==; Received: from willy by casper.infradead.org with local (Exim 4.98 #2 (Red Hat Linux)) id 1tL5wU-0000000BBsP-1JoR; Tue, 10 Dec 2024 19:30:46 +0000 From: "Matthew Wilcox (Oracle)" To: Andrew Morton Cc: "Matthew Wilcox (Oracle)" , Christoph Hellwig , linux-mm@kvack.org, Johannes Weiner , Michal Hocko , Roman Gushchin , Shakeel Butt , Muchun Song , cgroups@vger.kernel.org, stable@vger.kernel.org Subject: [PATCH] vmalloc: Move memcg logic into memcg code Date: Tue, 10 Dec 2024 19:30:33 +0000 Message-ID: <20241210193035.2667005-1-willy@infradead.org> X-Mailer: git-send-email 2.47.1 MIME-Version: 1.0 X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 06A5580014 X-Stat-Signature: 7scj6xkgbrgeirjbje4h41sfgt5kb85d X-Rspam-User: X-HE-Tag: 1733859006-293871 X-HE-Meta: U2FsdGVkX1+RblnTKrZLRt+oJm/H/NQoa+RqwhinYsQFaSOCfWYvFZJBos0leGNoOHJUfLwmiZWnSzVUjSpwzhHRadef2oRs+82LJsRMX+u3D6Wl+pFKk77JyhcJj3v423rbujM476MG3cGKYf4Rs61PZaveltJdBxk9YLtDlYaUh/shk3o2U8wyhXLZFKy97arkoD8ZIdPWGe7eDPyu1cXY3rXkfh4NSxEX/JVYjLsb7GH5zGNbGO/yyuc5hAkz1CsgQkc0wjSsgrIHLqRKV1ljUgeZqC5cAWFsgQGjSqLOuuIqcXL4Yy28osm5GibT7rGquA3nRaqLEft3w7kwF2as9BzpsRMpIWHRnohXystLO0jrRPIyfZo6hm1J4k9W5w8jaMgun6gu+UAY2nmG/deYg9/2z0z+6MM0LhqAZAaDLhKq7bhx5cnC60P+4ykdeIRiOtwStjk8B0ViOGFCX2NYdrA5xWsNX2UzeEPvTJq0BoXzF2DAc1dun8RcblSbMK7U+UsBfQF1LrcPwI7OlI/SqbTzL/q31CiSn88tL9dNwDwES+PUsCfA+tkr+7Ph3ccibJ7C9dEsEUSvFGbzirHID1l3PkKbCqxE2wXqs+XzdBQiMYv1hJY4AfIQxrtiAFiO20u/3/bOXAeS93gQp6cBitC/wtsmLNwFtff2OPNxyIRJHx9WVUxRX6umVUZC69sBixyTIXrKyd94XuGQUnFuiwcFRPMsTDZgYJz6gchMMzSET64TcyXi5NdL8L9PIUpOWK3e6nVt7B9bfHCTqVvHZ8zkWuxh2K40N7yIw/Tu13mk0Y7jHgZP0XrDtybagnAbNZm/ePqAKuDirjjZnbydHuuv1aSUt9nGdIu1kRXdrt87U95Nqpau98f0IcoNS5RIGvq9AkW0VT2098l39ua2ziQs/9i35ii0QpYqnIWgLTJmrDFpHNHwQbYdyJ6LIN2bk6D0VW5XmilKtEj OCqmP6yL 2KxleXGJihoHbanFcjJ6BgqIhTQ72kJl+EyeeVsKExLSoMZUvfFXZtD2X2XBwG8kdComyTKYYdzIaFRJ/L4C+U/aAEDkaIWli7G8LsPFbhMWjA2HsDVQRBGpcYzJTAEhNzFILlLEUMlvlu4ALEeG6vOZ3d5i/jsLLqygfkJXqouEblnwUKnst5htHze3chpF8JZ3t6nuxUs9jnTHxZKXDdZHHMNs5SE5clExCJu7zHU1nn+dcNXa7gkZK4ZlS8djQDT11WH0Sb4hazd0= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Today we account each page individually to the memcg, which works well enough, if a little inefficiently (N atomic operations per page instead of N per allocation). Unfortunately, the stats can get out of sync when i915 calls vmap() with VM_MAP_PUT_PAGES. The pages being passed were not allocated by vmalloc, so the MEMCG_VMALLOC counter was never incremented. But it is decremented when the pages are freed with vfree(). Solve all of this by tracking the memcg at the vm_struct level. This logic has to live in the memcontrol file as it calls several functions which are currently static. Fixes: b944afc9d64d (mm: add a VM_MAP_PUT_PAGES flag for vmap) Cc: stable@vger.kernel.org Signed-off-by: Matthew Wilcox (Oracle) --- include/linux/memcontrol.h | 7 ++++++ include/linux/vmalloc.h | 3 +++ mm/memcontrol.c | 46 ++++++++++++++++++++++++++++++++++++++ mm/vmalloc.c | 14 ++++++------ 4 files changed, 63 insertions(+), 7 deletions(-) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index 5502aa8e138e..83ebcadebba6 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -1676,6 +1676,10 @@ static inline struct obj_cgroup *get_obj_cgroup_from_current(void) int obj_cgroup_charge(struct obj_cgroup *objcg, gfp_t gfp, size_t size); void obj_cgroup_uncharge(struct obj_cgroup *objcg, size_t size); +int obj_cgroup_charge_vmalloc(struct obj_cgroup **objcgp, + unsigned int nr_pages, gfp_t gfp); +void obj_cgroup_uncharge_vmalloc(struct obj_cgroup *objcgp, + unsigned int nr_pages); extern struct static_key_false memcg_bpf_enabled_key; static inline bool memcg_bpf_enabled(void) @@ -1756,6 +1760,9 @@ static inline void __memcg_kmem_uncharge_page(struct page *page, int order) { } +/* Must be macros to avoid dereferencing objcg in vm_struct */ +#define obj_cgroup_charge_vmalloc(objcgp, nr_pages, gfp) 0 +#define obj_cgroup_uncharge_vmalloc(objcg, nr_pages) do { } while (0) static inline struct obj_cgroup *get_obj_cgroup_from_folio(struct folio *folio) { return NULL; diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h index 31e9ffd936e3..ec7c2d607382 100644 --- a/include/linux/vmalloc.h +++ b/include/linux/vmalloc.h @@ -60,6 +60,9 @@ struct vm_struct { #endif unsigned int nr_pages; phys_addr_t phys_addr; +#ifdef CONFIG_MEMCG + struct obj_cgroup *objcg; +#endif const void *caller; }; diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 7b3503d12aaf..629bffc3e26d 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -5472,4 +5472,50 @@ static int __init mem_cgroup_swap_init(void) } subsys_initcall(mem_cgroup_swap_init); +/** + * obj_cgroup_charge_vmalloc - Charge vmalloc memory + * @objcgp: Pointer to an object cgroup + * @nr_pages: Number of pages + * @gfp: Memory allocation flags + * + * Return: 0 on success, negative errno on failure. + */ +int obj_cgroup_charge_vmalloc(struct obj_cgroup **objcgp, + unsigned int nr_pages, gfp_t gfp) +{ + struct obj_cgroup *objcg; + int err; + + if (mem_cgroup_disabled() || !(gfp & __GFP_ACCOUNT)) + return 0; + + objcg = current_obj_cgroup(); + if (!objcg) + return 0; + + err = obj_cgroup_charge_pages(objcg, gfp, nr_pages); + if (err) + return err; + obj_cgroup_get(objcg); + mod_memcg_state(obj_cgroup_memcg(objcg), MEMCG_VMALLOC, nr_pages); + *objcgp = objcg; + + return 0; +} + +/** + * obj_cgroup_uncharge_vmalloc - Uncharge vmalloc memory + * @objcg: The object cgroup + * @nr_pages: Number of pages + */ +void obj_cgroup_uncharge_vmalloc(struct obj_cgroup *objcg, + unsigned int nr_pages) +{ + if (!objcg) + return; + mod_memcg_state(objcg->memcg, MEMCG_VMALLOC, 0L - nr_pages); + obj_cgroup_uncharge_pages(objcg, nr_pages); + obj_cgroup_put(objcg); +} + #endif /* CONFIG_SWAP */ diff --git a/mm/vmalloc.c b/mm/vmalloc.c index f009b21705c1..438995d2f9f8 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -3374,7 +3374,6 @@ void vfree(const void *addr) struct page *page = vm->pages[i]; BUG_ON(!page); - mod_memcg_page_state(page, MEMCG_VMALLOC, -1); /* * High-order allocs for huge vmallocs are split, so * can be freed as an array of order-0 allocations @@ -3383,6 +3382,7 @@ void vfree(const void *addr) cond_resched(); } atomic_long_sub(vm->nr_pages, &nr_vmalloc_pages); + obj_cgroup_uncharge_vmalloc(vm->objcg, vm->nr_pages); kvfree(vm->pages); kfree(vm); } @@ -3536,6 +3536,9 @@ vm_area_alloc_pages(gfp_t gfp, int nid, struct page *page; int i; + /* Accounting handled in caller */ + gfp &= ~__GFP_ACCOUNT; + /* * For order-0 pages we make use of bulk allocator, if * the page array is partly or not at all populated due @@ -3669,12 +3672,9 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask, node, page_order, nr_small_pages, area->pages); atomic_long_add(area->nr_pages, &nr_vmalloc_pages); - if (gfp_mask & __GFP_ACCOUNT) { - int i; - - for (i = 0; i < area->nr_pages; i++) - mod_memcg_page_state(area->pages[i], MEMCG_VMALLOC, 1); - } + ret = obj_cgroup_charge_vmalloc(&area->objcg, gfp_mask, area->nr_pages); + if (ret) + goto fail; /* * If not enough pages were obtained to accomplish an