From patchwork Wed Dec 11 04:32:50 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Matthew Wilcox (Oracle)" X-Patchwork-Id: 13902926 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D123FE77183 for ; Wed, 11 Dec 2024 04:33:01 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8ED7E6B0290; Tue, 10 Dec 2024 23:33:00 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 89D308D0013; Tue, 10 Dec 2024 23:33:00 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 716456B0292; Tue, 10 Dec 2024 23:33:00 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 558826B0290 for ; Tue, 10 Dec 2024 23:33:00 -0500 (EST) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 0F54CA1281 for ; Wed, 11 Dec 2024 04:33:00 +0000 (UTC) X-FDA: 82881407238.13.74533B7 Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf11.hostedemail.com (Postfix) with ESMTP id C035F4000B for ; Wed, 11 Dec 2024 04:32:37 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b="Vos/2H+F"; spf=none (imf11.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1733891561; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=kYgcX+CsLcaggzAlRVTJ/EcQANIp0vNfureKuWSQ4N8=; b=YTqCjabHyJ/VMhkc0lP3NPIzybpafe3neFfSv1I3PHrGuIRi77owRd+rFq7zOs0mEXePRS uPAyPZuK2Xrnqzfnw8BXLPY4Wov+MxS6/ufHQb+vU/4Q/Ha2KvKUTHJr0ZYWMAEE4CGL66 GKzJkekNNDNiKma4FtqUzysxT8FeZZA= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b="Vos/2H+F"; spf=none (imf11.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1733891561; a=rsa-sha256; cv=none; b=kyyqSiQh6mVdfstrPXzGWQmvEq51z1tr/GgBftGWi/O7E2k+B/J8KsYgKh31szXmyX/RhU qNZWohpGijZqVEh+B+H5XXayPvvOcyNFtbNqF8l7ecG9Z+Ri8pXADDL/UDUrx6NVAaGwPN mHXq/W8emqkVk6ssjBvVmJwiUZDkrd4= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Transfer-Encoding:MIME-Version: References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To: Content-Type:Content-ID:Content-Description; bh=kYgcX+CsLcaggzAlRVTJ/EcQANIp0vNfureKuWSQ4N8=; b=Vos/2H+FuYlHIfx7q00+cIyILL XkahFh6CzKVaojHzl5FXx8YKvd8fsXltvKPaSGq9ILxNgItmdc9fEdrOMCtxZdVfPZ6qEGv+IUGf7 ms4KbJCKwWynvAerSHHsueugkA4pbqvRioy3Hon22SzKLZzCyVNjU1jiaB8loE+50lkb0IZ+rdEv3 4+CfEd/lKtz6+snWV6r795h0nI2/hb6kN0HvPzNH0hK53237GOjMout2H/SYS+s3U36gfvDVgDwwz qLOsafZpJNe5G5JQTy2beexIKqkE/zjl7pZ4v4rFxYrMzn9ETBuUEwqzxNReP4KhFYgE9jjkUN/pA TVc9Oj7A==; Received: from willy by casper.infradead.org with local (Exim 4.98 #2 (Red Hat Linux)) id 1tLEP7-0000000DpR6-1kmn; Wed, 11 Dec 2024 04:32:53 +0000 From: "Matthew Wilcox (Oracle)" To: Andrew Morton Cc: "Matthew Wilcox (Oracle)" , Christoph Hellwig , linux-mm@kvack.org, Johannes Weiner , Michal Hocko , Roman Gushchin , Shakeel Butt , Muchun Song , cgroups@vger.kernel.org, stable@vger.kernel.org Subject: [PATCH 2/2] vmalloc: Account memcg per vmalloc Date: Wed, 11 Dec 2024 04:32:50 +0000 Message-ID: <20241211043252.3295947-2-willy@infradead.org> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20241211043252.3295947-1-willy@infradead.org> References: <20241211043252.3295947-1-willy@infradead.org> MIME-Version: 1.0 X-Rspamd-Queue-Id: C035F4000B X-Rspamd-Server: rspam12 X-Stat-Signature: o1za5txdanrouhs7jgyzbczggj4scknr X-Rspam-User: X-HE-Tag: 1733891557-551823 X-HE-Meta: U2FsdGVkX1+2DjbleDT5+zq6MyR9Kf/EN32AvXybfRnT6cSWi4TV1SbwxfiYmRujJCB0Ac7j26VbgZJXFFmVi5vJ21HRIu3byhh6QxpQg33qdhZcrtbPIfIqRzddlmJJBEo1wHge+fnzhLvKsNGRhyztLF9g/y2sl959fEFD7VQ+R38dUU2ZeiDx3PH+O6jlQP46eL6gHI74BdfBYra/Yjwmvg657BQqbJPq3fw5N4q4/rP4vKQh7TIx0eiEIre/JoxuaOoTQUb7PEGYVSK+8riQqXaqQu8FGAb5WMtN2cy6gJVV1gvjyBBMSHCJFQHEP8eUHP0mo/7nrwFo1/PJ5Wu3mLJLOeKX4mZTMLCrq17qJBt4vJVMh2Zj3lgQ5lI0w0sg4jL0qsExeGo9ga0GQIl0gz0xL3LovzmNszUmpcIfXZkZ2fk/ZOR3LA+sHy9radSHGMrKepiiF7jMrXHYabX4m+ryaszsd1ewEX924NeOzQD2RtLFGMgIfQxBBBrOvYzQIRITOLTku+6bevCp2DFA/VpPjqKy9AKTAfHnuTGuvieiHJCZWVau20r6HrdNMtGEoG4HQyUATLXc1N2WwDzrwqoR8IS876itdmwWXgZr9ZdYscxhCXaXiGiKzbQzm4oZ4gUxoBT9lS0DxdwvLnz9DHMksy6y+2qShHXnv/kYemTzkidUTITOodeF9oySY6iAuymCA124sLZ6D5aYEgdYzHqKm87uBM20RqrsWqVaJnIrR9Kz1LvY8H1LuZOnnh07UAD5c9MsZkXbAgGUrBvPuy3K4zqnvv2uuTVxUFziyDA0AQfhy23QteufackgBPnVbfI02/bL+vKZp8HjKOlrFIsR7At8LT1LL5WliXPnPHcbMd2pgCFu7Q8+8hq9/WEVCVEVZ4Euw6mwyivcFpozhpiFicrhtXdzlFK77I75yl60bqlC+GgONwtLYoYnw1uAyWxrpgc8i51ryFd 6HnzIS0k YgQj+cOXMNRj6ghg8em1RdRqCXzOsT/Fl6mD19AggHw+ZHfrJ7hAxl+VMlccC74G63DSFLAGIKa6U+m+htrjFniVxmvpq81v0Vi2VDkxMNuJ+u2xlttEWlGt7C45DuVvSV9IR7yMTfg5U7gD7k74oN7CuS4+FXmPwKqtK3WM/FZZT/e+O/YfAkPXp+1Rsbe5nDlgvzXa0hnCJ0qq7Q871lG71KQH4U3y3s4OIqGhmASOi/aCElL5C/FRe6dniOs6Z/v5iMNHTKkeDcvs= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Today we account each page individually to the memcg, which works well enough, if a little inefficiently (N atomic operations per page instead of N per allocation). Unfortunately, the stats can get out of sync when i915 calls vmap() with VM_MAP_PUT_PAGES. The pages being passed were not allocated by vmalloc, so the MEMCG_VMALLOC counter was never incremented. But it is decremented when the pages are freed with vfree(). Solve all of this by tracking the memcg at the vm_struct level. This logic has to live in the memcontrol file as it calls several functions which are currently static. Fixes: b944afc9d64d (mm: add a VM_MAP_PUT_PAGES flag for vmap) Cc: stable@vger.kernel.org Signed-off-by: Matthew Wilcox (Oracle) --- include/linux/memcontrol.h | 7 ++++++ include/linux/vmalloc.h | 3 +++ mm/memcontrol.c | 46 ++++++++++++++++++++++++++++++++++++++ mm/vmalloc.c | 14 ++++++------ 4 files changed, 63 insertions(+), 7 deletions(-) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index 5502aa8e138e..83ebcadebba6 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -1676,6 +1676,10 @@ static inline struct obj_cgroup *get_obj_cgroup_from_current(void) int obj_cgroup_charge(struct obj_cgroup *objcg, gfp_t gfp, size_t size); void obj_cgroup_uncharge(struct obj_cgroup *objcg, size_t size); +int obj_cgroup_charge_vmalloc(struct obj_cgroup **objcgp, + unsigned int nr_pages, gfp_t gfp); +void obj_cgroup_uncharge_vmalloc(struct obj_cgroup *objcgp, + unsigned int nr_pages); extern struct static_key_false memcg_bpf_enabled_key; static inline bool memcg_bpf_enabled(void) @@ -1756,6 +1760,9 @@ static inline void __memcg_kmem_uncharge_page(struct page *page, int order) { } +/* Must be macros to avoid dereferencing objcg in vm_struct */ +#define obj_cgroup_charge_vmalloc(objcgp, nr_pages, gfp) 0 +#define obj_cgroup_uncharge_vmalloc(objcg, nr_pages) do { } while (0) static inline struct obj_cgroup *get_obj_cgroup_from_folio(struct folio *folio) { return NULL; diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h index 31e9ffd936e3..ec7c2d607382 100644 --- a/include/linux/vmalloc.h +++ b/include/linux/vmalloc.h @@ -60,6 +60,9 @@ struct vm_struct { #endif unsigned int nr_pages; phys_addr_t phys_addr; +#ifdef CONFIG_MEMCG + struct obj_cgroup *objcg; +#endif const void *caller; }; diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 7b3503d12aaf..629bffc3e26d 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -5472,4 +5472,50 @@ static int __init mem_cgroup_swap_init(void) } subsys_initcall(mem_cgroup_swap_init); +/** + * obj_cgroup_charge_vmalloc - Charge vmalloc memory + * @objcgp: Pointer to an object cgroup + * @nr_pages: Number of pages + * @gfp: Memory allocation flags + * + * Return: 0 on success, negative errno on failure. + */ +int obj_cgroup_charge_vmalloc(struct obj_cgroup **objcgp, + unsigned int nr_pages, gfp_t gfp) +{ + struct obj_cgroup *objcg; + int err; + + if (mem_cgroup_disabled() || !(gfp & __GFP_ACCOUNT)) + return 0; + + objcg = current_obj_cgroup(); + if (!objcg) + return 0; + + err = obj_cgroup_charge_pages(objcg, gfp, nr_pages); + if (err) + return err; + obj_cgroup_get(objcg); + mod_memcg_state(obj_cgroup_memcg(objcg), MEMCG_VMALLOC, nr_pages); + *objcgp = objcg; + + return 0; +} + +/** + * obj_cgroup_uncharge_vmalloc - Uncharge vmalloc memory + * @objcg: The object cgroup + * @nr_pages: Number of pages + */ +void obj_cgroup_uncharge_vmalloc(struct obj_cgroup *objcg, + unsigned int nr_pages) +{ + if (!objcg) + return; + mod_memcg_state(objcg->memcg, MEMCG_VMALLOC, 0L - nr_pages); + obj_cgroup_uncharge_pages(objcg, nr_pages); + obj_cgroup_put(objcg); +} + #endif /* CONFIG_SWAP */ diff --git a/mm/vmalloc.c b/mm/vmalloc.c index bc9c91f3b373..d5e9068d9091 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -3374,7 +3374,6 @@ void vfree(const void *addr) struct page *page = vm->pages[i]; BUG_ON(!page); - mod_memcg_page_state(page, MEMCG_VMALLOC, -1); /* * High-order allocs for huge vmallocs are split, so * can be freed as an array of order-0 allocations @@ -3384,6 +3383,7 @@ void vfree(const void *addr) } if (!(vm->flags & VM_MAP_PUT_PAGES)) atomic_long_sub(vm->nr_pages, &nr_vmalloc_pages); + obj_cgroup_uncharge_vmalloc(vm->objcg, vm->nr_pages); kvfree(vm->pages); kfree(vm); } @@ -3537,6 +3537,9 @@ vm_area_alloc_pages(gfp_t gfp, int nid, struct page *page; int i; + /* Accounting handled in caller */ + gfp &= ~__GFP_ACCOUNT; + /* * For order-0 pages we make use of bulk allocator, if * the page array is partly or not at all populated due @@ -3670,12 +3673,9 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask, node, page_order, nr_small_pages, area->pages); atomic_long_add(area->nr_pages, &nr_vmalloc_pages); - if (gfp_mask & __GFP_ACCOUNT) { - int i; - - for (i = 0; i < area->nr_pages; i++) - mod_memcg_page_state(area->pages[i], MEMCG_VMALLOC, 1); - } + ret = obj_cgroup_charge_vmalloc(&area->objcg, gfp_mask, area->nr_pages); + if (ret) + goto fail; /* * If not enough pages were obtained to accomplish an