From patchwork Fri Aug 7 06:21:37 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 11704833 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id AB5DC138C for ; Fri, 7 Aug 2020 06:21:41 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 6E852221E5 for ; Fri, 7 Aug 2020 06:21:41 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="LruEwtSc" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6E852221E5 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 530A68D005A; Fri, 7 Aug 2020 02:21:40 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 4E1898D0026; Fri, 7 Aug 2020 02:21:40 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3F6528D005A; Fri, 7 Aug 2020 02:21:40 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0191.hostedemail.com [216.40.44.191]) by kanga.kvack.org (Postfix) with ESMTP id 299778D0026 for ; Fri, 7 Aug 2020 02:21:40 -0400 (EDT) Received: from smtpin02.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id ECFE982499A8 for ; Fri, 7 Aug 2020 06:21:39 +0000 (UTC) X-FDA: 77122776318.02.boy32_2513b3826fbe Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin02.hostedemail.com (Postfix) with ESMTP id CA40310097AA0 for ; Fri, 7 Aug 2020 06:21:39 +0000 (UTC) X-Spam-Summary: 1,0,0,403b8ee216b1a7df,d41d8cd98f00b204,akpm@linux-foundation.org,,RULES_HIT:1:41:69:355:379:800:960:966:967:973:988:989:1260:1263:1345:1359:1381:1431:1437:1605:1730:1747:1777:1792:2194:2196:2199:2200:2393:2525:2559:2563:2636:2682:2685:2693:2859:2890:2902:2933:2937:2939:2942:2945:2947:2951:2954:3022:3138:3139:3140:3141:3142:3308:3865:3866:3867:3868:3870:3871:3872:3874:3934:3936:3938:3941:3944:3947:3950:3953:3956:3959:4042:4321:4385:4605:5007:6119:6261:6653:7576:7903:8599:8957:9025:9121:9545:9592:10004:10913:11026:11232:11233:11473:11658:11914:12043:12048:12296:12297:12438:12517:12519:12555:12679:12683:12783:12986:13227:13229:13846:14877:21063:21080:21433:21450:21451:21627:21660:21939:21987:21990:30001:30054:30064,0,RBL:198.145.29.99:@linux-foundation.org:.lbl8.mailshell.net-62.2.0.100 64.100.201.201;04y85pejyc784wbjnk9baeq36tf5foctcnd7uaimkmrsfh8w1cckgt9wuj8m6om.gzf4p15sm3kb531whb4dgy115irfg8uq3f9namp9phkwc4o8tsfxib48nqeby96.g-lbl8.mailshell.net-223.238.255.10 0,CacheI X-HE-Tag: boy32_2513b3826fbe X-Filterd-Recvd-Size: 12134 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf02.hostedemail.com (Postfix) with ESMTP for ; Fri, 7 Aug 2020 06:21:39 +0000 (UTC) Received: from localhost.localdomain (c-73-231-172-41.hsd1.ca.comcast.net [73.231.172.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 24FF9221E5; Fri, 7 Aug 2020 06:21:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1596781298; bh=7lTOnYs2XH9AD5uvPszcKj3AytJenemPC3++jcHRye8=; h=Date:From:To:Subject:In-Reply-To:From; b=LruEwtSc2zzVz9lnL3PDTmki8rzswSFj3xcyeL14QQt4MwVoW3MUkNyEM3dXuUJoo FNmNiD2IEOAMw1LbP69XqbJ0B990ibaoIjRP1TmHlK/QHnQVTK8qghR3YqjHnW9+mf WxfAZEAoANLc5pzQP0LhEuJUH/xyhoIQ9ssLJfhM= Date: Thu, 06 Aug 2020 23:21:37 -0700 From: Andrew Morton To: akpm@linux-foundation.org, guro@fb.com, hannes@cmpxchg.org, linux-mm@kvack.org, mhocko@kernel.org, mm-commits@vger.kernel.org, shakeelb@google.com, torvalds@linux-foundation.org Subject: [patch 081/163] mm: memcontrol: account kernel stack per node Message-ID: <20200807062137.o44Vg87wz%akpm@linux-foundation.org> In-Reply-To: <20200806231643.a2711a608dd0f18bff2caf2b@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Queue-Id: CA40310097AA0 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam02 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Shakeel Butt Subject: mm: memcontrol: account kernel stack per node Currently the kernel stack is being accounted per-zone. There is no need to do that. In addition due to being per-zone, memcg has to keep a separate MEMCG_KERNEL_STACK_KB. Make the stat per-node and deprecate MEMCG_KERNEL_STACK_KB as memcg_stat_item is an extension of node_stat_item. In addition localize the kernel stack stats updates to account_kernel_stack(). Link: http://lkml.kernel.org/r/20200630161539.1759185-1-shakeelb@google.com Signed-off-by: Shakeel Butt Reviewed-by: Roman Gushchin Cc: Johannes Weiner Cc: Michal Hocko Signed-off-by: Andrew Morton --- drivers/base/node.c | 4 +- fs/proc/meminfo.c | 4 +- include/linux/memcontrol.h | 21 +++++++++++++- include/linux/mmzone.h | 8 ++--- kernel/fork.c | 51 +++++++++-------------------------- kernel/scs.c | 2 - mm/memcontrol.c | 2 - mm/page_alloc.c | 16 +++++----- mm/vmstat.c | 8 ++--- 9 files changed, 55 insertions(+), 61 deletions(-) --- a/drivers/base/node.c~mm-memcontrol-account-kernel-stack-per-node +++ a/drivers/base/node.c @@ -440,9 +440,9 @@ static ssize_t node_read_meminfo(struct nid, K(node_page_state(pgdat, NR_FILE_MAPPED)), nid, K(node_page_state(pgdat, NR_ANON_MAPPED)), nid, K(i.sharedram), - nid, sum_zone_node_page_state(nid, NR_KERNEL_STACK_KB), + nid, node_page_state(pgdat, NR_KERNEL_STACK_KB), #ifdef CONFIG_SHADOW_CALL_STACK - nid, sum_zone_node_page_state(nid, NR_KERNEL_SCS_KB), + nid, node_page_state(pgdat, NR_KERNEL_SCS_KB), #endif nid, K(sum_zone_node_page_state(nid, NR_PAGETABLE)), nid, 0UL, --- a/fs/proc/meminfo.c~mm-memcontrol-account-kernel-stack-per-node +++ a/fs/proc/meminfo.c @@ -101,10 +101,10 @@ static int meminfo_proc_show(struct seq_ show_val_kb(m, "SReclaimable: ", sreclaimable); show_val_kb(m, "SUnreclaim: ", sunreclaim); seq_printf(m, "KernelStack: %8lu kB\n", - global_zone_page_state(NR_KERNEL_STACK_KB)); + global_node_page_state(NR_KERNEL_STACK_KB)); #ifdef CONFIG_SHADOW_CALL_STACK seq_printf(m, "ShadowCallStack:%8lu kB\n", - global_zone_page_state(NR_KERNEL_SCS_KB)); + global_node_page_state(NR_KERNEL_SCS_KB)); #endif show_val_kb(m, "PageTables: ", global_zone_page_state(NR_PAGETABLE)); --- a/include/linux/memcontrol.h~mm-memcontrol-account-kernel-stack-per-node +++ a/include/linux/memcontrol.h @@ -32,8 +32,6 @@ struct kmem_cache; enum memcg_stat_item { MEMCG_SWAP = NR_VM_NODE_STAT_ITEMS, MEMCG_SOCK, - /* XXX: why are these zone and not node counters? */ - MEMCG_KERNEL_STACK_KB, MEMCG_NR_STAT, }; @@ -729,8 +727,19 @@ void __mod_memcg_lruvec_state(struct lru void __mod_lruvec_state(struct lruvec *lruvec, enum node_stat_item idx, int val); void __mod_lruvec_slab_state(void *p, enum node_stat_item idx, int val); + void mod_memcg_obj_state(void *p, int idx, int val); +static inline void mod_lruvec_slab_state(void *p, enum node_stat_item idx, + int val) +{ + unsigned long flags; + + local_irq_save(flags); + __mod_lruvec_slab_state(p, idx, val); + local_irq_restore(flags); +} + static inline void mod_memcg_lruvec_state(struct lruvec *lruvec, enum node_stat_item idx, int val) { @@ -1151,6 +1160,14 @@ static inline void __mod_lruvec_slab_sta __mod_node_page_state(page_pgdat(page), idx, val); } +static inline void mod_lruvec_slab_state(void *p, enum node_stat_item idx, + int val) +{ + struct page *page = virt_to_head_page(p); + + mod_node_page_state(page_pgdat(page), idx, val); +} + static inline void mod_memcg_obj_state(void *p, int idx, int val) { } --- a/include/linux/mmzone.h~mm-memcontrol-account-kernel-stack-per-node +++ a/include/linux/mmzone.h @@ -155,10 +155,6 @@ enum zone_stat_item { NR_ZONE_WRITE_PENDING, /* Count of dirty, writeback and unstable pages */ NR_MLOCK, /* mlock()ed pages found and moved off LRU */ NR_PAGETABLE, /* used for pagetables */ - NR_KERNEL_STACK_KB, /* measured in KiB */ -#if IS_ENABLED(CONFIG_SHADOW_CALL_STACK) - NR_KERNEL_SCS_KB, /* measured in KiB */ -#endif /* Second 128 byte cacheline */ NR_BOUNCE, #if IS_ENABLED(CONFIG_ZSMALLOC) @@ -203,6 +199,10 @@ enum node_stat_item { NR_KERNEL_MISC_RECLAIMABLE, /* reclaimable non-slab kernel pages */ NR_FOLL_PIN_ACQUIRED, /* via: pin_user_page(), gup flag: FOLL_PIN */ NR_FOLL_PIN_RELEASED, /* pages returned via unpin_user_page() */ + NR_KERNEL_STACK_KB, /* measured in KiB */ +#if IS_ENABLED(CONFIG_SHADOW_CALL_STACK) + NR_KERNEL_SCS_KB, /* measured in KiB */ +#endif NR_VM_NODE_STAT_ITEMS }; --- a/kernel/fork.c~mm-memcontrol-account-kernel-stack-per-node +++ a/kernel/fork.c @@ -276,13 +276,8 @@ static inline void free_thread_stack(str if (vm) { int i; - for (i = 0; i < THREAD_SIZE / PAGE_SIZE; i++) { - mod_memcg_page_state(vm->pages[i], - MEMCG_KERNEL_STACK_KB, - -(int)(PAGE_SIZE / 1024)); - + for (i = 0; i < THREAD_SIZE / PAGE_SIZE; i++) memcg_kmem_uncharge_page(vm->pages[i], 0); - } for (i = 0; i < NR_CACHED_STACKS; i++) { if (this_cpu_cmpxchg(cached_stacks[i], @@ -382,31 +377,14 @@ static void account_kernel_stack(struct void *stack = task_stack_page(tsk); struct vm_struct *vm = task_stack_vm_area(tsk); - BUILD_BUG_ON(IS_ENABLED(CONFIG_VMAP_STACK) && PAGE_SIZE % 1024 != 0); - if (vm) { - int i; - - BUG_ON(vm->nr_pages != THREAD_SIZE / PAGE_SIZE); - - for (i = 0; i < THREAD_SIZE / PAGE_SIZE; i++) { - mod_zone_page_state(page_zone(vm->pages[i]), - NR_KERNEL_STACK_KB, - PAGE_SIZE / 1024 * account); - } - } else { - /* - * All stack pages are in the same zone and belong to the - * same memcg. - */ - struct page *first_page = virt_to_page(stack); - - mod_zone_page_state(page_zone(first_page), NR_KERNEL_STACK_KB, - THREAD_SIZE / 1024 * account); - - mod_memcg_obj_state(stack, MEMCG_KERNEL_STACK_KB, - account * (THREAD_SIZE / 1024)); - } + /* All stack pages are in the same node. */ + if (vm) + mod_lruvec_page_state(vm->pages[0], NR_KERNEL_STACK_KB, + account * (THREAD_SIZE / 1024)); + else + mod_lruvec_slab_state(stack, NR_KERNEL_STACK_KB, + account * (THREAD_SIZE / 1024)); } static int memcg_charge_kernel_stack(struct task_struct *tsk) @@ -415,24 +393,23 @@ static int memcg_charge_kernel_stack(str struct vm_struct *vm = task_stack_vm_area(tsk); int ret; + BUILD_BUG_ON(IS_ENABLED(CONFIG_VMAP_STACK) && PAGE_SIZE % 1024 != 0); + if (vm) { int i; + BUG_ON(vm->nr_pages != THREAD_SIZE / PAGE_SIZE); + for (i = 0; i < THREAD_SIZE / PAGE_SIZE; i++) { /* * If memcg_kmem_charge_page() fails, page->mem_cgroup - * pointer is NULL, and both memcg_kmem_uncharge_page() - * and mod_memcg_page_state() in free_thread_stack() - * will ignore this page. So it's safe. + * pointer is NULL, and memcg_kmem_uncharge_page() in + * free_thread_stack() will ignore this page. */ ret = memcg_kmem_charge_page(vm->pages[i], GFP_KERNEL, 0); if (ret) return ret; - - mod_memcg_page_state(vm->pages[i], - MEMCG_KERNEL_STACK_KB, - PAGE_SIZE / 1024); } } #endif --- a/kernel/scs.c~mm-memcontrol-account-kernel-stack-per-node +++ a/kernel/scs.c @@ -17,7 +17,7 @@ static void __scs_account(void *s, int a { struct page *scs_page = virt_to_page(s); - mod_zone_page_state(page_zone(scs_page), NR_KERNEL_SCS_KB, + mod_node_page_state(page_pgdat(scs_page), NR_KERNEL_SCS_KB, account * (SCS_SIZE / SZ_1K)); } --- a/mm/memcontrol.c~mm-memcontrol-account-kernel-stack-per-node +++ a/mm/memcontrol.c @@ -1485,7 +1485,7 @@ static char *memory_stat_format(struct m (u64)memcg_page_state(memcg, NR_FILE_PAGES) * PAGE_SIZE); seq_buf_printf(&s, "kernel_stack %llu\n", - (u64)memcg_page_state(memcg, MEMCG_KERNEL_STACK_KB) * + (u64)memcg_page_state(memcg, NR_KERNEL_STACK_KB) * 1024); seq_buf_printf(&s, "slab %llu\n", (u64)(memcg_page_state(memcg, NR_SLAB_RECLAIMABLE_B) + --- a/mm/page_alloc.c~mm-memcontrol-account-kernel-stack-per-node +++ a/mm/page_alloc.c @@ -5396,6 +5396,10 @@ void show_free_areas(unsigned int filter " anon_thp: %lukB" #endif " writeback_tmp:%lukB" + " kernel_stack:%lukB" +#ifdef CONFIG_SHADOW_CALL_STACK + " shadow_call_stack:%lukB" +#endif " all_unreclaimable? %s" "\n", pgdat->node_id, @@ -5417,6 +5421,10 @@ void show_free_areas(unsigned int filter K(node_page_state(pgdat, NR_ANON_THPS) * HPAGE_PMD_NR), #endif K(node_page_state(pgdat, NR_WRITEBACK_TEMP)), + node_page_state(pgdat, NR_KERNEL_STACK_KB), +#ifdef CONFIG_SHADOW_CALL_STACK + node_page_state(pgdat, NR_KERNEL_SCS_KB), +#endif pgdat->kswapd_failures >= MAX_RECLAIM_RETRIES ? "yes" : "no"); } @@ -5448,10 +5456,6 @@ void show_free_areas(unsigned int filter " present:%lukB" " managed:%lukB" " mlocked:%lukB" - " kernel_stack:%lukB" -#ifdef CONFIG_SHADOW_CALL_STACK - " shadow_call_stack:%lukB" -#endif " pagetables:%lukB" " bounce:%lukB" " free_pcp:%lukB" @@ -5473,10 +5477,6 @@ void show_free_areas(unsigned int filter K(zone->present_pages), K(zone_managed_pages(zone)), K(zone_page_state(zone, NR_MLOCK)), - zone_page_state(zone, NR_KERNEL_STACK_KB), -#ifdef CONFIG_SHADOW_CALL_STACK - zone_page_state(zone, NR_KERNEL_SCS_KB), -#endif K(zone_page_state(zone, NR_PAGETABLE)), K(zone_page_state(zone, NR_BOUNCE)), K(free_pcp), --- a/mm/vmstat.c~mm-memcontrol-account-kernel-stack-per-node +++ a/mm/vmstat.c @@ -1140,10 +1140,6 @@ const char * const vmstat_text[] = { "nr_zone_write_pending", "nr_mlock", "nr_page_table_pages", - "nr_kernel_stack", -#if IS_ENABLED(CONFIG_SHADOW_CALL_STACK) - "nr_shadow_call_stack", -#endif "nr_bounce", #if IS_ENABLED(CONFIG_ZSMALLOC) "nr_zspages", @@ -1194,6 +1190,10 @@ const char * const vmstat_text[] = { "nr_kernel_misc_reclaimable", "nr_foll_pin_acquired", "nr_foll_pin_released", + "nr_kernel_stack", +#if IS_ENABLED(CONFIG_SHADOW_CALL_STACK) + "nr_shadow_call_stack", +#endif /* enum writeback_stat_item counters */ "nr_dirty_threshold",