From patchwork Tue Jul 30 15:01:57 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pasha Tatashin X-Patchwork-Id: 13747519 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3DCB2C3DA70 for ; Tue, 30 Jul 2024 15:02:48 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C7C356B00A5; Tue, 30 Jul 2024 11:02:47 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C05376B00A6; Tue, 30 Jul 2024 11:02:47 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A56886B00A7; Tue, 30 Jul 2024 11:02:47 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 7F05F6B00A5 for ; Tue, 30 Jul 2024 11:02:47 -0400 (EDT) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 3A81140369 for ; Tue, 30 Jul 2024 15:02:47 +0000 (UTC) X-FDA: 82396735974.04.825D0BA Received: from mail-qk1-f169.google.com (mail-qk1-f169.google.com [209.85.222.169]) by imf02.hostedemail.com (Postfix) with ESMTP id D3E1880036 for ; Tue, 30 Jul 2024 15:02:33 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=soleen-com.20230601.gappssmtp.com header.s=20230601 header.b=tcneG1Ol; spf=pass (imf02.hostedemail.com: domain of pasha.tatashin@soleen.com designates 209.85.222.169 as permitted sender) smtp.mailfrom=pasha.tatashin@soleen.com; dmarc=pass (policy=none) header.from=soleen.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1722351700; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=p8PfWoyQ/BZ4vL4BkdBJYX/P6ybvfxcH/D6ib7uXKjM=; b=Uv+GSlr0d/Vj8Rwe/aAXSnivsGX2hlpF3uA1qoEVKmb1IuB7TYPbX3UJKRDuZemOLRcW2T FlfGUqtAsOHKh65XECMm/9id4gzTGfLlHQyDfwIBMiyDRLH6aWgGq0w4TxqLUOIyu7Jh7y enuO8EwDPTppgbVXEj9beh9P82N+vuQ= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1722351700; a=rsa-sha256; cv=none; b=WcgeECTFHFSX2yLWRDuaLf0DCFQohtDztPxsxRBBcV73GvM9To2vqs2KsOhgdkXEfYWflR vfSYtUGdci96sfxkpkMTUFgZgYmQXUm06EGmTtbtOoqIk97PrI9gt1NnUUMAanFQxWzJ+N Beo0qSvuvKxO3t9xFWhjuwrwkXQSl3Q= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=pass header.d=soleen-com.20230601.gappssmtp.com header.s=20230601 header.b=tcneG1Ol; spf=pass (imf02.hostedemail.com: domain of pasha.tatashin@soleen.com designates 209.85.222.169 as permitted sender) smtp.mailfrom=pasha.tatashin@soleen.com; dmarc=pass (policy=none) header.from=soleen.com Received: by mail-qk1-f169.google.com with SMTP id af79cd13be357-7a1df0a93eeso261202285a.1 for ; Tue, 30 Jul 2024 08:02:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen-com.20230601.gappssmtp.com; s=20230601; t=1722351753; x=1722956553; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=p8PfWoyQ/BZ4vL4BkdBJYX/P6ybvfxcH/D6ib7uXKjM=; b=tcneG1OlUTtM+T93k6On/Rmpna0qYE6Qlm5/f2ASEEsw8AgQG+ap8jO6rXQjjzbdpU UZE6kpjV+tiPX+lHySfZ7IFgV1qwFhUUUp4NmlIFPArfY6DdlHTH5gahpnc4pjH2lKxf qNc8+fXRnlNCxhvjdrsJ0/Jk1zuAv+WxPpt0UDugDtF7+gV0CMOREqlEhWa5Kiiv5W6k No5eAWERY0cwOBQGHo3ncSklAixobMn0pPwtl3Pj74SH0hplsMA09ETUuo+FjvKpRV/D WPHkd1qxEUuL107jqF/mGCj5QXwHAbZU2D594YNChE/wRXbhMNohK3FXjqbvAIlMG0rX N3yQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722351753; x=1722956553; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=p8PfWoyQ/BZ4vL4BkdBJYX/P6ybvfxcH/D6ib7uXKjM=; b=v654hD9ojZtALJYp4lBszM0lIDfmAEYwBdGKCW9Ou42xb0VWioCbJ9mU5Qx/MGf49/ MFKnjTB38OqzTOmr6mKvddfdOF/WgmayIydEvZXWg33orRg8sWBz9VtnUEHAijPsl3JC EoF+0J+lg817qEh2zmc8T7g6rIX+F2t+p+XZbXjloA/h/zYP5dbPsgEWjhoaNHdbCcNd 8cjr1vN02a4DJghrggEXPrhJpW18DzD0Hm+zdtiJBs3fPUIR5taspojSbNlR+jzR8YtG hYcKhbF9xRjZKr0Yq2OuWnw3Rr24eSBVgUuCexWiDm0w1cbKjJzZiilwZNPDfZe8etNL Mbjw== X-Forwarded-Encrypted: i=1; AJvYcCXVgRk1z05vIAfpJBDIXSMy2BAxFqwO7HcN+4oeK80uaj0AkPKeMlzLy8UXl2x8O1DrTQEQ5G+CdabnWdpeU8L0+Qc= X-Gm-Message-State: AOJu0YwwbnGn8R1oQjosfTTgO7Yxd48zmfSYhzVXs4he8HjruWnqDapm +Xv4ZZcqmqgMfN1pNyQiSvsKL4WgcslNkCRsjHONc19mIRgXvPKle7GLXgk5UXc= X-Google-Smtp-Source: AGHT+IE62T9UYgoFVIkcGb7YcoXTufdF8VzWoHsKdhWNi0ZH/AY4EA/1XwSY1gXhI3Nz4aPyj1TOGA== X-Received: by 2002:a05:620a:1981:b0:7a1:dbf6:f762 with SMTP id af79cd13be357-7a1e524c6d3mr1039593585a.20.1722351752925; Tue, 30 Jul 2024 08:02:32 -0700 (PDT) Received: from soleen.c.googlers.com.com (197.5.86.34.bc.googleusercontent.com. [34.86.5.197]) by smtp.gmail.com with ESMTPSA id af79cd13be357-7a1d73efffdsm645934285a.69.2024.07.30.08.02.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Jul 2024 08:02:32 -0700 (PDT) From: Pasha Tatashin To: akpm@linux-foundation.org, jpoimboe@kernel.org, pasha.tatashin@soleen.com, kent.overstreet@linux.dev, peterz@infradead.org, nphamcs@gmail.com, cerasuolodomenico@gmail.com, surenb@google.com, lizhijian@fujitsu.com, willy@infradead.org, shakeel.butt@linux.dev, vbabka@suse.cz, ziy@nvidia.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, yosryahmed@google.com Subject: [PATCH v6 2/3] vmstat: Kernel stack usage histogram Date: Tue, 30 Jul 2024 15:01:57 +0000 Message-ID: <20240730150158.832783-3-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.46.0.rc1.232.g9752f9e123-goog In-Reply-To: <20240730150158.832783-1-pasha.tatashin@soleen.com> References: <20240730150158.832783-1-pasha.tatashin@soleen.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: D3E1880036 X-Stat-Signature: fg5sopfdbiey1u4eonwwxt3tsys8sgmn X-Rspamd-Server: rspam09 X-Rspam-User: X-HE-Tag: 1722351753-139749 X-HE-Meta: U2FsdGVkX1/0g9oCJSsqjiKXeALlKubYEI0/2jiQbTYFCK9VOTr53AQm8YOpb+iB7MTumkWmX+lYCNw4ceNAc8T1aOcQGLTaI2y7v98+nLR+uGRibHN0JiK/edQq/mL6kZm5fOlnHd7NTYrK7grzBLRGES/wgk7SyA76t57iw/WisOI8HcniuUqtwDkaqAjmXjSWiPZ6pt7YKFC3zUkvUZwpt8zMt4/ks1JnabQ2+E1LUeAUFR65t1+3f9Gq8YHarUvyOY3IeH/v4g5IuhGEgdigihyadOiDGrYd7pScM6uUi8gemXQyUOm2mjXVeJI8KXK5qm6Mqa436ZigWifDPmIlKvRIP58Z3gRE5+vR1wKmKS5ASIzTNa0Nw7cEWDhWGeY6zvIK2zJ1j41EhX4GBTADP1Gc5x/QfXQPSLhQstBTxO5bJsmzsLS60uf8BQtNRcaguq7BeIF8c1SGmYEN4Kl+K1tmMjl6Xh0TPdpVkUQBBDOQTJJ9OXNutUFQvKR8+EuuDf8Go1ae4f8UZhwSoSCbadIWxtSWUDBNlN7TtaEEwGy61A3ANsd1cBn6e0kSNG+lCwzJRviEovhRX227irBPq7Aw9L2MT6angYb+sB5H3wNW/KkRT/Q3mQdg5snUZ4epjbv26lCyUHx1FvLSEdrcSK119i3SYO9qcfUapgps0hrqTqmKR/pE2mpshkK9KW3GGFiNf+1Js34OkfQVbZL+PK6sUaLxOuyFwwLplSrAV1yrjn+ouW1NSXAFXuXz2IG8B1n+XqNMwz45CniIFZFA3A9FIRijrfwW/5F0roUtHPN/sZKO+dUHCWGtd9/eYV1y/pPtyhTEw20hWZkf6Kzc+bv5TExyXUZ/m+yx7PLbxVOzN89fVSX3RJyvLG2eWhpMRDit05f+W9KTujv9Hu1mA3sP2g0Ka9O+sJs/27H+5q+fcT7Wlir29sP4Prs+kido68SHAjjt9iXdc33 Mj/Nbdb0 G1u6YEfD3hthSGMo6ZFe0vtepb3Gt0xbIqRMPmInng88/P3+03NjTQYwjxrPbtYJ3kX8mfAplR6rGmtS5Okygn7BnK2YM01A0QM7a6rkOltVINMofXX6yOQ35YimuzIiqK06WGxcTjsYz4LdsQuRUolm7TZkTgnbm+cwiZNRSbYWqFysOwyxHKzzL/nRqE00Nn52ss6V9WWYgMyfBHyiG5faqiFQ4pXFY4S1RnrfVxXut+0CYfdIPfnWj/SIzSzKK7Zd8uJ9aT7Dec6EqsrmiuHEeYLe+jZtGsq9y6mkZxebh79INhgA5Q5YSK/rGbJPV+9nsPQtMX5F5/vyqligE/ZKOUA5sDFXFPovc00ugmWkpEPqt67WJcEGRz7Rk8bC0uqqZengK1JRKgzubKjC5H2UlBxU4V3/mFxGjBlDoomeGCek= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: As part of the dynamic kernel stack project, we need to know the amount of data that can be saved by reducing the default kernel stack size [1]. Provide a kernel stack usage histogram to aid in optimizing kernel stack sizes and minimizing memory waste in large-scale environments. The histogram divides stack usage into power-of-two buckets and reports the results in /proc/vmstat. This information is especially valuable in environments with millions of machines, where even small optimizations can have a significant impact. The histogram data is presented in /proc/vmstat with entries like "kstack_1k", "kstack_2k", and so on, indicating the number of threads that exited with stack usage falling within each respective bucket. Example outputs: Intel: $ grep kstack /proc/vmstat kstack_1k 3 kstack_2k 188 kstack_4k 11391 kstack_8k 243 kstack_16k 0 ARM with 64K page_size: $ grep kstack /proc/vmstat kstack_1k 1 kstack_2k 340 kstack_4k 25212 kstack_8k 1659 kstack_16k 0 kstack_32k 0 kstack_64k 0 Note: once the dynamic kernel stack is implemented it will depend on the implementation the usability of this feature: On hardware that supports faults on kernel stacks, we will have other metrics that show the total number of pages allocated for stacks. On hardware where faults are not supported, we will most likely have some optimization where only some threads are extended, and for those, these metrics will still be very useful. [1] https://lwn.net/Articles/974367 Signed-off-by: Pasha Tatashin Reviewed-by: Kent Overstreet Acked-by: Shakeel Butt --- include/linux/vm_event_item.h | 24 ++++++++++++++++++++++ kernel/exit.c | 38 +++++++++++++++++++++++++++++++++++ mm/vmstat.c | 24 ++++++++++++++++++++++ 3 files changed, 86 insertions(+) diff --git a/include/linux/vm_event_item.h b/include/linux/vm_event_item.h index 747943bc8cc2..37ad1c16367a 100644 --- a/include/linux/vm_event_item.h +++ b/include/linux/vm_event_item.h @@ -154,6 +154,30 @@ enum vm_event_item { PGPGIN, PGPGOUT, PSWPIN, PSWPOUT, VMA_LOCK_RETRY, VMA_LOCK_MISS, #endif +#ifdef CONFIG_DEBUG_STACK_USAGE + KSTACK_1K, +#if THREAD_SIZE > 1024 + KSTACK_2K, +#endif +#if THREAD_SIZE > 2048 + KSTACK_4K, +#endif +#if THREAD_SIZE > 4096 + KSTACK_8K, +#endif +#if THREAD_SIZE > 8192 + KSTACK_16K, +#endif +#if THREAD_SIZE > 16384 + KSTACK_32K, +#endif +#if THREAD_SIZE > 32768 + KSTACK_64K, +#endif +#if THREAD_SIZE > 65536 + KSTACK_REST, +#endif +#endif /* CONFIG_DEBUG_STACK_USAGE */ NR_VM_EVENT_ITEMS }; diff --git a/kernel/exit.c b/kernel/exit.c index 7430852a8571..64bfc2bae55b 100644 --- a/kernel/exit.c +++ b/kernel/exit.c @@ -778,6 +778,43 @@ static void exit_notify(struct task_struct *tsk, int group_dead) } #ifdef CONFIG_DEBUG_STACK_USAGE +/* Count the maximum pages reached in kernel stacks */ +static inline void kstack_histogram(unsigned long used_stack) +{ +#ifdef CONFIG_VM_EVENT_COUNTERS + if (used_stack <= 1024) + count_vm_event(KSTACK_1K); +#if THREAD_SIZE > 1024 + else if (used_stack <= 2048) + count_vm_event(KSTACK_2K); +#endif +#if THREAD_SIZE > 2048 + else if (used_stack <= 4096) + count_vm_event(KSTACK_4K); +#endif +#if THREAD_SIZE > 4096 + else if (used_stack <= 8192) + count_vm_event(KSTACK_8K); +#endif +#if THREAD_SIZE > 8192 + else if (used_stack <= 16384) + count_vm_event(KSTACK_16K); +#endif +#if THREAD_SIZE > 16384 + else if (used_stack <= 32768) + count_vm_event(KSTACK_32K); +#endif +#if THREAD_SIZE > 32768 + else if (used_stack <= 65536) + count_vm_event(KSTACK_64K); +#endif +#if THREAD_SIZE > 65536 + else + count_vm_event(KSTACK_REST); +#endif +#endif /* CONFIG_VM_EVENT_COUNTERS */ +} + static void check_stack_usage(void) { static DEFINE_SPINLOCK(low_water_lock); @@ -785,6 +822,7 @@ static void check_stack_usage(void) unsigned long free; free = stack_not_used(current); + kstack_histogram(THREAD_SIZE - free); if (free >= lowest_to_date) return; diff --git a/mm/vmstat.c b/mm/vmstat.c index 04a1cb6cc636..c7d52a9660c3 100644 --- a/mm/vmstat.c +++ b/mm/vmstat.c @@ -1417,6 +1417,30 @@ const char * const vmstat_text[] = { "vma_lock_retry", "vma_lock_miss", #endif +#ifdef CONFIG_DEBUG_STACK_USAGE + "kstack_1k", +#if THREAD_SIZE > 1024 + "kstack_2k", +#endif +#if THREAD_SIZE > 2048 + "kstack_4k", +#endif +#if THREAD_SIZE > 4096 + "kstack_8k", +#endif +#if THREAD_SIZE > 8192 + "kstack_16k", +#endif +#if THREAD_SIZE > 16384 + "kstack_32k", +#endif +#if THREAD_SIZE > 32768 + "kstack_64k", +#endif +#if THREAD_SIZE > 65536 + "kstack_rest", +#endif +#endif #endif /* CONFIG_VM_EVENT_COUNTERS || CONFIG_MEMCG */ }; #endif /* CONFIG_PROC_FS || CONFIG_SYSFS || CONFIG_NUMA || CONFIG_MEMCG */