From patchwork Wed Jul 24 20:33:21 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pasha Tatashin X-Patchwork-Id: 13741344 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5B460C3DA64 for ; Wed, 24 Jul 2024 20:33:32 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5B2016B0092; Wed, 24 Jul 2024 16:33:31 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 5658B6B0093; Wed, 24 Jul 2024 16:33:31 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 403886B0095; Wed, 24 Jul 2024 16:33:31 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 193776B0092 for ; Wed, 24 Jul 2024 16:33:31 -0400 (EDT) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id B589380BDB for ; Wed, 24 Jul 2024 20:33:30 +0000 (UTC) X-FDA: 82375796580.28.572F997 Received: from mail-oo1-f43.google.com (mail-oo1-f43.google.com [209.85.161.43]) by imf02.hostedemail.com (Postfix) with ESMTP id DCB2580004 for ; Wed, 24 Jul 2024 20:33:28 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=soleen-com.20230601.gappssmtp.com header.s=20230601 header.b="py/ODG4G"; spf=pass (imf02.hostedemail.com: domain of pasha.tatashin@soleen.com designates 209.85.161.43 as permitted sender) smtp.mailfrom=pasha.tatashin@soleen.com; dmarc=pass (policy=none) header.from=soleen.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1721853184; a=rsa-sha256; cv=none; b=jQn9Oi6XuwprgtXxM1FrgSCYPnwXEsdXOlNsb82OoQCiyZ9CPX4tp6nrMQdc0HPhEGMRnE QFHTaFhN4vi2fYUxuND1yeTOWKNhNtye8LeexyQ+xpNT7NJFGqGW+XfRhxs6Th+thZfgl5 TUwd/kFZgiOvnFCPkXNDM/e3p32pKDA= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=pass header.d=soleen-com.20230601.gappssmtp.com header.s=20230601 header.b="py/ODG4G"; spf=pass (imf02.hostedemail.com: domain of pasha.tatashin@soleen.com designates 209.85.161.43 as permitted sender) smtp.mailfrom=pasha.tatashin@soleen.com; dmarc=pass (policy=none) header.from=soleen.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1721853184; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=cJxm/byA41k6HHb6hL73ZdxkTNW3paktP68/wTRBIYU=; b=B1ObBPij/dFCPzUKLBIOwmsFEFYKraKvGmo8h+7MRGy+CpldkXXh9aBaLIxyjcwsMcovhK 7+xuEljm6zDThtVK2V8ZEE7V2mQVODihcbkoknLo7fXJ8K2ujr7pglm1oLLIHVmegPu0LG ti4KR5ied00uQ5RRn6ctfV+pasKTA7Q= Received: by mail-oo1-f43.google.com with SMTP id 006d021491bc7-5c6661bca43so177068eaf.0 for ; Wed, 24 Jul 2024 13:33:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen-com.20230601.gappssmtp.com; s=20230601; t=1721853208; x=1722458008; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=cJxm/byA41k6HHb6hL73ZdxkTNW3paktP68/wTRBIYU=; b=py/ODG4GzvOlnKykgEPCS88g/uobljJ0XpX3qSgvt3GnuHnZHOOssACuvTnIbdO7Zq jnXwvKvypVzvGLoRbhV4QxCJfX4R7eKq9BYX5QcE0V/JnIG1MNhFIF+Rm/0512D5wYs1 ctQib35zl/4m0J2nAizmgxOXUoDZ5zaBcmZpAblatffhQKpD7TtxoOIVHnp1Wnf8/XGb F8szC+jTWOG12yjzspf9ItZnFzhfNFGLNybY9J6KYFK2x5J3lwDZNgZ5hYWtP/BzklWw R69QyOHvLfgNnz0JQooR1KvCRp7e3cayW6KZV/Xq6cfODPor7l9i5V3n0gR8uiu6GqQg chOQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1721853208; x=1722458008; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=cJxm/byA41k6HHb6hL73ZdxkTNW3paktP68/wTRBIYU=; b=FeY8DJ1CNzhGeQIeqHD4BVLp9/kvvtaoBaodiIzPA3kuLWRVeOcS7DyXfUz/bSZk1i eF3f+Z0jKLIam6mPLJVdhHouZDLkCLxAG3hJRO5l/qcjXdAFADNp2U8ERM7FHhJSlqQQ S45NPkf/obVl/BFdNNfQ7mCWDlL1g5d8SPmn395OVBbdPun9CI8TwwzAog/c7+Ha5ddG CL9lOKAhnwZBlv6PbBKXVHOeaOCJ8Sw78YD4UFeLbibA8Vj1GWFi1Ym8K0uKniuj+FVD mo5+3fx9QmQAnZZa+1IaCGWni9+p+SQfNAw/7U7zCnQEJ3zNyAUGUfRw1TVuNsa4Jtc0 Qu6A== X-Forwarded-Encrypted: i=1; AJvYcCVQO441oXdiJZL04L5oPMImxwQMEza4jjoy/tQ6EjHehTV+oZHFVkFY4AK4JgGxExixFA6jKKACzzqMU6vvT0Jm5Nc= X-Gm-Message-State: AOJu0YyFMW/FTR+MzdDAkeTgtGeH4w6Ag3aTRG4YKAJnymnhQ/K4NOSy 4hSZHNjzakxj/KI1+n7Q0t08ajtHAawqI6cotY+F1g0Gzu88ISYi0q0PvhaCO5Q= X-Google-Smtp-Source: AGHT+IHE52f2bOrGfZsPUD1tpSaklQrRe5ujqaQavwqiUXjGH9LGmhslRRkcpkvz99rconqiOHZ7Tg== X-Received: by 2002:a05:6358:2484:b0:1aa:bde7:5725 with SMTP id e5c5f4694b2df-1acf8ae554fmr128120855d.28.1721853207710; Wed, 24 Jul 2024 13:33:27 -0700 (PDT) Received: from soleen.c.googlers.com.com (197.5.86.34.bc.googleusercontent.com. [34.86.5.197]) by smtp.gmail.com with ESMTPSA id af79cd13be357-7a1d73b1786sm466485a.33.2024.07.24.13.33.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 24 Jul 2024 13:33:26 -0700 (PDT) From: Pasha Tatashin To: akpm@linux-foundation.org, jpoimboe@kernel.org, pasha.tatashin@soleen.com, kent.overstreet@linux.dev, peterz@infradead.org, nphamcs@gmail.com, cerasuolodomenico@gmail.com, surenb@google.com, lizhijian@fujitsu.com, willy@infradead.org, shakeel.butt@linux.dev, vbabka@suse.cz, ziy@nvidia.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v5 2/3] vmstat: Kernel stack usage histogram Date: Wed, 24 Jul 2024 20:33:21 +0000 Message-ID: <20240724203322.2765486-3-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.45.2.1089.g2a221341d9-goog In-Reply-To: <20240724203322.2765486-1-pasha.tatashin@soleen.com> References: <20240724203322.2765486-1-pasha.tatashin@soleen.com> MIME-Version: 1.0 X-Stat-Signature: frirrud8foi8r8te8yumpsaa5of9fha1 X-Rspamd-Queue-Id: DCB2580004 X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1721853208-408497 X-HE-Meta: U2FsdGVkX1+TQA9pXMWib4NI+VBSrp1QS+iKuyLUHX9P6twrbx7cVeTJiRYWXkt2LXsLi9t2j9l1p3DavhulJ9bZe15oT5uxwNsGT37mvJoLwQF73PZbBWgGViPmCX1X123NGG7ng1J48vIursVoVDGm5XJADGywaPu6SiiPWtCOeerBrbVv4AtJUyQCDUewsIF/xLJ50Bz8AJ5LEewAhkIU4X9rDXymzIoBXhK+jlVUJpwgzFNUSyZ3jpX/nd2LDFYlWp15LmbTR0SUpKgZYPSNv+gVUioFH+n4hyQwIyPIEG9MG9SrlYqkV0ZJZCLkRHe6VF8KTEV5rodKivqq08EKKzsz2tz/CYV3MTlejstoV4ftS1xvt1HAcRiebU7gs8ruWd7YG1vIxXkx1EiN5T+TBpP0ddCgVZyliTntyHjjXkZg15aSl/VTJQAeIQgN8gx5oHvwBSdBlbEtlQzvVYkaFDpLfbT+ScVzbRvLIFjEMZHKIj2s1POqBwHw4K+hIK/sGDIv4Dn8l3EHKfdCOJxeXkmoVY1aqBqQFKINBcLkheLJw7TdQ66KhtsDaVc1Drad34Kcz1sMbTyNjIAmEMCfMmyqLwAalxFzOYSuwMdPnqvl9ZbcyU8jCnHqgqtwPtYQbGi0PGFS/trqDYGkloMEHyvjnJ7RqyHuAMbs2hQadDUokEAJ/rzask1BR1LHFv5ru/H9BfKWRl4KBvEr4XU6CpKNm3NjdNiG1M+hBUJ1Ja/diRGue0PUOwG7iRUgnkrZ2qLa3QmW9tDcgbLxeSypq9iXLCp0yOt6WO6MWLIs/8bEIzRxxHpdtMui+P3ATEyKoi8bLwZzSJojVcHeWJZsqfIurVy0h505ShyoonpgK5SOXoDHGULnTUVoZRL+9EHX9xgDH3b44awDXAGlX/cO6FCmEV304alFmUS0OvJI9h/Uik8cnKH1PTWkvqkVL7waj041Ftyj/fG4YWH UsqLEPEa MEaKzVACNw72orTlG5uk53wspLcXunnuB82cOTbIo/RlDn2ROr3TAKw6wY9zQ49j3088CqyJaZPxoWXf1FFSUgdTkDJq0kIVYPi5HdvtsdhohUcxAfiNxpVZFxoTgHl71cHHpEfTb07BQxi76FL5qomN8mIk4hxUiIZpMnAFM7aciEeKVsWWKnbEnI7+BgOhTWFiKqO050X52kUTakX4leU6W4EayiQ/T+hmzcoNtNW9+HwBlpBmUbA4zcREXrQAMTzGY6M2teyPn33unB85jQU6yJZ+iNATFUzzC7tqR0gf3kAkoWzo48FkcirdB8+eBrgA5hFiUShzpxyom3tKsgHPsFH42YeUbXDSlCRBgKeeQb8uao51EErb/iBGYqUB9QigaGm3Zzp2Vlte4FKuLlfLsZdHL9WhBlcIvel9WBkEuxJM= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: As part of the dynamic kernel stack project, we need to know the amount of data that can be saved by reducing the default kernel stack size [1]. Provide a kernel stack usage histogram to aid in optimizing kernel stack sizes and minimizing memory waste in large-scale environments. The histogram divides stack usage into power-of-two buckets and reports the results in /proc/vmstat. This information is especially valuable in environments with millions of machines, where even small optimizations can have a significant impact. The histogram data is presented in /proc/vmstat with entries like "kstack_1k", "kstack_2k", and so on, indicating the number of threads that exited with stack usage falling within each respective bucket. Example outputs: Intel: $ grep kstack /proc/vmstat kstack_1k 3 kstack_2k 188 kstack_4k 11391 kstack_8k 243 kstack_16k 0 ARM with 64K page_size: $ grep kstack /proc/vmstat kstack_1k 1 kstack_2k 340 kstack_4k 25212 kstack_8k 1659 kstack_16k 0 kstack_32k 0 kstack_64k 0 Note: once the dynamic kernel stack is implemented it will depend on the implementation the usability of this feature: On hardware that supports faults on kernel stacks, we will have other metrics that show the total number of pages allocated for stacks. On hardware where faults are not supported, we will most likely have some optimization where only some threads are extended, and for those, these metrics will still be very useful. [1] https://lwn.net/Articles/974367 Signed-off-by: Pasha Tatashin Reviewed-by: Kent Overstreet Acked-by: Shakeel Butt --- include/linux/vm_event_item.h | 24 ++++++++++++++++++++++ kernel/exit.c | 38 +++++++++++++++++++++++++++++++++++ mm/vmstat.c | 24 ++++++++++++++++++++++ 3 files changed, 86 insertions(+) diff --git a/include/linux/vm_event_item.h b/include/linux/vm_event_item.h index 747943bc8cc2..37ad1c16367a 100644 --- a/include/linux/vm_event_item.h +++ b/include/linux/vm_event_item.h @@ -154,6 +154,30 @@ enum vm_event_item { PGPGIN, PGPGOUT, PSWPIN, PSWPOUT, VMA_LOCK_RETRY, VMA_LOCK_MISS, #endif +#ifdef CONFIG_DEBUG_STACK_USAGE + KSTACK_1K, +#if THREAD_SIZE > 1024 + KSTACK_2K, +#endif +#if THREAD_SIZE > 2048 + KSTACK_4K, +#endif +#if THREAD_SIZE > 4096 + KSTACK_8K, +#endif +#if THREAD_SIZE > 8192 + KSTACK_16K, +#endif +#if THREAD_SIZE > 16384 + KSTACK_32K, +#endif +#if THREAD_SIZE > 32768 + KSTACK_64K, +#endif +#if THREAD_SIZE > 65536 + KSTACK_REST, +#endif +#endif /* CONFIG_DEBUG_STACK_USAGE */ NR_VM_EVENT_ITEMS }; diff --git a/kernel/exit.c b/kernel/exit.c index 7430852a8571..64bfc2bae55b 100644 --- a/kernel/exit.c +++ b/kernel/exit.c @@ -778,6 +778,43 @@ static void exit_notify(struct task_struct *tsk, int group_dead) } #ifdef CONFIG_DEBUG_STACK_USAGE +/* Count the maximum pages reached in kernel stacks */ +static inline void kstack_histogram(unsigned long used_stack) +{ +#ifdef CONFIG_VM_EVENT_COUNTERS + if (used_stack <= 1024) + count_vm_event(KSTACK_1K); +#if THREAD_SIZE > 1024 + else if (used_stack <= 2048) + count_vm_event(KSTACK_2K); +#endif +#if THREAD_SIZE > 2048 + else if (used_stack <= 4096) + count_vm_event(KSTACK_4K); +#endif +#if THREAD_SIZE > 4096 + else if (used_stack <= 8192) + count_vm_event(KSTACK_8K); +#endif +#if THREAD_SIZE > 8192 + else if (used_stack <= 16384) + count_vm_event(KSTACK_16K); +#endif +#if THREAD_SIZE > 16384 + else if (used_stack <= 32768) + count_vm_event(KSTACK_32K); +#endif +#if THREAD_SIZE > 32768 + else if (used_stack <= 65536) + count_vm_event(KSTACK_64K); +#endif +#if THREAD_SIZE > 65536 + else + count_vm_event(KSTACK_REST); +#endif +#endif /* CONFIG_VM_EVENT_COUNTERS */ +} + static void check_stack_usage(void) { static DEFINE_SPINLOCK(low_water_lock); @@ -785,6 +822,7 @@ static void check_stack_usage(void) unsigned long free; free = stack_not_used(current); + kstack_histogram(THREAD_SIZE - free); if (free >= lowest_to_date) return; diff --git a/mm/vmstat.c b/mm/vmstat.c index 73d791d1caad..6e3347789eb2 100644 --- a/mm/vmstat.c +++ b/mm/vmstat.c @@ -1417,6 +1417,30 @@ const char * const vmstat_text[] = { "vma_lock_retry", "vma_lock_miss", #endif +#ifdef CONFIG_DEBUG_STACK_USAGE + "kstack_1k", +#if THREAD_SIZE > 1024 + "kstack_2k", +#endif +#if THREAD_SIZE > 2048 + "kstack_4k", +#endif +#if THREAD_SIZE > 4096 + "kstack_8k", +#endif +#if THREAD_SIZE > 8192 + "kstack_16k", +#endif +#if THREAD_SIZE > 16384 + "kstack_32k", +#endif +#if THREAD_SIZE > 32768 + "kstack_64k", +#endif +#if THREAD_SIZE > 65536 + "kstack_rest", +#endif +#endif #endif /* CONFIG_VM_EVENT_COUNTERS || CONFIG_MEMCG */ }; #endif /* CONFIG_PROC_FS || CONFIG_SYSFS || CONFIG_NUMA || CONFIG_MEMCG */