From patchwork Wed Dec 14 22:51:22 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yuanchu Xie X-Patchwork-Id: 13073644 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C57ADC4332F for ; Wed, 14 Dec 2022 22:51:50 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 210D48E0005; Wed, 14 Dec 2022 17:51:50 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 1C13D8E0002; Wed, 14 Dec 2022 17:51:50 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 03ACC8E0005; Wed, 14 Dec 2022 17:51:49 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id E890C8E0002 for ; Wed, 14 Dec 2022 17:51:49 -0500 (EST) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id B04AD16057A for ; Wed, 14 Dec 2022 22:51:49 +0000 (UTC) X-FDA: 80242410738.11.B727855 Received: from mail-yw1-f202.google.com (mail-yw1-f202.google.com [209.85.128.202]) by imf04.hostedemail.com (Postfix) with ESMTP id 1731E4000C for ; Wed, 14 Dec 2022 22:51:47 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=D1ZGqVZq; spf=pass (imf04.hostedemail.com: domain of 3g1OaYwcKCLYuqWjYdqckkcha.Ykihejqt-iigrWYg.knc@flex--yuanchu.bounces.google.com designates 209.85.128.202 as permitted sender) smtp.mailfrom=3g1OaYwcKCLYuqWjYdqckkcha.Ykihejqt-iigrWYg.knc@flex--yuanchu.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1671058308; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=EUc+v5HwD46tRHbZWt0mN+PX65Kt87vRY2PWRTrdra8=; b=KmKp6QTdypT5P+TtvmvKfi4yGD8xV0JKaKN4Y2hcb2++wcugbt6aDHLAKKe2bAto8As+ac hyWVaxVawJAPPfODCWiTmZAqBdzdL1+Nbw8C89mmOJwPTje590N/kF3ni7bYjSdJXsixKv 9mI3pCp4PdL6tuloZN8zT6qiOAHnJO8= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=D1ZGqVZq; spf=pass (imf04.hostedemail.com: domain of 3g1OaYwcKCLYuqWjYdqckkcha.Ykihejqt-iigrWYg.knc@flex--yuanchu.bounces.google.com designates 209.85.128.202 as permitted sender) smtp.mailfrom=3g1OaYwcKCLYuqWjYdqckkcha.Ykihejqt-iigrWYg.knc@flex--yuanchu.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1671058308; a=rsa-sha256; cv=none; b=igQ7KYuWT767yrr/zAKSISRvNDavqdL9HsE0I/zdi0lsW6162stlWifUfOf84IyMKRcBIi X8LBo1l90TBPD6qebisC7iGmEaVkVm4SO+SGKS8Vh0oKbnrrkzna2BES0QMcAFgkdmikab 8l4H1UKVeazt2rQHSJFXakhuwVtW34Q= Received: by mail-yw1-f202.google.com with SMTP id 00721157ae682-3c9960ad866so15066467b3.4 for ; Wed, 14 Dec 2022 14:51:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=EUc+v5HwD46tRHbZWt0mN+PX65Kt87vRY2PWRTrdra8=; b=D1ZGqVZqRfjR4k+QCHfaRDc8HXpG/JxiKLnotRDm7Q0jKKMK98yfDAMjKF0CLGdkeO BVF5VEcuWQxJx6HnJzNbK6mWrwXtEdlrjzqRS0tZNCszk9P+OwNxJmLtbk++RNZ3MJ2G CZO/woce3dSyn1ylx7eIxBwwxjPYWeaDpc35wLuMTka2EBsXgXps1k6sEw7ixcsNXF9p XS/ySLs5INBD64ztnBLJmjS2xKhcPb2uQ8HO5xHRDe3Z59Zo8PPRKLq8X555tFBm+KIb asLhS5yiHmEdXlSIpeEqhJHQ5RCV8f1SRahCwzobKiTodVo1k1IFYKNhqPiusXTCOodl f8fQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=EUc+v5HwD46tRHbZWt0mN+PX65Kt87vRY2PWRTrdra8=; b=cQbSlss3XMGAt5i2Qshij/RoUHSyH64evpZc6VF5iOriM/k4YfM1hfamY6zrw+hyEN fL0QGdB6rBka5mdI8L8X47Vxtut7BUVTVdkuEOSouLRCMhbump9fSHu5Lg5B3ZweAGW0 N6qNf5oE5CkFCBwL79WyBC85UphaBEn9CSEIllPmqpRzMd2fqxsYHGRmoyp0Bc17up7B iZFmpakI5nObZIsDtRpKkMpyi0VEV9EG5GsHCypzaXFMiKjEfy5vanVJASiqsS9CvvjX gVxdxIkDK8LKkEd7A8axuySLmT8N8sOys4welBq6YhrmnNI3qsVDWIi3A8Wj74NpilEA gUKw== X-Gm-Message-State: ANoB5pkZ/TK2h1RFXbB+4zY8Sxn5LC3XNosPVF2aULFzh8KIMf6tbrlI M6VRpGy7M/gCUnBPgXrxWCFkDQ7KcHQX X-Google-Smtp-Source: AA0mqf7DGDIJUfC4VIyKgAVtfwcgyC3fcmhOUgkP+R5oeIkqE2QVIN7CL/qi6/psX2VJUox0PiyWM7a9fBN1 X-Received: from yuanchu.svl.corp.google.com ([2620:15c:2d4:203:1311:60bc:9e2a:ab1]) (user=yuanchu job=sendgmr) by 2002:a05:690c:583:b0:3bd:4b7a:c0b1 with SMTP id bo3-20020a05690c058300b003bd4b7ac0b1mr1390108ywb.220.1671058307326; Wed, 14 Dec 2022 14:51:47 -0800 (PST) Date: Wed, 14 Dec 2022 14:51:22 -0800 In-Reply-To: <20221214225123.2770216-1-yuanchu@google.com> Mime-Version: 1.0 References: <20221214225123.2770216-1-yuanchu@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20221214225123.2770216-2-yuanchu@google.com> Subject: [RFC PATCH 1/2] mm: multi-gen LRU: periodic aging From: Yuanchu Xie To: Johannes Weiner , Michal Hocko , Roman Gushchin , Yu Zhao Cc: Andrew Morton , Shakeel Butt , Muchun Song , linux-kernel@vger.kernel.org, linux-mm@kvack.org, cgroups@vger.kernel.org, Yuanchu Xie X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 1731E4000C X-Stat-Signature: 35ud5qh4e6uxpmqkbhpg9sjqmsdyz9df X-Rspam-User: X-HE-Tag: 1671058307-958732 X-HE-Meta: U2FsdGVkX19fiJJsfzkfzur0zHn3SM2JDmknUqfBJnfZd7RRl6qwOCdJSYx8004/ISg+boAkgfkoJr/wOJKPvUjKnw5H8lZ9Uq5JReI7vZE60iXhU16YPgEJHNzhsM/Rww89yWZtbw6yk+6fvrZUu1Tyj6V7AIVTlL6fY6iiYxHnl7T0sK++g9b4WX4LvyZBSfMAWC/qeG2QKNIultTL8DzXHAFZs/kAYHssDdAMhpmuDdBs5wqBAKApSfIT7yOGoIWj6F1nuUBa9THDWzA+FXIeNrumPyidHUU++r5ubhFj/92RF8cASdPmpv7XdrJqvAtJkd19O70mO1TcakeCPobNJDGSTBnX/Ab6lDu09lavDrzpw7BdnOFeRgNohWEBH0Ensju1F9iB5rOJagc3PZiNTD8QtQXMKH9tNLgtTA93X/epgajePmEMtf8Zl9cvfn0BHQ4QAbVfN4PH3z0kyTPs0Doz7e2/UzTsL/9u+gob5GL8i/5XnsyNnqSGUrgObPnECxeJZxrja5Uh1mNyVv06ds/i0iCtm5RJ9IdqIO2tQs82Ea9LbkIdCbFN0XZlhmjZSKPTdtAhiPpu/N2mSO3cUb4d6tLmSDi2wKKIji9Myu5MdPyPWmKr6hiy5EiPWQEO4VFulYtRid5KVNScrX2grxTz5rWfnJm2uqph8dX9+Gs+UqlzWuEmTe3wXxf1Yhx0/1vHS8icMBoDJpk+gNMsHPVBcXLDGRQWMTdMbb7vwPDVlKjmWMOL9gPXBvnBiy56e1K+lBsGmX95b6y9wde8U/LzNvQceUlHYCH9LTrJx46pN+ejz6G0dvG/EVLmHL5Y4K9Rk3WDiEZYihJNWJwW3T7gQiYEouN+fBh9SfLfL3JfnKkhwT7mrhRe+kgnuZ0YjQxorCnM7BfC0dRBLay1j1TbdCNWNzrkN0/G6/ABaRdFC3fGst9S8TJaHSrqV/rQRFVuEgw/VWuFA+X nFF8MjkK n9C5XS0nE998TXGT8Q/OYawKvEEniEjRNMBuldbQDndLeQfHXom5ox17x5mEO5NfbsSKZtglZa/ZdwYpeTfwjkDRZdSuHO+Oj6ivTPUHj79ZKKPEnDMkHzDNU9hecFXL8im1WGWBkmgOU//Astb+llXbwgGe3O6OVDNbyjtM7zoNUhQOk5G7qPxrphXm5iv+Gf+KaFDKA6HxL7K21m/4FKGkGFSFHt1xGiLcR1W0CUySBJFGrBudqxNvv2ACPx1ipKeBTKi1/sLByb8Wz2maODpkxs1ojPudNUEa9kDeLBZuz/u89ylgEwaxm2Q+o7CmdWCNvXBNY/5i/3XdiK+6DbGtlFO9QkqW7qqJDRXsATnDPB5tFJG+0HCuTtQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Periodically age MGLRU-enabled lruvecs to turn MGLRU generations into time-based working set information. This includes an interface to set the periodic aging interval and a new kthread to perform aging. memory.periodic_aging: a new root-level only file in cgroupfs Writing to memory.periodic aging sets the aging interval and opts into periodic aging. kold: a new kthread that ages memcgs based on the set aging interval. Signed-off-by: Yuanchu Xie --- include/linux/kold.h | 44 ++++++++++++ include/linux/mmzone.h | 4 +- mm/Makefile | 3 + mm/kold.c | 150 +++++++++++++++++++++++++++++++++++++++++ mm/memcontrol.c | 52 ++++++++++++++ mm/vmscan.c | 35 +++++++++- 6 files changed, 286 insertions(+), 2 deletions(-) create mode 100644 include/linux/kold.h create mode 100644 mm/kold.c diff --git a/include/linux/kold.h b/include/linux/kold.h new file mode 100644 index 000000000000..10b0dbe09a5c --- /dev/null +++ b/include/linux/kold.h @@ -0,0 +1,44 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later + * + * Periodic aging for multi-gen LRU + * + * Copyright (C) 2022 Yuanchu Xie + */ +#ifndef KOLD_H_ +#define KOLD_H_ + +#include + +struct kold_stats { + /* late is defined as spending an entire interval aging without sleep + * stat is aggregated every aging interval + */ + unsigned int late_count; +}; + +int kold_set_interval(unsigned int interval); +unsigned int kold_get_interval(void); +int kold_get_stats(struct kold_stats *stats); + +/* returns the creation timestamp of the youngest generation */ +unsigned long lru_gen_force_age_lruvec(struct mem_cgroup *memcg, int nid, + unsigned long min_ttl); + +#ifndef CONFIG_MEMCG +int kold_set_interval(unsigned int interval) +{ + return 0; +} + +unsigned int kold_get_interval(void) +{ + return 0; +} + +int kold_get_stats(struct kold_stats *stats) +{ + return -1; +} +#endif /* CONFIG_MEMCG */ + +#endif /* KOLD_H_ */ diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index 5f74891556f3..929c777b826a 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -1218,7 +1218,9 @@ typedef struct pglist_data { #ifdef CONFIG_LRU_GEN /* kswap mm walk data */ - struct lru_gen_mm_walk mm_walk; + struct lru_gen_mm_walk mm_walk; + /* kold periodic aging walk data */ + struct lru_gen_mm_walk kold_mm_walk; #endif CACHELINE_PADDING(_pad2_); diff --git a/mm/Makefile b/mm/Makefile index 8e105e5b3e29..8bd554a6eb7d 100644 --- a/mm/Makefile +++ b/mm/Makefile @@ -98,6 +98,9 @@ obj-$(CONFIG_DEVICE_MIGRATION) += migrate_device.o obj-$(CONFIG_TRANSPARENT_HUGEPAGE) += huge_memory.o khugepaged.o obj-$(CONFIG_PAGE_COUNTER) += page_counter.o obj-$(CONFIG_MEMCG) += memcontrol.o vmpressure.o +ifdef CONFIG_LRU_GEN +obj-$(CONFIG_MEMCG) += kold.o +endif ifdef CONFIG_SWAP obj-$(CONFIG_MEMCG) += swap_cgroup.o endif diff --git a/mm/kold.c b/mm/kold.c new file mode 100644 index 000000000000..094574177968 --- /dev/null +++ b/mm/kold.c @@ -0,0 +1,150 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2022 Yuanchu Xie + */ +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +static struct task_struct *kold_thread __read_mostly; +/* protects kold_thread */ +static DEFINE_MUTEX(kold_mutex); + +static unsigned int aging_interval __read_mostly; +static unsigned int late_count; + +/* try to move to a cpu on the target node */ +static void try_move_current_to_node(int nid) +{ + struct cpumask node_cpus; + + cpumask_and(&node_cpus, cpumask_of_node(nid), cpu_online_mask); + if (!cpumask_empty(&node_cpus)) + set_cpus_allowed_ptr(current, &node_cpus); +} + +static int kold_run(void *none) +{ + int nid; + unsigned int flags; + unsigned long last_interval_start_time = jiffies; + bool sleep_since_last_full_scan = false; + struct mem_cgroup *memcg; + struct reclaim_state reclaim_state = {}; + + while (!kthread_should_stop()) { + unsigned long interval = + (unsigned long)(READ_ONCE(aging_interval)) * HZ; + unsigned long next_wakeup_tick = jiffies + interval; + long timeout_ticks; + + current->reclaim_state = &reclaim_state; + flags = memalloc_noreclaim_save(); + + for_each_node_state(nid, N_MEMORY) { + pg_data_t *pgdat = NODE_DATA(nid); + + try_move_current_to_node(nid); + reclaim_state.mm_walk = &pgdat->kold_mm_walk; + + memcg = mem_cgroup_iter(NULL, NULL, NULL); + do { + unsigned long young_timestamp = + lru_gen_force_age_lruvec(memcg, nid, + interval); + + if (time_before(young_timestamp + interval, + next_wakeup_tick)) { + next_wakeup_tick = young_timestamp + interval; + } + } while ((memcg = mem_cgroup_iter(NULL, memcg, NULL))); + } + + memalloc_noreclaim_restore(flags); + current->reclaim_state = NULL; + + /* late_count stats update */ + if (time_is_before_jiffies(last_interval_start_time + interval)) { + last_interval_start_time += interval; + if (!sleep_since_last_full_scan) { + WRITE_ONCE(late_count, + READ_ONCE(late_count) + 1); + } + sleep_since_last_full_scan = false; + } + + /* sleep until next aging */ + timeout_ticks = -(long)(jiffies - next_wakeup_tick); + if (timeout_ticks > 0 && timeout_ticks != MAX_SCHEDULE_TIMEOUT) { + sleep_since_last_full_scan = true; + schedule_timeout_idle(timeout_ticks); + } + } + return 0; +} + +int kold_get_stats(struct kold_stats *stats) +{ + stats->late_count = READ_ONCE(late_count); + return 0; +} + +unsigned int kold_get_interval(void) +{ + return READ_ONCE(aging_interval); +} + +int kold_set_interval(unsigned int interval) +{ + int err = 0; + + mutex_lock(&kold_mutex); + if (interval && !kold_thread) { + if (!lru_gen_enabled()) { + err = -EOPNOTSUPP; + goto cleanup; + } + kold_thread = kthread_create(kold_run, NULL, "kold"); + + if (IS_ERR(kold_thread)) { + pr_err("kold: kthread_run(kold_run) failed\n"); + err = PTR_ERR(kold_thread); + kold_thread = NULL; + goto cleanup; + } + WRITE_ONCE(aging_interval, interval); + wake_up_process(kold_thread); + } else { + if (!interval && kold_thread) { + kthread_stop(kold_thread); + kold_thread = NULL; + } + WRITE_ONCE(aging_interval, interval); + } + +cleanup: + mutex_unlock(&kold_mutex); + return err; +} + +static int __init kold_init(void) +{ + return 0; +} + +module_init(kold_init); diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 2d8549ae1b30..7d2fb3fc4580 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -63,6 +63,7 @@ #include #include #include +#include #include "internal.h" #include #include @@ -6569,6 +6570,49 @@ static ssize_t memory_oom_group_write(struct kernfs_open_file *of, return nbytes; } +#ifdef CONFIG_LRU_GEN +static int memory_periodic_aging_show(struct seq_file *m, void *v) +{ + unsigned int interval = kold_get_interval(); + struct kold_stats stats; + int err; + + err = kold_get_stats(&stats); + + if (err) + return err; + + seq_printf(m, "aging_interval %u\n", interval); + seq_printf(m, "late_count %u\n", stats.late_count); + return 0; +} + +static ssize_t memory_periodic_aging_write(struct kernfs_open_file *of, + char *buf, size_t nbytes, + loff_t off) +{ + unsigned int new_interval; + int err; + + if (!lru_gen_enabled()) + return -EOPNOTSUPP; + + buf = strstrip(buf); + if (!buf) + return -EINVAL; + + err = kstrtouint(buf, 0, &new_interval); + if (err) + return err; + + err = kold_set_interval(new_interval); + if (err) + return err; + + return nbytes; +} +#endif /* CONFIG_LRU_GEN */ + static ssize_t memory_reclaim(struct kernfs_open_file *of, char *buf, size_t nbytes, loff_t off) { @@ -6679,6 +6723,14 @@ static struct cftype memory_files[] = { .flags = CFTYPE_NS_DELEGATABLE, .write = memory_reclaim, }, +#ifdef CONFIG_LRU_GEN + { + .name = "periodic_aging", + .flags = CFTYPE_ONLY_ON_ROOT, + .seq_show = memory_periodic_aging_show, + .write = memory_periodic_aging_write, + }, +#endif { } /* terminate */ }; diff --git a/mm/vmscan.c b/mm/vmscan.c index 04d8b88e5216..0fea21366fc8 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -54,6 +54,7 @@ #include #include #include +#include #include #include @@ -5279,8 +5280,10 @@ static void lru_gen_change_state(bool enabled) if (enabled) static_branch_enable_cpuslocked(&lru_gen_caps[LRU_GEN_CORE]); - else + else { static_branch_disable_cpuslocked(&lru_gen_caps[LRU_GEN_CORE]); + kold_set_interval(0); + } memcg = mem_cgroup_iter(NULL, NULL, NULL); do { @@ -5760,6 +5763,36 @@ static const struct file_operations lru_gen_ro_fops = { .release = seq_release, }; +/****************************************************************************** + * periodic aging (kold) + ******************************************************************************/ + +/* age lruvec as long as it is older than min_ttl, + * return the timestamp of the youngest generation + */ +unsigned long lru_gen_force_age_lruvec(struct mem_cgroup *memcg, int nid, + unsigned long min_ttl) +{ + struct scan_control sc = { + .may_writepage = true, + .may_unmap = true, + .may_swap = true, + .reclaim_idx = MAX_NR_ZONES - 1, + .gfp_mask = GFP_KERNEL, + }; + struct lruvec *lruvec = get_lruvec(memcg, nid); + DEFINE_MAX_SEQ(lruvec); + int gen = lru_gen_from_seq(max_seq); + unsigned long birth_timestamp = + READ_ONCE(lruvec->lrugen.timestamps[gen]); + + if (time_is_before_jiffies(birth_timestamp + min_ttl)) + try_to_inc_max_seq(lruvec, max_seq, &sc, true, true); + + return READ_ONCE(lruvec->lrugen.timestamps[lru_gen_from_seq( + READ_ONCE((lruvec)->lrugen.max_seq))]); +} + /****************************************************************************** * initialization ******************************************************************************/ From patchwork Wed Dec 14 22:51:23 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yuanchu Xie X-Patchwork-Id: 13073645 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6D93FC4167B for ; Wed, 14 Dec 2022 22:51:54 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id F30538E0006; Wed, 14 Dec 2022 17:51:53 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id EDFC48E0002; Wed, 14 Dec 2022 17:51:53 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D81638E0006; Wed, 14 Dec 2022 17:51:53 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id C89AE8E0002 for ; Wed, 14 Dec 2022 17:51:53 -0500 (EST) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 9ED14AB5BC for ; Wed, 14 Dec 2022 22:51:53 +0000 (UTC) X-FDA: 80242410906.14.1E7ED87 Received: from mail-yw1-f202.google.com (mail-yw1-f202.google.com [209.85.128.202]) by imf24.hostedemail.com (Postfix) with ESMTP id 1632F18000D for ; Wed, 14 Dec 2022 22:51:51 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=T1vg6Npx; spf=pass (imf24.hostedemail.com: domain of 3h1OaYwcKCLo3zfshmzlttlqj.htrqnsz2-rrp0fhp.twl@flex--yuanchu.bounces.google.com designates 209.85.128.202 as permitted sender) smtp.mailfrom=3h1OaYwcKCLo3zfshmzlttlqj.htrqnsz2-rrp0fhp.twl@flex--yuanchu.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1671058312; a=rsa-sha256; cv=none; b=zPakUm65WrXimBLje115cJ85bzIJ8XChB+49yg/mkYImEuBRSPu1b8QOJLGyDWyzlmsjow oYYuW9GzcqEl5VQqgGu3aWq7cyKpXASfNRG6Yck3cobnSau0H7GXDzRCojl2pMYwcJhOzW 0lDf5hbRc3vcqmrnRRtQqCLl5WOuFPA= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=T1vg6Npx; spf=pass (imf24.hostedemail.com: domain of 3h1OaYwcKCLo3zfshmzlttlqj.htrqnsz2-rrp0fhp.twl@flex--yuanchu.bounces.google.com designates 209.85.128.202 as permitted sender) smtp.mailfrom=3h1OaYwcKCLo3zfshmzlttlqj.htrqnsz2-rrp0fhp.twl@flex--yuanchu.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1671058312; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=pP1HscS3NBizlWxdjalcxiOCxdTFFXWst9Um78lFYew=; b=Mwp79IJEFxjsZmRqobKBBhZ0fafMngPPf/3rfZP4uK2iVnEBtO9cVbotaJ7HUzm8U1ARjT BDeyvIhDJQUeyjiVo8e/1OwHx2Y1IxKKmygBPy9ryx1XONUqFTRqJwXawhY5QEXY5k/uBZ CEDnHEUCt4ouSE67psE2C1xYXuvLtyk= Received: by mail-yw1-f202.google.com with SMTP id 00721157ae682-3dfb9d11141so15218527b3.3 for ; Wed, 14 Dec 2022 14:51:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=pP1HscS3NBizlWxdjalcxiOCxdTFFXWst9Um78lFYew=; b=T1vg6NpxLo6/eIwoy3Qed3mwYZrRcOaXUysYifLHaPi+vYhazUnDs6UDkTZlfRIw7V Y5hy6nvWH1kyT8kmze9eEa6nJSZiNVpW3N3UXMgfEqMuN0e25TVhuhO9D7VqOL/r6JiD 44j5nrRtBPJjeCrY/eBcV76Ko9qHQL7Uf12YDUe97/+MP0nkOYsl4soDy9ktZnmEK6x+ JG4eyDx3Gz2L0H7brEud6mnsIZbHYNBVEQ+rFXpJSg7KtZwCH9Hw29QGY6vvNpzSWtha wOUPLTllmlPcjgFW+I/5/U+W1WIJ6SDoiC88ehNVmoHhh1WhKqr8mBzhsap235RbUrOI kVPQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=pP1HscS3NBizlWxdjalcxiOCxdTFFXWst9Um78lFYew=; b=q6cUBij3Z+BI1K0wuqPkL0f8j59NknKBBwM3tFYtZC1HqSBSMDGJf5zrrkRNLhAGN6 CP5ZxchoCyQBk048K1sukthJGCAbn18/+Km3ZzNdfkoXIUDZrStR6bQBUvBI+5zT9rMA Aq71Cmf+KxLVTpxfMy0Wm7BV3kM4ChUismyfJ3dt9QGMYgHzMtfWh4gv9WaFXEDo+HXu sHOXQAtLMJelGYRWpk6076+FGRhr1QHly40JIctSSG1RVmudsVHwmAZgpIWwIMMtiEp+ wJxVk/h9GgfBDkBBgTkbJ81Wemq+9VdJQEYjhjEtIbY1Sh80tsafzvEuf2qMYrOZgX87 bJsw== X-Gm-Message-State: ANoB5pmbVjKlp40ZlsojLGnwiXM2nrRhGRW7UhGvrziWuJlLZzsXHQ1o vbbMJRo7AjYR5jPRnNKZs4SzvHCdr7Gi X-Google-Smtp-Source: AA0mqf7P1hKr2kxnLCk/QZ1080xuaZc9TM+496Hb5ZsvxrM7pi3Lu6mK+r3cvFJ2NP34V6PpPdiHlgQapjM1 X-Received: from yuanchu.svl.corp.google.com ([2620:15c:2d4:203:1311:60bc:9e2a:ab1]) (user=yuanchu job=sendgmr) by 2002:a25:dfc3:0:b0:6f3:2748:7469 with SMTP id w186-20020a25dfc3000000b006f327487469mr55315847ybg.564.1671058311228; Wed, 14 Dec 2022 14:51:51 -0800 (PST) Date: Wed, 14 Dec 2022 14:51:23 -0800 In-Reply-To: <20221214225123.2770216-1-yuanchu@google.com> Mime-Version: 1.0 References: <20221214225123.2770216-1-yuanchu@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20221214225123.2770216-3-yuanchu@google.com> Subject: [RFC PATCH 2/2] mm: multi-gen LRU: cgroup working set stats From: Yuanchu Xie To: Johannes Weiner , Michal Hocko , Roman Gushchin , Yu Zhao Cc: Andrew Morton , Shakeel Butt , Muchun Song , linux-kernel@vger.kernel.org, linux-mm@kvack.org, cgroups@vger.kernel.org, Yuanchu Xie X-Rspam-User: X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 1632F18000D X-Stat-Signature: 84e7owcdxijnk897u4ziga76h9y7aom7 X-HE-Tag: 1671058311-151979 X-HE-Meta: U2FsdGVkX1+SQatFgMSebaPaAc4pf1sbi3VwDtqIZPpU6ldFVFwLmDIRuFZtpBK/Y9rMF6/dW8wvXZ1sutNyQkcoYZBv+WfrB8LbRo7F7BnrMMS27ay5gjFvSswml27WUIu4U8rZ2WMpQU97HNcJ5gWWE4iuafNS/wnf643a6/wwOlVDfOPZQM+vaeM4wOpEDOkrgM8mzrcVWxitjlp6vo8Mhk+7CIFzDYqDRquKqXk1781HQZzBqSkGKCTDiCVdfxgp2xAAS+RSVqDOMrZkfD/GQn93MxfAePfEKatDZcgHLKjJbI0WdXUUyQrfvC7o2JeqwBQfSBJmatmOLwy6DYW7n7PSWXVoqNpx7sQRNEGAzO/f+ozuEsOLrZy1kHQY6VBhVrAuQa9Wp+XHWZKiH1D6SltGGNyufLTqUcDgxj7DSwMIhD0n8YKZLBADlp6ggQNNJx3F2kf6hlYyV0BDuGf3AyT4COXJmBzyP13/AhZCi4D6/3/ZmJIsjimNKBzeMXX306WedLPKEzJt89GnkkluTl4jVi+xfOwt0mCFFdZ2cqLHJo2Z8aGrQ1ZBELMg3KVv0pLBlvwnun1BDExZdodgUC/VH4YmolhT/NOZ+RekW2+03ea7Jo82sUKDizxfzt2Jffbxinm+VbF9/b7MD+HKTtDl31isoBB3p46xGmUEQvRYHeJITXl1sR+P0H9O+/mj7G+J1fkfqlV97UdX9AnprOueZCLJg0KPlDgd91ndVKYx3q6b/RJpgubzDNWE1u8tAyNrAkaQgbbFyc3h3htCUvGzk/WzEKCZjz7slgLAEbN/0oGR078eZxcn+SicgatC39nLs83fzSOZRileSQZwVOWGGfki2Q2DoAeftN42Ay+wnPQN4sl7t6+nqCYprITS4MrM3LeXBZnkqoOMOjJ6pv5VbRbvVBXwYjOe1BipoGP12LL3izP8BxJFYkethUJpu8Rp4Ry0SiN12x2 GetuJEHe /iRiONgr0A+cegQoIjVAv/3vJYTlTAjNqCCOJaXbrRXy5OD8UPaRVh0rlpjRW7tz2NIKjDWm05NyiB7jqcUgz7/yRdrI7g7GCgaTEN8JtC3W1deA6hQRWZ43YB+MZ0TbUkx2jFevIQ+beWRcx0rxSy66oC2Y1wN7F36E9wWptsBKgz8bkLn57sWipJNceKixVNx0woe2mznO/CttlS+7/RPiSTCbvqW/lOfQwapvuLuNaN3aN7el1uao4Oaz86HCgNPy/T803oImHY1upIGhuu6P6esaDaZZXjngP6Uw4A9KKgzEOQnJH7PWIvj1lOcyPmTU4z41w1QszTe5AmsszodI2+HTQYissGGBvusvZXCE57bWZ3yNtf2xLGQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Expose MGLRU generations as working set stats in cgroupfs as memory.page_idle_age, where we group pages into idle age ranges, and present the number of pages per node per pagetype in each range. This aggregates the time information from MGLRU generations hierarchically. Signed-off-by: Yuanchu Xie --- mm/memcontrol.c | 136 ++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 136 insertions(+) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 7d2fb3fc4580..86554e17be58 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -1655,6 +1655,130 @@ void mem_cgroup_print_oom_meminfo(struct mem_cgroup *memcg) pr_info("%s", buf); } +#ifdef CONFIG_LRU_GEN +static const unsigned long page_idle_age_ranges[] = { + 1, 2, 5, 10, 20, 30, 45, 60, 90, 120, 180, + 240, 360, 480, 720, 960, 1440, 1920, 2880, 3840, -1 +}; + +#define PAGE_IDLE_AGE_NR_RANGES ARRAY_SIZE(page_idle_age_ranges) + +static unsigned int lru_gen_time_to_page_idle_age_range(unsigned long timestamp) +{ + unsigned int i; + unsigned long gen_age = jiffies_to_msecs(jiffies - timestamp) / MSEC_PER_SEC; + + for (i = 0; i < PAGE_IDLE_AGE_NR_RANGES - 1; ++i) + if (gen_age <= page_idle_age_ranges[i]) + return i; + + return PAGE_IDLE_AGE_NR_RANGES - 1; +} + +static void lru_gen_fill_page_idle_age_table(unsigned long *table, + struct lru_gen_struct *lrugen, + int nid) +{ + unsigned long max_seq = READ_ONCE(lrugen->max_seq); + unsigned long min_seq[ANON_AND_FILE] = { + READ_ONCE(lrugen->min_seq[LRU_GEN_ANON]), + READ_ONCE(lrugen->min_seq[LRU_GEN_FILE]), + }; + unsigned long seq; + unsigned int pagetype; + + /* + * what do we want to do here? + * iterate over all the generations, for each anon and file + */ + + for (pagetype = LRU_GEN_ANON; pagetype < ANON_AND_FILE; ++pagetype) { + for (seq = min_seq[pagetype]; seq <= max_seq; ++seq) { + unsigned int zone; + unsigned int gen = lru_gen_from_seq(seq); + unsigned int idle_age = lru_gen_time_to_page_idle_age_range( + READ_ONCE(lrugen->timestamps[gen])); + unsigned long page_count = 0; + + for (zone = 0; zone < MAX_NR_ZONES; ++zone) { + page_count += READ_ONCE( + lrugen->nr_pages[gen][pagetype][zone]); + } + table[pagetype * PAGE_IDLE_AGE_NR_RANGES * + nr_node_ids + + PAGE_IDLE_AGE_NR_RANGES * nid + idle_age] += + page_count; + } + } +} + +static void memory_page_idle_age_print(struct seq_file *m, unsigned long *table) +{ + static const char *type_str[ANON_AND_FILE] = { "anon", "file" }; + unsigned int i, nid, pagetype; + unsigned int lower = 0; + + for (i = 0; i < PAGE_IDLE_AGE_NR_RANGES; ++i) { + unsigned int upper = page_idle_age_ranges[i]; + + for (pagetype = LRU_GEN_ANON; pagetype < ANON_AND_FILE; + ++pagetype) { + if (upper == -1) + seq_printf(m, "%u-inf %s", lower, + type_str[pagetype]); + else + seq_printf(m, "%u-%u %s", lower, upper, + type_str[pagetype]); + for_each_node_state(nid, N_MEMORY) { + unsigned long page_count = table + [pagetype * + PAGE_IDLE_AGE_NR_RANGES * + nr_node_ids + + PAGE_IDLE_AGE_NR_RANGES * nid + + i]; + seq_printf(m, " N%u=%lu", nid, page_count); + } + seq_puts(m, "\n"); + } + + lower = upper; + } +} + +static int memory_page_idle_age_format(struct mem_cgroup *root, + struct seq_file *m) +{ + struct mem_cgroup *memcg; + unsigned long *table; + + /* + * table contains PAGE_IDLE_AGE_NR_RANGES entries + * per node per pagetype + */ + table = kmalloc_array(PAGE_IDLE_AGE_NR_RANGES * nr_node_ids * + ANON_AND_FILE, + sizeof(*table), __GFP_ZERO | GFP_KERNEL); + + if (!table) + return -ENOMEM; + + memcg = mem_cgroup_iter(root, NULL, NULL); + do { + int nid; + + for_each_node_state(nid, N_MEMORY) { + struct lru_gen_struct *lrugen = + &memcg->nodeinfo[nid]->lruvec.lrugen; + + lru_gen_fill_page_idle_age_table(table, lrugen, nid); + } + } while ((memcg = mem_cgroup_iter(root, memcg, NULL))); + + memory_page_idle_age_print(m, table); + return 0; +} +#endif /* CONFIG_LRU_GEN */ + /* * Return the memory (and swap, if configured) limit for a memcg. */ @@ -6571,6 +6695,13 @@ static ssize_t memory_oom_group_write(struct kernfs_open_file *of, } #ifdef CONFIG_LRU_GEN +static int memory_page_idle_age_show(struct seq_file *m, void *v) +{ + struct mem_cgroup *memcg = mem_cgroup_from_seq(m); + + return memory_page_idle_age_format(memcg, m); +} + static int memory_periodic_aging_show(struct seq_file *m, void *v) { unsigned int interval = kold_get_interval(); @@ -6724,6 +6855,11 @@ static struct cftype memory_files[] = { .write = memory_reclaim, }, #ifdef CONFIG_LRU_GEN + { + .name = "page_idle_age", + .flags = CFTYPE_NS_DELEGATABLE, + .seq_show = memory_page_idle_age_show, + }, { .name = "periodic_aging", .flags = CFTYPE_ONLY_ON_ROOT,