From patchwork Sun Sep 18 08:00:07 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Yu Zhao X-Patchwork-Id: 12979364 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 4C1D4C32771 for ; Sun, 18 Sep 2022 08:27:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:Cc:To:From:Subject:References: Mime-Version:Message-Id:In-Reply-To:Date:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Owner; bh=l3rb6F6OavOF4SXoLZBeYJvRUhc0Idr+rBBiJHnfu+U=; b=BmBcXnsfkjDzTPUzKDQUjKJWWG jxo0KEoZUhZcmiWJAm33pYxqOYxy7uq7JSq0pLb87rI75KhbxqYDWY7lEvKVDpfdSRqWRg4RwWNTt GC3JoKXkjQls6jG0/Bj7l2Qrfmi/q26eNXng1C2jOZLpLTXFyGqg9jiSXG5RqZ/1m1UYAyAeRsAvf 3M8c8v6binjdklaZuVQqqDW7EVobsMh3CcyS7Z5rKYQok8rv6SJ61aLnnjn2jraZKIgzrdvIWkVUr WAEHF9VzzLUib8hEuA4pU80H8LEmahUu/YUSUM7JHbbw8jK93E4h99UUKYGX1yETdDKqykLXsqqDH D+1sJiSw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1oZpcn-00DmS9-AJ; Sun, 18 Sep 2022 08:26:01 +0000 Received: from casper.infradead.org ([2001:8b0:10b:1236::1]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1oZpc8-00Dm7w-0u for linux-arm-kernel@bombadil.infradead.org; Sun, 18 Sep 2022 08:25:20 +0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Transfer-Encoding:Content-Type: Cc:To:From:Subject:References:Mime-Version:Message-Id:In-Reply-To:Date:Sender :Reply-To:Content-ID:Content-Description; bh=EXgSguLZRobiNroyWI2fBVAqC0Qss/26Rhq8qK6YDnc=; b=D868vuotqn+0Di/NeKsLagJcfc FEuA/0Iky1AgitlSDvpuOao5ApGrNZkMkr4bkqP326riWaK8XPRiXRR+SG1rMmi0JRKrCCm69sgub MQKS02ZJKb/XqCYw5a/pl9q6YQ8PbuWdwLuHtyev5r4f0HFIaiU2Vr58QqhR3pq2z5OHAgVHVTemf UJ2cXyrY7fK8DFinC/8X1XFjmFNuGWTlami6If4W2feq4mxK9HZAxS4hCEVAU1WmcID9FpOj7zX1/ Pty9OgzbpT6WkJ8VGBkxTzcah/YmizlWez64LKuS3jF7yVsgp5kGJc9QZOG1f7wXvCaygVpHv/5jB 510EO8aQ==; Received: from mail-io1-xd49.google.com ([2607:f8b0:4864:20::d49]) by casper.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1oZpEw-003s88-5N for linux-arm-kernel@lists.infradead.org; Sun, 18 Sep 2022 08:01:25 +0000 Received: by mail-io1-xd49.google.com with SMTP id b16-20020a5d8950000000b006891a850acfso13719439iot.19 for ; Sun, 18 Sep 2022 01:01:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=content-transfer-encoding:cc:to:from:subject:references :mime-version:message-id:in-reply-to:date:from:to:cc:subject:date; bh=EXgSguLZRobiNroyWI2fBVAqC0Qss/26Rhq8qK6YDnc=; b=llRmi0Os3YHma20iSdKa8ePymL93BOIFj37sUcmCGRt0DbvoAczQExbPPkDlXFs6YH P2L48gOqnAq5g0YUO5EtGMzG0WIqV1zS1PzoAroKaXNIHKUCqk7dPPZwjJhWgX1B0BX2 4sRCLoAy9aGfXEVMvbDuqvArHNjRLyp0cC6839dKb0/vVTlQUtzoxw3/1oOXIr5vfWzb jJvQSiL3tw0AU4A7AUzjW766Xb5WcHL/2BILeZZU5tTorsGpdvi1q6R5i0DyfpYjMSRT CElOnBxxjhbXz6gShHQK9bM7cL7OWnzKFPj2/M7HW0aumHBf0JT343lo4V6r8QmrtKPP 6aYA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:cc:to:from:subject:references :mime-version:message-id:in-reply-to:date:x-gm-message-state:from:to :cc:subject:date; bh=EXgSguLZRobiNroyWI2fBVAqC0Qss/26Rhq8qK6YDnc=; b=jGq2+U1qsptLsO6LA9gnkEBKpgTs4YnlzC+Bk4ICiLUzt6iVsssW7ZNxjKw+CSS5oO mGLHKr29LuHBQffHmWffF4+mUXWO3UqMAj9zrQinTtIiF476A5yxmDeJ66cS48asN3aH 5FdawOzgigxZINNbBDOWPsek0LG/tSxm4MrQD4iAU97bV3rH7YEb89kyhKZo+Nd8a8VY 1uLyERSfqbcy5mhddvJQX0EjXx/FuCSd37vHZLnImj9vE0DIRSDgH6Bwtv0vNzflxcN4 lXVFYhXftXeMoI9/sCi0sq9BJixnH9/6wCjA08gsv4HNOW71gv+yDfl7DcgV8fWkaC2T akpQ== X-Gm-Message-State: ACrzQf3LuY0XxQFHJjIXwO48ULXot3ALgdmePDjTfKJTwZn+XNGq1Jib +NXH4xizNqWPXcIX5wjDIuRQINX0ucQ= X-Google-Smtp-Source: AMsMyM6QyOf9srlrwgCmoxKklddNTWk0zPTK7jfVWSVOprIq1fquKr7a+MN8qW5AkhXXVxw37pTHK9mQxnQ= X-Received: from yuzhao.bld.corp.google.com ([2620:15c:183:200:c05a:2e99:29cd:d157]) (user=yuzhao job=sendgmr) by 2002:a92:cdaa:0:b0:2f5:8f98:3ec2 with SMTP id g10-20020a92cdaa000000b002f58f983ec2mr569649ild.93.1663488076207; Sun, 18 Sep 2022 01:01:16 -0700 (PDT) Date: Sun, 18 Sep 2022 02:00:07 -0600 In-Reply-To: <20220918080010.2920238-1-yuzhao@google.com> Message-Id: <20220918080010.2920238-11-yuzhao@google.com> Mime-Version: 1.0 References: <20220918080010.2920238-1-yuzhao@google.com> X-Mailer: git-send-email 2.37.3.968.ga6b4b080e4-goog Subject: [PATCH mm-unstable v15 10/14] mm: multi-gen LRU: kill switch From: Yu Zhao To: Andrew Morton Cc: Andi Kleen , Aneesh Kumar , Catalin Marinas , Dave Hansen , Hillf Danton , Jens Axboe , Johannes Weiner , Jonathan Corbet , Linus Torvalds , Matthew Wilcox , Mel Gorman , Michael Larabel , Michal Hocko , Mike Rapoport , Peter Zijlstra , Tejun Heo , Vlastimil Babka , Will Deacon , linux-arm-kernel@lists.infradead.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, x86@kernel.org, page-reclaim@google.com, Yu Zhao , Brian Geffon , Jan Alexander Steffens , Oleksandr Natalenko , Steven Barrett , Suleiman Souhlal , Daniel Byrne , Donald Carr , " =?utf-8?q?Holger_Hoffst=C3=A4tte?= " , Konstantin Kharlamov , Shuang Zhai , Sofia Trinh , Vaibhav Jain X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20220918_090122_324005_2B0EA3C3 X-CRM114-Status: GOOD ( 24.81 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Add /sys/kernel/mm/lru_gen/enabled as a kill switch. Components that can be disabled include: 0x0001: the multi-gen LRU core 0x0002: walking page table, when arch_has_hw_pte_young() returns true 0x0004: clearing the accessed bit in non-leaf PMD entries, when CONFIG_ARCH_HAS_NONLEAF_PMD_YOUNG=y [yYnN]: apply to all the components above E.g., echo y >/sys/kernel/mm/lru_gen/enabled cat /sys/kernel/mm/lru_gen/enabled 0x0007 echo 5 >/sys/kernel/mm/lru_gen/enabled cat /sys/kernel/mm/lru_gen/enabled 0x0005 NB: the page table walks happen on the scale of seconds under heavy memory pressure, in which case the mmap_lock contention is a lesser concern, compared with the LRU lock contention and the I/O congestion. So far the only well-known case of the mmap_lock contention happens on Android, due to Scudo [1] which allocates several thousand VMAs for merely a few hundred MBs. The SPF and the Maple Tree also have provided their own assessments [2][3]. However, if walking page tables does worsen the mmap_lock contention, the kill switch can be used to disable it. In this case the multi-gen LRU will suffer a minor performance degradation, as shown previously. Clearing the accessed bit in non-leaf PMD entries can also be disabled, since this behavior was not tested on x86 varieties other than Intel and AMD. [1] https://source.android.com/devices/tech/debug/scudo [2] https://lore.kernel.org/r/20220128131006.67712-1-michel@lespinasse.org/ [3] https://lore.kernel.org/r/20220426150616.3937571-1-Liam.Howlett@oracle.com/ Signed-off-by: Yu Zhao Acked-by: Brian Geffon Acked-by: Jan Alexander Steffens (heftig) Acked-by: Oleksandr Natalenko Acked-by: Steven Barrett Acked-by: Suleiman Souhlal Tested-by: Daniel Byrne Tested-by: Donald Carr Tested-by: Holger Hoffstätte Tested-by: Konstantin Kharlamov Tested-by: Shuang Zhai Tested-by: Sofia Trinh Tested-by: Vaibhav Jain --- include/linux/cgroup.h | 15 ++- include/linux/mm_inline.h | 15 ++- include/linux/mmzone.h | 9 ++ kernel/cgroup/cgroup-internal.h | 1 - mm/Kconfig | 6 + mm/vmscan.c | 228 +++++++++++++++++++++++++++++++- 6 files changed, 265 insertions(+), 9 deletions(-) diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h index ac5d0515680e..9179463c3c9f 100644 --- a/include/linux/cgroup.h +++ b/include/linux/cgroup.h @@ -432,6 +432,18 @@ static inline void cgroup_put(struct cgroup *cgrp) css_put(&cgrp->self); } +extern struct mutex cgroup_mutex; + +static inline void cgroup_lock(void) +{ + mutex_lock(&cgroup_mutex); +} + +static inline void cgroup_unlock(void) +{ + mutex_unlock(&cgroup_mutex); +} + /** * task_css_set_check - obtain a task's css_set with extra access conditions * @task: the task to obtain css_set for @@ -446,7 +458,6 @@ static inline void cgroup_put(struct cgroup *cgrp) * as locks used during the cgroup_subsys::attach() methods. */ #ifdef CONFIG_PROVE_RCU -extern struct mutex cgroup_mutex; extern spinlock_t css_set_lock; #define task_css_set_check(task, __c) \ rcu_dereference_check((task)->cgroups, \ @@ -708,6 +719,8 @@ struct cgroup; static inline u64 cgroup_id(const struct cgroup *cgrp) { return 1; } static inline void css_get(struct cgroup_subsys_state *css) {} static inline void css_put(struct cgroup_subsys_state *css) {} +static inline void cgroup_lock(void) {} +static inline void cgroup_unlock(void) {} static inline int cgroup_attach_task_all(struct task_struct *from, struct task_struct *t) { return 0; } static inline int cgroupstats_build(struct cgroupstats *stats, diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h index f2b2296a42f9..4949eda9a9a2 100644 --- a/include/linux/mm_inline.h +++ b/include/linux/mm_inline.h @@ -106,10 +106,21 @@ static __always_inline enum lru_list folio_lru_list(struct folio *folio) #ifdef CONFIG_LRU_GEN +#ifdef CONFIG_LRU_GEN_ENABLED static inline bool lru_gen_enabled(void) { - return true; + DECLARE_STATIC_KEY_TRUE(lru_gen_caps[NR_LRU_GEN_CAPS]); + + return static_branch_likely(&lru_gen_caps[LRU_GEN_CORE]); } +#else +static inline bool lru_gen_enabled(void) +{ + DECLARE_STATIC_KEY_FALSE(lru_gen_caps[NR_LRU_GEN_CAPS]); + + return static_branch_unlikely(&lru_gen_caps[LRU_GEN_CORE]); +} +#endif static inline bool lru_gen_in_fault(void) { @@ -222,7 +233,7 @@ static inline bool lru_gen_add_folio(struct lruvec *lruvec, struct folio *folio, VM_WARN_ON_ONCE_FOLIO(gen != -1, folio); - if (folio_test_unevictable(folio)) + if (folio_test_unevictable(folio) || !lrugen->enabled) return false; /* * There are three common cases for this page: diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index b1635c4020dc..95c58c7fbdff 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -387,6 +387,13 @@ enum { LRU_GEN_FILE, }; +enum { + LRU_GEN_CORE, + LRU_GEN_MM_WALK, + LRU_GEN_NONLEAF_YOUNG, + NR_LRU_GEN_CAPS +}; + #define MIN_LRU_BATCH BITS_PER_LONG #define MAX_LRU_BATCH (MIN_LRU_BATCH * 64) @@ -428,6 +435,8 @@ struct lru_gen_struct { /* can be modified without holding the LRU lock */ atomic_long_t evicted[NR_HIST_GENS][ANON_AND_FILE][MAX_NR_TIERS]; atomic_long_t refaulted[NR_HIST_GENS][ANON_AND_FILE][MAX_NR_TIERS]; + /* whether the multi-gen LRU is enabled */ + bool enabled; }; enum { diff --git a/kernel/cgroup/cgroup-internal.h b/kernel/cgroup/cgroup-internal.h index 36b740cb3d59..63dc3e82be4f 100644 --- a/kernel/cgroup/cgroup-internal.h +++ b/kernel/cgroup/cgroup-internal.h @@ -164,7 +164,6 @@ struct cgroup_mgctx { #define DEFINE_CGROUP_MGCTX(name) \ struct cgroup_mgctx name = CGROUP_MGCTX_INIT(name) -extern struct mutex cgroup_mutex; extern spinlock_t css_set_lock; extern struct cgroup_subsys *cgroup_subsys[]; extern struct list_head cgroup_roots; diff --git a/mm/Kconfig b/mm/Kconfig index 5c5dcbdcfe34..ab6ef5115eb8 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -1127,6 +1127,12 @@ config LRU_GEN help A high performance LRU implementation to overcommit memory. +config LRU_GEN_ENABLED + bool "Enable by default" + depends on LRU_GEN + help + This option enables the multi-gen LRU by default. + config LRU_GEN_STATS bool "Full stats for debugging" depends on LRU_GEN diff --git a/mm/vmscan.c b/mm/vmscan.c index 3f83325fdc71..10f31f3c5054 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -51,6 +51,7 @@ #include #include #include +#include #include #include @@ -3070,6 +3071,14 @@ static bool can_age_anon_pages(struct pglist_data *pgdat, #ifdef CONFIG_LRU_GEN +#ifdef CONFIG_LRU_GEN_ENABLED +DEFINE_STATIC_KEY_ARRAY_TRUE(lru_gen_caps, NR_LRU_GEN_CAPS); +#define get_cap(cap) static_branch_likely(&lru_gen_caps[cap]) +#else +DEFINE_STATIC_KEY_ARRAY_FALSE(lru_gen_caps, NR_LRU_GEN_CAPS); +#define get_cap(cap) static_branch_unlikely(&lru_gen_caps[cap]) +#endif + /****************************************************************************** * shorthand helpers ******************************************************************************/ @@ -3946,7 +3955,8 @@ static void walk_pmd_range_locked(pud_t *pud, unsigned long next, struct vm_area goto next; if (!pmd_trans_huge(pmd[i])) { - if (IS_ENABLED(CONFIG_ARCH_HAS_NONLEAF_PMD_YOUNG)) + if (IS_ENABLED(CONFIG_ARCH_HAS_NONLEAF_PMD_YOUNG) && + get_cap(LRU_GEN_NONLEAF_YOUNG)) pmdp_test_and_clear_young(vma, addr, pmd + i); goto next; } @@ -4044,10 +4054,12 @@ static void walk_pmd_range(pud_t *pud, unsigned long start, unsigned long end, walk->mm_stats[MM_NONLEAF_TOTAL]++; #ifdef CONFIG_ARCH_HAS_NONLEAF_PMD_YOUNG - if (!pmd_young(val)) - continue; + if (get_cap(LRU_GEN_NONLEAF_YOUNG)) { + if (!pmd_young(val)) + continue; - walk_pmd_range_locked(pud, addr, vma, args, bitmap, &pos); + walk_pmd_range_locked(pud, addr, vma, args, bitmap, &pos); + } #endif if (!walk->force_scan && !test_bloom_filter(walk->lruvec, walk->max_seq, pmd + i)) continue; @@ -4309,7 +4321,7 @@ static bool try_to_inc_max_seq(struct lruvec *lruvec, unsigned long max_seq, * handful of PTEs. Spreading the work out over a period of time usually * is less efficient, but it avoids bursty page faults. */ - if (!arch_has_hw_pte_young()) { + if (!(arch_has_hw_pte_young() && get_cap(LRU_GEN_MM_WALK))) { success = iterate_mm_list_nowalk(lruvec, max_seq); goto done; } @@ -5074,6 +5086,208 @@ static void lru_gen_shrink_lruvec(struct lruvec *lruvec, struct scan_control *sc blk_finish_plug(&plug); } +/****************************************************************************** + * state change + ******************************************************************************/ + +static bool __maybe_unused state_is_valid(struct lruvec *lruvec) +{ + struct lru_gen_struct *lrugen = &lruvec->lrugen; + + if (lrugen->enabled) { + enum lru_list lru; + + for_each_evictable_lru(lru) { + if (!list_empty(&lruvec->lists[lru])) + return false; + } + } else { + int gen, type, zone; + + for_each_gen_type_zone(gen, type, zone) { + if (!list_empty(&lrugen->lists[gen][type][zone])) + return false; + } + } + + return true; +} + +static bool fill_evictable(struct lruvec *lruvec) +{ + enum lru_list lru; + int remaining = MAX_LRU_BATCH; + + for_each_evictable_lru(lru) { + int type = is_file_lru(lru); + bool active = is_active_lru(lru); + struct list_head *head = &lruvec->lists[lru]; + + while (!list_empty(head)) { + bool success; + struct folio *folio = lru_to_folio(head); + + VM_WARN_ON_ONCE_FOLIO(folio_test_unevictable(folio), folio); + VM_WARN_ON_ONCE_FOLIO(folio_test_active(folio) != active, folio); + VM_WARN_ON_ONCE_FOLIO(folio_is_file_lru(folio) != type, folio); + VM_WARN_ON_ONCE_FOLIO(folio_lru_gen(folio) != -1, folio); + + lruvec_del_folio(lruvec, folio); + success = lru_gen_add_folio(lruvec, folio, false); + VM_WARN_ON_ONCE(!success); + + if (!--remaining) + return false; + } + } + + return true; +} + +static bool drain_evictable(struct lruvec *lruvec) +{ + int gen, type, zone; + int remaining = MAX_LRU_BATCH; + + for_each_gen_type_zone(gen, type, zone) { + struct list_head *head = &lruvec->lrugen.lists[gen][type][zone]; + + while (!list_empty(head)) { + bool success; + struct folio *folio = lru_to_folio(head); + + VM_WARN_ON_ONCE_FOLIO(folio_test_unevictable(folio), folio); + VM_WARN_ON_ONCE_FOLIO(folio_test_active(folio), folio); + VM_WARN_ON_ONCE_FOLIO(folio_is_file_lru(folio) != type, folio); + VM_WARN_ON_ONCE_FOLIO(folio_zonenum(folio) != zone, folio); + + success = lru_gen_del_folio(lruvec, folio, false); + VM_WARN_ON_ONCE(!success); + lruvec_add_folio(lruvec, folio); + + if (!--remaining) + return false; + } + } + + return true; +} + +static void lru_gen_change_state(bool enabled) +{ + static DEFINE_MUTEX(state_mutex); + + struct mem_cgroup *memcg; + + cgroup_lock(); + cpus_read_lock(); + get_online_mems(); + mutex_lock(&state_mutex); + + if (enabled == lru_gen_enabled()) + goto unlock; + + if (enabled) + static_branch_enable_cpuslocked(&lru_gen_caps[LRU_GEN_CORE]); + else + static_branch_disable_cpuslocked(&lru_gen_caps[LRU_GEN_CORE]); + + memcg = mem_cgroup_iter(NULL, NULL, NULL); + do { + int nid; + + for_each_node(nid) { + struct lruvec *lruvec = get_lruvec(memcg, nid); + + if (!lruvec) + continue; + + spin_lock_irq(&lruvec->lru_lock); + + VM_WARN_ON_ONCE(!seq_is_valid(lruvec)); + VM_WARN_ON_ONCE(!state_is_valid(lruvec)); + + lruvec->lrugen.enabled = enabled; + + while (!(enabled ? fill_evictable(lruvec) : drain_evictable(lruvec))) { + spin_unlock_irq(&lruvec->lru_lock); + cond_resched(); + spin_lock_irq(&lruvec->lru_lock); + } + + spin_unlock_irq(&lruvec->lru_lock); + } + + cond_resched(); + } while ((memcg = mem_cgroup_iter(NULL, memcg, NULL))); +unlock: + mutex_unlock(&state_mutex); + put_online_mems(); + cpus_read_unlock(); + cgroup_unlock(); +} + +/****************************************************************************** + * sysfs interface + ******************************************************************************/ + +static ssize_t show_enabled(struct kobject *kobj, struct kobj_attribute *attr, char *buf) +{ + unsigned int caps = 0; + + if (get_cap(LRU_GEN_CORE)) + caps |= BIT(LRU_GEN_CORE); + + if (arch_has_hw_pte_young() && get_cap(LRU_GEN_MM_WALK)) + caps |= BIT(LRU_GEN_MM_WALK); + + if (IS_ENABLED(CONFIG_ARCH_HAS_NONLEAF_PMD_YOUNG) && get_cap(LRU_GEN_NONLEAF_YOUNG)) + caps |= BIT(LRU_GEN_NONLEAF_YOUNG); + + return snprintf(buf, PAGE_SIZE, "0x%04x\n", caps); +} + +static ssize_t store_enabled(struct kobject *kobj, struct kobj_attribute *attr, + const char *buf, size_t len) +{ + int i; + unsigned int caps; + + if (tolower(*buf) == 'n') + caps = 0; + else if (tolower(*buf) == 'y') + caps = -1; + else if (kstrtouint(buf, 0, &caps)) + return -EINVAL; + + for (i = 0; i < NR_LRU_GEN_CAPS; i++) { + bool enabled = caps & BIT(i); + + if (i == LRU_GEN_CORE) + lru_gen_change_state(enabled); + else if (enabled) + static_branch_enable(&lru_gen_caps[i]); + else + static_branch_disable(&lru_gen_caps[i]); + } + + return len; +} + +static struct kobj_attribute lru_gen_enabled_attr = __ATTR( + enabled, 0644, show_enabled, store_enabled +); + +static struct attribute *lru_gen_attrs[] = { + &lru_gen_enabled_attr.attr, + NULL +}; + +static struct attribute_group lru_gen_attr_group = { + .name = "lru_gen", + .attrs = lru_gen_attrs, +}; + /****************************************************************************** * initialization ******************************************************************************/ @@ -5084,6 +5298,7 @@ void lru_gen_init_lruvec(struct lruvec *lruvec) struct lru_gen_struct *lrugen = &lruvec->lrugen; lrugen->max_seq = MIN_NR_GENS + 1; + lrugen->enabled = lru_gen_enabled(); for_each_gen_type_zone(gen, type, zone) INIT_LIST_HEAD(&lrugen->lists[gen][type][zone]); @@ -5123,6 +5338,9 @@ static int __init init_lru_gen(void) BUILD_BUG_ON(MIN_NR_GENS + 1 >= MAX_NR_GENS); BUILD_BUG_ON(BIT(LRU_GEN_WIDTH) <= MAX_NR_GENS); + if (sysfs_create_group(mm_kobj, &lru_gen_attr_group)) + pr_err("lru_gen: failed to create sysfs group\n"); + return 0; }; late_initcall(init_lru_gen);