Message ID | ZcmaPqZ9HzoN0GFM@host1.jankratochvil.net (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [v2] Port hierarchical_{memory,swap}_limit cgroup1->cgroup2 | expand |
Hello. Something like this would come quite handy. On Mon, Feb 12, 2024 at 12:10:38PM +0800, "Jan Kratochvil (Azul)" <jkratochvil@azul.com> wrote: > which are useful for userland to easily and performance-wise find out the > effective cgroup limits being applied. And the only way to figure out inside cgroupns. > But for cgroup2 it has been missing so far, this is just a copy-paste of the > cgroup1 code while changing s/memsw/swap/ as that is what cgroup1 vs. cgroup2 > tracks. I have added it to the end of "memory.stat" to prevent possible > compatibility problems with existing code parsing that file. I was thinking of memory.max.effective (and others). - no need to (possibly flush) stats when reading memory.stat - can be generalized also for pids controller (and other "limiting" controllers) - analogous to precedent of cpuset.cpus.effective Whereas, using v1 approach in v2: - memory.stat mixes true stats and limits, - memmory.stat is hierarchical by default, no need for the prefix. What do you think of the separate .effective file(s)? Thanks Michal
On 2/12/24 10:00, Michal Koutný wrote: > Hello. > > Something like this would come quite handy. > > On Mon, Feb 12, 2024 at 12:10:38PM +0800, "Jan Kratochvil (Azul)" <jkratochvil@azul.com> wrote: >> which are useful for userland to easily and performance-wise find out the >> effective cgroup limits being applied. > And the only way to figure out inside cgroupns. > >> But for cgroup2 it has been missing so far, this is just a copy-paste of the >> cgroup1 code while changing s/memsw/swap/ as that is what cgroup1 vs. cgroup2 >> tracks. I have added it to the end of "memory.stat" to prevent possible >> compatibility problems with existing code parsing that file. > I was thinking of memory.max.effective (and others). > > - no need to (possibly flush) stats when reading memory.stat > - can be generalized also for pids controller (and other "limiting" controllers) > - analogous to precedent of cpuset.cpus.effective > > Whereas, using v1 approach in v2: > - memory.stat mixes true stats and limits, > - memmory.stat is hierarchical by default, no need for the prefix. > > What do you think of the separate .effective file(s)? This is certainly a good alternative. Cheers, Longman
diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 46d8d0211..2631dd810 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -1636,6 +1636,8 @@ static inline unsigned long memcg_page_state_local_output( static void memcg_stat_format(struct mem_cgroup *memcg, struct seq_buf *s) { int i; + unsigned long memory, swap; + struct mem_cgroup *mi; /* * Provide statistics on the state of the memory subsystem as @@ -1682,6 +1684,17 @@ static void memcg_stat_format(struct mem_cgroup *memcg, struct seq_buf *s) memcg_events(memcg, memcg_vm_event_stat[i])); } + /* Hierarchical information */ + memory = swap = PAGE_COUNTER_MAX; + for (mi = memcg; mi; mi = parent_mem_cgroup(mi)) { + memory = min(memory, READ_ONCE(mi->memory.max)); + swap = min(swap, READ_ONCE(mi->swap.max)); + } + seq_buf_printf(s, "hierarchical_memory_limit %llu\n", + (u64)memory * PAGE_SIZE); + seq_buf_printf(s, "hierarchical_swap_limit %llu\n", + (u64)swap * PAGE_SIZE); + /* The above should easily fit into one page */ WARN_ON_ONCE(seq_buf_has_overflowed(s)); }
Hello, cgroup1 (by function memcg1_stat_format) already contains two lines hierarchical_memory_limit %llu hierarchical_memsw_limit %llu which are useful for userland to easily and performance-wise find out the effective cgroup limits being applied. Otherwise userland has to open+read+close the file "memory.max" and/or "memory.swap.max" in multiple parent directories of a nested cgroup. For cgroup1 it was implemented by: memcg: show real limit under hierarchy mode https://github.com/torvalds/linux/commit/fee7b548e6f2bd4bfd03a1a45d3afd593de7d5e9 Date: Wed Jan 7 18:08:26 2009 -0800 But for cgroup2 it has been missing so far, this is just a copy-paste of the cgroup1 code while changing s/memsw/swap/ as that is what cgroup1 vs. cgroup2 tracks. I have added it to the end of "memory.stat" to prevent possible compatibility problems with existing code parsing that file. Jan Kratochvil Signed-off-by: Jan Kratochvil (Azul) <jkratochvil@azul.com> mm/memcontrol.c | 13 +++++++++++++ 1 file changed, 13 insertions(+)