Message ID | 20240606152232.20253-1-mkoutny@suse.com (mailing list archive) |
---|---|
Headers | show |
Series | Add memory.max.effective for application's allocators | expand |
On Thu, Jun 06, 2024 at 05:22:29PM +0200, Michal Koutný wrote: > Some applications use memory cgroup limits to scale their own memory > needs. Reading of the immediate membership cgroup's memory.max is not > sufficient because of possible ancestral limits. The application could > traverse upwards to figure out the tightest limit but this would not > work in cgroup namespace where the view of cgroup hierarchy is > incomplete and the limit may apply from outer world. > Additionally, applications should respond to limit changes. If the goal is to detect how much memory would it be possible to allocate, I'm not sure that knowing all memory.max limits upper in the hierarchy really buys anything without knowing actual usages and a potential for memory reclaim across the entire tree. E.g.: A (max = 100G) | \ B C C's effective max will come out as 100G, but if B.anon_usage = 100G and there is no swap, the actual number is 0. But if it's more about exploring the "invisible" part of the cgroup tree configuration, it makes sense to me. Not sure about the naming, maybe something like memory.tree.max or memory.parent.max or even memory.hierarchical.max is a better fit. Thanks!
On Fri, 07 Jun 2024 02:15:00 +0800, Roman Gushchin wrote: > If the goal is to detect how much memory would it be possible to allocate, > I'm not sure that knowing all memory.max limits upper in the hierarchy > really buys anything without knowing actual usages and a potential > for memory reclaim across the entire tree. > > E.g.: > > A (max = 100G) > | \ > B C > > C's effective max will come out as 100G, but if B.anon_usage = 100G and > there is no swap, the actual number is 0. Yes, it would be better to subtract the used memory from ancestor (and thus even current) cgroups. The original use case of this feature is for cloud nodes running a single Java JVM where the sibling cgroups are not an issue. Jan Kratochvil
Hello. On Sat, Aug 17, 2024 at 02:00:15PM GMT, Jan Kratochvil <jkratochvil@azul.com> wrote: > Yes, it would be better to subtract the used memory from ancestor (and thus > even current) cgroups. Then it becomes a more dynamic characterstics and it leads to calculations of available memory. I share a link [1] for completeness and to prevent repeated discussions (that past one ended up with no memory.stat:avail). > The original use case of this feature is for cloud nodes running a > single Java JVM where the sibling cgroups are not an issue. IIUC, it's a tree like this: O / | \ A B C // B:memory.max < O:memory.max | ... | W // workload This picture made me realize that memory controller may not be even enabled all the way down from B to W, i.e. W would have no memory.max.effective, IOW memory.* attribute would not be the right place for such an value. That would even apply in the apparently purposeful case if there was a cgroup NS boundary between B and W. (At least in the proposed implementation, memory.* file would have to be decoupled from memory controller, similarly to e.g. cpu.stat:usage_usec.) Jan, do I get the tree shape right? Are B and W in different cgroup namespaces? Thanks, Michal [1] https://lore.kernel.org/all/alpine.DEB.2.23.453.2007142018150.2667860@chino.kir.corp.google.com/