diff mbox series

[resend] mm, memcg: fix inconsistent oom event behavior

Message ID 20200502141055.7378-1-laoar.shao@gmail.com (mailing list archive)
State New, archived
Headers show
Series [resend] mm, memcg: fix inconsistent oom event behavior | expand

Commit Message

Yafang Shao May 2, 2020, 2:10 p.m. UTC
A recent commit 9852ae3fe529 ("mm, memcg: consider subtrees in
memory.events") changes the behavior of memcg events, which will
consider subtrees in memory.events. But oom_kill event is a special one
as it is used in both cgroup1 and cgroup2. In cgroup1, it is displayed
in memory.oom_control. The file memory.oom_control is in both root memcg
and non root memcg, that is different with memory.event as it only in
non-root memcg. That commit is okay for cgroup2, but it is not okay for
cgroup1 as it will cause inconsistent behavior between root memcg and
non-root memcg.

Here's an example on why this behavior is inconsistent in cgroup1.
     root memcg
     /
  memcg foo
   /
memcg bar

Suppose there's an oom_kill in memcg bar, then the oon_kill will be

     root memcg : memory.oom_control(oom_kill)  0
     /
  memcg foo : memory.oom_control(oom_kill)  1
   /
memcg bar : memory.oom_control(oom_kill)  1

For the non-root memcg, its memory.oom_control(oom_kill) includes its
descendants' oom_kill, but for root memcg, it doesn't include its
descendants' oom_kill. That means, memory.oom_control(oom_kill) has
different meanings in different memcgs. That is inconsistent. Then the user
has to know whether the memcg is root or not.

If we can't fully support it in cgroup1, for example by adding
memory.events.local into cgroup1 as well, then let's don't touch
its original behavior.

Fixes: 9852ae3fe529 ("mm, memcg: consider subtrees in memory.events")
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Cc: Chris Down <chris@chrisdown.name>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
---
 include/linux/memcontrol.h | 2 ++
 1 file changed, 2 insertions(+)

Comments

Johannes Weiner May 2, 2020, 2:45 p.m. UTC | #1
On Sat, May 02, 2020 at 10:10:55AM -0400, Yafang Shao wrote:
> A recent commit 9852ae3fe529 ("mm, memcg: consider subtrees in
> memory.events") changes the behavior of memcg events, which will
> consider subtrees in memory.events. But oom_kill event is a special one
> as it is used in both cgroup1 and cgroup2. In cgroup1, it is displayed
> in memory.oom_control. The file memory.oom_control is in both root memcg
> and non root memcg, that is different with memory.event as it only in
> non-root memcg. That commit is okay for cgroup2, but it is not okay for
> cgroup1 as it will cause inconsistent behavior between root memcg and
> non-root memcg.
> 
> Here's an example on why this behavior is inconsistent in cgroup1.
>      root memcg
>      /
>   memcg foo
>    /
> memcg bar
> 
> Suppose there's an oom_kill in memcg bar, then the oon_kill will be
> 
>      root memcg : memory.oom_control(oom_kill)  0
>      /
>   memcg foo : memory.oom_control(oom_kill)  1
>    /
> memcg bar : memory.oom_control(oom_kill)  1
> 
> For the non-root memcg, its memory.oom_control(oom_kill) includes its
> descendants' oom_kill, but for root memcg, it doesn't include its
> descendants' oom_kill. That means, memory.oom_control(oom_kill) has
> different meanings in different memcgs. That is inconsistent. Then the user
> has to know whether the memcg is root or not.
> 
> If we can't fully support it in cgroup1, for example by adding
> memory.events.local into cgroup1 as well, then let's don't touch
> its original behavior.
> 
> Fixes: 9852ae3fe529 ("mm, memcg: consider subtrees in memory.events")
> Reviewed-by: Shakeel Butt <shakeelb@google.com>
> Cc: Chris Down <chris@chrisdown.name>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Signed-off-by: Yafang Shao <laoar.shao@gmail.com>

Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Chris Down May 2, 2020, 3:23 p.m. UTC | #2
Yafang Shao writes:
>A recent commit 9852ae3fe529 ("mm, memcg: consider subtrees in
>memory.events") changes the behavior of memcg events, which will
>consider subtrees in memory.events. But oom_kill event is a special one
>as it is used in both cgroup1 and cgroup2. In cgroup1, it is displayed
>in memory.oom_control. The file memory.oom_control is in both root memcg
>and non root memcg, that is different with memory.event as it only in
>non-root memcg. That commit is okay for cgroup2, but it is not okay for
>cgroup1 as it will cause inconsistent behavior between root memcg and
>non-root memcg.

Thanks!

Acked-by: Chris Down <chris@chrisdown.name>
Michal Hocko May 4, 2020, 7:54 a.m. UTC | #3
On Sat 02-05-20 10:10:55, Yafang Shao wrote:
> A recent commit 9852ae3fe529 ("mm, memcg: consider subtrees in
> memory.events") changes the behavior of memcg events, which will
> consider subtrees in memory.events. But oom_kill event is a special one
> as it is used in both cgroup1 and cgroup2. In cgroup1, it is displayed
> in memory.oom_control. The file memory.oom_control is in both root memcg
> and non root memcg, that is different with memory.event as it only in
> non-root memcg. That commit is okay for cgroup2, but it is not okay for
> cgroup1 as it will cause inconsistent behavior between root memcg and
> non-root memcg.
> 
> Here's an example on why this behavior is inconsistent in cgroup1.
>      root memcg
>      /
>   memcg foo
>    /
> memcg bar
> 
> Suppose there's an oom_kill in memcg bar, then the oon_kill will be
> 
>      root memcg : memory.oom_control(oom_kill)  0
>      /
>   memcg foo : memory.oom_control(oom_kill)  1
>    /
> memcg bar : memory.oom_control(oom_kill)  1
> 
> For the non-root memcg, its memory.oom_control(oom_kill) includes its
> descendants' oom_kill, but for root memcg, it doesn't include its
> descendants' oom_kill. That means, memory.oom_control(oom_kill) has
> different meanings in different memcgs. That is inconsistent. Then the user
> has to know whether the memcg is root or not.
> 
> If we can't fully support it in cgroup1, for example by adding
> memory.events.local into cgroup1 as well, then let's don't touch
> its original behavior.
> 
> Fixes: 9852ae3fe529 ("mm, memcg: consider subtrees in memory.events")
> Reviewed-by: Shakeel Butt <shakeelb@google.com>
> Cc: Chris Down <chris@chrisdown.name>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Signed-off-by: Yafang Shao <laoar.shao@gmail.com>

Acked-by: Michal Hocko <mhocko@suse.com>

and sorry to distract you to a cgroup generic solution without doing my
homework and double checking it is possible.

> ---
>  include/linux/memcontrol.h | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
> index d275c72c4f8e..977edd3b7bd8 100644
> --- a/include/linux/memcontrol.h
> +++ b/include/linux/memcontrol.h
> @@ -783,6 +783,8 @@ static inline void memcg_memory_event(struct mem_cgroup *memcg,
>  		atomic_long_inc(&memcg->memory_events[event]);
>  		cgroup_file_notify(&memcg->events_file);
>  
> +		if (!cgroup_subsys_on_dfl(memory_cgrp_subsys))
> +			break;
>  		if (cgrp_dfl_root.flags & CGRP_ROOT_MEMORY_LOCAL_EVENTS)
>  			break;
>  	} while ((memcg = parent_mem_cgroup(memcg)) &&
> -- 
> 2.18.2
Andrew Morton May 4, 2020, 11:03 p.m. UTC | #4
On Sat,  2 May 2020 10:10:55 -0400 Yafang Shao <laoar.shao@gmail.com> wrote:

> A recent commit 9852ae3fe529 ("mm, memcg: consider subtrees in
> memory.events") changes the behavior of memcg events, which will
> consider subtrees in memory.events. But oom_kill event is a special one
> as it is used in both cgroup1 and cgroup2. In cgroup1, it is displayed
> in memory.oom_control. The file memory.oom_control is in both root memcg
> and non root memcg, that is different with memory.event as it only in
> non-root memcg. That commit is okay for cgroup2, but it is not okay for
> cgroup1 as it will cause inconsistent behavior between root memcg and
> non-root memcg.
> 
> Here's an example on why this behavior is inconsistent in cgroup1.
>      root memcg
>      /
>   memcg foo
>    /
> memcg bar
> 
> Suppose there's an oom_kill in memcg bar, then the oon_kill will be
> 
>      root memcg : memory.oom_control(oom_kill)  0
>      /
>   memcg foo : memory.oom_control(oom_kill)  1
>    /
> memcg bar : memory.oom_control(oom_kill)  1
> 
> For the non-root memcg, its memory.oom_control(oom_kill) includes its
> descendants' oom_kill, but for root memcg, it doesn't include its
> descendants' oom_kill. That means, memory.oom_control(oom_kill) has
> different meanings in different memcgs. That is inconsistent. Then the user
> has to know whether the memcg is root or not.
> 
> If we can't fully support it in cgroup1, for example by adding
> memory.events.local into cgroup1 as well, then let's don't touch
> its original behavior.
> 
> Fixes: 9852ae3fe529 ("mm, memcg: consider subtrees in memory.events")

Nearly a year ago.  Should we backport this into earlier kernels?

> --- a/include/linux/memcontrol.h
> +++ b/include/linux/memcontrol.h
> @@ -783,6 +783,8 @@ static inline void memcg_memory_event(struct mem_cgroup *memcg,
>  		atomic_long_inc(&memcg->memory_events[event]);
>  		cgroup_file_notify(&memcg->events_file);
>  
> +		if (!cgroup_subsys_on_dfl(memory_cgrp_subsys))
> +			break;
>  		if (cgrp_dfl_root.flags & CGRP_ROOT_MEMORY_LOCAL_EVENTS)
>  			break;
>  	} while ((memcg = parent_mem_cgroup(memcg)) &&
Michal Hocko May 5, 2020, 7:29 a.m. UTC | #5
On Mon 04-05-20 16:03:45, Andrew Morton wrote:
> On Sat,  2 May 2020 10:10:55 -0400 Yafang Shao <laoar.shao@gmail.com> wrote:
> 
> > A recent commit 9852ae3fe529 ("mm, memcg: consider subtrees in
> > memory.events") changes the behavior of memcg events, which will
> > consider subtrees in memory.events. But oom_kill event is a special one
> > as it is used in both cgroup1 and cgroup2. In cgroup1, it is displayed
> > in memory.oom_control. The file memory.oom_control is in both root memcg
> > and non root memcg, that is different with memory.event as it only in
> > non-root memcg. That commit is okay for cgroup2, but it is not okay for
> > cgroup1 as it will cause inconsistent behavior between root memcg and
> > non-root memcg.
> > 
> > Here's an example on why this behavior is inconsistent in cgroup1.
> >      root memcg
> >      /
> >   memcg foo
> >    /
> > memcg bar
> > 
> > Suppose there's an oom_kill in memcg bar, then the oon_kill will be
> > 
> >      root memcg : memory.oom_control(oom_kill)  0
> >      /
> >   memcg foo : memory.oom_control(oom_kill)  1
> >    /
> > memcg bar : memory.oom_control(oom_kill)  1
> > 
> > For the non-root memcg, its memory.oom_control(oom_kill) includes its
> > descendants' oom_kill, but for root memcg, it doesn't include its
> > descendants' oom_kill. That means, memory.oom_control(oom_kill) has
> > different meanings in different memcgs. That is inconsistent. Then the user
> > has to know whether the memcg is root or not.
> > 
> > If we can't fully support it in cgroup1, for example by adding
> > memory.events.local into cgroup1 as well, then let's don't touch
> > its original behavior.
> > 
> > Fixes: 9852ae3fe529 ("mm, memcg: consider subtrees in memory.events")
> 
> Nearly a year ago.  Should we backport this into earlier kernels?

It is a trivial change so I do not see problem marking it for stable.

> 
> > --- a/include/linux/memcontrol.h
> > +++ b/include/linux/memcontrol.h
> > @@ -783,6 +783,8 @@ static inline void memcg_memory_event(struct mem_cgroup *memcg,
> >  		atomic_long_inc(&memcg->memory_events[event]);
> >  		cgroup_file_notify(&memcg->events_file);
> >  
> > +		if (!cgroup_subsys_on_dfl(memory_cgrp_subsys))
> > +			break;
> >  		if (cgrp_dfl_root.flags & CGRP_ROOT_MEMORY_LOCAL_EVENTS)
> >  			break;
> >  	} while ((memcg = parent_mem_cgroup(memcg)) &&
diff mbox series

Patch

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index d275c72c4f8e..977edd3b7bd8 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -783,6 +783,8 @@  static inline void memcg_memory_event(struct mem_cgroup *memcg,
 		atomic_long_inc(&memcg->memory_events[event]);
 		cgroup_file_notify(&memcg->events_file);
 
+		if (!cgroup_subsys_on_dfl(memory_cgrp_subsys))
+			break;
 		if (cgrp_dfl_root.flags & CGRP_ROOT_MEMORY_LOCAL_EVENTS)
 			break;
 	} while ((memcg = parent_mem_cgroup(memcg)) &&