diff mbox

mm: memcg: fix use after free in mem_cgroup_iter()

Message ID 1531994807-25639-1-git-send-email-jing.xia@unisoc.com (mailing list archive)
State New, archived
Headers show

Commit Message

jing xia July 19, 2018, 10:06 a.m. UTC
It was reported that a kernel crash happened in mem_cgroup_iter(),
which can be triggered if the legacy cgroup-v1 non-hierarchical
mode is used.

Unable to handle kernel paging request at virtual address 6b6b6b6b6b6b8f
......
Call trace:
  mem_cgroup_iter+0x2e0/0x6d4
  shrink_zone+0x8c/0x324
  balance_pgdat+0x450/0x640
  kswapd+0x130/0x4b8
  kthread+0xe8/0xfc
  ret_from_fork+0x10/0x20

  mem_cgroup_iter():
      ......
      if (css_tryget(css))    <-- crash here
	    break;
      ......

The crashing reason is that mem_cgroup_iter() uses the memcg object
whose pointer is stored in iter->position, which has been freed before
and filled with POISON_FREE(0x6b).

And the root cause of the use-after-free issue is that
invalidate_reclaim_iterators() fails to reset the value of
iter->position to NULL when the css of the memcg is released in non-
hierarchical mode.

Signed-off-by: Jing Xia <jing.xia.mail@gmail.com>
---
 mm/memcontrol.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Michal Hocko July 19, 2018, 10:43 a.m. UTC | #1
[CC Andrew]

On Thu 19-07-18 18:06:47, Jing Xia wrote:
> It was reported that a kernel crash happened in mem_cgroup_iter(),
> which can be triggered if the legacy cgroup-v1 non-hierarchical
> mode is used.
> 
> Unable to handle kernel paging request at virtual address 6b6b6b6b6b6b8f
> ......
> Call trace:
>   mem_cgroup_iter+0x2e0/0x6d4
>   shrink_zone+0x8c/0x324
>   balance_pgdat+0x450/0x640
>   kswapd+0x130/0x4b8
>   kthread+0xe8/0xfc
>   ret_from_fork+0x10/0x20
> 
>   mem_cgroup_iter():
>       ......
>       if (css_tryget(css))    <-- crash here
> 	    break;
>       ......
> 
> The crashing reason is that mem_cgroup_iter() uses the memcg object
> whose pointer is stored in iter->position, which has been freed before
> and filled with POISON_FREE(0x6b).
> 
> And the root cause of the use-after-free issue is that
> invalidate_reclaim_iterators() fails to reset the value of
> iter->position to NULL when the css of the memcg is released in non-
> hierarchical mode.

Well, spotted!

I suspect
Fixes: 6df38689e0e9 ("mm: memcontrol: fix possible memcg leak due to interrupted reclaim")

but maybe it goes further into past. I also suggest
Cc: stable

even though the non-hierarchical mode is strongly discouraged. A lack of
reports for 3 years is encouraging that not many people really use this
mode.

> Signed-off-by: Jing Xia <jing.xia.mail@gmail.com>

Acked-by: Michal Hocko <mhocko@suse.com>

Thanks!
> ---
>  mm/memcontrol.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index e6f0d5e..8c0280b 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -850,7 +850,7 @@ static void invalidate_reclaim_iterators(struct mem_cgroup *dead_memcg)
>  	int nid;
>  	int i;
>  
> -	while ((memcg = parent_mem_cgroup(memcg))) {
> +	for (; memcg; memcg = parent_mem_cgroup(memcg)) {
>  		for_each_node(nid) {
>  			mz = mem_cgroup_nodeinfo(memcg, nid);
>  			for (i = 0; i <= DEF_PRIORITY; i++) {
> -- 
> 1.9.1
Shakeel Butt July 19, 2018, 4:23 p.m. UTC | #2
On Thu, Jul 19, 2018 at 3:43 AM Michal Hocko <mhocko@kernel.org> wrote:
>
> [CC Andrew]
>
> On Thu 19-07-18 18:06:47, Jing Xia wrote:
> > It was reported that a kernel crash happened in mem_cgroup_iter(),
> > which can be triggered if the legacy cgroup-v1 non-hierarchical
> > mode is used.
> >
> > Unable to handle kernel paging request at virtual address 6b6b6b6b6b6b8f
> > ......
> > Call trace:
> >   mem_cgroup_iter+0x2e0/0x6d4
> >   shrink_zone+0x8c/0x324
> >   balance_pgdat+0x450/0x640
> >   kswapd+0x130/0x4b8
> >   kthread+0xe8/0xfc
> >   ret_from_fork+0x10/0x20
> >
> >   mem_cgroup_iter():
> >       ......
> >       if (css_tryget(css))    <-- crash here
> >           break;
> >       ......
> >
> > The crashing reason is that mem_cgroup_iter() uses the memcg object
> > whose pointer is stored in iter->position, which has been freed before
> > and filled with POISON_FREE(0x6b).
> >
> > And the root cause of the use-after-free issue is that
> > invalidate_reclaim_iterators() fails to reset the value of
> > iter->position to NULL when the css of the memcg is released in non-
> > hierarchical mode.
>
> Well, spotted!
>
> I suspect
> Fixes: 6df38689e0e9 ("mm: memcontrol: fix possible memcg leak due to interrupted reclaim")
>
> but maybe it goes further into past. I also suggest
> Cc: stable
>
> even though the non-hierarchical mode is strongly discouraged.

Why not set root_mem_cgroup's use_hierarchy to true by default on
init? If someone wants non-hierarchical mode, they can explicitly set
it to false.

> A lack of
> reports for 3 years is encouraging that not many people really use this
> mode.
>
> > Signed-off-by: Jing Xia <jing.xia.mail@gmail.com>
>
> Acked-by: Michal Hocko <mhocko@suse.com>
>
> Thanks!
> > ---
> >  mm/memcontrol.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> > index e6f0d5e..8c0280b 100644
> > --- a/mm/memcontrol.c
> > +++ b/mm/memcontrol.c
> > @@ -850,7 +850,7 @@ static void invalidate_reclaim_iterators(struct mem_cgroup *dead_memcg)
> >       int nid;
> >       int i;
> >
> > -     while ((memcg = parent_mem_cgroup(memcg))) {
> > +     for (; memcg; memcg = parent_mem_cgroup(memcg)) {
> >               for_each_node(nid) {
> >                       mz = mem_cgroup_nodeinfo(memcg, nid);
> >                       for (i = 0; i <= DEF_PRIORITY; i++) {
> > --
> > 1.9.1
>
> --
> Michal Hocko
> SUSE Labs
> --
> To unsubscribe from this list: send the line "unsubscribe cgroups" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
Michal Hocko July 23, 2018, 6:44 a.m. UTC | #3
On Thu 19-07-18 09:23:10, Shakeel Butt wrote:
> On Thu, Jul 19, 2018 at 3:43 AM Michal Hocko <mhocko@kernel.org> wrote:
> >
> > [CC Andrew]
> >
> > On Thu 19-07-18 18:06:47, Jing Xia wrote:
> > > It was reported that a kernel crash happened in mem_cgroup_iter(),
> > > which can be triggered if the legacy cgroup-v1 non-hierarchical
> > > mode is used.
> > >
> > > Unable to handle kernel paging request at virtual address 6b6b6b6b6b6b8f
> > > ......
> > > Call trace:
> > >   mem_cgroup_iter+0x2e0/0x6d4
> > >   shrink_zone+0x8c/0x324
> > >   balance_pgdat+0x450/0x640
> > >   kswapd+0x130/0x4b8
> > >   kthread+0xe8/0xfc
> > >   ret_from_fork+0x10/0x20
> > >
> > >   mem_cgroup_iter():
> > >       ......
> > >       if (css_tryget(css))    <-- crash here
> > >           break;
> > >       ......
> > >
> > > The crashing reason is that mem_cgroup_iter() uses the memcg object
> > > whose pointer is stored in iter->position, which has been freed before
> > > and filled with POISON_FREE(0x6b).
> > >
> > > And the root cause of the use-after-free issue is that
> > > invalidate_reclaim_iterators() fails to reset the value of
> > > iter->position to NULL when the css of the memcg is released in non-
> > > hierarchical mode.
> >
> > Well, spotted!
> >
> > I suspect
> > Fixes: 6df38689e0e9 ("mm: memcontrol: fix possible memcg leak due to interrupted reclaim")
> >
> > but maybe it goes further into past. I also suggest
> > Cc: stable
> >
> > even though the non-hierarchical mode is strongly discouraged.
> 
> Why not set root_mem_cgroup's use_hierarchy to true by default on
> init? If someone wants non-hierarchical mode, they can explicitly set
> it to false.

We do not change defaults under users feet usually.
Shakeel Butt July 23, 2018, 4:17 p.m. UTC | #4
On Sun, Jul 22, 2018 at 11:44 PM Michal Hocko <mhocko@kernel.org> wrote:
>
> On Thu 19-07-18 09:23:10, Shakeel Butt wrote:
> > On Thu, Jul 19, 2018 at 3:43 AM Michal Hocko <mhocko@kernel.org> wrote:
> > >
> > > [CC Andrew]
> > >
> > > On Thu 19-07-18 18:06:47, Jing Xia wrote:
> > > > It was reported that a kernel crash happened in mem_cgroup_iter(),
> > > > which can be triggered if the legacy cgroup-v1 non-hierarchical
> > > > mode is used.
> > > >
> > > > Unable to handle kernel paging request at virtual address 6b6b6b6b6b6b8f
> > > > ......
> > > > Call trace:
> > > >   mem_cgroup_iter+0x2e0/0x6d4
> > > >   shrink_zone+0x8c/0x324
> > > >   balance_pgdat+0x450/0x640
> > > >   kswapd+0x130/0x4b8
> > > >   kthread+0xe8/0xfc
> > > >   ret_from_fork+0x10/0x20
> > > >
> > > >   mem_cgroup_iter():
> > > >       ......
> > > >       if (css_tryget(css))    <-- crash here
> > > >           break;
> > > >       ......
> > > >
> > > > The crashing reason is that mem_cgroup_iter() uses the memcg object
> > > > whose pointer is stored in iter->position, which has been freed before
> > > > and filled with POISON_FREE(0x6b).
> > > >
> > > > And the root cause of the use-after-free issue is that
> > > > invalidate_reclaim_iterators() fails to reset the value of
> > > > iter->position to NULL when the css of the memcg is released in non-
> > > > hierarchical mode.
> > >
> > > Well, spotted!
> > >
> > > I suspect
> > > Fixes: 6df38689e0e9 ("mm: memcontrol: fix possible memcg leak due to interrupted reclaim")
> > >
> > > but maybe it goes further into past. I also suggest
> > > Cc: stable
> > >
> > > even though the non-hierarchical mode is strongly discouraged.
> >
> > Why not set root_mem_cgroup's use_hierarchy to true by default on
> > init? If someone wants non-hierarchical mode, they can explicitly set
> > it to false.
>
> We do not change defaults under users feet usually.

Then how non-hierarchical mode is being discouraged currently? I don't
see any comments in the docs.

Shakeel
Michal Hocko July 24, 2018, 7:28 a.m. UTC | #5
On Mon 23-07-18 09:17:28, Shakeel Butt wrote:
> On Sun, Jul 22, 2018 at 11:44 PM Michal Hocko <mhocko@kernel.org> wrote:
> >
> > On Thu 19-07-18 09:23:10, Shakeel Butt wrote:
> > > On Thu, Jul 19, 2018 at 3:43 AM Michal Hocko <mhocko@kernel.org> wrote:
> > > >
> > > > [CC Andrew]
> > > >
> > > > On Thu 19-07-18 18:06:47, Jing Xia wrote:
> > > > > It was reported that a kernel crash happened in mem_cgroup_iter(),
> > > > > which can be triggered if the legacy cgroup-v1 non-hierarchical
> > > > > mode is used.
> > > > >
> > > > > Unable to handle kernel paging request at virtual address 6b6b6b6b6b6b8f
> > > > > ......
> > > > > Call trace:
> > > > >   mem_cgroup_iter+0x2e0/0x6d4
> > > > >   shrink_zone+0x8c/0x324
> > > > >   balance_pgdat+0x450/0x640
> > > > >   kswapd+0x130/0x4b8
> > > > >   kthread+0xe8/0xfc
> > > > >   ret_from_fork+0x10/0x20
> > > > >
> > > > >   mem_cgroup_iter():
> > > > >       ......
> > > > >       if (css_tryget(css))    <-- crash here
> > > > >           break;
> > > > >       ......
> > > > >
> > > > > The crashing reason is that mem_cgroup_iter() uses the memcg object
> > > > > whose pointer is stored in iter->position, which has been freed before
> > > > > and filled with POISON_FREE(0x6b).
> > > > >
> > > > > And the root cause of the use-after-free issue is that
> > > > > invalidate_reclaim_iterators() fails to reset the value of
> > > > > iter->position to NULL when the css of the memcg is released in non-
> > > > > hierarchical mode.
> > > >
> > > > Well, spotted!
> > > >
> > > > I suspect
> > > > Fixes: 6df38689e0e9 ("mm: memcontrol: fix possible memcg leak due to interrupted reclaim")
> > > >
> > > > but maybe it goes further into past. I also suggest
> > > > Cc: stable
> > > >
> > > > even though the non-hierarchical mode is strongly discouraged.
> > >
> > > Why not set root_mem_cgroup's use_hierarchy to true by default on
> > > init? If someone wants non-hierarchical mode, they can explicitly set
> > > it to false.
> >
> > We do not change defaults under users feet usually.
> 
> Then how non-hierarchical mode is being discouraged currently? I don't
> see any comments in the docs.

css_create warns about non-hierarchical hierarchies. We've been running
with a similar warning in (even older) SLES kernels for years now and
quite some tools have been updated because they simply didn't know they
are doing something wrong.
diff mbox

Patch

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index e6f0d5e..8c0280b 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -850,7 +850,7 @@  static void invalidate_reclaim_iterators(struct mem_cgroup *dead_memcg)
 	int nid;
 	int i;
 
-	while ((memcg = parent_mem_cgroup(memcg))) {
+	for (; memcg; memcg = parent_mem_cgroup(memcg)) {
 		for_each_node(nid) {
 			mz = mem_cgroup_nodeinfo(memcg, nid);
 			for (i = 0; i <= DEF_PRIORITY; i++) {