diff mbox series

[vfs/for-next,v4] cgroup: fix top cgroup refcnt leak

Message ID 20190102210659.3125-1-avagin@gmail.com (mailing list archive)
State New, archived
Headers show
Series [vfs/for-next,v4] cgroup: fix top cgroup refcnt leak | expand

Commit Message

Andrei Vagin Jan. 2, 2019, 9:06 p.m. UTC
It looks like the c6b3d5bcd67c ("cgroup: fix top cgroup refcnt leak")
commit was reverted by mistake.

$ mkdir /tmp/cgroup
$ mkdir /tmp/cgroup2
$ mount -t cgroup -o none,name=test test /tmp/cgroup
$ mount -t cgroup -o none,name=test test /tmp/cgroup2
$ umount /tmp/cgroup
$ umount /tmp/cgroup2
$ cat /proc/self/cgroup | grep test
12:name=test:/

You can see the test cgroup was not freed.

Cc: Li Zefan <lizefan@huawei.com>
Fixes: aea3f2676c83 ("kernfs, sysfs, cgroup, intel_rdt: Support fs_context")
Signed-off-by: Andrei Vagin <avagin@gmail.com>
---

v2: clean up code and add the vfs/for-next tag
v3: fix a reference leak when kernfs_node_dentry fails
v4: call deactivate_locked_super() in a error case
v5: don't dereference fc->root after dput()

 kernel/cgroup/cgroup.c | 25 ++++++++++++++++++-------
 1 file changed, 18 insertions(+), 7 deletions(-)

Comments

David Howells Jan. 3, 2019, 12:26 a.m. UTC | #1
Andrei Vagin <avagin@gmail.com> wrote:

> It looks like the c6b3d5bcd67c ("cgroup: fix top cgroup refcnt leak")
> commit was reverted by mistake.
> 
> $ mkdir /tmp/cgroup
> $ mkdir /tmp/cgroup2
> $ mount -t cgroup -o none,name=test test /tmp/cgroup
> $ mount -t cgroup -o none,name=test test /tmp/cgroup2
> $ umount /tmp/cgroup
> $ umount /tmp/cgroup2
> $ cat /proc/self/cgroup | grep test
> 12:name=test:/
> 
> You can see the test cgroup was not freed.
> 
> Cc: Li Zefan <lizefan@huawei.com>
> Fixes: aea3f2676c83 ("kernfs, sysfs, cgroup, intel_rdt: Support fs_context")
> Signed-off-by: Andrei Vagin <avagin@gmail.com>
> ---
> 
> v2: clean up code and add the vfs/for-next tag
> v3: fix a reference leak when kernfs_node_dentry fails
> v4: call deactivate_locked_super() in a error case
> v5: don't dereference fc->root after dput()
> 
>  kernel/cgroup/cgroup.c | 25 ++++++++++++++++++-------
>  1 file changed, 18 insertions(+), 7 deletions(-)

This patch doesn't work either.

	percpu ref (css_release) <= 0 (0) after switching to atomic
	RIP: 0010:percpu_ref_switch_to_atomic_rcu+0x90/0x1a0

Btw, note that the subject says "v4" but the changelog says "v5".

David
Andrei Vagin Jan. 3, 2019, 12:43 a.m. UTC | #2
On Thu, Jan 03, 2019 at 12:26:23AM +0000, David Howells wrote:
> Andrei Vagin <avagin@gmail.com> wrote:
> 
> > It looks like the c6b3d5bcd67c ("cgroup: fix top cgroup refcnt leak")
> > commit was reverted by mistake.
> > 
> > $ mkdir /tmp/cgroup
> > $ mkdir /tmp/cgroup2
> > $ mount -t cgroup -o none,name=test test /tmp/cgroup
> > $ mount -t cgroup -o none,name=test test /tmp/cgroup2
> > $ umount /tmp/cgroup
> > $ umount /tmp/cgroup2
> > $ cat /proc/self/cgroup | grep test
> > 12:name=test:/
> > 
> > You can see the test cgroup was not freed.
> > 
> > Cc: Li Zefan <lizefan@huawei.com>
> > Fixes: aea3f2676c83 ("kernfs, sysfs, cgroup, intel_rdt: Support fs_context")
> > Signed-off-by: Andrei Vagin <avagin@gmail.com>
> > ---
> > 
> > v2: clean up code and add the vfs/for-next tag
> > v3: fix a reference leak when kernfs_node_dentry fails
> > v4: call deactivate_locked_super() in a error case
> > v5: don't dereference fc->root after dput()
> > 
> >  kernel/cgroup/cgroup.c | 25 ++++++++++++++++++-------
> >  1 file changed, 18 insertions(+), 7 deletions(-)
> 
> This patch doesn't work either.

I'm sorry, but we can't say anything about this patch now, because it
looks like recent changes in vfs-next break something else here...

> 
> 	percpu ref (css_release) <= 0 (0) after switching to atomic
> 	RIP: 0010:percpu_ref_switch_to_atomic_rcu+0x90/0x1a0
> 
> Btw, note that the subject says "v4" but the changelog says "v5".

It is v5.

> 
> David
Andrei Vagin Jan. 3, 2019, 1 a.m. UTC | #3
On Wed, Jan 02, 2019 at 04:43:39PM -0800, Andrei Vagin wrote:
> On Thu, Jan 03, 2019 at 12:26:23AM +0000, David Howells wrote:
> > Andrei Vagin <avagin@gmail.com> wrote:
> > 
> > > It looks like the c6b3d5bcd67c ("cgroup: fix top cgroup refcnt leak")
> > > commit was reverted by mistake.
> > > 
> > > $ mkdir /tmp/cgroup
> > > $ mkdir /tmp/cgroup2
> > > $ mount -t cgroup -o none,name=test test /tmp/cgroup
> > > $ mount -t cgroup -o none,name=test test /tmp/cgroup2
> > > $ umount /tmp/cgroup
> > > $ umount /tmp/cgroup2
> > > $ cat /proc/self/cgroup | grep test
> > > 12:name=test:/
> > > 
> > > You can see the test cgroup was not freed.
> > > 
> > > Cc: Li Zefan <lizefan@huawei.com>
> > > Fixes: aea3f2676c83 ("kernfs, sysfs, cgroup, intel_rdt: Support fs_context")
> > > Signed-off-by: Andrei Vagin <avagin@gmail.com>
> > > ---
> > > 
> > > v2: clean up code and add the vfs/for-next tag
> > > v3: fix a reference leak when kernfs_node_dentry fails
> > > v4: call deactivate_locked_super() in a error case
> > > v5: don't dereference fc->root after dput()
> > > 
> > >  kernel/cgroup/cgroup.c | 25 ++++++++++++++++++-------
> > >  1 file changed, 18 insertions(+), 7 deletions(-)
> > 
> > This patch doesn't work either.
> 
> I'm sorry, but we can't say anything about this patch now, because it
> looks like recent changes in vfs-next break something else here...

I found a reason why this patch doesn't work on Al's vfs/for-next:

[avagin@laptop linux]$ git diff 40effd960becd8a355b7aafc789712afd64f5759..vfs/for-next  kernel/cgroup/cgroup-v1.c | grep -B 5 -A 5 cgroup_get 
 	/*
@@ -1280,8 +1285,8 @@ int cgroup1_get_tree(struct fs_context *fc)
 		mutex_lock(&cgroup_mutex);
 		percpu_ref_reinit(&root->cgrp.self.refcnt);
 		mutex_unlock(&cgroup_mutex);
+		cgroup_get(&root->cgrp);
 	}
-	cgroup_get(&root->cgrp);
 
 	/*
 	 * If @pinned_sb, we're reusing an existing root and holding an

40effd960becd8a355b7aafc789712afd64f5759 is the previous head of vfs/for-next

I reverted this hunk, applied my patch and all criu test passed.

> 
> > 
> > 	percpu ref (css_release) <= 0 (0) after switching to atomic
> > 	RIP: 0010:percpu_ref_switch_to_atomic_rcu+0x90/0x1a0
> > 
> > Btw, note that the subject says "v4" but the changelog says "v5".
> 
> It is v5.
> 
> > 
> > David
diff mbox series

Patch

diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index fb0717696895..53b730cf1f7b 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -2019,7 +2019,7 @@  int cgroup_do_get_tree(struct fs_context *fc)
 
 	ret = kernfs_get_tree(fc);
 	if (ret < 0)
-		goto out_cgrp;
+		return ret;
 
 	/*
 	 * In non-init cgroup namespace, instead of root cgroup's dentry,
@@ -2038,19 +2038,30 @@  int cgroup_do_get_tree(struct fs_context *fc)
 		mutex_unlock(&cgroup_mutex);
 
 		nsdentry = kernfs_node_dentry(cgrp->kn, fc->root->d_sb);
-		if (IS_ERR(nsdentry))
-			return PTR_ERR(nsdentry);
+		if (IS_ERR(nsdentry)) {
+			ret = PTR_ERR(nsdentry);
+			goto out_cgrp;
+		}
 		dput(fc->root);
 		fc->root = nsdentry;
 	}
 
 	ret = 0;
-	if (ctx->kfc.new_sb_created)
-		goto out_cgrp;
-	apply_cgroup_root_flags(ctx->flags);
-	return 0;
+	if (!ctx->kfc.new_sb_created)
+		apply_cgroup_root_flags(ctx->flags);
 
 out_cgrp:
+	if (!ctx->kfc.new_sb_created)
+		cgroup_put(&ctx->root->cgrp);
+
+	if (unlikely(ret)) {
+		struct super_block *sb = fc->root->d_sb;
+
+		dput(fc->root);
+		deactivate_locked_super(sb);
+		fc->root = NULL;
+	}
+
 	return ret;
 }