diff mbox

[1/9] cgroup: add cgroup_subsys->post_create()

Message ID 1351931915-1701-2-git-send-email-tj@kernel.org (mailing list archive)
State Not Applicable, archived
Headers show

Commit Message

Tejun Heo Nov. 3, 2012, 8:38 a.m. UTC
Currently, there's no way for a controller to find out whether a new
cgroup finished all ->create() allocatinos successfully and is
considered "live" by cgroup.

This becomes a problem later when we add generic descendants walking
to cgroup which can be used by controllers as controllers don't have a
synchronization point where it can synchronize against new cgroups
appearing in such walks.

This patch adds ->post_create().  It's called after all ->create()
succeeded and the cgroup is linked into the generic cgroup hierarchy.
This plays the counterpart of ->pre_destroy().

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Glauber Costa <glommer@parallels.com>
---
 include/linux/cgroup.h |  1 +
 kernel/cgroup.c        | 12 ++++++++++--
 2 files changed, 11 insertions(+), 2 deletions(-)

Comments

Glauber Costa Nov. 5, 2012, 1:42 p.m. UTC | #1
On 11/03/2012 09:38 AM, Tejun Heo wrote:
> Currently, there's no way for a controller to find out whether a new
> cgroup finished all ->create() allocatinos successfully and is
> considered "live" by cgroup.
> 
> This becomes a problem later when we add generic descendants walking
> to cgroup which can be used by controllers as controllers don't have a
> synchronization point where it can synchronize against new cgroups
> appearing in such walks.
> 
> This patch adds ->post_create().  It's called after all ->create()
> succeeded and the cgroup is linked into the generic cgroup hierarchy.
> This plays the counterpart of ->pre_destroy().
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Cc: Glauber Costa <glommer@parallels.com>

Tejun, If we do it this way, we end up with two callbacks that are
called after create: post_clone and post_create. I myself prefer the
approach I took, that convert post_clone into post_create, and would
prefer if you would pick that up.

For me, post_clone is totally a glitch that should not exist. Merging
this with post_create gives the following semantics:

* A while after cgroup creation, you will get a callback. In that
callback, you do whatever initialization you may need that you could not
in create. Why is reacting to a flag being set any different?

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Michal Hocko Nov. 7, 2012, 3:25 p.m. UTC | #2
On Sat 03-11-12 01:38:27, Tejun Heo wrote:
> Currently, there's no way for a controller to find out whether a new
> cgroup finished all ->create() allocatinos successfully and is
> considered "live" by cgroup.
> 
> This becomes a problem later when we add generic descendants walking
> to cgroup which can be used by controllers as controllers don't have a
> synchronization point where it can synchronize against new cgroups
> appearing in such walks.
> 
> This patch adds ->post_create().  It's called after all ->create()
> succeeded and the cgroup is linked into the generic cgroup hierarchy.
> This plays the counterpart of ->pre_destroy().

Hmm, I had to look at "cgroup_freezer: implement proper hierarchy
support" to actually understand what is the callback good for. The above
sounds as if the callback is needed when a controller wants to use
the new iterators or when pre_destroy is defined.

I think it would be helpful if the changelog described that the callback
is needed when the controller keeps a mutable shared state for the
hierarchy. For example memory controller doesn't have any such a strict
requirement so we can safely use your new iterators without pre_destroy.

Anyway, I like this change because the shared state is now really easy
to implement.

> Signed-off-by: Tejun Heo <tj@kernel.org>
> Cc: Glauber Costa <glommer@parallels.com>

Acked-by: Michal Hocko <mhocko@suse.cz>

> ---
>  include/linux/cgroup.h |  1 +
>  kernel/cgroup.c        | 12 ++++++++++--
>  2 files changed, 11 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
> index fe876a7..b442122 100644
> --- a/include/linux/cgroup.h
> +++ b/include/linux/cgroup.h
> @@ -438,6 +438,7 @@ int cgroup_taskset_size(struct cgroup_taskset *tset);
>  
>  struct cgroup_subsys {
>  	struct cgroup_subsys_state *(*create)(struct cgroup *cgrp);
> +	void (*post_create)(struct cgroup *cgrp);
>  	void (*pre_destroy)(struct cgroup *cgrp);
>  	void (*destroy)(struct cgroup *cgrp);
>  	int (*can_attach)(struct cgroup *cgrp, struct cgroup_taskset *tset);
> diff --git a/kernel/cgroup.c b/kernel/cgroup.c
> index e3045ad..f05d992 100644
> --- a/kernel/cgroup.c
> +++ b/kernel/cgroup.c
> @@ -4060,10 +4060,15 @@ static long cgroup_create(struct cgroup *parent, struct dentry *dentry,
>  	if (err < 0)
>  		goto err_remove;
>  
> -	/* each css holds a ref to the cgroup's dentry */
> -	for_each_subsys(root, ss)
> +	for_each_subsys(root, ss) {
> +		/* each css holds a ref to the cgroup's dentry */
>  		dget(dentry);
>  
> +		/* creation succeeded, notify subsystems */
> +		if (ss->post_create)
> +			ss->post_create(cgrp);
> +	}
> +
>  	/* The cgroup directory was pre-locked for us */
>  	BUG_ON(!mutex_is_locked(&cgrp->dentry->d_inode->i_mutex));
>  
> @@ -4281,6 +4286,9 @@ static void __init cgroup_init_subsys(struct cgroup_subsys *ss)
>  
>  	ss->active = 1;
>  
> +	if (ss->post_create)
> +		ss->post_create(&ss->root->top_cgroup);
> +
>  	/* this function shouldn't be used with modular subsystems, since they
>  	 * need to register a subsys_id, among other things */
>  	BUG_ON(ss->module);
> -- 
> 1.7.11.7
>
Tejun Heo Nov. 7, 2012, 5:02 p.m. UTC | #3
Hello, Michal.

On Wed, Nov 07, 2012 at 04:25:16PM +0100, Michal Hocko wrote:
> > This patch adds ->post_create().  It's called after all ->create()
> > succeeded and the cgroup is linked into the generic cgroup hierarchy.
> > This plays the counterpart of ->pre_destroy().
> 
> Hmm, I had to look at "cgroup_freezer: implement proper hierarchy
> support" to actually understand what is the callback good for. The above
> sounds as if the callback is needed when a controller wants to use
> the new iterators or when pre_destroy is defined.
> 
> I think it would be helpful if the changelog described that the callback
> is needed when the controller keeps a mutable shared state for the
> hierarchy. For example memory controller doesn't have any such a strict
> requirement so we can safely use your new iterators without pre_destroy.

Hmm.... will try to explain it but I think it might be best to just
refer to the later patch for details.  It's a bit tricky to explain.

Thanks.
diff mbox

Patch

diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index fe876a7..b442122 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -438,6 +438,7 @@  int cgroup_taskset_size(struct cgroup_taskset *tset);
 
 struct cgroup_subsys {
 	struct cgroup_subsys_state *(*create)(struct cgroup *cgrp);
+	void (*post_create)(struct cgroup *cgrp);
 	void (*pre_destroy)(struct cgroup *cgrp);
 	void (*destroy)(struct cgroup *cgrp);
 	int (*can_attach)(struct cgroup *cgrp, struct cgroup_taskset *tset);
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index e3045ad..f05d992 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -4060,10 +4060,15 @@  static long cgroup_create(struct cgroup *parent, struct dentry *dentry,
 	if (err < 0)
 		goto err_remove;
 
-	/* each css holds a ref to the cgroup's dentry */
-	for_each_subsys(root, ss)
+	for_each_subsys(root, ss) {
+		/* each css holds a ref to the cgroup's dentry */
 		dget(dentry);
 
+		/* creation succeeded, notify subsystems */
+		if (ss->post_create)
+			ss->post_create(cgrp);
+	}
+
 	/* The cgroup directory was pre-locked for us */
 	BUG_ON(!mutex_is_locked(&cgrp->dentry->d_inode->i_mutex));
 
@@ -4281,6 +4286,9 @@  static void __init cgroup_init_subsys(struct cgroup_subsys *ss)
 
 	ss->active = 1;
 
+	if (ss->post_create)
+		ss->post_create(&ss->root->top_cgroup);
+
 	/* this function shouldn't be used with modular subsystems, since they
 	 * need to register a subsys_id, among other things */
 	BUG_ON(ss->module);