diff mbox

[1/9,v3] cgroup: add cgroup_subsys->post_create()

Message ID 509CE472.9040504@monom.org (mailing list archive)
State Not Applicable, archived
Headers show

Commit Message

Daniel Wagner Nov. 9, 2012, 11:09 a.m. UTC
Hi Tejun,

On 08.11.2012 20:07, Tejun Heo wrote:> Subject: cgroup: add 
cgroup_subsys->post_create()
 >
 > Currently, there's no way for a controller to find out whether a new
 > cgroup finished all ->create() allocatinos successfully and is
 > considered "live" by cgroup.

I'd like add hierarchy support to net_prio and the first thing to
do is to get rid of get_prioidx(). It looks like it would be nice to be 
able to use use_id and post_create() for this but as I read the code 
this might not work because the netdev might access resources allocated 
between create() and post_create(). So my question is would it make 
sense to move

cgroup_create():

		if (ss->use_id) {
			err = alloc_css_id(ss, parent, cgrp);
			if (err)
				goto err_destroy;
		}

part before create() or add some protection between create() and 
post_create() callback in net_prio. I have a patch but I see
I could drop it completely if post_create() is there.

cheers,
daniel


 From 84fbbdf0dc5d3460389e39a00a3ee553ee55b563 Mon Sep 17 00:00:00 2001
From: Daniel Wagner <daniel.wagner@bmw-carit.de>
Date: Thu, 8 Nov 2012 17:17:21 +0100
Subject: [PATCH] cgroups: net_prio: Use IDR library to assign netprio index

get_prioidx() allocated a new id whenever it was called. put_prioidx()
gave an id back. get_pioidx() could can reallocate the id later on.
So that is exactly what IDR offers and so let's use it.

Signed-off-by: Daniel Wagner <daniel.wagner@bmw-carit.de>
---
  net/core/netprio_cgroup.c | 51 
+++++++++--------------------------------------
  1 file changed, 9 insertions(+), 42 deletions(-)

cgroup *cgrp)
@@ -39,34 +36,6 @@ static inline struct cgroup_netprio_state 
*cgrp_netprio_state(struct cgroup *cgr
  			    struct cgroup_netprio_state, css);
  }

-static int get_prioidx(u32 *prio)
-{
-	unsigned long flags;
-	u32 prioidx;
-
-	spin_lock_irqsave(&prioidx_map_lock, flags);
-	prioidx = find_first_zero_bit(prioidx_map, sizeof(unsigned long) * 
PRIOIDX_SZ);
-	if (prioidx == sizeof(unsigned long) * PRIOIDX_SZ) {
-		spin_unlock_irqrestore(&prioidx_map_lock, flags);
-		return -ENOSPC;
-	}
-	set_bit(prioidx, prioidx_map);
-	if (atomic_read(&max_prioidx) < prioidx)
-		atomic_set(&max_prioidx, prioidx);
-	spin_unlock_irqrestore(&prioidx_map_lock, flags);
-	*prio = prioidx;
-	return 0;
-}
-
-static void put_prioidx(u32 idx)
-{
-	unsigned long flags;
-
-	spin_lock_irqsave(&prioidx_map_lock, flags);
-	clear_bit(idx, prioidx_map);
-	spin_unlock_irqrestore(&prioidx_map_lock, flags);
-}
-
  static int extend_netdev_table(struct net_device *dev, u32 new_len)
  {
  	size_t new_size = sizeof(struct netprio_map) +
@@ -120,9 +89,9 @@ static struct cgroup_subsys_state *cgrp_create(struct 
cgroup *cgrp)
  	if (cgrp->parent && cgrp_netprio_state(cgrp->parent)->prioidx)
  		goto out;

-	ret = get_prioidx(&cs->prioidx);
-	if (ret < 0) {
-		pr_warn("No space in priority index array\n");
+	cs->prioidx = ida_simple_get(&netprio_ida, 0, 0, GFP_KERNEL);
+	if (cs->prioidx < 0) {
+		ret = cs->prioidx;
  		goto out;
  	}

@@ -146,7 +115,7 @@ static void cgrp_destroy(struct cgroup *cgrp)
  			map->priomap[cs->prioidx] = 0;
  	}
  	rtnl_unlock();
-	put_prioidx(cs->prioidx);
+	ida_simple_remove(&netprio_ida, cs->prioidx);
  	kfree(cs);
  }

@@ -284,12 +253,10 @@ struct cgroup_subsys net_prio_subsys = {
  	.module		= THIS_MODULE,

  	/*
-	 * net_prio has artificial limit on the number of cgroups and
-	 * disallows nesting making it impossible to co-mount it with other
-	 * hierarchical subsystems.  Remove the artificially low PRIOIDX_SZ
-	 * limit and properly nest configuration such that children follow
-	 * their parents' configurations by default and are allowed to
-	 * override and remove the following.
+	 * net_prio has artificial limit on properly nest
+	 * configuration such that children follow their parents'
+	 * configurations by default and are allowed to override and
+	 * remove the following.
  	 */
  	.broken_hierarchy = true,
  };

Comments

Tejun Heo Nov. 9, 2012, 5:22 p.m. UTC | #1
Hey, Daniel.

On Fri, Nov 09, 2012 at 12:09:38PM +0100, Daniel Wagner wrote:
> On 08.11.2012 20:07, Tejun Heo wrote:> Subject: cgroup: add
> cgroup_subsys->post_create()
> >
> > Currently, there's no way for a controller to find out whether a new
> > cgroup finished all ->create() allocatinos successfully and is
> > considered "live" by cgroup.
> 
> I'd like add hierarchy support to net_prio and the first thing to
> do is to get rid of get_prioidx(). It looks like it would be nice to

Ooh, I'm already working on it.  I *think* I should be able to post
the patches later today or early next week.

> be able to use use_id and post_create() for this but as I read the
> code this might not work because the netdev might access resources
> allocated between create() and post_create(). So my question is
> would it make sense to move
> 
> cgroup_create():
> 
> 		if (ss->use_id) {
> 			err = alloc_css_id(ss, parent, cgrp);
> 			if (err)
> 				goto err_destroy;
> 		}
> 
> part before create() or add some protection between create() and
> post_create() callback in net_prio. I have a patch but I see
> I could drop it completely if post_create() is there.

Glauber had about similar question about css_id and I need to think
more about it but currently I think I want to phase out css IDs.  It's
an id of the wrong thing (CSSes don't need IDs, cgroups do) and
unnecessarily duplicates its own hierarchy when the hierarchy of
cgroups already exists.  Once memcontrol moves away from walking using
css_ids, I *think* I'll try to kill it.

I'll add cgroup ID (no hierarchy funnies, just a single ida allocated
number) so that it can be used for cgroup indexing.  Glauber, that
should solve your problem too, right?

Thanks.
Glauber Costa Nov. 10, 2012, 1:35 a.m. UTC | #2
On 11/09/2012 06:22 PM, Tejun Heo wrote:
> Hey, Daniel.
> 
> On Fri, Nov 09, 2012 at 12:09:38PM +0100, Daniel Wagner wrote:
>> On 08.11.2012 20:07, Tejun Heo wrote:> Subject: cgroup: add
>> cgroup_subsys->post_create()
>>>
>>> Currently, there's no way for a controller to find out whether a new
>>> cgroup finished all ->create() allocatinos successfully and is
>>> considered "live" by cgroup.
>>
>> I'd like add hierarchy support to net_prio and the first thing to
>> do is to get rid of get_prioidx(). It looks like it would be nice to
> 
> Ooh, I'm already working on it.  I *think* I should be able to post
> the patches later today or early next week.
> 
>> be able to use use_id and post_create() for this but as I read the
>> code this might not work because the netdev might access resources
>> allocated between create() and post_create(). So my question is
>> would it make sense to move
>>
>> cgroup_create():
>>
>> 		if (ss->use_id) {
>> 			err = alloc_css_id(ss, parent, cgrp);
>> 			if (err)
>> 				goto err_destroy;
>> 		}
>>
>> part before create() or add some protection between create() and
>> post_create() callback in net_prio. I have a patch but I see
>> I could drop it completely if post_create() is there.
> 
> Glauber had about similar question about css_id and I need to think
> more about it but currently I think I want to phase out css IDs.  It's
> an id of the wrong thing (CSSes don't need IDs, cgroups do) and
> unnecessarily duplicates its own hierarchy when the hierarchy of
> cgroups already exists.  Once memcontrol moves away from walking using
> css_ids, I *think* I'll try to kill it.

May I suggest doing something similar with what the scheduler does? I
had some code in the past that reused that code - but basically
duplicated it. If you want, I can try getting a version of that in
kernel/cgroup.c  that would serve as a general walker.

I like that walker a lot, because it happens in a sane order. memcg
basically walks in a random weird order, that makes hierarchical
computation of anything quite hard.

> 
> I'll add cgroup ID (no hierarchy funnies, just a single ida allocated
> number) so that it can be used for cgroup indexing.  Glauber, that
> should solve your problem too, right?
> 

Actually I went with a totally orthogonal solution. I am now using per
kmem-limited ids. Because they are not tied to the cgroup creation
workflow, I can allocate whenever it is more convenient.

I ended up liking this solution because it will do better in scenarios
where most of the memcgs are not kmem limited. So it had an edge here,
and also got rid of the create/post_create problem by breaking the
dependency.

But of course, if cgroups would gain some kind of sane indexing, it
could shift the balance towards reusing it.


--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Daniel Wagner Nov. 12, 2012, 1:04 p.m. UTC | #3
Hi Tejun,

On 09.11.2012 18:22, Tejun Heo wrote:
> Hey, Daniel.
>
> On Fri, Nov 09, 2012 at 12:09:38PM +0100, Daniel Wagner wrote:
>> On 08.11.2012 20:07, Tejun Heo wrote:> Subject: cgroup: add
>> cgroup_subsys->post_create()
>>>
>>> Currently, there's no way for a controller to find out whether a new
>>> cgroup finished all ->create() allocatinos successfully and is
>>> considered "live" by cgroup.
>>
>> I'd like add hierarchy support to net_prio and the first thing to
>> do is to get rid of get_prioidx(). It looks like it would be nice to
>
> Ooh, I'm already working on it.  I *think* I should be able to post
> the patches later today or early next week.

No problem. I didn't spend too much time in writing the patch. Getting
rid of get_prioidx() was simple. I just wanted to help out, but then
I might to slow to keep up with you :)

Though I have a question on the patch you are writing. When you disable
broken_hierarchy for the networking controllers, are you also 
considering to disable them on the root cgroup? Currently both
net_prio and net_cls will do work on the root cgroup. I think for
harmonizing the behavior or all controllers this should also be
disabled.

>> be able to use use_id and post_create() for this but as I read the
>> code this might not work because the netdev might access resources
>> allocated between create() and post_create(). So my question is
>> would it make sense to move
>>
>> cgroup_create():
>>
>> 		if (ss->use_id) {
>> 			err = alloc_css_id(ss, parent, cgrp);
>> 			if (err)
>> 				goto err_destroy;
>> 		}
>>
>> part before create() or add some protection between create() and
>> post_create() callback in net_prio. I have a patch but I see
>> I could drop it completely if post_create() is there.
>
> Glauber had about similar question about css_id and I need to think
> more about it but currently I think I want to phase out css IDs.  It's
> an id of the wrong thing (CSSes don't need IDs, cgroups do) and
> unnecessarily duplicates its own hierarchy when the hierarchy of
> cgroups already exists.  Once memcontrol moves away from walking using
> css_ids, I *think* I'll try to kill it.
>
> I'll add cgroup ID (no hierarchy funnies, just a single ida allocated
> number) so that it can be used for cgroup indexing.  Glauber, that
> should solve your problem too, right?

net_prio doesn't have any particular requirements on the indexing, 
unless there is one. So a global ida would work for net_prio.

cheers,
daniel

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/core/netprio_cgroup.c b/net/core/netprio_cgroup.c
index 847c02b..3c1b612 100644
--- a/net/core/netprio_cgroup.c
+++ b/net/core/netprio_cgroup.c
@@ -27,10 +27,7 @@ 

  #include <linux/fdtable.h>

-#define PRIOIDX_SZ 128
-
-static unsigned long prioidx_map[PRIOIDX_SZ];
-static DEFINE_SPINLOCK(prioidx_map_lock);
+static DEFINE_IDA(netprio_ida);
  static atomic_t max_prioidx = ATOMIC_INIT(0);

  static inline struct cgroup_netprio_state *cgrp_netprio_state(struct