Message ID | 20210409122959.82264-1-songmuchun@bytedance.com (mailing list archive) |
---|---|
Headers | show |
Series | Use obj_cgroup APIs to charge the LRU pages | expand |
On Fri, Apr 09, 2021 at 08:29:45PM +0800, Muchun Song wrote: > We already have a helper lruvec_memcg() to get the memcg from lruvec, we > do not need to do it ourselves in the lruvec_holds_page_lru_lock(). So use > lruvec_memcg() instead. And if mem_cgroup_disabled() returns false, the > page_memcg(page) (the LRU pages) cannot be NULL. So remove the odd logic > of "memcg = page_memcg(page) ? : root_mem_cgroup". And use lruvec_pgdat > to simplify the code. We can have a single definition for this function > that works for !CONFIG_MEMCG, CONFIG_MEMCG + mem_cgroup_disabled() and > CONFIG_MEMCG. > > Signed-off-by: Muchun Song <songmuchun@bytedance.com> Looks good to me. Acked-by: Johannes Weiner <hannes@cmpxchg.org> If you haven't done so yet, please make sure to explicitly test with all three config combinations, just because the dummy abstractions for memcg disabled or compiled out tend to be paper thin and don't always behave the way you might expect when you do more complicated things. Something like boot echo sparsefile >/dev/null (> ram size to fill memory and reclaim) echo 1 >/proc/sys/vm/compact_memory should exercise this new function in a couple of important scenarios.
On Fri, Apr 09, 2021 at 08:29:47PM +0800, Muchun Song wrote: > Because memory allocations pinning memcgs for a long time - it exists > at a larger scale and is causing recurring problems in the real world: > page cache doesn't get reclaimed for a long time, or is used by the > second, third, fourth, ... instance of the same job that was restarted > into a new cgroup every time. Unreclaimable dying cgroups pile up, > waste memory, and make page reclaim very inefficient. > > We can convert LRU pages and most other raw memcg pins to the objcg > direction to fix this problem, and then the page->memcg will always > point to an object cgroup pointer. > > Therefore, the infrastructure of objcg no longer only serves > CONFIG_MEMCG_KMEM. In this patch, we move the infrastructure of the > objcg out of the scope of the CONFIG_MEMCG_KMEM so that the LRU pages > can reuse it to charge pages. Just an observation on this: We actually may want to remove CONFIG_MEMCG_KMEM altogether at this point. It used to be an optional feature, but nowadays it's not configurable anymore, and always on unless slob is configured. We've also added more than just slab accounting to it, like kernel stack pages, and it all gets disabled on slob configs just because it doesn't support slab object tracking. We could probably replace CONFIG_MEMCG_KMEM with CONFIG_MEMCG in most places, and add a couple of !CONFIG_SLOB checks in the slab callbacks. But that's beyond the scope of your patch series, so I'm also okay with this patch here. > We know that the LRU pages are not accounted at the root level. But the > page->memcg_data points to the root_mem_cgroup. So the page->memcg_data > of the LRU pages always points to a valid pointer. But the root_mem_cgroup > dose not have an object cgroup. If we use obj_cgroup APIs to charge the > LRU pages, we should set the page->memcg_data to a root object cgroup. So > we also allocate an object cgroup for the root_mem_cgroup and introduce > root_obj_cgroup to cache its value just like root_mem_cgroup. > > Signed-off-by: Muchun Song <songmuchun@bytedance.com> Overall, the patch makes sense to me. A few comments below: > @@ -252,9 +253,14 @@ struct cgroup_subsys_state *vmpressure_to_css(struct vmpressure *vmpr) > return &container_of(vmpr, struct mem_cgroup, vmpressure)->css; > } > > -#ifdef CONFIG_MEMCG_KMEM > extern spinlock_t css_set_lock; > > +static inline bool obj_cgroup_is_root(struct obj_cgroup *objcg) > +{ > + return objcg == root_obj_cgroup; > +} This function, and by extension root_obj_cgroup, aren't used by this patch. Please move them to the patch that adds users for them. > @@ -298,6 +304,20 @@ static void obj_cgroup_release(struct percpu_ref *ref) > percpu_ref_exit(ref); > kfree_rcu(objcg, rcu); > } > +#else > +static void obj_cgroup_release(struct percpu_ref *ref) > +{ > + struct obj_cgroup *objcg = container_of(ref, struct obj_cgroup, refcnt); > + unsigned long flags; > + > + spin_lock_irqsave(&css_set_lock, flags); > + list_del(&objcg->list); > + spin_unlock_irqrestore(&css_set_lock, flags); > + > + percpu_ref_exit(ref); > + kfree_rcu(objcg, rcu); > +} > +#endif Having two separate functions for if and else is good when the else branch is a completely empty dummy function. In this case you end up duplicating code, so it's better to have just one function and put the ifdef around the nr_charged_bytes handling in it. > @@ -318,10 +338,14 @@ static struct obj_cgroup *obj_cgroup_alloc(void) > return objcg; > } > > -static void memcg_reparent_objcgs(struct mem_cgroup *memcg, > - struct mem_cgroup *parent) > +static void memcg_reparent_objcgs(struct mem_cgroup *memcg) > { > struct obj_cgroup *objcg, *iter; > + struct mem_cgroup *parent; > + > + parent = parent_mem_cgroup(memcg); > + if (!parent) > + parent = root_mem_cgroup; > > objcg = rcu_replace_pointer(memcg->objcg, NULL, true); > > @@ -342,6 +366,27 @@ static void memcg_reparent_objcgs(struct mem_cgroup *memcg, > percpu_ref_kill(&objcg->refcnt); > } > > +static int memcg_obj_cgroup_alloc(struct mem_cgroup *memcg) > +{ > + struct obj_cgroup *objcg; > + > + objcg = obj_cgroup_alloc(); > + if (!objcg) > + return -ENOMEM; > + > + objcg->memcg = memcg; > + rcu_assign_pointer(memcg->objcg, objcg); > + > + return 0; > +} > + > +static void memcg_obj_cgroup_free(struct mem_cgroup *memcg) > +{ > + if (unlikely(memcg->objcg)) > + memcg_reparent_objcgs(memcg); > +} It's confusing to have a 'free' function not free the object it's called on. But rather than search for a fitting name, I think it might be better to just fold both of these short functions into their only callsites. Also, since memcg->objcg is reparented, and the pointer cleared, on offlining, when could this ever be non-NULL? This deserves a comment. > @@ -3444,7 +3489,6 @@ static u64 mem_cgroup_read_u64(struct cgroup_subsys_state *css, > #ifdef CONFIG_MEMCG_KMEM > static int memcg_online_kmem(struct mem_cgroup *memcg) > { > - struct obj_cgroup *objcg; > int memcg_id; > > if (cgroup_memory_nokmem) > @@ -3457,14 +3501,6 @@ static int memcg_online_kmem(struct mem_cgroup *memcg) > if (memcg_id < 0) > return memcg_id; > > - objcg = obj_cgroup_alloc(); > - if (!objcg) { > - memcg_free_cache_id(memcg_id); > - return -ENOMEM; > - } > - objcg->memcg = memcg; > - rcu_assign_pointer(memcg->objcg, objcg); > - > static_branch_enable(&memcg_kmem_enabled_key); > > memcg->kmemcg_id = memcg_id; > @@ -3488,7 +3524,7 @@ static void memcg_offline_kmem(struct mem_cgroup *memcg) > if (!parent) > parent = root_mem_cgroup; > > - memcg_reparent_objcgs(memcg, parent); > + memcg_reparent_objcgs(memcg); Since the objcg is no longer tied to kmem, this should move to mem_cgroup_css_offline() instead.
On Fri, Apr 09, 2021 at 08:29:41PM +0800, Muchun Song wrote: > Since the following patchsets applied. All the kernel memory are charged > with the new APIs of obj_cgroup. > > [v17,00/19] The new cgroup slab memory controller > [v5,0/7] Use obj_cgroup APIs to charge kmem pages > > But user memory allocations (LRU pages) pinning memcgs for a long time - > it exists at a larger scale and is causing recurring problems in the real > world: page cache doesn't get reclaimed for a long time, or is used by the > second, third, fourth, ... instance of the same job that was restarted into > a new cgroup every time. Unreclaimable dying cgroups pile up, waste memory, > and make page reclaim very inefficient. > > We can convert LRU pages and most other raw memcg pins to the objcg direction > to fix this problem, and then the LRU pages will not pin the memcgs. > > This patchset aims to make the LRU pages to drop the reference to memory > cgroup by using the APIs of obj_cgroup. Finally, we can see that the number > of the dying cgroups will not increase if we run the following test script. > > ```bash > #!/bin/bash > > cat /proc/cgroups | grep memory > > cd /sys/fs/cgroup/memory > > for i in range{1..500} > do > mkdir test > echo $$ > test/cgroup.procs > sleep 60 & > echo $$ > cgroup.procs > echo `cat test/cgroup.procs` > cgroup.procs > rmdir test > done > > cat /proc/cgroups | grep memory > ``` > > Patch 1 aims to fix page charging in page replacement. > Patch 2-5 are code cleanup and simplification. > Patch 6-18 convert LRU pages pin to the objcg direction. > > Any comments are welcome. Thanks. Indeed the problem exists for a long time and it would be nice to fix it. However I'm against merging the patchset in the current form (there are some nice fixes/clean-ups, which can/must be applied independently). Let me explain my concerns: Back to the new slab controller discussion obj_cgroup was suggested by Johannes as a union of two concepts: 1) reparenting (basically an auto-pointer to a memcg in c++ terms) 2) byte-sized accounting I was initially against this union because I anticipated that the reparenting part will be useful separately. And the time told it was true. I still think obj_cgroup API must be significantly reworked before being applied outside of the kmem area: reparenting part must be separated and moved to the cgroup core level to be used not only in the memcg context but also for other controllers, which are facing similar problems. Spilling obj_cgroup API in the current form over all memcg code will make it more complicated and will delay it, given the amount of changes and the number of potential code conflicts. I'm working on the generalization of obj_cgroup API (as described above) and expect to have some patches next week. Thanks!
On Sat, Apr 10, 2021 at 12:00 AM Johannes Weiner <hannes@cmpxchg.org> wrote: > > On Fri, Apr 09, 2021 at 08:29:45PM +0800, Muchun Song wrote: > > We already have a helper lruvec_memcg() to get the memcg from lruvec, we > > do not need to do it ourselves in the lruvec_holds_page_lru_lock(). So use > > lruvec_memcg() instead. And if mem_cgroup_disabled() returns false, the > > page_memcg(page) (the LRU pages) cannot be NULL. So remove the odd logic > > of "memcg = page_memcg(page) ? : root_mem_cgroup". And use lruvec_pgdat > > to simplify the code. We can have a single definition for this function > > that works for !CONFIG_MEMCG, CONFIG_MEMCG + mem_cgroup_disabled() and > > CONFIG_MEMCG. > > > > Signed-off-by: Muchun Song <songmuchun@bytedance.com> > > Looks good to me. > > Acked-by: Johannes Weiner <hannes@cmpxchg.org> Thanks for your review. > > If you haven't done so yet, please make sure to explicitly test with > all three config combinations, just because the dummy abstractions for > memcg disabled or compiled out tend to be paper thin and don't always > behave the way you might expect when you do more complicated things. I have tested. There is no problem. Thanks :-) > > Something like > > boot > echo sparsefile >/dev/null (> ram size to fill memory and reclaim) > echo 1 >/proc/sys/vm/compact_memory > > should exercise this new function in a couple of important scenarios.
On Sat, Apr 10, 2021 at 12:56 AM Johannes Weiner <hannes@cmpxchg.org> wrote: > > On Fri, Apr 09, 2021 at 08:29:47PM +0800, Muchun Song wrote: > > Because memory allocations pinning memcgs for a long time - it exists > > at a larger scale and is causing recurring problems in the real world: > > page cache doesn't get reclaimed for a long time, or is used by the > > second, third, fourth, ... instance of the same job that was restarted > > into a new cgroup every time. Unreclaimable dying cgroups pile up, > > waste memory, and make page reclaim very inefficient. > > > > We can convert LRU pages and most other raw memcg pins to the objcg > > direction to fix this problem, and then the page->memcg will always > > point to an object cgroup pointer. > > > > Therefore, the infrastructure of objcg no longer only serves > > CONFIG_MEMCG_KMEM. In this patch, we move the infrastructure of the > > objcg out of the scope of the CONFIG_MEMCG_KMEM so that the LRU pages > > can reuse it to charge pages. > > Just an observation on this: > > We actually may want to remove CONFIG_MEMCG_KMEM altogether at this > point. It used to be an optional feature, but nowadays it's not > configurable anymore, and always on unless slob is configured. > > We've also added more than just slab accounting to it, like kernel > stack pages, and it all gets disabled on slob configs just because it > doesn't support slab object tracking. > > We could probably replace CONFIG_MEMCG_KMEM with CONFIG_MEMCG in most > places, and add a couple of !CONFIG_SLOB checks in the slab callbacks. > > But that's beyond the scope of your patch series, so I'm also okay > with this patch here. > > > We know that the LRU pages are not accounted at the root level. But the > > page->memcg_data points to the root_mem_cgroup. So the page->memcg_data > > of the LRU pages always points to a valid pointer. But the root_mem_cgroup > > dose not have an object cgroup. If we use obj_cgroup APIs to charge the > > LRU pages, we should set the page->memcg_data to a root object cgroup. So > > we also allocate an object cgroup for the root_mem_cgroup and introduce > > root_obj_cgroup to cache its value just like root_mem_cgroup. > > > > Signed-off-by: Muchun Song <songmuchun@bytedance.com> > > Overall, the patch makes sense to me. A few comments below: > > > @@ -252,9 +253,14 @@ struct cgroup_subsys_state *vmpressure_to_css(struct vmpressure *vmpr) > > return &container_of(vmpr, struct mem_cgroup, vmpressure)->css; > > } > > > > -#ifdef CONFIG_MEMCG_KMEM > > extern spinlock_t css_set_lock; > > > > +static inline bool obj_cgroup_is_root(struct obj_cgroup *objcg) > > +{ > > + return objcg == root_obj_cgroup; > > +} > > This function, and by extension root_obj_cgroup, aren't used by this > patch. Please move them to the patch that adds users for them. OK. Will do. > > > @@ -298,6 +304,20 @@ static void obj_cgroup_release(struct percpu_ref *ref) > > percpu_ref_exit(ref); > > kfree_rcu(objcg, rcu); > > } > > +#else > > +static void obj_cgroup_release(struct percpu_ref *ref) > > +{ > > + struct obj_cgroup *objcg = container_of(ref, struct obj_cgroup, refcnt); > > + unsigned long flags; > > + > > + spin_lock_irqsave(&css_set_lock, flags); > > + list_del(&objcg->list); > > + spin_unlock_irqrestore(&css_set_lock, flags); > > + > > + percpu_ref_exit(ref); > > + kfree_rcu(objcg, rcu); > > +} > > +#endif > > Having two separate functions for if and else is good when the else > branch is a completely empty dummy function. In this case you end up > duplicating code, so it's better to have just one function and put the > ifdef around the nr_charged_bytes handling in it. Make sense. I will rework the code here. > > > @@ -318,10 +338,14 @@ static struct obj_cgroup *obj_cgroup_alloc(void) > > return objcg; > > } > > > > -static void memcg_reparent_objcgs(struct mem_cgroup *memcg, > > - struct mem_cgroup *parent) > > +static void memcg_reparent_objcgs(struct mem_cgroup *memcg) > > { > > struct obj_cgroup *objcg, *iter; > > + struct mem_cgroup *parent; > > + > > + parent = parent_mem_cgroup(memcg); > > + if (!parent) > > + parent = root_mem_cgroup; > > > > objcg = rcu_replace_pointer(memcg->objcg, NULL, true); > > > > @@ -342,6 +366,27 @@ static void memcg_reparent_objcgs(struct mem_cgroup *memcg, > > percpu_ref_kill(&objcg->refcnt); > > } > > > > +static int memcg_obj_cgroup_alloc(struct mem_cgroup *memcg) > > +{ > > + struct obj_cgroup *objcg; > > + > > + objcg = obj_cgroup_alloc(); > > + if (!objcg) > > + return -ENOMEM; > > + > > + objcg->memcg = memcg; > > + rcu_assign_pointer(memcg->objcg, objcg); > > + > > + return 0; > > +} > > + > > +static void memcg_obj_cgroup_free(struct mem_cgroup *memcg) > > +{ > > + if (unlikely(memcg->objcg)) > > + memcg_reparent_objcgs(memcg); > > +} > > It's confusing to have a 'free' function not free the object it's > called on. > > But rather than search for a fitting name, I think it might be better > to just fold both of these short functions into their only callsites. OK. Will do. > > Also, since memcg->objcg is reparented, and the pointer cleared, on > offlining, when could this ever be non-NULL? This deserves a comment. css_alloc() failed, offlining didn't happen. In this case, memcg->objcg could be non-NULL (Just like memcg_free_kmem() dose). I will move memcg_obj_cgroup_alloc() to the mem_cgroup_css_online() so that we do not need memcg_obj_cgroup_free. > > > @@ -3444,7 +3489,6 @@ static u64 mem_cgroup_read_u64(struct cgroup_subsys_state *css, > > #ifdef CONFIG_MEMCG_KMEM > > static int memcg_online_kmem(struct mem_cgroup *memcg) > > { > > - struct obj_cgroup *objcg; > > int memcg_id; > > > > if (cgroup_memory_nokmem) > > @@ -3457,14 +3501,6 @@ static int memcg_online_kmem(struct mem_cgroup *memcg) > > if (memcg_id < 0) > > return memcg_id; > > > > - objcg = obj_cgroup_alloc(); > > - if (!objcg) { > > - memcg_free_cache_id(memcg_id); > > - return -ENOMEM; > > - } > > - objcg->memcg = memcg; > > - rcu_assign_pointer(memcg->objcg, objcg); > > - > > static_branch_enable(&memcg_kmem_enabled_key); > > > > memcg->kmemcg_id = memcg_id; > > @@ -3488,7 +3524,7 @@ static void memcg_offline_kmem(struct mem_cgroup *memcg) > > if (!parent) > > parent = root_mem_cgroup; > > > > - memcg_reparent_objcgs(memcg, parent); > > + memcg_reparent_objcgs(memcg); > > Since the objcg is no longer tied to kmem, this should move to > mem_cgroup_css_offline() instead. LGTM, will do. Thanks for your all suggestions.
On Fri, Apr 09, 2021 at 06:29:46PM -0700, Roman Gushchin wrote: > On Fri, Apr 09, 2021 at 08:29:41PM +0800, Muchun Song wrote: > > Since the following patchsets applied. All the kernel memory are charged > > with the new APIs of obj_cgroup. > > > > [v17,00/19] The new cgroup slab memory controller > > [v5,0/7] Use obj_cgroup APIs to charge kmem pages > > > > But user memory allocations (LRU pages) pinning memcgs for a long time - > > it exists at a larger scale and is causing recurring problems in the real > > world: page cache doesn't get reclaimed for a long time, or is used by the > > second, third, fourth, ... instance of the same job that was restarted into > > a new cgroup every time. Unreclaimable dying cgroups pile up, waste memory, > > and make page reclaim very inefficient. > > > > We can convert LRU pages and most other raw memcg pins to the objcg direction > > to fix this problem, and then the LRU pages will not pin the memcgs. > > > > This patchset aims to make the LRU pages to drop the reference to memory > > cgroup by using the APIs of obj_cgroup. Finally, we can see that the number > > of the dying cgroups will not increase if we run the following test script. > > > > ```bash > > #!/bin/bash > > > > cat /proc/cgroups | grep memory > > > > cd /sys/fs/cgroup/memory > > > > for i in range{1..500} > > do > > mkdir test > > echo $$ > test/cgroup.procs > > sleep 60 & > > echo $$ > cgroup.procs > > echo `cat test/cgroup.procs` > cgroup.procs > > rmdir test > > done > > > > cat /proc/cgroups | grep memory > > ``` > > > > Patch 1 aims to fix page charging in page replacement. > > Patch 2-5 are code cleanup and simplification. > > Patch 6-18 convert LRU pages pin to the objcg direction. > > > > Any comments are welcome. Thanks. > > Indeed the problem exists for a long time and it would be nice to fix it. > However I'm against merging the patchset in the current form (there are some > nice fixes/clean-ups, which can/must be applied independently). Let me explain > my concerns: > > Back to the new slab controller discussion obj_cgroup was suggested by Johannes > as a union of two concepts: > 1) reparenting (basically an auto-pointer to a memcg in c++ terms) > 2) byte-sized accounting > > I was initially against this union because I anticipated that the reparenting > part will be useful separately. And the time told it was true. "The idea of moving stocks and leftovers to the memcg_ptr/obj_cgroup level is really good." https://lore.kernel.org/lkml/20191025200020.GA8325@castle.DHCP.thefacebook.com/ If you recall, the main concern was how the byte charging interface was added to the existing page charging interface, instead of being layered on top of it. I suggested to do that and, since there was no other user for the indirection pointer, just include it in the API. It made sense at the time, and you seemed to agree. But I also agree it makes sense to factor it out now that more users are materializing. > I still think obj_cgroup API must be significantly reworked before being > applied outside of the kmem area: reparenting part must be separated > and moved to the cgroup core level to be used not only in the memcg > context but also for other controllers, which are facing similar problems. > Spilling obj_cgroup API in the current form over all memcg code will > make it more complicated and will delay it, given the amount of changes > and the number of potential code conflicts. > > I'm working on the generalization of obj_cgroup API (as described above) > and expect to have some patches next week. Yeah, splitting the byte charging API from the reference API and making the latter cgroup-generic makes sense. I'm looking forward to your patches. And yes, the conflicts between that work and Muchun's patches would be quite large. However, most of them would come down to renames, since the access rules and refcounting sites will remain the same, so it shouldn't be too bad to rebase Muchun's patches on yours. And we can continue reviewing his patches for correctness for now. Thanks
On Mon, Apr 12, 2021 at 01:14:57PM -0400, Johannes Weiner wrote: > On Fri, Apr 09, 2021 at 06:29:46PM -0700, Roman Gushchin wrote: > > On Fri, Apr 09, 2021 at 08:29:41PM +0800, Muchun Song wrote: > > > Since the following patchsets applied. All the kernel memory are charged > > > with the new APIs of obj_cgroup. > > > > > > [v17,00/19] The new cgroup slab memory controller > > > [v5,0/7] Use obj_cgroup APIs to charge kmem pages > > > > > > But user memory allocations (LRU pages) pinning memcgs for a long time - > > > it exists at a larger scale and is causing recurring problems in the real > > > world: page cache doesn't get reclaimed for a long time, or is used by the > > > second, third, fourth, ... instance of the same job that was restarted into > > > a new cgroup every time. Unreclaimable dying cgroups pile up, waste memory, > > > and make page reclaim very inefficient. > > > > > > We can convert LRU pages and most other raw memcg pins to the objcg direction > > > to fix this problem, and then the LRU pages will not pin the memcgs. > > > > > > This patchset aims to make the LRU pages to drop the reference to memory > > > cgroup by using the APIs of obj_cgroup. Finally, we can see that the number > > > of the dying cgroups will not increase if we run the following test script. > > > > > > ```bash > > > #!/bin/bash > > > > > > cat /proc/cgroups | grep memory > > > > > > cd /sys/fs/cgroup/memory > > > > > > for i in range{1..500} > > > do > > > mkdir test > > > echo $$ > test/cgroup.procs > > > sleep 60 & > > > echo $$ > cgroup.procs > > > echo `cat test/cgroup.procs` > cgroup.procs > > > rmdir test > > > done > > > > > > cat /proc/cgroups | grep memory > > > ``` > > > > > > Patch 1 aims to fix page charging in page replacement. > > > Patch 2-5 are code cleanup and simplification. > > > Patch 6-18 convert LRU pages pin to the objcg direction. > > > > > > Any comments are welcome. Thanks. > > > > Indeed the problem exists for a long time and it would be nice to fix it. > > However I'm against merging the patchset in the current form (there are some > > nice fixes/clean-ups, which can/must be applied independently). Let me explain > > my concerns: > > > > Back to the new slab controller discussion obj_cgroup was suggested by Johannes > > as a union of two concepts: > > 1) reparenting (basically an auto-pointer to a memcg in c++ terms) > > 2) byte-sized accounting > > > > I was initially against this union because I anticipated that the reparenting > > part will be useful separately. And the time told it was true. > > "The idea of moving stocks and leftovers to the memcg_ptr/obj_cgroup > level is really good." > > https://lore.kernel.org/lkml/20191025200020.GA8325@castle.DHCP.thefacebook.com/ > > If you recall, the main concern was how the byte charging interface > was added to the existing page charging interface, instead of being > layered on top of it. I suggested to do that and, since there was no > other user for the indirection pointer, just include it in the API. > > It made sense at the time, and you seemed to agree. But I also agree > it makes sense to factor it out now that more users are materializing. Agreed. > > > I still think obj_cgroup API must be significantly reworked before being > > applied outside of the kmem area: reparenting part must be separated > > and moved to the cgroup core level to be used not only in the memcg > > context but also for other controllers, which are facing similar problems. > > Spilling obj_cgroup API in the current form over all memcg code will > > make it more complicated and will delay it, given the amount of changes > > and the number of potential code conflicts. > > > > I'm working on the generalization of obj_cgroup API (as described above) > > and expect to have some patches next week. > > Yeah, splitting the byte charging API from the reference API and > making the latter cgroup-generic makes sense. I'm looking forward to > your patches. > > And yes, the conflicts between that work and Muchun's patches would be > quite large. However, most of them would come down to renames, since > the access rules and refcounting sites will remain the same, so it > shouldn't be too bad to rebase Muchun's patches on yours. And we can > continue reviewing his patches for correctness for now. Sounds good to me! Thanks