[bpf-next,v2] bpf: reparent bpf maps on memcg offlining

Message ID	20220711162827.184743-1-roman.gushchin@linux.dev (mailing list archive)
State	Accepted
Commit	cbddef2759b6bc3b591271d023bfc166012c25d5
Delegated to:	BPF
Headers	show Return-Path: <bpf-owner@kernel.org> From: Roman Gushchin <roman.gushchin@linux.dev> To: bpf@vger.kernel.org Cc: Shakeel Butt <shakeelb@google.com>, Alexei Starovoitov <ast@kernel.org>, linux-kernel@vger.kernel.org, Roman Gushchin <roman.gushchin@linux.dev> Subject: [PATCH bpf-next v2] bpf: reparent bpf maps on memcg offlining Date: Mon, 11 Jul 2022 09:28:27 -0700 Message-Id: <20220711162827.184743-1-roman.gushchin@linux.dev> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk
Series	[bpf-next,v2] bpf: reparent bpf maps on memcg offlining \| expand [bpf-next,v2] bpf: reparent bpf maps on memcg offlining

Context	Check	Description
netdev/tree_selection	success	Clearly marked for bpf-next, async
netdev/fixes_present	success	Fixes tag not required for -next series
netdev/subject_prefix	success	Link
netdev/cover_letter	success	Single patches do not need cover letters
netdev/patch_count	success	Link
netdev/header_inline	success	No static functions without inline keyword in header files
netdev/build_32bit	success	Errors and warnings before: 1427 this patch: 1427
netdev/cc_maintainers	warning	10 maintainers not CCed: haoluo@google.com song@kernel.org daniel@iogearbox.net yhs@fb.com martin.lau@linux.dev john.fastabend@gmail.com jolsa@kernel.org andrii@kernel.org sdf@google.com kpsingh@kernel.org
netdev/build_clang	success	Errors and warnings before: 168 this patch: 168
netdev/module_param	success	Was 0 now: 0
netdev/verify_signedoff	success	Signed-off-by tag matches author and committer
netdev/check_selftest	success	No net selftest shell script
netdev/verify_fixes	success	No Fixes tag
netdev/build_allmodconfig_warn	success	Errors and warnings before: 1436 this patch: 1436
netdev/checkpatch	fail	ERROR: open brace '{' following function definitions go on the next line
netdev/kdoc	success	Errors and warnings before: 0 this patch: 0
netdev/source_inline	success	Was 0 now: 0
bpf/vmtest-bpf-next-PR	success	PR summary
bpf/vmtest-bpf-next-VM_Test-1	success	Logs for Kernel LATEST on ubuntu-latest with gcc
bpf/vmtest-bpf-next-VM_Test-2	success	Logs for Kernel LATEST on ubuntu-latest with llvm-15
bpf/vmtest-bpf-next-VM_Test-3	success	Logs for Kernel LATEST on z15 with gcc

Roman Gushchin July 11, 2022, 4:28 p.m. UTC

The memory consumed by a mpf map is always accounted to the memory
cgroup of the process which created the map. The map can outlive
the memory cgroup if it's used by processes in other cgroups or
is pinned on bpffs. In this case the map pins the original cgroup
in the dying state.

For other types of objects (slab objects, non-slab kernel allocations,
percpu objects and recently LRU pages) there is a reparenting process
implemented: on cgroup offlining charged objects are getting
reassigned to the parent cgroup. Because all charges and statistics
are fully recursive it's a fairly cheap operation.

For efficiency and consistency with other types of objects, let's do
the same for bpf maps. Fortunately thanks to the objcg API, the
required changes are minimal.

Please, note that individual allocations (slabs, percpu and large
kmallocs) already have the reparenting mechanism. This commit adds
it to the saved map->memcg pointer by replacing it to map->objcg.
Because dying cgroups are not visible for a user and all charges are
recursive, this commit doesn't bring any behavior changes for a user.

v2:
  added a missing const qualifier

Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
---
 include/linux/bpf.h  |  2 +-
 kernel/bpf/syscall.c | 35 +++++++++++++++++++++++++++--------
 2 files changed, 28 insertions(+), 9 deletions(-)

Alexei Starovoitov July 12, 2022, 9:48 p.m. UTC | #1

On Mon, Jul 11, 2022 at 9:28 AM Roman Gushchin <roman.gushchin@linux.dev> wrote:
>
> The memory consumed by a mpf map is always accounted to the memory
> cgroup of the process which created the map. The map can outlive
> the memory cgroup if it's used by processes in other cgroups or
> is pinned on bpffs. In this case the map pins the original cgroup
> in the dying state.
>
> For other types of objects (slab objects, non-slab kernel allocations,
> percpu objects and recently LRU pages) there is a reparenting process
> implemented: on cgroup offlining charged objects are getting
> reassigned to the parent cgroup. Because all charges and statistics
> are fully recursive it's a fairly cheap operation.
>
> For efficiency and consistency with other types of objects, let's do
> the same for bpf maps. Fortunately thanks to the objcg API, the
> required changes are minimal.
>
> Please, note that individual allocations (slabs, percpu and large
> kmallocs) already have the reparenting mechanism. This commit adds
> it to the saved map->memcg pointer by replacing it to map->objcg.
> Because dying cgroups are not visible for a user and all charges are
> recursive, this commit doesn't bring any behavior changes for a user.
>
> v2:
>   added a missing const qualifier
>
> Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev>
> Reviewed-by: Shakeel Butt <shakeelb@google.com>
> ---
>  include/linux/bpf.h  |  2 +-
>  kernel/bpf/syscall.c | 35 +++++++++++++++++++++++++++--------
>  2 files changed, 28 insertions(+), 9 deletions(-)
>
> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> index 2b21f2a3452f..85a4db3e0536 100644
> --- a/include/linux/bpf.h
> +++ b/include/linux/bpf.h
> @@ -221,7 +221,7 @@ struct bpf_map {
>         u32 btf_vmlinux_value_type_id;
>         struct btf *btf;
>  #ifdef CONFIG_MEMCG_KMEM
> -       struct mem_cgroup *memcg;
> +       struct obj_cgroup *objcg;
>  #endif
>         char name[BPF_OBJ_NAME_LEN];
>         struct bpf_map_off_arr *off_arr;
> diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
> index ab688d85b2c6..ef60dbc21b17 100644
> --- a/kernel/bpf/syscall.c
> +++ b/kernel/bpf/syscall.c
> @@ -419,35 +419,52 @@ void bpf_map_free_id(struct bpf_map *map, bool do_idr_lock)
>  #ifdef CONFIG_MEMCG_KMEM
>  static void bpf_map_save_memcg(struct bpf_map *map)
>  {
> -       map->memcg = get_mem_cgroup_from_mm(current->mm);
> +       /* Currently if a map is created by a process belonging to the root
> +        * memory cgroup, get_obj_cgroup_from_current() will return NULL.
> +        * So we have to check map->objcg for being NULL each time it's
> +        * being used.
> +        */
> +       map->objcg = get_obj_cgroup_from_current();
>  }
>
>  static void bpf_map_release_memcg(struct bpf_map *map)
>  {
> -       mem_cgroup_put(map->memcg);
> +       if (map->objcg)
> +               obj_cgroup_put(map->objcg);
> +}
> +
> +static struct mem_cgroup *bpf_map_get_memcg(const struct bpf_map *map) {
> +       if (map->objcg)
> +               return get_mem_cgroup_from_objcg(map->objcg);
> +
> +       return root_mem_cgroup;
>  }
>
>  void *bpf_map_kmalloc_node(const struct bpf_map *map, size_t size, gfp_t flags,
>                            int node)
>  {
> -       struct mem_cgroup *old_memcg;
> +       struct mem_cgroup *memcg, *old_memcg;
>         void *ptr;
>
> -       old_memcg = set_active_memcg(map->memcg);
> +       memcg = bpf_map_get_memcg(map);
> +       old_memcg = set_active_memcg(memcg);
>         ptr = kmalloc_node(size, flags | __GFP_ACCOUNT, node);
>         set_active_memcg(old_memcg);
> +       mem_cgroup_put(memcg);

Here we might css_put root_mem_cgroup.
Should we css_get it when returning or
it's marked as CSS_NO_REF ?
But mem_cgroup_alloc() doesn't seem to be doing that marking.
I'm lost at that code.

Shakeel Butt July 12, 2022, 10:11 p.m. UTC | #2

On Tue, Jul 12, 2022 at 2:49 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Mon, Jul 11, 2022 at 9:28 AM Roman Gushchin <roman.gushchin@linux.dev> wrote:
> >
> > The memory consumed by a mpf map is always accounted to the memory
> > cgroup of the process which created the map. The map can outlive
> > the memory cgroup if it's used by processes in other cgroups or
> > is pinned on bpffs. In this case the map pins the original cgroup
> > in the dying state.
> >
> > For other types of objects (slab objects, non-slab kernel allocations,
> > percpu objects and recently LRU pages) there is a reparenting process
> > implemented: on cgroup offlining charged objects are getting
> > reassigned to the parent cgroup. Because all charges and statistics
> > are fully recursive it's a fairly cheap operation.
> >
> > For efficiency and consistency with other types of objects, let's do
> > the same for bpf maps. Fortunately thanks to the objcg API, the
> > required changes are minimal.
> >
> > Please, note that individual allocations (slabs, percpu and large
> > kmallocs) already have the reparenting mechanism. This commit adds
> > it to the saved map->memcg pointer by replacing it to map->objcg.
> > Because dying cgroups are not visible for a user and all charges are
> > recursive, this commit doesn't bring any behavior changes for a user.
> >
> > v2:
> >   added a missing const qualifier
> >
> > Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev>
> > Reviewed-by: Shakeel Butt <shakeelb@google.com>
> > ---
> >  include/linux/bpf.h  |  2 +-
> >  kernel/bpf/syscall.c | 35 +++++++++++++++++++++++++++--------
> >  2 files changed, 28 insertions(+), 9 deletions(-)
> >
> > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > index 2b21f2a3452f..85a4db3e0536 100644
> > --- a/include/linux/bpf.h
> > +++ b/include/linux/bpf.h
> > @@ -221,7 +221,7 @@ struct bpf_map {
> >         u32 btf_vmlinux_value_type_id;
> >         struct btf *btf;
> >  #ifdef CONFIG_MEMCG_KMEM
> > -       struct mem_cgroup *memcg;
> > +       struct obj_cgroup *objcg;
> >  #endif
> >         char name[BPF_OBJ_NAME_LEN];
> >         struct bpf_map_off_arr *off_arr;
> > diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
> > index ab688d85b2c6..ef60dbc21b17 100644
> > --- a/kernel/bpf/syscall.c
> > +++ b/kernel/bpf/syscall.c
> > @@ -419,35 +419,52 @@ void bpf_map_free_id(struct bpf_map *map, bool do_idr_lock)
> >  #ifdef CONFIG_MEMCG_KMEM
> >  static void bpf_map_save_memcg(struct bpf_map *map)
> >  {
> > -       map->memcg = get_mem_cgroup_from_mm(current->mm);
> > +       /* Currently if a map is created by a process belonging to the root
> > +        * memory cgroup, get_obj_cgroup_from_current() will return NULL.
> > +        * So we have to check map->objcg for being NULL each time it's
> > +        * being used.
> > +        */
> > +       map->objcg = get_obj_cgroup_from_current();
> >  }
> >
> >  static void bpf_map_release_memcg(struct bpf_map *map)
> >  {
> > -       mem_cgroup_put(map->memcg);
> > +       if (map->objcg)
> > +               obj_cgroup_put(map->objcg);
> > +}
> > +
> > +static struct mem_cgroup *bpf_map_get_memcg(const struct bpf_map *map) {
> > +       if (map->objcg)
> > +               return get_mem_cgroup_from_objcg(map->objcg);
> > +
> > +       return root_mem_cgroup;
> >  }
> >
> >  void *bpf_map_kmalloc_node(const struct bpf_map *map, size_t size, gfp_t flags,
> >                            int node)
> >  {
> > -       struct mem_cgroup *old_memcg;
> > +       struct mem_cgroup *memcg, *old_memcg;
> >         void *ptr;
> >
> > -       old_memcg = set_active_memcg(map->memcg);
> > +       memcg = bpf_map_get_memcg(map);
> > +       old_memcg = set_active_memcg(memcg);
> >         ptr = kmalloc_node(size, flags | __GFP_ACCOUNT, node);
> >         set_active_memcg(old_memcg);
> > +       mem_cgroup_put(memcg);
>
> Here we might css_put root_mem_cgroup.
> Should we css_get it when returning or
> it's marked as CSS_NO_REF ?
> But mem_cgroup_alloc() doesn't seem to be doing that marking.
> I'm lost at that code.

CSS_NO_REF is set for root_mem_cgroup in cgroup_init_subsys().

Alexei Starovoitov July 12, 2022, 10:15 p.m. UTC | #3

On Tue, Jul 12, 2022 at 3:11 PM Shakeel Butt <shakeelb@google.com> wrote:
>
> On Tue, Jul 12, 2022 at 2:49 PM Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> >
> > On Mon, Jul 11, 2022 at 9:28 AM Roman Gushchin <roman.gushchin@linux.dev> wrote:
> > >
> > > The memory consumed by a mpf map is always accounted to the memory
> > > cgroup of the process which created the map. The map can outlive
> > > the memory cgroup if it's used by processes in other cgroups or
> > > is pinned on bpffs. In this case the map pins the original cgroup
> > > in the dying state.
> > >
> > > For other types of objects (slab objects, non-slab kernel allocations,
> > > percpu objects and recently LRU pages) there is a reparenting process
> > > implemented: on cgroup offlining charged objects are getting
> > > reassigned to the parent cgroup. Because all charges and statistics
> > > are fully recursive it's a fairly cheap operation.
> > >
> > > For efficiency and consistency with other types of objects, let's do
> > > the same for bpf maps. Fortunately thanks to the objcg API, the
> > > required changes are minimal.
> > >
> > > Please, note that individual allocations (slabs, percpu and large
> > > kmallocs) already have the reparenting mechanism. This commit adds
> > > it to the saved map->memcg pointer by replacing it to map->objcg.
> > > Because dying cgroups are not visible for a user and all charges are
> > > recursive, this commit doesn't bring any behavior changes for a user.
> > >
> > > v2:
> > >   added a missing const qualifier
> > >
> > > Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev>
> > > Reviewed-by: Shakeel Butt <shakeelb@google.com>
> > > ---
> > >  include/linux/bpf.h  |  2 +-
> > >  kernel/bpf/syscall.c | 35 +++++++++++++++++++++++++++--------
> > >  2 files changed, 28 insertions(+), 9 deletions(-)
> > >
> > > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > > index 2b21f2a3452f..85a4db3e0536 100644
> > > --- a/include/linux/bpf.h
> > > +++ b/include/linux/bpf.h
> > > @@ -221,7 +221,7 @@ struct bpf_map {
> > >         u32 btf_vmlinux_value_type_id;
> > >         struct btf *btf;
> > >  #ifdef CONFIG_MEMCG_KMEM
> > > -       struct mem_cgroup *memcg;
> > > +       struct obj_cgroup *objcg;
> > >  #endif
> > >         char name[BPF_OBJ_NAME_LEN];
> > >         struct bpf_map_off_arr *off_arr;
> > > diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
> > > index ab688d85b2c6..ef60dbc21b17 100644
> > > --- a/kernel/bpf/syscall.c
> > > +++ b/kernel/bpf/syscall.c
> > > @@ -419,35 +419,52 @@ void bpf_map_free_id(struct bpf_map *map, bool do_idr_lock)
> > >  #ifdef CONFIG_MEMCG_KMEM
> > >  static void bpf_map_save_memcg(struct bpf_map *map)
> > >  {
> > > -       map->memcg = get_mem_cgroup_from_mm(current->mm);
> > > +       /* Currently if a map is created by a process belonging to the root
> > > +        * memory cgroup, get_obj_cgroup_from_current() will return NULL.
> > > +        * So we have to check map->objcg for being NULL each time it's
> > > +        * being used.
> > > +        */
> > > +       map->objcg = get_obj_cgroup_from_current();
> > >  }
> > >
> > >  static void bpf_map_release_memcg(struct bpf_map *map)
> > >  {
> > > -       mem_cgroup_put(map->memcg);
> > > +       if (map->objcg)
> > > +               obj_cgroup_put(map->objcg);
> > > +}
> > > +
> > > +static struct mem_cgroup *bpf_map_get_memcg(const struct bpf_map *map) {
> > > +       if (map->objcg)
> > > +               return get_mem_cgroup_from_objcg(map->objcg);
> > > +
> > > +       return root_mem_cgroup;
> > >  }
> > >
> > >  void *bpf_map_kmalloc_node(const struct bpf_map *map, size_t size, gfp_t flags,
> > >                            int node)
> > >  {
> > > -       struct mem_cgroup *old_memcg;
> > > +       struct mem_cgroup *memcg, *old_memcg;
> > >         void *ptr;
> > >
> > > -       old_memcg = set_active_memcg(map->memcg);
> > > +       memcg = bpf_map_get_memcg(map);
> > > +       old_memcg = set_active_memcg(memcg);
> > >         ptr = kmalloc_node(size, flags | __GFP_ACCOUNT, node);
> > >         set_active_memcg(old_memcg);
> > > +       mem_cgroup_put(memcg);
> >
> > Here we might css_put root_mem_cgroup.
> > Should we css_get it when returning or
> > it's marked as CSS_NO_REF ?
> > But mem_cgroup_alloc() doesn't seem to be doing that marking.
> > I'm lost at that code.
>
> CSS_NO_REF is set for root_mem_cgroup in cgroup_init_subsys().

Ahh. I see that
css = ss->css_alloc(NULL); css->flags |= CSS_NO_REF; now.
Thanks.

patchwork-bot+netdevbpf@kernel.org July 12, 2022, 10:50 p.m. UTC | #4

Hello:

This patch was applied to bpf/bpf-next.git (master)
by Alexei Starovoitov <ast@kernel.org>:

On Mon, 11 Jul 2022 09:28:27 -0700 you wrote:
> The memory consumed by a mpf map is always accounted to the memory
> cgroup of the process which created the map. The map can outlive
> the memory cgroup if it's used by processes in other cgroups or
> is pinned on bpffs. In this case the map pins the original cgroup
> in the dying state.
> 
> For other types of objects (slab objects, non-slab kernel allocations,
> percpu objects and recently LRU pages) there is a reparenting process
> implemented: on cgroup offlining charged objects are getting
> reassigned to the parent cgroup. Because all charges and statistics
> are fully recursive it's a fairly cheap operation.
> 
> [...]

Here is the summary with links:
  - [bpf-next,v2] bpf: reparent bpf maps on memcg offlining
    https://git.kernel.org/bpf/bpf-next/c/cbddef2759b6

You are awesome, thank you!

Roman Gushchin July 13, 2022, 2:13 a.m. UTC | #5

On Tue, Jul 12, 2022 at 03:15:27PM -0700, Alexei Starovoitov wrote:
> On Tue, Jul 12, 2022 at 3:11 PM Shakeel Butt <shakeelb@google.com> wrote:
> >
> > On Tue, Jul 12, 2022 at 2:49 PM Alexei Starovoitov
> > <alexei.starovoitov@gmail.com> wrote:
> > >
> > > On Mon, Jul 11, 2022 at 9:28 AM Roman Gushchin <roman.gushchin@linux.dev> wrote:
> > > >
> > > > The memory consumed by a mpf map is always accounted to the memory
> > > > cgroup of the process which created the map. The map can outlive
> > > > the memory cgroup if it's used by processes in other cgroups or
> > > > is pinned on bpffs. In this case the map pins the original cgroup
> > > > in the dying state.
> > > >
> > > > For other types of objects (slab objects, non-slab kernel allocations,
> > > > percpu objects and recently LRU pages) there is a reparenting process
> > > > implemented: on cgroup offlining charged objects are getting
> > > > reassigned to the parent cgroup. Because all charges and statistics
> > > > are fully recursive it's a fairly cheap operation.
> > > >
> > > > For efficiency and consistency with other types of objects, let's do
> > > > the same for bpf maps. Fortunately thanks to the objcg API, the
> > > > required changes are minimal.
> > > >
> > > > Please, note that individual allocations (slabs, percpu and large
> > > > kmallocs) already have the reparenting mechanism. This commit adds
> > > > it to the saved map->memcg pointer by replacing it to map->objcg.
> > > > Because dying cgroups are not visible for a user and all charges are
> > > > recursive, this commit doesn't bring any behavior changes for a user.
> > > >
> > > > v2:
> > > >   added a missing const qualifier
> > > >
> > > > Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev>
> > > > Reviewed-by: Shakeel Butt <shakeelb@google.com>
> > > > ---
> > > >  include/linux/bpf.h  |  2 +-
> > > >  kernel/bpf/syscall.c | 35 +++++++++++++++++++++++++++--------
> > > >  2 files changed, 28 insertions(+), 9 deletions(-)
> > > >
> > > > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > > > index 2b21f2a3452f..85a4db3e0536 100644
> > > > --- a/include/linux/bpf.h
> > > > +++ b/include/linux/bpf.h
> > > > @@ -221,7 +221,7 @@ struct bpf_map {
> > > >         u32 btf_vmlinux_value_type_id;
> > > >         struct btf *btf;
> > > >  #ifdef CONFIG_MEMCG_KMEM
> > > > -       struct mem_cgroup *memcg;
> > > > +       struct obj_cgroup *objcg;
> > > >  #endif
> > > >         char name[BPF_OBJ_NAME_LEN];
> > > >         struct bpf_map_off_arr *off_arr;
> > > > diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
> > > > index ab688d85b2c6..ef60dbc21b17 100644
> > > > --- a/kernel/bpf/syscall.c
> > > > +++ b/kernel/bpf/syscall.c
> > > > @@ -419,35 +419,52 @@ void bpf_map_free_id(struct bpf_map *map, bool do_idr_lock)
> > > >  #ifdef CONFIG_MEMCG_KMEM
> > > >  static void bpf_map_save_memcg(struct bpf_map *map)
> > > >  {
> > > > -       map->memcg = get_mem_cgroup_from_mm(current->mm);
> > > > +       /* Currently if a map is created by a process belonging to the root
> > > > +        * memory cgroup, get_obj_cgroup_from_current() will return NULL.
> > > > +        * So we have to check map->objcg for being NULL each time it's
> > > > +        * being used.
> > > > +        */
> > > > +       map->objcg = get_obj_cgroup_from_current();
> > > >  }
> > > >
> > > >  static void bpf_map_release_memcg(struct bpf_map *map)
> > > >  {
> > > > -       mem_cgroup_put(map->memcg);
> > > > +       if (map->objcg)
> > > > +               obj_cgroup_put(map->objcg);
> > > > +}
> > > > +
> > > > +static struct mem_cgroup *bpf_map_get_memcg(const struct bpf_map *map) {
> > > > +       if (map->objcg)
> > > > +               return get_mem_cgroup_from_objcg(map->objcg);
> > > > +
> > > > +       return root_mem_cgroup;
> > > >  }
> > > >
> > > >  void *bpf_map_kmalloc_node(const struct bpf_map *map, size_t size, gfp_t flags,
> > > >                            int node)
> > > >  {
> > > > -       struct mem_cgroup *old_memcg;
> > > > +       struct mem_cgroup *memcg, *old_memcg;
> > > >         void *ptr;
> > > >
> > > > -       old_memcg = set_active_memcg(map->memcg);
> > > > +       memcg = bpf_map_get_memcg(map);
> > > > +       old_memcg = set_active_memcg(memcg);
> > > >         ptr = kmalloc_node(size, flags | __GFP_ACCOUNT, node);
> > > >         set_active_memcg(old_memcg);
> > > > +       mem_cgroup_put(memcg);
> > >
> > > Here we might css_put root_mem_cgroup.
> > > Should we css_get it when returning or
> > > it's marked as CSS_NO_REF ?
> > > But mem_cgroup_alloc() doesn't seem to be doing that marking.
> > > I'm lost at that code.
> >
> > CSS_NO_REF is set for root_mem_cgroup in cgroup_init_subsys().

Yeah, the root cgroups can't be deleted so we save on the refcounting.
> 
> Ahh. I see that
> css = ss->css_alloc(NULL); css->flags |= CSS_NO_REF; now.
> Thanks.

Thanks for applying the patch!

[bpf-next,v2] bpf: reparent bpf maps on memcg offlining

Checks

Commit Message

Comments

Patch