diff mbox series

[bpf-next,05/15] bpf: Fix incorrect mem_cgroup_put

Message ID 20220810151840.16394-6-laoar.shao@gmail.com (mailing list archive)
State New
Headers show
Series bpf: Introduce selectable memcg for bpf map | expand

Commit Message

Yafang Shao Aug. 10, 2022, 3:18 p.m. UTC
The memcg may be the root_mem_cgroup, in which case we shouldn't put it.
So a new helper bpf_map_put_memcg() is introduced to pair with
bpf_map_get_memcg().

Fixes: 4201d9ab3e42 ("bpf: reparent bpf maps on memcg offlining")
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Shakeel Butt <shakeelb@google.com>
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
---
 kernel/bpf/syscall.c | 14 +++++++++++---
 1 file changed, 11 insertions(+), 3 deletions(-)

Comments

Shakeel Butt Aug. 10, 2022, 5:07 p.m. UTC | #1
On Wed, Aug 10, 2022 at 03:18:30PM +0000, Yafang Shao wrote:
> The memcg may be the root_mem_cgroup, in which case we shouldn't put it.

No, it is ok to put root_mem_cgroup. css_put already handles the root
cgroups.
Yafang Shao Aug. 11, 2022, 2:49 a.m. UTC | #2
On Thu, Aug 11, 2022 at 1:07 AM Shakeel Butt <shakeelb@google.com> wrote:
>
> On Wed, Aug 10, 2022 at 03:18:30PM +0000, Yafang Shao wrote:
> > The memcg may be the root_mem_cgroup, in which case we shouldn't put it.
>
> No, it is ok to put root_mem_cgroup. css_put already handles the root
> cgroups.
>

Ah, this commit log doesn't describe the issue clearly. I should improve it.
The issue is that in bpf_map_get_memcg() it doesn't get the objcg if
map->objcg is NULL (that can happen if the map belongs to the root
memcg), so we shouldn't put the objcg if map->objcg is NULL neither in
bpf_map_put_memcg().
Maybe the change below would be more reasonable ?

+static void bpf_map_put_memcg(const struct bpf_map *map, struct
mem_cgroup *memcg)
+{
+       if (map->objcg)
+           mem_cgroup_put(memcg);
+}
Shakeel Butt Aug. 11, 2022, 3:47 p.m. UTC | #3
On Thu, Aug 11, 2022 at 10:49:13AM +0800, Yafang Shao wrote:
> On Thu, Aug 11, 2022 at 1:07 AM Shakeel Butt <shakeelb@google.com> wrote:
> >
> > On Wed, Aug 10, 2022 at 03:18:30PM +0000, Yafang Shao wrote:
> > > The memcg may be the root_mem_cgroup, in which case we shouldn't put it.
> >
> > No, it is ok to put root_mem_cgroup. css_put already handles the root
> > cgroups.
> >
> 
> Ah, this commit log doesn't describe the issue clearly. I should improve it.
> The issue is that in bpf_map_get_memcg() it doesn't get the objcg if
> map->objcg is NULL (that can happen if the map belongs to the root
> memcg), so we shouldn't put the objcg if map->objcg is NULL neither in
> bpf_map_put_memcg().

Sorry I am still not understanding. We are not 'getting' objcg in
bpf_map_get_memcg(). We are 'getting' memcg from map->objcg and if that
is NULL the function is returning root memcg and putting root memcg is
totally fine. Or are you saying that root_mem_cgroup is NULL?

> Maybe the change below would be more reasonable ?
> 
> +static void bpf_map_put_memcg(const struct bpf_map *map, struct
> mem_cgroup *memcg)
> +{
> +       if (map->objcg)
> +           mem_cgroup_put(memcg);
> +}
> 
> -- 
> Regards
> Yafang
Roman Gushchin Aug. 11, 2022, 4:48 p.m. UTC | #4
On Wed, Aug 10, 2022 at 03:18:30PM +0000, Yafang Shao wrote:
> The memcg may be the root_mem_cgroup, in which case we shouldn't put it.
> So a new helper bpf_map_put_memcg() is introduced to pair with
> bpf_map_get_memcg().
> 
> Fixes: 4201d9ab3e42 ("bpf: reparent bpf maps on memcg offlining")
> Cc: Roman Gushchin <roman.gushchin@linux.dev>
> Cc: Shakeel Butt <shakeelb@google.com>
> Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
> ---
>  kernel/bpf/syscall.c | 14 +++++++++++---
>  1 file changed, 11 insertions(+), 3 deletions(-)
> 
> diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
> index 83c7136..51ab8b1 100644
> --- a/kernel/bpf/syscall.c
> +++ b/kernel/bpf/syscall.c
> @@ -441,6 +441,14 @@ static struct mem_cgroup *bpf_map_get_memcg(const struct bpf_map *map)
>  	return root_mem_cgroup;
>  }
>  
> +static void bpf_map_put_memcg(struct mem_cgroup *memcg)
> +{
> +	if (mem_cgroup_is_root(memcg))
> +		return;
> +
> +	mem_cgroup_put(memcg);
> +}

+1 to what Shakeel said. mem_cgroup_put(root_mem_cgroup) is totally valid.
So this change does absolutely nothing.

The fixes tag assumes there is a bug in the existing code. If so, please,
describe the problem and how to reproduce it.

Also, if it's not related to the rest of the patchset, please, send it
separately.

Thanks!
Yafang Shao Aug. 12, 2022, 12:27 a.m. UTC | #5
On Thu, Aug 11, 2022 at 11:47 PM Shakeel Butt <shakeelb@google.com> wrote:
>
> On Thu, Aug 11, 2022 at 10:49:13AM +0800, Yafang Shao wrote:
> > On Thu, Aug 11, 2022 at 1:07 AM Shakeel Butt <shakeelb@google.com> wrote:
> > >
> > > On Wed, Aug 10, 2022 at 03:18:30PM +0000, Yafang Shao wrote:
> > > > The memcg may be the root_mem_cgroup, in which case we shouldn't put it.
> > >
> > > No, it is ok to put root_mem_cgroup. css_put already handles the root
> > > cgroups.
> > >
> >
> > Ah, this commit log doesn't describe the issue clearly. I should improve it.
> > The issue is that in bpf_map_get_memcg() it doesn't get the objcg if
> > map->objcg is NULL (that can happen if the map belongs to the root
> > memcg), so we shouldn't put the objcg if map->objcg is NULL neither in
> > bpf_map_put_memcg().
>
> Sorry I am still not understanding. We are not 'getting' objcg in
> bpf_map_get_memcg(). We are 'getting' memcg from map->objcg and if that
> is NULL the function is returning root memcg and putting root memcg is
> totally fine.

When the map belongs to root_mem_cgroup, the map->objcg is NULL, right ?
See also bpf_map_save_memcg() and it describes clearly in the comment -

static void bpf_map_save_memcg(struct bpf_map *map)
{
        /* Currently if a map is created by a process belonging to the root
         * memory cgroup, get_obj_cgroup_from_current() will return NULL.
         * So we have to check map->objcg for being NULL each time it's
         * being used.
         */
        map->objcg = get_obj_cgroup_from_current();
}

So for the root_mem_cgroup case, bpf_map_get_memcg() will return
root_mem_cgroup directly without getting its css, right ? See below,

static struct mem_cgroup *bpf_map_get_memcg(const struct bpf_map *map)
{

        if (map->objcg)
                return get_mem_cgroup_from_objcg(map->objcg);

        return root_mem_cgroup;   // without css_get(&memcg->css);
}

But it will put the css unconditionally.  See below,

memcg = bpf_map_get_memcg(map);
...
mem_cgroup_put(memcg);

So we should put it *conditionally* as well.

  memcg = bpf_map_get_memcg(map);
   ...
+ if (map->objcg)
       mem_cgroup_put(memcg);

Is it clear to you ?

> Or are you saying that root_mem_cgroup is NULL?
>

No
Yafang Shao Aug. 12, 2022, 12:31 a.m. UTC | #6
On Fri, Aug 12, 2022 at 12:48 AM Roman Gushchin
<roman.gushchin@linux.dev> wrote:
>
> On Wed, Aug 10, 2022 at 03:18:30PM +0000, Yafang Shao wrote:
> > The memcg may be the root_mem_cgroup, in which case we shouldn't put it.
> > So a new helper bpf_map_put_memcg() is introduced to pair with
> > bpf_map_get_memcg().
> >
> > Fixes: 4201d9ab3e42 ("bpf: reparent bpf maps on memcg offlining")
> > Cc: Roman Gushchin <roman.gushchin@linux.dev>
> > Cc: Shakeel Butt <shakeelb@google.com>
> > Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
> > ---
> >  kernel/bpf/syscall.c | 14 +++++++++++---
> >  1 file changed, 11 insertions(+), 3 deletions(-)
> >
> > diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
> > index 83c7136..51ab8b1 100644
> > --- a/kernel/bpf/syscall.c
> > +++ b/kernel/bpf/syscall.c
> > @@ -441,6 +441,14 @@ static struct mem_cgroup *bpf_map_get_memcg(const struct bpf_map *map)
> >       return root_mem_cgroup;
> >  }
> >
> > +static void bpf_map_put_memcg(struct mem_cgroup *memcg)
> > +{
> > +     if (mem_cgroup_is_root(memcg))
> > +             return;
> > +
> > +     mem_cgroup_put(memcg);
> > +}
>
> +1 to what Shakeel said. mem_cgroup_put(root_mem_cgroup) is totally valid.
> So this change does absolutely nothing.
>

Do you mean that we can mem_cgroup_put(root_mem_cgroup) without
mem_cgroup_get(root_mem_cgroup) ?
Am I missing something ?

> The fixes tag assumes there is a bug in the existing code. If so, please,
> describe the problem and how to reproduce it.
>

It is found by code review.  The root_mem_cgroup's css will break. But
I don't know what it may cause to the user.
If you think the fixes tag is unproper, I will remove it.

> Also, if it's not related to the rest of the patchset, please, send it
> separately.
>

I want to introduce a bpf_map_put_memcg() helper to pair with
bpf_map_get_memcg().
This new helper will be used by other patches.
Muchun Song Aug. 12, 2022, 5:33 a.m. UTC | #7
> On Aug 12, 2022, at 08:27, Yafang Shao <laoar.shao@gmail.com> wrote:
> 
> On Thu, Aug 11, 2022 at 11:47 PM Shakeel Butt <shakeelb@google.com> wrote:
>> 
>> On Thu, Aug 11, 2022 at 10:49:13AM +0800, Yafang Shao wrote:
>>> On Thu, Aug 11, 2022 at 1:07 AM Shakeel Butt <shakeelb@google.com> wrote:
>>>> 
>>>> On Wed, Aug 10, 2022 at 03:18:30PM +0000, Yafang Shao wrote:
>>>>> The memcg may be the root_mem_cgroup, in which case we shouldn't put it.
>>>> 
>>>> No, it is ok to put root_mem_cgroup. css_put already handles the root
>>>> cgroups.
>>>> 
>>> 
>>> Ah, this commit log doesn't describe the issue clearly. I should improve it.
>>> The issue is that in bpf_map_get_memcg() it doesn't get the objcg if
>>> map->objcg is NULL (that can happen if the map belongs to the root
>>> memcg), so we shouldn't put the objcg if map->objcg is NULL neither in
>>> bpf_map_put_memcg().
>> 
>> Sorry I am still not understanding. We are not 'getting' objcg in
>> bpf_map_get_memcg(). We are 'getting' memcg from map->objcg and if that
>> is NULL the function is returning root memcg and putting root memcg is
>> totally fine.
> 
> When the map belongs to root_mem_cgroup, the map->objcg is NULL, right ?
> See also bpf_map_save_memcg() and it describes clearly in the comment -
> 
> static void bpf_map_save_memcg(struct bpf_map *map)
> {
>        /* Currently if a map is created by a process belonging to the root
>         * memory cgroup, get_obj_cgroup_from_current() will return NULL.
>         * So we have to check map->objcg for being NULL each time it's
>         * being used.
>         */
>        map->objcg = get_obj_cgroup_from_current();
> }
> 
> So for the root_mem_cgroup case, bpf_map_get_memcg() will return
> root_mem_cgroup directly without getting its css, right ? See below,
> 
> static struct mem_cgroup *bpf_map_get_memcg(const struct bpf_map *map)
> {
> 
>        if (map->objcg)
>                return get_mem_cgroup_from_objcg(map->objcg);
> 
>        return root_mem_cgroup;   // without css_get(&memcg->css);
> }
> 
> But it will put the css unconditionally.  See below,
> 
> memcg = bpf_map_get_memcg(map);
> ...
> mem_cgroup_put(memcg);
> 
> So we should put it *conditionally* as well.

Hi,

No. We could put root_mem_cgroup unconditionally since the root css
is treated as no reference css. See css_put().

static inline void css_put(struct cgroup_subsys_state *css)
{
	// The root memcg’s css has been set with CSS_NO_REF.
        if (!(css->flags & CSS_NO_REF))
                percpu_ref_put(&css->refcnt);
}

Muchun,
Thanks.

> 
>  memcg = bpf_map_get_memcg(map);
>   ...
> + if (map->objcg)
>       mem_cgroup_put(memcg);
> 
> Is it clear to you ?
> 
>> Or are you saying that root_mem_cgroup is NULL?
>> 
> 
> No
> 
> -- 
> Regards
> Yafang
Yafang Shao Aug. 12, 2022, 11:25 a.m. UTC | #8
On Fri, Aug 12, 2022 at 1:34 PM Muchun Song <muchun.song@linux.dev> wrote:
>
>
>
> > On Aug 12, 2022, at 08:27, Yafang Shao <laoar.shao@gmail.com> wrote:
> >
> > On Thu, Aug 11, 2022 at 11:47 PM Shakeel Butt <shakeelb@google.com> wrote:
> >>
> >> On Thu, Aug 11, 2022 at 10:49:13AM +0800, Yafang Shao wrote:
> >>> On Thu, Aug 11, 2022 at 1:07 AM Shakeel Butt <shakeelb@google.com> wrote:
> >>>>
> >>>> On Wed, Aug 10, 2022 at 03:18:30PM +0000, Yafang Shao wrote:
> >>>>> The memcg may be the root_mem_cgroup, in which case we shouldn't put it.
> >>>>
> >>>> No, it is ok to put root_mem_cgroup. css_put already handles the root
> >>>> cgroups.
> >>>>
> >>>
> >>> Ah, this commit log doesn't describe the issue clearly. I should improve it.
> >>> The issue is that in bpf_map_get_memcg() it doesn't get the objcg if
> >>> map->objcg is NULL (that can happen if the map belongs to the root
> >>> memcg), so we shouldn't put the objcg if map->objcg is NULL neither in
> >>> bpf_map_put_memcg().
> >>
> >> Sorry I am still not understanding. We are not 'getting' objcg in
> >> bpf_map_get_memcg(). We are 'getting' memcg from map->objcg and if that
> >> is NULL the function is returning root memcg and putting root memcg is
> >> totally fine.
> >
> > When the map belongs to root_mem_cgroup, the map->objcg is NULL, right ?
> > See also bpf_map_save_memcg() and it describes clearly in the comment -
> >
> > static void bpf_map_save_memcg(struct bpf_map *map)
> > {
> >        /* Currently if a map is created by a process belonging to the root
> >         * memory cgroup, get_obj_cgroup_from_current() will return NULL.
> >         * So we have to check map->objcg for being NULL each time it's
> >         * being used.
> >         */
> >        map->objcg = get_obj_cgroup_from_current();
> > }
> >
> > So for the root_mem_cgroup case, bpf_map_get_memcg() will return
> > root_mem_cgroup directly without getting its css, right ? See below,
> >
> > static struct mem_cgroup *bpf_map_get_memcg(const struct bpf_map *map)
> > {
> >
> >        if (map->objcg)
> >                return get_mem_cgroup_from_objcg(map->objcg);
> >
> >        return root_mem_cgroup;   // without css_get(&memcg->css);
> > }
> >
> > But it will put the css unconditionally.  See below,
> >
> > memcg = bpf_map_get_memcg(map);
> > ...
> > mem_cgroup_put(memcg);
> >
> > So we should put it *conditionally* as well.
>
> Hi,
>
> No. We could put root_mem_cgroup unconditionally since the root css
> is treated as no reference css. See css_put().
>
> static inline void css_put(struct cgroup_subsys_state *css)
> {
>         // The root memcg’s css has been set with CSS_NO_REF.
>         if (!(css->flags & CSS_NO_REF))
>                 percpu_ref_put(&css->refcnt);
> }

Indeed. I missed that.
The call stack is so deep that I didn't look into it :(
Thanks for the information.
diff mbox series

Patch

diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 83c7136..51ab8b1 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -441,6 +441,14 @@  static struct mem_cgroup *bpf_map_get_memcg(const struct bpf_map *map)
 	return root_mem_cgroup;
 }
 
+static void bpf_map_put_memcg(struct mem_cgroup *memcg)
+{
+	if (mem_cgroup_is_root(memcg))
+		return;
+
+	mem_cgroup_put(memcg);
+}
+
 void *bpf_map_kmalloc_node(const struct bpf_map *map, size_t size, gfp_t flags,
 			   int node)
 {
@@ -451,7 +459,7 @@  void *bpf_map_kmalloc_node(const struct bpf_map *map, size_t size, gfp_t flags,
 	old_memcg = set_active_memcg(memcg);
 	ptr = kmalloc_node(size, flags | __GFP_ACCOUNT, node);
 	set_active_memcg(old_memcg);
-	mem_cgroup_put(memcg);
+	bpf_map_put_memcg(memcg);
 
 	return ptr;
 }
@@ -465,7 +473,7 @@  void *bpf_map_kzalloc(const struct bpf_map *map, size_t size, gfp_t flags)
 	old_memcg = set_active_memcg(memcg);
 	ptr = kzalloc(size, flags | __GFP_ACCOUNT);
 	set_active_memcg(old_memcg);
-	mem_cgroup_put(memcg);
+	bpf_map_put_memcg(memcg);
 
 	return ptr;
 }
@@ -480,7 +488,7 @@  void __percpu *bpf_map_alloc_percpu(const struct bpf_map *map, size_t size,
 	old_memcg = set_active_memcg(memcg);
 	ptr = __alloc_percpu_gfp(size, align, flags | __GFP_ACCOUNT);
 	set_active_memcg(old_memcg);
-	mem_cgroup_put(memcg);
+	bpf_map_put_memcg(memcg);
 
 	return ptr;
 }