diff mbox series

mm, memcg: assign shrinker_map before kvfree

Message ID 20190920122907.GG2507@uranus.lan (mailing list archive)
State New, archived
Headers show
Series mm, memcg: assign shrinker_map before kvfree | expand

Commit Message

Cyrill Gorcunov Sept. 20, 2019, 12:29 p.m. UTC
Currently there is a small gap between fetching pointer, calling
kvfree and assign its value to nil. In current callgraph it is
not a problem (since memcg_free_shrinker_maps is running from
memcg_alloc_shrinker_maps and mem_cgroup_css_free only) still
this looks suspicious and we can easily eliminate the gap at all.

Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
---
 mm/memcontrol.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Kirill A . Shutemov Sept. 20, 2019, 1:21 p.m. UTC | #1
On Fri, Sep 20, 2019 at 03:29:07PM +0300, Cyrill Gorcunov wrote:
> Currently there is a small gap between fetching pointer, calling
> kvfree and assign its value to nil. In current callgraph it is
> not a problem (since memcg_free_shrinker_maps is running from
> memcg_alloc_shrinker_maps and mem_cgroup_css_free only) still
> this looks suspicious and we can easily eliminate the gap at all.

With this logic it will still look suspicious since you don't wait a grace
period before freeing the map.
Cyrill Gorcunov Sept. 20, 2019, 1:40 p.m. UTC | #2
On Fri, Sep 20, 2019 at 04:21:14PM +0300, Kirill A. Shutemov wrote:
> On Fri, Sep 20, 2019 at 03:29:07PM +0300, Cyrill Gorcunov wrote:
> > Currently there is a small gap between fetching pointer, calling
> > kvfree and assign its value to nil. In current callgraph it is
> > not a problem (since memcg_free_shrinker_maps is running from
> > memcg_alloc_shrinker_maps and mem_cgroup_css_free only) still
> > this looks suspicious and we can easily eliminate the gap at all.
> 
> With this logic it will still look suspicious since you don't wait
> a grace period before freeing the map.

Probably, but as far as I see we're using mutex here to order
requests. I'm not sure, maybe ktkhai@ made the code to use free
before the assign intentionally? As I said there is no bug it
the code right now just forced me to stop and reread it several
times due to this gap. If you look into other code places where
we use similar technique we always assign before free.
Kirill Tkhai Sept. 20, 2019, 2:11 p.m. UTC | #3
On 20.09.2019 15:29, Cyrill Gorcunov wrote:
> Currently there is a small gap between fetching pointer, calling
> kvfree and assign its value to nil. In current callgraph it is
> not a problem (since memcg_free_shrinker_maps is running from
> memcg_alloc_shrinker_maps and mem_cgroup_css_free only) still
> this looks suspicious and we can easily eliminate the gap at all.
> 
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
> Cc: Kirill Tkhai <ktkhai@virtuozzo.com>
> Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
> ---
>  mm/memcontrol.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> Index: linux-tip.git/mm/memcontrol.c
> ===================================================================
> --- linux-tip.git.orig/mm/memcontrol.c
> +++ linux-tip.git/mm/memcontrol.c
> @@ -364,9 +364,9 @@ static void memcg_free_shrinker_maps(str
>  	for_each_node(nid) {
>  		pn = mem_cgroup_nodeinfo(memcg, nid);
>  		map = rcu_dereference_protected(pn->shrinker_map, true);
> +		rcu_assign_pointer(pn->shrinker_map, NULL);
>  		if (map)
>  			kvfree(map);
> -		rcu_assign_pointer(pn->shrinker_map, NULL);
>  	}
>  }

The current scheme is following. We allocate shrinker_map in css_online,
while normal freeing happens in css_free. The NULLifying of pointer is needed
in case of "abnormal freeing", when memcg_free_shrinker_maps() is called
from memcg_alloc_shrinker_maps(). The NULLifying guarantees, we won't free
pn->shrinker_map twice.

There are no races or problems with kvfree() and rcu_assign_pointer() order,
because of nobody can reference shrinker_map before memcg is online.

In case of this rcu_assign_pointer() confuses, we may just remove is from
the function, and call it only on css_free. Something like the below:

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 1c4c08b45e44..454303c3aa0f 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -372,7 +372,6 @@ static void memcg_free_shrinker_maps(struct mem_cgroup *memcg)
 		map = rcu_dereference_protected(pn->shrinker_map, true);
 		if (map)
 			kvfree(map);
-		rcu_assign_pointer(pn->shrinker_map, NULL);
 	}
 }
 
@@ -389,7 +388,6 @@ static int memcg_alloc_shrinker_maps(struct mem_cgroup *memcg)
 	for_each_node(nid) {
 		map = kvzalloc(sizeof(*map) + size, GFP_KERNEL);
 		if (!map) {
-			memcg_free_shrinker_maps(memcg);
 			ret = -ENOMEM;
 			break;
 		}
Kirill Tkhai Sept. 20, 2019, 2:20 p.m. UTC | #4
On 20.09.2019 16:21, Kirill A. Shutemov wrote:
> On Fri, Sep 20, 2019 at 03:29:07PM +0300, Cyrill Gorcunov wrote:
>> Currently there is a small gap between fetching pointer, calling
>> kvfree and assign its value to nil. In current callgraph it is
>> not a problem (since memcg_free_shrinker_maps is running from
>> memcg_alloc_shrinker_maps and mem_cgroup_css_free only) still
>> this looks suspicious and we can easily eliminate the gap at all.
> 
> With this logic it will still look suspicious since you don't wait a grace
> period before freeing the map.

This freeing occurs in the moment, when nobody can dereference shrinker_map
in parallel:

memcg is either not yet online or its css->refcnt is already dead.
This NULLifying is needed just to prevent double freeing of shrinker_map.

Please, see the explanation in my email to our namesake.

Kirill
Cyrill Gorcunov Sept. 20, 2019, 2:31 p.m. UTC | #5
On Fri, Sep 20, 2019 at 05:11:00PM +0300, Kirill Tkhai wrote:
> 
> The current scheme is following. We allocate shrinker_map in css_online,
> while normal freeing happens in css_free. The NULLifying of pointer is needed
> in case of "abnormal freeing", when memcg_free_shrinker_maps() is called
> from memcg_alloc_shrinker_maps(). The NULLifying guarantees, we won't free
> pn->shrinker_map twice.
> 
> There are no races or problems with kvfree() and rcu_assign_pointer() order,
> because of nobody can reference shrinker_map before memcg is online.
> 
> In case of this rcu_assign_pointer() confuses, we may just remove is from
> the function, and call it only on css_free. Something like the below:

Kirill, I know that there is no problem now (as I pointed in changelog),
simply a regular pattern of free after assign is being reversed, which
made me nervious. Anyway dropping assigns doesn't help much from my pov
so lets leave it as is. The good point is that we've this conversation
and if someone get a bit confused in future the google will reveal this
text. Which is enough I think.
diff mbox series

Patch

Index: linux-tip.git/mm/memcontrol.c
===================================================================
--- linux-tip.git.orig/mm/memcontrol.c
+++ linux-tip.git/mm/memcontrol.c
@@ -364,9 +364,9 @@  static void memcg_free_shrinker_maps(str
 	for_each_node(nid) {
 		pn = mem_cgroup_nodeinfo(memcg, nid);
 		map = rcu_dereference_protected(pn->shrinker_map, true);
+		rcu_assign_pointer(pn->shrinker_map, NULL);
 		if (map)
 			kvfree(map);
-		rcu_assign_pointer(pn->shrinker_map, NULL);
 	}
 }