Message ID | 20210319163821.20704-2-songmuchun@bytedance.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Use obj_cgroup APIs to charge kmem pages | expand |
On Fri, Mar 19, 2021 at 9:38 AM Muchun Song <songmuchun@bytedance.com> wrote: > > The rcu_read_lock/unlock only can guarantee that the memcg will not be > freed, but it cannot guarantee the success of css_get (which is in the > refill_stock when cached memcg changed) to memcg. > > rcu_read_lock() > memcg = obj_cgroup_memcg(old) > __memcg_kmem_uncharge(memcg) > refill_stock(memcg) > if (stock->cached != memcg) > // css_get can change the ref counter from 0 back to 1. > css_get(&memcg->css) > rcu_read_unlock() > > This fix is very like the commit: > > eefbfa7fd678 ("mm: memcg/slab: fix use after free in obj_cgroup_charge") > > Fix this by holding a reference to the memcg which is passed to the > __memcg_kmem_uncharge() before calling __memcg_kmem_uncharge(). > > Fixes: 3de7d4f25a74 ("mm: memcg/slab: optimize objcg stock draining") > Signed-off-by: Muchun Song <songmuchun@bytedance.com> Good catch. Reviewed-by: Shakeel Butt <shakeelb@google.com>
On Sat, Mar 20, 2021 at 12:38:14AM +0800, Muchun Song wrote: > The rcu_read_lock/unlock only can guarantee that the memcg will not be > freed, but it cannot guarantee the success of css_get (which is in the > refill_stock when cached memcg changed) to memcg. > > rcu_read_lock() > memcg = obj_cgroup_memcg(old) > __memcg_kmem_uncharge(memcg) > refill_stock(memcg) > if (stock->cached != memcg) > // css_get can change the ref counter from 0 back to 1. > css_get(&memcg->css) > rcu_read_unlock() > > This fix is very like the commit: > > eefbfa7fd678 ("mm: memcg/slab: fix use after free in obj_cgroup_charge") > > Fix this by holding a reference to the memcg which is passed to the > __memcg_kmem_uncharge() before calling __memcg_kmem_uncharge(). > > Fixes: 3de7d4f25a74 ("mm: memcg/slab: optimize objcg stock draining") > Signed-off-by: Muchun Song <songmuchun@bytedance.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Good catch! Did you trigger the WARN_ON() in percpu_ref_kill_and_confirm() during testing?
On Sat, Mar 20, 2021 at 12:38:14AM +0800, Muchun Song wrote: > The rcu_read_lock/unlock only can guarantee that the memcg will not be > freed, but it cannot guarantee the success of css_get (which is in the > refill_stock when cached memcg changed) to memcg. > > rcu_read_lock() > memcg = obj_cgroup_memcg(old) > __memcg_kmem_uncharge(memcg) > refill_stock(memcg) > if (stock->cached != memcg) > // css_get can change the ref counter from 0 back to 1. > css_get(&memcg->css) > rcu_read_unlock() > > This fix is very like the commit: > > eefbfa7fd678 ("mm: memcg/slab: fix use after free in obj_cgroup_charge") > > Fix this by holding a reference to the memcg which is passed to the > __memcg_kmem_uncharge() before calling __memcg_kmem_uncharge(). > > Fixes: 3de7d4f25a74 ("mm: memcg/slab: optimize objcg stock draining") > Signed-off-by: Muchun Song <songmuchun@bytedance.com> Acked-by: Roman Gushchin <guro@fb.com> Thanks!
On Mon, Mar 22, 2021 at 10:46 PM Johannes Weiner <hannes@cmpxchg.org> wrote: > > On Sat, Mar 20, 2021 at 12:38:14AM +0800, Muchun Song wrote: > > The rcu_read_lock/unlock only can guarantee that the memcg will not be > > freed, but it cannot guarantee the success of css_get (which is in the > > refill_stock when cached memcg changed) to memcg. > > > > rcu_read_lock() > > memcg = obj_cgroup_memcg(old) > > __memcg_kmem_uncharge(memcg) > > refill_stock(memcg) > > if (stock->cached != memcg) > > // css_get can change the ref counter from 0 back to 1. > > css_get(&memcg->css) > > rcu_read_unlock() > > > > This fix is very like the commit: > > > > eefbfa7fd678 ("mm: memcg/slab: fix use after free in obj_cgroup_charge") > > > > Fix this by holding a reference to the memcg which is passed to the > > __memcg_kmem_uncharge() before calling __memcg_kmem_uncharge(). > > > > Fixes: 3de7d4f25a74 ("mm: memcg/slab: optimize objcg stock draining") > > Signed-off-by: Muchun Song <songmuchun@bytedance.com> > > Acked-by: Johannes Weiner <hannes@cmpxchg.org> > > Good catch! Did you trigger the WARN_ON() in > percpu_ref_kill_and_confirm() during testing? No. The race window is very small, it should be difficult to trigger. When I reviewed the code here, I suddenly realized that there might be a problem here. Very coincidental. Thanks.
diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 845eec01ef9d..2cda76ff0629 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -3181,9 +3181,17 @@ static void drain_obj_stock(struct memcg_stock_pcp *stock) unsigned int nr_bytes = stock->nr_bytes & (PAGE_SIZE - 1); if (nr_pages) { + struct mem_cgroup *memcg; + rcu_read_lock(); - __memcg_kmem_uncharge(obj_cgroup_memcg(old), nr_pages); +retry: + memcg = obj_cgroup_memcg(old); + if (unlikely(!css_tryget(&memcg->css))) + goto retry; rcu_read_unlock(); + + __memcg_kmem_uncharge(memcg, nr_pages); + css_put(&memcg->css); } /*
The rcu_read_lock/unlock only can guarantee that the memcg will not be freed, but it cannot guarantee the success of css_get (which is in the refill_stock when cached memcg changed) to memcg. rcu_read_lock() memcg = obj_cgroup_memcg(old) __memcg_kmem_uncharge(memcg) refill_stock(memcg) if (stock->cached != memcg) // css_get can change the ref counter from 0 back to 1. css_get(&memcg->css) rcu_read_unlock() This fix is very like the commit: eefbfa7fd678 ("mm: memcg/slab: fix use after free in obj_cgroup_charge") Fix this by holding a reference to the memcg which is passed to the __memcg_kmem_uncharge() before calling __memcg_kmem_uncharge(). Fixes: 3de7d4f25a74 ("mm: memcg/slab: optimize objcg stock draining") Signed-off-by: Muchun Song <songmuchun@bytedance.com> --- mm/memcontrol.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-)