Message ID | 20210302081823.9849-1-songmuchun@bytedance.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | mm: memcontrol: fix root_mem_cgroup charging | expand |
On Wed, Mar 3, 2021 at 2:58 AM Roman Gushchin <guro@fb.com> wrote: > > On Tue, Mar 02, 2021 at 04:18:23PM +0800, Muchun Song wrote: > > CPU0: CPU1: > > > > objcg = get_obj_cgroup_from_current(); > > obj_cgroup_charge(objcg); > > memcg_reparent_objcgs(); > > xchg(&objcg->memcg, root_mem_cgroup); > > // memcg == root_mem_cgroup > > memcg = obj_cgroup_memcg(objcg); > > __memcg_kmem_charge(memcg); > > // Do not charge to the root memcg > > try_charge(memcg); > > > > If the objcg->memcg is reparented to the root_mem_cgroup, > > obj_cgroup_charge() can pass root_mem_cgroup as the first > > parameter to here. The root_mem_cgroup is skipped in the > > try_charge(). So the page counters of it do not update. > > > > When we uncharge this, we will decrease the page counters > > (e.g. memory and memsw) of the root_mem_cgroup. This will > > cause the page counters of the root_mem_cgroup to be out > > of balance. Fix it by charging the page to the > > root_mem_cgroup unconditional. > > Is this a problem? It seems that we do not expose root memcg's counters > except kmem and tcp. In the page_counter_cancel(), we can see a WARN_ON_ONCE() to catch this issue. Yeah, it is very hard to trigger this warn for root memcg. But it actually can. Right? If we do not care about the root memcg counter, we should not warn for the root memcg. > It seems that the described problem is not > applicable to the kmem counter. Please, explain. The kmem counter of the root memcg is updated unconditionally. Because we do not check whether the memcg is root when we charge pages to the kmem counter. Thanks. > > Thanks! > > > > > Fixes: bf4f059954dc ("mm: memcg/slab: obj_cgroup API") > > Signed-off-by: Muchun Song <songmuchun@bytedance.com> > > --- > > mm/memcontrol.c | 13 +++++++++++++ > > 1 file changed, 13 insertions(+) > > > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > > index 2db2aeac8a9e..edf604824d63 100644 > > --- a/mm/memcontrol.c > > +++ b/mm/memcontrol.c > > @@ -3078,6 +3078,19 @@ static int __memcg_kmem_charge(struct mem_cgroup *memcg, gfp_t gfp, > > if (ret) > > return ret; > > > > + /* > > + * If the objcg->memcg is reparented to the root_mem_cgroup, > > + * obj_cgroup_charge() can pass root_mem_cgroup as the first > > + * parameter to here. We should charge the page to the > > + * root_mem_cgroup unconditional to keep it's page counters > > + * balance. > > + */ > > + if (unlikely(mem_cgroup_is_root(memcg))) { > > + page_counter_charge(&memcg->memory, nr_pages); > > + if (do_memsw_account()) > > + page_counter_charge(&memcg->memsw, nr_pages); > > + } > > + > > if (!cgroup_subsys_on_dfl(memory_cgrp_subsys) && > > !page_counter_try_charge(&memcg->kmem, nr_pages, &counter)) { > > > > -- > > 2.11.0 > >
diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 2db2aeac8a9e..edf604824d63 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -3078,6 +3078,19 @@ static int __memcg_kmem_charge(struct mem_cgroup *memcg, gfp_t gfp, if (ret) return ret; + /* + * If the objcg->memcg is reparented to the root_mem_cgroup, + * obj_cgroup_charge() can pass root_mem_cgroup as the first + * parameter to here. We should charge the page to the + * root_mem_cgroup unconditional to keep it's page counters + * balance. + */ + if (unlikely(mem_cgroup_is_root(memcg))) { + page_counter_charge(&memcg->memory, nr_pages); + if (do_memsw_account()) + page_counter_charge(&memcg->memsw, nr_pages); + } + if (!cgroup_subsys_on_dfl(memory_cgrp_subsys) && !page_counter_try_charge(&memcg->kmem, nr_pages, &counter)) {
CPU0: CPU1: objcg = get_obj_cgroup_from_current(); obj_cgroup_charge(objcg); memcg_reparent_objcgs(); xchg(&objcg->memcg, root_mem_cgroup); // memcg == root_mem_cgroup memcg = obj_cgroup_memcg(objcg); __memcg_kmem_charge(memcg); // Do not charge to the root memcg try_charge(memcg); If the objcg->memcg is reparented to the root_mem_cgroup, obj_cgroup_charge() can pass root_mem_cgroup as the first parameter to here. The root_mem_cgroup is skipped in the try_charge(). So the page counters of it do not update. When we uncharge this, we will decrease the page counters (e.g. memory and memsw) of the root_mem_cgroup. This will cause the page counters of the root_mem_cgroup to be out of balance. Fix it by charging the page to the root_mem_cgroup unconditional. Fixes: bf4f059954dc ("mm: memcg/slab: obj_cgroup API") Signed-off-by: Muchun Song <songmuchun@bytedance.com> --- mm/memcontrol.c | 13 +++++++++++++ 1 file changed, 13 insertions(+)