Message ID | 20190529210617.GP374014@devbig004.ftw2.facebook.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [for-5.2-fixes] memcg: Don't loop on css_tryget_online() failure | expand |
On Wed 29-05-19 14:06:17, Tejun Heo wrote: > A PF_EXITING task may stay associated with an offline css. > get_mem_cgroup_from_mm() may deadlock if mm->owner is in such state. > All similar logics in memcg are falling back to root memcg on > tryget_online failure and get_mem_cgroup_from_mm() can do the same. > > A similar failure existed for task_get_css() and could be triggered > through BSD process accounting racing against memcg offlining. See > 18fa84a2db0e ("cgroup: Use css_tryget() instead of css_tryget_online() > in task_get_css()") for details. > > Signed-off-by: Tejun Heo <tj@kernel.org> Do we need to mark this patch for stable or this is too unlikely to happen? Acked-by: Michal Hocko <mhocko@suse.com> Thanks! > --- > mm/memcontrol.c | 24 ++++++++++-------------- > 1 file changed, 10 insertions(+), 14 deletions(-) > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index e50a2db5b4ff..be1fa89db198 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -918,23 +918,19 @@ struct mem_cgroup *get_mem_cgroup_from_mm(struct mm_struct *mm) > > if (mem_cgroup_disabled()) > return NULL; > + /* > + * Page cache insertions can happen without an actual mm context, > + * e.g. during disk probing on boot, loopback IO, acct() writes. > + */ > + if (unlikely(!mm)) > + return root_mem_cgroup; > > rcu_read_lock(); > - do { > - /* > - * Page cache insertions can happen withou an > - * actual mm context, e.g. during disk probing > - * on boot, loopback IO, acct() writes etc. > - */ > - if (unlikely(!mm)) > - memcg = root_mem_cgroup; > - else { > - memcg = mem_cgroup_from_task(rcu_dereference(mm->owner)); > - if (unlikely(!memcg)) > - memcg = root_mem_cgroup; > - } > - } while (!css_tryget_online(&memcg->css)); > + memcg = mem_cgroup_from_task(rcu_dereference(mm->owner)); > + if (!css_tryget_online(&memcg->css)) > + memcg = root_mem_cgroup; > rcu_read_unlock(); > + > return memcg; > } > EXPORT_SYMBOL(get_mem_cgroup_from_mm);
On Wed, Jun 05, 2019 at 02:55:20PM +0200, Michal Hocko wrote: > On Wed 29-05-19 14:06:17, Tejun Heo wrote: > > A PF_EXITING task may stay associated with an offline css. > > get_mem_cgroup_from_mm() may deadlock if mm->owner is in such state. > > All similar logics in memcg are falling back to root memcg on > > tryget_online failure and get_mem_cgroup_from_mm() can do the same. > > > > A similar failure existed for task_get_css() and could be triggered > > through BSD process accounting racing against memcg offlining. See > > 18fa84a2db0e ("cgroup: Use css_tryget() instead of css_tryget_online() > > in task_get_css()") for details. > > > > Signed-off-by: Tejun Heo <tj@kernel.org> > > Do we need to mark this patch for stable or this is too unlikely to > happen? This one's a lot less likely than the one in task_get_css() which already is pretty low frequency. I don't think it warrants -stable tagging. Thanks.
diff --git a/mm/memcontrol.c b/mm/memcontrol.c index e50a2db5b4ff..be1fa89db198 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -918,23 +918,19 @@ struct mem_cgroup *get_mem_cgroup_from_mm(struct mm_struct *mm) if (mem_cgroup_disabled()) return NULL; + /* + * Page cache insertions can happen without an actual mm context, + * e.g. during disk probing on boot, loopback IO, acct() writes. + */ + if (unlikely(!mm)) + return root_mem_cgroup; rcu_read_lock(); - do { - /* - * Page cache insertions can happen withou an - * actual mm context, e.g. during disk probing - * on boot, loopback IO, acct() writes etc. - */ - if (unlikely(!mm)) - memcg = root_mem_cgroup; - else { - memcg = mem_cgroup_from_task(rcu_dereference(mm->owner)); - if (unlikely(!memcg)) - memcg = root_mem_cgroup; - } - } while (!css_tryget_online(&memcg->css)); + memcg = mem_cgroup_from_task(rcu_dereference(mm->owner)); + if (!css_tryget_online(&memcg->css)) + memcg = root_mem_cgroup; rcu_read_unlock(); + return memcg; } EXPORT_SYMBOL(get_mem_cgroup_from_mm);
A PF_EXITING task may stay associated with an offline css. get_mem_cgroup_from_mm() may deadlock if mm->owner is in such state. All similar logics in memcg are falling back to root memcg on tryget_online failure and get_mem_cgroup_from_mm() can do the same. A similar failure existed for task_get_css() and could be triggered through BSD process accounting racing against memcg offlining. See 18fa84a2db0e ("cgroup: Use css_tryget() instead of css_tryget_online() in task_get_css()") for details. Signed-off-by: Tejun Heo <tj@kernel.org> --- mm/memcontrol.c | 24 ++++++++++-------------- 1 file changed, 10 insertions(+), 14 deletions(-)