Message ID | 20200117151533.12381-3-mkoutny@suse.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | cgroup: Iterate tasks that did not finish do_exit() | expand |
On Fri, Jan 17, 2020 at 04:15:32PM +0100, Michal Koutný wrote: > PF_EXITING is set earlier than actual removal from css_set when a task > is exitting. This can confuse cgroup.procs readers who see no PF_EXITING > tasks, however, rmdir is checking against css_set membership so it can > transitionally fail with EBUSY. > > Fix this by listing tasks that weren't unlinked from css_set active > lists. > It may happen that other users of the task iterator (without > CSS_TASK_ITER_PROCS) spot a PF_EXITING task before cgroup_exit(). This > is equal to the state before commit c03cd7738a83 ("cgroup: Include dying > leaders with live threads in PROCS iterations") but it may be reviewed > later. Yeah, this looks fine to me. Any chance you can order this before the clean up so that we can mark it for -stable. Thanks.
On Fri, Jan 17, 2020 at 9:28 AM Tejun Heo <tj@kernel.org> wrote: > > On Fri, Jan 17, 2020 at 04:15:32PM +0100, Michal Koutný wrote: > > PF_EXITING is set earlier than actual removal from css_set when a task > > is exitting. This can confuse cgroup.procs readers who see no PF_EXITING > > tasks, however, rmdir is checking against css_set membership so it can > > transitionally fail with EBUSY. > > > > Fix this by listing tasks that weren't unlinked from css_set active > > lists. > > It may happen that other users of the task iterator (without > > CSS_TASK_ITER_PROCS) spot a PF_EXITING task before cgroup_exit(). This > > is equal to the state before commit c03cd7738a83 ("cgroup: Include dying > > leaders with live threads in PROCS iterations") but it may be reviewed > > later. Tested-by: Suren Baghdasaryan <surenb@google.com> > > Yeah, this looks fine to me. Any chance you can order this before the > clean up so that we can mark it for -stable. > +1 for reordering. Makes it easier to backport. Thanks, Suren. > Thanks. > > -- > tejun > > -- > To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@android.com. >
On Fri, Jan 17, 2020 at 10:41:29AM -0800, Suren Baghdasaryan <surenb@google.com> wrote: > Tested-by: Suren Baghdasaryan <surenb@google.com> Thanks. > > Yeah, this looks fine to me. Any chance you can order this before the > > clean up so that we can mark it for -stable. > +1 for reordering. Makes it easier to backport. The grounds still need to be prepared for css_task_iter to store additional information. Let me see how the preceding changes can be minimized. Michal
Hi, thanks for your feedback. I'm sending updated solution that changes task iteration to be consistent with cgroup_is_populated() check and moving the responsibility to check PF_EXITING on the consumers of iterator API. Changes from v1 - v1: https://lore.kernel.org/lkml/20200117151533.12381-1-mkoutny@suse.com/ - Swap the fixing and refactoring patch (in order to make the fix easier for backport) - Don't introduce tasks lists array because of unnecessarily long access code - Fix comment in selftest - I leave the CC:stable on discretion of the maintainer Michal Koutný (2): cgroup: Iterate tasks that did not finish do_exit() cgroup: Clean up css_set task traversal Suren Baghdasaryan (1): kselftest/cgroup: add cgroup destruction test include/linux/cgroup.h | 4 +- kernel/cgroup/cgroup.c | 60 ++++++----- tools/testing/selftests/cgroup/test_core.c | 113 +++++++++++++++++++++ 3 files changed, 146 insertions(+), 31 deletions(-)
diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c index b56283e13491..132d258e7172 100644 --- a/kernel/cgroup/cgroup.c +++ b/kernel/cgroup/cgroup.c @@ -4492,11 +4492,12 @@ static void css_task_iter_advance(struct css_task_iter *it) goto repeat; /* and dying leaders w/o live member threads */ - if (!atomic_read(&task->signal->live)) + if (it->cur_list == CSS_SET_TASKS_DYING && + !atomic_read(&task->signal->live)) goto repeat; } else { /* skip all dying ones */ - if (task->flags & PF_EXITING) + if (it->cur_list == CSS_SET_TASKS_DYING) goto repeat; } }
PF_EXITING is set earlier than actual removal from css_set when a task is exitting. This can confuse cgroup.procs readers who see no PF_EXITING tasks, however, rmdir is checking against css_set membership so it can transitionally fail with EBUSY. Fix this by listing tasks that weren't unlinked from css_set active lists. It may happen that other users of the task iterator (without CSS_TASK_ITER_PROCS) spot a PF_EXITING task before cgroup_exit(). This is equal to the state before commit c03cd7738a83 ("cgroup: Include dying leaders with live threads in PROCS iterations") but it may be reviewed later. Reported-by: Suren Baghdasaryan <surenb@google.com> Fixes: c03cd7738a83 ("cgroup: Include dying leaders with live threads in PROCS iterations") Signed-off-by: Michal Koutný <mkoutny@suse.com> --- kernel/cgroup/cgroup.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)