Message ID | 20230317151508.1225282-4-longman@redhat.com (mailing list archive) |
---|---|
State | Accepted |
Commit | 6667439f51c446fead5d991ff49b842a811a6195 |
Headers | show |
Series | cgroup/cpuset: Miscellaneous updates | expand |
Hello. On Fri, Mar 17, 2023 at 11:15:07AM -0400, Waiman Long <longman@redhat.com> wrote: > * Iterate through each task of @cs updating its cpus_allowed to the > * effective cpuset's. As this function is called with cpuset_rwsem held, > - * cpuset membership stays stable. > + * cpuset membership stays stable. For top_cpuset, task_cpu_possible_mask() > + * is used instead of effective_cpus to make sure all offline CPUs are also > + * included as hotplug code won't update cpumasks for tasks in top_cpuset. > */ On Wed, Mar 15, 2023 at 11:06:20AM +0100, Michal Koutný <mkoutny@suse.com> wrote: > I see now that it returns offlined cpus to top cpuset's tasks. I considered only the "base" set change cs->effective_cpus -> possible_mask. (Apologies for that mistake.) However, I now read the note about subparts_cpus > * effective_cpus contains only onlined CPUs, but subparts_cpus > * may have offlined ones. So if subpart_cpus keeps offlined CPUs, they will be subtracted from possible_mask and absent in the resulting new_cpus, i.e. undesirable for the tasks in that cpuset :-/ Michal
On 3/17/23 14:01, Michal Koutný wrote: > Hello. > > On Fri, Mar 17, 2023 at 11:15:07AM -0400, Waiman Long <longman@redhat.com> wrote: >> * Iterate through each task of @cs updating its cpus_allowed to the >> * effective cpuset's. As this function is called with cpuset_rwsem held, >> - * cpuset membership stays stable. >> + * cpuset membership stays stable. For top_cpuset, task_cpu_possible_mask() >> + * is used instead of effective_cpus to make sure all offline CPUs are also >> + * included as hotplug code won't update cpumasks for tasks in top_cpuset. >> */ > On Wed, Mar 15, 2023 at 11:06:20AM +0100, Michal Koutný <mkoutny@suse.com> wrote: >> I see now that it returns offlined cpus to top cpuset's tasks. > I considered only the "base" set change cs->effective_cpus -> > possible_mask. (Apologies for that mistake.) > > However, I now read the note about subparts_cpus > >> * effective_cpus contains only onlined CPUs, but subparts_cpus >> * may have offlined ones. > So if subpart_cpus keeps offlined CPUs, they will be subtracted from > possible_mask and absent in the resulting new_cpus, i.e. undesirable for > the tasks in that cpuset :-/ A cpu will be in the subparts_cpus only if it has been given to the child partition. So when it becomes online, it will become part of the scheduling domain that child partition. Only the tasks in that child partition will get their cpumasks updated to use it, not those in the top cpuset. Cheers, Longman
On Fri, Mar 17, 2023 at 02:05:32PM -0400, Waiman Long <longman@redhat.com> wrote: > A cpu will be in the subparts_cpus only if it has been given to the child > partition. So when it becomes online, it will become part of the scheduling > domain that child partition. Only the tasks in that child partition will get > their cpumasks updated to use it, not those in the top cpuset. Right, it's actually the difference between offlining a CPU and giving it to a sub-partition (hence a removed child (or switched to member) before CPU onlining). It's clear to me now. Thanks, Michal
diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c index 5b8d763555b0..db8793a2082f 100644 --- a/kernel/cgroup/cpuset.c +++ b/kernel/cgroup/cpuset.c @@ -1209,7 +1209,9 @@ void rebuild_sched_domains(void) * * Iterate through each task of @cs updating its cpus_allowed to the * effective cpuset's. As this function is called with cpuset_rwsem held, - * cpuset membership stays stable. + * cpuset membership stays stable. For top_cpuset, task_cpu_possible_mask() + * is used instead of effective_cpus to make sure all offline CPUs are also + * included as hotplug code won't update cpumasks for tasks in top_cpuset. */ static void update_tasks_cpumask(struct cpuset *cs, struct cpumask *new_cpus) { @@ -1219,15 +1221,18 @@ static void update_tasks_cpumask(struct cpuset *cs, struct cpumask *new_cpus) css_task_iter_start(&cs->css, 0, &it); while ((task = css_task_iter_next(&it))) { - /* - * Percpu kthreads in top_cpuset are ignored - */ - if (top_cs && (task->flags & PF_KTHREAD) && - kthread_is_per_cpu(task)) - continue; + const struct cpumask *possible_mask = task_cpu_possible_mask(task); - cpumask_and(new_cpus, cs->effective_cpus, - task_cpu_possible_mask(task)); + if (top_cs) { + /* + * Percpu kthreads in top_cpuset are ignored + */ + if ((task->flags & PF_KTHREAD) && kthread_is_per_cpu(task)) + continue; + cpumask_andnot(new_cpus, possible_mask, cs->subparts_cpus); + } else { + cpumask_and(new_cpus, possible_mask, cs->effective_cpus); + } set_cpus_allowed_ptr(task, new_cpus); } css_task_iter_end(&it);
Similar to commit 3fb906e7fabb ("group/cpuset: Don't filter offline CPUs in cpuset_cpus_allowed() for top cpuset tasks"), the whole set of possible CPUs including offline ones should be used for setting cpumasks for tasks in the top cpuset when a cpuset partition is modified as the hotplug code won't update cpumasks for tasks in the top cpuset when CPUs become online or offline. Signed-off-by: Waiman Long <longman@redhat.com> --- kernel/cgroup/cpuset.c | 23 ++++++++++++++--------- 1 file changed, 14 insertions(+), 9 deletions(-)