diff mbox series

[v2,3/4] cgroup/cpuset: Include offline CPUs when tasks' cpumasks in top_cpuset are updated

Message ID 20230317151508.1225282-4-longman@redhat.com (mailing list archive)
State Accepted
Commit 6667439f51c446fead5d991ff49b842a811a6195
Headers show
Series cgroup/cpuset: Miscellaneous updates | expand

Commit Message

Waiman Long March 17, 2023, 3:15 p.m. UTC
Similar to commit 3fb906e7fabb ("group/cpuset: Don't filter offline
CPUs in cpuset_cpus_allowed() for top cpuset tasks"), the whole set of
possible CPUs including offline ones should be used for setting cpumasks
for tasks in the top cpuset when a cpuset partition is modified as the
hotplug code won't update cpumasks for tasks in the top cpuset when
CPUs become online or offline.

Signed-off-by: Waiman Long <longman@redhat.com>
---
 kernel/cgroup/cpuset.c | 23 ++++++++++++++---------
 1 file changed, 14 insertions(+), 9 deletions(-)

Comments

Michal Koutný March 17, 2023, 6:01 p.m. UTC | #1
Hello.

On Fri, Mar 17, 2023 at 11:15:07AM -0400, Waiman Long <longman@redhat.com> wrote:
>   * Iterate through each task of @cs updating its cpus_allowed to the
>   * effective cpuset's.  As this function is called with cpuset_rwsem held,
> - * cpuset membership stays stable.
> + * cpuset membership stays stable. For top_cpuset, task_cpu_possible_mask()
> + * is used instead of effective_cpus to make sure all offline CPUs are also
> + * included as hotplug code won't update cpumasks for tasks in top_cpuset.
>   */

On Wed, Mar 15, 2023 at 11:06:20AM +0100, Michal Koutný <mkoutny@suse.com> wrote:
> I see now that it returns offlined cpus to top cpuset's tasks.

I considered only the "base" set change cs->effective_cpus ->
possible_mask. (Apologies for that mistake.)

However, I now read the note about subparts_cpus

>         * effective_cpus contains only onlined CPUs, but subparts_cpus
>         * may have offlined ones.

So if subpart_cpus keeps offlined CPUs, they will be subtracted from
possible_mask and absent in the resulting new_cpus, i.e. undesirable for
the tasks in that cpuset :-/

Michal
Waiman Long March 17, 2023, 6:05 p.m. UTC | #2
On 3/17/23 14:01, Michal Koutný wrote:
> Hello.
>
> On Fri, Mar 17, 2023 at 11:15:07AM -0400, Waiman Long <longman@redhat.com> wrote:
>>    * Iterate through each task of @cs updating its cpus_allowed to the
>>    * effective cpuset's.  As this function is called with cpuset_rwsem held,
>> - * cpuset membership stays stable.
>> + * cpuset membership stays stable. For top_cpuset, task_cpu_possible_mask()
>> + * is used instead of effective_cpus to make sure all offline CPUs are also
>> + * included as hotplug code won't update cpumasks for tasks in top_cpuset.
>>    */
> On Wed, Mar 15, 2023 at 11:06:20AM +0100, Michal Koutný <mkoutny@suse.com> wrote:
>> I see now that it returns offlined cpus to top cpuset's tasks.
> I considered only the "base" set change cs->effective_cpus ->
> possible_mask. (Apologies for that mistake.)
>
> However, I now read the note about subparts_cpus
>
>>          * effective_cpus contains only onlined CPUs, but subparts_cpus
>>          * may have offlined ones.
> So if subpart_cpus keeps offlined CPUs, they will be subtracted from
> possible_mask and absent in the resulting new_cpus, i.e. undesirable for
> the tasks in that cpuset :-/

A cpu will be in the subparts_cpus only if it has been given to the 
child partition. So when it becomes online, it will become part of the 
scheduling domain that child partition. Only the tasks in that child 
partition will get their cpumasks updated to use it, not those in the 
top cpuset.

Cheers,
Longman
Michal Koutný March 20, 2023, 3:04 p.m. UTC | #3
On Fri, Mar 17, 2023 at 02:05:32PM -0400, Waiman Long <longman@redhat.com> wrote:
> A cpu will be in the subparts_cpus only if it has been given to the child
> partition. So when it becomes online, it will become part of the scheduling
> domain that child partition. Only the tasks in that child partition will get
> their cpumasks updated to use it, not those in the top cpuset.

Right, it's actually the difference between offlining a CPU and giving it
to a sub-partition (hence a removed child (or switched to member) before
CPU onlining). It's clear to me now.

Thanks,
Michal
diff mbox series

Patch

diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
index 5b8d763555b0..db8793a2082f 100644
--- a/kernel/cgroup/cpuset.c
+++ b/kernel/cgroup/cpuset.c
@@ -1209,7 +1209,9 @@  void rebuild_sched_domains(void)
  *
  * Iterate through each task of @cs updating its cpus_allowed to the
  * effective cpuset's.  As this function is called with cpuset_rwsem held,
- * cpuset membership stays stable.
+ * cpuset membership stays stable. For top_cpuset, task_cpu_possible_mask()
+ * is used instead of effective_cpus to make sure all offline CPUs are also
+ * included as hotplug code won't update cpumasks for tasks in top_cpuset.
  */
 static void update_tasks_cpumask(struct cpuset *cs, struct cpumask *new_cpus)
 {
@@ -1219,15 +1221,18 @@  static void update_tasks_cpumask(struct cpuset *cs, struct cpumask *new_cpus)
 
 	css_task_iter_start(&cs->css, 0, &it);
 	while ((task = css_task_iter_next(&it))) {
-		/*
-		 * Percpu kthreads in top_cpuset are ignored
-		 */
-		if (top_cs && (task->flags & PF_KTHREAD) &&
-		    kthread_is_per_cpu(task))
-			continue;
+		const struct cpumask *possible_mask = task_cpu_possible_mask(task);
 
-		cpumask_and(new_cpus, cs->effective_cpus,
-			    task_cpu_possible_mask(task));
+		if (top_cs) {
+			/*
+			 * Percpu kthreads in top_cpuset are ignored
+			 */
+			if ((task->flags & PF_KTHREAD) && kthread_is_per_cpu(task))
+				continue;
+			cpumask_andnot(new_cpus, possible_mask, cs->subparts_cpus);
+		} else {
+			cpumask_and(new_cpus, possible_mask, cs->effective_cpus);
+		}
 		set_cpus_allowed_ptr(task, new_cpus);
 	}
 	css_task_iter_end(&it);