Message ID | Z5QNhsWw0P1iPd2q@slm.duckdns.org (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [sched_ext/for-6.14-fixes] sched_ext: selftests/dsp_local_on: Fix sporadic failures | expand |
Context | Check | Description |
---|---|---|
netdev/tree_selection | success | Not a local patch |
On Fri, Jan 24, 2025 at 12:00:38PM -1000, Tejun Heo wrote: > From e9fe182772dcb2630964724fd93e9c90b68ea0fd Mon Sep 17 00:00:00 2001 > From: Tejun Heo <tj@kernel.org> > Date: Fri, 24 Jan 2025 10:48:25 -1000 > > dsp_local_on has several incorrect assumptions, one of which is that > p->nr_cpus_allowed always tracks p->cpus_ptr. This is not true when a task > is scheduled out while migration is disabled - p->cpus_ptr is temporarily > overridden to the previous CPU while p->nr_cpus_allowed remains unchanged. > > This led to sporadic test faliures when dsp_local_on_dispatch() tries to put > a migration disabled task to a different CPU. Fix it by keeping the previous > CPU when migration is disabled. > > There are SCX schedulers that make use of p->nr_cpus_allowed. They should > also implement explicit handling for p->migration_disabled. > > Signed-off-by: Tejun Heo <tj@kernel.org> > Reported-by: Ihor Solodrai <ihor.solodrai@pm.me> > Cc: Andrea Righi <arighi@nvidia.com> > Cc: Changwoo Min <changwoo@igalia.com> > --- > Applying to sched_ext/for-6.14-fixes. Thanks. > > tools/testing/selftests/sched_ext/dsp_local_on.bpf.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/tools/testing/selftests/sched_ext/dsp_local_on.bpf.c b/tools/testing/selftests/sched_ext/dsp_local_on.bpf.c > index fbda6bf54671..758b479bd1ee 100644 > --- a/tools/testing/selftests/sched_ext/dsp_local_on.bpf.c > +++ b/tools/testing/selftests/sched_ext/dsp_local_on.bpf.c > @@ -43,7 +43,7 @@ void BPF_STRUCT_OPS(dsp_local_on_dispatch, s32 cpu, struct task_struct *prev) > if (!p) > return; > > - if (p->nr_cpus_allowed == nr_cpus) > + if (p->nr_cpus_allowed == nr_cpus && !p->migration_disabled) This doesn't work with !CONFIG_SMP, maybe we can introduce a helper like: static bool is_migration_disabled(const struct task_struct *p) { if (bpf_core_field_exists(p->migration_disabled)) return p->migration_disabled; return false; } > target = bpf_get_prandom_u32() % nr_cpus; > else > target = scx_bpf_task_cpu(p); > -- > 2.48.1 > Thanks, -Andrea
diff --git a/tools/testing/selftests/sched_ext/dsp_local_on.bpf.c b/tools/testing/selftests/sched_ext/dsp_local_on.bpf.c index fbda6bf54671..758b479bd1ee 100644 --- a/tools/testing/selftests/sched_ext/dsp_local_on.bpf.c +++ b/tools/testing/selftests/sched_ext/dsp_local_on.bpf.c @@ -43,7 +43,7 @@ void BPF_STRUCT_OPS(dsp_local_on_dispatch, s32 cpu, struct task_struct *prev) if (!p) return; - if (p->nr_cpus_allowed == nr_cpus) + if (p->nr_cpus_allowed == nr_cpus && !p->migration_disabled) target = bpf_get_prandom_u32() % nr_cpus; else target = scx_bpf_task_cpu(p);