diff mbox series

[sched_ext/for-6.13-fixes] sched_ext: Fix invalid irq restore in scx_ops_bypass()

Message ID Z1n9v7Z6iNJ-wKmq@slm.duckdns.org (mailing list archive)
State Not Applicable
Headers show
Series [sched_ext/for-6.13-fixes] sched_ext: Fix invalid irq restore in scx_ops_bypass() | expand

Checks

Context Check Description
netdev/tree_selection success Not a local patch

Commit Message

Tejun Heo Dec. 11, 2024, 9:01 p.m. UTC
While adding outer irqsave/restore locking, 0e7ffff1b811 ("scx: Fix raciness
in scx_ops_bypass()") forgot to convert an inner rq_unlock_irqrestore() to
rq_unlock() which could re-enable IRQ prematurely leading to the following
warning:

  raw_local_irq_restore() called with IRQs enabled
  WARNING: CPU: 1 PID: 96 at kernel/locking/irqflag-debug.c:10 warn_bogus_irq_restore+0x30/0x40
  ...
  Sched_ext: create_dsq (enabling)
  pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
  pc : warn_bogus_irq_restore+0x30/0x40
  lr : warn_bogus_irq_restore+0x30/0x40
  ...
  Call trace:
   warn_bogus_irq_restore+0x30/0x40 (P)
   warn_bogus_irq_restore+0x30/0x40 (L)
   scx_ops_bypass+0x224/0x3b8
   scx_ops_enable.isra.0+0x2c8/0xaa8
   bpf_scx_reg+0x18/0x30
  ...
  irq event stamp: 33739
  hardirqs last  enabled at (33739): [<ffff8000800b699c>] scx_ops_bypass+0x174/0x3b8
  hardirqs last disabled at (33738): [<ffff800080d48ad4>] _raw_spin_lock_irqsave+0xb4/0xd8

Drop the stray _irqrestore().

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Ihor Solodrai <ihor.solodrai@pm.me>
Link: http://lkml.kernel.org/r/qC39k3UsonrBYD_SmuxHnZIQLsuuccoCrkiqb_BT7DvH945A1_LZwE4g-5Pu9FcCtqZt4lY1HhIPi0homRuNWxkgo1rgP3bkxa0donw8kV4=@pm.me
Fixes: 0e7ffff1b811 ("scx: Fix raciness in scx_ops_bypass()")
Cc: stable@vger.kernel.org # v6.12
---
 kernel/sched/ext.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Tejun Heo Dec. 11, 2024, 9:03 p.m. UTC | #1
On Wed, Dec 11, 2024 at 11:01:51AM -1000, Tejun Heo wrote:
> While adding outer irqsave/restore locking, 0e7ffff1b811 ("scx: Fix raciness
> in scx_ops_bypass()") forgot to convert an inner rq_unlock_irqrestore() to
> rq_unlock() which could re-enable IRQ prematurely leading to the following
> warning:
> 
>   raw_local_irq_restore() called with IRQs enabled
>   WARNING: CPU: 1 PID: 96 at kernel/locking/irqflag-debug.c:10 warn_bogus_irq_restore+0x30/0x40
>   ...
>   Sched_ext: create_dsq (enabling)
>   pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
>   pc : warn_bogus_irq_restore+0x30/0x40
>   lr : warn_bogus_irq_restore+0x30/0x40
>   ...
>   Call trace:
>    warn_bogus_irq_restore+0x30/0x40 (P)
>    warn_bogus_irq_restore+0x30/0x40 (L)
>    scx_ops_bypass+0x224/0x3b8
>    scx_ops_enable.isra.0+0x2c8/0xaa8
>    bpf_scx_reg+0x18/0x30
>   ...
>   irq event stamp: 33739
>   hardirqs last  enabled at (33739): [<ffff8000800b699c>] scx_ops_bypass+0x174/0x3b8
>   hardirqs last disabled at (33738): [<ffff800080d48ad4>] _raw_spin_lock_irqsave+0xb4/0xd8
> 
> Drop the stray _irqrestore().
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Reported-by: Ihor Solodrai <ihor.solodrai@pm.me>
> Link: http://lkml.kernel.org/r/qC39k3UsonrBYD_SmuxHnZIQLsuuccoCrkiqb_BT7DvH945A1_LZwE4g-5Pu9FcCtqZt4lY1HhIPi0homRuNWxkgo1rgP3bkxa0donw8kV4=@pm.me
> Fixes: 0e7ffff1b811 ("scx: Fix raciness in scx_ops_bypass()")
> Cc: stable@vger.kernel.org # v6.12

Applying to sched_ext/for-6.13-fixes.

Thanks.
Ihor Solodrai Dec. 12, 2024, 6:12 p.m. UTC | #2
On Wednesday, December 11th, 2024 at 1:01 PM, Tejun Heo <tj@kernel.org> wrote:

> 
> 
> While adding outer irqsave/restore locking, 0e7ffff1b811 ("scx: Fix raciness
> in scx_ops_bypass()") forgot to convert an inner rq_unlock_irqrestore() to
> rq_unlock() which could re-enable IRQ prematurely leading to the following
> warning:
> 
> raw_local_irq_restore() called with IRQs enabled
> WARNING: CPU: 1 PID: 96 at kernel/locking/irqflag-debug.c:10 warn_bogus_irq_restore+0x30/0x40
> ...
> Sched_ext: create_dsq (enabling)
> pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> pc : warn_bogus_irq_restore+0x30/0x40
> lr : warn_bogus_irq_restore+0x30/0x40
> ...
> Call trace:
> warn_bogus_irq_restore+0x30/0x40 (P)
> warn_bogus_irq_restore+0x30/0x40 (L)
> scx_ops_bypass+0x224/0x3b8
> scx_ops_enable.isra.0+0x2c8/0xaa8
> bpf_scx_reg+0x18/0x30
> ...
> irq event stamp: 33739
> hardirqs last enabled at (33739): [<ffff8000800b699c>] scx_ops_bypass+0x174/0x3b8
> 
> hardirqs last disabled at (33738): [<ffff800080d48ad4>] _raw_spin_lock_irqsave+0xb4/0xd8
> 
> 
> Drop the stray _irqrestore().
> 
> Signed-off-by: Tejun Heo tj@kernel.org
> 
> Reported-by: Ihor Solodrai ihor.solodrai@pm.me
> 
> Link: http://lkml.kernel.org/r/qC39k3UsonrBYD_SmuxHnZIQLsuuccoCrkiqb_BT7DvH945A1_LZwE4g-5Pu9FcCtqZt4lY1HhIPi0homRuNWxkgo1rgP3bkxa0donw8kV4=@pm.me
> Fixes: 0e7ffff1b811 ("scx: Fix raciness in scx_ops_bypass()")
> Cc: stable@vger.kernel.org # v6.12
> ---
> kernel/sched/ext.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
> index 7fff1d045477..98519e6d0dcd 100644
> --- a/kernel/sched/ext.c
> +++ b/kernel/sched/ext.c
> @@ -4763,7 +4763,7 @@ static void scx_ops_bypass(bool bypass)
> * sees scx_rq_bypassing() before moving tasks to SCX.
> */
> if (!scx_enabled()) {
> - rq_unlock_irqrestore(rq, &rf);
> + rq_unlock(rq, &rf);
> continue;
> }

Hi Tejun,


I tried this patch on BPF CI: the pipeline ran 3 times
successfully. That's 12 selftests/sched_ext runs in total.

https://github.com/kernel-patches/vmtest/actions/runs/12301284063

Tested-by: Ihor Solodrai ihor.solodrai@pm.me

Thanks for the fix!
Ihor Solodrai Dec. 17, 2024, 11:44 p.m. UTC | #3
Hi Tejun,

I re-enabled selftests/sched_ext on BPF CI today. The kernel on CI
includes this patch. Sometimes there is a failure on attempt to attach
a dsp_local_on scheduler.

Examples of failed jobs:

  * https://github.com/kernel-patches/bpf/actions/runs/12379720791/job/34555104994
  * https://github.com/kernel-patches/bpf/actions/runs/12382862660/job/34564648924
  * https://github.com/kernel-patches/bpf/actions/runs/12381361846/job/34560047798

Here is a piece of log that is present in failed run, but not in
a successful run:

2024-12-17T19:30:12.9010943Z [    5.285022] sched_ext: BPF scheduler "dsp_local_on" enabled
2024-12-17T19:30:13.9022892Z ERR: dsp_local_on.c:37
2024-12-17T19:30:13.9025841Z Expected skel->data->uei.kind == EXIT_KIND(SCX_EXIT_ERROR) (0 == 1024)
2024-12-17T19:30:13.9256108Z ERR: exit.c:30
2024-12-17T19:30:13.9256641Z Failed to attach scheduler
2024-12-17T19:30:13.9611443Z [    6.345087] smpboot: CPU 1 is now offline

Could you please investigate?

Thanks.
Tejun Heo Dec. 18, 2024, 6:34 p.m. UTC | #4
Hello,

On Tue, Dec 17, 2024 at 11:44:08PM +0000, Ihor Solodrai wrote:
> I re-enabled selftests/sched_ext on BPF CI today. The kernel on CI
> includes this patch. Sometimes there is a failure on attempt to attach
> a dsp_local_on scheduler.
> 
> Examples of failed jobs:
> 
>   * https://github.com/kernel-patches/bpf/actions/runs/12379720791/job/34555104994
>   * https://github.com/kernel-patches/bpf/actions/runs/12382862660/job/34564648924
>   * https://github.com/kernel-patches/bpf/actions/runs/12381361846/job/34560047798
> 
> Here is a piece of log that is present in failed run, but not in
> a successful run:
> 
> 2024-12-17T19:30:12.9010943Z [    5.285022] sched_ext: BPF scheduler "dsp_local_on" enabled
> 2024-12-17T19:30:13.9022892Z ERR: dsp_local_on.c:37
> 2024-12-17T19:30:13.9025841Z Expected skel->data->uei.kind == EXIT_KIND(SCX_EXIT_ERROR) (0 == 1024)
> 2024-12-17T19:30:13.9256108Z ERR: exit.c:30
> 2024-12-17T19:30:13.9256641Z Failed to attach scheduler
> 2024-12-17T19:30:13.9611443Z [    6.345087] smpboot: CPU 1 is now offline
> 
> Could you please investigate?

The test prog is wrong in assuming all possible CPUs to be consecutive and
online but I'm not sure whether that's what's making the test flaky. Do you
have dmesg from a failed run?

Thanks.
Ihor Solodrai Dec. 19, 2024, 10:51 p.m. UTC | #5
On Wednesday, December 18th, 2024 at 10:34 AM, Tejun Heo <tj@kernel.org> wrote:

> 
> 
> Hello,
> 
> On Tue, Dec 17, 2024 at 11:44:08PM +0000, Ihor Solodrai wrote:
> 
> > I re-enabled selftests/sched_ext on BPF CI today. The kernel on CI
> > includes this patch. Sometimes there is a failure on attempt to attach
> > a dsp_local_on scheduler.
> > 
> > Examples of failed jobs:
> > 
> > * https://github.com/kernel-patches/bpf/actions/runs/12379720791/job/34555104994
> > * https://github.com/kernel-patches/bpf/actions/runs/12382862660/job/34564648924
> > * https://github.com/kernel-patches/bpf/actions/runs/12381361846/job/34560047798
> > 
> > Here is a piece of log that is present in failed run, but not in
> > a successful run:
> > 
> > 2024-12-17T19:30:12.9010943Z [ 5.285022] sched_ext: BPF scheduler "dsp_local_on" enabled
> > 2024-12-17T19:30:13.9022892Z ERR: dsp_local_on.c:37
> > 2024-12-17T19:30:13.9025841Z Expected skel->data->uei.kind == EXIT_KIND(SCX_EXIT_ERROR) (0 == 1024)
> > 2024-12-17T19:30:13.9256108Z ERR: exit.c:30
> > 2024-12-17T19:30:13.9256641Z Failed to attach scheduler
> > 2024-12-17T19:30:13.9611443Z [ 6.345087] smpboot: CPU 1 is now offline
> > 
> > Could you please investigate?
> 
> 
> The test prog is wrong in assuming all possible CPUs to be consecutive and
> online but I'm not sure whether that's what's making the test flaky. Do you
> have dmesg from a failed run?

Tejun, can you elaborate on what you're looking for in the logs?
My understanding is that QEMU prints some of the dmesg messages.
QEMU output is available in raw logs.

Here is a link (you have to login to github to open):

https://productionresultssa1.blob.core.windows.net/actions-results/99cd995e-679f-4180-872b-d31e1f564837/workflow-job-run-7216a7c9-5129-5959-a45a-28d6f9b737e2/logs/job/job-logs.txt?rsct=text%2Fplain&se=2024-12-19T22%3A57%3A01Z&sig=z%2B%2FUtIIhli4VG%2FCCVxawBnubNwfIIsl9Q2FlTVvM8q0%3D&ske=2024-12-20T07%3A00%3A35Z&skoid=ca7593d4-ee42-46cd-af88-8b886a2f84eb&sks=b&skt=2024-12-19T19%3A00%3A35Z&sktid=398a6654-997b-47e9-b12b-9515b896b4de&skv=2024-11-04&sp=r&spr=https&sr=b&st=2024-12-19T22%3A46%3A56Z&sv=2024-11-04

Generally, you can access raw logs by going to the job, and 
clicking the gear on the topright -> "View raw logs".

> 
> Thanks.
> 
> --
> tejun
Ihor Solodrai Dec. 20, 2024, 7:26 p.m. UTC | #6
On Thursday, December 19th, 2024 at 2:51 PM, Ihor Solodrai <ihor.solodrai@pm.me> wrote:

> [...]
> 
> Tejun, can you elaborate on what you're looking for in the logs?
> My understanding is that QEMU prints some of the dmesg messages.
> QEMU output is available in raw logs.

I made changes to the CI scripts to explicitly dump dmesg in case of a failure.
It looks like most of that log was already printed.

Job: https://github.com/kernel-patches/bpf/actions/runs/12436924307/job/34726070343

Raw log: https://productionresultssa11.blob.core.windows.net/actions-results/a10f57cb-19e3-487a-9fb0-69742cfbef1b/workflow-job-run-4c580b44-6466-54d8-b922-6f707064e5ca/logs/job/job-logs.txt?rsct=text%2Fplain&se=2024-12-20T19%3A34%3A55Z&sig=kQ09k9r01VtP4p%2FgYvvCmm2FUuOHfsLjU3ARzks4xmE%3D&ske=2024-12-21T07%3A00%3A50Z&skoid=ca7593d4-ee42-46cd-af88-8b886a2f84eb&sks=b&skt=2024-12-20T19%3A00%3A50Z&sktid=398a6654-997b-47e9-b12b-9515b896b4de&skv=2024-11-04&sp=r&spr=https&sr=b&st=2024-12-20T19%3A24%3A50Z&sv=2024-11-04 

Search for "dmesg start".

> [...]
diff mbox series

Patch

diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
index 7fff1d045477..98519e6d0dcd 100644
--- a/kernel/sched/ext.c
+++ b/kernel/sched/ext.c
@@ -4763,7 +4763,7 @@  static void scx_ops_bypass(bool bypass)
 		 * sees scx_rq_bypassing() before moving tasks to SCX.
 		 */
 		if (!scx_enabled()) {
-			rq_unlock_irqrestore(rq, &rf);
+			rq_unlock(rq, &rf);
 			continue;
 		}