rcu-tasks: Fix race in schedule and flush work

Message ID	20220217152520.18972-1-treasure4paddy@gmail.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org> From: Padmanabha Srinivasaiah <treasure4paddy@gmail.com> To: paulmck@kernel.org, josh@joshtriplett.org, rostedt@goodmis.org, mathieu.desnoyers@efficios.com, jiangshanlai@gmail.com, joel@joelfernandes.org, rcu@vger.kernel.org, linux-kernel@vger.kernel.org Cc: tj@kernel.org, treasure4paddy@gmail.com, linux-rpi-kernel@lists.infradead.org, linux-arm-kernel@lists.infradead.org Subject: [PATCH] rcu-tasks: Fix race in schedule and flush work Date: Thu, 17 Feb 2022 16:25:19 +0100 Message-Id: <20220217152520.18972-1-treasure4paddy@gmail.com> Precedence: list MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" <linux-arm-kernel-bounces@lists.infradead.org> Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org
Series	rcu-tasks: Fix race in schedule and flush work \| expand rcu-tasks: Fix race in schedule and flush work

Padmanabha Srinivasaiah Feb. 17, 2022, 3:25 p.m. UTC

While booting secondary CPUs, cpus_read_[lock/unlock] not keeping online
cpumask stable. The transient online mask results in below calltrace.

[    0.324121] CPU1: Booted secondary processor 0x0000000001 [0x410fd083]
[    0.346652] Detected PIPT I-cache on CPU2
[    0.347212] CPU2: Booted secondary processor 0x0000000002 [0x410fd083]
[    0.377255] Detected PIPT I-cache on CPU3
[    0.377823] CPU3: Booted secondary processor 0x0000000003 [0x410fd083]
[    0.379040] ------------[ cut here ]------------
[    0.383662] WARNING: CPU: 0 PID: 10 at kernel/workqueue.c:3084 __flush_work+0x12c/0x138
[    0.384850] Modules linked in:
[    0.385403] CPU: 0 PID: 10 Comm: rcu_tasks_rude_ Not tainted 5.17.0-rc3-v8+ #13
[    0.386473] Hardware name: Raspberry Pi 4 Model B Rev 1.4 (DT)
[    0.387289] pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[    0.388308] pc : __flush_work+0x12c/0x138
[    0.388970] lr : __flush_work+0x80/0x138
[    0.389620] sp : ffffffc00aaf3c60
[    0.390139] x29: ffffffc00aaf3d20 x28: ffffffc009c16af0 x27: ffffff80f761df48
[    0.391316] x26: 0000000000000004 x25: 0000000000000003 x24: 0000000000000100
[    0.392493] x23: ffffffffffffffff x22: ffffffc009c16b10 x21: ffffffc009c16b28
[    0.393668] x20: ffffffc009e53861 x19: ffffff80f77fbf40 x18: 00000000d744fcc9
[    0.394842] x17: 000000000000000b x16: 00000000000001c2 x15: ffffffc009e57550
[    0.396016] x14: 0000000000000000 x13: ffffffffffffffff x12: 0000000100000000
[    0.397190] x11: 0000000000000462 x10: ffffff8040258008 x9 : 0000000100000000
[    0.398364] x8 : 0000000000000000 x7 : ffffffc0093c8bf4 x6 : 0000000000000000
[    0.399538] x5 : 0000000000000000 x4 : ffffffc00a976e40 x3 : ffffffc00810444c
[    0.400711] x2 : 0000000000000004 x1 : 0000000000000000 x0 : 0000000000000000
[    0.401886] Call trace:
[    0.402309]  __flush_work+0x12c/0x138
[    0.402941]  schedule_on_each_cpu+0x228/0x278
[    0.403693]  rcu_tasks_rude_wait_gp+0x130/0x144
[    0.404502]  rcu_tasks_kthread+0x220/0x254
[    0.405264]  kthread+0x174/0x1ac
[    0.405837]  ret_from_fork+0x10/0x20
[    0.406456] irq event stamp: 102
[    0.406966] hardirqs last  enabled at (101): [<ffffffc0093c8468>] _raw_spin_unlock_irq+0x78/0xb4
[    0.408304] hardirqs last disabled at (102): [<ffffffc0093b8270>] el1_dbg+0x24/0x5c
[    0.409410] softirqs last  enabled at (54): [<ffffffc0081b80c8>] local_bh_enable+0xc/0x2c
[    0.410645] softirqs last disabled at (50): [<ffffffc0081b809c>] local_bh_disable+0xc/0x2c
[    0.411890] ---[ end trace 0000000000000000 ]---
[    0.413000] smp: Brought up 1 node, 4 CPUs
[    0.413762] SMP: Total of 4 processors activated.
[    0.414566] CPU features: detected: 32-bit EL0 Support
[    0.415414] CPU features: detected: 32-bit EL1 Support
[    0.416278] CPU features: detected: CRC32 instructions
[    0.447021] Callback from call_rcu_tasks_rude() invoked.
[    0.506693] Callback from call_rcu_tasks() invoked.

Link: https://lore.kernel.org/lkml/20220210184319.25009-1-treasure4paddy@gmail.com/T/
Suggested-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Padmanabha Srinivasaiah <treasure4paddy@gmail.com>
---
 kernel/rcu/tasks.h | 3 +++
 1 file changed, 3 insertions(+)

Paul E. McKenney Feb. 17, 2022, 5:43 p.m. UTC | #1

On Thu, Feb 17, 2022 at 04:25:19PM +0100, Padmanabha Srinivasaiah wrote:
> While booting secondary CPUs, cpus_read_[lock/unlock] not keeping online
> cpumask stable. The transient online mask results in below calltrace.
> 
> [    0.324121] CPU1: Booted secondary processor 0x0000000001 [0x410fd083]
> [    0.346652] Detected PIPT I-cache on CPU2
> [    0.347212] CPU2: Booted secondary processor 0x0000000002 [0x410fd083]
> [    0.377255] Detected PIPT I-cache on CPU3
> [    0.377823] CPU3: Booted secondary processor 0x0000000003 [0x410fd083]
> [    0.379040] ------------[ cut here ]------------
> [    0.383662] WARNING: CPU: 0 PID: 10 at kernel/workqueue.c:3084 __flush_work+0x12c/0x138
> [    0.384850] Modules linked in:
> [    0.385403] CPU: 0 PID: 10 Comm: rcu_tasks_rude_ Not tainted 5.17.0-rc3-v8+ #13
> [    0.386473] Hardware name: Raspberry Pi 4 Model B Rev 1.4 (DT)
> [    0.387289] pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [    0.388308] pc : __flush_work+0x12c/0x138
> [    0.388970] lr : __flush_work+0x80/0x138
> [    0.389620] sp : ffffffc00aaf3c60
> [    0.390139] x29: ffffffc00aaf3d20 x28: ffffffc009c16af0 x27: ffffff80f761df48
> [    0.391316] x26: 0000000000000004 x25: 0000000000000003 x24: 0000000000000100
> [    0.392493] x23: ffffffffffffffff x22: ffffffc009c16b10 x21: ffffffc009c16b28
> [    0.393668] x20: ffffffc009e53861 x19: ffffff80f77fbf40 x18: 00000000d744fcc9
> [    0.394842] x17: 000000000000000b x16: 00000000000001c2 x15: ffffffc009e57550
> [    0.396016] x14: 0000000000000000 x13: ffffffffffffffff x12: 0000000100000000
> [    0.397190] x11: 0000000000000462 x10: ffffff8040258008 x9 : 0000000100000000
> [    0.398364] x8 : 0000000000000000 x7 : ffffffc0093c8bf4 x6 : 0000000000000000
> [    0.399538] x5 : 0000000000000000 x4 : ffffffc00a976e40 x3 : ffffffc00810444c
> [    0.400711] x2 : 0000000000000004 x1 : 0000000000000000 x0 : 0000000000000000
> [    0.401886] Call trace:
> [    0.402309]  __flush_work+0x12c/0x138
> [    0.402941]  schedule_on_each_cpu+0x228/0x278
> [    0.403693]  rcu_tasks_rude_wait_gp+0x130/0x144
> [    0.404502]  rcu_tasks_kthread+0x220/0x254
> [    0.405264]  kthread+0x174/0x1ac
> [    0.405837]  ret_from_fork+0x10/0x20
> [    0.406456] irq event stamp: 102
> [    0.406966] hardirqs last  enabled at (101): [<ffffffc0093c8468>] _raw_spin_unlock_irq+0x78/0xb4
> [    0.408304] hardirqs last disabled at (102): [<ffffffc0093b8270>] el1_dbg+0x24/0x5c
> [    0.409410] softirqs last  enabled at (54): [<ffffffc0081b80c8>] local_bh_enable+0xc/0x2c
> [    0.410645] softirqs last disabled at (50): [<ffffffc0081b809c>] local_bh_disable+0xc/0x2c
> [    0.411890] ---[ end trace 0000000000000000 ]---
> [    0.413000] smp: Brought up 1 node, 4 CPUs
> [    0.413762] SMP: Total of 4 processors activated.
> [    0.414566] CPU features: detected: 32-bit EL0 Support
> [    0.415414] CPU features: detected: 32-bit EL1 Support
> [    0.416278] CPU features: detected: CRC32 instructions
> [    0.447021] Callback from call_rcu_tasks_rude() invoked.
> [    0.506693] Callback from call_rcu_tasks() invoked.
> 
> Link: https://lore.kernel.org/lkml/20220210184319.25009-1-treasure4paddy@gmail.com/T/
> Suggested-by: Paul E. McKenney <paulmck@kernel.org>
> Signed-off-by: Padmanabha Srinivasaiah <treasure4paddy@gmail.com>

Thank you, Padmanabha!  I have queued this with additional explanation
in the commit log as shown below.

If you have not already tested this by repeatedly rebooting, for
example continuously in an overnight test, could you please do so?
(I am assuming that your system can do an automated reboot loop, so no,
I am not asking you to stay up all night!)

Should this somehow recur with larger systems, in which the call starts
with (say) five CPUs online but has trouble with the sixth CPU, there
are several ways to address it:

1.	Make cpus_read_lock() act as expected.

2.	Make the first for_each_online_cpu() in schedule_on_each_cpu()
	instead be for_each_possible_cpu(), always do the INIT_WORK(),
	but invoke schedule_work_on() only if cpu_online(cpu).

3.	Other ideas here, when and if.

							Thanx, Paul

------------------------------------------------------------------------

commit f6823834aa2e2e581ca627238fa3ad3e0a727c08
Author: Padmanabha Srinivasaiah <treasure4paddy@gmail.com>
Date:   Thu Feb 17 16:25:19 2022 +0100

    rcu-tasks: Fix race in schedule and flush work
    
    While booting secondary CPUs, cpus_read_[lock/unlock] is not keeping
    online cpumask stable. The transient online mask results in below
    calltrace.
    
    [    0.324121] CPU1: Booted secondary processor 0x0000000001 [0x410fd083]
    [    0.346652] Detected PIPT I-cache on CPU2
    [    0.347212] CPU2: Booted secondary processor 0x0000000002 [0x410fd083]
    [    0.377255] Detected PIPT I-cache on CPU3
    [    0.377823] CPU3: Booted secondary processor 0x0000000003 [0x410fd083]
    [    0.379040] ------------[ cut here ]------------
    [    0.383662] WARNING: CPU: 0 PID: 10 at kernel/workqueue.c:3084 __flush_work+0x12c/0x138
    [    0.384850] Modules linked in:
    [    0.385403] CPU: 0 PID: 10 Comm: rcu_tasks_rude_ Not tainted 5.17.0-rc3-v8+ #13
    [    0.386473] Hardware name: Raspberry Pi 4 Model B Rev 1.4 (DT)
    [    0.387289] pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
    [    0.388308] pc : __flush_work+0x12c/0x138
    [    0.388970] lr : __flush_work+0x80/0x138
    [    0.389620] sp : ffffffc00aaf3c60
    [    0.390139] x29: ffffffc00aaf3d20 x28: ffffffc009c16af0 x27: ffffff80f761df48
    [    0.391316] x26: 0000000000000004 x25: 0000000000000003 x24: 0000000000000100
    [    0.392493] x23: ffffffffffffffff x22: ffffffc009c16b10 x21: ffffffc009c16b28
    [    0.393668] x20: ffffffc009e53861 x19: ffffff80f77fbf40 x18: 00000000d744fcc9
    [    0.394842] x17: 000000000000000b x16: 00000000000001c2 x15: ffffffc009e57550
    [    0.396016] x14: 0000000000000000 x13: ffffffffffffffff x12: 0000000100000000
    [    0.397190] x11: 0000000000000462 x10: ffffff8040258008 x9 : 0000000100000000
    [    0.398364] x8 : 0000000000000000 x7 : ffffffc0093c8bf4 x6 : 0000000000000000
    [    0.399538] x5 : 0000000000000000 x4 : ffffffc00a976e40 x3 : ffffffc00810444c
    [    0.400711] x2 : 0000000000000004 x1 : 0000000000000000 x0 : 0000000000000000
    [    0.401886] Call trace:
    [    0.402309]  __flush_work+0x12c/0x138
    [    0.402941]  schedule_on_each_cpu+0x228/0x278
    [    0.403693]  rcu_tasks_rude_wait_gp+0x130/0x144
    [    0.404502]  rcu_tasks_kthread+0x220/0x254
    [    0.405264]  kthread+0x174/0x1ac
    [    0.405837]  ret_from_fork+0x10/0x20
    [    0.406456] irq event stamp: 102
    [    0.406966] hardirqs last  enabled at (101): [<ffffffc0093c8468>] _raw_spin_unlock_irq+0x78/0xb4
    [    0.408304] hardirqs last disabled at (102): [<ffffffc0093b8270>] el1_dbg+0x24/0x5c
    [    0.409410] softirqs last  enabled at (54): [<ffffffc0081b80c8>] local_bh_enable+0xc/0x2c
    [    0.410645] softirqs last disabled at (50): [<ffffffc0081b809c>] local_bh_disable+0xc/0x2c
    [    0.411890] ---[ end trace 0000000000000000 ]---
    [    0.413000] smp: Brought up 1 node, 4 CPUs
    [    0.413762] SMP: Total of 4 processors activated.
    [    0.414566] CPU features: detected: 32-bit EL0 Support
    [    0.415414] CPU features: detected: 32-bit EL1 Support
    [    0.416278] CPU features: detected: CRC32 instructions
    [    0.447021] Callback from call_rcu_tasks_rude() invoked.
    [    0.506693] Callback from call_rcu_tasks() invoked.
    
    This commit therefore fixes this issue by applying a single-CPU
    optimization to the RCU Tasks Rude grace-period process.  The key point
    here is that the purpose of this RCU flavor is to force a schedule on
    each online CPU since some past event.  But the rcu_tasks_rude_wait_gp()
    function runs in the context of the RCU Tasks Rude's grace-period kthread,
    so there must already have been a context switch on the current CPU since
    the call to either synchronize_rcu_tasks_rude() or call_rcu_tasks_rude().
    So if there is only a single CPU online, RCU Tasks Rude's grace-period
    kthread does not need to anything at all.
    
    It turns out that the rcu_tasks_rude_wait_gp() function's call to
    schedule_on_each_cpu() causes problems during early boot.  During that
    time, there is only one online CPU, namely the boot CPU.  Therefore,
    applying this single-CPU optimization fixes early-boot instances of
    this problem.
    
    Link: https://lore.kernel.org/lkml/20220210184319.25009-1-treasure4paddy@gmail.com/T/
    Suggested-by: Paul E. McKenney <paulmck@kernel.org>
    Signed-off-by: Padmanabha Srinivasaiah <treasure4paddy@gmail.com>
    Signed-off-by: Paul E. McKenney <paulmck@kernel.org>

diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h
index d73e32d803438..f7fdb349f8b8e 100644
--- a/kernel/rcu/tasks.h
+++ b/kernel/rcu/tasks.h
@@ -955,6 +955,9 @@ static void rcu_tasks_be_rude(struct work_struct *work)
 // Wait for one rude RCU-tasks grace period.
 static void rcu_tasks_rude_wait_gp(struct rcu_tasks *rtp)
 {
+	if (num_online_cpus() <= 1)
+		return;	// Fastpath for only one CPU.
+
 	rtp->n_ipis += cpumask_weight(cpu_online_mask);
 	schedule_on_each_cpu(rcu_tasks_be_rude);
 }

Padmanabha Srinivasaiah Feb. 20, 2022, 6:35 p.m. UTC | #2

On Thu, Feb 17, 2022 at 09:43:30AM -0800, Paul E. McKenney wrote:
> On Thu, Feb 17, 2022 at 04:25:19PM +0100, Padmanabha Srinivasaiah wrote:
> > While booting secondary CPUs, cpus_read_[lock/unlock] not keeping online
> > cpumask stable. The transient online mask results in below calltrace.
> > 
> > [    0.324121] CPU1: Booted secondary processor 0x0000000001 [0x410fd083]
> > [    0.346652] Detected PIPT I-cache on CPU2
> > [    0.347212] CPU2: Booted secondary processor 0x0000000002 [0x410fd083]
> > [    0.377255] Detected PIPT I-cache on CPU3
> > [    0.377823] CPU3: Booted secondary processor 0x0000000003 [0x410fd083]
> > [    0.379040] ------------[ cut here ]------------
> > [    0.383662] WARNING: CPU: 0 PID: 10 at kernel/workqueue.c:3084 __flush_work+0x12c/0x138
> > [    0.384850] Modules linked in:
> > [    0.385403] CPU: 0 PID: 10 Comm: rcu_tasks_rude_ Not tainted 5.17.0-rc3-v8+ #13
> > [    0.386473] Hardware name: Raspberry Pi 4 Model B Rev 1.4 (DT)
> > [    0.387289] pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> > [    0.388308] pc : __flush_work+0x12c/0x138
> > [    0.388970] lr : __flush_work+0x80/0x138
> > [    0.389620] sp : ffffffc00aaf3c60
> > [    0.390139] x29: ffffffc00aaf3d20 x28: ffffffc009c16af0 x27: ffffff80f761df48
> > [    0.391316] x26: 0000000000000004 x25: 0000000000000003 x24: 0000000000000100
> > [    0.392493] x23: ffffffffffffffff x22: ffffffc009c16b10 x21: ffffffc009c16b28
> > [    0.393668] x20: ffffffc009e53861 x19: ffffff80f77fbf40 x18: 00000000d744fcc9
> > [    0.394842] x17: 000000000000000b x16: 00000000000001c2 x15: ffffffc009e57550
> > [    0.396016] x14: 0000000000000000 x13: ffffffffffffffff x12: 0000000100000000
> > [    0.397190] x11: 0000000000000462 x10: ffffff8040258008 x9 : 0000000100000000
> > [    0.398364] x8 : 0000000000000000 x7 : ffffffc0093c8bf4 x6 : 0000000000000000
> > [    0.399538] x5 : 0000000000000000 x4 : ffffffc00a976e40 x3 : ffffffc00810444c
> > [    0.400711] x2 : 0000000000000004 x1 : 0000000000000000 x0 : 0000000000000000
> > [    0.401886] Call trace:
> > [    0.402309]  __flush_work+0x12c/0x138
> > [    0.402941]  schedule_on_each_cpu+0x228/0x278
> > [    0.403693]  rcu_tasks_rude_wait_gp+0x130/0x144
> > [    0.404502]  rcu_tasks_kthread+0x220/0x254
> > [    0.405264]  kthread+0x174/0x1ac
> > [    0.405837]  ret_from_fork+0x10/0x20
> > [    0.406456] irq event stamp: 102
> > [    0.406966] hardirqs last  enabled at (101): [<ffffffc0093c8468>] _raw_spin_unlock_irq+0x78/0xb4
> > [    0.408304] hardirqs last disabled at (102): [<ffffffc0093b8270>] el1_dbg+0x24/0x5c
> > [    0.409410] softirqs last  enabled at (54): [<ffffffc0081b80c8>] local_bh_enable+0xc/0x2c
> > [    0.410645] softirqs last disabled at (50): [<ffffffc0081b809c>] local_bh_disable+0xc/0x2c
> > [    0.411890] ---[ end trace 0000000000000000 ]---
> > [    0.413000] smp: Brought up 1 node, 4 CPUs
> > [    0.413762] SMP: Total of 4 processors activated.
> > [    0.414566] CPU features: detected: 32-bit EL0 Support
> > [    0.415414] CPU features: detected: 32-bit EL1 Support
> > [    0.416278] CPU features: detected: CRC32 instructions
> > [    0.447021] Callback from call_rcu_tasks_rude() invoked.
> > [    0.506693] Callback from call_rcu_tasks() invoked.
> > 
> > Link: https://lore.kernel.org/lkml/20220210184319.25009-1-treasure4paddy@gmail.com/T/
> > Suggested-by: Paul E. McKenney <paulmck@kernel.org>
> > Signed-off-by: Padmanabha Srinivasaiah <treasure4paddy@gmail.com>
> 
> Thank you, Padmanabha!  I have queued this with additional explanation
> in the commit log as shown below.

Thanks paul.
> 
> If you have not already tested this by repeatedly rebooting, for
> example continuously in an overnight test, could you please do so?
> (I am assuming that your system can do an automated reboot loop, so no,
> I am not asking you to stay up all night!)
> 
Yes, tested and issue doesn't appear.

> Should this somehow recur with larger systems, in which the call starts
> with (say) five CPUs online but has trouble with the sixth CPU, there
> are several ways to address it:
> 
> 1.	Make cpus_read_lock() act as expected.
>
> 2.	Make the first for_each_online_cpu() in schedule_on_each_cpu()
> 	instead be for_each_possible_cpu(), always do the INIT_WORK(),
> 	but invoke schedule_work_on() only if cpu_online(cpu).
>
Yes, agree and was also suggested by Tejun Heo. Will also submit same, in
other thread.

> 3.	Other ideas here, when and if.
> 
> 							Thanx, Paul
> 
> ------------------------------------------------------------------------
> 
> commit f6823834aa2e2e581ca627238fa3ad3e0a727c08
> Author: Padmanabha Srinivasaiah <treasure4paddy@gmail.com>
> Date:   Thu Feb 17 16:25:19 2022 +0100
> 
>     rcu-tasks: Fix race in schedule and flush work
>     
>     While booting secondary CPUs, cpus_read_[lock/unlock] is not keeping
>     online cpumask stable. The transient online mask results in below
>     calltrace.
>     
>     [    0.324121] CPU1: Booted secondary processor 0x0000000001 [0x410fd083]
>     [    0.346652] Detected PIPT I-cache on CPU2
>     [    0.347212] CPU2: Booted secondary processor 0x0000000002 [0x410fd083]
>     [    0.377255] Detected PIPT I-cache on CPU3
>     [    0.377823] CPU3: Booted secondary processor 0x0000000003 [0x410fd083]
>     [    0.379040] ------------[ cut here ]------------
>     [    0.383662] WARNING: CPU: 0 PID: 10 at kernel/workqueue.c:3084 __flush_work+0x12c/0x138
>     [    0.384850] Modules linked in:
>     [    0.385403] CPU: 0 PID: 10 Comm: rcu_tasks_rude_ Not tainted 5.17.0-rc3-v8+ #13
>     [    0.386473] Hardware name: Raspberry Pi 4 Model B Rev 1.4 (DT)
>     [    0.387289] pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
>     [    0.388308] pc : __flush_work+0x12c/0x138
>     [    0.388970] lr : __flush_work+0x80/0x138
>     [    0.389620] sp : ffffffc00aaf3c60
>     [    0.390139] x29: ffffffc00aaf3d20 x28: ffffffc009c16af0 x27: ffffff80f761df48
>     [    0.391316] x26: 0000000000000004 x25: 0000000000000003 x24: 0000000000000100
>     [    0.392493] x23: ffffffffffffffff x22: ffffffc009c16b10 x21: ffffffc009c16b28
>     [    0.393668] x20: ffffffc009e53861 x19: ffffff80f77fbf40 x18: 00000000d744fcc9
>     [    0.394842] x17: 000000000000000b x16: 00000000000001c2 x15: ffffffc009e57550
>     [    0.396016] x14: 0000000000000000 x13: ffffffffffffffff x12: 0000000100000000
>     [    0.397190] x11: 0000000000000462 x10: ffffff8040258008 x9 : 0000000100000000
>     [    0.398364] x8 : 0000000000000000 x7 : ffffffc0093c8bf4 x6 : 0000000000000000
>     [    0.399538] x5 : 0000000000000000 x4 : ffffffc00a976e40 x3 : ffffffc00810444c
>     [    0.400711] x2 : 0000000000000004 x1 : 0000000000000000 x0 : 0000000000000000
>     [    0.401886] Call trace:
>     [    0.402309]  __flush_work+0x12c/0x138
>     [    0.402941]  schedule_on_each_cpu+0x228/0x278
>     [    0.403693]  rcu_tasks_rude_wait_gp+0x130/0x144
>     [    0.404502]  rcu_tasks_kthread+0x220/0x254
>     [    0.405264]  kthread+0x174/0x1ac
>     [    0.405837]  ret_from_fork+0x10/0x20
>     [    0.406456] irq event stamp: 102
>     [    0.406966] hardirqs last  enabled at (101): [<ffffffc0093c8468>] _raw_spin_unlock_irq+0x78/0xb4
>     [    0.408304] hardirqs last disabled at (102): [<ffffffc0093b8270>] el1_dbg+0x24/0x5c
>     [    0.409410] softirqs last  enabled at (54): [<ffffffc0081b80c8>] local_bh_enable+0xc/0x2c
>     [    0.410645] softirqs last disabled at (50): [<ffffffc0081b809c>] local_bh_disable+0xc/0x2c
>     [    0.411890] ---[ end trace 0000000000000000 ]---
>     [    0.413000] smp: Brought up 1 node, 4 CPUs
>     [    0.413762] SMP: Total of 4 processors activated.
>     [    0.414566] CPU features: detected: 32-bit EL0 Support
>     [    0.415414] CPU features: detected: 32-bit EL1 Support
>     [    0.416278] CPU features: detected: CRC32 instructions
>     [    0.447021] Callback from call_rcu_tasks_rude() invoked.
>     [    0.506693] Callback from call_rcu_tasks() invoked.
>     
>     This commit therefore fixes this issue by applying a single-CPU
>     optimization to the RCU Tasks Rude grace-period process.  The key point
>     here is that the purpose of this RCU flavor is to force a schedule on
>     each online CPU since some past event.  But the rcu_tasks_rude_wait_gp()
>     function runs in the context of the RCU Tasks Rude's grace-period kthread,
>     so there must already have been a context switch on the current CPU since
>     the call to either synchronize_rcu_tasks_rude() or call_rcu_tasks_rude().
>     So if there is only a single CPU online, RCU Tasks Rude's grace-period
>     kthread does not need to anything at all.
>     
>     It turns out that the rcu_tasks_rude_wait_gp() function's call to
>     schedule_on_each_cpu() causes problems during early boot.  During that
>     time, there is only one online CPU, namely the boot CPU.  Therefore,
>     applying this single-CPU optimization fixes early-boot instances of
>     this problem.
>     
>     Link: https://lore.kernel.org/lkml/20220210184319.25009-1-treasure4paddy@gmail.com/T/
>     Suggested-by: Paul E. McKenney <paulmck@kernel.org>
>     Signed-off-by: Padmanabha Srinivasaiah <treasure4paddy@gmail.com>
>     Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
> 
> diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h
> index d73e32d803438..f7fdb349f8b8e 100644
> --- a/kernel/rcu/tasks.h
> +++ b/kernel/rcu/tasks.h
> @@ -955,6 +955,9 @@ static void rcu_tasks_be_rude(struct work_struct *work)
>  // Wait for one rude RCU-tasks grace period.
>  static void rcu_tasks_rude_wait_gp(struct rcu_tasks *rtp)
>  {
> +	if (num_online_cpus() <= 1)
> +		return;	// Fastpath for only one CPU.
> +
>  	rtp->n_ipis += cpumask_weight(cpu_online_mask);
>  	schedule_on_each_cpu(rcu_tasks_be_rude);
>  }

Paul E. McKenney Feb. 20, 2022, 8:04 p.m. UTC | #3

On Sun, Feb 20, 2022 at 07:35:17PM +0100, Padmanabha Srinivasaiah wrote:
> On Thu, Feb 17, 2022 at 09:43:30AM -0800, Paul E. McKenney wrote:
> > On Thu, Feb 17, 2022 at 04:25:19PM +0100, Padmanabha Srinivasaiah wrote:
> > > While booting secondary CPUs, cpus_read_[lock/unlock] not keeping online
> > > cpumask stable. The transient online mask results in below calltrace.
> > > 
> > > [    0.324121] CPU1: Booted secondary processor 0x0000000001 [0x410fd083]
> > > [    0.346652] Detected PIPT I-cache on CPU2
> > > [    0.347212] CPU2: Booted secondary processor 0x0000000002 [0x410fd083]
> > > [    0.377255] Detected PIPT I-cache on CPU3
> > > [    0.377823] CPU3: Booted secondary processor 0x0000000003 [0x410fd083]
> > > [    0.379040] ------------[ cut here ]------------
> > > [    0.383662] WARNING: CPU: 0 PID: 10 at kernel/workqueue.c:3084 __flush_work+0x12c/0x138
> > > [    0.384850] Modules linked in:
> > > [    0.385403] CPU: 0 PID: 10 Comm: rcu_tasks_rude_ Not tainted 5.17.0-rc3-v8+ #13
> > > [    0.386473] Hardware name: Raspberry Pi 4 Model B Rev 1.4 (DT)
> > > [    0.387289] pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> > > [    0.388308] pc : __flush_work+0x12c/0x138
> > > [    0.388970] lr : __flush_work+0x80/0x138
> > > [    0.389620] sp : ffffffc00aaf3c60
> > > [    0.390139] x29: ffffffc00aaf3d20 x28: ffffffc009c16af0 x27: ffffff80f761df48
> > > [    0.391316] x26: 0000000000000004 x25: 0000000000000003 x24: 0000000000000100
> > > [    0.392493] x23: ffffffffffffffff x22: ffffffc009c16b10 x21: ffffffc009c16b28
> > > [    0.393668] x20: ffffffc009e53861 x19: ffffff80f77fbf40 x18: 00000000d744fcc9
> > > [    0.394842] x17: 000000000000000b x16: 00000000000001c2 x15: ffffffc009e57550
> > > [    0.396016] x14: 0000000000000000 x13: ffffffffffffffff x12: 0000000100000000
> > > [    0.397190] x11: 0000000000000462 x10: ffffff8040258008 x9 : 0000000100000000
> > > [    0.398364] x8 : 0000000000000000 x7 : ffffffc0093c8bf4 x6 : 0000000000000000
> > > [    0.399538] x5 : 0000000000000000 x4 : ffffffc00a976e40 x3 : ffffffc00810444c
> > > [    0.400711] x2 : 0000000000000004 x1 : 0000000000000000 x0 : 0000000000000000
> > > [    0.401886] Call trace:
> > > [    0.402309]  __flush_work+0x12c/0x138
> > > [    0.402941]  schedule_on_each_cpu+0x228/0x278
> > > [    0.403693]  rcu_tasks_rude_wait_gp+0x130/0x144
> > > [    0.404502]  rcu_tasks_kthread+0x220/0x254
> > > [    0.405264]  kthread+0x174/0x1ac
> > > [    0.405837]  ret_from_fork+0x10/0x20
> > > [    0.406456] irq event stamp: 102
> > > [    0.406966] hardirqs last  enabled at (101): [<ffffffc0093c8468>] _raw_spin_unlock_irq+0x78/0xb4
> > > [    0.408304] hardirqs last disabled at (102): [<ffffffc0093b8270>] el1_dbg+0x24/0x5c
> > > [    0.409410] softirqs last  enabled at (54): [<ffffffc0081b80c8>] local_bh_enable+0xc/0x2c
> > > [    0.410645] softirqs last disabled at (50): [<ffffffc0081b809c>] local_bh_disable+0xc/0x2c
> > > [    0.411890] ---[ end trace 0000000000000000 ]---
> > > [    0.413000] smp: Brought up 1 node, 4 CPUs
> > > [    0.413762] SMP: Total of 4 processors activated.
> > > [    0.414566] CPU features: detected: 32-bit EL0 Support
> > > [    0.415414] CPU features: detected: 32-bit EL1 Support
> > > [    0.416278] CPU features: detected: CRC32 instructions
> > > [    0.447021] Callback from call_rcu_tasks_rude() invoked.
> > > [    0.506693] Callback from call_rcu_tasks() invoked.
> > > 
> > > Link: https://lore.kernel.org/lkml/20220210184319.25009-1-treasure4paddy@gmail.com/T/
> > > Suggested-by: Paul E. McKenney <paulmck@kernel.org>
> > > Signed-off-by: Padmanabha Srinivasaiah <treasure4paddy@gmail.com>
> > 
> > Thank you, Padmanabha!  I have queued this with additional explanation
> > in the commit log as shown below.
> 
> Thanks paul.
> > 
> > If you have not already tested this by repeatedly rebooting, for
> > example continuously in an overnight test, could you please do so?
> > (I am assuming that your system can do an automated reboot loop, so no,
> > I am not asking you to stay up all night!)
> > 
> Yes, tested and issue doesn't appear.

Very good, thank you!

May I add your Tested-by?

							Thanx, Paul

> > Should this somehow recur with larger systems, in which the call starts
> > with (say) five CPUs online but has trouble with the sixth CPU, there
> > are several ways to address it:
> > 
> > 1.	Make cpus_read_lock() act as expected.
> >
> > 2.	Make the first for_each_online_cpu() in schedule_on_each_cpu()
> > 	instead be for_each_possible_cpu(), always do the INIT_WORK(),
> > 	but invoke schedule_work_on() only if cpu_online(cpu).
> >
> Yes, agree and was also suggested by Tejun Heo. Will also submit same, in
> other thread.
> 
> > 3.	Other ideas here, when and if.
> > 
> > 							Thanx, Paul
> > 
> > ------------------------------------------------------------------------
> > 
> > commit f6823834aa2e2e581ca627238fa3ad3e0a727c08
> > Author: Padmanabha Srinivasaiah <treasure4paddy@gmail.com>
> > Date:   Thu Feb 17 16:25:19 2022 +0100
> > 
> >     rcu-tasks: Fix race in schedule and flush work
> >     
> >     While booting secondary CPUs, cpus_read_[lock/unlock] is not keeping
> >     online cpumask stable. The transient online mask results in below
> >     calltrace.
> >     
> >     [    0.324121] CPU1: Booted secondary processor 0x0000000001 [0x410fd083]
> >     [    0.346652] Detected PIPT I-cache on CPU2
> >     [    0.347212] CPU2: Booted secondary processor 0x0000000002 [0x410fd083]
> >     [    0.377255] Detected PIPT I-cache on CPU3
> >     [    0.377823] CPU3: Booted secondary processor 0x0000000003 [0x410fd083]
> >     [    0.379040] ------------[ cut here ]------------
> >     [    0.383662] WARNING: CPU: 0 PID: 10 at kernel/workqueue.c:3084 __flush_work+0x12c/0x138
> >     [    0.384850] Modules linked in:
> >     [    0.385403] CPU: 0 PID: 10 Comm: rcu_tasks_rude_ Not tainted 5.17.0-rc3-v8+ #13
> >     [    0.386473] Hardware name: Raspberry Pi 4 Model B Rev 1.4 (DT)
> >     [    0.387289] pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> >     [    0.388308] pc : __flush_work+0x12c/0x138
> >     [    0.388970] lr : __flush_work+0x80/0x138
> >     [    0.389620] sp : ffffffc00aaf3c60
> >     [    0.390139] x29: ffffffc00aaf3d20 x28: ffffffc009c16af0 x27: ffffff80f761df48
> >     [    0.391316] x26: 0000000000000004 x25: 0000000000000003 x24: 0000000000000100
> >     [    0.392493] x23: ffffffffffffffff x22: ffffffc009c16b10 x21: ffffffc009c16b28
> >     [    0.393668] x20: ffffffc009e53861 x19: ffffff80f77fbf40 x18: 00000000d744fcc9
> >     [    0.394842] x17: 000000000000000b x16: 00000000000001c2 x15: ffffffc009e57550
> >     [    0.396016] x14: 0000000000000000 x13: ffffffffffffffff x12: 0000000100000000
> >     [    0.397190] x11: 0000000000000462 x10: ffffff8040258008 x9 : 0000000100000000
> >     [    0.398364] x8 : 0000000000000000 x7 : ffffffc0093c8bf4 x6 : 0000000000000000
> >     [    0.399538] x5 : 0000000000000000 x4 : ffffffc00a976e40 x3 : ffffffc00810444c
> >     [    0.400711] x2 : 0000000000000004 x1 : 0000000000000000 x0 : 0000000000000000
> >     [    0.401886] Call trace:
> >     [    0.402309]  __flush_work+0x12c/0x138
> >     [    0.402941]  schedule_on_each_cpu+0x228/0x278
> >     [    0.403693]  rcu_tasks_rude_wait_gp+0x130/0x144
> >     [    0.404502]  rcu_tasks_kthread+0x220/0x254
> >     [    0.405264]  kthread+0x174/0x1ac
> >     [    0.405837]  ret_from_fork+0x10/0x20
> >     [    0.406456] irq event stamp: 102
> >     [    0.406966] hardirqs last  enabled at (101): [<ffffffc0093c8468>] _raw_spin_unlock_irq+0x78/0xb4
> >     [    0.408304] hardirqs last disabled at (102): [<ffffffc0093b8270>] el1_dbg+0x24/0x5c
> >     [    0.409410] softirqs last  enabled at (54): [<ffffffc0081b80c8>] local_bh_enable+0xc/0x2c
> >     [    0.410645] softirqs last disabled at (50): [<ffffffc0081b809c>] local_bh_disable+0xc/0x2c
> >     [    0.411890] ---[ end trace 0000000000000000 ]---
> >     [    0.413000] smp: Brought up 1 node, 4 CPUs
> >     [    0.413762] SMP: Total of 4 processors activated.
> >     [    0.414566] CPU features: detected: 32-bit EL0 Support
> >     [    0.415414] CPU features: detected: 32-bit EL1 Support
> >     [    0.416278] CPU features: detected: CRC32 instructions
> >     [    0.447021] Callback from call_rcu_tasks_rude() invoked.
> >     [    0.506693] Callback from call_rcu_tasks() invoked.
> >     
> >     This commit therefore fixes this issue by applying a single-CPU
> >     optimization to the RCU Tasks Rude grace-period process.  The key point
> >     here is that the purpose of this RCU flavor is to force a schedule on
> >     each online CPU since some past event.  But the rcu_tasks_rude_wait_gp()
> >     function runs in the context of the RCU Tasks Rude's grace-period kthread,
> >     so there must already have been a context switch on the current CPU since
> >     the call to either synchronize_rcu_tasks_rude() or call_rcu_tasks_rude().
> >     So if there is only a single CPU online, RCU Tasks Rude's grace-period
> >     kthread does not need to anything at all.
> >     
> >     It turns out that the rcu_tasks_rude_wait_gp() function's call to
> >     schedule_on_each_cpu() causes problems during early boot.  During that
> >     time, there is only one online CPU, namely the boot CPU.  Therefore,
> >     applying this single-CPU optimization fixes early-boot instances of
> >     this problem.
> >     
> >     Link: https://lore.kernel.org/lkml/20220210184319.25009-1-treasure4paddy@gmail.com/T/
> >     Suggested-by: Paul E. McKenney <paulmck@kernel.org>
> >     Signed-off-by: Padmanabha Srinivasaiah <treasure4paddy@gmail.com>
> >     Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
> > 
> > diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h
> > index d73e32d803438..f7fdb349f8b8e 100644
> > --- a/kernel/rcu/tasks.h
> > +++ b/kernel/rcu/tasks.h
> > @@ -955,6 +955,9 @@ static void rcu_tasks_be_rude(struct work_struct *work)
> >  // Wait for one rude RCU-tasks grace period.
> >  static void rcu_tasks_rude_wait_gp(struct rcu_tasks *rtp)
> >  {
> > +	if (num_online_cpus() <= 1)
> > +		return;	// Fastpath for only one CPU.
> > +
> >  	rtp->n_ipis += cpumask_weight(cpu_online_mask);
> >  	schedule_on_each_cpu(rcu_tasks_be_rude);
> >  }

Padmanabha Srinivasaiah Feb. 22, 2022, 10:06 p.m. UTC | #4

On Sun, Feb 20, 2022 at 12:04:25PM -0800, Paul E. McKenney wrote:
> On Sun, Feb 20, 2022 at 07:35:17PM +0100, Padmanabha Srinivasaiah wrote:
> > On Thu, Feb 17, 2022 at 09:43:30AM -0800, Paul E. McKenney wrote:
> > > On Thu, Feb 17, 2022 at 04:25:19PM +0100, Padmanabha Srinivasaiah wrote:
> > > > While booting secondary CPUs, cpus_read_[lock/unlock] not keeping online
> > > > cpumask stable. The transient online mask results in below calltrace.
> > > > 
> > > > [    0.324121] CPU1: Booted secondary processor 0x0000000001 [0x410fd083]
> > > > [    0.346652] Detected PIPT I-cache on CPU2
> > > > [    0.347212] CPU2: Booted secondary processor 0x0000000002 [0x410fd083]
> > > > [    0.377255] Detected PIPT I-cache on CPU3
> > > > [    0.377823] CPU3: Booted secondary processor 0x0000000003 [0x410fd083]
> > > > [    0.379040] ------------[ cut here ]------------
> > > > [    0.383662] WARNING: CPU: 0 PID: 10 at kernel/workqueue.c:3084 __flush_work+0x12c/0x138
> > > > [    0.384850] Modules linked in:
> > > > [    0.385403] CPU: 0 PID: 10 Comm: rcu_tasks_rude_ Not tainted 5.17.0-rc3-v8+ #13
> > > > [    0.386473] Hardware name: Raspberry Pi 4 Model B Rev 1.4 (DT)
> > > > [    0.387289] pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> > > > [    0.388308] pc : __flush_work+0x12c/0x138
> > > > [    0.388970] lr : __flush_work+0x80/0x138
> > > > [    0.389620] sp : ffffffc00aaf3c60
> > > > [    0.390139] x29: ffffffc00aaf3d20 x28: ffffffc009c16af0 x27: ffffff80f761df48
> > > > [    0.391316] x26: 0000000000000004 x25: 0000000000000003 x24: 0000000000000100
> > > > [    0.392493] x23: ffffffffffffffff x22: ffffffc009c16b10 x21: ffffffc009c16b28
> > > > [    0.393668] x20: ffffffc009e53861 x19: ffffff80f77fbf40 x18: 00000000d744fcc9
> > > > [    0.394842] x17: 000000000000000b x16: 00000000000001c2 x15: ffffffc009e57550
> > > > [    0.396016] x14: 0000000000000000 x13: ffffffffffffffff x12: 0000000100000000
> > > > [    0.397190] x11: 0000000000000462 x10: ffffff8040258008 x9 : 0000000100000000
> > > > [    0.398364] x8 : 0000000000000000 x7 : ffffffc0093c8bf4 x6 : 0000000000000000
> > > > [    0.399538] x5 : 0000000000000000 x4 : ffffffc00a976e40 x3 : ffffffc00810444c
> > > > [    0.400711] x2 : 0000000000000004 x1 : 0000000000000000 x0 : 0000000000000000
> > > > [    0.401886] Call trace:
> > > > [    0.402309]  __flush_work+0x12c/0x138
> > > > [    0.402941]  schedule_on_each_cpu+0x228/0x278
> > > > [    0.403693]  rcu_tasks_rude_wait_gp+0x130/0x144
> > > > [    0.404502]  rcu_tasks_kthread+0x220/0x254
> > > > [    0.405264]  kthread+0x174/0x1ac
> > > > [    0.405837]  ret_from_fork+0x10/0x20
> > > > [    0.406456] irq event stamp: 102
> > > > [    0.406966] hardirqs last  enabled at (101): [<ffffffc0093c8468>] _raw_spin_unlock_irq+0x78/0xb4
> > > > [    0.408304] hardirqs last disabled at (102): [<ffffffc0093b8270>] el1_dbg+0x24/0x5c
> > > > [    0.409410] softirqs last  enabled at (54): [<ffffffc0081b80c8>] local_bh_enable+0xc/0x2c
> > > > [    0.410645] softirqs last disabled at (50): [<ffffffc0081b809c>] local_bh_disable+0xc/0x2c
> > > > [    0.411890] ---[ end trace 0000000000000000 ]---
> > > > [    0.413000] smp: Brought up 1 node, 4 CPUs
> > > > [    0.413762] SMP: Total of 4 processors activated.
> > > > [    0.414566] CPU features: detected: 32-bit EL0 Support
> > > > [    0.415414] CPU features: detected: 32-bit EL1 Support
> > > > [    0.416278] CPU features: detected: CRC32 instructions
> > > > [    0.447021] Callback from call_rcu_tasks_rude() invoked.
> > > > [    0.506693] Callback from call_rcu_tasks() invoked.
> > > > 
> > > > Link: https://lore.kernel.org/lkml/20220210184319.25009-1-treasure4paddy@gmail.com/T/
> > > > Suggested-by: Paul E. McKenney <paulmck@kernel.org>
> > > > Signed-off-by: Padmanabha Srinivasaiah <treasure4paddy@gmail.com>
> > > 
> > > Thank you, Padmanabha!  I have queued this with additional explanation
> > > in the commit log as shown below.
> > 
> > Thanks paul.
> > > 
> > > If you have not already tested this by repeatedly rebooting, for
> > > example continuously in an overnight test, could you please do so?
> > > (I am assuming that your system can do an automated reboot loop, so no,
> > > I am not asking you to stay up all night!)
> > > 
> > Yes, tested and issue doesn't appear.
> 
> Very good, thank you!
> 
> May I add your Tested-by?
> 
> 							Thanx, Paul
>
Sure paul. Once again thanks for the suggestion.

Tested-by: Padmanabha Srinivasaiah <treasure4paddy@gmail.com>

> > > Should this somehow recur with larger systems, in which the call starts
> > > with (say) five CPUs online but has trouble with the sixth CPU, there
> > > are several ways to address it:
> > > 
> > > 1.	Make cpus_read_lock() act as expected.
> > >
> > > 2.	Make the first for_each_online_cpu() in schedule_on_each_cpu()
> > > 	instead be for_each_possible_cpu(), always do the INIT_WORK(),
> > > 	but invoke schedule_work_on() only if cpu_online(cpu).
> > >
> > Yes, agree and was also suggested by Tejun Heo. Will also submit same, in
> > other thread.
> > 
> > > 3.	Other ideas here, when and if.
> > > 
> > > 							Thanx, Paul
> > > 
> > > ------------------------------------------------------------------------
> > > 
> > > commit f6823834aa2e2e581ca627238fa3ad3e0a727c08
> > > Author: Padmanabha Srinivasaiah <treasure4paddy@gmail.com>
> > > Date:   Thu Feb 17 16:25:19 2022 +0100
> > > 
> > >     rcu-tasks: Fix race in schedule and flush work
> > >     
> > >     While booting secondary CPUs, cpus_read_[lock/unlock] is not keeping
> > >     online cpumask stable. The transient online mask results in below
> > >     calltrace.
> > >     
> > >     [    0.324121] CPU1: Booted secondary processor 0x0000000001 [0x410fd083]
> > >     [    0.346652] Detected PIPT I-cache on CPU2
> > >     [    0.347212] CPU2: Booted secondary processor 0x0000000002 [0x410fd083]
> > >     [    0.377255] Detected PIPT I-cache on CPU3
> > >     [    0.377823] CPU3: Booted secondary processor 0x0000000003 [0x410fd083]
> > >     [    0.379040] ------------[ cut here ]------------
> > >     [    0.383662] WARNING: CPU: 0 PID: 10 at kernel/workqueue.c:3084 __flush_work+0x12c/0x138
> > >     [    0.384850] Modules linked in:
> > >     [    0.385403] CPU: 0 PID: 10 Comm: rcu_tasks_rude_ Not tainted 5.17.0-rc3-v8+ #13
> > >     [    0.386473] Hardware name: Raspberry Pi 4 Model B Rev 1.4 (DT)
> > >     [    0.387289] pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> > >     [    0.388308] pc : __flush_work+0x12c/0x138
> > >     [    0.388970] lr : __flush_work+0x80/0x138
> > >     [    0.389620] sp : ffffffc00aaf3c60
> > >     [    0.390139] x29: ffffffc00aaf3d20 x28: ffffffc009c16af0 x27: ffffff80f761df48
> > >     [    0.391316] x26: 0000000000000004 x25: 0000000000000003 x24: 0000000000000100
> > >     [    0.392493] x23: ffffffffffffffff x22: ffffffc009c16b10 x21: ffffffc009c16b28
> > >     [    0.393668] x20: ffffffc009e53861 x19: ffffff80f77fbf40 x18: 00000000d744fcc9
> > >     [    0.394842] x17: 000000000000000b x16: 00000000000001c2 x15: ffffffc009e57550
> > >     [    0.396016] x14: 0000000000000000 x13: ffffffffffffffff x12: 0000000100000000
> > >     [    0.397190] x11: 0000000000000462 x10: ffffff8040258008 x9 : 0000000100000000
> > >     [    0.398364] x8 : 0000000000000000 x7 : ffffffc0093c8bf4 x6 : 0000000000000000
> > >     [    0.399538] x5 : 0000000000000000 x4 : ffffffc00a976e40 x3 : ffffffc00810444c
> > >     [    0.400711] x2 : 0000000000000004 x1 : 0000000000000000 x0 : 0000000000000000
> > >     [    0.401886] Call trace:
> > >     [    0.402309]  __flush_work+0x12c/0x138
> > >     [    0.402941]  schedule_on_each_cpu+0x228/0x278
> > >     [    0.403693]  rcu_tasks_rude_wait_gp+0x130/0x144
> > >     [    0.404502]  rcu_tasks_kthread+0x220/0x254
> > >     [    0.405264]  kthread+0x174/0x1ac
> > >     [    0.405837]  ret_from_fork+0x10/0x20
> > >     [    0.406456] irq event stamp: 102
> > >     [    0.406966] hardirqs last  enabled at (101): [<ffffffc0093c8468>] _raw_spin_unlock_irq+0x78/0xb4
> > >     [    0.408304] hardirqs last disabled at (102): [<ffffffc0093b8270>] el1_dbg+0x24/0x5c
> > >     [    0.409410] softirqs last  enabled at (54): [<ffffffc0081b80c8>] local_bh_enable+0xc/0x2c
> > >     [    0.410645] softirqs last disabled at (50): [<ffffffc0081b809c>] local_bh_disable+0xc/0x2c
> > >     [    0.411890] ---[ end trace 0000000000000000 ]---
> > >     [    0.413000] smp: Brought up 1 node, 4 CPUs
> > >     [    0.413762] SMP: Total of 4 processors activated.
> > >     [    0.414566] CPU features: detected: 32-bit EL0 Support
> > >     [    0.415414] CPU features: detected: 32-bit EL1 Support
> > >     [    0.416278] CPU features: detected: CRC32 instructions
> > >     [    0.447021] Callback from call_rcu_tasks_rude() invoked.
> > >     [    0.506693] Callback from call_rcu_tasks() invoked.
> > >     
> > >     This commit therefore fixes this issue by applying a single-CPU
> > >     optimization to the RCU Tasks Rude grace-period process.  The key point
> > >     here is that the purpose of this RCU flavor is to force a schedule on
> > >     each online CPU since some past event.  But the rcu_tasks_rude_wait_gp()
> > >     function runs in the context of the RCU Tasks Rude's grace-period kthread,
> > >     so there must already have been a context switch on the current CPU since
> > >     the call to either synchronize_rcu_tasks_rude() or call_rcu_tasks_rude().
> > >     So if there is only a single CPU online, RCU Tasks Rude's grace-period
> > >     kthread does not need to anything at all.
> > >     
> > >     It turns out that the rcu_tasks_rude_wait_gp() function's call to
> > >     schedule_on_each_cpu() causes problems during early boot.  During that
> > >     time, there is only one online CPU, namely the boot CPU.  Therefore,
> > >     applying this single-CPU optimization fixes early-boot instances of
> > >     this problem.
> > >     
> > >     Link: https://lore.kernel.org/lkml/20220210184319.25009-1-treasure4paddy@gmail.com/T/
> > >     Suggested-by: Paul E. McKenney <paulmck@kernel.org>
> > >     Signed-off-by: Padmanabha Srinivasaiah <treasure4paddy@gmail.com>
> > >     Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
> > > 
> > > diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h
> > > index d73e32d803438..f7fdb349f8b8e 100644
> > > --- a/kernel/rcu/tasks.h
> > > +++ b/kernel/rcu/tasks.h
> > > @@ -955,6 +955,9 @@ static void rcu_tasks_be_rude(struct work_struct *work)
> > >  // Wait for one rude RCU-tasks grace period.
> > >  static void rcu_tasks_rude_wait_gp(struct rcu_tasks *rtp)
> > >  {
> > > +	if (num_online_cpus() <= 1)
> > > +		return;	// Fastpath for only one CPU.
> > > +
> > >  	rtp->n_ipis += cpumask_weight(cpu_online_mask);
> > >  	schedule_on_each_cpu(rcu_tasks_be_rude);
> > >  }

Paul E. McKenney Feb. 23, 2022, 12:52 a.m. UTC | #5

On Tue, Feb 22, 2022 at 11:06:29PM +0100, Padmanabha Srinivasaiah wrote:
> On Sun, Feb 20, 2022 at 12:04:25PM -0800, Paul E. McKenney wrote:
> > On Sun, Feb 20, 2022 at 07:35:17PM +0100, Padmanabha Srinivasaiah wrote:
> > > On Thu, Feb 17, 2022 at 09:43:30AM -0800, Paul E. McKenney wrote:
> > > > On Thu, Feb 17, 2022 at 04:25:19PM +0100, Padmanabha Srinivasaiah wrote:
> > > > > While booting secondary CPUs, cpus_read_[lock/unlock] not keeping online
> > > > > cpumask stable. The transient online mask results in below calltrace.
> > > > > 
> > > > > [    0.324121] CPU1: Booted secondary processor 0x0000000001 [0x410fd083]
> > > > > [    0.346652] Detected PIPT I-cache on CPU2
> > > > > [    0.347212] CPU2: Booted secondary processor 0x0000000002 [0x410fd083]
> > > > > [    0.377255] Detected PIPT I-cache on CPU3
> > > > > [    0.377823] CPU3: Booted secondary processor 0x0000000003 [0x410fd083]
> > > > > [    0.379040] ------------[ cut here ]------------
> > > > > [    0.383662] WARNING: CPU: 0 PID: 10 at kernel/workqueue.c:3084 __flush_work+0x12c/0x138
> > > > > [    0.384850] Modules linked in:
> > > > > [    0.385403] CPU: 0 PID: 10 Comm: rcu_tasks_rude_ Not tainted 5.17.0-rc3-v8+ #13
> > > > > [    0.386473] Hardware name: Raspberry Pi 4 Model B Rev 1.4 (DT)
> > > > > [    0.387289] pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> > > > > [    0.388308] pc : __flush_work+0x12c/0x138
> > > > > [    0.388970] lr : __flush_work+0x80/0x138
> > > > > [    0.389620] sp : ffffffc00aaf3c60
> > > > > [    0.390139] x29: ffffffc00aaf3d20 x28: ffffffc009c16af0 x27: ffffff80f761df48
> > > > > [    0.391316] x26: 0000000000000004 x25: 0000000000000003 x24: 0000000000000100
> > > > > [    0.392493] x23: ffffffffffffffff x22: ffffffc009c16b10 x21: ffffffc009c16b28
> > > > > [    0.393668] x20: ffffffc009e53861 x19: ffffff80f77fbf40 x18: 00000000d744fcc9
> > > > > [    0.394842] x17: 000000000000000b x16: 00000000000001c2 x15: ffffffc009e57550
> > > > > [    0.396016] x14: 0000000000000000 x13: ffffffffffffffff x12: 0000000100000000
> > > > > [    0.397190] x11: 0000000000000462 x10: ffffff8040258008 x9 : 0000000100000000
> > > > > [    0.398364] x8 : 0000000000000000 x7 : ffffffc0093c8bf4 x6 : 0000000000000000
> > > > > [    0.399538] x5 : 0000000000000000 x4 : ffffffc00a976e40 x3 : ffffffc00810444c
> > > > > [    0.400711] x2 : 0000000000000004 x1 : 0000000000000000 x0 : 0000000000000000
> > > > > [    0.401886] Call trace:
> > > > > [    0.402309]  __flush_work+0x12c/0x138
> > > > > [    0.402941]  schedule_on_each_cpu+0x228/0x278
> > > > > [    0.403693]  rcu_tasks_rude_wait_gp+0x130/0x144
> > > > > [    0.404502]  rcu_tasks_kthread+0x220/0x254
> > > > > [    0.405264]  kthread+0x174/0x1ac
> > > > > [    0.405837]  ret_from_fork+0x10/0x20
> > > > > [    0.406456] irq event stamp: 102
> > > > > [    0.406966] hardirqs last  enabled at (101): [<ffffffc0093c8468>] _raw_spin_unlock_irq+0x78/0xb4
> > > > > [    0.408304] hardirqs last disabled at (102): [<ffffffc0093b8270>] el1_dbg+0x24/0x5c
> > > > > [    0.409410] softirqs last  enabled at (54): [<ffffffc0081b80c8>] local_bh_enable+0xc/0x2c
> > > > > [    0.410645] softirqs last disabled at (50): [<ffffffc0081b809c>] local_bh_disable+0xc/0x2c
> > > > > [    0.411890] ---[ end trace 0000000000000000 ]---
> > > > > [    0.413000] smp: Brought up 1 node, 4 CPUs
> > > > > [    0.413762] SMP: Total of 4 processors activated.
> > > > > [    0.414566] CPU features: detected: 32-bit EL0 Support
> > > > > [    0.415414] CPU features: detected: 32-bit EL1 Support
> > > > > [    0.416278] CPU features: detected: CRC32 instructions
> > > > > [    0.447021] Callback from call_rcu_tasks_rude() invoked.
> > > > > [    0.506693] Callback from call_rcu_tasks() invoked.
> > > > > 
> > > > > Link: https://lore.kernel.org/lkml/20220210184319.25009-1-treasure4paddy@gmail.com/T/
> > > > > Suggested-by: Paul E. McKenney <paulmck@kernel.org>
> > > > > Signed-off-by: Padmanabha Srinivasaiah <treasure4paddy@gmail.com>
> > > > 
> > > > Thank you, Padmanabha!  I have queued this with additional explanation
> > > > in the commit log as shown below.
> > > 
> > > Thanks paul.
> > > > 
> > > > If you have not already tested this by repeatedly rebooting, for
> > > > example continuously in an overnight test, could you please do so?
> > > > (I am assuming that your system can do an automated reboot loop, so no,
> > > > I am not asking you to stay up all night!)
> > > > 
> > > Yes, tested and issue doesn't appear.
> > 
> > Very good, thank you!
> > 
> > May I add your Tested-by?
> > 
> > 							Thanx, Paul
> >
> Sure paul. Once again thanks for the suggestion.
> 
> Tested-by: Padmanabha Srinivasaiah <treasure4paddy@gmail.com>

Thank you!  I will apply this on the next rebase.

							Thanx, Paul

rcu-tasks: Fix race in schedule and flush work

Commit Message

Comments

Patch