diff mbox series

[04/32] rcu-tasks: Drive synchronous grace periods from calling task

Message ID 20220620225411.3842519-4-paulmck@kernel.org (mailing list archive)
State Mainlined
Commit 4a8cc433b8bf3106cf7b1173a936c62d77b40b40
Headers show
Series RCU Tasks updates for v5.20 | expand

Commit Message

Paul E. McKenney June 20, 2022, 10:53 p.m. UTC
This commit causes synchronous grace periods to be driven from the task
invoking synchronize_rcu_*(), allowing these functions to be invoked from
the mid-boot dead zone extending from when the scheduler was initialized
to to point that the various RCU tasks grace-period kthreads are spawned.
This change will allow the self-tests to run in a consistent manner.

Reported-by: Matthew Wilcox <willy@infradead.org>
Reported-by: Zhouyi Zhou <zhouzhouyi@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 kernel/rcu/tasks.h | 24 ++++++++++++++++--------
 1 file changed, 16 insertions(+), 8 deletions(-)

Comments

Sascha Hauer Sept. 1, 2022, 10:36 a.m. UTC | #1
Hi Paul,

On Mon, Jun 20, 2022 at 03:53:43PM -0700, Paul E. McKenney wrote:
> This commit causes synchronous grace periods to be driven from the task
> invoking synchronize_rcu_*(), allowing these functions to be invoked from
> the mid-boot dead zone extending from when the scheduler was initialized
> to to point that the various RCU tasks grace-period kthreads are spawned.
> This change will allow the self-tests to run in a consistent manner.
> 
> Reported-by: Matthew Wilcox <willy@infradead.org>
> Reported-by: Zhouyi Zhou <zhouzhouyi@gmail.com>
> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>

This commit (appeared in mainline as 4a8cc433b8bf) breaks booting my
ARMv7 based i.MX6ul board when CONFIG_PROVE_RCU is enabled. Reverting
this patch on v6.0-rc3 makes my board boot again. See below for a boot
log. The last message is "Running RCU-tasks wait API self tests", after
that the board hangs. Any idea what goes wrong here?

Sascha

----------------------------8<-----------------------------

[    0.000000] Booting Linux on physical CPU 0x0
[    0.000000] Linux version 5.19.0-rc3-00004-g4a8cc433b8bf (sha@dude02) (arm-v7a-linux-gnueabihf-gcc (OSELAS.Toolchain-2021.07.0 11-20210703) 11.1.1 20210703, GNU ld (GNU Binutils) 2.36.1) #229 SMP Thu Sep 1 12:00:07 CEST 2022
[    0.000000] CPU: ARMv7 Processor [410fc075] revision 5 (ARMv7), cr=10c5387d
[    0.000000] CPU: div instructions available: patching division code
[    0.000000] CPU: PIPT / VIPT nonaliasing data cache, VIPT aliasing instruction cache
[    0.000000] OF: fdt: Machine model: IDS CU33X
[    0.000000] earlycon: ec_imx6q0 at MMIO 0x02020000 (options '')
[    0.000000] printk: bootconsole [ec_imx6q0] enabled
[    0.000000] Memory policy: Data cache writealloc
[    0.000000] cma: Reserved 64 MiB at 0x8c000000
[    0.000000] Zone ranges:
[    0.000000]   Normal   [mem 0x0000000080000000-0x000000008fffffff]
[    0.000000]   HighMem  empty
[    0.000000] Movable zone start for each node
[    0.000000] Early memory node ranges
[    0.000000]   node   0: [mem 0x0000000080000000-0x000000008fffffff]
[    0.000000] Initmem setup node 0 [mem 0x0000000080000000-0x000000008fffffff]
[    0.000000] percpu: Embedded 17 pages/cpu s38068 r8192 d23372 u69632
[    0.000000] Built 1 zonelists, mobility grouping on.  Total pages: 65024
[    0.000000] Kernel command line: console=ttymxc0,115200n8 earlycon ip=dhcp root=/dev/nfs nfsroot=192.168.8.12:/hom
e/sha/nfsroot/cu33x,v3,tcp
[    0.000000] Dentry cache hash table entries: 32768 (order: 5, 131072 bytes, linear)
[    0.000000] Inode-cache hash table entries: 16384 (order: 4, 65536 bytes, linear)
[    0.000000] mem auto-init: stack:off, heap alloc:off, heap free:off
[    0.000000] Memory: 162600K/262144K available (15360K kernel code, 2146K rwdata, 5472K rodata, 1024K init, 6681K b
ss, 34008K reserved, 65536K cma-reserved, 0K highmem)
[    0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=1, Nodes=1
[    0.000000] trace event string verifier disabled
[    0.000000] Running RCU self tests
[    0.000000] rcu: Hierarchical RCU implementation.
[    0.000000] rcu:     RCU event tracing is enabled.
[    0.000000] rcu:     RCU lockdep checking is enabled.
[    0.000000] rcu:     RCU restricting CPUs from NR_CPUS=4 to nr_cpu_ids=1.
[    0.000000]  Tracing variant of Tasks RCU enabled.
[    0.000000] rcu: RCU calculated value of scheduler-enlistment delay is 10 jiffies.
[    0.000000] rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=1
[    0.000000] NR_IRQS: 16, nr_irqs: 16, preallocated irqs: 16
[    0.000000] rcu: srcu_init: Setting srcu_struct sizes based on contention.
[    0.000000] Switching to timer-based delay loop, resolution 41ns
[    0.000003] sched_clock: 32 bits at 24MHz, resolution 41ns, wraps every 89478484971ns
[    0.007810] clocksource: mxc_timer1: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 79635851949 ns
[    0.021748] Console: colour dummy device 80x30
[    0.023488] Lock dependency validator: Copyright (c) 2006 Red Hat, Inc., Ingo Molnar
[    0.032009] ... MAX_LOCKDEP_SUBCLASSES:  8
[    0.035282] ... MAX_LOCK_DEPTH:          48
[    0.039445] ... MAX_LOCKDEP_KEYS:        8192
[    0.043886] ... CLASSHASH_SIZE:          4096
[    0.048127] ... MAX_LOCKDEP_ENTRIES:     32768
[    0.052552] ... MAX_LOCKDEP_CHAINS:      65536
[    0.057069] ... CHAINHASH_SIZE:          32768
[    0.061405]  memory used by lock dependency info: 4061 kB
[    0.066788]  memory used for stack traces: 2112 kB
[    0.071645]  per task-struct memory footprint: 1536 bytes
[    0.077138] Calibrating delay loop (skipped), value calculated using timer frequency.. 48.00 BogoMIPS (lpj=240000)
[    0.087384] pid_max: default: 32768 minimum: 301
[    0.093527] Mount-cache hash table entries: 1024 (order: 0, 4096 bytes, linear)
[    0.099327] Mountpoint-cache hash table entries: 1024 (order: 0, 4096 bytes, linear)
[    0.116798] CPU: Testing write buffer coherency: ok
[    0.122036] CPU0: update cpu_capacity 1024
[    0.123381] CPU0: thread -1, cpu 0, socket 0, mpidr 80000000
[    0.137390] cblist_init_generic: Setting adjustable number of callback queues.
[    0.142282] cblist_init_generic: Setting shift to 0 and lim to 1.
[    0.149333] Running RCU-tasks wait API self tests
Paul E. McKenney Sept. 1, 2022, 5:27 p.m. UTC | #2
On Thu, Sep 01, 2022 at 12:36:25PM +0200, Sascha Hauer wrote:
> Hi Paul,
> 
> On Mon, Jun 20, 2022 at 03:53:43PM -0700, Paul E. McKenney wrote:
> > This commit causes synchronous grace periods to be driven from the task
> > invoking synchronize_rcu_*(), allowing these functions to be invoked from
> > the mid-boot dead zone extending from when the scheduler was initialized
> > to to point that the various RCU tasks grace-period kthreads are spawned.
> > This change will allow the self-tests to run in a consistent manner.
> > 
> > Reported-by: Matthew Wilcox <willy@infradead.org>
> > Reported-by: Zhouyi Zhou <zhouzhouyi@gmail.com>
> > Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
> 
> This commit (appeared in mainline as 4a8cc433b8bf) breaks booting my
> ARMv7 based i.MX6ul board when CONFIG_PROVE_RCU is enabled. Reverting
> this patch on v6.0-rc3 makes my board boot again. See below for a boot
> log. The last message is "Running RCU-tasks wait API self tests", after
> that the board hangs. Any idea what goes wrong here?

New one on me!

Is it possible to get a stack trace of the hang, perhaps via
one form or another of sysrq-T?  Such a stack trace would likely
include synchronize_rcu_tasks(), synchronize_rcu_tasks_rude(), or
synchronize_rcu_tasks_trace() followed by synchronize_rcu_tasks_generic(),
rcu_tasks_one_gp(), and one of rcu_tasks_wait_gp(),
rcu_tasks_rude_wait_gp(), or rcu_tasks_wait_gp().

At this point in the boot sequence, there is only one online CPU,
correct?

I have seen recent non-boot hangs within synchronize_rcu_tasks()
due to some other task getting stuck in do_exit() between its calls
to exit_tasks_rcu_start() and exit_tasks_rcu_finish().  The symptom of
this is that the aforementioned stack trace includes synchronize_srcu().
I would not expect much in the way of exiting tasks that early in the
boot sequence, but who knows?

							Thanx, Paul

> Sascha
> 
> ----------------------------8<-----------------------------
> 
> [    0.000000] Booting Linux on physical CPU 0x0
> [    0.000000] Linux version 5.19.0-rc3-00004-g4a8cc433b8bf (sha@dude02) (arm-v7a-linux-gnueabihf-gcc (OSELAS.Toolchain-2021.07.0 11-20210703) 11.1.1 20210703, GNU ld (GNU Binutils) 2.36.1) #229 SMP Thu Sep 1 12:00:07 CEST 2022
> [    0.000000] CPU: ARMv7 Processor [410fc075] revision 5 (ARMv7), cr=10c5387d
> [    0.000000] CPU: div instructions available: patching division code
> [    0.000000] CPU: PIPT / VIPT nonaliasing data cache, VIPT aliasing instruction cache
> [    0.000000] OF: fdt: Machine model: IDS CU33X
> [    0.000000] earlycon: ec_imx6q0 at MMIO 0x02020000 (options '')
> [    0.000000] printk: bootconsole [ec_imx6q0] enabled
> [    0.000000] Memory policy: Data cache writealloc
> [    0.000000] cma: Reserved 64 MiB at 0x8c000000
> [    0.000000] Zone ranges:
> [    0.000000]   Normal   [mem 0x0000000080000000-0x000000008fffffff]
> [    0.000000]   HighMem  empty
> [    0.000000] Movable zone start for each node
> [    0.000000] Early memory node ranges
> [    0.000000]   node   0: [mem 0x0000000080000000-0x000000008fffffff]
> [    0.000000] Initmem setup node 0 [mem 0x0000000080000000-0x000000008fffffff]
> [    0.000000] percpu: Embedded 17 pages/cpu s38068 r8192 d23372 u69632
> [    0.000000] Built 1 zonelists, mobility grouping on.  Total pages: 65024
> [    0.000000] Kernel command line: console=ttymxc0,115200n8 earlycon ip=dhcp root=/dev/nfs nfsroot=192.168.8.12:/hom
> e/sha/nfsroot/cu33x,v3,tcp
> [    0.000000] Dentry cache hash table entries: 32768 (order: 5, 131072 bytes, linear)
> [    0.000000] Inode-cache hash table entries: 16384 (order: 4, 65536 bytes, linear)
> [    0.000000] mem auto-init: stack:off, heap alloc:off, heap free:off
> [    0.000000] Memory: 162600K/262144K available (15360K kernel code, 2146K rwdata, 5472K rodata, 1024K init, 6681K b
> ss, 34008K reserved, 65536K cma-reserved, 0K highmem)
> [    0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=1, Nodes=1
> [    0.000000] trace event string verifier disabled
> [    0.000000] Running RCU self tests
> [    0.000000] rcu: Hierarchical RCU implementation.
> [    0.000000] rcu:     RCU event tracing is enabled.
> [    0.000000] rcu:     RCU lockdep checking is enabled.
> [    0.000000] rcu:     RCU restricting CPUs from NR_CPUS=4 to nr_cpu_ids=1.
> [    0.000000]  Tracing variant of Tasks RCU enabled.
> [    0.000000] rcu: RCU calculated value of scheduler-enlistment delay is 10 jiffies.
> [    0.000000] rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=1
> [    0.000000] NR_IRQS: 16, nr_irqs: 16, preallocated irqs: 16
> [    0.000000] rcu: srcu_init: Setting srcu_struct sizes based on contention.
> [    0.000000] Switching to timer-based delay loop, resolution 41ns
> [    0.000003] sched_clock: 32 bits at 24MHz, resolution 41ns, wraps every 89478484971ns
> [    0.007810] clocksource: mxc_timer1: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 79635851949 ns
> [    0.021748] Console: colour dummy device 80x30
> [    0.023488] Lock dependency validator: Copyright (c) 2006 Red Hat, Inc., Ingo Molnar
> [    0.032009] ... MAX_LOCKDEP_SUBCLASSES:  8
> [    0.035282] ... MAX_LOCK_DEPTH:          48
> [    0.039445] ... MAX_LOCKDEP_KEYS:        8192
> [    0.043886] ... CLASSHASH_SIZE:          4096
> [    0.048127] ... MAX_LOCKDEP_ENTRIES:     32768
> [    0.052552] ... MAX_LOCKDEP_CHAINS:      65536
> [    0.057069] ... CHAINHASH_SIZE:          32768
> [    0.061405]  memory used by lock dependency info: 4061 kB
> [    0.066788]  memory used for stack traces: 2112 kB
> [    0.071645]  per task-struct memory footprint: 1536 bytes
> [    0.077138] Calibrating delay loop (skipped), value calculated using timer frequency.. 48.00 BogoMIPS (lpj=240000)
> [    0.087384] pid_max: default: 32768 minimum: 301
> [    0.093527] Mount-cache hash table entries: 1024 (order: 0, 4096 bytes, linear)
> [    0.099327] Mountpoint-cache hash table entries: 1024 (order: 0, 4096 bytes, linear)
> [    0.116798] CPU: Testing write buffer coherency: ok
> [    0.122036] CPU0: update cpu_capacity 1024
> [    0.123381] CPU0: thread -1, cpu 0, socket 0, mpidr 80000000
> [    0.137390] cblist_init_generic: Setting adjustable number of callback queues.
> [    0.142282] cblist_init_generic: Setting shift to 0 and lim to 1.
> [    0.149333] Running RCU-tasks wait API self tests
> 
> -- 
> Pengutronix e.K.                           |                             |
> Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
> 31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
> Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |
Paul E. McKenney Sept. 1, 2022, 5:33 p.m. UTC | #3
On Thu, Sep 01, 2022 at 10:27:25AM -0700, Paul E. McKenney wrote:
> On Thu, Sep 01, 2022 at 12:36:25PM +0200, Sascha Hauer wrote:
> > Hi Paul,
> > 
> > On Mon, Jun 20, 2022 at 03:53:43PM -0700, Paul E. McKenney wrote:
> > > This commit causes synchronous grace periods to be driven from the task
> > > invoking synchronize_rcu_*(), allowing these functions to be invoked from
> > > the mid-boot dead zone extending from when the scheduler was initialized
> > > to to point that the various RCU tasks grace-period kthreads are spawned.
> > > This change will allow the self-tests to run in a consistent manner.
> > > 
> > > Reported-by: Matthew Wilcox <willy@infradead.org>
> > > Reported-by: Zhouyi Zhou <zhouzhouyi@gmail.com>
> > > Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
> > 
> > This commit (appeared in mainline as 4a8cc433b8bf) breaks booting my
> > ARMv7 based i.MX6ul board when CONFIG_PROVE_RCU is enabled. Reverting
> > this patch on v6.0-rc3 makes my board boot again. See below for a boot
> > log. The last message is "Running RCU-tasks wait API self tests", after
> > that the board hangs. Any idea what goes wrong here?
> 
> New one on me!
> 
> Is it possible to get a stack trace of the hang, perhaps via
> one form or another of sysrq-T?  Such a stack trace would likely
> include synchronize_rcu_tasks(), synchronize_rcu_tasks_rude(), or
> synchronize_rcu_tasks_trace() followed by synchronize_rcu_tasks_generic(),
> rcu_tasks_one_gp(), and one of rcu_tasks_wait_gp(),
> rcu_tasks_rude_wait_gp(), or rcu_tasks_wait_gp().

If there is no chance of sysrq-T, kernel debuggers, kernel crash
dumps, or any other source of the stack trace, please decorate the
code path with printk() or similar and let me know where it goes.
Under normal circumstances, this code path is not sensitive to performance
perturbations of the printk() persuasion.

							Thanx, Paul

> At this point in the boot sequence, there is only one online CPU,
> correct?
> 
> I have seen recent non-boot hangs within synchronize_rcu_tasks()
> due to some other task getting stuck in do_exit() between its calls
> to exit_tasks_rcu_start() and exit_tasks_rcu_finish().  The symptom of
> this is that the aforementioned stack trace includes synchronize_srcu().
> I would not expect much in the way of exiting tasks that early in the
> boot sequence, but who knows?
> 
> 							Thanx, Paul
> 
> > Sascha
> > 
> > ----------------------------8<-----------------------------
> > 
> > [    0.000000] Booting Linux on physical CPU 0x0
> > [    0.000000] Linux version 5.19.0-rc3-00004-g4a8cc433b8bf (sha@dude02) (arm-v7a-linux-gnueabihf-gcc (OSELAS.Toolchain-2021.07.0 11-20210703) 11.1.1 20210703, GNU ld (GNU Binutils) 2.36.1) #229 SMP Thu Sep 1 12:00:07 CEST 2022
> > [    0.000000] CPU: ARMv7 Processor [410fc075] revision 5 (ARMv7), cr=10c5387d
> > [    0.000000] CPU: div instructions available: patching division code
> > [    0.000000] CPU: PIPT / VIPT nonaliasing data cache, VIPT aliasing instruction cache
> > [    0.000000] OF: fdt: Machine model: IDS CU33X
> > [    0.000000] earlycon: ec_imx6q0 at MMIO 0x02020000 (options '')
> > [    0.000000] printk: bootconsole [ec_imx6q0] enabled
> > [    0.000000] Memory policy: Data cache writealloc
> > [    0.000000] cma: Reserved 64 MiB at 0x8c000000
> > [    0.000000] Zone ranges:
> > [    0.000000]   Normal   [mem 0x0000000080000000-0x000000008fffffff]
> > [    0.000000]   HighMem  empty
> > [    0.000000] Movable zone start for each node
> > [    0.000000] Early memory node ranges
> > [    0.000000]   node   0: [mem 0x0000000080000000-0x000000008fffffff]
> > [    0.000000] Initmem setup node 0 [mem 0x0000000080000000-0x000000008fffffff]
> > [    0.000000] percpu: Embedded 17 pages/cpu s38068 r8192 d23372 u69632
> > [    0.000000] Built 1 zonelists, mobility grouping on.  Total pages: 65024
> > [    0.000000] Kernel command line: console=ttymxc0,115200n8 earlycon ip=dhcp root=/dev/nfs nfsroot=192.168.8.12:/hom
> > e/sha/nfsroot/cu33x,v3,tcp
> > [    0.000000] Dentry cache hash table entries: 32768 (order: 5, 131072 bytes, linear)
> > [    0.000000] Inode-cache hash table entries: 16384 (order: 4, 65536 bytes, linear)
> > [    0.000000] mem auto-init: stack:off, heap alloc:off, heap free:off
> > [    0.000000] Memory: 162600K/262144K available (15360K kernel code, 2146K rwdata, 5472K rodata, 1024K init, 6681K b
> > ss, 34008K reserved, 65536K cma-reserved, 0K highmem)
> > [    0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=1, Nodes=1
> > [    0.000000] trace event string verifier disabled
> > [    0.000000] Running RCU self tests
> > [    0.000000] rcu: Hierarchical RCU implementation.
> > [    0.000000] rcu:     RCU event tracing is enabled.
> > [    0.000000] rcu:     RCU lockdep checking is enabled.
> > [    0.000000] rcu:     RCU restricting CPUs from NR_CPUS=4 to nr_cpu_ids=1.
> > [    0.000000]  Tracing variant of Tasks RCU enabled.
> > [    0.000000] rcu: RCU calculated value of scheduler-enlistment delay is 10 jiffies.
> > [    0.000000] rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=1
> > [    0.000000] NR_IRQS: 16, nr_irqs: 16, preallocated irqs: 16
> > [    0.000000] rcu: srcu_init: Setting srcu_struct sizes based on contention.
> > [    0.000000] Switching to timer-based delay loop, resolution 41ns
> > [    0.000003] sched_clock: 32 bits at 24MHz, resolution 41ns, wraps every 89478484971ns
> > [    0.007810] clocksource: mxc_timer1: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 79635851949 ns
> > [    0.021748] Console: colour dummy device 80x30
> > [    0.023488] Lock dependency validator: Copyright (c) 2006 Red Hat, Inc., Ingo Molnar
> > [    0.032009] ... MAX_LOCKDEP_SUBCLASSES:  8
> > [    0.035282] ... MAX_LOCK_DEPTH:          48
> > [    0.039445] ... MAX_LOCKDEP_KEYS:        8192
> > [    0.043886] ... CLASSHASH_SIZE:          4096
> > [    0.048127] ... MAX_LOCKDEP_ENTRIES:     32768
> > [    0.052552] ... MAX_LOCKDEP_CHAINS:      65536
> > [    0.057069] ... CHAINHASH_SIZE:          32768
> > [    0.061405]  memory used by lock dependency info: 4061 kB
> > [    0.066788]  memory used for stack traces: 2112 kB
> > [    0.071645]  per task-struct memory footprint: 1536 bytes
> > [    0.077138] Calibrating delay loop (skipped), value calculated using timer frequency.. 48.00 BogoMIPS (lpj=240000)
> > [    0.087384] pid_max: default: 32768 minimum: 301
> > [    0.093527] Mount-cache hash table entries: 1024 (order: 0, 4096 bytes, linear)
> > [    0.099327] Mountpoint-cache hash table entries: 1024 (order: 0, 4096 bytes, linear)
> > [    0.116798] CPU: Testing write buffer coherency: ok
> > [    0.122036] CPU0: update cpu_capacity 1024
> > [    0.123381] CPU0: thread -1, cpu 0, socket 0, mpidr 80000000
> > [    0.137390] cblist_init_generic: Setting adjustable number of callback queues.
> > [    0.142282] cblist_init_generic: Setting shift to 0 and lim to 1.
> > [    0.149333] Running RCU-tasks wait API self tests
> > 
> > -- 
> > Pengutronix e.K.                           |                             |
> > Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
> > 31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
> > Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |
Sascha Hauer Sept. 2, 2022, 11:52 a.m. UTC | #4
On Thu, Sep 01, 2022 at 10:33:04AM -0700, Paul E. McKenney wrote:
> On Thu, Sep 01, 2022 at 10:27:25AM -0700, Paul E. McKenney wrote:
> > On Thu, Sep 01, 2022 at 12:36:25PM +0200, Sascha Hauer wrote:
> > > Hi Paul,
> > > 
> > > On Mon, Jun 20, 2022 at 03:53:43PM -0700, Paul E. McKenney wrote:
> > > > This commit causes synchronous grace periods to be driven from the task
> > > > invoking synchronize_rcu_*(), allowing these functions to be invoked from
> > > > the mid-boot dead zone extending from when the scheduler was initialized
> > > > to to point that the various RCU tasks grace-period kthreads are spawned.
> > > > This change will allow the self-tests to run in a consistent manner.
> > > > 
> > > > Reported-by: Matthew Wilcox <willy@infradead.org>
> > > > Reported-by: Zhouyi Zhou <zhouzhouyi@gmail.com>
> > > > Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
> > > 
> > > This commit (appeared in mainline as 4a8cc433b8bf) breaks booting my
> > > ARMv7 based i.MX6ul board when CONFIG_PROVE_RCU is enabled. Reverting
> > > this patch on v6.0-rc3 makes my board boot again. See below for a boot
> > > log. The last message is "Running RCU-tasks wait API self tests", after
> > > that the board hangs. Any idea what goes wrong here?
> > 
> > New one on me!
> > 
> > Is it possible to get a stack trace of the hang, perhaps via
> > one form or another of sysrq-T?  Such a stack trace would likely
> > include synchronize_rcu_tasks(), synchronize_rcu_tasks_rude(), or
> > synchronize_rcu_tasks_trace() followed by synchronize_rcu_tasks_generic(),
> > rcu_tasks_one_gp(), and one of rcu_tasks_wait_gp(),
> > rcu_tasks_rude_wait_gp(), or rcu_tasks_wait_gp().
> 
> If there is no chance of sysrq-T, kernel debuggers, kernel crash
> dumps, or any other source of the stack trace, please decorate the
> code path with printk() or similar and let me know where it goes.
> Under normal circumstances, this code path is not sensitive to performance
> perturbations of the printk() persuasion.

Some unrelated bug I was searching for made me turn on early console
output with the "earlycon" parameter. It turned out that when I remove
this parameter then my board boots fine.

I then realized that even with earlycon enabled my board boots fine
when I remove the call to

	pr_info("Running RCU-tasks wait API self tests\n");

Given that I am not sure how useful it is to add more printk. I did that
anyway like this:

> static void rcu_tasks_one_gp(struct rcu_tasks *rtp, bool midboot)
> {
> 	int needgpcb;
> 
> 	printk("%s: mutex_lock... midboot: %d\n", __func__, midboot);
> 	mutex_lock(&rtp->tasks_gp_mutex);
> 	printk("%s: mutex_locked midboot: %d\n", __func__, midboot);
> 
> 	// If there were none, wait a bit and start over.
> 	if (unlikely(midboot)) {
> 		needgpcb = 0x2;
> 	} else {
> 		printk("%s: set_tasks_gp_state(RTGS_WAIT_CBS)...\n", __func__);
> 		set_tasks_gp_state(rtp, RTGS_WAIT_CBS);
> 		printk("%s: rcuwait_wait_event...\n", __func__);
> 		rcuwait_wait_event(&rtp->cbs_wait,
> 				   (needgpcb = rcu_tasks_need_gpcb(rtp)),
> 				   TASK_IDLE);
>		printk("%s: rcuwait_wait_event done\n", __func__);
> 	}
>

What I see then is:

[    0.156362] synchronize_rcu_tasks_generic: rcu_tasks_one_gp....
[    0.162087] rcu_tasks_one_gp: mutex_lock... midboot: 1
[    0.167386] rcu_tasks_one_gp: mutex_lock... midboot: 0
[    0.172489] rcu_tasks_one_gp: mutex_locked midboot: 0
[    0.177535] rcu_tasks_one_gp: set_tasks_gp_state(RTGS_WAIT_CBS)...
[    0.183525] rcu_tasks_one_gp: rcuwait_wait_event...

Here the board hangs. After some time I get:

[  254.493010] random: crng init done

But that's it.

> 
> > At this point in the boot sequence, there is only one online CPU,
> > correct?

Yes, it's a single core system.

Sascha
Paul E. McKenney Sept. 2, 2022, 12:04 p.m. UTC | #5
On Fri, Sep 02, 2022 at 01:52:28PM +0200, Sascha Hauer wrote:
> On Thu, Sep 01, 2022 at 10:33:04AM -0700, Paul E. McKenney wrote:
> > On Thu, Sep 01, 2022 at 10:27:25AM -0700, Paul E. McKenney wrote:
> > > On Thu, Sep 01, 2022 at 12:36:25PM +0200, Sascha Hauer wrote:
> > > > Hi Paul,
> > > > 
> > > > On Mon, Jun 20, 2022 at 03:53:43PM -0700, Paul E. McKenney wrote:
> > > > > This commit causes synchronous grace periods to be driven from the task
> > > > > invoking synchronize_rcu_*(), allowing these functions to be invoked from
> > > > > the mid-boot dead zone extending from when the scheduler was initialized
> > > > > to to point that the various RCU tasks grace-period kthreads are spawned.
> > > > > This change will allow the self-tests to run in a consistent manner.
> > > > > 
> > > > > Reported-by: Matthew Wilcox <willy@infradead.org>
> > > > > Reported-by: Zhouyi Zhou <zhouzhouyi@gmail.com>
> > > > > Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
> > > > 
> > > > This commit (appeared in mainline as 4a8cc433b8bf) breaks booting my
> > > > ARMv7 based i.MX6ul board when CONFIG_PROVE_RCU is enabled. Reverting
> > > > this patch on v6.0-rc3 makes my board boot again. See below for a boot
> > > > log. The last message is "Running RCU-tasks wait API self tests", after
> > > > that the board hangs. Any idea what goes wrong here?
> > > 
> > > New one on me!
> > > 
> > > Is it possible to get a stack trace of the hang, perhaps via
> > > one form or another of sysrq-T?  Such a stack trace would likely
> > > include synchronize_rcu_tasks(), synchronize_rcu_tasks_rude(), or
> > > synchronize_rcu_tasks_trace() followed by synchronize_rcu_tasks_generic(),
> > > rcu_tasks_one_gp(), and one of rcu_tasks_wait_gp(),
> > > rcu_tasks_rude_wait_gp(), or rcu_tasks_wait_gp().
> > 
> > If there is no chance of sysrq-T, kernel debuggers, kernel crash
> > dumps, or any other source of the stack trace, please decorate the
> > code path with printk() or similar and let me know where it goes.
> > Under normal circumstances, this code path is not sensitive to performance
> > perturbations of the printk() persuasion.
> 
> Some unrelated bug I was searching for made me turn on early console
> output with the "earlycon" parameter. It turned out that when I remove
> this parameter then my board boots fine.
> 
> I then realized that even with earlycon enabled my board boots fine
> when I remove the call to
> 
> 	pr_info("Running RCU-tasks wait API self tests\n");

Ah, there are some printk() fixes in the works.  Maybe this is one area
needed that.  Or maybe not.

> Given that I am not sure how useful it is to add more printk. I did that
> anyway like this:
> 
> > static void rcu_tasks_one_gp(struct rcu_tasks *rtp, bool midboot)
> > {
> > 	int needgpcb;
> > 
> > 	printk("%s: mutex_lock... midboot: %d\n", __func__, midboot);
> > 	mutex_lock(&rtp->tasks_gp_mutex);
> > 	printk("%s: mutex_locked midboot: %d\n", __func__, midboot);
> > 
> > 	// If there were none, wait a bit and start over.
> > 	if (unlikely(midboot)) {
> > 		needgpcb = 0x2;
> > 	} else {
> > 		printk("%s: set_tasks_gp_state(RTGS_WAIT_CBS)...\n", __func__);
> > 		set_tasks_gp_state(rtp, RTGS_WAIT_CBS);
> > 		printk("%s: rcuwait_wait_event...\n", __func__);
> > 		rcuwait_wait_event(&rtp->cbs_wait,
> > 				   (needgpcb = rcu_tasks_need_gpcb(rtp)),
> > 				   TASK_IDLE);
> >		printk("%s: rcuwait_wait_event done\n", __func__);
> > 	}
> >
> 
> What I see then is:
> 
> [    0.156362] synchronize_rcu_tasks_generic: rcu_tasks_one_gp....
> [    0.162087] rcu_tasks_one_gp: mutex_lock... midboot: 1

So one task gets stuck either in mutex_lock() or the printk() above
and some other task below moves ahead?  Or might some printk()s have
been lost?

> [    0.167386] rcu_tasks_one_gp: mutex_lock... midboot: 0
> [    0.172489] rcu_tasks_one_gp: mutex_locked midboot: 0
> [    0.177535] rcu_tasks_one_gp: set_tasks_gp_state(RTGS_WAIT_CBS)...
> [    0.183525] rcu_tasks_one_gp: rcuwait_wait_event...

Given that everything works with printk()s turned off, my current
suspicion is a printk() issue.

> Here the board hangs. After some time I get:
> 
> [  254.493010] random: crng init done

This looks unrelated.

> But that's it.
> 
> > 
> > > At this point in the boot sequence, there is only one online CPU,
> > > correct?
> 
> Yes, it's a single core system.

OK, then we should be able to rule out SMP issues.  ;-)

								Thanx, Paul

> Sascha
> 
> -- 
> Pengutronix e.K.                           |                             |
> Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
> 31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
> Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |
diff mbox series

Patch

diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h
index ad993c4ed924f..bd9f2e24f5c73 100644
--- a/kernel/rcu/tasks.h
+++ b/kernel/rcu/tasks.h
@@ -495,17 +495,21 @@  static void rcu_tasks_invoke_cbs_wq(struct work_struct *wp)
 }
 
 // Wait for one grace period.
-static void rcu_tasks_one_gp(struct rcu_tasks *rtp)
+static void rcu_tasks_one_gp(struct rcu_tasks *rtp, bool midboot)
 {
 	int needgpcb;
 
 	mutex_lock(&rtp->tasks_gp_mutex);
-	set_tasks_gp_state(rtp, RTGS_WAIT_CBS);
 
 	// If there were none, wait a bit and start over.
-	rcuwait_wait_event(&rtp->cbs_wait,
-			   (needgpcb = rcu_tasks_need_gpcb(rtp)),
-			   TASK_IDLE);
+	if (unlikely(midboot)) {
+		needgpcb = 0x2;
+	} else {
+		set_tasks_gp_state(rtp, RTGS_WAIT_CBS);
+		rcuwait_wait_event(&rtp->cbs_wait,
+				   (needgpcb = rcu_tasks_need_gpcb(rtp)),
+				   TASK_IDLE);
+	}
 
 	if (needgpcb & 0x2) {
 		// Wait for one grace period.
@@ -540,7 +544,7 @@  static int __noreturn rcu_tasks_kthread(void *arg)
 	for (;;) {
 		// Wait for one grace period and invoke any callbacks
 		// that are ready.
-		rcu_tasks_one_gp(rtp);
+		rcu_tasks_one_gp(rtp, false);
 
 		// Paranoid sleep to keep this from entering a tight loop.
 		schedule_timeout_idle(rtp->gp_sleep);
@@ -554,8 +558,12 @@  static void synchronize_rcu_tasks_generic(struct rcu_tasks *rtp)
 	RCU_LOCKDEP_WARN(rcu_scheduler_active == RCU_SCHEDULER_INACTIVE,
 			 "synchronize_rcu_tasks called too soon");
 
-	/* Wait for the grace period. */
-	wait_rcu_gp(rtp->call_func);
+	// If the grace-period kthread is running, use it.
+	if (READ_ONCE(rtp->kthread_ptr)) {
+		wait_rcu_gp(rtp->call_func);
+		return;
+	}
+	rcu_tasks_one_gp(rtp, true);
 }
 
 /* Spawn RCU-tasks grace-period kthread. */