diff mbox series

[rcu,v2] 4/5] rcu-tasks: Move RCU Tasks self-tests to core_initcall()

Message ID 20250130185320.1651910-4-paulmck@kernel.org (mailing list archive)
State Handled Elsewhere
Headers show
Series [rcu,v2] 4/5] rcu-tasks: Move RCU Tasks self-tests to core_initcall() | expand

Commit Message

Paul E. McKenney Jan. 30, 2025, 6:53 p.m. UTC
The timer and hrtimer softirq processing has moved to dedicated threads
for kernels built with CONFIG_IRQ_FORCED_THREADING=y.  This results in
timers not expiring until later in early boot, which in turn causes the
RCU Tasks self-tests to hang in kernels built with CONFIG_PROVE_RCU=y,
which further causes the entire kernel to hang.  One fix would be to
make timers work during this time, but there are no known users of RCU
Tasks grace periods during that time, so no justification for the added
complexity.  Not yet, anyway.

This commit therefore moves the call to rcu_init_tasks_generic() from
kernel_init_freeable() to a core_initcall().  This works because the
timer and hrtimer kthreads are created at early_initcall() time.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Frederic Weisbecker <frederic@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: <linux-trace-kernel@vger.kernel.org>
---
 include/linux/rcupdate.h | 6 ------
 init/main.c              | 1 -
 kernel/rcu/tasks.h       | 5 ++++-
 3 files changed, 4 insertions(+), 8 deletions(-)

Comments

Sebastian Andrzej Siewior Feb. 4, 2025, 10:26 a.m. UTC | #1
On 2025-01-30 10:53:19 [-0800], Paul E. McKenney wrote:
> The timer and hrtimer softirq processing has moved to dedicated threads
> for kernels built with CONFIG_IRQ_FORCED_THREADING=y.  This results in
> timers not expiring until later in early boot, which in turn causes the
> RCU Tasks self-tests to hang in kernels built with CONFIG_PROVE_RCU=y,
> which further causes the entire kernel to hang.  One fix would be to
> make timers work during this time, but there are no known users of RCU
> Tasks grace periods during that time, so no justification for the added
> complexity.  Not yet, anyway.
> 
> This commit therefore moves the call to rcu_init_tasks_generic() from
> kernel_init_freeable() to a core_initcall().  This works because the
> timer and hrtimer kthreads are created at early_initcall() time.

Fixes: 49a17639508c3 ("softirq: Use a dedicated thread for timer wakeups on PREEMPT_RT.")
?

I played with it and I can reproduce the issue with !RT + threadirqs but
not with RT (which implies threadirqs).
Is there anything in RT that avoids the problem?

Thank you for debugging and the patch. 

Sebastian
Paul E. McKenney Feb. 4, 2025, 11:51 a.m. UTC | #2
On Tue, Feb 04, 2025 at 11:26:11AM +0100, Sebastian Andrzej Siewior wrote:
> On 2025-01-30 10:53:19 [-0800], Paul E. McKenney wrote:
> > The timer and hrtimer softirq processing has moved to dedicated threads
> > for kernels built with CONFIG_IRQ_FORCED_THREADING=y.  This results in
> > timers not expiring until later in early boot, which in turn causes the
> > RCU Tasks self-tests to hang in kernels built with CONFIG_PROVE_RCU=y,
> > which further causes the entire kernel to hang.  One fix would be to
> > make timers work during this time, but there are no known users of RCU
> > Tasks grace periods during that time, so no justification for the added
> > complexity.  Not yet, anyway.
> > 
> > This commit therefore moves the call to rcu_init_tasks_generic() from
> > kernel_init_freeable() to a core_initcall().  This works because the
> > timer and hrtimer kthreads are created at early_initcall() time.
> 
> Fixes: 49a17639508c3 ("softirq: Use a dedicated thread for timer wakeups on PREEMPT_RT.")
> ?

Quite possibly...  I freely confess that I was more focused on the fix
than on the bug's origin.  Would you be willing to try this commit and
its predecessor?

> I played with it and I can reproduce the issue with !RT + threadirqs but
> not with RT (which implies threadirqs).
> Is there anything in RT that avoids the problem?

Not that I know of, but then again I did not try it.  To your point,
I do need to make a -rt rcutorture scenario.  TREE03 has been intended to
approximate this, and it uses the following Kconfig options:

------------------------------------------------------------------------

CONFIG_SMP=y
CONFIG_NR_CPUS=16
CONFIG_PREEMPT_NONE=n
CONFIG_PREEMPT_VOLUNTARY=n
CONFIG_PREEMPT=y
#CHECK#CONFIG_PREEMPT_RCU=y
CONFIG_HZ_PERIODIC=y
CONFIG_NO_HZ_IDLE=n
CONFIG_NO_HZ_FULL=n
CONFIG_RCU_TRACE=y
CONFIG_HOTPLUG_CPU=y
CONFIG_RCU_FANOUT=2
CONFIG_RCU_FANOUT_LEAF=2
CONFIG_RCU_NOCB_CPU=n
CONFIG_DEBUG_LOCK_ALLOC=n
CONFIG_RCU_BOOST=y
CONFIG_DEBUG_OBJECTS_RCU_HEAD=n
CONFIG_RCU_EXPERT=y

------------------------------------------------------------------------

And the following kernel-boot parameters:

------------------------------------------------------------------------

rcutorture.onoff_interval=200 rcutorture.onoff_holdoff=30
rcutree.gp_preinit_delay=12
rcutree.gp_init_delay=3
rcutree.gp_cleanup_delay=3
rcutree.kthread_prio=2
threadirqs
rcutree.use_softirq=0
rcutorture.preempt_duration=10

------------------------------------------------------------------------

Some of these are for RCU's benefit, but what should I change to more
closely approximate a typical real-time deployment?

> Thank you for debugging and the patch. 

And to you for digging into it!

							Thanx, Paul
Sebastian Andrzej Siewior Feb. 4, 2025, 4:34 p.m. UTC | #3
On 2025-02-04 03:51:48 [-0800], Paul E. McKenney wrote:
> On Tue, Feb 04, 2025 at 11:26:11AM +0100, Sebastian Andrzej Siewior wrote:
> > On 2025-01-30 10:53:19 [-0800], Paul E. McKenney wrote:
> > > The timer and hrtimer softirq processing has moved to dedicated threads
> > > for kernels built with CONFIG_IRQ_FORCED_THREADING=y.  This results in
> > > timers not expiring until later in early boot, which in turn causes the
> > > RCU Tasks self-tests to hang in kernels built with CONFIG_PROVE_RCU=y,
> > > which further causes the entire kernel to hang.  One fix would be to
> > > make timers work during this time, but there are no known users of RCU
> > > Tasks grace periods during that time, so no justification for the added
> > > complexity.  Not yet, anyway.
> > > 
> > > This commit therefore moves the call to rcu_init_tasks_generic() from
> > > kernel_init_freeable() to a core_initcall().  This works because the
> > > timer and hrtimer kthreads are created at early_initcall() time.
> > 
> > Fixes: 49a17639508c3 ("softirq: Use a dedicated thread for timer wakeups on PREEMPT_RT.")
> > ?
> 
> Quite possibly...  I freely confess that I was more focused on the fix
> than on the bug's origin.  Would you be willing to try this commit and
> its predecessor?

Yes. Just verified.
Tested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

> > I played with it and I can reproduce the issue with !RT + threadirqs but
> > not with RT (which implies threadirqs).
> > Is there anything in RT that avoids the problem?
> 
> Not that I know of, but then again I did not try it.  To your point,

The change looks fine.

> I do need to make a -rt rcutorture scenario.  TREE03 has been intended to
> approximate this, and it uses the following Kconfig options:
> 
> ------------------------------------------------------------------------
> 
> CONFIG_SMP=y
> CONFIG_NR_CPUS=16
> CONFIG_PREEMPT_NONE=n
> CONFIG_PREEMPT_VOLUNTARY=n
> CONFIG_PREEMPT=y
> #CHECK#CONFIG_PREEMPT_RCU=y
> CONFIG_HZ_PERIODIC=y
> CONFIG_NO_HZ_IDLE=n
> CONFIG_NO_HZ_FULL=n
> CONFIG_RCU_TRACE=y
> CONFIG_HOTPLUG_CPU=y
> CONFIG_RCU_FANOUT=2
> CONFIG_RCU_FANOUT_LEAF=2
> CONFIG_RCU_NOCB_CPU=n
> CONFIG_DEBUG_LOCK_ALLOC=n
> CONFIG_RCU_BOOST=y
> CONFIG_DEBUG_OBJECTS_RCU_HEAD=n
> CONFIG_RCU_EXPERT=y

You could enable CONFIG_PREEMPT_RT ;)
CONFIG_PREEMPT_LAZY is probably also set a lot.

That should be it.

> ------------------------------------------------------------------------
> 
> And the following kernel-boot parameters:
> 
> ------------------------------------------------------------------------
> 
> rcutorture.onoff_interval=200 rcutorture.onoff_holdoff=30
> rcutree.gp_preinit_delay=12
> rcutree.gp_init_delay=3
> rcutree.gp_cleanup_delay=3
> rcutree.kthread_prio=2
> threadirqs
> rcutree.use_softirq=0
> rcutorture.preempt_duration=10
> 
> ------------------------------------------------------------------------
> 
> Some of these are for RCU's benefit, but what should I change to more
> closely approximate a typical real-time deployment?

See above.

> > Thank you for debugging and the patch. 
> 
> And to you for digging into it!

Thanks.

> 							Thanx, Paul

Sebastian
Paul E. McKenney Feb. 4, 2025, 8:20 p.m. UTC | #4
On Tue, Feb 04, 2025 at 05:34:09PM +0100, Sebastian Andrzej Siewior wrote:
> On 2025-02-04 03:51:48 [-0800], Paul E. McKenney wrote:
> > On Tue, Feb 04, 2025 at 11:26:11AM +0100, Sebastian Andrzej Siewior wrote:
> > > On 2025-01-30 10:53:19 [-0800], Paul E. McKenney wrote:
> > > > The timer and hrtimer softirq processing has moved to dedicated threads
> > > > for kernels built with CONFIG_IRQ_FORCED_THREADING=y.  This results in
> > > > timers not expiring until later in early boot, which in turn causes the
> > > > RCU Tasks self-tests to hang in kernels built with CONFIG_PROVE_RCU=y,
> > > > which further causes the entire kernel to hang.  One fix would be to
> > > > make timers work during this time, but there are no known users of RCU
> > > > Tasks grace periods during that time, so no justification for the added
> > > > complexity.  Not yet, anyway.
> > > > 
> > > > This commit therefore moves the call to rcu_init_tasks_generic() from
> > > > kernel_init_freeable() to a core_initcall().  This works because the
> > > > timer and hrtimer kthreads are created at early_initcall() time.
> > > 
> > > Fixes: 49a17639508c3 ("softirq: Use a dedicated thread for timer wakeups on PREEMPT_RT.")
> > > ?
> > 
> > Quite possibly...  I freely confess that I was more focused on the fix
> > than on the bug's origin.  Would you be willing to try this commit and
> > its predecessor?
> 
> Yes. Just verified.
> Tested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

Boqun, could you please apply Sebastian's tags, including the Fixes
tag above?

> > > I played with it and I can reproduce the issue with !RT + threadirqs but
> > > not with RT (which implies threadirqs).
> > > Is there anything in RT that avoids the problem?
> > 
> > Not that I know of, but then again I did not try it.  To your point,
> 
> The change looks fine.
> 
> > I do need to make a -rt rcutorture scenario.  TREE03 has been intended to
> > approximate this, and it uses the following Kconfig options:
> > 
> > ------------------------------------------------------------------------
> > 
> > CONFIG_SMP=y
> > CONFIG_NR_CPUS=16
> > CONFIG_PREEMPT_NONE=n
> > CONFIG_PREEMPT_VOLUNTARY=n
> > CONFIG_PREEMPT=y
> > #CHECK#CONFIG_PREEMPT_RCU=y
> > CONFIG_HZ_PERIODIC=y
> > CONFIG_NO_HZ_IDLE=n
> > CONFIG_NO_HZ_FULL=n
> > CONFIG_RCU_TRACE=y
> > CONFIG_HOTPLUG_CPU=y
> > CONFIG_RCU_FANOUT=2
> > CONFIG_RCU_FANOUT_LEAF=2
> > CONFIG_RCU_NOCB_CPU=n
> > CONFIG_DEBUG_LOCK_ALLOC=n
> > CONFIG_RCU_BOOST=y
> > CONFIG_DEBUG_OBJECTS_RCU_HEAD=n
> > CONFIG_RCU_EXPERT=y
> 
> You could enable CONFIG_PREEMPT_RT ;)
> CONFIG_PREEMPT_LAZY is probably also set a lot.
> 
> That should be it.
> 
> > ------------------------------------------------------------------------
> > 
> > And the following kernel-boot parameters:
> > 
> > ------------------------------------------------------------------------
> > 
> > rcutorture.onoff_interval=200 rcutorture.onoff_holdoff=30
> > rcutree.gp_preinit_delay=12
> > rcutree.gp_init_delay=3
> > rcutree.gp_cleanup_delay=3
> > rcutree.kthread_prio=2
> > threadirqs
> > rcutree.use_softirq=0
> > rcutorture.preempt_duration=10
> > 
> > ------------------------------------------------------------------------
> > 
> > Some of these are for RCU's benefit, but what should I change to more
> > closely approximate a typical real-time deployment?
> 
> See above.

Which got me this diff:

------------------------------------------------------------------------

diff --git a/tools/testing/selftests/rcutorture/configs/rcu/TREE03 b/tools/testing/selftests/rcutorture/configs/rcu/TREE03
index 2dc31b16e506..6158f5002497 100644
--- a/tools/testing/selftests/rcutorture/configs/rcu/TREE03
+++ b/tools/testing/selftests/rcutorture/configs/rcu/TREE03
@@ -2,7 +2,9 @@ CONFIG_SMP=y
 CONFIG_NR_CPUS=16
 CONFIG_PREEMPT_NONE=n
 CONFIG_PREEMPT_VOLUNTARY=n
-CONFIG_PREEMPT=y
+CONFIG_PREEMPT=n
+CONFIG_PREEMPT_LAZY=y
+CONFIG_PREEMPT_RT=y
 #CHECK#CONFIG_PREEMPT_RCU=y
 CONFIG_HZ_PERIODIC=y
 CONFIG_NO_HZ_IDLE=n
@@ -15,4 +17,5 @@ CONFIG_RCU_NOCB_CPU=n
 CONFIG_DEBUG_LOCK_ALLOC=n
 CONFIG_RCU_BOOST=y
 CONFIG_DEBUG_OBJECTS_RCU_HEAD=n
+CONFIG_EXPERT=y
 CONFIG_RCU_EXPERT=y

------------------------------------------------------------------------

But a 10-minute run got me the splat shown below, and in addition a
shutdown-time hang.

This is caused by RCU falling behind a callback-flooding kthread that
invokes call_rcu() in a semi-tight loop.  Setting rcutree.kthread_prio=40
avoids the splat, but still gets the shutdown-time hang.  Retrying with
the default rcutree.kthread_prio=2 failed to reproduce the splat, but
it did reproduce the shutdown-time hang.

OK, maybe printk buffers are not being flushed?  A 100-millisecond sleep
at the end of of rcu_torture_cleanup() got all of rcutorture's output
flushed, but lost the subsequent shutdown-time console traffic.  The
pr_flush(HZ/10,1) seems more sensible, but this is private to printk().

I would like to log the shutdown-time console traffic because RCU can
sometimes break things on that path.

Thoughts?

							Thanx, Paul

PS:  I will do longer runs in case that splat was not a one-off.
     My concern is that I might need to adjust something more in order
     to get a reliable callback-flooding test.

------------------------------------------------------------------------

[  133.107605] ------------[ cut here ]------------
[  133.107611] rcutorture_oom_notify invoked upon OOM during forward-progress testing.
[  133.107636] WARNING: CPU: 0 PID: 236 at kernel/rcu/rcutorture.c:3117 rcutorture_oom_notify+0x48/0x2e0
[  133.107651] Modules linked in:
[  133.107662] CPU: 0 UID: 0 PID: 236 Comm: rcu_torture_fwd Not tainted 6.13.0-rc2-00108-g56d12f072c66-dirty #1747
[  133.107667] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
[  133.107669] RIP: 0010:rcutorture_oom_notify+0x48/0x2e0
[  133.107673] Code: e8 3d ca c3 00 48 8b 2d 5e 8e 17 02 48 85 ed 0f 84 82 02 00 00 90 48 c7 c6 a0 fc 20 a7 48 c7 c7 40 82 68 a7 e8 f9 38 f6 ff 90 <0f> 0b 90 90 8b 35 36 10 82 01 45 31 e4 85 f6 7e 3c 49 63 dc 41 83
[  133.107675] RSP: 0018:ffffa3b10081fab0 EFLAGS: 00010286
[  133.107679] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000027
[  133.107681] RDX: ffff8bf11f21c948 RSI: 0000000000000001 RDI: ffff8bf11f21c940
[  133.107682] RBP: ffff8bf101ff8000 R08: ffffffffa7b42608 R09: 00000000ffffdfff
[  133.107683] R10: ffffffffa7a62620 R11: ffffffffa7b12620 R12: 0000000000000000
[  133.107685] R13: ffffa3b10081fb60 R14: 00000000ffffffff R15: 0000000000000000
[  133.107688] FS:  0000000000000000(0000) GS:ffff8bf11f200000(0000) knlGS:0000000000000000
[  133.107690] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  133.107691] CR2: ffff8bf10fe01000 CR3: 0000000001f88000 CR4: 00000000000006f0
[  133.107693] Call Trace:
[  133.107709]  <TASK>
[  133.107713]  ? __warn+0x83/0x130
[  133.107716]  ? rcutorture_oom_notify+0x48/0x2e0
[  133.107720]  ? report_bug+0x18e/0x1a0
[  133.107723]  ? handle_bug+0x54/0x90
[  133.107727]  ? exc_invalid_op+0x18/0x70
[  133.107729]  ? asm_exc_invalid_op+0x1a/0x20
[  133.107733]  ? rcutorture_oom_notify+0x48/0x2e0
[  133.107736]  notifier_call_chain+0x59/0xc0
[  133.107744]  blocking_notifier_call_chain+0x3c/0x60
[  133.107746]  out_of_memory+0x1f3/0x710
[  133.107754]  __alloc_pages_noprof+0xf27/0x10b0
[  133.107764]  ? timerqueue_del+0x2e/0x50
[  133.107767]  alloc_pages_mpol_noprof+0x43/0xf0
[  133.107774]  new_slab+0x37d/0x440
[  133.107796]  ___slab_alloc+0x480/0x9a0
[  133.107801]  ? rcu_torture_fwd_prog+0x344/0x9e0
[  133.107803]  ? preempt_schedule_thunk+0x1a/0x30
[  133.107806]  ? rcu_torture_fwd_prog+0x344/0x9e0
[  133.107808]  __kmalloc_cache_noprof+0x74/0x1f0
[  133.107811]  rcu_torture_fwd_prog+0x344/0x9e0
[  133.107814]  ? __pfx_rcu_torture_fwd_prog+0x10/0x10
[  133.107816]  ? kthread+0xd1/0x100
[  133.107818]  ? __pfx_rcu_torture_fwd_prog+0x10/0x10
[  133.107819]  kthread+0xd1/0x100
[  133.107824]  ? __pfx_kthread+0x10/0x10
[  133.107826]  ret_from_fork+0x2f/0x50
[  133.107833]  ? __pfx_kthread+0x10/0x10
[  133.107835]  ret_from_fork_asm+0x1a/0x30
[  133.107842]  </TASK>
[  133.107843] ---[ end trace 0000000000000000 ]---
Paul E. McKenney Feb. 5, 2025, 2:50 p.m. UTC | #5
On Tue, Feb 04, 2025 at 12:20:30PM -0800, Paul E. McKenney wrote:
> On Tue, Feb 04, 2025 at 05:34:09PM +0100, Sebastian Andrzej Siewior wrote:
> > On 2025-02-04 03:51:48 [-0800], Paul E. McKenney wrote:
> > > On Tue, Feb 04, 2025 at 11:26:11AM +0100, Sebastian Andrzej Siewior wrote:
> > > > On 2025-01-30 10:53:19 [-0800], Paul E. McKenney wrote:
> > > > > The timer and hrtimer softirq processing has moved to dedicated threads
> > > > > for kernels built with CONFIG_IRQ_FORCED_THREADING=y.  This results in
> > > > > timers not expiring until later in early boot, which in turn causes the
> > > > > RCU Tasks self-tests to hang in kernels built with CONFIG_PROVE_RCU=y,
> > > > > which further causes the entire kernel to hang.  One fix would be to
> > > > > make timers work during this time, but there are no known users of RCU
> > > > > Tasks grace periods during that time, so no justification for the added
> > > > > complexity.  Not yet, anyway.
> > > > > 
> > > > > This commit therefore moves the call to rcu_init_tasks_generic() from
> > > > > kernel_init_freeable() to a core_initcall().  This works because the
> > > > > timer and hrtimer kthreads are created at early_initcall() time.
> > > > 
> > > > Fixes: 49a17639508c3 ("softirq: Use a dedicated thread for timer wakeups on PREEMPT_RT.")
> > > > ?
> > > 
> > > Quite possibly...  I freely confess that I was more focused on the fix
> > > than on the bug's origin.  Would you be willing to try this commit and
> > > its predecessor?
> > 
> > Yes. Just verified.
> > Tested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> > Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> 
> Boqun, could you please apply Sebastian's tags, including the Fixes
> tag above?
> 
> > > > I played with it and I can reproduce the issue with !RT + threadirqs but
> > > > not with RT (which implies threadirqs).
> > > > Is there anything in RT that avoids the problem?
> > > 
> > > Not that I know of, but then again I did not try it.  To your point,
> > 
> > The change looks fine.
> > 
> > > I do need to make a -rt rcutorture scenario.  TREE03 has been intended to
> > > approximate this, and it uses the following Kconfig options:
> > > 
> > > ------------------------------------------------------------------------
> > > 
> > > CONFIG_SMP=y
> > > CONFIG_NR_CPUS=16
> > > CONFIG_PREEMPT_NONE=n
> > > CONFIG_PREEMPT_VOLUNTARY=n
> > > CONFIG_PREEMPT=y
> > > #CHECK#CONFIG_PREEMPT_RCU=y
> > > CONFIG_HZ_PERIODIC=y
> > > CONFIG_NO_HZ_IDLE=n
> > > CONFIG_NO_HZ_FULL=n
> > > CONFIG_RCU_TRACE=y
> > > CONFIG_HOTPLUG_CPU=y
> > > CONFIG_RCU_FANOUT=2
> > > CONFIG_RCU_FANOUT_LEAF=2
> > > CONFIG_RCU_NOCB_CPU=n
> > > CONFIG_DEBUG_LOCK_ALLOC=n
> > > CONFIG_RCU_BOOST=y
> > > CONFIG_DEBUG_OBJECTS_RCU_HEAD=n
> > > CONFIG_RCU_EXPERT=y
> > 
> > You could enable CONFIG_PREEMPT_RT ;)
> > CONFIG_PREEMPT_LAZY is probably also set a lot.
> > 
> > That should be it.
> > 
> > > ------------------------------------------------------------------------
> > > 
> > > And the following kernel-boot parameters:
> > > 
> > > ------------------------------------------------------------------------
> > > 
> > > rcutorture.onoff_interval=200 rcutorture.onoff_holdoff=30
> > > rcutree.gp_preinit_delay=12
> > > rcutree.gp_init_delay=3
> > > rcutree.gp_cleanup_delay=3
> > > rcutree.kthread_prio=2
> > > threadirqs
> > > rcutree.use_softirq=0
> > > rcutorture.preempt_duration=10
> > > 
> > > ------------------------------------------------------------------------
> > > 
> > > Some of these are for RCU's benefit, but what should I change to more
> > > closely approximate a typical real-time deployment?
> > 
> > See above.
> 
> Which got me this diff:
> 
> ------------------------------------------------------------------------
> 
> diff --git a/tools/testing/selftests/rcutorture/configs/rcu/TREE03 b/tools/testing/selftests/rcutorture/configs/rcu/TREE03
> index 2dc31b16e506..6158f5002497 100644
> --- a/tools/testing/selftests/rcutorture/configs/rcu/TREE03
> +++ b/tools/testing/selftests/rcutorture/configs/rcu/TREE03
> @@ -2,7 +2,9 @@ CONFIG_SMP=y
>  CONFIG_NR_CPUS=16
>  CONFIG_PREEMPT_NONE=n
>  CONFIG_PREEMPT_VOLUNTARY=n
> -CONFIG_PREEMPT=y
> +CONFIG_PREEMPT=n
> +CONFIG_PREEMPT_LAZY=y
> +CONFIG_PREEMPT_RT=y
>  #CHECK#CONFIG_PREEMPT_RCU=y
>  CONFIG_HZ_PERIODIC=y
>  CONFIG_NO_HZ_IDLE=n
> @@ -15,4 +17,5 @@ CONFIG_RCU_NOCB_CPU=n
>  CONFIG_DEBUG_LOCK_ALLOC=n
>  CONFIG_RCU_BOOST=y
>  CONFIG_DEBUG_OBJECTS_RCU_HEAD=n
> +CONFIG_EXPERT=y
>  CONFIG_RCU_EXPERT=y
> 
> ------------------------------------------------------------------------
> 
> But a 10-minute run got me the splat shown below, and in addition a
> shutdown-time hang.
> 
> This is caused by RCU falling behind a callback-flooding kthread that
> invokes call_rcu() in a semi-tight loop.  Setting rcutree.kthread_prio=40
> avoids the splat, but still gets the shutdown-time hang.  Retrying with
> the default rcutree.kthread_prio=2 failed to reproduce the splat, but
> it did reproduce the shutdown-time hang.
> 
> OK, maybe printk buffers are not being flushed?  A 100-millisecond sleep
> at the end of of rcu_torture_cleanup() got all of rcutorture's output
> flushed, but lost the subsequent shutdown-time console traffic.  The
> pr_flush(HZ/10,1) seems more sensible, but this is private to printk().
> 
> I would like to log the shutdown-time console traffic because RCU can
> sometimes break things on that path.
> 
> Thoughts?

Longer rcutorture runs showed (not unexpectedly) that the 100-millisecond
sleep was not always sufficient, nor was a 500-milliseconds sleep.

There is a call to kmsg_dump(KMSG_DUMP_SHUTDOWN) in kernel_power_off()
that appears to be intended to dump out the printk() buffers, but it
does not seem to do so in kernels built with CONFIG_PREEMPT_RT=y.
Does there need to be a pr_flush() call prior to the call to
migrate_to_reboot_cpu()?  Or maybe even to do_kernel_power_off_prepare()
or kernel_shutdown_prepare()?

Adding John Ogness on CC so that he can tell me the error of my ways.

> PS:  I will do longer runs in case that splat was not a one-off.
>      My concern is that I might need to adjust something more in order
>      to get a reliable callback-flooding test.

And this was not a one-off.  Running 10 40-minute instances of the new-age
CONFIG_PREEMPT_RT=y TREE03 reliably triggers this.  At first glance,
this appears to be an interaction between testing of RCU priority
boosting and RCU-callback flooding forward-progress testing.  And disabling
testing of RCU priority boosting avoids these OOMs.  As does running
without CONFIG_PREEMPT_RT=y.

My next step is to run with rcutorture.preempt_duration=0, which disables
within-guest-OS random preempting of kthreads.  If that doesn't help,
I expect to play around with avoiding concurrent testing of RCU priority
boosting and RCU callback flooding forward progress.

Or is there a better way?

							Thanx, Paul
John Ogness Feb. 5, 2025, 7:54 p.m. UTC | #6
On 2025-02-05, "Paul E. McKenney" <paulmck@kernel.org> wrote:
>> This is caused by RCU falling behind a callback-flooding kthread that
>> invokes call_rcu() in a semi-tight loop.  Setting rcutree.kthread_prio=40
>> avoids the splat, but still gets the shutdown-time hang.  Retrying with
>> the default rcutree.kthread_prio=2 failed to reproduce the splat, but
>> it did reproduce the shutdown-time hang.
>> 
>> OK, maybe printk buffers are not being flushed?  A 100-millisecond sleep
>> at the end of of rcu_torture_cleanup() got all of rcutorture's output
>> flushed, but lost the subsequent shutdown-time console traffic.  The
>> pr_flush(HZ/10,1) seems more sensible, but this is private to printk().
>>
>> I would like to log the shutdown-time console traffic because RCU can
>> sometimes break things on that path.

pr_flush() was changed to private because there were no users. It would
not be a problem to make it available. Adding a pr_flush() to
rcu_torture_cleanup() would be an appropriate workaround for now (more
on this at the end).

> There is a call to kmsg_dump(KMSG_DUMP_SHUTDOWN) in kernel_power_off()
> that appears to be intended to dump out the printk() buffers,

It only dumps the buffers to the registered kmsg_dumpers. It is not
responsible for flushing console backlogs.

> but it
> does not seem to do so in kernels built with CONFIG_PREEMPT_RT=y.
> Does there need to be a pr_flush() call prior to the call to
> migrate_to_reboot_cpu()?  Or maybe even to do_kernel_power_off_prepare()
> or kernel_shutdown_prepare()?

With CONFIG_PREEMPT_RT=y, legacy consoles only print via a dedicated
kthread. Without a pr_flush() somewhere, there is basically no chance
that they will get backlogs flushed because noone is waitig for them.

The new console API (NBCON) provides support for "atomic consoles",
which _do_ flush by transitioning to synchronous printing during
shutdown/reboot. Unfortunately we still don't have any NBCON atomic
console implemented in the kernel. The 8250 UART will be our first
driver, most likely available in 6.15. (With the current PREEMPT_RT
patch applied, the 8250 NBCON atomic driver is used.)

Since only CONFIG_PREEMPT_RT=y has this issue, I am not sure if we want
to sprinkle pr_flush() calls on all sleepable shutdown/reboot paths,
although that is certainly one way to handle it. For your case, adding a
pr_flush() to rcu_torture_cleanup() and making pr_flush() non-private
would be an easy solution to avoid your problem.

John Ogness
Paul E. McKenney Feb. 5, 2025, 8:10 p.m. UTC | #7
On Wed, Feb 05, 2025 at 09:00:07PM +0106, John Ogness wrote:
> On 2025-02-05, "Paul E. McKenney" <paulmck@kernel.org> wrote:
> >> This is caused by RCU falling behind a callback-flooding kthread that
> >> invokes call_rcu() in a semi-tight loop.  Setting rcutree.kthread_prio=40
> >> avoids the splat, but still gets the shutdown-time hang.  Retrying with
> >> the default rcutree.kthread_prio=2 failed to reproduce the splat, but
> >> it did reproduce the shutdown-time hang.
> >> 
> >> OK, maybe printk buffers are not being flushed?  A 100-millisecond sleep
> >> at the end of of rcu_torture_cleanup() got all of rcutorture's output
> >> flushed, but lost the subsequent shutdown-time console traffic.  The
> >> pr_flush(HZ/10,1) seems more sensible, but this is private to printk().
> >>
> >> I would like to log the shutdown-time console traffic because RCU can
> >> sometimes break things on that path.
> 
> pr_flush() was changed to private because there were no users. It would
> not be a problem to make it available. Adding a pr_flush() to
> rcu_torture_cleanup() would be an appropriate workaround for now (more
> on this at the end).

Ah, got it.

> > There is a call to kmsg_dump(KMSG_DUMP_SHUTDOWN) in kernel_power_off()
> > that appears to be intended to dump out the printk() buffers,
> 
> It only dumps the buffers to the registered kmsg_dumpers. It is not
> responsible for flushing console backlogs.

That would explain its not doing much for me.  ;-)

> > but it
> > does not seem to do so in kernels built with CONFIG_PREEMPT_RT=y.
> > Does there need to be a pr_flush() call prior to the call to
> > migrate_to_reboot_cpu()?  Or maybe even to do_kernel_power_off_prepare()
> > or kernel_shutdown_prepare()?
> 
> With CONFIG_PREEMPT_RT=y, legacy consoles only print via a dedicated
> kthread. Without a pr_flush() somewhere, there is basically no chance
> that they will get backlogs flushed because noone is waitig for them.
> 
> The new console API (NBCON) provides support for "atomic consoles",
> which _do_ flush by transitioning to synchronous printing during
> shutdown/reboot. Unfortunately we still don't have any NBCON atomic
> console implemented in the kernel. The 8250 UART will be our first
> driver, most likely available in 6.15. (With the current PREEMPT_RT
> patch applied, the 8250 NBCON atomic driver is used.)
> 
> Since only CONFIG_PREEMPT_RT=y has this issue, I am not sure if we want
> to sprinkle pr_flush() calls on all sleepable shutdown/reboot paths,
> although that is certainly one way to handle it. For your case, adding a
> pr_flush() to rcu_torture_cleanup() and making pr_flush() non-private
> would be an easy solution to avoid your problem.

Would it make sense to put a call to pr_flush() at the beginning of the
kernel_power_off() function?  I suspect that would take care of much
more than just rcutorture.

Or maybe after the pr_emerg("Power down\n"), since that is normally
the last thing that shows up on rcutorture's console logs.

I will give that a shot and see what happens.  ;-)

							Thanx, Paul
Paul E. McKenney Feb. 5, 2025, 8:31 p.m. UTC | #8
On Wed, Feb 05, 2025 at 12:10:53PM -0800, Paul E. McKenney wrote:
> On Wed, Feb 05, 2025 at 09:00:07PM +0106, John Ogness wrote:
> > On 2025-02-05, "Paul E. McKenney" <paulmck@kernel.org> wrote:
> > >> This is caused by RCU falling behind a callback-flooding kthread that
> > >> invokes call_rcu() in a semi-tight loop.  Setting rcutree.kthread_prio=40
> > >> avoids the splat, but still gets the shutdown-time hang.  Retrying with
> > >> the default rcutree.kthread_prio=2 failed to reproduce the splat, but
> > >> it did reproduce the shutdown-time hang.
> > >> 
> > >> OK, maybe printk buffers are not being flushed?  A 100-millisecond sleep
> > >> at the end of of rcu_torture_cleanup() got all of rcutorture's output
> > >> flushed, but lost the subsequent shutdown-time console traffic.  The
> > >> pr_flush(HZ/10,1) seems more sensible, but this is private to printk().
> > >>
> > >> I would like to log the shutdown-time console traffic because RCU can
> > >> sometimes break things on that path.
> > 
> > pr_flush() was changed to private because there were no users. It would
> > not be a problem to make it available. Adding a pr_flush() to
> > rcu_torture_cleanup() would be an appropriate workaround for now (more
> > on this at the end).
> 
> Ah, got it.
> 
> > > There is a call to kmsg_dump(KMSG_DUMP_SHUTDOWN) in kernel_power_off()
> > > that appears to be intended to dump out the printk() buffers,
> > 
> > It only dumps the buffers to the registered kmsg_dumpers. It is not
> > responsible for flushing console backlogs.
> 
> That would explain its not doing much for me.  ;-)
> 
> > > but it
> > > does not seem to do so in kernels built with CONFIG_PREEMPT_RT=y.
> > > Does there need to be a pr_flush() call prior to the call to
> > > migrate_to_reboot_cpu()?  Or maybe even to do_kernel_power_off_prepare()
> > > or kernel_shutdown_prepare()?
> > 
> > With CONFIG_PREEMPT_RT=y, legacy consoles only print via a dedicated
> > kthread. Without a pr_flush() somewhere, there is basically no chance
> > that they will get backlogs flushed because noone is waitig for them.
> > 
> > The new console API (NBCON) provides support for "atomic consoles",
> > which _do_ flush by transitioning to synchronous printing during
> > shutdown/reboot. Unfortunately we still don't have any NBCON atomic
> > console implemented in the kernel. The 8250 UART will be our first
> > driver, most likely available in 6.15. (With the current PREEMPT_RT
> > patch applied, the 8250 NBCON atomic driver is used.)
> > 
> > Since only CONFIG_PREEMPT_RT=y has this issue, I am not sure if we want
> > to sprinkle pr_flush() calls on all sleepable shutdown/reboot paths,
> > although that is certainly one way to handle it. For your case, adding a
> > pr_flush() to rcu_torture_cleanup() and making pr_flush() non-private
> > would be an easy solution to avoid your problem.
> 
> Would it make sense to put a call to pr_flush() at the beginning of the
> kernel_power_off() function?  I suspect that would take care of much
> more than just rcutorture.
> 
> Or maybe after the pr_emerg("Power down\n"), since that is normally
> the last thing that shows up on rcutorture's console logs.
> 
> I will give that a shot and see what happens.  ;-)

And that does the trick!  I specified 500ms, but maybe -1 would be
better?  You tell me!  ;-)

							Thanx, Paul

------------------------------------------------------------------------

commit 172b692a79ec5801eab57cb55d2f9c398f244c62
Author: Paul E. McKenney <paulmck@kernel.org>
Date:   Wed Feb 5 12:27:23 2025 -0800

    printk: Flush console log from kernel_power_off()
    
    Kernels built with CONFIG_PREEMPT_RT=y can lose significant console output
    and shutdown time, which hides shutdown-time RCU issues from rcutorture.
    Therefore, make pr_flush() public and invoke it after then last print
    in kernel_power_off().
    
    Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
    Cc: Petr Mladek <pmladek@suse.com>
    Cc: Steven Rostedt <rostedt@goodmis.org>
    Cc: John Ogness <john.ogness@linutronix.de>
    Cc: Sergey Senozhatsky <senozhatsky@chromium.org>

diff --git a/include/linux/printk.h b/include/linux/printk.h
index 4217a9f412b2..b3e88ff1ecc3 100644
--- a/include/linux/printk.h
+++ b/include/linux/printk.h
@@ -806,4 +806,6 @@ static inline void print_hex_dump_debug(const char *prefix_str, int prefix_type,
 #define print_hex_dump_bytes(prefix_str, prefix_type, buf, len)	\
 	print_hex_dump_debug(prefix_str, prefix_type, 16, 1, buf, len, true)
 
+
+bool pr_flush(int timeout_ms, bool reset_on_progress);
 #endif
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 80910bc3470c..e0c6e5a99897 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -2461,7 +2461,7 @@ asmlinkage __visible int _printk(const char *fmt, ...)
 }
 EXPORT_SYMBOL(_printk);
 
-static bool pr_flush(int timeout_ms, bool reset_on_progress);
+bool pr_flush(int timeout_ms, bool reset_on_progress);
 static bool __pr_flush(struct console *con, int timeout_ms, bool reset_on_progress);
 
 #else /* CONFIG_PRINTK */
@@ -4466,7 +4466,7 @@ static bool __pr_flush(struct console *con, int timeout_ms, bool reset_on_progre
  * Context: Process context. May sleep while acquiring console lock.
  * Return: true if all usable printers are caught up.
  */
-static bool pr_flush(int timeout_ms, bool reset_on_progress)
+bool pr_flush(int timeout_ms, bool reset_on_progress)
 {
 	return __pr_flush(NULL, timeout_ms, reset_on_progress);
 }
diff --git a/kernel/reboot.c b/kernel/reboot.c
index a701000bab34..3448e6ae3556 100644
--- a/kernel/reboot.c
+++ b/kernel/reboot.c
@@ -704,6 +704,7 @@ void kernel_power_off(void)
 	migrate_to_reboot_cpu();
 	syscore_shutdown();
 	pr_emerg("Power down\n");
+	pr_flush(500, 1);
 	kmsg_dump(KMSG_DUMP_SHUTDOWN);
 	machine_power_off();
 }
John Ogness Feb. 5, 2025, 9:22 p.m. UTC | #9
On 2025-02-05, "Paul E. McKenney" <paulmck@kernel.org> wrote:
>> Would it make sense to put a call to pr_flush() at the beginning of the
>> kernel_power_off() function?  I suspect that would take care of much
>> more than just rcutorture.
>> 
>> Or maybe after the pr_emerg("Power down\n"), since that is normally
>> the last thing that shows up on rcutorture's console logs.

Yes, after the pr_emerg("Power down\n"), as in your patch below.

> And that does the trick!  I specified 500ms, but maybe -1 would be
> better?  You tell me!  ;-)

I probably would have chosen 1000ms, but if the kthread has made no
forward progress after a few max timeslices, it probably is dead or will
not be scheduled anyway.

If there is no backlog, pr_flush() returns immediately.

I recommend _not_ using timeout_ms=-1 here, in case the kthread cannot
be scheduled (for whatever reason).

> ------------------------------------------------------------------------
>
> commit 172b692a79ec5801eab57cb55d2f9c398f244c62
> Author: Paul E. McKenney <paulmck@kernel.org>
> Date:   Wed Feb 5 12:27:23 2025 -0800
>
>     printk: Flush console log from kernel_power_off()
>     
>     Kernels built with CONFIG_PREEMPT_RT=y can lose significant console output
>     and shutdown time, which hides shutdown-time RCU issues from rcutorture.
>     Therefore, make pr_flush() public and invoke it after then last print
>     in kernel_power_off().
>     
>     Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
>     Cc: Petr Mladek <pmladek@suse.com>
>     Cc: Steven Rostedt <rostedt@goodmis.org>
>     Cc: John Ogness <john.ogness@linutronix.de>
>     Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
>
> diff --git a/include/linux/printk.h b/include/linux/printk.h
> index 4217a9f412b2..b3e88ff1ecc3 100644
> --- a/include/linux/printk.h
> +++ b/include/linux/printk.h
> @@ -806,4 +806,6 @@ static inline void print_hex_dump_debug(const char *prefix_str, int prefix_type,
>  #define print_hex_dump_bytes(prefix_str, prefix_type, buf, len)	\
>  	print_hex_dump_debug(prefix_str, prefix_type, 16, 1, buf, len, true)
>  
> +
> +bool pr_flush(int timeout_ms, bool reset_on_progress);

This will fail to link for !CONFIG_PRINTK.

In kernel/printk/printk.c currently there is:

#ifdef CONFIG_PRINTK
static bool pr_flush(int timeout_ms, bool reset_on_progress);
#else
static bool pr_flush(int timeout_ms, bool reset_on_progress) { return true; }
#endif

This would need to move to include/linux/printk.h

There you will find an existing CONFIG_PRINTK if/else block to move
those to.

>  #endif
> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index 80910bc3470c..e0c6e5a99897 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -2461,7 +2461,7 @@ asmlinkage __visible int _printk(const char *fmt, ...)
>  }
>  EXPORT_SYMBOL(_printk);
>  
> -static bool pr_flush(int timeout_ms, bool reset_on_progress);
> +bool pr_flush(int timeout_ms, bool reset_on_progress);
>  static bool __pr_flush(struct console *con, int timeout_ms, bool reset_on_progress);
>  
>  #else /* CONFIG_PRINTK */
> @@ -4466,7 +4466,7 @@ static bool __pr_flush(struct console *con, int timeout_ms, bool reset_on_progre
>   * Context: Process context. May sleep while acquiring console lock.
>   * Return: true if all usable printers are caught up.
>   */
> -static bool pr_flush(int timeout_ms, bool reset_on_progress)
> +bool pr_flush(int timeout_ms, bool reset_on_progress)
>  {
>  	return __pr_flush(NULL, timeout_ms, reset_on_progress);
>  }
> diff --git a/kernel/reboot.c b/kernel/reboot.c
> index a701000bab34..3448e6ae3556 100644
> --- a/kernel/reboot.c
> +++ b/kernel/reboot.c
> @@ -704,6 +704,7 @@ void kernel_power_off(void)
>  	migrate_to_reboot_cpu();
>  	syscore_shutdown();
>  	pr_emerg("Power down\n");
> +	pr_flush(500, 1);

I would have chosen:

	pr_flush(1000, true);

but I really don't have an argument either way.

>  	kmsg_dump(KMSG_DUMP_SHUTDOWN);
>  	machine_power_off();
>  }

John Ogness
Paul E. McKenney Feb. 5, 2025, 9:55 p.m. UTC | #10
On Wed, Feb 05, 2025 at 10:28:06PM +0106, John Ogness wrote:
> On 2025-02-05, "Paul E. McKenney" <paulmck@kernel.org> wrote:
> >> Would it make sense to put a call to pr_flush() at the beginning of the
> >> kernel_power_off() function?  I suspect that would take care of much
> >> more than just rcutorture.
> >> 
> >> Or maybe after the pr_emerg("Power down\n"), since that is normally
> >> the last thing that shows up on rcutorture's console logs.
> 
> Yes, after the pr_emerg("Power down\n"), as in your patch below.
> 
> > And that does the trick!  I specified 500ms, but maybe -1 would be
> > better?  You tell me!  ;-)
> 
> I probably would have chosen 1000ms, but if the kthread has made no
> forward progress after a few max timeslices, it probably is dead or will
> not be scheduled anyway.

I picked 500ms randomly, so I am happy to make it be 1000ms.

> If there is no backlog, pr_flush() returns immediately.

OK, so I don't need to add "if (IS_ENABLED(CONFIG_PREEMPT_RT))" to guard
it, then?

> I recommend _not_ using timeout_ms=-1 here, in case the kthread cannot
> be scheduled (for whatever reason).

Sounds eminently reasonable to me!

> > ------------------------------------------------------------------------
> >
> > commit 172b692a79ec5801eab57cb55d2f9c398f244c62
> > Author: Paul E. McKenney <paulmck@kernel.org>
> > Date:   Wed Feb 5 12:27:23 2025 -0800
> >
> >     printk: Flush console log from kernel_power_off()
> >     
> >     Kernels built with CONFIG_PREEMPT_RT=y can lose significant console output
> >     and shutdown time, which hides shutdown-time RCU issues from rcutorture.
> >     Therefore, make pr_flush() public and invoke it after then last print
> >     in kernel_power_off().
> >     
> >     Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
> >     Cc: Petr Mladek <pmladek@suse.com>
> >     Cc: Steven Rostedt <rostedt@goodmis.org>
> >     Cc: John Ogness <john.ogness@linutronix.de>
> >     Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
> >
> > diff --git a/include/linux/printk.h b/include/linux/printk.h
> > index 4217a9f412b2..b3e88ff1ecc3 100644
> > --- a/include/linux/printk.h
> > +++ b/include/linux/printk.h
> > @@ -806,4 +806,6 @@ static inline void print_hex_dump_debug(const char *prefix_str, int prefix_type,
> >  #define print_hex_dump_bytes(prefix_str, prefix_type, buf, len)	\
> >  	print_hex_dump_debug(prefix_str, prefix_type, 16, 1, buf, len, true)
> >  
> > +
> > +bool pr_flush(int timeout_ms, bool reset_on_progress);
> 
> This will fail to link for !CONFIG_PRINTK.
> 
> In kernel/printk/printk.c currently there is:
> 
> #ifdef CONFIG_PRINTK
> static bool pr_flush(int timeout_ms, bool reset_on_progress);
> #else
> static bool pr_flush(int timeout_ms, bool reset_on_progress) { return true; }
> #endif
> 
> This would need to move to include/linux/printk.h
> 
> There you will find an existing CONFIG_PRINTK if/else block to move
> those to.

Thank you, will do!

> >  #endif
> > diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> > index 80910bc3470c..e0c6e5a99897 100644
> > --- a/kernel/printk/printk.c
> > +++ b/kernel/printk/printk.c
> > @@ -2461,7 +2461,7 @@ asmlinkage __visible int _printk(const char *fmt, ...)
> >  }
> >  EXPORT_SYMBOL(_printk);
> >  
> > -static bool pr_flush(int timeout_ms, bool reset_on_progress);
> > +bool pr_flush(int timeout_ms, bool reset_on_progress);
> >  static bool __pr_flush(struct console *con, int timeout_ms, bool reset_on_progress);
> >  
> >  #else /* CONFIG_PRINTK */
> > @@ -4466,7 +4466,7 @@ static bool __pr_flush(struct console *con, int timeout_ms, bool reset_on_progre
> >   * Context: Process context. May sleep while acquiring console lock.
> >   * Return: true if all usable printers are caught up.
> >   */
> > -static bool pr_flush(int timeout_ms, bool reset_on_progress)
> > +bool pr_flush(int timeout_ms, bool reset_on_progress)
> >  {
> >  	return __pr_flush(NULL, timeout_ms, reset_on_progress);
> >  }
> > diff --git a/kernel/reboot.c b/kernel/reboot.c
> > index a701000bab34..3448e6ae3556 100644
> > --- a/kernel/reboot.c
> > +++ b/kernel/reboot.c
> > @@ -704,6 +704,7 @@ void kernel_power_off(void)
> >  	migrate_to_reboot_cpu();
> >  	syscore_shutdown();
> >  	pr_emerg("Power down\n");
> > +	pr_flush(500, 1);
> 
> I would have chosen:
> 
> 	pr_flush(1000, true);
> 
> but I really don't have an argument either way.

Hey, of the two of us, you are the printk() expert, so I will make it
be 1000.  ;-)

							Thanx, Paul

> >  	kmsg_dump(KMSG_DUMP_SHUTDOWN);
> >  	machine_power_off();
> >  }
> 
> John Ogness
John Ogness Feb. 5, 2025, 10:20 p.m. UTC | #11
On 2025-02-05, "Paul E. McKenney" <paulmck@kernel.org> wrote:
>> If there is no backlog, pr_flush() returns immediately.
>
> OK, so I don't need to add "if (IS_ENABLED(CONFIG_PREEMPT_RT))" to guard
> it, then?

For !CONFIG_PREEMPT_RT, if there are legacy consoles registered,
pr_flush() will additionally perform a blocking lock on the
console_lock. The kernel is in a clean shutdown path, so it should not
be a risk to block on the console_lock. But the console_lock is one of
the ugly global locks (semaphore, actually) we are working to remove
because sometimes it still causes problems.

@pmladek: What do you think? Should we have a CONFIG_PREEMPT_RT guard,
which would always be safe? Or do you feel comfortable performing a
console_lock() in the shutdown path for !CONFIG_PREEMPT_RT?

John
John Ogness Feb. 5, 2025, 10:31 p.m. UTC | #12
On 2025-02-05, John Ogness <john.ogness@linutronix.de> wrote:
>> OK, so I don't need to add "if (IS_ENABLED(CONFIG_PREEMPT_RT))" to guard
>> it, then?
>
> For !CONFIG_PREEMPT_RT, if there are legacy consoles registered,
> pr_flush() will additionally perform a blocking lock on the
> console_lock.

Sorry, I was just thinking about the flushing component of
pr_flush(). Later in the function it takes the console_lock even for
CONFIG_PREEMPT_RT. So please do _not_ have a guard.

pr_flush() should never hang on the console_lock during shutdown, but if
does, that is something that will need to be debugged and fixed.

@pmladek: Looking forward to reading your input on this.

John
Paul E. McKenney Feb. 5, 2025, 11:50 p.m. UTC | #13
On Wed, Feb 05, 2025 at 11:37:25PM +0106, John Ogness wrote:
> On 2025-02-05, John Ogness <john.ogness@linutronix.de> wrote:
> >> OK, so I don't need to add "if (IS_ENABLED(CONFIG_PREEMPT_RT))" to guard
> >> it, then?
> >
> > For !CONFIG_PREEMPT_RT, if there are legacy consoles registered,
> > pr_flush() will additionally perform a blocking lock on the
> > console_lock.
> 
> Sorry, I was just thinking about the flushing component of
> pr_flush(). Later in the function it takes the console_lock even for
> CONFIG_PREEMPT_RT. So please do _not_ have a guard.
> 
> pr_flush() should never hang on the console_lock during shutdown, but if
> does, that is something that will need to be debugged and fixed.
> 
> @pmladek: Looking forward to reading your input on this.

Here is an updated commit, hopefully applying your feedback correctly.

							Thanx, Paul

------------------------------------------------------------------------

commit 35679c18b062368855e183ee6712ca5c16145d8c
Author: Paul E. McKenney <paulmck@kernel.org>
Date:   Wed Feb 5 12:27:23 2025 -0800

    printk: Flush console log from kernel_power_off()
    
    Kernels built with CONFIG_PREEMPT_RT=y can lose significant console output
    and shutdown time, which hides shutdown-time RCU issues from rcutorture.
    Therefore, make pr_flush() public and invoke it after then last print
    in kernel_power_off().
    
    [ paulmck: Apply John Ogness feedback. ]
    
    Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
    Cc: Petr Mladek <pmladek@suse.com>
    Cc: Steven Rostedt <rostedt@goodmis.org>
    Cc: John Ogness <john.ogness@linutronix.de>
    Cc: Sergey Senozhatsky <senozhatsky@chromium.org>

diff --git a/include/linux/printk.h b/include/linux/printk.h
index 4217a9f412b2..d658d7e81bef 100644
--- a/include/linux/printk.h
+++ b/include/linux/printk.h
@@ -207,6 +207,7 @@ void printk_legacy_allow_panic_sync(void);
 extern bool nbcon_device_try_acquire(struct console *con);
 extern void nbcon_device_release(struct console *con);
 void nbcon_atomic_flush_unsafe(void);
+bool pr_flush(int timeout_ms, bool reset_on_progress);
 #else
 static inline __printf(1, 0)
 int vprintk(const char *s, va_list args)
@@ -315,6 +316,11 @@ static inline void nbcon_atomic_flush_unsafe(void)
 {
 }
 
+static bool pr_flush(int timeout_ms, bool reset_on_progress)
+{
+	return true;
+}
+
 #endif
 
 bool this_cpu_in_panic(void);
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 80910bc3470c..b9699eaacf19 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -2461,7 +2461,6 @@ asmlinkage __visible int _printk(const char *fmt, ...)
 }
 EXPORT_SYMBOL(_printk);
 
-static bool pr_flush(int timeout_ms, bool reset_on_progress);
 static bool __pr_flush(struct console *con, int timeout_ms, bool reset_on_progress);
 
 #else /* CONFIG_PRINTK */
@@ -2474,7 +2473,6 @@ static bool __pr_flush(struct console *con, int timeout_ms, bool reset_on_progre
 
 static u64 syslog_seq;
 
-static bool pr_flush(int timeout_ms, bool reset_on_progress) { return true; }
 static bool __pr_flush(struct console *con, int timeout_ms, bool reset_on_progress) { return true; }
 
 #endif /* CONFIG_PRINTK */
@@ -4466,7 +4464,7 @@ static bool __pr_flush(struct console *con, int timeout_ms, bool reset_on_progre
  * Context: Process context. May sleep while acquiring console lock.
  * Return: true if all usable printers are caught up.
  */
-static bool pr_flush(int timeout_ms, bool reset_on_progress)
+bool pr_flush(int timeout_ms, bool reset_on_progress)
 {
 	return __pr_flush(NULL, timeout_ms, reset_on_progress);
 }
diff --git a/kernel/reboot.c b/kernel/reboot.c
index a701000bab34..a84828d0757b 100644
--- a/kernel/reboot.c
+++ b/kernel/reboot.c
@@ -704,6 +704,7 @@ void kernel_power_off(void)
 	migrate_to_reboot_cpu();
 	syscore_shutdown();
 	pr_emerg("Power down\n");
+	pr_flush(1000, 1);
 	kmsg_dump(KMSG_DUMP_SHUTDOWN);
 	machine_power_off();
 }
John Ogness Feb. 6, 2025, 8:28 a.m. UTC | #14
On 2025-02-05, "Paul E. McKenney" <paulmck@kernel.org> wrote:
> commit 35679c18b062368855e183ee6712ca5c16145d8c
> Author: Paul E. McKenney <paulmck@kernel.org>
> Date:   Wed Feb 5 12:27:23 2025 -0800
>
>     printk: Flush console log from kernel_power_off()
>     
>     Kernels built with CONFIG_PREEMPT_RT=y can lose significant console output
>     and shutdown time, which hides shutdown-time RCU issues from rcutorture.
>     Therefore, make pr_flush() public and invoke it after then last print
>     in kernel_power_off().
>     
>     [ paulmck: Apply John Ogness feedback. ]
>     
>     Signed-off-by: Paul E. McKenney <paulmck@kernel.org>

Reviewed-by: John Ogness <john.ogness@linutronix.de>
Sebastian Andrzej Siewior Feb. 6, 2025, 9:02 a.m. UTC | #15
On 2025-02-05 15:50:14 [-0800], Paul E. McKenney wrote:
> diff --git a/include/linux/printk.h b/include/linux/printk.h
> index 4217a9f412b2..d658d7e81bef 100644
> --- a/include/linux/printk.h
> +++ b/include/linux/printk.h
> @@ -207,6 +207,7 @@ void printk_legacy_allow_panic_sync(void);
>  extern bool nbcon_device_try_acquire(struct console *con);
>  extern void nbcon_device_release(struct console *con);
>  void nbcon_atomic_flush_unsafe(void);
> +bool pr_flush(int timeout_ms, bool reset_on_progress);> diff --git a/kernel/reboot.c b/kernel/reboot.c
> index a701000bab34..a84828d0757b 100644
> --- a/kernel/reboot.c
> +++ b/kernel/reboot.c
> @@ -704,6 +704,7 @@ void kernel_power_off(void)
>  	migrate_to_reboot_cpu();
>  	syscore_shutdown();
>  	pr_emerg("Power down\n");
> +	pr_flush(1000, 1);

this could be true instead 1 given that the type is bool

>  	kmsg_dump(KMSG_DUMP_SHUTDOWN);
>  	machine_power_off();
>  }

Sebastian
Paul E. McKenney Feb. 6, 2025, 9:58 a.m. UTC | #16
On Thu, Feb 06, 2025 at 09:34:01AM +0106, John Ogness wrote:
> On 2025-02-05, "Paul E. McKenney" <paulmck@kernel.org> wrote:
> > commit 35679c18b062368855e183ee6712ca5c16145d8c
> > Author: Paul E. McKenney <paulmck@kernel.org>
> > Date:   Wed Feb 5 12:27:23 2025 -0800
> >
> >     printk: Flush console log from kernel_power_off()
> >     
> >     Kernels built with CONFIG_PREEMPT_RT=y can lose significant console output
> >     and shutdown time, which hides shutdown-time RCU issues from rcutorture.
> >     Therefore, make pr_flush() public and invoke it after then last print
> >     in kernel_power_off().
> >     
> >     [ paulmck: Apply John Ogness feedback. ]
> >     
> >     Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
> 
> Reviewed-by: John Ogness <john.ogness@linutronix.de>

Thank you!  I will apply this on my next rebase.

							Thanx, Paul
Paul E. McKenney Feb. 6, 2025, 9:59 a.m. UTC | #17
On Thu, Feb 06, 2025 at 10:02:49AM +0100, Sebastian Andrzej Siewior wrote:
> On 2025-02-05 15:50:14 [-0800], Paul E. McKenney wrote:
> > diff --git a/include/linux/printk.h b/include/linux/printk.h
> > index 4217a9f412b2..d658d7e81bef 100644
> > --- a/include/linux/printk.h
> > +++ b/include/linux/printk.h
> > @@ -207,6 +207,7 @@ void printk_legacy_allow_panic_sync(void);
> >  extern bool nbcon_device_try_acquire(struct console *con);
> >  extern void nbcon_device_release(struct console *con);
> >  void nbcon_atomic_flush_unsafe(void);
> > +bool pr_flush(int timeout_ms, bool reset_on_progress);
> …
> > diff --git a/kernel/reboot.c b/kernel/reboot.c
> > index a701000bab34..a84828d0757b 100644
> > --- a/kernel/reboot.c
> > +++ b/kernel/reboot.c
> > @@ -704,6 +704,7 @@ void kernel_power_off(void)
> >  	migrate_to_reboot_cpu();
> >  	syscore_shutdown();
> >  	pr_emerg("Power down\n");
> > +	pr_flush(1000, 1);
> 
> this could be true instead 1 given that the type is bool

Good point, and this matches other usage.  Will fix on my next rebase.

							Thanx, Paul

> >  	kmsg_dump(KMSG_DUMP_SHUTDOWN);
> >  	machine_power_off();
> >  }
> 
> Sebastian
Paul E. McKenney Feb. 6, 2025, 10:18 a.m. UTC | #18
On Thu, Feb 06, 2025 at 01:59:00AM -0800, Paul E. McKenney wrote:
> On Thu, Feb 06, 2025 at 10:02:49AM +0100, Sebastian Andrzej Siewior wrote:
> > On 2025-02-05 15:50:14 [-0800], Paul E. McKenney wrote:
> > > diff --git a/include/linux/printk.h b/include/linux/printk.h
> > > index 4217a9f412b2..d658d7e81bef 100644
> > > --- a/include/linux/printk.h
> > > +++ b/include/linux/printk.h
> > > @@ -207,6 +207,7 @@ void printk_legacy_allow_panic_sync(void);
> > >  extern bool nbcon_device_try_acquire(struct console *con);
> > >  extern void nbcon_device_release(struct console *con);
> > >  void nbcon_atomic_flush_unsafe(void);
> > > +bool pr_flush(int timeout_ms, bool reset_on_progress);
> > …
> > > diff --git a/kernel/reboot.c b/kernel/reboot.c
> > > index a701000bab34..a84828d0757b 100644
> > > --- a/kernel/reboot.c
> > > +++ b/kernel/reboot.c
> > > @@ -704,6 +704,7 @@ void kernel_power_off(void)
> > >  	migrate_to_reboot_cpu();
> > >  	syscore_shutdown();
> > >  	pr_emerg("Power down\n");
> > > +	pr_flush(1000, 1);
> > 
> > this could be true instead 1 given that the type is bool
> 
> Good point, and this matches other usage.  Will fix on my next rebase.

And here is the updated commit.  Thoughts?

							Thanx, Paul

------------------------------------------------------------------------

commit 04ccc09a6eb9a50f79c0a0e46fd273b5131b24f1
Author: Paul E. McKenney <paulmck@kernel.org>
Date:   Wed Feb 5 12:27:23 2025 -0800

    printk: Flush console log from kernel_power_off()
    
    Kernels built with CONFIG_PREEMPT_RT=y can lose significant console output
    and shutdown time, which hides shutdown-time RCU issues from rcutorture.
    Therefore, make pr_flush() public and invoke it after then last print
    in kernel_power_off().
    
    [ paulmck: Apply John Ogness feedback. ]
    [ paulmck: Appy Sebastian Andrzej Siewior feedback. ]
    [ paulmck: Apply kernel test robot feedback. ]
    
    Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
    Reviewed-by: John Ogness <john.ogness@linutronix.de>
    Cc: Petr Mladek <pmladek@suse.com>
    Cc: Steven Rostedt <rostedt@goodmis.org>
    Cc: Sergey Senozhatsky <senozhatsky@chromium.org>

diff --git a/include/linux/printk.h b/include/linux/printk.h
index 4217a9f412b2..5b462029d03c 100644
--- a/include/linux/printk.h
+++ b/include/linux/printk.h
@@ -207,6 +207,7 @@ void printk_legacy_allow_panic_sync(void);
 extern bool nbcon_device_try_acquire(struct console *con);
 extern void nbcon_device_release(struct console *con);
 void nbcon_atomic_flush_unsafe(void);
+bool pr_flush(int timeout_ms, bool reset_on_progress);
 #else
 static inline __printf(1, 0)
 int vprintk(const char *s, va_list args)
@@ -315,6 +316,11 @@ static inline void nbcon_atomic_flush_unsafe(void)
 {
 }
 
+static inline bool pr_flush(int timeout_ms, bool reset_on_progress)
+{
+	return true;
+}
+
 #endif
 
 bool this_cpu_in_panic(void);
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 80910bc3470c..b9699eaacf19 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -2461,7 +2461,6 @@ asmlinkage __visible int _printk(const char *fmt, ...)
 }
 EXPORT_SYMBOL(_printk);
 
-static bool pr_flush(int timeout_ms, bool reset_on_progress);
 static bool __pr_flush(struct console *con, int timeout_ms, bool reset_on_progress);
 
 #else /* CONFIG_PRINTK */
@@ -2474,7 +2473,6 @@ static bool __pr_flush(struct console *con, int timeout_ms, bool reset_on_progre
 
 static u64 syslog_seq;
 
-static bool pr_flush(int timeout_ms, bool reset_on_progress) { return true; }
 static bool __pr_flush(struct console *con, int timeout_ms, bool reset_on_progress) { return true; }
 
 #endif /* CONFIG_PRINTK */
@@ -4466,7 +4464,7 @@ static bool __pr_flush(struct console *con, int timeout_ms, bool reset_on_progre
  * Context: Process context. May sleep while acquiring console lock.
  * Return: true if all usable printers are caught up.
  */
-static bool pr_flush(int timeout_ms, bool reset_on_progress)
+bool pr_flush(int timeout_ms, bool reset_on_progress)
 {
 	return __pr_flush(NULL, timeout_ms, reset_on_progress);
 }
diff --git a/kernel/reboot.c b/kernel/reboot.c
index a701000bab34..3ba15b2c4066 100644
--- a/kernel/reboot.c
+++ b/kernel/reboot.c
@@ -704,6 +704,7 @@ void kernel_power_off(void)
 	migrate_to_reboot_cpu();
 	syscore_shutdown();
 	pr_emerg("Power down\n");
+	pr_flush(1000, true);
 	kmsg_dump(KMSG_DUMP_SHUTDOWN);
 	machine_power_off();
 }
Petr Mladek Feb. 6, 2025, 11:06 a.m. UTC | #19
On Wed 2025-02-05 15:50:14, Paul E. McKenney wrote:
> On Wed, Feb 05, 2025 at 11:37:25PM +0106, John Ogness wrote:
> > On 2025-02-05, John Ogness <john.ogness@linutronix.de> wrote:
> > >> OK, so I don't need to add "if (IS_ENABLED(CONFIG_PREEMPT_RT))" to guard
> > >> it, then?
> > >
> > > For !CONFIG_PREEMPT_RT, if there are legacy consoles registered,
> > > pr_flush() will additionally perform a blocking lock on the
> > > console_lock.
> > 
> > Sorry, I was just thinking about the flushing component of
> > pr_flush(). Later in the function it takes the console_lock even for
> > CONFIG_PREEMPT_RT. So please do _not_ have a guard.

It should be possible to avoid this console_lock() when only NBCON
consoles are registered. We take it in pr_flush() for reading con->seq
and potential race with register_console(). But it can be handled
another way, for example, by restarting the cycle when non-NBCON
console suddenly appears.

But it might be done later. We do not have NBCON consoles at the
moment so the console_lock() is needed anyway.

> > pr_flush() should never hang on the console_lock during shutdown, but if
> > does, that is something that will need to be debugged and fixed.
> > 
> > @pmladek: Looking forward to reading your input on this.

My only concern is that kernel_power_off() is called in many
code paths so it is not easy to review all scenarios.

But I haven't found any code path where it would be an obvious problem.
Especially, it seems that it is _not_ called in panic situations.

Also kernel_power_off() calls syscore_shutdown() which takes
syscore_ops_lock mutex. So that is must be called only in
a schedulable context even before this patch.

Summary: I do not see any obvious problem in the end.
	 Except, for some small details, see below.

> Here is an updated commit, hopefully applying your feedback correctly.
> 
> 							Thanx, Paul
> 
> ------------------------------------------------------------------------
> 
> commit 35679c18b062368855e183ee6712ca5c16145d8c
> Author: Paul E. McKenney <paulmck@kernel.org>
> Date:   Wed Feb 5 12:27:23 2025 -0800
> 
>     printk: Flush console log from kernel_power_off()
>     
>     Kernels built with CONFIG_PREEMPT_RT=y can lose significant console output
>     and shutdown time, which hides shutdown-time RCU issues from rcutorture.
>     Therefore, make pr_flush() public and invoke it after then last print
>     in kernel_power_off().
>     
> --- a/include/linux/printk.h
> +++ b/include/linux/printk.h
> @@ -207,6 +207,7 @@ void printk_legacy_allow_panic_sync(void);
>  extern bool nbcon_device_try_acquire(struct console *con);
>  extern void nbcon_device_release(struct console *con);
>  void nbcon_atomic_flush_unsafe(void);
> +bool pr_flush(int timeout_ms, bool reset_on_progress);
>  #else
>  static inline __printf(1, 0)
>  int vprintk(const char *s, va_list args)
> @@ -315,6 +316,11 @@ static inline void nbcon_atomic_flush_unsafe(void)
>  {
>  }
>  
> +static bool pr_flush(int timeout_ms, bool reset_on_progress)

There is missing "inline":

static inline bool pr_flush(int timeout_ms, bool reset_on_progress)


Otherwise, I get

In file included from ./include/asm-generic/bug.h:22,
                 from ./arch/x86/include/asm/bug.h:99,
                 from ./include/linux/bug.h:5,
                 from ./include/linux/page-flags.h:10,
                 from kernel/bounds.c:10:
./include/linux/printk.h:319:13: warning: ‘pr_flush’ defined but not used [-Wunused-function]

> +{
> +	return true;
> +}
> +
>  #endif
>  
>  bool this_cpu_in_panic(void);
> --- a/kernel/reboot.c
> +++ b/kernel/reboot.c
> @@ -704,6 +704,7 @@ void kernel_power_off(void)
>  	migrate_to_reboot_cpu();
>  	syscore_shutdown();
>  	pr_emerg("Power down\n");
> +	pr_flush(1000, 1);

This should be pr_flush(1000, true) as suggested by Sebastian.

>  	kmsg_dump(KMSG_DUMP_SHUTDOWN);
>  	machine_power_off();
>  }

With the two changes:

Reviewed-by: Petr Mladek <pmladek@suse.com>

Paul, I assume that you will pass this via the RCU tree together with
the other changes. Or would you prefer to route this particular
change via printk tree? Both ways are fine to me.

Best Regards,
Petr
Paul E. McKenney Feb. 6, 2025, 4:32 p.m. UTC | #20
On Thu, Feb 06, 2025 at 12:06:08PM +0100, Petr Mladek wrote:
> On Wed 2025-02-05 15:50:14, Paul E. McKenney wrote:
> > On Wed, Feb 05, 2025 at 11:37:25PM +0106, John Ogness wrote:
> > > On 2025-02-05, John Ogness <john.ogness@linutronix.de> wrote:
> > > >> OK, so I don't need to add "if (IS_ENABLED(CONFIG_PREEMPT_RT))" to guard
> > > >> it, then?
> > > >
> > > > For !CONFIG_PREEMPT_RT, if there are legacy consoles registered,
> > > > pr_flush() will additionally perform a blocking lock on the
> > > > console_lock.
> > > 
> > > Sorry, I was just thinking about the flushing component of
> > > pr_flush(). Later in the function it takes the console_lock even for
> > > CONFIG_PREEMPT_RT. So please do _not_ have a guard.

Thank you all for looking this over!

> It should be possible to avoid this console_lock() when only NBCON
> consoles are registered. We take it in pr_flush() for reading con->seq
> and potential race with register_console(). But it can be handled
> another way, for example, by restarting the cycle when non-NBCON
> console suddenly appears.
> 
> But it might be done later. We do not have NBCON consoles at the
> moment so the console_lock() is needed anyway.

Fair enough.

> > > pr_flush() should never hang on the console_lock during shutdown, but if
> > > does, that is something that will need to be debugged and fixed.
> > > 
> > > @pmladek: Looking forward to reading your input on this.
> 
> My only concern is that kernel_power_off() is called in many
> code paths so it is not easy to review all scenarios.
> 
> But I haven't found any code path where it would be an obvious problem.
> Especially, it seems that it is _not_ called in panic situations.
> 
> Also kernel_power_off() calls syscore_shutdown() which takes
> syscore_ops_lock mutex. So that is must be called only in
> a schedulable context even before this patch.
> 
> Summary: I do not see any obvious problem in the end.
> 	 Except, for some small details, see below.

The presence of that syscore_shutdown() call greatly reduced my level
of anxiety as well.  ;-)

> > Here is an updated commit, hopefully applying your feedback correctly.
> > 
> > 							Thanx, Paul
> > 
> > ------------------------------------------------------------------------
> > 
> > commit 35679c18b062368855e183ee6712ca5c16145d8c
> > Author: Paul E. McKenney <paulmck@kernel.org>
> > Date:   Wed Feb 5 12:27:23 2025 -0800
> > 
> >     printk: Flush console log from kernel_power_off()
> >     
> >     Kernels built with CONFIG_PREEMPT_RT=y can lose significant console output
> >     and shutdown time, which hides shutdown-time RCU issues from rcutorture.
> >     Therefore, make pr_flush() public and invoke it after then last print
> >     in kernel_power_off().
> >     
> > --- a/include/linux/printk.h
> > +++ b/include/linux/printk.h
> > @@ -207,6 +207,7 @@ void printk_legacy_allow_panic_sync(void);
> >  extern bool nbcon_device_try_acquire(struct console *con);
> >  extern void nbcon_device_release(struct console *con);
> >  void nbcon_atomic_flush_unsafe(void);
> > +bool pr_flush(int timeout_ms, bool reset_on_progress);
> >  #else
> >  static inline __printf(1, 0)
> >  int vprintk(const char *s, va_list args)
> > @@ -315,6 +316,11 @@ static inline void nbcon_atomic_flush_unsafe(void)
> >  {
> >  }
> >  
> > +static bool pr_flush(int timeout_ms, bool reset_on_progress)
> 
> There is missing "inline":
> 
> static inline bool pr_flush(int timeout_ms, bool reset_on_progress)
> 
> 
> Otherwise, I get
> 
> In file included from ./include/asm-generic/bug.h:22,
>                  from ./arch/x86/include/asm/bug.h:99,
>                  from ./include/linux/bug.h:5,
>                  from ./include/linux/page-flags.h:10,
>                  from kernel/bounds.c:10:
> ./include/linux/printk.h:319:13: warning: ‘pr_flush’ defined but not used [-Wunused-function]

Fixed, thank you both!

> > +{
> > +	return true;
> > +}
> > +
> >  #endif
> >  
> >  bool this_cpu_in_panic(void);
> > --- a/kernel/reboot.c
> > +++ b/kernel/reboot.c
> > @@ -704,6 +704,7 @@ void kernel_power_off(void)
> >  	migrate_to_reboot_cpu();
> >  	syscore_shutdown();
> >  	pr_emerg("Power down\n");
> > +	pr_flush(1000, 1);
> 
> This should be pr_flush(1000, true) as suggested by Sebastian.

Also fixed, also thank you both!

> >  	kmsg_dump(KMSG_DUMP_SHUTDOWN);
> >  	machine_power_off();
> >  }
> 
> With the two changes:
> 
> Reviewed-by: Petr Mladek <pmladek@suse.com>
> 
> Paul, I assume that you will pass this via the RCU tree together with
> the other changes. Or would you prefer to route this particular
> change via printk tree? Both ways are fine to me.

One advantage of carrying it via RCU is that I can arrange things so that
rcutorture users don't see this problem.  So unless a problem arises,
thank you, and I do plan to send this up via RCU.

							Thanx, Paul
diff mbox series

Patch

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 3bb554723074..c6c4aee47df2 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -121,12 +121,6 @@  void rcu_init(void);
 extern int rcu_scheduler_active;
 void rcu_sched_clock_irq(int user);
 
-#ifdef CONFIG_TASKS_RCU_GENERIC
-void rcu_init_tasks_generic(void);
-#else
-static inline void rcu_init_tasks_generic(void) { }
-#endif
-
 #ifdef CONFIG_RCU_STALL_COMMON
 void rcu_sysrq_start(void);
 void rcu_sysrq_end(void);
diff --git a/init/main.c b/init/main.c
index 00fac1170294..fb880b6fb524 100644
--- a/init/main.c
+++ b/init/main.c
@@ -1562,7 +1562,6 @@  static noinline void __init kernel_init_freeable(void)
 
 	init_mm_internals();
 
-	rcu_init_tasks_generic();
 	do_pre_smp_initcalls();
 	lockup_detector_init();
 
diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h
index 59314da5eb60..466668eb4fad 100644
--- a/kernel/rcu/tasks.h
+++ b/kernel/rcu/tasks.h
@@ -2256,7 +2256,7 @@  void __init tasks_cblist_init_generic(void)
 #endif
 }
 
-void __init rcu_init_tasks_generic(void)
+static int __init rcu_init_tasks_generic(void)
 {
 #ifdef CONFIG_TASKS_RCU
 	rcu_spawn_tasks_kthread();
@@ -2272,7 +2272,10 @@  void __init rcu_init_tasks_generic(void)
 
 	// Run the self-tests.
 	rcu_tasks_initiate_self_tests();
+
+	return 0;
 }
+core_initcall(rcu_init_tasks_generic);
 
 #else /* #ifdef CONFIG_TASKS_RCU_GENERIC */
 static inline void rcu_tasks_bootup_oddness(void) {}