diff mbox series

[printk,v2,10/11] rcu: Add atomic write enforcement for rcu stalls

Message ID 20230919230856.661435-11-john.ogness@linutronix.de (mailing list archive)
State New, archived
Headers show
Series wire up nbcon atomic printing | expand

Commit Message

John Ogness Sept. 19, 2023, 11:08 p.m. UTC
Invoke the atomic write enforcement functions for rcu stalls to
ensure that the information gets out to the consoles.

It is important to note that if there are any legacy consoles
registered, they will be attempting to directly print from the
printk-caller context, which may jeopardize the reliability of
the atomic consoles. Optimally there should be no legacy
consoles registered.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
---
 kernel/rcu/tree_stall.h | 6 ++++++
 1 file changed, 6 insertions(+)

Comments

Petr Mladek Sept. 27, 2023, 3 p.m. UTC | #1
On Wed 2023-09-20 01:14:55, John Ogness wrote:
> Invoke the atomic write enforcement functions for rcu stalls to
> ensure that the information gets out to the consoles.
> 
> It is important to note that if there are any legacy consoles
> registered, they will be attempting to directly print from the
> printk-caller context, which may jeopardize the reliability of
> the atomic consoles. Optimally there should be no legacy
> consoles registered.
> 
> Signed-off-by: John Ogness <john.ogness@linutronix.de>
> ---
>  kernel/rcu/tree_stall.h | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h
> index 6f06dc12904a..0a58f8b233d8 100644
> --- a/kernel/rcu/tree_stall.h
> +++ b/kernel/rcu/tree_stall.h
> @@ -8,6 +8,7 @@
>   */
>  
>  #include <linux/kvm_para.h>
> +#include <linux/console.h>
>  
>  //////////////////////////////////////////////////////////////////////////////
>  //
> @@ -582,6 +583,7 @@ static void rcu_check_gp_kthread_expired_fqs_timer(void)
>  
>  static void print_other_cpu_stall(unsigned long gp_seq, unsigned long gps)
>  {
> +	enum nbcon_prio prev_prio;
>  	int cpu;
>  	unsigned long flags;
>  	unsigned long gpa;
> @@ -597,6 +599,8 @@ static void print_other_cpu_stall(unsigned long gp_seq, unsigned long gps)
>  	if (rcu_stall_is_suppressed())
>  		return;
>  
> +	prev_prio = nbcon_atomic_enter(NBCON_PRIO_EMERGENCY);
> +
>  	/*
>  	 * OK, time to rat on our buddy...
>  	 * See Documentation/RCU/stallwarn.rst for info on how to debug
> @@ -651,6 +655,8 @@ static void print_other_cpu_stall(unsigned long gp_seq, unsigned long gps)
>  	panic_on_rcu_stall();
>  
>  	rcu_force_quiescent_state();  /* Kick them all. */
> +
> +	nbcon_atomic_exit(NBCON_PRIO_EMERGENCY, prev_prio);

The locations looks reasonable to me. I just hope that we would
use another API: nbcon_emergency_enter()/exit() in the end.

Note that the new API it would allow to flush the messages in
the emergency context immediately from printk().

In that case, we would to handle nmi_trigger_cpumask_backtrace()
some special way.

This function would be called from the emergency context but
the nmi_cpu_backtrace() callbacks would be called on other
CPUs in normal context.

For this case I would add something like:

void nbcon_flush_all_emergency(void)
{
	emum nbcon_prio = nbcon_get_default_prio();

	if (nbcon_prio >= NBCON_PRIO_EMERGENCY)
		nbcon_flush_all();
}

, where the POC of nbcon_get_default_prio() and nbcon_flush_all()
was in the replay to the 7th patch, see
https://lore.kernel.org/all/ZRLBxsXPCym2NC5Q@alley/


Best Regards,
Petr
diff mbox series

Patch

diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h
index 6f06dc12904a..0a58f8b233d8 100644
--- a/kernel/rcu/tree_stall.h
+++ b/kernel/rcu/tree_stall.h
@@ -8,6 +8,7 @@ 
  */
 
 #include <linux/kvm_para.h>
+#include <linux/console.h>
 
 //////////////////////////////////////////////////////////////////////////////
 //
@@ -582,6 +583,7 @@  static void rcu_check_gp_kthread_expired_fqs_timer(void)
 
 static void print_other_cpu_stall(unsigned long gp_seq, unsigned long gps)
 {
+	enum nbcon_prio prev_prio;
 	int cpu;
 	unsigned long flags;
 	unsigned long gpa;
@@ -597,6 +599,8 @@  static void print_other_cpu_stall(unsigned long gp_seq, unsigned long gps)
 	if (rcu_stall_is_suppressed())
 		return;
 
+	prev_prio = nbcon_atomic_enter(NBCON_PRIO_EMERGENCY);
+
 	/*
 	 * OK, time to rat on our buddy...
 	 * See Documentation/RCU/stallwarn.rst for info on how to debug
@@ -651,6 +655,8 @@  static void print_other_cpu_stall(unsigned long gp_seq, unsigned long gps)
 	panic_on_rcu_stall();
 
 	rcu_force_quiescent_state();  /* Kick them all. */
+
+	nbcon_atomic_exit(NBCON_PRIO_EMERGENCY, prev_prio);
 }
 
 static void print_cpu_stall(unsigned long gps)