diff mbox series

rcu: Add cpu-exp indicator to expedited RCU CPU stall warnings

Message ID 20220518114310.1478091-1-qiang1.zhang@intel.com (mailing list archive)
State Accepted
Commit 178b9d47f3049e8122738c3166ee4975b75cba55
Headers show
Series rcu: Add cpu-exp indicator to expedited RCU CPU stall warnings | expand

Commit Message

Zqiang May 18, 2022, 11:43 a.m. UTC
This commit adds a "D" indicator to expedited RCU CPU stall warnings.
when an expedited grace period begins, due to CPU disable interrupt
time too long, cause the IPI(rcu_exp_handler()) unable to respond in
time, this debugging id will be showed.

runqemu kvm slirp nographic qemuparams="-m 4096 -smp 4"  bootparams=
"isolcpus=2,3 nohz_full=2,3 rcu_nocbs=2,3 rcutree.dump_tree=1
rcutorture.stall_cpu_holdoff=30 rcutorture.stall_cpu=40
rcutorture.stall_cpu_irqsoff=1 rcutorture.stall_cpu_block=0
rcutorture.stall_no_softlockup=1" -d

rcu_torture_stall start on CPU 1.
............
rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks:
{ 1-...D } 26467 jiffies s: 13317 root: 0x1/.
rcu: blocking rcu_node structures (internal RCU debug): l=1:0-1:0x2/.
Task dump for CPU 1:
task:rcu_torture_sta state:R  running task     stack:    0 pid:   76
ppid:     2 flags:0x00004008

Signed-off-by: Zqiang <qiang1.zhang@intel.com>
---
 kernel/rcu/tree_exp.h | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

Comments

Paul E. McKenney May 18, 2022, 6:14 p.m. UTC | #1
On Wed, May 18, 2022 at 07:43:10PM +0800, Zqiang wrote:
> This commit adds a "D" indicator to expedited RCU CPU stall warnings.
> when an expedited grace period begins, due to CPU disable interrupt
> time too long, cause the IPI(rcu_exp_handler()) unable to respond in
> time, this debugging id will be showed.
> 
> runqemu kvm slirp nographic qemuparams="-m 4096 -smp 4"  bootparams=
> "isolcpus=2,3 nohz_full=2,3 rcu_nocbs=2,3 rcutree.dump_tree=1
> rcutorture.stall_cpu_holdoff=30 rcutorture.stall_cpu=40
> rcutorture.stall_cpu_irqsoff=1 rcutorture.stall_cpu_block=0
> rcutorture.stall_no_softlockup=1" -d
> 
> rcu_torture_stall start on CPU 1.
> ............
> rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks:
> { 1-...D } 26467 jiffies s: 13317 root: 0x1/.
> rcu: blocking rcu_node structures (internal RCU debug): l=1:0-1:0x2/.
> Task dump for CPU 1:
> task:rcu_torture_sta state:R  running task     stack:    0 pid:   76
> ppid:     2 flags:0x00004008
> 
> Signed-off-by: Zqiang <qiang1.zhang@intel.com>

Nice!!!  I have queued this for v5.20 and for further testing and
review, thank you!

As usual, I could not resist the temptation to wordsmith the commit log,
so could you please check it in case I messed something up?

							Thanx, Paul

------------------------------------------------------------------------

commit 178b9d47f3049e8122738c3166ee4975b75cba55
Author: Zqiang <qiang1.zhang@intel.com>
Date:   Wed May 18 19:43:10 2022 +0800

    rcu: Add irqs-disabled indicator to expedited RCU CPU stall warnings
    
    If a CPU has interrupts disabled continuously starting before the
    beginning of a given expedited RCU grace period, that CPU will not
    execute that grace period's IPI handler.  This will in turn mean
    that the ->cpu_no_qs.b.exp field in that CPU's rcu_data structure
    will continue to contain the boolean value false.
    
    Knowing whether or not a CPU has had interrupts disabled can be helpful
    when debugging an expedited RCU CPU stall warning, so this commit
    adds a "D" indicator expedited RCU CPU stall warnings that signifies
    that the corresponding CPU has had interrupts disabled throughout.
    
    This capability was tested as follows:
    
    runqemu kvm slirp nographic qemuparams="-m 4096 -smp 4"  bootparams=
    "isolcpus=2,3 nohz_full=2,3 rcu_nocbs=2,3 rcutree.dump_tree=1
    rcutorture.stall_cpu_holdoff=30 rcutorture.stall_cpu=40
    rcutorture.stall_cpu_irqsoff=1 rcutorture.stall_cpu_block=0
    rcutorture.stall_no_softlockup=1" -d
    
    The rcu_torture_stall() function ran on CPU 1, which displays the "D"
    as expected given the rcutorture.stall_cpu_irqsoff=1 module parameter:
    
    ............
    rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks:
    { 1-...D } 26467 jiffies s: 13317 root: 0x1/.
    rcu: blocking rcu_node structures (internal RCU debug): l=1:0-1:0x2/.
    Task dump for CPU 1:
    task:rcu_torture_sta state:R  running task     stack:    0 pid:   76  ppid:     2 flags:0x00004008
    
    Signed-off-by: Zqiang <qiang1.zhang@intel.com>
    Signed-off-by: Paul E. McKenney <paulmck@kernel.org>

diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h
index 4c7037b507032..f092c7f18a5f3 100644
--- a/kernel/rcu/tree_exp.h
+++ b/kernel/rcu/tree_exp.h
@@ -637,10 +637,11 @@ static void synchronize_rcu_expedited_wait(void)
 					continue;
 				ndetected++;
 				rdp = per_cpu_ptr(&rcu_data, cpu);
-				pr_cont(" %d-%c%c%c", cpu,
+				pr_cont(" %d-%c%c%c%c", cpu,
 					"O."[!!cpu_online(cpu)],
 					"o."[!!(rdp->grpmask & rnp->expmaskinit)],
-					"N."[!!(rdp->grpmask & rnp->expmaskinitnext)]);
+					"N."[!!(rdp->grpmask & rnp->expmaskinitnext)],
+					"D."[!!(rdp->cpu_no_qs.b.exp)]);
 			}
 		}
 		pr_cont(" } %lu jiffies s: %lu root: %#lx/%c\n",
diff mbox series

Patch

diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h
index 4c7037b50703..f092c7f18a5f 100644
--- a/kernel/rcu/tree_exp.h
+++ b/kernel/rcu/tree_exp.h
@@ -637,10 +637,11 @@  static void synchronize_rcu_expedited_wait(void)
 					continue;
 				ndetected++;
 				rdp = per_cpu_ptr(&rcu_data, cpu);
-				pr_cont(" %d-%c%c%c", cpu,
+				pr_cont(" %d-%c%c%c%c", cpu,
 					"O."[!!cpu_online(cpu)],
 					"o."[!!(rdp->grpmask & rnp->expmaskinit)],
-					"N."[!!(rdp->grpmask & rnp->expmaskinitnext)]);
+					"N."[!!(rdp->grpmask & rnp->expmaskinitnext)],
+					"D."[!!(rdp->cpu_no_qs.b.exp)]);
 			}
 		}
 		pr_cont(" } %lu jiffies s: %lu root: %#lx/%c\n",