diff mbox

rcu self-detected stall messages on OMAP3, 4 boards

Message ID 20120922000537.GH2454@linux.vnet.ibm.com (mailing list archive)
State New, archived
Headers show

Commit Message

Paul E. McKenney Sept. 22, 2012, 12:05 a.m. UTC
On Fri, Sep 21, 2012 at 10:41:14PM +0000, Paul Walmsley wrote:
> On Fri, 21 Sep 2012, Paul E. McKenney wrote:
> 
> > On Fri, Sep 21, 2012 at 05:47:31PM +0000, Paul Walmsley wrote:
> >
> > > I built an OMAP kernel from Linus' commit 
> > > 4651afbbae968772efd6dc4ba461cba9b49bb9d8 ("Merge branch 'for-3.6-fixes' of 
> > > git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq").  The config used 
> > > was 'omap2plus_defconfig', and enabled CONFIG_CPU_IDLE by hand.  Booted it 
> > > on a Pandaboard (OMAP4430ES2) into a very minimal Debian rootfs.
> > 
> > Did you have the patch at https://lkml.org/lkml/2012/8/30/290 applied?
> 
> No, it's just as described above.
> 
> > If not, could you please try it?  (This patch cleared up a similar
> > problem for Becky, also on OMAP.)
> 
> Did not seem to help, either with or without CONFIG_CPU_IDLE.

I was hoping!  ;-)

And my init=/bin/sh kernel ran idle for more than an hour without
any RCU CPU stall warnings...

I am wondering if your system somehow figured out how to start a grace
period that had no RCU callbacks waiting for it.  If that happened,
then a CONFIG_NO_HZ=y system could in theory get into a state where all
CPUs are in dyntick-idle mode, so that none of them is doing anything
to force the grace period to complete.

That should be easy to diagnose, anyway.  Please see below, which
includes the earlier diagnostic patch.

							Thanx, Paul

------------------------------------------------------------------------

Comments

Paul Walmsley Sept. 22, 2012, 6:16 p.m. UTC | #1
Hi Paul

On Fri, 21 Sep 2012, Paul E. McKenney wrote:

> I am wondering if your system somehow figured out how to start a grace
> period that had no RCU callbacks waiting for it.  If that happened,
> then a CONFIG_NO_HZ=y system could in theory get into a state where all
> CPUs are in dyntick-idle mode, so that none of them is doing anything
> to force the grace period to complete.
>
> That should be easy to diagnose, anyway.  Please see below, which
> includes the earlier diagnostic patch.

Here you go.
 
- Paul

[  248.902618] INFO: rcu_sched self-detected stall on CPU
[  248.905456]  0: (1 ticks this GP) idle=933/1/0 
[  248.907897]   (t=26570 jiffies g=11 c=10 q=0)
[  248.910339] [<c001bc90>] (unwind_backtrace+0x0/0xf0) from [<c00ad800>] (rcu_check_callbacks+0x220/0x714)
[  248.915527] [<c00ad800>] (rcu_check_callbacks+0x220/0x714) from [<c00532a0>] (update_process_times+0x38/0x68)
[  248.920928] [<c00532a0>] (update_process_times+0x38/0x68) from [<c008c9e8>] (tick_sched_timer+0x80/0xec)
[  248.926116] [<c008c9e8>] (tick_sched_timer+0x80/0xec) from [<c0068ed4>] (__run_hrtimer+0x7c/0x1e0)
[  248.930999] [<c0068ed4>] (__run_hrtimer+0x7c/0x1e0) from [<c0069cb8>] (hrtimer_interrupt+0x11c/0x2d0)
[  248.936035] [<c0069cb8>] (hrtimer_interrupt+0x11c/0x2d0) from [<c001a3cc>] (twd_handler+0x30/0x44)
[  248.940948] [<c001a3cc>] (twd_handler+0x30/0x44) from [<c00a7bd0>] (handle_percpu_devid_irq+0x90/0x13c)
[  248.946075] [<c00a7bd0>] (handle_percpu_devid_irq+0x90/0x13c) from [<c00a4344>] (generic_handle_irq+0x30/0x48)
[  248.951538] [<c00a4344>] (generic_handle_irq+0x30/0x48) from [<c0014e38>] (handle_IRQ+0x4c/0xac)
[  248.956329] [<c0014e38>] (handle_IRQ+0x4c/0xac) from [<c00084cc>] (gic_handle_irq+0x28/0x5c)
[  248.960937] [<c00084cc>] (gic_handle_irq+0x28/0x5c) from [<c04fb1a4>] (__irq_svc+0x44/0x5c)
[  248.965484] Exception stack(0xc0729f58 to 0xc0729fa0)
[  248.968231] 9f40:                                                       0003b832 00000001
[  248.972686] 9f60: 00000000 c074a8e8 c0728000 c07c42c8 c05065a0 c074bdc8 00000000 411fc092
[  248.977142] 9f80: c074bfe8 00000000 00000001 c0729fa0 0003b833 c0015130 20000113 ffffffff
[  248.981597] [<c04fb1a4>] (__irq_svc+0x44/0x5c) from [<c0015130>] (default_idle+0x20/0x44)
[  248.986083] [<c0015130>] (default_idle+0x20/0x44) from [<c001535c>] (cpu_idle+0x9c/0x114)
[  248.990539] [<c001535c>] (cpu_idle+0x9c/0x114) from [<c06d77b0>] (start_kernel+0x2b4/0x304)
Paul E. McKenney Sept. 22, 2012, 7:52 p.m. UTC | #2
On Sat, Sep 22, 2012 at 06:16:15PM +0000, Paul Walmsley wrote:
> Hi Paul
> 
> On Fri, 21 Sep 2012, Paul E. McKenney wrote:
> 
> > I am wondering if your system somehow figured out how to start a grace
> > period that had no RCU callbacks waiting for it.  If that happened,
> > then a CONFIG_NO_HZ=y system could in theory get into a state where all
> > CPUs are in dyntick-idle mode, so that none of them is doing anything
> > to force the grace period to complete.
> >
> > That should be easy to diagnose, anyway.  Please see below, which
> > includes the earlier diagnostic patch.
> 
> Here you go.
> 
> - Paul
> 
> [  248.902618] INFO: rcu_sched self-detected stall on CPU
> [  248.905456]  0: (1 ticks this GP) idle=933/1/0 
> [  248.907897]   (t=26570 jiffies g=11 c=10 q=0)

Bingo!!!  (q=0, in case you were wondering.  And thank you for testing this!)

Strangely enough, I believe that I have inadvertently fixed this in
my -rcu tree:

git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git rcu/next

Nevertheless, if you get a chance to try it, I would be interested to
hear if my guess is correct.  The trick is that a kthread drives the
grace period in -rcu, regardless of whether or not there are callbacks.

However, the backport would not be something that -stable would be happy
with, so I will be putting together a fix for mainline.  This thing
has been in the kernel since about 2004, not sure why you didn't hit
it earlier.

							Thanx, Paul

> [  248.910339] [<c001bc90>] (unwind_backtrace+0x0/0xf0) from [<c00ad800>] (rcu_check_callbacks+0x220/0x714)
> [  248.915527] [<c00ad800>] (rcu_check_callbacks+0x220/0x714) from [<c00532a0>] (update_process_times+0x38/0x68)
> [  248.920928] [<c00532a0>] (update_process_times+0x38/0x68) from [<c008c9e8>] (tick_sched_timer+0x80/0xec)
> [  248.926116] [<c008c9e8>] (tick_sched_timer+0x80/0xec) from [<c0068ed4>] (__run_hrtimer+0x7c/0x1e0)
> [  248.930999] [<c0068ed4>] (__run_hrtimer+0x7c/0x1e0) from [<c0069cb8>] (hrtimer_interrupt+0x11c/0x2d0)
> [  248.936035] [<c0069cb8>] (hrtimer_interrupt+0x11c/0x2d0) from [<c001a3cc>] (twd_handler+0x30/0x44)
> [  248.940948] [<c001a3cc>] (twd_handler+0x30/0x44) from [<c00a7bd0>] (handle_percpu_devid_irq+0x90/0x13c)
> [  248.946075] [<c00a7bd0>] (handle_percpu_devid_irq+0x90/0x13c) from [<c00a4344>] (generic_handle_irq+0x30/0x48)
> [  248.951538] [<c00a4344>] (generic_handle_irq+0x30/0x48) from [<c0014e38>] (handle_IRQ+0x4c/0xac)
> [  248.956329] [<c0014e38>] (handle_IRQ+0x4c/0xac) from [<c00084cc>] (gic_handle_irq+0x28/0x5c)
> [  248.960937] [<c00084cc>] (gic_handle_irq+0x28/0x5c) from [<c04fb1a4>] (__irq_svc+0x44/0x5c)
> [  248.965484] Exception stack(0xc0729f58 to 0xc0729fa0)
> [  248.968231] 9f40:                                                       0003b832 00000001
> [  248.972686] 9f60: 00000000 c074a8e8 c0728000 c07c42c8 c05065a0 c074bdc8 00000000 411fc092
> [  248.977142] 9f80: c074bfe8 00000000 00000001 c0729fa0 0003b833 c0015130 20000113 ffffffff
> [  248.981597] [<c04fb1a4>] (__irq_svc+0x44/0x5c) from [<c0015130>] (default_idle+0x20/0x44)
> [  248.986083] [<c0015130>] (default_idle+0x20/0x44) from [<c001535c>] (cpu_idle+0x9c/0x114)
> [  248.990539] [<c001535c>] (cpu_idle+0x9c/0x114) from [<c06d77b0>] (start_kernel+0x2b4/0x304)
>
Paul Walmsley Sept. 22, 2012, 10:20 p.m. UTC | #3
Hi Paul

On Sat, 22 Sep 2012, Paul E. McKenney wrote:

> Strangely enough, I believe that I have inadvertently fixed this in
> my -rcu tree:
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git rcu/next
> 
> Nevertheless, if you get a chance to try it, I would be interested to
> hear if my guess is correct.

Yes, good news: the stall warnings go away with that branch.

> The trick is that a kthread drives the grace period in -rcu, regardless 
> of whether or not there are callbacks.

This is "rcu: Move quiescent-state forcing into kthread" ?

Added some debugging into rcu_gp_kthread() after that commit and can 
confirm that the quiescent-state forcing loop does start a few times when 
there are zero callbacks pending (modulo any races in my measurement 
code).

> However, the backport would not be something that -stable would be happy
> with, so I will be putting together a fix for mainline.  This thing
> has been in the kernel since about 2004, not sure why you didn't hit
> it earlier.

One other data point in that regard - noticed the warnings don't appear 
when the board is booted with:

commit 4fa3b6cb1bc8c14b81b4c8ffdfd3f2500a7e9367
Author: Paul E. McKenney <paul.mckenney@linaro.org>
Date:   Tue Jun 5 15:53:53 2012 -0700

    rcu: Fix qlen_lazy breakage

...


- Paul
Paul E. McKenney Sept. 22, 2012, 11:17 p.m. UTC | #4
On Sat, Sep 22, 2012 at 10:20:19PM +0000, Paul Walmsley wrote:
> Hi Paul
> 
> On Sat, 22 Sep 2012, Paul E. McKenney wrote:
> 
> > Strangely enough, I believe that I have inadvertently fixed this in
> > my -rcu tree:
> > 
> > git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git rcu/next
> > 
> > Nevertheless, if you get a chance to try it, I would be interested to
> > hear if my guess is correct.
> 
> Yes, good news: the stall warnings go away with that branch.

Very good!

> > The trick is that a kthread drives the grace period in -rcu, regardless 
> > of whether or not there are callbacks.
> 
> This is "rcu: Move quiescent-state forcing into kthread" ?

Yep, plus the preceding commits moving grace-period initialization and
cleanup into that same kthread.  This was motivated by a bug report
last February complaining about 200-microsecond latency spikes from
RCU grace-period initialization.  On systems with 4096 CPUs.

Real-time response.  It is far bigger than I thought.  ;-)

> Added some debugging into rcu_gp_kthread() after that commit and can 
> confirm that the quiescent-state forcing loop does start a few times when 
> there are zero callbacks pending (modulo any races in my measurement 
> code).

Cool, thank you!  Assuming it works, that indicates that there is long-term
value to the fix for this problem.  On larger systems, extra grace periods
are not what you want, as their expense increases with the number of CPUs.

> > However, the backport would not be something that -stable would be happy
> > with, so I will be putting together a fix for mainline.  This thing
> > has been in the kernel since about 2004, not sure why you didn't hit
> > it earlier.
> 
> One other data point in that regard - noticed the warnings don't appear 
> when the board is booted with:
> 
> commit 4fa3b6cb1bc8c14b81b4c8ffdfd3f2500a7e9367
> Author: Paul E. McKenney <paul.mckenney@linaro.org>
> Date:   Tue Jun 5 15:53:53 2012 -0700
> 
>     rcu: Fix qlen_lazy breakage

You lost me on this one.  This is already in mainline, so if you were
using (say) 3.6-rc6, you would already have this commit applied.

							Thanx, Paul
Paul Walmsley Sept. 24, 2012, 9:54 p.m. UTC | #5
On Sat, 22 Sep 2012, Paul E. McKenney wrote:

> On Sat, Sep 22, 2012 at 10:20:19PM +0000, Paul Walmsley wrote:
> > On Sat, 22 Sep 2012, Paul E. McKenney wrote:
> >
> > > This thing has been in the kernel since about 2004, not sure why you 
> > > didn't hit it earlier.
> > 
> > One other data point in that regard - noticed the warnings don't appear 
> > when the board is booted with:
> > 
> > commit 4fa3b6cb1bc8c14b81b4c8ffdfd3f2500a7e9367
> > Author: Paul E. McKenney <paul.mckenney@linaro.org>
> > Date:   Tue Jun 5 15:53:53 2012 -0700
> > 
> >     rcu: Fix qlen_lazy breakage
> 
> You lost me on this one.  This is already in mainline, so if you were
> using (say) 3.6-rc6, you would already have this commit applied.

If I check out a kernel at this commit 
4fa3b6cb1bc8c14b81b4c8ffdfd3f2500a7e9367 + the 
zero-callback-in-tickless-idle diagnostic patch, build and boot it, then 
the stall warnings don't appear (in 25 minutes of testing).

Messages from the diagnostic patch indicate that the kernel is entering 
idle during a grace period with no RCU callbacks, though.

This is not a big deal and does not need any further attention.  Just 
wanted to place a time boundary on the point when these messages started 
appearing.  (It is unlikely to be an optimal bound: i.e., there are 
probably later commits where the warnings also don't appear.)


- Paul
Paul E. McKenney Sept. 24, 2012, 10 p.m. UTC | #6
On Mon, Sep 24, 2012 at 09:54:00PM +0000, Paul Walmsley wrote:
> On Sat, 22 Sep 2012, Paul E. McKenney wrote:
> 
> > On Sat, Sep 22, 2012 at 10:20:19PM +0000, Paul Walmsley wrote:
> > > On Sat, 22 Sep 2012, Paul E. McKenney wrote:
> > >
> > > > This thing has been in the kernel since about 2004, not sure why you 
> > > > didn't hit it earlier.
> > > 
> > > One other data point in that regard - noticed the warnings don't appear 
> > > when the board is booted with:
> > > 
> > > commit 4fa3b6cb1bc8c14b81b4c8ffdfd3f2500a7e9367
> > > Author: Paul E. McKenney <paul.mckenney@linaro.org>
> > > Date:   Tue Jun 5 15:53:53 2012 -0700
> > > 
> > >     rcu: Fix qlen_lazy breakage
> > 
> > You lost me on this one.  This is already in mainline, so if you were
> > using (say) 3.6-rc6, you would already have this commit applied.
> 
> If I check out a kernel at this commit 
> 4fa3b6cb1bc8c14b81b4c8ffdfd3f2500a7e9367 + the 
> zero-callback-in-tickless-idle diagnostic patch, build and boot it, then 
> the stall warnings don't appear (in 25 minutes of testing).
> 
> Messages from the diagnostic patch indicate that the kernel is entering 
> idle during a grace period with no RCU callbacks, though.
> 
> This is not a big deal and does not need any further attention.  Just 
> wanted to place a time boundary on the point when these messages started 
> appearing.  (It is unlikely to be an optimal bound: i.e., there are 
> probably later commits where the warnings also don't appear.)

Ah, got it, thank you!

							Thanx, Paul
diff mbox

Patch

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index 307caf1..696f189 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -879,6 +879,7 @@  static void print_other_cpu_stall(struct rcu_state *rsp)
 	unsigned long flags;
 	int ndetected = 0;
 	struct rcu_node *rnp = rcu_get_root(rsp);
+	long totqlen = 0;
 
 	/* Only let one CPU complain about others per time interval. */
 
@@ -923,8 +924,11 @@  static void print_other_cpu_stall(struct rcu_state *rsp)
 	raw_spin_unlock_irqrestore(&rnp->lock, flags);
 
 	print_cpu_stall_info_end();
-	printk(KERN_CONT "(detected by %d, t=%ld jiffies)\n",
-	       smp_processor_id(), (long)(jiffies - rsp->gp_start));
+	for_each_possible_cpu(cpu)
+		totqlen += per_cpu_ptr(rsp->rda, cpu)->qlen;
+	pr_cont("(detected by %d, t=%ld jiffies, g=%lu, c=%lu, q=%lu)\n",
+	       smp_processor_id(), (long)(jiffies - rsp->gp_start),
+	       rsp->gpnum, rsp->completed, totqlen);
 	if (ndetected == 0)
 		printk(KERN_ERR "INFO: Stall ended before state dump start\n");
 	else if (!trigger_all_cpu_backtrace())
@@ -939,8 +943,10 @@  static void print_other_cpu_stall(struct rcu_state *rsp)
 
 static void print_cpu_stall(struct rcu_state *rsp)
 {
+	int cpu;
 	unsigned long flags;
 	struct rcu_node *rnp = rcu_get_root(rsp);
+	long totqlen = 0;
 
 	/*
 	 * OK, time to rat on ourselves...
@@ -951,7 +957,10 @@  static void print_cpu_stall(struct rcu_state *rsp)
 	print_cpu_stall_info_begin();
 	print_cpu_stall_info(rsp, smp_processor_id());
 	print_cpu_stall_info_end();
-	printk(KERN_CONT " (t=%lu jiffies)\n", jiffies - rsp->gp_start);
+	for_each_possible_cpu(cpu)
+		totqlen += per_cpu_ptr(rsp->rda, cpu)->qlen;
+	pr_cont(" (t=%lu jiffies g=%lu c=%lu q=%lu)\n",
+		jiffies - rsp->gp_start, rsp->gpnum, rsp->completed, totqlen);
 	if (!trigger_all_cpu_backtrace())
 		dump_stack();