diff mbox

Testing tracer wakeup_rt: .. no entries found ..FAILED!

Message ID 1343740419.27983.33.camel@gandalf.stny.rr.com (mailing list archive)
State New, archived
Headers show

Commit Message

Steven Rostedt July 31, 2012, 1:13 p.m. UTC
On Tue, 2012-07-31 at 15:50 +0300, Avi Kivity wrote:
> On 07/31/2012 03:43 PM, Steven Rostedt wrote:

> That would be better.  A hypervisor might be real-time capable (with
> some effort kvm can do this), so we don't want to turn off real time
> features just based on that.

It would only turn off if you enable selftests and the timing falied. If
the kvm had real time features, this most likely would fail anyway. But
that said, here's a patch that should solve this:

-- Steve





--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Fengguang Wu July 31, 2012, 11:43 p.m. UTC | #1
On Tue, Jul 31, 2012 at 09:13:39AM -0400, Steven Rostedt wrote:
> On Tue, 2012-07-31 at 15:50 +0300, Avi Kivity wrote:
> > On 07/31/2012 03:43 PM, Steven Rostedt wrote:
> 
> > That would be better.  A hypervisor might be real-time capable (with
> > some effort kvm can do this), so we don't want to turn off real time
> > features just based on that.
> 
> It would only turn off if you enable selftests and the timing falied. If
> the kvm had real time features, this most likely would fail anyway. But
> that said, here's a patch that should solve this:

No luck.. it still fails:

[    2.360068] Testing tracer irqsoff: [    2.854529] 
[    2.854828] ===============================
[    2.855560] [ INFO: suspicious RCU usage. ]
[    2.856266] 3.5.0-00024-g01ff5db-dirty #3 Not tainted
[    2.857182] -------------------------------
[    2.857933] /c/wfg/linux/include/linux/rcupdate.h:730 rcu_read_lock() used illegally while idle!
[    2.859450] 
[    2.859450] other info that might help us debug this:
[    2.859450] 
[    2.860874] 
[    2.860874] RCU used illegally from idle CPU!
[    2.860874] rcu_scheduler_active = 1, debug_locks = 0
[    2.862754] RCU used illegally from extended quiescent state!
[    2.863741] 2 locks held by swapper/0/0:

[    2.864377]  #0: [    2.864423]  (max_trace_lock){......}, at: [<814f6bfe>] check_critical_timing+0xd7/0x286
[    2.864423]  #1:  (rcu_read_lock){.+.+..}, at: [<8116f930>] __update_max_tr+0x0/0x430

[    2.864423] stack backtrace:
[    2.864423] Pid: 0, comm: swapper/0 Not tainted 3.5.0-00024-g01ff5db-dirty #3
[    2.864423] Call Trace:
[    2.864423]  [<81103a06>] lockdep_rcu_suspicious+0x1c6/0x210
[    2.864423]  [<8116fc9a>] __update_max_tr+0x36a/0x430
[    2.864423]  [<8116f930>] ? tracing_record_cmdline+0x200/0x200
[    2.864423]  [<8117186e>] update_max_tr_single+0x14e/0x2c0
[    2.864423]  [<81170baa>] ? __trace_stack+0x2a/0x40
[    2.864423]  [<814f6d22>] check_critical_timing+0x1fb/0x286
[    2.864423]  [<81013313>] ? default_idle+0x593/0xc30
[    2.864423]  [<81013313>] ? default_idle+0x593/0xc30
[    2.864423]  [<8110a0e7>] ? trace_hardirqs_on+0x27/0x40
[    2.864423]  [<8117ea5e>] time_hardirqs_on+0x1de/0x220
[    2.864423]  [<81013313>] ? default_idle+0x593/0xc30
[    2.864423]  [<81109d6d>] trace_hardirqs_on_caller+0x2d/0x380
[    2.864423]  [<8110a0e7>] trace_hardirqs_on+0x27/0x40
[    2.864423]  [<81013313>] default_idle+0x593/0xc30
[    2.864423]  [<8101692d>] cpu_idle+0x2dd/0x390
[    2.864423]  [<814eb841>] rest_init+0x2f5/0x314
[    2.864423]  [<814eb54c>] ? __read_lock_failed+0x14/0x14
[    2.864423]  [<817a43b4>] start_kernel+0x866/0x87a

Thanks,
Fengguang
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Steven Rostedt July 31, 2012, 11:51 p.m. UTC | #2
On Wed, 2012-08-01 at 07:43 +0800, Fengguang Wu wrote:
> On Tue, Jul 31, 2012 at 09:13:39AM -0400, Steven Rostedt wrote:
> > On Tue, 2012-07-31 at 15:50 +0300, Avi Kivity wrote:
> > > On 07/31/2012 03:43 PM, Steven Rostedt wrote:
> > 
> > > That would be better.  A hypervisor might be real-time capable (with
> > > some effort kvm can do this), so we don't want to turn off real time
> > > features just based on that.
> > 
> > It would only turn off if you enable selftests and the timing falied. If
> > the kvm had real time features, this most likely would fail anyway. But
> > that said, here's a patch that should solve this:
> 
> No luck.. it still fails:

I bet you it didn't ;-)

> 
> [    2.360068] Testing tracer irqsoff: [    2.854529] 
> [    2.854828] ===============================
> [    2.855560] [ INFO: suspicious RCU usage. ]
> [    2.856266] 3.5.0-00024-g01ff5db-dirty #3 Not tainted
> [    2.857182] -------------------------------
> [    2.857933] /c/wfg/linux/include/linux/rcupdate.h:730 rcu_read_lock() used illegally while idle!
> [    2.859450] 
> [    2.859450] other info that might help us debug this:
> [    2.859450] 
> [    2.860874] 
> [    2.860874] RCU used illegally from idle CPU!
> [    2.860874] rcu_scheduler_active = 1, debug_locks = 0
> [    2.862754] RCU used illegally from extended quiescent state!
> [    2.863741] 2 locks held by swapper/0/0:
> 
> [    2.864377]  #0: [    2.864423]  (max_trace_lock){......}, at: [<814f6bfe>] check_critical_timing+0xd7/0x286
> [    2.864423]  #1:  (rcu_read_lock){.+.+..}, at: [<8116f930>] __update_max_tr+0x0/0x430
> 
> [    2.864423] stack backtrace:
> [    2.864423] Pid: 0, comm: swapper/0 Not tainted 3.5.0-00024-g01ff5db-dirty #3
> [    2.864423] Call Trace:
> [    2.864423]  [<81103a06>] lockdep_rcu_suspicious+0x1c6/0x210
> [    2.864423]  [<8116fc9a>] __update_max_tr+0x36a/0x430
> [    2.864423]  [<8116f930>] ? tracing_record_cmdline+0x200/0x200
> [    2.864423]  [<8117186e>] update_max_tr_single+0x14e/0x2c0
> [    2.864423]  [<81170baa>] ? __trace_stack+0x2a/0x40
> [    2.864423]  [<814f6d22>] check_critical_timing+0x1fb/0x286
> [    2.864423]  [<81013313>] ? default_idle+0x593/0xc30
> [    2.864423]  [<81013313>] ? default_idle+0x593/0xc30
> [    2.864423]  [<8110a0e7>] ? trace_hardirqs_on+0x27/0x40
> [    2.864423]  [<8117ea5e>] time_hardirqs_on+0x1de/0x220
> [    2.864423]  [<81013313>] ? default_idle+0x593/0xc30
> [    2.864423]  [<81109d6d>] trace_hardirqs_on_caller+0x2d/0x380
> [    2.864423]  [<8110a0e7>] trace_hardirqs_on+0x27/0x40
> [    2.864423]  [<81013313>] default_idle+0x593/0xc30
> [    2.864423]  [<8101692d>] cpu_idle+0x2dd/0x390
> [    2.864423]  [<814eb841>] rest_init+0x2f5/0x314
> [    2.864423]  [<814eb54c>] ? __read_lock_failed+0x14/0x14
> [    2.864423]  [<817a43b4>] start_kernel+0x866/0x87a

What was the next lines? I bet you it was "PASSED". Which means it did
not fail. This is the second bug you found that has to do with RCU being
called in 'idle'. The one that Paul posted a patch for.

-- Steve


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Paul E. McKenney July 31, 2012, 11:57 p.m. UTC | #3
On Tue, Jul 31, 2012 at 07:51:39PM -0400, Steven Rostedt wrote:
> On Wed, 2012-08-01 at 07:43 +0800, Fengguang Wu wrote:
> > On Tue, Jul 31, 2012 at 09:13:39AM -0400, Steven Rostedt wrote:
> > > On Tue, 2012-07-31 at 15:50 +0300, Avi Kivity wrote:
> > > > On 07/31/2012 03:43 PM, Steven Rostedt wrote:
> > > 
> > > > That would be better.  A hypervisor might be real-time capable (with
> > > > some effort kvm can do this), so we don't want to turn off real time
> > > > features just based on that.
> > > 
> > > It would only turn off if you enable selftests and the timing falied. If
> > > the kvm had real time features, this most likely would fail anyway. But
> > > that said, here's a patch that should solve this:
> > 
> > No luck.. it still fails:
> 
> I bet you it didn't ;-)
> 
> > 
> > [    2.360068] Testing tracer irqsoff: [    2.854529] 
> > [    2.854828] ===============================
> > [    2.855560] [ INFO: suspicious RCU usage. ]
> > [    2.856266] 3.5.0-00024-g01ff5db-dirty #3 Not tainted
> > [    2.857182] -------------------------------
> > [    2.857933] /c/wfg/linux/include/linux/rcupdate.h:730 rcu_read_lock() used illegally while idle!
> > [    2.859450] 
> > [    2.859450] other info that might help us debug this:
> > [    2.859450] 
> > [    2.860874] 
> > [    2.860874] RCU used illegally from idle CPU!
> > [    2.860874] rcu_scheduler_active = 1, debug_locks = 0
> > [    2.862754] RCU used illegally from extended quiescent state!
> > [    2.863741] 2 locks held by swapper/0/0:
> > 
> > [    2.864377]  #0: [    2.864423]  (max_trace_lock){......}, at: [<814f6bfe>] check_critical_timing+0xd7/0x286
> > [    2.864423]  #1:  (rcu_read_lock){.+.+..}, at: [<8116f930>] __update_max_tr+0x0/0x430
> > 
> > [    2.864423] stack backtrace:
> > [    2.864423] Pid: 0, comm: swapper/0 Not tainted 3.5.0-00024-g01ff5db-dirty #3
> > [    2.864423] Call Trace:
> > [    2.864423]  [<81103a06>] lockdep_rcu_suspicious+0x1c6/0x210
> > [    2.864423]  [<8116fc9a>] __update_max_tr+0x36a/0x430
> > [    2.864423]  [<8116f930>] ? tracing_record_cmdline+0x200/0x200
> > [    2.864423]  [<8117186e>] update_max_tr_single+0x14e/0x2c0
> > [    2.864423]  [<81170baa>] ? __trace_stack+0x2a/0x40
> > [    2.864423]  [<814f6d22>] check_critical_timing+0x1fb/0x286
> > [    2.864423]  [<81013313>] ? default_idle+0x593/0xc30
> > [    2.864423]  [<81013313>] ? default_idle+0x593/0xc30
> > [    2.864423]  [<8110a0e7>] ? trace_hardirqs_on+0x27/0x40
> > [    2.864423]  [<8117ea5e>] time_hardirqs_on+0x1de/0x220
> > [    2.864423]  [<81013313>] ? default_idle+0x593/0xc30
> > [    2.864423]  [<81109d6d>] trace_hardirqs_on_caller+0x2d/0x380
> > [    2.864423]  [<8110a0e7>] trace_hardirqs_on+0x27/0x40
> > [    2.864423]  [<81013313>] default_idle+0x593/0xc30
> > [    2.864423]  [<8101692d>] cpu_idle+0x2dd/0x390
> > [    2.864423]  [<814eb841>] rest_init+0x2f5/0x314
> > [    2.864423]  [<814eb54c>] ? __read_lock_failed+0x14/0x14
> > [    2.864423]  [<817a43b4>] start_kernel+0x866/0x87a
> 
> What was the next lines? I bet you it was "PASSED". Which means it did
> not fail. This is the second bug you found that has to do with RCU being
> called in 'idle'. The one that Paul posted a patch for.

Though it needs another patch to actually use it in the right place...

							Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Fengguang Wu July 31, 2012, 11:57 p.m. UTC | #4
On Tue, Jul 31, 2012 at 07:51:39PM -0400, Steven Rostedt wrote:
> On Wed, 2012-08-01 at 07:43 +0800, Fengguang Wu wrote:
> > On Tue, Jul 31, 2012 at 09:13:39AM -0400, Steven Rostedt wrote:
> > > On Tue, 2012-07-31 at 15:50 +0300, Avi Kivity wrote:
> > > > On 07/31/2012 03:43 PM, Steven Rostedt wrote:
> > > 
> > > > That would be better.  A hypervisor might be real-time capable (with
> > > > some effort kvm can do this), so we don't want to turn off real time
> > > > features just based on that.
> > > 
> > > It would only turn off if you enable selftests and the timing falied. If
> > > the kvm had real time features, this most likely would fail anyway. But
> > > that said, here's a patch that should solve this:
> > 
> > No luck.. it still fails:
> 
> I bet you it didn't ;-)
> 
> > 
> > [    2.360068] Testing tracer irqsoff: [    2.854529] 
> > [    2.854828] ===============================
> > [    2.855560] [ INFO: suspicious RCU usage. ]
> > [    2.856266] 3.5.0-00024-g01ff5db-dirty #3 Not tainted
> > [    2.857182] -------------------------------
> > [    2.857933] /c/wfg/linux/include/linux/rcupdate.h:730 rcu_read_lock() used illegally while idle!
> > [    2.859450] 
> > [    2.859450] other info that might help us debug this:
> > [    2.859450] 
> > [    2.860874] 
> > [    2.860874] RCU used illegally from idle CPU!
> > [    2.860874] rcu_scheduler_active = 1, debug_locks = 0
> > [    2.862754] RCU used illegally from extended quiescent state!
> > [    2.863741] 2 locks held by swapper/0/0:
> > 
> > [    2.864377]  #0: [    2.864423]  (max_trace_lock){......}, at: [<814f6bfe>] check_critical_timing+0xd7/0x286
> > [    2.864423]  #1:  (rcu_read_lock){.+.+..}, at: [<8116f930>] __update_max_tr+0x0/0x430
> > 
> > [    2.864423] stack backtrace:
> > [    2.864423] Pid: 0, comm: swapper/0 Not tainted 3.5.0-00024-g01ff5db-dirty #3
> > [    2.864423] Call Trace:
> > [    2.864423]  [<81103a06>] lockdep_rcu_suspicious+0x1c6/0x210
> > [    2.864423]  [<8116fc9a>] __update_max_tr+0x36a/0x430
> > [    2.864423]  [<8116f930>] ? tracing_record_cmdline+0x200/0x200
> > [    2.864423]  [<8117186e>] update_max_tr_single+0x14e/0x2c0
> > [    2.864423]  [<81170baa>] ? __trace_stack+0x2a/0x40
> > [    2.864423]  [<814f6d22>] check_critical_timing+0x1fb/0x286
> > [    2.864423]  [<81013313>] ? default_idle+0x593/0xc30
> > [    2.864423]  [<81013313>] ? default_idle+0x593/0xc30
> > [    2.864423]  [<8110a0e7>] ? trace_hardirqs_on+0x27/0x40
> > [    2.864423]  [<8117ea5e>] time_hardirqs_on+0x1de/0x220
> > [    2.864423]  [<81013313>] ? default_idle+0x593/0xc30
> > [    2.864423]  [<81109d6d>] trace_hardirqs_on_caller+0x2d/0x380
> > [    2.864423]  [<8110a0e7>] trace_hardirqs_on+0x27/0x40
> > [    2.864423]  [<81013313>] default_idle+0x593/0xc30
> > [    2.864423]  [<8101692d>] cpu_idle+0x2dd/0x390
> > [    2.864423]  [<814eb841>] rest_init+0x2f5/0x314
> > [    2.864423]  [<814eb54c>] ? __read_lock_failed+0x14/0x14
> > [    2.864423]  [<817a43b4>] start_kernel+0x866/0x87a
> 
> What was the next lines? I bet you it was "PASSED". Which means it did
> not fail. This is the second bug you found that has to do with RCU being
> called in 'idle'. The one that Paul posted a patch for.

Yeah, PASSED!

[    2.898070]  [<8117ea5e>] time_hardirqs_on+0x1de/0x220
[    2.898070]  [<81013313>] ? default_idle+0x593/0xc30
[    2.898070]  [<81109d6d>] trace_hardirqs_on_caller+0x2d/0x380
[    2.898070]  [<8110a0e7>] trace_hardirqs_on+0x27/0x40
[    2.898070]  [<81013313>] default_idle+0x593/0xc30
[    2.898070]  [<8101692d>] cpu_idle+0x2dd/0x390
[    2.898070]  [<817fbe97>] start_secondary+0x44b/0x460
[    3.150115] PASSED
[    3.390079] Testing tracer function_graph: PASSED

I'll test Paul's patch on top of yours right away.

Thanks,
Fengguang
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Steven Rostedt Aug. 7, 2012, 1:29 p.m. UTC | #5
On Wed, 2012-08-01 at 07:57 +0800, Fengguang Wu wrote:
> > 
> > What was the next lines? I bet you it was "PASSED". Which means it did
> > not fail. This is the second bug you found that has to do with RCU being
> > called in 'idle'. The one that Paul posted a patch for.
> 
> Yeah, PASSED!

I have this patch queued for 3.7. Can I add your 'Tested-by' for it.

Thanks,

-- Steve

> 
> [    2.898070]  [<8117ea5e>] time_hardirqs_on+0x1de/0x220
> [    2.898070]  [<81013313>] ? default_idle+0x593/0xc30
> [    2.898070]  [<81109d6d>] trace_hardirqs_on_caller+0x2d/0x380
> [    2.898070]  [<8110a0e7>] trace_hardirqs_on+0x27/0x40
> [    2.898070]  [<81013313>] default_idle+0x593/0xc30
> [    2.898070]  [<8101692d>] cpu_idle+0x2dd/0x390
> [    2.898070]  [<817fbe97>] start_secondary+0x44b/0x460
> [    3.150115] PASSED
> [    3.390079] Testing tracer function_graph: PASSED
> 
> I'll test Paul's patch on top of yours right away.
> 
> Thanks,
> Fengguang


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Fengguang Wu Aug. 7, 2012, 1:32 p.m. UTC | #6
On Tue, Aug 07, 2012 at 09:29:33AM -0400, Steven Rostedt wrote:
> On Wed, 2012-08-01 at 07:57 +0800, Fengguang Wu wrote:
> > > 
> > > What was the next lines? I bet you it was "PASSED". Which means it did
> > > not fail. This is the second bug you found that has to do with RCU being
> > > called in 'idle'. The one that Paul posted a patch for.
> > 
> > Yeah, PASSED!
> 
> I have this patch queued for 3.7. Can I add your 'Tested-by' for it.

Yes, please. Thanks!

Thanks,
Fengguang

> > [    2.898070]  [<8117ea5e>] time_hardirqs_on+0x1de/0x220
> > [    2.898070]  [<81013313>] ? default_idle+0x593/0xc30
> > [    2.898070]  [<81109d6d>] trace_hardirqs_on_caller+0x2d/0x380
> > [    2.898070]  [<8110a0e7>] trace_hardirqs_on+0x27/0x40
> > [    2.898070]  [<81013313>] default_idle+0x593/0xc30
> > [    2.898070]  [<8101692d>] cpu_idle+0x2dd/0x390
> > [    2.898070]  [<817fbe97>] start_secondary+0x44b/0x460
> > [    3.150115] PASSED
> > [    3.390079] Testing tracer function_graph: PASSED
> > 
> > I'll test Paul's patch on top of yours right away.
> > 
> > Thanks,
> > Fengguang
> 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/kernel/trace/trace_selftest.c b/kernel/trace/trace_selftest.c
index 1003a4d..2c00a69 100644
--- a/kernel/trace/trace_selftest.c
+++ b/kernel/trace/trace_selftest.c
@@ -1041,6 +1041,8 @@  static int trace_wakeup_test_thread(void *data)
 	set_current_state(TASK_INTERRUPTIBLE);
 	schedule();
 
+	complete(x);
+
 	/* we are awake, now wait to disappear */
 	while (!kthread_should_stop()) {
 		/*
@@ -1084,24 +1086,21 @@  trace_selftest_startup_wakeup(struct tracer *trace, struct trace_array *tr)
 	/* reset the max latency */
 	tracing_max_latency = 0;
 
-	/* sleep to let the RT thread sleep too */
-	msleep(100);
+	while (p->on_rq) {
+		/*
+		 * Sleep to make sure the RT thread is asleep too.
+		 * On virtual machines we can't rely on timings,
+		 * but we want to make sure this test still works.
+		 */
+		msleep(100);
+	}
 
-	/*
-	 * Yes this is slightly racy. It is possible that for some
-	 * strange reason that the RT thread we created, did not
-	 * call schedule for 100ms after doing the completion,
-	 * and we do a wakeup on a task that already is awake.
-	 * But that is extremely unlikely, and the worst thing that
-	 * happens in such a case, is that we disable tracing.
-	 * Honestly, if this race does happen something is horrible
-	 * wrong with the system.
-	 */
+	init_completion(&isrt);
 
 	wake_up_process(p);
 
-	/* give a little time to let the thread wake up */
-	msleep(100);
+	/* Wait for the task to wake up */
+	wait_for_completion(&isrt);
 
 	/* stop the tracing. */
 	tracing_stop();