diff mbox

Regression in next with use printk_safe buffers in printk

Message ID 20170214165645.GB10321@tigerII.localdomain (mailing list archive)
State Not Applicable, archived
Headers show

Commit Message

Sergey Senozhatsky Feb. 14, 2017, 4:56 p.m. UTC
On (02/14/17 17:18), Peter Zijlstra wrote:
> On Wed, Feb 15, 2017 at 01:01:40AM +0900, Sergey Senozhatsky wrote:
> > 
> > but I'm a bit confused by rt_b->rt_runtime_lock in this unsafe lock
> > scenario (so it's not ABBA, but ABAD)
> > 
> > >   lock(hrtimer_bases.lock);
> > >                                lock(&rt_b->rt_runtime_lock);
> > >                                lock(hrtimer_bases.lock);
> > >   lock(tk_core);
> > >
> > >
> > > Chain exists of:
> > >
> > >	tk_core --> &rt_b->rt_runtime_lock --> hrtimer_bases.lock
> > 
> > 
> > I'm lacking some knowledge here, sorry. where does the tk_core --> &rt_b->rt_runtime_lock
> > come from?
> 
> rt_b->rt_runtime_lock is one of the scheduler locks, since we do
> printk() under tk_core, which does semaphore muck, which then includes
> the entire scheduler chain of locks.

thanks, Peter.

that crossed my mind, but I kinda assumed that we do printk() from
under tk_core using sched fair, and rt_runtime_lock is from sched rt.


so something like below, perhaps. would be helpful if Tony can test it.

(I'll send out this patch 'in a proper way' tomorrow, after some sleep,
it's 2am here).

8< ====

From e1755b0bf7f8a0be5fdf4dd7303bf4cd150d9d20 Mon Sep 17 00:00:00 2001
From: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Date: Wed, 15 Feb 2017 01:42:18 +0900
Subject: [PATCH] time/timekeeping_debug: use printk_deferred()

Do not call printk() from tk_debug_account_sleep_time(), because
tk_debug_account_sleep_time() is called under tk_core seq lock.
It's not safe to call printk() under tk_core, because console_sem
invokes scheduled (via wake_up_process()->activate_task()), which,
in turn, can call timekeeping code again, for instance, via
get_time()->ktime_get(). This may result in infinite loop on
tk_core.

Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
---
 kernel/time/timekeeping_debug.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Tony Lindgren Feb. 14, 2017, 5:03 p.m. UTC | #1
* Sergey Senozhatsky <sergey.senozhatsky@gmail.com> [170214 08:58]:
> On (02/14/17 17:18), Peter Zijlstra wrote:
> > On Wed, Feb 15, 2017 at 01:01:40AM +0900, Sergey Senozhatsky wrote:
> > > 
> > > but I'm a bit confused by rt_b->rt_runtime_lock in this unsafe lock
> > > scenario (so it's not ABBA, but ABAD)
> > > 
> > > >   lock(hrtimer_bases.lock);
> > > >                                lock(&rt_b->rt_runtime_lock);
> > > >                                lock(hrtimer_bases.lock);
> > > >   lock(tk_core);
> > > >
> > > >
> > > > Chain exists of:
> > > >
> > > >	tk_core --> &rt_b->rt_runtime_lock --> hrtimer_bases.lock
> > > 
> > > 
> > > I'm lacking some knowledge here, sorry. where does the tk_core --> &rt_b->rt_runtime_lock
> > > come from?
> > 
> > rt_b->rt_runtime_lock is one of the scheduler locks, since we do
> > printk() under tk_core, which does semaphore muck, which then includes
> > the entire scheduler chain of locks.
> 
> thanks, Peter.
> 
> that crossed my mind, but I kinda assumed that we do printk() from
> under tk_core using sched fair, and rt_runtime_lock is from sched rt.
> 
> 
> so something like below, perhaps. would be helpful if Tony can test it.
> 
> (I'll send out this patch 'in a proper way' tomorrow, after some sleep,
> it's 2am here).
> 
> 8< ====
> 
> From e1755b0bf7f8a0be5fdf4dd7303bf4cd150d9d20 Mon Sep 17 00:00:00 2001
> From: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
> Date: Wed, 15 Feb 2017 01:42:18 +0900
> Subject: [PATCH] time/timekeeping_debug: use printk_deferred()
> 
> Do not call printk() from tk_debug_account_sleep_time(), because
> tk_debug_account_sleep_time() is called under tk_core seq lock.
> It's not safe to call printk() under tk_core, because console_sem
> invokes scheduled (via wake_up_process()->activate_task()), which,
> in turn, can call timekeeping code again, for instance, via
> get_time()->ktime_get(). This may result in infinite loop on
> tk_core.
> 
> Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>

Thanks yeah this fixes the issue for me:

Tested-by: Tony Lindgren <tony@atomide.com>

> ---
>  kernel/time/timekeeping_debug.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/time/timekeeping_debug.c b/kernel/time/timekeeping_debug.c
> index ca9fb800336b..b8f7146c3538 100644
> --- a/kernel/time/timekeeping_debug.c
> +++ b/kernel/time/timekeeping_debug.c
> @@ -75,7 +75,8 @@ void tk_debug_account_sleep_time(struct timespec64 *t)
>  	int bin = min(fls(t->tv_sec), NUM_BINS-1);
>  
>  	sleep_time_bin[bin]++;
> -	pr_info("Suspended for %lld.%03lu seconds\n", (s64)t->tv_sec,
> +	printk_deferred(KERN_INFO "Suspended for %lld.%03lu seconds\n",
> +			(s64)t->tv_sec,
>  			t->tv_nsec / NSEC_PER_MSEC);
>  }
>  
> -- 
> 2.11.1
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Peter Zijlstra Feb. 14, 2017, 6:29 p.m. UTC | #2
On Wed, Feb 15, 2017 at 01:56:45AM +0900, Sergey Senozhatsky wrote:
> that crossed my mind, but I kinda assumed that we do printk() from
> under tk_core using sched fair, and rt_runtime_lock is from sched rt.

That's all true; lockdep doesn't care :-) All it knows is that at some
point those locks nest.
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sergey Senozhatsky Feb. 15, 2017, 4:44 a.m. UTC | #3
On (02/14/17 09:03), Tony Lindgren wrote:
[..]
> > Do not call printk() from tk_debug_account_sleep_time(), because
> > tk_debug_account_sleep_time() is called under tk_core seq lock.
> > It's not safe to call printk() under tk_core, because console_sem
> > invokes scheduled (via wake_up_process()->activate_task()), which,
> > in turn, can call timekeeping code again, for instance, via
> > get_time()->ktime_get(). This may result in infinite loop on
> > tk_core.
> > 
> > Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
> 
> Thanks yeah this fixes the issue for me:
> 
> Tested-by: Tony Lindgren <tony@atomide.com>

thanks.

	-ss
Sergey Senozhatsky Feb. 15, 2017, 4:49 a.m. UTC | #4
On (02/14/17 19:29), Peter Zijlstra wrote:
> On Wed, Feb 15, 2017 at 01:56:45AM +0900, Sergey Senozhatsky wrote:
> > that crossed my mind, but I kinda assumed that we do printk() from
> > under tk_core using sched fair, and rt_runtime_lock is from sched rt.
> 
> That's all true; lockdep doesn't care :-) All it knows is that at some
> point those locks nest.

thanks.

I think I'll get more familiar with the lockdep splats in
coming months :) but it's good (well, so far) that now we
keep lockdep enabled in printk.

	-ss
diff mbox

Patch

diff --git a/kernel/time/timekeeping_debug.c b/kernel/time/timekeeping_debug.c
index ca9fb800336b..b8f7146c3538 100644
--- a/kernel/time/timekeeping_debug.c
+++ b/kernel/time/timekeeping_debug.c
@@ -75,7 +75,8 @@  void tk_debug_account_sleep_time(struct timespec64 *t)
 	int bin = min(fls(t->tv_sec), NUM_BINS-1);
 
 	sleep_time_bin[bin]++;
-	pr_info("Suspended for %lld.%03lu seconds\n", (s64)t->tv_sec,
+	printk_deferred(KERN_INFO "Suspended for %lld.%03lu seconds\n",
+			(s64)t->tv_sec,
 			t->tv_nsec / NSEC_PER_MSEC);
 }