diff mbox series

[v2] timekeeping: Allow runtime PM from change_clocksource()

Message ID 20210211134318.323910-1-niklas.soderlund+renesas@ragnatech.se (mailing list archive)
State Mainlined
Commit d4c7c28806616809e3baa0b7cd8c665516b2726d
Delegated to: Geert Uytterhoeven
Headers show
Series [v2] timekeeping: Allow runtime PM from change_clocksource() | expand

Commit Message

Niklas Söderlund Feb. 11, 2021, 1:43 p.m. UTC
The struct clocksource callbacks enable() and disable() are described as
a way to allow clock sources to enter a power save mode [1]. But using
runtime PM from these callbacks triggers a cyclic lockdep warning when
switching clock source using change_clocksource().

This change allows the new clocksource to be enabled() and the old one
to be disabled() without holding the timekeeper_lock. This solution is
modeled on timekeeping_resume() and timekeeping_suspend() where the
struct clocksource resume() and suspend() callbacks are called without
holding the timekeeper_lock.

Triggering and log of the deadlock warning,

  # echo e60f0000.timer > /sys/devices/system/clocksource/clocksource0/current_clocksource
  [  118.081095] ======================================================
  [  118.087455] WARNING: possible circular locking dependency detected
  [  118.093821] 5.10.0-rc6-arm64-renesas-00002-ga0cf8106e584 #37 Not tainted
  [  118.100712] ------------------------------------------------------
  [  118.107069] migration/0/11 is trying to acquire lock:
  [  118.112269] ffff0000403ed220 (&dev->power.lock){-...}-{2:2}, at: __pm_runtime_resume+0x40/0x74
  [  118.121172]
  [  118.121172] but task is already holding lock:
  [  118.127170] ffff8000113c8f88 (tk_core.seq.seqcount){----}-{0:0}, at: multi_cpu_stop+0xa4/0x190
  [  118.136061]
  [  118.136061] which lock already depends on the new lock.
  [  118.136061]
  [  118.144461]
  [  118.144461] the existing dependency chain (in reverse order) is:
  [  118.152149]
  [  118.152149] -> #2 (tk_core.seq.seqcount){----}-{0:0}:
  [  118.158900]        lock_acquire.part.0+0x120/0x330
  [  118.163834]        lock_acquire+0x64/0x80
  [  118.167971]        seqcount_lockdep_reader_access.constprop.0+0x74/0x100
  [  118.174865]        ktime_get+0x28/0xa0
  [  118.178729]        hrtimer_start_range_ns+0x210/0x2dc
  [  118.183934]        generic_sched_clock_init+0x70/0x88
  [  118.189134]        sched_clock_init+0x40/0x64
  [  118.193622]        start_kernel+0x494/0x524
  [  118.197926]
  [  118.197926] -> #1 (hrtimer_bases.lock){-.-.}-{2:2}:
  [  118.204491]        lock_acquire.part.0+0x120/0x330
  [  118.209424]        lock_acquire+0x64/0x80
  [  118.213557]        _raw_spin_lock_irqsave+0x7c/0xc4
  [  118.218579]        hrtimer_start_range_ns+0x68/0x2dc
  [  118.223690]        rpm_suspend+0x308/0x5dc
  [  118.227909]        rpm_idle+0xc4/0x2a4
  [  118.231771]        pm_runtime_work+0x98/0xc0
  [  118.236171]        process_one_work+0x294/0x6f0
  [  118.240836]        worker_thread+0x70/0x45c
  [  118.245143]        kthread+0x154/0x160
  [  118.249007]        ret_from_fork+0x10/0x20
  [  118.253222]
  [  118.253222] -> #0 (&dev->power.lock){-...}-{2:2}:
  [  118.259607]        check_noncircular+0x128/0x140
  [  118.268774]        __lock_acquire+0x13b0/0x204c
  [  118.277780]        lock_acquire.part.0+0x120/0x330
  [  118.287001]        lock_acquire+0x64/0x80
  [  118.295375]        _raw_spin_lock_irqsave+0x7c/0xc4
  [  118.304623]        __pm_runtime_resume+0x40/0x74
  [  118.313644]        sh_cmt_start+0x1c4/0x260
  [  118.322275]        sh_cmt_clocksource_enable+0x28/0x50
  [  118.331891]        change_clocksource+0x9c/0x160
  [  118.340910]        multi_cpu_stop+0xa4/0x190
  [  118.349522]        cpu_stopper_thread+0x90/0x154
  [  118.358429]        smpboot_thread_fn+0x244/0x270
  [  118.367265]        kthread+0x154/0x160
  [  118.375158]        ret_from_fork+0x10/0x20
  [  118.383284]
  [  118.383284] other info that might help us debug this:
  [  118.383284]
  [  118.402810] Chain exists of:
  [  118.402810]   &dev->power.lock --> hrtimer_bases.lock --> tk_core.seq.seqcount
  [  118.402810]
  [  118.425597]  Possible unsafe locking scenario:
  [  118.425597]
  [  118.439130]        CPU0                    CPU1
  [  118.447413]        ----                    ----
  [  118.455641]   lock(tk_core.seq.seqcount);
  [  118.463335]                                lock(hrtimer_bases.lock);
  [  118.473507]                                lock(tk_core.seq.seqcount);
  [  118.483787]   lock(&dev->power.lock);
  [  118.491120]
  [  118.491120]  *** DEADLOCK ***
  [  118.491120]
  [  118.507666] 2 locks held by migration/0/11:
  [  118.515424]  #0: ffff8000113c9278 (timekeeper_lock){-.-.}-{2:2}, at: change_clocksource+0x2c/0x160
  [  118.528257]  #1: ffff8000113c8f88 (tk_core.seq.seqcount){----}-{0:0}, at: multi_cpu_stop+0xa4/0x190
  [  118.541248]
  [  118.541248] stack backtrace:
  [  118.553226] CPU: 0 PID: 11 Comm: migration/0 Not tainted 5.10.0-rc6-arm64-renesas-00002-ga0cf8106e584 #37
  [  118.566923] Hardware name: Renesas Salvator-X 2nd version board based on r8a77965 (DT)
  [  118.579051] Call trace:
  [  118.585649]  dump_backtrace+0x0/0x190
  [  118.593505]  show_stack+0x14/0x30
  [  118.601001]  dump_stack+0xe8/0x130
  [  118.608567]  print_circular_bug+0x1f0/0x200
  [  118.616930]  check_noncircular+0x128/0x140
  [  118.625231]  __lock_acquire+0x13b0/0x204c
  [  118.633451]  lock_acquire.part.0+0x120/0x330
  [  118.641958]  lock_acquire+0x64/0x80
  [  118.649630]  _raw_spin_lock_irqsave+0x7c/0xc4
  [  118.658186]  __pm_runtime_resume+0x40/0x74
  [  118.666483]  sh_cmt_start+0x1c4/0x260
  [  118.674327]  sh_cmt_clocksource_enable+0x28/0x50
  [  118.683170]  change_clocksource+0x9c/0x160
  [  118.691460]  multi_cpu_stop+0xa4/0x190
  [  118.699384]  cpu_stopper_thread+0x90/0x154
  [  118.707672]  smpboot_thread_fn+0x244/0x270
  [  118.715961]  kthread+0x154/0x160
  [  118.723360]  ret_from_fork+0x10/0x20
  [  118.731465] clocksource: Switched to clocksource e60f0000.timer

1. commit 4614e6adafa2c5e6 ("clocksource: add enable() and disable() callbacks")

Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
---
 kernel/time/timekeeping.c | 36 +++++++++++++++++++++++-------------
 1 file changed, 23 insertions(+), 13 deletions(-)

Comments

Wolfram Sang March 3, 2021, 11:04 a.m. UTC | #1
On Thu, Feb 11, 2021 at 02:43:18PM +0100, Niklas Söderlund wrote:
> The struct clocksource callbacks enable() and disable() are described as
> a way to allow clock sources to enter a power save mode [1]. But using
> runtime PM from these callbacks triggers a cyclic lockdep warning when
> switching clock source using change_clocksource().
> 
> This change allows the new clocksource to be enabled() and the old one
> to be disabled() without holding the timekeeper_lock. This solution is
> modeled on timekeeping_resume() and timekeeping_suspend() where the
> struct clocksource resume() and suspend() callbacks are called without
> holding the timekeeper_lock.
> 
> Triggering and log of the deadlock warning,
> 
>   # echo e60f0000.timer > /sys/devices/system/clocksource/clocksource0/current_clocksource

...

I had the same issue when running 'clocksource-switch' from the Kernel
selftests with a Renesas Falcon board using a R-Car V3U. This patch
fixed the deadlock warning, so:

Tested-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
diff mbox series

Patch

diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 6aee5768c86ff7db..26ef4c9ef5c57081 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -1427,35 +1427,45 @@  static void __timekeeping_set_tai_offset(struct timekeeper *tk, s32 tai_offset)
 static int change_clocksource(void *data)
 {
 	struct timekeeper *tk = &tk_core.timekeeper;
-	struct clocksource *new, *old;
+	struct clocksource *new, *old = NULL;
 	unsigned long flags;
+	bool change = false;
 
 	new = (struct clocksource *) data;
 
-	raw_spin_lock_irqsave(&timekeeper_lock, flags);
-	write_seqcount_begin(&tk_core.seq);
-
-	timekeeping_forward_now(tk);
 	/*
 	 * If the cs is in module, get a module reference. Succeeds
 	 * for built-in code (owner == NULL) as well.
 	 */
 	if (try_module_get(new->owner)) {
-		if (!new->enable || new->enable(new) == 0) {
-			old = tk->tkr_mono.clock;
-			tk_setup_internals(tk, new);
-			if (old->disable)
-				old->disable(old);
-			module_put(old->owner);
-		} else {
+		if (!new->enable || new->enable(new) == 0)
+			change = true;
+		else
 			module_put(new->owner);
-		}
 	}
+
+	raw_spin_lock_irqsave(&timekeeper_lock, flags);
+	write_seqcount_begin(&tk_core.seq);
+
+	timekeeping_forward_now(tk);
+
+	if (change) {
+		old = tk->tkr_mono.clock;
+		tk_setup_internals(tk, new);
+	}
+
 	timekeeping_update(tk, TK_CLEAR_NTP | TK_MIRROR | TK_CLOCK_WAS_SET);
 
 	write_seqcount_end(&tk_core.seq);
 	raw_spin_unlock_irqrestore(&timekeeper_lock, flags);
 
+	if (old) {
+		if (old->disable)
+			old->disable(old);
+
+		module_put(old->owner);
+	}
+
 	return 0;
 }