diff mbox series

[RESEND,v2] clocksource/arm_arch_timer: fix a lockdep warning

Message ID 20181210135228.49751-1-cai@lca.pw (mailing list archive)
State New, archived
Headers show
Series [RESEND,v2] clocksource/arm_arch_timer: fix a lockdep warning | expand

Commit Message

Qian Cai Dec. 10, 2018, 1:52 p.m. UTC
Booting this Huawei TaiShan 2280 arm64 server generated this lockdep
warning.

[    0.000000]  lockdep_assert_cpus_held+0x50/0x60
[    0.000000]  static_key_enable_cpuslocked+0x30/0xe8
[    0.000000]  arch_timer_check_ool_workaround+0x128/0x2d0
[    0.000000]  arch_timer_acpi_init+0x274/0x6ac
[    0.000000]  acpi_table_parse+0x1ac/0x218
[    0.000000]  __acpi_probe_device_table+0x164/0x1ec
[    0.000000]  timer_probe+0x1bc/0x254
[    0.000000]  time_init+0x44/0x98
[    0.000000]  start_kernel+0x4ec/0x7d4

This is due to the commit cb538267ea1e ("jump_label/lockdep: Assert we hold
the hotplug lock for _cpuslocked() operations").

Since it is applying a global workaround to all CPUs here, it did not hold
any CPU locks in this path.

arch_timer_acpi_init
  arch_timer_check_ool_workaround(ate_match_acpi_oem_info, table)
    arch_timer_enable_workaround(wa, local = false)
      for_each_possible_cpu()
	per_cpu()

There is also another path did not have any CPU lock.

time_init
  clocksource_probe
    arch_timer_of_init
      arch_timer_check_ool_workaround(ate_match_dt, np)
        arch_timer_enable_workaround(wa, local = false)

When hot-adding a CPU, it will go with a slightly different route.

arch_timer_starting_cpu
  __arch_timer_setup
    arch_timer_check_ool_workaround(ate_match_local_cap_id, NULL)
      arch_timer_enable_workaround(wa, local = true)
        __this_cpu_write()

Hence, deal with them differently.

Fixes: 450f9689f294 (clocksource/arm_arch_timer: Use
static_branch_enable_cpuslocked())
Signed-off-by: Qian Cai <cai@lca.pw>
---

v2: fix the root cause instead of a workaround.

 drivers/clocksource/arm_arch_timer.c | 15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

Comments

Peter Zijlstra Dec. 10, 2018, 2:07 p.m. UTC | #1
On Mon, Dec 10, 2018 at 08:52:28AM -0500, Qian Cai wrote:
> Booting this Huawei TaiShan 2280 arm64 server generated this lockdep
> warning.
> 
> [    0.000000]  lockdep_assert_cpus_held+0x50/0x60
> [    0.000000]  static_key_enable_cpuslocked+0x30/0xe8
> [    0.000000]  arch_timer_check_ool_workaround+0x128/0x2d0
> [    0.000000]  arch_timer_acpi_init+0x274/0x6ac
> [    0.000000]  acpi_table_parse+0x1ac/0x218
> [    0.000000]  __acpi_probe_device_table+0x164/0x1ec
> [    0.000000]  timer_probe+0x1bc/0x254
> [    0.000000]  time_init+0x44/0x98
> [    0.000000]  start_kernel+0x4ec/0x7d4

It seems to me we want something like:

---
 kernel/cpu.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/kernel/cpu.c b/kernel/cpu.c
index 91d5c38eb7e5..e1ee8caf28b5 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -313,6 +313,8 @@ void cpus_write_unlock(void)
 
 void lockdep_assert_cpus_held(void)
 {
+	if (system_state < SYSTEM_RUNNING)
+		return;
 	percpu_rwsem_assert_held(&cpu_hotplug_lock);
 }
Valentin Schneider Dec. 10, 2018, 2:19 p.m. UTC | #2
Hi,

On 10/12/2018 14:07, Peter Zijlstra wrote:
[...]
> ---
>  kernel/cpu.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/kernel/cpu.c b/kernel/cpu.c
> index 91d5c38eb7e5..e1ee8caf28b5 100644
> --- a/kernel/cpu.c
> +++ b/kernel/cpu.c
> @@ -313,6 +313,8 @@ void cpus_write_unlock(void)
>  
>  void lockdep_assert_cpus_held(void)
>  {
> +	if (system_state < SYSTEM_RUNNING)
> +		return;
>  	percpu_rwsem_assert_held(&cpu_hotplug_lock);
>  }
>  

Kinda hijacking the thread here - is that something we'd want to replace 

    40fa3780bac2 ("sched/core: Take the hotplug lock in sched_init_smp()")

with?
Peter Zijlstra Dec. 10, 2018, 2:47 p.m. UTC | #3
On Mon, Dec 10, 2018 at 02:19:53PM +0000, Valentin Schneider wrote:
> Hi,
> 
> On 10/12/2018 14:07, Peter Zijlstra wrote:
> [...]
> > ---
> >  kernel/cpu.c | 2 ++
> >  1 file changed, 2 insertions(+)
> > 
> > diff --git a/kernel/cpu.c b/kernel/cpu.c
> > index 91d5c38eb7e5..e1ee8caf28b5 100644
> > --- a/kernel/cpu.c
> > +++ b/kernel/cpu.c
> > @@ -313,6 +313,8 @@ void cpus_write_unlock(void)
> >  
> >  void lockdep_assert_cpus_held(void)
> >  {
> > +	if (system_state < SYSTEM_RUNNING)
> > +		return;
> >  	percpu_rwsem_assert_held(&cpu_hotplug_lock);
> >  }
> >  
> 
> Kinda hijacking the thread here - is that something we'd want to replace 
> 
>     40fa3780bac2 ("sched/core: Take the hotplug lock in sched_init_smp()")
> 
> with?
>  

Possibly; and I have vague memories of other people running into this
same thing too.
Marc Zyngier Dec. 10, 2018, 2:48 p.m. UTC | #4
On 10/12/2018 13:52, Qian Cai wrote:
> Booting this Huawei TaiShan 2280 arm64 server generated this lockdep
> warning.
> 
> [    0.000000]  lockdep_assert_cpus_held+0x50/0x60
> [    0.000000]  static_key_enable_cpuslocked+0x30/0xe8
> [    0.000000]  arch_timer_check_ool_workaround+0x128/0x2d0
> [    0.000000]  arch_timer_acpi_init+0x274/0x6ac
> [    0.000000]  acpi_table_parse+0x1ac/0x218
> [    0.000000]  __acpi_probe_device_table+0x164/0x1ec
> [    0.000000]  timer_probe+0x1bc/0x254
> [    0.000000]  time_init+0x44/0x98
> [    0.000000]  start_kernel+0x4ec/0x7d4
> 
> This is due to the commit cb538267ea1e ("jump_label/lockdep: Assert we hold
> the hotplug lock for _cpuslocked() operations").
> 
> Since it is applying a global workaround to all CPUs here, it did not hold
> any CPU locks in this path.
> 
> arch_timer_acpi_init
>   arch_timer_check_ool_workaround(ate_match_acpi_oem_info, table)
>     arch_timer_enable_workaround(wa, local = false)
>       for_each_possible_cpu()
> 	per_cpu()
> 
> There is also another path did not have any CPU lock.
> 
> time_init
>   clocksource_probe
>     arch_timer_of_init
>       arch_timer_check_ool_workaround(ate_match_dt, np)
>         arch_timer_enable_workaround(wa, local = false)
> 
> When hot-adding a CPU, it will go with a slightly different route.
> 
> arch_timer_starting_cpu
>   __arch_timer_setup
>     arch_timer_check_ool_workaround(ate_match_local_cap_id, NULL)
>       arch_timer_enable_workaround(wa, local = true)
>         __this_cpu_write()
> 
> Hence, deal with them differently.
> 
> Fixes: 450f9689f294 (clocksource/arm_arch_timer: Use
> static_branch_enable_cpuslocked())
> Signed-off-by: Qian Cai <cai@lca.pw>
> ---
> 
> v2: fix the root cause instead of a workaround.
> 
>  drivers/clocksource/arm_arch_timer.c | 15 +++++++++------
>  1 file changed, 9 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/clocksource/arm_arch_timer.c b/drivers/clocksource/arm_arch_timer.c
> index 9a7d4dc00b6e..81dca7d31d13 100644
> --- a/drivers/clocksource/arm_arch_timer.c
> +++ b/drivers/clocksource/arm_arch_timer.c
> @@ -492,17 +492,20 @@ void arch_timer_enable_workaround(const struct arch_timer_erratum_workaround *wa
>  
>  	if (local) {
>  		__this_cpu_write(timer_unstable_counter_workaround, wa);
> +
> +		/*
> +		 * Use the locked version, as we're called from the CPU
> +		 * hotplug framework. Otherwise, we end-up in
> +		 * deadlock-land.
> +		 */
> +		static_branch_enable_cpuslocked(&arch_timer_read_ool_enabled);

I have the ugly feeling that it breaks the (equally ugly) big-little
stuff where the boot CPU is affected. In this context, you'd end-up
without the lock being taken and you'll get the same splat.

We could start testing the CPU number, but Peter's approach seem much
more palatable to me.

Thanks,

	M.
Qian Cai Dec. 10, 2018, 3:47 p.m. UTC | #5
On 12/10/18 9:07 AM, Peter Zijlstra wrote:
> On Mon, Dec 10, 2018 at 08:52:28AM -0500, Qian Cai wrote:
>> Booting this Huawei TaiShan 2280 arm64 server generated this lockdep
>> warning.
>>
>> [    0.000000]  lockdep_assert_cpus_held+0x50/0x60
>> [    0.000000]  static_key_enable_cpuslocked+0x30/0xe8
>> [    0.000000]  arch_timer_check_ool_workaround+0x128/0x2d0
>> [    0.000000]  arch_timer_acpi_init+0x274/0x6ac
>> [    0.000000]  acpi_table_parse+0x1ac/0x218
>> [    0.000000]  __acpi_probe_device_table+0x164/0x1ec
>> [    0.000000]  timer_probe+0x1bc/0x254
>> [    0.000000]  time_init+0x44/0x98
>> [    0.000000]  start_kernel+0x4ec/0x7d4
> 
> It seems to me we want something like:

Tested-by: Qian Cai <cai@lca.pw>

> 
> ---
>  kernel/cpu.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/kernel/cpu.c b/kernel/cpu.c
> index 91d5c38eb7e5..e1ee8caf28b5 100644
> --- a/kernel/cpu.c
> +++ b/kernel/cpu.c
> @@ -313,6 +313,8 @@ void cpus_write_unlock(void)
>  
>  void lockdep_assert_cpus_held(void)
>  {
> +	if (system_state < SYSTEM_RUNNING)
> +		return;
>  	percpu_rwsem_assert_held(&cpu_hotplug_lock);
>  }
>  
>
Thomas Gleixner Dec. 10, 2018, 9:31 p.m. UTC | #6
On Mon, 10 Dec 2018, Peter Zijlstra wrote:
> On Mon, Dec 10, 2018 at 08:52:28AM -0500, Qian Cai wrote:
> > Booting this Huawei TaiShan 2280 arm64 server generated this lockdep
> > warning.
> > 
> > [    0.000000]  lockdep_assert_cpus_held+0x50/0x60
> > [    0.000000]  static_key_enable_cpuslocked+0x30/0xe8
> > [    0.000000]  arch_timer_check_ool_workaround+0x128/0x2d0
> > [    0.000000]  arch_timer_acpi_init+0x274/0x6ac
> > [    0.000000]  acpi_table_parse+0x1ac/0x218
> > [    0.000000]  __acpi_probe_device_table+0x164/0x1ec
> > [    0.000000]  timer_probe+0x1bc/0x254
> > [    0.000000]  time_init+0x44/0x98
> > [    0.000000]  start_kernel+0x4ec/0x7d4
> 
> It seems to me we want something like:
> 
> ---
>  kernel/cpu.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/kernel/cpu.c b/kernel/cpu.c
> index 91d5c38eb7e5..e1ee8caf28b5 100644
> --- a/kernel/cpu.c
> +++ b/kernel/cpu.c
> @@ -313,6 +313,8 @@ void cpus_write_unlock(void)
>  
>  void lockdep_assert_cpus_held(void)
>  {
> +	if (system_state < SYSTEM_RUNNING)
> +		return;
>  	percpu_rwsem_assert_held(&cpu_hotplug_lock);
>  }
 
Hmm, no. SYSTEM_SCHEDULING is what you want because RUNNING is set way too
late.

Thanks,

	tglx
Qian Cai Dec. 14, 2018, 3:05 p.m. UTC | #7
On 12/10/18 4:31 PM, Thomas Gleixner wrote:
> On Mon, 10 Dec 2018, Peter Zijlstra wrote:
>> On Mon, Dec 10, 2018 at 08:52:28AM -0500, Qian Cai wrote:
>>> Booting this Huawei TaiShan 2280 arm64 server generated this lockdep
>>> warning.
>>>
>>> [    0.000000]  lockdep_assert_cpus_held+0x50/0x60
>>> [    0.000000]  static_key_enable_cpuslocked+0x30/0xe8
>>> [    0.000000]  arch_timer_check_ool_workaround+0x128/0x2d0
>>> [    0.000000]  arch_timer_acpi_init+0x274/0x6ac
>>> [    0.000000]  acpi_table_parse+0x1ac/0x218
>>> [    0.000000]  __acpi_probe_device_table+0x164/0x1ec
>>> [    0.000000]  timer_probe+0x1bc/0x254
>>> [    0.000000]  time_init+0x44/0x98
>>> [    0.000000]  start_kernel+0x4ec/0x7d4
>>
>> It seems to me we want something like:
>>
>> ---
>>  kernel/cpu.c | 2 ++
>>  1 file changed, 2 insertions(+)
>>
>> diff --git a/kernel/cpu.c b/kernel/cpu.c
>> index 91d5c38eb7e5..e1ee8caf28b5 100644
>> --- a/kernel/cpu.c
>> +++ b/kernel/cpu.c
>> @@ -313,6 +313,8 @@ void cpus_write_unlock(void)
>>  
>>  void lockdep_assert_cpus_held(void)
>>  {
>> +	if (system_state < SYSTEM_RUNNING)
>> +		return;
>>  	percpu_rwsem_assert_held(&cpu_hotplug_lock);
>>  }
>  
> Hmm, no. SYSTEM_SCHEDULING is what you want because RUNNING is set way too
> late.

SYSTEM_SCHEDULING works well here too.
diff mbox series

Patch

diff --git a/drivers/clocksource/arm_arch_timer.c b/drivers/clocksource/arm_arch_timer.c
index 9a7d4dc00b6e..81dca7d31d13 100644
--- a/drivers/clocksource/arm_arch_timer.c
+++ b/drivers/clocksource/arm_arch_timer.c
@@ -492,17 +492,20 @@  void arch_timer_enable_workaround(const struct arch_timer_erratum_workaround *wa
 
 	if (local) {
 		__this_cpu_write(timer_unstable_counter_workaround, wa);
+
+		/*
+		 * Use the locked version, as we're called from the CPU
+		 * hotplug framework. Otherwise, we end-up in
+		 * deadlock-land.
+		 */
+		static_branch_enable_cpuslocked(&arch_timer_read_ool_enabled);
 	} else {
 		for_each_possible_cpu(i)
 			per_cpu(timer_unstable_counter_workaround, i) = wa;
+		/* A global workaround is not on the CPU hotplug path. */
+		static_branch_enable(&arch_timer_read_ool_enabled);
 	}
 
-	/*
-	 * Use the locked version, as we're called from the CPU
-	 * hotplug framework. Otherwise, we end-up in deadlock-land.
-	 */
-	static_branch_enable_cpuslocked(&arch_timer_read_ool_enabled);
-
 	/*
 	 * Don't use the vdso fastpath if errata require using the
 	 * out-of-line counter accessor. We may change our mind pretty