diff mbox

[08/51] arm, hw-breakpoint: Fix CPU hotplug callback registration

Message ID 20140205220603.19080.80971.stgit@srivatsabhat.in.ibm.com (mailing list archive)
State New, archived
Headers show

Commit Message

Srivatsa S. Bhat Feb. 5, 2014, 10:06 p.m. UTC
Subsystems that want to register CPU hotplug callbacks, as well as perform
initialization for the CPUs that are already online, often do it as shown
below:

	get_online_cpus();

	for_each_online_cpu(cpu)
		init_cpu(cpu);

	register_cpu_notifier(&foobar_cpu_notifier);

	put_online_cpus();

This is wrong, since it is prone to ABBA deadlocks involving the
cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
with CPU hotplug operations).

Instead, the correct and race-free way of performing the callback
registration is:

	cpu_maps_update_begin();

	for_each_online_cpu(cpu)
		init_cpu(cpu);

	/* Note the use of the double underscored version of the API */
	__register_cpu_notifier(&foobar_cpu_notifier);

	cpu_maps_update_done();


Fix the hw-breakpoint code in arm by using this latter form of callback
registration.

Cc: Will Deacon <will.deacon@arm.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: linux-arm-kernel@lists.infradead.org
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
---

 arch/arm/kernel/hw_breakpoint.c |    8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

Comments

Will Deacon Feb. 6, 2014, 10:57 a.m. UTC | #1
Hi Srivatsa,

On Wed, Feb 05, 2014 at 10:06:04PM +0000, Srivatsa S. Bhat wrote:
> Subsystems that want to register CPU hotplug callbacks, as well as perform
> initialization for the CPUs that are already online, often do it as shown
> below:
> 
> 	get_online_cpus();
> 
> 	for_each_online_cpu(cpu)
> 		init_cpu(cpu);
> 
> 	register_cpu_notifier(&foobar_cpu_notifier);
> 
> 	put_online_cpus();
> 
> This is wrong, since it is prone to ABBA deadlocks involving the
> cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
> with CPU hotplug operations).

Hmm, the code in question (for this patch) runs from an arch_initcall. How
can you generate CPU hotplug operations at that stage?

> Instead, the correct and race-free way of performing the callback
> registration is:
> 
> 	cpu_maps_update_begin();
> 
> 	for_each_online_cpu(cpu)
> 		init_cpu(cpu);
> 
> 	/* Note the use of the double underscored version of the API */
> 	__register_cpu_notifier(&foobar_cpu_notifier);
> 
> 	cpu_maps_update_done();
> 
> 
> Fix the hw-breakpoint code in arm by using this latter form of callback
> registration.

I guess you introduce __register_cpu_notifier somewhere earlier in the
series, so it's best if you take this all via your tree.

Will
Srivatsa S. Bhat Feb. 6, 2014, 11:25 a.m. UTC | #2
Hi Will,

On 02/06/2014 04:27 PM, Will Deacon wrote:
> Hi Srivatsa,
> 
> On Wed, Feb 05, 2014 at 10:06:04PM +0000, Srivatsa S. Bhat wrote:
>> Subsystems that want to register CPU hotplug callbacks, as well as perform
>> initialization for the CPUs that are already online, often do it as shown
>> below:
>>
>> 	get_online_cpus();
>>
>> 	for_each_online_cpu(cpu)
>> 		init_cpu(cpu);
>>
>> 	register_cpu_notifier(&foobar_cpu_notifier);
>>
>> 	put_online_cpus();
>>
>> This is wrong, since it is prone to ABBA deadlocks involving the
>> cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
>> with CPU hotplug operations).
> 
> Hmm, the code in question (for this patch) runs from an arch_initcall. How
> can you generate CPU hotplug operations at that stage?
> 

You are right - in today's design of the init sequence, CPU hotplug
operations can't be triggered at that time during boot.

However, there have been proposals to boot CPUs in parallel along with the
rest of the kernel initialization [1] (and that would need full synchronization
with CPU hotplug even at the initcall stage). Of course this needs a lot of
auditing and modifications to the CPU hotplug notifiers of various subsystems
to make them robust enough to handle the parallel boot; so its not going to
happen very soon. But I felt that it would be a good idea to ensure that we
get the locking/synchronization right, even if the registrations happen very
early during boot today.. you know, just to be on the safer side and also to
make the job easier for whoever that is, who tries to implement parallel
CPU booting again in the future ;-)

[1]. http://thread.gmane.org/gmane.linux.kernel/1246209


>> Instead, the correct and race-free way of performing the callback
>> registration is:
>>
>> 	cpu_maps_update_begin();
>>
>> 	for_each_online_cpu(cpu)
>> 		init_cpu(cpu);
>>
>> 	/* Note the use of the double underscored version of the API */
>> 	__register_cpu_notifier(&foobar_cpu_notifier);
>>
>> 	cpu_maps_update_done();
>>
>>
>> Fix the hw-breakpoint code in arm by using this latter form of callback
>> registration.
> 
> I guess you introduce __register_cpu_notifier somewhere earlier in the
> series,

Yes, patch 1 adds that API..

> so it's best if you take this all via your tree.
>

Hmm.. I'm not a maintainer myself, so I'm hoping that either Oleg or Rusty
or any of the other CPU hotplug maintainers (Thomas/Peter/Ingo) would be
willing to take these patches through their tree.

Regards,
Srivatsa S. Bhat
Srivatsa S. Bhat Feb. 6, 2014, 11:38 a.m. UTC | #3
On 02/06/2014 05:09 PM, Will Deacon wrote:
> On Thu, Feb 06, 2014 at 11:25:46AM +0000, Srivatsa S. Bhat wrote:
>> Hi Will,
> 
> Hello,
> 
>> On 02/06/2014 04:27 PM, Will Deacon wrote:
>>> On Wed, Feb 05, 2014 at 10:06:04PM +0000, Srivatsa S. Bhat wrote:
>>>> Subsystems that want to register CPU hotplug callbacks, as well as perform
>>>> initialization for the CPUs that are already online, often do it as shown
>>>> below:
>>>>
>>>> 	get_online_cpus();
>>>>
>>>> 	for_each_online_cpu(cpu)
>>>> 		init_cpu(cpu);
>>>>
>>>> 	register_cpu_notifier(&foobar_cpu_notifier);
>>>>
>>>> 	put_online_cpus();
>>>>
>>>> This is wrong, since it is prone to ABBA deadlocks involving the
>>>> cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
>>>> with CPU hotplug operations).
>>>
>>> Hmm, the code in question (for this patch) runs from an arch_initcall. How
>>> can you generate CPU hotplug operations at that stage?
>>>
>>
>> You are right - in today's design of the init sequence, CPU hotplug
>> operations can't be triggered at that time during boot.
> 
> Phew, so we don't have a bug as the code stands today.

Yes, that's right.

> 
>> However, there have been proposals to boot CPUs in parallel along with the
>> rest of the kernel initialization [1] (and that would need full synchronization
>> with CPU hotplug even at the initcall stage). Of course this needs a lot of
>> auditing and modifications to the CPU hotplug notifiers of various subsystems
>> to make them robust enough to handle the parallel boot; so its not going to
>> happen very soon. But I felt that it would be a good idea to ensure that we
>> get the locking/synchronization right, even if the registrations happen very
>> early during boot today.. you know, just to be on the safer side and also to
>> make the job easier for whoever that is, who tries to implement parallel
>> CPU booting again in the future ;-)
>>
>> [1]. http://thread.gmane.org/gmane.linux.kernel/1246209
> 
> Makes sense, and this seems like a good start.
> 
>>> so it's best if you take this all via your tree.
>>>
>>
>> Hmm.. I'm not a maintainer myself, so I'm hoping that either Oleg or Rusty
>> or any of the other CPU hotplug maintainers (Thomas/Peter/Ingo) would be
>> willing to take these patches through their tree.
> 
> Well, you can have my ack for this patch:
> 
>   Acked-by: Will Deacon <will.deacon@arm.com>
>

Great! Thanks a lot Will!

Regards,
Srivatsa S. Bhat
Will Deacon Feb. 6, 2014, 11:39 a.m. UTC | #4
On Thu, Feb 06, 2014 at 11:25:46AM +0000, Srivatsa S. Bhat wrote:
> Hi Will,

Hello,

> On 02/06/2014 04:27 PM, Will Deacon wrote:
> > On Wed, Feb 05, 2014 at 10:06:04PM +0000, Srivatsa S. Bhat wrote:
> >> Subsystems that want to register CPU hotplug callbacks, as well as perform
> >> initialization for the CPUs that are already online, often do it as shown
> >> below:
> >>
> >> 	get_online_cpus();
> >>
> >> 	for_each_online_cpu(cpu)
> >> 		init_cpu(cpu);
> >>
> >> 	register_cpu_notifier(&foobar_cpu_notifier);
> >>
> >> 	put_online_cpus();
> >>
> >> This is wrong, since it is prone to ABBA deadlocks involving the
> >> cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
> >> with CPU hotplug operations).
> > 
> > Hmm, the code in question (for this patch) runs from an arch_initcall. How
> > can you generate CPU hotplug operations at that stage?
> > 
> 
> You are right - in today's design of the init sequence, CPU hotplug
> operations can't be triggered at that time during boot.

Phew, so we don't have a bug as the code stands today.

> However, there have been proposals to boot CPUs in parallel along with the
> rest of the kernel initialization [1] (and that would need full synchronization
> with CPU hotplug even at the initcall stage). Of course this needs a lot of
> auditing and modifications to the CPU hotplug notifiers of various subsystems
> to make them robust enough to handle the parallel boot; so its not going to
> happen very soon. But I felt that it would be a good idea to ensure that we
> get the locking/synchronization right, even if the registrations happen very
> early during boot today.. you know, just to be on the safer side and also to
> make the job easier for whoever that is, who tries to implement parallel
> CPU booting again in the future ;-)
> 
> [1]. http://thread.gmane.org/gmane.linux.kernel/1246209

Makes sense, and this seems like a good start.

> > so it's best if you take this all via your tree.
> >
> 
> Hmm.. I'm not a maintainer myself, so I'm hoping that either Oleg or Rusty
> or any of the other CPU hotplug maintainers (Thomas/Peter/Ingo) would be
> willing to take these patches through their tree.

Well, you can have my ack for this patch:

  Acked-by: Will Deacon <will.deacon@arm.com>

Cheers,

Will
diff mbox

Patch

diff --git a/arch/arm/kernel/hw_breakpoint.c b/arch/arm/kernel/hw_breakpoint.c
index 3d44660..eaa7fcf 100644
--- a/arch/arm/kernel/hw_breakpoint.c
+++ b/arch/arm/kernel/hw_breakpoint.c
@@ -1072,6 +1072,8 @@  static int __init arch_hw_breakpoint_init(void)
 	core_num_brps = get_num_brps();
 	core_num_wrps = get_num_wrps();
 
+	cpu_maps_update_begin();
+
 	/*
 	 * We need to tread carefully here because DBGSWENABLE may be
 	 * driven low on this core and there isn't an architected way to
@@ -1088,6 +1090,7 @@  static int __init arch_hw_breakpoint_init(void)
 	if (!cpumask_empty(&debug_err_mask)) {
 		core_num_brps = 0;
 		core_num_wrps = 0;
+		cpu_maps_update_done();
 		return 0;
 	}
 
@@ -1107,7 +1110,10 @@  static int __init arch_hw_breakpoint_init(void)
 			TRAP_HWBKPT, "breakpoint debug exception");
 
 	/* Register hotplug and PM notifiers. */
-	register_cpu_notifier(&dbg_reset_nb);
+	__register_cpu_notifier(&dbg_reset_nb);
+
+	cpu_maps_update_done();
+
 	pm_init();
 	return 0;
 }