diff mbox series

[v1] drivers/acpi/scan.c: Fixup "acquire device_hotplug_lock in acpi_scan_init()"

Message ID 20190731123201.13893-1-david@redhat.com (mailing list archive)
State New, archived
Headers show
Series [v1] drivers/acpi/scan.c: Fixup "acquire device_hotplug_lock in acpi_scan_init()" | expand

Commit Message

David Hildenbrand July 31, 2019, 12:32 p.m. UTC
Let's document why we take the lock here. If we're going to overhaul
memory hotplug locking, we'll have to touch many places - this comment
will help to clairfy why it was added here.

Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 drivers/acpi/scan.c | 5 +++++
 1 file changed, 5 insertions(+)

Comments

Michal Hocko July 31, 2019, 12:53 p.m. UTC | #1
On Wed 31-07-19 14:32:01, David Hildenbrand wrote:
> Let's document why we take the lock here. If we're going to overhaul
> memory hotplug locking, we'll have to touch many places - this comment
> will help to clairfy why it was added here.

And how exactly is "lock for consistency" comment going to help the poor
soul touching that code? How do people know that it is safe to remove it?
I am not going to repeat my arguments how/why I hate "locking for
consistency" (or fun or whatever but a real synchronization reasons)
but if you want to help then just explicitly state what should done to
remove this lock.

> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Oscar Salvador <osalvador@suse.de>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  drivers/acpi/scan.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/drivers/acpi/scan.c b/drivers/acpi/scan.c
> index cbc9d64b48dd..9193f1d46148 100644
> --- a/drivers/acpi/scan.c
> +++ b/drivers/acpi/scan.c
> @@ -2204,6 +2204,11 @@ int __init acpi_scan_init(void)
>  	acpi_gpe_apply_masked_gpes();
>  	acpi_update_all_gpes();
>  
> +	/*
> +	 * We end up calling __add_memory(), which expects the
> +	 * device_hotplug_lock to be held. Races with userspace and other
> +	 * hotplug activities are not really possible - lock for consistency.
> +	 */
>  	lock_device_hotplug();
>  	mutex_lock(&acpi_scan_lock);
>  
> -- 
> 2.21.0
David Hildenbrand July 31, 2019, 1:02 p.m. UTC | #2
On 31.07.19 14:53, Michal Hocko wrote:
> On Wed 31-07-19 14:32:01, David Hildenbrand wrote:
>> Let's document why we take the lock here. If we're going to overhaul
>> memory hotplug locking, we'll have to touch many places - this comment
>> will help to clairfy why it was added here.
> 
> And how exactly is "lock for consistency" comment going to help the poor
> soul touching that code? How do people know that it is safe to remove it?
> I am not going to repeat my arguments how/why I hate "locking for
> consistency" (or fun or whatever but a real synchronization reasons)
> but if you want to help then just explicitly state what should done to
> remove this lock.
> 

I know that you have a different opinion here. To remove the lock,
add_memory() locking has to be changed *completely* to the point where
we can drop the lock from the documentation of the function (*whoever
knows what we have to exactly change* - and I don't have time to do that
*right now*).
Michal Hocko July 31, 2019, 1:14 p.m. UTC | #3
On Wed 31-07-19 15:02:49, David Hildenbrand wrote:
> On 31.07.19 14:53, Michal Hocko wrote:
> > On Wed 31-07-19 14:32:01, David Hildenbrand wrote:
> >> Let's document why we take the lock here. If we're going to overhaul
> >> memory hotplug locking, we'll have to touch many places - this comment
> >> will help to clairfy why it was added here.
> > 
> > And how exactly is "lock for consistency" comment going to help the poor
> > soul touching that code? How do people know that it is safe to remove it?
> > I am not going to repeat my arguments how/why I hate "locking for
> > consistency" (or fun or whatever but a real synchronization reasons)
> > but if you want to help then just explicitly state what should done to
> > remove this lock.
> > 
> 
> I know that you have a different opinion here. To remove the lock,
> add_memory() locking has to be changed *completely* to the point where
> we can drop the lock from the documentation of the function (*whoever
> knows what we have to exactly change* - and I don't have time to do that
> *right now*).

Not really. To remove a lock in this particular path it would be
sufficient to add
	/*
	 * Although __add_memory used down the road is documented to
	 * require lock_device_hotplug, it is not necessary here because
	 * this is an early code when userspace or any other code path
	 * cannot trigger hotplug operations.
	 */

Now that is a useful comment because it documents an exception and gives
you reasoning. If the above statement ever turns out to be incorrect due
to later changes then you can replace it with the lock and the new
reasoning.

But "just for consistency argument" doesn't tell you much when
scratching your head in the future and trying to figure out whether that
consistency argument still applies or there are new reasons the lock is
still needed.
David Hildenbrand July 31, 2019, 1:18 p.m. UTC | #4
On 31.07.19 15:14, Michal Hocko wrote:
> On Wed 31-07-19 15:02:49, David Hildenbrand wrote:
>> On 31.07.19 14:53, Michal Hocko wrote:
>>> On Wed 31-07-19 14:32:01, David Hildenbrand wrote:
>>>> Let's document why we take the lock here. If we're going to overhaul
>>>> memory hotplug locking, we'll have to touch many places - this comment
>>>> will help to clairfy why it was added here.
>>>
>>> And how exactly is "lock for consistency" comment going to help the poor
>>> soul touching that code? How do people know that it is safe to remove it?
>>> I am not going to repeat my arguments how/why I hate "locking for
>>> consistency" (or fun or whatever but a real synchronization reasons)
>>> but if you want to help then just explicitly state what should done to
>>> remove this lock.
>>>
>>
>> I know that you have a different opinion here. To remove the lock,
>> add_memory() locking has to be changed *completely* to the point where
>> we can drop the lock from the documentation of the function (*whoever
>> knows what we have to exactly change* - and I don't have time to do that
>> *right now*).
> 
> Not really. To remove a lock in this particular path it would be
> sufficient to add
> 	/*
> 	 * Although __add_memory used down the road is documented to
> 	 * require lock_device_hotplug, it is not necessary here because
> 	 * this is an early code when userspace or any other code path
> 	 * cannot trigger hotplug operations.
> 	 */

Okay, let me phrase it like this: Are you 100% (!) sure that we don't
need the lock here. I am not -  I only know what I documented back then
and what I found out - could be that we are forgetting something else
the lock protects.

As I already said, I am fine with adding such a comment instead. But I
am not convinced that the absence of the lock is 100% safe. (I am 99.99%
sure ;) ).
Michal Hocko July 31, 2019, 1:33 p.m. UTC | #5
On Wed 31-07-19 15:18:42, David Hildenbrand wrote:
> On 31.07.19 15:14, Michal Hocko wrote:
> > On Wed 31-07-19 15:02:49, David Hildenbrand wrote:
> >> On 31.07.19 14:53, Michal Hocko wrote:
> >>> On Wed 31-07-19 14:32:01, David Hildenbrand wrote:
> >>>> Let's document why we take the lock here. If we're going to overhaul
> >>>> memory hotplug locking, we'll have to touch many places - this comment
> >>>> will help to clairfy why it was added here.
> >>>
> >>> And how exactly is "lock for consistency" comment going to help the poor
> >>> soul touching that code? How do people know that it is safe to remove it?
> >>> I am not going to repeat my arguments how/why I hate "locking for
> >>> consistency" (or fun or whatever but a real synchronization reasons)
> >>> but if you want to help then just explicitly state what should done to
> >>> remove this lock.
> >>>
> >>
> >> I know that you have a different opinion here. To remove the lock,
> >> add_memory() locking has to be changed *completely* to the point where
> >> we can drop the lock from the documentation of the function (*whoever
> >> knows what we have to exactly change* - and I don't have time to do that
> >> *right now*).
> > 
> > Not really. To remove a lock in this particular path it would be
> > sufficient to add
> > 	/*
> > 	 * Although __add_memory used down the road is documented to
> > 	 * require lock_device_hotplug, it is not necessary here because
> > 	 * this is an early code when userspace or any other code path
> > 	 * cannot trigger hotplug operations.
> > 	 */
> 
> Okay, let me phrase it like this: Are you 100% (!) sure that we don't
> need the lock here. I am not -  I only know what I documented back then
> and what I found out - could be that we are forgetting something else
> the lock protects.
> 
> As I already said, I am fine with adding such a comment instead. But I
> am not convinced that the absence of the lock is 100% safe. (I am 99.99%
> sure ;) ).

I am sorry but this is a shiny example of cargo cult programming. You do
not add locks just because you are not sure. Locks are protecting data
structures not code paths! If it is not clear what is actually protected
then that should be explored first before the lock is spread "just to be
sure"

Just look here. You have pushed that uncertainty to whoever is going
touch this code and guess what, they are going to follow that lead and
we are likely to grow the unjustified usage and any further changes will
be just harder. I have seen that pattern so many times that it is even
not funny. And that's why I pushed back here.

So let me repeat. If the lock is to stay then make sure that the comment
actually explains what has to be done to remove it because it is not
really required as of now.
David Hildenbrand July 31, 2019, 1:37 p.m. UTC | #6
On 31.07.19 15:33, Michal Hocko wrote:
> On Wed 31-07-19 15:18:42, David Hildenbrand wrote:
>> On 31.07.19 15:14, Michal Hocko wrote:
>>> On Wed 31-07-19 15:02:49, David Hildenbrand wrote:
>>>> On 31.07.19 14:53, Michal Hocko wrote:
>>>>> On Wed 31-07-19 14:32:01, David Hildenbrand wrote:
>>>>>> Let's document why we take the lock here. If we're going to overhaul
>>>>>> memory hotplug locking, we'll have to touch many places - this comment
>>>>>> will help to clairfy why it was added here.
>>>>>
>>>>> And how exactly is "lock for consistency" comment going to help the poor
>>>>> soul touching that code? How do people know that it is safe to remove it?
>>>>> I am not going to repeat my arguments how/why I hate "locking for
>>>>> consistency" (or fun or whatever but a real synchronization reasons)
>>>>> but if you want to help then just explicitly state what should done to
>>>>> remove this lock.
>>>>>
>>>>
>>>> I know that you have a different opinion here. To remove the lock,
>>>> add_memory() locking has to be changed *completely* to the point where
>>>> we can drop the lock from the documentation of the function (*whoever
>>>> knows what we have to exactly change* - and I don't have time to do that
>>>> *right now*).
>>>
>>> Not really. To remove a lock in this particular path it would be
>>> sufficient to add
>>> 	/*
>>> 	 * Although __add_memory used down the road is documented to
>>> 	 * require lock_device_hotplug, it is not necessary here because
>>> 	 * this is an early code when userspace or any other code path
>>> 	 * cannot trigger hotplug operations.
>>> 	 */
>>
>> Okay, let me phrase it like this: Are you 100% (!) sure that we don't
>> need the lock here. I am not -  I only know what I documented back then
>> and what I found out - could be that we are forgetting something else
>> the lock protects.
>>
>> As I already said, I am fine with adding such a comment instead. But I
>> am not convinced that the absence of the lock is 100% safe. (I am 99.99%
>> sure ;) ).
> 
> I am sorry but this is a shiny example of cargo cult programming. You do
> not add locks just because you are not sure. Locks are protecting data
> structures not code paths! If it is not clear what is actually protected
> then that should be explored first before the lock is spread "just to be
> sure"
> 
> Just look here. You have pushed that uncertainty to whoever is going
> touch this code and guess what, they are going to follow that lead and
> we are likely to grow the unjustified usage and any further changes will
> be just harder. I have seen that pattern so many times that it is even
> not funny. And that's why I pushed back here.
> 
> So let me repeat. If the lock is to stay then make sure that the comment
> actually explains what has to be done to remove it because it is not
> really required as of now.
> 

The other extreme I saw: People dropping locks because they think they
can be smart but end up making developers debug crashes for months (I am
not lying).

As I want to move on with this patch and have other stuff to work on, I
will adjust the comment you gave and add that instead of the lock.
Michal Hocko July 31, 2019, 1:44 p.m. UTC | #7
On Wed 31-07-19 15:37:56, David Hildenbrand wrote:
[...]
> The other extreme I saw: People dropping locks because they think they
> can be smart but end up making developers debug crashes for months (I am
> not lying).

Any lock removal should be accompanied with an explanation that is to be
a subject of the review. Sure people can make mistakes but there is no
way to around it I can see.
 
> As I want to move on with this patch and have other stuff to work on, I
> will adjust the comment you gave and add that instead of the lock.

Thanks!
diff mbox series

Patch

diff --git a/drivers/acpi/scan.c b/drivers/acpi/scan.c
index cbc9d64b48dd..9193f1d46148 100644
--- a/drivers/acpi/scan.c
+++ b/drivers/acpi/scan.c
@@ -2204,6 +2204,11 @@  int __init acpi_scan_init(void)
 	acpi_gpe_apply_masked_gpes();
 	acpi_update_all_gpes();
 
+	/*
+	 * We end up calling __add_memory(), which expects the
+	 * device_hotplug_lock to be held. Races with userspace and other
+	 * hotplug activities are not really possible - lock for consistency.
+	 */
 	lock_device_hotplug();
 	mutex_lock(&acpi_scan_lock);