diff mbox

[3/3] Thermal: do thermal zone update after a cooling device registered

Message ID 1443332924-14028-1-git-send-email-yu.c.chen@intel.com (mailing list archive)
State Superseded, archived
Delegated to: Zhang Rui
Headers show

Commit Message

Chen Yu Sept. 27, 2015, 5:48 a.m. UTC
From: Zhang Rui <rui.zhang@intel.com>

When a new cooling device is registered, we need to update the
thermal zone to set the new registered cooling device to a proper
state.

This fixes a problem that the system is cool, while the fan devices
are left running on full speed after boot, if fan device is registered
after thermal zone device.

CC: <stable@vger.kernel.org> #3.18+
Reference:https://bugzilla.kernel.org/show_bug.cgi?id=92431
Tested-by: Manuel Krause <manuelkrause@netscape.net>
Tested-by: szegad <szegadlo@poczta.onet.pl>
Tested-by: prash <prash.n.rao@gmail.com>
Tested-by: amish <ammdispose-arch@yahoo.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
---
 drivers/thermal/thermal_core.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

Comments

Javi Merino Sept. 28, 2015, 2:29 p.m. UTC | #1
On Sun, Sep 27, 2015 at 06:48:44AM +0100, Chen Yu wrote:
> From: Zhang Rui <rui.zhang@intel.com>
> 
> When a new cooling device is registered, we need to update the
> thermal zone to set the new registered cooling device to a proper
> state.
> 
> This fixes a problem that the system is cool, while the fan devices
> are left running on full speed after boot, if fan device is registered
> after thermal zone device.
> 
> CC: <stable@vger.kernel.org> #3.18+
> Reference:https://bugzilla.kernel.org/show_bug.cgi?id=92431
> Tested-by: Manuel Krause <manuelkrause@netscape.net>
> Tested-by: szegad <szegadlo@poczta.onet.pl>
> Tested-by: prash <prash.n.rao@gmail.com>
> Tested-by: amish <ammdispose-arch@yahoo.com>
> Signed-off-by: Zhang Rui <rui.zhang@intel.com>
> Signed-off-by: Chen Yu <yu.c.chen@intel.com>
> ---
>  drivers/thermal/thermal_core.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
> index c3bdb48..09c78a4 100644
> --- a/drivers/thermal/thermal_core.c
> +++ b/drivers/thermal/thermal_core.c
> @@ -1450,6 +1450,7 @@ __thermal_cooling_device_register(struct device_node *np,
>  				  const struct thermal_cooling_device_ops *ops)
>  {
>  	struct thermal_cooling_device *cdev;
> +	struct thermal_instance *pos, *next;
>  	int result;
>  
>  	if (type && strlen(type) >= THERMAL_NAME_LENGTH)
> @@ -1494,6 +1495,15 @@ __thermal_cooling_device_register(struct device_node *np,
>  	/* Update binding information for 'this' new cdev */
>  	bind_cdev(cdev);
>  

I think you need to hold cdev->lock here, to make sure that no thermal
zone is added or removed from cdev->thermal_instances while you are looping.

> +	list_for_each_entry_safe(pos, next, &cdev->thermal_instances, cdev_node) {

Why list_for_each_entry_safe() ?  You are not going to remove any
entry, so you can just use list_for_each_entry()

> +			if (next->cdev_node.next == &cdev->thermal_instances) {
> +				thermal_zone_device_update(next->tz);
> +				break;
> +			}
> +			if (pos->tz != next->tz)
> +				thermal_zone_device_update(pos->tz);
> +	}

Why is this so complicated?  Can't you just do:

	list_for_each_entry(pos, &cdev->thermal_instances, cdev_node)
        	thermal_zone_device_update(pos->tz);

Cheers,
Javi
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Chen Yu Sept. 28, 2015, 5:52 p.m. UTC | #2
Hi, Javi,

> -----Original Message-----

> From: Javi Merino [mailto:javi.merino@arm.com]

> Sent: Monday, September 28, 2015 10:29 PM

> To: Chen, Yu C

> Cc: linux-pm@vger.kernel.org; edubezval@gmail.com; Zhang, Rui; linux-

> kernel@vger.kernel.org; stable@vger.kernel.org

> Subject: Re: [PATCH 3/3] Thermal: do thermal zone update after a cooling

> device registered

> 

> On Sun, Sep 27, 2015 at 06:48:44AM +0100, Chen Yu wrote:

> > From: Zhang Rui <rui.zhang@intel.com>

> >

> >

> 

> I think you need to hold cdev->lock here, to make sure that no thermal zone

> is added or removed from cdev->thermal_instances while you are looping.

> 

Ah right, will add. If I add the cdev ->lock here, will there be a AB-BA lock with 
thermal_zone_unbind_cooling_device?

> 

> Why list_for_each_entry_safe() ?  You are not going to remove any entry, so

> you can just use list_for_each_entry()

> 

> 

> Why is this so complicated?  Can't you just do:

> 

> 	list_for_each_entry(pos, &cdev->thermal_instances, cdev_node)

>         	thermal_zone_device_update(pos->tz);

> 


This is an optimization here:
Ignore thermal instance that refers to the same thermal zone in this loop,
this works because bind_cdev() always binds the cooling device to one 
thermal zone first, and then binds to the next thermal zone.


Best Regards,
Yu
Javi Merino Sept. 29, 2015, 4:01 p.m. UTC | #3
Hi Yu,

On Mon, Sep 28, 2015 at 06:52:00PM +0100, Chen, Yu C wrote:
> Hi, Javi,
> 
> > -----Original Message-----
> > From: Javi Merino [mailto:javi.merino@arm.com]
> > Sent: Monday, September 28, 2015 10:29 PM
> > To: Chen, Yu C
> > Cc: linux-pm@vger.kernel.org; edubezval@gmail.com; Zhang, Rui; linux-
> > kernel@vger.kernel.org; stable@vger.kernel.org
> > Subject: Re: [PATCH 3/3] Thermal: do thermal zone update after a cooling
> > device registered
> > 
> > On Sun, Sep 27, 2015 at 06:48:44AM +0100, Chen Yu wrote:
> > > From: Zhang Rui <rui.zhang@intel.com>
> > >
> > >
> > 
> > I think you need to hold cdev->lock here, to make sure that no thermal zone
> > is added or removed from cdev->thermal_instances while you are looping.
> > 
> Ah right, will add. If I add the cdev ->lock here, will there be a AB-BA lock with 
> thermal_zone_unbind_cooling_device?

You're right, it could lead to a deadlock.  The locks can't be
swapped because that won't work in step_wise.

The best way that I can think of accessing thermal_instances
atomically is by making it RCU protected instead of with mutexes.
What do you think?


> > Why list_for_each_entry_safe() ?  You are not going to remove any entry, so
> > you can just use list_for_each_entry()
> > 
> > 
> > Why is this so complicated?  Can't you just do:
> > 
> > 	list_for_each_entry(pos, &cdev->thermal_instances, cdev_node)
> >         	thermal_zone_device_update(pos->tz);
> > 
> 
> This is an optimization here:
> Ignore thermal instance that refers to the same thermal zone in this loop,
> this works because bind_cdev() always binds the cooling device to one 
> thermal zone first, and then binds to the next thermal zone.

It has taken me a while to understand this optimization.  Please
document both "if"s in the code.  For the first "if" maybe you can use
list_is_last() to make it easier to understand that you're looking for
the last element in the list:

		if (list_is_last(&pos->cdev_node, &cdev->thermal_instances)) {
			thermal_zone_device_update(pos->tz);

For the second "if" you can say that you only need to run
thermal_zone_device_update() once per thermal zone, even though
multiple thermal instances may refer to the same thermal zone.

Cheers,
Javi
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Chen Yu Oct. 12, 2015, 9:23 a.m. UTC | #4
Hi, Javi
Sorry for my late response,

> -----Original Message-----

> From: Javi Merino [mailto:javi.merino@arm.com]

> Sent: Wednesday, September 30, 2015 12:02 AM

> To: Chen, Yu C

> Cc: linux-pm@vger.kernel.org; edubezval@gmail.com; Zhang, Rui; linux-

> kernel@vger.kernel.org; stable@vger.kernel.org

> Subject: Re: [PATCH 3/3] Thermal: do thermal zone update after a cooling

> device registered

> 

> Hi Yu,

> 

> On Mon, Sep 28, 2015 at 06:52:00PM +0100, Chen, Yu C wrote:

> > Hi, Javi,

> >

> > > -----Original Message-----

> > > From: Javi Merino [mailto:javi.merino@arm.com]

> > > Sent: Monday, September 28, 2015 10:29 PM

> > > To: Chen, Yu C

> > > Cc: linux-pm@vger.kernel.org; edubezval@gmail.com; Zhang, Rui;

> > > linux- kernel@vger.kernel.org; stable@vger.kernel.org

> > > Subject: Re: [PATCH 3/3] Thermal: do thermal zone update after a

> > > cooling device registered

> > >

> > > On Sun, Sep 27, 2015 at 06:48:44AM +0100, Chen Yu wrote:

> > > > From: Zhang Rui <rui.zhang@intel.com>

> > > >

> > > >

> > >

> > > I think you need to hold cdev->lock here, to make sure that no

> > > thermal zone is added or removed from cdev->thermal_instances while

> you are looping.

> > >

> > Ah right, will add. If I add the cdev ->lock here, will there be a

> > AB-BA lock with thermal_zone_unbind_cooling_device?

> 

> You're right, it could lead to a deadlock.  The locks can't be swapped because

> that won't work in step_wise.

> 

> The best way that I can think of accessing thermal_instances atomically is by

> making it RCU protected instead of with mutexes.

> What do you think?

> 

RCU would need extra spinlocks to protect the list, and need to sync_rcu after we delete
one instance from thermal_instance list,  I think it is too complicated for me to rewrite: (
How about using thermal_list_lock instead of cdev ->lock?
This guy should be big enough to protect the device.thermal_instance list.

> 

> > > Why list_for_each_entry_safe() ?  You are not going to remove any

> > > entry, so you can just use list_for_each_entry()

> > >

> > >

> > > Why is this so complicated?  Can't you just do:

> > >

> > > 	list_for_each_entry(pos, &cdev->thermal_instances, cdev_node)

> > >         	thermal_zone_device_update(pos->tz);

> > >

> >

> > This is an optimization here:

> > Ignore thermal instance that refers to the same thermal zone in this

> > loop, this works because bind_cdev() always binds the cooling device

> > to one thermal zone first, and then binds to the next thermal zone.

> 

> It has taken me a while to understand this optimization.  Please document

> both "if"s in the code.  For the first "if" maybe you can use

> list_is_last() to make it easier to understand that you're looking for the last

> element in the list:

> 

> 		if (list_is_last(&pos->cdev_node, &cdev-

> >thermal_instances)) {

> 			thermal_zone_device_update(pos->tz);

> 

Sure, ok
> For the second "if" you can say that you only need to run

> thermal_zone_device_update() once per thermal zone, even though

> multiple thermal instances may refer to the same thermal zone.

> 

OK


Best Regards,
Yu
Javi Merino Oct. 14, 2015, 5:07 p.m. UTC | #5
On Mon, Oct 12, 2015 at 09:23:28AM +0000, Chen, Yu C wrote:
> Hi, Javi
> Sorry for my late response,
> 
> > -----Original Message-----
> > From: Javi Merino [mailto:javi.merino@arm.com]
> > Sent: Wednesday, September 30, 2015 12:02 AM
> > To: Chen, Yu C
> > Cc: linux-pm@vger.kernel.org; edubezval@gmail.com; Zhang, Rui; linux-
> > kernel@vger.kernel.org; stable@vger.kernel.org
> > Subject: Re: [PATCH 3/3] Thermal: do thermal zone update after a cooling
> > device registered
> > 
> > Hi Yu,
> > 
> > On Mon, Sep 28, 2015 at 06:52:00PM +0100, Chen, Yu C wrote:
> > > Hi, Javi,
> > >
> > > > -----Original Message-----
> > > > From: Javi Merino [mailto:javi.merino@arm.com]
> > > > Sent: Monday, September 28, 2015 10:29 PM
> > > > To: Chen, Yu C
> > > > Cc: linux-pm@vger.kernel.org; edubezval@gmail.com; Zhang, Rui;
> > > > linux- kernel@vger.kernel.org; stable@vger.kernel.org
> > > > Subject: Re: [PATCH 3/3] Thermal: do thermal zone update after a
> > > > cooling device registered
> > > >
> > > > On Sun, Sep 27, 2015 at 06:48:44AM +0100, Chen Yu wrote:
> > > > > From: Zhang Rui <rui.zhang@intel.com>
> > > > >
> > > > >
> > > >
> > > > I think you need to hold cdev->lock here, to make sure that no
> > > > thermal zone is added or removed from cdev->thermal_instances while
> > you are looping.
> > > >
> > > Ah right, will add. If I add the cdev ->lock here, will there be a
> > > AB-BA lock with thermal_zone_unbind_cooling_device?
> > 
> > You're right, it could lead to a deadlock.  The locks can't be swapped because
> > that won't work in step_wise.
> > 
> > The best way that I can think of accessing thermal_instances atomically is by
> > making it RCU protected instead of with mutexes.
> > What do you think?
> > 
> RCU would need extra spinlocks to protect the list, and need to sync_rcu after we delete
> one instance from thermal_instance list,  I think it is too complicated for me to rewrite: (
> How about using thermal_list_lock instead of cdev ->lock?
> This guy should be big enough to protect the device.thermal_instance list.

thermal_list_lock protects thermal_tz_list and thermal_cdev_list, but
it doesn't protect the thermal_instances list.  For example,
thermal_zone_bind_cooling_device() adds a cooling device to the
cdev->thermal_instances list without taking thermal_tz_list.

To sum up, you have to protect accessing the cdev->thermal_instances
list but with the current locking scheme, you would create an AB-BA
deadlock.  As I see it you would have to change the locking scheme to
either RCU or add a new mutex that protects the
cdev->thermal_instances and tz->thermal_instances lists and change all
accesses to them to make sure they comply with the new locking scheme.

Is there a better way of solving this?  Cheers,
Javi


--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Chen Yu Oct. 14, 2015, 7:21 p.m. UTC | #6
SGkgSmF2aSwNCg0KDQo+IC0tLS0tT3JpZ2luYWwgTWVzc2FnZS0tLS0tDQo+IEZyb206IEphdmkg
TWVyaW5vIFttYWlsdG86amF2aS5tZXJpbm9AYXJtLmNvbV0NCj4gU2VudDogVGh1cnNkYXksIE9j
dG9iZXIgMTUsIDIwMTUgMTowOCBBTQ0KPiBUbzogQ2hlbiwgWXUgQw0KPiBDYzogbGludXgtcG1A
dmdlci5rZXJuZWwub3JnOyBlZHViZXp2YWxAZ21haWwuY29tOyBaaGFuZywgUnVpOyBsaW51eC0N
Cj4ga2VybmVsQHZnZXIua2VybmVsLm9yZzsgc3RhYmxlQHZnZXIua2VybmVsLm9yZzsgUGFuZHJ1
dmFkYSwgU3Jpbml2YXMNCj4gU3ViamVjdDogUmU6IFtQQVRDSCAzLzNdIFRoZXJtYWw6IGRvIHRo
ZXJtYWwgem9uZSB1cGRhdGUgYWZ0ZXIgYSBjb29saW5nDQo+IGRldmljZSByZWdpc3RlcmVkDQo+
IA0KPiBPbiBNb24sIE9jdCAxMiwgMjAxNSBhdCAwOToyMzoyOEFNICswMDAwLCBDaGVuLCBZdSBD
IHdyb3RlOg0KPiA+IEhpLCBKYXZpDQo+ID4gU29ycnkgZm9yIG15IGxhdGUgcmVzcG9uc2UsDQo+
ID4NCj4gPiA+IC0tLS0tT3JpZ2luYWwgTWVzc2FnZS0tLS0tDQo+ID4gPiBGcm9tOiBKYXZpIE1l
cmlubyBbbWFpbHRvOmphdmkubWVyaW5vQGFybS5jb21dDQo+ID4gPiBTZW50OiBXZWRuZXNkYXks
IFNlcHRlbWJlciAzMCwgMjAxNSAxMjowMiBBTQ0KPiA+ID4gVG86IENoZW4sIFl1IEMNCj4gPiA+
IENjOiBsaW51eC1wbUB2Z2VyLmtlcm5lbC5vcmc7IGVkdWJlenZhbEBnbWFpbC5jb207IFpoYW5n
LCBSdWk7DQo+ID4gPiBsaW51eC0ga2VybmVsQHZnZXIua2VybmVsLm9yZzsgc3RhYmxlQHZnZXIu
a2VybmVsLm9yZw0KPiA+ID4gU3ViamVjdDogUmU6IFtQQVRDSCAzLzNdIFRoZXJtYWw6IGRvIHRo
ZXJtYWwgem9uZSB1cGRhdGUgYWZ0ZXIgYQ0KPiA+ID4gY29vbGluZyBkZXZpY2UgcmVnaXN0ZXJl
ZA0KPiA+ID4NCj4gPiA+IEhpIFl1LA0KPiA+ID4NCj4gPiA+IE9uIE1vbiwgU2VwIDI4LCAyMDE1
IGF0IDA2OjUyOjAwUE0gKzAxMDAsIENoZW4sIFl1IEMgd3JvdGU6DQo+ID4gPiA+IEhpLCBKYXZp
LA0KPiA+ID4gPg0KPiA+ID4gPiA+IC0tLS0tT3JpZ2luYWwgTWVzc2FnZS0tLS0tDQo+ID4gPiA+
ID4gRnJvbTogSmF2aSBNZXJpbm8gW21haWx0bzpqYXZpLm1lcmlub0Bhcm0uY29tXQ0KPiA+ID4g
PiA+IFNlbnQ6IE1vbmRheSwgU2VwdGVtYmVyIDI4LCAyMDE1IDEwOjI5IFBNDQo+ID4gPiA+ID4g
VG86IENoZW4sIFl1IEMNCj4gPiA+ID4gPiBDYzogbGludXgtcG1Admdlci5rZXJuZWwub3JnOyBl
ZHViZXp2YWxAZ21haWwuY29tOyBaaGFuZywgUnVpOw0KPiA+ID4gPiA+IGxpbnV4LSBrZXJuZWxA
dmdlci5rZXJuZWwub3JnOyBzdGFibGVAdmdlci5rZXJuZWwub3JnDQo+ID4gPiA+ID4gU3ViamVj
dDogUmU6IFtQQVRDSCAzLzNdIFRoZXJtYWw6IGRvIHRoZXJtYWwgem9uZSB1cGRhdGUgYWZ0ZXIg
YQ0KPiA+ID4gPiA+IGNvb2xpbmcgZGV2aWNlIHJlZ2lzdGVyZWQNCj4gPiA+ID4gPg0KPiA+ID4g
PiA+IE9uIFN1biwgU2VwIDI3LCAyMDE1IGF0IDA2OjQ4OjQ0QU0gKzAxMDAsIENoZW4gWXUgd3Jv
dGU6DQo+ID4gPiA+ID4gPiBGcm9tOiBaaGFuZyBSdWkgPHJ1aS56aGFuZ0BpbnRlbC5jb20+DQo+
ID4gPiA+ID4gPg0KPiA+ID4gPiA+ID4NCj4gPiA+ID4gPg0KPiA+ID4gPiA+IEkgdGhpbmsgeW91
IG5lZWQgdG8gaG9sZCBjZGV2LT5sb2NrIGhlcmUsIHRvIG1ha2Ugc3VyZSB0aGF0IG5vDQo+ID4g
PiA+ID4gdGhlcm1hbCB6b25lIGlzIGFkZGVkIG9yIHJlbW92ZWQgZnJvbSBjZGV2LT50aGVybWFs
X2luc3RhbmNlcw0KPiA+ID4gPiA+IHdoaWxlDQo+ID4gPiB5b3UgYXJlIGxvb3BpbmcuDQo+ID4g
PiA+ID4NCj4gPiA+ID4gQWggcmlnaHQsIHdpbGwgYWRkLiBJZiBJIGFkZCB0aGUgY2RldiAtPmxv
Y2sgaGVyZSwgd2lsbCB0aGVyZSBiZSBhDQo+ID4gPiA+IEFCLUJBIGxvY2sgd2l0aCB0aGVybWFs
X3pvbmVfdW5iaW5kX2Nvb2xpbmdfZGV2aWNlPw0KPiA+ID4NCj4gPiA+IFlvdSdyZSByaWdodCwg
aXQgY291bGQgbGVhZCB0byBhIGRlYWRsb2NrLiAgVGhlIGxvY2tzIGNhbid0IGJlDQo+ID4gPiBz
d2FwcGVkIGJlY2F1c2UgdGhhdCB3b24ndCB3b3JrIGluIHN0ZXBfd2lzZS4NCj4gPiA+DQo+ID4g
PiBUaGUgYmVzdCB3YXkgdGhhdCBJIGNhbiB0aGluayBvZiBhY2Nlc3NpbmcgdGhlcm1hbF9pbnN0
YW5jZXMNCj4gPiA+IGF0b21pY2FsbHkgaXMgYnkgbWFraW5nIGl0IFJDVSBwcm90ZWN0ZWQgaW5z
dGVhZCBvZiB3aXRoIG11dGV4ZXMuDQo+ID4gPiBXaGF0IGRvIHlvdSB0aGluaz8NCj4gPiA+DQo+
ID4gUkNVIHdvdWxkIG5lZWQgZXh0cmEgc3BpbmxvY2tzIHRvIHByb3RlY3QgdGhlIGxpc3QsIGFu
ZCBuZWVkIHRvDQo+ID4gc3luY19yY3UgYWZ0ZXIgd2UgZGVsZXRlIG9uZSBpbnN0YW5jZSBmcm9t
IHRoZXJtYWxfaW5zdGFuY2UgbGlzdCwgIEkNCj4gPiB0aGluayBpdCBpcyB0b28gY29tcGxpY2F0
ZWQgZm9yIG1lIHRvIHJld3JpdGU6ICggSG93IGFib3V0IHVzaW5nDQo+IHRoZXJtYWxfbGlzdF9s
b2NrIGluc3RlYWQgb2YgY2RldiAtPmxvY2s/DQo+ID4gVGhpcyBndXkgc2hvdWxkIGJlIGJpZyBl
bm91Z2ggdG8gcHJvdGVjdCB0aGUgZGV2aWNlLnRoZXJtYWxfaW5zdGFuY2UgbGlzdC4NCj4gDQo+
IHRoZXJtYWxfbGlzdF9sb2NrIHByb3RlY3RzIHRoZXJtYWxfdHpfbGlzdCBhbmQgdGhlcm1hbF9j
ZGV2X2xpc3QsIGJ1dCBpdA0KPiBkb2Vzbid0IHByb3RlY3QgdGhlIHRoZXJtYWxfaW5zdGFuY2Vz
IGxpc3QuICBGb3IgZXhhbXBsZSwNCj4gdGhlcm1hbF96b25lX2JpbmRfY29vbGluZ19kZXZpY2Uo
KSBhZGRzIGEgY29vbGluZyBkZXZpY2UgdG8gdGhlDQo+IGNkZXYtPnRoZXJtYWxfaW5zdGFuY2Vz
IGxpc3Qgd2l0aG91dCB0YWtpbmcgdGhlcm1hbF90el9saXN0Lg0KPiANCg0KQmVmb3JlIHRoZXJt
YWxfem9uZV9iaW5kX2Nvb2xpbmdfZGV2aWNlIGlzIGludm9rZWQsIA0KdGhlIHRoZXJtYWxfbGlz
dF9sb2NrIHdpbGwgYmUgZmlyc3RseSBncmlwcGVkOg0KDQpzdGF0aWMgdm9pZCBiaW5kX2NkZXYo
c3RydWN0IHRoZXJtYWxfY29vbGluZ19kZXZpY2UgKmNkZXYpDQp7DQoNCm11dGV4X2xvY2soJnRo
ZXJtYWxfbGlzdF9sb2NrKTsNCg0KZWl0aGVyIHR6LT5vcHMtPmJpbmQgICAgOiAgIHRoZXJtYWxf
em9uZV9iaW5kX2Nvb2xpbmdfZGV2aWNlDQoNCm9yIF9fYmluZCgpICA6ICAgdGhlcm1hbF96b25l
X2JpbmRfY29vbGluZ19kZXZpY2UNCm11dGV4X3VubG9jaygmdGhlcm1hbF9saXN0X2xvY2spOw0K
DQp9DQoNCkFuZCBpdCBpcyB0aGUgc2FtZSBhcyBpbiAgcGFzc2l2ZV9zdG9yZS4NCg0KU28gd2hl
biBjb2RlIGlzIHRyeWluZyB0byBhZGQvZGVsZXRlIHRoZXJtYWxfaW5zdGFuY2Ugb2YgY2Rldiwg
aGUgaGFzDQphbHJlYWR5IGhvbGQgdGhlcm1hbF9saXN0X2xvY2sgSU1PLiBPciBkbyBJIG1pc3Mg
YW55dGhpbmc/DQoNCkJlc3QgUmVnYXJkcywNCll1DQoNCg0K
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Chen Yu Oct. 14, 2015, 7:23 p.m. UTC | #7
Hi,Javi

> -----Original Message-----

> From: Javi Merino [mailto:javi.merino@arm.com]

> Sent: Thursday, October 15, 2015 1:08 AM

> To: Chen, Yu C

> Cc: linux-pm@vger.kernel.org; edubezval@gmail.com; Zhang, Rui; linux-

> kernel@vger.kernel.org; stable@vger.kernel.org; Pandruvada, Srinivas

> Subject: Re: [PATCH 3/3] Thermal: do thermal zone update after a cooling

> device registered

> 

> On Mon, Oct 12, 2015 at 09:23:28AM +0000, Chen, Yu C wrote:

> > Hi, Javi

> > Sorry for my late response,

> >

> > > -----Original Message-----

> > > From: Javi Merino [mailto:javi.merino@arm.com]

> > > Sent: Wednesday, September 30, 2015 12:02 AM

> > > To: Chen, Yu C

> > > Cc: linux-pm@vger.kernel.org; edubezval@gmail.com; Zhang, Rui;

> > > linux- kernel@vger.kernel.org; stable@vger.kernel.org

> > > Subject: Re: [PATCH 3/3] Thermal: do thermal zone update after a

> > > cooling device registered

> > >

> > > Hi Yu,

> > >

> > > On Mon, Sep 28, 2015 at 06:52:00PM +0100, Chen, Yu C wrote:

> > > > Hi, Javi,

> > > >

> > > > > -----Original Message-----

> > > > > From: Javi Merino [mailto:javi.merino@arm.com]

> > > > > Sent: Monday, September 28, 2015 10:29 PM

> > > > > To: Chen, Yu C

> > > > > Cc: linux-pm@vger.kernel.org; edubezval@gmail.com; Zhang, Rui;

> > > > > linux- kernel@vger.kernel.org; stable@vger.kernel.org

> > > > > Subject: Re: [PATCH 3/3] Thermal: do thermal zone update after a

> > > > > cooling device registered

> > > > >

> > > > > On Sun, Sep 27, 2015 at 06:48:44AM +0100, Chen Yu wrote:

> > > > > > From: Zhang Rui <rui.zhang@intel.com>

> > > > > >

> > > > > >

> > > > >

> > > > > I think you need to hold cdev->lock here, to make sure that no

> > > > > thermal zone is added or removed from cdev->thermal_instances

> > > > > while

> > > you are looping.

> > > > >

> > > > Ah right, will add. If I add the cdev ->lock here, will there be a

> > > > AB-BA lock with thermal_zone_unbind_cooling_device?

> > >

> > > You're right, it could lead to a deadlock.  The locks can't be

> > > swapped because that won't work in step_wise.

> > >

> > > The best way that I can think of accessing thermal_instances

> > > atomically is by making it RCU protected instead of with mutexes.

> > > What do you think?

> > >

> > RCU would need extra spinlocks to protect the list, and need to

> > sync_rcu after we delete one instance from thermal_instance list,  I

> > think it is too complicated for me to rewrite: ( How about using

> thermal_list_lock instead of cdev ->lock?

> > This guy should be big enough to protect the device.thermal_instance list.

> 

> thermal_list_lock protects thermal_tz_list and thermal_cdev_list, but it

> doesn't protect the thermal_instances list.  For example,

> thermal_zone_bind_cooling_device() adds a cooling device to the

> cdev->thermal_instances list without taking thermal_tz_list.

> 

Before thermal_zone_bind_cooling_device is invoked,
the thermal_list_lock will be firstly gripped:

static void bind_cdev(struct thermal_cooling_device *cdev)
{
mutex_lock(&thermal_list_lock);
either tz->ops->bind    :   thermal_zone_bind_cooling_device
or __bind()  :   thermal_zone_bind_cooling_device
mutex_unlock(&thermal_list_lock);
}

And it is the same as in  passive_store.
So when code is trying to add/delete thermal_instance of cdev,
he has already hold thermal_list_lock IMO. Or do I miss anything?

Best Regards,
Yu
Javi Merino Oct. 15, 2015, 2:05 p.m. UTC | #8
On Wed, Oct 14, 2015 at 07:23:55PM +0000, Chen, Yu C wrote:
> > -----Original Message-----
> > From: Javi Merino [mailto:javi.merino@arm.com]
> > Sent: Thursday, October 15, 2015 1:08 AM
> > To: Chen, Yu C
> > Cc: linux-pm@vger.kernel.org; edubezval@gmail.com; Zhang, Rui; linux-
> > kernel@vger.kernel.org; stable@vger.kernel.org; Pandruvada, Srinivas
> > Subject: Re: [PATCH 3/3] Thermal: do thermal zone update after a cooling
> > device registered
> > 
> > On Mon, Oct 12, 2015 at 09:23:28AM +0000, Chen, Yu C wrote:
> > > Hi, Javi
> > > Sorry for my late response,
> > >
> > > > -----Original Message-----
> > > > From: Javi Merino [mailto:javi.merino@arm.com]
> > > > Sent: Wednesday, September 30, 2015 12:02 AM
> > > > To: Chen, Yu C
> > > > Cc: linux-pm@vger.kernel.org; edubezval@gmail.com; Zhang, Rui;
> > > > linux- kernel@vger.kernel.org; stable@vger.kernel.org
> > > > Subject: Re: [PATCH 3/3] Thermal: do thermal zone update after a
> > > > cooling device registered
> > > >
> > > > Hi Yu,
> > > >
> > > > On Mon, Sep 28, 2015 at 06:52:00PM +0100, Chen, Yu C wrote:
> > > > > Hi, Javi,
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: Javi Merino [mailto:javi.merino@arm.com]
> > > > > > Sent: Monday, September 28, 2015 10:29 PM
> > > > > > To: Chen, Yu C
> > > > > > Cc: linux-pm@vger.kernel.org; edubezval@gmail.com; Zhang, Rui;
> > > > > > linux- kernel@vger.kernel.org; stable@vger.kernel.org
> > > > > > Subject: Re: [PATCH 3/3] Thermal: do thermal zone update after a
> > > > > > cooling device registered
> > > > > >
> > > > > > On Sun, Sep 27, 2015 at 06:48:44AM +0100, Chen Yu wrote:
> > > > > > > From: Zhang Rui <rui.zhang@intel.com>
> > > > > > >
> > > > > > >
> > > > > >
> > > > > > I think you need to hold cdev->lock here, to make sure that no
> > > > > > thermal zone is added or removed from cdev->thermal_instances
> > > > > > while
> > > > you are looping.
> > > > > >
> > > > > Ah right, will add. If I add the cdev ->lock here, will there be a
> > > > > AB-BA lock with thermal_zone_unbind_cooling_device?
> > > >
> > > > You're right, it could lead to a deadlock.  The locks can't be
> > > > swapped because that won't work in step_wise.
> > > >
> > > > The best way that I can think of accessing thermal_instances
> > > > atomically is by making it RCU protected instead of with mutexes.
> > > > What do you think?
> > > >
> > > RCU would need extra spinlocks to protect the list, and need to
> > > sync_rcu after we delete one instance from thermal_instance list,  I
> > > think it is too complicated for me to rewrite: ( How about using
> > thermal_list_lock instead of cdev ->lock?
> > > This guy should be big enough to protect the device.thermal_instance list.
> > 
> > thermal_list_lock protects thermal_tz_list and thermal_cdev_list, but it
> > doesn't protect the thermal_instances list.  For example,
> > thermal_zone_bind_cooling_device() adds a cooling device to the
> > cdev->thermal_instances list without taking thermal_tz_list.
> > 
> Before thermal_zone_bind_cooling_device is invoked,
> the thermal_list_lock will be firstly gripped:
> 
> static void bind_cdev(struct thermal_cooling_device *cdev)
> {
> mutex_lock(&thermal_list_lock);
> either tz->ops->bind    :   thermal_zone_bind_cooling_device
> or __bind()  :   thermal_zone_bind_cooling_device
> mutex_unlock(&thermal_list_lock);
> }
> 
> And it is the same as in  passive_store.
> So when code is trying to add/delete thermal_instance of cdev,
> he has already hold thermal_list_lock IMO. Or do I miss anything?

thermal_zone_bind_cooling_device() is exported, so you can't really
rely on the static thermal_list_lock being acquired in every single
call.

thermal_list_lock and protects the lists thermal_tz_list and
thermal_cdev_list.  Making it implicitly protect the cooling device's
and thermal zone device's instances list because no sensible code
would call thermal_zone_bind_cooling_device() outside of a bind
function is just asking for trouble.

Locking is hard to understand and easy to get wrong so let's keep it
simple.

Cheers,
Javi
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Chen Yu Oct. 20, 2015, 1:05 a.m. UTC | #9
SGksDQo+IC0tLS0tT3JpZ2luYWwgTWVzc2FnZS0tLS0tDQo+IEZyb206IEphdmkgTWVyaW5vIFtt
YWlsdG86amF2aS5tZXJpbm9AYXJtLmNvbV0NCj4gU2VudDogVGh1cnNkYXksIE9jdG9iZXIgMTUs
IDIwMTUgMTA6MDUgUE0NCj4gVG86IENoZW4sIFl1IEMNCj4gQ2M6IGxpbnV4LXBtQHZnZXIua2Vy
bmVsLm9yZzsgZWR1YmV6dmFsQGdtYWlsLmNvbTsgWmhhbmcsIFJ1aTsgbGludXgtDQo+IGtlcm5l
bEB2Z2VyLmtlcm5lbC5vcmc7IHN0YWJsZUB2Z2VyLmtlcm5lbC5vcmc7IFBhbmRydXZhZGEsIFNy
aW5pdmFzDQo+IFN1YmplY3Q6IFJlOiBbUEFUQ0ggMy8zXSBUaGVybWFsOiBkbyB0aGVybWFsIHpv
bmUgdXBkYXRlIGFmdGVyIGEgY29vbGluZw0KPiBkZXZpY2UgcmVnaXN0ZXJlZA0KPiANCj4gT24g
V2VkLCBPY3QgMTQsIDIwMTUgYXQgMDc6MjM6NTVQTSArMDAwMCwgQ2hlbiwgWXUgQyB3cm90ZToN
Cj4gPiA+IC0tLS0tT3JpZ2luYWwgTWVzc2FnZS0tLS0tDQo+ID4gPiBGcm9tOiBKYXZpIE1lcmlu
byBbbWFpbHRvOmphdmkubWVyaW5vQGFybS5jb21dDQo+ID4gPiBTZW50OiBUaHVyc2RheSwgT2N0
b2JlciAxNSwgMjAxNSAxOjA4IEFNDQo+ID4gPiBUbzogQ2hlbiwgWXUgQw0KPiA+ID4gQ2M6IGxp
bnV4LXBtQHZnZXIua2VybmVsLm9yZzsgZWR1YmV6dmFsQGdtYWlsLmNvbTsgWmhhbmcsIFJ1aTsN
Cj4gPiA+IGxpbnV4LSBrZXJuZWxAdmdlci5rZXJuZWwub3JnOyBzdGFibGVAdmdlci5rZXJuZWwu
b3JnOyBQYW5kcnV2YWRhLA0KPiA+ID4gU3Jpbml2YXMNCj4gPiA+IFN1YmplY3Q6IFJlOiBbUEFU
Q0ggMy8zXSBUaGVybWFsOiBkbyB0aGVybWFsIHpvbmUgdXBkYXRlIGFmdGVyIGENCj4gPiA+IGNv
b2xpbmcgZGV2aWNlIHJlZ2lzdGVyZWQNCj4gPiA+DQo+ID4gPiBPbiBNb24sIE9jdCAxMiwgMjAx
NSBhdCAwOToyMzoyOEFNICswMDAwLCBDaGVuLCBZdSBDIHdyb3RlOg0KPiA+ID4gPiBIaSwgSmF2
aQ0KPiA+ID4gPiBTb3JyeSBmb3IgbXkgbGF0ZSByZXNwb25zZSwNCj4gPiA+ID4NCj4gPiA+ID4g
PiAtLS0tLU9yaWdpbmFsIE1lc3NhZ2UtLS0tLQ0KPiA+ID4gPiA+IEZyb206IEphdmkgTWVyaW5v
IFttYWlsdG86amF2aS5tZXJpbm9AYXJtLmNvbV0NCj4gPiA+ID4gPiBTZW50OiBXZWRuZXNkYXks
IFNlcHRlbWJlciAzMCwgMjAxNSAxMjowMiBBTQ0KPiA+ID4gPiA+IFRvOiBDaGVuLCBZdSBDDQo+
ID4gPiA+ID4gQ2M6IGxpbnV4LXBtQHZnZXIua2VybmVsLm9yZzsgZWR1YmV6dmFsQGdtYWlsLmNv
bTsgWmhhbmcsIFJ1aTsNCj4gPiA+ID4gPiBsaW51eC0ga2VybmVsQHZnZXIua2VybmVsLm9yZzsg
c3RhYmxlQHZnZXIua2VybmVsLm9yZw0KPiA+ID4gPiA+IFN1YmplY3Q6IFJlOiBbUEFUQ0ggMy8z
XSBUaGVybWFsOiBkbyB0aGVybWFsIHpvbmUgdXBkYXRlIGFmdGVyIGENCj4gPiA+ID4gPiBjb29s
aW5nIGRldmljZSByZWdpc3RlcmVkDQo+ID4gPiA+ID4NCj4gPiA+ID4gPiBIaSBZdSwNCj4gPiA+
ID4gPg0KPiA+ID4gPiA+IE9uIE1vbiwgU2VwIDI4LCAyMDE1IGF0IDA2OjUyOjAwUE0gKzAxMDAs
IENoZW4sIFl1IEMgd3JvdGU6DQo+ID4gPiA+ID4gPiBIaSwgSmF2aSwNCj4gPiA+ID4gPiA+DQo+
ID4gPiA+ID4gPiA+IC0tLS0tT3JpZ2luYWwgTWVzc2FnZS0tLS0tDQo+ID4gPiA+ID4gPiA+IEZy
b206IEphdmkgTWVyaW5vIFttYWlsdG86amF2aS5tZXJpbm9AYXJtLmNvbV0NCj4gPiA+ID4gPiA+
ID4gU2VudDogTW9uZGF5LCBTZXB0ZW1iZXIgMjgsIDIwMTUgMTA6MjkgUE0NCj4gPiA+ID4gPiA+
ID4gVG86IENoZW4sIFl1IEMNCj4gPiA+ID4gPiA+ID4gQ2M6IGxpbnV4LXBtQHZnZXIua2VybmVs
Lm9yZzsgZWR1YmV6dmFsQGdtYWlsLmNvbTsgWmhhbmcsDQo+ID4gPiA+ID4gPiA+IFJ1aTsNCj4g
PiA+ID4gPiA+ID4gbGludXgtIGtlcm5lbEB2Z2VyLmtlcm5lbC5vcmc7IHN0YWJsZUB2Z2VyLmtl
cm5lbC5vcmcNCj4gPiA+ID4gPiA+ID4gU3ViamVjdDogUmU6IFtQQVRDSCAzLzNdIFRoZXJtYWw6
IGRvIHRoZXJtYWwgem9uZSB1cGRhdGUNCj4gPiA+ID4gPiA+ID4gYWZ0ZXIgYSBjb29saW5nIGRl
dmljZSByZWdpc3RlcmVkDQo+ID4gPiA+ID4gPiA+DQo+ID4gPiA+ID4gPiA+IE9uIFN1biwgU2Vw
IDI3LCAyMDE1IGF0IDA2OjQ4OjQ0QU0gKzAxMDAsIENoZW4gWXUgd3JvdGU6DQo+ID4gPiA+ID4g
PiA+ID4gRnJvbTogWmhhbmcgUnVpIDxydWkuemhhbmdAaW50ZWwuY29tPg0KPiA+ID4gPiA+ID4g
PiA+DQo+ID4gPiA+ID4gPiA+ID4NCj4gPiA+ID4gPiA+ID4NCj4gPiA+ID4gPiA+ID4gSSB0aGlu
ayB5b3UgbmVlZCB0byBob2xkIGNkZXYtPmxvY2sgaGVyZSwgdG8gbWFrZSBzdXJlIHRoYXQNCj4g
PiA+ID4gPiA+ID4gbm8gdGhlcm1hbCB6b25lIGlzIGFkZGVkIG9yIHJlbW92ZWQgZnJvbQ0KPiA+
ID4gPiA+ID4gPiBjZGV2LT50aGVybWFsX2luc3RhbmNlcyB3aGlsZQ0KPiA+ID4gPiA+IHlvdSBh
cmUgbG9vcGluZy4NCj4gPiA+ID4gPiA+ID4NCj4gPiA+ID4gPiA+IEFoIHJpZ2h0LCB3aWxsIGFk
ZC4gSWYgSSBhZGQgdGhlIGNkZXYgLT5sb2NrIGhlcmUsIHdpbGwgdGhlcmUNCj4gPiA+ID4gPiA+
IGJlIGEgQUItQkEgbG9jayB3aXRoIHRoZXJtYWxfem9uZV91bmJpbmRfY29vbGluZ19kZXZpY2U/
DQo+ID4gPiA+ID4NCj4gPiA+ID4gPiBZb3UncmUgcmlnaHQsIGl0IGNvdWxkIGxlYWQgdG8gYSBk
ZWFkbG9jay4gIFRoZSBsb2NrcyBjYW4ndCBiZQ0KPiA+ID4gPiA+IHN3YXBwZWQgYmVjYXVzZSB0
aGF0IHdvbid0IHdvcmsgaW4gc3RlcF93aXNlLg0KPiA+ID4gPiA+DQo+ID4gPiA+ID4gVGhlIGJl
c3Qgd2F5IHRoYXQgSSBjYW4gdGhpbmsgb2YgYWNjZXNzaW5nIHRoZXJtYWxfaW5zdGFuY2VzDQo+
ID4gPiA+ID4gYXRvbWljYWxseSBpcyBieSBtYWtpbmcgaXQgUkNVIHByb3RlY3RlZCBpbnN0ZWFk
IG9mIHdpdGggbXV0ZXhlcy4NCj4gPiA+ID4gPiBXaGF0IGRvIHlvdSB0aGluaz8NCj4gPiA+ID4g
Pg0KPiA+ID4gPiBSQ1Ugd291bGQgbmVlZCBleHRyYSBzcGlubG9ja3MgdG8gcHJvdGVjdCB0aGUg
bGlzdCwgYW5kIG5lZWQgdG8NCj4gPiA+ID4gc3luY19yY3UgYWZ0ZXIgd2UgZGVsZXRlIG9uZSBp
bnN0YW5jZSBmcm9tIHRoZXJtYWxfaW5zdGFuY2UgbGlzdCwNCj4gPiA+ID4gSSB0aGluayBpdCBp
cyB0b28gY29tcGxpY2F0ZWQgZm9yIG1lIHRvIHJld3JpdGU6ICggSG93IGFib3V0IHVzaW5nDQo+
ID4gPiB0aGVybWFsX2xpc3RfbG9jayBpbnN0ZWFkIG9mIGNkZXYgLT5sb2NrPw0KPiA+ID4gPiBU
aGlzIGd1eSBzaG91bGQgYmUgYmlnIGVub3VnaCB0byBwcm90ZWN0IHRoZSBkZXZpY2UudGhlcm1h
bF9pbnN0YW5jZQ0KPiBsaXN0Lg0KPiA+ID4NCj4gPiA+IHRoZXJtYWxfbGlzdF9sb2NrIHByb3Rl
Y3RzIHRoZXJtYWxfdHpfbGlzdCBhbmQgdGhlcm1hbF9jZGV2X2xpc3QsDQo+ID4gPiBidXQgaXQg
ZG9lc24ndCBwcm90ZWN0IHRoZSB0aGVybWFsX2luc3RhbmNlcyBsaXN0LiAgRm9yIGV4YW1wbGUs
DQo+ID4gPiB0aGVybWFsX3pvbmVfYmluZF9jb29saW5nX2RldmljZSgpIGFkZHMgYSBjb29saW5n
IGRldmljZSB0byB0aGUNCj4gPiA+IGNkZXYtPnRoZXJtYWxfaW5zdGFuY2VzIGxpc3Qgd2l0aG91
dCB0YWtpbmcgdGhlcm1hbF90el9saXN0Lg0KPiA+ID4NCj4gPiBCZWZvcmUgdGhlcm1hbF96b25l
X2JpbmRfY29vbGluZ19kZXZpY2UgaXMgaW52b2tlZCwgdGhlDQo+ID4gdGhlcm1hbF9saXN0X2xv
Y2sgd2lsbCBiZSBmaXJzdGx5IGdyaXBwZWQ6DQo+ID4NCj4gPiBzdGF0aWMgdm9pZCBiaW5kX2Nk
ZXYoc3RydWN0IHRoZXJtYWxfY29vbGluZ19kZXZpY2UgKmNkZXYpIHsNCj4gPiBtdXRleF9sb2Nr
KCZ0aGVybWFsX2xpc3RfbG9jayk7DQo+ID4gZWl0aGVyIHR6LT5vcHMtPmJpbmQgICAgOiAgIHRo
ZXJtYWxfem9uZV9iaW5kX2Nvb2xpbmdfZGV2aWNlDQo+ID4gb3IgX19iaW5kKCkgIDogICB0aGVy
bWFsX3pvbmVfYmluZF9jb29saW5nX2RldmljZQ0KPiA+IG11dGV4X3VubG9jaygmdGhlcm1hbF9s
aXN0X2xvY2spOw0KPiA+IH0NCj4gPg0KPiA+IEFuZCBpdCBpcyB0aGUgc2FtZSBhcyBpbiAgcGFz
c2l2ZV9zdG9yZS4NCj4gPiBTbyB3aGVuIGNvZGUgaXMgdHJ5aW5nIHRvIGFkZC9kZWxldGUgdGhl
cm1hbF9pbnN0YW5jZSBvZiBjZGV2LCBoZSBoYXMNCj4gPiBhbHJlYWR5IGhvbGQgdGhlcm1hbF9s
aXN0X2xvY2sgSU1PLiBPciBkbyBJIG1pc3MgYW55dGhpbmc/DQo+IA0KPiB0aGVybWFsX3pvbmVf
YmluZF9jb29saW5nX2RldmljZSgpIGlzIGV4cG9ydGVkLCBzbyB5b3UgY2FuJ3QgcmVhbGx5IHJl
bHkgb24NCj4gdGhlIHN0YXRpYyB0aGVybWFsX2xpc3RfbG9jayBiZWluZyBhY3F1aXJlZCBpbiBl
dmVyeSBzaW5nbGUgY2FsbC4NCj4gDQo+IHRoZXJtYWxfbGlzdF9sb2NrIGFuZCBwcm90ZWN0cyB0
aGUgbGlzdHMgdGhlcm1hbF90el9saXN0IGFuZCB0aGVybWFsX2NkZXZfbGlzdC4NCj4gTWFraW5n
IGl0IGltcGxpY2l0bHkgcHJvdGVjdCB0aGUgY29vbGluZyBkZXZpY2UncyBhbmQgdGhlcm1hbCB6
b25lIGRldmljZSdzDQo+IGluc3RhbmNlcyBsaXN0IGJlY2F1c2Ugbm8gc2Vuc2libGUgY29kZSB3
b3VsZCBjYWxsDQo+IHRoZXJtYWxfem9uZV9iaW5kX2Nvb2xpbmdfZGV2aWNlKCkgb3V0c2lkZSBv
ZiBhIGJpbmQgZnVuY3Rpb24gaXMganVzdCBhc2tpbmcNCj4gZm9yIHRyb3VibGUuDQo+IA0KWWVz
LCBmcm9tIHRoaXMgcG9pbnQgb2YgdmlldyxpdCBpcyB0cnVlLiANCg0KPiBMb2NraW5nIGlzIGhh
cmQgdG8gdW5kZXJzdGFuZCBhbmQgZWFzeSB0byBnZXQgd3Jvbmcgc28gbGV0J3Mga2VlcCBpdCBz
aW1wbGUuDQo+IA0KSG93IGFib3V0IHRoZSBmb2xsb3dpbmcgMiBtZXRob2RzOg0KMS4gYXZvaWQg
YWNjZXNzaW5nIGRldmljZSdzIHRoZXJtYWxfaW5zdGFuY2UsYnV0DQphY2Nlc3MgYWxsIHRoZXJt
YWxfem9uZV9kZXZpY2UgZGlyZWN0bHksIGFsdGhvdWdoIHRoZXJlIG1pZ2h0IGJlIHNvbWUgcmVk
dW5kYW5jeSwNCnNvbWUgdGhlcm1hbCB6b25lcyBkbyBub3QgbmVlZCB0byBiZSB1cGRhdGVkLCBi
dXQgd2UgY2FuIGF2b2lkIGdyaXBwaW5nIGRldi0+bG9jazoNCg0KCW11dGV4X2xvY2soJnRoZXJt
YWxfbGlzdF9sb2NrKTsNCglsaXN0X2Zvcl9lYWNoX2VudHJ5KHBvcywgJnRoZXJtYWxfdHpfbGlz
dCwgbm9kZSkNCgkJdGhlcm1hbF96b25lX2RldmljZV91cGRhdGUodHopOw0KCW11dGV4X3VubG9j
aygmdGhlcm1hbF9saXN0X2xvY2spOyANCg0Kb3IsIA0KMi4gT25jZSB3ZSBiaW5kIHRoZSBuZXcg
ZGV2aWNlIHdpdGggdGhlIHRoZXJtYWxfem9uZV9kZXZpY2UsIHdlIGNhbiByZWNvcmQgdGhhdA0K
dGhlcm1hbF96b25lX2RldmljZSwgYW5kIHVwZGF0ZSB0aGF0IHRoZXJtYWxfem9uZV9kZXZpY2Ug
YWxvbmUuDQoNCkJUVywgc2luY2UgdGhlcm1hbF96b25lX2RldmljZV91cGRhdGUgaXMgbm90IGF0
b21pYywgd2UgbWlnaHQgbmVlZCBhbm90aGVyIHBhdGNoDQp0byBtYWtlIGl0IGludG8gYXRvbWlj
IG9yIHNvbWV0aGluZyBsaWtlIHRoYXQsIGJ1dCBmb3Igbm93LCBJIHRoaW5rIHRoZXNlIHRocmVl
IHBhdGNoZXMgYXJlIA0KanVzdCBmb3IgZml4aW5nIHRoZSByZWdyZXNzaW9ucy4NCg0KDQpUaGFu
a3MNCg0KQmVzdCBSZWdhcmRzLA0KWXUNCg0K
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Chen Yu Oct. 20, 2015, 1:44 a.m. UTC | #10
(resend for broken display)

Hi,
> -----Original Message-----

> From: Javi Merino [mailto:javi.merino@arm.com]

> Sent: Thursday, October 15, 2015 10:05 PM

> To: Chen, Yu C

> Cc: linux-pm@vger.kernel.org; edubezval@gmail.com; Zhang, Rui; linux- 

> kernel@vger.kernel.org; stable@vger.kernel.org; Pandruvada, Srinivas

> Subject: Re: [PATCH 3/3] Thermal: do thermal zone update after a 

> cooling device registered

> 

> On Wed, Oct 14, 2015 at 07:23:55PM +0000, Chen, Yu C wrote:

> > > -----Original Message-----

> > > From: Javi Merino [mailto:javi.merino@arm.com]

> > > Sent: Thursday, October 15, 2015 1:08 AM

> > > To: Chen, Yu C

> > > Cc: linux-pm@vger.kernel.org; edubezval@gmail.com; Zhang, Rui;

> > > linux- kernel@vger.kernel.org; stable@vger.kernel.org; Pandruvada, 

> > > Srinivas

> > > Subject: Re: [PATCH 3/3] Thermal: do thermal zone update after a 

> > > cooling device registered

> > >

> > > On Mon, Oct 12, 2015 at 09:23:28AM +0000, Chen, Yu C wrote:

> > > > Hi, Javi

> > > > Sorry for my late response,

> > > >

> > > > > -----Original Message-----

> > > > > From: Javi Merino [mailto:javi.merino@arm.com]

> > > > > Sent: Wednesday, September 30, 2015 12:02 AM

> > > > > To: Chen, Yu C

> > > > > Cc: linux-pm@vger.kernel.org; edubezval@gmail.com; Zhang, Rui;

> > > > > linux- kernel@vger.kernel.org; stable@vger.kernel.org

> > > > > Subject: Re: [PATCH 3/3] Thermal: do thermal zone update after 

> > > > > a cooling device registered

> > > > >

> > > > > Hi Yu,

> > > > >

> > > > > On Mon, Sep 28, 2015 at 06:52:00PM +0100, Chen, Yu C wrote:

> > > > > > Hi, Javi,

> > > > > >

> > > > > > > -----Original Message-----

> > > > > > > From: Javi Merino [mailto:javi.merino@arm.com]

> > > > > > > Sent: Monday, September 28, 2015 10:29 PM

> > > > > > > To: Chen, Yu C

> > > > > > > Cc: linux-pm@vger.kernel.org; edubezval@gmail.com; Zhang, 

> > > > > > > Rui;

> > > > > > > linux- kernel@vger.kernel.org; stable@vger.kernel.org

> > > > > > > Subject: Re: [PATCH 3/3] Thermal: do thermal zone update 

> > > > > > > after a cooling device registered

> > > > > > >

> > > > > > > On Sun, Sep 27, 2015 at 06:48:44AM +0100, Chen Yu wrote:

> > > > > > > > From: Zhang Rui <rui.zhang@intel.com>

> > > > > > > >

> > > > > > > >

> > > > > > >

> > > > > > > I think you need to hold cdev->lock here, to make sure 

> > > > > > > that no thermal zone is added or removed from

> > > > > > > cdev->thermal_instances while

> > > > > you are looping.

> > > > > > >

> > > > > > Ah right, will add. If I add the cdev ->lock here, will 

> > > > > > there be a AB-BA lock with thermal_zone_unbind_cooling_device?

> > > > >

> > > > > You're right, it could lead to a deadlock.  The locks can't be 

> > > > > swapped because that won't work in step_wise.

> > > > >

> > > > > The best way that I can think of accessing thermal_instances 

> > > > > atomically is by making it RCU protected instead of with mutexes.

> > > > > What do you think?

> > > > >

> > > > RCU would need extra spinlocks to protect the list, and need to 

> > > > sync_rcu after we delete one instance from thermal_instance 

> > > > list, I think it is too complicated for me to rewrite: ( How 

> > > > about using

> > > thermal_list_lock instead of cdev ->lock?

> > > > This guy should be big enough to protect the 

> > > > device.thermal_instance

> list.

> > >

> > > thermal_list_lock protects thermal_tz_list and thermal_cdev_list, 

> > > but it doesn't protect the thermal_instances list.  For example,

> > > thermal_zone_bind_cooling_device() adds a cooling device to the

> > > cdev->thermal_instances list without taking thermal_tz_list.

> > >

> > Before thermal_zone_bind_cooling_device is invoked, the 

> > thermal_list_lock will be firstly gripped:

> >

> > static void bind_cdev(struct thermal_cooling_device *cdev) { 

> > mutex_lock(&thermal_list_lock);

> > either tz->ops->bind    :   thermal_zone_bind_cooling_device

> > or __bind()  :   thermal_zone_bind_cooling_device

> > mutex_unlock(&thermal_list_lock);

> > }

> >

> > And it is the same as in  passive_store.

> > So when code is trying to add/delete thermal_instance of cdev, he 

> > has already hold thermal_list_lock IMO. Or do I miss anything?

> 

> thermal_zone_bind_cooling_device() is exported, so you can't really 

> rely on the static thermal_list_lock being acquired in every single call.

> 

> thermal_list_lock and protects the lists thermal_tz_list and thermal_cdev_list.

> Making it implicitly protect the cooling device's and thermal zone 

> device's instances list because no sensible code would call

> thermal_zone_bind_cooling_device() outside of a bind function is just 

> asking for trouble.

> 

Yes, from this point of view,it is true. 

> Locking is hard to understand and easy to get wrong so let's keep it simple.

> 

How about the following 2 methods:
1. avoid accessing device's thermal_instance,
but access all thermal_zone_device directly,
 although there might be some redundancy,
 some thermal zones do not need to be updated, 
but we can avoid gripping dev->lock:

	mutex_lock(&thermal_list_lock);
	list_for_each_entry(pos, &thermal_tz_list, node)
		thermal_zone_device_update(tz);
	mutex_unlock(&thermal_list_lock); 

or,
2. Once we bind the new device with the thermal_zone_device,
 we can record that thermal_zone_device, 
and update that thermal_zone_device alone,the the code would be:
	
	mutex_lock(&thermal_list_lock);
	list_for_each_entry(pos, &thermal_tz_list, node){
		if (tz->need_update)
			thermal_zone_device_update(tz);
	}
	mutex_unlock(&thermal_list_lock);


BTW, since thermal_zone_device_update is not atomic, 
we might need another patch to make it into atomic or 
something like that, but for now, I think these three patches 
are just for fixing the regressions.


Thanks

Best Regards,
Yu
Javi Merino Oct. 20, 2015, 9:47 a.m. UTC | #11
Hi Yu,

On Tue, Oct 20, 2015 at 01:44:20AM +0000, Chen, Yu C wrote:
> > -----Original Message-----
> > From: Javi Merino [mailto:javi.merino@arm.com]
> > Sent: Thursday, October 15, 2015 10:05 PM
> > To: Chen, Yu C
> > Cc: linux-pm@vger.kernel.org; edubezval@gmail.com; Zhang, Rui; linux- 
> > kernel@vger.kernel.org; stable@vger.kernel.org; Pandruvada, Srinivas
> > Subject: Re: [PATCH 3/3] Thermal: do thermal zone update after a 
> > cooling device registered
> > 
> > On Wed, Oct 14, 2015 at 07:23:55PM +0000, Chen, Yu C wrote:
> > > > -----Original Message-----
> > > > From: Javi Merino [mailto:javi.merino@arm.com]
> > > > Sent: Thursday, October 15, 2015 1:08 AM
> > > > To: Chen, Yu C
> > > > Cc: linux-pm@vger.kernel.org; edubezval@gmail.com; Zhang, Rui;
> > > > linux- kernel@vger.kernel.org; stable@vger.kernel.org; Pandruvada, 
> > > > Srinivas
> > > > Subject: Re: [PATCH 3/3] Thermal: do thermal zone update after a 
> > > > cooling device registered
> > > >
> > > > On Mon, Oct 12, 2015 at 09:23:28AM +0000, Chen, Yu C wrote:
> > > > > Hi, Javi
> > > > > Sorry for my late response,
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: Javi Merino [mailto:javi.merino@arm.com]
> > > > > > Sent: Wednesday, September 30, 2015 12:02 AM
> > > > > > To: Chen, Yu C
> > > > > > Cc: linux-pm@vger.kernel.org; edubezval@gmail.com; Zhang, Rui;
> > > > > > linux- kernel@vger.kernel.org; stable@vger.kernel.org
> > > > > > Subject: Re: [PATCH 3/3] Thermal: do thermal zone update after 
> > > > > > a cooling device registered
> > > > > >
> > > > > > Hi Yu,
> > > > > >
> > > > > > On Mon, Sep 28, 2015 at 06:52:00PM +0100, Chen, Yu C wrote:
> > > > > > > Hi, Javi,
> > > > > > >
> > > > > > > > -----Original Message-----
> > > > > > > > From: Javi Merino [mailto:javi.merino@arm.com]
> > > > > > > > Sent: Monday, September 28, 2015 10:29 PM
> > > > > > > > To: Chen, Yu C
> > > > > > > > Cc: linux-pm@vger.kernel.org; edubezval@gmail.com; Zhang, 
> > > > > > > > Rui;
> > > > > > > > linux- kernel@vger.kernel.org; stable@vger.kernel.org
> > > > > > > > Subject: Re: [PATCH 3/3] Thermal: do thermal zone update 
> > > > > > > > after a cooling device registered
> > > > > > > >
> > > > > > > > On Sun, Sep 27, 2015 at 06:48:44AM +0100, Chen Yu wrote:
> > > > > > > > > From: Zhang Rui <rui.zhang@intel.com>
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > > > I think you need to hold cdev->lock here, to make sure 
> > > > > > > > that no thermal zone is added or removed from
> > > > > > > > cdev->thermal_instances while
> > > > > > you are looping.
> > > > > > > >
> > > > > > > Ah right, will add. If I add the cdev ->lock here, will 
> > > > > > > there be a AB-BA lock with thermal_zone_unbind_cooling_device?
> > > > > >
> > > > > > You're right, it could lead to a deadlock.  The locks can't be 
> > > > > > swapped because that won't work in step_wise.
> > > > > >
> > > > > > The best way that I can think of accessing thermal_instances 
> > > > > > atomically is by making it RCU protected instead of with mutexes.
> > > > > > What do you think?
> > > > > >
> > > > > RCU would need extra spinlocks to protect the list, and need to 
> > > > > sync_rcu after we delete one instance from thermal_instance 
> > > > > list, I think it is too complicated for me to rewrite: ( How 
> > > > > about using
> > > > thermal_list_lock instead of cdev ->lock?
> > > > > This guy should be big enough to protect the 
> > > > > device.thermal_instance
> > list.
> > > >
> > > > thermal_list_lock protects thermal_tz_list and thermal_cdev_list, 
> > > > but it doesn't protect the thermal_instances list.  For example,
> > > > thermal_zone_bind_cooling_device() adds a cooling device to the
> > > > cdev->thermal_instances list without taking thermal_tz_list.
> > > >
> > > Before thermal_zone_bind_cooling_device is invoked, the 
> > > thermal_list_lock will be firstly gripped:
> > >
> > > static void bind_cdev(struct thermal_cooling_device *cdev) { 
> > > mutex_lock(&thermal_list_lock);
> > > either tz->ops->bind    :   thermal_zone_bind_cooling_device
> > > or __bind()  :   thermal_zone_bind_cooling_device
> > > mutex_unlock(&thermal_list_lock);
> > > }
> > >
> > > And it is the same as in  passive_store.
> > > So when code is trying to add/delete thermal_instance of cdev, he 
> > > has already hold thermal_list_lock IMO. Or do I miss anything?
> > 
> > thermal_zone_bind_cooling_device() is exported, so you can't really 
> > rely on the static thermal_list_lock being acquired in every single call.
> > 
> > thermal_list_lock and protects the lists thermal_tz_list and thermal_cdev_list.
> > Making it implicitly protect the cooling device's and thermal zone 
> > device's instances list because no sensible code would call
> > thermal_zone_bind_cooling_device() outside of a bind function is just 
> > asking for trouble.
> > 
> Yes, from this point of view,it is true. 
> 
> > Locking is hard to understand and easy to get wrong so let's keep it simple.
> > 
> How about the following 2 methods:
> 1. avoid accessing device's thermal_instance,
> but access all thermal_zone_device directly,
>  although there might be some redundancy,
>  some thermal zones do not need to be updated, 
> but we can avoid gripping dev->lock:
> 
> 	mutex_lock(&thermal_list_lock);
> 	list_for_each_entry(pos, &thermal_tz_list, node)
> 		thermal_zone_device_update(tz);
> 	mutex_unlock(&thermal_list_lock); 
> 
> or,
> 2. Once we bind the new device with the thermal_zone_device,
>  we can record that thermal_zone_device, 
> and update that thermal_zone_device alone,the the code would be:
> 	
> 	mutex_lock(&thermal_list_lock);
> 	list_for_each_entry(pos, &thermal_tz_list, node){
> 		if (tz->need_update)
> 			thermal_zone_device_update(tz);
> 	}
> 	mutex_unlock(&thermal_list_lock);

This sounds like a better alternative to me.  I was thinking whether
we could add the thremal_zone_device_update() directly in bind_cdev()
to avoid the need_update field but I don't think it's any better: you
would have to put it in two places (for the bind() and tbp.match()
paths).

With the solution you propose above you only have to put it in
__thermal_cooling_device_register(), which is simpler.  I vote for
your solution (2) above.

> BTW, since thermal_zone_device_update is not atomic, 
> we might need another patch to make it into atomic or 
> something like that, but for now, I think these three patches 
> are just for fixing the regressions.

Yeah, we can fix that in another series.

Cheers,
Javi
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index c3bdb48..09c78a4 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -1450,6 +1450,7 @@  __thermal_cooling_device_register(struct device_node *np,
 				  const struct thermal_cooling_device_ops *ops)
 {
 	struct thermal_cooling_device *cdev;
+	struct thermal_instance *pos, *next;
 	int result;
 
 	if (type && strlen(type) >= THERMAL_NAME_LENGTH)
@@ -1494,6 +1495,15 @@  __thermal_cooling_device_register(struct device_node *np,
 	/* Update binding information for 'this' new cdev */
 	bind_cdev(cdev);
 
+	list_for_each_entry_safe(pos, next, &cdev->thermal_instances, cdev_node) {
+			if (next->cdev_node.next == &cdev->thermal_instances) {
+				thermal_zone_device_update(next->tz);
+				break;
+			}
+			if (pos->tz != next->tz)
+				thermal_zone_device_update(pos->tz);
+	}
+
 	return cdev;
 }