diff mbox series

[2/2] thermal/drivers/thermal_helpers: Adjust output format

Message ID 20220408110920.3809225-2-alexander.stein@ew.tq-group.com (mailing list archive)
State New, archived
Headers show
Series [1/2] thermal: imx8mm: Add hwmon support | expand

Commit Message

Alexander Stein April 8, 2022, 11:09 a.m. UTC
Outputs like this where -1 is printed as unsigned is somewhat misleading
 thermal thermal_zone1: Trip3[type=0,temp=48000]:trend=4,throttle=1
 thermal cooling_device3: cur_state=1
 thermal cooling_device3: old_target=-1, target=2
 thermal cooling_device3: zone1->target=1
 thermal cooling_device3: zone1->target=2
 thermal cooling_device3: zone1->target=18446744073709551615
 thermal cooling_device3: set to state 2

With THERMAL_NO_TARGET assigning -1 as unsigned it make sense to print
the target as signed integer, even if the type is actually unsigned.

Signed-off-by: Alexander Stein <alexander.stein@ew.tq-group.com>
---
An alternative would be to change thermal_instance::target from unsigned
long to long, but this would entail a lot of API & driver changes as well
which looks less intriguing.

 drivers/thermal/thermal_helpers.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Daniel Lezcano April 14, 2022, 7:35 a.m. UTC | #1
On 08/04/2022 13:09, Alexander Stein wrote:
> Outputs like this where -1 is printed as unsigned is somewhat misleading
>   thermal thermal_zone1: Trip3[type=0,temp=48000]:trend=4,throttle=1
>   thermal cooling_device3: cur_state=1
>   thermal cooling_device3: old_target=-1, target=2
>   thermal cooling_device3: zone1->target=1
>   thermal cooling_device3: zone1->target=2
>   thermal cooling_device3: zone1->target=18446744073709551615
>   thermal cooling_device3: set to state 2
> 
> With THERMAL_NO_TARGET assigning -1 as unsigned it make sense to print
> the target as signed integer, even if the type is actually unsigned.
> 
> Signed-off-by: Alexander Stein <alexander.stein@ew.tq-group.com>
> ---
> An alternative would be to change thermal_instance::target from unsigned
> long to long, but this would entail a lot of API & driver changes as well
> which looks less intriguing.
> 
>   drivers/thermal/thermal_helpers.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/thermal/thermal_helpers.c b/drivers/thermal/thermal_helpers.c
> index 3edd047e144f..0d0da6670267 100644
> --- a/drivers/thermal/thermal_helpers.c
> +++ b/drivers/thermal/thermal_helpers.c
> @@ -199,7 +199,7 @@ void __thermal_cdev_update(struct thermal_cooling_device *cdev)
>   
>   	/* Make sure cdev enters the deepest cooling state */
>   	list_for_each_entry(instance, &cdev->thermal_instances, cdev_node) {
> -		dev_dbg(&cdev->device, "zone%d->target=%lu\n",
> +		dev_dbg(&cdev->device, "zone%d->target=%ld\n",
>   			instance->tz->id, instance->target);
>   		if (instance->target == THERMAL_NO_TARGET)
>   			continue;

Actually you pointed out something fuzzy in the target values.

The unsigned long type for the target and THERMAL_NO_TARGET are not 
compatible.

It would be much simpler to have THERMAL_NO_TARGET = 0 which 
semantically makes more sense than a negative value.
Nitin Garg May 10, 2022, 10:48 p.m. UTC | #2
On 08/04/2022 13:09, Alexander Stein wrote:
>> Outputs like this where -1 is printed as unsigned is somewhat misleading
>>   thermal thermal_zone1: Trip3[type=0,temp=48000]:trend=4,throttle=1
>>   thermal cooling_device3: cur_state=1
>>   thermal cooling_device3: old_target=-1, target=2
>>   thermal cooling_device3: zone1->target=1
>>   thermal cooling_device3: zone1->target=2
>>   thermal cooling_device3: zone1->target=18446744073709551615
>>   thermal cooling_device3: set to state 2
>> 
>> With THERMAL_NO_TARGET assigning -1 as unsigned it make sense to print
>> the target as signed integer, even if the type is actually unsigned.
>> 
>> Signed-off-by: Alexander Stein <alexander.stein@ew.tq-group.com>
>> ---
>> An alternative would be to change thermal_instance::target from unsigned
>> long to long, but this would entail a lot of API & driver changes as well
>> which looks less intriguing.
>> 
>>   drivers/thermal/thermal_helpers.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>> 
>> diff --git a/drivers/thermal/thermal_helpers.c b/drivers/thermal/thermal_helpers.c
>> index 3edd047e144f..0d0da6670267 100644
>> --- a/drivers/thermal/thermal_helpers.c
>> +++ b/drivers/thermal/thermal_helpers.c
>> @@ -199,7 +199,7 @@ void __thermal_cdev_update(struct thermal_cooling_device *cdev)
>>   
>>   	/* Make sure cdev enters the deepest cooling state */
>>   	list_for_each_entry(instance, &cdev->thermal_instances, cdev_node) {
>> -		dev_dbg(&cdev->device, "zone%d->target=%lu\n",
>> +		dev_dbg(&cdev->device, "zone%d->target=%ld\n",
>>   			instance->tz->id, instance->target);
>>   		if (instance->target == THERMAL_NO_TARGET)
>>   			continue;
>
>Actually you pointed out something fuzzy in the target values.
>
>The unsigned long type for the target and THERMAL_NO_TARGET are not 
>compatible.
>
>It would be much simpler to have THERMAL_NO_TARGET = 0 which 
>semantically makes more sense than a negative value.

The compare of unsigned long and negative int is bad idea.
But there is serious problem introduced by "thermal: core: Add notifications call in the framework" patch.
When system resumes from mem suspend first time (this happen only on 1st resume), the thermal notification is sent to drivers with value of 0 (meaning system is no longer hot).
This is due to the fact target is init to 0 and when there is only 1 cooling device; it gets out of the loop (due to continue;) with target still set to 0 and calls thermal_cdev_set_cur_state(cdev, target).
From there thermal_notify_cdev_state_update is called with argument of 0 which notifies drivers with value of 0.

May be "unsigned long target" should be initialized to THERMAL_NO_TARGET instead of 0.

[   29.107048] OOM killer enabled.
[   29.110225] Restarting tasks ... done.
[   29.124816] thermal cooling_device0: zone0->target=18446744073709551615
[   29.138388] GPU0: Hot alarm is canceled. 
[   29.145399] thermal cooling_device0: set to state 0
[   29.198954] PM: suspend exit
Alexander Stein May 11, 2022, 10:17 a.m. UTC | #3
Hello,

Am Mittwoch, 11. Mai 2022, 00:48:00 CEST schrieb Nitin Garg:
> On 08/04/2022 13:09, Alexander Stein wrote:
> >> Outputs like this where -1 is printed as unsigned is somewhat misleading
> >> 
> >>   thermal thermal_zone1: Trip3[type=0,temp=48000]:trend=4,throttle=1
> >>   thermal cooling_device3: cur_state=1
> >>   thermal cooling_device3: old_target=-1, target=2
> >>   thermal cooling_device3: zone1->target=1
> >>   thermal cooling_device3: zone1->target=2
> >>   thermal cooling_device3: zone1->target=18446744073709551615
> >>   thermal cooling_device3: set to state 2
> >> 
> >> With THERMAL_NO_TARGET assigning -1 as unsigned it make sense to print
> >> the target as signed integer, even if the type is actually unsigned.
> >> 
> >> Signed-off-by: Alexander Stein <alexander.stein@ew.tq-group.com>
> >> ---
> >> An alternative would be to change thermal_instance::target from unsigned
> >> long to long, but this would entail a lot of API & driver changes as well
> >> which looks less intriguing.
> >> 
> >>   drivers/thermal/thermal_helpers.c | 2 +-
> >>   1 file changed, 1 insertion(+), 1 deletion(-)
> >> 
> >> diff --git a/drivers/thermal/thermal_helpers.c
> >> b/drivers/thermal/thermal_helpers.c index 3edd047e144f..0d0da6670267
> >> 100644
> >> --- a/drivers/thermal/thermal_helpers.c
> >> +++ b/drivers/thermal/thermal_helpers.c
> >> @@ -199,7 +199,7 @@ void __thermal_cdev_update(struct
> >> thermal_cooling_device *cdev)>> 
> >>   	/* Make sure cdev enters the deepest cooling state */
> >>   	list_for_each_entry(instance, &cdev->thermal_instances, cdev_node) 
{
> >> 
> >> -		dev_dbg(&cdev->device, "zone%d->target=%lu\n",
> >> +		dev_dbg(&cdev->device, "zone%d->target=%ld\n",
> >> 
> >>   			instance->tz->id, instance->target);
> >>   		
> >>   		if (instance->target == THERMAL_NO_TARGET)
> >>   		
> >>   			continue;
> >
> >Actually you pointed out something fuzzy in the target values.
> >
> >The unsigned long type for the target and THERMAL_NO_TARGET are not
> >compatible.
> >
> >It would be much simpler to have THERMAL_NO_TARGET = 0 which
> >semantically makes more sense than a negative value.

Is it identical? Apparently target value is used differently in each governor. 
At least for gov_bang_bang 'THERMAL_NO_TARGET = 0' is no difference. Im not so 
sure about gov_step_wise.

> The compare of unsigned long and negative int is bad idea.

Well, THERMAL_NO_TARGET actually is an unsigned long (-1UL), so the comparison 
is unsigned long to unsigned long, so it should not be an issue.
But this implies that printing the target as unsigned int, results in a huge 
number, not immediately recognizable as -1, which I tried to address here.

> But there is serious problem introduced by "thermal: core: Add notifications
> call in the framework" patch. When system resumes from mem suspend first
> time (this happen only on 1st resume), the thermal notification is sent to
> drivers with value of 0 (meaning system is no longer hot). This is due to
> the fact target is init to 0 and when there is only 1 cooling device; it
> gets out of the loop (due to continue;) with target still set to 0 and
> calls thermal_cdev_set_cur_state(cdev, target). From there
> thermal_notify_cdev_state_update is called with argument of 0 which
> notifies drivers with value of 0.
> 
> May be "unsigned long target" should be initialized to THERMAL_NO_TARGET
> instead of 0.
> 
> [   29.107048] OOM killer enabled.
> [   29.110225] Restarting tasks ... done.
> [   29.124816] thermal cooling_device0: zone0->target=18446744073709551615
> [   29.138388] GPU0: Hot alarm is canceled.
> [   29.145399] thermal cooling_device0: set to state 0
> [   29.198954] PM: suspend exit

Is it legal to pass THERMAL_NO_TARGET to .set_cur_state()? At least pwm-fan 
will return -EINVAL in this case.

Alexander
diff mbox series

Patch

diff --git a/drivers/thermal/thermal_helpers.c b/drivers/thermal/thermal_helpers.c
index 3edd047e144f..0d0da6670267 100644
--- a/drivers/thermal/thermal_helpers.c
+++ b/drivers/thermal/thermal_helpers.c
@@ -199,7 +199,7 @@  void __thermal_cdev_update(struct thermal_cooling_device *cdev)
 
 	/* Make sure cdev enters the deepest cooling state */
 	list_for_each_entry(instance, &cdev->thermal_instances, cdev_node) {
-		dev_dbg(&cdev->device, "zone%d->target=%lu\n",
+		dev_dbg(&cdev->device, "zone%d->target=%ld\n",
 			instance->tz->id, instance->target);
 		if (instance->target == THERMAL_NO_TARGET)
 			continue;