diff mbox series

[v3] thermal/core: Clear all mitigation when thermal zone is disabled

Message ID 1641581806-32550-1-git-send-email-quic_manafm@quicinc.com (mailing list archive)
State Superseded, archived
Headers show
Series [v3] thermal/core: Clear all mitigation when thermal zone is disabled | expand

Commit Message

Manaf Meethalavalappu Pallikunhi Jan. 7, 2022, 6:56 p.m. UTC
Whenever a thermal zone is in trip violated state, there is a chance
that the same thermal zone mode can be disabled either via thermal
core API or via thermal zone sysfs. Once it is disabled, the framework
bails out any re-evaluation of thermal zone. It leads to a case where
if it is already in mitigation state, it will stay the same state
until it is re-enabled.

To avoid above mentioned issue, on thermal zone disable request
reset thermal zone and clear mitigation for each trip explicitly.

Signed-off-by: Manaf Meethalavalappu Pallikunhi <quic_manafm@quicinc.com>
---
 drivers/thermal/thermal_core.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

Comments

Thara Gopinath Jan. 10, 2022, 5:55 p.m. UTC | #1
Hi Manaf,

On 1/7/22 1:56 PM, Manaf Meethalavalappu Pallikunhi wrote:
> Whenever a thermal zone is in trip violated state, there is a chance
> that the same thermal zone mode can be disabled either via thermal
> core API or via thermal zone sysfs. Once it is disabled, the framework
> bails out any re-evaluation of thermal zone. It leads to a case where
> if it is already in mitigation state, it will stay the same state
> until it is re-enabled.
> 
> To avoid above mentioned issue, on thermal zone disable request
> reset thermal zone and clear mitigation for each trip explicitly.
> 
> Signed-off-by: Manaf Meethalavalappu Pallikunhi <quic_manafm@quicinc.com>
> ---
>   drivers/thermal/thermal_core.c | 12 ++++++++++--
>   1 file changed, 10 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
> index 51374f4..e288c82 100644
> --- a/drivers/thermal/thermal_core.c
> +++ b/drivers/thermal/thermal_core.c
> @@ -447,10 +447,18 @@ static int thermal_zone_device_set_mode(struct thermal_zone_device *tz,
>   
>   	thermal_zone_device_update(tz, THERMAL_EVENT_UNSPECIFIED);
>   
> -	if (mode == THERMAL_DEVICE_ENABLED)
> +	if (mode == THERMAL_DEVICE_ENABLED) {
>   		thermal_notify_tz_enable(tz->id);
> -	else
> +	} else {
> +		int trip;
> +
> +		/* make sure all previous throttlings are cleared */
> +		thermal_zone_device_init(tz);

It looks weird to do a init when you are actually disabling the thermal 
zone.


> +		for (trip = 0; trip < tz->trips; trip++)
> +			handle_thermal_trip(tz, trip);

So this is exactly what thermal_zone_device_update does except that 
thermal_zone_device_update checks for the mode and bails out if the zone 
is disabled.
This will work because as you explained in v2, the temperature is reset 
in thermal_zone_device_init and handle_thermal_trip will remove the 
mitigation if any.

My two cents here (Rafael and Daniel can comment more on this).

I think it will be cleaner if we can have a third mode 
THERMAL_DEVICE_DISABLING and have thermal_zone_device_update handle 
clearing the mitigation. So this will look like
if (mode == THERMAL_DEVICE_DISABLED)
	tz->mode = THERMAL_DEVICE_DISABLING;
else
	tz->mode = mode;

thermal_zone_device_update(tz, THERMAL_EVENT_UNSPECIFIED);

if (mode == THERMAL_DEVICE_DISABLED)
	tz->mode = mode;

You will have to update update_temperature to set tz->temperature = 
THERMAL_TEMP_INVALID and thermal_zone_set_trips to set 
tz->prev_low_trip = -INT_MAX and tz->prev_high_trip = INT_MAX for
THERMAL_DEVICE_DISABLING mode.
Manaf Meethalavalappu Pallikunhi Jan. 10, 2022, 8:45 p.m. UTC | #2
Hi Thara,

On 1/10/2022 11:25 PM, Thara Gopinath wrote:
> Hi Manaf,
>
> On 1/7/22 1:56 PM, Manaf Meethalavalappu Pallikunhi wrote:
>> Whenever a thermal zone is in trip violated state, there is a chance
>> that the same thermal zone mode can be disabled either via thermal
>> core API or via thermal zone sysfs. Once it is disabled, the framework
>> bails out any re-evaluation of thermal zone. It leads to a case where
>> if it is already in mitigation state, it will stay the same state
>> until it is re-enabled.
>>
>> To avoid above mentioned issue, on thermal zone disable request
>> reset thermal zone and clear mitigation for each trip explicitly.
>>
>> Signed-off-by: Manaf Meethalavalappu Pallikunhi 
>> <quic_manafm@quicinc.com>
>> ---
>>   drivers/thermal/thermal_core.c | 12 ++++++++++--
>>   1 file changed, 10 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/thermal/thermal_core.c 
>> b/drivers/thermal/thermal_core.c
>> index 51374f4..e288c82 100644
>> --- a/drivers/thermal/thermal_core.c
>> +++ b/drivers/thermal/thermal_core.c
>> @@ -447,10 +447,18 @@ static int thermal_zone_device_set_mode(struct 
>> thermal_zone_device *tz,
>>         thermal_zone_device_update(tz, THERMAL_EVENT_UNSPECIFIED);
>>   -    if (mode == THERMAL_DEVICE_ENABLED)
>> +    if (mode == THERMAL_DEVICE_ENABLED) {
>>           thermal_notify_tz_enable(tz->id);
>> -    else
>> +    } else {
>> +        int trip;
>> +
>> +        /* make sure all previous throttlings are cleared */
>> +        thermal_zone_device_init(tz);
>
> It looks weird to do a init when you are actually disabling the 
> thermal zone.
>
>
>> +        for (trip = 0; trip < tz->trips; trip++)
>> +            handle_thermal_trip(tz, trip);
>
> So this is exactly what thermal_zone_device_update does except that 
> thermal_zone_device_update checks for the mode and bails out if the 
> zone is disabled.
> This will work because as you explained in v2, the temperature is 
> reset in thermal_zone_device_init and handle_thermal_trip will remove 
> the mitigation if any.
>
> My two cents here (Rafael and Daniel can comment more on this).
>
> I think it will be cleaner if we can have a third mode 
> THERMAL_DEVICE_DISABLING and have thermal_zone_device_update handle 
> clearing the mitigation. So this will look like
> if (mode == THERMAL_DEVICE_DISABLED)
>     tz->mode = THERMAL_DEVICE_DISABLING;
> else
>     tz->mode = mode;
>
> thermal_zone_device_update(tz, THERMAL_EVENT_UNSPECIFIED);
>
> if (mode == THERMAL_DEVICE_DISABLED)
>     tz->mode = mode;
>
> You will have to update update_temperature to set tz->temperature = 
> THERMAL_TEMP_INVALID and thermal_zone_set_trips to set 
> tz->prev_low_trip = -INT_MAX and tz->prev_high_trip = INT_MAX for
> THERMAL_DEVICE_DISABLING mode.

I think just updating above fields doesn't guarantee complete clearing 
of mitigation for all governors. For  step_wise governor, to make sure 
mitigation removed completely, we have to set each 
thermal-instance->initialized = false as well.

If we add that to above list of variables in update_temperature() under 
if (mode == THERMAL_DEVICE_DISABLING) , it is same as 
thermal_zone_device_init function does in current patch. We are just 
resetting same fields in different place under a new mode, right ?

Thanks,

Manaf
Manaf Meethalavalappu Pallikunhi Jan. 19, 2022, 7:05 p.m. UTC | #3
Hi Rafael/Daniel,

Could you please check and comment  ?

Thanks,

Manaf

On 1/11/2022 2:15 AM, Manaf Meethalavalappu Pallikunhi wrote:
> Hi Thara,
>
> On 1/10/2022 11:25 PM, Thara Gopinath wrote:
>> Hi Manaf,
>>
>> On 1/7/22 1:56 PM, Manaf Meethalavalappu Pallikunhi wrote:
>>> Whenever a thermal zone is in trip violated state, there is a chance
>>> that the same thermal zone mode can be disabled either via thermal
>>> core API or via thermal zone sysfs. Once it is disabled, the framework
>>> bails out any re-evaluation of thermal zone. It leads to a case where
>>> if it is already in mitigation state, it will stay the same state
>>> until it is re-enabled.
>>>
>>> To avoid above mentioned issue, on thermal zone disable request
>>> reset thermal zone and clear mitigation for each trip explicitly.
>>>
>>> Signed-off-by: Manaf Meethalavalappu Pallikunhi 
>>> <quic_manafm@quicinc.com>
>>> ---
>>>   drivers/thermal/thermal_core.c | 12 ++++++++++--
>>>   1 file changed, 10 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/thermal/thermal_core.c 
>>> b/drivers/thermal/thermal_core.c
>>> index 51374f4..e288c82 100644
>>> --- a/drivers/thermal/thermal_core.c
>>> +++ b/drivers/thermal/thermal_core.c
>>> @@ -447,10 +447,18 @@ static int thermal_zone_device_set_mode(struct 
>>> thermal_zone_device *tz,
>>>         thermal_zone_device_update(tz, THERMAL_EVENT_UNSPECIFIED);
>>>   -    if (mode == THERMAL_DEVICE_ENABLED)
>>> +    if (mode == THERMAL_DEVICE_ENABLED) {
>>>           thermal_notify_tz_enable(tz->id);
>>> -    else
>>> +    } else {
>>> +        int trip;
>>> +
>>> +        /* make sure all previous throttlings are cleared */
>>> +        thermal_zone_device_init(tz);
>>
>> It looks weird to do a init when you are actually disabling the 
>> thermal zone.
>>
>>
>>> +        for (trip = 0; trip < tz->trips; trip++)
>>> +            handle_thermal_trip(tz, trip);
>>
>> So this is exactly what thermal_zone_device_update does except that 
>> thermal_zone_device_update checks for the mode and bails out if the 
>> zone is disabled.
>> This will work because as you explained in v2, the temperature is 
>> reset in thermal_zone_device_init and handle_thermal_trip will remove 
>> the mitigation if any.
>>
>> My two cents here (Rafael and Daniel can comment more on this).
>>
>> I think it will be cleaner if we can have a third mode 
>> THERMAL_DEVICE_DISABLING and have thermal_zone_device_update handle 
>> clearing the mitigation. So this will look like
>> if (mode == THERMAL_DEVICE_DISABLED)
>>     tz->mode = THERMAL_DEVICE_DISABLING;
>> else
>>     tz->mode = mode;
>>
>> thermal_zone_device_update(tz, THERMAL_EVENT_UNSPECIFIED);
>>
>> if (mode == THERMAL_DEVICE_DISABLED)
>>     tz->mode = mode;
>>
>> You will have to update update_temperature to set tz->temperature = 
>> THERMAL_TEMP_INVALID and thermal_zone_set_trips to set 
>> tz->prev_low_trip = -INT_MAX and tz->prev_high_trip = INT_MAX for
>> THERMAL_DEVICE_DISABLING mode.
>
> I think just updating above fields doesn't guarantee complete clearing 
> of mitigation for all governors. For  step_wise governor, to make sure 
> mitigation removed completely, we have to set each 
> thermal-instance->initialized = false as well.
>
> If we add that to above list of variables in update_temperature() 
> under if (mode == THERMAL_DEVICE_DISABLING) , it is same as 
> thermal_zone_device_init function does in current patch. We are just 
> resetting same fields in different place under a new mode, right ?
>
> Thanks,
>
> Manaf
>
Daniel Lezcano Jan. 19, 2022, 7:12 p.m. UTC | #4
Hi Manaf,

On 19/01/2022 20:05, Manaf Meethalavalappu Pallikunhi wrote:
> Hi Rafael/Daniel,
> 
> Could you please check and comment  ?

It is in my todo list, I'll review it before the end of the week.

Regards

  -- Daniel

> On 1/11/2022 2:15 AM, Manaf Meethalavalappu Pallikunhi wrote:
>> Hi Thara,
>>
>> On 1/10/2022 11:25 PM, Thara Gopinath wrote:
>>> Hi Manaf,
>>>
>>> On 1/7/22 1:56 PM, Manaf Meethalavalappu Pallikunhi wrote:
>>>> Whenever a thermal zone is in trip violated state, there is a chance
>>>> that the same thermal zone mode can be disabled either via thermal
>>>> core API or via thermal zone sysfs. Once it is disabled, the framework
>>>> bails out any re-evaluation of thermal zone. It leads to a case where
>>>> if it is already in mitigation state, it will stay the same state
>>>> until it is re-enabled.
>>>>
>>>> To avoid above mentioned issue, on thermal zone disable request
>>>> reset thermal zone and clear mitigation for each trip explicitly.
>>>>
>>>> Signed-off-by: Manaf Meethalavalappu Pallikunhi
>>>> <quic_manafm@quicinc.com>
>>>> ---
>>>>   drivers/thermal/thermal_core.c | 12 ++++++++++--
>>>>   1 file changed, 10 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/drivers/thermal/thermal_core.c
>>>> b/drivers/thermal/thermal_core.c
>>>> index 51374f4..e288c82 100644
>>>> --- a/drivers/thermal/thermal_core.c
>>>> +++ b/drivers/thermal/thermal_core.c
>>>> @@ -447,10 +447,18 @@ static int thermal_zone_device_set_mode(struct
>>>> thermal_zone_device *tz,
>>>>         thermal_zone_device_update(tz, THERMAL_EVENT_UNSPECIFIED);
>>>>   -    if (mode == THERMAL_DEVICE_ENABLED)
>>>> +    if (mode == THERMAL_DEVICE_ENABLED) {
>>>>           thermal_notify_tz_enable(tz->id);
>>>> -    else
>>>> +    } else {
>>>> +        int trip;
>>>> +
>>>> +        /* make sure all previous throttlings are cleared */
>>>> +        thermal_zone_device_init(tz);
>>>
>>> It looks weird to do a init when you are actually disabling the
>>> thermal zone.
>>>
>>>
>>>> +        for (trip = 0; trip < tz->trips; trip++)
>>>> +            handle_thermal_trip(tz, trip);
>>>
>>> So this is exactly what thermal_zone_device_update does except that
>>> thermal_zone_device_update checks for the mode and bails out if the
>>> zone is disabled.
>>> This will work because as you explained in v2, the temperature is
>>> reset in thermal_zone_device_init and handle_thermal_trip will remove
>>> the mitigation if any.
>>>
>>> My two cents here (Rafael and Daniel can comment more on this).
>>>
>>> I think it will be cleaner if we can have a third mode
>>> THERMAL_DEVICE_DISABLING and have thermal_zone_device_update handle
>>> clearing the mitigation. So this will look like
>>> if (mode == THERMAL_DEVICE_DISABLED)
>>>     tz->mode = THERMAL_DEVICE_DISABLING;
>>> else
>>>     tz->mode = mode;
>>>
>>> thermal_zone_device_update(tz, THERMAL_EVENT_UNSPECIFIED);
>>>
>>> if (mode == THERMAL_DEVICE_DISABLED)
>>>     tz->mode = mode;
>>>
>>> You will have to update update_temperature to set tz->temperature =
>>> THERMAL_TEMP_INVALID and thermal_zone_set_trips to set
>>> tz->prev_low_trip = -INT_MAX and tz->prev_high_trip = INT_MAX for
>>> THERMAL_DEVICE_DISABLING mode.
>>
>> I think just updating above fields doesn't guarantee complete clearing
>> of mitigation for all governors. For  step_wise governor, to make sure
>> mitigation removed completely, we have to set each
>> thermal-instance->initialized = false as well.
>>
>> If we add that to above list of variables in update_temperature()
>> under if (mode == THERMAL_DEVICE_DISABLING) , it is same as
>> thermal_zone_device_init function does in current patch. We are just
>> resetting same fields in different place under a new mode, right ?
>>
>> Thanks,
>>
>> Manaf
>>
Rafael J. Wysocki Jan. 19, 2022, 8:03 p.m. UTC | #5
On Fri, Jan 7, 2022 at 7:57 PM Manaf Meethalavalappu Pallikunhi
<quic_manafm@quicinc.com> wrote:
>
> Whenever a thermal zone is in trip violated state, there is a chance
> that the same thermal zone mode can be disabled either via thermal
> core API or via thermal zone sysfs. Once it is disabled, the framework
> bails out any re-evaluation of thermal zone. It leads to a case where
> if it is already in mitigation state, it will stay the same state
> until it is re-enabled.
>
> To avoid above mentioned issue, on thermal zone disable request
> reset thermal zone and clear mitigation for each trip explicitly.
>
> Signed-off-by: Manaf Meethalavalappu Pallikunhi <quic_manafm@quicinc.com>
> ---
>  drivers/thermal/thermal_core.c | 12 ++++++++++--
>  1 file changed, 10 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
> index 51374f4..e288c82 100644
> --- a/drivers/thermal/thermal_core.c
> +++ b/drivers/thermal/thermal_core.c
> @@ -447,10 +447,18 @@ static int thermal_zone_device_set_mode(struct thermal_zone_device *tz,
>
>         thermal_zone_device_update(tz, THERMAL_EVENT_UNSPECIFIED);
>
> -       if (mode == THERMAL_DEVICE_ENABLED)
> +       if (mode == THERMAL_DEVICE_ENABLED) {
>                 thermal_notify_tz_enable(tz->id);
> -       else
> +       } else {
> +               int trip;
> +
> +               /* make sure all previous throttlings are cleared */
> +               thermal_zone_device_init(tz);
> +               for (trip = 0; trip < tz->trips; trip++)
> +                       handle_thermal_trip(tz, trip);
> +

It looks to me like this has a potential of confusing user space by
setting the temperature to invalid before notifying it that the zone
has been disabled.

>                 thermal_notify_tz_disable(tz->id);
> +       }
>
>         return ret;
>  }
>
Daniel Lezcano Jan. 23, 2022, 8:51 p.m. UTC | #6
Hi Manaf,

semantically speaking disabling a thermal zone would be to detach the
thermal zone from its governor and stop the monitoring.

May be add the functions

 - thermal_governor_attach(struct thermal_zone_device *tzd)
   {
        ...
        if (tz->governor && tz->governor->bind_to_tz) {
                if (tz->governor->bind_to_tz(tz)) {
	}
        ...
   }

 - thermal_governor_detach(struct thermal_zone_device *tzd)
   {
        ...
        if (tz->governor && tz->governor->unbind_from_tz)
                tz->governor->unbind_from_tz(tz);
        ...
   }

And add in the step_wise and power_allocator the reset of the governor's
data as well as the cooling device instances in the unbind_from_tz()
callback

Then, thermal_zone_device_enable() attaches and
thermal_zone_device_disable() detaches the governor.

Does it make sense ?


On 07/01/2022 19:56, Manaf Meethalavalappu Pallikunhi wrote:
> Whenever a thermal zone is in trip violated state, there is a chance
> that the same thermal zone mode can be disabled either via thermal
> core API or via thermal zone sysfs. Once it is disabled, the framework
> bails out any re-evaluation of thermal zone. It leads to a case where
> if it is already in mitigation state, it will stay the same state
> until it is re-enabled.
> 
> To avoid above mentioned issue, on thermal zone disable request
> reset thermal zone and clear mitigation for each trip explicitly.
> 
> Signed-off-by: Manaf Meethalavalappu Pallikunhi <quic_manafm@quicinc.com>
> ---
>  drivers/thermal/thermal_core.c | 12 ++++++++++--
>  1 file changed, 10 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
> index 51374f4..e288c82 100644
> --- a/drivers/thermal/thermal_core.c
> +++ b/drivers/thermal/thermal_core.c
> @@ -447,10 +447,18 @@ static int thermal_zone_device_set_mode(struct thermal_zone_device *tz,
>  
>  	thermal_zone_device_update(tz, THERMAL_EVENT_UNSPECIFIED);
>  
> -	if (mode == THERMAL_DEVICE_ENABLED)
> +	if (mode == THERMAL_DEVICE_ENABLED) {
>  		thermal_notify_tz_enable(tz->id);
> -	else
> +	} else {
> +		int trip;
> +
> +		/* make sure all previous throttlings are cleared */
> +		thermal_zone_device_init(tz);
> +		for (trip = 0; trip < tz->trips; trip++)
> +			handle_thermal_trip(tz, trip);
> +
>  		thermal_notify_tz_disable(tz->id);
> +	}
>  
>  	return ret;
>  }
>
Pandruvada, Srinivas Jan. 24, 2022, 1:05 a.m. UTC | #7
On Sun, 2022-01-23 at 21:51 +0100, Daniel Lezcano wrote:
> 
> Hi Manaf,
> 
> semantically speaking disabling a thermal zone would be to detach the
> thermal zone from its governor and stop the monitoring.
> 
> May be add the functions
> 
>  - thermal_governor_attach(struct thermal_zone_device *tzd)
>    {
>         ...
>         if (tz->governor && tz->governor->bind_to_tz) {
>                 if (tz->governor->bind_to_tz(tz)) {
>         }
>         ...
>    }
> 
>  - thermal_governor_detach(struct thermal_zone_device *tzd)
>    {
>         ...
>         if (tz->governor && tz->governor->unbind_from_tz)
>                 tz->governor->unbind_from_tz(tz);
>         ...
>    }
> 
> And add in the step_wise and power_allocator the reset of the
> governor's
> data as well as the cooling device instances in the unbind_from_tz()
> callback
> 
> Then, thermal_zone_device_enable() attaches and
> thermal_zone_device_disable() detaches the governor.
> 
> Does it make sense ?
This is better.

Thanks,
Srinivas

> 
> 
> On 07/01/2022 19:56, Manaf Meethalavalappu Pallikunhi wrote:
> > Whenever a thermal zone is in trip violated state, there is a
> > chance
> > that the same thermal zone mode can be disabled either via thermal
> > core API or via thermal zone sysfs. Once it is disabled, the
> > framework
> > bails out any re-evaluation of thermal zone. It leads to a case
> > where
> > if it is already in mitigation state, it will stay the same state
> > until it is re-enabled.
> > 
> > To avoid above mentioned issue, on thermal zone disable request
> > reset thermal zone and clear mitigation for each trip explicitly.
> > 
> > Signed-off-by: Manaf Meethalavalappu Pallikunhi
> > <quic_manafm@quicinc.com>
> > ---
> >  drivers/thermal/thermal_core.c | 12 ++++++++++--
> >  1 file changed, 10 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/thermal/thermal_core.c
> > b/drivers/thermal/thermal_core.c
> > index 51374f4..e288c82 100644
> > --- a/drivers/thermal/thermal_core.c
> > +++ b/drivers/thermal/thermal_core.c
> > @@ -447,10 +447,18 @@ static int
> > thermal_zone_device_set_mode(struct thermal_zone_device *tz,
> >  
> >         thermal_zone_device_update(tz, THERMAL_EVENT_UNSPECIFIED);
> >  
> > -       if (mode == THERMAL_DEVICE_ENABLED)
> > +       if (mode == THERMAL_DEVICE_ENABLED) {
> >                 thermal_notify_tz_enable(tz->id);
> > -       else
> > +       } else {
> > +               int trip;
> > +
> > +               /* make sure all previous throttlings are cleared
> > */
> > +               thermal_zone_device_init(tz);
> > +               for (trip = 0; trip < tz->trips; trip++)
> > +                       handle_thermal_trip(tz, trip);
> > +
> >                 thermal_notify_tz_disable(tz->id);
> > +       }
> >  
> >         return ret;
> >  }
> > 
> 
>
Manaf Meethalavalappu Pallikunhi Jan. 25, 2022, 3:48 p.m. UTC | #8
HI Daniel,

On 1/24/2022 6:35 AM, Pandruvada, Srinivas wrote:
> On Sun, 2022-01-23 at 21:51 +0100, Daniel Lezcano wrote:
>> Hi Manaf,
>>
>> semantically speaking disabling a thermal zone would be to detach the
>> thermal zone from its governor and stop the monitoring.
>>
>> May be add the functions
>>
>>   - thermal_governor_attach(struct thermal_zone_device *tzd)
>>     {
>>          ...
>>          if (tz->governor && tz->governor->bind_to_tz) {
>>                  if (tz->governor->bind_to_tz(tz)) {
>>          }
>>          ...
>>     }
>>
>>   - thermal_governor_detach(struct thermal_zone_device *tzd)
>>     {
>>          ...
>>          if (tz->governor && tz->governor->unbind_from_tz)
>>                  tz->governor->unbind_from_tz(tz);
>>          ...
>>     }
>>
>> And add in the step_wise and power_allocator the reset of the
>> governor's
>> data as well as the cooling device instances in the unbind_from_tz()
>> callback
>>
>> Then, thermal_zone_device_enable() attaches and
>> thermal_zone_device_disable() detaches the governor.
>>
>> Does it make sense ?
> This is better.
>
> Thanks,
> Srinivas
Yes, it makes sense. I will update it in v4
>
>>
>> On 07/01/2022 19:56, Manaf Meethalavalappu Pallikunhi wrote:
>>> Whenever a thermal zone is in trip violated state, there is a
>>> chance
>>> that the same thermal zone mode can be disabled either via thermal
>>> core API or via thermal zone sysfs. Once it is disabled, the
>>> framework
>>> bails out any re-evaluation of thermal zone. It leads to a case
>>> where
>>> if it is already in mitigation state, it will stay the same state
>>> until it is re-enabled.
>>>
>>> To avoid above mentioned issue, on thermal zone disable request
>>> reset thermal zone and clear mitigation for each trip explicitly.
>>>
>>> Signed-off-by: Manaf Meethalavalappu Pallikunhi
>>> <quic_manafm@quicinc.com>
>>> ---
>>>   drivers/thermal/thermal_core.c | 12 ++++++++++--
>>>   1 file changed, 10 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/thermal/thermal_core.c
>>> b/drivers/thermal/thermal_core.c
>>> index 51374f4..e288c82 100644
>>> --- a/drivers/thermal/thermal_core.c
>>> +++ b/drivers/thermal/thermal_core.c
>>> @@ -447,10 +447,18 @@ static int
>>> thermal_zone_device_set_mode(struct thermal_zone_device *tz,
>>>   
>>>          thermal_zone_device_update(tz, THERMAL_EVENT_UNSPECIFIED);
>>>   
>>> -       if (mode == THERMAL_DEVICE_ENABLED)
>>> +       if (mode == THERMAL_DEVICE_ENABLED) {
>>>                  thermal_notify_tz_enable(tz->id);
>>> -       else
>>> +       } else {
>>> +               int trip;
>>> +
>>> +               /* make sure all previous throttlings are cleared
>>> */
>>> +               thermal_zone_device_init(tz);
>>> +               for (trip = 0; trip < tz->trips; trip++)
>>> +                       handle_thermal_trip(tz, trip);
>>> +
>>>                  thermal_notify_tz_disable(tz->id);
>>> +       }
>>>   
>>>          return ret;
>>>   }
>>>
>>
diff mbox series

Patch

diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index 51374f4..e288c82 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -447,10 +447,18 @@  static int thermal_zone_device_set_mode(struct thermal_zone_device *tz,
 
 	thermal_zone_device_update(tz, THERMAL_EVENT_UNSPECIFIED);
 
-	if (mode == THERMAL_DEVICE_ENABLED)
+	if (mode == THERMAL_DEVICE_ENABLED) {
 		thermal_notify_tz_enable(tz->id);
-	else
+	} else {
+		int trip;
+
+		/* make sure all previous throttlings are cleared */
+		thermal_zone_device_init(tz);
+		for (trip = 0; trip < tz->trips; trip++)
+			handle_thermal_trip(tz, trip);
+
 		thermal_notify_tz_disable(tz->id);
+	}
 
 	return ret;
 }