Message ID | 20200408041917.2329-5-rui.zhang@intel.com (mailing list archive) |
---|---|
State | Rejected |
Delegated to: | Zhang Rui |
Headers | show |
Series | [RFC,1/5] thermal: rename thermal_cooling_device_stats_update() | expand |
Hi Rui, On 08/04/2020 06:19, Zhang Rui wrote: > ACPI processor cooling device supports 1 cooling state before cpufreq > driver probed, and 4 cooling states after cpufreq driver probed. What is this one state ? > Thus update the statistics table when the cpufeq driver is > probed/unprobed. To be honest, the series seems to skirt a problem in the acpi processor. If there is a new policy, then there is a new cooling device. Why not unregister the old one and register the new one ? > This fixes an OOB issue when updating the statistics of the processor > cooling device. > > Fixes: 8ea229511e06 ("thermal: Add cooling device's statistics in sysfs") > Reported-by: Takashi Iwai <tiwai@suse.de> > Signed-off-by: Zhang Rui <rui.zhang@intel.com> > --- > drivers/acpi/processor_thermal.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/drivers/acpi/processor_thermal.c b/drivers/acpi/processor_thermal.c > index 41feb88ee92d..179d1b50ee2b 100644 > --- a/drivers/acpi/processor_thermal.c > +++ b/drivers/acpi/processor_thermal.c > @@ -142,6 +142,7 @@ void acpi_thermal_cpufreq_init(struct cpufreq_policy *policy) > if (ret < 0) > pr_err("Failed to add freq constraint for CPU%d (%d)\n", > cpu, ret); > + thermal_cdev_stats_update_max(pr->cdev); > } > } > > @@ -154,6 +155,7 @@ void acpi_thermal_cpufreq_exit(struct cpufreq_policy *policy) > > if (pr) > freq_qos_remove_request(&pr->thermal_req); > + thermal_cdev_stats_update_max(pr->cdev); > } > } > #else /* ! CONFIG_CPU_FREQ */ >
On Thu, 2020-04-09 at 15:34 +0200, Daniel Lezcano wrote: > Hi Rui, > > > On 08/04/2020 06:19, Zhang Rui wrote: > > ACPI processor cooling device supports 1 cooling state before > > cpufreq > > driver probed, and 4 cooling states after cpufreq driver probed. > > What is this one state ? One state means its original state and we can not set it to others. ACPI processor cooling states are combined of p-state cooling states and a couple of optional t-state cooling states. Say, if we have a processor device supports 7 throttling states, it actually supports 8 cooling states with cpufreq driver unprobed, and 11 cooling states with cpufreq driver probed. > > > Thus update the statistics table when the cpufeq driver is > > probed/unprobed. > > To be honest, the series seems to skirt a problem in the acpi > processor. > > If there is a new policy, then there is a new cooling device. Why not > unregister the old one and register the new one ? > Good point. IMO, the real problem is that do we support dynamic max_cooling_state or not in the thermal framework. Previously, I thought we don't have a hard rule of static max_cooling_state because we always invoke .get_max_state() callback when we need it. But after a second thought, actually we do have this limitation. For example, when binding cooling devices to thermal zones, the upper limit is set based on the return value of .get_max_state() at the binding time, and we never update the upper limit later. So this ACPI processor issue is not just about statistics table update issue, we actually lose some of its cooling states. Thus, a new max_state means that all the previous setting of the cooling_device, i.e. all the thermal instances of the cooling device, needs to get updated. And to fix this, it's better to a. unregister and re-register the cooling device as you suggested. or b. introduce an API that updates the cooling device entirely instead of statistics table only. For either of the above solutions, we'd better to cleanup the code to invoke .get_max_state() during registration/max_state_reset phase, once, and then always use cached value later. And plus, if we want to follow solution a), it's better to remove .get_max_state() callback and use an integer instead so that every driver knows this limitation. I'd vote for solution a) if there is no soc thermal driver that may return dynamic max_states. Do I still miss something? thanks, rui > > > This fixes an OOB issue when updating the statistics of the > > processor > > cooling device. > > > > Fixes: 8ea229511e06 ("thermal: Add cooling device's statistics in > > sysfs") > > Reported-by: Takashi Iwai <tiwai@suse.de> > > Signed-off-by: Zhang Rui <rui.zhang@intel.com> > > --- > > drivers/acpi/processor_thermal.c | 2 ++ > > 1 file changed, 2 insertions(+) > > > > diff --git a/drivers/acpi/processor_thermal.c > > b/drivers/acpi/processor_thermal.c > > index 41feb88ee92d..179d1b50ee2b 100644 > > --- a/drivers/acpi/processor_thermal.c > > +++ b/drivers/acpi/processor_thermal.c > > @@ -142,6 +142,7 @@ void acpi_thermal_cpufreq_init(struct > > cpufreq_policy *policy) > > if (ret < 0) > > pr_err("Failed to add freq constraint for CPU%d > > (%d)\n", > > cpu, ret); > > + thermal_cdev_stats_update_max(pr->cdev); > > } > > } > > > > @@ -154,6 +155,7 @@ void acpi_thermal_cpufreq_exit(struct > > cpufreq_policy *policy) > > > > if (pr) > > freq_qos_remove_request(&pr->thermal_req); > > + thermal_cdev_stats_update_max(pr->cdev); > > } > > } > > #else /* ! CONFIG_CPU_FREQ */ > > > >
On 10/04/2020 10:02, Zhang Rui wrote: > On Thu, 2020-04-09 at 15:34 +0200, Daniel Lezcano wrote: >> Hi Rui, >> >> >> On 08/04/2020 06:19, Zhang Rui wrote: >>> ACPI processor cooling device supports 1 cooling state before >>> cpufreq >>> driver probed, and 4 cooling states after cpufreq driver probed. >> >> What is this one state ? > > One state means its original state and we can not set it to others. > > ACPI processor cooling states are combined of p-state cooling states > and a couple of optional t-state cooling states. > > Say, if we have a processor device supports 7 throttling states, it > actually supports 8 cooling states with cpufreq driver unprobed, and 11 > cooling states with cpufreq driver probed. > >> >>> Thus update the statistics table when the cpufeq driver is >>> probed/unprobed. >> >> To be honest, the series seems to skirt a problem in the acpi >> processor. >> >> If there is a new policy, then there is a new cooling device. Why not >> unregister the old one and register the new one ? >> > Good point. > IMO, the real problem is that do we support dynamic max_cooling_state > or not in the thermal framework. > Previously, I thought we don't have a hard rule of static > max_cooling_state because we always invoke .get_max_state() callback > when we need it. But after a second thought, actually we do have this > limitation. For example, when binding cooling devices to thermal zones, > the upper limit is set based on the return value of .get_max_state() at > the binding time, and we never update the upper limit later. > So this ACPI processor issue is not just about statistics table update > issue, we actually lose some of its cooling states. > > Thus, a new max_state means that all the previous setting of the > cooling_device, i.e. all the thermal instances of the cooling device, > needs to get updated. > > And to fix this, it's better to > a. unregister and re-register the cooling device as you suggested. > or > b. introduce an API that updates the cooling device entirely instead of > statistics table only. > > For either of the above solutions, we'd better to cleanup the code to > invoke .get_max_state() during registration/max_state_reset phase, > once, and then always use cached value later. > And plus, if we want to follow solution a), it's better to remove > .get_max_state() callback and use an integer instead so that every > driver knows this limitation. > I'd vote for solution a) if there is no soc thermal driver that may > return dynamic max_states. > > Do I still miss something? I agree for the a) solution too. But regarding the get_max_state() callback being converted to a integer, the driver int3406_thermal.c computes the upper and lower limits which are updated on a INT3406_BRIGHTNESS_LIMITS_CHANGED notification and get_max_state() does uppper - lower.
On Fri, Apr 10, 2020 at 10:02 AM Zhang Rui <rui.zhang@intel.com> wrote: > > On Thu, 2020-04-09 at 15:34 +0200, Daniel Lezcano wrote: > > Hi Rui, > > > > > > On 08/04/2020 06:19, Zhang Rui wrote: > > > ACPI processor cooling device supports 1 cooling state before > > > cpufreq > > > driver probed, and 4 cooling states after cpufreq driver probed. > > > > What is this one state ? > > One state means its original state and we can not set it to others. > > ACPI processor cooling states are combined of p-state cooling states > and a couple of optional t-state cooling states. > > Say, if we have a processor device supports 7 throttling states, it > actually supports 8 cooling states with cpufreq driver unprobed, and 11 > cooling states with cpufreq driver probed. > > > > > > Thus update the statistics table when the cpufeq driver is > > > probed/unprobed. > > > > To be honest, the series seems to skirt a problem in the acpi > > processor. > > > > If there is a new policy, then there is a new cooling device. Why not > > unregister the old one and register the new one ? > > > Good point. > IMO, the real problem is that do we support dynamic max_cooling_state > or not in the thermal framework. > Previously, I thought we don't have a hard rule of static > max_cooling_state because we always invoke .get_max_state() callback > when we need it. But after a second thought, actually we do have this > limitation. For example, when binding cooling devices to thermal zones, > the upper limit is set based on the return value of .get_max_state() at > the binding time, and we never update the upper limit later. > So this ACPI processor issue is not just about statistics table update > issue, we actually lose some of its cooling states. > > Thus, a new max_state means that all the previous setting of the > cooling_device, i.e. all the thermal instances of the cooling device, > needs to get updated. > > And to fix this, it's better to > a. unregister and re-register the cooling device as you suggested. > or > b. introduce an API that updates the cooling device entirely instead of > statistics table only. > > For either of the above solutions, we'd better to cleanup the code to > invoke .get_max_state() during registration/max_state_reset phase, > once, and then always use cached value later. > And plus, if we want to follow solution a), it's better to remove > .get_max_state() callback and use an integer instead so that every > driver knows this limitation. > I'd vote for solution a) if there is no soc thermal driver that may > return dynamic max_states. I believe I mentioned one more option, which would be to introduce an optional callback into struct thermal_cooling_device_ops to return the maximum possible return value of .get_max_state(), say .get_max_state_limit(). That would be used for the allocation of the stats table and the drivers where the .get_max_state() return value could not change might set the new callback to NULL. Then, upon a change of the .get_max_state() return value, the driver providing it would be responsible for rearranging the stats to reflect the new set of available states. Cheers!
On Fri, 2020-04-10 at 16:10 +0200, Rafael J. Wysocki wrote: > On Fri, Apr 10, 2020 at 10:02 AM Zhang Rui <rui.zhang@intel.com> > wrote: > > > > On Thu, 2020-04-09 at 15:34 +0200, Daniel Lezcano wrote: > > > Hi Rui, > > > > > > > > > On 08/04/2020 06:19, Zhang Rui wrote: > > > > ACPI processor cooling device supports 1 cooling state before > > > > cpufreq > > > > driver probed, and 4 cooling states after cpufreq driver > > > > probed. > > > > > > What is this one state ? > > > > One state means its original state and we can not set it to others. > > > > ACPI processor cooling states are combined of p-state cooling > > states > > and a couple of optional t-state cooling states. > > > > Say, if we have a processor device supports 7 throttling states, it > > actually supports 8 cooling states with cpufreq driver unprobed, > > and 11 > > cooling states with cpufreq driver probed. > > > > > > > > > Thus update the statistics table when the cpufeq driver is > > > > probed/unprobed. > > > > > > To be honest, the series seems to skirt a problem in the acpi > > > processor. > > > > > > If there is a new policy, then there is a new cooling device. Why > > > not > > > unregister the old one and register the new one ? > > > > > > > Good point. > > IMO, the real problem is that do we support dynamic > > max_cooling_state > > or not in the thermal framework. > > Previously, I thought we don't have a hard rule of static > > max_cooling_state because we always invoke .get_max_state() > > callback > > when we need it. But after a second thought, actually we do have > > this > > limitation. For example, when binding cooling devices to thermal > > zones, > > the upper limit is set based on the return value of > > .get_max_state() at > > the binding time, and we never update the upper limit later. > > So this ACPI processor issue is not just about statistics table > > update > > issue, we actually lose some of its cooling states. > > > > Thus, a new max_state means that all the previous setting of the > > cooling_device, i.e. all the thermal instances of the cooling > > device, > > needs to get updated. > > > > And to fix this, it's better to > > a. unregister and re-register the cooling device as you suggested. > > or > > b. introduce an API that updates the cooling device entirely > > instead of > > statistics table only. > > > > For either of the above solutions, we'd better to cleanup the code > > to > > invoke .get_max_state() during registration/max_state_reset phase, > > once, and then always use cached value later. > > And plus, if we want to follow solution a), it's better to remove > > .get_max_state() callback and use an integer instead so that every > > driver knows this limitation. > > I'd vote for solution a) if there is no soc thermal driver that may > > return dynamic max_states. > > I believe I mentioned one more option, which would be to introduce an > optional callback into struct thermal_cooling_device_ops to return > the > maximum possible return value of .get_max_state(), say > .get_max_state_limit(). That would be used for the allocation of the > stats table and the drivers where the .get_max_state() return value > could not change might set the new callback to NULL. > For a dynamic max_state cooling device, now the problem is not just about the statistics table. Take the ACPI processor cooling device for example, we set its internal limit based on the return value of .get_max_state() during the cooling device registration, thus if it returns 1, actually we can not use deep cooling state later at all. Plus, when a max_state changed, the meaning of each cooling state may also actually changed. Say, for ACPI processor, cooling state 2 means 60% of max_freq with cpufreq driver, but means ACPI processor T2 throttle state with cpuferq driver unprobed. That says, if we don't do a full unregistration and registration in this case, at least we need to keep the device node but re-evaluate .get_max_state() and update all its thermal instances, as long as the statistics table change. thanks, rui > Then, upon a change of the .get_max_state() return value, the driver > providing it would be responsible for rearranging the stats to > reflect > the new set of available states. > > Cheers!
On Fri, 2020-04-10 at 14:10 +0200, Daniel Lezcano wrote: > On 10/04/2020 10:02, Zhang Rui wrote: > > On Thu, 2020-04-09 at 15:34 +0200, Daniel Lezcano wrote: > > > Hi Rui, > > > > > > > > > On 08/04/2020 06:19, Zhang Rui wrote: > > > > ACPI processor cooling device supports 1 cooling state before > > > > cpufreq > > > > driver probed, and 4 cooling states after cpufreq driver > > > > probed. > > > > > > What is this one state ? > > > > One state means its original state and we can not set it to others. > > > > ACPI processor cooling states are combined of p-state cooling > > states > > and a couple of optional t-state cooling states. > > > > Say, if we have a processor device supports 7 throttling states, it > > actually supports 8 cooling states with cpufreq driver unprobed, > > and 11 > > cooling states with cpufreq driver probed. > > > > > > > > > Thus update the statistics table when the cpufeq driver is > > > > probed/unprobed. > > > > > > To be honest, the series seems to skirt a problem in the acpi > > > processor. > > > > > > If there is a new policy, then there is a new cooling device. Why > > > not > > > unregister the old one and register the new one ? > > > > > > > Good point. > > IMO, the real problem is that do we support dynamic > > max_cooling_state > > or not in the thermal framework. > > Previously, I thought we don't have a hard rule of static > > max_cooling_state because we always invoke .get_max_state() > > callback > > when we need it. But after a second thought, actually we do have > > this > > limitation. For example, when binding cooling devices to thermal > > zones, > > the upper limit is set based on the return value of > > .get_max_state() at > > the binding time, and we never update the upper limit later. > > So this ACPI processor issue is not just about statistics table > > update > > issue, we actually lose some of its cooling states. > > > > Thus, a new max_state means that all the previous setting of the > > cooling_device, i.e. all the thermal instances of the cooling > > device, > > needs to get updated. > > > > And to fix this, it's better to > > a. unregister and re-register the cooling device as you suggested. > > or > > b. introduce an API that updates the cooling device entirely > > instead of > > statistics table only. > > > > For either of the above solutions, we'd better to cleanup the code > > to > > invoke .get_max_state() during registration/max_state_reset phase, > > once, and then always use cached value later. > > And plus, if we want to follow solution a), it's better to remove > > .get_max_state() callback and use an integer instead so that every > > driver knows this limitation. > > I'd vote for solution a) if there is no soc thermal driver that may > > return dynamic max_states. > > > > Do I still miss something? > > I agree for the a) solution too. > > But regarding the get_max_state() callback being converted to a > integer, > the driver int3406_thermal.c computes the upper and lower limits > which > are updated on a INT3406_BRIGHTNESS_LIMITS_CHANGED notification and > get_max_state() does uppper - lower. Right, this is another case shows that it's better to support dynamic max_state. IMO, this is not difficult to do. We just need to introduce a new API, which reuses the current cdev device, and reset its every thermal instance, and update all the thermal zones the cdev is involved. what do you think? thanks, rui
Hi Rui, On 12/04/2020 08:13, Zhang Rui wrote: > On Fri, 2020-04-10 at 14:10 +0200, Daniel Lezcano wrote: [ ... ] >>> And to fix this, it's better to >>> a. unregister and re-register the cooling device as you suggested. >>> or >>> b. introduce an API that updates the cooling device entirely >>> instead of >>> statistics table only. >>> >>> For either of the above solutions, we'd better to cleanup the code >>> to >>> invoke .get_max_state() during registration/max_state_reset phase, >>> once, and then always use cached value later. >>> And plus, if we want to follow solution a), it's better to remove >>> .get_max_state() callback and use an integer instead so that every >>> driver knows this limitation. >>> I'd vote for solution a) if there is no soc thermal driver that may >>> return dynamic max_states. >>> >>> Do I still miss something? >> >> I agree for the a) solution too. >> >> But regarding the get_max_state() callback being converted to a >> integer, >> the driver int3406_thermal.c computes the upper and lower limits >> which >> are updated on a INT3406_BRIGHTNESS_LIMITS_CHANGED notification and >> get_max_state() does uppper - lower. > > Right, this is another case shows that it's better to support dynamic > max_state. > IMO, this is not difficult to do. We just need to introduce a new API, > which reuses the current cdev device, and reset its every thermal > instance, and update all the thermal zones the cdev is involved. > what do you think? I like how the thermal framework is designed but I think there are too many API for the thermal framework and it deserves a simplification rather than adding more of them. There is no place where the get_max_state is cached except in the stats structure. In the function thermal_cooling_device_stats_update(): Is it possible to just compare the 'new_state' parameter with stats->max_state and if it is greater increase the stats table and update max_state to the new_state ?
On Sun, 2020-04-12 at 12:07 +0200, Daniel Lezcano wrote: > > > > > Hi Rui, > > On 12/04/2020 08:13, Zhang Rui wrote: > > On Fri, 2020-04-10 at 14:10 +0200, Daniel Lezcano wrote: > > [ ... ] > > > > > And to fix this, it's better to > > > > a. unregister and re-register the cooling device as you > > > > suggested. > > > > or > > > > b. introduce an API that updates the cooling device entirely > > > > instead of > > > > statistics table only. > > > > > > > > For either of the above solutions, we'd better to cleanup the > > > > code > > > > to > > > > invoke .get_max_state() during registration/max_state_reset > > > > phase, > > > > once, and then always use cached value later. > > > > And plus, if we want to follow solution a), it's better to > > > > remove > > > > .get_max_state() callback and use an integer instead so that > > > > every > > > > driver knows this limitation. > > > > I'd vote for solution a) if there is no soc thermal driver that > > > > may > > > > return dynamic max_states. > > > > > > > > Do I still miss something? > > > > > > I agree for the a) solution too. > > > > > > But regarding the get_max_state() callback being converted to a > > > integer, > > > the driver int3406_thermal.c computes the upper and lower limits > > > which > > > are updated on a INT3406_BRIGHTNESS_LIMITS_CHANGED notification > > > and > > > get_max_state() does uppper - lower. > > > > Right, this is another case shows that it's better to support > > dynamic > > max_state. > > IMO, this is not difficult to do. We just need to introduce a new > > API, > > which reuses the current cdev device, and reset its every thermal > > instance, and update all the thermal zones the cdev is involved. > > what do you think? > > I like how the thermal framework is designed but I think there are > too > many API for the thermal framework and it deserves a simplification > rather than adding more of them. I agree. > > There is no place where the get_max_state is cached except in the > stats > structure. why we can not have a cdev->max_state field, and get it updated right after .get_max_state(). and .get_max_state() is only invoked a) during cooling device registration b) when cooling device update its max_state via the new API. > > In the function thermal_cooling_device_stats_update(): > > Is it possible to just compare the 'new_state' parameter with > stats->max_state and if it is greater increase the stats table and > update max_state to the new_state ? > the problem is that thermal_cooling_device_stats_update() is invoked only if thermal zone are updated or the cur_state sysfs attribute is changed. There is no way for a cooling device driver to tell thermal framework that it has changed. Say, for the problem on hand, the statistics table will not be updated in time when cpufreq driver probed. thanks, rui
On 13/04/2020 04:01, Zhang Rui wrote: > On Sun, 2020-04-12 at 12:07 +0200, Daniel Lezcano wrote: [ ... ] > why we can not have a cdev->max_state field, and get it updated right > after .get_max_state(). > and .get_max_state() is only invoked > a) during cooling device registration > b) when cooling device update its max_state via the new API. > >> >> In the function thermal_cooling_device_stats_update(): >> >> Is it possible to just compare the 'new_state' parameter with >> stats->max_state and if it is greater increase the stats table and >> update max_state to the new_state ? >> > the problem is that thermal_cooling_device_stats_update() is invoked > only if thermal zone are updated or the cur_state sysfs attribute is > changed. > There is no way for a cooling device driver to tell thermal framework > that it has changed. > Say, for the problem on hand, the statistics table will not be updated > in time when cpufreq driver probed. Except I'm missing something, the statistics are only read from userspace via sysfs. userspace is not notified about a stat change. Is it really a problem the table is not updated right at the probe time ? Does it really matter if the user sees the old table until an update happens on a new higher max state ? The table is always consistent whenever the userspace reads the content. A new entry will appear only if it is used, no?
On Mon, 2020-04-13 at 20:06 +0200, Daniel Lezcano wrote: > On 13/04/2020 04:01, Zhang Rui wrote: > > On Sun, 2020-04-12 at 12:07 +0200, Daniel Lezcano wrote: > > [ ... ] > > > why we can not have a cdev->max_state field, and get it updated > > right > > after .get_max_state(). > > and .get_max_state() is only invoked > > a) during cooling device registration > > b) when cooling device update its max_state via the new API. > > > > > > > > In the function thermal_cooling_device_stats_update(): > > > > > > Is it possible to just compare the 'new_state' parameter with > > > stats->max_state and if it is greater increase the stats table > > > and > > > update max_state to the new_state ? > > > > > > > the problem is that thermal_cooling_device_stats_update() is > > invoked > > only if thermal zone are updated or the cur_state sysfs attribute > > is > > changed. > > There is no way for a cooling device driver to tell thermal > > framework > > that it has changed. > > Say, for the problem on hand, the statistics table will not be > > updated > > in time when cpufreq driver probed. > > Except I'm missing something, the statistics are only read from > userspace via sysfs. I agree. > > userspace is not notified about a stat change. Is it really a problem > the table is not updated right at the probe time ? > Does it really matter > if the user sees the old table until an update happens on a new > higher > max state ? > > The table is always consistent whenever the userspace reads the > content. > > A new entry will appear only if it is used, no? > Hmm, IMO, stats table is not the biggest problem here. The problem is that thermal framework is not aware of the max_state change, and the thermal instances are never updated according to the new max_state. So, we should invoke .get_max_state() in thermal_zone_device_update() and update the thermal instances accordingly. And then, what we need to do is just to do stats update right after .get_max_state() being invoked. About how to update the stats table, I think adding new entries is not enough, because the meaning of each cooling state may change when max_state changes, thus I'd prefer a full reset/resizing of the table. thanks, rui > > > > >
On 16/04/2020 06:46, Zhang Rui wrote: > On Mon, 2020-04-13 at 20:06 +0200, Daniel Lezcano wrote: >> On 13/04/2020 04:01, Zhang Rui wrote: >>> On Sun, 2020-04-12 at 12:07 +0200, Daniel Lezcano wrote: >> >> [ ... ] >> >>> why we can not have a cdev->max_state field, and get it updated >>> right >>> after .get_max_state(). >>> and .get_max_state() is only invoked >>> a) during cooling device registration >>> b) when cooling device update its max_state via the new API. >>> >>>> >>>> In the function thermal_cooling_device_stats_update(): >>>> >>>> Is it possible to just compare the 'new_state' parameter with >>>> stats->max_state and if it is greater increase the stats table >>>> and >>>> update max_state to the new_state ? >>>> >>> >>> the problem is that thermal_cooling_device_stats_update() is >>> invoked >>> only if thermal zone are updated or the cur_state sysfs attribute >>> is >>> changed. >>> There is no way for a cooling device driver to tell thermal >>> framework >>> that it has changed. >>> Say, for the problem on hand, the statistics table will not be >>> updated >>> in time when cpufreq driver probed. >> >> Except I'm missing something, the statistics are only read from >> userspace via sysfs. > > I agree. >> >> userspace is not notified about a stat change. Is it really a problem >> the table is not updated right at the probe time ? > >> Does it really matter >> if the user sees the old table until an update happens on a new >> higher >> max state ? >> >> The table is always consistent whenever the userspace reads the >> content. > >> >> A new entry will appear only if it is used, no? >> > Hmm, IMO, stats table is not the biggest problem here. > The problem is that thermal framework is not aware of the max_state > change, and the thermal instances are never updated according to the > new max_state. > So, we should invoke .get_max_state() in thermal_zone_device_update() > and update the thermal instances accordingly. > And then, what we need to do is just to do stats update right after > .get_max_state() being invoked. > > About how to update the stats table, I think adding new entries is not > enough, because the meaning of each cooling state may change when > max_state changes, thus I'd prefer a full reset/resizing of the table. Ok, I understand. Are planning to use the existing API: thermal_zone_device_update(tz, THERMAL_TABLE_CHANGED); ?
On Thu, 2020-04-16 at 09:58 +0200, Daniel Lezcano wrote: > On 16/04/2020 06:46, Zhang Rui wrote: > > On Mon, 2020-04-13 at 20:06 +0200, Daniel Lezcano wrote: > > > On 13/04/2020 04:01, Zhang Rui wrote: > > > > On Sun, 2020-04-12 at 12:07 +0200, Daniel Lezcano wrote: > > > > > > [ ... ] > > > > > > > why we can not have a cdev->max_state field, and get it updated > > > > right > > > > after .get_max_state(). > > > > and .get_max_state() is only invoked > > > > a) during cooling device registration > > > > b) when cooling device update its max_state via the new API. > > > > > > > > > > > > > > In the function thermal_cooling_device_stats_update(): > > > > > > > > > > Is it possible to just compare the 'new_state' parameter > > > > > with > > > > > stats->max_state and if it is greater increase the stats > > > > > table > > > > > and > > > > > update max_state to the new_state ? > > > > > > > > > > > > > the problem is that thermal_cooling_device_stats_update() is > > > > invoked > > > > only if thermal zone are updated or the cur_state sysfs > > > > attribute > > > > is > > > > changed. > > > > There is no way for a cooling device driver to tell thermal > > > > framework > > > > that it has changed. > > > > Say, for the problem on hand, the statistics table will not be > > > > updated > > > > in time when cpufreq driver probed. > > > > > > Except I'm missing something, the statistics are only read from > > > userspace via sysfs. > > > > I agree. > > > > > > userspace is not notified about a stat change. Is it really a > > > problem > > > the table is not updated right at the probe time ? > > > Does it really matter > > > if the user sees the old table until an update happens on a new > > > higher > > > max state ? > > > > > > The table is always consistent whenever the userspace reads the > > > content. > > > > > > A new entry will appear only if it is used, no? > > > > > > > Hmm, IMO, stats table is not the biggest problem here. > > The problem is that thermal framework is not aware of the max_state > > change, and the thermal instances are never updated according to > > the > > new max_state. > > So, we should invoke .get_max_state() in > > thermal_zone_device_update() > > and update the thermal instances accordingly. > > And then, what we need to do is just to do stats update right after > > .get_max_state() being invoked. > > > > About how to update the stats table, I think adding new entries is > > not > > enough, because the meaning of each cooling state may change when > > max_state changes, thus I'd prefer a full reset/resizing of the > > table. > > Ok, I understand. Are planning to use the existing API: > > thermal_zone_device_update(tz, THERMAL_TABLE_CHANGED); we can not use THERMAL_TABLE_CHANGED because cooling device driver can not invoke thermal_zone_device_update(). So, the cooling device max_state is changed, but the thermal framework will not know this until the cooling device being used for a temperature change. For the cooling devices that are not used by any thermal zones, the stats files are probably out of date, but maybe nobody cares. Or we can fix this by updating stats table for every stats sysfs attribute access. thanks, rui
diff --git a/drivers/acpi/processor_thermal.c b/drivers/acpi/processor_thermal.c index 41feb88ee92d..179d1b50ee2b 100644 --- a/drivers/acpi/processor_thermal.c +++ b/drivers/acpi/processor_thermal.c @@ -142,6 +142,7 @@ void acpi_thermal_cpufreq_init(struct cpufreq_policy *policy) if (ret < 0) pr_err("Failed to add freq constraint for CPU%d (%d)\n", cpu, ret); + thermal_cdev_stats_update_max(pr->cdev); } } @@ -154,6 +155,7 @@ void acpi_thermal_cpufreq_exit(struct cpufreq_policy *policy) if (pr) freq_qos_remove_request(&pr->thermal_req); + thermal_cdev_stats_update_max(pr->cdev); } } #else /* ! CONFIG_CPU_FREQ */
ACPI processor cooling device supports 1 cooling state before cpufreq driver probed, and 4 cooling states after cpufreq driver probed. Thus update the statistics table when the cpufeq driver is probed/unprobed. This fixes an OOB issue when updating the statistics of the processor cooling device. Fixes: 8ea229511e06 ("thermal: Add cooling device's statistics in sysfs") Reported-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Zhang Rui <rui.zhang@intel.com> --- drivers/acpi/processor_thermal.c | 2 ++ 1 file changed, 2 insertions(+)