mbox series

[v5,0/4] powercap/dtpm: Add the DTPM framework

Message ID 20201208164145.19493-1-daniel.lezcano@linaro.org (mailing list archive)
Headers show
Series powercap/dtpm: Add the DTPM framework | expand

Message

Daniel Lezcano Dec. 8, 2020, 4:41 p.m. UTC
The density of components greatly increased the last decade bringing a
numerous number of heating sources which are monitored by more than 20
sensors on recent SoC. The skin temperature, which is the case
temperature of the device, must stay below approximately 45°C in order
to comply with the legal requirements.

The skin temperature is managed as a whole by an user space daemon,
which is catching the current application profile, to allocate a power
budget to the different components where the resulting heating effect
will comply with the skin temperature constraint.

This technique is called the Dynamic Thermal Power Management.

The Linux kernel does not provide any unified interface to act on the
power of the different devices. Currently, the thermal framework is
changed to export artificially the performance states of different
devices via the cooling device software component with opaque values.
This change is done regardless of the in-kernel logic to mitigate the
temperature. The user space daemon uses all the available knobs to act
on the power limit and those differ from one platform to another.

This series provides a Dynamic Thermal Power Management framework to
provide an unified way to act on the power of the devices.

Changelog:
 V5:
  - Fixed typos in documentation
  - Added a dtpm NULL pointer check in the dtpm_register() function
 V4:
  - Changed fine grain spinlocks by global tree mutex lock
    - Dropped tested by tag from Lukasz
  - Fixed rollback routine in dtpm_cpu
  - Checked freq_qos_request_active() when releasing the dtpm_cpu node
 V3:
  - Fixed power-limit computation in addition with the hotplugging
  - Improved the encapsulation
  - Added specific ops for the leaves of the tree
  - Simplified API and self-encapsulation
  - Fixed documentation and generated it to check the content
 V2:
  - Fixed indentation
  - Fixed typos in comments
  - Fixed missing kfree for dtpm_cpu
  - Capitalize letters in the Kconfig description
  - Reduced name description
  - Stringified section name
  - Added more debug traces in the code
  - Removed duplicate initialization in the dtpm cpu

Daniel Lezcano (4):
  units: Add Watt units
  Documentation/powercap/dtpm: Add documentation for dtpm
  powercap/drivers/dtpm: Add API for dynamic thermal power management
  powercap/drivers/dtpm: Add CPU energy model based support

 Documentation/power/index.rst         |   1 +
 Documentation/power/powercap/dtpm.rst | 212 ++++++++++++
 drivers/powercap/Kconfig              |  13 +
 drivers/powercap/Makefile             |   2 +
 drivers/powercap/dtpm.c               | 473 ++++++++++++++++++++++++++
 drivers/powercap/dtpm_cpu.c           | 257 ++++++++++++++
 include/asm-generic/vmlinux.lds.h     |  11 +
 include/linux/cpuhotplug.h            |   1 +
 include/linux/dtpm.h                  |  77 +++++
 include/linux/units.h                 |   4 +
 10 files changed, 1051 insertions(+)
 create mode 100644 Documentation/power/powercap/dtpm.rst
 create mode 100644 drivers/powercap/dtpm.c
 create mode 100644 drivers/powercap/dtpm_cpu.c
 create mode 100644 include/linux/dtpm.h

Cc: Thara Gopinath <thara.gopinath@linaro.org>
Cc: Lina Iyer <ilina@codeaurora.org>
Cc: Ram Chandrasekar <rkumbako@codeaurora.org>
Cc: Zhang Rui <rui.zhang@intel.com>
Cc: Lukasz Luba <lukasz.luba@arm.com>

--
2.17.1

Comments

Daniel Lezcano Dec. 11, 2020, 10:39 a.m. UTC | #1
Hi Rafael,

I believe I took into account all the comments, do you think it is
possible to merge this series ?



On 08/12/2020 17:41, Daniel Lezcano wrote:
> The density of components greatly increased the last decade bringing a
> numerous number of heating sources which are monitored by more than 20
> sensors on recent SoC. The skin temperature, which is the case
> temperature of the device, must stay below approximately 45°C in order
> to comply with the legal requirements.
> 
> The skin temperature is managed as a whole by an user space daemon,
> which is catching the current application profile, to allocate a power
> budget to the different components where the resulting heating effect
> will comply with the skin temperature constraint.
> 
> This technique is called the Dynamic Thermal Power Management.
> 
> The Linux kernel does not provide any unified interface to act on the
> power of the different devices. Currently, the thermal framework is
> changed to export artificially the performance states of different
> devices via the cooling device software component with opaque values.
> This change is done regardless of the in-kernel logic to mitigate the
> temperature. The user space daemon uses all the available knobs to act
> on the power limit and those differ from one platform to another.
> 
> This series provides a Dynamic Thermal Power Management framework to
> provide an unified way to act on the power of the devices.
> 
> Changelog:
>  V5:
>   - Fixed typos in documentation
>   - Added a dtpm NULL pointer check in the dtpm_register() function
>  V4:
>   - Changed fine grain spinlocks by global tree mutex lock
>     - Dropped tested by tag from Lukasz
>   - Fixed rollback routine in dtpm_cpu
>   - Checked freq_qos_request_active() when releasing the dtpm_cpu node
>  V3:
>   - Fixed power-limit computation in addition with the hotplugging
>   - Improved the encapsulation
>   - Added specific ops for the leaves of the tree
>   - Simplified API and self-encapsulation
>   - Fixed documentation and generated it to check the content
>  V2:
>   - Fixed indentation
>   - Fixed typos in comments
>   - Fixed missing kfree for dtpm_cpu
>   - Capitalize letters in the Kconfig description
>   - Reduced name description
>   - Stringified section name
>   - Added more debug traces in the code
>   - Removed duplicate initialization in the dtpm cpu
> 
> Daniel Lezcano (4):
>   units: Add Watt units
>   Documentation/powercap/dtpm: Add documentation for dtpm
>   powercap/drivers/dtpm: Add API for dynamic thermal power management
>   powercap/drivers/dtpm: Add CPU energy model based support
> 
>  Documentation/power/index.rst         |   1 +
>  Documentation/power/powercap/dtpm.rst | 212 ++++++++++++
>  drivers/powercap/Kconfig              |  13 +
>  drivers/powercap/Makefile             |   2 +
>  drivers/powercap/dtpm.c               | 473 ++++++++++++++++++++++++++
>  drivers/powercap/dtpm_cpu.c           | 257 ++++++++++++++
>  include/asm-generic/vmlinux.lds.h     |  11 +
>  include/linux/cpuhotplug.h            |   1 +
>  include/linux/dtpm.h                  |  77 +++++
>  include/linux/units.h                 |   4 +
>  10 files changed, 1051 insertions(+)
>  create mode 100644 Documentation/power/powercap/dtpm.rst
>  create mode 100644 drivers/powercap/dtpm.c
>  create mode 100644 drivers/powercap/dtpm_cpu.c
>  create mode 100644 include/linux/dtpm.h
> 
> Cc: Thara Gopinath <thara.gopinath@linaro.org>
> Cc: Lina Iyer <ilina@codeaurora.org>
> Cc: Ram Chandrasekar <rkumbako@codeaurora.org>
> Cc: Zhang Rui <rui.zhang@intel.com>
> Cc: Lukasz Luba <lukasz.luba@arm.com>
> 
> --
> 2.17.1
>
Rafael J. Wysocki Dec. 11, 2020, 7:15 p.m. UTC | #2
On Fri, Dec 11, 2020 at 11:41 AM Daniel Lezcano
<daniel.lezcano@linaro.org> wrote:
>
>
> Hi Rafael,
>
> I believe I took into account all the comments, do you think it is
> possible to merge this series ?

It should be, unless more changes are requested.

I will be taking care of it next week and, if all goes well, it should
be possible to push it during the second half of the merge window.

Thanks!
Rafael J. Wysocki Dec. 22, 2020, 6:52 p.m. UTC | #3
On Fri, Dec 11, 2020 at 8:15 PM Rafael J. Wysocki <rafael@kernel.org> wrote:
>
> On Fri, Dec 11, 2020 at 11:41 AM Daniel Lezcano
> <daniel.lezcano@linaro.org> wrote:
> >
> >
> > Hi Rafael,
> >
> > I believe I took into account all the comments, do you think it is
> > possible to merge this series ?
>
> It should be, unless more changes are requested.
>
> I will be taking care of it next week and, if all goes well, it should
> be possible to push it during the second half of the merge window.

Applied as 5.11-rc material now, sorry for the delay.

Thanks!
Daniel Lezcano Dec. 23, 2020, 12:34 p.m. UTC | #4
On 22/12/2020 19:52, Rafael J. Wysocki wrote:
> On Fri, Dec 11, 2020 at 8:15 PM Rafael J. Wysocki <rafael@kernel.org> wrote:
>>
>> On Fri, Dec 11, 2020 at 11:41 AM Daniel Lezcano
>> <daniel.lezcano@linaro.org> wrote:
>>>
>>>
>>> Hi Rafael,
>>>
>>> I believe I took into account all the comments, do you think it is
>>> possible to merge this series ?
>>
>> It should be, unless more changes are requested.
>>
>> I will be taking care of it next week and, if all goes well, it should
>> be possible to push it during the second half of the merge window.
> 
> Applied as 5.11-rc material now, sorry for the delay.

No problem, thank you for taking care of the series.

I did not want to add another entry in the MAINTAINER file as you are
the maintainer of the powercap framework and that is fine.

However the get_maintainer script (and default cccmd) does not return me
as part of the maintainer/author of the dtpm or idle_inject. I would
like to be at least Cc'ed to review the changes related to those files
to make sure they stay aligned with the direction we are taking.

Is it possible to be automatically Cc'ed for the proposed changes in these
files ?
Pavel Machek Dec. 24, 2020, 6:46 p.m. UTC | #5
Hi!

> The density of components greatly increased the last decade bringing a
> numerous number of heating sources which are monitored by more than 20
> sensors on recent SoC. The skin temperature, which is the case
> temperature of the device, must stay below approximately 45°C in order
> to comply with the legal requirements.

What kind of device is that?

Does that mean that running fsck is now "illegal" because temperature
will not be managed during that time?
							Pavel
Daniel Lezcano Dec. 25, 2020, 11:54 a.m. UTC | #6
On 24/12/2020 19:46, Pavel Machek wrote:
> Hi!
> 
>> The density of components greatly increased the last decade bringing a
>> numerous number of heating sources which are monitored by more than 20
>> sensors on recent SoC. The skin temperature, which is the case
>> temperature of the device, must stay below approximately 45°C in order
>> to comply with the legal requirements.
> 
> What kind of device is that?

Any complex embedded devices like a phone, a laptop or a tablet with
components like NPU, CPU, GPU, GPS, DSPs, Camera, ...

> Does that mean that running fsck is now "illegal" because temperature
> will not be managed during that time?

The heating effect of the different devices will be conducted through a
common dissipation device.

The 'skin' temperature or 'case' temperature has a dedicated sensor in
the path of this dissipation device. So the temperature will increase
slower at this sensor level because of a higher thermal capacity.

The 'skin' temperature will be the result of the different components
running at the same time (eg. GPS + CPU + GPU + DSPs).

In the case of fsck, the system is in degraded mode, thus the
application using these components are not supposed to run and the
'skin' temperature should stay below.

If you are interested, here you can find some background to explain the
'skin' temperature [1] and the spreading of the heat [2].

Hope that helps

  -- Daniel

[1] https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4188373/

[2]
https://nanoheat.stanford.edu/sites/default/files/publications/Electronics%20Cooling%20Article.pdf