mbox series

[v3,0/4] Clarify abstract scale usage for power values in Energy Model, EAS and IPA

Message ID 20201019140601.3047-1-lukasz.luba@arm.com (mailing list archive)
Headers show
Series Clarify abstract scale usage for power values in Energy Model, EAS and IPA | expand

Message

Lukasz Luba Oct. 19, 2020, 2:05 p.m. UTC
Hi all,

The Energy Model supports power values expressed in an abstract scale.
This has an impact on Intelligent Power Allocation (IPA) and should be
documented properly. Kernel sub-systems like EAS, IPA and DTPM
(new comming PowerCap framework) would use the new flag to capture
potential miss-configuration where the devices have registered different
power scales, thus cannot operate together.

There was a discussion below v2 of this patch series, which might help
you to get context of these changes [2].

The agreed approach is to have the DT as a source of power values expressed
always in milli-Watts and the only way to submit with abstract scale values
is via the em_dev_register_perf_domain() API.

Changes:
v3:
- added boolean flag to struct em_perf_domain and registration function
  indicating if EM holds real power values in milli-Watts (suggested by
  Daniel and aggreed with Quentin)
- updated documentation regarding this new flag
- dropped DT binding change for 'sustainable-power'
- added more maintainers on CC (due to patch 1/4 touching different things)
v2 [2]:
- updated sustainable power section in IPA documentation
- updated DT binding for the 'sustainable-power'
v1 [1]:
- simple documenation update with new 'abstract scale' in EAS, EM, IPA

Regards,
Lukasz Luba

[1] https://lore.kernel.org/linux-doc/20200929121610.16060-1-lukasz.luba@arm.com/
[2] https://lore.kernel.org/lkml/20201002114426.31277-1-lukasz.luba@arm.com/

Lukasz Luba (4):
  PM / EM: Add a flag indicating units of power values in Energy Model
  docs: Clarify abstract scale usage for power values in Energy Model
  PM / EM: update the comments related to power scale
  docs: power: Update Energy Model with new flag indicating power scale

 .../driver-api/thermal/power_allocator.rst    | 13 +++++++-
 Documentation/power/energy-model.rst          | 30 +++++++++++++++----
 Documentation/scheduler/sched-energy.rst      |  5 ++++
 drivers/cpufreq/scmi-cpufreq.c                |  3 +-
 drivers/opp/of.c                              |  2 +-
 include/linux/energy_model.h                  | 20 ++++++++-----
 kernel/power/energy_model.c                   | 26 ++++++++++++++--
 7 files changed, 81 insertions(+), 18 deletions(-)

Comments

Doug Anderson Oct. 20, 2020, 12:15 a.m. UTC | #1
Hi,

On Mon, Oct 19, 2020 at 7:06 AM Lukasz Luba <lukasz.luba@arm.com> wrote:
>
> Hi all,
>
> The Energy Model supports power values expressed in an abstract scale.
> This has an impact on Intelligent Power Allocation (IPA) and should be
> documented properly. Kernel sub-systems like EAS, IPA and DTPM
> (new comming PowerCap framework) would use the new flag to capture
> potential miss-configuration where the devices have registered different
> power scales, thus cannot operate together.
>
> There was a discussion below v2 of this patch series, which might help
> you to get context of these changes [2].
>
> The agreed approach is to have the DT as a source of power values expressed
> always in milli-Watts and the only way to submit with abstract scale values
> is via the em_dev_register_perf_domain() API.
>
> Changes:
> v3:
> - added boolean flag to struct em_perf_domain and registration function
>   indicating if EM holds real power values in milli-Watts (suggested by
>   Daniel and aggreed with Quentin)
> - updated documentation regarding this new flag
> - dropped DT binding change for 'sustainable-power'
> - added more maintainers on CC (due to patch 1/4 touching different things)
> v2 [2]:
> - updated sustainable power section in IPA documentation
> - updated DT binding for the 'sustainable-power'
> v1 [1]:
> - simple documenation update with new 'abstract scale' in EAS, EM, IPA
>
> Regards,
> Lukasz Luba
>
> [1] https://lore.kernel.org/linux-doc/20200929121610.16060-1-lukasz.luba@arm.com/
> [2] https://lore.kernel.org/lkml/20201002114426.31277-1-lukasz.luba@arm.com/
>
> Lukasz Luba (4):
>   PM / EM: Add a flag indicating units of power values in Energy Model
>   docs: Clarify abstract scale usage for power values in Energy Model
>   PM / EM: update the comments related to power scale
>   docs: power: Update Energy Model with new flag indicating power scale
>
>  .../driver-api/thermal/power_allocator.rst    | 13 +++++++-
>  Documentation/power/energy-model.rst          | 30 +++++++++++++++----
>  Documentation/scheduler/sched-energy.rst      |  5 ++++
>  drivers/cpufreq/scmi-cpufreq.c                |  3 +-
>  drivers/opp/of.c                              |  2 +-
>  include/linux/energy_model.h                  | 20 ++++++++-----
>  kernel/power/energy_model.c                   | 26 ++++++++++++++--
>  7 files changed, 81 insertions(+), 18 deletions(-)

While I don't feel like I have enough skin in the game to make any
demands, I'm definitely not a huge fan of this series still.  I am a
fan of documenting reality, but (to me) trying to mix stuff like this
is just going to be adding needless complexity.  From where I'm
standing, it's a lot more of a pain to specify these types of numbers
in the firmware than it is to specify them in the device tree.  They
are harder to customize per board, harder to spin, and harder to
specify constraints for everything in the system (all heat generators,
all cooling devices, etc).  ...and since we already have a way to
specify this type of thing in the device tree and that's super easy
for people to do, we're going to end up with weird mixes / matches of
numbers coming from different locations and now we've got to figure
out which numbers we can use when and which to ignore.  Ick.

In my opinion the only way to allow for mixing and matching the
bogoWatts and real Watts would be to actually have units and the
ability to provide a conversion factor somewhere.  Presumably that
might give you a chance of mixing and matching if someone wants to
provide some stuff in device tree and get other stuff from the
firmware.  Heck, I guess you could even magically figure out a
conversion factor if someone provides device tree numbers for
something that was already registered in SCMI, assuming all the SCMI
numbers are consistent with each other...

-Doug



-Doug
Lukasz Luba Oct. 29, 2020, 12:37 p.m. UTC | #2
On 10/20/20 1:15 AM, Doug Anderson wrote:
> Hi,
> 
> On Mon, Oct 19, 2020 at 7:06 AM Lukasz Luba <lukasz.luba@arm.com> wrote:
>>
>> Hi all,
>>
>> The Energy Model supports power values expressed in an abstract scale.
>> This has an impact on Intelligent Power Allocation (IPA) and should be
>> documented properly. Kernel sub-systems like EAS, IPA and DTPM
>> (new comming PowerCap framework) would use the new flag to capture
>> potential miss-configuration where the devices have registered different
>> power scales, thus cannot operate together.
>>
>> There was a discussion below v2 of this patch series, which might help
>> you to get context of these changes [2].
>>
>> The agreed approach is to have the DT as a source of power values expressed
>> always in milli-Watts and the only way to submit with abstract scale values
>> is via the em_dev_register_perf_domain() API.
>>
>> Changes:
>> v3:
>> - added boolean flag to struct em_perf_domain and registration function
>>    indicating if EM holds real power values in milli-Watts (suggested by
>>    Daniel and aggreed with Quentin)
>> - updated documentation regarding this new flag
>> - dropped DT binding change for 'sustainable-power'
>> - added more maintainers on CC (due to patch 1/4 touching different things)
>> v2 [2]:
>> - updated sustainable power section in IPA documentation
>> - updated DT binding for the 'sustainable-power'
>> v1 [1]:
>> - simple documenation update with new 'abstract scale' in EAS, EM, IPA
>>
>> Regards,
>> Lukasz Luba
>>
>> [1] https://lore.kernel.org/linux-doc/20200929121610.16060-1-lukasz.luba@arm.com/
>> [2] https://lore.kernel.org/lkml/20201002114426.31277-1-lukasz.luba@arm.com/
>>
>> Lukasz Luba (4):
>>    PM / EM: Add a flag indicating units of power values in Energy Model
>>    docs: Clarify abstract scale usage for power values in Energy Model
>>    PM / EM: update the comments related to power scale
>>    docs: power: Update Energy Model with new flag indicating power scale
>>
>>   .../driver-api/thermal/power_allocator.rst    | 13 +++++++-
>>   Documentation/power/energy-model.rst          | 30 +++++++++++++++----
>>   Documentation/scheduler/sched-energy.rst      |  5 ++++
>>   drivers/cpufreq/scmi-cpufreq.c                |  3 +-
>>   drivers/opp/of.c                              |  2 +-
>>   include/linux/energy_model.h                  | 20 ++++++++-----
>>   kernel/power/energy_model.c                   | 26 ++++++++++++++--
>>   7 files changed, 81 insertions(+), 18 deletions(-)
> 
> While I don't feel like I have enough skin in the game to make any
> demands, I'm definitely not a huge fan of this series still.  I am a
> fan of documenting reality, but (to me) trying to mix stuff like this
> is just going to be adding needless complexity.  From where I'm
> standing, it's a lot more of a pain to specify these types of numbers
> in the firmware than it is to specify them in the device tree.  They

When you have SCMI, you receive power values from FW directly, not using
DT.

> are harder to customize per board, harder to spin, and harder to
> specify constraints for everything in the system (all heat generators,
> all cooling devices, etc).  ...and since we already have a way to
> specify this type of thing in the device tree and that's super easy
> for people to do, we're going to end up with weird mixes / matches of
> numbers coming from different locations and now we've got to figure
> out which numbers we can use when and which to ignore.  Ick.

This is not that bad as you described. When you have SCMI and FW
all your perf domains should be aligned to the same scale.
In example, you have 4 little CPU, 3 big CPUs, 1 super big CPU,
1 GPU, 1 DSP. For all of them the SCMI get_power callback should return
consistent values. You don't have to specify anything else or rev-eng.
Then a client like EAS would use those values from CPUs to estimate
energy and this works fine. Another client: IPA, which would use
all of them and also works fine.

> 
> In my opinion the only way to allow for mixing and matching the
> bogoWatts and real Watts would be to actually have units and the
> ability to provide a conversion factor somewhere.  Presumably that
> might give you a chance of mixing and matching if someone wants to
> provide some stuff in device tree and get other stuff from the
> firmware.  Heck, I guess you could even magically figure out a
> conversion factor if someone provides device tree numbers for
> something that was already registered in SCMI, assuming all the SCMI
> numbers are consistent with each other...

What you demand here is another code path, just to support revers
engineered power values for SCMI devices, which are stored in DT.
Then the SCMI protocol code and drivers should take them into account
and abandon standard implementation and use these values to provide
'hacked' power numbers to EM. Am I right?
It is not going to happen.

Regards,
Lukasz


> 
> -Doug
> 
> 
> 
> -Doug
>
Doug Anderson Oct. 29, 2020, 3:39 p.m. UTC | #3
Hi,

On Thu, Oct 29, 2020 at 5:37 AM Lukasz Luba <lukasz.luba@arm.com> wrote:
>
> On 10/20/20 1:15 AM, Doug Anderson wrote:
> > Hi,
> >
> > On Mon, Oct 19, 2020 at 7:06 AM Lukasz Luba <lukasz.luba@arm.com> wrote:
> >>
> >> Hi all,
> >>
> >> The Energy Model supports power values expressed in an abstract scale.
> >> This has an impact on Intelligent Power Allocation (IPA) and should be
> >> documented properly. Kernel sub-systems like EAS, IPA and DTPM
> >> (new comming PowerCap framework) would use the new flag to capture
> >> potential miss-configuration where the devices have registered different
> >> power scales, thus cannot operate together.
> >>
> >> There was a discussion below v2 of this patch series, which might help
> >> you to get context of these changes [2].
> >>
> >> The agreed approach is to have the DT as a source of power values expressed
> >> always in milli-Watts and the only way to submit with abstract scale values
> >> is via the em_dev_register_perf_domain() API.
> >>
> >> Changes:
> >> v3:
> >> - added boolean flag to struct em_perf_domain and registration function
> >>    indicating if EM holds real power values in milli-Watts (suggested by
> >>    Daniel and aggreed with Quentin)
> >> - updated documentation regarding this new flag
> >> - dropped DT binding change for 'sustainable-power'
> >> - added more maintainers on CC (due to patch 1/4 touching different things)
> >> v2 [2]:
> >> - updated sustainable power section in IPA documentation
> >> - updated DT binding for the 'sustainable-power'
> >> v1 [1]:
> >> - simple documenation update with new 'abstract scale' in EAS, EM, IPA
> >>
> >> Regards,
> >> Lukasz Luba
> >>
> >> [1] https://lore.kernel.org/linux-doc/20200929121610.16060-1-lukasz.luba@arm.com/
> >> [2] https://lore.kernel.org/lkml/20201002114426.31277-1-lukasz.luba@arm.com/
> >>
> >> Lukasz Luba (4):
> >>    PM / EM: Add a flag indicating units of power values in Energy Model
> >>    docs: Clarify abstract scale usage for power values in Energy Model
> >>    PM / EM: update the comments related to power scale
> >>    docs: power: Update Energy Model with new flag indicating power scale
> >>
> >>   .../driver-api/thermal/power_allocator.rst    | 13 +++++++-
> >>   Documentation/power/energy-model.rst          | 30 +++++++++++++++----
> >>   Documentation/scheduler/sched-energy.rst      |  5 ++++
> >>   drivers/cpufreq/scmi-cpufreq.c                |  3 +-
> >>   drivers/opp/of.c                              |  2 +-
> >>   include/linux/energy_model.h                  | 20 ++++++++-----
> >>   kernel/power/energy_model.c                   | 26 ++++++++++++++--
> >>   7 files changed, 81 insertions(+), 18 deletions(-)
> >
> > While I don't feel like I have enough skin in the game to make any
> > demands, I'm definitely not a huge fan of this series still.  I am a
> > fan of documenting reality, but (to me) trying to mix stuff like this
> > is just going to be adding needless complexity.  From where I'm
> > standing, it's a lot more of a pain to specify these types of numbers
> > in the firmware than it is to specify them in the device tree.  They
>
> When you have SCMI, you receive power values from FW directly, not using
> DT.
>
> > are harder to customize per board, harder to spin, and harder to
> > specify constraints for everything in the system (all heat generators,
> > all cooling devices, etc).  ...and since we already have a way to
> > specify this type of thing in the device tree and that's super easy
> > for people to do, we're going to end up with weird mixes / matches of
> > numbers coming from different locations and now we've got to figure
> > out which numbers we can use when and which to ignore.  Ick.
>
> This is not that bad as you described. When you have SCMI and FW
> all your perf domains should be aligned to the same scale.
> In example, you have 4 little CPU, 3 big CPUs, 1 super big CPU,
> 1 GPU, 1 DSP. For all of them the SCMI get_power callback should return
> consistent values. You don't have to specify anything else or rev-eng.
> Then a client like EAS would use those values from CPUs to estimate
> energy and this works fine. Another client: IPA, which would use
> all of them and also works fine.

I guess I'm confused.  When using SCMI and FW, are there already code
paths to get the board-specific "sustainable-power" from SCMI and FW?

I know that "sustainable-power" is not truly necessary.  IIRC some of
the code assumes that the lowest power state of all components must be
sustainable and uses that.  However, though this makes the code work,
it's far from ideal.  I don't want to accept a mediocre solution here.

In any case, I'm saying that even if "sustainable-power" can come from
firmware, it's not as ideal of a place for it to live.  Maybe my
experience on Chromebooks is different from the rest of upstream, but
it's generally quite easy to adjust the device tree for a board and
much harder to convince firmware folks to put a board-specific table
of values.


> > In my opinion the only way to allow for mixing and matching the
> > bogoWatts and real Watts would be to actually have units and the
> > ability to provide a conversion factor somewhere.  Presumably that
> > might give you a chance of mixing and matching if someone wants to
> > provide some stuff in device tree and get other stuff from the
> > firmware.  Heck, I guess you could even magically figure out a
> > conversion factor if someone provides device tree numbers for
> > something that was already registered in SCMI, assuming all the SCMI
> > numbers are consistent with each other...
>
> What you demand here is another code path, just to support revers
> engineered power values for SCMI devices, which are stored in DT.
> Then the SCMI protocol code and drivers should take them into account
> and abandon standard implementation and use these values to provide
> 'hacked' power numbers to EM. Am I right?
> It is not going to happen.

Quite honestly, all I want to be able to do is to provide a
board-specific "sustainable-power" and have it match with the
power-coefficients.  Thus:

* If device tree accepted abstract scale, we'd be done and I'd shut
up.  ...but Rob has made it quite clear that this is a no-go.

* If it was super easy to add all these values into firmware for a
board and we could totally remove these from the device tree, I'd
grumble a bit about firmware being a terrible place for this but at
least we'd have a solution and we'd be done and I'd shut up.  NOTE: I
don't know ATF terribly well, but I'd guess that this needs to go
there?  Presumably part of this is convincing firmware folks to add
this board-specific value there...

-Doug
Lukasz Luba Oct. 29, 2020, 4:15 p.m. UTC | #4
On 10/29/20 3:39 PM, Doug Anderson wrote:
> Hi,
> 
> On Thu, Oct 29, 2020 at 5:37 AM Lukasz Luba <lukasz.luba@arm.com> wrote:
>>
>> On 10/20/20 1:15 AM, Doug Anderson wrote:
>>> Hi,
>>>
>>> On Mon, Oct 19, 2020 at 7:06 AM Lukasz Luba <lukasz.luba@arm.com> wrote:
>>>>
>>>> Hi all,
>>>>
>>>> The Energy Model supports power values expressed in an abstract scale.
>>>> This has an impact on Intelligent Power Allocation (IPA) and should be
>>>> documented properly. Kernel sub-systems like EAS, IPA and DTPM
>>>> (new comming PowerCap framework) would use the new flag to capture
>>>> potential miss-configuration where the devices have registered different
>>>> power scales, thus cannot operate together.
>>>>
>>>> There was a discussion below v2 of this patch series, which might help
>>>> you to get context of these changes [2].
>>>>
>>>> The agreed approach is to have the DT as a source of power values expressed
>>>> always in milli-Watts and the only way to submit with abstract scale values
>>>> is via the em_dev_register_perf_domain() API.
>>>>
>>>> Changes:
>>>> v3:
>>>> - added boolean flag to struct em_perf_domain and registration function
>>>>     indicating if EM holds real power values in milli-Watts (suggested by
>>>>     Daniel and aggreed with Quentin)
>>>> - updated documentation regarding this new flag
>>>> - dropped DT binding change for 'sustainable-power'
>>>> - added more maintainers on CC (due to patch 1/4 touching different things)
>>>> v2 [2]:
>>>> - updated sustainable power section in IPA documentation
>>>> - updated DT binding for the 'sustainable-power'
>>>> v1 [1]:
>>>> - simple documenation update with new 'abstract scale' in EAS, EM, IPA
>>>>
>>>> Regards,
>>>> Lukasz Luba
>>>>
>>>> [1] https://lore.kernel.org/linux-doc/20200929121610.16060-1-lukasz.luba@arm.com/
>>>> [2] https://lore.kernel.org/lkml/20201002114426.31277-1-lukasz.luba@arm.com/
>>>>
>>>> Lukasz Luba (4):
>>>>     PM / EM: Add a flag indicating units of power values in Energy Model
>>>>     docs: Clarify abstract scale usage for power values in Energy Model
>>>>     PM / EM: update the comments related to power scale
>>>>     docs: power: Update Energy Model with new flag indicating power scale
>>>>
>>>>    .../driver-api/thermal/power_allocator.rst    | 13 +++++++-
>>>>    Documentation/power/energy-model.rst          | 30 +++++++++++++++----
>>>>    Documentation/scheduler/sched-energy.rst      |  5 ++++
>>>>    drivers/cpufreq/scmi-cpufreq.c                |  3 +-
>>>>    drivers/opp/of.c                              |  2 +-
>>>>    include/linux/energy_model.h                  | 20 ++++++++-----
>>>>    kernel/power/energy_model.c                   | 26 ++++++++++++++--
>>>>    7 files changed, 81 insertions(+), 18 deletions(-)
>>>
>>> While I don't feel like I have enough skin in the game to make any
>>> demands, I'm definitely not a huge fan of this series still.  I am a
>>> fan of documenting reality, but (to me) trying to mix stuff like this
>>> is just going to be adding needless complexity.  From where I'm
>>> standing, it's a lot more of a pain to specify these types of numbers
>>> in the firmware than it is to specify them in the device tree.  They
>>
>> When you have SCMI, you receive power values from FW directly, not using
>> DT.
>>
>>> are harder to customize per board, harder to spin, and harder to
>>> specify constraints for everything in the system (all heat generators,
>>> all cooling devices, etc).  ...and since we already have a way to
>>> specify this type of thing in the device tree and that's super easy
>>> for people to do, we're going to end up with weird mixes / matches of
>>> numbers coming from different locations and now we've got to figure
>>> out which numbers we can use when and which to ignore.  Ick.
>>
>> This is not that bad as you described. When you have SCMI and FW
>> all your perf domains should be aligned to the same scale.
>> In example, you have 4 little CPU, 3 big CPUs, 1 super big CPU,
>> 1 GPU, 1 DSP. For all of them the SCMI get_power callback should return
>> consistent values. You don't have to specify anything else or rev-eng.
>> Then a client like EAS would use those values from CPUs to estimate
>> energy and this works fine. Another client: IPA, which would use
>> all of them and also works fine.
> 
> I guess I'm confused.  When using SCMI and FW, are there already code
> paths to get the board-specific "sustainable-power" from SCMI and FW?
> 
> I know that "sustainable-power" is not truly necessary.  IIRC some of
> the code assumes that the lowest power state of all components must be
> sustainable and uses that.  However, though this makes the code work,
> it's far from ideal.  I don't want to accept a mediocre solution here.

As you said, sustainable power would be estimated when it is not coming
from DT. Currently it would be done based on lowest allowed OPPs. I am
trying to address this by marking OPP as sustainable [1]. The estimation 
would be more accurate (and also the derived coefficients).

> 
> In any case, I'm saying that even if "sustainable-power" can come from
> firmware, it's not as ideal of a place for it to live.  Maybe my
> experience on Chromebooks is different from the rest of upstream, but
> it's generally quite easy to adjust the device tree for a board and
> much harder to convince firmware folks to put a board-specific table
> of values.

The sysfs (which is there) is even easier for this adjustment than DT.

> 
> 
>>> In my opinion the only way to allow for mixing and matching the
>>> bogoWatts and real Watts would be to actually have units and the
>>> ability to provide a conversion factor somewhere.  Presumably that
>>> might give you a chance of mixing and matching if someone wants to
>>> provide some stuff in device tree and get other stuff from the
>>> firmware.  Heck, I guess you could even magically figure out a
>>> conversion factor if someone provides device tree numbers for
>>> something that was already registered in SCMI, assuming all the SCMI
>>> numbers are consistent with each other...
>>
>> What you demand here is another code path, just to support revers
>> engineered power values for SCMI devices, which are stored in DT.
>> Then the SCMI protocol code and drivers should take them into account
>> and abandon standard implementation and use these values to provide
>> 'hacked' power numbers to EM. Am I right?
>> It is not going to happen.
> 
> Quite honestly, all I want to be able to do is to provide a
> board-specific "sustainable-power" and have it match with the
> power-coefficients.  Thus:
> 
> * If device tree accepted abstract scale, we'd be done and I'd shut
> up.  ...but Rob has made it quite clear that this is a no-go.
> 
> * If it was super easy to add all these values into firmware for a
> board and we could totally remove these from the device tree, I'd
> grumble a bit about firmware being a terrible place for this but at
> least we'd have a solution and we'd be done and I'd shut up.  NOTE: I
> don't know ATF terribly well, but I'd guess that this needs to go
> there?  Presumably part of this is convincing firmware folks to add
> this board-specific value there...

The SCMI spec that we are talking supports 'sustained performance'
level for each performance domain. You can check doc [2] table 11
for the definition. In SCMI there is no concept of 'sustainable-power'
which would substitute the missing DT value. But we can estimate it
more accurately based on sustainable OPP.
You can check how I am going to feed that FW value into the OPP in patch
4/4 of [3]. I am also working on improved estimation patch set v4 for
IPA (some description of an issue in v2 [4], latest v3 is here [5]),
which is using the proposed sustainable OPP concept (Viresh mentioned
he would like to see the user of that).

As you can see, I am not going to leave you with this issue ;)

Regards,
Lukasz


[1] 
https://lore.kernel.org/linux-pm/20201028140847.1018-1-lukasz.luba@arm.com/
[2] https://developer.arm.com/documentation/den0056/b
[3] 
https://lore.kernel.org/linux-pm/20201028140847.1018-5-lukasz.luba@arm.com/
[4] 
https://lore.kernel.org/linux-pm/5f682bbb-b250-49e6-dbb7-aea522a58595@arm.com/
[5] https://lore.kernel.org/lkml/20201009135850.14727-1-lukasz.luba@arm.com/

> 
> -Doug
>
Lukasz Luba Nov. 2, 2020, 8:54 a.m. UTC | #5
On 10/19/20 3:05 PM, Lukasz Luba wrote:
> Hi all,
> 
> The Energy Model supports power values expressed in an abstract scale.
> This has an impact on Intelligent Power Allocation (IPA) and should be
> documented properly. Kernel sub-systems like EAS, IPA and DTPM
> (new comming PowerCap framework) would use the new flag to capture
> potential miss-configuration where the devices have registered different
> power scales, thus cannot operate together.
> 
> There was a discussion below v2 of this patch series, which might help
> you to get context of these changes [2].
> 
> The agreed approach is to have the DT as a source of power values expressed
> always in milli-Watts and the only way to submit with abstract scale values
> is via the em_dev_register_perf_domain() API.
> 
> Changes:
> v3:
> - added boolean flag to struct em_perf_domain and registration function
>    indicating if EM holds real power values in milli-Watts (suggested by
>    Daniel and aggreed with Quentin)
> - updated documentation regarding this new flag
> - dropped DT binding change for 'sustainable-power'
> - added more maintainers on CC (due to patch 1/4 touching different things)
> v2 [2]:
> - updated sustainable power section in IPA documentation
> - updated DT binding for the 'sustainable-power'
> v1 [1]:
> - simple documenation update with new 'abstract scale' in EAS, EM, IPA
> 
> Regards,
> Lukasz Luba
> 
> [1] https://lore.kernel.org/linux-doc/20200929121610.16060-1-lukasz.luba@arm.com/
> [2] https://lore.kernel.org/lkml/20201002114426.31277-1-lukasz.luba@arm.com/
> 
> Lukasz Luba (4):
>    PM / EM: Add a flag indicating units of power values in Energy Model
>    docs: Clarify abstract scale usage for power values in Energy Model
>    PM / EM: update the comments related to power scale
>    docs: power: Update Energy Model with new flag indicating power scale
> 
>   .../driver-api/thermal/power_allocator.rst    | 13 +++++++-
>   Documentation/power/energy-model.rst          | 30 +++++++++++++++----
>   Documentation/scheduler/sched-energy.rst      |  5 ++++
>   drivers/cpufreq/scmi-cpufreq.c                |  3 +-
>   drivers/opp/of.c                              |  2 +-
>   include/linux/energy_model.h                  | 20 ++++++++-----
>   kernel/power/energy_model.c                   | 26 ++++++++++++++--
>   7 files changed, 81 insertions(+), 18 deletions(-)
> 


Gentle ping to Quentin and Daniel for sharing opinion on this patch set.
If you are OK, then I could use this as a base for next work.

As you probably know I am working also on 'sustainable power' estimation
which could be used when there is no DT value but it comes from FW.
That would meet requirement from Doug, when the DT cannot be used,
but we have sustainable levels from FW [1].

Regards,
Lukasz

[1] https://lore.kernel.org/lkml/20201028140847.1018-5-lukasz.luba@arm.com/
Quentin Perret Nov. 2, 2020, 1:54 p.m. UTC | #6
On Monday 02 Nov 2020 at 08:54:38 (+0000), Lukasz Luba wrote:
> Gentle ping to Quentin and Daniel for sharing opinion on this patch set.
> If you are OK, then I could use this as a base for next work.

One or two small nits, but overall this LGTM. Thanks Lukasz.

> As you probably know I am working also on 'sustainable power' estimation
> which could be used when there is no DT value but it comes from FW.
> That would meet requirement from Doug, when the DT cannot be used,
> but we have sustainable levels from FW [1].

Cool, and also, I'd be happy to hear from Doug if passing the sustained
power via sysfs is good enough for his use-case in the meantime?

Thanks,
Quentin
Doug Anderson Nov. 3, 2020, 12:41 a.m. UTC | #7
Hi,

On Mon, Nov 2, 2020 at 5:54 AM Quentin Perret <qperret@google.com> wrote:
>
> On Monday 02 Nov 2020 at 08:54:38 (+0000), Lukasz Luba wrote:
> > Gentle ping to Quentin and Daniel for sharing opinion on this patch set.
> > If you are OK, then I could use this as a base for next work.
>
> One or two small nits, but overall this LGTM. Thanks Lukasz.
>
> > As you probably know I am working also on 'sustainable power' estimation
> > which could be used when there is no DT value but it comes from FW.
> > That would meet requirement from Doug, when the DT cannot be used,
> > but we have sustainable levels from FW [1].
>
> Cool, and also, I'd be happy to hear from Doug if passing the sustained
> power via sysfs is good enough for his use-case in the meantime?

It does sound like sysfs could be made to work for us, but it's
definitely a workaround.  If the normal way to set these values was
through sysfs then it would be fine, but I think most people expect
that these values are just setup properly by the kernel.  That means
anyone using our board with a different userspace (someone running
upstream on it) would need to figure out what mechanism they were
going to use to program them.  There's very little advantage here
compared to a downstream patch that just violates official upstream
policy by putting something bogoWatts based in the device tree.

My current plan of record (which I don't love) is basically:

1. Before devices are in consumer's hands, accept bogoWatts numbers in
our downstream kernel.

2. Once devices are in consumers hands, run the script I sent out to
generate some numbers and post them upstream.

If, at some point, there's a better solution then I'll switch to it,
but until then that seems workable even if it makes me grumpy.


-Doug
Lukasz Luba Nov. 3, 2020, 8:29 a.m. UTC | #8
On 11/2/20 1:54 PM, Quentin Perret wrote:
> On Monday 02 Nov 2020 at 08:54:38 (+0000), Lukasz Luba wrote:
>> Gentle ping to Quentin and Daniel for sharing opinion on this patch set.
>> If you are OK, then I could use this as a base for next work.
> 
> One or two small nits, but overall this LGTM. Thanks Lukasz.

Thank you Quentin for the review. I am going to send v4 with these small
changes.

Regards,
Lukasz

> 
>> As you probably know I am working also on 'sustainable power' estimation
>> which could be used when there is no DT value but it comes from FW.
>> That would meet requirement from Doug, when the DT cannot be used,
>> but we have sustainable levels from FW [1].
> 
> Cool, and also, I'd be happy to hear from Doug if passing the sustained
> power via sysfs is good enough for his use-case in the meantime?
> 
> Thanks,
> Quentin
>