diff mbox series

[v3,4/4] drm/panfrost: Register to the Energy Model with devfreq device

Message ID 20200221194731.13814-5-lukasz.luba@arm.com (mailing list archive)
State New, archived
Headers show
Series Add support for devices in the Energy Model | expand

Commit Message

Lukasz Luba Feb. 21, 2020, 7:47 p.m. UTC
Add device to the Energy Model framework. It will create a dedicated
and unified data structures used i.e. in the thermal framework.
The power model used in dev_pm_opp subsystem is simplified and created
based on DT 'dynamic-power-coefficient', volatage and frequency. It is
similar to the CPU model used in Energy Aware Scheduler.

Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
---
 drivers/gpu/drm/panfrost/panfrost_devfreq.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Rob Herring (Arm) Feb. 25, 2020, 8:57 p.m. UTC | #1
On Fri, Feb 21, 2020 at 1:48 PM Lukasz Luba <lukasz.luba@arm.com> wrote:
>
> Add device to the Energy Model framework. It will create a dedicated
> and unified data structures used i.e. in the thermal framework.
> The power model used in dev_pm_opp subsystem is simplified and created
> based on DT 'dynamic-power-coefficient', volatage and frequency. It is

typo.

> similar to the CPU model used in Energy Aware Scheduler.
>
> Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
> ---
>  drivers/gpu/drm/panfrost/panfrost_devfreq.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/drivers/gpu/drm/panfrost/panfrost_devfreq.c b/drivers/gpu/drm/panfrost/panfrost_devfreq.c
> index 413987038fbf..d527a5113950 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_devfreq.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_devfreq.c
> @@ -105,6 +105,8 @@ int panfrost_devfreq_init(struct panfrost_device *pfdev)
>         }
>         pfdev->devfreq.devfreq = devfreq;
>
> +       dev_pm_opp_of_register_em(dev, NULL);

Can't fail?

> +
>         cooling = of_devfreq_cooling_register(dev->of_node, devfreq);
>         if (IS_ERR(cooling))
>                 DRM_DEV_INFO(dev, "Failed to register cooling device\n");
> @@ -118,6 +120,7 @@ void panfrost_devfreq_fini(struct panfrost_device *pfdev)
>  {
>         if (pfdev->devfreq.cooling)
>                 devfreq_cooling_unregister(pfdev->devfreq.cooling);
> +       dev_pm_opp_of_unregister_em(&pfdev->pdev->dev);
>         dev_pm_opp_of_remove_table(&pfdev->pdev->dev);

Does it make sense to keep this (and the registration side) as
separate calls? Perhaps there's some ordering requirement with
everything between dev_pm_opp_of_add_table() and
dev_pm_opp_of_register_em()?

While you're just adding 2 lines, it seems there's a lot of complexity
exposed to the driver just to initialize devfreq/opp.

Rob
Lukasz Luba Feb. 26, 2020, 10:06 a.m. UTC | #2
Hi Rob,

On 2/25/20 8:57 PM, Rob Herring wrote:
> On Fri, Feb 21, 2020 at 1:48 PM Lukasz Luba <lukasz.luba@arm.com> wrote:
>>
>> Add device to the Energy Model framework. It will create a dedicated
>> and unified data structures used i.e. in the thermal framework.
>> The power model used in dev_pm_opp subsystem is simplified and created
>> based on DT 'dynamic-power-coefficient', volatage and frequency. It is
> 
> typo.

I'll fix it.

> 
>> similar to the CPU model used in Energy Aware Scheduler.
>>
>> Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
>> ---
>>   drivers/gpu/drm/panfrost/panfrost_devfreq.c | 3 +++
>>   1 file changed, 3 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/panfrost/panfrost_devfreq.c b/drivers/gpu/drm/panfrost/panfrost_devfreq.c
>> index 413987038fbf..d527a5113950 100644
>> --- a/drivers/gpu/drm/panfrost/panfrost_devfreq.c
>> +++ b/drivers/gpu/drm/panfrost/panfrost_devfreq.c
>> @@ -105,6 +105,8 @@ int panfrost_devfreq_init(struct panfrost_device *pfdev)
>>          }
>>          pfdev->devfreq.devfreq = devfreq;
>>
>> +       dev_pm_opp_of_register_em(dev, NULL);
> 
> Can't fail?

Yes, it can fail but the function does not return anything. It can
easily fail, it's looking for "dynamic-power-coefficient" in the device
node. The DT binding for the devfreq devices would also be good to add..

I would have to probably change it into returning 'int' and modify all
old cpufreq drivers.

> 
>> +
>>          cooling = of_devfreq_cooling_register(dev->of_node, devfreq);
>>          if (IS_ERR(cooling))
>>                  DRM_DEV_INFO(dev, "Failed to register cooling device\n");
>> @@ -118,6 +120,7 @@ void panfrost_devfreq_fini(struct panfrost_device *pfdev)
>>   {
>>          if (pfdev->devfreq.cooling)
>>                  devfreq_cooling_unregister(pfdev->devfreq.cooling);
>> +       dev_pm_opp_of_unregister_em(&pfdev->pdev->dev);
>>          dev_pm_opp_of_remove_table(&pfdev->pdev->dev);
> 
> Does it make sense to keep this (and the registration side) as
> separate calls? Perhaps there's some ordering requirement with
> everything between dev_pm_opp_of_add_table() and
> dev_pm_opp_of_register_em()?

Yes, dev_pm_opp_of_register_em() uses em_data_callback which operates
on OPPs to calculate power values and costs, so the the OPP table should
be already there.

> 
> While you're just adding 2 lines, it seems there's a lot of complexity
> exposed to the driver just to initialize devfreq/opp.

It depends, for example devfreq devices like buses would likely never
use the energy model. Potential clients would be GPUs, DSPs, ISPs.

Could you help me with defining a DT binding for this
"dynamic-power-coefficient" entry? It could be used in different types
of devices. Should it be placed in each of these devices documentation
file, or in some one common file?

Thank you for your comments.

Regards,
Lukasz
Robin Murphy Feb. 26, 2020, 1:55 p.m. UTC | #3
On 26/02/2020 10:06 am, Lukasz Luba wrote:
[...]
>>> @@ -118,6 +120,7 @@ void panfrost_devfreq_fini(struct panfrost_device 
>>> *pfdev)
>>>   {
>>>          if (pfdev->devfreq.cooling)
>>>                  devfreq_cooling_unregister(pfdev->devfreq.cooling);
>>> +       dev_pm_opp_of_unregister_em(&pfdev->pdev->dev);
>>>          dev_pm_opp_of_remove_table(&pfdev->pdev->dev);
>>
>> Does it make sense to keep this (and the registration side) as
>> separate calls? Perhaps there's some ordering requirement with
>> everything between dev_pm_opp_of_add_table() and
>> dev_pm_opp_of_register_em()?
> 
> Yes, dev_pm_opp_of_register_em() uses em_data_callback which operates
> on OPPs to calculate power values and costs, so the the OPP table should
> be already there.
> 
>>
>> While you're just adding 2 lines, it seems there's a lot of complexity
>> exposed to the driver just to initialize devfreq/opp.
> 
> It depends, for example devfreq devices like buses would likely never
> use the energy model. Potential clients would be GPUs, DSPs, ISPs.

Still, it seems less than ideal for every client to have to remember to 
make all these individual calls, all in the right order (especially when 
it comes to undoing them in failure paths).

I haven't quite grasped whether the energy model is conceptually "owned" 
by the OPP table or by the cooling device, but either way it would seem 
to be a much nicer API if there were simply an additional "with energy 
model" variant of the registration call, and the standard removal call 
just automatically cleaned up an energy model as well if one was present.

Robin.
Lukasz Luba Feb. 26, 2020, 2:39 p.m. UTC | #4
On 2/26/20 1:55 PM, Robin Murphy wrote:
> On 26/02/2020 10:06 am, Lukasz Luba wrote:
> [...]
>>>> @@ -118,6 +120,7 @@ void panfrost_devfreq_fini(struct 
>>>> panfrost_device *pfdev)
>>>>   {
>>>>          if (pfdev->devfreq.cooling)
>>>>                  devfreq_cooling_unregister(pfdev->devfreq.cooling);
>>>> +       dev_pm_opp_of_unregister_em(&pfdev->pdev->dev);
>>>>          dev_pm_opp_of_remove_table(&pfdev->pdev->dev);
>>>
>>> Does it make sense to keep this (and the registration side) as
>>> separate calls? Perhaps there's some ordering requirement with
>>> everything between dev_pm_opp_of_add_table() and
>>> dev_pm_opp_of_register_em()?
>>
>> Yes, dev_pm_opp_of_register_em() uses em_data_callback which operates
>> on OPPs to calculate power values and costs, so the the OPP table should
>> be already there.
>>
>>>
>>> While you're just adding 2 lines, it seems there's a lot of complexity
>>> exposed to the driver just to initialize devfreq/opp.
>>
>> It depends, for example devfreq devices like buses would likely never
>> use the energy model. Potential clients would be GPUs, DSPs, ISPs.
> 
> Still, it seems less than ideal for every client to have to remember to 
> make all these individual calls, all in the right order (especially when 
> it comes to undoing them in failure paths).

There are 3 things that register and unregister process must take
into account:
a) EM struct is populated based on OPPs of the device
    (OPPs must be there before EM tries to calculate per-OPP-cost)
b) EM is returned to subsystems like: scheduler or thermal
    (unregister must be done after removing cooling device)
c) EM might be created with driver specific callback function,
called for each OPP by the EM during setup
    (do not put default simple EM model into framework function)

So a) and b) shouldn't be hard to code, but I agree, it would be easier
for driver developer to not think about them.
Let me try to experiment and address this.

> 
> I haven't quite grasped whether the energy model is conceptually "owned" 
> by the OPP table or by the cooling device, but either way it would seem 
> to be a much nicer API if there were simply an additional "with energy 
> model" variant of the registration call, and the standard removal call 
> just automatically cleaned up an energy model as well if one was present.

This dev_pm_opp_of_* could potentially be used directly inside
devfreq_cooling_register() and then 'unregister' call done internally.
Just limiting devfreq_cooling to a call to:
dev_pm_opp_of_register_em()
would limit the EM for devfreq devices to use only this
simple DT model (which is based on "dynamic-power-coefficient") which
fails in i.e. GPU SCMI where OPP power comes from firmware.

That would require something like
devfreq_cooling_with_em_register(<old_args_here>, struct 
em_data_callback *em_cb)

and then if the *em_cb is set to null, it could call
dev_pm_opp_of_register_em() trying to use it's default em_cb
function, which seeks for "dynamic-power-coefficient".

This is doable inside devfreq_cooling, register and unregister
of EM would be avoided in drivers code.

Thank you Robin for your comments clarifying these things.

Regards,
Lukasz
diff mbox series

Patch

diff --git a/drivers/gpu/drm/panfrost/panfrost_devfreq.c b/drivers/gpu/drm/panfrost/panfrost_devfreq.c
index 413987038fbf..d527a5113950 100644
--- a/drivers/gpu/drm/panfrost/panfrost_devfreq.c
+++ b/drivers/gpu/drm/panfrost/panfrost_devfreq.c
@@ -105,6 +105,8 @@  int panfrost_devfreq_init(struct panfrost_device *pfdev)
 	}
 	pfdev->devfreq.devfreq = devfreq;
 
+	dev_pm_opp_of_register_em(dev, NULL);
+
 	cooling = of_devfreq_cooling_register(dev->of_node, devfreq);
 	if (IS_ERR(cooling))
 		DRM_DEV_INFO(dev, "Failed to register cooling device\n");
@@ -118,6 +120,7 @@  void panfrost_devfreq_fini(struct panfrost_device *pfdev)
 {
 	if (pfdev->devfreq.cooling)
 		devfreq_cooling_unregister(pfdev->devfreq.cooling);
+	dev_pm_opp_of_unregister_em(&pfdev->pdev->dev);
 	dev_pm_opp_of_remove_table(&pfdev->pdev->dev);
 }