mbox series

[0/4] CPUFreq statistics retrieved by drivers

Message ID 20200729151208.27737-1-lukasz.luba@arm.com (mailing list archive)
Headers show
Series CPUFreq statistics retrieved by drivers | expand

Message

Lukasz Luba July 29, 2020, 3:12 p.m. UTC
Hi all,

The existing CPUFreq framework does not tracks the statistics when the
'fast switch' is used or when firmware changes the frequency independently
due to e.g. thermal reasons. However, the firmware might track the frequency
changes and expose this to the kernel.

This patch set aims to introduce CPUfreq statistics gathered by firmware
and retrieved by CPUFreq driver. It would require a new API functions
in the CPUFreq, which allows to poke drivers to get these stats.

The needed CPUFreq infrastructure is in patch 1/4, patch 2/4 extends
ARM SCMI protocol layer, patches 3/4, 4/4  modify ARM SCMI CPUFreq driver.

Regards,
Lukasz Luba

Lukasz Luba (4):
  cpufreq: Add support for statistics read from drivers
  scmi: perf: Extend protocol to support performance statistics
  cpufreq: scmi: Move scmi_cpufreq_driver structure to the top
  cpufreq: scmi: Read statistics from FW shared memory

 drivers/cpufreq/cpufreq.c        |  22 ++++
 drivers/cpufreq/cpufreq_stats.c  |  38 +++---
 drivers/cpufreq/scmi-cpufreq.c   | 116 ++++++++++++++---
 drivers/firmware/arm_scmi/perf.c | 210 +++++++++++++++++++++++++++++++
 include/linux/cpufreq.h          |  32 +++++
 include/linux/scmi_protocol.h    |  11 ++
 6 files changed, 401 insertions(+), 28 deletions(-)

Comments

Viresh Kumar July 30, 2020, 8:53 a.m. UTC | #1
On 29-07-20, 16:12, Lukasz Luba wrote:
> The existing CPUFreq framework does not tracks the statistics when the
> 'fast switch' is used or when firmware changes the frequency independently
> due to e.g. thermal reasons. However, the firmware might track the frequency
> changes and expose this to the kernel.
> 
> This patch set aims to introduce CPUfreq statistics gathered by firmware
> and retrieved by CPUFreq driver. It would require a new API functions
> in the CPUFreq, which allows to poke drivers to get these stats.
> 
> The needed CPUFreq infrastructure is in patch 1/4, patch 2/4 extends
> ARM SCMI protocol layer, patches 3/4, 4/4  modify ARM SCMI CPUFreq driver.

Are you doing this for the fast switch case or because your platform
actually runs at frequencies which may be different from what cpufreq
core has requested ?

I am also not sure what these tables should represent, what the
cpufreq core has decided for the CPUs or the frequencies we actually
run at, as these two can be very different for example if the hardware
runs at frequencies which don't match exactly to what is there in the
freq table. I believe these are rather to show what cpufreq and its
governors are doing with the CPUs.

Over that I would like the userspace stats to work exactly as the way
they work right now, i.e. capture all transitions from one freq to
other, not just time-in-state. Also resetting of the stats from
userspace for example. All allocation and printing of the data must be
done from stats core, the only thing which the driver would do at the
end is updating the stats structure and nothing more. Instead of
reading all stats from the firmware, it will be much easier if you can
just get the information from the firmware whenever there is a
frequency switch and then we can update the stats the way it is done
right now. And that would be simple.
Sudeep Holla July 30, 2020, 9:10 a.m. UTC | #2
On Thu, Jul 30, 2020 at 02:23:33PM +0530, Viresh Kumar wrote:
> On 29-07-20, 16:12, Lukasz Luba wrote:
> > The existing CPUFreq framework does not tracks the statistics when the
> > 'fast switch' is used or when firmware changes the frequency independently
> > due to e.g. thermal reasons. However, the firmware might track the frequency
> > changes and expose this to the kernel.
> >
> > This patch set aims to introduce CPUfreq statistics gathered by firmware
> > and retrieved by CPUFreq driver. It would require a new API functions
> > in the CPUFreq, which allows to poke drivers to get these stats.
> >
> > The needed CPUFreq infrastructure is in patch 1/4, patch 2/4 extends
> > ARM SCMI protocol layer, patches 3/4, 4/4  modify ARM SCMI CPUFreq driver.
>
> Are you doing this for the fast switch case or because your platform
> actually runs at frequencies which may be different from what cpufreq
> core has requested ?
>

I think so.

> I am also not sure what these tables should represent, what the
> cpufreq core has decided for the CPUs or the frequencies we actually
> run at, as these two can be very different for example if the hardware
> runs at frequencies which don't match exactly to what is there in the
> freq table. I believe these are rather to show what cpufreq and its
> governors are doing with the CPUs.
>

Exactly, I raised similar point in internal discussion and asked Lukasz
to take up the same on the list. I assume it was always what cpufreq
requested rather than what was delivered. So will we break the userspace
ABI if we change that is the main question.

> Over that I would like the userspace stats to work exactly as the way
> they work right now, i.e. capture all transitions from one freq to
> other, not just time-in-state. Also resetting of the stats from
> userspace for example. All allocation and printing of the data must be
> done from stats core, the only thing which the driver would do at the
> end is updating the stats structure and nothing more. Instead of
> reading all stats from the firmware, it will be much easier if you can
> just get the information from the firmware whenever there is a
> frequency switch and then we can update the stats the way it is done
> right now. And that would be simple.
>

Good point, but notifications may not be lightweight. If that is no good,
alternatively, I suggested to keep these firmware stats in a separate
debugfs. Thoughts ?

--
Regards,
Sudeep
Lukasz Luba July 30, 2020, 9:36 a.m. UTC | #3
On 7/30/20 10:10 AM, Sudeep Holla wrote:
> On Thu, Jul 30, 2020 at 02:23:33PM +0530, Viresh Kumar wrote:
>> On 29-07-20, 16:12, Lukasz Luba wrote:
>>> The existing CPUFreq framework does not tracks the statistics when the
>>> 'fast switch' is used or when firmware changes the frequency independently
>>> due to e.g. thermal reasons. However, the firmware might track the frequency
>>> changes and expose this to the kernel.
>>>
>>> This patch set aims to introduce CPUfreq statistics gathered by firmware
>>> and retrieved by CPUFreq driver. It would require a new API functions
>>> in the CPUFreq, which allows to poke drivers to get these stats.
>>>
>>> The needed CPUFreq infrastructure is in patch 1/4, patch 2/4 extends
>>> ARM SCMI protocol layer, patches 3/4, 4/4  modify ARM SCMI CPUFreq driver.
>>
>> Are you doing this for the fast switch case or because your platform
>> actually runs at frequencies which may be different from what cpufreq
>> core has requested ?
>>
> 
> I think so.

For both cases, but fast switch is major and present. Thermal is not
currently implemented in SCP FW, but might be in future.

> 
>> I am also not sure what these tables should represent, what the
>> cpufreq core has decided for the CPUs or the frequencies we actually
>> run at, as these two can be very different for example if the hardware
>> runs at frequencies which don't match exactly to what is there in the
>> freq table. I believe these are rather to show what cpufreq and its
>> governors are doing with the CPUs.
>>
> 
> Exactly, I raised similar point in internal discussion and asked Lukasz
> to take up the same on the list. I assume it was always what cpufreq
> requested rather than what was delivered. So will we break the userspace
> ABI if we change that is the main question.

Thank you for confirmation. If that is the mechanism for tracking what
cpufreq governors are doing with the CPUs, then is clashes with
presented data in FW memory, because firmware is the governor.

> 
>> Over that I would like the userspace stats to work exactly as the way
>> they work right now, i.e. capture all transitions from one freq to
>> other, not just time-in-state. Also resetting of the stats from
>> userspace for example. All allocation and printing of the data must be
>> done from stats core, the only thing which the driver would do at the
>> end is updating the stats structure and nothing more. Instead of
>> reading all stats from the firmware, it will be much easier if you can
>> just get the information from the firmware whenever there is a
>> frequency switch and then we can update the stats the way it is done
>> right now. And that would be simple.
>>
> 
> Good point, but notifications may not be lightweight. If that is no good,
> alternatively, I suggested to keep these firmware stats in a separate
> debugfs. Thoughts ?

I agree that notifications might not be lightweight. Furthermore I think
this still clashes with the assumption that cpufreq governor decisions
are tracked in these statistics, not the firmware decision.

In this case I think we would have to create debugfs.
Sudeep do you think these debugfs should be exposed from the protocol
layer:
drivers/firmware/arm_scmi/perf.c
or maybe from the cpufreq scmi driver? I would probably be safer to have
it in the cpufreq driver because we have scmi_handle there.

Regards,
Lukasz
Sudeep Holla July 31, 2020, 3:56 p.m. UTC | #4
On Thu, Jul 30, 2020 at 10:36:51AM +0100, Lukasz Luba wrote:
> 
> In this case I think we would have to create debugfs.
> Sudeep do you think these debugfs should be exposed from the protocol
> layer:
> drivers/firmware/arm_scmi/perf.c

I prefer above over cpufreq as we can support for all the devices not
just cpus which avoids adding similar support elsewhere(mostly devfreq)

> or maybe from the cpufreq scmi driver? I would probably be safer to have
> it in the cpufreq driver because we have scmi_handle there.
>

Cristian was thinking if we can consolidate all such debugfs under one
device may be and that should eliminate your handle restriction. I would
like to see how that works out in implementation but I don't have any 
better suggestion ATM.

--
Regards,
Sudeep
Viresh Kumar Aug. 4, 2020, 5:35 a.m. UTC | #5
On 30-07-20, 10:36, Lukasz Luba wrote:
> On 7/30/20 10:10 AM, Sudeep Holla wrote:
> > On Thu, Jul 30, 2020 at 02:23:33PM +0530, Viresh Kumar wrote:
> > > On 29-07-20, 16:12, Lukasz Luba wrote:
> > > > The existing CPUFreq framework does not tracks the statistics when the
> > > > 'fast switch' is used or when firmware changes the frequency independently
> > > > due to e.g. thermal reasons. However, the firmware might track the frequency
> > > > changes and expose this to the kernel.
> > > > 
> > > > This patch set aims to introduce CPUfreq statistics gathered by firmware
> > > > and retrieved by CPUFreq driver. It would require a new API functions
> > > > in the CPUFreq, which allows to poke drivers to get these stats.
> > > > 
> > > > The needed CPUFreq infrastructure is in patch 1/4, patch 2/4 extends
> > > > ARM SCMI protocol layer, patches 3/4, 4/4  modify ARM SCMI CPUFreq driver.
> > > 
> > > Are you doing this for the fast switch case or because your platform
> > > actually runs at frequencies which may be different from what cpufreq
> > > core has requested ?
> > > 
> > 
> > I think so.
> 
> For both cases, but fast switch is major and present. Thermal is not
> currently implemented in SCP FW, but might be in future.

Okay, lets simplify things a bit and merge things slowly upstream and merge only
what is required right now.

IIUC, the only concern right now is to capture stats with fast switch ? Maybe we
can do something else in that case and brainstorm a bit..

> > > I am also not sure what these tables should represent, what the
> > > cpufreq core has decided for the CPUs or the frequencies we actually
> > > run at, as these two can be very different for example if the hardware
> > > runs at frequencies which don't match exactly to what is there in the
> > > freq table. I believe these are rather to show what cpufreq and its
> > > governors are doing with the CPUs.
> > > 
> > 
> > Exactly, I raised similar point in internal discussion and asked Lukasz
> > to take up the same on the list. I assume it was always what cpufreq
> > requested rather than what was delivered. So will we break the userspace
> > ABI if we change that is the main question.
> 
> Thank you for confirmation. If that is the mechanism for tracking what
> cpufreq governors are doing with the CPUs, then is clashes with
> presented data in FW memory, because firmware is the governor.

Why is firmware the governor here ? Aren't you talking about the simple fast
switch case only ?

Over that, I think this cpufreq stats information isn't parsed by any tool right
now and tweaking it a bit won't hurt anyone (like if we start capturing things a
bit differently). So we may not want to worry about breaking userspace ABI here,
if what we are looking to do is the right thing to do.

> > > Over that I would like the userspace stats to work exactly as the way
> > > they work right now, i.e. capture all transitions from one freq to
> > > other, not just time-in-state. Also resetting of the stats from
> > > userspace for example. All allocation and printing of the data must be
> > > done from stats core, the only thing which the driver would do at the
> > > end is updating the stats structure and nothing more. Instead of
> > > reading all stats from the firmware, it will be much easier if you can
> > > just get the information from the firmware whenever there is a
> > > frequency switch and then we can update the stats the way it is done
> > > right now. And that would be simple.
> > > 
> > 
> > Good point, but notifications may not be lightweight. If that is no good,
> > alternatively, I suggested to keep these firmware stats in a separate
> > debugfs. Thoughts ?
> 
> I agree that notifications might not be lightweight.

I am not sure what notifications are we talking about here.

> Furthermore I think
> this still clashes with the assumption that cpufreq governor decisions
> are tracked in these statistics, not the firmware decision.
> 
> In this case I think we would have to create debugfs.
> Sudeep do you think these debugfs should be exposed from the protocol
> layer:
> drivers/firmware/arm_scmi/perf.c
> or maybe from the cpufreq scmi driver? I would probably be safer to have
> it in the cpufreq driver because we have scmi_handle there.

For the CPUs it would be better if we can keep things in cpufreq only, lets see
how we go about it.
Lukasz Luba Aug. 4, 2020, 10:29 a.m. UTC | #6
On 8/4/20 6:35 AM, Viresh Kumar wrote:
> On 30-07-20, 10:36, Lukasz Luba wrote:
>> On 7/30/20 10:10 AM, Sudeep Holla wrote:
>>> On Thu, Jul 30, 2020 at 02:23:33PM +0530, Viresh Kumar wrote:
>>>> On 29-07-20, 16:12, Lukasz Luba wrote:
>>>>> The existing CPUFreq framework does not tracks the statistics when the
>>>>> 'fast switch' is used or when firmware changes the frequency independently
>>>>> due to e.g. thermal reasons. However, the firmware might track the frequency
>>>>> changes and expose this to the kernel.
>>>>>
>>>>> This patch set aims to introduce CPUfreq statistics gathered by firmware
>>>>> and retrieved by CPUFreq driver. It would require a new API functions
>>>>> in the CPUFreq, which allows to poke drivers to get these stats.
>>>>>
>>>>> The needed CPUFreq infrastructure is in patch 1/4, patch 2/4 extends
>>>>> ARM SCMI protocol layer, patches 3/4, 4/4  modify ARM SCMI CPUFreq driver.
>>>>
>>>> Are you doing this for the fast switch case or because your platform
>>>> actually runs at frequencies which may be different from what cpufreq
>>>> core has requested ?
>>>>
>>>
>>> I think so.
>>
>> For both cases, but fast switch is major and present. Thermal is not
>> currently implemented in SCP FW, but might be in future.
> 
> Okay, lets simplify things a bit and merge things slowly upstream and merge only
> what is required right now.
> 
> IIUC, the only concern right now is to capture stats with fast switch ? Maybe we
> can do something else in that case and brainstorm a bit..

Correct, the fast switch is the only concern right now and not tracked. 
We could fill in that information with statistics data from firmware
with a cpufreq driver help.

I could make the if from patch 1/4 covering narrowed case, when
fast switch is present, check for drivers stats.
Something like:
-----------8<------------------------------------------------------------
if (policy->fast_switch_enabled)
	if (policy->has_driver_stats)
		return cpufreq_stats_present_driver_data(policy, buf);
	else
		return 0;
-------------->8----------------------------------------------------------

> 
>>>> I am also not sure what these tables should represent, what the
>>>> cpufreq core has decided for the CPUs or the frequencies we actually
>>>> run at, as these two can be very different for example if the hardware
>>>> runs at frequencies which don't match exactly to what is there in the
>>>> freq table. I believe these are rather to show what cpufreq and its
>>>> governors are doing with the CPUs.
>>>>
>>>
>>> Exactly, I raised similar point in internal discussion and asked Lukasz
>>> to take up the same on the list. I assume it was always what cpufreq
>>> requested rather than what was delivered. So will we break the userspace
>>> ABI if we change that is the main question.
>>
>> Thank you for confirmation. If that is the mechanism for tracking what
>> cpufreq governors are doing with the CPUs, then is clashes with
>> presented data in FW memory, because firmware is the governor.
> 
> Why is firmware the governor here ? Aren't you talking about the simple fast
> switch case only ?

I used a term 'governor' for the firmware because it makes the final
set for the frequency. It (FW) should respect the frequency value
set using the fast switch. I don't know how other firmware (e.g. Intel)
treats this fast switch value or if they even expose FW stats, though.

You can read about this statistics region in [1] at:
4.5.5 Performance domain statistics shared memory region

> 
> Over that, I think this cpufreq stats information isn't parsed by any tool right
> now and tweaking it a bit won't hurt anyone (like if we start capturing things a
> bit differently). So we may not want to worry about breaking userspace ABI here,
> if what we are looking to do is the right thing to do.

So, there is some hope... IMHO it would be better to have this cpufreq
stats in normal location, rather then in scmi debugfs.

> 
>>>> Over that I would like the userspace stats to work exactly as the way
>>>> they work right now, i.e. capture all transitions from one freq to
>>>> other, not just time-in-state. Also resetting of the stats from
>>>> userspace for example. All allocation and printing of the data must be
>>>> done from stats core, the only thing which the driver would do at the
>>>> end is updating the stats structure and nothing more. Instead of
>>>> reading all stats from the firmware, it will be much easier if you can
>>>> just get the information from the firmware whenever there is a
>>>> frequency switch and then we can update the stats the way it is done
>>>> right now. And that would be simple.
>>>>
>>>
>>> Good point, but notifications may not be lightweight. If that is no good,
>>> alternatively, I suggested to keep these firmware stats in a separate
>>> debugfs. Thoughts ?
>>
>> I agree that notifications might not be lightweight.
> 
> I am not sure what notifications are we talking about here.

There is a notification mechanism described in the SCMI spec [1] at
4.5.4 Notifications.
We were referring to that mechanism.

> 
>> Furthermore I think
>> this still clashes with the assumption that cpufreq governor decisions
>> are tracked in these statistics, not the firmware decision.
>>
>> In this case I think we would have to create debugfs.
>> Sudeep do you think these debugfs should be exposed from the protocol
>> layer:
>> drivers/firmware/arm_scmi/perf.c
>> or maybe from the cpufreq scmi driver? I would probably be safer to have
>> it in the cpufreq driver because we have scmi_handle there.
> 
> For the CPUs it would be better if we can keep things in cpufreq only, lets see
> how we go about it.
> 

If that would be only ARM SCMI debugfs directory, then we would like to
keep it in the scmi. We could re-use the code for devfreq (GPU)
device, which is also exposed as performance domain.

Thank you Viresh for your comments.

Regards,
Lukasz

[1] 
https://static.docs.arm.com/den0056/b/DEN0056B_System_Control_and_Management_Interface_v2_0.pdf
Viresh Kumar Aug. 4, 2020, 10:38 a.m. UTC | #7
On 04-08-20, 11:29, Lukasz Luba wrote:
> On 8/4/20 6:35 AM, Viresh Kumar wrote:
> > IIUC, the only concern right now is to capture stats with fast switch ? Maybe we
> > can do something else in that case and brainstorm a bit..
> 
> Correct, the fast switch is the only concern right now and not tracked. We
> could fill in that information with statistics data from firmware
> with a cpufreq driver help.
> 
> I could make the if from patch 1/4 covering narrowed case, when
> fast switch is present, check for drivers stats.
> Something like:
> -----------8<------------------------------------------------------------
> if (policy->fast_switch_enabled)
> 	if (policy->has_driver_stats)
> 		return cpufreq_stats_present_driver_data(policy, buf);
> 	else
> 		return 0;
> -------------->8----------------------------------------------------------

I don't think doing it with help of firmware is the right thing to do
here then. For another platform we may not have a firmware which can
help us, we need something in the opp core itself for that. Lemme see
if I can do something about it.

> > Why is firmware the governor here ? Aren't you talking about the simple fast
> > switch case only ?
> 
> I used a term 'governor' for the firmware because it makes the final
> set for the frequency. It (FW) should respect the frequency value
> set using the fast switch. I don't know how other firmware (e.g. Intel)
> treats this fast switch value or if they even expose FW stats, though.

For Intel I think, Linux is one of the entities that vote for deciding
the frequency of the CPUs and the firmware (after taking all such
factors into account) chooses a frequency by its own, which must be >=
the frequency requested by Linux.

> You can read about this statistics region in [1] at:
> 4.5.5 Performance domain statistics shared memory region
> 
> > 
> > Over that, I think this cpufreq stats information isn't parsed by any tool right
> > now and tweaking it a bit won't hurt anyone (like if we start capturing things a
> > bit differently). So we may not want to worry about breaking userspace ABI here,
> > if what we are looking to do is the right thing to do.
> 
> So, there is some hope... IMHO it would be better to have this cpufreq
> stats in normal location, rather then in scmi debugfs.

I agree.

> > I am not sure what notifications are we talking about here.
> 
> There is a notification mechanism described in the SCMI spec [1] at
> 4.5.4 Notifications.
> We were referring to that mechanism.

Ahh, I see. All I was thinking was about the cpufreq specific
notifiers :)
Lukasz Luba Aug. 4, 2020, 10:44 a.m. UTC | #8
On 8/4/20 11:38 AM, Viresh Kumar wrote:
> On 04-08-20, 11:29, Lukasz Luba wrote:
>> On 8/4/20 6:35 AM, Viresh Kumar wrote:
>>> IIUC, the only concern right now is to capture stats with fast switch ? Maybe we
>>> can do something else in that case and brainstorm a bit..
>>
>> Correct, the fast switch is the only concern right now and not tracked. We
>> could fill in that information with statistics data from firmware
>> with a cpufreq driver help.
>>
>> I could make the if from patch 1/4 covering narrowed case, when
>> fast switch is present, check for drivers stats.
>> Something like:
>> -----------8<------------------------------------------------------------
>> if (policy->fast_switch_enabled)
>> 	if (policy->has_driver_stats)
>> 		return cpufreq_stats_present_driver_data(policy, buf);
>> 	else
>> 		return 0;
>> -------------->8----------------------------------------------------------
> 
> I don't think doing it with help of firmware is the right thing to do
> here then. For another platform we may not have a firmware which can
> help us, we need something in the opp core itself for that. Lemme see
> if I can do something about it.

OK, great, I will wait then with this patch series v2 which would change
into debugfs scmi only. Could you please add me on CC, I am very
interested in.

> 
>>> Why is firmware the governor here ? Aren't you talking about the simple fast
>>> switch case only ?
>>
>> I used a term 'governor' for the firmware because it makes the final
>> set for the frequency. It (FW) should respect the frequency value
>> set using the fast switch. I don't know how other firmware (e.g. Intel)
>> treats this fast switch value or if they even expose FW stats, though.
> 
> For Intel I think, Linux is one of the entities that vote for deciding
> the frequency of the CPUs and the firmware (after taking all such
> factors into account) chooses a frequency by its own, which must be >=
> the frequency requested by Linux.
> 
>> You can read about this statistics region in [1] at:
>> 4.5.5 Performance domain statistics shared memory region
>>
>>>
>>> Over that, I think this cpufreq stats information isn't parsed by any tool right
>>> now and tweaking it a bit won't hurt anyone (like if we start capturing things a
>>> bit differently). So we may not want to worry about breaking userspace ABI here,
>>> if what we are looking to do is the right thing to do.
>>
>> So, there is some hope... IMHO it would be better to have this cpufreq
>> stats in normal location, rather then in scmi debugfs.
> 
> I agree.
> 
>>> I am not sure what notifications are we talking about here.
>>
>> There is a notification mechanism described in the SCMI spec [1] at
>> 4.5.4 Notifications.
>> We were referring to that mechanism.
> 
> Ahh, I see. All I was thinking was about the cpufreq specific
> notifiers :)
>
Florian Fainelli Aug. 4, 2020, 5:19 p.m. UTC | #9
On 7/31/2020 8:56 AM, Sudeep Holla wrote:
> On Thu, Jul 30, 2020 at 10:36:51AM +0100, Lukasz Luba wrote:
>>
>> In this case I think we would have to create debugfs.
>> Sudeep do you think these debugfs should be exposed from the protocol
>> layer:
>> drivers/firmware/arm_scmi/perf.c
> 
> I prefer above over cpufreq as we can support for all the devices not
> just cpus which avoids adding similar support elsewhere(mostly devfreq)
> 
>> or maybe from the cpufreq scmi driver? I would probably be safer to have
>> it in the cpufreq driver because we have scmi_handle there.
>>
> 
> Cristian was thinking if we can consolidate all such debugfs under one
> device may be and that should eliminate your handle restriction. I would
> like to see how that works out in implementation but I don't have any 
> better suggestion ATM.

debugfs is not enabled in production kernels, and especially not with
Android kernels, so sticking those in sysfs like the existing cpufreq
subsystem statistics may be a better choice.
Florian Fainelli Aug. 4, 2020, 5:27 p.m. UTC | #10
On 7/29/2020 8:12 AM, Lukasz Luba wrote:
> Hi all,
> 
> The existing CPUFreq framework does not tracks the statistics when the
> 'fast switch' is used or when firmware changes the frequency independently
> due to e.g. thermal reasons. However, the firmware might track the frequency
> changes and expose this to the kernel.

Or the firmware might have changed the CPU frequency in response to a
request from the secure world for instance.

> 
> This patch set aims to introduce CPUfreq statistics gathered by firmware
> and retrieved by CPUFreq driver. It would require a new API functions
> in the CPUFreq, which allows to poke drivers to get these stats.

From a debugging perspective, it would be helpful if the firmware
maintained statistics were exposed as a super-set of the Linux cpufreq
statistics and aggregated into them such that you could view the normal
world vs. secure world residency of a given frequency point. This would
help because a lot of times, Linux requests freq X, but the secure world
requires freq Y (with X >= Y) and people do not really understand why
the resulting power usage is higher for instance.

What are your thoughts on this?
Lukasz Luba Aug. 5, 2020, 11:04 a.m. UTC | #11
On 8/4/20 6:27 PM, Florian Fainelli wrote:
> 
> 
> On 7/29/2020 8:12 AM, Lukasz Luba wrote:
>> Hi all,
>>
>> The existing CPUFreq framework does not tracks the statistics when the
>> 'fast switch' is used or when firmware changes the frequency independently
>> due to e.g. thermal reasons. However, the firmware might track the frequency
>> changes and expose this to the kernel.
> 
> Or the firmware might have changed the CPU frequency in response to a
> request from the secure world for instance.

Possible

> 
>>
>> This patch set aims to introduce CPUfreq statistics gathered by firmware
>> and retrieved by CPUFreq driver. It would require a new API functions
>> in the CPUFreq, which allows to poke drivers to get these stats.
> 
>  From a debugging perspective, it would be helpful if the firmware
> maintained statistics were exposed as a super-set of the Linux cpufreq
> statistics and aggregated into them such that you could view the normal
> world vs. secure world residency of a given frequency point. This would
> help because a lot of times, Linux requests freq X, but the secure world
> requires freq Y (with X >= Y) and people do not really understand why
> the resulting power usage is higher for instance.
> 
> What are your thoughts on this?
> 

I know that Viresh is going to develop patches and improve these
cpufreq stats framework. Maybe he also had this 'aggregation' in mind.
I will leave it him.

Thank you for your feedback.

Regards,
Lukasz
Sudeep Holla Aug. 5, 2020, 12:36 p.m. UTC | #12
On Tue, Aug 04, 2020 at 10:19:23AM -0700, Florian Fainelli wrote:
>
>
> On 7/31/2020 8:56 AM, Sudeep Holla wrote:
> > On Thu, Jul 30, 2020 at 10:36:51AM +0100, Lukasz Luba wrote:
> >>
> >> In this case I think we would have to create debugfs.
> >> Sudeep do you think these debugfs should be exposed from the protocol
> >> layer:
> >> drivers/firmware/arm_scmi/perf.c
> >
> > I prefer above over cpufreq as we can support for all the devices not
> > just cpus which avoids adding similar support elsewhere(mostly devfreq)
> >
> >> or maybe from the cpufreq scmi driver? I would probably be safer to have
> >> it in the cpufreq driver because we have scmi_handle there.
> >>
> >
> > Cristian was thinking if we can consolidate all such debugfs under one
> > device may be and that should eliminate your handle restriction. I would
> > like to see how that works out in implementation but I don't have any
> > better suggestion ATM.
>
> debugfs is not enabled in production kernels, and especially not with
> Android kernels, so sticking those in sysfs like the existing cpufreq
> subsystem statistics may be a better choice.

Fair enough. I was suggesting that only if we can't push this into existing
sysfs support. If we can, then we need not worry about it. If not, I don't
want a user ABI just for SCMI for this firmware stats, I would rather keep
it in debugfs for debug purposes. This will be useless once we start seeing
AMU in the hardware and hence I was pushing for debugfs.

--
Regards,
Sudeep
Viresh Kumar Aug. 5, 2020, 1:04 p.m. UTC | #13
On 05-08-20, 12:04, Lukasz Luba wrote:
> I know that Viresh is going to develop patches and improve these
> cpufreq stats framework. Maybe he also had this 'aggregation' in mind.
> I will leave it him.

I am only going to look at cpufreq's view of stats independently from
the firmware.
Sudeep Holla Aug. 5, 2020, 4:03 p.m. UTC | #14
On Wed, Aug 05, 2020 at 06:34:36PM +0530, Viresh Kumar wrote:
> On 05-08-20, 12:04, Lukasz Luba wrote:
> > I know that Viresh is going to develop patches and improve these
> > cpufreq stats framework. Maybe he also had this 'aggregation' in mind.
> > I will leave it him.
>
> I am only going to look at cpufreq's view of stats independently from
> the firmware.
>

+1, I agree with that. Kernel must avoid any logic to aggregate or
interpret the data in a generic way. The userspace tools can manage that
especially if this tend to be platform specific.

--
Regards,
Sudeep
Florian Fainelli Aug. 5, 2020, 5:33 p.m. UTC | #15
On 8/5/2020 9:03 AM, Sudeep Holla wrote:
> On Wed, Aug 05, 2020 at 06:34:36PM +0530, Viresh Kumar wrote:
>> On 05-08-20, 12:04, Lukasz Luba wrote:
>>> I know that Viresh is going to develop patches and improve these
>>> cpufreq stats framework. Maybe he also had this 'aggregation' in mind.
>>> I will leave it him.
>>
>> I am only going to look at cpufreq's view of stats independently from
>> the firmware.
>>
> 
> +1, I agree with that. Kernel must avoid any logic to aggregate or
> interpret the data in a generic way. The userspace tools can manage that
> especially if this tend to be platform specific.

We can probably standardize on how to expose the firmware maintained
statistics such that these tools do not have to widely vary from
platform to platform, right?
Sudeep Holla Aug. 6, 2020, 1:37 p.m. UTC | #16
On Wed, Aug 05, 2020 at 10:33:02AM -0700, Florian Fainelli wrote:
>
>
> On 8/5/2020 9:03 AM, Sudeep Holla wrote:
> > On Wed, Aug 05, 2020 at 06:34:36PM +0530, Viresh Kumar wrote:
> >> On 05-08-20, 12:04, Lukasz Luba wrote:
> >>> I know that Viresh is going to develop patches and improve these
> >>> cpufreq stats framework. Maybe he also had this 'aggregation' in mind.
> >>> I will leave it him.
> >>
> >> I am only going to look at cpufreq's view of stats independently from
> >> the firmware.
> >>
> >
> > +1, I agree with that. Kernel must avoid any logic to aggregate or
> > interpret the data in a generic way. The userspace tools can manage that
> > especially if this tend to be platform specific.
>
> We can probably standardize on how to expose the firmware maintained
> statistics such that these tools do not have to widely vary from
> platform to platform, right?

Ofcourse. I just don't want any logic that interpret/analyse the stats
comparing with the in-kernel stats or otherwise.

--
Regards,
Sudeep
Viresh Kumar Sept. 2, 2020, 7:26 a.m. UTC | #17
On 04-08-20, 11:44, Lukasz Luba wrote:
> On 8/4/20 11:38 AM, Viresh Kumar wrote:
> > I don't think doing it with help of firmware is the right thing to do
> > here then. For another platform we may not have a firmware which can
> > help us, we need something in the opp core itself for that. Lemme see
> > if I can do something about it.
> 
> OK, great, I will wait then with this patch series v2 which would change
> into debugfs scmi only. Could you please add me on CC, I am very
> interested in.

Here is an attempt.

http://lore.kernel.org/lkml/cover.1599031227.git.viresh.kumar@linaro.org