mbox series

[RFCv2,0/8] Add imx8mm bus frequency switching

Message ID cover.1561707104.git.leonard.crestez@nxp.com (mailing list archive)
Headers show
Series Add imx8mm bus frequency switching | expand

Message

Leonard Crestez June 28, 2019, 7:39 a.m. UTC
This series attempts to add upstream DVFS support for imx8mm, covering dynamic
scaling of internal buses and dram. It uses the interconnect framework for
proactive scaling (in response to explicit bandwidth requests from devices) and
devfreq in order expose the buses and eventually implement reactive scaling (in
response to measuredtraffic).

Actual scaling is performed through the clk framework: The NOC and main NICs
are driven by composite clks and a new 'imx8m-dram' clk is included for
scaling dram using firmware calls.

The interconnect and devfreq parts do not communicate explicitly: they both
just call clk_set_min_rate and the clk core picks the minimum value that can
satisfy both. They are thus completely independent.

This is easily extensible to more members of the imx8m family, some of which
expose more detailed controls over interconnect fabric frequencies.

TODO:
* Clarify DT bindings
* Clarify interconnect OPP picking logic
* Implement devfreq_event for imx8m ddrc
* Expose more dram frequencies

The clk_set_min_rate approach does not mesh very well with the OPP framework.
Some of interconnect nodes on imx8m can run at different voltages: OPP can
handle this well but not in response to a clk_set_min_rate from an unrelated
subsystem. Maybe set voltage on a clk notifier?

Vendor tree does not support voltage switching, independent freqs for
different parts of the fabric or any reactive scaling. I think it's important
to pick an upstreaming approach which can support as much as possible.

Feedback welcome.

Some objections were apparently raised to doing DRAM switch inside CLK:
perhaps ICC should make min_freq requests to devfreq instead?

Link to v1 (multiple chunks):
 * https://patchwork.kernel.org/patch/10976897/
 * https://patchwork.kernel.org/patch/10968303/
 * https://patchwork.kernel.org/project/linux-arm-kernel/list/?series=91251

Also as a github branch (with few other changes):
    https://github.com/cdleonard/linux/tree/next_imx8mm_busfreq

Alexandre Bailon (2):
  interconnect: Add generic driver for imx
  interconnect: imx: Add platform driver for imx8mm

Leonard Crestez (6):
  clk: imx8mm: Add dram freq switch support
  clk: imx8m-composite: Switch to determine_rate
  arm64: dts: imx8mm: Add dram dvfs irqs to ccm node
  devfreq: Add imx-devfreq driver
  arm64: dts: imx8mm: Add interconnect node
  arm64: dts: imx8mm: Add devfreq-imx nodes

 arch/arm64/boot/dts/freescale/imx8mm.dtsi |  73 +++
 drivers/clk/imx/Makefile                  |   1 +
 drivers/clk/imx/clk-composite-8m.c        |  34 +-
 drivers/clk/imx/clk-imx8m-dram.c          | 357 ++++++++++++
 drivers/clk/imx/clk-imx8mm.c              |  12 +
 drivers/clk/imx/clk.h                     |  13 +
 drivers/devfreq/Kconfig                   |  10 +
 drivers/devfreq/Makefile                  |   1 +
 drivers/devfreq/imx-devfreq.c             | 142 +++++
 drivers/interconnect/Kconfig              |   1 +
 drivers/interconnect/Makefile             |   1 +
 drivers/interconnect/imx/Kconfig          |  17 +
 drivers/interconnect/imx/Makefile         |   2 +
 drivers/interconnect/imx/busfreq-imx8mm.c | 151 ++++++
 drivers/interconnect/imx/busfreq.c        | 628 ++++++++++++++++++++++
 drivers/interconnect/imx/busfreq.h        | 123 +++++
 include/dt-bindings/clock/imx8mm-clock.h  |   4 +-
 include/dt-bindings/interconnect/imx8mm.h |  49 ++
 18 files changed, 1606 insertions(+), 13 deletions(-)
 create mode 100644 drivers/clk/imx/clk-imx8m-dram.c
 create mode 100644 drivers/devfreq/imx-devfreq.c
 create mode 100644 drivers/interconnect/imx/Kconfig
 create mode 100644 drivers/interconnect/imx/Makefile
 create mode 100644 drivers/interconnect/imx/busfreq-imx8mm.c
 create mode 100644 drivers/interconnect/imx/busfreq.c
 create mode 100644 drivers/interconnect/imx/busfreq.h
 create mode 100644 include/dt-bindings/interconnect/imx8mm.h

Comments

Saravana Kannan July 3, 2019, 10:19 p.m. UTC | #1
Hi Leonard,

On Fri, Jun 28, 2019 at 12:40 AM Leonard Crestez
<leonard.crestez@nxp.com> wrote:
>
> This series attempts to add upstream DVFS support for imx8mm, covering dynamic
> scaling of internal buses and dram. It uses the interconnect framework for
> proactive scaling (in response to explicit bandwidth requests from devices) and
> devfreq in order expose the buses and eventually implement reactive scaling (in
> response to measuredtraffic).

I'm assuming you took a glance at my patch series [1] adding BW OPP
tables and adding devfreq support for scaling interconnect paths.

Based on looking at your patch series, I think [1] would allow you to
use ICC framework for both proactive and reactive scaling. Proactive
scaling would use the ICC framework directly. Reactive scaling would
go through devfreq (so that you can use various governors/adjust
policy) before it goes to ICC framework.

> Actual scaling is performed through the clk framework: The NOC and main NICs
> are driven by composite clks and a new 'imx8m-dram' clk is included for
> scaling dram using firmware calls.

Seems reasonable. All hardware block in the end run using a clock.

> The interconnect and devfreq parts do not communicate explicitly: they both
> just call clk_set_min_rate and the clk core picks the minimum value that can
> satisfy both. They are thus completely independent.

Two different parts not talking to each other and just setting min
rate can cause problems for some concurrency use cases. ICC framework
is explicitly designed to handle cases like this and aggregate the
needs correctly. You might want to look into that more closely.

> This is easily extensible to more members of the imx8m family, some of which
> expose more detailed controls over interconnect fabric frequencies.
>
> TODO:
> * Clarify DT bindings
> * Clarify interconnect OPP picking logic
> * Implement devfreq_event for imx8m ddrc
> * Expose more dram frequencies
>
> The clk_set_min_rate approach does not mesh very well with the OPP framework.
> Some of interconnect nodes on imx8m can run at different voltages: OPP can
> handle this well but not in response to a clk_set_min_rate from an unrelated
> subsystem. Maybe set voltage on a clk notifier?

I think if you design it something like below, it might make your life
a whole lot easier.
Hopefully the formatting looks okay on your end. The arrow going up is
just connecting devfreq to ICC.

                        Proactive -> ICC -> clk/OPP API to set freq/volt
                                      ^
                                      |
HW measure -> governor -> devfreq ----+

That way, all voltage/frequency requirements are managed cleanly by
clk/OPP frameworks. The ICC deals with aggregating all the
requirements and devfreq lets you handle reactive scaling and policy.

And in the beginning, while you get a governor going, you can use
"performance" governor for the "reactive scaling" users. That way,
your reactive paths will continue to have good performance while the
rest of the "proactive" clients change to use ICC APIs.

> Vendor tree does not support voltage switching, independent freqs for
> different parts of the fabric or any reactive scaling. I think it's important
> to pick an upstreaming approach which can support as much as possible.
>
> Feedback welcome.

If all of this makes sense, please take a look at [2] and provide your
thoughts. I've dropped a few patches from [1] to avoid confusion (too
much going on at once). I think BW OPP tables and having OPP tables
for interconnect paths will be something you'll need (if not now,
eventually) and something you can build on top of nicely.

Thanks,
Saravana

[1] - https://lore.kernel.org/lkml/20190614041733.120807-1-saravanak@google.com/
[2] - https://lore.kernel.org/lkml/20190703011020.151615-1-saravanak@google.com/



> Some objections were apparently raised to doing DRAM switch inside CLK:
> perhaps ICC should make min_freq requests to devfreq instead?
>
> Link to v1 (multiple chunks):
>  * https://patchwork.kernel.org/patch/10976897/
>  * https://patchwork.kernel.org/patch/10968303/
>  * https://patchwork.kernel.org/project/linux-arm-kernel/list/?series=91251
>
> Also as a github branch (with few other changes):
>     https://github.com/cdleonard/linux/tree/next_imx8mm_busfreq
>
> Alexandre Bailon (2):
>   interconnect: Add generic driver for imx
>   interconnect: imx: Add platform driver for imx8mm
>
> Leonard Crestez (6):
>   clk: imx8mm: Add dram freq switch support
>   clk: imx8m-composite: Switch to determine_rate
>   arm64: dts: imx8mm: Add dram dvfs irqs to ccm node
>   devfreq: Add imx-devfreq driver
>   arm64: dts: imx8mm: Add interconnect node
>   arm64: dts: imx8mm: Add devfreq-imx nodes
>
>  arch/arm64/boot/dts/freescale/imx8mm.dtsi |  73 +++
>  drivers/clk/imx/Makefile                  |   1 +
>  drivers/clk/imx/clk-composite-8m.c        |  34 +-
>  drivers/clk/imx/clk-imx8m-dram.c          | 357 ++++++++++++
>  drivers/clk/imx/clk-imx8mm.c              |  12 +
>  drivers/clk/imx/clk.h                     |  13 +
>  drivers/devfreq/Kconfig                   |  10 +
>  drivers/devfreq/Makefile                  |   1 +
>  drivers/devfreq/imx-devfreq.c             | 142 +++++
>  drivers/interconnect/Kconfig              |   1 +
>  drivers/interconnect/Makefile             |   1 +
>  drivers/interconnect/imx/Kconfig          |  17 +
>  drivers/interconnect/imx/Makefile         |   2 +
>  drivers/interconnect/imx/busfreq-imx8mm.c | 151 ++++++
>  drivers/interconnect/imx/busfreq.c        | 628 ++++++++++++++++++++++
>  drivers/interconnect/imx/busfreq.h        | 123 +++++
>  include/dt-bindings/clock/imx8mm-clock.h  |   4 +-
>  include/dt-bindings/interconnect/imx8mm.h |  49 ++
>  18 files changed, 1606 insertions(+), 13 deletions(-)
>  create mode 100644 drivers/clk/imx/clk-imx8m-dram.c
>  create mode 100644 drivers/devfreq/imx-devfreq.c
>  create mode 100644 drivers/interconnect/imx/Kconfig
>  create mode 100644 drivers/interconnect/imx/Makefile
>  create mode 100644 drivers/interconnect/imx/busfreq-imx8mm.c
>  create mode 100644 drivers/interconnect/imx/busfreq.c
>  create mode 100644 drivers/interconnect/imx/busfreq.h
>  create mode 100644 include/dt-bindings/interconnect/imx8mm.h
>
> --
> 2.17.1
>
Leonard Crestez July 3, 2019, 11:30 p.m. UTC | #2
On 7/4/2019 1:20 AM, Saravana Kannan wrote:

>> The interconnect and devfreq parts do not communicate explicitly: they both
>> just call clk_set_min_rate and the clk core picks the minimum value that can
>> satisfy both. They are thus completely independent.
> 
> Two different parts not talking to each other and just setting min
> rate can cause problems for some concurrency use cases. ICC framework
> is explicitly designed to handle cases like this and aggregate the
> needs correctly. You might want to look into that more closely.

The clk framework aggregates the min_rate requests from multiple 
consumers under a big "prepare_lock" so I expect it will deal with 
concurrent requests correctly. As for performance: frequency switching 
shouldn't be a fast path.

>> The clk_set_min_rate approach does not mesh very well with the OPP framework.
>> Some of interconnect nodes on imx8m can run at different voltages: OPP can
>> handle this well but not in response to a clk_set_min_rate from an unrelated
>> subsystem. Maybe set voltage on a clk notifier?
> 
> I think if you design it something like below, it might make your life
> a whole lot easier.
> Hopefully the formatting looks okay on your end. The arrow going up is
> just connecting devfreq to ICC.
> 
>                          Proactive -> ICC -> clk/OPP API to set freq/volt
>                                        ^
>                                        |
> HW measure -> governor -> devfreq ----+
> 
> That way, all voltage/frequency requirements are managed cleanly by
> clk/OPP frameworks. The ICC deals with aggregating all the
> requirements and devfreq lets you handle reactive scaling and policy.

If icc and devfreq are to directly communicate it makes more sense to do 
it backwards: ICC should set a min_freq on nodes which have a devfreq 
instance attached and devfreq should implement actual freq switching.

HW measurement is done on individual nodes while ICC deals with requests 
along a path. In particular on imx we have a performance monitor 
attached to the ddr controller and I doubt it can distinguish between 
masters so how could this be mapped usefully to an interconnect request?

As far as I understand with devfreq the ddrc node could use "ondemand" 
while the other nodes would default to the "passive" governor and run at 
predefined default ratios relative to DDRC.

> If all of this makes sense, please take a look at [2] and provide your
> thoughts. I've dropped a few patches from [1] to avoid confusion (too
> much going on at once). I think BW OPP tables and having OPP tables
> for interconnect paths will be something you'll need (if not now,
> eventually) and something you can build on top of nicely.

I found it very confusing that you're assigning BW OPP tables to 
devices. My initial understanding was that BW OPP would map "bandwidth" 
to "frequency" so BW OPPs should be assigned to links along the 
interconnect graph. But maybe what you want is to have OPPs for the BW 
values requested by devices?

Looking at the sdm845 icc provider source and it seems that those 
"peak/avg" values are actually parameters which go into a firmware 
command!? It makes sense that you want interconnect to be "below" 
devfreq since icc_provider.set maps very closely to what firmware exposes.

 > Interconnects and interconnect paths quantify their performance 
levels > in terms of bandwidth and not in terms of frequency.

On i.MX we just have a bunch of interconnect IPs for which frequencies 
can be adjusted (in hz) so the above statement doesn't really hold. It 
is up to an icc provider to convert aggregate bandwidth values to 
frequencies along the path.

--
Regards,
Leonard
Saravana Kannan July 4, 2019, 3:02 a.m. UTC | #3
On Wed, Jul 3, 2019 at 4:30 PM Leonard Crestez <leonard.crestez@nxp.com> wrote:
>
> On 7/4/2019 1:20 AM, Saravana Kannan wrote:
>
> >> The interconnect and devfreq parts do not communicate explicitly: they both
> >> just call clk_set_min_rate and the clk core picks the minimum value that can
> >> satisfy both. They are thus completely independent.
> >
> > Two different parts not talking to each other and just setting min
> > rate can cause problems for some concurrency use cases. ICC framework
> > is explicitly designed to handle cases like this and aggregate the
> > needs correctly. You might want to look into that more closely.
>
> The clk framework aggregates the min_rate requests from multiple
> consumers under a big "prepare_lock" so I expect it will deal with
> concurrent requests correctly. As for performance: frequency switching
> shouldn't be a fast path.

Sorry I wasn't clear. I was not talking about locking issues or race
conditions when I said concurrency use cases. What I meant was if GPU
wants 5 GB/s and at the same time (concurrent) camera wants 5 GB/s
you'll need to configure the interconnect for 10 GB/s. If both of them
just try to set the min freq equivalent for 5 GB/s the performance
would be bad or functionality might break.

> >> The clk_set_min_rate approach does not mesh very well with the OPP framework.
> >> Some of interconnect nodes on imx8m can run at different voltages: OPP can
> >> handle this well but not in response to a clk_set_min_rate from an unrelated
> >> subsystem. Maybe set voltage on a clk notifier?
> >
> > I think if you design it something like below, it might make your life
> > a whole lot easier.
> > Hopefully the formatting looks okay on your end. The arrow going up is
> > just connecting devfreq to ICC.
> >
> >                          Proactive -> ICC -> clk/OPP API to set freq/volt
> >                                        ^
> >                                        |
> > HW measure -> governor -> devfreq ----+
> >
> > That way, all voltage/frequency requirements are managed cleanly by
> > clk/OPP frameworks. The ICC deals with aggregating all the
> > requirements and devfreq lets you handle reactive scaling and policy.
>
> If icc and devfreq are to directly communicate it makes more sense to do
> it backwards: ICC should set a min_freq on nodes which have a devfreq
> instance attached and devfreq should implement actual freq switching.
>
> HW measurement is done on individual nodes while ICC deals with requests
> along a path. In particular on imx we have a performance monitor
> attached to the ddr controller and I doubt it can distinguish between
> masters so how could this be mapped usefully to an interconnect request?

Ah, that was the missing piece for me -- you are trying to use a
central performance monitor. I see what you are trying to do.

So you are looking at system wide traffic at DDR and then using that
to scale the interconnect/DDRC. I don't know how complicated or not
the IMX interconnect topology is, so please pardon my questions. If
you are using a performance monitor at the DDR controller, why do you
need the "proactive" requests from other clients? Wouldn't the
performance monitor account for all the traffic to DDR?

> As far as I understand with devfreq the ddrc node could use "ondemand"
> while the other nodes would default to the "passive" governor and run at
> predefined default ratios relative to DDRC.

Yes, that approach would also work but I'm not sure why you need the
ICC framework in that case.

> > If all of this makes sense, please take a look at [2] and provide your
> > thoughts. I've dropped a few patches from [1] to avoid confusion (too
> > much going on at once). I think BW OPP tables and having OPP tables
> > for interconnect paths will be something you'll need (if not now,
> > eventually) and something you can build on top of nicely.
>
> I found it very confusing that you're assigning BW OPP tables to
> devices. My initial understanding was that BW OPP would map "bandwidth"
> to "frequency" so BW OPPs should be assigned to links along the
> interconnect graph. But maybe what you want is to have OPPs for the BW
> values requested by devices?

I want to have OPPs for bandwidths requested for paths. Each
interconnect node can also use BW OPPs if that makes sense for them,
but I think they'd be better served by using frequency OPPs.

> Looking at the sdm845 icc provider source and it seems that those
> "peak/avg" values are actually parameters which go into a firmware
> command!? It makes sense that you want interconnect to be "below"
> devfreq since icc_provider.set maps very closely to what firmware exposes.

Even without the firmware (it's mainly there to aggregate requests for
some system wide resources) or if interconnects are scaled directly
using clock APIs (older chips), sdm845 would still want ICC to be
below devfreq. It's because 845 doesn't try to do ICC scaling by
measuring at the DDR. Each master makes separate requests and then the
ICC aggregates and sets the frequency. They have their reasons (good
ones) for doing that.

>  > Interconnects and interconnect paths quantify their performance
> levels > in terms of bandwidth and not in terms of frequency.
>
> On i.MX we just have a bunch of interconnect IPs for which frequencies
> can be adjusted (in hz)

This is true for every chip. In the end, they all set the Hz of hardware IPs.

> so the above statement doesn't really hold. It
> is up to an icc provider to convert aggregate bandwidth values to
> frequencies along the path.

That's the job of every ICC provider. In the case of i.MX you use
clock APIs and in the case of sdm845 they use some complicated
firmware interface. But in both cases, when the system is tuned for
performance/power everyone in the end cares about frequency and
voltage. If the frequency goes too high, they might reduce the
bandwidth to make sure the frequency remains reasonable for whatever
use case they are profiling.

I think the main difference is where the performance monitoring or
performance decision is done. If you don't have a central performance
monitor or don't want to use one, then the policy decision (which is
where devfreq fits in) will have to happen at the bandwidth decision
level.

-Saravana
Leonard Crestez July 4, 2019, 8:32 a.m. UTC | #4
On 7/4/2019 6:03 AM, Saravana Kannan wrote:
> On Wed, Jul 3, 2019 at 4:30 PM Leonard Crestez <leonard.crestez@nxp.com> wrote:

>> On 7/4/2019 1:20 AM, Saravana Kannan wrote:
>>
>>>> The interconnect and devfreq parts do not communicate explicitly: they both
>>>> just call clk_set_min_rate and the clk core picks the minimum value that can
>>>> satisfy both. They are thus completely independent.
>>>
>>> Two different parts not talking to each other and just setting min
>>> rate can cause problems for some concurrency use cases. ICC framework
>>> is explicitly designed to handle cases like this and aggregate the
>>> needs correctly. You might want to look into that more closely.
>>
>> The clk framework aggregates the min_rate requests from multiple
>> consumers under a big "prepare_lock" so I expect it will deal with
>> concurrent requests correctly. As for performance: frequency switching
>> shouldn't be a fast path.
> 
> Sorry I wasn't clear. I was not talking about locking issues or race
> conditions when I said concurrency use cases. What I meant was if GPU
> wants 5 GB/s and at the same time (concurrent) camera wants 5 GB/s
> you'll need to configure the interconnect for 10 GB/s. If both of them
> just try to set the min freq equivalent for 5 GB/s the performance
> would be bad or functionality might break.

I'm not calling clk_set_min_rate independently for each icc path, that 
would be obviously broken. The interconnect framework is still used to 
aggregate bandwith requests and in your scenario clk_set_min_rate for 
the main NOC would be called in a way that meets the combined 10 GB/s 
requests.

It is devfreq which calls clk_set_min_rate independently of 
interconnect, this results in CLK performing the final aggregation 
between the "proactive" and "reactive" scaling.

>>>> The clk_set_min_rate approach does not mesh very well with the OPP framework.
>>>> Some of interconnect nodes on imx8m can run at different voltages: OPP can
>>>> handle this well but not in response to a clk_set_min_rate from an unrelated
>>>> subsystem. Maybe set voltage on a clk notifier?
>>>
>>> I think if you design it something like below, it might make your life
>>> a whole lot easier.
>>> Hopefully the formatting looks okay on your end. The arrow going up is
>>> just connecting devfreq to ICC.
>>>
>>>                           Proactive -> ICC -> clk/OPP API to set freq/volt
>>>                                         ^
>>>                                         |
>>> HW measure -> governor -> devfreq ----+
>>>
>>> That way, all voltage/frequency requirements are managed cleanly by
>>> clk/OPP frameworks. The ICC deals with aggregating all the
>>> requirements and devfreq lets you handle reactive scaling and policy.
>>
>> If icc and devfreq are to directly communicate it makes more sense to do
>> it backwards: ICC should set a min_freq on nodes which have a devfreq
>> instance attached and devfreq should implement actual freq switching.
>>
>> HW measurement is done on individual nodes while ICC deals with requests
>> along a path. In particular on imx we have a performance monitor
>> attached to the ddr controller and I doubt it can distinguish between
>> masters so how could this be mapped usefully to an interconnect request?
> 
> Ah, that was the missing piece for me -- you are trying to use a
> central performance monitor. I see what you are trying to do.
> 
> So you are looking at system wide traffic at DDR and then using that
> to scale the interconnect/DDRC. I don't know how complicated or not
> the IMX interconnect topology is, so please pardon my questions. If
> you are using a performance monitor at the DDR controller, why do you
> need the "proactive" requests from other clients? Wouldn't the
> performance monitor account for all the traffic to DDR?

Reactive scaling is too slow to ramp-up in media playback scenarios and 
first few frames would fail.

>> As far as I understand with devfreq the ddrc node could use "ondemand"
>> while the other nodes would default to the "passive" governor and run at
>> predefined default ratios relative to DDRC.
> 
> Yes, that approach would also work but I'm not sure why you need the
> ICC framework in that case.

For proactive scaling: to ensure bandwidth *before* traffic starts. In 
imx vendor tree that's all that's implemented; for reactive scaling we 
just set busfreq to high as soon as cpu leaves min opp.

IMX interconnect topology is not very complex so mechanisms other than 
interconnect could be used. But ICC is the most powerful and expressive 
subsystem for proactive requests.

>>> If all of this makes sense, please take a look at [2] and provide your
>>> thoughts. I've dropped a few patches from [1] to avoid confusion (too
>>> much going on at once). I think BW OPP tables and having OPP tables
>>> for interconnect paths will be something you'll need (if not now,
>>> eventually) and something you can build on top of nicely.
>>
>> I found it very confusing that you're assigning BW OPP tables to
>> devices. My initial understanding was that BW OPP would map "bandwidth"
>> to "frequency" so BW OPPs should be assigned to links along the
>> interconnect graph. But maybe what you want is to have OPPs for the BW
>> values requested by devices?
> 
> I want to have OPPs for bandwidths requested for paths.

Right, this was not very obvious.

> Each interconnect node can also use BW OPPs if that makes sense for them,
> but I think they'd be better served by using frequency OPPs.

Each interconnect node is asked by the framework to ensure a certain BW 
is available in "bytes". The nodes could use OPPs with BW values to map 
the icc request to a frequency in "hz".

>> Looking at the sdm845 icc provider source and it seems that those
>> "peak/avg" values are actually parameters which go into a firmware
>> command!? It makes sense that you want interconnect to be "below"
>> devfreq since icc_provider.set maps very closely to what firmware exposes.
> 
> Even without the firmware (it's mainly there to aggregate requests for
> some system wide resources) or if interconnects are scaled directly
> using clock APIs (older chips), sdm845 would still want ICC to be
> below devfreq. It's because 845 doesn't try to do ICC scaling by
> measuring at the DDR. Each master makes separate requests and then the
> ICC aggregates and sets the frequency. They have their reasons (good
> ones) for doing that.

Maybe I'm confused about how devfreq is used in your scenario: you have 
devices which have their own OPPs (like a GPU) and expose this via 
devfreq. Then for each GPU OPP you want to pick a the BW value to 
request from interconnect, right?

My idea was to use devfreq *after* icc so that you can do stuff like 
force a certain NOC to "max" via echo "performance" > $sysfs/governor. 
It also allows encapsulating complex freq switching (clk maintainers 
don't seem to like my dram clk).

On a second examination there is no actual incompatibility here, devfreq 
could be used both below and above ICC.

--
Regards,
Leonard