mbox series

[v1,00/30] Introduce core voltage scaling for NVIDIA Tegra20/30 SoCs

Message ID 20201104234427.26477-1-digetx@gmail.com (mailing list archive)
Headers show
Series Introduce core voltage scaling for NVIDIA Tegra20/30 SoCs | expand

Message

Dmitry Osipenko Nov. 4, 2020, 11:43 p.m. UTC
Introduce core voltage scaling for NVIDIA Tegra20/30 SoCs, which reduces
power consumption and heating of the Tegra chips. Tegra SoC has multiple
hardware units which belong to a core power domain of the SoC and share
the core voltage. The voltage must be selected in accordance to a minimum
requirement of every core hardware unit.

The minimum core voltage requirement depends on:

  1. Clock enable state of a hardware unit.
  2. Clock frequency.
  3. Unit's internal idling/active state.

This series is tested on Acer A500 (T20), AC100 (T20), Nexus 7 (T30) and
Ouya (T30) devices. I also added voltage scaling to the Ventana (T20) and
Cardhu (T30) boards which are tested by NVIDIA's CI farm. Tegra30 is now up
to 5C cooler on Nexus 7 and stays cool on Ouya (instead of becoming burning
hot) while system is idling. It should be possible to improve this further
by implementing a more advanced power management features for the kernel
drivers.

The DVFS support is opt-in for all boards, meaning that older DTBs will
continue to work like they did it before this series. It should be possible
to easily add the core voltage scaling support for Tegra114+ SoCs based on
this grounding work later on, if anyone will want to implement it.

WARNING(!) This series is made on top of the memory interconnect patches
           which are currently under review [1]. The Tegra EMC driver
           and devicetree-related patches need to be applied on top of
           the ICC series.

[1] https://patchwork.ozlabs.org/project/linux-tegra/list/?series=212196

Dmitry Osipenko (30):
  dt-bindings: host1x: Document OPP and voltage regulator properties
  dt-bindings: mmc: tegra: Document OPP and voltage regulator properties
  dt-bindings: pwm: tegra: Document OPP and voltage regulator properties
  media: dt: bindings: tegra-vde: Document OPP and voltage regulator
    properties
  dt-binding: usb: ci-hdrc-usb2:  Document OPP and voltage regulator
    properties
  dt-bindings: usb: tegra-ehci: Document OPP and voltage regulator
    properties
  soc/tegra: Add sync state API
  soc/tegra: regulators: Support Tegra SoC device sync state API
  soc/tegra: regulators: Fix lockup when voltage-spread is out of range
  regulator: Allow skipping disabled regulators in
    regulator_check_consumers()
  drm/tegra: dc: Support OPP and SoC core voltage scaling
  drm/tegra: gr2d: Correct swapped device-tree compatibles
  drm/tegra: gr2d: Support OPP and SoC core voltage scaling
  drm/tegra: gr3d: Support OPP and SoC core voltage scaling
  drm/tegra: hdmi: Support OPP and SoC core voltage scaling
  gpu: host1x: Support OPP and SoC core voltage scaling
  mmc: sdhci-tegra: Support OPP and core voltage scaling
  pwm: tegra: Support OPP and core voltage scaling
  media: staging: tegra-vde: Support OPP and SoC core voltage scaling
  usb: chipidea: tegra: Support OPP and SoC core voltage scaling
  usb: host: ehci-tegra: Support OPP and SoC core voltage scaling
  memory: tegra20-emc: Support Tegra SoC device state syncing
  memory: tegra30-emc: Support Tegra SoC device state syncing
  ARM: tegra: Add OPP tables for Tegra20 peripheral devices
  ARM: tegra: Add OPP tables for Tegra30 peripheral devices
  ARM: tegra: ventana: Add voltage supplies to DVFS-capable devices
  ARM: tegra: paz00: Add voltage supplies to DVFS-capable devices
  ARM: tegra: acer-a500: Add voltage supplies to DVFS-capable devices
  ARM: tegra: cardhu-a04: Add voltage supplies to DVFS-capable devices
  ARM: tegra: nexus7: Add voltage supplies to DVFS-capable devices

 .../display/tegra/nvidia,tegra20-host1x.txt   |  56 +++
 .../bindings/media/nvidia,tegra-vde.txt       |  12 +
 .../bindings/mmc/nvidia,tegra20-sdhci.txt     |  12 +
 .../bindings/pwm/nvidia,tegra20-pwm.txt       |  13 +
 .../devicetree/bindings/usb/ci-hdrc-usb2.txt  |   4 +
 .../bindings/usb/nvidia,tegra20-ehci.txt      |   2 +
 .../boot/dts/tegra20-acer-a500-picasso.dts    |  30 +-
 arch/arm/boot/dts/tegra20-paz00.dts           |  40 +-
 .../arm/boot/dts/tegra20-peripherals-opp.dtsi | 386 ++++++++++++++++
 arch/arm/boot/dts/tegra20-ventana.dts         |  65 ++-
 arch/arm/boot/dts/tegra20.dtsi                |  14 +
 .../tegra30-asus-nexus7-grouper-common.dtsi   |  23 +
 arch/arm/boot/dts/tegra30-cardhu-a04.dts      |  44 ++
 .../arm/boot/dts/tegra30-peripherals-opp.dtsi | 415 ++++++++++++++++++
 arch/arm/boot/dts/tegra30.dtsi                |  13 +
 drivers/gpu/drm/tegra/Kconfig                 |   1 +
 drivers/gpu/drm/tegra/dc.c                    | 138 +++++-
 drivers/gpu/drm/tegra/dc.h                    |   5 +
 drivers/gpu/drm/tegra/gr2d.c                  | 140 +++++-
 drivers/gpu/drm/tegra/gr3d.c                  | 136 ++++++
 drivers/gpu/drm/tegra/hdmi.c                  |  63 ++-
 drivers/gpu/host1x/Kconfig                    |   1 +
 drivers/gpu/host1x/dev.c                      |  87 ++++
 drivers/memory/tegra/tegra20-emc.c            |   8 +-
 drivers/memory/tegra/tegra30-emc.c            |   8 +-
 drivers/mmc/host/Kconfig                      |   1 +
 drivers/mmc/host/sdhci-tegra.c                |  70 ++-
 drivers/pwm/Kconfig                           |   1 +
 drivers/pwm/pwm-tegra.c                       |  84 +++-
 drivers/regulator/core.c                      |  12 +-
 .../soc/samsung/exynos-regulator-coupler.c    |   2 +-
 drivers/soc/tegra/common.c                    | 152 ++++++-
 drivers/soc/tegra/regulators-tegra20.c        |  25 +-
 drivers/soc/tegra/regulators-tegra30.c        |  30 +-
 drivers/staging/media/tegra-vde/Kconfig       |   1 +
 drivers/staging/media/tegra-vde/vde.c         | 127 ++++++
 drivers/staging/media/tegra-vde/vde.h         |   1 +
 drivers/usb/chipidea/Kconfig                  |   1 +
 drivers/usb/chipidea/ci_hdrc_tegra.c          |  79 ++++
 drivers/usb/host/Kconfig                      |   1 +
 drivers/usb/host/ehci-tegra.c                 |  79 ++++
 include/linux/regulator/coupler.h             |   6 +-
 include/soc/tegra/common.h                    |  22 +
 43 files changed, 2360 insertions(+), 50 deletions(-)

Comments

Michał Mirosław Nov. 5, 2020, 1:45 a.m. UTC | #1
On Thu, Nov 05, 2020 at 02:43:57AM +0300, Dmitry Osipenko wrote:
> Introduce core voltage scaling for NVIDIA Tegra20/30 SoCs, which reduces
> power consumption and heating of the Tegra chips. Tegra SoC has multiple
> hardware units which belong to a core power domain of the SoC and share
> the core voltage. The voltage must be selected in accordance to a minimum
> requirement of every core hardware unit.
[...]

Just looked briefly through the series - it looks like there is a lot of
code duplication in *_init_opp_table() functions. Could this be made
more generic / data-driven?

Best Regards
Michał Mirosław
Ulf Hansson Nov. 5, 2020, 9:45 a.m. UTC | #2
+ Viresh

On Thu, 5 Nov 2020 at 00:44, Dmitry Osipenko <digetx@gmail.com> wrote:
>
> Introduce core voltage scaling for NVIDIA Tegra20/30 SoCs, which reduces
> power consumption and heating of the Tegra chips. Tegra SoC has multiple
> hardware units which belong to a core power domain of the SoC and share
> the core voltage. The voltage must be selected in accordance to a minimum
> requirement of every core hardware unit.
>
> The minimum core voltage requirement depends on:
>
>   1. Clock enable state of a hardware unit.
>   2. Clock frequency.
>   3. Unit's internal idling/active state.
>
> This series is tested on Acer A500 (T20), AC100 (T20), Nexus 7 (T30) and
> Ouya (T30) devices. I also added voltage scaling to the Ventana (T20) and
> Cardhu (T30) boards which are tested by NVIDIA's CI farm. Tegra30 is now up
> to 5C cooler on Nexus 7 and stays cool on Ouya (instead of becoming burning
> hot) while system is idling. It should be possible to improve this further
> by implementing a more advanced power management features for the kernel
> drivers.
>
> The DVFS support is opt-in for all boards, meaning that older DTBs will
> continue to work like they did it before this series. It should be possible
> to easily add the core voltage scaling support for Tegra114+ SoCs based on
> this grounding work later on, if anyone will want to implement it.
>
> WARNING(!) This series is made on top of the memory interconnect patches
>            which are currently under review [1]. The Tegra EMC driver
>            and devicetree-related patches need to be applied on top of
>            the ICC series.
>
> [1] https://patchwork.ozlabs.org/project/linux-tegra/list/?series=212196
>
> Dmitry Osipenko (30):
>   dt-bindings: host1x: Document OPP and voltage regulator properties
>   dt-bindings: mmc: tegra: Document OPP and voltage regulator properties
>   dt-bindings: pwm: tegra: Document OPP and voltage regulator properties
>   media: dt: bindings: tegra-vde: Document OPP and voltage regulator
>     properties
>   dt-binding: usb: ci-hdrc-usb2:  Document OPP and voltage regulator
>     properties
>   dt-bindings: usb: tegra-ehci: Document OPP and voltage regulator
>     properties
>   soc/tegra: Add sync state API
>   soc/tegra: regulators: Support Tegra SoC device sync state API
>   soc/tegra: regulators: Fix lockup when voltage-spread is out of range
>   regulator: Allow skipping disabled regulators in
>     regulator_check_consumers()
>   drm/tegra: dc: Support OPP and SoC core voltage scaling
>   drm/tegra: gr2d: Correct swapped device-tree compatibles
>   drm/tegra: gr2d: Support OPP and SoC core voltage scaling
>   drm/tegra: gr3d: Support OPP and SoC core voltage scaling
>   drm/tegra: hdmi: Support OPP and SoC core voltage scaling
>   gpu: host1x: Support OPP and SoC core voltage scaling
>   mmc: sdhci-tegra: Support OPP and core voltage scaling
>   pwm: tegra: Support OPP and core voltage scaling
>   media: staging: tegra-vde: Support OPP and SoC core voltage scaling
>   usb: chipidea: tegra: Support OPP and SoC core voltage scaling
>   usb: host: ehci-tegra: Support OPP and SoC core voltage scaling
>   memory: tegra20-emc: Support Tegra SoC device state syncing
>   memory: tegra30-emc: Support Tegra SoC device state syncing
>   ARM: tegra: Add OPP tables for Tegra20 peripheral devices
>   ARM: tegra: Add OPP tables for Tegra30 peripheral devices
>   ARM: tegra: ventana: Add voltage supplies to DVFS-capable devices
>   ARM: tegra: paz00: Add voltage supplies to DVFS-capable devices
>   ARM: tegra: acer-a500: Add voltage supplies to DVFS-capable devices
>   ARM: tegra: cardhu-a04: Add voltage supplies to DVFS-capable devices
>   ARM: tegra: nexus7: Add voltage supplies to DVFS-capable devices
>
>  .../display/tegra/nvidia,tegra20-host1x.txt   |  56 +++
>  .../bindings/media/nvidia,tegra-vde.txt       |  12 +
>  .../bindings/mmc/nvidia,tegra20-sdhci.txt     |  12 +
>  .../bindings/pwm/nvidia,tegra20-pwm.txt       |  13 +
>  .../devicetree/bindings/usb/ci-hdrc-usb2.txt  |   4 +
>  .../bindings/usb/nvidia,tegra20-ehci.txt      |   2 +
>  .../boot/dts/tegra20-acer-a500-picasso.dts    |  30 +-
>  arch/arm/boot/dts/tegra20-paz00.dts           |  40 +-
>  .../arm/boot/dts/tegra20-peripherals-opp.dtsi | 386 ++++++++++++++++
>  arch/arm/boot/dts/tegra20-ventana.dts         |  65 ++-
>  arch/arm/boot/dts/tegra20.dtsi                |  14 +
>  .../tegra30-asus-nexus7-grouper-common.dtsi   |  23 +
>  arch/arm/boot/dts/tegra30-cardhu-a04.dts      |  44 ++
>  .../arm/boot/dts/tegra30-peripherals-opp.dtsi | 415 ++++++++++++++++++
>  arch/arm/boot/dts/tegra30.dtsi                |  13 +
>  drivers/gpu/drm/tegra/Kconfig                 |   1 +
>  drivers/gpu/drm/tegra/dc.c                    | 138 +++++-
>  drivers/gpu/drm/tegra/dc.h                    |   5 +
>  drivers/gpu/drm/tegra/gr2d.c                  | 140 +++++-
>  drivers/gpu/drm/tegra/gr3d.c                  | 136 ++++++
>  drivers/gpu/drm/tegra/hdmi.c                  |  63 ++-
>  drivers/gpu/host1x/Kconfig                    |   1 +
>  drivers/gpu/host1x/dev.c                      |  87 ++++
>  drivers/memory/tegra/tegra20-emc.c            |   8 +-
>  drivers/memory/tegra/tegra30-emc.c            |   8 +-
>  drivers/mmc/host/Kconfig                      |   1 +
>  drivers/mmc/host/sdhci-tegra.c                |  70 ++-
>  drivers/pwm/Kconfig                           |   1 +
>  drivers/pwm/pwm-tegra.c                       |  84 +++-
>  drivers/regulator/core.c                      |  12 +-
>  .../soc/samsung/exynos-regulator-coupler.c    |   2 +-
>  drivers/soc/tegra/common.c                    | 152 ++++++-
>  drivers/soc/tegra/regulators-tegra20.c        |  25 +-
>  drivers/soc/tegra/regulators-tegra30.c        |  30 +-
>  drivers/staging/media/tegra-vde/Kconfig       |   1 +
>  drivers/staging/media/tegra-vde/vde.c         | 127 ++++++
>  drivers/staging/media/tegra-vde/vde.h         |   1 +
>  drivers/usb/chipidea/Kconfig                  |   1 +
>  drivers/usb/chipidea/ci_hdrc_tegra.c          |  79 ++++
>  drivers/usb/host/Kconfig                      |   1 +
>  drivers/usb/host/ehci-tegra.c                 |  79 ++++
>  include/linux/regulator/coupler.h             |   6 +-
>  include/soc/tegra/common.h                    |  22 +
>  43 files changed, 2360 insertions(+), 50 deletions(-)
>
> --
> 2.27.0
>

I need some more time to review this, but just a quick check found a
few potential issues...

The "core-supply", that you specify as a regulator for each
controller's device node, is not the way we describe power domains.
Instead, it seems like you should register a power-domain provider
(with the help of genpd) and implement the ->set_performance_state()
callback for it. Each device node should then be hooked up to this
power-domain, rather than to a "core-supply". For DT bindings, please
have a look at Documentation/devicetree/bindings/power/power-domain.yaml
and Documentation/devicetree/bindings/power/power_domain.txt.

In regards to the "sync state" problem (preventing to change
performance states until all consumers have been attached), this can
then be managed by the genpd provider driver instead.

Kind regards
Uffe
Viresh Kumar Nov. 5, 2020, 10:06 a.m. UTC | #3
On 05-11-20, 10:45, Ulf Hansson wrote:
> + Viresh

Thanks Ulf. I found a bug in OPP core because you cc'd me here :)

> On Thu, 5 Nov 2020 at 00:44, Dmitry Osipenko <digetx@gmail.com> wrote:
> I need some more time to review this, but just a quick check found a
> few potential issues...
> 
> The "core-supply", that you specify as a regulator for each
> controller's device node, is not the way we describe power domains.

Maybe I misunderstood your comment here, but there are two ways of
scaling the voltage of a device depending on if it is a regulator (and
can be modeled as one in the kernel) or a power domain.

In case of Qcom earlier (when we added the performance-state stuff),
the eventual hardware was out of kernel's control and we didn't wanted
(allowed) to model it as a virtual regulator just to pass the votes to
the RPM. And so we did what we did.

But if the hardware (where the voltage is required to be changed) is
indeed a regulator and is modeled as one, then what Dmitry has done
looks okay. i.e. add a supply in the device's node and microvolt
property in the DT entries.
Ulf Hansson Nov. 5, 2020, 10:34 a.m. UTC | #4
On Thu, 5 Nov 2020 at 11:06, Viresh Kumar <viresh.kumar@linaro.org> wrote:
>
> On 05-11-20, 10:45, Ulf Hansson wrote:
> > + Viresh
>
> Thanks Ulf. I found a bug in OPP core because you cc'd me here :)

Happy to help. :-)

>
> > On Thu, 5 Nov 2020 at 00:44, Dmitry Osipenko <digetx@gmail.com> wrote:
> > I need some more time to review this, but just a quick check found a
> > few potential issues...
> >
> > The "core-supply", that you specify as a regulator for each
> > controller's device node, is not the way we describe power domains.
>
> Maybe I misunderstood your comment here, but there are two ways of
> scaling the voltage of a device depending on if it is a regulator (and
> can be modeled as one in the kernel) or a power domain.

I am not objecting about scaling the voltage through a regulator,
that's fine to me. However, encoding a power domain as a regulator
(even if it may seem like a regulator) isn't. Well, unless Mark Brown
has changed his mind about this.

In this case, it seems like the regulator supply belongs in the
description of the power domain provider.

>
> In case of Qcom earlier (when we added the performance-state stuff),
> the eventual hardware was out of kernel's control and we didn't wanted
> (allowed) to model it as a virtual regulator just to pass the votes to
> the RPM. And so we did what we did.
>
> But if the hardware (where the voltage is required to be changed) is
> indeed a regulator and is modeled as one, then what Dmitry has done
> looks okay. i.e. add a supply in the device's node and microvolt
> property in the DT entries.

I guess I haven't paid enough attention how power domain regulators
are being described then. I was under the impression that the CPUfreq
case was a bit specific - and we had legacy bindings to stick with.

Can you point me to some other existing examples of where power domain
regulators are specified as a regulator in each device's node?

Kind regards
Uffe
Viresh Kumar Nov. 5, 2020, 10:40 a.m. UTC | #5
On 05-11-20, 11:34, Ulf Hansson wrote:
> I am not objecting about scaling the voltage through a regulator,
> that's fine to me. However, encoding a power domain as a regulator
> (even if it may seem like a regulator) isn't. Well, unless Mark Brown
> has changed his mind about this.
>
> In this case, it seems like the regulator supply belongs in the
> description of the power domain provider.

Okay, I wasn't sure if it is a power domain or a regulator here. Btw,
how do we identify if it is a power domain or a regulator ?

> > In case of Qcom earlier (when we added the performance-state stuff),
> > the eventual hardware was out of kernel's control and we didn't wanted
> > (allowed) to model it as a virtual regulator just to pass the votes to
> > the RPM. And so we did what we did.
> >
> > But if the hardware (where the voltage is required to be changed) is
> > indeed a regulator and is modeled as one, then what Dmitry has done
> > looks okay. i.e. add a supply in the device's node and microvolt
> > property in the DT entries.
> 
> I guess I haven't paid enough attention how power domain regulators
> are being described then. I was under the impression that the CPUfreq
> case was a bit specific - and we had legacy bindings to stick with.
> 
> Can you point me to some other existing examples of where power domain
> regulators are specified as a regulator in each device's node?

No, I thought it is a regulator here and not a power domain.
Ulf Hansson Nov. 5, 2020, 10:56 a.m. UTC | #6
On Thu, 5 Nov 2020 at 11:40, Viresh Kumar <viresh.kumar@linaro.org> wrote:
>
> On 05-11-20, 11:34, Ulf Hansson wrote:
> > I am not objecting about scaling the voltage through a regulator,
> > that's fine to me. However, encoding a power domain as a regulator
> > (even if it may seem like a regulator) isn't. Well, unless Mark Brown
> > has changed his mind about this.
> >
> > In this case, it seems like the regulator supply belongs in the
> > description of the power domain provider.
>
> Okay, I wasn't sure if it is a power domain or a regulator here. Btw,
> how do we identify if it is a power domain or a regulator ?

Good question. It's not a crystal clear line in between them, I think.

A power domain to me, means that some part of a silicon (a group of
controllers or just a single piece, for example) needs some kind of
resource (typically a power rail) to be enabled to be functional, to
start with. If there are operating points involved, that's also a
clear indication to me, that it's not a regular regulator.

Maybe we should try to specify this more exactly in some
documentation, somewhere.

>
> > > In case of Qcom earlier (when we added the performance-state stuff),
> > > the eventual hardware was out of kernel's control and we didn't wanted
> > > (allowed) to model it as a virtual regulator just to pass the votes to
> > > the RPM. And so we did what we did.
> > >
> > > But if the hardware (where the voltage is required to be changed) is
> > > indeed a regulator and is modeled as one, then what Dmitry has done
> > > looks okay. i.e. add a supply in the device's node and microvolt
> > > property in the DT entries.
> >
> > I guess I haven't paid enough attention how power domain regulators
> > are being described then. I was under the impression that the CPUfreq
> > case was a bit specific - and we had legacy bindings to stick with.
> >
> > Can you point me to some other existing examples of where power domain
> > regulators are specified as a regulator in each device's node?
>
> No, I thought it is a regulator here and not a power domain.

Okay, thanks!

Kind regards
Uffe
Viresh Kumar Nov. 5, 2020, 11:13 a.m. UTC | #7
On 05-11-20, 11:56, Ulf Hansson wrote:
> On Thu, 5 Nov 2020 at 11:40, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> > Btw, how do we identify if it is a power domain or a regulator ?

To be honest, I was a bit afraid and embarrassed to ask this question,
and was hoping people to make fun of me in return :)

> Good question. It's not a crystal clear line in between them, I think.

And I was relieved after reading this :)

> A power domain to me, means that some part of a silicon (a group of
> controllers or just a single piece, for example) needs some kind of
> resource (typically a power rail) to be enabled to be functional, to
> start with.

Isn't this what a part of regulator does as well ? i.e.
enabling/disabling of the regulator or power to a group of
controllers.

Over that the regulator does voltage/current scaling as well, which
normally the power domains don't do (though we did that in
performance-state case).

> If there are operating points involved, that's also a
> clear indication to me, that it's not a regular regulator.

Is there any example of that? I hope by OPP you meant both freq and
voltage here. I am not sure if I know of a case where a power domain
handles both of them.

> Maybe we should try to specify this more exactly in some
> documentation, somewhere.

I think yes, it is very much required. And in absence of that I think,
many (or most) of the platforms that also need to scale the voltage
would have modeled their hardware as a regulator and not a PM domain.

What I always thought was:

- Module that can just enable/disable power to a block of SoC is a
  power domain.

- Module that can enable/disable as well as scale voltage is a
  regulator.

And so I thought that this patchset has done the right thing. This
changed a bit with the qcom stuff where the IP to be configured was in
control of RPM and not Linux and so we couldn't add it as a regulator.
If it was controlled by Linux, it would have been a regulator in
kernel for sure :)
Ulf Hansson Nov. 5, 2020, 12:52 p.m. UTC | #8
On Thu, 5 Nov 2020 at 12:13, Viresh Kumar <viresh.kumar@linaro.org> wrote:
>
> On 05-11-20, 11:56, Ulf Hansson wrote:
> > On Thu, 5 Nov 2020 at 11:40, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> > > Btw, how do we identify if it is a power domain or a regulator ?
>
> To be honest, I was a bit afraid and embarrassed to ask this question,
> and was hoping people to make fun of me in return :)
>
> > Good question. It's not a crystal clear line in between them, I think.
>
> And I was relieved after reading this :)
>
> > A power domain to me, means that some part of a silicon (a group of
> > controllers or just a single piece, for example) needs some kind of
> > resource (typically a power rail) to be enabled to be functional, to
> > start with.
>
> Isn't this what a part of regulator does as well ? i.e.
> enabling/disabling of the regulator or power to a group of
> controllers.

It could, but it shouldn't.

>
> Over that the regulator does voltage/current scaling as well, which
> normally the power domains don't do (though we did that in
> performance-state case).
>
> > If there are operating points involved, that's also a
> > clear indication to me, that it's not a regular regulator.
>
> Is there any example of that? I hope by OPP you meant both freq and
> voltage here. I am not sure if I know of a case where a power domain
> handles both of them.

It may be both voltage and frequency - but in some cases only voltage.
From HW point of view, many ARM legacy platforms have power domains
that work like this.

As you know, the DVFS case has in many years not been solved in a
generic way, but mostly via platform specific hacks.

The worst ones are probably those hacking clock drivers (which myself
also have contributed to). Have a look at clk_prcmu_opp_prepare(), for
example, which is used by the UX500 platform. Another option has been
to use the devfreq framework, but it has limitations in regards to
this too.

That said, I am hoping that people start moving towards the
deploying/implementing DVFS through the power-domain approach,
together with the OPPs. Maybe there are still some pieces missing from
an infrastructure point of view, but that should become more evident
as more starts using it.

>
> > Maybe we should try to specify this more exactly in some
> > documentation, somewhere.
>
> I think yes, it is very much required. And in absence of that I think,
> many (or most) of the platforms that also need to scale the voltage
> would have modeled their hardware as a regulator and not a PM domain.
>
> What I always thought was:
>
> - Module that can just enable/disable power to a block of SoC is a
>   power domain.
>
> - Module that can enable/disable as well as scale voltage is a
>   regulator.
>
> And so I thought that this patchset has done the right thing. This
> changed a bit with the qcom stuff where the IP to be configured was in
> control of RPM and not Linux and so we couldn't add it as a regulator.
> If it was controlled by Linux, it would have been a regulator in
> kernel for sure :)

In my view, DT bindings have consistently been pushed back during the
year, if they have tried to model power domains as regulator supplies
from consumer device nodes. Hence, people have tried other things, as
I mentioned above.

I definitely agree that we need to update some documentations,
explaining things more exactly. Additionally, it seems like a talk at
some conferences should make sense, as a way to spread the word.

Kind regards
Uffe
Dmitry Osipenko Nov. 5, 2020, 1:57 p.m. UTC | #9
05.11.2020 04:45, Michał Mirosław пишет:
> On Thu, Nov 05, 2020 at 02:43:57AM +0300, Dmitry Osipenko wrote:
>> Introduce core voltage scaling for NVIDIA Tegra20/30 SoCs, which reduces
>> power consumption and heating of the Tegra chips. Tegra SoC has multiple
>> hardware units which belong to a core power domain of the SoC and share
>> the core voltage. The voltage must be selected in accordance to a minimum
>> requirement of every core hardware unit.
> [...]
> 
> Just looked briefly through the series - it looks like there is a lot of
> code duplication in *_init_opp_table() functions. Could this be made
> more generic / data-driven?

Indeed, it should be possible to add a common helper. I had a quick
thought about doing it too, but then decided to defer for the starter
since there were some differences among the needs of the drivers. I'll
take a closer look for the v2, thanks!
Dmitry Osipenko Nov. 5, 2020, 3:22 p.m. UTC | #10
05.11.2020 12:45, Ulf Hansson пишет:
...
> I need some more time to review this, but just a quick check found a
> few potential issues...

Thank you for starting the review! I'm pretty sure it will take a couple
revisions until all the questions will be resolved :)

> The "core-supply", that you specify as a regulator for each
> controller's device node, is not the way we describe power domains.
> Instead, it seems like you should register a power-domain provider
> (with the help of genpd) and implement the ->set_performance_state()
> callback for it. Each device node should then be hooked up to this
> power-domain, rather than to a "core-supply". For DT bindings, please
> have a look at Documentation/devicetree/bindings/power/power-domain.yaml
> and Documentation/devicetree/bindings/power/power_domain.txt.
> 
> In regards to the "sync state" problem (preventing to change
> performance states until all consumers have been attached), this can
> then be managed by the genpd provider driver instead.

I'll need to take a closer look at GENPD, thank you for the suggestion.

Sounds like a software GENPD driver which manages clocks and voltages
could be a good idea, but it also could be an unnecessary
over-engineering. Let's see..
Dmitry Osipenko Nov. 8, 2020, 12:19 p.m. UTC | #11
05.11.2020 18:22, Dmitry Osipenko пишет:
> 05.11.2020 12:45, Ulf Hansson пишет:
> ...
>> I need some more time to review this, but just a quick check found a
>> few potential issues...
> 
> Thank you for starting the review! I'm pretty sure it will take a couple
> revisions until all the questions will be resolved :)
> 
>> The "core-supply", that you specify as a regulator for each
>> controller's device node, is not the way we describe power domains.
>> Instead, it seems like you should register a power-domain provider
>> (with the help of genpd) and implement the ->set_performance_state()
>> callback for it. Each device node should then be hooked up to this
>> power-domain, rather than to a "core-supply". For DT bindings, please
>> have a look at Documentation/devicetree/bindings/power/power-domain.yaml
>> and Documentation/devicetree/bindings/power/power_domain.txt.
>>
>> In regards to the "sync state" problem (preventing to change
>> performance states until all consumers have been attached), this can
>> then be managed by the genpd provider driver instead.
> 
> I'll need to take a closer look at GENPD, thank you for the suggestion.
> 
> Sounds like a software GENPD driver which manages clocks and voltages
> could be a good idea, but it also could be an unnecessary
> over-engineering. Let's see..
> 

Hello Ulf and all,

I took a detailed look at the GENPD and tried to implement it. Here is
what was found:

1. GENPD framework doesn't aggregate performance requests from the
attached devices. This means that if deviceA requests performance state
10 and then deviceB requests state 3, then framework will set domain's
state to 3 instead of 10.

https://elixir.bootlin.com/linux/v5.10-rc2/source/drivers/base/power/domain.c#L376

2. GENPD framework has a sync() callback in the genpd.domain structure,
but this callback isn't allowed to be used by the GENPD implementation.
The GENPD framework always overrides that callback for its own needs.
Hence GENPD doesn't allow to solve the bootstrapping
state-synchronization problem in a nice way.

https://elixir.bootlin.com/linux/v5.10-rc2/source/drivers/base/power/domain.c#L2606

3. Tegra doesn't have a dedicated hardware power-controller for the core
domain, instead there is only an external voltage regulator. Hence we
will need to create a phony device-tree node for the virtual power
domain, which is probably a wrong thing to do.

===

Perhaps it should be possible to create some hacks to work around
bullets 2 and 3 in order to achieve what we need for DVFS on Tegra, but
bullet 1 isn't solvable without changing how the GENPD core works.

Altogether, the GENPD in its current form is a wrong abstraction for a
system-wide DVFS in a case where multiple devices share power domain and
this domain is a voltage regulator. The regulator framework is the
correct abstraction in this case for today.
Viresh Kumar Nov. 9, 2020, 4:43 a.m. UTC | #12
On 08-11-20, 15:19, Dmitry Osipenko wrote:
> I took a detailed look at the GENPD and tried to implement it. Here is
> what was found:
> 
> 1. GENPD framework doesn't aggregate performance requests from the
> attached devices. This means that if deviceA requests performance state
> 10 and then deviceB requests state 3, then framework will set domain's
> state to 3 instead of 10.

It does. Look at _genpd_reeval_performance_state().
Dmitry Osipenko Nov. 9, 2020, 4:47 a.m. UTC | #13
09.11.2020 07:43, Viresh Kumar пишет:
> On 08-11-20, 15:19, Dmitry Osipenko wrote:
>> I took a detailed look at the GENPD and tried to implement it. Here is
>> what was found:
>>
>> 1. GENPD framework doesn't aggregate performance requests from the
>> attached devices. This means that if deviceA requests performance state
>> 10 and then deviceB requests state 3, then framework will set domain's
>> state to 3 instead of 10.
> 
> It does. Look at _genpd_reeval_performance_state().
> 

Thanks, I probably had a bug in the quick prototype and then overlooked
that function.
Dmitry Osipenko Nov. 9, 2020, 5:10 a.m. UTC | #14
09.11.2020 07:47, Dmitry Osipenko пишет:
> 09.11.2020 07:43, Viresh Kumar пишет:
>> On 08-11-20, 15:19, Dmitry Osipenko wrote:
>>> I took a detailed look at the GENPD and tried to implement it. Here is
>>> what was found:
>>>
>>> 1. GENPD framework doesn't aggregate performance requests from the
>>> attached devices. This means that if deviceA requests performance state
>>> 10 and then deviceB requests state 3, then framework will set domain's
>>> state to 3 instead of 10.
>>
>> It does. Look at _genpd_reeval_performance_state().
>>
> 
> Thanks, I probably had a bug in the quick prototype and then overlooked
> that function.
> 

If a non-hardware device-tree node is okay to have for the domain, then
I can try again.

What I also haven't mentioned is that GENPD adds some extra complexity
to some drivers (3d, video decoder) because we will need to handle both
new GENPD and legacy Tegra specific pre-genpd era domains.

I'm also not exactly sure how the topology of domains should look like
because Tegra has a power-controller (PMC) which manages power rail of a
few hardware units. Perhaps it should be

  device -> PMC domain -> CORE domain

but not exactly sure for now.
Viresh Kumar Nov. 9, 2020, 5:12 a.m. UTC | #15
On 09-11-20, 08:10, Dmitry Osipenko wrote:
> 09.11.2020 07:47, Dmitry Osipenko пишет:
> > 09.11.2020 07:43, Viresh Kumar пишет:
> >> On 08-11-20, 15:19, Dmitry Osipenko wrote:
> >>> I took a detailed look at the GENPD and tried to implement it. Here is
> >>> what was found:
> >>>
> >>> 1. GENPD framework doesn't aggregate performance requests from the
> >>> attached devices. This means that if deviceA requests performance state
> >>> 10 and then deviceB requests state 3, then framework will set domain's
> >>> state to 3 instead of 10.
> >>
> >> It does. Look at _genpd_reeval_performance_state().
> >>
> > 
> > Thanks, I probably had a bug in the quick prototype and then overlooked
> > that function.
> > 
> 
> If a non-hardware device-tree node is okay to have for the domain, then
> I can try again.
> 
> What I also haven't mentioned is that GENPD adds some extra complexity
> to some drivers (3d, video decoder) because we will need to handle both
> new GENPD and legacy Tegra specific pre-genpd era domains.
> 
> I'm also not exactly sure how the topology of domains should look like
> because Tegra has a power-controller (PMC) which manages power rail of a
> few hardware units. Perhaps it should be
> 
>   device -> PMC domain -> CORE domain
> 
> but not exactly sure for now.

I am also confused on if it should be a domain or regulator, but that
is for Ulf to tell :)
Ulf Hansson Nov. 11, 2020, 11:38 a.m. UTC | #16
On Sun, 8 Nov 2020 at 13:19, Dmitry Osipenko <digetx@gmail.com> wrote:
>
> 05.11.2020 18:22, Dmitry Osipenko пишет:
> > 05.11.2020 12:45, Ulf Hansson пишет:
> > ...
> >> I need some more time to review this, but just a quick check found a
> >> few potential issues...
> >
> > Thank you for starting the review! I'm pretty sure it will take a couple
> > revisions until all the questions will be resolved :)
> >
> >> The "core-supply", that you specify as a regulator for each
> >> controller's device node, is not the way we describe power domains.
> >> Instead, it seems like you should register a power-domain provider
> >> (with the help of genpd) and implement the ->set_performance_state()
> >> callback for it. Each device node should then be hooked up to this
> >> power-domain, rather than to a "core-supply". For DT bindings, please
> >> have a look at Documentation/devicetree/bindings/power/power-domain.yaml
> >> and Documentation/devicetree/bindings/power/power_domain.txt.
> >>
> >> In regards to the "sync state" problem (preventing to change
> >> performance states until all consumers have been attached), this can
> >> then be managed by the genpd provider driver instead.
> >
> > I'll need to take a closer look at GENPD, thank you for the suggestion.
> >
> > Sounds like a software GENPD driver which manages clocks and voltages
> > could be a good idea, but it also could be an unnecessary
> > over-engineering. Let's see..
> >
>
> Hello Ulf and all,
>
> I took a detailed look at the GENPD and tried to implement it. Here is
> what was found:
>
> 1. GENPD framework doesn't aggregate performance requests from the
> attached devices. This means that if deviceA requests performance state
> 10 and then deviceB requests state 3, then framework will set domain's
> state to 3 instead of 10.
>
> https://elixir.bootlin.com/linux/v5.10-rc2/source/drivers/base/power/domain.c#L376

As Viresh also stated, genpd does aggregate the votes. It even
performs aggregation hierarchy (a genpd is allowed to have parent(s)
to model a topology).

>
> 2. GENPD framework has a sync() callback in the genpd.domain structure,
> but this callback isn't allowed to be used by the GENPD implementation.
> The GENPD framework always overrides that callback for its own needs.
> Hence GENPD doesn't allow to solve the bootstrapping
> state-synchronization problem in a nice way.
>
> https://elixir.bootlin.com/linux/v5.10-rc2/source/drivers/base/power/domain.c#L2606

That ->sync() callback isn't the callback you are looking for, it's a
PM domain specific callback - and has other purposes.

To solve the problem you refer to, your genpd provider driver (a
platform driver) should assign its ->sync_state() callback. The
->sync_state() callback will be invoked, when all consumer devices
have been attached (and probed) to their corresponding provider.

You may have a look at drivers/cpuidle/cpuidle-psci-domain.c, to see
an example of how this works. If there is anything unclear, just tell
me and I will try to help.

>
> 3. Tegra doesn't have a dedicated hardware power-controller for the core
> domain, instead there is only an external voltage regulator. Hence we
> will need to create a phony device-tree node for the virtual power
> domain, which is probably a wrong thing to do.

No, this is absolutely the correct thing to do.

This isn't a virtual power domain, it's a real power domain. You only
happen to model the control of it as a regulator, as it fits nicely
with that for *this* SoC. Don't get me wrong, that's fine as long as
the supply is specified only in the power-domain provider node.

On another SoC, you might have a different FW interface for the power
domain provider that doesn't fit well with the regulator. When that
happens, all you need to do is to implement a new power domain
provider and potentially re-define the power domain topology. More
importantly, you don't need to re-invent yet another slew of device
specific bindings - for each SoC.

>
> ===
>
> Perhaps it should be possible to create some hacks to work around
> bullets 2 and 3 in order to achieve what we need for DVFS on Tegra, but
> bullet 1 isn't solvable without changing how the GENPD core works.
>
> Altogether, the GENPD in its current form is a wrong abstraction for a
> system-wide DVFS in a case where multiple devices share power domain and
> this domain is a voltage regulator. The regulator framework is the
> correct abstraction in this case for today.

Well, I admit it's a bit complex. But it solves the problem in a
nicely abstracted way that should work for everybody, at least in my
opinion.

Although, let's not exclude that there are pieces missing in genpd or
the opp layer, as this DVFS feature is rather new - but then we should
just extend/fix it.

Kind regards
Uffe
Dmitry Osipenko Nov. 12, 2020, 7:57 p.m. UTC | #17
11.11.2020 14:38, Ulf Hansson пишет:
> On Sun, 8 Nov 2020 at 13:19, Dmitry Osipenko <digetx@gmail.com> wrote:
>>
>> 05.11.2020 18:22, Dmitry Osipenko пишет:
>>> 05.11.2020 12:45, Ulf Hansson пишет:
>>> ...
>>>> I need some more time to review this, but just a quick check found a
>>>> few potential issues...
>>>
>>> Thank you for starting the review! I'm pretty sure it will take a couple
>>> revisions until all the questions will be resolved :)
>>>
>>>> The "core-supply", that you specify as a regulator for each
>>>> controller's device node, is not the way we describe power domains.
>>>> Instead, it seems like you should register a power-domain provider
>>>> (with the help of genpd) and implement the ->set_performance_state()
>>>> callback for it. Each device node should then be hooked up to this
>>>> power-domain, rather than to a "core-supply". For DT bindings, please
>>>> have a look at Documentation/devicetree/bindings/power/power-domain.yaml
>>>> and Documentation/devicetree/bindings/power/power_domain.txt.
>>>>
>>>> In regards to the "sync state" problem (preventing to change
>>>> performance states until all consumers have been attached), this can
>>>> then be managed by the genpd provider driver instead.
>>>
>>> I'll need to take a closer look at GENPD, thank you for the suggestion.
>>>
>>> Sounds like a software GENPD driver which manages clocks and voltages
>>> could be a good idea, but it also could be an unnecessary
>>> over-engineering. Let's see..
>>>
>>
>> Hello Ulf and all,
>>
>> I took a detailed look at the GENPD and tried to implement it. Here is
>> what was found:
>>
>> 1. GENPD framework doesn't aggregate performance requests from the
>> attached devices. This means that if deviceA requests performance state
>> 10 and then deviceB requests state 3, then framework will set domain's
>> state to 3 instead of 10.
>>
>> https://elixir.bootlin.com/linux/v5.10-rc2/source/drivers/base/power/domain.c#L376
> 
> As Viresh also stated, genpd does aggregate the votes. It even
> performs aggregation hierarchy (a genpd is allowed to have parent(s)
> to model a topology).

Yes, I already found and fixed the bug which confused me previously and
it's working well now.

>> 2. GENPD framework has a sync() callback in the genpd.domain structure,
>> but this callback isn't allowed to be used by the GENPD implementation.
>> The GENPD framework always overrides that callback for its own needs.
>> Hence GENPD doesn't allow to solve the bootstrapping
>> state-synchronization problem in a nice way.
>>
>> https://elixir.bootlin.com/linux/v5.10-rc2/source/drivers/base/power/domain.c#L2606
> 
> That ->sync() callback isn't the callback you are looking for, it's a
> PM domain specific callback - and has other purposes.
> 
> To solve the problem you refer to, your genpd provider driver (a
> platform driver) should assign its ->sync_state() callback. The
> ->sync_state() callback will be invoked, when all consumer devices
> have been attached (and probed) to their corresponding provider.
> 
> You may have a look at drivers/cpuidle/cpuidle-psci-domain.c, to see
> an example of how this works. If there is anything unclear, just tell
> me and I will try to help.

Indeed, thank you for the clarification. This variant works well.

>> 3. Tegra doesn't have a dedicated hardware power-controller for the core
>> domain, instead there is only an external voltage regulator. Hence we
>> will need to create a phony device-tree node for the virtual power
>> domain, which is probably a wrong thing to do.
> 
> No, this is absolutely the correct thing to do.
> 
> This isn't a virtual power domain, it's a real power domain. You only
> happen to model the control of it as a regulator, as it fits nicely
> with that for *this* SoC. Don't get me wrong, that's fine as long as
> the supply is specified only in the power-domain provider node.
> 
> On another SoC, you might have a different FW interface for the power
> domain provider that doesn't fit well with the regulator. When that
> happens, all you need to do is to implement a new power domain
> provider and potentially re-define the power domain topology. More
> importantly, you don't need to re-invent yet another slew of device
> specific bindings - for each SoC.
> 
>>
>> ===
>>
>> Perhaps it should be possible to create some hacks to work around
>> bullets 2 and 3 in order to achieve what we need for DVFS on Tegra, but
>> bullet 1 isn't solvable without changing how the GENPD core works.
>>
>> Altogether, the GENPD in its current form is a wrong abstraction for a
>> system-wide DVFS in a case where multiple devices share power domain and
>> this domain is a voltage regulator. The regulator framework is the
>> correct abstraction in this case for today.
> 
> Well, I admit it's a bit complex. But it solves the problem in a
> nicely abstracted way that should work for everybody, at least in my
> opinion.

The OPP framework supports both voltage regulator and power domain,
hiding the implementation details from drivers. This means that OPP API
usage will be the same regardless of what approach (regulator or power
domain) is used for a particular SoC.

> Although, let's not exclude that there are pieces missing in genpd or
> the opp layer, as this DVFS feature is rather new - but then we should
> just extend/fix it.

Will be nice to have a per-device GENPD performance stats.

Thierry, could you please let me know what do you think about replacing
regulator with the power domain? Do you think it's a worthwhile change?

The difference in comparison to using voltage regulator directly is
minimal, basically the core-supply phandle is replaced is replaced with
a power-domain phandle in a device tree.

The only thing which makes me feel a bit uncomfortable is that there is
no real hardware node for the power domain node in a device-tree.
Thierry Reding Nov. 12, 2020, 8:43 p.m. UTC | #18
On Thu, Nov 12, 2020 at 10:57:27PM +0300, Dmitry Osipenko wrote:
> 11.11.2020 14:38, Ulf Hansson пишет:
> > On Sun, 8 Nov 2020 at 13:19, Dmitry Osipenko <digetx@gmail.com> wrote:
> >>
> >> 05.11.2020 18:22, Dmitry Osipenko пишет:
> >>> 05.11.2020 12:45, Ulf Hansson пишет:
> >>> ...
> >>>> I need some more time to review this, but just a quick check found a
> >>>> few potential issues...
> >>>
> >>> Thank you for starting the review! I'm pretty sure it will take a couple
> >>> revisions until all the questions will be resolved :)
> >>>
> >>>> The "core-supply", that you specify as a regulator for each
> >>>> controller's device node, is not the way we describe power domains.
> >>>> Instead, it seems like you should register a power-domain provider
> >>>> (with the help of genpd) and implement the ->set_performance_state()
> >>>> callback for it. Each device node should then be hooked up to this
> >>>> power-domain, rather than to a "core-supply". For DT bindings, please
> >>>> have a look at Documentation/devicetree/bindings/power/power-domain.yaml
> >>>> and Documentation/devicetree/bindings/power/power_domain.txt.
> >>>>
> >>>> In regards to the "sync state" problem (preventing to change
> >>>> performance states until all consumers have been attached), this can
> >>>> then be managed by the genpd provider driver instead.
> >>>
> >>> I'll need to take a closer look at GENPD, thank you for the suggestion.
> >>>
> >>> Sounds like a software GENPD driver which manages clocks and voltages
> >>> could be a good idea, but it also could be an unnecessary
> >>> over-engineering. Let's see..
> >>>
> >>
> >> Hello Ulf and all,
> >>
> >> I took a detailed look at the GENPD and tried to implement it. Here is
> >> what was found:
> >>
> >> 1. GENPD framework doesn't aggregate performance requests from the
> >> attached devices. This means that if deviceA requests performance state
> >> 10 and then deviceB requests state 3, then framework will set domain's
> >> state to 3 instead of 10.
> >>
> >> https://elixir.bootlin.com/linux/v5.10-rc2/source/drivers/base/power/domain.c#L376
> > 
> > As Viresh also stated, genpd does aggregate the votes. It even
> > performs aggregation hierarchy (a genpd is allowed to have parent(s)
> > to model a topology).
> 
> Yes, I already found and fixed the bug which confused me previously and
> it's working well now.
> 
> >> 2. GENPD framework has a sync() callback in the genpd.domain structure,
> >> but this callback isn't allowed to be used by the GENPD implementation.
> >> The GENPD framework always overrides that callback for its own needs.
> >> Hence GENPD doesn't allow to solve the bootstrapping
> >> state-synchronization problem in a nice way.
> >>
> >> https://elixir.bootlin.com/linux/v5.10-rc2/source/drivers/base/power/domain.c#L2606
> > 
> > That ->sync() callback isn't the callback you are looking for, it's a
> > PM domain specific callback - and has other purposes.
> > 
> > To solve the problem you refer to, your genpd provider driver (a
> > platform driver) should assign its ->sync_state() callback. The
> > ->sync_state() callback will be invoked, when all consumer devices
> > have been attached (and probed) to their corresponding provider.
> > 
> > You may have a look at drivers/cpuidle/cpuidle-psci-domain.c, to see
> > an example of how this works. If there is anything unclear, just tell
> > me and I will try to help.
> 
> Indeed, thank you for the clarification. This variant works well.
> 
> >> 3. Tegra doesn't have a dedicated hardware power-controller for the core
> >> domain, instead there is only an external voltage regulator. Hence we
> >> will need to create a phony device-tree node for the virtual power
> >> domain, which is probably a wrong thing to do.
> > 
> > No, this is absolutely the correct thing to do.
> > 
> > This isn't a virtual power domain, it's a real power domain. You only
> > happen to model the control of it as a regulator, as it fits nicely
> > with that for *this* SoC. Don't get me wrong, that's fine as long as
> > the supply is specified only in the power-domain provider node.
> > 
> > On another SoC, you might have a different FW interface for the power
> > domain provider that doesn't fit well with the regulator. When that
> > happens, all you need to do is to implement a new power domain
> > provider and potentially re-define the power domain topology. More
> > importantly, you don't need to re-invent yet another slew of device
> > specific bindings - for each SoC.
> > 
> >>
> >> ===
> >>
> >> Perhaps it should be possible to create some hacks to work around
> >> bullets 2 and 3 in order to achieve what we need for DVFS on Tegra, but
> >> bullet 1 isn't solvable without changing how the GENPD core works.
> >>
> >> Altogether, the GENPD in its current form is a wrong abstraction for a
> >> system-wide DVFS in a case where multiple devices share power domain and
> >> this domain is a voltage regulator. The regulator framework is the
> >> correct abstraction in this case for today.
> > 
> > Well, I admit it's a bit complex. But it solves the problem in a
> > nicely abstracted way that should work for everybody, at least in my
> > opinion.
> 
> The OPP framework supports both voltage regulator and power domain,
> hiding the implementation details from drivers. This means that OPP API
> usage will be the same regardless of what approach (regulator or power
> domain) is used for a particular SoC.
> 
> > Although, let's not exclude that there are pieces missing in genpd or
> > the opp layer, as this DVFS feature is rather new - but then we should
> > just extend/fix it.
> 
> Will be nice to have a per-device GENPD performance stats.
> 
> Thierry, could you please let me know what do you think about replacing
> regulator with the power domain? Do you think it's a worthwhile change?
> 
> The difference in comparison to using voltage regulator directly is
> minimal, basically the core-supply phandle is replaced is replaced with
> a power-domain phandle in a device tree.

These new power-domain handles would have to be added to devices that
potentially already have a power-domain handle, right? Isn't that going
to cause issues? I vaguely recall that we already have multiple power
domains for the XUSB controller and we have to jump through extra hoops
to make that work.

> The only thing which makes me feel a bit uncomfortable is that there is
> no real hardware node for the power domain node in a device-tree.

Could we anchor the new power domain at the PMC for example? That would
allow us to avoid the "virtual" node. On the other hand, if we were to
use a regulator, we'd be adding a node for that, right? So isn't this
effectively going to be the same node if we use a power domain? Both
software constructs are using the same voltage regulator, so they should
be able to be described by the same device tree node, shouldn't they?

Thierry
Dmitry Osipenko Nov. 12, 2020, 10:14 p.m. UTC | #19
12.11.2020 23:43, Thierry Reding пишет:
>> The difference in comparison to using voltage regulator directly is
>> minimal, basically the core-supply phandle is replaced is replaced with
>> a power-domain phandle in a device tree.
> These new power-domain handles would have to be added to devices that
> potentially already have a power-domain handle, right? Isn't that going
> to cause issues? I vaguely recall that we already have multiple power
> domains for the XUSB controller and we have to jump through extra hoops
> to make that work.

I modeled the core PD as a parent of the PMC sub-domains, which
presumably is a correct way to represent the domains topology.

https://gist.github.com/digetx/dfd92c7f7e0aa6cef20403c4298088d7

>> The only thing which makes me feel a bit uncomfortable is that there is
>> no real hardware node for the power domain node in a device-tree.
> Could we anchor the new power domain at the PMC for example? That would
> allow us to avoid the "virtual" node.

I had a thought about using PMC for the core domain, but not sure
whether it will be an entirely correct hardware description. Although,
it will be nice to have it this way.

This is what Tegra TRM says about PMC:

"The Power Management Controller (PMC) block interacts with an external
or Power Manager Unit (PMU). The PMC mostly controls the entry and exit
of the system from different sleep modes. It provides power-gating
controllers for SOC and CPU power-islands and also provides scratch
storage to save some of the context during sleep modes (when CPU and/or
SOC power rails are off). Additionally, PMC interacts with the external
Power Manager Unit (PMU)."

The core voltage regulator is a part of the PMU.

Not all core SoC devices are behind PMC, IIUC.

> On the other hand, if we were to
> use a regulator, we'd be adding a node for that, right? So isn't this
> effectively going to be the same node if we use a power domain? Both
> software constructs are using the same voltage regulator, so they should
> be able to be described by the same device tree node, shouldn't they?

I'm not exactly sure what you're meaning by "use a regulator" and "we'd
be adding a node for that", could you please clarify? This v1 approach
uses a core-supply phandle (i.e. regulator is used), it doesn't require
extra nodes.
Ulf Hansson Nov. 13, 2020, 2:45 p.m. UTC | #20
On Thu, 12 Nov 2020 at 23:14, Dmitry Osipenko <digetx@gmail.com> wrote:
>
> 12.11.2020 23:43, Thierry Reding пишет:
> >> The difference in comparison to using voltage regulator directly is
> >> minimal, basically the core-supply phandle is replaced is replaced with
> >> a power-domain phandle in a device tree.
> > These new power-domain handles would have to be added to devices that
> > potentially already have a power-domain handle, right? Isn't that going
> > to cause issues? I vaguely recall that we already have multiple power
> > domains for the XUSB controller and we have to jump through extra hoops
> > to make that work.
>
> I modeled the core PD as a parent of the PMC sub-domains, which
> presumably is a correct way to represent the domains topology.
>
> https://gist.github.com/digetx/dfd92c7f7e0aa6cef20403c4298088d7

That could make sense, it seems.

Anyway, this made me realize that
dev_pm_genpd_set_performance_state(dev) returns -EINVAL, in case the
device's genpd doesn't have the ->set_performance_state() assigned.
This may not be correct. Instead we should likely consider an empty
callback as okay and continue to walk the topology upwards to the
parent domain, etc.

Just wanted to point this out. I intend to post a patch as soon as I
can for this.

[...]

Kind regards
Uffe
Dmitry Osipenko Nov. 13, 2020, 4 p.m. UTC | #21
13.11.2020 17:45, Ulf Hansson пишет:
> On Thu, 12 Nov 2020 at 23:14, Dmitry Osipenko <digetx@gmail.com> wrote:
>>
>> 12.11.2020 23:43, Thierry Reding пишет:
>>>> The difference in comparison to using voltage regulator directly is
>>>> minimal, basically the core-supply phandle is replaced is replaced with
>>>> a power-domain phandle in a device tree.
>>> These new power-domain handles would have to be added to devices that
>>> potentially already have a power-domain handle, right? Isn't that going
>>> to cause issues? I vaguely recall that we already have multiple power
>>> domains for the XUSB controller and we have to jump through extra hoops
>>> to make that work.
>>
>> I modeled the core PD as a parent of the PMC sub-domains, which
>> presumably is a correct way to represent the domains topology.
>>
>> https://gist.github.com/digetx/dfd92c7f7e0aa6cef20403c4298088d7
> 
> That could make sense, it seems.
> 
> Anyway, this made me realize that
> dev_pm_genpd_set_performance_state(dev) returns -EINVAL, in case the
> device's genpd doesn't have the ->set_performance_state() assigned.
> This may not be correct. Instead we should likely consider an empty
> callback as okay and continue to walk the topology upwards to the
> parent domain, etc.
> 
> Just wanted to point this out. I intend to post a patch as soon as I
> can for this.

Thank you, I was also going to make the same change, but haven't
bothered to do it so far. Please feel free to CC me on the patch.
Thierry Reding Nov. 13, 2020, 4:35 p.m. UTC | #22
On Fri, Nov 13, 2020 at 01:14:45AM +0300, Dmitry Osipenko wrote:
> 12.11.2020 23:43, Thierry Reding пишет:
> >> The difference in comparison to using voltage regulator directly is
> >> minimal, basically the core-supply phandle is replaced is replaced with
> >> a power-domain phandle in a device tree.
> > These new power-domain handles would have to be added to devices that
> > potentially already have a power-domain handle, right? Isn't that going
> > to cause issues? I vaguely recall that we already have multiple power
> > domains for the XUSB controller and we have to jump through extra hoops
> > to make that work.
> 
> I modeled the core PD as a parent of the PMC sub-domains, which
> presumably is a correct way to represent the domains topology.
> 
> https://gist.github.com/digetx/dfd92c7f7e0aa6cef20403c4298088d7
> 
> >> The only thing which makes me feel a bit uncomfortable is that there is
> >> no real hardware node for the power domain node in a device-tree.
> > Could we anchor the new power domain at the PMC for example? That would
> > allow us to avoid the "virtual" node.
> 
> I had a thought about using PMC for the core domain, but not sure
> whether it will be an entirely correct hardware description. Although,
> it will be nice to have it this way.
> 
> This is what Tegra TRM says about PMC:
> 
> "The Power Management Controller (PMC) block interacts with an external
> or Power Manager Unit (PMU). The PMC mostly controls the entry and exit
> of the system from different sleep modes. It provides power-gating
> controllers for SOC and CPU power-islands and also provides scratch
> storage to save some of the context during sleep modes (when CPU and/or
> SOC power rails are off). Additionally, PMC interacts with the external
> Power Manager Unit (PMU)."
> 
> The core voltage regulator is a part of the PMU.
> 
> Not all core SoC devices are behind PMC, IIUC.

There are usually some SoC devices that are always-on. Things like the
RTC for example, can never be power-gated, as far as I recall. On newer
chips there are usually many more blocks that can't be powergated at
all.

> > On the other hand, if we were to
> > use a regulator, we'd be adding a node for that, right? So isn't this
> > effectively going to be the same node if we use a power domain? Both
> > software constructs are using the same voltage regulator, so they should
> > be able to be described by the same device tree node, shouldn't they?
> 
> I'm not exactly sure what you're meaning by "use a regulator" and "we'd
> be adding a node for that", could you please clarify? This v1 approach
> uses a core-supply phandle (i.e. regulator is used), it doesn't require
> extra nodes.

What I meant to say was that the actual supply voltage is generated by
some device (typically one of the SD outputs of the PMIC). Whether we
model this as a power domain or a regulator doesn't really matter,
right? So I'm wondering if the device that generates the voltage should
be the power domain provider, just like it is the provider of the
regulator if this was modelled as a regulator.

Thierry
Dmitry Osipenko Nov. 15, 2020, 4:29 p.m. UTC | #23
13.11.2020 19:35, Thierry Reding пишет:
> On Fri, Nov 13, 2020 at 01:14:45AM +0300, Dmitry Osipenko wrote:
>> 12.11.2020 23:43, Thierry Reding пишет:
>>>> The difference in comparison to using voltage regulator directly is
>>>> minimal, basically the core-supply phandle is replaced is replaced with
>>>> a power-domain phandle in a device tree.
>>> These new power-domain handles would have to be added to devices that
>>> potentially already have a power-domain handle, right? Isn't that going
>>> to cause issues? I vaguely recall that we already have multiple power
>>> domains for the XUSB controller and we have to jump through extra hoops
>>> to make that work.
>>
>> I modeled the core PD as a parent of the PMC sub-domains, which
>> presumably is a correct way to represent the domains topology.
>>
>> https://gist.github.com/digetx/dfd92c7f7e0aa6cef20403c4298088d7
>>
>>>> The only thing which makes me feel a bit uncomfortable is that there is
>>>> no real hardware node for the power domain node in a device-tree.
>>> Could we anchor the new power domain at the PMC for example? That would
>>> allow us to avoid the "virtual" node.
>>
>> I had a thought about using PMC for the core domain, but not sure
>> whether it will be an entirely correct hardware description. Although,
>> it will be nice to have it this way.
>>
>> This is what Tegra TRM says about PMC:
>>
>> "The Power Management Controller (PMC) block interacts with an external
>> or Power Manager Unit (PMU). The PMC mostly controls the entry and exit
>> of the system from different sleep modes. It provides power-gating
>> controllers for SOC and CPU power-islands and also provides scratch
>> storage to save some of the context during sleep modes (when CPU and/or
>> SOC power rails are off). Additionally, PMC interacts with the external
>> Power Manager Unit (PMU)."
>>
>> The core voltage regulator is a part of the PMU.
>>
>> Not all core SoC devices are behind PMC, IIUC.
> 
> There are usually some SoC devices that are always-on. Things like the
> RTC for example, can never be power-gated, as far as I recall. On newer
> chips there are usually many more blocks that can't be powergated at
> all.

The RTC is actually a special power domain on Tegra, it's not a part of
the CORE domain, they are separate from each other.

We need to know what blocks belong to a power domain and what's the
power topology of these blocks. I think we already have this knowledge,
so it shouldn't be a problem.

>>> On the other hand, if we were to
>>> use a regulator, we'd be adding a node for that, right? So isn't this
>>> effectively going to be the same node if we use a power domain? Both
>>> software constructs are using the same voltage regulator, so they should
>>> be able to be described by the same device tree node, shouldn't they?
>>
>> I'm not exactly sure what you're meaning by "use a regulator" and "we'd
>> be adding a node for that", could you please clarify? This v1 approach
>> uses a core-supply phandle (i.e. regulator is used), it doesn't require
>> extra nodes.
> 
> What I meant to say was that the actual supply voltage is generated by
> some device (typically one of the SD outputs of the PMIC). Whether we
> model this as a power domain or a regulator doesn't really matter,
> right? So I'm wondering if the device that generates the voltage should
> be the power domain provider, just like it is the provider of the
> regulator if this was modelled as a regulator.

Technically this could be done and it shouldn't be difficult to add
GENPD support to the regulator framework, but I think this is an
inaccurate hardware description.

It shouldn't be correct to describe internal SoC parts as
directly-connected to an external voltage regulator. The core voltage
regulator is connected to a one of several power rails of the Tegra
chip. There is no good way to describe hardware in terms of voltage
regulators, hence that's why this v1 series added a core-supply to each
SoC component of each board's DT individually.

It's actually one of the benefits of using a separate DT node for the
power-domain, which describes the "Tegra Core" part of the Tegra SoC,
and thus, it all stays within tegra.dtsi. This means that PD explicitly
belongs to the SoC internals in oppose to describing PD like it's an
external/off-chip component.

Initially I didn't like much that there is no hardware address to back
up the power domain node in a DT, but actually there is no address for
the power rail. Hence it should be better to describe hardware by
keeping PD internally to the SoC. Note that potentially PD may require
knowledge about specifics of a particular SoC, while external regulator
doesn't belong to a SoC. Also, I guess technically there could be
multiple external regulators which power a single SoC rail.
Dmitry Osipenko Dec. 1, 2020, 2:17 p.m. UTC | #24
01.12.2020 16:57, Mark Brown пишет:
> On Thu, 5 Nov 2020 02:43:57 +0300, Dmitry Osipenko wrote:
>> Introduce core voltage scaling for NVIDIA Tegra20/30 SoCs, which reduces
>> power consumption and heating of the Tegra chips. Tegra SoC has multiple
>> hardware units which belong to a core power domain of the SoC and share
>> the core voltage. The voltage must be selected in accordance to a minimum
>> requirement of every core hardware unit.
>>
>> The minimum core voltage requirement depends on:
>>
>> [...]
> 
> Applied to
> 
>    https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound.git for-next
> 
> Thanks!
> 
> [1/1] regulator: Allow skipping disabled regulators in regulator_check_consumers()
>       (no commit info)
> 
> All being well this means that it will be integrated into the linux-next
> tree (usually sometime in the next 24 hours) and sent to Linus during
> the next merge window (or sooner if it is a bug fix), however if
> problems are discovered then the patch may be dropped or reverted.
> 
> You may get further e-mails resulting from automated or manual testing
> and review of the tree, please engage with people reporting problems and
> send followup patches addressing any issues that are reported if needed.
> 
> If any updates are required or you are submitting further changes they
> should be sent as incremental updates against current git, existing
> patches will not be replaced.
> 
> Please add any relevant lists and maintainers to the CCs when replying
> to this mail.

Hello Mark,

Could you please hold on this patch? It won't be needed in a v2, which
will use power domains.

Also, I'm not sure whether the "sound" tree is suitable for any of the
patches in this series.
Mark Brown Dec. 1, 2020, 2:34 p.m. UTC | #25
On Tue, Dec 01, 2020 at 05:17:20PM +0300, Dmitry Osipenko wrote:
> 01.12.2020 16:57, Mark Brown пишет:

> > [1/1] regulator: Allow skipping disabled regulators in regulator_check_consumers()
> >       (no commit info)

> Could you please hold on this patch? It won't be needed in a v2, which
> will use power domains.

> Also, I'm not sure whether the "sound" tree is suitable for any of the
> patches in this series.

It didn't actually get applied (note the "no commit info") - it looks
like b4's matching code got confused and decided to generate mails for
anything that I've ever downloaded and not posted.
Dmitry Osipenko Dec. 1, 2020, 2:44 p.m. UTC | #26
01.12.2020 17:34, Mark Brown пишет:
> On Tue, Dec 01, 2020 at 05:17:20PM +0300, Dmitry Osipenko wrote:
>> 01.12.2020 16:57, Mark Brown пишет:
> 
>>> [1/1] regulator: Allow skipping disabled regulators in regulator_check_consumers()
>>>       (no commit info)
> 
>> Could you please hold on this patch? It won't be needed in a v2, which
>> will use power domains.
> 
>> Also, I'm not sure whether the "sound" tree is suitable for any of the
>> patches in this series.
> 
> It didn't actually get applied (note the "no commit info") - it looks
> like b4's matching code got confused and decided to generate mails for
> anything that I've ever downloaded and not posted.
> 

Alright, thank you for the clarification.