Message ID | 52c403454c3b8fc201abe7ac74cf657638479311.1417691389.git.viresh.kumar@linaro.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Hi Viresh, not commenting on the overall structure as I have to think a bit more about this. But small comments below. Am Donnerstag, den 04.12.2014, 16:44 +0530 schrieb Viresh Kumar: > Hi Rob, et al.. > > Current OPP (Operating performance point) DT bindings are proven to be > insufficient at multiple instances. > > There had been multiple band-aid approaches to get them fixed (The latest one > being: http://www.mail-archive.com/devicetree@vger.kernel.org/msg53398.html). > For obvious reasons Rob rejected them and shown the right path forward. And this > is the first try to get those with a pen and paper. > > The shortcomings we are trying to solve here: > > - Some kind of compatibility string to probe the right cpufreq driver for > platforms, when multiple drivers are available. For example: how to choose > between cpufreq-dt and arm_big_little drivers. > > - Getting clock sharing information between CPUs. Single shared clock vs. > independent clock per core vs. shared clock per cluster. > > - Support for turbo modes > > - Other per OPP settings: transition latencies, disabled status, etc.? > > The below document should be enough to describe how I am trying to fix these. > Please let me know what all I need to fix, surely there would be lots of > obstacles. I am prepared to get beaten up :) > > I accept in advance that naming is extremely bad here, I need some suggestions > for sure. > > Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> > --- > Documentation/devicetree/bindings/power/opp.txt | 147 ++++++++++++++++++++++++ > 1 file changed, 147 insertions(+) > > diff --git a/Documentation/devicetree/bindings/power/opp.txt b/Documentation/devicetree/bindings/power/opp.txt > index 74499e5..5efd8d4 100644 > --- a/Documentation/devicetree/bindings/power/opp.txt > +++ b/Documentation/devicetree/bindings/power/opp.txt > @@ -4,6 +4,153 @@ SoCs have a standard set of tuples consisting of frequency and > voltage pairs that the device will support per voltage domain. These > are called Operating Performance Points or OPPs. > > +This documents defines OPP bindings with its required/optional properties. > +OPPs can be defined for any device, this file uses CPU device as an example to > +illustrate how to define OPPs. > + > +linux,operating-points, opp-lists and opps: > + > +- linux,operating-points: > + Container of all OPP nodes. > + > + Required properties: > + - opp nodes (explained below) > + > + Optional properties: > + - compatible: allow OPPs to express their compatibility with devices > + > + > +- opp-list@*: > + List of nodes defining performance points. Following belong to the nodes > + within the opp-lists. > + > + Required properties: > + - frequency-kHz: Frequency in kHz > + - voltage-uV: voltage in micro Volts > + > + Optional properties: > + - turbo-mode: Marks the volt-freq pair as turbo pair. > + - status: Marks the node enabled/disabled. What about devices with multiple different turbo states? We have seen CPUs that boost to different states in the x86 world, surely we will encounter something like this in the ARM world too. Do we just mark them all as turbo OPPs and let the driver decide what to do? If we want to keep using cpufreq-dt for as much devices as possible is it really sufficient to know that this is a turbo state, without knowing the conditions required for activating the state? > + > + > +- opp@*: > + Operating performance point node per device. Multiple devices sharing it can > + use its phandle in their 'opp' property. > + > + Required properties: > + - opp-list: phandle to opp-list defined above. > + > + Optional properties: > + - clocks: Tuple of clock providers > + - clock-names: Clock names > + - opp-supply: phandle to the parent supply/regulator node > + - voltage-tolerance: Specify the CPU voltage tolerance in percentage. This is extremely ill defined. It doesn't say in which direction the tolerance is to be applied. Can you go below or above the OPP specified voltage? For now everyone just assumes that it has to work both ways. Also with this binding the tolerance is applied for all OPPs, where is very much depends on the individual OPP. If you are going to redefine OPPs anyway I would really like to see this property die and rather have a min/max voltage per OPP. That way you can properly express the OPP constraints. Most OPPs will likely allow a much higher voltage than their minimal specified one, except when you go over thermal limits with a high clock/voltage combination. > + - clock-latency: Specify the possible maximum transition latency for clock, > + in unit of nanoseconds. Why do we need this? This is property of the clock. We should be able to handle this completely internally in the kernel. I don't know if the clock API has something like this right now, but it should be a trivial addition. Regards, Lucas
Hi Lucas, On 4 December 2014 at 17:04, Lucas Stach <l.stach@pengutronix.de> wrote: >> +- opp-list@*: >> + List of nodes defining performance points. Following belong to the nodes >> + within the opp-lists. >> + >> + Required properties: >> + - frequency-kHz: Frequency in kHz >> + - voltage-uV: voltage in micro Volts >> + >> + Optional properties: >> + - turbo-mode: Marks the volt-freq pair as turbo pair. >> + - status: Marks the node enabled/disabled. > > What about devices with multiple different turbo states? We have seen You mean that a state may or maynot be turbo at some point of time ? > CPUs that boost to different states in the x86 world, surely we will > encounter something like this in the ARM world too. Do we just mark them > all as turbo OPPs and let the driver decide what to do? If we want to Maybe yes. But the good thing about binding this time is, it is expandable. So, if there is a future need that we can't think of today, then we can surely do incremental changes here. > keep using cpufreq-dt for as much devices as possible is it really Its not about cpufreq-dt alone. We maybe using other drivers as well.. > sufficient to know that this is a turbo state, without knowing the > conditions required for activating the state? Can you elaborate more on this? If something is required and we know what exactly it is, then we can put up the right binding right now as well.. >> +- opp@*: >> + Operating performance point node per device. Multiple devices sharing it can >> + use its phandle in their 'opp' property. >> + >> + Required properties: >> + - opp-list: phandle to opp-list defined above. >> + >> + Optional properties: >> + - clocks: Tuple of clock providers >> + - clock-names: Clock names >> + - opp-supply: phandle to the parent supply/regulator node >> + - voltage-tolerance: Specify the CPU voltage tolerance in percentage. > > This is extremely ill defined. It doesn't say in which direction the > tolerance is to be applied. Can you go below or above the OPP specified > voltage? For now everyone just assumes that it has to work both ways. Yes, the binding is as per today's requirements (or rather implementations). So it is both ways. But if everybody agrees on it, we can improve it.. > Also with this binding the tolerance is applied for all OPPs, where is > very much depends on the individual OPP. Hmm, Not only this but the same is true for clock latency as well. We *may* need that per opp node sometime.. > If you are going to redefine OPPs anyway I would really like to see this > property die and rather have a min/max voltage per OPP. That way you can Maybe yes. > properly express the OPP constraints. Most OPPs will likely allow a much > higher voltage than their minimal specified one, except when you go over > thermal limits with a high clock/voltage combination. Yes. >> + - clock-latency: Specify the possible maximum transition latency for clock, >> + in unit of nanoseconds. > > Why do we need this? This is property of the clock. We should be able to > handle this completely internally in the kernel. I don't know if the > clock API has something like this right now, but it should be a trivial > addition. This is not only clock's latency, but is somehow named this way. This should give the time it takes to change from frequency A to frequency B, which include change in supplies as well.. So, this probably is dvfs-latency .. This is required by cpufreq right now, but would be useful for the energy aware scheduler as well. So, yes this is important. Also, it might also be required to be per OPP... Probably we can use voltage-tolerance and clock-latency at both levels. list-level and OPP level. list level being at higher priority ? Thanks for your quick comments :)
On Thu, Dec 04, 2014 at 12:34:28PM +0100, Lucas Stach wrote: > Am Donnerstag, den 04.12.2014, 16:44 +0530 schrieb Viresh Kumar: > > + - voltage-tolerance: Specify the CPU voltage tolerance in percentage. > This is extremely ill defined. It doesn't say in which direction the > tolerance is to be applied. Can you go below or above the OPP specified > voltage? For now everyone just assumes that it has to work both ways. > Also with this binding the tolerance is applied for all OPPs, where is > very much depends on the individual OPP. Almost all specifications for voltages are done as either min/typ/max or +/- a target voltage. > If you are going to redefine OPPs anyway I would really like to see this > property die and rather have a min/max voltage per OPP. That way you can > properly express the OPP constraints. Most OPPs will likely allow a much > higher voltage than their minimal specified one, except when you go over > thermal limits with a high clock/voltage combination. If you've got a minimum and maximum you also need to specify a target, generally it's going to be better to go for the target voltage which may not be the midpoint and is unlikely to be one of the bounds. I do think it's sensible to have the option of doing both to more closely match datasheets. > > + - clock-latency: Specify the possible maximum transition latency for clock, > > + in unit of nanoseconds. > Why do we need this? This is property of the clock. We should be able to > handle this completely internally in the kernel. I don't know if the > clock API has something like this right now, but it should be a trivial > addition. Or have it be part of the clock binding at any rate.
On 4 December 2014 at 19:37, Viresh Kumar <viresh.kumar@linaro.org> wrote: > This is not only clock's latency, but is somehow named this way. This should > give the time it takes to change from frequency A to frequency B, which include > change in supplies as well.. So, this probably is dvfs-latency .. Oops. No this is just clock-latency. We are calculating voltage-latency separately.
diff --git a/Documentation/devicetree/bindings/power/opp.txt b/Documentation/devicetree/bindings/power/opp.txt index 74499e5..5efd8d4 100644 --- a/Documentation/devicetree/bindings/power/opp.txt +++ b/Documentation/devicetree/bindings/power/opp.txt @@ -4,6 +4,153 @@ SoCs have a standard set of tuples consisting of frequency and voltage pairs that the device will support per voltage domain. These are called Operating Performance Points or OPPs. +This documents defines OPP bindings with its required/optional properties. +OPPs can be defined for any device, this file uses CPU device as an example to +illustrate how to define OPPs. + +linux,operating-points, opp-lists and opps: + +- linux,operating-points: + Container of all OPP nodes. + + Required properties: + - opp nodes (explained below) + + Optional properties: + - compatible: allow OPPs to express their compatibility with devices + + +- opp-list@*: + List of nodes defining performance points. Following belong to the nodes + within the opp-lists. + + Required properties: + - frequency-kHz: Frequency in kHz + - voltage-uV: voltage in micro Volts + + Optional properties: + - turbo-mode: Marks the volt-freq pair as turbo pair. + - status: Marks the node enabled/disabled. + + +- opp@*: + Operating performance point node per device. Multiple devices sharing it can + use its phandle in their 'opp' property. + + Required properties: + - opp-list: phandle to opp-list defined above. + + Optional properties: + - clocks: Tuple of clock providers + - clock-names: Clock names + - opp-supply: phandle to the parent supply/regulator node + - voltage-tolerance: Specify the CPU voltage tolerance in percentage. + - clock-latency: Specify the possible maximum transition latency for clock, + in unit of nanoseconds. + +Example: Multi-cluster system with separate clock lines for clusters. All CPUs + in the clusters share same clock lines. + +/ { + cpus { + #address-cells = <1>; + #size-cells = <0>; + + linux,operating-points { + compatible = "linux,cpufreq-dt"; + + opp-list0: opp-list@0 { + { + frequency-kHz = <1000000>; + voltage-uV = <975000>; + status = "okay"; + }; + { + frequency-kHz = <1100000>; + voltage-uV = <1000000>; + status = "okay"; + }; + { + frequency-kHz = <1200000>; + voltage-uV = <1025000>; + status = "okay"; + turbo-mode; + }; + }; + + opp-list1: opp-list@1 { + { + frequency-kHz = <1300000>; + voltage-uV = <1050000>; + status = "okay"; + }; + { + frequency-kHz = <1400000>; + voltage-uV = <1075000>; + status = "disabled"; + }; + { + frequency-kHz = <1500000>; + voltage-uV = <1100000>; + status = "okay"; + turbo-mode; + }; + }; + + opp0: opp@0 { + clocks = <&clk-controller 0>; + clock-names = "cpu"; + opp-supply = <&cpu-supply0>; + voltage-tolerance = <2>; /* percentage */ + clock-latency = <300000>; + opp-list = <&opp-list0>; + }; + + opp1: opp@1 { + clocks = <&clk-controller 1>; + clock-names = "cpu"; + opp-supply = <&cpu-supply1>; + voltage-tolerance = <2>; /* percentage */ + clock-latency = <400000>; + opp-list = <&opp-list1>; + }; + }; + + cpu@0 { + compatible = "arm,cortex-a7"; + reg = <0>; + next-level-cache = <&L2>; + opps = <opp0>; + }; + + cpu@1 { + compatible = "arm,cortex-a7"; + reg = <1>; + next-level-cache = <&L2>; + opps = <opp0>; + }; + + cpu@100 { + compatible = "arm,cortex-a15"; + reg = <100>; + next-level-cache = <&L2>; + opps = <opp1>; + }; + + cpu@101 { + compatible = "arm,cortex-a15"; + reg = <101>; + next-level-cache = <&L2>; + opps = <opp1>; + }; + }; +}; + + + +Deprecated Bindings +------------------- + Properties: - operating-points: An array of 2-tuples items, and each item consists of frequency and voltage like <freq-kHz vol-uV>.
Hi Rob, et al.. Current OPP (Operating performance point) DT bindings are proven to be insufficient at multiple instances. There had been multiple band-aid approaches to get them fixed (The latest one being: http://www.mail-archive.com/devicetree@vger.kernel.org/msg53398.html). For obvious reasons Rob rejected them and shown the right path forward. And this is the first try to get those with a pen and paper. The shortcomings we are trying to solve here: - Some kind of compatibility string to probe the right cpufreq driver for platforms, when multiple drivers are available. For example: how to choose between cpufreq-dt and arm_big_little drivers. - Getting clock sharing information between CPUs. Single shared clock vs. independent clock per core vs. shared clock per cluster. - Support for turbo modes - Other per OPP settings: transition latencies, disabled status, etc.? The below document should be enough to describe how I am trying to fix these. Please let me know what all I need to fix, surely there would be lots of obstacles. I am prepared to get beaten up :) I accept in advance that naming is extremely bad here, I need some suggestions for sure. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> --- Documentation/devicetree/bindings/power/opp.txt | 147 ++++++++++++++++++++++++ 1 file changed, 147 insertions(+)