Message ID | 2a2eb28da9fecf129f6bc0ab3d3748d9f4d25a29.1527225682.git.viresh.kumar@linaro.org (mailing list archive) |
---|---|
State | Not Applicable |
Headers | show |
On Fri, May 25, 2018 at 11:10:01AM +0530, Viresh Kumar wrote: > The cooling device properties, like "#cooling-cells" and > "dynamic-power-coefficient", should either be present for all the CPUs > of a cluster or none. If these are present only for a subset of CPUs of > a cluster then things will start falling apart as soon as the CPUs are > brought online in a different order. For example, this will happen > because the operating system looks for such properties in the CPU node > it is trying to bring up, so that it can register a cooling device. > > Add such missing properties. This seems awkward compared to just having one cooling-cells in the /cpus node instead. What's it used for? I don't see any properties in the device nodes on meson-gxm that have any cooling-foo cells in them? So why should #cooling-cells be needed? -Olof
Hi, On 25/05/2018 23:10, Olof Johansson wrote: > On Fri, May 25, 2018 at 11:10:01AM +0530, Viresh Kumar wrote: >> The cooling device properties, like "#cooling-cells" and >> "dynamic-power-coefficient", should either be present for all the CPUs >> of a cluster or none. If these are present only for a subset of CPUs of >> a cluster then things will start falling apart as soon as the CPUs are >> brought online in a different order. For example, this will happen >> because the operating system looks for such properties in the CPU node >> it is trying to bring up, so that it can register a cooling device. >> >> Add such missing properties. > > This seems awkward compared to just having one cooling-cells in the /cpus node > instead. > > What's it used for? I don't see any properties in the device nodes on meson-gxm > that have any cooling-foo cells in them? So why should #cooling-cells be > needed? There is no reason to have the cooling-cells on these other CPUs, the DVFS is controlled on the first CPU of each cluster, here cpu0 and cpu4 and only cpu0 and cpu4 are used as cooling-cells. Neil > > > -Olof > > _______________________________________________ > linux-arm-kernel mailing list > linux-arm-kernel@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel >
On 25-05-18, 14:10, Olof Johansson wrote: > On Fri, May 25, 2018 at 11:10:01AM +0530, Viresh Kumar wrote: > > The cooling device properties, like "#cooling-cells" and > > "dynamic-power-coefficient", should either be present for all the CPUs > > of a cluster or none. If these are present only for a subset of CPUs of > > a cluster then things will start falling apart as soon as the CPUs are > > brought online in a different order. For example, this will happen > > because the operating system looks for such properties in the CPU node > > it is trying to bring up, so that it can register a cooling device. > > > > Add such missing properties. > > This seems awkward compared to just having one cooling-cells in the /cpus node > instead. Well, we don't allow that property to be present in /cpus node right now and it is per device. And then we may not want all the CPUs to be cooling devices really. > What's it used for? I don't see any properties in the device nodes on meson-gxm > that have any cooling-foo cells in them? So why should #cooling-cells be > needed? This property is required to declare a device as a cooling-device and the device here is CPU. We use it as a cooling device by limiting its higher range of frequencies, so that it doesn't generate too much heat. It is already there for CPU0 and CPU4, but it should really be there for all the CPUs, like we have clock, supply, caches, etc.
On 26-05-18, 10:37, Neil Armstrong wrote: > Hi, > > On 25/05/2018 23:10, Olof Johansson wrote: > > On Fri, May 25, 2018 at 11:10:01AM +0530, Viresh Kumar wrote: > >> The cooling device properties, like "#cooling-cells" and > >> "dynamic-power-coefficient", should either be present for all the CPUs > >> of a cluster or none. If these are present only for a subset of CPUs of > >> a cluster then things will start falling apart as soon as the CPUs are > >> brought online in a different order. For example, this will happen > >> because the operating system looks for such properties in the CPU node > >> it is trying to bring up, so that it can register a cooling device. > >> > >> Add such missing properties. > > > > This seems awkward compared to just having one cooling-cells in the /cpus node > > instead. > > > > What's it used for? I don't see any properties in the device nodes on meson-gxm > > that have any cooling-foo cells in them? So why should #cooling-cells be > > needed? > > > There is no reason to have the cooling-cells on these other CPUs, the DVFS is > controlled on the first CPU of each cluster, here cpu0 and cpu4 and only > cpu0 and cpu4 are used as cooling-cells. First, this is an incomplete definition of the hardware as all the CPUs are cooling-devices here and DT shouldn't be written assuming how OS will interpret it. And then it is broken right now. You can offline your second cluster (4567 CPUs) and bring CPU5 up first. You will see things breaking. I have explained more in detail here. https://marc.info/?l=linux-kernel&m=152750569414761
On Mon, May 28, 2018 at 04:43:58PM +0530, Viresh Kumar wrote: > On 25-05-18, 14:10, Olof Johansson wrote: > > On Fri, May 25, 2018 at 11:10:01AM +0530, Viresh Kumar wrote: > > > The cooling device properties, like "#cooling-cells" and > > > "dynamic-power-coefficient", should either be present for all the CPUs > > > of a cluster or none. If these are present only for a subset of CPUs of > > > a cluster then things will start falling apart as soon as the CPUs are > > > brought online in a different order. For example, this will happen > > > because the operating system looks for such properties in the CPU node > > > it is trying to bring up, so that it can register a cooling device. > > > > > > Add such missing properties. > > > > This seems awkward compared to just having one cooling-cells in the /cpus node > > instead. > > Well, we don't allow that property to be present in /cpus node right > now and it is per device. And then we may not want all the CPUs to be > cooling devices really. And what I am saying is that it sounds like a broken binding if you don't allow that, especially since it'll be a super common case that all CPUs will specify the same cooling-device specifier. > > What's it used for? I don't see any properties in the device nodes on meson-gxm > > that have any cooling-foo cells in them? So why should #cooling-cells be > > needed? > > This property is required to declare a device as a cooling-device and > the device here is CPU. We use it as a cooling device by limiting its > higher range of frequencies, so that it doesn't generate too much > heat. > > It is already there for CPU0 and CPU4, but it should really be there > for all the CPUs, like we have clock, supply, caches, etc. You have #cooling-cells in the cpu node, but the actual data is in the thermal-zones nodes. Why isn't #cooling-cells under thermal-zones, next to cooling-maps? -Olof
On 02-06-18, 01:14, Olof Johansson wrote: > And what I am saying is that it sounds like a broken binding if you don't allow > that, especially since it'll be a super common case that all CPUs will specify > the same cooling-device specifier. I am fine with allowing the #cooling-cells property in the cpus node if the DT maintainers are fine with it. @Rob: comments ? @Olof: What about other properties which are still going to be duplicated for the most common cases today, like: clocks, supply information, cache information, cpu-idle-states and others. When we can duplicate these properties, why not keep following the same for #cpu-cooling property ? Note that the OPP table doesn't really need to get duplicated (for new platforms) as the platforms use the v2 bindings now which just duplicates a phandle assignment for all CPUs. Its a mess with older platforms which use the earlier version of OPP table. > > This property is required to declare a device as a cooling-device and > > the device here is CPU. We use it as a cooling device by limiting its > > higher range of frequencies, so that it doesn't generate too much > > heat. > > > > It is already there for CPU0 and CPU4, but it should really be there > > for all the CPUs, like we have clock, supply, caches, etc. > > You have #cooling-cells in the cpu node, but the actual data is in the > thermal-zones nodes. Why isn't #cooling-cells under thermal-zones, next to > cooling-maps? Actually I thought about that when I worked on these patches initially and this is why I felt convinced that the CPU nodes are the right place for this. We add #interrupt-cells to an Interrupt controller's DT node, #gpio-cells to a GPIO controller's DT node, #clock-cells to a clock controller's DT node and that's exactly why we should (and we do) add #cooling-cells property to a cooling device's DT node. This information is used in two ways, first it enables the OS to know that the device is capable of being a cooling device and second it tells us how many arguments will be required with a phandle of this device. And so the cooling-maps always contain two arguments with the cooling device's phandle (which is mostly a CPU or a gpio fan) as the #cooling-cells currently is fixed to 2. And so I am not really sure if thermal-zones is the right place to define this thing. Is my understanding correct ?
diff --git a/arch/arm64/boot/dts/amlogic/meson-gxm-khadas-vim2.dts b/arch/arm64/boot/dts/amlogic/meson-gxm-khadas-vim2.dts index 0868da476e41..313f88f8759e 100644 --- a/arch/arm64/boot/dts/amlogic/meson-gxm-khadas-vim2.dts +++ b/arch/arm64/boot/dts/amlogic/meson-gxm-khadas-vim2.dts @@ -209,10 +209,34 @@ #cooling-cells = <2>; }; +&cpu1 { + #cooling-cells = <2>; +}; + +&cpu2 { + #cooling-cells = <2>; +}; + +&cpu3 { + #cooling-cells = <2>; +}; + &cpu4 { #cooling-cells = <2>; }; +&cpu5 { + #cooling-cells = <2>; +}; + +&cpu6 { + #cooling-cells = <2>; +}; + +&cpu7 { + #cooling-cells = <2>; +}; + ðmac { pinctrl-0 = <ð_pins>; pinctrl-names = "default";
The cooling device properties, like "#cooling-cells" and "dynamic-power-coefficient", should either be present for all the CPUs of a cluster or none. If these are present only for a subset of CPUs of a cluster then things will start falling apart as soon as the CPUs are brought online in a different order. For example, this will happen because the operating system looks for such properties in the CPU node it is trying to bring up, so that it can register a cooling device. Add such missing properties. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> --- .../boot/dts/amlogic/meson-gxm-khadas-vim2.dts | 24 ++++++++++++++++++++++ 1 file changed, 24 insertions(+)