Message ID | 1377299755-5134-3-git-send-email-eduardo.valentin@ti.com (mailing list archive) |
---|---|
State | Superseded, archived |
Delegated to: | Zhang Rui |
Headers | show |
Hi, On Sat, Aug 24, 2013 at 12:15:43AM +0100, Eduardo Valentin wrote: > In order to be able to build thermal policies > based on generic sensors, like I2C device, that > can be places in different points on different boards, > there is a need to have a way to feed board dependent > data into the thermal framework. > > This patch introduces a thermal data parser for device > tree. The parsed data is used to build thermal zones > and thermal binding parameters. The output data > can then be used to deploy thermal policies. > > This patch adds also documentation regarding this > API and how to define tree nodes to use > this infrastructure. > > Cc: Zhang Rui <rui.zhang@intel.com> > Cc: linux-pm@vger.kernel.org > Cc: linux-kernel@vger.kernel.org > Signed-off-by: Eduardo Valentin <eduardo.valentin@ti.com> > --- > .../devicetree/bindings/thermal/thermal.txt | 160 +++++++ > drivers/thermal/Kconfig | 13 + > drivers/thermal/Makefile | 1 + > drivers/thermal/thermal_dt.c | 473 +++++++++++++++++++++ > include/dt-bindings/thermal/thermal.h | 27 ++ > include/linux/thermal.h | 3 + > 6 files changed, 677 insertions(+) > create mode 100644 Documentation/devicetree/bindings/thermal/thermal.txt > create mode 100644 drivers/thermal/thermal_dt.c > create mode 100644 include/dt-bindings/thermal/thermal.h > > diff --git a/Documentation/devicetree/bindings/thermal/thermal.txt b/Documentation/devicetree/bindings/thermal/thermal.txt > new file mode 100644 > index 0000000..af20ab0 > --- /dev/null > +++ b/Documentation/devicetree/bindings/thermal/thermal.txt > @@ -0,0 +1,160 @@ > +---------------------------------------- > +Thermal Framework Device Tree descriptor > +---------------------------------------- > + > +This file describes how to define a thermal structure using device tree. > +A thermal structure includes thermal zones and their components, such > +as name, trip points, polling intervals and cooling devices binding > +descriptors. A binding descriptor may contain information on how to > +react, with a cooling stepped action or a weight on a fair share. > +The target of device tree thermal descriptors is to describe only > +the hardware thermal aspects, not how the system must control or which > +algorithm or policy must take place. It follows a description of each > +type of device tree node. > + > +**** > +trip > +**** > + > +The trip node is a node to describe a point in the temperature domain > +in which the system takes an action. This node describes just the point, > +not the action. > + > +A node describing a trip must contain: > +- temperature: the trip temperature level, in milliCelsius. > +- hysteresis: a (low) hysteresis value on 'temperature'. This is a relative > +value, in milliCelsius. Presumably this is a single (u32) cell? > +- type: the trip type. Here is the type mapping: > + THERMAL_TRIP_ACTIVE > + THERMAL_TRIP_PASSIVE > + THERMAL_TRIP_HOT > + THERMAL_TRIP_CRITICAL What type is this property? What are these values? Do you need to refer to a header somewhere? Symbolic constants should have an associated value for an ABI. > + > +********** > +bind_param > +********** > + > +The bind_param node is a node to describe how cooling devices get assigned > +to trip points of the zone. The cooling devices are expected to be loaded > +in the target system. > + > +A node describing a bind_param must contain: > +- cooling_device: A string with the cooling device name. Please use '-' rather than '_' in bindings. That's the normal devicetree convention, and there's no point gonig against it and causing more confusion. What are valid values of this string, and what do they mean? Why not a phandle + cells binding to cooling devices? > +- weight: The 'influence' of a particular cooling device on this zone. > + This is on a percentage scale. The sum of all these weights > + (for a particular zone) cannot exceed 100. This is a bit fuzzy, and certainly needs more guidance. > +- trip_mask: This is a bit mask that gives the binding relation between > + this thermal zone and cdev, for a particular trip point. > + If nth bit is set, then the cdev and thermal zone are bound > + for trip point n. What type is this? The nth bit from which end? Could you explain what "bound for trip point n" means? Just because this is a bitmask in Linux does not mean it needs to be so in the dt. > +- limits: An array integers consisting of 2-tuples items, and each item means > + lower and upper state limits, like <min-state max-state>. > + > +************************** > +temperature sensor devices > +************************** > + > +Temperature sensor devices have to be linked to a specific thermal zone. > +This is done by means of the 'monitored-zones' list. > +- monitored-zones: A list of thermal zones phandles, identifying which > +thermal zones the temperature sensor device is used to monitor. Why does this need to be described that way around? Can the zone not have a link to the sensor(s) it needs? > + > +************ > +thermal_zone > +************ > + > +The thermal_zone node is the node containing all the required info > +for describing a thermal zone, including its cdev bindings. The thermal_zone > +node must contain, apart from its own properties, one node containing > +trip nodes and one node containing all the zone bind parameters. > + > +Required properties: > +- mask: Bit string: If 'n'th bit is set, then trip point 'n' is writeable. What type is a bit string in devicetree? I'm aware of cells and (byte) strings, but not bit strings. What does 'writeable' mean in this context? There's presumably a better name than 'mask'. > + > +- passive_delay: number of milliseconds to wait between polls when > + performing passive cooling. > + > +- polling_delay: number of milliseconds to wait between polls when checking. These sound very much like configuration of the driver rather than hardware description. What if my temperature sensor can generate an interrupt at some trigger temperature? Why would I need polling times? Why does this need to be in the dt at all? Surely the OS can choose some sensible polling period, possibly dynamically? > + > +- #thermal-cells: An integer that is used to specify how many cells compose > + sensor ids. I don't see why this is necessary. If the zone itself described its related sensors rather than the other way around, we wouldn't need it at all. Most bindings so far have consumers link to their providers rather than the other way around. > + > +Below is an example: > +cpu_thermal: thermal_zone { > + #thermal-cells = <1>; > + mask = <0x03>; /* trips writability */ > + passive_delay = <250>; /* milliseconds */ > + polling_delay = <1000>; /* milliseconds */ > + trips { > + alert@100000{ > + temperature = <100000>; /* milliCelsius */ > + hysteresis = <0>; /* milliCelsius */ > + type = <THERMAL_TRIP_PASSIVE>; > + }; > + crit@125000{ > + temperature = <125000>; /* milliCelsius */ > + hysteresis = <0>; /* milliCelsius */ > + type = <THERMAL_TRIP_CRITICAL>; > + }; > + }; > + bind_params { > + action@0{ > + cooling_device = "thermal-cpufreq"; NAK. That value was not defined anywhere, and represents Linux infrastructure rather than hardware. For this case, we could just as easily describe the thermal points, and if no physical cooling device (e.g. a fan) is present, assume cpufreq-base cooling. Other OSs could choose to do otherwise, and we could change later... > + weight = <100>; /* percentage */ > + mask = <0x01>; This should presumably be trip-mask (assuming my suggested corrections). What is it supposed to mean here? > + limits = < > + /* lower-state upper-state */ > + 1 THERMAL_NO_LIMITS > + THERMAL_NO_LIMITS THERMAL_NO_LIMITS > + >; > + }; > + }; > +}; > + > +sensor: adc@0xFFFF { > + ... > + monitored-zones = <&cpu_thermal 0>; > +}; > + > +cpu@0 { > + ... > + cooling-zones = <&cpu_thermal>; > +}; > + > +In the example above, the ADC sensor at address 0xFFFF is used to monitor > +the zone 'cpu_thermal' using its the sensor 0. The cpu@0 device is also linked > +to the same thermal zone 'cpu_thermal' as a cooling device. Huh? That seems to imply the the '0' in '<&cpu_thermal 0>' on the sensor node describes a property of the sensor rather than the thermal node, unless I've misunderstood? > + > +*************** > +cpufreq-cooling > +*************** > + > +The generic cpu cooling (freq clipping) can be loaded by the > +generic cpufreq-cpu0 driver in case the device tree node informs > +this need. > + > +In order to load the cpufreq cooling device, it is possible to > +inform that the cpu needs cooling by adding the list of thermal zones > +in the 'cooling-zones' property at the cpu0 phandle. This should be reworded so as to describe the binding rather than the Linux behaviour based on the binding. > + > +Example: > + cpus { > + cpu@0 { > + operating-points = < > + /* kHz uV */ > + 800000 1313000 > + 1008000 1375000 > + >; > + cooling-zones = <&cpu_thermal>; > + }; > + ... > + }; > + > +The above will cause the cpufreq-cpu0 driver to understand that > +the cpu will need passive cooling and while probing that node it will > +also load the cpufreq cpu cooling device in that system. The above describe that the CPU is part of a cooling (thermal?) zone and requires cooling. > + > +However, be advised that this flag will not describe what needs to > +be done or how the cpufreq cooling device will be used. In order to > +accomplish this, one needs to write a phandle for a thermal zone, as > +described in the section 'thermal_zone'. The above already has a phandle to a thermal (cooling?) zone. Please choose either cooling zone or thermal zone, having the two names makes this more confusing than necessary. > diff --git a/drivers/thermal/Kconfig b/drivers/thermal/Kconfig > index 7fb16bc..753f0dc 100644 > --- a/drivers/thermal/Kconfig > +++ b/drivers/thermal/Kconfig > @@ -29,6 +29,19 @@ config THERMAL_HWMON > Say 'Y' here if you want all thermal sensors to > have hwmon sysfs interface too. > > +config THERMAL_OF > + bool > + prompt "APIs to parse thermal data out of device tree" > + depends on OF > + default y > + help > + This options provides helpers to add the support to > + read and parse thermal data definitions out of the > + device tree blob. > + > + Say 'Y' here if you need to build thermal infrastructure > + based on device tree. > + > choice > prompt "Default Thermal governor" > default THERMAL_DEFAULT_GOV_STEP_WISE > diff --git a/drivers/thermal/Makefile b/drivers/thermal/Makefile > index 24cb894..eedb273 100644 > --- a/drivers/thermal/Makefile > +++ b/drivers/thermal/Makefile > @@ -7,6 +7,7 @@ thermal_sys-y += thermal_core.o > > # interface to/from other layers providing sensors > thermal_sys-$(CONFIG_THERMAL_HWMON) += thermal_hwmon.o > +thermal_sys-$(CONFIG_THERMAL_OF) += thermal_dt.o > > # governors > thermal_sys-$(CONFIG_THERMAL_GOV_FAIR_SHARE) += fair_share.o > diff --git a/drivers/thermal/thermal_dt.c b/drivers/thermal/thermal_dt.c > new file mode 100644 > index 0000000..cc35485 > --- /dev/null > +++ b/drivers/thermal/thermal_dt.c > @@ -0,0 +1,473 @@ > +/* > + * thermal_dt.c - Generic Thermal Management device tree support. > + * > + * Copyright (C) 2013 Texas Instruments > + * Copyright (C) 2013 Eduardo Valentin <eduardo.valentin@ti.com> > + * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > + * > + * This program is free software; you can redistribute it and/or modify > + * it under the terms of the GNU General Public License as published by > + * the Free Software Foundation; version 2 of the License. > + * > + * This program is distributed in the hope that it will be useful, but > + * WITHOUT ANY WARRANTY; without even the implied warranty of > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > + * General Public License for more details. > + * > + * You should have received a copy of the GNU General Public License along > + * with this program; if not, write to the Free Software Foundation, Inc., > + * 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA. > + * > + * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > + */ > +#include <linux/thermal.h> > +#include <linux/slab.h> > +#include <linux/types.h> > +#include <linux/of_device.h> > +#include <linux/of_platform.h> > +#include <linux/err.h> > +#include <linux/export.h> > +#include <linux/string.h> > + > +struct __thermal_bind_params { > + char cooling_device[THERMAL_NAME_LENGTH]; > +}; > + > +static > +int thermal_of_match(struct thermal_zone_device *tz, > + struct thermal_cooling_device *cdev); > + > +static int thermal_of_populate_bind_params(struct device *dev, > + struct device_node *node, > + struct __thermal_bind_params *__tbp, > + struct thermal_bind_params *tbp, > + int ntrips) > +{ > + const char *cdev_name; > + u32 *limits; > + int ret; > + u32 prop; > + > + ret = of_property_read_u32(node, "weight", &prop); > + if (ret < 0) { > + dev_err(dev, "missing weight property\n"); > + return ret; > + } > + tbp->weight = prop; > + > + ret = of_property_read_u32(node, "mask", &prop); > + if (ret < 0) { > + dev_err(dev, "missing mask property\n"); > + return ret; > + } > + tbp->trip_mask = prop; > + > + /* this will allow us to bind with cooling devices */ > + tbp->match = thermal_of_match; > + > + ret = of_property_read_string(node, "cooling_device", &cdev_name); > + if (ret < 0) { > + dev_err(dev, "missing cooling_device property\n"); > + return ret; > + } > + strncpy(__tbp->cooling_device, cdev_name, > + sizeof(__tbp->cooling_device)); > + > + limits = kzalloc(ntrips * sizeof(*limits) * 2, GFP_KERNEL); Why do you use kzalloc here, but devm_kzalloc elsewhere? That'll lead to problems as the driver gets refactored over time. > + if (!limits) { > + dev_err(dev, "not enough mem for reading limits\n"); > + return -ENOMEM; > + } > + ret = of_property_read_u32_array(node, "limits", limits, 2 * ntrips); > + if (!ret) { > + int i = ntrips; > + > + tbp->binding_limits = devm_kzalloc(dev, > + ntrips * 2 * > + sizeof(*tbp->binding_limits), > + GFP_KERNEL); > + if (!tbp->binding_limits) { > + dev_err(dev, "not enough mem for storing limits\n"); > + return -ENOMEM; > + } > + while (i--) > + tbp->binding_limits[i] = limits[i]; > + } > + kfree(limits); > + > + return 0; What if you failed to read the array? Surely you should return an error? > +} > + > +struct __thermal_trip { > + unsigned long int temperature; > + unsigned long int hysteresis; > + enum thermal_trip_type type; > +}; > + > +static > +int thermal_of_populate_trip(struct device *dev, > + struct device_node *node, > + struct __thermal_trip *trip) > +{ > + int ret; > + int prop; > + > + ret = of_property_read_u32(node, "temperature", &prop); > + if (ret < 0) { > + dev_err(dev, "missing temperature property\n"); > + return ret; > + } > + trip->temperature = prop; > + > + ret = of_property_read_u32(node, "hysteresis", &prop); > + if (ret < 0) { > + dev_err(dev, "missing hysteresis property\n"); > + return ret; > + } > + trip->hysteresis = prop; > + > + ret = of_property_read_u32(node, "type", &prop); > + if (ret < 0) { > + dev_err(dev, "missing type property\n"); > + return ret; > + } > + trip->type = prop; This will require sanity checking. > + > + return 0; > +} > + > +struct __thermal_zone_device { > + enum thermal_device_mode mode; > + int passive_delay; > + int polling_delay; > + int mask; > + int ntrips; > + char type[THERMAL_NAME_LENGTH]; > + struct __thermal_trip *trips; > + struct __thermal_bind_params *bind_params; > + struct thermal_bind_params *tbps; > + struct thermal_zone_params zone_params; > + int (*get_temp)(void *, unsigned long *); > + void *devdata; > +}; > + > +static struct __thermal_zone_device * > +thermal_of_build_thermal_zone(struct device *dev, struct device_node *node) > +{ > + struct device_node *child, *gchild; > + struct __thermal_zone_device *tz; > + int ret, i; > + u32 prop; > + > + if (!node) { > + dev_err(dev, "no thermal zone node\n"); > + return ERR_PTR(-EINVAL); > + } > + > + tz = devm_kzalloc(dev, sizeof(*tz), GFP_KERNEL); > + if (!tz) { > + dev_err(dev, "not enough memory for thermal of zone\n"); > + return ERR_PTR(-ENOMEM); > + } > + > + ret = of_property_read_u32(node, "passive_delay", &prop); > + if (ret < 0) { > + dev_err(dev, "missing passive_delay property\n"); > + return ERR_PTR(ret); > + } > + tz->passive_delay = prop; > + > + ret = of_property_read_u32(node, "polling_delay", &prop); > + if (ret < 0) { > + dev_err(dev, "missing polling_delay property\n"); > + return ERR_PTR(ret); > + } > + tz->polling_delay = prop; > + > + ret = of_property_read_u32(node, "mask", &prop); > + if (ret < 0) { > + dev_err(dev, "missing mask property\n"); > + return ERR_PTR(ret); > + } > + tz->mask = prop; > + > + /* assume node name as our zone type */ > + strncpy(tz->type, node->name, sizeof(tz->type)); > + > + /* default policy */ > + strncpy(tz->zone_params.governor_name, "step_wise", > + sizeof(tz->zone_params.governor_name)); > + > + /* trips */ > + child = of_find_node_by_name(node, "trips"); That might not be a child node, as of_find_node_by_name walks over the list of all nodes. Use of_get_child_by_name. > + > + /* No trips provided */ > + if (!child) > + goto finish; > + > + tz->ntrips = of_get_child_count(child); > + tz->trips = devm_kzalloc(dev, tz->ntrips * sizeof(*tz->trips), > + GFP_KERNEL); > + if (!tz->trips) > + return ERR_PTR(-ENOMEM); > + i = 0; > + for_each_child_of_node(child, gchild) > + thermal_of_populate_trip(dev, gchild, &tz->trips[i++]); What if a child node was not a valid trip node? It seems thermal_of_populate_trip returns errors you're ignoring. > + > + /* bind_params */ > + child = of_find_node_by_name(node, "bind_params"); Another of_get_child_by_name candidate. > + > + /* No binding info */ > + if (!child) > + goto finish; > + > + tz->zone_params.num_tbps = of_get_child_count(child); > + tz->bind_params = devm_kzalloc(dev, > + tz->zone_params.num_tbps * > + sizeof(*tz->bind_params), > + GFP_KERNEL); > + if (!tz->bind_params) > + return ERR_PTR(-ENOMEM); > + tz->zone_params.tbp = devm_kzalloc(dev, > + tz->zone_params.num_tbps * > + sizeof(*tz->zone_params.tbp), > + GFP_KERNEL); > + if (!tz->zone_params.tbp) > + return ERR_PTR(-ENOMEM); > + i = 0; > + for_each_child_of_node(child, gchild) { > + thermal_of_populate_bind_params(dev, gchild, > + &tz->bind_params[i], > + &tz->zone_params.tbp[i], > + tz->ntrips); Similarly, you're ignoring error conditions here. What if a bind_params node was malformed? > + i++; > + } > + > +finish: > + tz->mode = THERMAL_DEVICE_ENABLED; > + > + return tz; > +} > + > +static > +int thermal_of_match(struct thermal_zone_device *tz, > + struct thermal_cooling_device *cdev) > +{ > + struct __thermal_zone_device *data = tz->devdata; > + int i; > + > + for (i = 0; i < data->zone_params.num_tbps; i++) { > + if (!strncmp(data->bind_params[i].cooling_device, > + cdev->type, > + strlen(data->bind_params[i].cooling_device)) && > + (data->zone_params.tbp[i].trip_mask & (1 << i))) > + return 0; > + } > + > + return -EINVAL; > +} I'm not keen on binding to cooling devices using a string, as suddenly a name to allow humans to inspect the system is turned into an ABI. > + > +static > +int of_thermal_get_temp(struct thermal_zone_device *tz, > + unsigned long *temp) > +{ > + struct __thermal_zone_device *data = tz->devdata; > + > + return data->get_temp(data->devdata, temp); > +} > + > +static > +int of_thermal_get_mode(struct thermal_zone_device *tz, > + enum thermal_device_mode *mode) > +{ > + struct __thermal_zone_device *data = tz->devdata; > + > + *mode = data->mode; > + > + return 0; > +} > + > +static > +int of_thermal_set_mode(struct thermal_zone_device *tz, > + enum thermal_device_mode mode) > +{ > + struct __thermal_zone_device *data = tz->devdata; > + > + mutex_lock(&tz->lock); > + > + if (mode == THERMAL_DEVICE_ENABLED) > + tz->polling_delay = data->polling_delay; > + else > + tz->polling_delay = 0; > + > + mutex_unlock(&tz->lock); > + > + data->mode = mode; > + thermal_zone_device_update(tz); > + > + return 0; > +} > + > +static > +int of_thermal_get_trip_type(struct thermal_zone_device *tz, int trip, > + enum thermal_trip_type *type) > +{ > + struct __thermal_zone_device *data = tz->devdata; > + > + if (trip >= data->ntrips || trip < 0) > + return -EDOM; It's far more common to use -EINVAL. > + > + *type = data->trips[trip].type; > + > + return 0; > +} > + > +static > +int of_thermal_get_trip_temp(struct thermal_zone_device *tz, int trip, > + unsigned long *temp) > +{ > + struct __thermal_zone_device *data = tz->devdata; > + > + if (trip >= data->ntrips || trip < 0) > + return -EDOM; > + > + *temp = data->trips[trip].temperature; > + > + return 0; > +} > + > +static > +int of_thermal_set_trip_temp(struct thermal_zone_device *tz, int trip, > + unsigned long temp) > +{ > + struct __thermal_zone_device *data = tz->devdata; > + > + if (trip >= data->ntrips || trip < 0) > + return -EDOM; > + > + /* thermal fw should take care of data->mask & (1 << trip) */ > + data->trips[trip].temperature = temp; > + > + return 0; > +} > + > +static > +int of_thermal_get_trip_hyst(struct thermal_zone_device *tz, int trip, > + unsigned long *hyst) > +{ > + struct __thermal_zone_device *data = tz->devdata; > + > + if (trip >= data->ntrips || trip < 0) > + return -EDOM; > + > + *hyst = data->trips[trip].hysteresis; > + > + return 0; > +} > + > +static > +int of_thermal_set_trip_hyst(struct thermal_zone_device *tz, int trip, > + unsigned long hyst) > +{ > + struct __thermal_zone_device *data = tz->devdata; > + > + if (trip >= data->ntrips || trip < 0) > + return -EDOM; > + > + /* thermal fw should take care of data->mask & (1 << trip) */ Could you elaborate on that comment? > + data->trips[trip].hysteresis = hyst; > + > + return 0; > +} [...] Thanks, Mark. -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hello Mark, First of all, thanks for taking the time to review in such level of detail. Lets try to align and have a common understanding. Answers go inline. Please let me know if I missed something or if I made it more confusing :-) On 27-08-2013 06:22, Mark Rutland wrote: > Hi, > > On Sat, Aug 24, 2013 at 12:15:43AM +0100, Eduardo Valentin wrote: >> In order to be able to build thermal policies >> based on generic sensors, like I2C device, that >> can be places in different points on different boards, >> there is a need to have a way to feed board dependent >> data into the thermal framework. >> >> This patch introduces a thermal data parser for device >> tree. The parsed data is used to build thermal zones >> and thermal binding parameters. The output data >> can then be used to deploy thermal policies. >> >> This patch adds also documentation regarding this >> API and how to define tree nodes to use >> this infrastructure. >> >> Cc: Zhang Rui <rui.zhang@intel.com> >> Cc: linux-pm@vger.kernel.org >> Cc: linux-kernel@vger.kernel.org >> Signed-off-by: Eduardo Valentin <eduardo.valentin@ti.com> >> --- >> .../devicetree/bindings/thermal/thermal.txt | 160 +++++++ >> drivers/thermal/Kconfig | 13 + >> drivers/thermal/Makefile | 1 + >> drivers/thermal/thermal_dt.c | 473 +++++++++++++++++++++ >> include/dt-bindings/thermal/thermal.h | 27 ++ >> include/linux/thermal.h | 3 + >> 6 files changed, 677 insertions(+) >> create mode 100644 Documentation/devicetree/bindings/thermal/thermal.txt >> create mode 100644 drivers/thermal/thermal_dt.c >> create mode 100644 include/dt-bindings/thermal/thermal.h >> General question on device tree documentation. Assuming there is such an effort to make it Linux independent, is there other places out of Linux tree that these binding must be documented? >> diff --git a/Documentation/devicetree/bindings/thermal/thermal.txt b/Documentation/devicetree/bindings/thermal/thermal.txt >> new file mode 100644 >> index 0000000..af20ab0 >> --- /dev/null >> +++ b/Documentation/devicetree/bindings/thermal/thermal.txt >> @@ -0,0 +1,160 @@ >> +---------------------------------------- >> +Thermal Framework Device Tree descriptor >> +---------------------------------------- >> + >> +This file describes how to define a thermal structure using device tree. >> +A thermal structure includes thermal zones and their components, such >> +as name, trip points, polling intervals and cooling devices binding >> +descriptors. A binding descriptor may contain information on how to >> +react, with a cooling stepped action or a weight on a fair share. >> +The target of device tree thermal descriptors is to describe only >> +the hardware thermal aspects, not how the system must control or which >> +algorithm or policy must take place. It follows a description of each >> +type of device tree node. >> + >> +**** >> +trip >> +**** >> + >> +The trip node is a node to describe a point in the temperature domain >> +in which the system takes an action. This node describes just the point, >> +not the action. >> + >> +A node describing a trip must contain: >> +- temperature: the trip temperature level, in milliCelsius. >> +- hysteresis: a (low) hysteresis value on 'temperature'. This is a relative >> +value, in milliCelsius. > > Presumably this is a single (u32) cell? Yes, both of them above are a single u32. > >> +- type: the trip type. Here is the type mapping: >> + THERMAL_TRIP_ACTIVE >> + THERMAL_TRIP_PASSIVE >> + THERMAL_TRIP_HOT >> + THERMAL_TRIP_CRITICAL > > What type is this property? > > What are these values? Do you need to refer to a header somewhere? There is a header, introduced in this patch. include/dt-bindings/thermal/thermal.h > Symbolic constants should have an associated value for an ABI. OK. Sounds reasonable to me. Shall we be descriptive and enlist the values in this doc too or just pointing to header is fine? > >> + >> +********** >> +bind_param >> +********** >> + >> +The bind_param node is a node to describe how cooling devices get assigned >> +to trip points of the zone. The cooling devices are expected to be loaded >> +in the target system. >> + >> +A node describing a bind_param must contain: >> +- cooling_device: A string with the cooling device name. > > Please use '-' rather than '_' in bindings. That's the normal devicetree > convention, and there's no point gonig against it and causing more > confusion. Agreed. > > What are valid values of this string, and what do they mean? These are the name of the cooling devices to match to. > > Why not a phandle + cells binding to cooling devices? > Yeah, this could work too. I am going to check how that would look like and come back to you. >> +- weight: The 'influence' of a particular cooling device on this zone. >> + This is on a percentage scale. The sum of all these weights >> + (for a particular zone) cannot exceed 100. > > This is a bit fuzzy, and certainly needs more guidance. OK. This property is used to describe the amount of cooling contribution when activating a cooling device at a certain trip point. I will work on a better guidance. > >> +- trip_mask: This is a bit mask that gives the binding relation between >> + this thermal zone and cdev, for a particular trip point. >> + If nth bit is set, then the cdev and thermal zone are bound >> + for trip point n. > > What type is this? > > The nth bit from which end? > > Could you explain what "bound for trip point n" means? > It is just a way to match a cooling device with a set of trip points. Say you want to match one specific cooling device with trips 0 and 1, you would then create a bind_param with trip_mask = 0x3. > Just because this is a bitmask in Linux does not mean it needs to be so > in the dt. Indeed. In case we work with phandle + cell binging parameters, the trip and weight could be passed as params. cooling-devices = <&devX tripY weightZ>, <... >; > >> +- limits: An array integers consisting of 2-tuples items, and each item means >> + lower and upper state limits, like <min-state max-state>. >> + >> +************************** >> +temperature sensor devices >> +************************** >> + >> +Temperature sensor devices have to be linked to a specific thermal zone. >> +This is done by means of the 'monitored-zones' list. >> +- monitored-zones: A list of thermal zones phandles, identifying which >> +thermal zones the temperature sensor device is used to monitor. > > Why does this need to be described that way around? > > Can the zone not have a link to the sensor(s) it needs? > Do you see any issues with it? Idea is that sensor nodes would list which zones they are used. Same idea goes to cooling devices. It can be the other way around, yes. It depends on what is the best way to describe these relations. Idea would be so that sensors and cooling devices would be composing a thermal zone. >> + >> +************ >> +thermal_zone >> +************ >> + >> +The thermal_zone node is the node containing all the required info >> +for describing a thermal zone, including its cdev bindings. The thermal_zone >> +node must contain, apart from its own properties, one node containing >> +trip nodes and one node containing all the zone bind parameters. >> + >> +Required properties: >> +- mask: Bit string: If 'n'th bit is set, then trip point 'n' is writeable. > > What type is a bit string in devicetree? I'm aware of cells and (byte) > strings, but not bit strings. > > What does 'writeable' mean in this context? > > There's presumably a better name than 'mask'. > This is used to describe if trip points are writeable under sysfs. I think this is a OS implementation leak. I will be removing this from device tree and making them RO by default. >> + >> +- passive_delay: number of milliseconds to wait between polls when >> + performing passive cooling. >> + >> +- polling_delay: number of milliseconds to wait between polls when checking. > > These sound very much like configuration of the driver rather than > hardware description. What if my temperature sensor can generate an > interrupt at some trigger temperature? Why would I need polling times? > > Why does this need to be in the dt at all? Surely the OS can choose some > sensible polling period, possibly dynamically? Well, let me answer this one with another question. How the OS will determine how fast is the temperature rise on a thermal zone described in device tree? > >> + >> +- #thermal-cells: An integer that is used to specify how many cells compose >> + sensor ids. > > I don't see why this is necessary. If the zone itself described its > related sensors rather than the other way around, we wouldn't need it at > all. Most bindings so far have consumers link to their providers rather > than the other way around. > I see. >> + >> +Below is an example: >> +cpu_thermal: thermal_zone { >> + #thermal-cells = <1>; >> + mask = <0x03>; /* trips writability */ >> + passive_delay = <250>; /* milliseconds */ >> + polling_delay = <1000>; /* milliseconds */ >> + trips { >> + alert@100000{ >> + temperature = <100000>; /* milliCelsius */ >> + hysteresis = <0>; /* milliCelsius */ >> + type = <THERMAL_TRIP_PASSIVE>; >> + }; >> + crit@125000{ >> + temperature = <125000>; /* milliCelsius */ >> + hysteresis = <0>; /* milliCelsius */ >> + type = <THERMAL_TRIP_CRITICAL>; >> + }; >> + }; >> + bind_params { >> + action@0{ >> + cooling_device = "thermal-cpufreq"; > > NAK. That value was not defined anywhere, and represents Linux > infrastructure rather than hardware. > Yes, this was another leak as discussed above in the upper part of the doc file. > For this case, we could just as easily describe the thermal points, and > if no physical cooling device (e.g. a fan) is present, assume > cpufreq-base cooling. Other OSs could choose to do otherwise, and we > could change later... > The way other specifications do (aka ACPI) is to have trip types. In case a trip type is passive cooling, it means one is supposed to modulate the dissipated power rather than activating a device to remove heat (active cooling). The part of 'assuming' things that bothers me. I would rather have a way to really say one wants to have passive cooling at a specific trip point. >> + weight = <100>; /* percentage */ >> + mask = <0x01>; > > This should presumably be trip-mask (assuming my suggested corrections). > > What is it supposed to mean here? As stated above, it means the trip points that this binding represents. > >> + limits = < >> + /* lower-state upper-state */ >> + 1 THERMAL_NO_LIMITS >> + THERMAL_NO_LIMITS THERMAL_NO_LIMITS >> + >; >> + }; >> + }; >> +}; >> + >> +sensor: adc@0xFFFF { >> + ... >> + monitored-zones = <&cpu_thermal 0>; >> +}; >> + >> +cpu@0 { >> + ... >> + cooling-zones = <&cpu_thermal>; >> +}; >> + >> +In the example above, the ADC sensor at address 0xFFFF is used to monitor >> +the zone 'cpu_thermal' using its the sensor 0. The cpu@0 device is also linked >> +to the same thermal zone 'cpu_thermal' as a cooling device. > > Huh? That seems to imply the the '0' in '<&cpu_thermal 0>' on the sensor > node describes a property of the sensor rather than the thermal node, > unless I've misunderstood? > It is essentially which sensor id is used to monitor a specific zone. In case of sensor IPs that feature several internal sensors, one needs to have a way to describe which sensor monitors which zone. Example of this type of IPs: lm90 (up to 3 sensors) and TI bandgap IP (can contain up to 5 internal sensors). The probe happens for the IP device and not for the internal sensors. >> + >> +*************** >> +cpufreq-cooling >> +*************** >> + >> +The generic cpu cooling (freq clipping) can be loaded by the >> +generic cpufreq-cpu0 driver in case the device tree node informs >> +this need. >> + >> +In order to load the cpufreq cooling device, it is possible to >> +inform that the cpu needs cooling by adding the list of thermal zones >> +in the 'cooling-zones' property at the cpu0 phandle. > > This should be reworded so as to describe the binding rather than the > Linux behaviour based on the binding. OK. > >> + >> +Example: >> + cpus { >> + cpu@0 { >> + operating-points = < >> + /* kHz uV */ >> + 800000 1313000 >> + 1008000 1375000 >> + >; >> + cooling-zones = <&cpu_thermal>; >> + }; >> + ... >> + }; >> + >> +The above will cause the cpufreq-cpu0 driver to understand that >> +the cpu will need passive cooling and while probing that node it will >> +also load the cpufreq cpu cooling device in that system. > > The above describe that the CPU is part of a cooling (thermal?) zone and > requires cooling. That says the cpu participates in a thermal zone as a cooling device/mechanism, thus cpu cooling (passive cooling) is required in target system. > >> + >> +However, be advised that this flag will not describe what needs to >> +be done or how the cpufreq cooling device will be used. In order to >> +accomplish this, one needs to write a phandle for a thermal zone, as >> +described in the section 'thermal_zone'. > > The above already has a phandle to a thermal (cooling?) zone. > > Please choose either cooling zone or thermal zone, having the two names > makes this more confusing than necessary. I see. Idea is to have thermal zone as a node. cooling-zone is just a list of thermal zones that the referred device will be used as cooling device. > >> diff --git a/drivers/thermal/Kconfig b/drivers/thermal/Kconfig >> index 7fb16bc..753f0dc 100644 >> --- a/drivers/thermal/Kconfig >> +++ b/drivers/thermal/Kconfig >> @@ -29,6 +29,19 @@ config THERMAL_HWMON >> Say 'Y' here if you want all thermal sensors to >> have hwmon sysfs interface too. >> >> +config THERMAL_OF >> + bool >> + prompt "APIs to parse thermal data out of device tree" >> + depends on OF >> + default y >> + help >> + This options provides helpers to add the support to >> + read and parse thermal data definitions out of the >> + device tree blob. >> + >> + Say 'Y' here if you need to build thermal infrastructure >> + based on device tree. >> + >> choice >> prompt "Default Thermal governor" >> default THERMAL_DEFAULT_GOV_STEP_WISE >> diff --git a/drivers/thermal/Makefile b/drivers/thermal/Makefile >> index 24cb894..eedb273 100644 >> --- a/drivers/thermal/Makefile >> +++ b/drivers/thermal/Makefile >> @@ -7,6 +7,7 @@ thermal_sys-y += thermal_core.o >> >> # interface to/from other layers providing sensors >> thermal_sys-$(CONFIG_THERMAL_HWMON) += thermal_hwmon.o >> +thermal_sys-$(CONFIG_THERMAL_OF) += thermal_dt.o >> >> # governors >> thermal_sys-$(CONFIG_THERMAL_GOV_FAIR_SHARE) += fair_share.o >> diff --git a/drivers/thermal/thermal_dt.c b/drivers/thermal/thermal_dt.c >> new file mode 100644 >> index 0000000..cc35485 >> --- /dev/null >> +++ b/drivers/thermal/thermal_dt.c >> @@ -0,0 +1,473 @@ >> +/* >> + * thermal_dt.c - Generic Thermal Management device tree support. >> + * >> + * Copyright (C) 2013 Texas Instruments >> + * Copyright (C) 2013 Eduardo Valentin <eduardo.valentin@ti.com> >> + * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >> + * >> + * This program is free software; you can redistribute it and/or modify >> + * it under the terms of the GNU General Public License as published by >> + * the Free Software Foundation; version 2 of the License. >> + * >> + * This program is distributed in the hope that it will be useful, but >> + * WITHOUT ANY WARRANTY; without even the implied warranty of >> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU >> + * General Public License for more details. >> + * >> + * You should have received a copy of the GNU General Public License along >> + * with this program; if not, write to the Free Software Foundation, Inc., >> + * 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA. >> + * >> + * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >> + */ >> +#include <linux/thermal.h> >> +#include <linux/slab.h> >> +#include <linux/types.h> >> +#include <linux/of_device.h> >> +#include <linux/of_platform.h> >> +#include <linux/err.h> >> +#include <linux/export.h> >> +#include <linux/string.h> >> + >> +struct __thermal_bind_params { >> + char cooling_device[THERMAL_NAME_LENGTH]; >> +}; >> + >> +static >> +int thermal_of_match(struct thermal_zone_device *tz, >> + struct thermal_cooling_device *cdev); >> + >> +static int thermal_of_populate_bind_params(struct device *dev, >> + struct device_node *node, >> + struct __thermal_bind_params *__tbp, >> + struct thermal_bind_params *tbp, >> + int ntrips) >> +{ >> + const char *cdev_name; >> + u32 *limits; >> + int ret; >> + u32 prop; >> + >> + ret = of_property_read_u32(node, "weight", &prop); >> + if (ret < 0) { >> + dev_err(dev, "missing weight property\n"); >> + return ret; >> + } >> + tbp->weight = prop; >> + >> + ret = of_property_read_u32(node, "mask", &prop); >> + if (ret < 0) { >> + dev_err(dev, "missing mask property\n"); >> + return ret; >> + } >> + tbp->trip_mask = prop; >> + >> + /* this will allow us to bind with cooling devices */ >> + tbp->match = thermal_of_match; >> + >> + ret = of_property_read_string(node, "cooling_device", &cdev_name); >> + if (ret < 0) { >> + dev_err(dev, "missing cooling_device property\n"); >> + return ret; >> + } >> + strncpy(__tbp->cooling_device, cdev_name, >> + sizeof(__tbp->cooling_device)); >> + >> + limits = kzalloc(ntrips * sizeof(*limits) * 2, GFP_KERNEL); > > Why do you use kzalloc here, but devm_kzalloc elsewhere? That'll lead to > problems as the driver gets refactored over time. Because this memory is used only within this function and thus released at the bottom of it. > >> + if (!limits) { >> + dev_err(dev, "not enough mem for reading limits\n"); >> + return -ENOMEM; >> + } >> + ret = of_property_read_u32_array(node, "limits", limits, 2 * ntrips); >> + if (!ret) { >> + int i = ntrips; >> + >> + tbp->binding_limits = devm_kzalloc(dev, >> + ntrips * 2 * >> + sizeof(*tbp->binding_limits), >> + GFP_KERNEL); >> + if (!tbp->binding_limits) { >> + dev_err(dev, "not enough mem for storing limits\n"); >> + return -ENOMEM; >> + } >> + while (i--) >> + tbp->binding_limits[i] = limits[i]; >> + } >> + kfree(limits); here. >> + >> + return 0; > > What if you failed to read the array? Surely you should return an error? Attempting to make this property not mandatory. > >> +} >> + >> +struct __thermal_trip { >> + unsigned long int temperature; >> + unsigned long int hysteresis; >> + enum thermal_trip_type type; >> +}; >> + >> +static >> +int thermal_of_populate_trip(struct device *dev, >> + struct device_node *node, >> + struct __thermal_trip *trip) >> +{ >> + int ret; >> + int prop; >> + >> + ret = of_property_read_u32(node, "temperature", &prop); >> + if (ret < 0) { >> + dev_err(dev, "missing temperature property\n"); >> + return ret; >> + } >> + trip->temperature = prop; >> + >> + ret = of_property_read_u32(node, "hysteresis", &prop); >> + if (ret < 0) { >> + dev_err(dev, "missing hysteresis property\n"); >> + return ret; >> + } >> + trip->hysteresis = prop; >> + >> + ret = of_property_read_u32(node, "type", &prop); >> + if (ret < 0) { >> + dev_err(dev, "missing type property\n"); >> + return ret; >> + } >> + trip->type = prop; > > This will require sanity checking. OK. > >> + >> + return 0; >> +} >> + >> +struct __thermal_zone_device { >> + enum thermal_device_mode mode; >> + int passive_delay; >> + int polling_delay; >> + int mask; >> + int ntrips; >> + char type[THERMAL_NAME_LENGTH]; >> + struct __thermal_trip *trips; >> + struct __thermal_bind_params *bind_params; >> + struct thermal_bind_params *tbps; >> + struct thermal_zone_params zone_params; >> + int (*get_temp)(void *, unsigned long *); >> + void *devdata; >> +}; >> + >> +static struct __thermal_zone_device * >> +thermal_of_build_thermal_zone(struct device *dev, struct device_node *node) >> +{ >> + struct device_node *child, *gchild; >> + struct __thermal_zone_device *tz; >> + int ret, i; >> + u32 prop; >> + >> + if (!node) { >> + dev_err(dev, "no thermal zone node\n"); >> + return ERR_PTR(-EINVAL); >> + } >> + >> + tz = devm_kzalloc(dev, sizeof(*tz), GFP_KERNEL); >> + if (!tz) { >> + dev_err(dev, "not enough memory for thermal of zone\n"); >> + return ERR_PTR(-ENOMEM); >> + } >> + >> + ret = of_property_read_u32(node, "passive_delay", &prop); >> + if (ret < 0) { >> + dev_err(dev, "missing passive_delay property\n"); >> + return ERR_PTR(ret); >> + } >> + tz->passive_delay = prop; >> + >> + ret = of_property_read_u32(node, "polling_delay", &prop); >> + if (ret < 0) { >> + dev_err(dev, "missing polling_delay property\n"); >> + return ERR_PTR(ret); >> + } >> + tz->polling_delay = prop; >> + >> + ret = of_property_read_u32(node, "mask", &prop); >> + if (ret < 0) { >> + dev_err(dev, "missing mask property\n"); >> + return ERR_PTR(ret); >> + } >> + tz->mask = prop; >> + >> + /* assume node name as our zone type */ >> + strncpy(tz->type, node->name, sizeof(tz->type)); >> + >> + /* default policy */ >> + strncpy(tz->zone_params.governor_name, "step_wise", >> + sizeof(tz->zone_params.governor_name)); >> + >> + /* trips */ >> + child = of_find_node_by_name(node, "trips"); > > That might not be a child node, as of_find_node_by_name walks over the > list of all nodes. Use of_get_child_by_name. Right! This is a bug. > >> + >> + /* No trips provided */ >> + if (!child) >> + goto finish; >> + >> + tz->ntrips = of_get_child_count(child); >> + tz->trips = devm_kzalloc(dev, tz->ntrips * sizeof(*tz->trips), >> + GFP_KERNEL); >> + if (!tz->trips) >> + return ERR_PTR(-ENOMEM); >> + i = 0; >> + for_each_child_of_node(child, gchild) >> + thermal_of_populate_trip(dev, gchild, &tz->trips[i++]); > > What if a child node was not a valid trip node? It seems > thermal_of_populate_trip returns errors you're ignoring. OK. Adding proper treatment. > >> + >> + /* bind_params */ >> + child = of_find_node_by_name(node, "bind_params"); > > Another of_get_child_by_name candidate. OK. > >> + >> + /* No binding info */ >> + if (!child) >> + goto finish; >> + >> + tz->zone_params.num_tbps = of_get_child_count(child); >> + tz->bind_params = devm_kzalloc(dev, >> + tz->zone_params.num_tbps * >> + sizeof(*tz->bind_params), >> + GFP_KERNEL); >> + if (!tz->bind_params) >> + return ERR_PTR(-ENOMEM); >> + tz->zone_params.tbp = devm_kzalloc(dev, >> + tz->zone_params.num_tbps * >> + sizeof(*tz->zone_params.tbp), >> + GFP_KERNEL); >> + if (!tz->zone_params.tbp) >> + return ERR_PTR(-ENOMEM); >> + i = 0; >> + for_each_child_of_node(child, gchild) { >> + thermal_of_populate_bind_params(dev, gchild, >> + &tz->bind_params[i], >> + &tz->zone_params.tbp[i], >> + tz->ntrips); > > Similarly, you're ignoring error conditions here. What if a bind_params > node was malformed? Adding proper error handling. > >> + i++; >> + } >> + >> +finish: >> + tz->mode = THERMAL_DEVICE_ENABLED; >> + >> + return tz; >> +} >> + >> +static >> +int thermal_of_match(struct thermal_zone_device *tz, >> + struct thermal_cooling_device *cdev) >> +{ >> + struct __thermal_zone_device *data = tz->devdata; >> + int i; >> + >> + for (i = 0; i < data->zone_params.num_tbps; i++) { >> + if (!strncmp(data->bind_params[i].cooling_device, >> + cdev->type, >> + strlen(data->bind_params[i].cooling_device)) && >> + (data->zone_params.tbp[i].trip_mask & (1 << i))) >> + return 0; >> + } >> + >> + return -EINVAL; >> +} > > I'm not keen on binding to cooling devices using a string, as suddenly a > name to allow humans to inspect the system is turned into an ABI. OK. We need to align on how we describe the binding then. > >> + >> +static >> +int of_thermal_get_temp(struct thermal_zone_device *tz, >> + unsigned long *temp) >> +{ >> + struct __thermal_zone_device *data = tz->devdata; >> + >> + return data->get_temp(data->devdata, temp); >> +} >> + >> +static >> +int of_thermal_get_mode(struct thermal_zone_device *tz, >> + enum thermal_device_mode *mode) >> +{ >> + struct __thermal_zone_device *data = tz->devdata; >> + >> + *mode = data->mode; >> + >> + return 0; >> +} >> + >> +static >> +int of_thermal_set_mode(struct thermal_zone_device *tz, >> + enum thermal_device_mode mode) >> +{ >> + struct __thermal_zone_device *data = tz->devdata; >> + >> + mutex_lock(&tz->lock); >> + >> + if (mode == THERMAL_DEVICE_ENABLED) >> + tz->polling_delay = data->polling_delay; >> + else >> + tz->polling_delay = 0; >> + >> + mutex_unlock(&tz->lock); >> + >> + data->mode = mode; >> + thermal_zone_device_update(tz); >> + >> + return 0; >> +} >> + >> +static >> +int of_thermal_get_trip_type(struct thermal_zone_device *tz, int trip, >> + enum thermal_trip_type *type) >> +{ >> + struct __thermal_zone_device *data = tz->devdata; >> + >> + if (trip >= data->ntrips || trip < 0) >> + return -EDOM; > > It's far more common to use -EINVAL. hmm.. OK. > >> + >> + *type = data->trips[trip].type; >> + >> + return 0; >> +} >> + >> +static >> +int of_thermal_get_trip_temp(struct thermal_zone_device *tz, int trip, >> + unsigned long *temp) >> +{ >> + struct __thermal_zone_device *data = tz->devdata; >> + >> + if (trip >= data->ntrips || trip < 0) >> + return -EDOM; >> + >> + *temp = data->trips[trip].temperature; >> + >> + return 0; >> +} >> + >> +static >> +int of_thermal_set_trip_temp(struct thermal_zone_device *tz, int trip, >> + unsigned long temp) >> +{ >> + struct __thermal_zone_device *data = tz->devdata; >> + >> + if (trip >= data->ntrips || trip < 0) >> + return -EDOM; >> + >> + /* thermal fw should take care of data->mask & (1 << trip) */ >> + data->trips[trip].temperature = temp; >> + >> + return 0; >> +} >> + >> +static >> +int of_thermal_get_trip_hyst(struct thermal_zone_device *tz, int trip, >> + unsigned long *hyst) >> +{ >> + struct __thermal_zone_device *data = tz->devdata; >> + >> + if (trip >= data->ntrips || trip < 0) >> + return -EDOM; >> + >> + *hyst = data->trips[trip].hysteresis; >> + >> + return 0; >> +} >> + >> +static >> +int of_thermal_set_trip_hyst(struct thermal_zone_device *tz, int trip, >> + unsigned long hyst) >> +{ >> + struct __thermal_zone_device *data = tz->devdata; >> + >> + if (trip >= data->ntrips || trip < 0) >> + return -EDOM; >> + >> + /* thermal fw should take care of data->mask & (1 << trip) */ > > Could you elaborate on that comment? In case the trip is not assigned to be writable, this function won't be called. > >> + data->trips[trip].hysteresis = hyst; >> + >> + return 0; >> +} > > [...] > > Thanks, > Mark. > >
On Tue, Aug 27, 2013 at 02:44:40PM +0100, Eduardo Valentin wrote: > Hello Mark, > > > First of all, thanks for taking the time to review in such level of > detail. Lets try to align and have a common understanding. Answers go > inline. Please let me know if I missed something or if I made it more > confusing :-) > > On 27-08-2013 06:22, Mark Rutland wrote: > > Hi, > > > > On Sat, Aug 24, 2013 at 12:15:43AM +0100, Eduardo Valentin wrote: > >> In order to be able to build thermal policies > >> based on generic sensors, like I2C device, that > >> can be places in different points on different boards, > >> there is a need to have a way to feed board dependent > >> data into the thermal framework. > >> > >> This patch introduces a thermal data parser for device > >> tree. The parsed data is used to build thermal zones > >> and thermal binding parameters. The output data > >> can then be used to deploy thermal policies. > >> > >> This patch adds also documentation regarding this > >> API and how to define tree nodes to use > >> this infrastructure. > >> > >> Cc: Zhang Rui <rui.zhang@intel.com> > >> Cc: linux-pm@vger.kernel.org > >> Cc: linux-kernel@vger.kernel.org > >> Signed-off-by: Eduardo Valentin <eduardo.valentin@ti.com> > >> --- > >> .../devicetree/bindings/thermal/thermal.txt | 160 +++++++ > >> drivers/thermal/Kconfig | 13 + > >> drivers/thermal/Makefile | 1 + > >> drivers/thermal/thermal_dt.c | 473 +++++++++++++++++++++ > >> include/dt-bindings/thermal/thermal.h | 27 ++ > >> include/linux/thermal.h | 3 + > >> 6 files changed, 677 insertions(+) > >> create mode 100644 Documentation/devicetree/bindings/thermal/thermal.txt > >> create mode 100644 drivers/thermal/thermal_dt.c > >> create mode 100644 include/dt-bindings/thermal/thermal.h > >> > > > General question on device tree documentation. Assuming there is such an > effort to make it Linux independent, is there other places out of Linux > tree that these binding must be documented? At present, no. There is a plan to move the documentation out of the kernel, but I'm not sure on how that's currently progressing. Some general cleanup needs to occur before we can actually do that. > > > >> diff --git a/Documentation/devicetree/bindings/thermal/thermal.txt b/Documentation/devicetree/bindings/thermal/thermal.txt > >> new file mode 100644 > >> index 0000000..af20ab0 > >> --- /dev/null > >> +++ b/Documentation/devicetree/bindings/thermal/thermal.txt > >> @@ -0,0 +1,160 @@ > >> +---------------------------------------- > >> +Thermal Framework Device Tree descriptor > >> +---------------------------------------- > >> + > >> +This file describes how to define a thermal structure using device tree. > >> +A thermal structure includes thermal zones and their components, such > >> +as name, trip points, polling intervals and cooling devices binding > >> +descriptors. A binding descriptor may contain information on how to > >> +react, with a cooling stepped action or a weight on a fair share. > >> +The target of device tree thermal descriptors is to describe only > >> +the hardware thermal aspects, not how the system must control or which > >> +algorithm or policy must take place. It follows a description of each > >> +type of device tree node. > >> + > >> +**** > >> +trip > >> +**** > >> + > >> +The trip node is a node to describe a point in the temperature domain > >> +in which the system takes an action. This node describes just the point, > >> +not the action. > >> + > >> +A node describing a trip must contain: > >> +- temperature: the trip temperature level, in milliCelsius. > >> +- hysteresis: a (low) hysteresis value on 'temperature'. This is a relative > >> +value, in milliCelsius. > > > > Presumably this is a single (u32) cell? > > Yes, both of them above are a single u32. Ok. This seems to be the default assumption in many bindings, so I shan't argue against it here, but in general we should be as precise as possible as to the format of the devicetree. > > > > >> +- type: the trip type. Here is the type mapping: > >> + THERMAL_TRIP_ACTIVE > >> + THERMAL_TRIP_PASSIVE > >> + THERMAL_TRIP_HOT > >> + THERMAL_TRIP_CRITICAL > > > > What type is this property? > > > > What are these values? Do you need to refer to a header somewhere? > > There is a header, introduced in this patch. > include/dt-bindings/thermal/thermal.h > > > Symbolic constants should have an associated value for an ABI. > > OK. Sounds reasonable to me. Shall we be descriptive and enlist the > values in this doc too or just pointing to header is fine? I believe that referring to the header is fine, though personally I'd prefer if we were explicit in binding docs. I'll leave Stephen to comment on what the preferred way of handling header constants with binding documents is. > > > > >> + > >> +********** > >> +bind_param > >> +********** > >> + > >> +The bind_param node is a node to describe how cooling devices get assigned > >> +to trip points of the zone. The cooling devices are expected to be loaded > >> +in the target system. > >> + > >> +A node describing a bind_param must contain: > >> +- cooling_device: A string with the cooling device name. > > > > Please use '-' rather than '_' in bindings. That's the normal devicetree > > convention, and there's no point gonig against it and causing more > > confusion. > > Agreed. > > > > > What are valid values of this string, and what do they mean? > > These are the name of the cooling devices to match to. Which are what exactly? What constitutes a "cooling device"? Is "black-body radiation" [1] a valid cooling device? How about "copper heat sink"? > > > > > Why not a phandle + cells binding to cooling devices? > > > > Yeah, this could work too. I am going to check how that would look like > and come back to you. Ok. > > >> +- weight: The 'influence' of a particular cooling device on this zone. > >> + This is on a percentage scale. The sum of all these weights > >> + (for a particular zone) cannot exceed 100. > > > > This is a bit fuzzy, and certainly needs more guidance. > > OK. This property is used to describe the amount of cooling contribution > when activating a cooling device at a certain trip point. I will work on > a better guidance. > > > > >> +- trip_mask: This is a bit mask that gives the binding relation between > >> + this thermal zone and cdev, for a particular trip point. > >> + If nth bit is set, then the cdev and thermal zone are bound > >> + for trip point n. > > > > What type is this? > > > > The nth bit from which end? > > > > Could you explain what "bound for trip point n" means? > > > > It is just a way to match a cooling device with a set of trip points. > Say you want to match one specific cooling device with trips 0 and 1, > you would then create a bind_param with trip_mask = 0x3. > > > Just because this is a bitmask in Linux does not mean it needs to be so > > in the dt. > > Indeed. > > In case we work with phandle + cell binging parameters, the trip and > weight could be passed as params. > > cooling-devices = <&devX tripY weightZ>, <... >; Something like that could certainly work. > > > > >> +- limits: An array integers consisting of 2-tuples items, and each item means > >> + lower and upper state limits, like <min-state max-state>. > >> + > >> +************************** > >> +temperature sensor devices > >> +************************** > >> + > >> +Temperature sensor devices have to be linked to a specific thermal zone. > >> +This is done by means of the 'monitored-zones' list. > >> +- monitored-zones: A list of thermal zones phandles, identifying which > >> +thermal zones the temperature sensor device is used to monitor. > > > > Why does this need to be described that way around? > > > > Can the zone not have a link to the sensor(s) it needs? > > > > Do you see any issues with it? > > Idea is that sensor nodes would list which zones they are used. Same > idea goes to cooling devices. It can be the other way around, yes. It > depends on what is the best way to describe these relations. Idea would > be so that sensors and cooling devices would be composing a thermal zone. I think that having a zone refer to the sensors that monitor it would be preferable. Otherwise we're encoding that way a devices output is to be consumed on the binding for the device itself, which feels odd (it's arguable as to whether it's a property of the device, however). > > >> + > >> +************ > >> +thermal_zone > >> +************ > >> + > >> +The thermal_zone node is the node containing all the required info > >> +for describing a thermal zone, including its cdev bindings. The thermal_zone > >> +node must contain, apart from its own properties, one node containing > >> +trip nodes and one node containing all the zone bind parameters. > >> + > >> +Required properties: > >> +- mask: Bit string: If 'n'th bit is set, then trip point 'n' is writeable. > > > > What type is a bit string in devicetree? I'm aware of cells and (byte) > > strings, but not bit strings. > > > > What does 'writeable' mean in this context? > > > > There's presumably a better name than 'mask'. > > > > This is used to describe if trip points are writeable under sysfs. I > think this is a OS implementation leak. I will be removing this from > device tree and making them RO by default. Ok. > > >> + > >> +- passive_delay: number of milliseconds to wait between polls when > >> + performing passive cooling. > >> + > >> +- polling_delay: number of milliseconds to wait between polls when checking. > > > > These sound very much like configuration of the driver rather than > > hardware description. What if my temperature sensor can generate an > > interrupt at some trigger temperature? Why would I need polling times? > > > > Why does this need to be in the dt at all? Surely the OS can choose some > > sensible polling period, possibly dynamically? > > Well, let me answer this one with another question. How the OS will > determine how fast is the temperature rise on a thermal zone described > in device tree? How will the devicetree writer determine this? It's possible for the thermal management code to dynamically adjust the rate between some extreme polling intervals based on how big a rise it detects between polls or possibly something more complex taking into account some metric for load. > > > > >> + > >> +- #thermal-cells: An integer that is used to specify how many cells compose > >> + sensor ids. > > > > I don't see why this is necessary. If the zone itself described its > > related sensors rather than the other way around, we wouldn't need it at > > all. Most bindings so far have consumers link to their providers rather > > than the other way around. > > > > I see. > > > >> + > >> +Below is an example: > >> +cpu_thermal: thermal_zone { > >> + #thermal-cells = <1>; > >> + mask = <0x03>; /* trips writability */ > >> + passive_delay = <250>; /* milliseconds */ > >> + polling_delay = <1000>; /* milliseconds */ > >> + trips { > >> + alert@100000{ > >> + temperature = <100000>; /* milliCelsius */ > >> + hysteresis = <0>; /* milliCelsius */ > >> + type = <THERMAL_TRIP_PASSIVE>; > >> + }; > >> + crit@125000{ > >> + temperature = <125000>; /* milliCelsius */ > >> + hysteresis = <0>; /* milliCelsius */ > >> + type = <THERMAL_TRIP_CRITICAL>; > >> + }; I didn't spot this on my first round, but the trips node will need #address-cells and #size-cells, and the sub nodes unit-addresses should match their reg properties. > >> + }; > >> + bind_params { > >> + action@0{ > >> + cooling_device = "thermal-cpufreq"; > > > > NAK. That value was not defined anywhere, and represents Linux > > infrastructure rather than hardware. > > > > Yes, this was another leak as discussed above in the upper part of the > doc file. > > > For this case, we could just as easily describe the thermal points, and > > if no physical cooling device (e.g. a fan) is present, assume > > cpufreq-base cooling. Other OSs could choose to do otherwise, and we > > could change later... > > > > The way other specifications do (aka ACPI) is to have trip types. In > case a trip type is passive cooling, it means one is supposed to > modulate the dissipated power rather than activating a device to remove > heat (active cooling). > > The part of 'assuming' things that bothers me. I would rather have a way > to really say one wants to have passive cooling at a specific trip point. I don't see the problem here. If you don't specify a device, your only options will be some sort of passive management. > > > >> + weight = <100>; /* percentage */ > >> + mask = <0x01>; > > > > This should presumably be trip-mask (assuming my suggested corrections). > > > > What is it supposed to mean here? > > As stated above, it means the trip points that this binding represents. If you have more than 32 trip points (which is admittedly unlikely), how do you represent them? I'm not sure a mask is the best representation here, as it's difficult for a human to match up with a list of entries. Additionally, the trip points don't have a logical contiguous numbering from zero, so how do the bits in the mask correspond to entries in the list? This seems to rely on the order that the subnodes are in, which AFAIK is not guaranteed. It would be far nicer to have an explicit linkage. > > > > >> + limits = < > >> + /* lower-state upper-state */ > >> + 1 THERMAL_NO_LIMITS > >> + THERMAL_NO_LIMITS THERMAL_NO_LIMITS > >> + >; > >> + }; > >> + }; > >> +}; > >> + > >> +sensor: adc@0xFFFF { > >> + ... > >> + monitored-zones = <&cpu_thermal 0>; > >> +}; > >> + > >> +cpu@0 { > >> + ... > >> + cooling-zones = <&cpu_thermal>; > >> +}; > >> + > >> +In the example above, the ADC sensor at address 0xFFFF is used to monitor > >> +the zone 'cpu_thermal' using its the sensor 0. The cpu@0 device is also linked > >> +to the same thermal zone 'cpu_thermal' as a cooling device. > > > > Huh? That seems to imply the the '0' in '<&cpu_thermal 0>' on the sensor > > node describes a property of the sensor rather than the thermal node, > > unless I've misunderstood? > > > > It is essentially which sensor id is used to monitor a specific zone. In > case of sensor IPs that feature several internal sensors, one needs to > have a way to describe which sensor monitors which zone. Example of this > type of IPs: lm90 (up to 3 sensors) and TI bandgap IP (can contain up to > 5 internal sensors). The probe happens for the IP device and not for the > internal sensors. Which is exactly why this should be the opposite way round, with thermal zones referring to sensors in this fashion. The way you're suggesting means that the phandle+cells specifier refers half to the consumer and half to the producer. Having thermal zones refer to sensors would make this refer solely to the producer (the sensor). > > > >> + > >> +*************** > >> +cpufreq-cooling > >> +*************** > >> + > >> +The generic cpu cooling (freq clipping) can be loaded by the > >> +generic cpufreq-cpu0 driver in case the device tree node informs > >> +this need. > >> + > >> +In order to load the cpufreq cooling device, it is possible to > >> +inform that the cpu needs cooling by adding the list of thermal zones > >> +in the 'cooling-zones' property at the cpu0 phandle. > > > > This should be reworded so as to describe the binding rather than the > > Linux behaviour based on the binding. > > OK. > > > > >> + > >> +Example: > >> + cpus { > >> + cpu@0 { > >> + operating-points = < > >> + /* kHz uV */ > >> + 800000 1313000 > >> + 1008000 1375000 > >> + >; > >> + cooling-zones = <&cpu_thermal>; > >> + }; > >> + ... > >> + }; > >> + > >> +The above will cause the cpufreq-cpu0 driver to understand that > >> +the cpu will need passive cooling and while probing that node it will > >> +also load the cpufreq cpu cooling device in that system. > > > > The above describe that the CPU is part of a cooling (thermal?) zone and > > requires cooling. > > That says the cpu participates in a thermal zone as a cooling > device/mechanism, thus cpu cooling (passive cooling) is required in > target system. I think you need the rest of the nodes in this example. You don't show whether or not there's a cooling-device, so it's not clear that the CPU is required to be passively cooled... > > > > >> + > >> +However, be advised that this flag will not describe what needs to > >> +be done or how the cpufreq cooling device will be used. In order to > >> +accomplish this, one needs to write a phandle for a thermal zone, as > >> +described in the section 'thermal_zone'. > > > > The above already has a phandle to a thermal (cooling?) zone. > > > > Please choose either cooling zone or thermal zone, having the two names > > makes this more confusing than necessary. > > I see. Idea is to have thermal zone as a node. cooling-zone is just a > list of thermal zones that the referred device will be used as cooling > device. While I can see a distinction, the nomenclature is horrendously confusing. > > > > >> diff --git a/drivers/thermal/Kconfig b/drivers/thermal/Kconfig > >> index 7fb16bc..753f0dc 100644 > >> --- a/drivers/thermal/Kconfig > >> +++ b/drivers/thermal/Kconfig > >> @@ -29,6 +29,19 @@ config THERMAL_HWMON > >> Say 'Y' here if you want all thermal sensors to > >> have hwmon sysfs interface too. > >> > >> +config THERMAL_OF > >> + bool > >> + prompt "APIs to parse thermal data out of device tree" > >> + depends on OF > >> + default y > >> + help > >> + This options provides helpers to add the support to > >> + read and parse thermal data definitions out of the > >> + device tree blob. > >> + > >> + Say 'Y' here if you need to build thermal infrastructure > >> + based on device tree. > >> + > >> choice > >> prompt "Default Thermal governor" > >> default THERMAL_DEFAULT_GOV_STEP_WISE > >> diff --git a/drivers/thermal/Makefile b/drivers/thermal/Makefile > >> index 24cb894..eedb273 100644 > >> --- a/drivers/thermal/Makefile > >> +++ b/drivers/thermal/Makefile > >> @@ -7,6 +7,7 @@ thermal_sys-y += thermal_core.o > >> > >> # interface to/from other layers providing sensors > >> thermal_sys-$(CONFIG_THERMAL_HWMON) += thermal_hwmon.o > >> +thermal_sys-$(CONFIG_THERMAL_OF) += thermal_dt.o > >> > >> # governors > >> thermal_sys-$(CONFIG_THERMAL_GOV_FAIR_SHARE) += fair_share.o > >> diff --git a/drivers/thermal/thermal_dt.c b/drivers/thermal/thermal_dt.c > >> new file mode 100644 > >> index 0000000..cc35485 > >> --- /dev/null > >> +++ b/drivers/thermal/thermal_dt.c > >> @@ -0,0 +1,473 @@ > >> +/* > >> + * thermal_dt.c - Generic Thermal Management device tree support. > >> + * > >> + * Copyright (C) 2013 Texas Instruments > >> + * Copyright (C) 2013 Eduardo Valentin <eduardo.valentin@ti.com> > >> + * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > >> + * > >> + * This program is free software; you can redistribute it and/or modify > >> + * it under the terms of the GNU General Public License as published by > >> + * the Free Software Foundation; version 2 of the License. > >> + * > >> + * This program is distributed in the hope that it will be useful, but > >> + * WITHOUT ANY WARRANTY; without even the implied warranty of > >> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > >> + * General Public License for more details. > >> + * > >> + * You should have received a copy of the GNU General Public License along > >> + * with this program; if not, write to the Free Software Foundation, Inc., > >> + * 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA. > >> + * > >> + * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > >> + */ > >> +#include <linux/thermal.h> > >> +#include <linux/slab.h> > >> +#include <linux/types.h> > >> +#include <linux/of_device.h> > >> +#include <linux/of_platform.h> > >> +#include <linux/err.h> > >> +#include <linux/export.h> > >> +#include <linux/string.h> > >> + > >> +struct __thermal_bind_params { > >> + char cooling_device[THERMAL_NAME_LENGTH]; > >> +}; > >> + > >> +static > >> +int thermal_of_match(struct thermal_zone_device *tz, > >> + struct thermal_cooling_device *cdev); > >> + > >> +static int thermal_of_populate_bind_params(struct device *dev, > >> + struct device_node *node, > >> + struct __thermal_bind_params *__tbp, > >> + struct thermal_bind_params *tbp, > >> + int ntrips) > >> +{ > >> + const char *cdev_name; > >> + u32 *limits; > >> + int ret; > >> + u32 prop; > >> + > >> + ret = of_property_read_u32(node, "weight", &prop); > >> + if (ret < 0) { > >> + dev_err(dev, "missing weight property\n"); > >> + return ret; > >> + } > >> + tbp->weight = prop; > >> + > >> + ret = of_property_read_u32(node, "mask", &prop); > >> + if (ret < 0) { > >> + dev_err(dev, "missing mask property\n"); > >> + return ret; > >> + } > >> + tbp->trip_mask = prop; > >> + > >> + /* this will allow us to bind with cooling devices */ > >> + tbp->match = thermal_of_match; > >> + > >> + ret = of_property_read_string(node, "cooling_device", &cdev_name); > >> + if (ret < 0) { > >> + dev_err(dev, "missing cooling_device property\n"); > >> + return ret; > >> + } > >> + strncpy(__tbp->cooling_device, cdev_name, > >> + sizeof(__tbp->cooling_device)); > >> + > >> + limits = kzalloc(ntrips * sizeof(*limits) * 2, GFP_KERNEL); > > > > Why do you use kzalloc here, but devm_kzalloc elsewhere? That'll lead to > > problems as the driver gets refactored over time. > > Because this memory is used only within this function and thus released > at the bottom of it. > > > > >> + if (!limits) { > >> + dev_err(dev, "not enough mem for reading limits\n"); > >> + return -ENOMEM; > >> + } > >> + ret = of_property_read_u32_array(node, "limits", limits, 2 * ntrips); > >> + if (!ret) { > >> + int i = ntrips; > >> + > >> + tbp->binding_limits = devm_kzalloc(dev, > >> + ntrips * 2 * > >> + sizeof(*tbp->binding_limits), > >> + GFP_KERNEL); > >> + if (!tbp->binding_limits) { > >> + dev_err(dev, "not enough mem for storing limits\n"); > >> + return -ENOMEM; > >> + } > >> + while (i--) > >> + tbp->binding_limits[i] = limits[i]; > >> + } > >> + kfree(limits); > > here. I saw that, I did say that this may lead to problems in future if the function's reorganised, not now. :) > > >> + > >> + return 0; > > > > What if you failed to read the array? Surely you should return an error? > > Attempting to make this property not mandatory. That wasn't clear from the binding, and now I've re-read it, I'm still not entirely clear on what the limits represent, and why you may have multiple sets of pairs. > > > > >> +} > >> + > >> +struct __thermal_trip { > >> + unsigned long int temperature; > >> + unsigned long int hysteresis; > >> + enum thermal_trip_type type; > >> +}; > >> + > >> +static > >> +int thermal_of_populate_trip(struct device *dev, > >> + struct device_node *node, > >> + struct __thermal_trip *trip) > >> +{ > >> + int ret; > >> + int prop; > >> + > >> + ret = of_property_read_u32(node, "temperature", &prop); > >> + if (ret < 0) { > >> + dev_err(dev, "missing temperature property\n"); > >> + return ret; > >> + } > >> + trip->temperature = prop; > >> + > >> + ret = of_property_read_u32(node, "hysteresis", &prop); > >> + if (ret < 0) { > >> + dev_err(dev, "missing hysteresis property\n"); > >> + return ret; > >> + } > >> + trip->hysteresis = prop; > >> + > >> + ret = of_property_read_u32(node, "type", &prop); > >> + if (ret < 0) { > >> + dev_err(dev, "missing type property\n"); > >> + return ret; > >> + } > >> + trip->type = prop; > > > > This will require sanity checking. > > OK. > > > > >> + > >> + return 0; > >> +} > >> + > >> +struct __thermal_zone_device { > >> + enum thermal_device_mode mode; > >> + int passive_delay; > >> + int polling_delay; > >> + int mask; > >> + int ntrips; > >> + char type[THERMAL_NAME_LENGTH]; > >> + struct __thermal_trip *trips; > >> + struct __thermal_bind_params *bind_params; > >> + struct thermal_bind_params *tbps; > >> + struct thermal_zone_params zone_params; > >> + int (*get_temp)(void *, unsigned long *); > >> + void *devdata; > >> +}; > >> + > >> +static struct __thermal_zone_device * > >> +thermal_of_build_thermal_zone(struct device *dev, struct device_node *node) > >> +{ > >> + struct device_node *child, *gchild; > >> + struct __thermal_zone_device *tz; > >> + int ret, i; > >> + u32 prop; > >> + > >> + if (!node) { > >> + dev_err(dev, "no thermal zone node\n"); > >> + return ERR_PTR(-EINVAL); > >> + } > >> + > >> + tz = devm_kzalloc(dev, sizeof(*tz), GFP_KERNEL); > >> + if (!tz) { > >> + dev_err(dev, "not enough memory for thermal of zone\n"); > >> + return ERR_PTR(-ENOMEM); > >> + } > >> + > >> + ret = of_property_read_u32(node, "passive_delay", &prop); > >> + if (ret < 0) { > >> + dev_err(dev, "missing passive_delay property\n"); > >> + return ERR_PTR(ret); > >> + } > >> + tz->passive_delay = prop; > >> + > >> + ret = of_property_read_u32(node, "polling_delay", &prop); > >> + if (ret < 0) { > >> + dev_err(dev, "missing polling_delay property\n"); > >> + return ERR_PTR(ret); > >> + } > >> + tz->polling_delay = prop; > >> + > >> + ret = of_property_read_u32(node, "mask", &prop); > >> + if (ret < 0) { > >> + dev_err(dev, "missing mask property\n"); > >> + return ERR_PTR(ret); > >> + } > >> + tz->mask = prop; > >> + > >> + /* assume node name as our zone type */ > >> + strncpy(tz->type, node->name, sizeof(tz->type)); > >> + > >> + /* default policy */ > >> + strncpy(tz->zone_params.governor_name, "step_wise", > >> + sizeof(tz->zone_params.governor_name)); > >> + > >> + /* trips */ > >> + child = of_find_node_by_name(node, "trips"); > > > > That might not be a child node, as of_find_node_by_name walks over the > > list of all nodes. Use of_get_child_by_name. > > Right! This is a bug. > > > > >> + > >> + /* No trips provided */ > >> + if (!child) > >> + goto finish; > >> + > >> + tz->ntrips = of_get_child_count(child); > >> + tz->trips = devm_kzalloc(dev, tz->ntrips * sizeof(*tz->trips), > >> + GFP_KERNEL); > >> + if (!tz->trips) > >> + return ERR_PTR(-ENOMEM); > >> + i = 0; > >> + for_each_child_of_node(child, gchild) > >> + thermal_of_populate_trip(dev, gchild, &tz->trips[i++]); > > > > What if a child node was not a valid trip node? It seems > > thermal_of_populate_trip returns errors you're ignoring. > > OK. Adding proper treatment. > > > > >> + > >> + /* bind_params */ > >> + child = of_find_node_by_name(node, "bind_params"); > > > > Another of_get_child_by_name candidate. > > OK. > > > > >> + > >> + /* No binding info */ > >> + if (!child) > >> + goto finish; > >> + > >> + tz->zone_params.num_tbps = of_get_child_count(child); > >> + tz->bind_params = devm_kzalloc(dev, > >> + tz->zone_params.num_tbps * > >> + sizeof(*tz->bind_params), > >> + GFP_KERNEL); > >> + if (!tz->bind_params) > >> + return ERR_PTR(-ENOMEM); > >> + tz->zone_params.tbp = devm_kzalloc(dev, > >> + tz->zone_params.num_tbps * > >> + sizeof(*tz->zone_params.tbp), > >> + GFP_KERNEL); > >> + if (!tz->zone_params.tbp) > >> + return ERR_PTR(-ENOMEM); > >> + i = 0; > >> + for_each_child_of_node(child, gchild) { > >> + thermal_of_populate_bind_params(dev, gchild, > >> + &tz->bind_params[i], > >> + &tz->zone_params.tbp[i], > >> + tz->ntrips); > > > > Similarly, you're ignoring error conditions here. What if a bind_params > > node was malformed? > > > Adding proper error handling. > > > > >> + i++; > >> + } > >> + > >> +finish: > >> + tz->mode = THERMAL_DEVICE_ENABLED; > >> + > >> + return tz; > >> +} > >> + > >> +static > >> +int thermal_of_match(struct thermal_zone_device *tz, > >> + struct thermal_cooling_device *cdev) > >> +{ > >> + struct __thermal_zone_device *data = tz->devdata; > >> + int i; > >> + > >> + for (i = 0; i < data->zone_params.num_tbps; i++) { > >> + if (!strncmp(data->bind_params[i].cooling_device, > >> + cdev->type, > >> + strlen(data->bind_params[i].cooling_device)) && > >> + (data->zone_params.tbp[i].trip_mask & (1 << i))) > >> + return 0; > >> + } > >> + > >> + return -EINVAL; > >> +} > > > > I'm not keen on binding to cooling devices using a string, as suddenly a > > name to allow humans to inspect the system is turned into an ABI. > > OK. We need to align on how we describe the binding then. Definitely. We need to ensure that we fully define the relationship too cooling devices, and don't suddenly expose some piece of internal Linux infrastructure to the world. > > > > >> + > >> +static > >> +int of_thermal_get_temp(struct thermal_zone_device *tz, > >> + unsigned long *temp) > >> +{ > >> + struct __thermal_zone_device *data = tz->devdata; > >> + > >> + return data->get_temp(data->devdata, temp); > >> +} > >> + > >> +static > >> +int of_thermal_get_mode(struct thermal_zone_device *tz, > >> + enum thermal_device_mode *mode) > >> +{ > >> + struct __thermal_zone_device *data = tz->devdata; > >> + > >> + *mode = data->mode; > >> + > >> + return 0; > >> +} > >> + > >> +static > >> +int of_thermal_set_mode(struct thermal_zone_device *tz, > >> + enum thermal_device_mode mode) > >> +{ > >> + struct __thermal_zone_device *data = tz->devdata; > >> + > >> + mutex_lock(&tz->lock); > >> + > >> + if (mode == THERMAL_DEVICE_ENABLED) > >> + tz->polling_delay = data->polling_delay; > >> + else > >> + tz->polling_delay = 0; > >> + > >> + mutex_unlock(&tz->lock); > >> + > >> + data->mode = mode; > >> + thermal_zone_device_update(tz); > >> + > >> + return 0; > >> +} > >> + > >> +static > >> +int of_thermal_get_trip_type(struct thermal_zone_device *tz, int trip, > >> + enum thermal_trip_type *type) > >> +{ > >> + struct __thermal_zone_device *data = tz->devdata; > >> + > >> + if (trip >= data->ntrips || trip < 0) > >> + return -EDOM; > > > > It's far more common to use -EINVAL. > > hmm.. OK. > > > > >> + > >> + *type = data->trips[trip].type; > >> + > >> + return 0; > >> +} > >> + > >> +static > >> +int of_thermal_get_trip_temp(struct thermal_zone_device *tz, int trip, > >> + unsigned long *temp) > >> +{ > >> + struct __thermal_zone_device *data = tz->devdata; > >> + > >> + if (trip >= data->ntrips || trip < 0) > >> + return -EDOM; > >> + > >> + *temp = data->trips[trip].temperature; > >> + > >> + return 0; > >> +} > >> + > >> +static > >> +int of_thermal_set_trip_temp(struct thermal_zone_device *tz, int trip, > >> + unsigned long temp) > >> +{ > >> + struct __thermal_zone_device *data = tz->devdata; > >> + > >> + if (trip >= data->ntrips || trip < 0) > >> + return -EDOM; > >> + > >> + /* thermal fw should take care of data->mask & (1 << trip) */ > >> + data->trips[trip].temperature = temp; > >> + > >> + return 0; > >> +} > >> + > >> +static > >> +int of_thermal_get_trip_hyst(struct thermal_zone_device *tz, int trip, > >> + unsigned long *hyst) > >> +{ > >> + struct __thermal_zone_device *data = tz->devdata; > >> + > >> + if (trip >= data->ntrips || trip < 0) > >> + return -EDOM; > >> + > >> + *hyst = data->trips[trip].hysteresis; > >> + > >> + return 0; > >> +} > >> + > >> +static > >> +int of_thermal_set_trip_hyst(struct thermal_zone_device *tz, int trip, > >> + unsigned long hyst) > >> +{ > >> + struct __thermal_zone_device *data = tz->devdata; > >> + > >> + if (trip >= data->ntrips || trip < 0) > >> + return -EDOM; > >> + > >> + /* thermal fw should take care of data->mask & (1 << trip) */ > > > > Could you elaborate on that comment? > > In case the trip is not assigned to be writable, this function won't be > called. Ah. I misunderstood "fw" as "firmware", rather than "framework", and got confused. It would be nice if you could expand "fw" to make this clearer. :) Thanks, Mark. [1] http://en.wikipedia.org/wiki/Black-body_radiation -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 27-08-2013 12:23, Mark Rutland wrote: > On Tue, Aug 27, 2013 at 02:44:40PM +0100, Eduardo Valentin wrote: >> Hello Mark, >> >> >> First of all, thanks for taking the time to review in such level of >> detail. Lets try to align and have a common understanding. Answers go >> inline. Please let me know if I missed something or if I made it more >> confusing :-) >> >> On 27-08-2013 06:22, Mark Rutland wrote: >>> Hi, >>> >>> On Sat, Aug 24, 2013 at 12:15:43AM +0100, Eduardo Valentin wrote: >>>> In order to be able to build thermal policies >>>> based on generic sensors, like I2C device, that >>>> can be places in different points on different boards, >>>> there is a need to have a way to feed board dependent >>>> data into the thermal framework. >>>> >>>> This patch introduces a thermal data parser for device >>>> tree. The parsed data is used to build thermal zones >>>> and thermal binding parameters. The output data >>>> can then be used to deploy thermal policies. >>>> >>>> This patch adds also documentation regarding this >>>> API and how to define tree nodes to use >>>> this infrastructure. >>>> >>>> Cc: Zhang Rui <rui.zhang@intel.com> >>>> Cc: linux-pm@vger.kernel.org >>>> Cc: linux-kernel@vger.kernel.org >>>> Signed-off-by: Eduardo Valentin <eduardo.valentin@ti.com> >>>> --- >>>> .../devicetree/bindings/thermal/thermal.txt | 160 +++++++ >>>> drivers/thermal/Kconfig | 13 + >>>> drivers/thermal/Makefile | 1 + >>>> drivers/thermal/thermal_dt.c | 473 +++++++++++++++++++++ >>>> include/dt-bindings/thermal/thermal.h | 27 ++ >>>> include/linux/thermal.h | 3 + >>>> 6 files changed, 677 insertions(+) >>>> create mode 100644 Documentation/devicetree/bindings/thermal/thermal.txt >>>> create mode 100644 drivers/thermal/thermal_dt.c >>>> create mode 100644 include/dt-bindings/thermal/thermal.h >>>> >> >> >> General question on device tree documentation. Assuming there is such an >> effort to make it Linux independent, is there other places out of Linux >> tree that these binding must be documented? > > At present, no. There is a plan to move the documentation out of the > kernel, but I'm not sure on how that's currently progressing. Some > general cleanup needs to occur before we can actually do that. > OK. >> >> >>>> diff --git a/Documentation/devicetree/bindings/thermal/thermal.txt b/Documentation/devicetree/bindings/thermal/thermal.txt >>>> new file mode 100644 >>>> index 0000000..af20ab0 >>>> --- /dev/null >>>> +++ b/Documentation/devicetree/bindings/thermal/thermal.txt >>>> @@ -0,0 +1,160 @@ >>>> +---------------------------------------- >>>> +Thermal Framework Device Tree descriptor >>>> +---------------------------------------- >>>> + >>>> +This file describes how to define a thermal structure using device tree. >>>> +A thermal structure includes thermal zones and their components, such >>>> +as name, trip points, polling intervals and cooling devices binding >>>> +descriptors. A binding descriptor may contain information on how to >>>> +react, with a cooling stepped action or a weight on a fair share. >>>> +The target of device tree thermal descriptors is to describe only >>>> +the hardware thermal aspects, not how the system must control or which >>>> +algorithm or policy must take place. It follows a description of each >>>> +type of device tree node. >>>> + >>>> +**** >>>> +trip >>>> +**** >>>> + >>>> +The trip node is a node to describe a point in the temperature domain >>>> +in which the system takes an action. This node describes just the point, >>>> +not the action. >>>> + >>>> +A node describing a trip must contain: >>>> +- temperature: the trip temperature level, in milliCelsius. >>>> +- hysteresis: a (low) hysteresis value on 'temperature'. This is a relative >>>> +value, in milliCelsius. >>> >>> Presumably this is a single (u32) cell? >> >> Yes, both of them above are a single u32. > > Ok. This seems to be the default assumption in many bindings, so I > shan't argue against it here, but in general we should be as precise as > possible as to the format of the devicetree. > Yeah, I followed the default assumption. It does not hurt to be descriptive though. >> >>> >>>> +- type: the trip type. Here is the type mapping: >>>> + THERMAL_TRIP_ACTIVE >>>> + THERMAL_TRIP_PASSIVE >>>> + THERMAL_TRIP_HOT >>>> + THERMAL_TRIP_CRITICAL >>> >>> What type is this property? >>> >>> What are these values? Do you need to refer to a header somewhere? >> >> There is a header, introduced in this patch. >> include/dt-bindings/thermal/thermal.h >> >>> Symbolic constants should have an associated value for an ABI. >> >> OK. Sounds reasonable to me. Shall we be descriptive and enlist the >> values in this doc too or just pointing to header is fine? > > I believe that referring to the header is fine, though personally I'd > prefer if we were explicit in binding docs. I'll leave Stephen to > comment on what the preferred way of handling header constants with > binding documents is. > OK. I will do both. >> >>> >>>> + >>>> +********** >>>> +bind_param >>>> +********** >>>> + >>>> +The bind_param node is a node to describe how cooling devices get assigned >>>> +to trip points of the zone. The cooling devices are expected to be loaded >>>> +in the target system. >>>> + >>>> +A node describing a bind_param must contain: >>>> +- cooling_device: A string with the cooling device name. >>> >>> Please use '-' rather than '_' in bindings. That's the normal devicetree >>> convention, and there's no point gonig against it and causing more >>> confusion. >> >> Agreed. >> >>> >>> What are valid values of this string, and what do they mean? >> >> These are the name of the cooling devices to match to. > > Which are what exactly? What constitutes a "cooling device"? > > Is "black-body radiation" [1] a valid cooling device? > > How about "copper heat sink"? > Those are example of cooling devices, yes. But in this context what is interesting is those devices that allow you to control the amount of dissipated power, i.e. operating frequency/voltage, lcd brightness, audio volume, radio transmission power, or if those that you burn power by turning them on but you remove heat out of your thermal zone, like a speed controllable fan. >> >>> >>> Why not a phandle + cells binding to cooling devices? >>> >> >> Yeah, this could work too. I am going to check how that would look like >> and come back to you. > > Ok. > >> >>>> +- weight: The 'influence' of a particular cooling device on this zone. >>>> + This is on a percentage scale. The sum of all these weights >>>> + (for a particular zone) cannot exceed 100. >>> >>> This is a bit fuzzy, and certainly needs more guidance. >> >> OK. This property is used to describe the amount of cooling contribution >> when activating a cooling device at a certain trip point. I will work on >> a better guidance. >> >>> >>>> +- trip_mask: This is a bit mask that gives the binding relation between >>>> + this thermal zone and cdev, for a particular trip point. >>>> + If nth bit is set, then the cdev and thermal zone are bound >>>> + for trip point n. >>> >>> What type is this? >>> >>> The nth bit from which end? >>> >>> Could you explain what "bound for trip point n" means? >>> >> >> It is just a way to match a cooling device with a set of trip points. >> Say you want to match one specific cooling device with trips 0 and 1, >> you would then create a bind_param with trip_mask = 0x3. >> >>> Just because this is a bitmask in Linux does not mean it needs to be so >>> in the dt. >> >> Indeed. >> >> In case we work with phandle + cell binging parameters, the trip and >> weight could be passed as params. >> >> cooling-devices = <&devX tripY weightZ>, <... >; > > Something like that could certainly work. I see. I will check that option. > >> >>> >>>> +- limits: An array integers consisting of 2-tuples items, and each item means >>>> + lower and upper state limits, like <min-state max-state>. >>>> + >>>> +************************** >>>> +temperature sensor devices >>>> +************************** >>>> + >>>> +Temperature sensor devices have to be linked to a specific thermal zone. >>>> +This is done by means of the 'monitored-zones' list. >>>> +- monitored-zones: A list of thermal zones phandles, identifying which >>>> +thermal zones the temperature sensor device is used to monitor. >>> >>> Why does this need to be described that way around? >>> >>> Can the zone not have a link to the sensor(s) it needs? >>> >> >> Do you see any issues with it? >> >> Idea is that sensor nodes would list which zones they are used. Same >> idea goes to cooling devices. It can be the other way around, yes. It >> depends on what is the best way to describe these relations. Idea would >> be so that sensors and cooling devices would be composing a thermal zone. > > I think that having a zone refer to the sensors that monitor it would be > preferable. Otherwise we're encoding that way a devices output is to be > consumed on the binding for the device itself, which feels odd (it's > arguable as to whether it's a property of the device, however). > I got confused now. How adding a monitored-zones property on sensor's nodes would imply that device's output is to be consumed on the binding for the device itself? >> >>>> + >>>> +************ >>>> +thermal_zone >>>> +************ >>>> + >>>> +The thermal_zone node is the node containing all the required info >>>> +for describing a thermal zone, including its cdev bindings. The thermal_zone >>>> +node must contain, apart from its own properties, one node containing >>>> +trip nodes and one node containing all the zone bind parameters. >>>> + >>>> +Required properties: >>>> +- mask: Bit string: If 'n'th bit is set, then trip point 'n' is writeable. >>> >>> What type is a bit string in devicetree? I'm aware of cells and (byte) >>> strings, but not bit strings. >>> >>> What does 'writeable' mean in this context? >>> >>> There's presumably a better name than 'mask'. >>> >> >> This is used to describe if trip points are writeable under sysfs. I >> think this is a OS implementation leak. I will be removing this from >> device tree and making them RO by default. > > Ok. > >> >>>> + >>>> +- passive_delay: number of milliseconds to wait between polls when >>>> + performing passive cooling. >>>> + >>>> +- polling_delay: number of milliseconds to wait between polls when checking. >>> >>> These sound very much like configuration of the driver rather than >>> hardware description. What if my temperature sensor can generate an >>> interrupt at some trigger temperature? Why would I need polling times? >>> >>> Why does this need to be in the dt at all? Surely the OS can choose some >>> sensible polling period, possibly dynamically? >> >> Well, let me answer this one with another question. How the OS will >> determine how fast is the temperature rise on a thermal zone described >> in device tree? > > How will the devicetree writer determine this? > > It's possible for the thermal management code to dynamically adjust the > rate between some extreme polling intervals based on how big a rise it > detects between polls or possibly something more complex taking into > account some metric for load. It is possible yes, by means of a couple of complex differential equations implementation. But still, determining the maximum dT/dt may require unnecessary polling on low activity use cases. Besides, people who do transient thermal simulation should have a better estimate on necessary polling requirement. I don't see why we should not use that handy info and impose hard software requirements. > >> >>> >>>> + >>>> +- #thermal-cells: An integer that is used to specify how many cells compose >>>> + sensor ids. >>> >>> I don't see why this is necessary. If the zone itself described its >>> related sensors rather than the other way around, we wouldn't need it at >>> all. Most bindings so far have consumers link to their providers rather >>> than the other way around. >>> >> >> I see. >> >> >>>> + >>>> +Below is an example: >>>> +cpu_thermal: thermal_zone { >>>> + #thermal-cells = <1>; >>>> + mask = <0x03>; /* trips writability */ >>>> + passive_delay = <250>; /* milliseconds */ >>>> + polling_delay = <1000>; /* milliseconds */ >>>> + trips { >>>> + alert@100000{ >>>> + temperature = <100000>; /* milliCelsius */ >>>> + hysteresis = <0>; /* milliCelsius */ >>>> + type = <THERMAL_TRIP_PASSIVE>; >>>> + }; >>>> + crit@125000{ >>>> + temperature = <125000>; /* milliCelsius */ >>>> + hysteresis = <0>; /* milliCelsius */ >>>> + type = <THERMAL_TRIP_CRITICAL>; >>>> + }; > > I didn't spot this on my first round, but the trips node will need > #address-cells and #size-cells, and the sub nodes unit-addresses should > match their reg properties. You mean, thermal zones must have address and size cells to describe trips? > >>>> + }; >>>> + bind_params { >>>> + action@0{ >>>> + cooling_device = "thermal-cpufreq"; >>> >>> NAK. That value was not defined anywhere, and represents Linux >>> infrastructure rather than hardware. >>> >> >> Yes, this was another leak as discussed above in the upper part of the >> doc file. >> >>> For this case, we could just as easily describe the thermal points, and >>> if no physical cooling device (e.g. a fan) is present, assume >>> cpufreq-base cooling. Other OSs could choose to do otherwise, and we >>> could change later... >>> >> >> The way other specifications do (aka ACPI) is to have trip types. In >> case a trip type is passive cooling, it means one is supposed to >> modulate the dissipated power rather than activating a device to remove >> heat (active cooling). >> >> The part of 'assuming' things that bothers me. I would rather have a way >> to really say one wants to have passive cooling at a specific trip point. > > I don't see the problem here. If you don't specify a device, your only > options will be some sort of passive management. But it does not need to be active all the time. You need to say when (in temperature domain) that is need. Besides, there are even other means of managing average power other than cpufreq cooling. idle injections is another passive cooling device, for instance. > >> >> >>>> + weight = <100>; /* percentage */ >>>> + mask = <0x01>; >>> >>> This should presumably be trip-mask (assuming my suggested corrections). >>> >>> What is it supposed to mean here? >> >> As stated above, it means the trip points that this binding represents. > > If you have more than 32 trip points (which is admittedly unlikely), how > do you represent them? I'm not sure a mask is the best representation > here, as it's difficult for a human to match up with a list of entries. > > Additionally, the trip points don't have a logical contiguous numbering > from zero, so how do the bits in the mask correspond to entries in the > list? This seems to rely on the order that the subnodes are in, which > AFAIK is not guaranteed. It would be far nicer to have an explicit > linkage. Do you mean some sort of list of trip points property inside binding params node? > >> >>> >>>> + limits = < >>>> + /* lower-state upper-state */ >>>> + 1 THERMAL_NO_LIMITS >>>> + THERMAL_NO_LIMITS THERMAL_NO_LIMITS >>>> + >; >>>> + }; >>>> + }; >>>> +}; >>>> + >>>> +sensor: adc@0xFFFF { >>>> + ... >>>> + monitored-zones = <&cpu_thermal 0>; >>>> +}; >>>> + >>>> +cpu@0 { >>>> + ... >>>> + cooling-zones = <&cpu_thermal>; >>>> +}; >>>> + >>>> +In the example above, the ADC sensor at address 0xFFFF is used to monitor >>>> +the zone 'cpu_thermal' using its the sensor 0. The cpu@0 device is also linked >>>> +to the same thermal zone 'cpu_thermal' as a cooling device. >>> >>> Huh? That seems to imply the the '0' in '<&cpu_thermal 0>' on the sensor >>> node describes a property of the sensor rather than the thermal node, >>> unless I've misunderstood? >>> >> >> It is essentially which sensor id is used to monitor a specific zone. In >> case of sensor IPs that feature several internal sensors, one needs to >> have a way to describe which sensor monitors which zone. Example of this >> type of IPs: lm90 (up to 3 sensors) and TI bandgap IP (can contain up to >> 5 internal sensors). The probe happens for the IP device and not for the >> internal sensors. > > Which is exactly why this should be the opposite way round, with thermal > zones referring to sensors in this fashion. > > The way you're suggesting means that the phandle+cells specifier refers > half to the consumer and half to the producer. Having thermal zones > refer to sensors would make this refer solely to the producer (the > sensor). > Are you suggesting having inside thermal zone node: sensors = <&sensor0 Y>, ...., <&sensorN Z> ; Where sensor0 and sensorN are sensor IPs nodes and Y and Z represent their internal sensor instances? >> >> >>>> + >>>> +*************** >>>> +cpufreq-cooling >>>> +*************** >>>> + >>>> +The generic cpu cooling (freq clipping) can be loaded by the >>>> +generic cpufreq-cpu0 driver in case the device tree node informs >>>> +this need. >>>> + >>>> +In order to load the cpufreq cooling device, it is possible to >>>> +inform that the cpu needs cooling by adding the list of thermal zones >>>> +in the 'cooling-zones' property at the cpu0 phandle. >>> >>> This should be reworded so as to describe the binding rather than the >>> Linux behaviour based on the binding. >> >> OK. >> >>> >>>> + >>>> +Example: >>>> + cpus { >>>> + cpu@0 { >>>> + operating-points = < >>>> + /* kHz uV */ >>>> + 800000 1313000 >>>> + 1008000 1375000 >>>> + >; >>>> + cooling-zones = <&cpu_thermal>; >>>> + }; >>>> + ... >>>> + }; >>>> + >>>> +The above will cause the cpufreq-cpu0 driver to understand that >>>> +the cpu will need passive cooling and while probing that node it will >>>> +also load the cpufreq cpu cooling device in that system. >>> >>> The above describe that the CPU is part of a cooling (thermal?) zone and >>> requires cooling. >> >> That says the cpu participates in a thermal zone as a cooling >> device/mechanism, thus cpu cooling (passive cooling) is required in >> target system. > > I think you need the rest of the nodes in this example. You don't show > whether or not there's a cooling-device, so it's not clear that the CPU > is required to be passively cooled... The property states the cpu is used as cooling device for those thermal zones in the list 'cooling-zones' > >> >>> >>>> + >>>> +However, be advised that this flag will not describe what needs to >>>> +be done or how the cpufreq cooling device will be used. In order to >>>> +accomplish this, one needs to write a phandle for a thermal zone, as >>>> +described in the section 'thermal_zone'. >>> >>> The above already has a phandle to a thermal (cooling?) zone. >>> >>> Please choose either cooling zone or thermal zone, having the two names >>> makes this more confusing than necessary. >> >> I see. Idea is to have thermal zone as a node. cooling-zone is just a >> list of thermal zones that the referred device will be used as cooling >> device. > > While I can see a distinction, the nomenclature is horrendously > confusing. OK. I will try harder to improve it. > >> >>> >>>> diff --git a/drivers/thermal/Kconfig b/drivers/thermal/Kconfig >>>> index 7fb16bc..753f0dc 100644 >>>> --- a/drivers/thermal/Kconfig >>>> +++ b/drivers/thermal/Kconfig >>>> @@ -29,6 +29,19 @@ config THERMAL_HWMON >>>> Say 'Y' here if you want all thermal sensors to >>>> have hwmon sysfs interface too. >>>> >>>> +config THERMAL_OF >>>> + bool >>>> + prompt "APIs to parse thermal data out of device tree" >>>> + depends on OF >>>> + default y >>>> + help >>>> + This options provides helpers to add the support to >>>> + read and parse thermal data definitions out of the >>>> + device tree blob. >>>> + >>>> + Say 'Y' here if you need to build thermal infrastructure >>>> + based on device tree. >>>> + >>>> choice >>>> prompt "Default Thermal governor" >>>> default THERMAL_DEFAULT_GOV_STEP_WISE >>>> diff --git a/drivers/thermal/Makefile b/drivers/thermal/Makefile >>>> index 24cb894..eedb273 100644 >>>> --- a/drivers/thermal/Makefile >>>> +++ b/drivers/thermal/Makefile >>>> @@ -7,6 +7,7 @@ thermal_sys-y += thermal_core.o >>>> >>>> # interface to/from other layers providing sensors >>>> thermal_sys-$(CONFIG_THERMAL_HWMON) += thermal_hwmon.o >>>> +thermal_sys-$(CONFIG_THERMAL_OF) += thermal_dt.o >>>> >>>> # governors >>>> thermal_sys-$(CONFIG_THERMAL_GOV_FAIR_SHARE) += fair_share.o >>>> diff --git a/drivers/thermal/thermal_dt.c b/drivers/thermal/thermal_dt.c >>>> new file mode 100644 >>>> index 0000000..cc35485 >>>> --- /dev/null >>>> +++ b/drivers/thermal/thermal_dt.c >>>> @@ -0,0 +1,473 @@ >>>> +/* >>>> + * thermal_dt.c - Generic Thermal Management device tree support. >>>> + * >>>> + * Copyright (C) 2013 Texas Instruments >>>> + * Copyright (C) 2013 Eduardo Valentin <eduardo.valentin@ti.com> >>>> + * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >>>> + * >>>> + * This program is free software; you can redistribute it and/or modify >>>> + * it under the terms of the GNU General Public License as published by >>>> + * the Free Software Foundation; version 2 of the License. >>>> + * >>>> + * This program is distributed in the hope that it will be useful, but >>>> + * WITHOUT ANY WARRANTY; without even the implied warranty of >>>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU >>>> + * General Public License for more details. >>>> + * >>>> + * You should have received a copy of the GNU General Public License along >>>> + * with this program; if not, write to the Free Software Foundation, Inc., >>>> + * 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA. >>>> + * >>>> + * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >>>> + */ >>>> +#include <linux/thermal.h> >>>> +#include <linux/slab.h> >>>> +#include <linux/types.h> >>>> +#include <linux/of_device.h> >>>> +#include <linux/of_platform.h> >>>> +#include <linux/err.h> >>>> +#include <linux/export.h> >>>> +#include <linux/string.h> >>>> + >>>> +struct __thermal_bind_params { >>>> + char cooling_device[THERMAL_NAME_LENGTH]; >>>> +}; >>>> + >>>> +static >>>> +int thermal_of_match(struct thermal_zone_device *tz, >>>> + struct thermal_cooling_device *cdev); >>>> + >>>> +static int thermal_of_populate_bind_params(struct device *dev, >>>> + struct device_node *node, >>>> + struct __thermal_bind_params *__tbp, >>>> + struct thermal_bind_params *tbp, >>>> + int ntrips) >>>> +{ >>>> + const char *cdev_name; >>>> + u32 *limits; >>>> + int ret; >>>> + u32 prop; >>>> + >>>> + ret = of_property_read_u32(node, "weight", &prop); >>>> + if (ret < 0) { >>>> + dev_err(dev, "missing weight property\n"); >>>> + return ret; >>>> + } >>>> + tbp->weight = prop; >>>> + >>>> + ret = of_property_read_u32(node, "mask", &prop); >>>> + if (ret < 0) { >>>> + dev_err(dev, "missing mask property\n"); >>>> + return ret; >>>> + } >>>> + tbp->trip_mask = prop; >>>> + >>>> + /* this will allow us to bind with cooling devices */ >>>> + tbp->match = thermal_of_match; >>>> + >>>> + ret = of_property_read_string(node, "cooling_device", &cdev_name); >>>> + if (ret < 0) { >>>> + dev_err(dev, "missing cooling_device property\n"); >>>> + return ret; >>>> + } >>>> + strncpy(__tbp->cooling_device, cdev_name, >>>> + sizeof(__tbp->cooling_device)); >>>> + >>>> + limits = kzalloc(ntrips * sizeof(*limits) * 2, GFP_KERNEL); >>> >>> Why do you use kzalloc here, but devm_kzalloc elsewhere? That'll lead to >>> problems as the driver gets refactored over time. >> >> Because this memory is used only within this function and thus released >> at the bottom of it. >> >>> >>>> + if (!limits) { >>>> + dev_err(dev, "not enough mem for reading limits\n"); >>>> + return -ENOMEM; >>>> + } >>>> + ret = of_property_read_u32_array(node, "limits", limits, 2 * ntrips); >>>> + if (!ret) { >>>> + int i = ntrips; >>>> + >>>> + tbp->binding_limits = devm_kzalloc(dev, >>>> + ntrips * 2 * >>>> + sizeof(*tbp->binding_limits), >>>> + GFP_KERNEL); >>>> + if (!tbp->binding_limits) { >>>> + dev_err(dev, "not enough mem for storing limits\n"); >>>> + return -ENOMEM; >>>> + } >>>> + while (i--) >>>> + tbp->binding_limits[i] = limits[i]; >>>> + } >>>> + kfree(limits); >> >> here. > > I saw that, I did say that this may lead to problems in future if the > function's reorganised, not now. :) Ahh.. OK. I just didnt want o let the memory be managed and freed only when the struct device is release. > >> >>>> + >>>> + return 0; >>> >>> What if you failed to read the array? Surely you should return an error? >> >> Attempting to make this property not mandatory. > > That wasn't clear from the binding, and now I've re-read it, I'm still > not entirely clear on what the limits represent, and why you may have > multiple sets of pairs. Because you may assign cooling state limits to specific trip points, e.g., a fan with 5 different speeds would operate from speed 0 to 2 in the first trip point, while from speed 3 to 4 on last trip point. > >> >>> >>>> +} >>>> + >>>> +struct __thermal_trip { >>>> + unsigned long int temperature; >>>> + unsigned long int hysteresis; >>>> + enum thermal_trip_type type; >>>> +}; >>>> + >>>> +static >>>> +int thermal_of_populate_trip(struct device *dev, >>>> + struct device_node *node, >>>> + struct __thermal_trip *trip) >>>> +{ >>>> + int ret; >>>> + int prop; >>>> + >>>> + ret = of_property_read_u32(node, "temperature", &prop); >>>> + if (ret < 0) { >>>> + dev_err(dev, "missing temperature property\n"); >>>> + return ret; >>>> + } >>>> + trip->temperature = prop; >>>> + >>>> + ret = of_property_read_u32(node, "hysteresis", &prop); >>>> + if (ret < 0) { >>>> + dev_err(dev, "missing hysteresis property\n"); >>>> + return ret; >>>> + } >>>> + trip->hysteresis = prop; >>>> + >>>> + ret = of_property_read_u32(node, "type", &prop); >>>> + if (ret < 0) { >>>> + dev_err(dev, "missing type property\n"); >>>> + return ret; >>>> + } >>>> + trip->type = prop; >>> >>> This will require sanity checking. >> >> OK. >> >>> >>>> + >>>> + return 0; >>>> +} >>>> + >>>> +struct __thermal_zone_device { >>>> + enum thermal_device_mode mode; >>>> + int passive_delay; >>>> + int polling_delay; >>>> + int mask; >>>> + int ntrips; >>>> + char type[THERMAL_NAME_LENGTH]; >>>> + struct __thermal_trip *trips; >>>> + struct __thermal_bind_params *bind_params; >>>> + struct thermal_bind_params *tbps; >>>> + struct thermal_zone_params zone_params; >>>> + int (*get_temp)(void *, unsigned long *); >>>> + void *devdata; >>>> +}; >>>> + >>>> +static struct __thermal_zone_device * >>>> +thermal_of_build_thermal_zone(struct device *dev, struct device_node *node) >>>> +{ >>>> + struct device_node *child, *gchild; >>>> + struct __thermal_zone_device *tz; >>>> + int ret, i; >>>> + u32 prop; >>>> + >>>> + if (!node) { >>>> + dev_err(dev, "no thermal zone node\n"); >>>> + return ERR_PTR(-EINVAL); >>>> + } >>>> + >>>> + tz = devm_kzalloc(dev, sizeof(*tz), GFP_KERNEL); >>>> + if (!tz) { >>>> + dev_err(dev, "not enough memory for thermal of zone\n"); >>>> + return ERR_PTR(-ENOMEM); >>>> + } >>>> + >>>> + ret = of_property_read_u32(node, "passive_delay", &prop); >>>> + if (ret < 0) { >>>> + dev_err(dev, "missing passive_delay property\n"); >>>> + return ERR_PTR(ret); >>>> + } >>>> + tz->passive_delay = prop; >>>> + >>>> + ret = of_property_read_u32(node, "polling_delay", &prop); >>>> + if (ret < 0) { >>>> + dev_err(dev, "missing polling_delay property\n"); >>>> + return ERR_PTR(ret); >>>> + } >>>> + tz->polling_delay = prop; >>>> + >>>> + ret = of_property_read_u32(node, "mask", &prop); >>>> + if (ret < 0) { >>>> + dev_err(dev, "missing mask property\n"); >>>> + return ERR_PTR(ret); >>>> + } >>>> + tz->mask = prop; >>>> + >>>> + /* assume node name as our zone type */ >>>> + strncpy(tz->type, node->name, sizeof(tz->type)); >>>> + >>>> + /* default policy */ >>>> + strncpy(tz->zone_params.governor_name, "step_wise", >>>> + sizeof(tz->zone_params.governor_name)); >>>> + >>>> + /* trips */ >>>> + child = of_find_node_by_name(node, "trips"); >>> >>> That might not be a child node, as of_find_node_by_name walks over the >>> list of all nodes. Use of_get_child_by_name. >> >> Right! This is a bug. >> >>> >>>> + >>>> + /* No trips provided */ >>>> + if (!child) >>>> + goto finish; >>>> + >>>> + tz->ntrips = of_get_child_count(child); >>>> + tz->trips = devm_kzalloc(dev, tz->ntrips * sizeof(*tz->trips), >>>> + GFP_KERNEL); >>>> + if (!tz->trips) >>>> + return ERR_PTR(-ENOMEM); >>>> + i = 0; >>>> + for_each_child_of_node(child, gchild) >>>> + thermal_of_populate_trip(dev, gchild, &tz->trips[i++]); >>> >>> What if a child node was not a valid trip node? It seems >>> thermal_of_populate_trip returns errors you're ignoring. >> >> OK. Adding proper treatment. >> >>> >>>> + >>>> + /* bind_params */ >>>> + child = of_find_node_by_name(node, "bind_params"); >>> >>> Another of_get_child_by_name candidate. >> >> OK. >> >>> >>>> + >>>> + /* No binding info */ >>>> + if (!child) >>>> + goto finish; >>>> + >>>> + tz->zone_params.num_tbps = of_get_child_count(child); >>>> + tz->bind_params = devm_kzalloc(dev, >>>> + tz->zone_params.num_tbps * >>>> + sizeof(*tz->bind_params), >>>> + GFP_KERNEL); >>>> + if (!tz->bind_params) >>>> + return ERR_PTR(-ENOMEM); >>>> + tz->zone_params.tbp = devm_kzalloc(dev, >>>> + tz->zone_params.num_tbps * >>>> + sizeof(*tz->zone_params.tbp), >>>> + GFP_KERNEL); >>>> + if (!tz->zone_params.tbp) >>>> + return ERR_PTR(-ENOMEM); >>>> + i = 0; >>>> + for_each_child_of_node(child, gchild) { >>>> + thermal_of_populate_bind_params(dev, gchild, >>>> + &tz->bind_params[i], >>>> + &tz->zone_params.tbp[i], >>>> + tz->ntrips); >>> >>> Similarly, you're ignoring error conditions here. What if a bind_params >>> node was malformed? >> >> >> Adding proper error handling. >> >>> >>>> + i++; >>>> + } >>>> + >>>> +finish: >>>> + tz->mode = THERMAL_DEVICE_ENABLED; >>>> + >>>> + return tz; >>>> +} >>>> + >>>> +static >>>> +int thermal_of_match(struct thermal_zone_device *tz, >>>> + struct thermal_cooling_device *cdev) >>>> +{ >>>> + struct __thermal_zone_device *data = tz->devdata; >>>> + int i; >>>> + >>>> + for (i = 0; i < data->zone_params.num_tbps; i++) { >>>> + if (!strncmp(data->bind_params[i].cooling_device, >>>> + cdev->type, >>>> + strlen(data->bind_params[i].cooling_device)) && >>>> + (data->zone_params.tbp[i].trip_mask & (1 << i))) >>>> + return 0; >>>> + } >>>> + >>>> + return -EINVAL; >>>> +} >>> >>> I'm not keen on binding to cooling devices using a string, as suddenly a >>> name to allow humans to inspect the system is turned into an ABI. >> >> OK. We need to align on how we describe the binding then. > > Definitely. We need to ensure that we fully define the relationship too > cooling devices, and don't suddenly expose some piece of internal Linux > infrastructure to the world. > Yeah, agreed, any suggestions? >> >>> >>>> + >>>> +static >>>> +int of_thermal_get_temp(struct thermal_zone_device *tz, >>>> + unsigned long *temp) >>>> +{ >>>> + struct __thermal_zone_device *data = tz->devdata; >>>> + >>>> + return data->get_temp(data->devdata, temp); >>>> +} >>>> + >>>> +static >>>> +int of_thermal_get_mode(struct thermal_zone_device *tz, >>>> + enum thermal_device_mode *mode) >>>> +{ >>>> + struct __thermal_zone_device *data = tz->devdata; >>>> + >>>> + *mode = data->mode; >>>> + >>>> + return 0; >>>> +} >>>> + >>>> +static >>>> +int of_thermal_set_mode(struct thermal_zone_device *tz, >>>> + enum thermal_device_mode mode) >>>> +{ >>>> + struct __thermal_zone_device *data = tz->devdata; >>>> + >>>> + mutex_lock(&tz->lock); >>>> + >>>> + if (mode == THERMAL_DEVICE_ENABLED) >>>> + tz->polling_delay = data->polling_delay; >>>> + else >>>> + tz->polling_delay = 0; >>>> + >>>> + mutex_unlock(&tz->lock); >>>> + >>>> + data->mode = mode; >>>> + thermal_zone_device_update(tz); >>>> + >>>> + return 0; >>>> +} >>>> + >>>> +static >>>> +int of_thermal_get_trip_type(struct thermal_zone_device *tz, int trip, >>>> + enum thermal_trip_type *type) >>>> +{ >>>> + struct __thermal_zone_device *data = tz->devdata; >>>> + >>>> + if (trip >= data->ntrips || trip < 0) >>>> + return -EDOM; >>> >>> It's far more common to use -EINVAL. >> >> hmm.. OK. >> >>> >>>> + >>>> + *type = data->trips[trip].type; >>>> + >>>> + return 0; >>>> +} >>>> + >>>> +static >>>> +int of_thermal_get_trip_temp(struct thermal_zone_device *tz, int trip, >>>> + unsigned long *temp) >>>> +{ >>>> + struct __thermal_zone_device *data = tz->devdata; >>>> + >>>> + if (trip >= data->ntrips || trip < 0) >>>> + return -EDOM; >>>> + >>>> + *temp = data->trips[trip].temperature; >>>> + >>>> + return 0; >>>> +} >>>> + >>>> +static >>>> +int of_thermal_set_trip_temp(struct thermal_zone_device *tz, int trip, >>>> + unsigned long temp) >>>> +{ >>>> + struct __thermal_zone_device *data = tz->devdata; >>>> + >>>> + if (trip >= data->ntrips || trip < 0) >>>> + return -EDOM; >>>> + >>>> + /* thermal fw should take care of data->mask & (1 << trip) */ >>>> + data->trips[trip].temperature = temp; >>>> + >>>> + return 0; >>>> +} >>>> + >>>> +static >>>> +int of_thermal_get_trip_hyst(struct thermal_zone_device *tz, int trip, >>>> + unsigned long *hyst) >>>> +{ >>>> + struct __thermal_zone_device *data = tz->devdata; >>>> + >>>> + if (trip >= data->ntrips || trip < 0) >>>> + return -EDOM; >>>> + >>>> + *hyst = data->trips[trip].hysteresis; >>>> + >>>> + return 0; >>>> +} >>>> + >>>> +static >>>> +int of_thermal_set_trip_hyst(struct thermal_zone_device *tz, int trip, >>>> + unsigned long hyst) >>>> +{ >>>> + struct __thermal_zone_device *data = tz->devdata; >>>> + >>>> + if (trip >= data->ntrips || trip < 0) >>>> + return -EDOM; >>>> + >>>> + /* thermal fw should take care of data->mask & (1 << trip) */ >>> >>> Could you elaborate on that comment? >> >> In case the trip is not assigned to be writable, this function won't be >> called. > > Ah. I misunderstood "fw" as "firmware", rather than "framework", and got > confused. It would be nice if you could expand "fw" to make this > clearer. :) Sorry for that :-) > > Thanks, > Mark. > > [1] http://en.wikipedia.org/wiki/Black-body_radiation > >
Mark, Pawel and Stephen, On 27-08-2013 14:17, Eduardo Valentin wrote: > On 27-08-2013 12:23, Mark Rutland wrote: >> On Tue, Aug 27, 2013 at 02:44:40PM +0100, Eduardo Valentin wrote: >>> Hello Mark, <cut> I believe now we need to align on thermal bindings, before I propose next version of this series. So, following the discussions on this thread, I believe a typical CPU thermal zone would look like the following: + #include <dt-bindings/thermal/thermal.h> + + cpu_thermal: cpu_thermal { + passive-delay = <250>; /* milliseconds */ + polling-delay = <1000>; /* milliseconds */ + /* + * sensor ID + */ + sensors = <&bandgap0 0>; + /* + * cooling + * device Usage Trip lower bound upper bound + */ + cooling-devices = <&cpu0 100 &cpu-alert THERMAL_NO_LIMITS THERMAL_NO_LIMITS>; + trips { + cpu-alert: cpu-alert { + temperature = <100000>; /* milliCelsius */ + hysteresis = <2000>; /* milliCelsius */ + type = <THERMAL_TRIP_PASSIVE>; + }; + cpu-crit: cpu-crit { + temperature = <125000>; /* milliCelsius */ + hysteresis = <2000>; /* milliCelsius */ + type = <THERMAL_TRIP_CRITICAL>; + }; + }; + }; *Note: THERMAL_NO_LIMIT means, it will be using the cooling device minimum and maximum limits. Couple of comments. 1. I am keeping the pooling intervals. A possible alternative way to describe that would be specifying the maximum dT/dt. Essentially because I am still convinced that this is part of hw specs. 2. The above follows your suggestion to use consumer pointing to producers. Although, I still need to figure out how this could be implemented for Linux. But I think that is another story, at least for now. We need to first align on the DT binding itself. 3. The link from cooling device specification to trip points, from my perspective, should be enough, and thus we shall not need the thermal cells and size for trip points, as you proposed. Let me know what you think. Below is another example, on a more complex scenario, board specific case (hypothetical, just for exemplification): + #include <dt-bindings/thermal/thermal.h> + + board_thermal: board_thermal { + passive-delay = <1000>; /* milliseconds */ + polling-delay = <2500>; /* milliseconds */ + /* + * sensor ID + */ + sensors = <&adc-dummy 0>, + <&adc-dummy 1>, + <&adc-dymmy 2>; + /* + * cooling + * device Usage Trip lower upper + */ + cooling-devices = <&cpu0 75 &cpu-trip 0 2>, + <&gpu0 40 &gpu-trip 0 2>; + <&lcd0 25 &lcd-trip 5 10>; + trips { + cpu-trip: cpu-trip { + temperature = <60000>; /* milliCelsius */ + hysteresis = <2000>; /* milliCelsius */ + type = <THERMAL_TRIP_PASSIVE>; + }; + gpu-trip: gpu-trip { + temperature = <55000>; /* milliCelsius */ + hysteresis = <2000>; /* milliCelsius */ + type = <THERMAL_TRIP_PASSIVE>; + } + lcd-trip: lcp-trip { + temperature = <53000>; /* milliCelsius */ + hysteresis = <2000>; /* milliCelsius */ + type = <THERMAL_TRIP_PASSIVE>; + }; + crit-trip: crit-trip { + temperature = <68000>; /* milliCelsius */ + hysteresis = <2000>; /* milliCelsius */ + type = <THERMAL_TRIP_CRITICAL>; + }; + }; + }; Now writing the above example, one might want to have a way to say the sensor correlation equation. Another example: + #include <dt-bindings/thermal/thermal.h> + + dsp_thermal: dsp_thermal { + passive-delay = <250>; /* milliseconds */ + polling-delay = <1000>; /* milliseconds */ + /* + * sensor ID + */ + sensors = <&bandgap0 1>; + /* + * cooling + * device Usage Trip lower bound upper bound + */ + cooling-devices = <&dsp0 100 &dsp-alert THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; + trips { + dsp-alert: dsp-alert { + temperature = <100000>; /* milliCelsius */ + hysteresis = <2000>; /* milliCelsius */ + type = <THERMAL_TRIP_PASSIVE>; + }; + dsp-crit: cdsp-crit { + temperature = <125000>; /* milliCelsius */ + hysteresis = <2000>; /* milliCelsius */ + type = <THERMAL_TRIP_CRITICAL>; + }; + }; + }; + + gpu_thermal: gpu_thermal { + passive-delay = <500>; /* milliseconds */ + polling-delay = <1000>; /* milliseconds */ + /* + * sensor ID + */ + sensors = <&bandgap0 2>; + /* + * cooling + * device Usage Trip lower bound upper bound + */ + cooling-devices = <&gpu0 100 &gpu-alert THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; + trips { + gpu-alert: gpu-alert { + temperature = <100000>; /* milliCelsius */ + hysteresis = <2000>; /* milliCelsius */ + type = <THERMAL_TRIP_PASSIVE>; + }; + gpu-crit: gpu-crit { + temperature = <125000>; /* milliCelsius */ + hysteresis = <2000>; /* milliCelsius */ + type = <THERMAL_TRIP_CRITICAL>; + }; + }; + }; Wei, I think the above would cover for the IPs with multiple internal sensors cases you were looking for, right? Durga, please also check if I am missing something to cover for your plans to, in case you will be using DT to describe your HW. Anyways, Mark and Pawel, let me know if I missed something from our discussion. I believe the above bindings would look more like regular standard DT bindings. And also they should be now slim enough, without Linux implementation details and also without policies. > >
On 08/30/2013 07:19 AM, Eduardo Valentin wrote: > * PGP Signed: 08/29/2013 at 04:19:46 PM > > Mark, Pawel and Stephen, > > > On 27-08-2013 14:17, Eduardo Valentin wrote: >> On 27-08-2013 12:23, Mark Rutland wrote: >>> On Tue, Aug 27, 2013 at 02:44:40PM +0100, Eduardo Valentin wrote: >>>> Hello Mark, > > <cut> > > I believe now we need to align on thermal bindings, before I propose > next version of this series. So, following the discussions on this > thread, I believe a typical CPU thermal zone would look like the following: > + #include <dt-bindings/thermal/thermal.h> > + > + cpu_thermal: cpu_thermal { > + passive-delay = <250>; /* milliseconds */ > + polling-delay = <1000>; /* milliseconds */ > + /* > + * sensor ID > + */ > + sensors = <&bandgap0 0>; > + /* > + * cooling > + * device Usage Trip lower > bound upper bound > + */ > + cooling-devices = <&cpu0 100 &cpu-alert > THERMAL_NO_LIMITS THERMAL_NO_LIMITS>; > + trips { > + cpu-alert: cpu-alert { > + temperature = <100000>; /* milliCelsius */ > + hysteresis = <2000>; /* milliCelsius */ > + type = <THERMAL_TRIP_PASSIVE>; > + }; > + cpu-crit: cpu-crit { > + temperature = <125000>; /* milliCelsius */ > + hysteresis = <2000>; /* milliCelsius */ > + type = <THERMAL_TRIP_CRITICAL>; > + }; > + }; > + }; > > *Note: THERMAL_NO_LIMIT means, it will be using the cooling device > minimum and maximum limits. > > Couple of comments. > 1. I am keeping the pooling intervals. A possible alternative way to > describe that would be specifying the maximum dT/dt. Essentially because > I am still convinced that this is part of hw specs. > > 2. The above follows your suggestion to use consumer pointing to > producers. Although, I still need to figure out how this could be > implemented for Linux. But I think that is another story, at least for > now. We need to first align on the DT binding itself. > > 3. The link from cooling device specification to trip points, from my > perspective, should be enough, and thus we shall not need the thermal > cells and size for trip points, as you proposed. Let me know what you think. > > Below is another example, on a more complex scenario, board specific > case (hypothetical, just for exemplification): > > + #include <dt-bindings/thermal/thermal.h> > + > + board_thermal: board_thermal { > + passive-delay = <1000>; /* milliseconds */ > + polling-delay = <2500>; /* milliseconds */ > + /* > + * sensor ID > + */ > + sensors = <&adc-dummy 0>, > + <&adc-dummy 1>, > + <&adc-dymmy 2>; > + /* > + * cooling > + * device Usage Trip lower upper > + */ > + cooling-devices = <&cpu0 75 &cpu-trip 0 2>, > + <&gpu0 40 &gpu-trip 0 2>; > + <&lcd0 25 &lcd-trip 5 10>; > + trips { > + cpu-trip: cpu-trip { > + temperature = <60000>; /* milliCelsius */ > + hysteresis = <2000>; /* milliCelsius */ > + type = <THERMAL_TRIP_PASSIVE>; > + }; > + gpu-trip: gpu-trip { > + temperature = <55000>; /* milliCelsius */ > + hysteresis = <2000>; /* milliCelsius */ > + type = <THERMAL_TRIP_PASSIVE>; > + } > + lcd-trip: lcp-trip { > + temperature = <53000>; /* milliCelsius */ > + hysteresis = <2000>; /* milliCelsius */ > + type = <THERMAL_TRIP_PASSIVE>; > + }; > + crit-trip: crit-trip { > + temperature = <68000>; /* milliCelsius */ > + hysteresis = <2000>; /* milliCelsius */ > + type = <THERMAL_TRIP_CRITICAL>; > + }; > + }; > + }; > > Now writing the above example, one might want to have a way to say the > sensor correlation equation. > > Another example: > + #include <dt-bindings/thermal/thermal.h> > + > + dsp_thermal: dsp_thermal { > + passive-delay = <250>; /* milliseconds */ > + polling-delay = <1000>; /* milliseconds */ > + /* > + * sensor ID > + */ > + sensors = <&bandgap0 1>; > + /* > + * cooling > + * device Usage Trip lower > bound upper bound > + */ > + cooling-devices = <&dsp0 100 &dsp-alert > THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; > + trips { > + dsp-alert: dsp-alert { > + temperature = <100000>; /* milliCelsius */ > + hysteresis = <2000>; /* milliCelsius */ > + type = <THERMAL_TRIP_PASSIVE>; > + }; > + dsp-crit: cdsp-crit { > + temperature = <125000>; /* milliCelsius */ > + hysteresis = <2000>; /* milliCelsius */ > + type = <THERMAL_TRIP_CRITICAL>; > + }; > + }; > + }; > + > + gpu_thermal: gpu_thermal { > + passive-delay = <500>; /* milliseconds */ > + polling-delay = <1000>; /* milliseconds */ > + /* > + * sensor ID > + */ > + sensors = <&bandgap0 2>; > + /* > + * cooling > + * device Usage Trip lower > bound upper bound > + */ > + cooling-devices = <&gpu0 100 &gpu-alert > THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; > + trips { > + gpu-alert: gpu-alert { > + temperature = <100000>; /* milliCelsius */ > + hysteresis = <2000>; /* milliCelsius */ > + type = <THERMAL_TRIP_PASSIVE>; > + }; > + gpu-crit: gpu-crit { > + temperature = <125000>; /* milliCelsius */ > + hysteresis = <2000>; /* milliCelsius */ > + type = <THERMAL_TRIP_CRITICAL>; > + }; > + }; > + }; > > > Wei, I think the above would cover for the IPs with multiple internal > sensors cases you were looking for, right? Yes, I think so, and this series is what I'm looking for, thanks again. > > Durga, please also check if I am missing something to cover for your > plans to, in case you will be using DT to describe your HW. > > Anyways, Mark and Pawel, let me know if I missed something from our > discussion. I believe the above bindings would look more like regular > standard DT bindings. And also they should be now slim enough, without > Linux implementation details and also without policies. > >> >> > > -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 29-08-2013 19:19, Eduardo Valentin wrote: > Mark, Pawel and Stephen, > > > On 27-08-2013 14:17, Eduardo Valentin wrote: >> On 27-08-2013 12:23, Mark Rutland wrote: >>> On Tue, Aug 27, 2013 at 02:44:40PM +0100, Eduardo Valentin wrote: >>>> Hello Mark, > > <cut> > > I believe now we need to align on thermal bindings, before I propose > next version of this series. So, following the discussions on this > thread, I believe a typical CPU thermal zone would look like the following: > + #include <dt-bindings/thermal/thermal.h> > + > + cpu_thermal: cpu_thermal { > + passive-delay = <250>; /* milliseconds */ > + polling-delay = <1000>; /* milliseconds */ > + /* > + * sensor ID > + */ > + sensors = <&bandgap0 0>; > + /* > + * cooling > + * device Usage Trip lower > bound upper bound > + */ > + cooling-devices = <&cpu0 100 &cpu-alert > THERMAL_NO_LIMITS THERMAL_NO_LIMITS>; > + trips { > + cpu-alert: cpu-alert { > + temperature = <100000>; /* milliCelsius */ > + hysteresis = <2000>; /* milliCelsius */ > + type = <THERMAL_TRIP_PASSIVE>; > + }; > + cpu-crit: cpu-crit { > + temperature = <125000>; /* milliCelsius */ > + hysteresis = <2000>; /* milliCelsius */ > + type = <THERMAL_TRIP_CRITICAL>; > + }; > + }; > + }; > > *Note: THERMAL_NO_LIMIT means, it will be using the cooling device > minimum and maximum limits. > > Couple of comments. > 1. I am keeping the pooling intervals. A possible alternative way to > describe that would be specifying the maximum dT/dt. Essentially because > I am still convinced that this is part of hw specs. > > 2. The above follows your suggestion to use consumer pointing to > producers. Although, I still need to figure out how this could be > implemented for Linux. But I think that is another story, at least for > now. We need to first align on the DT binding itself. > > 3. The link from cooling device specification to trip points, from my > perspective, should be enough, and thus we shall not need the thermal > cells and size for trip points, as you proposed. Let me know what you think. > > Below is another example, on a more complex scenario, board specific > case (hypothetical, just for exemplification): > > + #include <dt-bindings/thermal/thermal.h> > + > + board_thermal: board_thermal { > + passive-delay = <1000>; /* milliseconds */ > + polling-delay = <2500>; /* milliseconds */ > + /* > + * sensor ID > + */ > + sensors = <&adc-dummy 0>, > + <&adc-dummy 1>, > + <&adc-dymmy 2>; > + /* > + * cooling > + * device Usage Trip lower upper > + */ > + cooling-devices = <&cpu0 75 &cpu-trip 0 2>, > + <&gpu0 40 &gpu-trip 0 2>; > + <&lcd0 25 &lcd-trip 5 10>; > + trips { > + cpu-trip: cpu-trip { > + temperature = <60000>; /* milliCelsius */ > + hysteresis = <2000>; /* milliCelsius */ > + type = <THERMAL_TRIP_PASSIVE>; > + }; > + gpu-trip: gpu-trip { > + temperature = <55000>; /* milliCelsius */ > + hysteresis = <2000>; /* milliCelsius */ > + type = <THERMAL_TRIP_PASSIVE>; > + } > + lcd-trip: lcp-trip { > + temperature = <53000>; /* milliCelsius */ > + hysteresis = <2000>; /* milliCelsius */ > + type = <THERMAL_TRIP_PASSIVE>; > + }; > + crit-trip: crit-trip { > + temperature = <68000>; /* milliCelsius */ > + hysteresis = <2000>; /* milliCelsius */ > + type = <THERMAL_TRIP_CRITICAL>; > + }; > + }; > + }; > > Now writing the above example, one might want to have a way to say the > sensor correlation equation. > > Another example: > + #include <dt-bindings/thermal/thermal.h> > + > + dsp_thermal: dsp_thermal { > + passive-delay = <250>; /* milliseconds */ > + polling-delay = <1000>; /* milliseconds */ > + /* > + * sensor ID > + */ > + sensors = <&bandgap0 1>; > + /* > + * cooling > + * device Usage Trip lower > bound upper bound > + */ > + cooling-devices = <&dsp0 100 &dsp-alert > THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; > + trips { > + dsp-alert: dsp-alert { > + temperature = <100000>; /* milliCelsius */ > + hysteresis = <2000>; /* milliCelsius */ > + type = <THERMAL_TRIP_PASSIVE>; > + }; > + dsp-crit: cdsp-crit { > + temperature = <125000>; /* milliCelsius */ > + hysteresis = <2000>; /* milliCelsius */ > + type = <THERMAL_TRIP_CRITICAL>; > + }; > + }; > + }; > + > + gpu_thermal: gpu_thermal { > + passive-delay = <500>; /* milliseconds */ > + polling-delay = <1000>; /* milliseconds */ > + /* > + * sensor ID > + */ > + sensors = <&bandgap0 2>; > + /* > + * cooling > + * device Usage Trip lower > bound upper bound > + */ > + cooling-devices = <&gpu0 100 &gpu-alert > THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; > + trips { > + gpu-alert: gpu-alert { > + temperature = <100000>; /* milliCelsius */ > + hysteresis = <2000>; /* milliCelsius */ > + type = <THERMAL_TRIP_PASSIVE>; > + }; > + gpu-crit: gpu-crit { > + temperature = <125000>; /* milliCelsius */ > + hysteresis = <2000>; /* milliCelsius */ > + type = <THERMAL_TRIP_CRITICAL>; > + }; > + }; > + }; > > > Wei, I think the above would cover for the IPs with multiple internal > sensors cases you were looking for, right? > > Durga, please also check if I am missing something to cover for your > plans to, in case you will be using DT to describe your HW. > > Anyways, Mark and Pawel, let me know if I missed something from our > discussion. I believe the above bindings would look more like regular > standard DT bindings. And also they should be now slim enough, without > Linux implementation details and also without policies. Any objections on the above binding proposal? > >> >> > >
On Fri, Aug 30, 2013 at 12:19:43AM +0100, Eduardo Valentin wrote: > Mark, Pawel and Stephen, > > > On 27-08-2013 14:17, Eduardo Valentin wrote: > > On 27-08-2013 12:23, Mark Rutland wrote: > >> On Tue, Aug 27, 2013 at 02:44:40PM +0100, Eduardo Valentin wrote: > >>> Hello Mark, Hi, sorry for the delay. > > <cut> > > I believe now we need to align on thermal bindings, before I propose > next version of this series. So, following the discussions on this > thread, I believe a typical CPU thermal zone would look like the following: > + #include <dt-bindings/thermal/thermal.h> > + > + cpu_thermal: cpu_thermal { > + passive-delay = <250>; /* milliseconds */ > + polling-delay = <1000>; /* milliseconds */ I'm still not keen on placing these intervals in the dt. Why can the kernel not choose sensible values? > + /* > + * sensor ID > + */ > + sensors = <&bandgap0 0>; > + /* > + * cooling > + * device Usage Trip lower > bound upper bound > + */ > + cooling-devices = <&cpu0 100 &cpu-alert > THERMAL_NO_LIMITS THERMAL_NO_LIMITS>; I'm not sure it makes sense to group this data within a property. We might need to extend this in future and it might not be possible to extend this unambiguously. It may make more sense as a sub-node or collection of properties. > + trips { > + cpu-alert: cpu-alert { > + temperature = <100000>; /* milliCelsius */ > + hysteresis = <2000>; /* milliCelsius */ > + type = <THERMAL_TRIP_PASSIVE>; > + }; > + cpu-crit: cpu-crit { > + temperature = <125000>; /* milliCelsius */ > + hysteresis = <2000>; /* milliCelsius */ > + type = <THERMAL_TRIP_CRITICAL>; > + }; > + }; > + }; How do the upper and lower bounds in the cooling-devices property interact with temperatures described in the trips node? > > *Note: THERMAL_NO_LIMIT means, it will be using the cooling device > minimum and maximum limits. I'm confused what are the cooling device min and max limits? Where are they defined if not in the lower bound and upper bound entries of the cooling-devices property? > > Couple of comments. > 1. I am keeping the pooling intervals. A possible alternative way to > describe that would be specifying the maximum dT/dt. Essentially because > I am still convinced that this is part of hw specs. I still don't see why it's not possible to do this dynamically. I am not entirely convinced this is part of the hw spec. > > 2. The above follows your suggestion to use consumer pointing to > producers. Although, I still need to figure out how this could be > implemented for Linux. But I think that is another story, at least for > now. We need to first align on the DT binding itself. I assume you mean the sensors property? That looks fine to me, though we may need to specify the way sensors relate to temperatures in trip points and cooling device properties -- do those temperatures correspond to an average of sensors readings, min/max, etc? > > 3. The link from cooling device specification to trip points, from my > perspective, should be enough, and thus we shall not need the thermal > cells and size for trip points, as you proposed. Let me know what you think. I'm not keen on the mixed bag of values in the cooling-device property, as it's difficult to extend in future if necessary. I think we may need cells for cooling devices -- what if a single fan controller controls fans in different thermal zones? I'm also not keen on devices being described as their own cooling device, as I think this is more difficult to support than not listing a cooling device -- it'll have to be special-cased anyway. > Below is another example, on a more complex scenario, board specific > case (hypothetical, just for exemplification): > > + #include <dt-bindings/thermal/thermal.h> > + > + board_thermal: board_thermal { > + passive-delay = <1000>; /* milliseconds */ > + polling-delay = <2500>; /* milliseconds */ > + /* > + * sensor ID > + */ > + sensors = <&adc-dummy 0>, > + <&adc-dummy 1>, > + <&adc-dymmy 2>; Here we have a set of sensors, but no relationship between these sensors and the trips or cooling-devices are described. How does that work? > + /* > + * cooling > + * device Usage Trip lower upper > + */ > + cooling-devices = <&cpu0 75 &cpu-trip 0 2>, > + <&gpu0 40 &gpu-trip 0 2>; > + <&lcd0 25 &lcd-trip 5 10>; > + trips { > + cpu-trip: cpu-trip { > + temperature = <60000>; /* milliCelsius */ > + hysteresis = <2000>; /* milliCelsius */ > + type = <THERMAL_TRIP_PASSIVE>; > + }; > + gpu-trip: gpu-trip { > + temperature = <55000>; /* milliCelsius */ > + hysteresis = <2000>; /* milliCelsius */ > + type = <THERMAL_TRIP_PASSIVE>; > + } > + lcd-trip: lcp-trip { > + temperature = <53000>; /* milliCelsius */ > + hysteresis = <2000>; /* milliCelsius */ > + type = <THERMAL_TRIP_PASSIVE>; > + }; > + crit-trip: crit-trip { > + temperature = <68000>; /* milliCelsius */ > + hysteresis = <2000>; /* milliCelsius */ > + type = <THERMAL_TRIP_CRITICAL>; > + }; > + }; > + }; > > Now writing the above example, one might want to have a way to say the > sensor correlation equation. > > Another example: > + #include <dt-bindings/thermal/thermal.h> > + > + dsp_thermal: dsp_thermal { > + passive-delay = <250>; /* milliseconds */ > + polling-delay = <1000>; /* milliseconds */ > + /* > + * sensor ID > + */ > + sensors = <&bandgap0 1>; > + /* > + * cooling > + * device Usage Trip lower > bound upper bound > + */ > + cooling-devices = <&dsp0 100 &dsp-alert > THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; > + trips { > + dsp-alert: dsp-alert { > + temperature = <100000>; /* milliCelsius */ > + hysteresis = <2000>; /* milliCelsius */ > + type = <THERMAL_TRIP_PASSIVE>; > + }; > + dsp-crit: cdsp-crit { > + temperature = <125000>; /* milliCelsius */ > + hysteresis = <2000>; /* milliCelsius */ > + type = <THERMAL_TRIP_CRITICAL>; > + }; > + }; > + }; > + > + gpu_thermal: gpu_thermal { > + passive-delay = <500>; /* milliseconds */ > + polling-delay = <1000>; /* milliseconds */ > + /* > + * sensor ID > + */ > + sensors = <&bandgap0 2>; > + /* > + * cooling > + * device Usage Trip lower > bound upper bound > + */ > + cooling-devices = <&gpu0 100 &gpu-alert > THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; > + trips { > + gpu-alert: gpu-alert { > + temperature = <100000>; /* milliCelsius */ > + hysteresis = <2000>; /* milliCelsius */ > + type = <THERMAL_TRIP_PASSIVE>; > + }; > + gpu-crit: gpu-crit { > + temperature = <125000>; /* milliCelsius */ > + hysteresis = <2000>; /* milliCelsius */ > + type = <THERMAL_TRIP_CRITICAL>; > + }; > + }; > + }; > > > Wei, I think the above would cover for the IPs with multiple internal > sensors cases you were looking for, right? > > Durga, please also check if I am missing something to cover for your > plans to, in case you will be using DT to describe your HW. > > Anyways, Mark and Pawel, let me know if I missed something from our > discussion. I believe the above bindings would look more like regular > standard DT bindings. And also they should be now slim enough, without > Linux implementation details and also without policies. I think that the above can describe that, but I'd like to see a binding document so we can consider it in more detail. Thanks, Mark. -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Mark, On 03-09-2013 09:15, Mark Rutland wrote: > On Fri, Aug 30, 2013 at 12:19:43AM +0100, Eduardo Valentin wrote: >> Mark, Pawel and Stephen, >> >> >> On 27-08-2013 14:17, Eduardo Valentin wrote: >>> On 27-08-2013 12:23, Mark Rutland wrote: >>>> On Tue, Aug 27, 2013 at 02:44:40PM +0100, Eduardo Valentin wrote: >>>>> Hello Mark, > > Hi, sorry for the delay. > No issues, >> >> <cut> >> >> I believe now we need to align on thermal bindings, before I propose >> next version of this series. So, following the discussions on this >> thread, I believe a typical CPU thermal zone would look like the following: >> + #include <dt-bindings/thermal/thermal.h> >> + >> + cpu_thermal: cpu_thermal { >> + passive-delay = <250>; /* milliseconds */ >> + polling-delay = <1000>; /* milliseconds */ > > I'm still not keen on placing these intervals in the dt. Why can the > kernel not choose sensible values? > >> + /* >> + * sensor ID >> + */ >> + sensors = <&bandgap0 0>; >> + /* >> + * cooling >> + * device Usage Trip lower >> bound upper bound >> + */ >> + cooling-devices = <&cpu0 100 &cpu-alert >> THERMAL_NO_LIMITS THERMAL_NO_LIMITS>; > > I'm not sure it makes sense to group this data within a property. We > might need to extend this in future and it might not be possible to > extend this unambiguously. It may make more sense as a sub-node or > collection of properties. > I see. Something like the previous DT binding? My previous proposal was using sub-nodes. >> + trips { >> + cpu-alert: cpu-alert { >> + temperature = <100000>; /* milliCelsius */ >> + hysteresis = <2000>; /* milliCelsius */ >> + type = <THERMAL_TRIP_PASSIVE>; >> + }; >> + cpu-crit: cpu-crit { >> + temperature = <125000>; /* milliCelsius */ >> + hysteresis = <2000>; /* milliCelsius */ >> + type = <THERMAL_TRIP_CRITICAL>; >> + }; >> + }; >> + }; > > How do the upper and lower bounds in the cooling-devices property > interact with temperatures described in the trips node? By means of the trip phandle. It would mean once the trip is crossed, it is expected to modulate the cooling device between the specified limits. > >> >> *Note: THERMAL_NO_LIMIT means, it will be using the cooling device >> minimum and maximum limits. > > I'm confused what are the cooling device min and max limits? Where are > they defined if not in the lower bound and upper bound entries of the > cooling-devices property? They were not specified in this example, because previous discussions suggested that we shall not represent cooling devices as dt nodes as they are virtual. Those would be part of the cooling device node, in the case we want to have cooling devices nodes and property in DT. > >> >> Couple of comments. >> 1. I am keeping the pooling intervals. A possible alternative way to >> describe that would be specifying the maximum dT/dt. Essentially because >> I am still convinced that this is part of hw specs. > > I still don't see why it's not possible to do this dynamically. I am not > entirely convinced this is part of the hw spec. Because not all hardware have power sensors and deriving power vs. temperature vs. time is not a simple task, specially based on estimates, say on cpu load. Remember that we are not only talking about CPU temperature here, there are other heat sources. Same load has different consumption on different HW, how SW is supposed to find that out wihtout proper sensing? > >> >> 2. The above follows your suggestion to use consumer pointing to >> producers. Although, I still need to figure out how this could be >> implemented for Linux. But I think that is another story, at least for >> now. We need to first align on the DT binding itself. > > I assume you mean the sensors property? That looks fine to me, though we Sensors and cooling-devices. > may need to specify the way sensors relate to temperatures in trip > points and cooling device properties -- do those temperatures correspond > to an average of sensors readings, min/max, etc? Those are based on instant measurements. > >> >> 3. The link from cooling device specification to trip points, from my >> perspective, should be enough, and thus we shall not need the thermal >> cells and size for trip points, as you proposed. Let me know what you think. > > I'm not keen on the mixed bag of values in the cooling-device property, > as it's difficult to extend in future if necessary. I think we may need > cells for cooling devices -- what if a single fan controller controls > fans in different thermal zones? I believe it is just a matter of representation. Using either a list or subnodes would imply that one would need to add links to fan device twice, one in each thermal zone (it does not matter if inside a subnode or inside a list). > > I'm also not keen on devices being described as their own cooling > device, as I think this is more difficult to support than not listing a > cooling device -- it'll have to be special-cased anyway. Then we need to have cooling device nodes, even if they are virtual. > >> Below is another example, on a more complex scenario, board specific >> case (hypothetical, just for exemplification): >> >> + #include <dt-bindings/thermal/thermal.h> >> + >> + board_thermal: board_thermal { >> + passive-delay = <1000>; /* milliseconds */ >> + polling-delay = <2500>; /* milliseconds */ >> + /* >> + * sensor ID >> + */ >> + sensors = <&adc-dummy 0>, >> + <&adc-dummy 1>, >> + <&adc-dymmy 2>; > > Here we have a set of sensors, but no relationship between these sensors > and the trips or cooling-devices are described. How does that work? See my query below. > >> + /* >> + * cooling >> + * device Usage Trip lower upper >> + */ >> + cooling-devices = <&cpu0 75 &cpu-trip 0 2>, >> + <&gpu0 40 &gpu-trip 0 2>; >> + <&lcd0 25 &lcd-trip 5 10>; >> + trips { >> + cpu-trip: cpu-trip { >> + temperature = <60000>; /* milliCelsius */ >> + hysteresis = <2000>; /* milliCelsius */ >> + type = <THERMAL_TRIP_PASSIVE>; >> + }; >> + gpu-trip: gpu-trip { >> + temperature = <55000>; /* milliCelsius */ >> + hysteresis = <2000>; /* milliCelsius */ >> + type = <THERMAL_TRIP_PASSIVE>; >> + } >> + lcd-trip: lcp-trip { >> + temperature = <53000>; /* milliCelsius */ >> + hysteresis = <2000>; /* milliCelsius */ >> + type = <THERMAL_TRIP_PASSIVE>; >> + }; >> + crit-trip: crit-trip { >> + temperature = <68000>; /* milliCelsius */ >> + hysteresis = <2000>; /* milliCelsius */ >> + type = <THERMAL_TRIP_CRITICAL>; >> + }; >> + }; >> + }; >> >> Now writing the above example, one might want to have a way to say the >> sensor correlation equation. One way to go is to add a property to describe the linear relation between them. >> >> Another example: >> + #include <dt-bindings/thermal/thermal.h> >> + >> + dsp_thermal: dsp_thermal { >> + passive-delay = <250>; /* milliseconds */ >> + polling-delay = <1000>; /* milliseconds */ >> + /* >> + * sensor ID >> + */ >> + sensors = <&bandgap0 1>; >> + /* >> + * cooling >> + * device Usage Trip lower >> bound upper bound >> + */ >> + cooling-devices = <&dsp0 100 &dsp-alert >> THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; >> + trips { >> + dsp-alert: dsp-alert { >> + temperature = <100000>; /* milliCelsius */ >> + hysteresis = <2000>; /* milliCelsius */ >> + type = <THERMAL_TRIP_PASSIVE>; >> + }; >> + dsp-crit: cdsp-crit { >> + temperature = <125000>; /* milliCelsius */ >> + hysteresis = <2000>; /* milliCelsius */ >> + type = <THERMAL_TRIP_CRITICAL>; >> + }; >> + }; >> + }; >> + >> + gpu_thermal: gpu_thermal { >> + passive-delay = <500>; /* milliseconds */ >> + polling-delay = <1000>; /* milliseconds */ >> + /* >> + * sensor ID >> + */ >> + sensors = <&bandgap0 2>; >> + /* >> + * cooling >> + * device Usage Trip lower >> bound upper bound >> + */ >> + cooling-devices = <&gpu0 100 &gpu-alert >> THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; >> + trips { >> + gpu-alert: gpu-alert { >> + temperature = <100000>; /* milliCelsius */ >> + hysteresis = <2000>; /* milliCelsius */ >> + type = <THERMAL_TRIP_PASSIVE>; >> + }; >> + gpu-crit: gpu-crit { >> + temperature = <125000>; /* milliCelsius */ >> + hysteresis = <2000>; /* milliCelsius */ >> + type = <THERMAL_TRIP_CRITICAL>; >> + }; >> + }; >> + }; >> >> >> Wei, I think the above would cover for the IPs with multiple internal >> sensors cases you were looking for, right? >> >> Durga, please also check if I am missing something to cover for your >> plans to, in case you will be using DT to describe your HW. >> >> Anyways, Mark and Pawel, let me know if I missed something from our >> discussion. I believe the above bindings would look more like regular >> standard DT bindings. And also they should be now slim enough, without >> Linux implementation details and also without policies. > > I think that the above can describe that, but I'd like to see a binding > document so we can consider it in more detail. Well, I can write another RFC based on this discussion, for sure. I just want to know what really blocks this work, that is why I sent the examples above. But it does not seam we are progressing, looks like we are move discussing in circles. Cooling devices seams to be bouncing back and forward. I can send an RFC considering: i) addition of cooling devices virtual device nodes ii) description of used cooling devices inside thermal zones by means of sub-nodes, instead of lists Pooling intervals seams to be also a loose lace. To me it is a matter of choosing a good way to avoid missing events. We can either get the info from HW simulation / experimentation based on HW itself or we can rely on SW and assume SW can handle to see all events based on loose info (assume that sw can be smart to do a self adaptive closed loop control system based on no defined set of inputs). Both solutions are doable, except that one may be closer to reality than the other. Also, one is based on HW info and the other forces sw to deduce HW info. If you can provide info that your proposal has strong evidence to be closer to reality, I am find to signed it off. All best, Eduardo. > > Thanks, > Mark. > >
Hi Mark, Stephen and Pawel, On 03-09-2013 09:15, Mark Rutland wrote: > On Fri, Aug 30, 2013 at 12:19:43AM +0100, Eduardo Valentin wrote: <cut> > I think that the above can describe that, but I'd like to see a binding > document so we can consider it in more detail. Find below another proposal. It is the updated binding document, with the your comments applied (at least those I agree :-) ). It is obviously an work in progress, but I think it is getting closer to what we are trying to achieve, I believe. And of course, much better after using your suggestions. As I stated before, I believe it is crucial to first agree on the bindings, then I can go ahead and update the corresponding code. The change from the last binding examples I sent is basically on sensors and cooling devices. This time, as suggested by Mark, I am adding cooling device nodes (or at least properties to be embedded into existing nodes). At some point, I remember that Pawel was not so in favor on this type of node, but lets discuss on top of the document below. I also added the #cells properties, as needed. Hopefully we may end with an agreement. :-) So, the document would look like this: ----------------------------------------------------------------------- * Thermal Framework Device Tree descriptor Generic binding to provide a way of defining hardware thermal structure using device tree. A thermal structure includes thermal zones and their components, such as trip points, polling intervals, sensors and cooling devices binding descriptors. The target of device tree thermal descriptors is to describe only the hardware thermal aspects, not how the system must control or which algorithm or policy must be taken in place. There are five types of nodes involved to describe thermal bindings: - sensors: used to describe the device source of temperature sensing; - cooling devices: used to describe devices source of power dissipation control; - trip points: used to describe points in temperature domain defined to make the system aware of hardware limits; - cooling attachments: used to describe links between trip points and cooling devices; - thermal zones: used to describe thermal data within the hardware; It follows a description of each type of these device tree nodes. * Sensor devices Sensor devices are nodes providing temperature sensing capabilities on thermal zones. Typical devices are I2C ADC converters and bandgaps. Theses are nodes providing temperature data to thermal zones. Temperature sensor devices may control one or more internal sensors. Required property: - #sensor-cells: Used to provide sensor device specific information while referring to it. Must be at least 1, in order to identify uniquely the sensor instances within the IC. See thermal zone binding for more details on how consumers refer to sensor devices. * Cooling device nodes Cooling devices are nodes providing control on power dissipation. There are essentially two ways to provide control on power dissipation. First is by means of regulating device performance, which is known as passive cooling. Second is by means of activating devices in order to remove the dissipated heat, which is known as active cooling, e.g. regulating fan speeds. In both cases, cooling devices shall have a way to determine the level of cooling. Required property: - cooling-min-level: A unsigned integer indicating the smallest cooling level accepted. Typically 0. - cooling-max-level: An unsigned integer indicating the largest cooling level accepted. - #cooling-cells: Used to provide cooling device specific information while referring to it. Must be at least 2, in order to specify minimum and maximum cooling level used in the reference. See Cooling device attachments section below for more details on how consumers refer to cooling devices. * Trip points The trip node is a node to describe a point in the temperature domain in which the system takes an action. This node describes just the point, not the action. Required properties: - temperature: the trip temperature level, in milliCelsius. - hysteresis: a (low) hysteresis value on 'temperature'. This is a relative value, in milliCelsius. - type: the trip type. Here is the type mapping: THERMAL_TRIP_ACTIVE 0: A trip point to enable active cooling THERMAL_TRIP_PASSIVE 1: A trip point to enable passive cooling THERMAL_TRIP_HOT 2: A trip point to notify emergency THERMAL_TRIP_CRITICAL 3: Hardware not reliable. Refer to include/dt-bindings/thermal/thermal.h for definition of these consts. * Cooling device attachments The cooling device attachments node is a node to describe how cooling devices get assigned to trip points of the zone. The cooling devices are expected to be loaded in the target system. Required properties: - cooling-device: A phandle of a cooling device with its parameters, referring to which cooling device is used in this binding. The required parameters are: the minimum cooling level and the maximum cooling level used in this attach. - trip: A phandle of a trip point node within the same thermal zone. Optional property: - contribution: The cooling contribution to the thermal zone of the referred cooling device at the referred trip point. The contribution is a value from 0 to 100. The sum of all cooling contributions within a thermal zone must never exceed 100. Note: Using the THERMAL_NO_LIMIT (-1L) constant in the cooling-device phandle limit parameters means: (i) - minimum level allowed for minimum cooling level used in the reference. (ii) - maximum level allowed for maximum cooling level used in the reference. Refer to include/dt-bindings/thermal/thermal.h for definition of this constant. * Thermal zones The thermal-zone node is the node containing all the required info for describing a thermal zone, including its cdev bindings. The thermal_zone node must contain, apart from its own properties, one node containing trip nodes and one node containing all the zone cooling attachments. Required properties: - passive-delay: The maximum number of milliseconds to wait between polls when performing passive cooling. - polling-delay: The maximum number of milliseconds to wait between polls when checking this thermal zone. - sensors: A list of sensor phandles and their parameters. The required parameter is the sensor id, in order to identify internal sensors when the sensor IC features several sensing units. - trips: A sub-node containing several trip point nodes required to describe the thermal zone. - cooling-attachments A sub-node containing several cooling device attaches nodes, used to describe the relation between trips and cooling devices. Optional property: - coefficients: An array of integers (one signed cell) containing coefficients to compose a linear relation between the sensors described in the sensors property. Coefficients defaults to 1, in case this property is not specified. A simple linear polynomial is used: Z = c0 * x0 + c1 + x1 + ... + c(n-1) * x(n-1) + cn. The coefficients are ordered and they match with sensors by means of sensor ID. Additional coefficients are interpreted as constant offsets. Note: The delay properties are bound to the maximum dT/dt (temperature derivative over time) in two situations for a thermal zone: (i) - when active cooling is activated (passive-delay); and (ii) - when the zone just needs to be monitored (polling-delay). The maximum dT/dt is highly bound to hardware power consumption and dissipation capability. * Examples Below are several examples on how to use thermal data descriptors using device tree bindings: (a) - CPU thermal zone The CPU thermal zone example below describes how to setup one thermal zone using one single sensor as temperature source and many cooling devices and power dissipation control sources. #include <dt-bindings/thermal/thermal.h> cpus { cpu0: cpu@0 { ... cooling-min-level = 0; cooling-max-level = 3; #cooling-cells = <2>; /* min followed by max */ }; ... }; &i2c1 { ... fan0: fan@0x48 { ... cooling-min-level = 0; cooling-max-level = 9; #cooling-cells = <2>; /* min followed by max */ }; }; bandgap0: bandgap@0x0000ED00 { ... #sensor-cells = <1>; }; cpu-thermal: cpu-thermal { passive-delay = <250>; /* milliseconds */ polling-delay = <1000>; /* milliseconds */ /* sensor ID */ sensors = <&bandgap0 0>; trips { cpu-alert0: cpu-alert { temperature = <90000>; /* milliCelsius */ hysteresis = <2000>; /* milliCelsius */ type = <THERMAL_TRIP_ACTIVE>; }; cpu-alert1: cpu-alert { temperature = <100000>; /* milliCelsius */ hysteresis = <2000>; /* milliCelsius */ type = <THERMAL_TRIP_PASSIVE>; }; cpu-crit: cpu-crit { temperature = <125000>; /* milliCelsius */ hysteresis = <2000>; /* milliCelsius */ type = <THERMAL_TRIP_CRITICAL>; }; }; cooling-attachments { attach0 { trip = <&cpu-alert0>; cooling-devices = <&fan0 THERMAL_NO_LIMITS 4>; }; attach1 { trip = <&cpu-alert1>; cooling-device = <&fan0 5 THERMAL_NO_LIMITS>; }; attach2 { trip = <&cpu-alert1>; cooling-device = <&cpu0 THERMAL_NO_LIMITS THERMAL_NO_LIMITS>; }; }; }; In the example above, the ADC sensor at address 0x0000ED00 is used to monitor the zone 'cpu-thermal' using its the sensor 0. The fan0, a fan device controlled via I2C bus 1, at adress 0x48, is used to remove the heat out of the thermal zone 'cpu-thermal' using its cooling levels from its minimum to 4, when it reaches trip point 'cpu-alert0' at 90C, as an example of active cooling. The same cooling device is used at 'cpu-alert1', but from 5 to its maximum level. The cpu@0 device is also linked to the same thermal zone, 'cpu-thermal', as a passive cooling device, using all its cooling levels at trip point 'cpu-alert1', which is a trip point at 100C. (b) - IC with several internal sensors The example below describes how to deploy several thermal zones based off a single sensor IC, assuming it has several internal sensors. This is a common case on SoC designs with several internal IPs that may need different thermal requirements, and thus may have their own sensor to monitor or detect internal hotspots in their silicon. #include <dt-bindings/thermal/thermal.h> bandgap0: bandgap@0x0000ED00 { ... #sensor-cells = <1>; }; cpu-thermal: cpu-thermal { passive-delay = <250>; /* milliseconds */ polling-delay = <1000>; /* milliseconds */ /* sensor ID */ sensors = <&bandgap0 0>; trips { /* each zone within the SoC may have its own trips */ cpu-alert: cpu-alert { temperature = <100000>; /* milliCelsius */ hysteresis = <2000>; /* milliCelsius */ type = <THERMAL_TRIP_PASSIVE>; }; cpu-crit: cpu-crit { temperature = <125000>; /* milliCelsius */ hysteresis = <2000>; /* milliCelsius */ type = <THERMAL_TRIP_CRITICAL>; }; }; cooling-attachments { /* each zone within the SoC may have its own cooling */ ... }; }; gpu-thermal: gpu-thermal { passive-delay = <120>; /* milliseconds */ polling-delay = <1000>; /* milliseconds */ /* sensor ID */ sensors = <&bandgap0 1>; trips { /* each zone within the SoC may have its own trips */ gpu-alert: gpu-alert { temperature = <90000>; /* milliCelsius */ hysteresis = <2000>; /* milliCelsius */ type = <THERMAL_TRIP_PASSIVE>; }; gpu-crit: gpu-crit { temperature = <105000>; /* milliCelsius */ hysteresis = <2000>; /* milliCelsius */ type = <THERMAL_TRIP_CRITICAL>; }; }; cooling-attachments { /* each zone within the SoC may have its own cooling */ ... }; }; dsp-thermal: dsp-thermal { passive-delay = <50>; /* milliseconds */ polling-delay = <1000>; /* milliseconds */ /* sensor ID */ sensors = <&bandgap0 2>; trips { /* each zone within the SoC may have its own trips */ dsp-alert: gpu-alert { temperature = <90000>; /* milliCelsius */ hysteresis = <2000>; /* milliCelsius */ type = <THERMAL_TRIP_PASSIVE>; }; dsp-crit: gpu-crit { temperature = <135000>; /* milliCelsius */ hysteresis = <2000>; /* milliCelsius */ type = <THERMAL_TRIP_CRITICAL>; }; }; cooling-attachments { /* each zone within the SoC may have its own cooling */ ... }; }; In the example above there is one bandgap IC which has the capability to monitor three sensors. The hardware has been designed so that sensors are placed on different places in the DIE to monitor different temperature hotspots: one for CPU thermal zone, one for GPU thermal zone and the other to monitor a DSP thermal zone. Thus, there is a need to assign each sensor provided by the bandgap IC to different thermal zones. This is achieved by means of using the #sensor-cells property and using the first parameter as sensor ID. In the example, then, bandgap.sensor0 is used to monitor CPU thermal zone, bandgap.sensor1 is used to monitor GPU thermal zone and bandgap.sensor2 is used to monitor DSP thermal zone. Each zone may be uncorrelated, having its own dT/dt requirements, trips and cooling attachments. (c) - Several sensors within one single thermal zone The example below illustrates how to use more than one sensor within one thermal zone. #include <dt-bindings/thermal/thermal.h> &i2c1 { ... adc: sensor@0x49 { ... #sensor-cells = <1>; }; }; bandgap0: bandgap@0x0000ED00 { ... #sensor-cells = <1>; }; cpu-thermal: cpu-thermal { passive-delay = <250>; /* milliseconds */ polling-delay = <1000>; /* milliseconds */ /* sensor ID */ sensors = <&bandgap0 0>, <&adc 0>; /* hotspot = 100 * bandgap - 120 * adc + 484 */ coefficients = <100 -120 484>; trips { ... }; cooling-attachments { ... }; }; In some cases, there is a need to use more than one sensor to extrapolate a thermal hotspot in the silicon. The above example illustrate this situation. For instance, it may be the case that a sensor external to CPU IP may be place close to CPU hotspot and together with internal CPU sensor, it is used to determine the hotspot. The hyppotetical extrapolation rule would be: hotspot = 100 * bandgap - 120 * adc + 484 The same idea can be used to add fixed offset: passive-delay = <1000>; /* milliseconds */ polling-delay = <2500>; /* milliseconds */ hotspot = 1 * adc + 6000 In the above equation, the hotspot is always 6C higher than what is read from the sensor ADC. The binding would be then: /* sensor ID */ sensors = <&adc 0>; /* hotspot = 1 * adc + 6000 */ coefficients = <1 6000>; (d) - Board thermal The board thermal example below illustrates how to setup one thermal zone with many sensors and many cooling devices. #include <dt-bindings/thermal/thermal.h> &i2c1 { ... adc-dummy: sensor@0x50 { ... #sensor-cells = <1>; /* sensor internal ID */ }; }; batt-thermal { passive-delay = <500>; /* milliseconds */ polling-delay = <2500>; /* milliseconds */ /* sensor ID */ sensors = <&adc-dummy 4>; trips { ... }; cooling-attachments { ... }; }; board-thermal: board-thermal { passive-delay = <1000>; /* milliseconds */ polling-delay = <2500>; /* milliseconds */ /* sensor ID */ sensors = <&adc-dummy 0>, <&adc-dummy 1>, <&adc-dymmy 2>; /* * An array of coefficients describing the sensor * linear relation. E.g.: * z = c1*x1 + c2*x2 + c3*x3 */ coefficients = <1200 -345 890>; trips { /* Trips are based on resulting linear equation */ cpu-trip: cpu-trip { temperature = <60000>; /* milliCelsius */ hysteresis = <2000>; /* milliCelsius */ type = <THERMAL_TRIP_PASSIVE>; }; gpu-trip: gpu-trip { temperature = <55000>; /* milliCelsius */ hysteresis = <2000>; /* milliCelsius */ type = <THERMAL_TRIP_PASSIVE>; } lcd-trip: lcp-trip { temperature = <53000>; /* milliCelsius */ hysteresis = <2000>; /* milliCelsius */ type = <THERMAL_TRIP_PASSIVE>; }; crit-trip: crit-trip { temperature = <68000>; /* milliCelsius */ hysteresis = <2000>; /* milliCelsius */ type = <THERMAL_TRIP_CRITICAL>; }; }; cooling-attachments { attach0 { trip = <&cpu-trip>; cooling-device = <&cpu0 0 2>; contribution = <55>; }; attach1 { trip = <&gpu-trip>; cooling-device = <&gpu0 0 2>; contribution = <20>; }; attach2 { trip = <&lcd-trip>; cooling-device = <&lcd0 5 10>; contribution = <15>; }; }; }; The above example is a mix of previous examples, a sensor IP with several internal sensors used to monitor different zones, one of them is composed by several sensors and with different cooling devices. ----------------------------------------------------------------------- All best,
diff --git a/Documentation/devicetree/bindings/thermal/thermal.txt b/Documentation/devicetree/bindings/thermal/thermal.txt new file mode 100644 index 0000000..af20ab0 --- /dev/null +++ b/Documentation/devicetree/bindings/thermal/thermal.txt @@ -0,0 +1,160 @@ +---------------------------------------- +Thermal Framework Device Tree descriptor +---------------------------------------- + +This file describes how to define a thermal structure using device tree. +A thermal structure includes thermal zones and their components, such +as name, trip points, polling intervals and cooling devices binding +descriptors. A binding descriptor may contain information on how to +react, with a cooling stepped action or a weight on a fair share. +The target of device tree thermal descriptors is to describe only +the hardware thermal aspects, not how the system must control or which +algorithm or policy must take place. It follows a description of each +type of device tree node. + +**** +trip +**** + +The trip node is a node to describe a point in the temperature domain +in which the system takes an action. This node describes just the point, +not the action. + +A node describing a trip must contain: +- temperature: the trip temperature level, in milliCelsius. +- hysteresis: a (low) hysteresis value on 'temperature'. This is a relative +value, in milliCelsius. +- type: the trip type. Here is the type mapping: + THERMAL_TRIP_ACTIVE + THERMAL_TRIP_PASSIVE + THERMAL_TRIP_HOT + THERMAL_TRIP_CRITICAL + +********** +bind_param +********** + +The bind_param node is a node to describe how cooling devices get assigned +to trip points of the zone. The cooling devices are expected to be loaded +in the target system. + +A node describing a bind_param must contain: +- cooling_device: A string with the cooling device name. +- weight: The 'influence' of a particular cooling device on this zone. + This is on a percentage scale. The sum of all these weights + (for a particular zone) cannot exceed 100. +- trip_mask: This is a bit mask that gives the binding relation between + this thermal zone and cdev, for a particular trip point. + If nth bit is set, then the cdev and thermal zone are bound + for trip point n. +- limits: An array integers consisting of 2-tuples items, and each item means + lower and upper state limits, like <min-state max-state>. + +************************** +temperature sensor devices +************************** + +Temperature sensor devices have to be linked to a specific thermal zone. +This is done by means of the 'monitored-zones' list. +- monitored-zones: A list of thermal zones phandles, identifying which +thermal zones the temperature sensor device is used to monitor. + +************ +thermal_zone +************ + +The thermal_zone node is the node containing all the required info +for describing a thermal zone, including its cdev bindings. The thermal_zone +node must contain, apart from its own properties, one node containing +trip nodes and one node containing all the zone bind parameters. + +Required properties: +- mask: Bit string: If 'n'th bit is set, then trip point 'n' is writeable. + +- passive_delay: number of milliseconds to wait between polls when + performing passive cooling. + +- polling_delay: number of milliseconds to wait between polls when checking. + +- #thermal-cells: An integer that is used to specify how many cells compose + sensor ids. + +Below is an example: +cpu_thermal: thermal_zone { + #thermal-cells = <1>; + mask = <0x03>; /* trips writability */ + passive_delay = <250>; /* milliseconds */ + polling_delay = <1000>; /* milliseconds */ + trips { + alert@100000{ + temperature = <100000>; /* milliCelsius */ + hysteresis = <0>; /* milliCelsius */ + type = <THERMAL_TRIP_PASSIVE>; + }; + crit@125000{ + temperature = <125000>; /* milliCelsius */ + hysteresis = <0>; /* milliCelsius */ + type = <THERMAL_TRIP_CRITICAL>; + }; + }; + bind_params { + action@0{ + cooling_device = "thermal-cpufreq"; + weight = <100>; /* percentage */ + mask = <0x01>; + limits = < + /* lower-state upper-state */ + 1 THERMAL_NO_LIMITS + THERMAL_NO_LIMITS THERMAL_NO_LIMITS + >; + }; + }; +}; + +sensor: adc@0xFFFF { + ... + monitored-zones = <&cpu_thermal 0>; +}; + +cpu@0 { + ... + cooling-zones = <&cpu_thermal>; +}; + +In the example above, the ADC sensor at address 0xFFFF is used to monitor +the zone 'cpu_thermal' using its the sensor 0. The cpu@0 device is also linked +to the same thermal zone 'cpu_thermal' as a cooling device. + +*************** +cpufreq-cooling +*************** + +The generic cpu cooling (freq clipping) can be loaded by the +generic cpufreq-cpu0 driver in case the device tree node informs +this need. + +In order to load the cpufreq cooling device, it is possible to +inform that the cpu needs cooling by adding the list of thermal zones +in the 'cooling-zones' property at the cpu0 phandle. + +Example: + cpus { + cpu@0 { + operating-points = < + /* kHz uV */ + 800000 1313000 + 1008000 1375000 + >; + cooling-zones = <&cpu_thermal>; + }; + ... + }; + +The above will cause the cpufreq-cpu0 driver to understand that +the cpu will need passive cooling and while probing that node it will +also load the cpufreq cpu cooling device in that system. + +However, be advised that this flag will not describe what needs to +be done or how the cpufreq cooling device will be used. In order to +accomplish this, one needs to write a phandle for a thermal zone, as +described in the section 'thermal_zone'. diff --git a/drivers/thermal/Kconfig b/drivers/thermal/Kconfig index 7fb16bc..753f0dc 100644 --- a/drivers/thermal/Kconfig +++ b/drivers/thermal/Kconfig @@ -29,6 +29,19 @@ config THERMAL_HWMON Say 'Y' here if you want all thermal sensors to have hwmon sysfs interface too. +config THERMAL_OF + bool + prompt "APIs to parse thermal data out of device tree" + depends on OF + default y + help + This options provides helpers to add the support to + read and parse thermal data definitions out of the + device tree blob. + + Say 'Y' here if you need to build thermal infrastructure + based on device tree. + choice prompt "Default Thermal governor" default THERMAL_DEFAULT_GOV_STEP_WISE diff --git a/drivers/thermal/Makefile b/drivers/thermal/Makefile index 24cb894..eedb273 100644 --- a/drivers/thermal/Makefile +++ b/drivers/thermal/Makefile @@ -7,6 +7,7 @@ thermal_sys-y += thermal_core.o # interface to/from other layers providing sensors thermal_sys-$(CONFIG_THERMAL_HWMON) += thermal_hwmon.o +thermal_sys-$(CONFIG_THERMAL_OF) += thermal_dt.o # governors thermal_sys-$(CONFIG_THERMAL_GOV_FAIR_SHARE) += fair_share.o diff --git a/drivers/thermal/thermal_dt.c b/drivers/thermal/thermal_dt.c new file mode 100644 index 0000000..cc35485 --- /dev/null +++ b/drivers/thermal/thermal_dt.c @@ -0,0 +1,473 @@ +/* + * thermal_dt.c - Generic Thermal Management device tree support. + * + * Copyright (C) 2013 Texas Instruments + * Copyright (C) 2013 Eduardo Valentin <eduardo.valentin@ti.com> + * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; version 2 of the License. + * + * This program is distributed in the hope that it will be useful, but + * WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * General Public License for more details. + * + * You should have received a copy of the GNU General Public License along + * with this program; if not, write to the Free Software Foundation, Inc., + * 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA. + * + * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + */ +#include <linux/thermal.h> +#include <linux/slab.h> +#include <linux/types.h> +#include <linux/of_device.h> +#include <linux/of_platform.h> +#include <linux/err.h> +#include <linux/export.h> +#include <linux/string.h> + +struct __thermal_bind_params { + char cooling_device[THERMAL_NAME_LENGTH]; +}; + +static +int thermal_of_match(struct thermal_zone_device *tz, + struct thermal_cooling_device *cdev); + +static int thermal_of_populate_bind_params(struct device *dev, + struct device_node *node, + struct __thermal_bind_params *__tbp, + struct thermal_bind_params *tbp, + int ntrips) +{ + const char *cdev_name; + u32 *limits; + int ret; + u32 prop; + + ret = of_property_read_u32(node, "weight", &prop); + if (ret < 0) { + dev_err(dev, "missing weight property\n"); + return ret; + } + tbp->weight = prop; + + ret = of_property_read_u32(node, "mask", &prop); + if (ret < 0) { + dev_err(dev, "missing mask property\n"); + return ret; + } + tbp->trip_mask = prop; + + /* this will allow us to bind with cooling devices */ + tbp->match = thermal_of_match; + + ret = of_property_read_string(node, "cooling_device", &cdev_name); + if (ret < 0) { + dev_err(dev, "missing cooling_device property\n"); + return ret; + } + strncpy(__tbp->cooling_device, cdev_name, + sizeof(__tbp->cooling_device)); + + limits = kzalloc(ntrips * sizeof(*limits) * 2, GFP_KERNEL); + if (!limits) { + dev_err(dev, "not enough mem for reading limits\n"); + return -ENOMEM; + } + ret = of_property_read_u32_array(node, "limits", limits, 2 * ntrips); + if (!ret) { + int i = ntrips; + + tbp->binding_limits = devm_kzalloc(dev, + ntrips * 2 * + sizeof(*tbp->binding_limits), + GFP_KERNEL); + if (!tbp->binding_limits) { + dev_err(dev, "not enough mem for storing limits\n"); + return -ENOMEM; + } + while (i--) + tbp->binding_limits[i] = limits[i]; + } + kfree(limits); + + return 0; +} + +struct __thermal_trip { + unsigned long int temperature; + unsigned long int hysteresis; + enum thermal_trip_type type; +}; + +static +int thermal_of_populate_trip(struct device *dev, + struct device_node *node, + struct __thermal_trip *trip) +{ + int ret; + int prop; + + ret = of_property_read_u32(node, "temperature", &prop); + if (ret < 0) { + dev_err(dev, "missing temperature property\n"); + return ret; + } + trip->temperature = prop; + + ret = of_property_read_u32(node, "hysteresis", &prop); + if (ret < 0) { + dev_err(dev, "missing hysteresis property\n"); + return ret; + } + trip->hysteresis = prop; + + ret = of_property_read_u32(node, "type", &prop); + if (ret < 0) { + dev_err(dev, "missing type property\n"); + return ret; + } + trip->type = prop; + + return 0; +} + +struct __thermal_zone_device { + enum thermal_device_mode mode; + int passive_delay; + int polling_delay; + int mask; + int ntrips; + char type[THERMAL_NAME_LENGTH]; + struct __thermal_trip *trips; + struct __thermal_bind_params *bind_params; + struct thermal_bind_params *tbps; + struct thermal_zone_params zone_params; + int (*get_temp)(void *, unsigned long *); + void *devdata; +}; + +static struct __thermal_zone_device * +thermal_of_build_thermal_zone(struct device *dev, struct device_node *node) +{ + struct device_node *child, *gchild; + struct __thermal_zone_device *tz; + int ret, i; + u32 prop; + + if (!node) { + dev_err(dev, "no thermal zone node\n"); + return ERR_PTR(-EINVAL); + } + + tz = devm_kzalloc(dev, sizeof(*tz), GFP_KERNEL); + if (!tz) { + dev_err(dev, "not enough memory for thermal of zone\n"); + return ERR_PTR(-ENOMEM); + } + + ret = of_property_read_u32(node, "passive_delay", &prop); + if (ret < 0) { + dev_err(dev, "missing passive_delay property\n"); + return ERR_PTR(ret); + } + tz->passive_delay = prop; + + ret = of_property_read_u32(node, "polling_delay", &prop); + if (ret < 0) { + dev_err(dev, "missing polling_delay property\n"); + return ERR_PTR(ret); + } + tz->polling_delay = prop; + + ret = of_property_read_u32(node, "mask", &prop); + if (ret < 0) { + dev_err(dev, "missing mask property\n"); + return ERR_PTR(ret); + } + tz->mask = prop; + + /* assume node name as our zone type */ + strncpy(tz->type, node->name, sizeof(tz->type)); + + /* default policy */ + strncpy(tz->zone_params.governor_name, "step_wise", + sizeof(tz->zone_params.governor_name)); + + /* trips */ + child = of_find_node_by_name(node, "trips"); + + /* No trips provided */ + if (!child) + goto finish; + + tz->ntrips = of_get_child_count(child); + tz->trips = devm_kzalloc(dev, tz->ntrips * sizeof(*tz->trips), + GFP_KERNEL); + if (!tz->trips) + return ERR_PTR(-ENOMEM); + i = 0; + for_each_child_of_node(child, gchild) + thermal_of_populate_trip(dev, gchild, &tz->trips[i++]); + + /* bind_params */ + child = of_find_node_by_name(node, "bind_params"); + + /* No binding info */ + if (!child) + goto finish; + + tz->zone_params.num_tbps = of_get_child_count(child); + tz->bind_params = devm_kzalloc(dev, + tz->zone_params.num_tbps * + sizeof(*tz->bind_params), + GFP_KERNEL); + if (!tz->bind_params) + return ERR_PTR(-ENOMEM); + tz->zone_params.tbp = devm_kzalloc(dev, + tz->zone_params.num_tbps * + sizeof(*tz->zone_params.tbp), + GFP_KERNEL); + if (!tz->zone_params.tbp) + return ERR_PTR(-ENOMEM); + i = 0; + for_each_child_of_node(child, gchild) { + thermal_of_populate_bind_params(dev, gchild, + &tz->bind_params[i], + &tz->zone_params.tbp[i], + tz->ntrips); + i++; + } + +finish: + tz->mode = THERMAL_DEVICE_ENABLED; + + return tz; +} + +static +int thermal_of_match(struct thermal_zone_device *tz, + struct thermal_cooling_device *cdev) +{ + struct __thermal_zone_device *data = tz->devdata; + int i; + + for (i = 0; i < data->zone_params.num_tbps; i++) { + if (!strncmp(data->bind_params[i].cooling_device, + cdev->type, + strlen(data->bind_params[i].cooling_device)) && + (data->zone_params.tbp[i].trip_mask & (1 << i))) + return 0; + } + + return -EINVAL; +} + +static +int of_thermal_get_temp(struct thermal_zone_device *tz, + unsigned long *temp) +{ + struct __thermal_zone_device *data = tz->devdata; + + return data->get_temp(data->devdata, temp); +} + +static +int of_thermal_get_mode(struct thermal_zone_device *tz, + enum thermal_device_mode *mode) +{ + struct __thermal_zone_device *data = tz->devdata; + + *mode = data->mode; + + return 0; +} + +static +int of_thermal_set_mode(struct thermal_zone_device *tz, + enum thermal_device_mode mode) +{ + struct __thermal_zone_device *data = tz->devdata; + + mutex_lock(&tz->lock); + + if (mode == THERMAL_DEVICE_ENABLED) + tz->polling_delay = data->polling_delay; + else + tz->polling_delay = 0; + + mutex_unlock(&tz->lock); + + data->mode = mode; + thermal_zone_device_update(tz); + + return 0; +} + +static +int of_thermal_get_trip_type(struct thermal_zone_device *tz, int trip, + enum thermal_trip_type *type) +{ + struct __thermal_zone_device *data = tz->devdata; + + if (trip >= data->ntrips || trip < 0) + return -EDOM; + + *type = data->trips[trip].type; + + return 0; +} + +static +int of_thermal_get_trip_temp(struct thermal_zone_device *tz, int trip, + unsigned long *temp) +{ + struct __thermal_zone_device *data = tz->devdata; + + if (trip >= data->ntrips || trip < 0) + return -EDOM; + + *temp = data->trips[trip].temperature; + + return 0; +} + +static +int of_thermal_set_trip_temp(struct thermal_zone_device *tz, int trip, + unsigned long temp) +{ + struct __thermal_zone_device *data = tz->devdata; + + if (trip >= data->ntrips || trip < 0) + return -EDOM; + + /* thermal fw should take care of data->mask & (1 << trip) */ + data->trips[trip].temperature = temp; + + return 0; +} + +static +int of_thermal_get_trip_hyst(struct thermal_zone_device *tz, int trip, + unsigned long *hyst) +{ + struct __thermal_zone_device *data = tz->devdata; + + if (trip >= data->ntrips || trip < 0) + return -EDOM; + + *hyst = data->trips[trip].hysteresis; + + return 0; +} + +static +int of_thermal_set_trip_hyst(struct thermal_zone_device *tz, int trip, + unsigned long hyst) +{ + struct __thermal_zone_device *data = tz->devdata; + + if (trip >= data->ntrips || trip < 0) + return -EDOM; + + /* thermal fw should take care of data->mask & (1 << trip) */ + data->trips[trip].hysteresis = hyst; + + return 0; +} + +static int +of_thermal_get_crit_temp(struct thermal_zone_device *tz, unsigned long *temp) +{ + struct __thermal_zone_device *data = tz->devdata; + int i; + + for (i = 0; i < data->ntrips; i++) + if (data->trips[i].type == THERMAL_TRIP_CRITICAL) { + *temp = data->trips[i].temperature; + return 0; + } + + return -EINVAL; +} + +static const +struct thermal_zone_device_ops of_thermal_ops = { + .get_temp = of_thermal_get_temp, + .get_mode = of_thermal_get_mode, + .set_mode = of_thermal_set_mode, + .get_trip_type = of_thermal_get_trip_type, + .get_trip_temp = of_thermal_get_trip_temp, + .set_trip_temp = of_thermal_set_trip_temp, + .get_trip_hyst = of_thermal_get_trip_hyst, + .set_trip_hyst = of_thermal_set_trip_hyst, + .get_crit_temp = of_thermal_get_crit_temp, +}; + +/** + * thermal_zone_of_device_register() - registers a thermal zone based on DT + * @device: a valid struct device pointer containing device tree node + * with thermal_zone descriptor. + * @data: a private pointer (owned by the caller) that will be passed + * back, when a temperature reading is needed. + * @get_temp: a pointer to a function that returns the thermal zone + * temperature. + * @add_hwmon: a boolean to indicate if the thermal to hwmon sysfs interface + * is required. When add_hwmon == true, a hwmon sysfs interface + * will be created. When add_hwmon == false, nothing will be done + * + * This function will parse the device tree node associated with the + * @dev struct and will attempt to build a thermal zone device based + * upon the thermal data provided and described by DT. + * + * The thermal zone temperature is provided by the @get_temp function + * pointer. When called, it will have the private pointer @data back. + * + * Return: On success returns a valid struct thermal_zone_device, + * otherwise, it returns a corresponding ERR_PTR(). Caller must + * check the return value with help of IS_ERR() helper. + */ +struct thermal_zone_device * +thermal_zone_of_device_register(struct device *dev, int id, bool add_hwmon, + void *data, int (*get_temp)(void *, unsigned long *)) +{ + struct __thermal_zone_device *tz; + struct thermal_zone_device_ops *ops; + struct of_phandle_args thermal_spec; + int ret; + + ret = of_parse_phandle_with_args(dev->of_node, + "monitored-zones", + "#thermal-cells", + id, + &thermal_spec); + if (ret) { + dev_err(dev, "unable to find desired monitored zone %d\n", id); + return ERR_PTR(ret); + } + + tz = thermal_of_build_thermal_zone(dev, thermal_spec.np); + if (IS_ERR(tz)) { + dev_err(dev, "failed to build thermal zone\n"); + return ERR_CAST(tz); + } + ops = devm_kzalloc(dev, sizeof(*ops), GFP_KERNEL); + if (!ops) { + dev_err(dev, "no memory available for thermal ops\n"); + return ERR_PTR(-ENOMEM); + } + memcpy(ops, &of_thermal_ops, sizeof(*ops)); + tz->get_temp = get_temp; + tz->devdata = data; + + return thermal_zone_device_register(tz->type, tz->ntrips, tz->mask, tz, + ops, &tz->zone_params, + tz->passive_delay, + tz->polling_delay, + add_hwmon); +} +EXPORT_SYMBOL_GPL(thermal_zone_of_device_register); diff --git a/include/dt-bindings/thermal/thermal.h b/include/dt-bindings/thermal/thermal.h new file mode 100644 index 0000000..6dd6ccd --- /dev/null +++ b/include/dt-bindings/thermal/thermal.h @@ -0,0 +1,27 @@ +/* + * This header provides constants for most thermal bindings. + * + * Copyright (C) 2013 Texas Instruments + * Eduardo Valentin <eduardo.valentin@ti.com> + * + * GPLv2 only + */ + +#ifndef _DT_BINDINGS_THERMAL_THERMAL_H +#define _DT_BINDINGS_THERMAL_THERMAL_H + +/* + * Here are the thermal trip types. This must + * match with enum thermal_trip_type at + * include/linux/thermal.h + */ +#define THERMAL_TRIP_ACTIVE 0 +#define THERMAL_TRIP_PASSIVE 1 +#define THERMAL_TRIP_HOT 2 +#define THERMAL_TRIP_CRITICAL 3 + +/* On cooling devices upper and lower limits */ +#define THERMAL_NO_LIMIT (-1UL) + +#endif + diff --git a/include/linux/thermal.h b/include/linux/thermal.h index 7c2769c..501b2fe 100644 --- a/include/linux/thermal.h +++ b/include/linux/thermal.h @@ -234,6 +234,9 @@ struct thermal_genl_event { }; /* Function declarations */ +struct thermal_zone_device +*thermal_zone_of_device_register(struct device *, int id, bool add_hwmon, + void *data, int (*get_temp) (void *, unsigned long *)); struct thermal_zone_device *thermal_zone_device_register(const char *, int, int, void *, const struct thermal_zone_device_ops *, const struct thermal_zone_params *, int, int, bool);
In order to be able to build thermal policies based on generic sensors, like I2C device, that can be places in different points on different boards, there is a need to have a way to feed board dependent data into the thermal framework. This patch introduces a thermal data parser for device tree. The parsed data is used to build thermal zones and thermal binding parameters. The output data can then be used to deploy thermal policies. This patch adds also documentation regarding this API and how to define tree nodes to use this infrastructure. Cc: Zhang Rui <rui.zhang@intel.com> Cc: linux-pm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Eduardo Valentin <eduardo.valentin@ti.com> --- .../devicetree/bindings/thermal/thermal.txt | 160 +++++++ drivers/thermal/Kconfig | 13 + drivers/thermal/Makefile | 1 + drivers/thermal/thermal_dt.c | 473 +++++++++++++++++++++ include/dt-bindings/thermal/thermal.h | 27 ++ include/linux/thermal.h | 3 + 6 files changed, 677 insertions(+) create mode 100644 Documentation/devicetree/bindings/thermal/thermal.txt create mode 100644 drivers/thermal/thermal_dt.c create mode 100644 include/dt-bindings/thermal/thermal.h