Message ID | 20170301132028.25309-1-carlo@caione.org (mailing list archive) |
---|---|
State | Changes Requested |
Headers | show |
On Wed, Mar 1, 2017 at 4:01 PM, Guenter Roeck <linux@roeck-us.net> wrote: > On 03/01/2017 05:20 AM, Carlo Caione wrote: >> >> From: Carlo Caione <carlo@endlessm.com> >> >> The temperature provided by the SCP sensors not always is expressed in >> millicelsius, whereas this is required by the thermal framework. This is >> for example the case for the Amlogic devices, where the SCP sensor >> readings are expressed in degree (and not milli degree) Celsius. >> > Are you saying that SCPI does not specify or provide the units to be used > when reading values, and thus effectively just reports a more or less > random number ? AFAICT the standard does not specify the units to be used for the values or a way to get that information. For the Amlogic case the SCP returns the value of the temperature in degree celsius and I'm not sure how common is that. [cut] >> - *temp = value; >> + slope = thermal_zone_get_slope(zone->z); >> + offset = thermal_zone_get_offset(zone->z); > > > This is conceptually wrong. The functions return -ENODEV if thermal is > disabled. > While a negative slope does not make sense, a negative offset does. Yeah, but in that case we would have already failed to register the thermal zone at all in devm_thermal_zone_of_sensor_register(). > The code in the thermal subsystem does not clarify well how slope and offset > are supposed to be used. Since coming from the thermal subsystem, one would > think > that the thermal subsystem would apply any corrections if they are supposed > to be software correction values, but that does not appear to be the case, > leaving it up to drivers to use or not use the provided values. Yes, this is my understanding also considering this comment here: http://lxr.free-electrons.com/source/drivers/thermal/of-thermal.c#L1006 So apparently the slope and offset values are left to be used by the driver, that's why I was changing the temperature driver to use those values. > This is kind of odd; the values are thermal subsystem attributes/properties, > and one would think that they are to be used there unless they are supposed > to be > written into hardware. Given that, I'd rather wait for the API to be > clarified > instead of jumping into using it. Zhang, Eduardo (+CC) can you clarify a bit this point? > There are secondary problems with this approach; other drivers such as lm90 > which > support the thermal subsystem have their own means to specify and use the > offset > (since it needs to be programmed into the chip registers). I fail to see how this is related to my patch. If no coefficient is defined in the thermal zone node in the DT the framework sets slope=1 and offset=0, so in the hwmon driver nothing changes. > Also, this would only solve the problem for temperatures and does not > address > the generic problem (ie voltage and current values). This problem is specific for the interaction between the hwmon SCPI driver and the thermal subsystem. Look for example at the board I'm working with one single thermal zone using the temperature data coming from SCP: scpi_sensors: sensors { compatible = "arm,scpi-sensors"; #thermal-sensor-cells = <1>; }; thermal-zones { soc_thermal { thermal-sensors = <&scpi_sensors 0>; coefficients = <1000 0>; trips { control: trip-point@1 { temperature = <80000>; hysteresis = <1000>; type = "passive"; }; }; cooling-maps { cpufreq_cooling_map { trip = <&control>; cooling-device = <&cpus THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; }; }; }; }; trip points are defined in milli degree Celsius while the temperature returned by the scpi_sensors node is in degree Celsius. Without using the coefficients property and my changes how can I make the cooling map working fine? Thanks,
Carlo Caione <carlo@endlessm.com> writes: > On Wed, Mar 1, 2017 at 4:01 PM, Guenter Roeck <linux@roeck-us.net> wrote: >> On 03/01/2017 05:20 AM, Carlo Caione wrote: >>> >>> From: Carlo Caione <carlo@endlessm.com> >>> >>> The temperature provided by the SCP sensors not always is expressed in >>> millicelsius, whereas this is required by the thermal framework. This is >>> for example the case for the Amlogic devices, where the SCP sensor >>> readings are expressed in degree (and not milli degree) Celsius. >>> >> Are you saying that SCPI does not specify or provide the units to be used >> when reading values, and thus effectively just reports a more or less >> random number ? > > AFAICT the standard does not specify the units to be used for the > values or a way to get that information. For the Amlogic case the SCP > returns the value of the temperature in degree celsius and I'm not > sure how common is that. The standard does specify the units, but the way it is written seems to suggest that the units are part of the platform implementation rather than part of the standard [0]. [0] http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0922g/CABBCJGH.html > > [cut] >>> - *temp = value; >>> + slope = thermal_zone_get_slope(zone->z); >>> + offset = thermal_zone_get_offset(zone->z); >> >> >> This is conceptually wrong. The functions return -ENODEV if thermal is >> disabled. >> While a negative slope does not make sense, a negative offset does. > > Yeah, but in that case we would have already failed to register the > thermal zone at all in devm_thermal_zone_of_sensor_register(). > In addition to the thermal sub-subsystem, hwmon sysfs interface also expects temperature in millidegree Celsius. Ideally, any change should fix the reporting there as well. More below. >> The code in the thermal subsystem does not clarify well how slope and offset >> are supposed to be used. Since coming from the thermal subsystem, one would >> think >> that the thermal subsystem would apply any corrections if they are supposed >> to be software correction values, but that does not appear to be the case, >> leaving it up to drivers to use or not use the provided values. > > Yes, this is my understanding also considering this comment here: > http://lxr.free-electrons.com/source/drivers/thermal/of-thermal.c#L1006 > > So apparently the slope and offset values are left to be used by the > driver, that's why I was changing the temperature driver to use those > values. > >> This is kind of odd; the values are thermal subsystem attributes/properties, >> and one would think that they are to be used there unless they are supposed >> to be >> written into hardware. Given that, I'd rather wait for the API to be >> clarified >> instead of jumping into using it. > > Zhang, Eduardo (+CC) > can you clarify a bit this point? > Another way to fix this would be to add optional properties to the scpi sensors binding and use them instead. This could then be used to fixup values reported to both thermal and hwmon. >> There are secondary problems with this approach; other drivers such as lm90 >> which >> support the thermal subsystem have their own means to specify and use the >> offset >> (since it needs to be programmed into the chip registers). > > I fail to see how this is related to my patch. If no coefficient is > defined in the thermal zone node in the DT the framework sets slope=1 > and offset=0, so in the hwmon driver nothing changes. > >> Also, this would only solve the problem for temperatures and does not >> address >> the generic problem (ie voltage and current values). > > This problem is specific for the interaction between the hwmon SCPI > driver and the thermal subsystem. Look for example at the board I'm > working with one single thermal zone using the temperature data coming > from SCP: > > scpi_sensors: sensors { > compatible = "arm,scpi-sensors"; > #thermal-sensor-cells = <1>; > }; > > thermal-zones { > soc_thermal { > thermal-sensors = <&scpi_sensors 0>; > coefficients = <1000 0>; > > trips { > control: trip-point@1 { > temperature = <80000>; > hysteresis = <1000>; > type = "passive"; > }; > }; > > cooling-maps { > cpufreq_cooling_map { > trip = <&control>; > cooling-device = <&cpus THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; > }; > }; > }; > }; > > trip points are defined in milli degree Celsius while the temperature > returned by the scpi_sensors node is in degree Celsius. Without using > the coefficients property and my changes how can I make the cooling > map working fine? > > Thanks, -- To unsubscribe from this list: send the line "unsubscribe linux-hwmon" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Mar 1, 2017 at 5:57 PM, Punit Agrawal <punit.agrawal@arm.com> wrote: > Carlo Caione <carlo@endlessm.com> writes: >>> Are you saying that SCPI does not specify or provide the units to be used >>> when reading values, and thus effectively just reports a more or less >>> random number ? >> >> AFAICT the standard does not specify the units to be used for the >> values or a way to get that information. For the Amlogic case the SCP >> returns the value of the temperature in degree celsius and I'm not >> sure how common is that. > > The standard does specify the units, but the way it is written seems to > suggest that the units are part of the platform implementation rather > than part of the standard [0]. > > [0] http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0922g/CABBCJGH.html Oh, I missed that, odd indeed. Fantastic, one more non-compliancy. >> [cut] >>>> - *temp = value; >>>> + slope = thermal_zone_get_slope(zone->z); >>>> + offset = thermal_zone_get_offset(zone->z); >>> >>> >>> This is conceptually wrong. The functions return -ENODEV if thermal is >>> disabled. >>> While a negative slope does not make sense, a negative offset does. >> >> Yeah, but in that case we would have already failed to register the >> thermal zone at all in devm_thermal_zone_of_sensor_register(). >> > > In addition to the thermal sub-subsystem, hwmon sysfs interface also > expects temperature in millidegree Celsius. Ideally, any change should > fix the reporting there as well. More below. [cut] > Another way to fix this would be to add optional properties to the scpi > sensors binding and use them instead. This could then be used to fixup > values reported to both thermal and hwmon. Yeah, if we want to fix also the hwmon side probably it's the best solution. Cheers,
On Wed, Mar 01, 2017 at 05:16:36PM +0100, Carlo Caione wrote: > On Wed, Mar 1, 2017 at 4:01 PM, Guenter Roeck <linux@roeck-us.net> wrote: > > On 03/01/2017 05:20 AM, Carlo Caione wrote: > >> > >> From: Carlo Caione <carlo@endlessm.com> > >> > >> The temperature provided by the SCP sensors not always is expressed in > >> millicelsius, whereas this is required by the thermal framework. This is > >> for example the case for the Amlogic devices, where the SCP sensor > >> readings are expressed in degree (and not milli degree) Celsius. > >> > > Are you saying that SCPI does not specify or provide the units to be used > > when reading values, and thus effectively just reports a more or less > > random number ? > > AFAICT the standard does not specify the units to be used for the > values or a way to get that information. For the Amlogic case the SCP > returns the value of the temperature in degree celsius and I'm not > sure how common is that. > > [cut] > >> - *temp = value; > >> + slope = thermal_zone_get_slope(zone->z); > >> + offset = thermal_zone_get_offset(zone->z); > > > > > > This is conceptually wrong. The functions return -ENODEV if thermal is > > disabled. > > While a negative slope does not make sense, a negative offset does. > > Yeah, but in that case we would have already failed to register the > thermal zone at all in devm_thermal_zone_of_sensor_register(). > So a return value of -19 eigher means that the offset/slope is -19 or that the thermal subsystem is disabled. What does a return value of -17 mean ? How does one distinguish the two ? Either the functions return negative error codes or they don't. Or at least I think that is how it should be. How do you know in which order to apply slope and offset ? In other words, how do you know that the offset is scaled and not the raw offset ? > > The code in the thermal subsystem does not clarify well how slope and offset > > are supposed to be used. Since coming from the thermal subsystem, one would > > think > > that the thermal subsystem would apply any corrections if they are supposed > > to be software correction values, but that does not appear to be the case, > > leaving it up to drivers to use or not use the provided values. > > Yes, this is my understanding also considering this comment here: > http://lxr.free-electrons.com/source/drivers/thermal/of-thermal.c#L1006 > > So apparently the slope and offset values are left to be used by the > driver, that's why I was changing the temperature driver to use those > values. > The driver author is left wondering how to use those values. The thermal API specifies that temperatures shall be reported as absolute values in milli-degrees C. It doesn't specify if and how the driver should or may use slope and offset as provided by the thermal subsystem to correct its readings. It doesn't specify what to do if the driver has other means to correct slope and/or offset internally (such as the lm90 driver, or the ntc_thermistor driver), or what to do if the provided values don't fit what hardware provides. It thus adds a lot of ambiguity which will need to be clarified before the API can be used. > > This is kind of odd; the values are thermal subsystem attributes/properties, > > and one would think that they are to be used there unless they are supposed > > to be > > written into hardware. Given that, I'd rather wait for the API to be > > clarified > > instead of jumping into using it. > > Zhang, Eduardo (+CC) > can you clarify a bit this point? > > > There are secondary problems with this approach; other drivers such as lm90 > > which > > support the thermal subsystem have their own means to specify and use the > > offset > > (since it needs to be programmed into the chip registers). > > I fail to see how this is related to my patch. If no coefficient is > defined in the thermal zone node in the DT the framework sets slope=1 > and offset=0, so in the hwmon driver nothing changes. > If temperature units are not well defined in SCPI, I have to assume that other units are not well defined either. You only provide a thermal subsystem specific solution for the problem, and you don't solve the problem for the driver itself. On top of that, the solution is incomplete - it assumes that the raw units may have to be scaled upward, never downward, and it assumes that the scaling factor is an integer multiple of the raw value. Given all that, I'd prefer a more comprehensive solution which also works for the next hardware which decides to report temperatures in mico-degrees C (or maybe in multiples of 2.5 mC), and voltages/currents with some arbitrary resolution. Browsing through the SCPI specification, it looks like you are right; sensors are defined to return a 64-bit value, but it isn't even specified if the returned value is signed or unsigned (or floating point, for that matter), much less its units. I hope I am wrong, but if that is the case, we'll need a much more comprehensive solution, one that is tied to 'arm,scpi-sensors' and not to the thermal subsystem. Thanks, Guenter > > Also, this would only solve the problem for temperatures and does not > > address > > the generic problem (ie voltage and current values). > > This problem is specific for the interaction between the hwmon SCPI > driver and the thermal subsystem. Look for example at the board I'm > working with one single thermal zone using the temperature data coming > from SCP: > > scpi_sensors: sensors { > compatible = "arm,scpi-sensors"; > #thermal-sensor-cells = <1>; > }; > > thermal-zones { > soc_thermal { > thermal-sensors = <&scpi_sensors 0>; > coefficients = <1000 0>; > > trips { > control: trip-point@1 { > temperature = <80000>; > hysteresis = <1000>; > type = "passive"; > }; > }; > > cooling-maps { > cpufreq_cooling_map { > trip = <&control>; > cooling-device = <&cpus THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; > }; > }; > }; > }; > > trip points are defined in milli degree Celsius while the temperature > returned by the scpi_sensors node is in degree Celsius. Without using > the coefficients property and my changes how can I make the cooling > map working fine? > > Thanks, > > -- > Carlo Caione | +39.340.80.30.096 | Endless > -- > To unsubscribe from this list: send the line "unsubscribe linux-hwmon" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-hwmon" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Mar 01, 2017 at 04:57:00PM +0000, Punit Agrawal wrote: > Carlo Caione <carlo@endlessm.com> writes: > > > On Wed, Mar 1, 2017 at 4:01 PM, Guenter Roeck <linux@roeck-us.net> wrote: > >> On 03/01/2017 05:20 AM, Carlo Caione wrote: > >>> > >>> From: Carlo Caione <carlo@endlessm.com> > >>> > >>> The temperature provided by the SCP sensors not always is expressed in > >>> millicelsius, whereas this is required by the thermal framework. This is > >>> for example the case for the Amlogic devices, where the SCP sensor > >>> readings are expressed in degree (and not milli degree) Celsius. > >>> > >> Are you saying that SCPI does not specify or provide the units to be used > >> when reading values, and thus effectively just reports a more or less > >> random number ? > > > > AFAICT the standard does not specify the units to be used for the > > values or a way to get that information. For the Amlogic case the SCP > > returns the value of the temperature in degree celsius and I'm not > > sure how common is that. > > The standard does specify the units, but the way it is written seems to > suggest that the units are part of the platform implementation rather > than part of the standard [0]. > > [0] http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0922g/CABBCJGH.html > Wow, that is pretty well hidden. Table 3.87 doesn't specify units. I agree, this appears to imply that the units are intentionally platform specific. Unfortunately it doesn't specify the data format, not even for the development platform, so I guess we'll have to brace ourselves for inventive new data formats. > > > > [cut] > >>> - *temp = value; > >>> + slope = thermal_zone_get_slope(zone->z); > >>> + offset = thermal_zone_get_offset(zone->z); > >> > >> > >> This is conceptually wrong. The functions return -ENODEV if thermal is > >> disabled. > >> While a negative slope does not make sense, a negative offset does. > > > > Yeah, but in that case we would have already failed to register the > > thermal zone at all in devm_thermal_zone_of_sensor_register(). > > > > In addition to the thermal sub-subsystem, hwmon sysfs interface also > expects temperature in millidegree Celsius. Ideally, any change should > fix the reporting there as well. More below. > > >> The code in the thermal subsystem does not clarify well how slope and offset > >> are supposed to be used. Since coming from the thermal subsystem, one would > >> think > >> that the thermal subsystem would apply any corrections if they are supposed > >> to be software correction values, but that does not appear to be the case, > >> leaving it up to drivers to use or not use the provided values. > > > > Yes, this is my understanding also considering this comment here: > > http://lxr.free-electrons.com/source/drivers/thermal/of-thermal.c#L1006 > > > > So apparently the slope and offset values are left to be used by the > > driver, that's why I was changing the temperature driver to use those > > values. > > > >> This is kind of odd; the values are thermal subsystem attributes/properties, > >> and one would think that they are to be used there unless they are supposed > >> to be > >> written into hardware. Given that, I'd rather wait for the API to be > >> clarified > >> instead of jumping into using it. > > > > Zhang, Eduardo (+CC) > > can you clarify a bit this point? > > > > Another way to fix this would be to add optional properties to the scpi > sensors binding and use them instead. This could then be used to fixup > values reported to both thermal and hwmon. > Yes, this would be much more appropriate. Thanks, Guenter -- To unsubscribe from this list: send the line "unsubscribe linux-hwmon" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Mar 1, 2017 at 6:56 PM, Guenter Roeck <linux@roeck-us.net> wrote: > On Wed, Mar 01, 2017 at 04:57:00PM +0000, Punit Agrawal wrote: >> Another way to fix this would be to add optional properties to the scpi >> sensors binding and use them instead. This could then be used to fixup >> values reported to both thermal and hwmon. >> > > Yes, this would be much more appropriate. What about something like: sensor-coefficients = <x1 y1 z1>, <x2 y2 z2>, ...; Where in general `x` is the sensor-id, `y` the slope and `z` the offset for the sensor reading. So the reported reading to hwmon (and thermal for temperature) for sensor with id `x1` is `y1 * val + z1`, the same for sensor with id x2 and so on for all the sensors in the array.
On Wed, Mar 01, 2017 at 07:45:50PM +0100, Carlo Caione wrote: > On Wed, Mar 1, 2017 at 6:56 PM, Guenter Roeck <linux@roeck-us.net> wrote: > > On Wed, Mar 01, 2017 at 04:57:00PM +0000, Punit Agrawal wrote: > > >> Another way to fix this would be to add optional properties to the scpi > >> sensors binding and use them instead. This could then be used to fixup > >> values reported to both thermal and hwmon. > >> > > > > Yes, this would be much more appropriate. > > What about something like: > > sensor-coefficients = <x1 y1 z1>, > <x2 y2 z2>, > ...; As mentioned before, I don't think 'slope' and 'offset' are well enough defined, so I would rather not use that approach. I don't know if there are DT examples on how to specify a translation from an arbitrary value to well-defined units; we could use that if it exists. Otherwise, we might just go with a combination of sensor class and the unit scale used for that sensor class. A simple approach might be something like scpi,sensor-scale = // pick preferred property name <0 1000>, <1 1000>, <2 1000>, <3 1000000>, <4 1000000>; or even simpler scpi,sensor-scale = <1000, 1000, 1000, 1000000, 1000000>; since the class index is really redundant. This would describe the units for the Juno ARM development platform. In your case, that would be something like scpi,sensor-scale = <1, 1000, 1000, 1000000, 1000000>; assuming that the other units are as expected (which may not be the case). Guenter -- To unsubscribe from this list: send the line "unsubscribe linux-hwmon" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/hwmon/scpi-hwmon.c b/drivers/hwmon/scpi-hwmon.c index 094f948f99ff..cdc05c60ba67 100644 --- a/drivers/hwmon/scpi-hwmon.c +++ b/drivers/hwmon/scpi-hwmon.c @@ -33,6 +33,7 @@ struct sensor_data { struct scpi_thermal_zone { int sensor_id; struct scpi_sensors *scpi_sensors; + struct thermal_zone_device *z; }; struct scpi_sensors { @@ -51,13 +52,17 @@ static int scpi_read_temp(void *dev, int *temp) struct scpi_ops *scpi_ops = scpi_sensors->scpi_ops; struct sensor_data *sensor = &scpi_sensors->data[zone->sensor_id]; u64 value; + int slope, offset; int ret; ret = scpi_ops->sensor_get_value(sensor->info.sensor_id, &value); if (ret) return ret; - *temp = value; + slope = thermal_zone_get_slope(zone->z); + offset = thermal_zone_get_offset(zone->z); + + *temp = value * slope + offset; return 0; } @@ -216,7 +221,6 @@ static int scpi_hwmon_probe(struct platform_device *pdev) INIT_LIST_HEAD(&scpi_sensors->thermal_zones); for (i = 0; i < nr_sensors; i++) { struct sensor_data *sensor = &scpi_sensors->data[i]; - struct thermal_zone_device *z; struct scpi_thermal_zone *zone; if (sensor->info.class != TEMPERATURE) @@ -228,17 +232,17 @@ static int scpi_hwmon_probe(struct platform_device *pdev) zone->sensor_id = i; zone->scpi_sensors = scpi_sensors; - z = devm_thermal_zone_of_sensor_register(dev, - sensor->info.sensor_id, - zone, - &scpi_sensor_ops); + zone->z = devm_thermal_zone_of_sensor_register(dev, + sensor->info.sensor_id, + zone, + &scpi_sensor_ops); /* * The call to thermal_zone_of_sensor_register returns * an error for sensors that are not associated with * any thermal zones or if the thermal subsystem is * not configured. */ - if (IS_ERR(z)) { + if (IS_ERR(zone->z)) { devm_kfree(dev, zone); continue; }