[04/13] thermal: Fix not emulating critical temperatures
diff mbox

Message ID 1427385240-6086-5-git-send-email-s.hauer@pengutronix.de
State Changes Requested
Delegated to: Eduardo Valentin
Headers show

Commit Message

Sascha Hauer March 26, 2015, 3:53 p.m. UTC
commit e6e238c38 (thermal: sysfs: Add a new sysfs node emul_temp for
thermal emulation)  promised not to emulate critical temperatures,
but the check for critical temperatures is broken in multiple ways:

- The code should only accept an emulated temperature when the emulated
  temperature is lower than the critical temperature. Instead the code
  accepts an emulated temperature whenever the real temperature is lower
  than the critical temperature. This makes no sense and trying to
  emulate a temperature higher than the critical temperature halts the
  system.
- When trying to emulate a higher-than-critical temperature we should either
  limit the emulated temperature to the maximum non critical temperature
  or refuse to emulate this temperature. Instead the code just silently
  ignores the emulated temperature and continues with the real temperature.

This patch moves the test for illegal emulated temperature to the sysfs
write function so that we can properly refuse illegal temperatures here.
Trying to write illegal temperatures results in an error message. While
at it use IS_ENABLED() instead of #ifdefs.

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
---
 drivers/thermal/thermal_core.c | 46 ++++++++++++++++++++++--------------------
 1 file changed, 24 insertions(+), 22 deletions(-)

Comments

Carlos Hernandez March 26, 2015, 6:13 p.m. UTC | #1
On 03/26/2015 11:53 AM, Sascha Hauer wrote:
> - The code should only accept an emulated temperature when the emulated
>    temperature is lower than the critical temperature. Instead the code

Why?
Emulating temperatures higher than critical temperature is useful for 
testing. For instance it allows one to validate that critical action 
(i.e. shutdown) is triggered.



--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sascha Hauer March 26, 2015, 6:55 p.m. UTC | #2
+Cc Amit who implemented the emulation code.

On Thu, Mar 26, 2015 at 02:13:47PM -0400, Carlos Hernandez wrote:
> On 03/26/2015 11:53 AM, Sascha Hauer wrote:
> >- The code should only accept an emulated temperature when the emulated
> >   temperature is lower than the critical temperature. Instead the code
> 
> Why?
> Emulating temperatures higher than critical temperature is useful
> for testing. For instance it allows one to validate that critical
> action (i.e. shutdown) is triggered.

Not emulating critical temperatures was the intention of the original
patch adding the emulation code, but it was implemented wrongly. I just
fix to the intended behaviour.
However, I also find emulating critical temperatures useful. I could
also remove the check instead of fixing it if we can agree on that
behaviour.

Sascha
Amit Kachhap March 27, 2015, 3:05 a.m. UTC | #3
Hi Sascha,

On Thu, Mar 26, 2015 at 9:23 PM, Sascha Hauer <s.hauer@pengutronix.de> wrote:
> commit e6e238c38 (thermal: sysfs: Add a new sysfs node emul_temp for
> thermal emulation)  promised not to emulate critical temperatures,
> but the check for critical temperatures is broken in multiple ways:
>
> - The code should only accept an emulated temperature when the emulated
>   temperature is lower than the critical temperature. Instead the code
>   accepts an emulated temperature whenever the real temperature is lower
>   than the critical temperature. This makes no sense and trying to
>   emulate a temperature higher than the critical temperature halts the
>   system.
Even higher than critical temperature should be accepted. see my
further comments below.
> - When trying to emulate a higher-than-critical temperature we should either
>   limit the emulated temperature to the maximum non critical temperature
>   or refuse to emulate this temperature. Instead the code just silently
>   ignores the emulated temperature and continues with the real temperature.
>
> This patch moves the test for illegal emulated temperature to the sysfs
> write function so that we can properly refuse illegal temperatures here.
> Trying to write illegal temperatures results in an error message. While
> at it use IS_ENABLED() instead of #ifdefs.
>
> Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
> ---
>  drivers/thermal/thermal_core.c | 46 ++++++++++++++++++++++--------------------
>  1 file changed, 24 insertions(+), 22 deletions(-)
>
> diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
> index dcea909..ebca854 100644
> --- a/drivers/thermal/thermal_core.c
> +++ b/drivers/thermal/thermal_core.c
> @@ -414,11 +414,6 @@ static void handle_thermal_trip(struct thermal_zone_device *tz, int trip)
>  int thermal_zone_get_temp(struct thermal_zone_device *tz, unsigned long *temp)
>  {
>         int ret = -EINVAL;
> -#ifdef CONFIG_THERMAL_EMULATION
> -       int count;
> -       unsigned long crit_temp = -1UL;
> -       enum thermal_trip_type type;
> -#endif
>
>         if (!tz || IS_ERR(tz) || !tz->ops->get_temp)
>                 goto exit;
> @@ -426,25 +421,10 @@ int thermal_zone_get_temp(struct thermal_zone_device *tz, unsigned long *temp)
>         mutex_lock(&tz->lock);
>
>         ret = tz->ops->get_temp(tz, temp);
> -#ifdef CONFIG_THERMAL_EMULATION
> -       if (!tz->emul_temperature)
> -               goto skip_emul;
> -
> -       for (count = 0; count < tz->trips; count++) {
> -               ret = tz->ops->get_trip_type(tz, count, &type);
> -               if (!ret && type == THERMAL_TRIP_CRITICAL) {
> -                       ret = tz->ops->get_trip_temp(tz, count, &crit_temp);
> -                       break;
> -               }
> -       }
> -
> -       if (ret)
> -               goto skip_emul;
>
> -       if (*temp < crit_temp)
I guess this check is confusing. Actually instead of returning
emulating temperature it is returning actual temperature. But the
important thing to look here is that actual temperature is higher than
critical temperature. So this check prevents the user from suppressing
the critical temperature and hence prevents from burning up the chip.
> +       if (IS_ENABLED(CONFIG_THERMAL_EMULATION) && tz->emul_temperature)
>                 *temp = tz->emul_temperature;
> -skip_emul:
> -#endif
> +
>         mutex_unlock(&tz->lock);
>  exit:
>         return ret;
> @@ -788,10 +768,32 @@ emul_temp_store(struct device *dev, struct device_attribute *attr,
>         struct thermal_zone_device *tz = to_thermal_zone(dev);
>         int ret = 0;
>         unsigned long temperature;
> +       int trip;
> +       unsigned long crit_temp;
> +       enum thermal_trip_type type;
>
>         if (kstrtoul(buf, 10, &temperature))
>                 return -EINVAL;
>
> +       for (trip = 0; trip < tz->trips; trip++) {
> +               ret = tz->ops->get_trip_type(tz, trip, &type);
> +               if (ret)
> +                       return ret;
> +
> +               if (type != THERMAL_TRIP_CRITICAL)
> +                       continue;
> +
> +               ret = tz->ops->get_trip_temp(tz, trip, &crit_temp);
> +               if (ret)
> +                       return ret;
> +
> +               if (temperature >= crit_temp) {
> +                       dev_err(&tz->device, "Will not emulate critical temperature %luC (tcrit=%luC)\n",
> +                                       temperature / 1000, crit_temp / 1000);
> +                       return -EINVAL;
> +               }
Emulating critical temperature is very much needed.
> +       }
> +
>         if (!tz->ops->set_emul_temp) {
>                 mutex_lock(&tz->lock);
>                 tz->emul_temperature = temperature;
> --
> 2.1.4
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sascha Hauer March 27, 2015, 5:23 a.m. UTC | #4
Hi Amit,

On Fri, Mar 27, 2015 at 08:35:50AM +0530, amit daniel kachhap wrote:
> Hi Sascha,
> 
> > -#ifdef CONFIG_THERMAL_EMULATION
> > -       if (!tz->emul_temperature)
> > -               goto skip_emul;
> > -
> > -       for (count = 0; count < tz->trips; count++) {
> > -               ret = tz->ops->get_trip_type(tz, count, &type);
> > -               if (!ret && type == THERMAL_TRIP_CRITICAL) {
> > -                       ret = tz->ops->get_trip_temp(tz, count, &crit_temp);
> > -                       break;
> > -               }
> > -       }
> > -
> > -       if (ret)
> > -               goto skip_emul;
> >
> > -       if (*temp < crit_temp)
> I guess this check is confusing. Actually instead of returning
> emulating temperature it is returning actual temperature. But the
> important thing to look here is that actual temperature is higher than
> critical temperature. So this check prevents the user from suppressing
> the critical temperature and hence prevents from burning up the chip.

Indeed the check is confusing, but now it makes perfectly sense. I'll
look at the patch again and maybe turn into a patch just adding a
comment to clarify this.

Sascha
Eduardo Valentin April 7, 2015, 2:08 a.m. UTC | #5
On Fri, Mar 27, 2015 at 06:23:18AM +0100, Sascha Hauer wrote:
> Hi Amit,
> 
> On Fri, Mar 27, 2015 at 08:35:50AM +0530, amit daniel kachhap wrote:
> > Hi Sascha,
> > 
> > > -#ifdef CONFIG_THERMAL_EMULATION
> > > -       if (!tz->emul_temperature)
> > > -               goto skip_emul;
> > > -
> > > -       for (count = 0; count < tz->trips; count++) {
> > > -               ret = tz->ops->get_trip_type(tz, count, &type);
> > > -               if (!ret && type == THERMAL_TRIP_CRITICAL) {
> > > -                       ret = tz->ops->get_trip_temp(tz, count, &crit_temp);
> > > -                       break;
> > > -               }
> > > -       }
> > > -
> > > -       if (ret)
> > > -               goto skip_emul;
> > >
> > > -       if (*temp < crit_temp)
> > I guess this check is confusing. Actually instead of returning
> > emulating temperature it is returning actual temperature. But the
> > important thing to look here is that actual temperature is higher than
> > critical temperature. So this check prevents the user from suppressing
> > the critical temperature and hence prevents from burning up the chip.
> 
> Indeed the check is confusing, but now it makes perfectly sense. I'll
> look at the patch again and maybe turn into a patch just adding a
> comment to clarify this.

That will be great. Thanks Sascha.

> 
> Sascha
> 
> -- 
> Pengutronix e.K.                           |                             |
> Industrial Linux Solutions                 | http://www.pengutronix.de/  |
> Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
> Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch
diff mbox

diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index dcea909..ebca854 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -414,11 +414,6 @@  static void handle_thermal_trip(struct thermal_zone_device *tz, int trip)
 int thermal_zone_get_temp(struct thermal_zone_device *tz, unsigned long *temp)
 {
 	int ret = -EINVAL;
-#ifdef CONFIG_THERMAL_EMULATION
-	int count;
-	unsigned long crit_temp = -1UL;
-	enum thermal_trip_type type;
-#endif
 
 	if (!tz || IS_ERR(tz) || !tz->ops->get_temp)
 		goto exit;
@@ -426,25 +421,10 @@  int thermal_zone_get_temp(struct thermal_zone_device *tz, unsigned long *temp)
 	mutex_lock(&tz->lock);
 
 	ret = tz->ops->get_temp(tz, temp);
-#ifdef CONFIG_THERMAL_EMULATION
-	if (!tz->emul_temperature)
-		goto skip_emul;
-
-	for (count = 0; count < tz->trips; count++) {
-		ret = tz->ops->get_trip_type(tz, count, &type);
-		if (!ret && type == THERMAL_TRIP_CRITICAL) {
-			ret = tz->ops->get_trip_temp(tz, count, &crit_temp);
-			break;
-		}
-	}
-
-	if (ret)
-		goto skip_emul;
 
-	if (*temp < crit_temp)
+	if (IS_ENABLED(CONFIG_THERMAL_EMULATION) && tz->emul_temperature)
 		*temp = tz->emul_temperature;
-skip_emul:
-#endif
+
 	mutex_unlock(&tz->lock);
 exit:
 	return ret;
@@ -788,10 +768,32 @@  emul_temp_store(struct device *dev, struct device_attribute *attr,
 	struct thermal_zone_device *tz = to_thermal_zone(dev);
 	int ret = 0;
 	unsigned long temperature;
+	int trip;
+	unsigned long crit_temp;
+	enum thermal_trip_type type;
 
 	if (kstrtoul(buf, 10, &temperature))
 		return -EINVAL;
 
+	for (trip = 0; trip < tz->trips; trip++) {
+		ret = tz->ops->get_trip_type(tz, trip, &type);
+		if (ret)
+			return ret;
+
+		if (type != THERMAL_TRIP_CRITICAL)
+			continue;
+
+		ret = tz->ops->get_trip_temp(tz, trip, &crit_temp);
+		if (ret)
+			return ret;
+
+		if (temperature >= crit_temp) {
+			dev_err(&tz->device, "Will not emulate critical temperature %luC (tcrit=%luC)\n",
+					temperature / 1000, crit_temp / 1000);
+			return -EINVAL;
+		}
+	}
+
 	if (!tz->ops->set_emul_temp) {
 		mutex_lock(&tz->lock);
 		tz->emul_temperature = temperature;