diff mbox series

[RESEND,v2] thermal: Fix a NULL pointer dereference

Message ID 1636070227-15909-1-git-send-email-quic_subbaram@quicinc.com (mailing list archive)
State Mainlined, archived
Delegated to: Rafael Wysocki
Headers show
Series [RESEND,v2] thermal: Fix a NULL pointer dereference | expand

Commit Message

Subbaraman Narayanamurthy Nov. 4, 2021, 11:57 p.m. UTC
of_parse_thermal_zones() parses the thermal-zones node and registers a
thermal_zone device for each subnode. However, if a thermal zone is
consuming a thermal sensor and that thermal sensor device hasn't probed
yet, an attempt to set trip_point_*_temp for that thermal zone device
can cause a NULL pointer dereference. Fix it.

 console:/sys/class/thermal/thermal_zone87 # echo 120000 > trip_point_0_temp
 ...
 Unable to handle kernel NULL pointer dereference at virtual address 0000000000000020
 ...
 Call trace:
  of_thermal_set_trip_temp+0x40/0xc4
  trip_point_temp_store+0xc0/0x1dc
  dev_attr_store+0x38/0x88
  sysfs_kf_write+0x64/0xc0
  kernfs_fop_write_iter+0x108/0x1d0
  vfs_write+0x2f4/0x368
  ksys_write+0x7c/0xec
  __arm64_sys_write+0x20/0x30
  el0_svc_common.llvm.7279915941325364641+0xbc/0x1bc
  do_el0_svc+0x28/0xa0
  el0_svc+0x14/0x24
  el0_sync_handler+0x88/0xec
  el0_sync+0x1c0/0x200

While at it, fix the possible NULL pointer dereference in other
functions as well: of_thermal_get_temp(), of_thermal_set_emul_temp(),
of_thermal_get_trend().

Suggested-by: David Collins <quic_collinsd@quicinc.com>
Signed-off-by: Subbaraman Narayanamurthy <quic_subbaram@quicinc.com>
---
 drivers/thermal/thermal_of.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

Comments

Greg KH Nov. 5, 2021, 6:50 a.m. UTC | #1
On Thu, Nov 04, 2021 at 04:57:07PM -0700, Subbaraman Narayanamurthy wrote:
> of_parse_thermal_zones() parses the thermal-zones node and registers a
> thermal_zone device for each subnode. However, if a thermal zone is
> consuming a thermal sensor and that thermal sensor device hasn't probed
> yet, an attempt to set trip_point_*_temp for that thermal zone device
> can cause a NULL pointer dereference. Fix it.
> 
>  console:/sys/class/thermal/thermal_zone87 # echo 120000 > trip_point_0_temp
>  ...
>  Unable to handle kernel NULL pointer dereference at virtual address 0000000000000020
>  ...
>  Call trace:
>   of_thermal_set_trip_temp+0x40/0xc4
>   trip_point_temp_store+0xc0/0x1dc
>   dev_attr_store+0x38/0x88
>   sysfs_kf_write+0x64/0xc0
>   kernfs_fop_write_iter+0x108/0x1d0
>   vfs_write+0x2f4/0x368
>   ksys_write+0x7c/0xec
>   __arm64_sys_write+0x20/0x30
>   el0_svc_common.llvm.7279915941325364641+0xbc/0x1bc
>   do_el0_svc+0x28/0xa0
>   el0_svc+0x14/0x24
>   el0_sync_handler+0x88/0xec
>   el0_sync+0x1c0/0x200
> 
> While at it, fix the possible NULL pointer dereference in other
> functions as well: of_thermal_get_temp(), of_thermal_set_emul_temp(),
> of_thermal_get_trend().
> 
> Suggested-by: David Collins <quic_collinsd@quicinc.com>
> Signed-off-by: Subbaraman Narayanamurthy <quic_subbaram@quicinc.com>
> ---
>  drivers/thermal/thermal_of.c | 9 ++++++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
> 

<formletter>

This is not the correct way to submit patches for inclusion in the
stable kernel tree.  Please read:
    https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html
for how to do this properly.

</formletter>
Rafael J. Wysocki Nov. 5, 2021, 3:14 p.m. UTC | #2
On Fri, Nov 5, 2021 at 12:57 AM Subbaraman Narayanamurthy
<quic_subbaram@quicinc.com> wrote:
>
> of_parse_thermal_zones() parses the thermal-zones node and registers a
> thermal_zone device for each subnode. However, if a thermal zone is
> consuming a thermal sensor and that thermal sensor device hasn't probed
> yet, an attempt to set trip_point_*_temp for that thermal zone device
> can cause a NULL pointer dereference. Fix it.
>
>  console:/sys/class/thermal/thermal_zone87 # echo 120000 > trip_point_0_temp
>  ...
>  Unable to handle kernel NULL pointer dereference at virtual address 0000000000000020
>  ...
>  Call trace:
>   of_thermal_set_trip_temp+0x40/0xc4
>   trip_point_temp_store+0xc0/0x1dc
>   dev_attr_store+0x38/0x88
>   sysfs_kf_write+0x64/0xc0
>   kernfs_fop_write_iter+0x108/0x1d0
>   vfs_write+0x2f4/0x368
>   ksys_write+0x7c/0xec
>   __arm64_sys_write+0x20/0x30
>   el0_svc_common.llvm.7279915941325364641+0xbc/0x1bc
>   do_el0_svc+0x28/0xa0
>   el0_svc+0x14/0x24
>   el0_sync_handler+0x88/0xec
>   el0_sync+0x1c0/0x200
>
> While at it, fix the possible NULL pointer dereference in other
> functions as well: of_thermal_get_temp(), of_thermal_set_emul_temp(),
> of_thermal_get_trend().

Can the subject be more specific, please?

The issue appears to be limited to the of_thermal_ family of
functions, but the subject doesn't reflect that at all.

> Suggested-by: David Collins <quic_collinsd@quicinc.com>
> Signed-off-by: Subbaraman Narayanamurthy <quic_subbaram@quicinc.com>

Daniel, any concerns regarding the code changes below?

> ---
>  drivers/thermal/thermal_of.c | 9 ++++++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/thermal/thermal_of.c b/drivers/thermal/thermal_of.c
> index 6379f26..9233f7e 100644
> --- a/drivers/thermal/thermal_of.c
> +++ b/drivers/thermal/thermal_of.c
> @@ -89,7 +89,7 @@ static int of_thermal_get_temp(struct thermal_zone_device *tz,
>  {
>         struct __thermal_zone *data = tz->devdata;
>
> -       if (!data->ops->get_temp)
> +       if (!data->ops || !data->ops->get_temp)
>                 return -EINVAL;
>
>         return data->ops->get_temp(data->sensor_data, temp);
> @@ -186,6 +186,9 @@ static int of_thermal_set_emul_temp(struct thermal_zone_device *tz,
>  {
>         struct __thermal_zone *data = tz->devdata;
>
> +       if (!data->ops || !data->ops->set_emul_temp)
> +               return -EINVAL;
> +
>         return data->ops->set_emul_temp(data->sensor_data, temp);
>  }
>
> @@ -194,7 +197,7 @@ static int of_thermal_get_trend(struct thermal_zone_device *tz, int trip,
>  {
>         struct __thermal_zone *data = tz->devdata;
>
> -       if (!data->ops->get_trend)
> +       if (!data->ops || !data->ops->get_trend)
>                 return -EINVAL;
>
>         return data->ops->get_trend(data->sensor_data, trip, trend);
> @@ -301,7 +304,7 @@ static int of_thermal_set_trip_temp(struct thermal_zone_device *tz, int trip,
>         if (trip >= data->ntrips || trip < 0)
>                 return -EDOM;
>
> -       if (data->ops->set_trip_temp) {
> +       if (data->ops && data->ops->set_trip_temp) {
>                 int ret;
>
>                 ret = data->ops->set_trip_temp(data->sensor_data, trip, temp);
> --
> 2.7.4
>
Daniel Lezcano Nov. 5, 2021, 4:19 p.m. UTC | #3
On 05/11/2021 16:14, Rafael J. Wysocki wrote:
> On Fri, Nov 5, 2021 at 12:57 AM Subbaraman Narayanamurthy
> <quic_subbaram@quicinc.com> wrote:
>>
>> of_parse_thermal_zones() parses the thermal-zones node and registers a
>> thermal_zone device for each subnode. However, if a thermal zone is
>> consuming a thermal sensor and that thermal sensor device hasn't probed
>> yet, an attempt to set trip_point_*_temp for that thermal zone device
>> can cause a NULL pointer dereference. Fix it.
>>
>>  console:/sys/class/thermal/thermal_zone87 # echo 120000 > trip_point_0_temp
>>  ...
>>  Unable to handle kernel NULL pointer dereference at virtual address 0000000000000020
>>  ...
>>  Call trace:
>>   of_thermal_set_trip_temp+0x40/0xc4
>>   trip_point_temp_store+0xc0/0x1dc
>>   dev_attr_store+0x38/0x88
>>   sysfs_kf_write+0x64/0xc0
>>   kernfs_fop_write_iter+0x108/0x1d0
>>   vfs_write+0x2f4/0x368
>>   ksys_write+0x7c/0xec
>>   __arm64_sys_write+0x20/0x30
>>   el0_svc_common.llvm.7279915941325364641+0xbc/0x1bc
>>   do_el0_svc+0x28/0xa0
>>   el0_svc+0x14/0x24
>>   el0_sync_handler+0x88/0xec
>>   el0_sync+0x1c0/0x200
>>
>> While at it, fix the possible NULL pointer dereference in other
>> functions as well: of_thermal_get_temp(), of_thermal_set_emul_temp(),
>> of_thermal_get_trend().
> 
> Can the subject be more specific, please?
> 
> The issue appears to be limited to the of_thermal_ family of
> functions, but the subject doesn't reflect that at all.
> 
>> Suggested-by: David Collins <quic_collinsd@quicinc.com>
>> Signed-off-by: Subbaraman Narayanamurthy <quic_subbaram@quicinc.com>
> 
> Daniel, any concerns regarding the code changes below?

I've a concern about the root cause but I did not have time to
investigate how to fix it nicely.

thermal_of is responsible of introducing itself between the thermal core
code and the backend. So it defines the ops which in turn call the
sensor ops leading us to this problem.

So, without a better solution, this fix can be applied until we rethink
the thermal_of approach.

Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Rafael J. Wysocki Nov. 5, 2021, 4:37 p.m. UTC | #4
On Fri, Nov 5, 2021 at 5:19 PM Daniel Lezcano <daniel.lezcano@linaro.org> wrote:
>
> On 05/11/2021 16:14, Rafael J. Wysocki wrote:
> > On Fri, Nov 5, 2021 at 12:57 AM Subbaraman Narayanamurthy
> > <quic_subbaram@quicinc.com> wrote:
> >>
> >> of_parse_thermal_zones() parses the thermal-zones node and registers a
> >> thermal_zone device for each subnode. However, if a thermal zone is
> >> consuming a thermal sensor and that thermal sensor device hasn't probed
> >> yet, an attempt to set trip_point_*_temp for that thermal zone device
> >> can cause a NULL pointer dereference. Fix it.
> >>
> >>  console:/sys/class/thermal/thermal_zone87 # echo 120000 > trip_point_0_temp
> >>  ...
> >>  Unable to handle kernel NULL pointer dereference at virtual address 0000000000000020
> >>  ...
> >>  Call trace:
> >>   of_thermal_set_trip_temp+0x40/0xc4
> >>   trip_point_temp_store+0xc0/0x1dc
> >>   dev_attr_store+0x38/0x88
> >>   sysfs_kf_write+0x64/0xc0
> >>   kernfs_fop_write_iter+0x108/0x1d0
> >>   vfs_write+0x2f4/0x368
> >>   ksys_write+0x7c/0xec
> >>   __arm64_sys_write+0x20/0x30
> >>   el0_svc_common.llvm.7279915941325364641+0xbc/0x1bc
> >>   do_el0_svc+0x28/0xa0
> >>   el0_svc+0x14/0x24
> >>   el0_sync_handler+0x88/0xec
> >>   el0_sync+0x1c0/0x200
> >>
> >> While at it, fix the possible NULL pointer dereference in other
> >> functions as well: of_thermal_get_temp(), of_thermal_set_emul_temp(),
> >> of_thermal_get_trend().
> >
> > Can the subject be more specific, please?
> >
> > The issue appears to be limited to the of_thermal_ family of
> > functions, but the subject doesn't reflect that at all.
> >
> >> Suggested-by: David Collins <quic_collinsd@quicinc.com>
> >> Signed-off-by: Subbaraman Narayanamurthy <quic_subbaram@quicinc.com>
> >
> > Daniel, any concerns regarding the code changes below?
>
> I've a concern about the root cause but I did not have time to
> investigate how to fix it nicely.
>
> thermal_of is responsible of introducing itself between the thermal core
> code and the backend. So it defines the ops which in turn call the
> sensor ops leading us to this problem.
>
> So, without a better solution, this fix can be applied until we rethink
> the thermal_of approach.
>
> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>

Thanks!

I've queued it up for 5.16-rc as "thermal: Fix NULL pointer
dereferences in of_thermal_ functions".
Subbaraman Narayanamurthy Nov. 5, 2021, 8:06 p.m. UTC | #5
On 11/4/21 11:50 PM, Greg KH wrote:
> On Thu, Nov 04, 2021 at 04:57:07PM -0700, Subbaraman Narayanamurthy wrote:
>> of_parse_thermal_zones() parses the thermal-zones node and registers a
>> thermal_zone device for each subnode. However, if a thermal zone is
>> consuming a thermal sensor and that thermal sensor device hasn't probed
>> yet, an attempt to set trip_point_*_temp for that thermal zone device
>> can cause a NULL pointer dereference. Fix it.
>>
>>  console:/sys/class/thermal/thermal_zone87 # echo 120000 > trip_point_0_temp
>>  ...
>>  Unable to handle kernel NULL pointer dereference at virtual address 0000000000000020
>>  ...
>>  Call trace:
>>   of_thermal_set_trip_temp+0x40/0xc4
>>   trip_point_temp_store+0xc0/0x1dc
>>   dev_attr_store+0x38/0x88
>>   sysfs_kf_write+0x64/0xc0
>>   kernfs_fop_write_iter+0x108/0x1d0
>>   vfs_write+0x2f4/0x368
>>   ksys_write+0x7c/0xec
>>   __arm64_sys_write+0x20/0x30
>>   el0_svc_common.llvm.7279915941325364641+0xbc/0x1bc
>>   do_el0_svc+0x28/0xa0
>>   el0_svc+0x14/0x24
>>   el0_sync_handler+0x88/0xec
>>   el0_sync+0x1c0/0x200
>>
>> While at it, fix the possible NULL pointer dereference in other
>> functions as well: of_thermal_get_temp(), of_thermal_set_emul_temp(),
>> of_thermal_get_trend().
>>
>> Suggested-by: David Collins <quic_collinsd@quicinc.com>
>> Signed-off-by: Subbaraman Narayanamurthy <quic_subbaram@quicinc.com>
>> ---
>>  drivers/thermal/thermal_of.c | 9 ++++++---
>>  1 file changed, 6 insertions(+), 3 deletions(-)
>>
> <formletter>
>
> This is not the correct way to submit patches for inclusion in the
> stable kernel tree.  Please read:
>     https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html
> for how to do this properly.
>
> </formletter>

Hi Greg,
For this case, is it because I've missed adding "Cc:stable@vger.kernel.org" in commit text itself and cc-ed stable@vger.kernel.org directly?

Thanks,
Subbaraman
Subbaraman Narayanamurthy Nov. 5, 2021, 8:08 p.m. UTC | #6
On 11/5/21 9:37 AM, Rafael J. Wysocki wrote:
> On Fri, Nov 5, 2021 at 5:19 PM Daniel Lezcano <daniel.lezcano@linaro.org> wrote:
>> On 05/11/2021 16:14, Rafael J. Wysocki wrote:
>>> On Fri, Nov 5, 2021 at 12:57 AM Subbaraman Narayanamurthy
>>> <quic_subbaram@quicinc.com> wrote:
>>>> of_parse_thermal_zones() parses the thermal-zones node and registers a
>>>> thermal_zone device for each subnode. However, if a thermal zone is
>>>> consuming a thermal sensor and that thermal sensor device hasn't probed
>>>> yet, an attempt to set trip_point_*_temp for that thermal zone device
>>>> can cause a NULL pointer dereference. Fix it.
>>>>
>>>>  console:/sys/class/thermal/thermal_zone87 # echo 120000 > trip_point_0_temp
>>>>  ...
>>>>  Unable to handle kernel NULL pointer dereference at virtual address 0000000000000020
>>>>  ...
>>>>  Call trace:
>>>>   of_thermal_set_trip_temp+0x40/0xc4
>>>>   trip_point_temp_store+0xc0/0x1dc
>>>>   dev_attr_store+0x38/0x88
>>>>   sysfs_kf_write+0x64/0xc0
>>>>   kernfs_fop_write_iter+0x108/0x1d0
>>>>   vfs_write+0x2f4/0x368
>>>>   ksys_write+0x7c/0xec
>>>>   __arm64_sys_write+0x20/0x30
>>>>   el0_svc_common.llvm.7279915941325364641+0xbc/0x1bc
>>>>   do_el0_svc+0x28/0xa0
>>>>   el0_svc+0x14/0x24
>>>>   el0_sync_handler+0x88/0xec
>>>>   el0_sync+0x1c0/0x200
>>>>
>>>> While at it, fix the possible NULL pointer dereference in other
>>>> functions as well: of_thermal_get_temp(), of_thermal_set_emul_temp(),
>>>> of_thermal_get_trend().
>>> Can the subject be more specific, please?
>>>
>>> The issue appears to be limited to the of_thermal_ family of
>>> functions, but the subject doesn't reflect that at all.
>>>
>>>> Suggested-by: David Collins <quic_collinsd@quicinc.com>
>>>> Signed-off-by: Subbaraman Narayanamurthy <quic_subbaram@quicinc.com>
>>> Daniel, any concerns regarding the code changes below?
>> I've a concern about the root cause but I did not have time to
>> investigate how to fix it nicely.
>>
>> thermal_of is responsible of introducing itself between the thermal core
>> code and the backend. So it defines the ops which in turn call the
>> sensor ops leading us to this problem.
>>
>> So, without a better solution, this fix can be applied until we rethink
>> the thermal_of approach.
>>
>> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
> Thanks!
>
> I've queued it up for 5.16-rc as "thermal: Fix NULL pointer
> dereferences in of_thermal_ functions".

Thanks, Daniel and Rafael. So, I guess I don't need to send v3 with fixing commit subject right?

-Subbaraman
Rafael J. Wysocki Nov. 5, 2021, 8:19 p.m. UTC | #7
On Fri, Nov 5, 2021 at 9:08 PM Subbaraman Narayanamurthy
<quic_subbaram@quicinc.com> wrote:
>
> On 11/5/21 9:37 AM, Rafael J. Wysocki wrote:
> > On Fri, Nov 5, 2021 at 5:19 PM Daniel Lezcano <daniel.lezcano@linaro.org> wrote:
> >> On 05/11/2021 16:14, Rafael J. Wysocki wrote:
> >>> On Fri, Nov 5, 2021 at 12:57 AM Subbaraman Narayanamurthy
> >>> <quic_subbaram@quicinc.com> wrote:
> >>>> of_parse_thermal_zones() parses the thermal-zones node and registers a
> >>>> thermal_zone device for each subnode. However, if a thermal zone is
> >>>> consuming a thermal sensor and that thermal sensor device hasn't probed
> >>>> yet, an attempt to set trip_point_*_temp for that thermal zone device
> >>>> can cause a NULL pointer dereference. Fix it.
> >>>>
> >>>>  console:/sys/class/thermal/thermal_zone87 # echo 120000 > trip_point_0_temp
> >>>>  ...
> >>>>  Unable to handle kernel NULL pointer dereference at virtual address 0000000000000020
> >>>>  ...
> >>>>  Call trace:
> >>>>   of_thermal_set_trip_temp+0x40/0xc4
> >>>>   trip_point_temp_store+0xc0/0x1dc
> >>>>   dev_attr_store+0x38/0x88
> >>>>   sysfs_kf_write+0x64/0xc0
> >>>>   kernfs_fop_write_iter+0x108/0x1d0
> >>>>   vfs_write+0x2f4/0x368
> >>>>   ksys_write+0x7c/0xec
> >>>>   __arm64_sys_write+0x20/0x30
> >>>>   el0_svc_common.llvm.7279915941325364641+0xbc/0x1bc
> >>>>   do_el0_svc+0x28/0xa0
> >>>>   el0_svc+0x14/0x24
> >>>>   el0_sync_handler+0x88/0xec
> >>>>   el0_sync+0x1c0/0x200
> >>>>
> >>>> While at it, fix the possible NULL pointer dereference in other
> >>>> functions as well: of_thermal_get_temp(), of_thermal_set_emul_temp(),
> >>>> of_thermal_get_trend().
> >>> Can the subject be more specific, please?
> >>>
> >>> The issue appears to be limited to the of_thermal_ family of
> >>> functions, but the subject doesn't reflect that at all.
> >>>
> >>>> Suggested-by: David Collins <quic_collinsd@quicinc.com>
> >>>> Signed-off-by: Subbaraman Narayanamurthy <quic_subbaram@quicinc.com>
> >>> Daniel, any concerns regarding the code changes below?
> >> I've a concern about the root cause but I did not have time to
> >> investigate how to fix it nicely.
> >>
> >> thermal_of is responsible of introducing itself between the thermal core
> >> code and the backend. So it defines the ops which in turn call the
> >> sensor ops leading us to this problem.
> >>
> >> So, without a better solution, this fix can be applied until we rethink
> >> the thermal_of approach.
> >>
> >> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
> > Thanks!
> >
> > I've queued it up for 5.16-rc as "thermal: Fix NULL pointer
> > dereferences in of_thermal_ functions".
>
> Thanks, Daniel and Rafael. So, I guess I don't need to send v3 with fixing commit subject right?

Right.
diff mbox series

Patch

diff --git a/drivers/thermal/thermal_of.c b/drivers/thermal/thermal_of.c
index 6379f26..9233f7e 100644
--- a/drivers/thermal/thermal_of.c
+++ b/drivers/thermal/thermal_of.c
@@ -89,7 +89,7 @@  static int of_thermal_get_temp(struct thermal_zone_device *tz,
 {
 	struct __thermal_zone *data = tz->devdata;
 
-	if (!data->ops->get_temp)
+	if (!data->ops || !data->ops->get_temp)
 		return -EINVAL;
 
 	return data->ops->get_temp(data->sensor_data, temp);
@@ -186,6 +186,9 @@  static int of_thermal_set_emul_temp(struct thermal_zone_device *tz,
 {
 	struct __thermal_zone *data = tz->devdata;
 
+	if (!data->ops || !data->ops->set_emul_temp)
+		return -EINVAL;
+
 	return data->ops->set_emul_temp(data->sensor_data, temp);
 }
 
@@ -194,7 +197,7 @@  static int of_thermal_get_trend(struct thermal_zone_device *tz, int trip,
 {
 	struct __thermal_zone *data = tz->devdata;
 
-	if (!data->ops->get_trend)
+	if (!data->ops || !data->ops->get_trend)
 		return -EINVAL;
 
 	return data->ops->get_trend(data->sensor_data, trip, trend);
@@ -301,7 +304,7 @@  static int of_thermal_set_trip_temp(struct thermal_zone_device *tz, int trip,
 	if (trip >= data->ntrips || trip < 0)
 		return -EDOM;
 
-	if (data->ops->set_trip_temp) {
+	if (data->ops && data->ops->set_trip_temp) {
 		int ret;
 
 		ret = data->ops->set_trip_temp(data->sensor_data, trip, temp);