diff mbox series

thermal: core: Move initial num_trips assignment before memcpy()

Message ID 20240226-thermal-fix-fortify-panic-num_trips-v1-1-accc12a341d7@kernel.org (mailing list archive)
State Mainlined, archived
Headers show
Series thermal: core: Move initial num_trips assignment before memcpy() | expand

Commit Message

Nathan Chancellor Feb. 27, 2024, 12:54 a.m. UTC
When booting a CONFIG_FORTIFY_SOURCE=y kernel compiled with a toolchain
that supports __counted_by() (such as clang-18 and newer), there is a
panic on boot:

  [    2.913770] memcpy: detected buffer overflow: 72 byte write of buffer size 0
  [    2.920834] WARNING: CPU: 2 PID: 1 at lib/string_helpers.c:1027 __fortify_report+0x5c/0x74
  ...
  [    3.039208] Call trace:
  [    3.041643]  __fortify_report+0x5c/0x74
  [    3.045469]  __fortify_panic+0x18/0x20
  [    3.049209]  thermal_zone_device_register_with_trips+0x4c8/0x4f8

This panic occurs because trips is counted by num_trips but num_trips is
assigned after the call to memcpy(), so the fortify checks think the
buffer size is zero because tz was allocated with kzalloc().

Move the num_trips assignment before the memcpy() to resolve the panic
and ensure that the fortify checks work properly.

Fixes: 9b0a62758665 ("thermal: core: Store zone trips table in struct thermal_zone_device")
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
---
 drivers/thermal/thermal_core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


---
base-commit: a85739c8c6894c3b9ff860e79e91db44cb59bd63
change-id: 20240226-thermal-fix-fortify-panic-num_trips-5f94094fb963

Best regards,

Comments

Kees Cook Feb. 27, 2024, 2:08 a.m. UTC | #1
On Mon, Feb 26, 2024 at 05:54:58PM -0700, Nathan Chancellor wrote:
> When booting a CONFIG_FORTIFY_SOURCE=y kernel compiled with a toolchain
> that supports __counted_by() (such as clang-18 and newer), there is a
> panic on boot:
> 
>   [    2.913770] memcpy: detected buffer overflow: 72 byte write of buffer size 0

Yay, the "better details" output is working. :)

>   [    2.920834] WARNING: CPU: 2 PID: 1 at lib/string_helpers.c:1027 __fortify_report+0x5c/0x74
>   ...
>   [    3.039208] Call trace:
>   [    3.041643]  __fortify_report+0x5c/0x74
>   [    3.045469]  __fortify_panic+0x18/0x20
>   [    3.049209]  thermal_zone_device_register_with_trips+0x4c8/0x4f8
> 
> This panic occurs because trips is counted by num_trips but num_trips is
> assigned after the call to memcpy(), so the fortify checks think the
> buffer size is zero because tz was allocated with kzalloc().
> 
> Move the num_trips assignment before the memcpy() to resolve the panic
> and ensure that the fortify checks work properly.
> 
> Fixes: 9b0a62758665 ("thermal: core: Store zone trips table in struct thermal_zone_device")
> Signed-off-by: Nathan Chancellor <nathan@kernel.org>
> ---
>  drivers/thermal/thermal_core.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
> index bb21f78b4bfa..1eabc8ebe27d 100644
> --- a/drivers/thermal/thermal_core.c
> +++ b/drivers/thermal/thermal_core.c
> @@ -1354,8 +1354,8 @@ thermal_zone_device_register_with_trips(const char *type,
>  
>  	tz->device.class = thermal_class;
>  	tz->devdata = devdata;
> -	memcpy(tz->trips, trips, num_trips * sizeof(*trips));
>  	tz->num_trips = num_trips;
> +	memcpy(tz->trips, trips, num_trips * sizeof(*trips));

Looks good to me; thanks for catching this!

Reviewed-by: Kees Cook <keescook@chromium.org>
Lukasz Luba Feb. 27, 2024, 9:58 a.m. UTC | #2
Hi Nathan,

On 2/27/24 00:54, Nathan Chancellor wrote:
> When booting a CONFIG_FORTIFY_SOURCE=y kernel compiled with a toolchain
> that supports __counted_by() (such as clang-18 and newer), there is a
> panic on boot:
> 
>    [    2.913770] memcpy: detected buffer overflow: 72 byte write of buffer size 0
>    [    2.920834] WARNING: CPU: 2 PID: 1 at lib/string_helpers.c:1027 __fortify_report+0x5c/0x74
>    ...
>    [    3.039208] Call trace:
>    [    3.041643]  __fortify_report+0x5c/0x74
>    [    3.045469]  __fortify_panic+0x18/0x20
>    [    3.049209]  thermal_zone_device_register_with_trips+0x4c8/0x4f8
> 
> This panic occurs because trips is counted by num_trips but num_trips is
> assigned after the call to memcpy(), so the fortify checks think the
> buffer size is zero because tz was allocated with kzalloc().

I don't know this tool (yet), but this description doesn't help to
understand it better.

In the memcpy() the 'num_trips' is used not 'tz->num_trips' and
'num_trips' is set in the function argument.

> 
> Move the num_trips assignment before the memcpy() to resolve the panic
> and ensure that the fortify checks work properly.

I don't see how this change can impact the code not this tool.

> 
> Fixes: 9b0a62758665 ("thermal: core: Store zone trips table in struct thermal_zone_device")
> Signed-off-by: Nathan Chancellor <nathan@kernel.org>
> ---
>   drivers/thermal/thermal_core.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
> index bb21f78b4bfa..1eabc8ebe27d 100644
> --- a/drivers/thermal/thermal_core.c
> +++ b/drivers/thermal/thermal_core.c
> @@ -1354,8 +1354,8 @@ thermal_zone_device_register_with_trips(const char *type,
>   
>   	tz->device.class = thermal_class;
>   	tz->devdata = devdata;
> -	memcpy(tz->trips, trips, num_trips * sizeof(*trips));
>   	tz->num_trips = num_trips;
> +	memcpy(tz->trips, trips, num_trips * sizeof(*trips));
>   
>   	thermal_set_delay_jiffies(&tz->passive_delay_jiffies, passive_delay);
>   	thermal_set_delay_jiffies(&tz->polling_delay_jiffies, polling_delay);
> 
> ---
> base-commit: a85739c8c6894c3b9ff860e79e91db44cb59bd63
> change-id: 20240226-thermal-fix-fortify-panic-num_trips-5f94094fb963
> 
> Best regards,

Maybe it reports different issue. There is some corner case when the
num_trips is 0. It's called from
thermal_tripless_zone_device_register().
Does your code is triggered from that function?
(you've cut the call stack so I cannot see this there)

Regards,
Lukasz
Daniel Lezcano Feb. 27, 2024, 10:14 a.m. UTC | #3
On 27/02/2024 01:54, Nathan Chancellor wrote:
> When booting a CONFIG_FORTIFY_SOURCE=y kernel compiled with a toolchain
> that supports __counted_by() (such as clang-18 and newer), there is a
> panic on boot:
> 
>    [    2.913770] memcpy: detected buffer overflow: 72 byte write of buffer size 0
>    [    2.920834] WARNING: CPU: 2 PID: 1 at lib/string_helpers.c:1027 __fortify_report+0x5c/0x74
>    ...
>    [    3.039208] Call trace:
>    [    3.041643]  __fortify_report+0x5c/0x74
>    [    3.045469]  __fortify_panic+0x18/0x20
>    [    3.049209]  thermal_zone_device_register_with_trips+0x4c8/0x4f8
> 
> This panic occurs because trips is counted by num_trips but num_trips is
> assigned after the call to memcpy(), so the fortify checks think the
> buffer size is zero because tz was allocated with kzalloc().
> 
> Move the num_trips assignment before the memcpy() to resolve the panic
> and ensure that the fortify checks work properly.
> 
> Fixes: 9b0a62758665 ("thermal: core: Store zone trips table in struct thermal_zone_device")
> Signed-off-by: Nathan Chancellor <nathan@kernel.org>
> ---
>   drivers/thermal/thermal_core.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
> index bb21f78b4bfa..1eabc8ebe27d 100644
> --- a/drivers/thermal/thermal_core.c
> +++ b/drivers/thermal/thermal_core.c
> @@ -1354,8 +1354,8 @@ thermal_zone_device_register_with_trips(const char *type,
>   
>   	tz->device.class = thermal_class;
>   	tz->devdata = devdata;
> -	memcpy(tz->trips, trips, num_trips * sizeof(*trips));
>   	tz->num_trips = num_trips;
> +	memcpy(tz->trips, trips, num_trips * sizeof(*trips));

IIUC, clang-18 is used and supports __counted_by().

Is it possible sizeof(*trips) returns already the real trips array size 
and we are multiplying it again by num_trips ?

While with an older compiler, __counted_by() does nothing and we have to 
multiply by num_trips ?

IOW, the array size arithmetic is different depending if we have 
_counted_by supported or not ?



>   	thermal_set_delay_jiffies(&tz->passive_delay_jiffies, passive_delay);
>   	thermal_set_delay_jiffies(&tz->polling_delay_jiffies, polling_delay);
> 
> ---
> base-commit: a85739c8c6894c3b9ff860e79e91db44cb59bd63
> change-id: 20240226-thermal-fix-fortify-panic-num_trips-5f94094fb963
> 
> Best regards,
Rafael J. Wysocki Feb. 27, 2024, 11:07 a.m. UTC | #4
On Tue, Feb 27, 2024 at 3:08 AM Kees Cook <keescook@chromium.org> wrote:
>
> On Mon, Feb 26, 2024 at 05:54:58PM -0700, Nathan Chancellor wrote:
> > When booting a CONFIG_FORTIFY_SOURCE=y kernel compiled with a toolchain
> > that supports __counted_by() (such as clang-18 and newer), there is a
> > panic on boot:
> >
> >   [    2.913770] memcpy: detected buffer overflow: 72 byte write of buffer size 0
>
> Yay, the "better details" output is working. :)
>
> >   [    2.920834] WARNING: CPU: 2 PID: 1 at lib/string_helpers.c:1027 __fortify_report+0x5c/0x74
> >   ...
> >   [    3.039208] Call trace:
> >   [    3.041643]  __fortify_report+0x5c/0x74
> >   [    3.045469]  __fortify_panic+0x18/0x20
> >   [    3.049209]  thermal_zone_device_register_with_trips+0x4c8/0x4f8
> >
> > This panic occurs because trips is counted by num_trips but num_trips is
> > assigned after the call to memcpy(), so the fortify checks think the
> > buffer size is zero because tz was allocated with kzalloc().
> >
> > Move the num_trips assignment before the memcpy() to resolve the panic
> > and ensure that the fortify checks work properly.
> >
> > Fixes: 9b0a62758665 ("thermal: core: Store zone trips table in struct thermal_zone_device")
> > Signed-off-by: Nathan Chancellor <nathan@kernel.org>
> > ---
> >  drivers/thermal/thermal_core.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
> > index bb21f78b4bfa..1eabc8ebe27d 100644
> > --- a/drivers/thermal/thermal_core.c
> > +++ b/drivers/thermal/thermal_core.c
> > @@ -1354,8 +1354,8 @@ thermal_zone_device_register_with_trips(const char *type,
> >
> >       tz->device.class = thermal_class;
> >       tz->devdata = devdata;
> > -     memcpy(tz->trips, trips, num_trips * sizeof(*trips));
> >       tz->num_trips = num_trips;
> > +     memcpy(tz->trips, trips, num_trips * sizeof(*trips));
>
> Looks good to me; thanks for catching this!
>
> Reviewed-by: Kees Cook <keescook@chromium.org>

Applied, thanks!
Rafael J. Wysocki Feb. 27, 2024, 11:09 a.m. UTC | #5
On Tue, Feb 27, 2024 at 11:14 AM Daniel Lezcano
<daniel.lezcano@linaro.org> wrote:
>
> On 27/02/2024 01:54, Nathan Chancellor wrote:
> > When booting a CONFIG_FORTIFY_SOURCE=y kernel compiled with a toolchain
> > that supports __counted_by() (such as clang-18 and newer), there is a
> > panic on boot:
> >
> >    [    2.913770] memcpy: detected buffer overflow: 72 byte write of buffer size 0
> >    [    2.920834] WARNING: CPU: 2 PID: 1 at lib/string_helpers.c:1027 __fortify_report+0x5c/0x74
> >    ...
> >    [    3.039208] Call trace:
> >    [    3.041643]  __fortify_report+0x5c/0x74
> >    [    3.045469]  __fortify_panic+0x18/0x20
> >    [    3.049209]  thermal_zone_device_register_with_trips+0x4c8/0x4f8
> >
> > This panic occurs because trips is counted by num_trips but num_trips is
> > assigned after the call to memcpy(), so the fortify checks think the
> > buffer size is zero because tz was allocated with kzalloc().
> >
> > Move the num_trips assignment before the memcpy() to resolve the panic
> > and ensure that the fortify checks work properly.
> >
> > Fixes: 9b0a62758665 ("thermal: core: Store zone trips table in struct thermal_zone_device")
> > Signed-off-by: Nathan Chancellor <nathan@kernel.org>
> > ---
> >   drivers/thermal/thermal_core.c | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
> > index bb21f78b4bfa..1eabc8ebe27d 100644
> > --- a/drivers/thermal/thermal_core.c
> > +++ b/drivers/thermal/thermal_core.c
> > @@ -1354,8 +1354,8 @@ thermal_zone_device_register_with_trips(const char *type,
> >
> >       tz->device.class = thermal_class;
> >       tz->devdata = devdata;
> > -     memcpy(tz->trips, trips, num_trips * sizeof(*trips));
> >       tz->num_trips = num_trips;
> > +     memcpy(tz->trips, trips, num_trips * sizeof(*trips));
>
> IIUC, clang-18 is used and supports __counted_by().
>
> Is it possible sizeof(*trips) returns already the real trips array size
> and we are multiplying it again by num_trips ?
>
> While with an older compiler, __counted_by() does nothing and we have to
> multiply by num_trips ?
>
> IOW, the array size arithmetic is different depending if we have
> _counted_by supported or not ?

IIUC it is just the instrumentation using the current value of
tz->num_trips (which is 0 before the initialization).
Daniel Lezcano Feb. 27, 2024, 3:37 p.m. UTC | #6
On 27/02/2024 12:09, Rafael J. Wysocki wrote:
> On Tue, Feb 27, 2024 at 11:14 AM Daniel Lezcano
> <daniel.lezcano@linaro.org> wrote:
>>
>> On 27/02/2024 01:54, Nathan Chancellor wrote:
>>> When booting a CONFIG_FORTIFY_SOURCE=y kernel compiled with a toolchain
>>> that supports __counted_by() (such as clang-18 and newer), there is a
>>> panic on boot:
>>>
>>>     [    2.913770] memcpy: detected buffer overflow: 72 byte write of buffer size 0
>>>     [    2.920834] WARNING: CPU: 2 PID: 1 at lib/string_helpers.c:1027 __fortify_report+0x5c/0x74
>>>     ...
>>>     [    3.039208] Call trace:
>>>     [    3.041643]  __fortify_report+0x5c/0x74
>>>     [    3.045469]  __fortify_panic+0x18/0x20
>>>     [    3.049209]  thermal_zone_device_register_with_trips+0x4c8/0x4f8
>>>
>>> This panic occurs because trips is counted by num_trips but num_trips is
>>> assigned after the call to memcpy(), so the fortify checks think the
>>> buffer size is zero because tz was allocated with kzalloc().
>>>
>>> Move the num_trips assignment before the memcpy() to resolve the panic
>>> and ensure that the fortify checks work properly.
>>>
>>> Fixes: 9b0a62758665 ("thermal: core: Store zone trips table in struct thermal_zone_device")
>>> Signed-off-by: Nathan Chancellor <nathan@kernel.org>
>>> ---
>>>    drivers/thermal/thermal_core.c | 2 +-
>>>    1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
>>> index bb21f78b4bfa..1eabc8ebe27d 100644
>>> --- a/drivers/thermal/thermal_core.c
>>> +++ b/drivers/thermal/thermal_core.c
>>> @@ -1354,8 +1354,8 @@ thermal_zone_device_register_with_trips(const char *type,
>>>
>>>        tz->device.class = thermal_class;
>>>        tz->devdata = devdata;
>>> -     memcpy(tz->trips, trips, num_trips * sizeof(*trips));
>>>        tz->num_trips = num_trips;
>>> +     memcpy(tz->trips, trips, num_trips * sizeof(*trips));
>>
>> IIUC, clang-18 is used and supports __counted_by().
>>
>> Is it possible sizeof(*trips) returns already the real trips array size
>> and we are multiplying it again by num_trips ?
>>
>> While with an older compiler, __counted_by() does nothing and we have to
>> multiply by num_trips ?
>>
>> IOW, the array size arithmetic is different depending if we have
>> _counted_by supported or not ?
> 
> IIUC it is just the instrumentation using the current value of
> tz->num_trips (which is 0 before the initialization).

Right, but I am wondering if

	memcpy(tz->trips, trips, num_trips * sizeof(*trips));

	is still correct with __counted_by because:

  (1) if the compiler supports it:

	sizeof(*trips) == 24 bytes * num_trips

	then:

	memcpy(tz->trips, trips, num_trips * sizeof(*trips));

	memcpy(tz->trips, trips, num_trips * 24 * num_trips);

	==> memory size = 24 * num_trips^2

  (2) if the compiler does not support it:

	sizeof(*trips) == 24 bytes

	then:

	memcpy(tz->trips, trips, num_trips * sizeof(*trips));

	memcpy(tz->trips, trips, num_trips * 24);

	==> memory size = 24 * num_trips

Or did I misunderstand __counted_by ?
Kees Cook Feb. 27, 2024, 4:26 p.m. UTC | #7
On Tue, Feb 27, 2024 at 04:37:36PM +0100, Daniel Lezcano wrote:
> On 27/02/2024 12:09, Rafael J. Wysocki wrote:
> > On Tue, Feb 27, 2024 at 11:14 AM Daniel Lezcano
> > <daniel.lezcano@linaro.org> wrote:
> > > 
> > > On 27/02/2024 01:54, Nathan Chancellor wrote:
> > > > When booting a CONFIG_FORTIFY_SOURCE=y kernel compiled with a toolchain
> > > > that supports __counted_by() (such as clang-18 and newer), there is a
> > > > panic on boot:
> > > > 
> > > >     [    2.913770] memcpy: detected buffer overflow: 72 byte write of buffer size 0
> > > >     [    2.920834] WARNING: CPU: 2 PID: 1 at lib/string_helpers.c:1027 __fortify_report+0x5c/0x74
> > > >     ...
> > > >     [    3.039208] Call trace:
> > > >     [    3.041643]  __fortify_report+0x5c/0x74
> > > >     [    3.045469]  __fortify_panic+0x18/0x20
> > > >     [    3.049209]  thermal_zone_device_register_with_trips+0x4c8/0x4f8
> > > > 
> > > > This panic occurs because trips is counted by num_trips but num_trips is
> > > > assigned after the call to memcpy(), so the fortify checks think the
> > > > buffer size is zero because tz was allocated with kzalloc().
> > > > 
> > > > Move the num_trips assignment before the memcpy() to resolve the panic
> > > > and ensure that the fortify checks work properly.
> > > > 
> > > > Fixes: 9b0a62758665 ("thermal: core: Store zone trips table in struct thermal_zone_device")
> > > > Signed-off-by: Nathan Chancellor <nathan@kernel.org>
> > > > ---
> > > >    drivers/thermal/thermal_core.c | 2 +-
> > > >    1 file changed, 1 insertion(+), 1 deletion(-)
> > > > 
> > > > diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
> > > > index bb21f78b4bfa..1eabc8ebe27d 100644
> > > > --- a/drivers/thermal/thermal_core.c
> > > > +++ b/drivers/thermal/thermal_core.c
> > > > @@ -1354,8 +1354,8 @@ thermal_zone_device_register_with_trips(const char *type,
> > > > 
> > > >        tz->device.class = thermal_class;
> > > >        tz->devdata = devdata;
> > > > -     memcpy(tz->trips, trips, num_trips * sizeof(*trips));
> > > >        tz->num_trips = num_trips;
> > > > +     memcpy(tz->trips, trips, num_trips * sizeof(*trips));
> > > 
> > > IIUC, clang-18 is used and supports __counted_by().
> > > 
> > > Is it possible sizeof(*trips) returns already the real trips array size
> > > and we are multiplying it again by num_trips ?
> > > 
> > > While with an older compiler, __counted_by() does nothing and we have to
> > > multiply by num_trips ?
> > > 
> > > IOW, the array size arithmetic is different depending if we have
> > > _counted_by supported or not ?
> > 
> > IIUC it is just the instrumentation using the current value of
> > tz->num_trips (which is 0 before the initialization).
> 
> Right, but I am wondering if
> 
> 	memcpy(tz->trips, trips, num_trips * sizeof(*trips));
> 
> 	is still correct with __counted_by because:
> 
>  (1) if the compiler supports it:
> 
> 	sizeof(*trips) == 24 bytes * num_trips

I think you're misunderstanding. The above sizeof() only evaluates a
single instance -- it has no idea how many more there may be.
Specifically:

	sizeof(*trips) == sizeof(struct thermal_trip)

> 	then:
> 
> 	memcpy(tz->trips, trips, num_trips * sizeof(*trips));
> 
> 	memcpy(tz->trips, trips, num_trips * 24 * num_trips);
> 
> 	==> memory size = 24 * num_trips^2

It's not counted twice. Under CONFIG_FORTIFY_SOURCE=y, memcpy is a macro
that expands to a checking routine (see include/linux/fortify-string.h),
which is using __builtin_dynamic_object_size() to determine the
available size of the destination buffer (tz->trips). Specifically:

	__builtin_dynamic_object_size(tz->trips)

When __bdos evaluates a flexible array (i.e. tz->trips), it will see the
associated 'counted_by' attribute, and go look up the value of the
designated struct member (tz->num_trips). It then calculates:

	sizeof(*tz->trips) /* a single instance */
		*
	tz->num_trips

Before the patch, tz->num_trips is 0, so the destination buffer size
appears to be of size 0 bytes. After the patch, it contains the
same value as the "num_trips" function argument, so the destination
buffer appears to be the matching size of "num_trips * sizeof(struct
thermal_trip)".

Hopefully that helps! If not, I can try again. :)

-Kees
Nathan Chancellor Feb. 27, 2024, 4:26 p.m. UTC | #8
Hi Daniel and Lukasz,

On Tue, Feb 27, 2024 at 04:37:36PM +0100, Daniel Lezcano wrote:
> On 27/02/2024 12:09, Rafael J. Wysocki wrote:
> > On Tue, Feb 27, 2024 at 11:14 AM Daniel Lezcano
> > <daniel.lezcano@linaro.org> wrote:
> > > 
> > > On 27/02/2024 01:54, Nathan Chancellor wrote:
> > > > When booting a CONFIG_FORTIFY_SOURCE=y kernel compiled with a toolchain
> > > > that supports __counted_by() (such as clang-18 and newer), there is a
> > > > panic on boot:
> > > > 
> > > >     [    2.913770] memcpy: detected buffer overflow: 72 byte write of buffer size 0
> > > >     [    2.920834] WARNING: CPU: 2 PID: 1 at lib/string_helpers.c:1027 __fortify_report+0x5c/0x74
> > > >     ...
> > > >     [    3.039208] Call trace:
> > > >     [    3.041643]  __fortify_report+0x5c/0x74
> > > >     [    3.045469]  __fortify_panic+0x18/0x20
> > > >     [    3.049209]  thermal_zone_device_register_with_trips+0x4c8/0x4f8
> > > > 
> > > > This panic occurs because trips is counted by num_trips but num_trips is
> > > > assigned after the call to memcpy(), so the fortify checks think the
> > > > buffer size is zero because tz was allocated with kzalloc().
> > > > 
> > > > Move the num_trips assignment before the memcpy() to resolve the panic
> > > > and ensure that the fortify checks work properly.
> > > > 
> > > > Fixes: 9b0a62758665 ("thermal: core: Store zone trips table in struct thermal_zone_device")
> > > > Signed-off-by: Nathan Chancellor <nathan@kernel.org>
> > > > ---
> > > >    drivers/thermal/thermal_core.c | 2 +-
> > > >    1 file changed, 1 insertion(+), 1 deletion(-)
> > > > 
> > > > diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
> > > > index bb21f78b4bfa..1eabc8ebe27d 100644
> > > > --- a/drivers/thermal/thermal_core.c
> > > > +++ b/drivers/thermal/thermal_core.c
> > > > @@ -1354,8 +1354,8 @@ thermal_zone_device_register_with_trips(const char *type,
> > > > 
> > > >        tz->device.class = thermal_class;
> > > >        tz->devdata = devdata;
> > > > -     memcpy(tz->trips, trips, num_trips * sizeof(*trips));
> > > >        tz->num_trips = num_trips;
> > > > +     memcpy(tz->trips, trips, num_trips * sizeof(*trips));
> > > 
> > > IIUC, clang-18 is used and supports __counted_by().
> > > 
> > > Is it possible sizeof(*trips) returns already the real trips array size
> > > and we are multiplying it again by num_trips ?
> > > 
> > > While with an older compiler, __counted_by() does nothing and we have to
> > > multiply by num_trips ?
> > > 
> > > IOW, the array size arithmetic is different depending if we have
> > > _counted_by supported or not ?
> > 
> > IIUC it is just the instrumentation using the current value of
> > tz->num_trips (which is 0 before the initialization).
> 
> Right, but I am wondering if
> 
> 	memcpy(tz->trips, trips, num_trips * sizeof(*trips));
> 
> 	is still correct with __counted_by because:
> 
>  (1) if the compiler supports it:
> 
> 	sizeof(*trips) == 24 bytes * num_trips

No, this is incorrect. sizeof(*trips) == sizeof(struct thermal_trip),
which is never affected by __counted_by. As far as I am aware,
sizeof() in general is never affected by __counted_by because calling
sizeof() on a flexible array member (which are the only things currently
allowed to have __counted_by) is invalid because a FAM is by definition
an unknown size at compile time.

> 	then:
> 
> 	memcpy(tz->trips, trips, num_trips * sizeof(*trips));
> 
> 	memcpy(tz->trips, trips, num_trips * 24 * num_trips);
> 
> 	==> memory size = 24 * num_trips^2
> 
>  (2) if the compiler does not support it:
> 
> 	sizeof(*trips) == 24 bytes
> 
> 	then:
> 
> 	memcpy(tz->trips, trips, num_trips * sizeof(*trips));
> 
> 	memcpy(tz->trips, trips, num_trips * 24);
> 
> 	==> memory size = 24 * num_trips
> 
> Or did I misunderstand __counted_by ?

I see there has been some confusion around this change from both
yourself and Lukasz, so I will try to make things as clear and concise
as I can.

The fortify routines in the kernel rely on __builtin_dynamic_object_size
to get the size of flexible array members (see the macro soup around
memcpy() in include/linux/fortify-string.h). Before __counted_by,
__builtin_dynamic_object_size() on a FAM would return -1. With
__counted_by, __builtin_dynamic_object_size() will be able to return the
correct size of the FAM by looking at the counted by member.

In this case, memcpy() was called with tz->num_trips (the counted by
member) == 0 (meaning that __bdos() will return 0) while trying to copy
'sizeof(*trips) * num_trips' bytes into it, triggering the fortify panic
routine. Moving the tz->num_trips assignment before the memcpy() call
ensures that __bdos() returns the correct amount for checking in the
fortify routines.

Hope that helps clear things up, I will make sure to write a clearer
commit message next time.

Cheers,
Nathan
Daniel Lezcano Feb. 27, 2024, 4:47 p.m. UTC | #9
On 27/02/2024 17:26, Kees Cook wrote:
> On Tue, Feb 27, 2024 at 04:37:36PM +0100, Daniel Lezcano wrote:
>> On 27/02/2024 12:09, Rafael J. Wysocki wrote:
>>> On Tue, Feb 27, 2024 at 11:14 AM Daniel Lezcano
>>> <daniel.lezcano@linaro.org> wrote:
>>>>
>>>> On 27/02/2024 01:54, Nathan Chancellor wrote:
>>>>> When booting a CONFIG_FORTIFY_SOURCE=y kernel compiled with a toolchain
>>>>> that supports __counted_by() (such as clang-18 and newer), there is a
>>>>> panic on boot:
>>>>>
>>>>>      [    2.913770] memcpy: detected buffer overflow: 72 byte write of buffer size 0
>>>>>      [    2.920834] WARNING: CPU: 2 PID: 1 at lib/string_helpers.c:1027 __fortify_report+0x5c/0x74
>>>>>      ...
>>>>>      [    3.039208] Call trace:
>>>>>      [    3.041643]  __fortify_report+0x5c/0x74
>>>>>      [    3.045469]  __fortify_panic+0x18/0x20
>>>>>      [    3.049209]  thermal_zone_device_register_with_trips+0x4c8/0x4f8
>>>>>
>>>>> This panic occurs because trips is counted by num_trips but num_trips is
>>>>> assigned after the call to memcpy(), so the fortify checks think the
>>>>> buffer size is zero because tz was allocated with kzalloc().
>>>>>
>>>>> Move the num_trips assignment before the memcpy() to resolve the panic
>>>>> and ensure that the fortify checks work properly.
>>>>>
>>>>> Fixes: 9b0a62758665 ("thermal: core: Store zone trips table in struct thermal_zone_device")
>>>>> Signed-off-by: Nathan Chancellor <nathan@kernel.org>
>>>>> ---
>>>>>     drivers/thermal/thermal_core.c | 2 +-
>>>>>     1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
>>>>> index bb21f78b4bfa..1eabc8ebe27d 100644
>>>>> --- a/drivers/thermal/thermal_core.c
>>>>> +++ b/drivers/thermal/thermal_core.c
>>>>> @@ -1354,8 +1354,8 @@ thermal_zone_device_register_with_trips(const char *type,
>>>>>
>>>>>         tz->device.class = thermal_class;
>>>>>         tz->devdata = devdata;
>>>>> -     memcpy(tz->trips, trips, num_trips * sizeof(*trips));
>>>>>         tz->num_trips = num_trips;
>>>>> +     memcpy(tz->trips, trips, num_trips * sizeof(*trips));
>>>>
>>>> IIUC, clang-18 is used and supports __counted_by().
>>>>
>>>> Is it possible sizeof(*trips) returns already the real trips array size
>>>> and we are multiplying it again by num_trips ?
>>>>
>>>> While with an older compiler, __counted_by() does nothing and we have to
>>>> multiply by num_trips ?
>>>>
>>>> IOW, the array size arithmetic is different depending if we have
>>>> _counted_by supported or not ?
>>>
>>> IIUC it is just the instrumentation using the current value of
>>> tz->num_trips (which is 0 before the initialization).
>>
>> Right, but I am wondering if
>>
>> 	memcpy(tz->trips, trips, num_trips * sizeof(*trips));
>>
>> 	is still correct with __counted_by because:
>>
>>   (1) if the compiler supports it:
>>
>> 	sizeof(*trips) == 24 bytes * num_trips
> 
> I think you're misunderstanding. The above sizeof() only evaluates a
> single instance -- it has no idea how many more there may be.
> Specifically:
> 
> 	sizeof(*trips) == sizeof(struct thermal_trip)
> 
>> 	then:
>>
>> 	memcpy(tz->trips, trips, num_trips * sizeof(*trips));
>>
>> 	memcpy(tz->trips, trips, num_trips * 24 * num_trips);
>>
>> 	==> memory size = 24 * num_trips^2
> 
> It's not counted twice. Under CONFIG_FORTIFY_SOURCE=y, memcpy is a macro
> that expands to a checking routine (see include/linux/fortify-string.h),
> which is using __builtin_dynamic_object_size() to determine the
> available size of the destination buffer (tz->trips). Specifically:
> 
> 	__builtin_dynamic_object_size(tz->trips)
> 
> When __bdos evaluates a flexible array (i.e. tz->trips), it will see the
> associated 'counted_by' attribute, and go look up the value of the
> designated struct member (tz->num_trips). It then calculates:
> 
> 	sizeof(*tz->trips) /* a single instance */
> 		*
> 	tz->num_trips

Ok my misunderstanding was I thought sizeof() was calling _bdos under 
the hood, so when calling sizeof(flex_array), it was returning the 
computed size inferring from the __counted_by field.


> Before the patch, tz->num_trips is 0, so the destination buffer size
> appears to be of size 0 bytes. After the patch, it contains the
> same value as the "num_trips" function argument, so the destination
> buffer appears to be the matching size of "num_trips * sizeof(struct
> thermal_trip)".
> 
> Hopefully that helps! If not, I can try again. :)

It is ok thanks for the clarification
Kees Cook Feb. 27, 2024, 5 p.m. UTC | #10
On Tue, Feb 27, 2024 at 05:47:44PM +0100, Daniel Lezcano wrote:
> Ok my misunderstanding was I thought sizeof() was calling _bdos under the
> hood, so when calling sizeof(flex_array), it was returning the computed size
> inferring from the __counted_by field.

Yeah, sizeof() has a very limited scope. __builtin_object_size() has
more flexibility (via the 2nd argument, "type"), but it was still
compile-time only. __builtin_dynamic_object_size() was added to bring
runtime evaluations into the mix (initially to support the alloc_size
attribute, and now includes the counted_by attribute too).
Lukasz Luba Feb. 28, 2024, 8:41 a.m. UTC | #11
Hi Nathan and Kees,

On 2/27/24 17:00, Kees Cook wrote:
> On Tue, Feb 27, 2024 at 05:47:44PM +0100, Daniel Lezcano wrote:
>> Ok my misunderstanding was I thought sizeof() was calling _bdos under the
>> hood, so when calling sizeof(flex_array), it was returning the computed size
>> inferring from the __counted_by field.
> 
> Yeah, sizeof() has a very limited scope. __builtin_object_size() has
> more flexibility (via the 2nd argument, "type"), but it was still
> compile-time only. __builtin_dynamic_object_size() was added to bring
> runtime evaluations into the mix (initially to support the alloc_size
> attribute, and now includes the counted_by attribute too).
> 

Thanks for your earlier emails explaining these stuff.
Do you have maybe some presentation about those features
for the kernel (ideally w/ a video from some conference)?

Regards,
Lukasz
Nathan Chancellor Feb. 28, 2024, 4:56 p.m. UTC | #12
On Wed, Feb 28, 2024 at 08:41:07AM +0000, Lukasz Luba wrote:
> Hi Nathan and Kees,
> 
> On 2/27/24 17:00, Kees Cook wrote:
> > On Tue, Feb 27, 2024 at 05:47:44PM +0100, Daniel Lezcano wrote:
> > > Ok my misunderstanding was I thought sizeof() was calling _bdos under the
> > > hood, so when calling sizeof(flex_array), it was returning the computed size
> > > inferring from the __counted_by field.
> > 
> > Yeah, sizeof() has a very limited scope. __builtin_object_size() has
> > more flexibility (via the 2nd argument, "type"), but it was still
> > compile-time only. __builtin_dynamic_object_size() was added to bring
> > runtime evaluations into the mix (initially to support the alloc_size
> > attribute, and now includes the counted_by attribute too).
> > 
> 
> Thanks for your earlier emails explaining these stuff.
> Do you have maybe some presentation about those features
> for the kernel (ideally w/ a video from some conference)?

I think Kees's 2022 and 2023 talks at LPC are a good place to start:

https://youtu.be/tQwv79i02ks?si=Nj9hpvmQwPB4K3Y4&t=452
https://youtu.be/OEFFqhP5sts?si=u6RnOP641S8FkouD&t=614

https://outflux.net/slides/2022/lpc/features.pdf
https://outflux.net/slides/2023/lpc/features.pdf

Cheers,
Nathan
Kees Cook Feb. 28, 2024, 5:48 p.m. UTC | #13
On Wed, Feb 28, 2024 at 09:56:51AM -0700, Nathan Chancellor wrote:
> On Wed, Feb 28, 2024 at 08:41:07AM +0000, Lukasz Luba wrote:
> > Hi Nathan and Kees,
> > 
> > On 2/27/24 17:00, Kees Cook wrote:
> > > On Tue, Feb 27, 2024 at 05:47:44PM +0100, Daniel Lezcano wrote:
> > > > Ok my misunderstanding was I thought sizeof() was calling _bdos under the
> > > > hood, so when calling sizeof(flex_array), it was returning the computed size
> > > > inferring from the __counted_by field.
> > > 
> > > Yeah, sizeof() has a very limited scope. __builtin_object_size() has
> > > more flexibility (via the 2nd argument, "type"), but it was still
> > > compile-time only. __builtin_dynamic_object_size() was added to bring
> > > runtime evaluations into the mix (initially to support the alloc_size
> > > attribute, and now includes the counted_by attribute too).
> > > 
> > 
> > Thanks for your earlier emails explaining these stuff.
> > Do you have maybe some presentation about those features
> > for the kernel (ideally w/ a video from some conference)?
> 
> I think Kees's 2022 and 2023 talks at LPC are a good place to start:
> 
> https://youtu.be/tQwv79i02ks?si=Nj9hpvmQwPB4K3Y4&t=452
> https://youtu.be/OEFFqhP5sts?si=u6RnOP641S8FkouD&t=614
> 
> https://outflux.net/slides/2022/lpc/features.pdf
> https://outflux.net/slides/2023/lpc/features.pdf

I've also got a write-up on the entire topic of array bounds, which ends
with some discussion of "the future" (which is now) involving the use of
the "counted_by" attribute:
https://people.kernel.org/kees/bounded-flexible-arrays-in-c#coming-soon-annotate-bounds-of-flexible-arrays
Lukasz Luba Feb. 29, 2024, 7:42 a.m. UTC | #14
On 2/28/24 17:48, Kees Cook wrote:
> On Wed, Feb 28, 2024 at 09:56:51AM -0700, Nathan Chancellor wrote:
>> On Wed, Feb 28, 2024 at 08:41:07AM +0000, Lukasz Luba wrote:
>>> Hi Nathan and Kees,
>>>
>>> On 2/27/24 17:00, Kees Cook wrote:
>>>> On Tue, Feb 27, 2024 at 05:47:44PM +0100, Daniel Lezcano wrote:
>>>>> Ok my misunderstanding was I thought sizeof() was calling _bdos under the
>>>>> hood, so when calling sizeof(flex_array), it was returning the computed size
>>>>> inferring from the __counted_by field.
>>>>
>>>> Yeah, sizeof() has a very limited scope. __builtin_object_size() has
>>>> more flexibility (via the 2nd argument, "type"), but it was still
>>>> compile-time only. __builtin_dynamic_object_size() was added to bring
>>>> runtime evaluations into the mix (initially to support the alloc_size
>>>> attribute, and now includes the counted_by attribute too).
>>>>
>>>
>>> Thanks for your earlier emails explaining these stuff.
>>> Do you have maybe some presentation about those features
>>> for the kernel (ideally w/ a video from some conference)?
>>
>> I think Kees's 2022 and 2023 talks at LPC are a good place to start:
>>
>> https://youtu.be/tQwv79i02ks?si=Nj9hpvmQwPB4K3Y4&t=452
>> https://youtu.be/OEFFqhP5sts?si=u6RnOP641S8FkouD&t=614
>>
>> https://outflux.net/slides/2022/lpc/features.pdf
>> https://outflux.net/slides/2023/lpc/features.pdf
> 
> I've also got a write-up on the entire topic of array bounds, which ends
> with some discussion of "the future" (which is now) involving the use of
> the "counted_by" attribute:
> https://people.kernel.org/kees/bounded-flexible-arrays-in-c#coming-soon-annotate-bounds-of-flexible-arrays
> 

Thank you guys!
diff mbox series

Patch

diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index bb21f78b4bfa..1eabc8ebe27d 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -1354,8 +1354,8 @@  thermal_zone_device_register_with_trips(const char *type,
 
 	tz->device.class = thermal_class;
 	tz->devdata = devdata;
-	memcpy(tz->trips, trips, num_trips * sizeof(*trips));
 	tz->num_trips = num_trips;
+	memcpy(tz->trips, trips, num_trips * sizeof(*trips));
 
 	thermal_set_delay_jiffies(&tz->passive_delay_jiffies, passive_delay);
 	thermal_set_delay_jiffies(&tz->polling_delay_jiffies, polling_delay);