mbox series

[RFC,0/4] slab: Allow for type introspection during allocation

Message ID 20240708190924.work.846-kees@kernel.org (mailing list archive)
Headers show
Series slab: Allow for type introspection during allocation | expand

Message

Kees Cook July 8, 2024, 7:18 p.m. UTC
Hi,

This is an RFC for some changes I'd like to make to the kernel's
allocators (starting with slab) that allow for type introspection, which
has been a long-time gap in potential analysis capabilities available
at compile-time. The changes here are just a "first step" example that
updates kmalloc() and kzalloc() to show what I'm thinking we can do,
and shows an example conversion within the fs/pstore tree.

Repeating patch 3's commit log here:

    There is currently no way for the slab to know what type is being
    allocated, and this hampers the development of any logic that would need
    this information including basic type checking, alignment need analysis,
    etc.
    
    Allow the size argument to optionally be a variable, from which the
    type (and there by the size, alignment, or any other features) can be
    determined at compile-time. This allows for the incremental replacement
    of the classic code pattern:
    
            obj = kmalloc(sizeof(*obj), gfp);
    
    into:
    
            obj = kmalloc(obj, gfp);
    
    As an additional build-time safety feature, the return value of kmalloc()
    also becomes typed so that the assignment and first argument cannot drift,
    doing away with the other, more fragile, classic code pattern:
    
            obj = kmalloc(sizeof(struct the_object), gfp);
    
    into:
    
            obj = kmalloc(obj, gfp);
    
    And any accidental variable drift will not be masked by the traditional
    default "void *" return value:
    
            obj = kmalloc(something_else, gfp);
    
    error: assignment to 'struct the_object *' from incompatible pointer type 'struct foo *' [-Wincompatible-pointer-types]
       71 |     obj = kmalloc(something_else, gfp);
          |         ^
    
    This also opens the door for a proposed heap hardening feature that
    would randomize the starting offset of the allocated object within
    its power-of-2 bucket. Without being able to introspect the type for
    alignment needs, this can't be done safely (or cannot be done without
    significant memory usage overhead). For example, a 132 byte structure
    with an 8 byte alignment could be randomized into 15 locations within
    the 256 byte bucket: (256 - 132) / 8.


Thanks!

-Kees

Kees Cook (4):
  compiler_types: Add integral/pointer type helper macros
  slab: Detect negative size values and saturate
  slab: Allow for type introspection during allocation
  pstore: Replace classic kmalloc code pattern with typed argument

 fs/pstore/blk.c                |  2 +-
 fs/pstore/platform.c           |  2 +-
 fs/pstore/ram.c                |  3 +--
 fs/pstore/ram_core.c           |  2 +-
 fs/pstore/zone.c               |  2 +-
 include/linux/compiler_types.h | 23 +++++++++++++++++++++++
 include/linux/slab.h           | 32 +++++++++++++++++++++++++-------
 7 files changed, 53 insertions(+), 13 deletions(-)

Comments

Roman Gushchin July 9, 2024, 4:57 p.m. UTC | #1
On Mon, Jul 08, 2024 at 12:18:34PM -0700, Kees Cook wrote:
> Hi,
> 
> This is an RFC for some changes I'd like to make to the kernel's
> allocators (starting with slab) that allow for type introspection, which
> has been a long-time gap in potential analysis capabilities available
> at compile-time. The changes here are just a "first step" example that
> updates kmalloc() and kzalloc() to show what I'm thinking we can do,
> and shows an example conversion within the fs/pstore tree.
> 
> Repeating patch 3's commit log here:
> 
>     There is currently no way for the slab to know what type is being
>     allocated, and this hampers the development of any logic that would need
>     this information including basic type checking, alignment need analysis,
>     etc.
>     
>     Allow the size argument to optionally be a variable, from which the
>     type (and there by the size, alignment, or any other features) can be
>     determined at compile-time. This allows for the incremental replacement
>     of the classic code pattern:
>     
>             obj = kmalloc(sizeof(*obj), gfp);
>     
>     into:
>     
>             obj = kmalloc(obj, gfp);
>     
>     As an additional build-time safety feature, the return value of kmalloc()
>     also becomes typed so that the assignment and first argument cannot drift,
>     doing away with the other, more fragile, classic code pattern:
>     
>             obj = kmalloc(sizeof(struct the_object), gfp);
>     
>     into:
>     
>             obj = kmalloc(obj, gfp);

I like the idea, however it's not as simple and straightforward because
it's common for structures to have a variable part (usually at the end)
and also allocate more than one structure at once.

There are many allocations which look like
	kmalloc(sizeof(my_struct) * 2 + SOME_MAGIC_LENGTH, GFP_...)
or something like this, which you can't easily convert to your scheme.

The only option I see is to introduce the new set of functions/macros,
something like kmalloc_obj() or kmalloc_struct(). Or maybe tmalloc()?
(t for typed)

Thanks!
Christoph Lameter (Ampere) July 9, 2024, 5:26 p.m. UTC | #2
On Mon, 8 Jul 2024, Kees Cook wrote:

>
>            obj = kmalloc(obj, gfp);

Could we avoid repeating "obj" in this pattern?

F.e.

 	KMALLOC(obj, gfp);

instead?
Kees Cook July 9, 2024, 6:57 p.m. UTC | #3
On Tue, Jul 09, 2024 at 04:57:38PM +0000, Roman Gushchin wrote:
> On Mon, Jul 08, 2024 at 12:18:34PM -0700, Kees Cook wrote:
> > Hi,
> > 
> > This is an RFC for some changes I'd like to make to the kernel's
> > allocators (starting with slab) that allow for type introspection, which
> > has been a long-time gap in potential analysis capabilities available
> > at compile-time. The changes here are just a "first step" example that
> > updates kmalloc() and kzalloc() to show what I'm thinking we can do,
> > and shows an example conversion within the fs/pstore tree.
> > 
> > Repeating patch 3's commit log here:
> > 
> >     There is currently no way for the slab to know what type is being
> >     allocated, and this hampers the development of any logic that would need
> >     this information including basic type checking, alignment need analysis,
> >     etc.
> >     
> >     Allow the size argument to optionally be a variable, from which the
> >     type (and there by the size, alignment, or any other features) can be
> >     determined at compile-time. This allows for the incremental replacement
> >     of the classic code pattern:
> >     
> >             obj = kmalloc(sizeof(*obj), gfp);
> >     
> >     into:
> >     
> >             obj = kmalloc(obj, gfp);
> >     
> >     As an additional build-time safety feature, the return value of kmalloc()
> >     also becomes typed so that the assignment and first argument cannot drift,
> >     doing away with the other, more fragile, classic code pattern:
> >     
> >             obj = kmalloc(sizeof(struct the_object), gfp);
> >     
> >     into:
> >     
> >             obj = kmalloc(obj, gfp);
> 
> I like the idea, however it's not as simple and straightforward because
> it's common for structures to have a variable part (usually at the end)
> and also allocate more than one structure at once.
> 
> There are many allocations which look like
> 	kmalloc(sizeof(my_struct) * 2 + SOME_MAGIC_LENGTH, GFP_...)
> or something like this, which you can't easily convert to your scheme.

Right -- and with this we can leave those as-is initially (since a size
argument will still work).

> The only option I see is to introduce the new set of functions/macros,
> something like kmalloc_obj() or kmalloc_struct(). Or maybe tmalloc()?
> (t for typed)

Yeah, in a neighboring thread I was talking about a kmalloc_obj that
would handle fixed-sized structs, flexible array structs, and arrays. I
need to prove out the array part, but the first two should be trivial to
implement.
Kees Cook July 9, 2024, 8:28 p.m. UTC | #4
On Tue, Jul 09, 2024 at 10:26:32AM -0700, Christoph Lameter (Ampere) wrote:
> On Mon, 8 Jul 2024, Kees Cook wrote:
> 
> > 
> >            obj = kmalloc(obj, gfp);
> 
> Could we avoid repeating "obj" in this pattern?
> 
> F.e.
> 
> 	KMALLOC(obj, gfp);

This appears to be the common feedback, which is good! :) And we can
still have it return "obj" as well, so it could still be used in
"return" statements, etc. I will work up a new RFC...
Marco Elver July 9, 2024, 9:02 p.m. UTC | #5
On Tue, 9 Jul 2024 at 22:28, Kees Cook <kees@kernel.org> wrote:
>
> On Tue, Jul 09, 2024 at 10:26:32AM -0700, Christoph Lameter (Ampere) wrote:
> > On Mon, 8 Jul 2024, Kees Cook wrote:
> >
> > >
> > >            obj = kmalloc(obj, gfp);
> >
> > Could we avoid repeating "obj" in this pattern?
> >
> > F.e.
> >
> >       KMALLOC(obj, gfp);
>
> This appears to be the common feedback, which is good! :) And we can
> still have it return "obj" as well, so it could still be used in
> "return" statements, etc. I will work up a new RFC...

More macros like this only obfuscate the code further. The name would
become something that makes it really clear there's an assignment.

  assign_kmalloc(obj, gfp)

There may be better options. Also ALLCAPS could be avoided here, as we
have done with other language-like features (vs. pure constants).
Kees Cook July 9, 2024, 11:28 p.m. UTC | #6
On Tue, Jul 09, 2024 at 11:02:55PM +0200, Marco Elver wrote:
> On Tue, 9 Jul 2024 at 22:28, Kees Cook <kees@kernel.org> wrote:
> >
> > On Tue, Jul 09, 2024 at 10:26:32AM -0700, Christoph Lameter (Ampere) wrote:
> > > On Mon, 8 Jul 2024, Kees Cook wrote:
> > >
> > > >
> > > >            obj = kmalloc(obj, gfp);
> > >
> > > Could we avoid repeating "obj" in this pattern?
> > >
> > > F.e.
> > >
> > >       KMALLOC(obj, gfp);
> >
> > This appears to be the common feedback, which is good! :) And we can
> > still have it return "obj" as well, so it could still be used in
> > "return" statements, etc. I will work up a new RFC...
> 
> More macros like this only obfuscate the code further. The name would
> become something that makes it really clear there's an assignment.
> 
>   assign_kmalloc(obj, gfp)
> 
> There may be better options. Also ALLCAPS could be avoided here, as we
> have done with other language-like features (vs. pure constants).

So, in looking a code patterns, it seems what we really want more than
returning the object that was allocated is actually returning the size
of the allocation size requested. i.e.:

	info->size = struct_size(ptr, flex_member, count);
	info->obj = kmalloc(info->size, gfp);

would become:

	info->size = kmalloc(info->obj, flex_member, count, gfp);

-Kees
Przemek Kitszel July 10, 2024, 4:42 a.m. UTC | #7
On 7/10/24 01:28, Kees Cook wrote:
> On Tue, Jul 09, 2024 at 11:02:55PM +0200, Marco Elver wrote:
>> On Tue, 9 Jul 2024 at 22:28, Kees Cook <kees@kernel.org> wrote:
>>>
>>> On Tue, Jul 09, 2024 at 10:26:32AM -0700, Christoph Lameter (Ampere) wrote:
>>>> On Mon, 8 Jul 2024, Kees Cook wrote:
>>>>
>>>>>
>>>>>             obj = kmalloc(obj, gfp);
>>>>
>>>> Could we avoid repeating "obj" in this pattern?
>>>>
>>>> F.e.
>>>>
>>>>        KMALLOC(obj, gfp);
>>>
>>> This appears to be the common feedback, which is good! :) And we can
>>> still have it return "obj" as well, so it could still be used in
>>> "return" statements, etc. I will work up a new RFC...
>>
>> More macros like this only obfuscate the code further. The name would
>> become something that makes it really clear there's an assignment.
>>
>>    assign_kmalloc(obj, gfp)
>>
>> There may be better options. Also ALLCAPS could be avoided here, as we
>> have done with other language-like features (vs. pure constants).
> 
> So, in looking a code patterns, it seems what we really want more than
> returning the object that was allocated is actually returning the size
> of the allocation size requested. i.e.:
> 
> 	info->size = struct_size(ptr, flex_member, count);
> 	info->obj = kmalloc(info->size, gfp);
> 
> would become:
> 
> 	info->size = kmalloc(info->obj, flex_member, count, gfp);
> 
> -Kees
> 

that will work out also for the (IMO) most common case of checking if
the allocation succeeded:
	if (!kmalloc(my_foo, flex_part, count, gfp))
		return -ENOMEM;