[v2] overflow: optimize struct_size() calculation

Message ID	20240910024952.1590207-1-mailhol.vincent@wanadoo.fr (mailing list archive)
State	New, archived
Headers	show Received: from mail-pf1-f174.google.com (mail-pf1-f174.google.com [209.85.210.174]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id ED65114D431; Tue, 10 Sep 2024 02:50:01 +0000 (UTC) Sender: Vincent Mailhol <vincent.mailhol@gmail.com> From: Vincent Mailhol <mailhol.vincent@wanadoo.fr> To: Kees Cook <kees@kernel.org>, David Laight <David.Laight@ACULAB.COM> Cc: "Gustavo A . R . Silva" <gustavoars@kernel.org>, linux-hardening@vger.kernel.org, linux-kernel@vger.kernel.org, Vincent Mailhol <mailhol.vincent@wanadoo.fr> Subject: [PATCH v2] overflow: optimize struct_size() calculation Date: Tue, 10 Sep 2024 11:49:52 +0900 Message-Id: <20240910024952.1590207-1-mailhol.vincent@wanadoo.fr> Precedence: bulk MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit
Series	[v2] overflow: optimize struct_size() calculation \| expand [v2] overflow: optimize struct_size() calculation

Vincent Mailhol Sept. 10, 2024, 2:49 a.m. UTC

If the offsetof() of a given flexible array member (fam) is smaller
than the sizeof() of the containing struct, then the struct_size()
macro reports a size which is too big.

This occurs when the two conditions below are met:

  - there are padding bytes after the penultimate member (the member
    preceding the fam)
  - the alignment of the fam is less than or equal to the penultimate
    member's alignment

In that case, the fam overlaps with the padding bytes of the
penultimate member. This behaviour is not captured in the current
struct_size() macro, potentially resulting in an overestimated size.

Below example illustrates the issue:

  struct s {
  	u64 foo;
  	u32 count;
  	u8 fam[] __counted_by(count);
  };

Assuming a 64 bits architecture:

  - there are 4 bytes of padding after s.count (the penultimate
    member)
  - sizeof(struct s) is 16 bytes
  - the offset of s.fam is 12 bytes
  - the alignment of s.fam is 1 byte

The sizes are as below:

   s.count	current struct_size()	actual size
  -----------------------------------------------------------------------
   0		16			16
   1		17			16
   2		18			16
   3		19			16
   4		20			16
   5		21			17
   .		.			.
   .		.			.
   .		.			.
   n		sizeof(struct s) + n	max(sizeof(struct s),
					    offsetof(struct s, fam) + n)

Change struct_size() from this pseudo code logic:

  sizeof(struct s) + sizeof(*s.fam) * s.count

to that pseudo code logic:

  max(sizeof(struct s), offsetof(struct s, fam) + sizeof(*s.fam) * s.count)

Here, the lowercase max*() macros can cause struct_size() to return a
non constant integer expression which would break the DEFINE_FLEX()
macro by making it declare a variable length array. Because of that,
use the unsafe MAX() macro only if the expression is constant and use
the safer max() otherwise.

Reference: ISO/IEC 9899:2018 §6.7.2.1 "Structure and union specifiers" ¶18

Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
---

I also tried to think of whether the current struct_size() macro could
be a security issue.

The only example I can think of is if someone manually allocates the
exact size but then use the current struct_size() macro.

For example (reusing the struct s from above):

  u32 count = 5;
  struct s *s = kalloc(offsetof(typeof(*s), fam) + count);
  s->count = count;
  memset(s, 0, struct_size(s, fam, count)); /* 4 bytes buffer overflow */

If we have concerns that above pattern may actually exist, then this
patch should also go to stable. I personally think that the above is a
bit convoluted and, as such, I only suggest this patch to go to next.


Changelog:

  * v1 -> v2:

    - replace max_t() with max()
    - change description:

        the alignment of the fam is less than the penultimate member's
        alignment

      into:

        the alignment of the fam is less than or equal to the
        penultimate member's alignment

    Link: https://lore.kernel.org/linux-hardening/20240909115221.1298010-1-mailhol.vincent@wanadoo.fr/
---
 include/linux/overflow.h | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

Kees Cook Sept. 11, 2024, 12:36 a.m. UTC | #1

On Tue, Sep 10, 2024 at 11:49:52AM +0900, Vincent Mailhol wrote:
> If the offsetof() of a given flexible array member (fam) is smaller
> than the sizeof() of the containing struct, then the struct_size()
> macro reports a size which is too big.
> 
> This occurs when the two conditions below are met:
> 
>   - there are padding bytes after the penultimate member (the member
>     preceding the fam)
>   - the alignment of the fam is less than or equal to the penultimate
>     member's alignment
> 
> In that case, the fam overlaps with the padding bytes of the
> penultimate member. This behaviour is not captured in the current
> struct_size() macro, potentially resulting in an overestimated size.
> 
> Below example illustrates the issue:
> 
>   struct s {
>   	u64 foo;
>   	u32 count;
>   	u8 fam[] __counted_by(count);
>   };
> 
> Assuming a 64 bits architecture:
> 
>   - there are 4 bytes of padding after s.count (the penultimate
>     member)
>   - sizeof(struct s) is 16 bytes
>   - the offset of s.fam is 12 bytes
>   - the alignment of s.fam is 1 byte
> 
> The sizes are as below:
> 
>    s.count	current struct_size()	actual size
>   -----------------------------------------------------------------------
>    0		16			16
>    1		17			16
>    2		18			16
>    3		19			16
>    4		20			16
>    5		21			17
>    .		.			.
>    .		.			.
>    .		.			.
>    n		sizeof(struct s) + n	max(sizeof(struct s),
> 					    offsetof(struct s, fam) + n)
> 
> Change struct_size() from this pseudo code logic:
> 
>   sizeof(struct s) + sizeof(*s.fam) * s.count
> 
> to that pseudo code logic:
> 
>   max(sizeof(struct s), offsetof(struct s, fam) + sizeof(*s.fam) * s.count)
> 
> Here, the lowercase max*() macros can cause struct_size() to return a
> non constant integer expression which would break the DEFINE_FLEX()
> macro by making it declare a variable length array. Because of that,
> use the unsafe MAX() macro only if the expression is constant and use
> the safer max() otherwise.
> 
> Reference: ISO/IEC 9899:2018 §6.7.2.1 "Structure and union specifiers" ¶18
> 
> Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
> ---
> 
> I also tried to think of whether the current struct_size() macro could
> be a security issue.
> 
> The only example I can think of is if someone manually allocates the
> exact size but then use the current struct_size() macro.
> 
> For example (reusing the struct s from above):
> 
>   u32 count = 5;
>   struct s *s = kalloc(offsetof(typeof(*s), fam) + count);
>   s->count = count;
>   memset(s, 0, struct_size(s, fam, count)); /* 4 bytes buffer overflow */
> 
> If we have concerns that above pattern may actually exist, then this
> patch should also go to stable. I personally think that the above is a
> bit convoluted and, as such, I only suggest this patch to go to next.

Yeah, this "over-estimation" has come up before, and my thinking as been
that while the above situation is certainly possible (but unlikely),
I've worried that the reverse situation (after this patch) is
potentially worse, where we allocate very precisely, but then manually
index too far:

	struct s *s = kmalloc(struct_size(s, fam, count), gfp);
	typeof(*s->fam) *element;
	element = (void *)s + sizeof(*s);
	for (i = 0; i < count; i++)
		element[i] = ...;

And for a max 7 byte savings, I'm concerned we can get bit much worse in
the above situation. It *should* be unlikely, but I've especially seen a
lot of manually calculated games especially for structs that have
effectively multiple trailing flexible arrays (wifi likes to do this,
for example).

So while I don't have very concrete evidence, my sensation is that we're
in a more defensive position leaving it over-estimated.

Vincent Mailhol Sept. 11, 2024, 4:45 a.m. UTC | #2

On Wed. 11 Sep. 2024 at 09:36, Kees Cook <kees@kernel.org> wrote:
> On Tue, Sep 10, 2024 at 11:49:52AM +0900, Vincent Mailhol wrote:
> > If the offsetof() of a given flexible array member (fam) is smaller
> > than the sizeof() of the containing struct, then the struct_size()
> > macro reports a size which is too big.
> >
> > This occurs when the two conditions below are met:
> >
> >   - there are padding bytes after the penultimate member (the member
> >     preceding the fam)
> >   - the alignment of the fam is less than or equal to the penultimate
> >     member's alignment
> >
> > In that case, the fam overlaps with the padding bytes of the
> > penultimate member. This behaviour is not captured in the current
> > struct_size() macro, potentially resulting in an overestimated size.
> >
> > Below example illustrates the issue:
> >
> >   struct s {
> >       u64 foo;
> >       u32 count;
> >       u8 fam[] __counted_by(count);
> >   };
> >
> > Assuming a 64 bits architecture:
> >
> >   - there are 4 bytes of padding after s.count (the penultimate
> >     member)
> >   - sizeof(struct s) is 16 bytes
> >   - the offset of s.fam is 12 bytes
> >   - the alignment of s.fam is 1 byte
> >
> > The sizes are as below:
> >
> >    s.count    current struct_size()   actual size
> >   -----------------------------------------------------------------------
> >    0          16                      16
> >    1          17                      16
> >    2          18                      16
> >    3          19                      16
> >    4          20                      16
> >    5          21                      17
> >    .          .                       .
> >    .          .                       .
> >    .          .                       .
> >    n          sizeof(struct s) + n    max(sizeof(struct s),
> >                                           offsetof(struct s, fam) + n)
> >
> > Change struct_size() from this pseudo code logic:
> >
> >   sizeof(struct s) + sizeof(*s.fam) * s.count
> >
> > to that pseudo code logic:
> >
> >   max(sizeof(struct s), offsetof(struct s, fam) + sizeof(*s.fam) * s.count)
> >
> > Here, the lowercase max*() macros can cause struct_size() to return a
> > non constant integer expression which would break the DEFINE_FLEX()
> > macro by making it declare a variable length array. Because of that,
> > use the unsafe MAX() macro only if the expression is constant and use
> > the safer max() otherwise.
> >
> > Reference: ISO/IEC 9899:2018 §6.7.2.1 "Structure and union specifiers" ¶18
> >
> > Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
> > ---
> >
> > I also tried to think of whether the current struct_size() macro could
> > be a security issue.
> >
> > The only example I can think of is if someone manually allocates the
> > exact size but then use the current struct_size() macro.
> >
> > For example (reusing the struct s from above):
> >
> >   u32 count = 5;
> >   struct s *s = kalloc(offsetof(typeof(*s), fam) + count);
> >   s->count = count;
> >   memset(s, 0, struct_size(s, fam, count)); /* 4 bytes buffer overflow */
> >
> > If we have concerns that above pattern may actually exist, then this
> > patch should also go to stable. I personally think that the above is a
> > bit convoluted and, as such, I only suggest this patch to go to next.
>
> Yeah, this "over-estimation" has come up before, and my thinking as been
> that while the above situation is certainly possible (but unlikely),
> I've worried that the reverse situation (after this patch) is
> potentially worse, where we allocate very precisely, but then manually
> index too far:
>
>         struct s *s = kmalloc(struct_size(s, fam, count), gfp);
>         typeof(*s->fam) *element;
>         element = (void *)s + sizeof(*s);
>         for (i = 0; i < count; i++)
>                 element[i] = ...;

To me, this pointer arithmetic is just a bug. Anyone in his right mind
would write:

        typeof(*s->fam) *element = s->fam;

When I compare your example to my example, I think that both are
convoluted and unrealistic but with the *huge* difference that:

 - in my example, the code logic is correct and the thing breaks
   because of the current overestimation in struct_size(). See the
   clang documentation:

      https://clang.llvm.org/docs/AttributeReference.html#counted-by

    The clang guys are using the same calculation as in my patch. So I
    can think of a world in which someone with good intent would copy
    this example, and then (for example because of a collaborative
    work), someone else would use the struct_size() to do a copy or a
    memset resulting in an out of bound as illustrated  in my example.

 - in your example, the code is incorrect. If we start to apply that
   "what if users do random pointer arithmetic" reasoning, then
   anything can be proven to be a security risk.

To repeat, I think that my example was convoluted, but I see yours as
even more convoluted.

So, with this said, what do you think is the good trade-off? Return
the exact size so that code with correct logic works as intended or
keep the extra bytes in case someone does some crazy pointer
arithmetic?

> And for a max 7 byte savings, I'm concerned we can get bit much worse in
> the above situation. It *should* be unlikely, but I've especially seen a
> lot of manually calculated games especially for structs that have
> effectively multiple trailing flexible arrays (wifi likes to do this,
> for example).

How does a struct with "multiple trailing flexible arrays" look like? By
the C standard, structures can only have one single fam, so this isn't
something I am aware of.

Would the struct_size() macro still be relevant for those "multiple
fam" structures or would the structure size also be hand calculated?
Isn't this argument out of scope of this patch discussion?

> So while I don't have very concrete evidence, my sensation is that we're
> in a more defensive position leaving it over-estimated.

So far, I think that my example is more concrete than your pointer
arithmetic argument. Not by a lot, and this is why I just put that
security argument at the end after the --- cutter.


Yours sincerely,
Vincent Mailhol

David Laight Sept. 11, 2024, 10:22 a.m. UTC | #3

From: Vincent Mailhol
> Sent: 10 September 2024 03:50
> 
> If the offsetof() of a given flexible array member (fam) is smaller
> than the sizeof() of the containing struct, then the struct_size()
> macro reports a size which is too big.
> 
> This occurs when the two conditions below are met:
> 
>   - there are padding bytes after the penultimate member (the member
>     preceding the fam)
>   - the alignment of the fam is less than or equal to the penultimate
>     member's alignment
> 
> In that case, the fam overlaps with the padding bytes of the
> penultimate member. This behaviour is not captured in the current
> struct_size() macro, potentially resulting in an overestimated size.
...
> Change struct_size() from this pseudo code logic:
> 
>   sizeof(struct s) + sizeof(*s.fam) * s.count
> 
> to that pseudo code logic:
> 
>   max(sizeof(struct s), offsetof(struct s, fam) + sizeof(*s.fam) * s.count)

You are adding a third comparison [1] to every call - even though most don't
need it and the memory saving is marginal at best.
The total code size increase could easily exceed any savings.

With care you only do the max() when sizeof(struct) != offsetof() but I doubt
the complexity is worth it.

[1] Both the '+' and '*' have extra code to detect overflow and return
  a 'big' value that will cause kmalloc() to return NULL.
I've not looked at the generated code but it is likely to be horrid
  (especially the check for multiply overflowing).
In this case there are enough constants that the alternative check:
	if (count > (MAX_SIZE - sizeof (*s)) / sizeof (s->member))
		size = MAX_SIZE;
	else
		size = sizeof (*s) + count * sizeof (s->member);
is fine and only has one conditional in it.
In some cases the compiler may already know that the count is small enough.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

David Laight Sept. 11, 2024, 4:46 p.m. UTC | #4

...
> > [1] Both the '+' and '*' have extra code to detect overflow and return
> >   a 'big' value that will cause kmalloc() to return NULL.
> > I've not looked at the generated code but it is likely to be horrid
> >   (especially the check for multiply overflowing).
> > In this case there are enough constants that the alternative check:
> >         if (count > (MAX_SIZE - sizeof (*s)) / sizeof (s->member))
> >                 size = MAX_SIZE;
> >         else
> >                 size = sizeof (*s) + count * sizeof (s->member);
> > is fine and only has one conditional in it.
> > In some cases the compiler may already know that the count is small enough.
> 
> Indeed. If count is small enough, the code isn't that horrid. If I
> take this example:
> 
>   size_t foo(u32 count)
>   {
>           return struct_size_t(struct s, fam, count);
>   }
> 
> I got this code:

What happens if the flex-array is larger than 1 byte - so a multiply is needed.
Probably worth testing something that isn't a power of 2.
Also check 32bit archs -godbolt can help.

	David

> 
>   0000000000000010 <foo>:
>     10:   f3 0f 1e fa             endbr64
>     14:   89 f8                   mov    %edi,%eax
>     16:   48 83 c0 10             add    $0x10,%rax
>     1a:   e9 00 00 00 00          jmp    1f <foo+0xf>

Add -d to objdump to get the relocation printed.
(I think this is a build where 'ret' isn't allowed :-)

	David

> 
> Here, no SIZE_MAX because the multiplication can not wraparound
> regardless of the value of count. It is only after changing the type
> of count to u64 that the compiler will emit a comparison:
> 
> 0000000000000010 <foo>:
>   10:   f3 0f 1e fa             endbr64
>   14:   48 8d 47 10             lea    0x10(%rdi),%rax
>   18:   48 83 ff f0             cmp    $0xfffffffffffffff0,%rdi
>   1c:   48 c7 c2 ff ff ff ff    mov    $0xffffffffffffffff,%rdx
>   23:   48 0f 43 c2             cmovae %rdx,%rax
>   27:   e9 00 00 00 00          jmp    2c <foo+0x1c>
> 
> For reference, this is the code after applying my patch, with count as a u32:
> 
>   0000000000000010 <foo>:
>     10:   f3 0f 1e fa             endbr64
>     14:   89 f8                   mov    %edi,%eax
>     16:   ba 10 00 00 00          mov    $0x10,%edx
>     1b:   48 83 c0 0c             add    $0xc,%rax
>     1f:   48 39 d0                cmp    %rdx,%rax
>     22:   48 0f 42 c2             cmovb  %rdx,%rax
>     26:   e9 00 00 00 00          jmp    2b <foo+0x1b>
> 
> and with count as a u64:
> 
>   0000000000000010 <foo>:
>     10:   f3 0f 1e fa             endbr64
>     14:   48 83 c7 0c             add    $0xc,%rdi
>     18:   72 11                   jb     2b <foo+0x1b>
>     1a:   b8 10 00 00 00          mov    $0x10,%eax
>     1f:   48 39 c7                cmp    %rax,%rdi
>     22:   48 0f 43 c7             cmovae %rdi,%rax
>     26:   e9 00 00 00 00          jmp    2b <foo+0x1b>
>     2b:   48 83 c8 ff             or     $0xffffffffffffffff,%rax
>     2f:   e9 00 00 00 00          jmp    34 <foo+0x24>
> 
> Thank you for your comments!
> 
> 
> Yours sincerely,
> Vincent Mailhol

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

Vincent Mailhol Sept. 13, 2024, 4:45 a.m. UTC | #5

[resend] I do not know why but below message got blocked for, I quote:

  Your message looked spammy to us. Please check
  https://subspace.kernel.org/etiquette.html and resend.

And I have no clue which part I violated. Maybe it is the gmail web
client? Resending with the git client, hoping this time it will work.

On Tue. 12 sept. 2024 at 01:46, David Laight <David.Laight@aculab.com> wrote:
> ...
> > > [1] Both the '+' and '*' have extra code to detect overflow and return
> > >   a 'big' value that will cause kmalloc() to return NULL.
> > > I've not looked at the generated code but it is likely to be horrid
> > >   (especially the check for multiply overflowing).
> > > In this case there are enough constants that the alternative check:
> > >         if (count > (MAX_SIZE - sizeof (*s)) / sizeof (s->member))
> > >                 size = MAX_SIZE;
> > >         else
> > >                 size = sizeof (*s) + count * sizeof (s->member);
> > > is fine and only has one conditional in it.
> > > In some cases the compiler may already know that the count is small enough.
> >
> > Indeed. If count is small enough, the code isn't that horrid. If I
> > take this example:
> >
> >   size_t foo(u32 count)
> >   {
> >           return struct_size_t(struct s, fam, count);
> >   }
> >
> > I got this code:
>
> What happens if the flex-array is larger than 1 byte - so a multiply is needed.
> Probably worth testing something that isn't a power of 2.
> Also check 32bit archs -godbolt can help.

Here it is:

  https://godbolt.org/z/dKdvW631e

The original is on the right, my patch v2 is on the left. I did not
add the optimization of only doing the max() if sizeof(struct) !=
offsetof().

It always takes time to copy the macros and make them work in godbolt,
so I was just a bit lazy in my previous message.

If the fam not being a power of 2 does not change things so much. At
the end, it is just a problem whether:

  sizeofof(*s) + sizeof(*s->fam) * type_max(count)

wraps around or not.

Of course, there are a few cases in which the multiplication will not
warp, but only the addition will. This always occurs when the fam
element is one byte and the element count has the same type width as
size_t, but it could also happen on degenerated cases as illustrated
in warp_only_on_add_fam32() (I can not think of a more realistic
example).

I will do a few more tests and maybe come up with a v3 depending on
what I found.

>         David
>
> >
> >   0000000000000010 <foo>:
> >     10:   f3 0f 1e fa             endbr64
> >     14:   89 f8                   mov    %edi,%eax
> >     16:   48 83 c0 10             add    $0x10,%rax
> >     1a:   e9 00 00 00 00          jmp    1f <foo+0xf>
>
> Add -d to objdump to get the relocation printed.

What I previously posted was the result of:

  objdump -d foo.o

> (I think this is a build where 'ret' isn't allowed :-)

I just did a defconfig build. I rarely try to tweak this part of the
config. Now that I did the gobolt, I will not investigate more on the
config.

Yours sincerely,
Vincent Mailhol

[v2] overflow: optimize struct_size() calculation

Commit Message

Comments

Patch