diff mbox series

[04/64] stddef: Introduce struct_group() helper macro

Message ID 20210727205855.411487-5-keescook@chromium.org (mailing list archive)
State Not Applicable
Headers show
Series Introduce strict memcpy() bounds checking | expand

Checks

Context Check Description
netdev/tree_selection success Guessing tree name failed - patch did not apply

Commit Message

Kees Cook July 27, 2021, 8:57 p.m. UTC
Kernel code has a regular need to describe groups of members within a
structure usually when they need to be copied or initialized separately
from the rest of the surrounding structure. The generally accepted design
pattern in C is to use a named sub-struct:

	struct foo {
		int one;
		struct {
			int two;
			int three;
		} thing;
		int four;
	};

This would allow for traditional references and sizing:

	memcpy(&dst.thing, &src.thing, sizeof(dst.thing));

However, doing this would mean that referencing struct members enclosed
by such named structs would always require including the sub-struct name
in identifiers:

	do_something(dst.thing.three);

This has tended to be quite inflexible, especially when such groupings
need to be added to established code which causes huge naming churn.
Three workarounds exist in the kernel for this problem, and each have
other negative properties.

To avoid the naming churn, there is a design pattern of adding macro
aliases for the named struct:

	#define f_three thing.three

This ends up polluting the global namespace, and makes it difficult to
search for identifiers.

Another common work-around in kernel code avoids the pollution by avoiding
the named struct entirely, instead identifying the group's boundaries using
either a pair of empty anonymous structs of a pair of zero-element arrays:

	struct foo {
		int one;
		struct { } start;
		int two;
		int three;
		struct { } finish;
		int four;
	};

	struct foo {
		int one;
		int start[0];
		int two;
		int three;
		int finish[0];
		int four;
	};

This allows code to avoid needing to use a sub-struct name for member
references within the surrounding structure, but loses the benefits of
being able to actually use such a struct, making it rather fragile. Using
these requires open-coded calculation of sizes and offsets. The efforts
made to avoid common mistakes include lots of comments, or adding various
BUILD_BUG_ON()s. Such code is left with no way for the compiler to reason
about the boundaries (e.g. the "start" object looks like it's 0 bytes
in length and is not structurally associated with "finish"), making bounds
checking depend on open-coded calculations:

	if (length > offsetof(struct foo, finish) -
		     offsetof(struct foo, start))
		return -EINVAL;
	memcpy(&dst.start, &src.start, length);

However, the vast majority of places in the kernel that operate on
groups of members do so without any identification of the grouping,
relying either on comments or implicit knowledge of the struct contents,
which is even harder for the compiler to reason about, and results in
even more fragile manual sizing, usually depending on member locations
outside of the region (e.g. to copy "two" and "three", use the start of
"four" to find the size):

	BUILD_BUG_ON((offsetof(struct foo, four) <
		      offsetof(struct foo, two)) ||
		     (offsetof(struct foo, four) <
		      offsetof(struct foo, three));
	if (length > offsetof(struct foo, four) -
		     offsetof(struct foo, two))
		return -EINVAL;
	memcpy(&dst.two, &src.two, length);

And both of the prior two idioms additionally appear to write beyond the
end of the referenced struct member, forcing the compiler to ignore any
attempt to perform bounds checking.

In order to have a regular programmatic way to describe a struct
region that can be used for references and sizing, can be examined for
bounds checking, avoids forcing the use of intermediate identifiers,
and avoids polluting the global namespace, introduce the struct_group()
macro. This macro wraps the member declarations to create an anonymous
union of an anonymous struct (no intermediate name) and a named struct
(for references and sizing):

	struct foo {
		int one;
		struct_group(thing,
			int two,
			int three,
		);
		int four;
	};

	if (length > sizeof(src.thing))
		return -EINVAL;
	memcpy(&dst.thing, &src.thing, length);
	do_something(dst.three);

There are some rare cases where the resulting struct_group() needs
attributes added, so struct_group_attr() is also introduced to allow
for specifying struct attributes (e.g. __align(x) or __packed).

Co-developed-by: Keith Packard <keithpac@amazon.com>
Signed-off-by: Keith Packard <keithpac@amazon.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
---
 include/linux/stddef.h | 34 ++++++++++++++++++++++++++++++++++
 1 file changed, 34 insertions(+)

Comments

Gustavo A. R. Silva July 28, 2021, 2:32 a.m. UTC | #1
On Tue, Jul 27, 2021 at 01:57:55PM -0700, Kees Cook wrote:
> Kernel code has a regular need to describe groups of members within a
> structure usually when they need to be copied or initialized separately
> from the rest of the surrounding structure. The generally accepted design
> pattern in C is to use a named sub-struct:
> 
> 	struct foo {
> 		int one;
> 		struct {
> 			int two;
> 			int three;
> 		} thing;
> 		int four;
> 	};
> 
> This would allow for traditional references and sizing:
> 
> 	memcpy(&dst.thing, &src.thing, sizeof(dst.thing));
> 
> However, doing this would mean that referencing struct members enclosed
> by such named structs would always require including the sub-struct name
> in identifiers:
> 
> 	do_something(dst.thing.three);
> 
> This has tended to be quite inflexible, especially when such groupings
> need to be added to established code which causes huge naming churn.
> Three workarounds exist in the kernel for this problem, and each have
> other negative properties.
> 
> To avoid the naming churn, there is a design pattern of adding macro
> aliases for the named struct:
> 
> 	#define f_three thing.three
> 
> This ends up polluting the global namespace, and makes it difficult to
> search for identifiers.
> 
> Another common work-around in kernel code avoids the pollution by avoiding
> the named struct entirely, instead identifying the group's boundaries using
> either a pair of empty anonymous structs of a pair of zero-element arrays:
> 
> 	struct foo {
> 		int one;
> 		struct { } start;
> 		int two;
> 		int three;
> 		struct { } finish;
> 		int four;
> 	};
> 
> 	struct foo {
> 		int one;
> 		int start[0];
> 		int two;
> 		int three;
> 		int finish[0];
> 		int four;
> 	};
> 
> This allows code to avoid needing to use a sub-struct name for member
> references within the surrounding structure, but loses the benefits of
> being able to actually use such a struct, making it rather fragile. Using
> these requires open-coded calculation of sizes and offsets. The efforts
> made to avoid common mistakes include lots of comments, or adding various
> BUILD_BUG_ON()s. Such code is left with no way for the compiler to reason
> about the boundaries (e.g. the "start" object looks like it's 0 bytes
> in length and is not structurally associated with "finish"), making bounds
> checking depend on open-coded calculations:
> 
> 	if (length > offsetof(struct foo, finish) -
> 		     offsetof(struct foo, start))
> 		return -EINVAL;
> 	memcpy(&dst.start, &src.start, length);
> 
> However, the vast majority of places in the kernel that operate on
> groups of members do so without any identification of the grouping,
> relying either on comments or implicit knowledge of the struct contents,
> which is even harder for the compiler to reason about, and results in
> even more fragile manual sizing, usually depending on member locations
> outside of the region (e.g. to copy "two" and "three", use the start of
> "four" to find the size):
> 
> 	BUILD_BUG_ON((offsetof(struct foo, four) <
> 		      offsetof(struct foo, two)) ||
> 		     (offsetof(struct foo, four) <
> 		      offsetof(struct foo, three));
> 	if (length > offsetof(struct foo, four) -
> 		     offsetof(struct foo, two))
> 		return -EINVAL;
> 	memcpy(&dst.two, &src.two, length);
> 
> And both of the prior two idioms additionally appear to write beyond the
> end of the referenced struct member, forcing the compiler to ignore any
> attempt to perform bounds checking.
> 
> In order to have a regular programmatic way to describe a struct
> region that can be used for references and sizing, can be examined for
> bounds checking, avoids forcing the use of intermediate identifiers,
> and avoids polluting the global namespace, introduce the struct_group()
> macro. This macro wraps the member declarations to create an anonymous
> union of an anonymous struct (no intermediate name) and a named struct
> (for references and sizing):
> 
> 	struct foo {
> 		int one;
> 		struct_group(thing,
> 			int two,
> 			int three,
> 		);
> 		int four;
> 	};
> 
> 	if (length > sizeof(src.thing))
> 		return -EINVAL;
> 	memcpy(&dst.thing, &src.thing, length);
> 	do_something(dst.three);
> 
> There are some rare cases where the resulting struct_group() needs
> attributes added, so struct_group_attr() is also introduced to allow
> for specifying struct attributes (e.g. __align(x) or __packed).
> 
> Co-developed-by: Keith Packard <keithpac@amazon.com>
> Signed-off-by: Keith Packard <keithpac@amazon.com>
> Signed-off-by: Kees Cook <keescook@chromium.org>

Acked-by: Gustavo A. R. Silva <gustavoars@kernel.org>

Love it! :)

Thanks
--
Gustavo

> ---
>  include/linux/stddef.h | 34 ++++++++++++++++++++++++++++++++++
>  1 file changed, 34 insertions(+)
> 
> diff --git a/include/linux/stddef.h b/include/linux/stddef.h
> index 998a4ba28eba..cf7f866944f9 100644
> --- a/include/linux/stddef.h
> +++ b/include/linux/stddef.h
> @@ -36,4 +36,38 @@ enum {
>  #define offsetofend(TYPE, MEMBER) \
>  	(offsetof(TYPE, MEMBER)	+ sizeof_field(TYPE, MEMBER))
>  
> +/**
> + * struct_group_attr(NAME, ATTRS, MEMBERS)
> + *
> + * Used to create an anonymous union of two structs with identical
> + * layout and size: one anonymous and one named. The former can be
> + * used normally without sub-struct naming, and the latter can be
> + * used to reason about the start, end, and size of the group of
> + * struct members. Includes structure attributes argument.
> + *
> + * @NAME: The name of the mirrored sub-struct
> + * @ATTRS: Any struct attributes (normally empty)
> + * @MEMBERS: The member declarations for the mirrored structs
> + */
> +#define struct_group_attr(NAME, ATTRS, MEMBERS) \
> +	union { \
> +		struct { MEMBERS } ATTRS; \
> +		struct { MEMBERS } ATTRS NAME; \
> +	}
> +
> +/**
> + * struct_group(NAME, MEMBERS)
> + *
> + * Used to create an anonymous union of two structs with identical
> + * layout and size: one anonymous and one named. The former can be
> + * used normally without sub-struct naming, and the latter can be
> + * used to reason about the start, end, and size of the group of
> + * struct members.
> + *
> + * @NAME: The name of the mirrored sub-struct
> + * @MEMBERS: The member declarations for the mirrored structs
> + */
> +#define struct_group(NAME, MEMBERS)	\
> +	struct_group_attr(NAME, /* no attrs */, MEMBERS)
> +
>  #endif
> -- 
> 2.30.2
>
Rasmus Villemoes July 28, 2021, 10:54 a.m. UTC | #2
On 27/07/2021 22.57, Kees Cook wrote:

> In order to have a regular programmatic way to describe a struct
> region that can be used for references and sizing, can be examined for
> bounds checking, avoids forcing the use of intermediate identifiers,
> and avoids polluting the global namespace, introduce the struct_group()
> macro. This macro wraps the member declarations to create an anonymous
> union of an anonymous struct (no intermediate name) and a named struct
> (for references and sizing):
> 
> 	struct foo {
> 		int one;
> 		struct_group(thing,
> 			int two,
> 			int three,
> 		);
> 		int four;
> 	};

That example won't compile, the commas after two and three should be
semicolons.

And your implementation relies on MEMBERS not containing any comma
tokens, but as

  int a, b, c, d;

is a valid way to declare multiple members, consider making MEMBERS
variadic

#define struct_group(NAME, MEMBERS...)

to have it slurp up every subsequent argument and make that work.

> 
> Co-developed-by: Keith Packard <keithpac@amazon.com>
> Signed-off-by: Keith Packard <keithpac@amazon.com>
> Signed-off-by: Kees Cook <keescook@chromium.org>
> ---
>  include/linux/stddef.h | 34 ++++++++++++++++++++++++++++++++++

Bikeshedding a bit, but do we need to add 34 lines that need to be
preprocessed to virtually each and every translation unit [as opposed to
adding a struct_group.h header]? Oh well, you need it for struct
skbuff.h, so it would be pulled in by a lot regardless :(

Rasmus
Kees Cook July 28, 2021, 9:59 p.m. UTC | #3
On Wed, Jul 28, 2021 at 12:54:18PM +0200, Rasmus Villemoes wrote:
> On 27/07/2021 22.57, Kees Cook wrote:
> 
> > In order to have a regular programmatic way to describe a struct
> > region that can be used for references and sizing, can be examined for
> > bounds checking, avoids forcing the use of intermediate identifiers,
> > and avoids polluting the global namespace, introduce the struct_group()
> > macro. This macro wraps the member declarations to create an anonymous
> > union of an anonymous struct (no intermediate name) and a named struct
> > (for references and sizing):
> > 
> > 	struct foo {
> > 		int one;
> > 		struct_group(thing,
> > 			int two,
> > 			int three,
> > 		);
> > 		int four;
> > 	};
> 
> That example won't compile, the commas after two and three should be
> semicolons.

Oops, yes, thanks. This is why I shouldn't write code that doesn't first
go through a compiler. ;)

> And your implementation relies on MEMBERS not containing any comma
> tokens, but as
> 
>   int a, b, c, d;
> 
> is a valid way to declare multiple members, consider making MEMBERS
> variadic
> 
> #define struct_group(NAME, MEMBERS...)
> 
> to have it slurp up every subsequent argument and make that work.

Ah! Perfect, thank you. I totally forgot I could do it that way.

> 
> > 
> > Co-developed-by: Keith Packard <keithpac@amazon.com>
> > Signed-off-by: Keith Packard <keithpac@amazon.com>
> > Signed-off-by: Kees Cook <keescook@chromium.org>
> > ---
> >  include/linux/stddef.h | 34 ++++++++++++++++++++++++++++++++++
> 
> Bikeshedding a bit, but do we need to add 34 lines that need to be
> preprocessed to virtually each and every translation unit [as opposed to
> adding a struct_group.h header]? Oh well, you need it for struct
> skbuff.h, so it would be pulled in by a lot regardless :(

My instinct is to make these kinds of helpers "always available" (like
sizeof_field(), etc), but I have no strong opinion on where it should
live. If the consensus is to move it, I certainly can! :)

-Kees
Dan Williams July 30, 2021, 10:19 p.m. UTC | #4
On Wed, 2021-07-28 at 14:59 -0700, Kees Cook wrote:
> On Wed, Jul 28, 2021 at 12:54:18PM +0200, Rasmus Villemoes wrote:
> > On 27/07/2021 22.57, Kees Cook wrote:
> > 
> > > In order to have a regular programmatic way to describe a struct
> > > region that can be used for references and sizing, can be examined for
> > > bounds checking, avoids forcing the use of intermediate identifiers,
> > > and avoids polluting the global namespace, introduce the struct_group()
> > > macro. This macro wraps the member declarations to create an anonymous
> > > union of an anonymous struct (no intermediate name) and a named struct
> > > (for references and sizing):
> > > 
> > >         struct foo {
> > >                 int one;
> > >                 struct_group(thing,
> > >                         int two,
> > >                         int three,
> > >                 );
> > >                 int four;
> > >         };
> > 
> > That example won't compile, the commas after two and three should be
> > semicolons.
> 
> Oops, yes, thanks. This is why I shouldn't write code that doesn't first
> go through a compiler. ;)
> 
> > And your implementation relies on MEMBERS not containing any comma
> > tokens, but as
> > 
> >   int a, b, c, d;
> > 
> > is a valid way to declare multiple members, consider making MEMBERS
> > variadic
> > 
> > #define struct_group(NAME, MEMBERS...)
> > 
> > to have it slurp up every subsequent argument and make that work.
> 
> Ah! Perfect, thank you. I totally forgot I could do it that way.

This is great Kees. It just so happens it would clean-up what we are
already doing in drivers/cxl/cxl.h for anonymous + named register block
pointers. However in the cxl case it also needs the named structure to
be typed. Any appetite for a typed version of this?

Here is a rough idea of the cleanup it would induce in drivers/cxl/:

diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 53927f9fa77e..a2308c995654 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -75,52 +75,19 @@ static inline int cxl_hdm_decoder_count(u32 cap_hdr)
 #define CXLDEV_MBOX_BG_CMD_STATUS_OFFSET 0x18
 #define CXLDEV_MBOX_PAYLOAD_OFFSET 0x20
 
-#define CXL_COMPONENT_REGS() \
-       void __iomem *hdm_decoder
-
-#define CXL_DEVICE_REGS() \
-       void __iomem *status; \
-       void __iomem *mbox; \
-       void __iomem *memdev
-
-/* See note for 'struct cxl_regs' for the rationale of this organization */
 /*
- * CXL_COMPONENT_REGS - Common set of CXL Component register block base pointers
  * @hdm_decoder: CXL 2.0 8.2.5.12 CXL HDM Decoder Capability Structure
- */
-struct cxl_component_regs {
-       CXL_COMPONENT_REGS();
-};
-
-/* See note for 'struct cxl_regs' for the rationale of this organization */
-/*
- * CXL_DEVICE_REGS - Common set of CXL Device register block base pointers
  * @status: CXL 2.0 8.2.8.3 Device Status Registers
  * @mbox: CXL 2.0 8.2.8.4 Mailbox Registers
  * @memdev: CXL 2.0 8.2.8.5 Memory Device Registers
  */
-struct cxl_device_regs {
-       CXL_DEVICE_REGS();
-};
-
-/*
- * Note, the anonymous union organization allows for per
- * register-block-type helper routines, without requiring block-type
- * agnostic code to include the prefix.
- */
 struct cxl_regs {
-       union {
-               struct {
-                       CXL_COMPONENT_REGS();
-               };
-               struct cxl_component_regs component;
-       };
-       union {
-               struct {
-                       CXL_DEVICE_REGS();
-               };
-               struct cxl_device_regs device_regs;
-       };
+       struct_group_typed(cxl_component_regs, component,
+               void __iomem *hdm_decoder;
+       );
+       struct_group_typed(cxl_device_regs, device_regs,
+               void __iomem *status, *mbox, *memdev;
+       );
 };
 
 struct cxl_reg_map {
diff --git a/include/linux/stddef.h b/include/linux/stddef.h
index cf7f866944f9..84b7de24ffb5 100644
--- a/include/linux/stddef.h
+++ b/include/linux/stddef.h
@@ -49,12 +49,18 @@ enum {
  * @ATTRS: Any struct attributes (normally empty)
  * @MEMBERS: The member declarations for the mirrored structs
  */
-#define struct_group_attr(NAME, ATTRS, MEMBERS) \
+#define struct_group_attr(NAME, ATTRS, MEMBERS...) \
        union { \
                struct { MEMBERS } ATTRS; \
                struct { MEMBERS } ATTRS NAME; \
        }
 
+#define struct_group_attr_typed(TYPE, NAME, ATTRS, MEMBERS...) \
+       union { \
+               struct { MEMBERS } ATTRS; \
+               struct TYPE { MEMBERS } ATTRS NAME; \
+       }
+
 /**
  * struct_group(NAME, MEMBERS)
  *
@@ -67,7 +73,10 @@ enum {
  * @NAME: The name of the mirrored sub-struct
  * @MEMBERS: The member declarations for the mirrored structs
  */
-#define struct_group(NAME, MEMBERS)    \
+#define struct_group(NAME, MEMBERS...) \
        struct_group_attr(NAME, /* no attrs */, MEMBERS)
 
+#define struct_group_typed(TYPE, NAME, MEMBERS...) \
+       struct_group_attr_typed(TYPE, NAME, /* no attrs */, MEMBERS)
+
 #endif
Kees Cook July 31, 2021, 2:59 a.m. UTC | #5
On Fri, Jul 30, 2021 at 10:19:20PM +0000, Williams, Dan J wrote:
> On Wed, 2021-07-28 at 14:59 -0700, Kees Cook wrote:
> > On Wed, Jul 28, 2021 at 12:54:18PM +0200, Rasmus Villemoes wrote:
> > > On 27/07/2021 22.57, Kees Cook wrote:
> > > 
> > > > In order to have a regular programmatic way to describe a struct
> > > > region that can be used for references and sizing, can be examined for
> > > > bounds checking, avoids forcing the use of intermediate identifiers,
> > > > and avoids polluting the global namespace, introduce the struct_group()
> > > > macro. This macro wraps the member declarations to create an anonymous
> > > > union of an anonymous struct (no intermediate name) and a named struct
> > > > (for references and sizing):
> > > > 
> > > >         struct foo {
> > > >                 int one;
> > > >                 struct_group(thing,
> > > >                         int two,
> > > >                         int three,
> > > >                 );
> > > >                 int four;
> > > >         };
> > > 
> > > That example won't compile, the commas after two and three should be
> > > semicolons.
> > 
> > Oops, yes, thanks. This is why I shouldn't write code that doesn't first
> > go through a compiler. ;)
> > 
> > > And your implementation relies on MEMBERS not containing any comma
> > > tokens, but as
> > > 
> > >   int a, b, c, d;
> > > 
> > > is a valid way to declare multiple members, consider making MEMBERS
> > > variadic
> > > 
> > > #define struct_group(NAME, MEMBERS...)
> > > 
> > > to have it slurp up every subsequent argument and make that work.
> > 
> > Ah! Perfect, thank you. I totally forgot I could do it that way.
> 
> This is great Kees. It just so happens it would clean-up what we are
> already doing in drivers/cxl/cxl.h for anonymous + named register block
> pointers. However in the cxl case it also needs the named structure to
> be typed. Any appetite for a typed version of this?

Oh cool! Yeah, totally I can expand it. Thanks for the suggestion!

> 
> Here is a rough idea of the cleanup it would induce in drivers/cxl/:
> 
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index 53927f9fa77e..a2308c995654 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -75,52 +75,19 @@ static inline int cxl_hdm_decoder_count(u32 cap_hdr)
>  #define CXLDEV_MBOX_BG_CMD_STATUS_OFFSET 0x18
>  #define CXLDEV_MBOX_PAYLOAD_OFFSET 0x20
>  
> -#define CXL_COMPONENT_REGS() \
> -       void __iomem *hdm_decoder
> -
> -#define CXL_DEVICE_REGS() \
> -       void __iomem *status; \
> -       void __iomem *mbox; \
> -       void __iomem *memdev
> -
> -/* See note for 'struct cxl_regs' for the rationale of this organization */
>  /*
> - * CXL_COMPONENT_REGS - Common set of CXL Component register block base pointers
>   * @hdm_decoder: CXL 2.0 8.2.5.12 CXL HDM Decoder Capability Structure
> - */
> -struct cxl_component_regs {
> -       CXL_COMPONENT_REGS();
> -};
> -
> -/* See note for 'struct cxl_regs' for the rationale of this organization */
> -/*
> - * CXL_DEVICE_REGS - Common set of CXL Device register block base pointers
>   * @status: CXL 2.0 8.2.8.3 Device Status Registers
>   * @mbox: CXL 2.0 8.2.8.4 Mailbox Registers
>   * @memdev: CXL 2.0 8.2.8.5 Memory Device Registers
>   */
> -struct cxl_device_regs {
> -       CXL_DEVICE_REGS();
> -};
> -
> -/*
> - * Note, the anonymous union organization allows for per
> - * register-block-type helper routines, without requiring block-type
> - * agnostic code to include the prefix.
> - */
>  struct cxl_regs {
> -       union {
> -               struct {
> -                       CXL_COMPONENT_REGS();
> -               };
> -               struct cxl_component_regs component;
> -       };
> -       union {
> -               struct {
> -                       CXL_DEVICE_REGS();
> -               };
> -               struct cxl_device_regs device_regs;
> -       };
> +       struct_group_typed(cxl_component_regs, component,
> +               void __iomem *hdm_decoder;
> +       );
> +       struct_group_typed(cxl_device_regs, device_regs,
> +               void __iomem *status, *mbox, *memdev;
> +       );
>  };
>  
>  struct cxl_reg_map {
> diff --git a/include/linux/stddef.h b/include/linux/stddef.h
> index cf7f866944f9..84b7de24ffb5 100644
> --- a/include/linux/stddef.h
> +++ b/include/linux/stddef.h
> @@ -49,12 +49,18 @@ enum {
>   * @ATTRS: Any struct attributes (normally empty)
>   * @MEMBERS: The member declarations for the mirrored structs
>   */
> -#define struct_group_attr(NAME, ATTRS, MEMBERS) \
> +#define struct_group_attr(NAME, ATTRS, MEMBERS...) \
>         union { \
>                 struct { MEMBERS } ATTRS; \
>                 struct { MEMBERS } ATTRS NAME; \
>         }
>  
> +#define struct_group_attr_typed(TYPE, NAME, ATTRS, MEMBERS...) \
> +       union { \
> +               struct { MEMBERS } ATTRS; \
> +               struct TYPE { MEMBERS } ATTRS NAME; \
> +       }
> +
>  /**
>   * struct_group(NAME, MEMBERS)
>   *
> @@ -67,7 +73,10 @@ enum {
>   * @NAME: The name of the mirrored sub-struct
>   * @MEMBERS: The member declarations for the mirrored structs
>   */
> -#define struct_group(NAME, MEMBERS)    \
> +#define struct_group(NAME, MEMBERS...) \
>         struct_group_attr(NAME, /* no attrs */, MEMBERS)
>  
> +#define struct_group_typed(TYPE, NAME, MEMBERS...) \
> +       struct_group_attr_typed(TYPE, NAME, /* no attrs */, MEMBERS)
> +
>  #endif

Awesome! My instinct is to expose the resulting API as:

__struct_group(type, name, attrs, members...)

struct_group(name, members...)
struct_group_attr(name, attrs, members...)
struct_group_typed(type, name, members...)
Kees Cook July 31, 2021, 3:10 p.m. UTC | #6
On Sat, Jul 31, 2021 at 07:24:44AM +0200, Rasmus Villemoes wrote:
> On Sat, Jul 31, 2021, 04:59 Kees Cook <keescook@chromium.org> wrote:
> 
> > On Fri, Jul 30, 2021 at 10:19:20PM +0000, Williams, Dan J wrote:
> > > On Wed, 2021-07-28 at 14:59 -0700, Kees Cook wrote:
> >
> > >  /**
> > >   * struct_group(NAME, MEMBERS)
> > >   *
> > > @@ -67,7 +73,10 @@ enum {
> > >   * @NAME: The name of the mirrored sub-struct
> > >   * @MEMBERS: The member declarations for the mirrored structs
> > >   */
> > > -#define struct_group(NAME, MEMBERS)    \
> > > +#define struct_group(NAME, MEMBERS...) \
> > >         struct_group_attr(NAME, /* no attrs */, MEMBERS)
> > >
> > > +#define struct_group_typed(TYPE, NAME, MEMBERS...) \
> > > +       struct_group_attr_typed(TYPE, NAME, /* no attrs */, MEMBERS)
> > > +
> > >  #endif
> >
> > Awesome! My instinct is to expose the resulting API as:
> >
> > __struct_group(type, name, attrs, members...)
> >
> > struct_group(name, members...)
> > struct_group_attr(name, attrs, members...)
> > struct_group_typed(type, name, members...)
> 
> Bikeshed: can we use proper nomenclature please. s/type/tag/,
> s/typed/tagged.

Ah! Thank you. I went looking for the spec on what these are called and
couldn't find it. "struct $tag" is the type, then, yes? So IIUC now:

       |    type   | members  | name
       |       tag
	struct foo { int bar; } baz;
diff mbox series

Patch

diff --git a/include/linux/stddef.h b/include/linux/stddef.h
index 998a4ba28eba..cf7f866944f9 100644
--- a/include/linux/stddef.h
+++ b/include/linux/stddef.h
@@ -36,4 +36,38 @@  enum {
 #define offsetofend(TYPE, MEMBER) \
 	(offsetof(TYPE, MEMBER)	+ sizeof_field(TYPE, MEMBER))
 
+/**
+ * struct_group_attr(NAME, ATTRS, MEMBERS)
+ *
+ * Used to create an anonymous union of two structs with identical
+ * layout and size: one anonymous and one named. The former can be
+ * used normally without sub-struct naming, and the latter can be
+ * used to reason about the start, end, and size of the group of
+ * struct members. Includes structure attributes argument.
+ *
+ * @NAME: The name of the mirrored sub-struct
+ * @ATTRS: Any struct attributes (normally empty)
+ * @MEMBERS: The member declarations for the mirrored structs
+ */
+#define struct_group_attr(NAME, ATTRS, MEMBERS) \
+	union { \
+		struct { MEMBERS } ATTRS; \
+		struct { MEMBERS } ATTRS NAME; \
+	}
+
+/**
+ * struct_group(NAME, MEMBERS)
+ *
+ * Used to create an anonymous union of two structs with identical
+ * layout and size: one anonymous and one named. The former can be
+ * used normally without sub-struct naming, and the latter can be
+ * used to reason about the start, end, and size of the group of
+ * struct members.
+ *
+ * @NAME: The name of the mirrored sub-struct
+ * @MEMBERS: The member declarations for the mirrored structs
+ */
+#define struct_group(NAME, MEMBERS)	\
+	struct_group_attr(NAME, /* no attrs */, MEMBERS)
+
 #endif