diff mbox series

openat2: switch to __attribute__((packed)) for open_how

Message ID 20191213222351.14071-1-cyphar@cyphar.com (mailing list archive)
State New, archived
Headers show
Series openat2: switch to __attribute__((packed)) for open_how | expand

Commit Message

Aleksa Sarai Dec. 13, 2019, 10:23 p.m. UTC
The design of the original open_how struct layout was such that it
ensured that there would be no un-labelled (and thus potentially
non-zero) padding to avoid issues with struct expansion, as well as
providing a uniform representation on all architectures (to avoid
complications with OPEN_HOW_SIZE versioning).

However, there were a few other desirable features which were not
fulfilled by the previous struct layout:

 * Adding new features (other than new flags) should always result in
   the struct getting larger. However, by including a padding field, it
   was possible for new fields to be added without expanding the
   structure. This would somewhat complicate version-number based
   checking of feature support.

 * A non-zero bit in __padding yielded -EINVAL when it should arguably
   have been -E2BIG (because the padding bits are effectively
   yet-to-be-used fields). However, the semantics are not entirely clear
   because userspace may expect -E2BIG to only signify that the
   structure is too big. It's much simpler to just provide the guarantee
   that new fields will always result in a struct size increase, and
   -E2BIG indicates you're using a field that's too recent for an older
   kernel.

 * While the alignment for u64s was manually backed by extra padding
   fields, some languages (such as Rust) do not currently support
   enforcing alignment of struct field members.

 * The padding wasted space needlessly, and would very likely not be
   used up entirely by future extensions for a long time (because it
   couldn't fit a u64).

While none of these outstanding issues are deal-breakers, we can iron
out these warts before openat2(2) lands in Linus's tree. Instead of
using alignment and padding, we simply pack the structure with
__attribute__((packed)). Rust supports #[repr(packed)] and it removes
all of the issues with having explicit padding.

Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
---
 fs/open.c                                      |  2 --
 include/uapi/linux/fcntl.h                     | 11 +++++------
 tools/testing/selftests/openat2/helpers.h      | 11 +++++------
 tools/testing/selftests/openat2/openat2_test.c | 18 +-----------------
 4 files changed, 11 insertions(+), 31 deletions(-)


base-commit: 912dfe068c43fa13c587b8d30e73d335c5ba7d44

Comments

Rasmus Villemoes Dec. 14, 2019, 10:17 p.m. UTC | #1
On 13/12/2019 23.23, Aleksa Sarai wrote:
> The design of the original open_how struct layout was such that it
> ensured that there would be no un-labelled (and thus potentially
> non-zero) padding to avoid issues with struct expansion, as well as
> providing a uniform representation on all architectures (to avoid
> complications with OPEN_HOW_SIZE versioning).
> 
> However, there were a few other desirable features which were not
> fulfilled by the previous struct layout:
> 
>  * Adding new features (other than new flags) should always result in
>    the struct getting larger. However, by including a padding field, it
>    was possible for new fields to be added without expanding the
>    structure. This would somewhat complicate version-number based
>    checking of feature support.
> 
>  * A non-zero bit in __padding yielded -EINVAL when it should arguably
>    have been -E2BIG (because the padding bits are effectively
>    yet-to-be-used fields). However, the semantics are not entirely clear
>    because userspace may expect -E2BIG to only signify that the
>    structure is too big. It's much simpler to just provide the guarantee
>    that new fields will always result in a struct size increase, and
>    -E2BIG indicates you're using a field that's too recent for an older
>    kernel.

And when the first extension adds another u64 field, that padding has to
be added back in and checked for being 0, at which point the padding is
again yet-to-be-used fields. So what exactly is the problem with
returning EINVAL now?

>  * The padding wasted space needlessly, and would very likely not be
>    used up entirely by future extensions for a long time (because it
>    couldn't fit a u64).

Who knows, it does fit a u32. And if the struct is to be 8-byte aligned
(see below), it doesn't actually waste space.

> diff --git a/include/uapi/linux/fcntl.h b/include/uapi/linux/fcntl.h
> index d886bdb585e4..0e070c7f568a 100644
> --- a/include/uapi/linux/fcntl.h
> +++ b/include/uapi/linux/fcntl.h
> @@ -109,17 +109,16 @@
>   * O_TMPFILE} are set.
>   *
>   * @flags: O_* flags.
> - * @mode: O_CREAT/O_TMPFILE file mode.
>   * @resolve: RESOLVE_* flags.
> + * @mode: O_CREAT/O_TMPFILE file mode.
>   */
>  struct open_how {
> -	__aligned_u64 flags;
> +	__u64 flags;
> +	__u64 resolve;
>  	__u16 mode;
> -	__u16 __padding[3]; /* must be zeroed */
> -	__aligned_u64 resolve;
> -};
> +} __attribute__((packed));

IIRC, gcc assumes such a struct has alignment 1, which means that it
will generate horrible code to access it. So if you do this (and I don't
think it's a good idea), I think you'd also want to include a
__attribute__((__aligned__(8))) - or perhaps that can be accomplished by
just keeping flags as an explicitly aligned member. But that will of
course bump its sizeof() back to 24, at which point it seems better to
just make the padding explicit.

Rasmus
Aleksa Sarai Dec. 15, 2019, 12:34 p.m. UTC | #2
On 2019-12-14, Rasmus Villemoes <linux@rasmusvillemoes.dk> wrote:
> On 13/12/2019 23.23, Aleksa Sarai wrote:
> > The design of the original open_how struct layout was such that it
> > ensured that there would be no un-labelled (and thus potentially
> > non-zero) padding to avoid issues with struct expansion, as well as
> > providing a uniform representation on all architectures (to avoid
> > complications with OPEN_HOW_SIZE versioning).
> > 
> > However, there were a few other desirable features which were not
> > fulfilled by the previous struct layout:
> > 
> >  * Adding new features (other than new flags) should always result in
> >    the struct getting larger. However, by including a padding field, it
> >    was possible for new fields to be added without expanding the
> >    structure. This would somewhat complicate version-number based
> >    checking of feature support.
> > 
> >  * A non-zero bit in __padding yielded -EINVAL when it should arguably
> >    have been -E2BIG (because the padding bits are effectively
> >    yet-to-be-used fields). However, the semantics are not entirely clear
> >    because userspace may expect -E2BIG to only signify that the
> >    structure is too big. It's much simpler to just provide the guarantee
> >    that new fields will always result in a struct size increase, and
> >    -E2BIG indicates you're using a field that's too recent for an older
> >    kernel.
> 
> And when the first extension adds another u64 field, that padding has to
> be added back in and checked for being 0, at which point the padding is
> again yet-to-be-used fields.

Maybe I'm missing something, but what is the issue with

  struct open_how {
    u64 flags;
    u64 resolve;
    u16 mode;
	u64 next_extension;
  } __attribute__((packed));

It was my understanding that __aligned_u64 was used to ensure consistent
layouts, not that it was needed for safety against unaligned accesses.

> So what exactly is the problem with returning EINVAL now?

I would argue that -EINVAL was the wrong choice of return code from the
outset (and if we do keep the padding, I will send a patch to switch it
to -E2BIG -- see below). The purpose of -E2BIG for the newer
"extensible" syscalls is to differentiate between using an unsupported
extension field and an unsupported (or invalid) flag.

This will be useful for a few other extension ideas for these types of
syscalls (related to allowing userspace to more efficiently figure out
what flags are supported by the kernel without having to try each one
separately).

> >  * The padding wasted space needlessly, and would very likely not be
> >    used up entirely by future extensions for a long time (because it
> >    couldn't fit a u64).
> 
> Who knows, it does fit a u32. And if the struct is to be 8-byte aligned
> (see below), it doesn't actually waste space.

Yeah, though giving it some more thought I think this might be a better
layout to avoid this problem:

  struct open_how {
    __aligned_u64 flags;
    __aligned_u64 resolve;
    __u16 mode;
	__u16 __padding[3]; /* must be zero */
  };

That way, we won't end up with a u16 which we never use (and we won't
have multiple __padding fields in the future).

> > diff --git a/include/uapi/linux/fcntl.h b/include/uapi/linux/fcntl.h
> > index d886bdb585e4..0e070c7f568a 100644
> > --- a/include/uapi/linux/fcntl.h
> > +++ b/include/uapi/linux/fcntl.h
> > @@ -109,17 +109,16 @@
> >   * O_TMPFILE} are set.
> >   *
> >   * @flags: O_* flags.
> > - * @mode: O_CREAT/O_TMPFILE file mode.
> >   * @resolve: RESOLVE_* flags.
> > + * @mode: O_CREAT/O_TMPFILE file mode.
> >   */
> >  struct open_how {
> > -	__aligned_u64 flags;
> > +	__u64 flags;
> > +	__u64 resolve;
> >  	__u16 mode;
> > -	__u16 __padding[3]; /* must be zeroed */
> > -	__aligned_u64 resolve;
> > -};
> > +} __attribute__((packed));
> 
> IIRC, gcc assumes such a struct has alignment 1, which means that it
> will generate horrible code to access it. So if you do this (and I don't
> think it's a good idea), I think you'd also want to include a
> __attribute__((__aligned__(8))) - or perhaps that can be accomplished by
> just keeping flags as an explicitly aligned member. But that will of
> course bump its sizeof() back to 24, at which point it seems better to
> just make the padding explicit.

Yeah, you're quite right -- I was aware that GCC generated "less than
great" code for aligned(1) structures, but wasn't sure whether it would
be seen as being a serious enough issue to NACK the change.

There is an additional problem -- unfortunately, having the struct be
__attribute__((aligned(8))) doesn't solve the Rust representation
problem because Rust can't represent a struct as both being
#[repr(packed)] and #[repr(align(n))]. Obviously the kernel doesn't
really care about Rust language restrictions, but given one of the main
users of how->resolve will be libpathrs, I'd prefer to not make my own
life any harder if possible. ;)

So, given all of the above, I suggest that instead I send something like
this instead:

diff --git a/fs/open.c b/fs/open.c
index 50a46501bcc9..6c97f52453fe 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -994,7 +994,7 @@ static inline int build_open_flags(const struct open_how *how,
        if (how->resolve & ~VALID_RESOLVE_FLAGS)
                return -EINVAL;
        if (memchr_inv(how->__padding, 0, sizeof(how->__padding)))
-               return -EINVAL;
+               return -E2BIG;
 
        /* Deal with the mode. */
        if (WILL_CREATE(flags)) {
diff --git a/include/uapi/linux/fcntl.h b/include/uapi/linux/fcntl.h
index d886bdb585e4..c307640071c8 100644
--- a/include/uapi/linux/fcntl.h
+++ b/include/uapi/linux/fcntl.h
@@ -114,9 +114,9 @@
  */
 struct open_how {
        __aligned_u64 flags;
+       __aligned_u64 resolve;
        __u16 mode;
        __u16 __padding[3]; /* must be zeroed */
-       __aligned_u64 resolve;
 };
 
 #define OPEN_HOW_SIZE_VER0     24 /* sizeof first published struct */
diff --git a/tools/testing/selftests/openat2/openat2_test.c b/tools/testing/selftests/openat2/openat2_test.c
index 0b64fedc008b..88e3614cbb3a 100644
--- a/tools/testing/selftests/openat2/openat2_test.c
+++ b/tools/testing/selftests/openat2/openat2_test.c
@@ -61,15 +61,15 @@ void test_openat2_struct(void)
                { .name = "normal struct (non-zero padding[0])",
                  .arg.inner.flags = O_RDONLY,
                  .arg.inner.__padding = {0xa0, 0x00, 0x00},
-                 .size = sizeof(struct open_how_ext), .err = -EINVAL },
+                 .size = sizeof(struct open_how_ext), .err = -E2BIG },
                { .name = "normal struct (non-zero padding[1])",
                  .arg.inner.flags = O_RDONLY,
                  .arg.inner.__padding = {0x00, 0x1a, 0x00},
-                 .size = sizeof(struct open_how_ext), .err = -EINVAL },
+                 .size = sizeof(struct open_how_ext), .err = -E2BIG },
                { .name = "normal struct (non-zero padding[2])",
                  .arg.inner.flags = O_RDONLY,
                  .arg.inner.__padding = {0x00, 0x00, 0xef},
-                 .size = sizeof(struct open_how_ext), .err = -EINVAL },
+                 .size = sizeof(struct open_how_ext), .err = -E2BIG },
 
                /* TODO: Once expanded, check zero-padding. */
Florian Weimer Dec. 15, 2019, 7:48 p.m. UTC | #3
* Aleksa Sarai:

> diff --git a/tools/testing/selftests/openat2/helpers.h b/tools/testing/selftests/openat2/helpers.h
> index 43ca5ceab6e3..eb1535c8fa2e 100644
> --- a/tools/testing/selftests/openat2/helpers.h
> +++ b/tools/testing/selftests/openat2/helpers.h
> @@ -32,17 +32,16 @@
>   * O_TMPFILE} are set.
>   *
>   * @flags: O_* flags.
> - * @mode: O_CREAT/O_TMPFILE file mode.
>   * @resolve: RESOLVE_* flags.
> + * @mode: O_CREAT/O_TMPFILE file mode.
>   */
>  struct open_how {
> -	__aligned_u64 flags;
> +	__u64 flags;
> +	__u64 resolve;
>  	__u16 mode;
> -	__u16 __padding[3]; /* must be zeroed */
> -	__aligned_u64 resolve;
> -};
> +} __attribute__((packed));
>  
> -#define OPEN_HOW_SIZE_VER0	24 /* sizeof first published struct */
> +#define OPEN_HOW_SIZE_VER0	18 /* sizeof first published struct */
>  #define OPEN_HOW_SIZE_LATEST	OPEN_HOW_SIZE_VER0

A userspace ABI that depends on GCC extensions probably isn't a good
idea.  Even with GCC, it will not work well with some future
extensions because it pretty much rules out having arrays or other
members that are access through pointers.  Current GCC does not carry
over the packed-ness of the struct to addresses of its members.
Aleksa Sarai Dec. 15, 2019, 8:55 p.m. UTC | #4
On 2019-12-15, Florian Weimer <fw@deneb.enyo.de> wrote:
> * Aleksa Sarai:
> 
> > diff --git a/tools/testing/selftests/openat2/helpers.h b/tools/testing/selftests/openat2/helpers.h
> > index 43ca5ceab6e3..eb1535c8fa2e 100644
> > --- a/tools/testing/selftests/openat2/helpers.h
> > +++ b/tools/testing/selftests/openat2/helpers.h
> > @@ -32,17 +32,16 @@
> >   * O_TMPFILE} are set.
> >   *
> >   * @flags: O_* flags.
> > - * @mode: O_CREAT/O_TMPFILE file mode.
> >   * @resolve: RESOLVE_* flags.
> > + * @mode: O_CREAT/O_TMPFILE file mode.
> >   */
> >  struct open_how {
> > -	__aligned_u64 flags;
> > +	__u64 flags;
> > +	__u64 resolve;
> >  	__u16 mode;
> > -	__u16 __padding[3]; /* must be zeroed */
> > -	__aligned_u64 resolve;
> > -};
> > +} __attribute__((packed));
> >  
> > -#define OPEN_HOW_SIZE_VER0	24 /* sizeof first published struct */
> > +#define OPEN_HOW_SIZE_VER0	18 /* sizeof first published struct */
> >  #define OPEN_HOW_SIZE_LATEST	OPEN_HOW_SIZE_VER0
> 
> A userspace ABI that depends on GCC extensions probably isn't a good
> idea.  Even with GCC, it will not work well with some future
> extensions because it pretty much rules out having arrays or other
> members that are access through pointers.  Current GCC does not carry
> over the packed-ness of the struct to addresses of its members.

Right, those are also good points.

Okay, I'm going to send a separate patch which changes the return value
for invalid __padding to -E2BIG, and moves the padding to the end of the
struct (along with open_how.mode). That should fix all of the warts I
raised, without running into the numerous problems with
__attribute__((packed)) of which I am now aware.
David Laight Dec. 16, 2019, 4:55 p.m. UTC | #5
From:  Aleksa Sarai
> Sent: 15 December 2019 12:35
> On 2019-12-14, Rasmus Villemoes <linux@rasmusvillemoes.dk> wrote:
> > On 13/12/2019 23.23, Aleksa Sarai wrote:
> > > The design of the original open_how struct layout was such that it
> > > ensured that there would be no un-labelled (and thus potentially
> > > non-zero) padding to avoid issues with struct expansion, as well as
> > > providing a uniform representation on all architectures (to avoid
> > > complications with OPEN_HOW_SIZE versioning).
> > >
> > > However, there were a few other desirable features which were not
> > > fulfilled by the previous struct layout:
> > >
> > >  * Adding new features (other than new flags) should always result in
> > >    the struct getting larger. However, by including a padding field, it
> > >    was possible for new fields to be added without expanding the
> > >    structure. This would somewhat complicate version-number based
> > >    checking of feature support.
> > >
> > >  * A non-zero bit in __padding yielded -EINVAL when it should arguably
> > >    have been -E2BIG (because the padding bits are effectively
> > >    yet-to-be-used fields). However, the semantics are not entirely clear
> > >    because userspace may expect -E2BIG to only signify that the
> > >    structure is too big. It's much simpler to just provide the guarantee
> > >    that new fields will always result in a struct size increase, and
> > >    -E2BIG indicates you're using a field that's too recent for an older
> > >    kernel.
> >
> > And when the first extension adds another u64 field, that padding has to
> > be added back in and checked for being 0, at which point the padding is
> > again yet-to-be-used fields.
> 
> Maybe I'm missing something, but what is the issue with
> 
>   struct open_how {
>     u64 flags;
>     u64 resolve;
>     u16 mode;
> 	u64 next_extension;
>   } __attribute__((packed));

Compile anything that accesses it for (say) sparc and look at the object code.
You really, really, REALLY, don't want to EVER use 'packed'.

Just use u64 for all the fields.
Use 'flags' bits to indicate whether the additional fields should be looked at.
Error if a 'flags' bit requires a value that isn't passed in the structure.

Then you can add an extra field and old source code recompiled with the
new headers will still work - because the 'junk' value isn't looked at.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Aleksa Sarai Dec. 17, 2019, 6:46 a.m. UTC | #6
On 2019-12-16, David Laight <David.Laight@ACULAB.COM> wrote:
> From:  Aleksa Sarai
> > Sent: 15 December 2019 12:35
> > On 2019-12-14, Rasmus Villemoes <linux@rasmusvillemoes.dk> wrote:
> > > On 13/12/2019 23.23, Aleksa Sarai wrote:
> > > > The design of the original open_how struct layout was such that it
> > > > ensured that there would be no un-labelled (and thus potentially
> > > > non-zero) padding to avoid issues with struct expansion, as well as
> > > > providing a uniform representation on all architectures (to avoid
> > > > complications with OPEN_HOW_SIZE versioning).
> > > >
> > > > However, there were a few other desirable features which were not
> > > > fulfilled by the previous struct layout:
> > > >
> > > >  * Adding new features (other than new flags) should always result in
> > > >    the struct getting larger. However, by including a padding field, it
> > > >    was possible for new fields to be added without expanding the
> > > >    structure. This would somewhat complicate version-number based
> > > >    checking of feature support.
> > > >
> > > >  * A non-zero bit in __padding yielded -EINVAL when it should arguably
> > > >    have been -E2BIG (because the padding bits are effectively
> > > >    yet-to-be-used fields). However, the semantics are not entirely clear
> > > >    because userspace may expect -E2BIG to only signify that the
> > > >    structure is too big. It's much simpler to just provide the guarantee
> > > >    that new fields will always result in a struct size increase, and
> > > >    -E2BIG indicates you're using a field that's too recent for an older
> > > >    kernel.
> > >
> > > And when the first extension adds another u64 field, that padding has to
> > > be added back in and checked for being 0, at which point the padding is
> > > again yet-to-be-used fields.
> > 
> > Maybe I'm missing something, but what is the issue with
> > 
> >   struct open_how {
> >     u64 flags;
> >     u64 resolve;
> >     u16 mode;
> > 	u64 next_extension;
> >   } __attribute__((packed));
> 
> Compile anything that accesses it for (say) sparc and look at the object code.
> You really, really, REALLY, don't want to EVER use 'packed'.

Right, so it's related to the "garbage code" problem. As mentioned
above, I wasn't aware it was as bad as folks in this thread have
mentioned.

> Just use u64 for all the fields.

That is an option (and is the one that clone3 went with), but it's a bit
awkward because umode_t is a u16 -- and it would be a waste of 6 bytes
to store it as a u64. Arguably it could be extended but I personally
find that to be very unlikely (and lots of other syscalls would need be
updated).

I'm just going to move the padding to the end and change the error for
non-zero padding to -E2BIG.

> Use 'flags' bits to indicate whether the additional fields should be looked at.
> Error if a 'flags' bit requires a value that isn't passed in the structure.
> 
> Then you can add an extra field and old source code recompiled with the
> new headers will still work - because the 'junk' value isn't looked at.

This problem is already handled entirely by copy_struct_from_user().

It is true that for some new fields it will be necessary to add a new
flag (such as passing fds -- where 0 is a valid value) but for most new
fields (especially pointer or flag fields) it will not be necessary
because the 0 value is equivalent to the old behaviour. It also allows
us to entirely avoid accepting junk from userspace.
David Laight Dec. 17, 2019, 10:14 a.m. UTC | #7
From Aleksa Sarai
> Sent: 17 December 2019 06:47
...
> > Just use u64 for all the fields.
> 
> That is an option (and is the one that clone3 went with), but it's a bit
> awkward because umode_t is a u16 -- and it would be a waste of 6 bytes
> to store it as a u64. Arguably it could be extended but I personally
> find that to be very unlikely (and lots of other syscalls would need be
> updated).

6 bytes on interface structure will make almost no difference.
There is no reason to save more than 16 bits anywhere else.
You could error values with high bits set.

> I'm just going to move the padding to the end and change the error for
> non-zero padding to -E2BIG.

The padding had to be after the u16 field.

> > Use 'flags' bits to indicate whether the additional fields should be looked at.
> > Error if a 'flags' bit requires a value that isn't passed in the structure.
> >
> > Then you can add an extra field and old source code recompiled with the
> > new headers will still work - because the 'junk' value isn't looked at.
> 
> This problem is already handled entirely by copy_struct_from_user().
> 
> It is true that for some new fields it will be necessary to add a new
> flag (such as passing fds -- where 0 is a valid value) but for most new
> fields (especially pointer or flag fields) it will not be necessary
> because the 0 value is equivalent to the old behaviour. It also allows
> us to entirely avoid accepting junk from userspace.

Only if userspace is guaranteed to memset the entire structure
before making the call - rather than just fill in all the fields it knows about.
If it doesn't use memset() then recompiling old code with new headers
will pass garbage to the kernel.
copy_struct_from_user() cannot solve that problem.
You'll never be able to guarantee that all code actually clears the
entire structure - so at some point extending it will break recompilations
of old code - annoying.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Aleksa Sarai Dec. 18, 2019, 5:31 p.m. UTC | #8
On 2019-12-17, David Laight <David.Laight@ACULAB.COM> wrote:
> From Aleksa Sarai
> > Sent: 17 December 2019 06:47
> ...
> > > Just use u64 for all the fields.
> > 
> > That is an option (and is the one that clone3 went with), but it's a bit
> > awkward because umode_t is a u16 -- and it would be a waste of 6 bytes
> > to store it as a u64. Arguably it could be extended but I personally
> > find that to be very unlikely (and lots of other syscalls would need be
> > updated).
> 
> 6 bytes on interface structure will make almost no difference.
> There is no reason to save more than 16 bits anywhere else.

You have a point, and clone3's way of dealing with it does make life
easier. It also removes the need to care about explicit padding and
padding holes entirely.

> You could error values with high bits set.

Of course we'll give -EINVAL with invalid values, that's one of the
reasons openat2(2) exists after all. :P

> > I'm just going to move the padding to the end and change the error for
> > non-zero padding to -E2BIG.
> 
> The padding had to be after the u16 field.

Right, I was suggesting to move the u16 field later in the struct too.
But after thinking about it some more, it doesn't help with
extensibility at all (a subsequent non-u16 extension will leave holes).
So I'm probably just going to go with either the -E2BIG patch or switch
to u64s.

> > > Use 'flags' bits to indicate whether the additional fields should be looked at.
> > > Error if a 'flags' bit requires a value that isn't passed in the structure.
> > >
> > > Then you can add an extra field and old source code recompiled with the
> > > new headers will still work - because the 'junk' value isn't looked at.
> > 
> > This problem is already handled entirely by copy_struct_from_user().
> > 
> > It is true that for some new fields it will be necessary to add a new
> > flag (such as passing fds -- where 0 is a valid value) but for most new
> > fields (especially pointer or flag fields) it will not be necessary
> > because the 0 value is equivalent to the old behaviour. It also allows
> > us to entirely avoid accepting junk from userspace.
> 
> Only if userspace is guaranteed to memset the entire structure before
> making the call - rather than just fill in all the fields it knows
> about. If it doesn't use memset() then recompiling old code with new
> headers will pass garbage to the kernel. copy_struct_from_user()
> cannot solve that problem.

You don't need to /explicitly/ memset(), since

	struct open_how how = { .flags = O_RDWR, .resolve = RESOLVE_IN_ROOT };

or even

	struct open_how how = {}; /* or { 0 } if you prefer. */

will clear all of the unused fields.

But, I can add a NOTE to the man-page to clarify that this is how users
should fill their structs (or rather, that they should zero-fill them
somehow to avoid this problem).

While this might be a little annoying, I would argue that given the
openat2(2) man page explains how extensions work (in great detail) and
mentions several times that the structure may have new fields added to
it in the future -- programs which don't zero-fill the struct should be
simply seen as buggy. Note that those buggy programs *will still work*
on new kernels -- until you recompile them with new headers (because
they made an incorrect assumption about the structures they were using).

As an aside, the other downside from the uapi side is that we would
probably have to spend flag bits *that are shared with openat(2)* for
such extensions, so I'd like to avoid that as much as necessary.

> You'll never be able to guarantee that all code actually clears the
> entire structure - so at some point extending it will break recompilations
> of old code - annoying.

Only if they're explicitly doing something like

	struct open_how how;
	how.flags = O_RDWR;
	how.resolve = RESOLVE_IN_ROOT;
	memset(how.__padding, 0, sizeof(how.__padding));

As above, given the description of extensions in the man-page, I would
consider that style of struct initialisation to be eyebrow-raising at
best.

I'm sorry, but I'm simply against the idea of silently ignoring garbage
that userspace passes to the kernel -- even if it's tied to a flag. That
has proven to be an awful idea and in fact openat2(2) was written
precisely to fix this problem. To be honest, this reminds me of
(hypothetical) code like:

   int flags;
   flags |= O_PATH | O_CLOEXEC;
   open("foo", flags); /* yay, mystery fds! */

IMHO that shouldn't have ever worked, and the only way to stop userspace
from passing garbage is to always reject it.
diff mbox series

Patch

diff --git a/fs/open.c b/fs/open.c
index 50a46501bcc9..8cdb2b675867 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -993,8 +993,6 @@  static inline int build_open_flags(const struct open_how *how,
 		return -EINVAL;
 	if (how->resolve & ~VALID_RESOLVE_FLAGS)
 		return -EINVAL;
-	if (memchr_inv(how->__padding, 0, sizeof(how->__padding)))
-		return -EINVAL;
 
 	/* Deal with the mode. */
 	if (WILL_CREATE(flags)) {
diff --git a/include/uapi/linux/fcntl.h b/include/uapi/linux/fcntl.h
index d886bdb585e4..0e070c7f568a 100644
--- a/include/uapi/linux/fcntl.h
+++ b/include/uapi/linux/fcntl.h
@@ -109,17 +109,16 @@ 
  * O_TMPFILE} are set.
  *
  * @flags: O_* flags.
- * @mode: O_CREAT/O_TMPFILE file mode.
  * @resolve: RESOLVE_* flags.
+ * @mode: O_CREAT/O_TMPFILE file mode.
  */
 struct open_how {
-	__aligned_u64 flags;
+	__u64 flags;
+	__u64 resolve;
 	__u16 mode;
-	__u16 __padding[3]; /* must be zeroed */
-	__aligned_u64 resolve;
-};
+} __attribute__((packed));
 
-#define OPEN_HOW_SIZE_VER0	24 /* sizeof first published struct */
+#define OPEN_HOW_SIZE_VER0	18 /* sizeof first published struct */
 #define OPEN_HOW_SIZE_LATEST	OPEN_HOW_SIZE_VER0
 
 /* how->resolve flags for openat2(2). */
diff --git a/tools/testing/selftests/openat2/helpers.h b/tools/testing/selftests/openat2/helpers.h
index 43ca5ceab6e3..eb1535c8fa2e 100644
--- a/tools/testing/selftests/openat2/helpers.h
+++ b/tools/testing/selftests/openat2/helpers.h
@@ -32,17 +32,16 @@ 
  * O_TMPFILE} are set.
  *
  * @flags: O_* flags.
- * @mode: O_CREAT/O_TMPFILE file mode.
  * @resolve: RESOLVE_* flags.
+ * @mode: O_CREAT/O_TMPFILE file mode.
  */
 struct open_how {
-	__aligned_u64 flags;
+	__u64 flags;
+	__u64 resolve;
 	__u16 mode;
-	__u16 __padding[3]; /* must be zeroed */
-	__aligned_u64 resolve;
-};
+} __attribute__((packed));
 
-#define OPEN_HOW_SIZE_VER0	24 /* sizeof first published struct */
+#define OPEN_HOW_SIZE_VER0	18 /* sizeof first published struct */
 #define OPEN_HOW_SIZE_LATEST	OPEN_HOW_SIZE_VER0
 
 bool needs_openat2(const struct open_how *how);
diff --git a/tools/testing/selftests/openat2/openat2_test.c b/tools/testing/selftests/openat2/openat2_test.c
index 0b64fedc008b..cbf95d160b1b 100644
--- a/tools/testing/selftests/openat2/openat2_test.c
+++ b/tools/testing/selftests/openat2/openat2_test.c
@@ -40,7 +40,7 @@  struct struct_test {
 	int err;
 };
 
-#define NUM_OPENAT2_STRUCT_TESTS 10
+#define NUM_OPENAT2_STRUCT_TESTS 7
 #define NUM_OPENAT2_STRUCT_VARIATIONS 13
 
 void test_openat2_struct(void)
@@ -57,22 +57,6 @@  void test_openat2_struct(void)
 		  .arg.inner.flags = O_RDONLY,
 		  .size = sizeof(struct open_how_ext) },
 
-		/* Normal struct with broken padding. */
-		{ .name = "normal struct (non-zero padding[0])",
-		  .arg.inner.flags = O_RDONLY,
-		  .arg.inner.__padding = {0xa0, 0x00, 0x00},
-		  .size = sizeof(struct open_how_ext), .err = -EINVAL },
-		{ .name = "normal struct (non-zero padding[1])",
-		  .arg.inner.flags = O_RDONLY,
-		  .arg.inner.__padding = {0x00, 0x1a, 0x00},
-		  .size = sizeof(struct open_how_ext), .err = -EINVAL },
-		{ .name = "normal struct (non-zero padding[2])",
-		  .arg.inner.flags = O_RDONLY,
-		  .arg.inner.__padding = {0x00, 0x00, 0xef},
-		  .size = sizeof(struct open_how_ext), .err = -EINVAL },
-
-		/* TODO: Once expanded, check zero-padding. */
-
 		/* Smaller than version-0 struct. */
 		{ .name = "zero-sized 'struct'",
 		  .arg.inner.flags = O_RDONLY, .size = 0, .err = -EINVAL },