diff mbox series

[v4] memfd: `MFD_NOEXEC_SEAL` should not imply `MFD_ALLOW_SEALING`

Message ID 20240630184912.37335-1-pobrn@protonmail.com (mailing list archive)
State New
Headers show
Series [v4] memfd: `MFD_NOEXEC_SEAL` should not imply `MFD_ALLOW_SEALING` | expand

Commit Message

Barnabás Pőcze June 30, 2024, 6:49 p.m. UTC
`MFD_NOEXEC_SEAL` should remove the executable bits and set `F_SEAL_EXEC`
to prevent further modifications to the executable bits as per the comment
in the uapi header file:

  not executable and sealed to prevent changing to executable

However, commit 105ff5339f498a ("mm/memfd: add MFD_NOEXEC_SEAL and MFD_EXEC")
that introduced this feature made it so that `MFD_NOEXEC_SEAL` unsets
`F_SEAL_SEAL`, essentially acting as a superset of `MFD_ALLOW_SEALING`.

Nothing implies that it should be so, and indeed up until the second version
of the of the patchset[0] that introduced `MFD_EXEC` and `MFD_NOEXEC_SEAL`,
`F_SEAL_SEAL` was not removed, however, it was changed in the third revision
of the patchset[1] without a clear explanation.

This behaviour is surprising for application developers, there is no
documentation that would reveal that `MFD_NOEXEC_SEAL` has the additional
effect of `MFD_ALLOW_SEALING`. Additionally, combined with `vm.memfd_noexec=2`
it has the effect of making all memfds initially sealable.

So do not remove `F_SEAL_SEAL` when `MFD_NOEXEC_SEAL` is requested,
thereby returning to the pre-Linux 6.3 behaviour of only allowing
sealing when `MFD_ALLOW_SEALING` is specified.

Now, this is technically a uapi break. However, the damage is expected
to be minimal. To trigger user visible change, a program has to do the
following steps:

 - create memfd:
   - with `MFD_NOEXEC_SEAL`,
   - without `MFD_ALLOW_SEALING`;
 - try to add seals / check the seals.

But that seems unlikely to happen intentionally since this change
essentially reverts the kernel's behaviour to that of Linux <6.3,
so if a program worked correctly on those older kernels, it will
likely work correctly after this change.

I have used Debian Code Search and GitHub to try to find potential
breakages, and I could only find a single one. dbus-broker's
memfd_create() wrapper is aware of this implicit `MFD_ALLOW_SEALING`
behaviour, and tries to work around it[2]. This workaround will
break. Luckily, this only affects the test suite, it does not affect
the normal operations of dbus-broker. There is a PR with a fix[3].

I also carried out a smoke test by building a kernel with this change
and booting an Arch Linux system into GNOME and Plasma sessions.

There was also a previous attempt to address this peculiarity by
introducing a new flag[4].

[0]: https://lore.kernel.org/lkml/20220805222126.142525-3-jeffxu@google.com/
[1]: https://lore.kernel.org/lkml/20221202013404.163143-3-jeffxu@google.com/
[2]: https://github.com/bus1/dbus-broker/blob/9eb0b7e5826fc76cad7b025bc46f267d4a8784cb/src/util/misc.c#L114
[3]: https://github.com/bus1/dbus-broker/pull/366
[4]: https://lore.kernel.org/lkml/20230714114753.170814-1-david@readahead.eu/

Cc: stable@vger.kernel.org
Signed-off-by: Barnabás Pőcze <pobrn@protonmail.com>
---

* v3: https://lore.kernel.org/linux-mm/20240611231409.3899809-1-jeffxu@chromium.org/
* v2: https://lore.kernel.org/linux-mm/20240524033933.135049-1-jeffxu@google.com/
* v1: https://lore.kernel.org/linux-mm/20240513191544.94754-1-pobrn@protonmail.com/

This fourth version returns to removing the inconsistency as opposed to documenting
its existence, with the same code change as v1 but with a somewhat extended commit
message. This is sent because I believe it is worth at least a try; it can be easily
reverted if bigger application breakages are discovered than initially imagined.

---
 mm/memfd.c                                 | 9 ++++-----
 tools/testing/selftests/memfd/memfd_test.c | 2 +-
 2 files changed, 5 insertions(+), 6 deletions(-)

Comments

Jeff Xu July 1, 2024, 9:31 p.m. UTC | #1
Hi

On Sun, Jun 30, 2024 at 11:49 AM Barnabás Pőcze <pobrn@protonmail.com> wrote:
>
> `MFD_NOEXEC_SEAL` should remove the executable bits and set `F_SEAL_EXEC`
> to prevent further modifications to the executable bits as per the comment
> in the uapi header file:
>
>   not executable and sealed to prevent changing to executable
>
> However, commit 105ff5339f498a ("mm/memfd: add MFD_NOEXEC_SEAL and MFD_EXEC")
> that introduced this feature made it so that `MFD_NOEXEC_SEAL` unsets
> `F_SEAL_SEAL`, essentially acting as a superset of `MFD_ALLOW_SEALING`.
>
> Nothing implies that it should be so, and indeed up until the second version
> of the of the patchset[0] that introduced `MFD_EXEC` and `MFD_NOEXEC_SEAL`,
> `F_SEAL_SEAL` was not removed, however, it was changed in the third revision
> of the patchset[1] without a clear explanation.
>
> This behaviour is surprising for application developers, there is no
> documentation that would reveal that `MFD_NOEXEC_SEAL` has the additional
> effect of `MFD_ALLOW_SEALING`. Additionally, combined with `vm.memfd_noexec=2`
> it has the effect of making all memfds initially sealable.
>
The documentation is in linux main (653c5c75115c), I hope this gives
clarity to the usage of MFD_NOEXEC_SEAL flag to application
developers, furthermore I'm working on man page for memfd_create.

> So do not remove `F_SEAL_SEAL` when `MFD_NOEXEC_SEAL` is requested,
> thereby returning to the pre-Linux 6.3 behaviour of only allowing
> sealing when `MFD_ALLOW_SEALING` is specified.
>
> Now, this is technically a uapi break. However, the damage is expected
> to be minimal. To trigger user visible change, a program has to do the
> following steps:
>
>  - create memfd:
>    - with `MFD_NOEXEC_SEAL`,
>    - without `MFD_ALLOW_SEALING`;
>  - try to add seals / check the seals.
>
> But that seems unlikely to happen intentionally since this change
> essentially reverts the kernel's behaviour to that of Linux <6.3,
> so if a program worked correctly on those older kernels, it will
> likely work correctly after this change.
>
During V3 patch discussion,  I sent my reasoning, but here are summaries:

- As one might have noticed, unlike other flags in memfd_create,
MFD_NOEXEC_SEAL is actually a combination of multiple flags. The idea
is to make it easier to use memfd in the most common way, which is
NOEXEC + F_SEAL_EXEC + MFD_ALLOW_SEALING.

- The new sysctl vm.noexec = 1 helps existing applications move to a
more secure way of using memfd. IMO, MFD_ALLOW_SEALING is included by
default because most applications would rather have it than not. In
any case, an app can set F_SEAL_SEAL to disable the sealing.

- MFD_NOEXEC_SEAL has been added for more than one year, multiple
applications and distributions have backported and utilized it.
Altering ABI now presents a degree of risk and may lead to
disruptions.

Best regards,
-Jeff

> I have used Debian Code Search and GitHub to try to find potential
> breakages, and I could only find a single one. dbus-broker's
> memfd_create() wrapper is aware of this implicit `MFD_ALLOW_SEALING`
> behaviour, and tries to work around it[2]. This workaround will
> break. Luckily, this only affects the test suite, it does not affect
> the normal operations of dbus-broker. There is a PR with a fix[3].
>
> I also carried out a smoke test by building a kernel with this change
> and booting an Arch Linux system into GNOME and Plasma sessions.
>
> There was also a previous attempt to address this peculiarity by
> introducing a new flag[4].
>
> [0]: https://lore.kernel.org/lkml/20220805222126.142525-3-jeffxu@google.com/
> [1]: https://lore.kernel.org/lkml/20221202013404.163143-3-jeffxu@google.com/
> [2]: https://github.com/bus1/dbus-broker/blob/9eb0b7e5826fc76cad7b025bc46f267d4a8784cb/src/util/misc.c#L114
> [3]: https://github.com/bus1/dbus-broker/pull/366
> [4]: https://lore.kernel.org/lkml/20230714114753.170814-1-david@readahead.eu/
>
> Cc: stable@vger.kernel.org
> Signed-off-by: Barnabás Pőcze <pobrn@protonmail.com>
> ---
>
> * v3: https://lore.kernel.org/linux-mm/20240611231409.3899809-1-jeffxu@chromium.org/
> * v2: https://lore.kernel.org/linux-mm/20240524033933.135049-1-jeffxu@google.com/
> * v1: https://lore.kernel.org/linux-mm/20240513191544.94754-1-pobrn@protonmail.com/
>
> This fourth version returns to removing the inconsistency as opposed to documenting
> its existence, with the same code change as v1 but with a somewhat extended commit
> message. This is sent because I believe it is worth at least a try; it can be easily
> reverted if bigger application breakages are discovered than initially imagined.
>
> ---
>  mm/memfd.c                                 | 9 ++++-----
>  tools/testing/selftests/memfd/memfd_test.c | 2 +-
>  2 files changed, 5 insertions(+), 6 deletions(-)
>
> diff --git a/mm/memfd.c b/mm/memfd.c
> index 7d8d3ab3fa37..8b7f6afee21d 100644
> --- a/mm/memfd.c
> +++ b/mm/memfd.c
> @@ -356,12 +356,11 @@ SYSCALL_DEFINE2(memfd_create,
>
>                 inode->i_mode &= ~0111;
>                 file_seals = memfd_file_seals_ptr(file);
> -               if (file_seals) {
> -                       *file_seals &= ~F_SEAL_SEAL;
> +               if (file_seals)
>                         *file_seals |= F_SEAL_EXEC;
> -               }
> -       } else if (flags & MFD_ALLOW_SEALING) {
> -               /* MFD_EXEC and MFD_ALLOW_SEALING are set */
> +       }
> +
> +       if (flags & MFD_ALLOW_SEALING) {
>                 file_seals = memfd_file_seals_ptr(file);
>                 if (file_seals)
>                         *file_seals &= ~F_SEAL_SEAL;
> diff --git a/tools/testing/selftests/memfd/memfd_test.c b/tools/testing/selftests/memfd/memfd_test.c
> index 95af2d78fd31..7b78329f65b6 100644
> --- a/tools/testing/selftests/memfd/memfd_test.c
> +++ b/tools/testing/selftests/memfd/memfd_test.c
> @@ -1151,7 +1151,7 @@ static void test_noexec_seal(void)
>                             mfd_def_size,
>                             MFD_CLOEXEC | MFD_NOEXEC_SEAL);
>         mfd_assert_mode(fd, 0666);
> -       mfd_assert_has_seals(fd, F_SEAL_EXEC);
> +       mfd_assert_has_seals(fd, F_SEAL_SEAL | F_SEAL_EXEC);
>         mfd_fail_chmod(fd, 0777);
>         close(fd);
>  }
> --
> 2.45.2
>
>
Aleksa Sarai July 2, 2024, 6:24 a.m. UTC | #2
On 2024-06-30, Barnabás Pőcze <pobrn@protonmail.com> wrote:
> `MFD_NOEXEC_SEAL` should remove the executable bits and set `F_SEAL_EXEC`
> to prevent further modifications to the executable bits as per the comment
> in the uapi header file:
> 
>   not executable and sealed to prevent changing to executable
> 
> However, commit 105ff5339f498a ("mm/memfd: add MFD_NOEXEC_SEAL and MFD_EXEC")
> that introduced this feature made it so that `MFD_NOEXEC_SEAL` unsets
> `F_SEAL_SEAL`, essentially acting as a superset of `MFD_ALLOW_SEALING`.
> 
> Nothing implies that it should be so, and indeed up until the second version
> of the of the patchset[0] that introduced `MFD_EXEC` and `MFD_NOEXEC_SEAL`,
> `F_SEAL_SEAL` was not removed, however, it was changed in the third revision
> of the patchset[1] without a clear explanation.
> 
> This behaviour is surprising for application developers, there is no
> documentation that would reveal that `MFD_NOEXEC_SEAL` has the additional
> effect of `MFD_ALLOW_SEALING`. Additionally, combined with `vm.memfd_noexec=2`
> it has the effect of making all memfds initially sealable.
> 
> So do not remove `F_SEAL_SEAL` when `MFD_NOEXEC_SEAL` is requested,
> thereby returning to the pre-Linux 6.3 behaviour of only allowing
> sealing when `MFD_ALLOW_SEALING` is specified.

This behaviour makes sense, I'm a little sad I didn't catch it when I
was fixing vm.memfd_noexec. There is a possibility for breakage, but we
should give it a shot, given how new the API is (and the API itself was
also broken until Linux 6.6 anyway)...

Feel free to take my

Reviewed-by: Aleksa Sarai <cyphar@cyphar.com>

Thanks.

> Now, this is technically a uapi break. However, the damage is expected
> to be minimal. To trigger user visible change, a program has to do the
> following steps:
> 
>  - create memfd:
>    - with `MFD_NOEXEC_SEAL`,
>    - without `MFD_ALLOW_SEALING`;
>  - try to add seals / check the seals.
> 
> But that seems unlikely to happen intentionally since this change
> essentially reverts the kernel's behaviour to that of Linux <6.3,
> so if a program worked correctly on those older kernels, it will
> likely work correctly after this change.
> 
> I have used Debian Code Search and GitHub to try to find potential
> breakages, and I could only find a single one. dbus-broker's
> memfd_create() wrapper is aware of this implicit `MFD_ALLOW_SEALING`
> behaviour, and tries to work around it[2]. This workaround will
> break. Luckily, this only affects the test suite, it does not affect
> the normal operations of dbus-broker. There is a PR with a fix[3].
> 
> I also carried out a smoke test by building a kernel with this change
> and booting an Arch Linux system into GNOME and Plasma sessions.
> 
> There was also a previous attempt to address this peculiarity by
> introducing a new flag[4].
> 
> [0]: https://lore.kernel.org/lkml/20220805222126.142525-3-jeffxu@google.com/
> [1]: https://lore.kernel.org/lkml/20221202013404.163143-3-jeffxu@google.com/
> [2]: https://github.com/bus1/dbus-broker/blob/9eb0b7e5826fc76cad7b025bc46f267d4a8784cb/src/util/misc.c#L114
> [3]: https://github.com/bus1/dbus-broker/pull/366
> [4]: https://lore.kernel.org/lkml/20230714114753.170814-1-david@readahead.eu/
> 
> Cc: stable@vger.kernel.org
> Signed-off-by: Barnabás Pőcze <pobrn@protonmail.com>
> ---
> 
> * v3: https://lore.kernel.org/linux-mm/20240611231409.3899809-1-jeffxu@chromium.org/
> * v2: https://lore.kernel.org/linux-mm/20240524033933.135049-1-jeffxu@google.com/
> * v1: https://lore.kernel.org/linux-mm/20240513191544.94754-1-pobrn@protonmail.com/
> 
> This fourth version returns to removing the inconsistency as opposed to documenting
> its existence, with the same code change as v1 but with a somewhat extended commit
> message. This is sent because I believe it is worth at least a try; it can be easily
> reverted if bigger application breakages are discovered than initially imagined.
> 
> ---
>  mm/memfd.c                                 | 9 ++++-----
>  tools/testing/selftests/memfd/memfd_test.c | 2 +-
>  2 files changed, 5 insertions(+), 6 deletions(-)
> 
> diff --git a/mm/memfd.c b/mm/memfd.c
> index 7d8d3ab3fa37..8b7f6afee21d 100644
> --- a/mm/memfd.c
> +++ b/mm/memfd.c
> @@ -356,12 +356,11 @@ SYSCALL_DEFINE2(memfd_create,
>  
>  		inode->i_mode &= ~0111;
>  		file_seals = memfd_file_seals_ptr(file);
> -		if (file_seals) {
> -			*file_seals &= ~F_SEAL_SEAL;
> +		if (file_seals)
>  			*file_seals |= F_SEAL_EXEC;
> -		}
> -	} else if (flags & MFD_ALLOW_SEALING) {
> -		/* MFD_EXEC and MFD_ALLOW_SEALING are set */
> +	}
> +
> +	if (flags & MFD_ALLOW_SEALING) {
>  		file_seals = memfd_file_seals_ptr(file);
>  		if (file_seals)
>  			*file_seals &= ~F_SEAL_SEAL;
> diff --git a/tools/testing/selftests/memfd/memfd_test.c b/tools/testing/selftests/memfd/memfd_test.c
> index 95af2d78fd31..7b78329f65b6 100644
> --- a/tools/testing/selftests/memfd/memfd_test.c
> +++ b/tools/testing/selftests/memfd/memfd_test.c
> @@ -1151,7 +1151,7 @@ static void test_noexec_seal(void)
>  			    mfd_def_size,
>  			    MFD_CLOEXEC | MFD_NOEXEC_SEAL);
>  	mfd_assert_mode(fd, 0666);
> -	mfd_assert_has_seals(fd, F_SEAL_EXEC);
> +	mfd_assert_has_seals(fd, F_SEAL_SEAL | F_SEAL_EXEC);
>  	mfd_fail_chmod(fd, 0777);
>  	close(fd);
>  }
> -- 
> 2.45.2
> 
>
diff mbox series

Patch

diff --git a/mm/memfd.c b/mm/memfd.c
index 7d8d3ab3fa37..8b7f6afee21d 100644
--- a/mm/memfd.c
+++ b/mm/memfd.c
@@ -356,12 +356,11 @@  SYSCALL_DEFINE2(memfd_create,
 
 		inode->i_mode &= ~0111;
 		file_seals = memfd_file_seals_ptr(file);
-		if (file_seals) {
-			*file_seals &= ~F_SEAL_SEAL;
+		if (file_seals)
 			*file_seals |= F_SEAL_EXEC;
-		}
-	} else if (flags & MFD_ALLOW_SEALING) {
-		/* MFD_EXEC and MFD_ALLOW_SEALING are set */
+	}
+
+	if (flags & MFD_ALLOW_SEALING) {
 		file_seals = memfd_file_seals_ptr(file);
 		if (file_seals)
 			*file_seals &= ~F_SEAL_SEAL;
diff --git a/tools/testing/selftests/memfd/memfd_test.c b/tools/testing/selftests/memfd/memfd_test.c
index 95af2d78fd31..7b78329f65b6 100644
--- a/tools/testing/selftests/memfd/memfd_test.c
+++ b/tools/testing/selftests/memfd/memfd_test.c
@@ -1151,7 +1151,7 @@  static void test_noexec_seal(void)
 			    mfd_def_size,
 			    MFD_CLOEXEC | MFD_NOEXEC_SEAL);
 	mfd_assert_mode(fd, 0666);
-	mfd_assert_has_seals(fd, F_SEAL_EXEC);
+	mfd_assert_has_seals(fd, F_SEAL_SEAL | F_SEAL_EXEC);
 	mfd_fail_chmod(fd, 0777);
 	close(fd);
 }