mbox series

[v14,0/3] Add trusted_for(2) (was O_MAYEXEC)

Message ID 20211008104840.1733385-1-mic@digikod.net (mailing list archive)
Headers show
Series Add trusted_for(2) (was O_MAYEXEC) | expand

Message

Mickaël Salaün Oct. 8, 2021, 10:48 a.m. UTC
Hi,

This patch series is mainly a rebase on v5.15-rc4 with some cosmetic
changes suggested by Kees Cook.  Andrew, can you please consider to
merge this into your tree?

Overview
========

The final goal of this patch series is to enable the kernel to be a
global policy manager by entrusting processes with access control at
their level.  To reach this goal, two complementary parts are required:
* user space needs to be able to know if it can trust some file
  descriptor content for a specific usage;
* and the kernel needs to make available some part of the policy
  configured by the system administrator.

Primary goal of trusted_for(2)
==============================

This new syscall enables user space to ask the kernel: is this file
descriptor's content trusted to be used for this purpose?  The set of
usage currently only contains "execution", but other may follow (e.g.
"configuration", "sensitive_data").  If the kernel identifies the file
descriptor as trustworthy for this usage, user space should then take
this information into account.  The "execution" usage means that the
content of the file descriptor is trusted according to the system policy
to be executed by user space, which means that it interprets the content
or (try to) maps it as executable memory.

A simple system-wide security policy can be enforced by the system
administrator through a sysctl configuration consistent with the mount
points or the file access rights.  The documentation patch explains the
prerequisites.

It is important to note that this can only enable to extend access
control managed by the kernel.  Hence it enables current access control
mechanism to be extended and become a superset of what they can
currently control.  Indeed, the security policy could also be delegated
to an LSM, either a MAC system or an integrity system.  For instance,
this is required to close a major IMA measurement/appraisal interpreter
integrity gap by bringing the ability to check the use of scripts [1].
Other uses are expected, such as for magic-links [2], SGX integration
[3], bpffs [4].

Complementary W^X protections can be brought by SELinux, IPE [5] and
trampfd [6].

Prerequisite of its use
=======================

User space needs to adapt to take advantage of this new feature.  For
example, the PEP 578 [7] (Runtime Audit Hooks) enables Python 3.8 to be
extended with policy enforcement points related to code interpretation,
which can be used to align with the PowerShell audit features.
Additional Python security improvements (e.g. a limited interpreter
without -c, stdin piping of code) are on their way [8].

Examples
========

The initial idea comes from CLIP OS 4 and the original implementation
has been used for more than 13 years:
https://github.com/clipos-archive/clipos4_doc
Chrome OS has a similar approach:
https://chromium.googlesource.com/chromiumos/docs/+/master/security/noexec_shell_scripts.md

Userland patches can be found here:
https://github.com/clipos-archive/clipos4_portage-overlay/search?q=O_MAYEXEC
Actually, there is more than the O_MAYEXEC changes (which matches this search)
e.g., to prevent Python interactive execution. There are patches for
Bash, Wine, Java (Icedtea), Busybox's ash, Perl and Python. There are
also some related patches which do not directly rely on O_MAYEXEC but
which restrict the use of browser plugins and extensions, which may be
seen as scripts too:
https://github.com/clipos-archive/clipos4_portage-overlay/tree/master/www-client

An introduction to O_MAYEXEC was given at the Linux Security Summit
Europe 2018 - Linux Kernel Security Contributions by ANSSI:
https://www.youtube.com/watch?v=chNjCRtPKQY&t=17m15s
The "write xor execute" principle was explained at Kernel Recipes 2018 -
CLIP OS: a defense-in-depth OS:
https://www.youtube.com/watch?v=PjRE0uBtkHU&t=11m14s
See also a first LWN article about O_MAYEXEC and a new one about
trusted_for(2) and its background:
* https://lwn.net/Articles/820000/
* https://lwn.net/Articles/832959/

This patch series can be applied on top of v5.10-rc6 .  This can be
tested with CONFIG_SYSCTL.  I would really appreciate constructive
comments on this patch series.

Previous series:
https://lore.kernel.org/r/20211007182321.872075-1-mic@digikod.net/

[1] https://lore.kernel.org/lkml/1544647356.4028.105.camel@linux.ibm.com/
[2] https://lore.kernel.org/lkml/20190904201933.10736-6-cyphar@cyphar.com/
[3] https://lore.kernel.org/lkml/CALCETrVovr8XNZSroey7pHF46O=kj_c5D9K8h=z2T_cNrpvMig@mail.gmail.com/
[4] https://lore.kernel.org/lkml/CALCETrVeZ0eufFXwfhtaG_j+AdvbzEWE0M3wjXMWVEO7pj+xkw@mail.gmail.com/
[5] https://lore.kernel.org/lkml/20200406221439.1469862-12-deven.desai@linux.microsoft.com/
[6] https://lore.kernel.org/lkml/20200922215326.4603-1-madvenka@linux.microsoft.com/
[7] https://www.python.org/dev/peps/pep-0578/
[8] https://lore.kernel.org/lkml/0c70debd-e79e-d514-06c6-4cd1e021fa8b@python.org/

Regards,

Mickaël Salaün (3):
  fs: Add trusted_for(2) syscall implementation and related sysctl
  arch: Wire up trusted_for(2)
  selftest/interpreter: Add tests for trusted_for(2) policies

 Documentation/admin-guide/sysctl/fs.rst       |  50 +++
 arch/alpha/kernel/syscalls/syscall.tbl        |   1 +
 arch/arm/tools/syscall.tbl                    |   1 +
 arch/arm64/include/asm/unistd.h               |   2 +-
 arch/arm64/include/asm/unistd32.h             |   2 +
 arch/ia64/kernel/syscalls/syscall.tbl         |   1 +
 arch/m68k/kernel/syscalls/syscall.tbl         |   1 +
 arch/microblaze/kernel/syscalls/syscall.tbl   |   1 +
 arch/mips/kernel/syscalls/syscall_n32.tbl     |   1 +
 arch/mips/kernel/syscalls/syscall_n64.tbl     |   1 +
 arch/mips/kernel/syscalls/syscall_o32.tbl     |   1 +
 arch/parisc/kernel/syscalls/syscall.tbl       |   1 +
 arch/powerpc/kernel/syscalls/syscall.tbl      |   1 +
 arch/s390/kernel/syscalls/syscall.tbl         |   1 +
 arch/sh/kernel/syscalls/syscall.tbl           |   1 +
 arch/sparc/kernel/syscalls/syscall.tbl        |   1 +
 arch/x86/entry/syscalls/syscall_32.tbl        |   1 +
 arch/x86/entry/syscalls/syscall_64.tbl        |   1 +
 arch/xtensa/kernel/syscalls/syscall.tbl       |   1 +
 fs/open.c                                     |  78 ++++
 include/linux/fs.h                            |   1 +
 include/linux/syscalls.h                      |   2 +
 include/uapi/asm-generic/unistd.h             |   4 +-
 include/uapi/linux/trusted-for.h              |  18 +
 kernel/sysctl.c                               |  12 +-
 tools/testing/selftests/Makefile              |   1 +
 .../testing/selftests/interpreter/.gitignore  |   2 +
 tools/testing/selftests/interpreter/Makefile  |  21 +
 tools/testing/selftests/interpreter/config    |   1 +
 .../selftests/interpreter/trust_policy_test.c | 362 ++++++++++++++++++
 30 files changed, 568 insertions(+), 4 deletions(-)
 create mode 100644 include/uapi/linux/trusted-for.h
 create mode 100644 tools/testing/selftests/interpreter/.gitignore
 create mode 100644 tools/testing/selftests/interpreter/Makefile
 create mode 100644 tools/testing/selftests/interpreter/config
 create mode 100644 tools/testing/selftests/interpreter/trust_policy_test.c


base-commit: 9e1ff307c779ce1f0f810c7ecce3d95bbae40896

Comments

Kees Cook Oct. 8, 2021, 10:47 p.m. UTC | #1
On Fri, Oct 08, 2021 at 12:48:37PM +0200, Mickaël Salaün wrote:
> This patch series is mainly a rebase on v5.15-rc4 with some cosmetic
> changes suggested by Kees Cook.  Andrew, can you please consider to
> merge this into your tree?

Thanks for staying on this series! This is a good step in the right
direction for finally plugging the "interpreter" noexec hole. I'm pretty
sure Chrome OS will immediately use this as they've been carrying
similar functionality for a long time.
Andrew Morton Oct. 10, 2021, 9:48 p.m. UTC | #2
On Fri,  8 Oct 2021 12:48:37 +0200 Mickaël Salaün <mic@digikod.net> wrote:

> The final goal of this patch series is to enable the kernel to be a
> global policy manager by entrusting processes with access control at
> their level.  To reach this goal, two complementary parts are required:
> * user space needs to be able to know if it can trust some file
>   descriptor content for a specific usage;
> * and the kernel needs to make available some part of the policy
>   configured by the system administrator.

Apologies if I missed this...

It would be nice to see a description of the proposed syscall interface
in these changelogs!  Then a few questions I have will be answered...

long trusted_for(const int fd,
		 const enum trusted_for_usage usage,
		 const u32 flags)

- `usage' must be equal to TRUSTED_FOR_EXECUTION, so why does it
  exist?  Some future modes are planned?  Please expand on this.

- `flags' is unused (must be zero).  So why does it exist?  What are
  the plans here?

- what values does the syscall return and what do they mean?
Mickaël Salaün Oct. 11, 2021, 8:47 a.m. UTC | #3
On 10/10/2021 23:48, Andrew Morton wrote:
> On Fri,  8 Oct 2021 12:48:37 +0200 Mickaël Salaün <mic@digikod.net> wrote:
> 
>> The final goal of this patch series is to enable the kernel to be a
>> global policy manager by entrusting processes with access control at
>> their level.  To reach this goal, two complementary parts are required:
>> * user space needs to be able to know if it can trust some file
>>   descriptor content for a specific usage;
>> * and the kernel needs to make available some part of the policy
>>   configured by the system administrator.
> 
> Apologies if I missed this...
> 
> It would be nice to see a description of the proposed syscall interface
> in these changelogs!  Then a few questions I have will be answered...

I described this syscall and it's semantic in the first patch in
Documentation/admin-guide/sysctl/fs.rst
Do you want me to copy-paste this content in the cover letter?

> 
> long trusted_for(const int fd,
> 		 const enum trusted_for_usage usage,
> 		 const u32 flags)
> 
> - `usage' must be equal to TRUSTED_FOR_EXECUTION, so why does it
>   exist?  Some future modes are planned?  Please expand on this.

Indeed, the current use case is to check if the kernel would allow
execution of a file. But as Florian pointed out, we may want to add more
context in the future, e.g. to enforce signature verification, to check
if this is a legitimate (system) library, to check if the file is
allowed to be used as (trusted) configuration…

> 
> - `flags' is unused (must be zero).  So why does it exist?  What are
>   the plans here?

This is mostly to follow syscall good practices for extensibility. It
could be used in combination with the usage argument (which defines the
user space semantic), e.g. to check for extra properties such as
cryptographic or integrity requirements, origin of the file…

> 
> - what values does the syscall return and what do they mean?
> 

It returns 0 on success, or -EACCES if the kernel policy denies the
specified usage.
Andrew Morton Oct. 11, 2021, 9:07 p.m. UTC | #4
On Mon, 11 Oct 2021 10:47:04 +0200 Mickaël Salaün <mic@digikod.net> wrote:

> 
> On 10/10/2021 23:48, Andrew Morton wrote:
> > On Fri,  8 Oct 2021 12:48:37 +0200 Mickaël Salaün <mic@digikod.net> wrote:
> > 
> >> The final goal of this patch series is to enable the kernel to be a
> >> global policy manager by entrusting processes with access control at
> >> their level.  To reach this goal, two complementary parts are required:
> >> * user space needs to be able to know if it can trust some file
> >>   descriptor content for a specific usage;
> >> * and the kernel needs to make available some part of the policy
> >>   configured by the system administrator.
> > 
> > Apologies if I missed this...
> > 
> > It would be nice to see a description of the proposed syscall interface
> > in these changelogs!  Then a few questions I have will be answered...
> 
> I described this syscall and it's semantic in the first patch in
> Documentation/admin-guide/sysctl/fs.rst

Well, kinda.  It didn't explain why the `usage' and `flags' arguments
exist and what are the plans for them.

> Do you want me to copy-paste this content in the cover letter?

That would be best please.  It's basically the most important thing
when reviewing the implementation.

> > 
> > long trusted_for(const int fd,
> > 		 const enum trusted_for_usage usage,
> > 		 const u32 flags)
> > 
> > - `usage' must be equal to TRUSTED_FOR_EXECUTION, so why does it
> >   exist?  Some future modes are planned?  Please expand on this.
> 
> Indeed, the current use case is to check if the kernel would allow
> execution of a file. But as Florian pointed out, we may want to add more
> context in the future, e.g. to enforce signature verification, to check
> if this is a legitimate (system) library, to check if the file is
> allowed to be used as (trusted) configuration…
> 
> > 
> > - `flags' is unused (must be zero).  So why does it exist?  What are
> >   the plans here?
> 
> This is mostly to follow syscall good practices for extensibility. It
> could be used in combination with the usage argument (which defines the
> user space semantic), e.g. to check for extra properties such as
> cryptographic or integrity requirements, origin of the file…
> 
> > 
> > - what values does the syscall return and what do they mean?
> > 
> 
> It returns 0 on success, or -EACCES if the kernel policy denies the
> specified usage.

And please document all of this in the changelog also.